Prediction of potential genes in microbial genomes Time: Sat May 28 19:01:48 2011 Seq name: gi|226333018|gb|ACII01000001.1| Ruminococcus sp. 5_1_39B_FAA cont1.1, whole genome shotgun sequence Length of sequence - 12879 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 166 - 399 91 ## gi|253578426|ref|ZP_04855698.1| transposase + Prom 489 - 548 6.8 2 2 Tu 1 . + CDS 617 - 1354 217 ## gi|253577769|ref|ZP_04855041.1| conserved hypothetical protein + Term 1366 - 1421 8.4 3 3 Tu 1 . - CDS 1418 - 1747 259 ## Cphy_1320 small acid-soluble spore protein alpha/beta type - Prom 1902 - 1961 8.6 + Prom 1918 - 1977 9.4 4 4 Op 1 . + CDS 2037 - 2432 525 ## COG1803 Methylglyoxal synthase 5 4 Op 2 14/0.000 + CDS 2429 - 2980 263 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 + Term 3035 - 3064 2.1 + Prom 3013 - 3072 1.7 6 4 Op 3 . + CDS 3107 - 3601 408 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 7 4 Op 4 . + CDS 3660 - 4292 1062 ## EUBELI_01034 hypothetical protein + Term 4332 - 4366 6.0 - Term 4319 - 4353 6.0 8 5 Tu 1 . - CDS 4401 - 5171 255 ## COG3314 Uncharacterized protein conserved in bacteria + Prom 5316 - 5375 5.7 9 6 Tu 1 . + CDS 5497 - 6948 1772 ## COG2195 Di- and tripeptidases + Term 6954 - 7003 11.9 + Prom 7061 - 7120 5.7 10 7 Tu 1 . + CDS 7148 - 7960 734 ## COG2357 Uncharacterized protein conserved in bacteria + Prom 7971 - 8030 3.1 11 8 Op 1 . + CDS 8063 - 10009 1394 ## COG1032 Fe-S oxidoreductase 12 8 Op 2 . + CDS 10006 - 10659 508 ## EUBELI_01028 hypothetical protein 13 8 Op 3 . + CDS 10741 - 11766 718 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 14 8 Op 4 . + CDS 11690 - 12250 574 ## COG0681 Signal peptidase I 15 8 Op 5 . + CDS 12276 - 12809 432 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase Predicted protein(s) >gi|226333018|gb|ACII01000001.1| GENE 1 166 - 399 91 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578426|ref|ZP_04855698.1| ## NR: gi|253578426|ref|ZP_04855698.1| transposase [Ruminococcus sp. 5_1_39B_FAA] # 1 77 406 482 482 149 93.0 7e-35 MEKYMQNHLLFLHDSRIPATNNEAERLLRNYKLKQAQAVTFRSFESIDYLCQCMSMLVLM RLEEPANIFDRVSRIFG >gi|226333018|gb|ACII01000001.1| GENE 2 617 - 1354 217 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577769|ref|ZP_04855041.1| ## NR: gi|253577769|ref|ZP_04855041.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 245 1 245 245 467 100.0 1e-130 MKKVRLILCSFLALLMMVLTAPVITKAATEPVCLKKATLRYLKWTNYVVDGREKFSTIAW DPLYIGNLSGSAVITNIKSSNKHFTARKGVGVNAIYVEEKPDYVIKDGIKTTISFNVTQN GRTYKLSSRITFKPYGAFFKSFKLGTKDYASDVSGYRLKKMSPLKRNTGKIQITPSNGYK IDEIIVNYPNKSRKVKNGSSISLKNVRKISVWYHTTKKPAYYQNPGKDYRGWTGSPLNYC FDLSF >gi|226333018|gb|ACII01000001.1| GENE 3 1418 - 1747 259 109 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1320 NR:ns ## KEGG: Cphy_1320 # Name: not_defined # Def: small acid-soluble spore protein alpha/beta type # Organism: C.phytofermentans # Pathway: not_defined # 6 62 3 59 66 85 70.0 7e-16 MSNTKSSNQTAVPEAKGALDRFKFEVANELGVPLTDGYNGNLTSKQNGSVGGYMVKKIVS EQPLNFNSIFYIIPRVIVNYRTTFPGISLLIFLFHPLLKASLVAQSSLV >gi|226333018|gb|ACII01000001.1| GENE 4 2037 - 2432 525 131 aa, chain + ## HITS:1 COG:lin2020 KEGG:ns NR:ns ## COG: lin2020 COG1803 # Protein_GI_number: 16801086 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Listeria innocua # 1 127 1 127 134 140 50.0 6e-34 MNLGIIAHNSKKVLIEDFCIAYKNILAKHEVYATGTTGRRIEEATSLHVHKFLAGSIGGD KQFMEMVERQDLDMVILFYNPVMIDPKEPDITSIVRRCDQYNIPVATNIATAELLILGLA RGDLDWRLNLK >gi|226333018|gb|ACII01000001.1| GENE 5 2429 - 2980 263 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 151 13 164 199 105 41 1e-22 MRVIAGKARRLNLKTIPGIDTRPTTDRIKETLFNILQPELLECRFLDLFSGSGGIGIEAL SRGASYAVFVEKNPKAAACIRENLAFTKLAEDGKLLNMDVLQALRSLEGKGVFDIIFMDP PYNNELERQVLEYLKDSTVADKNTLIIVEADLQTDFSYVESLGYRQLRSKEYKTNKHVFL ERA >gi|226333018|gb|ACII01000001.1| GENE 6 3107 - 3601 408 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 156 4 158 164 161 52 2e-39 MKIAVYPGSFDPATYGHLDVIRRAAVSFDKVIVGVLHNSSKSPLFSVQERVNILEKATRD VPNVEVKPFEGLSVNFARENHAQVIIRGLRAVTDFEYELQMAQTNRVLAPDVDTVFLTTS LEYAYLSSTILKEVAHFGGDLSKFAPAEITDAVIEKIRLTADNK >gi|226333018|gb|ACII01000001.1| GENE 7 3660 - 4292 1062 210 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01034 NR:ns ## KEGG: EUBELI_01034 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 201 2 203 203 165 50.0 1e-39 MSSRIEQIIGEIEEYVDSCKFQPLSTTKIVVNKEEIEELLRELRLKTPDEIKRYQKIINN KDAILEDAQTKADALIADAQARAQELVTQHEIMQKAYAQANDTINAANKQAQEILDSATQ DANSIRLSAITYTDDMMANIGSVLNTTLEDAGVKYKTFIDSLQSCLDVVNHNRQELAPQT AQAAAITGSYHDNDGAKDDDDDLLDDLGEN >gi|226333018|gb|ACII01000001.1| GENE 8 4401 - 5171 255 256 aa, chain - ## HITS:1 COG:BH2588 KEGG:ns NR:ns ## COG: BH2588 COG3314 # Protein_GI_number: 15615151 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 23 256 83 349 411 60 24.0 3e-09 MLTHTGHIRKLLLPFESFFHIFPGVSIWGGYVFVLGMLCGYPLGAKLASDLYGLHKISKK EALYLTTFCNNPSPAFIITYLCKSCLKDTVPAGMVFASIFSANLICILFFRFFVYQNATI SDSPGFSPEISSRKNILDFSIMNGFETITRLGGYILLFSILAGCIRYYAPFPLLYQYLLL GFTEITSGLSLISVSGLPYMTRILLSAATTASGGLCILAQTKGVLHHDLSILPYFISKCI CTFLTAGILYCLISYL >gi|226333018|gb|ACII01000001.1| GENE 9 5497 - 6948 1772 483 aa, chain + ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 4 482 5 486 486 347 39.0 3e-95 MAVLENCEPKRVFHYFEEICKIPHGSGNTKQISDYLVQFAKDHDLKYVQDEMNNVVIYKP GTAGYENAPVVIIQGHMDMVCEKRPDVEHDFTKDGLNLSVEGDYVSANGTTLGGDDGIAV AYGLALLESDTIAHPPLEVFITVDEEIGLLGAVGFDCSVLKGRRFINLDSEAEGSLWISC AGGLSGVSHIPVTRLDAEGEKLTVKISGLMGGHSGAEIDKNRANANSLLGKFLHGLDEKT DFELISVQGGQKDNAITREATAEILVLEENVDAVREYAASVQGAWREEYAGTDEGITVTV EDEGKQEVRVLHPTSKEKVIFFLVNVPYGVQKMSGTIKGLVETSTNIGILKTSENEVMGS SSIRSSVETARDSLSDKIAYLTEFLGGEYERQGVYPAWEYRKDSPLRDKMVEVYEEMYGQ KPNVVAIHAGLECGLFYKKMEGLDCVSLGPDMKDIHTSEEVLSISSTERVWKYLVKVLEA LKE >gi|226333018|gb|ACII01000001.1| GENE 10 7148 - 7960 734 270 aa, chain + ## HITS:1 COG:CAC0642 KEGG:ns NR:ns ## COG: CAC0642 COG2357 # Protein_GI_number: 15893930 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 216 6 216 262 201 52.0 1e-51 MEIQLWRSILCPYELAVRELEVKFNHIIDECKENDVYCPIEQVEGRVKSVSSILEKMQRK HIPMERMEEELEDIAGVRIICQFEEDIETVASLIQKRSDMVIKSEKNYLKHIKQSGYRSY HLIIYYTVDTIKGPKKLQAEIQIRTMAMNFWATIEHSLQYKYKGDMPGHVAERLSKAADA INALDHEMSSVRNEIMDAQNSSQMQSNLVKDILINIENLYKIANKREIMKIQDEFLRVFK TKDLQQLKRFHRQLDIISEGYRAQAVYHHV >gi|226333018|gb|ACII01000001.1| GENE 11 8063 - 10009 1394 648 aa, chain + ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 4 595 126 737 742 600 49.0 1e-171 MQDFLPVTKKEMKQRGWEQVDFAYITGDAYVDHPSFGTAIISRLLESRGYKVGIIPQPDW RKKESIQVFGEPRLGFLVSAGNMDSMVNHYTVSKKHRQKDSYSPGGQMGLRPDRAVIVYS NLIRQTYKKTPIILGGIEASLRRLAHYDYWENKVKHSVLLDSGADMISYGMGEHSIIEIA DALASGLPVEELTYIAGTVFKCRDLSRVYDPIILPSYEEVKVNKKVYADSFAIQYQNTDP FSARPMVESYGTKGYIIQNPPALPLTQEEMDDVYALPYTEKVHPMYEKLGGIPALEEIKF SLTSNRGCFGGCNFCALTFHQGRILQTRSHESLIEEATRMTNDPEFKGYIHDVGGPTADF RQPSCQKQLTKGVCKNKQCLFPTPCKNLTVDHSDYVSLLRKLRKIPGVKKVFIRSGVRFD YVVADRDKTFLRELVEHHVSGQLRVAPEHVSDQVLKYMGKPSHSVYQQFLKEYDAANRQT GKQQYAVPYFMSSHPGCTMKEAVKLAEYVRDLGFTPEQVQDFYPTPSTLSTCMYYTGIHP LTGEKVYVPKNPHEKAIQRALMQYKNPANRELVLEGLKMTGRMDLVGYGPKCLIRPLREN HGGQQHTQGSSRNAKNSRNSKNNKNSSASAKKNTIRNIHKKNSRGGRK >gi|226333018|gb|ACII01000001.1| GENE 12 10006 - 10659 508 217 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01028 NR:ns ## KEGG: EUBELI_01028 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 19 177 3 161 191 86 32.0 6e-16 MTEKKNRNKKQSKAKTSSKVSKGNFGYFRSEKRRRLIITAILFAVPLFIFFTSWFYFKTR MTVWTVVTVVGCLPACKSMVNLIMILKCRPMDAGLYQKIHEHQGSLDMAYELYMTFYEKS AYIDAAAVCGNTVVAYSSDPKIDASFMETNSQKIIRKNGYKVTVKIFTDLRPFLERLDSM NDHRESLEEGIKFTPDEKYPDLSRNELIRQTILALCL >gi|226333018|gb|ACII01000001.1| GENE 13 10741 - 11766 718 341 aa, chain + ## HITS:1 COG:CAC1015 KEGG:ns NR:ns ## COG: CAC1015 COG0564 # Protein_GI_number: 15894302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 3 309 2 314 318 179 35.0 1e-44 MKEFIIEENEAGQRFDKYLAKLLREAPKSFFYKMLRKKNITLNGKKATGNEKLLKGDTIK LFLSDETFDKFAGSSQAARAYCELDIVYEDPDIIIINKPAGMLSQPAYDGEPSLVEYLTG YLLKKGDLTEEQLKTFRPSVCNRLDRNTSGMVCAGKSLAGLQFLSRIFHDRTLHKYYICL TKGKIEKPDHIRGYLHKDKKTNKVIVSRQEFKDSLPIETKYCPLGSNGKITLLEVELITG RTHQIRAHLAGTGHPLLGDTKYGDSEFNKQYIRHGVRHQLLHAYRLVIPETDQTFVAPAP ELFCKIIKEENLEEAYHENLERNMGLRKNDHHSSSDRNACE >gi|226333018|gb|ACII01000001.1| GENE 14 11690 - 12250 574 186 aa, chain + ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 2 184 16 187 190 116 35.0 3e-26 MRIWKEIWDYAKMIIIVVVIVTLVNSVVLINAKIPSESMEKTIMTGDRIFGFRLAYGLNL DFFGHEISKKIKDPERFDIVIFKYPDDESKLFIKRVIGLPGEKVQIKDGKVYINDSEIPL DDSFVPEKPRGSFGPYEVPENSYFVLGDNRNHSKDSRCWKSTSFVTFDEIVGKAVIRYYP SVKLLK >gi|226333018|gb|ACII01000001.1| GENE 15 12276 - 12809 432 177 aa, chain + ## HITS:1 COG:BH3539 KEGG:ns NR:ns ## COG: BH3539 COG3331 # Protein_GI_number: 15616101 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus halodurans # 10 147 8 144 168 110 41.0 1e-24 MATWNSRGLRGSTLEDLVNRTNEQYREKGLALIQKIPTPITPVRMDKENRHITLAYFEQR STVDYIGAVQGIPVCFDAKECAADTFPLANIHPHQVEFMQAFEKQGGVAFFLIFYSHENQ FYYLTLRSLLTFWNRMQEGGRKSFRREELEPQYYLNKKSGFLVPFLDGIQIDLDERE Prediction of potential genes in microbial genomes Time: Sat May 28 19:02:23 2011 Seq name: gi|226333017|gb|ACII01000002.1| Ruminococcus sp. 5_1_39B_FAA cont1.2, whole genome shotgun sequence Length of sequence - 41393 bp Number of predicted genes - 36, with homology - 36 Number of transcription units - 14, operones - 9 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 497 - 556 8.2 1 1 Op 1 11/0.000 + CDS 651 - 1769 1141 ## COG3705 ATP phosphoribosyltransferase involved in histidine biosynthesis 2 1 Op 2 18/0.000 + CDS 1773 - 2420 905 ## COG0040 ATP phosphoribosyltransferase 3 1 Op 3 6/0.000 + CDS 2423 - 3727 1775 ## COG0141 Histidinol dehydrogenase + Prom 3748 - 3807 3.9 4 1 Op 4 . + CDS 3835 - 4422 636 ## COG0131 Imidazoleglycerol-phosphate dehydratase 5 1 Op 5 . + CDS 4480 - 4833 404 ## COG0139 Phosphoribosyl-AMP cyclohydrolase + Term 4867 - 4905 1.0 + Prom 5246 - 5305 3.9 6 2 Tu 1 . + CDS 5402 - 6799 2077 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase + Term 6824 - 6869 9.8 + Prom 6964 - 7023 5.5 7 3 Op 1 . + CDS 7124 - 8143 1409 ## EUBREC_1930 hypothetical protein 8 3 Op 2 33/0.000 + CDS 8222 - 9361 1298 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component + Prom 9363 - 9422 3.1 9 3 Op 3 35/0.000 + CDS 9501 - 10520 1008 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 10 3 Op 4 . + CDS 10600 - 11757 1172 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components + Term 11794 - 11844 -0.0 + Prom 11859 - 11918 3.6 11 3 Op 5 . + CDS 11942 - 12712 460 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 12717 - 12789 17.1 + Prom 12804 - 12863 6.6 12 4 Op 1 . + CDS 13012 - 13275 402 ## gi|253577794|ref|ZP_04855066.1| predicted protein 13 4 Op 2 . + CDS 13354 - 14976 1470 ## gi|253577795|ref|ZP_04855067.1| conserved hypothetical protein + Prom 15135 - 15194 7.1 14 5 Tu 1 . + CDS 15251 - 17092 2186 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Term 17136 - 17186 11.1 15 6 Op 1 . - CDS 17180 - 18256 431 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Term 18276 - 18313 7.8 16 6 Op 2 . - CDS 18317 - 18817 532 ## DSY0622 hypothetical protein - Prom 18875 - 18934 4.7 + Prom 19510 - 19569 4.0 17 7 Tu 1 . + CDS 19652 - 21247 1391 ## EUBREC_0128 hypothetical protein + Term 21275 - 21337 13.1 + Prom 21298 - 21357 6.5 18 8 Op 1 . + CDS 21454 - 22254 807 ## lin0468 hypothetical protein 19 8 Op 2 . + CDS 22323 - 23654 1134 ## COG3610 Uncharacterized conserved protein + Term 23676 - 23736 17.1 + Prom 23825 - 23884 6.5 20 9 Tu 1 . + CDS 23919 - 24869 1137 ## COG0039 Malate/lactate dehydrogenases + Prom 25283 - 25342 5.1 21 10 Op 1 7/0.000 + CDS 25466 - 26260 942 ## COG2048 Heterodisulfide reductase, subunit B 22 10 Op 2 2/0.000 + CDS 26257 - 26646 361 ## COG1150 Heterodisulfide reductase, subunit C 23 10 Op 3 2/0.000 + CDS 26653 - 28641 2165 ## COG1148 Heterodisulfide reductase, subunit A and related polyferredoxins 24 10 Op 4 2/0.000 + CDS 28632 - 29075 248 ## COG1908 Coenzyme F420-reducing hydrogenase, delta subunit 25 10 Op 5 3/0.000 + CDS 29063 - 30037 1140 ## COG1145 Ferredoxin 26 10 Op 6 6/0.000 + CDS 30041 - 31072 941 ## COG1145 Ferredoxin 27 10 Op 7 . + CDS 31072 - 31926 1100 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 28 11 Op 1 . + CDS 32105 - 32782 781 ## Nther_2666 ferredoxin 29 11 Op 2 . + CDS 32797 - 34410 2076 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit + Term 34436 - 34485 12.3 + Prom 34433 - 34492 6.1 30 12 Op 1 11/0.000 + CDS 34547 - 35389 502 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 31 12 Op 2 . + CDS 35391 - 35552 155 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 32 12 Op 3 . + CDS 35494 - 35958 274 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Term 35979 - 36030 14.6 + Prom 36026 - 36085 11.8 33 13 Tu 1 . + CDS 36310 - 36981 482 ## COG3619 Predicted membrane protein + Term 36994 - 37055 14.2 + Prom 37089 - 37148 5.7 34 14 Op 1 23/0.000 + CDS 37304 - 37783 631 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 35 14 Op 2 1/0.000 + CDS 37780 - 39651 1848 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 36 14 Op 3 . + CDS 39667 - 41355 1793 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain Predicted protein(s) >gi|226333017|gb|ACII01000002.1| GENE 1 651 - 1769 1141 372 aa, chain + ## HITS:1 COG:CAC0935 KEGG:ns NR:ns ## COG: CAC0935 COG3705 # Protein_GI_number: 15894222 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase involved in histidine biosynthesis # Organism: Clostridium acetobutylicum # 6 361 7 364 407 248 35.0 1e-65 MQRIFHTPEGVRDIYNGECSQKRKVQEKIHRVFHQYGYEDIETPTFEYFEVFSREVGTIP SKELYKFFDREGNTLVLRPDFTPSVSRACATYFSPDKEPVSLCYTGNTFINNTSYRGHLK ETTQMGVERIGDESADADAELLAMTVECLLAAGLTEFQVSVGQVDYFKSLLKEAELGPEA EERLRVLISQKNSFGVEEFVEEQKLKDSMQKAFTEIPQMFGSEEVLKKARSLTNNACALE AVSRLEEIYEIMKNYGYEKYISFDFGMLSKYQYYTGIIFQAYTYGTGEPMIKGGRYNGLM KHFGKPAASIGFALEVDNLLLALSSQKLISEKEEKPEVIEYEPQNRTEAIKEAQKLRAEG RCVALRLKKEVR >gi|226333017|gb|ACII01000002.1| GENE 2 1773 - 2420 905 215 aa, chain + ## HITS:1 COG:CAC0936 KEGG:ns NR:ns ## COG: CAC0936 COG0040 # Protein_GI_number: 15894223 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 214 5 214 215 196 49.0 3e-50 MRYLTVALTKGRLANKTMEMFEKAGITCEEMKDKDSRKLIFTNEELKMKFFLAKGPDVPT YVEYGAADIGIVGKDTILEEGRKLYEVMDLGFGKCKMCVCGPESAREVLENNQLIRVATK YPNIAKDYFFNRKHQTVDLIKLNGSIELAPIVGLSEVIVDIVETGSTLKENGLKVLEEVC PLSARMVVNQVSMKMENERIRKLIEDLRRVLQEEM >gi|226333017|gb|ACII01000002.1| GENE 3 2423 - 3727 1775 434 aa, chain + ## HITS:1 COG:CAC0937 KEGG:ns NR:ns ## COG: CAC0937 COG0141 # Protein_GI_number: 15894224 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Clostridium acetobutylicum # 15 429 19 431 431 466 59.0 1e-131 MRTVKLTKESTKDILENLLKRSPNNYGKFESTVAQILDKVKNEGDAALFAYTKEFDKADV TKETIRVTDAEIEEAYAQIDPALLGVIRKALVNIRQYHEKQIQNSWFTSTTDGTMLGQKV TPLNRVGVYVPGGKAVYPSSVLMNIVPAKVAGVPHIVMTTPPGKDGKVCASTLVAAKEAG ADEIYKVGGAQAIGALAFGTESIPKVDKIVGPGNIFVALAKKAVYGYVSIDSIAGPSEIL VLADETANPHFVAADLLSQAEHDELACAILITTSEEFAKKVDEEVKGFVEVLSRKEIIQK SLDNFGYILIAEDMDEAIEAANEIAPEHMEIVTANPFEDMMKVKNAGAIFIGEYSSEPLG DYFAGPNHVLPTNGTAKFFSALSVDDFIKKSSIVYYSKSALRNIHKDIIQFATSEQLTAH ANSIAVRFEDEDKE >gi|226333017|gb|ACII01000002.1| GENE 4 3835 - 4422 636 195 aa, chain + ## HITS:1 COG:CAC0938 KEGG:ns NR:ns ## COG: CAC0938 COG0131 # Protein_GI_number: 15894225 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Clostridium acetobutylicum # 3 195 5 197 197 222 54.0 4e-58 MAREASAERNTKETEIKLKINLDGTGYSDIETGVGFFNHMLDGFTRHGLFDLSVRVHGDL QVDDHHTIEDTGIVLGTALKEAIGDKKGIKRYGSCILPMDESLVLCAIDLSGRPYFVWDA EFTTDRIGDMSTEMVKEFFYAISYACGMNLHIRVLAGGNNHHVAEAMFKSFAKALDAATS YDPRITDVLSTKGSL >gi|226333017|gb|ACII01000002.1| GENE 5 4480 - 4833 404 117 aa, chain + ## HITS:1 COG:CAC0942 KEGG:ns NR:ns ## COG: CAC0942 COG0139 # Protein_GI_number: 15894229 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Clostridium acetobutylicum # 20 114 13 107 115 132 60.0 2e-31 MENTVNLKSSIAWSDFKLNSDGMVPVIAQDYQTNEVLMLAYMNEEAFNTTLETGKMTYWS RSRNELWTKGLTSGHLQFVKSLTLDCDNDTILAKVEQIGAACHTGNRTCFFKALTEV >gi|226333017|gb|ACII01000002.1| GENE 6 5402 - 6799 2077 465 aa, chain + ## HITS:1 COG:CAC1373 KEGG:ns NR:ns ## COG: CAC1373 COG4822 # Protein_GI_number: 15894652 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Clostridium acetobutylicum # 146 422 5 262 278 155 31.0 1e-37 MKRKSMLALLLAVSMTCSMTAATGTVAFASDDAATEEAADDTEAAADDADAADTEEASDD TTEASDDDQKAADEVGALIDKIYVQERTDTTDEDCKAAKEAWDKLTDAQKELVTGEEASP EYFGRDTGDASKDDPRNQDEIGENELLVVSFGTSFNDSRAEDIKGIEDALAKAYPDWSVR RAFTAQIIINHVQARDDEVIDNMQQALDRAVANGVKNLVVQPTHLMHGAEYDEMTEAIDG YKDKFESVAIAEPMLGEVGDDATVINDDKKAVAQAITDEACKEAGFDDMKAAADAGTAFV FMGHGTSHTANVTYDQMQTQMDDLGFTNAFIGTVEGEPEDTACDKVIEKVKEAGFKNVIL RPLMVVAGDHANNDMAGDDADSWKSQFEASGDFDSVDCQIAGLGRIAAVEDLYVAHTKAA IDSLGASDDAAAEDTDAKATDDSADDAQADDAAETTADTAEADAE >gi|226333017|gb|ACII01000002.1| GENE 7 7124 - 8143 1409 339 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1930 NR:ns ## KEGG: EUBREC_1930 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 41 333 32 296 303 248 54.0 3e-64 MKKRSFTLLMSALTISATLIGTSPVAAKTDTDTATEETTDKEETTAKDADSENTDNKETD ASKSDTLEDGTYTAEFDTDSGMFHVNEANDGKGTLTVKEGKMTIHVSLASKKIVNLFVGK AEDAQKDGAEVLEPTTDTVTYSDGMSEEVYGFDIPVPAIDEEFDVALIGTKGKWYDHKVS VKNPVKSDDTDAKKDDKENKDADSKADDTDKDAKDSKTSEGKTLADLNLEDGDYTMDVTL TGGSGRATIDSPAAINVEGDKATATIVWSSPNYDYMLVDGEKYEPVNKDGNSTFEIPVSV FDAEMEVTADTVAMSEPHEIDYTLNFDSTTAKEAEKTAK >gi|226333017|gb|ACII01000002.1| GENE 8 8222 - 9361 1298 379 aa, chain + ## HITS:1 COG:CAC2441 KEGG:ns NR:ns ## COG: CAC2441 COG0614 # Protein_GI_number: 15895706 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Clostridium acetobutylicum # 33 362 134 455 471 147 29.0 3e-35 MSVALFIICICTVCTAYAADKDNAKTPVDIPGLTYDHSMELSYAEEFSVDYYNDGYALIT IDQEGQFLVVPEGKKAPKGLEKDITVIQQPLNNIYLVATSAMDLFRAIDGIDSIRLSGTQ ENGWYIQEAKDAMESGKMIYAGKYNAPDYELILDEGCGLAIESTMIHHNPEVEEKLEQFG IPVMVERSSYESHPLGRTEWMKLYAVLLGKEDVAEKAFKEQTDKLDKVLTSDDKDTGKTV AFFYINSTGAVNVRKNGDYVSNMIELAGGKYVPEDTGESDNALSTMNMQMEEFYAKAKDA DYIIYNSTIDGELSTIDELLAKSNLLADFKAVKDGNVWCTGKNLFQETTELGTMIEDIHT ILTTDDDSLDELAYMHKLK >gi|226333017|gb|ACII01000002.1| GENE 9 9501 - 10520 1008 339 aa, chain + ## HITS:1 COG:CAC2442 KEGG:ns NR:ns ## COG: CAC2442 COG0609 # Protein_GI_number: 15895707 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Clostridium acetobutylicum # 6 335 7 338 342 241 44.0 1e-63 MDDKKRTARYWMAFLILAVFLMILFVWNVNAGSIHLSVSEILNIIFRHQGDATAYNIIWE IRLPRILSVIILGGALSVSGFLLQTFFNNPIAGPFVLGISSGAKLVVALLMIYFLSRSIS MGSAAMIIAAFIGSMISMGFVLLVAKKVHNMSMLVISGVMIGYICSAVTDFVVTFADDSN IVNLHNWSLGSFSGMSWGNVKVMAIVVLLTMVIVFFMSKPIGAYQLGEVYAQNMGVNIKL FRIGLILLSSILSACVTAFAGPISFVGIAVPHIVKSLLKTARPLLVIPACFLGGAVFCLF CDLIARTVFAPTELSISSVTAVFGAPVVIYIMIRRQKRN >gi|226333017|gb|ACII01000002.1| GENE 10 10600 - 11757 1172 385 aa, chain + ## HITS:1 COG:CAC2443 KEGG:ns NR:ns ## COG: CAC2443 COG1120 # Protein_GI_number: 15895708 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Clostridium acetobutylicum # 25 378 4 356 387 288 40.0 1e-77 MSEKIADVIKETENNKADRQEQYFFRTDQLTVGYDGKPLIREINIQLKKGEILTLIGPNG AGKSTILKSITRQLATISGTVYLDKEKMAKMTNKEVSQKLAVVLTERMRPELMTCEDIVA TGRYPYTGTLGILSAEDKTKVKKSMETVHAWDLKDRDFTAISDGQRQRILLARAICQEPE IIVLDEPTSFLDIRHKLELLTILKQMVLDHQLTVIMSLHELDLAQKISDKVICVHGEYIE KYGAPEEIFTSEYIRKLYGITRGSYNAAFGCVEMDPPRGEPQVFVIGGNGSGIPYYRKLQ RQGIPFAAGVLHTNDVDCQVAGALADQVITEKPFESISQESYDKAVELMKKCQKVICPLK DFGTVNAANRELLRLARELGILAEL >gi|226333017|gb|ACII01000002.1| GENE 11 11942 - 12712 460 256 aa, chain + ## HITS:1 COG:SPy0794 KEGG:ns NR:ns ## COG: SPy0794 COG0463 # Protein_GI_number: 15674837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 18 234 1 210 231 80 32.0 2e-15 MYMYAIMDTINHPKERNMDKLYIVIPAYNEQDNIEQVINDWYPVIEKHNGNGQSRLIVID DGSKDSTYEKLKQCTKTRPLLIPITKPNGGHGATVLYGYKYALKNGADYIFQTDSDGQTL PEEFEPFWKRRQKYDMVIGWRKDRQDGISRVFVTKTLKLVIRICFGVNLTDANTPYRLMK AETMARYIHLIPKDFNLSNVLLAVIYKKKGCSIKFLPVTFRPRQGGVNSINMKKICKIGK QAVIDFVKLNRIIRRT >gi|226333017|gb|ACII01000002.1| GENE 12 13012 - 13275 402 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577794|ref|ZP_04855066.1| ## NR: gi|253577794|ref|ZP_04855066.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 87 1 87 87 148 100.0 9e-35 MNKMNELCEMELTGRYEEKLSALKAEISAADEYIRQDNSDLTEFENRYERLDIPYRVKRF LDDYIACMQTKYERLADLCYLAGEQGL >gi|226333017|gb|ACII01000002.1| GENE 13 13354 - 14976 1470 540 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577795|ref|ZP_04855067.1| ## NR: gi|253577795|ref|ZP_04855067.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 540 1 540 540 967 100.0 0 MYRKKNFISLGMSLVLAAGITVTPVTAADFTDGTVSEFNSDDATEFDVQQETADNAEFAS DTEDSISDKEGSNEADSDEEVFAEAVIPDENGAVDNRQLLEKYLNDILYGQDPDTQPAAS LGEDVLNGQQQTLYEELKSKISAVASKGGSTRFKVSSDLGLTWQTTASGNALKREVAAHL NELDISKIIDCLAVDCPYELYWYEKTAATTWQYGYSTKAAGNKTTVKIENLSISMPVISS YASGKYKVNAGIVQLAKSAAENARMIVKEFQGYTETEKLYGYKEMICYLTSYNDDVTEDD EYGDPWQLIYVFDGDDSTNVVCEGYSKAFQYLCDLSDITCYTVTGIMNGGTGEGPHMWNI VANNGKYYMADITNSDEGTVGEDGGLFLDTPISGSISRGYTYATDSANIYFAYDKEAKSL YGTGKNSILYMTQETYSADDLTAKPAKTSITGISNRTAGKLVITWKKAKSADGYEIYRRA GTTNPYVRAAVIKSGSTTKYTNAKLKKGHTYYYRIRSYRYDENGRKIYSGWSTVKKQKVK >gi|226333017|gb|ACII01000002.1| GENE 14 15251 - 17092 2186 613 aa, chain + ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 613 1 608 608 641 52.0 0 MCGIVGFTGNHQAAPILLYGLSRLEYRGYDSAGLAVRDGEGDTEVIKAKGRLKVLADKTN NGESVPGTCGIGHTRWATHGEPSETNAHPHMSDDGNVVAVHNGIIENYQELKDKLIRNGY EFYSSTDTEVAVKLVDYYYKKYLGTPVDAINHAMVRIRGSYALAIMFKDYPGEIYVARKD SPMILGVENGESYIASDVPAILKYTRNVYYIGNLEMARIRKGEITFYNLDGDEIQKEPKT IEWDAEAAEKAGFEHFMIKEIHEQPKAVRDTLNSVLKDDRIDLSEVGLTNEEIKKISQIY IVACGSAYHVGMAAQYVIEDLTRIPVRVELASEFRYRNPILDPEGLVVIVSQSGETADSL AALREAKQRGIRTLGIVNVVGSSIAREADNVFYTLAGPEISVATTKAYSTQLIASYALAV QFGKVREQITEEKYQELIAELKTLPDKIAKIIEDDKERIQWFAAKQANARDAFFIGRGID YAISMEGSLKMKEVSYVHTEAYAAGELKHGTISLIEDGTLVIGILTQNHLYEKTVSNMVE CKSRGAYLMGLTTYGHYNIEDTADFTVYIPKTDPHFATSLAVIPLQLLGYYVSVAKGLDV DKPRNLAKSVTVE >gi|226333017|gb|ACII01000002.1| GENE 15 17180 - 18256 431 358 aa, chain - ## HITS:1 COG:lin0696 KEGG:ns NR:ns ## COG: lin0696 COG0463 # Protein_GI_number: 16799771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 4 149 5 148 637 97 36.0 3e-20 MISLSLCMIVKNEEKVLPRILKPMKEIVDKILICDTGSTDRTKEIIREFTAEVYDFPWKN DFSAARNFISEKVSTDYWMWLDADDMITQENLYRLKQLKETLSPNTDMVMMDYVTDFDEW NHAAFSCYRERILKTSRNFRWRGRVHESIIPTGNILYSPIQIEHRKIKPCSSFRNLHIYQ QMIEEGEPLEPRDLFYYGRELFYHKQYEYAICVLKKFLKEPDGWIENRLDSCLVLSYCYQ ASGNDQHALEILFHSFMSDIPRAEICCEIGKIFFMKRNFSMAAHWYHQALLAPDNNQNGG FYIPDCHNFIPFLQLCVCLDKSGMHKEAFEFHRKARTLKPEHPSVIQNQIYFHEILGF >gi|226333017|gb|ACII01000002.1| GENE 16 18317 - 18817 532 166 aa, chain - ## HITS:1 COG:no KEGG:DSY0622 NR:ns ## KEGG: DSY0622 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 31 145 313 427 705 72 56.0 4e-12 MLFGRHTSDDPELSGTSCCNNNNRCSCDCQGPQGPRGCPGPTGPRGCPGPVGPMGPQGPQ GEPGLQGYPGPVGPVGATGATGPVGATGATGPAGPSGAIGPTGPAGPIGPTGAAGPAGPT GATGATGPAGTVAPAAAVADATGQTDIVTQFNQLLANLRAAGLLAT >gi|226333017|gb|ACII01000002.1| GENE 17 19652 - 21247 1391 531 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0128 NR:ns ## KEGG: EUBREC_0128 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 526 22 547 550 682 61.0 0 MGMFVNPNAAAFQCAVNSEIYIDKTGLLEYTNKVLGTNARFICNSRPRRFGKSVTVDMLT AYYSKGCDTEKMFSGLEISKCPDFYEHLNKYDVIHFDVQWCCISAGSSENLISYITNIVV SELRETYPEVNLVENSTIYGAMARINTVLGKQFIVIIDEWDVLIRDEAHNQAAQEMYIDF LRNMFKGIEASKYIALAYLTGILPIKKLKTQSALNNFTQFTMLNAGPLTPYIGFNEDEVI DLCKKYEIDFAEVKRWYDGYRLGDYHVYNPNVLVNLTIMRTFQSYWSQTGTYLSILPLIN MDFDGLRTSIIEMLSGSSVEVNVNEFQNDMVSFADKDDVLTLLIHLGYLAYDQRTQRAYI PNEEIRQEFRAATKRNKWNELIEFQQESEKLLEATWEMDAETVAEQIEKIHAEYTSVIQY NNENSLSSVLSIAYLSAIKYYFKPIRELPTGRGFADFVFIPKPLYRDYYPALLVELKWNK DAETALNQIKERKYPESLQQYTGDILLVGINYDKKEKVHQCVIEKWFPLCI >gi|226333017|gb|ACII01000002.1| GENE 18 21454 - 22254 807 266 aa, chain + ## HITS:1 COG:no KEGG:lin0468 NR:ns ## KEGG: lin0468 # Name: not_defined # Def: hypothetical protein # Organism: L.innocua # Pathway: not_defined # 1 258 1 254 254 224 42.0 2e-57 MSTVLPREEAVETIFSKILASPEASGRLSGVFYDHIDDDHRLTDNDRDHFLQVLFHAYQN GDISALLLELCGRSMFDLLREAYLIPKKFHGKAGENPVLLTDAAGELLPGEKVSAREYAK FKETYEHHECAPRSALYLADGYDLVRTYTEGLNITEEKDNRKRGVLALYALPDTCKLGLT EAQAYAVVWDAFQKIQEEAPRAMVYYGQETGLKKENPDKPYDEIGILLPIHEFEKKMLQH LDEIDGIVLACREKMMEKAGNDSLQL >gi|226333017|gb|ACII01000002.1| GENE 19 22323 - 23654 1134 443 aa, chain + ## HITS:1 COG:SA0699 KEGG:ns NR:ns ## COG: SA0699 COG3610 # Protein_GI_number: 15926421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 303 434 5 135 164 65 34.0 2e-10 MPQEVIKKNHMDVAWHEYTDENGGNVPVVDSSIAEKASIIGRVGIMFLSCGTGAWRVRSS MNTLAEALGITCTEDIGLMSIEYTCYDGENGFTQSLCLTNTGVNTLKLNRLEHFIRNFEK EGKHMSGEQLHTFLDNIEKTHGLYSPPALGLAAAIACCGFTFLLGGGPIEMFCAFVGAGI GNYLRCKLTKHHFTLFLCIVSSVSLACFAYAGLLKLGEILFGISVQHEAGYICAMLFIIP GFPFITSGIDLAKLDMRSGIERLTYALIVILVATMSAWIMALILHLKPVDFIKLSLSTEQ WILFRLLASFCGVFGFSIMFNSPVRLAVAAAGIGAVANTLRLELVDLANIPPAAAAFIGA LTAGLLASLLKNKVGYPRISVTVPSIVIMVPGLYLYRGFYNLGIMSLSVAASWFASAILI IAALPLGLIFARILTDKAFRYCT >gi|226333017|gb|ACII01000002.1| GENE 20 23919 - 24869 1137 316 aa, chain + ## HITS:1 COG:CAC0267 KEGG:ns NR:ns ## COG: CAC0267 COG0039 # Protein_GI_number: 15893559 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Clostridium acetobutylicum # 2 313 4 312 313 269 45.0 4e-72 MSSKITIIGAGSVGSTIAYTLSSQDIASEIVLIDINKKKAEGEVLDIIQGTCFRDPISII AGDYEDARDSDIVIITSGIARKPGQTRLELTQTNVNILKSITPEIVKAAPNALYLIVSNP VDIMTYVFTKISGLPENQILGSGTILDSARLRCGLSEHFQIAQSNIHAYVFGEHGDTSFI PWSGAYISGVSVDEYYDLEKKLGKDIEPIDKEAMLQYVQKSGGEIISKKGATFYAVSSSV CKLCSLLVSSSESISTVSTMMHGEYGIDDVCLSTLTLVGPNGVQGKVPMRMTKAEIEQLK KSADALKEIIAQIDLN >gi|226333017|gb|ACII01000002.1| GENE 21 25466 - 26260 942 264 aa, chain + ## HITS:1 COG:SSO2358 KEGG:ns NR:ns ## COG: SSO2358 COG2048 # Protein_GI_number: 15899116 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit B # Organism: Sulfolobus solfataricus # 1 261 4 275 293 160 33.0 2e-39 MKYSYYPGCTLRNKAKDLDEYARASARALGFELEEIEDWQCCGGVYPLGTDEIATKLSSV RALNQAKEKGQDLVTICSACHHVIKRVNDDMKNVEDIRTRANNYMQLDEPYAGETTVLHY LEVLRDKVGFDKLKEKVVNPLTGKKIGAYYGCLLLRPGKIMAFDDPENPTIMEDFIRALG AEPVIYPYRNECCGGYISLKEKDMSQNMCQKIEDSAKGFGADMLITACPLCKYNLNKNAS SELPVYYFTELLAEALGVKEEVAK >gi|226333017|gb|ACII01000002.1| GENE 22 26257 - 26646 361 129 aa, chain + ## HITS:1 COG:MA3127 KEGG:ns NR:ns ## COG: MA3127 COG1150 # Protein_GI_number: 20091945 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit C # Organism: Methanosarcina acetivorans str.C2A # 21 96 40 114 247 68 35.0 3e-12 MRNQHSAAEKIKEISGTNPLKCMKCGKCSATCPSFNEMDIKPHQFVSYVVNEDIEALVNS KSLWKCLSCFACVERCPRDVKPGKIIDAARQLVVREKGGDYLKADEIPQLLDPELPQQLL VSAFRRYRR >gi|226333017|gb|ACII01000002.1| GENE 23 26653 - 28641 2165 662 aa, chain + ## HITS:1 COG:MK0265 KEGG:ns NR:ns ## COG: MK0265 COG1148 # Protein_GI_number: 20093705 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit A and related polyferredoxins # Organism: Methanopyrus kandleri AV19 # 3 662 7 653 656 643 50.0 0 MQRIGVFVCHCGSNIAATVDVKKVVELAAKEPGVVHAEDYQYMCSEAGQAKIQEAIKEKK LTGVVVCSCSPRMHEATFRKAAERAGLNPYMVEIANIREHCSWIHKDMEEATKKAVILAR AAIAKVNLNAPLQPGESRVTKRALIIGGGIAGIQTALDIADAGYEVDIVEKTPSIGGRMS QLDKTFPTLDCSACILTPKMVEAAAHEKINIYTYSEVEHVSGFVGDFTVDIRKKARSVNM DKCTGCGVCQEKCPSKKIPNEFNRGLNNRTAIYTPFAQAIPNVPVIDRENCLKFKTGKCG VCSKVCQAGAIDYDQQDEIVTQKYGAIVVATGFDTIKLDKYDEYAYSQSKDVITSLELER IMNAAGPTKGHLERLSDGKAPKELVFIQCVGSRCSDDRGKPYCSKICCMYTAKHAMLIRD KYPDTNVTVFYIDVRTPGKNFDEFYRRAVEQYGVNYIKGQVGKVIPQPDGSLLVQGSDLI DNKQILKKADMVVLATAIEPNPDVRKIATMLTASIDTNNFLTEAHAKLRPVESPTAGIFL SGVCQGPKDIPETVAQAGAAAVKAIGLLAKDKLTTNPCTAKSDELLCNGCSTCANVCPYG AISYEDKQVNDHGIRETRRVAVVNTALCQGCGACTVACPSGAMDLQGFSNRQIIAEVDAI CR >gi|226333017|gb|ACII01000002.1| GENE 24 28632 - 29075 248 147 aa, chain + ## HITS:1 COG:MTH1138 KEGG:ns NR:ns ## COG: MTH1138 COG1908 # Protein_GI_number: 15679149 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, delta subunit # Organism: Methanothermobacter thermautotrophicus # 9 135 3 129 136 162 58.0 2e-40 MSIETNEEFRPKIVAFCCNWCSYAGADLAGSSRLSYPADVKIIRVPCSCRVNPMFILRAF EKGADGVIMCGCHPGDCHYSTGNYYARRRMTLLFSMLDYIGVENGRTRVEWVSAAEGVKF SQTMNEFVEKIHSLGKNVRLEDLRCRS >gi|226333017|gb|ACII01000002.1| GENE 25 29063 - 30037 1140 324 aa, chain + ## HITS:1 COG:MTH1139_2 KEGG:ns NR:ns ## COG: MTH1139_2 COG1145 # Protein_GI_number: 15679150 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanothermobacter thermautotrophicus # 207 284 15 93 116 60 35.0 5e-09 MQELINRAKELLADGTVARVLGWKAGDLPYNPEPAYFETEESLKDFVYNGFCGANLSKYM IEASKLEGKTLVCLKPCDTYSFNQLIKEHRVDREKAYIIGVGCKGKLDIERIRKQGIKGI ESISGAEITDEAETLTIQTIYGEKTCAYKDAMLERCHVCKGKEHKIYDELIGESKDTKDA DRFAEVEKIEAMSPEEKFAFFQSQLSKCIRCNACRNVCPACSCRKCVFDSNKFDSAQKAN VDSFEEKMFHIIRAFHVAGRCTDCGECSRVCPQGIPLHLFNRKFIKDIDAFYGEYQAGED TTSKGPLTNFTFEDVEPSIVAERG >gi|226333017|gb|ACII01000002.1| GENE 26 30041 - 31072 941 343 aa, chain + ## HITS:1 COG:PAB0638 KEGG:ns NR:ns ## COG: PAB0638 COG1145 # Protein_GI_number: 14521154 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Pyrococcus abyssi # 3 334 5 327 334 169 33.0 8e-42 MYKIAAENLQALFQKIAADQDLYLPVKVSDQVNFKAWSEDAEVCLDTLKTVKSPKDAFFP QSENLYTCKKEEKNISIEPQALQEKNFVVFGMKACDIQGVKVLDNVFLSDPIDTFYAARR DHGTIVALACHEPEESCFCKVFGIDCADPVADVATWMIEGELYWKPLTEKGEALTKAVAE LLNDADEAKVEEEKAAIRAIVEKLPYSNLSLEGWGQEDYMDRFNSPVWEELYKPCLACGT CTFVCPTCQCYDIKDYDTGHGVQRYRCWDSCMYSDFTMMAHGNNRNSQMQRFRQRFMHKL VYYPVNNNGMFSCVGCGRCVEKCPSSLNIVKVIKAFENQGGEK >gi|226333017|gb|ACII01000002.1| GENE 27 31072 - 31926 1100 284 aa, chain + ## HITS:1 COG:PAB1785 KEGG:ns NR:ns ## COG: PAB1785 COG0543 # Protein_GI_number: 14521075 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 35 282 37 289 292 190 41.0 3e-48 MSECTCHKNYTLDTLIPKVGVITDIRQETPDVKTFRVNAPEGGKLFEHMPGQCAMVCAPG VSEGMFSITSSPTNKEYQEFSIKKCGALTDYLHSLQVGDEITVRGPYGNHFPVEDKLKGK NLLFIAGGIGLAPLRSVINYVLDNRADYGTVDILYGSRSVDDLVQLKEIQEVWMKAEGVN VYLTIDREQEGWDGHVGFVPSYLKEIGFDTNKTALVCGPPIMIKFVLAGLEELGFSREQV YTTLELRMKCGIGKCGRCNIGSKYVCKDGPVFRCDEIDELPAEY >gi|226333017|gb|ACII01000002.1| GENE 28 32105 - 32782 781 225 aa, chain + ## HITS:1 COG:no KEGG:Nther_2666 NR:ns ## KEGG: Nther_2666 # Name: not_defined # Def: ferredoxin # Organism: N.thermophilus # Pathway: not_defined # 2 225 16 239 239 258 52.0 8e-68 MEKNMVNVYFFGKKYSVPAELTIMTAMEYAGYTLKRGCGCRHGFCGACATIYRIKGENEL KTCLACQTQVQEGMYVASIPFFPTDKRLYNIEDLKPNQQVMMELYPEIYSCIGCNACTKA CTQDLNVMQYIAYAQRGELEKCAEESFDCVSCGCCSVRCPAGISHPMVGLLARRLTGKYI APKSEHLEKRVEEIHEGKYDDLIEQIMQKPITDMQELYNNREIEK >gi|226333017|gb|ACII01000002.1| GENE 29 32797 - 34410 2076 537 aa, chain + ## HITS:1 COG:MK0828 KEGG:ns NR:ns ## COG: MK0828 COG1053 # Protein_GI_number: 20094264 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Methanopyrus kandleri AV19 # 117 490 30 403 556 236 39.0 8e-62 MYAPYPENMMESIKKVEATRAQRMATEPRRLTAEEKDALLEKFHPDYNPDAFAEIKVGPN KGQKAPLELAEMLHSTSRLINEKIDLTKIDYDVDVLVIGGGGAGSSCAIEAHNAGADVMI VTKLRIGDANTMMAEGGIQAADKENDSPVQHYLDCFGGGHFAAKPELVKRLVMEAPDAIK WLNDLGVEFDKAEDGTMVTTHGGGTSRKRMHAAKDYSGSEIMRTIRDEVLNLKIPVVEFT SAVELLKDEKGQVAGAVLLNMETGDYSVARAKTVVIATGGAGRMHYQGFPTSNHYGATAD GLVLGYRAGASLLYQDSIQYHPTGAIYPSQIMGALVTEKVRSVGAQLVNANGEAYIHPLE TRDVNASGVIRECEEGRGVEVPGGLKGVWLDTPMIEILGGEGTIEKRIPAMFRMYMNYGI DMRKVPIVIYPTLHYQNGGLEINGDGFTKEIPNLLVAGEAAGGIHGRNRLMGNSLLDVIV FGRNAGKKAAAKCKNVELGEMNLDHIYKYADELDKAGVHTDKVSPLLLPKYARHEQR >gi|226333017|gb|ACII01000002.1| GENE 30 34547 - 35389 502 280 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 25 275 26 282 508 197 44 6e-50 MREVEVSRLTDVIEKLCIEANEHLPEDVKCAIKTCRACEDGEIAKGILDNIIENFDIADN ENVPICQDTGMACVFLEIGQDVHFVGGDLTDAINEGVRRGYDKGYLRKSVVKDPVRRGNT GDNTPAMIYTEIVPGDQVKITVGPKGFGSENMSQIRMFKPSAGLQGIKDFILEVVETAGP NPCPPMVVGVGIGGTFDKCALLAKKALMRPLDTQNPDPFYADLEKEMLEKVNKLGIGPQG FGGKTTAIGLNIETMPTHIAGMPCAVNINCHVTRHKSEVI >gi|226333017|gb|ACII01000002.1| GENE 31 35391 - 35552 155 53 aa, chain + ## HITS:1 COG:aq_1679 KEGG:ns NR:ns ## COG: aq_1679 COG1838 # Protein_GI_number: 15606776 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Aquifex aeolicus # 4 53 3 52 185 60 58.0 5e-10 MERKKITLPLTRELARTLHAGDQVLLTGTIYTSRDAGHKRMCEALAKRGKNSI >gi|226333017|gb|ACII01000002.1| GENE 32 35494 - 35958 274 154 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 5 140 350 486 508 110 40 6e-30 MQDTRECVRPLQKGEKIPFDPTDATIYYVGPTPAKPGAVIGSAGPTTSGRMDAYAPTMMS VGARGMIGKGARLPEVVDAMKKYDGVYFGAIGGAGALLAKCIKKAELIAYEDLGAEALRK LYVEDMPLVVIIDSEGNNLYEMGKAEYLKEHGDK >gi|226333017|gb|ACII01000002.1| GENE 33 36310 - 36981 482 223 aa, chain + ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 212 7 216 221 152 41.0 3e-37 MSEAFLNSAFLALSGGFQDAYTYNARNEVFCNAQTGNVVLMSQHFMAGELTAGLGYFFPI IFFALGVWVAEKIQANYKYAQKLHWRQGVLLAEILILFTVGFLPTEYNMLANAMASFACA MQVQSFRKVNGYSYASTMCIGNLRSGTAALSVYFREKKSKQLKQAMYYFGIILMFALGAG IGGNLSIRYGIRMIWVSCGFLMISFLLMFIEKYKHFHEEHETK >gi|226333017|gb|ACII01000002.1| GENE 34 37304 - 37783 631 159 aa, chain + ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 4 155 5 155 164 140 44.0 8e-34 MLDQSYYRKADEIIEHYGRTAASLIPIMQDIQAEYRYLPGELLTYVAKEIGVKEAKAYSV ATFYENFSFEPKGKYVIKVCDGTACHVRKSMPVKEALMKELGLSNKKHTTDDMLFTVETV SCLGACGLAPTLTVNDEVHPKMTPEKAVELLNKLRGECV >gi|226333017|gb|ACII01000002.1| GENE 35 37780 - 39651 1848 623 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 24 553 5 527 527 595 55.0 1e-170 MSIKSKEDLLNKQAEVNEQIKSYTCRVLVCSGTGCIASGAQKIYEEMSVLCERIDGVTVE MQKDVPHVGVIKTGCQGLCELGPLMRIEPYDYQYVHVQPEDCKEIVERTILEGKPVERLF YRDNNTVCPHPDDIPFLNQQTRIVLENCGKIDAESIEEYIAVGGYQALAKVMGSMSPQEV IDEVTKSGLRGRGGAGFPAGKKWSQVARQAEKTRYVVCNGDEGDPGAFMDGSVMEGDPYK MIEGMTIAAYAVGAENGYIYVRAEYPLSVKRLRMAIEQAEKYGLLGDNILNSGVNFHLHI NRGAGAFVCGEGSALTASIEGNRGMPRVKPPRTVEKGLWGKPTVLNNVETYANVPKIILQ GADWFRTIGTEGSPGTKTFSLTGSIENTGLIEVPMGTTLRHIIYDIGGGLKSGAAFKGVQ IGGPSGGCLILDQLDAPLDFDSVKKLDAIMGSGGLVVMDENTCMVEVARFFMNFTQRESC GKCVPCREGTKRMLEILERIVDGKGEMSDLDELEELANMVQNMALCGLGKSAPLPVISTL KRFRNEYEEHIRDKKCRAKVCTALRQFHINPEFCIGCGKCAKNCPAGAISGKIKHPYHID NDICIKCGACKDNCNFDAVYVEA >gi|226333017|gb|ACII01000002.1| GENE 36 39667 - 41355 1793 562 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 223 558 16 357 372 265 42.0 2e-70 MGHMIIDGRKVEFTDEKNVLSVIRKAGINIPTLCYHSEVSTFGACRLCTVEDERGKTFAS CSEQPRDGMVIYTNSGRVKKYRKLIVELLLAAHCRDCTTCVKSGECVLQELAHRLGVQTI RFKNTREYHELDTSSPSLVRDPNKCILCGNCVRACEELQGIGALGFAFRGTEAMVMPAFN KKIAETECVNCGQCRVYCPTGAIAIKTHMDEAWEALADPNIRVVAQIAPAVRVAVGDHYG LTKGKSVMGKIVNALHRMGFDEVYDTSFSADLTIMEESAEFLDRIKKGEKLPLLTSCCPA WVKFITDQYKEYIPNLSTCRSPQGMLSAVIKEYFRDPEHAGGKKTVMISIMPCTAKKAEA VRPNSFTDGEQDTDIVITTTELLRMIDNFGLDFATLDPEACDMPFGFGSGGGVIFGVTGG VTEAVLRRLTPDHSKETMHEIAECGVRGDEGIKEFTVPYEGMNLNICVASGLANARTVME QVKNGEKEYHLIEIMACRRGCIMGGGQPTRSGDRTKALRAKGLYNADNTTIIKKSDENPL VLELYNGLLKGKEHHLLHNDSY Prediction of potential genes in microbial genomes Time: Sat May 28 19:03:14 2011 Seq name: gi|226333016|gb|ACII01000003.1| Ruminococcus sp. 5_1_39B_FAA cont1.3, whole genome shotgun sequence Length of sequence - 17774 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 9, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 4.6 1 1 Tu 1 . + CDS 99 - 1568 1352 ## Cphy_2385 stage IV sporulation protein A + Term 1730 - 1763 2.1 2 2 Tu 1 . - CDS 1650 - 2021 383 ## COG5341 Uncharacterized protein conserved in bacteria - Prom 2167 - 2226 7.8 + Prom 2320 - 2379 3.5 3 3 Op 1 . + CDS 2426 - 3922 1705 ## COG0554 Glycerol kinase + Term 3925 - 3959 1.3 + Prom 3936 - 3995 8.1 4 3 Op 2 . + CDS 4102 - 4971 1101 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase + Term 4990 - 5046 11.5 + Prom 5032 - 5091 4.9 5 4 Op 1 . + CDS 5187 - 6167 839 ## gi|253577821|ref|ZP_04855093.1| predicted protein 6 4 Op 2 . + CDS 6207 - 6452 251 ## EUBREC_0233 hypothetical protein 7 4 Op 3 . + CDS 6468 - 7517 731 ## COG0502 Biotin synthase and related enzymes 8 5 Op 1 . + CDS 7643 - 9064 1450 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 9 5 Op 2 . + CDS 9116 - 10324 920 ## COG1160 Predicted GTPases + Term 10365 - 10417 9.2 + Prom 10412 - 10471 8.1 10 6 Tu 1 . + CDS 10665 - 10940 396 ## Cphy_1196 hypothetical protein + Prom 11024 - 11083 6.0 11 7 Op 1 . + CDS 11136 - 11345 288 ## EUBELI_00572 hypothetical protein + Prom 11350 - 11409 4.8 12 7 Op 2 22/0.000 + CDS 11443 - 11664 289 ## COG1918 Fe2+ transport system protein A + Prom 11753 - 11812 4.3 13 7 Op 3 . + CDS 11838 - 13997 2513 ## COG0370 Fe2+ transport system protein B + Term 14017 - 14053 3.4 + Prom 14131 - 14190 6.5 14 8 Op 1 . + CDS 14211 - 14342 110 ## 15 8 Op 2 . + CDS 14429 - 14902 305 ## EUBELI_00576 hypothetical protein + Term 14940 - 14987 6.5 + Prom 15057 - 15116 7.3 16 9 Op 1 . + CDS 15249 - 15584 513 ## EUBELI_00577 hypothetical protein 17 9 Op 2 . + CDS 15584 - 17680 2325 ## COG2217 Cation transport ATPase Predicted protein(s) >gi|226333016|gb|ACII01000003.1| GENE 1 99 - 1568 1352 489 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2385 NR:ns ## KEGG: Cphy_2385 # Name: not_defined # Def: stage IV sporulation protein A # Organism: C.phytofermentans # Pathway: not_defined # 1 489 1 491 491 518 52.0 1e-145 MENFQVYRDIQARTGGDIYIGVVGPVRTGKSTFIRRFMELVALPQMSDTKQAEIRDQLPL SGSGKIITTAETKFIPKEAVPITLGEDQQVKIRLIDSVGFLVKGASGQTEDGKERMVKTP WFEQAIPFREAARIGTQKVIQEHSTIGIVVTTDGSFGELPRDNFPEAEEKTIQELKKQQK PFIVLVNSQMPYKDAALKTAEEIQQKYKVTALTVNCDQLRKEDIARILEKVLYEFPVSQI QFFIPRWVEMLPLEHELKQQILSQIRDKMKSMQHIRDITKESVKLSGPYVQDSLLEDVGL SDGTVKVRIRIKEEYYYRMLSQMSGIEMESEYELIHTMQELVHMKEEYVKVQAALEAVRG TGYGVVVPNLDEIEIAQPEVIRQGNKYGVKIKSKSPSIHMIKANIETEIAPIVGTEQQAK DLIQYIDEGSQRGESIWETNIFGKSIEQLVQDGIRSKIAAISEESQVKLQDTMQKIVNDS KGGLVCIII >gi|226333016|gb|ACII01000003.1| GENE 2 1650 - 2021 383 123 aa, chain - ## HITS:1 COG:lin2788 KEGG:ns NR:ns ## COG: lin2788 COG5341 # Protein_GI_number: 16801849 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 36 121 51 144 144 60 37.0 1e-09 MKKKEFIFIGGILILALAMWGGMSLINKGSHGSIRITVDGKEYGTYSLEKDQTIKINDTN VCEIKDGQARMISAQCPDHLCMKQKAIDEKGGTIVCLPNKVVIEGEKSTDSKKTDEPVID AVT >gi|226333016|gb|ACII01000003.1| GENE 3 2426 - 3922 1705 498 aa, chain + ## HITS:1 COG:FN1839 KEGG:ns NR:ns ## COG: FN1839 COG0554 # Protein_GI_number: 19705144 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Fusobacterium nucleatum # 3 494 2 495 497 657 64.0 0 MGKYIMALDAGTTSNRCILFNEKGERLSVAQREFTQYFPKPGWVEHDADEIWASMLGVAV EAMTKIGADASKIAAIGITNQRETAIVWDKNTGVPVYHAIVWQCRRTSEYCDSLKSRGLT EKFREKTGLVIDAYFSATKIKWILDNVEGAREKAQKGQLLFGTVETWLIWKLTKGRVHVT DYSNASRTMLFNINTLQWDQEILNELDIPKSMLPKPMPSSCIYGKADPAVLGGAIPIAGA AGDQQAALFGQTCFNAGEAKNTYGTGCFLLMNTGEKPVFSKNGLVTTIAWGLDGKVTYAL EGSIFVAGAAIQWLRDELKVIDSSEDSEYMANKVSDTHGCYVVPAFTGLGAPHWDQYARG TIVGITRGVNKYHIIRATLESIAYQVNDVLNAMKADSGINLAALKVDGGASANNFLMQTQ ADIINAPVNRPCCVETTAMGAAYLAGLAVGYWSSKEEVRHNWSIDQKFYPSITEETREQK IKGWNKAVKYSYGWAKDE >gi|226333016|gb|ACII01000003.1| GENE 4 4102 - 4971 1101 289 aa, chain + ## HITS:1 COG:Cgl2543 KEGG:ns NR:ns ## COG: Cgl2543 COG0152 # Protein_GI_number: 19553793 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Corynebacterium glutamicum # 1 289 5 289 297 296 50.0 3e-80 MQEMKPIKEGKVREIYDNGDSLIMVATDRISCFDVILNNQVTKKGTVLTQMSKFWFDMTE DILPNHMISVDVKDMPEFFQQEKFDGNSMMCRKLEMLPIECIVRGYITGSGWESYKKTGK VCGIELPEGLKESDKLPEPIYTPSTKAEIGDHDENISYEQSIAHLEKYFPGRGEELAAKL RDNTIALYKKCAEYALSKGIIIADTKFEFGLDKDGNMVIGDEMLTPDSSRFWPAEGYEPG HGQPSFDKQFARDWLKANPDNDWTLPQDIVDKTIGKYLQGYEMLTGKKL >gi|226333016|gb|ACII01000003.1| GENE 5 5187 - 6167 839 326 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577821|ref|ZP_04855093.1| ## NR: gi|253577821|ref|ZP_04855093.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 326 1 326 326 635 100.0 1e-180 MKTLKKLLFIIMAVAVLAMPVFGVQAAESDIPVTEQPGVSPTPSPVPIRELVTKGNKIYY YYKGKMVKNKWKRYNGYKYYFGANGNAVRGGQRINNVVYVFDEKGRLFENKQNKIVKSGS NIYHIRTEHGRASIGYFIYKNNLYYADPKGRLYQKKSRQNGQLYFTNSGAARKDYNALLK MRVMQIVSSITNSGMSQSQKLYACWKYVVYGGFYYGGPDPNIYQSGWARSEALRMFRTGY GNCYGFSCIFAALAREIGYTPYMICGRVPGSRDGAADGFTRHCWVEINGLYYDPEAQYAG WMTGVYGYDYYPISHQILRVVNFCKF >gi|226333016|gb|ACII01000003.1| GENE 6 6207 - 6452 251 81 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0233 NR:ns ## KEGG: EUBREC_0233 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 79 1 79 81 104 69.0 1e-21 MESRIAVISIIVENSDSVECLNRILHEYGIYIIGRMGIPYRQRGISIISVAIDAPLDIIN ALTGKIGRLAGVSAKTVYSKA >gi|226333016|gb|ACII01000003.1| GENE 7 6468 - 7517 731 349 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 15 345 14 339 350 281 42.0 1e-75 MASAELQIPEHIQETQNITREELEMILHTQNPLVTEELRRRAQQTAKAIYGNKIYVRGLI EFTNYCKNNCYYCGIRCSNTDADRYRLSLEQILSCCETGWNLGFRTFVLQGGEDPHFTDE HICQIVRTIKEIYPDCAVTLSIGEKSEESYQAYFNAGADRYLLRHETADEDHYKKLHPEQ MSLVFRKNCLKNLKKIGYQTGCGFMVGSPGQTVDTLYRDFLFIKELKPEMIGIGPFIPHK DTPFAKELPGTLEQTLRLLSIIRLIHPHALLPATTALGTIDPKGREKGILAGANVVMPNL SPVNVRDKYTLYDNKICTGDEAAECKACMAARMKSIGYEVVTDRGDYKA >gi|226333016|gb|ACII01000003.1| GENE 8 7643 - 9064 1450 473 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 473 1 472 472 707 73.0 0 MYNPKSLKAEEFISDEEIRETLAYADANKENVALIDQILAKAKECKGLTHREASVLLACP IPEKIQEMYDLAAEIKKEFYGNRIVLFAPLYLSNYCVNGCVYCPYHKKNQHIARKKLTQE EIVKEVTALQDMGHKRLAIEAGEDPINNPIEYILECIQTIYSIKHKNGAIRRVNVNIAAT TVENYRKLKDAGIGTYILFQETYHKESYETLHPTGPKHNYAYHTEAMDRAMEGGIDDVGL GVLFGLELYRYEFAGLLMHAEHLEAVHGVGPHTISVPRIKHADDIDPSAFDNSISDEIFA KICALIRIAVPYTGMIISTRESKAVREKVLPLGVSQISGASRTSVGGYDVPETEDEVTSA QFDVSDQRSLDEIVNWLMGMGDIPSFCTACYRAGRTGDRFMSLCKSKQISNCCHPNALMT LMEYLQDYASPETKKVGLELIQKELENIPGEKIRKLTEEHLADIVNGQRDFRF >gi|226333016|gb|ACII01000003.1| GENE 9 9116 - 10324 920 402 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 5 401 4 397 411 331 43.0 1e-90 MAVTLNETPSGNRLHIGIFGKTNSGKSSFINTFSGQKVSIVADVAGTTTDPVYKPMEIYP LGPCIIIDTAGFDDEGELGALRIEKTSLAAQKTELAIILFCEDEMVQELKWYNYFKKRQT PVIPVLGKADLYTQEQKEYLIQMIQKNTGETVCPVSSETGEGIRKLKELLAEKIPEGYGN RMITGNLVSKDDLVLLVMPQDIQAPKGRLILPQVQTLRELLDKRCLIMSVTTDRYPAALQ TLSRPPKLIITDSQVFPYIYENKPKESMLTSFSVLFAAYKGDLSYYIEGAKAIDSLKETS HVLIAECCTHAPLKEDIGRVKIPRMLKKRFGENLQVDLVSGTDFPDDLTGYDLIIQCGAC MFNRKYVMYRIDRAKKQQVPVTNYGITIAHLTGILDHVALPE >gi|226333016|gb|ACII01000003.1| GENE 10 10665 - 10940 396 91 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1196 NR:ns ## KEGG: Cphy_1196 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 90 1 91 95 120 67.0 1e-26 MSIVIVGGHDRMVGQYKKICKSYKCKCKVFTQMEADFGKKIGCPDLLVLFTNTVSHKMVK CALDEVGASTDIVRCHTSSGNALNGILEARC >gi|226333016|gb|ACII01000003.1| GENE 11 11136 - 11345 288 69 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00572 NR:ns ## KEGG: EUBELI_00572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 69 15 83 83 88 71.0 7e-17 MPLSMVNAGEPFTIKRIGGKEEVKRHLEELGFVVGGVVTVVSEINGSLIVNVKESRVAIG KDMASKIMV >gi|226333016|gb|ACII01000003.1| GENE 12 11443 - 11664 289 73 aa, chain + ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 72 80 148 152 71 50.0 4e-13 MKTLKEVHVGETVKVQKLTGEGPVKRRIMDMGITKGVEIYVRKVAPLGDPVEVTVRGYEL SLRKADAEMIVVE >gi|226333016|gb|ACII01000003.1| GENE 13 11838 - 13997 2513 719 aa, chain + ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 5 703 4 702 709 697 50.0 0 MSVKIALAGNPNCGKTTLFNALTGSNQFVGNWPGVTVEKKEGKLKGHKDVVLTDLPGIYS LSPYTLEEVVARNYILQEKPDAIINIVDGTNIERNLYLTTQILELGIPVIMAVNMMDIVE KNGDTIHIDKLKKKLGCEVVTISALKGTGITEAANKAVSIAQSHRKVTPVHEFCDKAEEI IGAVENKLTGVVPEEQKRFFAIKLLERDDKIIEQMNTSVNVSAEIKEMENEFDDDTESII TNERYVYISSIIGQCVTKARKDKLSTSDKIDRIVTNRWLALPIFAVVMFIVYYVSVTTVG AFVTDWTNDVLFGEIIPPAIESGLNAIGCAAWLQGLILDGIVAGVGAVLGFVPQMLVLFA FLAFLESCGYMARVAFIMDRIFRKFGLSGKSFIPMLIGSGCGVPGVMASRTIENDRDRKM TIMTTTFVPCGAKLPIIAMIAGAFFGNSGWVATSCYFVGIAAIICSGIILKKTKMFAGDP APFVMELPAYHWPTVSNILRSMWERGWSFIKKAGTIILLSTIILWFLMSFGWEGGSFGMV EELNESILSKIGTAIAWIFAPLGWTKAGEGWKMAVAAVTGLIAKENVVATFGMLYGFAEV AEDGAEIWGNLAAAMTPIAAYGFLVFNLLCAPCFAAMGAIKREMNNTKWFLFAIGYQCGL AYLVSLCIYQIGTLVTGGGFGIWTVVAFAIIIGFLYLLFRPYKESGSLNVKVKGMAKAR >gi|226333016|gb|ACII01000003.1| GENE 14 14211 - 14342 110 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTLIVGAILILIVGLIIRGIVRDKKSGKSSCGGDCSHCRGCH >gi|226333016|gb|ACII01000003.1| GENE 15 14429 - 14902 305 157 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00576 NR:ns ## KEGG: EUBELI_00576 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 24 145 9 128 142 128 56.0 7e-29 MGAAALQRQENCKEPNYQRTRMQKDVVIQELKKRGCRITKQRRMLLEVILENECSCCKEI YYNASRLDPGIGSATVYRMVNLLEEIGAISRRNMYKIVRNPDCKDESVCTIELDDNTVQK LSKDSFNNVVLTGLKACGYIDKQEIERMFVNSCDFKH >gi|226333016|gb|ACII01000003.1| GENE 16 15249 - 15584 513 111 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00577 NR:ns ## KEGG: EUBELI_00577 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 90 30 113 114 104 73.0 9e-22 MIKFDGKKTGIFAAGVAFGTAGIKILTSKDAKKLYTNCTAAVLRAKACVMKTASAIQENA EDIYAEAQQINEDRAAAEEVVEDTADEAAEETEETTDFKEETEETEEGSEE >gi|226333016|gb|ACII01000003.1| GENE 17 15584 - 17680 2325 698 aa, chain + ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 100 688 98 682 687 394 36.0 1e-109 MKFAIKHEIKGRIRIHLAQKHMTYRQADILQCYLEKEANISKASVYVRTQDVAVVYTGSR DGLIRLLSRFSYETAQVSESALENSGRELNETYKEKLINSVVMRTLNKMFLPYPVRVAIT TVKSVKYIYHGVRTLMKGKLEVPVLDATAIGVSMLRGDFNTAGSTMFLLGIGEILEEWTH KKSVNDLARSMSINAEKVWFVNEDGQEVLISASSVKAGDLIRVHMGNVIPFDGDVAAGEA MVNQTSLTGESAPVRKAEGSFAYAGTVLEEGELTVKVKEAAGSSRYEKIVNMIEDSEKLK SSVESNAEHLADRLVPYTLAGTGITYLLTRNMTKTLAVLMVDFSCALKLAMPVSVLSAIR EASTHSITVKGGKYMEAMADATTIVFDKTGTLTKAKPVVSDVVSFSEELSSDELLRIAAC MEEHFPHSMARAVVNAAKEKCLVHEEMHSKVEYIVAHGISTTIEGKKAVIGSWHFVMEDE KCVIPEGMEDRFEHLPLECSHLYLAIEGKLAAVICVEDPLREEAADAVRELKEAGITKVV MMTGDSERTAAAIAKRVGVDEYYSEVLPEDKASFVEKEKELGHKVIMIGDGINDSLALSS ASVGIAISDGAAIAREIADVTISADNLHEIVTLKRLSTALMKRIHNNYRTIIGINGGLIA LGVTGIIMPTTSALLHNMSTLTISLRSMRNLLPKEEEA Prediction of potential genes in microbial genomes Time: Sat May 28 19:03:50 2011 Seq name: gi|226333015|gb|ACII01000004.1| Ruminococcus sp. 5_1_39B_FAA cont1.4, whole genome shotgun sequence Length of sequence - 9029 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 6, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 371 383 ## COG1321 Mn-dependent transcriptional regulator - Prom 547 - 606 6.9 + Prom 428 - 487 8.4 2 2 Tu 1 . + CDS 696 - 914 323 ## EUBELI_01434 hypothetical protein 3 3 Op 1 5/0.000 + CDS 1028 - 2893 2238 ## COG2217 Cation transport ATPase 4 3 Op 2 . + CDS 2953 - 3315 368 ## COG0640 Predicted transcriptional regulators + Term 3321 - 3374 13.3 - Term 3309 - 3362 13.3 5 4 Tu 1 . - CDS 3371 - 3853 630 ## COG4769 Predicted membrane protein - Prom 3903 - 3962 8.4 6 5 Tu 1 . + CDS 4234 - 6210 2155 ## COG5492 Bacterial surface proteins containing Ig-like domains + Term 6242 - 6279 5.4 + Prom 6304 - 6363 7.5 7 6 Op 1 . + CDS 6424 - 7020 591 ## COG3976 Uncharacterized protein conserved in bacteria 8 6 Op 2 1/0.000 + CDS 7017 - 7793 466 ## COG0348 Polyferredoxin + Term 7841 - 7893 -0.8 + Prom 7821 - 7880 3.2 9 6 Op 3 . + CDS 7925 - 8959 922 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis Predicted protein(s) >gi|226333015|gb|ACII01000004.1| GENE 1 2 - 371 383 123 aa, chain - ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 3 116 1 115 122 108 53.0 2e-24 MAIHESGEDYLEAILMIKKRNGNVRSIDIAHELSFSKPSVSVAMKNLKASNYITIDENGF INLTETGQEIADKIYERHTFLTGWLTSIGVDPKVAAEDACKMEHAISAESFAAIKKSIAG THE >gi|226333015|gb|ACII01000004.1| GENE 2 696 - 914 323 72 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01434 NR:ns ## KEGG: EUBELI_01434 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 72 1 72 72 107 93.0 1e-22 MKKTYKIDVDCANCANKMEEAARNTAGVKDATVNFMMLKMIVEFEDGQDPKAVMKDVLKN CKKVEDDCEIYL >gi|226333015|gb|ACII01000004.1| GENE 3 1028 - 2893 2238 621 aa, chain + ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 8 620 93 701 702 669 57.0 0 MNKKQKKMLIRIIIAAVLIVVFSLLPAEGYLRFVLFMIPYLVIGYDILKKAFKGILNKQV FDENFLMAVATVGAILLGDYSEGVAVMLFYQIGELFQSYAVGKSRRNISELMDIRPDYAN IEKDGTLEQVDPDEVEIGTIIVVQPGEKVPIDGVITEGTSTLNTSALTGESLPRDAKAGD EVISGCINMTGLLKIRTTKEFGESTVSKILELVENSSSRKSKSENFISKFAKYYTPAVCY GALALAFIPPIVLLIMGKPAMWGDWIYRALTFLVISCPCALVISIPLSFFAGIGGASNQG ILVKGSNYLETLAQTKYVVFDKTGTMTQGVFEVSGIHHNEMPDEKLLEYAALAECSSSHP ISKSLQKAYGKPIDRNRVTDIEEISGNGVIAKVDGISVAAGNTKLMNRLGIAYQDCHHVG TVVHMAIDGKYAGHILISDIIKPHAKEAIAELKKAGISKTVMLTGDSKRVADQVAGELGI QEVYSELLPADKVSRVEELLNQKSEKAKLAFVGDGINDAPVLSRADIGIAMGALGSDAAI EAADIVLMDDDPLKISKAIKIARKCIRIVYENIYFAIGIKILCLILGALGIANMWVAIFA DVGVMILAVLNAIRTLFVKNL >gi|226333015|gb|ACII01000004.1| GENE 4 2953 - 3315 368 120 aa, chain + ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 7 119 9 121 122 128 59.0 2e-30 MEVCKEELCESNEIHEDLLKIVDETLPEETELYDLAELFKVFGDSTRIRILFVLFEAEVC VCDLAKALNMTQSAISHQLRILKQNKLVKSRREGKSIFYSLADDHVRTIINQGREHIEED >gi|226333015|gb|ACII01000004.1| GENE 5 3371 - 3853 630 160 aa, chain - ## HITS:1 COG:L179010 KEGG:ns NR:ns ## COG: L179010 COG4769 # Protein_GI_number: 15673323 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 8 158 9 161 176 77 37.0 1e-14 MKQKTACLGLFSAVAIILGYVESLVPVFAGIPGIKLGLANLGVLFILKKYSFREAALVSV VRILVIGFMFGNLFSILYSLAGAALSMTVMTLMLKKTSFSLIGVSVAGGVSHNIGQLIIA MLIVQNASVFVYAPALLVAGVAAGVVIGGLTNEILKRVHL >gi|226333015|gb|ACII01000004.1| GENE 6 4234 - 6210 2155 658 aa, chain + ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 527 645 172 303 752 79 43.0 2e-14 MTNKKLMKKIAPVVMSAAVAMTSMPYAVLASDFTDSEVLTDSTDFSVSEEFAAEDAQEEF TTEDVQDAFAAEAQQGEAYVLMNIPYDDFYKAELKNNDVKVDAFTSATLNKSRTAGMMNG NSAYHVDPNGTDVTGVTFPVKVSDLSLLKDQKQVTDSDSVTITVTNRGQTSSNTYTGKDS LIENASYSYYILSETPSYYKELTVNADGSYSFSAMKGAQTKNVTIDATLKTETKYGDYEL DLNNEAFAAVIDTNTDKIYGVTVNTTDGTNYGLRHLENIWRGSMLAWGTGYTTEVHGCPV SSAHYKSIMGKTIDSVTYYTDKGMITFDVPDVKVQTTTGIKATVADIMNTDSSAAVTFDQ ALPADFKAQYTVDGTAVSCTDGKLAVGALALGTHKVEIADASGKYAAIVTEFTVNTDKMP ASYDSNETKLVAAKGITAEEFSAYIKSISKVKVDDTEYAATGRGSKVIVKEDGTLDLTDI QVTDATVFEITAVGYKNNLTFTYKEAENTFELNTSSKTLYTKGSTKTTLKVTTNLTDKIT WKSSNTKVASVNSKGVVTAKAKGTAVITASCGEYQVTCKITVKNPSLKLSKSSATVKVGK TTKISAKATPSGKVTYKSSNSKIATVSSKGVVKGKKKGTAKITVTCNGVKKVFTVKVK >gi|226333015|gb|ACII01000004.1| GENE 7 6424 - 7020 591 198 aa, chain + ## HITS:1 COG:CAC2762_2 KEGG:ns NR:ns ## COG: CAC2762_2 COG3976 # Protein_GI_number: 15896018 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 110 196 46 132 132 58 41.0 8e-09 MNNQNFYLRLIALLLIIMAVFFYNGTVKDKEQAQDIADLAAKTESLEKQQDQILTALKET YEKQKTAAESNASSDVSSEADNKSEKDSAEDKENADQADSEETDDSDNVYKNGTFEGSGT GYGGTITVQVTLEDDTITAVSVVSAPGEDSAYLSQGENVINSVISEQSTDVDTISGATFS STGILEAVNDALSKAENA >gi|226333015|gb|ACII01000004.1| GENE 8 7017 - 7793 466 258 aa, chain + ## HITS:1 COG:CAC2762_1 KEGG:ns NR:ns ## COG: CAC2762_1 COG0348 # Protein_GI_number: 15896018 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Clostridium acetobutylicum # 1 254 1 250 253 115 33.0 1e-25 MKKKQRKVRLIRAAIQLIFFIAAPSLFSTAFAGIKTIFLAIGGQQSVTWNSFLDITALLL IITILFGRHFCGYACAFGSLGDALYELTAFIRAKCFGKEKKHGYPEEWVHRLQKVKYVIL AFLLLSCITGFYSKLQGMSPWDVFSMLTTGRLPKSTYIVGTVLLILIMAGMCTQERFFCQ FLCPMGAVFAIMPIIPGALFKRNRPNCAPKCTLCKKRCPAHLDIDGDTAHSGECICCHAC TAVCPRKNIHTGTVIDKN >gi|226333015|gb|ACII01000004.1| GENE 9 7925 - 8959 922 344 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 38 339 11 314 319 162 32.0 6e-40 MTNRRSFVLGLFIIASLLCTSRTAWGNSEINTYTKTDFAMDTVVSETLYTTGEDITADII SALKDVEENWISWTKESSQIYQINQNAGNTTTVSDETATCLKQVLDLSKASGGAMDPTMG RVIRLWDIDGENPHIPSDDELNSLLENVGYDKVTLDGNKVTMPEGVTLDLGAAGKGIGCD AAQKILDADKNVSGMILNLGGSSVMSYGSKPDGSAWQVAVTDPRDTEGDYLGVVTLNGTE FLSTSGDYEKYFIEDGVRYHHILDPATGYPARSGLTSVTVVCNDGLNADGLSTACFVLGK EKAEELLKKYKADGLFVDDSDHVWMTEGMKERFQLLKDTYSIGE Prediction of potential genes in microbial genomes Time: Sat May 28 19:03:54 2011 Seq name: gi|226333014|gb|ACII01000005.1| Ruminococcus sp. 5_1_39B_FAA cont1.5, whole genome shotgun sequence Length of sequence - 9984 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 6, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 35 - 1924 2099 ## COG5492 Bacterial surface proteins containing Ig-like domains + Term 1940 - 1987 3.0 + Prom 1948 - 2007 4.2 2 2 Op 1 . + CDS 2057 - 3559 1551 ## COG3976 Uncharacterized protein conserved in bacteria + Prom 3600 - 3659 7.8 3 2 Op 2 . + CDS 3732 - 3992 365 ## Ccur_12930 copper chaperone + Term 4020 - 4074 13.2 4 3 Tu 1 . - CDS 3937 - 4128 64 ## - Prom 4161 - 4220 2.1 + Prom 4394 - 4453 7.4 5 4 Op 1 . + CDS 4504 - 4890 289 ## gi|253577848|ref|ZP_04855120.1| predicted protein 6 4 Op 2 . + CDS 4921 - 6222 208 ## Dhaf_2876 peptidase M56 BlaR1 + Term 6232 - 6285 9.1 + Prom 6336 - 6395 9.5 7 5 Tu 1 . + CDS 6608 - 7909 1581 ## COG0148 Enolase + Term 7976 - 8023 9.1 - Term 7963 - 8011 7.1 8 6 Op 1 . - CDS 8038 - 8622 420 ## gi|153855421|ref|ZP_01996552.1| hypothetical protein DORLON_02566 9 6 Op 2 . - CDS 8675 - 9895 935 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 9920 - 9979 6.5 Predicted protein(s) >gi|226333014|gb|ACII01000005.1| GENE 1 35 - 1924 2099 629 aa, chain + ## HITS:1 COG:CAC3086 KEGG:ns NR:ns ## COG: CAC3086 COG5492 # Protein_GI_number: 15896337 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 480 629 185 345 498 84 40.0 6e-16 MRKGNLIKMAAPAILSAAMMVSGLPVMAADFTSDTAVETEAQTDEFDAADDTADAFAGDE ETPFVDDQAAAEDEFAATAETGYKYVYAGLTWAQYWASEGVYNAANTASNGTKDSHDELD KGGFDTVTRATVNHGLHRGSYQCEAIMYDKNGGSYEISYWTSANDAVLTDGTTVKLNQPE RGQITKADGSVAEFDHYSVVGLKYVPVAVKAEDYEAFKSQFKIVEDGSTIAGGFGENNLQ SYSAVADVTAETNGLKEAVKGSDGTFTFKARTNGTGSGLKDQALQTASNVTATVKEASGS YGEFLRVDITGDGYGALGAKMYAVKWDYYGNGDKVLASYGTKFAADNWMHKAMGIQLGLT DSYRCQLPKGTDGTGKWKITVYAMGYADTVFEVNATDANIVKPEAGEADTTALKAAVEKA EALKEADYTADSWKAMQLELQEAKDLLAKEKPTQAEVDEATTHLNAAVEALVKADTKVTV TLNKKTATVYKGKTTTLKATVTGADAPKVTFTSSNPKVAAVNKTTGKVTAKAKGSAVITA KCGDVKVTCKVTVKNPTLTLSKTAVSVKVGKTTKITAKAAPSGKVTYKSGNKKIATVSSN GTIKGIKKGTAKITVTCNGVTKTVKVTVK >gi|226333014|gb|ACII01000005.1| GENE 2 2057 - 3559 1551 500 aa, chain + ## HITS:1 COG:CAC2762_2 KEGG:ns NR:ns ## COG: CAC2762_2 COG3976 # Protein_GI_number: 15896018 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 132 228 35 131 132 72 53.0 2e-12 MDKNNEQKNNKTPEMKPKQDSFRPWLKLIPAAAVFAAVCVTCYQADKSPVETTLVSNNNV MSTDEIKELISQGTADKDFDSEDSSETNGTSISKTSKKTKKTSKIKTGTKKNSSTGSSAN GGTAGGGAGSTVTPTTEVPAGGYADGTYTGSGTGFGGTITVQVTVTDHKIAAINIVDASN ETASYFANAQGVISKILASQSPNVDAVSGATYSSNGIITAVQNALSQAIPSGNQAVVTPT PTPLPKPTKKPSPIPKPGDEQIYKDGTYTGTGKGYSGTITLTAKIKKGVIKSLEAEHTDT PMFFKKAWDILENEIIQNQSVDGIDTVSGATYSSKGIINAMKDIQKQAEKGTTKVTPTPT PEVTVTPIPEATPIPTPEETPTPEVTPTPEETPAPTPTPEETPEPTPEPTGPYIDGTYTG SSYGYSGRVNVTVTIQGGQIASIEQSNSDSPEYFDYAWETIYPQIMGNQSADGIDAASGA TYSSEGILGAIQKALAQALA >gi|226333014|gb|ACII01000005.1| GENE 3 3732 - 3992 365 86 aa, chain + ## HITS:1 COG:no KEGG:Ccur_12930 NR:ns ## KEGG: Ccur_12930 # Name: not_defined # Def: copper chaperone # Organism: C.curtum # Pathway: not_defined # 1 81 1 82 82 92 53.0 6e-18 MIKTTVKVDGMMCGMCESHVNDAVRKAFQVDKVTSSHSKGETVIISDGPVDKAKLKTAIS ATGYEVKGITSEPYEKKKGLFSFLHK >gi|226333014|gb|ACII01000005.1| GENE 4 3937 - 4128 64 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTIDFIYQRHISGCNKVSDKVKSQERTVQSLNNSCLVSHGKLKRIFIYEEKKTVLFSFHM ALM >gi|226333014|gb|ACII01000005.1| GENE 5 4504 - 4890 289 128 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577848|ref|ZP_04855120.1| ## NR: gi|253577848|ref|ZP_04855120.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 128 1 128 128 225 100.0 6e-58 MKKKIELVCWILFTVFSGIGFFINFMSKPPVNDTHTFTLYMIGAAVVSLLYPASMFVLGL RNRYLYDKLNELAEEKGETEVAKSAHTFRSSASGYTVPDMDILVSALTSDQKLTEEEKEY LLEYIKKK >gi|226333014|gb|ACII01000005.1| GENE 6 4921 - 6222 208 433 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2876 NR:ns ## KEGG: Dhaf_2876 # Name: not_defined # Def: peptidase M56 BlaR1 # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 49 427 64 444 465 82 24.0 4e-14 MSIAGSIPVMICLFLYLIQRGNYNYILGRRLLLTGVFFYLVPVQLVKYLLPKDALPETML IGRKADSYLSGTLSFWSEKQGDYIWIPQWFNIFARIWLAGIIIFAIYEIIKYWRGAHSIR NYIFEKIEDPEDNLTYYLIPDEICGPCTIGFFRQKIVFPESFPLHPDFIMVYKHEHTHLK NHDNLVKLLCLFVLCLHWMNPVAYLLLFLYIDTAEIVSDSVAVDGCTKEKRQDYASLLVL EAATSDIRPAVWKNNLSGHKNNKEGKDFKTLKRRINYMMKEKRKGMLQRGIMVAVSALTV MASVGTVMAYEPIQSSDVSVSDSISDYDLDFTDFGFEADNNIVDNIDTSIIDFSETNDVF ISEDGTQIPIKDLSAPRAICTHSMINGYLNNHSSNGSGGCTVYVYKCQRCEKCGYLANAT LYSTNTYVKCPHK >gi|226333014|gb|ACII01000005.1| GENE 7 6608 - 7909 1581 433 aa, chain + ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 2 432 4 431 431 537 65.0 1e-152 MYLEIEKVIGREILDSRGNPTVEAEVYLMDGTVARGTAPSGASTGEFEALELRDGDKARY LGKGVQKAVENINTTINEAIEGLDASDIYAVDAAMIAADGTKDKSKLGANAILAVSIACC RAAAASLEIPLYRFLGGVSGNRLPVPMMNIVNGGCHALSSGLDVQEFMIMPVGAPSFKEC LRWCAEVFHALASILKERGLATSVGDEGGFAPALKSDEEAIETILEAVKKAGYEPGKDFR IAMDAASSEWKSEKGKGYYKLPKAGTEYTSEELIEHWAKLCEKYPIISIEDGLDEEDWEG WQKLTARLGDKVQLVGDDLFVTNTERLAKGISLGAGNAILIKLNQIGSVSETLEAIKMAH KAGYTAISSHRSGETADTTIADLAVALNTCQIKTGAPSRSERVAKYNQLLRIEEELGASA VYPGMKAFNVKQD >gi|226333014|gb|ACII01000005.1| GENE 8 8038 - 8622 420 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153855421|ref|ZP_01996552.1| ## NR: gi|153855421|ref|ZP_01996552.1| hypothetical protein DORLON_02566 [Dorea longicatena DSM 13814] # 4 193 451 652 661 149 39.0 7e-35 MRAVSGRTYVMEQQNIGIAPLFVQVFHNNMTDGISEISFTYDAGNFCVSFTEGEVIHKLP VGFGKAADGCVDLHGEHYLVATLGEFARDENDIPVLKLEVTFIEECVKRKAHIFFHEDDK IEIRWNETPGKKMILAGLSSITEELSGNFLYNSLLGDHNITTELLHRLMKQTIEPVVRGY LKRPEETGSIDTDE >gi|226333014|gb|ACII01000005.1| GENE 9 8675 - 9895 935 406 aa, chain - ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 104 319 56 290 323 112 34.0 1e-24 MAKEQIAVAELVLNMILGKTSGTRVDYFPQKPDFPFDAVYEQAFVRATPESQGISSNLFA ALLRELDASKDTEMHHFMALRHGKVICECNFAPYPKGMWHITHSMCKSITGMAIGMLIEE EKLKLDENIYDIFPDHINAFSKIFRPVITVENLLTMTSGVTFNESGIVSGNDWLGSFLNA SVNGKPGTEFQYNSLNTYVLSAIVTKRTGETLTEYLTPRLFGPLGITKYYWETCPKGITK GGWGLFLCAEDMAKLGQLYLQRGKWNGQQLVSEYWIEISTARHLKTQNGTYGYGYQLWME QRPGSFEYNGMLGQNVIIYPDMDMVLVTNAGNKEMFQDCIMLNIIRKYFPVNYHPADVLP ENPLSYSLLKRLCGELENGENNNRSTSLRGRWKRNVVSRRKHSDKK Prediction of potential genes in microbial genomes Time: Sat May 28 19:04:23 2011 Seq name: gi|226333013|gb|ACII01000006.1| Ruminococcus sp. 5_1_39B_FAA cont1.6, whole genome shotgun sequence Length of sequence - 6777 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 54 - 90 5.0 1 1 Tu 1 . - CDS 109 - 1494 1489 ## COG0372 Citrate synthase + Prom 1713 - 1772 10.6 2 2 Tu 1 . + CDS 1805 - 2341 651 ## COG1607 Acyl-CoA hydrolase + Term 2342 - 2382 8.7 - Term 2334 - 2364 3.0 3 3 Tu 1 . - CDS 2375 - 3394 682 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) - Prom 3526 - 3585 6.6 + Prom 3643 - 3702 5.6 4 4 Op 1 . + CDS 3780 - 3998 219 ## gi|253577855|ref|ZP_04855127.1| predicted protein 5 4 Op 2 . + CDS 4044 - 5996 1389 ## COG2200 FOG: EAL domain + Term 6085 - 6140 13.2 - Term 6073 - 6128 13.2 6 5 Tu 1 . - CDS 6155 - 6661 586 ## gi|253577857|ref|ZP_04855129.1| predicted protein - Prom 6713 - 6772 6.0 Predicted protein(s) >gi|226333013|gb|ACII01000006.1| GENE 1 109 - 1494 1489 461 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 23 461 7 441 441 481 54.0 1e-136 MEKHEMMSLEDSGQLRDKIRFFEQELLKNHHIDPNLYVEYNVKRGLRDSAGKGVLTGLTE ISDVNGYNLINGRQIPADGRLYYQGINVQDIISGLNGRRFGFEETIYLLIFGKLPDKEEL SRFLDMMSDMEELGGRFVRDVVMKGTNANIMNAMQRCVLALYTYDDNPEDISPENVLRQS LELIAKLPEIAVYSYHAYRHFRKDDTLFIRNPQKGLSLAENILLMLRPDGKYTELEAKVL DIALILHAEHGGGNNSTFTTHVVTSSGTDTYSSTAASIGSLKGPRHGGANLKVQNMFADL KSHVDQDHWDNEDEIITYLKKVLNKEAFDHAGLIYGMGHAVYTLSDPREVILKRFAQALA EEKGMTEEFELYNRVENIAGKLIMEHRKLFKNVCANVDFYSGFVYSMLGIPEELFTPIFA IARMPGWSAHRLEELINANKIIRPAYKYVGHHTDFVAFDER >gi|226333013|gb|ACII01000006.1| GENE 2 1805 - 2341 651 178 aa, chain + ## HITS:1 COG:BH0798 KEGG:ns NR:ns ## COG: BH0798 COG1607 # Protein_GI_number: 15613361 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA hydrolase # Organism: Bacillus halodurans # 28 147 7 127 157 93 42.0 2e-19 MERVIIANRKHDKGEVRKEMSERKMKRVEDSLTEQSHLLMPKCLNAAGYLFGGQLLAWID ETAGIVAKRHAEMNVVTVAVDNMYFKAGARVNDTIVLIGRLTHVGRSSMEVRIDTYCEAL DGTRTMINRAYFIMVGTDEHQHPVEVPGLIIEGVTQQIENEAAQKRAKLWKIRRQEGF >gi|226333013|gb|ACII01000006.1| GENE 3 2375 - 3394 682 339 aa, chain - ## HITS:1 COG:CAC0946 KEGG:ns NR:ns ## COG: CAC0946 COG2333 # Protein_GI_number: 15894233 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Clostridium acetobutylicum # 1 255 40 294 320 223 46.0 5e-58 MAVHFIDVGQGNAILVQSGGQNLLYDGGDQSHADLIISYLQEQNVENIDYMIASHYDEDH IGGLVPCIDNFSVSNIFGPDYVHTSNLFNNFMNTATANAIIVQYPSVGETFDFGTGSFTV LAPNGISQNNNDNSLVIKLENGSNSFIFTGDAEETSEQDMISTGMNLDCDVLSVGHHGSA SSTTWDFLEATSPSYAVISCGINNQYNHPSADTMGRLSDMGIPVFRTDKQGTIIAVSDGT NISWSQEPCNDYSSGDSSVNASAGGTGGNSRQEETTTSDPVPEQEESNNADLGTMVWIPA TGEKYHSIPNCGRMNPDTARQVSRSEAEAMGYGPCSKCY >gi|226333013|gb|ACII01000006.1| GENE 4 3780 - 3998 219 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577855|ref|ZP_04855127.1| ## NR: gi|253577855|ref|ZP_04855127.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 72 6 77 77 125 100.0 6e-28 MPLNLAQFNIKYKISKVCGQEKQRHYLESLGFIAGSEIELLSEIHGYYIVMVKGSKIGIE KSMAKRIILYAA >gi|226333013|gb|ACII01000006.1| GENE 5 4044 - 5996 1389 650 aa, chain + ## HITS:1 COG:sll0267_6 KEGG:ns NR:ns ## COG: sll0267_6 COG2200 # Protein_GI_number: 16331091 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 403 646 9 250 253 145 33.0 2e-34 MEKNKITIRLCYIELIIFAGILVTMIFHFRCGFEKSFFALMIIAALVGCGVSLALKDYLK SMIQSSAFDVAGVHNKKALEKQLQQIQDEDDTLDTGIMMFDLNNLKIINDTYGHEEGDVF IQTFASFLTRILTENSYLARFGGDEFLIIQRNTTWSQLEQMNIQLQTLIDEHNQTADHPL SYAVGYDISCKNHYYLIMDLLKIADEKMYQDKKYKKQQLAQKEQYLTRSGLAQSISSDSL KEKIFTILTNANGEKEYAFIMTDISKFHLINDYWGYETGTKILNFVLRRMELFSASLFVN RYHSDVFVAVLDITGRKSSDVKKQIEEYNRQIIQEILKTFQINYFHLNTGIYYLKDTSIP AEQIISHTNIVREKAKTELSGVCEYNNDIALNEQHCADTIHSFKSALKQKEFKIYFQPKI CGKDQTIASAEALVRWQRGNDTMWYPDMFLSILEETGEIRALDYYVYEETFAWMNQRQKE GKRVIPVSLNVSPVHFGNIHSFAEKVMALVKQYEINPYYVIFEITETTFIHNIKSVNEMI RFFHEQNIRISMDDFGSGYSSLNALKDILFDEVKIDKRFLSDELSENGKIVLQEIFHLLK RTNKSIVCEGVETKEMVDFLVEEGCDELQGYYYYKPMSQNAFEKLLENIA >gi|226333013|gb|ACII01000006.1| GENE 6 6155 - 6661 586 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577857|ref|ZP_04855129.1| ## NR: gi|253577857|ref|ZP_04855129.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 168 1 168 168 169 100.0 7e-41 MRHIDEIQIAGTAILLCGAMFAVSAVAPTGNTVSKAVTNFTEAPETAAEAQAQAEAQAEE DGFTIYTQESNYDSDYDNSYNDTSYDDSSYNTDTTQDGSLDDTGQENDNTDTGDGSSDDG NNDQTSSDETGDDSSDQIDYPIDDGSGDSTDGSLDNSDTWEDNGEEVQ Prediction of potential genes in microbial genomes Time: Sat May 28 19:04:40 2011 Seq name: gi|226333012|gb|ACII01000007.1| Ruminococcus sp. 5_1_39B_FAA cont1.7, whole genome shotgun sequence Length of sequence - 8181 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 74 - 133 6.8 1 1 Tu 1 . + CDS 188 - 1921 1769 ## COG0366 Glycosidases + Prom 1975 - 2034 8.8 2 2 Tu 1 . + CDS 2139 - 2513 556 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 2576 - 2625 8.0 + Prom 2600 - 2659 5.6 3 3 Op 1 . + CDS 2814 - 3647 402 ## BAA_A0205 pXO1-133 4 3 Op 2 . + CDS 3700 - 4116 201 ## gi|291546196|emb|CBL19304.1| hypothetical protein + Prom 4118 - 4177 6.5 5 4 Op 1 1/0.000 + CDS 4238 - 6109 2048 ## COG1032 Fe-S oxidoreductase 6 4 Op 2 2/0.000 + CDS 6093 - 6800 930 ## COG5011 Uncharacterized protein conserved in bacteria 7 4 Op 3 . + CDS 6817 - 7998 929 ## COG1530 Ribonucleases G and E Predicted protein(s) >gi|226333012|gb|ACII01000007.1| GENE 1 188 - 1921 1769 577 aa, chain + ## HITS:1 COG:CAC2686 KEGG:ns NR:ns ## COG: CAC2686 COG0366 # Protein_GI_number: 15895944 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Clostridium acetobutylicum # 1 440 1 447 451 400 44.0 1e-111 MSWYNEAIFYHIYPLGLTGAPKHNDYSAPVSRLNTLLPWIDHIREIGCTALYIGPLFESV GHGYETTDYKKLDSRLGTNEDLKNFVAACHEKEIKVIFDGVFNHTGRDFFAFKDIQQNRE HSRYLNWYCNVNFGGNTEYNDGFSYENWGGYNLLVKLNQRNPEVQNYICDVIRFWVSEFD VDGIRLDAADVLDFDFMRALRRTAAEVKEDFWLMGEVIHGDYSRWVNGETLHSVTNYALH KALYSGHNDHNYFEIAHTVKYLQNMGNIDLYNFVDNHDVERIYTKLSNKAHFAPVHVLLY TLPGVPSIYYGSEFGIEGKKEKFSDDSLRPALDIKDYADAVQKNPCTALIAALGKIRQHT PALSYGSYAELQLTNRQFAFARDLDGIRVIVTVNNDDNAADMSLPAGNCAEYIGTLTGRK VPVQDGRINVTVAANSGEIWVPAGEMPEYISVKTETADIKKVQEETEETTSTQTESPAQK TITAAAKAEDIQPEKTADTSATSAENSFPENTEAAVEKEKTVIVDLNKSPEDMTVDELQQ AILAKMAGNGPVTDQMKKTVYDNIWHDSLVNWLKSFH >gi|226333012|gb|ACII01000007.1| GENE 2 2139 - 2513 556 124 aa, chain + ## HITS:1 COG:SP1567 KEGG:ns NR:ns ## COG: SP1567 COG0251 # Protein_GI_number: 15901410 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Streptococcus pneumoniae TIGR4 # 2 121 3 123 126 140 58.0 4e-34 MKTLETKKAPAAIGPYSQAKMTGNFLFASGQIPVDPATGEVAGDKIETQAEQSCKNVGAI LEEAGLTFDNVIKTTCFLADMADFAAFNAVYEKYFTSKPARSCVAVKQLPKNVLCEVEAI AVAE >gi|226333012|gb|ACII01000007.1| GENE 3 2814 - 3647 402 277 aa, chain + ## HITS:1 COG:no KEGG:BAA_A0205 NR:ns ## KEGG: BAA_A0205 # Name: not_defined # Def: pXO1-133 # Organism: B.anthracis_A0248 # Pathway: not_defined # 20 204 58 237 485 127 38.0 6e-28 MKALVRQLESHMTKVCSLRFFYSYQIPKLGKEFDLLQIKDDQIINIELKSGAVSEEAIRK QLMQNRYYLSVLGRSIQSYTYISSQNRLVRLTNHDHVAEADWTELCGSLQKESSDYQGNI DDLFQAELYLISPITEPARFLKKEYFLTSQQRDIQRQILKKLRISRFEYFCFTGLPGTGK TLLLYDIAMKLSRRQQICMIHCGNAGKEWKILHKRLQRIAFLSDNQLTENTELKHYSAVL VDEAHLLSSEKLQILLTQSGGGVSCNIFQRFRRCHMS >gi|226333012|gb|ACII01000007.1| GENE 4 3700 - 4116 201 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291546196|emb|CBL19304.1| ## NR: gi|291546196|emb|CBL19304.1| hypothetical protein [Ruminococcus sp. SR1/5] # 2 133 337 477 482 151 54.0 2e-35 MFHLTNRIRTNAELSSFIQNMIHLTDRKTSKPYPHVSVVYANNEEETAALLEDYIHQGYE YEITAVRDIKRLVIILDERYYYDQNRYLRSKHFKEGRSDVRNLFHWLNQAKEELSIIVRE NTYVYETLLTLLQPDTVR >gi|226333012|gb|ACII01000007.1| GENE 5 4238 - 6109 2048 623 aa, chain + ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 8 621 6 615 622 699 54.0 0 MRTLALSDEILMKVDKAARYIGGEVNSVMKDKNDVDIRFAMCFPDVYEIGMSNLGMMILY NMFNEREDVWCERVFSPWMDLDEIMREEHIPLFALESQESVKEFDFLGITLGYEMCYTNV LQVLDLSHVSLLAKDRKEDDPIVIGGGACAYNPEPIAEFFDMFYIGEGETVYDALFDAYK ANKAAGGSRADFLFAASQIPGIYVPSLYNVTYKEDGTIASFTPAKEGVPEKVCKQLITDV TKDYRAIKAPVVPFIKATQDRVTLEIQRGCIRGCRFCQAGMIYRPTRERNVEDLKASARE MLQNTGHEEISLSSLSSSDYSELKELVNFLIEEFHGNAVNISLPSLRIDAFALDVMSKVQ DVKKSSLTFAPEAGSQRLRNVINKGLTEENILHGAGEAFKGGWNQVKLYFMLGLPTETED DMKGIAHLAQKIAETYYEEVPKEKRNGKVQVNVSTSFFVPKPFTPFQWAGMYREEDFVEK AKVVKSEIRAQLNQRSIRYSWHEPDVTILEGFLARGDRRCSKVILRAYEKGAIYDAWSES FDYNIWKESFAETNTDIDFYTLRERSTDEILPWDFIDAGVTKKFLIHEWEQAKKETVTPN CRQKCSGCGAMRYGGGVCYEGKN >gi|226333012|gb|ACII01000007.1| GENE 6 6093 - 6800 930 235 aa, chain + ## HITS:1 COG:CAC1255 KEGG:ns NR:ns ## COG: CAC1255 COG5011 # Protein_GI_number: 15894537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 216 1 226 238 110 34.0 2e-24 MKVRIKFSKEGPVKFVGHLDTMRYFQKAIRRANLPVAFSGGYSPHMIMSFAAPLGVGTES LGEYFDLELAETVPTSEITRRLDAVMVEGVHVLSTRQVEDGKAGKAMSLVAAADYYVEFR PGKEPEISWKDKISDFLAQPEIKVMKKTKRSEKEIDIRPFIYKMELQGDKIFMMLASASA NYTKPELVTDTFFSWLGIELPEFAYSIKRLEVYADKGTEEEHKFVTLEALGEEVE >gi|226333012|gb|ACII01000007.1| GENE 7 6817 - 7998 929 393 aa, chain + ## HITS:1 COG:XF1125 KEGG:ns NR:ns ## COG: XF1125 COG1530 # Protein_GI_number: 15837727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Xylella fastidiosa 9a5c # 12 392 17 411 497 204 32.0 2e-52 MNIRQIPCYVCAYEEDGKVMELRFEPVGEKSSLGNIYVGQIENIAANIGAAFVQISAGEK CYLQLSDAPNAIYASVKKGDRPLKAGDEILVQISREAMKGKLPAVTTNLNFTGKYLVLTT GDKKFGLSSKLSNDDRSRISKWMEAEVNRPDKEFGIIVRTNAADASKEEILKELEYLKGL YHKAAVDGRSRTCFSCVYRTEPFYISAVRDANSRNLEEIITDIPEISRQISDYLNSNSPE EKEKLRFYDDKLLPLYKLYRLETVLEEIQHEKVWLNSGGFLVIQQTEAFVSIDVNSGKFT GKKKMQETYRKINLEAAKEIARQLRLRNLSGIILIDFINMENQDHQDELFHVLQKYLRKD PVKAKAVDITPLHILELTRKKVRKPVIEEIREL Prediction of potential genes in microbial genomes Time: Sat May 28 19:04:54 2011 Seq name: gi|226333011|gb|ACII01000008.1| Ruminococcus sp. 5_1_39B_FAA cont1.8, whole genome shotgun sequence Length of sequence - 8176 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 5, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 17 - 76 1.7 1 1 Op 1 . + CDS 121 - 426 401 ## PROTEIN SUPPORTED gi|240144148|ref|ZP_04742749.1| 50S ribosomal protein L21 2 1 Op 2 . + CDS 439 - 759 186 ## PROTEIN SUPPORTED gi|228005639|ref|ZP_04052605.1| predicted ribosomal protein 3 1 Op 3 14/0.000 + CDS 764 - 1051 400 ## PROTEIN SUPPORTED gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 + Term 1063 - 1109 10.4 + Prom 1087 - 1146 9.8 4 2 Op 1 1/0.000 + CDS 1286 - 2581 1660 ## COG0536 Predicted GTPase 5 2 Op 2 7/0.000 + CDS 2655 - 2951 218 ## PROTEIN SUPPORTED gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein 6 2 Op 3 9/0.000 + CDS 2965 - 3600 557 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 7 2 Op 4 6/0.000 + CDS 3608 - 4198 625 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism + Prom 4261 - 4320 3.1 8 2 Op 5 . + CDS 4371 - 4730 578 ## COG0799 Uncharacterized homolog of plant Iojap protein + Term 4777 - 4828 2.5 - Term 4963 - 5020 16.3 9 3 Tu 1 . - CDS 5141 - 5761 773 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 5856 - 5915 6.2 10 4 Tu 1 . - CDS 6063 - 6254 103 ## - Prom 6382 - 6441 4.8 + Prom 6048 - 6107 2.4 11 5 Op 1 . + CDS 6157 - 7041 666 ## EUBREC_1735 hypothetical protein 12 5 Op 2 . + CDS 7016 - 7243 222 ## gi|153810903|ref|ZP_01963571.1| hypothetical protein RUMOBE_01287 13 5 Op 3 . + CDS 7215 - 7586 355 ## Cphy_2363 hypothetical protein 14 5 Op 4 . + CDS 7602 - 8175 564 ## COG0325 Predicted enzyme with a TIM-barrel fold Predicted protein(s) >gi|226333011|gb|ACII01000008.1| GENE 1 121 - 426 401 101 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240144148|ref|ZP_04742749.1| 50S ribosomal protein L21 [Roseburia intestinalis L1-82] # 1 100 1 100 101 159 76 8e-39 MYAIIATGGKQYKVAEGDVIRVEKLGVEAGETVTFDNVIAVSNDGLKAGEDVKDASVTAT VVENGKAKKVIVYKYKRKTGYHKKNGHRQQYTKVKIEKING >gi|226333011|gb|ACII01000008.1| GENE 2 439 - 759 186 106 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228005639|ref|ZP_04052605.1| predicted ribosomal protein [Alicyclobacillus acidocaldarius subsp. acidocaldarius DSM 446] # 1 94 1 94 106 76 42 7e-14 MITITVKKRNGNYLEFVSKGHAGYAEEGQDIVCAAVSVLVINTVNSLETFTDDQFEVQED DGYVSFHFTAPVTERGTLLMDSLILGLTEIEHSYNNRYLTVKVKEV >gi|226333011|gb|ACII01000008.1| GENE 3 764 - 1051 400 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 [Clostridium phytofermentans ISDg] # 1 95 1 95 95 158 83 1e-38 MMKMNLQLFAHKKGVGSTKNGRDSESKRLGAKRADGQFVKAGNILYRQRGTKIHAGENVG CGKDYTLFALKDGVVRFTRKGRDKKQVSIVPVEAE >gi|226333011|gb|ACII01000008.1| GENE 4 1286 - 2581 1660 431 aa, chain + ## HITS:1 COG:CAC1260 KEGG:ns NR:ns ## COG: CAC1260 COG0536 # Protein_GI_number: 15894542 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 1 431 1 424 424 397 49.0 1e-110 MFADRAKIYIRSGKGGDGHVSFRRELYVPNGGPDGGDGGRGGDVIFEVDEGQNTLGDYRH RRKYKAEDGQEGGKKRCHGADGKDVVLKVPEGTVIMDAESGKVIADMSGENKRQIVLRGG RGGKGNQHYATATMQVPKYAQPGQPAQELEVLLELKVIADVGLVGFPNVGKSTFLSRVTN AQPKIANYHFTTLSPNLGVVDTENGGFVIADIPGLIEGASEGVGLGHEFLRHIERTRVII HIVDAASTEGRDPIDDIYKINKELEAYNPEIAARPQVIAANKIDCIYTEDGEESPIDKLK AEFEPKGIQVYPISAVSGQGVRELLFHVKELLDSCPQEPVVFEQEFFPEDMLIGENLPYT VEQSPEDPTIFVVEGPKIEKMLGYTNLDSEKGFAFFQKFLKEGGILEELEKAGIQEGDTV RMYGFDFDYYK >gi|226333011|gb|ACII01000008.1| GENE 5 2655 - 2951 218 98 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Anoxybacillus flavithermus WK1] # 1 97 2 97 97 88 47 1e-17 MTSKQRAYLKGLAMTMDPILQLGKGGLTPENTAAVDEALAARELIKISVLQNCLEDPRQM AEVLAERTRSQIVQVIGKKIVLYREGKNEKKKIELPRK >gi|226333011|gb|ACII01000008.1| GENE 6 2965 - 3600 557 211 aa, chain + ## HITS:1 COG:CAC1262 KEGG:ns NR:ns ## COG: CAC1262 COG1057 # Protein_GI_number: 15894544 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Clostridium acetobutylicum # 9 204 5 199 200 138 35.0 6e-33 MADIKHRIGIMGGTFDPIHLGHLILGEKAYEQFRLEKVLFMPSGNPPHKRNRQGRATDEE RVEMVRRAITGNPHFELSLTEMHENGYTYTYHTLEMLKEKNPDTDYYFIIGADSLYDFDT WREPERICRNCILVTAVRNHFTIAELEAEMNRLSLKYNGTFLTLNTTNLDVSSEMLRNWI SEDKSVRYYIPDPVIEYIRENQIYSLAEKKG >gi|226333011|gb|ACII01000008.1| GENE 7 3608 - 4198 625 196 aa, chain + ## HITS:1 COG:CAC1263 KEGG:ns NR:ns ## COG: CAC1263 COG1713 # Protein_GI_number: 15894545 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Clostridium acetobutylicum # 1 169 1 166 189 122 39.0 5e-28 MAEYDFIKMQKKLAKYLDEDRYEHTLGVMFTCAALAMVHDCDLITAQTAGLLHDCAKCIP NKKKLKMCSQHHISVSEFEQEHPFLLHAKLGAYVAKAKYDVTDENILSAITWHTTGKPEM TLLEKIVYIADYIEPKRDKAPNLAIVRKLAFQDLDECMYKILGDTLAYLEENPKDIDNAT KDAFLYYRDLHMKRQS >gi|226333011|gb|ACII01000008.1| GENE 8 4371 - 4730 578 119 aa, chain + ## HITS:1 COG:SP1744 KEGG:ns NR:ns ## COG: SP1744 COG0799 # Protein_GI_number: 15901576 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Streptococcus pneumoniae TIGR4 # 4 117 3 116 117 93 37.0 7e-20 MNRELEMAKLACRALDEKKGKDIKVIDIHEVSVIADYFVIASASNQNQVQAMVDNVDETL GRAGFEAKQIEGTRNSSWVLMDYGDMIVHVFDEENRLFYDLERIWRDGKVLDVNEFLEK >gi|226333011|gb|ACII01000008.1| GENE 9 5141 - 5761 773 206 aa, chain - ## HITS:1 COG:BH2356 KEGG:ns NR:ns ## COG: BH2356 COG1974 # Protein_GI_number: 15614919 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Bacillus halodurans # 5 203 3 204 207 221 55.0 7e-58 MPYGRITPKQQEILDYIKNEILNRGFPPAVREICEAVNLKSTSSVHSHLEALEKNGYIRR DATKPRAIEIIDDNFNLVRREVVNVPIIGTVAAGQPLLAVENIEGYFPIPAEFMPNSQSF LLKVKGESMINAGIFDGDQVLVKQQSTAEDGDIVVALIDDGATVKTFHKEKGYYRLQPEN DTMEPILVHEGLQILGKVFGVFRLYE >gi|226333011|gb|ACII01000008.1| GENE 10 6063 - 6254 103 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALNAIYMNKMPNMIVPIFSPTTMRSLARIPGIFFDLLSDNGMSPFCAVHNHSITVQRFH SDC >gi|226333011|gb|ACII01000008.1| GENE 11 6157 - 7041 666 294 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1735 NR:ns ## KEGG: EUBREC_1735 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 14 287 17 284 462 135 31.0 1e-30 MPGILARLRMVVGLNIGTIMFGILFIYMAFSAILYFTTTHIESYQVTSGPLSRNETYTGL AIREESLCKADSSGYITYYAREGSKINASGAVYGLSDTKTASAPASLAPEELSKVRTDMM SFSKGFSSSKFNSTYSFKYELKGNILQYAESGNTSSAPLTSDEDGEGNDSGDNDKTADIY PGNQTICQSQSDGIVLYSMDNYEGKTVDAVKAEDFDQNSYHETDLKTSDKVKAGDDIYTI ITDERWSLLIPLSDRQVEKLKDRSTIRVKFLKDDMTQNGDFSIITIDGEQIWTD >gi|226333011|gb|ACII01000008.1| GENE 12 7016 - 7243 222 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153810903|ref|ZP_01963571.1| ## NR: gi|153810903|ref|ZP_01963571.1| hypothetical protein RUMOBE_01287 [Ruminococcus obeum ATCC 29174] # 4 73 254 324 443 95 70.0 1e-18 MVNKYGQIDFNKGLIRYASDRFLDIELVTNTVVGLKIPLTSIVTKDFYVIPSRMATTQDN QTGFTLYEGKNKNNI >gi|226333011|gb|ACII01000008.1| GENE 13 7215 - 7586 355 123 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2363 NR:ns ## KEGG: Cphy_2363 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 38 123 377 462 462 83 53.0 3e-15 MRGKIKTTFKNVSIYASIDDSSVSKLAVDEAEQPQIYYVDKSSFKEGDALINTDTGEKYV IGETDTLEGVYCINKGYAVFRRIEILDENEEYAIVSKNTTYGLSRYDHIVRNADKVKEED ILY >gi|226333011|gb|ACII01000008.1| GENE 14 7602 - 8175 564 191 aa, chain + ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 24 191 15 181 221 160 52.0 1e-39 MLADKLNLVKKNIEEACDTAGRSPQEVTLIAVSKTKPVEMLKEAYDAGARVFGENKVQEI VDKYDQMPSDVQWHMIGHLQRNKVKYIIDKVSMIHSVDSVRLAEAIEKEAAKKDICMPVL IEVNVAGEESKFGLSVEEVLPFLEEISSYEHLQVKGLMTIAPFVANPEENREVFQKLKKL SVDIATKNINN Prediction of potential genes in microbial genomes Time: Sat May 28 19:05:19 2011 Seq name: gi|226333010|gb|ACII01000009.1| Ruminococcus sp. 5_1_39B_FAA cont1.9, whole genome shotgun sequence Length of sequence - 26935 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 10, operones - 9 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 14/0.000 + CDS 3 - 152 123 ## COG0325 Predicted enzyme with a TIM-barrel fold 2 1 Op 2 . + CDS 172 - 831 849 ## COG1799 Uncharacterized protein conserved in bacteria 3 1 Op 3 . + CDS 838 - 1590 324 ## PROTEIN SUPPORTED gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e 4 1 Op 4 . + CDS 1618 - 2709 989 ## COG0337 3-dehydroquinate synthetase 5 1 Op 5 15/0.000 + CDS 2706 - 3230 354 ## COG0597 Lipoprotein signal peptidase 6 1 Op 6 . + CDS 3235 - 4149 350 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 7 1 Op 7 . + CDS 4216 - 4854 710 ## EUBELI_01147 cytidylate kinase + Term 4868 - 4914 11.0 + Prom 5017 - 5076 12.0 8 2 Op 1 . + CDS 5099 - 6556 1043 ## COG0628 Predicted permease 9 2 Op 2 . + CDS 6616 - 8148 1106 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 8208 - 8250 0.4 + Prom 8188 - 8247 7.0 10 3 Op 1 . + CDS 8282 - 8707 359 ## COG0824 Predicted thioesterase 11 3 Op 2 . + CDS 8739 - 8903 72 ## gi|253577884|ref|ZP_04855156.1| conserved hypothetical protein + Term 9054 - 9098 6.6 + Prom 9213 - 9272 7.3 12 4 Tu 1 . + CDS 9293 - 10264 781 ## COG0582 Integrase - Term 10427 - 10461 1.4 13 5 Op 1 . - CDS 10531 - 10788 284 ## Cphy_3270 hypothetical protein - Prom 10816 - 10875 10.4 - Term 10844 - 10890 1.2 14 5 Op 2 . - CDS 10937 - 11356 598 ## EUBREC_1736 hypothetical protein - Prom 11416 - 11475 9.1 15 6 Op 1 . - CDS 11740 - 12402 352 ## COG0582 Integrase 16 6 Op 2 . - CDS 12386 - 12883 331 ## gi|253577889|ref|ZP_04855161.1| predicted protein 17 6 Op 3 . - CDS 12901 - 13545 342 ## Caci_5527 hypothetical protein 18 6 Op 4 . - CDS 13623 - 13901 62 ## - Prom 13942 - 14001 4.9 - Term 13978 - 14015 4.0 19 7 Op 1 . - CDS 14052 - 14336 291 ## EUBREC_3088 hypothetical protein 20 7 Op 2 9/0.000 - CDS 14361 - 14636 60 ## COG3041 Uncharacterized protein conserved in bacteria 21 7 Op 3 . - CDS 14633 - 14902 281 ## COG3077 DNA-damage-inducible protein J - Prom 14930 - 14989 9.1 - Term 15233 - 15270 -0.0 22 8 Op 1 . - CDS 15517 - 15714 155 ## gi|253577894|ref|ZP_04855166.1| predicted protein 23 8 Op 2 . - CDS 15800 - 16126 370 ## gi|253577896|ref|ZP_04855168.1| predicted protein - Prom 16357 - 16416 6.7 24 9 Op 1 . - CDS 16508 - 16843 281 ## Acfer_1364 hypothetical protein 25 9 Op 2 . - CDS 16900 - 17046 155 ## Acfer_1364 hypothetical protein - Prom 17139 - 17198 3.5 - Term 17168 - 17217 13.5 26 10 Op 1 . - CDS 17224 - 18423 994 ## COG1686 D-alanyl-D-alanine carboxypeptidase 27 10 Op 2 . - CDS 18420 - 19580 1031 ## COG0772 Bacterial cell division membrane protein 28 10 Op 3 . - CDS 19574 - 19861 104 ## Cphy_2369 cell division topological specificity factor MinE 29 10 Op 4 . - CDS 19887 - 20675 856 ## COG2894 Septum formation inhibitor-activating ATPase 30 10 Op 5 . - CDS 20736 - 23630 2851 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 31 10 Op 6 . - CDS 23615 - 24145 479 ## EUBREC_1743 hypothetical protein 32 10 Op 7 22/0.000 - CDS 24135 - 25046 1011 ## COG1792 Cell shape-determining protein 33 10 Op 8 4/0.000 - CDS 25064 - 26077 845 ## COG1077 Actin-like ATPase involved in cell morphogenesis 34 10 Op 9 . - CDS 26100 - 26798 499 ## COG2003 DNA repair proteins - Prom 26859 - 26918 5.7 Predicted protein(s) >gi|226333010|gb|ACII01000009.1| GENE 1 3 - 152 123 49 aa, chain + ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 3 45 177 219 221 62 67.0 1e-10 ATKNINNVNMSVLSMGMTNDYQVAVEEGATMVRVGTGIFGERDYSIKED >gi|226333010|gb|ACII01000009.1| GENE 2 172 - 831 849 219 aa, chain + ## HITS:1 COG:SA1032 KEGG:ns NR:ns ## COG: SA1032 COG1799 # Protein_GI_number: 15926772 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 69 206 45 177 187 69 28.0 5e-12 MGVLDKFLDAIKLNDDYDDDGFLDDDLLDEEEDDDFLDDDFDEKPKKKFFDKFSKKKESD DNDDFDDVEEKAVKTAPKQASAPKQTASAPSKVTVKQERAERQTRPAASSKITPMRSSRK SNQGPNMEVCVIKPSSMEDTREIADTLVDNSTVILNLEGIDVELAQRIIDFTSGACYSLG GSLQKVSSYIFVLGPYNVDITGDLQNILGGSAPSVRAGY >gi|226333010|gb|ACII01000009.1| GENE 3 838 - 1590 324 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e [Oribacterium sinus F0268] # 1 249 1 253 254 129 33 2e-29 MEKDEFFLKRIRELANLSWQRDIVTFSDFLNLNEQNIVSSLKHQFPQIVMETSGGYENAE RQMVAFHPDALAFTWEYPIDCLRIEPKALKFSEELSHRDYLGALLNLGVDRSVIGDIVVQ KKAAWFFCQNKMTEFFLENLCRVRHTNILITKVEDSDEFPRPVLESVSGTCASVRLDSLI SLAFKTSRSSMVSYIEGGQVFVNGKLITSNGYEPKDGDIISVRGKGRFIFDGVSHQTKKG RCSVRIMRYV >gi|226333010|gb|ACII01000009.1| GENE 4 1618 - 2709 989 363 aa, chain + ## HITS:1 COG:slr2130 KEGG:ns NR:ns ## COG: slr2130 COG0337 # Protein_GI_number: 16330660 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Synechocystis # 39 352 37 354 361 266 43.0 6e-71 MAEQKLLVKREGDFHYPIYFKNDFQDLAGAIREEGLENRKICIVTDSHVAPLYHEAVKSA LQEISSEIFSFVFEAGEKNKNLNTVQELYKTLIENEMDRKGLLVALGGGVVGDLTGFGAS TYLRGIDFIQVPTTLLAQVDSSVGGKTGVDFLQYKNMVGAFHQPRLVYMNMSTLQSLPNR EFTCGMGEILKTGLICDEEFFRFVCKNQPEISKLDLSMLSRMIRRCCEIKAGVVERDPKE QGERALLNLGHTVGHAVEKMKNFQLLHGQCVGVGLIAAAYLSMQRGLLTEEEYEEIRKGC HSYNLPLSVDSLNAGDVLAATKKDKKMEAGHIKFILMDGIGKSFIDKTVTDEELLQAIRE ILI >gi|226333010|gb|ACII01000009.1| GENE 5 2706 - 3230 354 174 aa, chain + ## HITS:1 COG:SPy0826 KEGG:ns NR:ns ## COG: SPy0826 COG0597 # Protein_GI_number: 15674864 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Streptococcus pyogenes M1 GAS # 20 166 10 150 152 84 36.0 1e-16 MTAVSSVSSCMHWIRGCIIILLLTILDQGSKSLVLAQLKDHPDISLIPGVLQLRYLENRG MAFGLFEGKIPVFVILCLLFFGVFIYVYARIPKNRYYLPLSVTALVMVSGALGNFIDRVC RGYVVDFIYFSLIDFPVFNIADMYVVCSGILLVMLVCFRYKNDEDYDFLRIKKD >gi|226333010|gb|ACII01000009.1| GENE 6 3235 - 4149 350 304 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 28 298 27 285 285 139 36 2e-32 MEILNYQVSDGQSGIRIDRYLSEMNEELSRSYIQKLLKEQKITVNGSAVKANYKVQEGDE ISVAVPDIKEPDILPEDIPLDILYEDDDVLIVNKPKGMVVHPSAGHTSGTLVNAIMFHCK DNLSGINGVLRPGIVHRIDKDTTGALLVCKNDNAHRNLAEQLKEHSIRRRYRAIVAGVLK EDEGTIEGPIGRHPIDRKKMAVNYKNGKDAVTHYKVLERFKNATYVECRLETGRTHQIRV HMTSIGHPLLGDEVYGSGKNPYHLQGQTLHAMILGFVHPSTGEYMEFTAPLPEYFVKLLE KLRK >gi|226333010|gb|ACII01000009.1| GENE 7 4216 - 4854 710 212 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01147 NR:ns ## KEGG: EUBELI_01147 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: Pyrimidine metabolism [PATH:eel00240]; Metabolic pathways [PATH:eel01100] # 6 205 5 203 212 253 62.0 2e-66 MNSTSSIITIGREYGSGGRQIGQEVAKYFGIKCYDKELLEHAANESGICKELFENHDERP TNSFLYSLVMDTYSFGYSSSGFTDMPMNHKVFLAQFDAIKKLASEGPCVMVGRCADYALS EYKDCFSVFVHADTDWRINRLSQKHNKTAKEAKDMINKTDKSRSSYYNYYTNKKWGAASS YNLCVDSSKLGIDGAAKAIIQAIEIFDSIKNK >gi|226333010|gb|ACII01000009.1| GENE 8 5099 - 6556 1043 485 aa, chain + ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 18 393 13 383 383 134 27.0 4e-31 MKFRWDNRYLHWGVTAFLVIAASMLFYYGIFHMKTLIVGIKTFLGIMAPIIYGVILAYIL SPLINLFEQKLIYPQLEKHNIKLQKKGKRAIRWGCVLFSMFLFWIIIYALLMMVLPQLIR SIMSIIYSFPYYVKVIEKWLNSFVEHGWKLNPEMLDMINQYSVKAQEYLTTDILPQMQDM LKNVSAGIFDILIFMKNFLIGAIVALYVLADKEKFVAKSKMMVYAILPHKWANMLIRVMR FTDKTFGGFIYGKLLDSAIIGILCYFGMLLLDLPYPILISVIIGMTNVIPFFGPYIGAIP CILLILVVDPIKGLYFAIFILLLQQFDGNILGPKILGESTGLSSFMVIVAIMIGGGLFGV PGMIAGVPVFAVLYALIWRLINHSLNDKKMPAEEETYINIDCLDEQTGEVLLLCKEEKHK VSPEELEARKNNFFMKIWNPLSGLVITIWKYLLRILNIIWGYIKKAGIFLKNRYVELKNK REKNK >gi|226333010|gb|ACII01000009.1| GENE 9 6616 - 8148 1106 510 aa, chain + ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 32 341 25 321 2566 135 36.0 1e-31 MTAFLLVLALALQGGIFSGFSVPVQAAATSKKQTGFVKKNGSWYYYDKNGKKATGWYKSA AGNQYYFGKTGAAKAGILTISGKKYCFNEKGKMLTTWQTVNGKTYFFDEKKGYMHTGWVT TAAGNKYYFWNDGVIRSGFHKVNNVYYCFNEKGKMYKNCFRKSGNSTYYLQANGTMAKGR LKVKGSWYSFNRNTGQLVRSGWYKETDGSYYYAASNGKLVKGFYKPDSYYRYFRKSDCKL VTGWQTINGHKYYFKKTNGIRYDDIILKSSSGKRYYFESDGKLASSKWITKNSAHYYAKS DGVLASGWLTVDGKKYYMNPSTCERESGWVTVDEEKYYLNSSTGALVTNQWIDDNHYVGE DGSLIPDYQNGVSFRWPLSSGYSYISSYFGNRESPGGIGSTNHKGIDIPAPTGTPIYAAA SGTIVAMLSPASSGGAGYYTKINHDGKGLITEYMHQSKFNPNLSVGDKVKKGDIIGYVGS TGNSTGPHLHFGVMVNGVNQNPLNYVKRPS >gi|226333010|gb|ACII01000009.1| GENE 10 8282 - 8707 359 141 aa, chain + ## HITS:1 COG:SA1185 KEGG:ns NR:ns ## COG: SA1185 COG0824 # Protein_GI_number: 15926931 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Staphylococcus aureus N315 # 14 132 9 128 155 92 36.0 2e-19 MKEESEKTYTYYRKAQYHETDQMGIIHHSNYVKWMEEARIGYMSRMGFSYKKVEELGVIS PVVEISVAYRKQVFFDDEIRIRVGIKQYNGISLEFNYEFFNASRNEICTTAYSRHCFLKD GKLIALKKELPELNRIITDCL >gi|226333010|gb|ACII01000009.1| GENE 11 8739 - 8903 72 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577884|ref|ZP_04855156.1| ## NR: gi|253577884|ref|ZP_04855156.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 105 100.0 9e-22 MIKSRLGQPIAAFSGDDVYWEVPEDLPKMTKLMVDGKALYIHRANYQIIDKKLL >gi|226333010|gb|ACII01000009.1| GENE 12 9293 - 10264 781 323 aa, chain + ## HITS:1 COG:L57903 KEGG:ns NR:ns ## COG: L57903 COG0582 # Protein_GI_number: 15673214 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 22 312 7 316 356 90 27.0 3e-18 MDKYIDYYNDSKKVTYHEQTDISNILKLRAVLKTLPDFCKDYFRAMDPLTTTKTRISYAY DIRIFFQFLLDENPAFKDYSMKDFTVDVLDQLKAIDIEEYQEYLKVYKNGDKTETNGERG LKRKISALRSFYAYYYKHEFIQTNPTVLVDVPKTHEKNIIRLDADEVAMLLEHIEHCGDE LTGQKRVYYEKTKERDLAIVTLLLGTGIRVSECVGLDVEDVDFKNNGIKVTRKGGNEMVV YFGHEVEKALKKYLEVRKNIVPVAGHEHALFYSTQRKRIGIQAVENLVKKYAGAITTTKK SHHTSFAVLTVLLCIRKQAIFTL >gi|226333010|gb|ACII01000009.1| GENE 13 10531 - 10788 284 85 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3270 NR:ns ## KEGG: Cphy_3270 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 85 1 85 85 75 45.0 6e-13 MEELYRIIEQKIKDAGYPRAISGEMVYGDICDQIEGKENGSYVLLSKVEDDVIFEYHITV MDDEFNLGLLTIRTPEGVFETDFDK >gi|226333010|gb|ACII01000009.1| GENE 14 10937 - 11356 598 139 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1736 NR:ns ## KEGG: EUBREC_1736 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 139 1 140 140 157 59.0 9e-38 MVELIVGKKGKGKTKVLLDKVNGAVKEANGSIVYLDKSTKHMYELNNKVRLIDVSGFPIK NADEFVGFICGILSQDHDLEQIYLDSFLKVAKLEDKDITGTLGQLDEIGKQFNVTFVISV SLDKEEIPEALHGKISVAL >gi|226333010|gb|ACII01000009.1| GENE 15 11740 - 12402 352 220 aa, chain - ## HITS:1 COG:SA1835 KEGG:ns NR:ns ## COG: SA1835 COG0582 # Protein_GI_number: 15927603 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Staphylococcus aureus N315 # 20 215 199 379 390 81 27.0 1e-15 MVQQDNFLTYIKNSKYAYYYPLFVFILETALRVGEVCGLTWDDVDLEHGFVNITHQLQYN NVGKDACQYQILTPKSKSGVRNVPLSNAARDALMEQMKNQIEMDKKTEIEIDGYKDFVFS NRQNKPFITQTIGRILNRIVDDYNSDIKKFGGTIEPLPHIHPHLLRHTSCSRMAEAGVDP RTLQDIMGHASMKMTMELYNHVTDERLTNEIQKLNNRRIS >gi|226333010|gb|ACII01000009.1| GENE 16 12386 - 12883 331 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577889|ref|ZP_04855161.1| ## NR: gi|253577889|ref|ZP_04855161.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 165 1 165 165 318 100.0 7e-86 MNEKFRTIGLDTEDKITEFIDYIDDLEDTEEIKSDVYGFFEEESSYNENHIPAYVVMTKY KGKYIVWVGDQCCDFAISYPPYNLCDIFSTYDEANEYYKSLAHSGIKPLSEQVEECKSQL IDEVDLRVSTMVPNVLNIKVWYKNCLIIEHLTQFDVQYINNGAAG >gi|226333010|gb|ACII01000009.1| GENE 17 12901 - 13545 342 214 aa, chain - ## HITS:1 COG:no KEGG:Caci_5527 NR:ns ## KEGG: Caci_5527 # Name: not_defined # Def: hypothetical protein # Organism: C.acidiphila # Pathway: not_defined # 17 205 19 190 197 81 29.0 2e-14 MNDNKMITQSRIIEEYGWTKSLISKFLPDPVLKANPHYRKAAPMRLWDEDTVKQVMTTSE FQDAMEKANKRKKSASKAVETKYSNMNHSVQLFIDSITIKVLSDEELKTKALKAKQEWYQ CHPCSLNGDWIDEPYMKNAYTANEETINRWIVNYIRHNLVSYDNFLGSIDGKVGSVEAYP EVKIAVLEKIANTYPKYKDECERQIFFVDFNADR >gi|226333010|gb|ACII01000009.1| GENE 18 13623 - 13901 62 92 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKSYNGADNIAIISPESSTNLCSILSITSQEYADFLVKHANELDPYLILIKPSEIRGDE EYVCKVLRMIVNQSIKLDKEIEIPSRKYVVEL >gi|226333010|gb|ACII01000009.1| GENE 19 14052 - 14336 291 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3088 NR:ns ## KEGG: EUBREC_3088 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 153 81.0 2e-36 MNEYFPTVVQVIPLDNFHVQVFFDDGKIVDYDATEDLKTEIFKPLRDIDAFKEACTVMNG TLAWDLSGNRDESSCIDIDPFTVYELESINNLIA >gi|226333010|gb|ACII01000009.1| GENE 20 14361 - 14636 60 91 aa, chain - ## HITS:1 COG:HP0892 KEGG:ns NR:ns ## COG: HP0892 COG3041 # Protein_GI_number: 15645510 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Helicobacter pylori 26695 # 4 91 3 90 90 83 44.0 7e-17 MKYTVKPTSKFQKDLKRIQKRGYDMRLMTDIIKKLANGEILPPKNRDHNLSGNYSNCREC HIAPDWLLIYEVYEDELFLYLTRTGSHSDLF >gi|226333010|gb|ACII01000009.1| GENE 21 14633 - 14902 281 89 aa, chain - ## HITS:1 COG:SP0275 KEGG:ns NR:ns ## COG: SP0275 COG3077 # Protein_GI_number: 15900209 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Streptococcus pneumoniae TIGR4 # 3 89 5 87 87 57 35.0 5e-09 MANVSIRMDDNLKKQAEDLFNDLGMNLTTAFTIFVKQAIREQGIPFEITREIPNSETISA INEVQQMKQNPSLGKAYTDVDKMMEELLA >gi|226333010|gb|ACII01000009.1| GENE 22 15517 - 15714 155 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577894|ref|ZP_04855166.1| ## NR: gi|253577894|ref|ZP_04855166.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 1 65 65 115 100.0 1e-24 MRCKDCEYVKCYAVNKKMYYCDNRNRVNLMGKLGEDNLPETSPEWCPMADRTIDENNMAE KGLEK >gi|226333010|gb|ACII01000009.1| GENE 23 15800 - 16126 370 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577896|ref|ZP_04855168.1| ## NR: gi|253577896|ref|ZP_04855168.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 108 1 108 108 202 100.0 6e-51 MNKTIYEAEFTKMVHDKMREANRFKEYERIPKNRIGGDYWNTYWTIRYMLRTIEDILAKG DKLVLLGFFTVEPKFYKEKKHVLEWKEQAKMCMRFQKDIRQSLNLVLC >gi|226333010|gb|ACII01000009.1| GENE 24 16508 - 16843 281 111 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1364 NR:ns ## KEGG: Acfer_1364 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 3 106 71 175 185 85 40.0 7e-16 MAFACLLHDASECYLSDVPRPFKKTLSGYKEQEKNLLDLIYQKYLGSPLNAKEKQLLKEI DDDMLWFDLTYLLNEHQEREAPEIHITIDYTVRAFEETEKEYLELYSFFTH >gi|226333010|gb|ACII01000009.1| GENE 25 16900 - 17046 155 48 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1364 NR:ns ## KEGG: Acfer_1364 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 1 47 1 47 185 75 70.0 7e-13 MNNNCITTYTGRHIDPLHPDPDMICIEDIAHALSLICRGNGQVKTFFQ >gi|226333010|gb|ACII01000009.1| GENE 26 17224 - 18423 994 399 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 153 395 37 282 425 131 34.0 3e-30 MRTKKKSSGKIPEKKSAVPYKSAAPHRHKPSRPLQTVKLPKSQMPSRQTEEHQALLRTKY HKKRQRLKHIFRILVGVFVILAAAGGVFYGCNILSGNYFGTPEQAFDSSEVFTNALAAKE NMRTESFAQKLCVSSKGNVDCIKNAQLEEGQKGLLFSLSNHKVLYANGIYDKVYPASITK IMTAMLALQSGKLNDTVTITQDNVTLEDGSQVCGFVAGDQVTLDQLLHCLLVYSGNDAAS AIAEYVGGSTENFVQMMNDYAAKLGCTGTHFSNPHGLQDENHYTTPYDIYLMLNEAFTYP EFTEITELPSYTVTYTGSDGTEKSTTLTATDHYLTGEATAPKDVTILGGKTGTTEVAGNC LAILTQNAYGKTFVSIVMGAATKELLYQEMNSLLQNINS >gi|226333010|gb|ACII01000009.1| GENE 27 18420 - 19580 1031 386 aa, chain - ## HITS:1 COG:CAC1251 KEGG:ns NR:ns ## COG: CAC1251 COG0772 # Protein_GI_number: 15894533 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Clostridium acetobutylicum # 40 371 52 372 373 168 32.0 2e-41 MLKQYKLRFYNFRLIIFLLAISLIGVTLVGTAASDLRSKQFAGVILGLIVMLILSLLDYS WIMNFQWIMYGFNIIMLIVVRIAGDSANGAARWIGIGSFRFQPTELSKIILIVFFAKFFM DHEETLNTLKTLALSGVLLAVPLFLILEQPDLKNTITVVVIFCIMIYIAGLSYKIIGGAL LIAVPLTIIFLSIVVQPDQKLLKDYQRSRIMSFLYPENEEYSDDIEQQNNSKTAIASGEL VGKKLSGDDSTTSVNQGNFVAENQTDFIFAVAGEEYGFIGCVIIVLLLLAIAFECIRMSL RAKDLSGKVLCCGIGGLIALQSFINICVATGLAPNTGTPLPFVSYGLTSLVSLYIGMGLV LNVGLQSSTYNKEIIQKEIDRREDYL >gi|226333010|gb|ACII01000009.1| GENE 28 19574 - 19861 104 95 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2369 NR:ns ## KEGG: Cphy_2369 # Name: not_defined # Def: cell division topological specificity factor MinE # Organism: C.phytofermentans # Pathway: not_defined # 4 81 6 90 90 63 40.0 2e-09 MKAFFKEKKRSAGYARDRMKLLLISERIDCSPQMMKMLRNDMIHTVKKYLTIDEEQVKIQ ITQEPAVLHAYIPVLNKKDNRLVSSQLLKRTDKLC >gi|226333010|gb|ACII01000009.1| GENE 29 19887 - 20675 856 262 aa, chain - ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 1 260 1 261 264 318 61.0 5e-87 MSEVIVITSGKGGVGKTTTVANIGTGLAMLGKKVVVVDTDIGLRNLDVVLGLENRIVYNL VDVINGSCRLRQALIKDKRHPELCLLPSAQTKDKSAVSPEQMIKLTDDLREQFDYILLDC PAGIEQGFKNAIAGADKALVVTTPEVSAIRDADRIIGLLEANEIRDISLIINRLRPDMIA RGDMMSVDDVTDILAVDLIGTILDDEQIVIATNQGEPLSGKSSQAEEEYMNICKRLLGEE IPFADINRKQGLLRKIGSFFKK >gi|226333010|gb|ACII01000009.1| GENE 30 20736 - 23630 2851 964 aa, chain - ## HITS:1 COG:RC0852 KEGG:ns NR:ns ## COG: RC0852 COG0768 # Protein_GI_number: 15892775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Rickettsia conorii # 615 923 253 570 593 148 29.0 4e-35 MGTKIKRFFKKIRIKRTTVLVLVFVFMSATLVRQLFELQIIQGEAYISKFESRTTKKRVI KSTRGNIYDRNGEELATNVLAYSVTFEDNGTYDSTREKNLTLNGIAYKVLKILESNGDSI STGFHIVLDESGNYTFDVDEGFTLNRFRADIYGEPLIDNLTKKQKNATADQMIEFMSGKE GFSIVLYGDNAYTKEELTSHGLPENLSKQDVLELSKIRYALSTNSFKKYMAVTIASNISE KSVAAIRENQPELQGIDIVEDSVRKYIDDASMGPILGYTGQASSEELEELRQKNPDYSND AIIGKSGIEKYMETSLQGTDGEETVTVDNLGKVLKIDDSTRVEPVAGNDTYLTIDSSWQS AIYQILKQRVAGILLSKIEASKTYDFSVNDAAQIKIPIYDVYNALIANSVIDISKFSDAN ASDTEKNLYAKFQQKQQQVFDTITNRLTAENPPTVKDEDDQIQEYLTYICDDLLRDTLGI ISKNAIDTSDSTYQKWTTDKDISLKDYLTYAASQNWIDISKFSTEGDYLDSDEVYQALTD YLIDYLKKDTNFSKLLYKYMLQEDTISGSEICLVLYEQGVLSKDDGSYEALASGSMTAYD FMINKIYTLEIEPAQLALEPCSASAVITDVKTGDVLACVSYPGYDNNRLVNNMDTDYYAK LSTDKSSPFFNKATQQTAAPGSTFKILSTIAGMSEGVIDDGTYINCTGSFDLVTPPINCW NKQGHGEIEIREAIEQSCNYYFNMVGFKLGQDADGNFSENRSLSVLQKYASEIGLDKKTG IEITESAPHVSDSYAVPSYIGQGTNAYTTSQLARYATAIATSGNVYDLTLLDRQTDSKGN TLKKYEPDIINTVDVSQNVWDDIHDGMYRVVQTHRQFDGLGVDVAGKTGTAEVNIYHPNH GMFVGYAPASDPEYAIAVRIENGYTSGNACLAADDIFKYIFELTDEKSILTGVAASDTSD TSND >gi|226333010|gb|ACII01000009.1| GENE 31 23615 - 24145 479 176 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1743 NR:ns ## KEGG: EUBREC_1743 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 157 1 157 173 135 50.0 9e-31 MKSKIALFFMILISFLLQCTLLHVISIGSITPNLILILCISMGLMRGRKSGLWTGFFCGF LVDMFYGSVFGFYALIYMYIGFLSGYAHRICYDDDIKVPVMLAGIGDLLYGLSVYALQFL LRGRLGLGTYLYRIILPEIFYTVILTLIVYRVFHYINYHFMKNPSMKESESIWVLK >gi|226333010|gb|ACII01000009.1| GENE 32 24135 - 25046 1011 303 aa, chain - ## HITS:1 COG:CAC1243 KEGG:ns NR:ns ## COG: CAC1243 COG1792 # Protein_GI_number: 15894526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Clostridium acetobutylicum # 44 280 35 273 283 103 28.0 4e-22 MKNLKKKVHFRIKLKSKHLLVIMSIFCVSCIAATFSSGVTTAPLQDAAGILIVPFENSIN RISSVLKGMQDRMRDKEEILSENENLKAQIDSLTEQNNKLIQDQTEYVRLQQMYNLDQQY TEYPKIAAEIISKDPGNWYDTFMINRGSADGIRVDNNVLADKGLVGIVTEVGTHWATVRS IIDDSSNVSAMTVSTQDNCVVEGDLELIDEGKLSFSQLYDQNNKVTVGERIVTSNISEKY VEGLFIGYVSEVEQNSNNLTKTGTIVTPVDFQHLKNVLVITVNKQDSVNDNAQAAQEATA NEE >gi|226333010|gb|ACII01000009.1| GENE 33 25064 - 26077 845 337 aa, chain - ## HITS:1 COG:FN1577 KEGG:ns NR:ns ## COG: FN1577 COG1077 # Protein_GI_number: 19704898 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Fusobacterium nucleatum # 5 315 10 323 342 170 33.0 4e-42 MLFEKTYGIDLGSSSVKVYSFFRNKTYIEKNMIASKGHTIIAMGNEAYDMFEKSPTDITV TSPMTFGMIANLELQEIVLYSMIRKIDHILGFGATMYFTVPLDMTAVEKRAYFHVANGHW LKKNRVFMVEAPIADAIAMGVNLKDPEGNMIVNIGAQSTEVSIITGGKIIISRKIPLGGR QMNESVCSEIRKRYNLQIGTRTAKRLKMVMGRLSDPKKEVRKVVGIDCISGLPREEIISS YVVNDGIMNCLNEIAAEMKTFLERIPPQISYHIAKQGIYITGGSTRLPYIDKYLASYTGF TFNLSDFYETSAVTGLEKIIRNKELREWAVPVTQRKL >gi|226333010|gb|ACII01000009.1| GENE 34 26100 - 26798 499 232 aa, chain - ## HITS:1 COG:BH3032 KEGG:ns NR:ns ## COG: BH3032 COG2003 # Protein_GI_number: 15615594 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Bacillus halodurans # 1 225 8 227 232 181 39.0 9e-46 MKQLPREQRPYEKCFMQGEGSLNDTELLAVILRSGTSGKNSLALAQEILKFMESSSYPGL MGLMHVSVQDLMKIHGIGQVKAVQLKCIGELSKRIAKAAARPQIVMNNPSSIAAYYMEEL RHEEQELVICMMSDVKGHFLGDKILTRGTATGSLVTPREIFMEALRRHAVSLILIHNHPS GDPTPSPDDLQITARIYQAGELLGIHLLDHIVIGDQRYCSFREEGLWNTCTE Prediction of potential genes in microbial genomes Time: Sat May 28 19:06:14 2011 Seq name: gi|226333009|gb|ACII01000010.1| Ruminococcus sp. 5_1_39B_FAA cont1.10, whole genome shotgun sequence Length of sequence - 18910 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 6, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 62 - 1657 1421 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 2 1 Op 2 . - CDS 1654 - 2844 863 ## COG2081 Predicted flavoproteins 3 1 Op 3 . - CDS 2841 - 4115 1547 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance - Prom 4149 - 4208 3.5 4 2 Op 1 12/0.000 - CDS 4211 - 5155 1024 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 5 2 Op 2 6/0.000 - CDS 5175 - 7226 2000 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 6 2 Op 3 . - CDS 7233 - 9863 2991 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 9935 - 9994 8.0 + Prom 10500 - 10559 5.2 7 3 Tu 1 . + CDS 10596 - 11099 399 ## Cphy_0744 ATP-dependent OLD family endonuclease + Term 11110 - 11143 5.4 - Term 11098 - 11131 5.4 8 4 Tu 1 . - CDS 11139 - 12383 893 ## COG4826 Serine protease inhibitor - Prom 12487 - 12546 2.3 - Term 12425 - 12457 -0.1 9 5 Op 1 8/0.000 - CDS 12579 - 13733 703 ## COG0675 Transposase and inactivated derivatives 10 5 Op 2 . - CDS 13672 - 14343 316 ## COG2452 Predicted site-specific integrase-resolvase - Prom 14437 - 14496 9.1 - Term 14685 - 14738 11.3 11 6 Op 1 . - CDS 14985 - 18407 2142 ## COG5492 Bacterial surface proteins containing Ig-like domains 12 6 Op 2 . - CDS 18404 - 18748 287 ## gi|253577919|ref|ZP_04855191.1| predicted protein - Prom 18813 - 18872 3.7 Predicted protein(s) >gi|226333009|gb|ACII01000010.1| GENE 1 62 - 1657 1421 531 aa, chain - ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 528 1 531 535 487 47.0 1e-137 MIRISQLKLPITHTKAQLEKKIAKTLKNPGNSFTYEIKKQSLDCRHKNDKIFVYTVDVTI RDEQKLAKKVNNNNIMLTKEKPYEFPSPGETPLLHSPVIVGSGPAGLFCGWYLAKAGYCP IILERGEEAEKRQKTVENFWKNGVLDPESNVQFGEGGAGTFSDGKLNTLVKDPYGRNHEV LKRFVAAGAPEEIIYQQKPHLGTDVLVGIVEKMRHEIEEMGGKFCFRSKVTDLIFENNTL KEIEINNDKRIPAQVCVLAPGHSARDTFEMLQKRGVYMEPKSFAVGLRIEHPQEMINMDL YGEPENELLGAASYKVTHKCANGRGVYSFCMCPGGYVVNASSEPGRLAVNGMSYQARDSR NANSAMIVTVTPEDFPDKGILGGVEFQRDLEKKAWELGEGKIPVQLFGDFCKNQASEALG EVTPCMKGEYILTNVRSVLPKAVGDSIEEGVRAFGKRISGFDREDALLSGIESRTSSPVR IVRDQELTANMAGIYPCGEGAGYAGGITSAAMDGIKVAETVCRKYAAIKNN >gi|226333009|gb|ACII01000010.1| GENE 2 1654 - 2844 863 396 aa, chain - ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 5 388 2 403 405 293 39.0 6e-79 MKKRQVVITGAGACGLMAAIMAARNGAAVTVLEQNEKPGKKICATGNGRCNFSNLARPDD AYRGEHPEFVEDALREFSVENTIEFFKEIGIYPLNRNGYLYPRSNQAQSVVDVLCMEAAS LGVKIKTNEQVTEIKTGTNEKNFQILTKGWHYDADALILANGSRASSVSGSDGSGYMLAE SFQHRIVPVYPALTALKCKGSSFKAWAGVRTEGEISLFTDGKFCKSEHGELQLTEYGISG IPVFQLSTYAVRAVREGHKAELRINFMPELSEEELKKLLYARKKACPYKKEKELLVGLFP EKLIKILISQKQLVSAIREFPLEVQDGMSFSQAQVCSGGVDTSQVNSQTMESKLCRGLYF AGELLDIDGTCGGYNLQWAWSSGAVAGKNAAKEEKN >gi|226333009|gb|ACII01000010.1| GENE 3 2841 - 4115 1547 424 aa, chain - ## HITS:1 COG:BS_ynbB KEGG:ns NR:ns ## COG: BS_ynbB COG4100 # Protein_GI_number: 16078807 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Bacillus subtilis # 20 422 17 421 421 442 53.0 1e-124 MKEMYAQLGISSEVYDFGHKIEESLKERFEKFDKTAEYNQMKVLLAMQKNKVSAECFGSS SGYGYDDIGRETLEKVYADTFHTEACLVRPQITCGTHALAIALFGNLRPGDELLAPAGKP YDTLEEVIGIRPSKGSLAEYGVSYRQVELLEDGTFDYDSIRKAINEKTRLVEIQRSKGYQ TRPSFSVKQIGELISFVKSIKPDVICMVDNCYGEFVDTIEPSDVGADMIVGSLIKNPGGG LAPIGGYIAGTKECVENASYRMTCPGLGSEVGATLGVNRSFFQGFFLAPMVTKGALKGAV FAANIYEKLGFPVIPDSTEPRQDIIQAVSLGTPEGLIAFCKGIQAAAPVDSYVDPEPWDM PGYDSQVIMAAGAFVQGSSIELSADGPMKPPYAVYFQGGLTWEHAKLGVLMSLQKMVDKG LVTL >gi|226333009|gb|ACII01000010.1| GENE 4 4211 - 5155 1024 314 aa, chain - ## HITS:1 COG:CAC1835 KEGG:ns NR:ns ## COG: CAC1835 COG0324 # Protein_GI_number: 15895110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Clostridium acetobutylicum # 3 305 2 303 309 313 50.0 2e-85 MKKPLIVLTGPTAAGKTHLSIALAKAVNGEIISADSMQVYKYMDIGSAKIRPEEMQGVKH YLVDELLPQEEFHIVKFQQMAKAAMEEIYAKGKIPVLVGGTGFYIQAITKDIDFTQAEQE DGYRQELEQLAAEKGNEYLHQMLLNVDPVSAGEIHANNVKRVIRALEFYHQNQSPISAHN QEQKEHETPYNLAYFVLNVPRELLYKRIDDRIDEMLKEGLLEEVQKLKDMGYHRGMVSMQ GLGYKEILAYLDGEYPLEEAVRILKRDTRHFAKRQLTWFRREKDTIWMNKDEFDYNEDRI LDEMLKVCRDRGIL >gi|226333009|gb|ACII01000010.1| GENE 5 5175 - 7226 2000 683 aa, chain - ## HITS:1 COG:MA0522 KEGG:ns NR:ns ## COG: MA0522 COG0323 # Protein_GI_number: 20089411 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Methanosarcina acetivorans str.C2A # 3 681 7 654 656 370 34.0 1e-102 MRKIAVLDQQTIDKIAAGEVVERPSSIVKELVENAIDAGATAVTVEITDGGKKMIRITDN GGGMERDQVPLAFLRHATSKIEKVEDLEHIASLGFRGEALSSIAAVAQVELITKTPSALS GVRYVINGGVQESLEDMGAPEGTTFLVRNLFYNTPARSKFLKSDTTEGNYVSTLMEQLAL SHPEISFKYIQNKQVKLHTSGNYNVKDVIYNIYGRDITKALLEVSYENDFMKIEGFVGKP EISRGNRTFENYYINGRFVKNRIIAKGIEDAYKGFLMQHKFPFVSLHIQMEGNDLDVNVH PSKMEVRFARGTEVYDAVYETVHKALTTREMIQTVPFGKEEPIKKQQPVVKPGDVPEPFE TRRRAEIPEYRPQTVNTASRVSEHTALPRGTVTMAEQAVREQQIYQTRDPLTKAEEQLFE GTLSDKKNEEQPVFSETKQEIAEQKTAEYLKTAVDSEGNRKQIEQNKNVISEEIRPQQLE LFEEKLLAPESRSRHQLIGQIFDTYWLVQFEDKFFIIDQHAAHEKVYYERFVKRFREQTV ESQYLSPPLIVSLNLQEEALLKANRKYFEDFGFEIEPFGGKEYCINAVPSNLYGLTEEEL FLEMLDNLASEKDKDPLGIFASRLATMACKAAVKGNNQMSDREANALIDELLTLENPYHC PHGRPTIISMTKTELEKKFKRIV >gi|226333009|gb|ACII01000010.1| GENE 6 7233 - 9863 2991 876 aa, chain - ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 873 1 867 869 833 49.0 0 MAMSPMMQEYCKTKEQYKDCILFYRLGDFYEMFFDDALLVTRELEITLTGKDCGLEERAP MCGVPYHAAETYINRLIERGHKVAICEQVEDPKKAKGLVKREVVRIVTPGTTLDAAALDE TKNNYLMSIVSMEEHFGCAIADITTGDCFLTEVDKPQKLLDEINKFVPAEIICNDSFYMS NIDTDDLQNRLGICVFSLDSWYFDDELCRRTLKDHFHVGSLEGLGVGDYDCGIIAAGALF LYLKETQKTALSHMTTIRPYAAEKYMLIDSSSRRNLELVETLREKQKRGSLLWVLDKTKT AMGARTLRSYVEQPLIDRDEIEQRLEALEELNKNGMLRDEIREYLGPVYDLERLISRISY KSANPRDLIAFASSLEMLPYIKQVLKEFKTPLLQKIYEDMDSLEDVTDLIKRAIVEDPPL AQKDGGIIKEGYNEDVDKFRRSRTDGKKWLSELEAKERERTGIKTMKIKYNRVFGYSLEI TNTFKDLVPDNYIRKQTLTNAERYITQELKELEDLILGAEDKLYALEYELFCDVRDAVGK EVMRIQKTAKAVAALDVFASLALVAERNHFVRPKTNTTGVIDIKNGRHPVVEQMIENDMF IANDTYLDNHKKRVSIITGPNMAGKSTYMRQTALIVLMAQIGSFVPAEKANIGIVDRIFT RVGASDDLASGQSTFMVEMTEVANILRNATARSLLILDEIGRGTSTFDGLAIAWAVIEHI SNTKLCGAKTLFATHYHELTELEGKIPGVNNYCIAVKEKGDDIVFLRKIVQGGADKSYGI QVAKLAGVPDSVIQRAKELVEELSDADITAAVKDLTSAKKKKPVYDQMDMAQMSLFDTVK DNDIIDEIKGLDMGNMTPIEAMNTLYNLQNKIKNRW >gi|226333009|gb|ACII01000010.1| GENE 7 10596 - 11099 399 167 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0744 NR:ns ## KEGG: Cphy_0744 # Name: not_defined # Def: ATP-dependent OLD family endonuclease # Organism: C.phytofermentans # Pathway: not_defined # 1 165 457 622 642 192 57.0 3e-48 MIRDGDGKDAEELASSLCRYYEARNREDMDRLPRVTRENVLILKYYSFENYFLDPKIMEK IGVIKSEDDFYEILLKKWNEYLYKLKSGQHLTEMIGHALKNTTDIREHMEEIRICLRGHN LYDIFYGRFRKNETEILKSYIEEAPRDTFKDILDAIDRFVYFENRKK >gi|226333009|gb|ACII01000010.1| GENE 8 11139 - 12383 893 414 aa, chain - ## HITS:1 COG:all0778 KEGG:ns NR:ns ## COG: all0778 COG4826 # Protein_GI_number: 17228273 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Serine protease inhibitor # Organism: Nostoc sp. PCC 7120 # 70 411 27 371 374 141 30.0 2e-33 MIRKQLLIGILAAMTVALPAVGIYAGSAEASEIQSSNEDTVSGNKGASIDKAVTLQNSVK MTDQISAAVNGENIMFSPTSLNFALGMIAEGAEGETKEVLCNYLGTDDFASYAKEYLNKI KEYNTEDESYGYKSKLKIADAVWVDNNLTLQEEFKNSVTNGFGAEVENVDFSAAEKTCGI INSWCDKNTEGLIPKIITPDLINDTTGLCLTNSLYFESGWSGEPWNVSDTEEKFGNNEKT KYMTCAGDRYYENDKATAFDREYANGLSFVGILPVDEGDFTLEDLDIGGLLKSQPEYDEV QCKMPKLDFETTAILNDILSSLGLDNIFSSNADFSGIADKNVNVDTILQKTKLELDENGT KAAAVTAVIMECMSAVEEKEPVIKNVELTRPFAFLIYDRSNNEILFMGKVMTVS >gi|226333009|gb|ACII01000010.1| GENE 9 12579 - 13733 703 384 aa, chain - ## HITS:1 COG:TM1044 KEGG:ns NR:ns ## COG: TM1044 COG0675 # Protein_GI_number: 15643802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 1 375 4 369 405 176 32.0 5e-44 MRKSFKTEINPSEEQKIKIRKTIGTCRFIYNFYLAHNKELYNNGEKFMSSNKFRVWLNNE YLPQHPEYSWIKEAYSKAVTQAVNNGQTAFTRFFNHESAFPNFKKKGRSDVKMYFVKNNP KDCRCERHRINIPSLGWVRIKEKGYIPTTKDGYVIKSGSVSIKADRFYVSVLVEIPDNKI TDDPNEGIGIDLGLKEFAIVSNGKTYKNINRSARLKKLEKQLIRKQRSLSRKYENLKKGE STQRANLQKQKRKVQKLHHKIDNIRTDYINKTIAEIVKTKPSYITIEDLNVKGMMKNRHL SKAVASQKFYEFRTKLQAKCREKGIELRVVDRWYPSSKTCHCCGAVKKDLKLSDRIFKCS CGYVEDRDFNAALNLRDVTTYEIA >gi|226333009|gb|ACII01000010.1| GENE 10 13672 - 14343 316 223 aa, chain - ## HITS:1 COG:MJ0014 KEGG:ns NR:ns ## COG: MJ0014 COG2452 # Protein_GI_number: 15668185 # Func_class: L Replication, recombination and repair # Function: Predicted site-specific integrase-resolvase # Organism: Methanococcus jannaschii # 8 207 7 206 213 138 38.0 8e-33 MNTSNITNYKPKDFAELLGVSVKTLQRWDREGILKADRTPTDRRYYTYDQYLQFKGINTE NDMRQVVIYTRVSTRNQKDDLQNQVAFLRQFCNAKGIIVDQCIEDYGSGLNYNRKEWNKL LDAVMEGKIKTIIITHKDRFIRFGYDWFEKFCMKFNTTIVAVNNEELSPQEELVQDIVSI LHVFSCRLYGLRKYKKQIERDEEIAKELQDGNKPVRGTEDQDP >gi|226333009|gb|ACII01000010.1| GENE 11 14985 - 18407 2142 1140 aa, chain - ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 787 936 171 318 752 102 43.0 4e-21 MKKKCFKKTVSWALSIVMSATMAVTPVLADTFTDGSQLIVSEDAGEETPAATISDAEDEG MMEPEHTFEIEDNCEDSGRTGFSSENEASGFSDDSVLAAQTEEKAAVYLDPSSGNDGNTG ESAKSAVNTLSKAVELAQGGDIFLLDCVEVDSDIKLSNVTFRPGKSNMSGMLYISRGKVT LENIIINNKTPDGKSCQFSNYPIEVTGRLATLTIADGTEIGPFPGNSCIIVSHDSTVNLN GGKILGDKQNTVEYGGGIYAEHATVNVNGGTISGHSATYGGGIFSSYSNIKINGGTISDN QSIRGSGIYAINDSNVSLKGGSIIHNKSGQGAGVYLSISKLFINGESCQITQNTVTTEPS PKDMRGEGGGIYLIYSEAEIESGTINKNTAIAAAPDENGNIQGGLGGAISAKYSRVTIKG DTKIRNNSAGNRGGAIYTEGTNNDSSLLNIEGGTISGNHVNGSGAGIFAICSRGNKHNMD VNISGGTITNNYSGTGENEEENAIVLMGWDPNLTEDTGFADLHLSDSPVITGSVTLSDDN NYGPRIYVGKSLQLSDKHILVTPTYGKADLIAVEYENDSAAESFESQFYSNGMSKLVRDG KYLKWALVKPKVQVSADKEKGCPSSKIVLTAKATHVLDDKGITYSYQWYKDDQILNSQTG ETLTVSEAGTYKVEVTATSQAGVNSTETASIVIPAFEHSYSWQFDKTNHWEHCSIGNENT TQEAHTFGNWVVTKQASIGAEGEKERTCTVCGYTEKETIPSIYIPSYPVTGIKVSQDTLM LTKKDETAQLSAEVVPSYADNTRVTWKSSDESVVTVDEKGKVTAVGNGTATITVTSVSGN YTATVAVTVKIPVEIEKISIEAEKETLTKIGESTELKVKIEPENADAQKLIWKSDNEMVA AVDENGKVTAIGNGMAIITVTTEYGKNTASIIITVKILDKPVINKTKGFGRLKVRSVNQT KTSITLEWSKLDGVDGYLVYGNRCNTSTKTYKYKKLATITNGRTWTHKNLKKGTFYKYIV KAYKIVDGKKVVTDTSASIHVITQGSKYGIARSVSVTKIGNKKNVSKITLKKGKTAQITV KEIKKDKKIRHHRNLCYESSNTAVATVTPEGLIQAVGKGTCTIWVYAQNGVYAALTVTVK >gi|226333009|gb|ACII01000010.1| GENE 12 18404 - 18748 287 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577919|ref|ZP_04855191.1| ## NR: gi|253577919|ref|ZP_04855191.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 114 4 117 117 207 100.0 1e-52 MYDTEKRIELVKKRMHEYHRRQERRTVCRLSVLCTLLFLSLVGTMGMMQSQPIDITGMYG TILLHEDAGGYVLVAVISFTVAVVITALCIKFRKRGQKSQDVEDHVLKKGKVEV Prediction of potential genes in microbial genomes Time: Sat May 28 19:06:30 2011 Seq name: gi|226333008|gb|ACII01000011.1| Ruminococcus sp. 5_1_39B_FAA cont1.11, whole genome shotgun sequence Length of sequence - 25222 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 8, operones - 6 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 332 91 ## gi|253577920|ref|ZP_04855192.1| conserved hypothetical protein 2 1 Op 2 . - CDS 329 - 955 188 ## gi|253577921|ref|ZP_04855193.1| conserved hypothetical protein 3 1 Op 3 . - CDS 939 - 1307 354 ## gi|253577922|ref|ZP_04855194.1| conserved hypothetical protein 4 1 Op 4 . - CDS 1294 - 1842 491 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 1920 - 1979 7.9 - Term 2134 - 2159 -0.5 5 2 Op 1 . - CDS 2219 - 3274 1070 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 6 2 Op 2 4/0.000 - CDS 3294 - 5948 3021 ## COG2217 Cation transport ATPase 7 2 Op 3 . - CDS 6068 - 6385 471 ## COG1937 Uncharacterized protein conserved in bacteria - Prom 6568 - 6627 5.5 - Term 6746 - 6811 4.0 8 3 Op 1 . - CDS 6823 - 8169 1167 ## COG0534 Na+-driven multidrug efflux pump 9 3 Op 2 . - CDS 8264 - 9664 424 ## PROTEIN SUPPORTED gi|229233241|ref|ZP_04357664.1| SSU ribosomal protein S12P methylthiotransferase - Prom 9706 - 9765 8.1 10 4 Tu 1 . - CDS 9808 - 10707 831 ## BpOF4_05905 peptidase M14 carboxypeptidase A - Prom 10791 - 10850 8.1 - Term 10800 - 10837 6.2 11 5 Tu 1 . - CDS 10859 - 12919 2555 ## gi|253577930|ref|ZP_04855202.1| predicted protein 12 6 Op 1 . - CDS 13042 - 14292 1417 ## COG0014 Gamma-glutamyl phosphate reductase 13 6 Op 2 . - CDS 14307 - 14864 516 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 14 6 Op 3 . - CDS 14842 - 15063 401 ## EUBREC_1649 hypothetical protein - Prom 15092 - 15151 1.9 15 6 Op 4 . - CDS 15157 - 16014 1062 ## COG0263 Glutamate 5-kinase - Prom 16035 - 16094 2.3 16 7 Op 1 17/0.000 - CDS 16119 - 16613 764 ## COG0319 Predicted metal-dependent hydrolase 17 7 Op 2 . - CDS 16610 - 17620 1134 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase - Prom 17700 - 17759 3.6 18 8 Op 1 . - CDS 17761 - 18945 643 ## Cphy_2612 putative stage IV sporulation YqfD 19 8 Op 2 . - CDS 18939 - 19223 233 ## gi|253577940|ref|ZP_04855212.1| conserved hypothetical protein 20 8 Op 3 . - CDS 19297 - 19833 364 ## DSY4699 hypothetical protein 21 8 Op 4 . - CDS 19872 - 20351 577 ## COG1576 Uncharacterized conserved protein 22 8 Op 5 . - CDS 20402 - 21493 624 ## EUBREC_1956 hypothetical protein 23 8 Op 6 1/0.000 - CDS 21507 - 24206 2462 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 24 8 Op 7 . - CDS 24249 - 25127 813 ## COG1968 Uncharacterized bacitracin resistance protein - Prom 25159 - 25218 9.2 Predicted protein(s) >gi|226333008|gb|ACII01000011.1| GENE 1 2 - 332 91 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577920|ref|ZP_04855192.1| ## NR: gi|253577920|ref|ZP_04855192.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 110 16 125 125 194 100.0 2e-48 MTYIENVFLCIASPLLIAALCMGKRQRKFFLFCLAGMGACLLSAYINTFFAALYRADTFT TTTEIAPVVEEVMKFLPLLFYLLIFEPKREQIKNTAVVIALSFATFENVC >gi|226333008|gb|ACII01000011.1| GENE 2 329 - 955 188 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577921|ref|ZP_04855193.1| ## NR: gi|253577921|ref|ZP_04855193.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 208 1 208 208 363 100.0 3e-99 MELISNLFQFAVTLLGFCLSGIWYLKDRKQTCFLLTCFYGCFALGSLYWTLYLLLFSETP QVFYVSEFGWIASVIFLYLLQYTLSSAEERDFSTRKSLIAPLIGIPLCVFYCTFGDVLSN LLWCSMMIVVSYHSIRGLIYALIQTGTARKIRYFHIGVLCYVAVEYALWLSGCLWPGYSI SNPYCWFDLLLTGCLFALLPATGKAVQT >gi|226333008|gb|ACII01000011.1| GENE 3 939 - 1307 354 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577922|ref|ZP_04855194.1| ## NR: gi|253577922|ref|ZP_04855194.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 2 122 1 121 121 221 99.0 1e-56 MLKNEERIVEVKRRIEEKERQQRLRHRRIVSAFCIAACLAVIVGVSFVMPGIVGQITPGT SSGFETAATILGGGTALGYMVIGLLAFILGVCVTILCFRIRQLNKEEQTEEQKGDNGDGA DQ >gi|226333008|gb|ACII01000011.1| GENE 4 1294 - 1842 491 182 aa, chain - ## HITS:1 COG:Cgl1096 KEGG:ns NR:ns ## COG: Cgl1096 COG1595 # Protein_GI_number: 19552346 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Corynebacterium glutamicum # 42 181 63 205 213 69 29.0 3e-12 MATDEELYGQYLSGDETGLELLIKKYGDPLTLYIDGYLHDVHEAEDLMMETFSWLFTKKP RIRDGCFKAYLYKAARHMALRHKSRRRIIFSLDDLTREPEAQTLVEEVIRTKERNQILHL CMDELNSDYREALYLTYFEGMSYQQAAEVMGKSVKQITNMVYRGKERLRGLLQREGITDV EK >gi|226333008|gb|ACII01000011.1| GENE 5 2219 - 3274 1070 351 aa, chain - ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 53 310 9 257 314 161 41.0 2e-39 MENQNQTPHKRRVRYKGKYPKKFEEKYKELQPEKYKDTIEHVIQKGNTPAGMHISIMVKE ILDFLNIQPGETGFDATLGYGGHTKAMLQCLNGKGHIYATDVDPEESAKTKKRLAEQGFG EELLTVKLQNFCTIDEIAKEVGGFDFILADLGVSSMQIDNPKRGFSFKTDGPLDLRLNQE TGISAAQRLDTISREELAGMLCENSDEPYCEELAKAITDEIRRGNHIDTTTKLRQVIEKT LDFLPEKEKKETIKKTCQRTFQALRIDVNHEFEVLYEFMEKLPDALRPGGRVAILTFHSG EDKLVKKALKAGYKDGIYSDYAKDVIRPSAKECTQNPRARSTKMRWAIKAE >gi|226333008|gb|ACII01000011.1| GENE 6 3294 - 5948 3021 884 aa, chain - ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 6 748 78 813 818 627 48.0 1e-179 MEQYNVTGMSCAACSSRVEKAVSKVPGVTSCSVSLLTNSMGVEGTAGAGAVIKAVQDAGY GASLKGASGEQISASAAEEALEDHETPALKRRLIASVGFLLVLMYFSMGHMMWGWPLPAW FNDNHIAMGLMQLLLAGIIMVINQKFFISGFKSLWHRAPNMDTLVALGSMASFLWSVYVL FAMTRAQVDGDSAAVMNYMMEFYFESAAMILTLITVGKMLEARSKGKTTDALKGLMKLAP KTAVVVRDGQEATVPIEQVRKGDVFVVHPGENIPVDGVVLGGNSAVNEAALTGESIPVDK NPGDAVSAATVNQSGFIRCEATRVGEDTTLSQIIKMVSDAAATKAPIAKIADRVSGVFVP AVISIAVVTTIVWLLAGKEFGYALARGISVLVISCPCALGLATPVAIMVGNGMGAKNGIL FKTAVSLEEAGKIQMVALDKTGTITKGEPQVTDMIPAEGISEEELLGCAYALEKKSEHPL AKAIIARAEEKKTVLQEVSDFQALPGNGLRAVLNSDVLTGGNMKFISNETSVSSELISKA QKLAEEGKTPLLFAKGGKFLGMIAVADVIKEDSPQAIKELQNMGIRVVMLTGDNERTAKA IGAQAGVDDVIAGVLPDGKESVIRSLKEQGKVAMVGDGINDAPALTRADIGIAIGAGTDV AIDAADVVLMKSRLSDVPAAIRLSRATLRNIHENLFWAFFYNVIGIPLAAGVWIPIFGWT LNPMFGAAAMSLSSFCVVTNALRLNLFKVHDASRDKKIKQNVEEIHYISANAEMKNVTEN KSLKAENPDFCNSEIHDPKDQENIKENKENKEMTTITVNVTGMMCGHCEAHVTKAVKEAF GVEDVVSSHEKGTTVIHAPEKLDEDKIREVIKEAGYEVTGITQE >gi|226333008|gb|ACII01000011.1| GENE 7 6068 - 6385 471 105 aa, chain - ## HITS:1 COG:lin1968 KEGG:ns NR:ns ## COG: lin1968 COG1937 # Protein_GI_number: 16801034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 19 105 10 96 97 72 42.0 2e-13 MADTQEKKVCPCCTKHTMRSEEERKKLINRLKRVEGQIRGIIGMLENDAYCNDILIQSAA VNAAVNAFNKELLASHIRNCVARDIRRGKDDTIDELVATLQKLMK >gi|226333008|gb|ACII01000011.1| GENE 8 6823 - 8169 1167 448 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 409 3 411 426 234 34.0 4e-61 MKQNKYEIDMCNGTIMDKLISFSLPLMLSGILQLMFNAVDIIVVGRFSGSQALAAVGSTT ALINVFTNLFIGISLGANVLAARFYAAGKDREMSDTVHTAVTLALVSGIVMAFVGLIFSR WALELMGTPDDVIGQSALYMKIYFLGMPFFMLYNYGAAILRAVGDTKRPLIFLVISGVVN AVLNLILVIMFHMDVAGVAIATVISQLISCILVLRCLRTSKTSYQLHFGKLRINTVYLKQ IFQVGIPAGIQSTVINLSNALLQSSVNSFGSTAMAGYTAANNIFGFLYVAVNSVTQACMS FTSQNYGVHKFKRMDKVLVDCLIISVVTSFSLGCGAYFFGSEILKIYTADPEVIRCGLEI LSYTTVPYFLCGIMDLFPGALRGMGHSGVPMILSVIGTVGTRIVWIFGIFPHHRSLAVLF ISYPASWMLTIIMQVICFYFVRRKVHRV >gi|226333008|gb|ACII01000011.1| GENE 9 8264 - 9664 424 466 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229233241|ref|ZP_04357664.1| SSU ribosomal protein S12P methylthiotransferase [Chitinophaga pinensis DSM 2588] # 22 432 1 396 435 167 27 5e-41 MNNNTQESKIQLEYINKCTELVKEKYEKAPAFFIQNAGCQMNSLQTDTVAGIVKRMGYTE VSREEDADVVIYNTCTVRENANLKIYGHLGHLKSIKKKNPELKIILFGCMMQEPEVIEKI HKEYSFVDLVFGTHNFHKFPELFYRSLNTEGQIIDVWKESDEIVEGMPSDRKYSFKTGVN IMFGCNNFCSYCIVPYVRGREKSREPEAIIEEIKGLVADGVTEVMLLGQNVNSYGKTLEH PVTFAQLLKQVEAIEGLKRIRFMTSHPKDLSDELIRTMAESKKVCHHLHLPMQSGSSRIL KIMNRRYDKEKYLELVAKIRNAVPDISLTTDIIVGFPGETEEDFQDTLDVVEKCDFDSAF TFIYSKRSGTPAAKMENQVPEDVVKDRFDRLLALVQEKGRKASSRFEGTVQEILVEEESR EKGIFTGRTEYNLLVHFPGCQDLIGKYVKVKLDTCKGFYYFGSLAE >gi|226333008|gb|ACII01000011.1| GENE 10 9808 - 10707 831 299 aa, chain - ## HITS:1 COG:no KEGG:BpOF4_05905 NR:ns ## KEGG: BpOF4_05905 # Name: not_defined # Def: peptidase M14 carboxypeptidase A # Organism: B.pseudofirmus # Pathway: not_defined # 1 230 32 271 524 78 23.0 3e-13 MSTEQMGTYEKIYFALWELGQRYGNFVQFRVIGRSHDDRMIPMLEIGKGDTCIICLSGVE SGDRNLPEYLLSIAKDYCRSYESNWTIGESYEVRKLLDKVRICMIPMLNPDSYEICEYGY GAIHNPIHRQMLKMQDRPVEEYECNARGIDLRRNFPTNYYQRKRVNQEPASENETRALIS IFQEYSSLGLLTFSYSRGKIVYCRQEKGFAYNQKNYRLARHLQKCSGYRLEKGIAGGARV KKAGAKPEMGSPEQFYAEVIRQPSLAIEIPEYRKDDMEELRLIPLEYLYSLNSGILANA >gi|226333008|gb|ACII01000011.1| GENE 11 10859 - 12919 2555 686 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577930|ref|ZP_04855202.1| ## NR: gi|253577930|ref|ZP_04855202.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 686 1 686 686 1043 100.0 0 MGGHTMRKKQLFALLMAGALSVGMAPTATFAADTAAETSAEGELEGELDSADTEEPSEDP AETPEEPAETPEDPAETPAETTADPTEAPAAEETDEQAAEAGSAIVIGGDSFTSLEEAFA VVPDCEDMINGEPTYVKLKGTIEVNNTIKVPEKKNIMLVAAEDNTTIKRAAGFTESMFTV NGGNLQMAGGSVTDSDGNAIGSGSLTVDGTGDNVTGSIVEVASGNYALIDGTTLTGNTTT GNGGAVNNAAGANVYLLGGTITANSAAAGGAIYSEGTVNIRGTVSVTGNTVTNSFEVASN LVLAKDGVINVSGAVTGSAIGVAVQEANAGRTVVKLGDAVTDVKLADVLSQITYEGDSSL KIGEDGTLVSTTEPSPTPTPAEEKLKVTGKECKWSGSGTVKIKFQSNVKGTYYIDWVKRG EKAPTIDTSRVGAPIEADTNVTAKVTDLPDYDVDIYVCVISDKDKSNYGSVMFQPDSKER PVTPTPSHTPVVPDVKESVVQGFEKALVFYPNTFYDFKVIGAGTQNNNPGEGDVRWVPVG WSMSSNPSTWNTSWKIGAKSGIYTDAEKAYTIYIKYAKQVYSGNDWQETDASEVLPYQFK AAPLTQATTTPGANGTNGDGTGSGDGTTDGGTTDVTPTTYADGTNGTAKSAVSTGDESPV GTMLALAAASVLAGGYVLIRRRKKEM >gi|226333008|gb|ACII01000011.1| GENE 12 13042 - 14292 1417 416 aa, chain - ## HITS:1 COG:lin1227 KEGG:ns NR:ns ## COG: lin1227 COG0014 # Protein_GI_number: 16800296 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Listeria innocua # 1 415 1 414 415 467 57.0 1e-131 MNEMLNRLGMNAKAAETEMRNLSTNKKNEVLLAVADKLVKDAQALINANRLDVETGKRNH MPEGLIDRLLLTESRIEGMAEGLRQVAALDDPVGEVTGMKKRPNGLLIGQKRVPLGVIGI IYEARPNVTADAFALCFKTGNVVILKGGSDAIHSNEAIVNCIRETLNEQGVTEDAIQLIS DTSRETAAEFMKMNQYVDVLIPRGGRGLIKAVVEQSTIPVIETGTGNCHIYVDETADLEM AADIIMNAKTQRVGVCNACESVLVHKDVKDALLPVLAKRLQEKHVEIRADEAAYALIPGA VHATEEDWGKEYLDYILSIKVVSSVEEAIAHINKYNTKHSEAIITNNYEHAQKFLDEVDA AAVYVNASTRFTDGFEFGFGAEIGISTQKLHARGPMGLLALTTTKYIIYGNGQVRP >gi|226333008|gb|ACII01000011.1| GENE 13 14307 - 14864 516 185 aa, chain - ## HITS:1 COG:BH1417 KEGG:ns NR:ns ## COG: BH1417 COG0212 # Protein_GI_number: 15613980 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Bacillus halodurans # 1 180 1 184 186 118 36.0 5e-27 MESKKDIRKKIFAERKLHTDEQIEAMSRTITDKVTALPAFKNADRILVYADYNHEVVTEY LIKEAWKAGKEVAVPKVVGKDMVFYKLTDFARLEPGYFGIPEPVSGEIVNWSKALMIMPG VAFDRANHRVGYGGGFYDRYLEKHPQLERVAIAFSFQMLPEVPTEPTDICPQIIVTEEEI CYLTD >gi|226333008|gb|ACII01000011.1| GENE 14 14842 - 15063 401 73 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1649 NR:ns ## KEGG: EUBREC_1649 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 73 1 73 74 64 60.0 1e-09 MEVEKIDRINALAHKAKSVGLTEEEKKERELLRKEYIASVRMNLRAQLDNVDVQQEDGTI INLGEKYGIKKRH >gi|226333008|gb|ACII01000011.1| GENE 15 15157 - 16014 1062 285 aa, chain - ## HITS:1 COG:lin1228 KEGG:ns NR:ns ## COG: lin1228 COG0263 # Protein_GI_number: 16800297 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Listeria innocua # 4 273 2 269 276 241 48.0 8e-64 MNYRERLKDKKRIVIKIGSSSLTHSETGRLNLRKLEVLARELSDLRNQGKDVILVSSGAI ATGVAALGMHEKPTELKGKQACAAVGQARLMMIYQKLFSEYNQLSAQILMTKNTMVNNVN RKNAQNTFNELLSLGVIPIVNENDSISTYELQNLEKFGDNDTLSAMVAALVRADLLILLS DIDGLFTDDPNTNPDAKFIDVVENLDDNLLNMGKGTSGSKVGTGGMATKLTAAQIASAAG VDMVIANGADFHIIHKITEGRRYGTLFVSQSKEEVYLIDIIDRLL >gi|226333008|gb|ACII01000011.1| GENE 16 16119 - 16613 764 164 aa, chain - ## HITS:1 COG:MYPU_3780 KEGG:ns NR:ns ## COG: MYPU_3780 COG0319 # Protein_GI_number: 15828849 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Mycoplasma pulmonis # 17 164 16 149 149 91 37.0 5e-19 MNMQIDYETERELGIDYETLAVKVADKVLEMEKCPYDAQVNLVLTDNEEIERVNTEFRDI ARPTDVLSFPMIPFETPAGYDIVEEDESYFDLDTDELLLGDIMISVDKVFAQAEEYGHSV TREFCFLVAHSMLHLLGYDHMTPGEATVMEAKQAKALEELGITR >gi|226333008|gb|ACII01000011.1| GENE 17 16610 - 17620 1134 336 aa, chain - ## HITS:1 COG:BS_phoH KEGG:ns NR:ns ## COG: BS_phoH COG1702 # Protein_GI_number: 16079588 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus subtilis # 19 315 20 316 319 322 53.0 7e-88 MAGNIIELHTEVPAELEANVFGQFDEHLKLIERTLNVTVISRDGILKILGNEQNAASAKK LIEELTVLAKRGNTITKQNVNYALSLAMEQRNEVLTEIDKDFICNTIQGRPIKPKTLGQK DYVEQIRKKMIVFGVGPAGTGKTYLAMAMAVTAFRNEEVSRIILTRPAIEAGEKLGFLPG DLQSKVDPYLRPLYDALYQIMGADSFAKNMERGLIEVAPLAYMRGRTLDNAFIILDEAQN TTPAQMKMFLTRIGFGSKVIITGDASQKDLPRDTTSGLDVAMKVLKKIDDIGFCQLTSKD VVRHPLVQQIVQAYDAYEKKQKPERPARRRNGVNER >gi|226333008|gb|ACII01000011.1| GENE 18 17761 - 18945 643 394 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2612 NR:ns ## KEGG: Cphy_2612 # Name: not_defined # Def: putative stage IV sporulation YqfD # Organism: C.phytofermentans # Pathway: not_defined # 15 393 1 394 412 192 29.0 2e-47 MLKFLKLIYGYVIILVRGENLERFLNLCKSRKVYMEKIRYREDGQLMAQMQAADFFRLRP LRNKTGVHIQIIQRRGMPFFFLRNKKRKAFFTGMILGGILIFFLTGRIWNIHIEGNVRNS TGEILDFLDKQDINHGMSKKKINCSEVAAAVRKNFPEITWVSARIEGTRLILNIQEGIIP PKINSNTSPCNLLADKDGVITDMIVRRGIPVKKPGDSCKKGELLVSGELHIMNDSQEVLR KEYVHADADIFISRQISYYQEFPLKYRTEIPAGKKKKEMYFRFGHWYLGLYPTLHTGQKR ITEEIPLRITENFILPVWIGWTETTDYTIKEETYTRKEAEKEAVRRFHQYEKKLLQNGIK ISENHITTRLTESLCITGGTLLIIEQTGEKSRIT >gi|226333008|gb|ACII01000011.1| GENE 19 18939 - 19223 233 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577940|ref|ZP_04855212.1| ## NR: gi|253577940|ref|ZP_04855212.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 94 11 104 104 186 100.0 3e-46 MKKAGQWKNRMVLALELPPDLAYNETIVTLTGQTEAVIENYKSVFKYTPSEIVIQSFCGK VTISGKNLEIRWYNSSAMKVTGSIYDIKAGDSEC >gi|226333008|gb|ACII01000011.1| GENE 20 19297 - 19833 364 178 aa, chain - ## HITS:1 COG:no KEGG:DSY4699 NR:ns ## KEGG: DSY4699 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 21 177 23 182 182 89 29.0 4e-17 MNIEKFEEFISQYPIFEYRILDTKDITIQERVRTICQEECERYDSTWACPPAVGTLQECE KKIKSYPQAVFFSSVAEVSDIMNMEEMLSTRRAHEDLTTEVGKYLNTEGYETYILSTESC DICEKCAYKEGKPCRYPDRMHPCLESHGVVVNEIVERESMEYNLGGNTILWFSMVLFR >gi|226333008|gb|ACII01000011.1| GENE 21 19872 - 20351 577 159 aa, chain - ## HITS:1 COG:SP2238 KEGG:ns NR:ns ## COG: SP2238 COG1576 # Protein_GI_number: 15902041 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 1 159 1 159 159 176 54.0 2e-44 MEIRIVTVGKIKEKYLCDGIAEYAKRLSRYCRLTFCQVADEKTPDKASEALNTQIKNTEG ERLMKHIREQDYVIALAIDGKTPDSVELSQKIEKLGVSGISSIAFVIGGSLGLSESVLKR ADYKLSFSRMTFPHQLMQMILLEQIYRSYRIMNHEPYHK >gi|226333008|gb|ACII01000011.1| GENE 22 20402 - 21493 624 363 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1956 NR:ns ## KEGG: EUBREC_1956 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 93 361 157 432 433 129 32.0 2e-28 MKCPKCHKEVEKGSLYCPYCLAEIPWVREFSTVETLMKKEQQNRPSEKKQKTKIIKYFKH PKRRKLKFSRKQLLCLLLCAATLLGVFCYRQLNTFSALYSRAKKQYAQQNYEEAQRIAEN ALDKNPKNEAANLLLAKSMEKSGDKRSALLVLRPFIQNKTAGTGIYKEYVKLLTQEGKTN EVRLILKSADREVQNACAEYICETPVSNPAPGTYTTTQTLKLEGNCQKIYYTLDGSTPTR KSKVYTEPIILREGTTELKAFGVNDKNIESDVISRKYVIVLNAPKAPKVTPKSGDYNKKT EIKITVPDGCKAYYAFDSEPDLNSTVYEQPISMPVGYHRLNVILVAANGKTSKMTAMEYY LQY >gi|226333008|gb|ACII01000011.1| GENE 23 21507 - 24206 2462 899 aa, chain - ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 418 892 298 765 765 552 60.0 1e-156 MAASKSKNTGHSGRNTKKKTAMEQEVGRFQTEIILFVILAACVILMASNFGAGGFVGDAI SNFCFGLMGLMAYLFPIVFFLESAFILINKTNRLAYKKLAAAFVMFLFLCGAAQLLTDGY MQSTTLLDYYALSADYKTGGGLIGGAICISVTSAFGTIGGYVIIILAFVVCMIIITQRSL LDFVTRLVMKVCDFVKDRHMRYMQGQPERERYREERARERQLIREQRREERVCRLEEEIG DDEFSDDDFLMDPSEAKKMKKGGFLEGTKLAGYSEQKPKKKGKLLKKAADTDKAAAETVE AAEDTENTGIKSSDDADNKPYIDAPMNFEIHRADTSPDTYEADNESQQQEDLPFEDAVTD RKKSVVSEPEEDLPPFQDEEAVRPPSKNPKSSEKEIQSGIVNIQHEITRQEAAVKKEYKF PALNLLKKGSSKAQGDSDAYLRKTAKKLQEVLHNFGVNVTVTNVSCGPTVTRYELQPEMG VKVSKIVGLADDIKLNLATPDIRIEAPIPGKAAVGIEVPNKENSTVMLRDLLQSEEFQKA KSKLSFAVGKDIAGKTVVADIAKMPHLLIAGATGSGKSVCINTLIISILYKANPDEVKLI MIDPKVVELSVYNGIPHLFIPVVTDPKKAAGALNWAVQEMTNRYNTFAEYGVRNLDEYNR KAEQIKAAGAEEEPVKMPQIVIIVDELADLMMVAPGEVEDAICRLAQLARAAGIHLIIAT QRPSVNVITGLIKANMPSRIAFSVSSGVDSRTILDMNGAEKLLGKGDMLFYPQGYQKPAR LQGAFVSDDEVSAVVEFLADKNPGVQYNQQIEQQVNSPVTTGMSGDERDIHFEEAGKFII EKEKASIGMLQRMFKIGFNRAARIMDQLCDAGVVGPEEGTKPRKVLMSMEEFQNYLENQ >gi|226333008|gb|ACII01000011.1| GENE 24 24249 - 25127 813 292 aa, chain - ## HITS:1 COG:BH1521 KEGG:ns NR:ns ## COG: BH1521 COG1968 # Protein_GI_number: 15614084 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Bacillus halodurans # 12 286 12 266 278 92 28.0 1e-18 MIVVYLIFLAVIQAVTEFLPVSSLGHLCILEQYMGIGHETGLLSETMLHLGSAAAIIFLF RKDLKKIGLGLLGMFMDMIGNLNLYIHNKKTGESLGYARIVTGTYRKTGALLVISMIPTA LLGLTGKRLALLAADSQIVPGIGFLLTGVFLLVTDLNKSGGKKGPREASYDSAMWIGICQ GIAVFPGISRMGFTLCAALLCGYNRKFAVRFSVFMSLPAIIGAFFTEIGNFGASEMTAGL GFTYVFAMIIAGFAGCLVIRNTIAMTQNIKLRYFAYYSFIVGIITLALNFAL Prediction of potential genes in microbial genomes Time: Sat May 28 19:07:51 2011 Seq name: gi|226333007|gb|ACII01000012.1| Ruminococcus sp. 5_1_39B_FAA cont1.12, whole genome shotgun sequence Length of sequence - 21588 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 10, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 895 609 ## COG1295 Predicted membrane protein - Prom 991 - 1050 7.2 - Term 1035 - 1097 -0.6 2 2 Tu 1 . - CDS 1127 - 2188 1190 ## COG2017 Galactose mutarotase and related enzymes 3 3 Tu 1 . - CDS 2541 - 4031 1846 ## COG0498 Threonine synthase - Prom 4139 - 4198 7.6 - Term 4201 - 4264 16.5 4 4 Op 1 . - CDS 4315 - 5151 723 ## COG3773 Cell wall hydrolyses involved in spore germination - Prom 5215 - 5274 7.5 - Term 5288 - 5325 -0.8 5 4 Op 2 1/0.000 - CDS 5348 - 7774 1982 ## COG0500 SAM-dependent methyltransferases - Prom 7902 - 7961 7.5 - Term 8263 - 8309 10.1 6 5 Op 1 11/0.000 - CDS 8323 - 9063 569 ## COG1180 Pyruvate-formate lyase-activating enzyme 7 5 Op 2 . - CDS 9082 - 11337 2803 ## COG1882 Pyruvate-formate lyase - Prom 11442 - 11501 6.9 - Term 11582 - 11634 5.7 8 6 Tu 1 . - CDS 11743 - 13395 1778 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 13461 - 13520 5.3 - TRNA 13850 - 13923 70.5 # Pro GGG 0 0 - Term 13936 - 13988 -0.2 9 7 Tu 1 . - CDS 14088 - 14624 451 ## Aboo_0140 PKD domain containing protein - Prom 14673 - 14732 6.5 10 8 Tu 1 . - CDS 14749 - 15990 1380 ## COG4198 Uncharacterized conserved protein - Prom 16074 - 16133 8.3 + Prom 16130 - 16189 10.4 11 9 Op 1 4/0.000 + CDS 16249 - 17478 1570 ## COG1171 Threonine dehydratase + Prom 17500 - 17559 3.8 12 9 Op 2 . + CDS 17624 - 18934 1557 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Prom 19061 - 19120 7.0 13 10 Op 1 8/0.000 + CDS 19140 - 20333 1319 ## COG0078 Ornithine carbamoyltransferase + Term 20374 - 20419 4.1 + Prom 20375 - 20434 5.1 14 10 Op 2 . + CDS 20535 - 21479 1151 ## COG0549 Carbamate kinase + Term 21486 - 21525 -0.2 Predicted protein(s) >gi|226333007|gb|ACII01000012.1| GENE 1 1 - 895 609 298 aa, chain - ## HITS:1 COG:lin1818 KEGG:ns NR:ns ## COG: lin1818 COG1295 # Protein_GI_number: 16800885 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 20 269 24 274 289 140 32.0 3e-33 MKKKNKTRSVTGLLFGFMERVSGDHVGAYATQAAYFLIMSFIPCILFLTTLVRYTPLTYN VVRDAIVAFIPENLQSFVLGIVVDVYKRSTAIVPLSAIVALWSAGKGMQSIINGLNTIYH VKETRNWLLTRIYAVFYTLSLVIAVIVSLLMLVLGNEIQKVVSRYIPFLGRVLGKILGAR ALLVFGVLFLVFLMLYKVLPNRKATFKSQIPGALIIAVAWSIFSYGFSFYFEIFPGFSNM YGNLATIIMVMVWLYICMNLLLYGAEINAYFEKEFRIAHKSVQELLAHEKEKKQQENE >gi|226333007|gb|ACII01000012.1| GENE 2 1127 - 2188 1190 353 aa, chain - ## HITS:1 COG:CC1418 KEGG:ns NR:ns ## COG: CC1418 COG2017 # Protein_GI_number: 16125667 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Caulobacter vibrioides # 3 353 24 378 378 246 41.0 4e-65 MSVSERKFGVLQTGETVKIFHLENKSGAYAEVTDFGAILVKVCVPDKDGTLTDVVLGYDD LASYEVNGCFFGSTIGRNGNRIGGAKFSVNGKEVVLAQNENDNNLHSGPDGFEKKLWKVA EISDDKNSVIFNRISPDGENGFPGEFDVSVKYEFTEDNELRIHYQGICDEPTVANMTNHS YFNLNGEGSGTAMDQYLTIHAKYYTPVADSHSIPTGVYEEVAGTPMDFRTAKQIGQDIEA EFEQLKFTGGYDHNYVTDNYAKGNRRLIATAYSDKTGIAMDVTSDCPCVQFYAANFVENE HGKNGHTYNKRDAFCLETQVEPNAVNVEDFHSPILNAGEQYDSVTAYHFYIRK >gi|226333007|gb|ACII01000012.1| GENE 3 2541 - 4031 1846 496 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 2 494 3 495 496 587 60.0 1e-167 MQVLYKSTRGKEETVTASMAILKGLSEDGGLFVPTEIPKLDVPMDELAKMSYQETAYEVM KCFLTDFTEEELKNCINNAYDEKFDTKEIAPLHEADGAYFLELYHGATIAFKDMALSILP HLMTTAAKKNQVKNEIVILTATSGDTGKAAMAGFADVPGTKIIVFYPKHGVSPIQEKQMV TQKGNNTYVVGITGNFDDAQTAVKKMFNDKELEAELDAAGYQFSSANSINIGRLVPQIVY YVYAYASLVRQSKIKDGQEINVVVPTGNFGNILAAYYAKQMGLPVHKLICASNDNKVLYD FFRTGTYDRKRDFILTTSPSMDILISSNLERLIYRIAGGDAKKCAELMQSLTAGGEYTIT EDMKAQLADFYGNYCTEDETAKTIAEIFKDSHYVIDTHTAVAAGVYDKYVKDTNDTTPTV IASTASPYKFTRSVMEALGADTDGKDDFALADELSALSGVKLPQAVETIRTAPVLHNRVV DAPDMPKAVKDILGIR >gi|226333007|gb|ACII01000012.1| GENE 4 4315 - 5151 723 278 aa, chain - ## HITS:1 COG:CAC3081 KEGG:ns NR:ns ## COG: CAC3081 COG3773 # Protein_GI_number: 15896332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Clostridium acetobutylicum # 141 270 71 194 194 63 34.0 3e-10 MKVQKKESKQSGNPARSLILVTVCTCTAIAFAAGGRIADFNGKNKVYAASENREKTFSSD ADETDVLPTGIAGVVSGMNDTASPKKSVTRIGTSCEQVMVGQRIKVIEDNMAEFDVSSSM ESSINELDARTAALAESARMMSDEDYDTLLHIVEAEAGTEDVKGRILVANVIMNRIKNKE FPDTVTEVVWQNTNGVPQFSPTYDGRINEVTVTDETREAVRQALEGVDYSEGALFFIQKS EAESQNVSWFEKDLKRLFKYGVHEFYTYPDSAESSTGS >gi|226333007|gb|ACII01000012.1| GENE 5 5348 - 7774 1982 808 aa, chain - ## HITS:1 COG:MA3459 KEGG:ns NR:ns ## COG: MA3459 COG0500 # Protein_GI_number: 20092272 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 2 223 30 249 249 161 37.0 6e-39 MQKGAWLEVLKGQFPEKAKDEIKILDIGTGPGFFPVILAEAGYKVTAVDYTQEMLDTAKR NAGNLCERISFYKMDAQNLEFEDGVFDVVISRNLTWNLKDPKRAYEEWCRVLKLGGKLLN FDANWYGYLYDEEKRLSYEEDRKSVESEHLDDHYLCTDIDRMEKIALQMPLSSINRPSWD RKFLKENGFESVAVDTGIWQRVWSQEEKLNYHSTPMFMISAVKEEKNVWSENDGMGDSDS GYDRKRDLEDAMLCAAPGMKKSGFLRLGGGEFSLPYTVICGSHPGKTVLITAAVHGGEYV GIQAAVELADKLKPEKIHGRVILVKTVCRKEFEERSGSVCPEDDKNLNRVFPGNPQGTRM DRLAYEVVQKLHSAADYYIDLHSGDDYEQLTPYIYYAGCADEDVVQMSRKMAEQADVPYM VKSNVASGGSYNYAAACGIPSVLIERGQMGSWSPEEVHSTRKDVRNILCALGVYDGMRSY SNYYPMEIEDVRYQSASVSGLWYPAKKPGDIIKVGEYLGCVKDYEGNILETSLSDLNGVV LYQAGSLQVIKDGPMITYGSFSRRKDERKEKITNYWAKRSDSFMEQRRAELHSDMADKWL KEIGTFLPDGKLRILDVGCGAGFFSILLAKLGHEVTGIDLTPDMIIHSRELAKEENASCT FEVMDAENPDFPDGTFDVIVSRNLTWTLPDAARAYKEWIRVLKTGGILINADANYGADDF SDTADLPANHAHFTVGDAMMQECEEIKRQLPISSYVRPAWDLETLGKLGINRFSIDLGIS SRIYTKKDEFYNPTPMFLICGEKNKCNN >gi|226333007|gb|ACII01000012.1| GENE 6 8323 - 9063 569 246 aa, chain - ## HITS:1 COG:SP1976 KEGG:ns NR:ns ## COG: SP1976 COG1180 # Protein_GI_number: 15901799 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pneumoniae TIGR4 # 1 243 6 250 264 309 56.0 3e-84 MDNKQIKGYVHSLESFGSVDGPGVRYVIFLSGCAMRCQFCHNPDTWKMKQGELYTADELL KKALRYKGYWGSKGGITVSGGEPLLQMDFLTEFFKKAKAEGVHTNLDTSGNPFTDQEPWH SGWLELMKYTDLVMLDIKQIDEQEHIKLTGHSNKNILAMARELSDMKKPVWIRHVLVPGG SDKDEYLHRLADFIHTLSNVERVEVLPYHTLGKFKWENLGLSYPLEGVNPPTQERIDNAR KILGAI >gi|226333007|gb|ACII01000012.1| GENE 7 9082 - 11337 2803 751 aa, chain - ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 5 751 3 743 743 984 63.0 0 MKNFEQWNGFKGNRWKEKIDVRNFIGMNYTPYDGDASFLEGPTEATNKLWGKLQALQKEE RAKGGVLDMETEVVTSLTAYGPGYIDEETKDLEKVVGLQTDKPLKRAFMPYGGIKMAEQA CETYGYKVSDKIKDVFHNYEFKTHNQGVFDIYTPEMKVARHNKILTGLPDTYGRGRIVGD YRRVALYGIDALIEGKQKDFAACDRQGMRRYDFQLREEIADQIRALKGMKVMAESYGYDI SKPAKDAREAFQWLYFGYLAAIKTQNGAAMSVGRISTFLDIYIERDLQNGTLTEKEAQEL VDHMVMKFRMVKFARIPSYNQLFSGDPVWATLEVAGMGQDGRSMVTKNDYRFLHTLEDMG PAPEPNLTVLYSSRLPENFKKYAANISVTTSSVQYENDDVMRPVWGDDYSICCCVSATET GKEMQFFGARANLAKCLLYAINGGVDEKTKQQVGPNYQPITSEYLDYDEVIAKYDKMMDW LAHLYVGTLNMIHYMHDKYYYEAAEMALIDTKVERSFATGIAGFSHVVDSLSAIKYAKVK AIRDEDGITTDFQVEGDFPRYGNDDDRADEIATTLLSTFLEKLKHIHTYRDSKPTTSILT ITSNVVYGKATGSLPDGRKAGEPLAPGANPSYGAEQNGLLASLNSVAKLDYEDALDGISN TQTINPDALGHSDDERSENLVHVLDGYFDQGAHHLNVNVFGKEKLIDAMEHPEKEEYANF TIRVSGYAVKFIDLTREQQLDVIARTCHDRM >gi|226333007|gb|ACII01000012.1| GENE 8 11743 - 13395 1778 550 aa, chain - ## HITS:1 COG:CAC0273 KEGG:ns NR:ns ## COG: CAC0273 COG0119 # Protein_GI_number: 15893565 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 6 548 3 553 558 579 53.0 1e-165 MNPSKYQRQYFMPPVKCMKWAEKEYVDKAPVWCSVDLRDGNQALVIPMSLEQKIEFFKLL VKIGFKEIEVGFPAASETEYEFLRTLIEQNLIPQDVTIQVLTQAREHIIRKTFEAVKGAP KAIVHVYNSTSVAQREQVFKKSKEEILKIAVDGAALLKKLADETEGNFQFEYSPESFTGT EPEYALEVCNAVLDVWQPTADNKCIINLPVTVQHSMPHVYASQVEYMCENLKYRENVIVS LHPHNDRGCGVADSEMGLLAGADRIEGTLFGNGERTGNVDIVTLGMNMYSQGVNPKLDFS DMPHICEIYEECTGMKVGERSPYSGALVFAAFSGSHQDAIAKGMHWRDDKDPDHWNVPYL PIDPTDVGRNYDADVIRINSQSGKGGVGYILETKFGLNLPPKMREAMGYATKAVSDHKHK ELHPDEIFNLFKQTFENITEPYSINEVHFQQKDGGIVTKVTSTFRGKTITTEASGNGRLD AVSNALKKAYELKYSLETYQEHALERSSSSKAIAYVGIKKPDGTLAWGAGVDADIIRASI DALVTAINNR >gi|226333007|gb|ACII01000012.1| GENE 9 14088 - 14624 451 178 aa, chain - ## HITS:1 COG:no KEGG:Aboo_0140 NR:ns ## KEGG: Aboo_0140 # Name: not_defined # Def: PKD domain containing protein # Organism: A.boonei # Pathway: not_defined # 5 130 1117 1242 1362 90 29.0 2e-17 MEDAELRNILFSIQGSIAGFQNDMSDVKNDIADMKTDIANMKTDITNMKTDITNMKTDIT NMKADITNMKTDIANMKTDITNMKTDITNMKADITNMKTDITNMKADITNMKSDIVQLKG SVKTIELKLENEIDRKISALFDARMDELRYRKENKETREKVFELDMRVDNLEKAIMPA >gi|226333007|gb|ACII01000012.1| GENE 10 14749 - 15990 1380 413 aa, chain - ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 413 1 413 414 471 56.0 1e-132 MAVFHAFRALRPTPEKAADVAALPYDVVNREEAKSIGDENPLSFLHIDRPEMDLEPETDL YDDRVYEKAKENLDNMEEKGILVQDQKACYYIYELVRKGKTQTGIVGCSSIDDYMNGVVK KHELTREDKEQDRIHHVDSCNANTGPIFLACRYPDSLLTLMNNWKDHHEAAYDFTEEDQI THRVWVIDEDEVISEINKEFAGVDSLYIADGHHRAASAVKVGLKRREQNPGYTGEEEFNY FLSVVFPYDQLCILPYNRIVKDLNGLTVKAFLGALKFNFELMLMPGFPCKPVEKHCMGMY VDGQWYHLKAWPDIYDKKDVVGQLDVSILQEKVLRPVLGIEDPRTDQRISFVGGSHKAAE LAEIADRTGGVAFVMYPTSMEDLMKIADENKLMPPKSTWFEPKLRSGLFIHKL >gi|226333007|gb|ACII01000012.1| GENE 11 16249 - 17478 1570 409 aa, chain + ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 34 399 36 401 404 383 51.0 1e-106 MTEGLKWTVNEVPKSDDKHLELMSEENVKKANEFHRSFPQYSVTPLQNLSSLAKYLGVKN IFCKDESYRFGLNAFKVLGGSYAMGRYIAKELGRDISELPYNVLSSDKLREEFGQATFFT ATDGNHGRGVAWAANRLGQKAVVRMPKGTTKTRFDNIAKEGATVTIEEVNYDDCVRMAAA EAAKTEHGIIVQDTAWDGYEEIPSWIMQGYGTLVLEADQQLKEMGVERPTHVFVQAGVGS LAGAVVGYFAHKYKDNPPVMAVCEASAADCLYRSAVAKTGNLVNVTGDLQTIMAGLACGE GNTIGWDILKNHVDVFASCPDWMSAKATRIYANPLGDDPRVVSGESGSVPLGFCFTALHD EDAKDLKEALKLDENSVVLVISTEGDTDPVRYREIVWDGLYGTDESLSK >gi|226333007|gb|ACII01000012.1| GENE 12 17624 - 18934 1557 436 aa, chain + ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 6 434 10 399 403 568 64.0 1e-161 MDYAKINEAAMGYKDDMTKFLRDLVRIPGESCGEKGHIMRIKEEMEKVGFDKVQIDPMGN ILGYMGTGKTLIGFDAHIDTVGIGNIKNWEFDPYEGFESDEEIGGRGTSDQMGGIVSAVY GAKIMKDLGLLNDKYQVLVTGTVQEEDCDGLCWQYIIHEDGVRPEFVVSTEPTDGGIYRG QRGRMEIRVDVKGVSCHGSAPERGDNAIYKMADILQDIRALNENDAADTTEIKGLVKMLD EKYNPEWEEARFLGRGTVTTSEIFFTSPSRCAVADSCSVSLDRRMTAGETWESCLDEIRA LPAVQKYGDDVTVSMYKYNRPSYTDLVYPIECYFPTWVIPKDHKVTQALEEAYKGLYGEE RLGNAETEGMRKARPLTDKWTFSTNGVTIMGRNSIPCIGFGPGAEAQAHAPNEKTWKIDL VRCAAVYAALPTAYCK >gi|226333007|gb|ACII01000012.1| GENE 13 19140 - 20333 1319 397 aa, chain + ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 2 395 396 547 65.0 1e-155 MKTLQDYIDKLNSLNFKEMYNNDFFWTWDKTDDELEAVFTVADALRFMRENNISTKVFES GLGISIFRDNSTRTRFSFASACNLLGLEVQDLDEKKSQIAHGETVRETANMVSFMADVIG IRDDMYIGKGHAYQKEFMEAVTEGNKDGILEQRPTLVNLQCDVDHPTQCMADMLHIIHEF GGVENLKGKKIAMTWAYSPSYGKPLSVPQGVIGLMTRFGMDVVLAHPEGYDVMPEVEEIA KKNAEKNGGSFTKTNDMAEAFKDADIVYPKSWAPFAAMEKRTDLYAEGDFDGIDKLEKEL LAQNAQHKDWACTEELMKTTKDGKALYLHCLPADITGVSCETGEVDASVFDRYRIPLYKE ASFKPYIIAAMIMLSKFENPQDILKKLEVKAAPRILK >gi|226333007|gb|ACII01000012.1| GENE 14 20535 - 21479 1151 314 aa, chain + ## HITS:1 COG:ECs3747 KEGG:ns NR:ns ## COG: ECs3747 COG0549 # Protein_GI_number: 15833001 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli O157:H7 # 5 312 3 308 310 387 64.0 1e-107 MFTRKRIVIALGGNALGNTLPEQMVAVKTTAKALCDLIEEGHQVVVVHGNGPQVGMINNA MSALSREDENQPNTPLSVCVAMSQAYIGYDLQNALREELRIRGFVRTPVVTVVTQVRVDP NDPAFENPSKPIGKFLTKEEADHQAKAYGHIMKEDAGRGYRRVVASPKPVEIVEQDAINS LVDANKIVICCGGGGIPVVLNGHHLKGASAVIDKDYASCLLAKELDADMLIILTAVEKVA VNFGKENEEWLDDITVDDAKKYINEGQFAPGSMLPKVQAAVDFASSKEGRTAMITLLQKA KDGIQGKTGTKIHL Prediction of potential genes in microbial genomes Time: Sat May 28 19:08:03 2011 Seq name: gi|226333006|gb|ACII01000013.1| Ruminococcus sp. 5_1_39B_FAA cont1.13, whole genome shotgun sequence Length of sequence - 31722 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 7, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 682 509 ## COG2068 Uncharacterized MobA-related protein 2 1 Op 2 . - CDS 689 - 1405 782 ## CLL_A0643 conserved hypothetical protein TIGR03172 3 1 Op 3 2/0.000 - CDS 1365 - 2243 689 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 4 1 Op 4 15/0.000 - CDS 2240 - 2740 548 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 5 1 Op 5 12/0.000 - CDS 2737 - 3621 921 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 6 1 Op 6 . - CDS 3637 - 5910 2350 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Prom 6090 - 6149 9.5 - Term 6187 - 6226 -0.0 7 2 Op 1 . - CDS 6250 - 6741 435 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 8 2 Op 2 . - CDS 6744 - 9326 2997 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Prom 9542 - 9601 7.7 - Term 9643 - 9677 3.2 9 3 Op 1 1/0.000 - CDS 9755 - 12754 3553 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 12800 - 12859 7.1 10 3 Op 2 . - CDS 12867 - 14198 1423 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Prom 14303 - 14362 6.3 11 4 Op 1 . - CDS 14506 - 15105 527 ## COG0406 Fructose-2,6-bisphosphatase 12 4 Op 2 . - CDS 15098 - 15478 423 ## EUBELI_20177 hypothetical protein 13 4 Op 3 8/0.000 - CDS 15489 - 16265 737 ## COG0368 Cobalamin-5-phosphate synthase 14 4 Op 4 2/0.000 - CDS 16329 - 16853 510 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 15 4 Op 5 . - CDS 16853 - 17905 1263 ## COG2038 NaMN:DMB phosphoribosyltransferase 16 4 Op 6 7/0.000 - CDS 17930 - 18346 430 ## COG2246 Predicted membrane protein 17 4 Op 7 1/0.000 - CDS 18360 - 19295 901 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 19441 - 19500 9.1 18 4 Op 8 . - CDS 19606 - 22341 2028 ## COG4485 Predicted membrane protein - Prom 22387 - 22446 6.2 - Term 22702 - 22744 9.1 19 5 Op 1 . - CDS 22836 - 26372 4221 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 26409 - 26468 9.2 20 5 Op 2 . - CDS 26647 - 27054 486 ## COG1959 Predicted transcriptional regulator - Prom 27189 - 27248 10.4 + Prom 27146 - 27205 9.9 21 6 Tu 1 . + CDS 27406 - 29046 1665 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 29048 - 29107 21.5 - Term 29044 - 29086 13.5 22 7 Op 1 . - CDS 29089 - 29607 747 ## COG2606 Uncharacterized conserved protein 23 7 Op 2 . - CDS 29558 - 30334 1054 ## gi|253577983|ref|ZP_04855255.1| conserved hypothetical protein 24 7 Op 3 . - CDS 30410 - 31588 1106 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain - Prom 31618 - 31677 8.4 Predicted protein(s) >gi|226333006|gb|ACII01000013.1| GENE 1 1 - 682 509 227 aa, chain - ## HITS:1 COG:SSO2432 KEGG:ns NR:ns ## COG: SSO2432 COG2068 # Protein_GI_number: 15899180 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Sulfolobus solfataricus # 1 225 1 181 189 78 28.0 8e-15 MKIAMIMLAAGNSRRFGANKLLYEIDGIPMYRHVLEQLDDTKKKIENIYSEYSDITEDNN NDNNSDIVQLNNLYRNNITAKIICNIIVITQYDAIAEAAKTKEIQVLYNPHPEDGISSSV KIGLNASLDADAVLFTVSDQPWLTSETICELIHVFLNTSKGIACVSCQGKMGNPCIFDRK YYNELLSLEGDKGGKKVIMKHLDDTQIYEIEAGRELEDIDYYESIAV >gi|226333006|gb|ACII01000013.1| GENE 2 689 - 1405 782 238 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0643 NR:ns ## KEGG: CLL_A0643 # Name: not_defined # Def: conserved hypothetical protein TIGR03172 # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 7 223 17 248 265 124 38.0 3e-27 MKHQVDFLELLQIDYEKYPVIAVVGGGGKTSLIYRLTDELIDKGKRVIITTTTHMAGESE LPFARGGDAVRVKELLDKERYVIAAEYEEDTGKYASLTDEKLEELRELCDVMLVEADGAK HHPVKVPEKWEPVIPRCADIVISVIGLDCLGQPISQSAYRMERTSEFLRKSLEAPITEED IVKIATSICGLFKDVEERVYRVYLNKSDILREKEPAEHIVEELERKNTVAAYGSLLEE >gi|226333006|gb|ACII01000013.1| GENE 3 1365 - 2243 689 292 aa, chain - ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 18 273 270 524 541 239 48.0 6e-63 MTNRYTNVPKSDNKPENVVIIRGGGDLASGTIHRLYRCGYRLLVLECEKPTAIRRMVSFC EAVYDGQSSVEGVLCRKVDSVEECEAVWKAGEIPLMADTEGTVLKKYRPAALIDAILAKK NLGTTREMADLTVGLGPGFVAGEDVDYVVETMRGHNLARIITKGAAMPNTGVPGIIGGFG KERVLHAPAAGKIHCISKIADIVEKDQVLAWIGDTPVRASLTGVLRGMIRDGFTVPKGMK IADIDPRKEQKKNCFTISDKARCIAGSVLEILLSEGVMPYEAPGRFSGTAAD >gi|226333006|gb|ACII01000013.1| GENE 4 2240 - 2740 548 166 aa, chain - ## HITS:1 COG:ygeU KEGG:ns NR:ns ## COG: ygeU COG2080 # Protein_GI_number: 16130770 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Escherichia coli K12 # 8 150 9 152 159 156 55.0 1e-38 MKKQMQLVRCTVNGREVEEYVDVRESLLDMLRNRLGLTSVKKGCEVGECGACTVIIDGET IDSCIYLAVWAEGKNIRTLEGLEKNGELTRIQEAFIEEGAIQCGFCTPGFIMSATVLLER GEKLSRDAIRKHMAGNLCRCTGYENIVNAVEKVMDENETRRGKTVR >gi|226333006|gb|ACII01000013.1| GENE 5 2737 - 3621 921 294 aa, chain - ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 291 1 289 292 231 36.0 1e-60 MYDIENYYQAKSVKDAVRLLNEHPDARIISGGSDVLIKIREGKFAGTSLVSIHGIKEIQG VKMADNGDIYIGAGTVFSHITNDAIIRKYIPVLGEAVDQVGGPQVRNIGTIGGNICNGAV SADSAPTVFSLNALLRLEDGKEGRLVPVKDFYLGPGRVDLRQGEILTYVIIPAKEYEGYH GHYIKYSMRNAMDIATISCSVMSKVNQQAGVIEDVRITFGVAAPVPYRCTKAEEALKGMK IEESLYEKTAQLIREEINPRDSWRASRAFRLQIGGEIAARALRRTVELAGGDRI >gi|226333006|gb|ACII01000013.1| GENE 6 3637 - 5910 2350 757 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 8 757 2 752 752 710 47.0 0 MIGKSVPRVDAYDKVTGRAKFTDDLVPANCLVAKIYHAKIGNGIVKSIDTSKAEALDGVV KVVTFKDVPNHCYPTPGHPWSVELAHQDVADRNILTGRVRYYGDDIAAVVAEDEITAQRA LRLIEVEYEEYPVEIHPRKSMYGSKPPIHFDKKDNVLVRSNYFIGNVEEGLKESETVIKR SYKTPIVQHCHIENPISCAYMENGRIIVVTSTQIPHIVRRVIGQALGIPWGKIRVIKPYI GGGFGNKQDVLYEPLNAFLTTQVGGRPVKLDITREETFCNTRTRHSIEYDLTAGVDSEGH LLAKDMLAISNQGAYASHGHAIAANGLTAWRLQYACPNIKGEAYTVYTNTPVGGAMRGYG IPQVCFAIECFMDDIAKEIGMDPLEFRKKNLIKGYYEDAYLKPIAANTNGIFECLEKGAD YIHWEEKRKEYTNQTGNIRRGVGMSLFSYKTGVWPISLEIAGARMTLNQDGSVGLMLGAT EIGQGADTVFSQMTAEVLNMDISDIHIQTVQDTDVTPFDTGAYASRQSFVTGTVVKDCAN LLKEKILDYAKELLPEETRKLRLDHGWIMAGKDPAISLGNLALESYYSLGNSSVITAEVS EQVKKNTIATGCCFAEVEVDIKLGKIKILHIINVHDSGKLINPKLAEMQVQGGMSMGMGY GLSEQLILDEKTGRPLNGTLLDYKLMTMMDTPDIKADFVELDDPIGPFGNKALGEPPAIP GAPAIRNAVLHATGVAFNENPLTPQRLIDGFIRAGLI >gi|226333006|gb|ACII01000013.1| GENE 7 6250 - 6741 435 163 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 24 152 133 261 264 85 38.0 3e-17 MAVKKKFPYRAAVVACNGGCRSAEGEAGCADGCIGCKACIDKCKFGAISINEYGVAEVDE EKCIGCGACAKVCPQKVIHVHECANYIVVKCSNKKKGAEAKKQCDVSCIGCGICEKTCTA GAIKVKDSCAVIEESICLSCGMCAVKCPRHVIHDLHGILTEIR >gi|226333006|gb|ACII01000013.1| GENE 8 6744 - 9326 2997 860 aa, chain - ## HITS:1 COG:BH0748 KEGG:ns NR:ns ## COG: BH0748 COG1529 # Protein_GI_number: 15613311 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Bacillus halodurans # 169 851 15 740 760 336 33.0 1e-91 MAVFTVNGQKVEPTGNQKLLRFLRDELHLTSVKDGCSEGACGACTVIIDGKTCKACVPDT DLLDGRTVITVEGLTEWEREVYTYAYGKAGAVQCGFCIPGMVMCTKALLDVNKEPTDDEI KYALRNNYCRCTGYVKIMDAVRLAAKVLKTGVIPDDLDPDWNIGHRVSRVDVEEKVLGTG KYPDDFYFDGMLYGVALRSKYPRARVLEIDTAAAKALSGVEAVLTAEDIPGENKIGHLKH DQYSLIPVGGLTHYLGDAIAVIAAKDRETAEKAKKLIKVKYEVLPHIRTIEEAAAEDAPK VFDEEENNICAHKHISRGNADEAIRNSKYVISHHFETPWTEHAFLEPECAVALYDEDGDI FVYSTDQSAHQTLHECSLLLGTDRVKVQNALVGGGFGGKEDMTVQHLASLLTYATKKPVK MKLTRAESLLVHPKRHPFYMDMTMGCDENGNIMGVKAKVASDTGAFASLGGPVLERACTH AAGPYHYENFEIEGTAYYTNNPPAGAFRGFGVTQTCFATETLLNMMADKVGITPWEIRYR NAIRPGETLPNGQIVDNSTGLVETLEAVKKKYDAALEEGKPVGIGCAMKNAGVGVGIPDT GRVKLVFEKDKKLHIYSGASCIGQGLGTVLTQMVVTNTDLKHEDIVYERSNTWFAPDSGT TSGSRQTLVTGEACRRACDKVMEDRNAGKTIDDVIGKIYYGEYLAKTDPLGANVPNPVSH VAYGYATQVCILDKKTGKIEEMVAAHDVGKAVNPLSCEGQIEGGVVMSIGFALRERYPID ENCKPIDKYGSLGLFRSHEIPKIDAIVVDKPGLNVACGAIGIGEITSIPTAPAIADAYFR WNGERQYSLPLTGTPYERKC >gi|226333006|gb|ACII01000013.1| GENE 9 9755 - 12754 3553 999 aa, chain - ## HITS:1 COG:ygfK_2 KEGG:ns NR:ns ## COG: ygfK_2 COG0493 # Protein_GI_number: 16130780 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 441 962 1 535 582 403 42.0 1e-112 MSEVMTCMPFEQLMNWVLEEKKTKGTVFGQHRAYAAGTDRKLNIFERNLETPIGPAAGPH TQLTQNIVASYYAGARFFELKTVQKMDGAELAACINRPCILADDEGYNCEWSTELYVPQA MGEYIKAWFILHVIAKEFDLGAQDGFQFNISVGYDLAGIKEPKVNTFIDSMMEAKDTEIF KECKQWLLDNVDKFEKVTKEDIEAIPSDICNSATISTLHGCPPNEIESIANHLFKEKHLN TFIKCNPTLLGYEFARKTMDDMGYDYMVFGDFHFKDDLQYEDAIPMFKRLQALADELNLA FGVKITNTFPVDVTRNELPSEEMYMSGKSLFPLSISLAARLSREFDGKLRIAYSGGADYY NIDRIVGCGVWPVTVATTLLKPGGYQRFTQMAEKVMADGVKEWKGIDVAALEQLAEDAKK DAHHVKSIKPLPKRKTDSEVPLLDCFFAPCEEGCPIHQDITTYVKLAGEGDYAQALRVIL EKNALPFITGTLCAHNCMYKCTRNFYEEPVNIRNTKLIAAQNGYDTVIGEIKAGTADGKK VAVVGAGPAGIASAYFLARAGASVTVFEKEEKAGGVIRYVIPGFRISDDAIDKDVSFIQK MGVEIKTGTEVKSVQELKAQGYDAVIVATGANKPGTLKLEKGETINALKFLRDFKATDGK VALGKNVVVIGGGNTAMDTARAAKRTEGVEHVYLVYRRTKRYMPAAEDELLEVLEEGVEF KELLSPVSLENGKLLCKKMELGSMDASGRAGVTETGETEEVLADTVIVAVGEKVPTEFYE ANGIAVNERGKARINDKTMETSAEGVYVVGDGARGAATIVEAIRDAQVAAKAILGHDIVK GQPVPGTEKDCYSKKAILKESKDAANESERCLTCNKVCENCVDVCPNRANISIKVPGMAM NQVIHVDYMCNECGNCKSFCPYASAPYKDKFTLFASEADMADSTNDGFAVINPETKECKV RLLGQISDCKADDANDKLYEGLRRLICAVIDDYSYLIMK >gi|226333006|gb|ACII01000013.1| GENE 10 12867 - 14198 1423 443 aa, chain - ## HITS:1 COG:ssnA KEGG:ns NR:ns ## COG: ssnA COG0402 # Protein_GI_number: 16130781 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli K12 # 24 437 45 456 464 229 32.0 8e-60 MLVIGNGRMITQDASNPFLENGAVAMDGNTIVMVGVTEEVKKVYPDAEFVDAKGGVIMPA FINAHEHIYSSFARGLSINGYNPQGFLDILDGLWWTVDRHLTLEQTKLSAYATYIDSIKN GVTTVFDHHASFGHITGSLNAIEEAAKDLGVRTCLCYEISDRDGMEKSRESVMENVNFIK HALADDSDMIAGMMGMHASFTISDETMALCNELKPEGVGYHIHVAEGIYDLHQCLKEHGK RIVDRLHDWNILGPKTLLGHCIYVNEHEMDLIKDTDTMVVHNPESNMGNACGCPPTMRMV QKGILTGLGTDGYTHDMMESWKVANVLHKHSLCDPNAAWGEVPQMLFEGNAKIANRYFKK QLGVLKEGAAADVIVIDYDPLTPMNESNINGHLMFGVNGSMVQTTVCNGKVLMKDREVLV CDEAKVMADCRQAAKELADDING >gi|226333006|gb|ACII01000013.1| GENE 11 14506 - 15105 527 199 aa, chain - ## HITS:1 COG:BH1593 KEGG:ns NR:ns ## COG: BH1593 COG0406 # Protein_GI_number: 15614156 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Bacillus halodurans # 6 191 9 194 209 86 30.0 3e-17 MLKLWLIRHGKTEGNKLSRYIGTTDEPLCQEGTEFLHKMDYPKVQAVYVSPLKRCVQTAE ILFPGEPVHIIEELAECDFGEFENKNYKELEGNPHYQEWIDSNGTLPFPGGESREGFKSR NLRGFDRVVSGCIRSHVAEAALVIHGGTIMNIMEEYADIQKPFYEWHVRNGGGYEVELDE NLWKNGRKQLRVNSAIMEG >gi|226333006|gb|ACII01000013.1| GENE 12 15098 - 15478 423 126 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20177 NR:ns ## KEGG: EUBELI_20177 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 124 1 124 127 86 38.0 3e-16 MEMIIGGAYQGKTEYARKQYPQLRWSDGGSVTEEELMNAQGVTGFQQYIRSELENGKDVS DLAEKIICKNPDIVLVSQEVGYGVVPVDAFDRKYREAVGRVCTKLAAYSHKVTRVICGIG TVIKDA >gi|226333006|gb|ACII01000013.1| GENE 13 15489 - 16265 737 258 aa, chain - ## HITS:1 COG:lin1112 KEGG:ns NR:ns ## COG: lin1112 COG0368 # Protein_GI_number: 16800181 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Listeria innocua # 15 238 11 227 248 73 32.0 3e-13 MKSLWNNFKVAFAMFSKIPMPRADWTKENMKYMFCFFPFIGTVIGGLTMLVAYLGLRFGY QPGFVTAVLVLVPVFVTGGIHVDGLLDTSDALSSWQERSRRLEILKDSHAGAFAVITAAV YFIAWYGAYSQLFNSAENMRAIGILSLGFMVSRCLSGIGVITFPKAKADGTVAEFSRNSS DVLSRNVLIVYLVLLAAGMILIRPVWGSLAFVGALLIFWYYHHIAMKYFGGTTGDISGFF LCICEVGMALILAVVSNF >gi|226333006|gb|ACII01000013.1| GENE 14 16329 - 16853 510 174 aa, chain - ## HITS:1 COG:CAC1383 KEGG:ns NR:ns ## COG: CAC1383 COG2087 # Protein_GI_number: 15894662 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 2 167 4 182 185 89 33.0 4e-18 MLTLVTGGSGSGKSAFAEDRVLSFGDAQRIYIATMHPFDEESHKRIERHQKMRAGKGFET VECYTGLKNVKLPAGCVVLLECMSNLVANEMFEEQGAHDRTVSEVTKGIENLLEQAAHVV IVTNEIFSDAVVFDGDMDSYLEYLGKINQAAAQRADEVVEVVYGIPVFHKSTDT >gi|226333006|gb|ACII01000013.1| GENE 15 16853 - 17905 1263 350 aa, chain - ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 350 1 352 352 317 49.0 3e-86 MTLEEAIAKIKPLDHNAMEIAQKRWDSIAKPLHSLGKLETLLIQIAGITGNAEVDLSRRG LIAMCADNGVVEEGVTQTGQEVTAIVAENFLKYDTSVGVMCKQNHAEIFPVDMGMVTDTK VRTDHKIAYGTQNMTKGPAMTREQAVKGLEAGIDMVRELNDKGYRILATGEMGIGNTTTS SAVASVLLKQPVEEMTGRGAGLTSEGLVCKINAIKKAIALNEPDPEDAIDVLAKVGGLDI AGMAGVFLGGAVYGIPVVMDGFISCVSALIAMRICPAARDYILASHVSKEPAAHLILENM GKEAIIHADMCLGEGTGAVALFPILDLAAAVYHSMSTFDDIHVEQYEELK >gi|226333006|gb|ACII01000013.1| GENE 16 17930 - 18346 430 138 aa, chain - ## HITS:1 COG:lin2694 KEGG:ns NR:ns ## COG: lin2694 COG2246 # Protein_GI_number: 16801755 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 12 136 20 145 145 78 38.0 3e-15 MKQWIRKMYESSVIRYIFFGGCTTMVNLVSFFILRKLKVELNIANIISIILAILFAYVVN SKYVFQDKCETLKDHVQPFCKFVSARLVTMVIEVGGVWLLVSVMGLNDMIGKFLTQFIVL ILNYVFSKFFVFTTGKSK >gi|226333006|gb|ACII01000013.1| GENE 17 18360 - 19295 901 311 aa, chain - ## HITS:1 COG:BS_ykcC KEGG:ns NR:ns ## COG: BS_ykcC COG0463 # Protein_GI_number: 16078354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 3 302 6 307 323 228 39.0 1e-59 MSKLSVVLPAYNEELMVGKTCRVLAEVLTEAKIPYELVVVNDGSGDRTWEEIQKAGERDA NVTGVLFSRNFGKEAAIFAGLAQATGDVVAVMDCDLQHPPQTLIEMYRLWQEGYEVIEGV KADRGKEGFLHKECAGFFYDIMSKATKVNMKDASDFKMMDRKAVDSILSMPERNMFFRAT SSWVGYKTTSVEFEVQEREAGVSKWSPWALVKYAFTNIVAFTTFPLQFVTITGAICFICS LVLMVYSLVQYFTGSAVEGYTTLLMVLLLVGSAMMISLGIIGYYISKIYEEVKRRPRYII SKVIKNGNIQN >gi|226333006|gb|ACII01000013.1| GENE 18 19606 - 22341 2028 911 aa, chain - ## HITS:1 COG:BS_yfhO KEGG:ns NR:ns ## COG: BS_yfhO COG4485 # Protein_GI_number: 16077927 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 78 910 19 815 819 136 21.0 2e-31 MRKFVSRDSKKDFYLLYTLVFGVISFFIYYQFAGNGKSLVWSHDGIPQHLNSLAYYGRYL REVLHTIFVEHKLELPMWDMNIGYGSDILTTLHYYVIGDPLTLLSVFVPADKTEVLYEVL IFLRIYLAGISFSVFCFYHKNPKQATFMGTLIYIFAGWTIYAAMKHPYFSNPMIYLPLVL MGIDKIYKREKPWLFIWATAVSAMSNFYFFYMICIFMFIYAAFRYFGIFSERSVKDVLGW LVKFIGYYAVALMIAAVIFLPVIMTIFGTDRFQAENYVPLLYDKIYYEKYLGDLIGENMI QWGVAGYSAVAMAGVFVLFSRKKKYLDLKLGFGLLNLFLLVPFAGHVLNGFSYVSNRWIW AYGMMIAYIFVKAYPELFTLTVREKKKIFVMVVVYCVLALFAKAARTQRNMAGVLVLVLA VFTVTSFGNIFLQGKYMCGLLSALLVVSIMLNVSYQYSYEKDYLSEFAAEEESLDKLESN TDKAVLAAGDSGVYRYDQYGALPYDNTSMYMGTNSTAYYFSLANSSISDFFSEMYLNTPW EQHYENLDGRTILDRLASVKYFVISGDNFRYLSYGYNREKGSAGKGKSECRAYENENALP LGYTYDSYIPVSEYEKMDVVKKQQALMDGVVLEESTLPEASVDADNENIQYRMEAGDGCA LSKGAIRVTKEGAQMKLVFHGLTDSENYLIADNLDYDSLSPRELIGNSQWKKMSEYDQNK VLDEDSRWRYWKESKEAAMTVSSNDVTKTIKIFTDKYNAYSGRHDFLCNMGYSRSGVRTM TITFANTGVYTYDKLRVVSQPVQGIEEKTVKLGEEALENVKMGTNEITGDISVSEKKALV LSVPYSKGFTAYVDGKETKLQKANTMFMALELEPGSHEIRLTYCTPYLKAGMLLSVLGLA IYVMLVFRKKK >gi|226333006|gb|ACII01000013.1| GENE 19 22836 - 26372 4221 1178 aa, chain - ## HITS:1 COG:CAC2229_1 KEGG:ns NR:ns ## COG: CAC2229_1 COG0674 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 3 410 2 413 413 562 67.0 1e-159 MARKMKTMDGNHAAAHASYAYSDVAAIYPITPSSVMAEATDEWATQGRKNIFGQEVQVTE MQSEAGAAGAVHGSLAAGALTTTYTASQGLLLMIPNLYKIAGEQLPAVFNVSARALASHA LSIFGDHSDVMACRQTGCAMLCESSVQEVMDLTPVAHLAAIKGKVPFINFFDGFRTSHEI QKIETWDYEDLKDMADLEAIAEFRNHALNPNHPCQRGSAQNPDIFFQAREACNPYYDALP AVVQEYMDKVNAKIGTDYKLFNYYGAADAEHVIVAMGSVNDTIEETIDYLAAAGKKVGVV KVRLYRPFCAQALIDAIPESVKQITVLDRTKEPGALGEPLYLDVVAALKDSKFGGVKVFS GRYGLGSKDTTPAQIVAVYENTAKEKFTIGIVDDVTNLSLETGAPIVTTPEGTTNCKFWG LGADGTVGANKNSIKIIGDNTDMYAQAYFDYDSKKSGGVTMSHLRFGKKPIKSTYLIHKA NFVACHNPSYVNKYNMVQELVDGGTFLLNCAWDMEGLEKHLPGQVKAFIANHNIKFYTID GVKIGIETGMGPTRINTILQSAFFKLTGIIPEDQAIDLMKAAAKATYGRKGDDVVQKNWA AIDAGAKQVVEIQVPESWKNAEDEGLHMTHATEGRQDVIDFVNNIQAKVNAQEGNSLPVS AFKDYVDGTTPSGSAAYEKRGIAVNVPVWNPENCIQCNRCAYVCPHAVIRPVALTAEEAA NAPEGMKTLDLTGMKEYKFTMSVSALDCTGCGSCVNVCPGKKGAKALAMENLEASADEQK YFDYTVKLPVKEDVIAKFKEATVKGSQFKQPLLEFSGACAGCGETPYAKLITQLFGDRMY IANATGCSSIWGNSSPSTPYTVNAKGQGPAWSNSLFEDNAEFGYGMLLAQRAIRDGLKAK VEDVVANGTNEDVKAAGQEWLDTFAVGATNGAATDKLVAALEACGCDKAKEILAQKDFLA KKSQWIFGGDGWAYDIGFGGVDHVLASGRDINVMVFDTEVYSNTGGQSSKSTPTGAIAQF AAGGKETKKKDMASIAMSYGYVYVAQISMGADFNQTVKAIAEAEAYPGPSLIIAYAPCIN HGIKKGMAKAQTEEELAVKVGYWHNFRFNPAAEGNKFSLDSKAPSMEDYQAFLDGEVRYN SLKRQNPEKAARLFAKNEAEAKARFEYLQKLIALHSAE >gi|226333006|gb|ACII01000013.1| GENE 20 26647 - 27054 486 135 aa, chain - ## HITS:1 COG:CAC0115 KEGG:ns NR:ns ## COG: CAC0115 COG1959 # Protein_GI_number: 15893411 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 133 1 134 137 59 37.0 2e-09 MFITRECDYAVRVVRALWGESRLSVSDICEKEAITAPFAYKILKKLQKAEIVKGYRGVHG GYSLNRGLDELTLYEVYSAIDPDMFIIECLDPKYNCVRDGQDGLPCLVHRELLSVQDELI SLLKRKTIQQIMEEA >gi|226333006|gb|ACII01000013.1| GENE 21 27406 - 29046 1665 546 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 532 1 533 540 790 71.0 0 MISANNITLRVGKKALFEDVNIKFTEGNCYGLIGANGAGKSTFLKILSGQLEPTNGDVVI TPGQRLSFLQQDHFKYDAYTVLDTVIMGNKRLYEIMKEKDAIYAKADFTDEDGIRASELE GEFAEMNGWEAESDAATLLNGLGIETDLHYSQMADLTGSQKVKVLLAQALFGNPDILLLD EPTNHLDLPAIEWLEEFLINFDNTVIVVSHDRYFLNKVCTHTADIDYGKIQLYAGNYDFW FESSQLLIKQMKEANKKKEEKIKELQEFISRFSANASKSKQATSRKRALEKIQLDDMRPS SRKYPYIDFRPNREIGNEVLMVENLSKTIDGVKVLDNISFTLGHDDKVAFVGANEQAITT LFKILVGEMEPDEGNYKWGVTTSQAYFPKDNTAEFDNDLTITDWLTQYSEIKDATYVRGF LGRMLFPGEDGIKRVRVLSGGEKVRCLLSKMMISGANILILDEPTNHLDMESITALNNGL IKFPGVILFTSHDHQFVQTTANRIMEILPNGTMIDKITTYDEYLASDEMAKKRHVFEINE EDASDN >gi|226333006|gb|ACII01000013.1| GENE 22 29089 - 29607 747 172 aa, chain - ## HITS:1 COG:SA0652 KEGG:ns NR:ns ## COG: SA0652 COG2606 # Protein_GI_number: 15926374 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 12 168 1 158 160 137 44.0 1e-32 MPEDRKDRREKMAKEAKTNAMRMLDRQKVKYEAFSYECDEFIDGIHSADKIGAPYDQSFK TLVMEGKSGGYFVFVVPIEKEVDRKAAAKAVGEKTVDMIHVKDITKITGYVRGGCSPLGM KKPYPVVFDASAGEFEEIYVSGGRIGLTLKVPLADLLKVTGGKLADIIMKTE >gi|226333006|gb|ACII01000013.1| GENE 23 29558 - 30334 1054 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577983|ref|ZP_04855255.1| ## NR: gi|253577983|ref|ZP_04855255.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 258 1 258 258 417 100.0 1e-115 MVNMTDIFGNALSIAGSFMDSDLVNRFMMVIFAVILVFGVLNCILGYRLLRFWMMLGGFL VGGGLALVVVHTMGIQEKSTMLIAALAAGVVFAVIAFLIYKAGVFILAAGIGWAASIYFL HPTSSAVFFACILIGVALGSMAVKYCREVLIVATSLIGGIMAGVSLAQLGNLADIPYGLG MSVGFAVLGMLIQFAINKPVSDEEDEETVDEESTEQIYRRNQELRQDQSFEQSQDRDVEE SIYARGQKGQARKNGKRS >gi|226333006|gb|ACII01000013.1| GENE 24 30410 - 31588 1106 392 aa, chain - ## HITS:1 COG:SPy0210 KEGG:ns NR:ns ## COG: SPy0210 COG5279 # Protein_GI_number: 15674407 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Streptococcus pyogenes M1 GAS # 42 353 44 372 410 89 23.0 8e-18 MSKKRFSGLRLILMVPVVVLLAGGLIIGGRALQEPGEVRTLKKQEVAQTNEGHREYYFGL LNENEQRGYREILEGIRSFEDKFYLSLSGDNEIDRVYHAVLKDHPELFWVHNREKVYKTT YSGRDYCQFSPGYTYTEEQRQEITQAMENAYQEVLSQIPDGADDYTKVMTVYTYVIDNTE YVISDDDQSIAGAFWKKQAVCAGYAGAVQYLLERLDIPCIYVEGDAANSTQGHAWNIVEL NGQYYYVDATNGDQPDFLEGDATLRAEHKTTIYDYLCPFPQEYEENYTASDEFPVPACTA TDMNFYVRNGACFDTYDWSSVYDLCRLRIDSDAAVVRFKFGSQGAYDEAYMDLIDSNHIQ DIAKYYMEVHGLSQISYHYGVIDNMKTLYFMF Prediction of potential genes in microbial genomes Time: Sat May 28 19:08:37 2011 Seq name: gi|226333005|gb|ACII01000014.1| Ruminococcus sp. 5_1_39B_FAA cont1.14, whole genome shotgun sequence Length of sequence - 55749 bp Number of predicted genes - 49, with homology - 46 Number of transcription units - 29, operones - 11 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 85 - 144 4.8 1 1 Tu 1 . + CDS 229 - 867 593 ## COG4869 Propanediol utilization protein + Term 909 - 960 6.2 - Term 897 - 947 9.0 2 2 Tu 1 . - CDS 1053 - 1169 105 ## + Prom 1056 - 1115 4.1 3 3 Tu 1 . + CDS 1168 - 1356 94 ## - Term 1182 - 1207 -0.1 4 4 Op 1 . - CDS 1264 - 1665 252 ## gi|253577987|ref|ZP_04855259.1| predicted protein 5 4 Op 2 . - CDS 1662 - 1859 190 ## gi|253577988|ref|ZP_04855260.1| predicted protein - Prom 1938 - 1997 6.1 - Term 1978 - 2020 5.1 6 5 Tu 1 . - CDS 2067 - 3122 760 ## CLH_1452 hypothetical protein - Prom 3218 - 3277 13.9 + Prom 3122 - 3181 10.9 7 6 Op 1 . + CDS 3356 - 3541 189 ## gi|253577990|ref|ZP_04855262.1| predicted protein 8 6 Op 2 . + CDS 3603 - 4094 323 ## gi|253577991|ref|ZP_04855263.1| conserved hypothetical protein + Term 4104 - 4168 16.0 9 7 Tu 1 . - CDS 4702 - 5535 983 ## COG3711 Transcriptional antiterminator - Prom 5564 - 5623 6.8 - Term 5932 - 5986 6.2 10 8 Tu 1 . - CDS 6003 - 8246 2537 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 8443 - 8502 6.7 11 9 Tu 1 . - CDS 8722 - 10683 1436 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain - Term 12072 - 12113 1.0 12 10 Tu 1 . - CDS 12132 - 14006 2119 ## gi|253577995|ref|ZP_04855267.1| conserved hypothetical protein - Prom 14212 - 14271 6.1 - Term 14291 - 14331 7.2 13 11 Tu 1 . - CDS 14357 - 14947 360 ## EUBELI_20285 hypothetical protein - Prom 15018 - 15077 6.2 14 12 Tu 1 . - CDS 15079 - 15318 336 ## EUBREC_2810 hypothetical protein - Prom 15372 - 15431 4.7 - Term 15404 - 15440 4.0 15 13 Tu 1 . - CDS 15469 - 16026 493 ## COG4636 Uncharacterized protein conserved in cyanobacteria - Prom 16054 - 16113 6.6 16 14 Op 1 . - CDS 16224 - 17156 963 ## COG1897 Homoserine trans-succinylase 17 14 Op 2 . - CDS 17182 - 19140 1840 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 18 14 Op 3 . - CDS 18938 - 19882 415 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 19998 - 20057 6.4 + Prom 19977 - 20036 3.2 19 15 Tu 1 . + CDS 20064 - 21308 1063 ## COG0726 Predicted xylanase/chitin deacetylase + Term 21400 - 21445 3.1 - Term 21375 - 21445 25.1 20 16 Tu 1 . - CDS 21456 - 22361 717 ## gi|253578003|ref|ZP_04855275.1| conserved hypothetical protein - Prom 22603 - 22662 9.3 - TRNA 22486 - 22557 64.7 # Arg CCG 0 0 - Term 22682 - 22723 8.5 21 17 Tu 1 . - CDS 22732 - 24354 1933 ## COG1757 Na+/H+ antiporter - Prom 24386 - 24445 5.6 22 18 Op 1 . - CDS 24472 - 25320 771 ## gi|253578005|ref|ZP_04855277.1| predicted protein 23 18 Op 2 . - CDS 25339 - 26430 707 ## gi|253578006|ref|ZP_04855278.1| conserved hypothetical protein 24 18 Op 3 11/0.000 - CDS 26427 - 28844 2656 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 25 18 Op 4 . - CDS 28847 - 29314 488 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 26 18 Op 5 . - CDS 29304 - 30122 846 ## EUBELI_01495 xanthine dehydrogenase - Prom 30200 - 30259 7.3 + Prom 30555 - 30614 10.3 27 19 Tu 1 . + CDS 30722 - 31309 676 ## BDI_1616 hypothetical protein 28 20 Op 1 . - CDS 31389 - 32489 1027 ## COG2006 Uncharacterized conserved protein 29 20 Op 2 . - CDS 32587 - 33135 208 ## PROTEIN SUPPORTED gi|52081538|ref|YP_080329.1| ribosomal protein S2 - Prom 33184 - 33243 6.1 30 21 Tu 1 . + CDS 33422 - 34759 1252 ## COG0534 Na+-driven multidrug efflux pump + Term 34812 - 34859 6.2 - Term 34798 - 34846 10.2 31 22 Op 1 . - CDS 34920 - 35351 670 ## gi|253578015|ref|ZP_04855287.1| predicted protein - Prom 35398 - 35457 1.8 32 22 Op 2 36/0.000 - CDS 35460 - 37853 2384 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 33 22 Op 3 . - CDS 37865 - 38563 282 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Prom 38615 - 38674 4.9 + Prom 38842 - 38901 9.2 34 23 Op 1 6/0.000 + CDS 38962 - 40197 1293 ## COG1454 Alcohol dehydrogenase, class IV + Prom 40206 - 40265 4.4 35 23 Op 2 . + CDS 40312 - 41691 1245 ## COG1012 NAD-dependent aldehyde dehydrogenases - Term 41603 - 41648 6.3 36 24 Tu 1 . - CDS 41701 - 42441 479 ## COG2357 Uncharacterized protein conserved in bacteria - Prom 42551 - 42610 5.7 + Prom 42775 - 42834 6.2 37 25 Tu 1 . + CDS 42924 - 43592 415 ## COG1787 Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes 38 26 Op 1 . - CDS 43598 - 44182 432 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) 39 26 Op 2 . - CDS 44279 - 47608 2906 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases - Prom 47673 - 47732 3.9 - Term 47845 - 47892 1.2 40 27 Op 1 . - CDS 47920 - 48417 437 ## COG0582 Integrase 41 27 Op 2 . - CDS 48463 - 48924 397 ## NT01CX_0673 hypothetical protein 42 27 Op 3 . - CDS 48924 - 49382 326 ## NT01CX_0674 hypothetical protein 43 27 Op 4 27/0.000 - CDS 49429 - 50610 426 ## COG0732 Restriction endonuclease S subunits 44 27 Op 5 . - CDS 50630 - 52135 1226 ## COG0286 Type I restriction-modification system methyltransferase subunit 45 27 Op 6 . - CDS 52137 - 53192 582 ## COG3177 Uncharacterized conserved protein - Prom 53226 - 53285 6.2 - Term 53251 - 53299 11.8 46 28 Op 1 . - CDS 53364 - 54041 565 ## CA_C0729 hypothetical protein 47 28 Op 2 . - CDS 54034 - 54159 182 ## - Prom 54194 - 54253 1.6 48 29 Op 1 . - CDS 54257 - 55573 898 ## TBFG_12832 hypothetical protein 49 29 Op 2 . - CDS 55607 - 55747 109 ## gi|253578033|ref|ZP_04855305.1| CRISPR-associated protein Predicted protein(s) >gi|226333005|gb|ACII01000014.1| GENE 1 229 - 867 593 212 aa, chain + ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 5 192 23 210 210 183 51.0 3e-46 MSIKIPIETSARHLHISQEDFEKLFGEGSKPHFIKELSQPGQYLCQERVTVKGPKGTFEN MALLGPFRSDTQVEMSLTDTRKLGIPSVIRQSGDIEGTPGCILSGPCGDIEIPKGVIVAK RHIHMTPDEALALHIKDNDEVFVLTKSYGRALIYADVVVRVHRNYHLAMHVDTDEANAFN SDTEPYGVIVRFFDSNFNTDKWIEDELSGIRR >gi|226333005|gb|ACII01000014.1| GENE 2 1053 - 1169 105 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSIFEYDKEEEEQKLQATYEKIGEKRGEKQGELKRQKA >gi|226333005|gb|ACII01000014.1| GENE 3 1168 - 1356 94 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAIISAFQKPRTGVAVLYQSEAFRIWGYKVFTSIQIKELSLTDVYLCNGLRFFFRAVVNV LA >gi|226333005|gb|ACII01000014.1| GENE 4 1264 - 1665 252 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577987|ref|ZP_04855259.1| ## NR: gi|253577987|ref|ZP_04855259.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 133 1 133 133 235 100.0 6e-61 MTDSNLVGTSFAGTNTKDLELQLYIESDDGKKVEGIGGENTKVKEIHLPVYKAGERQAEY EDGYVYVNSENVYDFTLNELQNYESYLKTSAGYKGKVTLYAKISCKYIYYCTEKEAQSIA QINICQRQLFDLD >gi|226333005|gb|ACII01000014.1| GENE 5 1662 - 1859 190 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577988|ref|ZP_04855260.1| ## NR: gi|253577988|ref|ZP_04855260.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 1 65 65 103 100.0 4e-21 MEKKLFVNTIIAAYNAQAVNPELMFVKQSDRNAAQEKTVYYTLDNVNFNESQELIKNKPL EFHLK >gi|226333005|gb|ACII01000014.1| GENE 6 2067 - 3122 760 351 aa, chain - ## HITS:1 COG:no KEGG:CLH_1452 NR:ns ## KEGG: CLH_1452 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 33 350 35 377 380 85 25.0 3e-15 MEIGGFFPYQPVGSEPNNYVEHTCPDPGDTAHLMSGRCAIYYCLRDLMLTDQKRVAYLPS YDCETVLGCFVKAGYEIHYYDFDKNLSPVFEEEMIPQISVLLVCGYYGYPTYDENFVKKC KASGVSVIVDTTHTAFSPLPACPDSDYIAVSLRKWMGVASGGLAIKRKGTFGVKPIPINK EQFSLRDQALQSRETYEHTGNEDYNKEGNDVFWKAEFMLREIFDIQEGDEASLKTILHYP FRDAIRKRQENYHYLLEHLPERSDIRPVFPYLSDDICPMFFPFFVENRDALIKHLADRHI PPKVYWPVPPFVKVEDYPGAQYIYSHIMSVSCDERFNTDDMQKIVEAFENY >gi|226333005|gb|ACII01000014.1| GENE 7 3356 - 3541 189 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577990|ref|ZP_04855262.1| ## NR: gi|253577990|ref|ZP_04855262.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 113 100.0 4e-24 MNTHSSGYYSIDKDYNILSYNDTAKQLYPNLQPGVKCYKALMGLDSPCPPCPVALGIQCE I >gi|226333005|gb|ACII01000014.1| GENE 8 3603 - 4094 323 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253577991|ref|ZP_04855263.1| ## NR: gi|253577991|ref|ZP_04855263.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 163 25 187 187 280 100.0 2e-74 MGPLNDSKLLKSTLNNNNNFELDSNATSSTMVPAISTLIKNDSLLKNGTWAYLGSGKNKN RRYFFWTSVNVSNFPTGTKIPVIICRADNEYYVSESTTAERTGDKITTLYKVIADHQYNA ESYQKIADKGTKYDSLQAAYDAYVEKITEDSNYSQFKDTLSKN >gi|226333005|gb|ACII01000014.1| GENE 9 4702 - 5535 983 277 aa, chain - ## HITS:1 COG:CAC1355 KEGG:ns NR:ns ## COG: CAC1355 COG3711 # Protein_GI_number: 15894634 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Clostridium acetobutylicum # 2 272 8 277 287 146 30.0 4e-35 MYRVSKVLNNNGVIAIDMDENKEYVILGKGVGFGKKVSQRFDKPEGCTTYRLEQETERGS AKELVKGIEPEYLEIADAILTESQKVFGDSIDRGILFPLADHISFAVARIRRNEQISNPL TEDIKVLFYSEFKVAETLKTILRERLQIEIDDHEVGYVALHIHSAIGDEKVSVAMQTARA VRECIDMLEKATGKPIDVLSLSYNRLMNHMKYMVARASTGEKLNLDMNEYMLDQYPQAYK VATDICKNLEGCIGHKLDETETGYLAMHIQRVYKDAV >gi|226333005|gb|ACII01000014.1| GENE 10 6003 - 8246 2537 747 aa, chain - ## HITS:1 COG:SA0233_1 KEGG:ns NR:ns ## COG: SA0233_1 COG1263 # Protein_GI_number: 15925945 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Staphylococcus aureus N315 # 5 433 4 422 437 448 54.0 1e-125 MKDKIFGVLQRVGRSFMLPIAILPVAGLLLGIGSSFTNETTIATYGLQKILGSGTLLNSL LIIMNKVGSAVFDNLPLIFAVGVAIGMAKKEKEVAALSALIAYFVMNVAVSAMLLINNEI TADGQIAADVLEGTITSVCGIQSLQMGVFGGIIVGLGVAALHNRFHKIVLPNALSFFGGS RFIPIISTIVYMFVGILMYFVWPAVQNGIYALGGLVTGSGYLGTLIFGIIKRALIPFGLH HVFYMPFWQTAVGGTMEVAGQVVQGGQNIFFAQLADSANIAHFSADATRYFSGEFIFMIF GLPGAALAMYRCAKPEKKKAAGGLLLSAALACMFTGITEPLEFSFLFVAPALFVVQVILA GAAYMIAHMLNIAVGLTFSGGFLDLFLFGILQGNAKTSWLRIIPVGIIYFILYYVIFTSM IRKFNFKTPGREDDDTETKLYTKADVNARKEAGKTAGAAVATSGDPVSELITRGLGGKKN IVDVDCCATRLRITVAEPERVRDELLKQTDSRGIVKKGQGVQVIYGPHVTVIKAKLEEYL ETAPNEFADETPDNIQAQENIAAENADGQNSAEVEDKNAGTAQSLAPVKEEKIRKTAIIY SPVDGIAADLSTAPDEGFAGKMMGDGAVVTPTEGTVYAPADGEVEFIFDTKHAIGFQTDS GIPMLLHMGIDTVKLEGKGFEILVTEGQKVKKGDPMMKLDLEFLTANAPSITSPILDIEP EDNQRIRLLANGEIKAGEPLFAVETLE >gi|226333005|gb|ACII01000014.1| GENE 11 8722 - 10683 1436 653 aa, chain - ## HITS:1 COG:VC1348 KEGG:ns NR:ns ## COG: VC1348 COG3437 # Protein_GI_number: 15641360 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Vibrio cholerae # 289 630 65 415 441 215 37.0 2e-55 MTMQEAENKMEVLREIFDVVRLLKGSDLQAAGMESKLAGRENPCQCYAFWNKDKRCENCI SLKALEEKKQTSKIEFLDSDMYQVFARYLEIDSEPYVMEMLKKLDENTLTDEEGYEKLTE KLTVYSEKLYKDVLTGAYNRRYFEEKVKNMSLNAGIAVIDLDDFKLFNDTYGHDGGDLVL ITVVNVIRHYIRRTDILVRYGGDEFLLILPGIEKEVFSQKLRMIQKKIHATHIPGFNRLK LSVSIGGAMFTHGRLEEAITKADRLMYMAKGHKNIVVTRWDQKKNTDEMEKRNRLQLLVV DDSEMNREILKEILGKEYRILEACDGEEALKILEQYGPEISLVLLDIIMPKMDGFEVLAY MNRDKWIEDIPVIMISSEGSESYIRRAYELGASDYISRPFDAKVVYQRVINMIKLYAKQR RLIHLVTDQIYEKEKNNRMMTGILSQIVEFRNGESGLHVLHINILTQLLLEKLMRKSENY DLSWSQQHMIATASALHDIGKIGIDEKILNKPGKLTKEEFEAMKQHTIIGARMLDSLEMY HDEEMMKYAYEICRWHHERYDGKGYPDGLKGEEIPISAQVVSLADVYDALVSDRVYKKAY SHEKAIEMILNGECGMFNPLLLECLVEIQDKVRKELGIKDVNECLEYLEERNR >gi|226333005|gb|ACII01000014.1| GENE 12 12132 - 14006 2119 624 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577995|ref|ZP_04855267.1| ## NR: gi|253577995|ref|ZP_04855267.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 624 12 635 635 998 100.0 0 MKNKYLTRFLCLTLMSGMVLSGPASVLAAEDTAAYTADGSDFSDSGDFDSGSEDPAPADP TPAEPTPAPVDPTPADPTPAPVDPTPTPAEPTPAPADPTPTPTDPTPTPANPTPTPSSQI TKAAQDVIDRIKELNAQTITLEKKATVQSIRTAYNALSDTEKAQVSNYNLLVQAEATIAQ LEAQKNNANTNNPFTSNTSAAQTGTPVYYASNIHAGKDFYLDSLKNNYNLTFSDDFASVM DEIEKEYKEKNKVADASDVNGSKTTTSADSLLVRNWQDILAVYVYQQSQDGKTEFTLDSS CKKDLAKIFAEMNPIVRDKQDITHVTYANRKINYYIKKNKIAKKDRTILKKYVETDCKLL CAVVTAANGFVRESVGDDVSEERVNVISAAYSLVGKVGYFWGGKSTVIGEDPGWGTSEKV SAEGSKSTGTIRAYGLDCSGFVTWAVINGYQDKAMQEAVGDGTSDQWEKANVVTEADAQP GDLVFQKGPEAGSDNHVGILCGKTDAGDWIAVHCSSGKNGVTVGEAYSASFRYIRQPSFY PTQEQVQEMQSSSTVSTTTGDAAASTGTDIFSANVTVSNTLQDAMKSNTSDSTGFTTSES TEKTSILPDKIDLSKEKVAVFKAK >gi|226333005|gb|ACII01000014.1| GENE 13 14357 - 14947 360 196 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20285 NR:ns ## KEGG: EUBELI_20285 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 186 36 212 232 65 26.0 1e-09 MRKKSHILLARYLADQMQTTVSLQSHRKAFCLGSILPDIKPSFITRRHEFYGTFEDVKNR MKELTDIRPDESNQRVYWRRFGEVIHYMADYFTFPHNKTYTGSFSQHNHYEKVLKNRLKE CIQQGEAHAYLEPAIRFADFSTLIDYIEATHEKYLNKLRSVEEDIRFILNMCFQVVQGLI QICIGNKNFAGAIQAA >gi|226333005|gb|ACII01000014.1| GENE 14 15079 - 15318 336 79 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2810 NR:ns ## KEGG: EUBREC_2810 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 77 2 73 74 63 45.0 2e-09 MNIGNNIGNIMKITGAWNTFKSNHPKFPAFCQAVSRKGLKEGSIIEIAITTPEGEKIETN LKVKDSDLELLKQLSDLKM >gi|226333005|gb|ACII01000014.1| GENE 15 15469 - 16026 493 185 aa, chain - ## HITS:1 COG:aq_1194 KEGG:ns NR:ns ## COG: aq_1194 COG4636 # Protein_GI_number: 15606437 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Aquifex aeolicus # 9 144 20 154 180 67 30.0 1e-11 MPLLEEEYRKEEKINGVIYDMSPSPNYQHGLVDGNIYRIISTGLQGTLCLAFMENLDYKY HAQENNDYVIPDVMIICDRKHLKGGSYTGTPRFIVETLSPATALRDMTVKKEIYQAAGVE EYWIISPKERAVQIYYLEDGKYDLKYSYILQDDPEEEHYNADTVVTLKDFPKISMTLAEM FENVE >gi|226333005|gb|ACII01000014.1| GENE 16 16224 - 17156 963 310 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 380 59.0 1e-105 MPIKIQSDLPVKEILEKENIFVMDESRASHQDIRPIQILILNLMPLKEETELQLLRSLSN TPLQVDVTFMAVQSHEAKNTSVSHLNKFYQTFPELKNNKYDGMIITGAPVEQMEFEEVDY WKELVEIMDWTNTHVTSTIYLCWAAQAGLYHHYGLKKRKLNKKMFGLFWHKVMNRKIPLV RGFDDMFLAPHSRHTEVPIEDIHNCKELTVLAESDEAGLFLAMADGGRKIFVMGHPEYDR VTLDGEYKRDVSKNLPIEIPENYYKDNNPENRPLLMWRAHANNLYTNWLNYYVYQSTPFD LYGTPDFSEI >gi|226333005|gb|ACII01000014.1| GENE 17 17182 - 19140 1840 652 aa, chain - ## HITS:1 COG:SP1117 KEGG:ns NR:ns ## COG: SP1117 COG0272 # Protein_GI_number: 15900984 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Streptococcus pneumoniae TIGR4 # 8 643 4 646 652 366 34.0 1e-100 METAAIMRMKELVQKLDRAAKAYYQQDTEIISNREYDQMYDELQALEKETGTVLANSPTV SVGYEAVDQLPKEEHESPMLSLDKTKDREVLREFIGEHKTLLSWKLDGLTIVLTYENGGL AKAVTRGNGVTGEVVTNNARVFKNVPLKIPYQGRLVLRGEAIITYSDFERINESIEDVDA KYKNPRNLCSGSVRQLNNEITARRNVRFYAFTLVSADGVDFHNSRARQFEWLKEQGFDVV EYRTVTASTLDEAMEYFSTAITENDFPSDGLVALYDDIAYGDSLGRTAKFPRNAFAFKWA DEIRETHLLEIEWSPSRTGLINPVAVFEPVELEGTTVSRASVHNVSILKELQLGIGDTIT VYKANMIIPQIAENLTRSSKLEIPDTCPACGHGTQIQKVNDVEALYCTNPDCAAKKIKSF ALFASRDAMNIDGLSEATLEKFIARGFIHDFGDIFEIGKHRDEIVEMDGFGEKSFENLMT SLDKAKETTLAKVIYSLGIANIGLANAKVICRHFDDDLEKIRHADKEEISSIDTIGPVIA GSLTDYFSNEDNNRKLDHLMSHLTVKKEEKTGEQIFRNMNFVITGSVEHFANRAQAKEFI ESLGGKVTGSVTSKTNYLINNDTTSNSSKNKKAKELGIPILSEEDFLKMAQQ >gi|226333005|gb|ACII01000014.1| GENE 18 18938 - 19882 415 314 aa, chain - ## HITS:1 COG:VC2223 KEGG:ns NR:ns ## COG: VC2223 COG1187 # Protein_GI_number: 15642221 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Vibrio cholerae # 6 257 20 270 340 258 51.0 1e-68 MDEKIRINKYLSEAGICSRREADRMIEEGRITVNGKKAESGQKVSLEDEICADNIPVHKN EKKVLLLFNKPRGIVCSTKQQFDETTVTDYLDYPLRVYPVGRLDKESQGLLLLTNEGDLV NKIMRAGNYHEKEYFVTVNKPVDREFVRRMSKGVPVLDTVTRPCRVVQTGECSFRIILTQ GLNRQIRRMCRYLGYEVQKLKRIRIMNLTLDGIREGEYREITAQEWEELNHLLESSTSET VIRTGEQNGNSSDHANERAGAKAGQGSKGVLPAGYRDHKQQRVRSDVRRASGSGERNRHC SGKQPNRKRRVRGS >gi|226333005|gb|ACII01000014.1| GENE 19 20064 - 21308 1063 414 aa, chain + ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 214 398 27 212 217 151 40.0 2e-36 MSDHLPRTNRWLIALSVILCLILGGIFTRSSAVTSQADAISDAGAFVSEDQDSAEFTSEE TLTDSNDNFTDSSSGKLFGRTSSDSSETSPGGPALSTDISPEASEGSWASSGSNWMFLVD DKPYTGWFTDTDGKQYYMDETGIMQTGWTDIGKKRYYFDMDGILQTGTVIIDKKTYELDT DGSLKGYTPKKKSSKKKSSDKSATSDKSGTSTAKKSVALTFDDGPSSFTDRLLDCLEENN AKATFFMVGTEIASFPDEVKRMKKLGCELGNHTYDHKDLATLSSDEISSEIARVDEQLVN LTGEGASVVRPPYGSVNDTVKSTVGTPMILWSIDTLDWKTQDVESTVEEVMNNVKDGSII LMHDIFSTSVDAAEILIPQLIEEGYQLVTVHELASLHQTELSTGVTYGEFNRIK >gi|226333005|gb|ACII01000014.1| GENE 20 21456 - 22361 717 301 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578003|ref|ZP_04855275.1| ## NR: gi|253578003|ref|ZP_04855275.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 133 301 1 169 169 302 99.0 2e-80 MDDFREWLSDNLRYFMLGGAILIIVVVLFCGIRACSGSNKGNSGDEQKTTSEDQGNVPSS PISEGESDEKKEDANPMETADADVTALITSYYQALGEKDIATLKTLEDDFTPSDESKVTN LKDYIEGYEVGDVYTKKGMTDDSYVVYACFSYICQGVETKAPALTQFYVYKNSEGNWVIN NGALQDSEISAYMEKQLSDSDVSALIKKVQNELDQAQQSDPSLEEFLNGLGEEAGVSTEA EDGTMLTASEECNVRAEASTDADILGVISAGDQVQKTGTDGEWVQIDYDGQTGYIRGDLL E >gi|226333005|gb|ACII01000014.1| GENE 21 22732 - 24354 1933 540 aa, chain - ## HITS:1 COG:BH3449 KEGG:ns NR:ns ## COG: BH3449 COG1757 # Protein_GI_number: 15616011 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus halodurans # 41 532 3 505 516 327 40.0 3e-89 MKSNIKRLSLCLSALMLLILASPLTVLAAEEEVAASAKFYGTFWSLVPPLVAIALALITK EVYSSLFVGILVGALFYSNFNFEGTILHIFEGGMISVLTDSYNMGILIFLVILGTMVCLM NRAGGSAAFGRWAGKRIKSRVGAELSTIILGVLIFIDDYFNCLTVGSVMRPVTDKHNVSR AKLAYLIDATAAPVCIIAPISSWAAAVSGFVEGEDGITLFVHAIPYNFYALLTIFMMVAM VLMKVEFGTMGVHETNALKGDIYTTPARPYANAAEDEKSNPRGKVIDLLIPIVSLVICCV IGMIYTGGFFSGTDFVTAFSQSDASVGLVLGSFFGLVITIVLYLIRRVMNFRDCMACIPD GFKAMVPAILILTFAWTLKAMTDSLGASVFVSAIVKSSAGGLMNFLPAIIFVIGAVIAFA TGTSWGTFGILIPIVVNVFSGTNHELMIISISACMAGAVCGDHCSPISDTTIMASAGAQC EHVNHVSTQLPYAMVAAAVSFLTYIIAGFVQTAWIALPAGIILMAITLVIIRLKFDDLKM >gi|226333005|gb|ACII01000014.1| GENE 22 24472 - 25320 771 282 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578005|ref|ZP_04855277.1| ## NR: gi|253578005|ref|ZP_04855277.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 282 1 282 282 575 100.0 1e-162 MAGVTIQELGLRFVDACMKLYWGSCWYPLLFAVGLICTLVLGRKRKSGIFIGYTVFLLLT VYNPLVVKYLIAKAKFENEYYRFIWILPVIPAVAYYGVRLVTAFRKTWIKVLMAAAVLTG FVILGNPLDGVVTNFAMAENIYKVPNDLRAVCDVIHQNQEDDFPRVVFDGSLNSIVRQYD AGIATVISRNASIYRSGSTVAGNYDENSSFYKRQKALLDVIDYHIYEDKEGFQAALKKSK TDFVVTQIGLVDHDFLTECGCELIAQTESCYIYRFDYSNPRK >gi|226333005|gb|ACII01000014.1| GENE 23 25339 - 26430 707 363 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578006|ref|ZP_04855278.1| ## NR: gi|253578006|ref|ZP_04855278.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 363 1 363 363 644 100.0 0 MRELCLACAGFALEFIVFYAAGIVLVRILKVKGDCSLTFILGYLVYFAVFELVIVPMTLK WVSLTAAAYIWAGIMALVICAAVICTIKKQRRIRAGQTETCSGIFGSISEIWKNHSVMII LAGAVVLLQCLIVIFYEDTTVDAAYYVGTVSTSVYTDTLGRYNPFNGAIQKAFQARYVFS AYPMHNAVWCRLLGIHPIVQAKQVMSCMNVVTANLIIYQIGKRLFDGNRKKADLMLVFVC VLQLFCGTIYSSGTFFFTRSYEGKAILANIAIPSVLMCAVWYLQEKNSRNVWIILFVTAV SALTFSGSAIIFPIVIAAGMAPAAVMNKRFSGLVYCAVCMLPSALYAAVFFACRIGLLTL AAS >gi|226333005|gb|ACII01000014.1| GENE 24 26427 - 28844 2656 805 aa, chain - ## HITS:1 COG:Z4220_2 KEGG:ns NR:ns ## COG: Z4220_2 COG1529 # Protein_GI_number: 15803418 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 EDL933 # 10 792 12 789 790 412 34.0 1e-114 MKYVNQPVRKTDAMSLVTGKPVYTDDLAPRDCLIVKILRSPHANAWVEDIKTGTAEKVAG IACILTYKDVPQKRFTLAGQTAPEMSPDDRLILDRHVRFVGDAVAIVAGETEEAVDKALK RIRVDYRVETPVLDIHKAKDNPILVHPEDDWKLKIPLGGDNKRNLCASAVEEHGNVDEVL AKCKYTVERTYHTRANQQTMMETFRTACYMDHFGRLIVLSSTQIPFHVRRIVGRALDIPA SKVRVIKPRIGGGFGAKQTSVSEIYPAIVTWKTGRPSKMIFSRYESMICSSPRHEMEITV RAGADENGIIKAIDLYTLSNTGAYGEHSSTTVGLSGHKSIALYRHTEAHRFAFDVVYTNV QAAGAYRGYGATQGIFAVESAVNELAHKMGMDPVKVKEMNMPVEGGPLPGYPDVPYAQSC SMDRCMARAKEMMDWDSKYPCRDMGNGKVRGVGVAMAMQGSSIAGVDVGGADIKLNEDGS YTLALGCTDMGTGCDTVMAQIAADCLNTPMDNIVVFSVDTDISPYDSGSYASATTYTTGV AVMKACEELKKKICKLGAEMMEVDEGKVDFDGSCVFYDETLEISDGADGNAAGHMTDAEE GYSVDMDAISSENAQKKVSLEDIALKSTFFNNIELQVVKQHSSPISPPPFMVGMAEVEVD TETGSVDVLDYVAVVDCGTPINPNLARVQTEGGIAQGIGMALWEDVQYTDKGKIRNNSFM QYKIPTRQDIGNIRVDFECSYEKSGPFGAKSIGELVIDTPCPALAEAIYNATGVRLTELP MTPEKIAMEILKNRESIRSGTKVEE >gi|226333005|gb|ACII01000014.1| GENE 25 28847 - 29314 488 155 aa, chain - ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 1 135 13 149 171 121 42.0 5e-28 MNINFWLNGVNRHAEISPDTLLIDFLRKQGCFSVKRGCETANCGLCTVLMDGRPVLSCST LAARIEGKKIVTLEGMQREAQEFGSFLANEGAEQCGFCNPGFIMNVFAMLNEIKDPTEEQ IKEYLAGNLCRCSGFVSQTRSILKFLKYKKAQEEA >gi|226333005|gb|ACII01000014.1| GENE 26 29304 - 30122 846 272 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01495 NR:ns ## KEGG: EUBELI_01495 # Name: not_defined # Def: xanthine dehydrogenase # Organism: E.eligens # Pathway: not_defined # 1 265 1 261 264 290 55.0 3e-77 MLKIKSYVKVNSLAEAYELNQKKTARILGGMVWMKMGNRAISTAIDLSGLGLDAISENDD EFVIGCMTPLHDLETNEALNTYTHGAMKESLCHIVGVQFRNCATVGGSIYGRYGFSDVLT MFLGMDTWVELYDAGRIPLTEFVNMKKDNDILVNIIVRKEPLLTCYLSQRNIKTDFPVLT CAASVIGNEARTVIGARPARAMIVEDKKQILKNFRNMTKKQKEEAIEAFAEYAAENVPTA GNMRGSKEYRTLLVKVLTRRAWEAVGGMKNEY >gi|226333005|gb|ACII01000014.1| GENE 27 30722 - 31309 676 195 aa, chain + ## HITS:1 COG:no KEGG:BDI_1616 NR:ns ## KEGG: BDI_1616 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 192 3 200 215 145 40.0 6e-34 MSKRVITINRMLGSNGRLIGKALAEELGFAFYDKELIDIAAQKENIPFDLLAKVDEKRGS QWRFPIDHELQMDSDFHFVPMNDVLFDLQRKIILDAAEKEDCVIVGRCANHILDNKCLSV FIHAPFEYRVQTVIERTGREEKSAQKLVKKVDKERRTYYEYFTDKKWKDISQYHLAFDSS KFEQDKIIQMIKMAL >gi|226333005|gb|ACII01000014.1| GENE 28 31389 - 32489 1027 366 aa, chain - ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 15 252 17 266 295 67 25.0 4e-11 MNKNEIYIKSGTNYKEMTKELLAQCGLASVISDKKMQIGIKPNLVSPSEASWGATTHPEI VAGIIEYLQENGYGNIAILEGSWVGDKTIEAYEVCGYRELTEKYQVPFWDMQKDKGIERD CRGMKLNVCERAANIDFLINVPVLKGHCQTKITCALKNMKGLIPNTEKRHFHAMGLHEPI AHLNAGLHQDFVVVDNICGDLDFEDGGNPVVMNRIWAGTDPVLIDSYVCQIMHYTTKDVP YIELAEKLGVGSTDLKNSRIIYCEENARKELPKSRKVVELQDAVEEVESCSACYGYLIPA LEMLKNDGLFEKLDTKICIGQGYRGKTGKLGVGACTCKFEHNVKGCPPTENQIYDFLKQY ILGENK >gi|226333005|gb|ACII01000014.1| GENE 29 32587 - 33135 208 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|52081538|ref|YP_080329.1| ribosomal protein S2 [Bacillus licheniformis ATCC 14580] # 1 172 3 173 174 84 33 1e-15 KFMATLQQLDMNILLWIQEHLRVDALTPFWRAVTFLGNGGWFWIVLCVLLICFGKTRKTG VTAALSLLSGFLITNLLIKNAVARPRPFDTYTQIIPLITRPKDYSFPSGHTCASFAVALV CLRMLPGKWGILPVVLAGMIAFSRLYLGVHYPGDVLAGFLVALLTSTVACRLMRRTFSPQ KS >gi|226333005|gb|ACII01000014.1| GENE 30 33422 - 34759 1252 445 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 6 437 25 457 459 269 37.0 7e-72 MLKLTIPIIIQNLLSAAVNSADVIMLNYVGQSAISAVSLAANYSNILFMVYYGLGTGASL LCAQYFGKNNMQAIHAIEGIALRFSIIISGIVALIAFTAPQLMMKIFTSDQELISIGSSY LRIMGITYLCWGITEIYLAILRSIGRVTISMALNMLAFILNIILNATFIFGLFGAPKLGA TGVAIATAASRLIELAACVIVSFLSKDVKLNLTFMFIRSKVLFGDFVRLSLPALGNDVSW SVAFSMYSVILGHLGTDAVAANSLVTVVRNIGSVFCFAIASAGTILLGNVMGKGNLEKSK VYASRMLKMTIIAGAIGGVIVFAITPLVLRFASLSDGAMHYLKYMLLINTYYIMGSAVNT ALIAGVFRAGGDTKFGLICDTIDMWVYAVPLGFFAAFVLKLPVLWVYFLLCTDEFVKWPW VFHHYRQGKWAQNITREDLFEQKAS >gi|226333005|gb|ACII01000014.1| GENE 31 34920 - 35351 670 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578015|ref|ZP_04855287.1| ## NR: gi|253578015|ref|ZP_04855287.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 143 1 143 143 151 100.0 1e-35 MRKRTTSRAKETIKEATTKATEAVKETVEKTTVKASEVIEEAKDAAADIAKAPVVEEAKE TVKKAVKEVAKNIDIKKFAETTIELPGFAVTMSSIEEAIKKDIKDKGIEGKEISVYVNVE QKAAYYTVDGQGSDDYRIDLNTL >gi|226333005|gb|ACII01000014.1| GENE 32 35460 - 37853 2384 797 aa, chain - ## HITS:1 COG:CAC1534 KEGG:ns NR:ns ## COG: CAC1534 COG0577 # Protein_GI_number: 15894812 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 5 797 3 745 746 204 23.0 5e-52 MKNPLRKRLPRELKGELGKYLVVFILMVASIGFVSGFLVADNSMLIAYNEGFEKYNIEDG NFRTAEQVHKTQREEIEALGVKLYDNYYVEEPLDNGSTMRFFKNRQQVDKVCLMKGELPA GTGEIAIDRMYADNNNLSVGDTLRSGKRTWKITGLVALSDYSCLFQNNNDSMFDAVKFGV SVVTEEEFDSLDQEKLQYNYSWIYDEKPKTEKEEKEVSEDLMEDMGKIVTLEAFVPRYLN QAITFTGDDMGGDKAMMIMLLYIIMVIMAFVFGITISNTIRKEAGVVGTLRASGYTRQEL ILHYMTLPVLVTFVGALIGNILGYTVLKDVCADMYYGSYSLPTYVTVWNGEAFGLTTLVP VVIMLVVNYGVLRHKLKLSPLKFLRRDLSGRKQKRAIYLSPKMKIFSRFRLRVIFQNMSN YMVLFIGILFANLLLMFGLLLPSALSHYQVEIQNNMLAKYQYMLQVPVSAVGGNKFDGLI SLLEFYMDSRTDNEDAEEFSAYSLNTLPGKYKSEEVLLYGIEPDSRYVTIDFNNTKDKKD EAGNKEKADNKNTANAEKESAAVYISSAYADKFLLHVGDTITLKEKYEKEKYSFKIAGVY DYTAALCVFMPRSELNDIFDLGEDYYSGYFSDTELTDIKSQYIGSVVDLDALTKISRQLD VSMGSMMGMVNGFAIMIYMVLIYLLSKIIIEKNAQSISMVKILGYTNGEISKLYIMSTSL VVVFCLLLSLPLETVIMKVLFREMMLSSISGWITLWIDPMIYVQMFAAGIITYGIVALLE FRRVKKVPMDEALKNVE >gi|226333005|gb|ACII01000014.1| GENE 33 37865 - 38563 282 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 21 197 20 200 223 113 37 3e-24 MFLQIQGIQKSFGAGDSRVDVLKGLDLDIERGEFCVLLGPSGSGKSTLLNIIGGIDSADE GNLVIDGERLADMSEKNLSLYRRKHLGYIFQMYNLIPNLTVRENIEVGAYLSDKPLDVDE LLHTLGIYEHQRKLPNQLSGGQQQRTAIGRAIVKNPDILLCDEPTGALDYNTSKEILKLI ETVNQKYGNTIVMVTHNDAIKDMADRVVKLRDGMIRKNYLNEHKIPAAELDW >gi|226333005|gb|ACII01000014.1| GENE 34 38962 - 40197 1293 411 aa, chain + ## HITS:1 COG:CAP0059 KEGG:ns NR:ns ## COG: CAP0059 COG1454 # Protein_GI_number: 15004763 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Clostridium acetobutylicum # 4 392 1 390 396 354 48.0 2e-97 MNTLRKIYCRAFQKAFHIAIPFLPYRKPKIAGSVKELPEIIMRHKCTHVLIITDGGIMKL GLTRRLEKALKEAGIPYTIYDKTVANPTTVNVREALELYHKEGCDAIIGFGGGSSMDCAK AVGACAVKPNQSLAQMKGILKVHKKLPLLMAVPTTAGTGSETTLAAVITDADTRYKYAIN DFPLIPRYAVLDSKVTLSLPPFITATTGMDALTHAVEAYIGNSTTIDTRRDALKAVKLIF ENIDIAYEHGDNIQARRNMLHASFYAGCAFTKSYVGYVHAVAHSLGGQYNVPHGLANAIL LPLVLREYGSCIDKKLHRLAIAAGLADKNTPDHEAAELFIRAIEEMKERFGIVNIVKEIQ ETDIPKLAHYADKEANPLYPVPKLMDASELEKFYYMLMSLTSKENTSEAEK >gi|226333005|gb|ACII01000014.1| GENE 35 40312 - 41691 1245 459 aa, chain + ## HITS:1 COG:alr3672 KEGG:ns NR:ns ## COG: alr3672 COG1012 # Protein_GI_number: 17231164 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Nostoc sp. PCC 7120 # 4 459 7 460 460 464 47.0 1e-130 MTETEIKEIVQKQRSYFYTGATLPVEKRLDALKKLKTCIQKYEPLINEALKRDLGKSNFE SYMCEVGLVLSELSYMIRHTPSYAKEKTVLTPLAQFASRSYKKPSPYGVVLVMSPWNYPF LLTIDPLIDAIAAGNTVVLKPSAYSPHTSRVIETIITECFDPQYVAVVNGGRAENNTLLN EHFDYIFFTGSQHVGREVMAKASVHLTPVTLELGGKSPCIVEKSANLKLAARRIVFGKYL NCGQTCVAPDYIYCDREIKDELIRQIQKQIRKQFGSTPLNNKNYGKFINEKHFTRICNLI DPSKVVCGGDNNPGALQIAPTVMDNVTFGDAVMQEEIFGPVLPVLTYDSLDEAIEKVNSM AHPLALYIFTSDKEAAEKVTSRCGFGGGCVNDTIIHLATSEMGFGGFGESGMGSYHGKDG FHTFSHYKSIVDKKTWLDLPMRYQPYRKIYEKLLRKFLK >gi|226333005|gb|ACII01000014.1| GENE 36 41701 - 42441 479 246 aa, chain - ## HITS:1 COG:BH1885 KEGG:ns NR:ns ## COG: BH1885 COG2357 # Protein_GI_number: 15614448 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 46 242 21 215 251 175 48.0 8e-44 MKYFEEIMSERKNRIEARKDELIRRSLLSEEFLEMIQKNKRPFDLLMSYYQCAVMEIETK FRVLNQEYSLEYDKNPIEGIKTRIKSYDSIVRKIRRKNIPMSLTAIEENIKDIAGVRVIC SFPEDIYELADSFLRQDDIVLIEKKDYIKNPKPSGYRSLHLIVQVPIFLQKNKKMVNVEV QFRTIAMDFWASLEHKLRYKKDIPADQAQQLQEELLACATQSAQLDNRMQEIRNQLVSRA DKGNQS >gi|226333005|gb|ACII01000014.1| GENE 37 42924 - 43592 415 222 aa, chain + ## HITS:1 COG:jhp0345 KEGG:ns NR:ns ## COG: jhp0345 COG1787 # Protein_GI_number: 15611413 # Func_class: V Defense mechanisms # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes # Organism: Helicobacter pylori J99 # 1 98 79 175 189 79 42.0 6e-15 MDGYEFEYQCAKILKRKHFSKIKVTQSSGDQGIDIIAFKHRKKYGIQCKYYTHPVGNKAV QEAYAGANFYDCDKAVVMTNITFTKSATQLAQKLDVELWPNCPVTPFYSLPFKIMSSLNA VLFLYAVLAFLSLKPEFDFIHLPFVPDTLTLVIILTASISGIFGWSFLLCNLIAAGAYLY LAFHTLILPTYAETGSCTNIHMLSLIPAGVYLLHAGWIKIRK >gi|226333005|gb|ACII01000014.1| GENE 38 43598 - 44182 432 194 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 9 181 3 180 185 171 50 1e-41 MKWTDGKIRCCWANPRNERYIRYHDEEWGIPVHDDRKLFEMLILECFQAGLSWECVLNKR DAFRVAFDNFDTEKVSAYTEDKLEALRSNPGIIRNRLKIQAAVTNAWAFLEIQKEYGSFS DYIWHWTDGKVVCETGRTSSELSDCISKDLKKRGMKFVGTTVIYAYLQAVGVLCSHEQGC FLANQEAEHYNSYS >gi|226333005|gb|ACII01000014.1| GENE 39 44279 - 47608 2906 1109 aa, chain - ## HITS:1 COG:SP0892 KEGG:ns NR:ns ## COG: SP0892 COG4096 # Protein_GI_number: 15900775 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Streptococcus pneumoniae TIGR4 # 33 1106 8 1087 1091 811 43.0 0 MTNFDFLKNEPKFKDFADVAISAERIFQFDYAASVLNCRRAMEFAVKWMYSVDSALVMPW DDRLVSLMSTDEFRDIIDADLLKRMEFIRKTGNFAAHTGQKVKKEQAELCLQNLYIFLDF VAYCYADHYTVSNYDSTLLQENSVHTADQSVSQPVGDQAGQSETDLKLEQLLKENAALRE ELTARREEQQQTYVTKPLDISEYETRKIYIDSMLQDAGWTEGQNWINEYRIPGMPNKSQV GYADYVLLGDDGKILAVIEAKRTCADVAKGRQQAKLYADLIEQKQGRRPVIFLTNGFDVR ICDNQYPERKIAAIYSKRDLEKLFNLRQMRSGLGQISIDKNIAGRYYQEAAIKAVCDSFD TRNRRKALLVMATGSGKTRTVIALCKVLSEQGWVKNILFLADRNSLVTQAKRSFVNLLPD LSVTNLCEEKDNYAAHCVFSTYQTMINCIDTIQDKEGKLFTCGHFDLVICDEAHRSIYNK YRDIFNYFDAPLIGLTATPKDEIDKNTYEIFELENGVPTYGYELAQAVKDGYLVDFMSVE TTLKFIEHGIVYDELSPEDKEVYEDTFEDENGEMPESIGSAALNEWLFNEDTIKKVLHIL MTNGLKIDYGNKIGKTIIFAKNHRHAEKIYEVFGKEYPHLTGYAMVIDNRLKYAQSAIDE FSDSKKLPQIAISVDMLDTGIDVPEVLNLVFFKKVMSKAKFWQMIGRGTRLCPALMDGED KQKFYIFDFCGNFEFFRMNKGRATANMIALQGAIFQLEFEIAYKLQGAAYQTERLITYRK SLVEHMAGKVQELNRENFAVRQHLKYVDIYAVEENYKVLTYEDTLAVRDELSPLIRPEND DAKALRFDALLYGIELAYLAEKKYGRGRKDLLKKVSALAGISNIPEIMMQAELIDKILHG GYLDNAGIPEFEHIRENLRDLMKYIPDADRMIYDTNFDDEILNMEWKDSELENDDLKNYK AKAEYYVRQHQDNAVIAKLKTNIPLDNNDMRTLEKILWSEVGTKQDYEAEYGQKPLGEFV REIVGLDMNAAKAAFAQYLDDTCLDSRQIYFVNQIVEYIVHNGMLKDLSVLQESPFTDQG SIVDIFTDLNVWLGIRKTIEQVNANAIAA >gi|226333005|gb|ACII01000014.1| GENE 40 47920 - 48417 437 165 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 1 165 1 164 321 138 43.0 4e-33 MEEKVVKILNEMSEYLSIAQMKKLQEVMLKVCAENEADKVEIPNKDFLEMFLDAKKIEGC SERTLQYYRVTVEHLLSQMGNSVRKVTTEEIRTYLADYQKNSNCSNVTIDNIRRNISSFF SWLEEEDYILKSPMRRIHKIKTKTVVKNTISDESIEKLRDHCTEI >gi|226333005|gb|ACII01000014.1| GENE 41 48463 - 48924 397 153 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0673 NR:ns ## KEGG: NT01CX_0673 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 5 147 3 145 151 166 59.0 3e-40 MVGLGREDKKDNDLFYTCSLIDYISRKTKNIRADVVNQLGRKRLEKIYDLADVYHCDNID QVSEDFIAEAHIPTGRFDNVKECKYSIPSHWDIGKVYKRLIKQVAASEKIEVVDALIKVY NSFISEKIDDYNSSVYYENPSYIYESYRENKML >gi|226333005|gb|ACII01000014.1| GENE 42 48924 - 49382 326 152 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0674 NR:ns ## KEGG: NT01CX_0674 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 1 152 1 154 158 227 68.0 1e-58 MAKMTVYHGGYQPVKKPEIRKGRNTKDFGYGFYCTIIKEQAQRWAKRYETKAISIYDVRM NTNLNIKEFTEMSEEWLDFIVDCRSGKDHNYDIVIGAMADDQIYNFVSDYIDGTITREQF WVLAKFKYPTHQIAFCTKDALKCLEYRDCEVL >gi|226333005|gb|ACII01000014.1| GENE 43 49429 - 50610 426 393 aa, chain - ## HITS:1 COG:MA2103 KEGG:ns NR:ns ## COG: MA2103 COG0732 # Protein_GI_number: 20090947 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanosarcina acetivorans str.C2A # 128 392 23 284 290 107 28.0 4e-23 MKKIRLGDACDILNGFAFKSENYVDSGIRVIRIANVQKGYIEDNTPVFYPLETNELDKYM LEEGDLLMALTGNVGRVAILKKEFMPAALNQRVACLRLKTDRVAKDYLFHVLNSAFFEQQ CIQSSKGVAQKNMSTEWLKDYEIPMYPKEQQELIADILDKTRNIIISRNYELKKLDDLIK ARFVEMFGDAYLNEFGWKKIKIKNAVTVEPQNGMYKPQSDYVTDGSGIPILRIDGFYDGV VTDFSSLKRLRCSENERQKYLLYEDDVVINRVNSIEYLGKCAHINGLLEDTVYESNMMRM HFDSTRFHPVYVCRLLCSRFVYDQIVNHAKQAVNQASINQKDVLDFDIYEPPLKLQIQFA DFVRAVDKSKVEIQKALDKTQMLFDSLMQEYFG >gi|226333005|gb|ACII01000014.1| GENE 44 50630 - 52135 1226 501 aa, chain - ## HITS:1 COG:SP0886 KEGG:ns NR:ns ## COG: SP0886 COG0286 # Protein_GI_number: 15900769 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pneumoniae TIGR4 # 1 498 1 496 497 580 59.0 1e-165 MVTGELKSKIDNLWEIFWTGGLTNPLDVIEQMTYLMFIRDLDDADNIHAKEAAMLGLPHK SIFAGEIQIGDRKIDGSQLKWSTFHDFPAAKMYSTMQEWVFPFIKNLHGDKESAYSKYMR DAIFKVPTPLMLDKIVTTMDAIYEQMEQIKSADTRGDVYEYLLSKLATAGVNGQFRTPRH IIRMMVEMMDPKADEIICDPACGTSGFLVAASEYLRDKKKQEVLFNRQNKEHYMNHMFHG YDMDRTMLRIGAMNMMTHGVENPYIEYRDSLSDQNTDKEKYSLILANPPFKGSLDYDIVS ADLLKVCKTKKTELLFLALFIRMLKIGGRCACIVPDGVLFGSSTAHKAIRKALVEENRLE AVISMPSGVFKPYAGVSTAILIFTKTGHGGTDKVWFYDMKADGFSLDDKRTETKENDIPD IIERFRNLDKETDRKRTDQSFFVPKEEIAENGYDLSINKYKEIEYVPVEYPSTQEIMTEL HEIEMKIGEEMQALEELLGIN >gi|226333005|gb|ACII01000014.1| GENE 45 52137 - 53192 582 351 aa, chain - ## HITS:1 COG:MA1868 KEGG:ns NR:ns ## COG: MA1868 COG3177 # Protein_GI_number: 20090718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 35 311 152 418 473 90 30.0 4e-18 MRSFDFEKEYSKLLTPDIVSYITQIHEYRGRYSADNVQKELLSELLEIARIQSTEASNRI EGIITTDDRLKKIVREKTLPKTRSEKEIAGYRDVLATIHENYEYIPVSSNMILQLHRDLY KFSGNGYGGNYKSVDNVIAEELPDGTKRVRFQPVPAWETAETVNDLCNAFKDAVNKQFMD ILLLFPMFILDFLCIHPFNDGNGRMSRLLTLLILYKSEYYVGQYISIERLISDTKENYYE ALQESSWEWHEGRNDYAPFVKYMLGVIVAAYREFADRVEILTDKTISKSDRIEDIIRNSY GRITKAEIAAQCPEISQTTIQRTLNELLKNNQIIKIGGGRYTSYVWNRENE >gi|226333005|gb|ACII01000014.1| GENE 46 53364 - 54041 565 225 aa, chain - ## HITS:1 COG:no KEGG:CA_C0729 NR:ns ## KEGG: CA_C0729 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 8 200 1 200 448 86 30.0 8e-16 MNKKDTKINRNDMLELTRRMTLSRNCFVRAAGAYIDDEGYIDNTFNVSFKNMSKHDQQVN LDLAKTIPFSKTNEQLKCYKFPEEDMAYDSIHKLLMALSQYGLKDDLLVQTLCEQLAEGI CDTKDSSDKTNRLQGQEFGIYIYHGIYDIPKKGSDHSEQWESEEIYSFLICVVAPVDKDY NAGTPKCGFIFPAFEDRSAEPGKIDVFFEIPEYPDEGFLNRIIGQ >gi|226333005|gb|ACII01000014.1| GENE 47 54034 - 54159 182 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKGYSRPGLFGTMNYYNENGHKTGHSDKGIFGGWNHFDE >gi|226333005|gb|ACII01000014.1| GENE 48 54257 - 55573 898 438 aa, chain - ## HITS:1 COG:no KEGG:TBFG_12832 NR:ns ## KEGG: TBFG_12832 # Name: not_defined # Def: hypothetical protein # Organism: M.tuberculosis_F11 # Pathway: not_defined # 6 436 3 410 415 152 29.0 2e-35 MAKKYLFSPIGNTDPIKYLYDGSMIHICRYYQPDVVYLYLSKEMMENHKKDNRYVKTLEL LGEFLHHKFEIHIIENSDMIDVQQYDIFYNEFHRIIAEIEEQKGPEDILLVNMASGTPAM KSALLVMATLSEYRFLPIQVSTPQKKSNLEHEERDEYDVDANWELNMDNEEAAENRCQEV KCLNLMRLLKIDMIKKHLLAYDYHAALAVGKEIKEDLSPVAYQWLETADARSLLDWTRMN RVLPENNGIITAVRGENEKKVLFEYMLALDLKVKRGEYADFIRAITPLGVDLLEIVLEQS CDIDITRYYKRNNQRIWDKNRLVGEILDILNQKFYPFRYGPVYSAHLLEIIQKKCTDTLM VQRIQELVNIEQNVRNVAAHNIVSVTPEWIKERTGKSVDDILWILKYVCEQVKINTRKEN WNSYDSMNKRIINELDKD >gi|226333005|gb|ACII01000014.1| GENE 49 55607 - 55747 109 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578033|ref|ZP_04855305.1| ## NR: gi|253578033|ref|ZP_04855305.1| CRISPR-associated protein [Ruminococcus sp. 5_1_39B_FAA] # 4 46 1 43 43 71 97.0 1e-11 YNKLIEEIPRYIDKTEDNVRIYKITGKGKVTSWGEVPEFDEEIVLI Prediction of potential genes in microbial genomes Time: Sat May 28 19:11:07 2011 Seq name: gi|226333004|gb|ACII01000015.1| Ruminococcus sp. 5_1_39B_FAA cont1.15, whole genome shotgun sequence Length of sequence - 2143 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 13/0.000 - CDS 1 - 193 116 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair 2 1 Op 2 . - CDS 171 - 1172 641 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair Predicted protein(s) >gi|226333004|gb|ACII01000015.1| GENE 1 1 - 193 116 64 aa, chain - ## HITS:1 COG:MT2883 KEGG:ns NR:ns ## COG: MT2883 COG1343 # Protein_GI_number: 15842357 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Mycobacterium tuberculosis CDC1551 # 23 62 25 64 113 60 72.0 9e-10 MSQLSYDDEDYYFQISDELESDKEFVLIIYDIVDNRKRVKLAKLLSGYGKRVQKSAFEAM LTTQ >gi|226333004|gb|ACII01000015.1| GENE 2 171 - 1172 641 333 aa, chain - ## HITS:1 COG:MT2884 KEGG:ns NR:ns ## COG: MT2884 COG1518 # Protein_GI_number: 15842358 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Mycobacterium tuberculosis CDC1551 # 1 325 1 324 338 187 34.0 2e-47 MSLLYVNDSGATIGIEGNCCTVKQKDGSKRMLPIESLDGITIMGQSQMTTQCAEECMQRG IPVSYFSKGGKYFGRLISTGHVNVERQRKQCALYDTGFAVELAMKILSAKIKNQSVVLRR YEKSKGLNLEEEQKMLAICRNKVLTCDRIEEMIGFEGQAAKYYFKGLSACIDENFTFQGR NRRPPRDEFNSMISLGYSILMNEVYCKVEMKGLNPYFGFIHRDAEKHPTLASDMIEEWRA IIVDATAMSMINGHEILKDHFYFNMDEPGCYITKDGLKLYLNKLERKFQTEVRYLKYVDY AVSFRRGIFLQMEHLAKAIEEGDASLYEPIIIR Prediction of potential genes in microbial genomes Time: Sat May 28 19:11:13 2011 Seq name: gi|226333003|gb|ACII01000016.1| Ruminococcus sp. 5_1_39B_FAA cont1.16, whole genome shotgun sequence Length of sequence - 23043 bp Number of predicted genes - 28, with homology - 25 Number of transcription units - 9, operones - 5 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 672 - 1835 654 ## COG1332 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) 2 1 Op 2 7/0.000 - CDS 1768 - 2559 289 ## COG1567 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) - Prom 2647 - 2706 1.7 3 1 Op 3 . - CDS 2709 - 3374 478 ## COG1337 Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) 4 1 Op 4 . - CDS 3387 - 3782 363 ## Swol_1755 hypothetical protein 5 1 Op 5 2/0.000 - CDS 3772 - 6156 1325 ## COG1353 Predicted hydrolase of the HD superfamily (permuted catalytic motifs) 6 1 Op 6 . - CDS 6149 - 6598 203 ## COG5551 Uncharacterized conserved protein - Prom 6634 - 6693 5.9 - Term 6943 - 6991 2.0 7 2 Tu 1 . - CDS 7061 - 7213 138 ## EUBREC_1425 PTS system, glucose subfamily, IIA subunit - Term 7239 - 7288 11.6 8 3 Op 1 . - CDS 7339 - 9051 516 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 9 3 Op 2 . - CDS 9059 - 9241 152 ## 10 3 Op 3 . - CDS 9267 - 9485 208 ## gi|253578044|ref|ZP_04855316.1| predicted protein 11 3 Op 4 . - CDS 9499 - 9705 139 ## gi|253578045|ref|ZP_04855317.1| predicted protein 12 3 Op 5 . - CDS 9721 - 9906 116 ## 13 3 Op 6 . - CDS 9903 - 10397 499 ## gi|253578046|ref|ZP_04855318.1| predicted protein 14 3 Op 7 . - CDS 10379 - 11206 379 ## BLJ_1240 hypothetical protein - Prom 11415 - 11474 7.3 + Prom 11322 - 11381 9.9 15 4 Tu 1 . + CDS 11409 - 11642 237 ## Elen_1909 putative transcriptional regulator, XRE family + Term 11698 - 11734 0.5 - Term 12205 - 12244 5.2 16 5 Tu 1 . - CDS 12250 - 12963 460 ## CLJ_A0004 putative resolvase - Prom 13010 - 13069 5.3 - Term 13104 - 13149 5.1 17 6 Op 1 . - CDS 13290 - 13769 385 ## gi|253578050|ref|ZP_04855322.1| predicted protein 18 6 Op 2 . - CDS 13787 - 14599 509 ## gi|253578051|ref|ZP_04855323.1| predicted protein 19 6 Op 3 . - CDS 14596 - 14889 91 ## gi|253578052|ref|ZP_04855324.1| predicted protein 20 6 Op 4 . - CDS 14882 - 15247 170 ## gi|253578053|ref|ZP_04855325.1| predicted protein - Prom 15302 - 15361 1.9 - Term 15679 - 15742 0.7 21 7 Op 1 . - CDS 15751 - 16824 692 ## Swol_1312 hypothetical protein 22 7 Op 2 . - CDS 16842 - 17027 70 ## gi|253578055|ref|ZP_04855327.1| predicted protein 23 7 Op 3 . - CDS 16963 - 17157 64 ## 24 7 Op 4 . - CDS 17172 - 17378 91 ## gi|253578056|ref|ZP_04855328.1| predicted protein 25 7 Op 5 . - CDS 17344 - 17610 306 ## gi|253578057|ref|ZP_04855329.1| predicted protein - Prom 17649 - 17708 5.9 - Term 19027 - 19064 1.1 26 8 Op 1 1/0.000 - CDS 19075 - 21012 870 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 21036 - 21095 5.4 27 8 Op 2 . - CDS 21180 - 21551 336 ## COG2337 Growth inhibitor - Prom 21631 - 21690 3.5 - Term 22065 - 22107 8.5 28 9 Tu 1 . - CDS 22137 - 22565 259 ## MGAS2096_Spy1146 putative cytoplasmic protein - Prom 22793 - 22852 1.8 Predicted protein(s) >gi|226333003|gb|ACII01000016.1| GENE 1 672 - 1835 654 387 aa, chain - ## HITS:1 COG:MT2886 KEGG:ns NR:ns ## COG: MT2886 COG1332 # Protein_GI_number: 15842360 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 24 386 1 371 375 108 26.0 2e-23 MKKETVENIPYFDMQKLSSWGYNMERKLKTYQIHLKVNGPVFVGDGNEIQKKEYMFLNRN TIGVIDGAKFYMLAKKLHLQNDFERFMIDDTREDLKHWCFRNHVSQNDLKNCMKYVENVG DRSEEKGKLQVMTCITDPYGNPYIPGSSLKGMLRTILLGRDILQHREKYRTDTRQIRSDL EVNRINRRILNNNIVKIEKNAFNSVRSSGKETVDFDIMSGVIVGDSEPLSREDIILCQKW EQHTDGTYKTLNLLRECIKPGTVIKSTLTIDETLCNIKKKDILEAVQLFYEQYYQNFQKK FPRSDRRKPNTVFLGGGSGFVSKTVIYPLFGEKEGIETVKNIFDRTNVPKTHQHYKDTRM GVSPHILKCTRYQGKEYMMGECELNII >gi|226333003|gb|ACII01000016.1| GENE 2 1768 - 2559 289 263 aa, chain - ## HITS:1 COG:MT2887 KEGG:ns NR:ns ## COG: MT2887 COG1567 # Protein_GI_number: 15842361 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 18 263 60 297 302 160 35.0 3e-39 MKKQDRFLKQVSDGKIIFSDAFPYIGKNYYIPKPMIAVQTEDESKQGDSRQKKKFKNLNY IPVSTLADFVNGTFPEEHMEDMKYLGLYDMKVSVGIHGMEEPEPFRVNTWHFNTGTGLYV LAGYENESDRELLDELFESLQYTGIGGKKSSGLGRFVYKVCKVPEDMMGYLKKKSEKNIL LSTALPEDEELEQVMKGSSYLLIKRSGFVDSSEYALQQMRKRDLYVFSSGSCFAHTFRGR IIEERNGGKHPVFRYAKAFFMGV >gi|226333003|gb|ACII01000016.1| GENE 3 2709 - 3374 478 221 aa, chain - ## HITS:1 COG:MT2888 KEGG:ns NR:ns ## COG: MT2888 COG1337 # Protein_GI_number: 15842362 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair (RAMP superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 2 212 5 230 236 194 46.0 9e-50 MYGKIRISGNILVKTGMHIGGSGAFAAIGAVDSPIIRDARTNLPMIPGSSLKGKMRSLLA KEFNERVVEPNEDHERITRLFGSSKRGNIKRSRLLFSDMVLKNEEELRDAGLQSMTEVKF ENSISRATAVANPRQIERAVRGSVFKLDLIYELEEEQEFLEDMETLAEAMKLLQYDYLGG NGSRGYGKIQFQDIYTDAVVGHVDNTLLEKCNAILNEAVKE >gi|226333003|gb|ACII01000016.1| GENE 4 3387 - 3782 363 131 aa, chain - ## HITS:1 COG:no KEGG:Swol_1755 NR:ns ## KEGG: Swol_1755 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 7 130 38 158 158 102 44.0 6e-21 MRINENNYVDKAEEAIKSLVEESKQKCRGKVNIVTTSKIRNLLAMTADIYNQVLTYTSEK LDDEICGRIEYLRIRFIYECGREPKVKAFVKQAEILEILKEIRQSKKNYLLFSKYMEALI AFHKYYGGKEQ >gi|226333003|gb|ACII01000016.1| GENE 5 3772 - 6156 1325 794 aa, chain - ## HITS:1 COG:MT2890 KEGG:ns NR:ns ## COG: MT2890 COG1353 # Protein_GI_number: 15842364 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HD superfamily (permuted catalytic motifs) # Organism: Mycobacterium tuberculosis CDC1551 # 1 792 6 817 817 580 40.0 1e-165 MIDQKIKLITGSLLHDIGKVVYREGSDRRNHSVSGYDFLKEEGKIEDKEILDCVRYHHFS EIKSAHLKEDNLAYIVYLADNIAAFADRRKKEDTEEKGFDLSVPLQSVFNVLNGNNQSFY YQPGDMDDQGKINYPASEKKPFSREFYMKICQRMLDNFRGMNWSEEYLNSLLAVMEANLS YMPSSTSNEELSDISLFDHVKLTAAISSCIYDYLNENHLSYKTELFDKKDFYDRNAFLLC SMDISGIQSFIYTISSKNALKTLRARSFYLEIMMEHIIDLLLDRLQLSRANLLYSGGGHC YLLFANTTETVIQIQKFQEELKEWLLKQFQTDLFVAFGYVQCTGNAFRNVPEGSYTELFR KMSKMLSLQKQHRYTAREIIRLNNRKEQDYSRECKVCKKIGHVDEEGVCPLCRKIEKLSK NVLYADFFSVVLENPDEREDAMPLPGGYCLAADDEKKLCRRMENDDYFVRAYSKNKLYTG KHIATKLWVGDYSTGSTFEEFAREAEGISRIGVLRADVDNLGQAIVSGFHNAKNGDRYMT LSRTATLSRQLSLFFKYYIRFILENGEYSLEGKNGGKKRQATIVYSGGDDVFIVGAWNDI IELSVDLRRKFEQYTQGTLSISAGIGVYDFSYPIAAIAEETGMMESESKRMPEKNAVTLL QDGEIHLVDDGDEEKEISDGTYSWKELEEGVVQEKYRALCDFFEGIDETRGMSFLYRMME LVRGHEEKINFARMMYLLSRLEPTEEGTKKEKYRQLSQKMYRWIQSDQDCRQLKTAINLY AYIHRTKGEHRDEN >gi|226333003|gb|ACII01000016.1| GENE 6 6149 - 6598 203 149 aa, chain - ## HITS:1 COG:MT2891 KEGG:ns NR:ns ## COG: MT2891 COG5551 # Protein_GI_number: 15842365 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mycobacterium tuberculosis CDC1551 # 13 141 173 300 314 101 38.0 6e-22 MHLKNDNELITEFYEKKCPKYLEIKFQTPTAFKSDGKYVIYPDLGLIYASLMRKYSAVSE AFDMFDEETLEALVEQSEIVRYRLQTVPFPLEKVQITGFTGSICIHIRGPETMARYLRML FKFGEFAGVGIKTGMGMGAMKYVRRNEDD >gi|226333003|gb|ACII01000016.1| GENE 7 7061 - 7213 138 50 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1425 NR:ns ## KEGG: EUBREC_1425 # Name: not_defined # Def: PTS system, glucose subfamily, IIA subunit # Organism: E.rectale # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:ere00520]; Phosphotransferase system (PTS) [PATH:ere02060] # 7 45 414 452 750 68 76.0 8e-11 MSESFNYVIFAFMIRRFDFKTPECEDDDTETKLYTKADVNVRKEAGKIAG >gi|226333003|gb|ACII01000016.1| GENE 8 7339 - 9051 516 570 aa, chain - ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 9 481 7 463 542 126 27.0 1e-28 MIEEKPTNRKVVIYARVSTEHEAQLSALENQKDWYKPILQQHPEWDIVKMYVDEGITGTS AKKRPQFMQMIQDASSGNFDLILTREVSRFARNTVDTLQYTRELKSKGVEVFFLNDNIKT FDGDGELRLTIMATLAQDESRKTSVRVKSGQQTSMENGVFYGNGNILGYDRVGKDMVINP EQAKTVRMIFDWYLDGWGMRKIQFELEKAGRLTAMGKSNWHVSNISKILRNSFYCGIITY HKEFTPDFLEQKKIRNFGDMEFTQVRGNHKPIITEEEYDQAQKRINSRRKTLSVDTSGQR MYGEKPPSDVWVKLLECECGHKFNRKMWHNTTEGKQYGYQCYSSIRTGTVRTRLNKGLPI DGICQTPMIAGWKLQMMAKYIFKNYLSDTDQVLSLAESMLEKHIDDEEEVCDNTDIINQK KEELQKLLKRLDNLIEMRADGELTRDIFILKKESTENSISQLKKELAELEPSEIESEIDE VTSKEKITILKYALDQYTNFDSDADIPESVIEAFVEKIIARKDGFDWYLRFNPEESPMGC LVDGKRKRGAKVSSFCSPQHRQLSLRRMHN >gi|226333003|gb|ACII01000016.1| GENE 9 9059 - 9241 152 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQKANGKIFSALKMIEALYLDNKISYIILRNILNDYQDIIDLSEFSCYTDSHIKAKCGNM >gi|226333003|gb|ACII01000016.1| GENE 10 9267 - 9485 208 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578044|ref|ZP_04855316.1| ## NR: gi|253578044|ref|ZP_04855316.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 72 1 72 72 119 100.0 4e-26 MEEPIKERIISTMFERWCSQIDHDSEQREAMKEAMEKPSEETVFHLAAVTEARAYKAGFR DCLDMIKELFNK >gi|226333003|gb|ACII01000016.1| GENE 11 9499 - 9705 139 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578045|ref|ZP_04855317.1| ## NR: gi|253578045|ref|ZP_04855317.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 68 1 68 68 122 100.0 6e-27 MDMIQSLYSLWTSSRTPLPETKVSQLLPDKEAQRQLCTYVEEAEKRAFAVGFKAALGLVQ QLDDIEDI >gi|226333003|gb|ACII01000016.1| GENE 12 9721 - 9906 116 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNQRRKEVLLTYSEWERIFERRVKLYMGRKLHDFMYGTALFLLLVVPPLLMMVHYIVTG Y >gi|226333003|gb|ACII01000016.1| GENE 13 9903 - 10397 499 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578046|ref|ZP_04855318.1| ## NR: gi|253578046|ref|ZP_04855318.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 164 1 164 164 281 100.0 7e-75 MASRKVNLTLELPEEDLKDILFKVAADGTTLSELLEGFISDLVCGAHHGSDECDRAIAYY DRCSYGLGQEDSFLRFLLKSGYMDEYLALLDDIKTYQGWELQDGEVYGKELAAAAEEKAS YYEEWAEGYKVPPQTIEEAYQQVEEWREGYEAFMKSCERARDKA >gi|226333003|gb|ACII01000016.1| GENE 14 10379 - 11206 379 275 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 9 263 83 370 381 83 27.0 7e-15 MTEYSGGKFRKKKVNFSMISNEIIRDDTVSLKTKGLYALIQSYITMDDFTLYKGFLMSKC PEGRRSFDSAWNQLKQSGYLIQYRMKDEKNHFYYEYELLDVPVHQNVTPEKTGTSVVSVC DSSDCVVHKVDDTQNDAIYKTLQNNTVQNNTVSSHISESAVMDMIGYDASLHDDFIENLV LLIVEVLNMPDQATIRVNQTDQKAELVKERFRKLRYKHIEYVRLVFSEFTGEISSMRNYL ITSLYNAVSTCDIYFKQRVQHDMYEKGDKTWQVVK >gi|226333003|gb|ACII01000016.1| GENE 15 11409 - 11642 237 77 aa, chain + ## HITS:1 COG:no KEGG:Elen_1909 NR:ns ## KEGG: Elen_1909 # Name: not_defined # Def: putative transcriptional regulator, XRE family # Organism: E.lenta # Pathway: not_defined # 9 71 4 68 73 68 58.0 8e-11 MEKEKKNENVKYYKLFDMLNRKGLKKKDLLSVLSPATITKLSKHGIVTTDTIGRLCDFLD CQPGDIMEYEPPAKEEK >gi|226333003|gb|ACII01000016.1| GENE 16 12250 - 12963 460 237 aa, chain - ## HITS:1 COG:no KEGG:CLJ_A0004 NR:ns ## KEGG: CLJ_A0004 # Name: not_defined # Def: putative resolvase # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 232 1 224 230 146 41.0 8e-34 MGKTYGYARISKPKQSLQLQIDAIKKMAADAVIFSEAYTGSTNDRPEWKKLLKKVEKGDT IIFYMVSRMSRSAEEGTKDYFDLYNKGINLIFIKDRHIDTESYKQSFEKAGFSVDFDNDG SAESNLITKILDALKEFMQAKVNDDIKKAFEEAEKELKNNHKRTSDGMKASGAGKKIAEA RLGNRYETKKSKEMKGKIRKMAKCYDGTMTDKEVMETLKLARNTYYKYKREIDAEMQ >gi|226333003|gb|ACII01000016.1| GENE 17 13290 - 13769 385 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578050|ref|ZP_04855322.1| ## NR: gi|253578050|ref|ZP_04855322.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 159 1 159 159 325 100.0 4e-88 MEIREFSRNDRVKVDNCYAWDIGFRSSVTREDITIPAGAKGYALLTMDQVEEEIMRQNIF FVGTDGLGDHAGLKICDPDMYKYLFRTEDKTHHISKDTLLEVLKLSGKQKFMDAVAELIA TSSEARMCLYFMTDIDWSEYLGWKYDVLVERCREFMPVA >gi|226333003|gb|ACII01000016.1| GENE 18 13787 - 14599 509 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578051|ref|ZP_04855323.1| ## NR: gi|253578051|ref|ZP_04855323.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 270 14 283 283 543 100.0 1e-153 MIENSCTFSLLAGDVPGRRKVKVVLHEIYPDRTHWNRNGITYLEQYTRDNADSIKGMPLC AEFLDDDKEVPYGHGLTGQLRNMPVFEDSVQVGSFEDWSIEDIEIDGEVHRCLCATGYIN ESRYPNFVRWLEKQLESGSPVYGSVEFSGTKDNDGEIIYDGGWKEQGRVPMVYDYIGHSI ISIAPADDVALVTAFDMARGEFDGIFDDKKVESCTSDFDDIFDMPDVISVVGASKEEKDR PLSKREYDLVIAAGLMEEKEELENFDNIFK >gi|226333003|gb|ACII01000016.1| GENE 19 14596 - 14889 91 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578052|ref|ZP_04855324.1| ## NR: gi|253578052|ref|ZP_04855324.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 97 1 97 97 189 100.0 5e-47 MTRSEMIRRYSEIQCCSYEMAADAVDTVVTLISEGLEQDGKVSLKYFGRFEIRPFKAHRG YDFSKGESIEIPAHNTIRFTPCEAWRKEVLRKEIVKK >gi|226333003|gb|ACII01000016.1| GENE 20 14882 - 15247 170 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578053|ref|ZP_04855325.1| ## NR: gi|253578053|ref|ZP_04855325.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 121 89 209 209 233 100.0 2e-60 MDLNEYGRSNIRNMNSKKFAENRGRTLTQSAFIRRLARSNDLQIKDADIITKQVFRELIS CLKDGFCIKFEGFGTFGTYTRKARSVKHPVTQERCTIKEKAVVKFKPSAKLNASFQEGKD D >gi|226333003|gb|ACII01000016.1| GENE 21 15751 - 16824 692 357 aa, chain - ## HITS:1 COG:no KEGG:Swol_1312 NR:ns ## KEGG: Swol_1312 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 15 286 21 303 356 151 36.0 4e-35 MSKILDLDEVDTSQLSEGLEVKNYRALCELVGEEPKTGNGKKAQLKEWERFFAMEKVKGS QKMIVTEIYERPKERKDKRLEGVYVKSIELILLYELAKMSGYRATFTKNRLWHRLGMVND NYKKIAAADLKGIDTCITDFEINHFYQRSDSRLNKILMDALNSLKRRWVLDYQEQYRIVD RQGNRRIAEDIDIGNIITLKNRVARSLGCKDEREVFMKMKTAEFYKALTYYYGYYFGWAY VYKEYKLIFNQEIIKEEIPEVEVELQKELLNASVAEAIERTATQNYITCKAKDIKLPKLY QKAQGLLIDALIRLDGNEAEIASKRISDDTEEPDTIRMLTCEESSQLDDELDALFGM >gi|226333003|gb|ACII01000016.1| GENE 22 16842 - 17027 70 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578055|ref|ZP_04855327.1| ## NR: gi|253578055|ref|ZP_04855327.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 107 100.0 2e-22 MEREKKSVTVFSQDLCARLMLKGHRLQGMAADRRYPERNVFFFRNDAEVLADIAEIAQKN K >gi|226333003|gb|ACII01000016.1| GENE 23 16963 - 17157 64 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYCSKLNIARLVQGAYSKDIKLECRETALFSWCLFCVEREVKRGARKEKCDGFFTRLMR PSDA >gi|226333003|gb|ACII01000016.1| GENE 24 17172 - 17378 91 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578056|ref|ZP_04855328.1| ## NR: gi|253578056|ref|ZP_04855328.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 9 68 1 60 60 94 100.0 2e-18 MSCSKRQQMTVDLTPKEVEWISTILADRCDKSTRIFEISAIKTILAKLSVKYDWQPKEKL KVVSSVDT >gi|226333003|gb|ACII01000016.1| GENE 25 17344 - 17610 306 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578057|ref|ZP_04855329.1| ## NR: gi|253578057|ref|ZP_04855329.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 88 1 88 88 169 100.0 4e-41 MKDKDKDITLQLVDVFNEYLDLNGLTVSYAARCTKIPRTSLAEWAKGQRCLTAENARKVK QFLKGGFLIDIDTIVNYLVMQQEAADDC >gi|226333003|gb|ACII01000016.1| GENE 26 19075 - 21012 870 645 aa, chain - ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 24 490 5 446 542 113 25.0 1e-24 MNTAVNEYNYRFKLTDYAQFDRNRARNVAIYGRVSTEHEAQLSALENQLQWYDDQVKYHP NWTVCERYIDEGITGTQAKKRPAFLKMIEDAKQGKFDLIVTREVCRFARNVVDTLVVTRE LKGIGVEVFFIDDNIWTMDGDGELRLSLMATLAQEESRKTSERVKAGQKISRDNGVLYGN GNILGYDRVGDTYVINEEQAETVRMIYDLYLQGYGSMKIAKILTERKRKTASGLIKWSVS NIMRAIKNATYTGTKCYNKSRSNNFLEQKRINNLDMSTYEYVEGDFPAIISQEIWDKAQA IRESRVKPVCVSAGKNTHSKRDSRDIWVNKLRCSCGSSYRKNKWHTKLDGKTSYGYQCYN QLNNGIKRKRAEIGLDTEGYCDSRMITDWKLDFMAKALLEHLWTERKESVLIALDLIKCY YKEEKSNETQSDTVTIQAKLDKANSRLTNLIAMRADGEISKDEYQAMRSPVDEEIKKLQK ALDEIPQEKSNPKGLDIEGIRSTLNSLVDFSGSTISHDVVNRFVYLITPTSDTTFDWYVN LNGTADVKATFTAEGRKKNCIIKLEEIEKISSVHRGENEDNAHFIKNPHIFTYLHRQLLT KVGNNSFTKMYEYVINLADAKSYLYTYSTKHRIFRWEDINVSIFM >gi|226333003|gb|ACII01000016.1| GENE 27 21180 - 21551 336 123 aa, chain - ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 1 116 1 116 122 138 61.0 3e-33 MERIIKRGDIYYAELNPVIGSEQGGTRPVLIISNDIGNKHSPTVIIAAITSRVHTKAKLP THTAVSNYEGLDKDSVILLEQIRTIDKQRLKQYMGMMPNNIMARVDKALAVSVSLKGGQN GQY >gi|226333003|gb|ACII01000016.1| GENE 28 22137 - 22565 259 142 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1146 NR:ns ## KEGG: MGAS2096_Spy1146 # Name: not_defined # Def: putative cytoplasmic protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 140 1 140 143 127 48.0 2e-28 MAYNKAREEKKWRLWKEAEEKQLRSLGVSEDTIEQLRVHDWAIFNSDRRYYQRMQEAGTY LDEVAEDTTPPEIKTVEDFLDNIENQQLYQVLIKVDRLTLQAVLLQLQGFSIAEIALMLD MKEYTVYKRLSRLKEKIKKLLG Prediction of potential genes in microbial genomes Time: Sat May 28 19:12:44 2011 Seq name: gi|226333002|gb|ACII01000017.1| Ruminococcus sp. 5_1_39B_FAA cont1.17, whole genome shotgun sequence Length of sequence - 3086 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 40 - 95 5.1 1 1 Tu 1 . - CDS 315 - 1172 464 ## COG1408 Predicted phosphohydrolases - Prom 1221 - 1280 5.3 - Term 1245 - 1289 9.0 2 2 Tu 1 . - CDS 1321 - 2802 457 ## COG0577 ABC-type antimicrobial peptide transport system, permease component - Prom 2895 - 2954 4.7 Predicted protein(s) >gi|226333002|gb|ACII01000017.1| GENE 1 315 - 1172 464 285 aa, chain - ## HITS:1 COG:lin0757 KEGG:ns NR:ns ## COG: lin0757 COG1408 # Protein_GI_number: 16799831 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 7 285 2 287 293 198 38.0 1e-50 MKKRLSKKTLISVCTLIAALIIWTMWSNTALMLSTVTVSSNRIPAAFNGFRIAQISDLHN AVFGEDNAELLQILSECEPNIIVITGDLVDAEHTDIDVALDFAKEAVQIADTYYVTGNHE ASLSQYDELKAELENTGVVVLEDKAMQLEYNGDDITLIGLSDPSFTLKGDMFGEVPAMVD TKLRGLIGDKDNYTILLSHRPELFEAYVNCGVDLVLSGHAHGGQFRLPFIGGLVAPNQGL FPKYDAGLYTKGDTNMIVCRGLGNSIIPIRFNNRPEIVLLELIAE >gi|226333002|gb|ACII01000017.1| GENE 2 1321 - 2802 457 493 aa, chain - ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 488 290 826 832 90 19.0 8e-18 MVRCIGATPKQVMQLVRKEVMNWCRFAIPLGVIGGMVLVWVLCFILRQLSPQYFGGMPAF SISYPSIIAGIVVGLLTVLLAARSPAKKASKVSPLAAVSGNANDLQPVKKAANTKFFKID TSLGIHHAKASKKNFILMISSFSISIILFLAFSVTIDFMNHTLTPLQPWTADISVISPED TCSVKAEFIEELQQNLAVKNVYGRMFAYNVPVIVNGEEKVVDLISYEDMQFDWAKDYVLS GSLEKAQSESNTGLIVFEPQNDIEVGNVITLNLNGSQADMEIVGMLSDCPFNNRQDVGKL ICSEDTFRKLTGETDYSIIDIQLSQNATENDVNEIHRLVGSQYTFSDERMGNSNTRGAYY CFGLFIYGFLVLIALITLFNIINSIAMSVVAKMKQYGVFRAIGLSNRQLAKMIIAEASTY AITGTICGSVLGIVCNKILFSKLITYKWGDAWSVPLVQLGIIIVVVAIAVILAVRNPIKK LEKTSIVDIISVQ Prediction of potential genes in microbial genomes Time: Sat May 28 19:12:53 2011 Seq name: gi|226333001|gb|ACII01000018.1| Ruminococcus sp. 5_1_39B_FAA cont1.18, whole genome shotgun sequence Length of sequence - 33606 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 5, operones - 4 average op.length - 7.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 595 257 ## EUBREC_0763 putative ABC transport system permease protein 2 1 Op 2 4/0.000 - CDS 579 - 1274 325 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 1 Op 3 40/0.000 - CDS 1337 - 2491 255 ## COG0642 Signal transduction histidine kinase 4 1 Op 4 . - CDS 2563 - 3264 442 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 5 1 Op 5 . - CDS 3254 - 3439 128 ## CLK_0105 transposase and inactivated derivatives - Prom 3462 - 3521 9.0 + Prom 3544 - 3603 6.5 6 2 Tu 1 . + CDS 3635 - 3994 345 ## MGAS2096_Spy1120 MerR family transcriptional regulator + Term 4110 - 4138 -0.1 - Term 3969 - 4002 2.1 7 3 Op 1 . - CDS 4021 - 5796 1282 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 8 3 Op 2 . - CDS 5832 - 6605 669 ## COG4509 Uncharacterized protein conserved in bacteria 9 3 Op 3 . - CDS 6620 - 7024 191 ## MGAS2096_Spy1123 hypothetical protein 10 3 Op 4 . - CDS 7040 - 9373 1811 ## COG3451 Type IV secretory pathway, VirB4 components 11 3 Op 5 . - CDS 9357 - 9809 278 ## gi|240145273|ref|ZP_04743874.1| conserved hypothetical protein 12 3 Op 6 . - CDS 9824 - 10660 618 ## MGAS2096_Spy1127 hypothetical protein 13 3 Op 7 . - CDS 10752 - 11108 473 ## MGAS2096_Spy1129 aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase subunit A (EC:6.3.5.-) 14 3 Op 8 . - CDS 11123 - 12913 965 ## COG3505 Type IV secretory pathway, VirD4 components 15 3 Op 9 . - CDS 12900 - 13097 197 ## gi|240145269|ref|ZP_04743870.1| conserved hypothetical protein 16 3 Op 10 . - CDS 13113 - 13976 1034 ## MGAS2096_Spy1131 hypothetical protein 17 3 Op 11 . - CDS 13983 - 14795 534 ## MGAS2096_Spy1132 hypothetical protein 18 3 Op 12 . - CDS 14737 - 16149 564 ## MGAS2096_Spy1133 relaxase 19 3 Op 13 . - CDS 16149 - 16478 174 ## MGAS2096_Spy1134 relaxosome component 20 3 Op 14 . - CDS 16438 - 16590 159 ## gi|253578086|ref|ZP_04855358.1| conserved hypothetical protein - Term 16631 - 16665 3.1 21 4 Op 1 . - CDS 16682 - 16999 295 ## gi|253578087|ref|ZP_04855359.1| conserved hypothetical protein 22 4 Op 2 . - CDS 17050 - 18039 726 ## MGAS2096_Spy1135 LtrC - Term 18058 - 18104 7.1 23 4 Op 3 . - CDS 18117 - 19274 385 ## GTNG_2836 hypothetical protein - Prom 19305 - 19364 8.6 - Term 19347 - 19387 7.1 24 5 Op 1 . - CDS 19396 - 19743 215 ## MGAS2096_Spy1138 hypothetical protein 25 5 Op 2 . - CDS 19803 - 27566 4075 ## COG4646 DNA methylase 26 5 Op 3 . - CDS 27614 - 27820 320 ## gi|253578092|ref|ZP_04855364.1| conserved hypothetical protein 27 5 Op 4 . - CDS 27835 - 28542 442 ## MGAS2096_Spy1154 hypothetical protein 28 5 Op 5 . - CDS 28535 - 28753 106 ## gi|253578094|ref|ZP_04855366.1| conserved hypothetical protein 29 5 Op 6 . - CDS 28671 - 28943 283 ## MGAS2096_Spy1155 hypothetical protein 30 5 Op 7 . - CDS 28954 - 29256 228 ## gi|253578096|ref|ZP_04855368.1| conserved hypothetical protein - Term 29265 - 29298 2.7 31 5 Op 8 . - CDS 29306 - 29398 156 ## 32 5 Op 9 . - CDS 29402 - 33544 3947 ## COG4932 Predicted outer membrane protein Predicted protein(s) >gi|226333001|gb|ACII01000018.1| GENE 1 1 - 595 257 198 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0763 NR:ns ## KEGG: EUBREC_0763 # Name: not_defined # Def: putative ABC transport system permease protein # Organism: E.rectale # Pathway: not_defined # 1 198 1 199 786 248 58.0 6e-65 MKSYLDLVPISAKVHRKQNRMSIICIVLAVFLVTAIFGMADMFIRGQILQAQQENGNWHI GIQNISDEDAMLISSRPEVATVADYGTLNFRGEQEYTLHGKNVSICGGNESIVTEIFNVL DEGTFPKTENEAMISINAKDTLKLKIGEQIVITTPNGTEFSYIISGFMKNSANLMREDVY GVFLTTDGFRTIYSNETD >gi|226333001|gb|ACII01000018.1| GENE 2 579 - 1274 325 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 223 1 221 245 129 33 2e-29 MEILKVQNLCKTYGTGEAKVDALKNVSFSLEKGEFVAVVGESGSGKSTLLNCIGALDVPT SGTVTMDGNNLFSMKEEERTIFRRRNIGFIFQSFQLVSELTVEQNIAFPLLLDYRKPKAS DVEEILELLGLTERRNHLPSQLSGGQQQRVAIGRALITKPKLILADEPTGNLDSKNSQDV IDLLTKASRHYQQTILMITHNNGLTSMVDRVLRVTDGVLTDLGGNANEKLS >gi|226333001|gb|ACII01000018.1| GENE 3 1337 - 2491 255 384 aa, chain - ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 79 383 110 413 416 183 35.0 4e-46 MGIAVIDTQTAATKEMFLTHNNAIATSLLEEGISKEVIATALMNTEAGTTGNKFLAEIGL SNSTDIENLPYVSAVKNTILTSLFVMLFLWTLLLFLGTMLFLYRRERLYRESENIIKDYI DGNYTVHLPQSNEGTIYQLLSSVDQLATMLQAKNDTEQKTKEFLKNTISDISHQLKTPLS ALMMYQEIIENEPENFETVKEFSLKIGTALKRMEQLIQSMLKITRIDAGSISFEKSNYSI PNIINHAISELTTRADNEKKEIVIDGDLEQMLYCDIEWTGEAIGNIIKNALDHTDTGGKI SISWEQTPAMIRIFITDNGHGISQEDIHHIFKRFYRSKNTSESQGIGLGLPLAKAIVEGQ GGILSVQSELLQGTTFTLSFLTES >gi|226333001|gb|ACII01000018.1| GENE 4 2563 - 3264 442 233 aa, chain - ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 227 4 220 228 186 42.0 2e-47 MENRILLLEDDVSLIDGLQYSLKKNGFDVDFVRSVQETLTYLSEIEKYDMLILDVTLPDG TGFEICEKVRKNGSQIPIIFLTASDEEVSIIRGLDCGGDDYITKPFKLGELCSRIRALLR RAGLSTQGTGAVMECGDIKIDLLGSRVLLKGKILELTAAEYRLICLLVKNANRIITRDII LDELWDSTGDYVDNNTLSVYVRRLREKIETDPSHPEHLLTVRGFGYQWKEVDE >gi|226333001|gb|ACII01000018.1| GENE 5 3254 - 3439 128 61 aa, chain - ## HITS:1 COG:no KEGG:CLK_0105 NR:ns ## KEGG: CLK_0105 # Name: not_defined # Def: transposase and inactivated derivatives # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 1 58 118 178 182 75 57.0 4e-13 MFEYMTAQEAAELWGISVRRVQRLCKENRIEGVLNINRVWLIPKSSKKPVDKRRKVTDYG E >gi|226333001|gb|ACII01000018.1| GENE 6 3635 - 3994 345 119 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1120 NR:ns ## KEGG: MGAS2096_Spy1120 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 119 1 119 119 134 58.0 8e-31 MDARNNDFDFDFTPIGQAIKKARTAKGMTRDELSRIVDYDPRHLQAIENEGQKPSLELFI QLVTMFGVSVDEYIFPDNEVKKSSVRRRLDAELDKLNDKELLVIEATVSGLCKAKEPEE >gi|226333001|gb|ACII01000018.1| GENE 7 4021 - 5796 1282 591 aa, chain - ## HITS:1 COG:BS_yddH_1 KEGG:ns NR:ns ## COG: BS_yddH_1 COG0741 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Bacillus subtilis # 291 441 53 191 205 126 43.0 2e-28 MADIKTRDAVKGTIKTIDKAAIASERMKSAYVGIKEKAEQGYYADENSATEYAADRISYA ADRVKDEGIHQFNKQGQKAVKTTQENIGKAKDKITDFKQSRAVKAAEQKAAQNMSEQHGL QIRHGAASRSSAPDVSQTAKSQLIKTRQQGQKMIKTTARNAEKAVKTTAKGTVKTTEKGI KTAQATSKAAIKTTETSVKTAQAAAKASAKTAQKAAQAAKATAKATAEATKATVRATIAA VKAIIAGTKALISALIAGGWIAVVIILIVVLLGCAVSLFGGGSGSNAYTPVSAEVEAYEP LIQKYAKQYGIPEYVELIKAVMMQESGGRGLDPMQAAEGSFNTRYPHEPNGIQDPEYSIE CGVQELKAALISAEVENPIDMEHIKLALQGYNFGNGYISWAKTNYGGYSYANAVEFSTMQ AARLGWDSYGDTQYPAHVLRYYPYGRAFTSGGNQAIVEVALTQLGNEGGQPYWSWYGFEG RVEWCACFVSWCADQCGYIESGIIPKFAGCVDGANWFKGNGQWKDRSYEPSAGDIIFFDW EGDGETDHVGIVEKCENGVVYTVEGNSGDTCRQNQYTVGSSSIYGYGVPAY >gi|226333001|gb|ACII01000018.1| GENE 8 5832 - 6605 669 257 aa, chain - ## HITS:1 COG:SA0982 KEGG:ns NR:ns ## COG: SA0982 COG4509 # Protein_GI_number: 15926719 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 24 254 22 239 244 119 29.0 7e-27 MKNYKSIICVAAAVCLLGTAAFCGFRIYHHYAQVDEQTEAFEEIAEMVEQAPTEETMPDD APVSEGEDVLAKYQELYLQNEDMVGWLSIAGTIINYPVMQSRNNPNFYLKHNFEKEYSDL GTPYVQENCDIAESDNLVIYGHHIKGGKMFGALEDYKSKSFYEEHKNIQFDTLTEQEEYE IVAVFKTVAYSSEGFRYYDFVDAENEEDFNSYVGKCKELALYDTGVTAEYGDRLITLSTC EYSAQNGRLVVVAKKVG >gi|226333001|gb|ACII01000018.1| GENE 9 6620 - 7024 191 134 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1123 NR:ns ## KEGG: MGAS2096_Spy1123 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 134 1 134 134 224 79.0 6e-58 MADIKFRDTAHRDFFLENMMKCRVNDCYHRAFFYVMGIASETRANINQMFNFKEDCIEPE GMHGGWQTSGTVKVCHLAFNLWNGYAEEGRERYFTPEELFCCEFAPYFMEGIKVRYPEYC WELPAPRKQTEISR >gi|226333001|gb|ACII01000018.1| GENE 10 7040 - 9373 1811 777 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 208 754 24 600 617 118 22.0 5e-26 MLRTLKNLFKQDREKFVVPKSVQNVIPIKTIWDDGIFLVGRNKYAKTFKFEDINYAVASR EDKEAMFLEYSELLNSLDSGATTKITINNRRLNKADFEQTILIPMADDGLDKYRKEYNKM LLDKATGANSIVQDKYVTVSVCKKNIEEARNYFARVGADLIAHFNRLGSKCVELDAGDKL RIFHDFYRTGEETAFHFDITQTMRKGHDFKDFICPDTFEFESDCFRMGDRYGRVIFLREY AAYIKDSMVAELCELNRNMMLSVDIIPVPTDEAVREVENRLLGVETNITNWQRKQNQNNN FSAVIPYDLEQQRKESKEFLDDLTTRDQRMMFAVLTMVHTADTKEQLDNDTEALLTTARK HLCQFAVLKYQQMDGLNTALPFGVRKIDALRTLTTESLAVFIPFRVQEIYHENGVYYGQN VISKNMIIANRRQLLNGNSFILGVSGAGKSFTAKEEMTNIILTDPNADIIIIDPEREYSP LVKAMQGEVIHISATSENHINAMDMNSDYGDGANPVILKSEFILSLCEQLIGGTNLGAKQ KSIIDRCTASVYRYYQQGNYMGTPPTLQDFREELLKQNEPEAQEIALAIELFTDGSLNTF AKHTNVDTHSRLICYDILDLGKQLQPIGMLVVLDSILNRITQNRAKGRNTFIFIDEIYLL FQHEYSANFLFTLWKRVRKYGAYCTGITQNVDDLLQSHTARTMLANSEFIIMLNQASTDR LELAKLLNISDLQMSYITNVGAGQGLLKVGSSLVPFVNKFPRNTELYKLMTTKFGEV >gi|226333001|gb|ACII01000018.1| GENE 11 9357 - 9809 278 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145273|ref|ZP_04743874.1| ## NR: gi|240145273|ref|ZP_04743874.1| conserved hypothetical protein [Roseburia intestinalis L1-82] # 1 150 1 150 150 259 100.0 3e-68 MEVKINKEIRNYTESMFFGLSLRQFIFSVLACGVAVGLYFLLRPRFGTETLSWVCILGAF PFAAMGFIKYNGMTAEQFVWAWIKSEFLMPKKLMFLPDNLYYETMKPTIEAHEKGLPTVR KKQKGRTPKPKKTKKKKAKRSKEVNNAENS >gi|226333001|gb|ACII01000018.1| GENE 12 9824 - 10660 618 278 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1127 NR:ns ## KEGG: MGAS2096_Spy1127 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 278 6 283 283 481 92.0 1e-134 MSDNWVVQNLENALNTWNEKLAEIWQLITQSPENFKGGTIWNVIVDIHGAVQAIGLALLV LFFVVGVMRTCGNFAEVKRPEQALKLFIRFAIAKGAVTYGLELMMALFKIVQGMISTIMN AAGFGSAQQTVLPQEIVTAVEDCGFFESIPLWAVTLIGGLFITVLSFIMIMSVYGRFFKL YIYTAIAPVPLSAFAGEPSQSVGKSFIKSYAAVCLEGAVIVLACIIFSLFASSPPVVNPD AAAVTMVWSYIGELVFNMLVLVGAVKMADRVVREMMGL >gi|226333001|gb|ACII01000018.1| GENE 13 10752 - 11108 473 118 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1129 NR:ns ## KEGG: MGAS2096_Spy1129 # Name: not_defined # Def: aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase subunit A (EC:6.3.5.-) # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 118 1 123 123 183 85.0 2e-45 MKLFKKNENKKTTKLPMTKKVKKGFRYYCMAVMAMTFVLGTSVTAFAANDPIQVVNNLSD FIFGLVRAVGMIMLGFGIVQIGLSLKSHDPSQRANGFLTLAGGVVITFAKEILTLITG >gi|226333001|gb|ACII01000018.1| GENE 14 11123 - 12913 965 596 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 62 536 75 556 591 233 33.0 8e-61 MRNDDQKTSLILSVCGILPVVWLALLTAPYVSGGLVEIIRGLPVAMGNPFEIMVCEDSVK TVLIFLLAYAMGIGVYFSTRRNYRRREEHGSAKWGNAGALNKKYRDKDPSANKLLTQNVR IGLDGKKHRRNLNILVCGGSGAGKTRFFCKPNAMQCNTSFVILDPKGEIVRDIGGLLENK GYEVRVLDLINMHRSHCYNPFVYLRNDNDVQRLVTNLFKATTPKGSQSQDPFWDTAASML LLALVFYLKYEAPPDEQNFPMVMELLRAGEVREDDDSYVSPLDELFDRLEMVNPEHIALK YYRDYHSGSAKTLKSIQITLAARLEKFNLESLAGLTATDELDLPSLGEKKVALFALIPDN DTSFNFLVSILYTQLFQQLFYLADHKYGGSLPVHCHFIMDEFANVSLPDDFDKILSVMRS RGVSVSIILQNLAQLKALFEKQWESIVGNCDEFLYLGGNEQSTHKYVSELLGKETIDTNT YGKSSGRSGNYSTNYQISGRELMTPDEVRMLDNRYALLFVRGERPVMDFKYDILKHPNVK LTADGGQPPYIHGEPTQAVATLVFDSDIPDNAVSVKAVSTSYELLSDEDLEEIFNL >gi|226333001|gb|ACII01000018.1| GENE 15 12900 - 13097 197 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145269|ref|ZP_04743870.1| ## NR: gi|240145269|ref|ZP_04743870.1| conserved hypothetical protein [Roseburia intestinalis L1-82] # 1 65 1 65 65 104 100.0 1e-21 MREERYHIALDKYDKNIVINALNTLRTRQLQEERPTEPVDELISKVAHAPTKKVKVVHCR CNEER >gi|226333001|gb|ACII01000018.1| GENE 16 13113 - 13976 1034 287 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1131 NR:ns ## KEGG: MGAS2096_Spy1131 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 287 1 286 286 417 78.0 1e-115 MSIYDVFGGGNRPFDKEEWAAAKQAQRKEAYELIDNTCSEMMASGDSFRQYLNVQGRFDR YSVANAILVSAQMPEATQLKEYSKWKANRVYVNKDAQKIIILEPSKEYTREDGTKGISYN AKVVYDISETSAKDRQQEQEPKSMRELVSALIDASPVPCVPVGELELPAYYDSSQQTIFV KKGLSEEVLFVSMAKEVSAAVYNFKHNESRDASDFKSFCVAYMVSSRYGVDTRGFDFSRL PREYAEMDTQAFKGELGTMRDVLGEIQSDMYKSMEKNKPAKNKEQER >gi|226333001|gb|ACII01000018.1| GENE 17 13983 - 14795 534 270 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1132 NR:ns ## KEGG: MGAS2096_Spy1132 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 270 1 272 272 287 79.0 3e-76 MNTGGEAAEQIVRMSLEGFEVAAKITGAGAKNIAILLYSILKEEKKTKGKARLTSMLRSG KELKVFTVKSGDLKKFTQEAKKYGVLYCVLTDRKNKDPNAEVDVIARAEDASKISRIVER FNLASVDTASIVTEAEKSKDAKDGQSEPDIGVQEKAEKDKLLDELMGKPIQKEENAPNPS VAKTEKSPQSEPTSEQPKKFAEGATMTKEKPSVREELRKIKENRKEQKAEIGTSALDKSG ASDRAKSANGKTEHKQPQRKKKKPKSKETR >gi|226333001|gb|ACII01000018.1| GENE 18 14737 - 16149 564 470 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1133 NR:ns ## KEGG: MGAS2096_Spy1133 # Name: not_defined # Def: relaxase # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 470 22 491 491 776 82.0 0 MAVCEIWDVRGRLDHPIDYAENPEKTANPKYTEADLQAMVDVMEYATNKNKTEQRFFVTG VNCDPTTARDEMMITKAGWNDMSEIVCYHGFQSFKHGEVTPEQAHEVGVKLAERMWGDRF QVIVATHLNTDCLHNHFVVNSVSFADGMHYHDNKANLRLLRQRSDELCREYALSVIEHPS GKKKPYALYQAEKQGRPTRDNVARQAVDEAISKSFTLKDFDRQLAEMGYRINFDPNRKYW TIIGKGWQRPKRLYKLGEKYTNDRIMERISKNSYAVKFQSFSEPQLQVKVYRVKGSLKNA KKIGGLRGLYLHYCYKLGILPKGRKQNYARLHYLLKDDLMKMEAITQETRLLCRNHIDTA EQLCSYKGSLKTEMSALLQKRKELYSKSRRTSGEEKEAVKAELSDISGRLKIIRKEVRLC EGISARSDTLKEKLQTIRADEHEQQRKELVKNEHRRRSGRTNRPNELGGL >gi|226333001|gb|ACII01000018.1| GENE 19 16149 - 16478 174 109 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1134 NR:ns ## KEGG: MGAS2096_Spy1134 # Name: not_defined # Def: relaxosome component # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 109 1 109 110 162 82.0 4e-39 MRKRNVQILFRLTEEEAEHLNELVRKSGRSKEAFLREMVRGYQLCEKPDPEFYKMMRELS AIGNRINQLAVKANALGFVDTPMLREEARKWHEFQIDIRKRYLLPRRSS >gi|226333001|gb|ACII01000018.1| GENE 20 16438 - 16590 159 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578086|ref|ZP_04855358.1| ## NR: gi|253578086|ref|ZP_04855358.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 50 1 50 50 94 100.0 2e-18 MSEFLFFIIGTMLGGLFGVVCMCCVQINRLSERKEVDNAKKKCADTFPSD >gi|226333001|gb|ACII01000018.1| GENE 21 16682 - 16999 295 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578087|ref|ZP_04855359.1| ## NR: gi|253578087|ref|ZP_04855359.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 105 1 105 105 180 100.0 2e-44 MLKARNEPLKKYDVTLTAHYRKTVSVYAESPEQAKKKTKIILFDTDLINFTDDDFVCGEA DITEQSEDGLDCAGENTLQENEDCSDCPYFCPVCGECMYEDDCEE >gi|226333001|gb|ACII01000018.1| GENE 22 17050 - 18039 726 329 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1135 NR:ns ## KEGG: MGAS2096_Spy1135 # Name: not_defined # Def: LtrC # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 329 1 329 329 487 72.0 1e-136 MRTKQPYVQIDPDVLERIRQIDLLSYLREFEPSNLVKVKGTSNVYCTAEHDSLKISNGKW YWWSRGFGGYSALDYLIKVKEYDFVEAVEILTGQTMADWKPPPAPKKDEPKVLLLPPKNK DCNRVIQYLFGRGIDYHLIQECIADGTLYESADYHNAVFIGKDESGTPKYAALRSTLGST FKQDASGSDKRYSFRLLAREPTDTVHLFEAAIDLLSYATYLKCEGKDYKSESLLSLSGVY QPKKEMKDSKIPIALTTFLSANPQIKTIVLHLDNDKVGRLCTATLKELLQKDYKIVDDPP PVGKDFNDFLLSYLGIARPKPARERSDTR >gi|226333001|gb|ACII01000018.1| GENE 23 18117 - 19274 385 385 aa, chain - ## HITS:1 COG:no KEGG:GTNG_2836 NR:ns ## KEGG: GTNG_2836 # Name: not_defined # Def: hypothetical protein # Organism: G.thermodenitrificans # Pathway: not_defined # 4 331 9 334 378 149 29.0 1e-34 MLMKYFKIYDDELCPCGSGKTYLECCKNRKDRPVKKSKKPLNIQIMEQFRKNQIKCCLYP DSTRCVKHIKEAHALQNNKIISQLSEDGHVYILNPNKPPQVIPIENEEPEVLTMIDRVGV NHATTATCFCDIHDDEVFAPIEKGAPAFDKDSDEHKYIYAYKAFIFEYYKKLVEQKVLRE NITNTPSLLKTPQYIQYYRAISMTLEEMNEVKSFFDKGIVSKDYTGLETCVIEIPESIKF ATYACIAVDFDLKGKKIRHTKRGFISRLFITIFPEETKSYIILSCLSEDIKIYKKFFQQL QTTNLDKVKYYFDLILPLYTENNVLSPRLWEKWGEEQQIGYTFFANRKGKQFLLYKQILK FGMYNLRKKESGFGTGNRGKLDLFE >gi|226333001|gb|ACII01000018.1| GENE 24 19396 - 19743 215 115 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1138 NR:ns ## KEGG: MGAS2096_Spy1138 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 11 115 5 109 109 128 58.0 6e-29 MSEVKGVLDNFRFETFVDVHSNIFAEYLSSVIAKLPKENPEYHFIEERIEELYKEYPKVM AVLDTEKPSDLSEQECKALIEVLELRNRLSDMQQEAIYFRGCYDSVGYLKKAGIL >gi|226333001|gb|ACII01000018.1| GENE 25 19803 - 27566 4075 2587 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1295 2557 1 1305 1315 644 33.0 0 MAAKYQLITELYRRTGVAVAKNPQAWQGFLSSACRNYKCRFDEQLLIYAQRPDAVAVAKL ETWNRQFKRWVNKDSKGIAVFDPKGRRNTLKYYFDVSDTHEGYYGSRPVPIWQMDERYEQ AVMERLSDRFGDVESTDLASALMETAKNAVEDNLQDYFSQLKDCTKDSFLEELDDFNIEV VYRRLAANSVAFMLISRCGLDTNEFFDRDDFADIVNFNTPATINAIGVATSDIAEMALRE ISQSIRNVQMAEKDQNRTFAQRTQAQYDKGRQQPERSEYNEQNHLQQTGGLSYSRPNITD RARASAWQVRFDAQGLSGEAQASDLSQSADIGQAERASARGRADSTPEVGASDEAALSRA GRDRGTERESTDAVGRTDEQHPQPSGGSDTDRTDLQVSVAKEDEVRVNLPTVDEQIEMIA KAEDEKASAFAISKEDIDSVLQKGSGVADGKYRIYRQFQKGVDRQKNIEFLKNEYGTGGG THIFPDGFSGHSWHDSKGLAIDRNGTYTNHDLVLKWSQVEKRLRELIKDNRYLNPKEKDH YADYLESVSAPQYEIDTQRKIARQRFIDAHRDLPPADKRDTLALRLSDFIRDLDRYEKDL LSAVERSDLADVTAEQMEQHLSDPSTVQQLIDFLAQVQWKTTSVFSRSNGWKFTEELREL HPLRYLYNEGDVVYIGADKYEIATLTEEKVYLQNAEFPILGQEYSRADFEEKLTENPAND HLKVVVTEKQRTETPSEKKQDGIQFSIGFSEHPAFYDRQFNDRYTDLSFALGNKLLGILD EKQHREREGDKNIGWYHKTDFVIKAVIGGEEFNYEGRFDIGDGEGDLIAHIKNFYDYALS PKGEQLYGDDRESLLRGRGEFIPFLEQHTELTKEDEKLLDEIMATESDWYRTAEEAKEKP QAYADIMNGSEAPAIEMEKSADDLIGREIIIDNRKYLIENIGKISGDVSLRDITFQNNVG FPINRVEKIGYIQKLLEQEKTELLPEEKTEAPAVDRHNFRINDDAIGVGGAKEKFRNNMA AINLLHELEIENRLATPEEQEVLSQYVGWGGLSMAFDEHNAAWAEEFKELYASLSPEEYR AAMESTLTAFYTPPVVIKAMYDALDRLGFSQGNILEPSCGTGNFFGLLPESMQNSKLHGV EIDSLTGRIAKQLYQKANIAIEGFEKTNLPDDHFDVVLGNVPFGEIRVNDSRYNAQKFLI HDYFFAKALDKVRAGGVVMFITSKGTMDKASPEVRKYIAQRAELLGAIRLPDNTFKANAG TEVTSDILILQKRDRVMDIEPDWVHLDTDENGVTMNRYFVEHPEMVLGEIKMENTRFGTF EPVCKARKDIPLSELLSNAVQRINGEIPELDNRVDEISDEQELSVPADPNVRNFSFTLVD GKVYFRENDRMQPASVSMTAENRIKGLIQIRDCVRKLIEYQTEDYPEEMICTEQENLNRL YDAYTAKYGLINSRGNYLAFASDESYFLLCSLEVLDDEGNFKRKADMFTKRTIKPHREVT SVETASEALALSIGEKARVDLPYMEQLTGKTQAELVQDLQGVIFKVPNCEPVSYVAADEY LSGNVRNKLTVAELAAKNDPELAVNVDALKKVIPKDLSAAEISVRLGATWIPQEDIQRFV MELLTPSSYAAGRLKVRYTPINGDWFIENKSSDMGNVKADSTYGTKRASAYRIIEDTLNL RDTRIFDYVYDEHSNKKAVFNAKETTAAQAKQEVIKQAFQDWIWKAPERRNRLVRYYNDT FNSVRPREYDGSHITFGGISPEITLRPHQVNAIAHILYGGSTLLAHKVGAGKTFEMVAAA QESKRLGLCQKSMFVVPNHLVGQWASEYLRLYPSANILVTTKQDFETGNRKKFCGRIATG DYDAVIIGHSQFEKIPMSIERQREQLEKQLDDIERGIDDVQASKGEQFTVKQLMKTRKAI KTKLEKLNDTKRKDTVIDFEQLGVDRLFIDESHFYKNLYLYTKMRNVGGIAQTEAQKSSD LFMKCRYLDEITGNRGTVFATGTPVSNSMVELYSVQRYLQYDTLAQNGLQHFDSWASTFG ETVTALELAPEGTNYRAKTRFAKFYNLPELMHMFREVADIQTADMLKLPVPKVNYHNIKT KPSEIQTEMVASLAKRAEKVRARLVEPNIDNMLKITNDGRKLALDQRMIDPMLPDDPDSK VNACVDNVYRIWEEHADTKATQLVFCDLSTPKNDGTFNVYDDMREKLIARGIPAEQIRFI HEATTDAQKKELFGKVRSGEVRVLFGSTPKMGAGTNVQDRLIAIHNLDCPWRPSDLEQRQ GRIERQGNMFPEVEVYRYVTEQTFDAYLYQLVESKQKFISQIMTSKSPVRSAEDVDEVAL SFAEVKMLATGDARFKEKMDLDIQVSKLRVLKQSYLSEHYDLEDRVLKYYPQTIKEYEER IAGYENDAALAEQHKPQGEDKFCSMTLKGVTYTEKADAGEMLLAICKDYPMSAPTEIGSY RGFRMEIYYDTVNAHYCMNLCGKAKHKVDLGADALGNLTRIENELSKLPARLEAAKTKKA ETIAQLETAKEEIKKPFAFEDELKEKTERLNALNIELNLNEKDTSVMDTEPEQTEEQPER KCASRER >gi|226333001|gb|ACII01000018.1| GENE 26 27614 - 27820 320 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578092|ref|ZP_04855364.1| ## NR: gi|253578092|ref|ZP_04855364.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 68 1 68 68 101 100.0 2e-20 MAVNQKAVKVLNKVLEAGFTDEKAIAAMTMDDILSMQGITVADITLINDLQKSIKSNKVI AFLGGGAE >gi|226333001|gb|ACII01000018.1| GENE 27 27835 - 28542 442 235 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1154 NR:ns ## KEGG: MGAS2096_Spy1154 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 5 235 1 231 231 397 82.0 1e-109 MSNEIKTRPLTPTEQKYTYAQSMQLEGQTGTIGHLRGDFATTGYGFYTTWFDTRPQWKSD EFKADLDTVINALRKDKGLLHNRYDMSAFARHFPESAIKGNYCTEYGFRVDTEKHAFLLR CNPTKGDYNFYCYCYVKEWLDKHIQKAEQGIRFIDPQYKELFRIPDGGKVIVKTSWGEKR EYPCRFIDEYHTEVGSNLYHICEFAERMQKNGATYEPKPAEQTPQKTPKHKDLER >gi|226333001|gb|ACII01000018.1| GENE 28 28535 - 28753 106 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578094|ref|ZP_04855366.1| ## NR: gi|253578094|ref|ZP_04855366.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 72 1 72 72 107 100.0 2e-22 MNDTSQSAVCPIISLHKSRKTVMTVSDKSKYDCTTCPYPRYKDGLVIFCDVCIRKILDEQ KEKKERKEQPNE >gi|226333001|gb|ACII01000018.1| GENE 29 28671 - 28943 283 90 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1155 NR:ns ## KEGG: MGAS2096_Spy1155 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 90 1 90 90 128 76.0 6e-29 MDEEKRSNQNYEIIESCTIGSTELVIGHNPNAPNPYVCWYCKGGSNYFWGYYTNELDDAR KKLNERYQSECRMPYNQPAQKQKNGDDRER >gi|226333001|gb|ACII01000018.1| GENE 30 28954 - 29256 228 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578096|ref|ZP_04855368.1| ## NR: gi|253578096|ref|ZP_04855368.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 100 1 100 100 174 100.0 1e-42 MKNLKTIEKKVRAVLEKNEDARNDDMVLYLALCNACLKDAGAMPLAEIMTQHKYLGLPSF ESVSRTRRKLQAQHPELAGSRPVQKMRATGEKAYRRYAKE >gi|226333001|gb|ACII01000018.1| GENE 31 29306 - 29398 156 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAKTIIAIVLVVFIVGAAVWLNIRNRKKK >gi|226333001|gb|ACII01000018.1| GENE 32 29402 - 33544 3947 1380 aa, chain - ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 505 1336 1084 1830 1983 115 26.0 4e-25 MKIPKLFKKAAAFVMAAVTALSIMPATAFAAGDIGTICFSHTYDSNGNAMRYNSSANIGG YTAGGTGNYKYRMFVDGENAFCIQPGVPLKTGNTLKKASSDTWNALSANQKKVVGLALLY GYQGNRNNLSGSDDEKWLATQTLVWEFVTGCREATGSYNQTSTTVYSLHFGSNYANSGAR AVYDQIVAMLREHNTIPSFMSGGKNDITKELAYKDGKYSITLTDSNGVLSDYSFSSSDSN VSVSKSGNKLTISSTVAISGSVRITAKRNNVPTVSSSAKLIAYGDPNLQDLVTGVENADT VSAYINIETPTGTIALKKTSEDGVVEGISFTIKGDNFNKTVKTGKDGSVSVEGLFPGTYT VTEQSIDRYEPQKTQTVTLIGGKTSTVTFSNTLKRGSLEIVKTSEDNLVEGMKFHLYGTS LSGLPVDEYAVTDKNGLAKFENVLISGDTPYVVEEVDTAVRYVVPASQTAPIEWNKVTKR SFDNVLKKFQVTVTKTDAETGSPQGDASLAGAVYGIYKGEELIDTYTTDENGQFTTKYYI CDNDWTVREISPSEGYLLDTAIHKVGAEPELYTVELNSTANDVNEQVIKGNIALIKHTDN GETQIETPEEGAVFEVFLKSAGSYENAKETERDVLTCDENGFAQTKDMPYGIYTVRQTFG WEGRELMKDFDVFISKDGQTYRYLINNANFESYIKIVKKDAETGNTIPYAGAGFQIYDPN GNLVTMTFTYPEVTTIDTFYTTADGDLITPQTLEYGKGYSLVEVQAPYGYVLNSEPVYFD VVQENSEEESGITVIEVVRSNMAQKGTITVEKSGEVFSSVAGDKGLYQPIFSVSGLEGAV YEITAAEDIVTLDGTVRANKGEVVDTVTTGKDGTAKSKELYLGKYEVKEITAPYGMVLNE EVRSVELVYAGQNVDVTETATSFYNEKQRVEIDLIKSLAIDEAYGIGKNGEIFDVTFGLY AAEELTAADGKTIPADGLIEVISLDESGHGKAISDLPMGSYYVQETSTNSAYIVSDAKYP VIFEYAGQDTETVRIIANEGEAITNDIIYGSVSGKKSDEDGKALGSAVIGIFKTGTTEFT KENAIVATTSKDDGSFSFAKVPYGTWIIREIESPKGYVLSEEEIAVTIGKVDEVVEIELV NYFIKGNIALTKVDEDYPDNKLSGAVFEVYSDTNGDGKLDKDDTLLGEMKELDGGVYQMS ELRFGKYLVKETKAPTGFVLDENVYSVSVEENGKTYTVENKAGVGFINAAQKGSLKIVKT SSDGKVEGFSFRVTGVDYDQTFKTDKNGEIVIEGLRIGDYTVSEVSNKASAGYILPADKQ ATVKVDATAIVQMHNEFRDTPKTGDDFNLGLWVSLAALSVVGAGVLGFVGYKNRKKKKED Prediction of potential genes in microbial genomes Time: Sat May 28 19:14:24 2011 Seq name: gi|226333000|gb|ACII01000019.1| Ruminococcus sp. 5_1_39B_FAA cont1.19, whole genome shotgun sequence Length of sequence - 20153 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 7, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 25/0.000 - CDS 120 - 1067 732 ## COG1475 Predicted transcriptional regulators 2 1 Op 2 . - CDS 1021 - 1848 676 ## COG1192 ATPases involved in chromosome partitioning - Term 1906 - 1972 10.1 3 2 Tu 1 . - CDS 1974 - 2153 137 ## gi|240145248|ref|ZP_04743849.1| conserved hypothetical protein - Prom 2242 - 2301 6.0 - Term 2216 - 2263 10.4 4 3 Tu 1 . - CDS 2405 - 5029 3447 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Term 5314 - 5367 5.1 5 4 Tu 1 . - CDS 5385 - 6641 1592 ## COG0205 6-phosphofructokinase - Prom 6762 - 6821 8.6 + Prom 7023 - 7082 3.3 6 5 Tu 1 . + CDS 7127 - 8002 803 ## COG2017 Galactose mutarotase and related enzymes + Prom 8008 - 8067 2.5 7 6 Tu 1 . + CDS 8151 - 9023 1112 ## COG0646 Methionine synthase I (cobalamin-dependent), methyltransferase domain - Term 8999 - 9038 0.3 8 7 Op 1 . - CDS 9281 - 10414 1174 ## COG2006 Uncharacterized conserved protein 9 7 Op 2 . - CDS 10420 - 12054 1558 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 10 7 Op 3 4/0.000 - CDS 12080 - 13114 1488 ## COG0687 Spermidine/putrescine-binding periplasmic protein - Prom 13134 - 13193 1.9 11 7 Op 4 8/0.000 - CDS 13204 - 15048 2083 ## COG0687 Spermidine/putrescine-binding periplasmic protein 12 7 Op 5 30/0.000 - CDS 15060 - 15875 768 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 13 7 Op 6 8/0.000 - CDS 15875 - 16918 1221 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 14 7 Op 7 36/0.000 - CDS 16966 - 17739 673 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 15 7 Op 8 30/0.000 - CDS 17736 - 18599 767 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 16 7 Op 9 . - CDS 18586 - 19794 1169 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components Predicted protein(s) >gi|226333000|gb|ACII01000019.1| GENE 1 120 - 1067 732 315 aa, chain - ## HITS:1 COG:SP2240 KEGG:ns NR:ns ## COG: SP2240 COG1475 # Protein_GI_number: 15902043 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 29 259 1 209 252 74 28.0 3e-13 MARSRETKIELTAYDDLFQTDESREEAKLSKIRDIPISEIDEFPDHPFKVLMDEDMEQLV ESIKRNGVMTPATVRLKEDGRYELISGHRRKKACELAGLETLKCEVKELTRDEAIIVMVE SNLQRSVILPSEKAFAYKMRLEAMKRQAGRPPKENASPLATNLSKGRSDEELGELVGESK DQIRRYIRLTELVPEILQMVDERQIAFRPAVEISYLTEEQQYTLLEAMEYNDATPSLAQA IKMKKYNQYGKLTSEVIQSIMEEEKPNQKGKPAFRDERITKLIPKTVPRGQETDFVVKAL EFYNRHLQRNKAHER >gi|226333000|gb|ACII01000019.1| GENE 2 1021 - 1848 676 275 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 5 262 3 253 253 197 44.0 2e-50 MSNCKVIALTNQKGGVGKTTTAVNLGVSLVQQGKKVLLIDADAQANLTMALGYNRPDDIP ITLSTVMQNIIDDKTLDASQGIIHHREGVDLLPSNIELSGFEVRLINAMSRERVLKTYVN EVKKNYDYVLIDCMPSLGMITINALAAADSVIIPTQPHYLSAKGLELLLRSVSMVKRQIN PKLRIDGILMTMVMPRTNISKEITATVKSAYGQKIKVFDTEIPHSIRAVEATAEGKSIFA YDKSGKVAAAYEQFGKEVAEIGEKQRNQNRADCIR >gi|226333000|gb|ACII01000019.1| GENE 3 1974 - 2153 137 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145248|ref|ZP_04743849.1| ## NR: gi|240145248|ref|ZP_04743849.1| conserved hypothetical protein [Roseburia intestinalis L1-82] # 1 59 1 59 59 103 100.0 3e-21 MREKFNHLYLDSHERKLLIHSLVELKNQLIQQGRYTDCIDELIFKVINAPTKRMKIEYV >gi|226333000|gb|ACII01000019.1| GENE 4 2405 - 5029 3447 874 aa, chain - ## HITS:1 COG:SMc00025 KEGG:ns NR:ns ## COG: SMc00025 COG0574 # Protein_GI_number: 15964685 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Sinorhizobium meliloti # 1 874 1 887 898 1003 56.0 0 MAKWVYMFTEGNANMRNLLGGKGANLAEMTNLGLPVPQGFTITTEACTQYYEDGRQINDE IMAQIMEAITKMEGVTGKKFGDKENPLLVSVRSGARASMPGMMDTILNLGLNEDVVNVLA EKSGNARWAWDCYRRFIQMYSDVVMEVGKKYFEELIDEMKAKRGVKQDVELTAEDLHELA EQFKAEYKAKIGADFPTDPKEQLMGAIKAVFRSWDNPRANVYRRDNDIPYSWGTAVNVQM MAFGNMGDDCGTGVAFTRDPATGANGLFGEFLTNAQGEDVVAGVRTPMHISEMEQKFPEA FVQFKQVCETLEKHYRDMQDMEFTVEHGKLYMLQTRNGKRTAQAALKIACDLVDEGMRTE EEAVAMIDPRNLDTLLHPQFDAAALKAATPMGKGLGASPGAACGKIVFTADDAVEWAERG EKVVLVRLETSPEDITGMKSAQGILTVRGGMTSHAAVVARGMGECCVSGCGDIAMDEENK KFTLAGKEFHEGDFISIDGTTGNIYDGIIPTVDATIAGEFGRIMAWADKYRTMKVRTNAD TPADAKKAVELGAEGIGLCRTEHMFFGEGRIDAFREMICSETAEEREKALEKVLPYQQDD FKGLFEALEGNPVTIRFLDPPLHEFVPTDEADIKKLADAQGKTVEQIKTIIASLHEFNPM MGHRGCRLAVTYPEIAKMQTTAVIRAAIEVKKAHPDWTIKPEIMIPLVGDVKELKYVKKF VVETADAEIKAAGSDLQYEVGTMIEIPRAALTADEIAKEADFFCFGTNDLTQMTYGFSRD DAGKFLNAYYDAKIFENDPFAKLDQVGVGKLMKMAIELGKPVNPNLHVGICGEHGGDPSS VEFCHKIGLNYVSCSPFRVPIARLAAAQAAIANR >gi|226333000|gb|ACII01000019.1| GENE 5 5385 - 6641 1592 418 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 4 410 13 414 427 260 40.0 4e-69 MAKKRNIIVGQSGGPTAVINSSLAGVYKNAIERGFDKVYGMLHGIQGLLDEQYIDLSTQI HSELDIELLKRTPSAFLGSCRYKLPEIHEDKAIYEKIFEILNKLDIYAFIYIGGNDSMDT IKKLSDYAILTGQTQKFLGVPKTIDNDLALTDHTPGFGSAAKYIGASTKEVIRDALGLTY KKNMITIMEIMGRNAGWLTGATALAKSEDCDGPDLIYLPEVPFDVEKFLAKVKDLLNKKA SIVIAVSEGIKLADGRYVCELGSVGDYVDAFGHKQLQGTATYLANFLAAECGCKTRAVEL STLQRSASHMASRVDIDEAFMVGGAAVKAADEGDTGKMVVIDRVSDDPYMAATGIYDVHR IANEEKLVPREWMNKDATNVTKDFVDYIKPLIQGDYQPIMVNGMPRHLVLNMKKGRKK >gi|226333000|gb|ACII01000019.1| GENE 6 7127 - 8002 803 291 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 5 290 4 289 290 189 37.0 4e-48 MIHTIENDYLRVSVDDHGAELCSIFDKVHNREVIWQADPAYWKRHAPVLFPNVGRHFEDH YRINGVEYPSSQHGFARDSEFTCVDMTADSITHRLKSSDATRENYPYDFELKIKHVLEKN QVSVCWEVISLNDETMYFTIGGHPAFNVPAGGIGSQEQYHLTFDGRDSLSYLLIDMSSGT AVADKAYTLELENGSCLIDAHMFDKDALIFDDQIEKAGIAFPDGTPYVELICHGFPSFGI WSVPGSPFVCLEPWMGRCDDFGFKGELPEKKYINTLDKNEIFTASYEIKIY >gi|226333000|gb|ACII01000019.1| GENE 7 8151 - 9023 1112 290 aa, chain + ## HITS:1 COG:TM0268_1 KEGG:ns NR:ns ## COG: TM0268_1 COG0646 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I (cobalamin-dependent), methyltransferase domain # Organism: Thermotoga maritima # 3 281 5 280 285 168 34.0 1e-41 MTREEFQKLTQDVVLLDGATGSNLMAAGMPRGICTEAWIMEHKEVLQNLQKAYVEAGSQI VYAPTFGGNRYSLGLHGLQDKLAEMNHALVNISREAVGHKVYVAGDITTTGKMMEPAGDL TYEMAYETYCEQIKVLEDAGVDLIAAETMINIEETLAALDAAASVSSLPVMCTMTVEADG SIFSGGNAVEAAIALEGAGAVAVGINCSVGPDQLVSVVRNIKENVSIPVIAKPNAGMPTI DDQGNAIYSMDAKSFAEHMKVLIENGASVVGGCCGTTPEFIHEISRSLGR >gi|226333000|gb|ACII01000019.1| GENE 8 9281 - 10414 1174 377 aa, chain - ## HITS:1 COG:MA1031_1 KEGG:ns NR:ns ## COG: MA1031_1 COG2006 # Protein_GI_number: 20089906 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 279 1 291 295 149 33.0 1e-35 MKSKVVLLPCREYDEEKIYMLLKQGLDFLGGVETLIPKDAKILLKPNLLKKAEVEKAVIT HPVVVGVFAGILRESGYENIVLADSCGHGTTQAVIRGTGMDTYLEKYHIPAVDYSEGVKT AYPQGVQAKEFILPKELLEQDCVISLSKMKTHALERITGAVKNSYGFVYGFHKAKGHTQY PSADSFARMLIDLNKCVAPKLYVMDGIVAMEGNGPGSGDPVPMNVLLMSTDPVALDSVFS RLVYLKPEMVPTNYHGEKMGLGTWKEEEITLLTPDGEISMAEAVKKYGNPAFNVDRTEVR KNIWTRMAGALNIFQKKPYIEADKCVRCGICVQSCPVPGKAVDFRKGKGKPPVYDYRKCI RCFCCQEMCPKKAIKVK >gi|226333000|gb|ACII01000019.1| GENE 9 10420 - 12054 1558 544 aa, chain - ## HITS:1 COG:CAC3087 KEGG:ns NR:ns ## COG: CAC3087 COG1080 # Protein_GI_number: 15896338 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Clostridium acetobutylicum # 3 532 2 536 539 361 38.0 2e-99 MDVYKGTGAFSGIAIGKILYYHKSEYQMRQYEITDVKTELNIFRQARTRVMEQLEDLYEK NCIIQGTQAEIFLRQKNLLEGKSFQRAIESVIQSEKVNSAYAVMTTRDEMLKTFRTLEEP AIKERLDNMREISDRLIGELGGISPRIDLGDEPVIVVTESITPTELMEMDKDKLMAIVTH HGSDMSHAAIMAKTMNIPALLEIDTDSEWDGRQAIVDGYTGTLYIDPDEEVKKEYEIRRQ ADKEEREELLKLRREPDITKDGRKIEIYANIGNMDDLNSVLYYGAAGIGLLRSEFQYLGR ENYPRENELFLAYKKIAETMGEKLTVIRTADLGADKQAEYLNIPDETNPIMGNRGIRLCL DRKRMFKAQLRAIFRASAYGNLALMYPMISSEEEMDEIEEIIREVKSGLDEKGIPYKHIK TGIMIETPAAVMISRELARRVDFLSLGTNDLSQYTLAMDRQNPLLRKKYNDHHPAVLRMI QMVIEAGHAENRRVCICGELAADTALTEELLRMGVDCLSVVPACILPVRKALRQVDLSGD DKGA >gi|226333000|gb|ACII01000019.1| GENE 10 12080 - 13114 1488 344 aa, chain - ## HITS:1 COG:STM1222 KEGG:ns NR:ns ## COG: STM1222 COG0687 # Protein_GI_number: 16764577 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Salmonella typhimurium LT2 # 25 342 23 347 348 180 32.0 3e-45 MKKRVLAVVMCALMAGSVFTGFKDAGDDKELVLFTWEGMFPQEVLDDFEKETGVKVVYSN FDTDETMLEKLSMAKGGDYDIVIADDYILETAIQEGLAEKLDKDSLENIENINPLYQGQF YDPDDEYTVPYGAGIPLIVYDPEQVDIDIKGYKDLWDPSLEDSIALTANYRVINGITQLS MGESMNEEDVDVIKKTGEKLLELAPNVRLIQDDNTQNALLNGEASVAFLYTSQVTAALAE NPDLKVVYPEEGLGFGIMGMFIPSEAPDKDAAYSFMDYIMQPEVAAKCTDYIGYYSTNKA ADELVNPDLVVPDDVTKGEIVQNVSQDADAQYQKNWTEFKAACD >gi|226333000|gb|ACII01000019.1| GENE 11 13204 - 15048 2083 614 aa, chain - ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 295 614 35 353 354 258 41.0 3e-68 MKRNNKFFSMLYVVLCLAFFYLPILVTMIFSFNSSKSLTRFTGFSLRWYQELLGNGEVIK AVYVSVTIAIIATIVSTILGTITAIGLSKSKKVIKELLLNINNIPILNPEIVTAISLMLL FSSLGFRKGYLTMLLAHIAFCTPYVITSVYPKVRALDPNMANAAMDLGATPFQALTKVIV PMIKEGIFAGALLAFTMSFDDFVISYFVSGNGVKNISIVVYNMTKRINPTINALSTIVIV VIIVVLLLSNLLPKFKNKARKLNRKAVKIVSVVLVVAVMAGLIKWGFVAQSTHVLKVYNA GEYMDLSLLEDFEKEYDCTIVYETFESNEMMYTKLSSGETYDVLIPSDYMIERLIKEEYL QALDWKEIPNKKNLLNDVMNQSYDPGNRYSCPYFWGTVGILYDKTVVDEADLKDGWNLLC NPKYKGNIYMYDSERDSFMIALKALGYSMNTTNEAEIEAAYQWLINQRDTMDPIYAGDDV IDNMISGNKALAVVYSGDASYIISENPNLDYFTPEQGTNRWYDAMVITKDCSEVQLAHKF INFMISDKSALSNTEEVGYTSTVKSAFETMKEGDYAGISSYIPDTTNPNNEIFAYQQPKI KQKFAELWTKVKAK >gi|226333000|gb|ACII01000019.1| GENE 12 15060 - 15875 768 271 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 1 268 1 265 277 248 51.0 1e-65 MKQYRSLIKPYLVWAFVVIVIPLILIALYAFTEDGNEVLTFSFTLENFRKFLEATYMSVI FKSFKLGILTTVICLGLGYPLAYIISKCPEKSQSLLILLVTIPEWINMLLRTYAWMNLLS DNGIINHLLGLLGISPVTMMYTDFSVVVGLVCNFMPFMIIPIHTSLSKMDKSFIEAAYDL GANKFQTFTKVIWKLSIPGVLNGIMMVFLLSISTFVIPKLLGGGQYMLIGNLIESQFISV GDWNFGSAISLILAVLILIFMSLMKKMDKDE >gi|226333000|gb|ACII01000019.1| GENE 13 15875 - 16918 1221 347 aa, chain - ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 4 343 6 345 352 409 59.0 1e-114 MNKLIELKNLTKNFDDQQVLRGINLDIYEKEFLTLLGPSGCGKTTTLRIIAGFEEPSDGE VLFNGIEISKLPPYKREVNTVFQKYALFPHLNVAENIGFGLNLKKVDKTVIAQKVERMLK MVGLEGFGKRDVTLLSGGQQQRVAIARALVNEPKVLLLDEPLGALDAKIRKQMQVELKKI QQEVGITFIYVTHDQEEALSMSDTVVVMNNGEIQQIGSPTDIYNEPENRFVAGFIGESNI IEGTMIRDYLVAFDGFEFECVDKGFEDNEEIEVVLRPEDLDIVDPGQAKIRGIVRNITFK GVHYEILIETELRTYMVHTTDYAEVGREVGLKFGPEDIHVMCKMGSY >gi|226333000|gb|ACII01000019.1| GENE 14 16966 - 17739 673 257 aa, chain - ## HITS:1 COG:BMEI0414 KEGG:ns NR:ns ## COG: BMEI0414 COG1177 # Protein_GI_number: 17986697 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Brucella melitensis # 5 257 6 259 273 195 41.0 7e-50 MKKNSRFAGFYMGLIFFLMYLPIAVVIVFSFNESKLPVKFTGFSLKWYQELIHDRAMLEA LVNSLILGVLSCLVSAVIGTLGAVGLSRIHWKSKGALEYISILPLMIPEIILGMVLMAFF YMLNLPFGMLTLLIGHTVFCVPYILMEVKARLAGMDPALEEAARDLGAGPFRAFRDITLP LIMPAVISGSLLAFAMSMDDVVISIFVNGPRLSTLPIKVYTQLKTGVTPEINALCTILLA ATLVIFIVYSLITKKKK >gi|226333000|gb|ACII01000019.1| GENE 15 17736 - 18599 767 287 aa, chain - ## HITS:1 COG:STM1225 KEGG:ns NR:ns ## COG: STM1225 COG1176 # Protein_GI_number: 16764580 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Salmonella typhimurium LT2 # 23 278 20 275 287 208 42.0 9e-54 MKKRRNHSDSSAWMALPLYIFTIVFVVCPLIYMVALSFATPSRGFGVTWKFTLDNYRNIL EPVYLNTFVESLKLAFTSTIVIALIGYPFGYFMARLPEHRKKKAQLLLSTPFWVNSLIRL YGWIIILQKKGLLNFVLIKLGIIEKPLSILYSYPAIVIGMIYVLLPFMIMSVYSSAEKLD WSYVEAARDLGASRLQAFFTITLKLTLPGLLSGVILTFVPSMGLFFIADILGGNKVVLVG SLIQDQMTRGSNWPFAAALAVVMMVLTTVLIMIYRKVTNARELEGIG >gi|226333000|gb|ACII01000019.1| GENE 16 18586 - 19794 1169 402 aa, chain - ## HITS:1 COG:ECs1571 KEGG:ns NR:ns ## COG: ECs1571 COG3842 # Protein_GI_number: 15830825 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Escherichia coli O157:H7 # 6 369 18 344 378 294 44.0 2e-79 MAEVSLELKEIKKSFTEGEAVLDNISLEISKGEFITLLGSSGCGKTTTLRIIAGLEQPDA GSVWLDGREVTGLEPNQRDVNTVFQNYALFPHMNVAENIGYGLKLKKVPKSEIRKKVSQM LELVQLEGYEKRKPSELSGGQKQRVAIARALVNNPKVLLLDEPLGALDLQLRRAMQIELK HLQKKLGITFIYITHDQEEAINMSDRIAVMKDGRIEQIGTPDEIYNHPKTSYVATFVGNA NILHGAAESIQGQNAIVKIGNDKVIVKLETSQQNTEDTGGKQHLAAGEKVTLAVRSENIL LQEAAVIGNTGTDNGDTVDIRVAGGISDIHDANSISGLQATVTEKNFAGGQLRVTLKLSD GTQLIASRYGIDASVAEGQTVRCSFLPTDAVLVDREDIHEEA Prediction of potential genes in microbial genomes Time: Sat May 28 19:14:31 2011 Seq name: gi|226332999|gb|ACII01000020.1| Ruminococcus sp. 5_1_39B_FAA cont1.20, whole genome shotgun sequence Length of sequence - 8463 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 89 - 161 82.6 # Thr CGT 0 0 1 1 Tu 1 . - CDS 222 - 2486 2251 ## COG0296 1,4-alpha-glucan branching enzyme - Prom 2593 - 2652 8.3 + Prom 2614 - 2673 6.5 2 2 Tu 1 . + CDS 2740 - 3714 1279 ## COG0524 Sugar kinases, ribokinase family + Term 3743 - 3784 4.5 + Prom 3777 - 3836 2.7 3 3 Tu 1 . + CDS 3857 - 4885 828 ## Swol_1080 ATP-dependent transcriptional regulator-like protein + Prom 4897 - 4956 5.6 4 4 Tu 1 . + CDS 4991 - 8462 2116 ## gi|253579950|ref|ZP_04857218.1| predicted protein Predicted protein(s) >gi|226332999|gb|ACII01000020.1| GENE 1 222 - 2486 2251 754 aa, chain - ## HITS:1 COG:sll0158 KEGG:ns NR:ns ## COG: sll0158 COG0296 # Protein_GI_number: 16331275 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Synechocystis # 39 709 40 769 770 497 37.0 1e-140 MDEKVYGYMDWPRIEAIVYGEETAPRDVMGPRITQDGVLIQGFFPDAEEVSVISGKKTYA CEKEDEAGYFAVLLPVRKVPEYRFLIKMGETEKECYDPYAFPCQITEEEEKAFCAGVYYE AYKKLGAHPMEIKGVKGTLFAVWAPNAVSVNIAGDFNGWIGRATIMHRMPMSGIFELFVP GVEAGTHYKYEIKVKGGEVLLKADPYGNSADHDPEGASVVADVSAFQWNDGDWMKERHRF DDRKQPVSIYETSLEEWKSAEELVEFLAEEDFTHVELHPVMEYLDDITGGYSTYAYYAPT SRFGSVADFQKLVDELHQAGIGVILDWTPAQFPRYASGLEKFDGTPLYERQNPAEAIHPF WGTLLYNYGSPMVKDFLISNACFWAEVYHADGLRMDDVDAMLYLDYGRNPGEWTPNIYGT NENLDALEFLKHLNSVIKERNPGLLLVAQENGLWPELTDSVGNDHLGFDYKWSGGWTKDL LEYLSKDPIERKNYHDQLTMSMLYAYCEHYILTLGSRDVGTLKDFADKLPGSEEQKNAQI REAYVYMMLHPGCKMMAPDKDMPKELEVFVKDLNNMYLAHPALYQLDDEYDGFEWVQLMK YEENVIAFMRKTEKPEETILAVCNFAAIPYENYNVGVPFAGKYKEIFNSDDKKYGGNGVV NTRVKAAKKAECDEREYSITVKLPALGVAVFTCTPEETEKKPAAEHSQIKKSITKTRTVR KAAGKTKAAVKTAVKPVTKTVSEIPVKKDLTEKK >gi|226332999|gb|ACII01000020.1| GENE 2 2740 - 3714 1279 324 aa, chain + ## HITS:1 COG:BH1857 KEGG:ns NR:ns ## COG: BH1857 COG0524 # Protein_GI_number: 15614420 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Bacillus halodurans # 8 321 5 315 319 209 38.0 8e-54 MGNKAFDVTALGELLIDFTENGNSAQGNPILEANPGGAPCNVLAMLEKLGKKTAFIGKVG NDMFGTQLKNAVEEVGIDTRNLVIDNEVHTTLAFVHTYPDGDRDFSFYRNPGADMMLTKD EIQEDLIRDSRIFHFGTLSSTHEGVREATRYAIDVAKEAGCIVSFDPNLRPPLWKSLDDA KAEIEYGLGKCDILKISDNEVEFLFGTTDYDKGAALLKEKYNIPLILITLGKDGSRAYYK DMKVEAAPFLQEKTIETTGAGDTFCASSLNYVLEHGLDNLTEENLKELLTFANAAASLIT TRKGALRVMSTKEEVLDFMKSRGC >gi|226332999|gb|ACII01000020.1| GENE 3 3857 - 4885 828 342 aa, chain + ## HITS:1 COG:no KEGG:Swol_1080 NR:ns ## KEGG: Swol_1080 # Name: not_defined # Def: ATP-dependent transcriptional regulator-like protein # Organism: S.wolfei # Pathway: not_defined # 167 336 648 818 823 67 27.0 9e-10 MNSSDNFQDSALSRLMPLMNSSFTPGQAQATVDNFQDLDQRQIAQAELYYFSGRAEECRN IAELYLQDKDLCLRLSAALLYSFSNLTLGNPSASRMGFRNIQECLLVAKNSSAPKGIMAS CVFANYLAMVLMHLPTDGLPPLQDFLPSLPSGLRAYAVYVLAHNAYLHKEYTRALGLCQS VFLMLDGCYPVAMEYLYCVIIMCLINLKQQDEAREALIKAWNMAKPDGFLEPFIEHHGLM LGQIEACIKPAEPESYRQLSQAVIAFSRGWMAIHNPQLQSSVTDKLTPMEYSIAMLASKG WTNQEIAKQLSLSPNTIKHYLSRIFHLLDIEKREELKPFVNK >gi|226332999|gb|ACII01000020.1| GENE 4 4991 - 8462 2116 1157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579950|ref|ZP_04857218.1| ## NR: gi|253579950|ref|ZP_04857218.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 631 1157 609 1144 1185 827 81.0 0 MRKKLLAGVMALALCSTNMLPQTILAGEFTSGNLEEVSEETPEIFSDDNQEVIEETEEEL SVFSSESIPEFSNEANIMPATADGTDHINEINMDEINNKPYNKNNGKWEITSVGSYRFNG KSPNDAPIIIDNIDSGTVKVYLNNVNIETASGPALQITSDVQAQVCIYLENENKLISKHR DSAALQKDNNANLTIDNATNTTPGTLTVQTYFTDYSKSGFGAGIGSGFGNVSSGSCSNIT INGGSVNASSTNGTDIGSGRAAFLTGRRGSCSNITISGGSVNAKNIGCIPHQTLDSAGNY NPDSPEVYLCTIPNEENDPIFIDDTPWSPYSHIAVDPNNKNLYAWLTGKTHTISVGDTTQ TYYFDKNNNNNNNKEFKPITNRDFAFTPTDLTYNGTAQKAPLECKIEDIKDEITLSYYNK DDSQQSNEAINAGTYTVTANIAGNSFADESWTFTIKPRNLNISIDDKERKYGEDNPELTY LISPDTPLVANDTTDILGITLTTTANKDTAANTTWPITMYYTTKNYTINATLGKLTIKPA TFNENSIKITAYSGTYDGSEHPVIKEISVYPAGSTVEYSTETGSSNWNTTCPKVKNVSDS KKTKVWIRISKDNYEPWFSGEIQATISPTQIENLPLNPQILVPWTCKKVADITTKLPTNW NWQDSSKDLQLGSNSVIAVYTGPDAGNENYERETVTYTITRSKCTHEHTAGRYYSSPSCT SSGYSGDTYCKDCNETISYGYTISAYGHDYDSGVITTEPTTETDGIITYTCKRCKHQDTK VLGKLGDGEPYIEGSFQKKGWDAVNDLIKASKEKDTISIILNGAKILPASVLSGIKGKDI SLNLDIENGFIWKINGTSITAETPADTDLSVTNTAEHIPAALYSLISTNQNDFGFHLGRS RAFDFPAVLSVKADASCAGLMANLFWYDVENGVLQCIQTVTVGGAFERSIPYADFTLSKG QDYFIAFGTESLNGRDIHTDGSITDENGAYLRPANTKISSHSIDRNKLTVKLSKGCAGAQ GYDFVISKKSNMLQTGKFSKTVSSTGKPQASFRYLAKGTWYVAARSWVLDAQGNKVYGSW TKVKKIKITVVTPQQPKIKDITVKENTVTVTYTKCKNATGYEILLGNKYKTSAGEKYPVK KYVKRTEGKNTVTVTFT Prediction of potential genes in microbial genomes Time: Sat May 28 19:15:21 2011 Seq name: gi|226332998|gb|ACII01000021.1| Ruminococcus sp. 5_1_39B_FAA cont1.21, whole genome shotgun sequence Length of sequence - 33328 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 13, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 31 - 1194 1692 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 2 1 Op 2 . - CDS 1194 - 2276 1204 ## COG1932 Phosphoserine aminotransferase 3 1 Op 3 6/0.000 - CDS 2260 - 3951 1990 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 4 1 Op 4 1/0.000 - CDS 4027 - 5700 2266 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 5737 - 5796 4.6 5 1 Op 5 . - CDS 5808 - 6890 1490 ## COG0473 Isocitrate/isopropylmalate dehydrogenase - Prom 6928 - 6987 9.0 - Term 7006 - 7067 13.1 6 2 Op 1 15/0.000 - CDS 7079 - 7378 386 ## COG1862 Preprotein translocase subunit YajC - Prom 7556 - 7615 8.1 7 2 Op 2 . - CDS 7627 - 8778 1231 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase - Prom 8806 - 8865 5.8 + Prom 8852 - 8911 7.8 8 3 Tu 1 . + CDS 9004 - 13248 4480 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) + Term 13323 - 13361 -0.9 - Term 13472 - 13522 3.3 9 4 Op 1 . - CDS 13561 - 15189 1567 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 10 4 Op 2 . - CDS 15235 - 15909 640 ## COG0692 Uracil DNA glycosylase 11 4 Op 3 2/0.000 - CDS 15942 - 16700 972 ## COG0107 Imidazoleglycerol-phosphate synthase 12 4 Op 4 . - CDS 16702 - 17340 705 ## COG0118 Glutamine amidotransferase - Prom 17441 - 17500 6.5 - Term 17444 - 17504 11.2 13 5 Op 1 40/0.000 - CDS 17560 - 19104 1543 ## COG0642 Signal transduction histidine kinase 14 5 Op 2 . - CDS 19101 - 19781 867 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 19827 - 19886 7.3 - Term 20638 - 20688 -0.3 15 6 Tu 1 . - CDS 20765 - 20908 120 ## gi|253578137|ref|ZP_04855409.1| predicted protein - Prom 20991 - 21050 6.4 + Prom 20892 - 20951 5.7 16 7 Tu 1 . + CDS 21124 - 21672 169 ## PROTEIN SUPPORTED gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 + Term 21678 - 21738 8.3 - Term 21872 - 21907 4.0 17 8 Tu 1 . - CDS 21914 - 22921 1076 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family - Prom 22960 - 23019 4.6 - Term 22938 - 22988 11.1 18 9 Op 1 . - CDS 23039 - 23596 719 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 19 9 Op 2 . - CDS 23740 - 24264 629 ## COG0703 Shikimate kinase 20 9 Op 3 . - CDS 24261 - 24764 444 ## COG2179 Predicted hydrolase of the HAD superfamily - Prom 24796 - 24855 1.9 21 9 Op 4 . - CDS 24859 - 26220 1349 ## COG0372 Citrate synthase - Prom 26307 - 26366 6.4 + Prom 26655 - 26714 13.3 22 10 Tu 1 . + CDS 26756 - 28090 1745 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 28126 - 28175 14.1 - Term 28110 - 28165 8.2 23 11 Tu 1 . - CDS 28227 - 28676 474 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains - Prom 28804 - 28863 7.4 + Prom 28677 - 28736 6.7 24 12 Op 1 1/0.000 + CDS 28957 - 29211 147 ## COG1873 Uncharacterized conserved protein + Prom 29222 - 29281 6.2 25 12 Op 2 . + CDS 29431 - 30210 910 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Term 30236 - 30287 -0.3 - Term 30229 - 30269 5.5 26 13 Tu 1 . - CDS 30350 - 33238 2697 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC - Prom 33268 - 33327 3.0 Predicted protein(s) >gi|226332998|gb|ACII01000021.1| GENE 1 31 - 1194 1692 387 aa, chain - ## HITS:1 COG:lin2956 KEGG:ns NR:ns ## COG: lin2956 COG0111 # Protein_GI_number: 16802015 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Listeria innocua # 1 387 1 389 395 415 54.0 1e-116 MFQYHCLNPIAEKGLGLFDEEYKKSEDLQECDAVLVRSAKMHDMELPESVKVIARAGAGV NNIPVKDCAEKGVVVFNTPGANANGVKELVLAGMLLASRDIVGGIEWVAKEKDQEDIDKL AEKQKKQFAGCEIMGKKLGIIGLGAIGAMVANAASALGMEVYGYDPYISIDAAWNLSRTI KHIKSLDEIYSQCDYITIHVPLLDSTKEMINKEALDKMKDGVVLLNFARDLLVDEDALIE ALDSGKVKKYVTDFANHTVAGHKGILVTPHLGASTEESEENCAVMAVKEVRDFLENGNIK NSVNFPNCDMGTCVAVGRIAICHKNIPNMISQFTKILGAEGLNIADMTNKSKGEYAYTLI DLESAASREALDELKAIEGVSRVRVVK >gi|226332998|gb|ACII01000021.1| GENE 2 1194 - 2276 1204 360 aa, chain - ## HITS:1 COG:RSc0903 KEGG:ns NR:ns ## COG: RSc0903 COG1932 # Protein_GI_number: 17545622 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Ralstonia solanacearum # 2 360 13 378 378 399 52.0 1e-111 MSRVYNFSAGPAVLPEEVLKEVAEEMMDYQGTGMSVMEMSHRSADFQKIIDEAEQDLRDL MKIPDNYKVLFLQGGASQQFAAIPMNLMKNKVADYIVTGQWAKKAYQEAQKYGKANKIAS SEDKTFSYIPDCSDLPISPDADYVYICENNTIYGTKFKKLPNTKGKTLVADVSSCFLSEP VDVSKYGIIYGGVQKNIGPAGMVISIIREDLITDDVLEGTPTMLKFKTQADAGSLYNTPN CYCIYVCGKVFKWLKKMGGLEEMQRRNIEKAKILYDFLDQSKLFKGTVVPEDRSLMNVPF VTGDKDMDAKFVKEAKEAGLVNLKGHRTVGGMRASIYNAMPKEGVEALIAFMKKFEEENA >gi|226332998|gb|ACII01000021.1| GENE 3 2260 - 3951 1990 563 aa, chain - ## HITS:1 COG:CAC3169 KEGG:ns NR:ns ## COG: CAC3169 COG0028 # Protein_GI_number: 15896417 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Clostridium acetobutylicum # 1 550 1 550 554 645 59.0 0 MQLTGAEIIVECLKEQGVDTVFGYPGGAILNVYDALYEYKDQITHVLTSHEQGASHAADG YARATGKVGVCLATSGPGATNLVTGIATAYMDSVPMVAITCNVGTPLLGKDSFQEVDITG IVMPITKHSFIVKDITTLADTVRRAFYIAKSGRPGPVLVDVTKDVTGAKYEYTACAPAEI IPKTDTIKDEDIATAVQMIKEAKRPFIFTGGGTIISEASREVTELAHRIDAPVCDSLMGK GGFSGEDPLYTGMLGMHGTKTSNLGVSGCDLLIVLGARFSDRVTGNTKTFAKNAKILQID VDAAEINKNIVVDASVVGDEKEVLKRILAEVPEMRHPEWTAHISELKEKYPLRYDHSQLT GPYIIEKLYELTKGDAIITTEVGQNQMWAAQYFKYKEPRTFLSSGGLGTMGYGLGAAIGA KMGRPDKTVVNIAGDGCFRMNMNEIATAVRCGRPLIEIVLNNHVLGMVRQWQTLFYDHRY SHTILNDAVDFVKLAEAMGAKAIRITKMEEVEPALKEALVCPGPIVLDCMIDQDLSVFPM VPAGASIDDIFDEEDMKNNEQSV >gi|226332998|gb|ACII01000021.1| GENE 4 4027 - 5700 2266 557 aa, chain - ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 555 1 551 552 659 61.0 0 MRSDAVKTGTQQAPHRSLFNALGMTKEEMDRPLVGIVSSYNEIVPGHMNLDKITQAVKLG VAMAGGTPVMFPAIAVCDGIAMGHVGMKYSLVTRDLIADSTEAMAMAHQFDALVMIPNCD KNVPGLLMAAARLNVPTVFVSGGPMLAGHLNGHKTSLSSMFEAVGAYAAGKLDEDGLTEC EMKTCPTCGSCSGMYTANSMNCLTEVLGMGLKGNGTIPAVYSERIRLAKHAGMQVMEMYR KNIRPRDIMTKEAILNALIVDMALGCSTNSMLHLPAIAHEIGMDFDISFANEISAKTPNL CHLAPAGPTYIEDLNEAGGVYAVMNELNKKGLLHTECMTVTGKTVGENIKDCVNLNPEVI RPIDNPYSQTGGLAVLKGNLAPDGGVVKRSAVVEEMMVHEGPARVFDCEEDAIAAIKGGK IVEGDVVVIRYEGPKGGPGMREMLNPTSAIAGMGLGSSVALITDGRFSGASRGASIGHVS PEAAVGGPIALVEEGDIISINIPELKLEIKVSDEEMQARKAKWQPREPKVTTGYLARYAA MVTSGNRGAILEVPKAK >gi|226332998|gb|ACII01000021.1| GENE 5 5808 - 6890 1490 360 aa, chain - ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 360 1 357 360 424 60.0 1e-118 MDYKIALIPGDGIGPEIVREAKKVLDKVCEKYGHSFSYSEVLLGGASIDVHGVPLTDEAI ATAKSSDAVLMGSIGGDAKTSPWYKLEPSKRPEAGLLAIRKALNLFANLRPAYLYNELRK ACPLRDEIIGDGFDMIIVRELTGGLYFGERQTIEENGVKKAIDTLSYNENEIRRIAIKAF EIARKRRNKVTSVDKANVLDSSRLWRKVVEEVAKDYPDVTLEHMLVDNCAMQLVRDPKQF DVILTENMFGDILSDEASMVTGSIGMLSSASLNETKFGLYEPSHGSAPDIAGQDKANPIA TILSAAMMLRFSLDMDKEADAVEAAVQKVLTEGYRTGDIMSEGCKLVGTKEMGDLIAAAL >gi|226332998|gb|ACII01000021.1| GENE 6 7079 - 7378 386 99 aa, chain - ## HITS:1 COG:BS_yrbF KEGG:ns NR:ns ## COG: BS_yrbF COG1862 # Protein_GI_number: 16079823 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Bacillus subtilis # 8 94 4 88 89 70 37.0 6e-13 MNASSGMGMVGAIVWMVVLFGIMYFLMLRPQKKEQKRLQAMLNDMEVGDSIVTTGGFYGV VIDMTEEDVIVEFGNNKNCRIPMRKQAIAEVEKAGSAAE >gi|226332998|gb|ACII01000021.1| GENE 7 7627 - 8778 1231 383 aa, chain - ## HITS:1 COG:CAC2282 KEGG:ns NR:ns ## COG: CAC2282 COG0343 # Protein_GI_number: 15895550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Clostridium acetobutylicum # 4 374 2 372 376 629 76.0 1e-180 MAEYKLLKTEGRAKRAEFHTVHGTIQTPVFMNVGTIGAIKGAVSTMDLQQIGTQVELSNT YHLHVRPGDKIVKQLGGLHKFMVWDKPILTDSGGFQVFSLAKLRKIKEEGVYFNSHIDGH KIFMGPEESMQIQSNLASTIAMAFDECPSSVADRDYVQKSVDRTTRWLARCKAEMARLNS LPDTINKEQLLFGINQGAVYEDIRIEHAKAISEMDLDGYAVGGLAVGETHEEMYRILDAV VPYLPQNKPTYLMGVGTPANILEGVERGIDFFDCVYPSRNGRHGHVYTNHGKMNLFNAKY ELDTRPIEEGCQCPACQHYSRAYIRHLLKAKEMLGMRLCVLHNLYFYNHMMEEIRDALDA GNFAEYKKMRLEGFEEGKINSKR >gi|226332998|gb|ACII01000021.1| GENE 8 9004 - 13248 4480 1414 aa, chain + ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 6 661 7 663 663 847 61.0 0 MNYKTLGIDIGSTTVKIAILDENENLIFADYKRHFANIQETLADLLQEAFDQYGEMTLHP VITGSGGLTLANHLGVPFTQEVIAVSTSLQKLAPKTDVAIELGGEDAKIIYFENGNVEQR MNGICAGGTGSFIDQMASLLQTDATGLNEYAKDYKALYPIAARCGVFAKTDIQPLINDGA TKPDLSASIFQAVVNQTISGLACGKPIRGHVAFLGGPLHFLSELRESFIRTLKLDEEHTI IPENSHLFAAIGSALNAKEDNSISLTEMKDRLRNSIKLDFEVERMEPLFASQKEYDAFAK RQSSYKVKTGDLATYKGNCYLGIDAGSTTTKTALVGEDGTLLYSFYSSNNGNPLKTTIRS IKEIYKLLPKDAKIAYSCSTGYGEALIKAALLLDEGEVETVSHYYAAAFFDPEVDCILDI GGQDMKCIKIRNKTVDSVQLNEACSSGCGSFIETFAKSLNYTVQDFAKAAVFAQHPIDLG TRCTVFMNSKVKQAQKEGAEVSDISAGLAYSVIKNALYKVIKVSDASELGKHIVVQGGTF YNDAVLRSFEKIAGCEAIRPDIAGIMGAFGAALIARERHQEGAETTMLSIDKINELKYTT SMANCRGCTNNCRLTINKFSGGRQYVSGNRCERGIGKEKNKDHIPNLYEYKYKRIFSYTP LTADKATRGKVGIPRVLNMFENYPFWYTFFTELKYEVVLSPTSTRKIYELGIESIPSESE CYPAKLAHGHVTWLIRNGVKFIFYPCIPYERNEFPDAVNHYNCPIVTSYAENIKNNVDEL NDPSITFRNPFLAFTSEEILANRLVEEFKDIPAEEVKAAVHKGWEEMAAARRDVQKKGEE TLKYLEDTGRHGIVLAGRPYHIDPEIHHGIPDLINSYGIAVLTEDSISHLAPVERPIRVN DQWMYHSRLYAAANYVKTRDDLDLIQLNSFGCGLDAVTTDEVYEILDGSDKIYTCLKIDE VNNLGAARIRIRSLIAAIRAKKAQGQKRTVKPASIDKVSFTKEMRKDYTILCPQMSPFHF SLLQAAFNSCGYNLEVLPNDNKHAVDVGLKYVNNDACYPSLIVVGQIMDALLSGKYDLNK TAVVMSQTGGGCRASNYIAFIRRALKKAGMEQIPVISVNLSGLESNPGFKLTLPLVKKVA YGAVFGDILMKCVYRMRPYELEEGIVNRKHKIWEQRVISFLSGSSVSHSQFKKMCREMVH EFDTIPISDVKKPRVGIVGEILVKFLPAANNHLADLLESEGAEAVVPDLIDFMCYCFYNQ NFKVENLGFKKSKATMANWGIKAIEWVRKPASEALAQSRHFAPPADIRDLAKMASPIVST GNQTGEGWFLTGEMMELIHGDVPNIVCIQPFGCLPNHIVGKGVIKEIRREYPTANIVAID YDPGASEVNQLNRIKLMLSTAQKNLKKVEEKMHK >gi|226332998|gb|ACII01000021.1| GENE 9 13561 - 15189 1567 542 aa, chain - ## HITS:1 COG:BH3277 KEGG:ns NR:ns ## COG: BH3277 COG2244 # Protein_GI_number: 15615839 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus halodurans # 5 507 4 512 539 161 24.0 3e-39 MNQKNNLVKNASFLMIAAMISKVIGLLYKSPLSNIIGSLGMGYTSLAQNAYMILLMIASF SIPQAVSKLISERIALKDYRNAHKFFKGAMIYAMVIGGVVGLFCLFGAGLIIPQNQKDAI PALQILAPTIFLSGILGVFRGYFQAYRNMMPTSISQIIEQVFNAAVSLLAAWGFINAFSD GTENSIAKWGAAGSTVGTGAGVVTALAFMLLVYGVNRRKILHRVEKDHRHQEESYKEIFR VIILVVSPIILSAFVYNVNAYINGYLFSDILGRRGGDAAQIGILYAEYATYFMTIINIPL TLSSTAPTSMIPEVSALYATGDIEATRECIDQTVQLSMVVSAPCAMGLAVLAQPIVFLLY GNSTGLAANLLILGSFSILLNGMSNISNGVLQAIGQQRIPVITAAIALVVDIVVVVVLLF TTNLGVYALLIAMVIYSVVVCVLNDRAMKKYLQYKNPWKEGYLYPILASVPMGIVAGCIC YGLNIFVKSNFICLIVSIPVAAVVYLFAYLIISKPSESQLRRIPGGSYLIRIAEKLPFWQ NN >gi|226332998|gb|ACII01000021.1| GENE 10 15235 - 15909 640 224 aa, chain - ## HITS:1 COG:lin1190 KEGG:ns NR:ns ## COG: lin1190 COG0692 # Protein_GI_number: 16800259 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Listeria innocua # 1 224 1 224 224 295 62.0 5e-80 MSAISGDWLEALKGEFHQPYYAKLYKTVMTEYQTRKIFPPADDLFNAFHFTPLNEVKVVI LGQDPYHNDGQAHGLCFSVKRGVETPPSLVNIYQELHDDCGCYIPNNGYLEKWAKQGVLL LNTVLTVRAHQANSHRGIGWEQFTDAAIQVLNEQDRPIVFLLWGRPAQMKKSMLNNPKHL ILEAPHPSPLSAYRGFFGCKHFSKTNEFLVANGLEPIDWQIENI >gi|226332998|gb|ACII01000021.1| GENE 11 15942 - 16700 972 252 aa, chain - ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 251 1 250 253 333 65.0 1e-91 MFTKRIIPCLDVNNGRVVKGVNFVQLRDAGDPVEIAKAYDAAGADELVFLDITASCEQRD TVVDMVRRVAANVFIPFTVGGGIRTVDDFKKLLREGADKISVNSAAIDRPELISEAADKF GSQCVVVAIDAKRREDGGWNIYKHGGRLDTGIDAIEWAKKVEALGAGEILLTSMDCDGTK AGYDLALTRAIADAVSIPVIASGGAGTLEHFYDALTEGGADAALAASLFHYKELEISQVK DYLSDRGISVRR >gi|226332998|gb|ACII01000021.1| GENE 12 16702 - 17340 705 212 aa, chain - ## HITS:1 COG:MJ0506 KEGG:ns NR:ns ## COG: MJ0506 COG0118 # Protein_GI_number: 15668683 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Methanococcus jannaschii # 1 198 3 194 198 202 52.0 4e-52 MIAIIDYDAGNIKSVEKALHYLGEETTVTRDPQTLLNADKVILPGVGSFGQAMENLHTYG LVPVIHEIVEKKTPFLGICLGLQLLFESSEETPGVEGLGILKGKIVKIPPAPGLKIPHMG WNSLHFQNNGRLFQGIPEQTYVYFVHSYYLQAEEPEIVKATTEYSTCIHASVEKDNVFAC QFHPEKSSKWGLKILENFAAVGKDGNTEKEGK >gi|226332998|gb|ACII01000021.1| GENE 13 17560 - 19104 1543 514 aa, chain - ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 13 510 6 489 492 257 31.0 4e-68 MTGNGKTDKKFHTLQHQISVVFICLLLLSIFTITLINGLFLEKYYVSKKVEVLEEAKEVL SQMNLDDILQYDTDIEEDKKGATDEISDEIERSSSRNNLTWIIVNEENSGYYYWGENNMA KMLRSKLFGYINNLDQDMQHSRVLKKTDTCTMWQVHDRFAGMEYVECWGQFDNGYYFLIR SPLESIKESASISNSFYFIVGIIIIVVSGIVILVMTNRITRPISELTKLSEKMSDLDFDA RYQSRAGNEIDVLGDNFNKMSRKLESTISELKSANNKLQKDIEDKIKIDEMRKEFLDNVS HELKTPIALIQGYAEGLNENISDDPESREFYCEVIMDEASKMNKLVKNLLTLNQLESGKD APVMERFDIVSLIRGVLGSMHIMIEQKEATVIFEETEPVYVWADEFKIEEVVTNYTSNAL NHLDGERKVEIKVLQEEDCVKVTVFNTGTPIPEEDIPNLWNKFYKVDKARTREYGGSGIG LSIVKAIIESMNQKYGVCNYDNGVEFWFTLDCRQ >gi|226332998|gb|ACII01000021.1| GENE 14 19101 - 19781 867 226 aa, chain - ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 226 5 224 224 228 51.0 1e-59 MDTLKILVVDDESRMRKLVKDFLTKKNFQVLEAGNGEEAMDIFYEEKDIALIILDVMMPK MDGWEVCREIRKNSKVPIIMLTARSDERDELLGFDLGVDEYISKPFSPKILVARVEAILR RTGQNNPEDVISAGGIVIDKAAHLATVDGKPMELSFKEFELLTYFLENQGIALSREKILN SVWNYDYFGDARTIDTHVKKLRSKMGDKGEYIKTVWGMGYKFEVDE >gi|226332998|gb|ACII01000021.1| GENE 15 20765 - 20908 120 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578137|ref|ZP_04855409.1| ## NR: gi|253578137|ref|ZP_04855409.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 47 1 47 47 86 100.0 4e-16 MRVNGADSVTTAVLIGNTPEVNENFYTYDVSESDYKKELIRKVSLCG >gi|226332998|gb|ACII01000021.1| GENE 16 21124 - 21672 169 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 [Hydrogenivirga sp. 128-5-R1-1] # 6 158 5 157 185 69 29 2e-11 MALLKEESYTIEDIYALPDGERAELIDGHIYYMAPPSYKHQKLVMELSAIIRNYIKQHKG TCEVLPAPFAVYLDEINNTYVEPDISVICDPNKLDDKGCKGAPDWIIEIVSPASRKMDYL LKLFKYRSAGVREYWIVDIAKNRITVYNFNHDYSIEEYSFTDTVKAGIYEDLSIDFSEIN IL >gi|226332998|gb|ACII01000021.1| GENE 17 21914 - 22921 1076 335 aa, chain - ## HITS:1 COG:MA0427 KEGG:ns NR:ns ## COG: MA0427 COG1453 # Protein_GI_number: 20089319 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 1 333 1 380 400 150 30.0 3e-36 MEYVTLGKTGLRVSRMGLGGIPIQKIDAEGTKTLLHKLADRGVNYIDTARGYTVSEEYLG YALEGIRDRFIIATKSMARTKEAMAEDIEKSLHNLRTDHIELYQVHNPSMEQLDQVQAAG GALEALQEARAAGKIGHIGLTAHSVEVFARALELDWVETIMFPYNIVESQGEELISKCEE KNIGFIDMKPLAGGAIENGTLALRYVCSNKNVTIVIPGMAEAKEIEENIKACLDDSPLSE KELEEVQEVRNQLGTNFCRRCNYCAPCSVGIDISGVFLFAGYLNRYGLGDWARNRYSSLA VKASACIECGKCETRCPYHLPIREMLKECAQQFGE >gi|226332998|gb|ACII01000021.1| GENE 18 23039 - 23596 719 185 aa, chain - ## HITS:1 COG:CAC2094 KEGG:ns NR:ns ## COG: CAC2094 COG0231 # Protein_GI_number: 15895364 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Clostridium acetobutylicum # 1 185 1 185 186 213 57.0 2e-55 MISAGDFKNGITLEIDGNVCQIVEFQHVKPGKGAAFVRTKYKNIITGAVLEKSFRPTEKF PAARIERVDMQYLYDDGDLYYFMNVETYDQVGLTKDQVGDSLKFVKENEMVKICSHNGNV FAIEPPLFVELEVTETEPGFAGNTAQGATKPATVETGAQVQVPLFVNQGDKLKIDTRTGE YLSRV >gi|226332998|gb|ACII01000021.1| GENE 19 23740 - 24264 629 174 aa, chain - ## HITS:1 COG:NMA0648 KEGG:ns NR:ns ## COG: NMA0648 COG0703 # Protein_GI_number: 15793634 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Neisseria meningitidis Z2491 # 4 146 8 149 170 97 36.0 1e-20 MNHIVIIGFMGSGKTRVGKRLAKDFNLPFVDVDRVVSKKMNLTMKEIFDRFGEPFYRALE TTVIKALIDDPEQKIISLGSGLPMQEQNAKYIKKLGTVVYLKGSYATLKKRLENSSSNPL IEGEDKEDKIRKLLKQRDPVYTKFADVEMITGDKPFEDLIGQLEEKLKTCVKNS >gi|226332998|gb|ACII01000021.1| GENE 20 24261 - 24764 444 167 aa, chain - ## HITS:1 COG:BH1322 KEGG:ns NR:ns ## COG: BH1322 COG2179 # Protein_GI_number: 15613885 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HAD superfamily # Organism: Bacillus halodurans # 1 163 1 164 171 109 34.0 2e-24 MFNCFFPDEYLDSTYVINFDDLYAQGYRGLLFDIDNTLVPHGAPADERACALFAHLKELG FKCCFLSNNQYERVSSFNDAIGAQFIENAHKPSTKNYIRAMELLGTDRSNTVFIGDQLFT DIYGAKRTGIRNILVKPLNPKEEIQIVLKRYLERIVLYFYRKEEGNS >gi|226332998|gb|ACII01000021.1| GENE 21 24859 - 26220 1349 453 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 21 453 12 441 441 538 59.0 1e-152 MAGFKTKVTPEIENLTEVCKEHTSLDLSLYAKYDVKRGLRDINGKGVLAGLTQVSNVKAT KIVDGKEVPCAGSLSYRGYDIKDLTRGFIEDDRYGFEEVTYLLLFGSLPDKKQLADFTQL LANQRSLPTNFVRDVIMKAPSKDIMNGLSRSVLTLYSYDTNPDDTSLPNVLRQCLNLISV FPLLSVYGYQAYNHYIKDKSLYIHNPKKELTTAENILRMLRPDKKYSHLEAKILDIALIL HMEHGGGNNSTFTTHVVSSSGTDTYSAIAAALGSLKGPKHGGANIKVVRMFDDMKKEVKD WKDEDEVRTYLKRLLHKEAFDRKGLIYGMGHAIYSVSDPRAEVFKAYVETLAREKGRMKD YALYSMVERLAPEVIAEERRIYKGVSANVDFYSGFVYSMLDLPLELYTPMFAVARIVGWS AHRMEELINADKIIRPAYKNVLEPAVYVPLNER >gi|226332998|gb|ACII01000021.1| GENE 22 26756 - 28090 1745 444 aa, chain + ## HITS:1 COG:CAC0737 KEGG:ns NR:ns ## COG: CAC0737 COG0334 # Protein_GI_number: 15894024 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Clostridium acetobutylicum # 1 442 1 442 443 564 65.0 1e-160 MSYTEEVYERVVAQNPNEPEFHQAVKEVLDSLKVVIDKNEEKYRKLSILERLVEPERIIS FRVPWVDDKGVVQVNKGYRVQFNSAIGPYKGGLRFHPSVNQGILKFLGFEQIFKNSLTGL PIGGGKGGSNFDPKGKSDREVMAFCQSFMTELSKYIGADMDVPAGDIGVGGREIGYLYGQ YKRNRGLYEGVLTGKGLSYGGSLIRTQATGYGLVYMLNEMVKAKNDTISGKTLIVTGSGN VAIYAVEKAAELGGKVVAMCDSNGYIYDPEGIKLDIVKDIKEVKRGRIKEYADRVEGATY TEGKGIWNIKCDIYLPCATQNELDLDAVKILIANGCKYVAEGANMPTTREATDYLMANGV TFMPGKAANAGGVATSALEMCQNSARLSWTAEEVDTKLHQIMVDIFHKVDDASKRYDMEG NYVAGANIAGFEKVVDAMIAQGIV >gi|226332998|gb|ACII01000021.1| GENE 23 28227 - 28676 474 149 aa, chain - ## HITS:1 COG:lin1597 KEGG:ns NR:ns ## COG: lin1597 COG1327 # Protein_GI_number: 16800665 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Listeria innocua # 1 148 1 148 154 160 58.0 1e-39 MKCPFCNQDNTRVVDSRPVEDTNSIRRRRLCDACGRRFTTYEKVESIPLTVIKKDQNREQ YNRSKIQSGILRACYKRPISIDKIEEMMDAIEGEIFNTEEKEISSTRIGEIVMEHLKDLD AVAYVRFASVYREFKDVSTFMDELKKFMN >gi|226332998|gb|ACII01000021.1| GENE 24 28957 - 29211 147 84 aa, chain + ## HITS:1 COG:BS_ylmC KEGG:ns NR:ns ## COG: BS_ylmC COG1873 # Protein_GI_number: 16078600 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 77 2 78 81 63 35.0 1e-10 MKICELKQKEVINICTCRSLGCPIDAEFDCKSSQLTALILPGPGRFCCLFGRDNEYIIPW ECISQIGDDIILVKIDEEKCFHKG >gi|226332998|gb|ACII01000021.1| GENE 25 29431 - 30210 910 259 aa, chain + ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 1 257 1 257 257 338 66.0 9e-93 MALNKVEICGVNTSKLPVLTAEEKAELFRRIKNGDEQARELYIKGNLRLVLSVIRRFQNS SENADDLFQIGCIGLIKAIDNFDTTLQVKFSTYAVPMIIGEIRRYLRDNNSIRVSRSLRD IAYKAIYTRENYMKQHLKEPTVTEIAQEIGIEKEMIVYALDAIQNPVSLFEPVYTEGGDT LYVMDQISDKKNKEERWIENLALREAMQRLNSREKHIIELRFYEGKTQMEVAQEIGISQA QVSRLEKNALKSMRNYLRP >gi|226332998|gb|ACII01000021.1| GENE 26 30350 - 33238 2697 962 aa, chain - ## HITS:1 COG:L146188 KEGG:ns NR:ns ## COG: L146188 COG1887 # Protein_GI_number: 15672903 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Lactococcus lactis # 417 951 7 537 807 285 32.0 4e-76 MKICIWCTKIFDLGGTKRVVTLLANELVKEHDVTIMVYQDRFKEDRNMYHMSEDIKVDFI DNNEFVNRHHTPAFCWRYLVQKLNAKYGIFNKEKYNDILADAIFPKKTREKWVKYLNEQN YDIIITTASLSLRLGMLAPELNAKTIGWQHNCYAGYLDVPNVVFWKQECLLQEYLPKLDR YIVLSDYDKRDYKKFLDIDTEVKINPRSFVSERKCDPKSKRFLMATRFVYAKGLDLMMES FEEFCKQDDEWQLDIIGAGDLWNQIVADAKRRGIEDRVNFVGYTNEPEKYYLNSSVFLLP SRWEGWPMVIMEAFEFGLPVIAFHTGAMDLIIDDGKTGYLPEAFDTKKFTDAMLKLAHDE ELRREMSRNAIWKSEDFAIEKAVKEWNRLFNRVMGIKTFYMKNEEQILECREKYPLRTSY AEFVKEYQIRDNTILYEAFGGRGMICNPYALFLYLLEKEEYQDYTHIWVLEDFEDNRKQI EKYEQYPNVRFVKYKSKEYCKELATVKYLVNNVSFPSYFLKREGQVLIDTWHGTPLKNMG FDIPGANISQGNTARNLLSADYIVSSGPYMTKTAYKDSYKMQNLYEGTVLEEGFPRNDKL FDSDRAEVIQELKDCGVDVKEDKKIILYAPTWRGEQYSRPDTDLQDVYKLINVMENSIDT NEYQIFVKLHQIVYHYMKENAMEPGDAQTKFIPATMDTNEILSVTDVLISDYSSIFYDFM LTGKPILFYVPDAENFEDYRGLYFGFDKLPGPAVSTPEKLGELLKDLPGVAASCKEKYEK AREQTCPRDDGKACKRIAEVLLDGKEPVNPIYLNQTDKVKLLVYAGDFSDTQETKAFYEF LNKVDYEHFDVTLIGNGAKEEESSEKLDSLPKEIRVLYWKRSYPATDEEYVCHKKFMKSK KTEVPEMLLDFYRRELRRMIGMSKFDYALVFTDMKKFFPAMSGALDVKQIFNIENWQNLL KC Prediction of potential genes in microbial genomes Time: Sat May 28 19:15:40 2011 Seq name: gi|226332997|gb|ACII01000022.1| Ruminococcus sp. 5_1_39B_FAA cont1.22, whole genome shotgun sequence Length of sequence - 46067 bp Number of predicted genes - 39, with homology - 36 Number of transcription units - 18, operones - 10 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.200 - CDS 59 - 1348 1259 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 2 1 Op 2 1/0.200 - CDS 1359 - 1775 455 ## COG0615 Cytidylyltransferase 3 1 Op 3 . - CDS 1836 - 3095 827 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 4 1 Op 4 . - CDS 3076 - 4110 428 ## Shel_12340 hypothetical protein 5 1 Op 5 . - CDS 4118 - 5239 840 ## COG0438 Glycosyltransferase 6 1 Op 6 . - CDS 5287 - 6381 598 ## gi|253578156|ref|ZP_04855428.1| predicted protein 7 1 Op 7 26/0.000 - CDS 6424 - 7164 223 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 8 1 Op 8 . - CDS 7227 - 7979 607 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component - Prom 8027 - 8086 6.7 - Term 8200 - 8263 3.7 9 2 Op 1 . - CDS 8290 - 9132 788 ## COG1512 Beta-propeller domains of methanol dehydrogenase type 10 2 Op 2 . - CDS 9148 - 10236 1004 ## Acfer_0837 hypothetical protein 11 2 Op 3 . - CDS 10261 - 11550 1433 ## COG4260 Putative virion core protein (lumpy skin disease virus) - Prom 11708 - 11767 6.6 12 3 Op 1 . - CDS 11917 - 12324 394 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 13 3 Op 2 . - CDS 12407 - 12709 344 ## Fisuc_0373 anti-sigma-factor antagonist - Prom 12839 - 12898 4.0 - Term 12843 - 12887 10.1 14 4 Op 1 . - CDS 12994 - 13629 750 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 13649 - 13708 4.1 15 4 Op 2 . - CDS 13762 - 15114 898 ## COG0534 Na+-driven multidrug efflux pump - Prom 15152 - 15211 7.6 + Prom 15139 - 15198 10.2 16 5 Tu 1 . + CDS 15232 - 16140 269 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 16304 - 16350 6.2 17 6 Tu 1 . - CDS 16423 - 16845 625 ## COG0716 Flavodoxins - Prom 16897 - 16956 5.8 18 7 Tu 1 . - CDS 16994 - 17545 524 ## EUBELI_00570 hypothetical protein - Prom 17721 - 17780 6.4 - Term 17790 - 17831 5.5 19 8 Op 1 28/0.000 - CDS 17899 - 21252 2847 ## COG0419 ATPase involved in DNA repair 20 8 Op 2 . - CDS 21249 - 22328 883 ## COG0420 DNA repair exonuclease 21 8 Op 3 . - CDS 22336 - 22440 57 ## 22 8 Op 4 . - CDS 22453 - 23196 827 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 23416 - 23475 5.5 - Term 23422 - 23462 8.4 23 9 Op 1 2/0.200 - CDS 23491 - 23748 496 ## COG1925 Phosphotransferase system, HPr-related proteins 24 9 Op 2 19/0.000 - CDS 23808 - 25721 2540 ## COG1299 Phosphotransferase system, fructose-specific IIC component 25 9 Op 3 . - CDS 25739 - 26512 914 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) 26 9 Op 4 . - CDS 26509 - 26649 194 ## gi|153854730|ref|ZP_01995964.1| hypothetical protein DORLON_01962 - Prom 26773 - 26832 6.0 - Term 26876 - 26936 9.1 27 10 Tu 1 . - CDS 26961 - 28376 1292 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 28408 - 28467 4.9 - Term 28819 - 28864 7.2 28 11 Op 1 4/0.000 - CDS 28879 - 30558 1817 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 29 11 Op 2 . - CDS 30551 - 31822 1153 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Prom 31974 - 32033 7.3 + Prom 31939 - 31998 6.5 30 12 Op 1 35/0.000 + CDS 32067 - 33881 194 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 31 12 Op 2 . + CDS 33878 - 35644 1471 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Term 35663 - 35717 9.5 - Term 35656 - 35698 4.8 32 13 Op 1 . - CDS 35732 - 37519 1624 ## COG0006 Xaa-Pro aminopeptidase 33 13 Op 2 . - CDS 37516 - 38364 932 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily - Prom 38384 - 38443 4.4 34 14 Tu 1 . - CDS 38834 - 39688 422 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 39708 - 39767 6.3 + Prom 39553 - 39612 5.0 35 15 Tu 1 . + CDS 39744 - 39821 78 ## + Term 39864 - 39901 3.1 - Term 39919 - 39977 9.9 36 16 Op 1 1/0.200 - CDS 39986 - 41383 1199 ## COG2211 Na+/melibiose symporter and related transporters - Prom 41443 - 41502 3.1 37 16 Op 2 . - CDS 41551 - 44589 1902 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 44745 - 44804 8.7 + Prom 44535 - 44594 7.5 38 17 Tu 1 . + CDS 44793 - 45656 643 ## COG2207 AraC-type DNA-binding domain-containing proteins 39 18 Tu 1 . - CDS 45516 - 45773 103 ## - Prom 45971 - 46030 6.3 Predicted protein(s) >gi|226332997|gb|ACII01000022.1| GENE 1 59 - 1348 1259 429 aa, chain - ## HITS:1 COG:MTH365 KEGG:ns NR:ns ## COG: MTH365 COG1887 # Protein_GI_number: 15678393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Methanothermobacter thermautotrophicus # 65 420 56 401 409 108 27.0 2e-23 MKKRIKKKIKKLKKSINKINKVRVQKGRKAMLQETWATVTYGLYDNRKIRRLKKFPPALV ENRMLFETNDDFTDNGRALFDYLIEKGYNKKYEIVWLVHEPSKYKEYQFENVKFVQNFKK GSTIRRVEAYKYALTSKYIFYTQAFNWIGMSRRNQLFIDLWHGCGYKANKNGRKVFFDYC LVPGDIFIKTKMEFFGCTSKKLLSFGYPRYDMMLKGSERADEYKKKLLKETDSEKLILWM PTYRHASSERLNEETLNNEFNIPIIDDADKLLELNKFCKENHILIVIKKHYLQVPYDFGE NVLTNIVYLENGDLADNGLQLYEFINCSDALVSDYSSVAIDYLLLDRPLGFTLDDYEAYT ESRGWVFDDPLEYMPGEHMYNMQDFENFILDIKNGKDNYKEQRASVRAKTHNVCDNYCQR VLDYFNITM >gi|226332997|gb|ACII01000022.1| GENE 2 1359 - 1775 455 138 aa, chain - ## HITS:1 COG:aq_1368 KEGG:ns NR:ns ## COG: aq_1368 COG0615 # Protein_GI_number: 15606564 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Aquifex aeolicus # 4 135 6 130 168 92 43.0 2e-19 MEKKIGYTQGTFDMFHIGHLNLIRNAKKHCDYLIVGVNADDLVESYKNKRPIVPLEERAE IVRAIRYVDEVIVTTTLDKKQVWEKVHFNEIYIGDDWKGNARWEKTGKEMEELGAKLVFL PYTKDTSSTMLREKLKEF >gi|226332997|gb|ACII01000022.1| GENE 3 1836 - 3095 827 419 aa, chain - ## HITS:1 COG:MTH365 KEGG:ns NR:ns ## COG: MTH365 COG1887 # Protein_GI_number: 15678393 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Methanothermobacter thermautotrophicus # 59 395 62 384 409 112 27.0 1e-24 MTKSAIKKYTERLGFFGMILHYIGRFMTKTEQQVIGKLTVRFGKVKKNRLVFKNREMMDC TDNPEAFYEYLISSGYNKKYEIIWLVSEKRKFRNQNTGNVKFVTAENKYGWSSPLAYYYG ATAGFFFYSHNSAGLNRYRCKGQTVVNLWHGCGYKDAEQGKKKQNIKPDFDYALVPGPVF VKTKSGLWNCEPDRLLMMGYPRYDWMLHPSMSKDEILDSLFGWKGKKVVLWMPTFRKSDL GGCAENEIELPCQLPAIQDMNELKELDSYLREQEIILIIKKHPLQTEWDENEQEFTNIRY VAEALLEKKQIKLYELIGISDGLLSDYSSVAVDYLLLDRPLGYVLADYNIYKEKRGFVFE DPLEYMPGEKIYNACDIRKFMKHLTDGTDSYRQERAKNLKQMHNKTENYCKRLADYLQL >gi|226332997|gb|ACII01000022.1| GENE 4 3076 - 4110 428 344 aa, chain - ## HITS:1 COG:no KEGG:Shel_12340 NR:ns ## KEGG: Shel_12340 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 4 330 8 337 337 293 46.0 5e-78 MDKIDFVLPWVDGSDSAWIKQRNEYLGIKNNQTQDSRFRDWENLQYWFRGVEKFAPWVNH IYFVTWGHIPSWLNTDHPKLTVVKHEDYIPKQYLPTFSSHPIELNMHRIRGLSEQFVYFN DDTFIINKMEPEDFFRNGLPRDYCIETALVQDDINNPFACILMNNAALVNMHYSKREVIG RNWKKWFHPAYGKMVFRNMLMLPYREFSSFKYSHISSSFLKSTFEEVWREEGEVLDRVCR TRFRSPGDVNQYVMKYWQYMEGKYEPQSPKIGKFFTIGLHDRQIHDVLRNQKCKILCIND TENIGDFRQQKRNIKDSFESILPEKSAFELSYKDSGGMYDKKCD >gi|226332997|gb|ACII01000022.1| GENE 5 4118 - 5239 840 373 aa, chain - ## HITS:1 COG:SP0353 KEGG:ns NR:ns ## COG: SP0353 COG0438 # Protein_GI_number: 15900282 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 5 369 1 363 372 181 30.0 2e-45 MGERIMKILQIGMGNVAGGLEAFVMNYYRVLVHMDIQFDFVCMYDKIAYEEEIRKLGGRI YYIPNVKKNYQGYIKELKKILKETKYDAVHVNMLSAANIVPLRVAHTMKVPKIIAHSHSS SCPGLIRKIMDNWNRPKIAKYATDRIACGEMAGRWLFGDKAFQSGQVTLINNAIQAEKFS FSEKDRDSLRKELGWENKTVIGHVGRFDIPKNHDRMLDIFQQIVSEKKDVMLCLVGPKEG LYKEIKEKTVQKGLEDKVYFAGKQENIRRYLSAMDVFLFPSVFEGVPFALIEAQANGLAC VMSEAVSEEAVVFPERVRRLSLDRDNIQWAAAVMAMSSMNREAADLIKMRLSDAHFNIET EAKRLKDLYYHTR >gi|226332997|gb|ACII01000022.1| GENE 6 5287 - 6381 598 364 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578156|ref|ZP_04855428.1| ## NR: gi|253578156|ref|ZP_04855428.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 364 1 364 364 677 100.0 0 MTNGLKKEDSAALKGLAVLLLIFHHCYRLADRIERYQVDLCGLTTEQLVAIAECCKICVA IFAFVSGYGLMYGYSAKMKNKEKYAVSEWISGHLLSTMSGFWFTAAVSYLIYFGLGLKDP SKWGETFYERGFAVFADILGISRLLETESLNGAWWYMSAAFVFIILLPLLDGTIEKFGGI FCIAVIFLLPRILGIGFQGGSKPYSFLLIFVAGMLCCKYNFFQKLHEYRRKKLKFICLSA LLCAGLFLYHKIDLKIFWEFRYVLVPFLLILFCVEYLFRITPVSLFLQYLGKHSMNIWLV HTFVRDSLGKYVFALKKFWLIPVAVLVISLGISYCLDFLKKITGYQKLIRLLKKKVTEHY AVRE >gi|226332997|gb|ACII01000022.1| GENE 7 6424 - 7164 223 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 36 243 18 236 318 90 27 2e-17 MSKQPAIIVDNVSMKFNLSKEKVDSLKDYIIKSIKKEIKYNEFWALQNVSFTVEKGDRVG ILGLNGAGKSTLLKVIAGVFKPTEGSVTKHGKMVPLLELGAGFDQQYTGKENIYLYGAML GYSKEFIDEKYDEIVKFSELKDFIDVPIKNYSSGMKSRLGFSIATVVSPKILILDEVLSV GDAKFRKKSEKKVLSMFDSGVTVLFVSHSLAQVQRICNKAMILEKGKLIAYGDIDTISEQ YEKMTN >gi|226332997|gb|ACII01000022.1| GENE 8 7227 - 7979 607 250 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 1 246 9 254 258 109 33.0 6e-24 MQYRFLLSELVKKGIKLKYRRSYLGMIWSMLEPLLTMIVLTIVFGTLYGNTDRTFPVYIL TGRLLYSFFSQSTKAALKSIRQNSAMIKKVYVPKYLYPLSSVLFNYVIFLISLIVLAMVS VILGVKPTFYLLQAPIALILILIMSYGCGMILATIGVFFRDMEYLWSVALMLVMYTCAIF YYPEKLLKSGWAWILKYNPLYCVIDIFRCSVFGKAMNIHYFAYALIFSVVAMVIGLFCFK KKQDDFILYI >gi|226332997|gb|ACII01000022.1| GENE 9 8290 - 9132 788 280 aa, chain - ## HITS:1 COG:BH1807 KEGG:ns NR:ns ## COG: BH1807 COG1512 # Protein_GI_number: 15614370 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Bacillus halodurans # 40 248 30 237 271 68 26.0 2e-11 MLKKKIIPVFLASVLTVSGTTGFVTEAFSENVYAAEEVHTERLADFADLLDDGQEEELEA KLDQVSEDYGCDVVVVTEETLDGAVPQDYADDFFDYNDYGMGEDKSGILFLITMSERKWC ISTHGEAIQIFTDAGQEYMTDSFGSYLSDGEYYEGFMKFADLCEEFIIQAQSGEPYDVEN LPEETIPFYMIFLISLVVGFVIALIVTGVMRSRMKTVHMKPDAADYMKDGSLHINRSRDI FLYHQVTRTAKPKEESSGGGGSSTHTSSSGETHGGSSGSF >gi|226332997|gb|ACII01000022.1| GENE 10 9148 - 10236 1004 362 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0837 NR:ns ## KEGG: Acfer_0837 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 7 354 8 360 369 294 43.0 3e-78 MSDLREYKCPACGGAIEFDSKSQKMKCPYCDTEFELETLKELDAQMEREAGQQDDLSGWQ TDAGGEWQEGETDGMNVYTCQSCGGEIIADENTGASNCPYCGNPVIMTEKFKGALRPDLV IPFKLDKKAAKEAYYRHIKGRTFLPKAFRRENHIDEIKGLYVPFWLFDGDVDADVRYKAT KVRMWSDHDYDYTETSYYSVERSGEMTFVSVPVDGSEKMADDLMESIEPFKISESVDFQT AYLSGYLADKYDVSEKESINRAHDRMKKSAEEVLADTVKGYASVVPENTNVNISGGKAQY ALYPVWILNTTWKDKKYIFAMNGQTGKMTGDLPIDRGIYLKWLAGLTAVFTVVLCLAGLL IF >gi|226332997|gb|ACII01000022.1| GENE 11 10261 - 11550 1433 429 aa, chain - ## HITS:1 COG:BH1805 KEGG:ns NR:ns ## COG: BH1805 COG4260 # Protein_GI_number: 15614368 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Bacillus halodurans # 1 426 1 429 433 249 35.0 7e-66 MGLIKAAAGAFGGTMADQWKEFFYCDAIDKDVLVVKGEKRVGGRSSNKKGSDNIISSGSG IAVADGQCMIIVEQGKVVEVCAEPGQFTYDASTEPSIFAGSLGEGIHRTFDTVKKRFTFG GDTGKDQRVYYFNTKELVDNKFGTANPIPFRVVDRNIGLDIDVSVRCNGVYSYKIVDPLL FYTNVCGNVEQQYDREEIEVQLKTEFVSALQPAFAKISELQIRPSAIPGHVLELCDAMNE ALTKKWQQTRGLAVVSIAMNPVTLPEEDAQLIKDAQKNAILRDPTMAAATLAGAQADAMK SAASNTAGAMTGFMGMGMAGQAGGANMQNLYQMGAQQQAAQQQAAQNMQPASGAGTWKCE CGAENTGKFCSECGKPKPQREEWICSCGAVNTGKFCSECGSPRPTGKWTCSCGAVNTGKF CAECGKPRQ >gi|226332997|gb|ACII01000022.1| GENE 12 11917 - 12324 394 135 aa, chain - ## HITS:1 COG:slr1861 KEGG:ns NR:ns ## COG: slr1861 COG2172 # Protein_GI_number: 16330247 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Synechocystis # 25 135 25 137 143 68 33.0 4e-12 MKSITEEAKIENIAVITDFVNSILEANGCSAKVQMEIDIAIDEIFGNIAYYAYTPKTGEA TVQVEIKNFPERLELTFIDKGIPYNPLENKDPDVTLDIEKRKIGGLGIFLVKEMMDEVSY EYADGKNILKLKKNL >gi|226332997|gb|ACII01000022.1| GENE 13 12407 - 12709 344 100 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0373 NR:ns ## KEGG: Fisuc_0373 # Name: not_defined # Def: anti-sigma-factor antagonist # Organism: F.succinogenes # Pathway: not_defined # 3 99 1 97 98 84 48.0 1e-15 MSLKIEKKNEGTKDTVFLTGRLDTATAPELDTFAEKELINTQELVLDFAGLEYISSAGLR VILKMQKFMNVKGTMKLIHVSDIIQDVFDITGFADILSIE >gi|226332997|gb|ACII01000022.1| GENE 14 12994 - 13629 750 211 aa, chain - ## HITS:1 COG:CAC2692 KEGG:ns NR:ns ## COG: CAC2692 COG0110 # Protein_GI_number: 15895950 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 1 199 1 198 204 268 67.0 5e-72 MNQRERMLSGLPYKAWLDGLEEDRQACKQKIYDFNQLPPSRQSKEAPQMIKNIFGKTGEN VWVEAPFHCDYGWNIEVGENFYSNYNLTILDVGKVTCGKNVQIAPNVSIYTAGHPVHPDS RNSGYEYGIPVTVGDNVWIGGNTVILPGVTVGSNVVIGAGSVVSKDIPDNTIAAGNPCKV IRKITDEDRIYYFKKQKFDDEAWEEVKNKNK >gi|226332997|gb|ACII01000022.1| GENE 15 13762 - 15114 898 450 aa, chain - ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 4 450 12 459 459 288 41.0 2e-77 MESEKRIFYRKLWGLVFPIAIQNLMTALVSASDAFMLGFVSQTSLSAVSLATQIQFVHNL FMLALTIGATTLAAQYWGKGDTDSVEEILAIVLKISMAVSVVFFIAAMFFSGFLMRIFTN DIRLIQAGIPYLRIVSVSYLFMGFSQIYLCIMKNSGRTAKSTIYGSVAVVINIGFNVIFI FGLAGFPAMGIAGAALATTVSRALELLLTIYENMHRSLVCVRLKYIRNSSKKLKKDFWHY TTPVLGNELVWGCGFTMFSVIMGHLGSDAVAANSVANILKNIIACVCNGIGIGAGIIVGN ELGKGEMERATEYGNRLFKLAVFAGAVSGLILLAVSPVLRIFTGSLSAQAHSYLKNMMYI CTYYMIGKSVNATVIAGVFCAGGDTKFGLKCDAVTMWVILIPIGMITAFVLKLPIMVVYF IISMDEIIKLPAVYRHYKKYNWVRNLTELN >gi|226332997|gb|ACII01000022.1| GENE 16 15232 - 16140 269 302 aa, chain + ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 41 295 39 290 291 132 29.0 6e-31 MGLIRTELMPDSSEVVPYDHAGIPLYIRAAHLASYYNMCAPCHWHDDIEWIYIITGKMRY YVNGKRLILNEGDSLMVNARQMHYGYAFEQQDCYFLCILFHPSLFGNNQILQEKYFLPFF ENTSLEFHHFYSNDEAGVKVGRYLQEVLLLKEDSDSGYELGVISCMLELWSFLIRTDLLS AAKNKSESEQELNIQKDMVSFIYQHYPEKISLTDIAAAGHVGRSKCCQIFKHYMQQSPVD FLNTYRLKISCRLLCTTQKSITEIAILCGFNHLSYFSKYFMECYGCTPREYRTLHEHETL LT >gi|226332997|gb|ACII01000022.1| GENE 17 16423 - 16845 625 140 aa, chain - ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 140 1 140 142 116 46.0 2e-26 MSKVAVVYWSSTGNTEAMANAVADGVKGKGGEAVLHTCEDFDGSKVTEYDAIAFGCPAMG DEVLEDTEFEPMFDGCKDALKGKNIALFGSYGWGDGEWMRNWEDSCKEAGANLVCESVIC QEEPDDEATEACKALGAALV >gi|226332997|gb|ACII01000022.1| GENE 18 16994 - 17545 524 183 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00570 NR:ns ## KEGG: EUBELI_00570 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 181 10 189 189 181 48.0 7e-45 MLEKSVIEHCSPTLASIKTGNLFTYKYESEEELWKSVEGFNECVREKGVSLTVLRRSEKK ALIYVCRFSSLERDLKKPGVANFIKKYGYESTDPAYALERLRSRLAQREDFPHEIGLFLG YPLGDVIGFIKNAGQNCKCVGCWKVYCNECEAIKAFARFKKCTNVYVRLWNQGRSVRQLT VAA >gi|226332997|gb|ACII01000022.1| GENE 19 17899 - 21252 2847 1117 aa, chain - ## HITS:1 COG:VCA0521 KEGG:ns NR:ns ## COG: VCA0521 COG0419 # Protein_GI_number: 15601281 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Vibrio cholerae # 1 1104 1 1004 1013 263 28.0 2e-69 MKPLKLTLSAFGPYADETVIDFTQLGGQGLFLVTGDTGAGKTTIFDGITFALYGETSGGV REASMLRSKYAKPETPTYAEYIFEYKDAVYTVKRSPDYERPKTRGTGMTTQKGEALLTFS DGRAPVTKLKEVNQAVTELIGLDMKQFTQIAMIAQGDFRKLLLADTEERSNIFRKLFHTD IYKVIQEKLKFEAGGLDKAYKELLRSIRQYTEQVRYSTEDQLGQQWILMEKNGFEGNLEE SLEILEQFLKADKDNLKAVNKQIKENDTELEKVNQLLGKAHKEADAKARKEQLLQEREGL LPQLEQAQKDAENGAKEPEEIQRLILSIQKEKENLKRYTNLENLRREIENVSKQIEKLAI DSIALQNRQKEKKTVLEQKQKEYQTTDSAEKEKAETDFRKEKIDTLYAGVYKNCGNLEFL QTKISELKIQSGQEEQLAKQIKKDLTDLEEKIVSGGNLEIKENSLVVCLQQIQKIEDQQK RYQDLEKKAESKKKAYLTASQKRAEIKEILNKMEQAYLDGQAGILAAGLQDGMPCPVCGS VHHPKLTQTPKEVPTEEQLKKQKKLTEAAEKAASDASVQAGEAAGLLQRCREELTEGFKS YAAQFLPQEETQEILRNELSDHELCLFIKNQESSVQTGLEEIQKQKKVYQDLLKKKEEFT TQSEQHAKEFQKLQISLEKRKSQLESIQEQLNNQLKEPVLSWIYEEDARQRMKTDPELLQ DHSSETVYQKETGNVSQVLLQGKKAAEWLKQQQDILEQKQKDLTALLEQRKLLEMQIGKT QTELTTITDQINQNQQQTAAEKSRYEQLQKQRENMEKELGERTKEEILSQIQLKTEEQKK LEIHYKTVTEAREVLEKRMTELDSVIASLTEQLKESISISAEELNAQKENFAQQKKSLSE NRDEIHARLEINSDMYEKIRRQQTELLKTETRWKWMKSLSDTANGTITGKARIMLETYIQ MQYFDRILARANIRLMTMSSGQYELVRRKENKSRVGKTGLELDVVDHYNGTVRSVKTLSG GETFQASLSLALGLSDEIQSSSGGIQLDTMFVDEGFGSLDEDALDQAIRALKDLSQGSRL VGIVSHVAELKERIDKKIIVTKKRTEDGVGSTICIEG >gi|226332997|gb|ACII01000022.1| GENE 20 21249 - 22328 883 359 aa, chain - ## HITS:1 COG:lin1687 KEGG:ns NR:ns ## COG: lin1687 COG0420 # Protein_GI_number: 16800755 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Listeria innocua # 1 356 20 370 374 220 35.0 4e-57 MLEDQRYIMDQMMKIFAEQKVDGVLIAGDVYDKTVPSAEAVQLFDEFITGLAKAEIPVYM ISGNHDSAERLSFGAKLFESSDIYISQVYDGEMKRIVLKDQYGPISVYLLPFLKPAAVRH ALQRDDINTYEEGVMAALQECEIDTTQRNVLVAHQFVTGADRSDSEETWVGGLDNVSAEV FKDFDYVALGHIHRSQKMGRETLRYSGTPLKYSFSEADHKKSVTIVELLEKGNVRVSTVP LIPRRDMRKLRGTYMDVTAKDHYTAENKMDYLQITLTDEEDVPGALQKLRTVYPNLMRLE YDNKRTRENREVQAVEVQEQKSELELFEEFYELLNNEPMKEEQTEFVEKLIQDLKEVRV >gi|226332997|gb|ACII01000022.1| GENE 21 22336 - 22440 57 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPPADRAGRNDIKIGEMWDEVCAYWGFAYWKKGA >gi|226332997|gb|ACII01000022.1| GENE 22 22453 - 23196 827 247 aa, chain - ## HITS:1 COG:BS_fruR KEGG:ns NR:ns ## COG: BS_fruR COG1349 # Protein_GI_number: 16078502 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus subtilis # 1 246 1 248 251 176 42.0 5e-44 MLTEQRFDIILKLLEEKKSVTVTELRELLDASESTVRRDITALDKAGKLTKVFGGAVALN HKVTAYEPTVAQKSELNKEEKQKIAKYAASLITPDDFVYLDAGTTTGLMLEYLAGVRAAF VTNAVSHAQTLAKMGIRVYLIGGELKSSTEAVVGSQAMQMIQMYHFTKGFFGTNGITRRE GFTTPDTSEAIVKSTAMKQCKDVYILTDKSKFGEVSSVTFGGFTDAKILTEEIPEEYQDS KNILKVK >gi|226332997|gb|ACII01000022.1| GENE 23 23491 - 23748 496 85 aa, chain - ## HITS:1 COG:SA0934 KEGG:ns NR:ns ## COG: SA0934 COG1925 # Protein_GI_number: 15926669 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Staphylococcus aureus N315 # 1 81 1 81 88 67 41.0 7e-12 MKEFKYVVTDNEGIHARPAGELVKLVKGFESTISIEKEGKKVDCRKLLALMGLGVKKGHE VTFTIEGADEEAAYEAVSKFMQDNL >gi|226332997|gb|ACII01000022.1| GENE 24 23808 - 25721 2540 637 aa, chain - ## HITS:1 COG:BH0828_3 KEGG:ns NR:ns ## COG: BH0828_3 COG1299 # Protein_GI_number: 15613391 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Bacillus halodurans # 298 636 2 334 336 333 55.0 8e-91 MRITDLLDRRSVSLTAAPKTKSETLDMAVDLMVKSEKISNREAYRKQVYLREEESTTGIG EGIAIPHGKCDAVKKPGLAAMVIKNGVEFEALDDEPVTLLFLIAAPNTEDNIHLDVLSKL SVMLMDEEFTESLRNASSVDEFMDIIDRADSEKKDIDERLADTGSTEGTQAKILAVTSCP TGIAHTYMAAEGIEKAAKAKNCFIKIETRGSGGAKNVLTDQEIADADCIIVAADAKVPME RFDGKKVIECQVSDGISKADQLIDRAINGDAPVYHAAAGSQTSSAGNKSGGSVGHKIYMQ LMNGVSHMLPLVVGGGILIAIAFLIDGLSVDLSALPADQRANFGTITPVAALFKGIGGTA FGFMLPILAGFIAMAIGDRPALALGLVGGMMAANGKSGFLGALLAGFAAGYIILGLRKLC DKLPEALEKIAPVLIYPVVGILVMGLLMTFIVEPVMGGINTGLNNALTGMGSSSKVILGI VLGGMMAIDMGGPFNKAAYVFGTAAIAAGNYDIMAAVMIGGMTPPCAIALATLLFKDKFT KEERDAGPTNFIMGLAFITEGAIPFAASDPLRVLPSCIVGSAVAGAISMAFDCTLMAPHG GIFVFPVVGNAVMYLVALVAGTVISAVLLGLLKKKAA >gi|226332997|gb|ACII01000022.1| GENE 25 25739 - 26512 914 257 aa, chain - ## HITS:1 COG:BS_fruB KEGG:ns NR:ns ## COG: BS_fruB COG1105 # Protein_GI_number: 16078503 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Bacillus subtilis # 5 254 51 298 303 211 44.0 1e-54 MNLGIENTALGFTAGFVGKEIIRRLHEMGVKSEFIQISEGVSRINLKLKSIDGTEINGAG PVISKDKVEELMKKLEELGEGDVLFLAGSIPSSMPDDMYEQIMARLDGKGVMIVVDATKD LLVNVLKYHPFLVKPNNHELGEIFGIELKTRKDVIPYGKKLQEKGARNVLISMAGEGAVL VAEDGQVFEEPAPKGRLVNGVGAGDSMVAGFVAGWMEKKDYEHAFHMGIAAGSASAFSEN LARKEEIEAVYRQITEK >gi|226332997|gb|ACII01000022.1| GENE 26 26509 - 26649 194 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153854730|ref|ZP_01995964.1| ## NR: gi|153854730|ref|ZP_01995964.1| hypothetical protein DORLON_01962 [Dorea longicatena DSM 13814] # 1 36 1 36 301 75 100.0 9e-13 MIYTVTFNPSLDYIVSVDDFKLGLTNRTSSELMLPGERELMFPQSL >gi|226332997|gb|ACII01000022.1| GENE 27 26961 - 28376 1292 471 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 5 464 3 441 456 502 55 1e-141 MWEAINSFLTTVDNLVWGVPLMVLIMAGGILLTARLGLLQMRRLPLALKWMFKNEEEGDG EISSFAALCTALSATIGTGNIVGVATAVCAGGPGALFWMILAAFFGMATKFSEGLLAVKY RVVAEDGHSLGGPFYYIEKGMGTKWKWLAKIFAFFGVCVGLFGIGTFSQVNGISSAVQNF FDPNKSNCVNIPFLGEYSWAVVISSLILAACVAAILIGGLKRIATVSQIIVPFMAVIYLI FSVALILLNITEIPAAIVTVVKGAFNPSAVTGGVVGTMIVSMQKGVARGIFSNEAGLGSA PIAAAAAQTKEPVRQGLVSMTGTFIDTIVICTMTGLSIVLTGAWQVEGLEGVQVTTYAFQ HGLPIPGQISAFVLMICLVFFAFTTILGWDYYSERCLEYLTNGHKKTILTYRWLYILAVF IGPYMTVSAVWTIADIFNGLMAIPNMIALFALSGVIVKETKAFFAAEKHKM >gi|226332997|gb|ACII01000022.1| GENE 28 28879 - 30558 1817 559 aa, chain - ## HITS:1 COG:FN0206 KEGG:ns NR:ns ## COG: FN0206 COG1924 # Protein_GI_number: 19703551 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Fusobacterium nucleatum # 298 553 1 256 265 229 45.0 1e-59 MNKTYYVCKYTPIELLESLGGECENFNHMPEGFDLADQVSHPNICGFGKTLLEGVMEGKI KELVLVSCCDTIRSVYDILLESGKLDFLYILDVLHCDSACSRERMAVQLKGLAKAYAQYK GTEFDAGKFRAAFHAQEKITKSHIAVLGARMGQELFEMTSKAMPLPVENDTCVHNRSVGN ILPPEGASFDELMDWYAGEILGQIPCMRMMDPTGRKKLYNDPSVAGIIYHTVKFCDFYSF EYAEIKNHTDVPLLKIESDYTIQSSGQLLTRLEAFAESIQPETLEQTIDGDTKGERKMGK GYFAGIDSGSTSTDVVILNKDHEIVTSIILPTGAGAAIGADRALAEALKEAGLQREDIDA LVTTGYGRTAIKNGDKSITEITCHARGAHFLNPEVRTVIDIGGQDSKVIRLDENGAVANF VMNDKCAAGTGRFLEMMARTMEMNLDQMSECGLEFKEDITISSMCTVFAESEVVSLIAQN KATDDIVHGLNKAVAVKTAALTRRVGGEEKYMMTGGVSKNKGLVKTLEEKLGTKLVISDK AQLCGALGAALFAADMADV >gi|226332997|gb|ACII01000022.1| GENE 29 30551 - 31822 1153 423 aa, chain - ## HITS:1 COG:AF1958 KEGG:ns NR:ns ## COG: AF1958 COG1775 # Protein_GI_number: 11499540 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Archaeoglobus fulgidus # 78 417 37 390 391 103 28.0 5e-22 MEIIETFGHYVENLTDKKPQSARRLLKTGWGAQNLKFRYMQDKRLMPADRHLANLMMDTM LWPLQKPEDSVIVSIFTPCEMMQEVGLHPYNVEGFSCYLSASKAERAFLQQAENQGIAET LCSYHKTFLGAAQKGLLPKPKCIVYTNLTCDANLLTFRTLADFYQVPVFAIDVPWNQTTE NVQYVADQLKDLKIFLEKNTGKTISEDRLKERLACSKRTLENYKKYQQMRADRYVPSDLV TPLYAGMTNNILLGTAEEEKYTQMLLEDIKKAPAAKGKHIYWMHTIPFWSDAVRQELCFS EKAQIVGCELAETCEPDFDPENPFEAMAERMVYHSLNGSALRRIEAGIRHAKQAGADGAV WFGHWGCKHTLGAAQLAKKKFEEAGLPLLILDGDGCDRSHGGEGQTSTRLGAFLEMLGGA ENE >gi|226332997|gb|ACII01000022.1| GENE 30 32067 - 33881 194 604 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 346 585 279 525 563 79 27 4e-14 MRSLAVYLKNYKKESILAPFFKFLEVVFDLIVPIVIAQIIDVGIANHDNKYIVQRFFILI LMAALGLASSITAQFFAAKASVGFATKLRQAVYDHVQHLSFTELDTLGTDTLITRLTDDI NQVQNGLNMGLRLLLRSPFIVLGSMVMAFTIDFKCALVFAVAIPFLFLVVFVIMFFSIPL FKKVQGKLDTVTGLTRENLTGVRVIRAFCREKEAVAEFDTRNQELTKLNLFVGKLSALLN PVTYVLINIATVILIQKAGVEVNLGGMQQGQVVALYNYMAQMIVELIKLASLIITLNKSA ASAGRVADILKVKSTMDYPSSSTYSAQISKDLATATADASLSENAIVFNDVTFEYSKAGA PSLSHISFSVKKGQTVGIIGGTGSGKTTLVNLISRFYDASKGTVLLDGQNIENYTRSDLR SRIGVVPQKAALFKGSIRDNLKWGREDASDEDLWQALTTAQGKEVVEGKPGQLDFMLEQN GKNLSGGQKQRLTIARALVKKPEILILDDSASALDFATDAALRKSLNRLDWKVTTFLVSQ RSSSIQQADLILVLDNGTLAGKGTHAELLRTCDTYREIYFSQFPEERARYSSVSAGNIMK EVTV >gi|226332997|gb|ACII01000022.1| GENE 31 33878 - 35644 1471 588 aa, chain + ## HITS:1 COG:CAC2392 KEGG:ns NR:ns ## COG: CAC2392 COG1132 # Protein_GI_number: 15895658 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 17 583 5 575 579 527 45.0 1e-149 MSRSTIKDPAKSKSSETFFKVLRFIGRYRFLLIFSIILAAVSVILQLYVPILFGNAIDQV IAQHQVNFEMMWYYLSRILVMVILSSAATWLMNVINNRMTYQTVKDIRAKAIRHIQVLPL SYLDGHSTGDIISRIIADTDILSDGMLLGFTQLFSGIVTIIGTLIFMFSKNFWITLMVIV LTPVSFLVAKFISSHTFQMFRKQSDTRGRQTALIEEMIGNQKVVQAFGYEEKSSERFAKV NKELQECSLKAVFYSSLTNPSTRFVNNVIYACVALVGAFIIPGGRLTVGGLSVLLSYANQ YMKPFNDISSVVTELQNALACAGRIFNLLEEEPEVAEVSETLKDVKGEVDIKNVCFHYDE SKKLIEDFNLHVKPGMRVAIVGPTGCGKSTFINLLMRFYDVNSGSISVDGKPIRDITRHS LRSSYGMVLQETWIKNGTVRDNISIGKPEATDTEIIEAAKLTHSWEFIRRLPKGLDTMLN EDSLSQGQKQLLCITRVMLCLPPMLILDEATSSIDTRTELQIQEAFERMMEGRTSFIVAH RLSTIRNADLILVMKDGSIIEQGTHDELIAKGGFYHTLYNSQFAKVSE >gi|226332997|gb|ACII01000022.1| GENE 32 35732 - 37519 1624 595 aa, chain - ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 595 1 584 584 467 42.0 1e-131 MTVKERIAALRARMKETGIDAYLIPTDDFHGSEYVGEYFKCRKYITGFTGSAGTAVIMQD MAGLWTDGRYFIQAADQLEGTGITLFKMGEPEVPTVHEFLKKNLTQGRCLGFDGRTVSAK EAAELEKMLDENGVSLSVDHDLAGDIWENRPVLSCEPVTELDIKWAGESRADKCARIRKA MEKKGADLFVLTSLDDIAWLLNIRGGDIHCCPVVLSYLVMTKTEIRLFANEKAFQTDVLE ALEKDGVTLFPYDSIYEYVKTFKKDKKVLLCKKKVNSRLVSNIPADTRILDEENLTLLPK ATKNPVEVENERIAHIRDGVAVTKFIYWLKKNVGRIPITELSAAEKLYEFRSEQEDFIDN SFDPIIAYGKHAAIVHYFATPETDIPLEPSGFLLADTGGHYKEGTTDITRTVVMGPTTEE EKKYFTAVLRGTLNLGAARFLHGCTGVNLDILARQPLWEMGEDFKHGTGHGVGYLLNVHE GPNSFRWKIVPGGNAVLEEGMITSDEPGYYREDEFGIRHENLMVCKKAEKTEYGQFMCFE FLTMVPFDLDGVVSELMSVRERNLLNDYHAQVYEKISPYLNEEEKEWLKDATRAI >gi|226332997|gb|ACII01000022.1| GENE 33 37516 - 38364 932 282 aa, chain - ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 4 267 6 267 271 239 47.0 4e-63 MENKLEHLKEYLKGLGSVAVAFSSGVDSTFLLKVAHDVLGDKAIAITAQSCSFPKRELNE AKAFCEKEGIQHVICQSEELEIEGFSKNPPNRCYLCKKELFEKIGDIAKKNGIEYIAEGS NMDDNGDYRPGLIAVKELGVKSPLREAKLTKAEIREYSKELGLSTWDKQSFACLSSRFVY GETISEEKLGMVEKAEQLLLDYGFHQVRVRIHGSLARIEIMPEEFPKLLEKNVREDIAKK FKEYGFTYVTMDILGYRMGSMNETLGENDKTTADSSLKGKNE >gi|226332997|gb|ACII01000022.1| GENE 34 38834 - 39688 422 284 aa, chain - ## HITS:1 COG:mlr4907 KEGG:ns NR:ns ## COG: mlr4907 COG2207 # Protein_GI_number: 13474103 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Mesorhizobium loti # 171 275 190 294 305 75 34.0 8e-14 MLAKSKEDIDKCCKGEYFMQLKYVDTKEEIIAINRKSKHIEPHLHNALEIVCVTSGALEF GVGQELYHMEKGDIGFVFPDVIHHYQVLTPGVNKATYLIASPFTIAKFADIMQSMAPEYP IIKEEKVEPEVYRVINAILETEQSDITVAQAYLQIVLARCIGKLNLVEKSNVGSNDLIYQ TVSYISANFKKKFSLEEMAKDLGVSKYVLSRLFSKTFHRNFNQYLNDARLNYACHRLENT SDSITNICLDSGFESQRTFNRVFKERYKISPSDYRSTCVKEMLL >gi|226332997|gb|ACII01000022.1| GENE 35 39744 - 39821 78 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNLKKYLNDLLVFYTEYFEHPYHV >gi|226332997|gb|ACII01000022.1| GENE 36 39986 - 41383 1199 465 aa, chain - ## HITS:1 COG:BS_ynaJ KEGG:ns NR:ns ## COG: BS_ynaJ COG2211 # Protein_GI_number: 16078820 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 2 446 4 441 463 158 27.0 3e-38 MEEKKYLKWYNKIGYGSGDIAGNVVYAFLTSFMMVYLTDSVGLAAGVVGTLIAVSKLFDG FTDIFFGSMIDKTHSKMGKAKPWMLYGYIGCAITLVCCFAIPVSFGTTAKYAWFFISYTL LNGVFYTANNIAYSALTSLITKNSKERVQMGSYRFIFAFSTSLLIQAITVGFVDKCGGDA AAWRTVAIIYAIIGLVINTISALSVKELPEEELNEGEVKDDNEKYGMVQAFKLLVKNKYY MMICGTYILQQLYGAMIGAGIYYMTWVLKNKNLFGQFAWAVNIPLIIALIFTPTLVGKWK GMYKLNLRGYVLAVIGRALVVIAGYMGSVPLMMAFTALAALGQGPWQGDMNAVIASCSEY TYLTQGKRIDGTMYSCTSLGVKIGGGIGTAVVGWMLEFSGYVGTNATQPQSALDMMQFMY LWLPLIFDVLIMFVLSRMNVEDANKKLKAEKEIAADEVTDASDIN >gi|226332997|gb|ACII01000022.1| GENE 37 41551 - 44589 1902 1012 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 3 1006 4 983 1087 551 32.0 1e-156 MIVPRYYEDLSVLHENTMPARAYFIPASKRMDNLVEHREESDRMQLLNGTWKFQYFNSIY DVQEPFFEKDYDTENFDEIQVPSVWQMAGYDTHQYTNIRYPFPFDPPYVPQDIPCGTYAH TFVYHKDENAPKAFLNFEGIDSCFYVWINGSYVGYSQVSHMTSEFDITDLLRDGENSIAV LVMKWCDGSYLEDQDKFRMSGIFRDVYILKRPKQAISDYHIKTRIEDMLAKVEIEMKFYS PLNVKISIEDRNGAVVALGSIAEEGTAVLEIASPELWNTENPYLYKLILETENEVIVDHI ALRKIEIKDQVIYLNGQKIKFRGVNRHDSDPVTGFTINLEQITTDLTLMKQHNFNAIRSS HYPNAPFFYEMCDKYGFMVIDEADIEAHGPFMIYRKEDTDYNRFKRWNEKIADDPVWEEA IVDRVKLMVERDKNRFCIVMWSMGNESAYGCNFEKALEWTKNFDPDRITQYESARYRNYD ETYDYSNLDVYSRMYPALSEIQEYLDKDGSKPFLLVEYCHSMGNGPGDFEDYFQMIQDND KMCGGFVWEWCDHAIAHGTAENGKTIYAYGGDHGEEIHDGNFCMDGLVYPDRTVHTGLLE YKNVYRPARVISYDKESGELVLHNYMDFDDLKDYVKISYELTQDGLVISKGKLPEVSAAP HSEGKINLKINVPESGKCYLKFIYHLKKELPLLDEDHILGFDEIEVSQKDAKCQLAEKWV EKTVTDSELQVSEDDTQIHIKGREFAYTIDRRTALFTEMKFAGREYLNHPMELNIWRAPT DNDMYIKSEWKKAHYDKAYTRAYTTEVVQGKHGVKITSHASVVAETVQKILDVTITWKIE AAGKIDADIAVTKDDEFPDLPRFGVRMFLDKKLSAARYFGMGPQESYCDKHQAASHGLYH ANVDDLHEDYIRPQENGSHYDCEYVELNNSRYGIVVSAENAFSFNASYYTQEELEKKTHN YELTESDSVVFCVDYALNGIGSNSCGPVVLDQYRFDDVLFRFQFTLIPYVKG >gi|226332997|gb|ACII01000022.1| GENE 38 44793 - 45656 643 287 aa, chain + ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 48 282 40 276 287 86 28.0 6e-17 MYMNTGYLNHSHMDFKDKSRPLVVGSCGTYRLSSHPKLPTYRPKGRLDYQIIYITAGCGH FHFDNVDNETIVPAGNIVLYRPKELQKYEYYGKDKTEVYWIHFTGNNVKNILRQYGFPDK ERVFQVGTSMEYEQIFKRIIIELQRCQDNYEEMLVLLLRHLLIIFHRELTREHISKNEYL DHEMDNAVTFFSENYNQNISIDDYAASRGMSVSWFIRNFKKYTGSTPMQFIVGIRINNAQ MLLETTTYSINEISKIVGYDNQLYFSRLFHKLKGYSPREYRKIRNKF >gi|226332997|gb|ACII01000022.1| GENE 39 45516 - 45773 103 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSYFKIKVTADLDTSQVEAKLMTLQNRKITIQANAILWILEFVSYLSVLSWGISFQLVEK PAEIKLIIIPDDFRNFIDGIGCCFK Prediction of potential genes in microbial genomes Time: Sat May 28 19:16:39 2011 Seq name: gi|226332996|gb|ACII01000023.1| Ruminococcus sp. 5_1_39B_FAA cont1.23, whole genome shotgun sequence Length of sequence - 55796 bp Number of predicted genes - 60, with homology - 57 Number of transcription units - 27, operones - 16 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 228 - 785 574 ## COG4636 Uncharacterized protein conserved in cyanobacteria 2 1 Op 2 . - CDS 819 - 1334 338 ## COG4636 Uncharacterized protein conserved in cyanobacteria 3 1 Op 3 . - CDS 1368 - 1541 128 ## gi|253578188|ref|ZP_04855460.1| predicted protein 4 1 Op 4 . - CDS 1538 - 1711 267 ## gi|253578189|ref|ZP_04855461.1| predicted protein 5 1 Op 5 . - CDS 1708 - 2253 188 ## PROTEIN SUPPORTED gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 6 1 Op 6 . - CDS 2307 - 2954 306 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 7 1 Op 7 . - CDS 2970 - 3152 131 ## gi|253578192|ref|ZP_04855464.1| predicted protein - Term 3170 - 3204 2.8 8 2 Op 1 . - CDS 3234 - 4724 1356 ## COG4468 Galactose-1-phosphate uridyltransferase 9 2 Op 2 . - CDS 4718 - 4921 139 ## gi|291546014|emb|CBL19122.1| Protein of unknown function (DUF2500). - Prom 4990 - 5049 1.9 10 3 Op 1 . - CDS 5115 - 6443 826 ## AFE_1328 transposase, IS605 OrfB family protein, putative 11 3 Op 2 . - CDS 6489 - 6908 177 ## COG1943 Transposase and inactivated derivatives 12 3 Op 3 . - CDS 6991 - 7371 207 ## EUBREC_3508 hypothetical protein - Term 7407 - 7443 5.5 13 4 Op 1 . - CDS 7465 - 8910 863 ## BAA_0729 hypothetical protein 14 4 Op 2 . - CDS 8907 - 9395 223 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 9521 - 9580 5.3 - Term 9816 - 9874 7.8 15 5 Op 1 3/0.000 - CDS 9876 - 11624 881 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component 16 5 Op 2 . - CDS 11631 - 12020 308 ## COG3682 Predicted transcriptional regulator - Prom 12063 - 12122 5.8 - Term 12080 - 12130 13.0 17 6 Tu 1 . - CDS 12153 - 12995 899 ## gi|253578202|ref|ZP_04855474.1| conserved hypothetical protein - Prom 13038 - 13097 8.3 - Term 13495 - 13534 5.8 18 7 Op 1 . - CDS 13624 - 13935 242 ## Cphy_3169 hypothetical protein 19 7 Op 2 . - CDS 13998 - 14114 69 ## 20 7 Op 3 . - CDS 14107 - 15339 844 ## EUBELI_20456 hypothetical protein - Prom 15388 - 15447 2.9 + Prom 16142 - 16201 7.4 21 8 Tu 1 . + CDS 16382 - 17134 175 ## LSL_1851 hypothetical protein + Term 17290 - 17322 -0.9 22 9 Tu 1 . - CDS 17096 - 17839 195 ## gi|253578207|ref|ZP_04855479.1| predicted protein - Prom 17911 - 17970 8.6 23 10 Tu 1 . - CDS 18086 - 19003 473 ## COG4974 Site-specific recombinase XerD - Prom 19069 - 19128 7.2 - Term 19116 - 19164 2.2 24 11 Tu 1 . - CDS 19172 - 20446 413 ## COG1204 Superfamily II helicase - Prom 20479 - 20538 5.0 25 12 Tu 1 . - CDS 21726 - 22718 502 ## gi|253578210|ref|ZP_04855482.1| predicted protein - Prom 22740 - 22799 3.5 - Term 22791 - 22847 11.6 26 13 Op 1 . - CDS 22871 - 23089 57 ## gi|153813353|ref|ZP_01966021.1| hypothetical protein RUMOBE_03772 27 13 Op 2 . - CDS 23101 - 23796 477 ## COG3505 Type IV secretory pathway, VirD4 components - Prom 23966 - 24025 3.5 + Prom 24092 - 24151 4.2 28 14 Op 1 12/0.000 + CDS 24295 - 24798 189 ## COG0582 Integrase 29 14 Op 2 . + CDS 24795 - 25538 294 ## COG0582 Integrase 30 15 Tu 1 . - CDS 25449 - 25799 115 ## - Prom 25864 - 25923 5.1 - Term 25893 - 25935 7.2 31 16 Op 1 . - CDS 25959 - 27404 774 ## Csal_2906 hypothetical protein - Prom 27488 - 27547 3.0 - Term 27422 - 27465 0.3 32 16 Op 2 . - CDS 27567 - 29366 1145 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) - Prom 29497 - 29556 9.7 33 17 Tu 1 . - CDS 29660 - 29848 239 ## gi|253578217|ref|ZP_04855489.1| predicted protein - Prom 29883 - 29942 6.5 + Prom 30177 - 30236 5.3 34 18 Op 1 . + CDS 30256 - 30420 262 ## gi|253578218|ref|ZP_04855490.1| predicted protein 35 18 Op 2 . + CDS 30445 - 31365 609 ## gi|253578219|ref|ZP_04855491.1| transposase + Term 31381 - 31418 0.8 36 18 Op 3 . + CDS 31419 - 32117 274 ## COG3436 Transposase and inactivated derivatives + Term 32124 - 32173 2.2 37 19 Op 1 . - CDS 33198 - 33836 206 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 38 19 Op 2 . - CDS 33851 - 34711 393 ## DSY2148 hypothetical protein - Prom 34795 - 34854 3.2 - Term 34762 - 34801 -0.3 39 20 Op 1 . - CDS 34858 - 35148 222 ## gi|253578224|ref|ZP_04855496.1| predicted protein 40 20 Op 2 . - CDS 35161 - 36258 783 ## gi|253578225|ref|ZP_04855497.1| predicted protein 41 20 Op 3 . - CDS 36194 - 36322 63 ## 42 20 Op 4 . - CDS 36326 - 36718 253 ## DSY3434 hypothetical protein - Prom 36764 - 36823 3.6 43 21 Op 1 . - CDS 37185 - 37892 437 ## gi|253578227|ref|ZP_04855499.1| conserved hypothetical protein 44 21 Op 2 . - CDS 37889 - 39187 665 ## EUBREC_0964 hypothetical protein 45 21 Op 3 . - CDS 39190 - 39906 224 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 46 21 Op 4 . - CDS 39884 - 40612 413 ## EUBREC_0961 hypothetical protein 47 21 Op 5 . - CDS 40631 - 41521 349 ## EUBREC_0962 hypothetical protein 48 21 Op 6 . - CDS 41523 - 43007 966 ## EUBREC_0960 hypothetical protein - Prom 43033 - 43092 5.4 49 22 Op 1 40/0.000 - CDS 43107 - 44318 725 ## COG0642 Signal transduction histidine kinase - Prom 44385 - 44444 6.5 50 22 Op 2 5/0.000 - CDS 44525 - 45190 755 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 45349 - 45408 5.1 - Term 45385 - 45427 10.0 51 22 Op 3 . - CDS 45528 - 45869 295 ## COG0784 FOG: CheY-like receiver - Prom 45915 - 45974 1.9 52 23 Op 1 . - CDS 46040 - 47368 598 ## AFE_1328 transposase, IS605 OrfB family protein, putative 53 23 Op 2 2/0.000 - CDS 47410 - 47829 266 ## COG1943 Transposase and inactivated derivatives - Prom 47861 - 47920 7.0 54 23 Op 3 . - CDS 47967 - 48845 439 ## COG0642 Signal transduction histidine kinase 55 23 Op 4 . - CDS 48942 - 49343 386 ## EUBELI_20437 hypothetical protein - Prom 49370 - 49429 3.9 56 24 Tu 1 . - CDS 49453 - 49803 292 ## EUBELI_20438 polar amino acid transport system substrate-binding protein - Prom 49953 - 50012 2.9 - Term 49956 - 49997 2.0 57 25 Op 1 40/0.000 - CDS 50075 - 51352 505 ## COG0642 Signal transduction histidine kinase 58 25 Op 2 . - CDS 51396 - 52094 403 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 52165 - 52224 1.8 59 26 Tu 1 . - CDS 52269 - 53879 590 ## EUBREC_2038 hypothetical protein - Prom 53970 - 54029 3.4 60 27 Tu 1 . - CDS 54097 - 55632 897 ## gi|253578244|ref|ZP_04855516.1| predicted protein - Prom 55717 - 55776 4.1 Predicted protein(s) >gi|226332996|gb|ACII01000023.1| GENE 1 228 - 785 574 185 aa, chain - ## HITS:1 COG:all1618 KEGG:ns NR:ns ## COG: all1618 COG4636 # Protein_GI_number: 17229110 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Nostoc sp. PCC 7120 # 10 139 23 153 189 69 35.0 3e-12 MPLLEEEYRKEEKINGVIYDMSPSPNYQHGIVDGNIYSIIKSGLKGTLCLVFMENLDYKY HAQENNDYVIPDVMIICDRKHLKGGSYTGTPRFIVETLSPATALRDKTVKKELYQNAGVE EYWIISPRERAVEIYYLENGKYDLKYSYILQDDSEEEHYNADTVVTLKDFPKISMTLAEM FENVE >gi|226332996|gb|ACII01000023.1| GENE 2 819 - 1334 338 171 aa, chain - ## HITS:1 COG:alr4912 KEGG:ns NR:ns ## COG: alr4912 COG4636 # Protein_GI_number: 17232404 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Nostoc sp. PCC 7120 # 4 166 30 186 188 68 26.0 6e-12 MDWNLKITDMAGATPEHSSVIVNFVAAVRYQLKNSTCYVYSDNVQYRFKDAEGNNKIVIP DASINCRTKSRRGNTFTDAPRFVMEVLSPSTEKYDRTEKMELYAQQEIEEYWIVDWRTKT IEIYELDYDENDIPKYYLWKTISEENKDELRLVHFPNIKITFEDLFDEIDY >gi|226332996|gb|ACII01000023.1| GENE 3 1368 - 1541 128 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578188|ref|ZP_04855460.1| ## NR: gi|253578188|ref|ZP_04855460.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 57 1 57 57 87 100.0 2e-16 MIILNKEHREIYSREVTREIPNSETISAINEVQQMKQNPSLGKSYTDIDKMMIDLTI >gi|226332996|gb|ACII01000023.1| GENE 4 1538 - 1711 267 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578189|ref|ZP_04855461.1| ## NR: gi|253578189|ref|ZP_04855461.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 57 1 57 57 73 100.0 3e-12 MSDNMTKRAKEYMNADFVYEDYKNLAEAQKIADVEDVTAISERLIKQNMEAYKELAK >gi|226332996|gb|ACII01000023.1| GENE 5 1708 - 2253 188 181 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 [Hydrogenivirga sp. 128-5-R1-1] # 6 157 5 158 185 77 32 2e-13 MPLLKEEAYTIDDIYALPDGERAELIDGRMYMMAPPSRKHQRISTRLVSIIDRYIEEHKG KCEVYAAPFAVYLDERSNTYVEPDISVICDPDKLDDRGCKGAPDWIIEIVSPASKKMDYL LKLLKYRFAGVKEYWIVDPEKNRVIVYNFTGDESVNDYTLQDFVKVGIYEDFVIDFGSME L >gi|226332996|gb|ACII01000023.1| GENE 6 2307 - 2954 306 215 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 21 189 20 190 205 122 38 4e-27 MIKFENVVFGYGEEKNALNYVSFHISEGEQVGLIGANGAGKSTLMKAMLGLIPAKGKITV DEIEVNQENSSRIWQCLRYVFQDSDNQMFMPTVYEDMIFGPLNYGKSRKEADQLAREALE ELDLIHLKDRQNYKMSGGEKRMAAIATILAMNPKVMLMDEPSTALDPKNRRRLIRILQKL PTTKIIATHDLESKMTAADADIYICFVRIKKQVAE >gi|226332996|gb|ACII01000023.1| GENE 7 2970 - 3152 131 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578192|ref|ZP_04855464.1| ## NR: gi|253578192|ref|ZP_04855464.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 60 1 60 60 100 100.0 3e-20 MDRAVALYESMELRGFHGEFYYAEKVYNKLVSWGYVIFWVSVFILMRTCSVTVLLGRMFV >gi|226332996|gb|ACII01000023.1| GENE 8 3234 - 4724 1356 496 aa, chain - ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 1 496 1 497 497 563 54.0 1e-160 MLSESIAKLVQYGITTGLTPECERNYTTNLLLDVFHEDDYEKPDNIEESVNLEATLGELL DEAVKRGLIEDSIVYRDLFDTRLMNCLMPRPGQVQKEFWDKYKESPKEATDYFYKLSQDS NYIRRYRVEKDQKWKVDSPYGEIDITINLSKPEKDPKAIAAARNVKSGSYPKCLLCPENE GYAGRVNHPARQNHRIIPITINDTPWGFQYSPYVYYNEHCIVFNCQHTPMKIERNAFIKL FDFVKLFPHYFLGSNADLPIVGGSILSHDHFQGGHYTFAMAKAPIEQHVVLSGFEDVEAG IVKWPLSVLRIRHKDSNRLVDLATHVLEVWRGYTDEAAFIYAETNGEPHNTITPIARKVG DIYELDLTLRNNITTEEHPLGVYHPHAEYHHIKKENIGLIEVMGLAVLPARLKGEMELLE EYILEGKDISSNEQIEKHAEWVKKFLPKYPEITKENIHGILQKEIGIVFTHVLEDAGVYK CTTEGREAFMRFLETL >gi|226332996|gb|ACII01000023.1| GENE 9 4718 - 4921 139 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291546014|emb|CBL19122.1| ## NR: gi|291546014|emb|CBL19122.1| Protein of unknown function (DUF2500). [Ruminococcus sp. SR1/5] # 1 67 50 116 116 132 98.0 1e-29 MQSEAHVPWFRVVFEMTDGSLMEFQMQQGDFRELEKGDRGMLTYQGNRYICYQKYKFKSE AQEEEKC >gi|226332996|gb|ACII01000023.1| GENE 10 5115 - 6443 826 442 aa, chain - ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 18 399 2 391 439 170 31.0 1e-40 MRTVSSYGVELRKQNIPIRQTLDIYRSAVSCLIEIYAQIWDELAVITEPKNRFNTAEHLV HTTKKNQARFDFDLRFPKMPSYLRRAAIQHALGSVSSYKTRMELWKKTDQKGGRPKLVCD NHAMPVFYRDVMYREDTEEKDGACLKLYDGHDWKWFRVSLSHTDMEYLRKNWAGKKAAAP VLEKRHRKYFLRFSYTEEVPLTKTPVKEQIICSVDLGLNTDAVCTIMRADGTVLGRKFID FPSEKDRMYRVLGRISRFQRKHGSAQVKSRWAYAKRLNTELGRKIAGAVTGYAEENHVDV IVFEYLEIKGKISGRKKQKLHLWRKRDIQKRCEHQAHRRGMRISRICAWNTSRLAYDGSG TVVRDPGNHSLCTFQNGKRYNCDLSASYNIGARYFIRELLKPIPVTERSLLEAKVPSVKR RISCVYADLRELFLEMELLRAA >gi|226332996|gb|ACII01000023.1| GENE 11 6489 - 6908 177 139 aa, chain - ## HITS:1 COG:DR0667 KEGG:ns NR:ns ## COG: DR0667 COG1943 # Protein_GI_number: 15805694 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Deinococcus radiodurans # 8 139 9 140 140 156 51.0 1e-38 MFNINNNELSYGRGYVYSLQYHLVWCTKYRKKVLKDGIDAECKEMLCDLAEEYKFQILAM EVMPDHIHLLVDCRPQFYISDMIKIMKGNIARQMFLAHPELKKELWGGHLWNPSYCAVTV SDRSREQVCSYIEGQKEKQ >gi|226332996|gb|ACII01000023.1| GENE 12 6991 - 7371 207 126 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3508 NR:ns ## KEGG: EUBREC_3508 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 107 263 369 396 190 92.0 1e-47 MLYSIINLPNINLGQLQIILAAAGLLSVLATVSCTLFLSAKFKDTLTVLLISIVVLFMPL FAYVAMGATWFSTILPSAGIGMQNNFLYQLANFNYLHIGGMSFWTPYGFRQELNNCASHS RYLVMS >gi|226332996|gb|ACII01000023.1| GENE 13 7465 - 8910 863 481 aa, chain - ## HITS:1 COG:no KEGG:BAA_0729 NR:ns ## KEGG: BAA_0729 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_A0248 # Pathway: not_defined # 3 457 19 449 477 111 25.0 5e-23 MRLEDMKKDIPETPEFIHTMIQSEVKKQLQDTKVVNIQTRRVKKWTGARVAAAIAVCVLA TSTIVYAGTKLYHMFLEKQGAYSIVTGIKADGSTGKINLPERIHDIDISAGYIPEGMEWI DEFHLEYPEHDRTGGFSFASVLLDEDDLSKVMQDKNVVDCEERTFGNYEGVYLKYNDLAE DGSFNQRIYLLRPDVYRVITVYIGDDISKEDAIKVVENLVITENDTMIETAGLYTWSEMV SPEESSGEAVMTSIADNKLLMHQIGEVFDISASGEDRDGNYIENDKISVCVDAVQVEDNL QLLGQNNVPEEWTDAVGTDGNLVNNTLSYIKSGNGIDSVDEIVKTESVKQKLVYATVTYT NKSDEEINHMLYIGTLLLMDHEDGSYQIYDPTEQSGDDYDRVIWDGVARTAEMTYNSISE DYGNGGNYISSLKPGESIQVNMAWIVNENDLNNMYLNLNGDGAAYEFSDSMLKTGLVDIY Q >gi|226332996|gb|ACII01000023.1| GENE 14 8907 - 9395 223 162 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 8 156 17 177 208 77 32.0 8e-15 MTKEELGTLILNSERQLYSTAKTILINDQDCADAIQETIVKAFSKIGTLRNDKYAKTWLI RILINECYTLLRKSSKLVSLEGMSEMTEIETDQRTDYSDLYRAVNSLKEELRMPVILYYI EDFNIKEIAQILEITEGAVQKRLARARGKLKRNLQESEELTV >gi|226332996|gb|ACII01000023.1| GENE 15 9876 - 11624 881 582 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 4 491 7 496 541 127 23.0 7e-29 MEVLVLHILKLNIIAAIVILLVKVLATLFKGRVSARWKYFIWLLITISLCVPVRLPANLA LVDFKVLRSSQQNTQNPKITDYAVRPEESQKITETVSSASNDMKNLTEIPEQKRTVRYES DKWLGIVAVIFVAVWLSVAVLKLTGELLAYYFSIRNLERMSLQVSDTVSIQMYRAACQKK HVRRIPELRQNAGLTTPLLAGLLHTKLYLPATGYSAEERKLIFYHELTHYCHRDLWYKML LRICASIYWFNPFLLIMLKEADKDIENLCDTAVVRRVNKKEHKLYRQLLLRTVAMENQIP YVTASLNDSEMVFKDRILYMVNIRKLRKGILPGILVTLLLAGGNLVFNVSAGTDTVSVET EKSGIEKNADPEKNNVPDYAPFSEMVTMQKAAETQDEGGVTENTDTEDSVDEEEKADAET NDEPAMENSGQVSETTDDGITSDNNGSVSSYENLPAGVPYTSGFTTTSGVASIVAPGGGD EESRVLYDNGDGTYSDYYGSRYSYQGDGNWADANGNSYRTWNDEDYYSGNQLEQHELQGS NGTVTVKETTNGDYYYCDADGVGYTDNGDGTWTDENGNIYTE >gi|226332996|gb|ACII01000023.1| GENE 16 11631 - 12020 308 129 aa, chain - ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 1 126 18 142 144 69 30.0 1e-12 MKKTYQRLPESELDIMLVLWNNTPPMTRPEIEKVINTKKKLASTTILSLLTRLENKNFVE VTKQGKMNLYTPLVSQSDYQAHESQSVLEKLYGNSLKKFVTSLYHGKKISSEEVQDLSEF LKNLEDREE >gi|226332996|gb|ACII01000023.1| GENE 17 12153 - 12995 899 280 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578202|ref|ZP_04855474.1| ## NR: gi|253578202|ref|ZP_04855474.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 280 1 280 280 402 100.0 1e-110 MKTKFMTLMTVATLGIFSAATVMAAGNNDFYTQPPAGSEWEKADTEGTYASYTNGNDYVN IYKYSLEDVKTGIAKCNDTYDACYQTIYSEGNSLYMAVGYAKDADDISDVRKFVEYISYP GNPSLTRQENENQTDDSSDDSDKATTDTASQKSGGDSDILTTSPMTLVDSDGNTIEINYI LHKDYSCEYRGDDGMNYTDNGDETYTDENGNTYSALNDDTHHLGYTLEQHDLQDANGNTV TVTETTNGDYYFQDENGTGFTDNGDGTWSDENGSSYTEID >gi|226332996|gb|ACII01000023.1| GENE 18 13624 - 13935 242 103 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3169 NR:ns ## KEGG: Cphy_3169 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 11 90 145 223 231 72 45.0 6e-12 MYNRKTVAGFPYEIYYFSRNMEHVLHDRAENLTDDEKEDLAFDIADQYTDQLEKFLEYLH DDGFYVCGTYKDTWEFIMDGNNSLNRYCSYCNVAVFFEQLEIG >gi|226332996|gb|ACII01000023.1| GENE 19 13998 - 14114 69 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKKVILFLVEGPTDEDALAVIFTKLVNDHERNILMHI >gi|226332996|gb|ACII01000023.1| GENE 20 14107 - 15339 844 410 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20456 NR:ns ## KEGG: EUBELI_20456 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 406 68 475 477 214 33.0 7e-54 MVLLKYLLSGRQIPSNFYYYINEASDTSTVKYRFELKIEEKCYLVEYEIGLLKNGKKSFC ISKEKFSQREMSEGKRITPVFDYQKGKKELFRPLKLYERFSKDIQNVIALGVAEQATQNY NEEKGVPEVSSFLFSRKAQEVFEKAEGEAALLALLSQCFQKYGIYALAVVENAQYGYLSL NLESMPVNIDWSDTIPKKGEGIFLRFTDINVVPKEVFPYVANTIKQINIVMKSLIPEINI EIYNAFDKLLKDGKDGVQFEIIAMRGENRIPLLYESAGIKKLISICSNLVDCYNRESYCL VVDELDSGIYEYLLGECLEVMQDKAKGQLIFTSHNLRPLEILENDSLLYTTVNPENCYIK STYIKNTQNTRLSYLRTIKLGGQKEKLYNETNIYEMELAMRRAGRIRAND >gi|226332996|gb|ACII01000023.1| GENE 21 16382 - 17134 175 250 aa, chain + ## HITS:1 COG:no KEGG:LSL_1851 NR:ns ## KEGG: LSL_1851 # Name: not_defined # Def: hypothetical protein # Organism: L.salivarius # Pathway: not_defined # 48 242 28 208 226 90 33.0 6e-17 MYIDENVSQDFYYFVEDLKKRYNITCLCASKDGMLNIINCLNGKTKEKKVFISGSYDSVT ENTNNFADKLSCALVNHLYNSGYRISTGVGKRLGTYITGYANQYLAERNISNTPKYLSMR PFPFHMNLSEEKKETYRKLMQRDCSAAIFLFGQSRNMNKAGMFDDNIAHFSTGTYQEFQI AKANNFVIIPVGATGYEANIIWKEVKANINQFYYLSKKIDILRYETEPQKISEVIVNILN SVTEYRCIKQ >gi|226332996|gb|ACII01000023.1| GENE 22 17096 - 17839 195 247 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578207|ref|ZP_04855479.1| ## NR: gi|253578207|ref|ZP_04855479.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 247 1 247 247 473 100.0 1e-132 MLFFCFSSKDRHSIVESILFHLTNYGLPVWYDRHKMLLGDERDNKNFDEGVKACNYSIII LSANTIASECANEEIDLIYQRYKQHKMYVFPIFFNIKASQLPEKYCWMKRLVYKELTVAN DSRSACNHIICKVTLDELQKYKIKTINEYLKLYKNNKAFSYLTELIDSYCKISDENHNAQ IALLYAGCLYIKEKYNCCLEIPEFYYKGIQRLFEETRLNLPIDLRETLIFERLFLLLFNA AIFGYTI >gi|226332996|gb|ACII01000023.1| GENE 23 18086 - 19003 473 305 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 3 296 1 296 297 170 32.0 4e-42 MRLQDKLVSYLEYCTYRKELDQKTVKAYRIDLNQYFTFVACEEPDKEKIEEYITELHKKY KQKTVKRKIASVKAYYSYLEENELIKESPFRKIKVKFKENLILPRIIPREEIEHLLNHMY GCLQQASETVYKYCLRDVAVIELFFATGARVYEISNIRAENVNLNTGLIQLMGKGAKERY VQIGSTSVLNILRKYYAENRAAIKKSGYFFVNGRGNRYTEQSIRLMLKKYTAQAGIERNI TPHMFRHSFATYLIEEGVDISCVQRILGHSSIKTTQIYIHVAARKQAEILRDMHPRNNMK IIGAA >gi|226332996|gb|ACII01000023.1| GENE 24 19172 - 20446 413 424 aa, chain - ## HITS:1 COG:SMa2245 KEGG:ns NR:ns ## COG: SMa2245 COG1204 # Protein_GI_number: 16263663 # Func_class: R General function prediction only # Function: Superfamily II helicase # Organism: Sinorhizobium meliloti # 2 415 423 837 845 308 37.0 1e-83 MDSLPYFDSLADRYEEQACPNSELINKRIAQKIEQGIGRGVRGEKDYCAILIIGSELVRF MRSIATNKFFSPQTRKQIDIGIEIADMAKEDKTESPIKVVLSLIKQMLVRDEGWKEYYAS EMDTIVEDNAESQVYDRLLKERQAEQFFCEGEYEKAISSMQRLIDGLNVDDTEKGWYLQQ LARYTYPTSIAESIKIQKSAFKKNTQLLKPSTGIDYTKISYIHQDRLNNIRTYMRKFSDY GELFLSVNATLDNLSFGIEATKFEAALKDVGALLGYVSQRPDKEIRKGPDNLWCGSNDHY LLFECKSEVSGTRQEITKHEAGQMNNHCAWFEDQYGPNANVDRFMIISTKTLSYEGDFTH KVKIIRPNKLKILKDQIKGFIKELKPFSLDEISDDTLHEFLTLHHLNVEDFSEQYSEEYY HKTR >gi|226332996|gb|ACII01000023.1| GENE 25 21726 - 22718 502 330 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578210|ref|ZP_04855482.1| ## NR: gi|253578210|ref|ZP_04855482.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 330 1 330 330 575 100.0 1e-162 MGRNMEDNYLKLFGNMFYNIKEDQDNKKIIEEASEVLASIIIKYGELINEITARTSAQDE FVDTVICLFLRKIMEQLDALNILIDAGAFSPMQVIVRTLLENTVGLEFILKEDTRKRAAA YYLEQHYKEIETGEKLFARKSEKSKNAIKLLGEAQFTEDSKRFQRKKDAFQHLIDTNELF KEVADAREKAIESKRKYWKKKKRKNINKIHIQWYEIGTNVQNFREMMRETGHEKYYEGIY GGLSYEIHALNAARAIQAVADGLYLQRIRNPQNGGNIFAFVCDFSLIALDKVYKYLDDDE AKRAEFKRYVMDCQKKKVHIIATMDQINIL >gi|226332996|gb|ACII01000023.1| GENE 26 22871 - 23089 57 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153813353|ref|ZP_01966021.1| ## NR: gi|153813353|ref|ZP_01966021.1| hypothetical protein RUMOBE_03772 [Ruminococcus obeum ATCC 29174] # 1 72 35 106 106 134 95.0 2e-30 MEKILLAESCIEVYENIEEFLEKSGWRRDNPELCSECYLFENHICRKIQGKIWYFSRIQY ENGLRGMKQGFN >gi|226332996|gb|ACII01000023.1| GENE 27 23101 - 23796 477 231 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 2 199 384 577 591 172 47.0 4e-43 MLATLFTSFLSIKLVRLADRTEERVLPVPISFILDEFPNIGVVPDFKKKLATARSRRIGM SILFQNIPQLQNRYPDNQWEEILGGCDFSIFLGCNDMTTATYYSDRIGEITVSVASIRKN FFTIRATDYVPEYSESRSIGKRQLLLPDEILRYPLDQGILIIRGHNVLRFKKMDYTEHAD AGKLKMEKVVDHIPDWYRKWEKQQQEFQILDEQVYEAMDSEFCLPRNLDFI >gi|226332996|gb|ACII01000023.1| GENE 28 24295 - 24798 189 167 aa, chain + ## HITS:1 COG:mlr9321 KEGG:ns NR:ns ## COG: mlr9321 COG0582 # Protein_GI_number: 13488299 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 8 153 264 409 416 77 30.0 8e-15 MVMIGLKLGFRSSNVINLKLSDIDWMNKKISITQYKTKVSISLPLSVDVGNAVFRYMKYG RPECDDPHVFVRHKAPYGILSGKICSNALNRVLSTVSDGSSVKFHTLRKTFATSILKNNA GVERVIDALGNRDSTTVNVYLTYDELHMRKCALSLKEMSIPMGGATK >gi|226332996|gb|ACII01000023.1| GENE 29 24795 - 25538 294 247 aa, chain + ## HITS:1 COG:mll9330 KEGG:ns NR:ns ## COG: mll9330 COG0582 # Protein_GI_number: 13488151 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 14 194 12 189 323 68 25.0 7e-12 MILEEYIFTSCFLNDIQGLIEQKKQSGYIYDSAKYILIQFDKFCVENHIDRAIITKELSD TWLSYHADEAKSRRASRMSVLRQLVQYMNSQGSECYIPSKFSAKSYRVPYVMNEREIREF FTVADDYIPKVNADRFSILAEEYKVLFRFIYCCGLRVSEARKLKLEDIDFERKTALILRS KGDKDRLIYIADDVCNMCLDMLDLLHDKYHFYSEYLFPSSDPNTPLQVASINKKFQEFCN NIVYICF >gi|226332996|gb|ACII01000023.1| GENE 30 25449 - 25799 115 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYYEKETADKPEKWPANAREQILDTLCECVEKFEKNPSYKTREVLLSLTCEHDLNLNENF GLVRVTEYEVGILNFLYLVGNTYQISSLKTYIYNIIAEFLKFFVYRCHLQGGIGIR >gi|226332996|gb|ACII01000023.1| GENE 31 25959 - 27404 774 481 aa, chain - ## HITS:1 COG:no KEGG:Csal_2906 NR:ns ## KEGG: Csal_2906 # Name: not_defined # Def: hypothetical protein # Organism: C.salexigens # Pathway: not_defined # 3 459 5 467 480 172 27.0 2e-41 MKLYHYRSINSALLEIENGTFHFASKEELNDPLEGFVRVFWQGDKMAWEGLFRHYIYSVA RALELYILKADDETLYHGTLVADVHCYKNNFFEKILLKLGEEFITDTDVQNLAGVYGDNC LKVSEKELQYILFYIHNNALIKCLEEFKKNKFVPAEEAEKQIKLLNFSLSVEKLVDAIKK VFSNEKMRVQTIESMEEIFEEMKEFSYIMKGAENDIFLHGKGSEEQIYNNDGNSVVQQHR KWLIVMADFPKVFVAQLRDMIYPKSYVVCFSKKNDNSAMWGNYADCHKGVCLIYDTGDEA KLKVGGRHIPLDVRAISYGGESIECNFFQTLGRLTMVQIREWLLGVDGVSSCYEAFSDVE EWRKRYWKIYDAKIYRKTKNWEHEKEFRVAVSNTFGEFDVPQKQNMSFDWNLLKGVIFGI RTSEYDKKQILDKLIKHKDELSDFTFYQAEYSAEEQKIKIRKKKFWRLINYKGKVDGTEK V >gi|226332996|gb|ACII01000023.1| GENE 32 27567 - 29366 1145 599 aa, chain - ## HITS:1 COG:BS_asnB KEGG:ns NR:ns ## COG: BS_asnB COG0367 # Protein_GI_number: 16080106 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Bacillus subtilis # 1 590 26 609 632 512 45.0 1e-145 MMKAIEHRGPDSEGSFCREKITLGFRRLSIIDLEDGQQPMESADGNLHIVFNGEIYDYKE LRAELEASGISFCTHSDTEVLVNTIQQYGEKALDKLRGMFGFAVWNEKEQTLMLARDFFG IKPVYYAEIDGHFVFASEIKSILAFPGYERKVNQKALEQYLSFQYAPLEETFFKGIYKLM PGHMLLYKNGNYEIKTYFKTKLTPDKWNRKTNMDQLRSQLAEILKDSVKHHMLSDVEVGA FLSGGVDSGYLASASGADQAFTVGFDEGDRYSEVNKAAKVAEKASLKHHVKIISKQEFWD ALPDVMYHMDEPLGDASAVALYFLSKEAAGHVKVVLSGEGADELFGGYNIYREPEALKKV AWIPFVLRRAVRKLAAKLPDVKGRDFLIRAGMKVEERFIGNAYIYCEKEKAQILKNKVTG PSTQEYLSQFYEELESENRGSLQDMEKMQSVDLNYWLPGDILQKADKMSMAHSLEVRVPF LDKEVFNFAAKLPKEAKIAAGTTKYIFRKAVSEFLPQETNERKKLGFPIPIRVWLRQNDW YQMVTDLFTSKEAEEFFHTEKLLQLLNDHKDKKADNSRKIWTVLTFLIWYDRFFTTCSK >gi|226332996|gb|ACII01000023.1| GENE 33 29660 - 29848 239 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578217|ref|ZP_04855489.1| ## NR: gi|253578217|ref|ZP_04855489.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 62 1 62 62 110 100.0 2e-23 MKKTYQRLPESELDIMLVLWNNTPPMNRMEIVAAGEGDEESRVLYNNGDGTWIDENGNLH EE >gi|226332996|gb|ACII01000023.1| GENE 34 30256 - 30420 262 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578218|ref|ZP_04855490.1| ## NR: gi|253578218|ref|ZP_04855490.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 69 100.0 5e-11 MEIDFIQELKKRDELLDGYLRQIEIQEEFIQKQKELIECLEDHISKITDIVGSV >gi|226332996|gb|ACII01000023.1| GENE 35 30445 - 31365 609 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578219|ref|ZP_04855491.1| ## NR: gi|253578219|ref|ZP_04855491.1| transposase [Ruminococcus sp. 5_1_39B_FAA] # 1 306 1 306 306 571 100.0 1e-161 MPDTYDHITLMCRLKAAQRRNKELESGERYIQLEELHQKEYNVYEHKIEKLKKELADAHK ETIRVRNYWFQVLEDMLREFEKAQKRSAQELRKMEIRALNAEKQREDALDKAAVFRHQFY EAASRLEEEQGKNLKLRAQINRDYENSSIPSSKAVRRKKITNNREKTGRRPGGQPGHKGH CRKRQEPTQPVILLPPPEEVLEDCAFKKTARTIVKQMVSIRMVLNVTEYHADVYYNSHTG ERAHAAFPDGVIDDVNYDGSIRAFLFLLNNDCCTSIDKSRAFLSDLTGGKLNISKGMISR LNRSLL >gi|226332996|gb|ACII01000023.1| GENE 36 31419 - 32117 274 232 aa, chain + ## HITS:1 COG:MA1425 KEGG:ns NR:ns ## COG: MA1425 COG3436 # Protein_GI_number: 20090285 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 1 216 232 455 477 83 28.0 3e-16 MHTDCTNARENGKSCQVYVCATPDGKALYFAREKKGHEGVKGTVAEDYQGILVHDHDITF YNYGADHQECLAHVLRYLKASIENEPDRTWNKKMHALVQEMVHFRNGCRQPHGPDPEIVS GFEKRYREILETARKEYENIPANDYYKDGYNLFLRMEKYMHNHLLFLHDIRVPATNNEAE RLLRNYKRKQAQAVTFRSFENIDYLCQCMSMLVLMRLEDPANIYDRVSRIFG >gi|226332996|gb|ACII01000023.1| GENE 37 33198 - 33836 206 212 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 8 204 12 213 318 84 29 2e-15 MNEKKYIIEVSHVNKNFKNNKVLKDVTLRCESGRIYGLVGHNGSGKTVLFKTICGFLSCD EGTVSVNGKVMGKDKDMLTEAGIIIEDPGFLRNWSAYHNLEFLYTIRNRPDKSYLHSVLN TVGLDSKLKRPVGKFSLGMRQRLALAQAIMEDPPILILDEPMNGLDKNGVAEIREFLLKM KNENRLIILASHNREDIEILCDEVYEMEYFGK >gi|226332996|gb|ACII01000023.1| GENE 38 33851 - 34711 393 286 aa, chain - ## HITS:1 COG:no KEGG:DSY2148 NR:ns ## KEGG: DSY2148 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 95 256 91 255 284 79 31.0 1e-13 MDTYIYAVEMSGKMIAYAFCASAFAAVFCEDLENKYLRYSIGRGNLKKYVLSKVIVIYVS SIITMILGTLLFIMYLRLQLPWTSDLATSSISGMYHAIRVGRQKWGLGQIGGIFFRSLIM VLITIACTILVLLPKVEWSNEWGKLLRTAATTNALSEYQSSVLIYYDIFSEYTPIELILL EILLCTLICTFVGVLMFLFSLYINKIIAIALFPVLNMHPLICYKLARFVPTVWAELARIA TVDHTYYWLPSLSYMVVFLLVGITIMTGLIMHKIRYMQFNWENEDI >gi|226332996|gb|ACII01000023.1| GENE 39 34858 - 35148 222 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578224|ref|ZP_04855496.1| ## NR: gi|253578224|ref|ZP_04855496.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 96 1 96 96 176 100.0 5e-43 MDGSTIPYYVVEERKKLNRCIEEAPNDYEKEIYRAIRNMPVPYEKLNEILENSRDIITSK QNRKAKNRKFFATGIPIIFMILVVAVYTLLNKQGDD >gi|226332996|gb|ACII01000023.1| GENE 40 35161 - 36258 783 365 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578225|ref|ZP_04855497.1| ## NR: gi|253578225|ref|ZP_04855497.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 365 9 373 373 654 100.0 0 MVKKVLLCFMLGVATISSGCGKQVSSEKSNVVDMLESDDSKVKDTFPDTYNAESESGKVK FNCTLELPENMNTRTIQKTTVEGVHSYDKDKAYSLLAEGKEISNKEQYDGDNGEIISYTF SDGASLYLDYNITWTSATSSLYAYLGVQQSDYIDLFSSDSVSLDKDKYISEIKKDMNELG YDTENLSFQAIPLSVDAMKKLRDQELNNGLLEKGKTNEPTSEDEAYFIYAYQENTGIPVF HELMSVAKQMSDDSPDNAPVQAIYSARGLESLTIDYIYNFKNEQNTVTLKPFDEIASVVE EKYDNILNDVNYEVTRAKLYERVYTGEDQKYAEEPIWYFEVMENGSNKTVMLVNAETGKE INLPS >gi|226332996|gb|ACII01000023.1| GENE 41 36194 - 36322 63 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSCTGKVVDWILNVSKGEIDYGKKSFAVFYVRSRNHKFRMR >gi|226332996|gb|ACII01000023.1| GENE 42 36326 - 36718 253 130 aa, chain - ## HITS:1 COG:no KEGG:DSY3434 NR:ns ## KEGG: DSY3434 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 18 120 183 283 396 100 49.0 2e-20 MDDTIASKTKPSSQALHPIEDAYFLCDCGYVSEKMINTFVQKGFHTIGALKTNRLLYPSG MKKKLSELAAGLSVTEKVFDLVTVKKRKYYVYRYEGNLNGIENAVVLLSYPEKAFGNPKA PEGLAKFRRK >gi|226332996|gb|ACII01000023.1| GENE 43 37185 - 37892 437 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578227|ref|ZP_04855499.1| ## NR: gi|253578227|ref|ZP_04855499.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 26 235 1 210 210 401 100.0 1e-110 MKRKKIDSFTLFVILLLALILTLLGMLLAGMKEEKRQFTVLLPLGDLAYDEDFLQKAKKI KGIKEIWPVIEVPVVIKIEDYTETTTFSGIDMNAFGKNPTQNELGKMPLLLLGNGSLRDM KDYNNHAISKKQQEKFLEMGENLNIFYSLDEKEKDTSKATDDLTTLSGNSARGPQTSYMP CKATVVTEGNEIYIPISQAQDLCREIGEPSEISKVYLKINGKNNLENAKKILSGI >gi|226332996|gb|ACII01000023.1| GENE 44 37889 - 39187 665 432 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0964 NR:ns ## KEGG: EUBREC_0964 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 137 407 215 493 506 104 24.0 5e-21 MRNLFVKTALVLCTTIIFLPATVFATEPQNTEITSAEEGFTDSDPSVSYENDNSLTLDEK EDDSTSTEPPAEDEKTPDKADTDKENPDPDIFTSGDDFGDGSKSNPFHVVIDSDNPEDGD APQITDPVIDNGGYSGGGGGGNSTSDSSSEKILHKPQVLLEDSNLSGGRLEAGSTTEMSL TFRNKSCSQNVFGLKISLSTENKGIEFEKNSFYVQVLAPGEAITLEQMITITENCDPGQA VITFSLDYEDSKATATTGTELLSFQIIQPVRVKLEASDIPSVLYTMDTVEIPVKAMNLGR DKIYNASVSLKADELSSNDTAFLGTIEAGDLSQGSLRIYVKGKTGKCAGQLILTYEDADG KSYEEEIKFDSEIKETQIKSLKVEDDQEKTNNWWYSIAAVVAVLLISIILILVRKLHKKN ILLEEARKAASH >gi|226332996|gb|ACII01000023.1| GENE 45 39190 - 39906 224 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 44 238 28 226 318 90 31 1e-17 MEDWKKSEGTFNLACDLERASGMMQNTCVKIEHVTKRFGEDMVLQEVNLSLKTGNVYGIV GNNGSGKTVLMKCICGFLPVTTGTIFVFGKKIGHDVDFPESLGVIIETPGFLTQYTGFKN LEILADMNGRISKADIRIVLKRVGLNPDMKKPVGKYSLGMRQRLGIAQAIMEDPLFLILD EPFNGLDKHGVEEIRELLLDLKAAGKTILLASHNEEDIRILCDHVYEMDGGILRERTA >gi|226332996|gb|ACII01000023.1| GENE 46 39884 - 40612 413 242 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0961 NR:ns ## KEGG: EUBREC_0961 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 41 242 64 261 261 97 32.0 5e-19 MLGETKRMFFNKNFLAAWLIACISIAIGQTYPSLKKALTCGTFISILEGSLKSQAVSFAI PVAAVLPWSDSILQEYKSGFLKEILPRTTRRQYIEGKVFSIMVSGFMVWTLSILTVLFIN FIVFYPLEIKGAIQKDALLDLLMKAFRTGLIGSIISTFGGSCGILWDSAYMTYGIPFVSF YFGIILHDRYFKNQIWFYPVEWILADENWGPDKIGVWLFLLLFLLVMLGILGGVLYGRLE EI >gi|226332996|gb|ACII01000023.1| GENE 47 40631 - 41521 349 296 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0962 NR:ns ## KEGG: EUBREC_0962 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 293 1 292 296 233 42.0 5e-60 MRQVFGIVRYNFFGFLRNPKVIFIFLLEFVLSFLLTGRIMVVMETYDTPVQAIEPFIWTF GDGIAVLSSSLLLLLMFSDLPKMTPITPYQLTRTTKRKWLLGQFIYVILVTVLYTVLMLL FTAVLCMKKSYPGNLWSETAAMLGYSELGKNLQVPSTVRVMESISPYGCMLQVFFLLFCY SLTLSFVILVGNLYKGKTKGMIFGLLYSVFVFLLDPKVLAAILHKEKYEMYQVNVLICWI SPLNQAVYGKHNLGYDPLLPKISHSYMFFLGILVLLAVFAINRMKKYTFEFLGEKT >gi|226332996|gb|ACII01000023.1| GENE 48 41523 - 43007 966 494 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0960 NR:ns ## KEGG: EUBREC_0960 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 494 1 492 492 106 25.0 3e-21 MRKNIKKFAGIVLCIGMFTGCAQTPDSSLVKPKGSKAVDAYTEADDVSGSKGEASESKND DGSESKTTIRNLIDAPQNYKSHVEDDSAKLVVNTDASVEIPEVEKISAISVTAAEVTQEL LDRITNAFFSEAKLYTMDSYYVKTKDEIKKTLDELKEDVANGNLDPYNYGTDDDGNYVYD IYEDIETWEQEYETAPEKKTLTEAKPVAGNFFDCIAQMPDDSQFYYKISSDGSDTLSVKI KKAANKGGEKIPEDAMWCDYGYSEEEEKPTEESIGLSLDEAKKLVKEKVEKMGITDLQFS NWNYAVCKSFEGDNSSGNFGNGYRIDYARTINGVPVTQTVADGGALEDMDSTMETWSYES LCFYVDKDGIESMTYSNPYTIGKIKTENLNLLSFSEVMKIYEKMMVVTNADNMQYENSRV YNIDRIVLGYARIYEPSTDAHTGVLIPVWDFFGSMTSESEYNGETESNTSKDPNESFLTI NAVDGSIIDRDLGY >gi|226332996|gb|ACII01000023.1| GENE 49 43107 - 44318 725 403 aa, chain - ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 128 401 198 471 473 172 37.0 8e-43 MRPQIVTYAFRQVFHDSAVLYLDGKMTYNGTVYEFDVKKLRELCASKANMVDHGSNSDGH GPMISQINGRTLLLFYYGNVNMGYQIVTYKDITDVIEKSHLLFFQTGGFTLSLIVVMGII LYLSLRKITAPLTSLREATLLVSEGIYDFKVPSEGNTELAQVGATFNFMTGKIKEQIESL SNINRTQKQLIGSLAHELKTPMTAIIGYADTLLTVRLTPERQEKALIYIENECRRLSRLS MKMLELTGLYEASEDSFNPVEIQVDNFLKEVKELIDCRLQEKNISLDVFCEPKELVKKFD QDLMISVVTNLIDNAVKASRKESKIVLEATPDHLMVQDFGKGIPKEDLEMVTEPFYMVDK SRSRAKGSVGLGLSLCHKIIELHDFQLKIESNQGKGTTVSVFW >gi|226332996|gb|ACII01000023.1| GENE 50 44525 - 45190 755 221 aa, chain - ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 218 4 216 217 163 41.0 2e-40 MKARILVIEDEMSINDLLCMNLEIAGYETVGYLDGNEAYEAIIQDHAFQCALVDIMLPGR DGYSLLPKLKQYEIPVIFLTAKGDIASKIKGLKDGAEDYIVKPFEMLEVMVRLEKVLERN GTVQEQIQIGDVIIDTASRTVTKLGMEIYLKPMEYNCLMVFAKNPNKALTRNQILQELWN EEFCGETRTVDAHVGRIRKKLGWQDKIKTIPRIGYRLEVDL >gi|226332996|gb|ACII01000023.1| GENE 51 45528 - 45869 295 113 aa, chain - ## HITS:1 COG:alr3761_6 KEGG:ns NR:ns ## COG: alr3761_6 COG0784 # Protein_GI_number: 17231253 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Nostoc sp. PCC 7120 # 18 110 196 288 315 75 44.0 3e-14 MEIAEFLLTNAGAVIIRASNGKKAVEAFAKSGVGEVNVILMDIMMPVMDGMEATREIRTM NRSDASAVVIIAMTANAFTEDKTKALKPGMDDYLTKPLESEKLIRALSRYKHN >gi|226332996|gb|ACII01000023.1| GENE 52 46040 - 47368 598 442 aa, chain - ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 18 399 2 391 439 171 31.0 5e-41 MRIVSSYGVELRKQNIPICQTLDIYRSAVSCLIGIYAQTWDELAVITEPKKRFNTAEHLV HTTKKNQARFDFDLRFPKMPSYLRRAAIQHALGSVSSYKTRLGLWKKTDQKGGRPKLVCD NHAIPVFYRDVMYREDTEEKDRVCLKLYDGHDWKWFRVSLSHTDMEYLRKNWAGKKAAAP VLEKRHRKYFLRFSYTEEVPLTKTPAKERIICSVDLGLNTDAVCTIMRSDGTVLGRRFID FPSEKDRMYSVLGRISRFQREHGSAQVKSRWAYAKRLNTELGRKIAGAVTGYAEENHADV IVFEYLETKGKISGRKKQKLHLWRKRDIQKRCEHQAHRRGIRISRICAWNTSRLAYDGSG TVVRDPGNHSLCTFQNGKRYNCDLSASYNIGARYFIRELLKPLPVTERFLLEAKVPSVKR RISCVYADLRELFSEMEILRVA >gi|226332996|gb|ACII01000023.1| GENE 53 47410 - 47829 266 139 aa, chain - ## HITS:1 COG:DR0667 KEGG:ns NR:ns ## COG: DR0667 COG1943 # Protein_GI_number: 15805694 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Deinococcus radiodurans # 8 139 9 140 140 155 50.0 2e-38 MFNINNNELSYGRGYVYSLQYHLVWCTKYRKKVLKDGIDTECKEMLCDLAEEYKFQILAM EVMPDHIHLLVDCRPQFYISDMIKIMKGNIARQMFLAHPELKKELWGGHLWNPSYCAVTV SDRSREQVCSYIEGQKERQ >gi|226332996|gb|ACII01000023.1| GENE 54 47967 - 48845 439 292 aa, chain - ## HITS:1 COG:PA0928_1 KEGG:ns NR:ns ## COG: PA0928_1 COG0642 # Protein_GI_number: 15596125 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 46 289 257 491 509 137 34.0 2e-32 MTKGRDYYIYMFIDEQMAFASVFKNLLVVLLLYIFVAASVFAFRRRSNAQLLKIQMMQEE ACKQKLLQEAQRADMANNAKTEFLQRMSHDIRTPINGIRGMIEVADYYKDDLKKQDECRR KIWDASGYLLELINEVLDMSKLESEEIVLENKDFDINRLLHEIREVIEKQVYERGISLSE KEYNLKYRYLNGSPVHLKRILMNIISNAVKYNKEKGEILLSYYETQVDNRHILFEFQCKD TGVGMSKEFQNHIFEPFTQETGGARSVYGGTGLGMPITKKLIEKMGGTIKFE >gi|226332996|gb|ACII01000023.1| GENE 55 48942 - 49343 386 133 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20437 NR:ns ## KEGG: EUBELI_20437 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 133 45 234 734 106 35.0 2e-22 MDETLKYVKEQCYIYNRYNEASQTKGLMRAIENAQQVNRNLTHTKDEVTKDKLKTYAASI VLEDGSQVDLAAYGRSDEKGIIITFYHTSAEYARNYNLTIQSLLSGYDTYSSGMIVVTDG KNVVASNDENLIG >gi|226332996|gb|ACII01000023.1| GENE 56 49453 - 49803 292 116 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20438 NR:ns ## KEGG: EUBELI_20438 # Name: not_defined # Def: polar amino acid transport system substrate-binding protein # Organism: E.eligens # Pathway: not_defined # 10 111 158 259 263 110 49.0 1e-23 MHVLYYIFYVKDVYSLENRKLLFTSLGKGYVDAIAAHETSVQQYIKDYGAKIRILDEAIL TTGIGVAFPENTDSELPEKLTEIFKEMRKDGSEEKILKKYLPETSGYLEVDKIENN >gi|226332996|gb|ACII01000023.1| GENE 57 50075 - 51352 505 425 aa, chain - ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 21 413 37 427 432 181 28.0 2e-45 MQGLLAGLLVLSVTYFGGKGLLQRFFFRTGYVYAEETERAEELQEYVTQNKLSASDYKML KKWGTERNIDDFTISRGKWLLFDISYNGKIMYGSREIPNLTWRMYHRISFKDGTADVYIY EGTADKYFNILMVFSVVLGVAVCIGIVVSGMYENVKYIQCLMKEVNIISRGNLQGNVTVQ GTDEIAQLASGLEHMRQTLVKKEQIEYDLKSAQEKLVLGMSHDLRTPLTGLMAYLEILKK QQKEGAVTQEYINKAFDRVLQIRDLSEQMFEYFFVNSRHHIELEPAEDIMCAFGDYLSEL CALLEYEGFKVNTDRLEWRQISVCINTDFIGRIMNNIISNIEKYGKHENEVCIQLCYGNE EVGISIQNSITRFDSEHQKTGIGLQNISIMMEQMDGRMEVNADDLTYCIVLYFPEKKEYH KISKV >gi|226332996|gb|ACII01000023.1| GENE 58 51396 - 52094 403 232 aa, chain - ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 1 232 1 229 231 244 55.0 9e-65 MDNHILVVEDDAAIREGVRILLEGEGYIVQEAEDGYQCLKKVSDDIDLVILDIMMPGISG IKTCEEIRKISHVPVLFLTAKSQESDKLLGLMAGGDDYLVKPFSYAELLARVKALLRRYH VYSQTLQGEKTKEHWIEYGKVKVNTQCNEVFRNEKEISLREMEYRILLLMMENPKKVFSV KNLYESLWEEPFLYSSGNTVMVHIRRLRMKIEDDPQKPKMIQTVWGKGYRLG >gi|226332996|gb|ACII01000023.1| GENE 59 52269 - 53879 590 536 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2038 NR:ns ## KEGG: EUBREC_2038 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 317 532 44 259 263 175 41.0 4e-42 MKIKCTFHIAIAFILIQLTDEFSIAFAYGICEKNPYVRILYVTFAILSLLTMIYLYTKYI IRIQMQDLFLGMPLPKKKWCITAVLMPVLICLFYIFFTKGTFEKGNLTYKEIIDVVTISI LSSGIKAAVTEEIVFRGMILRLLQNTFGKKCAVFVSSFLFAVYHFDGIDTSDIKKVIMMI ISVEIAGIALALVTEQTGSIWSSVMIHCLYNILSGESQIFHIDVAQNFPAIWMYTVNSKN MLLTGINGSVGFTASIPTMAGFTGIILMVIYDEKKKKTYGKSCSKIILLILILAAWIVVT VRAKKTEEGIILTDAYKKQIMESAEWKKIFLHTENYPDILLEDLKRNPEMLEFVEGYNDV HKKSSEGLTFEEQKKKVPLFIQWDKRWGYEPYGTSDIGISGCGPTCMAMVIYSLTRNTEA IPPVLAQKSMNEGYYVDGIGTSWKFMREAALDYGVIASQFDMLGEQEMADRLKDGNLIIC AMGPGDFTNSGHFIVIYDYSRKKFSVNDPFSYTNSSKKWEYTTLISQCQQIWVYAA >gi|226332996|gb|ACII01000023.1| GENE 60 54097 - 55632 897 511 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578244|ref|ZP_04855516.1| ## NR: gi|253578244|ref|ZP_04855516.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 511 1 511 511 965 100.0 0 MVVTEPESTAENVENITSYPITPEVVDWNQSMLTVKEAYFDGGGLSFVGEPSNEAKNYMI TCCDHATVNGKDALASLNKIEGEDLYYGHITLYDDALIEKILNEDSEVELSMTIQAYPVY DGKIYYTWKDTEAYEKIFKTGSFTDSEGRVSYALPAEDVYRGYTPQTLTLTLPLKDEVRD IIEKYRKEGKLQAADTGVEPSSGESTQNTTDLESSSNTTSVNEKYDPEKTMLQVKENHVT ASLPGKSGTLTIEADITGNTDDVSKGVLKKCSISEKDICSWFGGKDGWEKSVTDETVWEN KEKGLQLFISDGMIECDGLSEKDLGFLGENTEDEKLNIINQLFSKADLSGTSVRKPISDM TEGYDYYDTEVMLNGIPAGGLSQYRYQGSTGFGLKPEDCYFKIPIPLQVKEKETVTMLPM EEIMKSVEQYVKEGKIGFFTEEDTTEKTEPITIPVTKIRLKYYIDETADGIVYRPVWSFC CPYQWKDSPEEQELFYIDGETGALIRDAFGW Prediction of potential genes in microbial genomes Time: Sat May 28 19:20:08 2011 Seq name: gi|226332995|gb|ACII01000024.1| Ruminococcus sp. 5_1_39B_FAA cont1.24, whole genome shotgun sequence Length of sequence - 9115 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 8, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 133 - 624 402 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 646 - 705 8.5 - Term 1030 - 1087 9.1 2 2 Op 1 . - CDS 1088 - 1717 230 ## Amuc_1661 hypothetical protein 3 2 Op 2 . - CDS 1717 - 1959 219 ## gi|253578247|ref|ZP_04855519.1| predicted protein - Prom 2048 - 2107 5.6 4 3 Tu 1 . - CDS 2149 - 2589 283 ## gi|253578248|ref|ZP_04855520.1| conserved hypothetical protein - Prom 2682 - 2741 8.5 - Term 2728 - 2777 5.4 5 4 Tu 1 . - CDS 2798 - 4201 936 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 4265 - 4324 1.9 6 5 Op 1 . - CDS 4364 - 5791 1202 ## CPR_0524 hypothetical protein 7 5 Op 2 . - CDS 5791 - 6309 508 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 6345 - 6404 4.4 - Term 6510 - 6559 12.1 8 6 Tu 1 . - CDS 6585 - 6923 129 ## Dfer_1856 glycoside hydrolase family protein + Prom 6900 - 6959 1.6 9 7 Tu 1 . + CDS 7018 - 7089 79 ## + Term 7101 - 7138 1.1 - Term 7089 - 7124 2.3 10 8 Op 1 . - CDS 7350 - 8456 985 ## gi|153813339|ref|ZP_01966007.1| hypothetical protein RUMOBE_03758 11 8 Op 2 . - CDS 8450 - 8992 314 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 9040 - 9099 4.2 Predicted protein(s) >gi|226332995|gb|ACII01000024.1| GENE 1 133 - 624 402 163 aa, chain - ## HITS:1 COG:BH0620 KEGG:ns NR:ns ## COG: BH0620 COG1595 # Protein_GI_number: 15613183 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 1 163 1 167 177 72 25.0 4e-13 MDALITEAKNGDKEAFVRLIRMNKQSLYKTAWIYLRDEQDIADALQNTILSCYENIQRLR EPKYFKTWLMRILINECKDILRQKNHAAELDDFAEGVHCEELNLCEWKQLLLCLDKESRK IVELYYFDEFSVKEISALLEMNSNTVSTKLSRARAKLKKVIRR >gi|226332995|gb|ACII01000024.1| GENE 2 1088 - 1717 230 209 aa, chain - ## HITS:1 COG:no KEGG:Amuc_1661 NR:ns ## KEGG: Amuc_1661 # Name: not_defined # Def: hypothetical protein # Organism: A.muciniphila # Pathway: not_defined # 10 202 1 192 215 118 36.0 1e-25 MSESTNSTPVFKEEYQKQIADIYKQYQQTVKPYVAQLEVMENEFPIEILNEVRAIMSHIA KCYEITNEELIQKNIGKAKSHMKRCVLDCYKYLCLAYSDYYENFVHKYRFTDLTVVDNGE FWSDLCETVSKAKKQLILAKQKEGMVEDVEDAYNEFEAAYNQYHRVYEIIENSYRHLIKL KRKTFWKVAISVLAWIIPLVLSIVLFFLG >gi|226332995|gb|ACII01000024.1| GENE 3 1717 - 1959 219 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578247|ref|ZP_04855519.1| ## NR: gi|253578247|ref|ZP_04855519.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 80 5 84 84 119 100.0 7e-26 MYKINSSSQKILEKMLGKKYREIVDMDYDDEISYIQKKNNRKLVFSNKVDHRKVGRGNPL LSRRRIKTMDKVNAGLSRIK >gi|226332995|gb|ACII01000024.1| GENE 4 2149 - 2589 283 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578248|ref|ZP_04855520.1| ## NR: gi|253578248|ref|ZP_04855520.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 146 1 146 146 277 100.0 1e-73 MVVKKKVTIVLGIISIVAISTMIMFSIYKSSEAYRKANAAIQWDCSVTCAEESTPDSYVI TYSDAKILSKTGVLTVQNRNDFNIIVHLLCEGKQELVSGSIPAGGCYSFLNVTDKEHTVG IHADVGENTDIKVFVYDGKDTEPYTR >gi|226332995|gb|ACII01000024.1| GENE 5 2798 - 4201 936 467 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 18 467 30 466 466 297 32.0 2e-80 MKIVVMLIVMILVMSCCVYAGISYYYSYHFFLGTTINGIDSSNKTAYEVEQEIAGKKDNY VIQVSARMQEPQTITGKDIDYQYVSSGEILQLLKTQKPWEWIRGFFETKNYMVQEETVFS REKLEEQVSSLNCAKKENQIAPENAYVSFSNSEFTIVPETEGSELNAKEAYQMISRAIDN EAADVDLGSNPKAYKEADVTRDSSELQNMVNMYNSLAKVNITYTFGDETVTLDGNTIKNW LQFDEKGQLLPDDGAFRQHVVDYVAQLAADHDTVGTERQFETTSGRIVYVYGSAYGWKID QDKEAAQLMQEIQSGTQTTREPVYSMRANAHGINDLGDTYIEVDLTEQYMWYYQNGNIIF QSEIVSGLPGDPDRKTPPGIFTLNSKSSPSVLRGEMTANGTYSYEQPVTYWMPFNGGIGF HDADWQPYFGGDRYLTGGSHGCINLPPENAGQLYSLIQYDVPIICFY >gi|226332995|gb|ACII01000024.1| GENE 6 4364 - 5791 1202 475 aa, chain - ## HITS:1 COG:no KEGG:CPR_0524 NR:ns ## KEGG: CPR_0524 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 7 399 2 301 362 69 22.0 2e-10 MMTLNYNEKDFQKKLQKDIEMPEIVHEHINQAYRLIENNAVLQKKASKDPYHWMKSGGRI AGGMAAVLAVGFVFCAINPVMAKNIPVVGGLFEILQDNVSFFGDFSDHATTLEAVDGTKT DSSETDGAKTDHTNNDAIYTKTADGLTITCSEVYANSQAIYVTMQFKSDKPFPETETGAE SGTPVIDLDMTGGVDFNSEASPVIDGQVEGKFLDDNTYACIFRYDLAEAAKDYTEYNEKY NEMTQQVMDEMGITQADLDDQTDEGYALLEEFINKLSERGGEYQKYIKDIDIPDTFNLHL DITKVRGLEANYQWSEEDEKKYGRDAGYYKYEGDWSFDIPVTVDDSQTEVLELNDTNDAG IGLKSVIRTPYELTVNELYKEGSNSDCFMVALDANGNTLPYNESTGNCNNFAIQDRDIST VDIYFLDYVQYMDELKGQQNFDNPTKEDGQKWKKLLEENAKYHKTLHFDNDNVKN >gi|226332995|gb|ACII01000024.1| GENE 7 5791 - 6309 508 172 aa, chain - ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 1 159 4 180 187 89 32.0 3e-18 MNEILIRKAKKGDKDAFCRLIDENVQSMYKVAAAYLKNDEDVADAIQDTILSCYENLKSL KQNRYFKTWMIRILINKCKDMIQKKKLVTYTDQMPETPFHEEKYAAMEWIQALEPLDSKY RLVVLLYYMEGFGIREISDILDMKESTVKSRLYRGRKQIAEMYGYKVKEGRA >gi|226332995|gb|ACII01000024.1| GENE 8 6585 - 6923 129 112 aa, chain - ## HITS:1 COG:no KEGG:Dfer_1856 NR:ns ## KEGG: Dfer_1856 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: D.fermentans # Pathway: not_defined # 2 94 679 772 781 124 58.0 1e-27 MGGGIYPNMLCAHPPFQIDGNFGFAAAVAEMLIQSRKGYILLLPALPDEWKDGKVQGMKA QGDITVDFEWREGRIHRVRLCSSHEQKVTLECNGISKTVFLKPDRTENMIFD >gi|226332995|gb|ACII01000024.1| GENE 9 7018 - 7089 79 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRFKAYIEYVMNFYANYYRFYR >gi|226332995|gb|ACII01000024.1| GENE 10 7350 - 8456 985 368 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153813339|ref|ZP_01966007.1| ## NR: gi|153813339|ref|ZP_01966007.1| hypothetical protein RUMOBE_03758 [Ruminococcus obeum ATCC 29174] # 1 231 1 232 232 404 90.0 1e-111 MLENKWNNIAPEVPQDFHEKFEETLRHIEQADAVDKKYKRKRFSGRLLIAAAVICTGMGV TVAAKEFFKWNDHLLKRLEPSEEQQETMQESNYIQNIDQSVTQKGVTVTLTDSIQDQGFL YAFFEVKTEDSISMTDHTSFEEITHFKIDGKEVYAVDDNRFGSFNTGVGQDMLLKTDQNS THLKYFNACISYDGDYDLSNKTVEITLKNLTEEGDYDTTVITDGTWDFKWTLGAVKPPTT LEVNRKCDFGGYEITVKKMEVTPLLWSLYLDYDEAMKVYEDEKNKFEYTGTDYGMDLYAR TSIDQVRYKDGTVLTLDRTMGGIAGGGEKQDKENGVMIIRNSFPQLVDVDNLQAVHFGNI DQWLEVRE >gi|226332995|gb|ACII01000024.1| GENE 11 8450 - 8992 314 180 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 17 155 26 178 208 77 30.0 1e-14 MTKDVFIKEVRDAEAMLYHISKSILKNDSDCGDAVQETILKAYEKLPTLKKEKYFRTWIT KILINECNGILRKRKNVIPYEEYMDNMRLTEEDRYSHLYMAIMELPEDLRILVTLYYLEG FSQKEISEALDIPEGTIKSRLSRAREFLKAQLSDEEEKRPMPGKNSKLSKRITGKGKMSC Prediction of potential genes in microbial genomes Time: Sat May 28 19:20:58 2011 Seq name: gi|226332994|gb|ACII01000025.1| Ruminococcus sp. 5_1_39B_FAA cont1.25, whole genome shotgun sequence Length of sequence - 30750 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 12, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 436 280 ## gi|253578255|ref|ZP_04855527.1| predicted protein - Term 477 - 515 1.2 2 2 Tu 1 . - CDS 542 - 721 117 ## gi|253578256|ref|ZP_04855528.1| predicted protein - Prom 765 - 824 3.2 - Term 778 - 809 -0.1 3 3 Op 1 . - CDS 830 - 1372 424 ## Shel_16390 hypothetical protein 4 3 Op 2 . - CDS 1388 - 1915 330 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 - Prom 1997 - 2056 4.8 - Term 2031 - 2069 6.4 5 4 Op 1 38/0.000 - CDS 2106 - 3044 501 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 6 4 Op 2 . - CDS 3145 - 3888 1064 ## PROTEIN SUPPORTED gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 - Prom 4029 - 4088 5.8 - Term 4039 - 4069 -1.0 7 5 Op 1 . - CDS 4215 - 4982 835 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 8 5 Op 2 . - CDS 4979 - 7036 1500 ## BMD_2227 hypothetical protein 9 5 Op 3 8/0.000 - CDS 7011 - 7913 754 ## COG1131 ABC-type multidrug transport system, ATPase component 10 5 Op 4 . - CDS 7910 - 8284 465 ## COG1725 Predicted transcriptional regulators - Prom 8366 - 8425 8.6 - Term 8564 - 8616 2.3 11 6 Tu 1 . - CDS 8635 - 9285 524 ## Cthe_2176 abortive infection protein - Prom 9318 - 9377 4.0 12 7 Op 1 . - CDS 9474 - 10121 768 ## COG0546 Predicted phosphatases 13 7 Op 2 . - CDS 10188 - 11216 389 ## gi|253578268|ref|ZP_04855540.1| predicted protein 14 7 Op 3 . - CDS 11200 - 12456 689 ## COG0515 Serine/threonine protein kinase - Prom 12566 - 12625 8.2 - Term 12649 - 12687 4.6 15 8 Op 1 . - CDS 12729 - 14495 2087 ## COG0018 Arginyl-tRNA synthetase 16 8 Op 2 . - CDS 14520 - 15392 1044 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 15466 - 15525 6.7 - Term 15519 - 15556 4.6 17 9 Op 1 . - CDS 15601 - 17217 2069 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 18 9 Op 2 . - CDS 17272 - 17529 292 ## STH955 PTS system phosphocarrier protein HPr - Prom 17619 - 17678 9.7 - Term 17750 - 17822 13.4 19 10 Op 1 . - CDS 17827 - 19086 419 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 20 10 Op 2 . - CDS 19106 - 21127 1758 ## EUBREC_2392 hypothetical protein 21 10 Op 3 24/0.000 - CDS 21133 - 21804 221 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 22 10 Op 4 . - CDS 21801 - 23279 1750 ## COG0845 Membrane-fusion protein - Prom 23316 - 23375 7.1 + Prom 23319 - 23378 9.1 23 11 Tu 1 . + CDS 23563 - 24774 756 ## gi|253578278|ref|ZP_04855550.1| conserved hypothetical protein + Term 24866 - 24926 5.4 - Term 24853 - 24914 5.6 24 12 Op 1 . - CDS 24942 - 25700 629 ## COG0730 Predicted permeases 25 12 Op 2 13/0.000 - CDS 25703 - 27499 2348 ## COG0173 Aspartyl-tRNA synthetase - Prom 27521 - 27580 3.4 26 12 Op 3 . - CDS 27596 - 28852 1578 ## COG0124 Histidyl-tRNA synthetase 27 12 Op 4 . - CDS 28785 - 29447 433 ## COG2206 HD-GYP domain 28 12 Op 5 . - CDS 29465 - 30748 1180 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases Predicted protein(s) >gi|226332994|gb|ACII01000025.1| GENE 1 5 - 436 280 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578255|ref|ZP_04855527.1| ## NR: gi|253578255|ref|ZP_04855527.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 143 1 143 143 279 100.0 3e-74 MAGYGDHRIGEVTNLNGNKIVITESIVSYSLGINAINFTYEYVNGKFVPTSRYGSYKEIY SADGSSRHFTVSSDLPAYTRPGATAVNTTLKTGSLTKIIKCALISGKMYIQLECDGEIYW IKALENPPISDNERQFMEVRYAG >gi|226332994|gb|ACII01000025.1| GENE 2 542 - 721 117 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578256|ref|ZP_04855528.1| ## NR: gi|253578256|ref|ZP_04855528.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 29 87 87 106 98.0 4e-22 MENIGYLGFYFYEIVVPQAKETALSMAADGMKAEKIAHYLKVSAAMVQNWIDESMSVVQ >gi|226332994|gb|ACII01000025.1| GENE 3 830 - 1372 424 180 aa, chain - ## HITS:1 COG:no KEGG:Shel_16390 NR:ns ## KEGG: Shel_16390 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 178 5 181 188 171 50.0 1e-41 MLGNSFTTANHMPDMLDELTGAEIVQHTRGGERLADQLNPKTKMGKRTQEALQNEKWDFV ILQEMSNGPITSKASFLENAEKLCEKIRENGAKPVFYATWAYQRGGKKLETFGMDYDEMY QKLYEAYHLAADRNHTLIADVGKKFYELSDKVNLYADDGCHPNEKGSQIAAEIIAEVLMN >gi|226332994|gb|ACII01000025.1| GENE 4 1388 - 1915 330 175 aa, chain - ## HITS:1 COG:RSc0334 KEGG:ns NR:ns ## COG: RSc0334 COG2110 # Protein_GI_number: 17545053 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Ralstonia solanacearum # 4 169 8 168 171 194 57.0 9e-50 MELLKTIRGDITKITDVQAIVNAANNSLLGGGGVDGAIHRAAGPELLEECRTLHGCETGK AKITKAYKLPCEYVIHTVGPIWNGGNQNEKELLASCYLSSMQLALEHKIRKIAFPSISTG VYSFPVGLATKIAVNTVAGFLKEHPDDFDLVEWVLFDEHTELVYAIEVNAHNKGI >gi|226332994|gb|ACII01000025.1| GENE 5 2106 - 3044 501 312 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 3 308 4 277 283 197 43 7e-50 MAITAAQVKELREMTGAGMMDCKKALTATEGDMDKAVEFLREKGLATAQKKASRVAAEGL CKTLVSEDGKKAVVVEVNAETDFVAKNEKFQNYVADVAAQAMNTTAADIDAFLAEAWALD TTKTVKEALAAQIAVIGENMNIRRFAQVTEENGFVASYTHMGGKIGVLVDVETDVVNDAV KEMAKNVAMQIAALKPQYTSDSEVSAEYIEHEKEILMAQIQNDPKESQKPAKVIEGMITG RIKKELKEICLLDQTYVKAEDGKQSVAKYVEQVAKENGAKITVKGFVRYETGDGIEKKEE NFAEEVAKQMAQ >gi|226332994|gb|ACII01000025.1| GENE 6 3145 - 3888 1064 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 [Roseburia intestinalis L1-82] # 1 247 1 242 247 414 83 1e-115 MSVISMKQLLEAGVHFGHQTRRWNPKMAPYIYTERNGIYIIDLQQSVGMVDDAYNAIADI VADGGSILFVGTKKQAQDAIKTEAERCGQFYVNERWLGGMLTNFKTIQSRIARLKEIEAM EADGTFDVLPKKEVIELKKELAKLQKNIGGIKEMKRLPDAIFIVDPKKERICVQEAHTLG IPLIGICDTNCDPEELDYVIPGNDDAIRAVKLIVSKMADAVIEAKQGTADADGEIEAESE EFAATEE >gi|226332994|gb|ACII01000025.1| GENE 7 4215 - 4982 835 255 aa, chain - ## HITS:1 COG:L109527 KEGG:ns NR:ns ## COG: L109527 COG1187 # Protein_GI_number: 15674222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Lactococcus lactis # 3 250 30 273 273 205 49.0 6e-53 MKKAVRLDKFLADAGAGTRSEVKKFIQKGKVQVNGVPAKKSEIKVSEEDEVVLDGNRISQ APEFVYYLLHKPAGYVSATEDKRDKTVMELVASDRKDLFPVGRLDKDTEGLLLITDDGAL AHELLSPKKHVDKTYYAVTDGCMTKEDVQRFADGLEIGEKNPTMPAKLKILSTRKVEETE LEQYPSGWSSEIQLTIKEGKFHQVKRMTEAVGKKVVYLKRISMGVLTLPDDLKKGECRQL TAEEEKRLKESVVIK >gi|226332994|gb|ACII01000025.1| GENE 8 4979 - 7036 1500 685 aa, chain - ## HITS:1 COG:no KEGG:BMD_2227 NR:ns ## KEGG: BMD_2227 # Name: not_defined # Def: hypothetical protein # Organism: B.megaterium_DSM319 # Pathway: not_defined # 65 624 66 627 669 87 20.0 1e-15 MKSKTFSGRSLRSSLKGSTWILVLLLLGFMVAFPVAELMLIGNQTDEIHRMTFAMICSYL IVPGFLVTMLAAVVNALNEFWYLFSRDKIDFYHSLPVTRSRSFWEKAIRGLVLYLVPYVI MELITMAIAVSKGHGSHLITAAGKMFLEHLLMYLLLYFGAVLALAIAGNILAGILSLCCV YLYGPVLGILLWVLEMMYFRTNMGLKEGMAEKISVFLSPVSISVALRTYSGQKNFWIIIV GGILLLIVLAVCAYLAYTKRPAEKTGKSFVYGFLEPILLFMVVIPAALAIGTMFALIGPE ENRTGWWIFGLVLGTVVFYGILQVIFAMDFRKMAAHKLQLLLLGICVAVSAWILHTDAIG YDTRIPTMAKTEGISLNLEWIGTESVNEPQMEVSSGSYKLDRLFYFMGGNYGRWTDAGMS DKIYEVLKEIASYQNSKECSGTEIGVQFKKKSGFDITRQYIVTAEQLGRLLEACYEQGTL KDNKYDILEKYRQKVSFITVDPLNELDDQYSVTLEKSDSQKLLDLLKQDIAEASPQELIG IPCGQMELYATSYADMDEHIAPESYAEVGRYIFPTFKRTLVFLKEKGYAFVMEKENLKQY DYSVTYNAEEMDVTDSEQKEELAQSLIRELECPAWLETEAGVSVKVALNSTESAGESLNG IEFAVLKAKEPEFIKKIVETGEEEE >gi|226332994|gb|ACII01000025.1| GENE 9 7011 - 7913 754 300 aa, chain - ## HITS:1 COG:lin2342 KEGG:ns NR:ns ## COG: lin2342 COG1131 # Protein_GI_number: 16801405 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Listeria innocua # 17 294 16 290 291 149 30.0 5e-36 MIEIRNCSKAFGEIHAIDHVLMEIGERQIFGLVGTNGAGKSTLLRMAAGVIKPDEGEILV DQEPVYENEKIKQQIFYISDGGYFPQGYTAADLKDFYREYYPDFMTDRYSKLMGQFHLDE KRRISSYSKGMKKQLLVIMGISAGTKYLLCDETFDGLDPVMRQAVKSLFVSEIMNRDFTP VIASHNMRELEDVCDHVGLLHKGGILLSRELEDMKFHIHKIQCVLPDQTKEETLLKELDV LKKEHQGSLLIFTARGTREEILQKVQTQNPIFCEVIPLTLEEIFISETEVAGYEVKNIFW >gi|226332994|gb|ACII01000025.1| GENE 10 7910 - 8284 465 124 aa, chain - ## HITS:1 COG:BS_ytrA KEGG:ns NR:ns ## COG: BS_ytrA COG1725 # Protein_GI_number: 16080098 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 117 1 119 130 79 34.0 1e-15 MIVLDYQDRRPIYEQVTDKFQILILNGVLPPGSQMPSVRQLATELSVNPNTIQRAYMELE KMGLIYPVKGRGNFVADSSQVQKINRESYRKEFTALIQKGRNTGFNREELEKMFAEIFEK EDES >gi|226332994|gb|ACII01000025.1| GENE 11 8635 - 9285 524 216 aa, chain - ## HITS:1 COG:no KEGG:Cthe_2176 NR:ns ## KEGG: Cthe_2176 # Name: not_defined # Def: abortive infection protein # Organism: C.thermocellum # Pathway: not_defined # 64 175 142 252 292 73 38.0 4e-12 MIPALYFYRRDKVRRIAGGLIPAQKKVPLSLLEIILLLLAGAGFAQYGNFLMAILQSFIN SSAYQESMTRITEGKSLLMMIFWMGIIAPAAEEMIFRWLIYLRLRDWLKMPVAAVISGLI FGIYHGNIVQGIYASILGTAFAWILEMSGNIYSSMLLHMGANIWSLLISEYALDLYSMKY GVQILILIYLILLISVVYCLTHFEKMCSIRKKKRMI >gi|226332994|gb|ACII01000025.1| GENE 12 9474 - 10121 768 215 aa, chain - ## HITS:1 COG:PA0065 KEGG:ns NR:ns ## COG: PA0065 COG0546 # Protein_GI_number: 15595263 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Pseudomonas aeruginosa # 2 200 8 204 221 202 49.0 5e-52 MYKVILFDLDGTLTESGEGITKSVQYALERIGKPEPDLEKLKVFIGPPLMEMFMQYAQID EATAKQAVEIYRERYSVTGIFENAVYPGIENMLAQLEKKGYILAVASSKPEVYVRQILDH FGLTRYFTEIGGSELNGRRTNKTEVIEDVLKRLNMDKHRDQVIMVGDKEHDVYGARKAGL ECIAVSFGYGTEEELKQAKPLKIVDSAEGIVDFFA >gi|226332994|gb|ACII01000025.1| GENE 13 10188 - 11216 389 342 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578268|ref|ZP_04855540.1| ## NR: gi|253578268|ref|ZP_04855540.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 342 1 342 342 426 100.0 1e-117 MKKNHNRLWIVTFVCLCLCLTSTISAMQMGGFDIDVGNGENSSWQWHEEQQWNQSSWENT WENNRQTGDNTSETPAQNDQTQDNGYQYDSDTGNGTQDNIQQWNGVQQNNTDWNNSSQWN NNSQWDNNSQNNNVQNNNAQNNNVQSNNARNNNSAWNAQPQTDNNMQITSRTDNESAVKI SEKPTPTSIPVPTPKPAKNPKPTKTPKSTQPPTPKPSKKKNKKKQNKAKKTEKENKNTGN SKVFQNKENQGDTVEYTHVKDEAVKFHCTQNTDSFHTPRIQIISRGSVQILSFRLNKTEC PWHWEGDWIIPDTEDENVKEIELLVISQGGKLIKMNPRIFSS >gi|226332994|gb|ACII01000025.1| GENE 14 11200 - 12456 689 418 aa, chain - ## HITS:1 COG:CAC0404_1 KEGG:ns NR:ns ## COG: CAC0404_1 COG0515 # Protein_GI_number: 15893695 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Clostridium acetobutylicum # 1 258 18 304 306 155 35.0 2e-37 MNLSGTVLKGRYCILDPVGKGGGGKVYLARDLELGVLWAVKEIPVSDKSEARMLLKLSHP SLPKMIDYIEDNRKCYLVMEYVRGKSLGEYLRGGKHFSINEIVKYGMEISDVFSYFHGRK PPVYYGDLKPDNLMLGENGRLYLIDLGGAVNGYKYHHKVCTGTAGFAAPEQYEGKINAAT DIYTFGKTLSALCGKTDCLLFIRNMSLFWLIFRCCMKKPEMRYQNMKTVQKKLSRIQKRQ NQSKIKNMLVLAGSSIAFAVITVFLLFSSDLVSGSKTFDFYEELTDITDLYYQEEFQNGS KADREKICEQADTELRKLQKQCTEKEESRKILQILAVNSEYQENYENAGFYYEQMLLYDE TYRAGYGEYGMFLFRTGQKEAGQALWTEYKSKETMLDDTVSRNLRLWEKEMTKSEEKS >gi|226332994|gb|ACII01000025.1| GENE 15 12729 - 14495 2087 588 aa, chain - ## HITS:1 COG:CC3359 KEGG:ns NR:ns ## COG: CC3359 COG0018 # Protein_GI_number: 16127589 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Caulobacter vibrioides # 1 588 1 600 600 489 42.0 1e-138 MKKLMEQMAEELSAAFEKSGYDPSYGKVTVSNRPDLCEFQCNGAMAGAKAYKKAPFMIAD DVVAYLKDSQVFSMAEVVKPGFINLKVKGEFLADYLKEMAADEKLSVSAAKEPKTIIIDY GGPNVAKPLHVGHLRSAIIGESIKRIGRFVGHKVIGDVHLGDWGLQMGLIITELKHRKPE LVYFDDSYTGEYPEEAPFTISELEEIYPCASGKSKEDEAYRNEALEATHLLQQGKPGYMA LWNHIMNVSVTDLKRNYDKLNVSFDLWKKESDAQPYIPGMVEEMKEKGFAYVDQGALVVD VKEENDTKEIPPCMLLKSDGASLYTTTDLATIVERMKLFDPDEILYVVDKRQELHFIQVF RCARKTGLVKDDTKLSFLGFGTMNGKDGKPFKTREGGVMRLETLIKDINEEMFTKIVENR SVKDKDAKETAEIVGLSAIKYGDLSNQATKDYIFDIDRFTSFEGNTGPYILYTIVRIKSI LNRFAEEGGNLEAGTILDPVNDSQKNLMLQLTGFGATVENAFEEKAPHKICAYIYEVSNA FNSFYHETKILSEENEAQKASFIQLLKLTKKVLETCIDLLGFSAPDRM >gi|226332994|gb|ACII01000025.1| GENE 16 14520 - 15392 1044 290 aa, chain - ## HITS:1 COG:DR1986 KEGG:ns NR:ns ## COG: DR1986 COG1307 # Protein_GI_number: 15806984 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 4 282 3 275 281 125 32.0 1e-28 MGKIAIVTDSNSGITQDQARELGISVIPMPFYINEKLYLEGITLSQEEFYERLKNDEAIS TSQPGPADVCGLWNTLLEKYDEVVHIPMSSGLSASCDTAMGLAQEYDGRVHVVDNQRISV TQRQSVLDALTLRDAGRNATQIKQVLEEQKLDSSIYITLETLKYLKKGGRITPAAAAIGT VLNLKPVLQIQGEKLDAYSKTRGKKQAKRVMLKAMQNDWENRFAEYVKRGEMCLQSAYTG NQEEAEEFKKEIAEVFPEIEIVQNPLSLSVACHIGHGAIAVACSRKVVVE >gi|226332994|gb|ACII01000025.1| GENE 17 15601 - 17217 2069 538 aa, chain - ## HITS:1 COG:CAC3087 KEGG:ns NR:ns ## COG: CAC3087 COG1080 # Protein_GI_number: 15896338 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Clostridium acetobutylicum # 3 534 2 536 539 450 44.0 1e-126 MRVLEGKSVFSGIAIGKISILQKADTSVKRTKVENPEAEITRVQEAKEKAVEQLQKLYDK ALREVGESGAAIFEVHQMMLDDEDYLDSIDNIIRTENVNAEYAVATTGDNFADMFAQMDD DYMKARAADVKDISDRLVRVLSGHDEGDMDAAEPSIIVAEDLAPSETVQMDKSKVLAFVT RKGSSNSHTAILARTMNIPALINIEYDDSMDGKMAVVDGKTGSLIVEPDADTLKKYQDQK DEELRQRAMLKELKGKTTETKSGHKIHLYANIGSTGDVASVLANDAEGIGLFRSEFIYLE KDNYPTEEEQFQIYKAVAQNMAGKKVIIRTLDIGADKQIDYFDMAHEENPAMGYRAIRIC LDRPEVFKTQLRALFRASMFGNISIMYPMIISVTEVKQIKAIVAEVKKELTEQGIPFKDD VEQGVMIETPAAVMISDLLAKEVDFFSIGTNDLTQYTLAIDRQNAKLDNIYDSHHEAVLR MIQMVIDNAHKEGIWAGICGELGADTTLTERFIQMGIDELSVSPTFVLPVRKIVRELD >gi|226332994|gb|ACII01000025.1| GENE 18 17272 - 17529 292 85 aa, chain - ## HITS:1 COG:no KEGG:STH955 NR:ns ## KEGG: STH955 # Name: not_defined # Def: PTS system phosphocarrier protein HPr # Organism: S.thermophilum # Pathway: not_defined # 1 85 1 85 85 80 49.0 1e-14 MVSFTYIIKDKFGIHARPGLLLVQEAGKLTSNITIFKGTDSGDAKRMFCVMNLAVKQGDQ ITVHVEGENEEADAEVMREFLKNNL >gi|226332994|gb|ACII01000025.1| GENE 19 17827 - 19086 419 419 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 3 419 7 413 413 166 28 2e-40 MLENIRLAFQGIWSHMMRSFLTMLGIIIGIASIIAIVSTIKGTSEQIKEDLIGAGSNTVK IYLYDGDNQYDAEYGNSSKTPILTDDVKEEVKDLDHVENATFYYNSTSASIYNNNNSLEG GSVYGIDVNYLATCGYIIQSGRGFVKDDYKNYNKVALIDDNAAQNLFPSQNPVGQTVEIN QEPYTIVGVIRQDENYQPTINSIDEYYTYYQSIVGSVMIPDSCWPISFKFDVPQNVIVKA DSTDTMSSVGNAAADVLNNLIAALNTDGNTTMKYKADNIMEQVKNMQKLSESTNNQLIWI ASISLLVGGIGVMNIMLVSVTERTKEIGLKKAIGARKKTILGQFLTEASVLTSIGGIIGV ITGIGLSKVVGKVTGSPTAISVPSIVIAVLFSTVIGVVFGLLPSVKAANLNPIDALRSE >gi|226332994|gb|ACII01000025.1| GENE 20 19106 - 21127 1758 673 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2392 NR:ns ## KEGG: EUBREC_2392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 311 671 175 525 525 135 30.0 7e-30 MGKVKKILAGVLVVVIAGSGITAGLMQLKKNSQKTVAVTPVSGLLQEFYTPSTTLDGTIT SSATQTVNGDKDLIIDQIYVTKGDAVKKGDPLVSFDTTLVEMELNIAKLKKQKLEQDLNK AVNRLNSLQNGGPVEETDAGTDADNLNSKTGKTDNDTGDDDMTPDDTLSSTADMSGSYLA TAMYPFLLSAFTDGDAVDNAGSDNSSSKSDLPGDDASTDTGNTGGSSGGTSADNDIPVYS DPSANGFSDGEKDDFNSGKNDTPELSPTPTPTLDDRTTYFDPYYRKGDPNITDGDEPFYQ KLDADSVPFTGSGTEDDPYVYLCSSAKEKVTVMGSFFNKLAGYSPDGTKVVNQGGYWYRL EFHQNDTIADYDDLKTSCTGYYLVNGSLLEKPVYEFAEVEFTLADADKYDENPDDGGDDN PGGNDEPTGTPVSRADAIKYQKSKIASLKLDLQESDIKISQLEKKANKKLITAKLDGTVT YVGDSGSGETTDGKALIKIKSSDGFYVVGSVSELMLDDFKEGTKLNCTSYTSGSFEADVM DVSEYPVTGNSFYGSSNPNVSYYAFTAVVSDKTLQLEDQDWVTVNYEASAAENIMVIQKA FVRNDNNKNYVYKDDNGILKKQEISVGAAVDSGYSVIVKGGLSEDDLIAFPYGKDVKEGA KTKEVTLDQMYGY >gi|226332994|gb|ACII01000025.1| GENE 21 21133 - 21804 221 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 214 7 231 563 89 27 2e-17 MILELKGIYKEYQQGKMKVPVLKDVNFSMEEGEYVAIMGPSGSGKTTLMNIIGCLDKQTE GTFFLDGVDIKACSENEMSDIRLNKIGFVFQSFHLLPKQSALANVEMPLNYARVPKKERR ERALKALDRVGLADRVDFKPNQLSGGQMQRVAIARAIVNNPKLLLADEPTGALDSKSGAQ VMELFQRLNDEGVSVLMITHDADIAAHAKRVVTIKDGILQEKR >gi|226332994|gb|ACII01000025.1| GENE 22 21801 - 23279 1750 492 aa, chain - ## HITS:1 COG:BS_yvrP KEGG:ns NR:ns ## COG: BS_yvrP COG0845 # Protein_GI_number: 16080382 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Bacillus subtilis # 7 412 13 392 397 126 28.0 1e-28 MKKRIIIVLGILIAAGGIGTGVWYHFKGNGQSGTGDSIVYVSKVSVITGSETTAVNRFAG VVEPQKTVNVKIESGRKVKEVEVKTGEEVKAGQLLFEYDLSSIEDDLKQAQLDLDRLKNE QISLTEQIATLEAEKKKAKAEDQLSYTIEIETNKMNLKKNEYSQKSKQSEIDKLQSATQN TEVRSEIDGVIQKIDTSKMTSDDGDSVDDSSAMDSSMSSGDGSSDSSAFITILSTGAYRV KGKVNEQNRDSIVPGEAVIIRSRVDSSKTWKGTMGSVDVNNGTSDDSSNDMYMGMASTSS DDQTTSTSYPFYVELDSSENLMLGQHVYIERDIGQDEKKDGLWLSDYYILDTDTNEPYVW AASDKNRLEKRYVTLGEHDDDLGEYKIVEGLTKKDAIAFPTAALEEGEKTEIGDLAQTMS GGADGITDMDADHGNMDDEQPDSADGEAYTDPDMEEGSEDDSSDDSIDPNEELVPIDEAP GMSAGEDVEGIE >gi|226332994|gb|ACII01000025.1| GENE 23 23563 - 24774 756 403 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578278|ref|ZP_04855550.1| ## NR: gi|253578278|ref|ZP_04855550.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 403 1 403 403 619 100.0 1e-175 MNINKHLILLFSVLFFLFAVSVSSGCFRKNESDVKSVVRNELDQLKNLDSETTQKYIPYT ELFPDATENTDLTDEINKTFSLFFRKFNYKISDVIVGTTNHSATVSVKLTTIDSKVLARD FKAELLRTQITESAQAQKGNIKDSSRSLEAHYLILNHLLNTNDYDTAETDCNIQLVNTGN DKKEKWKIQRTNSLEDDLVGGLIADLADPDILSPEDTLTVYLDTLQKLDLKEMTSYLGVV NIMNTSDTAKNSIASALAEQIHKNFNYVIKSSSENGYNATVTTEITTFDSDSILADYQEK LDKYLASADAVIDGSQKRYEKSFEILLNSINDNTVTTVNDVDFVLINDGVSWKLQDEGNT LGNAIFGTLTDSPLETSDSEDENISADTDKQTDDNTSTESSSN >gi|226332994|gb|ACII01000025.1| GENE 24 24942 - 25700 629 252 aa, chain - ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 248 1 248 254 146 37.0 3e-35 MELTIQTFLIVCPLVFLAGLVDSIAGGGGLISLPAYLLAGVPMHNAIATNKLSSATGTAI STARLCKNKFVDWGVALPCISMALVGSFAGAHIALLASDKILKWMLIPVLPIVAFYVMKK KNLDDNSNVEISRKKQWILCAVCSLAVGCYDGFYGPGTGTFLLILYTGVAKLPVAKASGT MKLANLSSNIMALVVFLFSGKIVIYLGLAASVFSIAGHYVGSGMVMKNGNKIVRPIILIV LVLLFIKIITGM >gi|226332994|gb|ACII01000025.1| GENE 25 25703 - 27499 2348 598 aa, chain - ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 595 1 593 595 713 58.0 0 MAESMQGLHRTCRCAEVTKEMIGKEVVLMGWVQKARDKGGIIFVDLRDRSGIMQLIFENG SIDEEGFAKAGKLRSEFVIAVTGVVEERGGAVNKNLATGEIELRVKSLRVLSEAEVPPFP IEENSKTKDEIRLKYRYLDLRRPDLQKNIMLKSKVMMITRQFFAKEGFLEIETPMLGKST PEGARDYLVPSRVHPGHFYGLPQSPQLYKQLLMCSGYDRYIQIARCFRDEDLRADRQPEF TQIDMELSFVDVDDVIDVNERFLAYLFKEVLDIDVKLPIQRITWQEAMDRFGSDKPDMRF GMELHDVSDVVKNCGFSVFTSALENGGSVRGINAEGQGAMPRKKIDKLVEFAKGYGAKGL AYIAIAEDGTRKSSFAKFMTDEEMDALVAAMDGKAGDLLLFAADKKKLVYDVLGALRLEL AKQMDLLDKNEYRFVWVTEFPLLEWSEEENRFTAMHHPFTMLMEEDLPLLDTDPGKVRAK AYDIVLNGNEIGGGSVRIHQDDIQERMFEALGFTKEAAHEQFGFLLDAFKYGVPPHAGLA YGLDRLVMLMAKVDSIRDVIAFPKVKDASCLMTQSPSRVSEEQLKELELEVRPEEVTE >gi|226332994|gb|ACII01000025.1| GENE 26 27596 - 28852 1578 418 aa, chain - ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 4 344 8 332 438 214 36.0 2e-55 MALKKKPVTGMKDILPKEMEIRNYVMNMIRETYGTFGFSSIETPCVEHIENLSSKQGGEN EKLIFKILKRGEKLKLAEAEKESDLVDSGLRYDLTVPLSRYYSNNANELPAPFKALQMGN VWRADRPQRGRFRQFMQCDIDILGEPTYLAEIELVLATTTLLGKLDFHNFTIRINDRKIL KAMAQYSGFPQESFDTVFIILDKMDKIGLEGVAEELEKEGFAKESIDTYLGLFKEITSDI EGVRYCKEKLKDVLDPKVAEDLETIISTVDSVKTADFKMSFDPTLVRGMSYYTGPIFEIS MDEFGGSVGGGGRYDEMIGKFTGNNTSACGFSIGFERIVMLLLERNYEIPTKNGKKAYLI EKNMPADKMLTILKQAQEERNAGTQINISIMKKNKKFQKDQMTAEGYTEFVEFFKDRM >gi|226332994|gb|ACII01000025.1| GENE 27 28785 - 29447 433 220 aa, chain - ## HITS:1 COG:RSc2515 KEGG:ns NR:ns ## COG: RSc2515 COG2206 # Protein_GI_number: 17547234 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Ralstonia solanacearum # 26 184 163 322 402 105 35.0 5e-23 MDTVQKTAAGSQQRILEFDLASELHHGMLVSNLAYAVAEEMGLPHEQCYELAIAGMLHDI GKLKLTSYINGQEQDPLVIEELKYVRMHSALGYEELKGQGYSDFVLESILYHHENYDGSG YPSNKAGDDIPIGARILRVCDVYAALISDRPYRRGFDISSVMELMIDEVKNFDMQVFLAF QRVVHNGDGQNFYKRGGQHGIKEETGNRYEGHFTKRDGNP >gi|226332994|gb|ACII01000025.1| GENE 28 29465 - 30748 1180 427 aa, chain - ## HITS:1 COG:CAC2271 KEGG:ns NR:ns ## COG: CAC2271 COG0635 # Protein_GI_number: 15895539 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 18 419 61 467 476 337 42.0 3e-92 DILSKENAGCGIQDAHALRKENKDNIKYALYQLLVKLTGRTLPWGNLTGIRPAKLAMGMI ESGMKNTEAAREMRERYLVSPQKTALAITIANREREILKDIDYENGYSLYIGIPFCPSIC LYCSFSSYPLKVWEKRTDEYVEALCREVRETALMMKGRKLDTIYIGGGTPTTLLPHQIRK LLDTVGEAFGYEGLAEFTVEAGRPDSITREKLMAIREYPVTRISVNPQTMNQETLDIIGR RHTVEQTKEAFHLARELGYDNINMDLIVGLPGEDIHMVERTLEQVRELAPDSITVHSLAV KRAARLNIFKEKYQEMSFENNQEIMDLTMKTAYEMGMGPYYLYRQKNMKGNFENVGYAKV DKAGIYNILIMEEKQPIIALGAGGSSKLVFDHGQRIERVENVKDVSNYISRIDEMIERKR TAIATWL Prediction of potential genes in microbial genomes Time: Sat May 28 19:22:21 2011 Seq name: gi|226332993|gb|ACII01000026.1| Ruminococcus sp. 5_1_39B_FAA cont1.26, whole genome shotgun sequence Length of sequence - 42931 bp Number of predicted genes - 47, with homology - 45 Number of transcription units - 22, operones - 8 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 224 126 ## gi|153810163|ref|ZP_01962831.1| hypothetical protein RUMOBE_00544 2 1 Op 2 . - CDS 217 - 846 482 ## COG0491 Zn-dependent hydrolases, including glyoxylases 3 1 Op 3 9/0.000 - CDS 911 - 3241 2404 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 4 1 Op 4 7/0.000 - CDS 3279 - 3803 630 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 3850 - 3909 8.9 5 1 Op 5 3/0.000 - CDS 3951 - 5663 1542 ## COG0608 Single-stranded DNA-specific exonuclease - Term 5674 - 5710 7.3 6 1 Op 6 1/0.000 - CDS 5745 - 8153 2940 ## COG0342 Preprotein translocase subunit SecD 7 1 Op 7 . - CDS 8265 - 9632 1451 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Term 9649 - 9690 8.9 8 2 Op 1 . - CDS 9713 - 9853 156 ## 9 2 Op 2 . - CDS 9927 - 10277 472 ## gi|253578291|ref|ZP_04855563.1| conserved hypothetical protein 10 2 Op 3 . - CDS 10327 - 11835 1357 ## gi|253578292|ref|ZP_04855564.1| conserved hypothetical protein - Prom 12005 - 12064 7.0 + Prom 11929 - 11988 8.3 11 3 Tu 1 . + CDS 12180 - 13463 1352 ## COG1253 Hemolysins and related proteins containing CBS domains 12 4 Tu 1 . - CDS 13467 - 14066 526 ## COG1434 Uncharacterized conserved protein - Prom 14261 - 14320 3.5 13 5 Tu 1 . - CDS 14337 - 15101 806 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 15334 - 15393 7.9 + Prom 15146 - 15205 8.9 14 6 Op 1 9/0.000 + CDS 15365 - 16351 1281 ## COG2984 ABC-type uncharacterized transport system, periplasmic component + Term 16445 - 16483 1.3 + Prom 16370 - 16429 4.6 15 6 Op 2 13/0.000 + CDS 16502 - 17419 1116 ## COG4120 ABC-type uncharacterized transport system, permease component 16 6 Op 3 . + CDS 17422 - 18216 263 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 18255 - 18299 8.5 - Term 18039 - 18076 -0.9 17 7 Tu 1 . - CDS 18291 - 19241 887 ## EUBELI_01204 hypothetical protein - Prom 19355 - 19414 4.4 - Term 19314 - 19348 4.0 18 8 Tu 1 . - CDS 19418 - 19894 374 ## COG0394 Protein-tyrosine-phosphatase 19 9 Tu 1 . - CDS 20080 - 21279 778 ## COG0582 Integrase - Prom 21311 - 21370 2.4 - Term 21291 - 21346 12.5 20 10 Tu 1 . - CDS 21391 - 21624 270 ## CD1092 excisionase - Prom 21746 - 21805 6.8 21 11 Op 1 1/0.000 - CDS 21867 - 23204 548 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 22 11 Op 2 . - CDS 23173 - 23817 386 ## COG0358 DNA primase (bacterial type) - Term 23833 - 23881 5.3 23 11 Op 3 . - CDS 23884 - 25464 626 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 24 12 Tu 1 . + CDS 25315 - 25638 78 ## CDR20291_1783 hypothetical protein - Term 25521 - 25563 3.2 25 13 Tu 1 . - CDS 25649 - 25870 271 ## gi|153810254|ref|ZP_01962922.1| hypothetical protein RUMOBE_00635 - Prom 25953 - 26012 3.5 - Term 25962 - 26020 4.7 26 14 Op 1 . - CDS 26188 - 26484 91 ## EUBREC_0795 hypothetical protein 27 14 Op 2 . - CDS 26477 - 26818 286 ## CDR20291_1781 hypothetical protein 28 14 Op 3 . - CDS 26815 - 27003 120 ## gi|160894302|ref|ZP_02075079.1| hypothetical protein CLOL250_01855 - Prom 27040 - 27099 3.0 + Prom 27390 - 27449 5.7 29 15 Tu 1 . + CDS 27487 - 28200 431 ## EUBREC_0800 hypothetical protein + Term 28302 - 28352 4.8 - Term 28511 - 28546 2.5 30 16 Tu 1 . - CDS 28584 - 29525 531 ## Clos_2295 hypothetical protein - Prom 29549 - 29608 7.7 31 17 Op 1 . - CDS 29703 - 30269 -33 ## gi|257439656|ref|ZP_05615411.1| CAAX amino protease family protein 32 17 Op 2 . - CDS 30317 - 30694 224 ## CDR20291_1777 hypothetical protein 33 18 Tu 1 . - CDS 31059 - 32258 335 ## COG4974 Site-specific recombinase XerD - Prom 32312 - 32371 4.7 34 19 Op 1 . - CDS 32394 - 32603 163 ## CD1092 excisionase 35 19 Op 2 . - CDS 32600 - 33415 657 ## COG1484 DNA replication protein 36 19 Op 3 . - CDS 33415 - 33636 123 ## gi|253578319|ref|ZP_04855591.1| conserved hypothetical protein 37 19 Op 4 . - CDS 33716 - 33904 81 ## gi|253578320|ref|ZP_04855592.1| conserved hypothetical protein - Term 34232 - 34274 6.1 38 20 Op 1 3/0.000 - CDS 34342 - 34755 354 ## COG0394 Protein-tyrosine-phosphatase 39 20 Op 2 . - CDS 34748 - 35491 637 ## COG0798 Arsenite efflux pump ACR3 and related permeases 40 20 Op 3 36/0.000 - CDS 35516 - 36004 195 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit - Prom 36053 - 36112 4.8 - Term 36023 - 36061 -0.7 41 20 Op 4 . - CDS 36223 - 37767 893 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 42 20 Op 5 . - CDS 37764 - 38255 278 ## CKR_0690 hypothetical protein 43 20 Op 6 . - CDS 38314 - 38832 455 ## gi|253578327|ref|ZP_04855599.1| conserved hypothetical protein 44 20 Op 7 5/0.000 - CDS 38849 - 41203 1899 ## COG2217 Cation transport ATPase 45 20 Op 8 . - CDS 41263 - 41622 245 ## COG0640 Predicted transcriptional regulators - Prom 41661 - 41720 7.3 - Term 41629 - 41668 4.7 46 21 Tu 1 . - CDS 41754 - 42029 107 ## gi|166030809|ref|ZP_02233638.1| hypothetical protein DORFOR_00483 - Term 42198 - 42238 6.6 47 22 Tu 1 . - CDS 42386 - 42487 64 ## - Prom 42511 - 42570 4.6 Predicted protein(s) >gi|226332993|gb|ACII01000026.1| GENE 1 2 - 224 126 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153810163|ref|ZP_01962831.1| ## NR: gi|153810163|ref|ZP_01962831.1| hypothetical protein RUMOBE_00544 [Ruminococcus obeum ATCC 29174] # 3 63 10 70 516 75 59.0 9e-13 MCKIGILFKNRDFEHDVYELIKAFYPEAEIHTLYENGEAEYDLRFSVERDNDSYIIKYER TENLSKQETEQKGV >gi|226332993|gb|ACII01000026.1| GENE 2 217 - 846 482 209 aa, chain - ## HITS:1 COG:BH2820 KEGG:ns NR:ns ## COG: BH2820 COG0491 # Protein_GI_number: 15615383 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Bacillus halodurans # 11 209 8 208 211 155 41.0 6e-38 MKNLELQKCILGPVYTNCYLLKNKETGELIIVDPADCLEKIEMKISRMNGKPVAILLTHG HFDHILAAQAVKEKYNIPIYACRQEEEMLREPSINMTVHYGQGCSIVPDVFLEDLDVIRL AGFSVQMIHTPGHTKGSCCYYLKDEGVLFSGDTVFYGSVGRTDFPGGSTAEIVRSLHKLV DSLPEETEVFPGHDASTTIGYEKRYNPFV >gi|226332993|gb|ACII01000026.1| GENE 3 911 - 3241 2404 776 aa, chain - ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 58 774 17 738 740 729 51.0 0 MTEITKSTEAKKDEMQVEKEKLESVKRADAAVKTMHDFTSPEVLYKELINSVLKYHPSTD ISMIEKAYKVASEAHEGQKRKSGEPYIIHPLCVAIILADLELDKETIVAGLLHDAVEDTW MTYEEVEKEFGSEVALLVDGVTKLGQLSYSADKVEVQAENLRKMFLAMAKDIRVILIKLA DRLHNMRTLQYMRPEKQQEKARETMDIYAPIAMRLGISKIKVELDDLSLKYLKPDVYYDL VHKVALRKSEREQFVGAIVKEVKKHMDDANIKAQVDGRVKHFFSIYKKMVNQDKTIDQIY DLFAVRILVDTVKDCYAALGVIHEMYKPIPGRFKDYIAMPKPNMYQSLHTTLIGPNGQPF EIQIRTYEMHRTAEYGIAAHWKYKESSDGKAPVGKSEEEKLNWLRQILEWQRDMSDNKEF MSLLKNDLDLFADSVYCFTPQGDVKTLPSGSTPVDFAYSVHSAVGNKMVGARVNGKLVPI EYEIKNGDRIEIITSQNSQGPSRDWLKLVKSTQAKNKINQWFKKELKEDNILKGKEMLAQ YARAKGFKIANYTKTQYLEAVLRKYGFRDWDSVLAAIGHGGLKEGQVFNKLVEAYDKENK KNLTDEQVLEAASETQEKLHIAKSKSGIVVKGIHDVAVRFSKCCNPIPGDEIVGFVTRGR GITIHRTDCVNVLNMSETDRTRLIEAEWQQPDTKEKEKYMAEIQVYANNRTGLLVDLSKI FTERKIDLRSINSRTSKQEKATISMSFEIGSKEELRSLIEKIRQVESVIDVERTTG >gi|226332993|gb|ACII01000026.1| GENE 4 3279 - 3803 630 174 aa, chain - ## HITS:1 COG:FN1483 KEGG:ns NR:ns ## COG: FN1483 COG0503 # Protein_GI_number: 19704815 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Fusobacterium nucleatum # 4 171 3 170 170 188 54.0 5e-48 MKKIEDYVLSIPDFPEPGIIFRDITSVLQDADGLQLAIDSMQDCLKDIDVDVIAGTESRG FIFGVPIAYNLHKPFVPIRKKGKLPRETVSVSYDLEYGSAEIEMHKDSIKPGQKVVIVDD LIATGGTIEAAIKLVEQLGGEVVKVVFLMELAGLKGRERLNGYDVASVICYDGK >gi|226332993|gb|ACII01000026.1| GENE 5 3951 - 5663 1542 570 aa, chain - ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 1 569 1 586 587 582 49.0 1e-166 MEKWVIAAKRADFNQIGQQFHIDPVIARLIRNRDVIGEDKIREYLSGTVQELPSPWLMKD MEKAVDILEKKISQKAKVRIIGDYDIDGVTSTYILMKGLARIGADVDTYIPDRVADGYGI HAHLIERAETDRIDTIVTCDNGIAAAAEIQMAKDKGMTVIITDHHEVPYREEKGERQMVL PPADAILNPKQYDCPYPNKNLCGAVVAFKYIAALYERFGVPAEELEDYYELAAIATVGDV MDLQGENRILVKEGLCRLKNTKNQGLQELIRANSLEDAKITAYHIGFVLGPCINASGRLD TAARSLALLNAKTKEEAARLAGDLTALNQSRKALTEKGKEEAIRIIETTELKNDRVLVVY LPDCHESLAGIIAGRIREKYHRPAFVLTGGETSAKGSGRSIESYSMYEELVKCADLMIQF GGHPMAAGLSIEEKNIEEFRRRLNVNCTLTDEELRPKIVIDVPMPVSYITKELVEQISLL EPFGKGNTKPVFAQKNLRVLDHSIIGKNKNVVKLKLLDPQGISVEGIYFGEAEDFVNFIR EKDSISVTYYPEINRFRGRESLQIIIQNYC >gi|226332993|gb|ACII01000026.1| GENE 6 5745 - 8153 2940 802 aa, chain - ## HITS:1 COG:CAC2278 KEGG:ns NR:ns ## COG: CAC2278 COG0342 # Protein_GI_number: 15895546 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Clostridium acetobutylicum # 2 424 3 393 417 253 37.0 1e-66 MKKSKGIVVLLLTLILTVFFCFTAAVGIGPTGTGAAKHINTGLDLAGGVSITYQAKESNP TSEEMSDTIYKLQKRVEQYSTEAQVYQEGSDRITVEIPGVTDADTILNDLGKPGSLYFIT KEDADGNQNFTTGQDGYVLSRTMDEIKESGSVVLEGSDVADAQGGAIQNQNTSAKEYVVS LTLTSDGENSGKTKFAEATKNNVGKQIAIVYDNNIISAPNVNEEISGGKAQISGMSDLEE AQNLASYIRIGSLGLELEELRSSVVAAQLGEEAISTSVLAGAIGLIIVIIFMIIAYRVPG VVAGIALILYTSLMLITLNAFDITLTLPGIAGIILGIGMAVDANVIIYARIREEIGAGVS VRNSIKSGFSKAFSAIFDGNITTLIAAFVLMWLGSGTVKGFAYTLALGIVISMFTALVVS RLIVNALYAVGVRDPKFYGSAKERKAVDFLGKKKVFFAISIILILCGPAAMFANSHAGNK ALNYSLEFSGGTSTTVTFNEDMDIKTIDSEVTPVVEEVTGDKNVQPTKVVGTNQVVIKTR SLNQSEREALQSALVEKFGVDDSTISTESISSTVSSEMRRDAIVAVIVATICMLLYIWFR FKDIRFASSAVLALLHDVLVVLAFYAIARVSVGNTFIACMLTIVGYSINATIVIFDRIRE NLHSGSREKLAEIVNTSITQTLTRSIYTSFTTFVMVAVLYIMGVSSIREFAAPLMVGVLV GAYSSVCITGALWYVMKKKFAGKGQPAIAAASASAKSSSKPKKARDVSQKDPTQPKKKNR KRVAERLAAEEAAKKQQNDDAE >gi|226332993|gb|ACII01000026.1| GENE 7 8265 - 9632 1451 455 aa, chain - ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 1 452 1 452 454 483 52.0 1e-136 MSLIHRYKSNGFNIVLDINSGCIHLVDEVTYEVLPYLEEGLGTEAIAEKLENKYNREDIE TSVRECNKLKEDGMLFTKDVYENVIEEFSNNRQTVVKALCLHIAHDCNLACRYCFAEEGE YHGRRALMSYEVGKKALDFLIANSGSRKNLEVDFFGGEPLMNWQVVKDLVKYGREQEKLH NKKFRFTLTTNGVLLNDEVMEFCNKEMGNVVLSVDGRKEVHDYMRPFRKGAGSYDLIMPK FQKFAESRNQDKYYVRGTFTHHNLDFSKDVLHLADLGFKQISVEPVVAADTEEYAIREED IPQIMEEYDTLAKEMIKREKEGKGFNFFHFMIDLTGGPCVYKRLSGCGSGTEYLAVTPWG DFYPCHQFVGEEQYLMGNVDEGITKPEIQKEFGGCNVYTKKDCKDCFARFYCSGGCAANA FHFQGDIKGNYEIGCELQRKRVECAIMIKAALACD >gi|226332993|gb|ACII01000026.1| GENE 8 9713 - 9853 156 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHIKTLNTKNLQNTVKKGGCGECQTSCQSACKTSCTVGNQSCEQK >gi|226332993|gb|ACII01000026.1| GENE 9 9927 - 10277 472 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578291|ref|ZP_04855563.1| ## NR: gi|253578291|ref|ZP_04855563.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 105 1 105 116 136 100.0 5e-31 MKVRAILKSLLFAYALTGVALLLLALVLFAFDLGETAVDAGIIIIYVLACFMGGFMAGKI VRKDKYLWGVITGLAYYVLLLTVSVMAKGGWDMSAAHAVTTFFMCTGGGTLGGMLS >gi|226332993|gb|ACII01000026.1| GENE 10 10327 - 11835 1357 502 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578292|ref|ZP_04855564.1| ## NR: gi|253578292|ref|ZP_04855564.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 502 1 502 502 698 100.0 0 MSNNINRIRKNKLKGTGIFAAAALTAALIAGNNIPGVTRQVSAAESQENTVSSLIPDNIT LDKPVELADVELPATEYGTFEWADGSFVPAQRVQSCEVLLIPAEGQDLSHVDGWDPESGA VVGYINVIVNSITDGDNVDKTEGSGKTDPTENADSTETPETSSEKPAADENADTDNKEDK ESKSDQDNKTDTDNEENKIEEDKTDNKNNKDNNSNADKTDKDTETTGKLEPADDMTQDKE DTKKDDEDGADDNIFDNPVLPDDKDDRPTDAEDSLSDEEKEERAALNHTCDGITVSGVNL PWYVQFRVSSGDSYQFTNESDAMIFQSYEFELWDLKNNTEYEIPNGDYISVTVPVKSGYQ YTIEHLLDNGATETIIPSVENGIMVFSTHSFSPFGIAGSRQLVGPDAGNDTDDTDAVTPA PASSDSQSSDSGKNNTVSMNDSSDSNNGSSSDQNDNSTENKAVKNNNTVNTGDTTSIMPF VILFVAAAVIAGVAVFIKKRKK >gi|226332993|gb|ACII01000026.1| GENE 11 12180 - 13463 1352 427 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 19 424 17 426 426 239 38.0 8e-63 MDSSIAFQAVAILILIALSSFFSSAETAMTTVNKIRIQSLAEQGNKRAVILEKIISDSPK MLSTVLIGNNIVNMSVSSLMTTLTIKILGNAYVGITTGILTLLILIFGEITPKNLATIHA EKLSLAYSRIIYGLMILLTPVVFIVNKITEGVLVILHVNPDEKANAMTEHELRTLVNVGE KDGVIENEEKQMIYNVFDFGDSTAKDVMIPRIDMTFIDINFSYDELMAVFSEDMHTRFPV YEDNTDNVIGIINMKDLLVYPKDKPFSIRNILREPYFTYEYKATADLMIEMRKASVNLAI VLDEYGATAGLVTLEDLLEEIVGEIRDEYDEDEVEDIKEIQPEREYVVQGSAKLDDINEA LHINLESEDYDSIGGYIIEQLDCLPKEGQSVTLESGIRLVVDRLDKNRIELVHIWLPEKK TETEEQP >gi|226332993|gb|ACII01000026.1| GENE 12 13467 - 14066 526 199 aa, chain - ## HITS:1 COG:AGl144 KEGG:ns NR:ns ## COG: AGl144 COG1434 # Protein_GI_number: 15890180 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 197 72 231 252 62 24.0 5e-10 MNDVYEKFLKSSEEFIFAENKPEKSDIIFVPGNGYPQMAEKAAELFKKGMADWILPSGKY SVVNGKFSGVLEKSNVYDKEYGTEWEFLRDVLIKNGVPDQKILKEDQATFTYENAIYSRQ VTDHAELEIERAILCCKSYHARRCLMYYQLLYPKTEFYVVPVNADGITRENWRKNEEGID AVTGELSRIVKQFSLMLER >gi|226332993|gb|ACII01000026.1| GENE 13 14337 - 15101 806 254 aa, chain - ## HITS:1 COG:BS_cwlC KEGG:ns NR:ns ## COG: BS_cwlC COG0860 # Protein_GI_number: 16078804 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 5 252 4 252 255 134 35.0 1e-31 MAYSIMLDAGHGGQDPGAVYKGRQEKNDNLKLALAVGEILKNKGIDISYTRTGDVYQTPF EKAQLANQAGVDYFISFHRNSSPKENQYNGAEVLIYDKSGIKYQMAENILGALGEVGFRE IGVKERPGLVVLRRTRMPALLIETGFINSEEDNKLFDNKFSDIAQGIADAILGSLDEETV QEPLYYRVQTGAFRKKENADRMLYQLLEKGYPSFMLHENGLYKVQTGAYQQIGNAVNMEQ RLRDDGYSTVIVTK >gi|226332993|gb|ACII01000026.1| GENE 14 15365 - 16351 1281 328 aa, chain + ## HITS:1 COG:Cgl2198 KEGG:ns NR:ns ## COG: Cgl2198 COG2984 # Protein_GI_number: 19553448 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Corynebacterium glutamicum # 28 321 39 328 330 201 47.0 1e-51 MKRKALAITLAAAMAMGTCAVTASAADGDKYTIGICQLVQHEALDAATQGFKDEVTKELG EDAVTFDEQNAQGDSNTCSTIINSFVSNNVDLILANATPALQAAAAGTSDIPILGTSVTE YGVALGLDDFDGTVGGNISGTADLAPLDQQAAMLNELFPDAKNVGLLYCSAEANSQYQVD TVKAELEKLGYTCEYYAFSDSNDLSSVATTATDASDVIYVPTDNTVASNTEIINNICLPA KVPVITGEEGICSGCGVATLSISYYDLGVTTGKMAVKILKDGEDISTMPIEYAPNFTKEY NKDICEELGIEVPDDYVAIGEAAEETAK >gi|226332993|gb|ACII01000026.1| GENE 15 16502 - 17419 1116 305 aa, chain + ## HITS:1 COG:Cgl2197 KEGG:ns NR:ns ## COG: Cgl2197 COG4120 # Protein_GI_number: 19553447 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Corynebacterium glutamicum # 12 302 3 289 296 187 43.0 3e-47 MQIITLITALPGAVAQGLIWGIMAIGVYITFRILDIADLTVDGTMCTGGAVCIMMMLSGH NVWISMLAATLAGMLAGLATGIFHTFMGIPAILAGILTQLSLYSVNLKIMGKANQAINVD KYNLLVSLRHIKNASLSKNTIFIVAVMCILLIAILYWFFGTELGCSLRATGCNPNMSRAQ GINTNRNKVLGLMISNGLVGLSSALLAQYQGFADVNMGRGSIVIGLAAVIIGEAIFGRIF RNFALRLLAVIFGSILYYLVLQIVIWMGIDTDLLKMLSAIVVALFLAFPYWKGKYFNKPA KRGGK >gi|226332993|gb|ACII01000026.1| GENE 16 17422 - 18216 263 264 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 228 1 218 245 105 33 3e-22 MLELKNIYKTFNPGTINEKRALNGLNLKLNEGDFVTVIGGNGAGKSTMLNAVAGTWPVDE GQILIDNIDVTKLSEHKRATYLGRVFQDPMTGTAATMGIEENLALAKRRGKSRLLRSGIT KAEREEYKELLKILGLGLEDRLTSKVGLLSGGQRQALTLLMATLQKPKLLLLDEHTAALD PKTAAKVLEITDMIVNRDHLTTLMITHNMKDAIAHGNRLIMMMEGKIILDIQGEEKKKLT VKNLLDQFEKASGEEFSNDSALLG >gi|226332993|gb|ACII01000026.1| GENE 17 18291 - 19241 887 316 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01204 NR:ns ## KEGG: EUBELI_01204 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 59 300 53 291 298 68 22.0 3e-10 MQKDYETFVEKLEIGIYEATGIPRENISFEKEGGRFAPVGDRLLVKFAEHDDAWEVCGLY TQELFKSYQNGSPFEEIIKEITDDLNRIKKADIYEKTRVIRDYEKTKPRLFIRLLNADKY SADLQDAVYKTLGDIALVLYMKVTEYEGCATSTKIRQGMLEQWGKECDEVFQEAILNTYF MSPPRIYRWEQMIFNPEYEGESFMNLGDKCELKKDAMGNCLSTTKKTNGAVAVFLPGVAE QLAYMLDSDFYMVFTSVHEVMIHNDKFVEPEDLQCVLRDTIREATPKEDYLTSRIYQYNR ETHKFICVTPLEKDEK >gi|226332993|gb|ACII01000026.1| GENE 18 19418 - 19894 374 158 aa, chain - ## HITS:1 COG:lin0937 KEGG:ns NR:ns ## COG: lin0937 COG0394 # Protein_GI_number: 16800007 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Listeria innocua # 1 155 1 152 152 123 43.0 1e-28 MIKVLFVCHGNICRSPMAEFIFKDMVSKQGLSDRFYIASAATSTEEIWNGIGNPVYPPAR EELAKHGIDCKGKRAVQLTKADYDKYDYILGMDHWNLKNMLRILKSDPEDKVKLLLDYSD DPRDIADPWYTGGFDVTYSDVVEGCEAFLEYLKDKKKL >gi|226332993|gb|ACII01000026.1| GENE 19 20080 - 21279 778 399 aa, chain - ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 93 393 80 377 387 63 22.0 6e-10 MSEKRRDNRNRILREGEYQRKDGRYRFRYIDEDGKEKNVYSWRLDKNDPMPKGKKREPSL REKEKQIEADLFDRIVTNGGNYTVLELVEKYVSLKTGVRHNTVAGYKTVINMLKKESFGN LRIDKVRLSDAKAWLIKLQQIDGRGYSSIHSIRGVLRPAFQMAVDDDLLRKNPFEFELAS VIVNDSVTREAITRKQQRDLLKFIQEDKHFSRYYDAIYILFHTGLRISEFCGLTVSEIEF GEMRIKVDHQLQRTAQMQYVIEEPKTDKGIRYVPMTEAVAACFRRIIANRKTPKVEPMVE GYAGFLFLDKNDMPMVALHWEKYLEHIIQKYNRIYRIQMPKVTPHVCRHTFCSNMAKSGM NPKTLQYIMGHADISVTLNTYTHVNFDDAKEEVYRIANS >gi|226332993|gb|ACII01000026.1| GENE 20 21391 - 21624 270 77 aa, chain - ## HITS:1 COG:no KEGG:CD1092 NR:ns ## KEGG: CD1092 # Name: xis # Def: excisionase # Organism: C.difficile # Pathway: not_defined # 12 77 2 67 67 76 53.0 2e-13 MANNKLNIEGKNYSDIPVWRRYTLTIEEAARYYHIGEGKLRTLIDTHPNEDFYVMNGNRA LIKREKFERYLDQATAV >gi|226332993|gb|ACII01000026.1| GENE 21 21867 - 23204 548 445 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 5 353 30 383 480 145 28.0 1e-34 MNAMQPPQSIEEIKAGLETTEKGGVRQSIRNCLTVFQRDPLLSGAIAYNILTDRKDIIKP IGFHRDSTALNDTDMKYLLLYLEETYGLTNEKKIDNAIGIVANENKYHPIRDYLSSLVWD GTERIRFCLRHFLGADTDDYTYEALKLFLLGAISRAFQPGCKFEIMLCLVGGQGAGKSTF FRLLAVRDEWFSDDLRKLDDDNVYRKLQGHWIIEMSEMMATANAKSIEEIKSFLSRQKEV YKIPYETHPADRPRQCVFGGTSNALDFLPLDRSGNRRFIPVMVYPEQAEVHILEDEAASR AYIGQMWAEAMEIYRSGRFKLAFSPAMQRYLKEHQRDFMPEDTKAGMIQAYLDKYTGSMV CSKQLYKEALNHAFDEPKQWEIREINEIMNQCISGWRYFQNPRMFSEYGRQKGWERENPA TDLCNPSEKTMDGFVEITEQMELPF >gi|226332993|gb|ACII01000026.1| GENE 22 23173 - 23817 386 214 aa, chain - ## HITS:1 COG:RC1330 KEGG:ns NR:ns ## COG: RC1330 COG0358 # Protein_GI_number: 15893253 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia conorii # 1 127 15 135 595 72 34.0 7e-13 MNVFEAVKQSVTTRQAAEHYGIHVGRNGMACCPFHHDKTPSMKLDRRYHCFGCGADGDVI DFAAALYGLGKKEAAVQLAQDFGLSYEDWKPPGKAKKPKPRQKSPEEQFQEAKNRCFRIL ADYLHLLRVWRKEYAPHSPEEVFHPRFVEALQKQDHVEYLLDVLLFGETEEKAALITEYG KDVIQLEQRMAELAAADAARTKKHHERHAAAPEH >gi|226332993|gb|ACII01000026.1| GENE 23 23884 - 25464 626 526 aa, chain - ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 265 22 290 1117 130 34.0 9e-30 MPCPHNEISIVQRSQQQSAVAAAAYQSGEKLFCEYDQEVKHYPEKRGIVHNEILLPANAP PEYADRNTLWNAAEAVEKQWNSQLARRWVLTIPREIPPDQYAVLVREFCQQQFVSKGMIA DFAIHDPHPPGHNPHAHVLLTMRAMDEHGKWLSKSRKVYDLDENGERIKLPSGRWKSHKE DTVDWNDQKYCEIWRHEWEVIQNRYLEANDRPERVDLRSYARQGLDIIPTVHEGAAVRQM EKRGIQTNIGNLNREIRAANRLMKSIRQLIQNLKGWITELGEKRKELLAQKAAEEATLLP NLLMKYMEIRKEERKDWTRAGQNRGTSQDLKAVSEALSYLRQKGLSTVEDLEAFLESSGK SAADYRSQMKPKEARSKVIDGLLSARTDCKECKPVYEKYQKIFFKKTKEKFNQEHPEVAR YAKAAAYLAKHPDDKDSTQKKLQEEQETLLEEIAALKTPLIEVQADLKKLRDIRYWVRKA TPGTEESKEPPKKQPIKEVLQDKADEKKAQRTAPAQAKHRQQDMEL >gi|226332993|gb|ACII01000026.1| GENE 24 25315 - 25638 78 107 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1783 NR:ns ## KEGG: CDR20291_1783 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 52 104 1 53 53 97 86.0 1e-19 MDDTSLFWVVLHFLVVFTEQLFAALVGSGGNRRLLLATLNNRDFVVWTGHFVSLLSMLGG WGGVLPPHFGQKKSRRPFSDFLLGVPLCKRRVFSCESSGQVDPMCVE >gi|226332993|gb|ACII01000026.1| GENE 25 25649 - 25870 271 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153810254|ref|ZP_01962922.1| ## NR: gi|153810254|ref|ZP_01962922.1| hypothetical protein RUMOBE_00635 [Ruminococcus obeum ATCC 29174] # 1 73 1 73 73 126 100.0 4e-28 MFKKAFWVPYEDSANYPTLAKTMEAISKYCEENGKSYTFINDDEVEINGKRYEIYRGYEN GSRGNYGIKCKEK >gi|226332993|gb|ACII01000026.1| GENE 26 26188 - 26484 91 98 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0795 NR:ns ## KEGG: EUBREC_0795 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 98 1 98 98 117 93.0 2e-25 MPDTSKLEKLNRELEKSEKKLRKAINDEKALQHQLKQLTRKERTHRLCIRGGMLESFLQE PERLTDDDVKVLLKIIFHRQDTQELLKKLLERKKPETP >gi|226332993|gb|ACII01000026.1| GENE 27 26477 - 26818 286 113 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1781 NR:ns ## KEGG: CDR20291_1781 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 113 1 113 113 209 97.0 2e-53 MNFEFMTIDTPLPPCMPFPRALTGFPVSSTAKVMYCRMLDAMLSKGQEDENGILFVCFPV TAIATVLSRSPMTVKRSLNELETAGLIMRVRQGIGEPNRIYVLIPGKENAALA >gi|226332993|gb|ACII01000026.1| GENE 28 26815 - 27003 120 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160894302|ref|ZP_02075079.1| ## NR: gi|160894302|ref|ZP_02075079.1| hypothetical protein CLOL250_01855 [Clostridium sp. L2-50] # 1 59 38 96 98 99 84.0 4e-20 MPRMSKKRKHELSFYLNDRGRVTYNELCRKCQHGCRQSFRAVVIDCPRYLSKRAKKKEEH TE >gi|226332993|gb|ACII01000026.1| GENE 29 27487 - 28200 431 237 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0800 NR:ns ## KEGG: EUBREC_0800 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 224 10 233 277 410 97.0 1e-113 MNGATTIQERLKDLRLNKRLKLEELAEQTGISKSALGSYEKDDYKEINHGNLILLADFYG VSLDYLFCRTENMVEINTPLRELHLSDEMVALLKSGRINNRLLCELATHKDFIKFLADIE IYVDGIATMQIQNLNALVDTVRHEIIERYRPGEDDPHLKVLQAAHISDDEYFSHMVRDDL NLIIRDIREAHKKDSESAPQTTVADELKENLEAVENFKGSRDGKARCTLLQTARHQL >gi|226332993|gb|ACII01000026.1| GENE 30 28584 - 29525 531 313 aa, chain - ## HITS:1 COG:no KEGG:Clos_2295 NR:ns ## KEGG: Clos_2295 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 216 310 135 228 228 120 61.0 9e-26 MKKKLCSMVCLCYFVSIMLCACGRKEQGNPIRLPAREDIVSIGVSDGDKYAISPNTEGEA TEFIDEFLSMLMDMETTSQQSINDAPVNKDSITININCDGEAGTTLFYYVDKGIEYVEQP YQGIYKPTPALGNCITEMLASADNRPLMVTFQASVIETNHDSIIVKPVDGSLELDSADKF YISNEENLELQIGDFVEISYNGEIMESYPAQLGEVYKITVIEQTEANAMWDRIPMVRIDG KLYYDTGRESIMDARCGTMDGEITSTVDGTEIPTEDNQSNFGSGFGYQYGADDTIEIYMN EKWFVFEYREESE >gi|226332993|gb|ACII01000026.1| GENE 31 29703 - 30269 -33 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257439656|ref|ZP_05615411.1| ## NR: gi|257439656|ref|ZP_05615411.1| CAAX amino protease family protein [Faecalibacterium prausnitzii A2-165] # 1 188 1 188 188 286 97.0 6e-76 MLKSIGLYFRKIDFLNFAIGAIMPIIVLFIVYSSVKSNIILQDTDFLSLLMNHKGKFIFY FFVSFIEEVIFRGIIFGLLLQKCKNKYLSCVIAALIFTLPHIFNTDNISVLVMFIFPFLY GIFANEMFYTTKSIWMPTGFHWLWNYTITSLFLVTGTQSFIYVHIIAAMIIMIPLFYIVI GKTRLSGD >gi|226332993|gb|ACII01000026.1| GENE 32 30317 - 30694 224 125 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1777 NR:ns ## KEGG: CDR20291_1777 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 125 1 125 125 216 92.0 3e-55 MSELKPRITENGIDYILVGDYYIPDLKLPEERRPIGKYGRMHREYLREVHPARLNTLILT GELWTYLADLNEQAQERLDTIMEQMKIAEGVTEKLKCTRQMEWVRRCNNIHNRAEEIVLH EIIYS >gi|226332993|gb|ACII01000026.1| GENE 33 31059 - 32258 335 399 aa, chain - ## HITS:1 COG:YPO0892 KEGG:ns NR:ns ## COG: YPO0892 COG4974 # Protein_GI_number: 16121197 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Yersinia pestis # 219 391 142 290 299 69 27.0 1e-11 MSEKRRDHRGRILHNGEIQLSDGRYRFKYVDEMGKERCVYSWRLDHNDATPKGKRRTLSL REMEKKIEADHFEQIATNGGNMTVLELVEKYTSTKTGVRPTTVAGYGTVINLLKKDPFGK RRIDTVRISDAKCWLIHLQQVEKKSYSSIHSIRGVLRPAFQLAVDDDLIRKNSFQFQLME VVVNDSVTREAISRAEERKFLRFVKEDPHFCRYYEGIYILFKTGLRISEFCGLTISDIDF KEHTINIDHQLQKKSKIGYYIQETKTTSGTRKIPMTADVEECFRKIIEKRNPPKAEPMVD GKSGFLYFDKNESICYSLHWEHYFQHIIQKYNNTYKVQMPVITPHVCRHTYCSNMAKSGM NPKALQYLMGHSDISVTLNTYTHVNLEDAREEVARIQVV >gi|226332993|gb|ACII01000026.1| GENE 34 32394 - 32603 163 69 aa, chain - ## HITS:1 COG:no KEGG:CD1092 NR:ns ## KEGG: CD1092 # Name: xis # Def: excisionase # Organism: C.difficile # Pathway: not_defined # 1 69 1 67 67 67 44.0 1e-10 MNHTEKTIPVWEKYSLNVSEAAEYYRIGEKRLRQIAGENEGADFILEVGSHIRFKRKLFE DYLDTASTV >gi|226332993|gb|ACII01000026.1| GENE 35 32600 - 33415 657 271 aa, chain - ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 18 258 25 271 282 117 32.0 2e-26 MREEDTVCVGSDGLKYCKVCGEAKEAFFPKGGFMGMKKHSRQCACDRKVYEEEQKYFKDK EHRELVSRNTSICFDESRMEEWTFENADMSDTVMHRAKKYVDNWEEMKRNHIGCLFWGPV GTGKSFIAGCIANELLKQEVMVKMTNFNTIIDDIFPLADKTEYINALASYQLLIIDDLGV ERNSEYALGIIFSVIDRRIRSGRPLIITTNLPLKEIKNETMLDKRRIYDRILEMCTPMYV GGTSKREVIASMKMEKAKTLLNTNRGEEDCE >gi|226332993|gb|ACII01000026.1| GENE 36 33415 - 33636 123 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578319|ref|ZP_04855591.1| ## NR: gi|253578319|ref|ZP_04855591.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 73 7 79 79 131 100.0 2e-29 MLQYFQFPKFLLKLRISQTAKFLYMILYDRARISRKNSWIDKYGNVAPSKVTEKYKSNKY HEVNYCYGEGESL >gi|226332993|gb|ACII01000026.1| GENE 37 33716 - 33904 81 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578320|ref|ZP_04855592.1| ## NR: gi|253578320|ref|ZP_04855592.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 62 78 139 139 124 98.0 3e-27 MLSNIQLTEKIKKISSGGCKNASQRFVLMKGYLGGKNKKLSSRACIYALHLLNICKDIFE PT >gi|226332993|gb|ACII01000026.1| GENE 38 34342 - 34755 354 137 aa, chain - ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 3 131 2 129 136 147 58.0 4e-36 MNKKKVAFICVHNSCRSQIAEALGKHLASDVFESYSAGTETKLQINQDAVRIMKELYGID MEAEGQYSKLIDEIPVPDIAISMGCNVGCPFIERPFDDNWGLEDPTGKSDEEFKKVIDEI RMQIFILKQRLDEFEEN >gi|226332993|gb|ACII01000026.1| GENE 39 34748 - 35491 637 247 aa, chain - ## HITS:1 COG:pli0038 KEGG:ns NR:ns ## COG: pli0038 COG0798 # Protein_GI_number: 18450320 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenite efflux pump ACR3 and related permeases # Organism: Listeria innocua # 8 238 110 340 352 246 66.0 2e-65 MYEENQKLKTWITPDLATEYLAGAVLLGAAPCTAMVFVWSTLTKGNPAYTVVQVASNDII ILFAFVPIVKFLLGVSNVSVPFQTLFMSVVLFVVIPLAAGIITRLLVVRKKGTEYFEQTF LHKFDSATTVGLLLTLILIFMFQGEVVLDNPLHIILIAVPLIIQTFFIFFLAFGVCRIIR LPYDIAAPAGMIGASNFFELAVAVAVALFGTSSPAALATTVGVLTEVPVMLTLVKIANKL KEKFKYE >gi|226332993|gb|ACII01000026.1| GENE 40 35516 - 36004 195 162 aa, chain - ## HITS:1 COG:CPn0790 KEGG:ns NR:ns ## COG: CPn0790 COG0479 # Protein_GI_number: 15618699 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Chlamydophila pneumoniae CWL029 # 1 149 73 232 258 101 35.0 5e-22 MLINERPRLACSTFLHTLKGSTITLEPLSKFPLVRDLIVDRSILFENLKKLNLWLESEAY MNRWTHEPRYQSARCLMCGCCLEVCPNFSANGTFAGAVAAVNAFRILNEEQESTHLNEIS AEYKKKYFEGCGKSLSCHDICPIGLPVEELLVRSNAAAVWGK >gi|226332993|gb|ACII01000026.1| GENE 41 36223 - 37767 893 514 aa, chain - ## HITS:1 COG:BH3092 KEGG:ns NR:ns ## COG: BH3092 COG1053 # Protein_GI_number: 15615654 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Bacillus halodurans # 1 507 1 562 589 367 38.0 1e-101 MSKTIIIIGAGLAGLSAALQAAENGCNVKLVSSLPSERAQSVMAEGGINAALNTKDENDS PEEHFTDTIKAACGLADPNAVWGMTQAAPELVHWLLKLGVKFNMSGYDDVDLRNFGGQKK KRTAFAQSDTGKQIMTAMIDAVRRKEASGMVERFSHHSFLTLRLCGNICCGCVIRDEYSQ ETVELPGDAVIVATGGMHGLFGNTTGSLSNTGEVTAELFRLGVPLANSEMIQYHPTTVKC GGKRMLISEAARGEGGRLFAMKDGKQWYFMEEKYPELGNLMPRDITAREIWKVSHESEVF LDMTEISEEIISNKLSGLADDCMTYLHKDIRKEPVSVLPGIHYFMGGILVDEQHRTPIQN LYAAGECCAQYHGANRLGGNSLLGALYGGRVAAKSAREQADVVDLSCATQIDFPPTSQIS EIKQLNKVMQDTMGVVRNENTLLNGIQTVQALTGNLPLLGMAVLKSALARKESRGAHWRE DYPKSNDDDYLKTTVARFDGKQIQISFVPVPERR >gi|226332993|gb|ACII01000026.1| GENE 42 37764 - 38255 278 163 aa, chain - ## HITS:1 COG:no KEGG:CKR_0690 NR:ns ## KEGG: CKR_0690 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 10 163 1 150 150 155 58.0 4e-37 MIHGIMGSFMLNGVGSSAGKLLAWIGVGILVVHTVIGVILTVQSLQTAKQSGKMYLKQNA IFWARRASGLAILILLFFHIGLFGKVQNGTYILFPFTTVKMVTQLLFVAAIFVHIFINIR PLLVSLGIISYKERRGDIYLILSVLLLFIAGAVIFYYIGWQYL >gi|226332993|gb|ACII01000026.1| GENE 43 38314 - 38832 455 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578327|ref|ZP_04855599.1| ## NR: gi|253578327|ref|ZP_04855599.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 172 1 172 172 308 100.0 1e-82 MKRKLILLVVTIVFLVGFGVILHSPPSMIDAVTGATPKSKKAAQASAQLEGSYVLGINMM SDGLDNENTRNKLKELALDDSETNETDLMKTDISFRLYVSETDYPLVSYAKKLCDRLKQA GFSVDLKEYSNTMMLSRVVSGKYDVFLASDDFIDVTTLTQMDYMIMDSEEMR >gi|226332993|gb|ACII01000026.1| GENE 44 38849 - 41203 1899 784 aa, chain - ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 78 783 4 698 702 728 54.0 0 MKRIFLLKGLDCPNCSAKIEKEVGELDGVQSSVVNLMKQTITVNVTQTAADTIASQIETI VYSHEPDVEVQEETVMNVTKSYSLKGLDCPNCSAKIEKEVGELDGVQSSVVNLMKQTLTI NVAQTAADTIASQIETIVHSHEPDVEVSEIVQESYIPEKKQEANESYNNEDKKLTVRLAT GAAIYAIGMALTVFAKVPLPIELAFLIVSYVILGGDIVWQAVRNISKGRVFDEHFLMSVS TIGAFVIGEYPEAVAVMLFYQVGEFFQSLAVKRSRKSISDLMDIRPDSATVRRNGELITI SPENVSIGEIIIVKPGEKIPLDGVVLDGDSMLDTRALTGESVPRSVHKGDEALSGCMNQT GVLMIKTTKAFGESTASKIIDLVENASSRKAPTENFITTFARYYTPVVVILAAFLAILPP IILGGGWTEWIRRGFVFLVVSCPCALVISIPLTFFGGIGAASKRGVLVKGSNYLEALNNV SVIVFDKTGTLTKGVFNVTDILPANGFSKEQVLEYAAEAESFSNHPIAKSILAAYEKEID QSVISDYKEISGYGISVMAGEKKVFAGNTKLMDTECIEYTTCEKAGTKVYLAVDGQYAGC ILITDEVKPDSKKAISDLKHIGVEKTVMLTGDDEKIGKSVAEELQLDKYYAQLLPDQKVE KVELLDSKKRPGSKLAFVGDGINDAPVLARADVGIAMGGLGSDAAIEAADVVLMTDEPSK LVDAIEVAKATKQIVMQNIVIALGIKSVFLILGALGIAGMWEAVFGDVGVTIIAVLNAMR ILKK >gi|226332993|gb|ACII01000026.1| GENE 45 41263 - 41622 245 119 aa, chain - ## HITS:1 COG:FN0260 KEGG:ns NR:ns ## COG: FN0260 COG0640 # Protein_GI_number: 19703605 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 3 116 5 118 125 119 53.0 2e-27 MPKTSYICNCDVIHEDIVNDVKSKMQPKDDYIQLASLFKLFGDGTRVQILHALEQSEMCV CDLAVLLGVTKSAISHQLKALRLANLVKFRKEAQIVYYSLADDHVKEIIDKGFEHLWQK >gi|226332993|gb|ACII01000026.1| GENE 46 41754 - 42029 107 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|166030809|ref|ZP_02233638.1| ## NR: gi|166030809|ref|ZP_02233638.1| hypothetical protein DORFOR_00483 [Dorea formicigenerans ATCC 27755] # 1 91 55 145 145 165 98.0 7e-40 METRLSTKDFYKNAKKSILTNNQFYFDARHVYANAKRKLPCLKRLPELTNEMVCVLKNME TMTITYKLSVTNLEFTGSVTRKPKRYTGKKD >gi|226332993|gb|ACII01000026.1| GENE 47 42386 - 42487 64 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIEYNKIKNRNVLDVRASLKVGLQMEQGKLPSH Prediction of potential genes in microbial genomes Time: Sat May 28 19:24:11 2011 Seq name: gi|226332992|gb|ACII01000027.1| Ruminococcus sp. 5_1_39B_FAA cont1.27, whole genome shotgun sequence Length of sequence - 3284 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 45 - 104 5.7 1 1 Op 1 . + CDS 139 - 432 129 ## gi|253578331|ref|ZP_04855603.1| conserved hypothetical protein 2 1 Op 2 . + CDS 423 - 596 129 ## gi|153810199|ref|ZP_01962867.1| hypothetical protein RUMOBE_00580 + Term 620 - 656 -0.4 - Term 810 - 850 -0.3 3 2 Tu 1 . - CDS 975 - 1334 250 ## PROTEIN SUPPORTED gi|223038821|ref|ZP_03609113.1| 30S ribosomal protein S8 - Prom 1358 - 1417 6.2 + Prom 1912 - 1971 2.0 4 3 Tu 1 . + CDS 2025 - 3282 539 ## CD1102 putative mobilization protein Predicted protein(s) >gi|226332992|gb|ACII01000027.1| GENE 1 139 - 432 129 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578331|ref|ZP_04855603.1| ## NR: gi|253578331|ref|ZP_04855603.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 97 1 97 97 191 100.0 9e-48 MIIYWDKKTNSKNRIGQRVKELRKAHNLTQKTLAAKLQLAGYDFNDLAILRISSRSDGRS CKIIQLYCCPFYAKLWIDTSEQHTLFINRMIGVILWN >gi|226332992|gb|ACII01000027.1| GENE 2 423 - 596 129 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153810199|ref|ZP_01962867.1| ## NR: gi|153810199|ref|ZP_01962867.1| hypothetical protein RUMOBE_00580 [Ruminococcus obeum ATCC 29174] # 1 57 1 57 57 84 100.0 3e-15 MELTKKVPTTDKNQENNHLLRMLDRGIDDMEAGRELPLEDAFRKITELRDARRNARI >gi|226332992|gb|ACII01000027.1| GENE 3 975 - 1334 250 119 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223038821|ref|ZP_03609113.1| 30S ribosomal protein S8 [Campylobacter rectus RM3267] # 1 114 2 116 118 100 39 1e-21 MKREEIFQYVKEQYGTEPEYLWKKDPDSAVLRHKNGKWYAIIMAVEKKTLGLEEDGKINI LDVKCDPDLVGMLIQTYGFLPGYHMNKRHWITILLDESVSEAKTLDFLDMSYDLIDSGK >gi|226332992|gb|ACII01000027.1| GENE 4 2025 - 3282 539 419 aa, chain + ## HITS:1 COG:no KEGG:CD1102 NR:ns ## KEGG: CD1102 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 1 416 1 408 449 266 37.0 1e-69 MAVTKIHGIKTTVDKAIEYICNPDKTDQNLYISSFACSPETAALDFKYTLDHTHDCRDPH NTNKAFHLIQAFSPGEVSYEEAHQIGRELADRLLEGKYSYVLTTHTDKGHVHNHLIFCSA DNITFSHYHDCKKNYWKIRNLSDTLCQEHNLSTIMPNGKKGMKYNEWAANKSESSKKTQL RKDINQTIRIVSTYSEFLTFMEAKGYEIKNAEFGENSRKYITFRSPDMSRPVRGSAKSLG KNFTKERIKERINNKLHRTTVPSVRNKLIDTNTPNIAGNIGLQKWANKENLKIVSAEYNK MLTHNLHNFSELEDRIALLHMQQKEVNTSVVSLESQIRHLREMLKYAKQYQKNKIYDDHY KSSKDPDRYFRKYESQIILFAGAEHILQENGMDLKHLNTNKLQEQLSDLISQKKSLNTQ Prediction of potential genes in microbial genomes Time: Sat May 28 19:24:30 2011 Seq name: gi|226332991|gb|ACII01000028.1| Ruminococcus sp. 5_1_39B_FAA cont1.28, whole genome shotgun sequence Length of sequence - 14163 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 7, operones - 4 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 181 - 357 63 ## CD1091 integrase - Prom 377 - 436 3.4 2 2 Op 1 . - CDS 470 - 1348 446 ## EUBREC_3527 site-specific recombinase 3 2 Op 2 . - CDS 1201 - 2241 113 ## COG1686 D-alanyl-D-alanine carboxypeptidase 4 2 Op 3 . - CDS 2285 - 2788 372 ## EUBREC_3554 hypothetical protein 5 2 Op 4 36/0.000 - CDS 2816 - 4009 719 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 6 2 Op 5 . - CDS 3990 - 4703 221 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 7 2 Op 6 . - CDS 4708 - 5541 312 ## EUBREC_3557 hypothetical protein - Prom 5610 - 5669 5.3 - Term 5577 - 5645 12.2 8 3 Op 1 40/0.000 - CDS 5700 - 6785 348 ## COG0642 Signal transduction histidine kinase 9 3 Op 2 . - CDS 6772 - 7473 442 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 3 Op 3 . - CDS 7479 - 7691 209 ## COG3655 Predicted transcriptional regulator - Prom 7714 - 7773 6.6 11 4 Op 1 . - CDS 7850 - 9040 790 ## EUBREC_3561 hypothetical protein 12 4 Op 2 . - CDS 9018 - 9212 183 ## EUBREC_3562 hypothetical protein 13 4 Op 3 . - CDS 9242 - 9316 65 ## - Prom 9418 - 9477 5.4 - Term 9450 - 9499 7.7 14 5 Op 1 . - CDS 9512 - 11461 1875 ## EUBREC_3563 hypothetical protein 15 5 Op 2 . - CDS 11451 - 12236 541 ## COG1192 ATPases involved in chromosome partitioning - Prom 12317 - 12376 4.2 16 6 Tu 1 . - CDS 12494 - 12934 242 ## CD0369 hypothetical protein - Prom 12962 - 13021 2.9 17 7 Tu 1 . - CDS 13045 - 13671 135 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component Predicted protein(s) >gi|226332991|gb|ACII01000028.1| GENE 1 181 - 357 63 58 aa, chain - ## HITS:1 COG:no KEGG:CD1091 NR:ns ## KEGG: CD1091 # Name: int # Def: integrase # Organism: C.difficile # Pathway: not_defined # 4 58 343 397 397 89 67.0 4e-17 MTHPHVMRHTFCTRLANAGMNPKALQYVMGHSNITITLNLYTHASLETVKSELQRFVA >gi|226332991|gb|ACII01000028.1| GENE 2 470 - 1348 446 292 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3527 NR:ns ## KEGG: EUBREC_3527 # Name: not_defined # Def: site-specific recombinase # Organism: E.rectale # Pathway: not_defined # 1 292 234 525 525 538 99.0 1e-151 MQFQRFDTIKKKHWSPTTVSAIVRDEIYIGTRIWGKTRCSMHTGHKAVLNDETEWVRLEN HHTAIIDRELFEKANEMHPKKKRSVAESRTNFTLERRKKQPALLICANCGHSLLKETEHL LKCSDARTNGDPVCRSLVIRREPMEENILGLVRQYAASMLKKGKKVSSKRQCEYKEINTT ELQKQSRQLTSEKMKLYDDYKDGRIDRDSYKQRAEKISVQLDEIKRKIEDAKNSKKLLEQ NELSDKIKLKDFLGIQKFDTEKLREVIKVIRVHSQDEIEIEWNFDDIFSEQR >gi|226332991|gb|ACII01000028.1| GENE 3 1201 - 2241 113 346 aa, chain - ## HITS:1 COG:CPn0672 KEGG:ns NR:ns ## COG: CPn0672 COG1686 # Protein_GI_number: 15618582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Chlamydophila pneumoniae CWL029 # 54 293 28 279 436 71 28.0 3e-12 MQRKRRKKKRFGVFLFLLVVCIGIGIMFVPWAMQPENFREVKNKINGFLYKEDFPDSYNA KSLILVDCSDDEIFVSKNENEPQIPASLAKLFVIEYASTLADLDSIVVADYGAISLTKPG SSVAQIKEKEYFLHNLFAAMLVPSGNDAAYVVADYCGGLLSPQAESCQERVRIFMENLNL HLQQQGYLDTVLYDPSGFDMEALTTTLDLKEVVYRLLEYSWFREIVSQNTYTATLPDGST QIWQNTNAFLDPTSEYYNENVCGIKTGSLSDDYNLIVLYQQHRRLLLVMSRNYQEGIRCS SRGLTRSKRNTGALRQYRQSSGMRFILEPESGAKHVAVCIRAIKQF >gi|226332991|gb|ACII01000028.1| GENE 4 2285 - 2788 372 167 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3554 NR:ns ## KEGG: EUBREC_3554 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 19 167 1 149 149 285 99.0 3e-76 MKRTISKSERPYRLLLCVMISLLVIMLAGCSTSSDSDTNTRGFTDFATIEEEYLTTIESL NWPEGFTPPDALEGEDTGASFQIGYGDTRASNLWEYSWMQEWLDTYNTNSERAAKALAEL EKAFDMPYMGTDRCDDATRKYLRDNIDKAKLGDPSGFTECIQANYAD >gi|226332991|gb|ACII01000028.1| GENE 5 2816 - 4009 719 397 aa, chain - ## HITS:1 COG:FN0828 KEGG:ns NR:ns ## COG: FN0828 COG0577 # Protein_GI_number: 19704163 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Fusobacterium nucleatum # 1 397 6 406 408 79 25.0 1e-14 MLKMILKDLRLSPLRSILTSVSMLVGIIAMLGSVLVGTLGREYLISVNAQVYGWSPTYSF VITESDFHDRNKMEQLFQRFEAIDDVAAVTFSMGEDIRFAPMKDLTPIPPNDVYQNLMAF DLVCTTEAYSQVYNLPMTSGRWLEPSSEGGSLEVVINKEAKNYFNDSPYAAGNVKSTLSL TPFNIVGVVNDGRDFPTIYADSAAILNFAPAMWQVQNANVYWHPTTGLTIEQIHSALGDI LTDTIGGYWESAGRSDIGDTYDSVLSILQLGLLVTSLLLLFVSVLGQINIGLSSLEQRTH ELLIRRAIGASRTNIVALVLGSQLILSIFVCLAAILISLILVHCIGALLPVDSPVGTPSY PISVAVVAVAVSVLTALLGGLLPALKAAKLEPALALR >gi|226332991|gb|ACII01000028.1| GENE 6 3990 - 4703 221 237 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 230 261 498 563 89 27 1e-17 MKGISEKTLIKIKDLKASVKLNNGDMLTTVTNANMELQRGRSYAIVGKSGSGKTSLISII GLLNREYEGEYLYDGISISALKDRDLSILRANNIGFVFQNYSLIKHLRVWENIELPLLYA KKSFTAKQRHEIITGLLKSVGLESKENDYPINLSGGEQQRVAIARALAVSPEAILCDEPT GALDKKTGTQIMELLHSVVKENGIMLLLVTHDPDIADTCDTIFEMDGGRITCAKNDT >gi|226332991|gb|ACII01000028.1| GENE 7 4708 - 5541 312 277 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3557 NR:ns ## KEGG: EUBREC_3557 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 277 9 285 285 466 99.0 1e-130 MLLCSILFLVFTLSACSAIGQTDENNLTDSATSAEKTVSVVRGTITPTVSTQTTIVPAVP FIISSPENGIFNTAVELEEKITAGQIIGTVNGKELKSPVDGTITSIAPSNESVPSNYPVA IVHYTGFALNVEADNFLSTLPEYAELKAKFQVYDGVGPTDMIAVVSPAADENAFTGIVPQ EGILQCLISQTVDVKSGQSATVVITATTRNDVLILPLSVIAGRQGTGLVTVITPNGERVE TKVTLGVTDGANIEILSGLEEGDVVSATPPNLDPRGI >gi|226332991|gb|ACII01000028.1| GENE 8 5700 - 6785 348 361 aa, chain - ## HITS:1 COG:CAC0565 KEGG:ns NR:ns ## COG: CAC0565 COG0642 # Protein_GI_number: 15893855 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 39 350 178 489 499 162 36.0 7e-40 MKLNKKEKNYLLSTLFKGYVIQCAICSVLIFGGTFLLGIMAEEILRKYVLYYYLYYRTVY LYIVAVIVWGGCIIYMTYLLLKKVVAYVYEVQAATGKMFDQNVSYIEMSPELSEIAANIN QLKQEAESNARLAKENEQRKNDLIMYLAHDLKTPLSSVIGYLTLLRDESQISKELREKYL SITLGKAERLEDLINEFFEITRFNIYDITLQYTKINLTRLLEQLVYEFKPMLKSKNLQCN LCVDDDIMLRCDADKIQRVFDNLLRNAVIYSFENTDITISAQCQEDTVSIIFCNHGDTLP EEKLNRIFEQFYRLDAARSTSSGGAGLGLAIAKQIVELHNGTIVAESQEDQNKFSITLPL A >gi|226332991|gb|ACII01000028.1| GENE 9 6772 - 7473 442 233 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 233 3 230 233 222 48.0 6e-58 MNKKILIVDDEKEIVDLLEVYLSNDGYSVYKCYNGLEAMKCIKQSQIDLAILDIMLPDID GFRLCQKIREKFYFPIIMLTAKIEDSDKIMGLTIGADDYITKPFNPLEVVARVKTQLRRY QSYNSPNLSQSEEKDEYDIRGLLINRVSHKCYLYGKEIALTPLEFSILWYLCEHQGKVVA SEELFEAVWKEKYLRNSNNTVMAHIGRLREKLNEPSKNPKFIKTVWGVGYEIE >gi|226332991|gb|ACII01000028.1| GENE 10 7479 - 7691 209 70 aa, chain - ## HITS:1 COG:SPy0544 KEGG:ns NR:ns ## COG: SPy0544 COG3655 # Protein_GI_number: 15674643 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 68 1 68 69 76 57.0 9e-15 MKVRYNKLWKLLIDKGMKKSQLREAVGASKSTFAKLGKNENVTLPVLLNICEYLECDFGD IMEAVPENEV >gi|226332991|gb|ACII01000028.1| GENE 11 7850 - 9040 790 396 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3561 NR:ns ## KEGG: EUBREC_3561 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 396 1 396 396 698 100.0 0 MDSIITVKTAQRAAQAARAAAKAAVVTVKAIAKATALAIKAIIAGTKALIAAIAAGGWVA VLVIIVICLIGMILGSVFGIFFSDEDSGTGMSMQTVVQEINTEYDTKLQEEKNSVSYDVL EMSGSRAVWKEVLAVYSVKTNTDQDNPQEVATMDDGKKQLLKDIFWEMNQISSRTESKTE TVITETDDGHGNIVETESTVTQTYLYITVSHKTAEEMAAQYGFNEEQKGYLAELLADENN YLWSQVLYGITGGDGQIVTVALSLVGNVGGQPYWSWYGFNSRVEWCACFVSWCANECGYI DAGVIPKYAGCVNGVQWFKDRGQWLDGSAEPAPGMIIFFDWADESGQDGLSDHTGIVQKV ENGKVYTVEGNSGDSCRVNEYSIGYYEILGYGAPAY >gi|226332991|gb|ACII01000028.1| GENE 12 9018 - 9212 183 64 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3562 NR:ns ## KEGG: EUBREC_3562 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 64 1 68 68 74 94.0 1e-12 MNDYDANGKKIGHSSPSFFGGMNHYDNNGHKVGESRPGFFGGLNHYDDKGHSNPGFLGGF NHYG >gi|226332991|gb|ACII01000028.1| GENE 13 9242 - 9316 65 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKGYSRPGFFGSINHYDANGKK >gi|226332991|gb|ACII01000028.1| GENE 14 9512 - 11461 1875 649 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3563 NR:ns ## KEGG: EUBREC_3563 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 17 649 286 918 918 1130 100.0 0 MPSKKRATPISLQPLDGMQIKKEDYKLVYVGELSGNMSLDDIFERFNIDRPEDFRGHSLS VSDIIVLNDGEKVTAHFVDSISFEQLDSFLNLEEQVLSELAYEVGERYFAIQRTEEGYDY SFYDEDFRLMDGGVYENDQISIEEAAEELLEDEGWTGERIRGDYDQLMEKVEEMDEVVMA EIQKSQGEYKPLAKVEELEEANYNMIDNVLNNMPPKKEPYLEYFATECDEFHDMGAYEKS TDVNQIAAVYEKYRENPKTAYLGCSMGIIYRDPEDSYYDEAEFAIVKGNTVFGNLMDDVR FYGELALVREGIEKIHEALPDYKYVPMRDVREAMYPEKMTTEQLAEALDEIAEAFDPYEY RDNVERGENTVQEVMLDLRSGNIHSYISYLKDIVDEECDLSVRAGVLIERLKAYEPELPK DMEPMVYVNYCEESDLMNPRCQKLSELDAKTAEMDKEWYAKRDPKTEEPTKIAKIYVTVY YAEKGEQMLHHFKKSMDIGNGHGGIVSQLKYDNEMKLTDEYWINYQKGKGSEEFQKYMED LTDMQNHVLPYLQSFCNLEEKGVKERREQQIAERYEGRADERVTSTEANEVVKDVGKTDR KPAQQKQAVDGKDKKLSIHERLEINKRIIQEKQGKDKTERGVDLGVRTV >gi|226332991|gb|ACII01000028.1| GENE 15 11451 - 12236 541 261 aa, chain - ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 257 1 250 253 186 41.0 3e-47 MARIISIVNQKGGTGKSACTANLAVGLAQKNMKVLIVDADPQSDVSAGFGYRDCDDSNET LTALMDAVMKDEDIPSECFIRHQAEGIDIICSNIGLAGTEVQLVNAMSREYVLKQILYGI KDQYDVVIIDCMPSLGMITINALAASDEVLIPVEASYLPIKGLQQLLKTIGKVRKQINPK LQVGGILFTMVDAHTNDARNNMELLRNVYGSQIHIFDNYIPFSVRMKEAVREGQSIFSYD PKGKATEAYRRVTEEVLKDAI >gi|226332991|gb|ACII01000028.1| GENE 16 12494 - 12934 242 146 aa, chain - ## HITS:1 COG:no KEGG:CD0369 NR:ns ## KEGG: CD0369 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 67 8 74 81 96 89.0 3e-19 MISLRKRDIKKALKAGAKAGGVSLADVRADIEATIDEAMDSADPEVQTNFKKYFGNKRPT PEEYIYKTMSDEVYQAFQNILKHRRKQKNLVIDGYSVFLFINRNGNPQVAVNYEAVFRKL VDKYNSKHEEPLPKITPHGRVIIRTS >gi|226332991|gb|ACII01000028.1| GENE 17 13045 - 13671 135 208 aa, chain - ## HITS:1 COG:BH0817 KEGG:ns NR:ns ## COG: BH0817 COG1277 # Protein_GI_number: 15613380 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 24 163 47 186 230 68 32.0 1e-11 MILGFLYTFALIAYAGGEVDAIEFSTYENIWTLINALQTVAYAVLISVMFSTFIIKEYAG KLNLLLFTYPVDKSTLLKAKVLLVAFSGIGGMLLGSALIFLIFFITERIFPLVPDTFSFG NIIKIWSELLLCSVYALCVGIVATKIGFIKMSVQTTIVSAIIIASCSANVISTMSYTNFP FLFLWGIMAIISIISYCNIVHKTIRMDI Prediction of potential genes in microbial genomes Time: Sat May 28 19:25:17 2011 Seq name: gi|226332990|gb|ACII01000029.1| Ruminococcus sp. 5_1_39B_FAA cont1.29, whole genome shotgun sequence Length of sequence - 52486 bp Number of predicted genes - 48, with homology - 48 Number of transcription units - 23, operones - 14 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 6.8 1 1 Tu 1 . + CDS 94 - 465 252 ## COG3682 Predicted transcriptional regulator + Prom 876 - 935 5.7 2 2 Tu 1 . + CDS 1160 - 2272 435 ## COG2602 Beta-lactamase class D + Term 2280 - 2323 8.2 - Term 2119 - 2170 3.1 3 3 Op 1 . - CDS 2300 - 3547 492 ## gi|253578355|ref|ZP_04855627.1| conserved hypothetical protein 4 3 Op 2 . - CDS 3534 - 4691 648 ## gi|253578356|ref|ZP_04855628.1| conserved hypothetical protein 5 3 Op 3 . - CDS 4732 - 6831 1530 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Prom 6860 - 6919 8.2 6 4 Tu 1 . - CDS 7559 - 7804 324 ## gi|253578359|ref|ZP_04855631.1| conserved hypothetical protein - Prom 7924 - 7983 7.9 + Prom 7964 - 8023 4.9 7 5 Tu 1 . + CDS 8095 - 9201 457 ## COG0471 Di- and tricarboxylate transporters - Term 9240 - 9295 12.2 8 6 Op 1 . - CDS 9319 - 10086 702 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 9 6 Op 2 1/0.000 - CDS 10115 - 10585 472 ## COG0716 Flavodoxins - Prom 10615 - 10674 6.0 - Term 10614 - 10661 8.1 10 7 Op 1 1/0.000 - CDS 10712 - 11590 424 ## COG0583 Transcriptional regulator - Prom 11709 - 11768 4.3 11 7 Op 2 . - CDS 11812 - 12132 345 ## COG1694 Predicted pyrophosphatase - Prom 12152 - 12211 2.7 12 8 Op 1 . - CDS 12225 - 14042 1019 ## Cthe_1185 hypothetical protein 13 8 Op 2 . - CDS 14039 - 14989 569 ## COG0657 Esterase/lipase - Prom 15034 - 15093 7.8 - Term 15066 - 15120 11.1 14 9 Op 1 . - CDS 15131 - 16279 1009 ## CPF_2665 isoaspartyl dipeptidase (EC:3.4.19.5) 15 9 Op 2 . - CDS 16312 - 17832 1673 ## COG1288 Predicted membrane protein - Prom 17883 - 17942 6.5 - Term 17882 - 17929 -0.3 16 10 Tu 1 . - CDS 17970 - 18959 726 ## Fisuc_0324 RelA/SpoT domain protein - Prom 18990 - 19049 4.0 17 11 Op 1 24/0.000 - CDS 19060 - 19647 434 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component - Term 19659 - 19695 0.5 18 11 Op 2 21/0.000 - CDS 19719 - 20510 534 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 19 11 Op 3 . - CDS 20485 - 21459 1103 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 21481 - 21540 3.5 20 12 Op 1 . - CDS 21550 - 22398 798 ## Elen_2053 cell wall hydrolase/autolysin 21 12 Op 2 . - CDS 22400 - 23254 1323 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase - Prom 23362 - 23421 8.0 - Term 23344 - 23384 -1.0 22 13 Op 1 . - CDS 23427 - 24056 865 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 23 13 Op 2 . - CDS 24149 - 24817 654 ## CDR20291_3182 hypothetical protein - Prom 24863 - 24922 4.9 24 14 Tu 1 . - CDS 24924 - 26864 1918 ## COG3894 Uncharacterized metal-binding protein - Prom 26894 - 26953 1.7 - Term 26896 - 26939 8.1 25 15 Tu 1 . - CDS 26998 - 27783 1096 ## COG1410 Methionine synthase I, cobalamin-binding domain - Prom 27810 - 27869 6.4 - Term 27833 - 27872 10.5 26 16 Op 1 5/0.000 - CDS 27885 - 29243 1653 ## COG1456 CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein) 27 16 Op 2 . - CDS 29266 - 30210 1298 ## COG2069 CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein) - Prom 30275 - 30334 6.3 28 17 Op 1 4/0.000 - CDS 30645 - 32771 2640 ## COG1614 CO dehydrogenase/acetyl-CoA synthase beta subunit 29 17 Op 2 . - CDS 32798 - 33562 1000 ## COG3640 CO dehydrogenase maturation factor 30 17 Op 3 . - CDS 33607 - 35499 2164 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 35545 - 35604 9.0 - Term 35719 - 35758 7.5 31 18 Op 1 . - CDS 35766 - 36536 989 ## COG3640 CO dehydrogenase maturation factor - Prom 36558 - 36617 5.1 - Term 36579 - 36622 -0.7 32 18 Op 2 . - CDS 36629 - 36811 325 ## gi|253578386|ref|ZP_04855658.1| conserved hypothetical protein 33 18 Op 3 . - CDS 36825 - 37220 414 ## gi|253578387|ref|ZP_04855659.1| conserved hypothetical protein - Prom 37300 - 37359 6.5 + Prom 37259 - 37318 10.3 34 19 Tu 1 . + CDS 37566 - 38393 788 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Term 38436 - 38485 11.2 - Term 38487 - 38521 0.0 35 20 Op 1 1/0.000 - CDS 38584 - 39603 1201 ## COG0240 Glycerol-3-phosphate dehydrogenase 36 20 Op 2 2/0.000 - CDS 39618 - 40262 555 ## COG0344 Predicted membrane protein 37 20 Op 3 1/0.000 - CDS 40265 - 41590 1754 ## COG1160 Predicted GTPases 38 20 Op 4 . - CDS 41553 - 42935 1273 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family - Prom 43126 - 43185 9.4 39 21 Tu 1 . + CDS 43251 - 44420 1350 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 44438 - 44493 17.3 - Term 44426 - 44481 19.1 40 22 Op 1 . - CDS 44550 - 45401 662 ## COG1496 Uncharacterized conserved protein 41 22 Op 2 1/0.000 - CDS 45462 - 45866 197 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 42 22 Op 3 8/0.000 - CDS 45937 - 46692 751 ## COG0164 Ribonuclease HII 43 22 Op 4 2/0.000 - CDS 46689 - 47534 1045 ## COG1161 Predicted GTPases 44 22 Op 5 5/0.000 - CDS 47541 - 48185 493 ## COG0681 Signal peptidase I 45 22 Op 6 1/0.000 - CDS 48220 - 48567 464 ## PROTEIN SUPPORTED gi|238916980|ref|YP_002930497.1| large subunit ribosomal protein L19 - Prom 48593 - 48652 1.6 46 22 Op 7 . - CDS 48695 - 49570 864 ## COG0024 Methionine aminopeptidase - Prom 49638 - 49697 5.3 47 23 Op 1 . - CDS 49779 - 50711 1040 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins 48 23 Op 2 . - CDS 50727 - 52367 1742 ## EcE24377A_0285 hypothetical protein - Prom 52425 - 52484 4.3 Predicted protein(s) >gi|226332990|gb|ACII01000029.1| GENE 1 94 - 465 252 123 aa, chain + ## HITS:1 COG:CAC3438 KEGG:ns NR:ns ## COG: CAC3438 COG3682 # Protein_GI_number: 15896679 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 5 123 6 124 124 86 39.0 9e-18 MSLPQISEAEFEVMKIVWKYAPINTNEITEKLTLTTSWSPKTIQTLIKRLVSKKALSYEK QSRVFVYTPLVQEDEYIRQESNSFLKRYYNGNLSSMLASYLEDDKLSSTEIDNLRHLLSK HQK >gi|226332990|gb|ACII01000029.1| GENE 2 1160 - 2272 435 370 aa, chain + ## HITS:1 COG:SA0039_2 KEGG:ns NR:ns ## COG: SA0039_2 COG2602 # Protein_GI_number: 15925746 # Func_class: V Defense mechanisms # Function: Beta-lactamase class D # Organism: Staphylococcus aureus N315 # 134 368 18 251 252 198 41.0 1e-50 MVISNILYWFNPFVWYTLKEILCDREIACDSAVLQMISADEYQAYGTTLINFAEKISSFS SPLAVGMSGNFRQMKRRILNIAVFRKETLYQKMRALIIYLVISAVFIGCTPILSIDASTQ SVYHFSDTDKNISLLDISETFGTYDGSFVLYDNNLDSWKIYNLEEATKRIPPESTYKIYD ALLGLESGIITPEHSSMTWNGDAFPFPSWEADQDLNSAMQNSVNWYFQAIDSQLGINRVQ GFLNKIEYGNQTTSSNLDLYWSDFSLKISPLEQVTLLKKFNTNEFHLNSQNVFSVKNAIK IASTPNGNFYGKTGTGRVNGQDINGWFIGYVETSDNSYYFATNIQSDSNATGKKAFEITS DILKKLHIWN >gi|226332990|gb|ACII01000029.1| GENE 3 2300 - 3547 492 415 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578355|ref|ZP_04855627.1| ## NR: gi|253578355|ref|ZP_04855627.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 415 1 415 415 827 100.0 0 MEIIDMSTKRKLKKMVSVLFILGCFFIGNTKCKGADLEYISQETANYAVQERGYDLPVDE VVKEEAIEDCKNVMNQMKAIYQKADKGTSSNVVVSETVMEEMQEVLKEKNVPVITSAPYS NMANYSKMEEFLFRAEQDLTGDIVLYRINRDGGIERLKFNYDGTDMYLLAVKAVWGMNDN LSIVYVSYTRIEEWKYTEKGWFGYTLCVPKYPEVSEAVDGSSMIRIKPLSDECREVSKKC VYLLGYQGNNLLCSDWDRSDMEGLDYNGLYEYLYRMKYGERYEFSGNSSGIPAEEFENLI MEFLPITAEQIKKWAVFDSEHQTYDWERLGCLNYSPTYFGTSLPEVVEIRDSGEGNNVLV VDAVCDTFICNDAVITSELTVKFNDDKSFKYMGNKILNNGTKEVPKYQYRIKRKN >gi|226332990|gb|ACII01000029.1| GENE 4 3534 - 4691 648 385 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578356|ref|ZP_04855628.1| ## NR: gi|253578356|ref|ZP_04855628.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 385 1 385 385 731 100.0 0 MKKIWKIFWGVIFLAVCSGCGIKKEQKKTIEDTKEKIYRECEMLAEDYQSIYENAAKENA LYELSTIQKSMDYFGKYGYAVIDSYNQLDMVQSIKVDDFLKKAEKEKNGKTTIFQVIAGD HFIRYDLKTKQGKIDVEVSSFKWKEDTWQETYYHEFRANSWKYTENGYFFMEEYHPAGYD GPSGYRAFRVVPLNKKCRELNRKYILPFGYTLNKLFTSNWSEKNYDGINFYDVFDRLLSM EEKTDEFKEGKTYEIPKESFEAIFQKYFNISAEILQTGTVFHTETQTYRYRTRGIVYDFA PTPYIPYPEVVSYIENQDGTITLEVNAVWPQKELDQAFCHSVTIRLLDKDRFQYVSNYVS RSEIEVTWYTERLLDEKWEECYGDN >gi|226332990|gb|ACII01000029.1| GENE 5 4732 - 6831 1530 699 aa, chain - ## HITS:1 COG:CAC3683 KEGG:ns NR:ns ## COG: CAC3683 COG0768 # Protein_GI_number: 15896915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 37 695 25 669 671 520 42.0 1e-147 MRKTKRKKHTKTITKIVLFSGILIGGGIGIVTIMNRNVPEKRLMEYMKYIEKGEYEQMYA MLDQKKSSMNSKEEFIERNSKIYEGIEMSDLSITDITAKRKENGNAAVSYTTNMQTAAGN VEFTNNAVFSHNWTGYHLIWQDQLIFPELSATDKVQVTLEEAKRGNILDRNGRQLAGEGT ASSVGIVPGRMENREDTIKKLAEYLGIGADEIEDKLKADWVKADSFVPVATIPKIQEVDL LTVNPDETVLEEKEKQDTLLKIPGIMLSDVKVRTYYLKEAASHLVGYVQAVTAEDLQEHK GEGYRTNSVIGKTGLETLYEKELKGTDGCEICIVDANGNKKSVIAYEPKKDGEDIHTTID GDLQSTLYEQFKEDRGCSVALNPYTGEVLALVSTPSYDNNDFVRGMDNSQWSALNENEDR PLYNRFRQTWCPGSTFKPVIAAIGLKVGAFTANDDFGNEGLAWQKDSSWGDYTVTTLHDY APVILKNALIYSDNIYFAKAALKIGADQLMQSLNQIGFNQELPFDIKMSESQYSNTDKIE TEIQLADSGYGQGQILVNPLHLASIYTAFLNDGNMIKPYLHADGSTSSEIWIKDAFSPQI VLEVMEGLEGVVNNPEGTGYGACREDIRLAGKTGTAELKATKEDTSGTEIGWFTVFTTDR DTENPILLISMVENVKDIGGSGYVVEKDKAILDEYLGNE >gi|226332990|gb|ACII01000029.1| GENE 6 7559 - 7804 324 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578359|ref|ZP_04855631.1| ## NR: gi|253578359|ref|ZP_04855631.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 1 81 81 140 100.0 3e-32 MMNLAMNFDTDECSVTAMFDKGNRNDTMEAIDHIIPFLKGDADMIVLVCNTLRKLFCMSD EGYETFLMDLEDYKSELEEGE >gi|226332990|gb|ACII01000029.1| GENE 7 8095 - 9201 457 368 aa, chain + ## HITS:1 COG:L20481 KEGG:ns NR:ns ## COG: L20481 COG0471 # Protein_GI_number: 15673760 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Lactococcus lactis # 36 363 59 383 392 112 29.0 1e-24 MSIAVILALLSMFWIRPDKVYFNYIDFRTLGLLFCLMSIVAGLKAIGVFDMLAKKLLMGT SGTVGVIRLLVLLCFFLSMVITNDVALITFVPLALIIVHKLPKELTNYWLLKIVAMQTIA ANLGSMLTPIGNPQNLYLYARAGMSAAELITLMLPYSATALILLLIWIQVAAAKAPHVCG SEKDKTLLGFSDRKELNMEYLAAYLILFTICLLTVARIIPYQIPLVLVLIYMLLRNRENI SRVDYSLLATFIALFIFIGNLGRIPQFSSFLERIMAGRETLTAVLASQIMSNVPAALLLS GFTDNYRALIVGTNIGGLGTLIASMASLISFKYIAKENRNLRGKYLGIFTASNIIFMIFM LILYFFLR >gi|226332990|gb|ACII01000029.1| GENE 8 9319 - 10086 702 255 aa, chain - ## HITS:1 COG:MA0416 KEGG:ns NR:ns ## COG: MA0416 COG1917 # Protein_GI_number: 20089309 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Methanosarcina acetivorans str.C2A # 123 255 8 141 141 134 50.0 2e-31 MGKIVQTAGRNTLGEFAPEFAHFNDDVLFGENWNNQDIDVKTRSIITVVALMASGITDSS LRYHLQNAKNHGVTQKEIAAVITHAAFYAGWPKAWAVFNLAKEVWETGEGDLPYEEEAMR AHAKEMVFPIGAPNDGFAQYFSGRSFLAPISTSQVGIFNVTFEPGCRNNWHIHHAKSGGG QILVCVAGRGYYQVEGKEAVEMKPGDCINIPAEVKHWHGAAPDEWFSHLAIEVPGENSSN EWLEPVSDEEYRKLK >gi|226332990|gb|ACII01000029.1| GENE 9 10115 - 10585 472 156 aa, chain - ## HITS:1 COG:YPO2003 KEGG:ns NR:ns ## COG: YPO2003 COG0716 # Protein_GI_number: 16122245 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Yersinia pestis # 13 156 85 231 235 100 35.0 8e-22 MSKKLVAFFSASGTTKKVAQMIAEEAKADLFEIEPKVPYTKLDLDWMNKKSRSSVEMSDK KYRPAIMKKEMDMSSYDEILLGFPIWWYVAPTIINTFLEAYDFSGKKIVLFATSGGSGFG NTVKELQSSASDAVITEGRLLKCGTKQEITEWVNSL >gi|226332990|gb|ACII01000029.1| GENE 10 10712 - 11590 424 292 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 287 1 285 291 177 34.0 3e-44 MELRLLRYFLTVAKEQSFTKAAEQLHITQPTLSRQMAAFEEDLGITLFIRNGKKISLTDE GILLKRRALEILNLEERTLEELKGKEEVVESTITIGCGEFAAVETLAKICKTYKEKYPLV QIVLHTATADAVYEMMNKGLVDIALFMEPVDTEGLDYIRITDCDHWCVGMRPDDPLAEKE FIKKEDLIGKPLILPERMNVQSELANWFGKDFSKLQIAFTSNLGTNAGVMAANGLGYPIS IEGAAKYWREDILVQRRISPEIKTSTVIAWRRNIPYSLAVRKMIEEINAFQA >gi|226332990|gb|ACII01000029.1| GENE 11 11812 - 12132 345 106 aa, chain - ## HITS:1 COG:BH3997 KEGG:ns NR:ns ## COG: BH3997 COG1694 # Protein_GI_number: 15616559 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Bacillus halodurans # 6 93 8 99 101 78 51.0 2e-15 MKYETIDRIRKFTEDRNWDQFHSPANLAKSIVIEAAELLECFQWSDEEYDLQHVKEELAD VLVYSQNLLDKLELDADEIINMKMSQNEAKYPVDKAKGSAAKYDQL >gi|226332990|gb|ACII01000029.1| GENE 12 12225 - 14042 1019 605 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1185 NR:ns ## KEGG: Cthe_1185 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 4 415 5 423 426 302 38.0 3e-80 MSKKKARWRKLDNAAKLYSAASNKKDTRVFRFYCELKEEVNPDVLQEALNQTIETFPTFL MVLRKGLFWHYLEPCNLRPIVKEEYKEPCSRLYIKDKKTLLFEVTYYKKRINFEVFHVLT DGTGATEFLKELVKNYLYLIHKVNGLEPVSLLPEDMTVQDQEVDSFLKYYSKDQKRPEKR KLHAFQIRRKKKDGNHLHVHESVVSVQAVLKRSRELGVSMTVFLTALFMMAINEEMSKMQ KKKPVVLMVPVNLRKFFPSLSMLNFFNWIEPGYNFTTQDQSFEAILKYTKEFFETELTKE KMSAHISELLALELHPILRLAPLELKNLCIQAGAKYSEKNTTAIFSNMSAVKMHASYVPY IERFGVYTNTPKFELCLCSFQDKLSFAFTSRYDTVNVERNFYRLLKEQGIASEKVKPEFP KTNEPSEQEMKVYKIYSFLCIAIVAAMLVTEYNFHPRIRWTLFTAGGVVTMWIASSIGFF KRYNLLKNAMWQLFIGTIICFIWDALTGWHSWSVDLVLPIMSVSTLTAMFVIAKVRKCPV REYLIYEIMAAGYGLILPGILLLCKVVKNPTVSMFGALICFLFLVAVILFKGREFKEEMQ KNLHV >gi|226332990|gb|ACII01000029.1| GENE 13 14039 - 14989 569 316 aa, chain - ## HITS:1 COG:SSO2517 KEGG:ns NR:ns ## COG: SSO2517 COG0657 # Protein_GI_number: 15899253 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Sulfolobus solfataricus # 53 292 5 228 251 171 39.0 2e-42 MINQTAKKILKALSFDGVEVEAFRHMADLKRLDPMKIFYKKIDEKIYNGSHAVPIRIFFP TEKSFRESEEKKHNIQDKKMLLFIHGGGWVTESIDNYERICARLANATGQYVVAVEYRLA PEDKFPAGLEDCYAVAKTLYSGDFMLNIKPENITLIGDSAGGNLCAALSLMARDCGEFMP KRQILIYPATYNDYTENSPFPSVKENGTDYLLTAGKMQDYIDLYARNKEDKKNPYFAPYI AEDLANQPDTLILTCEFDPLRDEGEAYGKRLKEAGNYVEIHRIKDALHGYFALGIKQLHV QESFTYINEFLKEETL >gi|226332990|gb|ACII01000029.1| GENE 14 15131 - 16279 1009 382 aa, chain - ## HITS:1 COG:no KEGG:CPF_2665 NR:ns ## KEGG: CPF_2665 # Name: iadA # Def: isoaspartyl dipeptidase (EC:3.4.19.5) # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 379 2 376 389 366 50.0 1e-100 MKLIQNIDVYAPQHLGKKDVLIINDKIVKIKDAGSICADGFLSEAEMINGEGLLLTPGFI DCHVHVLGGGGEGGFANRTPEATMEGLTKFGVTTVVGCLGTDGIGRDMCALVAKTKGLNE QGMSAYCYTGSYQIPVHTLTDSIVKDIMMIQEIIGTGEIAISDHRSSQPTFEEFARVVAD TRLGGVLSGKAGIVNVHLGDSPRCLDLIERVVDETEIPASQILPTHINRNEMLFGKSIEY ALKGGAVDFTGNEDIDYWETICDEVRVCNGIKRMLDAGVNPDRMTISSDGQGSLPMYSSD GEFLGMGVGQSSCLLKEVKECVFKTEIPLEIAISTITSNPADILRLKGKGKVEEGYDADL CILDQELQLVEVIAKGNTVYTK >gi|226332990|gb|ACII01000029.1| GENE 15 16312 - 17832 1673 506 aa, chain - ## HITS:1 COG:FN0023 KEGG:ns NR:ns ## COG: FN0023 COG1288 # Protein_GI_number: 19703375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 14 506 2 498 499 509 56.0 1e-144 MSKKEGKGLKSFNKGFQVPDTYIIIFLVVVVAALLTFLVPKGFYETQDISYMINGVEKTR TVIKDGSFQYLTDDAGNVVTEGVALFSGDGGTGFFNYMYNGIVSSSAIEIIAFLMVVGGA FGIMIRTGAIESGLIGLIRKAKGAEKLLIPILFVLFSLGGAVFGMGEEALPFTMILCPLF VAVGYDSVIAVLVTYVATQIGFGSSWMNPFSVGIAQGIAGIDVFSGAGFRMVMWVVFTAL GCGMTMFYASKIKKTPTISIAYKTDAYFREQNEKTGIDEGHSFGLGHILVLLTLAATVVW VVWGVMTQGYYMPEIATQFFIMGIVSGVIGVIFKLNDMKLNDIATSFKDGAKDLIGAALV VAMAQGIMQVLGGSDPTTPTVINTIMYNISNALSGVSGAVAAVLMYLFQSVFNFFVVSGT GQAAITMPIMAPLSDLLGVSRQTAVVAFQLGDAFTNLIVPTSGCLIGSLAIAKIEWSNWI KFMWKFLGVLMIGAIITILIAVGTGF >gi|226332990|gb|ACII01000029.1| GENE 16 17970 - 18959 726 329 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0324 NR:ns ## KEGG: Fisuc_0324 # Name: not_defined # Def: RelA/SpoT domain protein # Organism: F.succinogenes # Pathway: not_defined # 1 329 1 328 328 436 65.0 1e-121 MITLNDYLYSGDTVLKILQKYTRDLRKEAKLTHNEIDLMHVNFLIQITELLEHNDFLTAQ SQKIREFYKYMAHEYPFLAFTFKGRIKSLIRAEEKFNGYIVEYIYDYYTKHGTYPPVSEL KNKLSCFRDFIAYRIVLCMPKCHLKPGEDQETASIRYLYEIANVLPGFLEERGFTVESAY GVKKSTSPLLNEDVKTYYRDYICGTSGEGYQSLHITVYDNSSRSFMEVQLRTQKMDDTAE IGPANHLGYEKKQQDERARRDAIPEGECVYFDEAYERGMRLLQLELADLDVNMFSAVDNS LINDGCGLFRGRLILPYEHLSRFQNDVID >gi|226332990|gb|ACII01000029.1| GENE 17 19060 - 19647 434 195 aa, chain - ## HITS:1 COG:AF0638 KEGG:ns NR:ns ## COG: AF0638 COG1116 # Protein_GI_number: 11498246 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Archaeoglobus fulgidus # 5 189 9 194 235 124 39.0 1e-28 MDIKVDHISKAYGEQQVLQDLSCVFPEGKTTCIRGRSGCGKTTLIRLLLRLDVPDKGKIE GVGDRKLSAVFQEDRLCENLSAASNIRLVCTETITDRELEEAYKAVALTEIWQKPVRELS GGMRRRVSILRALLAESDCVIMDEPLRGLDEKTRAKTIDYILKKTEGKTLIFVTHEEKEA VWLKADRTVDILTKH >gi|226332990|gb|ACII01000029.1| GENE 18 19719 - 20510 534 263 aa, chain - ## HITS:1 COG:AGpT116 KEGG:ns NR:ns ## COG: AGpT116 COG0600 # Protein_GI_number: 16119871 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 20 241 66 288 313 85 26.0 1e-16 MMHSITNDRAKTDNGTKYTGIRILAVLFWIAIWQFASMYLKQEILLASPVSVIRKLFELI FSGNFWQSVGFSFVRIVTGFLLAVLLGIFLAVWAYWSKTVEILTAPVIAVVKSTPVASFI ILCLIWIPSKNLSVFISFLMVLPVIYTNILEGIRQTDRKILEMAKVFRVNLRRQIRYIYV SQVLPYFLSACRLSLGMCWKAGVAAEVIGVPSGSIGEKLYNAKIYLNTPDLFAWTIVIIV ISFVFEKCFLGIVSRVVYMIEHR >gi|226332990|gb|ACII01000029.1| GENE 19 20485 - 21459 1103 324 aa, chain - ## HITS:1 COG:TM0202 KEGG:ns NR:ns ## COG: TM0202 COG0715 # Protein_GI_number: 15642975 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Thermotoga maritima # 64 323 48 300 300 82 25.0 8e-16 MKRKLMICAMALCICAGLTGNTGEVSVWASEGSSSDAVRIGALKGPTAMGMAQLLDEDGY DFTIAASPDEIVPMVVQDKLDIAAVPANLAATLYQKTDKDVSVLAVNTLGVLYLVENGDS VKSVEDLKGKTIYASGKGATPEYALNSVLKANGIDPEKDVTVEFKSEHAEVVSALVQDQT AVGLLPQPFVTTALMKNDKLKVALDLNKLWEDSMDDGSKLVTGVVIANNEFVQDHADKVN DFMDAYKESVDFVNSDTEAAAQIIGDHDIIAKEVAQKAIPDCSIVFIEGDKMKTMLSGYL ATLDDQNPEIIGGQLPDDAFYYKR >gi|226332990|gb|ACII01000029.1| GENE 20 21550 - 22398 798 282 aa, chain - ## HITS:1 COG:no KEGG:Elen_2053 NR:ns ## KEGG: Elen_2053 # Name: not_defined # Def: cell wall hydrolase/autolysin # Organism: E.lenta # Pathway: not_defined # 98 280 969 1150 1805 171 52.0 2e-41 MKKVPGKYFWLLMTITGALMFGAPALVSASEGIASTDTENSAADNAADTPDDDTIAKTTD SAEKNSQETDHSGEGQISKPKYLENAENNGEEIISDEDPTDNTYGYYTIMGGTTVSVDEM CSLYNSQGCTYPSEELSGGGASDIDTFCNIIVEEANAENVRGEVVFAQAMLETGWLSFGG DAGIDQFNFAGLGTTGGGVKGIAFPDVRTGIRAQVQHLKAYASTDSLNQECVDERFDYVT RETAPYVEWLGIQENPYGGGWAAGKDYGSKLRKILADLKSGI >gi|226332990|gb|ACII01000029.1| GENE 21 22400 - 23254 1323 284 aa, chain - ## HITS:1 COG:SP0825 KEGG:ns NR:ns ## COG: SP0825 COG0190 # Protein_GI_number: 15900713 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pneumoniae TIGR4 # 1 283 1 283 285 234 45.0 1e-61 MAKQLLGKEVTAALNEKIKANVAELQGKGVNPTLCIIRVGENPSDISYERGATKRCETLG VACEKILLPEDVSQEELLATIDKVNKDDSIHGVLLFRPLPKHLDQAVIENALAPEKDVDC MTDLSMSGVFTGKKIGFPPCTPQACMEILDHYGIDCTGKKAVVIGRSLVVGKPAAMMLVK KNATVTICHTRTVDMPSVAREADIIIVAAGRAGVVGAEYVKEGQTIIDVGINVNAEGKLC GDVDYAAVEPIVDAITPVPGGVGSVTTSVLVGHVVEAAMRKVNA >gi|226332990|gb|ACII01000029.1| GENE 22 23427 - 24056 865 209 aa, chain - ## HITS:1 COG:SPy2084 KEGG:ns NR:ns ## COG: SPy2084 COG3404 # Protein_GI_number: 15675842 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pyogenes M1 GAS # 1 205 1 205 208 150 42.0 1e-36 MGFSTSTCTEFVEVLASKAPVPGGGGASALVGAVGTALGNMVGSLTVGKKKYADVEDEMY ELKAKCDQLQKDFLRLIERDAEVFEPLSKAYGMPKETEEEKAEKARVMEIVLKDACSVPM EIMEKCCEAIELIVEFGAKGSKLAISDAGVGAAFCKAALQGASLNVYINTKSMADREYAE ELNRKADAMLEKYTKIADDTFNSVLGRLK >gi|226332990|gb|ACII01000029.1| GENE 23 24149 - 24817 654 222 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3182 NR:ns ## KEGG: CDR20291_3182 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 18 213 12 202 209 118 32.0 2e-25 MAFSLSDLNYEKDSKERMPWEHYTQEFATADPKEIASRLSIPYDEEAKKLTLTFLGTQYQ ITWPDFEVTHTPDDKGFYPLENMIYARILTIRFLLNGVKSESSGKFKTYREMPWGEVYLR QFDGRCIKRLAFSYGNRPGDFRAIMEHISAIPVKHGDIAYEVEIFPEYKIQMILWEGDEE FPPSSQILFSDNFPVSFQAEDMAVMGDVIIGSLKAYLKCVQK >gi|226332990|gb|ACII01000029.1| GENE 24 24924 - 26864 1918 646 aa, chain - ## HITS:1 COG:AF0010 KEGG:ns NR:ns ## COG: AF0010 COG3894 # Protein_GI_number: 11497631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 64 625 65 594 597 298 31.0 3e-80 MFKVTFSFEDGSMVETFANAGDNLLEVARSANVAIDAPCSGNGACGKCRVQLKSGELESK KTLHISDEEYQAGWRLSCCSKISADVNVLVPDIASAYKSRMKVADLSSKEEIAIFENAKS DIQLAGIELKNSLEVVDVLMDVPSLDDTMPDNERLTRALRKYLNINRVRIPYVVLKKLPD VLRENNFAVKCVIRATSDDMYVYDIFGKDEDVVIGGLAIDIGTTTVSAVLINMENGEILA KSSAGNGQIRFGADVINRIVESQKPGGQKKLQDAVIKETINPMIHEMCKSAKFPKDHIYR MCVASNTTMNHLFAGINADPLRTEPYIPAFFKTNSLFASDVGVDINKDAHIIMAPNIGSY VGGDITAGTLVSQIWNRPEFSLFIDLGTNGELVFGNSDFMMSCACSAGPAFEGGDISCGM RATDGAIEACTIDKETMEPTYKIVGDPGTKPVGLCGSGIIDVISELYICGIINPKGKFIR EGKRIKHDKYGMGSYILAFEEEAGSVKDVEITEVDIDNFIRAKGAIFSAIRTMLTSLDFD VSMIDDVYVAGGIGSGINMQNAVNIGMFPDIPIEKFHYIGNSSLTGAYLMLLSTPAEKKT YELAANMTYMELSTVPIYMDEFVGACFIPHTDTSMFPTVMEEIQNR >gi|226332990|gb|ACII01000029.1| GENE 25 26998 - 27783 1096 261 aa, chain - ## HITS:1 COG:SMc04342 KEGG:ns NR:ns ## COG: SMc04342 COG1410 # Protein_GI_number: 15965739 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Sinorhizobium meliloti # 4 250 21 270 320 102 30.0 7e-22 MAKFVTIGERLSTTAPAVNKAFTERDPEPIIKRAKQQLDAGATYLDVNIGPAENDGEDLM KWAVQLLQSEFDNVPLALDTTNKKAIEAGISVYNRAKGKPIVNSADAGDRITNIDLAAAN DAICIALCSSGMVAADNDERMMHCQNMLERGMSLGMEAEDLWFDPLFLVVKGMQDKQMEV LEAIKMFSDMGLNSTGGLSNNSNGMPKHIRPIMDSALVAMAMMNGLTSAIVNPNDLRLME TIKSCDVFKGNTLYADSYLEI >gi|226332990|gb|ACII01000029.1| GENE 26 27885 - 29243 1653 452 aa, chain - ## HITS:1 COG:MK0723 KEGG:ns NR:ns ## COG: MK0723 COG1456 # Protein_GI_number: 20094160 # Func_class: C Energy production and conversion # Function: CO dehydrogenase/acetyl-CoA synthase gamma subunit (corrinoid Fe-S protein) # Organism: Methanopyrus kandleri AV19 # 3 444 4 453 462 255 34.0 1e-67 MALNGIQIFKLTPKKNCKECGCPTCMAFSMKVAQGAMKIEQCPHMSDEALASLSEATAPP MKTIKIGTGAEEHTLGGETVLFRHEKTFVSKPRYAVALCTCMDDATVEAKLAEIPKVDYD RIGERMYAELVYVNCGEGADAAKYTALVEKAAGLGRTLVLECKDPEIAKAALAVCKDSKP VLNGADASNYEAMNAVATEAGVVLGVSGKDLNELYDTTAALEKLGNKNLVLDTTGADIKE TFANTVQVRRAALKDQDRTFGYPSIVNLVKIAKGDLHLQAALASMFTMKYGSIIVMEQMT YAEALPLYGLRQNVFTDPQKPMKVEPGIYPLNGADENAVVVTTVDFALTYFVVSGELERS GVPLNLVINDAGGLSVLTSWAAGKFSGNSISAYIKENVEPKVKSRKLIIPGKVAVLKGDL EAKLPGWEIIVGPREAVQLVKFLKDMQADGQI >gi|226332990|gb|ACII01000029.1| GENE 27 29266 - 30210 1298 314 aa, chain - ## HITS:1 COG:AF0377 KEGG:ns NR:ns ## COG: AF0377 COG2069 # Protein_GI_number: 11497989 # Func_class: C Energy production and conversion # Function: CO dehydrogenase/acetyl-CoA synthase delta subunit (corrinoid Fe-S protein) # Organism: Archaeoglobus fulgidus # 20 314 139 434 452 140 32.0 4e-33 MPFNQKLQKFNAKINTVTIGSGDKTVTIGGDSTYPFYSFDAPSENTPKIGVEISDMGLEN VVSEGIKAYYDGASTIGEMAKKAAAMEGADFVALILEGGDPNGVNKSVDELIAVVKDVAD SIDAPLVVEGCKNVEKDAELLPKVAEALQGRNVLILSEKEENYKAIGAAAGLAYDQIVGA ESAVDINLAKQLNVVTTQLGVNAQKIVMNIGSAAAGYGYEYVVSTMDRIKGAALSQNDNM LQMPIITPVSAETWGVKEATASEADMPEWGPEEERGIDMEVMTAAADLAAGSDAVILRHP EAVATISRMIKALA >gi|226332990|gb|ACII01000029.1| GENE 28 30645 - 32771 2640 708 aa, chain - ## HITS:1 COG:MJ0156 KEGG:ns NR:ns ## COG: MJ0156 COG1614 # Protein_GI_number: 15668328 # Func_class: C Energy production and conversion # Function: CO dehydrogenase/acetyl-CoA synthase beta subunit # Organism: Methanococcus jannaschii # 297 700 4 393 469 353 45.0 9e-97 MTLFESVFAGNDAVYGLTEGAINQAIEKYGADKAVSFPDTAYSLPCYYAVTGTKVGTLAE LKDALAVVKTLMTREKRTNDVFMSGVATALCAEFIEVLKYIDGAVPYEAPCYGHLADAVI RELGVPLVTGDIPGVAVILGKAPSAQEAADLVKSYQSQGILITLVGDVIDQLEEVGYKTG ANVRVIPLGKDVTAVIHVVSVALRAALIFGNVKPGDAATLMDYTMKRVPAFVNAFAPLND VIVACGAGAIALGFPVITNQEDVARVPKSLICQKDISKWNATSLEARDIKIKITNIDIPV AFASAFEGEIIRRKDMQVEFDGSRVDCAELVHTCDASEVEDHKITVVGPEVDDMELGSKN SIAYVVKVAGKNMQPDFEPVIERKFHNYINCIEGVYHTGQRDMQRIRISKDAFAAGFKIK HIGEVLYTQVKNEFDAVVDKCEVTIYTDPAECTRIRHEVAIPIFEKRDDRLNTLTDESVD VYYSCILCQAFSPSHVCVVTPERLGLCGAVSWLDAKATHELDPNGPCQVITKERPIDENL GSYEDVDEAVKKLSQGALEHVTLYSIMQDPMTSCGCFECICGIEPFSNGVVIANREYAGM TPLGMTFSEMASMTGGGVQTPGFMGHGKHFISSKKFMKAEGGIERIVWMPKELKEFVAER LNNTAKELYGIENFTDMIGDETIATDPETLVGFLTEKGHPALALDPMM >gi|226332990|gb|ACII01000029.1| GENE 29 32798 - 33562 1000 254 aa, chain - ## HITS:1 COG:MA1967 KEGG:ns NR:ns ## COG: MA1967 COG3640 # Protein_GI_number: 20090815 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: CO dehydrogenase maturation factor # Organism: Methanosarcina acetivorans str.C2A # 1 251 3 246 262 206 46.0 3e-53 MKIAVTGKGGVGKTTFAATLARLYAAEGRPVLAADVDPDANLGLALGFDEETLDSIVPIS KMRKLVEERTGATAGNQFYHLNPKVDDIPEKYGKVCNGVRLLVLGTVETGGGGCVCPEHV MLKRIINNLVLRREDVVILDMEAGLEHLGRGTTEGVDAFVVVIEPGARSVQTYKNVKRLA KDLGVSQVKVVANKVRNEDDENFIRERIPQEDLLGMIHYNTEVIDADRQGKSPYDFSKTV TDEIIKIKEKIDRQ >gi|226332990|gb|ACII01000029.1| GENE 30 33607 - 35499 2164 630 aa, chain - ## HITS:1 COG:MA3282 KEGG:ns NR:ns ## COG: MA3282 COG1151 # Protein_GI_number: 20092097 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Methanosarcina acetivorans str.C2A # 3 630 4 627 628 556 47.0 1e-158 MSEFKLTTVEEFEEATARLLETGAKVGADAWQFRVKNQTPHCKFGEQGVCCRICAMGPCR ITPKAPRGICGCDVHGIVGRNFLKFTAGGAATHSDHGREICHTLYCAKDGGNYQVKDPEK LIRIAKEWGVETEGKDIYDLAHEMAETGLMEYGKPFGYQRFLDRMPAGQKEKLIENEIAP RAIDREVSTSLHMAHMGCSSLPEALVKQSIRCGMADGWGGSMMGTEFSDVLFGTPKPIDT EANLGVMVEENVNIVVHGHDPSLSEMICEYADSKEMIDYAKSVGAKGITVSGVCCTSNEV AMRRGVPMAGNFLQQENVVLTGACEAIVVDVQCIFPALGPLSKCFHTKFITTSPIAQMPD AEFIRFNAETAGENAKAIVKMAIDNFKNRKPELVHIPQLKQKATVGYSVEAIVKVLDGVT NSQVDETGTTKPLLECITSGVIRGAVAMVGCNNPKIRPDYAHIELMKKCIANDIVVIASG CSAQAAAKAGLMDKSAKDLCGAGLKRVCELADIPPVLHMGSCVDISRMMILAAELAKDAG LQINQLPVVGCAPEWMSEKAVSIGNYVVGTGIDTFLGVDPYVSGSSEMGELLTEGTRKWT GAAYTVETDIEKLVDLMIERIEEKRTALGI >gi|226332990|gb|ACII01000029.1| GENE 31 35766 - 36536 989 256 aa, chain - ## HITS:1 COG:MTH1711 KEGG:ns NR:ns ## COG: MTH1711 COG3640 # Protein_GI_number: 15679705 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: CO dehydrogenase maturation factor # Organism: Methanothermobacter thermautotrophicus # 4 254 6 249 252 191 42.0 1e-48 MAHVIAVAGKGGVGKTTLCGMLIQYLCEKGKGPILAVDADANSNLNEVLGVKVETTLGDV REEIARAELAKENPIPTGMSKADYAEMRFEDALVEDDDFDLLVMGRTQGKGCYCYVNGLL QTQLAKYQNNYPYIVVDNEAGMEHISRGVLPSMQTAILVSDCSRRGVQAVGRIAELIKEC DMHPDTVGLIINRAPKGELNKGIQEEIANQGLTLLGVVPQDETVYEYDCEGRPTSTLPED NPVKTALRAIVDNLKL >gi|226332990|gb|ACII01000029.1| GENE 32 36629 - 36811 325 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578386|ref|ZP_04855658.1| ## NR: gi|253578386|ref|ZP_04855658.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 60 1 60 60 94 100.0 2e-18 MCLATVYKESDDSVIFKNVSRIDVDGNKLILRDIMGDERVVEGTILMVDLANSIVKLKCD >gi|226332990|gb|ACII01000029.1| GENE 33 36825 - 37220 414 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578387|ref|ZP_04855659.1| ## NR: gi|253578387|ref|ZP_04855659.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 131 1 131 131 168 100.0 1e-40 MHLIKDENGNLVQHGHGHTHEHTHEDGMTHTHEHTHGEGHDHGHSHTDHACDTAHCGSCS EGDCKNETLALLTYMLQHNEHHAQELDQMADNLAKMGMDAACKTIKEAVADFQKGNMRLG LALTLVKEEMK >gi|226332990|gb|ACII01000029.1| GENE 34 37566 - 38393 788 275 aa, chain + ## HITS:1 COG:BS_yvgN KEGG:ns NR:ns ## COG: BS_yvgN COG0656 # Protein_GI_number: 16080393 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Bacillus subtilis # 1 270 5 274 276 256 49.0 3e-68 MDNTVFLNTNREMPLLGLGVYKATGENEAENAIISAVESGYRLIDTASVYKNEENVGRGI ASCGIPRNELFITTKVWNTAQRLGDIQGAFERSLDRLKLDYVDLYLIHWPVPGCYLSTWK VLEEIQKSGRALSIGVSNFEIRHLEELEANSGIIPAVNQIECHPLCYPKELIDYCQDKGI QVQAYAPLARGAYLDNDVMCVLGTKYAHTPAQIGLRWATQKGISVIPKSVHPDRIRSNGN IFDFTIEQEDMDLIDTLNENFHSSHVPEDLRDIAF >gi|226332990|gb|ACII01000029.1| GENE 35 38584 - 39603 1201 339 aa, chain - ## HITS:1 COG:FN0906 KEGG:ns NR:ns ## COG: FN0906 COG0240 # Protein_GI_number: 19704241 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 329 1 331 335 402 61.0 1e-112 MAKISVLGAGSWGTALSILLYNNGHEVILWSALGEEITTLREKREHVSKLPGVKIPEPID ITIDLERSLRDADVAILAVPSPYTRSTCKRMAPFVKNGQKIVNVAKGVEEKTLLTLSEII EQELPQANVSVLSGPSHAEEVGKGLPTTCVVSSHQRETAEYLQGIFMSPVFRVYTTPDML GVELGAALKNVIALAAGTADGLGYGDNTKAALITRGIAEISRLGVKMGARMETFYGLSGI GDLIVTCASVHSRNRKAGYLMGKGYTMEQAMAEVKMIVEGVYSAKAAKELAEKYDVEMPI ITEINKVLFEGKSAAEAVIDLMVRDKKVETPMLPWEDGQ >gi|226332990|gb|ACII01000029.1| GENE 36 39618 - 40262 555 214 aa, chain - ## HITS:1 COG:TM1447 KEGG:ns NR:ns ## COG: TM1447 COG0344 # Protein_GI_number: 15644196 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Thermotoga maritima # 9 212 9 196 196 111 35.0 1e-24 MERLICIAIGYACGLFQTAYILGKIYHIDIRKQGSGNLGSTNVLRTLGKKAGALNLLCDC LKCIAAILIVRAIFGNSYGDILPLLSLYAAAGCILGHNFPFYLNFKGGKGVAASVGLVIA FDWRIFIICAIVFFGLFFLTHYVSLCSVSAYLAAFISMIIFGQRGSYHMDSYHMIELYIV MGLLTVLAFYRHKKNIKRLLDGTESKIYLSKKNK >gi|226332990|gb|ACII01000029.1| GENE 37 40265 - 41590 1754 441 aa, chain - ## HITS:1 COG:lin2051 KEGG:ns NR:ns ## COG: lin2051 COG1160 # Protein_GI_number: 16801117 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Listeria innocua # 1 437 1 435 436 508 59.0 1e-144 MSKPVVAIVGRPNVGKSTLFNALAGEKISIVKDTPGVTRDRIYADVTWLDKTFTMIDTGG IEPDSSDIILSQMREQAQIAIDTADVIIFITDVHQGLVDSDAKVADMLRRSGKPVVLVVN KVDSIQKFMMDVYEFYNLGIGEPIPISAANRMGLGDMLDEVAKHFPEDADTEEDDEIPRI AIVGKPNVGKSSLVNKLLGEDRVIVSDIAGTTRDAVDTRVKWQGKDYIFIDTAGLRRKGK VKEEIERYSVIRTVTAVERADVVIVMIDASEGVTEQDAKIAGIAHERGKGVIIAVNKWDA IEKNDKTIYKHTNRIREVLAYMPYAELVFISAKTGLRISRMFETIDAVIENQTLRIQTGV LNEILSEAVAMQQPPSDKGKRLKIYYMTQVAVKPPTFVIFVNDKELMHFSYTRYLENKIR DAFGFAGTSLKFIIRERGEKE >gi|226332990|gb|ACII01000029.1| GENE 38 41553 - 42935 1273 460 aa, chain - ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 8 434 5 430 437 399 46.0 1e-111 MKKNLHIISRVLPDSIGEELELEPGDALLSINGQPVEDVFDYRYLMNDEFVTLLIRKKNG EEWELEVEKEYEDDLGVEFENSLMDEYRSCSNHCIFCFIDQMPPGMRETLYFKDDDSRLS FLQGNYVTLTNMSDYDLDRIIKFHLSPINVSFQTMNPKLRCKMLHNRFAGDALAKVDRLY KGDVTMNGQIVLCKGINDRDELEYSLEKLSEYAPVLQSVSIVPVGLSRYRKGLYPLESFD KEDARYLISQVERWQKIMVKKHGIHFVHASDEWYILAGYELPEEGRYDGYLQLENGVGMM RLLETEVKERLEQLEGDDREVNATVATGRLAAPYIGKMIKLVQKKFPNVQAEVYAVKNNF FGEKITVSGLITATDLMDQLAQRNLGEKVLIPCNMLRSGEDVFLDDLTVEDVRQALKTEV VVVDEPGADLVNCLIEAPDHKKIRRRQIYEQTGSSDSGQT >gi|226332990|gb|ACII01000029.1| GENE 39 43251 - 44420 1350 389 aa, chain + ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 18 389 16 369 371 376 50.0 1e-104 MPNIELLEKSMPVAPIRIAALGCRELAEEVDRKLVKFRKELVERKQLSVIPQGYSEESFI VECECPRFGTGEGKGYIKESVRGTDLYIMVDVTNYSESYTVCGHENHMSPDNHFQDLKRI ISAATGKAHRINVIMPFLYEGRQHRRTKRESLDCALALKELSDMGVSNIITFDAHDPRVQ NSIPLNGFDNFFPTYQFLKALIKSVPDLRVDNDHLMIISPDEGAMSRAVYFSNILGVDMG MFYKRRDYSTIVNGKNPIVAHEFLGDSVEGKDVVVIDDMISSGGSMLDVAKQLKERNAKR VFVCTTYGLFTDGLDKFDEYYEKGWLDRVITTNLNYRIPELLERPYYVEANMSKYLASII DIINHDVSVEKVRSSNEKIMSLMEKVNSR >gi|226332990|gb|ACII01000029.1| GENE 40 44550 - 45401 662 283 aa, chain - ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 35 282 30 276 278 188 37.0 1e-47 MKIKWNSEQIPKMNVKNNKDVTFLTYPAYEEMNWLVHGFSTRFGGVSRGIYSSMNLSFTR GDEESCVKENYRRISDAMGFDMNSIVTSDQTHTNNVRKVTEKDRGKGIVIPRDYTDTDGM ITNVPGLVLATFYADCVPLYFADPANHAIGLSHSGWRGTVQKIGAVTIEKMSEEYGSNPK DLKVAIGPSICQECYEVSEDVIEEFEKVFDKKYRNKLFYRKENGKYQLNLWMANKIIFLE AGIPEENISMPNLCTCCNPEFLFSHRASHGKRGNLGAFLGIRS >gi|226332990|gb|ACII01000029.1| GENE 41 45462 - 45866 197 134 aa, chain - ## HITS:1 COG:VC0580 KEGG:ns NR:ns ## COG: VC0580 COG0792 # Protein_GI_number: 15640602 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Vibrio cholerae # 17 127 5 118 122 88 43.0 3e-18 MPTYNIKRNGWKNLHKNTRSTGSCYERKAADYLKQQGLFILRYNYRCRFGEIDLIARDGE YLVFVEVKYRKDNSSGYSLAAVNPAKQKTICKVARYFLTVEYHNVDIPCRFDVAGIDGDE IHWVKNAFEYIAIL >gi|226332990|gb|ACII01000029.1| GENE 42 45937 - 46692 751 251 aa, chain - ## HITS:1 COG:BS_rnh KEGG:ns NR:ns ## COG: BS_rnh COG0164 # Protein_GI_number: 16078669 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Bacillus subtilis # 3 246 5 248 255 213 50.0 2e-55 MKTIAEIKAEFAAADMSSYPALYEIYKEDTRSGVQKLIAQCRKKEAALEAEKQRIENMKV YEHKYEHLGYLCGIDEVGRGPLAGPVVACAVILPQDSRILYLNDSKKLSASKREELYDVI MREAVSVGIGMRSPERIDEINILQATYEAMREAVSKLEVVPQLLLNDAVTIPQITIPQVP IIKGDAKSVSIAAASIVAKVTRDRMMEEYDKVFPEYGFASNKGYGSADHIAALKKYGETP IHRKTFITHFI >gi|226332990|gb|ACII01000029.1| GENE 43 46689 - 47534 1045 281 aa, chain - ## HITS:1 COG:BH2476 KEGG:ns NR:ns ## COG: BH2476 COG1161 # Protein_GI_number: 15615039 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus halodurans # 1 277 1 278 284 280 48.0 3e-75 MNVQWYPGHMTKAKRQMQEDIKLIDLIIELVDARVPLSSRNPDIDELGKNKSRLILLNKS DLADERQNEAWKTYFQAKGFYVVKVDSRNGSGMKAINNVIQEACKEKIERDRRRGIKNRP IRAMVVGIPNVGKSTFINTFAGRACAKTGNKPGVTKGKQWIRLNKGVELLDTPGILWPKF EDQQVGIRLACVGSIKDDILNMEELALWLIDYLRENYKGILAERYQITEEGTSVEVLEAI ARARGCLKKQEELDYAKASLILFDDFRSGKMGRITLEWAPL >gi|226332990|gb|ACII01000029.1| GENE 44 47541 - 48185 493 214 aa, chain - ## HITS:1 COG:BS_sipT KEGG:ns NR:ns ## COG: BS_sipT COG0681 # Protein_GI_number: 16078505 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus subtilis # 25 205 1 185 193 103 36.0 3e-22 MYCPSFSICQEIWMNIRKWKDSPALKETRDMLEIKLEDKKVRGILNWIFQIAVVLIFGIL AGIALFQSVTVQESTMEPTLQVGERFFVNRAVYKVSSPERGDIIVYKTSGSDDAALHIGR VIGLPGETVQISNGAVLINGEVYNENKNFPEISNAGLASDGVSLESGEYFVLGDNRNNSE DSRYGDIGNINKKYIVGKVWFVISPKDKFGFLKG >gi|226332990|gb|ACII01000029.1| GENE 45 48220 - 48567 464 115 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916980|ref|YP_002930497.1| large subunit ribosomal protein L19 [Eubacterium eligens ATCC 27750] # 1 115 1 115 115 183 81 2e-45 MNEIIKNIEAAQLKAEVPAFNTGDTVRVHALIKEGNRERVQVFEGTVLKKQGGSNRATFT VRKSSNGVGVEKTWPLHSPHVVKVEVIRRGKVRRAKLNYLRDRVGKAAKVKELVK >gi|226332990|gb|ACII01000029.1| GENE 46 48695 - 49570 864 291 aa, chain - ## HITS:1 COG:CT851 KEGG:ns NR:ns ## COG: CT851 COG0024 # Protein_GI_number: 15605587 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Chlamydia trachomatis # 7 289 3 286 291 258 46.0 8e-69 MSLKIGRNDPCWCGSGKKYKKCHEAFDEKMEIMKQKGYAVIDHDLIKTPEQIAKIKESCK INIAVLDYVEEHIKPGVSTEEIDRWVHEETVRHGGIPAPLNYEGFPKSVCTSVNEVVCHG IPDEEQILKEGDIINVDVSTIYNGYYSDSSRMFCIGKVSPEKEKLVKVTKECVEIGISQV KPWTPIGNMGSAVHKHAVENGYSVVREIGGHGVGVEFHEDPWVSFVSEENTGVLMVPGMM FTIEPMVNMGSDEIYTDEIDEWTVRTEDGLPSAQWEVTVLVTETGCEVICW >gi|226332990|gb|ACII01000029.1| GENE 47 49779 - 50711 1040 310 aa, chain - ## HITS:1 COG:Cj0599_2 KEGG:ns NR:ns ## COG: Cj0599_2 COG2885 # Protein_GI_number: 15791959 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Campylobacter jejuni # 99 294 1 186 191 81 33.0 2e-15 MRKRKKSEEGGFNVWRSYSDMIAGVLLLFILIMCVTLFQAQKSYNESIKERDDKIALQEQ YTAEILAQQSELDEKNSKLSSQDEQLDKQKKLLAELAAQLKEQQASLDEKTSLLATQQKK IDNIIGVKADVIEALQKEFSKNNVSVDIDAQTGALTLNANVLFDYDKSELTDEGKQELAD ILPIYCKVLLQKDYKKYLAEIIIDGYTDTDGDYDYNLELSQKRSLAVAQYLTEIRENFLS SDEISELQNYLTVNGHGSANPVLDSDGNVDKDASRRVEVKFRLKDDEMINELNQIMNTDK TSDSEKSGSN >gi|226332990|gb|ACII01000029.1| GENE 48 50727 - 52367 1742 546 aa, chain - ## HITS:1 COG:no KEGG:EcE24377A_0285 NR:ns ## KEGG: EcE24377A_0285 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 129 520 133 563 739 79 21.0 4e-13 MGKKIMNTVLFLVVLGGCVAMTLLTGNGSSGTMIYNFAFLAVMAVIYLAGLFGGMFRVDS IGEALARGTQELSSIFKIPGKAKSEDLSCLKGMFEHKYLDDRMDSFVDAMEKNQEGIGDV EDYINEDEIDLHVHKKILEMAPDIFTSLGILGTFIGLVWGLKSFEPSSYETMTTSVSALV DGIKVAFLTSIYGIAFALIYSSGMKSVYSGMDAKLQGFLERFHLYVLPAAESESRNLMLA SQKVQTKAMKQMAEQLTSQMAESFEKAINPTFQKMTESLEILTESVTKCQEDVVQDILRT FLREMNGSFKMQFTDFNEALVQLKEAQKENIDYTTRLYQTMSNQLNTSYAKQSEAMKDLV NELGQVQGRYMSTATRITQDNQEIQKLQQQDYQRIADYLHESEKSSAKFWVACNQTMQKY VEAAAQGMEKVSAANQAGKDVLEANRKLVEQLDVKMKDFVSYQKMTYETMEEVRRLFADI SVQKEDNNIYLSGGKSAQSAAQKESVEEVRKLLEEQSERQEALLEEMNKNMKNISKNQKG KFSLFK Prediction of potential genes in microbial genomes Time: Sat May 28 19:26:43 2011 Seq name: gi|226332989|gb|ACII01000030.1| Ruminococcus sp. 5_1_39B_FAA cont1.30, whole genome shotgun sequence Length of sequence - 23166 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 14, operones - 5 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 19 - 72 17.4 1 1 Op 1 . - CDS 89 - 703 598 ## COG0406 Fructose-2,6-bisphosphatase 2 1 Op 2 40/0.000 - CDS 703 - 3120 3166 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit - Prom 3147 - 3206 3.3 3 1 Op 3 . - CDS 3225 - 4250 1231 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 4383 - 4442 4.1 4 2 Op 1 5/0.000 - CDS 4719 - 6113 1765 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 5 2 Op 2 . - CDS 6113 - 7000 1156 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases - Prom 7028 - 7087 5.0 - Term 7123 - 7161 1.0 6 3 Tu 1 . - CDS 7203 - 7955 819 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 7993 - 8052 4.6 7 4 Op 1 . - CDS 8066 - 8797 1160 ## COG0217 Uncharacterized conserved protein 8 4 Op 2 . - CDS 8839 - 10152 1284 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 10193 - 10252 4.6 9 5 Tu 1 . - CDS 10353 - 10700 404 ## COG2337 Growth inhibitor - Prom 10812 - 10871 2.9 10 6 Op 1 1/0.500 - CDS 10886 - 12031 1318 ## COG0787 Alanine racemase - Term 12054 - 12085 -0.5 11 6 Op 2 . - CDS 12109 - 13608 654 ## PROTEIN SUPPORTED gi|225086616|ref|YP_002657886.1| ribosomal protein S15 - Term 13642 - 13676 4.0 12 6 Op 3 . - CDS 13698 - 14345 659 ## COG2344 AT-rich DNA-binding protein - Prom 14369 - 14428 10.6 + Prom 14430 - 14489 7.7 13 7 Tu 1 . + CDS 14606 - 16516 1950 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 16664 - 16694 1.3 - Term 16487 - 16531 9.4 14 8 Tu 1 . - CDS 16633 - 17841 1502 ## COG0538 Isocitrate dehydrogenases - Prom 17873 - 17932 6.3 15 9 Tu 1 . - CDS 17957 - 18802 579 ## COG1404 Subtilisin-like serine proteases + Prom 19023 - 19082 3.2 16 10 Tu 1 . + CDS 19113 - 19280 67 ## - Term 19245 - 19295 15.1 17 11 Op 1 . - CDS 19310 - 20041 689 ## COG0500 SAM-dependent methyltransferases 18 11 Op 2 . - CDS 20062 - 21540 1455 ## COG2326 Uncharacterized conserved protein - Prom 21646 - 21705 7.5 - Term 21871 - 21909 -0.6 19 12 Tu 1 . - CDS 22084 - 22431 213 ## COG0350 Methylated DNA-protein cysteine methyltransferase - Prom 22569 - 22628 6.5 - Term 22572 - 22610 -0.4 20 13 Tu 1 . - CDS 22753 - 22905 69 ## gi|253578422|ref|ZP_04855694.1| predicted protein 21 14 Tu 1 . - CDS 22986 - 23165 178 ## gi|153812050|ref|ZP_01964718.1| hypothetical protein RUMOBE_02446 Predicted protein(s) >gi|226332989|gb|ACII01000030.1| GENE 1 89 - 703 598 204 aa, chain - ## HITS:1 COG:lin0293 KEGG:ns NR:ns ## COG: lin0293 COG0406 # Protein_GI_number: 16799370 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Listeria innocua # 3 154 8 157 211 90 39.0 2e-18 MLIYVLRHGITQWNKLKKVQGAMDIPLAPEGIELAKRTGEVLKDVPFDICFTSPLARARQ TAHYVLGNRQIPVIEDKRIQEIDFGVLEGSRFKDEQGKIISHEMEIFFEEPQKFERPQNG ENISDILKRTREFWVEKTTDPALADKTILVSSHGCAVRALLQNVYQDPEHFWHGCVPPNC SINLLEVKDGKARFLEEDKVYSEI >gi|226332989|gb|ACII01000030.1| GENE 2 703 - 3120 3166 805 aa, chain - ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 154 804 3 654 654 576 47.0 1e-164 MNTSLSWIKMYVPDLDVTAQEYTDAMTLTGTKVEGFEKLDADLDKIVIGQIEKIEKHPDA DKLIICQVNIGTETVQIVTGAPNVKEGDKVPVVLDGGRVAGGHDGKKTPGGIEIKKGKLR GIDSYGMMCSIEELGSTREMYPEAPEYGIYIFPEDAVVGESAIKALGLDDVVFEYEITSN RVDCYGVLGIAREAAATFNEKFCPPVVECKGNDEKASDYVKVTVEDPKLCPRYCARVVKN VKIGPSPKWMQRCLASNGIRPINNLVDITNYVMEEFGQPMHAYDLDRIAGKEIVVRRAEN DEKFVTLDGQERTMDDQVLMICDGEKAVGIAGIMGGENSMITDDVKTVLFEAACFDGTNI RLSSKRIGLRTDASGKFEKGLDPNNAQAAIDRACQLMEELGAGEVVGGMVDICNEVREPS RVPFKPEKINALLGTDLTPEEMLAYLAKVELTYDEKTNEIVAPTFRQDIHYVADVAEEVA RFFGYDKIPTTLPTGEATTGKLPFKLRIEAVARDIAEYCGFSEGMTYSFESPKVFDKLRL PADSKLRQGIVISNPLGEDYSVMRTTTLNGMLSSLATNYNRRNKDVRLYELGNVYRPKAL PLTELPDERMHFTLGMYGNGDFFDMKGVCEEFFEKVGMRDKVDYDPNSGKTYLHPGRQAN MVYDGKVVGYLGEVHPLVADTYGIGDKAYVAVIDILTVLEFASFNHKYTGIAKYPAVTRD LSMVVPKTVLAGQIEHILTQRGGKILESYQLFDIYEGNQIKGGYKSMAYSLVFRHHDKTL EESEITAAMKKILNGLTDLGIELRS >gi|226332989|gb|ACII01000030.1| GENE 3 3225 - 4250 1231 341 aa, chain - ## HITS:1 COG:CAC2357 KEGG:ns NR:ns ## COG: CAC2357 COG0016 # Protein_GI_number: 15895624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Clostridium acetobutylicum # 3 341 1 339 339 452 61.0 1e-127 MDMKERLQELAEAAGKRISESEGLEKLNEVRVAYLGKKGELTAILKSMKDVAPEERPKVG QLVNETRSKIEALLDESKQKLEEAAREEKMKQEVIDVTLPAKKAKIGHRHPNTIALEEVE RIFIGMGYEVVEGPEVEYADYNFTKLNIPENHPARDEQDTFFINDSILLRSQTSPVQART MEKGKLPIRMIAPGRVFRSDEVDATHSPSFHQIEGLVIDKNITFADLKGTLQEFAQELFG PETKTKFRPHHFPFTEPSAEVDVSCFKCGGKGCRFCKGSGWIEILGCGMVHPNVLSMCGI DPEEYTGFAFGVGLERIALLKYEIDDMRLLYENDIRFLKQF >gi|226332989|gb|ACII01000030.1| GENE 4 4719 - 6113 1765 464 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 8 461 4 459 468 508 58.0 1e-143 MADAKMLKKVPVREQDPKVRATNFEEVCLGYNQEEAMEEAQRCLGCKKPKCVEGCPVSIN IPGFIEEIKEGKIEEAYKVIGLSSALPAICGRVCPQESQCEGKCIRGVKGEAVSIGKLER FVADYALEHDIKPVGSEVKNGHKVAVIGSGPSGLTCAGDLAKAGYDVTVFEALHELGGVL VYGIPEFRLPKQKVVKKEIEKVKELGVKFETNVVIGKSTTIDQLIEDEGFEAVFIGSGAG LPMFMGIPGENASGVFSANEYLTRSNLMKAFDDSYDTPIAAGKKVAVVGGGNVAMDAART ALRLGAEVHIVYRRSEAELPARAEEVHHAKEEGIIFDLLTNPKEILVDENGHVKGMKVVK MELGEPDASGRRRPVEIPGSEYDMDVDTVIMSLGTSPNPLISSTTKGLEVNKRRCIIAEE ETGKTSKDGVYAGGDAVTGAATVILAMGAGKAGAKGIDEYLKNK >gi|226332989|gb|ACII01000030.1| GENE 5 6113 - 7000 1156 295 aa, chain - ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 279 1 272 278 260 52.0 2e-69 MYKILKAETLAAKIHLMVVHAPRVASHCQPGQFIIVKLDEKGERIPLTICDYDREEGTIT IVFQEIGVSTIKMAELKAGDAFRDFVGPLGCPSEFVEEDIEELKKKKMVFVAGGVGTAPV YPQVKWLHEHGITADVILGAKTKDMVILEKELGEVSNLYVTTDDGSYGRAGMVTKTLEDL VTNEGKHYDVCVAIGPMIMMKFVCLLTKKLGIHTIVSMNPIMVDGTGMCGACRLQVGDEI KFACVDGPEFDGHLVDFDQAMKRSQMYKTEEGRALLKLQEGDTHHGGCGHCGGDK >gi|226332989|gb|ACII01000030.1| GENE 6 7203 - 7955 819 250 aa, chain - ## HITS:1 COG:BS_ymfB KEGG:ns NR:ns ## COG: BS_ymfB COG0740 # Protein_GI_number: 16078742 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Bacillus subtilis # 19 248 1 231 241 225 50.0 5e-59 MSENRYGEKREESETGMKLQNCETQESENKIQESTNEQIKDMGQVVLNSNSRKHKVELLT IIGEVEGHESAPSHSKTTKYEHVLPKLAIIEDDEEIEGLLILLNTVGGDVEAGLAIAEMI ASLSIPTVSLVLGGGHSIGVPMAVSADYSFAVPSATMVIHPVRSNGMFIGVAQTYRNMEK IQDRITTFISEHSKASQKRLEELMLDTSQLVKDVGTMLEGKEAVKEGLIDEVGGIKEALA KLYEMIEKEC >gi|226332989|gb|ACII01000030.1| GENE 7 8066 - 8797 1160 243 aa, chain - ## HITS:1 COG:BS_yrbC KEGG:ns NR:ns ## COG: BS_yrbC COG0217 # Protein_GI_number: 16079834 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 240 1 238 240 231 51.0 9e-61 MSGHSKFANIKHKKEKNDAKKGKIFTVIGRELVMAVKAGGPDPNNNSKLRDVIAKAKANN MPNDTIERGIKKAAGADSADNYEKAVYEGYGPNGVAIIVETLTDNKNRTAGNVRNAFTKG NGSIGTQGCVSYMFDQKGQIIIDKEECEMDADEIMMLALDAGAEDFSEEEDSYEILTAPD DFSAVREALEKAGLPIAEAQVAMIPQTYVELTDEQALKDLQRTLDLLDDDDDVTDYWTNW DEQ >gi|226332989|gb|ACII01000030.1| GENE 8 8839 - 10152 1284 437 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 13 429 14 435 443 192 33.0 8e-49 MEDGSSMPLWGLFVLLLLLWLNGIFYGFAAALRNISENDTQKKAEEGDKKAQMLMALIDK PAQFVNAIPLIVMACGICFGTFLVPYAVDAFYPYIKHVPALILVMALCVIFLAAIGILAF RRVGTYHPEAYAYKYLNLVHFWLNLLKPFTVSVTWIARLAAVPFGVEINRTEKSVTEEEI ISIVDEAHEQGVIQENEAEMIQNIISFNETEAHDIMTHRKNVVAFDEEILLKNMIDTMLE EGNSRYPVYEENIDNIKGIVHYKDALKFMTQNPWAKFKPLKELPGIIRQASLIPETRGIG DLFHTMQARKIHMAIVVDEYGQTAGIVTMEDILEEIVGDILDEYDEDEITIRAQKDNSLI IDGLAYLEDVEEELDADFGDTEFETLNGYLTNILGHIPADKDVNTSIKAIGYCFTILSIG NKTIGKVKVERDNNVAV >gi|226332989|gb|ACII01000030.1| GENE 9 10353 - 10700 404 115 aa, chain - ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 1 113 1 113 116 167 69.0 5e-42 MVIKRGDIFYADLRPVVGSEQGGIRPVLIIQNNVGNRHSPTVICAAITSKMNKAKLPTHI EINAELYGIEKDSVILLEQLRTIDKRRLKDKVCHLDDAMLDKVNHALEISLELTF >gi|226332989|gb|ACII01000030.1| GENE 10 10886 - 12031 1318 381 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 7 381 8 380 386 307 41.0 3e-83 MKKDSRVKAVISLDAVEHNFREMRKNIAEETKMIAVIKADAYGHGAVPVAHLIEDYDYIW GFAAATAEEAIHLREAGITKPILILGIVFDEYFPELVRYEIRPAVCEYEEARKLSDEAVL QKKTVHIHIALDTGMTRIGFADAQENVEEIKKISELPNLEIEGMFTHFARADEYDRSPAM VQLERYLDFSRRVEEAGVEIPLHHCSNSAGIIRMPEANLNIVRAGITIYGIYPSSEVERD IVKLEPVMELKSHVTYVKDVPAGAAISYGGTYVADRKLRVATVPVGYADGYSRQLSNKGW VLIHGQKAPILGRVCMDQFMVDVTEIGDVKKGDEVTLIGRDGDEFIGIEEIGDLCGRFSY EFACDISPRVPRVYIKNGKEI >gi|226332989|gb|ACII01000030.1| GENE 11 12109 - 13608 654 499 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225086616|ref|YP_002657886.1| ribosomal protein S15 [gamma proteobacterium NOR5-3] # 3 499 8 497 497 256 34 1e-67 MREIVTAVEMKELDHNTIQKAGISSPVLMERAALKTVEEIVQRLKNKEKEKILVVCGTGN NGGDGLAIARLLQLRGVRTWYYIAGKEEKMTAETAGQLKTAEYYQVPRVHNLDPDEYTTI VDAIFGVGLSRPVEGKYAELISTMNRASAYKVAVDIPSGIDSDTGFEKGIAFRADLTVTF AFRKRGLCFYPGRMYGGEIVVTDIGIYSIPGKKICMHHLEREDLKMLPERVPYGNKGTFG KVLLIAGSEGMCGAAYLSAAAALKSSAGMVRIQTVEANRIPLQTLLPEAMITCDFDEKKN QAMLDWCDVLVIGPGLGTGAQSRERAQWFLAKAAECKKTVILDADGLNLLSVHDNWKCFI GEHVILTPHIGEMSRLCGKEIGKIQDCLAQTAADYAKESGAVCVLKDACTVVADAFGDMY LNISGNAGMATAGSGDVLSGVLAGIQCMYLAAEKKFSPVILAALGVYVHGLAGDLAAETV GQRGMTAGDIICFLPEILR >gi|226332989|gb|ACII01000030.1| GENE 12 13698 - 14345 659 215 aa, chain - ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 2 213 3 214 214 229 52.0 3e-60 MEAKEISKAVIKRLPRYYRYLGELMEENVERISSNDLSKKMHVTASQIRQDLNNFGGFGQ QGYGYNVRYLYTEIGKILGLDTVHPMIILGAGNLGQALANYVDFEKRGFKLVGIFDVNPV LEGIAVRGIEIQMLSDLPLFLKENEVEIAILTLPKIKAKEMAKILIDNGIKAIWNFAHID LDAPEDVMVENVHLSESLMTLSYNLSEYRKEHEGM >gi|226332989|gb|ACII01000030.1| GENE 13 14606 - 16516 1950 636 aa, chain + ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 636 2 638 643 545 47.0 1e-154 MILSCQNICKSFGEKIILKDASFHIEDREKAALIGNNGAGKTTLLRIIMHELSPDSGNVV LAKDKNIGYLAQYQDIHGHHTIYEELISTKQHIIDMENRLRSLEQEMKTTTGDALDSLMN TYTRLSHQFELENGYAYKSEIMGVLKGLGFTEEDFERQIETLSGGQKTRVALGKLLLASP DILLLDEPTNHLDMESISWLETYLLNYPGAVFIVSHDRYFLDKVVTKVVEIEAGNMRMYS GNYSAYALKKAQLRDAQYKAYLNQQRDIKHQEAVIAKLKSFNREKSIKRAESREKMLDKV QRIDKPQEIQNQMRISLEPRFVSGNDVLTVESLSKSFPGQTLFSDISFEIKRGERVALIG NNGTGKTTMLKIINGLIDADSGRFTLGSKVQIGYYDQEHHVLHMEKTIFEEISDAYPTLT ETEIRNMLAAFLFTGDDVFKLISALSGGERGRVSLAKLMLSEANFLILDEPTNHLDIASK EILEEALNSYTGTVFYVSHDRYFINQTATRILDLTNQAIVNYIGDYDYYLEKKEELTEKY APAAAQAVTEAKQASDNKLSWQQQKEEQARQRKRENELKKVEARIEELETRDKEIDETMI LPDICTNVAECAKLSKEKAAITEELEGLYEKWEELA >gi|226332989|gb|ACII01000030.1| GENE 14 16633 - 17841 1502 402 aa, chain - ## HITS:1 COG:TM1148 KEGG:ns NR:ns ## COG: TM1148 COG0538 # Protein_GI_number: 15643905 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermotoga maritima # 1 402 1 399 399 484 56.0 1e-136 MKKIQMQTPLVEMDGDEMTRILWKIIKDELLLPYIDLNTEYYDLGLEYRNETDDQVTVDA AEATKKYGVAVKCATITPNKARMEEYTLKKMYKSPNGTIRAILDGTVFRAPIVVKGIEPC VKNWKKPITIARHAYGDVYKNTEMYIDGPGDAYLVFEGADGQQRKELIHHYEGPGVLQGM HNLDDSITSFARCCFNYALDTKQNLWLGGKDTISKIYDGRFKEIFATIYEDEFKEKFEAA GIEYFYSLIDDIVARVMKAEGGFIWACKNYDGDVMSDMVSSAFGSLAMMTSVLVSPQGYY EYEAAHGTVQRHYYRYLKGEETSTNSVATIFAWTGALRKRGELDGNKELMEFADKLEKAT LETIESGKMTKDLALITSLEDVTVLNSKDFILAIRERLDEIL >gi|226332989|gb|ACII01000030.1| GENE 15 17957 - 18802 579 281 aa, chain - ## HITS:1 COG:BH1930 KEGG:ns NR:ns ## COG: BH1930 COG1404 # Protein_GI_number: 15614493 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus halodurans # 10 256 144 414 444 187 38.0 2e-47 MKNSDDSGYTGKHVGVCVLDTGIFPHIDFTGRILAFQDFIGHRIRPYDDNSHGTHVCGII GGDGRASEGRIRGIAPGCSLIVLKVLDRTGNGRKEDVLQAFRWILENKRYYGIRVVNISV GTTCRRAEDHRVLIAGVEQLWDAGLVVVAAAGNQGPKAGSVTAPGSSRKIITVGSSDLLT GRTAISGRGPTFECVCKPDLVAPGNHVLACAPGADNGYGVKSGTSMSTPLVAGAAALMLE KNPELTNVQIKMKLKESARDMGLPKNQQGWGELDLERFMEL >gi|226332989|gb|ACII01000030.1| GENE 16 19113 - 19280 67 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTALLLILVFLLFQTFVLLCCVAAGNDAASQEISDQEQMEFIREWKKIHCSSPQK >gi|226332989|gb|ACII01000030.1| GENE 17 19310 - 20041 689 243 aa, chain - ## HITS:1 COG:PA3680 KEGG:ns NR:ns ## COG: PA3680 COG0500 # Protein_GI_number: 15598876 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Pseudomonas aeruginosa # 16 241 20 253 261 134 38.0 1e-31 MNDKITVCLGKGGQREMAESFARKNNVPIMDKPGEHLTVMFDSRGVSLTGYGLTYQGDFE GMLHRVTNGRLSHEMLVRAVKTEGEHLKAIDATAGMGEDGFLLAAYGYEVTLYEQNPVIA ALLKDALRRARKHPVLKDIASRMKLVEGDSVSCMEKLMDPVDVIYLDPMFPKRQKSGLIN KKLQLIQKLEPPCSEEKDLFDAAIKAGPSRIIVKRPLKSVCLDGREPSYILKGKAIRYDC YVM >gi|226332989|gb|ACII01000030.1| GENE 18 20062 - 21540 1455 492 aa, chain - ## HITS:1 COG:MA2391 KEGG:ns NR:ns ## COG: MA2391 COG2326 # Protein_GI_number: 20091222 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 489 67 567 569 396 43.0 1e-110 MLTTWTPAHMPEKEEIKIRLQTARSQLYDFQMKIKEHKIPVIVLFEGWGSSGKGSTIGKV IKNIDPRFFKVATMSAPTAEELRYPFLYRYFKQIPEAGKFTFLDSGWMEQTCKDCLSGEI EGEGYTQRIDSIRRFERQLTDNGYLVLKFFMQIDRDEQKRREKLLLDHKDTRWRVSDFDK WQNEHYKKCAKIFSQYLSDTNVSSAPWYIIDASDRKWAELQVLETMVSNIEVAMQNQAHS VPILQNVFPLEQIPRLSDIDLRDKTMDDEEYRGELKQLQSKLGELHNKLYRRRIPVIITY EGWDAAGKGGNIKRITEALDPRGYEVHPIASPEPHEKARHYLWRFWTRLPKNGHIAIFDR TWYGRVMVERLEGFCSENDWKRAYNEMNEFEKELKDWGAVIIKFWVQIDKDTQLERFQER QNNPEKQWKITDEDWRNREKWDAYEVAVNEMLQKTNTSFAPWHVLESVDKKYARIKALRI VISEIEKALKEN >gi|226332989|gb|ACII01000030.1| GENE 19 22084 - 22431 213 115 aa, chain - ## HITS:1 COG:Ta0460_1 KEGG:ns NR:ns ## COG: Ta0460_1 COG0350 # Protein_GI_number: 16081579 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Thermoplasma acidophilum # 4 102 12 105 120 73 43.0 1e-13 MISESYDLVTYSPTKKIYEAVKQIPKGCVATYGQVAEMAGNPRMSRAVGNALHKNPDPGH IPCYRVVNFRGELSGAFAFGGKDVQKKLLEADGIEVVNGTVDLKKYGLTQRDDKL >gi|226332989|gb|ACII01000030.1| GENE 20 22753 - 22905 69 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578422|ref|ZP_04855694.1| ## NR: gi|253578422|ref|ZP_04855694.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 50 1 50 50 72 100.0 1e-11 MWDRELKEIRKQEQYRKAISCMEEEQYDKAIILFEGLPRDYEDVRNNFRE >gi|226332989|gb|ACII01000030.1| GENE 21 22986 - 23165 178 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153812050|ref|ZP_01964718.1| ## NR: gi|153812050|ref|ZP_01964718.1| hypothetical protein RUMOBE_02446 [Ruminococcus obeum ATCC 29174] # 1 59 282 340 341 89 71.0 7e-17 YLTKEVLKELLELIRKAGDEKDKTAMVSINWPQAVIRVQYTDMQVQESVTIEEANEGEK Prediction of potential genes in microbial genomes Time: Sat May 28 19:27:00 2011 Seq name: gi|226332988|gb|ACII01000031.1| Ruminococcus sp. 5_1_39B_FAA cont1.31, whole genome shotgun sequence Length of sequence - 14096 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 429 262 ## Shel_22380 hypothetical protein 2 1 Op 2 . - CDS 459 - 731 301 ## EUBELI_20612 hypothetical protein - Prom 813 - 872 3.7 - Term 910 - 959 1.1 3 2 Tu 1 . - CDS 966 - 2360 688 ## COG3436 Transposase and inactivated derivatives - Prom 2604 - 2663 2.7 4 3 Op 1 . - CDS 2668 - 2832 298 ## gi|253578427|ref|ZP_04855699.1| predicted protein 5 3 Op 2 . - CDS 2822 - 3196 266 ## EUBELI_00417 hypothetical protein - Prom 3229 - 3288 2.3 6 4 Op 1 . - CDS 3323 - 4315 912 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 7 4 Op 2 24/0.000 - CDS 4383 - 4787 256 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component 8 4 Op 3 . - CDS 4859 - 5581 382 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component + Prom 5715 - 5774 9.4 9 5 Op 1 . + CDS 5906 - 6742 799 ## COG4820 Ethanolamine utilization protein, possible chaperonin 10 5 Op 2 . + CDS 6785 - 7273 470 ## gi|253578433|ref|ZP_04855705.1| conserved hypothetical protein - Term 7272 - 7326 7.4 11 6 Tu 1 . - CDS 7344 - 7847 596 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily - Prom 7921 - 7980 3.0 - Term 7926 - 7990 6.7 12 7 Op 1 . - CDS 7994 - 8281 390 ## gi|253578435|ref|ZP_04855707.1| conserved hypothetical protein 13 7 Op 2 . - CDS 8306 - 8605 280 ## gi|253578436|ref|ZP_04855708.1| conserved hypothetical protein - Prom 8654 - 8713 5.8 - Term 8887 - 8914 -0.8 14 8 Tu 1 . - CDS 8992 - 10497 929 ## Ccel_3295 hypothetical protein - Prom 10531 - 10590 7.5 - Term 10538 - 10575 2.1 15 9 Op 1 . - CDS 10659 - 13100 2011 ## COG0474 Cation transport ATPase 16 9 Op 2 . - CDS 13094 - 13279 141 ## gi|291546512|emb|CBL19620.1| Cation transporter/ATPase, N-terminus. - Prom 13331 - 13390 9.3 17 10 Tu 1 . - CDS 13913 - 14095 190 ## gi|153812050|ref|ZP_01964718.1| hypothetical protein RUMOBE_02446 Predicted protein(s) >gi|226332988|gb|ACII01000031.1| GENE 1 3 - 429 262 142 aa, chain - ## HITS:1 COG:no KEGG:Shel_22380 NR:ns ## KEGG: Shel_22380 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 3 140 72 210 397 72 26.0 7e-12 MEQYYYNHMNKAQQAAYHSILSGVKNLADEFQIPALEGEELYNVFFQMRLDHPEIFWVSS YKYRYYKDSPNLIFIPEYLFDKKKICEHQKAMTARVEKLIRPAQKLSEWEKEKYVHDFIC ENIRYDKLKKSYSHEIIGPLGQ >gi|226332988|gb|ACII01000031.1| GENE 2 459 - 731 301 90 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20612 NR:ns ## KEGG: EUBELI_20612 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 80 5 81 94 80 48.0 1e-14 MASERTEELYRVLLSKGYPKELCAEIAYKNMNTDYTATRMLGYLYRYTNPKIEDLVDEML AILSDRAQIIEKKESEHAQAVISEMYRKGL >gi|226332988|gb|ACII01000031.1| GENE 3 966 - 2360 688 464 aa, chain - ## HITS:1 COG:MA1425 KEGG:ns NR:ns ## COG: MA1425 COG3436 # Protein_GI_number: 20090285 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 19 448 21 455 477 120 27.0 4e-27 MEKRALKAEKQRDDALDKVKELQHQFYETAVRLEEEQGKNLKLRAQINRDYENSSIPSSK TLRKKKITNSREKTGRKPGGQPGHKGHCRKKQEPTRPAILLPPPEIVLEDNSFKKTSKTI IKQRVGIRMLLDVTEYHADVYYSSQTGERVHAPFPAGVIDDVNYDGSIRAFLFLLNNDCC TSIDKSRQFLSGLTGGKLNISKGMVSRLSREFALKTEAERRAAYADMLLSPVMHTDCTNG RENGKGCQIYVCATPDGKALYFAREKKGHEGVKDTVTEDYQGILVHDHDRTFYNYGTDHQ ECLAHVLRYLKGSMDNEPDRTWNKDMHSLVQEMIHFRNGLQPSEELDPCKVSEFEERYRK ILETARKEYENVPANDYYRDGYNLFLRIEKYMQNHLLFLHDLRVPATNNEAERLLRNYKR KQAQAVTFRSFESIDYLCQCMSMLVLMRLEEPENIFDRVSRIFG >gi|226332988|gb|ACII01000031.1| GENE 4 2668 - 2832 298 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578427|ref|ZP_04855699.1| ## NR: gi|253578427|ref|ZP_04855699.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 68 100.0 1e-10 MNIDLREELRKRDELLDGYLKQIEIQEEFIQKQKEMIEFLEDHISKITDIISGV >gi|226332988|gb|ACII01000031.1| GENE 5 2822 - 3196 266 124 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00417 NR:ns ## KEGG: EUBELI_00417 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 99 10 116 232 168 79.0 6e-41 MNKAYEQEIEMWNQVFKECTPVDLRKLDLKVETMFDEALKLFAEKTTNVLDFGCGTGDIS THKVLGIDASKVGIQFANETAKLSEYKTATFLEGGEHMISFRQFAFFLCFCYTFVRKERT AYEH >gi|226332988|gb|ACII01000031.1| GENE 6 3323 - 4315 912 330 aa, chain - ## HITS:1 COG:RSc0194 KEGG:ns NR:ns ## COG: RSc0194 COG1063 # Protein_GI_number: 17544913 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 3 318 17 332 345 210 38.0 3e-54 MEKPKPVLTHERDAVVKVTLASICSSDLHIKHGSVPRAVPGITVGHEMVGIVESVGSAVT HVKPGDRVTVNVETFCGECFFCKKGYVNNCTDENGGWALGCRIDGGQAEYVRVPFADQGL NKIPEGVTDRQALLVGDVLATGYWAARISEITEEDIVLIIGAGPTGICTLLSVMLKNPKC IIMCEKDENRIHFINEHYPEVLTVSPEECFDFVQDHSDHGGADVVLEVAGAESTFRLAWE CARPNATVTVVALYDKAQVLPLPDMYGKNLTFKTGGVDGCDCEETLKLIAEGKINTEPLI THTYPLSRIEEAYELFENKRDGVIKVAVEC >gi|226332988|gb|ACII01000031.1| GENE 7 4383 - 4787 256 134 aa, chain - ## HITS:1 COG:BMEII0798 KEGG:ns NR:ns ## COG: BMEII0798 COG1116 # Protein_GI_number: 17989143 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Brucella melitensis # 1 89 12 106 266 68 36.0 3e-12 MDIKVDHVSKAYGEQQILRDLCCVFPEGKTTCIRGRSGCGKTTLIRLLLGLDIPDKGKIE VISDRKISAVFQEDRLCENLSAASNIRLVCTETIDYILKKTEGKTLIFVTHEEQEAVWLK ADKTLKFMKDHLIK >gi|226332988|gb|ACII01000031.1| GENE 8 4859 - 5581 382 240 aa, chain - ## HITS:1 COG:AGpT116 KEGG:ns NR:ns ## COG: AGpT116 COG0600 # Protein_GI_number: 16119871 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 218 76 288 313 84 27.0 1e-16 MAVFFWITIWQFASMYLGQEILLASPVSVVRKLFELSFTGNFWQSVGFSFVRIVTGFLLA MFLGIFLAVLAYWSKTVEILIVPVIAVVKSTPVASFIILCLIWIPSRNLSVFISFLMVLP VIYTNILEGIRQTDSKILEMAKVFQVNPGRKIRYIYVSQVLPYFLSACRLSLGMCWKAGV AAEVIGVPSGSIGEKLYNAKIYLNTPDLFAWTIVIIVISFVFEKCFLGIVSRVVYMIEHK >gi|226332988|gb|ACII01000031.1| GENE 9 5906 - 6742 799 278 aa, chain + ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 5 266 7 268 274 249 47.0 4e-66 MELKNRTLSNFADLVQTGECKKFKGRLKVGVDLGTANTVLAVVDTTNRPIAGISAPSQAI RDGVIVNYYESVQLVTRLKAELEEKLKTELPYAAAAIPPGVSEGSSKSIQYVLEGAGFEV SNIVDEPTAAAAVLKISDGAVVDVGGGTTGISILKNGKVIYTDDEATGGSHMTMTVAGHY NIPYEEAEILKTDRSKEAEIFPVIKATVEKMATITQKFLTGYQVPAVYVVGGSASFEDFT GVFEKKLGLPVYRPVHPLLVTPLGIAYHCDLPPVTHKK >gi|226332988|gb|ACII01000031.1| GENE 10 6785 - 7273 470 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578433|ref|ZP_04855705.1| ## NR: gi|253578433|ref|ZP_04855705.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 162 1 162 162 331 100.0 8e-90 MRELKATKINAADFAPFGTFFSMTEPEGYPLHGEIHKFYPDRISGTCMGSIGFSPIAVHK DERIVKAAEYHTTTWEGIVALDDDMIIHVAPASAGAPVPELTRAFIVPKGTMVKINTAIW HLCPLPLNNEVLHAMIILPECTYANDCTVVDFEEKDWFKITV >gi|226332988|gb|ACII01000031.1| GENE 11 7344 - 7847 596 167 aa, chain - ## HITS:1 COG:MJ0304 KEGG:ns NR:ns ## COG: MJ0304 COG0663 # Protein_GI_number: 15668479 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Methanococcus jannaschii # 8 162 2 156 159 144 49.0 8e-35 MDYKSVKISEDAKIARQSVVIGDVTIGRDSCVLHYAVIRGDDAPIVIGEESNVQENCTIH VSRNMPVHIGNNVTVGHNAVLHGCMIGDRTLIGMGAVVLDGARIGKDCIIGAGSLVTKNT VIPDGSLVMGSPAKIKRNLTWEEKLGNLENSKEYVSVSAEMQKQGVL >gi|226332988|gb|ACII01000031.1| GENE 12 7994 - 8281 390 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578435|ref|ZP_04855707.1| ## NR: gi|253578435|ref|ZP_04855707.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 95 1 95 95 161 100.0 1e-38 MSYGALGLVEVVGSCNAVVVLDQMLKTSDVEFRTWHTKCGGHATIFLSGDIAAVKAAVDA VKGNPPCEIIAAAVISAPSEETSRLVKEEEEKHHK >gi|226332988|gb|ACII01000031.1| GENE 13 8306 - 8605 280 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578436|ref|ZP_04855708.1| ## NR: gi|253578436|ref|ZP_04855708.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 99 1 99 99 155 100.0 6e-37 MAKNYDGMAVGAFELDNLVACFVALDAASKAANVTIQSVERNRLKSGACVKIRGSVSDVN AAMEVALETAKPLGKIVSHTVIASPSADTEVALKMTINK >gi|226332988|gb|ACII01000031.1| GENE 14 8992 - 10497 929 501 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3295 NR:ns ## KEGG: Ccel_3295 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 289 499 410 640 643 90 31.0 2e-16 MIVNKCSENLLKKSKKLYENYRDNCTVVQRMLEKYKKIYPNISDYSIMHFIDIAEFCDLI MDRKKLEGLNEDECYCLLLAALFAHTGFGLNQEIMNRYINRLGIQKQTQSLTFLQIMSKY HVLFSACLIEEYGDIFEFPSEKHKYAITSMLYFIGGNSDDINQLEEILVLDNKNTVRLKD LAAVLVVGNQLAELKNINPDLDYEDFDKYNSEEIVGFVERNVVRSISVKDGKLVIEAGGS DSAYALIERKVNIIINNFNKILRSLTHESGDNTSLFCIDSIELKCVLSAENSEKIYLNKE IEESWTEEDIEFFHKLSFEERVFYADYLSTEFEITKFLMDFAKINKAVLAGLEYRVKSPK SLYNKLYQRVEKSFFDSIADVIRYTVILEPKEYVEQIRSVTDALYEKNWKIYSLKNYWVN DSFPYNGVNAKFKNSRNYRIEIQFHTQESFDVKMSEEDHKLYEQRRVLEPGCDEYNRILQ LQLKLYSDMEYPENIADLEKI >gi|226332988|gb|ACII01000031.1| GENE 15 10659 - 13100 2011 813 aa, chain - ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 1 806 60 862 862 762 52.0 0 MLILIASAVISGMLGDVESAAVIVIVITINAILGTVQTVKAEQSLQSLKKLSGPEAKVLR DGVAVQLPARELVVGDVILLEAGDMIPADGRLIENASLKVDESALTGESLAVEKSMDTIL AEAPLGDRTNMLFSGSFVTYGRGKAVVTDVGMQTEVGKIAGLLKSTSEKQTPLQANLDDF GKKLSILILIFCGILFAINVFRGEKISSAFMFAVALAVAAIPEALSSIVTIVLSFGTQKM AKEHAIIRKLQAVEGLGSVSVICSDKTGTLTQNKMTVEDYYIDGKRISAAAIDAADPAQR CLLDYSILCNDSTNENGVEIGDPTETALINLGSRCGIEAADVRNLYPREGELPFDSDRKM MSTLHRIDGENRMIVKGAVDRLLELTERIWTKDGVREITEADKEKIQMQNQDFSMEGLRV LAFTYREIPENHILTMEDEKHLVFLGLIAMMDPPREESKAAVAECIKAGIRPVMITGDHK ITAAAIAKRIGILHDLSEACEGADIENMSDEQLREFVPNISVYARVSPEHKIRIVRAWQE KGMIVAMTGDGVNDAPALKQADIGVAMGVTGTEVAKDAAAMVLTDDNFATIVKAVENGRN LYQNIKYAIQFLLSGNFGAILAVLCASLAGLPVPFAPVHLLFINLLTDSLPAIALGVEPH SSEVMNEKPRSADESILTKDFLGKIGLEGFVIGVMTMIGFLTGYYQNGALLGSTYAFGTL CLARLFHGFNCKSDHPVIFTKRFFNNKWLQGAFALGSVLITAVLTVPGLHFLFKVETLNL MQLGCVYLYAFASLPIIQMLKWIRMKLRKRGER >gi|226332988|gb|ACII01000031.1| GENE 16 13094 - 13279 141 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291546512|emb|CBL19620.1| ## NR: gi|291546512|emb|CBL19620.1| Cation transporter/ATPase, N-terminus. [Ruminococcus sp. SR1/5] # 1 61 1 61 61 87 86.0 2e-16 MKEIYQQTVEEVLDHVESRESGLTSEQVERSRENCGWNELAEGKKKVFYRFSLNSIRIFW C >gi|226332988|gb|ACII01000031.1| GENE 17 13913 - 14095 190 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153812050|ref|ZP_01964718.1| ## NR: gi|153812050|ref|ZP_01964718.1| hypothetical protein RUMOBE_02446 [Ruminococcus obeum ATCC 29174] # 1 60 282 341 341 90 71.0 2e-17 YLTKEVLKELLELIRKAGDEKDKTAMVSINWPQAVIRVQYTDMQVQESVTIEEANEGEKE Prediction of potential genes in microbial genomes Time: Sat May 28 19:27:46 2011 Seq name: gi|226332987|gb|ACII01000032.1| Ruminococcus sp. 5_1_39B_FAA cont1.32, whole genome shotgun sequence Length of sequence - 8468 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 43 - 102 6.1 1 1 Tu 1 . + CDS 234 - 1574 950 ## COG0534 Na+-driven multidrug efflux pump + Term 1581 - 1643 4.1 - Term 1565 - 1635 15.9 2 2 Op 1 . - CDS 1694 - 2716 517 ## COG2267 Lysophospholipase 3 2 Op 2 . - CDS 2796 - 4154 1065 ## COG2211 Na+/melibiose symporter and related transporters 4 2 Op 3 . - CDS 4180 - 4941 623 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family 5 2 Op 4 . - CDS 4991 - 6580 1035 ## EUBREC_0128 hypothetical protein - Prom 6631 - 6690 5.3 - Term 6593 - 6652 4.1 6 3 Op 1 . - CDS 6732 - 7535 212 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 7 3 Op 2 . - CDS 7544 - 7657 56 ## 8 3 Op 3 . - CDS 7644 - 8009 365 ## EUBELI_20143 hypothetical protein - Prom 8043 - 8102 3.3 - Term 8068 - 8103 0.1 9 4 Tu 1 . - CDS 8130 - 8444 244 ## gi|253578447|ref|ZP_04855719.1| predicted protein Predicted protein(s) >gi|226332987|gb|ACII01000032.1| GENE 1 234 - 1574 950 446 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 408 1 409 452 155 27.0 1e-37 MSKKEISFTEGPVFQSLLRFAIPVLGALILQAAYGAVDLLIVGKFGNASSISAVGTGSSF MQMATFIITSLAMGSTVIIGHHIGEQKPKEAGNAVGTTIILFLIIAVIMTFVLEFAAGGI AHLLQAPAESFDKTILYIRICSAGIVIIIAYNVISGVLRGVGNANLPLLFVGIACVINIL GDLLLVGVFHMDVAGAAIATVFAQFVSVVCSVVVLRKQDMPIEFSREQCRIYKEELGKIL NVGIPIALQETTVQISFLVVNSVINQMGLMPSAGYGIAQKIVSFIMLVPSSIMQSVSAFV AQNIGAGKKSRAWKGFYTAIITGCSVGIIIFMVGFFGGGVLSSFFTNDSEVIAQSAAYLK GFSADCILTCVLFSSIGYFNGCGKSIPVMIQGITSAFCIRIPVSIIMSRLPETSLTYVGM ATPITTVYGIIFFVICFRLLNKNSAK >gi|226332987|gb|ACII01000032.1| GENE 2 1694 - 2716 517 340 aa, chain - ## HITS:1 COG:CAC2246 KEGG:ns NR:ns ## COG: CAC2246 COG2267 # Protein_GI_number: 15895514 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Clostridium acetobutylicum # 4 332 47 361 363 175 33.0 1e-43 MEIIHENEYLYKMNDEIEPYLATHRQEGYISGAEDHLHRKDTGIVGKIHVQRYLAEKPKG VIIISHGFTEAAPKYEEMIYYFLKAGYHVYMPEHMGHGQSYCLTADPSLVHIDTWKRYVR DFLKICHVIKKTYPELPLVLFAHSMGGAIGTIAAAWEPQLFQKIILNSPMLRPLTGNVPW PLVIAIAQTKCLLGREEDYVAGQKPYDGSETFETSGCTSRPRFEKYNEMRKENKKFQVSA ASYGWLLASVKMSWYIKYCGWKKLTAPVLLFQAERDDFVSVNALRKFAEKINHRGITSCE YVYLPGTRHETYRSDDRTMEMYLDKIMKFMDSGMVSDSCT >gi|226332987|gb|ACII01000032.1| GENE 3 2796 - 4154 1065 452 aa, chain - ## HITS:1 COG:FN0222 KEGG:ns NR:ns ## COG: FN0222 COG2211 # Protein_GI_number: 19703567 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Fusobacterium nucleatum # 1 411 1 396 448 145 29.0 1e-34 MKKLTTGKIWQFAAGQFGWAMLSGIISNWLVYFYQPDKTAISQGQTVFIPQGLVVFGIFT IIGGITAFGRIFDAFTDPMIASLSDRCTSKNGRRIPFLKWASLPLALSTVLVFWSPVNKN SWINGVFLLIMILAYYLSITAYCTPYNALIPELGHTQQERLNISTVISFTFIAGTAVAYL APTIWGMFIPAFGRVGAIRMTFTLMAAIAFICMLVPVFCIREKDYVDTVPAKESAFGSLA ATFKNGEFRKFVASDIFYWIALTMFQTGLPFFVTSLLKLPETMTTLYFVGMTALSLVFYI PVNKLTPKLGKKKMILFAFAVFSLAYFYTGFMGKMSFIPASVQGLILTVVAALPMAIFGI LPQAVVADISQSDSITSGSNREGMFYAARTFAFKLGQSISMLIFTAVSTIGAAEGTGYRV AAFGAAVFCCIGGIILVFYNEKKINGIINRHQ >gi|226332987|gb|ACII01000032.1| GENE 4 4180 - 4941 623 253 aa, chain - ## HITS:1 COG:PM0315 KEGG:ns NR:ns ## COG: PM0315 COG0483 # Protein_GI_number: 15602180 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Pasteurella multocida # 11 237 4 232 267 143 36.0 3e-34 MELKEKTPHTILNEIIEAAKECGQVMLQADRANFGIKDKAGKANFVTKYDCKIQEMLEKK LSEILPEAEFWGEEEDCQINRNAEYIFVVDPIDGTTNFIKDYHMSCVSIGLIRNGKRYMG VVHNPYLNETFYAISGEGAYMNGNAIHVSKDDLANSIVLFGSSPYNTELAKASFELAYEY FQKCLDIRRSGSAALDLCAIASGRAEIYFELILSPWDFAAGALIVEEAGGIVTTVEGEEL PCLEKSSILARNK >gi|226332987|gb|ACII01000032.1| GENE 5 4991 - 6580 1035 529 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0128 NR:ns ## KEGG: EUBREC_0128 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 528 22 549 550 707 64.0 0 MGRFVNPDNSAFQVALNSPIYVDKTGLLEYTNSVLNTTEAYICNCRPRRFGKSYAANMLA AYYSKGCNSEEMFSGLDISRESDFKTHLNKYDVIHLDIQWFLANCDNVDNVVAFITKSVQ AELREIYPGVLPEEEISLSESLSRIKNIVGQKFIIIIDEWDVLIRDEAANKKVQEKYINF LRAMFKGTEPTKYIQLVYLTGILPIKKEKTQSALNNFDEFTMLDAGVMAPYIGFTEAEVK DLCERYHRDFEKVKYWYDGYLLEDYQVYNPKAVVSVCVRGRFRSYWSETASYEAIVPFIN MNYDGLKNAIIEILSGASIKVNTAAFKNDTVNIQSKDDVLTYLIHLGYLGYNQNRRTAFV PNEEIRQELTMAVESRKWNEMITFQQESEHLLEATLDMDEEAVEEGIEKIHTEYVSNIQY NNENSLSSVLAIAYLSSMEYYFKPVRELPTGRGFADFVFIPKPEYVSSYPALVVELKWNK NVKTALQQIKEKQYPDSVLDYTGDILLVEINYDKKTKEHQCLIEKYEKL >gi|226332987|gb|ACII01000032.1| GENE 6 6732 - 7535 212 267 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 220 2 238 242 86 30 7e-17 MESKQKRTALVTGGSSGIGRCTVAALSKAGYIVYEFSRRDVPVEGVKHMCVDVTDEASVQ EAVGQILLERGSIEILVNCAGFGISGAVEFTELKQAKAQFDVNFFGTVNVSRAVLPSMRR QHRGHIVNISSVAAVAHIPFQAFYSASKAAVSSYSCALDNEVSPYGVRVTAVELGDIHTG FTQARQKIVLGDDEYGGRISHGVSQMEKDELSGMSPEIIGTYIARIAQKKNCAPICVAGV KYKILRFLCKILPCTLRGKIVGSIYAK >gi|226332987|gb|ACII01000032.1| GENE 7 7544 - 7657 56 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICGDVMAQDEWCRQLFKNEEIRQKLTRVVRRKNGMR >gi|226332987|gb|ACII01000032.1| GENE 8 7644 - 8009 365 121 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20143 NR:ns ## KEGG: EUBELI_20143 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 121 109 228 228 213 89.0 1e-54 MISFYKSIVDQDRASIVICNLKHEIIYMNPAAVNSYAKRGGDKLIGRSLLDCHNQESRDK IQKVVDWFAADESHNLVYTFHNEKQNKDVYMVALREEGRLIGYYEKHEYRNAETMKQYDL W >gi|226332987|gb|ACII01000032.1| GENE 9 8130 - 8444 244 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578447|ref|ZP_04855719.1| ## NR: gi|253578447|ref|ZP_04855719.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 104 1 104 104 181 100.0 2e-44 MTENVEFADRFMNKNGSLRKTPLKKKRAISKLEFFIPEYEYRRLKKMKDPIETLERPVEH MTVYRNDGSSVTLTAENGRVSIVDSREKNVRHIIEADYFVSKIL Prediction of potential genes in microbial genomes Time: Sat May 28 19:28:07 2011 Seq name: gi|226332986|gb|ACII01000033.1| Ruminococcus sp. 5_1_39B_FAA cont1.33, whole genome shotgun sequence Length of sequence - 15488 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 9, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 62 - 247 152 ## gi|253578448|ref|ZP_04855720.1| predicted protein - Prom 282 - 341 4.9 - Term 521 - 557 1.2 2 2 Tu 1 . - CDS 682 - 810 64 ## - Prom 864 - 923 2.6 3 3 Op 1 . - CDS 1234 - 2703 403 ## BF3278 hypothetical protein 4 3 Op 2 . - CDS 2710 - 3765 782 ## BF3279 hypothetical protein - Prom 3856 - 3915 5.9 5 4 Op 1 . - CDS 4113 - 4985 379 ## EUBREC_0159 hypothetical protein 6 4 Op 2 1/0.000 - CDS 4988 - 7483 1728 ## COG1061 DNA or RNA helicases of superfamily II 7 4 Op 3 . - CDS 7453 - 8049 309 ## COG0500 SAM-dependent methyltransferases - Prom 8157 - 8216 7.5 - Term 8296 - 8357 13.3 8 5 Op 1 . - CDS 8407 - 9009 697 ## COG2316 Predicted hydrolase (HD superfamily) 9 5 Op 2 . - CDS 9097 - 9618 730 ## COG1335 Amidases related to nicotinamidase - Prom 9698 - 9757 3.9 10 6 Op 1 . - CDS 9782 - 10258 458 ## COG4728 Uncharacterized protein conserved in bacteria 11 6 Op 2 . - CDS 10310 - 10831 327 ## Ppha_2232 hypothetical protein 12 6 Op 3 . - CDS 10828 - 11850 763 ## Dhaf_0648 AAA ATPase - Prom 11877 - 11936 7.6 - Term 11867 - 11911 -0.2 13 7 Tu 1 . - CDS 12023 - 12364 362 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 12498 - 12557 5.5 14 8 Tu 1 . - CDS 12623 - 13387 644 ## gi|253578460|ref|ZP_04855732.1| predicted protein - Prom 13413 - 13472 1.8 + Prom 13695 - 13754 7.4 15 9 Tu 1 . + CDS 13847 - 15451 1908 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) Predicted protein(s) >gi|226332986|gb|ACII01000033.1| GENE 1 62 - 247 152 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578448|ref|ZP_04855720.1| ## NR: gi|253578448|ref|ZP_04855720.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 103 100.0 2e-21 MKLFLIGGMEDLNFNYCYKIIYESGETYDRRRNELSVEISKEDYKKIITGVLQERSIDQI E >gi|226332986|gb|ACII01000033.1| GENE 2 682 - 810 64 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQVVGGPNSTGEGMNKNKYWREVERELCLAVLACETISKRKH >gi|226332986|gb|ACII01000033.1| GENE 3 1234 - 2703 403 489 aa, chain - ## HITS:1 COG:no KEGG:BF3278 NR:ns ## KEGG: BF3278 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 81 471 106 500 535 114 27.0 8e-24 MLERIRNQITELRKEEERRYTYEAEFSFSKELPVQYSEWMNTIWDISNRDKIKIYLVSEG GESFDLKENVEEEYAEFLKGLESDETVEVSLTIDKKIVNGYLSVYCFEKFAEDISSLPID KVLTAFSNFMKKAGRSITFELFDSRDMFYTKTMFFLTAGSRPLDLDFDRTNRLLECRENA YFYNQDNYELLPDDFKIEIGYEGNPLKDLFLKMETILSACLLASNSFIQDGKLKLQIMGQ RSMEYHDTLENIQGNNNLYKIYNWIYSGGSIVDKVIIARNIICLHCKYESLLNISDGVMA SIQTNYNLYLRDNVTQYLELKNKVAEFITDIVSRTGEYATEMLDKFKTNLLAILGFLFSV ILANIVSDQPLDNIFTKDITILSEIVLIASFGYLFISYQQSKYELQKVYSSYDKLKDSYR EILTEDDIRECFQNDDIKTEMKRTIDRSVRIYLIIWGGILLSLLILVEYMSSEPFIWPIL KSLFQITKK >gi|226332986|gb|ACII01000033.1| GENE 4 2710 - 3765 782 351 aa, chain - ## HITS:1 COG:no KEGG:BF3279 NR:ns ## KEGG: BF3279 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 83 347 84 343 343 95 27.0 2e-18 MEIIAQAVRVIDYENNSVQRRNVNMMPKFKDYIEQLIVHVSNNVSIREYHTQSTRTEVIA GILEIAGNLETPDAIEEKIDSIAERLVRNEREAQARIGHMNISVQKGSLIFALFEDNGNR KCILAKIEHTGFFDERDYSERFGFSKDTKKIWKTCMFDLNNLNADQFQAKIYSNTVAKYW WFDFLELVECQNDRVNTYKTYDSVEKCLNRKLKKTAPYDNRVIQNAMYAYMNNANREMFD YDNMIEKIFTGYMPNDMTGEQKRELHERLNRLPEEKGFDRQFQAIPDAIKKKAIRTYKIN TGIMLKISDIGNENDISAYEDAYGKKYLKIRLSDDRTFQTFARNNEDNREE >gi|226332986|gb|ACII01000033.1| GENE 5 4113 - 4985 379 290 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0159 NR:ns ## KEGG: EUBREC_0159 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 169 285 128 244 251 108 45.0 3e-22 MEKMRMESENYVYKKEIDWSTLMEGFTLPLDNQVIFLRNMENFLQRGQSKIIHFFMNGKT YDAKIVNMNNSVEKRKKDAYQIRYPRNGELSQALQQYFFKSMSYIKMIRENRDPKDRSYI KMPDGLKEYLAIYTTEYEDTFLLEPIAQDDFQVMKKAIQGMRERTVENEIEYEMEDKSSG IEKKLQIVKIRKLNRKIGENLKLLYGYRCQICGQVIGEKYGSHIAEAHHIDYFVNSLNND ANNQMIVCPNHHSVIHDANPVFDRRRMVYRFDNGVEERISLNKHLFIYVK >gi|226332986|gb|ACII01000033.1| GENE 6 4988 - 7483 1728 831 aa, chain - ## HITS:1 COG:CAC2824_2 KEGG:ns NR:ns ## COG: CAC2824_2 COG1061 # Protein_GI_number: 15896079 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Clostridium acetobutylicum # 221 826 3 606 616 564 49.0 1e-160 MAEFTSAEKVNIGYENEYSTDAMTGGPDKRRQLYYQLIQSMKKAESVDIIVSFLMESGVR MLLKELEHTLKRGAKIRILTGNYLGITQPSALYLIKRKLGDRVDLRFYCEKGRSFHPKSY IFHYTDYSELYIGSSNISRSALTSGIEWNYRFSSQKDPENYKEFFRTFEDLFVNHSIIID DKELKRYSQNWHRPAVAKDLDRYDFTESETNDTKIKPLYEPRGAQIEALCALEDTRAEGA QKALIQAATGIGKTFLAAFDSKKYEKVLFVAHREEILKQAAVSFQNVRNSKDYGFFMGAE KCTDKPLIFASVASLGKPEYLNEKYFAPDYFNYVVIDEFHHAVNDQYRRIVEYFRPQFLL GLTATPERMDGKNIYEICDYNVPYEISLKDGINKGMLVPFHYYGIYDETDYTKLHIVKGK YAEEELNRTYIGNAYRHELIYKYYCKYGSSQALGFCCSRKHAEEMAKEFGKRGIPSAAVY SNADGEFSMDRAEAIEKLKNGEIKVIFSVDMFNEGVDIPSVDMVMFLRPTESPVVFLQQL GRGLRRSKGKEYLNVLDFIGNYEKAGRVRYLLIGKNKTGKETYYPADKADYPDECFVDFD MRLIDLFAEMDKKQLTIKEQIRNEYFRIKELLGKQPTRMDLFTYMDDDVYQMAVTHSNEN PFKKYLNYLEELDELTDEQKKFSQGMGKDFINLLENTNMSKVYKMPVLMAFYNHGNVRME VSEVELLASWKKFFSTGTNWKDLEKEITYEEYRKISDKNHIQKIMKMPVRFLLKSGEEFF VKKDGAALALRDEMEEIVKEPVLAEQMKDVIEYRAMDYYRRRYKEQIKALL >gi|226332986|gb|ACII01000033.1| GENE 7 7453 - 8049 309 198 aa, chain - ## HITS:1 COG:VC0813 KEGG:ns NR:ns ## COG: VC0813 COG0500 # Protein_GI_number: 15640831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Vibrio cholerae # 7 198 2 192 192 159 41.0 2e-39 MADIDKTLKYYNENAQSFASGTVSVKFTKVQDKFLEKLNPDAYILDFGCGAGRDTKYFLS RGYQVDAVDGSEQLCRIASEYTEIKVRQMLFQELDEKEKYDGIWACASILHLPKKQLREV LKNMYAALKSKGWIYTSFKYGEFEGERNGRYFTDFTTYTFKDFIHDMHGLKIEEHWITGD VRPGRGEEKWLNLLLQKK >gi|226332986|gb|ACII01000033.1| GENE 8 8407 - 9009 697 200 aa, chain - ## HITS:1 COG:DR2421 KEGG:ns NR:ns ## COG: DR2421 COG2316 # Protein_GI_number: 15807410 # Func_class: R General function prediction only # Function: Predicted hydrolase (HD superfamily) # Organism: Deinococcus radiodurans # 1 185 27 202 205 134 40.0 1e-31 MKTITRDEAFALLKKYNKDPFHIQHALTVEAVMKWYASELGYGEDAEYWGIVGLLHDIDF ELYPEEHCLKAPEMLRKAGVGEDVIHSVVSHGYGITVGCGATIDVAPEHEMEKVLFAADE LTGLIWAAALMRPSKSTKDMELKSLKKKYKSKGFAAGCSREVIERGAEQLGWELQKLLTM TLQAMADCEDEIKSEMDAEE >gi|226332986|gb|ACII01000033.1| GENE 9 9097 - 9618 730 173 aa, chain - ## HITS:1 COG:L67226 KEGG:ns NR:ns ## COG: L67226 COG1335 # Protein_GI_number: 15672251 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Lactococcus lactis # 1 171 2 168 171 164 54.0 8e-41 MQKVLVVVDMQNDFIDGALGTKEAVAIVPGVKEKIENFDGVVLFTRDTHETYYLDTQEGQ KLPVPHCIRDTEGWQIRSELDALRKTEPIDKETFGSTDLAGELVAMNEDNEIKSITFVGL CTDICVISNALLAKASLPEVPIIVDAKCCAGVTPESHENALKAMEACQIQIDR >gi|226332986|gb|ACII01000033.1| GENE 10 9782 - 10258 458 158 aa, chain - ## HITS:1 COG:CAC1208 KEGG:ns NR:ns ## COG: CAC1208 COG4728 # Protein_GI_number: 15894491 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 20 157 22 149 173 72 32.0 4e-13 MAKIRIIKKNNDCSMEYEVGDICEVTGTWYGGVHINGRSGIPVSLDKEEYMELNTEPEEP DSAENENTLETPEHDIHAGDIVQHFKREWVSPDTSEYLYKVLAVAHHTENGEKLMIYQAL YAPFKICARPYDMFMSEVDRDKYPDIKQKYRFEKYTEK >gi|226332986|gb|ACII01000033.1| GENE 11 10310 - 10831 327 173 aa, chain - ## HITS:1 COG:no KEGG:Ppha_2232 NR:ns ## KEGG: Ppha_2232 # Name: not_defined # Def: hypothetical protein # Organism: P.phaeoclathratiforme # Pathway: not_defined # 1 166 1 159 165 71 31.0 1e-11 MSVIPESGMNFGKYDEKDLFHIETSEIYKKLGDGIPTVEFILKYNGNNIVFLEAKKSCPN VANRYESKEKELKFEEYYGSITEKFISSLQIYLAAIMNRYQDTSEIGDNLRSVSNMKDVK LKFILVVKNAEDITWLAGPLAELRARLLKVRKIWGVEVVVLNEELAGEYNLTC >gi|226332986|gb|ACII01000033.1| GENE 12 10828 - 11850 763 340 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0648 NR:ns ## KEGG: Dhaf_0648 # Name: not_defined # Def: AAA ATPase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 3 333 1 332 340 371 56.0 1e-101 MEMPLTKIAAENFTVFEDIKIPFCEGLNVLVGENGVGKTHIMKLAYAACQASKHDVSFSQ KTTMLFRPDQSGIGRLVNRGKSGSNKAMVSVESDTAKIGMTFSTKTRKWDAEINSEEKWE KQMSDLTSVFIPAKEILSNAWNLDAAVKMGNVEFDDTYLDIIAAAKIDISRGVDSTVRKK YLDILQKISNGKVTLHEDRFYLKPGTQAKLEFNLVAEGLRKIALLWQLIKNGTLEKGSVL FWDEPEANINPKYIPVLAELLIMLETEGVQIFVSTHDYFLSKYIEVKRRQDSKLQYISLY KNENNKVLCEIAPEFELLEHNAIMDTFRQLYREEIGVALK >gi|226332986|gb|ACII01000033.1| GENE 13 12023 - 12364 362 113 aa, chain - ## HITS:1 COG:CAC0417 KEGG:ns NR:ns ## COG: CAC0417 COG1393 # Protein_GI_number: 15893708 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Clostridium acetobutylicum # 1 112 1 111 112 116 59.0 8e-27 MNIQIFGTKKNFDSKKAERYFKERGIKYQFIDMKEKGLSKGEFQSVCQAIGGYDKLIDTD CKDKDLLALITYIAEEDKAEKILENQSIIKVPVVRNGKQATVGYQPEIWKSWK >gi|226332986|gb|ACII01000033.1| GENE 14 12623 - 13387 644 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578460|ref|ZP_04855732.1| ## NR: gi|253578460|ref|ZP_04855732.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 254 1 254 254 398 100.0 1e-109 MKIIKTYYLIDYENVGSEGFKGCEKLRETDIIHLFYTDNSRKIDLDIINDHGESKLITHK VPTGNQSADMHLGSYLGYLIGKECTGQDEECKIVVISKDTGFDHIIEFWKAEENVKISRN EKISGKQVQTRKQVKKQTSKEKDRQLAEQTDQKTEKQLNEQSEKQSVKQSENQNTKKPKQ KIIDMKAGTKKLQQELVLVLDKAGMPKEVVNYVSTTVEKNAEDKNRKQQIYRSIISKYGQ TKGLNIYNHIKKKI >gi|226332986|gb|ACII01000033.1| GENE 15 13847 - 15451 1908 534 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 2 534 10 541 542 787 71.0 0 MAKIDLSKYGITGTTEIVYNPSYEVLFEEETKPELEGFEKGQVSELGAVNVMTGIYTGRS PKDKYIVMDENSKDTVWWTTDEYKNDNHPATQEAWNAVKEIAKKELSNKRLFVVDAFCGA NKDTRMAIRFIVEVAWQAHFVTNMFIQPTAEELENFEPDFVVYNASKAKVENYKELGLNS ETAVMFNITTKEQVIVNTWYGGEMKKGMFSMMNYFLPLKGMASMHCSANTDMNGENTAIF FGLSGTGKTTLSTDPKRLLIGDDEHGWDDNGVFNFEGGCYAKVINLDKESEPDIYNAIKR NALLENVTLDAEGKIDFADKSVTENTRVSYPINHIENIVRPISSAPAAKNVIFLSADAFG VLPPVSILTEAQTQYYFLSGFTAKLAGTERGITEPTPTFSACFGQAFLELHPTKYAEELV KKMEKSGAKAYLVNTGWNGTGKRISIKDTRGIIDAILNGDILNAPTKKIPFFDFEVPTEL NGVDTGILDPRDTYADASEWEEKAKDLAGRFIKNFAKYEGNAAGKALVAAGPQL Prediction of potential genes in microbial genomes Time: Sat May 28 19:29:01 2011 Seq name: gi|226332985|gb|ACII01000034.1| Ruminococcus sp. 5_1_39B_FAA cont1.34, whole genome shotgun sequence Length of sequence - 51140 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 24, operones - 10 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 20 - 54 5.2 1 1 Tu 1 . - CDS 103 - 1014 820 ## COG1893 Ketopantoate reductase - Prom 1047 - 1106 5.5 + Prom 947 - 1006 6.7 2 2 Tu 1 . + CDS 1191 - 2240 251 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 3 3 Op 1 18/0.000 - CDS 2237 - 2722 674 ## COG0054 Riboflavin synthase beta-chain - Prom 2749 - 2808 6.5 4 3 Op 2 15/0.000 - CDS 2819 - 4018 1171 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase - Prom 4046 - 4105 3.4 - Term 4115 - 4160 1.0 5 3 Op 3 16/0.000 - CDS 4182 - 4829 463 ## COG0307 Riboflavin synthase alpha chain 6 3 Op 4 . - CDS 4810 - 5919 560 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis - Prom 5962 - 6021 6.0 7 4 Tu 1 . - CDS 6385 - 7875 1766 ## COG0466 ATP-dependent Lon protease, bacterial type - Prom 7933 - 7992 8.2 8 5 Tu 1 . + CDS 8225 - 9286 1034 ## COG0180 Tryptophanyl-tRNA synthetase + Term 9380 - 9432 4.1 - Term 9362 - 9423 10.6 9 6 Tu 1 . - CDS 9500 - 11656 2227 ## EUBREC_0138 hypothetical protein - Prom 11887 - 11946 9.7 + Prom 11857 - 11916 8.1 10 7 Op 1 . + CDS 12049 - 12888 642 ## COG2207 AraC-type DNA-binding domain-containing proteins 11 7 Op 2 . + CDS 12893 - 14095 758 ## Cbei_2712 FliB family protein + Term 14130 - 14175 12.4 - Term 14118 - 14163 7.5 12 8 Op 1 . - CDS 14257 - 16755 2791 ## COG1472 Beta-glucosidase-related glycosidases - Prom 16775 - 16834 3.5 - Term 16785 - 16829 4.0 13 8 Op 2 38/0.000 - CDS 16840 - 17685 861 ## COG0395 ABC-type sugar transport system, permease component 14 8 Op 3 . - CDS 17689 - 18642 997 ## COG1175 ABC-type sugar transport systems, permease components 15 9 Tu 1 . - CDS 18774 - 20135 1590 ## EUBELI_20530 hypothetical protein - Prom 20222 - 20281 5.7 + Prom 20272 - 20331 8.8 16 10 Op 1 . + CDS 20548 - 23247 2269 ## COG3459 Cellobiose phosphorylase 17 10 Op 2 . + CDS 23326 - 25737 1963 ## COG3459 Cellobiose phosphorylase - Term 25901 - 25941 1.1 18 11 Op 1 . - CDS 25997 - 26794 593 ## COG2207 AraC-type DNA-binding domain-containing proteins 19 11 Op 2 . - CDS 26804 - 27472 518 ## Cphy_0177 hypothetical protein 20 11 Op 3 . - CDS 27495 - 27872 552 ## Cphy_3229 hypothetical protein - Prom 27892 - 27951 4.3 21 12 Tu 1 . - CDS 27975 - 30179 2235 ## COG0550 Topoisomerase IA - Prom 30297 - 30356 3.1 - Term 30192 - 30239 4.5 22 13 Tu 1 . - CDS 30391 - 31059 752 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 31226 - 31285 9.0 + Prom 31224 - 31283 13.5 23 14 Tu 1 . + CDS 31358 - 32041 350 ## COG0671 Membrane-associated phospholipid phosphatase + Term 32057 - 32111 12.1 - Term 32045 - 32098 4.3 24 15 Op 1 . - CDS 32170 - 33186 1113 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 25 15 Op 2 . - CDS 33183 - 34241 513 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Prom 34338 - 34397 7.0 26 16 Tu 1 . - CDS 34437 - 34733 288 ## gi|253578487|ref|ZP_04855759.1| conserved hypothetical protein - Prom 34838 - 34897 5.2 - Term 34848 - 34887 1.1 27 17 Op 1 1/0.000 - CDS 34910 - 36475 1407 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 36640 - 36699 6.2 - Term 36590 - 36635 4.8 28 17 Op 2 . - CDS 36771 - 38129 1255 ## COG0534 Na+-driven multidrug efflux pump - Prom 38183 - 38242 5.7 - Term 38258 - 38299 7.3 29 18 Tu 1 . - CDS 38357 - 39898 2052 ## COG0696 Phosphoglyceromutase - Prom 40001 - 40060 5.9 - Term 40091 - 40129 1.0 30 19 Op 1 . - CDS 40241 - 41536 1582 ## COG0460 Homoserine dehydrogenase 31 19 Op 2 . - CDS 41590 - 41850 139 ## gi|253578493|ref|ZP_04855765.1| predicted protein - Prom 41897 - 41956 7.5 32 20 Tu 1 . + CDS 41957 - 42178 186 ## gi|253578494|ref|ZP_04855766.1| conserved hypothetical protein + Prom 42381 - 42440 7.0 33 21 Tu 1 . + CDS 42609 - 43979 1342 ## COG2509 Uncharacterized FAD-dependent dehydrogenases + Term 44035 - 44086 4.1 - Term 44023 - 44072 7.5 34 22 Op 1 . - CDS 44080 - 44553 333 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 35 22 Op 2 . - CDS 44618 - 45778 1046 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 45811 - 45870 10.2 - Term 46324 - 46385 2.6 36 23 Op 1 13/0.000 - CDS 46447 - 47196 1014 ## COG0149 Triosephosphate isomerase 37 23 Op 2 26/0.000 - CDS 47241 - 48434 1620 ## COG0126 3-phosphoglycerate kinase - Prom 48478 - 48537 2.5 - Term 48519 - 48557 9.1 38 23 Op 3 . - CDS 48593 - 49612 1377 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase - Prom 49749 - 49808 7.0 + Prom 49825 - 49884 9.3 39 24 Tu 1 . + CDS 49931 - 50749 690 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 50890 - 50924 -0.1 Predicted protein(s) >gi|226332985|gb|ACII01000034.1| GENE 1 103 - 1014 820 303 aa, chain - ## HITS:1 COG:BH1763 KEGG:ns NR:ns ## COG: BH1763 COG1893 # Protein_GI_number: 15614326 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Bacillus halodurans # 1 291 1 287 304 106 24.0 5e-23 MKYLIIGAGGTGGVTGYYMKKAGKDVTLIARGEHLKKIQKQGLTLEKMWDKTEENISIPA TDMEHYAEHPDVIFVCVKGYSLQETIPFIKRIARKNTIVIPVLNIYGTGGKMQKELPDLL VTDGCIYVSANIKEPGKLLQHGQILHIVFGVRDKAEFRPELKEIQKDLCDSHIDGTLSEN IRREALEKFSYVSPIGAAGLYYHATAADFQKEGEARELFKTMIKEIAALAEAMGVPFQKD MAEVNLKILSNLAPEATTSMQRDIMAGKQSEIDGLVYEVVRMGEEYHVPVPAYEKVAEKL RAL >gi|226332985|gb|ACII01000034.1| GENE 2 1191 - 2240 251 349 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 43 330 41 279 285 101 29 1e-20 MERYLEFRIDEQTTELKMEAFLKKNAGLTKRQISQAKFRPDGITRNGVRCRVTETIYPGD VIRVCLEEAGTASAHLENYNDGIRSKDLSDFHSNRASASISETSFSLNILYEDTDILAVD KPAGMVTHPTGMHYTDSLSNLVAGYFREKNEQVCVRPVGRLDQETSGIVVFAKNQVAASR LQSVKSPCKIHKQYLAAVSGTLPVDMPGVWHTLDLPLMQDPENHLKMKTAADLSLHSSAL SGKIKTAVTHYHTLFSSQDWSLITLKLDTGRTHQIRVHMASTGHPLLGDTLYHSDDKISS CASFSRAALHAWKVSFQHPFRDEMLLLEAPLPFDFRELVSLSGSDLLSI >gi|226332985|gb|ACII01000034.1| GENE 3 2237 - 2722 674 161 aa, chain - ## HITS:1 COG:SP0175 KEGG:ns NR:ns ## COG: SP0175 COG0054 # Protein_GI_number: 15900112 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Streptococcus pneumoniae TIGR4 # 7 161 1 155 155 234 75.0 5e-62 MKGPEKMKTLEGKLVSDGMKVGIVAARFNEFITSKLVSGAMDGLTRHNVKEEDIQVAWVP GAFEIPLIASKMAKSGKYDAVICVGAVIRGNTSHYDYVCNEVSKGIASVSLETGVPVLFG VLTTENIEQAIERAGTKAGNKGYDCALSAIEMVNLIKEFEA >gi|226332985|gb|ACII01000034.1| GENE 4 2819 - 4018 1171 399 aa, chain - ## HITS:1 COG:SP0176_1 KEGG:ns NR:ns ## COG: SP0176_1 COG0108 # Protein_GI_number: 15900113 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Streptococcus pneumoniae TIGR4 # 4 203 3 202 203 275 64.0 1e-73 MNGFDSVEEALEELRNGKIILVTDDPDRENEGDFICAAEFATTENINFMATHGKGLICMP MSEEYVRKLQFPQMVAENSDNHETAFTVSVDHISTTTGISAAERSVTAMKCVDENAKPED FRRPGHMFPLLAKKNGVLERNGHTEATVDLCRLAGLKQCGLCCEIMREDGTMMRTSELRE LAGKWNLKFITIKDIQNYRKCHDILVDRVTTTKMPTRYGEFMAYGFVNRLNGEHHVALVK GEIGDGENVLCRVHSECLTGDVFGSQRCDCGQQFAAAMAQIEEEGRGILLYMRQEGRGIG LINKLRAYELQEQGMDTLEANLALGFAGDEREYYIGAQILRNLGVKELRLLTNNPDKVYQ LEEFGIRIKQRVPIQMNATAYDLRYLKTKQLRMGHILKY >gi|226332985|gb|ACII01000034.1| GENE 5 4182 - 4829 463 215 aa, chain - ## HITS:1 COG:L0164 KEGG:ns NR:ns ## COG: L0164 COG0307 # Protein_GI_number: 15672976 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Lactococcus lactis # 1 215 1 215 216 231 53.0 6e-61 MFTGIVEELGTVKKIKKGANSAVFTIRAEKILDDLKTGDSVAVNGICLTVTACLEDGFTA DVMHETLNRSALIQLSLGQHVNLERAMPANGRFGGHIVAGHVDGTGKITEIRRDDNAVWY TIQASPQIMKYIVTKGSVTVDGISLTVAKVSETDFSISAIPHTVKITVLGERKEGDIVNL ETDIIGKYVEKLITPVKEQPIKSNITRDFLNKYGF >gi|226332985|gb|ACII01000034.1| GENE 6 4810 - 5919 560 369 aa, chain - ## HITS:1 COG:SP0178_2 KEGG:ns NR:ns ## COG: SP0178_2 COG1985 # Protein_GI_number: 15900115 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Streptococcus pneumoniae TIGR4 # 148 361 4 216 223 258 58.0 2e-68 MEKDRQYMKMALELAQKGMGFTAPNPMVGAVIVKRGKVIGQGYHRKYGEPHAEREALASC TEQPEGASIYVTLEPCCHYGKQPPCVNAILESGIRRVIIGSSDPNPLVSGKGIRILKEHG IEVTENVLKEECDKLNEAFFYYIQNQKPYVVMKYAMTMDGKIATYTGASRWVTGEAARMH VQKQRLKYTGIMAGVGTVLADDPMLTCRLENGRNPVRIICDSHLRTPLNSRIVKTASTIP TIFATSSKDQQKIKNYEDMGCKVLDVPEKNGHIDLNRLMELLGAAKIDSILLEGGGTLNW SALESGIVQKVQTYIAPKLFGGESAKTPIEGKGFPDPASAILLKNSEIIRIGDDFLIESE VKSNVHGNC >gi|226332985|gb|ACII01000034.1| GENE 7 6385 - 7875 1766 496 aa, chain - ## HITS:1 COG:CAC0456 KEGG:ns NR:ns ## COG: CAC0456 COG0466 # Protein_GI_number: 15893747 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Clostridium acetobutylicum # 181 487 255 562 786 157 31.0 7e-38 MPVFDFSGASDKKPGVQSECTYTTFQSNEQETSPEKPIMVLDNSSRQWKHHSCGIFTNPI KRTSFEFKEDDGTFAADILSIDSRFVSLLKWLGENHINVRLSGQNQEKGYAVYKIRETAF GGGTKLSAEDGFLQFMIERLLASSAPAEIAEDEDEEETGDEMKLTSIQSITDFMTCAGRT LPDNIRLWARRNLAVARSHEVSPEERRHAQRALSIMMNVQWKSNYFEAIDPQEARRILDE ELYGMESVKQRIIETIIQINRTHTLPAYGLLLVGPAGTGKSQIAYAVARILKLPWTTLDM SSINDPEQLTGSSRIYANAKPGIIMEAFSAAGESNLVFIINELDKAASGKGNGNPADVLL TLLDNLGFTDNYMECMVPTVGVYPIATANDKSQISAPLMSRFAVIDIPDYTSEEKKIIFS KYVLPKVLKRMSLKAEECVVTEDGLEAIVEKHKNTSGIRDLEQAAEHIAANALYQIEVDH LTGVTFNAEMVRELLS >gi|226332985|gb|ACII01000034.1| GENE 8 8225 - 9286 1034 353 aa, chain + ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 5 346 6 340 341 421 61.0 1e-118 MSKIILTGDRPTGRLHVGHYVGSLSERVRLQNSGLYDEIYIMIADAQALTDNAEHPEKIR QNILQVALDYLACGIDPEKSTIFIQSMVPELTELTFYYMNLVTVARVQRNPTVKSEIQMR NFEASIPVGFFCYPISQAADITAFRATTVPVGEDQLPMLEQCKEIVHKFNTVYGETLTEP EIILPSNKACLRLPGIDGKAKMSKSLGNCIYLSDEEADVKKKVMSMFTDPNHLRVEDPGR VEGNPVFIYLDAFCKPEYFAEFLPEYQNLDELKAHYTRGGLGDVKVKKFLNKVLQAELSP IRERRKYWEQHLDDVYDILKEGSKVAEAKAAQTLHDVRSAMQINYFDDNSLIK >gi|226332985|gb|ACII01000034.1| GENE 9 9500 - 11656 2227 718 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0138 NR:ns ## KEGG: EUBREC_0138 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 718 7 721 721 1146 75.0 0 MGKGRVTIPTDLDVVPQTLEMIEEWGADAIRDCDGTEFPQKLKDTGAKIYATYYTTRKDN AWAKAHPEEIQQMYIMTSFHTAVADTLEIHLMDHLYPNMLAVNTRDDIRRWWEVMDRTTG EPVLTDQWSYEEETGNVVILSAKRFHEYTVSFLAYIMWDPVHMYNAVVNDWKDVEPQITF DVRQPKTKAHSLERLRRFLDTHEYVDVVRFTTFFHQFTLIFDELAREKYVDWFGYSASVS PYILEQFEKEVGYPFRPEYIIDQGYMNNTYRIPSKEFKDFQAFQRREVAKLAKEMVDIVH EYGKEAMMFMGDHWIGMEPFMDEFASIGLDAVVGSVGNGATLRLFSDIKNVKYTEGRFLP YFFPDTFHEGGDPVKEAKVNWVTARRAILRSPIQRIGYGGYLKLALEFPDFVQYIKEVCE EFRTLYDNIQGTTPYCVKRVAVLNCWGKMRSWGNHMVHHAIYYKQNYSYFGIIEALSGAP FDVSFISFDDIKADKDLLKKFDVVINVGDADTAQSGGENWIDETIITAVREFVYNGGGFI GVGEPAAHQWQGRFIQLDDVLGVEEEHGFNLNTDKYNWEEHRDHFILKDAEGEVDFGEGK KNIYALPETEILIQKDQEVQMAVKTFGKGRGVYISGLPYSFKNSRVLYRAVLWSASAEEE LHCWYSTNYNVEVHAYVKNGKYCVVNNTYEPQDTVVYRGDRSSFRLHMEANEIKWYQI >gi|226332985|gb|ACII01000034.1| GENE 10 12049 - 12888 642 279 aa, chain + ## HITS:1 COG:STM4423 KEGG:ns NR:ns ## COG: STM4423 COG2207 # Protein_GI_number: 16767669 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 26 274 8 266 274 97 29.0 3e-20 MSSQSFRFSPEHPTAQNITLLNTSSSRYEGDWPSIPHSHAFTELFYVRDGQGDFLIEDQI FPISKDDLVIINPHINHTEISKGSPPLSYFTVGVDGLCFSFNGHKEYRIFNCREKKTDLL FYFNSLFQELDRQSEGYETICSHMLDILILQLRRITDSPFELITAQHPSKECAHIKRYLD SNYGENITLDHLSELSHLNKYYLSHEFTKYYGISPINYLNRKRIEVCKELLENTDYGISD IAHLTGFSSQSYLSQSFRKSCGMTAGTYRKLKKKRKDPI >gi|226332985|gb|ACII01000034.1| GENE 11 12893 - 14095 758 400 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2712 NR:ns ## KEGG: Cbei_2712 # Name: not_defined # Def: FliB family protein # Organism: C.beijerinckii # Pathway: not_defined # 1 397 1 381 386 246 35.0 1e-63 MQYTFPNYYKEFSCIAGACPDTCCAGWQIVIDDPALKKYQHFKGPFRNRLHNDIDWKQHV FRQYNRRCAFLNEENLCDIYTEAGPEMFCRTCRNYPRHIEEFEGLREISLSLSCPEAARI LLSQKEKVHFITKEKKTKEEVYDDFDYFLFTALMDTRDMLIDIIQDRAVPMQKRLWKLLA AAHDFQLCVNKNELFKWEEMRKRHEDSGYGDRFCNKIYSRINADNIENSSAASACANTPE QLFKKMWKTVVPEMEVLRPGWQEYLKNCLTPLYNGNTDPQSDSGNLYSWQKSEFDFSYPD WQIQEEQLLVYWIYTYFCGAVYDNEIFAKVKMAVVCTLFIHELNVGTYLKNNRQFKLDDQ IRICYQFSRELEHSDLNLNRFEELMTEKEIFSFENLLKIC >gi|226332985|gb|ACII01000034.1| GENE 12 14257 - 16755 2791 832 aa, chain - ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 491 824 2 304 721 265 45.0 4e-70 MKKWIYSGTREEQPSERELMHGKLARKAASEGIVLLKNEGILPLKKDTAAALLGYGAEKT VKGGIGSGDVNNRKNTSIYQGLKEAGVKIVSEDWISDYHNRYEQAREAWKEKVLEEAKKV DNPFDAYAENPFAMPLGRKVAEEDICEADVVIYVISRISGEGKDRRKVKGDYYLSEREEE DLRYLAEMNKPVILILNAGGPVELTDILEQTDNIKGILNISQLGQEGGDALADVLLGKEV PGGKLTTTWARRYEDYPASEEYGYLNGNLEKEKYKEGIYVGYRYFDSFDKKVMFPFGFGL SYTMFEMMCCSINMEESKIRAEVQVTNTGNEYAGKEVVQIYVTLPQTELEKEYKRLAGFA KTRLLKPGETQTLTVEIPQKQLASFNEKTHTWIVEKGKYGILVGNSSDKLKLEAVLVVSD DTVLEQMDKICPLQEELKQIHLTKELKEKSVQRQEKLITAQVPEYYFKPAMIPAKSENAG KNQENLTEEEKRFVSVLEDRTTEELIPLLYGKISENISTLGAAGIRVPGSAGETCGTLEE YGIPSLVMADGPAGIRLRQWYEVDKEADSIYEMGVLGSLENGILEPGVHHENADTYYQYC TAFPVGTALAQTWDTDLMTEFGKAIAEEMEEFHVNLWLAPGMNIHRNPLCGRNYEYYSED PYLSGMLAAAVIRGVQSKSGCGVTIKHFACNNQEDNRMGVNSCVSERALREIYLRGFEIA VKEGNPVSIMTSYNLINGIHAANSKDLCMTVARKEWGFDGAIMSDWNTTVPEDGSVPWKC VAAGNDIIMPGNPDDDKNIRQAYKEGKLTEEEIRNCAGHLVSMIRRLERTDC >gi|226332985|gb|ACII01000034.1| GENE 13 16840 - 17685 861 281 aa, chain - ## HITS:1 COG:SPy0255 KEGG:ns NR:ns ## COG: SPy0255 COG0395 # Protein_GI_number: 15674435 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 12 280 11 275 276 132 31.0 1e-30 MKPNTSGRVRSIFVHLVLIFLSFLCLFFFYILIVNATRSHADLQKGFSALPGKYFLENLK NVANDGSFPMFRGILNSVVVSSCSAALCTYFSSLTAYGLYAYDFKMKKAAFTFIMAILVM PTQVTAMGFLRLITKMGMYDSLLPLIIPSIASPAVFYFMYSYLQSSLPLSLVEAARIDGS GEFRTFNSIVLPIMKPAVAVQAIFTFVGSWNNYFVPALIIQSKSKMTVPILIATLRGADY MNFDMGKIYMMITVAIVPIIIVYLLLSKYIIAGVTLGGVKE >gi|226332985|gb|ACII01000034.1| GENE 14 17689 - 18642 997 317 aa, chain - ## HITS:1 COG:BH3689 KEGG:ns NR:ns ## COG: BH3689 COG1175 # Protein_GI_number: 15616251 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 4 260 8 258 300 120 31.0 2e-27 MKKKKSISYAKWGYIFILPFFITFLIFSLIPLVDTVRYSFYEYYRSGIKEIGPNFIGIAN YLSLLKSDMLKYSANTLILWVIGFVPQIVIALVLACWFTDARLKIHGQQFFKVVIYLPNL IMASAFALLFFTMFSTNGPINSILMSLGWVKNPIDFMGSVIGTRSLVGFMNFLMWFGNTT IMLMAAVMGISMDIFEASELDGCNSIKRFFYITLPLIRPILAYTLITSIIGGLQMFDVPQ ILTNGQGNPDRTSMTLIMFLNSHLKSKNYGMAGALSVYLFIVSGILCMIVYKMTNDTDPD GSKKAAKKKAKEERRRR >gi|226332985|gb|ACII01000034.1| GENE 15 18774 - 20135 1590 453 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20530 NR:ns ## KEGG: EUBELI_20530 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 448 1 451 451 329 40.0 1e-88 MKKKALAVLLTASVIAGTMGGTATIVHADEEGKVINIYSWNDEFRQRLEAVYPEVKETSK DGTVTTLKDGTEIHWIINPNQDGVYQQKLDEALMNQADASADDKVDIFLSETDYVYKYTD AAADVAMPLKDLGIDPDKDLADQYDFTKTTASDADGVQRGSTWQCCPGLLVYRRDIAKDV FGTDDPEEVGKRMKDWDTAKATAEELKAKGYYTFASYADTFRLYGNSISQPWVEDGATTV NVDQQVMNWIKDSKEWLDAGYLNKTVKGQWNDDWNKAMGSESKVFAFLFPAWGIDFTLKP NWDGEDGEWAVTNPPQEYNWGGSYIHAATGTDNPEHAKDIILALTGNKDNLLKISKDYSD FTNTKSGMQEVATDDTFASDFLGGQNPFEYFAPVAENIKIAPLSAYDQGCVELLQNAFSD YFQGQVDYDKAKSNFETAIKERYPEIQEVNWAE >gi|226332985|gb|ACII01000034.1| GENE 16 20548 - 23247 2269 899 aa, chain + ## HITS:1 COG:AGc4949 KEGG:ns NR:ns ## COG: AGc4949 COG3459 # Protein_GI_number: 15889981 # Func_class: G Carbohydrate transport and metabolism # Function: Cellobiose phosphorylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 231 842 2230 2792 2802 116 23.0 2e-25 MKHIKFIDNHGSFCVNRPENTSYLYFPLASEAGLKSSVTPVLGGDSKIDQDTFLLEPVSS ENLHNNRSSRNFWIVKEGHIPYSVTGSSAQQEADRFTDRQEDSELTAGFMWQTLKRTSRE LGLISTVTSFIPKSDNVEIMYVTVENQTDTVQKFTAYGAIPVYGRSADNIRDHRNVTSML HRIETTEHGINVCPTMSFDERGHQPNHKIYYVNGCSGKGENPISYYPTVEDFIGEGGTYT HPRTVYEGYKGVPAHSLAAGKEAMGAFCFSSFTLAPGEKTDFILLSGIADSKEEIENIFE KYNTSDKVKNALEETRIYWKKQVNVDFHTQDPDFDSYMKWVSFQPFLRRLFGCSFLPHHD YGRGGRGWRDLWQDCLSLLLMNPQNVGAMIEKNYGGVRIDGTNATIIGDGDGNFIADRNG IARVWMDHALWPLITTSLYINQTGDIEILKKQVPYFKDAQTMRGTEIDTLWNDAYGNKQR TEDGQVYTGSVLEHILIQQLAAFYDVGVHNIYRLRGADWNDALDMAAENGESVAFTCAYA GNLHTLASILRLMESAGETSIPLSEEIEILLNDQTDMFDSVSEKKKILTEYAKSCRHNLS GRKKNFSLSTLAANLIQKSNWLTNHIRSQEWIDGKDSEEGWYNSYYDNHGEAVEGFHHEQ VRMMLTGQVFSIMGNVATDAQIRKIVKSADHYLYRKEIGGYRLNTDFQELKFDMGRMFGF AYGEKENGAVFSHMAVMYANALYQRGFAKEGWKALRSLSDTALDFDTSHIYPGIPEYFRS DGRGMYHYLTGAASWYLFTMITEVFGVRGVFGNLLVHPKLLAEQFDEAGTASVSVTFAGK QFHVTYRNTDRKDYGSYHIAAADCDKTAIEIVEDSHIMISREFINNLSSDLHEITVTLA >gi|226332985|gb|ACII01000034.1| GENE 17 23326 - 25737 1963 803 aa, chain + ## HITS:1 COG:VC0612 KEGG:ns NR:ns ## COG: VC0612 COG3459 # Protein_GI_number: 15640632 # Func_class: G Carbohydrate transport and metabolism # Function: Cellobiose phosphorylase # Organism: Vibrio cholerae # 1 803 1 800 801 514 36.0 1e-145 MKYGYFDNENREYVITNPATPAPWVNYLGSPEYGAIISNNAGGYSFAKSGANGRILRYVF NNFDQPGRYIYIRDDRSSDFWSASWQPVGKDLAQYKSECRHGIGYTKISADYSDIHSEAL YYVPLGKSYEVWALSVTNHSDHERNLTLSGYAEFTNHSNYEQDQVNLQYSLFISRTLFEG NRITQQIHGNLDAIPENENVDEKNVTERFFGLAGAEVSSYCGDKNEFLGSYHGYGNPEGI VCGNLGNKTSYNENSCGALSCKITLKAGETRTIAFLLGMKPSSEAAEVIRCYENPAPTVA EELEALKADWNSKLDNLKVSTPSPEFDTMINTWNAYNCFMTFIWSRAASFIYCGLRNGYG YRDTVQDIQGIIHLAPEMALEKIRFMLSAQVNNGGGLPLVKFTHTPGKEDTPDDASYVQE TGHPAYRADDALWLFPTVYKYISETGNTAFLDEVIPFANKDEATVYEHLKRAVAFSLNHL GPHGMPAGLYADWNDCLRLGANGESSFVALQFYYAMTILKIFAEYKKDTEYVQYLEETQK AFGEKVQELCWDNDRFVRGYTESGERIGEAAAPEANMWLNPQSWAVISGLATPEQGDAAL NNVYERLNTEYGAILMDPPYHAHAFDGALAVIYNQGTKENSGIFSQSQGWLILAEALRGH GERAFTYFMENSPAAQNDCAEIRRLEPYCYGQFTEGKASKHFGRSHVHWLTGTASTVMVG CVEGILGLRPDLEGLRLSPSVPKSWKHFEIEKVFRGKQLHISVENPDEKESGCYSLVLNG QKLADDYIPEKLLQDKNEVILTL >gi|226332985|gb|ACII01000034.1| GENE 18 25997 - 26794 593 265 aa, chain - ## HITS:1 COG:SMa0319 KEGG:ns NR:ns ## COG: SMa0319 COG2207 # Protein_GI_number: 16262625 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Sinorhizobium meliloti # 147 238 240 331 333 85 44.0 1e-16 MSYQNEWLLFEAEDEFDEIKHREPTEELLFYRAVASGDIKTVKKNCERQRFTECDGVGVL SKNAVTNMKYHFVITTAMITRLCKQNGMEMEQAFRLSDFYIQKLDGIHTVEEVQSLHDEM VMDYTEKMRRYFRDNTYSKHINASKEYIYSHIKERITIEDLADSLGVSASYLSRLFKKET GISVSAYIRRQKIEMAKNLLQYSDYPIIEIANRLSFSSQSHFIQQFREVTGMTPGKYRDE HYMIHWDIEKELCVASESELKKEQD >gi|226332985|gb|ACII01000034.1| GENE 19 26804 - 27472 518 222 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0177 NR:ns ## KEGG: Cphy_0177 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 219 1 219 224 274 57.0 2e-72 MRREQTLEEISDGKLYDSNDMVKADCHDCEGCCDCCQGMGDSVLLDPYDVYRLSAGLQKS AEQLLQEYLELGVTDGNILPHLRMTGVKEQCIFLNSEGRCHIHSIRPGFCRLFPLGRFYE NGSFKYILQIHECPKTNRSKIKVKKWIDTPDLKNYEKFVNDWHYFLLDVQEVLYNAEDPD LIRNLNLFVVNRFYLKPYDQNQDFYIQFYERLKEGKELLALA >gi|226332985|gb|ACII01000034.1| GENE 20 27495 - 27872 552 125 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3229 NR:ns ## KEGG: Cphy_3229 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 118 3 115 130 62 37.0 3e-09 MAVQDDHMIRNIQDVGRLIAKLLLHEQQPNYKLPEKEADYTEADRLFATIMKLAEEGKIN EAENELYMGMVEDDVDYLELALTFYLYLNDMDGDFLDDNGYSREEVLEGMKDLASDWGVT GLEAF >gi|226332985|gb|ACII01000034.1| GENE 21 27975 - 30179 2235 734 aa, chain - ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 626 3 618 618 712 57.0 0 MKSLVIAEKPSVARDIARVLHCTQKGNGTLEGKDYVVTWALGHLVTLADPEEYDKKYMKW EISTLPMMPDRMKLVVIRQTGKQYNVVKTQLYRKDINDIIIATDAGREGELVARWILDKA DCHKPIRRLWISSVTDKAIKEGFGNLKNGHEYDNLYRAAVARAEADWLVGMNGTRALTCK YNAQLSCGRVQTPTLAMIAKREEEIRAFKPKEYYGITLKAGDITWTWKEPKSKSFRTFNK ERAEEISDVLRNQSLEVTSVTSKEKKSFAPGLYDLTTLQREANRKYGYSAKQTLNIMQRL YENHKVLTYPRTDSRYIGKDIVPTIKERLRACATGPYRKLAGALMNQPVRANGSFVDDKK VSDHHAIIPTEQFVQLDHMTNEERKIYDMVVRRFLSVLYPPFVYEQVSMEGIVAGELFAA SGKVVKSAGWKDVYENTGDTEEDEDTADDAQKLKDQKLPQMKKGERLHIENVSMNTGRTK PPARFTEATLLAAMENPVRYMESHDTKAAKTLGETGGLGTVATRADIIEKLFNSFLMEKK GNEIHLTSKAKQLLELVPEELRKPELTADWEMKLGNIAKGKMKQETFLKEIRNYTCDIVD EIKTGQGTFRHDNLTNKICPNCGKKLLAVNGKNAKMLVCQDRECGYKETLSRTTNARCPK CHKRMEMYVKGKEETFICACGYKEKLSAFQARRAKEGAGVNKRDVQKYMRQQQKEAKEPV NNAFAQALAGLKLE >gi|226332985|gb|ACII01000034.1| GENE 22 30391 - 31059 752 222 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 220 6 228 232 178 43.0 7e-45 MIYLVEDDESIRELVIYTLKSQGFEAKGFERPSHFWKALEKEKPALLLLDVMLPEEDGIS ILKKLRMRPDTRKLPIIMLTAKSSEYDTVVGLDSGADDYIPKPFRMMELISRIRALLRRT EDDGAEEYRTGNLYVCPSRHIATVDKKNVNLTLKEYEVLCLLLKNSGTVLSRTQLLNQVW GYEFDGESRTVDVHIRTLRQKLGSAGELVETVRGVGYKINMK >gi|226332985|gb|ACII01000034.1| GENE 23 31358 - 32041 350 227 aa, chain + ## HITS:1 COG:CAC1489 KEGG:ns NR:ns ## COG: CAC1489 COG0671 # Protein_GI_number: 15894768 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 36 196 42 196 219 64 31.0 1e-10 MKRFISKYRHGLVIAVYSIIYILLFSYLEHRPVHSYHIVHTVFDDWIPFCEFFIVPYMLW FPYMILAVIYFIFFNKNKHEYYQLAFNLMMGMTVFLIVSYVYPNAQHIRPTEFPRDNIFT DIVKWLYSTDTPTNILPSIHVFNSLAIHMSLTNCETLRNKKFIKAASFTLTALIIMSTMF LKQHSVIDVCMGATLALFGFLVFYPRRVSVSSGSVCFREVKRKKSEN >gi|226332985|gb|ACII01000034.1| GENE 24 32170 - 33186 1113 338 aa, chain - ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 338 1 338 338 376 50.0 1e-104 MKLLSIAIPCYNSQDYMENCIESLLVGGEEVEILIVDDGSSDRTAEIADAYARKYPTIVK AIHQENGGHGEAVNAGIRNATGLYFKVVDSDDWVNKEAYVQILKTLYELLRGPQTVDLLI SNFVYEKQGATRKKIMQYRKCLPQDRIFGWEEVKHMPKGKYLLMHSMIYRTKLLHECNLE LPKHTFYVDNLFAFEPLPYVKNLYYLDVNFYRYFIGRDDQSVNEKVMIKRIDQQIRVNKL MVDAYINCKNTNKQLKAYMFSYLDIITTISSIMLIRANTEESLQKKKELLEYIRTENKQV YRKLRHSLFGRVMNLPGRGGRKMSVAAYKISQKFYGFN >gi|226332985|gb|ACII01000034.1| GENE 25 33183 - 34241 513 352 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 10 334 15 334 378 202 33 4e-51 MVLEEIVQPLVKWYRDNKRILPWRDKDNAYYTWVSEIMLQQTRVEAVKPYFQRFITELPD IQSLAECPEEKLLKLWEGLGYYNRVRNMQEAAKTVKDEYNGRLPEDYQALLSLKGIGSYT AGAIASIAYGEKVPAVDGNVLRVISRITESTEDISRQSVRRKIEQQVSQIMPSDCPGDFN QALMELGAVICVPNGQAKCAECPIAFTCLAHRHDKADMIPVKAPKKARTQDNRTVFIIQD GECTAIRKRPEKGLLAGLYELPNTQGHLKSEDALLYVKELGLDPLYIEELPPAKHIFSHI EWRMQAYRIKVSSLKTTQDKELIFVSKEQSGKQYAIPSAFGAYAKYIKEETR >gi|226332985|gb|ACII01000034.1| GENE 26 34437 - 34733 288 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578487|ref|ZP_04855759.1| ## NR: gi|253578487|ref|ZP_04855759.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 98 1 98 98 148 100.0 9e-35 MNFDSLNIILTIAAFIVGIMLLTGHGDIFLKGGNSDISKKLYDEEKMAKASGVALILIGV VSAIDLFTTALAFKIAYIAALLIIIVGLVCYLRLKCKK >gi|226332985|gb|ACII01000034.1| GENE 27 34910 - 36475 1407 521 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 45 521 5 466 466 306 33.0 7e-83 MGNKEEKNNTVPNGSMEITGRRHKSARPGRISYVPISDEDLAPYEPPVRKKHKGLKITGI AVAMVILSAGAAYAGMSYYYSDKFFEGTSINGIDCSGKTAYEAEQKIAKTVENYSIEVDS RNLDPQTISGDQIGYSYVSDGSVLKLLKDQKPYEWIRGFFEKKSYTAAENTTYDKEKLKK QVKALNCAQKENQVAPENAYVAYGDSQFEIVPETEGSKLDLKNAYNVLSEAVSGNKTSVD FDSEPDVYVKADITSDNPDLQASLNACNNFTKASITYTFGDETETLDGNTIKDWLNFDEK GQLIMDDTSFRQHIADYVAQLAAAHDTVGTEREFQTTSGRTVSVYGSAYGWQIDQTSEVA QLTQEIQSGTQTTREPVYSMTANAHGYNDIGNTYIEVDLSEQHMYFYQNGEDIFESDIVS GDMRYSDRQTPAGIYTIYYKKSPDVLRGKQLANGKYEYEQPVTYWMPFNGGIGFHDANWQ PYFGGDRFMEGGSHGCINMPPEKAAELYNIIDCNIPIVCFY >gi|226332985|gb|ACII01000034.1| GENE 28 36771 - 38129 1255 452 aa, chain - ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 10 448 3 441 446 256 38.0 6e-68 MGGFIGMQTDMTVGKPMKMILDFTIPVFIGNVFQQFYNMADAIIVGKFVGTKGLAAVGST GTIMFLIIGFLMGLTAGFTVLTSQKFGAGKMDEMRQSVGNAALLSIIISVIMTAVSMVGM HSLLTLMHTPEDIFQDAYTYIMIICGGIFAQVLYNILASILRALGNSKTPLYFLILAALL NIVLDLTFIIVFHMGVAGAAWATITAQGVSGVLCLIYIIKCVPELKLKADDWRFRSEIAK KEILVGIPMGLQYSITAIGTMMVQSALNILGSYAVAAFTAGNKIENIFTQAYVAIGTTMA TYNAQNIGARKLDRVRQGFNSAHIIGITYAIVTGVIIFFFGKYLSYLFISDNAAEVIPMV DTYVKCVAVFFIPLHFVNALRNGIQGMGYGLLPMLAGVAELAGRGITAMIAAAHKSYFGA CMASPMAWVLAGGLLIVMYFYVMKDMKRRLGL >gi|226332985|gb|ACII01000034.1| GENE 29 38357 - 39898 2052 513 aa, chain - ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 510 1 509 510 635 62.0 0 MSKKPTVLMILDGYGLNDKKEGNAVYLAKTPVMDKLMAEYPFVKGNASGLAVGLPDGQMG NSEVGHMNMGAGRIVYQELTRITKEIQDGDFFKNEALLAAMKNAKENNSAVHFMGLLSDG GVHSHNTHLYGLLEMAKREGVEKVYVHCFLDGRDTPPASGKEFVEALEAEMKKIGVGEIA TVSGRYYAMDRDNRWDRVELAYNALTTGEGVKGTDAPAAVQASYDNDKTDEFVLPTVIEK DGQPTGVISDKDSVVFFNFRPDRAREITRAFCDDDFQGFERKARPQVTFVCFTDYDDTIQ NKQVAFHKVQLHNTFGEYLAAHNMTQARIAETEKYAHVTFFFNGGVEEPNKGEDRILVKS PKVATYDLQPEMSAPQVCEKLVDAIKSDKYDVIVINFANPDMVGHTGVQEAAIKAVETVD ECVGKAVEALKEVDGQMFICADHGNAEQLIDYETGAPWTAHTTNPVPFILVNADPKYTLR ENGCLADIVPTLIQLMGMEQPAEMTGKSLLVEK >gi|226332985|gb|ACII01000034.1| GENE 30 40241 - 41536 1582 431 aa, chain - ## HITS:1 COG:BH3422 KEGG:ns NR:ns ## COG: BH3422 COG0460 # Protein_GI_number: 15615984 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Bacillus halodurans # 5 430 4 429 431 376 46.0 1e-104 MNENVIKAALLGFGTVGTGVYKVLKNQEKEMSAKIGCKVEIKKILVRNIDKAAAKIEDAS LLTSQWDEIINDPEIEIVIELMGGINPAKEYILSALKAGKHVVSANKDLIAVHGHELLDA AHESKVDFLFEAAVAGGIPIIRPLKQCLAGNHMTEVMGIVNGTTNFILTKMTQEGMEFKD ALALATELGYAEADPTADIEGLDAGRKVAILASVAFNSRVVFNDVYTEGIAKITSKDIHY AKEMGRDIKLLGVARNEADGIEAYVCPMLIPSSHPLATVNDSYNAVFVHGDAVEDAMFFG RGAGELPTASAVVGDVFDIVRNILAGCCGRIGCTCYKNIPVKKMEDTHNRYFLRLKVEDR CGVLAEMTAIFAKYQVSVAQIIQKDTKVEGCAEVVVITEKVREGDFRTAIEELRNRDSVR KISTIIRVYGK >gi|226332985|gb|ACII01000034.1| GENE 31 41590 - 41850 139 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578493|ref|ZP_04855765.1| ## NR: gi|253578493|ref|ZP_04855765.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 86 1 86 86 163 100.0 3e-39 MKTIRKIMEGVAAACISVAMYYCFLSGAPAGGTERTVYMILDGIFCISAWVAGVGILWYV RWLRRRNKELENMIKILLKNADKLDK >gi|226332985|gb|ACII01000034.1| GENE 32 41957 - 42178 186 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578494|ref|ZP_04855766.1| ## NR: gi|253578494|ref|ZP_04855766.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 73 12 84 84 145 100.0 8e-34 MISYRKLWKTMKERKISQYALYTHYGISTSFLDKLRHNENVEIRSLDILCSILDCDFGDI VEHIPDNAEPEEK >gi|226332985|gb|ACII01000034.1| GENE 33 42609 - 43979 1342 456 aa, chain + ## HITS:1 COG:CAC3595 KEGG:ns NR:ns ## COG: CAC3595 COG2509 # Protein_GI_number: 15896829 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 1 454 1 454 457 587 62.0 1e-167 MKYDVVIIGAGPGGIFSAYELMKQNNDLKIAVFEAGNPLSKRHCPIDGDKVKTCIKCKTC AIMSGFGGAGAFSDGKYNITNDFGGTLYEHIGKKEALDLMRYVDDINVAYGGQDTKMYST AGTKFKKLCMQNKLKLLDASVRHLGTDINYVVLENLYSKLKEHVDFYFNTPVERLEVLDD GYRIVTKNDTTDCSKCIVSVGRSGSKWMEQICKELDIPTKSNRVDLGVRVELPAVIFSHL TDELYESKIVYRTEKFEDNVRTFCMNPYGIVVNENTNGIVTANGHSYEDPSKQTENTNFA LLVAKHFSEPFKDSNGYGESIARLSNMLGGGVIVQRFGDLVRGRRSTVARIEEGMVRPTL AATPGDLSLVLPKRILDGIIEMIYALDKIAPGTANDDTLLYGVEVKFYNMEVDIDENLES RYKGLYIIGDGSGVTHSLSHASASGVYVARQIVENL >gi|226332985|gb|ACII01000034.1| GENE 34 44080 - 44553 333 157 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 157 1 166 167 90 29.0 1e-18 MEIRKASVQDLGQIMQVYDKARKFMRENGNAEQWGEDYPSTELIEHDIDKMYLCMSEGRI ACVFYYAAEEDEDYKEINGKWLNEEPYGVVHRVASTGIVRGAASFCLDWAYAQTLNLRMD TYSDNIPMQKLLEKCGFQYCGSFERLGMDKWMAYQKI >gi|226332985|gb|ACII01000034.1| GENE 35 44618 - 45778 1046 386 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 8 383 78 457 458 332 41.0 9e-91 MEEKVTKCSVSKKCGSCQYQGVPYKEQLAAKQKRMKKLLGKFANVKPIIGMDDPFYYRNK VHAVFDRDKKGNIICGTYEAKTHKVVPIEECMIEDKISQEIIRTIRDMLKSFRIKTYDED TGYGLLRHVLVRRGFFTDEIMVVIVIGSPIFPSKNNFVKALRKKYPQITTVVLNVNDKKT SMVLGERDIVIYGKGYIRDTLCGCTFRISPQSFYQVNPVQTEILYKTAIEYAGLGRKETV IDAYCGIGTIGLVAAKRAKNVIGVELNPDAVRDARINAKENKITNARFYQGDAGEFMENM AENGEHADVVFMDPPRTGSDKKFMSSVIKLNPSRIVYISCGPETLARDLEYLTKHGYDVR KIQPVDMFSFTDHCENICLLTKKFEK >gi|226332985|gb|ACII01000034.1| GENE 36 46447 - 47196 1014 249 aa, chain - ## HITS:1 COG:CAC0711 KEGG:ns NR:ns ## COG: CAC0711 COG0149 # Protein_GI_number: 15893999 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Clostridium acetobutylicum # 3 248 2 248 248 259 56.0 3e-69 MARKKIIAGNWKMNMTPSEAVKLVETLKPLVVNDEVDVVFCVPAIDIIPVVEATKGTNIQ VGAENMYFEEKGAYTGEISPAMLVDAGVKYVVLGHSERREYFGETNEDVNKKVLKAFEHG ITPIMCCGETLTQREQGVTMDFIRQQVKVGFQGVTADQAKTAVIAYEPIWAIGTGKTATT EQAQEVCAGIRACIAEIYDEATAEAIRIQYGGSVNPATAPDLFVQNDIDGGLVGGASLKA DFGKIVNYK >gi|226332985|gb|ACII01000034.1| GENE 37 47241 - 48434 1620 397 aa, chain - ## HITS:1 COG:CAC0710 KEGG:ns NR:ns ## COG: CAC0710 COG0126 # Protein_GI_number: 15893998 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Clostridium acetobutylicum # 2 397 3 397 397 555 73.0 1e-158 MLNKKSVDDINVKGKKVLVRCDFNVPLQDGKITDENRLVAALPTIKKLIADGGKVILCSH LGKPKGEPKPELSLAPVAKRLSELLGQEVKFAADPEVVGPNAKAAVAEMKDGDVILLENT RYRAEETKNGDEFSKELASLCDVFVNDAFGTAHRAHCSNVGVTKYVDTAVVGYLMQKEID FLGNAVNNPERPFVAILGGAKVSSKISVINNLLDKVDTLIIGGGMSYTFSKAMGGHIGTS LCEDDYLQYALDMMKKAEDKGVKLLLPVDNRIGDNFSNDCNIQVVKRGEIPDGWEGMDIG PETEKLFADAVKDAKTVVWNGPMGCFEMPNFAHGTEAVAKALAETDATTIIGGGDSAAAV NILGYGDKMTHISTGGGASLEFLEGKELPGVAAANDK >gi|226332985|gb|ACII01000034.1| GENE 38 48593 - 49612 1377 339 aa, chain - ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 334 1 331 334 436 66.0 1e-122 MAVKVAINGFGRIGRLAFRQMFGAEGFEIVAINDLTSPKMLAHLLKYDSTQGKYALADTV TAGEDSITVDGKEIKIYAKANAAELPWGEIGVDVVLECTGFYTSKDKAQAHIDAGAKHVI ISAPAGNDLKTIVYNVNHETLTKEDHIISAASCTTNCLAPMAKALNDLAAIKSGIMCTIH AYTGDQMTLDGPQRKGDLRRSRAAAVNIVPNSTGAAKAIGLVIPELNGKLIGSAQRVPTP TGSTTILTAVVEGNVTKEQINAAMKAASNESFGYNEDEIVSSDIVGMRFGSLFDATQTMV LPLENGTTEVQVVSWYDNENSYTSQMVRTIKHFGKLLNA >gi|226332985|gb|ACII01000034.1| GENE 39 49931 - 50749 690 272 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 8 266 5 259 269 144 32.0 1e-34 MDRQILWDCHMHSSFSADSDTPMEVMIHRAVETGLQGITFTEHLDPDYPVTPDNLDFSLN IPAYKEKLAELSDIYKDKIQVRFGIELGLQMHLGEYFDSLLSQTPFDFVIGSSHLVHGYD PYYPEFFEGRKEFLCYMEYFESILENISAYDGFDVYGHIDYVVRYGPNRNREYSYGRYKD ILDEILKKLISMGKGIELNTGGYHYGLGEPNPCTAVIRRYRELGGEIITIGADAHTPDKI ACAFDKAASVLEACGFRYYTIFKDRKPEFISL Prediction of potential genes in microbial genomes Time: Sat May 28 19:29:50 2011 Seq name: gi|226332984|gb|ACII01000035.1| Ruminococcus sp. 5_1_39B_FAA cont1.35, whole genome shotgun sequence Length of sequence - 15504 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 10, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1918 1688 ## COG0366 Glycosidases + Term 1947 - 2010 13.1 - Term 1944 - 1990 12.8 2 2 Tu 1 . - CDS 1995 - 4529 2217 ## COG0474 Cation transport ATPase - Prom 4560 - 4619 5.5 - Term 4625 - 4673 1.2 3 3 Op 1 20/0.000 - CDS 4725 - 5174 591 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 5195 - 5254 3.4 4 3 Op 2 13/0.000 - CDS 5260 - 6444 1520 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 5 3 Op 3 . - CDS 6462 - 6938 422 ## COG1959 Predicted transcriptional regulator - Prom 7114 - 7173 6.0 + Prom 6934 - 6993 3.3 6 4 Tu 1 . + CDS 7124 - 8455 647 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Term 8536 - 8574 -0.6 + Prom 8621 - 8680 6.4 7 5 Tu 1 . + CDS 8720 - 9751 1139 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 9923 - 9971 5.0 - Term 9911 - 9957 8.4 8 6 Tu 1 . - CDS 10001 - 10885 610 ## COG4974 Site-specific recombinase XerD - Prom 10999 - 11058 7.7 9 7 Tu 1 . - CDS 11139 - 11732 350 ## gi|253578511|ref|ZP_04855783.1| conserved hypothetical protein - Prom 11844 - 11903 6.0 - Term 11845 - 11902 11.3 10 8 Tu 1 . - CDS 11946 - 13505 1672 ## COG1418 Predicted HD superfamily hydrolase - Prom 13546 - 13605 6.5 11 9 Tu 1 . - CDS 13680 - 14135 353 ## EUBREC_2227 recombination regulator RecX 12 10 Tu 1 . - CDS 14270 - 15334 1187 ## COG0468 RecA/RadA recombinase - Prom 15378 - 15437 6.8 Predicted protein(s) >gi|226332984|gb|ACII01000035.1| GENE 1 2 - 1918 1688 638 aa, chain + ## HITS:1 COG:DR0933 KEGG:ns NR:ns ## COG: DR0933 COG0366 # Protein_GI_number: 15805957 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Deinococcus radiodurans # 25 597 19 597 644 392 36.0 1e-108 ETSKVEPVKVETSKVEPVKLDVPVDLTLYNERFQKHYDELKWLYCELYQDRDDVMTYLHD LTSNMEAFYNSRNSALKASDKKREADPDWYKRNDLVGMMMYVNNFAHTLKGLEEHLDYVE ECNVNYLHLMPLLASPKGKSDGGYAVADFRTVQPELGTMEDFSELTSKCHERGINICLDF VMNHTSEEHEWAKRARAGEKEYQDRYFFFDTYDVPALYEKTCPQVFPTTAPGNFTWLDDL HKHVMTTFYPYQWDLNYRNPVVFNEMVYNMLYLANQGVDIVRLDAVPYIWKQLGTNCRNL PQVHTIVRMMRMICEIVCPGVLLLGEVVMAPEKVVPYFGTLEKPECHILYNVTTMASTWH TVATKDVSLLRRQLDILGSLPKEYIFQNYLRCHDDIGWGLDYDFLKNFSIDEVSHKKFLN DFFTGKYPDTFGRGELYNDDPRLGDARLCGTTASLCGIEKYGFEGNVVGVDRSVRYDITL HAFMLSQSGIPVIYSGDEIGQVNDYSYKDDPDKAVDSRYLHRGEFNWSLAPNRNIADTVQ GKLFHALDHLEHIRSSHSVFDTTADLRTIDTWDSSILAIVRENAEEKFIGIYNFSDQDKV AWINEDDGIYTDLISGHEMEAKGVMIPAFGCFWLCKKK >gi|226332984|gb|ACII01000035.1| GENE 2 1995 - 4529 2217 844 aa, chain - ## HITS:1 COG:L168650 KEGG:ns NR:ns ## COG: L168650 COG0474 # Protein_GI_number: 15672557 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 1 822 3 775 775 544 39.0 1e-154 MTGLTNEQVQERIAEGKVNVNENPNTRTYKQIILENTLTFFNFLNIALLVLVLFVRSYKN SMFMGIILINTVIGIIQEIRAKKTIDKLAILTESKTVVLREGKKWSISTEKLVLDDLIFL KTGDQVPADVKVLEGTVEVNESLLTGESDNLSKSQGDELFSGSFVTSGEACCQVIHVGKD NYASQITSEAKEFKRHNSELRNSLNAILKVISIIIVPLGAMLFYKQYMIVGDTLKDSVVN MVAAVLGMIPEGLVLLTSVALTLGSMVLATKKTLVQELYCIETLARVDTLCLDKTGTITE GTMKVEDVQLYDTAQTTVVQHTAKFDPETGEPVQNVSALKPEVTVSAEKENGQIQETVNL ETVSQEERQKLQEIDHIMGNMMSVLHDQNATADALRKRFPSRNDLKLIHAIPFSSDRKYS GAVFEGRGTYLMGAAQFLFPEGNEELLEHCSSYAQEGYRILVLAHSEQETKGTERPTGLE PLGLFLITDVIREEAPDTLAFFDSQGVDLKVISGDDPVTVSAIAKKAGLKNANHYIDATT IKTPEEMQRAVAECSVFGRVTPQQKKQMVQALQSQKHTVAMTGDGVNDVLALKEADCSIA MAAGSDAAKNIANVVLLDSNFGAMPHIVNQGRRVVNNIRSAASMFLIKTIFSVLLSLITI FFGDAYPFEPIQMSLISACAVGIPTFLLTQENNYNKIDHTFLRHVFMNAFPAAVTITGCV FTIMLVCQDVYHSNVMLNTACVLVTGWNYMSALRTVYSPLNTYRKVIIYGMQFAFFISAV VLQDLLTLGSLEFGMIILVFVLMTFSPILIETITEWIRRIYEKSLDREDENKGFFARFLE RVQK >gi|226332984|gb|ACII01000035.1| GENE 3 4725 - 5174 591 149 aa, chain - ## HITS:1 COG:MA2717 KEGG:ns NR:ns ## COG: MA2717 COG0822 # Protein_GI_number: 20091541 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Methanosarcina acetivorans str.C2A # 1 127 1 125 128 166 64.0 1e-41 MAYSEKVMDHFQHPRNVGEIENASGVGTVGNAKCGDIMRIFLDIDDETHIIKDCKFKTFG CGAAVATSSMATEMVMGKTIEEAMEVTNKAVMEALDGLPPVKVHCSLLAEEAIHAALWDY AQKHHIEIKGLKKPKSDIHEGEEAEEEEY >gi|226332984|gb|ACII01000035.1| GENE 4 5260 - 6444 1520 394 aa, chain - ## HITS:1 COG:MA2718 KEGG:ns NR:ns ## COG: MA2718 COG1104 # Protein_GI_number: 20091542 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 3 384 4 384 392 439 56.0 1e-123 MGKMIYLDNAATTKTAPEVVQAMLPYFSEYYGNPSSIYDFAGKSKEAITKGREQIAEVLG AKKEEIYFTAGGSESDNWALKAAFEAYKAKGNHIITTKIEHHAILHTCEYLEKRGAKITY VDVDENGIVKLDELEKAITPETILISVMFANNEIGSIQPIKEIGRIAKEHGVLFHTDAVQ AFCQVPINVDECNIDMLSSSGHKINGPKGIGFLYIRKGVKIRSFVHGGAQERKRRAGTEN VPGIVGYGVAAARANASMKERTDKEIAIRDHLIHRIETEIPYVKVNGDRIKRLPNNVNVS FQFVEGESLLLMLDNYGICASSGSACTSGSLDPSHVLLAIGLPHEIAHGSLRMTLSEETT MEDVDFVVDRLKEIVAHLRSMSPLYEDFMKKQNK >gi|226332984|gb|ACII01000035.1| GENE 5 6462 - 6938 422 158 aa, chain - ## HITS:1 COG:CAC1675 KEGG:ns NR:ns ## COG: CAC1675 COG1959 # Protein_GI_number: 15894952 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 142 1 139 139 126 43.0 1e-29 MKLSTRARYGLKALIDLGLHSENEAISLQSIAERQDISTSYLEQLMAMLKKAGLVKSSRG AYGGYQLGKPADEISVGEVLRVLEGSLEAAACPGIENDGTCHGSDVCVAKLVWKRINDSI TNAVDTLMLGQLIEESRRVHEEKMGSQGMKENVPDILK >gi|226332984|gb|ACII01000035.1| GENE 6 7124 - 8455 647 443 aa, chain + ## HITS:1 COG:BS_spoVB KEGG:ns NR:ns ## COG: BS_spoVB COG2244 # Protein_GI_number: 16079820 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus subtilis # 1 433 1 436 518 145 24.0 1e-34 MSKKLFIKGTLLLTFAGLLSRLMGFFYRIFLSHTIGAHGLGIFQLILPLQILIMSICASG IQTAISRLTAAEKVSKNPSRHISDYFVVGTVFSVVFSLVFSWFLYSYADFWAVQILKESQ TSGLIRILSLSIPLSTLHTCISSYYLGRKQAGFPAGVQLLEQLFRTGGCYILYLICASQG RNITPAIAVGGNLIGEIASSFISLFAVSMHFKVTHYNIRNIRRPLSVLRKLLHISVPLTM NKILLTLLGSIEVILIPARLQMSGLTPKDSLSVYGIFTGMALPLILFPATLTNSAAAMLI PSITQLQTLGYQKRIQYVIGKIFRYCLLLGGSCLVFFLFLGEFLGNFLFHSQTAGIYIHT MSYICPFLYLNTTLTSVLNGLGKSGICLLHSVISVCIRISFVLFAIPVLGIRGYLYGILC SELALSILHSLQILQNNYSNPRK >gi|226332984|gb|ACII01000035.1| GENE 7 8720 - 9751 1139 343 aa, chain + ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 13 339 13 337 343 363 52.0 1e-100 MGFDFIKKLPTPEEIRNQYPIDAEIQAVKDARDKELRDVFTGKSDKFLAIIGPCSADNED AVLDYLTRLRNVQEKIADKVLIVPRVYTNKPRTTGEGYKGMVHQPDPEKQPNLLAGLVAI RKMHIHAIRTSGMTCADEMLYPENYRYLSDLLSYVAVGARSVEDQQHRLTVSGMEVPAGM KNPTSGDLAVMLNSVVAAQGGHRFIYRSWEVETTGNELAHTILRGAVNKHGEAIPNYHYE DLRLLWEKYQEKNLKNPAVIVDTNHSNSNKQYDQQVRIAKEVLHSRQIDPELHTLVKGLM IESYIEPGNQKIGCDHIYGKSITDPCLGWEESERLLYTIAEMC >gi|226332984|gb|ACII01000035.1| GENE 8 10001 - 10885 610 294 aa, chain - ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 1 293 1 295 296 262 45.0 5e-70 MDIEIREFITYLHNTKKTSANTEISYQRDLKKMAEFLKERGIRNYKDVKELELEGYISYM EREKFASSSISRSVASMRAFFQYLWKEGVIAEDPADNLKPPKVEKRAPEILTIEEVDKLL QQPKLDTPKGIRDSAMLELLYATGMRVSEMLHLQIFDVNLQFGYVVCNENGKERIIPIGI PCKKAMERYLQTARTVFVKDEKETALFTNCSGKAMSRQGFWKVLKGYADDAGIKRDIAPH TLRHSFAVHMLQNGADIRSVQEMLGHSDISTTQVYLGMNMNKMRDVYMKTHPRH >gi|226332984|gb|ACII01000035.1| GENE 9 11139 - 11732 350 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578511|ref|ZP_04855783.1| ## NR: gi|253578511|ref|ZP_04855783.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 197 1 197 197 349 100.0 7e-95 MKKNRTGTKKFPALILFLLGFLAGNLIPNIIWKAKWQQKTWASVYFLSTFAGKNTGNIEY LKEILKYRGVFYLLNIICGFSVFGAPLAVITLLGSGLYAGMIMTVSILEFGFAGGVIGMG LLLPQYLFYIPVWLYSMEQEWKISSEIWRNRGLISGEVSIYLKKMCIAAVGYFLGILIEC YVNPLIIDIILKYIKIF >gi|226332984|gb|ACII01000035.1| GENE 10 11946 - 13505 1672 519 aa, chain - ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 2 519 4 514 514 507 58.0 1e-143 MTTTIIVAIVAVVVALLIAVPVTCKVAVNNKIQKDAEIVGTAEDKARSIIDEALKTAETK KREALLEVKEESLRTKNELEKETKERRNELQKYENRVLAKESAVDKKADAVEKREAECTA RAAEVQKKEKRVEELEQKGVQELERVSGLTSEQAKEELLRSVEDDVKVDVARLYRELESR AKEEAGKKAKEYVVNAIQKCAVDHVAETTISVVQLPSDEMKGRIIGREGRNIRTLETLTG VDLIIDDTPEAVVLSAFDPIRREVARVALEKLIVDGRIHPARIEEMVEKAQREVEAQIRE DGENAAMDVGVHGIHPELLKLLGRMKFRSSYGQNALRHSIEVAQLSGMLAGEVGVDVRMA KRAGLLHDIGKSIDHEVEGSHIQIGVDLCKKYKESQIVINTVASHHGDCEPESLIACIVQ AADAISAARPGARRETLETYTNRLKQLEDITNQFKGVDKSFAIQAGREIRVMVVPEHVND ADMVLLARDIAKQIEAELEYPGQIKVNVIRESRAIDYAK >gi|226332984|gb|ACII01000035.1| GENE 11 13680 - 14135 353 151 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2227 NR:ns ## KEGG: EUBREC_2227 # Name: not_defined # Def: recombination regulator RecX # Organism: E.rectale # Pathway: not_defined # 1 138 65 199 202 95 40.0 6e-19 MRLLEHMDRTEKGLREKLRQAGFTSQAVDHALTYVEAYGYIDDERYARTYIAYRMDTKSR QKIIRELMGKGIDRKTAINAWEEEVALNMPDEKEILYRTIEKKYPPNTELDEKKMRRLYG YLVRRGFGYSDIADTLENMNIRLVHTYSDES >gi|226332984|gb|ACII01000035.1| GENE 12 14270 - 15334 1187 354 aa, chain - ## HITS:1 COG:AGc3441 KEGG:ns NR:ns ## COG: AGc3441 COG0468 # Protein_GI_number: 15889174 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 354 68 415 416 466 66.0 1e-131 MINEEKQKALEAALGQIEKHYGKGSVMKLGESGANMQVETVPTGSLSLDIALGVGGVPKG RIIEIYGPESSGKTTVALHMVAEVQKRGGIAGFIDAEHALDPVYAKNIGVDIDNLYISQP DNGEQALEITETMVRSGAVDIVIVDSVAALVPKAEIDGDMGDSHVGLQARLMSQALRKLT AVISKSNCVVIFINQLREKVGVMFGNPETTTGGRALKFYSSIRMDVRRVETLKQGGEMVG NHTRIKVVKNKVAPPFKQAEFDIMFGTGISKEGDILDLAAECGIVNKSGAWYAYNGDKIG QGRENAKIFLKEHPDICDEIEKQVRIHYHLLPDEEGQAKEPVPATGDAPDASEE Prediction of potential genes in microbial genomes Time: Sat May 28 19:30:14 2011 Seq name: gi|226332983|gb|ACII01000036.1| Ruminococcus sp. 5_1_39B_FAA cont1.36, whole genome shotgun sequence Length of sequence - 34799 bp Number of predicted genes - 36, with homology - 35 Number of transcription units - 6, operones - 4 average op.length - 8.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 17 - 72 10.1 1 1 Op 1 . - CDS 99 - 1547 1582 ## COG0297 Glycogen synthase 2 1 Op 2 2/0.000 - CDS 1601 - 2407 729 ## COG0784 FOG: CheY-like receiver - Prom 2459 - 2518 5.2 3 1 Op 3 3/0.000 - CDS 2628 - 3845 1123 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 3933 - 3992 3.4 - Term 3931 - 3982 10.0 4 2 Op 1 8/0.000 - CDS 4000 - 5691 1921 ## COG0497 ATPase involved in DNA repair - Prom 5714 - 5773 5.1 5 2 Op 2 1/0.000 - CDS 5775 - 6233 570 ## COG1438 Arginine repressor 6 2 Op 3 5/0.000 - CDS 6252 - 7112 743 ## COG0061 Predicted sugar kinase 7 2 Op 4 6/0.000 - CDS 7136 - 7945 794 ## COG1189 Predicted rRNA methylase 8 2 Op 5 13/0.000 - CDS 7945 - 9813 1897 ## COG1154 Deoxyxylulose-5-phosphate synthase 9 2 Op 6 . - CDS 9827 - 10714 747 ## COG0142 Geranylgeranyl pyrophosphate synthase 10 2 Op 7 . - CDS 10698 - 10937 326 ## gi|253578525|ref|ZP_04855797.1| conserved hypothetical protein 11 2 Op 8 1/0.000 - CDS 10915 - 12144 1293 ## COG1570 Exonuclease VII, large subunit 12 2 Op 9 10/0.000 - CDS 12141 - 12518 600 ## COG0781 Transcription termination factor 13 2 Op 10 . - CDS 12628 - 13014 522 ## COG1302 Uncharacterized protein conserved in bacteria 14 2 Op 11 . - CDS 13083 - 14672 1777 ## COG4108 Peptide chain release factor RF-3 - Prom 14761 - 14820 6.7 - Term 14704 - 14733 1.2 15 3 Op 1 . - CDS 14828 - 15580 872 ## EUBREC_2230 hypothetical protein 16 3 Op 2 . - CDS 15591 - 16028 369 ## EUBELI_01091 hypothetical protein 17 3 Op 3 . - CDS 15892 - 16458 322 ## gi|253578532|ref|ZP_04855804.1| conserved hypothetical protein 18 3 Op 4 . - CDS 16491 - 17720 808 ## Cphy_2520 sporulation stage III, protein AE 19 3 Op 5 . - CDS 17717 - 18103 438 ## Cphy_2521 stage III sporulation protein AD 20 3 Op 6 . - CDS 18156 - 18350 230 ## EUBREC_2235 hypothetical protein 21 3 Op 7 . - CDS 18432 - 18950 342 ## Cphy_2523 hypothetical protein 22 3 Op 8 . - CDS 18935 - 19873 1017 ## COG3854 Uncharacterized protein conserved in bacteria 23 3 Op 9 . - CDS 19908 - 21038 999 ## LCABL_23800 hypothetical protein 24 3 Op 10 . - CDS 21042 - 21593 521 ## COG2091 Phosphopantetheinyl transferase 25 3 Op 11 . - CDS 21590 - 23416 1192 ## COG0210 Superfamily I DNA and RNA helicases 26 3 Op 12 . - CDS 23419 - 24870 1731 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 24952 - 25011 6.2 + Prom 24850 - 24909 6.3 27 4 Tu 1 . + CDS 25010 - 25156 102 ## 28 5 Op 1 4/0.000 - CDS 25627 - 27342 1542 ## COG0777 Acetyl-CoA carboxylase beta subunit - Prom 27431 - 27490 7.2 29 5 Op 2 4/0.000 - CDS 27525 - 28871 1417 ## COG0439 Biotin carboxylase 30 5 Op 3 4/0.000 - CDS 28875 - 29306 762 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 31 5 Op 4 4/0.000 - CDS 29326 - 29787 629 ## COG0511 Biotin carboxyl carrier protein - Prom 29807 - 29866 5.2 32 5 Op 5 11/0.000 - CDS 29875 - 31110 1595 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 33 5 Op 6 26/0.000 - CDS 31125 - 31865 245 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 34 5 Op 7 6/0.000 - CDS 31867 - 32787 1202 ## COG0331 (acyl-carrier-protein) S-malonyltransferase - Prom 32818 - 32877 9.7 - Term 32891 - 32932 8.6 35 5 Op 8 . - CDS 32954 - 33187 421 ## COG0236 Acyl carrier protein - Prom 33376 - 33435 8.3 36 6 Tu 1 . - CDS 33437 - 34633 1242 ## COG0772 Bacterial cell division membrane protein - Prom 34677 - 34736 8.5 Predicted protein(s) >gi|226332983|gb|ACII01000036.1| GENE 1 99 - 1547 1582 482 aa, chain - ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 4 475 3 473 477 475 51.0 1e-133 MKNILFATSEAVPFIKTGGLADVAGSLPKYFDKRYFDIRVILPKYACMKQEWKDKMNYIT HFYMDLGYKNCYVGIMHMEYEGIQFYFIDNEYYFSGPKPYDGGTWDLEKFAFFSKAVLSV LPVIGFRPDIIHCHDWQTGLVPVYLHDSFQQNEFFWNIKTIMTIHNLKFQGVWDVQTIKN ITGLSDYYFTADKLEAYKDANYLKGGIVFADAVTTVSNTYAEEIKTPFYGEKLDGLMCAR ANSLRGIVNGIDYNEFNPETDPYITKTYNATTFRKEKVKNKLQLQRDLGLQEDPKTMMIG IVSRLTDQKGFDLIAYVMDELCQDAIQLVILGTGDERYENMFRHFDWKYHGKVSAQIYYD EKMSHRIYASADAFLMPSLFEPCGLSQLMSLRYGTLPIVRETGGLKDTVVPYNEYEGTGN GFSFRNYNAHEMLATVRNAERIYYDKKREWNKMVDRAMAADFSWGNSARQYEEMYNWLIG DK >gi|226332983|gb|ACII01000036.1| GENE 2 1601 - 2407 729 268 aa, chain - ## HITS:1 COG:BS_spo0A_1 KEGG:ns NR:ns ## COG: BS_spo0A_1 COG0784 # Protein_GI_number: 16079478 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus subtilis # 1 121 1 120 120 124 49.0 1e-28 MEKLNVAIADDNEKMVEVLGQIIEEDKDLELVGKAHNGEEICNIIREKEPDVVVLDIIMP KMDGLAVMEKFANDKSLKKIPSFIVVSAVGQERITENAFNLGADYYILKPFDNQMLLNRI KHVRRAGERRIRQIGRQPERTEDNPVPVRNLETDVTNIIHEIGVPAHIKGYQYLRDAIIL SVNDMEMLNSITKILYPTIAKKHQTTASRVERAIRHAIEVAWSRGKMDTIDELFGYTVST GKGKPTNSEFIALIADKIRLEYKNRSFQ >gi|226332983|gb|ACII01000036.1| GENE 3 2628 - 3845 1123 405 aa, chain - ## HITS:1 COG:CAC2072 KEGG:ns NR:ns ## COG: CAC2072 COG0750 # Protein_GI_number: 15895342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 77 400 65 391 395 229 41.0 6e-60 MNKRKYRLFVICCLVLDLLVMLISGYRYLDRKIPDEIQISRGKKTEDVTEVLSTPFVTFE EAVTVSQDGGYILPCKLLGYIPFKEIKVTPADDQEIYVSGSTIGIYMQTEGVLVIDTGEI QNRNGETEEPARNIIRQGDYIISFNGEKISTKRELIDDISELDGSEVTLGISRKGESIPV SVTPVKDKKGDYKLGIWVRDDTQGIGTLTYVDQNGNYGALGHGISDIDTAQLLNIRNGAL YKARILAINKGSKGNPGELAGYICYDDRNILGTIEANSRNGIYGQFTGIADDAITLKKMP AAYKQEVKIGTATILCSTDGEVKEYDAEIRKIDLNHEDTNKSFVIKVTDKELLEATGGIV QGLSGSPVIQNGKIIGAVTHVFVQDASSGYGIFIENMLKNTERLF >gi|226332983|gb|ACII01000036.1| GENE 4 4000 - 5691 1921 563 aa, chain - ## HITS:1 COG:BH2776 KEGG:ns NR:ns ## COG: BH2776 COG0497 # Protein_GI_number: 15615339 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus halodurans # 1 558 1 562 565 350 38.0 5e-96 MLAHLHVKNLALIEEIEVEFGPGLNILTGETGAGKSILLGSMQLILGAKTSKNMIRENAS YALVELLFQVENEKALETLKALDICPEDGQVLLSRKIMDGRSINKINGETSTVGQMKAAA ACLLDIHGQHEHQSLLYQDKQLAILDAYGKEKILPAKEKVSLAYKEYSKCKTELGSMNMN EEQRNRELAFLEFEIKEIEKANLQPGEDEELEARYRKMSNAKLIVDSLQLVHNLTGYESK EGAGETVGEALKEFSHVTQYDPELTPLAETLTSVDGLLNDFNRELSAYLDELTFDEGEFY ETERRLDLINGLKAKYGRTIEEIFTYRKLQEEKLEKLHKYEENLQELKEHLRELENILEK KSDELAKIRKEYSKQLEQKIIQGLKDLNFLDVNFAIDFRKKKNYTDNGTDDIQYLISTNP GETLKPLGQIVSGGELSRIMLALKAILADRDEIETLIFDEIDTGISGRTAQKVSEKMAVI GQHHQVLCITHLPQIAAMADSHFEIEKHLQGTETITQIHVLKEHDSIRELARLLGGAEIT PAVLENAKEMKELAQKQKNTRFK >gi|226332983|gb|ACII01000036.1| GENE 5 5775 - 6233 570 152 aa, chain - ## HITS:1 COG:CAC2074 KEGG:ns NR:ns ## COG: CAC2074 COG1438 # Protein_GI_number: 15895344 # Func_class: K Transcription # Function: Arginine repressor # Organism: Clostridium acetobutylicum # 1 148 1 148 150 115 43.0 3e-26 MKVARHEKIIELIHQYDIDTQEELASRLNEAGFKVTQATVSRDIRALKLTKVAGKDGKSR YAIINNESGSLGEKYTRVLEDTLLSIDVGQNIIVIKTVSGMAMGVAAALDALKWPEILGC IAGDDTIMCAAKTADMALGAAEKLRSFVRLNK >gi|226332983|gb|ACII01000036.1| GENE 6 6252 - 7112 743 286 aa, chain - ## HITS:1 COG:RSc2650 KEGG:ns NR:ns ## COG: RSc2650 COG0061 # Protein_GI_number: 17547369 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Ralstonia solanacearum # 51 283 63 290 302 157 33.0 1e-38 MDRFYIITNSDKDKKLEITEKIADYLKTHHKNCEVQQAERKHEGSFHYTDPDKVPDDTQC IIVLGGDGTLLQAARDVVHKEIPLLGINLGNLGFLAEVNQTSLYSALDQLMADDYEVEER MMLEGRVYRGRKLIGQDIALNDIVIGRDGHLRVVRFKNYVNDVYLNSYNADGIIISTPTG STGYSLSAGGPIVSPNAAMTIMTPIAPHTLNTRSIIFPAQDVITVEIGKGRHCDCEKGIA SFDGDTFIPMVTGDCIQIRQADVKTKILKLNHLSFVEVLRRKMRDS >gi|226332983|gb|ACII01000036.1| GENE 7 7136 - 7945 794 269 aa, chain - ## HITS:1 COG:CAC2076 KEGG:ns NR:ns ## COG: CAC2076 COG1189 # Protein_GI_number: 15895346 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Clostridium acetobutylicum # 2 269 5 267 267 315 59.0 4e-86 MKKRLDMLMMERALAPSREKAKAFIMAGDVYVDGQKEDKAGTMFPETVKIEVRGNTLPYV SRGGLKLEKAMKNFDVTLDSKVCMDVGASTGGFTDCMLQNGAVKVYSIDVGYGQLDWKLR NDPRVVCMEKTNIRYVVPEDLGEPADFSSIDVSFISLTKVLLPVRNLLTDEGEIVCLIKP QFEAGREKVGKKGVVRDPAVHQEVIEKVRDYAMSIFMEPCHLSFSPIKGPEGNIEYLLHL KKHPEGTGVTDSLKVSVEAVVAEAHGQLD >gi|226332983|gb|ACII01000036.1| GENE 8 7945 - 9813 1897 622 aa, chain - ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 2 614 4 615 619 617 49.0 1e-176 MILEKIKKPNDIHKIPLEDFEPLAAEIRDFLIRSVSQTGGHLASNLGVVELTLALHNVLD FPEDKLIWDVGHQAYTHKILTGRKDEFKNLRQEGGLSGFPKRSESPCDAYDAGHSSNSIS AGLGYVHARDILGQKHHVVSVIGDGALTGGMAYEALNNAAELKTNFIIIINDNNMSISRN VGGMSTYLSALRTAEAYTGMKMGVTKALKKVPKVGTALVDTMRKTKSSVKQLFIPGMLFE NMGLTYLGPVDGHNMRQMMKLFNEAKRVEGPVVVHVLTKKGRGYEPASAYPDRFHGTGPF DIKTGRVLQKKTVPGYTDVFSKALVSLGEKNKKLTAITAAMPDGTGLVEFSRRFPDRFFD VGIAEEHAVSFAAGLALGGLVPVVAIYSSFLQRAVDQILHDVCMQKLHVIFAVDRAGLVG ADGETHQGCFDLSYLSMMPNMTVLAPKNDRELEEMLAFAVSFDGPIAIRYPRGSAHQGLR EYQAPVEYGRSEIIRKGKKIAVLGVGSMIPSCMEICKGLKDDGYDPTFVNARFVKPLDVD LLDELAKDHSLFVTVEENVKNGGYGEHVSAYMEACHPEIRVLSAAVWDRFVPQGNVESLR SRIGLGVEDIRQAIEDSEELRE >gi|226332983|gb|ACII01000036.1| GENE 9 9827 - 10714 747 295 aa, chain - ## HITS:1 COG:alr0213 KEGG:ns NR:ns ## COG: alr0213 COG0142 # Protein_GI_number: 17227709 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Nostoc sp. PCC 7120 # 34 293 47 307 309 213 44.0 5e-55 MNFQDELAKRTEETEKVIRSFLPAEAGFAGTMAQAMNYSMLAGGKRLRPMLIRETYRLFD GKEEVVKPFMAGMEMIHTHSLIHDDLPALDDDDYRRGRLTTHKVYGEAMGVLSGVALLNY AYETMFQAFALTKEQDRVIHALRVVSQKTGIHGMLGGQSVDVENDGKPLEKEMLDYIYRN KTSALIEASMMTGAILAGANEQEVSAVEKAAGNIGLAFQIQDDILDVTSTAEELGKPVHS DEKNNKVTYVTLFGTEKAAEQVEELSEKAIDLLKSLNKNNEFLYLLIEKLINRRK >gi|226332983|gb|ACII01000036.1| GENE 10 10698 - 10937 326 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578525|ref|ZP_04855797.1| ## NR: gi|253578525|ref|ZP_04855797.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 79 1 79 79 103 100.0 4e-21 MAKKENCEQLQEEKSLEEMFAELDLLAEKLENRDTSLEDSFILYRQGMELLKLCSGKLDT VEKKMLQLNEDGTFSEFSR >gi|226332983|gb|ACII01000036.1| GENE 11 10915 - 12144 1293 409 aa, chain - ## HITS:1 COG:CAC2082 KEGG:ns NR:ns ## COG: CAC2082 COG1570 # Protein_GI_number: 15895352 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Clostridium acetobutylicum # 6 393 7 395 399 311 40.0 2e-84 MKNTYSVGQVNRYIKNMFTQDYLLQKIYVKGEVSNCKYHTSGHIYFSLKDETGTLNCVMF AGHRRGLAFAMKNGDKVIVGGSVDVYERDGRYQMYAKEITLEGAGILYERYLALKQELED MGMFAQEYKQPIPRFIRRLGVVTAPTGAAVQDIRNISYRRNPYLQIILYPALVQGAGAAE SIVKGIQMLDKTDVDVIIVGRGGGSIEDLWAFNEEIVARAIFECSTPIISAVGHETDFTI ADFAADLRAPTPSAAAELAVDDYRSVIEAVSIYRQRLYRAMSGKTDLYRSRLEHFQTKFA YLSPENRLREQRQRLADLENAVQNGMNRKLQDERHRLSVYLERFAGLSPLKKLNQGYSYV ADKYKKTLTSVEQVQNGDTIYISVTDGTIEADVTGTVKEERIYGEERKL >gi|226332983|gb|ACII01000036.1| GENE 12 12141 - 12518 600 125 aa, chain - ## HITS:1 COG:TM1765 KEGG:ns NR:ns ## COG: TM1765 COG0781 # Protein_GI_number: 15644510 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Thermotoga maritima # 3 123 20 138 142 80 41.0 6e-16 MTEFNSEEEMSEQLSLYFDTLEELSEKDQEYMSRKYRHVLEKLDEIDALLNETSNGWKTK RMSRVDLTALRLAVYELKYDKDVPTGVAINEAVELAKRFGGETSGSFVNGILGKIANSES EEKSE >gi|226332983|gb|ACII01000036.1| GENE 13 12628 - 13014 522 128 aa, chain - ## HITS:1 COG:BH2786 KEGG:ns NR:ns ## COG: BH2786 COG1302 # Protein_GI_number: 15615349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 16 128 16 128 132 79 40.0 1e-15 MAEKRITYKIQDLGGIGEVQIADEVVAIIAGLAATEVNGVASMAGNITNELVSKLGMKNL SRGVKVTVLEGVVTVDLNLNIEYGKNILETSKKVQEKVKSSIENMTGLEVADVNIHIASV DMENEKGK >gi|226332983|gb|ACII01000036.1| GENE 14 13083 - 14672 1777 529 aa, chain - ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 1 529 1 526 526 711 65.0 0 MSKYTNEIEKRRTFAIISHPDAGKTTLTEKFLLYGGAINLAGSVKGKATARHAVSDWMEI EKERGISVTSSVLQFHYDGYCINILDTPGHQDFSEDTYRTLMAADSAVMVIDGSKGVEAQ TRKLFKVCVMRHIPIFTFINKMDRDANDTFDLLDEIEKELGIATCPINWPIGSGKKFRGV YDRNTEKILTFSDTQKGTKEGVIEEIPLSDSRADEIMDPEQKAQLLDEIELLDGASADFD QELVSKGKLTPVFFGSALTNFGVETFLEHFLKMTTSPLPRVSDIGEIDPMDDDFSAFVFK IQANMNKAHRDRIAFMRICSGKFDASQEVYHVQGDKKMRLLQPQQMMAESRHVVDEAYAG DIIGVFDPGIFSIGDTICAPGKKFAFEGIPTFAPEHFARVRQIDTMKRKQFVKGINQIAQ EGAIQIFQEFNTGMEEIIVGVVGVLQFDVLKYRLENEYNVDIRLETLPYEHIRWIANKEE VKVDKIIGTSDMKKVTDLKGNPLLLFVNSWSVGMTEDRNPGLILTEFSK >gi|226332983|gb|ACII01000036.1| GENE 15 14828 - 15580 872 250 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2230 NR:ns ## KEGG: EUBREC_2230 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 206 1 207 208 105 40.0 1e-21 MKKVIGKNQVIITSLAILIAVAGYLNFADVDLGFRDKEASTDSSSILEDAGYDLTDETAL LDENQADGGLTDNSLTDSQETDTPGEAVFTGSTGFAAQAKISREQVRSQNKADLQDIINN EEIDDEEKQEAIHTMVSMTDLSEKEAAAELLLEAKGFKNVVVNLTGETADVVIPEAELSD AQRAQIEDIVKRKTGIAPENIVITPLKESEDAEDTTQTDETSADTSYEEDSETSAQPYED TAIDTTDIYD >gi|226332983|gb|ACII01000036.1| GENE 16 15591 - 16028 369 145 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01091 NR:ns ## KEGG: EUBELI_01091 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 144 3 133 135 77 37.0 1e-13 MRWWKSGDLKKTEFVSRLSEMNPAKWGILLLIGVLLAVAAMPVSSKQQKKNTAGEQSEDM QKNVLEEKLEAFLENTEGVGKVQVILMTDEKKDRQSFYNSETIQVTGVLISAEGGGNPVV VQNIQEAVMALFQVDAHRIRIMKMK >gi|226332983|gb|ACII01000036.1| GENE 17 15892 - 16458 322 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578532|ref|ZP_04855804.1| ## NR: gi|253578532|ref|ZP_04855804.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 188 1 188 188 331 100.0 1e-89 MKEELYLWIRNLAVFYIFFTAVLNLIPDQKYEKYVRFFMGLLLIFMMSTPIFSILGKGSE LTESFLDNFSEENREKELREFQNIQKVYLEKGYELELEQKIRETLEKRGIEVYKVKVNIE GEETQANLVLKTEISQEERKELKDALVEEWGLKENRICIQIVRNESGKMGNPVAHRSTSG SSGDACIQ >gi|226332983|gb|ACII01000036.1| GENE 18 16491 - 17720 808 409 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2520 NR:ns ## KEGG: Cphy_2520 # Name: not_defined # Def: sporulation stage III, protein AE # Organism: C.phytofermentans # Pathway: not_defined # 68 407 35 375 380 186 32.0 1e-45 MRQDKERSIREQADLWKGYWCILACILLVGRVSYLTFFQNHIVSVYAAAQTEQKEDSSVQ ILQENLMEEIQIDDVQKMLDEIMDDHVFSVRKALMNIINGEEPVSKETVRSFLYSLFFSD IENEKGLILKLLLVIFIAAVLAEFADVFGNGQAGSISFYIVYLALFTMLMENFSRLGSTL TNWLLGLTDFMKVLSPAYFMTVAASTGSSTAAAFYEGILLMIWAVQWLLANLFLPAVNLS LLLKMVNYLSKEEMLTKMAELLDVAVNWGLKTLLGAIVGLQIVRNMVSPVMDAMKRSAVG KAASAIPGIGNAVTAVTELVLTSAVMVRNSFGAVIVILLLLIGAGPIIHYGSLSLVYRFL AAIAQPISDKRVVGALSTMGEGCAMLLKLLLTAEVLCMLTFLIVVVSVS >gi|226332983|gb|ACII01000036.1| GENE 19 17717 - 18103 438 128 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2521 NR:ns ## KEGG: Cphy_2521 # Name: not_defined # Def: stage III sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 5 127 15 137 139 98 43.0 7e-20 MDIVKVSIMGICGMMLGFILKETRPEFAALVTMMTGFLILGLAAGKVSYLFETMNRLRES FPIDSSYLTVLVKIIGITYIGQFSSAICKDAGYQMIGTQIDLFCKLSVMVLSMPVLLAIL DTISEFMI >gi|226332983|gb|ACII01000036.1| GENE 20 18156 - 18350 230 64 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2235 NR:ns ## KEGG: EUBREC_2235 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 64 2 65 65 68 67.0 9e-11 MEVSLIFKIAAVGILVSVLCQILKHSGRDEQAFLVSLAGLLMVLFWIVPYIYELFESIQQ LFVL >gi|226332983|gb|ACII01000036.1| GENE 21 18432 - 18950 342 172 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2523 NR:ns ## KEGG: Cphy_2523 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 170 6 174 176 95 32.0 6e-19 MLRTVGLCVIVISGVGCGFAMSNELGRRKKMMEMILRMIILLRGEIRYGNKSLYDAFTGA SGKLEGKYREFFILTAQKMKEKTGDSFGTIFRESAGKCLDLDCLSQEERDRFYSLGDQLG YLGLDMQLKQMDLMEKETEYAIRELRKDFCEKRKLYRSMGILGGIFVAVFFW >gi|226332983|gb|ACII01000036.1| GENE 22 18935 - 19873 1017 312 aa, chain - ## HITS:1 COG:CAC2093 KEGG:ns NR:ns ## COG: CAC2093 COG3854 # Protein_GI_number: 15895363 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 27 303 23 295 305 248 43.0 8e-66 MRHIEAQQIENLFAGNIRNLLMDAKLDYDKLYEIRLRVGRPLFLTYDGGECFLRKPGQEQ YLVTTEDLKETLEYVTGYSLYAYEDEIRQGFISVQGGHRVGVTGKVVLDRGRIMGMKYIS CINVRLAHEIQGCADKVMPYIRTEKWVANTLIISPPRCGKTTLLRDIIRQLSNGWANTQG LTIGVVDERSELAGCYQGIPQNDLGMRTDVLDCCPKSEGMQMLVRSMSPDVVAVDELGCE EDFKAVDSVIHCGCKLIATAHGSCLEDVLDQPFFYKLMGEKVFERYIILDKNDQAGCLKA VLDENGKVCLEQ >gi|226332983|gb|ACII01000036.1| GENE 23 19908 - 21038 999 376 aa, chain - ## HITS:1 COG:no KEGG:LCABL_23800 NR:ns ## KEGG: LCABL_23800 # Name: not_defined # Def: hypothetical protein # Organism: L.casei_BL23 # Pathway: not_defined # 33 365 30 376 388 188 34.0 2e-46 MNRQTKKEIRRLAVKILVAVLALGLAFWGYAFVYRGKTEPPAKYPVTNQDAAVYSLRGQI QMLSQQEELNTFAFEGRKKEKEYGTYIIPGLRYTRTFLNVEGTKQAVCTSMTPQGLAVTD EYVLVSAYCHTEKHNSVIYVINKETHRFIKEVVLPGLPHVGGLAYDQEHDMLWYSSNTNG IAQAISIKMDVLRDYNYADNHMPVQVNQTCSLYGIVRDSFMTFYKGCLYVGCFNKYTEST IARYAVDSKGDLVNTFNEELGMMFEMAVPLDYSTISEQAQGMAFYNDKLLLSHSFGILPS RIVFYEQSDKRLYVNENSARSYRMPEMIEQIVVEGDELYVLFESGAYAYRGYAGNVVDRV LKLNLIKMEESYQEEY >gi|226332983|gb|ACII01000036.1| GENE 24 21042 - 21593 521 183 aa, chain - ## HITS:1 COG:BS_sfpm KEGG:ns NR:ns ## COG: BS_sfpm COG2091 # Protein_GI_number: 16081163 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Bacillus subtilis # 16 162 39 187 224 65 25.0 7e-11 MNKGIIYCTKIREEYEGAHMEHMIAEKLLEIALKKEYGINLYNEPRAEGEHGKPFLSYRP SLHYNISHSGKYVVCILADQEVGIDVQIHCRANYERMLRRMVPREQYLEILSDINVEKKF FEQWVLREAYIKWTGEGLSRDMRTISMDEGSHMLLDMEDGYSGAVWAMNPMEINWKYEDI ILG >gi|226332983|gb|ACII01000036.1| GENE 25 21590 - 23416 1192 608 aa, chain - ## HITS:1 COG:BS_yjcD KEGG:ns NR:ns ## COG: BS_yjcD COG0210 # Protein_GI_number: 16078247 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 4 603 137 746 759 321 35.0 2e-87 MKRNPSQLRAITHLSGPMMVLAGPGSGKTSVIVERTAYMINEGKIPASSILVVTFSRAAA TEMKERFLKFVGQNRSEVTFGTFHGIFYGILKAAYHLSAANILSEEEKFSILREMTEKYG QEMAQEGDFIEEVAREISVVKGNCISPEHYYASCCSDEIFRDIFHGYKQALKAKRKLDFD DMILCCYELFSQRQDILNAWRRKFVYILVDEFQDINSLQYKILQMLAAPVNNLFIVGDDD QSIYHFRGARPEIMLNFTKDYPKAETVLLNVNYRCSKNILRTAMEVIGCNTRRFKKQLDT PNEEGMPVTCKEFDNPREEYMCVVAALKKRMERGEDLLNTAILLRTNQESEGLINVLMEY QVPFTMKEQLPNLFRHWICRALLAYLEMAAGDRSRKNFLEIMNRPNRYISREALNSSQIN FNELREYYKDKDWMCDRITTLETHLRILGTLSPFAAINFIRKGMGFEEYLREYAQYRKIK PEELLETLDRIHESAKGMKTLAKWQAYIEEYTKRLNEQARKQQDKKEGVTISTLHAVKGL EYDIVYILNVNEGSIPYRKAVLAEAVEEERRLFYVGMTRAKKKLVLAYVKRQYEKEREPS RFLEETGL >gi|226332983|gb|ACII01000036.1| GENE 26 23419 - 24870 1731 483 aa, chain - ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 3 483 5 485 485 632 63.0 0 MTKIRTRFAPSPTGRMHVGNLRTALYAYLITKHEGGDFILRIEDTDQERYVEGAVDIIYR TLAKTGLLHDEGPDKDKGFGPYVQSERQAQGIYLKYAQQLVEQGDAYYCFCKKEELEGMK RVVAGKEISIYDKRCLKLSKEEVQRRLDAGEPHVIRFNMPTEGTTTFHDVIYGDITVNNE ELEDLILIKSDGYPTYNFANVIDDHLMEITHVVRGNEYLSSAPKYNRIYEAFGWDIPVYV HCPLITDETHQKLSKRCGHSSYEDLLDQGFVSEAIVNYVALLGWSPSDNREIFTLDELVQ AFDYHHINKSPAVFDIAKLRWMNGEYIKKMDADEFYERALPYMKEVLKKDYNFKKIAGMV QTRIETFPDIPALIDFFEEVPEYDSAMYCHKKMKTNEETSLTVLKEVLPVLEEQKDYTND PLFETLSAFVKEHGYKNGYVMWPLRTAVSGKQMTPAGATEIMEIIGKDETIKRVKAAIEK LSK >gi|226332983|gb|ACII01000036.1| GENE 27 25010 - 25156 102 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFGKGRMGSNYVWGLSFGTIVFLGIREAAKIMSRSLKLAPLQTPLRQ >gi|226332983|gb|ACII01000036.1| GENE 28 25627 - 27342 1542 571 aa, chain - ## HITS:1 COG:L0180 KEGG:ns NR:ns ## COG: L0180 COG0777 # Protein_GI_number: 15672761 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Lactococcus lactis # 18 280 21 282 288 330 61.0 3e-90 MADLKRFFKKTVSEDRHKQAASVSVPEGLWIKCPKCGEMVYREDVISNAYVCPKCGGYFR ISAKRRIKMIADKGSFKEWFTGIDNSNPLDYPDYPQKIEDLREKTHLDEAILIGEATIGG IRTVIGAMDTRFLMASMGYVVGEKITRSFEEATKQGLPVILFCCSGGARMQEGIVSLMQM AKTSAAIKRHSQAGLLYTSVLTDPTTGGVTASFAMLGEIILAEPGSLIGFAGPRVIEQTI GQKLPEGFQRPEFLLEHGFLDGIVERGSMRDVLSLILKLHDTRKCYGEFPQETSARSFDT FGKKKKQKKQKNLTAWDRVQIARSSDRPSAAEYIDAIFDDFMEFHGDRGVRDDNAIIGGI ATLDGQPVTVIGIRKGHTTKENIRCNFGMPSPEGYKKALRLMKQAEKFGRAVIAFVDTPG AFCGLEAEERGQGEAIARNLLEMSDLKVPVLSVVIGEGGSGGALALAVGNEVWMMENATY TILSPEGFASILWKDSRKAPEAAEVMKVTAAHLKELGIIERIIPEEYPASEENLPEIAEY MKIRMKQFLEKQAGKSGEQIAQERYERFRKM >gi|226332983|gb|ACII01000036.1| GENE 29 27525 - 28871 1417 448 aa, chain - ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 443 1 443 447 548 60.0 1e-155 MFNKILIANRGEIAVRIIRACREMGIRTVAVYSEADRDALHTQLADEAVCIGKAQSSDSY LNMERILSATIATKAEAIHPGFGFLSENSRFVEMCEKCNVAFIGPSAEVISRMGNKSEAK NTMRKAKVPVVPGTKEPVYTVEQALEAVKEIGFPVMIKASAGGGGKGMRVAGNEQEFAKL FETAQQESVHSFSDNTMYLERFVENPRHVEVQILADKYGNVVHLGERDCSVQRRHQKMIE ESPCIAISDDLRKKMGETAVRAAKAAGYESAGTIEFLLDQSGEFYFMEMNTRIQVEHPVT EFVSGVDLIKEQICIAAGEPLSVQQKDIEICGHAMECRINAEDPERHFMPCPGKITDLHL PGGNGIRVDTAVYNDYAIPPYYDSMIAKIIVYDKDRRSAIRKMISALGEVAIEGVKTNVD FLYELLNQPDFQEGNITTDFIPQHYPDL >gi|226332983|gb|ACII01000036.1| GENE 30 28875 - 29306 762 143 aa, chain - ## HITS:1 COG:SA1901 KEGG:ns NR:ns ## COG: SA1901 COG0764 # Protein_GI_number: 15927673 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Staphylococcus aureus N315 # 7 141 9 142 146 158 63.0 3e-39 MKLTTKEIMEIIPHRHPFLLIDTIEELVPGVKATGKKCVTYNEPHFAGHFPQEPVMPGVL IVEALAQTGAVAILSKPENKGKIAYFASINNAKFKNKVVPGDTLTLEVEIIKEKGPMGVG KAKATNQDGKVAVIAELTFAVGA >gi|226332983|gb|ACII01000036.1| GENE 31 29326 - 29787 629 153 aa, chain - ## HITS:1 COG:DR0118 KEGG:ns NR:ns ## COG: DR0118 COG0511 # Protein_GI_number: 15805158 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Deinococcus radiodurans # 77 153 111 187 187 102 62.0 2e-22 MELANILELIHAVSDSDLTEFNLQDDTLNISMSKEKTIVQQMAVNADPADAQQYQPVVQQ AVHVESVNAVNDEVQTGSVVKSPLVGTYYAASSPENPPFVKVGDKVSKGQVLGIVEAMKL MNEIESEFDGTVKEILVENEQMVEFGQPMFVIE >gi|226332983|gb|ACII01000036.1| GENE 32 29875 - 31110 1595 411 aa, chain - ## HITS:1 COG:BS_yjaY KEGG:ns NR:ns ## COG: BS_yjaY COG0304 # Protein_GI_number: 16078199 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Bacillus subtilis # 2 410 4 412 413 461 55.0 1e-129 MRRVVITGMGAITPVGLSVEEFWNSVKEGKTNFAEVTRFDSTNYRAHMAAEINNFNPKDY MDFKSAKRMELFSQYAVAAAKEAIEQSGLDMTKEDPFRVGCSVGSGIGSMQVVEKSCEIL NTKGPGRLNPLMIPLLISNMASGNVSIQFGLRGKNINVVTACATGTHSIGEAYRTIQCSD ADVMVAGGTEAAITPTGFGGFAALTALTSSTDPERCSIPFDKERSGFVMGEGAGVVVLEE LEHAKARGAKILGEVVGYGATGDAYHITSPAEDGSGAAKAMEWAVKEAGISTDNVWYVNA HGTSTHHNDLFETIAIKKTFGEHAKEMKINSTKSITGHLLGAAGAVEVITCVKELQEGLI HQTVGYKVPDEQCDLDYVGNGNVKMDIKYALTNSLGFGGHNASLLIKKYED >gi|226332983|gb|ACII01000036.1| GENE 33 31125 - 31865 245 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 2 246 1 242 242 99 30 4e-20 MLKNKIALITGAGRGIGRAIAIALAKEGAEVVINYNGSEERAKEVKQTIEENGGKASIYK CNVSDFAACEAMIKDIVKEYGHLDILVNNAGITKDGLIMKMKEEDFDSVLNVNLKGTFNT IRHSARQMLKQRSGKIINISSVSGILGNVGQANYAASKAGVIGLTKTMARELGSRGITVN AIAPGFVDTEMTEVLSEEIRENACKQIILGRFGKPEDIANTAVFLASDKADYITGQVISV DGGMNV >gi|226332983|gb|ACII01000036.1| GENE 34 31867 - 32787 1202 306 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 302 1 307 308 306 52.0 4e-83 MSKIAFVFPGQGAQYTGMAKDFYEKYAVSRKVFENASKASGLDVEALCFEENDRLNITEY TQIAMLTAEIAILRAVEEAGIRSQVNAGLSLGEYGALVASGVMKEEDAFTVVRKRGIFMQ EAYPTGGAMSAVLGTDAELIEKICNETQGIVSIANYNCPGQIVITGEETAVAAAGEALKA AGARRVIPLKVSGPFHCELLKGAGEKLGQELEKVEIQSFTVPYVTNVTAQYVTGPEQVKE LLVSQVSSSVRWQQCVEQMIDDGVDTFIEIGPGKTLTGFLKKINRNVKALHVEKTEDLDE VRKECL >gi|226332983|gb|ACII01000036.1| GENE 35 32954 - 33187 421 77 aa, chain - ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 3 73 5 75 78 66 54.0 1e-11 MLEKMSEMIAEQLNCDAAGITAETSFKDDLGADSLDLFELVMALEDEYNIEIPAEDLTDL TTVGAVMDYLKNKGVEA >gi|226332983|gb|ACII01000036.1| GENE 36 33437 - 34633 1242 398 aa, chain - ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 21 395 3 388 398 189 35.0 9e-48 MENTKSQKKSYSTASSKSRARRKTKTDYYDYSLVAVIVLLTCFGLIMLYSTSSYMAQINY GSDMYFFKKQAIISVACIIMALIISRLNYRILNRFSTALYVAALVLMALVKTPLGQSSHG AQRWLNLGPVQFQPAELAKIAVIVCLPYMIVHMGKKVHTLKGCMVLAVVGGGLALAAYVF TDNLSTAIIIFCITAGLIFVAHPDIKIFIIIAGVVIALAVIGVIFLNATVSVDGSGSFRL RRIMVWLHPEEYADSWGYQTIQALYAIGSGGFFGRGLGNSIQKLGSVPEAQNDMIFSIIC EELGIFGGLIVLMLYAYLLYRLFVIAQNAPDMFGSLMVSGIFIHIALQVILNIAVVVNLM PNTGVTLPFISYGGTSIVFLMAEMGLALSVARQIKFEE Prediction of potential genes in microbial genomes Time: Sat May 28 19:31:07 2011 Seq name: gi|226332982|gb|ACII01000037.1| Ruminococcus sp. 5_1_39B_FAA cont1.37, whole genome shotgun sequence Length of sequence - 50172 bp Number of predicted genes - 56, with homology - 54 Number of transcription units - 27, operones - 16 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 62 - 3076 3320 ## COG1511 Predicted membrane protein 2 1 Op 2 . - CDS 3135 - 5225 2081 ## COG1033 Predicted exporters of the RND superfamily - Prom 5409 - 5468 6.1 + Prom 5359 - 5418 9.4 3 2 Tu 1 . + CDS 5554 - 6213 599 ## Cphy_1910 TetR family transcriptional regulator + Term 6257 - 6307 13.1 - Term 6203 - 6261 2.5 4 3 Tu 1 . - CDS 6291 - 7091 815 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 7149 - 7208 8.9 + Prom 7373 - 7432 4.8 5 4 Op 1 . + CDS 7456 - 7779 194 ## gi|253578555|ref|ZP_04855827.1| conserved hypothetical protein 6 4 Op 2 . + CDS 7776 - 8051 191 ## Amet_3654 spore coat peptide assembly protein cotJB + Term 8205 - 8245 3.5 - Term 7980 - 8027 2.0 7 5 Op 1 . - CDS 8255 - 9688 1048 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) - Term 9703 - 9732 -0.3 8 5 Op 2 . - CDS 9733 - 10026 463 ## gi|253578558|ref|ZP_04855830.1| conserved hypothetical protein 9 5 Op 3 . - CDS 10100 - 10255 135 ## gi|253578560|ref|ZP_04855832.1| predicted protein - Prom 10279 - 10338 4.5 10 6 Tu 1 . - CDS 10365 - 10598 94 ## GALLO_0429 putative cro repressor + Prom 10671 - 10730 11.8 11 7 Op 1 . + CDS 10873 - 11256 244 ## CLL_A2790 putative phage repressor 12 7 Op 2 . + CDS 11258 - 11578 109 ## EUBREC_3101 hypothetical protein 13 7 Op 3 . + CDS 11578 - 11805 219 ## EUBREC_3102 hypothetical protein 14 7 Op 4 . + CDS 11847 - 12314 327 ## CD2951 phage protein + Prom 12336 - 12395 2.8 15 8 Tu 1 . + CDS 12453 - 13283 464 ## gi|253578566|ref|ZP_04855838.1| predicted protein + Term 13477 - 13511 0.6 16 9 Op 1 1/0.000 + CDS 13713 - 14255 172 ## PROTEIN SUPPORTED gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 + Term 14345 - 14388 4.0 + Prom 14343 - 14402 9.2 17 9 Op 2 . + CDS 14425 - 15840 902 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 15843 - 15902 6.7 18 9 Op 3 . + CDS 15940 - 16305 301 ## COG3546 Mn-containing catalase - Term 16320 - 16372 8.5 19 10 Op 1 . - CDS 16374 - 17942 758 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Term 17966 - 18012 0.5 20 10 Op 2 . - CDS 18024 - 18167 76 ## gi|240145600|ref|ZP_04744201.1| conserved hypothetical protein - Prom 18190 - 18249 3.9 21 11 Op 1 . - CDS 18551 - 18790 308 ## CD3370A putative conjugative transposon excisionase 22 11 Op 2 . - CDS 18792 - 19211 386 ## CD3371 conjugative transposon protein - Prom 19290 - 19349 3.6 23 12 Tu 1 . + CDS 19774 - 20130 444 ## CD3330 putative transposon-related DNA-binding protein + Term 20295 - 20339 -0.9 - Term 20177 - 20233 3.6 24 13 Op 1 . - CDS 20432 - 20944 287 ## gi|240145605|ref|ZP_04744206.1| hypothetical protein ROSINTL182_07497 25 13 Op 2 40/0.000 - CDS 20995 - 22329 217 ## COG0642 Signal transduction histidine kinase 26 13 Op 3 . - CDS 22317 - 22991 331 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 23073 - 23132 9.9 27 14 Op 1 . - CDS 23208 - 24110 873 ## CDR20291_3467 conjugative transposon protein 28 14 Op 2 . - CDS 24125 - 25000 172 ## PROTEIN SUPPORTED gi|145223395|ref|YP_001134073.1| NLP/P60 protein 29 15 Op 1 . - CDS 25129 - 27276 1040 ## CD0373 conjugative transposon protein 30 15 Op 2 . - CDS 27273 - 29726 1749 ## CD0374 conjugative transposon protein 31 15 Op 3 . - CDS 29713 - 30102 421 ## CD0375 conjugative transposon protein 32 15 Op 4 . - CDS 30038 - 30253 71 ## gi|253578584|ref|ZP_04855856.1| conserved hypothetical protein 33 15 Op 5 . - CDS 30216 - 30719 626 ## CDR20291_3463 conjugative tranposon protein 34 15 Op 6 . - CDS 30736 - 31131 176 ## EF2335 hypothetical protein 35 15 Op 7 . - CDS 31147 - 31638 339 ## CDR20291_3462 conjugative transposon protein 36 15 Op 8 . - CDS 31644 - 31865 325 ## CD3387A conjugative transposon protein 37 15 Op 9 . - CDS 31862 - 31996 141 ## 38 15 Op 10 . - CDS 32000 - 33232 1013 ## COG2946 Putative phage replication protein RstA - Prom 33254 - 33313 3.4 39 16 Tu 1 . - CDS 33453 - 33866 65 ## gi|253578591|ref|ZP_04855863.1| conserved hypothetical protein - Prom 34007 - 34066 6.7 + Prom 33895 - 33954 7.2 40 17 Tu 1 . + CDS 33974 - 34174 232 ## gi|166033009|ref|ZP_02235838.1| hypothetical protein DORFOR_02730 - Term 33902 - 33937 -0.9 41 18 Tu 1 . - CDS 34166 - 35557 926 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 35594 - 35653 6.1 - Term 35575 - 35611 4.2 42 19 Op 1 . - CDS 35661 - 36056 320 ## gi|253578594|ref|ZP_04855866.1| conserved hypothetical protein 43 19 Op 2 . - CDS 36123 - 36500 140 ## CPF_0994 hypothetical protein - Prom 36532 - 36591 6.8 44 20 Op 1 . - CDS 36594 - 36977 464 ## CD0384 conjugative transposon protein 45 20 Op 2 . - CDS 36993 - 37316 312 ## CDR20291_3454 conjugative transposon protein 46 21 Op 1 . - CDS 37493 - 40546 2634 ## CD3392 putative collagen-binding surface protein - Term 40571 - 40606 3.5 47 21 Op 2 . - CDS 40611 - 41066 147 ## gi|253578599|ref|ZP_04855871.1| predicted protein - Prom 41149 - 41208 3.8 48 22 Tu 1 . - CDS 41216 - 41488 145 ## - Prom 41512 - 41571 7.6 + Prom 41967 - 42026 7.5 49 23 Op 1 17/0.000 + CDS 42143 - 43501 1379 ## COG0569 K+ transport systems, NAD-binding component 50 23 Op 2 . + CDS 43514 - 44953 1283 ## COG0168 Trk-type K+ transport systems, membrane components + Term 45030 - 45071 1.0 51 24 Tu 1 . - CDS 45129 - 46163 834 ## COG1313 Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins 52 25 Op 1 . - CDS 46272 - 46982 628 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 53 25 Op 2 . - CDS 47000 - 47857 186 ## Cphy_2470 peptidase U4 sporulation factor SpoIIGA - Prom 47910 - 47969 5.8 + Prom 47996 - 48055 8.0 54 26 Tu 1 . + CDS 48153 - 49025 700 ## COG0169 Shikimate 5-dehydrogenase - Term 48862 - 48905 4.0 55 27 Op 1 . - CDS 49041 - 49574 458 ## COG1434 Uncharacterized conserved protein 56 27 Op 2 . - CDS 49513 - 50028 621 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 50107 - 50166 7.1 Predicted protein(s) >gi|226332982|gb|ACII01000037.1| GENE 1 62 - 3076 3320 1004 aa, chain - ## HITS:1 COG:DR0075 KEGG:ns NR:ns ## COG: DR0075 COG1511 # Protein_GI_number: 15805116 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Deinococcus radiodurans # 267 937 193 819 1467 175 27.0 4e-43 MNRNRKKAAAATLAAGMAVVLGAGTVMAAASTGDNGLQKEETVYVNTTAGGEVTDVTVSD WLKNSGDSSNSEVKDASDLQDIKNVKGDETFTQNGNDLTWSTDGKDIYYQGKTEKDLPVS VGIKYYMDGTEVSPEALAGKTGHLKMEVTYTNTSKTTKTVNGKKTDIYSPFVMVTGMILS TDNFSNVTVDNGKVISDGSRNVVVGFGMPGMKESLDMSSDIADEVNIPEGFTVEADVTDC EMNSTFTVALTDIFKDIDLNDVDGLDELKDSMKDLTDAAVKLVDGTKDLYDGTNKLNDKY KEFYDGIGTLKSGVSDLNDGAKELDDGAKELSSGAQTLNTGAQTLNSGVSSYTAGADTLS DGIKQYTAGVDTLDSKMAEYEAGMKSLSDGVNAYVAGGNQLAGGVQSYVTGTTTLTDGMQ QYIAGVNQLAGGVGQYAGGIKSLVDGIGQVVPGIDTAVNGIQTAVEKMTALSSKIDANQI ANFSDSMTAYTTGVSDYTSAVGQYIESVEQLASGLSQLESSLENMGASTASEIEESANAM SGQLNDSAETMNSQADAIESWISDSAEASVDASSVDAGVAAVSSQLEAAESALYNIDTSN MDDETAAAVEAAIDGALGSIQAAEDSASNIDTSTITASAPDTSALQSAADALRASASSAE STADAQEEAVNSSVEQMKQQLQALTQGMKTVQETVSQLNAGAKELTSNDGTLKASGTQLA QSGTQIATAVNQMSTLFTGLGQQLPEMLQGIDQLKAGSDQLTTANKSLQSGYESLAKAST EQMISGADTLLSSGKTLTDGIAQLQGENNVNAQTLLKGAKTLKSARKSLTSGMSQINSAT GQITSGVETLASNSGTLYSGADQLSSNSSTLNSGASQLASGTQTLVSGAQTLASGTGSLV SGTQTLMDGADKLSDGSDQLKDGISDLNDGAKELNKGQKKFYKEGIKKLDDTVNGDLQDI LDRFEVIQSDDVMYTTFTDRSGDMDGNVKFIFESDPIEISDDEK >gi|226332982|gb|ACII01000037.1| GENE 2 3135 - 5225 2081 696 aa, chain - ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 1 682 1 676 687 296 28.0 1e-79 MIKVGKWIAKHKILIIIIGIALLIPSYLGMAATRINYDVLSYLPDTLETVDGQDIMVDQF GMGAFSMIVVEDMELKDVAALKEKIEAVDHVKKVLWYDSVLDVSVPVEMLPKDLREAFFN GDATMMIALFDNTTSSETSMEAVTQIRKITSKNCFVSGMTGIVTDIKQIALKEMPIYVVI ASCLSLIVLLLAMDSLVVPVLFLLCIGVAVLYNLGSNIFLGQISYITQALTAVLQLGVTM DYSIFLLNSYEENKKRFEGDKERAMAHAINATFKSVIGSSITTVAGFIALCFMTFSLGRD MGIVMAKGVVIGVLCCVTLLPALVLTFDGAIEKTTHKPLLPSFDKASAFITKHYKVWLII FLVMLFPAIHGNNNLKNYYNIDKSLPSTLDSNVANAKLQETFDMNNVHMVLLKKGLTSKE KTEMLDQIKDVDGVKWALGMDSFVGAAFPDTMIPSNLKKMLESDEYEMEFICSKYKTATD EVNNQIDEIQKIVKGYSPESMVIGEAPLTKDLQDVTDVDLKIVNTISILAIFIIIMVVFK SISLPFILVAVIEFAIFVNMSVPYYQGKEVVFVASIVIGCIQLGATVDYAILMTSRYQKE RGLGKSKKESIAIAHKTSMKSIVISGCSFFAATFGVGLYTQVDMIGSITNLLSRGAVIST LVVLFILPAMFMIFDSLICHTSIGFAKKKPAKEKKQ >gi|226332982|gb|ACII01000037.1| GENE 3 5554 - 6213 599 219 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1910 NR:ns ## KEGG: Cphy_1910 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 195 1 195 207 200 49.0 3e-50 MGKLELNKKKKKDALFNTAFELFTTKGLTKTTISDIVDQAGVAKGTFYLYFKDKYDIRNK LVSHKTGELFFRAHEAVQNAHISEFDRQVHFIIDYILTELNKDRTLLLFISKNLAWGVFK GAFEEKMPDDEYNFYQSYLDMIAQSGLHYKNPELMLFTIIELVGSTCYSCILYQQPVSLA EYRPYLHRTISGIMETFSQDHTTCEVLSSDTKTHVDHTA >gi|226332982|gb|ACII01000037.1| GENE 4 6291 - 7091 815 266 aa, chain - ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 3 266 4 267 268 278 51.0 5e-75 MKLTILGTGNAKVTKCYNTCFALNEDKEYFMIDGGGGNTILKQLEDAGISWKEIRTIFVT HKHIDHLLGIIWMLRMYCQGMARGQFGGEVTIYGHDEVISLLKQMAEMLLTPKETKFIGD KVHLEEVKDGEDRTILGHRVTFFDILSTKAKQFGFCMEYAEGKKLTCCGDEPYNECEQKY AKNSTWLLHEAFCLFSQADKFKPYEKHHSTVKDACELAQKLEAENLILYHTEDKNISRRK ELYTEEGRQYYKGNLWIPDDLEIYKL >gi|226332982|gb|ACII01000037.1| GENE 5 7456 - 7779 194 107 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578555|ref|ZP_04855827.1| ## NR: gi|253578555|ref|ZP_04855827.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 57 107 1 51 51 104 100.0 2e-21 MENYRMGCSGNRPYNRTCGMNMPQPSNRNMNSSECSVRNTTDCSCTIPGVKEKKHKMFSH LQYLEPAMAYVPCQKFTENFPLQYALNVGTIFPQLCKPFCGKRGIRR >gi|226332982|gb|ACII01000037.1| GENE 6 7776 - 8051 191 91 aa, chain + ## HITS:1 COG:no KEGG:Amet_3654 NR:ns ## KEGG: Amet_3654 # Name: not_defined # Def: spore coat peptide assembly protein cotJB # Organism: A.metalliredigens # Pathway: not_defined # 7 87 8 87 88 65 43.0 6e-10 MKTDCSQKQLLNRIDQVSFAVNDMTLYLDTHPCDEKALTYCHELVQERKKLLKEYAEAYG PLIIDITDQTGESIWKWMEQPFPWEKEGACR >gi|226332982|gb|ACII01000037.1| GENE 7 8255 - 9688 1048 477 aa, chain - ## HITS:1 COG:ECs2905 KEGG:ns NR:ns ## COG: ECs2905 COG3757 # Protein_GI_number: 15832159 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli O157:H7 # 3 193 70 257 275 92 31.0 2e-18 MEIRGIDVSAWQGKIDWKTVADYGMGFAILRITEAGNVIDSYFEQNFSECRKYNIPVGAY KYSYAMTVAEIQSEARKVVEVLNGRKLQYPVWLDLEWNNQRSLGAEQIHKLAEAFEKIIT AAGYKFGIYCNVDWYLNVICSHLKKYDFWIARYPASDNGTLQERLRPDFGVGWQYSSKAK IPGISGTVDRNIFYKDYNEAKDIKKENAVMTKSEAINVVLGIAKEEIGYLEKKNNSKLDS KTGNAGSANYTKYWRDIKPSYQGQPWCAAFISWCFMKAFGLDNAKKLLKHWPYVYCPTLG VLFVKNANPKVGDIVIFKHGGTFTHTGFVTKVAGDRFWTIEGNTSGASGIVANGGGVCQK SYYNSNLPGTKFCTPDYSIVLSADKDETDKTTNPEGGSYMFNPETVKAGDKNTSVLLLQE ILRARGFKGKNGKALKLTWTADVNTIYALKAYQESRKEVLEVDGICGSATWKDLLAI >gi|226332982|gb|ACII01000037.1| GENE 8 9733 - 10026 463 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578558|ref|ZP_04855830.1| ## NR: gi|253578558|ref|ZP_04855830.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 97 3 99 99 156 100.0 4e-37 MEQITNYVKPELIVVAIALYFVGMALKQAQAVKDKYIPLILGGISIAICAIYVFATCTCG TGQDIAMAIFTAITQGILIAGLSTYVNQIVKQANKDE >gi|226332982|gb|ACII01000037.1| GENE 9 10100 - 10255 135 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578560|ref|ZP_04855832.1| ## NR: gi|253578560|ref|ZP_04855832.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 51 4 54 54 80 100.0 3e-14 MSELKLVTRNIRINGIQHKASDMSEEEIKCLLIQRQDIILLNMNYERKAAG >gi|226332982|gb|ACII01000037.1| GENE 10 10365 - 10598 94 77 aa, chain - ## HITS:1 COG:no KEGG:GALLO_0429 NR:ns ## KEGG: GALLO_0429 # Name: not_defined # Def: putative cro repressor # Organism: S.gallolyticus # Pathway: not_defined # 1 64 1 64 74 72 50.0 4e-12 MAFDYNKLRGRIVEIFNTQSNFASAMGWSERILALKMNGMCSWKQIDICKAIQLLKLTIE DIPIVCTRVGHYSVLTG >gi|226332982|gb|ACII01000037.1| GENE 11 10873 - 11256 244 127 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2790 NR:ns ## KEGG: CLL_A2790 # Name: not_defined # Def: putative phage repressor # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 72 39 106 127 65 47.0 7e-10 MHKSDISQYCSGKTEPNQEKLFILGNALNVSEAWLMGFDVPMERTPYKAESVQNSSVSAQ CKEIIEICNQLSPHNQRKVLAYSKNLLSAQQMEEDLLAAHARTDVEQTPEGVQHDLDIMN DDSKWEE >gi|226332982|gb|ACII01000037.1| GENE 12 11258 - 11578 109 106 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3101 NR:ns ## KEGG: EUBREC_3101 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 105 1 105 105 144 65.0 7e-34 MALDILELRKLCIPKNIRITLHAAKRLEQRRIFLKDVIACIMNGEIIEQYPDDYPYPSCL ILGMSIEDKYLHVVIGNHESDLFLITAYFPSFDKWESDFKTRKENA >gi|226332982|gb|ACII01000037.1| GENE 13 11578 - 11805 219 75 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3102 NR:ns ## KEGG: EUBREC_3102 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 75 1 75 75 93 64.0 2e-18 MTCFYCKGNIESSTTTYMTDYQGCYIIIKNVPCEKCSQCGEEYLNGETLERIEEIIQKVK GMLTEIAVVDYKQTA >gi|226332982|gb|ACII01000037.1| GENE 14 11847 - 12314 327 155 aa, chain + ## HITS:1 COG:no KEGG:CD2951 NR:ns ## KEGG: CD2951 # Name: not_defined # Def: phage protein # Organism: C.difficile # Pathway: not_defined # 2 144 3 147 157 153 54.0 1e-36 MNYEQLLTAADQEGLLVKEQPLTEHDGLIRGSHIAIRKDIETQAEKSCVLAEEIGHYRTS SGNILDQNKVESRKQEYRARLYGYNLKIGLTGLISAYEAGCGNLYEMAEYLNATEEYLKE AIQCYHSKYGVYAVVDNYVIYFEPFAVIHMISSAD >gi|226332982|gb|ACII01000037.1| GENE 15 12453 - 13283 464 276 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578566|ref|ZP_04855838.1| ## NR: gi|253578566|ref|ZP_04855838.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 276 1 276 276 516 100.0 1e-145 MKKTLLKYFTVALITISTISMPLTVKAAQKSNISIRPNVAYSKYDITGDGKADKIRINFK SESYLNIEVNGKKSFSLNAQNIYLVNADLYTLNGNKHFLKLKCQDIDNDHIDYDKLLTYK SGKLVSAVNLMSHRKGAFNARHNSFTQKVGANYIQIRMQSMPGGVGSIQYTITYKLSGSS LKLSKTTYPVTYSKSYNPLLGGQNMWKCAKSLNIKNAPNGNIIYTTDAYEVCTVNKIKYS GGSAYIYIRAEDADISGWVRCPNSYTSRFFEESLFI >gi|226332982|gb|ACII01000037.1| GENE 16 13713 - 14255 172 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163783284|ref|ZP_02178277.1| 50S ribosomal protein L16 [Hydrogenivirga sp. 128-5-R1-1] # 6 180 5 174 185 70 29 1e-11 MPLPKERIYTIDDIYALPDGERAELIDGQIYMMAPPNTRHQVIVGELYATIRNYIKSKNG SCKPYVSPFAVFLNEDNKNYVEPDLTVVCSPDKVDEKGCHGAPDWVIEVVSPATQSKDYG IKLFKYRMSGVREYWIINPMKGIVNVYDFENESGTGLYSFDDEILVCIYPDLSIVISELL >gi|226332982|gb|ACII01000037.1| GENE 17 14425 - 15840 902 471 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 1 436 4 473 542 100 24.0 6e-21 MSSKVACLYIRVSTEDQTELSPDAQKRLLLDYAQKNDMIVSGDFIFTESVSGRHAQKRPE FQKMIALAKQPSHPIDVILVWKFSRFARNQEESIVYKSMLKKDNVDVISVSEPLIEGPFG SLIERIIEWMDEYYSIRLSGEVLRGMKEKALQKGYQTSPCLGYTAVGHGKPYIINEAEYA IVSYIMDLYDNQNLDETAIARRCNDLGYRTKRGKLFERRSVDRILGNPFYCGTVVWNGVE FEGNHEVRLSRERYEKRQKLITSRKRPVKARNVSACKHWLSGLLKCSVCGATLSYTGNNK CPYFQCWKYAKGFHKTSVALSVKKAEEAVISYFDHILDGAEFTYVCKKKKTDHSLQIEQL QKEISKLTMRESRIKEAYEAGVDTLEEYKNNKDRLVSDRLELTAALSQLLQEEQAEQPDT EEILKEIRSVADVLKNPDVGYEEKGNLIRSVVEQIIYDKESGKMSFDIIIS >gi|226332982|gb|ACII01000037.1| GENE 18 15940 - 16305 301 121 aa, chain + ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 7 117 26 136 200 144 60.0 6e-35 MRDYYRVLHSGGPDGEIGASLRYLSQRFTMPNRTTSALLNDIGTEELSHLEMVSTIVHQL TRDLSMEEIEKSGFGPYYIDHTVGVWPQAAGGVPFNACEFQSKGDPITDLFEDLAAEGAI V >gi|226332982|gb|ACII01000037.1| GENE 19 16374 - 17942 758 522 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 9 304 10 300 301 177 36.0 5e-44 MIESKSRVAIYCRLSEEDRNKQSETDDSNSIQNQKSMLLQYSLEHGWEVYNIYSDDDYTG SDRRRPEFNRLLEDAKNRKFDIVLCKTQSRFTRELELVEKYIHGLFPIWGIRFISIVDNA DTANKGNKKSRQINGLVNEWYLEDMSENIKSVLTDRRKNGHHIGAFALYGYKKDPDVKGH LIIDEEAAEVVREVFTLFSQGYGKTAIARMLNDRGIPNPTEYKRLHGLRYKQPKTKNSTL WKYFAISDMLVNEIYIGNMVQGKYGSVSYKTKQNKPRPKDEWYRVEGTHEPIIDRELWDR VQALVAQKAKPFTVGTIGLFARKARCMNCGYTMRSSKNHGKHYLQCSNRHVAKDACIGSF ISVDKLEKAVIDELNKLSAEYLDKDELEQNVQFNNDLRGQKEALETEIAAYQKKIAEYTK GIRELYLDKVKGILSELDYLDLSKDFSTQKERLEKLMIDTQKQLDVIERKMLIGDNRRQL IEQYTNLEHLDRETVEKLIDYVLVGKKDPVTKEVPIEIHWNF >gi|226332982|gb|ACII01000037.1| GENE 20 18024 - 18167 76 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145600|ref|ZP_04744201.1| ## NR: gi|240145600|ref|ZP_04744201.1| conserved hypothetical protein [Roseburia intestinalis L1-82] # 1 47 1 47 47 79 100.0 7e-14 MANDAKVVCKNVFKNCDKAAFTKAFTLKWIELINQYEKNKGRATPAR >gi|226332982|gb|ACII01000037.1| GENE 21 18551 - 18790 308 79 aa, chain - ## HITS:1 COG:no KEGG:CD3370A NR:ns ## KEGG: CD3370A # Name: not_defined # Def: putative conjugative transposon excisionase # Organism: C.difficile # Pathway: not_defined # 1 79 1 79 79 142 92.0 5e-33 MMKSTKKCPLFSTISLAADGDEVAIEKILNHYDAYISKASLRPFYDEHGNMYIVVDMELK GRIRAALIKAILGFEVRVK >gi|226332982|gb|ACII01000037.1| GENE 22 18792 - 19211 386 139 aa, chain - ## HITS:1 COG:no KEGG:CD3371 NR:ns ## KEGG: CD3371 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 138 1 138 138 188 71.0 4e-47 MKPSSFENAIRLQFDCLARKVIGRTVKNYNKELARRAKHEISFCEIPELELNQLGVSDEY SLEFTSFDVFGTEVRVYDEKLCEAIKKLSERRRNVVLMFYFLELPDAEIAEILDISRNSV YRNRMCSLKLIRDMYEEEL >gi|226332982|gb|ACII01000037.1| GENE 23 19774 - 20130 444 118 aa, chain + ## HITS:1 COG:no KEGG:CD3330 NR:ns ## KEGG: CD3330 # Name: not_defined # Def: putative transposon-related DNA-binding protein # Organism: C.difficile # Pathway: not_defined # 1 117 1 117 118 176 81.0 3e-43 MRKTEDKYDFRAFGLAIKAARMKQGLTREQVGAKIEIDPRYLTNIENKGQHPSLQVFYDL VTLLNVSVDEIFLPTSDKVKSTRRRQLEQQLDTFDDKDMVVMESVAAGIIKSKEVGED >gi|226332982|gb|ACII01000037.1| GENE 24 20432 - 20944 287 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|240145605|ref|ZP_04744206.1| ## NR: gi|240145605|ref|ZP_04744206.1| hypothetical protein ROSINTL182_07497 [Roseburia intestinalis L1-82] # 1 170 1 170 170 284 100.0 1e-75 MKKIICIILITVTSILSGCNSTSDEEIISSQDFKENYEVSSYGTEEKINTLSDVELTSSE KEYSDLENAKFFLENNSNKEYHYSKAYFEIEAEQSETWYQLTQLYDPSKDNEDDAVINPT ERLSLPFDISSVYGELPSGHYRIIVSISYFESPKDWDYDTYYLACEFTLK >gi|226332982|gb|ACII01000037.1| GENE 25 20995 - 22329 217 444 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 153 438 202 493 498 99 25.0 1e-20 MEKVRRKNSRHYIDNMSIKNSFVFYALISLVIGIVIGVISITLVDDYRINLNYKYEDMTT RYDIPENGSFTAKYNKDQTEYTIYNASEKEVCNFVVDYQKERPVQEYIYPNHVSYIEVSP KFTKHDRIVDSALGVVNIVTIPIVLSISMILCVTIFVNRKLVRPIKLLTNAYRKVENNEL DFTLPYPYKDEMGRLCLAFEKMKNCLYQNNQKMIRQFTEQRRLNAAFSHDLRTPLTILKG HTTMLLSFIPKGLVSQQEVLDELSTVRNNVERLEKYVSAMTNLYRLEDIEIEKENINFDF LLKTLSNTTEMLCSDIEYTIKTNCNKQQTLFINLEIIIQIYENLLSNSIRYAKSMIAIDV NKQEEFLLITVSDDGCGFKSTDIERVTLPFYKPSQDTTSEHLGLGLNICKILCERHGGTI KISNNHKGGACVTACIKIAYVDEK >gi|226332982|gb|ACII01000037.1| GENE 26 22317 - 22991 331 224 aa, chain - ## HITS:1 COG:BH1944 KEGG:ns NR:ns ## COG: BH1944 COG0745 # Protein_GI_number: 15614507 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 223 5 226 229 174 42.0 9e-44 MAQSILIVDDERDIVSMLNQYFCKIGYVVYTAINGKEALNAITKHPDIILLDINMPDMNG FTICEKIRNFVSCPIIFLTARIEDCDKIKGFAVGGDDYVVKPFSVDELEARVAAHLRREK RHGSSAIVQFDNNIVIDYSSRVVFYRNAEINFTKKEFDIIEFLSQNKGIIFDRETIYEKV WGLDGSGDNSVITEHIRRIRTKFLSIGDNPYIETVWGCGYKWKK >gi|226332982|gb|ACII01000037.1| GENE 27 23208 - 24110 873 300 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3467 NR:ns ## KEGG: CDR20291_3467 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 299 1 296 297 406 74.0 1e-112 MFKKKEKTEKAPKNKKVRTMKVGTHKKSVLLLWAVLLASTSFGVYKNFTAIDTHTVHEKE IIQLRLNDTNGIENFVKNFAKAYYSWDTSKEAIEARTTEISKYLTKELQDLNADTIRTDI PTSVTVTNVLVWNVEQSGMNDFTVAYEVDQQVKEGEQTQAVTENYTVTVHVDKDGAMVIT QNPTLAPAVQKSKYEPKVQEADVSVSSDTVKDATAFLETFFKLYPTATEKELAYYVKDGV LAPVSGDYVFSELVNPVFTKDGDNLKVSVSVKYLDNKSKMTQISQYELMLHKDDNWKIVE >gi|226332982|gb|ACII01000037.1| GENE 28 24125 - 25000 172 291 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145223395|ref|YP_001134073.1| NLP/P60 protein [Mycobacterium gilvum PYR-GCK] # 173 277 240 340 348 70 37 1e-11 MNLSADVLKHQPMVEKYARENGISEYVNVLLAIIQVESGGTATDVMQSSESLGLPPNSLS TEESIKQGCKYFASLLSSCKAKGMTDINVVIQSYNYGGGYADYVAKNGKKHSFNLAENFA KNKSGGTKVTYTNPIAVSKNGGWRYNYGNMFYVELVNQYLTVKQFSNETVQAVMNEALKY QGWKYVYGGSNPNTSFDCSGLTQWCYGKAGISLPRTAQAQYDATQHIPLSQAQAGDLVFF HSTYNTSDYVTHVGIYVGNNQMYHAGNPIGYTDLTSAYWQQHIICVGRIKQ >gi|226332982|gb|ACII01000037.1| GENE 29 25129 - 27276 1040 715 aa, chain - ## HITS:1 COG:no KEGG:CD0373 NR:ns ## KEGG: CD0373 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 4 715 9 735 736 978 71.0 0 MMKSITREKVFHVLKLVLLAVTVTLVLLSLLGTVAHASGLVDDTVNADNLYSKYPLSNYQ LDFYVDNSWSWLPWNWLDGIGKSVQYGLYCITNFVWTISLYLSNATGYVVQQAYKLDFIN DMADSIGKSIQTLAGVTEHGFSSSGFYVGFLLIIILIVGVYTAYTGLLKRETSKALHAVI NFVVVFIVSASFIAYAPNYIQKINDFSSDISTASLDLGTKIMLPDSQSKGKDSVDLIRDT LFAIQVEKPWLLLQFGNSDTEEIGADRVEALVSASPSDEDGETRENVVKTEIEDNDNDNL TIPQVVNRLGMVFFLLIFNLGITIFIFLLTGMMLFSQILFIIYAMFLPVSFLLSMIPTYE NMAKQAVVRVFNAIMTRAGITLIVTVAFSISSMFYNISTDYPFFMVAFLQIICFAGIYMK LGELMSMFSLNANDSQQIGRRIFRRPMVFMRHRARRMERRIARAVGTGSMVGAGAGAIAG SAYNHSRSTHKNTPARPQRNNEVSMGSRVGFAVGAVMDTKNKVRDSASSLKENVKDLPTQ AGYAVHSAKQKAKDNVSDFKRGVVEERENRQEQRTQKRNLHRENISQKKQELQKAQEARQ TVHTNGSATAGATRSHERPVATPVPKPAQTDIVTKPDMKRPATSPVIKNAEIKAGKETVR TNIRQEQQVKSVARTNQPNVAESRSNRKKTTVQKQVNQKQNRKSVTKQPEKGRKK >gi|226332982|gb|ACII01000037.1| GENE 30 27273 - 29726 1749 817 aa, chain - ## HITS:1 COG:no KEGG:CD0374 NR:ns ## KEGG: CD0374 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 811 1 811 816 1444 91.0 0 MFPIKYIDNNLVWNKDNEVFAYYELIPYNYSFLSPEQKYLVHDSFRQLIAQSREGKIHAL QIATESSIRSIQEQSKKLVTGKLREVAIQKIDDQTEALVSMIGDNQVDYRFFLGFKLMVT EDEVNLKNIKKSVFLTFREFLNEVRHTLMNDFVSMSNDEINRYVKMEKLLENKISRRFKI RRLEAKDFAYLMEHLYGRDGIAYEDYVYPLPKRKLKRETLIKYYDLIRPTRCVVEESQRY LRLEHEDSESYVSYFTVNAIVGELDFPSSEIFYFQQQQFTFPVDTSMNVEIVGNKKALTT VRNKKKELKDLDNHAYQAGNETSSNVVEALDSVDELETDLDQSKESMYKLSYVIRVSAPD LDELKRRCDEVKDFYDDLNVKLVRPAGDMMGLHGEFLPASRRYINDYIQYVKSDFLAGLG FGATQMLGENTGIYIGYSVDTGRNVYLQPSLASQGVKGTVTNALASAFVGSLGGGKSFCN NLLVYYSVLFGGQAVILDPKSERGNWKETLPEIAEEINIVNITSDSSNQGLLDPYVIMKD VKDAESLAIDILTFLTGISSRDGEKFPVLRKAVRTVSQNQNHGLLQVIEELRKEDTAVSR NIADHIESFTDYDFAQLLFSDGSVENAISLDNQLNIIQVADLVLPDKDTTFEEYTTIELL SVSILIVISTFALDFIHSDRSIFKIVDLDEAWAFLNVAQGETLSNKLVRAGRAMNAGVYF VTQSSGDVSKESLKNNIGLKFAFRSTDTNEIKQTLEFFGLDSEDENNQKRLRDLENGQCL MQDLYGRVGVVQIHPVFVELLHAFDTRPPIKSEVDLE >gi|226332982|gb|ACII01000037.1| GENE 31 29713 - 30102 421 129 aa, chain - ## HITS:1 COG:no KEGG:CD0375 NR:ns ## KEGG: CD0375 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 129 1 129 132 209 86.0 2e-53 MKKIRSYTSIWNVEKVLYAINDVNLPFPVTFTQITWFVLTEFIIILFADVPPLSMIEGAF LKYFGIPVALTWFMSQKTFDGKKPYSYIKTMVLYALRPKVTYAGKAVNLHKQKFEETITA VRSVTYVPD >gi|226332982|gb|ACII01000037.1| GENE 32 30038 - 30253 71 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578584|ref|ZP_04855856.1| ## NR: gi|253578584|ref|ZP_04855856.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 8 71 1 64 64 122 98.0 9e-27 MQQTLVSVRYCVKGWQISTATGATVSFSLGGSLGNPAVPVVMKMKKGAFTSEENPKLYKY LERGKGSVCHQ >gi|226332982|gb|ACII01000037.1| GENE 33 30216 - 30719 626 167 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3463 NR:ns ## KEGG: CDR20291_3463 # Name: not_defined # Def: conjugative tranposon protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 165 1 165 167 224 73.0 1e-57 MIDDMQVYIANLGKYNEGELVGDWFSFPLDEEVIAERIGLNAEYEEYAIHDTDNFPMEIS EYTSIEELNRIYEQLEELPDYLLDDLDSFISCYGSLEELVEHKDDIILYSGCETMTDLAY YLIDEEQVLGEIPSSLQNYIDYEAYGRDLDIEGTFIATNAGICEVLR >gi|226332982|gb|ACII01000037.1| GENE 34 30736 - 31131 176 131 aa, chain - ## HITS:1 COG:no KEGG:EF2335 NR:ns ## KEGG: EF2335 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 1 114 1 117 166 70 35.0 2e-11 MAEMKVFVINLDEQEKDTGCAWFTLPCNIEALKQSIGLPPDSDRYLISDYDFPFEILQDT DLDLLNNVCLAISESEIPPEDIPAIQREWFSNLQELEAGLCNITYHWNCSDMEETSEHFL CVHGRFYEYNE >gi|226332982|gb|ACII01000037.1| GENE 35 31147 - 31638 339 163 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3462 NR:ns ## KEGG: CDR20291_3462 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 163 1 164 165 109 35.0 4e-23 MEECSVLIETTKSAEDKTSRWFDLPIDYELFRDLLGVEADSKDYQITDMKLPFAGDIVRT TSVRRLNKLYFAYTDLSPEVQQAYKELISYCGGVEDLLQKSEEFLFYPECHNIMDVARYR LEHNIEFSALSEKGKKYFNLEAYAHELEEKGRYALCNNGMFKL >gi|226332982|gb|ACII01000037.1| GENE 36 31644 - 31865 325 73 aa, chain - ## HITS:1 COG:no KEGG:CD3387A NR:ns ## KEGG: CD3387A # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 73 1 73 73 102 94.0 6e-21 MNFGQNLYNWFLSNAQSLVLMAIVVIGIYLGFKREFSKLIGFLVVALVAVGLVFNASGVK DVLLQLFNKIIGA >gi|226332982|gb|ACII01000037.1| GENE 37 31862 - 31996 141 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQRLLFDFLFFSLGGTVGVVAMCILQAGRQSDRKMMELKGEKKV >gi|226332982|gb|ACII01000037.1| GENE 38 32000 - 33232 1013 410 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 74 406 25 352 352 235 39.0 1e-61 MSQFFRKGGIALNDTEWIQDFADRRLQYGVSQTKLAVMAGISREHLSRIESSKVAVTEEM KVKLLEALEKFNPEAPLTMLFDYVRIRFPTLDIGHIIKDILQLNIQYMIHEDFGHYSYTE HYYIGDIFVYTSPDEEKGVLLELKGKGCRQFESYLLAQERSWYDFLMDALVDGGVMKRLD LAINDHTGMLDIPELTEKCRNEECVSVFRSFKSYASGELVKHEEQDKAGMGYTLYIGSLK SEVYFCVYEKSYEQYIKLGIPIEEAPIKNRFEIRLKNERAYYAVRDLLTYYDAERTAFSI INRYVRFVDKEADKKRSDWKLSVRWAWFIGENREPLKLTTKPEPYTLDRTLRWIQRQVDP TLKMLETITAKTGIDYLKEIRKSTKLTEKHYKIIEQQTTSTEDVILEKEN >gi|226332982|gb|ACII01000037.1| GENE 39 33453 - 33866 65 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578591|ref|ZP_04855863.1| ## NR: gi|253578591|ref|ZP_04855863.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 137 1 137 137 220 100.0 2e-56 MIRKRTVIIAITILLLLAGSLSYLQWGRQMDVSSSSGSNSDNHYELRVSVILNTLVVTDQ QKCAEQIFEKCRDNSFHSVRFSYDIQIPHALSVTVYKNQKDAESDNSAFSFSYRQENQID GSYNIVDNPKKFTLEME >gi|226332982|gb|ACII01000037.1| GENE 40 33974 - 34174 232 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|166033009|ref|ZP_02235838.1| ## NR: gi|166033009|ref|ZP_02235838.1| hypothetical protein DORFOR_02730 [Dorea formicigenerans ATCC 27755] # 1 66 1 66 66 102 100.0 8e-21 MTVYMDLENLLKEKNISKNKVCEACNLQRTQLNNYCKNKVTRIDFSILAKLCEYLDCTPN DILKLK >gi|226332982|gb|ACII01000037.1| GENE 41 34166 - 35557 926 463 aa, chain - ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 30 443 31 445 480 414 48.0 1e-115 MKVWKNSKGNRIRASDKSLVYHFCIGWLLLLFVAVFLLLNLRQLLVTDWKDFNLLHAGIT WTAYNSITVLIATGVCALVVFLYYRYGYDRIKRLLHRQKLARMVLENKWYEAENTKDSGF FTDLQSRSREKIVWFPKIYYQMDNGLLHILCEITMGKYQEQLLSLEDKLESGLYCELTDK TLHDGYIEYTLLYDMIANRISIDEVIAENGGLRLMKNLVWEYDSLPHALICGGTGGGKTY FLLTIIEALLRTNADLYILDPKNADLADLGTVMGNVYHTKDDMIECVNTFYEGMVTRSEE MKLHPNYRTGENYAYLGLAPQFLIFDEYVAFLEMLTTKESTALLSQLKKIVMLGRQAGYF LIVACQRPDAKYFGDGIRDNFNFRVGLGRMSELGYGMLFGSDVKKQFFQKRIKGRGYCDV GTSVISEFYTPLVPKGYDFLGTIGELAERRTESKALEQSEALL >gi|226332982|gb|ACII01000037.1| GENE 42 35661 - 36056 320 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578594|ref|ZP_04855866.1| ## NR: gi|253578594|ref|ZP_04855866.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 131 1 131 131 202 100.0 6e-51 MNFTDEKDYIMRMIKEMVRVLFSLAFGKKYVSVELEKENKYEVSGKNLKIFLNMIDLGQI NEAENILLDSIDYTNKNEVMAVALFYQYLSEKDNQFLENNNYTKEEVLSGFKQLLMKSGY SDLLYLLKYDE >gi|226332982|gb|ACII01000037.1| GENE 43 36123 - 36500 140 125 aa, chain - ## HITS:1 COG:no KEGG:CPF_0994 NR:ns ## KEGG: CPF_0994 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 125 1 125 127 150 60.0 1e-35 MGFWIFMLIMDLLLPFTMIGFGRYFMKKAPKEINSVFGYRTSMSMKNKDTWEFAHKYCGK VWYVCGMVMLPITVIFMLLVIGKNEDCVGSIGGIICGVQLIPLIGSILPTEIALKKNFDK NGTRR >gi|226332982|gb|ACII01000037.1| GENE 44 36594 - 36977 464 127 aa, chain - ## HITS:1 COG:no KEGG:CD0384 NR:ns ## KEGG: CD0384 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 127 3 129 129 186 73.0 2e-46 MRLANGIVIDKEKTFGVLKFSALRREVHVQNEDGTVSEEIKERTYDLKCNTQGRMIQVSV PATVPLKDYDYNAEVELINPVADTVANANYRGADVDWYVKADDIVLKNKGTHVGNLQNNV SQQPPKK >gi|226332982|gb|ACII01000037.1| GENE 45 36993 - 37316 312 107 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3454 NR:ns ## KEGG: CDR20291_3454 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 107 1 107 107 173 74.0 2e-42 MELRFVVPNMEKTFGNLEFAGENTTEQQRINGRMAVITRSYNLYSDVQRADDVVVTLPAK AGEKHFSPEQKVKLINPRITTDGYKIGERGFVNYILLADDMLPAESK >gi|226332982|gb|ACII01000037.1| GENE 46 37493 - 40546 2634 1017 aa, chain - ## HITS:1 COG:no KEGG:CD3392 NR:ns ## KEGG: CD3392 # Name: not_defined # Def: putative collagen-binding surface protein # Organism: C.difficile # Pathway: not_defined # 1 1016 1 1012 1014 1467 75.0 0 MNKLVKRLLTGTLAFATILTALPVTAVHASGNQYWTESAERVGYIEHVMNDGSIKSTFNE GHMKVEGETAYCVDINTSFKNGYKTRSDASTRMSSDQIADVALSLEYVKQYTASHTNLNY KQGYLLEQCVVWQRLSEQLGWQCDNVRASYNEISQAVQNEVYAGAKAFVKANKERYECGG YIYTGEGQDIGQFWAKLNVGNAKVKKTSSNPTVTDGNASYSFEGAIFGVYSDKGCNSQLA TLTADGNGDTKEVEVKAGTVYIKELSAPKGYKLDSTVHALNVEVGKTATLTVADTPKVTE TLIDLFKIDMETGKSTPQGTASLEGAEFTWSYYDDYYNADNLPAKATRTWTTKTVAKKDS DGTIHYVSRPADSYKVSGDSFYTQDGKNVLPLGTLTVTESKAPNGYLLDGAYMQADGSSE QIKGTYLTQISEDGELAVLSGSNQYSVSDKVIRGGVKIQKRDLETKDTKAQGSATLQYTE FNIISLNDSPVLVEGKLYNKNEIVKKIQTGIDGIASTSADLLPYGNYRLEESKAPEGYLT NGAKTIDFSITEDGKIVDLTDKSHSVYNQIKRGDIEGVKIGAGTHKRLAGVPFRITSKTT GESHIVVTDKNGQFSTASSWASHKVNTNAGKSSEDGVWFGTSEPDDSKGALLYDTYVIEE LKCDSNAGFKLIPAFEVVVSRNKVTVDLGTLTDEYEKEITIHTTATDKKTGEKMIVAGKD IKIVDKVTLDGLEAGTKYKLSGWQMLKEENAELLIDGKRVDSNYTFTADSEKMTVEITYS FNGSALGGQNLVTFEELYDISNPKEPVKVAEHKDINDDGQTVLITERIIKIHTTATDKNG KKEIEAGKDVTIVDKVTLDGLEIGTKYKLSGWQMLKEENAELLIDGKKVSNDYEFTADNE KMTVEIAFTFDGSSLGGKSLVTFEELYDMTNPDEPKKVTEHKDITDDGQTVTIKEVPEVP DTPKDNDTPDTPSTVTKTSDSPKTGDNTNIYAYLAMLGLSCVGLGGMLYFKRRRKKL >gi|226332982|gb|ACII01000037.1| GENE 47 40611 - 41066 147 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578599|ref|ZP_04855871.1| ## NR: gi|253578599|ref|ZP_04855871.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 151 1 151 151 280 100.0 2e-74 MIKEFSVYKNNNCTDCEIAIRTTFPQEIKEGILTQKGIIWFGQVYDVDFVRGDYSNTFST DMTYGSEDFYYEVWLQIKEYGRQREEFGLSPEQELGLDVLDYEFENFRKFTSPLVEFIKG INYQNKYTAIFETLSLLQDEIFKNIDDVTIL >gi|226332982|gb|ACII01000037.1| GENE 48 41216 - 41488 145 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDNLFQFLDKILPLISTLLGAYITYFVTVSSKKSELKVNAQTKARDEYWIPCSIAISNLQ KKIVELNKKKKIVMLPFKVKIVVSRNFRNY >gi|226332982|gb|ACII01000037.1| GENE 49 42143 - 43501 1379 452 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 452 1 452 452 292 37.0 8e-79 MQIIIVGCGKVGRTLAEQLQEEESDITLIDVSSNVITSLQNDIDAMGIVGNGASINTLVE AGIENADILIAVTGSDEMNLLCCLIAQKTGHCQTIARVRNPIYAKEISFIKKRLGVTMII NPELAAAQEISRLLRFPSAIKIDTFARGRVEMLKFKVLPEFNLDGMTISRITEALKCDVL FCAVESRDHVSIPGGNQVIHDGDMVSILASPVNAAAFFKKIGLKTNQVKNAIIVGGGTIS YYLTKALLDMNISVKIIEQNESRCETLSDLLPEATIINGDGTNRSLLMEEGLSRTEAFVS LTNMDEENVFLSLFAKTVSNAKLVAKVNRLAFDDVIDNLDIGSVIYPKYITADYILQYVR AMQNSIGSNIETLYHILDNQAEALEFAIRENSPVTGIPLSKLNLKKNLLVGYLNRNGQVK IPRGQDTIQVGDTVIIVTSQKGLRDITDILEK >gi|226332982|gb|ACII01000037.1| GENE 50 43514 - 44953 1283 479 aa, chain + ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 478 1 481 483 388 48.0 1e-107 MNYSIIIYIIGMILEIEAVFMALPAITALIYQETSGVAFLITIALCLVIGLPLTRKKPTR KAFYTKEGFVTVALSWIVLSIIGAIPFVISRSIPNPVDALFETVSGFTTTGASILSDVEA LPHCMLMWRSFTHWIGGMGVLVFILSLLPLTGGYHMNLMKAESPGPSVSKLAPKVQSTAK ILYSIYFVMTVIQILLLLVGGMPLFDSICTAFGTAGTGGFGIKNDSMASYSTYLQVVITV FMILFGVNFNAYFFIITKKFAQAFKMEEVRYYFGIIGIAILIITCNIYHIFGSAAKAFQQ AAFQVGSIITTTGYATTDFNTWPEISRTILVLLMFIGACAGSTGGGIKVSRILILCKTVR KELHIFLHPNAVKKIKMDGKAIPHEVVRSTNIFFIVYMLIFASSIFLIAFDDFNLITNFT AVAATFNNIGPGLELVGPTGNFGMFSWFSKLILTFDMLAGRLEIFPLLILFVRDTWKKF >gi|226332982|gb|ACII01000037.1| GENE 51 45129 - 46163 834 344 aa, chain - ## HITS:1 COG:CAC3242 KEGG:ns NR:ns ## COG: CAC3242 COG1313 # Protein_GI_number: 15896487 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins # Organism: Clostridium acetobutylicum # 46 336 4 294 298 344 53.0 1e-94 MFAVGDSQISSRGKKKGQKRKVKVNKIQTEENKDKTLRHDTESCILADCSLCPRNCHVDR TAGKTGYCGMDQKVKIARAALHMWEEPCISGTRGSGAVFFTGCNLRCCFCQNREIAIGDS GLEITEERLAEIFLELQEKDAANINLVTGTHYIPQIIAALDCAKKHGLNIPVVYNCGGYE NTETLKLLDGYVDIYLPDYKYAESELAVKFSHANDYPERAMEAVNEMVRQTGMPQFDAEG YMKRGTIVRHLILPGHTRNSKKVLKLLHKTFGKQIYISIMNQYTPVFEQKEYTELNRCVT RREYEKVLDYAFELGIENGFFQDGETARESFIPAFDYEGVRKMP >gi|226332982|gb|ACII01000037.1| GENE 52 46272 - 46982 628 236 aa, chain - ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 16 234 20 235 237 274 67.0 8e-74 MSCYPLQVVSRMVMNRSGEVFYIGGNDVLPAPLEPKREAFILQKLGTGQEEEAKSVLIEH NLRLVVYIAKKFDNTGVGVEDLISIGTIGLIKAINTFNPVKNIKLATYASRCIENEILMY LRRNSKTKMEVSIDEPLNVDWDGNELLLSDILGTDEDVISRRLEDEVEISLLSKAIRKLS PREQTIIRLRFGLGKENAAEKTQKEVADLLGISQSYISRLEKRIMKRLKKEIVKYE >gi|226332982|gb|ACII01000037.1| GENE 53 47000 - 47857 186 285 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2470 NR:ns ## KEGG: Cphy_2470 # Name: not_defined # Def: peptidase U4 sporulation factor SpoIIGA # Organism: C.phytofermentans # Pathway: not_defined # 1 285 1 281 293 77 26.0 4e-13 MHYEIYIDVVFFTNLLIDYILIRFTGILFRCGRSRKRALLGAAAGAVFSCWIIYLQSFLE SGIFLPALVLLHGGCAAGMLVIGCNLKRGSLLLKAILTLYFAAFLCGGFWEVIVSDKLTV KMFLILAAATWSGITALSYLMDSMRIRTKNIYPITIYYKGCGYPFQGFYDSGNLLMDSVN GKPVSVGTEKVLSEICSEETVSCLKHLKENPGESGKPGTAGLHPHFTFFHSIGKEQGMML AVTFERLCIQTPAEVVCIDNPVFAFSFETSAFGKEYEVLLNSRLL >gi|226332982|gb|ACII01000037.1| GENE 54 48153 - 49025 700 290 aa, chain + ## HITS:1 COG:lin0493 KEGG:ns NR:ns ## COG: lin0493 COG0169 # Protein_GI_number: 16799568 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 4 287 9 291 291 312 52.0 4e-85 MNNITGHTGLTALLGSPVAHSISPLMHNESFRLLGLDYVYLCFDVNEETLPAAVAGLKTC GIRGFNLTMPNKNKIVELLDELSPEAQLIGAVNTVLNDNGKLIGYNTDGYGFMQSVRDAG HDITGKNITVMGVGGASMAICAQSALDGAASVHIFARRTSRYWNRTQKFAENLSAKTGCQ VCLFDNDDHTALRDSIAQSYLLINATSVGMVPDIDDTIIKDTSLFHPELVVADVVYEPQE TRLLREAKATGCQTFNGMYMLLHQGAKAFQIWTGQEMPVSVIKEKYFSVK >gi|226332982|gb|ACII01000037.1| GENE 55 49041 - 49574 458 177 aa, chain - ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 25 170 103 252 259 107 39.0 1e-23 MILVLAEAARNINIAAVRMRKHPNLEYIIVLGAHVEGTRLTKALLERTRRALQYMEENPE TKAVLSGGKGDGESITEAQAMCNYLVEHGIDRERLILEEKSTSTTENLKFSLGMIGLNHS VGIVTNNFHVFRGTAIGKKCGCREIYPIPSRYRSWRLLIYIPREILAIIKDKLMGNM >gi|226332982|gb|ACII01000037.1| GENE 56 49513 - 50028 621 171 aa, chain - ## HITS:1 COG:CAC3537 KEGG:ns NR:ns ## COG: CAC3537 COG0653 # Protein_GI_number: 15896773 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 27 170 23 166 166 151 50.0 5e-37 MSDKTILEQWRAIAYDQQADRNKLQRFWANYFNIEKGIYEQLLSNPDEVVTGTVKELAEK YGQEVLTMVGFLDGINDSLKIPNPIETMDENTKVSLCFDKELLYKNMVDARADWLYELPQ WDAIFTPEKRKELYLEQKKSGTVVKAKKIGRNDPCPCGSGKKYKYCCGKNA Prediction of potential genes in microbial genomes Time: Sat May 28 19:33:41 2011 Seq name: gi|226332981|gb|ACII01000038.1| Ruminococcus sp. 5_1_39B_FAA cont1.38, whole genome shotgun sequence Length of sequence - 12287 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 49 - 876 733 ## COG1011 Predicted hydrolase (HAD superfamily) - Term 887 - 923 7.1 2 2 Op 1 1/0.000 - CDS 970 - 2250 1595 ## COG0104 Adenylosuccinate synthase - Prom 2333 - 2392 4.1 3 2 Op 2 . - CDS 2395 - 3828 1830 ## COG0015 Adenylosuccinate lyase - Prom 3866 - 3925 4.6 4 3 Op 1 34/0.000 - CDS 3957 - 4643 238 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 5 3 Op 2 31/0.000 - CDS 4710 - 5360 694 ## COG0765 ABC-type amino acid transport system, permease component - Prom 5407 - 5466 1.6 - Term 5385 - 5429 2.2 6 3 Op 3 . - CDS 5496 - 6392 1339 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 6431 - 6490 6.4 - Term 6442 - 6477 -1.0 7 4 Op 1 . - CDS 6500 - 7294 806 ## EUBREC_1824 hypothetical protein 8 4 Op 2 . - CDS 7308 - 8555 948 ## EUBREC_1825 hypothetical protein 9 4 Op 3 . - CDS 8539 - 12126 3209 ## EUBREC_1826 hypothetical protein - Prom 12183 - 12242 6.2 Predicted protein(s) >gi|226332981|gb|ACII01000038.1| GENE 1 49 - 876 733 275 aa, chain - ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 77 274 2 199 201 130 38.0 3e-30 MKKDNNKIKKIGAWIAIIILLLACCMPMIFAFGNGEDSQVYFKASLAVAIMVPIMAYAIW IVYKLLNRNKKVVDSDMENIIFDVGQVLVKYDWETYLDSFGFPKEERDKIAEVVFQSNTW NERDRSSETEQYYVDQMVKAAPEYEKDIREVMRRSDETIEKTDYAETWVRYLKDKGYHVY ILSNYATDTLERTEDKLTFLKYVDGAVFSCQVKQIKPEPEIYKTLLGRYHLDPEKSVFLD DRAENCEAARKQGIHAIQFKSFKQAAAELEKLGVN >gi|226332981|gb|ACII01000038.1| GENE 2 970 - 2250 1595 426 aa, chain - ## HITS:1 COG:SP0019 KEGG:ns NR:ns ## COG: SP0019 COG0104 # Protein_GI_number: 15899967 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Streptococcus pneumoniae TIGR4 # 5 419 6 418 428 373 44.0 1e-103 MVKAVVGANWGDEGKGKITDMLAEKADIIVRFQGGANAGHTIVNNYGKFALHTLPSGVFY DHTTSVIGNGVALNIPVFFKEYNEVVSRGVPAPKILISERAQMVMPYHILFDQYEEERLG GKSFGSTKSGIAPFYSDKYAKIGFQVSELFDEEHLKEKLASVCATKNVLLEHLYHKPLLN VDELFAELMEYKKMVEPYVCDVSLYLWNALKEGKEVLLEGQLGSLKDPDHGIYPMVTSSS TLAGYGAVGAGLPPYEIKQIVTVCKAYSSAVGAGAFVSEIFGDEADELRRRGGDGGEFGA TTGRPRRMGWFDCVASKYGCRLQGTTDVAFTVLDVLGYLDEIPVCTGYEIDGKVTTEFPT TTLLEKAKPVLETLPGWKCDIRGIKKYEDLPENCRKYVEFVEKHIGFPITMISNGPGRDD IIYRNK >gi|226332981|gb|ACII01000038.1| GENE 3 2395 - 3828 1830 477 aa, chain - ## HITS:1 COG:CAC1821 KEGG:ns NR:ns ## COG: CAC1821 COG0015 # Protein_GI_number: 15895097 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Clostridium acetobutylicum # 4 477 3 476 476 634 64.0 0 MSTDRYVSPLSERYASKEMQYIFSPDMKFRTWRRLWIALAETEKELGLNITQEQIDELKA HAEDINYDVAKERERQVRHDVMSHVYAYGVQCPKAKGIIHLGATSCYVGDNTDIIVMTEA LKLVKKKLVNVIAELSAFADKYKDQPTLAFTHFQPAQPTTVGKRATLWTQEFLLDLEDLE YVLGTMKLLGSKGTTGTQASFLELFDGDQETIDKIDPMIAEKMGFKECYPVSGQTYSRKV DTRVANILAGIAASAHKMSNDIRLLQHLKEVEEPFEKSQIGSSAMAYKRNPMRSERIASL SRYVMVDALNPAITSATQWFERTLDDSANKRLSIPEGFLAIDGILDLCLNVVDGLVVYPK VIEKHMMAELPFMATENIMMDAVKAGGDRQELHERIRELSMEAGKTVKVEGKDNNLLELI AADPAFNLSLEDLQRSMDPKKYIGRAKEQTERFVNTVVQPILDSHKELLGVKAEINV >gi|226332981|gb|ACII01000038.1| GENE 4 3957 - 4643 238 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 6 210 14 230 312 96 28 1e-19 MVLFEMKNIKKSFGSLEVLKDISLQVEEGEVLSIIGPSGSGKSTLLRCATGLETPDSGEI IKQGDVGLVFQNFNLFPHFSVLKNITDAPIRVQKRKKEEVYAQARTLLKQMGLADRENAY PFQLSGGQQQRVSIARALCMNPKILFFDEPTSALDPELTSEILKVIKDLAAEHITMVIVT HEMTFARDISDHIIFMDKGLIAVEGSPKEVFESEHARMKEFLGKFHQG >gi|226332981|gb|ACII01000038.1| GENE 5 4710 - 5360 694 216 aa, chain - ## HITS:1 COG:lin1851 KEGG:ns NR:ns ## COG: lin1851 COG0765 # Protein_GI_number: 16800918 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Listeria innocua # 6 216 7 213 213 153 43.0 3e-37 MSLSVMLQQLASGMVVSVEIFVVTLLFSLPLGLLICFGRMSKNPAIRGIVSAYISVMRGT PLMLQLMVVYFGPYFIFGIRISMGYSLIAVFIAFSINYAAYFAEIYRGGIESISAGQYEA AKILGYSKAQTFFRIILPQVIKRILPSVTNEVITLVKDTSLAFVVAVSEMFTIAKQIASA QTTMMPFVIAAVFYFVFNLLVAIVMDKIEKKLNYYR >gi|226332981|gb|ACII01000038.1| GENE 6 5496 - 6392 1339 298 aa, chain - ## HITS:1 COG:SPy1274 KEGG:ns NR:ns ## COG: SPy1274 COG0834 # Protein_GI_number: 15675229 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pyogenes M1 GAS # 32 254 41 266 278 126 34.0 6e-29 MKKRSLAVLMAGVMAASMMTGVVSVYADADSKTLTVGFDAQYPPFGYKDDNGEYVGFDLD LAQEVCDNLGWELVKKPINWDSKDMELNSGSIDCIWNGFTINGREDDYTWSNPYLNNEQV MVVAADSGIEKLDDLAGKNVVVQAASAALDALNSDDNKDLTASFASLTENPDYNTAFMNL DSGAADAIAVDIGVAKYQLSQREEGKYVILDEPIQSEEYGIGFKKGNDELKDTVWEEVLK LYDAGEVDKLAEKYEVADMLCIGDDEKSDDKKDDAAEDDASDADTAKEDTADTAKDAE >gi|226332981|gb|ACII01000038.1| GENE 7 6500 - 7294 806 264 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1824 NR:ns ## KEGG: EUBREC_1824 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 218 1 218 259 105 30.0 2e-21 MSGYILCQTKKAQHPYFIENISMNIYSIEELCYYLYHNLYLADHTVFNEELCNWLRDELE LVHLAAKLKQNLERNVSVEDMIYPVFKEINYLTYEEMKGFNSRIVTYGKEKAAVRQKRKG DALTENGMYVNAIRVYQKLLEREDLSEQRKGFAASVRYNLGCAYSYLFQMEKAQECFLEA YREAHSKDALKAYIIAYSSVHDKTDYDKVMEELEVDEELKKDIKEEIRQSLKAFESVPEE KTDEKNLDALLERLMKDYHRSTGS >gi|226332981|gb|ACII01000038.1| GENE 8 7308 - 8555 948 415 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1825 NR:ns ## KEGG: EUBREC_1825 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 9 409 10 406 412 187 27.0 6e-46 MNEIRDLIVGIDFGKEYSQICYYDRKGDEARSLSMKAGGNQYEAPCILCWRPEQKDYCVG LEADYFVREKGGVMIDDVYGICEKSDSVLVEGQELAPWEILAQFLQGMLKFLGVAELVKN TKCVAVTLPGLTEIQVENFQKAFEIIGFPHEKYMLLDYGESFYYYVLSQKRETWNRSVGW YAFTPENVSFRKMTMNGATKPVTVKLEEAVETELPAEGQERDSEFGKFVQKTLGRDLFSS IQITGDGFSQDWAEQSVRLLCYQKRKVFYGNNLFAKGACAAGKEKTENKALKGYRFLSDS LVMADVGMEMRVMGSPTYYPLIEAGNNWYDSKASCEFILDDTEELVFIVSTYHRPEKRRV GMMLPGIPLRPDKTTRLRLELQYVSSAECKVTVTDLGFGEMFPSSGKTWTETVEW >gi|226332981|gb|ACII01000038.1| GENE 9 8539 - 12126 3209 1195 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1826 NR:ns ## KEGG: EUBREC_1826 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 1173 1 1178 1181 491 26.0 1e-137 MKKTIEELLSGNFRHEHPQLLFSQDKIEVTLKAGDVYKGELYFGTEDNQKIRGYITSSNR RVVPGTEKFSGTTVRLQYGIDGAGMHPGERHEGWLCFTTDIGEYKLPFLIQTEKAELKSA AGDVPDMDTFVNIAKDDFKEAYRVFTDRRFELLLRNAGQKEKALYKGLSKQPVTFQHVEE FLIGTGKKDPVKIELKADQNSFYDISESVRETFAVQRSGWGHLRLEVEAKGDFLEVSRHV VTDEDFIGSYYQVEYVIRKEKLKKGRQFGEITVKSPYQQLTYQITASMESKVQLKTEIHV KRHKLELLRDLAEYSCKRMDSKTWIGSSRFVLNQLREEGCDYPEYQMYEAYILHMEEKDE QAREILKKFKHQNFVRENLELAGAYLYLCVLTGLHKDREQAIRRLYNFFLQKEDSFLLFK LWLEMNPEDKGSPSRLVFMMEELFEKGCRSPFLYLEAWNYISSDTTLLHRMSSFWTQVFL FAGEKGLLTEELVMRFAYLTGYEREFCTSMYRALAMGYDAFESDDTLEAICRYIMLGNPR KPEYFRWFSLAVEKGIRLTRIYEYYVETMDTSYQRELPKPLLMYFTYNNTSLADDKKAFI YASIVGNRGRQPQTYADYRDHMKIFAMRKAKEGRMNENYAVLYQEFLSNPKTEGQAKLLV QKLFTHRLYCDDKKIRYVIVRHEQLREEEIYPCIQGVAYPRIYTDDAVILFQDGKQRRYV STVDYNIKKLFDERELTDKVLGFHIEEPGLVLHCSEHTELKPENLQAFQRIPESEEFSDE YQKCIREKLLSYYSEHTRAEEADNYLRQMDYKKYAAVDRTALLEVLISRGMYQQAMSIVS QFGYEGIRIESQLKLTSRMLTRCEMEEDDELLALASDVYRRGKYDEVILKYLMEYRFGPV DELISVWKSAQGFEMDTYELEEKLLGLLMFTSDYRKEGEKILEDYVHHSGKERITGAYLT QTAYGAFVKEYPMSVFVRSLLERAYDEKWPVDFVCSLALLEAYSKEKKLEKKQLCNAEEI LQKCVKQGMYFAFFGKLPVSVLSPYQLDDKTFVEYHADPSAKVTLYYALDAGLGNQVKYQ TEPLRDLYEGIFAKSFTLFYGETLRYYFESVTGERVNRTQERVLTMNKIEGTPVSKYHMI NQMLSARRLDKKKEVFSQVKKYLRQEQYVKKMFVIEKEEQEQIVLKPGGTNERNS Prediction of potential genes in microbial genomes Time: Sat May 28 19:34:05 2011 Seq name: gi|226332980|gb|ACII01000039.1| Ruminococcus sp. 5_1_39B_FAA cont1.39, whole genome shotgun sequence Length of sequence - 15123 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 10, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 7.9 1 1 Tu 1 . + CDS 144 - 950 649 ## gi|253578616|ref|ZP_04855888.1| conserved hypothetical protein + Term 1051 - 1098 9.0 + Prom 1134 - 1193 4.7 2 2 Tu 1 . + CDS 1234 - 2595 1235 ## COG0534 Na+-driven multidrug efflux pump + Term 2610 - 2669 10.1 - Term 2597 - 2655 14.1 3 3 Tu 1 . - CDS 2706 - 3755 1164 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase - Prom 3801 - 3860 7.6 - Term 3882 - 3932 7.4 4 4 Op 1 10/0.000 - CDS 3939 - 5999 1474 ## COG0642 Signal transduction histidine kinase 5 4 Op 2 . - CDS 6036 - 8699 2035 ## COG0642 Signal transduction histidine kinase - Prom 8719 - 8778 5.2 - Term 8745 - 8790 1.4 6 5 Tu 1 . - CDS 8834 - 8941 59 ## - Prom 8988 - 9047 3.3 7 6 Tu 1 . - CDS 9081 - 9773 256 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 9815 - 9874 6.5 + Prom 9724 - 9783 6.8 8 7 Op 1 . + CDS 9828 - 10589 411 ## COG1145 Ferredoxin 9 7 Op 2 . + CDS 10604 - 12202 1492 ## COG1151 6Fe-6S prismane cluster-containing protein + Term 12237 - 12280 2.3 - Term 12225 - 12268 8.2 10 8 Tu 1 . - CDS 12270 - 13211 314 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 13232 - 13291 10.8 + Prom 13204 - 13263 8.3 11 9 Tu 1 . + CDS 13322 - 14548 684 ## BLD_1837 fucose permease + Term 14596 - 14639 2.3 - Term 14584 - 14627 3.6 12 10 Tu 1 . - CDS 14705 - 15073 236 ## EUBREC_3110 hypothetical protein Predicted protein(s) >gi|226332980|gb|ACII01000039.1| GENE 1 144 - 950 649 268 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578616|ref|ZP_04855888.1| ## NR: gi|253578616|ref|ZP_04855888.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 268 1 268 268 512 100.0 1e-144 MKKLKVFSISAILIAICCSFLFSVSVSAASKKRNKFDHKPSGNIYYYDENGHTVKGLVTI RGKKYYFNEKGIQQNGWQKIKGDYYFFQIRNGCYASMVTSRRVNGIYLTKSGEARYNSEE KRKLNLMVTANQVMRRVTKRNMSKPEKLWRCYLKAVSYGYGGTGNDYDFRYYYSNWDVSY AEDMFYRGHGDCFAFASAFAYLANAVGFEAKVISSGGHGWAEIKGEVCDPNWAKGTGHIE RYYRMSYDLSGVDGRPYYRGNRAYVITI >gi|226332980|gb|ACII01000039.1| GENE 2 1234 - 2595 1235 453 aa, chain + ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 3 437 2 433 448 280 38.0 4e-75 MQNNKNFLGTEPVGKLLLKLSIPTVIAQLINMLYNIVDRIYIGHIPGEGSLALTGVGVCM PIIMIVSAFAALVSSGGAPRASIYMGKQDNKSAENILGNCFILQIIISVILTAILLIWGR DLLLAFGASENTISYATDYMHIYAFGTLFVQLTLGMNAFITAQGFTTFSMVSVLIGAVCN IVLDPVFIFVFHMGVRGAALATVISQAISTLWVVLFLCGKKTQLHLRKKHLHLEAKVVLP CIALGLAAFIMQSSESIVTVCFNSSLLRYGGDIAVGAMTILTSVMQFAMLPLQGIAQGAQ PISSYNYGAKNADRVKKTFRLLLITCLSYSVLLWAAVQLVPRVFVSIFTADTDLIGFTAP MLKIYLGGLFLFGIQIACQMTFTSLGKAVNSIIVAVVRKFVLLIPLIYIMPHMISDSTTG VYMAEPIADITAVIFTSVLFTFQFKKALAEIQN >gi|226332980|gb|ACII01000039.1| GENE 3 2706 - 3755 1164 349 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 4 348 6 347 355 362 53.0 1e-100 MKNFKIGDKITRVPLIQGGMGVGISLGRLAGSVAKAGGVGIISTAQIGYREEDFDRNPAA ANERAIAGEMKKAREISGDGIIGYNIMVALKEYASHVKAAVKAGADIIISGAGLPTELPE LVKGSLTKIAPIVSTEKSAKVILKYWDRKYKRTADLVVIEGPQAGGHLGFHKEELEKYTE ESYSEEIKKIITTVKSYAEKYGTEIPVIVAGGIYNREDVQKVDNLGADGIQVATRFITTE ECDADIRYKEAHLKAKESDIAIVKSPVGMPGRAIMNKFMTRVMNGEQIPHSSCHGCLVKC SPKEIPYCITDGLINAVKGNVDEGLLFCGAKAWKAERLQTVQEVINDLF >gi|226332980|gb|ACII01000039.1| GENE 4 3939 - 5999 1474 686 aa, chain - ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 309 593 1 276 385 161 33.0 4e-39 MKINSTAAKHIKKYCYIIGLLSFLIAFFACSQMLRYQEKEKKASGTYMAQSTIRRVKAKL DNYIMISDFLGNYIIDGSDMDENTFSELAEKIPNEEGVVKAFELAPEGIITEIYPMQGNS EALGLDVLREHERKKDAVLAKETGEYTLGGPYQLKQGGTGALLFKPVYRTDNFGESSFWG FVLQVIDWDRFMSDINLKSLSEADFSYKIWSYDRSSGDKVILAQSQDDMPENSLTIECEI PNNTWYFDIAPSNGWTPVSQWILSIIASYIFSLLIAMVFYQISSKKHREKQYAAELEKAA ELAKSANEAKTRFLFNMSHDIRTPMNAIIGFSGLLEKNLQNEEKAKEYLGKICSSGNLLM TIINQILEIARIESGTTALQLKAEDINTLFHTVNTVFEEDVRKKNLQYSVDLDVYHTFIL CDRVKLQEIMLNIISNAIKYTSDGHGVHVKIYEKDSEDPRKARLIFTCEDTGIGMSEEYL PHIFEEFSREHTTTENKVAGTGLGLPIVKSMIELMGGSIRVESTQGAGTKFTVDISFDTA SEEDVYRNQISEQPDALKRMEGKRILLAEDNDLNAEIAIELLAEQKIIADRAMDGADCLD KLEKAASGYYDMILMDIQMPVMDGYDATARIRRMKDEEKASIPIIAMTANAFAEDRQKAI STGMNDHVAKPIDMKVLLPVMAKYIR >gi|226332980|gb|ACII01000039.1| GENE 5 6036 - 8699 2035 887 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 497 734 1 236 260 157 34.0 9e-38 MKKFKANDNKSEEHEINYRQFNKKLELLNQILSLISDINQASDEKEVKDCIPALLQYTGE YTDADRVYIFELMSEKQDYYSNTFEWCREGIVPQIDNLIRISADSMPVWHEKFLRGETII IHDLEKICETMPSEYDILKVQDIHSLLVLPLFANTQLSGFIGLDNPDLVNPGVSISLLSN AGGHLASTLNNLRMFRMLEEKQKTLESNLEELQKEKHILEALCVDYTSFYLCDLANDTLE TIKQSILSNGTEIDEMTEQKSSYTSRIRYYYDHYVIRESAPDFLRKMERHALIEYLQNHK RFAYRYQSVKNHAGHEYFELQVIRLETGNRDDYKIIMGFRYIDDIVQEDMKKKQQMEETM ADLKLNNEIISAISKMYRIIYRMDLIHDTYEEISSQESMHRLTGRSGKTSVQFTKAREKF IAPEFQERMREFLDVSTLAERLKNREEVSTEYRAITGVWYQARFIVKLRNEAGEVTNVLY VTRDINDQKISELESREELRRTAQEAEKANLAKTDFLRRMSHDVRTPINGIQGCVDIADR YPDDLELQQEARTRIRTASSYLLNLVNDVLDMNKLESGNIQLERKHFNLLELLHNTNEII RMQANEAGVNYYVEAGEIIHTQLIGSPVHFQQILMNIGSNAVKYNHDGGSVTVTCREISY TDTIAEYELTCTDTGIGMSEEFQKRAFEPFAQEGQSARTKYAGTGLGLAIARKLTELQEG TLFFESIQGKGTKFILHISFEISAEEPEKKSHMYQNCSVEGIQVLLVEDNELNMEIAEFL LKEEKMIVTKAWNGREAVEIFENSEPGYFDVILMDLMMPQMGGLEATRRIRKMDREDAES IPIFAMTANAFLDDIAQSKAAGMNEHFSKPLQMEKVIDTIRFYCTAR >gi|226332980|gb|ACII01000039.1| GENE 6 8834 - 8941 59 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTFAETKVSLNDIYKYTSSNGLKGSFMDTERYEYK >gi|226332980|gb|ACII01000039.1| GENE 7 9081 - 9773 256 230 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 2 223 3 220 229 79 25.0 6e-15 MKKYLPILKNSPFFTGLTDNEILSILHCVKATAVSQKRDSYIFRAGDSTEVMGLVVSGCV LVIQEDLWGHRNILSKCHTGDFFGEPYAASPGAVLNVSVVADTDCEIILLNVQRLLISCP TACEHHQKLIRNLVSVLANKILILNDKITHVGKRTTRDKLLSYLSAESIRHSSLSFDIPF DRQQLADYLCIDRAAMSTEISKLQKEGFIKTNRNHFELTVCNNSDSEIKV >gi|226332980|gb|ACII01000039.1| GENE 8 9828 - 10589 411 253 aa, chain + ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 15 121 1 105 115 120 57.0 2e-27 MPTETKVKSIDGGKMIRKIIRIDKEKCNGCGACADACHESAIDIINGKAELVREHFCDGL GDCLPECPTGAISFEEREAPEYDEEAVKEAQKKIFAKNQAITAHAGCPGSKSMQIQRSEI PKSGKPSSTTGQVSNLQNWPVQIKLAPVSAPYFNGAKLLIAADCTAYAYAGFHQDFIRNK VTLIGCPKLDQVDYSEKLTAIIQNNDIQSVTIVRMEVPCCGGLEMATRKALQNSGKFIPW QVITIGIDGNIIE >gi|226332980|gb|ACII01000039.1| GENE 9 10604 - 12202 1492 532 aa, chain + ## HITS:1 COG:MJ0765 KEGG:ns NR:ns ## COG: MJ0765 COG1151 # Protein_GI_number: 15668946 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Methanococcus jannaschii # 4 530 9 547 548 489 46.0 1e-138 MDKKMFCYQCEQTVGCTGCTGNAGVCGKKAGTAKLQDELTGALIGLARAVDSAAAISKST SQIIIEGLFTTVTNVSFDDAAIENMIQKVRAEKERLVPGCSKCQSPCGKSDEYDMQQLWN ADEDIRSLKSLILFGIRGMAAYAYHANVLNYEDAEVNRFFCEALFKIGYEESMDTLLPTV LKVGEINLKCMALLDKANTKTYGTPEPTAVTLTVEKGPFIVVTGHDLKDLQLLLEQTEGK GINIYTHGEMLPAHAYPLLKKFSHLKGNFGTAWQNQQKEFDHLPAPILYTTNCLMPPKSS YADRVFTTEVVAFPGAVHIDEKKNFTPVIEKALELGGYKEDQTRTGINGGTKVTTGFGHA AILSHADTVVEAVKSGAIRHFFLVAGCDGAKPGRNYYTEFVKQTPSDSIVLTLACGKFRF NDLELGEIGGLPRLMDMGQCNDAYGAIQVAVALADAFGCSVNELPLSFVLSWYEQKAVCI LLTLLHLGIKNIRLGPSLPAFLSPNILNYLVENYGIAPITTPEEDIKALMNQ >gi|226332980|gb|ACII01000039.1| GENE 10 12270 - 13211 314 313 aa, chain - ## HITS:1 COG:BH0724 KEGG:ns NR:ns ## COG: BH0724 COG2207 # Protein_GI_number: 15613287 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 49 303 43 296 298 114 28.0 3e-25 MEMDYNKIYNKILTDEEFMEIKQHGSSDYPFQYYYDNLELFDFHCIEWHWHREFEFLYVE SGEVTCGIGEKQIILSEGEAIFINSKILHRFYAPSGGIIPNFVCMPEFIAPENSLIYKKY ILPIISSNIYFQRFQTDELWQTKIIQTMIKIMEIQGNEKIRELATLALMQDLWLTFYENV KLSDKGDVQTVDEVAQKRVQLIMQYIHENYNHNLSLDEIASHIGISKSTALNLFHRFLHT TPVNYLIGYRLQAASWLLKNTNKKVKTIAYESGFHNVDYFCRLFKKRYHLTPSEYRCTCL KLLSADPSLQTHK >gi|226332980|gb|ACII01000039.1| GENE 11 13322 - 14548 684 408 aa, chain + ## HITS:1 COG:no KEGG:BLD_1837 NR:ns ## KEGG: BLD_1837 # Name: fucP4 # Def: fucose permease # Organism: B.longum_DJO10A # Pathway: not_defined # 8 389 2 384 402 346 47.0 9e-94 MIFKLGNSIKKADYHKTKLACYMGFITQAIAANFAPLLFLKFHSDYHISLGNIALISTFF FFTQLLVDLFCAKFVDHIGYRVCIVTSEIFSALGLLGLAFLPDFLPDPFIGIICSVIVYA IGSGLIEVLCSPIIEACPFENKEATMSLLHSFYCWGAVGTILISSLFFLIFGIDNWKWLA VIWAIIPAVNTYNFMTCPIEPLVDNGSGMGIKNLFSRPFFWVAICLMICSGASELAMAQW ASAYAEAALGLSKALGDLAGPCMFAVTMGISRIIFGKYGEQLDLMKFMSGSGILCVVCYL LAALSSSPIIGLIGCIACGFSVGIMWPGTISISSKTFPTGGTAMFSLLAMAGDLGGSIGP GIVGRITQNAGNNIRIGMGFGLIFPVILLFMLFLLYRKKSTQPQISKH >gi|226332980|gb|ACII01000039.1| GENE 12 14705 - 15073 236 122 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3110 NR:ns ## KEGG: EUBREC_3110 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 114 1 114 116 141 58.0 9e-33 MKTCCLVWEIPVIEGEHYNPIMNVIYMQKAKKFLNQLNRYFQDENMDWKCVLDRSACTYN EIFSSENQAVIFVPEAKTRQWLYKKEIQAVSIPKYYLDFAEYNEGKLDKIAIFFMNLKND IQ Prediction of potential genes in microbial genomes Time: Sat May 28 19:34:30 2011 Seq name: gi|226332979|gb|ACII01000040.1| Ruminococcus sp. 5_1_39B_FAA cont1.40, whole genome shotgun sequence Length of sequence - 5481 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 5 - 51 9.8 1 1 Tu 1 . - CDS 81 - 632 576 ## COG3760 Uncharacterized conserved protein - Prom 658 - 717 6.7 - Term 1015 - 1045 2.0 2 2 Tu 1 . - CDS 1053 - 1910 549 ## gi|253578630|ref|ZP_04855901.1| conserved hypothetical protein - Prom 1955 - 2014 5.4 - Term 2083 - 2125 9.3 3 3 Op 1 . - CDS 2180 - 3088 743 ## COG1533 DNA repair photolyase 4 3 Op 2 1/0.000 - CDS 3076 - 3765 317 ## COG4422 Bacteriophage protein gp37 5 3 Op 3 . - CDS 3762 - 4199 266 ## COG1846 Transcriptional regulators - Prom 4266 - 4325 5.2 6 4 Tu 1 . - CDS 4443 - 5480 332 ## COG0582 Integrase Predicted protein(s) >gi|226332979|gb|ACII01000040.1| GENE 1 81 - 632 576 183 aa, chain - ## HITS:1 COG:SMc03800 KEGG:ns NR:ns ## COG: SMc03800 COG3760 # Protein_GI_number: 15966936 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 179 54 222 224 81 27.0 1e-15 MELQKGRPENTDNRLDKEIRVYDFLDKLGVQYQRIDHEAAMTMEACEEIDHALGDNTTIC KNLFLCNRQETDFYLLLMPGDKPFKTKDLSAQIHSARLSFAKPEYMEKYLDITPGSVSVL GLMNDSEKKVQLLIDEDVMKEPYFGCHPCINTSSLKFTTEDLMQKIIPALEHEPVTVTLP VPE >gi|226332979|gb|ACII01000040.1| GENE 2 1053 - 1910 549 285 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578630|ref|ZP_04855901.1| ## NR: gi|253578630|ref|ZP_04855901.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 285 18 302 302 534 100.0 1e-150 MTVFEQSHTWDELIQHFVDSYVMETDKKAVSAFYDRDYIAERLKGLETELSLECRITLNG EERWVRNVIIRGEIEDSEYAMIFLRDITEAKVESARHLQMAADNASMEQLIQSIVRLVDR FVVCDLENDRYESYNLNGQMIYKPLGFYHDFQMQVLERYKTLEAIDILIAPDNIRKKLKS ENDIYKFEYCSLDEKTYKIASYIPLEWKNGKLEKVLLASMDVTQEKKAEIESRQALKEAY RSAENANRAKTEFLSNMSHVLLCLDWLYLIDAAEVDKKGRINLCI >gi|226332979|gb|ACII01000040.1| GENE 3 2180 - 3088 743 302 aa, chain - ## HITS:1 COG:MJ0683 KEGG:ns NR:ns ## COG: MJ0683 COG1533 # Protein_GI_number: 15668864 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Methanococcus jannaschii # 22 241 19 232 259 138 37.0 1e-32 MEEVINGILIREVETKNIMTKSSLPVGGYSVNPYVGCTHACKYCYASFMKRFTGHKEEWG TFLDVKHWPEIKNPKKYVGQRVVIGSVTDGYNPQEEQFGNTRKLLEQLIGSDADILICTK SDLVVRDIDLLKKLGCVTVSWSINTLDENFKNDMDSASSIEHRIAAMKQVYEAGIRTVCF VSPVFPGITNFEAIFERVKDQCDLFWLENLNLRGGFKKTIMDYIAGKYPDLVPLYDEIYN KHNRSYFEALEVKAEKMAKKYDCAFVDNEMPYGRVPQGHPVIVDYFYHEEIRGTENTGKR NR >gi|226332979|gb|ACII01000040.1| GENE 4 3076 - 3765 317 229 aa, chain - ## HITS:1 COG:SMa2239 KEGG:ns NR:ns ## COG: SMa2239 COG4422 # Protein_GI_number: 16263660 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Sinorhizobium meliloti # 13 176 14 204 259 74 30.0 1e-13 MSICIKDQIQNMNIVIGCTVGCTYCYARNNVKRWHMIDDFADPEFFPGKLKMMEKKRPQN FLLTGMSDLSGWKPEWRDEVFAKIRENPQHQFLFLTKRPDLLDFDTDLENAWFGVTVTRK AERWRIDALRKNVRAKHYHVTFEPLFDDPGTVDLSGINWIVVGTMTGAQSRKIHTEPEWA WSLTDQAHKLGIPVFMKEDLVPIIGDENMIQEMPEEFNKVLEVQKSWKK >gi|226332979|gb|ACII01000040.1| GENE 5 3762 - 4199 266 145 aa, chain - ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 1 144 1 144 160 134 52.0 5e-32 MEMNGGFLVTKIKQLGDRIFEKILSEKNIDAFNGAQGRILYVLWQKDGISIRSLSTKCGL AITSLTTMLERMENQGLISRVQSETDKRKTLLFLTEKAHALKGEYDSVSDEMGSIYYKGF SEEEIIRFEECLDRIRKNLEEWQKS >gi|226332979|gb|ACII01000040.1| GENE 6 4443 - 5480 332 345 aa, chain - ## HITS:1 COG:L48477 KEGG:ns NR:ns ## COG: L48477 COG0582 # Protein_GI_number: 15672029 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Lactococcus lactis # 138 342 226 392 394 62 26.0 1e-09 VENYLGTIRNIKKHPLAERKLKNVTSEHLQSFFDLLSFGGVHPDGKERKGYSKDYIHSFS AVMQQSFRFAVFPKQYITFNPMQYIKLRYQTDEVDLFSDEDMDGNIQPISREDYERLLAY LKKKNPAAILPIQIAYYAGLRIGEACGLAWQDVNLEEQCLTIRRSIRYDGSKRKYIIGPT KRKKVRIVDFGDTLVEIFRNARKEQLKNRMQYGELYHTNYYKAVKEKNRVYYEYYCLDRT EEVPADYKEISFVCLRPDGCLELPTTLGTVCRKVAKTLEGFEGFHFHQLRHTYTSNLLAN GAAPKDVQELLGHSDVSTTMNVYAHSTRDAKRKSVRLLDKVVGND Prediction of potential genes in microbial genomes Time: Sat May 28 19:34:42 2011 Seq name: gi|226332978|gb|ACII01000041.1| Ruminococcus sp. 5_1_39B_FAA cont1.41, whole genome shotgun sequence Length of sequence - 1436 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 573 501 ## EUBREC_3495 hypothetical protein - Prom 597 - 656 9.2 + Prom 647 - 706 9.1 2 2 Tu 1 . + CDS 745 - 975 312 ## EUBREC_3496 hypothetical protein 3 3 Tu 1 . + CDS 1116 - 1430 334 ## EUBREC_3497 hypothetical protein Predicted protein(s) >gi|226332978|gb|ACII01000041.1| GENE 1 3 - 573 501 190 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3495 NR:ns ## KEGG: EUBREC_3495 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 190 1 190 209 313 89.0 2e-84 MKDKELRKLIGSRIKQRRLELNLTQPYVAEKMGVTASTILRYENGSIDNTKKMVLEGLSE ALHVSVEWLKGETDEYETDITDKRELQIRDAMGDILEQLPLALTKEEDAFSKDLLLLMLK QYGLFLDSFQFACKNFKGNDGQTDIAKTIGFESNDEYNEIMFLREITHTINAFNEMADIV RLYSKKPKTA >gi|226332978|gb|ACII01000041.1| GENE 2 745 - 975 312 76 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3496 NR:ns ## KEGG: EUBREC_3496 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 75 1 75 76 119 93.0 4e-26 MTQRKIALSIEEAADYTGIGRNTLRQLVEWKKLPVLKVGRKVLIKTDILEMFMEANEGRD LRDRGNVKAVTRTVAN >gi|226332978|gb|ACII01000041.1| GENE 3 1116 - 1430 334 104 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3497 NR:ns ## KEGG: EUBREC_3497 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 104 33 136 136 117 77.0 2e-25 MAKSTKSYEERMLEMEKREQESLEKAKRYAAQKKELLKRKKTEESKKRTHRLCQIGGAVE SVLGAPIEEKDIPKLIVFLKRQEANGRFFSKAMQKETNTDMEEV Prediction of potential genes in microbial genomes Time: Sat May 28 19:34:49 2011 Seq name: gi|226332977|gb|ACII01000042.1| Ruminococcus sp. 5_1_39B_FAA cont1.42, whole genome shotgun sequence Length of sequence - 4286 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 65 - 1786 636 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 2 1 Op 2 . + CDS 1506 - 2471 477 ## EUBREC_3500 hypothetical protein - Term 2564 - 2614 11.6 3 2 Tu 1 . - CDS 2667 - 4277 598 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|226332977|gb|ACII01000042.1| GENE 1 65 - 1786 636 573 aa, chain + ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 307 18 260 1117 109 29.0 1e-23 MAIFHYTIKIVGRSKGKSVISASAYLNGDVMKNEETGRISYYTSKKEVVYTRLMMCENAP PEWQIVPEENIKRFQKSVRYKRSEDKEAALKKFKITFQKQRLWNEVLKIEKNADAQLGRS FEFALPKEWSRQEQIQYTTDYIQKTFVDRGMCADWSIHDKGDGNPHVHLLLTMRPFNPDH SWGKKEVKDWDFVRDKNGNIVIDESHPNWWQDKKNPDRHGIRIPVLDENGIQKIGARNRL QWKRVLTDATGWNNPKNCELWRSEWAKVCNEHLPLHNQVDHRSYEKQGKLQIPTIHEGAD ARKIEQKFFAGQEIKGSWKVAENQIIKQQNTLLQKILNTFGKVSGALSLWKERLNDIRRK PGNYTLNGIHDWSNRRTAEPNGRNDSGNAEPGHSTLSYAKTESEITKIKQRVIRAAQHFA KYRGTAFQDGRTENEDRAFGKRKSAMAEIGTEAEQRKQFITETEHRIAELEQQIEKGRDI DERIQRIKERRTVGRTSALDRGDTRRTRTERTAYRGTEDAAQRISDLEREIKQREQSREY SSIKERLEAGRQSIADREREVAKRKRHDRGMSR >gi|226332977|gb|ACII01000042.1| GENE 2 1506 - 2471 477 321 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3500 NR:ns ## KEGG: EUBREC_3500 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 87 321 1 235 236 268 57.0 1e-70 MNESRESKNAEQLEELLLLTEEIQEELEQKEQLIVELKTQLSESLTLNEKLNKENRAGNI QALKNDLKLADRALRIEKEKLRSANVMIEECQDKICYLTQERDYARTHQKIVEIPVEKPV LYEKCEACNRTAYQNAKAKYETQKERLAGQYKAKTVMFQTTLFLLAWYSLTTTLFQAVQS DMFLVDCKSFFNDLASFIQTFVGWTIDAGHSAAQISTKIPNAFIAGMVYWLLLILIVGIC MAGTGMLAILIEIKVIELYKKNCWDVITLLMILTSVAIVIYFGESIKKVLSVNLLLLFVL SQGAYVGIRCYLKVWLEKRPY >gi|226332977|gb|ACII01000042.1| GENE 3 2667 - 4277 598 536 aa, chain - ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 11 536 260 827 832 70 22.0 9e-12 MSFSNYAVYGVVAAILFLAAALVIYSIFYISVGQKVAEFGQLRTIGASKKQIYKIVLKQG YMLAVPGILIGSIMGTIISYCLQSKGWSVFAFIVSLCGACLFGILLVYISVRKPAKIAAN TSPISALKNPVGIVNYRTHKRHRITPVYLAKISFSRNRKKSLLTILSLGLCGVIFFLAAS YQSSFNAESMARYWDMKYGDFKISIDLENESENLDALLQKEYFSDYVKRVGDIEGVSNIF TYATVPVDFSTDNDISDSTLMLGYNEKNLASLNAAVLSGNITDDTELVVSDPERIYDVYH WKPQIGDTVTFNFKNHSGKTITKAFKIGAITSSNDGMGGYIFRMPENMLKELAGYDCTYA IEIQAEAEYQEIIEQELKLLVSDNRDVRLQTLQDFIVEHQSDNRAGFTLAYAIAAILWIF AVINQINLTVTNLLSQKQEMGTLKSIGMTNKQLKQAFMMEGLFTTLIALLITAVVGIPGG YAIGIFLKNAGMSTGFVFPVVAFTLFAVAMFMLESLMTILLIHSWKKQSVIAIMRN Prediction of potential genes in microbial genomes Time: Sat May 28 19:34:55 2011 Seq name: gi|226332976|gb|ACII01000043.1| Ruminococcus sp. 5_1_39B_FAA cont1.43, whole genome shotgun sequence Length of sequence - 3551 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 726 479 ## gi|167768471|ref|ZP_02440524.1| hypothetical protein CLOSS21_03030 2 1 Op 2 4/0.000 - CDS 741 - 1412 203 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Prom 1457 - 1516 5.6 3 1 Op 3 40/0.000 - CDS 1558 - 2589 153 ## COG0642 Signal transduction histidine kinase 4 1 Op 4 . - CDS 2586 - 3275 296 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 3348 - 3407 8.0 Predicted protein(s) >gi|226332976|gb|ACII01000043.1| GENE 1 3 - 726 479 241 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167768471|ref|ZP_02440524.1| ## NR: gi|167768471|ref|ZP_02440524.1| hypothetical protein CLOSS21_03030 [Clostridium sp. SS2/1] # 1 241 1 241 775 470 100.0 1e-131 MNYPFENDTSAVIKKLARKSVRFDKKKNLFCLSAIIVAVAMIMMSLLTVQNIIYQNQKEI EGLHQGIFFDITQDSKEKLLANEEVKSVGLSCNIKTVKENSKELSLIYYDDDMLSLIPAF DGKYPEKASEIAVTDAFFASENKSPEINAIVQLNLDGTLKNYTIVGIYHDKTSTAYPVFV SLEKCRELRGNNLLNGYVWLENADTLTKDEATEILSQISEETGLNNWTVSSYYDYVNTTL S >gi|226332976|gb|ACII01000043.1| GENE 2 741 - 1412 203 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 222 7 222 318 82 28 4e-16 MNMSILQTIDLKKYYGTEPNITRALNGVNFTVEQGEFVGVVGTSGSGKSTLLHMMGGLDT PTSGSVIVRGEELAKKNDDELTIFRRRNIGFIFQNYNLVPILNVYENIVLPVELDGDTVD QKFMDEVVYMLALEDKLENMPNNLSGGQQQRVAIARALITKPAIILADEPTGNLDSKTSA DVLGLLKHTSGEFNQTIVMITHNNEIAQLADRIVRIEDGKIVG >gi|226332976|gb|ACII01000043.1| GENE 3 1558 - 2589 153 343 aa, chain - ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 79 340 148 410 416 153 31.0 5e-37 MNLNNLSAKKICRLVSAGMVFSMLVITGIFGGIMQDIRIFIVGGTLTLCAFFWIWLLVLI FGKRLSHFTSNLCRTLDNMIDGNEELQKSNDSETLFARINHRLIRLYEIMQKNRHKVDME RQELQMLISDISHQVKTPVSNLQMVTDTLLTKPVSEEERMDFLQGIRSQTDKLDFLFQAL VKTSRLETGAIRLEKKDSSLFHTLAQAMSSIVYAAEKKEIAVSVDCPENLIISHDSKWTS EALFNLLDNAVKYTPSGGKISVSVVQWEMYVEVKVTDTGKGISESNQAAIFRRFYREEEV HDQQGVGIGLYLAREIVTRQGGYIKVVSELRQGSEFSIMLPVR >gi|226332976|gb|ACII01000043.1| GENE 4 2586 - 3275 296 229 aa, chain - ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 226 1 225 227 160 39.0 2e-39 MKHILIVEDDNLLNKTLTYNLELDGYTITSVLNARTAAESLKTNIFDLVLLDINLPDGNG YDLCRLIKPEHPDTVVIFLTANDQESNQIRGYEAGAVDYITKPFSISALQRKIKAMFAML EHHKPAKDIYEDGSLFLDFSEQFASLNGKPLALSAMEYKMLNLFLKNPKQVLTRQQFLEK LWDVDEKYVDEHTLTTSISRIRSKIEADGDTYIKTVYGMGYQWTGGEKK Prediction of potential genes in microbial genomes Time: Sat May 28 19:35:05 2011 Seq name: gi|226332975|gb|ACII01000044.1| Ruminococcus sp. 5_1_39B_FAA cont1.44, whole genome shotgun sequence Length of sequence - 783 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 159 62 ## gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE_01921 + Term 183 - 227 1.1 - Term 169 - 214 3.0 2 2 Tu 1 . - CDS 266 - 412 131 ## gi|226325591|ref|ZP_03801109.1| hypothetical protein COPCOM_03404 - Prom 555 - 614 6.1 + Prom 324 - 383 8.3 3 3 Tu 1 . + CDS 585 - 773 63 ## Predicted protein(s) >gi|226332975|gb|ACII01000044.1| GENE 1 1 - 159 62 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153811529|ref|ZP_01964197.1| ## NR: gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE_01921 [Ruminococcus obeum ATCC 29174] # 1 52 66 117 117 102 98.0 6e-21 EEISLSTGDWLKIAPTAKRQFFASDISGITYICIQVKENSLEHFTAEDAVIG >gi|226332975|gb|ACII01000044.1| GENE 2 266 - 412 131 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226325591|ref|ZP_03801109.1| ## NR: gi|226325591|ref|ZP_03801109.1| hypothetical protein COPCOM_03404 [Coprococcus comes ATCC 27758] # 1 48 1 48 48 76 97.0 4e-13 MKGIKIELLGITIIPLGIAVTTNNFWGYVLGVLGFGVAVVGCFLKDNH >gi|226332975|gb|ACII01000044.1| GENE 3 585 - 773 63 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLSIFSTFSQFSISKIALTLSLDGTFAEKFIFLLSTRGDFEECQTIFYEIKKETLTVIV SA Prediction of potential genes in microbial genomes Time: Sat May 28 19:35:21 2011 Seq name: gi|226332974|gb|ACII01000045.1| Ruminococcus sp. 5_1_39B_FAA cont1.45, whole genome shotgun sequence Length of sequence - 13487 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 480 - 522 7.1 1 1 Tu 1 . - CDS 714 - 4136 2625 ## HMPREF0868_0283 Listeria/Bacterioides repeat protein - Prom 4321 - 4380 6.7 2 2 Op 1 . - CDS 4454 - 5743 802 ## Nther_0049 hypothetical protein - Prom 5773 - 5832 4.0 3 2 Op 2 . - CDS 5842 - 7527 1141 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 4 2 Op 3 . - CDS 7543 - 7728 204 ## gi|253578649|ref|ZP_04855920.1| predicted protein - Prom 7792 - 7851 7.7 - Term 7946 - 8005 8.1 5 3 Op 1 . - CDS 8022 - 8783 944 ## COG1402 Uncharacterized protein, putative amidase 6 3 Op 2 . - CDS 8805 - 10283 1424 ## COG4145 Na+/panthothenate symporter 7 3 Op 3 . - CDS 10284 - 10532 272 ## LHK_00849 hypothetical protein - Prom 10621 - 10680 13.6 + Prom 10597 - 10656 11.7 8 4 Tu 1 . + CDS 10701 - 11579 488 ## COG1737 Transcriptional regulators + Term 11600 - 11650 1.5 - Term 11584 - 11642 13.4 9 5 Tu 1 . - CDS 11658 - 12797 1026 ## CCV52592_0185 isoaspartyl dipeptidase (EC:3.4.19.5) + Prom 13054 - 13113 7.5 10 6 Tu 1 . + CDS 13140 - 13343 299 ## gi|253578655|ref|ZP_04855926.1| predicted protein Predicted protein(s) >gi|226332974|gb|ACII01000045.1| GENE 1 714 - 4136 2625 1140 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0283 NR:ns ## KEGG: HMPREF0868_0283 # Name: not_defined # Def: Listeria/Bacterioides repeat protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 220 790 195 761 1160 405 45.0 1e-111 MKKRMAILMSMAIAANMIASVPVCAADIFTDSMTTDSAITQEADDDSFSDGLGTPDFADE QENVQAAQDEGTQSAKEPPEVIAEYESGSFANVCKLKTSLNSRYVEQITGISINGTEWNE VTYSWDLTGTTYIKKDSDGYVGLSPDRLEVGNEIVITSTGYKDLTLKVTGTGKHWNVEKV TSQSEPEQTKDAPTFAGDQVEISGTDYKIIKPADENNVSDYISKIKEISVDGTVWDKTDY AIALYGRKAYCPDSSNNRVVFDSTVLHTGNVITIKSDGYKELYLKVTAEGNDFKVVTADS GETNINGASNGVNTLHVRLKGYFESAVTGQQKYDAVSGASTSVSSNKNSNVVVEAADLPD GQEPVEADWKELKETGVKIDTKNTKVNIDSESGMAGMYSTYDSSLSLSGTPAKIGTYPVN VTVTDESGRTVTSNELTFKVYSTKEKLADHLKLENATKTADGKYMYDMDPWVIPDFNDTD DIVTVPAEIKAWYGSHTSGTYGELGYAVSEGEATTQTLIIPAGCDLTMVNMKVLSSVKIK VENGGTLNLRDSSIYGQIEVENGGTISVNHDDYSGKFLTGTSINGQLILNDGATIRNSMI YSNTNFLPNGTQARHNTSPVVVISGNVKVEGKVYIKGDEAATGTDPATGKSYSGQPALKV SSGTLTIGEGSQLAAYGGGNIATTSVGGAAVILDNGTISGAGTLIAVAGRGDGDNGGNAV EGTGNVEIAKAYLEGGSTSVMNKNAEPGKAYTESVTISHSTKGTAVNGKKITSNSESAPD TYWSDITETPDNKIQNCTISDTTIIQTVNPTPSVNPTPSVNPTPSVNPSPSVNPSPSVNP SPSVNPTVTPVPTVAPIPDTYPEGTEKDKNGNLVTPEGIVISSDGTVTLPDGTELTPDAD GKKPTVNKDETVTDTKGNTYSTDGSITDSDGNYTRPAKAVIKNVSVSKNSVKMVLNEECK GALKYDYVIGTSEDMLKTGQYTKVLKNQEELKSAFAYMDKGTWYVACHAWFKGEDGKKVF GQWSEIHQVDVSTITPQTPVISKVTVKGSTVTVKYTKSKNSQGYDVVLSDTLTRTNGQKR PAVSGENYYVKKIKGNTVTVTFKNVKSGTYYIGLHAWNRTSEDNSKTFSEWSNVRKVKMK >gi|226332974|gb|ACII01000045.1| GENE 2 4454 - 5743 802 429 aa, chain - ## HITS:1 COG:no KEGG:Nther_0049 NR:ns ## KEGG: Nther_0049 # Name: not_defined # Def: hypothetical protein # Organism: N.thermophilus # Pathway: not_defined # 163 313 3 152 153 66 30.0 2e-09 MANTKKRWTADEIEILRKYYPAEGDQVCERLSGRTAKSCRIQASKMEISKNISSSATVQN DVKYVRLWTEAEDKILREYYAAEGSKVRYRLNQRSAESCRGRAAALGLTYSGANRKWSSS EDEILRNYYPTEGAEVQKRLEGRSKEACNKRAAKLGLCIQGRKEWTAEEDEIIRTYYKAE GRKAAERLEGRTLTACQSRAKYLNVRSDNNRLPWTQEEDAIISRYYEEKGAGYVHTLIPE RSLQACKYRAMILGLSRKTNDPWTKAEEKILKKYYPVEGSNVYKRLNNRTRNACISKAAQ LHLQYGLPMYKGPVLDEWIILAVKQGITKVRQSMTYSEITFKPGKLPELHLRLECDETGK FRVYEKSVLLDEKEYDRQRVENLADTMGLQIEGNKITLNSSEKEIEDDIFRLIAGCYLIY YLHYFMELP >gi|226332974|gb|ACII01000045.1| GENE 3 5842 - 7527 1141 561 aa, chain - ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 21 465 5 460 559 115 23.0 2e-25 MDIYATREELKSRQKNIFDLKLNVAYYARVSTDKEEQKTSIEHQWTFFEQLISDVKNWTL VDGYIDEGISGITVTHREEFQRMLQDAKAGKIDLIITKEITRFARNVLDSIRYTRELLDC GTAVWFRNDNINTLDEDSELRLSIMSGIAQEESRKLSSRVRFGHARSIQNGVVLGNSHIY GWDKRNGKLFLNPEEAKMVQLIFEKYALGNWSTHSLEQFLWECGYRNYKGGKIDSHVIAN IIRNPKYKGYYVGGKVKIVDLFTKKQKFLPEDEWVMYKDDGTHVPAIVDEKLWEDANRIM EERSKNIKLKKTSYKQDNLFTGLIHCAEDKAAYWLKVRTVRGKAAKTWVCSHRIKQGAVS CRSKPVKEEILLEMISDVYNNLAQSNKSILHKYLELYEKEINKTEGTEEKIDNIQHEITV LEQKRDKLLDYNMGGYISDAEFLSRNKVFTQQIAEKKANLEQLKSMKPLSKTELTDQMDR ILDYAKEYSCVKPKDITRAMIENTINNIEITPVNEKKAEVMIVFKSGSLHSPESVIPALG QEKTYTTETGYEYVVHSGISM >gi|226332974|gb|ACII01000045.1| GENE 4 7543 - 7728 204 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578649|ref|ZP_04855920.1| ## NR: gi|253578649|ref|ZP_04855920.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 3 63 63 93 100.0 3e-18 MTLEQVVLTMINDKMLDNGQISEKLHSSIQKDIITLSTFNTAVAKNGEQMDSSDEYCQRI S >gi|226332974|gb|ACII01000045.1| GENE 5 8022 - 8783 944 253 aa, chain - ## HITS:1 COG:MK0183 KEGG:ns NR:ns ## COG: MK0183 COG1402 # Protein_GI_number: 20093623 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Methanopyrus kandleri AV19 # 20 249 17 222 224 102 29.0 5e-22 MYLAKMTTAQAKEAFEKDPIIVIPVGSTEQHGTQGALGTDFMVPSYLADHVEDVDNVIVA PTVPYGVCPYHLSFEGSINIGYEGLYMVLHGIMDSLMQHGAKRFVVLNGHGGNTPSIDRA ALEVYHKGGVCASVDWWSLVAQLDKKFDGGHGDVLETSAMMAIAPESVHLELSKPINAQD PSENMKAAYIQAVNFKGGIVRLVRDTKEIAPSGWFGPFDPKDSSAELGQEALDLAVSYIR DFLEEMKQIELSS >gi|226332974|gb|ACII01000045.1| GENE 6 8805 - 10283 1424 492 aa, chain - ## HITS:1 COG:FN0685 KEGG:ns NR:ns ## COG: FN0685 COG4145 # Protein_GI_number: 19704020 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Fusobacterium nucleatum # 41 490 37 482 484 184 29.0 3e-46 MNGAAVIQPAPIPFYTVLVLYLGIMAFIGWYAGRKTNNIGDFFVLSGKAGVVVSGIAYFS TQFSMGTFLGTPGTIYGVGYAGMAISVPGAVFCMILPALLIGRKLITHGHKYGFLTMADY LTDRYHSKNMSGVLGVMMLFFLVPMMGAQIIGAGVIVHVFTGLPEWVGVVGMGIIVILYC MTGGMKGAMMTDVIQGSLMIATAVVTFIVSIVMGGGFSNINHTLQSMNEAYLTFPGANGY MPWTYYISNIVLWSFFTMGQPHLFTKFFAMKDHKTMFKAILLGTAGMFFSATLIEWAGVN GIASIQNIEKADQIIPMILQRGMNPFLASIFIAGIVAAGMSTIDGILVTTTGAVTRDIYQ KIINKNATDEAVMSLSKVVTVIIGIVVICFGVFQPGSIFEINLFAFSGMAIFVVPILFGI YWKKATAKGAIASVIVGIISLLLFTLNPSVKALAMGFHALFPTTIIASIVMIVVSKFTET PPQETIDRHFTV >gi|226332974|gb|ACII01000045.1| GENE 7 10284 - 10532 272 82 aa, chain - ## HITS:1 COG:no KEGG:LHK_00849 NR:ns ## KEGG: LHK_00849 # Name: not_defined # Def: hypothetical protein # Organism: L.hongkongensis # Pathway: not_defined # 9 82 8 81 85 77 44.0 1e-13 MKKLFEKHFERTWLIIFLIMFVLIMIPFPFFYSETYIPAFGGVPLYIFGWIVHTAITFVL IIVYYRMCMKRKEYHTYDEEDK >gi|226332974|gb|ACII01000045.1| GENE 8 10701 - 11579 488 292 aa, chain + ## HITS:1 COG:BH2940 KEGG:ns NR:ns ## COG: BH2940 COG1737 # Protein_GI_number: 15615502 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 10 284 1 263 284 107 27.0 2e-23 MRHAKKGHRVISMDILEHLQKLYPALTKKQKTIADYLIANPEEISYITLAQLSHQTGTSE LTLLRFCQKLGYSNFLDLKEQFRDYTQKMIKQASSSSFFLPERTSGEDTEREQMLHEICN TEAAVFSDFITNINLESVIKASDEIRKSTRIYIFAHDISLVPGKFLQSRLEILYLNSALI DLSDLEVTQKIIQQLTDGDLVIFFSFPRYYFPIGNIAKKAAKSGVPILTITDSDTSPAAA HSTLLLLCPTSTKLFYNSMTAPMAMLNILASCLVIDSVSPSERQIFIDTLPS >gi|226332974|gb|ACII01000045.1| GENE 9 11658 - 12797 1026 379 aa, chain - ## HITS:1 COG:no KEGG:CCV52592_0185 NR:ns ## KEGG: CCV52592_0185 # Name: iadA # Def: isoaspartyl dipeptidase (EC:3.4.19.5) # Organism: C.curvus # Pathway: not_defined # 1 378 1 377 378 384 52.0 1e-105 MLLVKNAEIYAPEYLGKKDLLICGGKIECIQDSIRELPVECEVLDAEGKILTPGFLDQHV HITGGGGEGSFHTRTPELQMSELVENGITTVVGLLGTDGITRSVDNLYAKTRVLCEEGVS AYMLTGAYGYPSPTITGETDRDIVFVNEILGVKLAISDHRAPNVTGDQLVQIASKARVAG MLSGKPGIVVLHMGDDKDGLAPVFRALEVSSVPVRIFRPTHVNRNEKLLEEGYEFLKRGG YIDLTCGMHTSPGECVLEAKKRGLPTEHITMSSDGHGSWSNYAEDGSLLEIGVSGVDALY KELKYMVQVLGMTLEEALPYMTCQVAEGLDLLGIKGTVAEGADADLLLFDQDLTLDTYVA RGKIFMKHGEVIRKGTYEK >gi|226332974|gb|ACII01000045.1| GENE 10 13140 - 13343 299 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578655|ref|ZP_04855926.1| ## NR: gi|253578655|ref|ZP_04855926.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 67 12 78 78 114 100.0 2e-24 MEEKILNFILECAEVQKLVPFSLIEEEFNLILDEALKSVITDALWDNDTISDVTIGTDGF TVTFFEN Prediction of potential genes in microbial genomes Time: Sat May 28 19:35:59 2011 Seq name: gi|226332973|gb|ACII01000046.1| Ruminococcus sp. 5_1_39B_FAA cont1.46, whole genome shotgun sequence Length of sequence - 2419 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 762 635 ## COG1145 Ferredoxin - Prom 791 - 850 5.1 - Term 1055 - 1095 1.2 2 2 Tu 1 . - CDS 1141 - 2106 575 ## COG2199 FOG: GGDEF domain - Prom 2185 - 2244 7.1 Predicted protein(s) >gi|226332973|gb|ACII01000046.1| GENE 1 3 - 762 635 253 aa, chain - ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 234 12 247 294 76 27.0 4e-14 MVVYFSGTGNSKYIAERIAGSLQEKLLCMNERIKSGDTGSVKTRENLVVVVPTYAWRIPR VVSDWIGQTEFVGAKNVWYVMSCGSGIGGADIYNRKLSEKKGLKHMGTAQIIMPENYIAM FNAPDVEKAKKIVVAAGPDIAKAVLAIKHGEKLPSKSGFGASFESGLVNDIFYAAFVKAK AFYADQTCTGCGKCVKVCPLNNVTMKNKKPVWEKHCTHCMACICYCPAEAIEYGRKSVGK PRYCFENLGLEKY >gi|226332973|gb|ACII01000046.1| GENE 2 1141 - 2106 575 321 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 156 315 338 501 511 83 32.0 5e-16 MMIQIIRLQGTARVINYAGLVRGATQREVKLEITGNQNEELIKYLDDIFLGLRYQDGHYD LVKLNDEEYLEKLQIQSDYWDKLKKEIEAVRNKGYENTDIVNMSEIYFTMADETVSAAES YSEKIAIKIRTIELLSTLDMLSLVILVIMQTLRAMQMAMQNRLLEQKAFIDAYTGLPNKN ACNEILNKKDIITDPTACIMFDLNNLKTVNDTMGHSAGNQLILNFAKLLRSVIPEKDFVG RYGGDEFIAVIYHTSEAEIKVILKSLYREKNRLNSYENQIPIDYACGWALSSDDMSGTMQ MLLDEADAYMYKNKQLCKKYN Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:01 2011 Seq name: gi|226332972|gb|ACII01000047.1| Ruminococcus sp. 5_1_39B_FAA cont1.47, whole genome shotgun sequence Length of sequence - 8331 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 51 - 554 217 ## COG3022 Uncharacterized protein conserved in bacteria - Prom 596 - 655 3.4 2 2 Op 1 . - CDS 812 - 1267 511 ## COG1225 Peroxiredoxin 3 2 Op 2 . - CDS 1271 - 1483 178 ## EUBREC_1224 hypothetical protein - Prom 1684 - 1743 6.6 4 3 Op 1 . - CDS 1763 - 2200 348 ## EUBELI_20567 hypothetical protein 5 3 Op 2 10/0.000 - CDS 2248 - 5079 2142 ## COG0642 Signal transduction histidine kinase 6 3 Op 3 . - CDS 5150 - 8017 1755 ## COG0642 Signal transduction histidine kinase - Prom 8108 - 8167 6.9 Predicted protein(s) >gi|226332972|gb|ACII01000047.1| GENE 1 51 - 554 217 167 aa, chain - ## HITS:1 COG:NMB0895 KEGG:ns NR:ns ## COG: NMB0895 COG3022 # Protein_GI_number: 15676791 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 1 164 91 258 259 127 41.0 1e-29 MAPSVFEDSQFEYVQNHLRILSAFYGVLKPLDGVTPYRLEMQAKVGIGDAKNLYEYWGDM LYCSVIDNSRIIINLASKEYSKSIEKYLTLRDKYITIVFCELSGDKLVTKGTYAKMARGE MVRFMAENSIENPEDIKKFDRLGYAFRHDLSSDTEYIFERRIEITSY >gi|226332972|gb|ACII01000047.1| GENE 2 812 - 1267 511 151 aa, chain - ## HITS:1 COG:CAC0327 KEGG:ns NR:ns ## COG: CAC0327 COG1225 # Protein_GI_number: 15893619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Clostridium acetobutylicum # 5 150 6 151 151 146 52.0 1e-35 MLEVGVKAPDFELPDQNGKIHRLSDYAGKKVVLYFYSKDNTAGCTKQACGFSERYPQFIE KGAVILGVSKDSVSSHKRFEEKYGLAFTLLADPERKVIEAYDVWKEKKNYGKVSMGVVRT TYLIDEQGVIIKANDKVKAADDPENMLKELD >gi|226332972|gb|ACII01000047.1| GENE 3 1271 - 1483 178 70 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 69 26 94 94 92 62.0 4e-18 MKAHKYWSVGALITMIGTFYTGHKGLKSSHKYFALSSMLCMTMAIYTGHKMISGNRKKKV NKEKVTEREE >gi|226332972|gb|ACII01000047.1| GENE 4 1763 - 2200 348 145 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 78 1 78 114 94 56.0 1e-18 MTMQECYKAIGGNYEAVLGRLHSEALIQRFTLKFLEDQSYLQLKQTLENKNYEDAFRSAH TLKGVCQNLSFDRLYEVSDKSLPNQIYSQNLIKVMQEKIDFFKSNSGINSIDYNASSGQL TIINEKQKIIYQREDPGFDVFKVFE >gi|226332972|gb|ACII01000047.1| GENE 5 2248 - 5079 2142 943 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 528 927 262 654 676 173 31.0 1e-42 MISSAKKYMAVLLVFVCVCMIFSPLTAYAAEDSSQHETVKVGFFAMDGYHMMDEEGNRSG YGYDFLRLMARYWDVDYEYVGYDQSWDDMQQMLEDGEIDMVTSARKTPDREEKFDFSRPI GTNYGMLTVRSDNSTIVDDNYSTYNGMRVALLNGNTRNEEFADFADNKGFTYVPSYFDTT AEMEEALQSEKVDAIVTSSLRKTNNERIVDKFGSSDFYVIVKKGNTELLNEINYAIDQMN AVEGDWKTTLYNKNYESIETKNLEYTEQEKSIIAQYSKDNPLHVLCDPSRYPYSYNENGE MKGIIPDYFRKIADYAGISYEFLTPATRDEYIAYQKNKEVTNISIDARLETDNYAETKKW GLTAPFITMQLARVTRRDFDGEINVVATVDQTASNSIADAMAPGAEKLMCSTRQEMMEAV RKGKADAAFVYYYMAQAFVNSDTTGTMTYTLLEQPTFTYRMVVSSTENHALAGILTKAMY AMPQNLVEDLAARYTTYKAAELTFVDWIRLHPVATVWVLLIFGWLLTTMAMIAMRLSARK KAQKAAQETAEEMAELAEHAQAANKAKTAFLSHMSHDMRTPMNAIMGFTGIAMKNNPSDE VKNCLEKIDESSEHLLSLINDVLDLTRVESGKVKYNPVPTDLKSITDFALDITKGFLTNR DISFKIQREEAKIPNVLVDSVRLRDVLVNILSNAVKFTPDGETITFEARCQEKGGDGYIN MHYRISDTGIGMSEEFTKEVFEEFAQEDSDVRTQYHGAGLGMAIVKKYVDMMGGTISVQS KKHEGTTFTVDIPLEITDKECNKSDIGFSEKVNLTGVKVLLAEDNELNAEITTVQLEEFG MNVERAVDGKNAVEIFRNHPEGTFDVILMDIMMPEMNGYEAAKAIRAMNDRPDGKNIPII AMTANSFAEDVQASLDAGMNAHLSKPIVIEEVIKTILRYVHND >gi|226332972|gb|ACII01000047.1| GENE 6 5150 - 8017 1755 955 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 563 800 2 236 260 163 39.0 2e-39 MKKCKLVQNIMNFKFCYIIFCTIICSIVLSVPTFAKENSDNVIRVGSFEETYNTVNEKGE RSGYGYEYLQDIAGYAGWTYKYITSDWKNCFTQLENGEIDILGDISYTDERAENMLFSDM PMGEEKYYIYTDASNMDLTAGNLDSFEGKNIGVSKDNIVEDVLNEWESKYGLHTKHINVS TTTEIMDKLSKHEIDCFVSVEESRWEESDISPLTSIGETEIYFAINPERPDIKEALDSAM RRIKDDNPFYTDDLYRRYLSAQSSSFLSKEEREWLGRHGALRIGYLNQDGGISSVDSSTG KLTGVITDYVDLAENCLQGQTLEFELNGYDTRSELLQALQDGKIDLIFHANQNPYFAETN GFALSDTLLTLNMAAITAKDSFDENKENIVAVEKDSFALKAYLSYNYPQWEVVEYETSDA AVKAMQKGETDCIVSNSGTVSDYLKNNKLHSVFLTKEADVSFAVQQGEPVLLSILNKTLT SMPTTQFSGAVVSYNASSRKVTAKDFIQDNLLAVSLIVGISIFVALCIILDSLKKSKRAE EKSKKSAEQALKLNQELEEKQQELQNALVEAQSANKAKTSFLNNMSHDIRTPINGIMGML TILEKSGNDGERAKDCLNKINESSKLLLSLVNDVLDMAKLESDTVVFSDESINLDQVCQE ITESLSFQAEEKGLHVIGEHDDYSGIYVWSNAVHLKKILMNLFTNSMKYNKVNGFIYMSM RTIERSEDHMTCEFKIKDNGIGMSEEFIKNELFTPFVQADNSPRSDYNGTGLGMPIVKQL VEKMGGTITVESKLGEGSCFTVILPFKIDTNARLEEKEDFNADISGVRILLVEDNELNAE IAEFILTENGAKVETVKNGLEAVQYFEACESGTYDVILMDVMMPVMDGLTATKTIRSLER QDAKTIPIIAMTANAFREDAEKCMEAGMNAHLAKPLDDKTIKQTICEELRSSRDR Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:07 2011 Seq name: gi|226332971|gb|ACII01000048.1| Ruminococcus sp. 5_1_39B_FAA cont1.48, whole genome shotgun sequence Length of sequence - 5366 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 45/0.000 - CDS 18 - 920 601 ## COG0842 ABC-type multidrug transport system, permease component 2 1 Op 2 2/0.000 - CDS 920 - 1597 542 ## COG1131 ABC-type multidrug transport system, ATPase component 3 1 Op 3 . - CDS 1623 - 1832 95 ## COG1131 ABC-type multidrug transport system, ATPase component 4 1 Op 4 . - CDS 1859 - 2041 103 ## - Prom 2103 - 2162 4.8 - Term 2046 - 2088 3.2 5 2 Tu 1 . - CDS 2317 - 2451 87 ## gi|226322848|ref|ZP_03798366.1| hypothetical protein COPCOM_00620 - Prom 2556 - 2615 3.5 - Term 2574 - 2620 1.2 6 3 Op 1 . - CDS 2645 - 4576 1043 ## COG0642 Signal transduction histidine kinase - Prom 4602 - 4661 4.5 7 3 Op 2 . - CDS 4667 - 5314 368 ## COG0655 Multimeric flavodoxin WrbA Predicted protein(s) >gi|226332971|gb|ACII01000048.1| GENE 1 18 - 920 601 300 aa, chain - ## HITS:1 COG:Cgl2007 KEGG:ns NR:ns ## COG: Cgl2007 COG0842 # Protein_GI_number: 19553257 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Corynebacterium glutamicum # 4 286 2 279 289 75 26.0 1e-13 MRTVLALMDRNRKLFFKDKGMLFTSMITPVILIVLYATFLAKVFKDSFTAAIPDMITISD KLINGTVAAQLTASLMAVSCITVTFCVNLTMVQDKANGTRKDFNVAPVSKEKIYLGYFLS TVANSLMVNGLAFVLCLGYLLKMGWYMNTADILWVLFDMILLVLFGSTLSSIISFPLTTQ GQLSAVGTIVSAGYGFLCGAYMPISNFGPGLQKALSYLPSTYATSLIKNHMLHGVFREME RKNYPDEMVEAIRDTLDCNPVFHGNVVSINQMIGIMMGSIAVFGIIYYVVTLLSAGEGRR >gi|226332971|gb|ACII01000048.1| GENE 2 920 - 1597 542 225 aa, chain - ## HITS:1 COG:lin0985 KEGG:ns NR:ns ## COG: lin0985 COG1131 # Protein_GI_number: 16800054 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Listeria innocua # 2 216 77 296 303 176 43.0 4e-44 MLGVVFQDSVLDKPLTVKENLKSRAALYGITGNAFDKRLQELVEILDFDEFLNRPVGKLS GGQRRRIDIARALLHRPEILILDEPTTGLDPQTRQLIWNVIEKLQKTENMTVFLTTHYME EAANAGYVVILDKGSIAAEGTPFELKNDYVQDIVSVYGISEDEIKSLNREYKKIRDGYQI KVRNTKEATKLIVEHQDLFTDYEVVKGGMDDVFLAVTGKKLGGER >gi|226332971|gb|ACII01000048.1| GENE 3 1623 - 1832 95 69 aa, chain - ## HITS:1 COG:MA4206 KEGG:ns NR:ns ## COG: MA4206 COG1131 # Protein_GI_number: 20092997 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 1 67 1 65 333 62 47.0 2e-10 MESDIIKISHLNKSFSEVKAVNDLSFRVKKGELFAFLGVNGAGKSTTISILCGLQEKRQR NGSSKRNRN >gi|226332971|gb|ACII01000048.1| GENE 4 1859 - 2041 103 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVLDPIYLCLLSGAFGGGDVNGILYLMPQAWIWTIFAFILLFNYMDSIKKGKSNVYCVI >gi|226332971|gb|ACII01000048.1| GENE 5 2317 - 2451 87 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226322848|ref|ZP_03798366.1| ## NR: gi|226322848|ref|ZP_03798366.1| hypothetical protein COPCOM_00620 [Coprococcus comes ATCC 27758] # 1 44 12 55 55 69 93.0 8e-11 MKILENNNKSNSLEENLILVETIPIKEYTYNRFVVSFGGKLWII >gi|226332971|gb|ACII01000048.1| GENE 6 2645 - 4576 1043 643 aa, chain - ## HITS:1 COG:sll1228_2 KEGG:ns NR:ns ## COG: sll1228_2 COG0642 # Protein_GI_number: 16330678 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 255 496 2 245 267 174 38.0 6e-43 MSSMKKTKEGINLKKLLRLKNTGSIFVIAGLLMIFVIASYTYILQSSYTKTALETEITRD TASADAVHKLVNGRIGKEDFDQIKDQSDEKKQLYKDISSYFNEIRTLNSTRYIYTAKKNE EGKLVYVVDGLDPDADDVRRPGDYIEEEMVPYIDRAISGENVYSQDIIDTTWGPIFTACY PVSANYDGTGEIIGAFCIEMDMQSAYGMVEKTNHISIICGLVAGAVLLLICLYTYYVYQK SKAEEQKQKQLLMTAAEEADAANKAKSAFLLSISHDIRTPMNAIIGFTNIALHQNTVSDI HDSLEKVQKSSNHLLSLLNDVLDFTRIESGKVTISPEPMDITQLTDNVQAIMNGLLYNRD LKFEVHREISKNPYVLADVVRIREVLVNLLGNAVKFTKDGGKITLDISSYPGADEKHIIT RYVVRDNGIGMSEEFQKKLFDPFSQEDDANARTQYKGTGLGMAITKKYVDMMGGSIAVES KKGVGSTFTVEIPLELTEQVIQSEQKQHLHRDLTGIHVLMAEDNDLNAELATIMLEDAGM TVTRASDGKEVVNLFKNHPRGTYDFILMDIMMPNMDGHQAAKAIRALGIERSDAVTIPIV ALSANAFIDDIQESLDSGMNDHISKPINMEELIDTITKYIKHD >gi|226332971|gb|ACII01000048.1| GENE 7 4667 - 5314 368 215 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 209 1 208 208 225 51.0 6e-59 MKVMLVNGSSHLHGTTMTALEEMIKVFHSEEVETEVVQLGGKPIADCLQCNVCQKTGKCV IKNDGVNEFVEKAKTADGFVFATPVYFAHPSGRIYDFLDRVFYSAGGYDAFKFKPGAAVA VARRGGTTAALDGINKYLGIAQMPVAGSTYWNMVYGLYGEEAFQDEEGMQTMRNLARNMI WMMRCFKLGRESGILYPETETDACTNFIKRSDTLR Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:17 2011 Seq name: gi|226332970|gb|ACII01000049.1| Ruminococcus sp. 5_1_39B_FAA cont1.49, whole genome shotgun sequence Length of sequence - 1498 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 2 - 1088 458 ## COG0675 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 1036 - 1497 352 ## COG2452 Predicted site-specific integrase-resolvase Predicted protein(s) >gi|226332970|gb|ACII01000049.1| GENE 1 2 - 1088 458 362 aa, chain - ## HITS:1 COG:MA0258 KEGG:ns NR:ns ## COG: MA0258 COG0675 # Protein_GI_number: 20089156 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 1 362 1 340 370 149 31.0 8e-36 MVKAIKVMLIPNNVQKTRMFQYAGASRFAYNWALAREIKNYEKGGGFITDAELRKEFTKL RHSDEYAWLLNISNNVTKQAIKDACTAYKNFFRGLQKFPRFKSKKRSMPKFYQDNVKIRF SNTHVKFEGFSSSRKANKQKMNWVRLAEHGRIPTNAKYMNPRISFDGLNWWISVCVEFPD CRETLNDDGVGIDLGIKDLAVCSDGTKYKNINKSQKVKKSEKQKRRLQRSISRSYEKNKK GESYCKTNNVIKKEKLLLKRNHRLTNIRKNYLNQTISESVDRKPRFICIEDLNVSGMMKN RHLSKAVQAQGFFWFRKQLEYRCSDKGIQLIVADRFYPSSKLCSCCGNIKKDLKLSDRVY RC >gi|226332970|gb|ACII01000049.1| GENE 2 1036 - 1497 352 153 aa, chain - ## HITS:1 COG:PAB2076 KEGG:ns NR:ns ## COG: PAB2076 COG2452 # Protein_GI_number: 14520623 # Func_class: L Replication, recombination and repair # Function: Predicted site-specific integrase-resolvase # Organism: Pyrococcus abyssi # 1 134 70 208 212 110 47.0 1e-24 VSSHKQKDDLERQIDNVKTYLLAKGQPFEVISDIGSGIDYKKKGLQELIRRISQNQVEKV VVLYKDRLLRFGFELIEYIASLYNCEIEIIDNTEKSEQQELVEDLVQIITVFSCKLQGKR ANKAKKLIRELIQEETDGKSHKSNVDTKQCTEN Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:20 2011 Seq name: gi|226332969|gb|ACII01000050.1| Ruminococcus sp. 5_1_39B_FAA cont1.50, whole genome shotgun sequence Length of sequence - 8785 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 197 - 237 6.7 1 1 Tu 1 . - CDS 250 - 1497 975 ## CDR20291_2878 hypothetical protein - Prom 1588 - 1647 3.0 2 2 Op 1 . - CDS 1660 - 2544 286 ## gi|253578673|ref|ZP_04855944.1| predicted protein - Term 2552 - 2592 5.3 3 2 Op 2 . - CDS 2603 - 2803 171 ## gi|253578674|ref|ZP_04855945.1| predicted protein - Prom 2835 - 2894 7.9 - Term 2967 - 3006 -0.5 4 3 Tu 1 . - CDS 3064 - 3993 1053 ## Bmur_2337 auxin efflux carrier - Prom 4024 - 4083 4.7 + Prom 4049 - 4108 10.1 5 4 Op 1 1/1.000 + CDS 4137 - 5072 758 ## COG0679 Predicted permeases 6 4 Op 2 . + CDS 5124 - 5849 412 ## COG0583 Transcriptional regulator - Term 5700 - 5748 9.3 7 5 Tu 1 . - CDS 5837 - 7246 873 ## COG0534 Na+-driven multidrug efflux pump - Prom 7288 - 7347 7.8 + Prom 7281 - 7340 5.7 8 6 Tu 1 . + CDS 7411 - 7605 324 ## gi|253578679|ref|ZP_04855950.1| predicted protein - Term 7521 - 7566 7.4 9 7 Op 1 . - CDS 7619 - 7990 280 ## EUBREC_3448 hypothetical protein 10 7 Op 2 . - CDS 8007 - 8585 291 ## COG2135 Uncharacterized conserved protein 11 7 Op 3 . - CDS 8594 - 8785 110 ## EUBREC_3447 hypothetical protein Predicted protein(s) >gi|226332969|gb|ACII01000050.1| GENE 1 250 - 1497 975 415 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2878 NR:ns ## KEGG: CDR20291_2878 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 244 379 29 163 343 87 40.0 9e-16 MTLWICLGTGFTVNAAQREDTKLLKIYDGKSIEASQIQFKAEVATPDSNRIDIYRSEKPV SQGGKLEKLDSFRVIGDFWEVLGDNLYMNYGDNFKVQCYSHDTCVEGNLTFVDTTAKAGH QYSYRLVYGDMVYDPDSGEYSVEIVSNIIDVKANLQTPELYKCYSTDNKTVNLSWSYIAQ ADGYRVYRYDNGKWSFLKNVKKRNVISTTDKNVQAGKTYQYRVLAYRVIKGKNIYSSKSK ARKITLKTATVKGDYQYGSVYGPYLDAQHLAQVRSVVQSFKINYIRKGMSDYDRVLTAYN YLRSNCSYAYKGWQYNYANTAWGALVYGEAQCSGYARAMKALCDAIGVDCRYVHADSKAS NPSHQWNQVRVGGKWYILDAQSGGFLLGSRTWKKKAGMSWDTKGLPTCSVTDYKK >gi|226332969|gb|ACII01000050.1| GENE 2 1660 - 2544 286 294 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578673|ref|ZP_04855944.1| ## NR: gi|253578673|ref|ZP_04855944.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 294 1 294 294 521 100.0 1e-146 MYFSGTKKKVKLASLKPDTDTTKITLKVKEKKEFPCTGISYLSNVKYSSDNKKVAKVNSS GEIQAKKAGTTYIRCKVKQYGDTYNLVCKVTVKKEGNIKDPTSAYRNLIQSYEKKYGEAQ LNEQKQFWTGLCYAKLLDFNNDGINELILTYQTERYNKDKVQYHVELWKYGGKSAKRVTS RISWSGNNMPYFGGLGICKYNGKYLLELTGNACGDNYYYGTKKDGSVGLVHKFIWKGDAM EGDWYMDGKKTSGNMYETYYKKYHANATWYSFAQSSNNNLIRKELSNTKQKLGM >gi|226332969|gb|ACII01000050.1| GENE 3 2603 - 2803 171 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578674|ref|ZP_04855945.1| ## NR: gi|253578674|ref|ZP_04855945.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 66 1 66 66 94 100.0 3e-18 MEKRYTIGVMIGNANSPHTMNLMQGIFHAAKSMDVNVLFFLGFTVVITINLILAKIRKMI MIISSM >gi|226332969|gb|ACII01000050.1| GENE 4 3064 - 3993 1053 309 aa, chain - ## HITS:1 COG:no KEGG:Bmur_2337 NR:ns ## KEGG: Bmur_2337 # Name: not_defined # Def: auxin efflux carrier # Organism: B.murdochii # Pathway: not_defined # 3 305 4 305 307 105 25.0 3e-21 MNVLEIVLPVLVMIVVGMLCRKWKILTRDGINNMKVLVTNVMLPVAIFHALATAEYNKET GILILIMFVMLVVSFGLGFLLKPFLKGTYQKYLPFMVSVYEGGLMAYPLYTSLCGSENLS RIAVLDIAGLLFGFSVYMGMLGQVENGEKIDVKKLFFSALKTPAFIASILGIIAGLSKVI LAVLDSPFGGVYQSVENILTTSVTAIILLVVGYSMELNAKLLKPCIATIILRVLLQAVMI AGVLVAVHYLVGDDRLVNLAIISYMSAPATFSMQTFMKDEEGSAYVSTTNSMYCLVSILV YIIMAVTIY >gi|226332969|gb|ACII01000050.1| GENE 5 4137 - 5072 758 311 aa, chain + ## HITS:1 COG:CAC2949 KEGG:ns NR:ns ## COG: CAC2949 COG0679 # Protein_GI_number: 15896202 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 70 303 69 300 305 100 28.0 4e-21 MNTFSILTAQIGMFVIYMLAGVILIRTRVMNRENLEVISKFVIKLALPVMIFINTVNGVE RKTLFHSLSIFLIAGIMYICLFLLSYISGIFFHLHGNHRQLYSAMSVFGNVGFMGIPIVT SIYPENGMLYICVFTIIDQLMLWTAGVRLTSGTDSQKNRFDFRKLINPVTVSILLAVICV LTGIRLPDVLNNSLQKIGQTATPLAMIYLGGVFACIDVLKNIQRLDYYGIVILKMLLFPL FFYVLLGYLPVSSEIRTTMALTSAMPVMSSVVMMANTYGSDGDYAMGGILVTTICSVFTL PFIYWIFHFIG >gi|226332969|gb|ACII01000050.1| GENE 6 5124 - 5849 412 241 aa, chain + ## HITS:1 COG:aq_1038 KEGG:ns NR:ns ## COG: aq_1038 COG0583 # Protein_GI_number: 15606329 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 4 238 75 305 306 82 27.0 5e-16 MVYQLYDSAFRILDIRNNLLENFTGSRKQIIDLAASTIPSSYLLPELLAGFGRMYPDVYF HSWQTDSSGAINRVLDGTVDLALTGQNTGDDSCVFIPFCQDNMVIATPVNDHYLNLKQKE QPVVFQDFIKDPVIIREKGSGTKKEMDIFLENAGIEPSSLNVVARMNDLESIKKSIVNGL GFSILSARSVVDLQKTKQILLFPLEESAHKRSFYIVYSKNRILKSHVRQFIHYVKEYYQT V >gi|226332969|gb|ACII01000050.1| GENE 7 5837 - 7246 873 469 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 18 417 6 405 426 253 37.0 8e-67 MNGGMRVAAYEKSKSVKLKSISTMTEGTIWRQIFLFSIPLILGNLLQQMYNTVDSIIVGN FVGSNALAAVGSSTSLICLLIAFSMGASAGAGVIVSQFYGAGDENGVQRSAHTALMLALI LGIVLTIAGIVFSPAILRWMRTPEEVMNQSVLYLRIYSYGLVFNVIYNMAAGILNAVGNS RRSLMYLAVASFSNIFLDLWLIGGMHMGVEGAAIATDISQVLSCIFALWFLMRVPDIYRI NPKKLSIDRTMAGRIIQVGLPTAIQNTVISFSNVLVQSGVNGFGASAMAGFGAYLKVDGF NILPVMSFSMAATTFTGQNYGAGKTDRIRKGLRVTLGMSVLYTIVTGILLLTFSRPIIGI FSSDKEVIYYGAQAMKYFCPFYFLLGIMHSLAGTIRGTGKTVPPMVVMLLALCLFRILWI QFVVPHFTSIDIIYVLYPVSWAVGMVLMLIYAWKGNWMPKKYQNINQTV >gi|226332969|gb|ACII01000050.1| GENE 8 7411 - 7605 324 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578679|ref|ZP_04855950.1| ## NR: gi|253578679|ref|ZP_04855950.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 121 100.0 2e-26 MAFDITPYVDMSPKKVRELIRKGVIDFPTAGMCRGYAQANMDAGISSMTASISIVRNEIL FLFM >gi|226332969|gb|ACII01000050.1| GENE 9 7619 - 7990 280 123 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3448 NR:ns ## KEGG: EUBREC_3448 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 120 1 120 126 213 86.0 1e-54 MTKDEWYKQLFERLENSRFRSRFHLKQNDIDYIHEKGLDTIRKHAEDFITKREAPAYIPN DGKQTPMRGHPVFIAQHATATCCRECIRKWHKIQPGRELSQVQQEYLVDVIMTWIERELE NAR >gi|226332969|gb|ACII01000050.1| GENE 10 8007 - 8585 291 192 aa, chain - ## HITS:1 COG:AGc1954 KEGG:ns NR:ns ## COG: AGc1954 COG2135 # Protein_GI_number: 15888401 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 164 1 188 253 77 32.0 1e-14 MCSRYYIDLDMMEEISRVVQNIDGRIRLTQGDIRPTDVAPVIGQSEHRLELGMCRWGYPL SKGKNPVINARVETVMDKPSFQNGILYHRLLIPAGGFYEWNSLKEKSTFTRPDSSVLYMA GFCDWFENERRFVILTTTANNSMKKIHDRMPLILEREQITDWFDNNKMPVLLRQTPTLLN RQTEYEQQSLFS >gi|226332969|gb|ACII01000050.1| GENE 11 8594 - 8785 110 63 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3447 NR:ns ## KEGG: EUBREC_3447 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 63 181 243 243 100 73.0 1e-20 VNQKGVEYGWDVAVYSSPEHIYGYDHVTSCYKEDPRTSWGKIVDYMKQLYPEAKDMQIRK ILK Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:53 2011 Seq name: gi|226332968|gb|ACII01000051.1| Ruminococcus sp. 5_1_39B_FAA cont1.51, whole genome shotgun sequence Length of sequence - 1870 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 567 345 ## EUBREC_3447 hypothetical protein - Prom 592 - 651 10.6 2 2 Tu 1 . + CDS 961 - 1824 794 ## COG1680 Beta-lactamase class C and other penicillin binding proteins Predicted protein(s) >gi|226332968|gb|ACII01000051.1| GENE 1 3 - 567 345 188 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3447 NR:ns ## KEGG: EUBREC_3447 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 186 1 186 243 293 76.0 2e-78 MGNEAGEWIIHGVTWDDPKCIHDVDEAVEYINRVGFLPLFKNEIPGFSLEERTVAEYWWS DDPVRDPWRWRIAIAKRHDVLYGKFFAKKAGFISKKWLPVFANYRRDGYDFDALFEDEKA PIKHKNIMDHFMENDAEIYSYELKKLAGFGKDGEKGFDGAITSLMMQTYLCNCDFRKRNQ SKRRGVRL >gi|226332968|gb|ACII01000051.1| GENE 2 961 - 1824 794 287 aa, chain + ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 3 267 15 303 323 107 30.0 3e-23 MDISDFVKKAKEYNVLGIKISKDNELAAEWYSEPECRRNIYSATKSFTSCAVGFAVQEGL IDLNEKLTEAFSGDLPECIDENLKEATVRDLLTMCLGQEKGSLMGEQRPLYEEDNWVKMS LAIPFKYKPGTHFVYNNVGPYLAGILVQRRSGCDLVSYLTPRLFSKIGIKRPTWETDPLG NSFGAGGLFLTLSELHKFGLFYLNKGKWNGKQILSEKWIEESTKAADVGYYGYLFWRGEY NSFRADGKYSQISMILPKKNAVVSFVSECRRGDELLKTVYELVCAKL Prediction of potential genes in microbial genomes Time: Sat May 28 19:36:59 2011 Seq name: gi|226332967|gb|ACII01000052.1| Ruminococcus sp. 5_1_39B_FAA cont1.52, whole genome shotgun sequence Length of sequence - 9660 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 59 - 118 4.9 1 1 Tu 1 . + CDS 198 - 1043 438 ## COG2267 Lysophospholipase + Term 1225 - 1271 0.3 - Term 1206 - 1266 16.5 2 2 Op 1 . - CDS 1302 - 1643 192 ## Cyan7425_0300 hypothetical protein 3 2 Op 2 . - CDS 1622 - 2083 233 ## Cyan7425_0300 hypothetical protein - Prom 2108 - 2167 3.8 4 3 Tu 1 . - CDS 2187 - 3194 685 ## COG0618 Exopolyphosphatase-related proteins - Prom 3227 - 3286 6.6 - Term 3662 - 3711 2.2 5 4 Tu 1 . - CDS 3739 - 6072 2375 ## COG0370 Fe2+ transport system protein B - Prom 6146 - 6205 5.2 - Term 6203 - 6257 9.4 6 5 Tu 1 . - CDS 6385 - 7044 720 ## gi|253578690|ref|ZP_04855961.1| predicted protein - Prom 7120 - 7179 8.8 - Term 7125 - 7167 3.2 7 6 Tu 1 . - CDS 7268 - 9322 1816 ## COG1200 RecG-like helicase - Prom 9346 - 9405 5.8 Predicted protein(s) >gi|226332967|gb|ACII01000052.1| GENE 1 198 - 1043 438 281 aa, chain + ## HITS:1 COG:PA3301 KEGG:ns NR:ns ## COG: PA3301 COG2267 # Protein_GI_number: 15598497 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Pseudomonas aeruginosa # 16 281 28 302 316 88 25.0 2e-17 MHLEIYAPEYKGDLKGLIQICHGMTEYMGRYVEFAKYFTSRGYIVFGNDIISHGHSTTPR SSCLYINDWNDTVKDMVSAREYVVRKYPNLPIYLLGFSLGSFIVRTNADLTPYKKEILIG TGAQSAFLMRIMRTWIGKKYTGKMSCASDKIYDLMFGTYGKKFKGRPANYWLLTDNEKRR EYADDSLVRQDVSPAFFCEFSKGMECASRNLKNPNNTIPTLFLYGKKDPVSGFGKGVRKV YKAYKENNPDTEIRSFPGTHDILHDSGYESVFKAIADYLRK >gi|226332967|gb|ACII01000052.1| GENE 2 1302 - 1643 192 113 aa, chain - ## HITS:1 COG:no KEGG:Cyan7425_0300 NR:ns ## KEGG: Cyan7425_0300 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC7425 # Pathway: not_defined # 5 113 243 348 350 73 37.0 2e-12 MPGTEDVEKPVYRKEPVYATKYYYEIDKWTVVDTAKSSGNDQNPSWPEPKLKDGQRTGAE EEHYFVTATYEKKKGKTETGRYEMDFSQWKELKKGEKIELKIDAAGFAEINQK >gi|226332967|gb|ACII01000052.1| GENE 3 1622 - 2083 233 153 aa, chain - ## HITS:1 COG:no KEGG:Cyan7425_0300 NR:ns ## KEGG: Cyan7425_0300 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC7425 # Pathway: not_defined # 1 97 1 100 350 63 37.0 2e-09 MGKIIEGLWDCPFCGNKRIRAGQKTCPDCGHPQDENTKFYMPDEIKYVSEEEAEKISRNP DWQCSFCGSLNSDDLNVCKNCGATKEDSERNYFEMRQQEEEKKRKKEEKKESCQKNIPQN TPKKKPLLRRVLLILGIFAAIIFGMMSCLAPKM >gi|226332967|gb|ACII01000052.1| GENE 4 2187 - 3194 685 335 aa, chain - ## HITS:1 COG:lin1610 KEGG:ns NR:ns ## COG: lin1610 COG0618 # Protein_GI_number: 16800678 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Listeria innocua # 15 322 8 299 311 98 27.0 1e-20 MTEFNFDTVFDETMQTHNLDTVVILGHLNPDGDAAGSVMSLAHYIHVNYPQYRVFPYLAK TLERGPKKMVVEDKIFVPFEMPDISGKQYCVIVCDNATLERMIGLEYYQNAAASIVVDHH ASNEGYGDVNWTKVSEACAENVYHMLSSELLLNAAQKEAYPNAADYIYLGILHDTGGLAR ANQGIFQAVADLMAMGVDHGRIMKTLHSDTLDILYKRADILHGAVRAMDGRVAYVIMGQK EIAEKDITYEDIHCISTILRDCDDIELGFTMYEEEENGWRCSFRSDGKWINVNELLKPFG GGGHISAAGLKYQTDNVEELKKQILDRVAEMKKKQ >gi|226332967|gb|ACII01000052.1| GENE 5 3739 - 6072 2375 777 aa, chain - ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 116 777 6 679 683 514 39.0 1e-145 MTLKDLEIGKSAVITSVGGKGALRQHFLDMGMIPGAEVTVVKFAPMGDPMELQVHGYELT LRLAEAEQIEIEQIPKRSRSHAGIEAVSDLAHPGLGEDGKYHSREDEHPLPEGTTLTYAL VGNQNCGKTTLFNQLTGSSQHVGNFPGVTVDRKTGSIKGHPDTEITDLPGIYSMSPYSSE EIVSRNFVLDEKPSAIINIVDATNIERNLYLTMQLLEMDIPMVVAMNMMDEVLGNQGSID VNKMESLLGVPVIPISAAKNEGVDELVDHALHIAKYQEHPLRQDFCDKSDHDGAVHRCIH AVMHLIEDHAAQADIPTRFAATKAIEGDPLILEKLKLDTNETEMLEHIVQQMETERQVDR SAAIADMRFDFIERLCEQTVVKPKESKERIRSEKIDKILTGKYTAIPCFIGIMVLVFYLT FNVIGAWLQGLLEIGIDKVSALVEAALTAANVNSAMHSLVIDGIFTGVGSVLSFLPIIVT LFFFLSLMEDSGYVARVAFVMDKLLRKIGLSGRSIVPMLIGFGCTVPAVMATRTLTSERD RKMTILLTPFMSCTAKLPIYSFFVSAFFPKKGGFIMAGLYLLGILVGILAAFLYNRTLFK GDPVPFVMELPNYRLPGARNVGQLLWEKAKDFLQRAFSIILLASMIIWFLQSFDLHLNLV KDSADSILAMVAGVLAPIFKPIGLGDWRICTALISGFMAKESVVSTLEVLFSGNIASVLT PLAAASLLVFSLLYTPCVAAIASIKREMGSKWAVGVVVWQCVIAWVAALIVHLIGMI >gi|226332967|gb|ACII01000052.1| GENE 6 6385 - 7044 720 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578690|ref|ZP_04855961.1| ## NR: gi|253578690|ref|ZP_04855961.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 219 1 219 219 166 100.0 1e-39 MATKATTADLNTSTAKTTEVKEAPAKTTAEEVAVNKEANKKETAETETVKKETTAKKPAA RRTRKTAEKKSAEAEKETEKKETAKAADTVEKKPAETEKEAEKKETAKTTKTAAKPTKTT KRTTKKAPAKTAEKPAEETAAEKKTTRRTTKKEATAEVYIQFMGREVYTKNILSNVRRIW TDELGKKEKDLKDIKIYIKPEENKAYYVINGDITGSVWI >gi|226332967|gb|ACII01000052.1| GENE 7 7268 - 9322 1816 684 aa, chain - ## HITS:1 COG:CAC1736 KEGG:ns NR:ns ## COG: CAC1736 COG1200 # Protein_GI_number: 15895013 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Clostridium acetobutylicum # 1 653 3 650 678 495 42.0 1e-139 MSTSLRTLKGVGEKTEKLFAKIGVTDMESLLSYYPRNYDAYEEPVEIRSLEEGAVVAISV AVITGVYVNQVRNLQVITTTVADLTGKISVTWFNAPYLRSAVRKGSRFVLRGRVVRKQGK LQMEHPEIFTPAAYEEILHSLQPIYGLTAGLSNKTIVKLIHQVLDEQKLQTEYLADEYKE RYHLADRNFAIPAIHFPKNMQELLAARRRLVFDEFLLFILAVQSLKEKTEEAPNAFPMHP VWTTEQIIESLPYDLTKAQLNVWHEIERDLSGQALMSRLVQGDVGSGKTILAFLAMIMTV ENGYQAVLMAPTEVLARQHFQAMEKLLQEQNIDFGHPVLLTGSDTAKEKREKYVLIASKE ANLVIGTHALIQEKVQYNNLGLVITDEQHRFGVKQREALTTMGNPPNVLVMSATPIPRTL AIIIYGDLDISVIDELPAQRLPIKNCVVDTSYRPKAYSFMEKQIRQGRQVYVICPMVEES EGMDGENVLDYTLKLRNVFSPDIKIASLHGKMKAKEKNVIMEAFAAGEIQILVSTTVVEV GVNVPNATVMMVENAERFGLAQLHQLRGRVGRGEYQSYCIFMQGNGAKEISKRLQILNKS NDGFYIAGEDLKLRGPGDLFGIRQSGLLEFKLGDIYQDADILKAASETASEILSLDGDLS LPQNEELQRRLSAYMKEDLQNLGL Prediction of potential genes in microbial genomes Time: Sat May 28 19:37:22 2011 Seq name: gi|226332966|gb|ACII01000053.1| Ruminococcus sp. 5_1_39B_FAA cont1.53, whole genome shotgun sequence Length of sequence - 17883 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 7, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 9/0.000 - CDS 1 - 1618 2068 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 2 1 Op 2 . - CDS 1638 - 2018 524 ## COG1302 Uncharacterized protein conserved in bacteria - Prom 2061 - 2120 5.5 + Prom 2070 - 2129 6.8 3 2 Tu 1 . + CDS 2206 - 2394 232 ## PROTEIN SUPPORTED gi|160881022|ref|YP_001559990.1| ribosomal protein L28 + Term 2420 - 2462 6.5 4 3 Op 1 . - CDS 2728 - 3099 489 ## COG3874 Uncharacterized conserved protein 5 3 Op 2 . - CDS 3102 - 4130 479 ## EUBELI_00953 hypothetical protein 6 3 Op 3 . - CDS 4135 - 4446 358 ## gi|253578698|ref|ZP_04855969.1| conserved hypothetical protein - Prom 4477 - 4536 1.5 - Term 4501 - 4540 5.4 7 4 Op 1 1/0.000 - CDS 4551 - 5006 392 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 5043 - 5102 4.9 - Term 5145 - 5182 -0.6 8 4 Op 2 19/0.000 - CDS 5187 - 6674 1497 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 9 4 Op 3 . - CDS 6598 - 7971 1427 ## COG0772 Bacterial cell division membrane protein 10 4 Op 4 . - CDS 7977 - 10172 1675 ## COG0826 Collagenase and related proteases - Prom 10206 - 10265 5.3 - Term 10251 - 10300 5.0 11 5 Tu 1 . - CDS 10301 - 10708 510 ## EUBREC_1658 hypothetical protein - Prom 10759 - 10818 4.5 12 6 Op 1 29/0.000 - CDS 10843 - 11847 888 ## COG2255 Holliday junction resolvasome, helicase subunit 13 6 Op 2 . - CDS 11927 - 12535 778 ## COG0632 Holliday junction resolvasome, DNA-binding subunit - Prom 12566 - 12625 4.3 - Term 12618 - 12665 10.2 14 7 Op 1 12/0.000 - CDS 12671 - 13462 904 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 15 7 Op 2 3/0.000 - CDS 13476 - 14051 726 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 16 7 Op 3 13/0.000 - CDS 14058 - 14768 875 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 17 7 Op 4 12/0.000 - CDS 14768 - 15376 952 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 18 7 Op 5 12/0.000 - CDS 15373 - 16350 1167 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 19 7 Op 6 . - CDS 16366 - 17682 1552 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 17732 - 17791 6.4 Predicted protein(s) >gi|226332966|gb|ACII01000053.1| GENE 1 1 - 1618 2068 539 aa, chain - ## HITS:1 COG:CAC1735 KEGG:ns NR:ns ## COG: CAC1735 COG1461 # Protein_GI_number: 15895012 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Clostridium acetobutylicum # 1 539 1 529 547 449 47.0 1e-126 MNNKTIDARILSRMFLAGAKNLEAKKEWINELNVFPVPDGDTGTNMTMTIMAAAAEVGSL GEPDMESLAKAISSGSLRGARGNSGVILSQLLRGFTRSVKNSKELDAIDIAAAMEKGVET AYKAVMKPKEGTILTVAREAAAKAVELAETAEDLDTFFQSVIAHAEETLAKTPEMLPVLK EAGVVDSGGQGLLEVYHGAYDGFLGKEIDYTQFDKAKGPAVTKIDAQTEADIKFGYCTEF IILLNQPMSEETEHEFKKFLMSLGDSIVLVADDEIVKVHVHTNHPGQAIEKALTFGALSR MKIDNMREEHQEKLIKDAEKMAKEQAEQEEKEEAAQPPKEVGFISVSVGKGMTEIFKELG VDYLIEGGQTMNPSTEDMLNAIAKVNAKTIYIFPNNKNIVLAANQARDLTEDKEIVVVPT KTIPQGITAMISYVPEKNSAENTEAMLQAIEHVKTGQITYAVRDTRIDDKEIHEGDMMGI GDHGILAVGKDRMEVAKETVAQMVDDESEVISIYYGADTEEAEAEELATALEEAYPDCD >gi|226332966|gb|ACII01000053.1| GENE 2 1638 - 2018 524 126 aa, chain - ## HITS:1 COG:BH2499 KEGG:ns NR:ns ## COG: BH2499 COG1302 # Protein_GI_number: 15615062 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 8 126 1 119 120 91 40.0 4e-19 MSIGGNHMNGRIDSGLGQIIIDTDVIATYAGSVAVECFGIVGMAAVSMKDGLVKLLGKNS LKHGISVRITEDNKIRLNFHVIVAYGVNISTIADNLVSNVKYKVEAFTGMEIDKIDIYVE GVRAID >gi|226332966|gb|ACII01000053.1| GENE 3 2206 - 2394 232 62 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881022|ref|YP_001559990.1| ribosomal protein L28 [Clostridium phytofermentans ISDg] # 1 62 1 65 65 94 72 7e-19 MAKCAICEKAAHFGNNVSHSHRRSNKMWKSNVKSVKIKTAEGGARKTYVCTSCLRSGKVE RA >gi|226332966|gb|ACII01000053.1| GENE 4 2728 - 3099 489 123 aa, chain - ## HITS:1 COG:CAC3264 KEGG:ns NR:ns ## COG: CAC3264 COG3874 # Protein_GI_number: 15896509 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 107 1 115 133 57 36.0 9e-09 MANENNFKETVNSLFKGMDSFISAKTVVGEAIHVGDTIILPLMDVSFGVGAGAFSGEKKD NGGGGMAGKMTPCAVLVIQNGTTKLVNVKNQDGLTKILDMVPDFVDRFTSGKGDEESVGE SEK >gi|226332966|gb|ACII01000053.1| GENE 5 3102 - 4130 479 342 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00953 NR:ns ## KEGG: EUBELI_00953 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 336 1 327 327 83 24.0 1e-14 MLHILWLILKFILIILGILLGLLLLAVLLVLFCPVRYKASAVKAEGGWKQTEGEGIVSWL FHGISLRAEWKDQQMEISFHLFGIPVDKLLKKRQEKRAVTKKSGKNPEKSGAEKVSVSGT DGNREEEHKKSVVDHPKSENMETEDLKTDRNSASDTSADTERIRNSQISGQQELNADDTD QNSLKKRQFFIKRIYDRIRNIFQLIRTKLQNIRRTFGKIKKNVSWWKAFIEHPRVTAARK LVWKHGKFLLKHIFPTRIEGQVTFSFEDPALTGAALAVLGMTIPFHKNCIQINPRFDSEN YLYGNIRAKGRIYGFVPVRAAVSIYFNKNIKYVIKRWKTRRA >gi|226332966|gb|ACII01000053.1| GENE 6 4135 - 4446 358 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578698|ref|ZP_04855969.1| ## NR: gi|253578698|ref|ZP_04855969.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 103 1 103 103 195 100.0 9e-49 MLKWQENFLAGESVKDPEKIKKKLNSGKPVLGIYLLTLAENPVNLMDIIPAAMLIQKSFY GICPKIIGMAKGKEEALEMVRSLIDEMYRETGTFATAEYIENR >gi|226332966|gb|ACII01000053.1| GENE 7 4551 - 5006 392 151 aa, chain - ## HITS:1 COG:aq_158 KEGG:ns NR:ns ## COG: aq_158 COG0494 # Protein_GI_number: 15605731 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Aquifex aeolicus # 1 133 1 127 134 79 37.0 3e-15 MIEATSCGGVVIHRGKILTLYKSYRNRYEGWVLPKGTVEPGETHEQTALREVMEEAGVRA TIVKYIGKSQYNFTVPEDVVFKEVHWYLMTADNYHSKPQREEFFVDSGYYKFHEIYHLLR FSNEKQIVERAYQEYLEMRSNGLWGKQHKYY >gi|226332966|gb|ACII01000053.1| GENE 8 5187 - 6674 1497 495 aa, chain - ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 42 471 15 460 482 265 34.0 2e-70 MRDIEEKERRRELKKRQKRKAREISEKRRPRNREYTLVSCFFVLIFVSMIGYLIYFNYAK SDDFINSPYNTRQDTFSDRVVRGKIISADGQVLAQTNVYEDGTEERTYPYANMFAHVVGY DTNGKSGLESEANFQLLTSHEFFLNQMKNEFKNQKNTGDSVNTTLNADLQSTAYNALGDR RGAVVAIEPSTGKILVEMSRPDFDPNTISQNWDTLVNDSNDSSLLNRATNGAYPPGSTFK VVTALDYFRTKGSLEGFSYLCEGSITREDHTIRCYGGTVHGQEDFYSAFANSCNSAFAEI GTMLGGSSLKKTAEDLLFNKSLPLSSYKKSTFTLNGKSSVAEVMQTAIGQGNTLVSPMHM ALITSAIANGGELMKPYLIDSVVSSDGETVKTTEPENYKRLMTTNEANILGKLMKGVVEN GTASALNGRGYTVAGKTGSAEFDENGSSHSWFIGYSNVDDPDLVVAIIVENGGTGSEAAV PIAEQIFDAYYYDKQ >gi|226332966|gb|ACII01000053.1| GENE 9 6598 - 7971 1427 457 aa, chain - ## HITS:1 COG:CAC0505 KEGG:ns NR:ns ## COG: CAC0505 COG0772 # Protein_GI_number: 15893796 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Clostridium acetobutylicum # 41 420 8 393 400 187 33.0 3e-47 MINVIVELSKYVILTLMVIYAFHCFYLVKQQSEEERNESLRQQLMLIFFMDFTAFLVIYL KTGKFQVVLFYVEMMAFFAGIQILYRIFYKKASILLLNNMCMLLSVGFIMLCRLDVSTAT RQLVIVAGVNVVALIVPVLIRKMKFLKDLTWVYAGIGIVLLAAVFVLAKTSYGAKLSLMG IQPSEAIKITFVFFMAALLRKGADFSKVVQATVVAGLHVGILVLSRDLGSAVIFFAAYLV MIYVASRNVGYLALGLAGGSAGAVVAYHLFGHVRQRVTAWKDPMAVYQNEGYQIVQSLFA IGTGGWFGMGLCQGSPEKIPVVKNDFIFSAICEELGGIFAICLILVCMSFFLMIVNIALK IKKPFYKLIALGLGTEYAFQVFLTIGGATKFIPMTGVTLPLVSYGGSSVACTVLMLAIIQ GLYILREDEDEEIERYRRKGTAQRVEEAPEAQSPGDF >gi|226332966|gb|ACII01000053.1| GENE 10 7977 - 10172 1675 731 aa, chain - ## HITS:1 COG:MA0538 KEGG:ns NR:ns ## COG: MA0538 COG0826 # Protein_GI_number: 20089427 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Methanosarcina acetivorans str.C2A # 1 347 1 342 855 286 47.0 1e-76 MRREEIELLAPAGSYEGFEAAIGAGADAVYVGGAAFGARAYAKNFGEEELLRAIDTAHIH GRKLYLTVNTLLKNRELSEQLYDYLLPYYKEGLDAVIVQDMGVFKMIREMFPGMHLHAST QMTVTGPEGMKFLEEQGASRVVTARELSLEEIRRMHETSPIEIESFIHGALCYSYSGQCL MSSILGGRSGNRGRCAQPCRLPYQTALCCGENGRRSEKRKAQELCPLSLKDISTIEILPQ ILEAGVTSLKIEGRMKQPGYTAGVTSVYRKYLDLLFEKGAENYRVAEEDKRYLLDLFNRG GSCTGYYQMQNGPSMMAFSNEKKTGDVSPVLRKKKEKIQGTFILFPGSPAILDVSCRGIH GFASVGEVQYAQNQPLTEERIRSQMEKLGNTEYEWENLEIQMDENIFIPMKMLNEARREA LESLENELLKPYKREENNGKKRLSEDAGRKADSPKQKNLPIYISCEEKSTALALYKREGI HGMYLNADAMEVCLDDCVSRGMEMYLSLPHIMRGEMPRELLTRIREWMDRGMTGFLVRNL ETFAVLRKAGLAEKCVLDHSMYTWNDEAADFWKDQKVLRNTVPLELNEGEIRHRDNRDSE MLIYGYLPLMISAQCVHKNLYGCNHKEEGVTLKDRYDKEFTAKCICNPWKTENTDFCIPC YNIIYNSIPYGLLKEKSQIDRLGVSSLRLAFTIEKPQDAVKIYEEFRAVYRDGKNPPKRE YTKGHFKRGAE >gi|226332966|gb|ACII01000053.1| GENE 11 10301 - 10708 510 135 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1658 NR:ns ## KEGG: EUBREC_1658 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 134 1 134 157 96 47.0 2e-19 MAVKNTAKVIIGGKIITLGGYESEEYFQKVASYINKKMDELSAMPGYSRQPMETKHTLIS LNITDDYFKAKKQAEVFEQDLQQKDQEMYDLKHELISLRMQIEEAQKHEQEALEQKSLLE GKNKELEKQIDELLK >gi|226332966|gb|ACII01000053.1| GENE 12 10843 - 11847 888 334 aa, chain - ## HITS:1 COG:CAC2284 KEGG:ns NR:ns ## COG: CAC2284 COG2255 # Protein_GI_number: 15895552 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Clostridium acetobutylicum # 1 326 1 325 349 438 67.0 1e-123 MEKRIITTDVTEEDFSLEGNLRPQTLDDYIGQEKTKSTLKVYIEAAKQRHDALDHVLFYG PPGLGKTTLSGIIANEMGVHMKVTSGPAIEKPGEMAAILNNLQEGDILFVDEIHRLNRQV EEVLYPAMEDYAIDIMIGKGASARSIRLDLPKFTLVGATTRAGLLSAPLRDRFGVMHHLE FYTHEELKTIIIRSAQVLGVEIDEKGAAEIAKRSRGTPRLANRLLKRVRDFAQVKYDGRI TYDVACFALNLLEVDQYGLDKIDRRILQTMIVNFQGGPVGLETLAAAIGEDSGTLEDVYE PYLLQNGFLNRTPRGRMASALAYEHLGYQMPEKN >gi|226332966|gb|ACII01000053.1| GENE 13 11927 - 12535 778 202 aa, chain - ## HITS:1 COG:CAC2285 KEGG:ns NR:ns ## COG: CAC2285 COG0632 # Protein_GI_number: 15895553 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Clostridium acetobutylicum # 1 200 1 197 201 129 38.0 4e-30 MIAFVHGMAVDMTESSVVVEAGGIGYEIYMTGADLSEIRMGEDVKVHTYFSVREDAMKLY GFRAKDDLQMFKLLLGVNGVGPKAALGVLAGITADELRFAILSDDVKTLSKAPGIGKKTA QKLILELKDKMKLEDAFELKLVHEQERAAVGAGEVSDGRQEAVEALVALGYSSADALRAV RKVTDVSPDDVEGLLKAALKNF >gi|226332966|gb|ACII01000053.1| GENE 14 12671 - 13462 904 263 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 6 263 5 262 264 197 46.0 2e-50 MNIGAIIAATVLVAAVGLFIGVFLGVAGKKFAVEVDEKEVAVREALPGNNCGGCGYPGCD GLAAAIAKGEAPVNGCPVGGEPVGKVIAAIMGQEVVETTRQVAYVKCAGTCEKTKDNYEY TGVEDCEMMAFIPGGGAKSCTYGCLGFGSCVKACPFGAIDVVNGVAVVDKEACKACGKCV AKCPKHLIELVPYEQTTFVQCSSHAKGKAVTSACEVGCIGCKKCEKTCPNGAITVDNFCA HIDYSKCTNCGACKEVCPRHIIQ >gi|226332966|gb|ACII01000053.1| GENE 15 13476 - 14051 726 191 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 19 190 21 192 194 186 61.0 3e-47 MKELLIILVSSAIVNNVVLSQFLGLCPFLGVSKSVETAAGMGGAIIFVITLSSFVTGIIY NAILVPTNLTYLQTIVFILLIAALVQFVEMFLKKTMPSLYQALGVYLPLITTNCAVLGVA LTNVQKEYNVLQGTVNGFATAVGFTISIVLMAGIREKIAYNDIPKSFQGFPTVLLTAGLM AIAFFGFSGLI >gi|226332966|gb|ACII01000053.1| GENE 16 14058 - 14768 875 236 aa, chain - ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 10 211 7 197 200 204 53.0 1e-52 MKPNTPAERLYNGIIKENPTFVLMLGMCPTLAITTSATNGIGMGLTTTVILAASNLMISL LRNFIPDRVRMPAFIVVVASFVTVVQLLLQGFIPSLYDSLGIYIPLIVVNCIILGRAEAY ASKNKPIASLFDGIGMGLGFTLSITCIGAVRELIGAGSLFGHQILPLADAAAGKAGYEPI TIFILAPGAFFVLAALSALQNKFKIGAAKRGIDPSNPDCGGSCAACGNTMCKGKRG >gi|226332966|gb|ACII01000053.1| GENE 17 14768 - 15376 952 202 aa, chain - ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 1 197 1 174 177 74 30.0 1e-13 MSNRIIKDTIAITVITLVAGLALGVVQDITADPIAKQEAQAKQDAYKAVFADADSFETVD VDADAMQSYLDENGYAAQSIDETMLAKDASGNELGYAFTVTTSEGYGGDIQFAMGIQDDG TLNGISILSIGETAGLGMRANTDAFKDQFKDKNVDKFEYTKTGATADDQIDALSGATITT NAMTNGVNAGLCAFQYEKGGNQ >gi|226332966|gb|ACII01000053.1| GENE 18 15373 - 16350 1167 325 aa, chain - ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 11 320 9 315 318 210 43.0 4e-54 MDQMLHVSSNPHVRDKMTTSKIMQLVVLALLPTTLFGIYNFGLRALLVVVITVASSVFFE WIYDKLMHKKNTITDFSAVVTGLLLALNMPASIPLWMPVLGSAFAIIVVKQLFGGLGQNF MNPALAGRCFLMISFAGKMTDFAVSDSFRGVVDTVSGATPLAALKQNGFTDSSVSVLHMF IGDIQGTIGETSALAILIGAAILLVFKVIDLKIPLTYIGSFAVFVILYMLGTGKGFDVNY LFSHIFGGGLMLGAWFMATDYVTTPITPRGQYVYGVCLGVLTAVFRLFGGSAEGVSYAII FCNLLVPMIERVTRHPAFGKGGKKA >gi|226332966|gb|ACII01000053.1| GENE 19 16366 - 17682 1552 438 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 436 7 441 441 342 40.0 8e-94 MGILTFKGGVHPYDGKELSKNSPVVKYLPKGDMVYPLSQHIGAPAAAIVQKGDHVLAGQK IADAAGFVSSPIHSSVSGTVKGIEPRLTATGSMVNSIIIENDQQYESVEFTPARLEDLSK EEILARIKEGGVVGMGGAGFPTQVKLAPKEPDKIDHILVNGAECEPYLTSDYRRMLDDSE KLIEGLRVMLKLFDNAKGYICIEDNKPDCIAKLKEMVKDIPRIEVAELMTKYPQGGERFL IKAVTDREINSAMLPADAGCVVDNVDTVIAVYEAVILGKPVMDRIVTVTGQGIKNPQNFD VLSGTDMAELIEAAGGLTDNVSKVISGGPMMGFAMYDTHVPCIKTSSACLCLEKDDVADA EQTACINCGRCVGVCPGHVIPARLATFAEHGDMESFQKFDGMECCECGCCSYICPAKRPL TQMIKSMRKMVLASRKKK Prediction of potential genes in microbial genomes Time: Sat May 28 19:37:48 2011 Seq name: gi|226332965|gb|ACII01000054.1| Ruminococcus sp. 5_1_39B_FAA cont1.54, whole genome shotgun sequence Length of sequence - 45890 bp Number of predicted genes - 39, with homology - 34 Number of transcription units - 23, operones - 10 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 140 - 961 624 ## COG3359 Predicted exonuclease 2 2 Op 1 . - CDS 976 - 1623 369 ## COG2091 Phosphopantetheinyl transferase 3 2 Op 2 . - CDS 1635 - 1937 435 ## Fisuc_0373 anti-sigma-factor antagonist 4 2 Op 3 . - CDS 1957 - 10941 8430 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 5 2 Op 4 . - CDS 10955 - 12700 1867 ## COG0534 Na+-driven multidrug efflux pump - Prom 12758 - 12817 7.0 - Term 12755 - 12793 1.1 6 3 Op 1 . - CDS 12833 - 14392 1178 ## Cphy_2085 pyrrolo-quinoline quinone 7 3 Op 2 . - CDS 14424 - 14705 116 ## - Prom 14770 - 14829 6.3 - Term 14762 - 14796 -0.5 8 4 Tu 1 . - CDS 14857 - 15237 361 ## gi|253578718|ref|ZP_04855989.1| conserved hypothetical protein 9 5 Tu 1 . - CDS 15354 - 16190 929 ## COG0648 Endonuclease IV - Prom 16284 - 16343 6.8 + Prom 16243 - 16302 7.6 10 6 Tu 1 . + CDS 16450 - 16656 113 ## - Term 16439 - 16487 12.5 11 7 Tu 1 . - CDS 16580 - 17041 145 ## gi|253578720|ref|ZP_04855991.1| conserved hypothetical protein - Prom 17182 - 17241 9.4 12 8 Tu 1 . + CDS 17522 - 17959 133 ## - Term 17683 - 17735 -0.4 13 9 Tu 1 . - CDS 17902 - 18609 358 ## COG4252 Predicted transmembrane sensor domain - Prom 18638 - 18697 5.0 14 10 Op 1 . - CDS 18716 - 18847 279 ## 15 10 Op 2 . - CDS 18863 - 20554 1064 ## COG5421 Transposase - Prom 20583 - 20642 5.6 - Term 20701 - 20729 -0.1 16 11 Op 1 3/0.000 - CDS 20755 - 21510 282 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 17 11 Op 2 . - CDS 21524 - 24262 2208 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 24369 - 24428 9.2 + Prom 24404 - 24463 6.2 18 12 Tu 1 . + CDS 24693 - 26087 627 ## COG2508 Regulator of polyketide synthase expression + Term 26092 - 26139 0.9 - Term 26086 - 26118 2.3 19 13 Op 1 . - CDS 26212 - 27195 1134 ## COG0385 Predicted Na+-dependent transporter 20 13 Op 2 12/0.000 - CDS 27199 - 28038 894 ## COG0414 Panthothenate synthetase 21 13 Op 3 . - CDS 28148 - 28588 519 ## COG0853 Aspartate 1-decarboxylase - Prom 28740 - 28799 6.3 22 14 Tu 1 . + CDS 29567 - 29776 199 ## EUBREC_1224 hypothetical protein + Term 29800 - 29856 16.2 - Term 30164 - 30209 3.9 23 15 Op 1 3/0.000 - CDS 30371 - 30922 169 ## COG5619 Uncharacterized conserved protein 24 15 Op 2 . - CDS 31021 - 33138 1093 ## COG3972 Superfamily I DNA and RNA helicases - Prom 33261 - 33320 4.1 - Term 33259 - 33298 0.2 25 16 Op 1 . - CDS 33427 - 34680 341 ## STER_0748 hypothetical protein 26 16 Op 2 . - CDS 34718 - 34951 137 ## gi|253578735|ref|ZP_04856006.1| predicted protein 27 16 Op 3 . - CDS 34960 - 35367 102 ## gi|253578736|ref|ZP_04856007.1| conserved hypothetical protein - Prom 35457 - 35516 4.2 28 16 Op 4 . - CDS 35563 - 35916 263 ## gi|253580918|ref|ZP_04858180.1| predicted protein - Prom 35946 - 36005 13.5 29 17 Op 1 . - CDS 36228 - 36473 364 ## gi|153811885|ref|ZP_01964553.1| hypothetical protein RUMOBE_02278 30 17 Op 2 . - CDS 36489 - 38183 1233 ## COG5421 Transposase - Prom 38274 - 38333 7.3 31 18 Op 1 . - CDS 38510 - 38947 205 ## Exig_0319 hypothetical protein 32 18 Op 2 . - CDS 38949 - 39206 152 ## gi|291524428|emb|CBK90015.1| HNH endonuclease. 33 18 Op 3 . - CDS 39190 - 40557 777 ## COG3950 Predicted ATP-binding protein involved in virulence - Prom 40633 - 40692 9.8 - Term 40809 - 40853 1.1 34 19 Tu 1 . - CDS 40900 - 41466 106 ## gi|253578742|ref|ZP_04856013.1| predicted protein - Prom 41608 - 41667 4.9 - Term 41538 - 41587 0.6 35 20 Tu 1 . - CDS 41750 - 42049 259 ## gi|253578743|ref|ZP_04856014.1| conserved hypothetical protein - Prom 42083 - 42142 6.7 36 21 Op 1 . + CDS 42378 - 42677 200 ## c3658 hypothetical protein 37 21 Op 2 . + CDS 42711 - 43139 299 ## COG3436 Transposase and inactivated derivatives + Prom 43469 - 43528 8.3 38 22 Tu 1 . + CDS 43667 - 43795 87 ## 39 23 Tu 1 . - CDS 44249 - 45829 1245 ## EUBREC_0128 hypothetical protein Predicted protein(s) >gi|226332965|gb|ACII01000054.1| GENE 1 140 - 961 624 273 aa, chain + ## HITS:1 COG:CAC0978 KEGG:ns NR:ns ## COG: CAC0978 COG3359 # Protein_GI_number: 15894265 # Func_class: L Replication, recombination and repair # Function: Predicted exonuclease # Organism: Clostridium acetobutylicum # 2 134 72 203 274 77 33.0 3e-14 MAENANEEETILRIFSQFLQQCDLLISYNGDRFDQPYLEARYEKYGIPSPFTGKQSLDLY LTLKPLKSLLKLSAMKQPCMEEFLGIKDRIYDNGKECIKLYKDFLKKRDAFTADEILGHN LEDVLGLGRIFDMLGYLCIYDGDYEVTYSEFDGDNLILKLKLPCTLPQEFSNGNTDFYLT GKDEEINLIIKTTDGKLKQYYADYKDYYYLPEEDTVIPKSLGSGIDRKHRKAATRNTCYT WFTCSDAFLSSPVQQKQYLTYTLSCLIETLERV >gi|226332965|gb|ACII01000054.1| GENE 2 976 - 1623 369 215 aa, chain - ## HITS:1 COG:BS_sfpm KEGG:ns NR:ns ## COG: BS_sfpm COG2091 # Protein_GI_number: 16081163 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Bacillus subtilis # 3 196 2 206 224 110 34.0 2e-24 MQKLYRININHFEDPLKNERILDLVGIRRREKVIRYRMPDDRKRSLGAGIIIRKILDENG LTESCLRYSDNEKPVVDGLFFNVSHAGDYVVGVLSDCEVGCDIEKNANAPLEVAEHYFYH SELAYIKAAKNKDKAFFTLWTLKESYMKMTGRGMSLPLDSFEVVPTAEGFVLGKSSERPC FFETMEFDGYIFSVSSERAFSLHNPLEIVFDDIYV >gi|226332965|gb|ACII01000054.1| GENE 3 1635 - 1937 435 100 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0373 NR:ns ## KEGG: Fisuc_0373 # Name: not_defined # Def: anti-sigma-factor antagonist # Organism: F.succinogenes # Pathway: not_defined # 1 91 1 91 98 68 42.0 6e-11 MEIITKTEGNKATMEISGWLDTQTAPQLGEALSGLEDSITSLVFDFTNLEYISSAGLRQV IVAYKKMAQKDGFKIINVSDEVYDVFSLTGFDSKIDIQKK >gi|226332965|gb|ACII01000054.1| GENE 4 1957 - 10941 8430 2994 aa, chain - ## HITS:1 COG:BS_srfAA_3 KEGG:ns NR:ns ## COG: BS_srfAA_3 COG1020 # Protein_GI_number: 16077417 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Bacillus subtilis # 1598 2601 1 1033 1498 521 31.0 1e-146 MKNINERNNFVFEFRPVTELFEEQVKLHPEKCAVVSGKESFTYAQLNERANRIANALVEK GVRRETIVGVVLERCCDFYAVRQGILKAGGAFAVATPDYPDDRIRYIFEDSRAPFIITTK EIAEERRELFAKLPCTVLLIEELLENENTENPQVKIGEHDLCYCIYTSGSTGKPKGVLIE HINLANFVNPDPKNAETYGYVSRGSVSLSMAAMTFDVSVLEEFLPLTNGMTAVIASDEEI LNPLMLGELIVRNKVDIMTTTPTYLSNMIDLPQLEKAVSQIKVFDVGAEAFPPALYDKIR RVNPDAYIMNGYGPTETTISCTMKVITDSRNITIGTPNGNVKGYIVDKENKILPDGETGE LVIAGLGVGRGYMNLPDKTEAVFIDLNGERAYKTGDLARISPEGEIEFFGRIDNQIKLRG LRIELGEIEEVINSYEGIITSITLPVDNKFLCCYFMADRQINTEELSAYASESLAHYMVP EVFVQLEKMPVTQNGKIDKKALPKPVAQPKNLKEPQTPMQKKIFEIVADVVENDFFGTDT SFYRAGLSSISAMKLCILISEEFGVTVKTSDIHENNTVEKLEKYVMLAPKIRTYEKREVY PLTGSQKGIFAECMKNPESTVYNIPFLFELESSVDVQKLSDAISQMIAAHPYLLTKVYLS DSGEMVQKPCEEAFVPEVVQTTNEQFEKMKDELVRSFKLEKGRLFRAGIYVTEDRKYLFT DFHHILADGNSYDIIFEDIDRAYLGEKLEKESYTGFDAALDEEQQMKEGKYKKAEKYYDS IFEGIETESLPLPDCSGKTPERGYLSMPVDTGESLVLSACEKLGVTPNILFTGVFGILMS RYSNSEESLFATIYNGRNDSRLENTVCMLVKTLPVYCSCDPKVTIQAYMTELSEQILSSM ANDIFPFSDICAKYGINSDLVFAYQAELGDDFPIGDTVARGHDLSPDMSKMPLLIQVREY DHKYVLTAEYRSDMYSEAFVRGMLESYEAAMDSLLKAKYISEVSVLSQNGADKISEFNNT ACEYDRSKTIADMFEELVQTIPDHTAVVFKDKKYTYRELDEISDRLGKYIASQGIGSEDV VSILIPRCEYMAIAPMGVIKAGAAYQPLDPTYPRDRLMYMMEDSSAKLLIVDRELLPLVD GYKGPVLFTDQIWQLEDRDVVLKKPQLHDLFILLYTSGSTGVPKGCMLEYGNITAFCHWF KRYYGIDSECRIAAYASFGFDACMMDIYGAITNGAQLHIIPEEIRLDFIGLQRYFEENGI THSFMTTQVGRQFALEMDCKSLRYLSVGGEKLVPCEPPKDYKFINAYGPTEATIFTTVFE VDKYYPNVPIGKALDNVKLYITDKLGHMLPPGACGELMITGWQVSRGYLNKPEKTAEVYT KNLYDDTPGYEVMYHSGDVARFLPDGNIQIIGRKDSQVKIRGFRIELSEVEEVIRRYEGI RDATVVAFDDPNGGKYIAAYVVSDSTVDINALNDFIKETKPAYMVPAVTMQIDKIPLNQN QKVNKKALPLPEKKAAEIIKPENEVQQILFDCIAEVLGYTDFGITTDIYEAGLTSITAIK LNILISKAFDIVIKTSDIKNHPTIRMMEEFVKTAGKESKREVQESYPLTNTQEGIFIECT ANMGSTIYNIPYLLKLDNKVDLDRLAEAIDSTVEAHPYLKTRLFMDDNGNVLQKRNDGLS YKTPILNGMNRDTLVRPYMLFNEQLFRFEIYRTCDGNYLFLDLHHIVADGTSLAIIINDI NRAYSGEKLEPEGYTSYDLALDNRDALAGDIYKNAENYYKSVFEHAGGSISFYPDKNGAA PTAELYHRETRSISVQDVKEFCKKHGITENVFFISAFGITLGKYNFKKDAVFTTIYHGRN DSRLSETVGMLVKTLPVFCDFSGTAKDCLSGVQKQLIDSMNNDIYPFSQISHEFDIKADA MVIYQGDNFAFDTIGGEYAQEEPVSLNMAKAPVSVSISIEKNRFVFEIEYRGDMYHEETI RYLADNLETTAEGILRECDPADIRLMFEEKTQMEDIPEHAGKTFIDLFKEMAARYPDRPA VRDDSGDFTYRELDRMSDYIAQKLTENGFGPEQAAGILCGRTKEYTVAYVGVMKAGGAYV PLDPEYPQSRIEYMLKDSGARNLLVIDQYQNLVEFYDGNVISLDSVPDEAEDFELSAELI SPKPENLAYMIYTSGSTGKPKGVMIEHRNLLNLIEYITLSRNTSPDDIVAEFASFCFDAS VIDLFAPLTAGAVLYILPESIRKDAIAISRYIKEKEITTVTFPTQMGELVTELLEDAPAL KFVTLGGEKFKHYRNRTYQMINGYGPTENTVSSTEFLVDRQYDNIPIGKSQRNVRSYIVD ENLNRLPVGASGELCHAGRQIARGYHNLPEKTASVFVENPFAVCEQESRLYRTGDMVRMK GDGNIEYIGRIDSQVKIRGYRVELGEIEGALLKHELVKNAAVTVIEKGGNKYITAYYTGE AIPEDELKTFLEPLIPDYMMPSFFVSIEEMPVTPGGKIDKKALPMPEVTTNTASYVEPVT AAQRALCEIFEKALGIERVGIEDNFFELGGSSLTASKVAVMCLSKNISLVYADIFKYPTV RELAALVDDDGVTEAAQSKNEFSDYNYNRIQNVISANTEENADRITKEKLGDIMVTGATG FLGIHVLKAFIDNYDGKVYCLVRKGKYESSEKRMMNMLMYYFDDPYKELFESRIICVDGD ITSKEQVTGFSEYKFSTIINCAACVKHFAADDVLERINVQGVENLIDFCKNNGRRLIQIS TVSVAGEGSDGVPPMSRVFCENDLYIGQNITNEYIRTKFLAERAVLEAVSEGLDGKVVRV GNLMSRNSDGEFQINFITNGFLRSLRGYAAVGKFPMGGMHEVAEFSPIDSTALAVLRLAQ TDRRFTVFHACNSHHIYMADLIYAMRSYGFRIDIVRDEDFEAAVKEFAKTGQDSDAVSGL IAYTSHNENEIYTIDYSNRFTAQVLYRLDYKWPVTDDRYLENAIAALDRLTFFD >gi|226332965|gb|ACII01000054.1| GENE 5 10955 - 12700 1867 581 aa, chain - ## HITS:1 COG:BH2163 KEGG:ns NR:ns ## COG: BH2163 COG0534 # Protein_GI_number: 15614726 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 18 404 24 409 464 116 25.0 1e-25 MEERNNRLLNAKLNKYIVPGIMMSLALQLGNIVDTIFVSNLIGVEAMSAVTLSLPVETIV QLTGYCLGIGGSIAVGNMLGKRDKEGASKLFSATFIVTLVVGIIFSICALPAAEPVARLL VSGDGVLTGYTRDYIRVSMLGAPVIGIGLMMVNYLGVENHPELASAYLIAANVINLVLDY IFLRYTPLGITGASLSTVLGFLFAMGIFVLYIRSDKRNLHLILLKPGELGITKEAIITGV PMLIFMAANFVKALGLNTIIMNQLGEEGMAVFTVCDNVLLIVEMLTGGIIGVIPNVAGIL FGEKDYVGIRVLCKKMLKYSYILLAVIFVFIMLFTEEITVMFGSGGGELGSHMVQALRIF ALCVAPYLWNKFIISYYESIEETAIASFATFLENAVVVLPATLVGILVWKQIDGIGIDGI AAGFVATEIITAVAACIFRKIRHKNTSFYIVPDKNPGINLDFSIKSTMEEAQTVHKRIIE FCQEQGASKSKANLAAVCAEEMTVNIIRFGGKTSNWIDINLCLEDDLCRLRIRDNGVNFN PLEYQYDSEDFDIHGIELVKKVSKSMDYIRAIDMNNTIISF >gi|226332965|gb|ACII01000054.1| GENE 6 12833 - 14392 1178 519 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2085 NR:ns ## KEGG: Cphy_2085 # Name: not_defined # Def: pyrrolo-quinoline quinone # Organism: C.phytofermentans # Pathway: not_defined # 13 519 59 569 569 435 45.0 1e-120 MVEMKKTKEINFQPHCVDSTKPENLISYTNIEVDGNVLENIGDYKADGDISFNIGQDYTD VDGIVTFRGNSFRDTPSHGYADMTDFRLNKLWSADTGSLSSGSAVWTGSGWTGQPLMMKW SKEVKAHMNMTEKAKADDELVEVIYACMDGYVYFLDLRTGEKTRDPLYLGYTFKGAGALD PRGYPIMYVGAGYNSNEGTAKVFVVNLLDCSVMYTFGDNDEFSLRGSLSFFDGSALVDAE TDTLIYPGENGILYLIRLNTKYDLNSGTLSIDPGRTVKWRYYGTRSSTESFWLGMEDSAA IYKGYIFMTDNGGNLMCLDLNTLQLVWVQDTLDDSNSTPVLEIEKGHLYLYVSTSFRLGW RSYDTATVPVWKIDAETGEIVWHTDYECTTDDGVSGGVQSTIACGKNSLADYIYVTVAKT SDNASGVLACLRKSDGSKAWEDSSSYAWSSPVCVYNKDGSGKVLYCNSTGDIRLLDGKTG KQEDVLSVSDGVIEASPAVYDNYAVVGTRDCKIWGIRLQ >gi|226332965|gb|ACII01000054.1| GENE 7 14424 - 14705 116 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKKKNVDSGRKSARPDYIQKTISQKSSGKRSGKVFFPKTLIKTGGIIIFVLLLGFAGNK FFLSVSKTDSGKVKAVSASGADEEKEGRIIEKN >gi|226332965|gb|ACII01000054.1| GENE 8 14857 - 15237 361 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578718|ref|ZP_04855989.1| ## NR: gi|253578718|ref|ZP_04855989.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 126 7 132 132 222 100.0 6e-57 MQQRQGYQAVEKLRKTLDEKYLWEVILLYAGESFKTYTGLPFTYEVRKGRNGDYTRELWI DRRENSKSLALSSVLLALRNIKKVGAVVDRPKALGDIRGVSYIYGMFYCFGLIDVPDTAR QKMKCL >gi|226332965|gb|ACII01000054.1| GENE 9 15354 - 16190 929 278 aa, chain - ## HITS:1 COG:lin1487 KEGG:ns NR:ns ## COG: lin1487 COG0648 # Protein_GI_number: 16800555 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Listeria innocua # 1 273 1 283 297 218 42.0 1e-56 MLYIGNHTSSSRGYLAMGKQMLANGGNTFAFFTRNPRGGKAKDIDPQDVQAFGQLAEENH FGKLVAHAPYTMNCCAAKENLRDFAREIMADDLRRLEMTPGNYYNFHPGSHVGQGAEVGI SKIAEILNEVLTKDQSTIVLLETMSGKGTEVGRNFEELRQIIDRVELKDKLGVCLDTCHV WDGGYDIVNDLDSVFEEFDRIIGLDRLKAIHLNDSMNPLGSHKDRHARIGEGEIGLEALV RVINHPATEGIPFILETPNDDEGWTHEIALLRNEYKQK >gi|226332965|gb|ACII01000054.1| GENE 10 16450 - 16656 113 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKAAPADSRDGFQNASQKLLSFERDLPEIAIQDDLSVFHLVLPFTFHAPTLTSYSPTNSP AYIPNYYE >gi|226332965|gb|ACII01000054.1| GENE 11 16580 - 17041 145 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578720|ref|ZP_04855991.1| ## NR: gi|253578720|ref|ZP_04855991.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 153 2 154 154 265 99.0 1e-69 MIRFFSMFLLYFFSRKRGSLLTFLAIVFGSTVFTRTPGIRQYQLEIFWSWKEILGIGTCG RLGSTTGDGLLQENLLNILLLFPAGILLPGLTGRKLKWWMGLLVGIAVSSAIEISQLLLC RGLFEFDDIIHNSLGCMLGSLLGNRMLRLVHGK >gi|226332965|gb|ACII01000054.1| GENE 12 17522 - 17959 133 145 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLVTINNALLLKYTGDSEVLQKSTRPYVLVVRLKYKDACYDFAVPIRSNIPASAPKDQY FALPPRPQTRPKNRHGLHYIKMFPVTKQYLVRYRTEGNSFATLIHDIIEKNSKQIVDECQ HYLCIYLYICLIIPAISSATSMTDR >gi|226332965|gb|ACII01000054.1| GENE 13 17902 - 18609 358 235 aa, chain - ## HITS:1 COG:alr1575 KEGG:ns NR:ns ## COG: alr1575 COG4252 # Protein_GI_number: 17229067 # Func_class: T Signal transduction mechanisms # Function: Predicted transmembrane sensor domain # Organism: Nostoc sp. PCC 7120 # 105 214 237 342 1047 62 32.0 6e-10 MEMLAIFVNNNPYNFEFYIAVVYDIPKEEHVEKMIGLFENAGLVLRNDLTFFMFMNSMKD RRFNSCAFLGEQDIDLKFENMFDFVEIEDLKKKGYYMQPFGKEKQVFISHSSKDKKDVEM IIPYLNGQDLPVWFDKYSIPVGASITEQVQRGIEESDMVIFWVTDNFLNSNWCQMEMKAY ISRMIQENIRICIVMDDDIEIKKLPLFLRDIKHIRRDHRSVIEVAEEIAGIIKHM >gi|226332965|gb|ACII01000054.1| GENE 14 18716 - 18847 279 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARTRKTSKEALEEKIEKAEVDVIAAKKKYDQATANLKELLGK >gi|226332965|gb|ACII01000054.1| GENE 15 18863 - 20554 1064 563 aa, chain - ## HITS:1 COG:MA3502 KEGG:ns NR:ns ## COG: MA3502 COG5421 # Protein_GI_number: 20092312 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Methanosarcina acetivorans str.C2A # 29 545 17 517 517 112 25.0 2e-24 MYLKSRVKIPDVHGKLIRLRRGDITYIKYEYDRYYNPEKKYTYPQRATIGKVCADDDTMM IPNEAFLKYFPDAVLSDMEDRTERSSCLRVGTYAAIHKIVSNYSLDKMLGQFFSKRDMAL LLDLASYFIVEESNVAQHYPDYAYNHPLYSSGMKIYSDASVSDFFKSITKDKSTGFLNEW NAKRNRRERIYISYDSTNKNCQAGEIELAEFGHAKADIGSAIINYSVGYDLKNKEPLFYE AYPGSINDVSQLRAMLGRITGYGYKKIGFILDRGYFSRGNIRYLDECGYSFVIMAKGTSS FISDLILENKGTFENKRSCDMNAYGVYGKTVQKTLFEGDEKKRYIHIYHSISKDADERES FENALRERTTLLMSHQNETVEFGSAYEKYFYLHYDKDGVFLYPEEKTTVTEREISLCGYF VIITSERMTAKEALHLYRSRDVSEKLFASDKSFLGNKSMRSHTNEGVEGRIFTQFIALII RNKIYTALQEENEKLEKKQNYMTVPAAIRELEKIEMTRQTDNIYRLDHAVTANQKVILKA FGLDANSIKYFASELSKELKKSE >gi|226332965|gb|ACII01000054.1| GENE 16 20755 - 21510 282 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 246 4 238 242 113 34 2e-24 MKFENKVAIITGSTRGIGRATAELFAKEGAKVIVVGTKEELGESCVNAIKAAGGEAIFCK TDVTSDESLDNLVKTALDTYGKIDILVNNAGVGGTTANMDQITMDEWNTVLATNLTAPFV LCKKIIPIMEKQGAGTIVNVASMAATAAGRGGLAYTSAKHGLLGLTRQMSLDHGRTGVRI NAVLPGPIATDMIARVLAIPQHPVTMKIGMSPAKRPGEPIEVAQAIAFLASDEASFIHGA ALAVDGGYTIF >gi|226332965|gb|ACII01000054.1| GENE 17 21524 - 24262 2208 912 aa, chain - ## HITS:1 COG:CAC1044_1 KEGG:ns NR:ns ## COG: CAC1044_1 COG1902 # Protein_GI_number: 15894331 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Clostridium acetobutylicum # 1 357 25 383 413 379 52.0 1e-104 MGTSLAEMDGSPSEDMIAFYEARAIGGAALIIPEIARVNDLHGAGMMRQLSVSKDRHIEG LAKLAETVHKYGTKIFIQLHHPGRETVTALTGGPVVSASAIPCKYLQQETRALSTEEVKA LIQQFISGAVRVKKAGCDGVELHAAHGYLLQQFLSPYTNKREDEYGGSFENRLRMITEII NGIRVQCGPDFPIGVRLSVEEFLDKTGVTEDYIHIQDGVKIAMALEKCGIDFIDVSVGLY ETGSVCVEPISFPQGWRKDLIKAVKDHVSIPVIGVSCIREPQVAESFLEGGIVDFVSMGR SWLADEQWGLKVLEGRENEICKCINCLRCFESLNAWMGAGIPAECAVNPRACRERLYGNA EYDTQEHKVVVVGGGPCGMIAAKTLAERGMKVTLVDRQSELGGTINLAKKPPFKERMGWI ADYYNVEFEKLGVELKLDMEATADKIAEMNPDAVLVATGSKSVIPGSIPGITGENVYTIE DILSGKAGLKNKKVMIIGAGVTGLETAEYLCHEGNTVVLADMLDKVAPNANHTNVADVCG RLKEYNAQFMMAHALKEIKKDGVVLERLNDKINVEVAADAVVLSLGFRPDNHLVEELKEK GIHAQAIGNAVKDGTIAPASRSGFEAARKLFKTSVKTPSFITAPEEMPNFGKISLMKNQE GIYLAYLTDPAAIAKVLPAPLKPFSVPVVTVSVCHVKEPTFADDYYEAILGVYCTYGTQL GLYPIGLVLGGTGAEMAVQCGRDNGSIPKKLGSEFVIRRNNDHVTAQVCRRGTELVNIDL KIGEYNNAMTGMLYQFPEAGKKTYGGGFYFHLDREPDKEGKSHFQNGALLQNLCEYNYHS WEPGFAAIKLQSSIDDPWGELPIRTVIGGAYSSNDLMVHKLNLCEEIDADVLAPYLLTAR YDRTAFMETGRR >gi|226332965|gb|ACII01000054.1| GENE 18 24693 - 26087 627 464 aa, chain + ## HITS:1 COG:CAP0121 KEGG:ns NR:ns ## COG: CAP0121 COG2508 # Protein_GI_number: 15004824 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 64 456 129 538 543 76 21.0 1e-13 MQPSQCILVRGENYLFPYSCTDRATVQAKELFAVFNAAKQFLQTHEQVSLYEELVRKGHS SHSLDEVINSASIKLGYSLVFCDKKFRILSYSTSIPVTDKLWKKNIERGYCTYDFIKNVM SLEAVQGASASIEAVEVSCPESPYRKCSSKVFLNGVQVGFVLMIENNIPFTAEHLAAMSD VSRAISSVVGHYHPELFQYTSADQQLLEALLIGTPADILADSISHLRVSDAMYALCIQPK TFLKENKLKQQLYPVIRHIFPEAYMTVHDNALVLLVPDQSSVIFVDTQKIMLKTMAEYHL DVGISLVFSKMDEFYRAYQQARTVLEWNHRIPDNLKKEISWDQRYPMEHQIHRYSEIEFY EMMNKIKEPEKLADYIHPALYRLNKYDQRTGNELYHTLEVYLQCFHNNKETANILCIHRN SLAYRMEKIIEIGKADLNDPMTEFLLRMSFKLVEYLKIRDLDLP >gi|226332965|gb|ACII01000054.1| GENE 19 26212 - 27195 1134 327 aa, chain - ## HITS:1 COG:BS_yocS KEGG:ns NR:ns ## COG: BS_yocS COG0385 # Protein_GI_number: 16078995 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Bacillus subtilis # 8 306 4 300 321 265 48.0 9e-71 MENILKKIGKISSFASRYIGIIIIAFSCLAFFWRDGFAWMTNYTSVFLGVIMFGMGLTIR LEDFRAIFSRPKEVIIGAVAQYTIMPVVAWVLCKVMNLPADLALGVILVGCCPGGTASNV ITYIAGGDVALSVGMTIVSTLAAPVMTPFLVYILAGAWVEVSFWAMVLSVVKVILVPVLL GILLRTLAGDHVDKVSDVMPLISVVAIVMIIGGIVAINAEKILSCGVLVLGVVAIHNFCG MMLGLLAAKIFHVEYTRATAIAIEVGMQNSGLAVSLAAANFVANPLATLPGAIFSVWHNI AGSIFAGIRRSGVENRTRTGENGVSAA >gi|226332965|gb|ACII01000054.1| GENE 20 27199 - 28038 894 279 aa, chain - ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 279 1 281 281 350 58.0 2e-96 MQVTKTVEETRKQIKAWKKEGKTIGLVPTMGFLHEGHASLIKKCREQNDIVVVSDFVNPT QFGPTEDLEAYPRDFERDSKLCESLGTDLIFCPEPSEMYHDPHAFVSIDTLSETLCGKTR PIHFKGVCTVVTKLFHIVAPDRAYFGQKDAQQLAIIRKMVQDLNFDIEIVGCPIVREEDG LAKSSRNTYLSDEERKAALCLSRSVKLGQEIIHAGISAEELLGKMRAVIEAEPLAKIDYV SMVDALTMQPVEKADHNVLVAMAVYIGKTRLIDNFSYEV >gi|226332965|gb|ACII01000054.1| GENE 21 28148 - 28588 519 146 aa, chain - ## HITS:1 COG:CAC2916 KEGG:ns NR:ns ## COG: CAC2916 COG0853 # Protein_GI_number: 15896169 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Clostridium acetobutylicum # 17 142 1 126 127 173 64.0 1e-43 MVLLDSDFSRRKGLYKMTIEMLKGKIHRATVIQAELDYVGSITVDEELLEAAGILEYEKV QIVDVNNGSRFETYTICGKRGSGMICLNGAAARCVSTGDKIIIMAYAGYEPEEARTHKPA VVFVDEENKISRVTNYEKHGLLKDMA >gi|226332965|gb|ACII01000054.1| GENE 22 29567 - 29776 199 69 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 69 26 94 94 90 65.0 2e-17 MKAHKYWSLGALAAMIGTFYTGYKNMKTAHKYFACSSLLCMIMAIYSGHKMISGKSRKKK NSVSKESAE >gi|226332965|gb|ACII01000054.1| GENE 23 30371 - 30922 169 183 aa, chain - ## HITS:1 COG:XF2664 KEGG:ns NR:ns ## COG: XF2664 COG5619 # Protein_GI_number: 15839253 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 2 148 41 186 222 124 38.0 1e-28 MDIVWEKYQNISLYLRNMDYSTIYDEIEKNHNYNVKLPDGGIIQLMYRFNRTGKELISHR LGYYPSPSYELYQNDPELYDVDYIYGDILNKSVLPVIIRADYNRDPEESELHHPYSHITL GGYKNCRIPVDRPISPMKFVKFIMEHFYYVPSSQLEFNFEIEGIVAFEEHIAEKDINKSR IIV >gi|226332965|gb|ACII01000054.1| GENE 24 31021 - 33138 1093 705 aa, chain - ## HITS:1 COG:AGc765 KEGG:ns NR:ns ## COG: AGc765 COG3972 # Protein_GI_number: 15887781 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 694 11 709 718 506 39.0 1e-143 MLTIVNGVTSKQVLANEIINLLETMGLDGYFYLGYPVLGGIDGKIKVDALLVSEQTGIVL FDLETLAEENMEDKIQLLDELYNNMEAKLKRYGYLSKRRVLQVPINVLSYAPLYKTKSDE ICTSIEEVKEYLESLEWKQGEEYYKKLLEAIQMISQLKKRGGRKNLQKASSRGSKLKAIE DQISCLDKYQSKAVIETVEGVQRIRGLAGSGKTIVLALKVAYLYTMYENKMIAVTFNSRA LKGQFIQLITNFIVENTNEEPDWSRIKIIHAWGSKNSEGLYFNFCKANNLVCYDYMDACQ KYGRNSSAFDRVCEEAVESVTNPQPLYDVILVDEAQDFSKYFLQMCYMSLPHESRMLVYA YDELQSLDNKNVESPEDIFGYSNGRPNVVLDNSNGKAEDIVLSKCYRNSRPVLITAHSLG FGIYRKKEAREETSLVQLFEDKQLWEDIGYTVKEGVIRDGEFVTLYRTEETSPAFLEDHS SIDDLIQFRRFENNVQEADWIVEDIEKNLHDEELRYQDIMIIHPDPKITKSYVSHIRVKL MEKGIKSHIVGVQTTPDDFFQEDSIAISQIYRAKGNEAAIVYLINADLCARGINLSRKRN IIFTAMTRSKAWVRVSGIGDDMDLLIQEFNSVKERNFQLEFTYPDKSQRKNMRIIHRDMT QNELRDIRKSNSNLADITSKLANGEIRKEDLDEDTVNRLKDVLYN >gi|226332965|gb|ACII01000054.1| GENE 25 33427 - 34680 341 417 aa, chain - ## HITS:1 COG:no KEGG:STER_0748 NR:ns ## KEGG: STER_0748 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophilus_LMD9 # Pathway: not_defined # 3 416 5 374 378 87 27.0 1e-15 MNILIVGNGFDLAHGLPTKYADFLKFIDFFYKHKAQESSGLELIAGEDINCYKYFTDLFN SKQDSEFDQYLYDQSRKTIHELSDLCKDNAWIKYFSEVYKSREQKGKDGWIDFESEISLI IQTFNSVSRDIQETIQKGGVGTVLSQRQLNVLALFLEKMDSSSGMATHVWKKEEIDFWKQ KLLEDLNKLTRALEIYLSDYISNFMLGNGLPDIKNLPYLDKILSFNYTCTYQRIYGEHPF LEFDYVHGKADLRNDIQSTNMVLGIDEYLEGDARDKDLEFIEFKKFFQRIHKETGGLYEG WLEEIQSEKKIYEISAIVKENGIVKKHHRVVKYHKVFIFGHSLDITDKDILRKFILNENV KIIIFYTDKEDYKKKIINLIKIIGQDELVKRTGGKNKTIVFQKINTCTLESDSMREK >gi|226332965|gb|ACII01000054.1| GENE 26 34718 - 34951 137 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578735|ref|ZP_04856006.1| ## NR: gi|253578735|ref|ZP_04856006.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 77 1 77 77 133 100.0 4e-30 MYAWYEKTTHMCCAFFYDITQIFYIVLAGQEKQIIEADESVTQSIRKIQLFIRRHEDYWK DHVKQKIELEHRLEAML >gi|226332965|gb|ACII01000054.1| GENE 27 34960 - 35367 102 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578736|ref|ZP_04856007.1| ## NR: gi|253578736|ref|ZP_04856007.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 135 66 200 200 228 100.0 7e-59 MQNYKNEGIKDKEFRIDSWESDLLEIIIVRLGGEGQEEKGLLDLLYGLFSGNQKKVLSYW EFERLSDTSKKYILENISKDILENWNLLFEKKKKKHENLREIWFSGYFSLILNSLQYDLK NKEEWKLSFLNYEKY >gi|226332965|gb|ACII01000054.1| GENE 28 35563 - 35916 263 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580918|ref|ZP_04858180.1| ## NR: gi|253580918|ref|ZP_04858180.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 67 1 67 70 105 77.0 7e-22 MDVKDSISVTLKTIDGKETKDQSAKNIFSYKPIMSNILKYTVEEYRDCSLEEIMNCIEGD TIQTGTALVEEDMAKTIRGEDTEFHTTDEAPAAFDILFRSLNPKAEGKLLVNLHIDF >gi|226332965|gb|ACII01000054.1| GENE 29 36228 - 36473 364 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153811885|ref|ZP_01964553.1| ## NR: gi|153811885|ref|ZP_01964553.1| hypothetical protein RUMOBE_02278 [Ruminococcus obeum ATCC 29174] # 1 81 1 81 81 91 98.0 2e-17 MARTRKTSKEALEEKIEKAEADVIAAKKKYDQATANLKELLDKRDAIRKEEILSAVVKSG RSYEEIMNFLNAGSDGEKTEK >gi|226332965|gb|ACII01000054.1| GENE 30 36489 - 38183 1233 564 aa, chain - ## HITS:1 COG:MA2942 KEGG:ns NR:ns ## COG: MA2942 COG5421 # Protein_GI_number: 20091761 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Methanosarcina acetivorans str.C2A # 17 546 9 521 521 95 24.0 3e-19 MYLDFLVKVPTVKGKITRRKKSNVVYIEYEYDRVYDPSRKYTFLKRVTIGKLSDTDPELM KPNQNFLKYFPDAELPESKNRTSRSSCLRVGTYFVLRKIIEECSLNDILGKYFGSRDMGL FLDLAVYSIIAENNAAQYYPDYTYNHPLFTEKMKQYSDSTVLDFLNSVTDDQSTGFLNTW NESRNYHEKIYISYDSTNKNCQAGDIKMVEFGHPKVDMGEPVFNYAVAYDTHNQEPLFYE KYPGSLNDISQLQFMLDKASGYGYKKIGFILDRGYFSCENIQYMDKCGYSFVIMVKGMSA LVNELILENKGTFENKRVNNIYEYGVYGKTIRHKLYASDKKERYFHLYHSISKESAERIE IENRINQMTQYLKKYQNKVKEFGPGFEKYFNLYYDEKSQAFILPEERCSVVERELDLAGY FCIVTSEKMSAKEAIELYKSRDVSEKLFRGDKSYLGNKSIRVYSEESARAKIFVEFVAMI VRCKMYIKLKEEMKKLDKKLNYMTVPAAIKELEKIEMVRQLDNIYRLDHAVTANQKVILK TFGLDANSIKYFASELSKELREAE >gi|226332965|gb|ACII01000054.1| GENE 31 38510 - 38947 205 145 aa, chain - ## HITS:1 COG:no KEGG:Exig_0319 NR:ns ## KEGG: Exig_0319 # Name: not_defined # Def: hypothetical protein # Organism: E.sibiricum # Pathway: not_defined # 1 136 92 222 235 78 34.0 9e-14 MWPDEDNTAIAYSYINGIPKVNEELLIKLDSTGDYLKRARNTYNLVGLGNLPMGKDRDRR FGQRNTAYQKALNSLEHWNHMKDLSKEYQNDMKNQIIMTALGDGFFSIWMEVFCDEPEIR LALIEAFPGTNLNYYDEKGCVKEII >gi|226332965|gb|ACII01000054.1| GENE 32 38949 - 39206 152 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291524428|emb|CBK90015.1| ## NR: gi|291524428|emb|CBK90015.1| HNH endonuclease. [Eubacterium rectale DSM 17629] # 1 82 1 82 235 134 75.0 2e-30 MRPINKGESPYKKINEYKDALPYLERRIGMYCSYCEFSIPHVPEVEHVVSKSKGGDLTDW NNLNLGCKYCNTRKKAQTMPENKKN >gi|226332965|gb|ACII01000054.1| GENE 33 39190 - 40557 777 455 aa, chain - ## HITS:1 COG:STM3753 KEGG:ns NR:ns ## COG: STM3753 COG3950 # Protein_GI_number: 16767037 # Func_class: R General function prediction only # Function: Predicted ATP-binding protein involved in virulence # Organism: Salmonella typhimurium LT2 # 6 360 3 370 396 136 27.0 9e-32 MCEFNIQEIELFNYRQFEDKKFVLNSRMNVFAGKNGSGKTTVLEAANVVLGAYLAAFKTY VPSRFVFNIKSTDARRKTQISDDSRILTAGAIPQYPCKVSCKAAWNIDYESIGFQRVLLK QDGRTKFGGKNPMQQTVIEWESKIANADHSDEDVILPLVLYLSSARLWKDGNKRTVKTGV LSRTDAYSHCLDAQHGLDIAFRYIDTLKAVAVEENDGKIFPAYEVILWAVNEAFRDELKS GEKIIFSTKYEDDIIALRTSEGTVLPFQMLSDGYRNVIKIILDIAARMCILNPYLKEKAL QETPGIVLIDEIDLSLHPTWQKRIIGILKQLFPKIQFICATHSPFIIQSLEEGELITLDQ PLESEYSGESIEDISEDIMGVVLPQYSEKKRKIYEASKLYFEALTQAKSQEDIDRLGKRL AELEAEYSENPAYMAWVHQQYLEQKWKVKKNETNK >gi|226332965|gb|ACII01000054.1| GENE 34 40900 - 41466 106 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578742|ref|ZP_04856013.1| ## NR: gi|253578742|ref|ZP_04856013.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 188 1 188 188 370 100.0 1e-101 MIESELYYLVKSYRDAIEQAHKKNRFCKWPFNQFPEACCGQASDILARFLGEQGIYTKVV SGSDSNTTHAWLVLDDSILQKNIQRAEEATDYEEVNKMNRILEGYGYNKEINIYPYCDLD SILRGCTIIDITGSQSQFRLDSKYFNYNIPIYIGEMDDFHNLFDINSINEFRNDDKLDNI YRIVQQYL >gi|226332965|gb|ACII01000054.1| GENE 35 41750 - 42049 259 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578743|ref|ZP_04856014.1| ## NR: gi|253578743|ref|ZP_04856014.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 99 1 99 99 147 100.0 2e-34 MQENMKEYLTEKEAKEKRLEVYTEKIQEKVLSRSSETIRQLVEERHRQKMTQQEIADITG IKPSNMARFESGGRVPTLVVLEKYANALGKHIEIKICDD >gi|226332965|gb|ACII01000054.1| GENE 36 42378 - 42677 200 99 aa, chain + ## HITS:1 COG:no KEGG:c3658 NR:ns ## KEGG: c3658 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 3 98 306 390 533 66 38.0 3e-10 MIYEYQKDRDHQKPLEFYRDYKGILVTDGLQQYHLVDKKLPDVTNANCWAHARRDFADAV KAMDKKDPSAGHSSVAYTALQKIGGFYTADTELKKLSSE >gi|226332965|gb|ACII01000054.1| GENE 37 42711 - 43139 299 142 aa, chain + ## HITS:1 COG:ECs1336 KEGG:ns NR:ns ## COG: ECs1336 COG3436 # Protein_GI_number: 15830590 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 # 1 110 73 180 199 86 38.0 1e-17 MVEDFFAWVKEQLSQCTVPPKSKTGQGLQYLVNQELYLKVFLTDGDVPIDNSASERSIRT FCIGKKNWMFHNTANGASANAMVYSISETAKLNSLRPYYYFRHILTELPKRCDVNGKINP AELDDLMPWSEELPDECRKSRR >gi|226332965|gb|ACII01000054.1| GENE 38 43667 - 43795 87 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVSFVIFTIVYGTVSSILGTYIVRILDKKCKNDRHSSDSDR >gi|226332965|gb|ACII01000054.1| GENE 39 44249 - 45829 1245 526 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0128 NR:ns ## KEGG: EUBREC_0128 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 525 22 546 550 701 63.0 0 MGMFVNPDNLAFQAALNAKIYVDKSGILNYTNSVLGSTDAFICNSRPRRFGKSVTANMLT AYYSKGCDSEEMFSRLKISQAEDFRKHLNQYDVIHWDIQWCMGPANGPEKVVSYISEKTI LELREYYPDVLPAENHSLAETLARINTVTGRKFIVIIDEWDVLIRDEAAKEDIKNEYIRF LRGIFKGTEPTKYIQLAYLTGILPIKKEKTQSALNNFDEFTMVSPSILAQYIGFTEAEVQ KLSDKYHQDFDKVKKWYDGYLLKDYQVYNPRAVVSVMLKGEYKSYWSETASYEAIVPFIN MNYDGLKTAIIEMLSGSSVKVNTATFKNDTVNIQSKDDVLTYMIHLGYLGYDQTEKIAFV PNEEIRQELTNAVKSKSWNEMLMFQQESETLLDATLDMDNETVAAQIEKIHNEYASVIRY HDENSLSSVLAIAYLGTMQYYFKPVREFPTGRGFADFIFIPKPEYKSAYPALVVELKWNQ KVQTAIQQIKDRKYPSSISDYTGDILLVRINYDKKSKKHQCIIEKV Prediction of potential genes in microbial genomes Time: Sat May 28 19:39:34 2011 Seq name: gi|226332964|gb|ACII01000055.1| Ruminococcus sp. 5_1_39B_FAA cont1.55, whole genome shotgun sequence Length of sequence - 4488 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 285 - 1613 350 ## COG3666 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 1576 - 1779 98 ## Bcer98_1656 transposase IS4 family protein - Prom 1979 - 2038 6.6 - Term 1877 - 1936 -0.1 3 2 Tu 1 . - CDS 2081 - 2431 298 ## COG1482 Phosphomannose isomerase - Prom 2635 - 2694 3.2 + Prom 2420 - 2479 1.8 4 3 Tu 1 . + CDS 2545 - 3003 395 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Term 3075 - 3123 -0.8 5 4 Tu 1 . - CDS 3294 - 3677 405 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 3723 - 3782 5.8 + Prom 3961 - 4020 8.1 6 5 Tu 1 . + CDS 4075 - 4486 142 ## gi|253581324|ref|ZP_04858557.1| conserved hypothetical protein Predicted protein(s) >gi|226332964|gb|ACII01000055.1| GENE 1 285 - 1613 350 442 aa, chain - ## HITS:1 COG:MA3799 KEGG:ns NR:ns ## COG: MA3799 COG3666 # Protein_GI_number: 20092595 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 193 432 4 234 243 104 30.0 3e-22 MDVMQKTPILLFKYLLLKYLYNLSDNGVVERSRYDMSFKYFLELTPEEEVIHPSLLTKFR KQRLKDENILDLLINKSVELAINQGVLKSNTIIVDSTHTEARYHKTTQREMLKKASTSLK AALKAALKDSDKDIYLPDEPEKRASLDEYNEYCHELLNTVQESPYSELPTIKEESSMLSE ILDNQIDCYDQSIDPDARIGHKTVNSSFYGYKTHLAMTDERIITAATITSGEAFDGQELP VLVQKSKAAGATVNEVIADTAYSTLENLKDAQSNDYKLISKLNPSIIKGTRSEDGFIFNK DADTMQCPAGHLAIKYRIDKRSNQKKNTRIKYFFDINKCHVCPYRNGCYKENAKTKTYSI TIKSDYHKNQEAFQKTQYFKERFKTRYMIEAKNSELKNIHGYSRCDAAGLSNMQLQGAVS IFAVNLKRILKLKGEVCPNEEK >gi|226332964|gb|ACII01000055.1| GENE 2 1576 - 1779 98 67 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_1656 NR:ns ## KEGG: Bcer98_1656 # Name: not_defined # Def: transposase IS4 family protein # Organism: B.cereus_NVH # Pathway: not_defined # 6 63 1 58 486 74 58.0 9e-13 MSEVEMLSSQLKLSLSYYSDLYDMIVPKDHILRKIRELVDFSFIYDELKNKYCLDNGRNA KNPYIAI >gi|226332964|gb|ACII01000055.1| GENE 3 2081 - 2431 298 116 aa, chain - ## HITS:1 COG:SP0736 KEGG:ns NR:ns ## COG: SP0736 COG1482 # Protein_GI_number: 15900631 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Streptococcus pneumoniae TIGR4 # 4 109 152 254 314 92 42.0 2e-19 MKYLQNVPIHKDDLFFIPAGTIHAIGAGALVAEIQESSNLTYRLYDYDRIGKDGKKRELH IDKALDVADLHGSAEPRQPLRVLKYRPGMASELLIRCKYFEVYRMLINGVCQEVQR >gi|226332964|gb|ACII01000055.1| GENE 4 2545 - 3003 395 152 aa, chain + ## HITS:1 COG:AF0813 KEGG:ns NR:ns ## COG: AF0813 COG0111 # Protein_GI_number: 11498419 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Archaeoglobus fulgidus # 1 151 163 313 527 140 45.0 1e-33 MKVIAYDPYINETAAQELNVTPVTLEQLLKDSDLISVHCPLTKDTYHLIGKEEMTLLKPN AILVNTARGGIIDEAALIEALQNGKISGAGVDVFENEPVTPEHPLLHMDNVIATPHSAWY SETAIHTLQRKVAEEVVNVLNGNPPFNCVNMK >gi|226332964|gb|ACII01000055.1| GENE 5 3294 - 3677 405 127 aa, chain - ## HITS:1 COG:CAC0249 KEGG:ns NR:ns ## COG: CAC0249 COG0346 # Protein_GI_number: 15893541 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 127 1 126 126 162 62.0 1e-40 MNLSKIHHIAIIVSDYEVAKDFYVNKLGFSVIRENYRPERKDWKLDLRVNEHTELEIFAE ENPPKRANRPEACGLRHLAFCVESVEQTVNELAEVGIECEPIRMDDYTGKKMTFFHDPDG LPLELHE >gi|226332964|gb|ACII01000055.1| GENE 6 4075 - 4486 142 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581324|ref|ZP_04858557.1| ## NR: gi|253581324|ref|ZP_04858557.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 137 1 137 138 247 98.0 2e-64 MNRTNICKNIVQSIKDYITDQVKLEPHRMEKHFVRRRKLSLLQVIIYLFFSSKASMFQNL SQIREELGTLSFPDVSKQALSKARQFINPSLFKELYYLSVDLFYSQISSRKLWQGYHLFA VDGSRIELPNSKSTFDF Prediction of potential genes in microbial genomes Time: Sat May 28 19:39:43 2011 Seq name: gi|226332963|gb|ACII01000056.1| Ruminococcus sp. 5_1_39B_FAA cont1.56, whole genome shotgun sequence Length of sequence - 757 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 7 - 757 542 ## PSPA7_0119 transposase DDE domain-containing protein Predicted protein(s) >gi|226332963|gb|ACII01000056.1| GENE 1 7 - 757 542 250 aa, chain + ## HITS:1 COG:no KEGG:PSPA7_0119 NR:ns ## KEGG: PSPA7_0119 # Name: not_defined # Def: transposase DDE domain-containing protein # Organism: P.aeruginosa_PA7 # Pathway: not_defined # 1 237 129 376 423 90 26.0 7e-17 MSSYPDPNRRYTMGLASIIYDVLDDYILHASIHKFLSSERAAALEHLKVLEDMGLYNNSI IIFDRGYYSEDMFRYCVEHGHLCVMRLKEGINLSKKCNGDMISILQGTSKEGTSDVPIRV LEIPLDDGTKEYLATNLFDPAVTKDMFRELYFYRWPVELKYKELKSRFAMEEFSGATAVS IQQEFYINMLLSNLASLIKNEADEEIQISAKSTNKFRYQANRAFIIGRIKSIVPKILCGL FELSIIEQLY Prediction of potential genes in microbial genomes Time: Sat May 28 19:39:48 2011 Seq name: gi|226332962|gb|ACII01000057.1| Ruminococcus sp. 5_1_39B_FAA cont1.57, whole genome shotgun sequence Length of sequence - 4300 bp Number of predicted genes - 8, with homology - 6 Number of transcription units - 5, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 162 - 206 8.5 1 1 Op 1 . - CDS 224 - 577 259 ## gi|253578752|ref|ZP_04856023.1| predicted protein 2 1 Op 2 . - CDS 581 - 1525 669 ## Cthe_1159 hypothetical protein - Prom 1591 - 1650 2.0 + Prom 1551 - 1610 9.3 3 2 Tu 1 . + CDS 1764 - 2384 206 ## EUBELI_20054 hypothetical protein + Term 2401 - 2448 8.4 4 3 Tu 1 . - CDS 2459 - 2836 520 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 2876 - 2935 8.4 5 4 Op 1 . - CDS 3027 - 3269 140 ## gi|253578756|ref|ZP_04856027.1| predicted protein 6 4 Op 2 . - CDS 3285 - 3413 90 ## - Prom 3548 - 3607 5.9 7 5 Op 1 . - CDS 3630 - 3827 168 ## - Term 3870 - 3903 2.0 8 5 Op 2 . - CDS 3927 - 4241 280 ## EUBREC_3010 hypothetical protein Predicted protein(s) >gi|226332962|gb|ACII01000057.1| GENE 1 224 - 577 259 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578752|ref|ZP_04856023.1| ## NR: gi|253578752|ref|ZP_04856023.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 117 1 117 117 201 100.0 1e-50 MGEIRVVLIFNEDNERQREIGRYLKSQKRCKTALITELVYAWLHKENAPVSSYNNANVSV EELKQQLLQDTDFIQQIKKSVVIEEDAGKVIEEEQSDGLDMDEGMLMAGLSMFENNL >gi|226332962|gb|ACII01000057.1| GENE 2 581 - 1525 669 314 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1159 NR:ns ## KEGG: Cthe_1159 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 3 301 4 294 303 134 29.0 3e-30 MAEIIMGIDHGNGNMKGVHVNFPCGLVRYTSEPGRFMNEDILEYNGVYYTLSETRMPFKA DKTVDEDYFILTLFSLALEAKERGITLSGKDIVLGVGLPPADFGQQAPGFKKYFMEHSKH GISFRFNGKPVNCYLKDILVSPQNFAAVMCYKASLFRQYRTVNCIDIGDGTVDLLVIRKG QPDLSVRVSDRSGMAILRSEISNSIQQNYGIHLESSDVEQVLMQEATILDEEVVREIQKM AGDWMQRIINKLHAYVPDFRTNPAVFLGGGSLLLKQQIEKSSDFKYIEFIEDIRANAVGY EKMTALHIQKRQGV >gi|226332962|gb|ACII01000057.1| GENE 3 1764 - 2384 206 206 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20054 NR:ns ## KEGG: EUBELI_20054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 117 13 121 154 73 40.0 3e-12 MNTGEIIKYFRLARNMTQEQLAQDAEISFSTLRKYEANERNPKYEQLSKIADALGISVNL FMDFEIQSVSDLFSILFKMEGQADLEIATSKDLAVLSEDALFLHFKNGYINHILSNYHAS VESIQDMDAQYREKALSEIQNRLLDDNSSIHTDSPALSADSGKPALSLSSADQTWFELSS DCSDEEKEQMIKAATFIKSCLRHSDK >gi|226332962|gb|ACII01000057.1| GENE 4 2459 - 2836 520 125 aa, chain - ## HITS:1 COG:CAC0249 KEGG:ns NR:ns ## COG: CAC0249 COG0346 # Protein_GI_number: 15893541 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 2 125 3 126 126 172 65.0 1e-43 MFDTIHHIAIIGSDYEKSKHFYVDLLGFKVIRENYRKERDDYKIDLACGPQEIELFIIKN APARVNYPEALGLRHLAFKVKSVDDTVKELNGKGIETEPVRLDDYTGKKMTFFHDPDGLP LEIHE >gi|226332962|gb|ACII01000057.1| GENE 5 3027 - 3269 140 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578756|ref|ZP_04856027.1| ## NR: gi|253578756|ref|ZP_04856027.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 12 80 12 80 80 131 100.0 1e-29 MEKAERKKKEEYQIINPVKYQRFIYLSERAIQFSKISGCNIKIQTFPDMSALIKMQTGYI WLLSDGESALEDKETMKNLL >gi|226332962|gb|ACII01000057.1| GENE 6 3285 - 3413 90 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNIEKFYYDEQTEIYEQMKKEQDEHEQSLSVEEREKEHQKLF >gi|226332962|gb|ACII01000057.1| GENE 7 3630 - 3827 168 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTVLQIFFIFKSEYIIGCTIKNIAQFFNGKDCDVLIFPELIEGVLVNSFVNQGILGNVAL FHGFP >gi|226332962|gb|ACII01000057.1| GENE 8 3927 - 4241 280 104 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3010 NR:ns ## KEGG: EUBREC_3010 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 10 98 7 95 100 115 66.0 7e-25 MNGGMAMSSYKDYKKRALQNPEVKAEYDALQPEYDIIQAMIDARVQQNMTQKDLSAKTGI TQADISRIENGTRNPSLSMVKKLAQGLGMQLKLEFVPMPTKNKM Prediction of potential genes in microbial genomes Time: Sat May 28 19:40:16 2011 Seq name: gi|226332961|gb|ACII01000058.1| Ruminococcus sp. 5_1_39B_FAA cont1.58, whole genome shotgun sequence Length of sequence - 2965 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 302 - 361 5.4 1 1 Tu 1 . + CDS 431 - 1807 1085 ## COG0534 Na+-driven multidrug efflux pump 2 2 Tu 1 . - CDS 1945 - 2964 952 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase Predicted protein(s) >gi|226332961|gb|ACII01000058.1| GENE 1 431 - 1807 1085 458 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 12 437 13 437 457 229 33.0 1e-59 MNATANLGKAPVRTLVLQLAIPSMTAQFVNVLYSIIDRMYIGHISENGALALAGAGICGP IVTLLTSFGTLVGIGGSITMGIRMGEKNHKKAEQVLANSFLMLSVFSVLLTVSFLLLKDK LLMLFGASGATFPFANTYLTIYTTGTFFALMATGLNYFINCQGFPLVGMFTVLIGAGCNI LSDPIFIFGFHMGISGAAIATVISQILSCIFAMTFLLGKKVPVKITFGQYNLKVIRKILS LGFSPFLILATDSLILIFMNTVLQKYGGASQGDMLITCATIAQSYLLLITSPMIGISGGT QAILSYNYGARCVNRVKDAERNILGVMLIFTTIMFLLSRFIPRYFVLLFTTDPQYIRLSV WVIRTYTLMIIPLSFQYVFVDGLTALECPKTGLALSVTRKSLFVISAAVLPSFFGAKAAF FAEPVADGISSVLSTIVFLLLINRHLDKRCRMDKKESL >gi|226332961|gb|ACII01000058.1| GENE 2 1945 - 2964 952 339 aa, chain - ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 1 282 19 274 364 98 27.0 1e-20 DTPPYCNEKGGTVLNVAILLNGEKPVEVTLERIPEQKVVFDSRDMDVHGEFDTIKPLQAT GDPYDPFALQKACLLACGIIPREGHTLGEILERLGSGFVMHSEVTNVPKGSGLGTSSILS AACVKAVFEFMGIAYTEEDLYAHVLAMEQIMSTGGGWQDQVGGITSGLKYITSMPGLQQH LQVAHIELSPQTKKELDERFVLIYTGQRRLARNLLRDVVGRYVGNEPDSLFALEEIQKTA VLMRFELERGNVDGFAKLLDYHWELSKKIDAGSSNTLIEQIFSSIEELVDGKLVCGAGGG GFLQVILKKGVTRQMVEERLKEVFMDSLVGVADCKLVWE Prediction of potential genes in microbial genomes Time: Sat May 28 19:40:20 2011 Seq name: gi|226332960|gb|ACII01000059.1| Ruminococcus sp. 5_1_39B_FAA cont1.59, whole genome shotgun sequence Length of sequence - 12549 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 239 106 ## gi|253578774|ref|ZP_04856045.1| LOW QUALITY PROTEIN: L-fucokinase domain-containing protein - Term 274 - 313 1.6 2 2 Tu 1 . - CDS 342 - 824 259 ## Swol_0146 RNA-directed DNA polymerase (reverse transcriptase) - Prom 868 - 927 2.5 - Term 878 - 922 6.6 3 3 Op 1 2/0.000 - CDS 950 - 2206 538 ## COG3344 Retron-type reverse transcriptase 4 3 Op 2 . - CDS 2727 - 3461 443 ## COG3344 Retron-type reverse transcriptase 5 3 Op 3 . - CDS 3521 - 3850 116 ## gi|253578766|ref|ZP_04856037.1| predicted protein - Prom 4054 - 4113 2.9 - Term 4090 - 4116 -1.0 6 4 Op 1 . - CDS 4126 - 5799 1590 ## COG1109 Phosphomannomutase 7 4 Op 2 . - CDS 5840 - 6838 591 ## COG1482 Phosphomannose isomerase 8 4 Op 3 . - CDS 6885 - 8201 1186 ## COG0836 Mannose-1-phosphate guanylyltransferase 9 4 Op 4 14/0.000 - CDS 8226 - 9170 903 ## COG0451 Nucleoside-diphosphate-sugar epimerases 10 4 Op 5 . - CDS 9189 - 10241 1166 ## COG1089 GDP-D-mannose dehydratase - Prom 10316 - 10375 2.8 11 5 Op 1 . - CDS 10406 - 11281 506 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 12 5 Op 2 . - CDS 11304 - 12548 1229 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase Predicted protein(s) >gi|226332960|gb|ACII01000059.1| GENE 1 2 - 239 106 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578774|ref|ZP_04856045.1| ## NR: gi|253578774|ref|ZP_04856045.1| LOW QUALITY PROTEIN: L-fucokinase domain-containing protein [Ruminococcus sp. 5_1_39B_FAA] # 4 75 401 472 472 150 94.0 3e-35 MKKLDVQDQVIPDNVVLHGLKQRNGKFIVRILGVNDNPKENRLFGRNLDELEDTLGVKFW EENGQAHTLWSAALYQESD >gi|226332960|gb|ACII01000059.1| GENE 2 342 - 824 259 160 aa, chain - ## HITS:1 COG:no KEGG:Swol_0146 NR:ns ## KEGG: Swol_0146 # Name: not_defined # Def: RNA-directed DNA polymerase (reverse transcriptase) # Organism: S.wolfei # Pathway: not_defined # 1 160 274 433 443 140 48.0 1e-32 MGHFGLSLEEEKSRLIEFGRYAKERCRKAGTKPGTFTFLGFTHYCSKSRNGRFRVKRKTS KKKFAKKCREVNLLLVKMRTCRLKEIIEKLNQILTGYYHYYGITDNTESITAFRYNIMRS LFYCLNRRSQKKSYNWVEFLNMLDNSYPLVRPRIYVNIYE >gi|226332960|gb|ACII01000059.1| GENE 3 950 - 2206 538 418 aa, chain - ## HITS:1 COG:MA2797 KEGG:ns NR:ns ## COG: MA2797 COG3344 # Protein_GI_number: 20091619 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 1 414 1 415 415 500 57.0 1e-141 MSEAKQFDISKKAVIAAFQAVKENAGSYGADEQTIKEFEEHLNNNLYKLWNRMASGSYFP KPVRAVAIPKKNGGIRILGIPTVEDRIAQMVAKMYFEPLVEPMFYNDSYGYRPNKSAIQA VGQARERCFKRDWALELDIKGLFDNIKHGYLMYMVEKHTQIKWLILYIKRWLTVPFIMSD GSVAERRSGTPQGGVISPVLANLFLHYVFDDFMTKAYPNIWWERYADDGVLHCQSYKQAA FIKQKLEERFQQFGLELNKEKTRIVYCKDNRRPQNYSCTQFTFLGYTFRPRLNKNKEGKF FVGFTPAVSEKAKTAMKQKIREWKIQLKADLSLKDIGNMINKVVQGWINYYTHYYKSEFY EVLRYINQCLIKWVRRSYKKKNTRSRAEHWLGAVARRDRNLFAHWKFGILPSVGEGAV >gi|226332960|gb|ACII01000059.1| GENE 4 2727 - 3461 443 244 aa, chain - ## HITS:1 COG:BH0039 KEGG:ns NR:ns ## COG: BH0039 COG3344 # Protein_GI_number: 15612602 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 6 235 2 223 418 124 32.0 2e-28 MVFTSLGHLINKEMLKSCHAQMDGTKAVGIDGITKEEYGRRLEENISALIERLKKKSYKP KPARLVEIPKDNGKMRPLSIYCYEDKLVQEALRRILEAVFEPIFYDEMMGFRPNRGCHKA IRKLNLMLERKPTSYVLDADIKGFFQHLDHEWIIRFIGSRIKDPNIIRLVRRMLKAGIMN NYEFEETEEGSGQGSVCSPVISCIYMHYVLVWWFKEVITPKLKGYAGLVVYADDCAPRRR KLVT >gi|226332960|gb|ACII01000059.1| GENE 5 3521 - 3850 116 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578766|ref|ZP_04856037.1| ## NR: gi|253578766|ref|ZP_04856037.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 109 1 109 109 219 100.0 3e-56 MVWMEDNICIYRYGEILHTSAVLVGTIHVIQGIYINWREPAFSSVVCKQKVCLTTKRRMV KEMQVVGGPNITDEGMNKINTRGKSRENCSSQFWPAKLYQRGNISYTQR >gi|226332960|gb|ACII01000059.1| GENE 6 4126 - 5799 1590 557 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 1 542 1 556 575 517 47.0 1e-146 MKIKKEYERWLANAVADTDVAAELKAMDYVEVEDAFYRDLAFGTGGLRGVIGAGTNRMNI YTVAKASQGLADYLKKHFKEPSVAIGYDSRIKSDVFAKVAAGVFAANGVSVNIWPVLMPV PTVSFATRYLHTSAGVMVTASHNPGKYNGYKVYGADGCQITTEAAAEILAEIEKLDIFAD VKTSNFEAGVANGSIRYIPDEVYTAFVEQVKSQSVLFGENVNKNVAIVYSPLNGTGLKPV TRTLQEMGYTNITVVKEQEQPDGHFPTCPYPNPEIKEAMALGMEYARKCNADLLLATDPD CDRVGIAVKNKAGEYELLTGNQTGMLLLDYICSQRIKHGKMPADPVMIKTIVTMDMGEQI AAHYGLRTVNVLTGFKFIGEQIGKLEQLGKADSYVFGFEESYGYLTGSYVRDKDGVDGAY MICEMFSYYATQGISLLDRLDELYKTYGYCLNTLHSYEFDGSAGFARMQSIMQAFRGGIK EFGGKKVARLLDYVQGLDGLPKSDVLKFLLEDNCSLVVRPSGTEPKLKIYISVSAENKEM AEEVEDEIAKTAEKWIC >gi|226332960|gb|ACII01000059.1| GENE 7 5840 - 6838 591 332 aa, chain - ## HITS:1 COG:lin2215 KEGG:ns NR:ns ## COG: lin2215 COG1482 # Protein_GI_number: 16801280 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Listeria innocua # 10 324 3 307 318 211 36.0 1e-54 MDVCMEKRNESLLLRPAGKDYLWGGKRLNDEYGKNIELSPLAETWECSTHPDGVSTVRCG TFDKMDLTAVIKAHPEYLGERHKGETTLPILVKLIDARKDLSVQVHPDDDYAKMKEHGQL GKTEMWYVLDAARDAKLIYGLRQDCTKKEMQKALAEGTVMKYLQKVPIHKDDLFFIPAGT IHAIGAGALVAEIQESSNLTYRLYDYDRIGKDGKKRELHIDKALDVADLHGSAEPRQPLR VLKYRPGMASELLIRCKYFEVYRMLINTERRQTVHYRADRMAFRVLLCMDGCGTISYDEG TVNFYKGDCVFVPADSEVLTIHGQAQFLDIRG >gi|226332960|gb|ACII01000059.1| GENE 8 6885 - 8201 1186 438 aa, chain - ## HITS:1 COG:aq_589_1 KEGG:ns NR:ns ## COG: aq_589_1 COG0836 # Protein_GI_number: 15606035 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Aquifex aeolicus # 1 311 1 320 320 157 32.0 5e-38 MNIILLSGGSGKRLWPLSNDIRSKQFIKIFKKEDGTYESMVQRVYRQIKKVDTDATVTIA TGKTQVSAIHNQLGEDVGISVEPCRRDTFPAIALATAYLVDVMHVAPSESVVVCPVDPYV EDGYFEALKELSEQADKCEANLVLMGIEPTYPSEKYGYIIPESTDHVAEVRTFREKPTAD VAEQYIAQGALWNGGVFAYKMGYVMDKAHEMMDFTDYQDLFDKYDTMKKISFDYAVAEHE PKIQVVRFAGMWKDLGTWNTLTEAMDESIIGKGELNDKCSGVHIINEMDVPVLAMGLHDV VISASPEGILVSDKEQSSYIKPFVDKFEQQIMFAEKSWGSFKVVDVETTSLTIKVTLNPG HSMNYHSHRNRDEVWVVISGEGKTIVDGMEQMVKPGDVITMSAGCRHTVIADTELKLIEV QLGADINVHDKQKFELEQ >gi|226332960|gb|ACII01000059.1| GENE 9 8226 - 9170 903 314 aa, chain - ## HITS:1 COG:ECs2857 KEGG:ns NR:ns ## COG: ECs2857 COG0451 # Protein_GI_number: 15832111 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Escherichia coli O157:H7 # 7 311 5 318 321 360 54.0 3e-99 MMEKNAKIYVAGHRGMVGSAIVRELQRQGYMNIVTRTHKELDLTRQDAVEKFFATEKPEY VFLAAAKVGGIVANQNALADFMYDNMILEMNVIHSAWKNGCRKLEFLGSSCIYPRMAPQP MKESCLLTSELEKTNEAYALAKISGLKYCEFLNRQYGTDYISVMPTNLYGPNDNYHPTHS HVLPALIRRFHEAKESGAASVTCWGDGSPLREFLYVDDLANLCVFLMNNYSGNETVNAGT GKELTIKELTELVAKVVGYNGEIKWDPDKPNGTPRKLLDVSKATNLGWTYKTELEDGIRL AYEDFLNNPMRAER >gi|226332960|gb|ACII01000059.1| GENE 10 9189 - 10241 1166 350 aa, chain - ## HITS:1 COG:ECs2858 KEGG:ns NR:ns ## COG: ECs2858 COG1089 # Protein_GI_number: 15832112 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: GDP-D-mannose dehydratase # Organism: Escherichia coli O157:H7 # 1 348 1 362 373 446 60.0 1e-125 MSKVALITGITGQDGSYLAEFLLEKGYEVHGIVRRASISNTARIDHLIEKNVIKLHDGDL SDSSGLIRLVGEIRPDEIYNLAAQSHVQVSFDAPEYSGDVDALGVLRVLEAVRVCGLTKT CKVYQASTSELYGKVEEVPQKETTPFHPYSPYAVAKQYGFWMVKEYRDAYGMFAVNGILF NHESERRGENFVTRKITLAAGRIAEGLQDHLELGNMDSLRDWGYAKDYVECMWLIMQQEK PEDFVIATGVQHTVRDFTEKAFAANGMTIRWEGTGIDEKGYDAATGKMLVCVNPQWFRPT DVDNLWGDPTKAKTVLGWNPQKTSYEELVEIMAKNDRELAKREKAMKAVM >gi|226332960|gb|ACII01000059.1| GENE 11 10406 - 11281 506 291 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 189 287 417 515 530 89 38.0 1e-17 MKDLFVNTQLPVVTSDRILYTPSNFARTSLLYLQEIGSLQVHKPHSTGRSNLSSYLFFTV TDGEGELICEGRTYSLRTGDCVFIDCRKQYAHSSSEKLWSLSWIHFYGPNMTAIYDKYLE RGGRAVFHPVSVSGFLEICHRIYELADSDDYIRDMKINSELNYLLTLLMNESWHPGEKHR DADLKKQNVLPVKQYLDEHYQEKISLDFLAEHFFISKYYLTRVFREQFGVSISSYLMNRR VTKAKYMLRFSNEKIKTIGYKCGLGEPHYFSRVFRQMEGMSPLEYREKWEK >gi|226332960|gb|ACII01000059.1| GENE 12 11304 - 12548 1229 414 aa, chain - ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 63 357 6 274 364 115 28.0 1e-25 KRLRKADFGEKMRLHYYLGVILEDENEVQECFRIIQSEVLEATIKSLAYNEQARIVTDHH TVRLPLRVNWGGGWSDTPPYCNEKGGTVLNAAILLNGEKPVEVTLERIPEQKVVFDSRDM DVHGEFDTIEPLQATGDPYDPFALQKACLLACGIIPREGHTLGEILERLGSGFVMHSEVT NVPKGSGLGTSSILSAACVKAVFEFMGIAYTEEDLYAHVLAMEQIMSTGGGWQDQVGGIT SGLKYITSMPGLQQQLQVAHIELSPQTKKELDERFVLIYTGQRRLARNLLRDVVGRYVGN EPDSLFALEEIQKTAALMRFELERGNVDGFAKLLDYHWELSKKIDAGSSNTLIEQIFSSI EELVDGKLVCGAGGGGFLQVILKKGVTRQMVEERLKEVFMDSLVGVADCNLVWE Prediction of potential genes in microbial genomes Time: Sat May 28 19:40:40 2011 Seq name: gi|226332959|gb|ACII01000060.1| Ruminococcus sp. 5_1_39B_FAA cont1.60, whole genome shotgun sequence Length of sequence - 23085 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 7, operones - 4 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1431 807 ## Acid_0569 bifunctional fucokinase/L-fucose-1-P-guanylyltransferase - Prom 1512 - 1571 3.7 2 2 Op 1 . - CDS 1653 - 2264 204 ## gi|253578775|ref|ZP_04856046.1| conserved hypothetical protein 3 2 Op 2 . - CDS 2271 - 3176 837 ## Cphy_0247 hypothetical protein - Prom 3307 - 3366 5.5 - Term 3485 - 3521 -0.6 4 3 Tu 1 . - CDS 3637 - 3816 214 ## - Prom 3844 - 3903 4.6 5 4 Op 1 9/0.000 - CDS 3955 - 4563 698 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 6 4 Op 2 11/0.000 - CDS 4581 - 5495 932 ## COG1091 dTDP-4-dehydrorhamnose reductase 7 4 Op 3 . - CDS 5510 - 6529 783 ## COG1088 dTDP-D-glucose 4,6-dehydratase - Prom 6584 - 6643 6.1 8 5 Tu 1 . + CDS 6906 - 8003 654 ## Amet_1537 hypothetical protein 9 6 Op 1 . - CDS 8050 - 8946 858 ## COG1209 dTDP-glucose pyrophosphorylase 10 6 Op 2 1/0.000 - CDS 8951 - 10057 302 ## COG1035 Coenzyme F420-reducing hydrogenase, beta subunit - Prom 10177 - 10236 6.0 11 6 Op 3 4/0.000 - CDS 10246 - 11319 297 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 11370 - 11429 6.3 12 6 Op 4 . - CDS 11664 - 12830 477 ## COG2327 Uncharacterized conserved protein 13 6 Op 5 . - CDS 12907 - 14154 35 ## gi|253578785|ref|ZP_04856056.1| predicted protein 14 6 Op 6 1/0.000 - CDS 14202 - 15422 473 ## COG3754 Lipopolysaccharide biosynthesis protein 15 6 Op 7 . - CDS 15496 - 16629 403 ## COG0438 Glycosyltransferase 16 6 Op 8 . - CDS 16613 - 17869 452 ## COG1035 Coenzyme F420-reducing hydrogenase, beta subunit 17 6 Op 9 . - CDS 17862 - 19022 209 ## BT_0603 hypothetical protein 18 6 Op 10 3/0.000 - CDS 19035 - 20117 435 ## COG0438 Glycosyltransferase 19 6 Op 11 . - CDS 20120 - 20887 407 ## COG1922 Teichoic acid biosynthesis proteins - Prom 21102 - 21161 4.7 20 7 Op 1 . - CDS 21223 - 22677 1145 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Term 22703 - 22750 11.2 21 7 Op 2 . - CDS 22762 - 23085 367 ## Elen_2445 NusG antitermination factor Predicted protein(s) >gi|226332959|gb|ACII01000060.1| GENE 1 3 - 1431 807 476 aa, chain - ## HITS:1 COG:no KEGG:Acid_0569 NR:ns ## KEGG: Acid_0569 # Name: fkp # Def: bifunctional fucokinase/L-fucose-1-P-guanylyltransferase # Organism: S.usitatus # Pathway: not_defined # 30 472 37 513 1035 175 31.0 3e-42 MTNLNSLFLSQAYKDCWDDYNRSLKLRSFPRWDYVILTASNEQQAEGFRKQIADRQNFLP RGTKFIAIPDRDGRRVGSGGATLEVLRYLHEQEGSFDSLRVLVIHSGGDSKRVPQYSALG KLFSPVPHELPNGRSSTLFDEFMICMSSVPSRIREGMVLLSGDVLLLFNPLQIDYNNVGA AAISFKENVETGKNHGVYLNGPDGNVKCCLQKKSVEVLRKAGAVNESGCVDIDTGALIFS TDIMKSLYSLIETDADYDRNVNERTRLSLYADFLYPLASDSTLEDFYREKPEGEFCPELT AARTRVWEVLRPYRMKLLRLAPAKFIHFGTTREILELMNGGVDEYHYLGWSRKVGSSIRS DVSGYNSVLSSRASVGKDCYLEVSYVHGNSRIGSHSVLSYIDVQDQVIPDNVVLHGLKQR NGKFIVRIFGVNDNPKENRLFGRDLDELEDTLGVRFWEENGQAHTLWSAALYQEAD >gi|226332959|gb|ACII01000060.1| GENE 2 1653 - 2264 204 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578775|ref|ZP_04856046.1| ## NR: gi|253578775|ref|ZP_04856046.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 203 2 204 204 362 100.0 7e-99 MKETIDLLGKIFTNILTALYEPFGFSLLLSFLAMFFYLYAYKPIEAGKGWKNAMVTWYRE FKESIFFRKLFFLAFVTSLILFRTLLNRNLWMNPLSNVMGGWGIWEIVNGEQQLTTECIE NVIMMVPFSAVVMWTFHGKAGNGWNNILWPSGKIAFVFSLMIEILQLLLRLGTFQLSDIF YNTVGGVLGGMCYWGIIKIKKGL >gi|226332959|gb|ACII01000060.1| GENE 3 2271 - 3176 837 301 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0247 NR:ns ## KEGG: Cphy_0247 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 299 1 306 306 303 52.0 6e-81 MKTTTLLIMAAGIGSRFGTGIKQLEPVDDTDHIIMDYSIHDAIEAGFNHVVFIIRKDIEK EFKEVIGDRIASVCSAHDVTVDYVFQDIKDIPGTLPEDRTKPWGTGQAVLAAKKVIKTPF IVINADDYYGREGFKAVHEYLMNGGKSCMAGFVLSNTLSDNGGVTRGICKMDENGNLTEI VETKNIVKTADGAEADGVSVDVNSLVSMNMWGLTPGFLDVLEKGFKEFFDKEVPSNPLKA EYLIPVFIGELLEQGKMSVKVLKTNDTWYGMTYHEDVASVKDSFRKMQENGVYKADLFSD L >gi|226332959|gb|ACII01000060.1| GENE 4 3637 - 3816 214 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSEAIGTREFAEKFGVTPATVSKWCREGKIPNCNQDGKESPWHIPKDAVPPVGYKRRKK >gi|226332959|gb|ACII01000060.1| GENE 5 3955 - 4563 698 202 aa, chain - ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 198 1 180 185 231 61.0 7e-61 MGQIKVEKNVGGIEGLCVIEPAVHGDARGYFMETYNEKDMKEAGIDIHFVQDNQSMSTKG VLRGLHFQKQYPQCKLVRAVRGTVFDVAVDLRSNSETYGKWYGVTLSAENKKQFLIPEGF AHGFLVLSDEAEFCYKVNDFWHPNDEGGMAWNDPEIGIEWPELKGEYKGSASAEGYTLKD GTALNLSDKDQKWLGLKDTFKF >gi|226332959|gb|ACII01000060.1| GENE 6 4581 - 5495 932 304 aa, chain - ## HITS:1 COG:SPy0784 KEGG:ns NR:ns ## COG: SPy0784 COG1091 # Protein_GI_number: 15674829 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Streptococcus pyogenes M1 GAS # 4 301 3 280 284 199 41.0 8e-51 MKFFVTGVGGQLGHDVMNELLKRGHEGVGSDIQENYSGVADDSAVTKAPYIALDITDKDA AEKVITEVNPDAVIHCAAWTAVDMAEDDDKVAKVRAINAGGTQNIADVCKKLNCKMTYIS TDYVFDGQGTEPWQPDCKDYKPLNVYGQTKLEGELAVSQTLEKYFIVRIAWVFGLNGKNF IKTMLNVGKTHDTVRVVNDQIGTPTYTYDLARLLVDMNETEKYGYYHATNEGGYISWYDF TKEIYHQAGYKTAVLPVSTEEYGLSKAARPFNSRLDKSKLVEAGFTPLPTWQDALSRYLK EIEQ >gi|226332959|gb|ACII01000060.1| GENE 7 5510 - 6529 783 339 aa, chain - ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 3 339 4 333 336 461 64.0 1e-130 MNIIVTGGAGFIGSNFVFHMLKKYPDYRIICLDKLTYAGNLSTLAPVMDNPNFRFVKADI CDREAVDKLFEEEHPDIVVNFAAESHVDRSIEDPGIFLQTNIIGTSVLMDACRKYGIQRY HQVSTDEVYGDLPLDRPDLFFTEETPIHTSSPYSSSKAAADLLVLAYHRTYGLPVTISRC SNNYGPYHFPEKLIPLMIANALADKPLPVYGEGLNVRDWLYVEDHCKAIDLIIHKGRVGE VYNVGGHNEKQNIEIVKIICKELGKPESLITHVGDRKGHDMRYAIDPTKIHNELGWLPET KFEDGIKKTIQWYLDNREWWETIISGEYQNYYEKMYSNR >gi|226332959|gb|ACII01000060.1| GENE 8 6906 - 8003 654 365 aa, chain + ## HITS:1 COG:no KEGG:Amet_1537 NR:ns ## KEGG: Amet_1537 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 3 345 84 423 444 262 39.0 1e-68 MNITAKVALRMIPDSLATQPIFLCVDDTMVAKTGTRFENVSKLFDHAAHNGSNYLNGHCF VNIMLCIPVWKNDRVSYLSLPLGYRMWQKKESKLELAASMIRQVMPEFQEKKQAIILCDS WYTKQNLGSIVDEYQNLDLIGNARIDSVMYDPAPAHTGRRGRPAKHGKRLSVETDFAFSN EKIGDYYIGAHRVITNLFGCREIQAYVTATEKEHGTKRLFFSTIFPEDLQIFCACQEKEP LNQTGSDRMKYIPLFLYSFQWNIETSYYEQKTIWSLCSYMVRSCKGIEMLVNLINICYCA MKILPYQDEQFSEYRTKSVQEFRFELSQGIRSQIFLTNFVRNIETHIKSNVIIKALKQLI RQQVY >gi|226332959|gb|ACII01000060.1| GENE 9 8050 - 8946 858 298 aa, chain - ## HITS:1 COG:L197041 KEGG:ns NR:ns ## COG: L197041 COG1209 # Protein_GI_number: 15672176 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Lactococcus lactis # 1 285 1 282 289 412 69.0 1e-115 MKGIILAGGSGTRLYPLTKVTSKQLLPIYDKPMIYYPMSVLMNAGIRDILIISTPQDTPR FENLLGDGHQFGVNLTYAVQPSPDGLAQAFIIGADFIGDDSVAMVLGDNIFAGHGLKKRL QAAVQNAENGKGATVFGYYVDDPERFGIVEFDKNGKAISIEEKPEHPKSNYCVTGLYFYD NRVVEFAKNLKPSARGELEITDLNRIYLEDGTLNVELLGQGFTWLDTGTHESLVDATNFV KTVEQHQHRKIACLEEIAYLNGWICRDELMEVYEVMKKNQYGQYLKDVMDGKYQEHLY >gi|226332959|gb|ACII01000060.1| GENE 10 8951 - 10057 302 368 aa, chain - ## HITS:1 COG:MTH341 KEGG:ns NR:ns ## COG: MTH341 COG1035 # Protein_GI_number: 15678369 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, beta subunit # Organism: Methanothermobacter thermautotrophicus # 31 356 71 402 406 128 28.0 2e-29 MIDGGFPNVDVKKGESLEFYHSVCPVFYYKGKQKHDIWGNIEKALIGYSSNKKIRFKAAS GGALTEICCYLLENKKVDAIIHTTYDPDDQTKTISCVSTTVEEVISRCGSRYGISVPLKD ILQIVQSDKKYAFVGKPCDVMALRRYLNKNEKLTKNIIYLLSFFCAGEPSVNAQDELLKK MGTSRQGCDTLVYRGNGWPGFTTVNTKDGRELKMEYKVAWGQYLGRDIRYICRFCMDGTG ELADIVCADFWQLDNNNHPDFSEHEGRNIIIARNELGKQLLDATVKSGRINVEEDFTTKI DSEFYLYQPAQLKRKGTMKTTITAMKLCGRDTPKYSKSFLNKYASHVDLKVKFDFFIGII KRVIKGRL >gi|226332959|gb|ACII01000060.1| GENE 11 10246 - 11319 297 357 aa, chain - ## HITS:1 COG:MA4461 KEGG:ns NR:ns ## COG: MA4461 COG2244 # Protein_GI_number: 20093247 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanosarcina acetivorans str.C2A # 105 289 216 404 490 77 26.0 6e-14 MFEIPPVMVYVMVLYFLFVPAYQFWSGRQRYEYKYKALSIITVCIGLFSLLLGVLFVLNA SSENEAIARVCAMEGVNIVVGIFFFMIIAMKAQFKLRIDYCIYALKFNIPLIPHYMSMYI LASSDRIMITKMIDTTSTAIYSVAYTVASVIQILWTSIEASLSPWIYERLDVQDEESVRK TSGQIMLTFAVFCLGCTLFAPEIMTILAPDSYKAGIYAIPAIAGGVYFTAVYSLYMRIEL FYKKTTFATVASTIAAVSNIILNYIFIKLFGFVAAGYTTMVCYALLALLHYINVKNKKYD DAIDNKKILAISIVVIVVSILISILYSHRLLRYSFILIISLAAFIKRKAIIVMLRKS >gi|226332959|gb|ACII01000060.1| GENE 12 11664 - 12830 477 388 aa, chain - ## HITS:1 COG:MTH340 KEGG:ns NR:ns ## COG: MTH340 COG2327 # Protein_GI_number: 15678368 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 57 357 90 368 400 72 25.0 2e-12 MIIGLFGFDFISSNKGCEALTYTFLSMMERICDEKKLYVYNFTYNKTLGRIPEQFPEIEF KQCNINLKNPRYWKNAFKLLKTCDVFFDATFGDGFSDIYGKKWNIKTDLIKQMVIWSGTP LVLVPQTYGPYNNLVLKKWAMRLIRKADLVYSRDNLSAKVIKEQSGVEIKVGSDMAFKLP YDRTKYKIDNERINIGINVSSLLWDSQWAKENHFGLTVDYKQYHIKILEWLIEQSKYKIH IIPHVIDLEQPNARENDYRVCLQLKKKYGDKIEIAPPFNTPIEAKSYIANMDVFIGSRMH ATIGSISSGVATIPFSYSRKFEGLFGNLEYPYVISARNFSLEEALEKSKSWIQNYEILKG CGQKSVVESYAKLELLENDIRCLLEKRK >gi|226332959|gb|ACII01000060.1| GENE 13 12907 - 14154 35 415 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578785|ref|ZP_04856056.1| ## NR: gi|253578785|ref|ZP_04856056.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 415 1 415 415 723 100.0 0 MKLNIRNTIIFLVFGIYPIIPQYFGIMGISAFKLMCLGIVLFACVALGISKSVATSCRNI QIAFGIWLIIMFINAILFKGVVEYFYEILCYYLVGFVIIKCLNTRKRFLRAVDLLIIGAV VASIIGIVESITGFNVFHLLNNMGAQITLQPLRFGFRRIISFTYQTISFCNYCMFALGLI VYRISVCSKGNERKQKYGIAYGFVFIAALLTLSRSILICIIISQLILLYLCGYKVLLKKL LIITSVGMLAVIICSIVLPEILNILQNVAYMLLAVFDDNYAAMLGNVDGTGSGDRKELLK WVWESIDNKWIGMGGSAEFAYSLTERSGIYSYIRTKTSIENQYLNLLYHYGIIGLISMIW VYIQVFIKSIIQAVKNPSEWEGRISFPKMLVVILGTYYISFFGVHQIDEKRFFSV >gi|226332959|gb|ACII01000060.1| GENE 14 14202 - 15422 473 406 aa, chain - ## HITS:1 COG:CC0633 KEGG:ns NR:ns ## COG: CC0633 COG3754 # Protein_GI_number: 16124886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis protein # Organism: Caulobacter vibrioides # 21 400 221 566 818 234 34.0 2e-61 MEKVRTFEIELLLKDVGGTMKIIAFYLPQFHDIPENDEWWGKGFTEWVNVKKAQPLYKGH EQPRIPMNENYYNLLDDNVKIWQANIAKEYGIYGFCYYHYWFGGKLLLEKPMEQMLANPK VDIPFCISWANEPWTKAWVNESKVLIPQFYGGKKEWKEHFDYLLPFFKDNRYIKEDNKPL FIIYRAEVIDCLNDMLDYWTELARQNGFSGMKYAYQNLTFDLMPNRDDSRFDYNIEFQPS YAWNDLNNKSAVQKSKLWNFLRNIKRRIYAETEKRLGFDLQRYFNHKGKEKNASVLKTDY DEAWKAILEHIPENEKNIPGAFVGWDNTPRKGHRGQVYIGDTPEKLNKYMSKQIQRAKSI YKKDMIFMYAWNEWAEGGYLEPDERTGYKNLEAIRDALKANNEFPW >gi|226332959|gb|ACII01000060.1| GENE 15 15496 - 16629 403 377 aa, chain - ## HITS:1 COG:SP0353 KEGG:ns NR:ns ## COG: SP0353 COG0438 # Protein_GI_number: 15900282 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 11 335 2 320 372 157 29.0 5e-38 MKQQINGNNPIRVLHMAPIRMGGITELVLTLSEKIDFEKVQFDYLTFEDTFDQASKRAKE AHSTIYRVDLQTVKNPIVRGLKKFFGIIKLVHSEQIKIIHINSSTPYEVLVAFAAKIGGA KKIIYHSHNSSYYPGQKLVKLSEVFKWFMPFVITDYLACSDAAARFMFPKKIVDAENYTV IKNGVDYDKHSFKPEVRTELRKKYGLENNFVLGHVGRFNVQKNHEFLIDIFYEVYKHDNS ARLFLVGIGETENQIKEKVKRLGIEEVVIFHGLSDEVNRLWNMLDVFVMPSLYEGLPVAG VEAQANGLPLIVSNTITQELKITDAVEYIGLDEKTSKWCEVIERYNGYKRKNTRMEITNA GFNIDDTAIRLERIYLS >gi|226332959|gb|ACII01000060.1| GENE 16 16613 - 17869 452 418 aa, chain - ## HITS:1 COG:MTH341 KEGG:ns NR:ns ## COG: MTH341 COG1035 # Protein_GI_number: 15678369 # Func_class: C Energy production and conversion # Function: Coenzyme F420-reducing hydrogenase, beta subunit # Organism: Methanothermobacter thermautotrophicus # 10 190 11 211 406 67 26.0 4e-11 MIEINDKVNCSGCTACYAVCPQSAIEMKLDEEGFKYPRVDKNRCVECGLCNSVCPILNKR KIGEETTSGYIVQNKNDSIRLKSTSGAAINAIAEYVISCGGVVFGCEFSDDRVCRHTMAT DISDLGKFQGSKYVQSELGDCFKSIKEQLDADKKVLFIGMPCQVAGLESYIKRNKENLYL IDLACHGVPSPGVFKDFIQFLEHKYKGKISNFVFRDKTYGYSATNIKVYFANGKTIDCRN DIKTFTRLMFKGITLRPSCYECAFKTKHRVSDITIFDCALVGSYNKDMDDDIGTTSILCH SDKGKKLLEQPDVIDKMRIKSVDSEELILTEGSMLIASAKKSENREAFFHDRELMTYEQL TNKYAPITLKMIIGDIVKRSLRYTGPIGKAVILYNKKKSIEKYQREFMGRSGENETAD >gi|226332959|gb|ACII01000060.1| GENE 17 17862 - 19022 209 386 aa, chain - ## HITS:1 COG:no KEGG:BT_0603 NR:ns ## KEGG: BT_0603 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 382 1 368 370 158 31.0 3e-37 MKIGICTCYDNHNYGSMLQAYATYVKFVELGYDCEFLCYRKKLNTIQKIKWLPRLLNKYL MKEKFRLIEKKIQLKKHPDIKMKDAIRQRAFDRFLENFLKDKVSAPYIGFEALRAGSKNY DVVVAGGDQLWIPAGLPTNFFNLMFADEKVKRVSYSTSFGVTEVPFYQKRRTIEYLNRIE MLGVREKSGAELIERLTGRKAVVVLDPTLLLNKEEWAQAIPVKPVLENDYIFCYFLGANS EHRKVAEELSRKTGIPLYDMPHIDEFVPYDLQFNAEHLYDIDSEQFVNLIRYAKYICTDS FHCTVFSILHHKQFLAFDRYSDPLMMSRNSRIANLCDITGLSSRRYTGDVLEQMKKPIDY DRVDERLEKEKAISVAYIREALGKND >gi|226332959|gb|ACII01000060.1| GENE 18 19035 - 20117 435 360 aa, chain - ## HITS:1 COG:VC0925 KEGG:ns NR:ns ## COG: VC0925 COG0438 # Protein_GI_number: 15640941 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Vibrio cholerae # 6 356 15 363 365 177 29.0 4e-44 MKRVCMIVPNRMVKGGIAAVVNGYRGSQLEKDYEITYVESYKDGSKFDKLLKGICGYFHF AYVLMFHKPDVVHIHSSFGPSFYRKMPFIYMASWRKIPIVNHIHGADFDEFYVNAPEEKK AKIKKVYSKCNVLIALSEEWKERLSQIVPEDRIEIIENYSVLHEDALEERMQRECNNTVL FLGELGKRKGCYDIPAVIAQVKKSIPDVIFVLAGAGSEADEKAIKELIAEKGISDNVKFP GWVRGDTKDKLLREADVFFLPSYNEGMPMSVLDAMGYGLPVVSTNVGGIPKIVHDGENGY CCDPGNVNQFAKGITEILLDRKERKSFGEASWKIVKEGYSLEAHLNRIEQAYKQVLLMDV >gi|226332959|gb|ACII01000060.1| GENE 19 20120 - 20887 407 255 aa, chain - ## HITS:1 COG:slr1118 KEGG:ns NR:ns ## COG: slr1118 COG1922 # Protein_GI_number: 16329226 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Teichoic acid biosynthesis proteins # Organism: Synechocystis # 45 247 35 237 251 169 39.0 4e-42 MEYKRKVDKKCIPTCNIMGVNIAAINMEWLVDYLEKNISEIKGDYVCVSNVHTTVTSFED ADYCAIQNGGLMAIPDGGPLSTVGQKRGHKNMERTTGPSLMGEIFEISAKKGYRHYFYGS KEETLELLQKKLMEKYPEIQIAGMYSPPFRPLTEEEDKVIIERINETKPDFVWVGLGAPK QEKWMAAHQGKIDGLMLGVGAGFDYYAENIKRAPMWMQKHNLEWVYRLVQDPKRLFKRYW STNTKFIWNAMIRGK >gi|226332959|gb|ACII01000060.1| GENE 20 21223 - 22677 1145 484 aa, chain - ## HITS:1 COG:all4160_2 KEGG:ns NR:ns ## COG: all4160_2 COG2148 # Protein_GI_number: 17231652 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 277 480 33 221 226 208 50.0 2e-53 MYRKKSRGWYKHKDFILIDLACLFMALILAYLIRNGSVRNLFTLGIYRNVMIFVILVDLF LIILFESYKGVLRRGRYQELRSVIQQMILIELASGLYLFTVSGGHTFSRIILYLMGIFYV LLSYCTRIIWKKRLLHKMAEGGEHSLYIVTNYDLASKVIQNVKEHNYNRYNINGLILIDK DMTGKEIAGVQVVADLNNAPSFICQQWVDEVFVNVDETYPYPQELIEELLEMGMPVHVNL AKVRSTPGQKQFVEAIGGYTVLTTTMNYATDRQALAKRVLDILGGLVGCFLTGIIFIFIA PAIYISSPGPIFFSQTRIGKNGKPFKMYKFRSMYMDAEERKAELMAQNKMSDGRMFKLDF DPRVIGNKILPDGTRKTGIGEFIRKTSLDEFPQFWNVLNGSMSLVGTRPILQDELRQYEL HHRARIAIKPGITGMWQVSGRSDITDFEEVVRLDTEYISNWNFGLDIKILFKTVIMVLKR EGSV >gi|226332959|gb|ACII01000060.1| GENE 21 22762 - 23085 367 107 aa, chain - ## HITS:1 COG:no KEGG:Elen_2445 NR:ns ## KEGG: Elen_2445 # Name: not_defined # Def: NusG antitermination factor # Organism: E.lenta # Pathway: not_defined # 4 107 67 171 172 87 46.0 1e-16 AQNPEKLVNGLRKVIGLTKLIGTGDEIVPLVQEEIDLLMKIGTDKQLVEMSSGIIENDRV QILAGPLMGMEGNIRRIDRHKRTAYLEIEMFGRTVEMKGGLEIIRKE Prediction of potential genes in microbial genomes Time: Sat May 28 19:41:35 2011 Seq name: gi|226332958|gb|ACII01000061.1| Ruminococcus sp. 5_1_39B_FAA cont1.61, whole genome shotgun sequence Length of sequence - 8785 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 360 - 393 4.5 1 1 Tu 1 . - CDS 416 - 1042 837 ## gi|253578794|ref|ZP_04856065.1| predicted protein - Prom 1112 - 1171 7.2 - Term 1124 - 1182 0.1 2 2 Op 1 . - CDS 1188 - 2219 904 ## COG1316 Transcriptional regulator 3 2 Op 2 . - CDS 2256 - 3218 897 ## gi|253578796|ref|ZP_04856067.1| predicted protein 4 2 Op 3 5/0.000 - CDS 3282 - 3992 683 ## COG0489 ATPases involved in chromosome partitioning 5 2 Op 4 2/0.000 - CDS 4014 - 4793 864 ## COG3944 Capsular polysaccharide biosynthesis protein - Prom 4873 - 4932 4.0 6 2 Op 5 . - CDS 4934 - 5680 618 ## COG4464 Capsular polysaccharide biosynthesis protein - Prom 5739 - 5798 4.6 - Term 5745 - 5804 17.0 7 3 Op 1 4/0.000 - CDS 5828 - 6904 1129 ## PROTEIN SUPPORTED gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 8 3 Op 2 2/0.000 - CDS 6885 - 7763 553 ## PROTEIN SUPPORTED gi|229229955|ref|ZP_04354520.1| SSU ribosomal protein S1P; 4-hydroxy-3-methylbut-2-enyl diphosphate reductase - Prom 7826 - 7885 4.7 9 3 Op 3 . - CDS 7952 - 8623 217 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 - Prom 8710 - 8769 4.8 Predicted protein(s) >gi|226332958|gb|ACII01000061.1| GENE 1 416 - 1042 837 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578794|ref|ZP_04856065.1| ## NR: gi|253578794|ref|ZP_04856065.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 208 1 208 208 363 100.0 2e-99 MKIAKKFMTLALAAALGVGCATVGVGAAPSNTSDLAAVAPSRTSETTVDSEHSSIYEVVK ELEKTQGFADLQAKYKDLADAFKQINDGTMKMEDFIKVLKSLSVDEADKADLEDAIAKLE GKMIVTFVNEFNVLKPEEAKPNDDGTYEVKLNVPSLTDTMEGIQLWVYSKDGTKKFVVID PVNIDKEGKTLTVNLNDGDFFMVIADAK >gi|226332958|gb|ACII01000061.1| GENE 2 1188 - 2219 904 343 aa, chain - ## HITS:1 COG:BH3670 KEGG:ns NR:ns ## COG: BH3670 COG1316 # Protein_GI_number: 15616232 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 19 320 10 279 304 74 27.0 2e-13 MKKNTRKAIRQRKICMLLIIILIIVLSILGGGYYLLSQKNAALPKIGGQNSDSRNQTDLN QNSDTVEYKGETYKYNDHLSNYLFLGIDTREAVDTYQSQADAGQADAIFLVSMDRATEKI KVLFLPRDSMTRIEVFNPYGQSLGETTDHLNIQYAFGDGKEKSCELMKTAVSNMLDGLPI QGYCSMNMDGISVITDFVGGIQLTIPDDSLADVNPEYKKGAVVDITGETAEQFVRYRDID KTQSALVRQERQKTFLQALVQKAQEKAGEDAGFVTGLYDSVKSYTVTNMGNDIFAKLLAA SQNGITDTETVPGEGTHGENFDEYHIDEDALSDLIISMFYEKI >gi|226332958|gb|ACII01000061.1| GENE 3 2256 - 3218 897 320 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578796|ref|ZP_04856067.1| ## NR: gi|253578796|ref|ZP_04856067.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 12 320 22 330 330 533 100.0 1e-150 MSKIRKKKIKKTNILLGFICILLAVLLGFILVTSYKSEQKESRHLSEIADDQQSGIEDYE AVKERAKELEEISSEEAEDTEDIKSKNTEDTKDTEKTDTADTEDADKTEDTKDKTTAASG IVCWGDDLINGEESNTYSYMTVLQKLLTDNGYNMTVINKTLQGGGTLSMMKMAGVSDETI QSYITKHQQAANGAQLNVTETGIRDLTEEQTTRNDMDCIPVIFMGYYGGWNHDPAELADQ QEQILNTFQNKDQFIVVGTRPMDGSVTSEALDQVLSQKWGEHYISLADVTAQPSSTYEAQ QAMAEAILQKLQELNYISKN >gi|226332958|gb|ACII01000061.1| GENE 4 3282 - 3992 683 236 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 18 225 19 224 227 160 40.0 1e-39 MKKLNLTDKRNPDYFYSEAIKTLRTNIQLSGQSIKTILVTSCYPNEGKSDIVLSLAQELG SIGKKVLLLDADIRKTAYAGRLGVEEEVKGLSQLLSGQVGLQEIIYSTNFPNMDIIFGGP SAPNPSGLLSENIFKVFLKEIREYYNYILIDTPPIGTVIDAAVIGRCCDGAVFLIEPGNV RYRDAQKAFKQLERSGCRILGAVMNKIDTSDDKYYSSYYKHYGEYYHRSEEEAQIK >gi|226332958|gb|ACII01000061.1| GENE 5 4014 - 4793 864 259 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 22 239 5 225 230 122 34.0 5e-28 MLDNLNSNFESSKAQRPDANDDEVEIDLLEIFYALKKKILLVLMVALAGGCIAAAYTQFL MTPIYSSTSSILVLSKETTLTSLADLQLGASLTSDYTVLITSTPVMEQVISDLDLDMTAE QLKGSVSINNPTDTRILEITVNNTDSKMAKKIVDEIANVSSSYIGDKMEVIPPKIIEVGK IATVRTSPSVKKNAALGFLLGFLACAAIVVVYAVMDDTIKTEEDIEKYLGVSVLAKVPDR KDFINSKNRKSKNKNKKHH >gi|226332958|gb|ACII01000061.1| GENE 6 4934 - 5680 618 248 aa, chain - ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 8 238 3 238 243 168 40.0 9e-42 MSIKGIYDIHCHIVPGVDDGATDIGETVKLLRMEYEQGVRTVIATPHFRFRMFETPAEKV REQFRLVEKAASEISPDLHVYLGCEFHANMEMLPMLREQKVMTMAGSRYVLTEFSHNSEE SYIRERLGALLSGGYKPIMAHIERYEATRNSLDFVEELADMGVYMQINADSITGKDGFFT KRYCNKIMKDGLLHFVGSDCHNSTKRSSRIGEAYRMVSAKFGQDYADELFIHNPAEILKQ KQKHEATD >gi|226332958|gb|ACII01000061.1| GENE 7 5828 - 6904 1129 358 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 [Oribacterium sinus F0268] # 1 354 1 353 367 439 62 1e-123 MSELSFEQMLEDSVKTIRNGEIVQGTVIDVKEDEIILNIGYKADGIITKNEYSNDASLVL TDAVHVGDTMEAKVLKVNDGEGQVTLTYKRLAAEKGNKRLEEAFENQEVLKAPVTQVLDG GLCVNVDEARVFIPASLVSDTYEKDLSKYADQEIEFVITEFNPRRRRIIGNRKQLLLAEK AEKQKELMEKIHVGDTVEGTVKNVTDFGAFIDLGGADGLLHISEMSWGRVENPKKVFTVG DKVKVLIKEINGEKIALSLKFPEANPWLTAAEDFAVGNVVKGKVARMTDFGAFVELAPGV DALLHVSQISREHVAKPSDVLSIGQEIEAKVVDFNGEDKKISLSMKALEAPAEEATEE >gi|226332958|gb|ACII01000061.1| GENE 8 6885 - 7763 553 292 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229229955|ref|ZP_04354520.1| SSU ribosomal protein S1P; 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Desulfotomaculum acetoxidans DSM 771] # 1 288 1 274 676 217 41 1e-134 MKVKVAETAGFCFGVKRAVDKVYELIGTEQKPIFTLGPIIHNEGVVADLEARGVHVITEA DLDSPDDTLQNGTVVIRSHGVGKAIYDKLKEKNISYVDVTCPFVLKIHRIVEKESLAGNH IIIIGDKDHPEVQGICGWCQGPYTVIRNKEEAEAFVPPKGKKISIVSQTTFNYNKFKDLV EILCKKRYDNNVLNILNILNTICNATEERQREAKNIAGEVDTMLVVGGRHSSNTQKLFEI CKKECGNTYYIQTPVDLDSEMFQCSSYVGITAGASTPNKIIEEVQEHVRIKF >gi|226332958|gb|ACII01000061.1| GENE 9 7952 - 8623 217 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 5 213 37 279 863 88 26 2e-17 MGCNIAIDGPAGAGKSTIAKKVAKELSFIYVDTGAMYRAMALYLLNHGVNGENQEEIEAV CSGADISIEYKNGEQIVILNGENVNAMIRTEQVGNMASKSSANPKVRAHLLKLQRTLAEK NDVVMDGRDIGTTILPNAEVKIYLTASADTRAKRRALEYEQKGESFDLDQIRKDIIERDE RDMNREISPLKQADDAVLVDSSEMGIDQVVDTILDVYNKKVQK Prediction of potential genes in microbial genomes Time: Sat May 28 19:42:16 2011 Seq name: gi|226332957|gb|ACII01000062.1| Ruminococcus sp. 5_1_39B_FAA cont1.62, whole genome shotgun sequence Length of sequence - 65134 bp Number of predicted genes - 57, with homology - 55 Number of transcription units - 25, operones - 14 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1335 1054 ## COG2081 Predicted flavoproteins - Prom 1476 - 1535 3.8 - Term 1484 - 1517 4.5 2 2 Op 1 4/0.000 - CDS 1552 - 2964 1676 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase - Prom 3041 - 3100 2.6 - Term 3026 - 3050 -1.0 3 2 Op 2 . - CDS 3107 - 4255 1561 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 4 2 Op 3 . - CDS 4272 - 4640 661 ## COG4770 Acetyl/propionyl-CoA carboxylase, alpha subunit 5 2 Op 4 . - CDS 4668 - 5420 871 ## EUBELI_00921 hypothetical protein 6 2 Op 5 . - CDS 5435 - 6871 1920 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) - Prom 6929 - 6988 7.2 + Prom 7400 - 7459 4.9 7 3 Op 1 1/0.200 + CDS 7535 - 9535 2472 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 8 3 Op 2 8/0.000 + CDS 9572 - 10450 1082 ## COG0169 Shikimate 5-dehydrogenase + Term 10455 - 10495 2.2 + Prom 10605 - 10664 4.9 9 3 Op 3 . + CDS 10748 - 11482 865 ## COG0710 3-dehydroquinate dehydratase + Term 11640 - 11687 7.2 - Term 11632 - 11669 5.5 10 4 Op 1 . - CDS 11684 - 13003 1176 ## COG3681 Uncharacterized conserved protein 11 4 Op 2 . - CDS 13075 - 13992 819 ## COG4989 Predicted oxidoreductase 12 4 Op 3 . - CDS 14038 - 15441 960 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 13 4 Op 4 . - CDS 15422 - 16969 742 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 14 4 Op 5 38/0.000 - CDS 17015 - 17860 555 ## COG0395 ABC-type sugar transport system, permease component 15 4 Op 6 35/0.000 - CDS 17860 - 18744 422 ## COG1175 ABC-type sugar transport systems, permease components - Prom 18769 - 18828 4.6 - Term 18790 - 18839 0.1 16 4 Op 7 . - CDS 18886 - 20136 1424 ## COG1653 ABC-type sugar transport system, periplasmic component 17 4 Op 8 . - CDS 20159 - 21106 978 ## GALLO_1275 hypothetical protein - Prom 21278 - 21337 9.5 + Prom 21221 - 21280 8.5 18 5 Tu 1 . + CDS 21343 - 23169 779 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 23175 - 23208 2.3 - Term 23163 - 23196 2.3 19 6 Op 1 . - CDS 23201 - 23479 158 ## LGAS_0243 hypothetical protein 20 6 Op 2 . - CDS 23463 - 23735 232 ## Acfer_0671 hypothetical protein 21 6 Op 3 . - CDS 23793 - 25157 1421 ## CDR20291_1888 hypothetical protein - Prom 25205 - 25264 6.5 22 7 Op 1 4/0.000 - CDS 25390 - 26589 1314 ## COG0477 Permeases of the major facilitator superfamily 23 7 Op 2 1/0.200 - CDS 26610 - 27494 1066 ## COG0169 Shikimate 5-dehydrogenase - Prom 27593 - 27652 6.4 - Term 27650 - 27685 1.8 24 8 Op 1 . - CDS 27772 - 29697 2652 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 29725 - 29784 5.3 - Term 29751 - 29797 -0.0 25 8 Op 2 . - CDS 29827 - 30669 1223 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 30738 - 30797 10.0 26 9 Tu 1 . - CDS 30806 - 31009 58 ## + Prom 30697 - 30756 9.4 27 10 Tu 1 . + CDS 30968 - 31876 1054 ## COG0583 Transcriptional regulator + Term 32121 - 32156 -0.2 - Term 31888 - 31953 19.5 28 11 Op 1 . - CDS 32028 - 32816 945 ## COG0703 Shikimate kinase - Prom 32921 - 32980 7.1 29 11 Op 2 . - CDS 33071 - 34303 1120 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 34346 - 34405 2.7 30 12 Op 1 . - CDS 34484 - 35062 486 ## gi|253578830|ref|ZP_04856101.1| conserved hypothetical protein 31 12 Op 2 . - CDS 34963 - 35277 148 ## gi|253578831|ref|ZP_04856102.1| conserved hypothetical protein 32 12 Op 3 21/0.000 - CDS 35296 - 35865 642 ## COG1386 Predicted transcriptional regulator containing the HTH domain 33 12 Op 4 . - CDS 35909 - 36700 881 ## COG1354 Uncharacterized conserved protein 34 12 Op 5 . - CDS 36746 - 37576 521 ## COG1408 Predicted phosphohydrolases 35 12 Op 6 . - CDS 37635 - 38669 1244 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 38716 - 38775 9.3 36 13 Tu 1 . - CDS 38820 - 39977 701 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 40078 - 40137 2.7 - Term 40135 - 40184 12.4 37 14 Op 1 . - CDS 40188 - 41786 1374 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Term 41817 - 41851 4.0 38 14 Op 2 . - CDS 41865 - 42041 280 ## PROTEIN SUPPORTED gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 - Prom 42114 - 42173 4.5 - Term 42239 - 42290 1.8 39 15 Op 1 . - CDS 42331 - 44970 3544 ## COG0013 Alanyl-tRNA synthetase 40 15 Op 2 . - CDS 45039 - 46268 867 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 41 16 Tu 1 . - CDS 46706 - 48079 1525 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 48175 - 48234 7.2 + Prom 48088 - 48147 4.3 42 17 Tu 1 . + CDS 48347 - 48667 265 ## + Term 48683 - 48718 6.5 43 18 Tu 1 . - CDS 48742 - 48927 195 ## gi|253578844|ref|ZP_04856115.1| predicted protein - Prom 49033 - 49092 7.7 + Prom 48839 - 48898 8.3 44 19 Op 1 . + CDS 49093 - 49473 249 ## gi|253578845|ref|ZP_04856116.1| conserved hypothetical protein 45 19 Op 2 . + CDS 49466 - 49738 339 ## gi|253578846|ref|ZP_04856117.1| conserved hypothetical protein + Term 49829 - 49875 2.1 - Term 49816 - 49861 6.2 46 20 Op 1 6/0.000 - CDS 49894 - 50418 290 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 47 20 Op 2 2/0.000 - CDS 50435 - 50974 342 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 48 20 Op 3 . - CDS 50958 - 52295 1081 ## PROTEIN SUPPORTED gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase - Prom 52316 - 52375 3.3 - Term 52301 - 52349 9.2 49 21 Op 1 . - CDS 52383 - 52742 558 ## EUBREC_1674 hypothetical protein 50 21 Op 2 4/0.000 - CDS 52785 - 53408 834 ## COG0194 Guanylate kinase 51 21 Op 3 4/0.000 - CDS 53401 - 53673 287 ## COG2052 Uncharacterized protein conserved in bacteria 52 21 Op 4 . - CDS 53685 - 54566 793 ## COG1561 Uncharacterized stress-induced protein - Prom 54629 - 54688 5.0 + Prom 55051 - 55110 4.2 53 22 Tu 1 . + CDS 55173 - 56918 1258 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP + Term 56993 - 57041 5.5 + Prom 56981 - 57040 4.0 54 23 Op 1 . + CDS 57063 - 58619 1687 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 55 23 Op 2 . + CDS 58647 - 60011 1196 ## COG0534 Na+-driven multidrug efflux pump + Term 60037 - 60093 -0.1 56 24 Tu 1 . - CDS 60091 - 62493 3174 ## COG0495 Leucyl-tRNA synthetase - Prom 62527 - 62586 5.0 - Term 62627 - 62674 7.3 57 25 Tu 1 . - CDS 62761 - 64851 2578 ## COG0480 Translation elongation factors (GTPases) - Prom 64958 - 65017 10.8 Predicted protein(s) >gi|226332957|gb|ACII01000062.1| GENE 1 3 - 1335 1054 444 aa, chain - ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 14 444 1 389 393 394 49.0 1e-109 MSKVIIIGGGAAGMMAGVFAARNHHEVHILEKNEKLGKKVFITGKGRCNVTNACDTEELF PAMMSNPKFLYSSFYSFTPQDVMEFFEKAGVPLKVERGNRVFPQSDHSSDIIRALERELK KAGAKIHLHTAVQEIVKKPVTDSANTLESEATLTESGSDAGKSRKGKKSPDIPQEKITGV ILTDGTFMEGDAVIVATGGFSYQSTGSTGDGYRFARELGLKVTDIAPSLVPLKTKEDYVP KLQGLSLKNTGLTIKNGKKVLYEDFGEMMFTHFGVTGPMILSASAHIGAKLAKAPNGELS AYLDLKPALTKEQLDARILREFEAGPNKQFKNVIGVLFPSSLTPVMLELGGIPAEKKIHD ISREERQHFIDLIKAFPFTITGMGEFKEAIITRGGVSVKEINPGTMESKKISGLYFAGEV LDLDAVTGGYNLQIAWSTAYLAAQ >gi|226332957|gb|ACII01000062.1| GENE 2 1552 - 2964 1676 470 aa, chain - ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 9 453 4 448 448 538 59.0 1e-153 MAELEKKPVKIVETILRDAHQSQIATRMTTEQMLPIVDKLDKVGYHAVECWGGATFDASL RFLHEDPWERLRKLRDGFKNTKLQMLFRGQNILGYRPYADDVVEYFVQKSAANGIDIIRI FDCLNDLRNLQTAVSAANKEKAHAQVALSYTLGDAYTLEYWTDIAKRIEDMGADSICIKD MAGLLLPNKATELVTALKETVKIPIDLHTHYTSGVASMTYLKAVEAGVDIIDTAMSPFAL GTSQPATEVMVETFKDTPYDTGFDQKLLSEIADYFRPIRDEALDSGLLNPKNLGVNIKTL LYQVPGGMLSNLTSQLKEQHAEDKFYDVLEEVPRVRKDLGEPPLVTPSSQIVGTQAVFNV LMGERYKMVTKETKDILLGKYGATVKPFNPEVQKKCIGDEKPITCRPADLLDNELEKLES EMAQYKEQDEDVLTYALFPQVAMDFFKYRQAQKTKVDEKAADKKNGAYPV >gi|226332957|gb|ACII01000062.1| GENE 3 3107 - 4255 1561 382 aa, chain - ## HITS:1 COG:PAB1772 KEGG:ns NR:ns ## COG: PAB1772 COG1883 # Protein_GI_number: 14521092 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Pyrococcus abyssi # 6 381 5 400 400 356 53.0 3e-98 MSYVTETLSNLIHQTAFFNLTWGNYLMIAVACVFLYLAIKKGFEPLLLVPIAFGMLLVNI YPDIMANPEDMSNGVGGLLHYFYILDEWSILPSLIFMGVGAMTDFGPLIANPISFLMGAA AQLGIYAAYFLAIFLGFNGKAAAAISIIGGADGPTSIFLAGKLGQSALMGPIAVAAYSYM SLVPIIQPPIMKLCTTEKERKIKMDQLRPVSKLEKILFPIVITIVVCLILPTTAPLVGML MLGNLFRECGVVKQLTETASNALMYIVVILLGTSVGASTSAEAFLNADTLKIVVLGLVAF AIGTFGGCMLGKLLCKLTHGKINPLIGSAGVSAVPMAARVSQKVGAEADPTNFLLMHAMG PNVAGVIGTAVAAGTFMAIFGV >gi|226332957|gb|ACII01000062.1| GENE 4 4272 - 4640 661 122 aa, chain - ## HITS:1 COG:VNG1532G KEGG:ns NR:ns ## COG: VNG1532G COG4770 # Protein_GI_number: 15790515 # Func_class: I Lipid transport and metabolism # Function: Acetyl/propionyl-CoA carboxylase, alpha subunit # Organism: Halobacterium sp. NRC-1 # 2 122 488 610 610 65 36.0 3e-11 MKSYTITVNGTAYEVTVEETGSVSAPAAAPKAAPKAAPAAAPKAAAPAAGAGAVKVTASV PGKVVKVAASVGQAVKAGDSVVILESMKMEIPVVAPQDGTIASIDVAEGASVENGDTLAT MN >gi|226332957|gb|ACII01000062.1| GENE 5 4668 - 5420 871 250 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00921 NR:ns ## KEGG: EUBELI_00921 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 72 250 82 263 263 110 42.0 3e-23 MKMIKKITSVFMTFLVMAALVIGSTSVVFAADKVDDTVKSTLVTTAEGLTDAIVQLKDEE IENYMSSGDDFTTSAMQSWQTSKDELGAKKDSNGETTVTFKDDQYTVTVPLKFEKADANF VYVFDSQGTPTSMSVDVQYGMGKTLQRAGLNTLMGIGTVFVMLILLSLLISLFRFIPNPE AKKAAEAKAAKAAKEAEAAAIAAAPAQAEENLADDGELVAVIAAAIAAAEGTTTDGFVVR SIRKVKRNRR >gi|226332957|gb|ACII01000062.1| GENE 6 5435 - 6871 1920 478 aa, chain - ## HITS:1 COG:TM0716 KEGG:ns NR:ns ## COG: TM0716 COG4799 # Protein_GI_number: 15643479 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Thermotoga maritima # 9 478 37 513 515 357 41.0 3e-98 MSNATQSLAGTRITSLLDANSFVEIGQGVTARSTDFNMTANKAPSDGVITGYGVIDDKLV YVYSQDASVLNGTVGEMHAKKITRLYDLAMKTGAPVIGLVDCAGIRLQEATDALEAFGQI YLKQTLASGVIPQITAIFGTCGGGMAVIPSLTDFTFMESKKGKMFVNTPNALEGNNTDKC DTAAADFQSKETGVIDGVGTEDEILGQIRSLVSLLPSNNEDTDNYTECTDDLNRVCADLA NCAGDTAIALSQIADNGEFFETKADYAKDMVTGFIRLNGATVGAVANRSEVYDAEGKKTE TFDGSISARGARKAADFVKFCDAFDIPVLTLTNATGFMATLCSEKMMAKSVGELVAAFAD ATVPKVNVIIGKAYGTAYVAMNSKSIGADLVYAWDNAEIGMMDASLAAKIMYADADAAEL NEKAAQYRELQNGVASAAARGYVDTVISPADTRKYVIGAFEMLFTKREDRPSKKHGTV >gi|226332957|gb|ACII01000062.1| GENE 7 7535 - 9535 2472 666 aa, chain + ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 3 364 2 364 364 519 66.0 1e-146 MKSNYQHIFTPLTVKNMTIKNRIVMMPMGTNYGEQNGEMSFLHINYYKERAKGGTGLIIV ENASVDSPQGSNGTTQLRIDHDNYLPRLYKFCEEIHKYGTCIAIQINHAGASAVSARTNM QPVSASDIPSKEGGEIPRPLSVEEIHHIVKKYGEAAKRAQAAGFDAVEIHAGHSYLISQF LSPLTNKRTDEFGGSVENRTRFCRMVIEEVRKQVGPFFPIMLRLSADELMEGGNTLEDTL EYLEYVQDEVDIFDVSCGLNGSIQYQIDANYMKDGWRSYMPKAVREKFGKPCISMGNIRN PKVAEQIIADGDADLIGMGRGLIAEPAWVNKVATGRECDLRKCISCNIGCAGNRIGFNRP IRCTVNPAVLEGDVYKNQKVNKNCNVVVIGGGTAGLEAACTAAEVGCNTFLLEKGETLGG LASVISKIPAKKRLADFPNYLIHRAEQLENLYIFTNTEGTPENIRKFHPNLIVSSTGSAP LLPPIRGLHDRIDKEGSKVASILGMINHINDYPEDMTGKKVVVVGGGAVGLDVVEFFAAR NAEISIVEMMDQIGRDLDPVTKNDMKDQMKKHHVAQLTKTALQEVKDSSFLVKDAEGERE LPFDYGFVCLGMRAQGQLFAELSDAFVSDDVEILNIGDSKRARRIIDGTLEGRNILNTLT QMGYLQ >gi|226332957|gb|ACII01000062.1| GENE 8 9572 - 10450 1082 292 aa, chain + ## HITS:1 COG:lin0493 KEGG:ns NR:ns ## COG: lin0493 COG0169 # Protein_GI_number: 16799568 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 286 5 290 291 387 65.0 1e-107 MAERITGHTELIGLMAYPIRHSSSPAMQNEAFAKLGYDYAYLAFEVGADEIEDAVKAIRT LKMRGSNVSMPNKTLVGKYLDELSPAAELCGAVNTIVNDNGHLTGHITDGIGFMSALKDN DIDVIGKKMTIVGAGGAATAIEIQAALDGVAEITIFNRKDEFWDRAVSTVEKINTKTSCH AVLYDLADLDKLKAEMDDSFIFVNATGVGMKPLEGQSVVPDKSYFRPELIVIDVPYSPLE TKMRSMAKEVGCKTMNGLGMMLFQGSAAFELWTGEPMPIEHMKEILHISYDD >gi|226332957|gb|ACII01000062.1| GENE 9 10748 - 11482 865 244 aa, chain + ## HITS:1 COG:STM1358 KEGG:ns NR:ns ## COG: STM1358 COG0710 # Protein_GI_number: 16764709 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Salmonella typhimurium LT2 # 1 243 10 252 252 268 57.0 5e-72 MIGEGRPKICVPIVGKTKTDILEEAKKITTLPVDVVEWRVDWFDDVFATEKVLETAKELQ EVLKDIPVLLTFRTSKEGGEKEISVNDYAALNIAAAQSGYVDLIDVEAFTGDEVVKTIIN AAHEAGVKVIASNHDFFKTPEKEEIIRRLRMMQDFGADIPKMAVMPTCKQDVLTLLSATL EMSEKYADRPIITMSMAGTGVVSRLTGETFGSALTFGAASKASAPGQVGVHELKQVLDII HSSL >gi|226332957|gb|ACII01000062.1| GENE 10 11684 - 13003 1176 439 aa, chain - ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 20 433 3 409 411 310 41.0 5e-84 MEKKVYESYLEILREELVPALGCTEPIAIAYATATAASVLEKVIKTDDDGNKKYEGHLNL YCSGNIIKNVKSVTVPNSGGMRGIEAAAVLGMVGGQAERKLEVLEGITEAERIQTAKLLE AQFCTCYLEEGVENLYIRAELTQNGHRAVVVIENKHTNIVLIKKDDEVLLEKKGFQQKKT EKNQDKRDLLNVADILEFAKIVNIKEIEEVIGRQIAMNTAISEEGLRNHYGAEVGRTLLR AYGDDVKVRARAKAAAGSDARMNGCSMPVVINSGSGNQGMTCSLPVIEYARELKVPKEKM YRALVVSNLVAIHQKTYIGSLSAYCGAVSAACGAGAAISWLNGGDYNAISKTITNALANT SGIVCDGAKSSCAAKIASAVDAAIMAYALEDADHCFQAGEGLVRDNVEDTIRNMGYVGRV GMKDTDVTILNLMINQKKV >gi|226332957|gb|ACII01000062.1| GENE 11 13075 - 13992 819 305 aa, chain - ## HITS:1 COG:CAC3378 KEGG:ns NR:ns ## COG: CAC3378 COG4989 # Protein_GI_number: 15896620 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Clostridium acetobutylicum # 1 305 1 306 306 389 60.0 1e-108 MKMFQISDKIDNVSRIGIGCMRITGLSDEKAVRSLIEGALECGINFFDHADIYAGGEAET LFGNALTPQLREKMVIQTKCAIRPGICYDFSKEYILNSVDGSLKRLKTDYVDILLLHRPD ALMEPEEVAEAFEILEKSGKVKAFGVSNHNPMQIELLNQYCGGKICIDQIQFSAAHCPTI DAGLNVNIHNDAGCDRDGGIIEYARLKKMTLQAWSPFQYGMFEGIFIGNEKFPELNKVLD RLAEKYGVTQNAIAVAWIMRHPAGIQTIVGSTNLKRIQDISKASDIVLSREEWYEIYLAA GKMLP >gi|226332957|gb|ACII01000062.1| GENE 12 14038 - 15441 960 467 aa, chain - ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 18 461 1 452 452 213 32.0 7e-55 MNPGKVEWNEENKPQPDMELLEQWISRQEYQMREDIAALIRIPSIADQQEAIPGAPYGKK CREALEQMRAFGLREQMETEDIDGHCLTIATGKGNVEIGIWNHLDVVPEGKGWIYPPYTC TEKDGYLIGRGVQDNKGPAVAVLYAMKYCREKEILNNIKVRQILGCQEESGMTDVEYYLK YKKAPEYSFVADCGFPVCCGEKGHCIVLMETVSAVEGILEFDGGTAPNSVPSFAHAVAEN KIGQKSEETAVGISGHAAFPDGTQNAIGILCGKLKMQNFSKQTKRALEFVERLSEGGYGE KTEICCSDECSGRLTCNIGQAFLHEGHLRIVIDIRYPVTKKTEDFLPKLQTEAAKSGIRI IGIEDSRPYYMNPEQPFIQTLMEAWREVTGLEGRPFVMGGGTYARKIPNAVAFGPGQERN LDRLGLPKGHGNCHCADEAELFENLKNAVKIYVFALKKLDKRIKEIR >gi|226332957|gb|ACII01000062.1| GENE 13 15422 - 16969 742 515 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 500 3 520 525 131 24.0 4e-30 MKVLVVEDEFYARKAMIRQIKKYDQDGLFEISEASNGEEGWEFCEKYKPELVLTDIRMPK MDGIQLLQEIRNNELEIKVIILSAYSDFEYARSAIVNGASDYLLKPIDDTALTECLNKFV TQHKIERKEALLSRKDMATQYILNSIQESKYSGFIEKNMFERVFPQYQLGVFLFLHDKPR QEIFLTELEESCGSIMLTKIRFVELKPDMWILLVRPEGDMLFFWRRIRKLLEKEDSQVKI GISNVYGANASVLDAFREAVTAIKSRIYKRESLIFAKEIKQEDFSEYYLEKEIERELEQY LKEGNESKTGTTLDKLFKDIEKVLPIRIECMELLYSQIILIYRRTIRMENHDKELDNLSE SFLKFDSVLEIESYLRRIAENICHMQPQRQSDELTNTGMDIVEKMSEYAVNNYNMDITIR NLAENVFFMNQSYISHLFAEKKGISFSAFLRMIRICHAKEFLADARWSVTDVALMSGYND TSQFIRIFRQETGMTPKKYRIFLKNGGSENESRES >gi|226332957|gb|ACII01000062.1| GENE 14 17015 - 17860 555 281 aa, chain - ## HITS:1 COG:SP1895 KEGG:ns NR:ns ## COG: SP1895 COG0395 # Protein_GI_number: 15901722 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 8 280 5 278 278 210 42.0 2e-54 MKIKAGNDERVWLFFRNIILIGGLLLILCPLWMVLINSFKTLEEAGKNFFALPSKLNLEN YIELFTNSNYWIFLKNSFKITVITIILILIFVPSVSYSIARNFNRKYYKTIYFYIMMGLF VPSQVVLLPVTKMMSKLNMLNHAGLIILYVAFSLTQGVFLFVNYIRGLPYEIEESAQIDG CSVFQTYVKIVLPLVKPMISTLLIMDTLWIWNDFMLPLLILNRSQAIWTLPLFQYNFKTE YSFNYTMAFTAYLLAMLPMLIIYCMGQKYIVKGLTAGSVKG >gi|226332957|gb|ACII01000062.1| GENE 15 17860 - 18744 422 294 aa, chain - ## HITS:1 COG:SP1896 KEGG:ns NR:ns ## COG: SP1896 COG1175 # Protein_GI_number: 15901723 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Streptococcus pneumoniae TIGR4 # 7 290 10 293 296 215 42.0 7e-56 MNNTKRKKLSGRVTFFLFTLIPVVLYTFFYVVSVINGIRYSFTDWDGMAQKMNFIGWKNY KILLKNPNFWNAIKTTVIYSVLLVIGVIVISLILSISLNSLKKFKTLVKSVYFVPAMIGG VTIALIWDQIYYRVIPVIGKALGISWLSNSVLMSGDTALPAVVFVQIWQAVAMPTVIFIA GLQQIPEEQYESAKIDGATAFQRFRYVTFPYLLPTVTVNLILTIKQGFTSFDFPYALTGG GPVRATEVIGILIYNDAFKNMRFSMANAEACILFVIVAIFSLTQLKLTSKGGTD >gi|226332957|gb|ACII01000062.1| GENE 16 18886 - 20136 1424 416 aa, chain - ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 28 351 30 352 419 144 29.0 3e-34 MGKKSLAALLCATMAMTACMLPATAAMADEPLKIEFFQQKGEEGPQKGYKEIIDKFNKEN SDIEIEMNTVPDAGTVLTSRISSGDIPVIFSDFPTQTQFKQKVANGYVQDLSDQDFLKNV NESALEMTKQEDGGYYALPYSRNYIGVYYNKKTFEDNGLEIPTTWEEFTAVCDKLKEAGI TPVGMHGKDPARVGHLFQAATVAWAPDGVETIGKVVSGEAKIEGDEEFKNVFEKMNTLLS YANEDALALSDTTCYENFVNGEYAMTITGSYARGTIQSINPNLEIGVFPLPNDTYDDTKC LSGIDAAICVSAQASDKEKDAAYRFLSYLADPENAQIFCDNDGAPSCITGVTSNDDGIKP MVDMINAGKTHDWMASTIDNNVTTDLYNVVQGFWANKDVDAVMKDMDASIEISSAQ >gi|226332957|gb|ACII01000062.1| GENE 17 20159 - 21106 978 315 aa, chain - ## HITS:1 COG:no KEGG:GALLO_1275 NR:ns ## KEGG: GALLO_1275 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus # Pathway: not_defined # 2 307 3 331 333 150 33.0 8e-35 MTKKERVLAVMKKEQVDMIPAGFWFHYKSDYTVQQMIDEHMKLFRTTDMDIIKIMQDYPY PISGKITCADDWYHIQVKGTDSEEFAKMAEIIRGIRKEAGKDVLIFQTMFGPFKAASMTF GDDVLMKYSKEAPEAVAAGVKIIADALEEWTKGYLEAGADGIYYSAQFGEIGRFEKNEWE ELVCPYDLQILKVAEEMPEKYNILHICGEPEYQFETHVEWFKNYPADLINWSVKDNHFSL EQGRELYSSAILGGMNNKGNVLNGSEDAIREEVKGILDAFGTKGIMIGADCTIQGENISL DLIKTAVEAAHAYKK >gi|226332957|gb|ACII01000062.1| GENE 18 21343 - 23169 779 608 aa, chain + ## HITS:1 COG:BH0792 KEGG:ns NR:ns ## COG: BH0792 COG2972 # Protein_GI_number: 15613355 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 173 581 159 572 587 140 26.0 9e-33 MQIFLSSLILVIFPTVLLSTINAISKTSTITSEYNTSTAATLTQMNQSLDTLLENALKIA DTPLLNDDARKAMITNYEKDYLSYAQDFNVFRNLMRQTNQLNSSMLTVYFQNRYGYSFEY NVKTAQQRHTIESNMEKWKKIAETSKNRTYFAPLQTDSSTGHSILPMVKILLDRYDYQET GLCYAEIDFTPIMEILSSSCETQNTLLIYNADNKLTCTINLASFSEADISSSVLSKLEDF SNTLTSQDAIDQSTLKTSLGQFVINGCINNTTQWHIVQIISNEKIAHTFHDTIISYLGIF LFCALLGLILAIFLSRILTRPVSNLCHEIDILDPSDGTQIDLKSCGSNQELRKLIDSFNG MSQRLFLSLKQNYEIQITEQQMRVQMLQFQINHHFLHNTLNVIKSLAEIHDVPEIETIAT CMSELVRYNLEKFPVATLQEELQQVQRYMTIQNIRFPGKFCYDINVPPEFLQMNLPVFLF QPLVENSVEHGFSNKENECYISISCQLENELLHFLVADNGSGMSQEQLKTFQCSGKTSPK GHHSIGLANVNQRLRSYYGEEYGLLVESIPGEGTIIDIVLPSSAINIEPSFFQPELSKTI SLSQNPSQ >gi|226332957|gb|ACII01000062.1| GENE 19 23201 - 23479 158 92 aa, chain - ## HITS:1 COG:no KEGG:LGAS_0243 NR:ns ## KEGG: LGAS_0243 # Name: not_defined # Def: hypothetical protein # Organism: L.gasseri # Pathway: not_defined # 1 85 1 83 93 79 42.0 3e-14 MPQILRIGPYSIYFWSNEGDPLEPIHVHVSEGRASATATKIWITSTGKTILSNNNSKIPE KILKRLMRMIEANSSDIIDEWLNRFGEIRYFC >gi|226332957|gb|ACII01000062.1| GENE 20 23463 - 23735 232 90 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0671 NR:ns ## KEGG: Acfer_0671 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 1 87 1 86 89 102 55.0 6e-21 MMYPFMTLNDDTEITHSEMKADGKVKVYIETPDEFGGFHNATCWLPDYKWEDIEGYSDTE MAYFKQLIRNNAHLIIEFSQEGGILNAANS >gi|226332957|gb|ACII01000062.1| GENE 21 23793 - 25157 1421 454 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1888 NR:ns ## KEGG: CDR20291_1888 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 5 443 3 435 463 367 41.0 1e-100 MAHTRTQRRENELPYIPFGPFQIRLPFIHYKIESVEFIQGLILGVTALAAVPYLEQYLGL PYELAWSCVIIETFLYLLHSLLGDPVVPGWITPTLPLTIVFLEGFPMGKERIQAMIALQM LVGLVFIFMGVTRLADKFVHAVPNSVKGGILLAAPVTVMAGQLGENGNMHKYPLAIVAGV GLLILISFSDKYQEKRKNNKFLDLVARYGNLFPYLIAMVVGLLVGELDAPGLEIGTFIKI PQFKDIIDQVSIFGVGIPPASMFIKALPLALVCYVIAFGDFVTTETLVSEARQARDDEYI DFNSSRSNLVSGLRNLILSIIAPFPPLSGPLWVGMTVSVSMRYKEGKKAMKSLLGGMASF RLATFLSVLIIPVVSFFRPIFGVGSSITLMFQAFVCARIGMDYCKSDRDKMIAGVMAAVL ATQGTAWASAWALAVGFGLNVFLSNWNPLEKNHR >gi|226332957|gb|ACII01000062.1| GENE 22 25390 - 26589 1314 399 aa, chain - ## HITS:1 COG:lin2339 KEGG:ns NR:ns ## COG: lin2339 COG0477 # Protein_GI_number: 16801402 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Listeria innocua # 12 393 9 389 402 414 59.0 1e-115 MKQKTNKMLIFTSLAAYFTYFIHGIGASILGQYKPEFAAAWGAKPLADGTLDVSMVVSVI AALGLGRLISLPFSGPLSDKYGRKLSGIIGVLCYVAYFFGMANSTSMAMGYAFAVIGGIA NSFLDTCVSPTCMEIYVNNPSVANLFTKFSICLSQFLLPFLIGIVASANMSYKTIFIVAG IAILIDGILILILPFPARKKAVQAKADKKKSGHKISPAAIAAILIGFTSSSTFMLWLNCN QELARSYGMADPSKIQSLYALGTATAILATAAFIKKGLKEINVLILYPLISTIMLALCYF IQAPFICLVGGFVIGYAGAGGVLQLAVSTTAEFFPENKGTATSLVMIASSIANYTILSLA GYITKVGGSSAPRMILLLNMAVTIIGILLALFVKKNRNK >gi|226332957|gb|ACII01000062.1| GENE 23 26610 - 27494 1066 294 aa, chain - ## HITS:1 COG:lin2338 KEGG:ns NR:ns ## COG: lin2338 COG0169 # Protein_GI_number: 16801401 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 285 1 284 289 373 63.0 1e-103 MEQRIKGTTGLMALIGSPVGHSGSPAMYNFSFRHHNLDYAYMAFDIKEDQVPAALDAIRL FKMRGANVTMPCKNEAAKHMDELSPAARIIGAVNTIVNEDGKLVGHITDGIGFVRNLKEH GVDVKGKKMVVLGAGGAATAIQVQCALDGAESISIFNPKDPFFARAESTAEKLSKETPDC KVSVFDLADEAKLKEEVANADILVNATLAGMKPHEELTLIKDKSMFRPDLVVADVVYNPA ETRMVKEAKEAGCKLAIGGKGMLLWQGAAAYKLYTGLEMPTAEYQKFQEENENK >gi|226332957|gb|ACII01000062.1| GENE 24 27772 - 29697 2652 641 aa, chain - ## HITS:1 COG:lin2337_1 KEGG:ns NR:ns ## COG: lin2337_1 COG1902 # Protein_GI_number: 16801400 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 1 361 1 361 361 515 67.0 1e-145 MKFNAMFQPINIGPMTVKNRFVVPPMGNNFANTDGSMSEQSAAYYRERAKGGFGLITIEA TVVHKGAKGGPRKPCLYDDSTIESFRKVVDACHAEGAKVSVQLQNAGPEGNAKNAGAPIE AATSIPSDCGRDTPKEVTTEEVYELVKGYGLAAKRAMEAGVDAVEIHMAHGYLVSTFLSP RTNKRLDEFGGSFENRMRFSRLIIEEVKKMTEGKIAVLARINSCDEVPGGLDVHDSAAIA AYLEECGLDAIHVSRAVHIRDEFMWAPTVTHGGFSASQVEEIKRAVSIPVITVGRYTEPQ FAELMVKEGRCDLVAFGRQSLADPYMPLKAQEERLEDMIPCIACLQGCVANMYAGNPVCC LVNPFLGHEAEGIAPAEKAKKVMVIGGGVAGLCAAFIAQEKGHQVTIYEASDKLGGNMRL AAYPPGKGDITNMIRSYIVRCQKAGVTIKMNQEVTLDLIREEKPDSVIVASGSRTLILPI EGIDNPAIIHGSDLLDGKRAAGKKVLVVGGGMVGCETAAFLGEQNHDVTVIEFRDTVGAD VIHEHRVYLMKDFEDYKIKEITGAKVCKFYEDGVEYETADGQRHESRGYDSVILSMGFRN YNPFGEEIKAIVPDTYVIGDATRARRALDATKEAYEVASTL >gi|226332957|gb|ACII01000062.1| GENE 25 29827 - 30669 1223 280 aa, chain - ## HITS:1 COG:lin2336 KEGG:ns NR:ns ## COG: lin2336 COG1082 # Protein_GI_number: 16801399 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 3 278 9 281 284 305 56.0 9e-83 MSKEFPITISSWTLGDQCKFEERVIAAKNAGYEGIGLRAETYVDALNEGLFDKDILAILD KHGMKVTEVEYIVQWAEEHRSYEQKYKEQLCFHMCELFDVKQINCGLMENYSVEYTAQKL RELCQRAGKYIIGVEPMPYSGIPDMKKGWAVVKAADCENAKLIMDTWHWVRANQPVDLSV IEDIPADKIVSIQINDVWERPYATTILRDESMHDRLAPGTGIGCTAAFVKMVKEKGIKPN AIGVEVISDAILAKGLEYAANHTYENTKKVLEEAWPEVLQ >gi|226332957|gb|ACII01000062.1| GENE 26 30806 - 31009 58 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQRRKVPQMIQVHRKTPFSDKYFISDYILCAGFHSPLMLRKTIIIVTLAQIIRAVFVIY VDYIMMY >gi|226332957|gb|ACII01000062.1| GENE 27 30968 - 31876 1054 302 aa, chain + ## HITS:1 COG:lin2335 KEGG:ns NR:ns ## COG: lin2335 COG0583 # Protein_GI_number: 16801398 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 292 1 290 292 332 56.0 5e-91 MNLYHLRYFSTLAHIEHYTKAADILAITQPSLSYAISTLEEELGVKLFEKNGRNVTLTKY GKVFLNDVEEVLNRLDSSVNSLKLAGKGEGCIDVVFLRTLGIDFLPKIMRGFLEENPTKK IDFNLYCDKVLTSDILNGLKEKKYDLGFCSKLDNEPLIEFIPVARQELVVIVPLDHPLAV KDEVRLEDTIPYKQIIFKKRSGLRQIIDGLFECIGQTPDVAYEIDEDQVAAGFVSNGFGI CVAPNIPILQSLNVKILPLVSPSWQRNFYMAMLKNVYHPPVVEAFKKYVIEQAQKEWDYK TS >gi|226332957|gb|ACII01000062.1| GENE 28 32028 - 32816 945 262 aa, chain - ## HITS:1 COG:PA5039 KEGG:ns NR:ns ## COG: PA5039 COG0703 # Protein_GI_number: 15600232 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Pseudomonas aeruginosa # 92 258 3 169 172 102 35.0 8e-22 MNQLEILRESLGQCDEIILDALIMRNRIVEDIMAYKEANELQILQPEQEAKQKEWLEKRM EGRRHKDEVSDVFECIRTNSKRIQARKLFNYNIVLIGFMGAGKSTISDFLKNVFAMDVVE MDQIIAQRQGMSISDIFETYGEQYFRDLETNLLIEMQSRSNVVISCGGGTPMRECNVVEM KKNGRVVLLTAKPETILDRVKNNHDRPLIENNKTVPFIADLMEKRRAKYEAAADIIIETD GKNELEICEELVHRLRTMDEEK >gi|226332957|gb|ACII01000062.1| GENE 29 33071 - 34303 1120 410 aa, chain - ## HITS:1 COG:BH1573 KEGG:ns NR:ns ## COG: BH1573 COG1686 # Protein_GI_number: 15614136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 30 399 19 365 382 170 32.0 5e-42 MKHIAAVIVVTAVLLFTQTYTSARGAEYKIPQTVDMTPVAEEPAELYALSAVLMDGESGR VLYEKDGERPLANASTTKVLTCIVALENSPGDDYVQVSQNAASQPEVKLGLQKGEQYYLE DLLYSLMLKSHNDTAVAIAEHCGGSVEGFARMLNRKAKQIGCKNTYFITPNGLDAEDENG KHHTTAKDLALIMRYAIKNETFLHIAQTRDYTFSEITGKRTFSVHNANAFLDMRDGVLAG KTGYTSQAGYCYVCAWEKEGKTFIVSLLGCGWPNHKTYKWSDTEKLLDFGDYNYEYETYW KEPQTGKILVTDGVEDDQDIGTKIYLRGKCSVTAYDREKEVLLKKGETVICKIEIPQKVS APVLKGEKLGRIAYYLDGKLIDFYPVYAEKSVEKISFKWYTEKVFHDFFH >gi|226332957|gb|ACII01000062.1| GENE 30 34484 - 35062 486 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578830|ref|ZP_04856101.1| ## NR: gi|253578830|ref|ZP_04856101.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 192 1 192 192 389 100.0 1e-107 MRQVRRNNDEALFAYLLGFGCHYILDSACHPYVNKMAAEGVIPHIVLEKEFDRVLMEETG KDPDHYYPACGIMPKMEYARVIHRAIPLVKTINIYISVRMMKILTNFMVCDDHGRKRRIL GKLLRLGGESIGSVIEHFMTAEAVEQAKAPMPELERLYREAVPEAVEYLGELYTLREGAY HLSKRWDRTYNG >gi|226332957|gb|ACII01000062.1| GENE 31 34963 - 35277 148 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578831|ref|ZP_04856102.1| ## NR: gi|253578831|ref|ZP_04856102.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 104 1 104 104 206 100.0 3e-52 MPTTYAHDLFGKEVYKRLPSDMKALIRRHGDLYRIGLHGPDILFYYMVSKNPVTQFGIEM HHEKARAFLKKGCGRSEEITMRHSLHIFLGLAVIIFWTPHAIPM >gi|226332957|gb|ACII01000062.1| GENE 32 35296 - 35865 642 189 aa, chain - ## HITS:1 COG:CAC2060 KEGG:ns NR:ns ## COG: CAC2060 COG1386 # Protein_GI_number: 15895330 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Clostridium acetobutylicum # 7 184 21 198 202 127 39.0 1e-29 MKIKETEAAIEAILFAMGGSVELPRIARAIGVDEKTTGRIIRNMMDRYQEENRGIQIIEL ENSFQMCTKKEYYQYLINIALHPQKPALSDVMLETLSIIAYKQPVTKAEIEKIRGVKCDH AINKLVEYSLVRELGRLDAPGRPILLGTTEEFLRCFGVQDLESLPVPDPVQIEDFKAEAE EEIQLKLDV >gi|226332957|gb|ACII01000062.1| GENE 33 35909 - 36700 881 263 aa, chain - ## HITS:1 COG:CAC2061 KEGG:ns NR:ns ## COG: CAC2061 COG1354 # Protein_GI_number: 15895331 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 244 1 243 249 150 41.0 2e-36 MGIPVKLEVFEGPLDLLLHLIEKNKIDIYDIPIVEITDQYMEYIHAMEREDLGIMSEFMV MAATLLDIKCKMLLPKEVNEEGEEEDPRAELVEKLLEYKMYKFMSYELKDKMDDAANVFF KEPTLPDEVLQYREPVDPKELLAGLTLEKLNAIYKSIIRRQEDKVDPIRSKFGTIEKEEV SLSDKMIEIKDFARTHRKFSFRNLLESQCSKVQVIVTFLSILELMKMGHIHVEQDGLFDD ISVEVQTDPDTWKNLTEFAEDEP >gi|226332957|gb|ACII01000062.1| GENE 34 36746 - 37576 521 276 aa, chain - ## HITS:1 COG:CAC2775 KEGG:ns NR:ns ## COG: CAC2775 COG1408 # Protein_GI_number: 15896030 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 4 266 25 285 287 142 31.0 5e-34 MKRFKTVHYTIHSNKIKDARGIRFAVIADLHGMEFGPDNRKLSETIHRYRPDGILIAGDM VVRNDSASLKTASSLLRSLAQQYPVYYALGNHEYKLYRTDPQENCMAARYQEYEKELKTA GIHILHNESCSLQVGKTSLTVYGLEVPLIYYKKPFSPQLKREEIRELIGEPSSDSLNILL AHSPKYGDTYFDWGADLILSGHYHGGIVRIGRHNGLLSPQLHPFPKFCCGDFHRKEQHML VSAGAGEHTIPVRIHNPRELLMVDLKPNAENIVTVC >gi|226332957|gb|ACII01000062.1| GENE 35 37635 - 38669 1244 344 aa, chain - ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 4 330 6 321 330 87 24.0 5e-17 MKEIRTEDAVGHILCHDITQIIKDVKKGVLFKKGHIVREEDIPLLLSVGKEHLYVWEKKE GILHENEGAEILYKICAGDSDTMHGSDIKEGKIELIADIDGVIKIRREALLAVNSLGEMM IASRHGDFPVKKGDKLAGTRIIPLVIEKEKMDRAAEVAGTEPIFSILPYKKKKVGIVTTG SEVQKGLITDTFTPVLRDKFAKFPSEVIGQTKPGDDMEQITADILKFIEEGADMVVCSGG MSVDPDDRTPGAIKATGTRIVSYGAPVLPGAMMLVSYYEKDGRQIPILGLPGCVMYAKNT IFDLLLPRLMADDEITLSEINQLGEGGLCLNCDVCTFPNCGFGK >gi|226332957|gb|ACII01000062.1| GENE 36 38820 - 39977 701 385 aa, chain - ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 2 381 11 379 387 275 41.0 2e-73 MVCTITTPVYAAGIFLPEESNTRQAESTDLIEAPSGILMEAQTGTVVYQKDADTRRSPAS ITKIMTLILIFDALDKGSLKMDDTVTTSAHAKSMGGSQVFLEEGEIQTVETLIKCIVIAS GNDASVAMAEHICGSEQEFVRHMNERAEGLGMKNTHFEDCCGLTESPDHYTTARDIAIMS RELITKYPKILEYSSIWMENITHVTRQGTKEFGLTNTNKLIRSYEGCVGLKTGSTSIAKY CLSAVAKRNDITLIAVVMAAPDYKVRFKDAASMLNYGFSKCSLYIDKKMEALPEIPVRNG KKKTVSLVYEEQFHYLDTTGQNIGNVKRKLRIHREVKAPVKKNTLAGEMIYSVDGKELGR VRILYGENAGKATYPDCLKKVFMRL >gi|226332957|gb|ACII01000062.1| GENE 37 40188 - 41786 1374 532 aa, chain - ## HITS:1 COG:CAC3316 KEGG:ns NR:ns ## COG: CAC3316 COG1502 # Protein_GI_number: 15896559 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Clostridium acetobutylicum # 25 532 6 510 510 498 48.0 1e-140 MKAENNKILKEKKNWILLLFKKLAKLIFNRIFYVAVAMLVQLGWILMMVLRLAAYSRYID IGLRLIGIVLVLWILNKEINPSYKLAWTMLILILPILGVVLYFVFGRSRIAAIMQQHFEQ RIEESREYLRDRPQTRQKLEALDPSASNQSRYISDVSRFPVHENTTAEYFQVGDDMFPVL VRELKQAKKYIFIEYFIINDGVMWQTILNILEKKAAEGVDVRLIYDGFGCLTTLPHKYYE ELQKKGIKCQVFNPFRPILNIIQNNRDHRKLCIIDGWVGFTGGINLADEYINQKERFGHW KDTAVMLKGEAVWNMTVMFLHMWAVIGRSEESVDYEAYFPHRYHEGEFESDGFVQPFCDT PLDEEVVGEDVYLNIINKAKKYVYICTPYLIIDNEMMTALCLAAKSGVDVRIMTPGIPDK KLVFILTQSYYRQLLEAGVKIYEYQPGFLHAKSFVSDDEIGVVGTINLDYRSLYLHFEDG VWIYRNRVIQDIKDDFIQTMEYCRQIELEFCLNRNIGLCIMQNIFRVFAPLM >gi|226332957|gb|ACII01000062.1| GENE 38 41865 - 42041 280 58 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 [Oribacterium sinus F0268] # 1 58 1 58 58 112 94 5e-24 MSNVIVKENETLDSALRRFKRSCAKAGIQQEIRKREHYEKPSVRRKKKSEAARKRKYN >gi|226332957|gb|ACII01000062.1| GENE 39 42331 - 44970 3544 879 aa, chain - ## HITS:1 COG:CAC1678 KEGG:ns NR:ns ## COG: CAC1678 COG0013 # Protein_GI_number: 15894955 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 879 1 881 881 949 55.0 0 MKKYGVNELRQMFLDFMESKGHLVMKSFSLVPQGDKSLLLINAGMAPLKPYFTGAEIPPR TRVSTCQKCIRTGDIENVGKTARHGTFFEMLGNFSFGDYFKTEAIHWSWEFLTEVVGLDA DRLYPSVYLEDDEAFDIWNKEIGIPADRIFRFGKEDNFWEHGAGPCGPCSEIYYDRGEKY GCGKPGCTVGCDCDRYMEVWNNVFTQFENDGNGNYTTLKQKNIDTGMGLERLAVVVQDVD SIFDVDTICALRNLVCEISGKEYEKNYNDDVSIRLITDHIRSATFMISDGIMPTNEGRGY VLRRLIRRAARHGRLLGIEGTFLAKLSEEVINGSKAGYPELEEKKEFIFKVLTNEENQFN KTIDQGLRILGEMEDEMKAAGEKTLSGENAFKLYDTYGFPMDLTKEILEEKGYDIDEAGF QKCMEEQRNKARSAREVTNYMGADATVYDDIDVNVTTEFVGYDHLTFDSKVTVLTTETEI VNSLMEGQKGTVFTEQTPFYATMGGQVGDTGVIETANGKFVVEDTIKLRGGKFGHVGHME SGMISTGETVSLKVDEAARRDTEKNHSATHLLQKALKTVLGNHVEQKGSLVTPDRLRFDF AHFQAMTADEIAQVEALVNKEIQAGLEVRTDVMDVEEAKKSGAMALFGEKYDQKVRVVSM GDFSKELCGGTHVANTGNIMLFKIVSESGIAAGVRRIEALTGNGVLEYYKKQEELLHEAA KALKANPAEIVEKIGHLQGEVKALSSENESLKSKLAQGALGDVMDKVVEVKGVKLLAAKV DGVDMNGLRDLGDQLKGKLGEGVVLLAAVNGEKVNLLAMATDAAQKAGAHAGNLIKAVAA IVGGGGGGRPNMAQAGGKNPAKAQEAVDAAAGILEGQIK >gi|226332957|gb|ACII01000062.1| GENE 40 45039 - 46268 867 409 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 10 402 16 410 418 338 43 4e-92 MTVWDELKARGLIAQVTDEEEIKEMVNNGKATFYIGFDPTADSLHVGHFMALCLMKRLQM AGNKPIALLGGGTGMIGDPSGRSDMRQMMTVETIQHNIDCFKKQMERFIDFSDGKALMVN NADWLMNLNYVEVLRDVGPHFSVNRMLSHECYKQRMERGLTFLEFNYMIMQSYDFYMLYQ KYGCTMQFGGDDQWANMLGGTELIRRKLGKDAYAMTITLLLNSEGKKMGKTQSGAVWLDP EKTSPFDFYQYWRNVDDADVIKCMRLLTFLPLEEIDEMAKWEGSQLNKAKEILAYELTNL VHGEEEAKKAQEGARALFAGGADTAHMPTTELADEDFSEEGTIDLISMLVKAGMVPTRSE GRRAIEQGGVSIDGEKITDVKYTVAKDALTGEGVVLKKGKKKFNKVLAK >gi|226332957|gb|ACII01000062.1| GENE 41 46706 - 48079 1525 457 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 28 265 2254 2477 2566 104 38.0 3e-22 MKKHGKLLLKYLLLFVMTVFVMGCFSVSAATKTGFVTQKGKTYYINKDGSKQKGWLELKG KKYYFDKKTGVQVKGWVKDSSGQAIRYFTSGAGYMVTGFITDSNGNTRHFDETTGLMTRG WLTDTDEYKYYFYSGSGVMAKGWVENKKEQKRYFSQANGRMCTGWVKSSAGNYRYFKPSN GIMYTGLEKIDSDYYYFSKSTGVRYQKGFGTVGSKKYYFNPSDGKAKTGWLELDGKKYYF DTSGVMLANTIASIDGTTYRFDSDGAATKTSGNDYTVEGKYVKVFDAKNNKYYYMEEEFL EHPGIADGKVSDLDLLAAVCDAEAGDQGVVGMEAVALCVLNCTIDQYKEFPSQIRYVVYQ GKPTQYAVVTDGALLKRLKGQFEDRTNAYAAAKAAMEVFSNYVNHGTKRTLPGFKTKDFN YKFFMTPAAFKAQNLNFSKLEYEQYKGHVFFVDWISG >gi|226332957|gb|ACII01000062.1| GENE 42 48347 - 48667 265 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSDLAATNCGCEDRGGCGCGCANNSGCGCDNGCGCNNGCGCNSGWGLSGGNNSCCNILWI IILLCLCGNNNCGCGNGCGGCGNGDNCWIIIVLLLLCGNNGCGCGC >gi|226332957|gb|ACII01000062.1| GENE 43 48742 - 48927 195 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578844|ref|ZP_04856115.1| ## NR: gi|253578844|ref|ZP_04856115.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 89 100.0 6e-17 MGSESEEKKEMLIIDGNAFYEIDLECVKKKKQCAQTVHPEKKTDRINSRNMKYKDNIRQH E >gi|226332957|gb|ACII01000062.1| GENE 44 49093 - 49473 249 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578845|ref|ZP_04856116.1| ## NR: gi|253578845|ref|ZP_04856116.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 6 126 1 121 121 214 100.0 2e-54 MEADIMNEEQISQTLLDQMVSSDRGQMIKAAIPYLPPKGQQIFSVYEKAVEFINTVSVFS KRSSGSDLCAMSMPDQNPVDIVNDIRSFCYGPSRDKLNQMVNMMAMVQMLQLMNQPADGE KEDSHE >gi|226332957|gb|ACII01000062.1| GENE 45 49466 - 49738 339 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578846|ref|ZP_04856117.1| ## NR: gi|253578846|ref|ZP_04856117.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 90 1 90 90 141 100.0 2e-32 MNDDWRNNPKLAGMDRSKLDMLQNLASQGSSKGANEMLPFLMSAAAQGKKGGLKFNADEI SAIIEVLKMGKSPAEAQKLDKVVNLMKMMR >gi|226332957|gb|ACII01000062.1| GENE 46 49894 - 50418 290 174 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 9 165 736 898 904 116 39 4e-25 MASEKEASEESALEETALEETVVGLLKKKKMTLSLAESCTGGAVAADIVNVPGASEVFMC GYVTYTNRAKRKCLGVKKSTLKKEGAVSAKCARQMAKGGAKAAKTDVCLSVTGLAGPGGG TEETPVGTVFMGCYCAGKTTVREFHFEGDRKSIRDQAVVQALTFICDRLHKCDR >gi|226332957|gb|ACII01000062.1| GENE 47 50435 - 50974 342 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 175 484 669 904 136 40 1e-110 MNTPNKLTVARMILVPFLVLFMLTDLGGEANRYIALAIFVVASVTDWFDGKLARKYNLVT NFGKFMDPLADKLLVCSAMICFIELEKLPAWFVIIIIGREFIISGFRLIAAENGVVIAAN YWGKFKTVSQMIMIILLLIDLGGVFDILEQIFIWLSLALTVISLITYIWQNRSVLSMQD >gi|226332957|gb|ACII01000062.1| GENE 48 50958 - 52295 1081 445 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase [Desulfotomaculum acetoxidans DSM 771] # 1 439 19 460 462 421 47 1e-117 MKLLFVSLGCDKNLVDSEEMLGLLTGNGFEIVDDETEAEAIVVNTCCFINDAKEESVNTI LEMAEYKKTGSCKVLVVTGCMAQRYKNEIIEEVPEVDAVLGTTSYGDILKAIREAMEGKH FQEFKDIDYLPEKLGKRVLTTGGHFGYLKIAEGCDKHCTYCIIPKLRGKFRSVPMERLVT QAKEMAEEGVKELILVAQETTVYGTDIYGKKSLHILLKELCKIKGIRWIRVLYCYPEEIY DELIQTMKEEKKICHYLDLPIQHASDRILKRMGRRTTQAELVEIVNKLRREIPDIVLRTT LISGFPGETQEDHEELMSFVDEMEFDRLGVFTYSPEEDTPAATMPDQVAEEVKEARRDEI MELQQEISYDKGTDRIGQELLVMIEGKVADESAYIGRTYGDAPKVDGYIFVQTGELLMTG DFAKVRVTGALEYDLIGELADEYTE >gi|226332957|gb|ACII01000062.1| GENE 49 52383 - 52742 558 119 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1674 NR:ns ## KEGG: EUBREC_1674 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Purine metabolism [PATH:ere00230]; Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100]; RNA polymerase [PATH:ere03020] # 1 86 1 85 86 76 50.0 3e-13 MIHPSYTELIEAINTNSEDDDTTMSLNSRYSLVLAASKRARQIIAGSKPMVEGAAGKKPL SVAIDELYKGKVKILAPEEEDEEGTEQTAEAQTEESAQASEITETAETAEETTAETTEE >gi|226332957|gb|ACII01000062.1| GENE 50 52785 - 53408 834 207 aa, chain - ## HITS:1 COG:L149828 KEGG:ns NR:ns ## COG: L149828 COG0194 # Protein_GI_number: 15673881 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Lactococcus lactis # 3 193 4 194 205 201 54.0 7e-52 MNKGILVVVSGFSGAGKGTVMKRLMEKYDGYALSVSATTRKPRPGEEDGREYFFRTRDEF EKLIEEDALLEYAQYVENYYGTPRSYVEEQLQAGRNVILEIEIQGAMKIKEKIPEALLVF VTPPTVEELERRLTGRGTETAQVIADRLARAGEEAEGMGQYDYILVNDTVEECVDHLHQI IVSEHSRVSRNAEFIADIQKQTKAFQK >gi|226332957|gb|ACII01000062.1| GENE 51 53401 - 53673 287 90 aa, chain - ## HITS:1 COG:BS_yloBa KEGG:ns NR:ns ## COG: BS_yloBa COG2052 # Protein_GI_number: 18677778 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 3 87 4 88 89 88 48.0 2e-18 MARLINIGFGNVVNSQKIVAVVSPDAAPVKRMVQSIKGTRSLIDATQGRKTKAVIVTSDD YLVLSALQPETIARRFAEIYENEDKGETHE >gi|226332957|gb|ACII01000062.1| GENE 52 53685 - 54566 793 293 aa, chain - ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 293 1 292 292 231 46.0 1e-60 MIKSMTGFGRAEVVTKERKITIELKSVNHRYLDLSIKMPRKLNFLEGAVRNLMKTYIQRG KVDVYITYEDYTLDNGALKYNRELASEYITCLKQMQQDFDLDYDIKVSTLSRYPEVLVME EQSVDEEALWESLEPPLREACEKFVQTRILEGRNLEKDLIGKLDSLEEKVLRVEARSPEV VNAYRTKLEAKVSELLEDTQIDDNRIAAEVILFSDKICNDEETVRLHSHIRNMKKMLTTE KEGIGRKLDFMAQEMNREANTILSKSSDLEISNIAIDLKTEIEKVREQIQNVE >gi|226332957|gb|ACII01000062.1| GENE 53 55173 - 56918 1258 581 aa, chain + ## HITS:1 COG:BH2516 KEGG:ns NR:ns ## COG: BH2516 COG1293 # Protein_GI_number: 15615079 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus halodurans # 1 578 1 563 570 356 36.0 8e-98 MAFDGITIAAMVKELHLNLDGGRFNKIAQPEADELLITGKGANGQCRLLLSASASLPLIY FTSKNKPSPMTAPNFCMLLRKHIGSARVSDIRQPGMERVVMFELEHLNELGDPCKKVLIM ELMGKHSNIIFCDDKGMILDSIKHVSSHMSSVREVLPGREYFIPKTQDKLDPLTVSEEEF YDVVCRKPCNVSRAVYSSLTGISPVVAEEICFRASIDGSDAAQSLDEAARVHLYHTFRRL MDQVVEGDFSPNIVYRGDEPVEYGVFAFQQYGPEYHSVEFDSVSQMLETYYATKNTLTRI HQKSSDLRRIVQTALERNRKKLSLQEKQMKDTAKKEKYKVYGELINTYGYGLEDGCKSFK ALNYYTNEEITIPLDPTMTPAENSKKYFDKYGKLKRTEEALTEQIADTRSEIEHLESVSN ALDIALAESDLAQIKEELMEYGYIKKHYDRRKGQKAQSKSKPFHYVTEDGYNIYVGKNNF QNDELTFKFATGNDWWFHAKKMAGSHVVVKSKDGELPDHIFEIAGQLAAYYSKGRTAPKV EIDYIQKKQVKKPAGAKPGFVVYYTNYSLMAEPSLKGVREV >gi|226332957|gb|ACII01000062.1| GENE 54 57063 - 58619 1687 518 aa, chain + ## HITS:1 COG:BS_yfmM KEGG:ns NR:ns ## COG: BS_yfmM COG0488 # Protein_GI_number: 16077809 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 518 1 518 518 833 78.0 0 MSILNVEHLTHGFGDRAIFNDVSFRLLKGEHIGLVGANGEGKSTFFNIVTGKLMPDEGKI EWAKNVRVGYLDQHSVLSQGMSIRDVLKSAFSYLFEMEERMNGICDSLGTASPEEMDTLM EELGTIQDTLTMHDFYVIDAKVEEVARALGLLDIGLERDVTDLSGGQRTKVLLGKLLLEK PDILLLDEPTNYLDEEHIEWLKRYLQDYENAFILISHDIPFLNSVINLVYHMENQELNRY VGDYDHFQEVYEVKKAQLEAAYRRQQQEISELKDFVARNKARVSTRNMAMSRQKKLDKMD VIELAAEKPKPEFHFKYGRTPGKYIFETKDLVIGYNEPLSRPVTLSMERGNKIALVGANG IGKTTLLKSILGLIPSLKGSCELGENLQIGYFEQEVKGDNKTTCIDEIWAEFPSYTQYEV RSALAKCGLTTKHIESQVRVLSGGEQAKVRLCKLVNKETNILLLDEPTNHLDVDAKEALK QALIEYKGSILLICHEPEFYRDVVSQVWDCSKWTTKVL >gi|226332957|gb|ACII01000062.1| GENE 55 58647 - 60011 1196 454 aa, chain + ## HITS:1 COG:VNG0727C KEGG:ns NR:ns ## COG: VNG0727C COG0534 # Protein_GI_number: 15789902 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Halobacterium sp. NRC-1 # 5 449 15 471 494 144 28.0 5e-34 MSKAKTVDLTSGPILKTLAELALPIMISSMLGTAYSIMDMAWIGLLGAKAVAGVGVGGMY VWLSQGLASLSRMGGQVNTAQACGRKDYDQARSYAAGSLQLTALSGILFAAVCVIFIQPL LGFFNLTDAETYTAARSYTLITCGLILFSYLNLTLTGLSTAQGDSRTPLLANFIGLAGNM ILDPLLILGIGPFPRLEVAGAAIATVSSQILVFVVMVLRIRHSRLEPNVLQHLNLLSVFP KEYYKHIFRIGFPTAIQGSIYCFISMVLTRMVSGYGAAAIATQRVGGQIESVSWNTADGF ASALNAFIAQNYGARKKDRIRKGYSLSFRVLTIWGLFVTAAFVFLPEPIARLFFHEKEAL DTAVNYLVIIGFSEVFMSIELMTVGALSGLGRTRLSSTISVILTGSRIPLALILTHAGMG LNGVWWALTISSIVKGIVFTLTFRHISRRLPEKV >gi|226332957|gb|ACII01000062.1| GENE 56 60091 - 62493 3174 800 aa, chain - ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 1 799 1 803 804 841 50.0 0 MKYLFKKIEPKWQAKWEEAKVFEAKDNSDKPKFYGLVEFPYPSGAGMHVGHIKAYSSLEV ISRKRRMEGYNVLFPIGFDAFGLPTENYAVKTGTHPRIITDENIKRFSNQLKKVGFSFDW NRVIDTTDEDYYKWTQWIFLKMFEKGLVFRDRTLVNYCPHCKVVLSNEDSQGGKCDICHS DVVQKSKDVWYLRITQYADKLLEGLKDVDYPDNVKQQQIHWIGKSKGAFVNFDVDGIDEK LEIYTTRPDTLFGVTFMVIAPEHPIIDKYADKITNMDEITAYRNECAKKTEFERTQLVKD KTGVRIQGLEGINPVNGKKIPIYIADYVMMGYGTGAIMAVPAHDQRDYDFAKKFGIDIIE VIKGGDISKEAYTGDGEMVNSDFLNGYAKKKDSIERMLEELEKKGIGKAGVQYKMKDWAF NRQRYWGEPIPLIHCPNCGVVAVPENELPLRLPDVKDFEPGEDGQSPLAKIDSFVNCTCP KCGAAAKRETDTMPQWAGSSWYFLRYTDPHNANALADMDKLKYWMPVDWYNGGMEHVTRH MIYSRFWHHFLYDLGVVNTPEPYAKRSIQGLILGPDGDKMSKSKGNVVDPLDIVEEYGAD TLRTYVLFMGDYGAATPWSENSVKGCKKFLDRVAGLTDILSEESVSDELEMKLHRTIKKV SSDIENLKFNTAIASLMTLINEITAEGHLSKDDLTIFIKLLSPFAPHVCEEIWEFIGGEG LLAVSEWPKYDEGKTIAKTVEIGVQVNGKVRGTIVIPNGCEKEKAFEIAKADERIASFLE GKNLIKEIYVPNKIVNFVAK >gi|226332957|gb|ACII01000062.1| GENE 57 62761 - 64851 2578 696 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 687 3 686 690 599 47.0 1e-171 MDVFRTDRIRNVVLLGHGGAGKTTLAEAMAYLAGMTNRLGRVEDGNTVSDYDKEEIKRHF SISTTLIPVPWDKVKINVLDTPGYFDFVGEVEEAVSAADAAIIVVSGKAGVQVGTQKAWN LCEKYKLPRMIFVTDMDVDDVSYRKVVEQLTELYGKKIAPLHFPIREDGKFVGYVNVVKQ AGRKYIDKGKKEECPIPEYLDEYLEKYHDILMESVAETSEEFMDRYFEGDTFSVTEVSAA LAMNVQEASIVPVCMGSPINLRGVSNLLDDICGYFPSPDKRSCNGIAQKTNEIFEANYDF TKVKSAYVWKTIADPFLGKYSLIKVCSGVIKSDDTLLNVEKGEEERLNKLYVLEGSKPIE VPELHAGDIGAIAKLNSVQTGDSLATKANPILYPKALLSTPYTCKRFKAVKKGDEDKISQ ALAKMMSEDKTLRVVNDSANRQSLIYGIGDQHLDIVMNKLKERYKVDVELSKPKVPFRET IRKNADVEGKHKKQSGGHGQYGHVKMRFEPSGDLETPYEFEQVVVGGAVPKNYFPAVEKG LQECVLKGPLAAYPVVGVKATLYDGSYHPVDSSEMAFKMATILAFKKGFMDASPVLLEPI VSMKVTVPDKFTGDVMGDLNKRRGRVLGMTPDHQGNTIIEADVPQLEIYGYSTDLRSMTG GSGEFSYEFARYEQAPGDIQAKEIEARASKVDKSEE Prediction of potential genes in microbial genomes Time: Sat May 28 19:43:26 2011 Seq name: gi|226332956|gb|ACII01000063.1| Ruminococcus sp. 5_1_39B_FAA cont1.63, whole genome shotgun sequence Length of sequence - 21027 bp Number of predicted genes - 24, with homology - 22 Number of transcription units - 12, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 78 - 137 7.6 1 1 Op 1 6/0.000 + CDS 210 - 1301 1163 ## COG0287 Prephenate dehydrogenase 2 1 Op 2 . + CDS 1410 - 2684 1236 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase + Term 2701 - 2764 13.1 + Prom 3035 - 3094 6.7 3 2 Tu 1 . + CDS 3139 - 4128 1063 ## COG1052 Lactate dehydrogenase and related dehydrogenases + Term 4147 - 4202 6.9 - Term 4132 - 4191 5.9 4 3 Op 1 . - CDS 4192 - 4569 389 ## gi|253578862|ref|ZP_04856133.1| predicted protein 5 3 Op 2 . - CDS 4630 - 5181 428 ## COG0655 Multimeric flavodoxin WrbA 6 3 Op 3 . - CDS 5184 - 6002 915 ## COG0648 Endonuclease IV - Prom 6041 - 6100 10.9 + Prom 6064 - 6123 10.0 7 4 Tu 1 . + CDS 6306 - 7673 974 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Term 7795 - 7836 9.5 - Term 7783 - 7824 3.2 8 5 Tu 1 . - CDS 7846 - 9621 938 ## COG0515 Serine/threonine protein kinase - Term 9967 - 10012 2.4 9 6 Tu 1 . - CDS 10235 - 10435 156 ## COG3655 Predicted transcriptional regulator - Prom 10478 - 10537 6.5 10 7 Tu 1 . - CDS 10549 - 10686 169 ## - Prom 10719 - 10778 7.0 + Prom 10771 - 10830 8.8 11 8 Op 1 . + CDS 10897 - 11823 608 ## CKL_3875 hypothetical protein 12 8 Op 2 . + CDS 11871 - 12311 211 ## HM1_0612 hypothetical protein + Term 12313 - 12347 3.6 - Term 12296 - 12340 7.1 13 9 Tu 1 . - CDS 12363 - 12623 242 ## Cbei_1678 XRE family transcriptional regulator - Prom 12731 - 12790 7.4 14 10 Op 1 . - CDS 12851 - 14173 711 ## BDP_0599 hypothetical protein - Prom 14197 - 14256 3.1 - Term 14225 - 14256 1.7 15 10 Op 2 . - CDS 14260 - 14574 457 ## gi|253578874|ref|ZP_04856145.1| conserved hypothetical protein - Prom 14611 - 14670 1.9 - Term 14652 - 14693 -0.6 16 11 Op 1 . - CDS 14735 - 15085 297 ## gi|253578875|ref|ZP_04856146.1| conserved hypothetical protein 17 11 Op 2 . - CDS 15090 - 15395 184 ## gi|253578876|ref|ZP_04856147.1| hypothetical protein RSAG_01108 18 11 Op 3 . - CDS 15417 - 15881 405 ## EUBREC_2082 hypothetical protein 19 11 Op 4 . - CDS 15909 - 16079 57 ## gi|291522397|emb|CBK80690.1| hypothetical protein 20 11 Op 5 . - CDS 16076 - 16267 261 ## gi|166031046|ref|ZP_02233875.1| hypothetical protein DORFOR_00727 21 11 Op 6 . - CDS 16267 - 16611 419 ## 22 11 Op 7 . - CDS 16553 - 17830 510 ## BLD_1139 reverse transcriptase - Prom 17979 - 18038 4.7 23 12 Op 1 . - CDS 18269 - 20551 1332 ## BLD_1140 hypothetical protein 24 12 Op 2 . - CDS 20552 - 21019 408 ## gi|253578881|ref|ZP_04856152.1| predicted protein Predicted protein(s) >gi|226332956|gb|ACII01000063.1| GENE 1 210 - 1301 1163 363 aa, chain + ## HITS:1 COG:BH1666 KEGG:ns NR:ns ## COG: BH1666 COG0287 # Protein_GI_number: 15614229 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Bacillus halodurans # 2 361 4 364 366 236 37.0 4e-62 MKTIGFIGLGLIGGSIAKAIRKYHEDYRLLAYDQDRETLAAAVSSNMIDAVCEEHDERFR SCDYIFLCAPVEFNVKYLEYFKDNIGPDTILTDVGSVKGIIHKEVERLGMESRFIGGHPM AGSEKTGFEASSDRLIENAYYIITPGGEVALERLTDFTEMISSLGAIPMVLTSEEHDFIT AGVSHLPHIIASALVNLVNMLDSEQEYMKTIAAGGFRDITRIASSSPIMWEQICVENNQN ISNVLDDYIRLLVQIKCFVDNKDSRSLYQMFASSRDYRDSIDVVDNGLLKKAYVLYLDIA DEAGGIATIATILAMEKISIKNIGIIHNREFEEGVLKIEFYDGISMEKGAALLKKRNYII YER >gi|226332956|gb|ACII01000063.1| GENE 2 1410 - 2684 1236 424 aa, chain + ## HITS:1 COG:BS_aroE KEGG:ns NR:ns ## COG: BS_aroE COG0128 # Protein_GI_number: 16079317 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Bacillus subtilis # 1 424 1 426 428 413 52.0 1e-115 MEIKKQTNLRGELTVPGDKSISHRAVMFGSLAQGTTKITHFLEGADCLSTISCFRKMGID IERNASEILVHGKGLHGLSAPSETLDVGNSGTTTRLISGILAGQSFISELNGDASIQSRP MKRIMTPLLSMGADIVSLRGNGCAPLRITGKPLHAAHYQSPVASAQVKSCVLLAGMYADG ITSVTEPVLSRNHTEIMLNYFGANVTSQGTTASIEPEPVLNGREIKVPGDISSAAYFIAA GLLTPGSEILLKNVGINPTRDGMLRVCKAMGADITLLNASTEGEPTADLLIRTSSLHGTT VEGEIIPTLIDEIPMIAVMAAFAEGTTVIRDAQELKVKESDRIAVVTEGLKRMGADIQPT DDGMIIHGGKPLHGAEINSYLDHRIAMSFAVAGTICDGTLTIKDGDCVKISYPEFYEDLY SLGK >gi|226332956|gb|ACII01000063.1| GENE 3 3139 - 4128 1063 329 aa, chain + ## HITS:1 COG:AGl1952 KEGG:ns NR:ns ## COG: AGl1952 COG1052 # Protein_GI_number: 15891092 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 319 1 319 337 345 50.0 7e-95 MKILFYGTKNYDEEFFEKLLPSYPGITIKFTEANIHEETASLAKGYEAICAFVNADLSTP VIEELNAQGVKLILMRCAGYNNVDLETAHKFGMKVLRVPGYSPEAVAEHAMALALTANRH THKAYIKCRENNFSLSGLMGLNFYQKTAGIIGTGKIGQAMAKICKGFGMRVIAYDLFPNK SLDYLEYVSLDELLATSDLISLHCPLTEETKHIINEETIAKMKDGVILVNTSRGGLIKTE DLISGIRDHKFFAVGLDVYEEETDFVFEDMSERILQSSITQRLLSFPNVVMTSHQGFFTK EALTNIAETTLENAKAFMDGNELKNEVLS >gi|226332956|gb|ACII01000063.1| GENE 4 4192 - 4569 389 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578862|ref|ZP_04856133.1| ## NR: gi|253578862|ref|ZP_04856133.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 113 12 124 136 101 100.0 2e-20 MKKIKYFMITGMLLASIGLAGCGKKSEDDITNKEIYQDDLEPLTEEELNEMDADTSDEDM KQIDEPEEETADITDTTDATETTDKTENTDKAETSDSSAKKSADKTETSDSADTTKSETK TKTEK >gi|226332956|gb|ACII01000063.1| GENE 5 4630 - 5181 428 183 aa, chain - ## HITS:1 COG:MA0444 KEGG:ns NR:ns ## COG: MA0444 COG0655 # Protein_GI_number: 20089335 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 201 68 28.0 5e-12 MKTLIFNGSPRKNGETAYMIRTLQENLGGDFKVVNAYRADIRPCIDCRWCFDHAGCAVKD EWQEVLSYIEECDHIIMASPVYFEEVTGMLLAVMSRLQTYFSARYIRKEEPVPKKKTGAV LLTAGSIGPREKAESTAEMLLRQMSCSSLGTVYVNRTDKVPVRDRADIIQEIKDLAGRLR TAE >gi|226332956|gb|ACII01000063.1| GENE 6 5184 - 6002 915 272 aa, chain - ## HITS:1 COG:lin1487 KEGG:ns NR:ns ## COG: lin1487 COG0648 # Protein_GI_number: 16800555 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Listeria innocua # 1 266 1 282 297 200 41.0 3e-51 MLYIGNHTSSSKGYLAMGKQTLANGGNTFAFFTRNPRGGRAKDINLQDVQEFLQLAQENH FGKIVAHAPYTMNCAGAKENLRDFARETMADDLKRLELTPGNYYNFHPGNHVGQGAETGI ARIAEILNEVLTEEQSTTVLLETMSGKGSEIGRNFDELRQIIDRVERKDKMGVCLDTCHV WDGGYDIVNNLDGVFTEFDRLKAIHLNDSMNGLGSHKDRHAKIGEGEIGLGALVRVIRHP ATKGIPFILETPNDDEGWTREIALLRKEYEEK >gi|226332956|gb|ACII01000063.1| GENE 7 6306 - 7673 974 455 aa, chain + ## HITS:1 COG:lin0516 KEGG:ns NR:ns ## COG: lin0516 COG2843 # Protein_GI_number: 16799591 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Listeria innocua # 106 402 161 459 475 162 34.0 1e-39 MNSHPKSKHPPVKNSSARRKDIYKKSKYYQEKRQKLLIAAGIGGFILVFVLILAGIRGCS NYMQAKQTTAQNAVSMNASGDSNQTAATDSQETDSSDTAISAPVSLTLSVVGDCTLGTDE TFDYDTSLNAYYENYGADYFLQNVRSIFSADDLTIANFEGTLTDSDEREDKTFAFKAPAS YASILSGSSVEAVNTANNHSHDYGDQGFDDTLAALDDAGIVHFGYDETAVMDIKGIKVGM VGIYELYDHLDREQQLKDNIAKVQADGAQLIVVIFHWGNETETVPDSNQTTLGRMAIDLG ADLVCGHHPHVLQGIETYKGKNIVYSLGNFCFGGNSSPSDMDTMIYQQTFTIDADGVKND NVTNIIPCSISSAAYDGYNNYQPTPAEGDEATRILEKINERSSWISTAEGTTFTAEYNNR NTSTSSSESDISSENSSDTDSSEDDISDNNLIDMN >gi|226332956|gb|ACII01000063.1| GENE 8 7846 - 9621 938 591 aa, chain - ## HITS:1 COG:PA0074_1 KEGG:ns NR:ns ## COG: PA0074_1 COG0515 # Protein_GI_number: 15595272 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Pseudomonas aeruginosa # 7 216 65 262 411 92 33.0 2e-18 MSELKGHPNIVSYEDHSVIPHRGEVGWDILLKMELLIPFQEWQTNHMLDETEVLKLGRDI SAALAFARNRGLVHRDVKPQNIFVDEFGNFKLGDFGISRTIEKTMSGLSRKGTESYMAPE VYNGNAYGESVDVYSLGLVLYKCLNNNRLPFFPPISKPIRYSDREEALNKRMAGYRIPEL ENVTDEFMEILYKACEFNPQKRIKNAEELYESLDLLEKKRRQESGTKNYYGKRDAEKTEL IENGKMYHTGRNSGSRTYTSEKKQGKNRKTAGVIGAFAVVCMLFLVIAAGYKILPYINNS EEKGSYRIISDGYEMLTQTYDGYMRMQRNEKWGYLDEMGREVIPAEYDFIKFFGENGLAP AQKDGRWYYIDKENNIVENPVFDEYEYMGILSNGFIPAKQWDGKWGYLDSDFNKVSTFEY DQALPFLNGIAAVKKGEMWALINNNMEVTTDYIYDNIVVDGREMCSYNNVVFVYVGDKKY LVNGNGKKICEDAFEDAAPFLSVGPVAVEMNGKWGYISSDGKKVIDYQYERGRSFTQIGY AAAKQNGKYGYIDEKNHWVIQPEFIQAKACNDHGIAAVESEEGIWKLIQIE >gi|226332956|gb|ACII01000063.1| GENE 9 10235 - 10435 156 66 aa, chain - ## HITS:1 COG:SP1030 KEGG:ns NR:ns ## COG: SP1030 COG3655 # Protein_GI_number: 15900901 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 2 66 3 67 72 62 38.0 2e-10 MISYNPLWHTLIDKGMNKGDLMRETGISFGTMASMGKNEPVNLKQIDRICKVLHCKIEDV IEYKED >gi|226332956|gb|ACII01000063.1| GENE 10 10549 - 10686 169 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENDEMTTLEFKSIMEMVVMIIESSKDKEEALEKIKNLSILKEKN >gi|226332956|gb|ACII01000063.1| GENE 11 10897 - 11823 608 308 aa, chain + ## HITS:1 COG:no KEGG:CKL_3875 NR:ns ## KEGG: CKL_3875 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 5 306 3 302 347 264 45.0 3e-69 MSELLKKQNYGVEVEFTGISRKMAADAVAEIIGTTASRPDHTCYQTRTIQDSQGRKWKVM RDSSIHPIRKVGAESIDEYRVEFVTPILKYEDLDTLQAIIRKFREIGGVPHSSCGIHIHV DGANHTATSLRRLVNFMYSRQEIIYDALAVGDRKYRWCQPVCKNLLDTMKKDKNITTDSV EKIWYSPANDGYCGGINHQHYNSTRYHALNLHSFFQKGTVEFRLFNSTLHAGKIKAYVQF CLALSAWSIESDDKVVFRSMNGYTAKKKVTLMYNILTNRLGLYGDEFKTCRLHMMKQLRE NAKAEQVA >gi|226332956|gb|ACII01000063.1| GENE 12 11871 - 12311 211 146 aa, chain + ## HITS:1 COG:no KEGG:HM1_0612 NR:ns ## KEGG: HM1_0612 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 2 143 6 146 164 139 48.0 4e-32 MLYVAYGSNLNVQQMSYRCPRATVAFTGYLINWKLLYRGSRTGSYATVKRQKGSRVPVAV WNIDNKNEKALDLYEGYPRFYRKKNVFVQLKNGTRKKAMIYLLPDSATVGKPSNRYVETV LQGYKDMGFDTDYLYDSLEYNLKEMN >gi|226332956|gb|ACII01000063.1| GENE 13 12363 - 12623 242 86 aa, chain - ## HITS:1 COG:no KEGG:Cbei_1678 NR:ns ## KEGG: Cbei_1678 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.beijerinckii # Pathway: not_defined # 1 71 4 74 75 78 57.0 6e-14 MIKILLSKKLGELRLTQADLARATGIRPNTINELYHELTERVNLEHLDLICEALNCELDE LIIRVPNKETSVVHTRQGTLKPNKKM >gi|226332956|gb|ACII01000063.1| GENE 14 12851 - 14173 711 440 aa, chain - ## HITS:1 COG:no KEGG:BDP_0599 NR:ns ## KEGG: BDP_0599 # Name: not_defined # Def: hypothetical protein # Organism: B.dentium # Pathway: not_defined # 212 352 10 144 259 94 38.0 8e-18 MESGDQVYGKQDYSCFAGVGANCSNEKAITIGAGQWYAGEAKELLYRIQRANPKLFKDMD NAGMEKDLLMKSWDTYAVTAESAKGKCIVDIISTDLGKKCQDQYMEDQIQAYIPIIEKAY GTMEDSAMMECINILHQGGFDALKRILSKTPEPYTADKIYVTLCRDPADPTPNQVGDYTD RQKAVINMIHTYADNTEKEGIAMTKTEKAIRQMETWAKDDSHGYDQDYRWGEKGDYDCSS AVIQAWQNAGVPVKSGGATYTGDMKNVFLKNGFVEVTSKVNVATGSGLLRGDVLLNEAHH VAMYCGNGKEVEASINEKGTAHGGKPGDQTGKEFLIRSYRNYPWNCVLRYRGNIFSASDT EKKQNVVAYVARFTKDCKCYSVAGKTQAKMFPVIKKNAVVDVMKYTETVNGKKWYFIRIP HPTEGFVREFVPAGYFKKLI >gi|226332956|gb|ACII01000063.1| GENE 15 14260 - 14574 457 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578874|ref|ZP_04856145.1| ## NR: gi|253578874|ref|ZP_04856145.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 104 1 104 104 155 100.0 7e-37 MTLEYFLLLLMIVSIFTGLVTEGIKKLLEESKKTYKANFLAGGVAVVLSVLVGSGYIILM DAQINSKMAVYLIALILLSWLSAMVGYDKVIQSLGQIKLPNKNE >gi|226332956|gb|ACII01000063.1| GENE 16 14735 - 15085 297 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578875|ref|ZP_04856146.1| ## NR: gi|253578875|ref|ZP_04856146.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 116 1 116 116 224 100.0 1e-57 MKFKEAFEEMKSGIPVKLPSWAGYWWWDEESQTILMYTKDGDCLDIRETQNVEYTLQNIL SDEWVYADSRNCPILGGEATFSFGEAIKYLKRGMKVAKKRMEWKEAVHSACKRNLL >gi|226332956|gb|ACII01000063.1| GENE 17 15090 - 15395 184 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578876|ref|ZP_04856147.1| ## NR: gi|253578876|ref|ZP_04856147.1| hypothetical protein RSAG_01108 [Ruminococcus sp. 5_1_39B_FAA] # 1 101 1 101 101 191 100.0 2e-47 MKKRLKKIVTTVKKVGTLNLVLMFVGAFFIWFNWQMILLYRQCDSMPETYACAVVAATIG ECGICGWIRTNKDKQQDRKWEKEDRKKQEQNDANMAENEED >gi|226332956|gb|ACII01000063.1| GENE 18 15417 - 15881 405 154 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2082 NR:ns ## KEGG: EUBREC_2082 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 29 154 1 153 153 132 45.0 3e-30 MEVKLCVAKEARRKSNIEVLEIKYGGIEMTLKEILEAGGGILFVVLTLVQVAPIKVNPWT ALGRSIGRVLNKEVMDKIEEGNAKNARYRIIRFNDEVKHDVKHTEEHFDQIIEDIDTYEN YCSDHPHFPNGKAVHSISNIRKIYDKCSDEHSFL >gi|226332956|gb|ACII01000063.1| GENE 19 15909 - 16079 57 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291522397|emb|CBK80690.1| ## NR: gi|291522397|emb|CBK80690.1| hypothetical protein [Coprococcus catus GD/7] # 1 56 1 56 74 67 53.0 2e-10 MTKLQIISKQWSLIYDLLLFNKGESERTLEDIEQDMDMLEFHCRKYAGADDEELMM >gi|226332956|gb|ACII01000063.1| GENE 20 16076 - 16267 261 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|166031046|ref|ZP_02233875.1| ## NR: gi|166031046|ref|ZP_02233875.1| hypothetical protein DORFOR_00727 [Dorea formicigenerans ATCC 27755] # 1 62 1 62 65 80 82.0 5e-14 MMKLLNNLIILLQNDGGKEMIAMLWAQQIMLGKKTYAEVPRLLKAKVKEILEDSGMGELA KEE >gi|226332956|gb|ACII01000063.1| GENE 21 16267 - 16611 419 114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQKAKYMERQPEVRWQPVNNGMVDVTLCLNEQKVTIEQGQMEDSAKQMMYEYDYHQFRES ADKINEETVRASPAKYMSYVPEVEKSLEEKLEELQASNEMLTSCVLEMSELVYQ >gi|226332956|gb|ACII01000063.1| GENE 22 16553 - 17830 510 425 aa, chain - ## HITS:1 COG:no KEGG:BLD_1139 NR:ns ## KEGG: BLD_1139 # Name: not_defined # Def: reverse transcriptase # Organism: B.longum_DJO10A # Pathway: not_defined # 70 405 33 346 414 167 35.0 7e-40 MKKCCKNVNILADDFIEDSIYEALDEKWKRSDVAKYLHGRTSSMSLQAMKRLLRDTDERD LMVSGLVHTVAESLHYEIQNRCLKVEPIQYSWRQDGVNGKIREIGVESVKQLILDEIASE GLDELWRRKLGYHQYASIKGKGQLGGKKAIEHQIRKKYSMSRYAWKGDVKKCYPSVDTRK LKRMLEHDVKNEVLLYLVYFLIGTYKQGLNIGSGLSQFLCNYYLAKAYVYVLGLHKVRKH RDGATESKRLVYFCIMYMDDILLIGAREADVKRAARALEKYLLKEYGLTIKPDADLFPID YRIKTGKKYDSYREKDKAERRGKPIDMMGYVIYREHTEIRSKIFLRARKAYSVAWYCMKN KMEIPLQTAYKCTSYYGWFKHTDSKYAKGKYNIDAVCTAAKRRISKHAKSEIYGTSARSA LAACQ >gi|226332956|gb|ACII01000063.1| GENE 23 18269 - 20551 1332 760 aa, chain - ## HITS:1 COG:no KEGG:BLD_1140 NR:ns ## KEGG: BLD_1140 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_DJO10A # Pathway: not_defined # 262 748 42 519 526 152 28.0 8e-35 MANRSTGSGQWTDMGNVTPNPRGSYSDAETYKYLDMVSYGGGSYLCLQDDTIGVRPSPGE STDRWFCSSVPGEATPDFKNLVTETKEAARTAKEKASEAETSAKASEISAQAASNSAGAA AASARDAENAKDIVAGYKNAAEKAASSAATSEKNVNDKIAGLDNTFSEKTTSAIETINKS VDAKAEEIKNEITATKKSMVDASQKAINDTIDARKTEINNTGASEIKNVQAESATQTQGI KSVAAEQLAAINAAGGTLESAIERYYAMRRTREIYTVEELDPDVTQACTVNRLDALSGLT CTPSTNTTAGEDQIGTLEAFRPIEVNWILDDDGNQKITAIEGMPGYKTTGKVNRGIMNMG LYYKKERNAEDNGWLHHWSMLPRKEEGYVPMKECVRPDNTVQGWMLHPKGAAVDIDGVPY VTNGKPVRNKPSYANFAYARKQGPAYCFETDVDAAWVLALTMIKYGTKDLQAYMRGCTAY SSQYNVAVAEENTKRVILTKDQANYFVVGSSVSIGNPGSNTNFDRGNNYMHNIVDSAKIT AIQKVDDTYSALVLDVSAPFTTATTYKVSAMHWETGSTDSVQGYDGSPVSNTDGKNICKI NGIEIMPGGLSVSGNSVHIIETDSDGNTSCTYYRCDDARLLTTNTDTIISSYAKVGSFPA TDNAWKFIKEQMVDFGKGVMYPVTYGGGDKAYWADGWHTGSTPSAGQKSARELLRRGDLS NGGLAGPSCVGGNGGLANTWWDILATLSPNAVRGEWQAAA >gi|226332956|gb|ACII01000063.1| GENE 24 20552 - 21019 408 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578881|ref|ZP_04856152.1| ## NR: gi|253578881|ref|ZP_04856152.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 155 1 155 155 301 100.0 9e-81 MAGDFMVTNKVAIIVVNAGYTSGDTVAEAKNYFEQILRYFDATNMNVQKYGKLAERFAVG LAEDPESLMDNAKYYAHQAEQAVMGIPGQVEDAKNDIDAYVKEKEADLKGEDGNVCFVEF RIEPPCLYMRNNPDETDIEFRLNGSKLEYKWRERG Prediction of potential genes in microbial genomes Time: Sat May 28 19:45:02 2011 Seq name: gi|226332955|gb|ACII01000064.1| Ruminococcus sp. 5_1_39B_FAA cont1.64, whole genome shotgun sequence Length of sequence - 64145 bp Number of predicted genes - 80, with homology - 78 Number of transcription units - 19, operones - 13 average op.length - 5.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 247 179 ## 2 1 Op 2 . - CDS 318 - 1463 555 ## CKR_1702 hypothetical protein - Prom 1484 - 1543 1.8 3 2 Tu 1 . - CDS 1628 - 1957 391 ## gi|253578883|ref|ZP_04856154.1| predicted protein - Term 2258 - 2287 -0.3 4 3 Op 1 . - CDS 2368 - 2724 103 ## gi|253578885|ref|ZP_04856156.1| hypothetical protein RSAG_01117 5 3 Op 2 . - CDS 2774 - 3304 135 ## gi|253578886|ref|ZP_04856157.1| hypothetical protein RSAG_01118 6 3 Op 3 . - CDS 3316 - 3918 465 ## gi|253578887|ref|ZP_04856158.1| predicted protein 7 3 Op 4 . - CDS 3933 - 4304 309 ## EUBREC_2088 hypothetical protein 8 3 Op 5 . - CDS 4304 - 5062 230 ## EUBREC_2089 hypothetical protein 9 3 Op 6 . - CDS 5065 - 6195 597 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 10 3 Op 7 . - CDS 6200 - 6409 184 ## gi|253578891|ref|ZP_04856162.1| predicted protein - Prom 6504 - 6563 4.6 - Term 6507 - 6549 6.1 11 4 Op 1 . - CDS 6675 - 7025 204 ## EUBREC_2092 hypothetical protein 12 4 Op 2 . - CDS 7034 - 8284 971 ## EUBREC_2093 hypothetical protein 13 4 Op 3 . - CDS 8297 - 8998 181 ## COG1652 Uncharacterized protein containing LysM domain 14 4 Op 4 . - CDS 9013 - 13761 3374 ## COG5283 Phage-related tail protein - Term 13818 - 13854 1.3 15 5 Tu 1 . - CDS 13925 - 14479 609 ## EUBREC_2097 hypothetical protein - Term 14489 - 14525 3.1 16 6 Op 1 . - CDS 14541 - 14996 506 ## EUBREC_2098 hypothetical protein 17 6 Op 2 . - CDS 15014 - 16423 1253 ## EUBREC_2099 hypothetical protein 18 6 Op 3 . - CDS 16427 - 16654 123 ## gi|253578901|ref|ZP_04856172.1| predicted protein 19 6 Op 4 . - CDS 16666 - 17535 474 ## EUBREC_2101 hypothetical protein 20 6 Op 5 . - CDS 17525 - 17794 122 ## EUBREC_2102 hypothetical protein 21 6 Op 6 . - CDS 17796 - 18155 282 ## EUBREC_2103 hypothetical protein 22 6 Op 7 . - CDS 18152 - 18514 167 ## EUBREC_2104 hypothetical protein 23 6 Op 8 . - CDS 18511 - 18945 366 ## EUBREC_2105 hypothetical protein 24 6 Op 9 . - CDS 18945 - 19379 466 ## gi|253578907|ref|ZP_04856178.1| predicted protein 25 6 Op 10 . - CDS 19390 - 20415 1093 ## EUBREC_2107 hypothetical protein 26 6 Op 11 . - CDS 20431 - 20859 604 ## EUBREC_2108 hypothetical protein 27 6 Op 12 . - CDS 20877 - 22091 1136 ## EUBREC_2109 hypothetical protein - Term 22113 - 22141 1.0 28 7 Op 1 . - CDS 22142 - 22618 445 ## EUBREC_2111 hypothetical protein 29 7 Op 2 . - CDS 22605 - 23543 474 ## EUBREC_2112 hypothetical protein 30 7 Op 3 . - CDS 23548 - 24915 815 ## EUBREC_2113 hypothetical protein 31 7 Op 4 . - CDS 24942 - 26411 838 ## Amet_2579 hypothetical protein 32 7 Op 5 . - CDS 26401 - 27201 638 ## COG5484 Uncharacterized conserved protein - Prom 27226 - 27285 3.2 - Term 27234 - 27269 3.3 33 8 Op 1 8/0.000 - CDS 27301 - 27822 392 ## COG1475 Predicted transcriptional regulators 34 8 Op 2 8/0.000 - CDS 27815 - 28954 560 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 35 8 Op 3 . - CDS 28945 - 29598 610 ## COG1475 Predicted transcriptional regulators 36 8 Op 4 . - CDS 29648 - 30424 490 ## gi|253578919|ref|ZP_04856190.1| conserved hypothetical protein 37 8 Op 5 . - CDS 30414 - 31793 832 ## COG0591 Na+/proline symporter - Prom 31817 - 31876 8.0 - Term 31861 - 31900 5.1 38 9 Op 1 . - CDS 31992 - 32567 347 ## LMHCC_2572 hypothetical protein 39 9 Op 2 . - CDS 32576 - 32806 229 ## gi|253578922|ref|ZP_04856193.1| predicted protein 40 9 Op 3 . - CDS 32808 - 33020 145 ## gi|253578923|ref|ZP_04856194.1| predicted protein 41 9 Op 4 . - CDS 32995 - 33198 127 ## gi|253578924|ref|ZP_04856195.1| predicted protein 42 9 Op 5 . - CDS 33143 - 33496 323 ## gi|253578925|ref|ZP_04856196.1| predicted protein 43 9 Op 6 . - CDS 33572 - 33886 318 ## EUBREC_2124 hypothetical protein 44 9 Op 7 . - CDS 33900 - 34451 130 ## EUBREC_2048 hypothetical protein 45 9 Op 8 . - CDS 34454 - 34675 206 ## gi|253578928|ref|ZP_04856199.1| predicted protein - Prom 34783 - 34842 4.1 46 10 Tu 1 . - CDS 35027 - 35509 378 ## BDI_0866 putative prophage Lp2 protein 26 - Prom 35652 - 35711 2.5 47 11 Tu 1 . - CDS 35771 - 36067 328 ## gi|253578932|ref|ZP_04856203.1| predicted protein 48 12 Op 1 . - CDS 36440 - 36820 373 ## gi|253578934|ref|ZP_04856205.1| predicted protein 49 12 Op 2 . - CDS 36919 - 37110 171 ## gi|253578935|ref|ZP_04856206.1| predicted protein 50 12 Op 3 . - CDS 37155 - 37481 187 ## gi|253578936|ref|ZP_04856207.1| conserved hypothetical protein 51 12 Op 4 . - CDS 37468 - 37713 353 ## gi|253578937|ref|ZP_04856208.1| predicted protein 52 12 Op 5 . - CDS 37754 - 37978 137 ## gi|253578938|ref|ZP_04856209.1| predicted protein 53 12 Op 6 . - CDS 37978 - 38172 287 ## gi|253578939|ref|ZP_04856210.1| conserved hypothetical protein 54 12 Op 7 . - CDS 38172 - 40334 678 ## CLK_A0325 hypothetical protein 55 12 Op 8 . - CDS 40346 - 41107 747 ## gi|253578941|ref|ZP_04856212.1| predicted protein 56 12 Op 9 . - CDS 41092 - 41817 329 ## SJA_C1-18490 putative type I restriction-modification system methyltransferase subunit 57 12 Op 10 . - CDS 41909 - 42871 742 ## CLD_2033 GP49-like protein 58 12 Op 11 . - CDS 42882 - 46709 2323 ## COG0419 ATPase involved in DNA repair 59 12 Op 12 . - CDS 46706 - 47845 771 ## gi|253578945|ref|ZP_04856216.1| predicted protein 60 13 Op 1 . - CDS 47998 - 48333 295 ## gi|253578946|ref|ZP_04856217.1| predicted protein 61 13 Op 2 . - CDS 48333 - 48497 140 ## gi|253578947|ref|ZP_04856218.1| predicted protein 62 13 Op 3 . - CDS 48515 - 48661 234 ## 63 13 Op 4 . - CDS 48676 - 48873 227 ## gi|253578949|ref|ZP_04856220.1| conserved hypothetical protein 64 13 Op 5 . - CDS 48888 - 49364 356 ## EUBELI_01565 hypothetical protein 65 13 Op 6 . - CDS 49419 - 49601 98 ## gi|253578951|ref|ZP_04856222.1| predicted protein - Prom 49642 - 49701 10.8 66 14 Tu 1 . + CDS 49713 - 50225 54 ## COG1396 Predicted transcriptional regulators 67 15 Op 1 . + CDS 50344 - 50589 107 ## gi|253578953|ref|ZP_04856224.1| predicted protein 68 15 Op 2 . + CDS 50632 - 51390 452 ## gi|253578954|ref|ZP_04856225.1| predicted protein 69 15 Op 3 . + CDS 51405 - 51959 342 ## gi|253578955|ref|ZP_04856226.1| predicted protein + Term 52109 - 52167 1.6 + Prom 52086 - 52145 5.3 70 16 Tu 1 . + CDS 52279 - 54024 449 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 54044 - 54075 2.4 - Term 54032 - 54063 3.2 71 17 Op 1 . - CDS 54106 - 55131 1176 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 72 17 Op 2 30/0.000 - CDS 55128 - 55886 877 ## COG0336 tRNA-(guanine-N1)-methyltransferase 73 17 Op 3 12/0.000 - CDS 55887 - 56390 187 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 - Prom 56423 - 56482 2.0 - Term 56483 - 56524 7.8 74 18 Op 1 19/0.000 - CDS 56535 - 56765 337 ## COG1837 Predicted RNA-binding protein (contains KH domain) 75 18 Op 2 23/0.000 - CDS 56829 - 57074 326 ## PROTEIN SUPPORTED gi|238923878|ref|YP_002937394.1| 30S ribosomal protein S16 - Prom 57113 - 57172 2.7 76 18 Op 3 8/0.000 - CDS 57188 - 58543 1767 ## COG0541 Signal recognition particle GTPase 77 18 Op 4 . - CDS 58547 - 58870 423 ## COG2739 Uncharacterized protein conserved in bacteria - Prom 58936 - 58995 5.6 - Term 58968 - 59014 9.2 78 19 Op 1 . - CDS 59056 - 60267 1615 ## COG1171 Threonine dehydratase - Term 60282 - 60339 -0.8 79 19 Op 2 10/0.000 - CDS 60348 - 61298 595 ## PROTEIN SUPPORTED gi|157804145|ref|YP_001492694.1| 50S ribosomal protein L32 80 19 Op 3 . - CDS 61314 - 64145 2857 ## COG1196 Chromosome segregation ATPases Predicted protein(s) >gi|226332955|gb|ACII01000064.1| GENE 1 1 - 247 179 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKQEVIFNVKNLKISKTENIFATEGIKNVFTAVFQFHSTDWDGLAKTAVFENAEGTKEP KLLEEDRCDIPDSFFKTSGVCY >gi|226332955|gb|ACII01000064.1| GENE 2 318 - 1463 555 381 aa, chain - ## HITS:1 COG:no KEGG:CKR_1702 NR:ns ## KEGG: CKR_1702 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 364 80 440 447 314 43.0 4e-84 MSFETAKKMVDLLLSGNKGVGDYINPQKSPGLIIDFIGGEPLLEIELIDQICSYTINRMI ELNHPWLTRTMFSICSNGVCYFEPEVQRVLQKWNQRLSFSVTVDGNKELHDSCRVFPDGR PSYDLAISAAKDWVNKGGYMGSKVTIAPANVMHVYDAITHMIDLGYNEINANCVYEEGWQ MIHATVFYDQLKKLADYILEHNLDMENDYYISLFEENFFHPKQPDDLENWCGGNGVMLAV DPDGIIYPCLRYMESSLAGQQEPYSIGDVDTGICQTECDRCRVECLKKIDRRTQSTDECF NCPVAEGCSWCTAYNYQVFGTPDARATYICDMHKARALGNIYFWNHYYEKNNIDKHMENH VPEEWALNIISKPEWDMLCSL >gi|226332955|gb|ACII01000064.1| GENE 3 1628 - 1957 391 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578883|ref|ZP_04856154.1| ## NR: gi|253578883|ref|ZP_04856154.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 109 1 109 109 199 100.0 4e-50 MRTVIIKVDSKEAEYIERLDYERGFTKDVLQRIIESHMEDPDVINSPAFKAYQKQGAELD AQFSMAVAELEKKYIPEILKHHKTKWNLEYKTGELKVDILCNCEIEGIK >gi|226332955|gb|ACII01000064.1| GENE 4 2368 - 2724 103 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578885|ref|ZP_04856156.1| ## NR: gi|253578885|ref|ZP_04856156.1| hypothetical protein RSAG_01117 [Ruminococcus sp. 5_1_39B_FAA] # 1 118 1 118 118 136 100.0 5e-31 MQRAVSPQRLATAADAGLPAREIAVRYVAAPVLVLVIKHAQSSAITIVRTNVLDVNGHAQ MIARQDAKRIAFRHAQQIVRTPVQTVQADAETVAFRHAQMIAQADARAVAIRHAQQIA >gi|226332955|gb|ACII01000064.1| GENE 5 2774 - 3304 135 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578886|ref|ZP_04856157.1| ## NR: gi|253578886|ref|ZP_04856157.1| hypothetical protein RSAG_01118 [Ruminococcus sp. 5_1_39B_FAA] # 1 176 1 176 176 270 100.0 2e-71 MISAERLVELRAKVKKEMARRSCVEHGSSASMNKFAANYDYNAVPVTGGDITDEHIQKVI DPLLNVADFLQDNSLQQSHSGADVIVDQAEKFVDTLAKIDKQASDSGCRGLCTGLCVGSC TSGCQGCTGCTGGCDTTCAKSCSDGCSTSCGGCSDGCFSGCTHTCGSGCTTGAMTT >gi|226332955|gb|ACII01000064.1| GENE 6 3316 - 3918 465 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578887|ref|ZP_04856158.1| ## NR: gi|253578887|ref|ZP_04856158.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 200 1 200 200 339 100.0 6e-92 MANVVIPENPEFNEALRIIETEDLVHANVVNPMFRTLLLNTIYLERRVAKMTERIDTLAI DNTYGGPELSADANIVDASAQFSVIRKTSSTASVQTLFQKAIDGLRKGLYSLLIRVKVSS NSNNGRLIELNVTSGGAILETRTITANMFERAGVYQTFGLNVELNDTVTITARLLKNSAN ITVSVDYVMLQPAQTAITSL >gi|226332955|gb|ACII01000064.1| GENE 7 3933 - 4304 309 123 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2088 NR:ns ## KEGG: EUBREC_2088 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 122 11 131 131 109 50.0 3e-23 MEGKVTVVGRTKILRARAGEITLPKIVGFAFGSGGSNGSTVLSPGETLKNEFLRKAVDGH TLKTNENKCEYYCTLNGSEANGKSISEIGLYDSEGDIIMIANFLPKGKDSNVSMRFEIDD VLQ >gi|226332955|gb|ACII01000064.1| GENE 8 4304 - 5062 230 252 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2089 NR:ns ## KEGG: EUBREC_2089 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 126 1 122 423 111 38.0 3e-23 MHIDNVDLEHFPTNEVAQRLLTYVTRGWYDKSYVGKWIYEVIGLELETASRRIGEAQKQA FPETAAWGIYFHELTYGIPIDRTKDIDDRRKAVVNRRDRTSRSSITPYRLENIIQTVFGL SASVSEQVEKYIFNVDLLIGADYPIYSVDILLGYIRKIKPSHLSMQARYVIEAAICSERE RVLFPALDIGMQHTWMEGYSVPLIEVKCEITEKLPVGMTGNVMIYKNLNQWNGEYKWDGT IKFDTEVTTEEL >gi|226332955|gb|ACII01000064.1| GENE 9 5065 - 6195 597 376 aa, chain - ## HITS:1 COG:BS_yqbT KEGG:ns NR:ns ## COG: BS_yqbT COG3299 # Protein_GI_number: 16079651 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Bacillus subtilis # 17 364 8 336 348 105 28.0 2e-22 MTEEFVTPEFIDNSDPDTIQSRMMNNLPVDISDMPADFPYDFTMPTAIEISRLIQYNLTR TLMLMFPMWAWGQWLDLHGVSAKVTRKQASRASGHVTVVGTAGTIIEEGTVFCTEGTTDV ESVEFATTEEATIPEQGTVDIPVASVLTGASYNVTRNTVTLQKQPNKNVTSVTNENPIRG GTDEEDDDTYRERILEKLRSAEVSFVGCDADYVRWAKEVSGVGSAVVEAEWKGPGTVKVV VADPDGSAVGEDTLKAVEDYIVSPKDRMKRLAPIGASVTISTVKDMTISYSAVLELESNY SIDNVKEAFLTALKTYYREAKDSEEIRYTVASALLSNTAGVIDFSDFRINENTNNISVAA DYYPITTATELNFTEG >gi|226332955|gb|ACII01000064.1| GENE 10 6200 - 6409 184 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578891|ref|ZP_04856162.1| ## NR: gi|253578891|ref|ZP_04856162.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 69 22 90 90 112 100.0 1e-23 MEDAMKEEDDTAVELAIERTIEEALMVNPRTESVEDFEFSWEPSVVYVKFTVYAIHWEKF DLEVTLKRR >gi|226332955|gb|ACII01000064.1| GENE 11 6675 - 7025 204 116 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2092 NR:ns ## KEGG: EUBREC_2092 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 114 4 117 118 100 45.0 1e-20 MAYDSNDGVARLAAVLDARMRDHADKPLCLDFAEIQADGSLLSNTFPIPIPKSDYRVCRQ LTLGKTGDAFCDVRADEHSGKAYLPESMRQLQAGDRVLIAWVQDTAVVIDIITRPV >gi|226332955|gb|ACII01000064.1| GENE 12 7034 - 8284 971 416 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2093 NR:ns ## KEGG: EUBREC_2093 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 416 1 413 413 340 45.0 6e-92 MIDPLKYSYYLVLVTEKKKKYDITNFVEDLGWEELENELAARLSCTVKNDKTTKGRISSL SKPGCYLYLYYRYKTGTAQEAMRGRIVEWNPSAKSSSQPLKLKAYDNLYDLQESEDCVYY SSGARTKQVIQDYFKKWGIPIGKYTGPDVVHGVIKEDKKKLGTMVKDILDEAKKKGGGYS AIRSVKGKAQILAIGSNKNIYHFAETENLISVSHKISTSGMVTRVKILGEADDDKRRPVE ATVDGQTKYGIRQKILTRGKDDSLDEAKKEAKEVLDDDGKPKEEIKVVTIDIPIIRKGDI IHLKMSTGSGYYWVKAITHDCDKMEMTMTLKKTKLKSSSSKKDNKKKDGDYSIGDTVNFH GGYHYVSSDATSGYKVSAGKATITHSNPGSAHPWCLENVNWAETHVCGWVDEGSFD >gi|226332955|gb|ACII01000064.1| GENE 13 8297 - 8998 181 233 aa, chain - ## HITS:1 COG:lin1283 KEGG:ns NR:ns ## COG: lin1283 COG1652 # Protein_GI_number: 16800351 # Func_class: S Function unknown # Function: Uncharacterized protein containing LysM domain # Organism: Listeria innocua # 16 233 19 227 227 90 30.0 2e-18 MEIYLKEAANKQSCLRFPSLPDKEITVKGNSKYQKYDLIKKGTFAFPAGPDIRSYEWDGY LWGRARKKMSTIHTKWLDPKSVIKKLENWRDKGTVLNLIISAGGGINVDVTINSFEYKKF GGKGDYFYSISFYRYRPLKIQTTKDLGIDKKKKKTTARTNLKKSSTDKKKQTYTIKTGDC LWNIAKKFYGSGADWKKIYDANKTAIEKAAKKYGHKDSNQGDWIFPGTILTIP >gi|226332955|gb|ACII01000064.1| GENE 14 9013 - 13761 3374 1582 aa, chain - ## HITS:1 COG:ECs2641 KEGG:ns NR:ns ## COG: ECs2641 COG5283 # Protein_GI_number: 15831895 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 # 217 522 204 511 696 150 30.0 2e-35 MAETIRIEIPVNVVDNTGSGTSSVTRNLTAMERAFERADRAAQRFQRRSGVAAEIEIGAD DNATSVLSAVENATEQIDGETAQVEVAADDSATQTVNAASDAVENFDGTSGDAEIGSDDS ATPVVSAASDAVENFDGMSGDAEIGASDEATPVIRAAQDAAESWGGSVFNATIGVIDAAT APISALASAAKNPFVQGASLIGASFGVAESVNSFQDFEAMMSQVKAISGATGQAFDDLTA KAQEMGATTKFTATESAEAFNYMAMAGWKPQQMIDGISGIMSLAAASGEDLGTTSDIVTD AITAFGLTAGDAGHFADVLAQASANANTDVSMLGESFKYVAPVAGAMKYSIEDTSLALGL MASANVKGSMSGTALKTSIANMVKPTNDMAEAMDKYGISITDGEGNLKSLKGVIDNVRGS LGGLSRDEQTAVASTIFGKEAMAGMLAIVNASEEDYNKLSNAIYNANDAAEGMADTMLDN LKGSFTLMQSAIEGTENAFGKRLSPYLRGIAGGITDMMPEITDGINAVMDVVDDKIAGVK RKITDMTGSDEWKNADLFGKIDIAWDSIIAKPFGNWVSGDGAQLISTGLGTLFSSAAAIL PGGEKAGLTSWLSAGILAKGAATVAQKGKSIVETLSPIGDAIGSITEAAGNANDVMDFAG NLSSMIPVGAKVGLAAAGITAAIIGIKLAIDKYNETQLENSLEDHFGKIKLSADEVKDAA AGILNQKYLTNVELALNEVQNADNLRAEAQKALESNDVLEFKSRVGITLTADEQQEYTDN INTFVESKISELESRTFAAHIHVQTYLGGTEDGQTLAQNIKEWARADNLELSDLSSQLSQ KVSEALKDGIIDVNEEEAISALQEKMNNITARWKEAEAQAQWDWIKQEYGHMSAADLESG SFTDLMDEMRSQRETAMESIKADTTQWYSELEAMKDYGRITPEQYESYKEQTGWYVRGQQ GSELAKSLELGSNTLNDTYGEKITGNIQTLTETAQNALKSAETSLQSGAYGTIASTFDNM FTSMDNGKGFLGIGANADQRALNELYQSMAPDVSQMGSLIDQYREAGQAVPQSLMDGYKE AIEVGAAAGDVDAAWQNYANQILESGSEEMKSVLTDPNNPMYESVREQLPDELRTAIDRA TAETTDNEITLEGLKASVDGDVDIDKDAWVSALNEKLGDLATTEEVTADSIKIKVEQGDC LWEIGNALGIDWQTIAEQNGIESPYVIHPDQELTISMDTITAEMDGDKAQAAIEQAMSAL DAEGAEMSVTAEGVKVDLANVEVDSDVAAAQIEAALGMESGTLAANGIEVQAGATVTIPQ ELVQVDTSGIQSATAEQTETEPVETDTTANVNITDATTDASGAKEQAQSEVESTFSESMP ADGHTDVTLDQTNNAAEVYSEVAGEVQSTFSNPIPASCTVNVTLDWHITNPSAGITTSGS GSSVKASIAGNAEGSIVTGPLLSWVGEDGPEAIIPLGSKRRDRGMDLWLQAGRALGVKEY ADGGMIGDVPLSGGSSDSSSGSSGNNGDKGQIVVNMNPVFNINGEGGNDTVNSIKEKLKE LINEMSGELASRLLESYANMPT >gi|226332955|gb|ACII01000064.1| GENE 15 13925 - 14479 609 184 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2097 NR:ns ## KEGG: EUBREC_2097 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 11 178 17 186 189 122 42.0 5e-27 MARKIDNVIEESNVTEVDMTEAEVEEALKEDMRANEMDYLNGILEAADDVDEETKEIKII RSGKLYFAFSVHALSDDDMYEIRKKYTKYAKNKRTGMKVTDGMDNAKFRSSLIYNATVAE DQEKLWNNKNIQEALRKKGKKIINALDVIEAALLPGEKEKVLATLDELCGYNTEEDKVNT AKNL >gi|226332955|gb|ACII01000064.1| GENE 16 14541 - 14996 506 151 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2098 NR:ns ## KEGG: EUBREC_2098 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 151 4 152 154 128 44.0 8e-29 MLNTSASTDARHSRSGKDAMLYDEDGNALAQVNSFQTKAAFNNNKYNPIGQNMELEVNNT IGVTITISEIVVLDGYLFNQVINAVQKGESPVMTLDGAIEGRNGSQERVTYRECIFSGDQ DIQNVSTGDVLSRSYNLHCNGRPEQRSALTI >gi|226332955|gb|ACII01000064.1| GENE 17 15014 - 16423 1253 469 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2099 NR:ns ## KEGG: EUBREC_2099 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 466 1 465 467 465 54.0 1e-129 MAGYFQKNEAGTKIRPGAYFNVDKVGEDDSFGAVDGVVAVVFKANFGPVNTVTTLDRGDD YESIFGNGLTTDAIREAWYGGAKKILACRLGGSGGVAASATLTAETGSVKITAKHVGEMP FTVTVRNRLTDAERKECIIYTGTTEFEKVYFKAGEDEAANLVAAFANSKNFTATVESSGK GVVTNVNQAAFTGGKNPTIANANYSAAFTEIEKYYFNTICVDTEDTAIHALLQAFLDRIY EAGQFGVAVIAEKDNKDLEERMKAAEGYNAKNIVYVLNPKVSINGGSLDGYQTAALIAAL IAATPASQSVTHTAISRYTEVGELLTNTQITKAEQRGCLVLSTSQDDEVWIDSAINTLIT PADNEDAGWKKIRRVKTRYELLYRMNAQADALVGKVDNDVNGRATIVGKLQKIINDMIQE GKLVSGTVAESTTYTADTDNCYFDIDVVDKDSAEHIYNFFRFQFSTISA >gi|226332955|gb|ACII01000064.1| GENE 18 16427 - 16654 123 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578901|ref|ZP_04856172.1| ## NR: gi|253578901|ref|ZP_04856172.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 75 6 80 80 86 100.0 5e-16 MAETKKVSVQAEEKPDQTAKQETEYAVSELIAASDQLFSCPRECTVVALKQAGKENMSVS EAQTLIEKFMKKEVK >gi|226332955|gb|ACII01000064.1| GENE 19 16666 - 17535 474 289 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2101 NR:ns ## KEGG: EUBREC_2101 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 280 1 285 293 253 45.0 8e-66 MNLSELIFRRLSADEDLTKMLATYAGVPAIFDSEFPSDQQEGWEGATQYPRICYRIDMQV NQERSSSGTLYIAIYTDKTSTVIDEIENTVRLRLQDVLMKPDGEAPFCVAWARTESYAIE GKEVWYKEVAFDILEYPDQLSTDPDPVLAVAAYIKQLFPETAVLGIDSISDFIETSKTPV FYCRLANIQKTTGHCMNTIAWFIGKVAVHLIYPGAGTRLKTLASINQRIAIDEEIIMLDD SPMIISQMELNNKADYLREGQLTITGKYGCLRGNEKKHNLSGIGMDFTD >gi|226332955|gb|ACII01000064.1| GENE 20 17525 - 17794 122 89 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2102 NR:ns ## KEGG: EUBREC_2102 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 9 77 10 80 100 72 50.0 5e-12 MFINRIERAEFNLDEIRRGTLIYAKHRSWKEGKSGIVYHASAERITVLYPNEKTNTQNHF FIPVSEDGEWEIRYSNDGLLTIKEGTDES >gi|226332955|gb|ACII01000064.1| GENE 21 17796 - 18155 282 119 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2103 NR:ns ## KEGG: EUBREC_2103 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 116 10 149 152 84 42.0 2e-15 MTPAEAAEAVKVQVQTDKERIEQQVIARYPRASNALRNAALSVLANPSPSAPGSPPGVRS GNLRRNWNMSGGAVCITSGMGYAGYLEHGTRKMAARPYVEKIKQTALPNITAIFAEIGG >gi|226332955|gb|ACII01000064.1| GENE 22 18152 - 18514 167 120 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2104 NR:ns ## KEGG: EUBREC_2104 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 120 6 118 118 65 35.0 8e-10 MISPFGLMYLRPGNLWTDFVVRRKSIRNILGHPVSDFEAKGEISGILAEASSNESDRMKH RWDQEQHSLTHTLVIRDFADVKQGDYLTTAGRTFLVLLSEDPGNLGATGLIYLEERNDLK >gi|226332955|gb|ACII01000064.1| GENE 23 18511 - 18945 366 144 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2105 NR:ns ## KEGG: EUBREC_2105 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 137 4 131 136 120 50.0 2e-26 MAGTYTYEPAMITSYGKDRMRFELGDVMVDGKERTCALSDEEYIVLCDDVQSAKDWKRAK LKCLESIFRRFSFEPDTTVGPTSFKFGDRAKLWQEEYEKLKKDLKLASVSPSAILMNAGD MSKQPTPYFYNGMMSHEESEGDDI >gi|226332955|gb|ACII01000064.1| GENE 24 18945 - 19379 466 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578907|ref|ZP_04856178.1| ## NR: gi|253578907|ref|ZP_04856178.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 144 1 144 144 235 100.0 5e-61 MKLVANKPCNLNGKKYFIGEEVPVEEVVDYASLVKMGLLSVIHDAVPEDNLEECVAMVGE VSFSIPIVKGHETIDLDVTEPQMQDAVKTMQMSADAAVAHIRGDIEDDTTLIIINALDSR ATVKKAAESKAKNLIEQEESKGDA >gi|226332955|gb|ACII01000064.1| GENE 25 19390 - 20415 1093 341 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2107 NR:ns ## KEGG: EUBREC_2107 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 336 1 336 341 479 66.0 1e-134 MRNTTAGIQAEIAKGVFRPHTALTNMALAYYQNASNYFAKALFPTCPVGLSSDNYYIFSR EDLLRDNWQRKPAYGKVDPTVVGESTDNYVCKVDQMIMGIDQIRQTDLSRRQGPSIIQPK QQRTRTIAEQANIHQDRLFAASYFKEGAWKNELEGVDNTTPSTNQFIKFSNANSDPIAFI DKEKTDMNQQTGRMPNRLGLGINVFNALKVHPGILERVKYGGSTANPASVTENVLAQLFG VEKIVVLKSIMNSASMGADEEMQYIGDPNAFLLAYATNAPSIDEPSAGYIFTWDMLGNGQ MLPILNYLGENGTHTEYIEGLMATDMKKTSDDLARFYKAAV >gi|226332955|gb|ACII01000064.1| GENE 26 20431 - 20859 604 142 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2108 NR:ns ## KEGG: EUBREC_2108 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 139 5 139 143 137 66.0 1e-31 MGTNFNGTMINQSVTIAEKAGADIADVRNLILKYDEDGNVVIAANGTAPLLGLSIIEGGY NDISGAESGKVKKGDDLEIQIKDIGYAIASAEIKKGQEVTATTGGKAAVAKAGEYVIGVA LNSVSAGGYSRIQIAKYQKAKA >gi|226332955|gb|ACII01000064.1| GENE 27 20877 - 22091 1136 404 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2109 NR:ns ## KEGG: EUBREC_2109 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 404 1 418 418 240 42.0 7e-62 MATKLKGLNVGKVDFVDCGANQRADIKIIKNKDGEEAQNPEIGFFKRFLNWVTGELSKPD SEIAKSATTFNEQINAVSMDAIRDEIWSTCYALQNSLNSILCDAEMDSSAKQAAMETSTE QFATAMKGYIPNWASGTATNIRKNLDIPDETDLQMVMKAHKNLTDIIEKSNEDNEKGELE DMLKINKSKMTAEERAAYEELIKKYAVETEEQTEEPVGKSAPKVEDPDIVDDSEVTKTQK SVTPPPAAPTTETSADTGDDIYKGLHPAVRARLEALEKRAAEAEERELLDVAKKYEIVGE KPEELVKTLKSLKDAGGTAYNDMISVLDRSVDMVEKSGVFGEIGKSFSGNPVASIKKSAA ESKIDTIAKGYMEKDSTLTYNAALAKAWEDHPELLDEYEVEAGY >gi|226332955|gb|ACII01000064.1| GENE 28 22142 - 22618 445 158 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2111 NR:ns ## KEGG: EUBREC_2111 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 156 1 158 159 194 68.0 8e-49 MKSFNEIMKIRDEPETKDVPVEKKKFQIKKSDDEKMQAFGWASVALTENGEVLEDWQHDI IEPEELEQAAYKFVNLYREGGEMHERGGVAYLIESVVFTEEKMMAMGIPEGVLPVGWWIG FQVTDADVWEKVKNGTYSMFSIEGEAQRIEVKDEETDQ >gi|226332955|gb|ACII01000064.1| GENE 29 22605 - 23543 474 312 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2112 NR:ns ## KEGG: EUBREC_2112 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 13 306 15 310 318 204 39.0 4e-51 MLKMRARSRMIKKSVESQKVLEALDNYLESNLDEPMKWLVRFWKDQAAVMLYKDLREIVI GEADPQSLFDQWFSDYSVFLSSKMTASWESAYFAAWNSTAEFVALEEKISSEIYVRDWII NQTSNLITNVCSDQVNAVRYLIAEAQSLGMGSDETARYIRPTVGLTERQAAANLRHYNSV KTQLRADHPRMKEESIERKARTAAAKYAERQQRYRAETIARTEIAQAYNAGADAFIREAM RHDLMPEMKKEWSTALDERVCKECQALEGVQISMDDSFKTQSGRRNVTVLLPPLHPRCKC AVKYVEATYEIV >gi|226332955|gb|ACII01000064.1| GENE 30 23548 - 24915 815 455 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2113 NR:ns ## KEGG: EUBREC_2113 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 455 4 463 492 644 68.0 0 MKEYGRIGQKRWEGVFNEEFLPELSGIRGIKTYREMLDNDDTIGAIMFAIKMLIRQVKWH VEPGGDSAKDREAAEFVESCMDDMQNTWTDTISEILSFLAYGWSFHEIVYKRRMGKTKNR KTSSKYSDGLIGWQKIPPRAQDTLYRWEYDDKDNLIGMTQQPPPDYGLLTIPISKAMLFR TESIKDNPEGRSILRNAYRSWYFKRRIQEIEAIGIERDLAGLPVLHAPDGVDIWDDKDPE LVSINAALTSMVKNIRRNEYEGLVLPAGYEAELLSTGGTRQFDTNAIINRYDAKIAQTVM ADFIMLGHEQTGSFALSEDKTELFAVALGAFLDVICETFNNQGIPSLIDMNGTHFDAITD YPQLAHGDVDKRDITKLSTFLKDMVGVGILIPDEDLEDYVREVANLPERTLSDDPRNKDE QREAQRRSPEKEGKTSEVEPEENQEIEEAKKRLGR >gi|226332955|gb|ACII01000064.1| GENE 31 24942 - 26411 838 489 aa, chain - ## HITS:1 COG:no KEGG:Amet_2579 NR:ns ## KEGG: Amet_2579 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 4 457 2 447 469 442 50.0 1e-122 MMDDTAFSEFLDESIPLWRDDPVMFFREVLNFEPDEWQAQAARDLAANPKVSIKSGQGVG KTGLEAAVFLWFVTCFPHPRIVATAPTKQQLHDVLWSEISKWMSKSELLSILLKWTKTYV YMVGEEKRWFGVARTATKPENMQGFHEDNMLFIVDEASGVADPIMEAILGTLSGANNKLL LCGNPTKTSGTFYDSHTRDRALYKCHTVSSMDSTRTNKENIDSLVRKYGWDSNVVRVRVR GEFPNQEDDVFIPLSLIEQCSSKLLELDDADGMQFVSLGVDVARFGDDETIIYRNYHGHC KIVRNRRGQNLMATVGDIVQEFKKIYREHPTYESKVYVQIDDTGLGGGVTDRLKEVRKEQ KLYKMQVIPINAAEKIETDTAAGKDAAERYNNLTTAMWASMRDLLDNKQIVIEDDEQTIG QLSSRKYTMASNGKLEIEPKKEMKKRGLDSPDRADALALALYLGKIKKHTGSAPSVGAMK KLSKDNYWG >gi|226332955|gb|ACII01000064.1| GENE 32 26401 - 27201 638 266 aa, chain - ## HITS:1 COG:lin1733 KEGG:ns NR:ns ## COG: lin1733 COG5484 # Protein_GI_number: 16800801 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 246 1 268 294 68 26.0 9e-12 MARARSPNSIEAEEMYKNGMKLVDIAKKLDVPASTVRRWKSTQNWDGDAKKKKNERSQKK KTSARYKGGQLGNKNAVGNKGGPLKPGDKIAEKHGAYSSVYWDVLDESEKDMIEDIPMDE EMLLIEQIQLFAVRERRIMAAINKYRNMNGEVSLFGFARTEDKRAFKSDEDKQLYEERIE EKVASGDRLPGNTYNMMTNMENKDNMIARLEKELSTVQSKKTKAIEALAKLRLEKQKIAG ESKGNEVVRAWAEAVVKARREEKHDG >gi|226332955|gb|ACII01000064.1| GENE 33 27301 - 27822 392 173 aa, chain - ## HITS:1 COG:STM0604 KEGG:ns NR:ns ## COG: STM0604 COG1475 # Protein_GI_number: 16763981 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Salmonella typhimurium LT2 # 8 145 42 183 205 102 40.0 3e-22 MDSKLTAPLSTLRWVDRNLLKPNDYNPNKVSKENLKLLIQSILTNGWTLPIVVRPDMTII DGFHRWTVAGMEPLLSKLDGKVPIVIVEHKEHSEDIYGTVTHNRARGTHLLEPMKKIVKE LMDEGKTVEEIGKQLGMRPEEIFRLSDFSKEDFLKMMTKGVTGYSKAEFITKI >gi|226332955|gb|ACII01000064.1| GENE 34 27815 - 28954 560 379 aa, chain - ## HITS:1 COG:ECs1386 KEGG:ns NR:ns ## COG: ECs1386 COG3969 # Protein_GI_number: 15830640 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Escherichia coli O157:H7 # 11 326 15 332 410 97 25.0 4e-20 MAVKRCESNIDVVKAAEIRIKNVFGNGLPVFFSFSGGKDSLCLAQLMVNLANRGEINMKQ LTVQFIDEEAIFPCMEEMTKKWRRIFMMMGAKFEWYCVEVKHYNCFNELSNDETFICWDS TKQDVWVRQPPSFAIRSHKLLRPRIDAYQDFLPRTTVSGITMVGIRTAESVQRLQNIASM TKAGNRMTSKKQVFPIYDWTDNDVWLFLLRNYVDIPEIYLFLWQSGSSKRQMRVSQFFSV DTARSLVKMNEYYPDLMERVIRREPNAYLAALYWDSEMFGRSSRKRKESEQGQEQKDYKQ ELINLFDHMEIFDTPHKRHVAERYRNFFIAVSAIATPEDCKHIYEGLISGDPKMRTFRAL YQRIYGRYINNAKKERKHG >gi|226332955|gb|ACII01000064.1| GENE 35 28945 - 29598 610 217 aa, chain - ## HITS:1 COG:YPMT1.28c KEGG:ns NR:ns ## COG: YPMT1.28c COG1475 # Protein_GI_number: 16082812 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Yersinia pestis # 15 170 19 174 222 61 26.0 1e-09 MKVTIKKLSVLKHPEKNVRIHSEQQIRELKRSLEKFGQTRALVIDENNIILIGNGLYEAM VSLGYQEATVYVKAGLSENDKKKLMIADNKTYALGIDNLETLNEFLEELQGDLDIPGYDE EILQQMVADADEVTEKLSEYGTLDDSEIQKIKEANEKREQKAAVDTQSADNGESSPEKPN PQNEQPAEEQNATETEPEITETRRFVVCPKCGERIWL >gi|226332955|gb|ACII01000064.1| GENE 36 29648 - 30424 490 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578919|ref|ZP_04856190.1| ## NR: gi|253578919|ref|ZP_04856190.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 258 1 258 258 510 100.0 1e-143 MLGRKQSVRNNEDWKNALDHIEETVSKKELDSLVKKTVKDIKEKCKGKKAAYAWSAGKDS LVLGEICEKAGIDQSVLVRCNLEYPAFIAWIEQNKPSNLEIINTGQDMEWLKKHQDMLFP DKSNKAAQWFHIVQHRGQARYYKEHQLDILLLGRRKADGNYVGKDNIYTNSAGITRYSPL AEWRHEDVLAYIHYYDVKLPPIYDWEKGYLCGTHPWPARQYMETEQQGWKEVYDIDKTIV ENAAQHFDGAREFLKAIK >gi|226332955|gb|ACII01000064.1| GENE 37 30414 - 31793 832 459 aa, chain - ## HITS:1 COG:YHL016c KEGG:ns NR:ns ## COG: YHL016c COG0591 # Protein_GI_number: 6321771 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Saccharomyces cerevisiae # 9 422 23 473 735 94 25.0 3e-19 MNGIIMLFVYAAIMILATVTMTKKEKNVVNFCVGSRSENWILSALSIAATWIWAPALFVS TEKAYSTGCVGLFWFLVPNALCLVIFIPFAKRIRKEMPEGMTLSGYMKEKYKSDGVKRVY LFQLIGLSVLSTGVQLLAGSQILSAVTGISFKTMTILLACIAISYSLFSGIKASMLTDAI QMVFMLIACSLFVIFGVRNTGIQGILQGLSGISGDCTTLFSGRGIEIFLSFGLPTTIGLL SGPFGDQSFWQRAFAVKKEKLGRAFLLGAVLFAVVPLSMGILGFMGAGAGYQAQNLGIIN FELIRHFFPSWAVLPFLFMIVSGLLSTVDSNLCAVSSLTTDIAGGKDIRKTRAAMAVLLI AGILIANIPGITVTHLFLFYGTLRASTLLPTVMTLKGVRLNAKGIIAGVVAALAVGLPVF AYGSVLNSGPYKTLGSLLTVLLSGIIVLAASGKERRYAR >gi|226332955|gb|ACII01000064.1| GENE 38 31992 - 32567 347 191 aa, chain - ## HITS:1 COG:no KEGG:LMHCC_2572 NR:ns ## KEGG: LMHCC_2572 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_HCC23 # Pathway: not_defined # 5 188 3 174 180 78 28.0 1e-13 MDTEKQFVVMSKKDVEEMIQQAAAAGAQVASDTMLVAQRRAEKERIDRRLHNTELLLRNY RTLKASCENAVYESRDSKREEVTEILEDIMEMKDDKVIVESIKASAKRTALMVQHIDKML DVYRIYCSKLSEKDKRRYKVIKLLYISKQSMNITEISKKFSVSKVTIYEDLKIAKERLSS LFFGIDGLRFF >gi|226332955|gb|ACII01000064.1| GENE 39 32576 - 32806 229 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578922|ref|ZP_04856193.1| ## NR: gi|253578922|ref|ZP_04856193.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 76 1 76 76 131 100.0 1e-29 MDKSLYNASGCKDRTAHDAICAADRTRTLVYRASRSKKDEDAELFVKMVKRIARGFGFKL CDRIKFEDPETGKKYV >gi|226332955|gb|ACII01000064.1| GENE 40 32808 - 33020 145 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578923|ref|ZP_04856194.1| ## NR: gi|253578923|ref|ZP_04856194.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 70 1 70 70 127 100.0 2e-28 MEEKRICPECGKEYSSRPALSRKDNKTMICPKCGMMEALDTVRDFYAPGMTDQHWKQYKE EYMLKYIKEN >gi|226332955|gb|ACII01000064.1| GENE 41 32995 - 33198 127 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578924|ref|ZP_04856195.1| ## NR: gi|253578924|ref|ZP_04856195.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 12 67 1 56 56 100 98.0 4e-20 MYSKCQKCGRKLTDPESIERGYGPECWGSILPHYSIEQEGPEESIPGQMTIEDFLGGLKN GGEKDMS >gi|226332955|gb|ACII01000064.1| GENE 42 33143 - 33496 323 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578925|ref|ZP_04856196.1| ## NR: gi|253578925|ref|ZP_04856196.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 117 1 117 117 198 100.0 8e-50 MKEVIFYLCGIFSCMIIWFLWAIIAYKKAKEAPLKEYARIHIDIEKAIRENEEQIVIAKK YQLIEDQVIDQMILQWKIEYLQSQREWLFTLLGGKMEDSYVQQMSEMWEKIDGSGKH >gi|226332955|gb|ACII01000064.1| GENE 43 33572 - 33886 318 104 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2124 NR:ns ## KEGG: EUBREC_2124 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 104 7 103 103 104 59.0 1e-21 MKKFKEKRMASYVLRSNELIQEGKTKEAAELFGKGLGYYSSNIIHAITPYATADAGLIST VLRHLASEIEEKNPGAKEMREWAETHVEKPELKEIKRIKKPNCK >gi|226332955|gb|ACII01000064.1| GENE 44 33900 - 34451 130 183 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2048 NR:ns ## KEGG: EUBREC_2048 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 25 182 28 153 161 85 36.0 6e-16 MTRAETTKFLGQLLISTRFNGAGKHWASEVSIDPWGRDAKRVDYMQFSPANQCSISGIEK GIFTCYEVKSCKEDVYSGNGLNFLGEKNYIVTTMECYKDILPDFRSGKLAKHMRDQFPES SNYFGVMVAIPYWAEITDEFENPTPIDGNTERSWKLAAILSCRQGPRKRSMSELLFCMLR SGH >gi|226332955|gb|ACII01000064.1| GENE 45 34454 - 34675 206 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578928|ref|ZP_04856199.1| ## NR: gi|253578928|ref|ZP_04856199.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 73 18 90 90 138 100.0 1e-31 MAEVDTRYFADLDKELDLRGLTMDGEPDSGFAWTEECGSISGERVHAWMPMEDIRIAELP EGVENRNFESMEV >gi|226332955|gb|ACII01000064.1| GENE 46 35027 - 35509 378 160 aa, chain - ## HITS:1 COG:no KEGG:BDI_0866 NR:ns ## KEGG: BDI_0866 # Name: not_defined # Def: putative prophage Lp2 protein 26 # Organism: P.distasonis # Pathway: not_defined # 1 158 1 146 149 62 32.0 7e-09 MREILFKGKKKDNGEWIEGYLLDGGMPGGKRIFIGKLVIGKWTVMADEFDEVDPDTICEY TGLTDKNGKKIWENDILMRHGNSEDLVKAVFGEFGVRNIETGSIVDKVVGWHYEIIPTDT ISRCEPLCYSMPLTKDYIDRCEMEVVGSIFDNPELLQGSL >gi|226332955|gb|ACII01000064.1| GENE 47 35771 - 36067 328 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578932|ref|ZP_04856203.1| ## NR: gi|253578932|ref|ZP_04856203.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 98 15 112 112 178 100.0 9e-44 MKLIDADKLILHLNDYALQESPSDVESAGDRKVSRAVYKAITDCIRAVDEQPTAFDLDKV MEQLEDRSALARPVGWSKAYEIIMLKDAIEIVKGGGVE >gi|226332955|gb|ACII01000064.1| GENE 48 36440 - 36820 373 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578934|ref|ZP_04856205.1| ## NR: gi|253578934|ref|ZP_04856205.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 126 1 126 126 215 100.0 6e-55 MSHIKNRLKQYRDKYSKYSKHDGLYVEDVLAMIEQLQEDLERDERPKLTKNEKSFLEALD PSWSYMLRNGRGQLYLARKEESMYGSTFKYLYLEGITNAKFDFVEAEDNSWLIDDLRKLE VKDEAN >gi|226332955|gb|ACII01000064.1| GENE 49 36919 - 37110 171 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578935|ref|ZP_04856206.1| ## NR: gi|253578935|ref|ZP_04856206.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 1 63 63 93 100.0 4e-18 MEEKNIKITINVECSEKSSVKKEQIAGYLLRAIAGVAANNKCLITNYVCEINEKNDDKLQ EAL >gi|226332955|gb|ACII01000064.1| GENE 50 37155 - 37481 187 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578936|ref|ZP_04856207.1| ## NR: gi|253578936|ref|ZP_04856207.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 108 3 110 110 214 100.0 1e-54 MDTYNRSIIGLKSRSNGEYFERMIMAASRFYEDRGIAVIDKTPEAFKVIKPYNRDRGQFI CCFTQQAQPDFKGALMDSTMVLFDAKHTDKGQISRNVVTEEQEECFGK >gi|226332955|gb|ACII01000064.1| GENE 51 37468 - 37713 353 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578937|ref|ZP_04856208.1| ## NR: gi|253578937|ref|ZP_04856208.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 10 90 90 135 100.0 1e-30 MDENKIHEKAVKMRKKTDEQLVHYVEDRVEKARSEGFNEGKALAKNTAKEFIVLLQQNKI PGIGAVTINKLVKVAGEHGYL >gi|226332955|gb|ACII01000064.1| GENE 52 37754 - 37978 137 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578938|ref|ZP_04856209.1| ## NR: gi|253578938|ref|ZP_04856209.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 74 1 74 74 148 100.0 1e-34 MKVMKPIFRTKQYIKYGFVKMEHEYYCCPKCRNILNAGPNYQPEFCDRCGQALDFSNTEW KEDRQIGFVEPEAV >gi|226332955|gb|ACII01000064.1| GENE 53 37978 - 38172 287 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578939|ref|ZP_04856210.1| ## NR: gi|253578939|ref|ZP_04856210.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 4 67 67 94 100.0 2e-18 MAKQIIRSIRKGSVQWNEEDRLQMVSMLAKAGYAVQIVRKEIPSSETRKTTQYEYVIEYG EKVE >gi|226332955|gb|ACII01000064.1| GENE 54 38172 - 40334 678 720 aa, chain - ## HITS:1 COG:no KEGG:CLK_A0325 NR:ns ## KEGG: CLK_A0325 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 141 706 125 638 647 191 28.0 8e-47 MEKRKLAQIPREEATDEMVRFAERAAGTHIVTTRDIEKDLLMMTFYPIDKLKKGEKSAQL RTFFSKNDYISQDLTAERVKWLTAAFDRMECIHLYEYHWDRDKGNRYTPNMFFWTDADID RMRGFFKEWSTEKDVKDWTAVTRFQDMVKQRRLDEKHAKETNPIDTVMGTVKEIPEDFKK WVSEKAMSFSRYLVYSTRSKNEALVHCTHCNGETLVDRTKIRLRNNEKGICPLCGSQVTI KARGRMPAHIWDKRIVSFIEPREEGFLWRYFSVHREVKQDGKTNDSLLEIVRTFYKFAPN GTPCTSSYEYREYKQTGIVRWCTDEGYRERSYCALYPGNLPEAWKDTPMKYSALEILSEN RPSEQIHYANAIYRYQEFPQLEWFIKMGLYKLAAHLINKIHDGAFEYNSRNGIRGLRKNG KTIFEILGLTKENTRILQSIDGNIDELRLLQEAQSSGYNLKAEELERFYKLFGCNTTLIR KENRHSTIHKICRYIEREGSDYRVGERGGCWRYSYMQYKERPDIREERLQNCAKDWLDYL AWCKELKYDLTNMFFYFPKNFKKVHDRTAAEYQAVQDKKAAEKKRREEERIKREAEVMKK LLEEMLKENAGIDNAFLIKGKGLILRVPRDAQEIKNEGAALHHCVGTYVDRVAKGQTHIF FVRRVEEPDTPYFTMEYNNGRVIQCRGSHNCGMPASVKAFVAAFEKLMKEREEKTERKCG >gi|226332955|gb|ACII01000064.1| GENE 55 40346 - 41107 747 253 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578941|ref|ZP_04856212.1| ## NR: gi|253578941|ref|ZP_04856212.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 253 1 253 253 295 100.0 2e-78 MSEQLKQEVDTAEIDRLETETDVDSKTIGQDETELLESKLEDESGNEVKAEDTVYLGKAS LAEILTGMADPTEEEIRAAEIENAKPVKQKAKEKLEAEKKKATQKNFAEPIIAYLLKRCE EDQGLAEDVMQEGKTWNKCFNYIVEQARKQSNGRSTAVEDQVVYEWAEDYYHKYEKPETA KKEKDKKPATTKKTEAPAKKVTEIKKDIQETKNDSKVSEKPEKKDAASKQRKTEKTSTKS SGLNGQISLFDLL >gi|226332955|gb|ACII01000064.1| GENE 56 41092 - 41817 329 241 aa, chain - ## HITS:1 COG:no KEGG:SJA_C1-18490 NR:ns ## KEGG: SJA_C1-18490 # Name: not_defined # Def: putative type I restriction-modification system methyltransferase subunit # Organism: S.japonicum # Pathway: not_defined # 1 216 32 232 267 141 40.0 2e-32 MAAMACTLANAVDKSLSRHTAREKEYAECIKRLGGVEKPAKCFAIVVEALERNPDQDFLG RLYMSLELGNHWKGQFFTPYDVCRCMAELTIHDNMQKLQNKEWVSVNDPACGAGATLIAA ANTFRRKGFNYQTQVLFVANDIDRVTAQMCFIQLSLLGCPGYVAVANTLSNPVAGNTLMP EERPEQEFWYTPFYFRPEWNMRRQLQILRHRQRRPTLFGRAPEQITFHFDFEKGEYKCQN S >gi|226332955|gb|ACII01000064.1| GENE 57 41909 - 42871 742 320 aa, chain - ## HITS:1 COG:no KEGG:CLD_2033 NR:ns ## KEGG: CLD_2033 # Name: not_defined # Def: GP49-like protein # Organism: C.botulinum_B1 # Pathway: not_defined # 1 170 10 177 288 159 54.0 1e-37 MGKRYYWLKLPDDFFRQKPIKKLRRIAGGDTYTIIYLKMLLVSLKNEGKLFFDGVEENFT EEIALELDEEEENVKVTVQFLMAQGLLQLIDESEYELTECSRMVGSESASAERMRRLRDK KTSQCDIGVTQQLHLSDVEKEKEKEIDKDKEKEKEIDKDKEIENKYICPEVNSGQPQPKA EIEPVAESRTKVEIEPSCSKAELKVETEPAQADVFIKLPLINGDDYLVTKEYVKELKELY PAVDVEQALRSMRGWLGSHPKNKKTPRGIKRFITGWISRDQDEASRVPDKPKPVFQNRFN NFHQRDYDFAEYERQLLKRE >gi|226332955|gb|ACII01000064.1| GENE 58 42882 - 46709 2323 1275 aa, chain - ## HITS:1 COG:slr1048 KEGG:ns NR:ns ## COG: slr1048 COG0419 # Protein_GI_number: 16329624 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Synechocystis # 412 1273 3 1002 1006 117 22.0 1e-25 MKILHTADWHIGQFKGPVVDGVNLRSQDTVNCLNYMIKVAEEEKPDIVCVSGDVFHQEQI GPVRYSDEMIVATDTITKLAGVAKAVIVMRGTPNHDGGGQFRVLSKMFANTGNVHIVTSP TVLRTPYADIACIPGFDKQEFRSRFPGLSADEENEAWTSYISSMVMGLRAECHNTSILMA HYTVPGCNMESGQTSFFTNFEPVIPREALEAAGYEAVLLGHIHRPQILNGLHNVFYSGAI NAMNFNDEGQERGFWIHEFSDTGKLTKGHNCITPYRRFYTITWDTEEVEAYIREGVMYLH RLGFPEDVTDKIVRVRYSCTSEQKKQLNIPALQKDLYELGAFYVADIEAENAIDVTNRGL LSEESDPTLNLKKYLEEKCFKNPDKIVELAEPIIAEAMKQSTTAEIHGVFRPISIAVRNY RNYKEERFDFADISFCTINGVNGAGKSSLFMDAIVDCLFEETREGDNKAWIRGTEDARSG SIEFVFDIGDKRFRVVRTRTKSGKPTLNLSQYEENEWRNISKERIADTQAEIEKLLGMDS MTFRSCALIMQDQYGLFLQAKKDERMAILAKLLGLGIYGVMELDSKKKLSEQRKELASKK EAVRIKTDFIKSKGDPESELQKAEEDIQQLNKDIEDLSDTQGQLLNKHAQIAKAEQECRK ASKELDDCHKRRSSISDEISSKTQILENCNVALESANEVRKKAAEYKQLSEQIIELEKDA LNHDNAKRNLAGYNADIQNCQNIINDAKRRNNDIANLIEQLKAELPDNLEEKLTELAQVR TQCEELQEKRYLASVAEQELQQIRATYSQRISEAENRRKYRLDRISEIRQQEEFMKNSGC PDTDGASCRFLAKAIDDVKSLPEEADHLEKCEEEIAALRTKRDEEISKKQDEICIIGYDA EKLDLLTVKASTLVKYENLKKDAEKKKLEIARLETEKDTNSKTIGQCEESLLELNIKAQK ATDIVDALSDSVIKHDDAVCKRNSVAHFAEQEKELPVYEERKQHIDKRLTELYQERSKED ANELVLYNNLREAEIELKELRKDIEGSEALEEVERRLKSAKETLEKAQIQKGVLTQRVED VEAMRSEIALLNKGIAVAAEKADCYEALKQAFSQDGVPHQIIRNIIPHITDTANNILGSM TGGTMGVEFVMERTVKGKDGDRATLDVLINEYGKTTLPYASKSGGEKVKASLAIILALSE IKATSAGIQLGMLFIDEPPFLDDDGTQAYVDALETIRQRYPDVKIMAITHDDAMKARFNQ SVTVIKTEDGSKVIY >gi|226332955|gb|ACII01000064.1| GENE 59 46706 - 47845 771 379 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578945|ref|ZP_04856216.1| ## NR: gi|253578945|ref|ZP_04856216.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 379 1 379 379 746 100.0 0 MQEISGSLSDVIRAYSDHNLLVPAATDVQLNPFYKYHVEEVAVDLSENSGDIYKVNRIPT GKKDGKGKEIWQDTYSLSKPLLNKLAMAAGIQFNPQQTYGRKIDATTYRAQAQGAMRKAD GTYRSEVDQREICLEDEEDRFRRESTDKAVRGITDKKAAEEAAKIFAGKWIDDTDKWGNH IKAYVIAEKDRERYIERYVKVNMALLRKSWAEKALTGAKLRVIRALLGIKSCYTIDELKK NFAIPTVIFSPDYSDPQVRQAMLMQGMNSVNNMFGMPQIEVKNVDFATDSNIIDEGDLDN PAFTSELPDEDMGEIQQEAFAQPEQEEPNEPDPQPEEDRTADFQCSRCGTIINEKVYEYS INKFGEPLCIKCQRGGGRR >gi|226332955|gb|ACII01000064.1| GENE 60 47998 - 48333 295 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578946|ref|ZP_04856217.1| ## NR: gi|253578946|ref|ZP_04856217.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 111 1 111 111 182 100.0 5e-45 MDVRKNVEHQYITEDSKTRIYLSVVREFNEKEYEEYSKKKKRAKIKKRIRFLKRSMVYIV PTAVSLIFFGYLSDMLCAIRGSAELGSEWIAIPLMWVWVYALTRFAVGDAY >gi|226332955|gb|ACII01000064.1| GENE 61 48333 - 48497 140 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578947|ref|ZP_04856218.1| ## NR: gi|253578947|ref|ZP_04856218.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 98 100.0 1e-19 MFTTEDMKKYHTTAERILNALDNSSVPISWHEMDRCALQSVIAKELILIDKEAR >gi|226332955|gb|ACII01000064.1| GENE 62 48515 - 48661 234 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDLKEKLKEILKKNYGITSDAELLEELNNMESVDLGIFVTQIDTEKTA >gi|226332955|gb|ACII01000064.1| GENE 63 48676 - 48873 227 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578949|ref|ZP_04856220.1| ## NR: gi|253578949|ref|ZP_04856220.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 7 71 71 116 100.0 4e-25 MQKKLSPWCKKAKIAMIQNDISVNDLAEELGCSRCYLSSTLNGKKTSIEIRRRISDYLNI SDSDN >gi|226332955|gb|ACII01000064.1| GENE 64 48888 - 49364 356 158 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01565 NR:ns ## KEGG: EUBELI_01565 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 158 18 174 176 122 41.0 3e-27 MSKFATKAAGNMFCQARYEAAKFNERLSSREGAAEELGVDRTRLARIELGSVTPYPEEVL LMADIYRAPELKGNYCREMCPLGKGMPKIENQDIDRIALRALCSFRKINEAKELLLDITA DGVITEDEKPDLEKIINTLNEVNEITQNLKNWIEKSLK >gi|226332955|gb|ACII01000064.1| GENE 65 49419 - 49601 98 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578951|ref|ZP_04856222.1| ## NR: gi|253578951|ref|ZP_04856222.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 60 1 60 60 102 100.0 1e-20 MNKMKEYRQNKNMSFLDLSLKTGISERYLRFIEKGDRTPSLKTANIIASALNATVDKIFL >gi|226332955|gb|ACII01000064.1| GENE 66 49713 - 50225 54 170 aa, chain + ## HITS:1 COG:L80045 KEGG:ns NR:ns ## COG: L80045 COG1396 # Protein_GI_number: 15674195 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 1 68 1 71 102 63 42.0 1e-10 MGIESRIKDLRIENNYTQRELAAKIGLTPKMISFYEKGERVPPLDIIVKLVQIFNVSSDY LLGLSDKRYPDEDLGWRSPHIENRFGKILSDYRRTNDISISDFSKKIGVSKDLLSQIEFG IYTPSLELLRKISELTGYSIDYLTGAEILTRVNKNIESSGQILTSFFCGK >gi|226332955|gb|ACII01000064.1| GENE 67 50344 - 50589 107 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578953|ref|ZP_04856224.1| ## NR: gi|253578953|ref|ZP_04856224.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 14 94 94 153 100.0 4e-36 MPTLSELLRISYGFEVSVDFLLGKTDFPNINLTTDEVELLLNYRDCIQPYKANIRDRAEK LSIESINISPNTEGQPLKKAK >gi|226332955|gb|ACII01000064.1| GENE 68 50632 - 51390 452 252 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578954|ref|ZP_04856225.1| ## NR: gi|253578954|ref|ZP_04856225.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 252 1 252 252 438 100.0 1e-121 MKKRLFIVLLAATISITGSQASFVLASSDSEPSAEQTSDEKSTFTFRDIQWWDTKTDAEK QLVAEGAEIQTAAFEDNILRMSGIDYANSTGAKDRVEGGGTVVRYSGLKVAGYTPSETEA CYIYTLNDDGSINKDKDSAQFYFGWYTFESYDYADGEGVYNDLLQKLQSLYGDGAINSDD DYFTTSTWTDADNNQIRLLLGGKNKDYKYVTLGYIAAGADEKLDEMQAAVDAENIASEAA DREKNKKDVSGL >gi|226332955|gb|ACII01000064.1| GENE 69 51405 - 51959 342 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578955|ref|ZP_04856226.1| ## NR: gi|253578955|ref|ZP_04856226.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 173 1 173 184 314 100.0 2e-84 MTIGKRIKELRTEADLLQSELGKAVAVSSQVISNIERGYTKPSTELVNRCAKYFGVPADY LLGRTTEKYSTTEQKEAPALSAKIKDRMDQLQLNPSDLITKSEIPEDSFEDIMTGTVIPG IDVAGRLSKALDTSIDYLVGNSEYSCAIASEDEQDIILRYRQLSKKGKRIFLGMMEKMEE EKTE >gi|226332955|gb|ACII01000064.1| GENE 70 52279 - 54024 449 581 aa, chain + ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 19 507 2 530 531 164 26.0 5e-40 MARKRTNLIGNTPSVRETKVAIYIRVSTIHQVDKDSIPMQKKDLIAYCQLILGTDNYEIF EDAGYSGKNTDRPAFQNMMGRIRKGEFTHVLVWKIDRVSRNLLDFAEMYEELRSLRVTFV SKNEQFDTSTAIGEAMLKIILVFAELERNMTSERVTATMISRANSGQWNGGRIPFGYSYD PKEKVFSIREDEASICRELKDLYLLNRSLVYVSRALNEKGYKTRAGVSWSPHSVWIIASS PFYAGIYRYNRYKGVESRTINPEEEWVMIQNHHPAIFTLEEHQAMRSIMKSNQRNMDNLP GRVHLSTKTHIFQGIMYCDKCGSKMVSTPGRLHVDGYRTSNYGCPLRRNTKKCNNPTVND IIIGEFVINYILNMLNAKKTFSTINTPDELNAALLSGSVFSEVSSVEENGLNSFFNLLSR YGSDRSYIFSVKSPRKKKAAVDPELSKLRKEKEKQERALQRLQDLYLYSETSMSEKDFII RKSEISSHLDNINRQLGLMTQDQASFLSDEEFIKQASHLLIQKELKNKKYIYFKKLVSTV DPDILKAYMETILDSIYTADGKITAITFKNGLTHRFIYKDK >gi|226332955|gb|ACII01000064.1| GENE 71 54106 - 55131 1176 341 aa, chain - ## HITS:1 COG:CAC2283 KEGG:ns NR:ns ## COG: CAC2283 COG0809 # Protein_GI_number: 15895551 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Clostridium acetobutylicum # 1 340 1 340 341 449 60.0 1e-126 MKTSDFYYDLPQELIAQDPLEDRSSSRLMHLSLKDGSIEHRHFTDVLDYMEEGDCLVIND TKVIPARLYGHKEETGALIEILLLKRRENDIWECLVKPGKKARPGAKITFGNGILKGKII DVVDEGNRLIQFHYEGIFEEILDQLGEMPLPPYITHKLKDKNRYQTVYAKNEGSAAAPTA GLHFTKELLEKVKEKGVNIAHVTLHVGLGTFRPVKVDDVESHHMHSEFYIVEEDQAKLIN DTKKAGKRVIAVGTTSCRTLESATGEDGILKAGSGWTEIFIYPGYHFKMIDALITNFHLP ESTLVMLVSALAGKENIMHAYEVAVQEKYRFFSFGDAMILI >gi|226332955|gb|ACII01000064.1| GENE 72 55128 - 55886 877 252 aa, chain - ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 243 1 243 246 275 52.0 6e-74 MNFHVLTLFPEMIENGMNTSITGRAITKGLLTLEAVNIRDYAFNKHQKVDDYTYGGGAGM LMQAEPVYLAYEAIANRTAKKPRVVYLTPQGQVFNQAMAREMAQEEDLVFLCGHYEGIDE RVLEEIVTDYVSIGDYVLTGGELPAMVMMDSISRMVPGVLNNQESGETESFAGNLLEYPQ YSRPEEWHGKKVPEVLMSGHHANIEKWRREQSIYRTAKRRPDLLKKADLTNKEWNYVRQL RKQWKEEEEQIK >gi|226332955|gb|ACII01000064.1| GENE 73 55887 - 56390 187 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 1 160 1 159 179 76 30 3e-13 MEDLLKVGVITTTHGIRGEVKVYPTTDADRFLDLEYVLLDTGREKRKLEIENVKYFKNLV ILKFRGIDNINDIEMYKKRELWIPREEAQELEEDEYYIADLIGMDVVLEDGSKFGTLKDV METGANDVYVVEDAKGEEILLPAIRECILDVDVEKNVMTIHLMKGLI >gi|226332955|gb|ACII01000064.1| GENE 74 56535 - 56765 337 76 aa, chain - ## HITS:1 COG:CAC1756 KEGG:ns NR:ns ## COG: CAC1756 COG1837 # Protein_GI_number: 15895033 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Clostridium acetobutylicum # 1 75 1 75 75 70 66.0 7e-13 MKELVEVIAKSLVENPDEVVVTETEKEDAIIVELKVGPADMGKVIGRQGRIAKAIRTVVK AASSKSSKKVIVDILQ >gi|226332955|gb|ACII01000064.1| GENE 75 56829 - 57074 326 81 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238923878|ref|YP_002937394.1| 30S ribosomal protein S16 [Eubacterium rectale ATCC 33656] # 1 81 1 81 81 130 79 2e-29 MAVKIRLRRMGQKKAPFYRIVVADSRCKRDGRCIEEIGTYDPNLEPSAVTVDEEAAKKWL AAGAQPTDVVAKILKDAGIEK >gi|226332955|gb|ACII01000064.1| GENE 76 57188 - 58543 1767 451 aa, chain - ## HITS:1 COG:BH2484 KEGG:ns NR:ns ## COG: BH2484 COG0541 # Protein_GI_number: 15615047 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus halodurans # 1 433 1 433 451 443 55.0 1e-124 MAFESLTDKLQNVFKKLRSKGRLTEEDVKLALKEVKMALLEADVNFKVVKQFTKSVQEQA IGQDVMSGLNPGQMVIKIVNDELVKLMGSETTDLPLRPGNEITVFMMAGLQGAGKTTTVA KLAGKLKSKGKKPLLVACDVYRPAAITQLQVNGEKQGVEVFSMGDKQKPVDIAKAAIEHA KANQQNVVLIDTAGRLHVDEDMMQELADIKANIQVDATILIVDAMTGQDAVNVASTFADK VGIDGVILTKMDGDTRGGAALSIKAVTGKPILYVGMGEKLSDLEQFYPERMASRILGMGD VMSLIEKAEASIDKEQAQDMQKKLKKMDFDFNDYLTSMEQMNKMGGIGSILNMLPGMGGK MKDIEGMIDEKAMDRTKSIILSMTPQERSNPSILNISRKNRIARGAGVDVSEVNRLVKQF EQSKKMMKQMPGLMGGKAGKRGGGFKLPFGL >gi|226332955|gb|ACII01000064.1| GENE 77 58547 - 58870 423 107 aa, chain - ## HITS:1 COG:BS_ylxM KEGG:ns NR:ns ## COG: BS_ylxM COG2739 # Protein_GI_number: 16078660 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 107 3 109 110 61 39.0 4e-10 MEKFVEQGYLYDFYGELLTERQQQVYESVVLEDYSLSEVAEDLGISRQGVHDMIKRCNKT LEEYEQKLHLVEKFLNIRAQVKQIRVLAREYHSDEIVNISNEILEEL >gi|226332955|gb|ACII01000064.1| GENE 78 59056 - 60267 1615 403 aa, chain - ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 1 393 1 393 404 398 57.0 1e-110 MLTLESFEQASEIVKQVTQETKLIKSSYFSELTGNKVYLKPENMQRTGAYKVRGAYYKIS TLSDEERNKGLITASAGNHAQGVAYAAHKYGVKAVIVMPTTTPLIKVERTKSYGAEVILH GDVYDEACAHALELAEKEGYTFIHPFDDPAVATGQGTIAMEIVQELPLVDYILVPIGGGG LATGVSTLAKLLNPHIKVIGVEPAGAACMKASLKKGEVVTLPHVNTIADGTAVQTPGKKI FPYIQKNLDDIITIEDDELIVAFLDMVENHKMIVENSGLLTVAALRHLDVKGKKIVSILS GGNMDVITMSSVVQHGLIQRDRIFTVSVLIPDKPGELVRVASVIARAQGNVIKLDHNQFV STNRNAAVELKITLEAFGTDHKNEIVKALEDAGCRPKVIRPSL >gi|226332955|gb|ACII01000064.1| GENE 79 60348 - 61298 595 316 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157804145|ref|YP_001492694.1| 50S ribosomal protein L32 [Rickettsia canadensis str. McKiel] # 9 305 5 300 303 233 42 2e-60 MAEEKKGFFKRLVSGLAKTRDNIVAGFDSIFSGFSSIDEDFYEELEEILIMGDIGINATT SIIENLKKEVSERHIKEPMECKQLLINEIKDQMRVDSTEYEFENRRSVVLVIGVNGVGKT TSVGKLAGKLKDQGKKVILAAADTFRAAAGEQLTEWANRAGVEIIGGQAGADPASVIYDA VAAAKARNADVLLCDTAGRLHNKKNLMEELRKIYRILEREYPDAYLETLVVLDGTTGQNA LAQARQFAEVANVSGIILTKLDGTAKGGIAVAIQSELDIPVKYIGVGESIDDLQKFDADA FVNALFDVDHKENPDA >gi|226332955|gb|ACII01000064.1| GENE 80 61314 - 64145 2857 943 aa, chain - ## HITS:1 COG:BS_smc KEGG:ns NR:ns ## COG: BS_smc COG1196 # Protein_GI_number: 16078657 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Bacillus subtilis # 30 940 266 1182 1186 377 31.0 1e-104 VEEKNIIAQNQLKDTQSSYDRTKVEYERLELELEDLNSKMDALRISGQEQAIRKQQLEGQ INVLNEQILAGAQNEEHYKGRIQTIEAELSVRTDSKKKLEEEKTDIYAQLKAVRKKLSEE EEKLRTAQENMEACTLEVENGKNEIIEILNSRANVKGKAQRFDAMMEQAEIRKAEISQRI LRLKSEEEEQQTILTTGRKQYDEITSQIENANEECEHLNLSVMKIQEKLKEQNTKLEAEQ TAYHREASRLDSLRNIAERYDGYGNSIRRVMEQKERVPGIQGVVADLIQVNKDYEIAIET ALGGSIQNIVTDNEQTAKTMIEFLKKNRYGRATFLPMSSISPRGEFTPKEALKEPGVVGI ASELVSVASQYQQITKFLLGRVLVVDNIDHAIAIGKKYRHSLRMVTTEGESLNPGGSMTG GAFRNNSNLLGRRREIEELETKVNQLKQNLTEMQNAVEENKNQRNRLRDAIAGFQEKLRQ KYIEQNTARMNIKQQEKKAEEIRSGYAQINRDQAEIKRQVMEIRQDHDRIARELENSKQD EKELESFIEAKQSELDEWKEEEKKITRELEEIRLQSSALEQKEKFDQENLNRLKAEITAF MTEKEDIYQSLAHSSEEMEKKQEMITQLKKESEESVLHEEQAQMQLKNLQKEKESRTSQH KDFFEKRDYLSGQIGLLDKECYRLQGQMDKLEENREERIAYMWSEYEITPNNAVSYRKEE LTDLSQMKRQAAQIKDDIRKLGPVNVNAIEDYKELLERHTFLSGQYEDLVTAEKTLEQII QELDEGMRKQFSEKFGEIQKEFDKAFKELFGGGKGTLELDEEADILEAGIKIISQPPGKK LQNMMQLSGGEKALTAIALLFAIQNLKPSPFCLLDEIEAALDDSNVGRFAGYLQKLTKNT QFIIITHRRGTMNAADRLYGITMQEKGVSTLVSVDLVENQLTQ Prediction of potential genes in microbial genomes Time: Sat May 28 19:49:49 2011 Seq name: gi|226332954|gb|ACII01000065.1| Ruminococcus sp. 5_1_39B_FAA cont1.65, whole genome shotgun sequence Length of sequence - 5973 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 3 - 729 625 ## COG1196 Chromosome segregation ATPases 2 1 Op 2 7/0.000 - CDS 741 - 1424 877 ## COG0571 dsRNA-specific ribonuclease - Prom 1490 - 1549 3.0 3 1 Op 3 3/0.000 - CDS 1551 - 1778 392 ## COG0236 Acyl carrier protein 4 1 Op 4 . - CDS 1816 - 2853 1223 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 5 2 Tu 1 . - CDS 3029 - 4846 1754 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 4927 - 4986 4.1 - Term 4967 - 5018 12.6 6 3 Tu 1 . - CDS 5057 - 5869 1023 ## gi|253578973|ref|ZP_04856244.1| predicted protein - Prom 5900 - 5959 6.0 Predicted protein(s) >gi|226332954|gb|ACII01000065.1| GENE 1 3 - 729 625 242 aa, chain - ## HITS:1 COG:CAC1751 KEGG:ns NR:ns ## COG: CAC1751 COG1196 # Protein_GI_number: 15895028 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Clostridium acetobutylicum # 1 231 1 231 1191 268 61.0 6e-72 MYLKNIEVQGFKSFAQKINFEFHNGITGIVGPNGSGKSNVGDAVRWVLGEQSARSLRGGN MQDVIFSGTETRKPLGYASVAITLDNSDHKLPVDFNEVTVTRRLYRSGESEYKINGSACR LKDINEMFYDTGIGKEGYSIIGQGQIDRILSGKPEERRELFDEAAGIVKFKRRKNTTIKK LEEERQNLVRVTDILSELTRQLEPLEKQSETARVYLSKRETLKELDVNMFLMEYAAAAKE LK >gi|226332954|gb|ACII01000065.1| GENE 2 741 - 1424 877 227 aa, chain - ## HITS:1 COG:lin1919 KEGG:ns NR:ns ## COG: lin1919 COG0571 # Protein_GI_number: 16800985 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Listeria innocua # 6 225 5 226 229 191 47.0 9e-49 MNRKTEELEEIIGYHFKNKHYLTQALTHSSYANEKKLGKLGSNERLEFLGDAVLELISSD YLYARFTQIPEGELTKKRASLVCEPSLAYCAREFGLPQFLLLGKGEDMTGGRNRDSIVSD ATEALLGAIYLDGGFANAKEFVLNFILNDIEHKQLFYDSKTILQEIVQENGTQPVEYILT KEEGPDHNKNFTVEARVNGKVMGQGSGHTKKAAEQAAAYQAIRVLRK >gi|226332954|gb|ACII01000065.1| GENE 3 1551 - 1778 392 75 aa, chain - ## HITS:1 COG:BMEI1475 KEGG:ns NR:ns ## COG: BMEI1475 COG0236 # Protein_GI_number: 17987758 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Brucella melitensis # 4 71 6 73 78 57 50.0 7e-09 MELEKIKAIIAEVLNIDADSITEDTTFVDDLGADSLDIFQIIMGIEEEYDIELDNESVEQ IQTVGDAVEAIRTIK >gi|226332954|gb|ACII01000065.1| GENE 4 1816 - 2853 1223 345 aa, chain - ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 7 335 3 329 331 297 46.0 2e-80 MAEQVRVVVDAMGGDNAPAEPVRAAVEAVTERQDIKVILTGKQEVIDKELAKYSGYPKNQ IQVVNASQVIETAEPPVFAIRKKKDSSIVVGLNMVKKQEADAFVSSGSTGAVLVGGQVLV GRSKGVERPPLAPLIPTTKGVSLLIDCGANVDARPSHLVQFAKMGSIYMENVVGIKNPKV AIVNNGAEEEKGNALVKETFPLLKECKSINFIGSIEARDIPAGYADVVVCEAFVGNVILK LYEGVGSALVQKIKEGMMTSTRSKIGALLVKPALKETLKSFDATEYGGAPLLGLKGLVVK THGSAKAIEIKHGIFQCVQFKQEKINEKIAERILEDQEVLKSGAE >gi|226332954|gb|ACII01000065.1| GENE 5 3029 - 4846 1754 605 aa, chain - ## HITS:1 COG:CAC3012 KEGG:ns NR:ns ## COG: CAC3012 COG0488 # Protein_GI_number: 15896264 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 599 1 630 632 527 45.0 1e-149 MNILNIEHVSKVFGEKVVLDDVSYGVHQGDKIGIIGINGTGKSTILKIIGGLEEPDEGQV ITQNGLRITYLPQMPEFPQGASVLDYVAEGKWQKDWSTESEARNVLNKLGITDHEEKIDH LSGGQKKRVALARTLVNPCDVLLLDEPTNHLDNEMVTWLEDFLRSFKGVVIMVTHDRYFL DRVTNKILEISHGGLYAYEANYSKFLELKAEREEMELASERKRQSVLRMELEWAKRGCRA RSTKQRARLERLEALKNGKAPVRDANVELDSVETRMGKKTIELHHISKSFGEKKILDDFN YIVLRNQRLGIIGPNGCGKSTLIKIIDGMIQPDAGEVEIGETIRIGYFAQEVPDMDTNQR VIDYIRDVAEYIPTRDGKISATMMLERFLFDSAMQYAPIAKLSGGEKRRLYLLKVLMEAP NVLLLDEPSNDLDIPTLTILEDYLDSFAGIVIAVSHDRYFLDNIVDRIFAFEGNGHLTQY EGGYTDYTETLARKGGAVSEGQSTAVGAEKKKSAQADWKQNRPQKLKFTYKEQREFETID DDIAALEELLEKLDKDMEANATNSVKLREIMEQKEKTQADLDEKMDRWVYLNDLAERIEA QKSEK >gi|226332954|gb|ACII01000065.1| GENE 6 5057 - 5869 1023 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578973|ref|ZP_04856244.1| ## NR: gi|253578973|ref|ZP_04856244.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 270 1 270 270 433 100.0 1e-120 MKNWKKIMAGLCAAAMVSSMVVPVWAAEETTEEAADDGDVADVSEAMLGLWKDSAGDIYG FYKDNSFFGQWKDEEQDVLGVYALSSDGEYTALVMQFANDDGTYDDENMVTYLVQANEEE NLLELYDPDSLDLTATLEPYEADGDESDYNQTYQDMGDILTECYSGETEAGETFIYAANE DGTFCSVLVIDQDDNYVSFVGEGTFDEENGTVTITDEVSEMALTFGVAVNDDDTLTLDMG DLGSATVEEATLAVAVQGLKYAVENGTEMN Prediction of potential genes in microbial genomes Time: Sat May 28 19:50:04 2011 Seq name: gi|226332953|gb|ACII01000066.1| Ruminococcus sp. 5_1_39B_FAA cont1.66, whole genome shotgun sequence Length of sequence - 13182 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 4.1 1 1 Tu 1 . + CDS 86 - 808 499 ## COG3279 Response regulator of the LytR/AlgR family + Term 835 - 875 -0.1 - Term 586 - 624 1.6 2 2 Op 1 . - CDS 853 - 2757 1198 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 3 2 Op 2 . - CDS 2810 - 5035 1193 ## gi|253578976|ref|ZP_04856247.1| predicted protein - Prom 5099 - 5158 4.7 4 3 Tu 1 . - CDS 5188 - 6540 1380 ## COG0534 Na+-driven multidrug efflux pump - Prom 6700 - 6759 4.6 - Term 6709 - 6749 2.1 5 4 Op 1 . - CDS 6771 - 6953 291 ## PROTEIN SUPPORTED gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 6 4 Op 2 1/0.500 - CDS 6958 - 7485 193 ## PROTEIN SUPPORTED gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein - Prom 7505 - 7564 8.6 - Term 7639 - 7679 6.3 7 5 Op 1 21/0.000 - CDS 7760 - 9037 1793 ## COG0282 Acetate kinase - Prom 9072 - 9131 3.9 - Term 9137 - 9179 -0.9 8 5 Op 2 . - CDS 9182 - 10192 1324 ## COG0280 Phosphotransacetylase - Prom 10261 - 10320 6.6 + Prom 10263 - 10322 8.0 9 6 Op 1 . + CDS 10403 - 11770 633 ## COG1323 Predicted nucleotidyltransferase 10 6 Op 2 . + CDS 11851 - 12639 788 ## Blon_1431 hypothetical protein 11 7 Tu 1 . - CDS 12649 - 13062 475 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component - Prom 13106 - 13165 6.2 Predicted protein(s) >gi|226332953|gb|ACII01000066.1| GENE 1 86 - 808 499 240 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 233 2 230 234 89 24.0 6e-18 MNIAILDDSKEDLRQVKSVISSYYESQNTSADIHLYSSAEAFFTKYIPGFYDLIIMDIYM GSMTGMDAARKLREARDTAALIFISSSDSFAVESYDVQASYYLLKPFDPEKLCRILSTIQ FRQSQNSRYIELISDRTPVKIPVRSILYVDTYRNAVQVHTTDAGIIRSYITFQKFEELLE GMKCFLSCYRGCIVNMDHIEKSTDEGFVMDNQELVTVRKRGSNAIKKAYLEYLFSSDSDK >gi|226332953|gb|ACII01000066.1| GENE 2 853 - 2757 1198 634 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 398 630 212 452 452 79 28.0 2e-14 MKKNWKYSLLLIVAVFALTMGIFFLFYRFDNKYTARGDQAIQGILYVPDDDSVHYLAREW EYYPNVLLTPQEIKEQKEDYYSRYVSIGEYGGMDLGDKNKSPFGSGTYRMTLVLPEKEKQ YAIGLTEIFSAYKLYINGELMGQMGNPDPDNYQEQIQNRVFTFEGKGTAEIVIAVTDKHS VSSGIQYVPVLASPFKVNIIRGLSLMIAVVFMAFTFFVLIFSVYMYMRTKKVEFGLFALL CICVLGYGSYPILHSFVAVKVQPCYGMEALFYYLMFAGVMLVQQKILGGEERIPEILAGV AAAAGVMVFVAEMLCSRAQSATGLYLISKLTEILKWAAAGYLLFRSVKEIKQKYGNVLLV GIVVYAASLAADRIWRLYEPIIGGWFMEIGAAVLVASVGLTLWMELANAYRFQLTYGEYS RQMEQKLQIQKQNYEELTEKMDEISRMRHDMRHHLRTIMSYTQQGKYQEMMEYLQEYASV ITEEEKLICYCRNMAVDAVIHFYAGELRKKGIPFECDMMLPQNIGISDTDLCKIYGNLLE NAVDAVKDLLPENKPYVKMLTKVKNRKLLIEISNPYSNGIQRREKVFYSTKHEGFGIGTA SVAEVVQRMGGYVVFNTEEGIFKVNIFLPVKTVQ >gi|226332953|gb|ACII01000066.1| GENE 3 2810 - 5035 1193 741 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578976|ref|ZP_04856247.1| ## NR: gi|253578976|ref|ZP_04856247.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 741 1 741 741 1392 100.0 0 MRKQKITVLILMITLCLTGIWGMKKNTTVWGAVIRVNDVIVPKDIPAVAVEINKSSELIA EKLSQVFYTVYTDTDTLHIPVFWDVSSVNMKLAGVYTMKGVLKLPPEYAFDESESLQVQT TVSVQYPDKPDINIYYRLTAAGIYIFPWLKQENFDSMEAYLKKESGQWINLTEEGFALCE EEGLYVSNQSMVVGNTYSLLVTYDNGRKQTNTLRFQYQKDGSLKIYGYQHSLLGNTQKPG KIICSYDTGDEKYLSRCAAYAVKTGGSLKQIEKELKENVRLRVSTAETFENTAENPELIL ESSWDLSRVNREKQGVYKVTGSFVIPEGYEVSDSLTLPEAYAYLSVQKKGEPQINTYSMP MVDLLEFPMLLDGFSAEERRNIQVYLSENGGNYRKIDDDLVEVTSAGIQIYYREILKKGQ NYQICAVYEKGSTGIYSFSYNDAFIVNEYWHERNFSDREEKDLPDIVQKAPASSEVVQEE KKENLNTGSAGYHYGDSYGTDNNTDGQNSSSNKASDAKSAEISDTENFVTELSTDTITAV SGKRMLLMIQQNGSARFEKQGISVTFSPDTVNGWKINAEDEIQIGIEKTSETAFSLRIFV RGEKVTEIPGTVVEFQASVFGGIQLPETVKAEDVQGNQYAVNYQEKQNVLRMEINQTGDY FLTDGESENIGDQTVEDSVEAISDGENIGFTEEPMPEEETEQKTEQKTENSAKQFLMIIL TLPAVCILVAAIIFAIRKKRR >gi|226332953|gb|ACII01000066.1| GENE 4 5188 - 6540 1380 450 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 442 1 441 445 320 42.0 5e-87 MNEKEKVNQITEGVIWKQLLLFFFPIVFGTFFQQIYNTADTIVVGRFVGKQALAAVGGSA SQIANLIVGFFVGLSSGAAVVISQFYGAKDKKNLSKALHTAFAFSIAAGIVLTVVGIFLT RPALLLMKTPADVVEDSAVYLHIYFGGMVFNLVYNMGAAILRAVGDSKRPLYVLIITCVL NIILDLLFVVAFDMGVTGVAVATVTSQVISALIVTVMLLKTREIYVLKINRIRFDRRMLF SVLRIGIPAGLESVMYNISNIVIQVFVNNLGTDTVAAWGTLGKIDALFWMVINAFAISIT TFVGQNFGAGKYHRMRKSVSVCMIMSMVSSAVLIILMYSFAPWIYRLFTTDSAVIVHGVH MSRFLLPSYFIYVIIGILSGALRGTGKVLVPMLLTCGGVCSLRILWLFTAGQMYPGINTI MLSYPVSWSITAVLFIVYYFMKFPGKKKEN >gi|226332953|gb|ACII01000066.1| GENE 5 6771 - 6953 291 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 [Oribacterium sinus F0268] # 1 60 1 60 60 116 88 7e-26 MSICPKNKSSKARRDKRRANWKMSAPNLVKCSKCGELMMPHRVCKACGSYNKKEIIKVED >gi|226332953|gb|ACII01000066.1| GENE 6 6958 - 7485 193 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein [Clostridium botulinum Bf] # 46 170 46 164 166 79 32 2e-14 MQIHLSDISSSEGMRIQKTAEFGMDTITFQSGSFPVLAKEPIELTITNTGDRNLEIRGTG KITVGIPCDRCLEEVSTEIPLEIERKLDMKLTDEDRVNDLDESSYLTGMDLDVDQLVYLE VLMSWPLKVLCREDCKGICSQCGKNLNDGPCGCVEEPKDPRMAAISDIFSKFKEV >gi|226332953|gb|ACII01000066.1| GENE 7 7760 - 9037 1793 425 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 30 425 1 398 403 484 58.0 1e-136 MKGKADRNIRRLVSGLWTVFEICCRQNMKMNVLVINCGSSSLKYQLINSDSEAVLAKGLC ERIGIDGRLVYQKTGCDKEITEAAMPTHKEAIQMVLDALTNDKTGAIGSLKEVNAVGHRV VHGGEKFAKSVVITDEVISAVEECNDLAPLHNPANLIGIRVCSELMPGVPQVAVFDTAFH QTMPAKAYLYGLPIEYYKNYKVRRYGFHGTSHSFVSKRAVEFLGLDKDNSKVIVCHLGNG SSISAVVNGECVDTTMGLTPLEGVVMGTRSGNIDPAIMEFIAKKENLDIAGMMNVLNKKS GLLGLSGGLSSDFRDLNDAAQSGNEDAANAIDVLCYGIAKFVGGYVAAMNGVDAIVFTAG IGENAIPVREKVVSYLGYLGVTLDKEANGVRGEEIVISTPDSKVKVAVIPTNEELAICRE TVALV >gi|226332953|gb|ACII01000066.1| GENE 8 9182 - 10192 1324 336 aa, chain - ## HITS:1 COG:MA3607 KEGG:ns NR:ns ## COG: MA3607 COG0280 # Protein_GI_number: 20092407 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Methanosarcina acetivorans str.C2A # 3 334 4 332 333 366 59.0 1e-101 MGFIDTLKERAKANVKTIVLPETEDKRTLAATEKILKEGIAKVILVGNEEAVKKSAAEDG YDISGAQIVDPATSEKTQGYIDKLVELRQKKGMTPDQAKEILLNQYLYYGVMMVKMGDAD GMVSGACHSTADTLRPCLQILKTKPGTKLVSAFFLMVVPNCEYGANGAFVFADSGLNQNP TSEELAAIAASSAESFELLVQEKPIVAMLSHSTKGSAKHADVDKVVEATKIAKEQNPELA LDGEFQLDAAIVPSVGASKAPGSEVAGKANVLVFPDLDAGNIGYKLVQRLAKAEAYGPVT QGIAKPVNDLSRGCSADDIVGVVAITAVQCQADDNK >gi|226332953|gb|ACII01000066.1| GENE 9 10403 - 11770 633 455 aa, chain + ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 1 407 1 364 402 232 36.0 1e-60 MKTAGIIAEYNPFHKGHEYQIRYTKEKLKADYVIVAMSGDYVQRGTPALISKHARAEMAL RCGADLVLEMPVSVSTASAEAFAMGGVSLLDGLGVVDMLCFGSESGEISALKELAEILVE EPEEYKKLLKSFLSEGLTFPAARSQALTEYFKNPRNFSGDDFDGVLTPLLNEVTQILNTP NNILGIEYCKALLRLNSQIRPVTIRREGMGYHETTVPEGDSASSSPDLQSSTDFFASATA IRSLIQNPGDGHSEASSDINNPVRNPDTKTANILSSQIPPDAFYVFKKALDSGEFLTENS LDSILSYCLMKENVESLSSYMDVSEDLARRIINQQNLLLSFSQSVSVLKTRELTQTRIQR ALLHIILNIHTAPTQIPFARVLGFRRESSELLSQIKQHSQIPLITKLADAQNLLDSEGNQ ILSETVFSSNLYEKLLCLKTGRKFCHESQKQLIIL >gi|226332953|gb|ACII01000066.1| GENE 10 11851 - 12639 788 262 aa, chain + ## HITS:1 COG:no KEGG:Blon_1431 NR:ns ## KEGG: Blon_1431 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 9 257 53 303 358 262 54.0 8e-69 MKANSGLKKNRTKIQMVLTALVILSVVRQFFLGNYHNMFLGILTLILFMVPQFLDRKLNV TIPPGLETVILIFIFSAEILGEINAFYVKIPIWDTILHTTNGFLMAAIGFALIDLFNRSD RFSIKMSPYFVAFFAFCFSMTVGVMWEFFEFSMDWFFGMDMQKDWIVPAINSVKLNPTGA NVPIHVDVQSLAVNGETWNLGGYLDIGIVDTMKDLIVNFIGAVVFSIIGILYLRNRGKGK LAASLIPQVRSKQEEELSSKNQ >gi|226332953|gb|ACII01000066.1| GENE 11 12649 - 13062 475 137 aa, chain - ## HITS:1 COG:lin2165 KEGG:ns NR:ns ## COG: lin2165 COG1226 # Protein_GI_number: 16801231 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Listeria innocua # 9 126 105 223 247 61 25.0 4e-10 MKRIKVLGSVLKRTRADKIIIGFIVFIFAIAAVIQLVEPDINRYGDALWYCYAVISTAGF GDVVAVTFIGKMCSVLLTIYSLFVVAIATGVVVNFYTQMVELQRKETLAMFMDQLERLPE LSREELENISRKVRQFR Prediction of potential genes in microbial genomes Time: Sat May 28 19:50:35 2011 Seq name: gi|226332952|gb|ACII01000067.1| Ruminococcus sp. 5_1_39B_FAA cont1.67, whole genome shotgun sequence Length of sequence - 13772 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 4, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 37 - 519 522 ## gi|253578986|ref|ZP_04856257.1| predicted protein 2 1 Op 2 . + CDS 538 - 1812 1097 ## Elen_2005 chloride channel core - Term 1842 - 1876 -1.0 3 2 Op 1 13/0.000 - CDS 1932 - 2726 231 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 2 9/0.000 - CDS 2730 - 3656 912 ## COG4120 ABC-type uncharacterized transport system, permease component - Prom 3722 - 3781 4.3 - Term 3689 - 3725 4.9 5 2 Op 3 . - CDS 3832 - 4854 1392 ## COG2984 ABC-type uncharacterized transport system, periplasmic component - Prom 5040 - 5099 7.2 + Prom 5829 - 5888 7.3 6 3 Tu 1 . + CDS 5965 - 7056 708 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase - Term 7309 - 7354 8.0 7 4 Op 1 9/0.000 - CDS 7362 - 8726 1967 ## COG2848 Uncharacterized conserved protein 8 4 Op 2 . - CDS 8816 - 9088 457 ## COG3830 ACT domain-containing protein 9 4 Op 3 . - CDS 9105 - 9899 798 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 10 4 Op 4 4/0.000 - CDS 9886 - 10488 467 ## COG0237 Dephospho-CoA kinase 11 4 Op 5 . - CDS 10488 - 13241 2903 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 12 4 Op 6 . - CDS 13259 - 13606 239 ## gi|253578997|ref|ZP_04856268.1| conserved hypothetical protein - Prom 13657 - 13716 7.4 Predicted protein(s) >gi|226332952|gb|ACII01000067.1| GENE 1 37 - 519 522 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578986|ref|ZP_04856257.1| ## NR: gi|253578986|ref|ZP_04856257.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 12 160 1 149 149 209 100.0 6e-53 MLLGVYTLVRPMRTFLAIGWILGILLFVNGIELVILSLSKEKKEIGACILGVLEGLAGII LLFSGIQRFITDVMAAYMVGAIILIYGIFQIAAGTKVYKTSKGKGILSIVCGVLSVIVSI ISFTHPILTMISAGYMIAFSVLMQGINMIVLGINFGKTEN >gi|226332952|gb|ACII01000067.1| GENE 2 538 - 1812 1097 424 aa, chain + ## HITS:1 COG:no KEGG:Elen_2005 NR:ns ## KEGG: Elen_2005 # Name: not_defined # Def: chloride channel core # Organism: E.lenta # Pathway: not_defined # 8 393 2 389 417 253 41.0 1e-65 MNNTKSPKNVSLKNQLELWLFCALIGAVAGALVWILLKIMAVGTEFLWKWFPGKTSVPYY TILICVAGAAIIGIFRKIFGDYPEDLETVMEKVRVEKRYEYKNMLVMMVAALLPLLIGSS VGPEAGLTGIIVGLCYWAGDNLKFAKQNTRNYSQIGAAVSMSVLFHAPLFGIFEVEENSE EDLAALTKGSKLFIYGIALAAGTSIYAGLSALFGAGLSGFPSFDMVEIQRKDYLLMILYI LCGLILAYFYQATHKLTGNISNRFPAVVKEIFAGLCLGITGSFLPVLMFSGEDQMGTLMK TYTSYLPLALIAIAFFKLLLTNLCIQFGLKGGHFFPVIFAGVCMGYGMAMLTCGPDGGHV VFGAAIVTASLLGGIMKKPLAVTMLLFLCFPVKMFIWIFIAAAVGSKLITLKPEQEPSAN YSKQ >gi|226332952|gb|ACII01000067.1| GENE 3 1932 - 2726 231 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 225 1 218 245 93 31 7e-19 MLKIEHISKTFNPGTVNEKQAIRDLSLNLEKGDFATIIGSNGAGKSTLFNAICGDFFTDS GVIMLDGQDITFMPQHVRAKTIGRLYQDPMRGTAPGMTIEENLALAAGKGGWLSHVSRQE KERFHEELKLLDIGLEKRMTQPVGLLSGGQRQALTLLMATMNPPKLLLLDEHTAALDPGT AEKVLNLTKRIVEEHQLTCLMITHNMQSALDLGNRILMMDSGDIVLDIHEEEKKGLTVEG LLEKFKTGAGKMLDNDRILLSGEK >gi|226332952|gb|ACII01000067.1| GENE 4 2730 - 3656 912 308 aa, chain - ## HITS:1 COG:Cgl2197 KEGG:ns NR:ns ## COG: Cgl2197 COG4120 # Protein_GI_number: 19553447 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Corynebacterium glutamicum # 13 296 4 284 296 138 32.0 1e-32 MTIATILSLGETALKLGLICSLTVLALFLSYSMLNVCDLSTDGCFTLGAAVGAAVAVSGH PFLSIFAAMAAGICSGFVTAILQTKLGVDSLLAGIIVNTALYSVNIAVMGGSSLINMNRT ETVFSMMKETLKSTPLKGRGDIVVAFIAVAIVIVFLVFFLRTRLGLAIRATGNNADMVKS SSINPVFTTIVGLCVANSFTALSGCLLSQSQKSVDINIGQGMVTIALASLLIGGTILGRG GIFVRAVGMVLGSFIFRLVYTVALRFNMPAFMLKLVSSVIVVLAISGPYLKKQWPQIKRR MTHRKGVQ >gi|226332952|gb|ACII01000067.1| GENE 5 3832 - 4854 1392 340 aa, chain - ## HITS:1 COG:VC1101 KEGG:ns NR:ns ## COG: VC1101 COG2984 # Protein_GI_number: 15641114 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Vibrio cholerae # 4 312 5 309 321 147 31.0 3e-35 MKKRVLAIAATAMATMMMASPVMAQDFKIGICNYVDDASLNQIVDNIQSRLEEIGKEKDV NFDVSYDNCNADANVMEQIISDFQADNVDLMVGIATPVAMRMQSATEGTDTPVVFSAVSD PVGSGLVEDLDAPGANITGTSDYLDTASIMKLIQAVNPDVKKIGLLYDIGQDSSTTAIQE AKDYLDKEGIEYVERTGTTTDEVQLAADALVADGVDAVFTPTDNTIMTAELSIYEKFIDA GIPQYTGADSFALNGAFVGYGVDYANLGQKTADMIADILMNDADPATTSVETFDNGTATV NTETCEGLGLDLETVKKDFEPLCTQVNEIVTAESFDDVEK >gi|226332952|gb|ACII01000067.1| GENE 6 5965 - 7056 708 363 aa, chain + ## HITS:1 COG:FN0973 KEGG:ns NR:ns ## COG: FN0973 COG0079 # Protein_GI_number: 19704308 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Fusobacterium nucleatum # 19 363 9 353 357 237 36.0 2e-62 MSTAKMVFHGSDIEKICEVYQLKSEEIVKFGANVNPLGLSEHVKEQLAGRLDILSSYPDR DYTSLRSTISEYCNIPAEFILPGNGSSELIALLIQERNPKHTLILGPTYSEYSRELSFSG STQEYYHLREEDNFVLDVDDFCRTLDGKYDFLILCNPNNPTSSAISINDLRRIVSFCNER NIFVMIDETYVEFAPDINEITAVSLTREFTNLMILRGVSKFYAAPGMRLGYGITGNLEFL KKMKEKQVPWSLNSLGALAGELMLKDKNYIRQTRDLILSERTRLLKALENIPTYKTYPAY ANFLLLKIQKPGLTSRDVFDACIRQGLMIRDCSSFECLDGEYIRFCIMHPEDNTRLLNIL ESL >gi|226332952|gb|ACII01000067.1| GENE 7 7362 - 8726 1967 454 aa, chain - ## HITS:1 COG:lin0538 KEGG:ns NR:ns ## COG: lin0538 COG2848 # Protein_GI_number: 16799613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 6 454 5 451 451 580 71.0 1e-165 MINIFEVNETNKMIEQENLDVRTITMGISLLDCIDSDLAVLNEKIYKKITTCAKDLVSTG EAISGEYGIPIVNKRISVTPIALIGGAACKTKEDFVTIAETLDKAAKVVGVNFIGGYSAL VSKGMTPAERLLIESIPQAMAKTERVCSSVNVGSTRTGINMDAVKLMGEIVLETAKATKD QDSLGCAKLVVFCNAPDDNPFMAGAFHGVTEADTIINVGVSGPGVVKTALEQVRGKDFET LCEMIKRTAFKVTRVGQLVAQEASRRLGVKFGIVDLSLAPTPAIGDSVAEILEEIGLEHA GAPGTTAALALLNDQVKKGGVMASTAVGGLSGAFIPVSEDQGMIDAVNAGALTLEKLEAM TCVCSVGLDMIAIPGDTKASTIAGIIADESAIGMVNQKTTAVRLIPVIGKGVGEVVEFGG LLGYAPIMPVNQFSCDAFVQRGGRIPAPIHSFKN >gi|226332952|gb|ACII01000067.1| GENE 8 8816 - 9088 457 90 aa, chain - ## HITS:1 COG:MA3235 KEGG:ns NR:ns ## COG: MA3235 COG3830 # Protein_GI_number: 20092051 # Func_class: T Signal transduction mechanisms # Function: ACT domain-containing protein # Organism: Methanosarcina acetivorans str.C2A # 5 90 7 92 92 84 50.0 4e-17 MKKTIITVVGNDTVGIIAKVCTYLADNNVNILDISQTIVQGYFNMMMVTDASKCEKDNGV LAKELEALGDKIGVVIRCQHEDIFNVMHRI >gi|226332952|gb|ACII01000067.1| GENE 9 9105 - 9899 798 264 aa, chain - ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 1 262 1 259 261 229 45.0 6e-60 MRLCSIASGSSGNCIYVGSDNTHLLVDVGISGKRVEQGLNSLELTGKDIDGILITHEHSD HIKGLGVIARKHQIPVYATEGTVEALSHMNLGKMPEGIFREIHEDEPFEINDLTVNPFTI PHDAAQPVGYRVECGEHSVGIATDLGKYNEYIVGNLQNLDALLLEANHDIRMLQVGKYPY YLKQRILGDRGHLSNENAGRLLCRLLHDNMKGIFLGHLSRENNYEELAYETVCSEVTLGD NPYKSRDFRIQVAQRDCISEVITV >gi|226332952|gb|ACII01000067.1| GENE 10 9886 - 10488 467 200 aa, chain - ## HITS:1 COG:BH3150 KEGG:ns NR:ns ## COG: BH3150 COG0237 # Protein_GI_number: 15615712 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Bacillus halodurans # 3 200 1 197 201 109 29.0 3e-24 MMITIGITGGVGAGKSTVLDFLEEKYQAYVMKADEIGHLVMEPGQSCYEPVIALFGRQII KNDKTIDRRQVSDVVFSHPELLEKLNQIIHPAVKQYIREQLAVKKQQEQKICVVEAALLL EDHYQEFCDTIWYIHTDEEIRIRRLMENRGYTREKSVSIIASQAPETFFRENADYVVVNN GDFAQTRRQIEEGIRKYETL >gi|226332952|gb|ACII01000067.1| GENE 11 10488 - 13241 2903 917 aa, chain - ## HITS:1 COG:BH3153_2 KEGG:ns NR:ns ## COG: BH3153_2 COG0749 # Protein_GI_number: 15615715 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Bacillus halodurans # 430 917 83 569 569 467 51.0 1e-131 MSEKILLIDGHSILNRAFYGLPDLTNSEGKHTGAVYGFLNILFRTIEEEKPQYLTVAFDL KAPTFRHKMFEAYKGTRKPMPEELREQVPLIKEMLTAMGVNVVTKEGYEADDILGTLARK SEAAGMDATILSGDRDLLQLATDKVMIRLPKTVRGKTTIEDYHAQQVIEKYQVTPPQIIE LKALMGDSADNIPGIPGVGEKTATKIIVEYGSIENAHEHLEELKPNRARESMREHYDMAQ MSKALATICTDSPIEFSYEKAKLGNLYTKEAFLLCRQLEFKNLLNRFDSAAVQEDTLEQE FFTCADLAGCEALFAKAEAGKTAGVSLVTENGRVFGAGLALNEEEIYYIPVEGMITEGYL CGKLEGLLHKVSESNTENIMKSNTDDVKKDPENEISDVNTDGTLKYDKKCVCALDVKALL KHIKSDDPMAVFDAGVAAYLLNPLKSSYTYDDMAKEYLNGRILPAREELLGKKTVEKAWE ESAEGLAAWACYMAYTALACRKPMCEALRDTGMWNVYTQIELPLIFTLDSMEKWGISVKS EELKSYGEKLSVRIGELEKQIWQQAGEEFNINSPKQMGVILFEKLGLKGGKKTKTGYSTA ADILEKLAPEYPIVKDILEYRQLAKLKSTYADGLVGEIAEDGRIHSTFNQTITATGRISS TEPNLQNIPVRMELGRLIRKVFVPEEGYVFLDADYSQIELRVLAHMSGDEKLIQAYKEAQ DIHRHTASEVFHVPFDEVTDLQRRNAKAVNFGIVYGISSFGLSQDLSITRKEAAEYIERY FETYPKIKGFLDGLVEEGKEKGYVTTMFGRRRPIPELKSGNFMQRSFGERVAMNSPIQGT AADIIKIAMNRVYRRLKDENLKSRLVLQVHDELLIETCKEEIPQVSAILEEEMKGAAKLS VELEVDMHQGNNWYEAK >gi|226332952|gb|ACII01000067.1| GENE 12 13259 - 13606 239 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253578997|ref|ZP_04856268.1| ## NR: gi|253578997|ref|ZP_04856268.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 115 1 115 115 202 100.0 7e-51 MDCQTAEGMVSRYIKHDLPLNELEEFLDHVQNCSSCYDELETYFIVHEVTHQLDDDSSDS VLDFKKLLEQDIRKSRRYIRKKKVSWLMFGVSICLLIATIAAILIFVMMETNYIL Prediction of potential genes in microbial genomes Time: Sat May 28 19:50:56 2011 Seq name: gi|226332951|gb|ACII01000068.1| Ruminococcus sp. 5_1_39B_FAA cont1.68, whole genome shotgun sequence Length of sequence - 3503 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1928 1253 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 2009 - 2068 11.2 - TRNA 2100 - 2182 62.2 # Leu CAA 0 0 - Term 2046 - 2088 3.2 2 2 Tu 1 . - CDS 2247 - 3311 890 ## COG3291 FOG: PKD repeat - Prom 3441 - 3500 10.2 Predicted protein(s) >gi|226332951|gb|ACII01000068.1| GENE 1 3 - 1928 1253 642 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 11 637 6 624 636 487 44 1e-138 MNLDNKKPGRKPLYVYYIIAAVIIILLNVLLLPAIEGRKVQTTDYSTFLTGVEKDYVKTV EIQDDYIYYVAESDGQEGYFRTVKMNDPDLVDRLHDANVKFGAAAPQQTSIWVSLIAGYV LPIAIFILLGRWLSKKMMSSMGGGPGGAMSFGKSNAKVYVKSSTGIKFSDVAGEDEAKDL LTEIVDYLHNPQKYREIGASMPKGALLVGPPGTGKTLLAKAVAGEAEVPFFSISGSEFVE MFVGMGAAKVRDLFKQANEKAPCIVFIDEIDTIGKKRDGAGFTGGNDEREQTLNQLLTEM DGFDGSKGVVILAATNRPDSLDPALLRPGRFDRRIPVELPDLKGREEILKVHAKKIKIAD SVRFDEIAKAAAGASGAELANIVNEAALRAVRDGRKFATQADFEESIEVVIAGYQKKNRV LSNKEKLIVAYHEIGHALVAAKQTESAPVHKITIIPRTSGALGYTMQVDDGDHYLMTKEE LANKIATFTGGRAAEELIFHSITTGASNDIEQATKLARAMISRYGMSEDFDMVAMENVTN QYLGGDSSLSCSFETQTLLDKKVVELVRMEHQKALKILQDNIGKLHELAKYLYEHETITG EEFMKILNAPVQVPTAVAESESNTKSENNAESENSTEADADT >gi|226332951|gb|ACII01000068.1| GENE 2 2247 - 3311 890 354 aa, chain - ## HITS:1 COG:MA4285_2 KEGG:ns NR:ns ## COG: MA4285_2 COG3291 # Protein_GI_number: 20093074 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 190 311 694 811 1325 66 34.0 9e-11 MKKIAAMVLAGVLAAVPVSTTGAAEFQDGADTQNFILAAGEETSTDTDESKGISANDSVV EGLDKPLVFYPNTYYKFNVTGAGTQNTNPVEGDARWTPLYWSLSLKPQESDIHRKWEIGS AKGIYTKVERAYNIYIFFQREEYTGGIWEKNDIVQPVRYQFNAAPLTEQDGSYKYLIGGI GYKILNEREVSVTGLAAEYNVIQIPATVVINDKVYKVTTIDKNAFSGNKEITDVIFGNNV ITIGKYAFSQCPNLRNIRFGSRVKRIGSNAFAQCTKLRNFILPASVRHIDARAFYQCPAV KVIRINSTALNYVGKKAFAVNKTVTIRLPEKLFARYQKLIKASSVYSKTRFVKY Prediction of potential genes in microbial genomes Time: Sat May 28 19:51:03 2011 Seq name: gi|226332950|gb|ACII01000069.1| Ruminococcus sp. 5_1_39B_FAA cont1.69, whole genome shotgun sequence Length of sequence - 22729 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 9, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1494 841 ## COG0584 Glycerophosphoryl diester phosphodiesterase 2 1 Op 2 . - CDS 1499 - 1714 106 ## gi|253579001|ref|ZP_04856272.1| predicted protein - Prom 1829 - 1888 3.8 - Term 1769 - 1825 6.5 3 2 Op 1 . - CDS 1898 - 3178 1616 ## COG0019 Diaminopimelate decarboxylase 4 2 Op 2 . - CDS 3196 - 4074 498 ## Cphy_2901 hypothetical protein 5 3 Tu 1 . - CDS 4189 - 5130 934 ## COG0598 Mg2+ and Co2+ transporters - Prom 5348 - 5407 7.4 + Prom 5307 - 5366 9.0 6 4 Tu 1 . + CDS 5546 - 6640 1087 ## COG0082 Chorismate synthase + Term 6769 - 6803 -0.4 7 5 Tu 1 . - CDS 6752 - 7819 1095 ## COG2706 3-carboxymuconate cyclase - Prom 7936 - 7995 5.8 - Term 8028 - 8080 -0.8 8 6 Op 1 7/0.000 - CDS 8114 - 9391 1595 ## COG0001 Glutamate-1-semialdehyde aminotransferase 9 6 Op 2 2/0.000 - CDS 9391 - 10374 1196 ## COG0113 Delta-aminolevulinic acid dehydratase 10 6 Op 3 6/0.000 - CDS 10435 - 11940 1607 ## COG0007 Uroporphyrinogen-III methylase 11 6 Op 4 1/0.000 - CDS 11989 - 12894 1038 ## COG0181 Porphobilinogen deaminase 12 6 Op 5 4/0.000 - CDS 12887 - 13384 376 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) 13 6 Op 6 . - CDS 13374 - 14603 1329 ## COG0373 Glutamyl-tRNA reductase - Prom 14722 - 14781 5.6 - Term 14912 - 14962 6.7 14 7 Tu 1 . - CDS 14977 - 16677 1589 ## EUBREC_0955 hypothetical protein - Prom 16847 - 16906 4.6 + Prom 16691 - 16750 6.3 15 8 Tu 1 . + CDS 16885 - 17700 584 ## gi|253579015|ref|ZP_04856286.1| conserved hypothetical protein + Term 17893 - 17927 3.2 - Term 17695 - 17755 14.0 16 9 Op 1 . - CDS 17765 - 17944 105 ## gi|253579016|ref|ZP_04856287.1| predicted protein 17 9 Op 2 16/0.000 - CDS 17990 - 18694 801 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 18 9 Op 3 . - CDS 18687 - 20252 1021 ## COG2205 Osmosensitive K+ channel histidine kinase 19 9 Op 4 17/0.000 - CDS 20325 - 20972 644 ## COG0569 K+ transport systems, NAD-binding component 20 9 Op 5 . - CDS 20992 - 22329 683 ## COG0168 Trk-type K+ transport systems, membrane components Predicted protein(s) >gi|226332950|gb|ACII01000069.1| GENE 1 3 - 1494 841 497 aa, chain - ## HITS:1 COG:lin0625_2 KEGG:ns NR:ns ## COG: lin0625_2 COG0584 # Protein_GI_number: 16799700 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Listeria innocua # 262 478 1 212 243 75 26.0 3e-13 MGDKKKEYRDKRTGMQTVRSVIFAIVLIITSFTSIISVADNTYAADEYRVHKSEVVVKTG ATYTVKILNHDTKVSPKKFKWTSSNSKCVKVINGRIYGLKPGQATITAQMSGLKVNCEVF VYNKTETVLFKKYKKQVKVTAGKTIILEPQKYGKRLTYTSSDKTVATVSKKGKVTAKKAG NVKITSVSYGTDRYVSEIEVIVLPAASETPEITPTLTPDEPAPSVTPEPTVTVTPTPKIT PAPEDEEKFRKPLDGVTHYILHRGEQTEAPENSVPAFEMAGRNGAEFVETDVRETADGVL VVSHDDSLLRMCGEDRLISEMTYEEIKQYPIINGRNASQYPDNLIPTLEQYIACCNKYSV TPVIEIKSIRTEEAMNLFMQLLTESQKEPVLICFRIETLGKLREMGFTGKMQWIRTVRMN ASMIQQCKKYDLDISAEYKNISMNDINNAHQNGIRISVWLCRNEDMVDIFRKMGADYITY EKWNTDEVKMRYYSHST >gi|226332950|gb|ACII01000069.1| GENE 2 1499 - 1714 106 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579001|ref|ZP_04856272.1| ## NR: gi|253579001|ref|ZP_04856272.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 71 26 96 96 145 100.0 5e-34 MIKYEYETGMCKQLHYNGLWSVQYEGVPGHFKKVKMVCPCIRDECDQNCEVFKNIPEIKA ADQEWHMRDER >gi|226332950|gb|ACII01000069.1| GENE 3 1898 - 3178 1616 426 aa, chain - ## HITS:1 COG:SP1978 KEGG:ns NR:ns ## COG: SP1978 COG0019 # Protein_GI_number: 15901801 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 3 419 2 416 416 561 63.0 1e-160 MKKVPFVTLDKLQEITAQFPTPFHLYDEKGIRENAKAVKEAFAWNKGFKEFFAVKATPNP FLIQILQEYGCGCDCSSMTELMMSKAIGCKEGDIMFSSNDTPEEEFKYANEIGGIINLDD ITHIESVERAVGYIPKVISCRYNPGGVFKISNDIMDNPGDAKYGMTTEQIFEAFKILKEK GAEEFGIHAFLASNTVTNEYYPMLAKEMFEMAVKLQKETGAHVKFINLSGGVGIAYKPDQ TPNDIREIGEGVRKVYEEVLVPAGMGDVAIYTEMGRFMMGPYGCLVTKAIHEKHTHKEYI GVDACAVNLMRPAMYGAYHHITVMGKEDQPCDHMYDVTGSLCENNDKFAIDRYLPKVDMG DLLVIHDTGAHGFAMGYNYNGKLKSSEILLHEDGSFEMIRRAETPKDYFATFDCFPIYEK LLEDME >gi|226332950|gb|ACII01000069.1| GENE 4 3196 - 4074 498 292 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2901 NR:ns ## KEGG: Cphy_2901 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 290 1 287 297 251 44.0 3e-65 MKFIDKLERKFGNRGIENLTIYIIVSYVLGYALMYINPGALSMLSLNVSEILHGQIWRLV TWIIYPPSTSSALWFVIAILFFYYPISASLERTWGSFRFTVYILSGMIFTVISAFILYFI TGGVLDAYLNGSQFSTYYISLSIFLAYALTYPDMKVLLYFVIPIKMKWMAIVYAALVVYD IVRYFMGGAWFMALPIIASLLNFIIFFLGTRNLNRYNPKEIHRRNQFKRAMGESKTVPFP GGSKSEEVTKHKCAVCGRTEKDDPNLEFRFCSKCNGNYEYCQDHLYTHIHKK >gi|226332950|gb|ACII01000069.1| GENE 5 4189 - 5130 934 313 aa, chain - ## HITS:1 COG:SP0185 KEGG:ns NR:ns ## COG: SP0185 COG0598 # Protein_GI_number: 15900122 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Streptococcus pneumoniae TIGR4 # 1 313 1 314 314 291 54.0 1e-78 MVRIFKTIDGAIHEIQEPQEGCWIALTNPTATEIFEISEQFGIEVDDLRAPLDEEERSRI EVEDNYTLILADVPAIEERNEKDWYVTIPLGIIVTQKMIFTVCLEDTQVLTRFMEGRVRN FFTYMKTRFILQILYRNASMYLRYLRIIDKKSEQIEEKLHLSTRNQELMELLELEKSLVY FTTSLRSNEMVLEKLLKVESIKKYPEDTELLEDVIIENKQAIEMANIYSGILSSMMGTFA SVISNNLNIVMKVLAVITIVMSIPTIVFSAYGMNLSAAGMPFSRTPWGFLIVIILSVVAS IIAAIFLSRKKFF >gi|226332950|gb|ACII01000069.1| GENE 6 5546 - 6640 1087 364 aa, chain + ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 1 348 1 353 365 341 51.0 2e-93 MSQSSFGNKFKVTTWGESHGKALGAVIDGCPAGLPLCEEDIQKFLDRRKPGQSRYTTARK EGDLVEILSGVFEGKTTGTPISLMVRNTDQRSRDYGNIAYSYRPGHADYTFDQKYGFRDY RGGGRSSGRETIGRVAAGAIASKILEELGISICTYTRSIGPVEIASFNKEEIHQNAFYMP DAQAAVKAGEYLEECMKNQDSAGGVIECRITGTPAGLGEPVFDKLSALLAHALMSIGAVK AVEIGDGIAVTSSNGSTDNDGFTVKDGEIIKTSNHAGGIMGGISDGSEIILRAHIKPTPS ISQPQQTVTKDKEPLSLEIHGRHDPVIVPRAVVVVESMAAITLADALFVNMSSQMDKVRD FYRK >gi|226332950|gb|ACII01000069.1| GENE 7 6752 - 7819 1095 355 aa, chain - ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 5 348 3 344 349 171 32.0 2e-42 MSEKKYVAYVGSYTHGSSRQGIHVYDVDVENGTLTERSSVEVSNASHMAVSKNGKYLYSI EDEGVAVFKRDKNGDLSRINSVNIDGMRGCFLSTDVDGKYLYVAGYHDGKVTVVHTHKDG RLGSLMDGVFHTGLGSVAERNFRPHVNCVRPTPDNKYLCAVDNGIDQVKIYRVNKLRNKL ELVDILRCPRESGPRIIRFSDDGKFAYILFELSNEIRAYKYDGSGKVPAFELIQTIETSA KKDVHDTHNAASGLSLANDGKHLFCTTAGEDTVSMFERDEETGMLTRKFTLPISGEYPKD LVLLPDDKHLVLANHASNTLTTFTVDYEKNIIVMNGKPIKITEPNCIRIWPVPEE >gi|226332950|gb|ACII01000069.1| GENE 8 8114 - 9391 1595 425 aa, chain - ## HITS:1 COG:PM0462 KEGG:ns NR:ns ## COG: PM0462 COG0001 # Protein_GI_number: 15602327 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Pasteurella multocida # 1 425 1 425 427 477 52.0 1e-134 MGLSEELFDRAVKVIPGGVNSPVRAYGAIGIAPRFIDRADGCHIYDVDGKEYVDYIDSWG PMILGHNFPEVKESVLKACEKGLSFGCATAIEVEMAEFICDHIPHVDMVRMVNSGTEAVM SAVRVARGFTGKNKVIKFAGCYHGHSDAMLVSAGSGVMTSGVPDSAGVPKGCTEDTMTAV YNDLDSVRALMEQADGQTAAVIVEAVGANMGVVPPKKGFLEGLRKLCDEYGALLIFDEVI TGFRLAFGGAAEYFGVTPDLVTYGKIIGAGMPVGAYGGRREIMELVSPVGKVYQAGTLSG NPIAMAAGLTQLKYLYEHQGIYKDLEEKGKRLYGGMEKILAEKNLPYHINYVSSLGSLFF TEQEVVDYTSAKSSDTKAFSEYFKGMLAQGIHMAPSQFEAMFLSVAHTDEIIDQTLEAVR NYFTK >gi|226332950|gb|ACII01000069.1| GENE 9 9391 - 10374 1196 327 aa, chain - ## HITS:1 COG:FN0460 KEGG:ns NR:ns ## COG: FN0460 COG0113 # Protein_GI_number: 19703795 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Fusobacterium nucleatum # 6 319 4 318 322 394 60.0 1e-109 MDLIQRPRRLRGSENLRKMVRETRMDKSSLIYPLFVKEGTGIEEEIPSMEGQFRYSVDRL PFELERLQNAGVNSIMLFGIPDHKDEVGSGAYDPNGIVQKALREAKKQFPDMYYITDVCM CEYTSHGHCGVLCGHDVNNDATLELLAKTAVSHVEAGADMVAPSDMMDGRVRAIREALDA NGHYGAPIMSYAVKYASAFYGPFRDAAGSAPSFGDRKSYQMDFHNRREGMKEALTDVEEG ADIIMVKPAMSYLDMVSEVSKAVNVPVATYSVSGEYAMVKAAAKMGWIDEERIMCEMAVS AYRAGAQIYLTYYAKELAKCMDEGRIG >gi|226332950|gb|ACII01000069.1| GENE 10 10435 - 11940 1607 501 aa, chain - ## HITS:1 COG:lin1164_1 KEGG:ns NR:ns ## COG: lin1164_1 COG0007 # Protein_GI_number: 16800233 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Listeria innocua # 4 251 2 248 252 241 50.0 3e-63 MAAGKVWLVGAGPGDIGLFTLKGAAVLQQADVVVYDSLVGEGVLAKIPEHARLINVGKRA NHHTMVQEDINKVLLEEAEKGNKVVRLKGGDPFLFGRGGEELELLSENGIPYEIIPGVTS PISVPAYNGIPVTHRDFCSSLHIITGHKRAGQEYDIDFKALTRTKGTLVFLMGIAALEDI CKGLLAGGMDPDMPAAVLQKGTTAGQKRVVATVGTLKEEVDRQGIETPAIIVVGKVCSLA DKFAWYEKLPLAGWKVLVTRPRQHISKTADLLRQKGAEVLELPSICTVPVEDNGRLYEAF EKLDTYQWIIFTSPAGVEIFFDEMDRKEMDVRSLGQAKIAVIGEGTKKKLKEHHLLADFV PDVYDGDTLGAELAKELQGNEKILIPRAEAGNKKLTELLEQTGAQVDDIATYTTRYEKSR LIDEKKELETGSVDCVVFTSASTVKGFVEGTKGLDYTKVKAACIGKQTKAAADAYGMQTR MAKKATIESLIELVEEMKQEN >gi|226332950|gb|ACII01000069.1| GENE 11 11989 - 12894 1038 301 aa, chain - ## HITS:1 COG:YPO3849 KEGG:ns NR:ns ## COG: YPO3849 COG0181 # Protein_GI_number: 16123984 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Yersinia pestis # 6 296 8 296 313 212 42.0 6e-55 MTEVIIGSRESKLAVLQSEMVKSYIEQKNREENAGSEITVNILTMKTTGDIILDRTLDKV GGKGLFVKELDRALLDGKSNLSVHSLKDMPMEVPKELPLLAFSKREDPRDVLVLPEGVAE LDPDKPLGCSSLRRTLQLKKLYPEMEVKSIRGNLQTRLRKLDEGEYSGLILAAAGLKRLG LENRISRYFTPDEMIPSAGQGILAVQGRKGEDYGYLDGYCDRDAWLAGTAERAYVKYLDG GCSSPVAAFAEVDGDEIFIRGLYYSEATGKWLTGQIRGAAEDGEKLGIALAKQLKSDCEG R >gi|226332950|gb|ACII01000069.1| GENE 12 12887 - 13384 376 165 aa, chain - ## HITS:1 COG:aq_1237 KEGG:ns NR:ns ## COG: aq_1237 COG1648 # Protein_GI_number: 15606466 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Aquifex aeolicus # 6 162 4 160 187 121 38.0 6e-28 MKNKPYFPMFIDLSDKNIVVVGGGNIATRRVKTLLSFTRNIRVIAPKVTMEMMELSKAGY VELITRPVKRTDFAMAYMVIAATNDRKLNDEIHRICRQEGVYVNVASDREQCDFYFPGIY MEGGLVVGITASGLDHKKARRIREEIQNALERVAQEDDNEEENDD >gi|226332950|gb|ACII01000069.1| GENE 13 13374 - 14603 1329 409 aa, chain - ## HITS:1 COG:VC2180 KEGG:ns NR:ns ## COG: VC2180 COG0373 # Protein_GI_number: 15642179 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Vibrio cholerae # 1 384 1 387 419 167 30.0 4e-41 MSISMIGIDHNMAPVDIRAKFAFTKKNAGEAMEKIKNQNGIYGCVILSTCNRLEVWASVD DEEEVCLYDCLCRIKGITEDSYRKYFVERKDQEAVEHLFYLTSGLKSQIIGEDQILTQVK DALNLARENFAADGVLEVLFRMAATAGKRIKTEVPFSHGNPSVIHQAIGFLAEKGYSLRD KICMVIGNGEMGKVAAQALQEAGADVTVTIRQYRSGVVNIPLGCKRINYGDRMEYMPQCD LVVSATASPNFTLKEELFQGIRTKDDLILIDLAVPRDIEPSVGKIQGVTLYDMDSFRIEE VPAELQENLEAAGAIVREQMDEFFQWLDGRDLIPRIQDIKDEAVNDLNLRIAKILKKTPM EEDDRQNLVHAVDTAAGKVVNKLIFGLRDSLNQEIFLECVAGLEKIYEE >gi|226332950|gb|ACII01000069.1| GENE 14 14977 - 16677 1589 566 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0955 NR:ns ## KEGG: EUBREC_0955 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 561 5 562 564 592 52.0 1e-167 MAGLKKLPIGIENFEEMRREDFYYVDKSHVIEQLLTQWGKVNLFTRPRRFGKSLNMSMLQ SFFEIGKDKTLFDGLRISDNQELCEKYQGKFPVVSVSLKGINGATYEEARRFLIKTINEE ARRLSVLSDSTELDETDHELLTQLKKKEMTNDSLVYSIRELTELLEKHYDRKVIVLIDEY DVPLAKANENGYYDEMVLLIRNLFENALKTNSSLKFAVLTGCLRIAKESIFTGLNNFKDY SITDKSFDETFGFTDAEVRELLRYYGQEKYYETVKEWYDGYRFGNVDVYCPWDVINFCSD HLADPGLEPKNYWANTSGNSVISHFIDSVGKPQKLTRMELEQLVNGGIVQKEINSELTYK ELYSSIDNLWSTLFMTGYLTQRGEPSGNRYNLVIPNREIRNIITNHILKMFKENVKDDGK TVSDLCDALLNKNPEKVELIFTEYMKKTISIRDTFARKPTKENFYHGLLLGILGFKENWS VMSNRESGDGFGDILIRIEDEDVGIVIEVKYADDGNLQGECEKALQQIIDIRYTEALEQE GIHTIIKYGIACYRKKCKVLMRIDKQ >gi|226332950|gb|ACII01000069.1| GENE 15 16885 - 17700 584 271 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579015|ref|ZP_04856286.1| ## NR: gi|253579015|ref|ZP_04856286.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 26 258 1 233 246 436 100.0 1e-120 MKILKKAGGILFVIIGIFFFVSALKMIFVDNPKTKAALKDAVYVDAADTIDPENDGKTVI VCGTFELTEPAHDDELELDFDSIRISSSKQTMKLTKSSSKKKEAMTDDEKKYGVLEWNSS FSSMPVSGQGKIGNYALSQDFIDDIMLTKTWEDYDKAALSSAGYTYVPDNTYTQKHFIEP SNQTTRSHKEYDVRYYYSAADFETGQTVTAIGIQDGQTLKSAPGITENLMKNKLDRDEAL KQGGTPGVGSQIFSVVFSLLLILGGFLLIIL >gi|226332950|gb|ACII01000069.1| GENE 16 17765 - 17944 105 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579016|ref|ZP_04856287.1| ## NR: gi|253579016|ref|ZP_04856287.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 1 59 59 117 100.0 2e-25 MRIRVCRSKNAGVAGYAEAFRPVHSDLQTSEWYSYFENDVLDTEYVTEDIAQPSDLMIQ >gi|226332950|gb|ACII01000069.1| GENE 17 17990 - 18694 801 234 aa, chain - ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 232 1 231 240 275 60.0 4e-74 MNKTLILVVEDDRPIRNLIVTTLKTHDYKYLAAENGSSAILEASSHNPDIVLLDLGLPDM EGVEVIKKIRTWSNMPIIVISARSEDTDKIEALDAGADDYITKPFSVEELLARIRVTQRR LAVMQSGEQLESSVFENGGLKIDYTAGCTYLKGEELHLTPIEYKLLCLLSRNVGKVLTHT YITQQIWGRCSDNDVASLRVFMATLRKKLEPEKNGVQYIQTHIGIGYRMLRIEK >gi|226332950|gb|ACII01000069.1| GENE 18 18687 - 20252 1021 521 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 21 521 384 888 888 436 44.0 1e-122 MSLKNVNHQKRSFTWSMEKPSAKDYFLTVFIFAVCTLIGLLFQKLNFTDTNIVTIYILGI LLTSIVTDGYLCSVAGSFLSVFLFCFFLTEPRMSFKTYAVGYPVTFFIMLISSVLTGALA AKLKTHAKLSTQLAFRTQILFDTDRLLQKAKGETEILDVTCTQLLRLLNRNITAYVVEDG TLSEGKLFSGEKEDTEDFLIPEEQQIARWVCENRQHAGASTHHFPQAKCLYLAIRSGDNV YGVIGIPLQKETLDSFEYSILLSVINECALAMENAKNALEKEKNAVMAKNEQLRADLLRA ISHDLRTPLCSISGNADMLLGNSDRLDEATKHQIYSDIYDDSEWLIGVVENLLSITRLND GRLKFKFTDQLLDEVIAESLRHISRKHDDYKIVTDCEELVLARMDVRLIMQVLVNLVDNA IKYTPPGSVIFIRGTKTDGKAQISVEDNGPGIPEEMKTHIFEMFYTGKTTVADSQRSLGL GLALCHSIIEAHEGTLVLTDHNPHGCNFTFTLPLSEVTLNE >gi|226332950|gb|ACII01000069.1| GENE 19 20325 - 20972 644 215 aa, chain - ## HITS:1 COG:SA0939 KEGG:ns NR:ns ## COG: SA0939 COG0569 # Protein_GI_number: 15926674 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Staphylococcus aureus N315 # 2 211 3 212 220 148 38.0 6e-36 MKNVLIIGLGRFGRHIAMQLNQLGHEIMAVDWKEERVDKVLPFVTNAQIGDSTNAEFLQS LGIGNYDICFVTIGGSFQNSLETTSLLKELGAKLVISRAERDVQEKFLLRNGADKVVYPE KQVAKWASIRYTDDHILDYMEVDASHAIFEVEVPEEWTGKTVGGLDIRKRYNINILAVKN EGEFNIAISPDTYLEENSKLLVLGEYRALQKCFRI >gi|226332950|gb|ACII01000069.1| GENE 20 20992 - 22329 683 445 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 3 445 6 445 445 254 37.0 2e-67 MPEKMCKKKHLTSFQLIILGFAGVILLGSILLMLPVSSLEKMPTPFHEALFTATSAVCVT GLVVKDSGSYWSVFGQTVILALIQIGGFGVVTVAAAVSLLSGKKISLMQRSTMQDAISAP KVGGIVRLTRFILKGTLLIEAAGAMLLLPVFVSDYGLKGIWMAAFHSVSAFCNAGFDILG TADNAFPSLTGYSGNILINVVIMLLIITGGIGFLTWDDIYRNKMNFKRYRMQSKIILMTT VCLIVFPAFFFFACDLKNLPAGERLLAAMFQSVTTRTAGFNTIDISEMSEASKAVMILLM LIGGSPGSTAGGMKTTTFTVLILNAIATFRSQENAGAFGRRLEYHVIKNAATIAMLYFTL FFCGGVAISVYEGLPLLDCLYEAASAVGTVGLTLGVTPGLHVFSQVVLIILMYLGRVGGL TLIYAVLSGRNKGNAKLPLEKITVG Prediction of potential genes in microbial genomes Time: Sat May 28 19:51:40 2011 Seq name: gi|226332949|gb|ACII01000070.1| Ruminococcus sp. 5_1_39B_FAA cont1.70, whole genome shotgun sequence Length of sequence - 17022 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 10, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 307 100 ## gi|253579021|ref|ZP_04856292.1| conserved hypothetical protein - Prom 464 - 523 3.0 2 2 Op 1 . - CDS 542 - 3040 1375 ## Ccel_0558 protein of unknown function DUF214 3 2 Op 2 . - CDS 3033 - 3734 172 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 - Prom 3791 - 3850 5.4 - Term 3803 - 3855 1.3 4 3 Tu 1 . - CDS 3961 - 5571 644 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 5610 - 5669 3.7 5 4 Op 1 . - CDS 5683 - 5871 145 ## EUBREC_3579 hypothetical protein 6 4 Op 2 . - CDS 5868 - 6725 543 ## COG1484 DNA replication protein 7 4 Op 3 . - CDS 6722 - 7471 481 ## CDR20291_1762 phage protein - Prom 7515 - 7574 2.2 + Prom 7402 - 7461 3.1 8 5 Tu 1 . + CDS 7515 - 7697 76 ## gi|160894293|ref|ZP_02075070.1| hypothetical protein CLOL250_01846 + Term 7779 - 7827 2.0 9 6 Tu 1 . - CDS 7841 - 9457 685 ## EUBREC_3583 hypothetical protein - Prom 9533 - 9592 3.3 + Prom 9326 - 9385 3.4 10 7 Tu 1 . + CDS 9444 - 9863 158 ## gi|294641691|ref|ZP_06719603.1| hypothetical protein CUS_1608 + Term 9915 - 9941 -1.0 - Term 10307 - 10347 4.2 11 8 Op 1 . - CDS 10384 - 10788 177 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 12 8 Op 2 . - CDS 10819 - 11031 250 ## gi|253579032|ref|ZP_04856303.1| conserved hypothetical protein 13 8 Op 3 . - CDS 11018 - 11461 62 ## CDR20291_1773 hypothetical protein 14 8 Op 4 3/0.000 - CDS 11516 - 13189 332 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component 15 8 Op 5 . - CDS 13200 - 13559 346 ## COG3682 Predicted transcriptional regulator - Prom 13679 - 13738 3.1 16 9 Op 1 40/0.000 - CDS 13962 - 14891 513 ## COG0642 Signal transduction histidine kinase 17 9 Op 2 . - CDS 14879 - 15547 369 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 15748 - 15807 8.6 + Prom 16015 - 16074 8.0 18 10 Tu 1 . + CDS 16262 - 16804 447 ## COG4636 Uncharacterized protein conserved in cyanobacteria + Term 16864 - 16914 6.1 Predicted protein(s) >gi|226332949|gb|ACII01000070.1| GENE 1 1 - 307 100 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579021|ref|ZP_04856292.1| ## NR: gi|253579021|ref|ZP_04856292.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 102 1 102 102 169 100.0 4e-41 MKEMIKKYKGTLICSVLVMLAGILVGFTMAQSIWINVFFVVTDCILVTIIFYDNRNRQQS SKVIGMVIWIIPVTTLIYNGMARLISMDADSENLFMAVTYFG >gi|226332949|gb|ACII01000070.1| GENE 2 542 - 3040 1375 832 aa, chain - ## HITS:1 COG:no KEGG:Ccel_0558 NR:ns ## KEGG: Ccel_0558 # Name: not_defined # Def: protein of unknown function DUF214 # Organism: C.cellulolyticum # Pathway: not_defined # 1 832 2 830 831 315 27.0 4e-84 MLKNPNQKVIKRMARNALKVNRRKTITLFLAVLLSSFLVFTIFTVGDSYFRLQKIQNIRM SGAEFDAIMYGVTDEQRQMCENNPNIALTGTVGVCGWVEKTNQDSTPDVGLIWADDGYWT QMMEPVREKLEGRYPIALDEIMVTKSALKECGYEDLDVGDTLAMSYGTHEGIFTGTFRIC GIWDGYGPKKQFYVSKEFYDQSGWKLSQAASGRYFMDFKQKIMTKTEQNAFIESMNLGKQ QNLFFMEDLGASVQILAGLIGLIAVTCLCAYLLIYNIMYLSVAGKVRYYGLLQTVGMTEK QIKRMMKEQMLLIGSAGTVLGCLSGGMVSFFLIPVVVKSLGIKSGYVGADMVRFHPAVLL ATILLVGVTIFLASQKPTKMAADISPIEALGYRPTHKTVKERKAGNGKVIARLSMEQFTK DKKRTAVVLLSLAASLSVYLCIVTMLDSQAARTIVSNYMDTDMVIKNDTACKEKSEDRRD ILDDSFVKSIKENAGVSEVHSMIFAEITVPWEPDFAEMWMKEFYAKWMSIPYSKEKHEYQ NHPENFGSSLIGIDEQEFDYLNNSLTHPIKKEDFLSGKVCIVYRNGLDLSDADIIGKNVT CALYEDQQTTKSFTIEGVTDENYYTALLGYPPTIIASDQVVKTFADHPITLKTSIKYHKE YDRDTEQEILSLLEESDNAKDFSWESKIEDADEIEKAQGNMPQLGIGIVLILAFIGIMNY MNTFVVNVQSRMTELSVMESIGMTPKQLLGMLVREGVLYAGGAWLVTLTVGMGVTYLLYE SMNYRGIAFSVPLLPLLFAVGISFLVCTMVPVLTWIILEKNGTVVERIKGVE >gi|226332949|gb|ACII01000070.1| GENE 3 3033 - 3734 172 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 25 216 17 205 205 70 29 6e-12 MREYIVETKNLKKYYQMGANTVKALDGVDFRVKDQEFVAIIGKSGSGKSTLLHMLGGLDV PTEGEVLVEGKCLLGLKKEQLAIFRRRKIGFIFQNYNLVPDLNVYENVILPVELDGRKVD VEYVDGILELLGLAEKREALPGTLSGGQQQRAAIARALAAKPAIILADEPTGNLDSVTSH DVLGLLKMAARQFSQTLILITHDRDIAQLADRIVHIEDGKIVGDTRKGSDSYA >gi|226332949|gb|ACII01000070.1| GENE 4 3961 - 5571 644 536 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 6 301 2 301 301 324 56.0 3e-88 MLRQTNQQPITALYPRLSHEDELQGESNSISNQKRILETYAKQNGFSNLRWYTDDGYSGA NFQRPGFQAMLADIEAGKVGTVIVKDMSRLGRNYLQVGMYTEMIFPQKGVRFIAINDGVD SAQGDNDFAPLRNIFNEWLVRDTSKKIKAVKRSKGMSGKPITSKPVYGYLMDEDENFIID EEASPVVKQIYNLCLAGNGPTKIARMLTEQQIPTPGTLEYRRTGSTRRYHPGYECKWATN TVVHILENREYTGCLVNFKTEKLSYKVKHSVENPEEKQAIFENHHEPIIDTQTWERVQEL RKQRKRPNRYDEVGLFSGILFCADCGSVMYQQRYQTDKRKQDCYICGNYKKRTHDCTAHF IRTDLLTAGVLSNLRKVTSYAAKHEARFIKLLIEQNEDGGKRRNAAKKKELEAAEKRIAE LSAIFKRLYEDSVTGRISDERFTELSADYEAEQRELKEKAAAIQAELSKAQEATVNAEKF MNVVRRHTSFEELTPTLLREFVEKIVVHECSYDENKTRRQDIEIYYSFVGKVDLPE >gi|226332949|gb|ACII01000070.1| GENE 5 5683 - 5871 145 62 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3579 NR:ns ## KEGG: EUBREC_3579 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 62 1 62 62 75 98.0 4e-13 MTETRQTSTTKTDRRPDCVTEIRMGNSVLTVSGFFKQGATDTAADKMMKVLEAEAATQKT AI >gi|226332949|gb|ACII01000070.1| GENE 6 5868 - 6725 543 285 aa, chain - ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 59 278 51 281 282 112 30.0 7e-25 MKNEIEAMITDITTTTAEAEDYTGEDGLLYCGKCHTPKEAYFPKETAQWLGHDRHPAECD CQRAAREKREAAESRQKHLETVEDLKRRGFTDPAMRNWTFEHDNGRNPQTETARFYVESW ETMQAENIGYLFWGGVGTGKSYLAACIANALMEKEVAVRMTNFATILNDLAANFEGRNEY ISRLCSYPLLILDDFGMERGTEYGLEQVYSVIDSRYRSGKPLIATTNLTLEELQHPQDTP HARIYDRLTSMCAPVRFTGSNFRKETAQEKLERLKQLMKQRKESL >gi|226332949|gb|ACII01000070.1| GENE 7 6722 - 7471 481 249 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1762 NR:ns ## KEGG: CDR20291_1762 # Name: not_defined # Def: phage protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 249 1 244 244 284 69.0 2e-75 MADNRKYYYLKLKESYFDDDAIVLLESMPDGILYSNILLKLYLKSLKNGGKLQLDENIPY TAQMIATLTRQQVGTVERALGIFQQLGLVEQLHGGLLYMTDIELMIGQSSTEAERKRAAR LANKALPPPRTNGGHLSDIRPPEIEIKKEIDIEIEKEREGETGHPTPAAYGRYNNVILTD TELSGLKTELPDKWEYYIDRLSCHIASTGKQYHSHAATIYKWAQEDTAKGKAAPKQGIPD YSCKEGESL >gi|226332949|gb|ACII01000070.1| GENE 8 7515 - 7697 76 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160894293|ref|ZP_02075070.1| ## NR: gi|160894293|ref|ZP_02075070.1| hypothetical protein CLOL250_01846 [Clostridium sp. L2-50] # 2 48 11 57 69 65 70.0 8e-10 MAEDNHGSETLLAIVGAAPAKVVIGNSVSVVEEKAPFKPPFVAHAWHWYAPFCRFRGYHG >gi|226332949|gb|ACII01000070.1| GENE 9 7841 - 9457 685 538 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 538 3 543 543 758 75.0 0 MATFKHISSKNADYGAAEAYLTFEHDEFTMKPTLDEAGRLILREDYRIATLNCGEEDFAV ACMRANLRYGKNQKREDVKSHHYIISFDPRDAADNGLTVDRAQALGEEFCAEHFPGHQAI VCTHPDGHNHSGNIHVHIVINSLRIEEVPLLPYMDRPADTREGCKHRCTNAAMEYFKAEV MEMCHRENLYQIDLLHGSKNRITEREYWAQKKGQAKLDKESATLAAEGQPAKQTKFETDK AKLRQTIRNAMSEATTFDEFSALLLRQGVIVKESRGRLSYLTPDRTKPITARKLGDDFDR AAVLALLEQNTHRAAEKTAPIPQYHTAETNRTERGKTQKIAPTGNIQRMVDRAAKRAEGK GIGYDRWAAVHNLKQMAATVAAMEQYGFTPDELDAALVSANADLHSSTAKLKPIETAIRE KKDLQKQVLAYAKTRDVRDGLKKQKTDKARKAYREKHESDFIIADAAVRYFRQKGITKLP TYKSLQAEIEQLTAEKNALYNEYRANKERVRELQTMKSNLSQMHHGEPSRQKKHEQER >gi|226332949|gb|ACII01000070.1| GENE 10 9444 - 9863 158 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|294641691|ref|ZP_06719603.1| ## NR: gi|294641691|ref|ZP_06719603.1| hypothetical protein CUS_1608 [Ruminococcus albus 8] # 1 78 1 78 80 75 51.0 9e-13 MNVANSVTHFLQDFKLQGGKVGNGGSHFNRQRIVGRSVFVEKAGDLIEVAADPAVFGGQP ADSGQQLIIDRGHGNDGANGRPCNGLPDKFGLAHVIVFQPLGKVGVFFLGHAGFDHMTAV GRVVFFLQALTSFLARQGA >gi|226332949|gb|ACII01000070.1| GENE 11 10384 - 10788 177 134 aa, chain - ## HITS:1 COG:BS_ycbL KEGG:ns NR:ns ## COG: BS_ycbL COG0745 # Protein_GI_number: 16077324 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 17 131 110 224 226 81 40.0 5e-16 MLIFNDTEEKAVEKAIAALADMIPLEAMQPPHSPVLTFPGLEIKLHQRRVLKNGTDIPLT RLEYGALCCLATSPGRVFTKAQIFEAVWSLESESCQSNVANVICNLRKKIEPDSGKPTYI KTVLGVGYKFASGE >gi|226332949|gb|ACII01000070.1| GENE 12 10819 - 11031 250 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579032|ref|ZP_04856303.1| ## NR: gi|253579032|ref|ZP_04856303.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 70 1 70 70 130 100.0 2e-29 MRNPKLVPYETIVRATSGEPEAIDEVLRHYSKRIRLASLENGQVNKDTEDNIKRRLIAAL FQFRFDGQPT >gi|226332949|gb|ACII01000070.1| GENE 13 11018 - 11461 62 147 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1773 NR:ns ## KEGG: CDR20291_1773 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 140 1 137 145 74 33.0 1e-12 MPCTEAYKEHIEYTFNAFCRVVIHYAAINAWRDRDKRRQREISFEYLTEEKFYPFCTSDT YFIDPYKQYPVTICGQRVILTNGNLAEALSSLPEKKREIIYLYFFGRYTQQEIGELYGRC RSTAWHHIHSALQMLHEEMEVLFHEES >gi|226332949|gb|ACII01000070.1| GENE 14 11516 - 13189 332 557 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 1 320 9 338 541 113 24.0 1e-24 MNLLQMSFSGAAFITAVVIIRAVAINKLPKKTFLVLWELVILRLLIPFSIPSMFSVYTLV THSISSTTLPEAGTDYNIPTMQGLFVTTQGAEQPPENILSSVSMWFIVWCAGIILIALFF VISYLRCLIEFRTALPVHNHYVEKWLAERPLKRPILVKQSDRISAPLTYGIFRPVILVPK KMDWKNEKQLQYVLSHEYVHIYRYDTVTKLIATLALCIHWFNPFVWVMYILFNRDIELAC DESVIRQFGEKSKSAYSLMLINMEATKSGLLPFCNSFSKNAIEERITAIMKTKKTTIFSL VLACLIVVGVATAFATSAQANNNPIFEEQEYKPIHASENASDYDTAAELPGGNIDYGIYE EYGLIYDKNSKCYTYKGNVVRFFNDPVGGASFTNFFTGTVDIEAERDTNDKLIGIRECTK EVYDRHTEKYENSGLKSMPAGTSTENGDMIRNHESYSMESGDKTENMNTLQDYESYGIVY HPKDGYWYYNDQIIGIFIDTKKPIIYFNNSGSIYLSIAENSENGVMEIKEISEAEAQKLL RDNNSGNSASFTVETLK >gi|226332949|gb|ACII01000070.1| GENE 15 13200 - 13559 346 119 aa, chain - ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 8 119 28 141 144 68 31.0 3e-12 MDIKLFDSELKVMCVLWNEGDTTAKHISDVLKKEIGWNMNTTYTLIKRCIKKGAIERSEP NFMCHALIPKEEVQEAETKELVDKIYDGSVDKLFAALLGRKKLSAEQIQKLKQIVEDLE >gi|226332949|gb|ACII01000070.1| GENE 16 13962 - 14891 513 309 aa, chain - ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 85 304 189 410 416 146 34.0 4e-35 MDWLMYIIMIILFAAVLVLLWKNYRIKKEAKLFAEKVEDALDAIVTGKEWNMEEELEDSL WGRTGTQLAKAGNVFRKKEEDGFREKERVKGLISDISHQTRTPIANIKLYLELLEDEEFS QNGQEFLGKIKGQMEKIEFLMQNMIKMSRLETGILQIHKEDQNLYETIRHAVADVVPEAA LKGINLYVNCEENMMIRHDSKWTEEVIYNILDNALKYTESGGKIRIQAERQELFFKISIS DTGKGIAPERQAEIFTRFYREPEVHDKPGVGIGLYLARTIMELQKGYIEVQSEIGKGACF RLYFPVNEP >gi|226332949|gb|ACII01000070.1| GENE 17 14879 - 15547 369 222 aa, chain - ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 222 1 224 227 159 38.0 3e-39 MQYILVVEDDYDLNQAICYSLKKSGYGVYGVTFMEKGKQTFIENQIDLILLDVNLPDGEG FSFCQWVKKQREVPVIYLTARDMEEDALAGYESGAEDYVTKPFSMKILLRKIDVILKRTA SANHRIFTDENLYIDLDNARVMVKGQESSVTPTEYRLLCQFLMNRGQLLTYDLLLERLWD SGGQFVDRHALAVNINRLRGKIEDENHRYISNVYGMGYQWIG >gi|226332949|gb|ACII01000070.1| GENE 18 16262 - 16804 447 180 aa, chain + ## HITS:1 COG:aq_1194 KEGG:ns NR:ns ## COG: aq_1194 COG4636 # Protein_GI_number: 15606437 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Aquifex aeolicus # 9 151 8 149 180 92 33.0 3e-19 MPLPREQVYTTDDIYALPDGERAELIDGQIYMMGTPSRIHQKLVGQLSRIIGNYIESNHG SCEIYPAPFAVFIKKDDKNYVEPDISVICDKSKLSDRGCEGAPDFIIEIVSPSSRRMDYY KKCTLYAESGVREYWIVDPEKQRTMIYRYEDDAAPMIVPFEQDLAVGIYNDFMINVSKLL Prediction of potential genes in microbial genomes Time: Sat May 28 19:52:31 2011 Seq name: gi|226332948|gb|ACII01000071.1| Ruminococcus sp. 5_1_39B_FAA cont1.71, whole genome shotgun sequence Length of sequence - 21843 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 10, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 307 110 ## gi|253579039|ref|ZP_04856310.1| conserved hypothetical protein 2 1 Op 2 . - CDS 291 - 575 122 ## COG0640 Predicted transcriptional regulators - Prom 622 - 681 8.0 - Term 823 - 860 1.2 3 2 Op 1 . - CDS 863 - 1468 514 ## COG0535 Predicted Fe-S oxidoreductases 4 2 Op 2 . - CDS 1553 - 2089 699 ## EUBREC_0322 hypothetical protein - Prom 2177 - 2236 2.4 5 3 Tu 1 . - CDS 2317 - 3186 975 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase - Prom 3352 - 3411 9.6 + Prom 3176 - 3235 8.3 6 4 Tu 1 . + CDS 3350 - 4675 927 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Term 4519 - 4553 6.7 7 5 Tu 1 . - CDS 4672 - 5316 799 ## COG2082 Precorrin isomerase - Prom 5365 - 5424 4.4 - Term 5370 - 5407 -0.1 8 6 Op 1 2/0.250 - CDS 5449 - 6957 1742 ## COG1492 Cobyric acid synthase 9 6 Op 2 9/0.000 - CDS 6990 - 8033 963 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 10 6 Op 3 3/0.000 - CDS 8101 - 9078 810 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 11 6 Op 4 . - CDS 9179 - 10573 1194 ## COG1797 Cobyrinic acid a,c-diamide synthase - Prom 10594 - 10653 2.7 12 7 Op 1 . - CDS 10696 - 12747 1743 ## COG2242 Precorrin-6B methylase 2 13 7 Op 2 6/0.000 - CDS 12740 - 13471 932 ## COG1010 Precorrin-3B methylase 14 7 Op 3 12/0.000 - CDS 13464 - 14585 837 ## COG2073 Cobalamin biosynthesis protein CbiG 15 7 Op 4 . - CDS 14659 - 15435 940 ## COG2875 Precorrin-4 methylase 16 7 Op 5 . - CDS 15461 - 16579 1162 ## COG1903 Cobalamin biosynthesis protein CbiD - Term 16687 - 16730 7.6 17 8 Tu 1 . - CDS 16941 - 18284 1238 ## COG0534 Na+-driven multidrug efflux pump - Prom 18310 - 18369 7.3 18 9 Tu 1 . + CDS 18653 - 19552 316 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family + Term 19591 - 19632 6.5 - Term 19574 - 19625 13.7 19 10 Op 1 . - CDS 19646 - 20410 948 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 20 10 Op 2 . - CDS 20448 - 21656 1292 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase - Prom 21773 - 21832 10.8 Predicted protein(s) >gi|226332948|gb|ACII01000071.1| GENE 1 1 - 307 110 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579039|ref|ZP_04856310.1| ## NR: gi|253579039|ref|ZP_04856310.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 102 1 102 102 165 100.0 8e-40 MKEMIKKYKGTLICSVLVMLAGILVGFTMAQSIWINVFFVVTDCILVTIIFYDNRNRQQS SKVIGMVIWMIPVTALIYNGMARLISMDTNSENLFMAVIYFG >gi|226332948|gb|ACII01000071.1| GENE 2 291 - 575 122 94 aa, chain - ## HITS:1 COG:BS_yvbA KEGG:ns NR:ns ## COG: BS_yvbA COG0640 # Protein_GI_number: 16080432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 3 89 1 89 90 95 50.0 2e-20 MGLQDTLKALADPIRREILNLLKGGRLSAGEICEHFSVTGASISRHLAILKDADLIRSKR EGKFIYYELNTSVLEDVMLWITDLKGVSDYEGND >gi|226332948|gb|ACII01000071.1| GENE 3 863 - 1468 514 201 aa, chain - ## HITS:1 COG:aq_2060_2 KEGG:ns NR:ns ## COG: aq_2060_2 COG0535 # Protein_GI_number: 15607030 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Aquifex aeolicus # 7 199 2 190 194 112 34.0 4e-25 MTTDIIYRYKNQVYFNITNKCPCRCTFCIRNTEDAIGEASNLWFEHEPALEEIYKAIDEF DFSDCNEVVFCGYGEPTMALENLIAVSKYIRERYPFRIRLNTNGLSDLIHKRSTAEEICQ AVDSISISLNMPDAASYNEVVRPAYGEKSFDAMLKFARDCKQYLDDVRFTVVDVIGEEKV EQSKILAAQNGIPLRVRKYSP >gi|226332948|gb|ACII01000071.1| GENE 4 1553 - 2089 699 178 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0322 NR:ns ## KEGG: EUBREC_0322 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 16 173 5 171 181 176 64.0 3e-43 MSELTAKENIKTADTVKRITATAVMAALTTLMTAYIFHIPVGVNGGYVHLGDTMIYLAAA FLPLPYACAAGAIGGGLADLLTAPVWAPATIIIKMLICLPFSSKGTKLVTKRNVVALFLA FAISATGYYIAEGIMFGFTASFFTSVSGSIVQSGGSAIMFVIIGTALDKIGFKTNIAK >gi|226332948|gb|ACII01000071.1| GENE 5 2317 - 3186 975 289 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 1 286 5 278 290 176 35.0 6e-44 MKKIAVIHDLSGLGKCSLTAAIPVISAMGVQACPLPTAILSNQTGYGSYFWDDYTDRMEL IMDEWKKTGFTPDGVYTGFLGDARQVDLILKFVDMFCTEGTHILVDPVMGDRGETYKTYS ETLCEKMRTLVREATVITPNLTEALLLLYGESGMKERMKALSDMEETQLLKKIEKSGRDL TLRFGLESVVITGVEVSGSDGRARMGNLVLEKEKADWCFSVKEGGSYSGTGDLLASVLSA GMVRGIPMKACVEKAVEFLGGAIHDAVEEGTDRNDGVCFEKYLGILTQI >gi|226332948|gb|ACII01000071.1| GENE 6 3350 - 4675 927 441 aa, chain + ## HITS:1 COG:BS_ydeL KEGG:ns NR:ns ## COG: BS_ydeL COG1167 # Protein_GI_number: 16077591 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus subtilis # 14 402 15 422 463 234 32.0 3e-61 MELILDKNKKKEHLYLEVFHYYRELIMEKKLPPGSRMPSLRKCSQELKLSRTTIENAYLQ LAAEGYIISKAQSGYYVTDIADRQHTSVRKTEKQTQTPCRFDFASSGVDRESFRFDLWRR YIKSALRQDERLLSYGEPQGEADLRDTLADYVRERRNVLCSGEDIVIGAGIQSLLHILCP LLEQRQTVSFPNPSFVQGSTVFHDYGFQVDYRNKDCDIIYVSPAHMTKWGDIMPVSRRLE LVNYSAAHGSLVIEDDFENEFVYLQKPTPSLFGLAGGQNVVYLGTFSRLLLPSIRISFMV LPPELSALYRKKADQYNQTASKAEQIALCQFIRDGHLAAQTRKLKRLYSVKLKALRSAVR EVFGKDAQIQTGAAGTSLALTLSTTLTWQELQKKAQTNGLRLQLLREAPGKITLILSCSS MPVADFIPACKLLKDISQIQN >gi|226332948|gb|ACII01000071.1| GENE 7 4672 - 5316 799 214 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 10 214 10 217 219 176 46.0 3e-44 MKVELENVKPMDIEKRSFEIITEELGDKKLVPGTELIVKRCIHTSADFDYAENLCFSENA VEKAIAAIKDGACIVTDTHMAESGINKRVLSRYGGEVFCFISDEDVAATAKANGTTRATA SMDKAAAMGKKLIFAIGNAPTALVRLYELIEDGKLDPELIIGVPVGFVNVVQSKELIMQA DVPYIVARGRKGGSNVAACICNALLYMIDNNRGY >gi|226332948|gb|ACII01000071.1| GENE 8 5449 - 6957 1742 502 aa, chain - ## HITS:1 COG:STM2019 KEGG:ns NR:ns ## COG: STM2019 COG1492 # Protein_GI_number: 16765349 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Salmonella typhimurium LT2 # 1 500 1 501 506 484 51.0 1e-136 MAKAIMVQGTMSNAGKSLLAAGLCRIFKQDGYRVAPFKSQNMALNSFITEEGLEMGRAQV MQAEAAGIKPSVLMNPILLKPTNDVGSQVIVNGEVLGTMSARDYFKYKKKLVPDIMKAYD KLASENDIIVIEGAGSPAEINLKTEDIVNMGMAKMAKAPVLLVGDIDRGGVFAQLIGTVE LLEEDEREMVKGLIINKFRGDKTILDPGVQMLEERSHIPVVGVAPYLDIQVEDEDSLTER FDRKQEVDLIDIAVIRVPRISNFTDFNPLESIPGVSLRYVQHVSELKNPDMIILPGTKNT MEDLLWMRANGLEAAVLKEAAKGKIIFGICGGYQMLGETLSDPHHVEAGGTIKGMGLLPM DTVFAEKKTRTRVSGRFLELEGELQALSGTELEGYEIHMGETVLKGEAGHSVSIEDQVSG ECKEDGAYCKNVCGTYVHGVFDREDVAEAVVRVLGEKKGIDVSQMTGIDFAAFKETQYDI LAAELRKHLDMKKIYEILEQGI >gi|226332948|gb|ACII01000071.1| GENE 9 6990 - 8033 963 347 aa, chain - ## HITS:1 COG:STM0644 KEGG:ns NR:ns ## COG: STM0644 COG0079 # Protein_GI_number: 16764021 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Salmonella typhimurium LT2 # 6 347 8 358 364 250 37.0 3e-66 MTKHIHGGDVYKYDHCLDFSANCNPLGTPKSVKQAIIDSVEDLSDYPRVGCGPLKEAIAD YEHTKKEYLICGNGAADLIFSLSRALNPKKALLPAPTFAEYEQALVSVGCEISRYYLKEE NDFCIQKDYPDVLKREKPDIIFLCNPNNPTGITIPQDLLEEILETCAMQGIFMVVDECFL DFVKDPEKHTLKEKLAKYPGLFILKAFTKRYAIPGVRLGYGFCSDGEVLDRMEAVTQPWN VSTMAQQAGMAALKEAEYVEAGRQIIFRESAWMKEKMRQLGLTVYPSEANYIFFYGSEDL FERCVAKGILIRDCSNYPGLKKGYYRVAVKLHEQNEKLIEVLEEVLS >gi|226332948|gb|ACII01000071.1| GENE 10 8101 - 9078 810 325 aa, chain - ## HITS:1 COG:FN0975 KEGG:ns NR:ns ## COG: FN0975 COG1270 # Protein_GI_number: 19704310 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Fusobacterium nucleatum # 1 314 1 316 325 249 40.0 4e-66 MLYYSMPALAAGFILDLMIGDPRWLYHPVCLIGNLIAFLEKILRKIFPKTDKGELAAGTV EVILVCLLSGGIPFLMLHILYGISVWAGFALETFWCYQLLATKSLKTESMKVYDRLKNGT LDEARYAVSMIVGRDTQSLTEEGVTKAAVETVAENASDGVIAPMLYMAIGGVWLMFLYKG INTMDSMLGYKNDKYLYFGRCAAKLDDVANYIPARLSGWLMVAASAFVKMDVKNAAKIYR RDRRNHASPNSAQTEAAMAGALEVQLAGNAYYFGKLYEKPTIGDEIRSVEVEDIRRSNSL LYATAILGAVIFLLIGSAVRYCFFL >gi|226332948|gb|ACII01000071.1| GENE 11 9179 - 10573 1194 464 aa, chain - ## HITS:1 COG:TVN0409 KEGG:ns NR:ns ## COG: TVN0409 COG1797 # Protein_GI_number: 13541240 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Thermoplasma volcanium # 1 464 1 452 454 285 35.0 1e-76 MDIPRILLAAGASGSGKTLITCGLLQALVNRKMKVASFKCGPDYIDPMFHSRVIGTKSRN LDTFFTDAETTRYLLGKNAADCDIAVMEGVMGYYDGVGGISTKASAYDLADTTDTPVILV VNSRGMSISLAAYIKGFMEYKEKSHIKGVIFNQMSPMLYPRMKKLVEEQLEVEVLGYVPK VEDCVIESRHLGLVLPEEISDLKERLQKLAGILEDTLEIDRILALAKNAEDLQVPESLIQ KDGTYGYCLPQKLRIGVAKDEAFCFFYEDNFRLLQEMGAELVDFSPIHDEHLPADLDGIL LYGGYPELNGEALERNASMKEEIAQAVKQGMPCMAECGGFMYLHEQMEDMGGVFRKTCGV IPGKCFRTPRLTRFGYITLTAGKPVFGRSAEEIGEIPAHEFHYFDSENCGSDFHAAKPLS KRGWDCMHSSSNLLAGYPHIYYYGNPQIPRAFLMKCLEYHNSKE >gi|226332948|gb|ACII01000071.1| GENE 12 10696 - 12747 1743 683 aa, chain - ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 497 678 3 184 189 163 46.0 9e-40 MYKVLVFAGTTEGYEICRFLADHKIETKGFVATEYGSKSLTENEFLTVQTGRLDAADMEE VFLQEKPEMVLDATHPYAAEVTVNIRTACENTQTAYYRVLREAGEHEDRAVYVDSVQAAA DYLDQTQGNVLLTTGSKELAGFTGMKDYQNRLYARVLSLPNVMKACAELGFEGKHLIGMQ GPFSRELNAAMLRQYDCRYLVTKDTGKAGGFQDKIDAALECDAVPVIIGRPLKEEGMSVR ECKRFLTEHFSLAHRPHITLLGIGMGSQKLLTVQGKNSLDQADLLIGARRMVDSVKRPGQ DVFVEYRSQEIRDYIDAHPEYDNIVIVLSGDVGFYSGARKLLEVLCQDSADLRVQRKNGS EKSEEERDSSAQNNTEIEIQCGISSVVYFMSQIGLSWDDAKIVSAHGRGCNLISHICYAE KVFSILGTSDGVAVLAEKLVKYGMGDVLLYVGENLSYENEKIFAKPASELTEYNGDPLSV ICAVNENAGTRLETHGIRDEEFIRGKAPMTKEEVRTVSLSKLRLTAGSVCYDVGAGTGSL SIEMALRAHQGQVWAIEKKEEAVELIHRNKVKFAADNLEIVEGLAPEALKDLPAPTHAFI GGSSGNLKEIVKLLVEKNPQVRIVINCITLETVSEALETAKEFGFEENEIVQLSAARSKA IGRYHMMMGENPIYIITLQNPGK >gi|226332948|gb|ACII01000071.1| GENE 13 12740 - 13471 932 243 aa, chain - ## HITS:1 COG:lin1162 KEGG:ns NR:ns ## COG: lin1162 COG1010 # Protein_GI_number: 16800231 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Listeria innocua # 4 240 2 239 241 249 52.0 3e-66 MSKIYVIGIGPGAYDQMTGKAIRAMNESDAIIGYTVYVDLVKEYFPGKEFMTTPMKKEVD RCVLAFEEAKKGKTVSMICSGDAGVYGMAGLMYEVGVNYPETELEIIPGVTAATGGAAVL GAPLIHDFCLISLSDLLTPWEKIEARLLAAAQADFVVCLYNPSSKKRHDYLQKACDLMMK YKSPDTICGTVSNIAREGEEAHVMTLKELRDTQVDMFTTVFIGNSQTKEINNKMVTPRGY RNV >gi|226332948|gb|ACII01000071.1| GENE 14 13464 - 14585 837 373 aa, chain - ## HITS:1 COG:CAC1370 KEGG:ns NR:ns ## COG: CAC1370 COG2073 # Protein_GI_number: 15894649 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Clostridium acetobutylicum # 1 364 1 320 326 178 35.0 2e-44 MKVALICFSLTGQQTGERLYRGLEAAGLTAELDKKSKYLPDSIQISTSAWAGEKFSDSDA LIFIGATGIAVRSIAPYVASKKSDPAVLVVDECGKFVISLLSGHLGGANELALKTAEILE AIPVVTTATDLHHRFAVDVFAKKNNCNIFNMKAAKEVSAALLAGKKVGFYSEFPTDGELP EGLIRCDEYGNPVSSMDDRSEEETQKSSDFNGTGTDCTNIDCGVAVTVHTSCNPFISTTQ VVPKCLTLGMGCRKDKDARGIAEAAQKVLDRSGFHKEAFEQIASIDLKKEEKGILSLSQD WQIPFVTYTEEELKQVPGEFTPSPFVKKITGVDNVCERSAVLASGNGRLLQRKTGENGVT TAVAAREWRIHFE >gi|226332948|gb|ACII01000071.1| GENE 15 14659 - 15435 940 258 aa, chain - ## HITS:1 COG:lin1160 KEGG:ns NR:ns ## COG: lin1160 COG2875 # Protein_GI_number: 16800229 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Listeria innocua # 2 248 4 249 249 268 55.0 8e-72 MVHFVGAGPGAPDLITVRGKQYLEEADVVIYAGSLVNPKLLEYTKDICTIYNSAKMTLEE VIHVIEKAEAEGKTTVRLHTGDPCIYGAIREQMDILDEKNITYDYCPGVSAFCGAASALN LEYTLPDISQSVIITRMEGRTPVPSKESIQSFAAHQATMVVFLSTGMLEELSRRLIEGGY TADTPAAIVYKATWPEQKTFVCTVGTLAKTAAENNITKTALMIIGDTVAAAHYDRSKLYD PEFTTEFREAVKSKTHHS >gi|226332948|gb|ACII01000071.1| GENE 16 15461 - 16579 1162 372 aa, chain - ## HITS:1 COG:FN0967 KEGG:ns NR:ns ## COG: FN0967 COG1903 # Protein_GI_number: 19704302 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Fusobacterium nucleatum # 1 359 6 368 375 223 38.0 6e-58 MRFGYTTGSCAAAACKGAAEILLGGVMQKAVTLMTPKGILLTLELKDIRIEGNQVTCAIQ KDAGDDPDTTNGILVYATVQKTKEPGIILDGGVGVGRVTKAGLSQKIGEAAINPVPKAMI LREATEIAEKYDYEGGLKIIISVPEGVEIAKKTFNPRLGIVGGISILGTSGIVEPMSEAA LVQSINVEMKQHFSQGEEYLLVTPGNYGADYLREHMDLPYEKNIKCSNYVGETLDMAIDM GVKGILFIAHIGKFVKVAAGIMNTHSHSADARMEVLASNAIRAGASLECAKEILNASTTD EAIDILEKYQILQKTMKEILDRIQFYLNHRSYEQILLGAVIFNNTYGYLGQTADAEKLIT LINAQNDETQLQ >gi|226332948|gb|ACII01000071.1| GENE 17 16941 - 18284 1238 447 aa, chain - ## HITS:1 COG:CAC1611 KEGG:ns NR:ns ## COG: CAC1611 COG0534 # Protein_GI_number: 15894889 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 16 339 11 347 350 172 31.0 2e-42 MKEQQHIFTNKMLRTLLIPVIFEQVLNSLMGTVDTMMVSNVGSAAISAVSLVDSINVLVI QAFSALAAGGAILCSQYIGQKNHEMANKSARQVLFIITAISVAVTALCLLFRVPLLKLIF GKVEADVMTASQVYFFYTALSFPFIALYDAGASIFRSQGNTRGPMTVSVISNVINIGGNA ALIWGFNMGVAGAAIATLASRVFCAVVVLWQLRMDRQPIVVRNYRQIRPDGKMIRRVLAL GIPSGIENSMFQLGKLAIQSSVSTLGTVAIAAQAMTNILENLNGIAAIGVGIGLMTVVGQ CLGAGRKDEAVYYIKKLSVLAEIIVVASCLLVFALTIPVTKLGGMEPESARMCFHMISWI TVVKPIVWTLAFIPAYGLRAAGDVKFSMITSCITMWTFRFCLCVYLIRFRGFGPMAVWLG MFTDWTVRGIVFSLRFHSRKWLKHKVV >gi|226332948|gb|ACII01000071.1| GENE 18 18653 - 19552 316 299 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 14 289 14 284 287 126 33 1e-28 MQRILKYIISENTTPVTILDFLKKEGFSRHILSSMKNTPNTIILNGERGFGRSVLKPQDH LIVTVPETESGENIIRTKMDLNIIYEDEDILVINKPADMPVHPSAGNYENTLANGIAWYF AEKGEDFVYRCINRLDRDTTGALILAKNPLSAAILSVQMKKRQILRTYLALVDGLLPDSG TINAPIARMEGSVITREVNFGTGESAITHYERLAAGKEYSLAELHLETGRTHQIRVHMKY IGHPLPGDYLYNPDYRRINRQPLHSYQLEFTHPTTGKVMLFTAPLPIDFISAFYSNPLK >gi|226332948|gb|ACII01000071.1| GENE 19 19646 - 20410 948 254 aa, chain - ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 1 253 1 253 254 209 43.0 3e-54 MILAIDVGNTNIVVGCIDDRKTYFIERLSTNRTKTELEYAVDLKNVLDIYHIKKTEIEGC IISSVVPQITNIVKLAAEKILKKNAIVLGPGVKTGLNIMMDNPGQLGADQVADAVAGIAG YPVPLILIDMGTATTASVVNSKKQYVGGMILPGVGVSLDALTARASQLSGISIDAPRHVI GKNTIECMKSGVLYSNAAALDGIIDRIEEELGEKATVVATGGLAKKIVPHCKREIILDEE LLLRGLLIIYEKNK >gi|226332948|gb|ACII01000071.1| GENE 20 20448 - 21656 1292 402 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 3 390 2 390 404 372 52.0 1e-103 MTLEGKTVLLGVTGSIAAYKIAYLASALKKRHADVHVLMTENATNFINPITFETLTGNKC LVDTFDRNFQFQVEHVSIAKKADVVMIAPASANVIGKLAHGIADDMLTTTIMACKCKKFI SPAMNTNMFENPVVQDNLKILEHYGYEVIAPACGYLACGDTGAGKMPEPETLLAYIEREA ACEKDLKGKKILVTAGPTQESVDPVRYLTNHSSGKMGYAIAKAAMLRGADVTLVSGRTSI EPPMFVNLIPVVTARDMYEAVTSVSNEQDIIIKAAAVADYRPARISEEKVKKSDGQMSIE LERTDDILKFLGEHKQHGQFLCGFSMETQNVIDNSRAKLAKKNLDMVAANNVKVEGAGFQ GDTNVLTLITQDEEISLPLMSKEDAAFRILDKILSLMQDSVC Prediction of potential genes in microbial genomes Time: Sat May 28 19:52:43 2011 Seq name: gi|226332947|gb|ACII01000072.1| Ruminococcus sp. 5_1_39B_FAA cont1.72, whole genome shotgun sequence Length of sequence - 6380 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 6, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 25 - 84 6.2 1 1 Tu 1 . + CDS 207 - 770 393 ## COG0500 SAM-dependent methyltransferases + Term 783 - 841 5.7 - Term 776 - 823 9.7 2 2 Tu 1 . - CDS 892 - 1710 904 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family - Prom 1739 - 1798 6.2 + Prom 1790 - 1849 7.3 3 3 Tu 1 . + CDS 1948 - 2427 533 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis + Term 2504 - 2554 6.1 - Term 2604 - 2642 1.2 4 4 Op 1 . - CDS 2646 - 3161 711 ## gi|253579063|ref|ZP_04856334.1| predicted protein - Prom 3221 - 3280 7.2 5 4 Op 2 . - CDS 3361 - 4356 870 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family - Prom 4402 - 4461 9.8 + Prom 4180 - 4239 4.2 6 5 Tu 1 . + CDS 4439 - 5650 934 ## COG1301 Na+/H+-dicarboxylate symporters + Term 5663 - 5710 7.8 - Term 5642 - 5702 10.1 7 6 Tu 1 . - CDS 5708 - 6322 816 ## EUBREC_3158 hypothetical protein Predicted protein(s) >gi|226332947|gb|ACII01000072.1| GENE 1 207 - 770 393 187 aa, chain + ## HITS:1 COG:NMB0747 KEGG:ns NR:ns ## COG: NMB0747 COG0500 # Protein_GI_number: 15676645 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Neisseria meningitidis MC58 # 13 179 13 177 188 148 46.0 7e-36 MKNYQITQWCAAFIRQQVQEGDFCIDATMGNGNDTLLLSQLCGESGKVLAFDIQEQALTA TQKRLNAGHVPENYRLLLESHANMAEYATPDSVSCIVFNFGYLPGGDHSLATRGKTSIQA LTQALTLLKKGGMISLCIYSGGDSGFEERDQILDWLKNLDPHQYLVIKSEYYNRPNNPPI PVLIIKN >gi|226332947|gb|ACII01000072.1| GENE 2 892 - 1710 904 272 aa, chain - ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 115 244 8 137 174 150 57.0 3e-36 MKRKLLAALCVTAMSAGLLAGCGSSDKTEESTKTEETAKEETSSQEETETKAASSDSDNV ASEEEAEKAGFSDGSDEKSDSADTDSENKDSSDSSTTSVLLDTSKELTGIHHAEIEVKDY GTIDVELDADTAPITVTNFVKLAQEGFYDGLTFHRIMDGFMIQGGDPNGDGTGGSEENIK GEFSNNGVDNDISHTRGTISMARASDPDSASSQFFIVQADSTFLDGDYAGFGHVTEGMDI VDKICEDAKPTDDNGTIPSDQQPVIEKITITD >gi|226332947|gb|ACII01000072.1| GENE 3 1948 - 2427 533 159 aa, chain + ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 159 1 158 158 234 69.0 3e-62 MEKITSFTIDHIKLQPGIYVSRKDPVGGSMITTFDIRMTSPNEEPVMNTAELHTIEHLAA TFLRNHKEFGSKMIYWGPMGCRTGNYLLLNGDYESKDIVPLMIETFEFIRDFEGEVPGAS AKDCGNYLDMNLPMAKYLAGKFLDEVLYDIKPDRLIYPQ >gi|226332947|gb|ACII01000072.1| GENE 4 2646 - 3161 711 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579063|ref|ZP_04856334.1| ## NR: gi|253579063|ref|ZP_04856334.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 171 1 171 171 206 100.0 6e-52 MKKSLWLVLVLGLAFAAQGCGFDPTIDSSVMIVKVTPTPEVTPTPEASEATPTPEATPTP AAEQTASGVKVEVKSGTYYASSELNLRSDASSDADLISSVAAGTQLNSTGVCENGWIRVD YNGQTCYASGDFVTTTAPAADAGETAADTSADAGETAADTSADTSYDESAE >gi|226332947|gb|ACII01000072.1| GENE 5 3361 - 4356 870 331 aa, chain - ## HITS:1 COG:ECs3748 KEGG:ns NR:ns ## COG: ECs3748 COG1975 # Protein_GI_number: 15833002 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli O157:H7 # 67 229 91 252 541 98 30.0 2e-20 MYRKIYQILEETGKVKTAMVLNGNHKGEKCLIKENHCLSSDENTADFWKKYEADLAECQE TKVVHMGDVDFFVEIYVKNPQLIIFGGGHVSQPVAKIGKMLGFHVTVMDDREDFVTSERF PDADRLIKGSYDKLSDKIPAYENAYYVIVTRGHLGDSACARQILRRPYTYLGMIGSKNKV KLTREKLLGEGFSEEQLNSIHAPIGLPIGGHMPAEIAVSIAAEIVQEKNRYDVSYIDGAV EDAVRKKENGIMITIISKSGSSPRGTGSKMFIDKDGNSYGSIGGGNVEFQALKYAPEAQH GEIRKYNLSNQGGANLGMICGGEVEVLYELL >gi|226332947|gb|ACII01000072.1| GENE 6 4439 - 5650 934 403 aa, chain + ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 9 390 9 382 390 376 58.0 1e-104 MKKNFFLHSLPFRLLVGVAFGICIGLLLNIIDESAITTAVLNVVVTSKYILGQLISFSVP LIIIGFIAPSITRLGNNASRMLGVAVSIAYASSLGAALFSALSGYLLIPHLSFSSTASHL KALPDVVFELSIPQIMPVMSALVLAIMLGLASAWTKASLISSFLEEFQKIILAIVSRVII PILPYFIGLTFCSLAYEGTITRQLPVFLQVIVIVLIGHFIWIALLYILAGIYSRENPLKV VKHYGPAYLTAIGTMSSAATLPVALQCAHRAKPLRKDMVDFGIPLFANIHLCGSVLTEVF FCMAISKILYGAVPAPGAMVLFCVLLGIFAIGAPGVPGGTVMASLGIITGILKFGDSATA LMLTIFALQDSFGTACNVTGDGALTLILTGYARYHNIREQKLQ >gi|226332947|gb|ACII01000072.1| GENE 7 5708 - 6322 816 204 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3158 NR:ns ## KEGG: EUBREC_3158 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 204 4 205 205 196 52.0 5e-49 MKHEVNVMGKQCPIPVVMTKKVIDNAAVGDEIEILIDNETAVNNLSRLANKTGCTFVSEK LGDKKYQVKMAVQTEQTGGTLEEEEFVCEAPHKKVTVAVISSNVMGNGDDELGKILIKGF IYALSQMETHPDTILFYNGGAKLTTEDSESLEDLKKMEEEGVEILTCGTCLKHYGLMEKL MVGKVTDMYTIAERMTGADKVIRP Prediction of potential genes in microbial genomes Time: Sat May 28 19:52:59 2011 Seq name: gi|226332946|gb|ACII01000073.1| Ruminococcus sp. 5_1_39B_FAA cont1.73, whole genome shotgun sequence Length of sequence - 13180 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 10, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 100 - 142 2.4 1 1 Tu 1 . - CDS 206 - 1579 1454 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 1761 - 1820 10.0 + Prom 1720 - 1779 15.2 2 2 Tu 1 . + CDS 1946 - 2578 679 ## COG2964 Uncharacterized protein conserved in bacteria + Term 2603 - 2648 7.0 + Prom 2793 - 2852 9.8 3 3 Op 1 9/0.000 + CDS 2908 - 4176 553 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 4 3 Op 2 . + CDS 4246 - 4962 546 ## COG3279 Response regulator of the LytR/AlgR family 5 4 Tu 1 . - CDS 5055 - 5240 164 ## gi|253579071|ref|ZP_04856342.1| predicted protein - Prom 5345 - 5404 9.8 + Prom 5330 - 5389 10.3 6 5 Tu 1 . + CDS 5603 - 6130 506 ## gi|253579072|ref|ZP_04856343.1| conserved hypothetical protein + Term 6368 - 6403 4.1 - Term 6349 - 6394 6.0 7 6 Op 1 15/0.000 - CDS 6412 - 7431 1214 ## COG0059 Ketol-acid reductoisomerase - Prom 7453 - 7512 4.9 - Term 7452 - 7490 0.7 8 6 Op 2 . - CDS 7560 - 8057 450 ## COG0440 Acetolactate synthase, small (regulatory) subunit - Prom 8117 - 8176 8.4 - Term 8146 - 8202 -0.1 9 7 Op 1 . - CDS 8218 - 9270 865 ## COG2365 Protein tyrosine/serine phosphatase 10 7 Op 2 . - CDS 9191 - 9376 79 ## gi|253579076|ref|ZP_04856347.1| predicted protein - Prom 9462 - 9521 4.0 + Prom 9381 - 9440 7.9 11 8 Tu 1 . + CDS 9515 - 10459 1024 ## COG0679 Predicted permeases - Term 10430 - 10467 3.1 12 9 Op 1 . - CDS 10477 - 11286 707 ## COG0561 Predicted hydrolases of the HAD superfamily 13 9 Op 2 . - CDS 11340 - 12608 1173 ## COG1362 Aspartyl aminopeptidase - Prom 12658 - 12717 6.8 - Term 12683 - 12728 2.1 14 10 Tu 1 . - CDS 12741 - 13067 501 ## COG0662 Mannose-6-phosphate isomerase - Prom 13115 - 13174 5.4 Predicted protein(s) >gi|226332946|gb|ACII01000073.1| GENE 1 206 - 1579 1454 457 aa, chain - ## HITS:1 COG:SMc01821 KEGG:ns NR:ns ## COG: SMc01821 COG0044 # Protein_GI_number: 15966207 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Sinorhizobium meliloti # 1 452 1 452 484 361 41.0 1e-99 MKTLLKNGTVVSGDMSVREDVLIDGEKIVKVGRNLEAEDAQVVDVNGKLLFPGFIDGHTH FDLEVAGTVTADDFETGTRAAIAGGTTLVIDYASQDKGGHTLREGLEKWHEKADGKCSCD YSFHMSIVEWNEETQREIQDMINEGITSFKLYMTYPAMIVDDGDLYKIIKKLNEYGCFAG VHCENAGAIDALIAEAKKEGRLGPENHPLVRPDIMEAEAVHRLLVIADAANAPVMVVHLT NRKAFEEVMRARMNGQKVYAETCPQYLLLDDSVYSKPDFEGAKYVCAPPIRKKADQDCLW KALANDQIQTVATDQCSFTMEQKALGKDDFTKIPGGLPGVQTRGTLLYTYGVRTGRITQE QMCRLLSENAAKLYGVYPNKGVIREGSDADIVVFDPEKENVISAKTHLYHTDNNPYEGFK LHGDIDSVYLRGAHVVQDGKVILEKTGKYIKRGKNQM >gi|226332946|gb|ACII01000073.1| GENE 2 1946 - 2578 679 210 aa, chain + ## HITS:1 COG:YPO0626 KEGG:ns NR:ns ## COG: YPO0626 COG2964 # Protein_GI_number: 16120952 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Yersinia pestis # 14 184 5 175 196 65 26.0 9e-11 MGRLTVLKQIAADLASQFGPDCEVVIHDLKTSEPEHSIVYIVNGHVTNRDIGDGPSNAVF DAIRNQEKGATPEDHTGYLMKTADGKILKCSTSYIHDDDGTLHYVFGINYDITKLTMIES ALHSLVTPENKEEKPKEITHNVNDLLDHLIEESVALVGKPVPLMNKEDKVTAIQFLNDAG AFLITKSGDKVANYFGISKYTLYSYIDVNK >gi|226332946|gb|ACII01000073.1| GENE 3 2908 - 4176 553 422 aa, chain + ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 221 396 235 405 433 66 26.0 1e-10 MSEQILSAVHGVTTMLFGIYCSAFFLGIKPIRKNILTMFLLFLGQGLLYVIDLALFGETL ANMSYPLIVHFPLVLFLSVHYKYPLISSAVSVFSAYLCCQISNWTGLFALAITGLQWCYY SVRILTTTLTFVLLYRYVFRSTKTIFTKNARELSIIGFLPFVYYVFDYASTKFSTLLYSG NKAVVEFMGFAFCIAYLVFLIIYFQEYENKQEITQYSNLREMQLQSMQNEIEQVKISSQR LAILRHDMRHHLSIILTQLQNGHPDKAQEYIHEINSAYDDTIIAAYSGNEMLNSVLSIYH SRFTDRGLSLICNVSTGKELPCSDLSLCTILSNALENSMHALEQLESPSKWARLTLSQKK NHILFQLENPVEKIPAFVDGVPVSTRNGHGIGVRSIIYYVEQLHGQCHFSIVDHCFVLRI II >gi|226332946|gb|ACII01000073.1| GENE 4 4246 - 4962 546 238 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 232 2 230 234 110 27.0 2e-24 MNIAICDDEILFTRELSSLLTHWAEKNDFSLTLYPYSNGDDLLTALRTIPVDLIFLDIIM PLLNGIDTAREIRSMGLTVPVIFLTSSREFALDSYDVKAFHYLLKPVNTLKLFSVMDDFF KTYHVPAETFVAHTADGFCSITLNDVDYLEAQNKQVLVCLSNGTTLKIRELFVKCEGVFT PEKGFFKCHRSYIVNLSHIKQFTRTMVTTGISSVPISRNNYSAFKDAYFSYMFDSNSG >gi|226332946|gb|ACII01000073.1| GENE 5 5055 - 5240 164 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579071|ref|ZP_04856342.1| ## NR: gi|253579071|ref|ZP_04856342.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 100 100.0 3e-20 MEIEEEFISGFCRTCNGGQTVCCEYTMEGDKRTLTFMDCAHDRCVNYAACEIYKQAHEME R >gi|226332946|gb|ACII01000073.1| GENE 6 5603 - 6130 506 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579072|ref|ZP_04856343.1| ## NR: gi|253579072|ref|ZP_04856343.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 175 1 175 175 318 100.0 7e-86 MNRELYDEAIRSNILSRKLIEQLMESMNYSSISFINWTVEVLKIIKTRLERGDKITDEVS GITYDIKSFRNFVSTNFSSYITSQVFDAPDKAEKVYFSLEATEDGHAYNMVMANSSKDKT YKWISSLSERFSLVEMIATGIVYLKDNRTDTYQPFISGNGKYCRYDVEKGQIIEL >gi|226332946|gb|ACII01000073.1| GENE 7 6412 - 7431 1214 339 aa, chain - ## HITS:1 COG:Cj0632 KEGG:ns NR:ns ## COG: Cj0632 COG0059 # Protein_GI_number: 15791992 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Campylobacter jejuni # 1 338 1 336 340 429 62.0 1e-120 MANKIFYQEDCNLSLLDGKKIAIIGYGSQGHAHALNLKDSGCDVIIGLYEGSKSWKRAEE QGFKVYTAAEAAKQADIIMILINDELQAKLYKESIEPNLEEGNMLMFAHGFNIHFGCIKP PKNVDVTMIAPKAPGHTVRSEYQAGKGTPCLIAVEQDYTGKAQDLALAYGLGIGGARAGL LETTFRVETETDLFGEQAVLCGGVCALMQAGFETLVEAGYDPRNAYFECIHEMKLIVDLI YQSGFAGMRYSISNTAEYGDYITGPKIITEDTKKAMKKILSDIQDGTFAKDFLLDMSDAG AQVHFKAMRKKASEHQSEKVGEEIRKLYSWNGEDKLINN >gi|226332946|gb|ACII01000073.1| GENE 8 7560 - 8057 450 165 aa, chain - ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 5 160 10 166 168 137 45.0 1e-32 MQKRVLSLLVDNTAGVTTRVSGLFSRRGYNIDSITGGVTADERFSRITVVCSGDELILEQ ITNQLAKLVDVRDIKILEPDNSVCHELMMIKVAAKPEQRQGLISIADVFHAKVADVSVDS MILEMTGNHNKLEAFLELMGDYEILELARTGMTGLSRGSQDVTYF >gi|226332946|gb|ACII01000073.1| GENE 9 8218 - 9270 865 350 aa, chain - ## HITS:1 COG:lin1914 KEGG:ns NR:ns ## COG: lin1914 COG2365 # Protein_GI_number: 16800980 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 67 349 26 297 298 158 31.0 1e-38 MKYGKIRIEDGFLVFTRHMMINNLPCKDIVWAYMRKEGADEGDDRQLSVNYLVIVTRRKK RYKFDMTEKEIHECIRILKILNPDMATGFPKGGRISLHSLPNTRDLGAIVTADDRHILPR RLLRSGELYHISESDKNRLREEYNLKTVIDLRSAEERKCKPDTIIAEVEYYHVPVVDEDV QVISNREQFVKMLAGLPDDMEEYMIRQYRNLCMDQLVLKQYAKFIDILFRQEKGAVLWHC GTGKDRTGIGTAFLLSLLGVEEDVIYEDYLRTNRYMEPKLVYMQRLVQTWPEADEKMTEK LPIIYNVKEEYLAAVFETVKKTYGSMEKFFQTVFYLKPKMIEEFRNKYLI >gi|226332946|gb|ACII01000073.1| GENE 10 9191 - 9376 79 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579076|ref|ZP_04856347.1| ## NR: gi|253579076|ref|ZP_04856347.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 13 73 73 108 100.0 1e-22 MVAVQYALQIVGMIFQTRSKSRVNSNKIADSGVVDHEVWKDKDRRWISGFYQTYDDKQSS M >gi|226332946|gb|ACII01000073.1| GENE 11 9515 - 10459 1024 314 aa, chain + ## HITS:1 COG:FN0623 KEGG:ns NR:ns ## COG: FN0623 COG0679 # Protein_GI_number: 19703958 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 314 4 318 318 141 32.0 1e-33 MENLIFCLNATIPIFLTLMLGLFFRKIGLFDDTFVSRLNKFVFNAALPALLFLDIAKSDF YSVWNTKFVLFCFFVTLISIGISTGLSYLLKPRNIQGEFIQASYRSSAAILGIAIIQNLY GNAGMAPLMIIGSVPLYNVMAVITLSLFAPGGGKLDTAKLIRTLKGIATNPIILGILTGM VWSLLHLPMPTILNTAVDNLGKTATPLGLMAMGGALKFGKAFARPKPLIACSFLKLVGYE AVFLPLAVLLGFSHDMLVAIIIMLGSATTVSCYIMAKNMDYDGDFTSGVVMTTTLLSSFT MTIWLYICKSLGLI >gi|226332946|gb|ACII01000073.1| GENE 12 10477 - 11286 707 269 aa, chain - ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 269 1 266 266 126 32.0 5e-29 MNRKAVFFDADGTLCDMEKGVPQSTKEALKKLRENGHDAWLCTGRSRAFVSRYLEELPFT GMISACGATIEKDGERLFNKEMPPEVAKKSVEILRRYGLVPVMEGADFMYYDKDEYNTDV NWYTDLITESLGSKWRPIRGNEDCMRINKISAKMIKGCDAEAACRELSEYYDIIRHESGS GIAGTTIELVPKGFNKAVGISAVCRLFDIPWEDTIVFGDSNNDLAMFEYAAVKVAMGNGS EKIKALADHITQDMFHYGIRNGLEYLKLI >gi|226332946|gb|ACII01000073.1| GENE 13 11340 - 12608 1173 422 aa, chain - ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 6 421 8 431 433 391 47.0 1e-108 MYRKTAKKLLKFIEKSPTAFQAVTEMTKRLDKEGFEELKEEDHWKLKKGGNYYVTRNHSA IIAFSIPQKPVWKFHIMASHSDSPSLKIKENPEIEVENAYIKLNVERYGGMILSPWFDRP LSVAGRLIVRQDGKIREKMVAVDRDLLVIPNLAIHMNREVNDGYKYNVQKDMLPLFSDKE GKGRFMETVAEAAEVKTEDILGHDLFLYDRTPGTLWGVNEEFVSAPRLDDLQCAFSSMEG FLQGNREESISVHCVLDNEEVGSSTRQGAASAFLKDTLMRINMGLGRTQEEYYMALADSF MISADNAHALHPNYTDKTDPVNRPVLNEGIVIKYNANQKYCTDGVSAAIFKDICDRAKVP YQTFVNRSDMAGGSTLGNISNTQVPVKTVDIGLAQLAMHSVYETTGAKDTESLVKAATVF FA >gi|226332946|gb|ACII01000073.1| GENE 14 12741 - 13067 501 108 aa, chain - ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 5 106 18 119 121 100 48.0 9e-22 MKIRREEHMAGGNGHVIIKEILDAEQLNGKCGLYAQVTLEPGCSLGYHEHHGESETYYIL QGQGEYNDNGTYRPVKAGDITFTPDNHGHALANTGNTDLVFMALIIKD Prediction of potential genes in microbial genomes Time: Sat May 28 19:53:21 2011 Seq name: gi|226332945|gb|ACII01000074.1| Ruminococcus sp. 5_1_39B_FAA cont1.74, whole genome shotgun sequence Length of sequence - 17977 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 8, operones - 5 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 22 - 873 515 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 2 1 Op 2 1/0.000 - CDS 931 - 1740 770 ## COG0294 Dihydropteroate synthase and related enzymes 3 1 Op 3 . - CDS 1780 - 2256 278 ## COG1418 Predicted HD superfamily hydrolase 4 1 Op 4 . - CDS 2274 - 2831 692 ## COG0302 GTP cyclohydrolase I - Prom 2884 - 2943 4.9 - Term 2904 - 2941 -0.8 5 2 Op 1 . - CDS 3129 - 4112 483 ## EUBREC_0521 hypothetical protein 6 2 Op 2 . - CDS 4118 - 4576 309 ## EUBREC_0520 hypothetical protein - Prom 4700 - 4759 3.7 7 3 Op 1 . - CDS 4814 - 5227 201 ## Cphy_2149 hypothetical protein 8 3 Op 2 . - CDS 5242 - 6771 1044 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 9 3 Op 3 . - CDS 6808 - 7254 448 ## COG1671 Uncharacterized protein conserved in bacteria 10 3 Op 4 . - CDS 7274 - 7642 205 ## EUBREC_3407 hypothetical protein 11 3 Op 5 . - CDS 7670 - 8080 292 ## COG0242 N-formylmethionyl-tRNA deformylase - Prom 8147 - 8206 5.3 12 4 Tu 1 . - CDS 8283 - 9539 533 ## COG0675 Transposase and inactivated derivatives 13 5 Tu 1 . - CDS 10013 - 10162 165 ## - Prom 10194 - 10253 3.3 - Term 10203 - 10238 -0.9 14 6 Op 1 . - CDS 10455 - 10862 395 ## COG0394 Protein-tyrosine-phosphatase 15 6 Op 2 . - CDS 10855 - 11274 89 ## TDE2638 hypothetical protein 16 6 Op 3 . - CDS 11290 - 11634 208 ## gi|253579096|ref|ZP_04856367.1| predicted protein 17 6 Op 4 . - CDS 11544 - 11948 162 ## PROTEIN SUPPORTED gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes 18 6 Op 5 . - CDS 11970 - 12260 154 ## EUBREC_0170 DNA/RNA helicase 19 6 Op 6 . - CDS 12299 - 15460 1843 ## COG1061 DNA or RNA helicases of superfamily II - Prom 15591 - 15650 7.3 - Term 15777 - 15825 8.4 20 7 Op 1 . - CDS 15925 - 16287 228 ## COG3304 Predicted membrane protein 21 7 Op 2 6/0.000 - CDS 16357 - 16713 205 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 22 7 Op 3 . - CDS 16756 - 17463 460 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 17536 - 17595 6.1 23 8 Tu 1 . - CDS 17699 - 17977 186 ## Acfer_1417 protein of unknown function DUF567 Predicted protein(s) >gi|226332945|gb|ACII01000074.1| GENE 1 22 - 873 515 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 6 281 1 275 278 202 40 1e-51 MDETRLDKIEIKELEVFANHGVYPEENVLGQKFVISATLFTRTRLAGLTDELSASINYGE VSHMITDFTRKHTYKLLESLAENLAEMLLCSLSGLEKITLKIEKPWAPVGLPLKTVSVEI TRKWHTAYIAFGSNMGDKKMYIDNGIRGLAETKGCRIEAISDYLITEPYGVTDQDEFLNG VLKMRTLLTPGELLVRLHQLEQAANRERIIHWGPRTLDLDILFYDQEIIDMPDLHIPHID LHNRDFVLVPMNQIAPYLRHPVLNQTISQLLDSLLNKSENTAK >gi|226332945|gb|ACII01000074.1| GENE 2 931 - 1740 770 269 aa, chain - ## HITS:1 COG:CAC2926 KEGG:ns NR:ns ## COG: CAC2926 COG0294 # Protein_GI_number: 15896179 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Clostridium acetobutylicum # 1 268 1 268 268 328 61.0 8e-90 MKIGNRFFDTKNHTYIMGILNVTPDSFSDGGKWNHMDEALKHTEAMIADGADIIDIGGES TRPGHTPVSADEEALRVLPVIEAVKKHFDIPVSVDTFKSSVAESSIQAGADLVNDIWGLK YDPKMAAVIAKYDVACCLMHNKSNTEYQNFLIDMLAETQECVNLARQAGIKDEKIMLDPG IGFGKTFEMNLEAMNHLELFQNLGFPVLLGTSRKSMIGLALDLPVDQRVEGTLATSVIGV MKGCSFVRVHDVKENKRVIQMTEAILGRR >gi|226332945|gb|ACII01000074.1| GENE 3 1780 - 2256 278 158 aa, chain - ## HITS:1 COG:CAC2925 KEGG:ns NR:ns ## COG: CAC2925 COG1418 # Protein_GI_number: 15896178 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 2 158 5 161 161 145 49.0 3e-35 MDRVNRIWRHPVYQEHYKKIQELESERIFCRHTPEHFLDVARLMYIYALEEHLELPKELI YAAALLHDIGRAQQYQYNIPHDIAGVEIAREILTDLHFTEQEKELILSSIGHHRKGDSCS TLAALLYKADKQSRNCFLCSAASECYWSDDKKNMKIEY >gi|226332945|gb|ACII01000074.1| GENE 4 2274 - 2831 692 185 aa, chain - ## HITS:1 COG:SP0291 KEGG:ns NR:ns ## COG: SP0291 COG0302 # Protein_GI_number: 15900225 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Streptococcus pneumoniae TIGR4 # 2 176 1 175 184 225 58.0 5e-59 MVDKVKIEQAVRLLLEGIGEDITREGLIDTPDRIARMCEEIYGGLGHEADQHLLKQFPVE NNEIVLEKDITFYSMCEHHLMPFYGKAHLAYIPNGKVTGLSKLARTVEVYSRRPQIQERL TVQIADALERTLDPKGIMVMLEAEHTCMTMRGIKKPGSKTITTVTRGAFTEDKELQKMFL SMVKG >gi|226332945|gb|ACII01000074.1| GENE 5 3129 - 4112 483 327 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0521 NR:ns ## KEGG: EUBREC_0521 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 325 1 322 325 297 44.0 4e-79 MKGSHRIIVESRKVKYDFVIHRNITIITGDSGSGKTVLIDLIHDYGRYGADSGVFLSCDC PCKVIDSEDWERKIEETTGSIIFIDEGNRFLVSKKFAQLVQGSDNYFVLATREKLPALPY SVSEIYGFRKSGKFHDAKQKYNEIYHLYGEISEEKNINPKLVITEDSNSGFEFFKEMSRQ KGVNCFSAGGKSNIIRQLEQRPNEEGTILVIVDGAAFGSEMKDISECIKTQGNIVLYAPE SFEWLLLSTKEIPEVKVETILQNPEEYIDSKEYISWERYFTDLLIESTSKNFIWAYSKKR LTKAYFAPRIVNAVKTIMKLVDWEKSF >gi|226332945|gb|ACII01000074.1| GENE 6 4118 - 4576 309 152 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0520 NR:ns ## KEGG: EUBREC_0520 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 146 1 145 145 169 56.0 3e-41 MLNIIYGDVKNSIYNTNVYFKNSYEPEWLDAELAKKMIKDIDDSEVLSGECINSPVLGQI PPERLSGGVKTLLLILNEPDRIFNASTCGDNCAKWILEIGRIKDVTINLRHMMSFGKDTK FEIKVQNGGETVHSMKELVPIANKYLNEMKQE >gi|226332945|gb|ACII01000074.1| GENE 7 4814 - 5227 201 137 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2149 NR:ns ## KEGG: Cphy_2149 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 133 3 132 142 132 53.0 3e-30 MKNNSYDDIINLPHPVSKNHPQMPFRDRAAQFAPFAALTGHDAAIKETARLTDERLELSE EVIAQLNEKINIIRNNIGIEQNVSITYFIPDAKKAGGSYVMCSGIVKKVDEYEHTIIMTD QTVIPIERISDIKCEEW >gi|226332945|gb|ACII01000074.1| GENE 8 5242 - 6771 1044 509 aa, chain - ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 4 508 8 419 420 199 30.0 9e-51 MTEKERTYIAIDLKSFYASVECKERNRDPLTTNLVVADKSRTEKTICLAVSPALKCYGIP GRARLFEVVQKVKEANSARRWKVPNRTFIGSSDDSTELNINPALEIDYIVAPPRMALYLE YSTRIYSIYLKYIAPEDIFPYSIDEVFMDVTDYLHTYNMTARELAMTMIQDVLKTTGITA TAGIGTNMYLCKIAMDIVAKHIKADKDGVRIAELDEMSYRRKLWSHRPLTDFWRVGKGYA KKLEEYGLYTMGDIARCSIGKENELYNEDLLYKLFGVNAELLIDHAWGYEPCTMKMVKAY KPETNSVCSGQVLHCPYDFEKAKLVVKEMTDQMVLDLVDKKLVTDQIVLTVGYDIENLNN TDRKKKYHGEVTIDRYGRRIPKHAHGTTNLKRQTSSTKMITDAVIELYDGIVDRNLLIRR INITANRLVDENSVKKEKVYEQLDLFTDYEAQRKKQEEEEAALDREKRMQEAMLSIKKKF GKNAMLKGMNLQEGATARDRNEQIGGHKA >gi|226332945|gb|ACII01000074.1| GENE 9 6808 - 7254 448 148 aa, chain - ## HITS:1 COG:CAC2825 KEGG:ns NR:ns ## COG: CAC2825 COG1671 # Protein_GI_number: 15896080 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 140 5 143 148 215 72.0 2e-56 MVDADACPVVSIVERVAKEHNLPVTLLCDTNHVLSSDYSEVIVVGAGADAVDYKLISICH KGDIIVSQDYGVAAMVLGKGAYAIHQSGKWYTNENIDQMLMERHLNKKARRAFGKNHLKG PRKRISEDDEHFRESFEKMIHMAMDKEK >gi|226332945|gb|ACII01000074.1| GENE 10 7274 - 7642 205 122 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3407 NR:ns ## KEGG: EUBREC_3407 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 119 3 120 127 160 69.0 9e-39 MLKNVLIVVDDIEKSIEFYRDLFGLQVILKNEGNVILSEGLVLQDADIWGKTLNETSTPF NNMMELYFEDFDIDGLLAKYESGKYFVRYATELTELAGGQKLVRLYDPSGNLIEVRTPFR CN >gi|226332945|gb|ACII01000074.1| GENE 11 7670 - 8080 292 136 aa, chain - ## HITS:1 COG:SP1549 KEGG:ns NR:ns ## COG: SP1549 COG0242 # Protein_GI_number: 15901392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Streptococcus pneumoniae TIGR4 # 1 136 1 136 136 174 58.0 4e-44 MVKQIVRDIFFLGQPSKPATKADIQIGKDLQDTLQANREWCVGMAANMIGVRKNIIIVNM GFIDVVMFNPVIVSKHDMYETEEGCLSLDGVRKTTRYQEIEVEYYDFNWKKQRQKLSGWT AQICQHEIDHLSGKII >gi|226332945|gb|ACII01000074.1| GENE 12 8283 - 9539 533 418 aa, chain - ## HITS:1 COG:TVN1222 KEGG:ns NR:ns ## COG: TVN1222 COG0675 # Protein_GI_number: 13542053 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermoplasma volcanium # 17 392 15 384 416 116 26.0 1e-25 MLLSKKTSIKVSWEYANLIGHMCYAASKLWNVCNYERQHYKETGMEQYPDWYYQKKAHKE ELWYKQLPSQTAQEVCRLLDKAWKSFYALKRSGGIEAPRPPRFKQESIPITYMQMGIVHE RDTDRVRLSLPKTLKKYMEETYQIHENFLYLENKIFRGMDQIKQLRIYPPEKGSCKMIVV YEVPDQEELPQNGHELSIDLGLHNLMTCYDSGNGKTFILGRKYLALERYFHKEIARVQAQ WYGQQSGKGVKHPVTSKHIRKLYKRKQDSVTDYLHKVTRYLAEYCREQGITCVVTGDIRN IRREKDLGHRTNQKFHSLPYNRIYIMLEYKLKRYGIRFIKQEESYTSQCSPLSPEVGKRY AEPSNRKERGSYRDGNRVYNADAVGAYNILRKYHSVSGIKRELSITGLKTPEIIKVAV >gi|226332945|gb|ACII01000074.1| GENE 13 10013 - 10162 165 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKGYNTSNGYMGYVEGKYILFASEQEYFEFMHYRGYIYVILELVLKDLG >gi|226332945|gb|ACII01000074.1| GENE 14 10455 - 10862 395 135 aa, chain - ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 3 131 2 129 136 155 60.0 2e-38 MNKKKVAFICVHNSCRSQIAEALGKHLASDIFESYSAGTETKSQINQDAVRIMKEIYGID MEANGQYSKLIADIPDVDIAISMGCNVGCPFIGRAFDDNWGLDDPTGKSDDDFKTVIQRI EENIIELKKRLSENK >gi|226332945|gb|ACII01000074.1| GENE 15 10855 - 11274 89 139 aa, chain - ## HITS:1 COG:no KEGG:TDE2638 NR:ns ## KEGG: TDE2638 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 29 137 98 206 208 172 70.0 4e-42 MLLIIVVFGKLFLQCRKLNIRLIPQSLNRGKAVPGGVCGFWGACGVGISAGVFISIISGA TPLKNESWGLANKMTFKALDAIGSIGGPRCCKRDSYMAIISAIDYVAENFNIQMEKPVIK CIHSGKNNQCIKERCPFHE >gi|226332945|gb|ACII01000074.1| GENE 16 11290 - 11634 208 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579096|ref|ZP_04856367.1| ## NR: gi|253579096|ref|ZP_04856367.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 114 1 114 114 225 100.0 6e-58 MAYEGSTEQRCLVTGRYHFDRENSKEHGELISNIIFRGIAMIYTVKRIDEDLDFGCEERP KGMPVMAIVTLTNSSGEEIRIKAEDAMLYECNINEGDKVCFDVNNKLEKVALPN >gi|226332945|gb|ACII01000074.1| GENE 17 11544 - 11948 162 134 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631237|ref|ZP_00156775.1| COG0494: NTP pyrophosphohydrolases including oxidative damage repair enzymes [Haemophilus influenzae R2866] # 2 134 4 134 136 67 31 9e-11 MKTIKVVAAVICDNMKEKNKIFATARGYGELKGGWEFPGGKIEAGETPQEALKREIMEEL DTEIKVGDLIDTIEYGYPTFHLSMDCFWAEVTAGHLELKEAEAAKWLTKDQLNSVAWLPA DITLIEKIRRNMEN >gi|226332945|gb|ACII01000074.1| GENE 18 11970 - 12260 154 96 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0170 NR:ns ## KEGG: EUBREC_0170 # Name: not_defined # Def: DNA/RNA helicase # Organism: E.rectale # Pathway: not_defined # 16 56 1 41 771 69 80.0 3e-11 MTNSGTGKIYASALAMRELGFKRVLFLVHRVTLAKQAKKSFEKVFDKKVTMGLVGARNAR KEDNRGRMQPITKVTMKMHHPVREDLLRYLQSNITA >gi|226332945|gb|ACII01000074.1| GENE 19 12299 - 15460 1843 1053 aa, chain - ## HITS:1 COG:CAC2824_2 KEGG:ns NR:ns ## COG: CAC2824_2 COG1061 # Protein_GI_number: 15896079 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Clostridium acetobutylicum # 328 772 3 434 616 281 41.0 4e-75 MLKNGIYEQIVNTRINKELQQLDREKYDIELEKLDGEDARRILTIYISYVINQALQYLRD SFPSSKENESLLAQIRLCNEIVQEIAEHTNEPEFEDNIILEKGEVLTSLYEKMNSARSIN TIKAVHPETSIVENALFTGSKNEPSMLSELKKEILSSDSIDLLVSFIKWSAIRPLLVELT AFTKREGVRLRVIATTYTQATDYKAIVALAELPNTEVKINYETNHARMHAKSYLFKRDTG FSTAYIGSSNLSNPALTGGLEWNVKVTEKESFDIVKKFSVSFESYWNDASFETFDPENPD CHEKLKKELTRAAFDRKSKRSLQVSIRPYAYQQEILDNLAAEREVYGHYRNLVSAATGVG KTIIAAFDYKAFKEKNPRARLLFVAHRKEILEQSIQKFQEVLNDFNFGELYVDGYKPTDI EHLFISIQSFNSAKLSQWTSKDYYDFIIVDEFHHAAADSYQELLAYYEPKILLGLTATPD RMDGKDILKYFDGRIASKLLLGEAIDRNLLSPFQYLGVTDDTDYRNFKWTRGKYDVSELE KVYTADTRRCALILNSVKKYVTDISSVKGLGFCVSVAHANYMAKYFNDKGIPSIALSSKS ADDIRNDAKTDLLSGKINFIFVVDLYNEGVDIPAIDTVLFLRPTESATVFLQQLGRGLRL SPGKDCLTVLDFIGQANKKYNFAMKFEAIVGKGRKSIRKQVEDGFSNLPRGCYIELEKYA KEYILENLKQTDNNKRALIEMVRTFEEDTGLPLNLENFLLEYNMSLYDFYQNTGARSLFR LKKWAGIIEDDRDVDDKIYSMMTGFFHVNSARLLDYWIRYIDGNKVANNEEERIMRNMLY YTFYKKNPAKCGFQSIDEGIESVLKETFVRDEVLQILKYNRKHISFVAGRNEYSYLCPLD LHCRYNTNQIMAAFGLFTETESPEFREGVKRFEDKKTDIFFINLNKSEKDFSPSTMYEDY AINDKLFHWQSQSQDRQASAKIQRYINHKNTGDIISLFVREFKRLGSYTAPYTFLGNADY VKHEGERPVSFVWKLHNSIPADMLPKANKSIAL >gi|226332945|gb|ACII01000074.1| GENE 20 15925 - 16287 228 120 aa, chain - ## HITS:1 COG:VCA1051 KEGG:ns NR:ns ## COG: VCA1051 COG3304 # Protein_GI_number: 15601802 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 1 119 2 130 143 84 40.0 7e-17 MRTIGNILWFIFGGLLGGLAWVFAGCIWCITIIGIPVGLQCFKFATLAFWPFGKEIVYGN GMFSFLVNLIWIVFFGWEMALGNLIVGCIWCITIVGIPFGKQFFKMARLSFMPFGASVIG >gi|226332945|gb|ACII01000074.1| GENE 21 16357 - 16713 205 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 1 117 1 123 126 83 39 9e-16 MKFKMIHENYNVSDLDRSIKFYNEALDLHEVRRKTTDDFIIVYLSNDVSDFELELTWLKD HPQPYNLGECEFHLAFKAEDYEAAHKKHEEMGCICFENEAMGIYFIVDPDGYWLEILP >gi|226332945|gb|ACII01000074.1| GENE 22 16756 - 17463 460 235 aa, chain - ## HITS:1 COG:SSO3004 KEGG:ns NR:ns ## COG: SSO3004 COG1028 # Protein_GI_number: 15899712 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Sulfolobus solfataricus # 6 225 3 249 299 154 39.0 2e-37 MEFKNKVVIVTGGAQGIGKTIAEEFQKQGALICSIDKQLGVYFTGDLAQQDTLEKFAQKV ITEYGHVDYLINNALPLMKGIDNCSYEEFNYALKVGVTAPFYLAKLFQPYFSEGAAIVNI SSSRDRMSQPQIESYTAAKGGISALTHALAVSLAGRVRVNSISPGWIDTDFTEYTGADAV QQPVHRVGNPMDIANMVLFLCSEKAGFITGENICIDGGMTKQMIYHGDCGWRFDI >gi|226332945|gb|ACII01000074.1| GENE 23 17699 - 17977 186 92 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1417 NR:ns ## KEGG: Acfer_1417 # Name: not_defined # Def: protein of unknown function DUF567 # Organism: A.fermentans # Pathway: not_defined # 1 91 65 155 158 116 51.0 2e-25 VYKNGSYIGCLSKEFSFLTPHYNIDYNGWHIDGTIMEWDYSILDRSGYSIARVSKELFHM TDTYVIDVQDPGNALGALMFVLAIDAEKCSRN Prediction of potential genes in microbial genomes Time: Sat May 28 19:53:59 2011 Seq name: gi|226332944|gb|ACII01000075.1| Ruminococcus sp. 5_1_39B_FAA cont1.75, whole genome shotgun sequence Length of sequence - 38185 bp Number of predicted genes - 38, with homology - 35 Number of transcription units - 18, operones - 13 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 106 - 465 143 ## gi|253579105|ref|ZP_04856376.1| conserved hypothetical protein - Prom 492 - 551 8.2 + Prom 475 - 534 7.3 2 2 Op 1 . + CDS 606 - 1802 804 ## COG1373 Predicted ATPase (AAA+ superfamily) + Prom 1806 - 1865 3.3 3 2 Op 2 . + CDS 1976 - 2149 187 ## gi|253579106|ref|ZP_04856377.1| conserved hypothetical protein + Term 2192 - 2256 14.5 - Term 2183 - 2242 7.1 4 3 Op 1 . - CDS 2287 - 2469 72 ## 5 3 Op 2 . - CDS 2538 - 3752 1449 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 3864 - 3923 10.1 6 4 Op 1 21/0.000 - CDS 4003 - 5442 1612 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) 7 4 Op 2 31/0.000 - CDS 5442 - 6911 1697 ## COG0154 Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases 8 4 Op 3 1/0.000 - CDS 6923 - 7216 441 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 9 4 Op 4 . - CDS 7279 - 8610 1573 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Term 8912 - 8956 -0.1 10 5 Op 1 . - CDS 9002 - 9556 494 ## Cphy_0629 response regulator receiver/ANTAR domain-containing protein - Prom 9703 - 9762 1.7 11 5 Op 2 . - CDS 9805 - 11127 1125 ## COG0174 Glutamine synthetase - Prom 11294 - 11353 10.3 - Term 11233 - 11284 6.3 12 6 Op 1 . - CDS 11420 - 11782 270 ## EUBREC_1122 hypothetical protein - Prom 11850 - 11909 2.1 13 6 Op 2 . - CDS 11921 - 12022 73 ## - Prom 12198 - 12257 8.3 - Term 12086 - 12132 2.1 14 7 Op 1 . - CDS 12337 - 13509 176 ## gi|295107843|emb|CBL21796.1| hypothetical protein 15 7 Op 2 . - CDS 13513 - 13899 144 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 16 7 Op 3 . - CDS 13944 - 14207 94 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 7 Op 4 . - CDS 14204 - 15049 416 ## COG0789 Predicted transcriptional regulators - Prom 15069 - 15128 6.7 - Term 15082 - 15115 -0.2 18 8 Op 1 . - CDS 15143 - 15322 130 ## SSUBM407_0953 hypothetical protein 19 8 Op 2 . - CDS 15355 - 15633 138 ## EUBELI_01803 integrase/recombinase XerC 20 9 Tu 1 . + CDS 15761 - 16096 195 ## Balac_0969 hypothetical protein + Term 16131 - 16170 -1.0 - Term 16278 - 16339 17.4 21 10 Op 1 21/0.000 - CDS 16358 - 17842 1807 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 17901 - 17960 10.0 22 10 Op 2 . - CDS 17964 - 22517 5542 ## COG0069 Glutamate synthase domain 2 23 10 Op 3 . - CDS 22534 - 24159 1964 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 24313 - 24372 6.3 - Term 24441 - 24501 11.4 24 11 Op 1 . - CDS 24533 - 24760 264 ## EUBREC_1515 hypothetical protein - Prom 24916 - 24975 4.1 25 11 Op 2 . - CDS 25030 - 26451 1503 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 26541 - 26600 8.9 + Prom 26610 - 26669 6.8 26 12 Op 1 . + CDS 26868 - 28967 2164 ## COG3968 Uncharacterized protein related to glutamine synthetase + Prom 29180 - 29239 5.5 27 12 Op 2 . + CDS 29266 - 30978 1860 ## COG0004 Ammonia permease + Term 31008 - 31058 12.2 28 13 Op 1 . - CDS 31317 - 31475 185 ## gi|253579129|ref|ZP_04856400.1| predicted protein 29 13 Op 2 . - CDS 31444 - 31686 119 ## gi|253579130|ref|ZP_04856401.1| predicted protein 30 13 Op 3 . - CDS 31729 - 32052 207 ## 31 13 Op 4 . - CDS 32089 - 32472 273 ## gi|253579131|ref|ZP_04856402.1| predicted protein - Prom 32492 - 32551 4.6 32 14 Op 1 . - CDS 32709 - 33254 236 ## EUBREC_2446 hypothetical protein 33 14 Op 2 . - CDS 33254 - 33403 138 ## EUBREC_2447 hypothetical protein - Prom 33581 - 33640 9.8 + Prom 33572 - 33631 9.6 34 15 Op 1 . + CDS 33786 - 34028 148 ## EUBREC_2448 hypothetical protein + Term 34071 - 34106 -0.5 + Prom 34030 - 34089 8.0 35 15 Op 2 . + CDS 34120 - 34827 293 ## COG3279 Response regulator of the LytR/AlgR family + Term 34923 - 34983 2.4 + Prom 35395 - 35454 2.6 36 16 Tu 1 . + CDS 35662 - 36132 -26 ## EUBREC_2450 hypothetical protein + Term 36145 - 36182 2.2 + Prom 36536 - 36595 5.2 37 17 Tu 1 . + CDS 36662 - 37135 395 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Term 36965 - 37007 4.0 38 18 Tu 1 . - CDS 37044 - 37697 700 ## COG2860 Predicted membrane protein - Prom 37826 - 37885 3.8 Predicted protein(s) >gi|226332944|gb|ACII01000075.1| GENE 1 106 - 465 143 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579105|ref|ZP_04856376.1| ## NR: gi|253579105|ref|ZP_04856376.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 119 51 169 169 226 100.0 4e-58 MYRQPDSLLNFRKKVKKFAFDHCKTCSEHYEELAQEQNIALANFRRKKLSQTGIRGSNGN IKYLTFDEASYLAKVSRSMINKWADKGKFTVIKVGSRVRIRRDEFEDWMEQRDLERSMQ >gi|226332944|gb|ACII01000075.1| GENE 2 606 - 1802 804 398 aa, chain + ## HITS:1 COG:HI1038 KEGG:ns NR:ns ## COG: HI1038 COG1373 # Protein_GI_number: 16272972 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Haemophilus influenzae # 4 397 6 400 400 263 38.0 6e-70 MKTITRAKYLDRIIELNGTPDIKIITGIRRSGKSKLMQAYIAYLKNNFENINIIFIDFMD LAYEEIKEYHALHSYVEEHYQEGKTNYLFVDEVQMCPKFELAINSLYSKGKYDIYVTGSN AFLLSADLATLFTGRYIEIHVFPFSFQEYCQYYDDISDKDKLFDEYAIKGGLAGSYAYRT EKDRTNYIKEVYETIVTRDLVQKYNLPDTLVLQRLSEFLMDNISNLTSPNKISQLLTANE TPTNHVTVGKYIKYLCNAFVFYDIKRYDIRGKKYLESSEKFYLCDSGIRYAILGSRNMDY GRVYENIVCIELLRRGYDVYVGKLYQKEIDFVAQRGSEKFYIQVSDNISGQETFERERSP LLQIRDAYPKMIIARTKHPQYSYEGIQIHDITDWLLQE >gi|226332944|gb|ACII01000075.1| GENE 3 1976 - 2149 187 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579106|ref|ZP_04856377.1| ## NR: gi|253579106|ref|ZP_04856377.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 57 1 57 57 81 100.0 1e-14 MKPDIEELRRKYMDNPPEGMTSMDIRRMSEDELLDMDYFLNEDELDDYFGEEGFYIF >gi|226332944|gb|ACII01000075.1| GENE 4 2287 - 2469 72 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEQLFGVELRSEADDFIHLTMKMKNGRTKILENQGFRTCRFYFIKIYMEFDYYLSINVVF >gi|226332944|gb|ACII01000075.1| GENE 5 2538 - 3752 1449 404 aa, chain - ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 404 1 408 410 520 61.0 1e-147 MVTVNHNYLKLPGSYLFSTIGKKVKAYKEANPQANVISLGIGDVTQPLAPAIIEALHKSV DEMGDAATFHGYAPDLGYEFLRSAIAKNDYKDRGCDIEADEIFVSDGAKSDSGNIQEIFG LDNKIAVCDPVYPVYVDTNVMAGRTGEYNKERGNFDNVIYMPCTASNGFLPEFPEEVPDL IYLCFPNNPTGGAITKPQLQEWVDYANKNGSVIIYDAAYEAYISEEDVPHSIYECEGARS CAIELRSFSKNAGFTGVRLGFTVVPKDLVRDGVDLHSLWARRHGTKFNGAPYIVQRAGEA VYSPEGKAQLKEQVGYYMSNAKAIYEGLASAGYSVSGGVNAPYIWLKTPDKMTSWEFFDY LLEKANIVGTPGSGFGAHGEGFFRLTAFGTQENTLEAIERIKNL >gi|226332944|gb|ACII01000075.1| GENE 6 4003 - 5442 1612 479 aa, chain - ## HITS:1 COG:CAC2669 KEGG:ns NR:ns ## COG: CAC2669 COG0064 # Protein_GI_number: 15895927 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Clostridium acetobutylicum # 3 478 2 476 476 479 49.0 1e-135 MKQYETVIGLEVHVELATKTKIFCGCSTEFGGAPNTHTCPVCTGMPGSLPVLNKKVVEFA LKAGLAANCQIHQYCKFDRKNYFYPDNPQNYQISQLYLPICHDGWVEIETSAGKKKIRIH EMHMEEDAGKLIHDEWEDCSLVDYNRSGVPLIEIVSEPDMRSSEEVIAYLEKLRCMMQYL GVSDCKLQEGSMRADVNLSVREVGAEEFGTRTEMKNLNSFKAIARAIEGERERQIELIEE GKAVTQETRRWDDNKEYSYAMRSKEDAKDYRYFPDPDLPPIHISDAWIEKIKAEQPELRE VKQARYQEEYGLPAYDAGILTESRHLAGLFEETAAIYGNAKKTANWFMGEVLRLTKDKAM DAEQVSFSPKHLADLLVMVEKSEVSPQNAKKVFEKVFEEDIDPVVYIEEHGLKIVEDTGL LEETINRILDANPGPLSELLGGKDKVMGFFVGQIMKEMKGKANPASVRETLMKEVEKRK >gi|226332944|gb|ACII01000075.1| GENE 7 5442 - 6911 1697 489 aa, chain - ## HITS:1 COG:CAC2977 KEGG:ns NR:ns ## COG: CAC2977 COG0154 # Protein_GI_number: 15896230 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase A subunit and related amidases # Organism: Clostridium acetobutylicum # 1 477 1 478 478 510 54.0 1e-144 MELEKLTALQLGEKIKQRQVSVLDGVKTVFEQIEKQDSEVHAYLDTYKEEAYKRAEEVQK GIEDGTYTSPLAGVPIAIKDNICINGKKTTCASKILENFVPQYNAEVIDRLEKAGLVIIG KTNMDEFAMGSTTETSAYGITRNPWNLEHVPGGSSGGSCAAVAAGETYLALGSDTGGSIR QPSSYCGVTGIKPTYGTVSRYGLVAYASSLDQIGPVGKDVSDCAALLEIIAGHDTKDSTS MKREDLQFSKELTGDIKGMKFGVPEEYLAEGLDPEVKASFMGVLDTLKELGAEVEFFSIK TMEYMIPAYYIIASAEASSNLERFDGVKYGFRAAEYEGLHDMYKKTRTAGFGEEVKRRIM LGSFVLSSGYYDAYYLKALRTKALIKKEFDQAFGKYDVLLAPASPFTAPKIGESLKDPLA MYLGDIYTVAVNLCGLPGITVPCGKDSKGLPIGIQMIGDCFMEKKILRAAHAYETSRGSF AVPGEGGTR >gi|226332944|gb|ACII01000075.1| GENE 8 6923 - 7216 441 97 aa, chain - ## HITS:1 COG:lin1868 KEGG:ns NR:ns ## COG: lin1868 COG0721 # Protein_GI_number: 16800934 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Listeria innocua # 5 93 4 92 97 64 43.0 4e-11 MAKIIDDETMENVCILAKLSLSEEEKEKAKAEMQKMLDYVEKLDELDTSEVEPMSHIFQD ENVFREDVVTNGDNKEAMLANAPKAKEGQYQVPKTIG >gi|226332944|gb|ACII01000075.1| GENE 9 7279 - 8610 1573 443 aa, chain - ## HITS:1 COG:CAC2979 KEGG:ns NR:ns ## COG: CAC2979 COG0017 # Protein_GI_number: 15896232 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 26 443 16 430 430 371 46.0 1e-102 MEFMDGTWKKEINTWEELVGNNLLNQEAVIEGAVHSIRNMGDVAFIILRRREGLFQTVYE NEMANVSIHELKEAMTIRVKGILHEEERAPHGRELRIRHIDILSTPAEPLPMAIDKWKLN TSLEAKLNYRPISLRNIQERSKFKIQEALTKAFRDYLYGQGFTEIHTPKIGARGAEGGAN LFKFSYFHKPAVLAQSPQFYKQMMVGVFDRVFETGPVFRAEKHNTKRHLNEYTSLDFEMG YIDSFEEIMAMETGFLQYAMNLLKTEYAKEVQILKLDIPDVSKIPAVRFDVAKELVSQKY NRKIRNPFDLEPEEEALIGQYFKEEYGSDFVFVTHYPSKKRPFYAMDDPEDARFTLSFDL LFKGLEITTGGQRIHDYNMLVQKIEDRGMTQEGMEQYLDTFKHGMPPHGGLGIGLERLTM QLIGEENVRETCLFPRDMNRLEP >gi|226332944|gb|ACII01000075.1| GENE 10 9002 - 9556 494 184 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0629 NR:ns ## KEGG: Cphy_0629 # Name: not_defined # Def: response regulator receiver/ANTAR domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 4 180 3 180 180 149 50.0 4e-35 MSSSIVIALPKIEDAKKIRTILNRHGLTVASVCSSVSNALANISELDSGVLICGYKLTDA YYKDALDDLPQYFEMLLLASQRIIDEAPNSVSTVQMPMKASALIDTVNDMLYHLERRIKK EKKKPKPRSEKEQNYISNAKRLLMEKNQMTEEEAYRHIQKCSMDSGTNMVETAQMLLMLM YDEI >gi|226332944|gb|ACII01000075.1| GENE 11 9805 - 11127 1125 440 aa, chain - ## HITS:1 COG:BH2360 KEGG:ns NR:ns ## COG: BH2360 COG0174 # Protein_GI_number: 15614923 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Bacillus halodurans # 1 437 1 446 449 459 51.0 1e-129 MGNYTREEILQMAEEEDVEFIRLQFTDMFGTLKNIAITARELPRALDNRCIVDGSFIAGI AGESEPDMYLRPDLDTFAILPWRPQQGKVARLLCDIYCPDGTPHERSPRYILKKTAREAK KEGYTCLVDPECEFFLFHTDDNGVPTTVTHEKAGYLDVSPVDLGENARRDIVLNLEDMGI EVESSHHETAPAQHEVDFKYGEVRTIADRIMSFKMTVRTIAKRHGLHATFMPKPRAEVNG SGMHIHFSLFKDGRNVFVNPGNPQELSEEAYYFVGGLLAHSKEMALITNPIVNSYKRLVP GYEAPTELTWTKNNQNSLVRIPGSRGMETRIELRSPDAAANPYLVFAVCLAAGLDGINKK IYPTKSSSRELSETDQKTMKIENLPGNLNEAIDYFEQSDWIKEVLGTEFCKEYAAAKKKE WLRYTREISAWEIEEYLYRI >gi|226332944|gb|ACII01000075.1| GENE 12 11420 - 11782 270 120 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1122 NR:ns ## KEGG: EUBREC_1122 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 120 2 121 122 222 87.0 3e-57 MKTLNDYMSMSYRMEVVEDKTEGGFVVSYPDLPGCITCGETVESAISNALDAKKAWLEAA LEEGVEIHEPDSLEDYSGQFKLRIPRSLHRLLAEHSQREGISMNQYCVYLLSRNDAVFSK >gi|226332944|gb|ACII01000075.1| GENE 13 11921 - 12022 73 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKWDKLITRICNLSKDLRFDELRKVLKVMDMK >gi|226332944|gb|ACII01000075.1| GENE 14 12337 - 13509 176 390 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295107843|emb|CBL21796.1| ## NR: gi|295107843|emb|CBL21796.1| hypothetical protein [Ruminococcus obeum A2-162] # 1 390 1 390 394 582 94.0 1e-164 MYQLLLKQRFKFAVRMIKRNIGAYLILIAIVPCILILQSMIQTTHGMDIAVFIPYGILCF LFMNLFGTIPRMRISPETYIWKIEKTYIWRVKKIGKSTVLTLCVALILFWAFPMEGIVLR QFIVVMICNPIVNMHTVLSTQYKYRDITNICMLAVVMICFFNRSVFVASVLLIIYGLYFI LYPYFSYQAIIPFFQQYNQMLYGFLNREYDTVLQTQDEVFFQNNQREMKGLMKKYYGNRY RFLMVYEVVSVIRRKKQIMYQYFAIVLITMLLSQIEGKYIEYVLLFSILILGQNVVSYNV QNELRLVKKGFPLRYSLKEKLQCKSVIGMFLLLLPMLCYWIAYKFELFSILLWIWMPIQS IIIYTAKNRLQRLLYILPFYILSLLSMRVW >gi|226332944|gb|ACII01000075.1| GENE 15 13513 - 13899 144 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 109 112 219 245 60 31 1e-14 KKQEAFDKIERILDGLGLKKYENYSPSELSKGTMQRLNNACAMIQEEKITIFDEPFSGLD PVQMKLLENVMVEFHKKEKGIYIISSHDIECLQNVCDKCLLLHNGQIREINTEEVDREKI AKILSEGE >gi|226332944|gb|ACII01000075.1| GENE 16 13944 - 14207 94 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 71 1 76 245 40 31 1e-14 MIQISNLTKKYGETAVFSNLNYTFENTCIYGLVGRNGAGKTTLLNILSNYDKNYTGVISI DGEKMNKVDYLDMPVAYAMDQPSFLMS >gi|226332944|gb|ACII01000075.1| GENE 17 14204 - 15049 416 281 aa, chain - ## HITS:1 COG:PM1941 KEGG:ns NR:ns ## COG: PM1941 COG0789 # Protein_GI_number: 15603806 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pasteurella multocida # 2 120 6 130 132 71 33.0 2e-12 MKIKNVEERTGLSRSNVRFYEKEKLIEPSRNESNGYRDYSENDVENIKKIAYLRTLGISI EDIRSIISGKVTLQETLERQNEILNSQITDLNKAKRMCEKMLGEESISYEKLQVEQYVTE LQDYWKDNQTVFKLDSVSFLYIWGSMLTWITITVLCLIIGTLSYSKLPTEIPVQWSKGVA TSLVNKNWIFICPVICVIIRYLLKPFIYAKLQMNNYYGEIITEYLTNYMCFIVLSVEVFS ILFTFGVVKSVVVLLFVNTAVFMGLLVVGLTKMDLRGKEIL >gi|226332944|gb|ACII01000075.1| GENE 18 15143 - 15322 130 59 aa, chain - ## HITS:1 COG:no KEGG:SSUBM407_0953 NR:ns ## KEGG: SSUBM407_0953 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 1 58 66 123 124 61 51.0 9e-09 MAGIEEQAQKRFERIVEQMAESEGITEQLKATDQVAWVGEMNNIWSRAREVVNAELIYN >gi|226332944|gb|ACII01000075.1| GENE 19 15355 - 15633 138 92 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01803 NR:ns ## KEGG: EUBELI_01803 # Name: not_defined # Def: integrase/recombinase XerC # Organism: E.eligens # Pathway: not_defined # 1 88 197 284 400 124 71.0 7e-28 MLYWCGIREGELLALTAADFNFEKETVRINKFYQRLHGEDVITTPKTKKSNRMIKMPKFL CEEMQEYLQMLYGLKKKDRIFTVTKNIVEVYI >gi|226332944|gb|ACII01000075.1| GENE 20 15761 - 16096 195 111 aa, chain + ## HITS:1 COG:no KEGG:Balac_0969 NR:ns ## KEGG: Balac_0969 # Name: not_defined # Def: hypothetical protein # Organism: B.animalis_lactis_Bl-04 # Pathway: not_defined # 2 90 34 123 133 99 55.0 3e-20 MHTLFTLEDLYGLTVSESDEDICLKVDTSKDKDAHELWKMLYAWKEQADKLSSEEISKDK YDEGRYHYPEFDTTQTWAKVPSQKLSNALVGIFQDRLKPDKYGFICCTTTY >gi|226332944|gb|ACII01000075.1| GENE 21 16358 - 17842 1807 494 aa, chain - ## HITS:1 COG:sll1027 KEGG:ns NR:ns ## COG: sll1027 COG0493 # Protein_GI_number: 16329369 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Synechocystis # 1 494 1 493 494 552 55.0 1e-157 MGKPTGFLEYKRETAKVLPPKVRIANFKEFRTPLNKEKQKLQGARCMACGVPFCQSGMMI GGMASGCPLHNLVPETNDLVYTGNWKQAFLRLSKTHSFPEFTSRVCPALCEAACTCNLNG EPVSTKENERAIIETAYENGWVTPQIPEVRTGKSIAVIGSGPSGLAAAQQLNRRGHSVTV FERKDRVGGLLRYGIPNMKLDKAVVDRRIHLMEEEGVKFVTNVNVGKDIKAEELLKQFDR VILACGASNPRDIKVPGRDAKGIYFAVDFLGQVTKALLDSDFAKVPYELAKGKNVLVIGG GDTGNDCVGTSIRLGAKSVIQLEMMPKPPVERTPSNPWPQWPRVLKTDYGQEEAIAVFGH DPRVYQTTVTEFIKDKNGNVCQAKLTKLKSEKDPKTGRMMMVPVEGTEEVIPVDVVLIAA GFLGSEKYVTDAFKVAINGRTNVDTKPDQYQTSVDKVFTAGDMHRGQSLVVWAIREGREA AKAVDESLMGYSYL >gi|226332944|gb|ACII01000075.1| GENE 22 17964 - 22517 5542 1517 aa, chain - ## HITS:1 COG:sll1502_2 KEGG:ns NR:ns ## COG: sll1502_2 COG0069 # Protein_GI_number: 16329610 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Synechocystis # 390 1186 1 801 801 946 58.0 0 MKEVREQQQQQGMYRPEFEHDNCGIGAVVNIKGKKTHDTVANALKIVEHLEHRAGKDAEG KTGDGVGILLQISHKFFKKACKKEGFEIGDEREYGIAQFFFPQHEIKRAQAKKMFEIILE KEGLELLGWRQVPVFPEVLGHKARECMPCIMQAFIKKPEDVEKGLPFDRMLYIARREFEQ SNDNTYVVSMSSRTIVYKGMFLVGQLRTFFEDLQSPDYESAIAMVHSRFSTNTNPSWERA HPNRFMVHNGEINTIRGNADKMLAREENMESPYLEGQLHKVLPVVNRNGSDSAMLDNTLE FLVMSGMELPLAVMITIPEPWANNDTISQAKRDFYQYYATMMEPWDGPASILFSDGDVMG AVLDRNGLRPSRYYITSDGYMILSSEVGVLPIPEEKIVLKERLHPGKMLLVDTVKGKVID DNELKEGYAKMQPYGEWLDSHLVQLKDIKIPNERPEEYTPEQRARLQKAFGYTYEQYRTS IRNMALNGAESIGAMGVDTPLAVFSKQNRPLFDYFKQLFAQVTNPPIDAIREEIVTSTSV YVGKDGNLLEQKPENCQVLKINNPILTNTDMMKIKSFKHEGFKTAVVSTLYYKSTKLERA IDRLFVEVDKAFREGANILVLSDRGVDENHMPIPSLLAVSAVHQHLVKTKKSTSLAIILE SGEPREVHHFATLLGYGASAINPYLALETIHELIDSHMLEKDYYAAVDDYNHAVISGIVK IASKMGISTIQSYQGSQIFEAIGISSDVIEKYFTGTVSRVGGITLEDIEENVEKLHSEAF DPLGLPTDLTLDSVGAHKMRSQGEEHRYNPQTIHLLQQSTWTGSYDLFKQYTDLVDKENH GNLRGLLDFKFAETPVPLEEVESVDDIVKRFKTGAMSYGSISQEAHETLAIAMNHLHGKS NTGEGGESDERLDSAGSSDDRCSAIKQVASGRFGVTSRYLVSAREIQIKMAQGAKPGEGG HLPAKKVYPWIAKTRHSTPGVSLISPPPHHDIYSIEDLAQLIYDLKNANKYADISVKLVS EAGVGTVAAGVAKAGAQTILISGYDGGTGAAPRSSIHNAGLPWELGLAETHQTLLKNGLR NRVRIETDGKLMSGRDVAIAALMGAEEFGFATAPLVTMGCVMMRVCNLDTCPVGVATQNP ELRKRFKGKPEYVENFMRFIAQELREYMSKLGFRTVSEMVGRTDLLVQTDNVPEPHQGKV DLSAILNNPFAGKDQKVTFDPKAVYNFELEKTMDEKVLVKKCVNAINKGQKTELSVNLTN IDRTFGTILGAEITRKNKDGLADDTITVHCNGAGGQSFGAFIPKGLTLELTGDSNDYFGK GLSGGKLILKVPEKAAYKAEENIIVGNVALYGATSGTAFINGVAGERFAVRNSGASAVVE GVGEHGCEYMTGGRVVVLGKTGKNFAAGMSGGIAYVLDVDNVLYKNLNKAMISIEKVENK YDKKELRDLIEAHVEATGSKLGAKVLADFDYYLPHFKKLIPNEYKKMITLSAKLEEKGLN SQQAQMEAFYESIGAKH >gi|226332944|gb|ACII01000075.1| GENE 23 22534 - 24159 1964 541 aa, chain - ## HITS:1 COG:CAC2892 KEGG:ns NR:ns ## COG: CAC2892 COG0504 # Protein_GI_number: 15896145 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Clostridium acetobutylicum # 1 533 1 533 535 699 61.0 0 MAVKYVFVTGGVVSGLGKGITAASLGRLLKARGYQVTMQKFDPYINIDPGTMNPIQHGEV FVTDDGTETDLDLGHYERFIDESLDKNSNVTTGKVYWSVLQKERHGDFGGGTVQVIPHIT NEIKSRFYRAKSPEENKIAIIEVGGTVGDIESQPFLEAIRQFQLNVGHDNAILIHVTLIP YLKASEELKTKPTQASVKELQGMGIQPDVLVCRSEHELTEDIKEKIALFCNVPVSHVLQN LDVEYLYEAPLAMEREKLADVVLSSLRLENRKPDLSDWEEMVESLRNPNKTVKIAIVGKY TQLHDAYLSVVEALKHGGISCRAKVELDWIDSEELTEKNLDQQLHNVDGILVPGGFGNRG TEGMILAAQYARVHKIPYLGICLGMQMAIVEFARHVLGYEDANSIELNPETKHPVIALMP DQEDIEDIGGTLRLGSYPCILAEGSRSYELFGKKEIHERHRHRYEVNNAFRKELQEKGMN IVGTSPDNHIVEMIEIAGHPFYVGTQAHPEFKSRPNHAHPLFRGFIQAAVDKKISEEQQQ K >gi|226332944|gb|ACII01000075.1| GENE 24 24533 - 24760 264 75 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1515 NR:ns ## KEGG: EUBREC_1515 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 75 5 77 77 69 47.0 4e-11 MSQSKIKLNATEEVQEFVNAASRCDFDIDVYYNRFLIDAKSILGVLSLDLTRELTVDCHG ESREFDRTLKKFAVA >gi|226332944|gb|ACII01000075.1| GENE 25 25030 - 26451 1503 473 aa, chain - ## HITS:1 COG:MJ0204 KEGG:ns NR:ns ## COG: MJ0204 COG0034 # Protein_GI_number: 15668376 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Methanococcus jannaschii # 8 459 2 462 471 449 49.0 1e-126 MAGIKEECGVFGIYDLDGGNVVPSIYYGLTSLQHRGQESCGLAVSDTKGERGNVKFHKEL GLVSEVLRQDVVRKYEGDIGIGHVRYSTTGASVAENAQPLVLSYVKGTLALAHNGNLVNT PELKWELIQNGAIFHTTTDSEVIAFHIARERVHSKTVEEAVLKTAKKIKGAYGLVVMSPR KLIAVRDPYGLKPLCLGKRGNAYVIASESCALTSVSAEFIRDIEPGEILTITKDGLKSNK ELASAAAKRAHCVFEYIYFARLDSTIDGVKVYDARIRGGKSLAKSYPVEADLVTGVPESG LPAAKGYSEASGIPFAFAFYKNSYIGRTFIKPTQEERESSVHLKLSVLESVVKDKRIVLV DDSIVRGTTIANLIHMLKEAGAKEVHVRISSPPFLHPCYFGTDVPSNDQLIAASHSTEEI CKMIGADSLGYMQTDYLEGMAGGLPLCKACFDGNYPMEIPDYTNAEFADIVKC >gi|226332944|gb|ACII01000075.1| GENE 26 26868 - 28967 2164 699 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 3 699 2 696 696 801 56.0 0 MAQNIPELYGSLVFNDKVMRSKLPKDMYKALKKTIESGTHLELDVANSVAVAMKEWATEN GATHYTHWFQPMTNVTAEKHDSFISPTGDGQVIMDFSGKELVKGEPDASSFPSGGLRATF EARGYTAWDPTSPAFIKDGTLYIPTAFCSYSGEALDKKTPLLRSMQTLDKEATKLLHIIG NKDVKHVNTTVGPEQEYFLVDKELYKQRKDLVFCGRTLIGAPAPKGQEMEDHYFGALKPR VAAYMHDLDVELWKLGIPAKTKHNEVAPAQHELAPVFDTTNVAVDHNQLTMEVMKKVADK HGLVCLLHEKPFEGINGSGKHNNWSMITDTGVNILDPGKTPAENTQFLIFLTAVIKAVDE YADVLRISVASAGNDHRLGANEAPPAVVSVFLGDELTEVLKSIENDEYFAGSRAVQMDIG AKVLPHFVKDNTDRNRTSPFAFTGNKFEFRMLGSEASVANPNIILNTAVAECVHQFAEQL KDVPEDKMEDAIHELIKKTIIDHKRVIFNGNGYTDEWIEEATKRGLFNLKSTPDALPQWI ADKNIELFTKYHIFTKEEIESRYEIWLESYSKILNIESNTMVEMVQKDFLPSVFAYIDKV AATAVAKKSVVSDVSTASEGKLIKELSQLADEISTGLETLKADTAKALATEDPLANAKAY QTVVLSDMDELRKSVDAAETLIPDALLPYPTYDKLLFSV >gi|226332944|gb|ACII01000075.1| GENE 27 29266 - 30978 1860 570 aa, chain + ## HITS:1 COG:TM0402 KEGG:ns NR:ns ## COG: TM0402 COG0004 # Protein_GI_number: 15643168 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Thermotoga maritima # 5 413 30 428 435 382 54.0 1e-106 MEFSAVNTIWVLVGAALVFFMQAGFAMVETGFTRAKNAGNIIMKNLMDFCIGTPVFWLVG FGLMFGAGNGFIGKIGGIATEAHYGSGMLPDGVPFWAFLIFQTVFCATSATIVSGAMAER TKFLSYCIYSLMISLVVYPVSGHWIWGGGFISQMGFHDFAGSCAVHMVGGVAALIGAIIL GPRIGKYTKGGKSKAIPGHNLTVGALGVFILWFCWFGFNGASTVSMEGDAIVSAGKIFVT TNLAAAVATVTVMIITWVRYKKPDVSMSLNGSLAGLVAITAGCDTVSPTSAAIIGIASGF IVVFGIEFIDKVLKIDDPVGAVGVHGLNGAFGTLAVGLFSDGSGTDWKGLLTGGGFHGFG VQFTGMIITIAWVVVTMTIIFQVIKHTIGLRVSAEEEIAGLDSKEHGLASAYDGFAMAGV PTPMGTSIKAPVVTARPSAPAESVPELPKEGVHKLTKVVIITRQNKLDEFMQAMNEIGVT GITITNVLGCGVQKGAPSYYRGVEMEMNLLPKVKIEIVVSLVPVQKVIETAKAVLHTGQI GDGKVFVYDIENVVKIRTGEEGYLALQDTK >gi|226332944|gb|ACII01000075.1| GENE 28 31317 - 31475 185 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579129|ref|ZP_04856400.1| ## NR: gi|253579129|ref|ZP_04856400.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 52 21 72 72 79 100.0 5e-14 MALAQATKSHEPTYTINDAMDVLLSRLDEAIDDMESDRVQIVEEAWAEIDAI >gi|226332944|gb|ACII01000075.1| GENE 29 31444 - 31686 119 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579130|ref|ZP_04856401.1| ## NR: gi|253579130|ref|ZP_04856401.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 80 1 80 80 155 100.0 1e-36 MGCRTEKFGAAALLFTFLIETTNGQILKNVPKTVFCVWVIKTNLKYIDISGGSWYDKNRN KSAISRKGDGYGIGTGNKIP >gi|226332944|gb|ACII01000075.1| GENE 30 31729 - 32052 207 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYMRKNSFVVCLVISAILLTSSFAMPVLAAENTGIETSSVTRSVPGYYKCTGSDVNIRS GPGTQYTIVGTLNYGTKVYVYSISNGWAKISLGGIYRYVSADYLREL >gi|226332944|gb|ACII01000075.1| GENE 31 32089 - 32472 273 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579131|ref|ZP_04856402.1| ## NR: gi|253579131|ref|ZP_04856402.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 127 1 127 127 217 100.0 2e-55 MTHKKEVVFSLLFVIMVLCTGCSDKDNGDTSKNNRTSGESFVQEETIESTDNTDNADGSS APTPELKTPRTKKELEDALGTVNESGKWTPPAGSHIDPKTGNILNKDGVVVGTTQEPYSK ARPGSQG >gi|226332944|gb|ACII01000075.1| GENE 32 32709 - 33254 236 181 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2446 NR:ns ## KEGG: EUBREC_2446 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Two-component system [PATH:ere02020] # 1 162 1 162 194 256 87.0 2e-67 MEKCTDTVVNWMIRCNAINEADKELYKYALYSFFLLVSPLILAGVIGFGLGSVKHGVALI FPFVVLRKFSGGYHAKNLHTCILGSGFLLFLCIMLSMHVQCDLKLAIATVIASVSLITFS PIDSENRRLDTNEKRTYKKTTVFFVILFGLLDIALFLLGKYICWDITNSRLTSTLYRKEN V >gi|226332944|gb|ACII01000075.1| GENE 33 33254 - 33403 138 49 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2447 NR:ns ## KEGG: EUBREC_2447 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 49 1 49 49 69 93.0 4e-11 MKKEVTKSVAKGMKAALDVVLRTEANTASCLIMYQPKAPKELMNYRRNK >gi|226332944|gb|ACII01000075.1| GENE 34 33786 - 34028 148 80 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2448 NR:ns ## KEGG: EUBREC_2448 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 80 30 108 108 142 96.0 5e-33 MSSIENGRQKPSLDIFIQICNLLNVTPDYLLLGSMHAYNIPQDIIDKLRLCNQSDIELAG DFIELLVKRNNKATHNEPKL >gi|226332944|gb|ACII01000075.1| GENE 35 34120 - 34827 293 235 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 232 2 231 234 101 27.0 1e-21 MRILICDDDTLIIEQLKKYIESYFESNHLKCPELVSFTSGEALLTDKGEKDIVFLDIEMP GMNGIYVGNELKKANRNVIVFVVTSYSEYLDEAMRFHVFRYLSKPLDKQRFFQNLKDALA YYNSITLKIPVETKQAVYSFPASQIIAIEAQGRKVIVHTARQDFDSIHNMNYWLEKLPQN SFFQTHRSFIVNFEHIVDFDHSTVHLSNPQITAYLTRRKYSSFRNAYLLYLESMR >gi|226332944|gb|ACII01000075.1| GENE 36 35662 - 36132 -26 156 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2450 NR:ns ## KEGG: EUBREC_2450 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 156 277 432 432 275 82.0 3e-73 MESSDLKETTKLCDHEMLNSILCRYQRHCNDKHIIFHTDIRSGTVQHIYPRDLTSLFCNL LENAVESAENIPDSFIELTVQKKENSPFLIIILINSCQNTPVYNPEGLPISRKSDNGRHG FGIKSIKKVVKQYHGNLQMYYDNDSGTFHTILTLKQ >gi|226332944|gb|ACII01000075.1| GENE 37 36662 - 37135 395 157 aa, chain + ## HITS:1 COG:MA2295 KEGG:ns NR:ns ## COG: MA2295 COG1853 # Protein_GI_number: 20091133 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Methanosarcina acetivorans str.C2A # 2 146 15 156 188 71 33.0 6e-13 MPVLMVATYNEDGSINVMNATWGTMQERDTVALNLTETHKTVQNIKARGAFTVSIADAAH VTEADYFGVESGNREPDKFARSSLTATKSETVDAPIINNFPPRTKRQSDKLSLTVLNITY LILYGILPIHVVTAWADSSGNVLPDIGSAPLQLMILP >gi|226332944|gb|ACII01000075.1| GENE 38 37044 - 37697 700 217 aa, chain - ## HITS:1 COG:L169795 KEGG:ns NR:ns ## COG: L169795 COG2860 # Protein_GI_number: 15673892 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 1 213 1 217 219 132 40.0 4e-31 MDYQSIVTFVMEMAGSVAFAASGAMVGVERNMDIFGVSVLGVATAVGGGMIRDIVLGIIP PAVFTNPVYALVSVLASCIVFFIFYFKRELLQGHRRETYDKIMLAMDSVGLGIFTVVGVN TGIRQGYMDNVFLLVFLGTITGVGGGLLRDMMASVPPYIFVKHIYACASIIGAVVCVYMN RFVGNVQAMVVSSIVVVLIRYLAAHYRWNLPRLSPHE Prediction of potential genes in microbial genomes Time: Sat May 28 19:55:15 2011 Seq name: gi|226332943|gb|ACII01000076.1| Ruminococcus sp. 5_1_39B_FAA cont1.76, whole genome shotgun sequence Length of sequence - 1075 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 554 245 ## Teth514_2350 IS605 family transposase OrfB 2 1 Op 2 . - CDS 573 - 929 136 ## COG1943 Transposase and inactivated derivatives - Prom 983 - 1042 5.2 Predicted protein(s) >gi|226332943|gb|ACII01000076.1| GENE 1 2 - 554 245 184 aa, chain - ## HITS:1 COG:no KEGG:Teth514_2350 NR:ns ## KEGG: Teth514_2350 # Name: not_defined # Def: IS605 family transposase OrfB # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 12 169 15 158 397 79 32.0 4e-14 MLLSKKTSIKVSREYANLIGHMCYAASKLWNVCNYERQHYKETGIEQHPDWYYQKKAHKE DLWYKQLPSQTAQEVCKLLDKAWKSFYALKRSGGIETPRPPRFKQESIPITYMQMGIVHE RDTGRVRLSLPKALKKYMEETYQIHENFLYLENKIFRGMDQIKQLRIYPPEKGNCKMIVV YEVS >gi|226332943|gb|ACII01000076.1| GENE 2 573 - 929 136 118 aa, chain - ## HITS:1 COG:TM0777 KEGG:ns NR:ns ## COG: TM0777 COG1943 # Protein_GI_number: 15643540 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 2 112 18 124 129 96 47.0 1e-20 MVWSVKYRRKILSAEVEAYLQKLVQEIAEDKGFTVHLFECGEGDHVHCFVSAPPKLSITA IMKYLKGISGRKLFERFPEIRNQLWKGELWNHSYYVEMVGSVSEENIRRYIEHQSKAY Prediction of potential genes in microbial genomes Time: Sat May 28 19:55:25 2011 Seq name: gi|226332942|gb|ACII01000077.1| Ruminococcus sp. 5_1_39B_FAA cont1.77, whole genome shotgun sequence Length of sequence - 25123 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 9, operones - 4 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 12 - 623 260 ## COG0500 SAM-dependent methyltransferases - Term 633 - 670 1.3 2 1 Op 2 . - CDS 677 - 898 281 ## gi|253579144|ref|ZP_04856414.1| predicted protein - Prom 931 - 990 4.4 3 2 Op 1 42/0.000 - CDS 1426 - 1827 457 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 4 2 Op 2 42/0.000 - CDS 1843 - 3237 1628 ## COG0055 F0F1-type ATP synthase, beta subunit 5 2 Op 3 42/0.000 - CDS 3256 - 4113 929 ## COG0224 F0F1-type ATP synthase, gamma subunit 6 2 Op 4 41/0.000 - CDS 4134 - 5663 1935 ## COG0056 F0F1-type ATP synthase, alpha subunit - Prom 5709 - 5768 4.4 7 2 Op 5 38/0.000 - CDS 5788 - 6327 639 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 8 2 Op 6 37/0.000 - CDS 6324 - 6827 730 ## COG0711 F0F1-type ATP synthase, subunit b 9 2 Op 7 40/0.000 - CDS 6946 - 7224 523 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 10 2 Op 8 . - CDS 7355 - 8113 893 ## COG0356 F0F1-type ATP synthase, subunit a 11 2 Op 9 . - CDS 8170 - 8586 354 ## gi|253579155|ref|ZP_04856425.1| conserved hypothetical protein 12 2 Op 10 . - CDS 8614 - 8883 228 ## gi|253579156|ref|ZP_04856426.1| conserved hypothetical protein - Prom 8972 - 9031 9.9 - Term 9175 - 9223 9.3 13 3 Tu 1 . - CDS 9366 - 10475 1430 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 10671 - 10730 6.5 - Term 10592 - 10652 3.0 14 4 Tu 1 . - CDS 10754 - 11455 229 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family - Prom 11477 - 11536 3.6 + Prom 12911 - 12970 5.5 15 5 Tu 1 . + CDS 13015 - 13773 482 ## COG2188 Transcriptional regulators + Term 13851 - 13901 7.4 - Term 13842 - 13884 3.2 16 6 Op 1 25/0.000 - CDS 13941 - 15560 1174 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 17 6 Op 2 . - CDS 15596 - 15853 365 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 15962 - 16021 3.8 18 7 Op 1 13/0.000 - CDS 16103 - 16930 601 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 19 7 Op 2 13/0.000 - CDS 16930 - 17673 772 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 20 7 Op 3 . - CDS 17689 - 18153 427 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 21 7 Op 4 . - CDS 18150 - 18572 452 ## gi|253579165|ref|ZP_04856435.1| conserved hypothetical protein 22 7 Op 5 . - CDS 18556 - 19242 534 ## COG0637 Predicted phosphatase/phosphohexomutase 23 7 Op 6 . - CDS 19267 - 20244 527 ## COG2222 Predicted phosphosugar isomerases - Prom 20295 - 20354 7.5 + Prom 20525 - 20584 10.4 24 8 Tu 1 . + CDS 20617 - 22332 981 ## Swol_1745 transposase - Term 22595 - 22658 12.2 25 9 Tu 1 . - CDS 22664 - 24922 2693 ## COG1048 Aconitase A - Prom 25063 - 25122 6.5 Predicted protein(s) >gi|226332942|gb|ACII01000077.1| GENE 1 12 - 623 260 203 aa, chain - ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 2 203 6 209 209 133 36.0 3e-31 MRFFENTRKPQGFGGKLMTKMMNSGHAKLSQWGFSNISAKSDAEVLDVGCGGGANIAVWL DRCRNGHVTGMDYSEVSVAESQKLNALAIKQGKCNVVQGDVSAIPFSDATFDYVSAFETV YFWPGLVKCFSEVNRVLKSEGIFLICNESDGTNAADEKWTKMINGMKIYTSKQLVAALKE AGFTEIKTYSNAKNHWISIVATK >gi|226332942|gb|ACII01000077.1| GENE 2 677 - 898 281 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579144|ref|ZP_04856414.1| ## NR: gi|253579144|ref|ZP_04856414.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 73 3 75 75 115 100.0 7e-25 MSRNIKGGFLTLSGVVGIVGMIIAAMQNPATAWVTPPGRMIVSILENGLLIPTVLFLVLF IYGLYILLTEKND >gi|226332942|gb|ACII01000077.1| GENE 3 1426 - 1827 457 133 aa, chain - ## HITS:1 COG:CAC2864 KEGG:ns NR:ns ## COG: CAC2864 COG0355 # Protein_GI_number: 15896118 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Clostridium acetobutylicum # 6 133 7 132 133 77 35.0 7e-15 MSTFPLRIGTPDGLLFEGNVERVICRTVSGDLAILARHCNYCTALGMGEASVVMEDGSRR TAACIGGMLSVMNGKCRVLATTWEWKEDIDEQRAQEAKKRAEEMIAKGGLSDHDYKIVEA KLQRALVRLSVKS >gi|226332942|gb|ACII01000077.1| GENE 4 1843 - 3237 1628 464 aa, chain - ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 462 3 461 466 670 73.0 0 MSEKHIGEVVQVIGPVLDIRFAHDELPPLLNAIEIDNDGAKLVAEVAQQVGDDVVRCIAM NSTDGLVRGAKALDTGGPISVPVGEECLGRVFNLLGDPVDNLPAPEAKERWSIHRQAPSY EEQMPATEILETGIKVVDLICPYAKGGKIGLFGGAGVGKTVLIMELIHNVATAHGGLSVF TGVGERTREGNDLYNEMKESGVIDKTALVYGQMNEPPGARMRVGLSGLTMAEYFRDEKNQ DVLLFIDNIFRFTQAGSEVSALLGRMPSAVGYQPTLATEMGNLQERITSTRKGSITSVQA VYVPADDLTDPAPATTFSHLDATTVLSREISSQGIYPAVDPLDSTSRILSPEIVGTEHYE IARSVQRVLQRYKELQDIIAIMGMDELSEEDKRTVSRARKVQRFLSQSFSVAEQFTGMPG KYVPLKETLRGFRMILNGECDDIPESYFLFVGTIDEVFEKAKNQ >gi|226332942|gb|ACII01000077.1| GENE 5 3256 - 4113 929 285 aa, chain - ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 3 285 2 284 285 182 37.0 6e-46 MASNMKAVKLRIKSVQSTMQITKAMELVASSKLRKAKERAEVCRPYFTELHETLKEIARE NTDFSSVYAKESSNNKTCYVVIAGDRGLAGGYNTNLFKCLEASAEGKDYMVLPVGKKAVE YFRQREIEALTEQFAEAGSISVADCFEMANMLCTEFRKGTFGHIELCYTVFNSMLSQQPE VFSMLPMTDIREESGGKIETKNLILYEPDGETVFDAIVPEYLAGLVYGGICESTASELAA RRMAMDAATSNAEEMIDQLNLYYNRARQASITQEITEIVAGAEGL >gi|226332942|gb|ACII01000077.1| GENE 6 4134 - 5663 1935 509 aa, chain - ## HITS:1 COG:BH3756 KEGG:ns NR:ns ## COG: BH3756 COG0056 # Protein_GI_number: 15616318 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Bacillus halodurans # 1 506 1 502 502 643 64.0 0 MQLKPEEISKVIRSQIKYYENAIQQNETGTILMVGDGIARASGLVNCMSGELLEFEDGSF GMAQNLEENSVSIVIFGSDENIGEGQTVKRTGKVVSVPVGDAMVGRVVNALGQPIDGAGP IDTKEFRPIESKAPGICERRSVYQPLQTGIKAIDSMIPIGRGQRELIIGDRQTGKTTIAT DTIINQKGKDVICIYVAIGQKRSTVATLVENLTRNGAMDYTIVVAATASESSPLQYIAPY SGCAMGEYFMNKGKDVLIIYDDLSKHAVAYRALSLLIRRPPGREAYPGDVFYLHSRLLER AAKLDDEHGGGSMTALPIIETQAGDVSAYIPTNVISITDGQIFLETELFHSGIMPAVNPG ISVSRVGGDAQIKAMKKVAGTLKLIYSQYRELQSFAQFGSDLDADTKARLEQGARIVEVL KQNQNAPVPVEKQVAILYAVTKGVLESVKVEDVNVYESGLYTYLDTDAAGVEVMQEIRST GKLEQETEQKLRSVLDAYTENFLNTRPEK >gi|226332942|gb|ACII01000077.1| GENE 7 5788 - 6327 639 179 aa, chain - ## HITS:1 COG:BH3757 KEGG:ns NR:ns ## COG: BH3757 COG0712 # Protein_GI_number: 15616319 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Bacillus halodurans # 5 174 7 177 183 70 26.0 1e-12 MTKTARVYGGSLYDLAAEEKLDGQIMEQMIAIRQIFRENPGYLKLLGEPAIPEDERLKMI EEAFGGQAERYLVNFLKLLCERKILREFAGCCEEFIRRYNSAHGIAEAVVTSAVKLSDTQ MEALKAKLEKLSGKKVYLVQKTDASVLGGLRVELEGVQLDGTVQSRISGISKKLNELVV >gi|226332942|gb|ACII01000077.1| GENE 8 6324 - 6827 730 167 aa, chain - ## HITS:1 COG:SP1512 KEGG:ns NR:ns ## COG: SP1512 COG0711 # Protein_GI_number: 15901359 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Streptococcus pneumoniae TIGR4 # 7 166 5 164 164 75 30.0 4e-14 MQVQELVGIVPWNFIATICNLFIQVYLIKRFLFKPINEMLEKRKAKADAQIQDAVKAKEE AQAMKAEYEKNMQEAKNRANDIVMTAQKTAAIQSEEMLKEASSQVTAMKEKAEKDIAQEK RKAVNEIKGEIGGMAVEIAGKVIEREISEEDHAKLIDEFIENVGEAS >gi|226332942|gb|ACII01000077.1| GENE 9 6946 - 7224 523 92 aa, chain - ## HITS:1 COG:FN0363 KEGG:ns NR:ns ## COG: FN0363 COG0636 # Protein_GI_number: 19703705 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 5 84 4 83 89 75 56.0 2e-14 MDFQLLAKGIALAGCAIGAGCALIAGIGPGIGEGNAVAKALEAIGRQPECKGDVTSTMLL GCAIAETTGIYGFVTGLLLIFVAPGMFMNFLS >gi|226332942|gb|ACII01000077.1| GENE 10 7355 - 8113 893 252 aa, chain - ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 26 245 20 214 221 108 33.0 7e-24 MELDISGARIFFTLPIDVPVLGPLRISETMVVSWIVMILITGLCIWLTHDLKEENISKRQ AVAELIVEKANSFVIGNMGEKFRYLIPFVAALFATSVVSNLISLIGLRSPTADLSTEAAW AVVVFIMITAQKIKTSGFGGYLKGFTTPIAVMTPFNILSELATPVSMACRHFGNILSGVV INGLIYGALAVASSALLGLIPGALGDVLSKIPILDVGVPAITSVYFDWFSGVMQAFIFCM LTVMYIANAAEE >gi|226332942|gb|ACII01000077.1| GENE 11 8170 - 8586 354 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579155|ref|ZP_04856425.1| ## NR: gi|253579155|ref|ZP_04856425.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 138 1 138 138 228 100.0 7e-59 MSIMDNVQPAVKKETGKVAAITGIGLILMWIVFGVLHAFMPQKVPFDYTVFLGGAVGGCV AVLNFFFMGLAVQKAAAATDEDTARMRIKASYSQRMLIQMLWVILAIVAPCFHFVAGIVP LLFPGTGIKIAGIFHTNK >gi|226332942|gb|ACII01000077.1| GENE 12 8614 - 8883 228 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579156|ref|ZP_04856426.1| ## NR: gi|253579156|ref|ZP_04856426.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 89 1 89 89 125 100.0 8e-28 MKQWSEIIRNVTMLSQLGLSLITPVLICLAVCWLIVSKTGTGGWVYIPGFFFGLGGSGTV AYKFYLSINRQQKKENKKKKNKVSFNRHL >gi|226332942|gb|ACII01000077.1| GENE 13 9366 - 10475 1430 369 aa, chain - ## HITS:1 COG:mll3407 KEGG:ns NR:ns ## COG: mll3407 COG1879 # Protein_GI_number: 13472950 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 110 369 59 325 325 92 28.0 1e-18 MKRAAAAAISLMTGLAMLTGCSRAEVEETIQPLVADAIEESDSEAEAVKEEDESPLIPEI DTDIKIHAGSRIAVVSKCVKGEYWKMVKKGMEDAVKEINKAYGYKKDDQITMTFEGPDNE EDVETQINTIDAVIAENPDVLCISASDMDSCEAQLEAAKENGIPVIAFDSNVAEKKLVKA YRGTDNVQVGKMAAYQLGSALGKMGKVAVFSAQQKTKSVQDRVEGFMNNIGKYGDIEVVE TIYSDQVDDMEASMKEVLEKYPTLDGVFCTNADITEMYLNLEKSDEHDAPALVGVDATTK QQEAIRNGEEIGCVSQQPYAMGYQTIWAAVETTAPKKSVVIEKNVLSNPAWIDADCLDDP DYSGYLYTE >gi|226332942|gb|ACII01000077.1| GENE 14 10754 - 11455 229 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 10 219 83 274 287 92 33 2e-18 MWIKDIRDYILYEDKDILVCHKPAGLAVQNARVGSMDMESLLKNYIAQKVPGKMPYLGII HRLDQPVEGVLVFALNPKAAADLSRQMTAGKIKKTYLAVTEGTVKVKSAKLVDWLKKDGR TNSSAVVEGGTSGAKKAILSYEVLETWKNKEDAQDCGERNLIRIDLDTGRHHQIRVQMAH AGMPLVGDRKYNPGQNSQEPLALCSAKLGFQHPVTKKQLEFQVQPAGMAFKRH >gi|226332942|gb|ACII01000077.1| GENE 15 13015 - 13773 482 252 aa, chain + ## HITS:1 COG:BS_yurK KEGG:ns NR:ns ## COG: BS_yurK COG2188 # Protein_GI_number: 16080309 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 239 1 239 242 153 35.0 3e-37 MMIMDSSKPLYLQVKADIKNRILSKQYMPGDKLPTENELSDQYNVSKITIRKAIQNLSDE GYVNKVQGKGTFINFKKDKLYLNKTSGFKESLSSLGHASRHDIIQASFLNADEDIAEKLM IPMGTKVVYIERLVWQDNEPIAIDKIYIEDARFPDFITTLSKDRSFYQVMDECYHIRPNH SVLEIDGKAAQSHSADTLKCNVGDPLFSIHKISYDQDGKPIHYSLTTVRCDRVTYVVSTN DSTVMDEKIKSN >gi|226332942|gb|ACII01000077.1| GENE 16 13941 - 15560 1174 539 aa, chain - ## HITS:1 COG:CAC3087 KEGG:ns NR:ns ## COG: CAC3087 COG1080 # Protein_GI_number: 15896338 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Clostridium acetobutylicum # 6 535 5 536 539 444 44.0 1e-124 MEKYIGKSVYKGTAIGPINVLKKSDGIVKRQHVEDVAAELQRLEDAKKQAQAQLGALYEK ALQEVGEVNAQIFEVHQMMLEDEDYQDAIHSMIETEELNAEYAVAVTGDNFAEIFANMDD EYMQARSADVKDISNRLIRNLSGEEELDWAHMEPSIIVADDLTPSETVQMDKRKILAFVT VHGSTNSHTAILARMMNIPALIGVPVELDSLHSGTMGIVDGKDAVFCVDPDEATIAAAHE MQARAAEQKRLLANYKGRPSVTKSGRKVNVYANIGSVSDVAYVQENDAEGIGLFRSEFLY LGKTALPDETEQFNTYRQVLQTMGGKKVIIRTLDIGADKNVDYLGLGKEDNPAMGYRAIR ICLKQPDVFKTQLRALLRAAKYGNLAIMYPMIISVDEVLRIREIVAEVAAELKREQIPYA IPEQGIMIETPAAVMICEELAELVDFFSIGTNDLTQYTLAIDRQNEKLDEFYNPHHEAVL RMIQMTIDGAHKAGKWAGICGELGADTTLTERFVEMGIDELSVAPSMVLSVRSKICEMA >gi|226332942|gb|ACII01000077.1| GENE 17 15596 - 15853 365 85 aa, chain - ## HITS:1 COG:MG041 KEGG:ns NR:ns ## COG: MG041 COG1925 # Protein_GI_number: 12044891 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Mycoplasma genitalium # 1 84 1 85 88 62 38.0 1e-10 MKEFTYVIKEELGLHARPAGLLVKEAKKFQSATTLAAKGKTAAAGKLIAIMSMGVKQGDE VTVQVEGPDEDAAFEALEKFFQENL >gi|226332942|gb|ACII01000077.1| GENE 18 16103 - 16930 601 275 aa, chain - ## HITS:1 COG:STM0574 KEGG:ns NR:ns ## COG: STM0574 COG3716 # Protein_GI_number: 16763951 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Salmonella typhimurium LT2 # 9 265 21 274 284 278 51.0 8e-75 MENQKALKKVSDKDLRKVFLHSLAIMCSWNYERQMHMGFMYGMAPVLDKLYADDEERKKE AYQRHMEFFNCTPQLTPFIMGLAASMEEQNANSEEGEFQTESISMIKTSLMGPFAGIGDS FFQGTIRIITFGIGLSFAQQGSILGPILAVLLFAIPSLLFAYNATFFGYRSGNKYLAKLY QEGLMDRVMHFASIVGLAVVGGMVASMVSITTPLTFSTGGTNLVIQDMLDSIIPKMLPFV FTLGIYNLVQKKVNTNVLLIGIVLFGMVMGALGIL >gi|226332942|gb|ACII01000077.1| GENE 19 16930 - 17673 772 247 aa, chain - ## HITS:1 COG:STM0575 KEGG:ns NR:ns ## COG: STM0575 COG3715 # Protein_GI_number: 16763952 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Salmonella typhimurium LT2 # 1 240 1 240 247 177 47.0 1e-44 MLIKAILLGLVGIVGVIDSRLIGRQNIGRPLILSTLAGLVLGDVRQGVILGASLELISMG FVAIGAAGPPNMQFGSIIATAFAILSGASTEAALTIAVPVAVIGEFLSVIMRMVIAQFSH VADKAIENGQFKKAIHIHLWWSFGFNALVYFIPIFLTVYFGTDLVTNLVAAIPQAITNGL TVAGNLLSALGFAMLLSSMLSKKMFPYFLLGFLIVAYSGLSLIGVTMFAAILAFILDKVL YGKGANA >gi|226332942|gb|ACII01000077.1| GENE 20 17689 - 18153 427 154 aa, chain - ## HITS:1 COG:STM0576 KEGG:ns NR:ns ## COG: STM0576 COG3444 # Protein_GI_number: 16763953 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Salmonella typhimurium LT2 # 1 148 1 150 156 117 39.0 7e-27 MIKLVRVDHRLLHGQVIFSWTKQVGTNYIIVADDKVPNDPISMMALSIAKPVDCELSIIP FSQLRKLVDENASKNIMIIVKGPAEALQLAKELPEVTEINYGGVAKKAGSKEYGKAVYLN EAELKATQELISMGKKIYIQMVPSSPIEHASFEN >gi|226332942|gb|ACII01000077.1| GENE 21 18150 - 18572 452 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579165|ref|ZP_04856435.1| ## NR: gi|253579165|ref|ZP_04856435.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 140 5 144 144 268 100.0 8e-71 MKDAFDVLMITHAGLCKGYADVLKLILDIDEDDFGTLSFENGESLDDFSEAIRATIEERY KDRNLVVLLDLPGGSPANTALPFMSSSRKLIAGVNLPLVLEIMAMKKSGTTWEELDLDDT CKSAAQSIVLYNNFLNGGNV >gi|226332942|gb|ACII01000077.1| GENE 22 18556 - 19242 534 228 aa, chain - ## HITS:1 COG:alr0288 KEGG:ns NR:ns ## COG: alr0288 COG0637 # Protein_GI_number: 17227784 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 3 216 5 218 222 107 32.0 2e-23 MSLKLVIFDVDGLLLDTERVWQDVWYDTARDFGIDDWKREDFLNVVGRSGQPVYDYMEEL FRGRCSTEEFMKVARQHGVERLERELTAKPGAVELLNCIKKAGLPCAVATATSRSLTEER LTRLQLIQYFDYICCGDEVVERKPSPEAYLKVLAKMNTAPEDALVFEDSKVGVQAAWNAR IPVIMVPDLMPPTEVQKRQAIKIIASLSEAVPIIEELSKDGTKHERCI >gi|226332942|gb|ACII01000077.1| GENE 23 19267 - 20244 527 325 aa, chain - ## HITS:1 COG:BS_yurP KEGG:ns NR:ns ## COG: BS_yurP COG2222 # Protein_GI_number: 16080314 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Bacillus subtilis # 20 325 22 328 328 285 45.0 7e-77 MRDLHASQIKVAVEAVKARKEVNEVYFVACGGSQAVLMAGQYMFDKESSIPSHVYTANEF VYDTPKRLNENSVVISCSHSGNTPETVEATKLARSKGALTICLSNLEGSPLWEAAEYPVH YDWGKEVSDSDKNKGILYGLLFSLLSVLAPDAKWDVCLKELEKLTDLSAQAKAQYDAQAK AWAKRNKREKTIYTIGSGINYGEVYSTAMCWFMEMQWINSGCIHSGEYFHGPFEVTDYDV PFMLVKSIGHTRHLDERVENFAKKFTEDLLVLDQKDLDLSTVADEAKPYVAAILTGVVIR HFVEAIAFERGHSLDVRRYMWQMQY >gi|226332942|gb|ACII01000077.1| GENE 24 20617 - 22332 981 571 aa, chain + ## HITS:1 COG:no KEGG:Swol_1745 NR:ns ## KEGG: Swol_1745 # Name: not_defined # Def: transposase # Organism: S.wolfei # Pathway: not_defined # 5 547 3 545 570 186 28.0 3e-45 MAYFLKKTNNKKGTYLQIYSSFYDPERRHTAHKAYKPIGYVHELQAKGIDDPISFYKEEV IRLNQEANAAKAAKKAKQISDDSPEKLIGYFPMKNINDKLSVKKYIDLMQTATDFRFNVF GMISALVYARLVQPCSKSKTYDEVIPKLFESYDFSLNQLYDGLEYIGCEYEKIIEIYNHQ IQQMYQFDTSHTYFDCTNFYFEIDREDDFRKKGPSKENRKEPIVGLGLLLDANQIPIGMK LFPGNQSEKPVIRNIIDDLKKRNSVSGRTIQIADKGLNCAENIFHALKNGDGYIFSKSVK MLPETEKTWVLLPNNYRDVKNAAGETLYRIKECVDEFEYKFTESETGSLKKFRITEKRIV TFNPKLAKKQIYEINKEVEKARLLKASQAKKSEYGDSAKYVIFTAADKKGNDTDGKVKVT MNEELIQKSLELAGYNMLVTSEISMKDKDVYNAYHNLWRIEESFRIMKSQLDARPVYLQK QDTIVGHFLICYLAVLLTRLLQFKILKNGYSSEELFEFVRDFRVAKISDRKYINLSHGTS FIRELSELCSLPLTSYFLNEGDIKKMLSHRF >gi|226332942|gb|ACII01000077.1| GENE 25 22664 - 24922 2693 752 aa, chain - ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 751 10 757 761 940 60.0 0 MKLYDTGVYLLNGQKIVPENQADFPVSKEEAAKSTIAYSILKAHNTSGNMEKLQIKFDKL TSHDITFVGIIQTARASGLEKFPVPYVLTNCHNSLCAVGGTINEDDHMFGLTCAKKYGGV YVPPHQAVIHQFAREMLAAGGKMILGSDSHTRYGALGTMAMGEGGPELVKQLLSQTYDIN MPGVVAIYLTGSPRPGVGPQDVALAIIGKVFANGYVKNKVMEFVGPGVANLSADFRIGVD VMTTETTCLSSIWQTDEKIQEFYEIHGRSEDYKELKPGETAYYDGCVEIDLSEIRPMIAM PFHPSNTYTIDELKANLDDILADVEKKAQISLDGKVPYTLRDKVIDGKLYVEQGIIAGCA GGGFENICAAADILKGKSIGADAFTLSVYPASTPIYMELAKNGVLAGLLETGAVVKTAFC GPCFGAGDTPANNAFSIRHTTRNFPNREGSKIQNGQISSVALMDARSIAATAANKGFLTS AEDYTGGYTGQKYFFDKTIYDNRVFDSKGIADPGVEIQFGPNIKDWPEMAALPENLIVKV VSEIHDPVTTTDELIPSGETSSYRSNPLGLAEFTLSRKDPAYVGRAKEVQKAQKAIQEGK CPVEALPEMKPVMDVIKKDYPNVSKENLGVGSTIFAVKPGDGSAREQAASCQKVLGGWAN IANEYATKRYRSNLINWGMLPFLIKEGELPFANGDYLFFPQIRKAVEEKDDVIQGYGVGA EGLKEFEVALGELTDDEREIILKGCLINYNRK Prediction of potential genes in microbial genomes Time: Sat May 28 19:56:13 2011 Seq name: gi|226332941|gb|ACII01000078.1| Ruminococcus sp. 5_1_39B_FAA cont1.78, whole genome shotgun sequence Length of sequence - 58283 bp Number of predicted genes - 50, with homology - 49 Number of transcription units - 24, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 10 - 69 7.5 1 1 Tu 1 . + CDS 128 - 400 359 ## Cphy_0044 hypothetical protein + Term 425 - 482 7.3 - Term 411 - 472 1.4 2 2 Tu 1 . - CDS 500 - 1096 561 ## gi|291547307|emb|CBL20415.1| hypothetical protein - Prom 1156 - 1215 5.8 3 3 Op 1 6/0.000 - CDS 1235 - 2605 1572 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase - Prom 2626 - 2685 1.7 - Term 2641 - 2687 1.8 4 3 Op 2 . - CDS 2691 - 3743 1149 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 5 3 Op 3 . - CDS 3756 - 5135 1244 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 5192 - 5251 5.4 - Term 5156 - 5195 7.2 6 4 Tu 1 . - CDS 5253 - 6539 1368 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 6628 - 6687 9.1 + Prom 6814 - 6873 5.1 7 5 Tu 1 . + CDS 7051 - 7197 197 ## gi|253579176|ref|ZP_04856446.1| predicted protein + Term 7285 - 7328 0.0 8 6 Op 1 . - CDS 7182 - 8549 955 ## COG5263 FOG: Glucan-binding domain (YG repeat) 9 6 Op 2 . - CDS 8573 - 9508 636 ## Amet_1408 hypothetical protein - Prom 9632 - 9691 7.4 - Term 9758 - 9808 -0.8 10 7 Tu 1 . - CDS 9819 - 10004 125 ## - Prom 10037 - 10096 3.7 11 8 Tu 1 . - CDS 10122 - 10985 856 ## COG1284 Uncharacterized conserved protein 12 9 Tu 1 . - CDS 11113 - 11970 934 ## COG1092 Predicted SAM-dependent methyltransferases - Prom 12107 - 12166 8.7 - Term 12587 - 12635 8.4 13 10 Op 1 11/0.000 - CDS 12649 - 13671 1191 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 14 10 Op 2 21/0.000 - CDS 13671 - 14726 1351 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 15 10 Op 3 16/0.000 - CDS 14726 - 16246 1750 ## COG1129 ABC-type sugar transport system, ATPase component - Prom 16366 - 16425 2.1 - Term 16326 - 16385 17.0 16 11 Op 1 1/0.000 - CDS 16436 - 17494 1329 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 17651 - 17710 8.2 17 11 Op 2 7/0.000 - CDS 17898 - 19349 1351 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 18 11 Op 3 . - CDS 19353 - 20999 1537 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 21090 - 21149 4.4 - Term 21158 - 21207 10.1 19 12 Op 1 . - CDS 21262 - 22611 1550 ## COG0535 Predicted Fe-S oxidoreductases 20 12 Op 2 . - CDS 22683 - 23264 689 ## Cthe_1479 hypothetical protein - Prom 23334 - 23393 9.2 - Term 23440 - 23480 0.6 21 13 Op 1 3/0.000 - CDS 23520 - 24077 445 ## COG1309 Transcriptional regulator - Prom 24145 - 24204 4.2 22 13 Op 2 . - CDS 24275 - 27040 2769 ## COG1472 Beta-glucosidase-related glycosidases - Prom 27120 - 27179 5.2 - Term 27153 - 27196 -0.8 23 14 Op 1 7/0.000 - CDS 27203 - 28093 680 ## COG0395 ABC-type sugar transport system, permease component 24 14 Op 2 2/0.000 - CDS 28104 - 29009 822 ## COG4209 ABC-type polysaccharide transport system, permease component - Term 29029 - 29066 7.0 25 14 Op 3 . - CDS 29073 - 30551 537 ## PROTEIN SUPPORTED gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein - Prom 30772 - 30831 7.0 + Prom 30711 - 30770 9.8 26 15 Op 1 7/0.000 + CDS 30807 - 32540 799 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 27 15 Op 2 . + CDS 32577 - 34160 727 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 34185 - 34227 7.2 - Term 34171 - 34217 1.3 28 16 Op 1 2/0.000 - CDS 34229 - 35869 1297 ## COG0642 Signal transduction histidine kinase - Prom 35952 - 36011 3.4 - Term 35994 - 36041 3.2 29 16 Op 2 32/0.000 - CDS 36057 - 36728 744 ## COG0704 Phosphate uptake regulator 30 16 Op 3 41/0.000 - CDS 36731 - 37489 334 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 31 16 Op 4 38/0.000 - CDS 37495 - 38400 1009 ## COG0581 ABC-type phosphate transport system, permease component 32 16 Op 5 39/0.000 - CDS 38390 - 39247 955 ## COG0573 ABC-type phosphate transport system, permease component - Prom 39271 - 39330 5.7 - Term 39321 - 39371 4.0 33 16 Op 6 . - CDS 39394 - 40257 1404 ## COG0226 ABC-type phosphate transport system, periplasmic component + Prom 40360 - 40419 7.1 34 17 Tu 1 . + CDS 40573 - 40965 530 ## Acfer_1645 C_GCAxxG_C_C family protein + Term 41010 - 41048 7.0 - Term 40994 - 41037 5.0 35 18 Tu 1 . - CDS 41103 - 42074 396 ## gi|253579203|ref|ZP_04856473.1| conserved hypothetical protein - Prom 42118 - 42177 7.1 + Prom 42148 - 42207 5.4 36 19 Tu 1 . + CDS 42277 - 42630 520 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family + Prom 42647 - 42706 3.2 37 20 Op 1 40/0.000 + CDS 42836 - 43513 485 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 38 20 Op 2 4/0.000 + CDS 43506 - 44405 314 ## COG0642 Signal transduction histidine kinase + Term 44436 - 44474 -0.8 + Prom 44410 - 44469 2.4 39 20 Op 3 36/0.000 + CDS 44541 - 45212 344 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 40 20 Op 4 . + CDS 45228 - 47873 1433 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 47921 - 47970 5.5 - Term 47918 - 47948 3.0 41 21 Op 1 9/0.000 - CDS 47953 - 48813 734 ## COG0829 Urease accessory protein UreH 42 21 Op 2 17/0.000 - CDS 48826 - 49434 676 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase - Prom 49516 - 49575 5.6 43 21 Op 3 16/0.000 - CDS 49587 - 50279 519 ## COG0830 Urease accessory protein UreF 44 21 Op 4 . - CDS 50298 - 50774 579 ## COG2371 Urease accessory protein UreE 45 21 Op 5 . - CDS 50795 - 51301 595 ## Hac_1534 urease accessory protein - Prom 51441 - 51500 4.1 - Term 51467 - 51511 3.1 46 22 Op 1 17/0.000 - CDS 51560 - 53281 2257 ## COG0804 Urea amidohydrolase (urease) alpha subunit 47 22 Op 2 13/0.000 - CDS 53294 - 53653 442 ## COG0832 Urea amidohydrolase (urease) beta subunit 48 22 Op 3 . - CDS 53668 - 53970 493 ## COG0831 Urea amidohydrolase (urease) gamma subunit - Prom 54173 - 54232 9.6 + Prom 54281 - 54340 5.2 49 23 Tu 1 . + CDS 54362 - 55546 1239 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases + Term 55549 - 55600 10.2 50 24 Tu 1 . + CDS 56114 - 58144 2029 ## Ccel_1549 putative outer membrane protein + Term 58197 - 58257 16.1 Predicted protein(s) >gi|226332941|gb|ACII01000078.1| GENE 1 128 - 400 359 90 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0044 NR:ns ## KEGG: Cphy_0044 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 88 1 89 89 108 67.0 5e-23 MFRLWGKLWKDNHLLKDTVICDDSEDTRTHKIFRGLESICYEMDLGNPIWLESTVRHFKK HGKARFYQDNFIEQIDFDYLEIQVIEEDDE >gi|226332941|gb|ACII01000078.1| GENE 2 500 - 1096 561 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291547307|emb|CBL20415.1| ## NR: gi|291547307|emb|CBL20415.1| hypothetical protein [Ruminococcus sp. SR1/5] # 1 183 1 182 199 224 62.0 2e-57 MLEKVMELHFLLYAMMLMGGLGALGMLTTHLTYRRMIRKNTEVKSNLKEKWTNLWKTRDR LLNRMNRFVWYPSLFCIAFLGLAFFLYSKDVRWDGMPLAYLYVAAIIPTALLLLRQALDF TYREELLMNSFSDYVEHARTWVEEVPVPEKADEAVKEEIVEHIADSIRQTAAAGSHFSGM LTPEEDEIMREIIREFMK >gi|226332941|gb|ACII01000078.1| GENE 3 1235 - 2605 1572 456 aa, chain - ## HITS:1 COG:CAC2128 KEGG:ns NR:ns ## COG: CAC2128 COG0770 # Protein_GI_number: 15895397 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Clostridium acetobutylicum # 4 456 1 448 452 279 37.0 1e-74 MKNLTLENIARVCGGTYYGPADKLQEEVMSVITDSRKAEEGCLFVPIVGERVDAHKFIPQ VMEAGALATLSERVLDNADFPYIAVESSLQAVKDIAEFYLKQLQIPVVGITGSVGKTSTK EVIASVLNQKYHTLKTQGNFNNELGLPLTVFRLRDGDEIAVLEMGISDFGEMTRLAQIAR PDTCVITNIGTCHLENLGDRNGVLKAKTEIFKFLRPEGHIVLNGDDDKLITVKEYEKIKP VFFGMSNECQVYGDKIVSRGLKGMTCTIHLGDTAFTVDIPMPGRHMVYNALAAAAVGNIY GLTTDQIKAGIESLEPISGRFRMIETDKFLIVDDCYNANPMSMKASLDVLQDGAGRKIAV LGDMGELGENEVQLHEEVGEHAGKCDIDVLICTGKLCRNMAERAQRTNPDLKVIYEPDRE SLLEHLEGYVQDGDTILVKASHFMKFEEVVTKLENM >gi|226332941|gb|ACII01000078.1| GENE 4 2691 - 3743 1149 350 aa, chain - ## HITS:1 COG:PA4201 KEGG:ns NR:ns ## COG: PA4201 COG1181 # Protein_GI_number: 15599396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Pseudomonas aeruginosa # 1 342 6 337 346 233 38.0 3e-61 MKIVVLAGGTSTERTVSITSGTGICKALRQKGHQAILVDIFCGIENVDRENPFPSEYDVD AASEYMSAFNDRIEQMKKERRSFFGPNVLKLCEAADIVFLGLHGANGEDGKVQATFDLMG IKYTGTGYLSSALAMDKGITKQMFLMNNVPTPRGVSMVKEEMTTDLKALGMEFPVVVKTC CGGSSVGVYIVNDQAEYEQALKDAYSYENEVVVEEYIKGREFSVAVVDGKAYPIIEIAPL QGFYDYKNKYQAGSTIETCPAEISPELTEKMQHYAEAGAKALFMEGYCRLDFMMKENGDM YCLEANTLPGMTPTSLIPQEAKVLGIDYPTLCEKLIEVSMKKYPTQDFTL >gi|226332941|gb|ACII01000078.1| GENE 5 3756 - 5135 1244 459 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 458 11 465 466 268 32.0 2e-71 MSSRKKIMATLIIILLLLTAGVYGYGVYYFTEHFLPGSMVNGFNCSYMTVSQSEELLTQK VGAYVLTIGTRNNGQESITAKQAGLSYDSDGSVGKLIRNQDRFTWFLAFNQHRNYEVSSS IKYDEQKTEAAISGLKCMQQENMTEPSDAHIEEKDDKFVIVPEQEGTALKPEKTSQDIID ALVTGRTPVDLEADGCYKEPKVYQSDETLTKNCELINKLTDVVITYDFDDRTETVDRDMI KNWLTTDENGLYTLDKKQIEAYISELAAKYDTVGTERTFNTYDGREITVSGGNYGWQIDQ KAELKELTELIKNGETQVREPVYSHEGLVRKTNDIGYTYIEIDLTAQRMVFYKDGTPTAD AQIVSGNPFVPNCATPVGCYTTGEMKSGCTVNGEDYPSAVNYWIPFDGNLGISDAPWRMD FGGQLYEFEGTHGSICAPSDQVQIIFSNVEKNTPIVIYE >gi|226332941|gb|ACII01000078.1| GENE 6 5253 - 6539 1368 428 aa, chain - ## HITS:1 COG:MT3825 KEGG:ns NR:ns ## COG: MT3825 COG1167 # Protein_GI_number: 15843343 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Mycobacterium tuberculosis CDC1551 # 4 425 3 423 435 384 46.0 1e-106 MKKYSDMSKEELLTLKAQLDKEYADIKAKGLALDMSRGKPSADQLDLSMGLMNVLSSDSD LKCETGVDCRNYGVIDGIPEAKRLLGEMSEVDPDHIIIYGNSSLNVMFDSIARSMTQGVM GNTPWCKLDKVKFLCPVPGYDRHFKITEFFGIEMINVPMTPQGPDMDMVEKLVSEDASIK GIWCVPKYSNPQGYTYSAETVERFAHLKPAAPDFRIYWDNAYSIHHLYDDNQDFLVEILE ACAKAGNPDLVYKFTSTSKVSFPGSGIAAVAASKANLEDFRSYMQIQTIGHDKLNQLRHV KFFKNIDGLHEHMRKHAAILRPKFEMVLNTLDRELGGLGIGEWTKPHGGYFISFDSMEGC AKAIVAKAKEAGVVLTGAGATYPYGKDPKDSNIRIAPSFPTPEDLGKAAEVFVLCVKLTS VEKLLENA >gi|226332941|gb|ACII01000078.1| GENE 7 7051 - 7197 197 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579176|ref|ZP_04856446.1| ## NR: gi|253579176|ref|ZP_04856446.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 48 1 48 48 68 100.0 1e-10 MSMFIHAIIVILLAVLAMGIGLLILVCWALAACASRRDSYLEKDYRVK >gi|226332941|gb|ACII01000078.1| GENE 8 7182 - 8549 955 455 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 61 201 523 663 744 94 37.0 5e-19 MGISDIMERQMKDMKRKLSFRKVAVFLILALFLTAGISMRAVTVNAATYVKQDRTSVSIT SKKTGWQKINGDYYFYNSKGRLICGSFKYKGYYYYSIANGKRFTGWMKRSGNKYYYNRKN GAMFRNRWATGDKYTYYFNKSGVAIARQWLTQDGKKYYFLSNSTMAKGWQKIGKYYYYFS KKTGVMAKNTWIGKYYVNSKGQRVKSSDTKPTNTKPTVTQSGNTYEYKSSTLNIKLKRKS VHGISYWVAHIKTSNAKQLKSALSNGTYGGSRQTTSDAVSSNGGIIGVNGSAFDYGTGKP SPLGMCIKNGIIYGDYMTSYSVMAVKKDGTIYTPAQGLMGKNLLAAGVKDTYNFGPVLIK DGEAQLPWTETEKYYPRTAVGMVKPNDYVLLVTDTGSYNGLNHWDMVNIFKSYGCTYAYN LDGGGSATLYFNGKVMNKLIGNTQRPCADFLYFTR >gi|226332941|gb|ACII01000078.1| GENE 9 8573 - 9508 636 311 aa, chain - ## HITS:1 COG:no KEGG:Amet_1408 NR:ns ## KEGG: Amet_1408 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 4 311 6 313 315 120 28.0 7e-26 MIVYHYCSLESLNSILKNRSLRLTNILKSNDSMEISWICRYYDAEFKRAYENASDLFRSK ISSERLMGYVKLFTDEFFNENHADFRYYVTCFSYQNDLLSQWRGYADDGRGAAIGFDLDV LKEVVTVSPEISKPSIVSLHKISYSEAEQREVVHQIVQELVDEIEKILQKEEQCRESIEE KQDYEIEVLDKVMNCFEKKFLKLFQESVYMKNPFFREESEIRLCEFSPKQFLMGREVELS LGARLYNYSYYVKESQLISYVDFDFSDCLDRLIKELVIGPKCLMSERDMEYYLTTLGLSN CRVKKSQGTYR >gi|226332941|gb|ACII01000078.1| GENE 10 9819 - 10004 125 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGNATESTLPHGNTSGYDRKACRTCACSTCYEQEFCDRCSSCQNQSRKLDRCNAYEGAYN Y >gi|226332941|gb|ACII01000078.1| GENE 11 10122 - 10985 856 287 aa, chain - ## HITS:1 COG:BS_yitT KEGG:ns NR:ns ## COG: BS_yitT COG1284 # Protein_GI_number: 16078176 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 9 281 5 277 280 144 32.0 3e-34 MKELNLKHEAKRYGCAVLAATIMALNIKTFVRAGGLFPGGFTGLTLLIQNIFQTFMGIAV PYTLINVLLNSIPVFIGLKFIGKKFTISSVCVIVLSGLLTDIIPSQPITYDTLLISIFGG LINGFCISLCLIGNTSTGGTDFIAIYFSEKNGRDIWNYILCGNAVILTVAGLLFGWDRAL YSIIFQFTSTQVIQMLHQRYKKHTLFIITKEPYQIYEEIFKLTNHTATRFEGTGCYTDEK TSMIYSVVSTEEAKYLVKKVHEIDPKAFVNIIKTDYINGRFYQKTDY >gi|226332941|gb|ACII01000078.1| GENE 12 11113 - 11970 934 285 aa, chain - ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 9 285 58 338 338 237 43.0 2e-62 MWIADGWKDYEVIDTSCGEKLERWGDYLLVRPDPQVIWDTPKVNKGWKHMNGHYHRSKKG GGEWEFFDLPEQWDIHYKGLTFNLKPFSFKHTGLFPEQAVNWDWFGNKIRNAGRPVKVLN LFAYTGGATLAAAAAGASVTHVDASKGMVTWAKENAVASGLGDAPIRWLVDDCVKFVERE IRRGNHYDAIIMDPPSYGRGPKGEIWKIEDCIHDLIKLCTKILSDEPLFFLINSYTTGLA PAVLTYMLSTELKKFGGHVDSQEIGLPVSSNGLVLPCGASGRWEA >gi|226332941|gb|ACII01000078.1| GENE 13 12649 - 13671 1191 340 aa, chain - ## HITS:1 COG:YPO3905 KEGG:ns NR:ns ## COG: YPO3905 COG1172 # Protein_GI_number: 16124037 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 17 334 8 313 330 202 42.0 1e-51 MKANVQKKKITGNGFLLLITIALFVVMYVAGMIVFADKGFAKPQMFLNLFVSNAGLLVIA CGLTIVMITGGIDISVGSVTALVCMVLADLMENKGVNAYAAVLVALLIGVAFGIVQGFLV AYLDIQPFIVTLAGMFFGRGLTAIISTDMISIKNELFLKWANYRFYMPFGSTSKKGKFIP AYIPPTVVIALIVVILIAVLLKYRKFGRKLYAIGGSRQSALMMGLNVKKTIFNAYVLDGF LAGLGGFLFCLNSCAGFVEQAKGLEMDAISSAVIGGTLLSGGVGTPIGTLFGVLIKGTIS SLITTQGTLSSWWTRIILSALLCFFIVIQSVVASMKKKAK >gi|226332941|gb|ACII01000078.1| GENE 14 13671 - 14726 1351 351 aa, chain - ## HITS:1 COG:YPO3906 KEGG:ns NR:ns ## COG: YPO3906 COG1172 # Protein_GI_number: 16124038 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 21 300 29 298 339 202 47.0 7e-52 MNKSKLKKITSARLFLPIVCLIAVLLLNVIRTPDFFNVSIRNGVLYGYIVDVINRASELV ILAIGMTLVTAASGGQDISVGAIMAVAAAVCCQILSGGQVSVNDYQNPVILAVIAALLAS ALCGAFNGFLVSKLNIQPMVATLILYTAGRGIAQLITNGQITYIRVDSFKMAGGYIGKCP IPTPIFFAIITVVIVYLILKKTALGLYIESVGINGKAARLVGLNSTMIKFLTYVICGVLA GIAGIVASSRIYSADANNIGLNLEMDAILAVALGGNFLGGGKFSLIGSVIGAYTIQALTT TLYAMNVKADQLPVYKAIVVVIIVTLQSDVFKKYIANLKAKRSVAAEGGQK >gi|226332941|gb|ACII01000078.1| GENE 15 14726 - 16246 1750 506 aa, chain - ## HITS:1 COG:YPO3907 KEGG:ns NR:ns ## COG: YPO3907 COG1129 # Protein_GI_number: 16124039 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Yersinia pestis # 5 501 3 494 496 496 51.0 1e-140 MEQNVVLEMRDISKNFTGVRALSHVDFTLRKGEIHALMGENGAGKSTLIKVLTGVHEFES GTIHMAGNSNAIINHSPQEAQANGISTVYQEVNLCPNLTVAENLFIGREPRKMGMIDWKQ MNERSGKLLESLDIHVPPTQMLDECSIAIQQMIAIARAVDMKCQVLILDEPTSSLDDDEV EKLFVLMRRLRDEGVGIIFVTHFLEQVYAVCDRITVLRNGELVGEYETKDLPRVMLVAKM MGKDFDDLADIKGDHKDKRITDAEPVIEAKGLSHKGTIKPFDITINKGEVIGLTGLLGSG RSELVRAIYGADKADTGTLKVKGKEVKINTPLDAMKLGMAYLPEDRKAEGIIADLSVREN IIIALQAKRGMFHPLSKKEMEDAADKYIKLLQIKTASRETPIKSLSGGNQQKVILGRWLL TNPDYLILDEPTRGIDIGTKTEIQKLVLDLADQGMAVTFISSEVEEMLRTCSRMAVLRDG QKVGELEEDELSQSGVMKAIAGGDDK >gi|226332941|gb|ACII01000078.1| GENE 16 16436 - 17494 1329 352 aa, chain - ## HITS:1 COG:SMb21587 KEGG:ns NR:ns ## COG: SMb21587 COG1879 # Protein_GI_number: 16264775 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 4 325 2 311 320 205 43.0 1e-52 MKKKIVSALLTATMVCGMSVVPAVGVAAADDTITVGFSQVGAESDWRTANTESMKSTFSE ENGYELIFDDAQQKQENQLTAIRNFIQQEVDYILLAPVTETGWDTVLQEAKDADIPVIIV DRMVDVSDDSLYTTWIGTDSLLEGRKAAEWLNAYTTAKGIDAKDVNIVDIQGTIGSTAQI GRSKGLEEGVDNYGWNLLAQQSGEFTQAKGQEVMESMLKQYDNINVVYCENDNEAFGAID AIEAAGKTVGSDIANGEIMVMSFDTTNAGLTDTLAGKIACDVECNPLHGPRAEELIKALE AGEEVEKLNYVDEEIFATDDTVDKVKAVNSLDEEGEYDVTPITQEIIDGRAY >gi|226332941|gb|ACII01000078.1| GENE 17 17898 - 19349 1351 483 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 205 471 313 589 602 175 35.0 2e-43 MKQKQEKLETKMKWMLLSTMIPMTILIIVLLVFFWHYTNEYNQLSNNLAVSSEYNANYRN SNNDMSFKDEVDMDIYYVTIGRKGKDGLPTEQVDKAISTVESLKKTTSKKESLRSLKYLT NYLENLKKRMNQLLEIKNYQERQEFVANNTEILTTLFEKEMQNYIYQEATELVQIESQLA GNVRLTIITMCAAILVATAILLRRSFRLTYSITKPISEILHNIKKVGKGEYKTINAVSAD SVEIQELDAGTRKMAGRIEGLLENVRKEQEVQHMTELQLIQAQVNPHFLYNTLDTIVWLV EGGMQQDAVDMITSLSVFFRTSLSKGKDIIPLSEEKRHTLSYLEIQQSRYRDIMEFEINI PPELDEVMVPKLTLQPLAENALYHGIKNKRGKGKILIEGFNLGDDMMLRVTDNGQGMTPE RLYEVQEAIRTGERAGFGLAAVSERIVLYYGPGYGLKISSTEGEGTVMEVYMAKKINRDN NSR >gi|226332941|gb|ACII01000078.1| GENE 18 19353 - 20999 1537 548 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 542 1 520 525 168 26.0 2e-41 MLKTFLVEDEVVIREMIKKMIPWEQYGFELAGEASDGEMALPLILKSKPDLLITDIKMPF MDGLTLCKLVKKELPDIKIVILSGYDDFNYAKQAINIGVEDYLLKPITKNAFIERLEEIH NRYEHEKTQKEYYEKFKLEMQEYERNASRDFFESLVRADFDLEEIYRRADRLNLDIVAEA YNILIFTPDASDSSYNSSEGYSDWEAEVHKKIENYFLSHPVAMLFRHQVFSYAILVKGQR DTIKKNTCECVETIQKIMEETRANVDWFVAVGEEADRLSRIKQSYHTAARTYAFRYLYDG HILYYNMLEQVKENSADTSKTEAVQLKNVNINALNPEILQKFLSSGLEDEVDSFVHDYFH AIGREPMKSLVFRNYVVLNVRFSVLSFLKKIGYDDTELSREETDDVVKKTSQSTEASVAY AEEVLKRAIAIRDENAGSQNRSVLKQATDFIDGHYMDEEISLNRVAHAANVSANHFSALF SQNMGQTFIEYLTSLRMDKAKELLRCTSKRSSEIAGEVGYKDAHYFSYLFKKTQGITPSE YRKTRGEV >gi|226332941|gb|ACII01000078.1| GENE 19 21262 - 22611 1550 449 aa, chain - ## HITS:1 COG:CAC2791 KEGG:ns NR:ns ## COG: CAC2791 COG0535 # Protein_GI_number: 15896046 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 1 442 1 444 461 295 35.0 2e-79 MKLMDKAKQAALAAAVKTGLGYLEKDPEVNIPKLMELVDKFVPDGWYESQRNAIRNAIQN KDSNWYKLILRIYELDPGVREAFFTNFIINASLKGSALQEETAEENNCNVPWAILLDPTS ACNLHCTGCWAAEYGHKLNLDFDTICSIVEQGRKMGTYMYIYTGGEPLVRKKDLMRICEK YPDCEFLSFTNGTLIDEEFCQEMLRVKNFVPAISLEGSEEANDGRRGEGVYQKVMHAMEL LKAHKLPFGVSTCYTSANVDSVSSEEFFDHIIDCGALFVWFFHYMPTGNDAVVELMPNPQ QREKMYHKIREYRSTKAIFGMDFQNDAQYVGGCIAGGRRYLHINANGDVDPCVFIHYSNA NIYENTLLEALKSPIFMAYHDGQPFNENMLRPCPMLENPQKLRKMVEESGAKSTDLQSPE SVEHLCAKCDAYAEHWAPKAEELFPVKNK >gi|226332941|gb|ACII01000078.1| GENE 20 22683 - 23264 689 193 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1479 NR:ns ## KEGG: Cthe_1479 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 7 193 9 195 202 141 43.0 1e-32 MDALKQLKLAKNGYIIMSVLFMVLGACLIIWPDCSMAVFCTAVGIMLIVYGLIKILGYFS RDIYCLAFQFDLAFGVLLAAVGIIIIVRRNVVVNLIFGIFGLLILADALFKIQMSIDAKK FGLNLWWRILLVAILTGVLGFLLLIRPFEAAEIMMILVGVSVLFEGILNLCVAIYTVKII KNQKQDIIDMDEY >gi|226332941|gb|ACII01000078.1| GENE 21 23520 - 24077 445 185 aa, chain - ## HITS:1 COG:lin0482 KEGG:ns NR:ns ## COG: lin0482 COG1309 # Protein_GI_number: 16799557 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 2 157 4 158 186 58 26.0 8e-09 MSGFTKEIIAKTFTELLDEKPMSKITVKDIVERCGVNRNTFYYHFKDIPDVVEFILKKKW DEILEHPQDRASILECMEEMADLVRNNRKVMLNVYRSVKKDTFLFYMNEISNYIIMEYFR KNADQFDLDEGEIRILIQYYKCLFMGFLMEWLDNNLKSDFGEEMRQASRLFEKHPELHAV LNRSE >gi|226332941|gb|ACII01000078.1| GENE 22 24275 - 27040 2769 921 aa, chain - ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 5 395 318 711 721 231 39.0 5e-60 MNDYKLDLEKYATLARQAAAEGCVLLENEKQALPLREGESVAVFGRMAFHYYKSGLGSGG LVNTRYVVGILDALKECKEIQLDEKLLGIYANWIKENPYDEGQGWGRVPWSQKEMEVTEE MLDCARSNDVSLVIIGRTAGEDQDNNTNLGSYCLTETEEDLICRVCEVSKCTVVVLNVGN IIDMSWVEKYHPQAVLYAWQGGQEGGNGVADVLTGKVCACGKLTDTIAERIEYYPSTENF GDPYKNYYKEDIYVGYRYFETFAKDKVLYPFGYGLSYTNFETKAEIFKNTEDELTVAATV TNIGDVRGKEIVQVYVKAPQGKLGNPARKLIGFAKTRELAPGEKEELVIIIPKYDMASYD DSGVTGHKSCYVLEEGTYEIFAGSDVRSAKSAGIYEEELRVIEQLQEAYAPIEKFRRMKA VLRADGTYQAVTEEVPVRTADPHKRREERMPKTLEYTGDKGYKLADVLDKKVSMDEFVAQ ISEADLIAMFRGEGMCSPKVTAGTAAAFGGVTESLKALGIPVGCCADGPSGIRMDCGTKA FSLPNGTLLGCTFNTELVGELYEMTGRELRLNKIDSLLGPGMNIHRNPLNGRNFEYISED PLLTGRICAAQVKAMAKSEIGSTIKHFCGNNQEVGRSTSDSVMSERCLREIYLKGFEIAV KEGGARSVMTTYGSVNGLWTAGSYDLCTTILRKEWGFQGIVMTDWWAKSNYEGHQAEVTA KAPMVAAQNDIYMVVSDAKSNPENDDVEEMLHAGKITVGELQRNAANILGFLLKSPSVLL LTDRICKEELEAMNTKEEDDVDAGSLVSIESDSVTQKIVIDGALLHPAKGKADVIAVTNE FMGDFTMKFTLKSDLGELAQLPVSVFLDNIHKMTVSVQGTNGKWVEENRILNMGFGHNHY IKFYYGADNLEIKEIVLIPNR >gi|226332941|gb|ACII01000078.1| GENE 23 27203 - 28093 680 296 aa, chain - ## HITS:1 COG:BH0795 KEGG:ns NR:ns ## COG: BH0795 COG0395 # Protein_GI_number: 15613358 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 15 295 40 321 322 228 43.0 1e-59 MVRSKSAKRFERLAHTVMILVTICIVLPFILLFMSSITSESALVRDGYSFFPKEFSIHAY KFIWDNAANVFRAYGITILVTVIGTSINVAMSALLAYPLSLKNLPGKRILNFYIFFTMLF NGGLVPTYLMYTGIFHINNTLLAYIVPGLLMSAMNVMLIRTFFATSIPDALFEAAQIDGA SQFQIFFKIVLPLGKPILIAMGLLSGLGYWNDWTNGLYYIRDTKLYGIQNLLNKMISDLT ALTQNASGASTVAIADIPSASVRMAIAFVAMLPILCLYPFLQRYFTKGIALGAVKG >gi|226332941|gb|ACII01000078.1| GENE 24 28104 - 29009 822 301 aa, chain - ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 14 301 16 309 309 295 51.0 6e-80 MKKAKFKRWVPLYLMMAPGLIYLFINNYMPMAGLVVAFKNYNVVDGIFGSPWAGLSNFTY LFNDAWTITRNTLLYNIVFIIINLILGIAFAIFICDIRSKACKTIYQSAILLPFLMSIVI VSYITFAFFSGDNGMLNKTILPFFGKEAINWYSESKYWPVILVIVNTWKGVGYGCLIYIS SISGIDPSFYEAAELDGASKWKQIRYITLPSIMPSVITLTLLNIGRIFYSDFGLFYQVTQ NSGQLYDTTNVIDTYVYRALLQSGNIGMASAAGFYQSIVGFACVLLANVVVRKLSPENAM F >gi|226332941|gb|ACII01000078.1| GENE 25 29073 - 30551 537 492 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein [Streptococcus pneumoniae TIGR4] # 3 492 5 491 491 211 28 8e-54 MNKKVIAAGMAMVLASMGLTACGDSGSKEAKDSSDETYTVTMAYIGDKEEDTDRIEKKIN EIMKKDINMELDIEPISWGAYAETMKLILSGGEKMDIVPILVEQVNSMVNAKQVIDMSEY IDKYGNNIKELLGDTAKAANIGNYVYGVTTGREWFCQSSVIMRKDILDECGIDVSSITDY KDLTDVYATVKEKYPDMVMMASNNSATPDTKYEMNDTLTDGFGVLMDHGQDTTVVDYYET DEYKEFVETMYDWQQKGYLSKDAATTTESAENQVKAGAAFSYLAPNKPGYDTRAALLCGT EMEIAPISEPWAGTAQISYLTYGISSSSADKDKTMQCLDYLYGNADILNLLNWGEEGVDY EVVDAENNIINYPDGKDDSNTYHLAEGWQLFDQFKMHIWEGDSPDIWDETKALNESAIKS KAFGFTYDSTSVANELAALSNVKAKYAAALGSGTVDPEETLPKFIEELKKAGIEKVISTK QEQLDKWLEENK >gi|226332941|gb|ACII01000078.1| GENE 26 30807 - 32540 799 577 aa, chain + ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 200 575 199 550 552 116 27.0 1e-25 MYSRHRVAGYSIDHVIKFLFFLLVPVVILDIMISCFVIFSMRSQSIQSLQDTTSLYASQI DTVHISINRYLIRLLTENDDADTLLETQDKLEFISSAERVHNNIDIFLESFEKGYQIFLY NSENERMYRPTEIFQTLTQNQLVLLNKHLIQKITAPKTSFYTDAWEVLMFGKNVYFYKSF HTGSHYACCFIPADNIIQPLKNIIKEKDGFISLTSQNGTVLTNKELLKQHHIEFSRKSNS DSYETYNKGNNLIISGALIMGAFYPQIILSKFRAYEKIILLQFILVLAVLLVAVILFTSI FYMKKHVLTPIKLFLENLAHLDDSAETISLKDTNLLELEQANSQFKNLMRQIKKLKIDIY EKELEKQKTMMNYLQLQIRPHFFLNCLNTIYSMAQTQLYEEIMKMSMITSNYFRYIFQNT QDLVPVKNELEHIKNYMEIQKMRYGDTFSYEIHAEEGTENIRIPPILIQTFIENAIKHSN TFDDPIIIRTDIAWMTAGGNSHENIQIQVMDTGCGFSEDILQSLNSAIPLEPQNGHRIGI TNAIQRLNLLYGQGEAVITFSNLPSGGACVTILLPAI >gi|226332941|gb|ACII01000078.1| GENE 27 32577 - 34160 727 527 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 4 527 5 525 526 126 24.0 1e-28 MNALLVDDDRFVVAALEKKINWEQLTITEVLTAFNIRQAQKIIEKNSIDICVCDIEMPGG SGLDLLSWVRESGKEIQFIFLTSYADFDYAKKAIELSSLDYQLKPIDFDTLFHILEKAVS KVRKNAALTQTKADSQKWKDNYRYIVDLFWKELFATTLFREPSLLETELRKKDLSYTADD LFIPVLFRLYPLSGQIMPMESSMVDFSFQNITAETFQKSCILYESIVILNTFEYVVILSG LQLEQIRQPFTENLLTLFKNLQSFLHCEIACGIAPEVSLTELPETLTRLREMRESNLNSV NKPLFLQDFVPSTTSYIPPSLEVINTFLEQKQADAALQNIENYLSKYIQASGITKNFLLH LRLDIEQIVFSYLHKNGIEAHTLFSSEERDELITKSLDAVPYMFNYLRYLIQRAVEYNSF INEKDSVVDIVLNYIHQHYSEDISRTMLADMVYLNPDYLARLFKKQTQTSIINYITTYRL EKAKELLLNPDIPVGTVALKVGYGNYSYFSKLFKDVVGCTPNEYRKK >gi|226332941|gb|ACII01000078.1| GENE 28 34229 - 35869 1297 546 aa, chain - ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 53 538 67 561 566 236 33.0 1e-61 MFVCALVLTVGLAAVMGILYSNFDGQMRKELSKEAAYLAYGVEQQGTDYLKNVKDKSSRI TYINKDGTVLFDNKADADEMQNHKNRTEFQKAEKYGAGECSRYSDTLSEKTIYYALRLKD GTVLRVSGTQDSVLALVENLLLPLCGLLFLMLILSGIMASVISKRIVKPVNELDLEHPEE NKIYEELSPLLGKIHKQNRQIQKQLELAKQQQEEFSLITENMQEGLIVIDRYTMILSANS SAWNLFRVDKVCQGESVYCLDRAEDFRHAIEQVLSGEHAELILKLNGSDIQLIANPVVRG QKTEGAVILLVNVTEKLERENLRREFSANVSHELKTPLTSISGFAEIMQGGLVKCEDIPK FAGRIYKESQRLLQLVEDVIQISQLDEEKTSYTWELVDVYQVCKNAFESLKEKAQSMNVH LYICGDSMKMEAVRTLLEEAVYNVCDNAIKYNRNDGSVSIFLTQTAHEIQIVVKDTSVGI PKEDQDRVFERFYRVDKSHSKEIGGTGLGLSIVKHAVSTLNGSIVLRSEEGSGTEITMKF LKVHKE >gi|226332941|gb|ACII01000078.1| GENE 29 36057 - 36728 744 223 aa, chain - ## HITS:1 COG:aq_906 KEGG:ns NR:ns ## COG: aq_906 COG0704 # Protein_GI_number: 15606237 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Aquifex aeolicus # 7 217 5 215 221 109 33.0 3e-24 MSKYFERQLEELHVQLITLGSYCEKAISFSAKAIQKVQNEEIARQVFETDRDIDAKEREI ENLCLSLLLHQHPVARDLREISAALKMVSDMERIGDQAADIADLSLYVAKNTTAIPETIT QMAEDTVRMVTESVDAFVKSDLELCRNVIDRDDVVDDAFNEIKEKLADMIYGGNLDAKTG LDLLMTAKYFERIGDHAVNIAEWVEYSITGQHRNNEHQDYLNN >gi|226332941|gb|ACII01000078.1| GENE 30 36731 - 37489 334 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 247 2 239 245 133 35 3e-30 MSGNEKITVENMNLHYGKFHALRNINMAIPEKEITAFIGPSGCGKSTLLKSLNRMNDLVE GCVITGKVQLDGEDIYGDMDVNLLRKRVGMVFQKPNPFPMSIYDNIAYGPRTHGIRSKSK LDDIVEKSLRDAAIWDEVKDRLKKSALGMSGGQQQRLCIARALAVQPEVLLMDEPTSALD PISTSKIEELAMELKKDYTIVMVTHNMQQATRISDKTAFFLLGEVVEFDDTDKLFSMPSD KRTEDYITGRFG >gi|226332941|gb|ACII01000078.1| GENE 31 37495 - 38400 1009 301 aa, chain - ## HITS:1 COG:SP2086 KEGG:ns NR:ns ## COG: SP2086 COG0581 # Protein_GI_number: 15901902 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 37 301 5 269 271 265 60.0 1e-70 MQIDTAAAIDLSKEKKQNESSFSDQMKAYIKHPGSGVLALLTLLGAVLTFALLFFLIGYI LVKGVPYLSTDLFSLTYNSENLSLLPSLINTFILTVVSLVIAAPLGIFAAIYLVEYAKKG SKLVNVIRITAETLSGIPSIVYGLFGMLFFVTALHWGLSLLSGAFTLVIMILPLIMRTAE EALKSVPDSYREASFGLGAGKLRTIFTIVLPSAVPGILAGVILAIGRVIGETAALIYTAG TVAEVPKNLMGSGRTLALHMYVLSGEGLHMNQASATAVVILAFVLVINFLSGAVAKRIAK G >gi|226332941|gb|ACII01000078.1| GENE 32 38390 - 39247 955 285 aa, chain - ## HITS:1 COG:SP2085 KEGG:ns NR:ns ## COG: SP2085 COG0573 # Protein_GI_number: 15901901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 4 283 6 286 287 290 59.0 2e-78 MNHFKEKAMKCVFLIAACTSVLAVFLICAFLFANGVPAIGKIGPLKFLLGTKWKPSNDIF GILPMIVASIYVTAGAILLGVPIALFTSVFMARYCPKKIYRPLKSGIELMAGVPSIVYGF FGLILIVPLIRQIFGGTGTSMLAACVLLGMMILPTIIGVTESAIRSVPESYYEGSLALGA TKERSIFFVMLPAAKSGILAAVVLGIGRAIGETMAVVMVAGNQPRMPQGILKGVRTMTAN IVTEMGYATGLHREALIATAVVLFIFILIINLSLSLLNRRAEHAN >gi|226332941|gb|ACII01000078.1| GENE 33 39394 - 40257 1404 287 aa, chain - ## HITS:1 COG:SP2084 KEGG:ns NR:ns ## COG: SP2084 COG0226 # Protein_GI_number: 15901900 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 4 285 2 290 291 186 42.0 5e-47 MKKKIATILTAAVIGATAFTTIVSAASGDITVVSREDGSGTRGAFVELFGIEEEKDGEKV DMTTDEASVTNSTSVMMTTVAGDENAIGYISLGSLDDTVKAVKIDGVEATVDNVSNDSYK IARPFNILTSDKESDAAKDFVNYIMSSDGQKIVEDNGYIKEAADAKAYEAADGVSGKVVV AGSSSVTPVMEKLAEGYEAVNKDVTVEVQQSDSTTGVNMAAEGTADIGMASRDLKDEEKD LGLTATVIARDGIAVIVNKDNDVDELTSDQVKAVYTGETTTWEDLAK >gi|226332941|gb|ACII01000078.1| GENE 34 40573 - 40965 530 130 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1645 NR:ns ## KEGG: Acfer_1645 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: A.fermentans # Pathway: not_defined # 5 126 7 131 132 87 45.0 1e-16 MKNLRCEKAVEKKHNGYNCAQAVACSFCKEASMDEDTLKKITQGFGAGLGTMAGTCGAIS GAAVVAGLINQDKAGTSQTVRSVMNQFKQQNGTVICKDLKGVETGKVIRSCDDCVRDAVK FLEDALKSEN >gi|226332941|gb|ACII01000078.1| GENE 35 41103 - 42074 396 323 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579203|ref|ZP_04856473.1| ## NR: gi|253579203|ref|ZP_04856473.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 7 323 1 317 317 563 100.0 1e-159 MRDETKMDVYTSGVTGAEYDAGMKQLMSNKEIIIPILQMTVPEFKTCSQEEILQCLDISS ITKDDFVSDIPNVEKDLRLTKEDSELSSLVEKLVRFDIRFKIINPKLSTEKIRVNLHIDM EAQKSYRPSNPSYPILKRAVYYVARDLSSQLSTITQTTDYSKLEKCYSIWICAEDVPKKL QNTLTEYSFSKKDIIGVADEPEEDYDLLTVIIIRLGKETEEKGIFDYLKGLFTGDIKRIQ RYSHIEWSEPFQEEASKMTGFGDMIYERGIQQGMQQGIQQGRHEGMILGALMSGKTPEEV SKMLNLPLEEIKKVQEQQMTVNK >gi|226332941|gb|ACII01000078.1| GENE 36 42277 - 42630 520 117 aa, chain + ## HITS:1 COG:lin2520 KEGG:ns NR:ns ## COG: lin2520 COG1393 # Protein_GI_number: 16801582 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Listeria innocua # 6 111 7 112 117 114 52.0 4e-26 MKVLVYRKCSTCMKALKWLDEHNINYEERAIKEENPTYEELKKWYAKSGLPLKKFFNTGG MIYKEMGLKDKLKDMSEDEQLKLLATDGMLVKRPLVIGDDFVLTGFREKEWSEAMHM >gi|226332941|gb|ACII01000078.1| GENE 37 42836 - 43513 485 225 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 220 1 222 227 153 39.0 3e-37 MEQLLIIEDDIGLNQGLSKALKADDRQIISCHDLKAAREQLLCGSVSLILLDINLPDGSG LELLREVKENMPGVPVILLTANDTDLDIVDGLERGADDYITKPFSLSVLRARVNTQLRKQ ASNHKNVSIHIDLFHFDFEAMTFYVGDSKVELSKTEQKLLRLLVENRGRTMTRGDLVDRI WTDGAEYVDENALSVTIKRLRDKLGAQKYIKTVYGIGYSWVIKDE >gi|226332941|gb|ACII01000078.1| GENE 38 43506 - 44405 314 299 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 8 295 115 408 416 130 28.0 3e-30 MNKTGIVIILLCFLAAAAVVLWERRKARKTMEEIERMLDAAMTGSFSETNFDESQLSALE TKFAHYLSAAEASYRNVAQEKDKIKTLIADISHQTKTPIANLLLYSELLMEETLPASTKA NVEALYKQSEKLRFLIDSLVKLSRLENGIISLSPQQAALQPLLESVVEQYAAKASEKGLS LQLQDTNAFAVFDFKWTAEALANIVDNAIKYTEHGTIRISAVSYELFERIDISDTGTGIS ESEQAKIFARFYRSNSVQKQEGVGIGLYLARQIISGEGGYIKVASVPGKGSTFSIFLPK >gi|226332941|gb|ACII01000078.1| GENE 39 44541 - 45212 344 223 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 222 1 220 245 137 36 2e-31 MEVLQAKNLRKIYGSGNNAVHALDGVDLSVKKGEFVAIVGTSGSGKSTLLHMLGGLDRPT SGTVMVDGQDIFSLREEALTIFRRRKIGFVFQTYNLVPVLNVYENIVLPIELDGGKVNKD FVQQIVQTLGLDDRLDALPNQLSGGQQQRVAIARALAAAPAIILADEPTGNLDSKTSQDV LSLLKVTSQKFAQTIVMITHNEEIAQMADRIIRIEDGRIVSQN >gi|226332941|gb|ACII01000078.1| GENE 40 45228 - 47873 1433 881 aa, chain + ## HITS:1 COG:CAC0527 KEGG:ns NR:ns ## COG: CAC0527 COG0577 # Protein_GI_number: 15893817 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 440 1 448 863 100 23.0 1e-20 MNVKNRKCIRKLSLKSLYANRRRNLIAIFAIALTTLLFTSMFTIVLSLNASYETYQFRQV GGYAHGTFKDVSPEQAERIAAHPKVKATGARKVIGITAEGVFAKIPAEISYMDANCTKWS YATPTTGRMPESGKEVAMDTAALQLLGVTPELGAEVTVSYSITDKDQTAFTVTDTFTLVG YWDYDELMPVHYINISRDYADDIEAQAVKTGLQPFRTDLNVMLTSGTNIQGQMEQVDTDL GYTWDSYTDPNSVRIGVNWGYTSSQLESQLDPELMIAIAAFLLLVIFTGYLIIYNIFQIS VAGDIRFYGLLKTIGTTPRQLKRIIRQQALLLCLIGIPAGLLLGYGIGAVLMPVVLHSIQ LNTGITTISTSPVIFLGSMLFALLTVLLSCSKPGKMAAKVSPVEATKYTDVMQTKKKRRS IRGAKLHQMAFANLGRNKKKTVLVVLSLALSVTLFNALCAFVGGFSMEKYVSAMTCADFI VSTPDYFRYNPADEFITPEQIGEIAANTKASLSGTGYAVRKPAYLWMTEDALRQDYARYE SAEQLDTHMSRMEHRGNMVMGDTRIEALDNSLFDKLQVLDGDISPMLEPDNNAIAIAVSL DDYGNLLNPEYYPKVGDTITATYADDVKYIDSRTGELRTEDTPEEYRQKKLYGARDVEYT VCALVELPNSMSYRYSGVGYDAVLSVDTAQRDSGGAAIPMLYLFDTADEADEAEAEQYLS KLTAGEFSPLMYESKATARSEFAQFRQMFLLIGGILCAIIGLVGLLNFFNAMMTGILSRR REFAVLQAVGMTNRQLKTMLIYEGLFYAMSSVAVAFILSLAVGPLAGKMLGSMFWFFEYQ FTILPVLLTIPVFLLLGWLIPCMMYDNAAKCSVVEQLRDAQ >gi|226332941|gb|ACII01000078.1| GENE 41 47953 - 48813 734 286 aa, chain - ## HITS:1 COG:jhp0062 KEGG:ns NR:ns ## COG: jhp0062 COG0829 # Protein_GI_number: 15611133 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreH # Organism: Helicobacter pylori J99 # 19 280 19 256 265 156 33.0 5e-38 MDNKFGKISRISACAALKDGRTILEDLSFTAPYKIMMPFEKENGGIQIMPLCASAGIMAG DSQEFSYHVKEGADLEVLSQSFEKIHKMDEGSAARTIEVQVDKNATLYYYPQPVIPFAQS AFDSKMTIHLEDETSRLFLLEIISCGRNAHDERFQYRRFSSKVLLYRGDKLIYRDNTSYE PDKMPMEGIGMYEGYTHMANLFLSKICSRDGESCSQESGSVKSADSAINLELQEKIWQIL DEDSEIDGGVTRLTTGDLALRIFGYRAQKLQQVAEKIKKLYEKEQI >gi|226332941|gb|ACII01000078.1| GENE 42 48826 - 49434 676 202 aa, chain - ## HITS:1 COG:HP0068 KEGG:ns NR:ns ## COG: HP0068 COG0378 # Protein_GI_number: 15644698 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Helicobacter pylori 26695 # 4 200 2 198 199 323 82.0 1e-88 MSYVKIGVAGPVGSGKTALIEALSRKMAKDYSIGVITNDIYTKEDAQFLAKNSVLPVERI IGVETGGCPHTAIREDASMNLEAVDEMMERFPDIELLFIESGGDNLSATFSPELVDATIF VIDVAEGDKIPRKGGPGITRSDLLVINKIDLAPYVGASLEVMERDSKKMRGDRPFQFTNI RGDENVDKVVEWIKKNVLLEGM >gi|226332941|gb|ACII01000078.1| GENE 43 49587 - 50279 519 230 aa, chain - ## HITS:1 COG:HP0069 KEGG:ns NR:ns ## COG: HP0069 COG0830 # Protein_GI_number: 15644699 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreF # Organism: Helicobacter pylori 26695 # 3 230 30 254 254 196 45.0 3e-50 MSEKQFYLLQVNDALFPIGGYSHSQGLETYIQRGIVHNVDTAREYITHKIKWNLAYTELL AARLAYEAAEKKDLQELLYLEELLEASRIPMEQREAARKMGSRFAKTIEKLGLSISETGI FREYLDARKGKAVNHCCIYGVFCAEMQIPLEEALTHYLYAQTSAIVTNCVKTIPLSQTSG QQLLSGCYGEFDEILKDVMNRSEEDLCLSAPGFDIRGIQHEKLYSRLYMS >gi|226332941|gb|ACII01000078.1| GENE 44 50298 - 50774 579 158 aa, chain - ## HITS:1 COG:HP0070 KEGG:ns NR:ns ## COG: HP0070 COG2371 # Protein_GI_number: 15644700 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreE # Organism: Helicobacter pylori 26695 # 1 153 1 150 170 97 36.0 1e-20 MLCEQVLGKLHDFDITGKTIEYVDIEWHEAFKKIHKKITDKGTEVGIRMDDSILARGLYQ DDVIYADNEKLVVVNTPPCEVIRVSLTPGHEKMSAKVCYEIGNRHAPLFWGENDTFITIY NEPMLVMLQKIHGVQAEKEVLKLDFDKRISASIHNHHH >gi|226332941|gb|ACII01000078.1| GENE 45 50795 - 51301 595 168 aa, chain - ## HITS:1 COG:no KEGG:Hac_1534 NR:ns ## KEGG: Hac_1534 # Name: ureI # Def: urease accessory protein # Organism: H.acinonychis # Pathway: not_defined # 1 168 1 194 195 130 42.0 1e-29 MLGVCLLFVGIVLINNGMCSLYNVDGKSTAIMNIFTGGLSLFINFVNLVQGNYYAAGTGL LFCFTYLFVALSKFLKASPIPFAWFSTFVAINAVIFGTIEGFTGSTALGITPDLRWAAIW YLWAILWGTAFVEDICGKKLGKFVPYLQVFEGIVTAWVPGVMMLLQIW >gi|226332941|gb|ACII01000078.1| GENE 46 51560 - 53281 2257 573 aa, chain - ## HITS:1 COG:HP0072 KEGG:ns NR:ns ## COG: HP0072 COG0804 # Protein_GI_number: 15644702 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) alpha subunit # Organism: Helicobacter pylori 26695 # 4 573 3 569 569 815 73.0 0 MSTKISGSKYAAMYGPTTGDKVRLADTSLVIEVEKDYTTYGDEVKFGGGKTIRDGMGQSV KTCSKDGDLDLVITNALIVDSTGIVKADIGIKDGKIAGIGKAGNPDIMDGVTPGMTVGAS TEALAGEGMIVTAGGIDTHIHFISPQQIDCALYSGVTTMIGGGTGPADGTNATTCTPGPW NLKMMLKAAEEYPMNLGFLGKGNCSDEAPLIEQVKAGAMGLKIHEDWGATPAVINHCLNV ADEYDVQVAIHTDTLNEGGCVEDTLAAIGGRTIHTYHTEGAGGGHAPDIIRAAAAPNVLP SSTNPTMPYTVNTLDEHLDMLMVCHHLDKRIPEDVAFADSRIRPETIAAEDVLHDMGIFS MMSSDSQAMGRVGEVITRTWQTASKMKDERGALPEDAGHDNDNFRVKRYISKYTINPAIT HGISQYVGSVEEGKFADLVLWNPVFFGAKPDIIIKGGMIIASKMGDANASIPTTQPVLYQ PMFAAHGKAKNEACLTFVSQAAMDENVKEKYGLEKTVVPVKGCRNISKKDMVFNDRTPEL TVDPETYKVTVDGEEITSKPAEKLPLTQLYSLF >gi|226332941|gb|ACII01000078.1| GENE 47 53294 - 53653 442 119 aa, chain - ## HITS:1 COG:HP0073_2 KEGG:ns NR:ns ## COG: HP0073_2 COG0832 # Protein_GI_number: 15644703 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) beta subunit # Organism: Helicobacter pylori 26695 # 4 118 7 121 136 142 58.0 1e-34 MKIGEIMAADREITLNEGKKTVTITVANKGDRPVQVGSHFHFFEVNKCLSFDREKAYGYH LDIPSGTSVRFEPGEEKEVQLTEMGGRQRVFGLNDLTRAQATDDTKAASMETAKLKGFL >gi|226332941|gb|ACII01000078.1| GENE 48 53668 - 53970 493 100 aa, chain - ## HITS:1 COG:alr3667 KEGG:ns NR:ns ## COG: alr3667 COG0831 # Protein_GI_number: 17231159 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) gamma subunit # Organism: Nostoc sp. PCC 7120 # 1 100 1 100 100 129 66.0 2e-30 MRLNPKEQEKLMLHMAGNLAKERKERGLKLNYVEALAYISSELLELARDGKTVVELMQLG TKILTKDDVMDGVADMIGEVQVEATFPDGTKLVTVHNPIQ >gi|226332941|gb|ACII01000078.1| GENE 49 54362 - 55546 1239 394 aa, chain + ## HITS:1 COG:Cj1183c KEGG:ns NR:ns ## COG: Cj1183c COG2230 # Protein_GI_number: 15792507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Campylobacter jejuni # 11 394 3 384 387 347 48.0 2e-95 MFETLGEKAEEKAMVQFLERFTDYPFLVKFKNSEYPIGEGEPTFTVNFKETIPLAELLKS TSLALGEAYMRGDLDIEGNLYEALDHFLGQMSKFSTNESALKKIMFSSTSKKNQEKEVTS HYDIGNDFYKLWLDETMSYSCGYFIHDDDSLYQAQVNKVDYILKKLHLEEGMSLLDIGCG WGFLLIEAAKKYKVHGTGITLSHEQYTEFQKRIKDQGLEDYLTVELMDYRDLPKHNYQFD RVVSVGMLEHVGRDNYQLFLDCVEKVLKPGGLFLLHFISALKEHPGDPWIKKYIFPGGTV PSLREILNHMAEDNFHTLDIENLRLHYNKTLLHWEKNFRENIEKEKTMFDESFLRMWELY LSACAATFHNGIIDLHQILMTKGINNDLPMTRWY >gi|226332941|gb|ACII01000078.1| GENE 50 56114 - 58144 2029 676 aa, chain + ## HITS:1 COG:no KEGG:Ccel_1549 NR:ns ## KEGG: Ccel_1549 # Name: not_defined # Def: putative outer membrane protein # Organism: C.cellulolyticum # Pathway: not_defined # 15 675 17 640 644 488 42.0 1e-136 MIIYVDASVIQTGNGTKENPFKTIQEAAAKALPGDEVIVAPGLYREAVNPIHAGTADKRI TYRSAIKGQAHITGSEAVKDWENVEGTVWKAVIPNGIFGDYNPFTTLVSGDWFIATFIAH TGDVYLNEKSMYEVTTLDKVKNPQKSTISWDPDFSVYTWYAEQDEANNQTIIYANFHEKD PNKENVEISVRRNCFYPESEGIGYITLSGFRISQAATQWAPPTAYQEGMVGPHWSKGWII EDCEIYESKCSGISLGKYLQPENDNKWLKWKYKDGTQTERDCICQASYEGWDKEHIGSHI VRRCEIHDCGQTGIVGHLGGVFSVIEDNHIHHINNKQNLAGAEIGGIKMHAAIDVIFRRN HIHNCTRGLWLDWQAQGTRVTGNLFHDNALPNDFEAGDDAVTSVGEDIFVEVSHGPTLID HNILLSDRALKIATQGVALVHNLICGGFVSVGIGTDNGAPDIPSPRYTPYHTKHGTQVAG FMTILHGDDRFYNNIFIQKPIRPCMQDLADLMGNNGNMWDDCNVITGTFKFNGYPTFDEW NRQFEGYCGMGSETTGNCYYDHLPVWASGNLYFNGARAWEKETDAVTDTEHSVDISVEEK EDGWYLKTNLYDIIKEETDGIISTETLGMAFEPEQKYENPDGSPIIFNQDFFGNHRDVKT VAGPFTDKKASEQKLF Prediction of potential genes in microbial genomes Time: Sat May 28 19:57:09 2011 Seq name: gi|226332940|gb|ACII01000079.1| Ruminococcus sp. 5_1_39B_FAA cont1.79, whole genome shotgun sequence Length of sequence - 3969 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 61 - 1428 1151 ## COG0534 Na+-driven multidrug efflux pump - Term 1475 - 1516 -1.0 2 1 Op 2 . - CDS 1518 - 3167 1606 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 3244 - 3303 5.2 - Term 3323 - 3360 2.1 3 2 Tu 1 . - CDS 3370 - 3828 399 ## COG5577 Spore coat protein - Prom 3854 - 3913 4.3 Predicted protein(s) >gi|226332940|gb|ACII01000079.1| GENE 1 61 - 1428 1151 455 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 426 1 426 447 248 36.0 2e-65 MTTSMIKGNPLKLMLQFAFPLLLGNLLQQTYNIIDAAIVGQILGAKALASVGASSSIQFL VLGFCMGSCTGFGVPVAKYFGAEKIEKMRDYIFNGAVLCAGIAVILTALCSVLCPQILHI LSVPEDIYDNAYSYLLIIFLGIPFTILYNYLSSILRSVGDSRTPFIFLALSAVLNIFLDL FCIVVLKLGCAGAAIATISAQAISGILCLIFIIRKMKLLWLKKENRTIKGDAVKELLAMG MPTGLQFSITAIGSMVMQSANNGLGSTYVSAFTAAMKIKQFTMCPFDAIATSASVFCSQN LGAGQSDRIKKGLRCGITVGVGYGIAAGILLIFAGRTLSMLFVGKSAVAVLDASAKYLRC MGFFYWSLGILNVARMVTQGLGYSGRAVFSGVTEMIARTVVCLGFVGTFGFTAICFADQT AWITATCYIFPTCLWCIRKSTKMLASIAVRVNNAS >gi|226332940|gb|ACII01000079.1| GENE 2 1518 - 3167 1606 549 aa, chain - ## HITS:1 COG:TM1414 KEGG:ns NR:ns ## COG: TM1414 COG1621 # Protein_GI_number: 15644166 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Thermotoga maritima # 106 548 2 431 432 210 31.0 7e-54 MYTGNYECLELFVKPEAGKKGSVSLKKVKENKTEIKELQHIYGNWIKFPTDKDTEYEITA QDCTITFAYLSECENILKNGICVLNTTDNFKAMNKEEFFKFIGTPYREQYHFSPVVNWNN DPNGLCWFKGYYHLFYQLNPFGQEWNNMYWGHAASKDLMHWTHLPVALEPQEEILDNLAI KGGAFSGSALPVGDEVYFYLTRHIGPQDDGWDTVQYQTMTKSSDMIHFEPEKEIIREKPE GTNYDFRDPKAIKIGEKYYIVLGACVDEKGTFLLYESEDAENWKYRCPLITEETKIRTIE CPDFFPLDDKYVAMGAWMSHYDEYGRFQQCRYYVGDWNGDAMDVHTQQWVDFGSNCYAAQ SFQHEDRRILIGWISDFYGEHIATEPGAYGSMTLPRELHVKNEHVYTKPVEEVYTLLGDT VYEGTGREIKVGSIADNRYYASVSFEETGDLNILLGQDGDKSISLTAEGGKVFFKMAGVK SDKVQFVSSVEKCRNAEIFVDGRTIEVYLNDGEDVGTRLFYNSNRQGIFCLNSEKDAQVK ICEMKSIWK >gi|226332940|gb|ACII01000079.1| GENE 3 3370 - 3828 399 152 aa, chain - ## HITS:1 COG:CAC2615 KEGG:ns NR:ns ## COG: CAC2615 COG5577 # Protein_GI_number: 15895873 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Spore coat protein # Organism: Clostridium acetobutylicum # 3 151 5 149 153 103 38.0 9e-23 MILKEKERSVIEDLQTQEKSCVEKYGKYAEQARDPELKNLFRNIQQEEQKHYESLSMILS GSVPECNCNDRSGRDYQPKAVYAAMSSPEDKQHDAFLATDCIGTEKLVSGTYNDDVFAFG DSGVRKLLADIQIEEQNHAEMLYKYKTVNGML Prediction of potential genes in microbial genomes Time: Sat May 28 19:57:14 2011 Seq name: gi|226332939|gb|ACII01000080.1| Ruminococcus sp. 5_1_39B_FAA cont1.80, whole genome shotgun sequence Length of sequence - 15935 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 7, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 5.8 1 1 Tu 1 . + CDS 98 - 1465 1130 ## COG0534 Na+-driven multidrug efflux pump + Term 1543 - 1584 0.0 2 2 Tu 1 . - CDS 1584 - 2012 491 ## gi|253579222|ref|ZP_04856492.1| conserved hypothetical protein - Prom 2055 - 2114 3.2 3 3 Op 1 . - CDS 2117 - 2428 547 ## COG0393 Uncharacterized conserved protein 4 3 Op 2 . - CDS 2409 - 3050 674 ## CLB_0971 hypothetical protein - Prom 3137 - 3196 2.6 5 4 Op 1 . - CDS 3204 - 4517 1388 ## COG1757 Na+/H+ antiporter 6 4 Op 2 . - CDS 4608 - 5462 1002 ## COG1284 Uncharacterized conserved protein - Prom 5494 - 5553 5.0 - Term 5531 - 5568 2.0 7 5 Op 1 . - CDS 5651 - 6304 525 ## Shel_11370 hypothetical protein 8 5 Op 2 . - CDS 6306 - 6815 298 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 6993 - 7052 8.7 9 6 Op 1 . - CDS 7098 - 11537 4573 ## CPR_0357 hypothetical protein 10 6 Op 2 . - CDS 11530 - 12270 768 ## CPR_0356 hypothetical protein 11 6 Op 3 . - CDS 12276 - 13874 1551 ## CPR_0355 hypothetical protein - Prom 13929 - 13988 2.2 12 7 Op 1 . - CDS 14025 - 14645 303 ## HRM2_34640 hypothetical protein 13 7 Op 2 . - CDS 14722 - 15651 920 ## COG0530 Ca2+/Na+ antiporter - Prom 15866 - 15925 9.7 Predicted protein(s) >gi|226332939|gb|ACII01000080.1| GENE 1 98 - 1465 1130 455 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 7 438 9 436 457 223 34.0 8e-58 MKRIDFENGTVTNNILSAALPMLVAQILNLLYNIVDRIYIARIHDIGTTALGAVGLCFPI IMIITAFSNLFGSGGAPIFSINRGKGDSRTADMIMNTAFTMLCGSAAVLMLIGFLFARPL LTLFGASDDALVYAYPYLMIYLLGTLPSMIATGMNPFINAQGYAIIGMLSVTVGAVANII LDPIFIFVLDMGIKGAAIATVISQILSALLVFYFLHGKSELKVRWIHINEISECTRHARD IISLGSAGFIMQLTNSLVSICCNNVLSVTGGDIYISVMTIISSVRQMVETPIHAINEGTS PVLSYNYGARRPDRVKKAGVVLIIMVLIYTGIMWSVILIAPEFLIGIFSSDKLLLKDAVP ALKLYFAAFIFMDLQYIGQTVFKSLNKKKQAIFFSLLRKVFIVVPLTYFLPYGLHMGTDG VFMAEPVSNVIGGSLCFITMLVTILPELNKMGNSR >gi|226332939|gb|ACII01000080.1| GENE 2 1584 - 2012 491 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579222|ref|ZP_04856492.1| ## NR: gi|253579222|ref|ZP_04856492.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 142 1 142 142 268 100.0 7e-71 MSDNKNVNQDKGLQGNEKIEQAIAALQQEATQEMLAHTLTVIRRRMRENGQFILSVEPPT GDNQLRIGTVKTGDGKVWWAAFTSFEEELKGGGSVQSTFLTDIDQLFHSALQVNEIEGII LNPWNCTIMLDKNLINIILGNV >gi|226332939|gb|ACII01000080.1| GENE 3 2117 - 2428 547 103 aa, chain - ## HITS:1 COG:TM0763 KEGG:ns NR:ns ## COG: TM0763 COG0393 # Protein_GI_number: 15643526 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 1 101 1 102 106 99 57.0 2e-21 MKLLSIEYIPGVEFEALGIVKGTVVQTKNVGKDFMAGMKTLVGGEITGYTEMLNEARQIA TKRMVDEAKEMNADAVIGVKYGSSQVMSGAAEVIAYGTAVKYK >gi|226332939|gb|ACII01000080.1| GENE 4 2409 - 3050 674 213 aa, chain - ## HITS:1 COG:no KEGG:CLB_0971 NR:ns ## KEGG: CLB_0971 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 125 1 125 424 77 32.0 5e-13 MKIKQVEELVGITRKNIRFYEEQGLLNVERAENGYREYHRADIARLQEIKLFRKMDISIE EMRALFEKRKSLQVCLEQHLGELERRREGLVKMQEMCQRLIAEHQSLDTLNAENCLEEIE QMEKEGARFMDIKKTDIRKKRRTGAIIGAVVMILLMGFTIGLMLWANTQDPIPTGLLIFL IAIPVVIIGGILAALAGRMKEIEGGEEDEASKY >gi|226332939|gb|ACII01000080.1| GENE 5 3204 - 4517 1388 437 aa, chain - ## HITS:1 COG:FN1420 KEGG:ns NR:ns ## COG: FN1420 COG1757 # Protein_GI_number: 19704752 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 8 428 9 425 445 268 41.0 2e-71 MKKGKGIALLPIGVFLVLYLGLGILFEYVMEIPMGFYNVPIVVAFLAAIMVACLQNRALD FDKKLEIMAQGVGDKNIITMLLIFLAAGSFVGVVGRSSAESVAYCMLSLIPARFSVSVLF IVACFVSVAMGTSVGTITLLTPIAAAVSTASGFDLAFCVASVMGGAMFGDNLSFISDTTI AACNGQGCEMKDKFRENFWIALPAAVATLILILILSFQTEIQGRVIQPYHLTQVIPYVLV LIGGIVGINVFVVLLTGIVSGAFIMLIGGHTAPVEILKNMGSGVSGMFETCMVAILVAAM CALIREYGGFDALLGWIHKIFRGKKGGQLGMGLLVGTMDIATANNTVAIVMANPIAKEMA EEYGITPRKTASILDTFSCVFQGVIPYGAQMLVAISAVNELGGEISAFQIMPKLFYPMLL LFSSLITIMRGSDRTGA >gi|226332939|gb|ACII01000080.1| GENE 6 4608 - 5462 1002 284 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 11 275 5 269 283 159 33.0 4e-39 MKKQLNYADIIKEALILTGAVAIIAAAVYFFLVPSHTSVSSISGLGIVLSNFVPLSLSAI TMILNVVLLIIGFITCGREFGVKTVYTSIVLPLFLGLFEKVFPDFGSMTNSQELDVLCYI LVVSVGLSILFNRNASSGGLDIVAKIMNKYLHMELGKAMSLSGMCVALSAALVYDKKTVV LSILGTYFNGMVLDHFIFDHNIKRRVCVITQKEEELRKFIIEDLHSGATIYEATGAYNMK KRNEIITIVDKTEYQKLMAYINHEDPKAFVTVYNVSDMRYQPKL >gi|226332939|gb|ACII01000080.1| GENE 7 5651 - 6304 525 217 aa, chain - ## HITS:1 COG:no KEGG:Shel_11370 NR:ns ## KEGG: Shel_11370 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 4 211 126 335 371 91 28.0 2e-17 MEKINCNVIQDILPLYIDDVVSDDTKELVEEHLQNCEICQRVYHETKTDLENDMKVSVQT KESSNEANDLKNFRKFLKKKKTKTILLSIAATIVCFVAVFTFMNKHVIYINYKDAGITII EDNKDEVTYRMNIKGNYRWKTSLDRETGVMTIHFEQSLWEKYVSCIFYPYDNLHEILKKD EIKEVYVDKDGTTTTIWEAPEEEQKAYLSEEHTEPLG >gi|226332939|gb|ACII01000080.1| GENE 8 6306 - 6815 298 169 aa, chain - ## HITS:1 COG:BS_ylaC KEGG:ns NR:ns ## COG: BS_ylaC COG1595 # Protein_GI_number: 16078537 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 3 169 4 169 173 75 29.0 6e-14 MEKSEIETIYRKYFHDVFLYIRALSENESLAEEITQETFLKAMNNIEKFDGRKDIRAWLF VIAKNTYFRYCRRNKIYVGEEFSEIMQDSMQDTSPAVLDQIVQDETVRNLKRYAELIPEP YREVFHLRIYGELSFEQIGKCYQKSAGWARVTCHRARQMIRAQMSQEEK >gi|226332939|gb|ACII01000080.1| GENE 9 7098 - 11537 4573 1479 aa, chain - ## HITS:1 COG:no KEGG:CPR_0357 NR:ns ## KEGG: CPR_0357 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 1478 1 1463 1463 698 33.0 0 MSKINAVRLINVNYNNNAYRISDETLHFNGKSTLISLQNGGGKSVLVQMLTAPFVHPKYR NTKDRLFESYFTTNKPSFILVEWALDQGAGYVLTGLMVRKSQDMEEDRKENLDIIGIVSE YQSPCIQDIHHLPVVEKGKKEMILKNFNSCRQLFETYKKDRDMKFFYYDLTNYAQSRQYF NKLMEYQINYKEWETIIKKINLKESGLSDLFADCKDEKGLTEKWFLEAIESKLNKDKNRI KEFQSILEKYAGQYKDNRSKIKRRDTIRQFKEEANGIQTQAEEYQNAETEEGKQQNKIAW FIHDVNGLLSQTEKAHDHAKEVLEGLDQKLSRVEYEELSSQVYDYEEEKNLHIGNRDMID MERENLQVQAEQTEKKLHILQCARQQDSVSEEKGELDLLREKIAVARQQGADLEPERKAL GLTLKNIFEERIQDNRQQQENLSENIQKKGKEAQDEANRVEELEEKIRDCFGKKGKLESR IESYTRLENQYNAKYQEELVRNILGTYEAGTLEIRRQMYEQELEKTVRERTENQKKWEDD KELIRRQERSLEDKKEALIHKKTETENQEHTYQLYQNELEIRKNILKYLDMEERHLLDTD RILENSARKLRELEELRRNLEKEEDSLEKEYLSLTSGKVLELPEELEKELDDLGIHTVYG MEWLKKNGYSQKKNQMLVRKNPFLPYALILTRQEIEKLGRNDRNVCTSFPVPIVEREKIE EFQEKYTDKIVNFPGISFYILFNENLLDEEKLQAMIWEKKKELEKLNQAVDQRKKEYAEY FQRQETLKNQSVTKEKWEEIQELLKTLEEEKKSLEKDIRDAAQELDSLKTAHEKLQNLLI KSAAEIDRQKQRLEDFSQLEKDYAFYENNRNELEKCKKDETRFRENQKLAKDRQEKLLEE KMTLEHNLNMLEREKDKLDEKYTRYASYEAGSRESDEVRKQQSQQFGFASMSTEQMEARY EAITSVLSQELKNLEEQERRSAARYQREKEDLDYLQKKYHLKPEQWSGVIYDRKEESHQE GVLEDFRRKIQTKDMQWNDADKKAAVAQSKISELTKRIHSVCGMEKPLPKDEIQGQDFQA RKNQLLYEKEEEKKQEEFLSGKLRSYEENLTALSEYNELIPVAAEDHEPVSENLSAEELR NCKGILIRDYNQKMRDTGQKKEELVRTLNKIVRMESFQDDFYRKPLEQMLELSDDAVRVL TQLKTTVQSYDSLMEKLEVDISVVEREKERITELLEDYVREIHSNLGKIDHNSTITIRKR NIKMLKIQLPDWEENAGLYRLRLEDFIDKITMEGVELFEKNENAQEFFGSGITTRNLYDQ VVGIGNVQIHLYKIEAQREYPITWKEVSRNSGGEGFLSAFVILSSLLYYMRRDDTDIFAD KNEGKVLIMDNPFAQTNASHLLIPLMDMAKKSNTQLICLTGLGGESIYNRFDNIYVLNLI AASLRGGTQYLKAEHKRGKEPDELVTARIEVGDQMELLF >gi|226332939|gb|ACII01000080.1| GENE 10 11530 - 12270 768 246 aa, chain - ## HITS:1 COG:no KEGG:CPR_0356 NR:ns ## KEGG: CPR_0356 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 246 1 235 235 253 58.0 5e-66 MAYEAEEIRTSQEIFYYLLEHHELREEDDQLLYKAYTEEESIRNLVKSQGEIAGSNVERY GNVIYLIPKEENNFLGFSKQQLKGTLCKSGATDKDYYLSQFVILTLLVEFFDGQGSSSKS REYMRVGELMNCLSARLKEGAAEETTTEEEEDGKNPKDTAGISFKILYETYEALKSDDRG SRMKTTKEGFIYNILMFLQKQGLIEYVVQDEMIKTTRKLDNFMDWNLLNQNNYQRVLRVL GVAKDE >gi|226332939|gb|ACII01000080.1| GENE 11 12276 - 13874 1551 532 aa, chain - ## HITS:1 COG:no KEGG:CPR_0355 NR:ns ## KEGG: CPR_0355 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 532 13 538 549 392 44.0 1e-107 MKNVGLYGVLIQNSIQKTSWKQFGFLKFDEQMNLIFAVMLYIMEQSLREENCTMDDIGAY IDTINTRYLGKEISYDDCRKLGDFVVNVILSNEGRAMYFDGYDFEENDYHVMHISYVANR IVYLDQEVRRTSYYLTDDGYNLILSTLEIENNMKLTIHEMIFQMHLEKQSYDKAVDEIKN VFNLMRIQLQKIQEAMGKIRRNALNYSVKDYEEILLENLDTISDTKEKFQKYRELVRSRV KKLEEENINVRRLGEKEEENLENLRIIESYLNRTIDEHQKILSSHFDLKALYTRELEALS QMSLIRRFSMRNDFYDKVLENPSALGNLDYFLRPLFNREPDKVYNLNKALLYQKPSVRNE DEDTEEMLDFDEDAYLKEQEEKRRQKLKRYESSLGFLLEQASEKGEISLSDIWKNIQENE KNAEEMQQQLIPNVEIFKEIMVELIRNKEINMEVLKKERSEFIQDQTADFQLNEMLLQLS EERFSDKKIGKIEVYRIEDGSTVTFDNVFSENGVKKSIRCSNILIRIMRNEE >gi|226332939|gb|ACII01000080.1| GENE 12 14025 - 14645 303 206 aa, chain - ## HITS:1 COG:no KEGG:HRM2_34640 NR:ns ## KEGG: HRM2_34640 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 69 188 68 186 210 76 35.0 8e-13 MDNYEKQVYTGRELFLKYNQDKLIEKYGLKHDEEYLYLKYIETEYRINRRNGAIEYATGE EWTDCREYTVVMTIYDFLCCSGQEILPPLTGQWQPVGRFVTAGSSPSTDPFVEKYARAFS GKVEEVKQACICLGGKQTKRLAGADLTFEMPVLPEFSVLFQFWDGDEEFPPKILLLWDKV SLSYLHFETTYYLQGDLLKAILQIIG >gi|226332939|gb|ACII01000080.1| GENE 13 14722 - 15651 920 309 aa, chain - ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 306 14 314 318 188 40.0 1e-47 MFLQVIILLAGFLFLVKGADWFVEGAASIAKKLGIPQLIIGLTIVAMGTSMPEAAVSITA AINKNAGITIGNVVGSNILNIFIILGITAVITNVAIQKSTLLYEIPFMTVITIILLIFGI TGSEVTFIEGVIFWILFLIYLGYLFVMAKKGNDQEEAEAKDNPVWKCMLLMVIGGILVVK GSDFAVSGATEIARYFGMSERFIGLTIVALGTSLPELVTSVTAARRGNTGIAIGNIVGSN IFNILFVIGTTALICTVPFESKFIIDTVIAVLCGVILWIGTFRHKELRKPCGVVMLLCYV AYFLYLCLV Prediction of potential genes in microbial genomes Time: Sat May 28 19:58:00 2011 Seq name: gi|226332938|gb|ACII01000081.1| Ruminococcus sp. 5_1_39B_FAA cont1.81, whole genome shotgun sequence Length of sequence - 5020 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 33 - 1013 877 ## CPR_0354 hypothetical protein 2 1 Op 2 . - CDS 1088 - 1450 353 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Term 1461 - 1490 0.5 3 2 Op 1 . - CDS 1538 - 1999 363 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 2020 - 2079 3.1 4 2 Op 2 . - CDS 2084 - 2749 702 ## COG3859 Predicted membrane protein - Prom 2810 - 2869 7.6 + Prom 3197 - 3256 6.9 5 3 Tu 1 . + CDS 3363 - 3461 116 ## - Term 3439 - 3496 4.5 6 4 Tu 1 . - CDS 3588 - 4529 568 ## EUBELI_20017 hypothetical protein - Prom 4670 - 4729 4.7 - Term 4709 - 4767 -0.8 7 5 Tu 1 . - CDS 4817 - 4993 71 ## gi|226322750|ref|ZP_03798268.1| hypothetical protein COPCOM_00522 Predicted protein(s) >gi|226332938|gb|ACII01000081.1| GENE 1 33 - 1013 877 326 aa, chain - ## HITS:1 COG:no KEGG:CPR_0354 NR:ns ## KEGG: CPR_0354 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 2 326 9 333 333 255 41.0 2e-66 MKRISLDDLLHTRQDLDYQEQYEYIMKLLEKGQIKPVKASKSNGKKPALYREYWMVEEQK DYSNYIEEIKYTFSTMISVDYYLAHPDTYEKDRIWVLMLNEYLKKHADALLTAESLNERS FEIWHREKFLDREQGKKILKRCGLNVEALNVYRTTEPLSYYTHTRNTPQNILILENKDPF FSMRNYLLNGHTEIFGAEIGTLIYGAGKGIIRSFQDFDLCAEPYMKHPKNTIYYFGDMDY EGIGIYENLAEKFRSRWKIIPFVPAYQAMLGKAEQIIELPETKEHQNRNISTQFFSCFDE IMVKKMEAVLDKDRYIPQEILNTADF >gi|226332938|gb|ACII01000081.1| GENE 2 1088 - 1450 353 120 aa, chain - ## HITS:1 COG:jhp1146 KEGG:ns NR:ns ## COG: jhp1146 COG0239 # Protein_GI_number: 15612211 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Helicobacter pylori J99 # 4 111 6 116 130 63 38.0 1e-10 MNCLAVGLGGFAGAVLRYLIGLIPTGETMIFPVKTFCINVIGCIVIGAITVLAAKLSVPP RMILFLKVGVCGGFTTFSTFALESSDLIRDGHMGIALCYVLLSVLVGVLAIFAVEYLTVR >gi|226332938|gb|ACII01000081.1| GENE 3 1538 - 1999 363 153 aa, chain - ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 3 152 2 150 158 159 53.0 2e-39 MTITTLCYIENNGKYLMLHRIKKHNDINEGKWIGVGGHAEGQESPEECLLREVKEETGLT LTSYKLRGLVTFISDKCEPELMCVFTANEYIGELTECNEGELYWIDKAVVPTLPTWEGDR VFLDLLLSGDERFFSLKLQYEGEKLVDKQVNLY >gi|226332938|gb|ACII01000081.1| GENE 4 2084 - 2749 702 221 aa, chain - ## HITS:1 COG:CAC2928 KEGG:ns NR:ns ## COG: CAC2928 COG3859 # Protein_GI_number: 15896181 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 41 192 38 183 210 90 41.0 2e-18 MFGFMVTADGGLTTAGYVITVIAGIILFFLAIYFAGKNSDKKKLTTRQLVFCAVAIALAF ITSYLKIFKLPWGGSVTLCSMLFIVLIANWYGVGTGIMAGFAYGILQFIQEPYILSFFQV CCDYILAFAALGLAGLFAKQSHGLLKGYIIAVIARGAFHSLGGYLYWMSYMPDNFPKSLT AVYPILYNYSYLLAEGIITVIIISVPAVSKALIQVKRAALQ >gi|226332938|gb|ACII01000081.1| GENE 5 3363 - 3461 116 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIVDFIAVVSFGLTCFGLGYTFGKDNNKPQK >gi|226332938|gb|ACII01000081.1| GENE 6 3588 - 4529 568 313 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20017 NR:ns ## KEGG: EUBELI_20017 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 115 7 115 177 127 57.0 8e-28 MDTKKIGAFLKQCRKEKNLTQEQLAEKFGVSARTVSRWETGINMPDLSILVQLAEYYDVE MRELLDGERSQTMNKEMKETLDKVAVYEEWVKQKALKAGNLAFASMFVISVLAIIIQMLL TVDIRLVLGETATALVGGILYASIMVYNGIWDKCLPKSATIWRDFLTSVICAGIFTVIYG VCLFRMGATETQITRLALGFLIGITIVAFIILRLLAFINRKRNQNLSNVQEKKEISLSQV EWTKIYNAQNIVETEQLVEMLKQNGIAAFSQEAGANVAMHGAPGFGIYGVDIFVKTDDAE KAVQLIKEINNQE >gi|226332938|gb|ACII01000081.1| GENE 7 4817 - 4993 71 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226322750|ref|ZP_03798268.1| ## NR: gi|226322750|ref|ZP_03798268.1| hypothetical protein COPCOM_00522 [Coprococcus comes ATCC 27758] # 3 44 5 46 106 70 90.0 4e-11 MFYIVFNFAMAVIMFLFGIWFYRSKGQASNFLSGDNMKSAEERKNMMKMLCVRHMGRE Prediction of potential genes in microbial genomes Time: Sat May 28 19:58:28 2011 Seq name: gi|226332937|gb|ACII01000082.1| Ruminococcus sp. 5_1_39B_FAA cont1.82, whole genome shotgun sequence Length of sequence - 39303 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 17, operones - 9 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 193 179 ## gi|253579241|ref|ZP_04856511.1| conserved hypothetical protein 2 1 Op 2 . - CDS 258 - 608 166 ## Cthe_1487 hypothetical protein 3 1 Op 3 . - CDS 664 - 858 222 ## gi|253579243|ref|ZP_04856513.1| conserved hypothetical protein - Prom 1077 - 1136 5.0 - Term 1104 - 1149 4.2 4 2 Tu 1 . - CDS 1152 - 1499 140 ## CDR20291_1765 hypothetical protein - Prom 1606 - 1665 4.8 - Term 1517 - 1553 5.3 5 3 Op 1 . - CDS 1783 - 3897 619 ## Noca_3089 hypothetical protein 6 3 Op 2 . - CDS 3906 - 5696 339 ## Gmet_3138 type IIs restriction endonuclease 7 3 Op 3 . - CDS 5663 - 7600 857 ## COG0338 Site-specific DNA methylase 8 3 Op 4 . - CDS 7600 - 7818 242 ## COG3655 Predicted transcriptional regulator - Prom 7857 - 7916 7.8 9 4 Tu 1 . - CDS 7960 - 8181 128 ## EUBELI_01771 hypothetical protein - Prom 8417 - 8476 6.6 - Term 8754 - 8808 10.4 10 5 Op 1 . - CDS 8820 - 9194 333 ## CDR20291_1777 hypothetical protein 11 5 Op 2 . - CDS 9199 - 10614 1067 ## EUBREC_0392 hypothetical protein - Prom 10654 - 10713 4.4 - Term 10729 - 10789 10.2 12 6 Op 1 . - CDS 10851 - 11078 232 ## CLL_A0036 hypothetical protein 13 6 Op 2 . - CDS 11226 - 11402 106 ## gi|253579253|ref|ZP_04856523.1| conserved hypothetical protein - Prom 11450 - 11509 2.3 14 7 Tu 1 . - CDS 11533 - 12618 1079 ## EUBREC_0390 hypothetical protein - Prom 12753 - 12812 3.7 15 8 Op 1 . - CDS 13000 - 14091 950 ## COG0582 Integrase 16 8 Op 2 . - CDS 14066 - 14257 256 ## EUBREC_0388 hypothetical protein - Prom 14297 - 14356 6.8 + Prom 14303 - 14362 8.4 17 9 Tu 1 . + CDS 14470 - 15003 523 ## HS_0422 transcriptional regulator + Term 15230 - 15276 1.4 - Term 15428 - 15465 7.1 18 10 Tu 1 . - CDS 15523 - 16716 1436 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 16819 - 16878 6.2 - Term 16914 - 16953 -0.1 19 11 Op 1 51/0.000 - CDS 17052 - 19172 2402 ## COG0480 Translation elongation factors (GTPases) 20 11 Op 2 56/0.000 - CDS 19190 - 19681 754 ## PROTEIN SUPPORTED gi|238922793|ref|YP_002936306.1| ribosomal protein S7 - Prom 19714 - 19773 1.9 21 11 Op 3 . - CDS 19885 - 20301 658 ## PROTEIN SUPPORTED gi|238922792|ref|YP_002936305.1| 30S ribosomal protein S12 - Prom 20406 - 20465 6.9 + Prom 20957 - 21016 7.9 22 12 Tu 1 . + CDS 21085 - 22098 663 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Prom 22132 - 22191 3.4 23 13 Tu 1 . + CDS 22212 - 23312 881 ## COG2367 Beta-lactamase class A - Term 23397 - 23445 12.4 24 14 Op 1 . - CDS 23565 - 24794 992 ## gi|253579265|ref|ZP_04856535.1| conserved hypothetical protein 25 14 Op 2 17/0.000 - CDS 24808 - 25626 650 ## COG0631 Serine/threonine protein phosphatase 26 14 Op 3 . - CDS 25607 - 27514 1546 ## COG0515 Serine/threonine protein kinase - Prom 27653 - 27712 5.4 - Term 27719 - 27774 4.5 27 15 Op 1 58/0.000 - CDS 27824 - 31579 4288 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 28 15 Op 2 . - CDS 31597 - 35457 834 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 35548 - 35607 5.1 - Term 35512 - 35576 8.1 29 16 Tu 1 . - CDS 35667 - 37757 1516 ## COG5434 Endopolygalacturonase - Prom 37784 - 37843 5.0 - Term 37773 - 37816 8.8 30 17 Op 1 47/0.000 - CDS 37845 - 38216 506 ## PROTEIN SUPPORTED gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 31 17 Op 2 . - CDS 38327 - 38830 573 ## PROTEIN SUPPORTED gi|240143816|ref|ZP_04742417.1| 50S ribosomal protein L10 - Prom 38850 - 38909 4.7 Predicted protein(s) >gi|226332937|gb|ACII01000082.1| GENE 1 2 - 193 179 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579241|ref|ZP_04856511.1| ## NR: gi|253579241|ref|ZP_04856511.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 1 63 63 107 100.0 2e-22 MWNELGISGGTAIVILIALYFVIKWAVKNGIKEAYSAITGKKTDEDIQNEKELKELGLDS EDQ >gi|226332937|gb|ACII01000082.1| GENE 2 258 - 608 166 116 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1487 NR:ns ## KEGG: Cthe_1487 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 2 114 7 121 132 70 35.0 2e-11 MFIFLSACSLLVPLSMIILGYTWKDKPPKDRQGSSGYRTTMSRMNDETWRYAHRCWGWIN FVLGIILVILSIFILILTKDDTNFEMISVYLVFLQLGIMVLTIIPTEFLLHKHFTK >gi|226332937|gb|ACII01000082.1| GENE 3 664 - 858 222 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579243|ref|ZP_04856513.1| ## NR: gi|253579243|ref|ZP_04856513.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 51 114 114 114 100.0 2e-24 MFYGVVSSRNCMIGSGIIVMGVTVVLLPFTTSETTALFGNKKARFIGRILGMILIAVGIW VGFI >gi|226332937|gb|ACII01000082.1| GENE 4 1152 - 1499 140 115 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1765 NR:ns ## KEGG: CDR20291_1765 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 11 115 21 125 125 80 40.0 1e-14 MKNNSILKYGYLMKSLLLNEEEPIYGKYGMLRKQFLKEHRSAKYQYLLLTGKLTEHLNQI DQEARKQVEILMEQMVKKQGATEELKAQDQMKWVRLMINIKSSAEEIVVKNTIYM >gi|226332937|gb|ACII01000082.1| GENE 5 1783 - 3897 619 704 aa, chain - ## HITS:1 COG:no KEGG:Noca_3089 NR:ns ## KEGG: Noca_3089 # Name: not_defined # Def: hypothetical protein # Organism: Nocardioides_JS614 # Pathway: not_defined # 1 673 4 669 706 340 29.0 1e-91 MEEQYKKTFDQMVDSLKMYQRYELKDKNRKDILDKVYVDPIENDGILNLCLKDNTTVLIG RRGTGKSTIFMRMQNELRKQNDIMTCYIDVKSIFDVAKRNYITINYLKTSNLEEIEQYSI QRQFILDFVDELINEICKNYDTIWEKFKQKLHISKSQKAIDKLKNIRKRIQDNNHLSNIE MQTLQEVMHSNVDNMSYCDKRSMKMDISLSKKKNILNAGGNRENTLSEGEENGKKYNRVF ARIFEITSIVTEIKSILQELSMRRLFLILDDYSEIEQTSLVMFCDLIVNTLHNNSDNFVK LKISAYPGRVELGELDRQKVDIRYLDYFQLYAGDKRNEMESMAVAYTERLMDTRLKIYTG KDFDYYFDTTKTSKEEYCKYLFNMTMNVVRHIGLILDYAQELSIIQGERITLNILNEASK RFYKERLVQFFEESKTAKMTYNERIESLELNKLLNQIIDKEKTIKTNIRTNQYTAVIFQK ERNNPYTSHFYIAQELEPYLGSLELNFFISKYNEMSNKSGKKVSIYALNYGLCMDENLRW GKPKGNEYRTYFIESPFNFTPVIKNFLSENKKIYCENCMHEFSEEEYNLMKKYGGTCLKC GCKNSIQEKRVLSDEERSEIEEIEKKDNLLEREQYQLLKLLQYSRKDKTATELAQELDVS WQKIGWIAKKIEEDYCYLKKEPRKGKMYYVLSDLGREYLDSITP >gi|226332937|gb|ACII01000082.1| GENE 6 3906 - 5696 339 596 aa, chain - ## HITS:1 COG:no KEGG:Gmet_3138 NR:ns ## KEGG: Gmet_3138 # Name: not_defined # Def: type IIs restriction endonuclease # Organism: G.metallireducens # Pathway: not_defined # 39 590 4 563 572 399 42.0 1e-109 MAKAKTTIKKITQNAWECTKIFKTGEIEKRIVKDPIDIWSIGNTGVRNPWRIPGGYKVYV ESNQVGKIRTAQEQKLFKEKLLLAGEIGGDPKKDADASITRKYRLMFNKYGFAYPEVLKK DGFSQKEIGKIDDITPAGWAFYHANTIQAQQECFLRGLVVQMEPLGERETYSPLRWVLKI MFDLFERTGDYKINYIEFAVCVQTSSPKYELETVVNKILEIRGRRKKYINKKKFDRDLIH NAWKHYFKEEKNFHEYADMNLRYLSASGILKRSGRGITVMPEYKSLAYELTKNVISDATL KERYKLLCQGAPLPTDNTEIAKKVLEDLINELEMYNIKYEIPDITLDNSKNINIVRSNLK QNIDHYKEEKYAQNQRECWYEIYEYMKLLISNNGKSKELGEDYIINVPKAEAAAYLEWIL WRAFLAIDHIVNKPYDARGFNVDQDYLPIGTAPGGKPDMIFEFDDYVIVVEVTLSTNSRQ EAMEGEPVRRHVADLVQKYKKPVYGLFVANKIDSNTAETFRMGVWYTTQDERLELHIVPL TLTQFSEYFKFIFTENFAEPQKIVDLMCNCENYRKICEGPEWKKCINEVVQIMTKR >gi|226332937|gb|ACII01000082.1| GENE 7 5663 - 7600 857 645 aa, chain - ## HITS:1 COG:all0061 KEGG:ns NR:ns ## COG: all0061 COG0338 # Protein_GI_number: 17227557 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Nostoc sp. PCC 7120 # 9 266 7 266 277 213 43.0 8e-55 MEQVLKKEKQIQAKPIMKWAGGKTQMLGDIMPKIPQKYGKYIEPFIGGGALFFALSPDKS IIADSNPELINMYRQVADNVEAVISYLKKYKNTKEDFYEVRSLDWLKLKKEEAAARTIYL NKTCFNGLYRVNKKGQFNVPFGKYKAPNFCDEEALFAASDVLKKATITCGDYLSVLKEYA EPGDFIFLDPPYLPISEYSDFKRYTKEQFYEEDHVELAKEVKRLQELGCHVILTNSNHPL VHELYADYKIEVIQTKRYISCNGSKRKGEDIIVDILPKQKTMLKIVPKPLPEQVMKYPAT RYMGSKSKLLPQIWAVASQFNFDSVVDLFSGSGIVGYMFKAQGKTVISNDYMAMSATFTK AMVENNNVVLPLDEAKKLLIEKRESDHFVEEIFRGLYYKDEENRLIDVLRTNIAAIRDQY KKAIAMTALIRACTKKRPRGIFTYTGDRYNDGRKDLQKSLEQQFLEAVESINNAIFDNGC ENKSKHGDAMEVKIKHPDLVYIDPPYYSPLSDNEYVRRYHFVEGLARDWKGVEIQENTVT KKFKSYPTPFSTRKGAADAFDKLFKKFSDSILIVSYSSNSQPTQDEMVALMSKYKEHVEV VPIDYTYSFGNQKHAKTNRNKVQEYLFVGFNGEDVWQKQKLQLKK >gi|226332937|gb|ACII01000082.1| GENE 8 7600 - 7818 242 72 aa, chain - ## HITS:1 COG:SPy0544 KEGG:ns NR:ns ## COG: SPy0544 COG3655 # Protein_GI_number: 15674643 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 66 1 66 69 71 57.0 4e-13 MAVSYKKLWKLLIDRDMKKKDLCEAAGISHASVAKLGKNENVTTDVLVKICTALKCDISD IMEIIEIEREKA >gi|226332937|gb|ACII01000082.1| GENE 9 7960 - 8181 128 73 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01771 NR:ns ## KEGG: EUBELI_01771 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 66 117 182 182 95 62.0 8e-19 MIGDIRKKGYVLPLGMNSMQKFVDTGFKFKEIVIKEQHNCRSTDYWEGKERKFLMLAHEY IFILEKADDHNPI >gi|226332937|gb|ACII01000082.1| GENE 10 8820 - 9194 333 124 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1777 NR:ns ## KEGG: CDR20291_1777 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 9 123 10 124 125 99 49.0 5e-20 MKKQIYDEKNGLSYTLHGDYYLPDLEINEEEPTYGKYGIMRKQFLKEHRSARYQYLVLTG KLTEQLNQVDKEAREKVEMLMEQMVEQWGVTEELKMHEQMEWVRRKDNIQAIAEEIALKE IIYL >gi|226332937|gb|ACII01000082.1| GENE 11 9199 - 10614 1067 471 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0392 NR:ns ## KEGG: EUBREC_0392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 9 436 1 443 449 234 37.0 7e-60 MIKRTISGMIGVGSLAHNRRDFVAENVNPDRVQLNICYKNENLKEVYKELFDDAVERYNV GKRKDRQIANYYEKIRQGKQEKLFHEVIFQIGNREDMAVGTTEGDLAVRVLDEYMKDFQK RNPSLRVFSCYLHQDEATPHLHIDFVPYVTNWKGKGMDTRVSLKQALKSLGFQGGNKHDT ELNQWINHEKEVLAEIAKQHGIEWEQKGTHEEHLDVYNFKKKERKKEVQELEQEKEYLTA ENEGLTSQIADARADIKLLEEEKIQFQKDKETAEKRAGKAEMELKKLEDRREFLQPVMDN VSKEIKEYGMIKTFLPEATTLERAVTYRDKKIKPLFIEMKNKIGAMAVQVKELTRERDSW KSKFQKKKQEHEKTKKELAEVQKDYQKLSGEKEILQELADRYNRLLRMLGKDMVERLVQD DIRIQAELEAKKQKEQMPKKISDRIPWAKERSEEHNAQIKKNKAKYRGMEL >gi|226332937|gb|ACII01000082.1| GENE 12 10851 - 11078 232 75 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0036 NR:ns ## KEGG: CLL_A0036 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 2 65 4 67 76 63 49.0 3e-09 MKQGRLGYNSSNDRYGLLSSDLWIDTGFHCGEGLEVLVDDKWIRTRMEMNSAREWYLVGT PYCGDLEYVQARIPE >gi|226332937|gb|ACII01000082.1| GENE 13 11226 - 11402 106 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579253|ref|ZP_04856523.1| ## NR: gi|253579253|ref|ZP_04856523.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 58 22 79 79 85 100.0 1e-15 MDNYDGEEKIRLGLEQKLDAMVNREMYSKFKTAPTEEEREKFRQEYLDRKGIQESFRW >gi|226332937|gb|ACII01000082.1| GENE 14 11533 - 12618 1079 361 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0390 NR:ns ## KEGG: EUBREC_0390 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 15 359 6 322 322 399 58.0 1e-110 MNMNEKNKVENCRIEPKLKLINMDSVEVEQIEWLLYPFIPYGKVTIIQGDPGEGKTTMVL QIIAKLTRGEPIFPVTDTTMKIKEKRSDEVDSGNDGLDGEDNMQEQSSMSPVHVIYQTAE DGLGDTIKPRLLAAGADCSKVMVIDDSDQPLTMADVRLEEAIVQTKAKMVVLDPIQGFLG ANVDMHRANEIRPLMKRIAVLAEKYHCAVILIGHMNKNSNGKSSYRGLGSIDFQAAARSV LIVGRVKEEPEVRVVCHTKSSLAPEGMSIAFRLDKNNGFEWIGEYDISADELLNGDGRGQ KSQKAKEFLLEILANGGMAQKKIAEEAEGRGIKGKTLWNAKRELEIDSVKRGKQWYWMLP E >gi|226332937|gb|ACII01000082.1| GENE 15 13000 - 14091 950 363 aa, chain - ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 27 361 15 366 368 159 33.0 1e-38 MDALREQYEGGFKMSAYKDKTQGTWYVSFRYIDWTGKKTQKLKRGFKTKKEALNYEKEFI RKTAADMKMEMNSFIQVYFEDKKNELKENSIRNKQHMMNKHIVPYFGTRKMNEITPAEII QWQNAIQEKGYSKTYERMIQNQLNALFNHAQKIYNLKENPCKKVKKMGKSDANKLEFWTK AEYDRFIAGIEPGSEDYLIFEILFWTGIREGELLALSLSDFDMSGNLLHINKTYNRIKKR DVIDTPKTENSVRTIDIPNFLKEEVQEYAKKHYGFPEDQRLFPIVARTLQKRLKKYEALT GVKPIRVHDIRHSHVAYLIYQGVEPLIIKERLGHKDIQMTLNTYGHLYPSQQKKVAEMLD NKR >gi|226332937|gb|ACII01000082.1| GENE 16 14066 - 14257 256 63 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0388 NR:ns ## KEGG: EUBREC_0388 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 60 3 61 66 66 45.0 2e-10 MRTNYMMTVDDVMEELGVKRSKAYSVLKQLNDELAKEGYVAVRGKIPRPYWETKFYGCSQ RAV >gi|226332937|gb|ACII01000082.1| GENE 17 14470 - 15003 523 177 aa, chain + ## HITS:1 COG:no KEGG:HS_0422 NR:ns ## KEGG: HS_0422 # Name: not_defined # Def: transcriptional regulator # Organism: H.somnus # Pathway: not_defined # 3 129 4 130 146 73 29.0 4e-12 MVGKKIRAFREFRGYSQIQLAELSGINVGTIRKYELGIRNPKPDQLEKIATALGLNVSVF LDFNIETVGDVLSLLFSIDDSVNLSLAETSDQKISLTFDNPTMQDFFRKWCQFKNVYEKE KAEILAIEDADKRQEELDKLNATQEEWKLRAMGTTIGCHTIVKKGTEGNSIKTYDLT >gi|226332937|gb|ACII01000082.1| GENE 18 15523 - 16716 1436 397 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 397 1 407 407 557 67 1e-158 MAKAKFERTKPHCNIGTIGHVDHGKTTLTAAITKVLAERVAGNVVENFEDIDKAPEERER GITISTAHVEYQTEKRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGVMAQTKEH VLLARQVGVPYIVVFMNKCDMVDDEELLELVEMEIRELLSEYDFPGDDIPVIKGSALKAL EDPNGEWGDKIMELMDAVDSYIPDPQRDTDKPFVMPVEDVFSITGRGTVATGRVEAGVLH VSDEVEIVGIKEETRKVVVTGIEMFRKLLDEAQAGDNIGALLRGVQRNEIERGQVLAKPG TLTCHTKFTAQVYVLTKDEGGRHTPFFNNYRPQFYFRTTDVTGVCNLPEGTEMCMPGDNI EMTIELIHPIAMSQGLTFAIREGGRTVGSGRVATIIE >gi|226332937|gb|ACII01000082.1| GENE 19 17052 - 19172 2402 706 aa, chain - ## HITS:1 COG:CAC3138 KEGG:ns NR:ns ## COG: CAC3138 COG0480 # Protein_GI_number: 15896387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 5 705 2 686 687 917 64.0 0 MAKEGREYPLERTRNIGIMAHIDAGKTTLTERILYYTGVNYKIGDTHEGTATMDWMEQEQ ERGITITSAATTCHWTLEENCKPKPGALEHRINIIDTPGHVDFTVEVERSLRVLDGAVGV FCAKGGVEPQSENVWRQADTYNVPRMAFINKMDILGANFYGAVEQIKTRLGKNAICLQLP IGKEDDFKGIIDLFEMKAYIYNDEKGDDISIVDIPEDMKEDAELYHTELIEKICELDDDL MMEYLEGEEPTVERLKATLRKATCECTAVPVCCGSAYRNKGVQKLLDAILEYMPAPTDIP PIEGTDLDGNPVVRHSSDEEPFAALAFKIMTDPFVGKLAYFRVYSGTMNSGSYVLNATKD KKERVGRILQMHANKRMELDKVYSGDIAAAIGFKFTTTGDTICDEQHPVILESMEFPEPV IELAIEPKTKAGQGKLGEALAKLAEEDPTFRAHTDQETGQTIIAGMGELHLDIIVDRLLR EFKVEANVGAPQVAYKETITKAVDVDSKYAKQSGGRGQYGHCKVKFEPMDANGEELYKFE STVVGGAIPKEYIPAVGEGIEEAMKAGILGGFPVVGVYANVYDGSYHEVDSSEMAFHIAG SLAFKDAMQKAAPVLLEPIMKVEVTTPEDYMGDVIGDINSRRGRIEGMEDIGGGKMIRGY VPLSEMFGYATDLRSRTQGRGNYSMFFEKYEQVPKSVQEKILAKKD >gi|226332937|gb|ACII01000082.1| GENE 20 19190 - 19681 754 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922793|ref|YP_002936306.1| ribosomal protein S7 [Eubacterium rectale ATCC 33656] # 1 163 1 163 163 295 88 4e-79 MIKEGSNVPRKGHTQKRDVLADPMYNNKVVTKLINNIMLDGKKGVAQKIVYGAFARIEEK AGKPALEVFEEAMNNIMPLLEVKARRIGGATYQVPIEVRADRRQALALRWLTMFSRKRGE KTMEERLANELLDAMNNTGASVKRKEDMHKMAEANKAFAHYRF >gi|226332937|gb|ACII01000082.1| GENE 21 19885 - 20301 658 138 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922792|ref|YP_002936305.1| 30S ribosomal protein S12 [Eubacterium rectale ATCC 33656] # 1 136 16 151 154 258 94 5e-68 MPTFNQLVRKGRQTSVKKSTAPALQKTFNSLRKKAVEQSAPQKRGVCTAVKTATPKKPNS ALRKIARVRLSNGIEVTSYIPGEGHNLQEHSVVLIRGGRVKDLPGTRYHIIRGTLDTAGV ANRKQARSKYGAKRPKKK >gi|226332937|gb|ACII01000082.1| GENE 22 21085 - 22098 663 337 aa, chain + ## HITS:1 COG:BS_yqiI KEGG:ns NR:ns ## COG: BS_yqiI COG0860 # Protein_GI_number: 16079475 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 170 335 48 204 206 67 30.0 6e-11 MKKPSMISILLFLTGLLCIIALIFFNRTTLHQILTTEKEVDTISEQITAKHDEVKQAVDK YTTAIDTVQSNINRMKQKQQTATQKKTKEKAEIPEEASSKPSVFSQNSTSASSDSSASVF SSDAETDGHIIGIDPGHQSESVDMSAPEPNGPGSSEMKAKCTSGTQGTYSGVPEYQLNLE VSLQLKDELEQRGYQVVMTRTDNETAISNMERAQYAASQGAEIYVRIHANGDDSHTASGA LTMSPSQNNPYIPQLFEQSDRLSRCIIDSYCAATGFQNLGIQYTDTMTGINWSTVPVTIL EMGFMTSQNDDLKMNDAEFQKTMVQGIANGIDSYFAS >gi|226332937|gb|ACII01000082.1| GENE 23 22212 - 23312 881 366 aa, chain + ## HITS:1 COG:FN1584 KEGG:ns NR:ns ## COG: FN1584 COG2367 # Protein_GI_number: 19704905 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Fusobacterium nucleatum # 127 336 9 234 264 69 30.0 1e-11 MNKKPNEKPIFLYIVIGILTVAVIALAAVSFLSLNHLVTLQQKVQTLSKTVEEISTDANT LISQADQLDELREQESSVKAATGETSPAQSESAEPSSESEQQNGTLSPSSNNTFTDNTDS SMDNLLKQVQSLLPADNGTWSVYVCNLPKDSEGMINDTPMQAASLIKLYIMGAVYENYDT LSQSHNGDEIDSNISAMITVSDNDAANTLVNWLGNGDNSAGMAKVNGFCQEHGFTSTQMN RLLLAGKENGDNYTSVKDCGTFLKQIYQTVNGTLPASTLPNADAMYYHLKMQQRKNKIPA QLPEGVGTANKTGELDTVENDAAIIYDTAKGIDLVVCFMSQNLTDTGAAQSTIAADARAI YGYYNE >gi|226332937|gb|ACII01000082.1| GENE 24 23565 - 24794 992 409 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579265|ref|ZP_04856535.1| ## NR: gi|253579265|ref|ZP_04856535.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 409 1 409 409 786 100.0 0 MKNRMQDLDFEQNVAFDKVQEYEFTRRAAQRFRQVVSLDSFEDEDADVIFHYLYKEMELV SFGDHLKRYIYERAELEEPFSEIPQEVYKEIVVDSFKETYTPKSMNPTSTKLSALVNNWL NQASVKRETVFLLGFGLKMTTEDVSDFLTRVLKEQDFDFYNPDEVIYWYCYSTQQGYHKA EELKKKYEILAPVEVENTQVLYGSNLCLDTEEKLIDYLARLKSKRVDPISEKSQAFQEFT KLLYHAKQIIAGLYQHDEEEKGGDKVWTAERITPSDVEKVICSGIPINKMGNLKKMSASI LAKHFSQKRFSRQRITNILSHKLPVERFDLITLEFFIVSQEMEDDDPFNRYKHFLDEIQD ILLRCGMGEIYIVNPYECFLLMCLLTDCPLAVFSEIWEKSYEEGEAEEA >gi|226332937|gb|ACII01000082.1| GENE 25 24808 - 25626 650 272 aa, chain - ## HITS:1 COG:ML0020_1 KEGG:ns NR:ns ## COG: ML0020_1 COG0631 # Protein_GI_number: 15826883 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Mycobacterium leprae # 15 252 14 235 237 91 30.0 2e-18 MAMYQIECAYTCHTGNIRANNEDNFWCFGESLPVNNEGTKGICSKIISGNRVPAMAVFDG MGGESCGEIAAFLASEEFGKFYNANKRMLRDMPEDFIDDVCEKMNQAVCRYGTEHHIWSM GSTMAMLLFTPESMFACNLGDSRIYFMDGGKLQQISTDHVFGGTAVGKAPLTQYLGLPEE LQRLEPSVTEIEHKEGYRYLLCSDGVTDMLSDSEIEAILSQDMEIPEIVNDLLKNALQKG GRDNVTIVLCEVQKLDTVRRMKEWMKNLKDKK >gi|226332937|gb|ACII01000082.1| GENE 26 25607 - 27514 1546 635 aa, chain - ## HITS:1 COG:Rv0014c_1 KEGG:ns NR:ns ## COG: Rv0014c_1 COG0515 # Protein_GI_number: 15607156 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Mycobacterium tuberculosis H37Rv # 130 304 107 279 428 101 33.0 6e-21 MDKELPVSAWPEWKIIEKIGEGSFGKVYKAQRTERGKSFYSAIKIINIPGSQSELNSVRS ETGDEQSTRQYFQNLVEECIQEISTMEYFRGNSYIVSVEDFKVMEYLDVIGWEISIRMEY LTSFMDYCAEKQLTEKEVIKLGMDLSKALEYCRKLKIIHRDIKPENIFVSRFGDFKLGDF GIARELERTMSGFSKKGTYSYMAPEMYKGEKYDSRVDIYSLGIVLYRLMNHNRLPFMSLE KQFITYRDKENALNKRVAGEQMSAPVDAGTQFARIIMKACAYDPAQRYQTPEELYSALDD LKNGRSGRIRQLQNGVQKAGTDKTESYSAIAQNGRAAQNVTSDAENVRKSTVQNTSVQNT RGNKKTSDFQNAAKRKRAVNNVPASSFTDKETMSVVAEQKIQPQETPSETVAIRSARAVE ARKRRRRKKQLLRRALLAIVAGTAILMMAAVYYRVSKDHAQEGQEDNPIAILDDEKYSST KSGSKDFSKALDTIKEQATTITDNLNSYDWVGTEGTVLRYLRRVKTTQQADGNLDIESEC MKALVYPAESGDGVYEEYFYWGEKLFFAYIWYDETSEYYYYNDGELIRWIDSNGKCHDNE KDNEEFVKRGEKYWNNSLKALKGEGTKTDNGNVSD >gi|226332937|gb|ACII01000082.1| GENE 27 27824 - 31579 4288 1251 aa, chain - ## HITS:1 COG:CAC3142 KEGG:ns NR:ns ## COG: CAC3142 COG0086 # Protein_GI_number: 15896391 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Clostridium acetobutylicum # 16 1202 6 1181 1182 1637 68.0 0 MAETTNANETYQPMTFDAIKIGLASPEKIREWSRGEVLKPETINYRTLKPEKDGLFCERI FGPSKDWECHCGKYKKIRYKGVVCDRCGVEVTKSSVRRERMGHIELAAPVSHIWYFKGIP SRMGLILDISPRTLEKVLYFANYIVLDPANSGLQYKQVLTEREYQDAREAYGYDFRVGMG AESIKELLEAIDLEKDSVELKKELKDATGQKRARIIKRLEVVESFRESGNRPEWMIMTVI PVIPPDLRPMVQLDGGRFATSDLNDLYRRIINRNNRLKRLLELGAPDIIVRNEKRMLQEA VDALIDNGRRGRPVTGPGNRALKSLSDMLKGKSGRFRQNLLGKRVDYSGRSVIVVGPELK IYQCGLPKEMAIELFKPFVMKELVANGTSHNIKNAKKMVEKLEPAVWDVLEDVIKEHPVM LNRAPTLHRLGIQAFEPILVEGKAIKLHPLVCTAFNADFDGDQMAVHLPLSQEAQAECRF LLLSPNNLLKPSDGGPVAVPSQDMVLGIYYLTQERPGNKGEGKFFKSVNEAILAYENKVI TLQSRIHVRCSKTMPDGSVLTGTVESTLGRFLFNEILPQDLGFVDRSVPGNELLLEVDFL VAKKQLKQILEKVINTHGATKTAEVLDAIKATGYKYSTRAAMTVAIADMTVPPQKPEMIK QAQDTVDLITKNYKRGLITEEERYKEVVDTWKKTDDELTHALLSGLDKYNNIFMMADSGA RGSDKQIKQLAGMRGLMADTTGHTIELPIKSNFREGLDVLEYFMSAHGARKGMSDTALRT ADSGYLTRRMVDVSQDLIVRETDCCENRDEISGMYVESFVDGKEEIEGLQERITGRFSCE TIKNKDGEVIVKANHMITPKRAARIMKEGVDNQTGGSIEKVKIRTILTCKCKVGICAKCY GANLATGEPVQVGESVGIIAAQSIGEPGTQLTMRTFHSGGVAGGDITQGLPRVEELFEAR KPKGLAIITEIPGVAVINDTKKKREVIVTDQETGESKTYLIPYGSRIKITDGQILEAGDE LTEGSVNPHDILRIKGVRAVQDYMIREVQRVYRLQGVEISDKHIEVIVRQMLKKIRIEDN GDTEFLPGTLVDFLEYEDVNEQMEKEGKQPADGKQVMLGITKASLATNSFLSAASFQETT KVLTEAAIKGKVDPLIGLKENVIIGKLIPAGTGMKRYRDVRLNSTMPEPEVKVEDIAEEL TEDEEIKDDEMVTMDEELPEDEEFDVEESDEEEAAEDEADDQASDDAEDNE >gi|226332937|gb|ACII01000082.1| GENE 28 31597 - 35457 834 1286 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 847 1211 998 1391 1392 325 46 2e-88 MEKNRIRSVKTGKSMRMSYQRQEEVLEMPNLIEVQRDSYKWFLDEGLREVFDDISPIADY SGKLSLEFIDFTFDEKDAKYTIEQCKERDATYAAPLKVRVRLINKETEEINEHEIFMGDL PIMTVTGTFVINGAERVIVSQLVRSPGIYYAIAHDKLGKRLFSSTVIPNRGAWLEYETDS NDVFYVRVDRTRKVPITVLIRALGVGSNAEIIDLFGEEPKILASFTKDTSTNYQEGLLEL YKKIRPGEPLAVESAESLINAMFFDPRRYDLAKVGRYKFNKKLHLNRRIVGHRLAEDVVD ASTGEILAEEGTVVTKEMANTIQNSAVPYVWILGEEDRRIKVLSNLMVDIRHYLPEIEDP KSIGVTEAVYYPVLENILEENDTLEDRIAAIHRDIHDLIPKHITKEDIFASINYNMHLEY GLGNADDIDHLGNRRIRAVGELLQNQYRIGLSRLERVVRERMTTQDTENISPQSLINIKP VTAAVKEFFGSSQLSQFMDQNNPLGELTHKRRLSALGPGGLSRDRAGFEVRDVHYSHYGR MCPVETPEGPNIGLINSLASYARINQYGFIEAPYRKIDKSDPKNPRVTDEVVYMTADEED NYHVAQANEPLDEEGYFIHKNVSGRFREETQEYERHMFDYMDVSPRMVFSVATALIPFLQ NDDANRALMGSNMQRQAVPLLTTEAPVVGTGMEVKTAVDSGVCVVAKKSGTVLRSTSTDI SIKNDDGTKDDYHLTKYLRSNQSNCYNQKPIVFQGEHVEAGQVIADGPSTANGELALGKN PLIGFMTWEGYNYEDAVLLSERLVMDDVYTSVHIEEYECEARDTKLGPEEITRDVPGVGD DALKDLDERGIIRIGAEVRAGDILVGKVTPKGETELTAEERLLRAIFGEKAREVRDTSLK VPHGEYGIIVDAKVFTRENGDEMSPGVNQSVRIYIAQKRKISVGDKMAGRHGNKGVVSRV LPVEDMPFLPNGRPLDIVLNPLGVPSRMNIGQVLEIHLSLAAKALGFNVATPIFQGANEH DIQDTLELANDYVNTEDFEEFREKYKDILAPDVMQYLDENKAHRALWKGVPISRDGKVRL RDGRTGEYFDSPVTIGHMHYLKLHHLVDDKIHARSTGPYSLVTQQPLGGKAQFGGQRFGE MEVWALEAYGASYTLQEILTVKSDDVVGRVKTYEAIIKGDNIPEPGIPESFKVLLKELQS LGLDVKVLKDDNTEVHLLESVDYGDTDLRSVIEGDSRRHHDREEDYGKHGYTKQEFDGEE LVDIDEDEDEDDFIELDEALEDNDEE >gi|226332937|gb|ACII01000082.1| GENE 29 35667 - 37757 1516 696 aa, chain - ## HITS:1 COG:SP1833 KEGG:ns NR:ns ## COG: SP1833 COG5434 # Protein_GI_number: 15901662 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Streptococcus pneumoniae TIGR4 # 130 347 176 417 708 62 27.0 3e-09 MMKRKTAVIFGIIIAAQLCLTPYAGVKVQAAELEEGQMFEDSSEDSETFGDVSIPDAQSA EKTEENEFGDDDGFSDGDVETFSAEDADQFRENMQELVLNVQDGEDITVKLNTLLAQARD KATDEKQCKVIVPPGNYTLTGTLHMYSNIYLYAEGATITKTSPRKEILLRLGDTKKSAGG YEGYRNITIDGGTWDSNYECVEDKGGPGGFVGFRIGHATNVTVKNVTFLNNLKSHFLELA GVKNAEITGCTFRGYWKEFTGGGQECIQLDACMPRIFPGYLPYDGSVCENIVIKDNTFED VFAGIGSHSMMFDKPYKNITISNNRFNNLKKRAIWCLNYQDTVVTGNTMTNVGGGVYVRS VYTRNAHTVSGQEVSPEGNQYAENILIADNQITVLEPTVIDGKQWNGYGIWITGEVSLGS AGEIIPSEDDDPENTANPTTNGIPAGRYIIRGVTARNNTISGNCDGIKFSLAEDCSCRGN TVRLSDSHTYNNMGISVAKGSNIRVSYNNLSGGNGYGIYAVGTEALQKICRIYGNTVSGF MKDGILASGLAAGSRIEHNRVSTSAANGILVKCIDKSVVDGNYSFKNKARGILLQRCESA MVADNFVSENAINGIELNIRSNHSSVQGNVCGSNKKSGLRIAGSKGISADGNSFRSNRGY AAEFIRSHISSYKGNRFEGNGYSNRIHVRNSKIPKK >gi|226332937|gb|ACII01000082.1| GENE 30 37845 - 38216 506 123 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 [Roseburia intestinalis L1-82] # 1 123 1 125 125 199 87 2e-50 MAKLTTEEFIAAIKELSVLELNDLVKACEEEFGVSAAAGVVVAAAGAAEAAEEKDEFDVE LAEVGPNKVKVIKVVREVTGLGLKEAKEMVDGAPKVVKEGASKAEAEDIKTKLEAEGAKV NLK >gi|226332937|gb|ACII01000082.1| GENE 31 38327 - 38830 573 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143816|ref|ZP_04742417.1| 50S ribosomal protein L10 [Roseburia intestinalis L1-82] # 1 166 1 166 187 225 70 4e-58 MAKVELKQPIVDEIINNVKDAESVVLVNYSGLTVEQDTALRKELREAGVVYKVYKNTMMH RAFEGTQCEELIQHLHGTNAIAISATDATAPARILDKFAKKVPALELVAGIVEGNYNDQA GIQALAGIPSREELLGKLLGSIQSPITNFARVLNQIAEKNGSEAAAE Prediction of potential genes in microbial genomes Time: Sat May 28 19:59:39 2011 Seq name: gi|226332936|gb|ACII01000083.1| Ruminococcus sp. 5_1_39B_FAA cont1.83, whole genome shotgun sequence Length of sequence - 5795 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 109 - 738 605 ## COG0035 Uracil phosphoribosyltransferase 2 1 Op 2 . - CDS 786 - 968 257 ## Cphy_1044 hypothetical protein - Prom 1157 - 1216 9.9 + Prom 1111 - 1170 4.2 3 2 Tu 1 . + CDS 1213 - 2259 1047 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 4 3 Tu 1 . - CDS 2709 - 4163 1629 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Prom 4308 - 4367 6.9 5 4 Op 1 . - CDS 4451 - 5296 494 ## gi|253579277|ref|ZP_04856547.1| predicted protein 6 4 Op 2 . - CDS 5315 - 5569 272 ## gi|253579278|ref|ZP_04856548.1| conserved hypothetical protein - Prom 5602 - 5661 8.6 Predicted protein(s) >gi|226332936|gb|ACII01000083.1| GENE 1 109 - 738 605 209 aa, chain - ## HITS:1 COG:SA1914 KEGG:ns NR:ns ## COG: SA1914 COG0035 # Protein_GI_number: 15927686 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Staphylococcus aureus N315 # 1 208 1 208 209 273 59.0 2e-73 MENVYIMNHPLIQHKISMLRNKNTGTNEFRKLVEEIGILMGYEALQDLPVENVEVETPIE TCMTPMISGKKLAVIPVLRAGLGMVNSILTLVPSAKVGHVGLYRDPVTHEPHEYYCKLPE AIDERISVIVDPMLATGGSAEAAVDFVKKQGGKNIKFMCIIAAPEGLRRLHEAHPDVQIY CGHLDRELNDHAYICPGLGDAGDRIFGTI >gi|226332936|gb|ACII01000083.1| GENE 2 786 - 968 257 60 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1044 NR:ns ## KEGG: Cphy_1044 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 60 1 60 60 65 60.0 8e-10 MSDAVNCEYCVNYSYDDDYECYTCEINLDEDEMVKFITGNFNQCPYFKMGDEYRIVRKQM >gi|226332936|gb|ACII01000083.1| GENE 3 1213 - 2259 1047 348 aa, chain + ## HITS:1 COG:CAC2761 KEGG:ns NR:ns ## COG: CAC2761 COG1477 # Protein_GI_number: 15896017 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 41 348 19 320 327 216 36.0 5e-56 MLAASLILCPAVFSGCSAKTENIKNTDAGSQDPISATAIKLNTAVTVTIYDSQDRELLTE CMNLCDKYEKIFSRTADDSELYQLNHRELTLVKGTEDTYQVSASLAELVSKGLDYSVLSE GAFDIAIEPLTSLWDFTAENPKVPKDSLIQAALPKCNYHNISVDTDKNEITLKTDDTAIE LGAIAKGYIADRLKDYLVSQDVKSAIINLGGNVLCIGEKTDNSAFKIGIQKPFADRSETI AVMDIRDKSVVSSGIYERCFKQNGTLYHHLLNPKTGYPYDNGLIAVTIISDQSVDGDALS TTCFALGLEDGMKLAESLDDVQAFFVTSDYEIQYTRDFQKKIKVTETE >gi|226332936|gb|ACII01000083.1| GENE 4 2709 - 4163 1629 484 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 131 429 79 379 450 200 39.0 6e-51 MAYEGQSPEMLEELPYKIVPGEIATYRDSIFLERAVVGERLRVAMGMSLRKVTEHAPISK GVEASVIEEKYYEPPLINVIKFACNSCPEKRVMITEGCQGCLEHPCVEVCPKKAVHMEGG RSHIDEDACIKCGKCLEACPYNAIIKQERPCSKACGMNAIGSDEYGRAEIDQDKCVSCGQ CLVSCPFSAIVDKGQIFQTVMALKSETPVYAIVAPAIAGQFPGMENNKIRGAFQALGFTD VREVAVGADLCTVEEAKDFLEEVPEKLPFMATSCCPSWSMMAKKLFPEQAKCISMALTPM VLTARLIKQKEPNCKIVFVGPCAAKKLEASRKSIRSYVDFVLTFEEVAGMFDAKGVDWKD IPEGEPLFRASADGRGFAVSGGVAEAVVHVVKRIDPDREVKVMNAEGLQNCKKMLQMAKI GKYNGYLLEGMACPGGCVAGAGTIQTVKKAAAALEKMKKEALFTDAYDSKYRARLESLEK FDVE >gi|226332936|gb|ACII01000083.1| GENE 5 4451 - 5296 494 281 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579277|ref|ZP_04856547.1| ## NR: gi|253579277|ref|ZP_04856547.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 281 1 281 281 500 100.0 1e-140 MIFTKKYRTGQIVLTACIVAVCMFSNVHGTEAAGPPAPKTLTVTPTEKSKSTAKSASTSK SKSTPKSKSALTQKPEATPIPASTPTPEQEAETDKQNPADQGTLSKPDHPDTISADKLVF IGDSRTEGLRDAVNDDSIWSCLSSMGYDWMVSTGVPQVEDQIEDNTAVIILMGVNDLYHV NDYVSYINSKAAEWGNRGAQTYFVSVGPVQNDPYCSNAEIESFNAAMQANLSGVTYIDVY SHLVSEGFSTVDGTHYPDSVSVDIYNYILDHLEEQRSGIWG >gi|226332936|gb|ACII01000083.1| GENE 6 5315 - 5569 272 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579278|ref|ZP_04856548.1| ## NR: gi|253579278|ref|ZP_04856548.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 84 1 84 84 162 100.0 6e-39 MRLEKVQEALKTKKIHYEYTEENGCGSIDFMFRGLRFHIWEYEDRVWGAETNVFEAGRSQ DIEGDYEEIISGEILSWPDMLPGM Prediction of potential genes in microbial genomes Time: Sat May 28 20:00:28 2011 Seq name: gi|226332935|gb|ACII01000084.1| Ruminococcus sp. 5_1_39B_FAA cont1.84, whole genome shotgun sequence Length of sequence - 104286 bp Number of predicted genes - 97, with homology - 96 Number of transcription units - 48, operones - 28 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 236 - 290 8.5 1 1 Op 1 18/0.000 - CDS 303 - 1943 2121 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 2 1 Op 2 . - CDS 1945 - 2844 903 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes - Prom 3017 - 3076 2.6 - Term 3001 - 3046 -0.8 3 2 Op 1 6/0.000 - CDS 3094 - 3408 380 ## COG1146 Ferredoxin 4 2 Op 2 . - CDS 3405 - 5150 2127 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 5 3 Op 1 17/0.000 - CDS 5262 - 6320 1071 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 6 3 Op 2 17/0.000 - CDS 6322 - 7146 883 ## COG4208 ABC-type sulfate transport system, permease component 7 3 Op 3 9/0.000 - CDS 7146 - 7982 751 ## COG0555 ABC-type sulfate transport system, permease component 8 3 Op 4 . - CDS 8053 - 9051 1175 ## COG1613 ABC-type sulfate transport system, periplasmic component - Prom 9120 - 9179 9.2 - Term 9194 - 9233 7.1 9 4 Tu 1 . - CDS 9253 - 9741 556 ## EUBREC_2990 hypothetical protein - Prom 9858 - 9917 8.3 - Term 9957 - 9995 1.0 10 5 Tu 1 . - CDS 10087 - 10803 937 ## COG4816 Ethanolamine utilization protein - Prom 10919 - 10978 4.9 11 6 Tu 1 . - CDS 11172 - 12014 702 ## TDE1891 hypothetical protein - Prom 12067 - 12126 6.2 - Term 12196 - 12240 1.3 12 7 Op 1 21/0.000 - CDS 12363 - 13322 1086 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 13 7 Op 2 21/0.000 - CDS 13335 - 14831 1926 ## COG1129 ABC-type sugar transport system, ATPase component 14 7 Op 3 . - CDS 14843 - 15781 1227 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components - Term 15879 - 15924 12.1 15 8 Tu 1 . - CDS 15940 - 17082 1369 ## gi|253579293|ref|ZP_04856563.1| conserved hypothetical protein - Prom 17227 - 17286 5.8 - Term 17321 - 17373 10.5 16 9 Op 1 1/0.231 - CDS 17422 - 19065 1515 ## COG0784 FOG: CheY-like receiver - Prom 19105 - 19164 3.2 - Term 19155 - 19215 0.5 17 9 Op 2 . - CDS 19238 - 21043 1517 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 21226 - 21285 10.2 - Term 21293 - 21346 8.1 18 10 Tu 1 . - CDS 21455 - 22693 1489 ## COG0153 Galactokinase - Prom 22863 - 22922 7.8 + Prom 22874 - 22933 6.5 19 11 Tu 1 . + CDS 23097 - 23993 579 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 24136 - 24172 -0.8 20 12 Op 1 . - CDS 24274 - 26331 1916 ## COG1874 Beta-galactosidase 21 12 Op 2 7/0.000 - CDS 26354 - 28141 1147 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 22 12 Op 3 . - CDS 28143 - 28973 565 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 28989 - 29033 4.7 23 13 Op 1 38/0.000 - CDS 29049 - 29885 734 ## COG0395 ABC-type sugar transport system, permease component 24 13 Op 2 35/0.000 - CDS 29899 - 30777 745 ## COG1175 ABC-type sugar transport systems, permease components - Term 30789 - 30829 3.2 25 13 Op 3 . - CDS 30851 - 32161 1294 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 32209 - 32268 5.7 - Term 32340 - 32394 8.6 26 14 Op 1 . - CDS 32408 - 34486 2096 ## COG1874 Beta-galactosidase - Prom 34538 - 34597 3.5 27 14 Op 2 . - CDS 34599 - 35864 807 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 35959 - 36018 6.2 - Term 36056 - 36099 8.1 28 15 Op 1 . - CDS 36115 - 38112 1691 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 29 15 Op 2 38/0.000 - CDS 38137 - 38973 628 ## COG0395 ABC-type sugar transport system, permease component 30 15 Op 3 35/0.000 - CDS 38994 - 39890 698 ## COG1175 ABC-type sugar transport systems, permease components - Prom 39923 - 39982 4.5 - Term 39911 - 39967 14.2 31 16 Op 1 2/0.000 - CDS 39993 - 41210 1219 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 41251 - 41310 9.2 32 16 Op 2 7/0.000 - CDS 41312 - 42034 750 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 33 16 Op 3 . - CDS 42049 - 43842 1158 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 43907 - 43966 9.2 - Term 44075 - 44127 10.4 34 17 Op 1 8/0.000 - CDS 44176 - 44817 610 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 35 17 Op 2 . - CDS 44831 - 45862 725 ## COG0524 Sugar kinases, ribokinase family 36 17 Op 3 3/0.000 - CDS 45932 - 47677 1661 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 47769 - 47828 8.9 37 17 Op 4 3/0.000 - CDS 47988 - 48722 698 ## COG2186 Transcriptional regulators 38 17 Op 5 1/0.231 - CDS 48733 - 49665 735 ## COG0583 Transcriptional regulator - Prom 49730 - 49789 6.0 39 18 Op 1 . - CDS 49854 - 50975 1417 ## COG1929 Glycerate kinase 40 18 Op 2 . - CDS 51004 - 51387 267 ## gi|253579318|ref|ZP_04856588.1| conserved hypothetical protein - Prom 51548 - 51607 7.4 + Prom 51439 - 51498 6.8 41 19 Op 1 . + CDS 51629 - 51826 116 ## CDR20291_0020 AraC-family transcriptional regulator + Term 51829 - 51872 1.1 + Prom 51831 - 51890 3.9 42 19 Op 2 . + CDS 51992 - 53215 1542 ## COG1301 Na+/H+-dicarboxylate symporters + Term 53259 - 53308 7.2 - Term 53246 - 53295 7.2 43 20 Tu 1 . - CDS 53382 - 54773 1683 ## COG1621 Beta-fructosidases (levanase/invertase) - Prom 54798 - 54857 5.4 44 21 Op 1 38/0.000 - CDS 54892 - 55731 837 ## COG0395 ABC-type sugar transport system, permease component 45 21 Op 2 35/0.000 - CDS 55728 - 56612 781 ## COG1175 ABC-type sugar transport systems, permease components - Term 56656 - 56686 1.3 46 21 Op 3 2/0.000 - CDS 56712 - 58007 1407 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 58050 - 58109 7.7 - Term 58095 - 58146 14.3 47 22 Op 1 7/0.000 - CDS 58235 - 59848 1319 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 48 22 Op 2 . - CDS 59848 - 61680 1254 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 61790 - 61849 4.5 - Term 61740 - 61785 7.1 49 23 Tu 1 . - CDS 61852 - 62286 522 ## EUBREC_1411 lactoylglutathione lyase related lyase - Prom 62314 - 62373 7.0 + Prom 62697 - 62756 7.0 50 24 Tu 1 . + CDS 62905 - 63084 89 ## + Prom 63189 - 63248 5.0 51 25 Op 1 11/0.000 + CDS 63321 - 63875 546 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 52 25 Op 2 . + CDS 63876 - 67208 239 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 + Term 67242 - 67303 16.3 - Term 67649 - 67685 0.8 53 26 Tu 1 . - CDS 67694 - 68701 676 ## COG0523 Putative GTPases (G3E family) - Prom 68789 - 68848 5.7 - Term 68819 - 68866 6.3 54 27 Op 1 35/0.000 - CDS 68881 - 70764 2016 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 55 27 Op 2 4/0.000 - CDS 70757 - 72490 209 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 56 27 Op 3 . - CDS 72527 - 73036 442 ## COG1846 Transcriptional regulators - Prom 73082 - 73141 9.1 + Prom 73368 - 73427 3.3 57 28 Tu 1 . + CDS 73454 - 74056 801 ## COG0560 Phosphoserine phosphatase + Term 74102 - 74147 3.4 - Term 74090 - 74135 7.2 58 29 Tu 1 . - CDS 74185 - 74688 660 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 74729 - 74788 6.8 + Prom 75026 - 75085 2.9 59 30 Tu 1 . + CDS 75109 - 76692 2239 ## COG0166 Glucose-6-phosphate isomerase + Term 76769 - 76809 3.2 - Term 76757 - 76797 1.6 60 31 Tu 1 . - CDS 76825 - 78519 1459 ## COG2200 FOG: EAL domain - Prom 78692 - 78751 6.0 - Term 78858 - 78922 10.7 61 32 Op 1 2/0.000 - CDS 78942 - 79358 416 ## COG1310 Predicted metal-dependent protease of the PAD1/JAB1 superfamily 62 32 Op 2 . - CDS 79410 - 80246 1034 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 63 32 Op 3 . - CDS 80247 - 80453 357 ## EUBREC_0898 hypothetical protein 64 32 Op 4 . - CDS 80482 - 80799 481 ## COG0526 Thiol-disulfide isomerase and thioredoxins 65 32 Op 5 . - CDS 80799 - 81050 318 ## EUBREC_0896 sulfite reductase (ferredoxin) 66 32 Op 6 . - CDS 81063 - 81932 1141 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits - Prom 81966 - 82025 3.4 67 33 Tu 1 . - CDS 82045 - 82944 484 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 82994 - 83053 10.8 68 34 Tu 1 . - CDS 83104 - 83373 413 ## EUBREC_0173 hypothetical protein - Prom 83405 - 83464 4.1 69 35 Op 1 . - CDS 83494 - 83805 332 ## COG2200 FOG: EAL domain 70 35 Op 2 . - CDS 83805 - 84137 393 ## Elen_2229 diguanylate cyclase/phosphodiesterase - Prom 84162 - 84221 3.0 71 36 Op 1 . - CDS 84229 - 84549 185 ## Lebu_0233 putative transcriptional regulator 72 36 Op 2 . - CDS 84582 - 85355 550 ## COG1342 Predicted DNA-binding proteins - Prom 85535 - 85594 8.6 + Prom 85335 - 85394 4.4 73 37 Tu 1 . + CDS 85473 - 86132 343 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 86160 - 86217 9.1 - Term 86146 - 86207 11.3 74 38 Op 1 . - CDS 86251 - 87942 1254 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 75 38 Op 2 . - CDS 87965 - 88270 326 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 88299 - 88358 5.6 - Term 88365 - 88407 8.5 76 39 Op 1 . - CDS 88450 - 88623 110 ## gi|253579353|ref|ZP_04856623.1| conserved hypothetical protein - Prom 88658 - 88717 2.5 77 39 Op 2 . - CDS 88749 - 89645 431 ## LSL_1338 DNA-binding protein - Prom 89672 - 89731 4.1 78 40 Tu 1 . - CDS 90204 - 90383 189 ## gi|253579355|ref|ZP_04856625.1| predicted protein + Prom 90398 - 90457 8.8 79 41 Op 1 . + CDS 90603 - 91010 135 ## Apre_0590 hypothetical protein 80 41 Op 2 . + CDS 91007 - 91213 161 ## COG1476 Predicted transcriptional regulators 81 41 Op 3 . + CDS 91223 - 91525 190 ## gi|197302208|ref|ZP_03167267.1| hypothetical protein RUMLAC_00935 82 41 Op 4 . + CDS 91545 - 92342 776 ## COG4377 Predicted membrane protein + Term 92389 - 92418 1.4 - Term 92804 - 92839 4.2 83 42 Op 1 . - CDS 92887 - 93096 125 ## EUBREC_1224 hypothetical protein 84 42 Op 2 . - CDS 93172 - 93696 539 ## COG0655 Multimeric flavodoxin WrbA - Prom 93903 - 93962 7.2 + Prom 94051 - 94110 6.9 85 43 Tu 1 . + CDS 94237 - 95082 356 ## COG0789 Predicted transcriptional regulators + Term 95313 - 95349 -0.7 + Prom 95456 - 95515 8.7 86 44 Tu 1 . + CDS 95610 - 96248 658 ## COG4887 Uncharacterized metal-binding protein conserved in archaea 87 45 Op 1 . - CDS 96357 - 97289 428 ## COG0582 Integrase 88 45 Op 2 . - CDS 97286 - 97474 70 ## gi|291520870|emb|CBK79163.1| hypothetical protein 89 45 Op 3 . - CDS 97480 - 98091 71 ## wcw_1894 hypothetical protein - Prom 98117 - 98176 6.8 - Term 98578 - 98620 -0.2 90 46 Op 1 . - CDS 98761 - 99057 186 ## gi|225389716|ref|ZP_03759440.1| hypothetical protein CLOSTASPAR_03464 91 46 Op 2 . - CDS 99062 - 99826 654 ## COG4200 Uncharacterized protein conserved in bacteria 92 46 Op 3 . - CDS 99843 - 100568 623 ## HMPREF0424_0157 hypothetical protein 93 46 Op 4 . - CDS 100565 - 101329 268 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 - Prom 101441 - 101500 5.5 + Prom 101457 - 101516 6.3 94 47 Op 1 40/0.000 + CDS 101552 - 102187 448 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 95 47 Op 2 . + CDS 102192 - 103556 322 ## COG0642 Signal transduction histidine kinase + Term 103584 - 103632 8.5 + Prom 103602 - 103661 4.0 96 48 Op 1 . + CDS 103851 - 104117 247 ## gi|217388356|ref|YP_002333385.1| hypothetical protein pMG2200_24 97 48 Op 2 . + CDS 104121 - 104279 84 ## gi|253579375|ref|ZP_04856645.1| predicted protein Predicted protein(s) >gi|226332935|gb|ACII01000084.1| GENE 1 303 - 1943 2121 546 aa, chain - ## HITS:1 COG:XF1501_1 KEGG:ns NR:ns ## COG: XF1501_1 COG2895 # Protein_GI_number: 15838102 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Xylella fastidiosa 9a5c # 1 405 50 457 457 405 51.0 1e-113 MKGLLKFITCGSVDDGKSTLIGHILYDAKLLYADQEKALELDSKVGSRGGAIDYSLLLDG LMAEREQGITIDVAYRYFTTDNRSFIVADTPGHEEYTRNMAVGASFADLAVILVDASQGV LVQTRRHARICALMGIRYFVFAVNKMDLIEYDEKRFRDIENQIVALIEELKLANVTIIPV SATEGDNVTTKSDNMPWYKGEALLSHLETVDIKEDTEKGFYMPVQRVCRPNHEFRGFQGQ IENGVIRAGDLVTTLPSKEEAHVKSILVGDKEVQEAVQGQPVTIQLDREVDVSRGCVLTI DSGAVLTDSVEADILWMDDNALTDGKNFFVKIGTKMIPGLVTKINYSVDVNTGEKKSAYT LKKNEIASCTLEFSEKIVVDEFDRHRTLGELILIDRVTNMTSACGVVRKTFVSQDRSQIG KVDEQVRAGLKGQTPVVVEFPIGKEGITLDFAEQVEKGLTVLGKHTYLYHPAASENYAET VRHLKAAGLIVLLVLDENTAKDETLKNLDGFYANWQIDGITVKDAIDFVKKKSAFTVQSV HDGNYI >gi|226332935|gb|ACII01000084.1| GENE 2 1945 - 2844 903 299 aa, chain - ## HITS:1 COG:mlr7575 KEGG:ns NR:ns ## COG: mlr7575 COG0175 # Protein_GI_number: 13476292 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Mesorhizobium loti # 4 299 5 301 301 454 72.0 1e-127 MSNLSHLDALEAEAIYIMREVAAECEKPVMLYSIGKDSSVMLHLAMKAFYPEKPPFPFLH IDTTWKFHEMIEFRDRIAKKNGIDMLVYTNEEGVKAGINPFDNGSAYTDIMKTQALKQAL TKYGFTAAFGGGRRDEEKSRAKERIFSFRNSAQAWDPKNQRPEMWKLYNTKIQKGESMRV FPISNWTEKDIWQYIQRENIEIVPLYFAKERPVIYRDGNIIMVDDDRLKLRPGEKIENKK VRFRTLGCYPLTGGIESEADTLDEIIDETLSAVSSERTSRVIDHEAAGSMERRKREGYF >gi|226332935|gb|ACII01000084.1| GENE 3 3094 - 3408 380 104 aa, chain - ## HITS:1 COG:CAC0105 KEGG:ns NR:ns ## COG: CAC0105 COG1146 # Protein_GI_number: 15893401 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 104 1 104 104 109 50.0 1e-24 MSIRIQKSKCAGCGRCIEACPGNLIKKDKENKAFIRQVRDCWGCTSCIKECRHDAIRFFL GADVGGRGASMVVSEKPDISTWTVEKPDGTKITIQVNKKDANKY >gi|226332935|gb|ACII01000084.1| GENE 4 3405 - 5150 2127 581 aa, chain - ## HITS:1 COG:CAC0104 KEGG:ns NR:ns ## COG: CAC0104 COG1053 # Protein_GI_number: 15893400 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Clostridium acetobutylicum # 6 566 8 553 559 638 58.0 0 MRTERIRTDVLIIGGGTAGCFAAYVIGQKSGRKVLIAEKANIKRSGCLAAGVNALNAYIT EGRVPQDYVDYAKKDAHGIVREDLLLTMSERLNHVTKVMEDLGLVILKDENGKYVARGNR NIKINGENMKPILAKAAAEQENVTVLNHVNITDYKTENKKVTGAVGFSVNEDIFYDIDAK AVLCATGGAAGLYRPNNPGFSRHKMWYPPFNTGAGYAMGLLSGAEMTTFEMRFIALRCKD TIAPTGTIAQGVGARQVNAHGDIYETKYGLTTSQRVYGTVMENREGNGPCYLRTEGISKE QEQDLYKAYLNMAPSQTLKWMEAGKGPSEENVEIEGTEPYIVGGHTASGYWVDNNRATTI EGLYAAGDVAGGCPQKYVTGALAEGEIAAEAILEYLNGKDDDRIAAENKEEALERGGEQE LSQTAIAKKAEYESHLNASRSLMNAEQLEEAMQKIMDQYAGGIGTDYRYSGMSLAKADEK IAALLPVVDTLAATDTYELMQIYELRERLIVCQAVIAHLAARKETRWHSFGENTDYPQQS EEWMKYVNSRMENGEIKILYRDLITKGNTGLNTAEKGAGER >gi|226332935|gb|ACII01000084.1| GENE 5 5262 - 6320 1071 352 aa, chain - ## HITS:1 COG:all0126 KEGG:ns NR:ns ## COG: all0126 COG1118 # Protein_GI_number: 17227622 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Nostoc sp. PCC 7120 # 1 268 1 269 338 286 51.0 4e-77 MYVELKNINKTYGSYQASRNVNFGIEKGKLIGLLGPSGSGKTTILRMIAGLETPDSGEVI IDGKVVNDVPASQRGIGFVFQNYALFRYMTVYDNVAFGLKVQKADKKKIDARVRELIKLV GLEGLEKRYPSQLSGGQRQRVAFARALAPNPQVLLLDEPFAAIDAKIRQELRSWLKEMIG KLGITSIFVTHDQDEAIEVADEIIITNAGRIEQKGTPIGVYRNPETAFTASFFGQPSVLK NADDFHTFAQAEGADKIIVRPEFVKISRLDEVEKFKTSVSRGVVERVSFRGDNLELQVRV NNSVVTARRSLEKADIREGETVNVFLYRIFALRGDSVQLIENEAVRDDFVII >gi|226332935|gb|ACII01000084.1| GENE 6 6322 - 7146 883 274 aa, chain - ## HITS:1 COG:RSc1346 KEGG:ns NR:ns ## COG: RSc1346 COG4208 # Protein_GI_number: 17546065 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, permease component # Organism: Ralstonia solanacearum # 16 273 31 288 326 269 55.0 5e-72 MDNNEKTSDKITKWLLVGISVLFLAVMLLLPLITVITEALRSGWKVYAAAVTDEYTIKAL LLTVKATLFAVVFNTVFGVFAAWALTKFQFRGKKLLTTLIDLPVTISPVIAGLIFLLIFG RQSPVYFFLMKLGMKVVFSVPGIIIATVFVTFPFISRELIPVLEAEGSDEEEAAALMGAG GWTIFWKITFPHIKWALLYGVVLCTARAMGEFGAVSVLSGHLRGKTNTLPLHVEILFNEF QYVPAFAVSSLLVIMAVIILVVRSLIEYKGKKEA >gi|226332935|gb|ACII01000084.1| GENE 7 7146 - 7982 751 278 aa, chain - ## HITS:1 COG:AGc1503 KEGG:ns NR:ns ## COG: AGc1503 COG0555 # Protein_GI_number: 15888162 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type sulfate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 266 12 276 286 263 54.0 3e-70 MKKRNKPVIPGFGLSMGITISILSLVVLIPLASLVIYTAKLSFKDIIETITRARVLASFY TSFVCAFAAAVINGIMGTILAWVLVKYDFPGKRLMDGMIELPFALPTAVAGIALTSLTSD SGIVGSFFAGFGIKIAYTRIGITVAMIFVGIPFVVRSVQPVLEKMDNSYSEAAGVLGAKK TTIFWKVIFPELKPAIFAGTGLAFGRCLGEYGSVVFIAGNKPYYTEITPLVIMSELQEFD YSSATAIALVMLAVSFFILFLNSMIQQRNARILRGGNA >gi|226332935|gb|ACII01000084.1| GENE 8 8053 - 9051 1175 332 aa, chain - ## HITS:1 COG:STM4063 KEGG:ns NR:ns ## COG: STM4063 COG1613 # Protein_GI_number: 16767329 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate transport system, periplasmic component # Organism: Salmonella typhimurium LT2 # 8 332 4 329 329 366 55.0 1e-101 MIRKLFAFGLAVVLALMPVQAFGKTVSVTNVSYDPTREMFEQYNKIFEDHWKEKTGEDVE VIQSHGGSGKQALEVANGLDADVVTLALEYDIESIENAGLIKAGWKDKFDNDSSPYTSTI VFLVKKGNPRDIKDWDDLTKKGVGVVTPNPKTSGGARWNYMAAWAYADKKYGGDEAQMKE FIKKLYRNVVVLDSGARGATTSFVENGQGDVLVAWENEAYLSMRDYPDEYEIVTPSVSVL AQPSVAVVDEVVDYRDTRDVATEYLNYLYSDEAQEIAAENYFRPTNEKILKKHSDTFDLN INLANISDYGGWDKVQEKHFTDGGIFDQIYER >gi|226332935|gb|ACII01000084.1| GENE 9 9253 - 9741 556 162 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2990 NR:ns ## KEGG: EUBREC_2990 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 153 1 153 154 146 57.0 3e-34 MPKNKFQEVIFTIIMVFVMVYAMICYNIVLNTGTMTNETFLLAFHELTFMGPIAFILDFF IIGGLAKKIAFGIVDMRRDNPFHLVLAISAVSVAFMCPCMSFAATVLIKHAPASQIIPTW LQTTAMNFPMAFFWQIFYAGPFVRFIFGKLFPEKEKAAASAE >gi|226332935|gb|ACII01000084.1| GENE 10 10087 - 10803 937 238 aa, chain - ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 15 231 44 260 267 230 55.0 2e-60 MIDVHKISTNCTRNEFVGTAVLDTIGLVISGIDDTLLETMNVGMKYRCLGLFSSRTGAAG QITAIDDAVKATNTEVLSIELPRDTKGWGGHGNYIVLGGTDVSDVRHAISMALELTNKYA GELYISESGHLEFAYSASAGQALNKAFGVPVGEAFGFMAGSPAAIGLVMADTAMKSAPLS IAKYMTPNKGTSHSNEVILAVSGDASAAKTAVLNARQTGLELLIGMGSYPEIPGIPYL >gi|226332935|gb|ACII01000084.1| GENE 11 11172 - 12014 702 280 aa, chain - ## HITS:1 COG:no KEGG:TDE1891 NR:ns ## KEGG: TDE1891 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 260 3 254 254 125 31.0 2e-27 METRKFKDLTIRDAFMFAAVMSDPEICRRVLELALGIPISEVHIQTEKTMAYHSEYHGVR LDVYAADADRTRFNVEMQVTLQKFLPKRSRYYHDQIDMDALLAGDSYENLPDTYVIFICD FDPFGDGLYRYSTGMVCEETGKSVSDGVKTVYLNAHGRNRDGIPEELLQFLDYVKNTGRT EGISTTDPFVRHLQDSIDKIKQNRGMEERYMLLEEMMRNEKREGNTEGKQEFLLTALESK FSVPSELKEKIMSETDIEKLNLWFQLSLKVSSLEEFEQNM >gi|226332935|gb|ACII01000084.1| GENE 12 12363 - 13322 1086 319 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 1 313 1 319 322 113 30.0 6e-25 MSTTKKSTEQVPFLERPIVRTILPIAGFVIICVLFAFLTDGRLFQPKNISLLLSQSYMLL ISSIGVFMVMTMGGLDFSQGSMLGVASIVVCYLSHYNMVLAALGGVVTGGLIGLINGYFN VKRKITSFIVTICTMYLFRGVCAYATTNSPVYAVSDISKYNTLPFMLTFTVIIFVVAYLV FTYTGLGSRLKAIGAGETAARFAGIRVETTKILVFVAAGCITGLAAFVNAIKVGSVTSTA GNQLETQIMIALVLGGMPVNGGAKVRFYNIILGVMTYKVLSSGLVMLMMPTQLQQLILGI IFLIEVAIFSDRKTGMIVK >gi|226332935|gb|ACII01000084.1| GENE 13 13335 - 14831 1926 498 aa, chain - ## HITS:1 COG:L84240 KEGG:ns NR:ns ## COG: L84240 COG1129 # Protein_GI_number: 15673619 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Lactococcus lactis # 6 480 3 472 492 318 39.0 1e-86 MGEILLRAENIDKHFGITHALDNVSFTFNKGEIHALIGENGSGKSTLTSCLTGIYQKDSG KFILEGKEITATNQVEANHQGVAIIVQEIGTLSGLTVAENIFLGNEDQFTKCGIKNTAAM NKKAQEYLDSYGFNYIDATKVIDDYNFEDRKLVEIVKSTYFNPKVLVVDETTTALGQKGR EELFKVMHKVRDTGNCVIFISHDLEEVIEQSDNISVLRDGVKIGSITKEEATPDRLKALM VGREIGDSYYRTDYGEKVSDEVVLSAKNVTVKGQIENLNLDLHKGEILGIGGLSECGMHE VGKALFGASYFRQGSVTLGDGTPINSIPDAIKHSIAYASKDRDNESLVINDTIGDNICLP SLEDLKTHGFLRAKTMNEFANKFAKQMSTKMTGVDQFVSALSGGNKQKVVLARWVGKDSD LIILDSPTRGIDVKVKADIYAMMNDMRKRGKSIIMISEEIMELLGMADRILIMKDGKING EFLRSPDLKDTDLIDNMI >gi|226332935|gb|ACII01000084.1| GENE 14 14843 - 15781 1227 312 aa, chain - ## HITS:1 COG:SMb20506 KEGG:ns NR:ns ## COG: SMb20506 COG1172 # Protein_GI_number: 16264236 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Sinorhizobium meliloti # 18 301 15 305 317 82 28.0 1e-15 MKKLKQSQYYGIGLLVLMLVVFWAVFKVLAPTTFGSPEKLATYMKSALIYAVGGCGLYFI CVMGPFDMSVGANIVLSSIIACNASEKFGYAGLIIAPLICGTIIGLINGVVYIKLHISSL IVTCALSLIYEAFSVYATNGKNVILSTDYRAFGDYPVNLILALIAYLLCAFILKYTKIGT YTYAIGSNEVVAKNMGVNVSKYKIVAFTLAGFFFGLQAILTISFGTSMTSASNLSSMSRN FTPLMGTFFGLAFKKYGHPVIAIVIGEMIIQLMFNGFVALGAPTTIQNVVTGVALLAIVA LTTKPVKGLVVK >gi|226332935|gb|ACII01000084.1| GENE 15 15940 - 17082 1369 380 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579293|ref|ZP_04856563.1| ## NR: gi|253579293|ref|ZP_04856563.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 380 1 380 380 705 100.0 0 MKKRMRKAVAIAATAAMVGSLPAGAVTAQAADDDITIGVSIWSSTDVLGSQSKKIIDKAA DALGVNVMYVDQGHVSEQVTASVEQLCAAGCNGIVICNSADSEMTSAIQTCDANQVYLTQ FYRIISEENSPEVYQTACNSQYYLGAVHEDEVTNGYTLVNLLLENGDRNIGLEAWTVGDA TFQQRWKGYKQAVDEWNEANPDDQATITEPVYANTSSEEGAAAALSLYNSNPDMDALIVA GGGGDPLVGSIGALANEGLTGKIDVVSTDFLDDLGEQLESGGMFAESGGHFCDPLYAFLI TYNAIKGNYVKDEGSFGYEIQFPYLYVSSSEDYENYKKYFVDDDPYTDEELVELAGYSFD DLNKAATSLSIEDVISRHSN >gi|226332935|gb|ACII01000084.1| GENE 16 17422 - 19065 1515 547 aa, chain - ## HITS:1 COG:BH3446 KEGG:ns NR:ns ## COG: BH3446 COG0784 # Protein_GI_number: 15616008 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 6 171 4 169 200 99 33.0 2e-20 MKNMKYKVLAADDEFWSRENIRNLIPWEEYSLEFLEPACDGEEVMERIAEEKPDIILTDI NMPFLSGLELLQRLQNEYPEIITIAISGYDDFDKVKGVFVSGGLDYLLKPVGKEEMVKVL TKALGLLEERENTKKQDETSKLQKHKLSSFLEDSEYSALLSGKLYGQSASQTHVSSTNTF SEVATLMVKFYNITEIAEQFEHDNLQMSWNIKSRLRELTGSKGNAIIFNNSNKMSEFLIV KTADAKEFRALAENILKEFPLEEYGPVSVVLHEQTSSLDDIGTVYRELISTLITRPFNRE HSILFCPEEKTGELPRNSETQMVGKSAPAHMETELYHLLTTGQKAEAEKLIFKTCNFRQC DDGNWTYLDVKQYAGRITGILYRYVQEKCPELTAQAEEAMDNIDYYMKCLNAHSLLVSLK ILLDSLWESGEDKSSDAGSIMQQVEQIHKYIERSYHENITLTALAEQYHMDASYLSRTFS QKYGETIIAFLTRIRMEKAAELMKNQDKKLETISFLVGYDDYNYFSRVFRKKVGCSPREY RNKFTQL >gi|226332935|gb|ACII01000084.1| GENE 17 19238 - 21043 1517 601 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 303 588 291 571 577 149 30.0 2e-35 MKKEWTMRKALLVLCIVSVVFALLLQTFLFHQSLRRQIRAESISDHELSLNKMQTDLSNF IHTIRGEMLTIYSEQDLIAAMRDTAEKDSSMKDYYWRSWYFARKRFTKEDQLLAMYLYDT KDHLISSYRYNSVTYPKDIYKVEYDDNSDRVYDYVHGDRTDLMISGYHNSVAKKDIVRFV LKLHNYDADRNALGYLVCDINSAAFTSIMAKYVDVEQVCLWLQPLNDRVIAMTGEASESQ RRIQKQLAKVVQSHYKSGELEQEYDGNYLIQVSQENYNLEAFVLVSQSLLTATQKTLIRT LLIIMGAMILAIMVIVLFVSQWMTKPVEEMSSTITRIKDGETQLRVQPVGWSQELTTLGT EFNEMLDRMQVMAQEELQHKMLVERTEYKMLQAQINPHFLYNTLDTMSGIANAQNCPMVS GMCHSLSAIFRYSLNMTDELSTVQNEMAHVRNYLYVMDMRNGSTIAYDYQIDSDTLADQM PRICIQPVVENALTHGLRNVRRKDKKLLIRSEHVNENLVITVQDNGAGMDAESMNRLLEQ NDMKRVESGISIGILNVSARLKRLFGEKYGLHIESTAGEGTTVTITVPAVSTENSGDMEN V >gi|226332935|gb|ACII01000084.1| GENE 18 21455 - 22693 1489 412 aa, chain - ## HITS:1 COG:STM0774 KEGG:ns NR:ns ## COG: STM0774 COG0153 # Protein_GI_number: 16764138 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Salmonella typhimurium LT2 # 30 379 11 347 382 116 28.0 7e-26 MNIPNKNELTRIYGEAEKSAARFQAVADHFAEIYHHDIAEFFTAPGRTEIIGNHTDHNGG RVIAGSIDMDTIGAAYPNNSSVIRITSEGYDKEVVVDINDLASVPKAQGTVSLVAGMVEA IQKFGFKVAGFDAYVTTNVIRAAGVSSSASFEMLVCSIINYFFNDGAMTYINYAKAGQYA ENVYWLKASGLMDQLACAVGGPILLDFSDRENPKYEKVNFSFHDYDHHLVIVNTGKGHAD LSEEYSEIPMEMKEAAKAAGAELLCETTLEKVLANMDKIDNDRAILRAIHFFKENERVER AAKAVEEKDGETVLKLLSESGKSSWELLQNCYPIKAYTEQKISVALALTDLFLEKLGKGI CRIHGGGFAGVIMCVVPEEETENYVSYISEFAGKENVYPMNIRAVGAVHIEK >gi|226332935|gb|ACII01000084.1| GENE 19 23097 - 23993 579 298 aa, chain + ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 1 295 1 291 291 150 27.0 3e-36 MAEIITDGSQKELKKHGSDEFPLLVSYEKLSGYKSGSFLWHWHPEIEITLVLDGQILYKV NQCTYHMKAGDILLGNANVLHAGFMEDMKDGKYVSITFLPKIIYGSYGSILRRKYVEPFL HNFSLPAVYIDYSQPWHEVFSEYVREIITLYEEKKEFYELDITGTLQKLWRCMLQNLPEN LPYTEHSKLERNRMREIMDYLEYNYMNKIQMKDIAEEIHLCESECSRLFKRYMNVSLFTF LQEYRVERSLEFLLSSDDSIMEVAQKSGFTDSNYYAKIFSRVKGCSPQKYRKQNTKKE >gi|226332935|gb|ACII01000084.1| GENE 20 24274 - 26331 1916 685 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 14 682 2 645 649 303 32.0 1e-81 MNAKQNYKWNELTMGTCYYPEHWDKKLWADDLDRMKKTGISVIRIAEFAWSKVEPEEGVF TYDFFDEFLDLCSEKQMKVIFGTPTATPPAWLTEKYPEVLNARKDGVLLRHGGRRHYNYN SPVYQRLCARVVEREAAHYAKHPAIVGWQIDNEINCEVDEFYSEADSVAFREFLRKQYGT LEKLNEAWGTTFWNQTYTAWDQIYVLRPVLTRGINPHQHLDYIRFISESARHFCKMQAEI IKKYVKPGDYITTNGKFWNLDNHKMEKESLDVYCYDSYPDFAFGLDRDPLHSDDLNDRKW SHNLAEIRSICPHFGIMEQQSGGNGWTTRMEAPSPRPGQLTLWAMQSVAHGADYISFFRW RTCTFGTEIYWHGILDYDNRENRKYREVQDFYKKLKAIDGVCNSEYKAAFAIVKDYDNEW DTSVDVWHNRVAGASEQGIFKAAQLTHTPYDIVYLQEDSELTDLTKYPVLFYAHPVLISE ERVKLLKEYIEQGGTLVIGCRSGYKDMNGKCVMLPQPGLLAELTGTDVRDFTFTSPAEDE VFAEFSDKKIPMPVFNDIITPLEGTKTLAIYGNSYYEGNPALTCHAVGKGQVLHLGATFN KENTEALFDYLKIKEPFKEYIEAPETAEVIMREKDGEKYIFVLNYQAEKIEYTLQKPMIA LYDKTEATGRCTLPAFGTAVYKVIE >gi|226332935|gb|ACII01000084.1| GENE 21 26354 - 28141 1147 595 aa, chain - ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 216 582 207 577 585 164 28.0 5e-40 MARQNKLKKTGGLKTNGGLMNKIFKVYMLLMFISCMVLLIVFGIRFSSVYNSQAKSHMND VTVATETGIEERISQIDQLSVSILINNSVQKNLKSINHQKVMEPEKTSFQIEKTAISRDI RGSIFNIPGIVSARIYSSDGVEIVIGTSGKKMKADDITREKIYAKNGGALWADDEENGMV GLYRAILSVDDFKPIGYMVIECKNSYFSEKLRSVPSTYKNRFYLLNNEMDIIVSSEENMQ GISFPLKTRDFKRVKIVRDPSTEKNSYFTYQYMNNGWLLVSTINVGQLWKNIGIALLSVL LTFGIVLLVSLVIMRHAARVMVKPTKKLVDSMTAYQEGNFDSRFEVENQDEINQIGMVYN QMADKVQNLIEKNYTLEIANREAEIEFLKMQINPHFLYNCLDTISWLGFSNGNSEITDLA VALGKFLRASIKREDYYTVKQEMEVVDNYLFIQKYRFGDKIEIRHNIPDEVLNFYIPSFI IQPVIENSIVHGLEEQIEKGILWINIQLCNNKYLQFNLTDNGKGMDEEQLKQVIQNYSDK NKKSSIGLSNVYRRLNLLYGEACEFHVTSVAGEGTEVTFRIPVMQVPGNHTLNET >gi|226332935|gb|ACII01000084.1| GENE 22 28143 - 28973 565 276 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 254 6 253 257 145 38.0 1e-34 MNLLIADDEAVIRRGLLSLDWKSIGITDVYSVANGVEAKELLLSTSIDLVIFDIRMPGFS GLELAQMVKERSMDVAVVLLSGFSEFEYARSAMRYGVYEYLLKPVSPNELMETMHNVMHR LEQKRFEQKLLSQKDEFGEKHDAVSQVNSLFVQSSGAIKSILTDIAQNYEQNISLADLAE KYHFSESYISRKIKKETGYTFVDILNGIRLMCAASLLKTGEKISDVCEKTGFNDQHYFSQ LFKKTFGCTPSNYRSESGETFSGLYEILNSKAGNGK >gi|226332935|gb|ACII01000084.1| GENE 23 29049 - 29885 734 278 aa, chain - ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 278 17 293 293 182 40.0 5e-46 MKRSASENIWLVIKYVLLIGFTILCVYPLVWLFLASFKTNAELYTNTWGLPEQWSMTNYV NAVVKGGVFRYFGNSVIIAVSVVLVTVILATMASYAIARMHWKLANLTHSIFLLGMMIPI YALVIPLFSIFKGMGLLDTHLAVIIPQIAVGFPLAIFIICGFMRSIPTELEEAAIIDGCT VFQCFFKIILPIAKSSVVTVAVVQFINVWNDLLLPRIFLTDSSKMTLPVGLTNFQAMYST DYVGMIAAVIITIIPSIVVYILLHKQIMEGMVAGAVKG >gi|226332935|gb|ACII01000084.1| GENE 24 29899 - 30777 745 292 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 7 288 5 288 292 181 38.0 2e-45 MNKVMGNKKTIALFVLPAFIIYAIFALAPIFYNVYLSLFDTDLMSKMNFVGIKNYLSLFS DKTFKHAFGNNILMVVGSLVAHMPLAMFFGNAIFKKIKGASFFQTVFFLPCVICGVAVGL TWTFVYNGNYGLLNGFLKAIGLGGLQQMWLANKETAMLCIIIVIMWQYVGYHMVIQLAAM RNIDSSFYEAAEIDGATGWQQFKYITFPLIKPILKIDAVLIITGSLKYYDLVAVMTSGGP NHATEVMSTYMYYQSFNILNYGYASAIGVVLMLLCILSVGISNRVFKTDETV >gi|226332935|gb|ACII01000084.1| GENE 25 30851 - 32161 1294 436 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 386 1 387 438 187 31.0 3e-47 MKKFTVCLLIGAMAASMLTGCGSNGGSSDSSDSKADSSSETMTIMMNGSDSDAFMEGYRK IVDGFNESNEYGVKVEIQNITNADYKTKLTTMMASDSEPDIIFTWPLGYLENFVNGDKVV SIQKYLDEDEDWKNSFNGGILDPLTYDGEVYAIPTQQSTAIMYYNKAIFDKYGLEVPTTY DEYVQICDTLKENGVTPVALASTADDAWLVSQYIQQLSDGIKGEDLFNSLKDGTGKWNDE GMVEAGKLFQEEVNKGYFEDGFTGVSREEAQLQFANGQTAMYFNGCWEISNLDKKENAAE AENISCFLMPAVNEEYAGVEVGSVDTSYAITKNCKNVDAAVALLKYMTNEESTSLLLYDY GRTPSTNFDIDNSKLTPLCADFINLLGDVKVQTPWFDRVDTDLGNEFNNTTIGIANGDDV QEAFDTLQSYAESKSK >gi|226332935|gb|ACII01000084.1| GENE 26 32408 - 34486 2096 692 aa, chain - ## HITS:1 COG:TM1195 KEGG:ns NR:ns ## COG: TM1195 COG1874 # Protein_GI_number: 15643951 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Thermotoga maritima # 5 514 1 515 649 193 28.0 1e-48 MKELLFGAAYYDEYMPYDRLQQDVAMMKKAGINTVRIAESTWSTCEPQEGVFDFSHVERV MDAMEEAGINVIIGTPTYAIPTWMVKSHPDVMAETVNGRGIYGARQIMDITHPVYRFYAE RVIRKLMECTAYRKCVIGFQVDNETKYYGTAGKNVQEKFVKYLRRKFNNDLDAMNHEFGL DYWSNRINAWEDFPDVRGTINGSLGAEFEKFQRTLVDEFLSWQADIVNEYRREDQFITHN FDFEWRGYSYGVQPDVNHYHAAKALTIAGTDIYHPTQDDLTGAEIAFGGDMTRSLKRDNY LVLETEAQGYPGWTPYKGQLRLQGYSHLASGSNSVMYWHWHSIHNSFETYWKGLLSHDMQ ENAPYREACIMGKEFSEIGSHLVNLKKKNDVAILVSNEALTALKWFGIEATAAGNNGIGY NDVVRWIYDALYQMNIECDFVWPESDNLEQYKAIFVPALYAAPDELLERLKQYVADGGTL VATFKTAFANENIKVSHEMQPHILSNCFGINYQQFTFPKNVGLTGSIIRESGADEADKKN ETKENIETEENTDVPATAKVFMELLMPQEAEVLASYDHYNWKEYAAITKNHYEKGTAIYI GCMTDDNTLKAVITEALRSVEVELPEYRWPVIVRKGTNDLGKCVRYILNYSAEEQKVSYY GKNGTELLSGESVQDGESITVSPWNIKIVEEA >gi|226332935|gb|ACII01000084.1| GENE 27 34599 - 35864 807 421 aa, chain - ## HITS:1 COG:yijO KEGG:ns NR:ns ## COG: yijO COG2207 # Protein_GI_number: 16131792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli K12 # 302 414 159 271 283 66 27.0 9e-11 MNLEYICEQMVRILHGNITCISKSGAIEACYGDMAVQYNPLFTNSEFLSAVHNREKKDYP DLFYEKDTILYGVIPLESGEIQNETNNESKSGSRLDEKKHFAKIVVGPVSTEKHTKDSEH YLMQHHHISNETGFRLSFCELKVFGSGILMLYHMITGKELTINDLWQKNGIRETDIIEVK GQISSVIFEHQEQDLPHNPYDQELRELDSIRHGDVEMLNRSLAETYRGEVGQLAKNQVRQ AKNIAICVIALASRAAISGGMIPEEAFSMVDGYIMKIEDMNNAVKIDSMMRQAEYEFAEC VAEIHKNQQKNELVERTKNYIYQNLHSDIVIGEIGQKIGVNTSYLSDLFHKVEGTTIQQY IRKEKIRLAENMLRYSDYEVKEIASYLSFCSQSYFGNIFRQQTGMTPARYRKKYGKWKEQ K >gi|226332935|gb|ACII01000084.1| GENE 28 36115 - 38112 1691 665 aa, chain - ## HITS:1 COG:STM0041 KEGG:ns NR:ns ## COG: STM0041 COG1501 # Protein_GI_number: 16763431 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Salmonella typhimurium LT2 # 20 665 20 674 679 879 63.0 0 MDVFRTENNCLKFHYDAEELWIQPWGANSFRIRATKLAEMPKENWALEMPVEDIVPKIQI EEKRASIINGKIKAVISSYGKLTIYNSKDEILLDEYLRNRLDVFADYCSALDIEAREFKP ILGGDYHLSMRFVSNPEEKIYGMGQYQQPYLNLKGTDLELAQRNSQASVPFAVSSLGYGF LWNNPGVGRVMFGKNITTWEAYSTKVLDYWITAGDSPAEIEEAYASVTGTVPMMPDYAMG FWQCKLRYQTQDELLNVAREYKKRGLPISVIVIDYFHWPLQGDWKFDEKYWPDPDAMIQE LKEMGIELMVSIWPTVDYRSENFQEMKEKGYLIRTDRGFRIVMDFQGNTVHYDATNPAAR EYVWQKAKKNYYDKGIKVFWLDEAEPEYSVYDFDNYRYHMGPNVQVGNIYPALYAKTFFD GMKAEGQENIINLLRCAWAGSQKYGALVWSGDIHSSFSSLRNQLAAGLNMGLAGIPWWTT DIGGFHGGDPKDPKFQELLVRWFEYGTFCPVMRLHGYREPLKEPMGKEGGAACVSGADNE VWSFGEEAYEICKKYLALRESMKPYITELMEAAHEKGTPVMRPLFYDFPKDSKCWEIEDQ YMFGPDVLVAPVTYADMRKRTIYLPEGSQWTNFDTEEIYEGGQVIEVDAPLSQIPVFTRN GRKLK >gi|226332935|gb|ACII01000084.1| GENE 29 38137 - 38973 628 278 aa, chain - ## HITS:1 COG:SMb21593 KEGG:ns NR:ns ## COG: SMb21593 COG0395 # Protein_GI_number: 16264781 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 6 277 1 275 275 178 37.0 1e-44 MRKIHLRDKQKNILLSILAVIIACIFLFPLYWIIVNSFKIDSEIFSSVPTLWPKKFTITA YKDLIGNLSVTLKNSVIIALGSMILSLVLSVPAAYGLARYKVKGMKLFVLVFLVTQMLPA SLVLTPLFLIFSKLGLLNSYLAPILSTATISIPFVVLMLRPGFLSMPKELEDAAKIDGCT SLGVFFRICIPISKPTVITAACFSFVFGWNDLVYSMTFNTKDTMRPMTSAIYTYMNQYGT KWNSIMAYGVLLILPCCIIFLTMQKHIVEGMTSGSVKG >gi|226332935|gb|ACII01000084.1| GENE 30 38994 - 39890 698 298 aa, chain - ## HITS:1 COG:AGl1012 KEGG:ns NR:ns ## COG: AGl1012 COG1175 # Protein_GI_number: 15890623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 23 291 25 293 300 174 32.0 2e-43 MNSKSEKRERRIGYFFILPAVIYMLIFIGYPIVYNWIISLQDVTATTLASAARDFVGLDN FKAIFSDVTFRKSVLHTFVYTIGCLVIQFSLGFLLAMFFVKKFTLAKPIRGFIVISWMLP VTVTALVFKFMFGESSGIINTILMNLHLIKSPIGWLLKGNTAMVVLIIANSWVGIPFNML LLTTGLNNIPGDVYEAASIDGATSVQKFFKITIPLLKPTIMSVLILGFVLTFKVFDLVYV MTGGGPVDATEVLSTYSYKLSFQTFHFGEGAAAANVLFICLFIVALIYLKTISKDETI >gi|226332935|gb|ACII01000084.1| GENE 31 39993 - 41210 1219 405 aa, chain - ## HITS:1 COG:AGl1009 KEGG:ns NR:ns ## COG: AGl1009 COG1653 # Protein_GI_number: 15890622 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 404 10 409 410 194 32.0 3e-49 MKKRIAVVAIAALMAGSLTNVGAVSAAEPTEITFWHYMSEDKEGKFVNEAIEEFNNSQDE VHVTAQYLPREELMKQYTIGVVSGELPDCGMVDNPDHASYASMGVFEDITDLYNSWDEAD FMEGSINSCYYDDKLYGLPWGNNCLGLFYNKSMLEEAGVEVPTTWSELEAACEKLTTDTC KGLAISAIGNEEGTFQYMPWLLSAGGSVDNLTSDESKESMTYLKDLMDKGYISKECINWT QADAEKQFASGQAAMMINGPWQFSGLSDDAPDLEYGVAKVPKADDGDYASVLGGENVAIC KGANVEASWKFLTWITSKEESAKICEAIGRYSPRADVDVQEMYKDDPLNATFAEILPTAE SRGPSPVWPEISEAIYSAEQEVLSGQKDVDTAMSDAQAKIDALDK >gi|226332935|gb|ACII01000084.1| GENE 32 41312 - 42034 750 240 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 4 234 5 239 245 152 38.0 4e-37 MKKIIIVEDEFRIRNGLSTLINKLDMGCKVIGEAENGFEGLKMISDMEPEVVITDIKMPK MDGLSMIRQAKEMGASCDFVILSGYAEFEYAQQGIQLGVMDYLLKPASVSDVKELLNKLN EEKEITQGLDRENCSDIVREMINVIEESYGMKLQLDAIAEKFHMTPEYLGNLFAKETGIT FSNYLRQVRMEKAKELLTGTDMKIYEVACAVGYPDQKYFSKVFKEYAGVSAKQFALKTVK >gi|226332935|gb|ACII01000084.1| GENE 33 42049 - 43842 1158 597 aa, chain - ## HITS:1 COG:BH1122 KEGG:ns NR:ns ## COG: BH1122 COG2972 # Protein_GI_number: 15613685 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 255 589 254 569 586 144 29.0 6e-34 MKNYLERFVGNPHSFQRKIITIYLVLTIIPMLLIALIITGVYYQRILDSAYNILNENAQQ HEIIVQERMENYENVMYELVADSEFINLAKMYNISDSVDELKIKKILSSGINTYDQIRAA VFLSDSGKYVSYSRWYGSQYDSIWSESKKRTEIYDEVNKNQALTFIATVNIGIEEVRDDQ AILMGFPVRDLRTQEQSGVLIIALDDNCLLFNSESESSVKGVETVIIDDYNKIIAGTKDK FINISLDSYLGGIYKNVSTLEIRKYPIQSTQWSIVHIIDRAIYREDIYHTLWAVIIIVVI VVVIVCALVWGIFGRYIGNIQKISQGLRNYEGKQTEELKVDVNKEDELYTIVRQFNKMTV RINHLVETLEKKNIEIQEAAISQKHAEIKALEAQINPHFLFNTLDSINWRAIENDEEEIS DMLATLGSLLRYSVSNIESMVCLEAEISWLKKYVFLQRDRFQNSFDCVYDVTEDAMGFPV YKMLLQPIIENTILHAFENVKEGGMINIEACVRKDGKLEIHIRDNGCGMDTLTLERIRKE IGENGALNSESIGISNVIHRLRIYYQEEADIMVKSKLGSGTEFILVIPRKDPGIIVV >gi|226332935|gb|ACII01000084.1| GENE 34 44176 - 44817 610 213 aa, chain - ## HITS:1 COG:VC0285 KEGG:ns NR:ns ## COG: VC0285 COG0800 # Protein_GI_number: 15640313 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Vibrio cholerae # 3 201 2 199 201 225 53.0 4e-59 MKTIEEQFQKLGVVPVVVLEDKKDAIPLAKALSEGGLPCAEVTFRTDAAAESIRLISETY PDMLVGAGTVLTTEQVDLAVKSGAKFIVSPGFDPEIVDYCLKKNIPVFPGCISPSEVAQA VKRGLKVVKFFPAEQAGGIAMIKAMAAPYQNLKFMPTGGINTGNLKDYLSCDKILCCGGS WMVKGDMIRNGEFDQIQVMVKEAKKLADEIRFN >gi|226332935|gb|ACII01000084.1| GENE 35 44831 - 45862 725 343 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 2 321 7 316 339 209 37.0 5e-54 MGEIMLRLTPPDCEKIRTADCFEARYGGSEANVALSLANLGIDSSYFTVVPDNSLGKSAI RMMKSNDVDCTPVIFSTSEETPTHRLGTYYLETGYGIRPSKVIYDRKHSAFDEYDFTKVD LEKLLDGYTWLHLSGITPALGERCRKMIMDCLKIAKKNGITVSFDGNFRSTLWSWEEARE FCTACLPYVDVLMGIEPYHLWKDENNYSAGDVKDDIPFQPDLEQQEVVFNAFVKRYPNLK CIARHVRYSPSCSENSLKAYLWYEGKTYESRKLTFNILDRVGGGDAFVSGVIYALMQKFT PEDTVNFGVASSAIKHTLHGDGNITDDVSLIRHVMQMNFDIKR >gi|226332935|gb|ACII01000084.1| GENE 36 45932 - 47677 1661 581 aa, chain - ## HITS:1 COG:CAC3604 KEGG:ns NR:ns ## COG: CAC3604 COG0129 # Protein_GI_number: 15896838 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 3 580 2 572 572 736 60.0 0 MNLKSQEMRKLAPELDPLRIGTGWKKEDLGKPQIMVESTYGDSHPGSGHLNILVEEVRKG VAEAGGFGARYFCTDICDGESQGTDGINYSLASREMIANMIEIHANATPFDAGVYLSSCD KGMPGNLIGLARVNIPAVVVPGGTMNAGPEMLTLEQLGMYSAKYERGEIDEAKLDWAKCN ACPSCGACSFIGTASTMQIMAEALGLALPGSALMPAASPDLLAYAREAGRQAVKLAQTEN MRPSDIVTMKSFENAILVHAAISGSTNCLLHIPAIAHEFGMEITGDTFDRLHRNARYLLD VRPAGRWPAECFYYAGGVPAIMEEIKEHLHLDVMTVTGKTLGENLDDLKKNGFYKKCDKW LQEFNQRYGIKISKEDIIRPYDKAIGTDGSIAVLRGNLAPEGAVIKHTACPKEMFKSVLR ARPFDSEEECLDAVLKHKVQKGDAVFIRYEGPKGSGMPEMFYTSEAISSDKELGKSIALI TDGRFSGASTGPVIGHCSPEAVDGGPIALVEEGDLIEIDVMERKLNIIGIAGERKTAEEI DEILKERRKNWRPREPKYRKGVLRLFSQHAASPMKGAYLEY >gi|226332935|gb|ACII01000084.1| GENE 37 47988 - 48722 698 244 aa, chain - ## HITS:1 COG:SMb21350 KEGG:ns NR:ns ## COG: SMb21350 COG2186 # Protein_GI_number: 16264674 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 9 223 8 229 249 88 27.0 9e-18 MSQKKVQIKKTLGQQTEDQLMKYILDNQIAIGEKIPNEFELAGIFDVGRSTIREAVKGLV SRGILEVRRGDGTYVISTVYMENDVLGFGQIKDRYQLALDLFDVRLMIEPEIVTWACRKA TKEQIAKLRELCNEVEMLYKQGHNHIQKDIEFHSYLAKLSGNMVVERLIPVINTSVVIFA NITYRSLMQETLETHRAIVNSIEHKDPVGAKCAMNMHLTYNRQVIMELLEKKKKVRKNHE PDDI >gi|226332935|gb|ACII01000084.1| GENE 38 48733 - 49665 735 310 aa, chain - ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 299 21 314 322 266 44.0 5e-71 MTLTQMNYIITISETGSLNKAAEALYISQPSLTNAVKELEKELGIIIFNRSGRGVTLTND GTEFLMYARQIYGQYESVVEKYSEGGSYKKKFGVSTQHYSFAVKAFVDMVQKFDVSEYEF AIRETKTADVISDVSAMKSEVGVLYLSDFNRKALLKLLHSANLEFHHLIDCQAYVYLWKN HPLANEKSISYSQLAKYPCLSFEQGDKSSFYLSEEILSTNEYSRTVKASDRATMLNLMVG LNGYTLCSGIICEELNGSDYLAIPFEGDEQNQNSDMEIGYITRKNSILSKVGNLYVSSLK KYLEQNTSFS >gi|226332935|gb|ACII01000084.1| GENE 39 49854 - 50975 1417 373 aa, chain - ## HITS:1 COG:CAC2834 KEGG:ns NR:ns ## COG: CAC2834 COG1929 # Protein_GI_number: 15896089 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Clostridium acetobutylicum # 3 373 6 376 380 289 44.0 6e-78 MKVVVAVDSFKGSMTSMEAGMAVKEGILAAKKDAEVIVKPLADGGEGTTDALIEGLKGER VDVTVTGPYHEPVQAYYGYLRESNTAVMEMATAAGITLSDKKEPMTATTYGVGELILHAI GRGIRNFIIGIGGSATNDGGVGMLRAFGYKFLDEKGEDVGEGGQALARIASVEISDKKEL LSQCNFRIACDVTNPLCGSQGATYIYGPQKGVTPDILPVLDAGMKNYAEVTAKVIGKDNQ NAAGAGAAGGLGFAFLNYLDGELTPGIQLILDAVHIEDEMKDADVVVTGEGRLDHQTAMG KAPVGVAKLGRKYGAKTIAFAGSVTKEAVACNEVGIDAFFPIVRGISTLEEAMDPDTAKA NMAAAAEQVFRLL >gi|226332935|gb|ACII01000084.1| GENE 40 51004 - 51387 267 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579318|ref|ZP_04856588.1| ## NR: gi|253579318|ref|ZP_04856588.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 127 1 127 127 236 100.0 5e-61 MKPTILLINFQDQQKLRKLKMALLPFKLRIKTIEPQDYSQPIGYLAGVKDVFQVQIPSAL IPQEQMEKEMLILAGITGNLFDQVLYTLRKAGTPVDYKAVLTEHNQNWNCMQLYKELEKE HNMYHPS >gi|226332935|gb|ACII01000084.1| GENE 41 51629 - 51826 116 65 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0020 NR:ns ## KEGG: CDR20291_0020 # Name: not_defined # Def: AraC-family transcriptional regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 1 61 219 279 281 70 59.0 2e-11 MGITFLEYQNEYRLSFIYRDLITTRDPVHVILERHGFTNYKLFRRMFLEHFGNTPTQIRK QREIL >gi|226332935|gb|ACII01000084.1| GENE 42 51992 - 53215 1542 407 aa, chain + ## HITS:1 COG:BH3820 KEGG:ns NR:ns ## COG: BH3820 COG1301 # Protein_GI_number: 15616382 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Bacillus halodurans # 12 404 3 400 413 206 32.0 5e-53 MKKFYSWYKKHITAAIFTGLILGIITGLFLANRFEPVLAITSLIGSIYMNALNMMIFPLV FCSIIMGISSIGNARTTGKITAGAMIFFLCTTALASLAGLIIPRLIHLGEGVKFEMATSD IEATEMTSILDTIKNLIPSNPVAAFTNGNMLQVLVFAVIVGFTLIAIGEKGKALLNVIDS CNEVCLKVISTVMYFTPIGVFCTIVPVVEANGTETIISLATQLIILYVAFFGFALVIYGG SVKFIGKESPVKFFKAMLPAALNAFGTCSSSATIPISKQRMEDEMGVSNKITSIAIPLGA TVNMDAVSILMSFMIMFFANACGINVSISMMIIILLANVLLSVGTPGIPGGAIASFAALA TMAGLPAGVMGVYISINTLCDMGATCVNVIGDMAGCVVLKKKIKLED >gi|226332935|gb|ACII01000084.1| GENE 43 53382 - 54773 1683 463 aa, chain - ## HITS:1 COG:TM1414 KEGG:ns NR:ns ## COG: TM1414 COG1621 # Protein_GI_number: 15644166 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Thermotoga maritima # 5 268 6 259 432 75 29.0 2e-13 MQKLYYQFPGTWFGDCMPFGHGDKFYLYHQRDTRKPGPFGEPFGWDLATTSDFVHYEDKG VAIPRGTDEEQDQFIFAGSVFEAEGQYHIFYTGYNRDYPALGKPSQVLMHAYSDDLVTWH KTQDALTFTPQEGYDPDDWRDPWVIRDEENDQYLLILGARLQGPKTRQTGRTVKFTSKDL KNWKFEGDFWAPDLYTMHEMPDLFKIGDWWYHIVTEYSDRSKMVYRMSKSLEGPWIAPKD DAFDGRAYYAGRTFELNGQRILFGWVATKDQDDDDNNFIWAGTFMAHEVYQREDGTLGVR IPETVWNAFDKEEKTEDFVIDTPTKSTEKVVAKDTGDIFKFEADVEFTEGTRTFGVRFYE DEDKAESYQFVFNVTEDRYVFEKKPNWPWPANQNIGLERPLELVPGKKYNIKMIVDDTIA TIYVDGVALNARAYKRPGESLSLFASEGSLKVTNCKVTTGLKK >gi|226332935|gb|ACII01000084.1| GENE 44 54892 - 55731 837 279 aa, chain - ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 2 279 17 293 293 160 34.0 3e-39 MKKKKPTIPSIIKWVIALILVVMQVYPFFYVFTSSFKSLDDFRQLPAYALPSKWVLTNFI NVFTKSHMLTYFKNSIIVLIGVLIPLLLFALMAGFALSKIKFKGRKFVLNYFLLGLMLPM QVALIPLFTIFNKMGLINTYPAIILPQIAFSLSYSIQLFYSFSKFFPEEMLEAAIIDGCS PIGCFFKMVVPMSLNSIITVATMQAVFCWNEYINAYTFTRSTDMKTITLGLNDFVGSMGL TDWGGTFAAITVTVLPVFIFYFFSSKKMLAGMTAGAVKG >gi|226332935|gb|ACII01000084.1| GENE 45 55728 - 56612 781 294 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 8 294 6 292 292 156 33.0 5e-38 MQKAFGKKSVIFLFIFPAFLIYTAFVIVSIVWAGYYSFFDWSGVGEKVFVGFKNYIELLT QDDVFRSTVWHTLIYTVINVAIQVFGGLLFAILLSRIKKGRVALQTLYYIPVVISSVAIC QIFTKLLSVTPTGIVNQVLSFIDPSLKLMEWISNPQISLYVTAFVEAYKYLGLYMVIFYA ALIGVPDELGEAALIDGASTWQEYLYVRIPYIKPVIIANCLLVLNGSLRSFEFSYLLTHG GPGNASELMTTYMYKQAFSSMKYGYGSSVAIMIVIICMVVGMLFRKFTGGGDDE >gi|226332935|gb|ACII01000084.1| GENE 46 56712 - 58007 1407 431 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 4 431 1 416 422 73 23.0 5e-13 MRKMKKVLSLAMTAAMTASLLTGMGAVTASAKKSDDGKRTKITALLKGTESTEQYKTFDY LLKNFCDEKGLDYEIELVNDMQDYFTKLQMYINSDTLPDIFGCPNGTLSKACKDIDALVN VGDELERNGYDEKLNGAIRDFLTDADDGNMYLFPQGLYCEYFMYRKDIFEKAGIKDAPTT WEEFEEDCQKIADQGEIPVIVGGSDAWQLMRYLSFSPWRVTGPEFIEGYQAGTDSFSDNE SAKYAVNLLSDLGTKGYFEPGFASVDFTSACNLFFGGSGAIFYTGSGQISLAEEMYDNGE LGFFPVPDTEGMDNMSTNVPVHAGFAEGFNKATYDDTMQEFFDYMCENFTDACYNQAQVF SPFNEELPEGLPQLYYDTQPMFEDAETAWTSWDDKLDSDVMTKIVDEQQQLAQGIIKPDE FIKTCDSLVKK >gi|226332935|gb|ACII01000084.1| GENE 47 58235 - 59848 1319 537 aa, chain - ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 4 162 3 158 368 103 34.0 1e-21 MELQVMILDDEYIILDGLCSFPWSDYGYRISATARNGLEGLEKLEQAKPDLILTDVKMPG MDGLDFAEKAHEIYPNAVIVILTGYDSFAYAQKAISIGVEEYLLKPVDYDELKAMAARIA GEIHARKEKQQEIRDLKKYFNRSVPQLRSKFAGNLLYGRIQGKGVVKEQAESLNLTIEKY IVCVGRKVVGENKIHTGDKWIEEFACINIFEEIFNDFGIHVLSDYNTATAEYNFILLFEK EEENAVCMEKALQACSKIQVEVERYLKAHMNFGLSDVEVDEYQANAQYRKAQTACRQCVY LGTNIILRYEDLQYKKQTDFVITSGEKTHFMMTLFQDSFEKAEDELHQIFRNAPEDVSPV KFAAMDLLLGCMKFPYICAVDSEIHNKNWNLSVLQDGIKRICQCENTEEVLNCLLNLFGT LIKQNTEGTDERNRKLVQSVLSYIEKNYSGDLSMDDLTEKFHVSRTYISRLLKKYAGKSF LEYLTDVRFQQVEKLIADNKYKQYEIAEMVGYKDFGYYIKVFKKRYGITPNEFRKYI >gi|226332935|gb|ACII01000084.1| GENE 48 59848 - 61680 1254 610 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 43 584 42 592 602 179 24.0 2e-44 MKNLKKWYINLSIQRKILYCTLGVALVVLLAASVSQYMSASSIVTEQTRKQSAGVVNELS VNLDHYFDMVRNSFEYIANNSTVQEELESDEPYKSDGTELYSYYSRSGQIRRLLLQGYTS IYMNDIQLYGYNGANHLLANNHEINENTAQTSCELAEQAKGRCIYYNASEEGLMYMAKQI KDSLTMKPVGILRASIKLSYLKKMTITARDSLAAHIFLLDNDKNVLIESAENDATISDRS WIEKISGNTGEFLFTADGQGYDCVYQRSSDTGLTVVGMIPMSFLQKTARGLQKTTIMLIL ASLMLCIFLANILAKGIAGPIKRTSKAMQQFAEGDFSVRLPEGRRDEIGAMNSVFNQTIE KIEQLIKQVVEMETVNKDIEFQALQAQINPHFLYNVLDTINWMARKKGEDNICRMVTSIS SLMRASISNKRSMVYIREEIKYVQDYLYIQETRYGDKFTSYIEVDERLNELEIPKMTIQT LVENAVVHGVENATWDCFLYVSGEITDGMAVFTVKDDGVGMSQEQLEKLMGTEEEPDHEA ERTHTHLGVYAVRKRLDYVYQNKARMSITSEPGKGTQVILEIPMNGNVGVTYRKYDETVN ENNKKRGDER >gi|226332935|gb|ACII01000084.1| GENE 49 61852 - 62286 522 144 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1411 NR:ns ## KEGG: EUBREC_1411 # Name: not_defined # Def: lactoylglutathione lyase related lyase # Organism: E.rectale # Pathway: not_defined # 15 144 1 130 130 130 53.0 1e-29 MSKSYKDEQKEVKIMDLKNYSTGVQHIGIPTNDIDKTVEFYHKLGFETAFETVNEEANEK VVFLKLGTLVVETYENHAAKMEHGAIDHVALDVRDIEEIFQYINEAGLNSTQDTIHFLPF WENGVKFFTIEGPNKEKVEFSQYL >gi|226332935|gb|ACII01000084.1| GENE 50 62905 - 63084 89 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNTAITDIAVFTVSFPGRPPLQEPRISVRCRIVLVSKVKQGGYGISYVRPCVGRTFFML >gi|226332935|gb|ACII01000084.1| GENE 51 63321 - 63875 546 184 aa, chain + ## HITS:1 COG:BH3392 KEGG:ns NR:ns ## COG: BH3392 COG3090 # Protein_GI_number: 15615954 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Bacillus halodurans # 5 140 9 144 175 64 28.0 9e-11 MKIYKKIMNALAAVEKIVLVISTLLILVLTVGNVFSRKVIHRSWSFTEELVVAVFVLITL LAAALACREGELVSLTLVTDRLPKKLKKPSVILITVLSIIFSVILFKYGMDKVITQLQNG KRTFVLNWPEWIFWSFVPIGAGCMILHFIEFCLDFCCKTDSAKNTDNSTKSNMNTITTDK KEAK >gi|226332935|gb|ACII01000084.1| GENE 52 63876 - 67208 239 1110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 825 1082 73 323 346 96 28 4e-19 MISAILFISFFIFLILGVPIGICLGLSSVCAILYSGTSLTIVATNMYSGISKFLLLAIPF FVLSGNIMAKAGISKRLIKFVDTCVGHKKGGIAIVCVIVACFFGAISGSGPATVAALGAV LIPAMVEQGGFSAPFSTALMATSSSIAIVIPPSIAFVVYASITGTSIADMFMAGIVPGLL MGVALVIVVMLEAKKHNIKPSREKASGKERWDAFKDAFWGFLMPVIILGGIYGGIFTPTE AAAVSVVYGLFVGMVIYREVSIRDMFDILVDSAKTTGGIMLIVASASLFSFVCTKFGIAD AASNLLGSIAHNQFTFLLIVNIIFLIAGCFIDANSAMYIFIPIMLPVCKALGYDIVAFGV MATVNLAIGQVTPPVGVNLFVAISIKIKKGLEVTLQEISRAVVPMIAACVAVLLIVTYIP ITSTFLPKALAKEGSYTGDQSSASSDTASKDAGDGNNSFDTIADYSDLDWPEMTWNFACS TTETSTWADGGRKFGELMEKATGGKVKVNIYAADQLTNGNQSEGIQALMNGDPVQISMHS NLIYSAFDPRFNVVSLPFVYDSYDDADAKFDGEAGAKLKEILSEYGLHCMGIAENGFREI TNSKHEIKSVDDMKNLKVRVAGSNLLMECYKRWGADATNMNWSETYTALQQNTVEGQENP LPAIDAASVQEVQPYCSMWDAIYDCLFFCINEDIYNSLTPQQQEVVDEAGQKAVEYERYI NRSGDDEIKERWASQNGVTITEKEDMDIDSFKEAVDGIDDWFVNELKSQGYDDAQDLVDL FTKDSFNTVEDYSNLDWPETTWNFACSTTETSTWADGGRKFGELMEKATGGKVKVNIYAA DQLTNGNQSEGIQALMNGDPVQISMHSNLIYSAFDPRFNVVSLPFVYDSYDDADAKFDGE AGEKLKEILGEYGLHCMGIAENGFREITNSKHEIKSVDDMKNLKVRVAGSNLLMECYKRW GADATNMNWSETYTALQQNTVEGEENPLPAIDAASVQEVQPYCSMWDAIYDCLFFCINQD IYDGLTPQQQVVVDECGQKAVEYERYINRSSDNEIKERWESKNGVTFTEKADMDIDSFKK AVDGVDDWFVNELKSQGYEDGQDLVDLFTK >gi|226332935|gb|ACII01000084.1| GENE 53 67694 - 68701 676 335 aa, chain - ## HITS:1 COG:ECs3065 KEGG:ns NR:ns ## COG: ECs3065 COG0523 # Protein_GI_number: 15832319 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Escherichia coli O157:H7 # 1 167 1 164 328 70 30.0 3e-12 MVKIDLITGFLGSGKTTFIRKYAQYLMDAGNNIGILENDYGAVNVDMMLLQDLMGENCEL EMISGGCDKDCHRRRFKTKLIAMGMCGYDRVIVEPSGIFDVDEFFDILHEEPLNRWYQIG NVIAIVDSKLERDLSEEADFILASEVADAGCIVMSKSQDASPEEIQGTIKHVNQALEKVH CSRRFCCEMDDVDTVNIIHKNWDEMSKEDFDRIASCGYVMASYRKPEFEAEDAFTSLYFM NVKMTEKELREAAEKILSDSECGRVFRMKGFMRVDSDSEDGSGLSAWAECDRRQWIELNA TKNEITIRPLHVGQEVLIVIGEELHEEKIKSYLKI >gi|226332935|gb|ACII01000084.1| GENE 54 68881 - 70764 2016 627 aa, chain - ## HITS:1 COG:CAC3415 KEGG:ns NR:ns ## COG: CAC3415 COG1132 # Protein_GI_number: 15896656 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 627 2 627 627 752 60.0 0 MNKKKKLDQNTIQTAKRLLKYMTGTHKIQFIIVFICIFISSAASIAVSLSLKFLLDDFII PLIGQTDPNFAELYKALAVLGCIFALGVIATFIYTRMMVYIGQGVLKSVRDDMFEHMQTL PIRYFDQNTNGSVMSLYTNDTDTLRQMISQAIPQALMALFTIVVTFISMLLLSPLLTILA VVIIFIMLKVTSKIGSNSGKYFIRQQVSLADVTGFVEERMNGQRVVKVFNHEDKSKEEFD KLNEALFESSANANKYGNMMGPVIGNIGNLQFVLTAVLGGLLSVTGVGGITLGVMASYLQ FTKSFTQPFMQVAQQFNAIVMALAGAERIFRLIDEKPEEDEGYVELVNARKDANGNITEC KERTGMWAWKHPHSADGSVTYTELTGDVRFEDVTFGYNPDKVILKDISLFAKPGQKLAFV GSTGAGKTTITNLINRFYDIQEGKIRYDGINITKIKKDDLRRSLGIVLQDTHLFTGTIKE NIRYGKLDATDEEVYEAARLAHADQFIKMLPKGYDTMLSGDGEELSQGQRQLLSIARAAV ANPPVLILDEATSSIDTRTESIVQKGMDNLMKGRTVFVIAHRLSTIRNSDAIIVLDHGKI IERGDHEDLIKLKGTYYQLYTGKLELS >gi|226332935|gb|ACII01000084.1| GENE 55 70757 - 72490 209 577 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 334 558 132 354 398 85 28 1e-15 MNKKLLRSVREYKKQSVLAPILVILEVLMEVLIPLEMAKIIDVGIANGDMSYIIQRGVIL VAMAMLSLFFGVQAGNMAAVAAAGYAKNLRHDIFYKVQDFSFKNIDHFSTSGLVTRLTTD ITNVQQAYMMSIRLLARAPFMIILSWIMTLTINKPIAVLFLIVVPVLGGTLIFIAKKAHP HFIKVFDEYDNLNNSVQENVNASRVVKAFVREDYEIEKFHGISRYVYNLFTKAEKIVAWN SPVMMFVMYTVIIIIVAIGGRGIVLGGMETGELTSIIVYALQILMSLNMVTFVFVMIMIA EASTDRIVEVLDEVPEMQDKADAVKNVADGSIDFNHVDFSYAGEGGNLSLKDVNLHIKSG QTVGIIGGTGSAKSTLVQLIPRLYDVTKGNVQVGGVDVRNYNLEVLRDQVSMVLQKNVLF SGSIYDNIRWGDEHASEEEVRRVCKLAQADGFVSEFPDGYDTMIVQGGNNVSGGQKQRLC IARALLKKPKILILDDSTSAVDTRTDALIRKAFREEIPNTTKIIIAQRVSSIEDADQIIV LDDGKIAGVGTSEELLKTNDIYREVYESQVKGGGDDE >gi|226332935|gb|ACII01000084.1| GENE 56 72527 - 73036 442 169 aa, chain - ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 20 127 3 113 143 70 41.0 1e-12 MCTDFRRCGEIDEEMKQKLNEMYVGRLIHILSHTMKRHNPAEVMENDDLTTMQKHVLKFI LLETMHRDLYQKDIEEEFQIRKPTVTGILKLMEKNGYIYRESAKQDARLKQIIPTEKAEK IRPAILKSIEEGEAKMLRGIPKEDVELCKQVLWQMYENRKKSCDSRNDF >gi|226332935|gb|ACII01000084.1| GENE 57 73454 - 74056 801 200 aa, chain + ## HITS:1 COG:PA1757 KEGG:ns NR:ns ## COG: PA1757 COG0560 # Protein_GI_number: 15596954 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 1 191 1 191 205 202 51.0 4e-52 MYITCLDVEGVLVPEIWIAFAEASGIPELKKTTRDEPDYDKLMNWRLGILKEHGLGLKEI QDVITKIDPLPGAKEFLDELRSFSQVILISDTFTQFAAPLMEKLGRPTLFCNTLEVADNG EITGFKMRVEQSKLTTVKALQSIGFDTIASGDSYNDLGMIQASKAGFLFRSTDKIKADYP EIPAYEEYDELLAAIRNAMK >gi|226332935|gb|ACII01000084.1| GENE 58 74185 - 74688 660 167 aa, chain - ## HITS:1 COG:lin0387 KEGG:ns NR:ns ## COG: lin0387 COG0494 # Protein_GI_number: 16799464 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 1 167 1 161 169 84 35.0 1e-16 MEFWDIYDENKKPTGRTMKRNDWCLKDGEYHLTVLGVVGRPDGTFLITKRVMTKAWAPGW WEVSGGAAQAGEESCEAVLREVKEETGLDVRNAEGGYLFTYKRENPGEGDNYFVDVYRFV MDIDESDLHLQTEETDGYMFATVDEIKAFAAEGKFLHYDSIKKAFEM >gi|226332935|gb|ACII01000084.1| GENE 59 75109 - 76692 2239 527 aa, chain + ## HITS:1 COG:TP0475 KEGG:ns NR:ns ## COG: TP0475 COG0166 # Protein_GI_number: 15639466 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Treponema pallidum # 4 527 3 529 535 624 59.0 1e-178 MATWKNLDTLASYSKLAELKGHVNIADAMAGEKGAERVKKYSTPMAAGLTYNYAAKQVDE TVLDALAKLADEAELIDKFQELYNGAVINTGEKRMVLHHLARTQLGEDVVVDGVNKREFY VAQQKKAADFANKVHAGEITNENGEKFTTVVQIGIGGSDLGPRALYIALENWAKANNTSK MEAKFISNVDPDDAAAVLASVDLAHALFIVVSKSGTTLETLTNEAFVKNALVKAGLNPSK HMLAVTSETSPLAKSEDYLTAIFMDDYIGGRYSSVSGVGGAILSLAFGPEVFAQILDGAA AEDKLATNKNILENPDMLDALIGVYERNVQGYPATAVLPYSQALSRFPAHLQQCDMESNG KSVNRFGEPVDYVTGPVIFGEPGTNGQHSFYQLLHQGTDIVPLQFIGFKKSQLGVDVDIE NSTSQQKLCANVAAQIVAFACGKKDENLNKNFKGGRPSSIITGEELTPASLGALLAHFEN KIMFQGFVWNLNSFDQEGVQLGKVLAKRVLAHDTDGALKVYSDLLNI >gi|226332935|gb|ACII01000084.1| GENE 60 76825 - 78519 1459 564 aa, chain - ## HITS:1 COG:alr1230_2 KEGG:ns NR:ns ## COG: alr1230_2 COG2200 # Protein_GI_number: 17228725 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 312 557 4 254 266 181 38.0 3e-45 MRKDSDNAKLLEPGELPSKRKVLVVEDNELNRDILSSFLEEKFDVFLAENGEEGLELLSE HYRELSVVLLDICMPVCDGFEFLRRRNTDKFLSTIPVIVMTGSNSKDVELQCLDLGAVDF IPKPYNFKLVMGRINSVIKLRESVMALTAVEHDELTGVYTRQAFFYHAKSFLKTRPGERF HLVIADIRDFKLINSSYGEKVGDEILCYLAGAYTKMLKTGLVSRYGSDQFVCMTCDDCDL SLETVTKIEEEIAEHAPVPNLMVKYGVYQDIDKSLPVSVICDRGFMAIRSIRDNYECNIA YYTEKMKQKQMQDRLLENRFESAIKNKEFVAYFQPKYDVKTERITGAESLVRWINPDGSM VMPGDFIPLYEKDGLIVKLDEYIFRYVCEFQRELMKKGQELIPISVNLSRTSIHHSDIVE RYMKIVEENGIPFSCVPVELTETATLNNVMIRDFTEKLVNAGFALHMDDFGSGYSSLITL SELPFNTLKIDKSLVDCIHQQKGRMVVQQVIILAHGLGMKVVAEGVETADQLELLKEMEC DNIQGFYYSRPLPKAEFVKKSEEN >gi|226332935|gb|ACII01000084.1| GENE 61 78942 - 79358 416 138 aa, chain - ## HITS:1 COG:VNG1818a KEGG:ns NR:ns ## COG: VNG1818a COG1310 # Protein_GI_number: 16554503 # Func_class: R General function prediction only # Function: Predicted metal-dependent protease of the PAD1/JAB1 superfamily # Organism: Halobacterium sp. NRC-1 # 6 131 9 135 136 65 29.0 4e-11 MRDHLYDEIVNYAKEHLPEEACGLLAGVENGEGREIRKVYFLENKDHAEDHFTLDPRDQI NAIRDMRANGLKPLGNWHSHPSSPSRPSVEDIRLAFDSKASYLILSLMADYPVLNSFHIE GGEWTKEDLRIYSEEYYF >gi|226332935|gb|ACII01000084.1| GENE 62 79410 - 80246 1034 278 aa, chain - ## HITS:1 COG:aq_1329 KEGG:ns NR:ns ## COG: aq_1329 COG0476 # Protein_GI_number: 15606532 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Aquifex aeolicus # 4 266 5 268 271 309 57.0 4e-84 MAMTDEQIERYSRHIILKEVGAKGQKKLLKGKVLIIGAGGLGAPAAMYLAAAGVGTIGIV DADEVDLSNLQRQIIHSTADIGKAKVKSAKETMNAMNPDVEVKTYRMFVDASNIRELIRE YDFIIDGTDNFPAKFLINDACVLEKKPFSHAGIIRFKGQLMTYVPGEGPCYRCVFKNPPP KDAVPTCKQAGVIGAMGGVIGSLQAMEAIKYLIGVGDLLTGYLLTFDALTMEFHKVKLPK DTHDCAVCGDHPTILEPIDYEQEVCEETAGRFATRDEQ >gi|226332935|gb|ACII01000084.1| GENE 63 80247 - 80453 357 68 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0898 NR:ns ## KEGG: EUBREC_0898 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 68 1 68 68 104 86.0 1e-21 MFITVAGEKKEVKDGLTLPELIAQENVEMPEYVTVSINEEFVASEDKESTVLKEGDNVEF LYFMGGGC >gi|226332935|gb|ACII01000084.1| GENE 64 80482 - 80799 481 105 aa, chain - ## HITS:1 COG:mll5082 KEGG:ns NR:ns ## COG: mll5082 COG0526 # Protein_GI_number: 13474238 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Mesorhizobium loti # 3 105 5 107 107 86 40.0 1e-17 MAVRVSKTDFEEKVLKEKLPVLVDFYSDSCVACKKLAPVLGNAEDDYEDKIKVYKVNTNF DVELAEQYEVQANPTLILFKDGQAKDRKTGALKQAELNSWIEGLL >gi|226332935|gb|ACII01000084.1| GENE 65 80799 - 81050 318 83 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0896 NR:ns ## KEGG: EUBREC_0896 # Name: not_defined # Def: sulfite reductase (ferredoxin) # Organism: E.rectale # Pathway: not_defined # 1 83 1 81 81 120 84.0 1e-26 MAELDFEVTDEVDITDKVCPLTFVKAKVAIEELEDGEILAVRMNDGEPVQNVPRSMKEEG HKVLKLTDNEDGTYTMYVRKVED >gi|226332935|gb|ACII01000084.1| GENE 66 81063 - 81932 1141 289 aa, chain - ## HITS:1 COG:MA3439 KEGG:ns NR:ns ## COG: MA3439 COG2221 # Protein_GI_number: 20092251 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Methanosarcina acetivorans str.C2A # 3 282 7 286 288 222 40.0 8e-58 MAQKVDYAALKKGGYMRQKQKGYGSLRLAVVGGNLTAENIKTVAEVAEKYGRGYVHMTSR QGIEIPFIKVEELAEVKEALAKGGVGTGVCGPRVRTVTACQGSEVCPSGCIDTYTLAKEL DERYFGRELPHKFKFGVTGCQNNCLKAEENDVGIKGGMNIEYKEDDCISCGVCVKACRQE ALKMVDGKIELDAQKCNHCGRCVKSCPVDAWKGTPGYIVSFGGTFGNNIYKGEELLPLIP DKETLFRVTDAAINFFEKNANPSERFRKTLQRVGEEDFRSQLKDAYEGQ >gi|226332935|gb|ACII01000084.1| GENE 67 82045 - 82944 484 299 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 3 298 2 303 306 191 38 2e-47 MKYDVGIIGSGPAGLSAAIYAKRANLSAVVIEKEYEGTGQIAESGQVDNYPGFPGISGYD LGENFREHAVKLGVSFMEQEVTEIKKEASSDFELVFADGSQVEAKTVIYAAGATPRRANI PGEQEYAGKGVSYCAICDGSFYRGKSVAVLGGGDTALDDAVYLADVAEKVYVIHRRKEFR GAAVTVAKLREKENVIFVLEHQVKEIIGEQKVTGVVLEDAAVIDVNGVFVAYGAVPQTDL LKKFAVLDDSGYVRAGETGETALEGLYVAGDARTKKLRQVVTAVSDGANAATAVAEYLK >gi|226332935|gb|ACII01000084.1| GENE 68 83104 - 83373 413 89 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0173 NR:ns ## KEGG: EUBREC_0173 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 89 18 106 106 114 62.0 1e-24 MYYVPDGTKECGYYECKKCGNRFLSMQTMKRIPCPDCEAEIDYEIGPDESLEDVLDTAEL IQKIEGEEEVEKMDTLLSLALTGGDYSWI >gi|226332935|gb|ACII01000084.1| GENE 69 83494 - 83805 332 103 aa, chain - ## HITS:1 COG:slr1593 KEGG:ns NR:ns ## COG: slr1593 COG2200 # Protein_GI_number: 16329581 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 1 96 192 287 303 91 43.0 4e-19 MEDFGSGYSSLNMLKEAPVDEIKLDMRFLSAADPYGRAEEILHMIITMGNHMKLSIIAEG VETEQQKVMLQGFGCNKAQGYFYARPMREAEYTELLRKEKAEN >gi|226332935|gb|ACII01000084.1| GENE 70 83805 - 84137 393 110 aa, chain - ## HITS:1 COG:no KEGG:Elen_2229 NR:ns ## KEGG: Elen_2229 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase # Organism: E.lenta # Pathway: not_defined # 34 106 37 110 981 69 43.0 4e-11 MSGKKRNAMLFVLLFTLLLGVIVPVQAQSDGRTKTIRVAYREDADFINKSSSGVYKGYGV EYLNKISQYTGWRYEYINESWENQLADLKSGKVDLICNAQKTEAREDSVF >gi|226332935|gb|ACII01000084.1| GENE 71 84229 - 84549 185 106 aa, chain - ## HITS:1 COG:no KEGG:Lebu_0233 NR:ns ## KEGG: Lebu_0233 # Name: not_defined # Def: putative transcriptional regulator # Organism: L.buccalis # Pathway: not_defined # 3 99 194 295 487 69 36.0 3e-11 MCQGNGEVTAAHPFCVILQKDRLYFKGTDRGTDRAVFLDKREFTGPIYTQIKEAVDFVLR NIRLGATIDGLVRKEKYELPPEAIREILFLVCIRLSVVEMGLKSNT >gi|226332935|gb|ACII01000084.1| GENE 72 84582 - 85355 550 257 aa, chain - ## HITS:1 COG:CAC3166 KEGG:ns NR:ns ## COG: CAC3166 COG1342 # Protein_GI_number: 15896414 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Clostridium acetobutylicum # 1 94 1 95 143 77 45.0 3e-14 MPRPVKCRKVCHFPNVLEFLPADNAEKKVPIVLTVDEYETIRLLDKKGYSQEQCAVSMQI ARTTVQRIYEIARKKIADALIDGHPLRIEGGDFRICDGQSSNCSFGGCYKQEIYKKYAAE KGEGIMRIAVTYENGQIFQHFGHTETFKIYDVEEGKVVHSEVVDTNGSGHGALAGVLNAL NADVLICGGIGGGAQTALAAAGIKLFGGVSGDADKAVEAFINKTLDYNPDVKCSHHEHSH GEGHTCGEHGCGSHSCH >gi|226332935|gb|ACII01000084.1| GENE 73 85473 - 86132 343 219 aa, chain + ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 2 209 9 212 217 84 30.0 2e-16 MNFENYFPLWNDLNIAQKKIISDNLITRDVKKGTIIHNGNLDCTGLLLVKSGQLRTYILS DEGREITLYRLFDMDMCLLSASCIIRSIQFEVTIEAEKDTDLWIIPAEIYKDIMKESAPV ANYTNELMATRFSDVMWLIEQIMWKSLDKRVASFLLEETSIEGTNELKITHETIANHLGS HREVITRMLRYFQGEGLVKLSRGKITILDSKRLETLQRS >gi|226332935|gb|ACII01000084.1| GENE 74 86251 - 87942 1254 563 aa, chain - ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 2 469 3 469 469 419 44.0 1e-117 MKVIIVGGVAGGATAAARIRRLDEHAEITVFERSGYISYANCGLPYYIGDVITDPEELTL QTPESFFKRFHINMKIHHEVISIHPDRKTVSVKNLENGEIFEENYDKLILSPGAKPTQPR LPGVSIDKLFTLRTVEDTFRIKEYINKNHPKSAILAGGGFIGLELAENLKELGMDVTIVQ RPKQLMNPFDPDMASMIHNEMRKHGIKLVLGYTVEGFREKDNGVEILLKDNPSLQADMVV LAIGVTPDTALAKEAGLELGIKGSIVVNDRMETSVPDIYAAGDAVQVKHYVTGDDALISL AGPANKQGRIVADNICGGDSHYLGSQGSSVIKVFDMTAATTGINETNAKKSGLEVDTVIL SPMSHAGYYPGGKVMTMKVVFEKETYRLLGAQIIGYEGVDKRIDVLATAIHAGLKATQLK DLDLAYAPPYSSAKDPVNMAGFMIDNIAKGTLKQWHLEDMDKISKDKNVVLLDVRTVGEF NRGHMKGFNNIPVDELRERISEIEKGKPVYLICQSGLRSYIASRILEGNGYETYNFSGGF RFYDTVVNDRALIERAYACGMDY >gi|226332935|gb|ACII01000084.1| GENE 75 87965 - 88270 326 101 aa, chain - ## HITS:1 COG:BH3098 KEGG:ns NR:ns ## COG: BH3098 COG0526 # Protein_GI_number: 15615660 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 1 101 1 100 104 107 49.0 7e-24 MSAININKNNFQNEIMDSEKTVLLDFWAPWCAPCRMVVPIIEEIASERPDIKVGKINVDE QPELASKFGIMSIPTLVVMKNGKIVTKVSGARPKKAILEML >gi|226332935|gb|ACII01000084.1| GENE 76 88450 - 88623 110 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579353|ref|ZP_04856623.1| ## NR: gi|253579353|ref|ZP_04856623.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 57 24 80 80 99 98.0 6e-20 MLSEDHQLSAVKLAEKIGVVSRNIENNIKKLKEYGILIGHGSPKNGYWEIIDKDLQE >gi|226332935|gb|ACII01000084.1| GENE 77 88749 - 89645 431 298 aa, chain - ## HITS:1 COG:no KEGG:LSL_1338 NR:ns ## KEGG: LSL_1338 # Name: not_defined # Def: DNA-binding protein # Organism: L.salivarius # Pathway: not_defined # 1 115 1 116 203 88 41.0 2e-16 MDLIKIGKYIAGKRKSLGMTQKQLAEKLGMSDKSVSKWERGVCLPDVSVYKELCSILGIS LNEFLAGEDIAQENMIQKSETNIIEVIRDNIDKQKCLKVMKCILLVISICAVSVIGFTIY RLKKPQNYISPVAKDSIEMQTAELLAGPDGAFVYKFITTDEYKKLRLHIYRYESGKLSDQ DKVEMGFEDIGSPKSGEIVMVSDFDNYVIKLIISGGGSRLSTEIPILENVENREYYGRTA TEIKNVVDIRYDKQQPLIAFVYDNDEMSVPTLDDFINSQTDFLSKNDYVYYVAFEFCK >gi|226332935|gb|ACII01000084.1| GENE 78 90204 - 90383 189 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579355|ref|ZP_04856625.1| ## NR: gi|253579355|ref|ZP_04856625.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 1 59 59 92 100.0 9e-18 MPRKPSLDGKDSSLRIRMSPEQKEKLVSYAERHYQTMSNVIFQALDILYAREEQQNNKE >gi|226332935|gb|ACII01000084.1| GENE 79 90603 - 91010 135 135 aa, chain + ## HITS:1 COG:no KEGG:Apre_0590 NR:ns ## KEGG: Apre_0590 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 27 125 1 99 105 64 35.0 1e-09 MNSIDKKTWYVYDIKYAKKTWYGGVFMKKDEILNASRKEHRNKDLAEMEVVYQAGSHASR VGVLVCCLLSLLSSVLAHTMIYSPWVIYFSIIATQWLVRFIKMKRKSDLVLTVLFFVLSI LAFVGFVSHLLEVRI >gi|226332935|gb|ACII01000084.1| GENE 80 91007 - 91213 161 68 aa, chain + ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 7 67 1 62 68 60 45.0 9e-10 MKEQLQLKNHLKEVRTEANLSQAQLAEMVGISRNTISSIETGQFNPTAKLALILCIALDK KFEELFYF >gi|226332935|gb|ACII01000084.1| GENE 81 91223 - 91525 190 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|197302208|ref|ZP_03167267.1| ## NR: gi|197302208|ref|ZP_03167267.1| hypothetical protein RUMLAC_00935 [Ruminococcus lactaris ATCC 29176] # 1 100 1 100 100 150 98.0 2e-35 MENIIMLILGVFISVVGIVNIKGNISTIHSYNRRKVKEEDIPKYGKTVGTGTLIIGISLV LGFIVSFWSEIIIDYIILPAVIVGLGFILYGQFKYNKGIF >gi|226332935|gb|ACII01000084.1| GENE 82 91545 - 92342 776 265 aa, chain + ## HITS:1 COG:BS_yhfC KEGG:ns NR:ns ## COG: BS_yhfC COG4377 # Protein_GI_number: 16078082 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 6 204 2 200 258 115 40.0 7e-26 MELNTISGLAIAGVICSVVLSMGVPIALFIAGKVKLKARISSFFIGAGTYLLFAMLLEQL LHVLVIQFCGLNAQSRPWLYYVYAALAAAVFEETGRLIAMKFWMKKWLDFPNALMYGIGH GGVEAILIGGLSGISNLVSMLMINSGAMQNTLAALPAESANQTVSQLSALWTTPAPLFFV SGIERISAIILHIGLSLLIYRAVKAGKCRTAAFTAVLAYGIHFIVDFFAVAGPALLPIYV IEIGVFVMAAGTLVMALKMGRMEEL >gi|226332935|gb|ACII01000084.1| GENE 83 92887 - 93096 125 69 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 69 26 94 94 94 62.0 1e-18 MKAHEYWSVGALITMLGTFYSGYKGSKSSHKYFAASSLLCMIMAIYTGHRIISKNKKTEK ESNSIENEE >gi|226332935|gb|ACII01000084.1| GENE 84 93172 - 93696 539 174 aa, chain - ## HITS:1 COG:CAC3334 KEGG:ns NR:ns ## COG: CAC3334 COG0655 # Protein_GI_number: 15896577 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 159 1 164 178 82 34.0 3e-16 MKIVIINGSARKGNTLTAINAFIKGASEKNEIEIIEPDKLNIAPCKGCGVCQCSKGCVDK DDTNPTIDKIVAADMILFATPVYWWGMSAQLKLIIDKCYCRGLQLKNKKVGTIVVGGSPV DSIQYELIDKQFDCMAKYLSWDMLFKKSYYATARDELEKNKDSMNELEGIGKNL >gi|226332935|gb|ACII01000084.1| GENE 85 94237 - 95082 356 281 aa, chain + ## HITS:1 COG:PM1941 KEGG:ns NR:ns ## COG: PM1941 COG0789 # Protein_GI_number: 15603806 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pasteurella multocida # 3 120 7 130 132 64 29.0 2e-10 MTIKDVEERTGLSRSNIRFYEKEKLIEPSRNESNGYRDYSENDVENIKKIAYLRTLGISI EDIRSIISEKVTLQEMLEKQKEVLKNQITDLNKAKLMCEKMPDEESISYEKLQVEQYVTD LHDYWKDNRTVFKLDSVSFLYIWGSMLTWTMITALCLIIGALSYSKLPTEIPVQWSKGVA TSLVNKNWIFICPVICIIIRYLLKPFIYAKLQMNNYYGEIITEYLTNYMCFIVLSVEIFS ILFTFGVVKSVVVLLFVDTAIFIGLLVVGLVKMDLRGKEVL >gi|226332935|gb|ACII01000084.1| GENE 86 95610 - 96248 658 212 aa, chain + ## HITS:1 COG:MJ0455 KEGG:ns NR:ns ## COG: MJ0455 COG4887 # Protein_GI_number: 15668631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein conserved in archaea # Organism: Methanococcus jannaschii # 34 205 19 187 193 139 48.0 4e-33 MKKEDMSCIDCAVKNCNKMDKTYPDFCLTTHMDEEVLNEAMECYNEDENRKVTIAAAEVE YENYCKHTRVEEIMDFAKKINAKKIGIATCVGLLKESRILADILRRHGFEVYGVGCKAGT QKKTSVGIPECCEGVGVNMCNPILQAKLLNKAKTDLNVVVGLCVGHDSLFYKYSEALTTT AVTKDRVLGHNPVAALYTADSYYSKLKKSEEE >gi|226332935|gb|ACII01000084.1| GENE 87 96357 - 97289 428 310 aa, chain - ## HITS:1 COG:SA1835 KEGG:ns NR:ns ## COG: SA1835 COG0582 # Protein_GI_number: 15927603 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Staphylococcus aureus N315 # 2 299 87 373 390 96 28.0 8e-20 MREKTYHLYLNEEERSRVIQSLIELKNNLAAQGRYTDAVDDTVLHQVLQIAVDDDFIRNN PSDNVLRELKKAHCFQSEKRRALTKPEQELFLNFLKTHPVYEHWYPVFAVMIGTGLRVGE VTGLRWCDIDMESGMIDVNYTLVYYDHRTEGSKSGCYFNVNTTKTPASMRQVPMLGFVKE AFEHEKQKQEDLGLHCEVTIDGYTDFIFINRFGQAQHQATLNKAIRRIIRDCNDEQFLHS DEPDVLLPHFSCHSLRHTFTTRMCEAGVNIKVIQDALGHSDISTTLNIYADVTKEMKAEE FKGLDSYFKV >gi|226332935|gb|ACII01000084.1| GENE 88 97286 - 97474 70 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291520870|emb|CBK79163.1| ## NR: gi|291520870|emb|CBK79163.1| hypothetical protein [Coprococcus catus GD/7] # 1 62 400 461 461 130 100.0 2e-29 MKAYMCKQRGIRLIKLPMKGTELDYANNLKKAFQSVHIFISSDTEEDVEIIKNTFERWRD SQ >gi|226332935|gb|ACII01000084.1| GENE 89 97480 - 98091 71 203 aa, chain - ## HITS:1 COG:no KEGG:wcw_1894 NR:ns ## KEGG: wcw_1894 # Name: not_defined # Def: hypothetical protein # Organism: W.chondrophila # Pathway: not_defined # 13 193 640 819 877 115 38.0 7e-25 MIEVIFVKGTCGHEWQTSVKARSNGEKCPICSGARVIAGINDLAILEPLLVKQWSKKNKI KPTEVSIGSHKKVIWRCKKGHEWEAAVKSRTINKTGCPYCSHNKVLAGFNDFATLLPDIA AEWSDRNYPLLPTQVTVFANRKAWWKCKDCGREWNTLISTRSGGSKCPYCSGYIFSKGFN GLQTTHPEIQKSWLLSQEVPMRI >gi|226332935|gb|ACII01000084.1| GENE 90 98761 - 99057 186 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225389716|ref|ZP_03759440.1| ## NR: gi|225389716|ref|ZP_03759440.1| hypothetical protein CLOSTASPAR_03464 [Clostridium asparagiforme DSM 15981] # 1 98 1 98 98 121 93.0 2e-26 MEKKKFQSVSIAALIVSIIPLAALAPSLLHLSLSDGVRTAWAGANIVFVLLGLILSVVCV RSRGSRSVINIASTAISAFWALLMLGIVALAMFLTFVQ >gi|226332935|gb|ACII01000084.1| GENE 91 99062 - 99826 654 254 aa, chain - ## HITS:1 COG:BH0447 KEGG:ns NR:ns ## COG: BH0447 COG4200 # Protein_GI_number: 15613010 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 252 2 245 247 62 26.0 1e-09 MSFLDLLKIEFMKVKRSKIVPLIFIAPLLVVVSGVAYLSNYFTPEYTNAWAAMFIQSALV YAYYLLPFSMIVVCVMIAGRETGNNGILKMLALPVSRCALSIAKFCVLTFYLFMEMVVFL VVFVIAGLIATQTMGVTETLPILYLLKWCLGLFLTMLPCIAAMWAITVLFEKPLLSVGLN LLLVIPGVLVANTSLWIAYPYCYSGYLVSCSLYDFTAETSDAAFSVFPFLPCAIVVFGLF FALAVTQFGKKEMR >gi|226332935|gb|ACII01000084.1| GENE 92 99843 - 100568 623 241 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0157 NR:ns ## KEGG: HMPREF0424_0157 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 1 241 1 241 241 297 71.0 2e-79 MKTLAIELRKEKRTGVIPVLLAVGVLGAAYAFVNFIMRKDTLLNLPLAPMDVLLTQLYGM IMVLNLFGIVVATCMIYNMEFKGSAVKKMYMLPVSVPAMYLCKFLILTVMFLVAIVLQNL ALAQIGMTDLPDGAFEMGTLVRFAGYSFLTSMPILSFMLLISSRFENMWVPLGIGVAGFL SGMALANSGLTLLLVHPFVIILKPAVAMSAQPDSTVALVALVETLLFLAVGLWLAKYRRY E >gi|226332935|gb|ACII01000084.1| GENE 93 100565 - 101329 268 254 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 10 232 14 240 309 107 32 2e-22 MDYIITTEQLTKKYKNFTSVNHVSLHIRKGSIYGFLGPNGAGKSTTMKMLLGLTDPTKGS FTIDGKQFPQDRIAILKEIGSFIEAPSFYANLTGRENLDIIRRILGLPKADVEDALELVG LTEFGDRLAKQYSLGMKQRLGFAGALLGRPPILILDEPTNGLDPSGIHEIRNLVKSLPNL YDCTVLISSHMLSEIELIADDIGILNHGRLLFEGSMNDLRQYALQSGFAADNLEDVFLSM VEKDNMDRKQRAKL >gi|226332935|gb|ACII01000084.1| GENE 94 101552 - 102187 448 211 aa, chain + ## HITS:1 COG:CAC0289 KEGG:ns NR:ns ## COG: CAC0289 COG0745 # Protein_GI_number: 15893581 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 207 21 233 235 177 41.0 1e-44 MVSDILKDAGFETVLTAMTVKEAILTAKEETPDLIVLDVMLPDGDGFSLMQHFRTFTNVP IIFLTAKDEAADKLSGLGLGADDYISKPFMPQELLLRIYAVLRRTYKEDSPLLVLDGCTI DFSRAEVHKGSEIISLTAKEHTLLETLARNAGKIVTVDALCEALWGDNPFGYENSLNAHV RRIREKIETDPSKPVSLITIKGLGYKLIARK >gi|226332935|gb|ACII01000084.1| GENE 95 102192 - 103556 322 454 aa, chain + ## HITS:1 COG:BS_yvrG KEGG:ns NR:ns ## COG: BS_yvrG COG0642 # Protein_GI_number: 16080374 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 196 450 303 570 573 119 28.0 2e-26 MKSFGSYISKYLVSFVAFILILLFLNAVVFGLTFQKIVTEDYGDSSPQAMLEMTATAATP EQLSDEAVQMLRQNHIWAIYLNTDGQCYWSVDLPDNVPKNYTIQDVALFSKGYIEDYPVF IWNTDDGLLVLGYPTDSYTKLTSNYYSIAALQRLPIFVLGMLGLDLLCLFSAYYFSKRRI IHNTEPIVSAVETLADGKPVSLHISGELSEIASSVNKASSILNRQNEARANWISGVSHDI RTPLSMIMGYAGRIAESKAASKSIREQAEIVRKQSVKIKELVQDLNLVSQLEYEMQPLHK EMVRLSKLLRSYVADLLNTGISDSYNIGIKITPDAENAVLECDARLISRAVNNLVQNSIK HNPQGCEVCLSLTASQNHLILAVTDNGTGLSAEKLQELEEKPHYMEITDERLDLRHGLGL LIVQQIAIAHNGNFKLTNVFPKGCEATLIFPYIG >gi|226332935|gb|ACII01000084.1| GENE 96 103851 - 104117 247 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|217388356|ref|YP_002333385.1| ## NR: gi|217388356|ref|YP_002333385.1| hypothetical protein pMG2200_24 [Enterococcus faecalis] # 1 88 227 314 314 152 88.0 7e-36 MVEISPHFDALASTKDNDRLFAMLPYKSLCFTSIKDRHGVYAMINKDERRDVSIRKPRPS IRKQLADSKQATSPKKAAARTKKNELEV >gi|226332935|gb|ACII01000084.1| GENE 97 104121 - 104279 84 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579375|ref|ZP_04856645.1| ## NR: gi|253579375|ref|ZP_04856645.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 52 1 52 52 95 100.0 9e-19 MMHFTVEEENLICMYHNADRRRTIGKITAAMPGIDGDRIKHSHTPPTMREKP Prediction of potential genes in microbial genomes Time: Sat May 28 20:02:10 2011 Seq name: gi|226332934|gb|ACII01000085.1| Ruminococcus sp. 5_1_39B_FAA cont1.85, whole genome shotgun sequence Length of sequence - 8578 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 81 - 140 3.6 1 1 Op 1 . + CDS 165 - 869 233 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 2 1 Op 2 . + CDS 870 - 1616 404 ## Blon_1000 bacteriocin ABC transporter, permease protein subunit, putative 3 1 Op 3 . + CDS 1619 - 2362 557 ## Blon_1001 hypothetical protein 4 1 Op 4 . + CDS 2355 - 2732 329 ## GWCH70_0240 SpaI 5 1 Op 5 40/0.000 + CDS 2741 - 3433 388 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 6 1 Op 6 . + CDS 3424 - 4788 588 ## COG0642 Signal transduction histidine kinase + Term 4805 - 4837 -0.9 + Prom 4818 - 4877 3.8 7 2 Tu 1 . + CDS 5048 - 5152 82 ## + Term 5178 - 5220 12.4 8 3 Op 1 . - CDS 5586 - 5804 254 ## gi|253579382|ref|ZP_04856652.1| conserved hypothetical protein 9 3 Op 2 . - CDS 5797 - 6024 307 ## EUBELI_01778 hypothetical protein - Prom 6119 - 6178 5.9 - Term 6289 - 6356 4.4 10 4 Tu 1 . - CDS 6413 - 7990 2079 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 8048 - 8107 8.5 Predicted protein(s) >gi|226332934|gb|ACII01000085.1| GENE 1 165 - 869 233 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 204 1 212 305 94 32 3e-19 MDMILKTTDLCKNFKGQMAVNNVSLNIRRNSVYGLLGPNGAGKSTILKMFTGILRPTSGS IEFDGHPWKRNDLEHIGALIEMPPLYENLTAYENLKVRTTLLGLDDARINEVLQIVQLTN TGKKRAGQFSLGMKQRLGIAIALLNSPQLLILDEPTNGLDPLGIEELRELIRSFPCKGIT VILSSHILSEVQQIADHVGIIAGGVLGYEGELRAGEDLEQLFMDVIRRNGKEGY >gi|226332934|gb|ACII01000085.1| GENE 2 870 - 1616 404 248 aa, chain + ## HITS:1 COG:no KEGG:Blon_1000 NR:ns ## KEGG: Blon_1000 # Name: not_defined # Def: bacteriocin ABC transporter, permease protein subunit, putative # Organism: B.longum_infantis_ATCC15697 # Pathway: ABC transporters [PATH:bln02010] # 5 247 4 246 247 213 49.0 5e-54 MEMAYIQAENLKHKRTFTKTLIVLAPFVTALMNFFAPLWFQLNSYNWWYILLYPGFLTLT CALIEQRDNGKLKYRAVASLPVSQNKVWKAKIGVAGIYSCVGNFIFLALNLLGGFAILVI NEIPLTIGIWQAAAGTACIVIASLWEVPLCLWLSKKVGIFVTVILNAGLGSVLGIFTATT SLWMICPYSWVPHLMISVLGILPNGEPVADQSTAMAFWMIILVLVISLAWFAALSFLTAR WFEKKEVG >gi|226332934|gb|ACII01000085.1| GENE 3 1619 - 2362 557 247 aa, chain + ## HITS:1 COG:no KEGG:Blon_1001 NR:ns ## KEGG: Blon_1001 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: ABC transporters [PATH:bln02010] # 1 247 1 247 247 216 45.0 7e-55 MVSVVRCWKAEYQKCKHSILLYMHSMIPIICAAIFAGYYHISRWELATKISAYLEVLAVA FPFLIGIIVGLVVQIENQAGHYQLLLGTIPSRMATYIGKLGFLMICAFGATFLALGTFAA LYRDAPASLYLKAGILLLITMLPIYLIHLFVGMSFGKGASMGLGIAGSLIAALMITGLGD ATWKHIPWAWGVRAMDYTVLAWDSPQLYAQVKTDFFSGMIISVCCTVCLLIASLVWFHGW EGGKNSE >gi|226332934|gb|ACII01000085.1| GENE 4 2355 - 2732 329 125 aa, chain + ## HITS:1 COG:no KEGG:GWCH70_0240 NR:ns ## KEGG: GWCH70_0240 # Name: not_defined # Def: SpaI # Organism: Geobacillus_WCH70 # Pathway: not_defined # 6 121 8 146 147 68 35.0 6e-11 MNKKCLSAVVIVLVSLICLSACGALRDTADKNKALNESLPYYELNAANYDEISYNGLTYT ITDECLEMPELQEEIGQVSKRFKNVAGEDFSFGYVYSIVDVDISNAVAVNINNEYRKADI KNNDE >gi|226332934|gb|ACII01000085.1| GENE 5 2741 - 3433 388 230 aa, chain + ## HITS:1 COG:SPy1081 KEGG:ns NR:ns ## COG: SPy1081 COG0745 # Protein_GI_number: 15675069 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 14 230 4 227 228 165 40.0 5e-41 MWYNGAGGDFLTHLLVVDDEVSILELIKNSLGKDGYLITVCQNADDVDIKKLHFYDLILL DVMMPGTDGFEFCKQIRNMVDCPILFLTAKTLEEDILFGLGIGADDYITKPFRIQELRAR VAAHLRREKREHHSTLSFEPDIRFDLSAKVLYVSEQPVPLTKSEYSICEYLAKNRGQVFT KEQIYEAVFGFDGIGDNSTISTHIKNIRAKLEHFKISPVSTVWGIGYKWE >gi|226332934|gb|ACII01000085.1| GENE 6 3424 - 4788 588 454 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 206 435 30 255 279 111 30.0 3e-24 MGVKKKPTLKILFRQFAISLIVMLVAAIIVPFGLEGLAVNAGLATRANLSELQVKEIIPT LTIAPDITKVMIPQGCGYLILDKNFNELYSNMDDDEKEIALLYAKGEYIEYATGRQFALV VRENEFCVLRYYIGSQFTVSWLPEYFPSPDTLAFILMAVNSLLVIIILTARFAKNLRTQL TPLFEATAEVSKQNLDFEVGHAKIKEFEDVLASFSDMKDNLKISLERQWKTEQTQKEQIA ALTHDLKTPLTVIQGNADLLTETNLDDEQRLYAGYVVESSGQMQSYIQTLIDISQAAVGY QLHIESIDLPAFMQHLFGYMESLCWTKEIRLQMNTVSLPQMLKFDRVLIERAIMNVISNG LDYSPQGGTLYVDVQSNNGFVEISVTDEGTGFSKEALCHAQERFYMGDQSRNSKLHFGMG LYITNSIMEQHNGQLILENSKETGGAKVTMKLPC >gi|226332934|gb|ACII01000085.1| GENE 7 5048 - 5152 82 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFYYHYNLIHFYRLKKALPRHGNIPAMSMNYTM >gi|226332934|gb|ACII01000085.1| GENE 8 5586 - 5804 254 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579382|ref|ZP_04856652.1| ## NR: gi|253579382|ref|ZP_04856652.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 72 1 72 72 120 100.0 2e-26 MRKRKMYLSVEDIAELFGLSVSFAYRVVERMNADLESKNYYVILGRVPTRYVEDKIYGLE HVEQYLKEDKAI >gi|226332934|gb|ACII01000085.1| GENE 9 5797 - 6024 307 75 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01778 NR:ns ## KEGG: EUBELI_01778 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 64 1 64 65 63 43.0 2e-09 MQNKLFLKAADICELLEVKQTSAYEIIGNLNKELEEQGYLTLRGKVPTKYFVKRFYGVED TCEIPQEEGKEWFHA >gi|226332934|gb|ACII01000085.1| GENE 10 6413 - 7990 2079 525 aa, chain - ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 196 525 1 316 316 425 66.0 1e-118 MKRETVVVLDFGGQYNQLVARRVRECNVYCEIYSYKTDLEKIKAMNPKGIILTGGPNSCY EADSPTCNKELFELGIPVLGLCYGAQLMMHVLGGKVEKADVSEYGKTEVLVDKTDSKIFK DVSDKTICWMSHTDYISQVAPGFEIAAHTADCPVATAQNEEKKLYAIQFHPEVLHTVEGK KMLSNFVLGVCGCAGDWKMDAFVEHTIREIREKVGDGKVLLALSGGVDSSVAAGLLSRAI GKQLTCVFVDHGLLRKDEGDEVEGVFGPNGQFDLNFIRVNAQQRYYDKLAGVTEPEAKRK IIGEEFIRIFEEEAKKIGAVDFLAQGTIYPDVVESGLGGESAVIKSHHNVGGLPDFVDFK EIIEPLRDLFKDEVRKAGLELGIPERLVFRQPFPGPGLGIRIIGEVTAEKVRIVQDADFI YREEVDNAAAEYKKEHGEDPSWMPNQYFAALTNMRSVGVMGDFRTYDYAVALRAVKTIDF MTAESAEIPYAVLNKVMNRIINEVKGVNRVFYDLTSKPPGTIEFE Prediction of potential genes in microbial genomes Time: Sat May 28 20:02:45 2011 Seq name: gi|226332933|gb|ACII01000086.1| Ruminococcus sp. 5_1_39B_FAA cont1.86, whole genome shotgun sequence Length of sequence - 51699 bp Number of predicted genes - 46, with homology - 44 Number of transcription units - 24, operones - 9 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 15 - 2408 2371 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 2456 - 2515 4.9 2 2 Tu 1 . - CDS 2629 - 4029 1004 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 4205 - 4264 5.8 - TRNA 4557 - 4632 85.8 # Lys CTT 0 0 - Term 4976 - 5021 4.1 3 3 Tu 1 . - CDS 5058 - 5723 693 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase - Prom 5942 - 6001 6.2 + Prom 5867 - 5926 9.2 4 4 Op 1 . + CDS 6030 - 6341 251 ## gi|253579388|ref|ZP_04856658.1| conserved hypothetical protein 5 4 Op 2 . + CDS 6390 - 7433 1040 ## COG1527 Archaeal/vacuolar-type H+-ATPase subunit C 6 4 Op 3 . + CDS 7501 - 9441 1799 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I + Prom 9491 - 9550 1.9 7 5 Op 1 . + CDS 9586 - 10017 759 ## FMG_1169 V-type sodium ATP synthase subunit K 8 5 Op 2 . + CDS 10019 - 10327 259 ## COG1436 Archaeal/vacuolar-type H+-ATPase subunit F + Prom 10411 - 10470 4.0 9 6 Op 1 . + CDS 10490 - 11077 702 ## Cphy_3443 hypothetical protein 10 6 Op 2 16/0.000 + CDS 11070 - 12842 1714 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 11 6 Op 3 16/0.000 + CDS 12905 - 14287 1806 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 12 6 Op 4 . + CDS 14306 - 14923 823 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D + Term 14937 - 14985 14.1 13 7 Tu 1 . - CDS 15612 - 16325 886 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 16393 - 16452 7.1 + Prom 16310 - 16369 7.0 14 8 Tu 1 . + CDS 16601 - 18376 1068 ## COG4805 Uncharacterized protein conserved in bacteria - Term 18434 - 18482 6.0 15 9 Tu 1 . - CDS 18507 - 21143 2947 ## COG0474 Cation transport ATPase - Prom 21187 - 21246 4.5 - Term 21206 - 21238 1.5 16 10 Op 1 5/0.000 - CDS 21248 - 21856 672 ## COG2310 Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 17 10 Op 2 5/0.000 - CDS 21884 - 22465 782 ## COG2310 Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 - Term 22511 - 22546 3.7 18 10 Op 3 1/0.000 - CDS 22563 - 23144 737 ## COG2310 Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 19 10 Op 4 . - CDS 23174 - 24328 1456 ## COG3853 Uncharacterized protein involved in tellurite resistance 20 10 Op 5 . - CDS 24335 - 26608 1403 ## CA_C1410 hypothetical protein - Prom 26725 - 26784 6.7 + Prom 26689 - 26748 6.8 21 11 Tu 1 . + CDS 26930 - 27892 479 ## COG2301 Citrate lyase beta subunit 22 12 Op 1 . - CDS 27905 - 28957 960 ## BCB4264_A5099 hypothetical protein 23 12 Op 2 . - CDS 28959 - 30065 671 ## CDR20291_1539 hypothetical protein 24 12 Op 3 . - CDS 30077 - 30514 500 ## COG1846 Transcriptional regulators + Prom 30560 - 30619 7.2 25 13 Tu 1 . + CDS 30857 - 32260 1391 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Term 32287 - 32324 5.5 - Term 32274 - 32312 1.1 26 14 Tu 1 . - CDS 32361 - 33134 952 ## COG0084 Mg-dependent DNase - Prom 33156 - 33215 3.6 - Term 33174 - 33209 -0.7 27 15 Tu 1 . - CDS 33284 - 33712 501 ## COG2258 Uncharacterized protein conserved in bacteria 28 16 Op 1 11/0.000 - CDS 33879 - 34367 224 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 - Prom 34421 - 34480 4.7 29 16 Op 2 . - CDS 34493 - 34969 637 ## COG0315 Molybdenum cofactor biosynthesis enzyme - Prom 35132 - 35191 5.0 - Term 35076 - 35114 3.0 30 17 Tu 1 . - CDS 35232 - 37190 2230 ## COG0143 Methionyl-tRNA synthetase - Prom 37399 - 37458 6.0 31 18 Tu 1 . + CDS 37131 - 37292 103 ## + Term 37315 - 37349 -1.0 + Prom 38002 - 38061 6.3 32 19 Op 1 . + CDS 38111 - 38293 102 ## 33 19 Op 2 . + CDS 38202 - 38615 374 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Prom 38634 - 38693 7.2 34 20 Tu 1 . + CDS 38875 - 39420 798 ## COG1592 Rubrerythrin + Term 39457 - 39491 5.2 - Term 39437 - 39485 4.0 35 21 Tu 1 . - CDS 39513 - 39857 283 ## gi|253579419|ref|ZP_04856689.1| predicted protein - Prom 39898 - 39957 9.1 - Term 40112 - 40156 5.1 36 22 Tu 1 . - CDS 40161 - 41156 910 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 41191 - 41250 5.6 - Term 41572 - 41608 6.2 37 23 Op 1 55/0.000 - CDS 41627 - 42319 993 ## PROTEIN SUPPORTED gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 38 23 Op 2 45/0.000 - CDS 42381 - 42806 631 ## PROTEIN SUPPORTED gi|238922786|ref|YP_002936299.1| ribosomal protein L11 39 23 Op 3 . - CDS 42877 - 43392 623 ## COG0250 Transcription antiterminator 40 23 Op 4 . - CDS 43408 - 43617 151 ## gi|253579424|ref|ZP_04856694.1| conserved hypothetical protein 41 23 Op 5 . - CDS 43670 - 43897 246 ## PROTEIN SUPPORTED gi|160881814|ref|YP_001560782.1| ribosomal protein L33 - Prom 43924 - 43983 4.4 42 24 Op 1 5/0.000 - CDS 44082 - 44819 205 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 44839 - 44898 6.6 - Term 44840 - 44875 -0.7 43 24 Op 2 . - CDS 44969 - 46681 1697 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 44 24 Op 3 . - CDS 46723 - 47883 837 ## gi|253579428|ref|ZP_04856698.1| predicted protein 45 24 Op 4 . - CDS 47908 - 50334 1761 ## COG3475 LPS biosynthesis protein 46 24 Op 5 . - CDS 50331 - 51638 1148 ## COG0615 Cytidylyltransferase Predicted protein(s) >gi|226332933|gb|ACII01000086.1| GENE 1 15 - 2408 2371 797 aa, chain - ## HITS:1 COG:SP0312 KEGG:ns NR:ns ## COG: SP0312 COG1501 # Protein_GI_number: 15900245 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 76 681 2 592 679 636 49.0 0 MDLKKFVLEGNPVCRKEAVIVGDHFRITMLTTALIRFEYSEDGGFEDRATQMVCNRDFPV PEFRVSDGGEELHIYTKDLEIHYDRQKFSPSGLMIRVAGGKASERVWHYGDEPKDLLGTA RTLDEADGEIPLSHGIMSRNGFSVLDDSHTMAMGEDGMVEPRQGNRADFYFFGYGHRYVE CLQDFYRLCGKTPLLPRYTFGNWWSRYHKYTETEYKELVERFEKEEVPFSVAVVDMDWHL VEDVPPVYGSGWTGYTWNKKFFPNPPEFMDWLHKHGYKITLNVHPADGVRAYEEAYPRVA EKMGIDPASKEPVLFDMTDPKFIETYFEELHHPMEEEGVDFWWLDWQQGTVTKVPGLDPL WMLNHYHYLDSKWKGKRALTFSRYAGPGSHRYPVGFSGDTFITWESLKFQPYFTANASNI GFGWWSHDIGGHMFGYRDDELTARWVQLGVFSPMNRLHSTDNPFNGKEPWKYNQIVETVM KNFLKLRHKLVPYLYTMNRRASRAGLPLVQPMYYLEPEREETYEVPNNYYFGTEMMVSPI TDKLDPVTGLAGAKTWIPQGIWYDFFNGRAYKGGRKVDLWRDIYEMPVLVREGSIIPLKD MEGYDNSIENPEKLEVLVYPGESGEFVLWEDGGDTPEDLDENWVSTRMTKTADENGTVFI VEAAQGNTAVIPQKRSWKIRFCNIQDKPQEVTVNGQVYKDAEFAEDKKLHGTIVILKDVP ADAQVKVTFAADAAVYQRDYAEEVYEILEKAQITYAQKTEVYKVVKELGTEAVPVLVSMN LNPSLLGVLMEILTMGI >gi|226332933|gb|ACII01000086.1| GENE 2 2629 - 4029 1004 466 aa, chain - ## HITS:1 COG:NMA1646 KEGG:ns NR:ns ## COG: NMA1646 COG1502 # Protein_GI_number: 15794540 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Neisseria meningitidis Z2491 # 38 419 58 471 525 110 24.0 6e-24 MKKILKRTGKVFCLFLLIVVLVNVLSPLFCRKPDEKYVESLRETEFTSETEGTERICCID DNEEALLWRLRMIGTAKESIVLSTFDLRADDNGTKILAALNCAAARGVKIQLLIDGIYQQ LFLAGSSDFQALASYENVEVGVYNPVTPVNLFKVNYRMHDKYLIVDEKMYLLGGRNSNDI FLGNQTKGINEDRDILVYDTSEGQGESLNQLEDYFHKIWKESCVSIKKGKQSSRYTDAYR HMEEIYISLLKRYNDIETYSAWEKDTIEANKITLINNGIEAGRKTPQVLQTIQYLTENAD HVIIQTPYVICNGYMYDVLQGISDHAKLQIVLNAVEKGSNPWGCTDYLNQKKKILETGAD VYELMNDYPVHTKAVLINDRLSVVGSYNLDMRSTYLDTELMLVIDSEKLNQQIHETESDY MEKSKEVLANGQETEGAKYQGKALNRKKKLYYGVLRIIIRPLRQLL >gi|226332933|gb|ACII01000086.1| GENE 3 5058 - 5723 693 221 aa, chain - ## HITS:1 COG:BS_ytmQ KEGG:ns NR:ns ## COG: BS_ytmQ COG0220 # Protein_GI_number: 16080042 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Bacillus subtilis # 1 210 1 204 213 199 46.0 4e-51 MRLRNIPGAQDAILESPYVVQEPQIKKGHWGEVFVKKQPLHIEVGMGKGRFLMDLARLHP DINYIGIEMYDSVLLRALQKREELEENGEVYSNLFFMRVDARLLPEIFEKGEVDKIYLNF SDPWPKARHAKRRLTSREFLARYDQILVQDGKVEFKTDNKELFEFSLEEVEEAGWNLEAS TFDLHHNEEMVQGNVMTEYEEKFSSMGNPICKMVISREYHR >gi|226332933|gb|ACII01000086.1| GENE 4 6030 - 6341 251 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579388|ref|ZP_04856658.1| ## NR: gi|253579388|ref|ZP_04856658.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 103 1 103 103 110 100.0 2e-23 MEQILNKLSEIELTAQRIMEDCDRQKQQLSEEAEQKCKNYDEQLETRTAEQIRRIRQQLE EEKDSRLAQLRADTDATFSSLDAHYEQQHSQLSRELFEKILAM >gi|226332933|gb|ACII01000086.1| GENE 5 6390 - 7433 1040 347 aa, chain + ## HITS:1 COG:MTH957 KEGG:ns NR:ns ## COG: MTH957 COG1527 # Protein_GI_number: 15678975 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit C # Organism: Methanothermobacter thermautotrophicus # 11 326 53 365 385 59 22.0 7e-09 MGGVLSYSGLSTKIRAMQSRLTTMDQFEEILQLSDVTQVAAYLKRMPEYSSRWDALDENT LHRGQIEKLLKKSIFQNFSRIYHFANPEQRKFLDLYSKRYEIRVLKEVMTNIFDHRDTDP VDVSPYREFFRLHSNIDVDRITTCSTMEELISCLKGNEFYIPLSKIQEHETALLFDYGMA LDLYYFTQIWNIRKKLFKGKDLEEITCTYGEKFDMLNLQFIQRSKRYYNMDPASIYALLI PVNYKLKKEEITALVEAPSYEEGRRIFQKTWYGNKYEQLTAANLEEFYNHIHRSILEKES HRNPYSVAVIYSYLYNKEHEVNRLTIAIECVRYGVQHDEAMRYICNS >gi|226332933|gb|ACII01000086.1| GENE 6 7501 - 9441 1799 646 aa, chain + ## HITS:1 COG:PH1981 KEGG:ns NR:ns ## COG: PH1981 COG1269 # Protein_GI_number: 14591717 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Pyrococcus horikoshii # 88 642 109 646 659 134 26.0 4e-31 MIEKMKFLSITGPKADIDRVVNTYLSKYEIHLENALSELTAVESLSPFMEINPYKEALST ANSFYDELADPEKISAKPMDTEQALNTIRRIQSDYRELSNKRADLESQLAALDESLRVIR PFRNINYNISSILKFQFIHFRFGRIEKEYYQKFEKYIYENLDTIFIKCDEDDQYVWGLYF VPEHEAQKVKAVYSSMHFERIYMPDTYEGTAKEAFEQLTETRNRAAADLASADQKKADFL LRNQKDIVASRNAIAQLSGNFDVRRLAACTHGDKDVFYILCGWMTEKDAHSFQNDIKDDS RIFCIIDGEEDEHLKPGIHQQPPTKLKNPKLFKPFEMYIKMYGLPAYNEMDPTWFVALTY SFIFGAMFGDVGQGLLLFIGGFLLYKFKHITLAGIISCAGVFSTIFGFMFGSIFGFEDVI PALWLRPMNNMMSVPFIGKLNTVFIVAIGFGMCLILLCMVFNILNAWKARDVEHIWFDTN SVAGLVFYGSAVVSIALILNGKTLPGGIVLFIMFGIPLILIFLKEPLTALIEKKSEVMPK EKGMFVVQGLFEMFEVLLSYFSNTLSFVRIGAFAVSHAAMMEVVLMLAGAESGNLNWIVV VLGNLFVCGMEGLIVGIQVLRLEYYEMFSRFYKGTGRKFEPFRSVK >gi|226332933|gb|ACII01000086.1| GENE 7 9586 - 10017 759 143 aa, chain + ## HITS:1 COG:no KEGG:FMG_1169 NR:ns ## KEGG: FMG_1169 # Name: not_defined # Def: V-type sodium ATP synthase subunit K # Organism: F.magna # Pathway: Oxidative phosphorylation [PATH:fma00190]; Metabolic pathways [PATH:fma01100] # 7 143 4 139 139 76 43.0 3e-13 MTTIAQIILAIALVLSIILPFGYYLIGEKNRGRYKTALGVNAFFFFGTMLIAIAVMFTGA SSVQAAAGSEGTLATGMGYIAAALVTGLSCIGGGIAVASAASAALGAISEDQSILGKSLI FVGLAEGVALYGLIISFMILGQL >gi|226332933|gb|ACII01000086.1| GENE 8 10019 - 10327 259 102 aa, chain + ## HITS:1 COG:MTH956 KEGG:ns NR:ns ## COG: MTH956 COG1436 # Protein_GI_number: 15678974 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit F # Organism: Methanothermobacter thermautotrophicus # 11 102 12 104 106 57 39.0 8e-09 MQMFLISDNIDTYTGMRLAGVEGVVVHTHDELKDALQKAISNKEIGIILLTEKFGREFPE IIDDVKLHHKTPLIIEIPDRHGTGRKPDFITSYVNEAIGLKL >gi|226332933|gb|ACII01000086.1| GENE 9 10490 - 11077 702 195 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3443 NR:ns ## KEGG: Cphy_3443 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 195 1 195 195 110 37.0 3e-23 MTIDEKLSHFYDITVEEAHEKAAAILDEHKAALEKMTEEHKTLSEENAQTQIKAETASAR REINKALSAEQLTIKRDWTRKQNELKEKIFSEVKGLLESFTKTPEYENYLTAKIKEALDF AGNDEITIYLSPEDRSLAEKLQHTTGTAIQLAKDSFLGGIRATIPQKNILIDHSFAGNFE AAYKEFKFDGGPEHE >gi|226332933|gb|ACII01000086.1| GENE 10 11070 - 12842 1714 590 aa, chain + ## HITS:1 COG:MK1017 KEGG:ns NR:ns ## COG: MK1017 COG1155 # Protein_GI_number: 20094453 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Methanopyrus kandleri AV19 # 1 588 1 590 592 636 53.0 0 MSNTGIIYGVNGPVIYLKGDSGFKISEMVYVGPEHLVGEIIGLKKGMTTVQVFEETTGLK PGDTVTGTGDAISVLLGPGIIHNIFDGIQRPLEEIAKASGKYISRGVSVDSLDTEKKWNT HIIVKEGDVVGPGSVIAETQETDSILHKSMVPPNLTEATVIHAASDGAYTILEPIVTIQF ADGTTKDLALAQKWPIRIPRPTHKRFPASVPLVTGQRILDTLFPIAKGGTAAVPGGFGTG KTMTQHQIAKWSDADIIIYIGCGERGNEMTQVLEDFSKLIDPKSGNLMMDRTTLIANTSN MPVAAREASIYTGVTLAEYYRDMGYDVAIMADSTSRWAEALRELSGRLEEMPAEEGFPAY LASKLSAFYERAGMMQNLNGTEGSVSIIGAVSPQGGDFSEPVTQNTKRFVRCFWGLDKSL AYARHFPAIHWLTSYSEYLEDLTPWYREHVSPKFVADRNQLMAILNQESSLMEIVKLIGS DVLPDDQKLTLEIARVIRLGFLQQNAFHAEDTCVPMEKQFKMMETILYLYEKCRALINRG MPVSVLKEGNNIFEKIISIKYDVANNQLDKFDQYKQDIDTFYDNIMEKNG >gi|226332933|gb|ACII01000086.1| GENE 11 12905 - 14287 1806 460 aa, chain + ## HITS:1 COG:PH1974 KEGG:ns NR:ns ## COG: PH1974 COG1156 # Protein_GI_number: 14591711 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Pyrococcus horikoshii # 1 455 4 457 465 558 57.0 1e-159 MAIEYLGLSEINGPLVVLEGVKNASYEEIVEFHMDDGTRKIGRIIEIYEDKAVIQVFEGT DGMSLGNTHTRLTGRPMEIGLSPEILGRTFNGIGQPIDGLGDITPDVKLNINGLPLNPVA REYPRNYINTGISAIDGLTTLIRGQKLPIFSGNGLPHDKLAAQIVQQASLGEDSDEDFAI VFAAMGVKYDVAEFFRRTFEESGAADHVVMFLNLANDPVVERLLTPKIALTAAEYLAFEK GMHILVILTDITSFCEAMREVSSSKGEIPSRKGYPGYLYSELATLYERAGIVKGKPGSVT QIPILTMPNDDITHPIPDLTGYITEGQIVLDRQLHGQAIYPPINVLPSLSRLMKDGIGEG YTRADHQDVANQLFSCYAKVGDARALASVIGEDELSPLDKRYLVFGKAFESEFVGQSETE NRSITETLDKGWELLGLLPKEELDRIDTKILDKYYHETVR >gi|226332933|gb|ACII01000086.1| GENE 12 14306 - 14923 823 205 aa, chain + ## HITS:1 COG:PH1972 KEGG:ns NR:ns ## COG: PH1972 COG1394 # Protein_GI_number: 14591709 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Pyrococcus horikoshii # 7 200 9 205 214 110 36.0 2e-24 MDPNTFPTKGNLILAKNSLKLSRQGYELMDKKRNILIREMMELIDQAKDIQTQIDVTFRT AYTALQKANMEIGIAFVQQIACTVPVENSIRIKTRSVMGTEIPLVEYDKTTNTPTYAYYS TKMSLDEAKAAFEKVKELSIRLSMVENAAIRLAANIKKTQKRANALKNITIPKYEALTKD IQNALEEKEREEFTRLKVIKRMKQK >gi|226332933|gb|ACII01000086.1| GENE 13 15612 - 16325 886 237 aa, chain - ## HITS:1 COG:NMA0440 KEGG:ns NR:ns ## COG: NMA0440 COG0791 # Protein_GI_number: 15793445 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Neisseria meningitidis Z2491 # 114 227 141 258 280 90 38.0 3e-18 MNNKIKTILMAAIMSSVFLTQGVCVSAAQFDDGASAFSDGTGSVSALASTLAKQTEEQTQ EFVAKEEEAAQIREERRVEKAEKASEKAEQEKTAALESIQQTVEDTKKKIAEEAEAKRLA KRQEVVNFALQFEGNPYVYGGTSLTNGADCSGFVMSVFANFGYSLPRVAAAQCEASTKKD ISQLEPGDLVFYGSGYIDHVALYIGDGKIIHASNAATGIKISDYDYEKPAAVGTFME >gi|226332933|gb|ACII01000086.1| GENE 14 16601 - 18376 1068 591 aa, chain + ## HITS:1 COG:XF0221 KEGG:ns NR:ns ## COG: XF0221 COG4805 # Protein_GI_number: 15836826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 134 590 142 601 613 121 27.0 3e-27 MDMRIYKNIPKRILSLCLVFAIGISMGVLSDHFVSENSRFQTFTEKLFRSEVCSNTLTLH YTLAHPEKKGIRKPEATLGTALSDPAKTTSLCQEYEKELKSFAYSRLSEENRLTCDMLLL YFHTRASLGKNSALDEPLGPGLGVQAQLPILLAEYTFRTKEDISDYLKLLSTVRPYFQSI IKLEKQKSQSGLFMSDTTLDRILKQCHSFVANPDSNYMDDIFAQKLKAFSNPAFNSEDQK KLCTYHHKLILTEVIPAYQELADSLESLRGTGKSSRGLAFFEGGREYYLYLLQSQTGTYV PVGQIEKRLSAQLLSDYREISSLLKQNSSLIDRLNQCSGELTLTPTQMLEKLPELMKKDF PELKDATYELRTVHPSMKKFLSPAFYLTPPVDTRTPNVIYINDSGHTSSLELFGTLAHEG FPGHLYQTVSFAENNPSDIRYLVTSSGYVEGWATYVESYGYEYAASLMNDPDSAKNAVRL AWLNRSMNLCIYSLIDIGIHYRGWDAARTAVFLKAFGINNASTAAEIYQYIVETPGNYLK YYWGYLNFLDLKTVCQKRLGDDFDLKEFHRRILDIGPVQFPVLEKYMKQWN >gi|226332933|gb|ACII01000086.1| GENE 15 18507 - 21143 2947 878 aa, chain - ## HITS:1 COG:SPy0623 KEGG:ns NR:ns ## COG: SPy0623 COG0474 # Protein_GI_number: 15674699 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pyogenes M1 GAS # 7 860 23 867 893 306 29.0 1e-82 MNEINKEERMQGLSGQQVSESKAKYGTNELAKKETESLWSMFIGAFDDIWIKVLCAALAL KVILAVLGIFVPALSGGNDVVEIISIVIAIALATGFSTLSEYRNTSRSEALQEEYSKTYA KVVRDGKLVNILTSEIVKGDTILIQAGDKVPADGLLFEGKIKVSQAALNGESRDENKAAV TNMDEAESTDYSSHGKVFMGSVVTSGEGYMIATVIGDSSELGKINKALTDTSEEEERKDT SSLKLEGVAAGIGKLGVSAAAIAGILHVVLTLIRADQAITVVSVLLVIAEAVMLMASIVI MAVPEGLPMMNSLVQSMNTESMYKKNILVSHKAAFSDSAYMNVLFSDKTGTITQGNLSLV EFITGNGEIVDSIQSREFIEAITLNNLAKISSEGKAIGSNNMDRALLSYAINHGYNDSDN DPEKVEEISGFDSEKKCATVKLKNGLVYWKGATENIIDKVTHYMLPDGEEREFTKADKDK VEEQMHAQAKRTMKLLSVAKISDGKTVLMAVLCLRDNVRTDAVETVQILNDAGIQVVMVT GDAEETAVAIAKEAGILADEKKDVVLTHEEMEKLSDEELKKVLPNLRVVSRAKPLDKKRL VSLSQQIDNVCGMTGDGVNDAPALKQADIGFAMGDGTAVAQEAGDVVILNNSLTSIKDCV LNSRTMAKSVGKFLIFQLTVNISTLLMNIIAPILGWTEPFSIVQILWINLIMDTLAAMAF GGEPILDRYMKDQPAKRTDNILTPYIKSAIGVSAIFITLGSILILENIGGITSWVIPAGC ADPELYEKTFMFAFFIYAIIFNSLNTRSEKFNLFEHIGENKNFVLVMGAIFVLQTIIIEI GGKVFNTTLLEPKALLVSMVLAVFIIPMDLIRKAIMSR >gi|226332933|gb|ACII01000086.1| GENE 16 21248 - 21856 672 202 aa, chain - ## HITS:1 COG:BS_yceC KEGG:ns NR:ns ## COG: BS_yceC COG2310 # Protein_GI_number: 16077358 # Func_class: T Signal transduction mechanisms # Function: Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 # Organism: Bacillus subtilis # 1 202 1 199 199 239 59.0 3e-63 MSVSLQKGQKVNLSKEHAGLAKVIVGLGWDEAKPSGGGGLFGALFGSSSHQAIDCDASAI MLKNGKFTDKTDLVYFGNLKHKSGTVNHMGDNLTGAGEGDDEQIVIDLSRVPAEYDKIVI VVNIYQAVKRKQHFGMIQNAFIRLVDARNNKEMCKYNLTENYSGMTAMIFGEIYRYNGEW KFNAVGNGTTDPGLGELCQRFV >gi|226332933|gb|ACII01000086.1| GENE 17 21884 - 22465 782 193 aa, chain - ## HITS:1 COG:CAC1412 KEGG:ns NR:ns ## COG: CAC1412 COG2310 # Protein_GI_number: 15894691 # Func_class: T Signal transduction mechanisms # Function: Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 # Organism: Clostridium acetobutylicum # 1 191 1 190 191 248 64.0 4e-66 MPVSLKKGQKVSLTKGNPGLKNVVVGIGWDINAFDTGGDFDLDAAAFCLTDSGRVSDSND FVFYGNLVHPSGAVQHMGDNLTGAGDGDDEQIKIDLSKIPANITKIAFTVTIYDAEARRQ NFGQVSNAFVRIFNEVTGEEILRYDLGEDFSIETAVVFGELYKNGDEWKFNAIGSGYQGG LAALCNSYGVDVE >gi|226332933|gb|ACII01000086.1| GENE 18 22563 - 23144 737 193 aa, chain - ## HITS:1 COG:BS_yceE KEGG:ns NR:ns ## COG: BS_yceE COG2310 # Protein_GI_number: 16077360 # Func_class: T Signal transduction mechanisms # Function: Uncharacterized proteins involved in stress response, homologs of TerZ and putative cAMP-binding protein CABP1 # Organism: Bacillus subtilis # 1 192 1 192 192 270 66.0 1e-72 MPINLSKGQKVDLTKKNPGLKKIMVGLGWDVNAFDSGSDFDLDAAAFMLGANGKCPTEKE FIFYGNLKHVSESVIHMGDNLTGEGEGDDEQIMIDLSKVPANIERIAFTVTIYDAEARRQ NFGQVSNSYIRLVDESNEVELIHYDLGEDFSIETAVVVGELYRHNGEWKFNAIGSGFQGG LAALCGHYGIEVA >gi|226332933|gb|ACII01000086.1| GENE 19 23174 - 24328 1456 384 aa, chain - ## HITS:1 COG:CAC1411 KEGG:ns NR:ns ## COG: CAC1411 COG3853 # Protein_GI_number: 15894690 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Clostridium acetobutylicum # 27 376 12 363 371 286 47.0 5e-77 MGLDFGKSSMGNPAQAGNGLQEKNEIEVVEKYDIVADRQQMNAELVNSKEVDDIVSTIEV YNMDTIVSFGAKAAEEISKASDVVLNSMNMSQLNETSEMLKSLAKIMDQFDINEIKDNPG LFGKLFGNFRKQLDKILAKYHTMGEEVDKIYVQLKGYESEIKQSNKKLDTMFKTNVDYYH ELVKYILAGEQACKEIEAYIAQRQQDMENTGDQSIQFELTSLNQALMMLEQRTQDLRTAE NVAMQSIPMIKTMEFSNYNLVRKINSAFIVTLPVFKQALAQAILLKRQKIQAESIAELDK KTNEMLLKNAQNTVDVSKMTAKMASGSSIQIETLEKTWATITNGIEETRRIQEDARKKRQ EDQVRLEAIKQDFNKQYNVPPVRK >gi|226332933|gb|ACII01000086.1| GENE 20 24335 - 26608 1403 757 aa, chain - ## HITS:1 COG:no KEGG:CA_C1410 NR:ns ## KEGG: CA_C1410 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 125 737 169 809 826 189 27.0 4e-46 MLEHKKIQNLSDYFVELNSRREKGVYFYRINGYSEEVGEFIKKYYDTARRTGVVIEGKIP NPDEGNLAYYNEIMGMDFQMSMDFIHVSLRKWLPRMNEFQRQNVAASIYDSLDSLRKAGK TENMLRNAYIKFMCWLYYKFERIVNQLGENHIPKILYEGQISNYELMLISILSNAGCDVV LLQYAGDQGYLKTDPGSVLSDSLQMEGLQPFPQGYCVKKVRDEIQNELNNERLYGIRPSL TNCTNAWIKGNGLDDIRESILLRGNDSRFFYNCFCRINGAEDKLTYANELFRLQQELRNS KRNTVIVSKEIPRPTPQEISEIKRSNYTSGDQMLLGLACNIQYGANPELQRILHKTFVDV MLAESQKEGENLNRLTNRAVYLLCWMRRYLPKLFINWKSPEIGCFIYLGGCRNENEALFM SFLGRLPLDVLILCPDLNIKCCLEDKLLYEVNYPESLAITEYPEESSQVKIGTAAYHAER ELDTLMYQDSGIYRNQQYVKANVINLQTMYEEIKLLWDQELKYRPNFSTVNGVVNIPVIF SKISGVKDGSVSQYWESIAALITEDTVVVKSAPYIEPTASGPMKAFAVEFYKNGKLLRNK IKNHPRYPYHLLREEMQEFILDKLQLLIERKLIKGIGENGTEYTVIAQILDLPKEILRLI QKFDFTKKNPKLIYINTGETMISLQDSILVTFLNLAGFDILFFVPTGYQSVENYFAEKLM EEHQIGEYKYDMQVPDLNNISTNSTRPSWRRKLFRRG >gi|226332933|gb|ACII01000086.1| GENE 21 26930 - 27892 479 320 aa, chain + ## HITS:1 COG:DR2219 KEGG:ns NR:ns ## COG: DR2219 COG2301 # Protein_GI_number: 15807211 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Deinococcus radiodurans # 8 319 12 300 310 122 29.0 8e-28 MKNNVLYYSVGPLLYCPANRISITDSLINERFGNRFSLALCLEDTINDDHVEEAEQILIS SLSQIFIQHEQKPFYLPKIFIRVRNPQQIQRLTKALGQSIKIVTGFIVPKFSPDNAQSYI EQMILVNELVAKKLYMMPIYESPSIIDLRNRIDILYLLKDSLARIEDLILNIRVGGNDLC HMFGFRRHANESIHSIRPVSDIFSDIITVYGMDYVISGPVWEYYAGDSWKEGMIQEIRED RLCGFIGKTVIHPSQIPVVNRVYQISRNDYLDARAILNWNADSASLVAGSKTRERMNEYK THLNWAKKTVYLSEVFGITE >gi|226332933|gb|ACII01000086.1| GENE 22 27905 - 28957 960 350 aa, chain - ## HITS:1 COG:no KEGG:BCB4264_A5099 NR:ns ## KEGG: BCB4264_A5099 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_B4264 # Pathway: not_defined # 3 347 9 365 370 360 50.0 5e-98 MRSSYSEEDVILLLKDITGLVEPQPAKVREKLIQSGKHYSEMLPVEYVPTDQYMQVYHNA LKHYAKPVANAVGMLADKIIENKGKKIVLVSLARAGIPIGILVKRYIKFKYGINVPHYSI SIIRGRGIDDNAMKYLLEKYRPQQILFVDGWIGKGAILNELKKDISAYEGVSADIAVVAD PANVTELCGTHEDILIPSSCLNSTVSGLISRTFLRDDIIGRDDFHGAVYYGELTDSDLSY DFIETIEKEFNMENTPDLEKRVESQGIDEVINIGKNFGIDDINLIKPGIGETTRVLLRRI PWKVLIDERYKGNPQLEHIVRLAEEKHTPVEYYPLTHYKCCGIIKKLADA >gi|226332933|gb|ACII01000086.1| GENE 23 28959 - 30065 671 368 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1539 NR:ns ## KEGG: CDR20291_1539 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 7 333 24 374 412 130 32.0 9e-29 MEYTEKDLVKVAKRENNTKRNYLVVDPLQGKHIPVVPSKALDLFAALADTFREKYKDEKL LLVGFAETATAIGAQAAITVGADYIQTTREVIPGVNYLFFSEEHSHATEQKLVKDDIDRA AAETDRIIFIEDEVTTGKTIRNIISILDREYDGKFKYSVASLLNGMSEENLERYKRQGIF LYYLVKTDHSTYGDRAETFKGDGFYYKCPDKATEYTTIYVKNRMDARRLIDSGKYEEACE NLWREIREKTGNMADNISGKRILVIGTEEFMFPALYIGRKMEKEGAEVRCHSTTRSPIAV SLEKEYPLHSRYELKSLYDPDRRTFIYDIGKYDKVLIVTDSPEIKESQETLINAVRMQNK DITVVRWC >gi|226332933|gb|ACII01000086.1| GENE 24 30077 - 30514 500 145 aa, chain - ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 142 8 143 154 129 54.0 2e-30 MNHYEAINDVLVNLFNEILDLEERALITGEYKNISVNDMHIINAVGIREQKNMSTVAREL NVTVGTLTIAVNNLVKKGYIQRMRSQEDRRVVLISLTEQGKKAYYHHKDFHEKMVLAVLK GLNVEETEALTKALTKLQTFFRSYQ >gi|226332933|gb|ACII01000086.1| GENE 25 30857 - 32260 1391 467 aa, chain + ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 23 322 45 333 430 137 32.0 4e-32 MKKNNMPLNLRERINSRISLLCSVIVLVLLVLVMHQMDYTLIRKPQKEAAEAAALKEQQD KIKAETPVISTASVIAVGDNLYHSKLYESGENDSGIWNYDHIYTHVLDQIQAADVAMIDQ ETVFAPSHDAVSTYPSFATPQEVGDAIIKAGFDVVESATNHADDYGYDYLKSTLDFWSTN YPDIPVLGIHATQEDADTVKVKEVNGIKIAFLDYTYGTNNSGAGEGYEYMIDIFDKDKIT TMIQKAKEISDCIIFVAHWGTEDETMPNEYEKQWAAFLMQQGVDVIIGGHPHVLQPYGQL SDDQGHNTTIFYSLGNFVSTQQELPELLEGMASFTIQKSTLNGKSTIQILSPEVKPMVMH YNHDNGEYGPYMLDDYTEELASSHSVRNVIGDEFTLDNLKAKFKEIMSMNVKPSTNTNLL NVKFDWEGNMIDKTTGNVVEDTESIHSWEYQADTSDGTEDSSDDSGY >gi|226332933|gb|ACII01000086.1| GENE 26 32361 - 33134 952 257 aa, chain - ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 1 257 1 254 255 227 46.0 1e-59 MIIDTHAHYDDEAFDTDREALLMSMYDGGIEKIVNVCASVGGFQDTVDLMEKYPFIYGAV GIHPDDADKMTQETLDEIRRLSHMDKMVAIGEIGLDYYWHKEEEEHQIQKKMFRAQLDIA REEKLPFMIHSREAAEDTLNIVREYMQGGMYGGIIHCFSYSREIAAEYLKMGLYLGIGGV VTFKNAKKLKETVQYAPLSQLVLETDCPYMSPEPNRGKRNSSLNLPYVAQAIAELKGITA EEVIDVTRQNAEKLLGI >gi|226332933|gb|ACII01000086.1| GENE 27 33284 - 33712 501 142 aa, chain - ## HITS:1 COG:CAC1991 KEGG:ns NR:ns ## COG: CAC1991 COG2258 # Protein_GI_number: 15895261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 142 2 143 145 126 45.0 1e-29 MGKVIAVCTSERKGIQKTSVPEIKVIEDWGIEGDAHAGKWHRQVSLLSFDKIEDFRARGA EVEDGAFGENLVVQGIDFATLPIGTKFQCNDVVLELTQIGKECHSGCAIFKKMGECIMPK QGVFTKVLHGGVIHPGDELVIL >gi|226332933|gb|ACII01000086.1| GENE 28 33879 - 34367 224 162 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 4 147 2 151 194 90 37 1e-17 MKRVAIVTSSDTGYRGEREDLSGPAIREIVEKNGYEVVSEDVLPDDRKMLSERMAEIADS GTAELILTTGGTGFSPRDITPEATEDIIDRRVPGIPEAMRAYSMTITKRAMLSRSTAGIR KKTLIINLPGSPKAVKESLEYIIDALGHGIEIMTGEAGNCAR >gi|226332933|gb|ACII01000086.1| GENE 29 34493 - 34969 637 158 aa, chain - ## HITS:1 COG:RSc0560 KEGG:ns NR:ns ## COG: RSc0560 COG0315 # Protein_GI_number: 17545279 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Ralstonia solanacearum # 5 156 4 155 158 175 57.0 4e-44 MSQGFNHFDENGNAVMVDVSGKNITYRTAVATGEIHVGEAIMEAVKEGSVKKGDVLGVAR VAGIMGVKRTSELIPMCHPLPIQKCSVDYELDETNGIIRAFCTVKTEGKTGVEMEALTGV QVTLLTIYDMCKAIDKHMVMSNIHLVEKTGGKSGDFHF >gi|226332933|gb|ACII01000086.1| GENE 30 35232 - 37190 2230 652 aa, chain - ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 2 529 3 524 536 660 60.0 0 MEKQKYYITTAIAYTSGKPHIGNTYEVVLADAIARYKRQQGYDVFFQTGTDEHGQKIELK AEEAGITPKEFVDNVSGEIKRIWDLMNTSYDKFIRTTDEDHEKQVKKIFKKMYAKGDIYK GHYEGMYCTPCESFFTESQLVDGKCPDCGRPCVPAKEEAYFFKMSKYADKLIDYINTHPD FIQPESRKNEMMNNFLLPGLQDLCVSRTSFKWGIPVDFDPKHVVYVWLDALTNYITGIGY DCDGESTEQFNKLWPADLHLIGKDIIRFHTIYWPIFLMSLDLPLPKQVFGHPWLLQGDGK MSKSKGNVIYADELVDFFGVDAVRYFVLHEMPFENDGVITWELMVERLNSELANTLGNLV NRTISMSNKYFGGVVENKGVTEPVDDDLKNFILSVPAKVNEKMDKLRVADAMTEVFTIFK RCNKYIDETMPWALAKDEEKKDRLATVLYNLVEGICIGAVLLKSFMPETTERILAQLNAQ DRELEDLKTFGLYPSGNKVTEKPEILFARLDLKEVLAKVAELHPPKAEEPAKEEKEDVID IEAKPEITFEDFGKLQFQVGKIIKCEEVKKSKKLLCSQVQIGSQVRQIVSGIKAHYSAEE MVGKRVMVVTNLKPAKLAGVLSEGMILCAEDADGNLSLMVPEKEMPAGAEIC >gi|226332933|gb|ACII01000086.1| GENE 31 37131 - 37292 103 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGFAGCVGDGGGDVVFLFFHGLLSPFLNVDAPENIIVFCAVSGARNSKCLKIS >gi|226332933|gb|ACII01000086.1| GENE 32 38111 - 38293 102 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIIVIEQIITATILNSYDKSFASKQGREEYHGDTEIQPSEGIDQRVSKNQNRSSYCRCGI >gi|226332933|gb|ACII01000086.1| GENE 33 38202 - 38615 374 137 aa, chain + ## HITS:1 COG:aq_1207 KEGG:ns NR:ns ## COG: aq_1207 COG0735 # Protein_GI_number: 15606445 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Aquifex aeolicus # 4 125 20 138 144 85 33.0 3e-17 MATLKYSRQRESIKEFLRTRTDHPTADVVYENMKLIYPNISLGTVYRNLSLLADLGEIKK LSSFAGADHFDGRTERHCHFMCIRCERIIDLESEGIHHIMDLAGENFKGKITDYSARFFG LCEDCLKETEGNSDSSK >gi|226332933|gb|ACII01000086.1| GENE 34 38875 - 39420 798 181 aa, chain + ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 179 1 180 181 215 65.0 5e-56 MTKWVCSVCGYVYEGEKAPEACPVCKAPADKFVKQDGEMSWAAEHVVGVAQGSSEDIMAD LRANFQGECSEVGMYLAMARVAHREGYPEIGLYYEKAAYEEAEHAAKFAELLGEVVTDST KKNLQMRVEAENGATAGKTDLAKRAKAANLDAIHDTVHEMARDEARHGKAFAGLLKRYFG E >gi|226332933|gb|ACII01000086.1| GENE 35 39513 - 39857 283 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579419|ref|ZP_04856689.1| ## NR: gi|253579419|ref|ZP_04856689.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 114 4 117 117 176 99.0 4e-43 MTDRQILEAVLTEVRGFGTRMDKFESRMDGFESRMDRMEERQKRFEEETRNNFADIKLHL ENITDRNISILAENHLNLINKMDVSATWMNRIMINEVKMNALTDQVARLQAKSS >gi|226332933|gb|ACII01000086.1| GENE 36 40161 - 41156 910 331 aa, chain - ## HITS:1 COG:CT551 KEGG:ns NR:ns ## COG: CT551 COG1686 # Protein_GI_number: 15605280 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Chlamydia trachomatis # 71 314 37 293 343 130 32.0 3e-30 MAGRRRKKSGGRSYAWRYRGRIVACVFVEVIIICALVIMIGWNKGVKEWFEQFEQPVLKE VDISGINSPNAILMQARGGKILGEINGEAQIYPASMTKIMTVILGIENFDDLDEKITLTN EMFSGLYEQDATQAGFQPGEEVRVIDLLYGAMLPSGAECCIALADTISGSEADFAELMNK KARKLGMENTHFCDSTGLHNPDHYSTVKDIAVLMKYCIKNDTFREIVETSRHSTGVTNIH PDGITYYSTMFKNLSDPTVTGGKILGGKTGYTSEAGHCLVSFAEIEGREYIFVSAGASGA DGNTIPHIQDAVTVYNRVGAAVEAKMNDNNK >gi|226332933|gb|ACII01000086.1| GENE 37 41627 - 42319 993 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 [Eubacterium eligens ATCC 27750] # 1 228 1 228 231 387 83 1e-107 MKRGKKYVEAAKAVDRATLYDTAEAISLVKKAAVAKFDETIEVHIRTGCDGRHADQQIRG AVVLPHGTGKKVRVLVFAKDAKAEEAKAAGAEFVGAEDLIPKIQNEGWLDFDVVVATPDM MGVVGRLGRVLGPKGLMPNPKAGTVTMDVTKAVHDIKAGKIEYRLDKTNIIHVPVGKASF TEEQLSDNFQTLIDAIVKARPSTLKGQYLKSVVIAPTMGPGVKINPMKLA >gi|226332933|gb|ACII01000086.1| GENE 38 42381 - 42806 631 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922786|ref|YP_002936299.1| ribosomal protein L11 [Eubacterium rectale ATCC 33656] # 1 141 1 141 141 247 88 9e-65 MAKKVTGYIKLQIPAGKATPAPPVGPALGQHGVNIVQFTKEFNARTADQGDLIIPVVITV YADRSFSFITKTPPAAVLLKKAAKIKSGSGVPNKTKVAKVTKAQVQEIAELKMKDLNAAS LEAAMSMIAGTARSMGIEVVD >gi|226332933|gb|ACII01000086.1| GENE 39 42877 - 43392 623 171 aa, chain - ## HITS:1 COG:CAC3149 KEGG:ns NR:ns ## COG: CAC3149 COG0250 # Protein_GI_number: 15896397 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Clostridium acetobutylicum # 1 171 1 172 173 184 56.0 6e-47 MSEAKWYVVHTYSGYENKVKVDIEKTIENRHLEDQILEVRVPLQEVAELKNGALKQVQKK MFPGYVLLNMVMNDDTWYVVRNTRGVTGFVGPGSKPVPLTEEEMLPLGIKKEEIQVDFAE GDTVVVTGGAWKDTVGVITALNVQKQTATINVELFGRETPVEISFAEIKQM >gi|226332933|gb|ACII01000086.1| GENE 40 43408 - 43617 151 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579424|ref|ZP_04856694.1| ## NR: gi|253579424|ref|ZP_04856694.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 69 1 69 69 117 100.0 2e-25 MQGNEKSAGKSQKKSWFQGLQSEFRKIVWTDRNELIKQTIVVVCVSIVLCVLISVMDSFI LEGINLLMK >gi|226332933|gb|ACII01000086.1| GENE 41 43670 - 43897 246 75 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881814|ref|YP_001560782.1| ribosomal protein L33 [Clostridium phytofermentans ISDg] # 27 75 1 49 49 99 85 4e-20 MPEKMTDTSSGMRKMTMQMKTLEVEVVRVKITLACTECKQRNYNMTKEKKNHPERMETQK YCKFCRKHTLHKETK >gi|226332933|gb|ACII01000086.1| GENE 42 44082 - 44819 205 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 240 4 238 242 83 26 2e-15 MSKVTVITGGTSGIGRGIVEKILANSAEDDLIFATYAHNAYKANEFWDSLKPEDQEKLII LKADMSSYDDMMNFVGEVKEKAGHVDWLISNAGISTYDKFQDYTFEEWNKIVNTNLSVPV FMVKEFMPVMTEGGRVLFMGSYAGQQAYSSSLVYGVTKAAIHFLTKSLVKEFEPKGITVN AIAPGFIQTPWHENRTPESYERINRKIALHRFGEIKDVADMAYSILTNNYMNGSIVDIHG GYDYF >gi|226332933|gb|ACII01000086.1| GENE 43 44969 - 46681 1697 570 aa, chain - ## HITS:1 COG:BS_ilvB KEGG:ns NR:ns ## COG: BS_ilvB COG0028 # Protein_GI_number: 16079883 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Bacillus subtilis # 1 511 28 524 574 285 35.0 3e-76 MQKKGIRHFFGYQGTMIAHLVDSIERNPETENHSGYNEQGAAFAACGYAQAKEECACAYA TSGPGAINLLSGVADAYYDSLPVIFLTGQLNTYEYSGIKGLRQQGFQETDIVAMAKPITK YAVQIRNPEDIVEELNKAYHIATTGRRGPVLIDLPMNIQRSEVENPVYDMTFEDKHTDAA AAQQAADTILEALEQAKRPVIMLGHGVDSSFSQQKLIRFAQKRQIPIITSVLAKSVLGYD HPLNFGCIGGAYGHRYANMIANAKSDLLLCFGISLCTRQIGTKVHEFARGAKIIRIDIDP YNLQRDIHENGINEVKLQAETGAVIDCLAKAAAPDTEGISDWLNVCTEIKENLQAVDDAT PERYPNRMIADLSDMLEDTSAIAVDVGQHMVWSYQSFKNHEGQKLLFSGGHGAMGYGLPA AIGAYYATGKPTACICGDGALQMNIQELEWVKRENLPVKIMVMNNEALGMIRHLQRDYFD CLFAGTTDGCGFASCNFAEVAKAYGIPALRMLCDAVREEAPEFLEGEGPKLLEVMLEHGT FAYPKTCLGEPIHNQQPYAPKEVYDRLMEL >gi|226332933|gb|ACII01000086.1| GENE 44 46723 - 47883 837 386 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579428|ref|ZP_04856698.1| ## NR: gi|253579428|ref|ZP_04856698.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 386 1 386 386 763 100.0 0 MNKKQKVILSLLQEIDEICRRNKIEYYLSPRLTLCAVEGHPFPQNPMFGVVLMKTADMER FRLAVDEDPREKRALESMKSHKWFSGFYLRYTNTDTLCLNLDNTRDYAFPGIGVNIFPLR TPAASVKAERRLSRDENAWTELCHINYADRNFRSRVNRTIMRLQCMITGRQGQAAHLYDR LVRSCQQPGANKYILKRRKQTTVFPAEIFAQSKRVTLEGAELQVPAKTAEYLTISYGKNY KDAKEPRYVTPIALVVSARVSYTQFWKESGNFEKYCKERMKNARKLARSRRHKDYFNECW DYVEFCGERMNLSVSYEKQKDYIKNLYKNEDYMTLERVFRPYFKMMQKSLQKNELFAEDE EIFDIYVDVLEKTGKTVQRSKIGTLI >gi|226332933|gb|ACII01000086.1| GENE 45 47908 - 50334 1761 808 aa, chain - ## HITS:1 COG:L15884 KEGG:ns NR:ns ## COG: L15884 COG3475 # Protein_GI_number: 15672196 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Lactococcus lactis # 9 253 14 265 278 96 27.0 2e-19 MTEKQKYLLKLFREVDEICREHNLRYVLAGGSLIGALRHEGFVPWDDDVDLYMPRPDWEK FIEICKTELPPDREIQCSEVDRNYTNSFPRYASTNTCAIHKSQIIGKDCGGEIIDILTLD PVPADDKEYEKYRTHMMIYSDLINISVGYSDRWEIPASMYLKYLLSYIFLGKKRTLAKLE KIMFSYKEEECDRYAMRWGGCPFLFDKDMMFPVKDGIFEGKKAMIPNKCSDYLIWHYGDE WSYMPPHDKREGHVAVCVDDLPYQELREEYMPKINKERLRWDSVFRKFYNMCIAKKSHKV RQDGLAMKARAVALDLQRAIDESGLKISELVESRSFRKLSALFGSYYKNQLSADFIGRED YTNIYAFYHPTLVEIPDDVFYAAMLTLFYTERVSKAYRMMQVRQQLDHLSPEMEGLKEDI EFFRKAADHYEFHRIKEAEQIVNELLKKYPGHPGFMKFKCRFLMEDAGENRIEAERFLDK ALKMFPEDGYFLKYKADILWMDGEMQKAAELYLQVKNKTTNGIVWMEMDRFFRGYKSEIL KSCEELIANHNKKEALALMELWSRLIPEDDDIQGALYLAKTVCARTQSEIEKEIGEIRAV IGTQMITPVSVEKNPGKSRKQIKSDRTSDETSADDVNKKVSDPSAETEEVKASADIQVKV SEEHKMYRKALTRAWKRLGYSDELAELRTQIICTGEESELEWLAEQVRNRQFRREEKACA YKLVGDVRMKQGQTREAFANYRKALEYEMPSYVRTELYRIFINDLNDGSRQAKSFGKKTD ITVVLDKWLDKYESIEDIKKIVQTVTVK >gi|226332933|gb|ACII01000086.1| GENE 46 50331 - 51638 1148 435 aa, chain - ## HITS:1 COG:aq_1368 KEGG:ns NR:ns ## COG: aq_1368 COG0615 # Protein_GI_number: 15606564 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Aquifex aeolicus # 2 139 6 141 168 134 53.0 3e-31 MKRVITYGTFDLFHKGHYNIIKRAKALGDYLIVGVTSESFDIERGKLNVRDSLIKRIENV RRTGLADEIIIEEYQGQKVNDIIKYDIDVLVVGSDWRGKFDYLKNYCDVVYLERTKNISS TKLRSEGVIFNMGIVTDDIRDNDFVEESKYVSGVHVERVFSEDHETAQRFCDKYELGSCW SSYDEFLADVDIVYIKTSLNRRAEYIERALKKGKYVISDSPMTLSSEKLRYLFQVARENN VVLIERTTLVYLRAFNQLVWLVHSNLVGDLVSVKCAISQDDFEGGRTFNETVCTAICAVL KLLGKKCQDINTNAVRNQEGRFVYDMISMKYAGALATIEIGTTVDIENELVIIGSNGRVT IPNDWWNTGYFEANISGKEFLKRYSFNFEGNGFRYLLQELMIMIRDKRTECTRLFYDESV RIIEILETINRKDNS Prediction of potential genes in microbial genomes Time: Sat May 28 20:03:58 2011 Seq name: gi|226332932|gb|ACII01000087.1| Ruminococcus sp. 5_1_39B_FAA cont1.87, whole genome shotgun sequence Length of sequence - 33542 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 16, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 742 588 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 2 1 Op 2 . - CDS 745 - 2145 1218 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 2211 - 2270 6.1 + Prom 2501 - 2560 8.8 3 2 Op 1 7/0.000 + CDS 2588 - 3727 1247 ## COG0448 ADP-glucose pyrophosphorylase 4 2 Op 2 . + CDS 3729 - 4844 990 ## COG0448 ADP-glucose pyrophosphorylase + Term 4868 - 4916 12.8 + Prom 4922 - 4981 8.2 5 3 Tu 1 . + CDS 5111 - 5707 252 ## PROTEIN SUPPORTED gi|226309687|ref|YP_002769581.1| probable 50S ribosomal protein L25 + Term 5731 - 5768 8.0 6 4 Tu 1 . - CDS 5828 - 7378 1180 ## COG2508 Regulator of polyketide synthase expression - Prom 7407 - 7466 6.0 7 5 Tu 1 . - CDS 7486 - 8121 565 ## COG5012 Predicted cobalamin binding protein - Prom 8214 - 8273 5.9 - Term 8444 - 8482 -1.0 8 6 Tu 1 . - CDS 8533 - 9069 374 ## Dhaf_4609 hypothetical protein - Prom 9089 - 9148 3.8 9 7 Tu 1 . - CDS 9222 - 10955 1456 ## COG3894 Uncharacterized metal-binding protein - Term 11253 - 11295 7.2 10 8 Op 1 . - CDS 11340 - 11975 948 ## COG5012 Predicted cobalamin binding protein 11 8 Op 2 . - CDS 12052 - 12870 1007 ## COG1410 Methionine synthase I, cobalamin-binding domain 12 8 Op 3 . - CDS 12913 - 13947 1270 ## DSY4700 hypothetical protein - Prom 14159 - 14218 7.0 + Prom 14167 - 14226 11.9 13 9 Op 1 . + CDS 14375 - 15301 782 ## COG2367 Beta-lactamase class A 14 9 Op 2 . + CDS 15348 - 16316 835 ## gi|253579445|ref|ZP_04856715.1| conserved hypothetical protein 15 9 Op 3 . + CDS 16393 - 16839 594 ## EUBREC_1210 hypothetical protein + Term 16853 - 16916 11.6 16 10 Tu 1 . - CDS 17005 - 18135 1273 ## COG2055 Malate/L-lactate dehydrogenases - Prom 18196 - 18255 8.2 17 11 Tu 1 . + CDS 18530 - 19378 631 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding + Term 19428 - 19488 11.2 - Term 19421 - 19469 9.1 18 12 Tu 1 . - CDS 19528 - 20676 1160 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Prom 20696 - 20755 3.6 - Term 20743 - 20787 12.3 19 13 Op 1 4/0.000 - CDS 20805 - 21272 595 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 21398 - 21457 5.4 20 13 Op 2 1/0.000 - CDS 21460 - 21927 657 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 22074 - 22133 8.1 - Term 22142 - 22191 5.9 21 13 Op 3 . - CDS 22271 - 25417 2627 ## COG0642 Signal transduction histidine kinase - Prom 25452 - 25511 6.2 - Term 25615 - 25668 12.1 22 14 Op 1 38/0.000 - CDS 25744 - 26571 788 ## COG0395 ABC-type sugar transport system, permease component 23 14 Op 2 35/0.000 - CDS 26564 - 27463 845 ## COG1175 ABC-type sugar transport systems, permease components - Prom 27496 - 27555 2.4 - Term 27528 - 27561 3.7 24 14 Op 3 . - CDS 27661 - 28950 1657 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 29143 - 29202 7.7 - Term 29152 - 29212 7.2 25 15 Op 1 . - CDS 29278 - 29787 583 ## Cphy_3288 hypothetical protein 26 15 Op 2 . - CDS 29795 - 30733 670 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 30767 - 30826 8.3 - Term 30823 - 30875 13.3 27 16 Op 1 . - CDS 30976 - 31887 681 ## COG0348 Polyferredoxin 28 16 Op 2 . - CDS 31888 - 32052 214 ## gi|253579459|ref|ZP_04856729.1| predicted protein 29 16 Op 3 . - CDS 32053 - 33255 1457 ## FMG_0299 hypothetical protein - Prom 33445 - 33504 8.6 Predicted protein(s) >gi|226332932|gb|ACII01000087.1| GENE 1 1 - 742 588 247 aa, chain - ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 1 246 6 254 257 155 35.0 7e-38 MNKPLVSVIMPVYNGEKYIRKAVESVYEQGVSLELLVIDDGSTDHTEEVLSAYEGREDFR YIKNEQNMGAAGSRNRGVGLAQGTYIAFLDADDWWESGKLKEQLKRLEETGYVLCSTGRE LMKADGSSTGRTIPVKEKITYRELLKHNSINCSSVILRRDVAREFPMEHDDSHEDYITWL KILRKYGCAAGINKPYLKYRLSEGGKSRNKLKSAAMTYNVYRYAGYGRIRSCIFFCSYAV HGIWKYC >gi|226332932|gb|ACII01000087.1| GENE 2 745 - 2145 1218 466 aa, chain - ## HITS:1 COG:CAC2330 KEGG:ns NR:ns ## COG: CAC2330 COG2148 # Protein_GI_number: 15895597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Clostridium acetobutylicum # 164 444 163 443 461 226 45.0 5e-59 MSKREDYKRFIVFCLASLVVLAQAAVFAWVWYSVYRGQIDEPFWRKGNWVLIAIYGLMFA LFAKLYGGLKVGYLKRIDVFYSLTLALLCTNVVEYLEITLINRWFLSVWPMIEMTGIQLV LIIIWIFGSRYIYSGLYRARRLLVIYGDRDPGDDLIHKMNSRKDKYDISGKVHISVGEEK IHQMMQDYDGVIIWDLHSTERNRYLKYCFAHSVRCYVSPKISDIILMGSERIHLFDTPLL VSRNMGLAVDQRAAKRVMDILISGIGIIITSPIMLIIAIAVKAYDRGPVFYFQDRLTLGG KPFKICKFRSMCVDSEKNGARLASKHDSRITPVGHVLRNLHLDELPQLFNVFKGDMSLVG PRPERESIMLEYEKELPEFYYRLKVKAGLTGYAQVYGKYNTTPYDKLKLDLFYIENYSFL LDIKLIFMTVKIFFQKEVSEGVDDRQVNALKYSGKNAGVSENKTEE >gi|226332932|gb|ACII01000087.1| GENE 3 2588 - 3727 1247 379 aa, chain + ## HITS:1 COG:CAC2237 KEGG:ns NR:ns ## COG: CAC2237 COG0448 # Protein_GI_number: 15895505 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 1 376 1 375 380 438 56.0 1e-122 MKQNSMLAMILAGGRGSRLHDLTNKVAKPAVSYGGKYRIVDFPLSNCANSGVDVVGVLTQ YESIQLNSYVAAGGRWGLDAKNSGVYVLPPREKADENLNVYRGTADAISQNIDFIDKFDP EYVLVLSGDHIYKMNYDKMLAAHKEAKADATIAVIGVPMKEASRFGIMNTDESGRIVEFE EKPEHPKSNLASMGIYIFTWKLLRKMLMADIKNPDSSHDFGKDIIPTMLNDNRTLYAYKF EGYWKDVGTIDSLWEANMDLLSSKNELDLGDPSWKIYTEDVTALPQYISAEADVKDAYIT QGCVVQGEVKHSVLFTGVKIGAGARIIDSVLMPGVVVEEGAVVQRALVADGVRIGKGAVV GTADSEHIELVAKRVKGAE >gi|226332932|gb|ACII01000087.1| GENE 4 3729 - 4844 990 371 aa, chain + ## HITS:1 COG:BH1086 KEGG:ns NR:ns ## COG: BH1086 COG0448 # Protein_GI_number: 15613649 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Bacillus halodurans # 23 371 23 368 368 195 31.0 1e-49 MYKAFGIVSSSGRNIYVDGMQDYRPIGAFSFLGRYRVIDFPISNMTNSDIDRIQVYINNK PRSVVEHLGTGRHYNINSKSGKLQLLFSEHNNDNDIYNTDISCYLDNMESLSRMNHPYVV IAPSYMVYSIDFDQFLHTHVESGADVTLLYHAVDNAKEAYLTCNVLGLNRQKGVESIEPN HGNKKNQYVYMDTCVMKTELFISLIKEAKNLSSMYTLADILNDKCETLDIRGVAHKGFFA SITDFPSYYAANMALLNYKSAQELFHDNWPIYTRTNDSCPTQYFNTADVKNSVISNGCQI EGTVENSVIGRGCVIKKGAIVRNSVVLAEATVGEGVHVENEVVDKWAKLVHKKEIVSPTE QPGYIRRNDTL >gi|226332932|gb|ACII01000087.1| GENE 5 5111 - 5707 252 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|226309687|ref|YP_002769581.1| probable 50S ribosomal protein L25 [Brevibacillus brevis NBRC 100599] # 1 196 1 195 202 101 30 5e-21 MNTLKAEKRTMDTKAKRLRREGYVTGNVFGREMEGSLPVKIEKSAVEKLLKTNNKGSQIM LDVDGQSYDVLIKEIQFNPLKGQVDEIDFQALVSNEKVHSVAEIVLVNHDKLEAGVLQQH LEEISFRALPSALVDKIMVDVGDMKVGDVIKVKDLAIAQDKDVDLKTDLEAAVVSVAAVH VDPALEAEEEAAAEEEAK >gi|226332932|gb|ACII01000087.1| GENE 6 5828 - 7378 1180 516 aa, chain - ## HITS:1 COG:CAP0121 KEGG:ns NR:ns ## COG: CAP0121 COG2508 # Protein_GI_number: 15004824 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 324 510 350 536 543 78 24.0 4e-14 MQLSSVYLYEKIKEKYEITERGTLSGSDGYLRPFLCYEKKETFRHGHVYVVQRYDKEWES AVLTAENILWVFCGREKIDAASEEFQQIPYIHIALDSLKEIAEFMNDVQEIFDAADEWER KIHDLMLEHAGMDRLLQVTSEFLQNPMTVTGLDFTFVAEAGSEYLPPRARLYTDDGLNME YVNALLQNETYRDMADTHEYVMFPAYISGCRSMNRNLFVDGKATHRLVLTECRSEITLRV ICVLDILVEKLEYLLAHEAEEEDPDRDMEQIFVRILSDRTADYMQVSRELSELGWSGNHE YMCLILQITYLNQQNLSTKAICRYIKKKFEDSVSFLYQDEIVAFFDLTRLGKSQEEVAGK LVYFIRDTYLKAGYSRVMTGHMNLRRQYVQAKTALDVGSRKKPYLWIHYFGQVALTYILE QATRRLPGTMICHEGLLELKKHDEKNQTQYMETLRVYLEQHLSATQAARELFIHRSTFLY RLDRIKEILQSELDDPEEIFYLELSFRLLEQEQEKE >gi|226332932|gb|ACII01000087.1| GENE 7 7486 - 8121 565 211 aa, chain - ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 3 210 22 228 238 130 33.0 1e-30 MTVLEQLRKTIEDGHPGETEKLVREALKQHIPAGRIVEEAMTPAMRTVGENYKSNGADII KILAAARSVRKGFELLEEQDSQFGRRNIGTVILGTVEGDLHDVGKNLVAIMFRSAGFKVI DLGVDISEKQFLRAVKQNPDVSIVCISSLLSTSIPEMEQVVKSLKRSSNKHKFKIMVGGG AVTEQLAKNMGADAYTENCIEAVEVAKTFIV >gi|226332932|gb|ACII01000087.1| GENE 8 8533 - 9069 374 178 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4609 NR:ns ## KEGG: Dhaf_4609 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 14 177 16 182 182 114 38.0 2e-24 MDIKKIEQMGMEQGFSHVVLLDCDTIELKPEVRQMCAADTCHKYDKCWSCPPGCGSLEEC EAKVRQYKYGIIVQTVGELEDVFDGEGMMETEARHKEYFVEFEKKLREIYPDMLAIGAGC CTKCKVCTYPDAPCRFPKQAFSSMEAYGMLVTQVCQANDLQYYYGDCTIAYTSCYLLE >gi|226332932|gb|ACII01000087.1| GENE 9 9222 - 10955 1456 577 aa, chain - ## HITS:1 COG:AF0010 KEGG:ns NR:ns ## COG: AF0010 COG3894 # Protein_GI_number: 11497631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 56 568 78 592 597 298 34.0 3e-80 MAWVKFMREGIEIEVNAGMSVLEAEIRAGLRPDAPCGGLGKCGKCLVKINGEVVKACQIR IGEGETCVVETLDRAGNEKILTDGFNREVAFEPGLRMAQVELEKAKTGEKRSDWKRLLDT LAETDGEVEPGQMEVDLKLAGELYGMRRDSEEWYVIYSRRRILEMRKEAGRRCLAAFDIG TTTIAGYLLDGADGRTLAVESRMNPQAQYGADVIMRANYALEHGTEALSMCVRKAVNEML GSLAEDAGIRREDVFQVCVVGNTCMHHLFLGISPASLVHAPYTPAVSERLVLNAGDYGLA VQERAELIMLSDIAGYVGADTCGCLLAIRQDQQEEISLMIDIGTNGEMVLGNRERMVTCS TAAGPAFEGAKIECGMRGAAGAVDHVKYEDGKWNYTTVGNKPAVGLCGSGLIDLVAGLLD AGMLDENGVLRSGQEKQGIFILVPPERGGNERGVYLTQKDIGEVQLAKAAIAAGIQMLME RLGITEDDICSVYIAGAFGNYMDPVSAGKIGLLPATLVKKVKPVGNAAGEGAKIALVNEK EMLEMDELVRKIEFVELAASADFQDHFIDELGFETGE >gi|226332932|gb|ACII01000087.1| GENE 10 11340 - 11975 948 211 aa, chain - ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 27 207 46 225 238 154 45.0 1e-37 MSKINEVAELVEKGKAKLVGPAVQEAIDEGDDPVAILNDGMISAMSVVGEKFKNGEIFVP EMLVAARAMKKGVEVLKPHLASGSTGALGKVIIATVSGDLHDIGKNLVAMMIESAGFEVI DLGVDVPAAKIIECYKENPDVKIIALSALLTTTMPALKETVEALNAADFRGNIKVMVGGA PITEAFAEEIGADGYSDDAASAASLAKKLAA >gi|226332932|gb|ACII01000087.1| GENE 11 12052 - 12870 1007 272 aa, chain - ## HITS:1 COG:mlr1243 KEGG:ns NR:ns ## COG: mlr1243 COG1410 # Protein_GI_number: 13471306 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Mesorhizobium loti # 3 263 23 284 324 119 32.0 9e-27 MIIIGEKINGSIPIVAEAIAKRDSEFIKERAKLQAESGATYIDCCASVPEAEEVETLKWM IDCIQEVTDLPISIDSPSPDVLAQAYKFCKKPGIFNSVSGEGHKIDTIFPIMAEPGNEGW QVIALLSDNTGIPKCAADRLNVFDKIMAKAKEYGIAPDRIHIDPLVEMLCTSEDGIATNI ETITSVREQYPTIHITAAISNISFNLPVRKLINYGFLVLAMNAGLDSGIMDPTNKDMLGL VYATEALLGLDDYCMEYIGAYREGLIGTTKKK >gi|226332932|gb|ACII01000087.1| GENE 12 12913 - 13947 1270 344 aa, chain - ## HITS:1 COG:no KEGG:DSY4700 NR:ns ## KEGG: DSY4700 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 328 1 325 325 356 51.0 7e-97 MLTPKQNMLEVIKGGNPDRFVNQYEAVQLLFHPFMFANPLLQPGQENVVNAWGVTNTFPK GVPGSFPVHTPDKIVVKDIEDWKDYVHAPSLKFTQDQWDMVKAQYDAVDGEQAFKAAFVA PGLFEQTHHLCEISNALVYYITNPDEMHDLIKYLTEWELELAEGICSNLHPDALFHHDDW GGLDSTFMSPAMFDEFLLEPYKEIYGYYHSHGVELVIHHSDSYAATLVPSMIEMGIDVWQ GCMETNNLPELIRKYGGKISFMGGIENRAVDFEGWTDENCDAVVRRVCEECGNKYFIPCI AQGGPGSVYPGVYKSLCDSIDKLNEEKFGIKASESTRLPWQIMF >gi|226332932|gb|ACII01000087.1| GENE 13 14375 - 15301 782 308 aa, chain + ## HITS:1 COG:FN1584 KEGG:ns NR:ns ## COG: FN1584 COG2367 # Protein_GI_number: 19704905 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Fusobacterium nucleatum # 70 305 2 257 264 87 30.0 4e-17 MFSKKLLIPLLSLSLLVTPAYSSHAAENAEAAETSVEGQGMLSPSSSSESSESVFNGSGA GTSSEPVTDEKLETILKQVQSQLPAENGTWAVFISDLVNGTEGSLNDQKMQAASLIKLYI MGAVYENYDQITGQYGRDSVDSNLYSMITVSDNDAANTLTTYLGGGDSAAGMQAVNSFCQ AHGYDQTHMGRMLLASNENDDNYTSVGDCGHLLQEIYKQDTSGYTHATDMFNLLKAQTRC NKIPAQLPEGIKTANKTGELDNVENDAGIIYDSKNDVVIVFMSQNLSSAGSAQNTIATLS RTIYDYYN >gi|226332932|gb|ACII01000087.1| GENE 14 15348 - 16316 835 322 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579445|ref|ZP_04856715.1| ## NR: gi|253579445|ref|ZP_04856715.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 50 322 1 273 273 395 100.0 1e-108 MKKRKNIAIPMILAATIAAVSVPCSSQIVMASDTYQVGVSKGYLALRSAMAYDSSNEIGE LYSGDTVEVTEYTTSDYWYVYSPKLNRSGYVNNDYLYFLSSQPTSSSGSYTVSVAKGYLA LRSAKAYDSSNEIGQLYSGDTVTVSDSSDPQYWYVYSPKLNLSGYVNKDYLYYSGDTAAS AQSSSGDSRTVSIAKGYLALRSAKAYDSSNEIGQLYSGDTVQLIDTSDSQYWYVYSQKLG KNGYVNKDYLIGGTTTYATRTVSVATGYLALRSAKAYDSSNEIGQLYSGDTVQLVDTTDA QYWYVYSQKLCKYGYVNKDYLY >gi|226332932|gb|ACII01000087.1| GENE 15 16393 - 16839 594 148 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1210 NR:ns ## KEGG: EUBREC_1210 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 11 141 2 130 202 98 41.0 8e-20 MSPKRQVNHNIRKHFNTIIYNISRYAEIALSMVILLVIALAGFRLIMEVADTSVMSMDTE FFSTFLSQALSLVVGVEFVKMLCQHSAQTVVEVLMFATARQMVVEHLGPAETLLGVLSIA VLFAIRKYLMTDNDDMNAHNCPKNTEEN >gi|226332932|gb|ACII01000087.1| GENE 16 17005 - 18135 1273 376 aa, chain - ## HITS:1 COG:CAC0566 KEGG:ns NR:ns ## COG: CAC0566 COG2055 # Protein_GI_number: 15893856 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Clostridium acetobutylicum # 1 363 8 369 369 386 50.0 1e-107 MGYVKWSYDTLNTFCHDVFRKFGFNEEETNIIKDVLLTADLYGIESHGMQRMVRYDKGIE KGTIHPDAKPEVVFETPVSAVIDGHDGMGQLISHFAMEKAIEKAKKTGVGFVSVRNSNHF GIAGYYAEMASKQGLLGMACTNSEAIMVPTFGRKAMLGSNPIAVAMPAEPYPFLFDCSTT VVTRGKLEMYNKMEKPLPQGWALGANGQESTDAPDVLANIVAKKGGGIMPLGGNKEVNGS HKGYGYGMLCEIFSSIFSMGVTSDKCCTFKDKTGICHGFAAIDPAIFGNADDIRAHFSEY LETLRQSPKAEGAEQIYTHGEKEVFAEKERRENGIPVNDNTMVELANLCNYLKVDFGSYF KGYELPKDSKFFSGNY >gi|226332932|gb|ACII01000087.1| GENE 17 18530 - 19378 631 282 aa, chain + ## HITS:1 COG:CAC3396 KEGG:ns NR:ns ## COG: CAC3396 COG2816 # Protein_GI_number: 15896637 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Clostridium acetobutylicum # 47 277 39 268 271 154 34.0 2e-37 MIQDIAPHTYHNEYKPSAPDKNSFILAYEKGSILLPHQEREADIYFPRFQDLEEKVSDLY SKYIYLFSIDDQRFYLIPELDTTLLPDYEFQDIRNLRTARPQHLAFAGVTGHQLFQWYSK RRFCGCCGKPMLHSQKERMMECPSCGNQEYPVLCPAVIVGITNGDKIILSKYEGRRFKRY ALIAGFAEIGETIEETVHREVMEEVGLKVKNLRYYKSQPWSFSGTLLFGFFCDVDGDDTL TIDHEELSMAQWVERDKIPDQGNNISLTKEMMMLFRDGKEPR >gi|226332932|gb|ACII01000087.1| GENE 18 19528 - 20676 1160 382 aa, chain - ## HITS:1 COG:VC0856 KEGG:ns NR:ns ## COG: VC0856 COG0484 # Protein_GI_number: 15640872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Vibrio cholerae # 2 374 3 360 381 167 35.0 3e-41 MKRDYYEVLGVNKNADAATIKKAYRKLAKKYHPDSNEGNASAAEHFKEVNEAYDVLSDEK KRKLYDQFGHAAFEEGAGNYGNAQGSPFGSGFGGAQGNPFGGGFHGSYSDGNGYHEYHFE NGEDMDDILKNIFGGGFKKSKSSSGFGSSGFGGSGFHGRGVGGFGSNGTDGFGGGFGTGG SDFHSQGFGGSYSSKGEDLHAEVTVSFDEAAFGGKKVIRLQSSNGGVQNYEVNIPAGIES GKSIRLKGKGYPGVGGGEAGDLLLKVNVQDKPGYKREGRDVYTTVNIPFTTAVFGGEAKV HTIYGDVLCNIKPGTQSGTKIRLRGKGIVAMNNPSVHGDEYATVQIEVPTNLTPEARRKL KEFEQECNGSRRSRGFGSGSAA >gi|226332932|gb|ACII01000087.1| GENE 19 20805 - 21272 595 155 aa, chain - ## HITS:1 COG:RSc0200 KEGG:ns NR:ns ## COG: RSc0200 COG0071 # Protein_GI_number: 17544919 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Ralstonia solanacearum # 45 143 37 133 140 68 32.0 4e-12 MLMPSIFGENLFDDFFGDFPFYYDDRAMKDAEKKLYGHKANHVMKTDIKEMNNGYELIVD LPGFKKDEVHAALENGYLTISAEKGLDKDEKEKETGRYIRRERYAGACSRSFYVGKEVHQ DDIKAEFKHGILTLFVPKKEAKPAVEQKHSISIEG >gi|226332932|gb|ACII01000087.1| GENE 20 21460 - 21927 657 155 aa, chain - ## HITS:1 COG:RSc0200 KEGG:ns NR:ns ## COG: RSc0200 COG0071 # Protein_GI_number: 17544919 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Ralstonia solanacearum # 45 151 37 139 140 67 33.0 9e-12 MLMPSIFGENLFDDFFGDFPFYYDDRAMKDTEKKLYGRRASHIMKTDIKETDKGYELVVD LPGFTKDEVKADLENGYLTISAEKGLDKDEQEKETGRYIRKERYAGACSRSFYVGKEVEK EDIKAEFKHGILTLFVPKKEAKPAVEQKKTIAIEG >gi|226332932|gb|ACII01000087.1| GENE 21 22271 - 25417 2627 1048 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 621 1032 239 653 676 161 29.0 7e-39 MSEVKNKKKKSSIIQVSIGVLAVILAILIIIMMGIVSDIQGTARIVNYTGLVRGETQRLI KLELSTQQENEMIHDIRTFIDGLRNGNDELNLVRLNDVDFQNKMRELDDKFSDLYKKIYL VRFKGARNTDIIPESEEFFVICDEATGLAEKYSQKKATSLSLLEKYITADIVVLMLLIGY EFIKAIQYAAMNRLLQRKVYLDDATGLPNKNKCEELLSEEEPDADTGVCSFDLNNLQRIN DSRGHEAGDAYIRRFAICLRASMPAEQFVGRAGGDEFLAVTHGLDREQMTQCLEKVRRDM REESKVYPDTPLSYAAGFALAGDFPGSTMRELFNCADKNMYINKNHVKREEAAAEKHQGY QLLKLLNQHGSNFSDCLYCDARMDTYRAIRSSENFFLASDGAYSGAVEQIIEEQVEKSSQ ADIREGLQISELQKKMHTKKDILEYEYNIGKQGAYNRFTLIPVDWDEDKKLHHFLLAFET IRKTSEGQTGAKEQLQLYYEQLKQSILENDSYVDALLELSDVIYTVNLTKDALERRIVLN GKEQKSRELFMDYPLPCSYRDYCWEYEKKITQETIAGYCMTDNCEKLRKRFENGETNMSV EYCAREDDGSIRWVQKTVLMTRMVVFDTEILAEVPMIYAIILLQDTTQRHERDEQEQARL QAAFNEMRAESRAKTNFLSRMSHDIRTPLNGIIGLLKIDETHFEDKELIQENHKKMKIAA DYLLSLINDVLQMSKIEEGHIVLTHEYICLKDLVYEIESIITHKAADEDIQWIYEKNKEN IPYPYVYGSSLHLRQIFLNIYGNCIKYNRLGGKITTVMEAADVHDGICTYRWTISDTGIG MSPEFLSHIFDPFSQEKTDARSVYQGTGLGMAITKGLIEQMNGSIEVTSQVGVGSVFVIT IPFEIAREQKKDEEIAEKYDIRGLHLLAAEDNELNAEIIEMLLTDDGAKVTVAKNGRQAV EHFENNPPGTFDAILMDVMMPVMDGIAATKAIRAMDRADAKTIPIIAMTANAFEEDAKRC LAAGMTAHLAKPFQIEDVEKTIVECCGK >gi|226332932|gb|ACII01000087.1| GENE 22 25744 - 26571 788 275 aa, chain - ## HITS:1 COG:BH2724 KEGG:ns NR:ns ## COG: BH2724 COG0395 # Protein_GI_number: 15615287 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 17 275 7 271 272 225 48.0 6e-59 MTKKNKSAIQRRLIPTYIFLIIVSFISVFPLYWMISAATNTSTDVSRGRIIPGSHFMENF RNLTSQQPLWRALGNSFFYAILTTVICLLICSIAGYGFEVYHDKWKDRVFSILLLAMMVP QVATMVPLFKMFSKAGLLNTAVGFILPIISTPFMIMMFRQNARSFPVDIIEAARIDGLSE VRIFFQMFIPTMKSTYAAAAVITFMNAWNSYLWPKVIMTDNKAMTMPMLIANLITGYVTD YGMLMCGVLFCSIPTMIVFFVLQKQFAEGITGAVK >gi|226332932|gb|ACII01000087.1| GENE 23 26564 - 27463 845 299 aa, chain - ## HITS:1 COG:BH2725 KEGG:ns NR:ns ## COG: BH2725 COG1175 # Protein_GI_number: 15615288 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 14 295 7 286 290 268 48.0 7e-72 MNSKKKGMSLLGKQRAAGWTFLAPATIMIAIMSFYPMIRAFIISLQTGAGANMRFADPIF SNYKRILADKVFQQSIANTFLYLIIQVPIMLILAILLAQLLNNKDLKFKGFFRTCVFLPC ATSLVSYSLIFRSMFATDGLINSILIKLGILSTGYNFLGNSASAKIVIILALIWRWTGYN MVFYLAGLQNIEYSVYEAAKIDGANGWKTFWKITVPLLKPTIIMTFIMSINGTLQLFDES VNLTKGGPANSTITMSHYIYNTCFINVPNFGYAAAMSFIIFIMVAVLAFINLKVGDTRD >gi|226332932|gb|ACII01000087.1| GENE 24 27661 - 28950 1657 429 aa, chain - ## HITS:1 COG:BH2726 KEGG:ns NR:ns ## COG: BH2726 COG1653 # Protein_GI_number: 15615289 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 15 420 17 417 425 269 37.0 9e-72 MRKKLLAVLMTGAMVASFAGSATAVFAKSDDDNKLTVWAWDQNFNIKSMQIAADQYAKDH EGFSVDIVETSSDDCQTKLTTAANAGDYSTLPDIVLMQDNSYQKYLKSYPDAFTDLKDIN INWDDFGKLKQSYSMVDDTHYGVPFDNGAVIACYRTDILDEAGYTIDDLTDITWSKFMEI GKDVHEKTGKYLLTSEATGGDTLMMMMQSCGANFVNEDGEAYIVGNEVAEKCINLYVDLV KNDVVKLVNNWDEYIATITGGEAAGIVNGNWITATLMSTEDQKGLWQITTMPKVDDVDTA TNYANNGGSSWYVTSNCQNVELAEDFLASTFGSSTEFYDTILPEAGAISCYLPAGESKVY NEPNEFFGGQPIFSTIVEYSSHIPEFTKTPYHYESRECVNTAVVNIVNGTPVEDALQEAQ DTLAFKMTE >gi|226332932|gb|ACII01000087.1| GENE 25 29278 - 29787 583 169 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3288 NR:ns ## KEGG: Cphy_3288 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 166 1 166 169 75 28.0 6e-13 MNYKIVEQVSRKKKSMSGVLRVVMIIFAVLFVVMGITISQAFMLPGFLLAGLYFVFDIFS QKDYEYMMENGQFEIDVIYGKKYRKTAHILELSNLETVAPHWHESVARYKKNGGTEKLKK YDYTSYDEDTPYYTMIIREDGHKIKLLLDLNEELLHAMKTQYPEKVYFA >gi|226332932|gb|ACII01000087.1| GENE 26 29795 - 30733 670 312 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 158 293 160 298 313 83 30.0 5e-16 MAKKRKKPKMEFRYYQMPAGSPILALLGQKWVQNYGNDVDYLHFHNYLEIGFCYEGDGFM AFGEAKMRFSGREFTVIPPNYAHTTNSDLGTVSKWEYLFIDVEGFMKKFLNNPVKAEKMI QRIYSKALFLKENEFPSIAAKILKIMNIMREGEEFYIEESSGVLAALLVEIARLNRASPE DRVEEETGKLTNMITRVLDFVSYHYMEDIRIEDLAKICHISETHFRRVFTSHMKVSPLEY INSVRIHTACEFLQKTDIPVADIAHKCGFTTNSTFNRNFRQIMGVTPVEWRKRPENYEQQ LLDFYIHSEEGW >gi|226332932|gb|ACII01000087.1| GENE 27 30976 - 31887 681 303 aa, chain - ## HITS:1 COG:MJ0750 KEGG:ns NR:ns ## COG: MJ0750 COG0348 # Protein_GI_number: 15668931 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Methanococcus jannaschii # 99 303 52 238 238 85 28.0 1e-16 MSNQKNSDNKSNKKQNTKNMRINEWPRHGIQALWAFITNSHVTGFVTGKIYTGKLKNACV PGLNCYSCPGAVGACPIGSLQAVIGSWNFKMAYYVVGFLIFIGAMVGRLICGFLCPFGLI QDLLNKIPFPKKIRTFKGDKLLRKLKYVIFAVFVILLPLFLVDIMGQGAPYFCKLICPAG TLEGGLPLVLLNKSMRSALGWLYIWKNVILVITIILSILIYRPFCKYICPLGAFYSVFNP VSLYKYRVDKDKCIKCGKCAKACQMIVDPVENSNSPECIRCGRCKKACPTDAIQCGIRRK PQS >gi|226332932|gb|ACII01000087.1| GENE 28 31888 - 32052 214 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579459|ref|ZP_04856729.1| ## NR: gi|253579459|ref|ZP_04856729.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 95 100.0 8e-19 MNKLKTSKIWNSKWPALILMLVGVCFMGYGIFRGELAVVFTKAINICMECVGIG >gi|226332932|gb|ACII01000087.1| GENE 29 32053 - 33255 1457 400 aa, chain - ## HITS:1 COG:no KEGG:FMG_0299 NR:ns ## KEGG: FMG_0299 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 72 396 56 381 384 277 46.0 4e-73 MKNRRMRLMAIFAAGVLSVTSMGGIVTAHAEETLSVPAEEAGAGAVQMDGKDAQAAEDGS VAASEGLIAGSFKPSEIAQPAQDTYEYPFLGMKLGISKDLKEQMEKKNISMFTDERWNDN ADAVTYATASWCTLTDEQKDAEVDKMGTGYDDWLKSLSRVGVIGMYDEESQKDLDTITGC TEHKELGTSSDGKYKYYLSTNKDADADLLKDVEGIEVTLTEMTPFQMLSAFDQPQDTSDS TEAAEGTNVGKFETTGVDGKTYTQDIFSKYDLTMVNIFTTWCSPCVNEIPDLEKLYQEMK DKGVGVVGVTLDTVGSDGKQDEEAVKKAQVLQEKTKASYPFLIPDSGMMNGRLNGISAFP ETFFVDKNGNIVGETYSGSHSLDEWKEIVEKELENVTEGK Prediction of potential genes in microbial genomes Time: Sat May 28 20:04:35 2011 Seq name: gi|226332931|gb|ACII01000088.1| Ruminococcus sp. 5_1_39B_FAA cont1.88, whole genome shotgun sequence Length of sequence - 5481 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 219 - 278 7.1 1 1 Op 1 40/0.000 + CDS 323 - 991 662 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 994 - 1053 2.5 2 1 Op 2 . + CDS 1073 - 2233 887 ## COG0642 Signal transduction histidine kinase - Term 2180 - 2217 5.4 3 2 Op 1 . - CDS 2259 - 3314 1039 ## COG2008 Threonine aldolase 4 2 Op 2 . - CDS 3419 - 3823 285 ## COG3543 Uncharacterized conserved protein - Prom 3857 - 3916 5.3 5 3 Tu 1 . - CDS 3993 - 4868 792 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 4934 - 4993 5.6 Predicted protein(s) >gi|226332931|gb|ACII01000088.1| GENE 1 323 - 991 662 222 aa, chain + ## HITS:1 COG:mlr7684 KEGG:ns NR:ns ## COG: mlr7684 COG0745 # Protein_GI_number: 13476381 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mesorhizobium loti # 1 222 1 218 221 181 43.0 1e-45 MHILIIEDEEQLCRSMAEGLRMDGYETDTCFDGEEGLELCMTENYDLILLDLNLPGIDGL EILRQFRTFNTNTPVLILSARVQIQDKVEGLDLGANDYLTKPFHFEELEARIRSLTRRKF IQEDVCLRCSRITFDTRTREASVDGTPLALTRKESALLEYFLLHRNRIISPEEMIEHLWD GSVNSFSNSIRVHISSLRKKLRTALGYDPIQNKIGQGYILEE >gi|226332931|gb|ACII01000088.1| GENE 2 1073 - 2233 887 386 aa, chain + ## HITS:1 COG:lin1415 KEGG:ns NR:ns ## COG: lin1415 COG0642 # Protein_GI_number: 16800483 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 36 382 130 476 483 147 26.0 4e-35 MKFYSEKLSPRRLSLQWRLTLLISGLVITACALMYFFISRSAVTGLEGISDYVVLVTPDN SNPISINVDPKILIPDLENQIQNTKNNFLFQSMIATGIIILLSSICTWFVTRRALTPLRR FSDKISQVQAQNLSEPLEVPLSEDEISRLTRSFNDMLARLDNAFSAQKQFVASAAHELRT PLAVMQTNLEVFYKKPEHTPQEYDRLFTMLQEQIGRLSHLAEILLDMTGLQTVERSDTIS LAALTEEVFCDLDPVADKHQIRLIQTEGDCTVTGSYILLYRAVYNLVENAIKYNRPSGSV TVNIHSAESAVLEVTDTGIGISPENQEKIFDPFYRVDKSRSRAMGGAGLGLALVSEIARQ HNGQVKVTQSSEKGSTIALMLPVTLL >gi|226332931|gb|ACII01000088.1| GENE 3 2259 - 3314 1039 351 aa, chain - ## HITS:1 COG:SA1154 KEGG:ns NR:ns ## COG: SA1154 COG2008 # Protein_GI_number: 15926897 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Staphylococcus aureus N315 # 10 347 4 341 341 392 54.0 1e-109 MENTESRLYFASDYMEGAHPAIMKKLMETNLEKTVGYGQDPYTEDAKEKIRKACNAPEAE VFLLVGGTQTNATVIDALLKSYQGVVAADTGHIATHESGAIEFGGHKVLTVPQKDGKISA QQVEKLVKDFYDDANYEHMVMPGMVYISQPTEYGTLYSREELAALSKVCRENHLPLYVDG ARLAYALASPENDVTLTDLAEFSDAFYIGGTKCGALFGEAVVIPQKGRIPHFFTIIKQHG ALLAKGRIAGIQFGELFTDGLYLRIGKPAMEAAEQIKAALKKYGYQLSLDTPTNQIFCIV SNDVMKEIAQDVEFGFWEKYDETHSVIRFATSWATTMEDTQKLIQILEKNK >gi|226332931|gb|ACII01000088.1| GENE 4 3419 - 3823 285 134 aa, chain - ## HITS:1 COG:MTH906 KEGG:ns NR:ns ## COG: MTH906 COG3543 # Protein_GI_number: 15678926 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 6 124 16 136 140 66 31.0 1e-11 MSFIKNIPLRPHHGMCLAYFKGEGYSNGFSAHMQEMLDIFQKGAKIQLHADTDEICSACP NNEKGCCSSFSLVEAYDNAVLELCGLENGQIMEFDDFTDVVQKKILASGKRKEICGNCQW NSICESQKSRWEKC >gi|226332931|gb|ACII01000088.1| GENE 5 3993 - 4868 792 291 aa, chain - ## HITS:1 COG:SA0684 KEGG:ns NR:ns ## COG: SA0684 COG0697 # Protein_GI_number: 15926406 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Staphylococcus aureus N315 # 7 288 5 284 288 189 44.0 8e-48 MKQENYIKGMIMIILSAFFFACMNVSVRLAGDLPSVEKSFFRNLVAAVFAAIILRKNRTV PKVDKKYWGPLILRCVCGTLGILCNFYAIDHLLVADASILNKLSPFFAIIFSFLLLKEKI RPAQAACVALAFIGCLFVVKPGFQNAALVPALIGVCGGLGAGIAYTMVRVLGTHGVKGPV IVFYFSFFSCLSVVPWMLSHFTPMSMKQLVTLLMAGLFAAGGQFTITAAYTYAPAGKISI FDYSQIIFATMLGFILFGEIPDKYSFTGYVLIILASLGTFLYNMRVAKAGK Prediction of potential genes in microbial genomes Time: Sat May 28 20:04:37 2011 Seq name: gi|226332930|gb|ACII01000089.1| Ruminococcus sp. 5_1_39B_FAA cont1.89, whole genome shotgun sequence Length of sequence - 7142 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 370 - 957 589 ## COG4732 Predicted membrane protein 2 1 Op 2 . + CDS 887 - 1663 724 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family - Term 1657 - 1716 11.5 3 2 Op 1 1/0.000 - CDS 1722 - 3122 1374 ## COG0534 Na+-driven multidrug efflux pump - Prom 3232 - 3291 5.0 - Term 3264 - 3317 5.4 4 2 Op 2 . - CDS 3378 - 4712 1182 ## COG0038 Chloride channel protein EriC - Prom 4755 - 4814 4.4 5 3 Tu 1 . - CDS 4849 - 6849 1717 ## Ccel_1549 putative outer membrane protein - Prom 6988 - 7047 6.8 Predicted protein(s) >gi|226332930|gb|ACII01000089.1| GENE 1 370 - 957 589 195 aa, chain + ## HITS:1 COG:SP0723 KEGG:ns NR:ns ## COG: SP0723 COG4732 # Protein_GI_number: 15900620 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 1 165 1 165 174 130 44.0 2e-30 MNTDRSKTLRMVMLAMMVAIGVVISPILRIEGMCPTAHLINIVCSVLLGPWYSLLCATLI GIIRMMFMGIPPLALTGAVFGAFLSGVFYRASHGKIICAVIGEIFGTGIIGSLVSYPVMA FLMGRSGLNAFFYTPMFLAATCMGGTIAYFFLKALSHAGMLAKFQQSLGAKVYDRKSDKS QTIDQSSTAADSLHH >gi|226332930|gb|ACII01000089.1| GENE 2 887 - 1663 724 258 aa, chain + ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 13 257 10 259 273 155 36.0 5e-38 MTVNQTNPRQLTRAVQPLIHCITNPISIHDCANIILAAGGRPIMAEHPAEVAEITSHAQS LALNLGNITDARMKSMPESLKIAASLHIPVMLDLVGTACSNLRYEFAQKLMNIHMPELLK GNMSELLAMSGQTAHAIGIDAGVQDVLTDANRSHLKELFQEKASQWNTTLLITGKEDMIV SASKCEFITNGTPAMSQITGTGCMLGMICATYLAVTDPFTAALSAAREFGTAGERAEKNS SGPGSFQTELFDQFYNLL >gi|226332930|gb|ACII01000089.1| GENE 3 1722 - 3122 1374 466 aa, chain - ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 17 460 9 452 455 432 52.0 1e-121 MEKDNQSTQEINSQAQNPLGIAPVGGLIAKFAIPAIISMLVSAMYNIVDQIFIGQGVGML GNAATNVAFPVTTVATALALLLGIGGASNYNLEMGAGQEKKASGIAGTALSSLAISGLIL AVIVLMFLKPLLTLFGATADVMPYAVDYTGITAFGLPFYILSVGGNHVVRADRSPTYSMV CIMIGAIINTILDPLFIFGFGWGIKGAAWATVIGQVASGVLVIVYFCKFRKMYLSGEMLK PKLSYLKAIISLGLASCINQIAMAIVQIVLNNILRYYGANSVYGSDIPIACVGVISKVNQ VFMAICIGITQGSQPIWGFNYGAKKYDRVRQAYRYSVTACTVIATIFFFCFQIFPHQIVG IFGTGSELYFQFAERYLKIFMFMTFANGIQPVSSGFFTSIGMAKLGIVMSLTRQVIFLLP LIIIFPLFMGIDGVMYAGPIADAAAFALAIVFARRELGKMRSAEIV >gi|226332930|gb|ACII01000089.1| GENE 4 3378 - 4712 1182 444 aa, chain - ## HITS:1 COG:BH0663 KEGG:ns NR:ns ## COG: BH0663 COG0038 # Protein_GI_number: 15613226 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Bacillus halodurans # 43 440 9 412 424 244 34.0 2e-64 MEKEEKRADQENDTEEQSTEAKSLLKKKLQYYSSEIQRDVGNLVKWLMIAVLVGCITGAA STLFSFVLKSVTNCRKENEWMFYLLPVMGLIIVYLYEKFGKDDGGTNQVLSTVRSQDDVP ILSAPLIFISTALTHLAGGSAGREGAAIQLGGSIANQLGRWIHLDEEDRHVIVMCGMSAA FSALFGTPMAAAVFALEVVSVGVMYYTALMPCMIASLVASGFAAGMGVTPETFHVVDIPK LTIETGLKMGAIAVGCAVISIVFCMVLNGVAGAYGRWFKNPYVRVVVGSCLVIGITLLLG TSDYMGAGAELIEKAVEEGQARPLDFFWKLALTALTMRAGFRGGEIVPSFCIGATFGCVM GNWLGLSPSICAACGMTAVFCGVTNCPITSILIAFEMFGFKGVSFYLIAVSISYAASGYY GLYKDQTIVYSKYKAKYVNKHTRF >gi|226332930|gb|ACII01000089.1| GENE 5 4849 - 6849 1717 666 aa, chain - ## HITS:1 COG:no KEGG:Ccel_1549 NR:ns ## KEGG: Ccel_1549 # Name: not_defined # Def: putative outer membrane protein # Organism: C.cellulolyticum # Pathway: not_defined # 2 649 4 629 644 477 42.0 1e-133 MKIYVNVNAGHDGNGTEQMPFRHINDAAKIAQPGDEVWVAPGVYREYVDPVHAGREDARI TYRSVEPLGAVITGAERIQSWVPYKENVWVCRVANSLFGNYNPYTTMVYGDWYFAKADKH TGCVYLNNRALYEAGSVEECIKAEVYECSWVPEESTYKWYTEQDQEKDETVIYANFHGAD PNEENVEINVRRECFMPSKTGVGYITVSGFVVTKAATTWAPPAAYQDGMIGPHWSKGWII EDCEISNSKCAGISLGKYYDPENDHYFTNKYVKSPTQMERDAVCRGQYHGWLKEKVGSHI IRRNNIHHCEQGGIIGRMGGVFSIIEDNHIHHINNMMELGGAEIAGIKMHAAIDVIMRRN HIHHCTMGIWCDWEAQGTRLSQNLLHDNQRPAFAKQLKGGMMCQDIFVEVGHGPTLIDNN ILLSDASLRFATQGVAMVHNLICGALTCVGEGTSWRYTPYHMPHRTEVMGFMTILHGDDR FYNNIFVQKWPSEDFITMHDSDDGFDSENRKVGTWMFDEYPTYDEWISQFDFTKPADMKK LESVHFDHLPVWSEGNVYLNGAKAWKHEKNGFVSSENVKVELTEKDGKYFLDTNIYEILE AFSGRMINTEVLGKAFEPEEFFENPDGTPITFDTDYFGGHRGAKVIPGPFAEKEDVGKNV NICTAF Prediction of potential genes in microbial genomes Time: Sat May 28 20:04:49 2011 Seq name: gi|226332929|gb|ACII01000090.1| Ruminococcus sp. 5_1_39B_FAA cont1.90, whole genome shotgun sequence Length of sequence - 13378 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 7, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 289 - 1155 498 ## COG0583 Transcriptional regulator - Prom 1227 - 1286 8.9 + Prom 1201 - 1260 4.9 2 2 Tu 1 . + CDS 1292 - 2683 1176 ## COG0733 Na+-dependent transporters of the SNF family 3 3 Tu 1 . - CDS 2802 - 3467 606 ## COG0637 Predicted phosphatase/phosphohexomutase - Prom 3515 - 3574 1.7 4 4 Op 1 . - CDS 3582 - 4958 1169 ## COG1066 Predicted ATP-dependent serine protease 5 4 Op 2 . - CDS 4952 - 5365 555 ## EUBREC_2305 hypothetical protein - Prom 5465 - 5524 3.9 6 5 Op 1 . - CDS 5595 - 8057 2219 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 7 5 Op 2 . - CDS 8066 - 8443 416 ## COG1725 Predicted transcriptional regulators - Prom 8467 - 8526 3.6 - Term 8531 - 8579 10.4 8 6 Tu 1 . - CDS 8623 - 9393 1093 ## gi|253579480|ref|ZP_04856749.1| conserved hypothetical protein - Prom 9513 - 9572 8.3 - Term 9578 - 9637 4.7 9 7 Op 1 . - CDS 9697 - 10971 1663 ## COG0151 Phosphoribosylamine-glycine ligase 10 7 Op 2 21/0.000 - CDS 10997 - 11638 741 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 11 7 Op 3 . - CDS 11632 - 12657 817 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 12 7 Op 4 . - CDS 12707 - 13207 757 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 13285 - 13344 8.4 Predicted protein(s) >gi|226332929|gb|ACII01000090.1| GENE 1 289 - 1155 498 288 aa, chain - ## HITS:1 COG:BH2117 KEGG:ns NR:ns ## COG: BH2117 COG0583 # Protein_GI_number: 15614680 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 7 267 5 266 290 162 33.0 1e-39 MKEIRDRYEIFLKVCETGSFSKAAEALNYTQSGISQMMAGFEEELGVQLFARAKKGVTLT DNGRMLMPYIQEMTNQKDKLRQAAFNINHKIEGKLRIGSFSSVTAMWMPEVIHYFNRNYP QVEVQIFDGNYDEIRDWIIHGQVDCGFLSSIVADNLKFYPLCEDPLCAVMQKNHPLAEKK SVRLEELLEYPFIIETPGCDNDILHLLERCGKEPKTSYSFRDDTLIMAFVKNGLGVTISQ KLVMKAFANDVESRPLEPGRSRIIGLAFVKTMNSVVTKILLEYLKGMM >gi|226332929|gb|ACII01000090.1| GENE 2 1292 - 2683 1176 463 aa, chain + ## HITS:1 COG:VC1669 KEGG:ns NR:ns ## COG: VC1669 COG0733 # Protein_GI_number: 15641673 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Vibrio cholerae # 7 456 12 447 448 224 33.0 3e-58 MERKSNHFSGQLGFVLAAAGSAVGVGNLWRFPYLAAKDGGGLFLIIYFILVLTFGFTLLT SDIAIGRRTQKSAIGAYAEMKPKWKFLGILTFLVPVLIMTYYAVIGGWITKYAVVYLTGQ AKAAAADDYFTSFITSSTSPVIFALIFMGVTAFIVYNGVQDGIEKVSKWMMPVLLVLVVI ISIYSLTLKHTDSSGQVHTGIQGFLYYLTPNLEGLTVQRFLQILLDAMSQLFFSLSVSMG IMITYGSYVKPDVNLNKAVNQIEIFDTGVAFLAGAMIIPAVFVFSGTEGMGAGPSLMFVS LPKVFAAMGKAGIFVGILFFVTAIFATLSSCISVLESIVANCMEIFHTGRKKTVSVLSAV YLAASAIIALGYSIFYVEVELPNGSTGQLLDIMDYISNSVMMPFIALLSTILIGWVMTPD YVIDEMERNGSRFRRKKLYRIMIRYVAPVMMFILFLQSTGVLS >gi|226332929|gb|ACII01000090.1| GENE 3 2802 - 3467 606 221 aa, chain - ## HITS:1 COG:alr0288 KEGG:ns NR:ns ## COG: alr0288 COG0637 # Protein_GI_number: 17227784 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Nostoc sp. PCC 7120 # 1 217 1 217 222 139 35.0 4e-33 MASGIDTVIFDMDGVIFDSEILVLQAWKEVAERHGIAGVEAACHECLGTNSVVSKGVFLK HYGEDFPYEEYKAEMAEVFFSHASGGKLAKKPGVEELLKYLKMRGFKIGLASSTREVLVR SEISDGGLLGYFDQIVGGDMVERSKPEPDIFLEACRRLGTRPENCYVIEDSHNGIRAAYA AGMHPIMVPDLMEVTEEMKSLAEEILGSLCAVQEFLQGSGI >gi|226332929|gb|ACII01000090.1| GENE 4 3582 - 4958 1169 458 aa, chain - ## HITS:1 COG:BH0104 KEGG:ns NR:ns ## COG: BH0104 COG1066 # Protein_GI_number: 15612667 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus halodurans # 2 452 1 451 457 496 55.0 1e-140 MLKNKKTVYFCQECGYESAKWMGQCPGCKSWNTFVEETVSTKKPSSSGVMKSSEKRQEPV ILKDISLSEDERQTTQIGELDRVLGGGIVPGSLVLVGGDPGIGKSTLLLQVCRNLAEKQV SVLYISGEESLRQIKLRANRIGQFTDKMQLLCETNLEVIREVIERRKPDVVVIDSIQTMF HEDVSSAPGSVSQVRESTNILMQIAKGMGVSIFIVGHVTKEGNVAGPRVLEHMVDTVLYF EGDRHASYRILRAVKNRFGSTNEIGVFEMCNTGLEEVKNPSEYMLNGRPEDASGSVVACS MEGTRPILVEIQALVCQSNFGIPRRTAVGTDFNRVNLLMAVLEKKVGLRLAASDAYVNIA GGMKMTEPAIDLGICLAIVSSAKDIVIPDNVMVFGEVGLSGEIRAVNMAGQRVQEAKKLG FGTVVLPEVCRSSVGKVEGINLVYVSQIRDAIGYIMKK >gi|226332929|gb|ACII01000090.1| GENE 5 4952 - 5365 555 137 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2305 NR:ns ## KEGG: EUBREC_2305 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 136 1 136 137 158 63.0 6e-38 MAVVKELIRTEENGAISFGDYELAQKSKLSDYQHQGDMYKVKTFKEITKLERNGMFVYES VPGTAVFNLTQSEAQMDFHVEGPEDAQITVEMEPDTEYEVFIEQASTGKMKTNLGGKLSF SVELGNAARVEVKIVKC >gi|226332929|gb|ACII01000090.1| GENE 6 5595 - 8057 2219 820 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 4 820 5 814 815 859 53 0.0 MERQFTRQAENALKLATTIAKSCGHGYIGTEHLLAGLLKEPEGTAGKVLQEFNVEEEKLR ELISSLVTPVQGNTAAAETKDPQYSPRARRIIEQAKEDAESMECLCGTEHLLLSLLKETD CVATRLLFTMGVNIQKLFVAILTAMGFDNDTIAEEFQYARNAGNKKVTSTPTLDQYSRDL TQLAAEGKLDPVIGRDKEITRLIQILSRRTKNNPCLTGEPGVGKTAIVEGLAQRIVMGMV PDTVKDKRLVVLDLSGMVAGSKYRGEFEERIKNVIQEVKEHQGILLFIDEIHTIIGAGGA EGALDASNILKPSLSRGEIQLIGATTLEEYRKYIEKDAALERRFQPVTVEEPSSEETVEI LKGLRPYYEKHHGVKIEDEALEAAVKMSQRYINDRFLPDKAIDIIDEAASKVQLAGYQSA PEMEQYELELNELREEKEQAVKTADIERAKKAQVRQNEVEAEMEKIRIKTERRNKRKKLV VDEAAVAETISDWTKIPLQKLTEGETKRLARLEKELHKRVIGQNEAVKAVAQAVKRGRVG LKDPNRPIGSFLFLGPTGVGKTELSKALAEAVFGSEQAMIRVDMSEYMEKHSVSKLIGSP PGYVGYEEGGQLSEKVRRNPYSVLLFDEIEKAHPDVFNILLQVLDDGHITDAQGRKVDFK QTIIIMTSNAGAQAIMEPKRLGFMSDNDEKKDYERMKGGVMEEVRRIFKPEFLNRIDDIM VFHVLNKEDIRKIVTLLLKTLEKRCAEQMEIHLTVTNAVKDFIAEKGSDNKYGARPLRRA IQSKIEDALANEILEGRVKRGDSVQVKLHKNEIAFEVKKQ >gi|226332929|gb|ACII01000090.1| GENE 7 8066 - 8443 416 125 aa, chain - ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 122 1 124 125 107 44.0 5e-24 MMIEIDFSSDEAIYIQLTNQIIMGIATSRLQEGDTLPSVRQLADTVGINMHTVNKAYSLL RQEGFVTIDRRRGAVISIDVNKRKALEELKQNLMVALAKGCCKSVSREEVHQLIDEIFDE YDENR >gi|226332929|gb|ACII01000090.1| GENE 8 8623 - 9393 1093 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579480|ref|ZP_04856749.1| ## NR: gi|253579480|ref|ZP_04856749.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 256 1 256 256 463 100.0 1e-129 MGISKEEAIKELQTREEIFVAYSQATKLPYVKCDEETFNDQAWIFSTEEGIKEFGKKMLE EKVLLMGMKFTKKDYPRLYGTFYAIGVNTVVWVDGEDKIEIDLPDIAKQADMSKIEPAKR PLLNPTLELSGIYFMQELRRPVEKDQHGNLRELEEELIVNLKKSEYLVAMNVDPEDPKKI NIPYLKNKKEEILQPVFTDVMELEKFTKGQKLRIAKVPFAKLPELLIDKAMAYAVNPLGF NLVLNREQLNKIIGLK >gi|226332929|gb|ACII01000090.1| GENE 9 9697 - 10971 1663 424 aa, chain - ## HITS:1 COG:VC0275 KEGG:ns NR:ns ## COG: VC0275 COG0151 # Protein_GI_number: 15640304 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Vibrio cholerae # 1 418 1 419 429 399 48.0 1e-111 MKVLIVGSGGREHAIAWSVSKSPKVDKIYCAPGNAGIAELAECVDIGAMEFEKLADFAQE KEIDLTIIGMDDPLVGGVVDVFEARGLKVFGPRKNAAILEGSKAFSKDLMKKYNIPTAAY ENFDDPEKALEYLRTEARFPIVLKADGLALGKGVLICKDLKEAEDGVKEIMEDKKFGNAG NTMVIEEFMTGREVSVLSFVDGKTIRTMTSAQDHKRAQDGDQGLNTGGMGTFSPSPFYTE EVDEFCRKYVYQPTVDAMAAEGRPFKGIIFFGLMLTPDGPKVLEYNARFGDPEAQVVLPR MKNDIIEVMEACVNGTLDQIDLQFEDNAAVCVVLASDGYPVKYEKGIPMYGFENFKGKDG YYCFHAGTKLKDGQIVTNGGRVLGITATGKDLKEARKNAYEATEWITFANKYMRHDIGKA IDEA >gi|226332929|gb|ACII01000090.1| GENE 10 10997 - 11638 741 213 aa, chain - ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 203 1 198 204 196 51.0 4e-50 MLKVGVLVSGGGTNLQAILDAIDCGKITNAEVSLVISNNPKAYALERAKNHNIEAVCISP KQYESREEFHKTLLEKLKESGVELIVLAGFLVAIPPMIVEAYPNKIINIHPSLIPSFCGV GYYGLHVHEKALARGVRVTGATVHFVDTGTDTGPIILQKAVKIKSDDTPEVLQRRVMEKA EWKILPKAINLIANGKVKVVDGRVEIEEYDTEE >gi|226332929|gb|ACII01000090.1| GENE 11 11632 - 12657 817 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 1 338 12 344 356 319 48 8e-87 MDYKTAGVDIEAGYKSVELMKEHVKRTMREEVLGGLGGFSGAFSLAKIKEMEEPVLLSGT DGCGTKVKLAMVMDKHDTIGIDAVAMCVNDIACAGGEPLFFLDYIACGKNYPEKIAAIVG GVAEGCVQSDAALIGGETAEHPGLMPEDEYDLAGFAVGVCDKKEMITGEELKAGDVLIGM ASTGVHSNGFSLVRKVFDMTKESLDTYYEELGTTLGEALLAPTRIYVKALKSIKNAGVRV HACSHITGGGFYENIPRMLKEGTRAVVEKNSYPVLPIFKLLAEKGNIDEQMMYNTYNMGL GMILAVDAADVDKTMEAIKAAGDTPYVVGRIEDGEKGVTLC >gi|226332929|gb|ACII01000090.1| GENE 12 12707 - 13207 757 166 aa, chain - ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 166 1 166 171 199 61.0 2e-51 MAKVGIVMGSDSDMPIMSKAADILEKLGIDYEIKIISAHREPDVFFEYAKTAEEKGFKVI IAGAGMAAHLPGMCAAIFPMPVIGIPMHTTSLGGRDSLYSIVQMPSGIPVATVAINGGAN AGLLAAKILATSDEELLAKLKAYSQELKEQVEAKDARLQEVGYKNY Prediction of potential genes in microbial genomes Time: Sat May 28 20:05:09 2011 Seq name: gi|226332928|gb|ACII01000091.1| Ruminococcus sp. 5_1_39B_FAA cont1.91, whole genome shotgun sequence Length of sequence - 21517 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 10, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 64 - 2097 2193 ## EUBELI_00547 hypothetical protein 2 1 Op 2 . - CDS 2136 - 2456 374 ## gi|253579486|ref|ZP_04856755.1| conserved hypothetical protein 3 1 Op 3 . - CDS 2440 - 2913 320 ## COG4767 Glycopeptide antibiotics resistance protein 4 1 Op 4 . - CDS 2939 - 3091 150 ## gi|291546925|emb|CBL20033.1| hypothetical protein 5 1 Op 5 1/0.000 - CDS 3170 - 3850 948 ## COG0461 Orotate phosphoribosyltransferase 6 1 Op 6 13/0.000 - CDS 3877 - 4779 1244 ## COG0167 Dihydroorotate dehydrogenase 7 1 Op 7 1/0.000 - CDS 4779 - 5567 805 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 8 1 Op 8 . - CDS 5578 - 6504 1158 ## COG0284 Orotidine-5'-phosphate decarboxylase 9 1 Op 9 . - CDS 6538 - 7824 1367 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 8058 - 8117 9.6 + Prom 8017 - 8076 17.4 10 2 Tu 1 . + CDS 8100 - 8300 342 ## COG1278 Cold shock proteins + Term 8371 - 8398 1.2 - Term 8348 - 8398 2.6 11 3 Op 1 . - CDS 8433 - 8906 550 ## gi|253579495|ref|ZP_04856764.1| predicted protein 12 3 Op 2 . - CDS 8981 - 9820 923 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 9859 - 9918 5.7 + Prom 9892 - 9951 9.4 13 4 Tu 1 . + CDS 10075 - 10575 295 ## Cphy_3370 hypothetical protein - Term 10489 - 10532 2.1 14 5 Tu 1 . - CDS 10589 - 10918 426 ## Dhaf_3021 sporulation transcriptional activator Spo0A - Prom 11106 - 11165 5.5 - Term 11157 - 11198 7.4 15 6 Tu 1 . - CDS 11272 - 13344 2155 ## COG0326 Molecular chaperone, HSP90 family - Prom 13527 - 13586 5.0 - Term 13525 - 13563 6.3 16 7 Tu 1 . - CDS 13613 - 16321 2991 ## Dhaf_2434 cell wall/surface repeat protein - Prom 16470 - 16529 7.5 - Term 16519 - 16582 13.3 17 8 Op 1 . - CDS 16645 - 16902 386 ## EUBREC_1570 hypothetical protein 18 8 Op 2 . - CDS 16904 - 17707 819 ## COG0561 Predicted hydrolases of the HAD superfamily 19 8 Op 3 . - CDS 17700 - 18365 860 ## COG0546 Predicted phosphatases 20 8 Op 4 . - CDS 18372 - 19040 619 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Prom 19164 - 19223 5.9 + Prom 19028 - 19087 7.6 21 9 Tu 1 . + CDS 19185 - 19580 509 ## gi|253579505|ref|ZP_04856774.1| predicted protein - Term 19600 - 19654 17.3 22 10 Tu 1 . - CDS 19690 - 21357 2074 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 21405 - 21464 9.4 Predicted protein(s) >gi|226332928|gb|ACII01000091.1| GENE 1 64 - 2097 2193 677 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00547 NR:ns ## KEGG: EUBELI_00547 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 675 6 686 690 699 53.0 0 MDEWVMERYELAKERIAQIPDEKMAEEPYCDFFTKEAMFLQQVIHVMDHGIAGKNLEELQ EQNHELYQDILPENYENSYGNPAYAQKMLGEYGKVFTFLYTELHGTIAYAFEKKAWDLTV VLELFLEIYSAFTQEEIPAEKQVKEILVSYVNDYCQDMVENRIREAIDPENNFVVEMIMD SDLSDLSYLYQSGEYVSENELKTAEFMNSLSQREIDEMARTYTEGYRMGFITGRKDITKK KTVNIRYHLGFERMVKAAVLQFREMGLQTVIYRHALHAVNRRNQFRNGFTGGIANPQFDY DHRQDSALFLNPDFVKRKLRAMQTSYDEYADLADVHGGPAVIETFGEKPFSPVSKPESWA FTEAQQKLQLELDNESGQITNRYIKGEERSFTIIAYPIPEIGEDFPEIFREIVKINTLDY KKYQKIQQTIIDTLDTCEWVEIKGKGENETDLLIHLHTLTDAKTQTNFENCVADVNIPLG EVFTSPVLAGTGGMLHVSKVYLNGLQFCDLKLVFDCGQVIDYSCSNFETEEENRKYIEDN ILFHHPKLAMGEFAIGTNTTAYVAAQKYNIADKLPILIAEKMGPHFAVGDTCYSWSEDTP VYNPDGKEIIARDNEISEMRKEDVSLAYYGCHTDITIPYDELGSIRTIDDEGDMISIIED GRFVLPGTEALNEPFGE >gi|226332928|gb|ACII01000091.1| GENE 2 2136 - 2456 374 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579486|ref|ZP_04856755.1| ## NR: gi|253579486|ref|ZP_04856755.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 106 1 106 106 166 100.0 3e-40 MAKRYRYAFAKKREAQQGKLSAVLAAVSLVLFVTAVLLAFFLQGQYGYIVGGISLCAMLL SVYGFAMGLKSFSEENRTHKAGMIGSIANGIIMIGWLGIFLMGISG >gi|226332928|gb|ACII01000091.1| GENE 3 2440 - 2913 320 157 aa, chain - ## HITS:1 COG:CAC0829 KEGG:ns NR:ns ## COG: CAC0829 COG4767 # Protein_GI_number: 15894116 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Clostridium acetobutylicum # 16 139 18 143 308 79 37.0 2e-15 MKKETKHIIRTLGTILFILYVLALIYFLFFSEEYGRAAMEERQYRYNLIPFVEIRRFWVY RKQLGLMAVVTNLFGNVIGFLPFGFILPVILDKMRSGWLIVLAGFGLSVTVEVIQLITKV GCFDVDDMILNTAGAALGYLLFFICDHLRRKIYGKKI >gi|226332928|gb|ACII01000091.1| GENE 4 2939 - 3091 150 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291546925|emb|CBL20033.1| ## NR: gi|291546925|emb|CBL20033.1| hypothetical protein [Ruminococcus sp. SR1/5] # 1 50 35 84 84 62 78.0 7e-09 MNEKIYKTMSSTGACSLTVGIIVLATGIVTGIMMIVNGARLLKKKSEILI >gi|226332928|gb|ACII01000091.1| GENE 5 3170 - 3850 948 226 aa, chain - ## HITS:1 COG:CAC0027 KEGG:ns NR:ns ## COG: CAC0027 COG0461 # Protein_GI_number: 15893325 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 226 1 224 224 265 60.0 5e-71 MEQYKQEFIEFMVDSEVLKFGEFTLKSGRKSPFFMNAGGYVTGSQLKKLGEYYAHAIHDK YGDDFDVLFGPAYKGIPLSVVTAIAFSELYGKEIRYCSDRKEEKDHGADKGSFLGSKLKD GDRVVMIEDVTTSGKSMEETVPKVKGAADVEIVGLMVSLDRMEVGKGGVKCALDEVHDLY GFETNAIVTMREVIEHLYNKEYKGKVIIDDTLKAAIDAYYEQYGAK >gi|226332928|gb|ACII01000091.1| GENE 6 3877 - 4779 1244 300 aa, chain - ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 3 297 2 297 311 298 50.0 7e-81 MDMSVNIAGVEWKNPVTVASGTFGSGAEFADYVDINKLGAVTTKGVANVPWAGNPTPRVA EVYGGMMNAIGLQNPGIDLFCERDIPFLKQYDTKIIVNVCGHAPEEYLAVVERLADQPID MMEINISCPNVNAGFLAFGQDAHHVEELTAQIKKIAKQPVIMKLTPNVTDITEIAKGAEA GGADAVSLINTLTGMKIDINRKTFALANKTGGVSGPIVKPIAVRMVYQVAQAVNIPIIGM GGISCAEDAIEFILAGASAVSVGTANFHNPAVTLEVIDGIEAYMKKNGFNSVKEMVGIVK >gi|226332928|gb|ACII01000091.1| GENE 7 4779 - 5567 805 262 aa, chain - ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 4 260 2 257 259 208 43.0 1e-53 MSEKTREICTVVSQESIGAGIYSMWIQTDRIAADAKPGQFVSLYTNDKSKILPRPISLCE IDKENGRLHLVYRVTGQGTGTDEFSQMKAGDTIPVLGPLGNGFPVEKAEGKKVFLMGGGI GVPPILELAKQMKCEKKQIIAGYRDCHTFLREEFEAAGTLYIATEDGSVGTKGNVMDAIR ENALEADVIYACGPTPMLRAIKKYAEENGIECYISLEERMACGIGACLACVCKSREKDAH SNVNNKRICKDGPVFLSTEVEI >gi|226332928|gb|ACII01000091.1| GENE 8 5578 - 6504 1158 308 aa, chain - ## HITS:1 COG:CAC2652 KEGG:ns NR:ns ## COG: CAC2652 COG0284 # Protein_GI_number: 15895910 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Clostridium acetobutylicum # 1 305 2 286 286 199 37.0 4e-51 MINKLIENIKKTNAPIVVGLDPMLNYIPQHIQKKAFSELGETLEGAAEAIWQYNKGIVDA ISDLIPAVKPQIAMYEQFGIPGLMAYKNTIDYCKEKGLVVIGDIKRGDIGSTSAAYAVGH LGKVQVGSNRIAGFDEDFATINPYMGSDSVNPFIDVCKEENKGLFVLVKTSNPSSGEFQD RIIDGRPLYEWVGEKVAQWGESHMGNEYSYVGAVVGATYPEMGKTLRKIMPKTFILVPGY GAQGGKGADLVHFFNEDGLGAIVNSSRGIIAAYKQEKYASFGELNYADASRQAVKDMIED ISTALNNR >gi|226332928|gb|ACII01000091.1| GENE 9 6538 - 7824 1367 428 aa, chain - ## HITS:1 COG:BH2538 KEGG:ns NR:ns ## COG: BH2538 COG0044 # Protein_GI_number: 15615101 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Bacillus halodurans # 1 426 1 422 428 345 44.0 9e-95 MGILIQGGRIVDAATDTDKKGDIYLEDGVITEIGEKLKIKDKSDKVIDAKGCLVMPGLID LHVHFRDPGQTQKEDIETGSRAAARGGVTTVVAMPNTTPVIDSPDRVNYVHNKAKQLAGI HVLQAGAITQGEKGQELSDIEGMVKAGIPALSEDGKSVMNTRLCKEAMEVAEKFNVPIFA HCEDIDLRGDGCMNEDENARRLGLPGICNAVEDVIAARDILLARETGARLHLCHCSTEGV AKMMEIVKEEGLDNITAEVCPHHFILTSDDIKCDDPNYKMNPPLRTKKDVDALIRGLKDG SFKVISTDHAPHAVPEKTGSIRNAAFGIVGIETSFALSYTALVETGILTISQLIEKMSWN PAQILGSDRGTLQKGHPADIVIADIDHEYKIDKNEFASKGRNTPFDGWKVKGKILYTICD GKVVYQNA >gi|226332928|gb|ACII01000091.1| GENE 10 8100 - 8300 342 66 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 3 66 2 65 65 91 75.0 3e-19 MNKGTVKWFNAEKGYGFITGEDGADVFVHFSAINGEGFKSLEEGQAVTYDLTEGARGMQA ANVTKL >gi|226332928|gb|ACII01000091.1| GENE 11 8433 - 8906 550 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579495|ref|ZP_04856764.1| ## NR: gi|253579495|ref|ZP_04856764.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 157 7 163 163 162 100.0 5e-39 MSQKKVDAYKARKGLHNKTDRKEKVLFGLEMFAWAFICVVIVAWIGYSAYVKVTGAKENV VQNTVMDTTALDNYISNLSTDASDGTDSTDADTAEEEDTTASTDTDSEDADAADTTADDS ETETADKSEEADDTNVADDKAAETADSTTDADSADAK >gi|226332928|gb|ACII01000091.1| GENE 12 8981 - 9820 923 279 aa, chain - ## HITS:1 COG:SP0742 KEGG:ns NR:ns ## COG: SP0742 COG1307 # Protein_GI_number: 15900637 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 278 1 277 281 163 35.0 4e-40 MRYKIIIDSCGELLDEWKKDECFESIPLTLMVGAEQIIDDETFDQAEFINKVAACPECPK SACPSPERYMRAYDCEAEHIYAVTLSSELSGSYNSALLGRDLIMEDHPDKKIHVFNSRSA SIGESLIGMKIQECEEAGMSFEEVVSTVEHYIEGQHTFFILENLDTLRKNGRLSKVKALV ASALKIKPVMGSTDDGNICQLDQARGMNRALIKLVEQVIEKTPDSAEKVLAISHCNCPAR AQVLKEAFEERMKLAKIVVLDTAGVSSMYANDGGVIVAV >gi|226332928|gb|ACII01000091.1| GENE 13 10075 - 10575 295 166 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3370 NR:ns ## KEGG: Cphy_3370 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 165 1 164 164 115 37.0 5e-25 MLALKITDVKDFMNKLLIGEVFDRFSLVEASVTTFNTFTINGKLHYDFFDTDTKAAFEEN STEYSLWHDVKPYCFSIIRGRRTPLNFRIVFELSHDQTQSLLKNEQHIADIPVQSFYLNI RYKNQSLLCTTGVSYTSFSPDKRLEHLWDDSMTVFLSSHHIPCEKL >gi|226332928|gb|ACII01000091.1| GENE 14 10589 - 10918 426 109 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_3021 NR:ns ## KEGG: Dhaf_3021 # Name: not_defined # Def: sporulation transcriptional activator Spo0A # Organism: D.hafniense_DCB-2 # Pathway: Two-component system [PATH:dhd02020] # 2 108 152 262 264 75 37.0 6e-13 MSAVYGVIRKLGATSKYKGYYYVVDAVEMAQKIYERPVKVTKDIYPVIARKYKSTPSNVE HNIRTLVNLCWMNHKDTLEEMAGCTMADKPTNSEFIDILVYYLRYSDKD >gi|226332928|gb|ACII01000091.1| GENE 15 11272 - 13344 2155 690 aa, chain - ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 278 690 216 654 658 280 38.0 5e-75 MAVKHGNLSINSENIFPIIKKWLYSDHDIFVRELVSNGCDAITKLKKLEVMGEYTFPEGY KPEIQVIVNPDEKTLKFIDNGIGMTADEVEKYITQIAFSGATQFLEKYKDKTTDDQMIGH FGLGFYSAFMVADEVHIDTLSYTEGAQPVHWSCDGGTEYDIQEGNKNTIGTEITLFLNED CLEFANEYRMREILEKYCSFMPVHIYLSKANAPQEYETIDESELRDDDVVVEHIHEEAKT EEKENDKGEKEVVEISPAKDKVKINKRPVSISDINPLWMKHPNECTDDEYKEFYRKVFMD YKEPLFWIHLNMDYPFNLKGILYFPKINTEYDSIEGTIKLYNNQVFIADNIKEVIPEFLL LLKGVIDCPDLPLNVSRSALQNDGFVQKISEYISKKVADKLTGMCKTDRESYEKYWDDIS PFIKYGCIKDAKFAEKMGDYVLFKNLDGKYLTLKDCVEENKKETEETEAQTEEKKEDAAE SENKDNAEAKEPEKTVIFYVTDEVQQSQYINMFKEAGKDAVILKHNIDSAFISSLEQKHQ EVQFKRIDADLTEEMKGEGTADEETVKALTELFRKSLNKDKLEVHVENLKNENVSAMMTL SEESRRMQDMMKMYNMYGMDPNMFGGQETLVLNANHPLVKYLAENQESDKAPLICEQLYD LAMMSHKQLSPDEMTRFVQRSNEILLMIAK >gi|226332928|gb|ACII01000091.1| GENE 16 13613 - 16321 2991 902 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2434 NR:ns ## KEGG: Dhaf_2434 # Name: not_defined # Def: cell wall/surface repeat protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 525 871 1290 1619 2207 127 32.0 2e-27 MERKRFLAVVMTVMMVISMIPSMVFAAAPSGELGGKLKIKGLAAVGVVLSADYAKVTPEE VTDEDFTFSWSRQTGEKELTQVGTEKTYTITQDDLGCRLVLDLTPVDGSGLTGTLTAKTL EVAATEEEAKAVEQEAEDAQQTETADGTENAQLEETADGTEDAQLEETADNAEGSQDSEN QDVDRTEDVQQNAGNEQEDESQNTKGQPEETTEGTDDSQMKIYTEDQLQVDENGNVQTDG SEKDGQAAGEDDKDKKTYEATVTVEDSEDQICDFGTVKSDLEDVEAQYVQITNTGNESLN FQEISPEHFMAQDIIEPLSAGESVSVWIQPREGLEPGEYDDMITYRTEEGIEVSFEAKIA VEGEDDSSDDEDLKPSDEPSAEPTETPDDSEAPSAGDNADASLEAQSLEVNTGSLNFSDI EENYTQANETLSVTVINTGDGIITLKIPKSDYFEVMNEDGSEAVSGIQIAKGDSLMFMVQ PKTGLTKGDYSDTLIFESEEDSEVAAQVTAEVSVKEAEQEQITAVQADPESFSYDDLKEG YDTPEATTITLTNTGNTTVSLMQPYAEYFDIGELSASVLEPGDSAAFTAVPVTGLKVGNY LDSIQIAQTSSEGQEDVLTTIKASATVSEVKKIYKLSVTPEELNFGKAKEGYSEAPEAQK VTVTNEGNTNVTLNAPSGKNFKIGKLSATELAPGESCTFKIRPKAGLKAGSYTESVVIDN EQQISAEVKVQFTVKAARKAKIADPADNKITGISSDGYTTQSKITFTAVGAGMDNESPGK GDVRYVPYNWKVINTNSWSSAPYTAAFGITKAGTYTLTVTFDRQKYNGSEWENTGEQDTK QVNFSITQAQTVTATPTPQPNGATAKSAVKTGDTTNITPFVIILAIAAGCIVGVVVYKRR KK >gi|226332928|gb|ACII01000091.1| GENE 17 16645 - 16902 386 85 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1570 NR:ns ## KEGG: EUBREC_1570 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 85 9 92 92 79 57.0 5e-14 MFTEECINTFLENQEQLFPQAVAESYEAAEAFLEDCMAQVVDSIEEVREYLEESGADVEG MSDEELEDASEVFALPDGKYLIVEG >gi|226332928|gb|ACII01000091.1| GENE 18 16904 - 17707 819 267 aa, chain - ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 247 1 245 265 130 30.0 4e-30 MIKLVASDLDGTLLLHKAQSLPEEIFSLIRQLEELGIMFVAASGRQYPNMTKLFAPVASE ISYISENGALAVDHGEVLYQDSFDRKLAGEIISAILEKKDAEFTCSAKDYHYLMPKTKRF HDHMLYEVKNECRFVNSMEEMTAPIMKLAVFEPAGLTEESVKYWMDRFGKECVVVTSGNE WIDFIPFGTNKAKGIREYQKRYHISPEECIAFGDEYNDIEMLKAVKYGFAMEHSKEGVRA ATSYMTKQVEPVLEKLIRAKGKIEEVI >gi|226332928|gb|ACII01000091.1| GENE 19 17700 - 18365 860 221 aa, chain - ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 213 4 218 222 159 38.0 3e-39 MIKACIFDLDGTLADTLDSMAYVTNIIMEKFGLKTLPVDNFRYYSGEGANMLIRRALKDA GDPELAHYDEGQKLYREMFEADPMYKVVPYKGMPETLKKLKEHGMKLAVCSNKPHPAAVK VIAQLFDGEFDMVVGQSEAIRRKPAPDGPLMVAEKFGVKPEECMYVGDTSTDMKTGKAAG MYTVGALWGFRDRKELNENGADLVAEKPTDLVKICEEHGND >gi|226332928|gb|ACII01000091.1| GENE 20 18372 - 19040 619 222 aa, chain - ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 26 221 1 189 190 157 44.0 2e-38 MERVNESSCIPKRYKVYYDRKEEVIMRKIILASASPRRKELLERAGVDFEVLPASGDENR ISDNPGEAVKQLASDKAASVIRTMKDSADGTIVIGSDTVVVFENVILGKPHDTEDAVNTL KKLQASTHQVYTGVSVWEKKEKVWTEHTFYESTDVTFYPVSDEEIREYVATGEPMDKAGS YGIQGLFGIYVKGINGDYNNVVGLPVARLFYEMKKSGINLRG >gi|226332928|gb|ACII01000091.1| GENE 21 19185 - 19580 509 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579505|ref|ZP_04856774.1| ## NR: gi|253579505|ref|ZP_04856774.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 131 4 134 134 92 100.0 1e-17 MAKINKYWGYVAIGALTAAAAGAVAALIAKKGSCKEDFDFEDDFDDFDLTDEKETAKEDQ EPAEDFQSWEHTGEQSATDDTENEEAADTDDSEKSSDKAAEVEIKDAEEEEDADEPENFE ETMDAASESEE >gi|226332928|gb|ACII01000091.1| GENE 22 19690 - 21357 2074 555 aa, chain - ## HITS:1 COG:VC0997 KEGG:ns NR:ns ## COG: VC0997 COG0008 # Protein_GI_number: 15641012 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Vibrio cholerae # 2 550 3 550 556 642 57.0 0 MENETVSKNFIENIIDKDLAEGVYDTVHTRFPPEPNGYLHIGHAKSILLNYGLAQEYNGK FNMRFDDTNPTKEKSEFVESIKADIKWLGADWEDRLFFASDYFGQMYEAAVKLIQKGKAY VCDLTADQIREYRGTLTEPGKESPYRNRSVEENLQLFEEMKEGKYADGEKVLRAKIDMAS PNMNMRDPVIYRVAHMTHHRTGDTWCIYPMYDFAHPIEDAIEGITHSICTLEFEDHRPLY DWVVKELEYPQPPKQIEFAKLYLTNVVTGKRYIKKLVEEGIVDGWDDPRLVSIAALRRRG FTPESIKMFVELCGVSKANSSVDYAMLEYCIREDLKLKRPRLMAVLDPIKLVIDNYPEGE VEYLEAPNNMENEKLGTRKIPFGRELYIEREDFMVDPPKKYKRMFPGTEVRLMNAYFVTC TGYEADEDGTVRVVHCTYDPATKGGNAPDGRKVKGTIHWVEASQAGKAEVRLYENIVDEE KGVYNKEDGSLNVNPNSLTKVTAYVEPALMEAKGYDSFQFVRTGFFCADIHDSKEGAPVF NRIVSLKSSFKLPKA Prediction of potential genes in microbial genomes Time: Sat May 28 20:06:09 2011 Seq name: gi|226332927|gb|ACII01000092.1| Ruminococcus sp. 5_1_39B_FAA cont1.92, whole genome shotgun sequence Length of sequence - 32236 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 13, operones - 3 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 50 - 109 8.2 1 1 Tu 1 . + CDS 133 - 642 363 ## COG4636 Uncharacterized protein conserved in cyanobacteria + Term 790 - 841 11.6 - Term 782 - 826 6.0 2 2 Tu 1 . - CDS 873 - 962 59 ## - Prom 1005 - 1064 6.9 - TRNA 1272 - 1343 79.4 # Thr GGT 0 0 - Term 1174 - 1213 5.4 3 3 Op 1 1/0.000 - CDS 1458 - 2057 673 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 2112 - 2171 2.2 4 3 Op 2 7/0.000 - CDS 2213 - 2950 699 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 5 3 Op 3 8/0.000 - CDS 2947 - 3405 242 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 6 3 Op 4 2/0.000 - CDS 3411 - 4820 1662 ## COG0215 Cysteinyl-tRNA synthetase 7 3 Op 5 . - CDS 4848 - 5330 695 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 8 3 Op 6 . - CDS 5333 - 6607 1261 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 6726 - 6785 6.3 9 4 Tu 1 . - CDS 6788 - 8281 1342 ## COG2367 Beta-lactamase class A - Prom 8324 - 8383 10.1 10 5 Tu 1 . - CDS 8407 - 10764 1750 ## COG1199 Rad3-related DNA helicases - Prom 10813 - 10872 7.3 - Term 10993 - 11038 8.0 11 6 Tu 1 . - CDS 11121 - 12440 1606 ## COG0527 Aspartokinases - Prom 12521 - 12580 5.1 12 7 Tu 1 . - CDS 12586 - 13296 510 ## COG1720 Uncharacterized conserved protein - Prom 13342 - 13401 1.9 - Term 13404 - 13449 11.4 13 8 Tu 1 . - CDS 13453 - 14841 1005 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 - Prom 15074 - 15133 6.3 14 9 Tu 1 . + CDS 15222 - 17174 2213 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases + Term 17272 - 17342 7.6 - Term 17276 - 17314 2.2 15 10 Op 1 2/0.000 - CDS 17342 - 18694 1187 ## COG1641 Uncharacterized conserved protein 16 10 Op 2 . - CDS 18697 - 19458 833 ## COG1691 NCAIR mutase (PurE)-related proteins - Prom 19506 - 19565 2.0 17 11 Tu 1 . - CDS 19567 - 20751 1188 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 20951 - 21010 7.5 - Term 21035 - 21084 5.9 18 12 Tu 1 . - CDS 21085 - 22485 1701 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 22604 - 22663 4.5 - Term 22678 - 22718 11.2 19 13 Op 1 16/0.000 - CDS 22787 - 23872 1490 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 23916 - 23975 4.3 20 13 Op 2 21/0.000 - CDS 24110 - 25090 1202 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 21 13 Op 3 21/0.000 - CDS 25102 - 26613 189 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 22 13 Op 4 16/0.000 - CDS 26634 - 27611 1035 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components - Prom 27652 - 27711 3.7 - Term 27696 - 27747 2.8 23 13 Op 5 1/0.000 - CDS 27796 - 28773 1065 ## COG1879 ABC-type sugar transport system, periplasmic component 24 13 Op 6 7/0.000 - CDS 28760 - 30565 1665 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 25 13 Op 7 . - CDS 30562 - 32181 1426 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain Predicted protein(s) >gi|226332927|gb|ACII01000092.1| GENE 1 133 - 642 363 169 aa, chain + ## HITS:1 COG:all3650 KEGG:ns NR:ns ## COG: all3650 COG4636 # Protein_GI_number: 17231142 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Nostoc sp. PCC 7120 # 52 164 76 187 188 65 36.0 6e-11 MSNALPIEDLSKTYLEHSMVINNFVIKIGSQIKDSLCRVFGDGVQYEWRENDDKVIIPDV SIICNLRDRKNISFTGIPRFVMEVLSNATEEYDRHEKMNIYCKVGVSEYWIVDWRKKSVE IYLFDFKEDGTGYPYLYKTVTAQNKEDLHLVMFPNLKITFDELFDIGEY >gi|226332927|gb|ACII01000092.1| GENE 2 873 - 962 59 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENLAVSFLVTFMASIAAYYVCKWLDGDK >gi|226332927|gb|ACII01000092.1| GENE 3 1458 - 2057 673 199 aa, chain - ## HITS:1 COG:BH0115 KEGG:ns NR:ns ## COG: BH0115 COG1595 # Protein_GI_number: 15612678 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 4 195 14 211 217 166 44.0 3e-41 MKQYDGIKDEELISRFKNGESEILDYLMEKYKNMVRKKARTMFLIGGENDDLIQEGMIGL FKAVRDYQPDRDAAFQTFASICVDRQIYNAIQSSNRQKHQPLNSYISLSEQDGENEEHLG DNWGENPESIIIDQENVQDLEQEITATLSPMENQVLEYYLAGNGYGEIAQIMGKTPKSID NALQRIRIKIREQLEQYQK >gi|226332927|gb|ACII01000092.1| GENE 4 2213 - 2950 699 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 3 241 7 246 255 273 54 8e-73 MSEQIEGRNAVLEAFRSGKCVDKLFILDGCQDGPVRTIAREARKKDTIINYVAKERLDQL SETGAHQGVIAQVAAYEYASVEDILAKAREKGEDPFIFILDNIEDPHNLGAIIRTANLAG AHGVIIPKRRAVGLTSTVAKTSAGALNYTPVAKVTNLGHTIDELKEQGMWFVCADMGGET MYNLNLTGPIGVVIGNEGEGVSRLIREKCDFVASIPMKGDIDSLNASVAAGVLGYEIVRQ RLQKK >gi|226332927|gb|ACII01000092.1| GENE 5 2947 - 3405 242 152 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 36 147 16 129 141 97 44 7e-20 MSGTNGLNEENLSSLKGMLHQLFHLEDKDLRTYSPLTLAYIGDGVYELVIRTILVKKGNC PVNQLHRKASSLVKAGTQSSMMEVIEPMLTEEEHSVYRRGRNAHSPTMAKHATMADYRRA TGFEALMGYLYLKDDFSRIIELVRAGIGEDKI >gi|226332927|gb|ACII01000092.1| GENE 6 3411 - 4820 1662 469 aa, chain - ## HITS:1 COG:BH0111 KEGG:ns NR:ns ## COG: BH0111 COG0215 # Protein_GI_number: 15612674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Bacillus halodurans # 1 469 3 465 466 508 56.0 1e-143 MKLFNTLTRRKEEFVPLEEGKVRMYVCGPTVYNLIHIGNARPMIIFDTVRRYMEYKGYEV NYVSNFTDVDDKIIKKAIEEGVSAQEISERYIAECKKDMDGMNVKPATTNPQATQEINGM ISMIQTLVDKGYAYPAADGTVYFRVKKFKEYGKLSHKNLDDLQSGFRSLQVSGEDQKEDP LDFVLWKPKKEGEPSWPSPWCDGRPGWHIECSVMAKKYLGDEIDIHAGGEDLIFPHHENE IAQSECCNDKIFAKYWMHNAFLNIDNRKMSKSLGNFRTVREISEQYDLQVLRFFMLNAHY RSPLNFSADLMEAAKNALERITDAAANLKDRKAAAQTETATDAEKELLAQAQEFVKKFEE AMDDDFNTADALAAIFELVKFANTNVSEASSAEFAGTLLDTMVKLCDVLGLKAVKTEEIL DKEIEDLIAERQEARKAKNFARADEIRDELLAKGIILKDTREGVKWKRA >gi|226332927|gb|ACII01000092.1| GENE 7 4848 - 5330 695 160 aa, chain - ## HITS:1 COG:CAC0434 KEGG:ns NR:ns ## COG: CAC0434 COG0245 # Protein_GI_number: 15893725 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Clostridium acetobutylicum # 1 154 1 154 155 202 63.0 3e-52 MRVGMGYDVHRLTEDRDLILGGVKIDWEKGLLGHSDADVLIHAVMDALLGAAALGDIGKH FPDTDPAYKGISSILLLEHVTKLLREHHYEIGNIDATIIAQKPKMAPHIPQMRANMAKAM GINESQLNIKATTEEKLGFTGREEGIASQAICLLNERKES >gi|226332927|gb|ACII01000092.1| GENE 8 5333 - 6607 1261 424 aa, chain - ## HITS:1 COG:BH1862 KEGG:ns NR:ns ## COG: BH1862 COG0371 # Protein_GI_number: 15614425 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Bacillus halodurans # 5 413 8 389 399 192 32.0 9e-49 MRVDADDFARPCSCGREHQIAVKEILIEAGAVEKLEEEMSEGMLREYISPLVICDTNTYA ATEELMEDIYDRCQVLVLDAEGLQADRHAIKIVENNMEEDIDLILAVGAGTIHDISRYIA HNYKVPFISVPTAASGDGFVTTVAAMTLDGVKKTVPSVAPICVYADTDIFSKAPQRLTAA GISDLMAKYICLADWKIANLVTGEYFCRETVKLEEKALKTVKSSIQDITEGEEDECEQLM YALILSGLAMQMIGNSRPASCAEHQVTHLWDMEVINGPLDALHGEKVSVAALLVLEEYKR IAAAITQGRCHAKPYENEDEELLKETFGKKGLLEEIRKENEPELLETISPQHLEKCLNGI EEIIDELPSEQTMFHLLEKAGCAKTVYDIGLDEAAVLPSLRLAPYTRRRLSLLRISKMLD IRGE >gi|226332927|gb|ACII01000092.1| GENE 9 6788 - 8281 1342 497 aa, chain - ## HITS:1 COG:FN1584 KEGG:ns NR:ns ## COG: FN1584 COG2367 # Protein_GI_number: 19704905 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Fusobacterium nucleatum # 248 495 8 259 264 84 28.0 3e-16 MRKNRQTAKYHIKSYIIAGSLAGAISMALALPVPVMAAQQSLASVTGQAQIQPMGDGSGK YMMKSDGFYCLDVNGAGSTQAEIHYFQDYEIDGTVFDGYYYHDADGKFKACSPHMEHLKG VAVFGDKTDEEADTQNAQEAEKFDGYYFVNNLGRLSAAPQVRYIDNLAIDGITLNGYYYF DENGRLVTEPGIHSLEMDCCEMNFDGSYYFGGTNGALLQESTVTDDGFIVDDTGKIVNMD DLGMDNLKPQLEKMLSGYQGTWSVYVKDLNEEKEILINDTSLYSASLIKAFVMAKTYEDM EQVKADEAKKLNTADTKTVDVKLNDLLWNMITVSDNESCNELVKLQTDSLDFKKGAEDIN KYLEKEGYTETSVQHTLHPAASAQESLGGRNMTSVKDCGTLLEKIYKGECVSKEVSEEML NLLSNQENTWKIPQGLPDGIKSANKTGETDQDQHDIAIVYGEKTTYILCVMSENCPESTA VTNIQNISKIVYNYLNL >gi|226332927|gb|ACII01000092.1| GENE 10 8407 - 10764 1750 785 aa, chain - ## HITS:1 COG:CAC1672 KEGG:ns NR:ns ## COG: CAC1672 COG1199 # Protein_GI_number: 15894949 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Clostridium acetobutylicum # 6 778 8 788 791 633 42.0 0 MDKPVVRISVRNLVEFILRSGDLDNRGGSSDREAMQKGSRLHRKIQGRMGSHYRAEVSLK YKTEYEDVSIQVEGRADGIFTEDGQCWIDEIKGVYADVSQLEKPVKVHSAQAMCYAWIYA QEQKPEKIGVQMTYGNLDTEELKFFREEYTLEELGLWYQNLLDRYHKWIAYQFAWKKERN ASMSDLEFPFEYREGQRKIVSGVYHTISTERQIFVQAPTGVGKTMSTIFPAVRAVGAGLG ENIFYLTAKTITRTVAEEAFYILKEHGLKFKVITITAKEKLCFCDKTECNPENCLWARGH LDRVNDAVFELWTTQDSYDRDTLLEYAKKWQVCPFEMCLDLAVWVDAVICDYNYVFDPNV YLKRFFGEGTSGEYIFLIDEAHNLVDRGREMYSAHVDRADVLEAKRLAGDYSKGLVRALE KVNRQLRTLEKECTEYEILPNPGAVSLGMLQVMGEMDKLLEELHGKELPEQLLEFYFCVR DFLNIDELLDENYVVYTEMGEGGKVILRLFCVNPAANIHRCLEKGKSAVFFSATLLPMDY YRALLSTRKDDYGIYVTSPFRQENRCILTGRDVSSRYTRRGYEEYHRIASYIARTVWTRK GNYMVFFPSYKFMEDVLEVYENEFSAEWVRCISQTSGMNEREKEEFLEEFSASEGTLVGF CVMGGIFSEGIDLMGEKLIGAIIIGTGLPQIGTEREILRQYYDKKGVNGFDYAYRYPGMN KVLQSAGRVIRTQEDTGIILLLDERFAGKDYRNLFPAEWSDRGNCTLNTVEEQLGSFWNR IREKN >gi|226332927|gb|ACII01000092.1| GENE 11 11121 - 12440 1606 439 aa, chain - ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 4 434 5 436 437 477 54.0 1e-134 MKKIVKFGGSSLANAEQFQKVGEIIRSDESRRYVVPSAPGKRFDGDTKVTDLLYKCYNTA VEGEDFIPILQEIKGRYYEIIRGLNLDLSLEDEFAQIEADFKAQAGTDYAASRGEFLNGK VMAAYLGYEFVDAADVIRFDKNGNLDAEKTDRLLSKKLAKCEHAVIPGFYGAGEDGKVKT FSRGGSDVTGSLVAKAIKADLYENWTDVSGFLVTDPRIVKNPEVIESITYRELRELSYMG ATVLHEDAIFPVRKEGIPINIKNTNRPEDKGTFIVESTCKKPKFTITGIAGKKGFCSINI EKSMMNSEVGFGRKVLQVFEDQGISFEHVPSGIDTMTVYVHQDEFEEKEQQVIAGIHRAV QPDFVEMESDLALIAVVGRGMKSTRGTAGRIFSALAHANVNVKMIDQGSSELNIIIGVEN RDFETAVKAIYDIFVITQI >gi|226332927|gb|ACII01000092.1| GENE 12 12586 - 13296 510 236 aa, chain - ## HITS:1 COG:NMA0245 KEGG:ns NR:ns ## COG: NMA0245 COG1720 # Protein_GI_number: 15793263 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 1 211 5 212 226 163 41.0 3e-40 MVPIAHIENDFPTKFGIPRQSGRVGALKARIVFESEYRNVDACRGLEEFSHIWLIWEFSE AKRTKWSPTVRPPRLGGNVRKGVFATRSPFRPNSIGLSCVKLEKVALDEPDSPVLYVEGA DLMNGTPIFDIKPYIPYADCHPEATGSFTEYSKDHHLNVEFPQELLERIPGESREALIAV LADDPRPAYQNDPERSYGMPFGEKDIHFRVDGDILRVYNVTEFVKKQQESSADCGK >gi|226332927|gb|ACII01000092.1| GENE 13 13453 - 14841 1005 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 4 462 2 445 456 391 45 1e-108 MLQTIESINNVVNNFIWGVPAMICIIGVGLYLSLRTGFVQIRKFPYALKTTLGRIFKKKE ASDGSMTPFQAVCTALAGTVGTGNIAGVAGAIAIGGPGAVFWMWISALLGMCTKFTEVTL AVHFRERNRHGDYVGGPMYYIKNGLGKNWRFLAVLYSAFGVLTVFGTGNATQVNTITTAI DTALINFNVISESSTGRLNLILGIVITLLVGMVLLGGIKRIGSVSEKLVPFMALFYIVLA LGVVVLNIGRVPAVFHDIVYGAFNPSAVTGGVIGSFFLSMKKGVSRGIFSNEAGLGTGSI AHACADTSKPVKQGMFGIFEVFADTIVICTLTALVILVSGVPVNYGAAAGAELTISGFTA TYGGWVSIFTAVAMCCFAFSTIIGWGLYGARCIEFLFSAKVIRPFMIAYSLVAILGATVD LGLLWSIAETFNGLMAIPNLIGVFLLSGIAITLTKEYFAQKK >gi|226332927|gb|ACII01000092.1| GENE 14 15222 - 17174 2213 650 aa, chain + ## HITS:1 COG:lin0222 KEGG:ns NR:ns ## COG: lin0222 COG1501 # Protein_GI_number: 16799299 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Listeria innocua # 50 648 145 707 763 358 30.0 2e-98 MIQKYTYGTPFETDAIVTSIPASEGTPAYGTISTENGFSFTYDMDDNDVVYGLGESNRGI NKRGYVYESNCSDQPNHTEDKISLYGAHNFLVISGKQTFGLFVDYPGKLKFDIGYTLSSQ LKITCEDADLYLYVIEGETPYDIVKQFRKIIGRSYIPPKFAFGFGQSRWGYTTADDFRKA ADGYRENNIPIDMVYMDIDYMEDYKDFTVNQENFPDFEAYVNEMKEKGIHLVPIIDAGVK VEAGYDVYEEGVEKNYFCKREDGSDFVSAVWPGWTHFPDVLNADARAWFGQKYERLISKG IDGFWNDMNEPAMFCTPEGVAELKEYIKDNFMDKEEAPGFTLGDKVNALANNPEDYKRFY HNVNGQKIRHDKVHNLFGYNMTRAAGEAFEKIAPGKRFLMFSRSSYVGMHRYGGIWMGDN KSWWSHILLNLKMLPSLNMCGFLYTGADLGGFGADTTRDLVLRWLALGVFTPLMRNHSAL GTREQEAYQFENIEDFRHVIGVRYRLVPYLYSEYMKAALNDDMMFRPLAFDYTDDAFATQ VEDQLLLGNEIMIAPVYTQNAKGRYVYLPEEMMFVKFLPDGTISQEILEKGHHYVEVALN EVPLFIRKGKAIPVADVAQTVKDIDPATIRMIGYEGAEYDRYDDDGVSTL >gi|226332927|gb|ACII01000092.1| GENE 15 17342 - 18694 1187 450 aa, chain - ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 442 2 414 420 254 37.0 2e-67 MGKTLYLECYSGISGDMTVAALLDLGGDRTVLDKVLRSLPISGFETKISRVVKSGIDACD FDVVLDKDHENYDHDMEYLHGHHHEGHESNHAHGTGTAQDHHHHEHRGIKEITYIIEHSA MTENAKKIALRIFEILAEAESKAHNVPVDQVHFHEVGAVDSIVDIVSVAVCLDDLDVTEV IVPVLCEGRGTVRCQHGILPIPVPAVANIVSANHLHLKMTEVEGELVTPTGAAIVAAVKT KDKLPETFEIQKIGIGAGKRQYECPGILRAMIISQSAEIDEEKAQTEEFKNAEIGNNPKA ENQETKDTIIKMETNIDDCSGEVLGFVMERLMKAGARDVHYVPVFMKKNRPAWVLNVICK EEDIETLQNIIFEETTTIGIRYSIMERTILPRETRTLPTPWGEVLAKVCTLNGKEQLYPE YESVAQLSREKEIPFAEIYRYIVLANKDKE >gi|226332927|gb|ACII01000092.1| GENE 16 18697 - 19458 833 253 aa, chain - ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 6 244 9 248 248 235 48.0 6e-62 MTLQEILESVKSGSVSVEEAERILKKESYEEMGYAKLDTSRKARTGFAEVIYCSNKADEH LLNIFERLYREDGEVFGTRASQHQYELVKEKFPETEYDPISRILKVEKKDKKRIGKIAVC SAGTADIPVAEEAAQTAEYFGTNVERIYDVGVSGMHRLFSRLETIQSANCVIAVAGMEGA LASVMGGLVSRPVIAVPTSVGYGASMHGLSALLTMINSCANGIAVVNIDNGYGAGYLATQ INRLAAGTDEKGN >gi|226332927|gb|ACII01000092.1| GENE 17 19567 - 20751 1188 394 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 111 368 18 269 277 213 41.0 6e-55 MNTISVEELEKIDSSKITIVDVRPADQYSRGSFPGAVNIPLDEFEERMESVDREKMVYVL CHTGDRSRDCVEKLSDAGYEAVNIEGGYRAYLRLSLSRFMENDAKDQKELKTKEIEHSII KTFRKTVWRPFTKALNEYQLIQEGDKIAVCISGGKDSMLMAKLLQELKRHGKIHFELVFL VMNPGYNADNWKIIQDNAELLGIPLTVFESDIFDTVAEIENNPCYLCARMRRGYLYSHAK ELGCNKIALGHHFDDVIETILMGMLYSGKVETMMPKLHSQNFEGMELIRPMYLIKESAIK AWRDTNGLHFIQCACRFTENCVSCGGGRGSKRDEMKELVAQFRNTSSVIETNIFNSVRDI NLRTVMGYHKDGEYYNFLDDYDQRGNKGVDKDKE >gi|226332927|gb|ACII01000092.1| GENE 18 21085 - 22485 1701 466 aa, chain - ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 4 465 3 461 467 488 51.0 1e-138 MAAYYLAVDIGASSGRHIIGHMENGKMVLEEIYRFENGMVKKDGELCWEFDRLFKEVVNG LKKCKEIGKIPVSMGVDTWGVDFVLLDKNDNVLGNTVGYRDHRTEGMDKEVYKAISLKDL YSRTGIQKADYNTIYQLMAVKKKHPEYLEQAETLLHVPDYFHFLLTGQKTCEYTEATTGQ LVSPITKDWDYELIDMLGYPRKMFQRLIMPGTGIGHLSDKIREEVGFDLEVVAPATHDTG SAVLAVPANDDDFIYISSGTWSLMGIERKEADCSEKSCEMNFTNEGGYAGRFRYLKNIMG LWMIQSVRHEVNDAYSFAEICAMAEEAKDFPSRVDANDECFLSPDNMTEEVKDYCRRTGQ KVPETLGEIATVIYTSLAECYAKTAKELEEMTGRTYSRIHIVGGGSNAGYLNELTAKATK KEIHAGPGEATAIGNITAQMLKAEEFKTIEEARTIIHESFGVKVYK >gi|226332927|gb|ACII01000092.1| GENE 19 22787 - 23872 1490 361 aa, chain - ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 30 336 66 332 357 92 29.0 2e-18 MCAAMVAGMSASSVMAADMPEQFKDLKANEAYDFPMMVKSFQSTYWDAAQEGMKKAADEL GVTYKAQGPNSESDIADQVNMINTAIAAKPAGLGLAACDTSSVLDALQECADKGIPVVTF DTGIADAPEGSVVCEVCTDNTQAGSVAAENMYNSIKDVIANADGQVIIGEVNQDATGQNI QQRGGGFIDKMIELIQADGKTVAVKGNEFYVNAAKGADAKEADADVVIQVAVPAQTTVEL CSTEAQAILSQENCIAIFGSNQTAAEGVLAANANLNVLGSDAGAGDVIGVGFDAGSIIKA AVQDGTFIGAVTQSPLMMGYYAIYALTAAANGQELEDVPTDGYWYDSTNMDSEEIAPNLY D >gi|226332927|gb|ACII01000092.1| GENE 20 24110 - 25090 1202 326 aa, chain - ## HITS:1 COG:YPO0859 KEGG:ns NR:ns ## COG: YPO0859 COG1172 # Protein_GI_number: 16121167 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 52 316 67 331 336 145 34.0 1e-34 MTKKKKSILSDQRFIVLLVIIVLFAIFSFKSREFRQYTTILSMLDFSYYDLLMAIGVTFP LITGGVDLSIGTGMVCYALIAGSLVRNNNLPVVLAMLLCIVLGIIVGAANGVLIGIMNLP PFLATLCTCMITRGAGSLCSATPWPGLTQEGGWFHSIFKITVGTGRSASRYPIGFLWMII LVLVMEYVLNHTKFGRYTIAIGSNKEAAALSGINVKFYHVMVYVVCGLFTGLAAIAYAAV TPTVQPGTGAGLEMDAIGGVFVGGVAATGGYGSVIGTLAGIFVIMLLKTGLPYIGLQANW QQIITGAVLIIAVLIDIMKEKKAAAK >gi|226332927|gb|ACII01000092.1| GENE 21 25102 - 26613 189 503 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 260 488 12 229 312 77 25 1e-13 MGEVILTMKDIDKSFPGVHALDHVNFEVKRGEVHALMGENGAGKSTLMKVLTGIYQKDSG SITYKGKETEFHNTREAQDAGVVIVHQELNMVGDLTVAQNIFIGREPKKGFSIDDKKMIE DSKKLFQELNIEINPKEKMNNLTVGKQQMCEIAKAISHKAEVIIFDEPSAALTEKEIADL FEIIRDLRKKGLGIVYISHRMDEIKTITDRVTVMRDGGYVGTLITADSTKEDIINMMVGR VIYEDPKEHSMVAPDAPVVLKVENLNAGKMVQNVSFELRKGEILGFSGLMGAGRTETARA LFGADPKQSGKISIRGKDGQLREVTINSPQDAVKYGIGYLSEDRRRYGCVVQKSVTENTT LATMEEFTSGIFINKSKEKEVSEKYVKELATKTPNCEQLVVNLSGGNQQKVVIAKWLTRD SEILIFDEPTRGIDVGAKNEIYKLMNRLAAEGKSIIMISSEMTEVLRMSDRIIVMCEGKI TGNIDISEATQEHIMNHATRNIN >gi|226332927|gb|ACII01000092.1| GENE 22 26634 - 27611 1035 325 aa, chain - ## HITS:1 COG:mlr3338 KEGG:ns NR:ns ## COG: mlr3338 COG1172 # Protein_GI_number: 13472896 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 37 322 39 319 322 141 32.0 2e-33 MEKLKQNPIVKKLGLNRILLACILVLMFVVFKVVLGSKFPVGDSIKSTLNYVYFLGFLSL GVTFVIATGGIDFSIGPVMFCCALISGYCMTSYHVPCAAAMVICILIGFAFGVFNGWMVS YMSVPPFIISMASMNIAKGIASVFTKTQSVSWPLGSDPVNGWFRNLISYKGFPVGLVIFL AAAVICGIILYNTKPGRYILCLGSNSEAVRLSGVNTKKWRMLAYVICGVLVGIGAIFFVG AYTTVQPGYGDQYNNEAIAGCVMGGTSMVGGLASIGGTVIGVFIISLLQQGIMAFGLGKG QQMIITGLIVIVAVYVDVSARRRKN >gi|226332927|gb|ACII01000092.1| GENE 23 27796 - 28773 1065 325 aa, chain - ## HITS:1 COG:mlr3334 KEGG:ns NR:ns ## COG: mlr3334 COG1879 # Protein_GI_number: 13472894 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 12 323 12 328 331 128 26.0 2e-29 MKSHKKEIITVAILMAAAVVIFAGILKPEATQTKKCSLIYIPKIRDNTNDFWTSVISGCK MAAEEYESDLEILAPDKEENIEEQNKLLKKAIEQKPDAILFSPSSMDSSDELLKEAKEKG IRITYIDSYTKEKLQDLTVATDNVNAGRMLGEYARKLIDKDSKIAIVSHVKGVSTAVERE QGFREGLGDYADNIVDIVYCNSLYEKSYELAQELMRKYPDLELIAGMNEYSAVGVGRAVS DAGAKDKIAVVGVDCSQEAINLMEMGVYKGIIVQKAFRMGYIGVEETIHMLNGDAVEKNI DSGCELVTPENMYNSDIERLIFPFS >gi|226332927|gb|ACII01000092.1| GENE 24 28760 - 30565 1665 601 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 18 594 12 561 563 247 31.0 4e-65 MKISSKFKSIQSSIFCAVSILVLSAVLVVTVVSLRYTNSSIYENSVMYTQTIIKQLNQNI DSYISYMDNIASVIAQSGDAYKYLYSEKGYGATKDENYSEYRQRLVEQFKTILKGRADIR NIGIVREDKNSPSLFDNGLSVRNTYVDLNTQPWYADAVGKYDRYNLTSSHVQNVIKGERP WVITLSRGIRNYTGTEAEDGVVFLDLNYSAISELCAQSSMGDKGYVFILDQNGNIVYHPQ QQQLYNELQTENISLVMNAKSDIVTVGKGDDEKIYALSHSDITGWTIVGCMNMSELLRNS RQTRSIYVLVAVGLIIVALLISSLIARNITLPIQKLRDSMKSVQKGNFDIEDIEVISDNE IGSLTRSFNVMTHRIRELMEQNVKEQEQKRKIELKALQSQINPHFLYNTLDSIIWMAEGK KNEEVVIMTASLARLLRQSISNEDELVTVGQEIEYVRSYLTIQKMRYKDKLEFEIKADPS ITQVPIIRLVLQPLVENAIYHGLKYKDSKGLLTVHGYMKGENAVIDITDDGVGMDEETLK HIYDKHKVNYRSNGVGVYNVQQRLVLYYGKDYGIIYHSEKGKGTTASVVIPGIQEESHEK S >gi|226332927|gb|ACII01000092.1| GENE 25 30562 - 32181 1426 539 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 533 1 519 525 172 24.0 2e-42 MYKILLVDDEILVRDAIRENIDWKSLDCELVGDCENGRQAVEFVQSHKVDVVLTDICMPY MDGMELSEFLHDNYPDILIVIFSGFGEFEYAKKAIRYNVSEYMLKPVTAMELTKVLRNMK EKLDSRKKEQKKMESLSQTSKDYHKNVDVIRSKALETLVNCTRDVQVSLLELKKLGISFD CSSYRVAVFDMDTYSEMYQVDMHKQQESALMSFVLFNIGNEIVTRENAGVAYQEGSNRVC IIFTGCRSREFGDKIHSICQEIQQKVKEVIGIETSIGIGSWVRSPQELVYSYKLAEKAIG YRYLLGGSLLLDMEEKKTDNSINLIKSLETLTEEIKVGNRQKVTEILEQIEHEIKGALVE KSYACIYLQQVIRAIGNTCQSLSDDPEKIIAQREKLLKEVSQAKTFDKAVTLVKEYAEGV FESLQDLNSSSGQRQGMMAMDYIRKNYMDPDLSLNSICSYLNISTSYFSTIFKEMTGETF VESLTRIRMEKAKELLENTTLKNYEIAEKVGFSDPHYFGISFKKMTGKTPTEYAREKRR Prediction of potential genes in microbial genomes Time: Sat May 28 20:06:21 2011 Seq name: gi|226332926|gb|ACII01000093.1| Ruminococcus sp. 5_1_39B_FAA cont1.93, whole genome shotgun sequence Length of sequence - 19195 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 6, operones - 3 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 121 - 173 15.2 1 1 Tu 1 . - CDS 204 - 2000 2164 ## COG2407 L-fucose isomerase and related proteins - Prom 2036 - 2095 2.0 + Prom 2405 - 2464 8.4 2 2 Tu 1 . + CDS 2489 - 3337 470 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 3359 - 3398 -0.2 - Term 3342 - 3389 10.2 3 3 Op 1 . - CDS 3457 - 4539 937 ## COG2096 Uncharacterized conserved protein 4 3 Op 2 . - CDS 4608 - 5759 1115 ## COG1454 Alcohol dehydrogenase, class IV 5 3 Op 3 . - CDS 5784 - 6548 657 ## COG1349 Transcriptional regulators of sugar metabolism 6 3 Op 4 2/0.000 - CDS 6616 - 7164 802 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 7 3 Op 5 . - CDS 7181 - 8509 1241 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 8 3 Op 6 . - CDS 8514 - 8780 355 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein - Prom 8800 - 8859 2.1 - Term 8800 - 8837 7.1 9 4 Op 1 2/0.000 - CDS 8913 - 9557 742 ## COG4869 Propanediol utilization protein - Prom 9582 - 9641 5.3 - Term 9584 - 9629 9.3 10 4 Op 2 . - CDS 9643 - 10119 555 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 11 4 Op 3 . - CDS 10119 - 10421 379 ## Cphy_1181 microcompartments protein 12 4 Op 4 . - CDS 10458 - 10733 470 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 13 4 Op 5 . - CDS 10806 - 11120 411 ## Cphy_1180 microcompartments protein 14 4 Op 6 1/0.000 - CDS 11216 - 12439 1504 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 15 4 Op 7 1/0.000 - CDS 12467 - 13855 874 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 16 4 Op 8 . - CDS 13885 - 14748 1028 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 14794 - 14853 8.2 17 5 Op 1 11/0.000 - CDS 14938 - 15729 851 ## COG1180 Pyruvate-formate lyase-activating enzyme 18 5 Op 2 . - CDS 15803 - 18328 2348 ## COG1882 Pyruvate-formate lyase - Prom 18422 - 18481 6.7 - Term 18567 - 18618 16.3 19 6 Tu 1 . - CDS 18712 - 19095 385 ## gi|253579551|ref|ZP_04856820.1| predicted protein - Prom 19124 - 19183 5.0 Predicted protein(s) >gi|226332926|gb|ACII01000093.1| GENE 1 204 - 2000 2164 598 aa, chain - ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 10 598 4 588 588 780 63.0 0 MAKSRLIGSYPVIGIRPTIDGRRGALDVRGSLEDQTMNMAKSAAKLFEENLRYSNGEPVK VVIADTTIGRVGESAACADKFRKEGVDITLTVTPCWCYGAETMDMDPQTIKAVWGFNGTE RPGAVYLASVLATHAQKGLPAFGIYGHDVQEADDTSIPEDVKEKLLRFGRAAVAAASMRG KSYLQIGSVTMGIGGSIIDSDFIESYLGMRVESVDEVEIIRRMTEGIYDHAEFDKALKWA KETCKIGWDKNPEELQFSAEKKEEQFEFVVKMAVIIKELMNGCDNLDPKFSEEAIGHNAL AAGFQGQRQWTDFYPNGDFAEAMLNTSFDWNGAREPYILATENDVLNGLGMLFMKLLTNR AQIFADVRTYWSPEAVKKATGYDLEGVAKEAGGFLHLINSGAACLDANGEAKEADGTAVM KQWWDITEEDQKAIMENTEWCMADNGYFRGGGYSSRYETRAQMPATMIRLNLVKGLGPVL QIAEGWTVALPEEVSDKLWKRTDYTWPCTWFAPRCDGKEGPFKTAYDVMNNWGANHGAIS YGHIGADLITMCSMLRIPVSMHNVPEKDIYRPAAWNAFGMDKEGADFRACKNYGPLYK >gi|226332926|gb|ACII01000093.1| GENE 2 2489 - 3337 470 282 aa, chain + ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 281 1 287 299 104 26.0 2e-22 MHDLDHNEDRPRGTYEFPFEFHHIDHTHPRYVMSYHWHVEYEIIRILEGSLTVTLDEKSF TAVKNDVIFVHSGILHSGIPHDCVYQCIVFDMNTFLKNNPVCGEYIQKILHQETMLFPHF SDEHPQILNCISVLFDAMYEKDTGYELTVFGQFYHFFGLIFSNHYFIENPTRTRRDYKRI LQLKQVLEFIEKNYANPITLQELSASVSMSPKYFCRFFSEMTHQTPVDYLNRQRIEEACL QLAATDDSITEIAYRNGFNDLSYFIRTFKKYKGMTPGKYKRR >gi|226332926|gb|ACII01000093.1| GENE 3 3457 - 4539 937 360 aa, chain - ## HITS:1 COG:lin1128_1 KEGG:ns NR:ns ## COG: lin1128_1 COG2096 # Protein_GI_number: 16800197 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 161 1 165 190 84 33.0 2e-16 MSIYTKRGDRGITDMAHVGNISKSDDRIRLMGEADELNSHIELVKSMLRQSEILQFLERI QKNLDLIAAGVSNPYDRDCKVSEKETAVLEAETDKLEALCEKPVPEKFTGKSRLAAEIDI ARTVARRTERCLAQVSVKFGADTESKRFLNRLSDYLYILARYIEYYSDLSVELSDDHAGA LAKTSDNNTDALPEPSDSNTDVLKKTADINEAVIQEVLKRMGIQSRITLDTAKKLIERLE QEAVRRGQRAVIAVCNPEGNPVAVHVMDGAFLVSFDVAMKKAYTAVAVKMSTMELSKIAQ PGGTFYGVDKLDGGKIVIFGGGIPLKSGNTIIGGLGISGGTGEEDHSLAEYGQSVLNEIL >gi|226332926|gb|ACII01000093.1| GENE 4 4608 - 5759 1115 383 aa, chain - ## HITS:1 COG:lin1135 KEGG:ns NR:ns ## COG: lin1135 COG1454 # Protein_GI_number: 16800204 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Listeria innocua # 1 380 1 377 379 346 46.0 4e-95 MNTFEMKTAIHFGDNALDRLKEIPYKRVLIITDPFVVQSKMINLITAPLNSAGIEYDIFH DVVPDAPVDKIAEGVKKFLEYQPEAVVAVGGGSAIDSSKAIREFALKINNYGKVGLIAIP TTSGTGSEVTSFAVVNDTAAHVKYPLISESLTADEAILDAELVRSVPPAITADTGMDVFT HALESYVSTAHNEFSSALAEKAIEICGVFLLRAYLDGSDMHARKKMHVASCLAGLSFNTA GLGITHSMAHQLGAMFHIPHGRANAMLLPHIVEFNSDINKHSKSQKEYLPAVKRYANVAH ILGLSNYNKVMTVRSLVNWIQFMQKEMNIPLTIQELGTIAPEEYFAAIDKMADAALADAC TVNNPRVPTKEDIIKIYTKLWSF >gi|226332926|gb|ACII01000093.1| GENE 5 5784 - 6548 657 254 aa, chain - ## HITS:1 COG:BS_yulB KEGG:ns NR:ns ## COG: BS_yulB COG1349 # Protein_GI_number: 16080173 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus subtilis # 1 232 1 231 258 158 42.0 8e-39 MLALERRNLILEKLQAEKRVVVSELSQLYDVSEETIRRDLDKLEKEGLAIKSYGGAVINE DVSIDLPFNVRKNQNVTGKQKMAELAASLVKDGDHIFLDASTTAVFVAKALKEKERLTVI TNSMEILLELADVSGWNIISTGGVMKEGYLAFLGSKTDESIRSYYVDKVIFSCKALDLEW GIMESQEAFGSTKRAMIGSGRKRILVVDSSKFDQTAFSVAGSMKEVDIIVTDKEPTERWK KHFEKFNVKCLYPV >gi|226332926|gb|ACII01000093.1| GENE 6 6616 - 7164 802 182 aa, chain - ## HITS:1 COG:STM2054 KEGG:ns NR:ns ## COG: STM2054 COG4577 # Protein_GI_number: 16765384 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Salmonella typhimurium LT2 # 1 177 1 177 184 128 37.0 7e-30 MSKAIGMIEFKTTSTGVTAADAMVKTSEVEIVEAQTVCPGKYIAIITGDLSAVKAAVDTA VTTYEDKCIDSFVLGNPHESIFPAIYGTTQVEDISALGILETYDAASIIEAADQAAKTAI VDLIELRIAKGMCGKSYMMITGEVSAVEASIDRAKELVAAKGMYLDSSVIAHPDRRMIDS IL >gi|226332926|gb|ACII01000093.1| GENE 7 7181 - 8509 1241 442 aa, chain - ## HITS:1 COG:STM2053 KEGG:ns NR:ns ## COG: STM2053 COG4656 # Protein_GI_number: 16765383 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Salmonella typhimurium LT2 # 23 439 34 448 451 308 41.0 2e-83 MKIKELQDIIQQNGVVGAGGAGFPTYMKLTDKANTILMNCAECEPLLKLHRQLLEKHAYE IMKTFHMIQETVGASEAIIGIKKSYVQTIHALNQHIEEFPGMRLHLLDEVYPMGDEVVLI YEATGRVVRPGGLPIEQGVAVFNVETVYNVYRAVEEKQPVTDKYVSVVAEVSDPVTVRVP LGCTLEEVVAQAGSTTVKDPVYFVGGPMMGRIGNGSDPVTKTTNAILVLPKDHLIVAKKQ RTSSIDLKRAASICCQCNTCTDLCPRHNLGHPIDPAKFMRAASNNDFRDLNPYIDASFCS SCGVCEMYSCPQSLAPRSLLADMKGGLRKAGIRPPQGVQPKPVQESREYRKVPEERLMAR LGLTRYDKDAPLKEELVQVKKVRILLSQHIGAPAQAVVKAGDEVTRGQMIAQPAQGLSVG IHASVSGKVTEVTDRYIIIAVK >gi|226332926|gb|ACII01000093.1| GENE 8 8514 - 8780 355 88 aa, chain - ## HITS:1 COG:lin1127 KEGG:ns NR:ns ## COG: lin1127 COG4576 # Protein_GI_number: 16800196 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 1 82 1 83 87 65 45.0 3e-11 MIVGKVVGSVVSTRKSEKLIGSKFMIIEPVHHMKGDLSQLVAIDMIGAGVGEYVLVAQGS AARIGCGVETAPVDAAIVGIIDDGAGLE >gi|226332926|gb|ACII01000093.1| GENE 9 8913 - 9557 742 214 aa, chain - ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 14 214 10 209 210 193 54.0 2e-49 MESNNIELITRMVIQAINQNEKKGDGFMVPIGVSARHIHLTQEHVEALFGPGYQLTKKKE LMGGQFASNETVTIVGLKLRAIENVRILGPVRKASQVEVSATDAIKLGMNVPVRESGDVA GSAPIAIVGPKGAVYLKEGCIVAMRHIHMSPKDAQAAGVKDGDIVSVKADNERGTIFNQV KIRVDDSFTLEMHIDTDEANAAKIATGTTVTIIK >gi|226332926|gb|ACII01000093.1| GENE 10 9643 - 10119 555 158 aa, chain - ## HITS:1 COG:STM2038 KEGG:ns NR:ns ## COG: STM2038 COG4577 # Protein_GI_number: 16765368 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Salmonella typhimurium LT2 # 67 155 1 89 94 106 78.0 1e-23 MAEEKKPEADKNAPKKAAAPRRTKAKTSSKPEVKAQTADTAAETANKTESQKEQIKETVS HKEEKVMTQEALGMVETRGLTAAIEAADQMCKAANVALVGTEKIGSGLVTVMVRGDVGAV KSAVESGSAAASRLGELVATHVIPRPHTDVEKILPVLK >gi|226332926|gb|ACII01000093.1| GENE 11 10119 - 10421 379 100 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1181 NR:ns ## KEGG: Cphy_1181 # Name: not_defined # Def: microcompartments protein # Organism: C.phytofermentans # Pathway: not_defined # 1 94 1 94 100 123 72.0 2e-27 MSKSYGFIEITGVVAAIDALDIMCKTADVELASWERKLGGRLVTIIVEGDVSAVTEAVNA AATGAIKKPVSYAVIARPHEEIVRMVELSASRWKNNQGDE >gi|226332926|gb|ACII01000093.1| GENE 12 10458 - 10733 470 91 aa, chain - ## HITS:1 COG:STM2038 KEGG:ns NR:ns ## COG: STM2038 COG4577 # Protein_GI_number: 16765368 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Salmonella typhimurium LT2 # 1 89 1 89 94 101 79.0 3e-22 MAQEALGMVETRGLTAAIEAADAMTKAAEVALVGTEKIGSGLVTVMVRGDVGAVKAAVES GSAAASRLGELVATHVIPRPHTDVEKILPSI >gi|226332926|gb|ACII01000093.1| GENE 13 10806 - 11120 411 104 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1180 NR:ns ## KEGG: Cphy_1180 # Name: not_defined # Def: microcompartments protein # Organism: C.phytofermentans # Pathway: not_defined # 1 104 1 104 104 108 60.0 7e-23 MAEAVGILEVFGLATAFVAGDAGCKAANVRLEVFDKNKPANADSLPVPLLVCIKFRGSVT DVTAAVEAGMEVANRMTGVVQHYVIPNPEEGTEKMLKISALDKD >gi|226332926|gb|ACII01000093.1| GENE 14 11216 - 12439 1504 407 aa, chain - ## HITS:1 COG:TM0436 KEGG:ns NR:ns ## COG: TM0436 COG1063 # Protein_GI_number: 15643202 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Thermotoga maritima # 56 402 21 362 368 122 28.0 2e-27 MDMNNVNIEEIVKQVLSGMTGNAPAAATVSAPAATTGIPKTARVAVLTEKEHFELKEYPI PPIGDDDILVKVEGCGVCGTDAHEFKRDPFGLIPVALGHEGTGEIVAMGKNVKVDTAGKP VKVGDKVVTCMIFKDDPEITMFDLNKKNVGGADVYGLLPDDDVHLNGWFSDYIFLRGGNF GTTFFNVSDLDLDSRILIEPCAVLVHAVERAKTTGILRFNSRVVVQGCGPIGLICIAVLR TMGVEHICAVDGNEKRLEFAKRMGADTSVNFMNFKGIEALTEAVKEAQGGHLADFAFQCT GNPKAHANIYKFIRNGGGLCELGFFINGGDATINPHFDLCSKEINLVGSWVYTLRDYVTT FDFLKRANAIGLPMSELITDKFPLEQINEALQTNLAMTGLKIAVVNK >gi|226332926|gb|ACII01000093.1| GENE 15 12467 - 13855 874 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 1 462 1 474 477 341 41 3e-93 MPINENMVQEIVQEVMAKMQIADAPTGKHGIFKEMNDAIEAAKKSQLIVKKMSMDQREKI ITCIRKKIKENAEVMARMGVEETGMGNVGDKILKHHLVADKTPGTEVITTTAWSGDRGLT LIEMGPFGVIGAITPCTNPSETILCNTMGMLAGGNTVVFNPHPAAIKTSIYAINLLNEAS LESGGPDNIAVTVEKPTLETSNVMMKHKDIPLIAATGGPGVVTAVLSSGKRGIGAGAGNP PALVDETADIRKAATDIVNGCTFDNNLPCIAEKEIVAVSSIVDELMHYLVTENDCYLASK EEQDKLTEVVLAGGKLNRKCVGRDARTLLSMIGVNAPANIRCIVFEGPKEHPLITTELMM PILGVVRARDFDDAVEQAVWLEHGNRHSAHIHSKNIDNITKYAKAIDTAILVKNAPSYAA LGFGGEGYCTFTIASRTGEGLTCASTFTKRRRCVMADSLCIR >gi|226332926|gb|ACII01000093.1| GENE 16 13885 - 14748 1028 287 aa, chain - ## HITS:1 COG:SMb20666 KEGG:ns NR:ns ## COG: SMb20666 COG0235 # Protein_GI_number: 16265121 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Sinorhizobium meliloti # 2 208 1 207 225 103 28.0 5e-22 MVNEYEIKKQICDIGRRIYSRNMVAANDGNISVKLNDNEFLCTPTGVSKGFMTPEYICKV DREGNVIQANPGFKPSSEIKMHMRVYEKRPDVGSVVHAHPIYATSFAIAGIPLTQPIMPE AVISLGCVPIAEYGTPSTMEIPDNLEKYLPYFDAVLLENHGALTWSTDLNAAYMKMESVE FYAQLLYQSKVLGGPKEFDEKNIEKLYEIRRKFGMAGKHPANLCLNKDGKNCHNCGGACQ DGEFKKFPGYQYDFVGGDSKPAENNAASDAQLVAEITKQVMAQLGMK >gi|226332926|gb|ACII01000093.1| GENE 17 14938 - 15729 851 263 aa, chain - ## HITS:1 COG:SPy2055 KEGG:ns NR:ns ## COG: SPy2055 COG1180 # Protein_GI_number: 15675825 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 7 244 4 241 257 211 46.0 8e-55 MDYLDTKGRVFDVQKYSIHDGPGIRTIVFLKGCVLRCKWCCNPESQEYKIQTMKVQGEDK VIGRDVTVREMIEEVEKDRVYYYRSGGGMTLSGGECLCQPEFAGALLRAAKERGISTAIE SMACAKWETIETILPYLDTYLMDIKHTNPAKHKEFTGRSNELMMENARKVALSGKTRLVI RVPVIPTFNDTVEEIQGIARFADTLPGVDKIHLLPYHRLGQDKYEGLGRPYLMGNVEPPS KEHMETLKKAVHAVCGLDCQIGG >gi|226332926|gb|ACII01000093.1| GENE 18 15803 - 18328 2348 841 aa, chain - ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 14 837 4 761 765 578 39.0 1e-164 MEKFQTSDIPKSPRIQKLVDALYEHMPVIESARAKLITESYKETEGEPIITRRAKAFAHI LHNIPIIIRDNELIVGSSTIAPRGCQTFPEFSYEWLEAELDTVATRTADPFEIAEETKAE LKEADKYWKGKTTSELATSYMAPEAIKAIEHNIFTPGNYFYNGVGHVTVKYWEVLETGFE GIMEKAQKELDGCSVGDGNYARKSHFLEAVILSCKAVIDYAGRYAKLAQEMAAQTSDPVR KQELFVIAENCSRVPAKGAQNFYEACQSFWFVQQLLQMESSGHSISPGRFDQYMYPYYKK DMEAGTITREFAQELMDCIWVKLNDLNKCRDAASAEGFAGYSLFQNLIAGGQNKEGEDVT NDLSVMCIQASMHVHLPAPSLSVRVWNGSPHEFLIKAAELTRTGIGLPAYYNDEVIIPAL QNRGLSLEDAREYNIIGCVEPQKAGKTEGWHDAAFFNMCRPLELVFSNGMDKGEMVGIPT GDVTQMKTFDEFFDAYKKQMEYCISLLVNADNAIDVAHAERCPLPFLSCMIDDCLKEGKS VQEGGAVYNFTGPQGFGIANMADGLFAIRKLVYEDKKVSMKELKEALAWNYDKGLDAQSA GDMTEMIMKAMQKAGRNVDASTAEGLLKTFMGMKPGEQKTQRFKEIHDMIDEVPKFGNDI PEVDYFAREVAYTYSKPLQKYNNPRGGKFQAGLYPVSANVPLGGQTGATPDGRYAHTPVA DGVSPSAGKDVKGPTAAATSVSRLDHFIVSNGTLFNQKFHPSALSGREGLEKFVALIRGY FDQKGMHMQFNVVDRQTLLDAQEHPEKYKHLVVRVAGYSALFTTLSRSLQDDIIRRTEQG F >gi|226332926|gb|ACII01000093.1| GENE 19 18712 - 19095 385 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579551|ref|ZP_04856820.1| ## NR: gi|253579551|ref|ZP_04856820.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 127 1 127 127 242 100.0 5e-63 MKITEELLNEMKIKDENFSDGLIKPDGDYVRIPRGHLHGMMELLPWTENEIWKMIPDDDS PLFWLIEKTGCVLTDYNNSIGMKMTPAQQTVFDMMRKHGVLTDDYYDLTKQREKVREARE QKENRKQ Prediction of potential genes in microbial genomes Time: Sat May 28 20:06:35 2011 Seq name: gi|226332925|gb|ACII01000094.1| Ruminococcus sp. 5_1_39B_FAA cont1.94, whole genome shotgun sequence Length of sequence - 8384 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 141 - 177 5.2 1 1 Op 1 . - CDS 248 - 1219 1018 ## SYNW1564 hypothetical protein 2 1 Op 2 . - CDS 1239 - 3515 1475 ## COG4870 Cysteine protease - Prom 3543 - 3602 8.4 3 2 Tu 1 . - CDS 3926 - 4222 257 ## COG2158 Uncharacterized protein containing a Zn-finger-like domain - Prom 4327 - 4386 9.6 4 3 Op 1 4/0.000 - CDS 4886 - 5401 569 ## COG4917 Ethanolamine utilization protein 5 3 Op 2 . - CDS 5490 - 5918 400 ## COG4810 Ethanolamine utilization protein 6 3 Op 3 . - CDS 5930 - 6295 408 ## COG4810 Ethanolamine utilization protein - Prom 6336 - 6395 2.5 - Term 6612 - 6663 9.2 7 4 Tu 1 . - CDS 6699 - 8144 1511 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 8209 - 8268 6.1 Predicted protein(s) >gi|226332925|gb|ACII01000094.1| GENE 1 248 - 1219 1018 323 aa, chain - ## HITS:1 COG:no KEGG:SYNW1564 NR:ns ## KEGG: SYNW1564 # Name: not_defined # Def: hypothetical protein # Organism: Synechococcus_WH8102 # Pathway: not_defined # 212 280 520 588 1154 76 52.0 1e-12 MKRRFGGTVIAMVLCFILTVVSVYADDGKGQNPDYVTDYYMIVQSTQGGVDIYDEADTQS VKLNDSKIPNGTAIHVLGEKNGVDNKKWAYTQYHGMNGYVPMNDLDPASREEAANEEYRT FGGKDVDFEVKVHGNDGNVSVYNGPGEKFDQVSGTEGIADGTTVHIFQYVQGEDGTNWGK TDTDGVTQGWLNLDRDTDYVNENVSADAPEATGTSGNVPVAAVTPTPEATPTPKATPTPE VTPTEKVTPTPEATPTEEATPTPEETEKPASTEDKESDDKQSQKTSGKNVKADSGMKNPV IWISGIGIIVIIILLIYFLKKKK >gi|226332925|gb|ACII01000094.1| GENE 2 1239 - 3515 1475 758 aa, chain - ## HITS:1 COG:MA3430_1 KEGG:ns NR:ns ## COG: MA3430_1 COG4870 # Protein_GI_number: 20092242 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 139 507 104 449 516 137 29.0 8e-32 MKKRNTAIRALSVFLAYAMVCTSVPAAGQEMFGSGVNRETEENTSDLKEFQSSQADEFGT DTGSDAELFGSDDAKQEFQDGENSEEGTDGIRYIKGRPLTEEERKEELEPFKNLKPLDPG IEVESDLTSVYAAYGNREAAFPSSYDARKEGLVTPVKNQNPFGTCWAFGMAAIMETSLLA QNKGTYDLSEEHLSYFFSNRQNDPLGNTPDDKNYVLGNYHVIGGNDHLAAIYLSTWSGMT TEADVPFPTDSLHQNDLTVQIPESKAYNSAAYLKNASVSKYSEERMKEMLLNDHAVSIML YMKESYANPDTAAYCYPVGKSNSTVINHIVTVVGWDDTYSKDNFLPVSNVTSDGAWIIKN SWGEKKGDGGYYYLSYQDPNISKLVSAEAVAALDQKYRNNYFYDGSSALSVIPIQAGQSV AAVYETTAGKGKAEVLGEVNLVTNSDNACYKIQIYTDLTDPYDPESGTAAYAAPYEFEQP IAGVQTISVPEVVLKQGSRYSVVITNSGIDKISFGVEAKSSYGNWFTCTAGIETGQTFYK SASETARWTDGKTKNWTARIKAHTRTLNQSWVPDTPVFQVKAYNSGYNLISWKKVSGATG YYVYRKPAAGGKWSQIADVGTSELKYKDSKVTANASYRYTVKAYYEASGKRYSGKYKTGD VIKAAPAVQKVTSVKSEKNGIRIRWKPQKKCDGYYIYRKKKGGSYQLIKKISNGNSSSYL DKKAQKGVSYYYAVKAYVKEPYGNTYSKYKSSSAVKRK >gi|226332925|gb|ACII01000094.1| GENE 3 3926 - 4222 257 98 aa, chain - ## HITS:1 COG:CAC2444 KEGG:ns NR:ns ## COG: CAC2444 COG2158 # Protein_GI_number: 15895709 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a Zn-finger-like domain # Organism: Clostridium acetobutylicum # 11 97 5 88 88 81 50.0 3e-16 MEKPYWEGKEYSYFSHKKCEFFPCHKNADPDDFNCLFCYCPLYALGEKCGGNFRYTEKGI KDCTNCMLPHKRKNYGYVTGKYQELADMMKRAKESISK >gi|226332925|gb|ACII01000094.1| GENE 4 4886 - 5401 569 171 aa, chain - ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 144 1 145 150 114 40.0 1e-25 MKKIFLMGRSEAGKTSLTQALKGEKLHYVKTQYTNTDDDTIDSPGEYAESKHFSVGLACF SFEADVVAMVQSADEPFSLFGYGSNAFILRPLIGIITKIDSPYANVPMVRQWMLNAGCER IFLVNNVTGEGIDELRAFLNEDVPKLSLEEAKFRQSLGLNEWDPLPEGVEY >gi|226332925|gb|ACII01000094.1| GENE 5 5490 - 5918 400 142 aa, chain - ## HITS:1 COG:lin1108 KEGG:ns NR:ns ## COG: lin1108 COG4810 # Protein_GI_number: 16800177 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 25 142 7 116 116 96 44.0 2e-20 MAGMTTRELLEKLYNQDYEKLKGKKLRMTRVRVPGKEVCLAHVINPSQSCIYQNLGLHIG VHEGEDHTGEAIGMIRFTPWEAVVVAADIAVKSANVEIGFMDRFSGSLILTGGLTEVQTA VEEVVKFFREVLGFKTCAVHKS >gi|226332925|gb|ACII01000094.1| GENE 6 5930 - 6295 408 121 aa, chain - ## HITS:1 COG:lin1108 KEGG:ns NR:ns ## COG: lin1108 COG4810 # Protein_GI_number: 16800177 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 10 120 9 115 116 102 46.0 2e-22 MANDHVEHIRIVQETVAGKEITFAHVMGGPAPVIYQKLGLNPQVDYRSSAIGIMNMTPPE SAVIASDIAVKSGNVYLGFADRFTGTLIITGEISDVMSAMTEVVDYFRDTLGYVVCKITK R >gi|226332925|gb|ACII01000094.1| GENE 7 6699 - 8144 1511 481 aa, chain - ## HITS:1 COG:CAC0308_2 KEGG:ns NR:ns ## COG: CAC0308_2 COG0791 # Protein_GI_number: 15893600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 362 478 48 167 167 97 45.0 8e-20 MKKRIVCLTLALLMSATQVVSVSASREDELREEQAITSQQLDATYSRMDELAYAKSQLEN EISELDSNLVSVMVSIDTLKGDIDNKEVDIIKTKQDLAKAQKARDKQYESMKLRIQALYE QGGDAAWFQMMLNSEDLSELLTRAENTQQMYEQDRKNLDKYVNTINEVNNLKTQYESDKA ELEEMKASYEQQSYDLQCQIDQKKSESADYENEIAYAQQQATEYANLLAEQTAEIQRLEA ERIAAEEEARRQAEEEAARQAAEEEAAAKAAEQEQDSESEDEENIEYDEDGNEIENTDDA DNESDESESDDGVEYDENGDPVDNAGASDDVEYDEYGNVIDSDNTVSPDDYESEQSSDSD SSSSSGSGSGSSVVDFATQFVGNPYVWGGTSLTGGADCSGFTQSVYANFGVSLPRTSYEQ QYAGTEVSYADAQPGDLICYGGHVAIYMGNGRIVHASNSVDGIKISDNAAYRTIVSVRRL V Prediction of potential genes in microbial genomes Time: Sat May 28 20:06:44 2011 Seq name: gi|226332924|gb|ACII01000095.1| Ruminococcus sp. 5_1_39B_FAA cont1.95, whole genome shotgun sequence Length of sequence - 7156 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 53 - 113 6.2 1 1 Tu 1 . - CDS 126 - 1598 1434 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 1737 - 1796 5.9 - Term 1820 - 1860 7.2 2 2 Tu 1 . - CDS 1950 - 2936 1219 ## Cphy_2659 diaminopimelate dehydrogenase (EC:1.4.1.16) - Prom 2956 - 3015 5.1 3 3 Op 1 . - CDS 3129 - 3776 730 ## COG2214 DnaJ-class molecular chaperone 4 3 Op 2 . - CDS 3769 - 4596 546 ## Cphy_0207 hypothetical protein 5 3 Op 3 . - CDS 4668 - 5078 421 ## Cphy_3541 Zn-finger containing protein 6 3 Op 4 . - CDS 5115 - 5624 718 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family - Prom 5657 - 5716 2.9 7 4 Tu 1 . - CDS 5873 - 7075 818 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative Predicted protein(s) >gi|226332924|gb|ACII01000095.1| GENE 1 126 - 1598 1434 490 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 38 255 2057 2283 2566 105 37.0 2e-22 MQNVNRKFKIWGALLFLLFFAVGVSMYFTAANVQAATKTGFVTINGESYYINEDGSKQKG WLELEGKKYYFNTKTGVQVKGWVTDSKGRKRYFSKQAGIMMTGWVTDSKDQKRYFNPSTG FMQTKWLTLKGKRYYFYSNSGVAACKTFLTDSKKNTRYFTSACYMLTGWTKNSSNEYRYF ETEDGIMAKGFQTLDGKKYYFNTGSGKMAVGWTTIDGNKYYFDKETGVMATGDVTIDGQK YHFNSNGILSNTTSPTGSRTIKNYLAGALQPVGQALYVWGGGWNDSTRKGTSQTMTDFYN SQSSSYDYNNYRDLSTANRAKGFDCSGFVGWSAYQVMQSKSGVGSGYTVVSGEIGSYYKS MGWGSILTQANLASDDWTVYPGDVGYDSGHTWIILGQCADKSAVIVHSTPNAGVQIAGTP TPSGGYSSQAIALAQKYMSRYPGYTKYDYHTSSGNYIRRGNYLRWNRSTLSDPDGYMNMT ADQILADLFS >gi|226332924|gb|ACII01000095.1| GENE 2 1950 - 2936 1219 328 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2659 NR:ns ## KEGG: Cphy_2659 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: C.phytofermentans # Pathway: Lysine biosynthesis [PATH:cpy00300] # 1 328 1 328 328 550 78.0 1e-155 MSIRIGILGYGNLGRGVECAIKHNPDLELVAVFTRRAPETVKILTETAAVYSVNDAEKMK DKIDVLIICGGSATDLPKQTPEYAKMFNVIDSFDTHARIPEHFDSVDAAAKESGHIGIIS VGWDPGMFSLNRLYANAILTNGKDYTFWGKGVSQGHSDAVRRIKGVKDAKQYTIPVEAAL EAVRNGENPDLTTRQKHTRECFVVAEEGADLAQIENDIKTMPNYFSDYDTTVHFISEEEL KRDHSGIPHGGFVIRSGKTGWNDENNHVIEYSLKLDSNPEFTSSVLVAYARAAYRMNKEG QSGAKTVFDVAPAYLCAADGAELRKHLL >gi|226332924|gb|ACII01000095.1| GENE 3 3129 - 3776 730 215 aa, chain - ## HITS:1 COG:CAC0648 KEGG:ns NR:ns ## COG: CAC0648 COG2214 # Protein_GI_number: 15893936 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Clostridium acetobutylicum # 1 207 1 183 195 83 31.0 3e-16 MIDPYSILGISRDASDEEVKKAYRKMSRKYHPDANIDNPNKEQAEEKFKQVQQAYEQIMK EREQGIDYGNYGSNNYGGFGGFSGQADSGYQDKESMRRQAAANYIQSGHYREAVNVLQSL SQRNGQWYYLSSMANMGLGNNVNALNDIKQAIRLEPDNVQYQMVLQQMEGGENWYQEMQN PFGGMPTGGDDYCMKLCLANMACSLCCPGSGIFCC >gi|226332924|gb|ACII01000095.1| GENE 4 3769 - 4596 546 275 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0207 NR:ns ## KEGG: Cphy_0207 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 260 7 266 285 275 51.0 1e-72 MNKPEIKFKDFDLYRSFYCGLCRELKSKYGISGQISLTYDMTFVVILLSALYESPTQKGS TRCIIHPVCKQPVRRNTVTEYAADMNVLLTYYKCRDDWEDEKKVTALGYSKVLQGKVKKL DQKYPDKSRRIQKLLSELSEMEKSGEKDIDKMAGCFGKIMEEIFAWKQDVWEDTLRRMGF FLGKFIYLLDAYDDVEEDIKNKNYNPFSEQYIIEGFDEQVRRILIMMMAQTCREFEKLPI IKYTDILRNILYSGVWCRFEVIHKKRKEAGEKDND >gi|226332924|gb|ACII01000095.1| GENE 5 4668 - 5078 421 136 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3541 NR:ns ## KEGG: Cphy_3541 # Name: not_defined # Def: Zn-finger containing protein # Organism: C.phytofermentans # Pathway: not_defined # 2 135 4 132 133 105 40.0 3e-22 MKDKLNRFMQGRYGVDNFARFTLGVALFVIVVGSFMRQNAAGGVLDTVGFILIIYTYFRI LSRNISARYAENQKYLGYTQKIRSWFTREKNMMEQRKTHHIYTCPGCGQKIRIPRGKGQK VEIECPKCHEKFIKRR >gi|226332924|gb|ACII01000095.1| GENE 6 5115 - 5624 718 169 aa, chain - ## HITS:1 COG:TM0564 KEGG:ns NR:ns ## COG: TM0564 COG1853 # Protein_GI_number: 15643330 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Thermotoga maritima # 20 168 14 159 159 83 35.0 2e-16 MSFREVKAEELTMNPFTKIGKEWLLITAGNEEKCNTMTASWGAMGVMWGKNAVTVYIRPQ RYTKEFVDREDTFTISVLGEKYRKALNYCGKVSGKNADNKIKEAGLTPYFTDGTAGIEEA DMIMVCKKMYHDEIKPECFDAGENDGKWYPQKDYHTMYIAEILKVLVRE >gi|226332924|gb|ACII01000095.1| GENE 7 5873 - 7075 818 400 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 3 390 1 386 396 319 44 3e-87 MGMAVVTLKKGEGRLLKSGGMWIFDNEIDTVMGNFENGDIVLVHDFDGYPMGRGFINTNS KITVRLMTRDENVDINEELLEKRVRDAWEYRKKVVDTGCCRLIFAEADFLPGIVVDKFSD VLVVQSLALGIDRFKETIVELLRKVLAEDGITIRGVYERSDVKVRKQEGMEMVKGFIGPE FPTLVQIEENGVKYEVDVKDGQKTGFFLDQKYNRLAIQKLCKGAKVLDCFTHTGSFALNA GIAGAESVTGVDASQLAVDQATANAALNGLSDSVKFICEDVFELLPELEEKGEKFDVVIL DPPAFTKSRNSIKNAVKGYREINLRAMKLVKDGGFLATCSCSHFMDYELFTKTIGQAAKN VHKRLRQVEYRTQAPDHPILWAADESYYLKFYIFQVCNDR Prediction of potential genes in microbial genomes Time: Sat May 28 20:06:57 2011 Seq name: gi|226332923|gb|ACII01000096.1| Ruminococcus sp. 5_1_39B_FAA cont1.96, whole genome shotgun sequence Length of sequence - 6914 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 115 - 169 11.3 1 1 Op 1 . - CDS 178 - 828 405 ## Cbei_2125 GCN5-related N-acetyltransferase 2 1 Op 2 . - CDS 782 - 2377 1848 ## COG4086 Predicted secreted protein - Prom 2400 - 2459 3.3 3 2 Tu 1 . - CDS 2490 - 3044 610 ## COG1971 Predicted membrane protein - Term 3758 - 3792 3.2 4 3 Op 1 . - CDS 3884 - 5563 1179 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 5 3 Op 2 . - CDS 5556 - 6866 627 ## COG1061 DNA or RNA helicases of superfamily II Predicted protein(s) >gi|226332923|gb|ACII01000096.1| GENE 1 178 - 828 405 216 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2125 NR:ns ## KEGG: Cbei_2125 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.beijerinckii # Pathway: not_defined # 21 200 3 183 184 70 28.0 5e-11 MQIPQRLQAARKMQNNQSIMQISENKDGSAKVLFRIRKAEMTDVDEIMAVMHEAKNDKEH PDWFVSDDEEYVRTHIEEQGFVIVAQTADGSVAGFFIIKYPENREDNLGTYLDFDEEQLS HVAVMDSAVVCCAYRGNGLQGHMLEEAERLLDTDQYYYLMCTIHPDNQFSRHNMEIHGYE VKRTALCYGGLPRCILLKDLTESGKSPAQVVKSSKQ >gi|226332923|gb|ACII01000096.1| GENE 2 782 - 2377 1848 531 aa, chain - ## HITS:1 COG:CAC2758 KEGG:ns NR:ns ## COG: CAC2758 COG4086 # Protein_GI_number: 15896014 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Clostridium acetobutylicum # 17 253 19 256 288 161 40.0 3e-39 MKKVKTAAALLCSACLVLSGTAVPTMADSVKVVTLGADLTQDQKNTMMKYFNVDSNQVQI LTITNQDERDHLSAYVPLEQIGTRTVSCAYVKPTQSGGIKVRTANLNWVTCNMIATSLST SGVKNCEVVAACPFEVSGTGALTGIQMAYETATGEQLDSTKKELATEEMVVTGNLADEVG KNDATTVMNNSKIQVIKDNVQNVDDIYNIVVNVAQQNNVNLDSDQINKIVELLKQIAQQE YNYDDVKATLEQVEQNTSGDNDELGDIDDEEDDTVNAGDPADGDDILNNVDNSALGGDIV ESSTENPSLEEDSGLIEDDGDDQSDGDLSETEEPQEPDGDETTDDSTLGNTDEGDTDSDE GTDDTENDTTSDELDTSALTEDQKTMFDKAENFCKGEYEGDTTALTTAMEDETATASVTL DAENGATLAKNVEKAYLKILTEGTASYQADGTEIYMSTELNMVDKSMKEIFGLSADAQAS PELGELSDDDRQTLYNETMKFFEKLYGESTETYDTTDADSTETAGGEENAE >gi|226332923|gb|ACII01000096.1| GENE 3 2490 - 3044 610 184 aa, chain - ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 1 181 1 184 187 145 51.0 4e-35 MNIFELFILAIGLSMDAFAVSVCKGLSLGRINAKHMCIAGAWFGGFQALMPLVGYFGGRF FADKVTRYSHWVAFVLLVFIGAGMIKESKEEEHVNADMDIKSMFILAVATSIDALAVGVT FAFLKVEIVSAVSFIGVITFVCSAAGVKIGSLFGMKYKSKAELCGGIILILIGTKILLEG LGMI >gi|226332923|gb|ACII01000096.1| GENE 4 3884 - 5563 1179 559 aa, chain - ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 1 557 1 557 559 525 50.0 1e-148 MSKKIKCDIYTRVSTTMQVDGYSLDAQKEKLKRYAEFQNMEIVNEYSDEGKSGKSVEGRP EFQRMLDNIENGTDEVQFVLVFKLSRFGHNAADVLNSLQRMQDFGVNLICVEDGIDSSKD SGKLMISVLSAVAEIERENILVQTMEGRKQKAREGKWNGGFAPYGYELVNGELQIAEDEA EIIRLIYDKFIHTNMGISAIAAWLNQHGYKKKKRQNNTLDAFASSFIKGVLDNPVYCGKL AYGRRKNEKVSGTRNEYRIVKQENYMLHDGIHEGIISETDWELAHQKREKTGVKYEKTHS LDHEHILSGILRCPLCGSGMYGNVNRKKKKDGTLYKDYFYYACKHRRLVDGHKCGYRKQW SEEKINNAVKEIIRKLVKNPKFEEAILNKIGSRIDTEEIEKEIERLEKQHRQLTGAKGRL GQQMDSLDIMDKFYEKKYQDMETRLYRLYDEIEGVENSIEEVKNRLLNIRQQKISEENVY QFLLYFDKLYDKFTDLEKKEFLNSFVEQVDIYEQEQPDGRFLKHIKFRFPVYFGDRETQE LCWDKESTAESCALLSKLE >gi|226332923|gb|ACII01000096.1| GENE 5 5556 - 6866 627 436 aa, chain - ## HITS:1 COG:SP0575 KEGG:ns NR:ns ## COG: SP0575 COG1061 # Protein_GI_number: 15900485 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Streptococcus pneumoniae TIGR4 # 1 422 143 544 548 172 27.0 1e-42 MTGIIDVAMVGSMYSRGKFNERINYYGMVIMDECHHAASNTSMELLQKINAKYVYGVSAT PKRGDSLDRIIYMLLGPLRHRFTALERAKEQGIGHYFVPRYTRVVDTVESKDNINKAYNL ISTSKVRNEMIVDDVITCITRKQTPVILTRFKEHAKFLYDALKGKADHVFLLYGDNSDKE NTEIRGRLKQVPGSESLVLVATGQKIGEGFDFPRLDVLMLAAPVSFEGRLEQYVGRLNRD YVGKEAVYVYDYIDSHVRYFDKMYAKRLRTYRKTGFSIWTQELQPKQIINAIFDSVNYTE KFEQDIVESEKMVVISSPDIRQDKIDRFLLLITKRQEVGVKVTVITTDPEDITYGKSDVC YELIRAMQLVGINVITRTEVEECFAIIDDEIVWHGGMNLLGKADVWDNLMRIRNSQVATE LLEIALGCSEERRKSE Prediction of potential genes in microbial genomes Time: Sat May 28 20:07:01 2011 Seq name: gi|226332922|gb|ACII01000097.1| Ruminococcus sp. 5_1_39B_FAA cont1.97, whole genome shotgun sequence Length of sequence - 2368 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1598 476 ## COG4951 Uncharacterized protein conserved in bacteria - Prom 1686 - 1745 6.0 + Prom 1679 - 1738 9.9 2 2 Tu 1 . + CDS 1768 - 1959 238 ## gi|253580383|ref|ZP_04857649.1| conserved hypothetical protein - Term 1964 - 2015 8.1 3 3 Tu 1 . - CDS 2023 - 2211 111 ## gi|253579574|ref|ZP_04856843.1| conserved hypothetical protein - Prom 2275 - 2334 2.8 Predicted protein(s) >gi|226332922|gb|ACII01000097.1| GENE 1 2 - 1598 476 532 aa, chain - ## HITS:1 COG:ECs1305_1 KEGG:ns NR:ns ## COG: ECs1305_1 COG4951 # Protein_GI_number: 15830559 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 68 423 49 388 392 186 32.0 1e-46 MNIEAYDADSLRKMVRLLEYENKILKDKLKKAGISYEEVNPFEEKIESAEEYDLDQGSRI VNPPYITEKMAMRFFSMFWGREDVYARRGKNGGYFPQCANRWNDRLCPKQRKEKVLCDEC ENTKWISLDVKKIIDHLLGTKEDGSDVIGVYPLLPNGTCRFIVFDFDNHEKGAEVTDFAN TDNEWHKEVDALRKMCELNGIRPLVERSRSGKGAHVWIFFKKAIPAATARNFGFLLLDKG STFINLKSFHYYDRMYPSQDVASSIGNLIALPLQGQALKNGNSAFVDENWNAYPDQWDAL FNKTRKLGIEDVEQCMAKWQGELAEVRGMLTNIEKNVRPKPWKKKCEFCKSDVVGKLHMV LGNGVYVDTLNLMPRIQNQIRSLAAFDNPEFYKNKRLGYSNYYNFSAVYLGKDIDGYIQI PRGLRENIIQECEKAGISVDVSDQRETGQPIRVSFKGDLRMQQELAAEKLLSHSDGVLSA ATAFGKTVVCSYLIAERKVNTLILLQSKDLLNQWVDELNHFLEIREEPPEYE >gi|226332922|gb|ACII01000097.1| GENE 2 1768 - 1959 238 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580383|ref|ZP_04857649.1| ## NR: gi|253580383|ref|ZP_04857649.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 25 87 87 111 95.0 1e-23 MATSSITHNFVISNPNSVKLFIAAIDEADRDRTPKQTLPGRQLTNPQEILSLMSKRKKQT LKY >gi|226332922|gb|ACII01000097.1| GENE 3 2023 - 2211 111 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579574|ref|ZP_04856843.1| ## NR: gi|253579574|ref|ZP_04856843.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 62 53 114 114 117 98.0 2e-25 MFAVVIAFRYLLPGKDFLEFKRKLIKEIDRVNREVEHISEVELLNKMGFSENWKSITKYH LK Prediction of potential genes in microbial genomes Time: Sat May 28 20:07:15 2011 Seq name: gi|226332921|gb|ACII01000098.1| Ruminococcus sp. 5_1_39B_FAA cont1.98, whole genome shotgun sequence Length of sequence - 18698 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 13, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 363 - 1109 477 ## DSY0090 hypothetical protein 2 1 Op 2 . - CDS 1114 - 1746 422 ## COG4832 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1743 - 2660 353 ## COG2378 Predicted transcriptional regulator - Prom 2713 - 2772 6.8 - Term 2741 - 2795 3.5 4 2 Tu 1 . - CDS 2830 - 3000 79 ## gi|253579578|ref|ZP_04856847.1| conserved hypothetical protein - Prom 3065 - 3124 5.3 + Prom 3067 - 3126 6.3 5 3 Tu 1 . + CDS 3222 - 4007 419 ## EUBREC_3583 hypothetical protein + Prom 4027 - 4086 6.9 6 4 Tu 1 . + CDS 4143 - 4940 486 ## EUBREC_1021 hypothetical protein + Term 4947 - 4993 5.3 + Prom 4942 - 5001 6.4 7 5 Tu 1 . + CDS 5044 - 5262 128 ## EUBREC_1224 hypothetical protein + Term 5295 - 5334 2.1 - Term 5617 - 5655 8.3 8 6 Op 1 . - CDS 5659 - 6531 633 ## COG1131 ABC-type multidrug transport system, ATPase component 9 6 Op 2 . - CDS 6535 - 7020 142 ## gi|253579583|ref|ZP_04856852.1| conserved hypothetical protein - Prom 7049 - 7108 5.2 10 7 Tu 1 . - CDS 7188 - 8258 228 ## Cphy_1407 hypothetical protein - Prom 8285 - 8344 4.7 - Term 8317 - 8361 4.1 11 8 Op 1 16/0.000 - CDS 8375 - 9175 507 ## COG2205 Osmosensitive K+ channel histidine kinase - Prom 9214 - 9273 9.0 12 8 Op 2 . - CDS 9542 - 10216 284 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 10237 - 10296 4.5 13 9 Op 1 . + CDS 10437 - 10673 325 ## gi|154506202|ref|ZP_02042940.1| hypothetical protein RUMGNA_03744 14 9 Op 2 . + CDS 10718 - 11218 312 ## gi|291549668|emb|CBL25930.1| plasmid mobilization system relaxase + Term 11284 - 11324 4.1 - Term 11272 - 11310 3.7 15 10 Op 1 . - CDS 11360 - 11740 251 ## Lebu_2175 hypothetical protein 16 10 Op 2 . - CDS 11802 - 12218 408 ## CKR_3111 hypothetical protein - Prom 12409 - 12468 4.7 17 11 Tu 1 . - CDS 12492 - 13346 514 ## AZL_005550 hypothetical protein - Prom 13394 - 13453 4.9 - Term 13691 - 13731 7.0 18 12 Op 1 . - CDS 13735 - 14085 336 ## CDR20291_1777 hypothetical protein 19 12 Op 2 . - CDS 14137 - 15774 1751 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 20 12 Op 3 . - CDS 15780 - 16727 997 ## COG3481 Predicted HD-superfamily hydrolase - Prom 16751 - 16810 5.7 - Term 16758 - 16811 3.7 21 13 Tu 1 . - CDS 16812 - 18125 883 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 18205 - 18264 8.4 - TRNA 18390 - 18474 52.6 # Leu AAG 0 0 Predicted protein(s) >gi|226332921|gb|ACII01000098.1| GENE 1 363 - 1109 477 248 aa, chain - ## HITS:1 COG:no KEGG:DSY0090 NR:ns ## KEGG: DSY0090 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 246 1 247 271 268 53.0 2e-70 MEYIRVTKENLEKEHICCAISNNKDIQVSSKKAWLADRFDEGLVFLKSVERGKCFIEYIP AECAWNPIEAPGYMYINCLWVSGSFKGHGYSSDLLSECIEDSKEKGKKGLCILAAARKKP FLADSKFLKYKGFKACDEADNGIQLWYLPFEEKTEPPVFKECAKHHHINESGYVLYYTNQ CPFNAKYVPILEETAQKNGIPLKAVKIENRKDAQNVPTPITTYALFCDGEYVTNEQMNDK KFLKLVGR >gi|226332921|gb|ACII01000098.1| GENE 2 1114 - 1746 422 210 aa, chain - ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 207 7 207 208 190 48.0 2e-48 MTFDYKKEYKEFYMPKGTPSIVRVPKMNYIAVRGSGNPNDADGEYKQSIGLLYGIAFTIK MSKKEDHQIDGYFDYVVPPLEGFWWQEGVSGIDYARKEEFKWISVIRLPDFVSREDFDWA VRKATEKKKQDFSKVEFFSYDEGLCVQCMHIGSYDDEPATVDEMHRFMEEQGYALDITDQ RMHHEIYLSDARRVAPEKRKTVVRHPIRRA >gi|226332921|gb|ACII01000098.1| GENE 3 1743 - 2660 353 305 aa, chain - ## HITS:1 COG:lin0464 KEGG:ns NR:ns ## COG: lin0464 COG2378 # Protein_GI_number: 16799540 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 300 1 308 310 172 33.0 1e-42 MKTDRLIGILSILLQEEKTTAPELAEKFEGSRRTINRDIEDLCKAGIPIRTAQGTGGGIS IMDGYRMDRTILTSKDMQMILAGLRSLDSVSGNRYYGQLMEKIQTGSSEFISERDSMLID LSSWYKGSLVPKIEVIQNAIENRHTIQFKYYAPSGDGNRRIEPYYLVFRWSSWYVWGWCL EREDYRLFKLNRMDCVTESEQFFMCRNVPMPNLSNEKIFPGGIKVKVLFAPDVKWRLVEE FGPHCFTRTDDGRLLFSADYTDMENLVTWLMTFGAKAEVLEPKEARDIIRRNAEETLKSY GGLGK >gi|226332921|gb|ACII01000098.1| GENE 4 2830 - 3000 79 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579578|ref|ZP_04856847.1| ## NR: gi|253579578|ref|ZP_04856847.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 56 1 56 56 97 100.0 2e-19 MGIKSLLGFGQARDKPVRNYSNGEYSFNFGQSTSGKSVNEIRNIDKGLSDGLECIG >gi|226332921|gb|ACII01000098.1| GENE 5 3222 - 4007 419 261 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 261 191 428 543 94 31.0 3e-18 MGGVIHPLHQVDLLTPAERKITEKEYWAQRRGQEKLDELNQKMKEDGITPKETRYQTEKQ FLRDAIDDAASTAQSPEEFSKILDEKYHIIFKISRNRYSYLHPGRKKYITERNLGTRYTE DFLLKAFEENTKSHREQKEEILEQQAPNTSTDLPTVPFSDTSAISTPFIFIKSDLRLVID LQTCIKAQQSGAYAQKVKLTNLKQMAQTVAYIQEHGYDSLDDFHVELNQASDQTSAPRKS LKDTEQQLKDVNEQIEKVKDL >gi|226332921|gb|ACII01000098.1| GENE 6 4143 - 4940 486 265 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1021 NR:ns ## KEGG: EUBREC_1021 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 26 209 29 214 311 143 41.0 6e-33 MNNTEETLDFEQKHKEDLQRLRGFRLLDDDFMSKVFEDKACAEFLLQIILQRHDLKVQSV QGQYDIKNLQGRSIRLDILAVDSNNRIYNIEIQRSDRGADAKRARYNSSLIDANITEAGD KYDALTETYVIFITENDVLKAGLPIYHVDRIIQETGEPFGDEAHIIYINSQIKDETELGK LMHDFSCTNPKDMYYKILADRVRYFKEDEEGVLTMCREMENMRKAERIEIAKRMLASGKL TYEDIAAFTDLTLEEIEALAVQKTA >gi|226332921|gb|ACII01000098.1| GENE 7 5044 - 5262 128 72 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 69 26 91 94 85 62.0 7e-16 MKAHKYWSLGALAAMFGIFYAGYKNMKSAHKYFACSSLFCMIIAIYSGHEMISGKSKKRK TLFPMENSKPNL >gi|226332921|gb|ACII01000098.1| GENE 8 5659 - 6531 633 290 aa, chain - ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 5 271 4 266 293 187 37.0 2e-47 MSIRINDLTVRFKNGVVAVNKASLEIPKGIYGLLGENGAGKTTLMRVLTTVLKQTEGMVS LDGILYNEGNYEKIQKKIGYLPQEIDLYPNLTVKECLVYMGGLSGVARNDLEQRITYYLE KTSLTEHQNKKMRQLSGGMKRRVGLIQALLNNPDFLIIDEPTTGLDPEERIRIRNLLVDF SKDRTVLFSTHVVEDLAATCTQLAIMKKGSFLYSGSVSGLLENAKGCVWNCTVQNAEEAR LLEANYSISSKQYVENGIHMKVLSKGKPNENCVLDNDITLEDAYIYLTNS >gi|226332921|gb|ACII01000098.1| GENE 9 6535 - 7020 142 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579583|ref|ZP_04856852.1| ## NR: gi|253579583|ref|ZP_04856852.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 161 58 218 218 288 100.0 1e-76 MEVKSKRADVFHLYDQKKQLKVISQRVGVQILYLLILSCVGYVLFFWQKPGSVNEGISGI QIFLLYFIAMFGTIWLWSICSVILCTLLRNMWAGIGCLFGIVIGLISKAGSSFFGNLGLF SFSFCEPTQLMSESWIYGTLVSFIAGLFLFAVLPMTLKKRG >gi|226332921|gb|ACII01000098.1| GENE 10 7188 - 8258 228 356 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1407 NR:ns ## KEGG: Cphy_1407 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 164 1 164 164 79 30.0 2e-13 MKTIIKRSILDYLKNPVLWIGLIIIVASMYQCLSSYLQIHYIKQNEQITQNDVALEDADV MDGYIPTSDDKERRREWEDTIKETLMDTSKNGFGFSRQEADHVMKEIQNMDVKTASEFLE SQYGYYNAIYAYEDLEIHKGTAEEINHYIERKLSEHSFSWYFAKKFTDFAGLHMAFFATV LLSFLFIQDTRKSTYELLHTKPVTAVQYICGKVISGFISMLGVLVILNVIFFMLCLKTSL ESGFPVTPIDFCVNSLIYIIPNLLMICCVYTITAVIFKNPLPAAPILFLHIIYSNMLTMK NDIYYMRPFSIMVRFPGRFFETHVAKMSNINQIILVISSVILVCISVTIWKRRRVH >gi|226332921|gb|ACII01000098.1| GENE 11 8375 - 9175 507 266 aa, chain - ## HITS:1 COG:STM0703 KEGG:ns NR:ns ## COG: STM0703 COG2205 # Protein_GI_number: 16764073 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Salmonella typhimurium LT2 # 42 257 655 873 894 88 30.0 1e-17 MIADNELDFHVSYENKDEMGTLCKEFEMMRSDLADNNRKMWRMIDDEKALRNAIAHDIRS PLSILRGYQEMLLEFVSAESIKTEDVIDILQTGMYQIDRIEHFTENMRKMSHLEQRELQC SEIELSELAKKIEAEAAMLSKKESKLCKVERVQEQNIVKVDEELVMEVTDNLLENAVRYA QKSIALQIKKKDGFLIISVEDDGIGFVDTEEKVTEPFYHKNPQDDLKHFGLGMYISRIFC EKHGGNLKIYNARQGGAHVEALFKAE >gi|226332921|gb|ACII01000098.1| GENE 12 9542 - 10216 284 224 aa, chain - ## HITS:1 COG:CAC3517 KEGG:ns NR:ns ## COG: CAC3517 COG0745 # Protein_GI_number: 15896754 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 224 1 228 230 151 40.0 1e-36 MKYKILVVDDDKELVKMLCSYFNMKQYETITATDGMEALNKIKMKPDIILLDINMPRMDG IEVCRLIRSKVLCPILFLTARVDEDDKINGLLSGGDDYITKPFSLRELEARIVTNIKREE RHQQKTEYRFMDEMLIDYSEKIVAIAGHRMEFTKIEYQIIEFLSMHPGQVFDKERIYEQV CGYDAEGDSRTITELVRRIRKKIADYSEKEYIETVWGIGYRWKK >gi|226332921|gb|ACII01000098.1| GENE 13 10437 - 10673 325 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154506202|ref|ZP_02042940.1| ## NR: gi|154506202|ref|ZP_02042940.1| hypothetical protein RUMGNA_03744 [Ruminococcus gnavus ATCC 29149] # 1 78 61 138 365 157 96.0 2e-37 MQELRNTKIIAVDHGYGNMKTANTVTPTGIKAYETEPIFTGNILEYNGIYYRIGKGHKEF IPDKAMDEEYYLLTLMAM >gi|226332921|gb|ACII01000098.1| GENE 14 10718 - 11218 312 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291549668|emb|CBL25930.1| ## NR: gi|291549668|emb|CBL25930.1| plasmid mobilization system relaxase [Ruminococcus torques L2-14] # 1 166 404 569 569 288 97.0 1e-76 MNQQLTTVTEEIEELKSRKEQLIFQAECSTDKDMTNLSKKYDQMNNNLDILDSQDISLKK QLEKDAAAFREEKFRPEPEQYTELLDTRIQIRPDFRDKLIEQLKGTFGKYYDYHRRDIAA NEVDYLNAEDPDVFSHRAWELEYQRKQEMRRNQPARTKKKSYDMEL >gi|226332921|gb|ACII01000098.1| GENE 15 11360 - 11740 251 126 aa, chain - ## HITS:1 COG:no KEGG:Lebu_2175 NR:ns ## KEGG: Lebu_2175 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 126 1 124 155 80 38.0 1e-14 MKKKVLAIMLIAMSIMLISACGKKEKLYEIPDLSQYKTDYVGDSSNVINIVSGQEYPAGY SYDSIEIQSETEPYGLTVFLKDEPSAAKLEDELQVNADMTFDLIGNLGTIDYKTADSKEI IASYER >gi|226332921|gb|ACII01000098.1| GENE 16 11802 - 12218 408 138 aa, chain - ## HITS:1 COG:no KEGG:CKR_3111 NR:ns ## KEGG: CKR_3111 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 3 138 26 162 162 109 54.0 3e-23 MDKVNGKLTVFFEEPFWVGIFERIEDGKLSVAKVTFGAEPKDYEVQEYIQKCYFSLKFSP VVETVVKDIKRNPKRMQREAKKQMLEIGIGTKSQQALKLQQEQNKQERKEKRRKKKEAEE QRMFELKQRKKREKHKGH >gi|226332921|gb|ACII01000098.1| GENE 17 12492 - 13346 514 284 aa, chain - ## HITS:1 COG:no KEGG:AZL_005550 NR:ns ## KEGG: AZL_005550 # Name: not_defined # Def: hypothetical protein # Organism: Azospirillum_B510 # Pathway: not_defined # 77 278 77 282 289 88 29.0 3e-16 MKKYYAERHGLLTKQLQIDFDELLQYFGQVYKYFCDKEYFEVATRGVWRQIPYTQDSEQI LPPSLLPSPEVYFATCLQDKEVWPIWQYLEEYDEQTLFSVIEILYDHIGVYNYEIDQFEN EAQKEEFAEQINNILRAYKEGYYLEPTNGFIMQIPNGALREQLEYDGSDLPDSVYEQLAT ATEMYYRFDANLEQKKKAINILADILESEREEVKDTLNAEYEVPKNEHDKLIFGIVNGYN IRHNRADQKNDYSKEIWYDWMMQYYTSVIIAFYKLKNKYTDIDF >gi|226332921|gb|ACII01000098.1| GENE 18 13735 - 14085 336 116 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1777 NR:ns ## KEGG: CDR20291_1777 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 13 115 22 124 125 89 45.0 5e-17 MGGTYTLGADGIYYPNLVSTDEEPHYGKYGMMRKTYLKEYRPAMYSLYMLEDRLTEHLNA VDDEAQERMDILVRQMMEKEGITEEMKACDQMEWVRAVNNIRNVAEEIVLKELVYI >gi|226332921|gb|ACII01000098.1| GENE 19 14137 - 15774 1751 545 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 5 450 10 454 458 456 52.0 1e-128 MEFRKNDLVTLEIEDCGIDGEGIGKADGFTVFVKDAVIGDTVTAKIIKAKKNYGYGRLME VLKPSPYRVEPKCEFARQCGGCQLQALSYDQQLVFKTNKVKGHLERIGGFTDIPMEPIIG MDELFHYRNKAQFPVGRNKEGKIVTGFYAGRTHNIIENRDCALGVAENKEVLDRVIAHME KYGIEPYNEATGKGLVRHVLIRYGYFTKEVMVCLILNGNKIPKEEQLVKSLCEIPGMTSI TINVNKKHSNVILGEEIRLLWGQEYITDRIGDISYQISPLSFYQVNPMQTQKLYAKALEY ADLHGQETVWDLYCGIGTISLFLAQKAKFVRGVEIVPAAIENAKENAKLNGLENTEFFVG KAEEVLPREYKKNGVYADVIVVDPPRKGCDETLLETMVEMNPDRIVYVSCDSATLARDLK YLCERGYELRKVCPVDQFGMTVHVETVVLLSQQKPDDTIEIDLDLDELDATSAELKATYQ EIKDYVLKESGLKVSSLYISQVKRKCGIEVGENYNLPKSENARVPQCPKEKEDAIKAALK YYAMI >gi|226332921|gb|ACII01000098.1| GENE 20 15780 - 16727 997 315 aa, chain - ## HITS:1 COG:SA1660 KEGG:ns NR:ns ## COG: SA1660 COG3481 # Protein_GI_number: 15927416 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Staphylococcus aureus N315 # 1 289 1 286 313 157 32.0 2e-38 MRFLNELHEGDRINGIYLCKQKQSAVTKNGKPYENIILQDKTGIMDGKIWDPNSLGIDDF DALDYIDVVGDVTSFAGAMQLNIKRVRKAAEDEYDPADYLPVSENSTDDMYSQLRAFIDS VENTYLSALLKKLFVEDEAFVKAFEGHSAAKTVHHGFIGGLMEHTLGVTRLCDYMSKAYP VINRDLLITASLLHDIGKTKELSAFPLNDYTDEGQLLGHIYMGAQMINDLAHQIPEFPEV LKNELIHCILSHHGELEYGSPKKPALVEAVALNLADNTDARMETITEIFAANKSKKEWLG YNKLFESNLRRTDEV >gi|226332921|gb|ACII01000098.1| GENE 21 16812 - 18125 883 437 aa, chain - ## HITS:1 COG:XF0285 KEGG:ns NR:ns ## COG: XF0285 COG0265 # Protein_GI_number: 15836890 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Xylella fastidiosa 9a5c # 116 418 64 371 481 84 26.0 6e-16 MTLGSYPEWSAAGESGMKKNKMDTQEYQFIKETVKKQPKENGSLLRRLLMIAGCGVLFGG CAAVTFASVFPVMADREENTQQKIELTGSDVAETISEEQATERESVASVSEKEEKSLLAM RQEMYREVLKVSQKSRKSLVNVRGISKDEDLLNNSYFQQEDTEGLVFLETETQFYILTYE EELENLQELQVTFADGSTVQGEICRGDADSGFAVATVRKNLLNDSTREGIVVSDLTDTKK LGQSDIVIAIGSPAGDSDAVVYGMITSVSEKLSVADTEYNVLATDVQGNEDGSGVLLDSD GNVAGMILKKNENDGDNIHAVSISQILPLVEHLANKETIRYTGIYGTEITQAQCRKLGID QGLYVERTQEESPAMKAGIQCGDIISKIDGTPTESMQSYYTYLQTKKQGENLTITVLRKN SDGEYAEKDYKITVGER Prediction of potential genes in microbial genomes Time: Sat May 28 20:08:31 2011 Seq name: gi|226332920|gb|ACII01000099.1| Ruminococcus sp. 5_1_39B_FAA cont1.99, whole genome shotgun sequence Length of sequence - 85969 bp Number of predicted genes - 70, with homology - 67 Number of transcription units - 34, operones - 17 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 12 - 71 7.7 1 1 Op 1 . + CDS 114 - 2495 2281 ## COG1193 Mismatch repair ATPase (MutS family) 2 1 Op 2 40/0.000 + CDS 2507 - 3208 898 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 3 1 Op 3 . + CDS 3205 - 4617 1045 ## COG0642 Signal transduction histidine kinase + Term 4705 - 4759 8.2 - Term 4693 - 4745 11.2 4 2 Op 1 . - CDS 4803 - 6527 1937 ## COG1001 Adenine deaminase - Prom 6665 - 6724 9.6 - Term 6745 - 6780 2.7 5 2 Op 2 . - CDS 6807 - 8783 2108 ## COG4716 Myosin-crossreactive antigen - Prom 9005 - 9064 11.0 - Term 9029 - 9091 15.7 6 3 Op 1 . - CDS 9108 - 9677 656 ## COG1309 Transcriptional regulator 7 3 Op 2 . - CDS 9705 - 10205 674 ## COG0219 Predicted rRNA methylase (SpoU class) 8 3 Op 3 . - CDS 10276 - 10872 662 ## COG0309 Hydrogenase maturation factor 9 3 Op 4 . - CDS 10895 - 10999 194 ## 10 3 Op 5 . - CDS 11014 - 12690 1537 ## COG1032 Fe-S oxidoreductase 11 3 Op 6 11/0.000 - CDS 12697 - 12999 554 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 13048 - 13107 5.7 12 3 Op 7 . - CDS 13111 - 14019 303 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 14110 - 14169 6.4 - Term 14197 - 14236 -0.5 13 4 Tu 1 . - CDS 14281 - 15519 1599 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 15554 - 15613 8.4 14 5 Op 1 . - CDS 15699 - 16769 1192 ## Athe_2369 hypothetical protein - Prom 16789 - 16848 4.5 15 5 Op 2 . - CDS 16855 - 18975 2223 ## COG1472 Beta-glucosidase-related glycosidases - Prom 18995 - 19054 2.2 16 6 Op 1 7/0.000 - CDS 19056 - 20615 1104 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 17 6 Op 2 . - CDS 20608 - 22407 1415 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 22456 - 22515 4.0 18 7 Op 1 38/0.000 - CDS 22529 - 23377 960 ## COG0395 ABC-type sugar transport system, permease component 19 7 Op 2 35/0.000 - CDS 23381 - 24283 1025 ## COG1175 ABC-type sugar transport systems, permease components - Prom 24363 - 24422 5.3 - Term 24482 - 24526 8.1 20 7 Op 3 . - CDS 24579 - 25946 1643 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 26188 - 26247 8.9 + Prom 26113 - 26172 13.1 21 8 Tu 1 . + CDS 26302 - 27486 1103 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family + Term 27621 - 27687 30.0 + TRNA 27603 - 27675 83.5 # Phe GAA 0 0 22 9 Tu 1 . + CDS 27956 - 29239 1411 ## COG2873 O-acetylhomoserine sulfhydrylase - Term 29238 - 29293 2.2 23 10 Tu 1 . - CDS 29343 - 30482 1156 ## EUBREC_3573 peptidase, M23 family - Prom 30572 - 30631 2.1 24 11 Tu 1 . + CDS 31039 - 31272 294 ## gi|253579621|ref|ZP_04856890.1| predicted protein + Term 31277 - 31332 11.3 + TRNA 31443 - 31515 65.6 # Gly GCC 0 0 25 12 Tu 1 . - CDS 31922 - 33106 1353 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) - Prom 33131 - 33190 5.7 - Term 33191 - 33227 -0.4 26 13 Tu 1 . - CDS 33236 - 33949 926 ## EUBREC_1566 hypothetical protein - Prom 34120 - 34179 8.5 - Term 33974 - 34026 3.2 27 14 Tu 1 . - CDS 34193 - 34789 397 ## COG5632 N-acetylmuramoyl-L-alanine amidase 28 15 Tu 1 . - CDS 35144 - 36601 1803 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 36632 - 36691 9.4 - Term 36641 - 36671 1.1 29 16 Tu 1 . - CDS 36711 - 37214 644 ## Cphy_3288 hypothetical protein - Prom 37258 - 37317 5.6 - Term 37391 - 37445 5.1 30 17 Op 1 41/0.000 - CDS 37528 - 39147 1573 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 - Prom 39208 - 39267 5.7 31 17 Op 2 . - CDS 39279 - 39563 511 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 39777 - 39836 6.7 - Term 39905 - 39948 5.8 32 18 Tu 1 . - CDS 39979 - 40764 963 ## COG4465 Pleiotropic transcriptional repressor - Prom 40888 - 40947 10.1 - Term 41050 - 41111 4.2 33 19 Op 1 13/0.000 - CDS 41136 - 43205 2143 ## COG0550 Topoisomerase IA - Prom 43234 - 43293 4.4 34 19 Op 2 2/0.000 - CDS 43337 - 44254 708 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 35 19 Op 3 . - CDS 44275 - 45807 1136 ## COG0606 Predicted ATPase with chaperone activity - Prom 45936 - 45995 7.5 + Prom 45889 - 45948 6.8 36 20 Tu 1 . + CDS 46075 - 46290 314 ## CHY_1697 putative prophage LambdaCh01, repressor protein 37 21 Op 1 . - CDS 46368 - 46970 520 ## gi|253579634|ref|ZP_04856903.1| conserved hypothetical protein 38 21 Op 2 . - CDS 47005 - 48420 1503 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 48455 - 48514 3.7 39 22 Op 1 . - CDS 48629 - 50590 2032 ## COG3855 Uncharacterized protein conserved in bacteria - Prom 50615 - 50674 4.3 40 22 Op 2 . - CDS 50682 - 55151 4371 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 41 22 Op 3 . - CDS 55198 - 56250 1285 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 42 22 Op 4 . - CDS 56302 - 57441 1223 ## COG3853 Uncharacterized protein involved in tellurite resistance 43 22 Op 5 . - CDS 57476 - 58774 1269 ## Cbei_0753 hypothetical protein - Prom 58795 - 58854 9.4 + Prom 58712 - 58771 7.6 44 23 Tu 1 . + CDS 59001 - 60026 1273 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Term 60049 - 60090 -0.7 - Term 60092 - 60141 7.3 45 24 Op 1 16/0.000 - CDS 60197 - 60688 676 ## COG0262 Dihydrofolate reductase - Prom 60713 - 60772 3.1 46 24 Op 2 . - CDS 60793 - 61641 945 ## COG0207 Thymidylate synthase 47 24 Op 3 . - CDS 61693 - 62658 653 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 48 25 Tu 1 . + CDS 62911 - 63075 101 ## gi|253579646|ref|ZP_04856915.1| predicted protein + Term 63118 - 63166 3.4 - Term 63457 - 63514 19.1 49 26 Op 1 . - CDS 63532 - 63702 69 ## - Prom 63757 - 63816 8.1 50 26 Op 2 . - CDS 63868 - 65103 409 ## CD1890 hypothetical protein 51 26 Op 3 . - CDS 65112 - 65987 389 ## COG1131 ABC-type multidrug transport system, ATPase component 52 26 Op 4 . - CDS 65959 - 66729 72 ## CD1888 hypothetical protein 53 26 Op 5 . - CDS 66719 - 67258 193 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 67324 - 67383 9.0 54 27 Tu 1 . - CDS 67432 - 67629 182 ## gi|253579651|ref|ZP_04856920.1| predicted protein - Prom 67823 - 67882 7.6 + Prom 67819 - 67878 7.1 55 28 Op 1 3/0.000 + CDS 68002 - 69672 2045 ## COG0366 Glycosidases + Term 69684 - 69728 9.0 + Prom 69684 - 69743 7.0 56 28 Op 2 . + CDS 69880 - 70935 851 ## COG1609 Transcriptional regulators 57 28 Op 3 . + CDS 70916 - 71113 59 ## - Term 71033 - 71087 16.2 58 29 Op 1 1/0.000 - CDS 71100 - 72764 1979 ## COG0366 Glycosidases - Prom 72868 - 72927 8.9 59 29 Op 2 2/0.000 - CDS 72960 - 73793 735 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 73920 - 73979 7.3 - Term 73816 - 73846 -1.0 60 29 Op 3 . - CDS 73983 - 75104 807 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 75184 - 75243 4.0 61 30 Op 1 . - CDS 75249 - 76358 1133 ## EUBREC_3517 hypothetical protein 62 30 Op 2 38/0.000 - CDS 76394 - 77251 871 ## COG0395 ABC-type sugar transport system, permease component 63 30 Op 3 35/0.000 - CDS 77265 - 78143 703 ## COG1175 ABC-type sugar transport systems, permease components - Prom 78205 - 78264 4.6 64 30 Op 4 . - CDS 78267 - 79502 1440 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 79608 - 79667 9.9 65 31 Op 1 7/0.000 + CDS 79776 - 81620 1085 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 66 31 Op 2 . + CDS 81607 - 83199 1352 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 83202 - 83261 5.8 67 32 Tu 1 . + CDS 83380 - 84186 730 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase 68 33 Op 1 . - CDS 84373 - 84882 424 ## COG5652 Predicted integral membrane protein 69 33 Op 2 . - CDS 84879 - 85361 522 ## gi|253579665|ref|ZP_04856934.1| predicted protein - Prom 85386 - 85445 2.6 70 34 Tu 1 . - CDS 85591 - 85899 130 ## cce_0110 hypothetical protein Predicted protein(s) >gi|226332920|gb|ACII01000099.1| GENE 1 114 - 2495 2281 793 aa, chain + ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 3 792 2 785 785 600 45.0 1e-171 MNQKAYKALEYYKIINMLTDKASSSMGKEICRKLEPSTDIDEIRHMQTQTRDALTRLFQK GNISFGSVKDVRGSLKRLEIGSSLGIGELLAICSLLENTNRVKAYSRSERGDSLPDSLDG MFEALEPLTPLTTEIRRCILSEDEISDDASSNLRQIRRNMKITGDRIHTQLSSLVNGSAR NYLQDSVITMRNGRYCIPVKAEYKGQVPGMVHDQSSTGSTLFIEPMAIVKLNNDIRELEL EEQKEIEVILSTLSQQTAEQTDSIRADLNIMVQLDVIFARASLAMDMNATEPIFNDEGRI RLKQARHPLIDKKKAVPIDIRLGDDFDLLVITGPNTGGKTVSLKTVGLLTLMGQSGLHIP TLDRSELALFEEVYADIGDEQSIEQSLSTFSSHMTNVVSFLKKANRHSLVLFDELGAGTD PTEGAALAIAILSHLHEQGIRTMATTHYSELKVYALSTSGVENACCEFDVETLRPTYRLL IGVPGKSNAFAISSKLGLPDYIIDKAKEQISEQDESFEDVLSSLESSRITIENERREIEQ YKQEIASLKSEMESKQEKLNEQRDRIIRQANEEAHAVLREAKEYADQTMKMFHKFQKDHV DLSAVEKERQNLRKHMNKAEKGMTQKTAAKKPKKELTAKDISLGDAVKVLSMNLKGTVSS RPDNKGFLFVQMGIIRSKVHISDLELIDEAEITTPTMQRTGAGKIRMSKAAHVSTEINLL GKTVDEAIAELDKYLDDAYIAHLKSVRIVHGKGTGALRKGVHNYLKRQKHVESFRLGEFG EGDAGVTIVEFKK >gi|226332920|gb|ACII01000099.1| GENE 2 2507 - 3208 898 233 aa, chain + ## HITS:1 COG:CAC3220 KEGG:ns NR:ns ## COG: CAC3220 COG0745 # Protein_GI_number: 15896467 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 231 4 228 228 264 56.0 1e-70 MVSKQKILIVDDDNNIAELIALYLTKECFETKIVNDGEEALREFASFRPDLIILDLMLPG IDGYQVCREIRHTSDVPIIMLSAKGETFDKVLGLELGADDYMIKPFDSKELVARVRAVLR RFQVKQPSPSSSEKCVTYPDLTVNLTNYAVTYMGRQVDMPPKELELLYFLAASPNQVFTR EQLLDHIWGYEYIGDTRTVDVHIKRLREKIKDNPHWSIATVWGIGYKFEVKNP >gi|226332920|gb|ACII01000099.1| GENE 3 3205 - 4617 1045 470 aa, chain + ## HITS:1 COG:CAC3219 KEGG:ns NR:ns ## COG: CAC3219 COG0642 # Protein_GI_number: 15896466 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 195 470 201 475 475 195 36.0 1e-49 MKQHHLSTKFLRFYILIGILGFFLITLGGSYMVEKHLEHSLSAALYTEAHNIASNEAVKS NISSSTVDTLQEHLCAISDFQDAVLWIINSNGEIIVSTQKNIDVRDPIPLEEFDASKWGS NYYQIGKFYGFFKTDHLSVIAPITSDMETKGYVAIHYSMTNLYQSRSSILFIMQVIFLLC YAATSLLLWAYSHYIRKPLARIMKGASEYAGGNLAYKIDVTSDDEMGYLAKTLNYMSDEL NKNGEYQRKFIANVSHDFRSPLTSIKGYVNAILDGTIPYEMQERYLKIIAFESERLEKLT RSLLTLNELDMKKRMMHIQRFDINDTIKNTAATFEGICTSRQIRLELLLSGHELYVRADM EQIQQVLYNLLDNAIKFSNDNSSVQIETTVKSGKVFVSVKDYGTGIPKESLGKIWDRFYK IDASRGKDRKGTGLGLSIVKEIINAHNQNIDVISTEGVGTEFIFTLEKTK >gi|226332920|gb|ACII01000099.1| GENE 4 4803 - 6527 1937 574 aa, chain - ## HITS:1 COG:CAC0887 KEGG:ns NR:ns ## COG: CAC0887 COG1001 # Protein_GI_number: 15894174 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Clostridium acetobutylicum # 11 563 9 564 570 451 43.0 1e-126 MELEMYRRYSQMARGIEKAELVFKNGRVFSSGTGEFIDGDVAVADGIVIGVGTYEGETEI DLEGKVICPGFIDSHLHLESTLVTPGELVRQAAQCGTTTFIVDPHESANVSGTDGIDYIL DQTEDVPANVYVMMPSCVPATHVDDNGCILTAGKMKGYLEHPRILGLGEVMDAPSVINGS VAMHEKLQLFQDRVKDGHAPFLASGDLAAYVLGGIDTDHECVDYEYAMAEARNGMQVLIR EGSAARNLDAIVKGIVEHHTDTSGFCFCTDDKHIEEIRKEGHINYNVKRAVQLGLPVEKA LQMATIQPARCYGLYRLGMIAPGRQADFVILDNVADLNVVDVYHCGKKIIKDEKAELKPC PPYLKNTVHVSGFSEERLKLKHPGTKARVIQMLEKQIVTRDVLEEVPWIESDGEKYFAPD GEYQKIAVIERHKNTGKMGVGIIKGYGIRGGAIASSVSHDSHNIIVVGDNDRDMAVAVKE MMRTQGGYTLVCNGEIYGTLPLPVMGLMSDAGYESVNEALAKMIPKAHEMGVKEGFDPFI TLSFMALPVIPEIRITPRGIYLVNEDRMLRTPFS >gi|226332920|gb|ACII01000099.1| GENE 5 6807 - 8783 2108 658 aa, chain - ## HITS:1 COG:SA0102 KEGG:ns NR:ns ## COG: SA0102 COG4716 # Protein_GI_number: 15925810 # Func_class: S Function unknown # Function: Myosin-crossreactive antigen # Organism: Staphylococcus aureus N315 # 63 653 1 591 591 714 56.0 0 MSKNKGLGLISALAVGAGVAAVAAGKKYKTVETKKKEDYDALHNTSEYRNTERGKYEKNS KGIYYTNGNYEAFARPRKPEGVDDKHAYIVGSGLASLAAACFLVRDGQMPGKNIHILEAM DIAGGACDGIFDPSRGYVMRGGREMENHFECLWDLFRSIPSIETPGVSVLDEYYWLNKED PNYSLCRATKERGKDAHTDGKFNLSQKGCMEIMKLFMTKDEDLYDKTIEDVFDDEVFDST FWLYWRTMFAFENWHSALEMKLYFQRFIHHISGLPDFSALKFTKYNQYESLILPMQKYLE EAGVEFQFNTEVTNVLFEFEGNKKIAKAIECKVNGVEKGIVLTENDLVFVTNGSCTEGTI YGDQNHAPNGDAEVRTSGCWSLWKNIAKQDPSFGHPEKFCSDIAKTNWESATVTTLDDKI IPYITDICKRDPRTGKVVTGGIVSCQDSSWLLSWTINRQGQFKDQDKDKVCVWVYGLFTD VPGDYIKKPMKECTGKEITEEWLYHLGVPVDQIEDMAENSAVCVPTMMPYITAFFMPRTK GDRPDVIPDGCVNFAFLGQFADTPRDTVFTTEYSVRTAMEAVYGLLGVDRGVPEVWGSVY DIRELLDSSVKLMDGKSPLEIELPGPLDMLKKPLLKAVKGTVIEKVLRDHDVIKDYMM >gi|226332920|gb|ACII01000099.1| GENE 6 9108 - 9677 656 189 aa, chain - ## HITS:1 COG:lin0482 KEGG:ns NR:ns ## COG: lin0482 COG1309 # Protein_GI_number: 16799557 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 5 174 7 175 186 98 33.0 7e-21 MPNTTKAALEESLKRLLLKKPLDKITITDITTDCGISRMAFYYHFKDIYDLVEWSCVEDG TKALQGKKTSESWTEGLTQIFEAVLENKPFIMNVYRNVDRERIENYLFKLTYDLIVGVVE EKSKGLNITEEDKKFIADFYKYGFVGIILEWIREGMKENIEDLVRMMDLTLRDTVTTSIH NFQKNNMEK >gi|226332920|gb|ACII01000099.1| GENE 7 9705 - 10205 674 166 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 4 152 1 149 150 199 64.0 2e-51 MAKLNIVLHEPEIPANTGNIGRTCVATGTRLHLIEPLGFSLSEKALKRAGMDYWKDLDVT TYLDFEDFLEKNPGAKIYYATTKAPQTYTDVHYEEDCYIMFGKESAGIPEDILVNNQETC VRIPMIGDIRSLNLSNSVAIVLYEALRQNNFDHMNLEGHLRNYDWK >gi|226332920|gb|ACII01000099.1| GENE 8 10276 - 10872 662 198 aa, chain - ## HITS:1 COG:PAB0403 KEGG:ns NR:ns ## COG: PAB0403 COG0309 # Protein_GI_number: 14520803 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Pyrococcus abyssi # 20 196 150 326 326 91 33.0 7e-19 MRLGQAAIEALRAEITGCLKPGDELVVACPVALKGTSVIAKNKKDKLAERFSAGFIQNCV SLRDDYGAGSIVWKIAQEADASALYAMGEGGFLSALWKMAEASEVGLEADFRKVPIRQET IEVCEIFDLNPYKLQADGAVLIGIRGGEALVQRLRNEGFMAEIIGQTNSGNDRLLYSGGS ARYLERPAEDELQQLLKE >gi|226332920|gb|ACII01000099.1| GENE 9 10895 - 10999 194 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGMFNPKNRKVITAIIAIILVLAMVVPTVLSLLV >gi|226332920|gb|ACII01000099.1| GENE 10 11014 - 12690 1537 558 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 523 3 526 548 374 41.0 1e-103 MKILLAACNAKYIHSNLAVYDLQAYASDYADHIVLKEYTINQQKDDIMRDIYLEHPDVVC VSCYIWNLSFVKELMADLIKILPGADFWAGGPEVSYDAEKFLTENSEFKGVMVGEGEETF KELAGYYVEKNPQNLKDMTGICYRDGDQIIHNGWRQIMDLSSIPFIYKDLSEFKNRIIYY ESSRGCPFSCSYCLSSIDKKLRFRDTETVKKELQFFIDNKVPQVKFVDRTFNCKHDHAMA IWKYINEHDNGVTNFHFEISADLLREEELQEMSTMRPGLIQLEIGVQSTNPDTIKAIHRT MDFEKLKGIVDRIHSFGNIHQHLDLIAGLPYEDYDSFRNSFNDVYALKPQQLQLGFLKVL KGSHMMEMCREYGIVYKTQEPYEVLSTKWLDYDHVLKLKTVENMVEVYYNSGQFQNTLEY LENFFPDAFSIYERLGSFYMEKGYGDVSHTRMRRYEILLEFLEDMPEISVDQVKDQMVYD LYLRENLKSRPGFARDQKPFERQIWDFRKREKVAKNAHVEVFADGKVLLFDYADRDPLTN NAHVTDVTKDVFENLNRD >gi|226332920|gb|ACII01000099.1| GENE 11 12697 - 12999 554 100 aa, chain - ## HITS:1 COG:MT4033 KEGG:ns NR:ns ## COG: MT4033 COG0526 # Protein_GI_number: 15843547 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Mycobacterium tuberculosis CDC1551 # 4 94 11 104 116 107 58.0 6e-24 MAKEITTANFETEVLKSEKPVLIDFWATWCGPCMRQGPVVEELAEEGYAVGKVDVDQNMA LAQQFRVVSIPTLILFKDGAEVKRFVGLTSKEELKSALEG >gi|226332920|gb|ACII01000099.1| GENE 12 13111 - 14019 303 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 5 301 5 306 306 121 31 1e-26 MEKVIIIGAGPAGISAALYAVRGNLDPLVINNGIGALEKAEKIENYYGLEHPLSGQELYD TGIAQAKALGVRILDAQVLGVGGFDTFVVKTTEGEFETQSLILATGSKRAAPKIPGVKEF EGKGVSYCAICDAFFYRGKDVAVLGNGDFAMHEAKELSNTASTVTIYTNGEEPEFTSEEN ISVNTMKIQSVEGDDLVSGIRLAPDVQAEEIKAGAGQQANDFCPAEGVFIAMGTAGSTEI ARQMGAELTEKGNIKVNENMETTIPGLYAAGDCTGGLLQVAKAVYEGAKAGINAGKYVRS LK >gi|226332920|gb|ACII01000099.1| GENE 13 14281 - 15519 1599 412 aa, chain - ## HITS:1 COG:BS_glyA KEGG:ns NR:ns ## COG: BS_glyA COG0112 # Protein_GI_number: 16080743 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Bacillus subtilis # 11 410 8 409 415 525 65.0 1e-149 MYSIDDVAKTDKDIADLIEAELARQNSHIELIASENWVSKAVMAAMGSPLTNKYAEGYPG KRFYGGCSCVDEVEALAIERAKELFGCEYANVQPHSGAQANMAVFFAMLQPGDTVMGMNL DHGGHLTHGSPANMSGTYFKPVYYGVNDDGVIDYEEVRRIAIENKPKLIVAGASAYARVI DFKKFREIADEVGAYLMVDMAHIAGLVAGGQHPSPIPYADVVTTTTHKTLRGPRGGLILS SAENAKKFNFNKAVFPGIQGGPLMHVIAAKAVCFKEALQPEFKDYAKMIVENAQALCKGL QKRGIDIVSGGTDNHLMLVDLRSLGVTGKQMENLLDEVNITCNKNAIPNDPQSPFVTSGV RLGTAAVTSRGMKPEDMDKIAEAIAMTLKEEGSQEKAKAIVKELTDKYPLVG >gi|226332920|gb|ACII01000099.1| GENE 14 15699 - 16769 1192 356 aa, chain - ## HITS:1 COG:no KEGG:Athe_2369 NR:ns ## KEGG: Athe_2369 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophilum # Pathway: not_defined # 70 353 181 468 480 67 22.0 1e-09 MKNKIKKILLLGMTAMFTAGAAGTAVISCPVWADEAEQNSETAEEPKAEDAAVEEEIADQ TDDKTENTDLKTVEHPRMSAYSIRRFSIVKDGEEVFQIKQEPADYKMDFDYWEITNPYDE TATVNTENMYEMFGVLAAFDLSNGVDAANTDTGLDNTKTYFTVDFVNTVNDDTAKETQDA DATATILIGNTDENGDYYACVKGYEEAVYLLSKESVNSLLELKPFNLILKIPALVNIDTL DSVDISIGKKTYTMKLDGSDYKFGKKTVKKEKFTELYQALQSIMLDSEVEETKDAADKEE VLTVTFHRNTEEAPEVTLKYFAYDDTYDSLEINGTERFLVKAEDVDALVKQIKKAF >gi|226332920|gb|ACII01000099.1| GENE 15 16855 - 18975 2223 706 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 40 696 92 754 778 348 34.0 2e-95 MKDHRKERAEARKKAEKLVSQMTLLEKASQLKYDAAPVKRLGVPAYNYWNEALHGVARAG VATMFPQAIAMAAVFDDEEMKKVGDIIATEGRAKYNAYSAKEDRDIYKGLTFWSPNVNIF RDPRWGRGHETYGEDPYLTSRLGVKFVEGIQGDGPVMKAAACAKHYAVHSGPESLRHEFD AQASMKDMWETYLPAFEALVTEADVEAVMGAYNRTNGEPCCAHKYLMEDVLRGKWKFEGH YTSDCWAIRDFHEHHMVTSTPRQSAAMALNAGCDLNCGNTYLHMMGAYQDGLVTEEKITE SAVRLLTTRYLLGLFDGSEYDKIPYSVVECKEHIDEALKMARKSCVLLKNDGVLPIDKTK VNTIGVIGPNADSRAALIGNYHGTSSEYITVLEGIREEAGDDVRILYSQGCDLYKDKVEN LAWDQDRISEAVITAENSDVVILCVGLNETLEGEEGDTGNSDASGDKVDLHLPKVQEELI EKVTAVGKPTIVVLMAGSAIDLNYAQDNCNGILLAWYPGARGGRAIADLLFGKESPSGKL PITFYKDLEGMPEFTDYSMKNRTYRYMEKEALYPFGYGLTYSDTCVTEAEVVGEVSAESD IVLKATVKNNGTVDTDEVVQVYIKDLDSPLAVRNYSLCGFKRVSLKAGEEKSVEFTISNK AMNIVDEDGNRYIAGKHFRLFAGVSQPDTRSAELTGHKPAEIEIAL >gi|226332920|gb|ACII01000099.1| GENE 16 19056 - 20615 1104 519 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 516 1 523 525 149 25.0 2e-35 MYRVLIVDDEKMIRMGMKNAIDWKKLGVDDVFTAASGNEALKILKEEGPEIMVTDIQMTE MTGLELIKAARESVPELRVIVLTGFDNFEYARESLRLQVQDFFLKPIDEDDLFNAIEKQI KELKEQENKEQNQARVWRSQGSVVQMRLEQCMRNLVHTKADKEMQLYILQKDFQFDIKQK MSLILLEQGMYTEQQNDGKFRAMTVKNVCMSMVDSQNRGITFVDDDGTIAIVCFEKDDDD SVLEQIEELSDVLKDEFEYKPKITVGSAVQGFENLAISYNDARYLLEHEKKNIQDIIQTM GAQNKKKIFWEIYAELRNIMISNIGTPDYVLKAFNTFIKATESYNLSPSAVRRCCFEIAS SLIFSYMEVSCEVEEGKLDALSKSLSSAGKEEACEITRMFIEQLIENDEEDVHYTISNAR HYIDEHLAEDISVSSIAESLYITPNYFSRLFKRITGEGCNEYIVRKRIEKAKSLLETTSI KTGKIAMMVGYRDTNYFSLAFKKHTGKSPTKYREEMQNT >gi|226332920|gb|ACII01000099.1| GENE 17 20608 - 22407 1415 599 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 256 581 238 557 563 166 31.0 9e-41 MNKWKTKFSHWKFDKKILLLVTVSILIVTLTVAGVSLTFSIASMKEQSVELLQMQNNTVA ESFKGSMDSYKEIVLGTIMDDSVQNYCRQVQKNELKTADIDAVYSKLENLNNMYESLNFA AIVSSDYKSYYYRGKGSLSVTQFEEVYPQAYERSKYAQKGTLKVSYNNDYYNDYYKGNRY TLSLYYPLYDTRKVSDARGLLCMNFDDPAVQRMLAVGDSSAETRVVDTGGMILLSNDKEE TGRYVNYIDEMEKGIQIFTKNGNMYVCQKIKNWNYYVVSSISFYKLMESGFHIMVVVILV LIIVLIIVSIIIRALVRKMYQPLNKVVSNMDYVATGSLRTRINVDNMGEDFTKLAIGFNS MMDEIEVLMEQVKLEQHQLEQIRFNALQSQIQPHFLYNTLECIHWQAIAEGNKEISTMVK ALAKYYRICLSKGHDVITIREEVEHIRNYMIIQNMRYDNIIGSEIVVESQVEESLIPKLT LQPLVENSIYHGMKVKEGKTGTLYLTAYKNEDEVIIKVSDTGTGMSQEQIDEMNEQLSRY EDSFGYGVRNVNKRIELLFGEGYGLNYQKNDSGGVTVVIHLPYQTEIKKNTAGGEWTCV >gi|226332920|gb|ACII01000099.1| GENE 18 22529 - 23377 960 282 aa, chain - ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 9 282 19 293 293 175 37.0 8e-44 MDSYKKSRRKSIIAWIIAFVLAIGAAIVSFTPFIFMVLNSFKEKFEMLTKGVFQLPDQLN WSNYTEVLTGGFANYFKNSVIVLAISLILLLFISACASYPLARFKFKMAQPIYAIIVACM SIPVHITLIPVFKMSKSTGLYDTIWSLVGPYVAFAVPISVFILTSFMKEIPREIEESAEI DGCGKIQMFFSMILPLSKPGMATLAIYNGVNMWNEFSFVNTLTQSAQNRTLPLAIWEFQG QYSMNTPMIMAVLTLTLLPMVIMFIIFQDKLVKGMTAGAVKG >gi|226332920|gb|ACII01000099.1| GENE 19 23381 - 24283 1025 300 aa, chain - ## HITS:1 COG:BH2225 KEGG:ns NR:ns ## COG: BH2225 COG1175 # Protein_GI_number: 15614788 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 14 298 17 301 304 157 31.0 2e-38 MNEKGISLVQLRQRKTRQTAALFLAPVIILLVVFIAYPIIDTFITSGYQWNGISADKKMV GLANWAKLIKDTKFWIAFKNNVIIMILSIIIQIPLGLAMATFLDFGGKKLTIFKVIWFIP MLMSSVAIGFLFTYALATNGGIVSTISGWFGGGNIDLLGNPKLALLTVILIIAWQFTPFY MVYFMAGYTNIPYDVFEAARIDGATRGQYFWKIALPLLIPSIKSAAILSMVGSLKYFDLI YVMTGGGPGTATELMATYMYKESFKNFNMGYGSAIAGGMFILITMVSMITMKLINGKQED >gi|226332920|gb|ACII01000099.1| GENE 20 24579 - 25946 1643 455 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 44 425 54 434 438 190 32.0 5e-48 MKKRMYKIMALAMTAAMVVPMAAQTTAFAADDKEITYWNIAVESPDKDIVTAAVDKFNKE TKSGYTVNQVAIQNDTYKEKLVIAMSSGECPDMYSSWSGGPMYEYIDSGFAQPIDDLLDQ SDVKDKLLDAAIEQGSYNGHVYAIPYLNVSLAGIFYNKDMFKQYGLEEPKTLADLENICA TLKENGITPFALANASKWTGSMYFMSLAARYGGLEPFQNAVAGTGKFTDDCFIKAGEKIQ EWVNNGYFPDGVNSLSEDDGQAKQLMYQETAGMLLCGSWYTGTFQTDSEEFYQKIGWFPF PAIDDSDADPTIQIGTVGDQFISFNCEGDKLAAAFECATDHLSDEVADITYSNNKIVPVK DAGDHIKDPVVKEIFDAAQEASSIQLWYDQYLPTSVATAHLDGLQEVFGLTKTPQEAQEE MQKAMDEYLSTKSDSGAADDTAEEATDDAAADDAE >gi|226332920|gb|ACII01000099.1| GENE 21 26302 - 27486 1103 394 aa, chain + ## HITS:1 COG:FN1415 KEGG:ns NR:ns ## COG: FN1415 COG1979 # Protein_GI_number: 19704747 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Fusobacterium nucleatum # 1 394 1 385 385 319 42.0 5e-87 MRNFEYYTPTQVLFGKDTHLQAGSLLKKYGAKKVLIHYGGQSAVRSGLIDEITSNLKEEG LEYITLGGVIPNPLLSKVREGIELCQKEHVDFILAVGGGSVIDSSKAIGYGVANPNNDVW DFCLKKAVPTGCLPIGVILTIAASGSEMSSSSVITNEETKEKRGCAKTDYCRPKFAILNP RLTYTLPQYQTESGCVDILMHTMERYFVNIETMEITDSISEALMQTVIYNARILMKEPDN YSARAEIMWAGSLSHNGLTGCGTGGGDWACHQLEHELGGVYNVTHGAGLAAIWGSWARYV YEVNPERFAQFATNVFDIPCGTDYKETALAGIEAMENFFRSVEMPTSLHELGLDLTDQQI HDLAFKCSFEDTRTIGVFKQLNMRDMEKIYLMAR >gi|226332920|gb|ACII01000099.1| GENE 22 27956 - 29239 1411 427 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 8 427 5 421 422 505 57.0 1e-143 MSDFKVETKCLHSGYTPSKGEPCALPIYQSTTYKYDTTDEMGQLFDLKAEGYFYTRLQNP TNDAVAAKIADLEGGVAAILTSSGQAANFYAVFNICEAGDHVVAASTIYGGTFNLLAVTF KKLGIDCTFVDTDASPEEIAAAFRPNTKVLFAETIANPALVILDIEKFAKVAHEHEVPLI VDNTFATPVNCRPFEWGADIVTHSTTKYMDGHAVQVGGAIVDSGNFDWDAYGHKYHGLTE PDESYHGVIYTKQFGKKAYITKATSQLMRDLGSIPSPTNCFLLNLGLETLPLRVERHCYN AQKIAEFLNAHEKVSHVNYAGLPDDKYYALAQKYMNEGRTCGVISFELTGGREAAVRFMD SLKLATIATHVAASKTMILHPASHTHRQMNDEQLVEAGVSPGMIRLSVGIENVEDIIADI EQALKNA >gi|226332920|gb|ACII01000099.1| GENE 23 29343 - 30482 1156 379 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3573 NR:ns ## KEGG: EUBREC_3573 # Name: not_defined # Def: peptidase, M23 family # Organism: E.rectale # Pathway: not_defined # 277 379 504 612 933 68 35.0 5e-10 MIGQTERDYKGVEAQFMKKKILVLLVCSMLTLPTTVSADILDYEGTVNAYKEDNKIPQTF EIDDGSGEKVTAVAKSDKVSVSAKATYIRSMPGKNGKKLVRVYLGTGVERVAVCDNGWSK VTYEKKSKKGTEKISGYVPTKHINDSDQVAQAKGTFTALKDSDILDYPGKKDGQVVGEVI QEDEVKRLATVNGIWSQIYYKKENGKRGIGYIPTSVLENTAAEEKTEESKVAKVDGEEAG TIHKSEGKGVFAEAVDGVTASKGAAVSGVQVGTPIAVSSDATLKPLGTFKITHYCPCSIC CGPWANGVTSTGVTATTNRTIAVDPTQIPYGSKVVINGQVYVAEDCGGAIKHNCIDIYVA THQEGEDKGVYYTDVYLLE >gi|226332920|gb|ACII01000099.1| GENE 24 31039 - 31272 294 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579621|ref|ZP_04856890.1| ## NR: gi|253579621|ref|ZP_04856890.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 77 1 77 77 132 100.0 6e-30 MTAVKERIIGAVSIMSDKDANIFWHIIQKHFKLPDTFADIEKVEPDETDLIMLKEIENNP DCHEFISQEELMKELNM >gi|226332920|gb|ACII01000099.1| GENE 25 31922 - 33106 1353 394 aa, chain - ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 394 5 391 391 503 60.0 1e-142 MKEFELKYGCNPNQKPAKIYMADGSELPVKILSGRPGYINFLDAFNGWQLVSNLKKATGL PAATSFKHVSPAGAAVGLPLTETLAKIYWVNDMDWKNFSPLACAYARARGADRMSSFGDF ISLSDVCDKDTALLIKREVSDGVIAPGYTEEALEILKQKKKGNYNVIQIDENFVPAPLEH KEVFGVTFEQGRQELEIDDAMLSNIVTENKELTEEAKRDMKISLIILKYTQSNSVCFVKD GQAIGVGAGQQSRVHCTRLAGQKADNWFLRQCPKVLNLPFKDTISRAERDNAIDVYIGDE YMDVLADGMWEKTFTEKPEVFTKEEKRAWLDQMTDVTLGSDAFFPFSDNIERAHKSGVKY IAQPGGSVRDDAVIETCNKYGMVMSFTGIRLFHH >gi|226332920|gb|ACII01000099.1| GENE 26 33236 - 33949 926 237 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1566 NR:ns ## KEGG: EUBREC_1566 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 237 50 286 286 405 82.0 1e-112 MEMLSLEKELKENSYPGRGIVIGRSADGKKAVTAYFIMGRSENSRNRVFVEDGEGIRTQA FDPSKLTDPSLIIYAPVRVLGNKTIVTNGDQTDTIYEGMDHQLTFEQSLRSREFEPDGPN YTPRISGVMHVENGKFNYAMSILKSNNGNPDACNRYTFAYENAIAGEGHFIHTYKCDGNP LPSFEGEPKLVAIPDDMDEFAELLWNSLNEDNKVSLFVRYIDIETGKYESKIINKNK >gi|226332920|gb|ACII01000099.1| GENE 27 34193 - 34789 397 198 aa, chain - ## HITS:1 COG:lin2374 KEGG:ns NR:ns ## COG: lin2374 COG5632 # Protein_GI_number: 16801437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 35 188 9 176 316 103 33.0 3e-22 MISDAEQSGTDNADSSGGPEQWQKEGAPYIDVELLTPNEYSRPQIPIESVQYIAIHYTAN PGATAIANRNYFENLATTHDTKVSSHFVVGLDGEVVQCIPTSEMSYATNSRNVDTLSIEC CHPDETGKFNEATYDSAVKLSAWLCVRFGLTSENVIRHYDVTGKNCPKYYVENPDAWIQM KSDIAAQIDVDYGLQDVQ >gi|226332920|gb|ACII01000099.1| GENE 28 35144 - 36601 1803 485 aa, chain - ## HITS:1 COG:CAC2701_3 KEGG:ns NR:ns ## COG: CAC2701_3 COG0516 # Protein_GI_number: 15895958 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Clostridium acetobutylicum # 205 483 1 279 280 400 72.0 1e-111 MGTIIGEGITFDDVLLVPAYSKVIPNQVDVTTYLTKKVKLNIPMMSAGMDTVTEHRMAIA MARQGGIGIIHKNMSIEAQAEEVDKVKRSENGVITDPFYLSPEHTLKDADELMAKFRISG VPITEGRKLVGIITNRDLKFETDFSKKIKECMTSEGLITAKEGITLEDAKKILAKSRKEK LPIVDDDFNLKGLITIKDIEKQIKYPLAAKDEQGRLLCGAAVGITANVLARVDALVKASV DVIVIDSAHGHSENILKAVREIKAAYPELQVIAGNVATGAATKALIEAGVDAVKVGIGPG SICTTRVVAGIGVPQITAVMDCYEAAKEYGIPIIADGGIKYSGDVTKAIAAGANVCMMGS MFAGCDESPGTFELYQGRKYKVYRGMGSIAAMENGSKDRYFQENAKKLVPEGVEGRVAYK GHVEDTVFQLMGGLRSGMGYCGAETIEKLKETGRFIKISAASLKESHPHDIHITKEAPNY SVDDK >gi|226332920|gb|ACII01000099.1| GENE 29 36711 - 37214 644 167 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3288 NR:ns ## KEGG: Cphy_3288 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 164 1 166 169 114 37.0 1e-24 MGDLYSELLVKKDKTAKDSLLKYGLIVLTVLAVFAGLIITPLALTIAVALGIACYFVIPK TDVEYEYLFINGDFDIDMIMSKTKRKKVKSFKLSEADLAAPLDSHRMDYYNGNQNMKVLD FSSGNPEHKRFGVITRLDGNLCKIILEPDEALAQAMKNSAPSKVFLD >gi|226332920|gb|ACII01000099.1| GENE 30 37528 - 39147 1573 539 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 539 3 543 547 610 59 1e-174 MAKEIKFGAEARAALEAGVNKLADTVRVTLGPKGRNVVLDKPYGAPLITNDGVTIAKEIE LEDGFENMGAQLIKEVASKTNDVAGDGTTTATVLAQAMVHEGMRNLAAGANPIILRKGMK KATDVAVEAIKNMSQTISGKKQIANVASISASDETVGQLVADAMEKVSKDGVITVEESKT MHTELDLVEGMQFDRGYVSAYMSTDMEKMEANLEDPYILITDKKISNIQEILPLLEQIVK VGAKLLIIAEDVEGEALTTLIVNKLRGTFQVVAVKAPGYGDRRKEMLQDIAILTGGQVIS EEVGLELKDATMEQLGRAKSVKVAKENTVIVDGMGDKDAIANRVAQIRGQIEETKSEFDK EKLQERLAKLAGGVAVIRVGASTETEMKEAKLRLEDALAATRAAVEEGIIAGGGSAYIHA SKEVAKLAAALEGDEKTGANIILKALEAPLFRISANAGLEGSVIINKVRESEPGIGFDAL NERYVDMVSEGILDPAKVTRSALQNATSVASTLLTTESAVAIIKEDTPAPAANPGMGMM >gi|226332920|gb|ACII01000099.1| GENE 31 39279 - 39563 511 94 aa, chain - ## HITS:1 COG:BS_groES KEGG:ns NR:ns ## COG: BS_groES COG0234 # Protein_GI_number: 16077669 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Bacillus subtilis # 3 93 16 107 108 95 57.0 3e-20 MTLVPLGDRVVLKQVEAEETTKSGIVLPGQAQEKPQQAEVVAVGPGGVVDGKEVKMEVAV GDKVIYSKYSGTEVKMDGTEYIIVKQNDILAIVK >gi|226332920|gb|ACII01000099.1| GENE 32 39979 - 40764 963 261 aa, chain - ## HITS:1 COG:CAC1786 KEGG:ns NR:ns ## COG: CAC1786 COG4465 # Protein_GI_number: 15895062 # Func_class: K Transcription # Function: Pleiotropic transcriptional repressor # Organism: Clostridium acetobutylicum # 5 254 4 258 258 203 48.0 3e-52 MSVQLLDRTRKINKLLHNSSSSKVVFNDICQVLMETLSSNILVLSKKGKVLGVSLCPGVE EITELIEDKVGGYVDSLLNERFLGVLSTKENVNLQTLGFEHVSSSYQGIINPIDIAGERL GTVFMYRYEKPYDIEDIIVSEYGTTVVGLEMMRAVHEENAEEDRKQQVVKSAFNTLSFSE LEAIIHIFDELDGDEGILVASKIADRVGITRSVIVNALRKFESAGVIESRSSGMKGTYIK VLNEVIFDELEEIKAQRNKKS >gi|226332920|gb|ACII01000099.1| GENE 33 41136 - 43205 2143 689 aa, chain - ## HITS:1 COG:BH2467_1 KEGG:ns NR:ns ## COG: BH2467_1 COG0550 # Protein_GI_number: 15615030 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus halodurans # 1 549 1 550 550 606 56.0 1e-173 MAKYLVIVESPAKVKTIKKFLGANYDVEASNGHVRDFPKSQFGIDVEHDFEPKYITIRGK GELLAKLRKAAKKADKIYLATDPDREGEAISWHLMQALKEDPKKMHRITFNEITKTAVKS SIKQARDLDMDLVDAQQARRMLDRMVGYTISPLLWAKVKRGLSAGRVQSVALRIICDRED EINAFIPEEYWSLEGDFQVKGEKKPLQAKFYGTDKKMDIHSKEEMDKLLASLKDKEYEIN EVKKGERIKNAPLPFTTSTLQQEAAKTLNFSTQKTMRLAQQLYEGIDIKGNGTVGVISYL RTDSTRISEEADAAAREYITAQYGEDYVSKSEKAVKKGQKIQDAHEAIRPTDISRTPAVL KESLSRDQFRLYQLIWRRFAASRMAPAKYETTSVKIGADQYIFTVSASKIIFDGFMSVYT DEEDQKKGNVLNQSLERGMKLSLKELKPEQHFTQPPAHYTEASLVKTMEELGIGRPSTYA PTITTIISRRYVAKEQKNLYVTELGEVVNNIMKQAFPSIVDVNFTATMEGLLDCVAAGTV KWKTVVSNFYPDLKKDVDAAEEELEKVDIQDEVTDVICENCGRHMVIKYGPHGRFLACPG FPECKNTKPYFEKIGVACPKCGKEIVLKKTKKGRKYYGCEDNPDCDFMSWQKPSAKKCPK CGSYMVEKGNKLVCSQETCGYVEAKDEKE >gi|226332920|gb|ACII01000099.1| GENE 34 43337 - 44254 708 305 aa, chain - ## HITS:1 COG:HI0985 KEGG:ns NR:ns ## COG: HI0985 COG0758 # Protein_GI_number: 16272923 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Haemophilus influenzae # 27 230 83 286 373 181 45.0 2e-45 MERTDMKNNHTEEYSTGQIRCIEKSDPEYPQIMKQYSSMPAKLYVKGRLPDPKRRTVAVI GARMCSPYGRMQAFRYAKALSVAGVQIISGMALGIDSEGHKGALEGKMPTFAVLGSGVDV CYPKSNRKLYERILWENGGIISECPLGSGPVSWHFPARNRIISALSDAVLVVEAKENSGS LITAGFALEQGKMVYAIPGAVTDELSRGCHKLIYDGAGIAYCPEIMLEELGISMEKVTQN GEKNNLGLARDLNMVYSCLDLRPKNPDYIVRKTGFSPAQVSNCLVELTLRGLIRESGRHY YVKDS >gi|226332920|gb|ACII01000099.1| GENE 35 44275 - 45807 1136 510 aa, chain - ## HITS:1 COG:alr4088 KEGG:ns NR:ns ## COG: alr4088 COG0606 # Protein_GI_number: 17231580 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Nostoc sp. PCC 7120 # 1 505 1 507 509 452 44.0 1e-126 MFASVLSAAIFGVEVCPVQVEADVSNGLPSFIMVGFPSAQVKEAQERVRTALKNNGYQFP PKRITVNFAPADMKKEGAGFDVPVAAAVLAAFEMISPQVVSRVMMAGEIGLDGEIHGISG ILPIVLCARSLGSRFCVVPYENLKEGRLIRDVPVAGVKNLRELVECLKNPEPYLKREIQE EIPSIINTDMGMDFSDIEGQEGAKRAAEIAVSGFHNLLLIGPPGTGKTMLARRLPTIMPG LGFEEKLELTRIYSIAGLLSREHPLIDERPFRSPHHTSTPQAIAGGGRNPRPGEITLAHK GVLFLDEMPEFSRASLELLRQPMEDKVIQIARASGTYNFPADFMLCAAMNPCPCGYYPDL NRCTCTAGEITHYMGKISRPLLDRIDISTEVPPVSFSQLHCGRKGENSAAIRKRVEKVQK IQEERYKDEKINFNGQLKSSLIDKFCPLTDSASRLLARAFEKIAFSARSYHRILKVARTI ADMEGEEMIAGHHIGEALSYRAFDKDSVIK >gi|226332920|gb|ACII01000099.1| GENE 36 46075 - 46290 314 71 aa, chain + ## HITS:1 COG:no KEGG:CHY_1697 NR:ns ## KEGG: CHY_1697 # Name: not_defined # Def: putative prophage LambdaCh01, repressor protein # Organism: C.hydrogenoformans # Pathway: not_defined # 4 71 5 70 254 63 41.0 2e-09 MKFQRIQDLRTDADMSQKQLSEILHISQRSYSHYETGSRNIPVEMLIRLANYYDISVDYL VGRTDKKEMNK >gi|226332920|gb|ACII01000099.1| GENE 37 46368 - 46970 520 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579634|ref|ZP_04856903.1| ## NR: gi|253579634|ref|ZP_04856903.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 41 200 41 200 200 235 100.0 1e-60 MSITNNGKQKRNKKKKRLIKLLIIVIILFILLMLADTLLLQHRKNVDREQRKAEAIAALE RVPTVTPTPAATPTPTPTPIPTVTPTPVLERAYVFNPEDYLGTWRSENGRVKIKIKKLSQ KSVTFTYWQTNKKKTASCKAKVKKSVAGNATRFSFTDSLGNVAKGYLTFDNGRLYVNIKT KTKAEGAKVHPSVDTVMIKE >gi|226332920|gb|ACII01000099.1| GENE 38 47005 - 48420 1503 471 aa, chain - ## HITS:1 COG:BH0630 KEGG:ns NR:ns ## COG: BH0630 COG0034 # Protein_GI_number: 15613193 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Bacillus halodurans # 81 470 96 454 473 187 32.0 3e-47 MGGIFGVASKSSCILDLFFGIDYHSHLGTRRGGMAVYDKKRGFDRVIHNIENAPFRTKFD GDIGEMEGNLGIGCISDNEPQPLLVRSHLGNFAITTVGKINNMDELIANCFKNGCTHFLE MSGGSVNPTEMVAALINQKDNMIEGIRYAQEVIEGSMTMLILTPKGILAARDRLGRTPLI VGKKEDACCVSFESFAYLNLGYEDLKELGPGEIVAVTPDGVETLQKPGEEMRICSFLWVY YGYPTSAYEGVNVEEMRYHCGDMLARRDRGEVNPDIVAGVPDSGIAHAIGYANQSGVPFA RPFIKYTPTWPRSFMPTQQSQRNLIARMKLIPVHRLIKDKSLLMIDDSIVRGTQLRETTE FLYRNGAKEVHIRPACPPLLYGCKYLNFSRSKSEMDLITRRVIAKREGENVSDKVLADYA DPNSTNYKEMLEEIRKELNFTSLKFHRLDDLKASIGISPCKLCTYCWDGKE >gi|226332920|gb|ACII01000099.1| GENE 39 48629 - 50590 2032 653 aa, chain - ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 5 651 13 661 665 757 57.0 0 MDMKKDELRYLQRLAEIYPTIGKASTEIINLQSILNLPKGTEHFMSDLHGEYQAFSHVLR NGSGAVRKKIDDVFGHTLSNNDKRSLATLIYYPKEKMDLVKDTEEDMENWYKITLYRLIE ICKTTASKYTRSKVRKALPADYAYVIEELITEKAEVLDKEAYYDSIVNTIIEIGSAENFI IALAELIQRLVVDHLHILGDIYDRGPAPHFIMDRLMQYHSLDIQWGNHDVVWMGAAAGQK ACIATVVRNSIRYGNLDILEDGYGINMLPLATFVMEAYKDDPCEIFAMKGAFNYNVLEEE LGKKMHKAIAVIQFKLEGKLVRRHKEFQMEDRALLHRINPEKGTITLPDGKEYPLRDNNF PTIDWKHPYELTAGEKEVMDKLSSAFRNCEKLQNHIRLLLDKGGLYTVYNGNLLFHGSIP LNEDGTFREVQIYGKSYKGKELYDVLETYVRRAFYSVGKEEQKKGRDIMWYIWAAPNSPL FGKSKMSTFERYFIEDPSTHEEKKSAYYRLWENEEVVDNMLREFGLDPEKGHIINGHVPV HQSEGESPVKCDGKVIVIDGGFSKPYQKVTGIAGYTLIYNSYGLNLTAHEPFTSKADAVA RETDIVSNRVAVSYMSRRQLVGDTDTGHALKERIQELIQLLDAYRTGIIKEKK >gi|226332920|gb|ACII01000099.1| GENE 40 50682 - 55151 4371 1489 aa, chain - ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 170 1486 163 1450 1452 1380 52.0 0 MEKDFFDVFPSLKVKKELEELLDMVFVTRVSCNPSRTHIWVYIKSERWIHKKHIFELEEQ IERQIFAGLNVTVTVIEKFRLSGQYNPQNFLETYRSSMELELRSYNMLEYNMFRQAQISF PGENDLHMILPDSVIAREKSGILIEYLQKVFCERCGMDLKVELEFRETQESKYRKNAAVQ IAQEVENVIRHAKLNSKNEEPAQSEEAEKKTDKGQQEKKDKKAAFADRKDNRKGDFRGGF RRDSNPDVIYGRDFEGEPVALETITGEMGEVIVRGQVMEVEAREIRNEKTILIFPITDFT DSIVVKMFLRNEQVPEVTEHVKKGAFLKFRGVTTVDRFDSELTIASIAGIKKIANFTTAR VDTSPQKRVELHCHTKMSDMDGVTDAKSLVKRAYEWGHPAIAITDHGVVQAFPEANHCFD AWGGCVPKDSDFKVLYGMEGYLVDDLKGMVTNGKGQKLNGRFVVFDIETTGFSSLTCQII EIGAVLVENGEITDRFSTFVNPKVPIPFRIEQLTSINDSMVMDAPTIEEVLPKFLEFSKD AVMVAHNADFDMGFIMKNCDRLGIAHDFTYVDTVGMARFLLPALNRFKLDTVAKAVGVSL ENHHRAVDDAACTAEIFVKFVKMLEERDIRDVDMLNEQGAVSVNTIRKLPTYHVIIFARN ETGRINLYKLVSQSHLKYYHRRPRVPKSVLEQYRDGLLVGSACEAGELYQAILRNAPDTE IARLVNFYDYLEIQPVGNNRFMIADDKHDMISSEEDLKEINRKIVKLGEQFKKPVVATCD VHFMDPQDEIYRRIIMAGNGFSDADEQAPLYLRTTEEMLEEFAYLGSEKAEEVVITNTNK IADMIEKISPIHPDKFPPVIENSDQDLKNICFTKAHEMYGDPLPEIVESRLDRELNSIIS NGYAVMYIIAQKLVWKSNEDGYLVGSRGSVGSSLAATMSGITEVNPLPPHYLCPNPDCKY SDFDSPEVKQYAGMAGCDMPDKICPKCGTRMKKEGFDIPFETFLGFKGDKEPDIDLNFSG EYQANAHRYTEVIFGKGQTFKAGTIGTLAEKTAFGYVKNYFEERGVHKRYCEINRIVQGC TGVRRTTGQHPGGIIVLPVGEEIEKFTPVQHPANDVNSDIITTHFDYHSIDGNLLKLDIL GHDDPTMIRMLEDLTGISARDVPLDQRDVMSLFASTKALKIEPEDIGGCKLGCLGIPEFG TDFAMQMLIDTKPKYFSDLIRIAGLSHGTDVWLGNAQVLIQEGKATISTAICTRDDIMIY LIGKGVESGLAFTIMESVRKGKGLRDEWIETMKEHGVPDWYIWSCKLIKYMFPKAHAAAY VMMAYRIAWYKVFQPLAYYAAYFSIRATAFSYEMMCMGKERLEYYMAEIRKKGDSASKKE LDTLKDMRIVQEMYARGFEFVPIDLYTAKAQRFQIVDGKLMPSLATIDGLGDKAADAVVD AAKQGKFLSKDDFRDRTKVSKTVIDLMDDLKLFGDIPQSNQMSLFDFTG >gi|226332920|gb|ACII01000099.1| GENE 41 55198 - 56250 1285 350 aa, chain - ## HITS:1 COG:CAC1797 KEGG:ns NR:ns ## COG: CAC1797 COG0821 # Protein_GI_number: 15895073 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Clostridium acetobutylicum # 1 348 1 348 349 396 58.0 1e-110 MYRDHTKVIQIGDRKIGGGNPILIQSMTNTKTEDVEQTVAQILALEQAGCDIIRCAVPTM EAAKALKEIKKQIHIPLVADIHFDYRLAIAAMENGADKIRINPGNIGSTERIKAVVDVAK ERGIPIRVGVNSGSLEKELVEKYHGVTAEGLVESALDKVHIIEDLGYDNLVISIKSSDVL MCAKAHELIAEKTNYPLHVGITEAGTLYSGNIKSAIGLGIILNQGIGDTIRVSLTGAPLE EIKSAKRILKTLGLRKGGIEVVSCPTCGRTQIDLIGLANKVETMVQDIPLDIKVAVMGCV VNGPGEAKEADIGIAGGVGVGLLIKHGEIIKKVPEDQLLDTLREELLNWK >gi|226332920|gb|ACII01000099.1| GENE 42 56302 - 57441 1223 379 aa, chain - ## HITS:1 COG:lin2081 KEGG:ns NR:ns ## COG: lin2081 COG3853 # Protein_GI_number: 16801147 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Listeria innocua # 41 379 56 399 399 217 41.0 3e-56 MGNEFKDFEMETPSLTLEPDLGELEKKEEVIPQKQLQKEEVPVLTPEEQKMVNDFAAKID IENTNQILQYGAGTQKKMADFSDTALENVKTQDLGEIGELISNVVGELKDFDVQEEGKFF GFFRKQTSKIENLKNKYDKAQANVEKITDSLQQHQVRLMKDSAMLDKMYEQNLNYFKELT MYILAGKKKLEETRNGKLAEMKNKAALSGLPEDAQAARDLDEKCSRFEKKLHDLELTRTI AMQTAPQIRLIQNNDTVMVEKIQTTIVNTIPLWKSQMVLALGIAHSAEAAQAQRQVTDIT NELLRKNAETLHMATVETAKESERGIVDLETLQKTNADLIQTLDDVMRIQMEGRQKRQAA EMEMHRMEEELKRKLLEIR >gi|226332920|gb|ACII01000099.1| GENE 43 57476 - 58774 1269 432 aa, chain - ## HITS:1 COG:no KEGG:Cbei_0753 NR:ns ## KEGG: Cbei_0753 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 7 428 5 399 399 205 31.0 3e-51 MSENFDELGELGDKIKKIIDTAVSTKDYRQMTEDIKQTVGQTVNSAVNSAVDSGSEAIKN GLNNVFGTVNSQSGDSGVYRNKTKEFEEKRRREREQEQERQRIEKEKAKEKMLTLYERNT GGRMKGMMIAVSGGILASGMGLGTLVLSIFGAVGHMSSLVTGGTCFMAVGALTGVGLLAG GIKKLGKLERFQKYIDTLGNHTYCNFEQLSAAVNKPVKFVKKDIKKMIDDRWFRQGHIDE QETCLITSNETYLQYTQTQKALEQKKQEEEKHQAEQERNRKNTPLEVQEVLNKGNEFLDK IHRSNDAIPGEEISAKISRMELIVEKIFERAQKHPEIIPDLKKLMNYYLPMTVKLLDAYE EMDQQPVQGENIQASKKEIEDTLDTLNQAFEKLLDSVFQDTAWDVSSDISVLHTLLAQEG LTDDDFAKMKKN >gi|226332920|gb|ACII01000099.1| GENE 44 59001 - 60026 1273 341 aa, chain + ## HITS:1 COG:CAC1479 KEGG:ns NR:ns ## COG: CAC1479 COG0115 # Protein_GI_number: 15894758 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Clostridium acetobutylicum # 1 341 1 341 341 583 78.0 1e-166 MEKKNIDWSSLGFGYVQTDYRYVSKFKDGAWDAGELTTDPNVTLNECAGVFQYAQTVFEG MKAYTTEDGHIVTFRPDLNAERMANSAKRLEMPVFPEDRFVDAIVQTVKANAAYVPPYGS GATLYVRPYMMGTNPVIGVKPADEYMFRVFTTPVGPYFKGGAKPITIRVSDFDRAAPHGT GHIKAGLNYAMSLHAIVDAHKNGFDENMYLDAATRTKVEETGGANFIFVTKDGKVVTPKS NSILPSITRRSLIYVAKEYLGLEAEEREIYFDEVKDFAECGLCGTAAVISPVGKIVDHGK EICFPSGMEKMGPVIQKLYDTLTGIQMGRIQAPEGWLKVIE >gi|226332920|gb|ACII01000099.1| GENE 45 60197 - 60688 676 163 aa, chain - ## HITS:1 COG:BS_dfrA KEGG:ns NR:ns ## COG: BS_dfrA COG0262 # Protein_GI_number: 16079240 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus subtilis # 1 162 2 160 168 115 40.0 4e-26 MNVIVAADKNWGIGKNNQLLVSIPADMKMFREETSGKVVVMGRKTLESFPNGLPLKNRIN IVITSNKEYEVKGAIIVHSIEEALEEIKKYPAEDVYCIGGDSIYAQMLPYCDVAHVTKID FAYEADSHFPNLDEDDEWEITGESDEQTYFDLEYQFVKYERKK >gi|226332920|gb|ACII01000099.1| GENE 46 60793 - 61641 945 282 aa, chain - ## HITS:1 COG:BS_thyA KEGG:ns NR:ns ## COG: BS_thyA COG0207 # Protein_GI_number: 16078831 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus subtilis # 1 270 1 268 279 203 42.0 2e-52 MSYADDVFIKMCQDILENGTSTEGEKVRPVWEDGTPAYTIKKFGVVNRYDLRKEFPAMTL RRTALKSAMDEILWIWQKKSNNIHDLHSHIWDSWADESGSIGKAYGYQLGVKHQYKEGMM DQVDRVIYDLKNNPFSRRIMTNIYVHQDLHEMNLYPCAYSMTFNVTQTPGNDKLTLNAIL NQRSQDILAANNWNVCQYALLLMMVAQVCDMEAGELLHVVADCHIYDRHIPLIKELIQRK PYPAPVVKLNPEVKDFYQFTTDDLIVENYETWPQIKNIPIAV >gi|226332920|gb|ACII01000099.1| GENE 47 61693 - 62658 653 321 aa, chain - ## HITS:1 COG:BH1953 KEGG:ns NR:ns ## COG: BH1953 COG1597 # Protein_GI_number: 15614516 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 6 295 4 276 295 125 28.0 1e-28 MRRIQIIVNNGAGTGRAHRVWNETQCLLRGYKIKYEAHMTRYEGHAAKLAEQISRVKGEE PVYLLVVGGDGTINEVLNGITDFDKVRFGVIPTGSGNDFGRNLKLPKTPKESLREICACI RKDQRGEALYRIDLGQVSWEGCEKPRIFGISSGLGLDALVCKKALHSRLKQVLNRFHLGK LTYLALTVQSLFTMETANAKVVTEHGGYILPKMIFAAAMNLPAEGGGVPMAPHASVQDGL LSLGSASGIAKWQTFFLLPFLVAAKQEHINGFIIRNEKGFRLILDKPMVLHADGEYCADV TRVEFRCLEKKLWLLHEQKGD >gi|226332920|gb|ACII01000099.1| GENE 48 62911 - 63075 101 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579646|ref|ZP_04856915.1| ## NR: gi|253579646|ref|ZP_04856915.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 54 1 54 54 67 100.0 3e-10 MKILFFQIRLKPFLSVFDFIDVFGEAEKKFALFNIIIFTFPKRDVSPIFSIFNL >gi|226332920|gb|ACII01000099.1| GENE 49 63532 - 63702 69 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVISILTRACMCRNGIAFSSALGDSDNCSNCGCNCLPWNHGSNASTASNADRASA >gi|226332920|gb|ACII01000099.1| GENE 50 63868 - 65103 409 411 aa, chain - ## HITS:1 COG:no KEGG:CD1890 NR:ns ## KEGG: CD1890 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 410 1 413 413 188 29.0 3e-46 MSEMIKYEWKKIWKSRLAQLVIIGCGLFLVFCVWASIMQISATNEKGENFSGMSAVEVMK NTQKKIELTQENVNDIVGRYLKYISDPGTNSESETYHYLSEEVYRTFYLPNRELLSLVTN VYREPGSGSSIKEVLEENVGKDFRKAQIKRDIAYINLQKEQGRLTSGEADYWKEKIGNLQ EYQYGYHKGWSMILDTLTWPVLIMMLICIGIAPIFAGEYQSKCDSLILCMKYGRSKLILA KIISGWLYATGVYWGITLIYSSIYMIFLGTQGADLPIQLKYPAMSVGYNLTMGEAVGITL LLGYFFTLGIMGITLFMSALLKNTYAVIIVAFLLIIIPTFLSLDTGGYVWSHVLSLLPPK IADFSFQSYTAYSIGNIVLSWPVMAILVNAIVAVLCSVLGYIIFRKHQVNK >gi|226332920|gb|ACII01000099.1| GENE 51 65112 - 65987 389 291 aa, chain - ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 3 289 2 287 293 196 37.0 4e-50 MRLVMEHLSKQFKNRIAVESIDTELREGVYGFLGANGAGKTTLMQMICGIIEPTAGEVKI NGENNIQMGERFRDMLGYLPQEFGYTPGFTANDFMLYIASIKGLPPRFAKRKTKELLKLV NLEKDAGRKIRTFSGGMKRRLGIAQALLNDPKILILDEPTAGLDPKERAYFRNIIAEMSK DKIIIISTHIVSDIEYISDQVLIMKKGKFILQGPADELLKEVKNEVWSCRVPNNQWSNFE RTHCIANSHVMKEVIEARVISTSSPIEGAIIVEPTLEDLYLKLFSDELQEG >gi|226332920|gb|ACII01000099.1| GENE 52 65959 - 66729 72 256 aa, chain - ## HITS:1 COG:no KEGG:CD1888 NR:ns ## KEGG: CD1888 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 238 1 240 253 114 30.0 3e-24 MSDKALKKKLQQLHIPQYDEYKLEKVIVEAKKIDLFSGNQRMTNTEFFLNQLGFISKKVW SLKAVFSIFILYLILAKDVYSNNWVWTLIAITGPILCLFNAKEICNVFQPGMFEIQMATK HSFSKVLMIRLIVFGIFDLLFLICSTIILSLVKKTVIWQIILYGIVPYEIMCFGCMYILN KCHEENILLYSTTWGICLSSSIIILKISGVDLFATCYFAIWLVFGLIAVSGIGVEIKQLL KKAGGNLNEISNGTFI >gi|226332920|gb|ACII01000099.1| GENE 53 66719 - 67258 193 179 aa, chain - ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 20 172 16 184 187 63 25.0 2e-10 MHFRRKFVFENKKIQNLFLNEEKVEDLIRRYYSDIYKYCFFHLGNREIAEDITQEVFLKF LKSLDSYREYGKLKNYLYVVAKNTIRDFQRKKYEITEEPEVEASCDGALDNIPLQIDIWR EIQMLDPLDKELIILRYYQNFKVKDIAQIVNIPPSTVGYKIKKIEKILKNRLGDIGYER >gi|226332920|gb|ACII01000099.1| GENE 54 67432 - 67629 182 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579651|ref|ZP_04856920.1| ## NR: gi|253579651|ref|ZP_04856920.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 22 86 86 122 100.0 5e-27 MEKNDKPNNLRGFCTITLEDVFCLKNIAHSINAAFKEDWERAVMDEYQKKLNYTRDRKQQ QRTGR >gi|226332920|gb|ACII01000099.1| GENE 55 68002 - 69672 2045 556 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 1 553 1 555 561 662 58.0 0 MEKRWWKESVVYQIYPRSFCDSNGDGIGDLNGITGKLDYLKELGIDVIWLSPVYKSPNDD NGYDISDYQAIMDEFGTMEDFDRMLATAHEKGIKIMMDLVVNHTSDEHKWFIESRKSTDN PYRDYYIWRPAKEDGSLPNNWGSCFSGPAWEYDKTTDMYFLHLFSKKQPDLNWDNPAVRQ DVFDMMNWWLKKGVDGFRMDVISLISKEPGLPDKEPGINGYATFNVSANGPHVHEYLQEM RQKALNNADTITVGECSGVTLEEAKKYARSDEKELNMVFQFEHMDVDSDEKAGKWTTRKM DLRNLKKILTRWQKGLQDIAWNSLYWENHDQPRSVSRFGNDSDEYREISAKMLATCIHMM QGTPYVYQGEELGMTNCPFNTLDNFRDLESINAFHELTEQGKMTEEDMMAAIGYKGRDNA RTPMQWDDSAYAGFSTANPWIMVNPNYTKINAKDQINREDSVFKYYQKLIKLRHESELIV YGTYDLILDDDKDIYAYIRTLGDEKLIVYCNFSENTREVELPEEFTNGKVLISNYIDAKV NHKITLRPYEAIVIQK >gi|226332920|gb|ACII01000099.1| GENE 56 69880 - 70935 851 351 aa, chain + ## HITS:1 COG:VCA0132 KEGG:ns NR:ns ## COG: VCA0132 COG1609 # Protein_GI_number: 15600903 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 337 1 333 334 151 29.0 2e-36 MITIKEMAEMLGISTTTVSNVINGKTSEVSQKTAEKVQKLLDEYDYVPNMNAKNLAQNHS RLIGIVLKRRKDKYENIFTDPFHGELLGALESAIREQGYYMMIYISEDIEEIVRNIVGWN TEGLILIGMLHDDYLKIRSKYKKPAVLIDSYTPKNIARYVNIGLDDEEGGYLMTKYLLDC GHRKIAFLADNMEGVDYIRYTGHQRALQEYGLDIDLDNLIVIRPSKYERQGSMEEIYAVA HKFTAFMCCSDYYAVTLMKYLKAKGIRFPEDLSITGFDDNLYAQLACPSLTTIHQDIFSR GTIAAEYLFKMIDGWNPKTTNLSLPVRLVVRDSVKILNPPEDDSSDDGSAL >gi|226332920|gb|ACII01000099.1| GENE 57 70916 - 71113 59 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVQLYKALRDICRGISRSDEPGAICIIYEDCYEKIAIIHRYKKNENPGTYHTWISVLFS SATSQ >gi|226332920|gb|ACII01000099.1| GENE 58 71100 - 72764 1979 554 aa, chain - ## HITS:1 COG:BS_yugT KEGG:ns NR:ns ## COG: BS_yugT COG0366 # Protein_GI_number: 16080181 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 1 522 1 517 554 417 43.0 1e-116 MVKKWWNEKVAYQIYPKSFYDTNGDGIGDLRGVIEKLDYLKDLGVDIIWLSPCYRSPLAD QGYDISDYYDIDPRFGTMADMDELIAEAKKRDMYILMDLVVNHCSDEHEWFKKACEDPDG KYGNFFYLRDKKEGELPTNWRSYFGGPVWEDLPGTNKQYLHVFHKKQPDLNWENPEVREE VYKNINWWLDKGLGGFRIDAIINIKKALPMHDYEPDREDGLCSIRKMLKEAKGIGDFLGE MRDRTFKKYDAFSVGEVFDEKDEEIPDFIGDNGYFSSMFDFEETIWGASDKGWYDCKQIT PDAYKKCCFTTQRKIGDIGFVSNIIENHDEPRGVSRYIPEGDCCDASKKMLGGLNFMLRG LPFIYQGQELGMENVKFESIDQVDDISSLDEYKVALEAGCTPEEALKAVSRFSRDNARTP MQWTDGENAGFTTGKPWLKVNANYTKINAESQMNDPESVRSFYKKLIALRKNPEYKETVV YGELEPVWEDVHNLMAYYRKGDKTLLVVGNYQKEPQTIELAGECRKVLINNYDNVAMDGN KIELQGYQFLVIEM >gi|226332920|gb|ACII01000099.1| GENE 59 72960 - 73793 735 277 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 5 273 36 290 299 106 27.0 5e-23 MVKHHWHEEVELLYFSGGEFCLEVNMERFNISEECFYFINPGELHSISVEKNGGCVENAV VFHMDMLNSAYALDQIQTNLIQPVQNGSIQFPRCLRKNEPAFEPVRKSYEEIRNVFGGDS FRENYYDGEAVTDDITRQLFIKSALLRILAVLSANNLFEVTEKNHDHRIETIKTTLTYIQ ENYKEKIYIRDLAGLIGMNEQYFCRFFKKVIGRSPMEYVNEYRIKRAIHYLKESDLTVTE ICLECGYNNLGNFLREFRKYTSTTPLQYRRHSLKETQ >gi|226332920|gb|ACII01000099.1| GENE 60 73983 - 75104 807 373 aa, chain - ## HITS:1 COG:BH3845 KEGG:ns NR:ns ## COG: BH3845 COG1653 # Protein_GI_number: 15616407 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 69 304 129 371 436 60 24.0 7e-09 MHIDADNKRFQDFIEETEKKLDMEITVEKCPSNADNRQAKISTILASGDKNIDLISVNDE MISEFKHKGYLEPLEKNVMTEEVRDSFPQKYLESICEENGHIYSVPFQMDIMMFWVNQEL LKQAGLDEIKNTGDFDILQKSLKNPDQYAYGDAWENTHVYNSLSQFVNFWGGDYFDWTDP KTKDAARYMKSMLDEKNTSPAQFVDQYEQMEEKFIRGKYGCVFMYTSALGTFLDADVYGK NKIHMAPLPVLHKKKATNIATWQYVLNNASENKDAAVRFLQYAAGYEGSTEYAKIMKSYP ARVDVIENEDIDLEGIDMIRKYLREYTLNARPLCKNSMGAVSDMGKLFQGYVLGQCEEDS FFEQAQNCINTYY >gi|226332920|gb|ACII01000099.1| GENE 61 75249 - 76358 1133 369 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3517 NR:ns ## KEGG: EUBREC_3517 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 369 1 370 370 676 84.0 0 MYQSFKGGLGGIALAKHGRSRAINAENPHGEKGKGGMAASHLGPSRKGSPCLNDIEPGAT VTLAEMEGPGEINHIWITVDNKTTDADCFVLRDLVIRMYWDDEETPSVESPLGDFFCCGF GRECIVNSVPMAVVPSRGFNCYFPMPFKKKAKITLENQHANKIPAFFYQVDYCLYDELPD DITYFHAQWRRERLTEKQKDYTILDGVKGKGHYVGTYIALTTLERYWWGEGEMKFYIDGD DEYPTICGTGTEDYFGGSWSFAKQVDGKTVEQNYNTPYLGYPYYSAHDELIHNFYHNDDC PPMRGFYRWHIQDPICFDEDLRVTIQQIGVGYRGLFERQDDVASVAYWYQTHPHAPFAPL MSKEDRWPR >gi|226332920|gb|ACII01000099.1| GENE 62 76394 - 77251 871 285 aa, chain - ## HITS:1 COG:DR1436 KEGG:ns NR:ns ## COG: DR1436 COG0395 # Protein_GI_number: 15806453 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Deinococcus radiodurans # 20 285 17 283 283 144 36.0 3e-34 MKKDSRKYNIYHGVMAGGRWVLFAIVSFIILFPVYWIFISSITPSGELFKTPIDYLPDHP TLESYKFLIENVGLLSKVGNTVLIVGLTLVISTVLCAMAAYGFSRFTTKGINIAFAFIIA TMLIPEVVTARPLYEFMQKVKLYDTYPGLILLYISTVIPFTVLILRNFVGEIPISLEEAA AIDGATFSQRLFTIVLPLMKPAIATVCIINFITCLNNFFTPLFYSNGIQVLSVSIVQLPL RDNMYGVPWDLVSAMGWIILLPIIIFVAVFEKQIMDGIMAGGVKA >gi|226332920|gb|ACII01000099.1| GENE 63 77265 - 78143 703 292 aa, chain - ## HITS:1 COG:BH1245 KEGG:ns NR:ns ## COG: BH1245 COG1175 # Protein_GI_number: 15613808 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 49 287 194 433 445 162 36.0 9e-40 MTISKKSMKWIPWLLVLPVVIIRGFTTLYPILMTVRNSFFDIKILSGINEFCGIQNYLKI FSDPKVLTSIRFTVIFVVVSMIFHVILGVGLALILNMKFKGRRFLRTIVLIPWAMPAVVV GMAAKWAFNNDYGLINDFIRLFVHGYQNSWLINTGSARAAVIAVDLWKDLPFFAILVLSG LQFISGDIYEAAKVDGANGIQCFFRITLPLILKNVLTLSIPFTLWRLTTFDIVYSMTSGG PGEDTALIAYRITTEAFTNLNVGYASALAMLLFVVMAVFSWLNIRVMNKIDN >gi|226332920|gb|ACII01000099.1| GENE 64 78267 - 79502 1440 411 aa, chain - ## HITS:1 COG:BH0905 KEGG:ns NR:ns ## COG: BH0905 COG1653 # Protein_GI_number: 15613468 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 53 408 69 440 445 63 26.0 7e-10 MRKKVLSAVLAATMVAGMMGNVVSVSAEEKYDLTLYSVNTTDPDFDDWLANVEEATGLNI NVIAAPTDTDTRQQKITTILSTGDSSVDIVEINDEMSAAFKNSGWLEGLNDTVMTDDIID QFAQGYIQDMITDKDGNIVGVPGYTGYLAFWVNQEIMDEVGIESIDTKEDFMKYMEAVSK DGRFGYGGSWEKTYASNEIAQFVNMFGGDYFDWTKEENKEAIQFLHDLVANKETPVDQIA DKYDQMNPKINDGKYGCWFMWGLGTDYAKADMLGADKIHMAMVPDFSGNGERAIFTDSWN YVLNSASKNKDAAIKFLQYMADEGGMEASYKAFDRYPARKDVAEKVVPDTDPAKEMYSQY AEECNVQGRPLVAQTMEFISDMGTIFQSCMKDEITVDEFCKKAQEVVDTYQ >gi|226332920|gb|ACII01000099.1| GENE 65 79776 - 81620 1085 614 aa, chain + ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 24 600 16 576 585 131 22.0 5e-30 MKLNLKKINYHKWIPSRFSSFQAKLLIAFLVATLLPLTLVALVSYHVSYNLARDRITNSA LMSDEQLIFQLNSRLNQTENVADTIQYQMYAFEHSSPDQVGTMQALTSMRSNISLYKSTF DFFHIYVFLKPEQTGADEGIYFFSTDKLFNYGISESELHSIGSSSLWLLKKNTTLPRVIS ASKRSANTLLCFRALQDKSSNVLEYAYCIALDCDEFSRYLQTSASDTAISSYILTPQGQI AAHSDSSRNQTWISDSIKQMFLENTNSFFKDNDIYYNCRTLDNGWLHITEIPENYIKSNT QVLIKTLLITVIISLPLTIFIVILFSRKLTSRIAALSYAMESFHLGHDPEHLSIITLPHP ADASNYDEIDNLGITFEDMQHTIAQNLKSIVSLSINEERLKYQLLQSQINPHFLYNILGT IRTCQALGKLDIADQMLTNLTAFYRLTLRKSKELIPIKDELEIARLYLEMEKLCHKDNLN WEINAEDGIENFTICKFTLQPFLENSILHGLSADTPEVFISIQVLYGDDTVVISIEDNGA GMSPETLEQLRYAIDNKVINYEKHFGISNVSARISNPLYGNGSVRIDSHPGNGTYVTIEF QQMEDTDEENNDCR >gi|226332920|gb|ACII01000099.1| GENE 66 81607 - 83199 1352 530 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 524 5 521 525 133 24.0 9e-31 MIVDDNYLSAEGIEKNIDWEVLNAEIVHICYNGTSAIDAMKKEPVDLIISDIEMPDLDGI SMSRQALDINPMVKIILISAYDKFEYARRALLLGALDYIEKPLDYAYLIQKVKNAFALME REQKNLQLLKQSRPLLIEKFFRDITHRSRQEASYRLKPYLNYLNLKLDYDFYNVVILELE NAQSLKDSYGIEKFQMELLNLRDLITEQTSELNYIYPLQDIDGYLCVIGQSGCQSGNFRQ LTYKIISALVDNCNSQGLVLNAGIGAIVQDLWDLNRSYESAVHALDYRFFFPHQNVFDGR EALGHDLSATDFSDSREEELIRLLCKKDVSAIDSWFQDFSRWLTENFRTKSFTFIQIYSL LGRILKFFYELNLDTHDLEAKILYVYNHLEEFHTTEELCGWLNELCRLLCRKLDSSLNDY HQKLCESVTSYIDENYASNTLCLNDIASEANVSPAYLSALFKKKQGVSISDYITTQRINA ACRYLTATNMTLKEISMKCGYANQYYFSTSFKKKMNVTPSAYRDTQTSKS >gi|226332920|gb|ACII01000099.1| GENE 67 83380 - 84186 730 268 aa, chain + ## HITS:1 COG:lin1165 KEGG:ns NR:ns ## COG: lin1165 COG4822 # Protein_GI_number: 16800234 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Listeria innocua # 6 264 2 258 261 196 40.0 5e-50 MGVTEKTAILVISFGTSYEETRKKTIEQIESDLHHAFPEYPLYCAWTSPRIRAKLRKRDG IHIMDIDEAMTQLKADGIRNVVVQPTYVITGFESDAMKEKVLAHKKDFDSVIICDSLMVT KQDKEEVCQAMAQEYHPDSDEILLFMGHGTEHVANELYPEMDELFKHFGYSNMHMGTVEG DFSIESFLDKLKNLHPAHVHLAPFMIVAGDHATNDMSGEDDDSWKSILEKEGYSVKCTLK GLGEIQAVRDIFIRHTRAGLDRLSEIQA >gi|226332920|gb|ACII01000099.1| GENE 68 84373 - 84882 424 169 aa, chain - ## HITS:1 COG:CAC3074 KEGG:ns NR:ns ## COG: CAC3074 COG5652 # Protein_GI_number: 15896325 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Clostridium acetobutylicum # 14 164 8 161 161 74 31.0 7e-14 MKILITLLKPLSFLPALLMMYIIFSFSAQPGEVSGNLSYKISYEIIETKSELLSENLSSE EIAYKADGIHYYVRKAAHMTEYFLLAIAISFPLYVYGVRGIWLMLLAGIVCVGFAGLDEY HQSFVDARTPAIKDVGIDSIGAFIGILLVQAFCWSVLNNPNKKKKKRSH >gi|226332920|gb|ACII01000099.1| GENE 69 84879 - 85361 522 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579665|ref|ZP_04856934.1| ## NR: gi|253579665|ref|ZP_04856934.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 12 148 23 159 171 174 100.0 2e-42 MEKNENEELLEQSPAEIQQEDTQKIAKSGKEEPDTEEQKVSNGRFILWSLAGVYLLYTSY SLCKGYVTGEEGTSMGFMLAGIAFAAIGAGLLFFGIKNMLSEEKIKKAKAAKNAAMEAAA GGELKKEAASGQNRSMSIAERANMVKNLEDEEEDGENEKE >gi|226332920|gb|ACII01000099.1| GENE 70 85591 - 85899 130 102 aa, chain - ## HITS:1 COG:no KEGG:cce_0110 NR:ns ## KEGG: cce_0110 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_ATCC51142 # Pathway: not_defined # 11 94 401 482 486 67 46.0 2e-10 MRVSRICAWNTSRLAYDGSGAVTRDWEDYSLCTFQTGKRYNCDLSASYNIGARYFIRELL KPLPVTERSLLEAKVPPVKRRTSCVYADLRKLHSEMEFLKAA Prediction of potential genes in microbial genomes Time: Sat May 28 20:09:59 2011 Seq name: gi|226332919|gb|ACII01000100.1| Ruminococcus sp. 5_1_39B_FAA cont1.100, whole genome shotgun sequence Length of sequence - 1567 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 124 58 ## gi|253579862|ref|ZP_04857130.1| transposase - Prom 181 - 240 5.4 - Term 317 - 352 4.2 2 2 Tu 1 . - CDS 587 - 1510 998 ## COG2267 Lysophospholipase Predicted protein(s) >gi|226332919|gb|ACII01000100.1| GENE 1 1 - 124 58 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579862|ref|ZP_04857130.1| ## NR: gi|253579862|ref|ZP_04857130.1| transposase [Ruminococcus sp. 5_1_39B_FAA] # 1 41 9 49 163 81 97.0 1e-14 MQIISSYGVELRKQNIPVRQTLEIYRSAVSYLIGIYVQVWE >gi|226332919|gb|ACII01000100.1| GENE 2 587 - 1510 998 307 aa, chain - ## HITS:1 COG:lin1226 KEGG:ns NR:ns ## COG: lin1226 COG2267 # Protein_GI_number: 16800295 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Listeria innocua # 12 305 11 302 306 146 33.0 4e-35 MKYESSFKSEADGLEISVMALIPDKKPYRAVVQLVHGMSEHKERYIPFMQYLAKLGYVVV IHDHRGHGKSVKHQDDLGFTYGGGAQAMLQDIRTVNRKIHAYYPELPLILMGHSMGSLAV RAFVAEHDSCVDMLIVCGSPSYNTAMPLGVAIAKTEKAVFGPRHRSKLIETMSFGAGAMK FRKDKRCTAWICSDPDVAKEYEESELCGFTFTDDAYLALFELMKRAYDVEHFSCTNPDMP VLFVSGAEDPCLINVRHFAKTVRAMRRAGYKDVKGKLYPGMRHEILNEIGREQVYHDIAV YMRKKGF Prediction of potential genes in microbial genomes Time: Sat May 28 20:10:04 2011 Seq name: gi|226332918|gb|ACII01000101.1| Ruminococcus sp. 5_1_39B_FAA cont1.101, whole genome shotgun sequence Length of sequence - 4012 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 42 - 101 7.7 1 1 Tu 1 . + CDS 297 - 1358 836 ## CD2029 putative lipoprotein + Term 1418 - 1473 3.1 + Prom 1366 - 1425 5.3 2 2 Tu 1 . + CDS 1555 - 1764 231 ## COG2155 Uncharacterized conserved protein - TRNA 1934 - 2006 65.6 # Gly GCC 0 0 - TRNA 2010 - 2080 77.1 # Gly TCC 0 0 - TRNA 2098 - 2170 65.6 # Gly GCC 0 0 - TRNA 2174 - 2244 77.1 # Gly TCC 0 0 - Term 2344 - 2381 0.3 3 3 Tu 1 . - CDS 2582 - 3508 1094 ## COG1045 Serine acetyltransferase - Prom 3739 - 3798 80.3 + TRNA 3711 - 3798 64.9 # Ser CGA 0 0 + TRNA 3909 - 3979 51.0 # Trp CCA 0 0 Predicted protein(s) >gi|226332918|gb|ACII01000101.1| GENE 1 297 - 1358 836 353 aa, chain + ## HITS:1 COG:no KEGG:CD2029 NR:ns ## KEGG: CD2029 # Name: not_defined # Def: putative lipoprotein # Organism: C.difficile # Pathway: not_defined # 36 214 24 205 205 77 25.0 1e-12 MTKGNKKGHAGLIIFILILILAIGGGTGFYFYQRQQPRKAVEHFLDSMKKMDFDTMESLI QSSDLTALDNADIRDAAYTDFFSEINKKMTYKITKNRFDIQNGTASVTAHITYIDGTNIY KATITEFLRQIVSNAYAGNKLTEEETQAKLASILSEKAKKVEKDVFSETDITYPVIKTDS GWKIVSLDDETVKIMSANFKSVEEEINNSLNNMDNEDSSGSSSNAPEASADDTLNLTTEK FTIKYTKHTITKDFAGNPCIMVYYDYTNNSSSASSAMVDVSLKAYQHGESCEAAIPENND DAIDHFTAEIQPGQTVNVCQAFTLTDESDVTVQAQEAFSFDEDATARQILKVK >gi|226332918|gb|ACII01000101.1| GENE 2 1555 - 1764 231 69 aa, chain + ## HITS:1 COG:CAC0976 KEGG:ns NR:ns ## COG: CAC0976 COG2155 # Protein_GI_number: 15894263 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 62 2 60 69 65 59.0 2e-11 MGSKCLRYTALTISIIGAVNWGLIGFFNLNLVALLFGSMSWISRIIYGLVGICGLCLLTF YGDSDTESE >gi|226332918|gb|ACII01000101.1| GENE 3 2582 - 3508 1094 308 aa, chain - ## HITS:1 COG:BH0110 KEGG:ns NR:ns ## COG: BH0110 COG1045 # Protein_GI_number: 15612673 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus halodurans # 116 292 4 173 229 159 46.0 5e-39 MTDKKKTILEAVEKLTDTYRKEELFLGKDRERLPNKKEIINFIKDMRSIIFPGYFSVDSS ASVFPEHYVAYRLNDLYDCLQEQIEIAFLYQGEEEQKAKEHAEQITERFFANVPEIQRML LTDLQAGFDGDPAAKSKEEIIFSYPGFYAIYVYRLAHVLYLENVPFIPRIMSEYAHGYTG IDINPGATIGEYFFIDHGTGVVIGETTEIGKNVKLYQGVTLGALSTRQGQLLANVKRHPT IRDNVTIYSNSSVLGGETVIGENTIIGGNTFITASIPANTKVSAKSPELVIKKPRSSVEA TNVWDWEN Prediction of potential genes in microbial genomes Time: Sat May 28 20:10:10 2011 Seq name: gi|226332917|gb|ACII01000102.1| Ruminococcus sp. 5_1_39B_FAA cont1.102, whole genome shotgun sequence Length of sequence - 2724 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 34 - 67 -0.9 1 1 Op 1 . - CDS 146 - 484 390 ## COG3943 Virulence protein 2 1 Op 2 . - CDS 518 - 724 111 ## gi|253579672|ref|ZP_04856941.1| predicted protein - Prom 744 - 803 1.6 3 2 Op 1 1/0.000 - CDS 821 - 1381 529 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 4 2 Op 2 . - CDS 1439 - 1939 491 ## COG0778 Nitroreductase - Prom 1985 - 2044 7.2 5 3 Tu 1 . - CDS 2434 - 2601 99 ## gi|253579675|ref|ZP_04856944.1| predicted protein - Prom 2649 - 2708 3.7 Predicted protein(s) >gi|226332917|gb|ACII01000102.1| GENE 1 146 - 484 390 112 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 4 112 13 126 345 68 31.0 3e-12 MKNELILFETADKEIKLNVSVKDSTVWLNRNQLAELFERDVKTIGKHINNALKEELDKAV VAKFATTALDGKIYQVEHYNLDMIISVGYRVKSNRGVEFRKWANKVLKQYII >gi|226332917|gb|ACII01000102.1| GENE 2 518 - 724 111 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579672|ref|ZP_04856941.1| ## NR: gi|253579672|ref|ZP_04856941.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 68 5 72 72 108 100.0 1e-22 MKLFMNKNEKYEEPEVHMFFHKQTKEVEQVEKFVRSLSITVPVQYEDEIYKLYPLVWIPE LLWIWMLP >gi|226332917|gb|ACII01000102.1| GENE 3 821 - 1381 529 186 aa, chain - ## HITS:1 COG:CAC0446 KEGG:ns NR:ns ## COG: CAC0446 COG0494 # Protein_GI_number: 15893737 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 2 155 4 151 206 89 35.0 3e-18 MELFDVIDSKGNPAGQIVSREKAHAEGIPHRTAHIWIIRKKEGRVQILLQKRSQNKDSFP GKFDTSSAGHIQAGDEPLESALRELKEELGISATPEQLHFAGTFPISFAKEFHGKMFRDE EIAFVYIYQEPVNTAELVLQTEEVEEVQWFDLEEVYEQCGKRREIFCVPEGGLGVVRSYL KSIEKN >gi|226332917|gb|ACII01000102.1| GENE 4 1439 - 1939 491 166 aa, chain - ## HITS:1 COG:TM0383 KEGG:ns NR:ns ## COG: TM0383 COG0778 # Protein_GI_number: 15643149 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Thermotoga maritima # 1 163 1 163 166 137 44.0 8e-33 MNSIFHRISVRKYEDKPVEKEKIMEILKAGMQAPSACNQQPWEFYVVTDKEKIQKLSKVT PYTGCAAGAPAVIVPVYHTEGLVAPSFAQIDMSIAQENIWLETDTLGLGGVWFGIAPVEE RMEEVHRLLGLPENVKVFSMFALGYPAETRPQQDRFDPERIHFIEK >gi|226332917|gb|ACII01000102.1| GENE 5 2434 - 2601 99 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579675|ref|ZP_04856944.1| ## NR: gi|253579675|ref|ZP_04856944.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 55 1 55 55 99 100.0 6e-20 MTKDVFIKEVRDAEAMLYHISKSILKNDSDCGDAVQETILNEKNGSNAPFMVIIS Prediction of potential genes in microbial genomes Time: Sat May 28 20:10:39 2011 Seq name: gi|226332916|gb|ACII01000103.1| Ruminococcus sp. 5_1_39B_FAA cont1.103, whole genome shotgun sequence Length of sequence - 71924 bp Number of predicted genes - 66, with homology - 60 Number of transcription units - 31, operones - 19 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 5 - 322 246 ## gi|253578255|ref|ZP_04855527.1| predicted protein 2 1 Op 2 . + CDS 348 - 539 89 ## gi|253579677|ref|ZP_04856946.1| predicted protein + Term 745 - 781 -0.2 3 2 Tu 1 . - CDS 554 - 667 102 ## - Prom 702 - 761 4.7 - Term 730 - 770 6.1 4 3 Op 1 8/0.000 - CDS 849 - 2276 1610 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Term 2292 - 2330 4.4 5 3 Op 2 7/0.000 - CDS 2347 - 4233 2503 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 4262 - 4321 3.8 - Term 4282 - 4310 -0.9 6 3 Op 3 . - CDS 4357 - 5208 901 ## COG3711 Transcriptional antiterminator - Prom 5246 - 5305 9.3 7 4 Tu 1 . - CDS 5759 - 5962 113 ## gi|253579681|ref|ZP_04856950.1| predicted protein - Prom 5983 - 6042 5.6 + Prom 6575 - 6634 9.6 8 5 Op 1 . + CDS 6709 - 7923 933 ## COG3328 Transposase and inactivated derivatives + Prom 7951 - 8010 4.2 9 5 Op 2 . + CDS 8074 - 8274 80 ## - Term 8253 - 8299 5.7 10 6 Tu 1 . - CDS 8358 - 10466 2229 ## COG0855 Polyphosphate kinase - Prom 10502 - 10561 7.5 - Term 10768 - 10813 5.4 11 7 Op 1 . - CDS 10822 - 11781 1044 ## COG0053 Predicted Co/Zn/Cd cation transporters 12 7 Op 2 . - CDS 11858 - 12052 100 ## gi|253579685|ref|ZP_04856954.1| predicted protein - Prom 12212 - 12271 5.8 13 8 Tu 1 . - CDS 12273 - 13682 1472 ## COG0534 Na+-driven multidrug efflux pump - Term 14133 - 14178 6.2 14 9 Op 1 38/0.000 - CDS 14201 - 15061 999 ## COG0395 ABC-type sugar transport system, permease component 15 9 Op 2 35/0.000 - CDS 15061 - 16113 1172 ## COG1175 ABC-type sugar transport systems, permease components - Term 16219 - 16257 5.0 16 9 Op 3 . - CDS 16271 - 17551 1522 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 17614 - 17673 6.8 - Term 17867 - 17920 16.2 17 10 Op 1 . - CDS 17947 - 18561 860 ## CLH_1589 hypothetical protein 18 10 Op 2 . - CDS 18470 - 18697 86 ## - Prom 18853 - 18912 3.0 - Term 18878 - 18934 13.5 19 11 Tu 1 . - CDS 18949 - 20121 1642 ## COG0281 Malic enzyme - Prom 20158 - 20217 5.1 - Term 20209 - 20273 7.0 20 12 Op 1 . - CDS 20321 - 22087 1743 ## gi|253579694|ref|ZP_04856963.1| conserved hypothetical protein 21 12 Op 2 . - CDS 22148 - 22744 553 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 22929 - 22988 8.7 - Term 22997 - 23026 -0.4 22 13 Tu 1 . - CDS 23064 - 23570 565 ## CA_C3602 HD superfamily hydrolase - Prom 23629 - 23688 4.5 - Term 23641 - 23676 5.1 23 14 Tu 1 . - CDS 23691 - 24848 1592 ## COG0426 Uncharacterized flavoproteins - Prom 24919 - 24978 7.8 24 15 Op 1 . - CDS 25027 - 25962 660 ## Mhun_1221 hypothetical protein 25 15 Op 2 . - CDS 25977 - 26162 88 ## - Prom 26210 - 26269 3.1 - Term 26217 - 26262 7.3 26 16 Op 1 . - CDS 26375 - 26809 376 ## gi|253579700|ref|ZP_04856969.1| predicted protein 27 16 Op 2 . - CDS 26800 - 27153 204 ## gi|253579701|ref|ZP_04856970.1| predicted protein - Prom 27226 - 27285 9.4 28 17 Op 1 21/0.000 - CDS 27311 - 28177 915 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 28200 - 28259 1.7 29 17 Op 2 . - CDS 28265 - 29308 1071 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 29537 - 29596 8.4 30 18 Op 1 . - CDS 29719 - 30072 197 ## gi|253579704|ref|ZP_04856973.1| predicted protein 31 18 Op 2 . - CDS 30158 - 30259 76 ## - Prom 30396 - 30455 5.3 - Term 30387 - 30438 9.4 32 19 Op 1 8/0.000 - CDS 30462 - 32225 203 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 33 19 Op 2 . - CDS 32272 - 34014 197 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 34 19 Op 3 . - CDS 34011 - 34358 164 ## VFMJ11_A1052 hypothetical protein 35 19 Op 4 . - CDS 34360 - 34764 279 ## COG2832 Uncharacterized protein conserved in bacteria - Prom 34865 - 34924 5.9 - Term 34985 - 35020 -0.7 36 20 Op 1 . - CDS 35028 - 35852 582 ## gi|253579709|ref|ZP_04856978.1| predicted protein 37 20 Op 2 . - CDS 35896 - 36801 535 ## COG1321 Mn-dependent transcriptional regulator - Prom 36824 - 36883 3.2 38 21 Tu 1 . - CDS 36885 - 37073 73 ## 39 22 Tu 1 . - CDS 37364 - 37690 77 ## COG3344 Retron-type reverse transcriptase - Prom 37776 - 37835 4.0 40 23 Op 1 . - CDS 37866 - 38015 81 ## gi|253579712|ref|ZP_04856981.1| predicted protein 41 23 Op 2 8/0.000 - CDS 38046 - 39155 1008 ## COG0524 Sugar kinases, ribokinase family 42 23 Op 3 . - CDS 39155 - 39793 410 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 43 23 Op 4 1/0.000 - CDS 39821 - 41263 828 ## COG1012 NAD-dependent aldehyde dehydrogenases 44 23 Op 5 1/0.000 - CDS 41288 - 42280 524 ## COG1879 ABC-type sugar transport system, periplasmic component 45 23 Op 6 2/0.000 - CDS 42305 - 43399 886 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 46 23 Op 7 . - CDS 43448 - 44464 798 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 47 23 Op 8 3/0.000 - CDS 44513 - 45475 412 ## COG0524 Sugar kinases, ribokinase family - Prom 45563 - 45622 6.0 48 23 Op 9 . - CDS 45648 - 46445 205 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 46472 - 46531 11.1 49 24 Op 1 . - CDS 46565 - 47626 781 ## SCO3481 hypothetical protein 50 24 Op 2 2/0.000 - CDS 47604 - 50699 898 ## COG3250 Beta-galactosidase/beta-glucuronidase 51 24 Op 3 38/0.000 - CDS 50707 - 51540 446 ## COG0395 ABC-type sugar transport system, permease component 52 24 Op 4 35/0.000 - CDS 51540 - 52427 564 ## COG1175 ABC-type sugar transport systems, permease components - Term 52437 - 52481 7.4 53 24 Op 5 2/0.000 - CDS 52493 - 53815 1440 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 53839 - 53898 10.3 54 25 Op 1 2/0.000 - CDS 53924 - 55222 525 ## COG1653 ABC-type sugar transport system, periplasmic component 55 25 Op 2 7/0.000 - CDS 55212 - 56666 505 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 56 25 Op 3 . - CDS 56659 - 58308 694 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 58392 - 58451 5.4 - Term 58344 - 58390 -0.9 57 26 Tu 1 . - CDS 58490 - 58756 218 ## COG4977 Transcriptional regulator containing an amidase domain and an AraC-type DNA-binding HTH domain - Prom 58777 - 58836 1.9 + Prom 59230 - 59289 6.7 58 27 Tu 1 . + CDS 59333 - 61753 993 ## COG3250 Beta-galactosidase/beta-glucuronidase - Term 61750 - 61795 2.6 59 28 Op 1 7/0.000 - CDS 61843 - 63357 305 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 60 28 Op 2 . - CDS 63332 - 65086 467 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 65181 - 65240 7.8 - Term 65208 - 65273 15.1 61 29 Op 1 11/0.000 - CDS 65275 - 66237 527 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 62 29 Op 2 . - CDS 66237 - 67205 622 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 63 29 Op 3 . - CDS 67219 - 67416 198 ## gi|253579735|ref|ZP_04857004.1| D-ribose transport system ATP-binding protein - Term 67467 - 67504 2.1 64 30 Op 1 16/0.000 - CDS 67635 - 69137 170 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Term 69150 - 69185 6.1 65 30 Op 2 . - CDS 69221 - 70222 874 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 70264 - 70323 7.8 + Prom 70234 - 70293 11.4 66 31 Tu 1 . + CDS 70539 - 71195 413 ## CKR_3341 hypothetical protein + Term 71279 - 71330 2.1 Predicted protein(s) >gi|226332916|gb|ACII01000103.1| GENE 1 5 - 322 246 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253578255|ref|ZP_04855527.1| ## NR: gi|253578255|ref|ZP_04855527.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 105 1 105 143 178 92.0 8e-44 MAGYGNHRIGEVTNLNGNKIVITESIVSYSLGINAINFTYEYVNGKFVPTSKYGSYKEIY SADGSSRYFTVNSNLPAYARLGATAVNTTLKTGSLTKIIKCALIN >gi|226332916|gb|ACII01000103.1| GENE 2 348 - 539 89 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579677|ref|ZP_04856946.1| ## NR: gi|253579677|ref|ZP_04856946.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 1 63 63 112 100.0 6e-24 MEKSIGLKRWKIHRFLIAKDNLWKLDMQDDIIEKKVSYSHNIRNKLACTFQFKTLKTGSV IDN >gi|226332916|gb|ACII01000103.1| GENE 3 554 - 667 102 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDEKVLQVAILIWGCVFCAIAALCMRAFQVLCKQKN >gi|226332916|gb|ACII01000103.1| GENE 4 849 - 2276 1610 475 aa, chain - ## HITS:1 COG:CAC1405 KEGG:ns NR:ns ## COG: CAC1405 COG2723 # Protein_GI_number: 15894684 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 6 475 2 472 473 608 61.0 1e-174 MKNYNRMPENFLWGGATAANQYEGGWDEDGRGPSIADILTGGSVDKERRLTPPAPLPEEF YPNHQATDFYHHWKEDIALFAEMGFKVYRFSISWSRVFPNGDEETPNEEGLKFYDNVIDE LRKYNIEPLITISHYENPLHLSLEYGGWKNRKMIDFYLRFAKVLFERYKGKVKYWLTFNE INMLTQDFGAVFCAGMLDPKDVCEQSRYQAMHHQLVASALAVSMAHEIDPKFMLGCMLAY HNGYPYTCHPDDILYAQQFGQIHNSIAGDVHVRGYYPGFAARYFEEHGIQLEVLQEDKEI LKKGTVDFFTISYYSSSCVSVTKDGEKTAGNGSDNLKNPYLKASDWGWQIDATGLRYVLN QIYDRYQIPIMIVENGLGAVDQLTEDGKIHDDYRIEYMRRHIEQMKEAIHDGVDLIGYTC WGCTDLVSASTGEFKKRYGLIYVNKNDDGTGDFSRIRKDSFYWYKKVIESCGEEL >gi|226332916|gb|ACII01000103.1| GENE 5 2347 - 4233 2503 628 aa, chain - ## HITS:1 COG:SPy0572_2 KEGG:ns NR:ns ## COG: SPy0572_2 COG1263 # Protein_GI_number: 15674662 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 92 449 6 364 364 262 44.0 1e-69 MKKDYAKTADTLIAALGGKDNITRLFHCMTRLRFYVKDRSKINEKEILKLSEISGVNWHE DQFQVIAGNEVNAVYKALEDKGVPTDDAPAANSDSSKSVVSKVIDAITGCMTPMIPALTA AGMIKVVLSLLTTFHLVSETSSTYQVISFIGDVTFYFMPFLIAANAAKVFRVNQSLALFI AGVYLSPTFVTMVAGDAAITLFGLPITKATYSYSVIPVILMVWITHYIEILVDKITPKMV KLILNPTLVILISAPIALIVVGPIGTIIGNGLAVAINFLSVKLGFIIVGILAATFPFIVM TGMHHALTPIGLNAIATGGTDTLIFVSQVCSNLAQSGASLAVAVRSKDSNMKQLASAAGV SALMGITEPALYGVTLKLKRPVVAASIAAGIGGIVGGLLQVSLYIAQNCIMAIPAFIGEK GLSNLIYGIIMIVVSFVAAFVLTLIFGFEDVKAETEDEVQNTDTEKQPAQQSVPLVDKIE LCAPVAGTVKALSDVPDKTFADKVLGDGAAIVPSEGKVYAPADGTVANIMDSKHGIMFVT ESGAEILIHIGLDTVNLNGKYFKSHVSDGDKVKKGKLLVEFDMDAIKKEGYDLITPVVVT NISDYIKAVCMEKEDAAVNAGDKFLTIV >gi|226332916|gb|ACII01000103.1| GENE 6 4357 - 5208 901 283 aa, chain - ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 275 1 275 277 197 43.0 2e-50 MVIHKILNNNLILSRDGQGHEVIVKGCGIAFQKKKGQQVEESRIEKIFTAENAQISKEIQ GYLTAIPEKYLDFVEAFVNDVKEKEGMKLNDSIYFSLSDHIMGAISRLENGIRLQNMLLL DIKQLYHKEYEIGLELLKKVNEMFEVDLSADEAGFFALHFVNAEDGGKTDSYQISIIVHQ ILDIVEKYYDNMEFQENNLYYQRFLTHLKYFAQRFLHKELHYDENQELFKIIKEQYREAY GCVKMIYLMMEEEYKYAMTEDEMLYLTIHIQKITEDHKRLKNL >gi|226332916|gb|ACII01000103.1| GENE 7 5759 - 5962 113 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579681|ref|ZP_04856950.1| ## NR: gi|253579681|ref|ZP_04856950.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 67 1 67 67 114 100.0 2e-24 MKTVAGQQLLKDKIAVTLPIKMTPDEVKKLSADTSKGEYNDAENMWYFYDATYEITNNAK FTMPKAA >gi|226332916|gb|ACII01000103.1| GENE 8 6709 - 7923 933 404 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 2 404 8 402 402 348 41.0 2e-95 MAVAKEQIRQIISENNINSVADIYTLLKDSFKDILQELMEAELDATLGYEKNHKGDLQTD NKRNGHSTKNLKSQYGEFQIDVPRDRNGEFEPKLIPKYQRDISGIEEKVISLYARGMSTR DIHDQLQDLYGIELSAEMVSKITDKILPQVKEWQSRPLNPVYPFVFMDCIHYKVREDGRI LSRAAYVVLGVTVEGYKDILSITAGANETSKFWLGMLNDLKNRGVKDVLFFCVDGLPGFK EAIQAVYPQAEIQRCVIHMLRNSFKYVNYNDLKKFSSDFKEVYNAPNETAALTELENMKE KWGKKYPYAISNWENNWEDVSSFFQFSNDIRRIMYTTYIIEGLNRQYRKVTKTKSVFPSD PALEKMLYLASENVVKKWTQRYRNWDQVLNQLIVLYGERLTAYL >gi|226332916|gb|ACII01000103.1| GENE 9 8074 - 8274 80 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHYLSCKKIENDVYNFPTWQYCKYKDIFDSRIIINSTAKFYISISYQKDLLSIDTRIFTP SQKDTI >gi|226332916|gb|ACII01000103.1| GENE 10 8358 - 10466 2229 702 aa, chain - ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 1 694 1 703 705 449 38.0 1e-126 MKKQQKKTKEITVPVYMMNRELSWLKFNERVLNEAGNPRVPLAERLTFAAIYQSNLDEFY RVRVGTLMDQMEASEVIRENKTNMTSEEQVTAIIQATRDLDQKKAAIYEQLMGELEPYGV RLINFNRLSAEEGKLLETYFDQEIAPYLSANIVSKQQPFPFLKNKNIYAVASLMSKGGKT KTAIIPCSNTVFRRLIDIPTRKGTFMLSEELILHFLPKMFKSYEIREKALLRITRNADID TETIYDEDLDYRDAMENLIKQRRRMNPVRMELSRELNKKMISSLCKELHVDKEHVFLTRV PLDMSFVFGLQGYLRNTAQDKLFYQRRTPRMTPELDDKSALIPQIMKKDVLLSYPFENIR SFINLLNEAAKDDSVVSIKMTLYRLADKSQIVDALVEAAENGKEVVVLVELRARFDEESN IEYSRKLEEAGCRVIYGLNGYKVHSKLCLISRKTDKGVSYITQIGTGNYNEKTSALYTDL SLITANQEIGKEAAEVFAALLKGEVVEKSNLLLVAPKCLQNRVLDMIQEEIDQVKQGKEG YIGIKINSLTDKVIINKLVEASQAGVKIEMVVRGICCLIPEIKGYTENIKVVSIVGRYLE HSRIYRFGTKEREKIYIASADFMTRNTVRRVEVAAPVLNEKLKERLDWMFETMMNDDEKG KRLTETGNYADRSLNDVKLNSQEIFYAMAYSNAEKTQKKRED >gi|226332916|gb|ACII01000103.1| GENE 11 10822 - 11781 1044 319 aa, chain - ## HITS:1 COG:MA3366 KEGG:ns NR:ns ## COG: MA3366 COG0053 # Protein_GI_number: 20092180 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 34 316 10 295 341 199 38.0 7e-51 MEQTKKLNMTAMSDKTNSNTTAKKEEPAWEHKVAMKVSGVSIAVNLLLSLFKLLAGIVAH SGAMISDAIHSASDVGSTFVVIVGVNLSSKKSDKEHQYGHERMECVSSIILSGLLLATGI GIGMSGIENIIKSTSGASIAVPGTLALIAAVVSIVVKEWMFWYTRSAAKKINSGALMADA WHHRSDAMSSVGAFIGILGARLGFPILDPLASVAICVLIVKASVDIFRDAIDKMVDHSCD EATEESMREVIMGVKGVKGIDLLQTRLFGSKMYVDIEISADGEIPLNEAHDVAENVHHSI EKNFKNVKHCMVHVNPVNE >gi|226332916|gb|ACII01000103.1| GENE 12 11858 - 12052 100 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579685|ref|ZP_04856954.1| ## NR: gi|253579685|ref|ZP_04856954.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 8 64 1 57 57 106 98.0 6e-22 MFRRCVFLFSEKYVLERFSFRGIFLGGFSETGAVPKYKKTPVRPHFDKSQKGNMFSDCGA MYVT >gi|226332916|gb|ACII01000103.1| GENE 13 12273 - 13682 1472 469 aa, chain - ## HITS:1 COG:TP0901 KEGG:ns NR:ns ## COG: TP0901 COG0534 # Protein_GI_number: 15639886 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Treponema pallidum # 16 469 18 468 470 190 27.0 4e-48 MERTKERSSFRKKFIGDRKFYMMVLAVAVPIMIQNGITNFVSLLDNIMIGRIGTEQMSGA SIVNQLIFVYNLCIFGGVSGAGIFTAQYFGQKDHEGVRQTFRYKFWMAVILTIGTILLFL TVGENLISMYLQGEGTAQQIADTLNYGKQYLDIMLLGLPPFMMVQIYSSTLRECGETVLP MKAGVVAICVNLLFNYLLIYGVFFFPRLGVRGAAIATVLSRYVEAAIVIGWTHRHTEKNA FAKGLYSTLKVPANLTKKILVKGTPLLFNETLWASAMAMLTQCYSIRGLNVVASLNISNT INNVFNIVFIALGDSVAIIVGQLLGAGDMKKARDTDNKMIAFSVMCCTCVAMVMFVMAPL FPMLYNTNDEARELAKYLIMITAFFMPQNAFLHATYFTLRSGGKTIITFLFDSVFIWCVS VPIAFVLSRYTGIHVLVIYACVQLADLIKCVIGFVLVKKGVWLQNIVSS >gi|226332916|gb|ACII01000103.1| GENE 14 14201 - 15061 999 286 aa, chain - ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 6 286 31 303 304 179 38.0 6e-45 MVTKKTPKVVLTAVAIVIAAYILSPFYLVLINTFKNANGIVANPVSFAGASFAQFKTNIS NVVNNSNFSFWSAFGTSAIITVVSLVLLALFGGMAAWVICRNKTKWSTAIYMTFIASMTI PFQVVMLPLVSTFRDVGKFLGINMLQSVPGIVFAYCGFGGAMTVFILTGFIKGIPYELEE AAAIDGCSPEGTFFRIILPLLKPVIVTVTILNGMWIWNDYLLPSLMLGQNGKVKTLPVAV QAFVGSYVKQWDLILTAALLAMIPMIIVFLLAQKQIMNGMVEGAIK >gi|226332916|gb|ACII01000103.1| GENE 15 15061 - 16113 1172 350 aa, chain - ## HITS:1 COG:Cgl2407 KEGG:ns NR:ns ## COG: Cgl2407 COG1175 # Protein_GI_number: 19553657 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Corynebacterium glutamicum # 70 349 11 280 281 180 40.0 3e-45 MNTNSASKKAISMVCLIAGIILLAASLIQDISGSGAGWRGPALIIGAIGILLGMYLFPTL KHHRSIVNFIFVFPLLFTFVVTVIIPLCLGVFYSFTDWNGIKMTGFVGFANYKAMFNEPS FLWSLIITILFVVLNMVLVNVVAFMLALLCTSKVKGLSFFRASYFLPNLIGGIVLGYVWQ FVFNNVLIKMTNNISLLSKTNTAMMAIIVVYIWQYAGYIMLIYVTGLTQVPKDVLEASQI DGANAVTTLFKIKIPMIAATITICTFLTLTSAFKQFDVNMALTNGTGSVASFFGNYLTNG TQMLALNIYNTAISKNNYALGQAKAILFFIILAAVSIIQVRISNKKEVEM >gi|226332916|gb|ACII01000103.1| GENE 16 16271 - 17551 1522 426 aa, chain - ## HITS:1 COG:Cgl2408 KEGG:ns NR:ns ## COG: Cgl2408 COG1653 # Protein_GI_number: 19553658 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Corynebacterium glutamicum # 7 414 8 428 443 111 26.0 2e-24 MKKKVFATLLAVSMIASMVPAAVSVQAADGDSIRLVNGKIEVDAQLKKLAEMYEKETGTK VEIESMGGGIDIQGTLKGYYQSDNMPDIFVNGGATDFANWTDLLVDMSDQEWASDTDSAY VDESQGTIGFPYTTEAIGLAYNKDILDKAGIDPSTLTGPDAIKEAFETIDSKKDELGLTA VVGYAAEPVNLYWSTGNHLFGNYLDSGLDRDDTTYIDMLNDGGKVDEDRLTDFANFVGLL NQYSDPALLVSGTYDQQILNFSSGKYAFVTQGSWIGATMTGDDADAYKEAGNFEVGMIPY AFEDGIDTILTNSPSWWSVYKDGNVEAAEAFLQWLTTDEAQEVLVKEAGFVSPFKSCTIV GDDPFAQTITDYVSAGKTSAWHWLEQKEGLAQNYTGQVFADYASGSLDEAGFVKTLEQVI QSAYAN >gi|226332916|gb|ACII01000103.1| GENE 17 17947 - 18561 860 204 aa, chain - ## HITS:1 COG:no KEGG:CLH_1589 NR:ns ## KEGG: CLH_1589 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 193 1 193 196 281 75.0 9e-75 MNTQNSTITKLAETALMAALCYVSFTFLQIKIPVPGGDATSIHIGNAFCVLAALLIGGVY GGLSGAIGMGIADLMDPVYITGAPKTFVLKFCIGLITGLVAHKIAKINESTDKKYVFKWS VIASVAGLAFNVVADPIVGYFYKQYILGQPQELAQALAKMSAVATLFNAVVSVILVAVIY NAVRPVLIKSHLLTVTGKKKEQAA >gi|226332916|gb|ACII01000103.1| GENE 18 18470 - 18697 86 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTHFAYDNFNILMRQYIKNLIVRIKQAIWLIDMGLLSDHMRGDSYYEYSEQHNYKTCRDS SYGCIMLCIFYILAD >gi|226332916|gb|ACII01000103.1| GENE 19 18949 - 20121 1642 390 aa, chain - ## HITS:1 COG:SA1524 KEGG:ns NR:ns ## COG: SA1524 COG0281 # Protein_GI_number: 15927279 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Staphylococcus aureus N315 # 1 390 1 391 409 461 59.0 1e-129 MDYAKESLKMHYDLKGKIEVVSRAKVDSKDALSLAYTPGVAQPCLEIQKDVNKSYDLTRR WNTVAVVTDGTAVLGLGDIGPEAGMPVMEGKCVLFKEFGGVDAIPLCVRSKDVDEIVKTV SLLAGSFGGINLEDISAPRCFEIEKKLKECCDIPIFHDDQHGTAVITLAGVINALKLVGK KLDEVKIVTSGAGAAGIAIIKLLMSMGLKNVIMTDRKGAIYEGREGLNPIKEEMAKITNF NKEKGTLAEVIKGADIFIGVSAPGTLTQDMVRSMAKDPIIFACANPVPEIFPDEAKAAGA AVVSTGRSDYPNQVNNVLCFPGLFRGVLDARASDVNDEMKVAAAYAIAGLVSDEELTAEY ILPAAFDPRVKDAVAKAVAEAARKSGVARI >gi|226332916|gb|ACII01000103.1| GENE 20 20321 - 22087 1743 588 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579694|ref|ZP_04856963.1| ## NR: gi|253579694|ref|ZP_04856963.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 47 588 1 542 542 942 100.0 0 MAKKLDFDKEKNNRNDNAIEEIMQADFPLPKQAEDAKNEAFARIREMAAASENVENAENI VQRLPEKSTEKSTEKSTGSSGKKSTGTVKSHKKFKTVYKTALGLTAAAAVFSAVCITNPA FAENIPMVGNVFKQLGNSLGFYGDYSKYAKQLTASAEDTLSADADGSQEGSSNSQNAQSE DQNTTENNNADKTKDNESYSKTVDGTTVTLSEVYCNEMALYLSMTIHTEDKFPDTFITSD GKPNIMLSENSTVKYDYMDGKSNLFNAYLDGKMLDDNTYAGVLRIPVEDMTVDEAGWTKF YEVRNAFFKEKGIDVDSEDFSFDKLAQALGMDEYSDEKLPQVGGPAISDYVKDIKVPDRF TMELDLKDIVGTLPDDQDTTPDIPQDLWDEYNQKMAEHGISTDDADYESLTEEQKNLEHQ FFTEMWNEYYERYPEANEGNNRYNSWTLKGDWKFNVDVEKNTSDTVEKDVNVVDENGDGV LSITKTPFEITMKMQDPETKYVAVMLDANGDILPDGGVANGNAGTYAIQDRDISTVYIYL CDYYEYMDELKGYYWSDDYEEKAKTKTFKQLLDERAVADTEVHFDTDK >gi|226332916|gb|ACII01000103.1| GENE 21 22148 - 22744 553 198 aa, chain - ## HITS:1 COG:mll8140 KEGG:ns NR:ns ## COG: mll8140 COG1595 # Protein_GI_number: 13476734 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 2 160 1 173 208 80 28.0 3e-15 MIQLVKRSISGDADAFLELMEKNSLAMYKVARGILDNDEDAADAIQDTILTCFEKIHTLK NPEYFKTWMIRILINECNKIHRHYKNFSRAEELPEVPGQDMSIEEFEFKEMLGMLDESYR IILVLYYVEGFRIADIASILNMNENTVKTRLVRARMQLKQEYTSAEDNIQKKRLQQKKKK NREFTGIAEGLQNIRMKI >gi|226332916|gb|ACII01000103.1| GENE 22 23064 - 23570 565 168 aa, chain - ## HITS:1 COG:no KEGG:CA_C3602 NR:ns ## KEGG: CA_C3602 # Name: not_defined # Def: HD superfamily hydrolase # Organism: C.acetobutylicum # Pathway: not_defined # 3 161 5 163 163 129 43.0 3e-29 MTVAEVVKKMVEYSKGDLHDISHFMKVYAYAKTIAEGENLSPEQQKLVEVTAVVHDIACP LCRVKYGNANGKHQEEESAALLEEFFADSDLPKEFVDRVSYIVSHHHTITGIDGIDYQIM IEADYLVNADESNFSGNNVRNMLEKVFKTETGKFLLQSMYQKRLNAEE >gi|226332916|gb|ACII01000103.1| GENE 23 23691 - 24848 1592 385 aa, chain - ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 2 383 6 399 405 327 41.0 2e-89 MKIGNDIYYVGVNDHQIDLFEGQYVVPNGMSYNSYVIMDEKIAVMDTVDANFKDEWLENV AKVLDGAKPDYLVVQHMEPDHAANIENFMKAYPDTTVVANTKTFTMMGNFFRNLNLDGKK LVVANGDSLTLGKHVLTFVFAPMVHWPEVMVTYDSTDKVLFSADGFGKFGALDVEEDWDC EACRYYIGIVGKYGAQVQKLLKAAATLDIQTICPLHGPILTENLGHYLEKYDIWSSYKVE SEGVVIAYTSVYGNTKKAVELLAQKLEEKGCPKVTVFDLARDDMAKAVEDAFRYGKLVLA TITYNGDIFPFMRTYIENLTERSYQNRTIGLIENGSWAPAAARIMQGMFEKSKNITWLEN NVKITSSLSEANEAEIEAMAEELCK >gi|226332916|gb|ACII01000103.1| GENE 24 25027 - 25962 660 311 aa, chain - ## HITS:1 COG:no KEGG:Mhun_1221 NR:ns ## KEGG: Mhun_1221 # Name: not_defined # Def: hypothetical protein # Organism: M.hungatei # Pathway: not_defined # 65 192 77 213 344 65 29.0 2e-09 MNNTIFDDVFRTMIEKMPYLAVPLINEVFHTSYPENVPIVQLRNEHQQENGEIITDSCLK IAGKLYHIECQSVDDTTMAIRMIEYDFSIAIENAQKQGRKYRMDFPKSCVLYLRSGKNTP DFLEIEMVLSDEKIVHYWVPTMKLETYTRNSIFEKNLLMLLPFYIMRYEKDIHEMSENPE MFQSLLNDYEEIRINLERELSGADKTALYMNLNKLIIKIADYICRNEKTVRKGIGEIMGG KVLELESERLERLQKEAEAEAKAIGEAIGEERLSTLLNRLIMDGRSAEIQSVVTNTDIRK QLYKEYGILSE >gi|226332916|gb|ACII01000103.1| GENE 25 25977 - 26162 88 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCRKLLIDGTACVCYTEFIYLSSRPVGHLFLFMKVCTKDKDYPIYDICANWVFKHCYRML E >gi|226332916|gb|ACII01000103.1| GENE 26 26375 - 26809 376 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579700|ref|ZP_04856969.1| ## NR: gi|253579700|ref|ZP_04856969.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 144 1 144 144 276 100.0 4e-73 MALNSYFDFAENDFRYFKASYDAGIVANMMGAMAQGICEKYMKHLISEYYKPDDAMQQKD FENILRTHSLNRLMKFLKANMGAEFSKNTQTHMRMIDGFYFSTRYPGDDSIEIDGDDVET CNDAIELCRKEVLELERELKKCEV >gi|226332916|gb|ACII01000103.1| GENE 27 26800 - 27153 204 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579701|ref|ZP_04856970.1| ## NR: gi|253579701|ref|ZP_04856970.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 117 1 117 117 214 100.0 1e-54 MLTCIRSKRQRDAAEYALRTIQSSCIRPYIKTIYLYGSCARGEEKYSSDVDLFLELSESF RSRPELKKYLYLLKSEVSSEELDDAETDLKIVVGDDWKRNKMLYYTNVRKEGIQIWP >gi|226332916|gb|ACII01000103.1| GENE 28 27311 - 28177 915 288 aa, chain - ## HITS:1 COG:BH3154 KEGG:ns NR:ns ## COG: BH3154 COG0330 # Protein_GI_number: 15615716 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 8 287 30 307 310 179 35.0 7e-45 MKGKKIGILIGVSAVVIAVGASVTVTQQNEYKLIRQFGKVDRVISSSGISFKIPFIESTQ SLPKETLLYDLAASDVITKDKKTMISDSYVLWKISDPLKFAQTLNSSVESGESRINTAVY NATKNAISSMSQDQVITSRDGELSDMVMEAIGTNMDQYGIELLKFETKQLDLPDDNKEAV YERMISERDNIAATYKAEGNSEAKVIRNKTDKEVAIQISDAKKQAEILEAEGEQEYMKIL AQAYGEEDRSEFYSFVRSLDALKTSMKGEDKTVILSADSPIAQIFEGK >gi|226332916|gb|ACII01000103.1| GENE 29 28265 - 29308 1071 347 aa, chain - ## HITS:1 COG:BH3155 KEGG:ns NR:ns ## COG: BH3155 COG0330 # Protein_GI_number: 15615717 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 29 315 13 301 319 192 40.0 6e-49 MVKRRRKVNERVNPFKKLQKPGKHVKRIVIGAAGLVIIAGLAGDATYQIQEQEQAVLTTF GVPKAVAETGLHFKLPFIQKVQKVNTTIQGFPIGYSMGDNSVVENEGIMITSDYNFIDVD FFVEYRILEPVKYLYNSEEPEDILKNISQSCIRTVIASYDVDEVLTTGKGEIQSKIKEMI LKQMEEQDLGIQLVNITIQDSEPPTQEVMKAFKTVETAKQGKETALNNANKYRNEKLPEA EAEADQIIQDAEAQKQVRINEAEAEVARFNAMYEEYVKNPEITKKRMFYEAMEDVLPGMK IVIDNGDGVQKVLPLDSFTGNSSENPAETLNTDPAQDTQDLNDSENE >gi|226332916|gb|ACII01000103.1| GENE 30 29719 - 30072 197 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579704|ref|ZP_04856973.1| ## NR: gi|253579704|ref|ZP_04856973.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 117 1 117 117 216 100.0 4e-55 MLTCIRSKRQRNAAEYALRAIQSSCIKPYIKTIYLYGSCARGEEKYSSDVDLFLELSESF RSRPELKKYLYLLKSEVSSEELDDAETDLKIVVGDDWKRNKMLYYTNVRKEGIQIWP >gi|226332916|gb|ACII01000103.1| GENE 31 30158 - 30259 76 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNTIFDDVFRTMIEKMPYLAVPLIKEYGILPE >gi|226332916|gb|ACII01000103.1| GENE 32 30462 - 32225 203 587 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 378 586 8 219 223 82 30 5e-15 MGSLIGLVKPLLHIMIAAIILGTAGYLCAIFLTILAGQVIVHGILTGEMGAAIATETTDV ALSAQTVGITVKNLWLINTPVKTIIIIMIIIAVLRGILHYAEQYCNHFIAFKLLAIIRHK VFAALRKLCPAKLEGRDKGNLISIITTDIELLEVFYAHTISPIAIATLTSLIMIVFIGRY HWMAGLLALAAYLIVGVAIPVWNGRRGSQMGMEFRTNFGELNSFVLDSLRGLDETIQYGQ GESRKEQMSQRSGNLAKMQRSLSKQEGSQRSFTNMVILLASFGMLALTIWLYGRGELGFE GILTCTIAMMGSFGPVVALSSLSNNLNQTLASGERVLSLLEEKPMVEEIPVDVKDDSVNI SENLLGGKSDIFSGAETKNVTFAYENEVILDNYSLKLEPEKITGIHGASGSGKSTLLKLL MRFWDVDKGCVSVDGEDVRKIPTRHLRDMESYVTQETHLFHDSIANNIAVGNPGASREEI MEAAKKASIHDFIMKLPKGYDTEVGELGDTLSGGEKQRIGIARAFLHDSPLILMDEPTSN LDSLNEGIILKSLREASEKKTVVLVSHRKSTMNIADTVFEMKDGRIS >gi|226332916|gb|ACII01000103.1| GENE 33 32272 - 34014 197 580 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 341 564 14 235 311 80 27 2e-14 MIKTRLVGLLSHAKKYIVYTILWQWMALLSQVLAVFSIADLLERVIYQSVTAAVVERTII TMILVVVVRFICERMGTRSSYLACVDVKRILREKIYEKMLKLGASYKEQVSSSEVVQVST EGVEQLETYFGKYLPQLFYSLIAPLTLFVILCRVSMKASVVLLICVPLIPISIVVVQKIA KKLLNKYWSIYTGLGDSFLENLQGLTTLKIYQADQQKADEMDVESQNFRRITMKVLTMQL NSTSVMDIVAYGGAVVGMAVTLSEFLKGNISISGTICIVLLASEFFLPLRLLGSFFHIAM NGMAASDKIFKILDLPEPQTGDKNLPDDSLDISFKNVHFAYEENREILKGIDLYLPAGNF VSLVGESGCGKSTIAGIISAKNRGFTGEITIGGVSLSEVNESDLMKHVVLVRHNSYLFKG TIEENLRMAKPDATEEEMNTVLKKVNLLGFLQTQDGLQTKLLEKASNLSGGQCQRLVIAR ALLYDAAVYIFDEAASNIDVESEELIMDVIHELSRTKTVLLISHRLANVVESDQIYFLKN GEIQEHGKHEELMQKDGEYHHLYESQMALENYGRQVNRDV >gi|226332916|gb|ACII01000103.1| GENE 34 34011 - 34358 164 115 aa, chain - ## HITS:1 COG:no KEGG:VFMJ11_A1052 NR:ns ## KEGG: VFMJ11_A1052 # Name: not_defined # Def: hypothetical protein # Organism: V.fischeri_MJ11 # Pathway: not_defined # 7 108 10 106 129 103 54.0 3e-21 MNKVEKKRQKEQYIVEEMIRLYCRKNHLEEERQGRQMCPVCQELSDYAKLRSHKCPFMEQ KTFCANCKVHCYKPEKREQIRQVMRFSGPRMLLHHPILAIWHLICSKREAKVKNL >gi|226332916|gb|ACII01000103.1| GENE 35 34360 - 34764 279 134 aa, chain - ## HITS:1 COG:PM0679 KEGG:ns NR:ns ## COG: PM0679 COG2832 # Protein_GI_number: 15602544 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 5 123 1 118 120 88 46.0 3e-18 MKHSLKIFWMILGFLCLGFGTVGIVLPILPTVPFYMATLFCFAKSSERLHSWFLGTNLYK KHLDSFVQKRAMTMGTKLRIMGTVTVIMGIGFLCMKNVPVGRICLIVVWVCHVLYFFLRV KTVNAENKTLEGAK >gi|226332916|gb|ACII01000103.1| GENE 36 35028 - 35852 582 274 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579709|ref|ZP_04856978.1| ## NR: gi|253579709|ref|ZP_04856978.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 274 37 310 310 549 100.0 1e-155 MLQIQYLLELEKTGKKRGSVAMIADTCGVSHGPVSRFFKECIGKGYLTEKYEFTTEGAHA LLTYKRLLRDVKTYLKAMNIPEKEIPEKMKQLIENVDYDLLRTMVRNARQTTERTVEGKN EQEQKNPQGEKDSNAYFIEKILEKGKFQVGIAIHQISRNSETRLSMAHQGFEHLACIRNN TRGSWLELTICEIHGRSRIDGTEMSGHLSSLKYEKQGQLYESVIKNGKLRIPLSACHFFR GRMGNVKGTIHITVACNVGNAHMPESTAVLTFWM >gi|226332916|gb|ACII01000103.1| GENE 37 35896 - 36801 535 301 aa, chain - ## HITS:1 COG:MA3468 KEGG:ns NR:ns ## COG: MA3468 COG1321 # Protein_GI_number: 20092281 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 32 187 4 142 160 62 26.0 9e-10 MNELIQKKLTKNDLKENIIKETDAQLPSSRREIEEDMLEQICLIANAGAEITVEELCSRM DMSRREVNFYLKQLKVHGYIIREKVTSTGHQKDGSGSGISLTEFGKITGEEFQYRHEILT QFLQFVGVSDEKAQEDACRMEHVVSEETVQQICNFVDYGDTFERILKHSDLRSWYQTGKY MFLMGIYQIEKTCPRRFAREFDCFSENIWLHITEDKSSFELIKKEDTSAQNLWYKDPEQG WIEAEEKMGNPWIPSSVFQFSTRCQDPIIEGTALIAFTRKGEKPVDWNSRELDVHIWKSE G >gi|226332916|gb|ACII01000103.1| GENE 38 36885 - 37073 73 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIAEKDKQYMIERIEISICFIFAYFGKNKLVDTNYNCLLEFHLMKNIEKKIKIMILKAVM KR >gi|226332916|gb|ACII01000103.1| GENE 39 37364 - 37690 77 108 aa, chain - ## HITS:1 COG:BH0224 KEGG:ns NR:ns ## COG: BH0224 COG3344 # Protein_GI_number: 15612787 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 2 107 292 398 418 64 30.0 4e-11 MKKVRKLTSRKWGVSNSYKAQKIAEVVRGWINYFKIGSILTATRRLDTIIRYRFRMCIWK HWKNPKTRYRNLIKLGVSKKNAKSAAGFHGYARVCRTETICYAMSNDV >gi|226332916|gb|ACII01000103.1| GENE 40 37866 - 38015 81 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579712|ref|ZP_04856981.1| ## NR: gi|253579712|ref|ZP_04856981.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 49 1 49 49 77 100.0 3e-13 MYYYDYKNVVIVGNQKRIVEDYGKGGFDPLGDKYSHKSKHCTYYDEKAM >gi|226332916|gb|ACII01000103.1| GENE 41 38046 - 39155 1008 369 aa, chain - ## HITS:1 COG:CC1496 KEGG:ns NR:ns ## COG: CC1496 COG0524 # Protein_GI_number: 16125743 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Caulobacter vibrioides # 7 369 6 368 368 453 60.0 1e-127 MEETMNLKLKNRSECKYDAVSLGEVMLRLDPGEGRIRTARNFQVWEGGGEYNVIRGLRKC FGLNTGVITAFADNEVGLLMEDLIMQGGVDTSLIRWMKTDGIGRVCRNGLNYTERGFGIR GAVGCSDRANTAISKATPKDFDFDYIFGELGVRWLHTGGIYAALSEQSCETVIAAIKTAK KYGTIVSYDLNYRPSMWSAIGGLEKAQEVNKEIARYVDVMIGNEEDFTACLGFEIEGNDE NLKELNLDGYRKMINEAARAYPNFKVVATTLRTVCTATVNDWSAICWADGEIYKAKDYKG LEILDRVGGGDSFASGLIYGLMTSDDPETAVNYGAAHGALAMTTPGDTTMARKKEVEAIM GGAGARVQR >gi|226332916|gb|ACII01000103.1| GENE 42 39155 - 39793 410 212 aa, chain - ## HITS:1 COG:VC0285 KEGG:ns NR:ns ## COG: VC0285 COG0800 # Protein_GI_number: 15640313 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Vibrio cholerae # 13 201 12 199 201 211 55.0 7e-55 MENIMVQIEKTGVIPVVVINDVEDAEPLAQALCEGGLPCAEVTFRTAAAEESIRKMTDIY PDMLIGAGTVLTTEQVDRAVAAGAKFIVSPGFDPEVVDYCILKQIPVFPGCITPSEVAQA VKRGLKVVKFFPAVQFGGVSTIQALTAPYVGLKFMPTGGVNAKNLADYLQCKSIIACGGS WMVKSDLIKAGEFEKIKDMTKEAVSLVNEIRA >gi|226332916|gb|ACII01000103.1| GENE 43 39821 - 41263 828 480 aa, chain - ## HITS:1 COG:NMB1968 KEGG:ns NR:ns ## COG: NMB1968 COG1012 # Protein_GI_number: 15677798 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Neisseria meningitidis MC58 # 1 477 1 476 480 353 40.0 5e-97 MLQLNNMIDGKLIPSESGERIEVYNPFNQELVGTVPKSTKKDVDDALCSSKRAQKEWAKI PAQRRGEYLLEIAELINKHKEELAVLLTSEHGKTIVQARAEVEGARGFLCYAAESARRIE GEILTSELECEQTWIQRVPYGVTVGIVAWNFPLALTTRKFGNALVCGNSMIIKPPSETPL TVMRLAEIISQESSLPSGVLNFITGSGRVVGDALVRNDTTRLVTLTGSTGAGIEVFRTAA EHCVEVHLELGGKAPFIVMNDADIDKAVKAAVISRFSNCGQICTCNERMYIHEEVYDQFV EKLIAETKKIIVGDPMDENTFMGPKVNKAEIEKISQMVDLSLQQGGKILLDMTPEKKPTE NGNWLYPCIVEVEDNKNELIQNEVFGPVLPVMKVKDFDQALEFANDCEYGLSAYLFTNNA KYIMRAVNELEFGEVYTNRENGELINAFHNGFKLSGTGGEDGKYGLEGYLQKKTVYMNYK >gi|226332916|gb|ACII01000103.1| GENE 44 41288 - 42280 524 330 aa, chain - ## HITS:1 COG:BS_rbsB KEGG:ns NR:ns ## COG: BS_rbsB COG1879 # Protein_GI_number: 16080649 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 37 307 35 295 305 97 30.0 3e-20 MKKRKRFYSGLLIVLVILFSSGCQKNHTKEEKEIPKKKEKAIGLSVDQGFASRNYETDII EEEAEKLGYEVYELIAGGNAEAQNSQIERLLEKNISALLVCAVDRNKIESALLQAKAKGV PVITFDRLIPNSKSVDAYVGPDSVADGMACGEAIIRQVRESKEPIQVLELVGALNDQNGI DRSKGFHQAMMQDEKIQVIQVPTDWNLDAAFVDMERIFRSYPDIKAVFCGTDSFFPNVES VLDNIDSSGRWRERIYVTGVNGSKEGYNAVLSGTADGVMVMDLDEIGKAAVTVADKLIKN ENVEKNIIIEGRYCTTDTIKESQEHIWGTK >gi|226332916|gb|ACII01000103.1| GENE 45 42305 - 43399 886 364 aa, chain - ## HITS:1 COG:SMb21107 KEGG:ns NR:ns ## COG: SMb21107 COG4948 # Protein_GI_number: 16264434 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sinorhizobium meliloti # 33 363 31 361 370 170 30.0 3e-42 MDNKIKTIAIDYYKVPLEGNLVDALHGKHDNFELITATVTLENGRIGCGYTYTGGFGGAA IAKMLEQDLTPQLIGVPMNSPEKMNDYMNQHIHYVARGGIASFAISALDIAFWDIELKSH GIALKDLRGKGADKVRTYYGGIDLMYSEKELLENIEKQLASGHTAVKIKLGRENEDEDIQ RIKAVRKLIGDEALFMVDANMVWTEEQAIRMAKRMEEYNIGWLEEPTNPDDYEGYARIGQ ATSIPIAMGENLHTIYEHELAMKLARIKRPIGDCSNVCGITGFLKVADLAEKYSVEVHSH GMQELHANVLGAVENRGMVEFHSFPIYKYTVDPLVVKDGYLETCKAPGTGVVFDLEKLNL YKVN >gi|226332916|gb|ACII01000103.1| GENE 46 43448 - 44464 798 338 aa, chain - ## HITS:1 COG:BS_yjmD KEGG:ns NR:ns ## COG: BS_yjmD COG1063 # Protein_GI_number: 16078298 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 1 338 1 339 339 256 38.0 3e-68 MRVAKLTEVRKIEVVEGEKPEISSEYDVQVKVKAVGICGTDLHIFNEGRADVILPRVMGH ELSGEVTAVGEKVDRVKVGDRVALDPVFACGECPTCRKGYPNVCENVKCYGVQMDGGYQD YIVVHEKHLYPFDHSISFEQAALAEPFSIAANILERTSAAKEDNVIIIGAGTIGLSIVQV AKGIGCRIMVADVVDSKLEKARALGADKVVNSKKESIKDMIEEFAPGGLDVVIDAVGITS LFQQSIEYAAPRGRIACIGFDAKPAEIPPVLITKKELSIVGSRMNCYQFPKVMKWLEEGK IDAEKMISRKYSIDDIQQAFEETIANGQDVVKTLIVFE >gi|226332916|gb|ACII01000103.1| GENE 47 44513 - 45475 412 320 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 2 293 18 316 339 144 30.0 2e-34 MNEKLVRGETFEKQVGGAELNVLSGASLLGLRAGIISKIAENDIGEFARNRVRFVGVSDD YLMNDIEKDARLGIYYYENGAYPRKPRIVYDRSHTSMSKISIEDFPDTMYSLTRCFHTTG ITLALGEQVRTTAYEMIKKFKKNGALISFDVNFRSNLWSGEEARRTIEEILPLVDIFFCS EETARLTFKKSGNLKDMMKEFADEYNIGVIASTQRIVHSPKIHTFGSVIYEKCSDTFYEE PPYSKIEVVDRIGSGDAYISGVLYGILSENGSCAKALSFGNAAGAIKNTIPGDMQASLRG EVEAVIQSHRLKGNDCEMVR >gi|226332916|gb|ACII01000103.1| GENE 48 45648 - 46445 205 265 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 8 255 1 238 242 83 27 3e-15 MDKNSFSLKGKIALVTGASYGIGFAIAKGFAAAGATIVFNDIKEELVEKGLASYKECGIE AHGYVCDVTKEDQVNDFVAKVEKEVGIIDILVNNAGIIKRIPMCDMTADEFRQVIDVDLN APFIVSKAVIPSMIKKGHGKIINICSMMSELGRETVSAYAAAKGGLKMLTRNICSEYGEY NIQCNGIGPGYIATPQTAPLREKQPDGSRHPFDQFIIAKTPAARWGDPEDLAGPAVFLAS DASNFVNGHVLYVDGGILAYIGKQP >gi|226332916|gb|ACII01000103.1| GENE 49 46565 - 47626 781 353 aa, chain - ## HITS:1 COG:no KEGG:SCO3481 NR:ns ## KEGG: SCO3481 # Name: SCE65.17c # Def: hypothetical protein # Organism: S.coelicolor # Pathway: not_defined # 4 347 7 348 370 419 59.0 1e-115 MKLSSASKRAIERNYYQSCEWFCDFKESGIEGIGYEKGIHRRDPSSVIKVGDDYFVWYSR SVGPHKGFHTGDEEAKVFPWDYCDIWYAVSKDGYRWEEKGPAVVRGERGAYDDRSVFTPE ILEYEGKYYLVYQVVQHPYVNRSFESIAIAVADSPHGPFTKSKEPILTPTKDGIWEGEED NRFAVKKKGSFDSHKVHDPILFAFRGKFYLYYKGEPMGEELYMGGRETKWGVAIADNILG PYHRSEYNPVTNSGHETCLWQYNGGIAAFLRTDGVEKNTIQFAEDGINFEIKSVIKQGPE ACGPYRHLESDSNPLKGMEWGLCHDVSQDYGFIKRFDIDEWQKKVYTNREMYE >gi|226332916|gb|ACII01000103.1| GENE 50 47604 - 50699 898 1031 aa, chain - ## HITS:1 COG:ECs3958 KEGG:ns NR:ns ## COG: ECs3958 COG3250 # Protein_GI_number: 15833212 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli O157:H7 # 8 1008 16 1005 1042 563 33.0 1e-160 MVFKDRYFENPQLNHVNTMEPRCYYIPLQEVELDVEKQKESSAQCFQLNGEWDFLYFSSI YDIDHKPESIHETVHEWERIPVPSVWQNYGYDKRQYINVNYPIPFDPPFVPSENPCGVYH RTFEYRSDSDYPKSYLNFEGVDSCFYLYINGKFAGYSQVSHSTSEFDITDYLSDGINHIT VFVLKWCDGTYFEDQDKFRMSGIFRDVYILKRPVNHLRDYFIHTVLVDRYDKAFVTVDLE GGGRLDVQYSLQEQSGKKILEGEAQDGKISFMILEPVLWNAENPYLYTLVMKYNGETFIE TVGVRDIRIENKVLLINGVDVKLRGVNRHDSNPSTGCAVTIDQMKQDLTLMKAHNVNAIR TSHYPGRPELYYLCDKYGFYIMDEADVEIHGVDGLYSPVWEEDYNKHAFSAEISDNELYS ASVTDRVKRCVIRDKNRCSVVIWSMGNESGYGCVFESALEWTKKFDSSRLTHYEGALHAP HDRKNDYSNIDLYSRMYPSIDEINEYFRKDFDKPFIMCEYSHAMGNGPGDLEAYYKVIQK YEGHCGGFVWEWCDHAVELGKTADGKVQYGYGGDSGEELHDGNFCVDGLVYPDRTPHVGL KEFKNVNRPVRVLGYDRKTGTVEFENILDFTNIKDVLDIKYEVTCGKEVIERGAIIDREL LDIHPHEKRSIPIMVHISDKSKCYIKFDYVQRVKNLTTDEGHSFGFDQISLNDAAIDENI FIREPEEKYNAGTAIEIEENSRRVVLAGENFVYAYDKRKGCFESICVEGTEILKKTMEYN IFRAPIDNDRIIKEQWYAAGYDKSVIRVYDTFVKKEDEGISIRSSAVISAPHVQKIIEFQ VSWKVSLSGKINMKLEATRNVEMPYLPRFGLRLFLDERMDQVEYFGYGPYENYSDKHQSS YLGLFRTDAETMHENYIRPQENGSRCGCRYVRLFGSGVHWSIISKKDFSISVSKYTQEEL CAKRHNFELKESGNTVLCLDYKQSGVGSGSCGPQLSEEHRFDERKFCFECCFMRPELTDT EEEISETKQCK >gi|226332916|gb|ACII01000103.1| GENE 51 50707 - 51540 446 277 aa, chain - ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 9 277 24 293 293 148 33.0 1e-35 MKQLKNKSISVLLNIATMLISITCIFPLIWLAYNSVKDKFEFMKDTISLPADPQWGNFWE AIQLGNLIPATFNTVFNSVINVILVCLGSIIVGYFLERYEFRGKKIISAIFLVGMVIPLY SLLVPMFLQYKVLGMLNTRFVLILPYFAMQISLGILLCKSFVHGIPKEIEEAAVIDGCGM RQLLKNMIFPLCKPMISTVGILTLLASWNEFAFATVLSSGSQYRTLSVAVQSYSSGREME YTLFLAALLMVSLPIILIYCIFSKQIINGMTAGAVKG >gi|226332916|gb|ACII01000103.1| GENE 52 51540 - 52427 564 295 aa, chain - ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 15 282 34 302 317 151 29.0 2e-36 MNNAGLKVKTKTIYLYVLPFFLLFIIAFVIPLLMAFGISFTNWRGGNKIDFIALDNYIKL VKDETFWMSFLNNLKFMALMLVFQIGLAFVFAIVVQNKRVKFQGFHRRVIFLPSVLSSVV VAMIWQIVYNKDIGLISAIMEKIGMEDIVPLWLDDPKIVIYSLAIVLIWQFVGQYVIIMM AGFQNVDTSLMEAAKIDGASYSQTVRYVTLPLMKPTLSVCVTLCVSGCMKLFDTIYAMTG GGPGRSSTVTALYAYDVAFKTKQLSYASAVSIGMIILSVILIGGVSRIFREKEEQ >gi|226332916|gb|ACII01000103.1| GENE 53 52493 - 53815 1440 440 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 1 432 1 433 438 109 26.0 1e-23 MKRKMCLVLAGLMVAGLAGCGADQKGSSADGNEEMEIVYAHFQGEGDAQYKSIQNAITAF EKDNPGITIKEEYYPSDAYLLQAETWSAAGELPDICMVNGSMASDYADIGTILDLTSYAE EYGITDKIDESYFKEMSADGAIYGIPWEDAHYAFILYNEGIFKEVGVTEFPKTLDELIDV SKKIADAGYIPMAMGDKALWPADSLAFSAFVNNFVGNDWFDSILACDGKAAFTDKEFVDA LDQYQRLAKEGVFNDNLSSIDNDQRGALYQNREAAMISAGNWECNSCVEIAPEVAEETQV ALWPVPAENAKAHDSVVSSSAWGLALGSNIDEEKIPYAMDFIANYICSEDFGKVLAEEQG MFTPWKVEYDTSKLNIITQREQEVSNADGVTRCLNWDSSLPASVKDVYQRGLQEVLMQIK EPADLADEMQTAYEEYIDMK >gi|226332916|gb|ACII01000103.1| GENE 54 53924 - 55222 525 432 aa, chain - ## HITS:1 COG:TP0737 KEGG:ns NR:ns ## COG: TP0737 COG1653 # Protein_GI_number: 15639724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Treponema pallidum # 17 383 12 381 436 136 25.0 8e-32 MKRKAIILPVLLFLLGIGLISYSLWNLLQAHVGIQEKREEKEITMIVQESRYYTGLKNMI KKLYVEEGIRIKVRVLPDNQAESVIQMMINSGDYPDMLDANIPHIYNLINPCRYLADFTE ESWVSKLTNPQQVTYSDGRIYGFPFLKNSGVGGIIYNKDIFEKYGIAIPESEEDFYNVCE RLKSLGISPVLLCADNWIPQIWMNYGFPLSFGSDKNCKKITEEILSGKKKLGDYEESYKV IDTYLGLFEKGYVNNDYMYIDYNDMIDKLNNQEGAMVFGYSTMIPVMNQHHSNLNMGMFV PPFAYNQSKNIVYLDFSVGFVSFRESENLDVVKEVLDKWSQKEYLSMWIFENGGDPAFED VPDNVNKDIEYLYKSYVETGKTVGEFMLYMDPIYSLNQDKLWLYYREAPKNHWTADELLE KIEIDIEEKLEK >gi|226332916|gb|ACII01000103.1| GENE 55 55212 - 56666 505 484 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 479 1 523 525 112 24.0 2e-24 MISVLVVEDEIPILRSICRQVESANSSFHVTRTAVNGKEAIEALQEEKFDLMFVDMQMPV LDGMAVLKFVKEKQLSVYTIVLSGYQEFQYAREALQCGVLDYLLKPLKKNELTRVLQKVE RMIWIDRSKEETAYINNSDDREFEAALLTVGPYIMLREDYAFTEGQNRIDEQLNSFLDKY ISDELRWIIPTKYENERIIVFRKLGGRTKNIVEKLFRDYQSQCDLVTIIMAGQEQTHKTI FETNKRLHQYMIGAMGVEESRLMDEEEAKSVMENGVHYLKTVRKEVLKVRVSEHVMELIM KILNPVAERKVIFEILKLAFFRFSEEINSNYTYNQMEDEIADIIQSSYNREELYDKLEKM IIRMFCVEDYSGNKQKLALDIKEYLDHNYKSVTGNKELADIFGFVPAYLRDVFRSKYEKS PMEYLQELRLECARKMLAQELKLPVKEIARAVGFNDALYLSKVFRRKFGITPSEYRNQVK NNET >gi|226332916|gb|ACII01000103.1| GENE 56 56659 - 58308 694 549 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 302 543 332 555 563 108 27.0 3e-23 MKNILEQSETTIEASINSLDTIALQVSLNPDVIDIMKQAENDSYDKFFENNIIEAERVNE ILWSFILDQENISGISIFDKKGSFVYAGQETDTEQFRYSRDEDFFESLENEFDKPRVYNM FENETNRSVVTGYKAVTIIRQIKSDDITPYCIGYVEVGIDVQQLENKLQQDFSGSVLVLY DAETGRVMGSTIPQLIENTNTNYEDFLKDQNLDNYFWVQGSTFHNQVGIIVLHSRDELNR FVFFTVIFTLGIYITILSIYFLIQRKLSLFLAEPLVQLCESITRRMKYKNNCEIIDNREF NEIEELQETFNQMFLKLDDSMKREIAAKTERLKVQLYALQAQIHPHFIHNTIAIIQAYAL EEDYNTIVEICENFSDLIRYGVEVTEDKSLVRDELEWAVKYLWIFHLKYGDNLIFTVEKR NAVENIHIPHFIIQPLVENSIKHGLKATEFPWKLQIICNTDHKTWQIEIRDNGVGFSDEM RMELMQYKQKLMGEGIKNDAWREESKIGGMGMKNIITRLYLSYGCDMIFDIERQLGHGAV ITVGGRCDD >gi|226332916|gb|ACII01000103.1| GENE 57 58490 - 58756 218 88 aa, chain - ## HITS:1 COG:SMa1476 KEGG:ns NR:ns ## COG: SMa1476 COG4977 # Protein_GI_number: 16263257 # Func_class: K Transcription # Function: Transcriptional regulator containing an amidase domain and an AraC-type DNA-binding HTH domain # Organism: Sinorhizobium meliloti # 6 87 232 313 330 69 40.0 1e-12 MLFQKRIADIAEELNYNPDYLRQKFKKIAGVSLKAYIMQQRIQNACKLLEQENYSCTEIA QLCGFSSSSQFSRIFKEKMGYTPKSYKL >gi|226332916|gb|ACII01000103.1| GENE 58 59333 - 61753 993 806 aa, chain + ## HITS:1 COG:TM1624 KEGG:ns NR:ns ## COG: TM1624 COG3250 # Protein_GI_number: 15644372 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 70 670 25 633 785 198 25.0 3e-50 MEKNKDFFALEAGSGLFEASDSSIGRITQDLSGKTWKMEKMRVGQGVSEGIHLLKSELSG NNFSWNAAQVPGDVYTDLFLAGELDDPFWGRNMGKAKWVQEYEWWYNRAFNVDKNLIGKD IELIFEGVDFSCEVWLNQKYLGRHEGMYSSFSFQVTDLLDYSQPHVPVNLLTVKLDPPPK NQQNFAGMKHNFSGDYLTGLIPFGIWQPVKLVATNRLRLENYRVEYDLHKNAATAVFYIE LSNLSDSVLEAELNVSMTGRDQQITHSEPLSVSPGTSVCTFSFEIEDPQLWWPYELGEPF LYDLNILIADGKSILDSLSDKTGIRTITMQMNPGFTEEESRIPWTFVINGKPMFLRSACW GGQPSFFYGRNSTPKYRMFLEKAKEANINNLRIFGWHPAETDEFYTICDQLGITVWTNFS FATQEFKTDQPYIEKVTKEIQSTVIKRRNHPSNIMWMGGEEVYFTEAHVESGNKQLMEYI GEVTHQLTNTPYADASPLSSREAIRMGYATKESMHANSHYYAAGAIFMEDYYPNLDYAII PELAAASAPNIDSLKKFIPSDELWPMGPSWGYHAADIDVLKNLNYEVFGYTCTGTLEEFV EATQIAQGTVAQFALEHFRRQKPHVSGVSLCHFITNWPIIKWDIIDYYGQTKKSFDYVKR SYQPLLPSLEIQKRRWMPNELFRGRLYIINDYYKNYPSLTYKCIFRDSDQNELYSNTFTA SVTENSSTAYEFLEFKLPSDISNCFYIQLYLSDGDNVLSENNYRLLVGDQKAAQEAAYGM YKIMHQKSREYGKGYYRYFPDMFSDL >gi|226332916|gb|ACII01000103.1| GENE 59 61843 - 63357 305 504 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 503 1 522 525 144 23.0 3e-34 MIKLLVADDEKTIREGIRKGILWKEIDCSVVYTAKNGLEVLHFLEENTVDIIISDVKMPR MSGLDLIKSLREKGSSVPVIFVSGYEDFKYVQEAMKYEASAFILKPINTDELQKEVQKIC KKYNIGGENIPLRHFIEYHFKGAEKKGDLLNYEFLEETLNRKYFCVINLRCNYAEIKSQL FLLNFREKIEGIMRKYLPSDQFAEIEISSHGLLYCVMNSEKNRLEYTIHQLKENLRKELS EFSYSNLRIRNGGIYRGTQHILDSYIESFDNTYLKNDREGVLERKILTDVFVELFESEDT IIENLISRKKDEVNQILDRQKEIVLKRRIEGADVRVYLEHLFFTYHKQENALNSKEDQKC NSYLQEDFMELSIEEMFQKAKKRVNEICVSRRTENLSSVENNNQKIKNYILQNYENPQLS LSSISEYMALNASYLSTAFSRCEGITISNYILNIRIQKAEKLLLNSSMKISEISHCVGFI NYTYFSSAFRRVKGMSPSEFRNQE >gi|226332916|gb|ACII01000103.1| GENE 60 63332 - 65086 467 584 aa, chain - ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 42 567 29 551 552 173 27.0 1e-42 MMLSDSFFLHLLKAFLVVSVFPLVIAGIISCVSFYNVSFEYAVSYSNKKSELVALQINML QDVNEKITNSLHNSSMMQLEMRSGLYGHTHTMEDLFKINEALSYIKNSSVSGIEGIYCIK KNGTVYQSNTTAVYKYFNTDVQWYQDVQEKQQGIWLSPYKHSLVTPALEGNYVAYLCPYL DYITGEMNGVVLIEMDCDEIWEILRKDSEKDSTQYSILNQENHVIYTTDSDQRKATEEML DENKFSYVEELNNGWKIVSNISKTEIFLSVVKSLIILFTILLLVCFSIIIIVSIKKANEI SMPIKKLTESTKLVQRGQYDVQMDIPNTSGEIVELYQNFNKMIVALQKYTSKIIEEQKKL RLANYKALQAQINPHFLYNTLDTIAWDIRLNDNKRALSILMAFTQFFRTSLRKGEDMVTL GQEFEHVTSYLQIQEHRYGEIMEYRTKYNPELKDNMLPKMILQPLVENSIYHGIKKKDEL GYVIIYTKCYNEYYEIVVYDNGGGMSKTRLQQINESLIKRSPLKTESSGYGVYNIDERIK ILFGFDYGLTFYSEENRYTKVVVRLPYSTQEEGIAKSDKVTGCR >gi|226332916|gb|ACII01000103.1| GENE 61 65275 - 66237 527 320 aa, chain - ## HITS:1 COG:YPO0859 KEGG:ns NR:ns ## COG: YPO0859 COG1172 # Protein_GI_number: 16121167 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 47 314 58 333 336 174 39.0 3e-43 MAKEKNIKDTVLLFVKNNSTFMIFAVIFIFASLRYEEFFTYNFIKNLIVQNTMIGLLAFG MSFVIIAGDIDLSVGSVMALCGVIAAKLSSQNILIVFAVTFGVGIFWGFLNGFMVAKMDI VPFIATLATMVGVRGIVHIITGSKSVSTAGCTEVFKAIGHGELIPYVPNTILIYLVVFAI CMFVQRNTRYGRHNYAVGGNLSASKMMGINGTRIRILNYVLSGVMAAIAALIMTSRLGSG QYSAGDGWEMDAIASTVIGGTLLTGGVGDVKGTFFGVLILGIIKQIFNLQGNLNTWWQNI ATGLILLAVVVAQSIAKNRK >gi|226332916|gb|ACII01000103.1| GENE 62 66237 - 67205 622 322 aa, chain - ## HITS:1 COG:YPO3906 KEGG:ns NR:ns ## COG: YPO3906 COG1172 # Protein_GI_number: 16124038 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 26 300 31 314 339 160 35.0 4e-39 MGKKKVLNKNLYNFIVDYSALIVALLLIIINVIFTPNFFNITTISNIVIQMTSTLLISLG MTWVIASGGIDISVGSVMALSSMVSVKLLDYGLLTAMAGGMLVGIASGALIGFLVARFDI QPMIVSMTLMIGLRGVAQILNDSKILRFDNDAYAQIGRAKIFGNLPIQILIMLFFIVIIW FISSKTIFGIKIEAVGDNREASRLSGINTLMVLVSIYAVTGLLSSCAGIIETARLYASDA NTLGKAIEMDAISAVAIGGTSMIGGKPNIVGTIFGVIIIQVLTTMVNMNNISYQYSLVLK AIVIVVALYGQRFLQSRKRGVA >gi|226332916|gb|ACII01000103.1| GENE 63 67219 - 67416 198 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579735|ref|ZP_04857004.1| ## NR: gi|253579735|ref|ZP_04857004.1| D-ribose transport system ATP-binding protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 67 131 131 126 100.0 4e-28 MHKHVNSEEFYYIEKGNAVMIADGNEIHMPPHSVFMINPGSEHSFLNCETEDVVALVMEV NLDNK >gi|226332916|gb|ACII01000103.1| GENE 64 67635 - 69137 170 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 279 499 25 238 311 70 25 3e-11 MSEKKVLEMKGITKIYPGVRALDGVDFIVEEGSVHALMGENGAGKSTLIKILTGVIGKDS GSIVLSGKEVNFHSVYEANNNGISAVYQELDLIPELSIGENIFMGREMMKKGSIDWKNTY KEADKILKSMGILLDVTAKLSSLGTAMQQMVSIARAISIKSQIVVLDEPTSSLDTSEVEV LFKVIEKLKRDKIAVIFITHRLDEVFATCDTVTILKDGKLVHRCKIEETNKLDMVSKMIG RNASDVMGQTKVCLSDRSPKEIFLEAKGMKKFPKVIEQSIKIRKGEVLGLAGLLGAGRTE LARLLFGADTCKEGTIEIEGKPVKINTPRDAISNRIAFCSEDRKVEGIIPNMSISNNLTL ACLKSISKWGVISKRKQKALVQKYIEMLKIKIGNEKDPIISLSGGNQQKVLIARWLCANP SLLILDEPTRGIDVGAKQEILDKVCELSEQGMSVLVISSILEELVQTCDRIQVIKDGKTR GEIQYEGISEASIMQTIAKE >gi|226332916|gb|ACII01000103.1| GENE 65 69221 - 70222 874 333 aa, chain - ## HITS:1 COG:AGl3185 KEGG:ns NR:ns ## COG: AGl3185 COG1879 # Protein_GI_number: 15891710 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 50 304 25 284 319 158 37.0 2e-38 MKTKNVKCFFAAIAATTLVVGMVAGCGNTVKADEVSKEESSGSKEIGPDLTVGFAQSGNP NPWMVALTESMQQSADEYGVNYIYTDANDDMATHVANIEDMLAKDLDILVIAPMEDTGLE AVLDEAAEKGVPVILSARTTQGEYVTTVYSDQAWEGERCAELIGEKIPDAKVVELRGIEG TSSVAGREKGFRDVMAEQYPDMEIVVEQTANFSRQEAMDAMANILQAKGPDAIDAVYCHN DEMALGAVQAIKDAGLTPGEDIQVVGIDGQKEAWELVKSGEMLGTVQCSPKHGPTVFEVI QKILDGETVQKETIVPDQVITKENVDESESLVF >gi|226332916|gb|ACII01000103.1| GENE 66 70539 - 71195 413 218 aa, chain + ## HITS:1 COG:no KEGG:CKR_3341 NR:ns ## KEGG: CKR_3341 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 4 216 44 226 231 119 37.0 1e-25 MEVSFKPAGCLRFGRINKEVYMEDYRYKALQSLTRARYYAVTNLTREKQRFINVLFKKYS TMTQKKVFSDTFSTTALAVYEEFEFAEALEKMDLQELTAFIIEKGKNRFPDPDAVAKTIQ KAALAKYAGLAWQQHQSGNFEAQTTRMIHSSNRFLKYYLCEAAFSLVRCDKEYRDFYNLK YKEVNRFQHKRALALTARKFVRLVFALFKDKRLYRSAE Prediction of potential genes in microbial genomes Time: Sat May 28 20:12:51 2011 Seq name: gi|226332915|gb|ACII01000104.1| Ruminococcus sp. 5_1_39B_FAA cont1.104, whole genome shotgun sequence Length of sequence - 22867 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 11, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 155 - 244 63.6 # Ser GGA 0 0 - TRNA 406 - 491 65.0 # Ser TGA 0 0 1 1 Op 1 . - CDS 666 - 1358 388 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 2 1 Op 2 . - CDS 1360 - 2379 821 ## PROTEIN SUPPORTED gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 3 1 Op 3 . - CDS 2453 - 3361 868 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 4 1 Op 4 20/0.000 - CDS 3447 - 3884 297 ## PROTEIN SUPPORTED gi|226224682|ref|YP_002758789.1| ribosomal protein alanine acetyltransferase 5 1 Op 5 12/0.000 - CDS 3881 - 4609 204 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase 6 1 Op 6 . - CDS 4619 - 5053 654 ## COG0802 Predicted ATPase or kinase - Prom 5084 - 5143 7.3 + Prom 5046 - 5105 6.2 7 2 Tu 1 . + CDS 5147 - 5665 722 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 5679 - 5726 14.1 8 3 Tu 1 . - CDS 5720 - 6376 705 ## COG1739 Uncharacterized conserved protein - Prom 6464 - 6523 10.8 + Prom 6499 - 6558 9.8 9 4 Tu 1 . + CDS 6578 - 9169 2506 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 9216 - 9275 15.7 - Term 9211 - 9254 10.0 10 5 Op 1 . - CDS 9289 - 10233 578 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 11 5 Op 2 . - CDS 10264 - 11199 1174 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism - Prom 11320 - 11379 7.8 12 6 Tu 1 . - CDS 11466 - 13328 1914 ## COG0322 Nuclease subunit of the excinuclease complex - Prom 13375 - 13434 3.8 - Term 13402 - 13445 9.3 13 7 Tu 1 . - CDS 13454 - 15424 1263 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 15446 - 15505 6.0 14 8 Tu 1 . + CDS 15745 - 17226 1740 ## COG4868 Uncharacterized protein conserved in bacteria + Term 17302 - 17355 9.1 - Term 17289 - 17342 9.1 15 9 Op 1 . - CDS 17348 - 18091 786 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 18117 - 18176 4.6 16 9 Op 2 . - CDS 18217 - 19026 766 ## COG0388 Predicted amidohydrolase - Prom 19046 - 19105 5.9 - Term 19082 - 19134 10.1 17 10 Tu 1 . - CDS 19190 - 20206 1279 ## COG1087 UDP-glucose 4-epimerase - Prom 20378 - 20437 5.7 - Term 20601 - 20650 11.4 18 11 Op 1 . - CDS 20706 - 21806 1018 ## EUBREC_0139 hypothetical protein 19 11 Op 2 . - CDS 21811 - 22737 1194 ## EUBREC_0139 hypothetical protein - Prom 22794 - 22853 3.0 Predicted protein(s) >gi|226332915|gb|ACII01000104.1| GENE 1 666 - 1358 388 230 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 7 229 6 224 234 154 42 7e-37 MSERNTAIVLAAGQGKRMHSKVQKQFLEIQGYPVLYYSLRCFQESPLIQDIILVTGEESI SYCKEEIVQKYGFTKVSAVIPGGKERYDSVYAGLCECRDCEYVLIHDGARPFVTEEILKR GLQKVKETGACVIGMPSKDTVKLSDEEGYVKETPNRKCVWTIQTPQIFSYSLIREAHDSI RQKDMSKITDDAMVVEQETGAKVALAEGSYQNIKITTPEDLDIAEAFLKH >gi|226332915|gb|ACII01000104.1| GENE 2 1360 - 2379 821 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Slackia heliotrinireducens DSM 20476] # 5 338 439 779 781 320 50 4e-87 MKDTLILAIESSCDETAASVVKNGRTILSNVISSQIALHTLYGGVVPEIASRKHIEKINQ VIEQALADADVTLDDLDAIGVTYGPGLVGALLVGVAEAKAIAYAKKLPLVGVHHIEGHVS ANYIEHPDLEPPFLCLIVSGGHTHLVIVKDYGEFEILGRTRDDAAGEAFDKVARAIGLGY PGGPKVDKLSKEGNPNAIEFPKAKIGDCPYDFSFSGVKSAVLNYINHAQMTGEEINRADL AASFQKAVVDVLVEHTMLAAKDYGMTKIAIAGGVASNGTLRAAMEEACKKNNYSFYRPSP IFCTDNAAMIGVAAYYEYIKGTRHGWDLNAVPNLKLGER >gi|226332915|gb|ACII01000104.1| GENE 3 2453 - 3361 868 302 aa, chain - ## HITS:1 COG:CAC1584 KEGG:ns NR:ns ## COG: CAC1584 COG1234 # Protein_GI_number: 15894862 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 301 5 311 313 341 50.0 8e-94 MLDVCLVGTGGMMPLPRRWLTALMTRYNGSSLLIDCGEGTQVAIKEKGWSFKPIDVICFT HYHGDHISGLPGLLLTMGNADRTEPLTLVGPKGLERVVNALRVIAPELPFEIKFIEITQP EQVIELNGYRITAFKVNHNVLCYGYTLEILRQGKFSAERAKEQDIPLKYWNPLQKGQTIE ADGITYTPEMVLGPARKGIRLTYTTDTRPTESILRNAKESDLFICEGMYGEDDKADKARG YKHMTFREAAVLARDARVKEMWLTHYSPSLVRPDEFMDKVREIFPNAYPGKDGKSLELNF EE >gi|226332915|gb|ACII01000104.1| GENE 4 3447 - 3884 297 145 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|226224682|ref|YP_002758789.1| ribosomal protein alanine acetyltransferase [Listeria monocytogenes Clip81459] # 1 140 7 147 151 119 43 2e-26 MILREMLVDDLDQVMEIEQDLFHVPWTKEGFFTFLTRDDAMFLVVEEKEKILGYCGLLMV LDEGDITNVAVRRDRQKEGIGAFLMQSLIRLAAEREVTTIHLEVRVGNETAIRLYERMGF TGDGIRKAYYSDPVEDALLMTRHPE >gi|226332915|gb|ACII01000104.1| GENE 5 3881 - 4609 204 242 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 42 227 1 182 380 83 31 1e-15 MKILALDSSGIVASVAVVEDDTLLAEYTVNYKKTHSQTLLPMLDEIVKMTELELESVDAI AVAAGPGSFTGLRIGSATAKGLGLALKKPLVAVPTVDALAYNLYDAQGLICPIMDARRKQ VYTGIYRFEEHQLMTLKEQWAAPIEELLEELNQRGEMVTFLGDGVPVFRELIAEKLQVPY SFAPAHVNKQRAAAVAALGSIYYKEGRTETAMEHIPEYLRVSQAERERAEREKEQKPAGT KI >gi|226332915|gb|ACII01000104.1| GENE 6 4619 - 5053 654 144 aa, chain - ## HITS:1 COG:BS_ydiB KEGG:ns NR:ns ## COG: BS_ydiB COG0802 # Protein_GI_number: 16077658 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Bacillus subtilis # 5 137 8 135 158 141 54.0 4e-34 MIIETKTPQETFEVGKKIGENAKPGQIYTLTGDLGVGKTVFTQGVAAGLGITEPICSPTF TIIQEYESGRLPLYHFDVYRIGDIEEMEEIGYDDYFFGQGICLIEWADLIEEILPEKLIK VTIEKDLEKGFDYRRITVIGPENS >gi|226332915|gb|ACII01000104.1| GENE 7 5147 - 5665 722 172 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 170 1 170 174 239 67.0 2e-63 MANPIVTITMDNGDVMKAELYPEIAPNTVNNFISLVKKGFYDGLIFHRVINGFMIQGGCP DGTGMGGPGYSIKGEFTQNRFKNDLKHTAGVLSMARAMHPNSAGSQFFIMHKDAPHLDGA YAAFGKITEGMDVVNRIAEEDTDYSDRPLDEQKIKSMTVETFGVDYPEPEKC >gi|226332915|gb|ACII01000104.1| GENE 8 5720 - 6376 705 218 aa, chain - ## HITS:1 COG:BS_yvyE KEGG:ns NR:ns ## COG: BS_yvyE COG1739 # Protein_GI_number: 16080604 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 206 1 205 217 162 44.0 4e-40 MLEKYKTVYEGGEGEIVEKKSRFIATVRPVQTEEEALAFIEEMKKKYWDARHNCYVYSVG KNREYTRCSDDGEPSGTAGRPMLDVILGEDIYNVAAVVTRYFGGILLGTGGLVRAYSRSL QEGLAASTVIEKTYGISMEVVTDYTGIGKIQYIAGEQKLPILDSEYTDRVVLHLLVPADQ IAFVEKAITEGTNGRAKMKKEKDLYYSVIDGEVKVFTD >gi|226332915|gb|ACII01000104.1| GENE 9 6578 - 9169 2506 863 aa, chain + ## HITS:1 COG:BH1702 KEGG:ns NR:ns ## COG: BH1702 COG0744 # Protein_GI_number: 15614265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus halodurans # 24 782 22 696 886 292 29.0 2e-78 MNYGKKSTAKKRTALISRSSMMGKRARVSFIRVLFVSLIALCIAVTCLGVGSFRGVIDTA PDVDDIDIMPLGYATFLYDDAGNQIRKLAAPDSNRLPVTLDQIPVDLQHAVVAIEDERFY EHNGIDVKGILRAGMKALTTGDFSEGASTITQQLLKNNVFTNWTSESTQLERFTRKIQEQ YLAVQVEKKTDKDTILENYLNTINLGAGSYGVQAAARQYFDKDVWDLNLSECVTLAGITQ NPTKFNPIINPDSNRKRRKEVLQHMLDQNYITQDQYDEALADDVYSRIQAAQEKNSSTEN TVYTYFEDELTDQIINDLMNIKGYTKKQATNLLYSGGLKVYTTQDSKIQNILDEEYADPS NYPDTVQYELDYALTVTDPDGNQVNYSKEMLQLYFQNEDPDFDLLFDSPEDGQTYVDKYK ASILANGSKVLAERVNFAPQPQSSMSVIDQHTGYVKALIGGRGEKTASLTLNRATDTTRQ PGSTFKIVSTYAPALNEKGMTLATTFEDEPYEYPDGSPVNNATRSYNGTTTIRTAIQNSI NVVAVKCLEKVTPDLGLKYLDNFGFTTLAHGTEADKDANGNVWSDANLATALGGITRGVT NVELCASYAAIANGGNYIKPIYYTKILDHNGNVLIENTAAERSVIKESTAFLLTSAMEDV VKQGTGTACQLDNMPVAGKTGTTEAYNDLWFVGYTPYYTCAVWSGYDNNEKLPDYARNFH KALWKKVMTRIHEGLPSKEFEKPASVEKLSVCEETGLLPRAGCPVITEYFDVGTMPTEYC DQHFYDSDDYDYNYDTDSSDQTDNTTDTDNSGNSDNGDTDNSGDSNNTDDNGNSGDDGTD NTGGSDDNGDGNEDDSSYQVDYY >gi|226332915|gb|ACII01000104.1| GENE 10 9289 - 10233 578 314 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 312 1 319 319 227 39 6e-59 MGKYCFGIDVGGTSVKCGLFQTDGVLVEKWEIPTRKENSGEAILPDIAKTILDKIAERKL DKEEIDGVGIGVPGPVNERGEVPCAVNLFWGFKEVTKELTELTGLPSKAGNDANVAALGE AWKGAAAGAKNVILVTLGTGVGGGIIVDGKIVGGAHGAGGEIGHAAVNHEEKEACNCGNC GCLEQYASATGIVRVAQRTLAATDEQTVLRKFTKLSAKNVLDAFKEGDKVACDVMAQVGE MLGGTLAMFACVTDPEAIVIGGGVSKAGQPLIDCIQKYYEKYAFTACKKTPIILATLGND AGIYGSARMVIKEV >gi|226332915|gb|ACII01000104.1| GENE 11 10264 - 11199 1174 311 aa, chain - ## HITS:1 COG:SA0715 KEGG:ns NR:ns ## COG: SA0715 COG1493 # Protein_GI_number: 15926437 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Staphylococcus aureus N315 # 8 306 6 303 310 254 43.0 2e-67 MKGVQLTKLVQELGLHNLTPEIDLSEIVIKTAEINRPALQLTGYLEHFANERVQIIGYVE YTYLMQLSDEERKFKYERFISSKIPCVIFSTMTRPSQDMIDLAVKYNVPTFVTERTTSSF MAEIIRWLGVQLAPCISIHGVLVDVYGEGVLITGESGIGKSEAALELIKRGHRLVSDDVV ELRKVSDVTLVGSAPDITRHFIELRGIGIIDVKTLFGVESVKDTQSVDLVIKLEEWDRDK EYDRLGLHEEYTEYLGNKIVCHSLPIRPGRNLAIIVESAAVNHRQKKMGYNAAEELYKRV QANLAKKREEK >gi|226332915|gb|ACII01000104.1| GENE 12 11466 - 13328 1914 620 aa, chain - ## HITS:1 COG:CAC0508 KEGG:ns NR:ns ## COG: CAC0508 COG0322 # Protein_GI_number: 15893799 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Clostridium acetobutylicum # 1 614 1 619 623 593 48.0 1e-169 MFQIEEELKKLPGKPGVYIMHGEKDEIIYVGKAVSLKNRVRQYFQSSRNKGAKIEQMVTH ITRFEYIITDSELEALVLECNLIKEHRPKYNTMLKDDKSYPFIKVTVNEEYPRVLFARRM KKDKAKYFGPYTSAGAVKDVIELVRKLYKVRSCNRVLPRDCGKDRPCLYYHMKQCSAPCQ GYVSSEEYKKNIAELLKFLNGDFKDTIDMLTDKMMAASEEMRFEDAMEYRDLIRSIQKIG ERQKITGYGEEDKDIIAVAMDESLDLREQDAVVQVFFVRGGKLIGREHFYLRVARGDTKA QVLSSFMKQFYAGTPFIPREIMLQKEIEDAKIIEEWLTDRRKQRVYIRVPKKGTKEKLVE LAEENAKMVLDKDRERIKREEGRTIGAVHEVEEWLGLSGIRRMEAYDISNISGFESVGSM VVYEKGKPKRSDYRKFKIKWVQGPNDYASMEEVLTRRFTHEGKDEFDSFSIMPDLILMDG GRGQVNIALKVLEILGIEIPVCGMVKDDHHRTRGLYYNNLEIPIDTDSEGFRLITRIQDE AHRFAIEFHRSLRSKEQVHSLLDDIPGIGETRRKALMRKFKSVENIRDASLKELAETESM NAGSAQKVYEFFHGASVPTS >gi|226332915|gb|ACII01000104.1| GENE 13 13454 - 15424 1263 656 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 26 652 2 619 636 491 42 1e-138 MDNHMDNNPNGNRPDNNKGPGRQDPNKNHQSILAFLVCLLVTLVCFAMFTNMLKDNSSEI TYDKFIEMVDDGDVKEVTLQSNTLTITPKHQSSGFSEEVYYANQMESVDKLTERLEGTGI KFEYKKPDAAGEIISMLVSVLLPTILLFVLLTVFMRRMNKGGGMMGVGKSRAKAYIQKDT GITFRDVAGQDEAKESLQEVVDFLHNPGKYTTIGAKLPKGALLVGPPGTGKTLLAKAVAG EAHVPFFSLSGSEFVEMFVGVGASRVRDLFEEAKKNAPCIIFIDEIDAIGKSRDSHYGGG NDEREQTLNQLLAEMDGFDTSKGLLILAATNRPEVLDPALLRPGRFDRRVIVDRPDLKGR IEILKVHARNVYLDETVDFENIALATSGAVGSDLANMINEAAILAVKSGRSAVSQKDLLE AVEVVLVGKEKKDRILSAQERRIVSYHEVGHALVSALQKDAEPVQKITIVPRTMGALGYV MQVPEEEKYLNTKKELEAMLVGYLGGRAAEELVFDTVTTGAANDIEQATKVARAMITQYG MSDKFGLMGLATQENQYLSGRTVLNCGDDTATEVDHEVMVLLHNSYEEAKRLIGSHREAL DKIAAYLIRRETITGKEFMKIFHAAERGIEIPENLDDLVIPEEKQNTVTLDKPEIQ >gi|226332915|gb|ACII01000104.1| GENE 14 15745 - 17226 1740 493 aa, chain + ## HITS:1 COG:Cgl2942 KEGG:ns NR:ns ## COG: Cgl2942 COG4868 # Protein_GI_number: 19554192 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 493 1 494 495 603 59.0 1e-172 MKIGFDNDKYLKMQSEHIRERINQFGNKLYLEFGGKLFDDYHASRVLPGFEPDSKLRMLM QLSDQAEIVIVISAADIDKNKVRGDLGITYDEDVLRLMEVFTERGLYVGSVCITQYSGQE SADAFKKRLEKLGIKVYVLYLIPGYPNNTSFIVSDEGYGKNDYIETTRPLVVITAPGPGS GKMATCLSQLYHEYKRGVKAGYAKFETFPIWNIPLKHPVNLAYEAATADLNDVNMIDPFH LEAYGETTINYNRDVEVFPVLQAMFEKIMGECPYKSPTDMGVNMAGNCIVDDEVCKEAAR QEIIRRYYKSMEALVRGTGREEEVYKIELLLKQAHVTIEERKVVPAALKREEETGAPAAA MELPDGRIVTGKTSELLGASSALLLNALKELAGIEHDKHVISPEALKPIQALKTEYLGSK NPRLHSDETLIALSISAADNKDAKLALQQIPKLKGCQVHTSVLLSQVDILEFRKLGVELT CEPKSEHAKKLQK >gi|226332915|gb|ACII01000104.1| GENE 15 17348 - 18091 786 247 aa, chain - ## HITS:1 COG:BS_yqiK KEGG:ns NR:ns ## COG: BS_yqiK COG0584 # Protein_GI_number: 16079474 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 9 240 2 235 239 167 39.0 2e-41 MTETTKRTTKVWAHRGASGYRPENTLEAFELAIRQGADGIEMDVHTSADGELIVMHDENV DRVTDGTGLIKDMTLAQLKELKVSTPAEPSGIYHIPTLAEVLELMRTTDMMINIELKNSI CFYPGMEEKILKLVKEMNMEDQLIYSSFNHYSLLQLKQLNDHVQTGILFSDGWVNPAMYA KNLGINAVHPAVYHLKYPQFIEEVKRAGLKMHVWTANKPEHIQLVKDAGAEAVITNYPDR AIEIIEK >gi|226332915|gb|ACII01000104.1| GENE 16 18217 - 19026 766 269 aa, chain - ## HITS:1 COG:MTH1811 KEGG:ns NR:ns ## COG: MTH1811 COG0388 # Protein_GI_number: 15679799 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Methanothermobacter thermautotrophicus # 1 267 1 266 272 272 47.0 6e-73 MKVAAIQMPTVKDKIQNIRTAGTYIEKIKAENPDFVILPEMFCCPYQTENFPVYAEKEGG PSWQAMSDYARKYHIYLIAGSMPEADDVGKVYNTSYIFDRDGKQIGKHRKAHLFDINVKN GQHFKESDTLTSGDHATVFDTEFGKMGVMICYDIRFPEFARTMVLDGARMIFVPAAFNMT TGPAHWELTFRARALDNQIYMLGCAPARDTQAGYISWGHSIVTDPWGKVMKQLDEKEGIL IEEIDLDREDQIREQLPLLKHRKSEMYHL >gi|226332915|gb|ACII01000104.1| GENE 17 19190 - 20206 1279 338 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 337 1 336 339 477 67.0 1e-134 MAILVTGGAGYIGSHTVVELQNAGYDVVVLDNLSNASEKSLERVSKITGKPVKFYKADIL DRDALNEVFDKEDIDSCIHFAGLKAVGESVAKPWEYYENNIAGTLTLVDVMRKHNVKNII FSSSATVYGDPAIIPITEECPKGQCTNPYGWTKSMLEQILTDIQKADPEWNVVLLRYFNP IGAHKSGTIGENPNGIPNNLMPYITQVAVGKLKELGVFGNDYDTPDGTGVRDYIHVVDLA KGHVKALKKIEDNSGLSIYNLGTGKGYSVLDIVKNFEAATGVKIPYVIKPRRPGDIATCY SDASKAERELGWKAENGIKEMCADSWRWQSNNPNGYEE >gi|226332915|gb|ACII01000104.1| GENE 18 20706 - 21806 1018 366 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 14 363 314 667 668 457 61.0 1e-127 MEINREIKNITSAFVLEDNLTECVPYGSGHINDTYRLTYGTGKHYILQKMNRSIFTKPVE LMENVSGVTAWLKKKIQENGGDVERETLNLVMTKDGLPYYVDEDGEYWRVYLFIEGATCY DMVKDEEDFYQSAVAFGHFQRLLADYPAETLHETIVNFHNTVDRLDKFKTAVEKNVCHRA ADVEKEIQFVLDRTELAHVLCDMQDQGKLPLRVTHNDTKLNNIMIDDATGKAICVIDLDT VMPGLSVNDFGDSIRFGASTGAEDEKDLTKVSCDLHLYEVYVKGFIEGCGGALTETELDM LPMGAILMTFECGMRFLTDYLEGDHYFKIHREGHNLDRCRTQFKLVKDMEEKLSRMKEIV NKYKQV >gi|226332915|gb|ACII01000104.1| GENE 19 21811 - 22737 1194 308 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 305 1 304 668 390 60.0 1e-107 MKKPVLVVMAAGMGSRYGGLKQIDPVDKEGHIIMDFSIYDAVQAGFQKVVFIIKKENEAD FRSAIGDRLSNQLEVSYVFQDLHNIPEGYEVPEGRVKPWGTGHAVLSCINEIDGPFVVIN ADDYYGSHAFKMAYDFLTENEDTEDTYRYMMVGYKLENTLTENGHVARGVCVTDEEGHLL KINERTHIEKHDGGTAYTEDDGKTWTMLPEGSTVSMNMWGFSASILKELKDRFPKFLDEN LKVNPLKCEYFLPFVVDELLGEKRATVKVLKSMDKWYGVTYKEDKPVVVAAIQNLKDGGL YPQRLWEE Prediction of potential genes in microbial genomes Time: Sat May 28 20:13:06 2011 Seq name: gi|226332914|gb|ACII01000105.1| Ruminococcus sp. 5_1_39B_FAA cont1.105, whole genome shotgun sequence Length of sequence - 15701 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 9, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 38/0.000 - CDS 40 - 918 996 ## COG0395 ABC-type sugar transport system, permease component 2 1 Op 2 35/0.000 - CDS 932 - 1819 814 ## COG1175 ABC-type sugar transport systems, permease components - Prom 1855 - 1914 3.4 - Term 1936 - 1981 8.1 3 1 Op 3 . - CDS 1998 - 3248 1880 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 3352 - 3411 5.0 + Prom 3316 - 3375 5.8 4 2 Tu 1 . + CDS 3512 - 4411 704 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 4370 - 4406 7.1 5 3 Tu 1 . - CDS 4452 - 5720 1452 ## Caci_0625 extracellular solute-binding protein family 1 - Prom 5762 - 5821 6.7 6 4 Op 1 40/0.000 - CDS 5914 - 7257 1139 ## COG0642 Signal transduction histidine kinase 7 4 Op 2 . - CDS 7291 - 7959 969 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 8014 - 8073 4.9 - Term 8196 - 8234 1.0 8 5 Tu 1 . - CDS 8258 - 10027 1973 ## COG1283 Na+/phosphate symporter - Prom 10191 - 10250 6.2 9 6 Op 1 . - CDS 10300 - 10800 719 ## COG0782 Transcription elongation factor 10 6 Op 2 . - CDS 10819 - 13521 2370 ## COG0480 Translation elongation factors (GTPases) - Prom 13573 - 13632 5.9 11 7 Tu 1 . - CDS 13752 - 14693 768 ## EUBREC_1097 hypothetical protein - Prom 14863 - 14922 9.1 - Term 14974 - 15028 5.1 12 8 Tu 1 . - CDS 15073 - 15411 305 ## gi|253579770|ref|ZP_04857038.1| conserved hypothetical protein 13 9 Tu 1 . - CDS 15548 - 15700 77 ## FMG_P0027 transposase Predicted protein(s) >gi|226332914|gb|ACII01000105.1| GENE 1 40 - 918 996 292 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 23 292 38 300 300 136 30.0 4e-32 MAKEKEARRPFNLKKEVKLLPGYIILLLWVIFTIALLGWVFGASFSTTAEIFQGKALKFE SGLHFENYAKAWNGAGVAGFFGNSLIYSVISCTILVLVCAPAAYVLSRFDFIANKFIQTS FVSAMGVPAIMVVLPLFSIISGMGILNNVAAGRTVLIILYVGINVPYTTIFLLTFFSNIS RTYEEAAAIDGCPPMRTFWKIMFPMAQSGIVTVTIFNFINIWNEYFLSLIFASSDKLKPV APGLYGMINSMKYTGDWAGMFAAVIIVFLPTFILYIFLSEKIIGGVTGGIKG >gi|226332914|gb|ACII01000105.1| GENE 2 932 - 1819 814 295 aa, chain - ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 3 280 33 301 317 99 27.0 1e-20 MIVVFLTPAVLLFLGIFLYPIIRTVIMSFFKIDGVTDSMDLWKFVGFGNYTKLMSTSLFK TSFFNLFRIWLIGGLVVMSLSLLFAVILTSGIRGKKFFRAIIYMPNIVSAVALATMWLQY VYSPRYGLLKDLFTALHLDNLAKIQFLDNDHKFWALLFAYCFGMVGYHMLIWLSGIERIS PEYYEAATIDGATKPAQFRYMTLPLLKGVFKTNITMWSVSSVGFFVWSQLFSTVTADTQT ITPYVYMYMQIFGGGNTVTERNAGLGAAIGVLLSVCVVIVFTICNKVIKDDDLEF >gi|226332914|gb|ACII01000105.1| GENE 3 1998 - 3248 1880 416 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 3 416 1 415 422 78 23.0 3e-14 MGMKKKMLSATLCAVMVGSLAAPAFTAQAADDTLVYWSMWEATEPQGQVIQEAVDAYTEQ TGVKVDLQFKGRTGNREALQPALDGGTQIDIFDEDIDRVNGMYGKYLLDLEDLAKENDYE ATANAGLMAACRDAGGGTLKTIPYQPNVFAFFYNKDLFEQAGITEEPKTWDEFKDVCQKL KDAGITPMTMDDAYATCVIGYHLARLVGEDKVKEIVTEGEWDDPAVLQMAQDIEELASNG YYSEMVGSNVWPAGQNTELALGTVAMYLNGSWLPNEVKEMAGPDFHWGCFAYPALTDGAN GIETNNFGAQVFGINKDTKLAKEAFDLITFLTKGEYDQKLADETVGIPADTTNTEWPAMV ECAKPVIEQSTNRFTWACGVESNVDMTPVIKENFIKLMAGSITADEFVSTMQDAAK >gi|226332914|gb|ACII01000105.1| GENE 4 3512 - 4411 704 299 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 196 297 423 524 530 69 33.0 9e-12 MAFESMYLEEDIHIDRIYTIHYFEYKSDFHFAGERHNFWEFQCVDKGKAEIETDNGIYHL TSGQLIFHRPNEFHNLIAIGNTAPNIVVVSFECDSPCMKFFEKKLLKLSDSERNLMGMLI AEARRCIKTPLDDPYTEKMEKKEDFLFGSQQLIRIYLEQMLIYMIRRNSSPPMTVPVSQF VNLKNNSAVYKRVIAYMEDHIRENITLCDLCHDNMIGRSQLQKIFQEQHQCGAMDFFSRL KIAYARQLIRENQMNFTQISEFLGYSSIHYFSRQFKKISGMTPTEYITSIKALSERERQ >gi|226332914|gb|ACII01000105.1| GENE 5 4452 - 5720 1452 422 aa, chain - ## HITS:1 COG:no KEGG:Caci_0625 NR:ns ## KEGG: Caci_0625 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: C.acidiphila # Pathway: not_defined # 38 355 46 364 437 104 28.0 9e-21 MEKWRSIAAGMSCLSMLFATGAVTVSASDEITEIPEQSIVYWSMWEEDEPQADVIREAAE AYEEATGISVEIQWKGRGIRSLIEPALDAGEQIDLFDDDYQRMAQEHRDYLAELKGMADT VDYEKHIMPVLLEQVKNWGNGELLAMPYQPYITGVWYNKDLWEEAGLTEQDIPDTWEKLI RVCRKIKNSDSGLSAMTCDEEYVNLLYGYQLARYLGQEKVQQLIRNCTWSQIPQAKEAAD DIRILFFAGYMSQSAPAQHPEGQDEVGDGEAVMVLQGSWVPNEVTEATESDDSWGFFPWP AVKAGTDGTEGVMVGAQGFGVTKDSQMKQEAFDFAYSICTGETDMKMTDAVNSIPADTDN TQWPEVLADAVPYMKEMSKPYMWAAGLEADPDYKEQIQSELLKLTRLEETSDEFIENLSS MK >gi|226332914|gb|ACII01000105.1| GENE 6 5914 - 7257 1139 447 aa, chain - ## HITS:1 COG:BH3156 KEGG:ns NR:ns ## COG: BH3156 COG0642 # Protein_GI_number: 15615718 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 163 447 300 583 589 171 36.0 3e-42 MKHKIFSHTSLLIILSVILTFLAAGTVMYNRYDIYMKQGVRDEAAYIKTGLEEDGDVFLS DRVGDATSSRITLLGKDGQVLFDSIENPEEMENHSNRPEFIEAEKQGSGEMVRYSDTLSK QTFYYAVKLKDDQVLRVARTTDSLLVTMLTSFLLLGGLVCVILVIELFLVQKQTRKLIEP INRIDLEHPLEHVCYEELRPLLFRLDQQNRQIQKQLEDLKNAESARKEFTANVSHELKTP LMSISGYAELMMNGMVPPDKMQDFSGRIFHESERLSNLVADIIQLSRLDEKNGETMFEQV DIGELGEDVINNLQNRAAKKKINLEYTGEPAQMQGVRHVLYEMFYNITDNAIRYTPDGGD VKVFVGKLNGKPYFRVEDNGIGIPESEQQRIFERFYRVDKSHSRETGGTGLGLSIVKHGA VLHHAKILLDSEPGKGTKMEILFDQTK >gi|226332914|gb|ACII01000105.1| GENE 7 7291 - 7959 969 222 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 219 6 226 232 205 50.0 6e-53 MIYCVEDERNIRELLIYTLETTGFKARGFGNGNELMKALKEEIPELILLDIMLPGDDGYT ILEQLKSMPSVKDVPVIMVTAKEAEFDKVKGLEGGADDYITKPFGMMEFVARVKAVLRRS ARQNEDKELHYDELYLNVGRHEVRYREEKVDLTRKEFELLQYLLENKGLVMTRNQILCHV WGYDFDGETRTVDVHVRTLRQKLGEAGNLIETVRGVGYRIGA >gi|226332914|gb|ACII01000105.1| GENE 8 8258 - 10027 1973 589 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 6 554 9 542 543 310 36.0 5e-84 MDLFSILTLIGGLALFLYGMNAMGDGLAKVSGGKLEKILENLTSNPIKAVLLGAGVTAVI QSSSATTVMVVGFVNSGIMKLSQAVGVIMGANIGTTITSWILSLTGIQSDNFIIQMFKPT SFSPVLAIIGVIFILFINDSKKKDIGTIFIGFAILMYGMDMMSSAVKPLAEVPEFTNLLL KFSNPLLGVIAGALLTAVIQSSSASVGILQALCLTGAVPFSAAIPIIMGQNIGTCITAIL SAIGAKKNAKRAAAVHLYFNLIGTVIFMTVFYLINAVVGFSFFHQAATPAGIAVIHSVFN VTATIILLPFAKGLEKLACLTIRDKKEDVVVSAEDREFMILEPRFLEKPAFAVEQSRNAA RKMAEESHNALFTALSLVDKYSEEGVERVENMESKVDRYEDELGTYLVKLSHKDISEADS HSLSIMLHCIGDFERISDHAVNIMESAQELYEKGLKFSENAKKDLEVLGQAVEDIVNTAY EVFDKQDMKLAEKIEPLEEVIDELSKEVKRRHVQRLRNGECTIEMGFILSDITTCLERVA DHCSNIGVCVTQVNEDLYDTHSHLNIVKSHPDETFYHELEDARIKYQLS >gi|226332914|gb|ACII01000105.1| GENE 9 10300 - 10800 719 166 aa, chain - ## HITS:1 COG:BS_greA KEGG:ns NR:ns ## COG: BS_greA COG0782 # Protein_GI_number: 16079786 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Bacillus subtilis # 5 139 9 143 157 81 39.0 7e-16 MKEKLTQKDVEKIQQEIDHRKLVVRKEAIEAVKEARAQGDLSENFEYYAAKKHKNQNESR IRYLENMLKTATIVSDDSKDDEVGMDDIVEVYFEEDDEIEKYKLVTSIRGNSMENRISIE SPIGMALKGHKVGDRVEVKVNDNYSYFLEIRSIDKTGSEDEEIRGF >gi|226332914|gb|ACII01000105.1| GENE 10 10819 - 13521 2370 900 aa, chain - ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 6 637 5 640 644 544 43.0 1e-154 MEKLVVGILAHVDAGKTTLSEGILYLTGKIRKLGRVDHKDAYLDTYNLERERGITIFSKQ AEFELGNRGITLLDTPGHVDFSAEMERTLQVLDYAILVINGADGVQGHTMTLWRLLARYQ IPTFLFINKMDQDGTDKEKLLAELKKRLSDNCVDFTWEKSDLQSQFLEDVSVCDEELLEK YLETEEISTSDIRKVIKERKLFPCFFGSALKMTGVEEFLHGLEKYCETPEYPSEFGAKVF KIARDDQGNRLSYMKITGGTLKVKELLTDTEKADQIRIYSGAKFELAKEAPAGTICAVTG LSQTHPGQGFGIERESEMPVLEPVLNYRILLPEDCDVHQMLKKLKELEEEEPELHIVWNE QLGEIHAMLMGEVQIEILKHLIWERFHVAVEFGTGNIVYKETIAEPVEGVGHFEPLRHYA EVHLLLEPGEPGSGLQFFTACSEDVLDRNWQRLILTHLEEREHPGVLTGSPITDMQITLI TGRAHLKHTEGGDFRQATYRAVRQGLKKAKSVLLEPYYEFRLEIPGDMIGRAMTDIQKMN GTFQQPEADEDDMMVLKGSAPVSMMRDYQTQVTSYTKGRGRLFCSLKGYAPCQNQDEIVE EIGYDSERDLDNPTGSVFCAHGAGFVVPWYEVEDYMHLEGIDESELGNTIPDSEESIAGN RNSNNQTDSGYRPPRNAGVGSYEDEEELKAIFERTFGPVKRYKEPQFKRTFSAKSDSGSY YRNSSSAKKKEKEYLLVDGYNIIYAWEDLKELADANLHAAQTKLMDILSNYQGFKKCTLI LVFDAYKIEGHAEEVITYHNIHVVYTKEAETADQYIEKTVHKIGRENQVTVATSDGLEQI IIMGQGAHRMSARGLRDEIKATENQIRQQWHEKRQSSKNYLIDNISDEMAQYMKEKRLGK >gi|226332914|gb|ACII01000105.1| GENE 11 13752 - 14693 768 313 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1097 NR:ns ## KEGG: EUBREC_1097 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 18 283 65 337 370 168 37.0 2e-40 MAKRKNKKIIVKRNRLKGNTATANRKYKDTVFRMLFSDRKNLLSLYNAINETSYTDAAQL EIVTLENAIYMGMKNDLAFIINTNLFLYEHQSTYNPNMPLRDLFYISSEYQKMVDWKSLY TSTRLRIPTPNFIVFYNGTEKKEDRWVDYLSESYENMSGEPNLELKVITLNINVGHNKKL MEECRTLREYAQYVDKVRRYSNEMELNTAVERAVSESIQEGILKEFLQKNRAEVVAMSIF EYNEEEEKRKLRKAEYEAGKNDGIKIGREEEKRKIANSLFKEGDSIEKVARILCESEETI RGWVNEKMDIKVE >gi|226332914|gb|ACII01000105.1| GENE 12 15073 - 15411 305 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579770|ref|ZP_04857038.1| ## NR: gi|253579770|ref|ZP_04857038.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 112 1 112 112 216 100.0 4e-55 MSPCIHSNNVCARVGCLAAFQNRTDFFQDYPEDTCLVAMMTCNECKGVNPIEPIEDKGIL EKIDRLVSEKISTIHVGVCRLPDGKHECPRMTQICNMIEERGIKVVRGTHKE >gi|226332914|gb|ACII01000105.1| GENE 13 15548 - 15700 77 50 aa, chain - ## HITS:1 COG:no KEGG:FMG_P0027 NR:ns ## KEGG: FMG_P0027 # Name: not_defined # Def: transposase # Organism: F.magna # Pathway: not_defined # 1 48 329 375 378 65 68.0 7e-10 KICSVCGHKKKELALSERTYLCECGNRIDRDVNAAVNILEEGKRIYKKCA Prediction of potential genes in microbial genomes Time: Sat May 28 20:13:42 2011 Seq name: gi|226332913|gb|ACII01000106.1| Ruminococcus sp. 5_1_39B_FAA cont1.106, whole genome shotgun sequence Length of sequence - 59130 bp Number of predicted genes - 56, with homology - 55 Number of transcription units - 24, operones - 16 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 21 - 69 15.2 1 1 Op 1 35/0.000 - CDS 82 - 837 237 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 2 1 Op 2 33/0.000 - CDS 837 - 1922 1035 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 3 1 Op 3 . - CDS 1937 - 3061 1454 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component - Prom 3127 - 3186 2.7 - Term 3231 - 3287 12.4 4 2 Op 1 . - CDS 3533 - 4183 610 ## EUBELI_01749 hypothetical protein 5 2 Op 2 8/0.000 - CDS 4180 - 5025 705 ## COG1131 ABC-type multidrug transport system, ATPase component 6 2 Op 3 . - CDS 5029 - 5373 369 ## COG1725 Predicted transcriptional regulators - Prom 5417 - 5476 7.3 - Term 5414 - 5470 8.6 7 3 Op 1 7/0.000 - CDS 5575 - 7398 1263 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 8 3 Op 2 . - CDS 7401 - 8954 944 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 8980 - 9039 5.5 - Term 8989 - 9028 5.9 9 4 Op 1 . - CDS 9041 - 11914 2317 ## gi|253579779|ref|ZP_04857047.1| predicted protein 10 4 Op 2 38/0.000 - CDS 11932 - 12768 733 ## COG0395 ABC-type sugar transport system, permease component 11 4 Op 3 35/0.000 - CDS 12771 - 13658 617 ## COG1175 ABC-type sugar transport systems, permease components - Prom 13693 - 13752 7.3 - Term 13712 - 13740 1.4 12 4 Op 4 . - CDS 13756 - 15126 1319 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 15263 - 15322 7.3 - Term 15291 - 15326 2.0 13 5 Op 1 . - CDS 15505 - 16371 813 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 14 5 Op 2 11/0.000 - CDS 16380 - 17189 1050 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase 15 5 Op 3 . - CDS 17212 - 17850 741 ## COG0352 Thiamine monophosphate synthase - Prom 17877 - 17936 3.3 16 6 Op 1 . - CDS 17973 - 18098 93 ## 17 6 Op 2 . - CDS 18106 - 19491 1553 ## COG0733 Na+-dependent transporters of the SNF family - Prom 19614 - 19673 7.0 + Prom 20371 - 20430 3.9 18 7 Op 1 5/0.000 + CDS 20604 - 21008 335 ## COG3547 Transposase and inactivated derivatives + Prom 21046 - 21105 4.3 19 7 Op 2 . + CDS 21287 - 21841 304 ## COG3547 Transposase and inactivated derivatives + Term 21928 - 21972 -0.9 - Term 21987 - 22032 -0.4 20 8 Op 1 . - CDS 22060 - 22344 296 ## EUBREC_0760 hypothetical protein 21 8 Op 2 . - CDS 22348 - 22557 133 ## gi|253579798|ref|ZP_04857066.1| conserved hypothetical protein - Prom 22664 - 22723 2.6 - Term 22800 - 22846 15.9 22 9 Op 1 . - CDS 22913 - 23374 360 ## CDR20291_1753 hypothetical protein 23 9 Op 2 . - CDS 23380 - 24195 563 ## CKL_3850 transporter protein 24 9 Op 3 . - CDS 24208 - 24921 376 ## CKR_3399 hypothetical protein 25 9 Op 4 3/0.000 - CDS 24995 - 25921 280 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 25941 - 26000 4.3 - Term 26016 - 26052 5.4 26 10 Op 1 40/0.000 - CDS 26056 - 26973 646 ## COG0642 Signal transduction histidine kinase 27 10 Op 2 . - CDS 26979 - 27671 477 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 28 10 Op 3 . - CDS 27664 - 27852 94 ## EUBREC_0767 hypothetical protein - Prom 27886 - 27945 2.6 29 11 Tu 1 . - CDS 28244 - 28477 262 ## CDR20291_1756 rna polymerase, sigma-24 subunit, ecf subfamily - Prom 28625 - 28684 4.4 - Term 28804 - 28850 8.9 30 12 Op 1 36/0.000 - CDS 28928 - 30736 742 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 31 12 Op 2 4/0.000 - CDS 30733 - 31416 341 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 32 12 Op 3 40/0.000 - CDS 31529 - 32365 248 ## COG0642 Signal transduction histidine kinase - Prom 32489 - 32548 2.3 33 12 Op 4 . - CDS 32575 - 33261 345 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 34 12 Op 5 . - CDS 33254 - 33442 108 ## EUBREC_0767 hypothetical protein - Prom 33484 - 33543 3.7 + Prom 34296 - 34355 4.1 35 13 Tu 1 . + CDS 34405 - 34602 68 ## HMPREF0424_0734 hypothetical protein + Term 34606 - 34661 8.7 - Term 34864 - 34901 7.1 36 14 Op 1 59/0.000 - CDS 34937 - 35329 591 ## PROTEIN SUPPORTED gi|238925338|ref|YP_002938855.1| 30S ribosomal protein S9 37 14 Op 2 7/0.000 - CDS 35347 - 35775 641 ## PROTEIN SUPPORTED gi|240144421|ref|ZP_04743022.1| 50S ribosomal protein L13 - Prom 35798 - 35857 6.5 38 14 Op 3 8/0.000 - CDS 36101 - 36841 868 ## COG0101 Pseudouridylate synthase - Prom 36866 - 36925 6.3 - Term 36879 - 36926 9.1 39 15 Op 1 34/0.000 - CDS 36960 - 37763 840 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 40 15 Op 2 15/0.000 - CDS 37760 - 38836 1194 ## COG1122 ABC-type cobalt transport system, ATPase component 41 15 Op 3 . - CDS 38827 - 39672 584 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 39699 - 39758 3.1 42 16 Op 1 4/0.000 - CDS 39760 - 40365 580 ## COG0218 Predicted GTPase - Term 40381 - 40416 2.7 43 16 Op 2 18/0.000 - CDS 40423 - 42735 2723 ## COG0466 ATP-dependent Lon protease, bacterial type 44 16 Op 3 24/0.000 - CDS 42756 - 44072 1593 ## COG1219 ATP-dependent protease Clp, ATPase subunit - Term 44095 - 44138 11.9 45 17 Op 1 29/0.000 - CDS 44148 - 44729 606 ## COG0740 Protease subunit of ATP-dependent Clp proteases 46 17 Op 2 . - CDS 44825 - 46111 1696 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 46305 - 46364 6.0 47 18 Tu 1 . - CDS 46386 - 46946 518 ## COG0693 Putative intracellular protease/amidase - Prom 46999 - 47058 8.1 48 19 Op 1 . - CDS 47097 - 48533 977 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 49 19 Op 2 . - CDS 48594 - 50282 1585 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 50 19 Op 3 . - CDS 50254 - 53172 2179 ## HRM2_02090 inner membrane transport protein (PmrA-like protein) 51 19 Op 4 . - CDS 53217 - 54176 733 ## COG4866 Uncharacterized conserved protein - Prom 54318 - 54377 6.5 + Prom 54473 - 54532 6.8 52 20 Tu 1 . + CDS 54570 - 55025 541 ## COG2954 Uncharacterized protein conserved in bacteria + Term 55040 - 55087 -0.6 53 21 Tu 1 . - CDS 55207 - 56001 991 ## COG0345 Pyrroline-5-carboxylate reductase - Prom 56073 - 56132 2.5 - Term 56079 - 56122 -1.0 54 22 Tu 1 . - CDS 56134 - 56580 437 ## COG1490 D-Tyr-tRNAtyr deacylase - Prom 56617 - 56676 4.6 55 23 Tu 1 . - CDS 56700 - 58058 1292 ## COG0534 Na+-driven multidrug efflux pump - Prom 58114 - 58173 6.6 + Prom 58052 - 58111 7.3 56 24 Tu 1 . + CDS 58228 - 59128 325 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|226332913|gb|ACII01000106.1| GENE 1 82 - 837 237 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 237 2 245 245 95 28 5e-19 MKITTEDIQVGFHARQILKGISIESKDKELVGIIGPNGSGKSTLLKCIYRILKPDAGAVY LDGEELHSMSVKSSARKMAVVAQHNYYNFDFTVREVVLMGRAPHKKALERDNAKDYRIVE EALKTVQMDAFADRTFSTLSGGEQQRVILARALAQQTPALILDEPTNHLDITHQIMLMEL VKKLNVTVISAIHDLNIAAAYCDKIYVLKDGVLEGYGTPQEVLTPELIKRIYQVDSEVVN DSRGKMHILFL >gi|226332913|gb|ACII01000106.1| GENE 2 837 - 1922 1035 361 aa, chain - ## HITS:1 COG:FN0884 KEGG:ns NR:ns ## COG: FN0884 COG0609 # Protein_GI_number: 19704219 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 37 359 26 344 345 220 39.0 3e-57 MKEQVSQTKKGSVLRTSMYVAVLIGLAAFLIFSILAAITFGNADLSLKDVYSVIAYKLFH IKSLSAYAEGAVHDVVWLIRLPRVLLALAVGMALSVCGVVMQAIVQNPLADPYVLGISSG ASLGATLAIMLGVGGFLGGNSVGVPAFIGAMVTSFAVIAIANMGGKATSAKLILAGMAVS AVCSAFSNFVIYITNDKNAATEVMKWTMGSLAGASWSRVGVMLPVTLICVIIFWTQYRNL NLMLLGDDVSITLGTDLHRLRTFYLIVASVMIGFAVYCAGVIGFVGLVIPHVIRILFGTD HRRLLPLSALLGASFLIWCDVACRVILKNSEMPIGVLVSIIGAPCFIYLLVRKSYGFGGS K >gi|226332913|gb|ACII01000106.1| GENE 3 1937 - 3061 1454 374 aa, chain - ## HITS:1 COG:FN0885 KEGG:ns NR:ns ## COG: FN0885 COG0614 # Protein_GI_number: 19704220 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 97 368 18 281 286 91 26.0 2e-18 MKNRYLSTRIMAIMMAAAMAVSAAPAAFAAERADSETIKEDNQPAADADSDTQEDADAQD SADADKTADAEGTKTEYPLTITTYDYDGNEIETTYEKAPEKVLAVYQGSIETMLALGLED RLVATAGLDNEVPDELKDAFSKTNYLDEFTPSLETVTMLEPDMILSWSSLFSDKNLGNVT DWIDKGCNTYYNTNTRPDGDRTLENEFTDILNLGKIFDVQDKAQAIVDDAKAVIDKTLTA TADVEEKPSVMVLEPLGEDITNYGAKSLGGDMVTQLGATLANPDASTAGKEDIIAANPDV IFVVYMPYAGDDPETVKESQLAVIKDDEALQSLDAVKNGRVYPIMLSEMYASATRTQDGI ETFAKGLYPDVNLD >gi|226332913|gb|ACII01000106.1| GENE 4 3533 - 4183 610 216 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01749 NR:ns ## KEGG: EUBELI_01749 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 216 1 213 213 73 29.0 4e-12 MKGLFVKDIELMKQQKQFFILVVVMEVILNLAGSGSVSFATGYFTIVTAIFAITTISYDE FDNGLAFLMTLPVTRKQYVAEKYLLGAGLTAVAWGIATITGVICKGVAELQGCLSETIIG SLIDIPLALLMLAVSLPLVIHFGAEKGRYIAMVMWAIIFAVVYILIKTMGLSADAVDAWL NGLNRGMVLAVVVLFTVIVYMGSFWIGVRLMEKKEF >gi|226332913|gb|ACII01000106.1| GENE 5 4180 - 5025 705 281 aa, chain - ## HITS:1 COG:BH0652 KEGG:ns NR:ns ## COG: BH0652 COG1131 # Protein_GI_number: 15613215 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 1 278 4 282 288 163 32.0 4e-40 MLKISHLQKHYRGFSLDCSMEVQPGMITGLIGRNGSGKSTTFKALLGLIHPDGGEIEVFG KKAEELKPEDKQKLGVVFADSGFSMYLTAAGVANIMKSIYPDFDREKFLQQCRRFNLPTD KKIKEFSTGMKAKFKVLAALSHKAELLILDEPTVGLDVIARDEVLNMLREYMEENESSSI LISSHISSDLESLCDDFYMIHAGKIILHEDTDVLLSDYAVLKVSEEEYEKLDQQYLIKIR KEAYGYRCLTNQKQFYMENYPDVVIENGKIDDLVVMMEEQV >gi|226332913|gb|ACII01000106.1| GENE 6 5029 - 5373 369 114 aa, chain - ## HITS:1 COG:BH0651 KEGG:ns NR:ns ## COG: BH0651 COG1725 # Protein_GI_number: 15613214 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 98 12 107 123 85 45.0 2e-17 MVPIYEQIIDQIKSAIIRGELQPDTVLPSVRSLSKELKISALTVKKAYDNLEEEGFTVTV HGKGTYVAATNKNLMREEQLKEVEHDLEQAIMKGRRCGLNDEEIRNLFEMIMED >gi|226332913|gb|ACII01000106.1| GENE 7 5575 - 7398 1263 607 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 283 599 262 592 602 157 30.0 8e-38 MEKLKYTKFKTLLISSFICISIIPVVLLGLFIFHLSKQELIRQSEKQMWQNAENVSDILD EKLDYIEEFSLKINVDTRIYKIFQNLDTSDSMQLESASQEISKILLDYLPWNNTVYSTHI VTPYYQFGEKEKNYYPNHSFMGSKIQKAADEANGKLVWIPAYNYMDMFSIEDMPRDFLEY EHVFTAVRKLQLSRVESGHIEHLENKTDQMYLVVNFTEDNLNNMLKKYTKANSQILYYIM SEDGSVVGPYGENESKLFRNVTAADMGITENKGTVKYYGANNQEYIVTYSKSMVTGWYTL AMIPVNVFSKNIATDLTRAILILVTIEIILSVTTAVLISRKIGKKVYKPLHMIEKTGEGN FTARIVYDNRDEFAFFYKKLNEMNQNLQMLVHEKYEMMIQKRDTEIMSLNIQMNPHFLYN SLNIINWVCLKGEQENASKMLLDLSRMLQYTSQNGDVLVPLWEELDWLKRYIGIMQKRYQ DQFEVSMDIPEYLRSLEVPKLFLQPFVENAIVHGFKNYQDSGKIWISAEEEENDILFYVE DNGCGICEEDLKEVLNKERKSIGVRNTNKRIRMIYGEKYGVTISSQLEEGTVVTIRLPLK VNSKKQS >gi|226332913|gb|ACII01000106.1| GENE 8 7401 - 8954 944 517 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 516 1 517 530 181 25.0 4e-45 MYNILVVDDERIHREGLIMLLEKICSDDMLWEASNGQEAMEIMNNISCEIVISDIRMAEM DGLQLQKKVKTDYPETAFVIVSGYADFSYAREALQFHASDYLLKPVDAEELKRAIKKIKK SKNQDQVQKQKVDQMERRLKESAYGYMEWLLNQYIHHTRRKVWEPLSEMFPLRQEGYLML IRINKQHLILNEEMKREIGYFIKKHLIDTSSITFQVNGTENLLAVLALSGSRIEQEQITQ IEEFIQNILVLEKNRIYFAVSSLYPDMESCGNKAYEEAMAAMKYSFYEEERYLQAEKYLS DEKQEDSVGSEAILNSIKEGNLEKSEAIFEKITEVLSEHKSVQPDIFKRKIVFFFFRILR YMEPYLKAEHLSELNALDNRLIESQWYSELKKRSAELIKNIIAFREIADSQETDPLDESR RYIEEHYQDEISLEQIAALYHFNASYFSTLFKQRFGISFSEYLSDIRMEKAENLLKTSSD KVRKIALEVGYKDPNYFNRAFKKRHNMTPEEFRKRVG >gi|226332913|gb|ACII01000106.1| GENE 9 9041 - 11914 2317 957 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579779|ref|ZP_04857047.1| ## NR: gi|253579779|ref|ZP_04857047.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 957 1 957 957 1993 100.0 0 MHLRGNVIYNTWENFWKKAYKSSYFRAQYDETSERTDWSLSERQLFTQSNGRTLLGQDTM GRLHFLCTPHDKIYAVPSGGTDLGGQFKGYIGQYYQNDTALLLGKMKYDLELDNGLMISG ANKQSWTTYYDDFLPVTATEHPELETRIFSFAPILEKEAVPASSIHPVPGPAGAFYCIEV KNTSGCKIKGKLRLSFDQKFVNQFEHYGKYFDDYTVTPYKSEWDQKLLVLWHPEACAAIQ LLDSVCEGQGDNPRIYVPFELKANESRTFTTVIALTPKREEIYSALGTLYQHTPLEWLNV TDDFWKERFGKITTDIRENQEAGEKYRDMQVRFILDNFNCLSFREDGSLLTNWQGAPSHS LSRLWGIDITPDVVSVMYAVPEVGKSAMFYLAERNRPRYSLYSDHSMFYYISLLVIAGKY LELTGDEKFFRDHPELVAAIDEIYDGMMKHKHKEKALISSRYASDLIVFRKYDYGANVQC YYALKSYRRILKLLERDATDVDRFMEQMKADMKELMEGSGPFGRQITGGNNLGENEERFY IQDDLNYYGGEDTATVMAPLYGLYEFDYEPYVNLHRYARSMYITNYDPEFQTMRELHFGM NPSATGCTLKLGGSLTRQEMLDNLNLLYERLDETGSLFWWPRATNKKRCLTRCSQGQGAW IQQSIEQWFGLRMDGTQKTLVIKPQGLLSGYKLQKTHLGAYVFDIEYREEKGKTVINIKN YNEEAIKVIFEVRKYGAGAQGECRKVEELVLAGALVEKEFVTETMAVQETDIAGAECSTM TEDRILFSPYGIIMPKLYNSDCGIFLFRYLISQSTDTEWKNAEVKIEVPDGWKVQYKQFY HWDYQPVFSDNKAVCMVGEVPQNTHKVAGFYISLPDELAGGEKSVMLSEHPFPQDTQHKM KQVTLYVEGQKTQAAGVISASLETDGKQRKVSEIPVLILSKEDYADALDKMYHGTKK >gi|226332913|gb|ACII01000106.1| GENE 10 11932 - 12768 733 278 aa, chain - ## HITS:1 COG:BH1119 KEGG:ns NR:ns ## COG: BH1119 COG0395 # Protein_GI_number: 15613682 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 4 277 3 281 281 221 44.0 9e-58 MKNTKKIISKILIYFALILGVLFCLVPLYWMIRSSLMNTVEVFMMPPRWIPSKFMWENYQ EVFDTLPFGKYFLNSFIVTGGCVVGTMLTSSICAYGLARIKWRGRNVVFACIISSMMLPV AVTLIPTFLMWRTIGITDSFIPLIVPAWFGGGAFYIFLLRQFYLGIPKDFDEAAYLDGAS HIQIFTKIILPITKPALAVVGMFAFLNSWNDFLSPLVYLNSEKKYTVALGLQLFTGSYRG EWNLMMAAACLVLAPVVVVFAIGQKYLVEGVTMSGVKG >gi|226332913|gb|ACII01000106.1| GENE 11 12771 - 13658 617 295 aa, chain - ## HITS:1 COG:BH1118 KEGG:ns NR:ns ## COG: BH1118 COG1175 # Protein_GI_number: 15613681 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 18 295 38 314 318 211 41.0 2e-54 MTNREKRENRWGILFSLPCILGFLIFALLPMAVSLGLSFTNFNFTTGGQFIGISNYKNLF SGSDPYFYKSLLVTVIYVALSVPTAIIFSFFIAMLLNTNVKGKGFFRTIFYLPTVVPIVA MAAIWMWIFNPDMGLANNILKAMHLPVSTWLSGESSVIPTLVFINLWTTGSTMVIFLAGM QDVPRQLLEAVEIDGGGFWAKLLHVTIPMMTPTIFYNVVMGIINGFQIFTQSYVMTQGGP NNSSLFYVYYLYREAFEFGRMGNACAVAWVLFLIIMVLTAVIFKFSNRWVYYGGE >gi|226332913|gb|ACII01000106.1| GENE 12 13756 - 15126 1319 456 aa, chain - ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 6 448 13 438 461 109 26.0 1e-23 MKRRGMALGLAALMTCTALFTTTAKAENKGSDEKETIKFTYWGSGDEKKAIEESVDKFME ANPNIEVDLMHIPSEDFLTKLNAMIAAGEAPDVSYSASWKCQMGEDGLIYNFYDLLDDMG LSKDDYLSTCWWNWSPTESAGPVQANVTTNLMYNVDCFEEAGVDLPPTKVEEAWSWDEFV EMAQKLTLDTEGRNAADPDFDANNIKQYGVMFGTDWNVYMPFILSAGGGYLNESMDGLGL NDEKSAQILQNFADLINVYHVSPTAVQKNAMPSAATALAAKQTAMYIDGSWNHLDLMNSG CNWGVGVLPIDENYTTFFDGGSLIIFKSTKHLEASEKLYLWLTNPESSERITELYRTLWM PVQTEYYTDPDKIDFWASEELPARPEGFQDAIVKATYDHAVVATEINVKNFNEINTLVGS ALEQVWSGKKTAEEAMTEVKSQTDSLVEGTYSGERS >gi|226332913|gb|ACII01000106.1| GENE 13 15505 - 16371 813 288 aa, chain - ## HITS:1 COG:FN0508 KEGG:ns NR:ns ## COG: FN0508 COG4667 # Protein_GI_number: 19703843 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Fusobacterium nucleatum # 9 286 3 279 281 233 43.0 3e-61 MEDNKKKGLGMVLEGGGMRGLYTAGVLDELMEQGIHADSTVGVSAGAIFGCNYKSRQIGR TLRYNTRFCKDKRYMGLKSWITTGDLYSKDFAYGEVPWKLDVFDTETFARSPMKFTVVCT DIETGKPCYQECRMGDRLDVEWMRASASLPLAARPVKLNGRMYLDGGISDPIPVNWMLSQ GYEKNVVVCTRHPGYRKEHNKLMPLLRLKFREYPELVKLLDERHIQYNRMLDKIAYLEKE GRIHVIRPDRDISAKPVERDPAHLKEIYEVGRRVMKADMEKLKAYLAE >gi|226332913|gb|ACII01000106.1| GENE 14 16380 - 17189 1050 269 aa, chain - ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 1 261 4 264 265 312 59.0 4e-85 MRTALTIAGSDSSGGAGIQADIKTMITNGVYAMSAITALTAQNTTGVQAILNVTPEFLGQ ELDSVFQDIYPDAVKIGMVSDKDLIHVIAEKLKQYKAENIVVDPVMVATSGAKLISDDAV EILKQELFPLADVLTPNIPEAEVLVEMSVTNAEEMIEAAGKISETYHCAVLCKGGHSIND ANDLLYYDGKYCWFEGKRIDNPNTHGTGCTLSSAIASNLAKGYDLQTAVKRSKQYISGAL AAMLDLGKGSGPMDHGFAIRNEYVKESEK >gi|226332913|gb|ACII01000106.1| GENE 15 17212 - 17850 741 212 aa, chain - ## HITS:1 COG:CAC0495 KEGG:ns NR:ns ## COG: CAC0495 COG0352 # Protein_GI_number: 15893786 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Clostridium acetobutylicum # 9 209 7 207 211 173 47.0 2e-43 MKFDKEKMRLYAITDRHWLNGRSLKEVVKESLDGGVTFLQLRDKNSDDETFLQEAAELQE LCRDYKVPLIINDNVEIALKMNADGVHVGQSDMEAGAVREKLGPDKILGVSARTVEQALL AQERGADYLGVGAVFATGSKADAAELPHETLKAICEAVSIPVVAIGGITAENISQLKGTG ICGVAVISAIYAQNNIKEAAEELKEAVDKIVL >gi|226332913|gb|ACII01000106.1| GENE 16 17973 - 18098 93 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLFTKNAAWINCKTREEATAQFTRKNGWQANMAGNLPMNV >gi|226332913|gb|ACII01000106.1| GENE 17 18106 - 19491 1553 461 aa, chain - ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 13 451 9 447 453 249 37.0 7e-66 MKKENNSAHNHNRGSFSGRIGYVLAVAGSAVGLGNIWRFPYLAAKYGGGIFLLIYILLTA SFGYVLIMSETALGRMTRKSPVGAFEHFGKTKSFKIGGWLNAVIPMLIVPYYSTIGGWVI KYLAEYFKGNVQAVAEDGYFGNFISDSWQVELWFLVFAALVFIIILGGVQNGVERMSKIM MPVLVVLAVVVTIYSVTRPGAVEGVKYFLIPNVKNFSWMTVVAAMGQMFYSLSIAMGILY TYGSYVRKDMDIERSTTQVEVFDTGIAILAGLMIIPAVFSFSGGNPETLQAGPSLMFITL PKVFASMGFGTATGIVFFILVLLAALTSAVSLMETSVSTFMDELHWSRKKCCGLMVVIMI VLGTASSMGYGLLDFIRIFGMNFLDFFDFITNSVMMPLAALATCILILRVVGIDGMVKEI EQSSPFRRKKLYRVFIKYFAPICLIIILLSSIANVLGIISM >gi|226332913|gb|ACII01000106.1| GENE 18 20604 - 21008 335 134 aa, chain + ## HITS:1 COG:MA2729 KEGG:ns NR:ns ## COG: MA2729 COG3547 # Protein_GI_number: 20091553 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 1 129 1 117 414 80 36.0 8e-16 MKDLLEISCGLDVHKEKIVACILTGPLGKPTRSEIREFSTLIPDMIALRNWIVSKNCHHV AMESTGIYWMPIYEILEDAFSGDITLLVVNARHMKNVPGKKTDMRDSEWISTLLRAGLLN GSFIPEKKFGNSAI >gi|226332913|gb|ACII01000106.1| GENE 19 21287 - 21841 304 184 aa, chain + ## HITS:1 COG:MT2233 KEGG:ns NR:ns ## COG: MT2233 COG3547 # Protein_GI_number: 15841667 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mycobacterium tuberculosis CDC1551 # 1 180 28 215 221 111 37.0 7e-25 MTHYDSLKKHLAEIETSLEEDMAPFALQVEQLNSIYGISTTASCAIIAEIGIDMKPFKTA EHICSWAGLCPGNNESAGKRKSTSVTKGNPYIKSMLCEIAWVIAGKRNTYLSAWYWRIKQ KKGAKKAIVALARKLLVIIYTMLKQGTLFDESCFETRRKHCEQKQLSRYIRELEKHGYHV EAQS >gi|226332913|gb|ACII01000106.1| GENE 20 22060 - 22344 296 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0760 NR:ns ## KEGG: EUBREC_0760 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 92 88 178 184 77 46.0 1e-13 MYRYKAQYSLDCENGIENAVLLKPQTPEMLLEEKQFQEQVYAAVMKLPEKQAKRIYARYY LGMAVNEIAEVEGVDPSRVRDSIRRGLKQLVKYF >gi|226332913|gb|ACII01000106.1| GENE 21 22348 - 22557 133 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579798|ref|ZP_04857066.1| ## NR: gi|253579798|ref|ZP_04857066.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 69 1 69 105 113 86.0 3e-24 MIRLSAGVGPRLHSIHPSIIRESKWRYIMKKINLREFYPDVYTTDFFVNVTEEVMETIRA AERAEAAYE >gi|226332913|gb|ACII01000106.1| GENE 22 22913 - 23374 360 153 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1753 NR:ns ## KEGG: CDR20291_1753 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 153 1 153 153 232 82.0 3e-60 MKKIVLSLCLILLGASVLSGCTKDEFLDQYNNIVQSAGSIALTGNSSLQGTKEKGIDDYT GSYTANYEDFSDTEYLFGGTSIKREAGKDLSIDCALEVTEGTAKVFWISGADEAVTLFET TGTYSDTITLPEGGNYIGIECENFTGNIELNIE >gi|226332913|gb|ACII01000106.1| GENE 23 23380 - 24195 563 271 aa, chain - ## HITS:1 COG:no KEGG:CKL_3850 NR:ns ## KEGG: CKL_3850 # Name: not_defined # Def: transporter protein # Organism: C.kluyveri # Pathway: not_defined # 3 262 2 262 274 145 34.0 1e-33 MWNLLKSELLKLRRCQILLVGLVALALCPLVQYGSQLIVEAEYRNPNYDFSALFENVVWG NTQIFLPISLVMIGGWLIDRESTHDTLKNIMTIPVSMPKMLGAKLLLVGMLAVLLGIYSV GITLITGLTVGLSGLTEEVFFHGGTQIVLAALTTYLVCMPLILIFGQIRGAYLGGSILAF FLGYSMLFFKSGILASIYPFSAALILVGFDMSEYAGTTTSSTPLLAVIGVGIMVLWAVLL LLMSSNKKEMKSRKQANAKGKGKCAVRRKGR >gi|226332913|gb|ACII01000106.1| GENE 24 24208 - 24921 376 237 aa, chain - ## HITS:1 COG:no KEGG:CKR_3399 NR:ns ## KEGG: CKR_3399 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 20 237 49 264 264 129 38.0 7e-29 MPLAYAFFLADVNTDVDAVNGVMSSLFQLSAYLLLMPLLVILASNLLFEELDNDTLKNLV TVPVNRTKLVLSKMLVLLLFAVGFMAVGGLVNLAVLLFQGWEPVGFWTLFGVGIEEGLIM WVGTLPCILLVVLLNKNYIVSVVITFFYTIANYILSMNDMFLTQPFGLNIGTLFPGPLAF RWTFQFYDQSQTSAELADLLERVSPYFLNGVQVFGVIIVEAIVFLALIAFVYRRQEI >gi|226332913|gb|ACII01000106.1| GENE 25 24995 - 25921 280 308 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 10 301 8 301 311 112 30 5e-24 MDTNYIIETKNLTKQYGSQKSVADLNIHVKRGRIYGLLGRNGAGKTTTMKMLLGLTKPTS GEVKIWGKSLQGNEKKLLPRIGSLIESPGFYPNLTGTENLRIFATLRGVPNNHAIKDALD LVGLPYKDKKLFSQYSLGMKQRLAIALAVMHDPELLILDEPINGLDPIGIAEVRSFIREL CDARGKTILISSHILSEISLLADDIGIIDHGALLEEESLAELEQKSSKHIRFTLSDTAQA ARILERNFHENHFSIQDDHNLRLHNLDLPVGKIVTAFVENGLEVSEAHTCEESLEDYFKR VTGGEGIA >gi|226332913|gb|ACII01000106.1| GENE 26 26056 - 26973 646 305 aa, chain - ## HITS:1 COG:BH0819 KEGG:ns NR:ns ## COG: BH0819 COG0642 # Protein_GI_number: 15613382 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 28 299 28 302 309 130 30.0 5e-30 MEIIVFLSFVIAVVAVLTSIVLVRRVKKQIAEMTDVLVDVKNGNGNRRILSATNELTAPL AYEINEIVVAYEGRLSTVRQTEETNRQLMTSLSHDVRTPLTTLLGYLDATHKGLVTGKDR DDYIETARRKAHDLKEYIDVLFDWFKLNSNEFALEIQSVEVAELTRNILIDWIPIFEDKQ VDYDIDIPEQPVRVRLDMDSYMRIINNLIQNVIAHSHADKIKISLSKKENSMELLLADNG VGIEKEDLKHIFERLYKCDKGRSDKGSGLGLSIVHQLVEKMGGSITVESLPGKGTEFMLL FPLES >gi|226332913|gb|ACII01000106.1| GENE 27 26979 - 27671 477 230 aa, chain - ## HITS:1 COG:BH1808 KEGG:ns NR:ns ## COG: BH1808 COG0745 # Protein_GI_number: 15614371 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 4 227 6 230 231 174 40.0 2e-43 MNKILIIDDDRELCALIKRSVQAENIEADFCNTGKEGLQKLREQEYQLVVLDVMMPGMDG FETLEEIRKENSLPILMFTSKNDSISKVRGLRAGADDYLTKPFDMDELIARIASLIRRYT RFNQQAGAIQKLNFDGLQIDLENRSVTTSNGTFELPPKEFDLLLYCAKHQGKILTKQQIY EEVWGEEYFYDDSNIMAIISRLRKKLEVNPSSPKYIQTVKGIGYRFNKEV >gi|226332913|gb|ACII01000106.1| GENE 28 27664 - 27852 94 62 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0767 NR:ns ## KEGG: EUBREC_0767 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 62 1 62 62 107 85.0 2e-22 MDYMTLKEAAEKWGVTPRRVNYYCAAGRIPGAVKMAGVWLIPKNAEKPIDGRTKQGKGLH HE >gi|226332913|gb|ACII01000106.1| GENE 29 28244 - 28477 262 77 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1756 NR:ns ## KEGG: CDR20291_1756 # Name: not_defined # Def: rna polymerase, sigma-24 subunit, ecf subfamily # Organism: C.difficile_R20291 # Pathway: not_defined # 1 64 13 75 165 64 50.0 1e-09 MKKINFRELYPDVYTTDFFVDVTEEVMETIRAAERAETAYERKMYRYKAQYSLDCENGIE NAVLLKPQTPEMLLEEK >gi|226332913|gb|ACII01000106.1| GENE 30 28928 - 30736 742 602 aa, chain - ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 71 595 250 824 832 114 21.0 6e-25 MKSYLSLIPISAKVHRRQNGMTLLCIVFAVFMVTAVFSMAEMGFRMEQARLVGKHGSFSI GDLLGSSMGQTLLSVAVVLFLLILIAGVLMISSSMNSSVAQRTRFFGMMRCIGMSKQQTI LFVRLEALNWCKTAIPIGLVLGVVTTWGLCVVLRFAVKEEFSDIPLFGISIFGIACGIFV GLITVLIAANAPAKHAAKVSPITAVSGNAGHEKTMRHSPYVGFGKIESLLGVSHAISGKK NLFLMTGSFALSIILFLSFSIMVDFVDYLIPQSAATSDIDIASADGNTIPWELLTTIREM DGVKEVYGRRSVFDVSAKLNDDTNFSGTVDLISYDDFDLQCLKKDSALKRGSDLSKVFGD SNFVLATSDQDSTWKIGDTVQIGDETLTIAGLLKNDPFSENGLTNGKLTLITSDETFVRL TGEEGYSLVLIQTTGDVTDENVQAIQNSVDQTYSFRDKRDERTTRTYMAFVFCVYAFLAI IALVTVMNIVNSLSMSVSARMKQYGAMRAVGMDERQMTKMIACEAFTYAVLGCVVGCAIG LPLSKALYDFLIAGHFPSAVWQFPIISLGVILLFVSIAAIAAVYAPAKRIRNMSITATIN EL >gi|226332913|gb|ACII01000106.1| GENE 31 30733 - 31416 341 227 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 223 1 221 245 135 34 4e-31 MNLLEVKNISKTYGNGETAVKALKDVSFSVPKGEYVAIVGESGSGKSTLLNMIGALDTPT SGKVLIDGKDIFAMNDRKLTVFRRRNIGFIFQAFNLIPELTVEQNIIFPLLLDYQKPDKR YLEELLTVLNLKDRRNHLPSQLSGGQQQRVAIGRALITRPSLILADEPTGNLDSQNSSEV IALLKETSKKYEQTIIMITHNRSIAQTADRVLQVSDGTLTDFGRCRE >gi|226332913|gb|ACII01000106.1| GENE 32 31529 - 32365 248 278 aa, chain - ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 12 273 149 411 416 163 35.0 3e-40 MEQAILQINAYLDGDRNARIECDNEGELYRLFHSINSLAAVLNAHADNELREKEFLKNTI SDISHQLKTPLAALNIYNGLLQDEDIEVSSVKEFAGLSEQELDRIETLVQSLLKITRLDA GAIVLEKNAENVADMMRDIELHFAYRARQEKKEIILSGSDHLSLFCDRDWLNEAIDNIVK NAFDHTESGATIHVAWKELPSGVQIVITDNGCGIHPEDIHHIFKRFYRSRFSKDKQGIGL GLPLAKAIVEAHNGTIEVDSELGIGTTFTMNFLIPTKL >gi|226332913|gb|ACII01000106.1| GENE 33 32575 - 33261 345 228 aa, chain - ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 223 1 220 228 195 42.0 6e-50 MSNILLLEDDLSLINGLSFAFKKQGFELNIARTLKEAEELWTDGKYDLLVLDVSLPDGSG FEFCKRVRLTSKVPIIFLTASDEETSIIMGLDIGGDDYITKPFKLGVLVSRINALLRRAN SFQAADTELQSNGIKVLLLQGQVFKNGELLDLTAAEYKLLCLFMRNPSMVLTKGQILDKL WDCDGNYIDSSTLTVYMRRLRMKIEDNPSEPQMLLTVRGMGYKWNIIG >gi|226332913|gb|ACII01000106.1| GENE 34 33254 - 33442 108 62 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0767 NR:ns ## KEGG: EUBREC_0767 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 62 1 62 62 103 87.0 2e-21 MDYMTLKEAAEKWGVTPRRVNYYCAAGRISGAVKMATIWLIPKDAEKPIDGRTKQGKVKK YE >gi|226332913|gb|ACII01000106.1| GENE 35 34405 - 34602 68 65 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0424_0734 NR:ns ## KEGG: HMPREF0424_0734 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: Homologous recombination [PATH:gva03440] # 3 65 259 321 322 73 52.0 2e-12 MSKQILAFCTTPKSKKELAVFCGFKDLRNFTLKHINPLLESGQLEMTIPDKPKSRNQKYI TVRSE >gi|226332913|gb|ACII01000106.1| GENE 36 34937 - 35329 591 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238925338|ref|YP_002938855.1| 30S ribosomal protein S9 [Eubacterium rectale ATCC 33656] # 1 130 1 130 130 232 86 4e-60 MASAKFYGTGRRKKSIARVYLVPGTGKITINKRDIDEYFGLDTLKVIVRQPLAATETEGK FDVLVNVHGGGYTGQAGAIRHGVARALLQADNDYRPVLKAAGFLTRDPRMKERKKYGLKA ARRAPQFSKR >gi|226332913|gb|ACII01000106.1| GENE 37 35347 - 35775 641 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240144421|ref|ZP_04743022.1| 50S ribosomal protein L13 [Roseburia intestinalis L1-82] # 1 142 1 142 142 251 84 7e-66 MQTYMANPDKIERKWYVVDADGCTLGRLASGVASVLRGKNKPQFTPHVDTGDYVIIVNAD KIKVTGKKLEQKIYYNHSDYVGGMRETTLKEMLAKKPERVIELAVKGMLPKGPLGRSMYT KLFVYAGPEHKHEAQKPEALTF >gi|226332913|gb|ACII01000106.1| GENE 38 36101 - 36841 868 246 aa, chain - ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 1 239 1 239 244 226 50.0 2e-59 MKRIGLVVAYDGTNYCGWQTQPNGITVQGVLNDTLSELLGEKIETIGASRTDAGVHAMGN VAVFDTNTRIPGEKISYALNQRLPEDIRIQLSEEVEPDFHPRYCDSEKTYEYRILNRKFP VPTERLYTYFYHYKLDVDKMKAATSYLIGQHDFASFCGAKAQVKTTIRTVTGIDVWRDGD IVTIRVTGTGFLYNMVRIISGTLIEVGNGQYPPERVKTILEACNREMAGPTAPAQGLTLM GIEFFD >gi|226332913|gb|ACII01000106.1| GENE 39 36960 - 37763 840 267 aa, chain - ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 1 248 1 250 267 275 55.0 4e-74 MIRDITIGQYYPADSVLHKMDPRAKLVGTLVFIISVFVFHTFPGYAVATIFLAAMIILSK VPVKFMFKGLKAIVMLLMITVVFNIFLTPGKVLWQWGILHVTEEGLKLAGRMAIRLTYLV IGSSLMTLTTTPNQLTDGLERLLRPLNKLHVPIHEIAMMMSIALRFIPILMEETDKIMKA QIARGADFETGNIIQKAKNMIPLLVPLFISAFRRANDLAMAMEARCYHGGDNRTQMKPLR YESRDYISYVVMWAYLGIAIVFRIVGI >gi|226332913|gb|ACII01000106.1| GENE 40 37760 - 38836 1194 358 aa, chain - ## HITS:1 COG:CAC3101 KEGG:ns NR:ns ## COG: CAC3101 COG1122 # Protein_GI_number: 15896352 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 280 1 281 286 289 54.0 5e-78 MSIELKNVTYTYSPGTAYEIHALKDINLSIPDGQFIGIIGHTGSGKSTLIQHFNALIRPT SGTITYNGEDIWGEKYDRRALRSEVGLVFQYPEHQLFENDVLSDVCFGPMNQGLSREEAE VEAKKALQHVGFKEKNFSKSPFELSGGQKKRVAIAGVLAMNPKILILDEPTAGLDPKGRD DILDQIAELHKVRGITIILVSHSMEDIAKYAERLIVVNDGEIAYDDAPKAVFAHYRELEA MGLAAPQITYIMHALKEKGLDVDVTATTVDEAKENILAVLRKCNLGKLSGNRQSVSDSQS KKENAKAETLISEMEADLTKDGTKESHSDVPDEDLSIQKETMDKKSAEQEENEGGQEV >gi|226332913|gb|ACII01000106.1| GENE 41 38827 - 39672 584 281 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 28 280 146 397 398 229 47 3e-59 MGIIKAFKLGFDYLKYDEDGNVQDTQRAVNDVNLDIEAGQFVAVLGHNGSGKSTLAKHLN ALLLPTEGTLWVDGIDTSKEPELWKVRQKAGMVFQNPDNQIIGTVVEEDVGFGPENMGVP TEKIWERVDESLKKTGMTSYRYHSPNKLSGGQKQRVAIAGVMAMRPKCIILDEPTAMLDP NGRKEVLEAVSDLNRREGVTVILITHYMEEVVHADKVYVMDNGEVVMQGTPREIFSQVET LKEYRLDVPQVTLLAHELHKAGVDVSEGILTTEELVNALCQ >gi|226332913|gb|ACII01000106.1| GENE 42 39760 - 40365 580 201 aa, chain - ## HITS:1 COG:lin1593 KEGG:ns NR:ns ## COG: lin1593 COG0218 # Protein_GI_number: 16800661 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Listeria innocua # 1 191 1 191 194 197 49.0 2e-50 MVIKNVSLDIVCGITSKLPDTNRPEVAFAGKSNVGKSSLINGLMNRKSLARTSAQPGKTQ TINFYNINEAMYLVDLPGYGYAKVSQSEKEKWGKMIERYLHTSKNLKAVFLLIDIRHDPS ANDKMMYDWILNNGYEPIIIATKLDKLKRSQVQKNIKAIKEGLKLSKDGIIIPFSAETKQ GRDEIWALIDELTGLAATEEA >gi|226332913|gb|ACII01000106.1| GENE 43 40423 - 42735 2723 770 aa, chain - ## HITS:1 COG:BS_lonA KEGG:ns NR:ns ## COG: BS_lonA COG0466 # Protein_GI_number: 16079872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus subtilis # 8 770 9 770 774 724 49.0 0 MTNQNIVLPAIALRGTTILPGMIVHFDVSRERSVKAIEAAMLHDQKIFLVTQIDPEVESP DLAGVYHVGTIAYIKQVVKLPQNLLRVLVEGTGRATLVKFEQEFPFIRSEITPVDEEEMQ MPEPVMEAMHRSLKELFHRYCMENGKVSKELVAQILNIDNVEELVEQIAVNIPLSYQNKQ KILEALTLEERYEVLGAILGNEIEIMQIGRDLQKKVKARIDKNQREYILREQLKLIREEL GEDNTADDAEEFKKKLQELQAGDEVKEKISKEIERFKNTNSNVSENAVLRGYIETMLALP WEKKSTDSDDLKEAWKVLQEGHYGLKDVKERVMEFLSVRKLTHKGKSPILCLVGPPGTGK TSIARSIAEAMHKKYVRICLGGVRDEAEIRGHRKTYVGAMPGRITAALQQAGVSNPLMLL DEIDKTSSDYKGDTSAALLEVLDPEQNSRFMDHYIEVPQDLSEVLFIATANDVQGIPRPL LDRMELIEIAGYTENEKEHIAKEHLIPKQMEENGIEKGKLTIQSAALKKIINNYTKEAGV RNLERTIGQICRKTARLIMEEDKKKVTVTSKNLSDFLGKEHFNYLMANKKDEIGISRGLA WTQVGGDTLQIEVNVMPGKGELMLTGQLGDVMKESAQAGITYIRSIASDYKVEPEFFQEN DIHVHIPEGAVPKDGPSAGITMATAILSAIIKKPVRADLAMTGEITLRGRVLPIGGLKEK LLAAKYAKIKEVLVPAENKPDIQELDKEITDGLTITFVSSMKEVLNKALV >gi|226332913|gb|ACII01000106.1| GENE 44 42756 - 44072 1593 438 aa, chain - ## HITS:1 COG:CAC2639 KEGG:ns NR:ns ## COG: CAC2639 COG1219 # Protein_GI_number: 15895897 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent protease Clp, ATPase subunit # Organism: Clostridium acetobutylicum # 1 413 1 409 432 528 64.0 1e-149 MAEKMREGKVRCSFCQKTEDQVRKLIAGPDGKVFICDECIGICSEIMEEELNPYDDEVLN SDINLLKPEEIHAVLDDYVIGQDAAKKALSVAVYNHYKRILASKNSDVELQKSNILMLGP TGSGKTLLAQTLARLLNVPFAIADATTLTEAGYVGEDVENILLKIIQAADYDIERAQYGI IYIDEIDKITRKSENPSITRDVSGEGVQQALLKILEGTVASVPPQGGRKHPHQEFLQIDT TNILFICGGAFDGIEKIIESRQDTKSIGFGAEVSVKEDRNVGEILKDVMPEDFIKFGLIP EFIGRVPVVVTLDALDENALISILKEPKNSLTKQYHRLFELDGVELDFEDDALELVAKKS LERKTGARGLRAIMEGSLMDLMYKIPSDDTIRKCTITKDVVDGTGEPEIVRGETPAQAKT AGSRRTSRTHKKDKPETA >gi|226332913|gb|ACII01000106.1| GENE 45 44148 - 44729 606 193 aa, chain - ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 190 1 190 193 290 74.0 8e-79 MSLVPYVIEQTSRGERSYDIYSRLLKDRIIFLGEEVNDVSAGLIVSQLLFLEAEDPGKDI QLYINSPGGSVTAGMAIYDTMQYIKCDVSTICLGMAASMGAFLLAGGAKGKRFALPHSTI MIHQPSGGAQGQATEIQIVADHIAQTKRTLNELLAANTGQPIEVVERDTDRDNYMTAEEA KAYGLIDGVVMHK >gi|226332913|gb|ACII01000106.1| GENE 46 44825 - 46111 1696 428 aa, chain - ## HITS:1 COG:lin1306 KEGG:ns NR:ns ## COG: lin1306 COG0544 # Protein_GI_number: 16800374 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Listeria innocua # 1 427 1 426 427 311 44.0 1e-84 MSLQVEKMEKNMAKLTIEVAAEDLEKAMQNAYQKAKGRISIPGFRKGKAPRKMIEQMYGK GVFLEDAVNALIPEHYSKALAECELEIVSQPTIDITQAEPGKAFIFTAEVAVKPEVTLGD YKGVEVPKTEITVTDEDVEAELKKEQEKNSRTISVEDRAAQLNDIVTIDFEGSVDGVPFD GGQATEYPLTLGSNTFIPGFEEQLVGAKVGDDVDVKVTFPEEYQAKELAGKEAIFKCAVK KIEAKELPELDDDFAKDVSEFDTLAEYKEHVKTNLEDKKADEAKRAKEDAAVDKAIENAQ MDIPEAMLMTQCRQMLDDFSRRMQSQGLSMDQYFQFTGMTADKMMEDMKPQALKRIQTRL VLEKVAEVENIQPTEEEVNEEISKMAEAYKMEADKLKELLGERELEQMKKDMAVQKAVTV IADAAKEV >gi|226332913|gb|ACII01000106.1| GENE 47 46386 - 46946 518 186 aa, chain - ## HITS:1 COG:SP0804 KEGG:ns NR:ns ## COG: SP0804 COG0693 # Protein_GI_number: 15900697 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Streptococcus pneumoniae TIGR4 # 4 175 3 169 184 129 44.0 3e-30 MSKKAYIFLADGFEDIEGLTVVDLLRRAGIDIKTISIKETTRIQTSHGITMLTDAIFAET DFADADMLVLPGGMPGTKYLAGYKPLIDLLTDFNNKGKKIAAICAAPSVFSGLGFLKGRK ATSYPSFMEVLSKDGAVTSEDSVVVDGNITTSRGLGTAVDFALSLISQLENEEKAKEIAE SVVYTR >gi|226332913|gb|ACII01000106.1| GENE 48 47097 - 48533 977 478 aa, chain - ## HITS:1 COG:ECs1921 KEGG:ns NR:ns ## COG: ECs1921 COG1473 # Protein_GI_number: 15831175 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Escherichia coli O157:H7 # 6 475 3 477 481 397 44.0 1e-110 MDHTDEYIKEILEKIDSKKGYYAEISNKIWEFAEPRFQEYRSSELLQHVLKQAGFSIKAD LAGEKTAFIAEYGSGKPVIAFLGEFDALPGLSQKADAAERIPANTGSCGHGCGHQLIGTG TLASVIALKDFVKEHNLQGTIRYYGCPAEENAGGKAFLVRDGYFNDCDLALCWHPEQGRR ACYGSTKANFRVFFTFHGTPAHASMCPELGRSALDAVELMDVGVNYMREHMIDEARIHYA ITDTGGDAPNVVQSRAQVLYAIRAPKITQVKELYNRVCNIARGAALMTETTVEIRQVAAY SNLISSKILADHMNAYLEKLGPITYTEEEYAYAQKFQQSLSDQDKNQLPTIARDYFGPKA PEAMKTPIFEPIACQNLYSDLKYSTDVGDVSWIVPTVHLNIPTFAAGTALHSWQAVAQGK SSIAHKGMLYAAKIMALTALDFLQNPELANQARKELTETLNGETYPDPLPKDLKPEIW >gi|226332913|gb|ACII01000106.1| GENE 49 48594 - 50282 1585 562 aa, chain - ## HITS:1 COG:FN1091 KEGG:ns NR:ns ## COG: FN1091 COG2208 # Protein_GI_number: 19704426 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Fusobacterium nucleatum # 264 551 162 447 447 139 35.0 1e-32 MEKRRKLNVELIKFLIGSMLVVGVLMLLGSSLVNNRQYRKLYNDKALEIAKTVSDQVNGD FIEELCKEIDTEEFEQIQKEAVAADDEQPIIDWLKAKGMYQNYERINEYLHSIQADMNIE YLYIQMIQDHSSVYLFDPSSGYLTLGYKEELSERFDKLKGNERLEPTVSRTEFGWLSSAG EPVLSSDGEKCAVAFVDIDMTEIVRNTIRFTVLMVCLCILIILAAGMDISRKIKKRISRP IELLTEATHKFGNGEEGYDENNIVDLDIHTRDEIEELYHATQSMQKSIINYMDNLTRVTA EKERIGAELNVATQIQASMLPCIFPAFPDRDEMDIYATMTPAKEVGGDFYDFFMVDDRHM AIVMADVSGKGVPAALFMVIGKTLIKDHTQPGRDLGEVFTEVNNILCESNENGMFITAFE GVLDLVTGEFRYVNAGHEMPFVYRRETNTYEAYKIRAGFVLAGIEDIVYKEQKLQLNIGD KIFQYTDGVTEATDKDRQLYGMDRLDHVLNQQCLSSNPEETLKLVKADIDAFVGGNDQFD DITMLCLEYTKKMENQRLLNNC >gi|226332913|gb|ACII01000106.1| GENE 50 50254 - 53172 2179 972 aa, chain - ## HITS:1 COG:no KEGG:HRM2_02090 NR:ns ## KEGG: HRM2_02090 # Name: not_defined # Def: inner membrane transport protein (PmrA-like protein) # Organism: D.autotrophicum # Pathway: not_defined # 478 950 331 798 817 100 20.0 3e-19 MLLLTVCSVFVVYAGREWMFTNPFKPYTFSSVSYASGDGDGCTYVIDDSNRKILKISADG RLLWRACASDKSFLSAERVVADGDGNVYLHDVRIEQGVQIASEGIVKLSSKGKYISTVAS VEAEKGSVRRNIVGMVPTEHGVIYMQKEKEGILVSNTEQGSSKVFSVADAQDRILCCAYD RDSDSLFYVTYDGKIYKYTDSGQDELLYDSDTVDGSIPQEISYSDGVLYSADIGLRDIIR IPCDMENTGSTDRLTVEESLKEREIAYHVSAPGTLVSSTNYSVILWDGEDYEQFWDVPLS GKLQVWNCLLWAACAVIVAAVLFFAVTLLKILVKKFSFYAKITMAVIGIIVGVAALFIGT LFPQFQSLLVDETYTREKFAASAVTNRLPADAFQRLEKPSDFMNEDYRQVRQVVRDVFFS DSDSSQDLYCVLYKVKDGTVTLVYTLEDICVSYPYDWEYEGTDLQEVMEQGATKTYATNS SSGSFVFIHSPIRDKSGDIIGIIEVGTDMNSLTEKSREIQVSLIINLIAIMVVFFMLTFE VIYFIKGRQELKRRKQEENNSRLPVEIFRFIVFLVFFFTNLTCAILPIYAMKISEKMSVQ GLSPAMLAAVPISAEVLSGAIFSALGGKVIHKLGAKRSVFVSSVLLTAGLGLRVVPNIWL LTLSALLLGAGWGVLLLLVNLMIVELPDEEKNRAYAYYSVSSLSGANCAVVFGGFLLQWM SYTALFAVTAVLSVLLFLIANKYMSKYTSDNEEENCETEDTHMNIVQFIFRPRIISFFLL MMIPLLICGYFLNYMFPIVGSEWGLSETYIGYTYLLNGIFVLILGTPLTEFFSNRGWKHF GLAAAAFIYAAAFLEVAMLQNIPSLLIALALIGVADSFGIPLLTSYFTDLKDVERFGYDR GLGVYSLFENGAQSLGSFVFGYVLVLGVGRGLIFVLILVSVLSAAFLISTTFAAHRDKKE VKEHGKKTKTEC >gi|226332913|gb|ACII01000106.1| GENE 51 53217 - 54176 733 319 aa, chain - ## HITS:1 COG:MA1791 KEGG:ns NR:ns ## COG: MA1791 COG4866 # Protein_GI_number: 20090642 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 24 309 8 294 313 61 20.0 2e-09 MKSDTKKYYKIPQKMLEIFLARGYQELRVEDQSLFDPYYDSMNDHWSANTSFLNLIGWRD SYPTYFKIAEGLLINITYLNTEGYPVAVPFVGNYTEEKIKKVMQILKEDFAGFDAQLIIM DVVPWMLPYYEASGIQFEIEDNRDYMDYTFTPEQFLAGMDMQDDRYRYNYFKRRFAYETE EITPFHREEIREFMEREWCGEKTCEECHCGCLLRVIDNLVPVFDKIRINGILVRVEGKMA GLCIVSCRNGLGVYQYKNAVNRIKGINEYLLRESFERYLQSADTINYTEDMGVENLRYYK EHMAPAHTLLSKLTLTERG >gi|226332913|gb|ACII01000106.1| GENE 52 54570 - 55025 541 151 aa, chain + ## HITS:1 COG:SMc03154 KEGG:ns NR:ns ## COG: SMc03154 COG2954 # Protein_GI_number: 15966638 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 2 151 4 152 157 68 35.0 3e-12 MEIERKFLISKENLPADLNSYPHHRLEQGYLSTAPVVRIRKEDDNYYLTYKSKGLMTREE YNLPLTKESYEHMRPKADGILISKTRYLIPEKDGLTIELDVFDAPYEGLYLAEVEFSSEE QALSYNPPVWFGEDVTNSGKYHNSRLSQGNL >gi|226332913|gb|ACII01000106.1| GENE 53 55207 - 56001 991 264 aa, chain - ## HITS:1 COG:lin0414 KEGG:ns NR:ns ## COG: lin0414 COG0345 # Protein_GI_number: 16799491 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Listeria innocua # 2 264 3 266 266 251 53.0 1e-66 MKIGFIGCGNMASAMISGMLKKGLYKKDEIIVSNLTEEGSKRSREKLGVVTTLDNHEVVK NTKLVFLAVKPQFYEEVLNEVKDELTPEHTVVGIAPGKTLAWLEEKCGQPLKVVRMMPNT PAQVGEGMTGVCANEKVSAEELAQICEITDSFGRTEVVPERLMDAVSAVSGCSPAYVFMF IEAMADAAVAQGMPRKQAYQFAAQALLGSAKMVLETGMHPGELKDMVCSPAGSTIEGVRI LEQNGFRSAVFEALNGAAEKGKKM >gi|226332913|gb|ACII01000106.1| GENE 54 56134 - 56580 437 148 aa, chain - ## HITS:1 COG:FN0349 KEGG:ns NR:ns ## COG: FN0349 COG1490 # Protein_GI_number: 19703692 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Fusobacterium nucleatum # 1 147 4 150 154 149 50.0 1e-36 MRFVIQRVMHSKVTINGEVRGQIGKGFMVLIGVGEEDTVEIADKMIHKMVNMRIFEDENG KTNLGLKEVGGSLLLISQFTLYADCKRGNRPSFIRAGAPDMAENLYKYIIARCKEHIDIV EQGEFGADMKVELVNDGPFTVILDSNEL >gi|226332913|gb|ACII01000106.1| GENE 55 56700 - 58058 1292 452 aa, chain - ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 8 442 3 437 446 256 35.0 5e-68 MARSKEMDLTTGNPFRSLLKFAIPVILGNLFQLFYTLADSVIVGKTLGADSLAAVGSTSI IIYFVFCFINGFTGGFGICLGQKCGAKDEKGMRKSVAVSALLSIAFTVVLTLICCLLSHQ ILRWMQIPADISDEAYDYMFVVLLGTGATVFYNMISNMLRALGDSKTPLYFLVFSSVLNI FFDILFIVPFHMGVAGAAWATILSQFLSAVLSLVVGLKNFPILHLHREDFTDLKDTAVLH LKTGFPMGFQMSVMCIGQLAMQAVVNSLGTAAVAGYTAASKADQLSVLVNNAMMTAISNY VAQNFGAGRWDRIRQGVRACLIQTEAFNLIMCIGILLLRHPIVRMFLSDSTKEIYHYSDM YLTIVAPFYFVLGLLAVYRTSIQSMQNGKAPFAACMIELVMRIAATVGLSGIIGYTSVCI ASPMAWIGACALLIPVYYKMMKGNCAALLRTE >gi|226332913|gb|ACII01000106.1| GENE 56 58228 - 59128 325 300 aa, chain + ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 3 296 1 289 291 128 25.0 2e-29 MAMRKILVDETNRELCEHGTVGFPMTVNHDDLWAFEGKSVPIHWHNDLEISLPREGEAIY QVYRKSYTVRPGDVLLLNRNVPHSCHSPNNSHARYSTFLTRPDFICGEYGSDVERRCFRP FLQNSAVPCILLTSESSCTRTVIQKLNETEALFDQKTFCYELKIKGLLCEIFGMILCEHQ NDLAKFIPANQLELERLEQMLDYLNTHFGSVISLQELADQVHLSREVCCRLFKKMTGKTI TGYLEEYRVNQSLPLVQSCQYSMIQIADMTGFSNASRFARAFRRQFGCNPGEYNSLKLQK Prediction of potential genes in microbial genomes Time: Sat May 28 20:15:05 2011 Seq name: gi|226332912|gb|ACII01000107.1| Ruminococcus sp. 5_1_39B_FAA cont1.107, whole genome shotgun sequence Length of sequence - 41501 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 15, operones - 10 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 124 - 176 7.2 1 1 Tu 1 . - CDS 238 - 864 728 ## EUBELI_01180 hypothetical protein - Prom 939 - 998 4.2 2 2 Op 1 . - CDS 1014 - 2474 970 ## gi|253579828|ref|ZP_04857096.1| conserved hypothetical protein 3 2 Op 2 40/0.000 - CDS 2511 - 3794 898 ## COG0642 Signal transduction histidine kinase 4 2 Op 3 . - CDS 3800 - 4459 695 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 4533 - 4592 4.3 - Term 4478 - 4522 -0.4 5 3 Op 1 . - CDS 4594 - 4869 368 ## Athe_1650 hypothetical protein 6 3 Op 2 . - CDS 4887 - 5147 254 ## EUBREC_3089 hypothetical protein - Prom 5167 - 5226 8.1 7 4 Op 1 . - CDS 5237 - 6430 959 ## gi|253579833|ref|ZP_04857101.1| predicted protein 8 4 Op 2 . - CDS 6414 - 6980 332 ## Cphy_0946 ECF subfamily RNA polymerase sigma-24 factor - Prom 7022 - 7081 7.1 + Prom 6993 - 7052 9.1 9 5 Tu 1 . + CDS 7275 - 8102 824 ## CLB_2021 hypothetical protein + Term 8114 - 8165 9.0 - Term 8100 - 8152 3.2 10 6 Op 1 2/0.000 - CDS 8171 - 9508 775 ## COG0772 Bacterial cell division membrane protein 11 6 Op 2 . - CDS 9512 - 9841 197 ## PROTEIN SUPPORTED gi|18309686|ref|NP_561620.1| 30S ribosomal protein - Prom 9884 - 9943 5.2 12 7 Op 1 . - CDS 10018 - 10305 264 ## gi|253579838|ref|ZP_04857106.1| predicted protein 13 7 Op 2 . - CDS 10309 - 10989 347 ## COG3822 ABC-type sugar transport system, auxiliary component 14 7 Op 3 . - CDS 10970 - 12073 623 ## BL0059 hypothetical protein 15 7 Op 4 . - CDS 12098 - 14569 962 ## COG3250 Beta-galactosidase/beta-glucuronidase 16 7 Op 5 . - CDS 14526 - 16772 1429 ## Csac_2723 hypothetical protein 17 7 Op 6 14/0.000 - CDS 16817 - 18283 478 ## PROTEIN SUPPORTED gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein 18 7 Op 7 7/0.000 - CDS 18310 - 19227 682 ## COG0395 ABC-type sugar transport system, permease component 19 7 Op 8 2/0.000 - CDS 19245 - 20216 754 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 20242 - 20301 5.2 20 7 Op 9 2/0.000 - CDS 20312 - 21781 575 ## PROTEIN SUPPORTED gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein - Prom 21879 - 21938 5.8 - Term 21902 - 21942 6.3 21 8 Op 1 7/0.000 - CDS 21953 - 23470 1039 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 22 8 Op 2 1/0.000 - CDS 23448 - 25037 1048 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 25219 - 25278 8.1 - Term 25239 - 25290 6.2 23 9 Op 1 5/0.000 - CDS 25310 - 28429 2381 ## COG0383 Alpha-mannosidase - Prom 28463 - 28522 7.3 24 9 Op 2 . - CDS 28535 - 29830 938 ## COG3538 Uncharacterized conserved protein - Term 29838 - 29873 5.0 25 9 Op 3 . - CDS 29881 - 32700 2210 ## COG4724 Endo-beta-N-acetylglucosaminidase D - Prom 32729 - 32788 5.0 26 10 Op 1 . - CDS 32939 - 33778 708 ## EUBREC_2984 hypothetical protein 27 10 Op 2 . - CDS 33838 - 34068 355 ## gi|253579853|ref|ZP_04857121.1| predicted protein - Prom 34095 - 34154 5.8 + Prom 34162 - 34221 4.3 28 11 Tu 1 . + CDS 34251 - 34424 219 ## EUBREC_0206 hypothetical protein - Term 34458 - 34504 6.1 29 12 Op 1 . - CDS 34531 - 35352 636 ## EUBELI_20603 hypothetical protein 30 12 Op 2 . - CDS 35359 - 35982 341 ## EUBELI_20604 hypothetical protein - Prom 36043 - 36102 8.4 - Term 36056 - 36102 13.1 31 13 Op 1 . - CDS 36139 - 36693 352 ## GYMC10_4521 hypothetical protein 32 13 Op 2 1/0.000 - CDS 36733 - 38091 1704 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 38148 - 38207 3.2 33 13 Op 3 . - CDS 38216 - 38866 498 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 34 13 Op 4 . - CDS 38866 - 39423 598 ## lp_2824 integral membrane protein 35 13 Op 5 . - CDS 39420 - 40409 849 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 36 14 Tu 1 . + CDS 40719 - 40844 57 ## + Term 40863 - 40893 0.4 + Prom 40919 - 40978 5.7 37 15 Tu 1 . + CDS 41035 - 41500 306 ## AFE_1328 transposase, IS605 OrfB family protein, putative Predicted protein(s) >gi|226332912|gb|ACII01000107.1| GENE 1 238 - 864 728 208 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01180 NR:ns ## KEGG: EUBELI_01180 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 207 1 207 208 349 82.0 4e-95 MDANMKKRNELLAKTVIKGLESRNMSGYYAQDKEAALKQALELIPNGSTIAMGGCMSAHE IGLVKALQDGEYNYIDRAKLEPREGLMAAYDADFFLSSANAVTDDGVLVNIDGNSNRVSC IAQGPKKVIFIVGINKVCPDLDSAMKRARNVAATANAQRFDIKTPCKETGKCFNCKSPDT ICCQFLITRYSRHKDRIHVILVNDTLGM >gi|226332912|gb|ACII01000107.1| GENE 2 1014 - 2474 970 486 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579828|ref|ZP_04857096.1| ## NR: gi|253579828|ref|ZP_04857096.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 4 486 1 483 483 908 99.0 0 MNTLERQYGTLFSYEEGTFFEKLRKDEKEMKKVKIFLGAFLCMNFLTMSGCDKVVYSAVV EEKDKTDSKTEEHKEKETVQKTVAEQVQAPEKYQTIIQSDLRTAERTDQKNPMKFTLTAN APVEVPDVDAICLKNVKKVAISEEEPQKVLDTFVKGQLLKQNDDGLVETYEVNGLTYQYT GTFQNLEVLSLWEKYSAFDDVKEENLSQDEKEQRAERFQNYIKAGNQNMTEEDAENYVKD IVSGEWCLFDSASKELTEENSTLEKDTFFFERMVNGVPVNYIRNSYLPYDELSRPWIDED DNFHESQCKGWDNESLTMVFCSGTLQSFDHSDPIEVSDASDEAVFLLPFDEIKDIFEKTI TMQIMTEQENRLLAVDGTSFFARYPSIDAQTVEMTITKVQLGYMRVREGNSNTEGSLIPV WDFYGTWNSQEPVAYTDSGDEMVIDSVTMDRIGVPLLTIDARDGSVVQRVQSGSSTTFSG GIGECE >gi|226332912|gb|ACII01000107.1| GENE 3 2511 - 3794 898 427 aa, chain - ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 192 418 246 472 473 165 42.0 2e-40 MKLWKRTVALMLITLLCSLIPVGVLSLYVTGKRSLNNAAETYGRQLANGKNLLEQFWDNS KYEQMSETGKKSYLEFQFLRCCGEKMLLINSESKEAVVNRTDYEIVSMDNLGLNNKTDSY EYKIQKLNHKYLLLQHALLTNPEGYEILSVRDVTSMFTELRQTAVWFLGIYLAVFCIAGL FIYLMMKRTVQQMEKLQEVAGKQEMLMGALAHEMKTPLTSIIGYSDTLRHVKLNDEQKDC ALEHISREGKRLEKLSGKMLQMLGLHQNDSIKMESHAIGELLKQVAQLEAEQAAQKQIQL QTEYEDFSLKMDEELMESLIVNLTDNALHATEPGGYIILKAYKESGKKILEVADNGRGIP QEEMGKITEAFYMVDKSRSRQEGGAGLGLALCVKIAEVHGARLDIVSQIGAGTKVRVIFD KKDKELT >gi|226332912|gb|ACII01000107.1| GENE 4 3800 - 4459 695 219 aa, chain - ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 219 4 217 217 176 45.0 3e-44 MKNKILVIEDDRAISELLCMNLTIAGYETVAAYDGIEAQRLLLWNEDADLAVVDIMLPGR DGFALMEDFRAKNLPVIYLTAKDDVASKVKGLKLGAEDYMVKPFEMLELLVRIEKILERT GRAKRILTIKNIQADLQSHQILKDGIPVNLKPLEYDLFILLLQNKNIALGREELLKRVWG EEYLGESRTVDVHIGQLRKKLELYDEIKTIPKLGYRLED >gi|226332912|gb|ACII01000107.1| GENE 5 4594 - 4869 368 91 aa, chain - ## HITS:1 COG:no KEGG:Athe_1650 NR:ns ## KEGG: Athe_1650 # Name: not_defined # Def: hypothetical protein # Organism: A.thermophilum # Pathway: not_defined # 4 81 7 84 114 92 58.0 3e-18 MLYDIVQVIPYEDYTVYVYFEDGKIVCYDAKPLLDKKVFAPLKNIDFFMNACTILNDTLA WDVSGDRDCSKCLDIDPDMLYSLENVEEKIA >gi|226332912|gb|ACII01000107.1| GENE 6 4887 - 5147 254 86 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3089 NR:ns ## KEGG: EUBREC_3089 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 86 1 86 86 142 75.0 3e-33 MPEISLFYGIRITMYYEDHNPPHFHAEYNGNKAVIDIDKARVIKGALPSRQLKLILAWCV IHQDELMQNWELSKDGLPLNRINPLV >gi|226332912|gb|ACII01000107.1| GENE 7 5237 - 6430 959 397 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579833|ref|ZP_04857101.1| ## NR: gi|253579833|ref|ZP_04857101.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 397 1 397 397 733 100.0 0 MEKGFKTLELIGDLDDKYIYEASRPWKKQHRFFHIQRSWKTVAAGVILLIMVGGTLRYQE EVKAALQRVTSWIGQALGIENGDTDFYTNVVGKSISRDGITITLNEIVMDSKNLWIAYSD SEDLNNKNHIGKEQKDLIAQNTDSDTQDTIEDKILFTEVRINGQEIQQGLGQRTVNNNDL SENSITVEDFTIGEEINTEGTVNIEFKVWPIAMADAETWENIQKIQADTEPYAFKLTTSK EELEKNTVDINLDQNIKFDSNILNLTEFRWNPFESTIYGTYHGTVYIDSDYYLIGTDDQG NKICYQETGRNGEETVFQQTIDPEFTGYKEISPEAKSITLQLYEVKNDTAHQVFEEKTDK DSSDDVYEEFAGYVYDDGENPTDAGVPIGDRFTLQIR >gi|226332912|gb|ACII01000107.1| GENE 8 6414 - 6980 332 188 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0946 NR:ns ## KEGG: Cphy_0946 # Name: not_defined # Def: ECF subfamily RNA polymerase sigma-24 factor # Organism: C.phytofermentans # Pathway: not_defined # 1 176 1 175 186 156 46.0 4e-37 MEDAVIINLYFDRSEKAIQATEEKYSRYCFSIAWNILYDKEDSDECVNDTWLAAWSSIPP RKPAILSAFLGKITRNLAIDCFRRKKAAKRSTEHILELCKELEEIEDVTAYSLNDEIQRK EILEILEKFLESLKKGDRDIFIRRYWYMDSISEISARHGISESRIKSSLYRSRKKLWERV KHLYGKRI >gi|226332912|gb|ACII01000107.1| GENE 9 7275 - 8102 824 275 aa, chain + ## HITS:1 COG:no KEGG:CLB_2021 NR:ns ## KEGG: CLB_2021 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 244 1 244 271 169 39.0 1e-40 MANAKHEDAVMKMGFDYFRNTILKTLGIDYQYEEIGPTELVELTIQSLYMDFTFLTTGGF YIHTEFQTTDKKEADLRRFHAYDAVYSNKTGKKVITYVIYSGGITNVKSELDCGLYTYRV QPIYLKDKNADEVFRKLKQKQDNGEAFTEDDYAALSLTPLMSGKMSRKDMFKEAIRLAKP NIELSAEKTTAMLYTLADKFLDRAELDEIKEVIRMTRLGQMLMDEGMEKGIELNQTDSIK KLMKNMNLTIDQAMNALEVPEDKREKYRKTITPDN >gi|226332912|gb|ACII01000107.1| GENE 10 8171 - 9508 775 445 aa, chain - ## HITS:1 COG:lin0441 KEGG:ns NR:ns ## COG: lin0441 COG0772 # Protein_GI_number: 16799518 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Listeria innocua # 2 436 7 416 416 118 25.0 2e-26 MDEYLKTLLEQIRCKKARPYVKQEFQDHIEDQIEANMHAGMDREQAEREAVRDMGDPVET GISLDSVHRPQIAWKLLGIIILISIAGVLIHAGIAGKISENAAAGSDRYVFHVVIGLAVM MILYLLDYTVLARFSKIIAVVLLSACLVTLLGGYQLNGARYFIVLPGGRGISMQTLMMFY VPIYGAILYKYHGWGYKGLMRAIIWMIAPVILVYAMPAFMTACMMLASMLVMLTIAIQKN WFTVRKKRAMCGIWAGFLAMPVAAFLIRYLSSSLTEYQIARLQAVFSGGGEEDYFTEMLH SFWQQNKWIGKSGSDVMGNLPAFNADYILTYLSSVYGTIAAILLCCVLAVLIFAVFNTAM RQKNQLGMMMGCGCGIVFLINFFINILENLGIFPQSVTFLPFLSAGGSCIIVSYGLMGIV LSTYRYKNIYPRHLNNCVLRSRYFQ >gi|226332912|gb|ACII01000107.1| GENE 11 9512 - 9841 197 109 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|18309686|ref|NP_561620.1| 30S ribosomal protein [Clostridium perfringens str. 13] # 1 95 1 97 110 80 39 1e-14 MAIEKSLVSGSMTMLILKLLSEKDMYGYEMIDTLRQKSQNVFELKAGTLYPLLHSLEDKG FLTVYEQEAAGKTRKYYSLTRQGRGFLEKKVEEWKEYSAAVTSVLVMEV >gi|226332912|gb|ACII01000107.1| GENE 12 10018 - 10305 264 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579838|ref|ZP_04857106.1| ## NR: gi|253579838|ref|ZP_04857106.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 95 1 95 95 174 100.0 2e-42 MVVTEKEFVEENDLGYARFLAEKVCTLANVFEMGSYNECVPMLNIVTAEKNVKGTFQLAK QLLEGFRNEEDFDYMREYEPWKKLLAEKDSNFGVF >gi|226332912|gb|ACII01000107.1| GENE 13 10309 - 10989 347 226 aa, chain - ## HITS:1 COG:ECs5070 KEGG:ns NR:ns ## COG: ECs5070 COG3822 # Protein_GI_number: 15834324 # Func_class: R General function prediction only # Function: ABC-type sugar transport system, auxiliary component # Organism: Escherichia coli O157:H7 # 1 222 1 223 227 221 45.0 1e-57 MKRSKINSEIRHMEELIQEHGFEIPPFCKWTPEEWNNKGSEYDEIRDNMLGWDITDYGLG KFDEIGFSLITIRNGNLKMDKYTKTYAEKLLVVKEGQMAPMHFHWNKMEDIINRGGGNVL ITVYNSTEDGKFADTDVTVNCDGHEYTVPAGTQIKLTPGESITIYPYMYHDFHVEEGSGD VLLGEVSMCNDDENDNRFYEPIGRFPEIEEDEKPYRLLCTEYPKAK >gi|226332912|gb|ACII01000107.1| GENE 14 10970 - 12073 623 367 aa, chain - ## HITS:1 COG:no KEGG:BL0059 NR:ns ## KEGG: BL0059 # Name: not_defined # Def: hypothetical protein # Organism: B.longum # Pathway: not_defined # 3 337 16 354 369 395 50.0 1e-108 MVRKELMSYVGSAQQLMSVRPVVYKEGRAEGLQAYEVKNDRLSFSVMIDKCLDIAEVNWK GYNISFLSKPGLTGRNHYDTHGAEAQRSIMGGMLFTCGLENICAPCLVDGKEYPMHGRMR TTPAEHVGADTYWTDGDEPQYIINISGEMREAELFGENMILRRSIETTFGVPEIVIKDCI TNESFREETMMLLYHFNVGYPFLNEDCEIILPTKEVIPRDDVAEKQIGLWSKMEKPADNE PECVFIHELSADKEGNTFAAVINERLGIGIKIQFNQKYLPYFMQWKSVASGDYVIGLEPA NSSVLGKTYHLEKGDLHMLNPFETENIELRLSFLEGKEIKKVKEEALQLVTQYKQDKRRH KNEKIKN >gi|226332912|gb|ACII01000107.1| GENE 15 12098 - 14569 962 823 aa, chain - ## HITS:1 COG:AGc2809 KEGG:ns NR:ns ## COG: AGc2809 COG3250 # Protein_GI_number: 15888848 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 36 644 37 646 832 437 40.0 1e-122 MTQQEMIMEKIKLNENWRMRCLNNGRISSEWQNAVVPGSVYTDLLRNHQIQDPYWKDNED SVCALMEEEYEYECTFMNYGTDGYTDIYLEFEGLDTVADIYLNDGYVGHAENMHRIWKYD VKSILRKGENQLRIIFRSPLKYIAEAYKKYGNIGNDDTFEGFMHLRKAHYMFGWDWGAHL PDAGIFRPVWLCKETGGRIESVYICQNHEEGRCTLEFKGEYVLTQPGNYRTLVTLKSPCG EIMETELDADGDGRIVIDHPLLWWVNGLGEQPLYEVEAVLMLEEKTVDTWKRRIGLRSMT VRREKDEWGESFAHEINGVAFFAMGGDYIPEEHLLGRRSEEKTRQLLLDCKRANFNVIRV WGGGFYPDEWFYDICDELGLAVWQDFMFACSVYELTEAFETNIRQEFIDNIKRLRHHPSL ALWCGNNEMEMFVDERCWVTKHTEVRDYLLMYERIIPEILKDHDHQTFYWPASPSSGGSF DEPNDPARGDVHYWKVWHGNRPFPEYRKFFFRYLSEFGFQSFPCKKTIDTFTDDPADWNI FSYVMEKHQRNYGANGKIMNYMQQMYRYPGDFETVLYASQLLQADAIRYGVEHFRRNRGR CMGAVYWQINDCWPVISWSSIDYCGRWKALHYYAKRFFAPVMISCEEESWMTAEANMNRQ HFVFPKSIRLHVANETMETRKILVKWQIRNAKAQILREEENIVVVSPMSGQWLEKVKLPE INVFCEYVSYEAWEAGEKISEGTTIFSYPKYFRYEDPKLSFSINNDEITVTANAYAKSVE IQNENQDLVLSDNYFDLNSNSRTVKILRGNSTGIRVRSVYQIR >gi|226332912|gb|ACII01000107.1| GENE 16 14526 - 16772 1429 748 aa, chain - ## HITS:1 COG:no KEGG:Csac_2723 NR:ns ## KEGG: Csac_2723 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticus # Pathway: not_defined # 6 699 5 668 700 595 43.0 1e-168 MQLISKTDNRPYMNVIFGNFYSPAFDNEEFVDKTMGLIRDLGFNSVMFDTKAWEDFKERF ETGALSQYVKMQEYMGGSAHRHGLAYNFLLLYLNGDNLYPHIRFSPPIFGEETVYLDGRP GRWYKYWSKKAQISMKEHVDRIMKQYGQGCERCLTGEIDQGKDRQDMNNGAKEVIPVCSM WDPVVAPSFDKEGQKRYLDYLRELYKGDIKSLNHNYETELEKFDDLTPKEYWYSVRFGSD SFYTEEDVKKRSPKFWVWRDNALWKIHELTLYFEKIGPMLKENNPELLLCPDMSQWGYFL NIYGRTQQDIDNEFSDLWDTAMRGIDIYALAPYVDSCHFITVPVTPDGYPDSYVVSCQHS MMRVMNQGKPFIGGIYWGRYIYNDLYALLSPSEIIGSMTACGIDGYTCYGMNGLDDGGVM NRMDTHFLDSLRMANEWFSQVICLRKGEKKKEIAILFPSEMAHLEPYEVGNNKIRRLDLL GWYKLCCDLGYQVDVISNHEIEKGTLAEYKVLIVPSNDCYFAVDHAAMEKEIRSWVCKGG VLLHGPRDLLAENCFGIQGEECEKKPYHYGKTIIAQGEAFCRYQDGKEIASYVDHGGCCV AKYEKLELAQKMNTKYKSGMGAVYSFGIQIGASYAAKNIPHVPYEQGNKEMYPIIQSGTT LVKDILYHYMTPVSGICERGIETGVFENGMVVVNHRSIPYVLPEKYKIEKYQYSFHAVQN GNGILAGHEAVWVSNDTTGDDYGENKIK >gi|226332912|gb|ACII01000107.1| GENE 17 16817 - 18283 478 488 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein [Streptococcus pneumoniae TIGR4] # 1 487 1 491 491 188 28 4e-47 MKTRKIIALGLAAVMTAGMLTGCGGSESKENASNDKGEEVVTLKWVTIGNGMPSNYDSWI GKVNDYIGEKIGVNLEMEVIPWGDWDNRRNITISTNEPYDIIFGTGNNYIADIKLGAYAD ITDLIDENMKEFNELMPEKYWEAVQVGGKIYGVPSYKDSSISNYAIWDKELVEEYDIDYK NLTDLENLTPVFEMLKKEKNDYPVYIKNDGVYSIFDVYDQLGGGTQILGVRYDDQEGRVC FTLEEPDIYAALETFYDWNQKGIINPDAATLTEGRVYNMWRIAQGWESAGVTSWGPQMGK DVVVAKVGDTILSNETVRGSLNMISANSKYPEKCLQLLNLVNTDTTLRDMFYYGEEGVNF EYVDVDGTQKVHKNNEDWSMAGYTQGTFFTVTPIETDTNDQWGEIRALNEQAVSSVMLGF TFDTTEVNDQLANCREIWLRYRSEVMTGVKDPAEVVPQIKEELMQAGWQEVMDAAQKQVD EFLAAKQK >gi|226332912|gb|ACII01000107.1| GENE 18 18310 - 19227 682 305 aa, chain - ## HITS:1 COG:lin2116 KEGG:ns NR:ns ## COG: lin2116 COG0395 # Protein_GI_number: 16801182 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 19 305 38 323 323 272 46.0 7e-73 MSKKVKEQSSLNQIKTSTNVLFNIIFLLLGLMCIFPLLFVFSISITSEEALRSGTYQIIP QQLSNAAYMFLWNERGTILRAFCMSVLVTVVGTVITIILTTSMGYVASRRNFKLKKLYTW IIFIPMVFNGGMLASYVVVNNILNLRNTIWALILPLACSSFSVTICRTFFRITVPDALIE AAKIDGAGQFRIWSNIVLPISKPVMATIGMFAAFGYWNDWFQASLYIQDKNLQTLQSLLN NIQKNIEYIANNPYGGLSLQQYKMGMPTESVRMAIAIIIIVPIACTYPFFQKYFISGLTI GGVKE >gi|226332912|gb|ACII01000107.1| GENE 19 19245 - 20216 754 323 aa, chain - ## HITS:1 COG:lin2117 KEGG:ns NR:ns ## COG: lin2117 COG4209 # Protein_GI_number: 16801183 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Listeria innocua # 27 319 16 305 309 298 49.0 1e-80 METKKSPGLMAKVRRKRWTKDDTELTLLSLPTLLWYLIFSYLPMFGVVIAFKEYKLAPGG HAFLYNLFASDWAGFKNFKYFFTSNSFFMLLRNTILYNIAFIIISATVAVGGALMLSNMR NKRGSKVYQTMMFLPYFMSWVVVSYFVYALLTPEKGYLNGIITALGGNPVMWYQEKKYWP FILVILNTWKGMGYGMVMYLASITGIDPSLYEAAVMDGATKKQQMRHITLPGIKPVFIMM LILDCGKIFNSDFGLFYQVTGGIPQSLYTTVSTFDTYVYNAIQSTAPIGQTAAASFFQAI CSCGMILLANWVVSKLDNENRII >gi|226332912|gb|ACII01000107.1| GENE 20 20312 - 21781 575 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900035|ref|NP_344639.1| ABC transporter, substrate-binding protein [Streptococcus pneumoniae TIGR4] # 4 483 14 484 491 226 30 2e-58 MTVIMVLSALIVGAWSVINDKRSEESEKVEDVIELTWYQLGEKQKDTDIVLEKVNDYLAG KIGVKLDIVNVEGADYTKKMHVVINTGSKWDLCFSSSWANDYLQNADKGAFLALDDLVQN TEMYEKIDSRFWEAARVQGKIYGVPSEKELGNMQMWVFTKEYVDKYEIPYEELHTLEDLE PWLKVIRENEPDVIPLYITKDYTAPTYMDKIEDPIGIEYGDETLTVKNVFETERMKNTLK TMRKYYKAGYINKDAATASDDKTQKRFVTKGDGQPYAELIWGKELGYEVVVSPIMKTSIT NISARGALTAVNAQSEHPDKAVELLNLINTDVYLRNLLNYGIEGIHWEKVEVSDEEAKKV EGKPYIYDTKVKLIKDKFRNYSVHYWVQGGLFNTYVLENEPIDKWATFKEFNDASEEASS FGFDFNLEPVSSQVAGFRYILDEFGKALYTGSVDPDEYLPKLLEKLDDAGVQKVIDEMQR QVDVWKEGR >gi|226332912|gb|ACII01000107.1| GENE 21 21953 - 23470 1039 505 aa, chain - ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 505 1 507 508 192 28.0 9e-49 MRKVMIVEDEELILQGIKNIVHWNELELQLFHMAHSGQEALEMWKKEPVDIIITDIEMPE MSGLELLKKIRSEEELVRFIILTGYDDFEYAREAIHLDVENYILKPINEEELEKQLKEAA EKLEELEKKKIRYIDEKTEWLQFLSGKISETGYEYYCERFQLYTGSGHFGAAVMKWSTDS VKENKITDMILSLKNTEKQLRVVHLPPDCLLLLLDGFGKSQEVKEYFQMIQNEIESQFDI VTYISIGPVFDNYRKLPECYRVAMKLQKYLLVDGFGSCVDEKHIQDRKSEDIVIDRTLLR KLILKKENEGALGYIEDLFINNVKTGVSLESLYQMTVQVAMLLQEIKKEYKLDNYETLHN LPDLLETIYRADDIFGIKALFISEINEIIRYLHEEDSQYTPVVRQIMDEVQKNYKENMNL KTLAYKYHMNASYLGQIFQKEVGCSFTQYLSSKKIEIAKELILNTNMKISDIAKQVGYPD TSYFYRKFKLHYGVAPVSLREMKKY >gi|226332912|gb|ACII01000107.1| GENE 22 23448 - 25037 1048 529 aa, chain - ## HITS:1 COG:BH0792 KEGG:ns NR:ns ## COG: BH0792 COG2972 # Protein_GI_number: 15613355 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 22 502 77 558 587 207 29.0 3e-53 MHEEAVEYVNDCRDISTYLHEELYKSNMEMNDLLHYLTDAPEKYQEYRLDTYSQYKLLDY NGIEDFSATAFEAYPSLKRLAFVSYSKGDLTSFNGSKSIYRREDGDVALERIKSGNLAES GEFSFLKEIRDPLNMQNVGAMIVTFHAKRFGNILTAYNLPELIIYNESGTLIYASCSDCK VEEIEKTEGAAELEEKLNAYIQSTNLENYHIISYLKKAKAAEIPFSIMIMIAGVAVAVFL ISEFLVNYYLKRVSARLNRILDGMTQVMGGDLTVRLAAEKNGDELDVIACHFNEMCEKLD LHIQKSYLAEIDQKNAEMSALQSQINPHFLYNTLEAIRMKAICNGDSEVGKMLYSMAVTF RSQLKEADIITLAQELHYCKKYLELFEYRYPSQFKSSVDCPLEYMQIPIIKFVLQPIIEN YFIHGIRMRDKDNFICIRVEKTADDYEIIIEDNGRGMSEKDIREKNRQLTENVMEKNKSI GISNVNRRVKAVYGNTYGVSLEQVETGGLRVILRFKPEEDNENEKSNDC >gi|226332912|gb|ACII01000107.1| GENE 23 25310 - 28429 2381 1039 aa, chain - ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 1 1039 1 1031 1032 1029 49.0 0 MFLTDRKLERRIAELKEHRYRDIINLNEFTVQEDTQGVVNPVLPEVFTGWDTLRVGDTWK GRDLFLWMHREIQVPAEWKGKKIVGVFDFGNTGAGNNAGFESMLYLNGKMYQGVDANHKE VFFDDTFCGTNMDVTFRLWSGLEGGGVPTPQEHRIARADLAWLDEEVDDFYYLASMVWET IKELDEFNPVQHDLRKALDTACHCIDWAYPGSDTFYESVHEADHLLNESIDSMDKKSSVN IRCVGHTHIDVAWLWRLKHTREKCSRSFTTVLRLMEQYPEYVFLQTQPQLYEYMKEEFPD IYKKIKERVAEGRWEADGAMWVEADCNLTSGESLTRQILLGSKFIKDEFGKDVEYLWLPD VFGYSWALPQILKKAGINTFMTTKISWNQYNRMPHDTFKWKGIDGSEVLTHFITTPEPWN EPGSWFYTYNGKLIPKTVKGTWDAYSEKQMNKELLIAYGFGDGGGGVNRDMLENRRRIDK IPGLPNLKTSTAAEYFRDLQETVKNTDQYVHTWDGELYLEYHRGTYTSQAYNKRMNRKME LLYRNTEWLTVMNALKMGGLTYAKQEELTKGWKHILTDQFHDIIPGSSIHEVYEDSRKDY EYIQSVAQSVQEDALSHIIEDKKDCYTVFNASGWKRSEIVTVPNERAGIYRDEQGNELDA QQAGTVTYVEVKDIPAMGTGTIIFEEKKAEETEAEFTICGKNIETPYYSMSMNGYGQITR LYDKTFARDVLPKGERANVFQMFEDKPLDNDAWDIDIFYQEKMREITDMTKFEVIECGPL QMKIHMEWNYMNSFIAQDMVLYRNTRRIDFRTTVDYHEQHQLLKVAFPVDIRSTFGTYDV QYGNVRRPNHWNTSWDQARFESVAHRWADLSERNYGVSLLNDCKYGHDIKDNVIRMSLLK AATHPDHLQDQGMHEFTYALLPHGGDFVEGRVVQEAYALNDPMLVVHGISDLGYESFVRF DNDQIELDAVKKSEDGDYIVIRFHEFAGSAQNVTVYPGFHFKSWVECDLRERPVGSVSQE KEIHLSMHAYEIKTVLVQL >gi|226332912|gb|ACII01000107.1| GENE 24 28535 - 29830 938 431 aa, chain - ## HITS:1 COG:lin2121 KEGG:ns NR:ns ## COG: lin2121 COG3538 # Protein_GI_number: 16801187 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 5 430 3 428 434 461 51.0 1e-129 MKQIEKIPESIRNILDKVKTFFPENPQIYEIFKNCYTNTLNTAVHSMENDTVYVVTGDIP AMWLRDSAASLRPYLIPAKEDKNIADILVGVVHRQFLYICIDPYANAFNQKPNGNCWEKD ETEMNDWLWERKYEVDSLCYPIQLAYLIWKNTGRTDQFDHTFVNGAWKILNVFRTEQYHE EKSAYRFIRRNTYYTDTLSRDGKGTHVKSGIGMTWSGFRPSDDACTYGYLVPSNMFAVVV LGYLEEIARLIVESSDLAKAAAELRKEIYDGIENFAITCKEGYGEVYAYEVDGYGQFNLM DDANVPSLLSMDYLGYEGKNPQVAENTRNLILSEANPFYYRGAKAEGIGSPHTPVEYIWH IALCMQGLTASTREEKKKMIDLAASTDGGKMLMHEGFCVKDDTQYTREWFAWANAMFSEL VMDYCGYKIER >gi|226332912|gb|ACII01000107.1| GENE 25 29881 - 32700 2210 939 aa, chain - ## HITS:1 COG:L122924 KEGG:ns NR:ns ## COG: L122924 COG4724 # Protein_GI_number: 15673474 # Func_class: G Carbohydrate transport and metabolism # Function: Endo-beta-N-acetylglucosaminidase D # Organism: Lactococcus lactis # 3 558 5 545 546 494 44.0 1e-139 MRRKEFLRRASAISVAALLTMNLCACQKEEAVISEEEKASNYQVTEENENKELVMNRQPE SSYWFPEELLEWNPEEDQDLIYNISKIPLAKRVDHEYLTPVNETQNKDTRVMAISIMNSS TSGNAPHGLNSANCNVFTYWQYVDELVYWGGSSGEGLIVPPSPDVTDLGHKNGVPVIGTV FFPQGVAGGKMEWLDTFLSQEKDGSFPLADKLIEVAEIYGFDGWFINQETEGTEEQPLTK DYADQMQAFIKYFKEKAPELRVIYYDSMTADGEMDWQNALTDQNEMFLQDEDGNAVADEM FLNFWWTEEELADQKLLEQSAEKAETMGIDPYQVYAGIDIQANGYNTPIRWDLFESGENS THTSLGIYCPSWAYASASTLDEFHQKENTIWVNSKADPSEKMEYSKAEQWRGVSAFAIEK SVLTSVPFVTNFNTGSGYSFFKEGEQISKLDWNNRSIGDVLPTYRWMIDDGEGNALTAAF DVGNAWYGGNSLKLYGNMEEGRSSIIHLYSADLPVEETTYFSTTVMANTETELNAILLFD DGSQETIKGDHKVGDQWTAVNFDISKYSGKNIRAISYELKPTEQSTFYQFNFGNITIADS AEEKIAEITNLTVDDAEFDEDGMYAGVRLSWESDSETSNYEIYRINQDNSKSLLGVSNTE CFYINTLPRTDETNKSEFKVVPVNRFLKEGEGVSVSMEWPDNSLPKAGLKASQTLIGPGS TVKFSAACSQNTEEITWSLPGSSSESAEGDSVSVTYDKEGVYDVTVMAKNSSGEDTKTVE GLVFVTSELEKDGELLLLSQDKVTEATAYVNENEAPSFAVDGDVTKKWCATGMPPHELII DLGEEVAVSQVAIAHAEAGGESPDMNTKAYTISVSTDGTEYTSIVSVTKNTLADTLDTLA PVNARYVKLSVVKPTQGSDTAARIYEVQIYGLEKTLAAD >gi|226332912|gb|ACII01000107.1| GENE 26 32939 - 33778 708 279 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2984 NR:ns ## KEGG: EUBREC_2984 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 279 20 303 303 456 75.0 1e-127 MFLPLFREHREYFYDWCEIGSIYGAPADCIWGGGRAGFGENDPEEVLELMQEYGISDRLT FSNSLLRQEHLSDRKCNALCRLLERYRKPQNGVIIYSELLLEYLKEQYPGLYFVSSTTKV LTDFTQFEEEIRRKDFRYVVPDFRLNKAFDKLNTLSQAEKDKVEFLCNECCWFGCRDRKA CYETVSRKNLGENCPEHHCKAPEGERGYLFSKAMENPGFIGINDIRDIYLPMGFSNFKIE GRGLGSALILEFLLYYMTRPEYQIHVREKIYLDSMLDLF >gi|226332912|gb|ACII01000107.1| GENE 27 33838 - 34068 355 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579853|ref|ZP_04857121.1| ## NR: gi|253579853|ref|ZP_04857121.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 76 3 78 78 79 100.0 7e-14 MGGIAMSKVVKLSSVMEREEKLKEIYEGLEEIKDQLTALIDEYEEEESHTGKAEELIEAL DALEDASETIEDVLED >gi|226332912|gb|ACII01000107.1| GENE 28 34251 - 34424 219 57 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0206 NR:ns ## KEGG: EUBREC_0206 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 56 6 56 56 62 66.0 4e-09 MLVCDYIVESIDGDYAHLRRTDLPEEELKLVARALLPFDITEGCRLHYEMMQYSIID >gi|226332912|gb|ACII01000107.1| GENE 29 34531 - 35352 636 273 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20603 NR:ns ## KEGG: EUBELI_20603 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 17 272 17 271 271 188 39.0 2e-46 MEIRNFDAYSPLHKSGHTSKGDQLKWKIDDRWYKADYMGYEGLSQVLVSDLLQKSTCPFP FVEYDAAVIEYNGKIFQGCSSLDFLKANQVLIPLEKLYRRYTGESLAVKLSDFADVSERI QYLVTQVEEITKIDNFGAYLTAMLEIDAFFMNEDRHTNNIAVIYNEKTQKYELSPFFDQG LCMFADINQDYPIDQSLEYCLEKIEAKPFSSDFDIQLDAAEELYGVQIGFQFTFKDIQIY LQKNSENYPREVIRRVEQLLRGQMRKYGYLMKK >gi|226332912|gb|ACII01000107.1| GENE 30 35359 - 35982 341 207 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20604 NR:ns ## KEGG: EUBELI_20604 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 99 204 73 178 178 123 54.0 5e-27 MNFKDQIQQIFGTTDIHELKQISRDADNYRCLNADMNNSIISEKKKNTGRKNSFTEEQLA HILALQDRGEKITDIARQYHVSRQTIYSQIKRAYNFSDDPDVKMRMNFMNHDDLCTTIDI DFRHEKIKIKNYTDQIIFRAFGVVTDPDWADFEYFLEERCFPRTRDHRKDILREMGLPFY DPLLIIEKTQGRMSDDHQWIMILKKEG >gi|226332912|gb|ACII01000107.1| GENE 31 36139 - 36693 352 184 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_4521 NR:ns ## KEGG: GYMC10_4521 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 2 181 29 203 204 87 29.0 2e-16 MKAAVFFPGIGYHCDKPLLYYSRKLAQECGYEETIALSYTYDGGNIRGNEEKMQQAFESL YEQAEKSLSAIDFDKYDEILFVAKSVGTIIASAYAEKHSIRCRQILYTPLKYTYNFAHRE AIAFIGTSDPWSIVSEIQALSKKQQVPMYIYENANHSLETTDTLENLKILQDVMGKTKGF LERK >gi|226332912|gb|ACII01000107.1| GENE 32 36733 - 38091 1704 452 aa, chain - ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 1 449 1 450 452 297 36.0 4e-80 MDIKEQIHGLTEEMLTNLGRLVAIDSQLGTPAEGKPFGEGPAKALEEGLKIAQELGFKTV NLDNYCGYAEMGEGDEIVGIAGHLDIVPVGGDWSYDPFKLTREGDYVYGRGTTDDKGPVL EALYAMKLLRDSGVKLNKRVRLIMGCNEETGSKCMEHYNEVAEELSCGFTPDASFPCIHG EKGLLQMMAYSKNTKIISADGGFVFNAVCDSSTIVVPAEEGLKERLEAVLAETKLQEYKV TEENGQISIYAKGVPAHASTPTLGVNAIGVTFECLEKAGFKDDFVEFYNTHIGTSCDGEG IGLKFADEYGELTLCNGMIKTENDVISCTIDIRVPVTLKSDEVRRMCESRLDDENGRIEI LGIGDALFFPRESPLVNALYKAYTDVTGDTENKPMVIGGGTYAKSLKNIIAFGPEKPGID YRIHSADEFILVSGMEEAVLVYMEAIKNLLAI >gi|226332912|gb|ACII01000107.1| GENE 33 38216 - 38866 498 216 aa, chain - ## HITS:1 COG:SPy0440 KEGG:ns NR:ns ## COG: SPy0440 COG1028 # Protein_GI_number: 15674564 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Streptococcus pyogenes M1 GAS # 3 205 5 227 232 83 30.0 2e-16 MKVLITGTASGIGKGCAEFFLKENHEVYGFDKNISSIQHPKYTHFCLDIRDKSSYPEIGQ VDILINNAGVQNGDDIDVNLKGTISVTEHYGIHPNIYSIIMIGSASGHTGSEFPEYAASK GGVLAYTKNVAMRIAPYQATCNSLDFGGVMTELNRPVMEDEKLWDQIMELTPLKRWMSVE EAAEWIYFMAVKNRFCTGQNILIDGLEAGNCNFIWP >gi|226332912|gb|ACII01000107.1| GENE 34 38866 - 39423 598 185 aa, chain - ## HITS:1 COG:no KEGG:lp_2824 NR:ns ## KEGG: lp_2824 # Name: not_defined # Def: integral membrane protein # Organism: L.plantarum # Pathway: not_defined # 4 182 2 181 183 144 47.0 2e-33 MKTEKISVQKICMIAFAICINFVGGQIALFLKLPIYLDSIGTVFVAAVLGPFYGMLPNLL SGLLMGMTVDIYSLYYAPVGILLGFVTGLVYRKFQPKKWQIFPAALVITLPSTIISSCIT AFLFGGITSSGSTVLVQLLAKTPLGMVGSCFVVQFITDYIDRVLCIAVAAVLITALRKSM KENFA >gi|226332912|gb|ACII01000107.1| GENE 35 39420 - 40409 849 329 aa, chain - ## HITS:1 COG:mll3190 KEGG:ns NR:ns ## COG: mll3190 COG1957 # Protein_GI_number: 13472783 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Mesorhizobium loti # 2 309 3 310 313 197 36.0 2e-50 MKKRKVIIDCDPGIDDSLAIMLALKSPEIEVIGITIVCGNSPVEMGFGNAKKILKQMNRL DVPVYIGESTPLRRDYVNALDTHGEDGLGESFLPEVIGYQQQISAVDFLADVLKKEKVSI IELAPMTNLARLIQKDKEAFSCIEEIVSMGGSFKSHGNCSPVAEYNYWCDPDGASLVYET LHQNGQKIHMVGLDVTRKIVLTPDLLEYMCRLNKETGEFVKKITKFYFDFHWEWEHIIGC VINDPLAVAYFINPELCKGFDSYVQIETEGISLGQSVVDSMNFYRKASNARVLTEVDVYG FFQMFLSRILDQEPEKLDILQDLIRGDLK >gi|226332912|gb|ACII01000107.1| GENE 36 40719 - 40844 57 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNKKWIALAAAVVLAVTSLPVTAFAEKKDSTEIRCLQKSL >gi|226332912|gb|ACII01000107.1| GENE 37 41035 - 41500 306 155 aa, chain + ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 18 155 2 140 439 75 36.0 8e-13 MQIISSYGVELRKQNIPIRQTLEIYRSAVSYLIGIYVQVWEGLAEIPDAKRRFNAAEHLV HTTKKNHACFDFDIRFPKMPSYLRRSAIRHALGTVSSYKTRLDLWEKTDRKSGKPKLVYE NHAMPVFYRDVMYREGAEGKDEAYLKLYDGHDWKW Prediction of potential genes in microbial genomes Time: Sat May 28 20:16:43 2011 Seq name: gi|226332911|gb|ACII01000108.1| Ruminococcus sp. 5_1_39B_FAA cont1.108, whole genome shotgun sequence Length of sequence - 22524 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 12, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 44 - 1057 363 ## COG1192 ATPases involved in chromosome partitioning + Term 1064 - 1102 6.4 - Term 1051 - 1090 6.2 2 2 Tu 1 . - CDS 1091 - 2029 404 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 2129 - 2188 6.4 - Term 2129 - 2168 1.1 3 3 Op 1 . - CDS 2288 - 2749 486 ## CLH_1981 hypothetical protein 4 3 Op 2 . - CDS 2730 - 2951 212 ## COG1476 Predicted transcriptional regulators - Prom 3004 - 3063 7.1 + Prom 2961 - 3020 9.5 5 4 Tu 1 . + CDS 3113 - 4129 1149 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 4307 - 4367 9.1 + Prom 4424 - 4483 5.8 6 5 Op 1 4/0.000 + CDS 4548 - 5660 1440 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 5675 - 5719 10.4 + Prom 5664 - 5723 5.3 7 5 Op 2 . + CDS 5805 - 6893 871 ## COG2508 Regulator of polyketide synthase expression + Term 7019 - 7072 4.3 - Term 7006 - 7060 12.1 8 6 Op 1 6/0.000 - CDS 7083 - 8141 1011 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 9 6 Op 2 23/0.000 - CDS 8141 - 8836 752 ## COG4149 ABC-type molybdate transport system, permease component 10 6 Op 3 . - CDS 8839 - 9738 1240 ## COG0725 ABC-type molybdate transport system, periplasmic component - Prom 9785 - 9844 2.0 11 7 Op 1 . + CDS 10118 - 10330 428 ## COG3585 Molybdopterin-binding protein 12 7 Op 2 . + CDS 10323 - 11261 699 ## COG1910 Periplasmic molybdate-binding protein/domain 13 8 Tu 1 . - CDS 11351 - 12397 1210 ## COG1968 Uncharacterized bacitracin resistance protein - Prom 12435 - 12494 2.4 - Term 12452 - 12516 8.9 14 9 Op 1 24/0.000 - CDS 12540 - 15755 4213 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 15 9 Op 2 . - CDS 15755 - 16831 1094 ## COG0505 Carbamoylphosphate synthase small subunit - Prom 16886 - 16945 9.3 + TRNA 17243 - 17326 61.1 # Leu CAG 0 0 - Term 17334 - 17365 3.1 16 10 Tu 1 . - CDS 17379 - 18506 957 ## Hore_04160 beta-1,3-glucanase (EC:3.2.1.39) - Prom 18598 - 18657 3.1 - Term 18579 - 18629 4.1 17 11 Tu 1 . - CDS 18662 - 20167 1507 ## CHU_1854 hypothetical protein - Prom 20245 - 20304 7.9 18 12 Op 1 . - CDS 20306 - 20893 660 ## gi|253579879|ref|ZP_04857147.1| conserved hypothetical protein 19 12 Op 2 . - CDS 20949 - 22367 253 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 - Prom 22455 - 22514 9.5 Predicted protein(s) >gi|226332911|gb|ACII01000108.1| GENE 1 44 - 1057 363 337 aa, chain + ## HITS:1 COG:MJECL24 KEGG:ns NR:ns ## COG: MJECL24 COG1192 # Protein_GI_number: 10954513 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Methanococcus jannaschii # 4 190 3 166 259 77 31.0 3e-14 MAKIIALFNNKGGVSKTTTTFHLGWKLAELGYKTLMIDTDPQCNLTGLCLNADKENKLTQ FYEANNYNIKSALSPVLNNEMRPLEATTCYEFEHNENLFLLPGHIEFSEYDATYNIAENM TGSLVVLQNVPGALRQLITMTSEKYHLDFVLLDMSPSISATNANILMGSDFFIIPCAPDY FCYMAIESLIKVFPKWCSTYDNLRKAEVFKNAIYKMNDTVPKFLGTIQQRYRPRNGSPVK AFSEWIDDINKIVAEKLVPILDENGMLIQKRTNYNLINIADFNSLIAQSQMNNTPVFELT QEQVEKTGSVWENMKRNRDDFSVTFETLAKTIIALTN >gi|226332911|gb|ACII01000108.1| GENE 2 1091 - 2029 404 312 aa, chain - ## HITS:1 COG:lin2373 KEGG:ns NR:ns ## COG: lin2373 COG4823 # Protein_GI_number: 16801436 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Listeria innocua # 3 306 6 287 298 87 26.0 3e-17 MSKPFITYTAQVEKLKNEKNLVITDDDFAVESLQNISYYALIGGYKHPFIDIHTRKYINE ACFEDIVALYEFDEELRGIFFKYLCRVERKMRSSISYHFCKKHGERQEEYLNSNNYGNIP KNKNGITKLIKMLDMMANKNKDHEYLVYQRNKYHNIPLWVIMNTLTFGQISKMFEFLPQN MQGAICQDFGNVKKNEMIKYLKVLTLYRNVCAHNERLFSYHTYIDIPDTLLHKKLGISKN GSKYIYGKNDLFSVVITFRYLLPKTDFLLFKKQLLHIFDRYEKRNSNLKLNDLFEYMGFP INWKEITKFRKI >gi|226332911|gb|ACII01000108.1| GENE 3 2288 - 2749 486 153 aa, chain - ## HITS:1 COG:no KEGG:CLH_1981 NR:ns ## KEGG: CLH_1981 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 151 1 148 155 166 58.0 3e-40 MKKTKSNLDELQELKLLKIEHNGCWLAFWGLLAVILTQIAIGNDSKQDLSGEWIVFMCLA LYLTVGCIRNGIWDRKLKPNFKNNIMASSIAAVVMGILWFIISYRNYHKLVGSIATGVIM FFSIEILCFLALTLTSKIYKKRLKKLEDDSEDE >gi|226332911|gb|ACII01000108.1| GENE 4 2730 - 2951 212 73 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 3 67 2 66 68 87 64.0 6e-18 MPKNIAIKVARAQKDMTQKALAEAAGISRQTMNAIEKGEYNPTIKLCRRICRILDKSLDD LFWEDDEDEENEK >gi|226332911|gb|ACII01000108.1| GENE 5 3113 - 4129 1149 338 aa, chain + ## HITS:1 COG:CAC0620 KEGG:ns NR:ns ## COG: CAC0620 COG0715 # Protein_GI_number: 15893908 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Clostridium acetobutylicum # 36 337 31 333 338 257 43.0 2e-68 MKNKKWISLAAAVILAVTALPMTAFAAEKDGGEEKLTKVTLNEVAHSIFYAPQYVAIEEG YFSEEGLDLTLITGFGADKVLTALISGEADIGFMGAEASIYAYQEGATDPVVNFAQLTQR AGNFLVAREEMPDFKWEDLKGRKVLGGRKGGMPEMVFEYILKKNGLDPEKDLSIDQSIDF GATAAAFTGDDSADFTVEFEPSATALEKQGAGYVVASLGVDSGYVPYTSYSARTSYMEKN PDIMQKFTDALQKGMDFVQSHTPEEIAEIIEPQFPETDLDTITAIVKRYYDQDTWKENLV FGQDGFELLQDILEDAGELKERTPYAELVNTEFAQNAS >gi|226332911|gb|ACII01000108.1| GENE 6 4548 - 5660 1440 370 aa, chain + ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 368 1 368 369 481 65.0 1e-135 MASLSLQHINKTYPNGFQAVKDFNLEIEDKEFIIFVGPSGCGKSTTLRMIAGLEEISGGT LKIGDKVMNDVEPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRKVPKDQIDKAVREAAR ILDLEKLLDRKPKALSGGQRQRVAMGRAIVRNPKVFLMDEPLSNLDAKLRVQMRIEISKI HQRLGATIIYVTHDQTEAMTLGTRIVVMKDGVVQQVDTPQNLYQKPGNLFVAGFMGSPQM NFLDAVISEKGGNLIAKVGEHELVIPAAKAKALKDGGYVGKTVVLGIRPEDIHDSQMFIE ASPSAPMTSVVKVYELLGAEVFLYFDVNGTQVTARVDPRTTAKTGDPIKFAFDMEKSHFF DKETELTICN >gi|226332911|gb|ACII01000108.1| GENE 7 5805 - 6893 871 362 aa, chain + ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 169 350 134 307 312 87 30.0 4e-17 MKIQQILQKCLTDWKNISQIDFCLLDSDNHIFLSTCDKKLPAESKLEEFRQSSALCVSNT SCCLYKIMENHSISYILIVWGKAESTATIGELAVCQVQSLLAAYAEKSDKNTFMQNLLLG SYSDVDAFNCAKKLHIATSVQRAVFLVETKQTKDENALATIRNIFSARTRDFITAIDDTG IIIIRELQSTETYEDLESIAYMLVDMLNTEAMTSAWVAYSNLAEDISRLPDAYKEAHTAL EVGKIFYADKNVFGYNQLGIGRLIYQLPVSICEMFIDEIFKEETLDSIDEETLITIRTFF ENNLNLSETSRQLYVHRNTLVYRFEKLQRKFGLDIRAFEDALTFKLAMMVVDYIKYSKNH PS >gi|226332911|gb|ACII01000108.1| GENE 8 7083 - 8141 1011 352 aa, chain - ## HITS:1 COG:sll0739_2 KEGG:ns NR:ns ## COG: sll0739_2 COG1118 # Protein_GI_number: 16331977 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Synechocystis # 2 287 37 322 395 280 46.0 3e-75 MAVSVDIEKKLHGFTLKVKLESDGSPMGILGASGSGKSMTLRCIAGIQTPDSGRIVVNDK VLFDSEKKINLKPQERKVGYLFQNYALFPTMTVEKNIACGYRGDKKHLKAKVADYIERYQ LNGLEKRYPGQLSGGQQQRVALARMMIGEPEVILLDEPFSALDGYLKDIMQRDMQNFLNE YTGDMILVTHSRDEAYKFCGHLTILDSGQALTTGETKKLFERPGILQAARLTGCKNFSTV QKMGKHSIYAVDWDLMLQTKDVVPDDVTHVGIRGHWMKGASEGGENHMEVEVMEYIETTF EHQYLLKNKKGGDCQPVWWMCPKGNFEEDPHAKVPKYIHFPEEHLMLLKEAK >gi|226332911|gb|ACII01000108.1| GENE 9 8141 - 8836 752 231 aa, chain - ## HITS:1 COG:alr2433_1 KEGG:ns NR:ns ## COG: alr2433_1 COG4149 # Protein_GI_number: 17229925 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Nostoc sp. PCC 7120 # 8 229 3 223 223 179 47.0 4e-45 MSEFLAEINWSPLWISLKTGFTATVIAFFLGIFFARLVMKMKPFSRGILDGILTMPLVLP PTVAGFILLLLFSLRRPFGAFLLDNFDIKIVQTWKGCVIAASVIAFPLMYRNARAAFEQV DVNLIAAGKTLGMSDRRIFWTVVMPTAGPGIASGTVLAFARAIGEYGATSMLAGNILGKT RTVSVAIASETAAGNYGMAGFWVVVILIISFVIVAAINIVSGKGMKMGRWM >gi|226332911|gb|ACII01000108.1| GENE 10 8839 - 9738 1240 299 aa, chain - ## HITS:1 COG:sll0738 KEGG:ns NR:ns ## COG: sll0738 COG0725 # Protein_GI_number: 16331978 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Synechocystis # 39 261 48 266 270 157 40.0 3e-38 MRKKLIAAMMAGVLSAGMFSTGVFAAGTDLKGEVNTFIAASLSNAMEEIQKDFNETYPDV EILYNADSSGTLQTQIEEGARCDIFFSAADKQMDALVDEDLAKKDTVEDILENKVVLIKP KDGETKVTGFENITDAANIALAGDSVPVGQYSREIFDNLGITDEVNKMEINEGKNVSEVL AAVSEGSNEIGIVYATDAASVADKVDVIAEAPADALKTPVLYPVGLIEDKEASEDDTAAT EAFLEYIKSDDAMKVFEKYGFTAYKADDASETDDKDAEATEEADDTETNAETETTEEAK >gi|226332911|gb|ACII01000108.1| GENE 11 10118 - 10330 428 70 aa, chain + ## HITS:1 COG:HI1370 KEGG:ns NR:ns ## COG: HI1370 COG3585 # Protein_GI_number: 16273280 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-binding protein # Organism: Haemophilus influenzae # 1 69 1 69 69 78 72.0 3e-15 MKLSARNQLKGKVVSINNGAVNSIVSIDIGGGNIITATISCAAVEELNLKVGSDAYAVIK ATNVMVGIDE >gi|226332911|gb|ACII01000108.1| GENE 12 10323 - 11261 699 312 aa, chain + ## HITS:1 COG:CAC0252 KEGG:ns NR:ns ## COG: CAC0252 COG1910 # Protein_GI_number: 15893544 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic molybdate-binding protein/domain # Organism: Clostridium acetobutylicum # 2 310 3 316 319 185 33.0 1e-46 MNKLYTAQEVADRLKIKKTTVYELIKRGELESSKIGKQLRVSEEQLTQYLNRSSSGNSGS GQDIPYTSVNFEPESSLLKRDYLLHSSGIILGGQTSAALELLLGKMSAHPDGLPVLQSHM NDYNGLYSLYFEKAHIAATSLDIDDIRHLVPGIPLILLSLYKYSVGFYVQAGNPEKISSV EDLTDPKVVFMNREKGSARRVYLDRLLKERGISSEKISGYRNEAVSDMSSASAVFEHRAN VAFGEEMIARYFPGLDFVPVTDMQMYLAIPAESVRKPGFSALVEIVQSEDFKTEIHHLTG YDTSYTGEMICI >gi|226332911|gb|ACII01000108.1| GENE 13 11351 - 12397 1210 348 aa, chain - ## HITS:1 COG:CAC0501 KEGG:ns NR:ns ## COG: CAC0501 COG1968 # Protein_GI_number: 15893792 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Clostridium acetobutylicum # 47 340 1 272 274 297 58.0 2e-80 MSIYSESDRYMWVFCITYYLAETCLIRLKIVVNKNNNKKENRKGKIMDFIELLKVIFLGI VEGITEWLPISSTGHMILVDEFLKLNVTEDFKNLFFVVIQLGAILAVVVLYWNKLWPFYI RPISKKQQAVLNRHGAVSRGILTFVEKFCDKEKWVLWFKIIVACIPTIVIALPFNDVIEE KFNNYVVVAIALIVYGILFIVIENYNKRRRPTCTNLENLSFKTALIIGLFQVLSVVPGTS RSGSTIIGGILAGTSRTVAAEFTFFLGIPVMFGASLLKILKFGFSFTGTEVMILIVGMVV AFVVSIIAIKFLMGYIKKHDFKAFGWYRIALGILVLGYFIGKTLLGCK >gi|226332911|gb|ACII01000108.1| GENE 14 12540 - 15755 4213 1071 aa, chain - ## HITS:1 COG:BH2536 KEGG:ns NR:ns ## COG: BH2536 COG0458 # Protein_GI_number: 15615099 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus halodurans # 1 1062 1 1058 1062 1200 58.0 0 MPRNKDIKKVLVIGSGPIIIGQAAEFDYAGTQACRSLKEEGIEVVLLNSNPATIMTDKDI ADRVYIEPLTVEVVEQLILKEKPDSVLPTLGGQAGLNLAMELDEKGFLKEHNIRLIGTTA QTIKKAEDRQEFKDTMEKIGEPIAASKVVTTVEDGLAFTNIIGYPVVLRPAYTLGGSGGG IAHNEYELREILENGLRLSRVGEVLVERCIAGWKEIEYEVMRDSAGNCITVCNMENIDPV GVHTGDSIVVAPSQTLGDKEYQMLRTSALNIITELGITGGCNVQYALKPDSFEYCVIEVN PRVSRSSALASKATGYPIAKVAAKIALGYTLDEIPNAITGKTYASFEPMLDYCVVKIPRL PFDKFITAKRTLTTQMKATGEVMSICHNFEGALMKAIRSLEQHVDSLMSYDFTQLTDEEL LAELEIVDDRRIWKIAEAIRRGMPQSMLHDITKIDIWFIDKLAILVGMENALKTRKLTKE LLLEAKRMEFPDYIIARLTGKTEEEIKALREEYQIKAAYKMVDTCAAEFAAATPYYYSVY GDEGTENEAVATPDKKKILVLGSGPIRIGQGIEFDFCSVHCTWAFAKEGYETIIINNNPE TVSTDFDIADKLYFEPLTPEDVENVVNIEKPDGAVVQFGGQTAIKLTEALTKMGVKILGT SAENVDAAEDRELFDEILEKCEIPRPKGQTVFTAEEAKKAANELGYPVLVRPSYVLGGQG MRIAVSDEDVEEYIGVINQIAQEHPILVDKYLMGKEIEVDAVCDGEDILIPGIMEHIERA GIHSGDSISVYPAKTISEKAKETIAEYTRRLARALHVIGMINIQFIVLGDDVYVIEVNPR SSRTVPYISKVTGIPIVPLATKVILGYKLKDMGYEPGLQPEAKHYAIKMPVFSFEKIRGA DISLGPEMKSTGECLGIAETFNEALYKAFIGAGIRLPKHKNMIITVKDEDKQDIIPIARR FEALGYRIYATLGTAKVLKENGIKVIRTNKLEQPAPNLMDLILGHKIDVVIDTPPQGVEH QKDGFVIRRNAIETGVNVLTSLDTAEALATSLENTDLNNLSLIDIATIARR >gi|226332911|gb|ACII01000108.1| GENE 15 15755 - 16831 1094 358 aa, chain - ## HITS:1 COG:lin1950 KEGG:ns NR:ns ## COG: lin1950 COG0505 # Protein_GI_number: 16801016 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Listeria innocua # 2 354 3 353 363 353 50.0 3e-97 MKAFLILEDGHVFKGTSIGSTRDVISEIVFNTSMTGYLEVMTDPSYAGQAVCMTYPLIGN YGICYDDQESSKPWVDGFIVRELSRVPSNFRSVDTIQHFLTKHDIPGIAGIDTRALTKIL REKGTMNGMITVNENYDLDTIIPQLKAYTTGKVVEKVTCREKEVLKGDGPKVALLDLGAK RNIARSLNERGCEVTIYPALTPAEEILEANPDGIMLSNGPGDPKECTSIIKEIKKLYDSE VPIFAICLGHQLMALATGGDTHKMKYGHRGGNHPVKDLATGRVYISSQNHGYVVDAEGLE EKNIAKPAFVNVNDGTNEGMAYEGKNIFTVQFHPEACPGPQDSSYLFDRFMDMMGGNK >gi|226332911|gb|ACII01000108.1| GENE 16 17379 - 18506 957 375 aa, chain - ## HITS:1 COG:no KEGG:Hore_04160 NR:ns ## KEGG: Hore_04160 # Name: not_defined # Def: beta-1,3-glucanase (EC:3.2.1.39) # Organism: H.orenii # Pathway: not_defined # 14 214 771 968 1067 70 30.0 9e-11 MKYIRKSFSLFWLIAVMLFGTVSASAASAKPETPVLSGTAAGNRVTLNWNKVKKASGYQI FLYYKAYGKYKCVGRIKNRNITSFTLTGSEDKLYTYKIRSYLKQGNKTLYSPLSKALEIK TAPGKPVITRIRVREESGTLIKWKKIKTAEGYQIFRSESEDRGYKRINIVSGNTTFSYAD TGVVSGKTYYYRIRAYVRNQGNVVYSELSDPAEAVMRKTIMIGDSRTDMMKDVVENDNIT WICEVGMGYKWLRDTALKTLQEQMKGNEDIFVWLGVNDVYNISNYISLLNEEIPKWKAQG ADVYIVAVGQVTKDPYVTNEEIEDFNARMKKEVAGAKYADLYSYLKKQGYKTTDGTHYDN ETTWKIYRYLMSFVS >gi|226332911|gb|ACII01000108.1| GENE 17 18662 - 20167 1507 501 aa, chain - ## HITS:1 COG:no KEGG:CHU_1854 NR:ns ## KEGG: CHU_1854 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 70 269 18 231 1059 79 29.0 5e-13 MKNRKLNQLLVAGMISAAMMTSGISVQAADFSDDTSTVEEQSVGSAEEEAPEVEDGSEFQ SDAAEASSAVAVSSANFPDAKFRQYVLDNIDTDKDKMLSAAEIKAAKTIDVSGLGISNLK GIESFTYATDLFAANNKLTSVNITKNTRVAYLNLSNNSLAGTLNLSKCTNLRVVKYGSNK LTKVVMPSKKYLKNLDFVDASSNKFTTQANAGLNIGDTDYVKSLSEVNASNNAITSFNCA GLQGILDLRNNKITNLRLENSKEGSQVVSLYLDGNSLSKTPSIDFTPEWIAVPQQFSCDA GVSSKVKMLKATASITSATWDQIVVNVGSSTDDASYKLEKKTGNGAYETVKTWDNGDLAD AEFGEDYTDNVISTGTAYTYRVTATVQVKDANKNLRSWSNSAEVKATATGTKPAISVKST KKGVATVSWKAVAGADGYDVYCGSSKKSQKGTVVKGTTKLTANKTKLTSGKTYYFRARAY KMVGSAKVYTGYSAVKSVKVK >gi|226332911|gb|ACII01000108.1| GENE 18 20306 - 20893 660 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579879|ref|ZP_04857147.1| ## NR: gi|253579879|ref|ZP_04857147.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 195 1 195 195 295 100.0 1e-78 MGTDFFDGLVKTFSKTTKELGKTTRELGARAEQTIEAQKIRSKITGEERIIEKIKVDIGD IVYRRHSQGDGIDSELSALCQEIDQHYLKIREYKDNAANLKGEKICPSCEREVDINVSFC PYCGTPCPTPEPAKNVENDTVSSDEDEDGSNEPEETPAEELQPEEPETEKTAETEPAQEQ PVEETPVEKETMEEK >gi|226332911|gb|ACII01000108.1| GENE 19 20949 - 22367 253 472 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 224 450 267 448 466 102 32 3e-21 MSENYNDDEIIDSSEDTEKKRNKSEKFCFLCRRPESKAGPMIELPNNIHVCTDCMQKSFN SMNQQFNEGKFNYSDLLNMPNVSMIDLGSFQNPVQQPKKEKKKKKQEKPVLDLKNIPAPH KIKETLDQYVIGQEKAKKVMSVAVYNHYKRVATDTMDEIEIEKSNMLMIGPTGCGKTYLV KTLAKLLDVPLAIADATSLTEAGYIGDDIESVVSKLLAAADNDVERAEHGIIFIDEIDKI AKKKNTNQRDVSGEAVQQGMLKLLEGSEVEVPVGANSKNAMVPLVTVNTRNILFICGGAF PDLENIIKERLNKQASIGFYADLKDKYDNDPHLLQKVTVEDIRSFGMIPEFIGRLPIIFT LDGLNEDMLVKILQEPKNAILKQYQKLLALDEVKLEFEDGALHAIAAKALERDTGARALR AILEEYMLDIMYEIPKDDSIGEVVITREYIEGNGGPKILLRGQEPILLEGSH Prediction of potential genes in microbial genomes Time: Sat May 28 20:17:21 2011 Seq name: gi|226332910|gb|ACII01000109.1| Ruminococcus sp. 5_1_39B_FAA cont1.109, whole genome shotgun sequence Length of sequence - 50498 bp Number of predicted genes - 37, with homology - 37 Number of transcription units - 16, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 37 - 71 5.2 1 1 Tu 1 . - CDS 79 - 1494 1666 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 1610 - 1669 6.5 2 2 Tu 1 . - CDS 1691 - 2635 979 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 2659 - 2718 2.5 - Term 2647 - 2688 7.1 3 3 Op 1 . - CDS 2724 - 4145 1820 ## COG1316 Transcriptional regulator 4 3 Op 2 . - CDS 4163 - 5566 1438 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 5589 - 5648 7.0 5 4 Op 1 8/0.000 - CDS 5650 - 6606 625 ## COG1216 Predicted glycosyltransferases 6 4 Op 2 . - CDS 6588 - 8495 1477 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 8521 - 8580 9.0 + Prom 8527 - 8586 9.7 7 5 Tu 1 . + CDS 8674 - 9954 1321 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 10012 - 10067 7.2 - Term 10000 - 10054 6.2 8 6 Op 1 . - CDS 10061 - 11521 888 ## gi|253579888|ref|ZP_04857156.1| conserved hypothetical protein 9 6 Op 2 11/0.000 - CDS 11540 - 13984 2134 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 10 6 Op 3 . - CDS 13986 - 16493 2005 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 11 6 Op 4 . - CDS 16496 - 18688 1875 ## Cbei_4749 methyltransferase type 12 12 6 Op 5 26/0.000 - CDS 18744 - 20024 1334 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component - Prom 20067 - 20126 1.7 13 6 Op 6 . - CDS 20136 - 20927 743 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 14 6 Op 7 . - CDS 20964 - 21518 746 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 15 6 Op 8 . - CDS 21595 - 22896 1374 ## COG1686 D-alanyl-D-alanine carboxypeptidase 16 6 Op 9 . - CDS 22982 - 24112 989 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 17 6 Op 10 . - CDS 24105 - 25403 1296 ## COG0439 Biotin carboxylase 18 6 Op 11 . - CDS 25400 - 26362 648 ## DP0041 hypothetical protein 19 6 Op 12 . - CDS 26362 - 27297 767 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 27332 - 27391 5.5 + Prom 27291 - 27350 8.7 20 7 Op 1 . + CDS 27490 - 27840 164 ## CLL_A3239 hypothetical protein 21 7 Op 2 . + CDS 27857 - 28171 221 ## Shel_14320 membrane protein - Term 28179 - 28240 2.1 22 8 Tu 1 1/0.000 - CDS 28263 - 29720 1308 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 29768 - 29827 8.1 - Term 29810 - 29864 11.0 23 9 Op 1 . - CDS 29920 - 32505 2271 ## COG5263 FOG: Glucan-binding domain (YG repeat) 24 9 Op 2 . - CDS 32576 - 32839 316 ## COG1925 Phosphotransferase system, HPr-related proteins 25 9 Op 3 7/0.000 - CDS 32892 - 34214 1458 ## COG4856 Uncharacterized protein conserved in bacteria 26 9 Op 4 . - CDS 34189 - 35079 857 ## COG1624 Uncharacterized conserved protein - Prom 35103 - 35162 6.0 - Term 35172 - 35210 1.0 27 10 Op 1 . - CDS 35250 - 36347 1372 ## EUBREC_2782 hypothetical protein 28 10 Op 2 . - CDS 36344 - 36778 473 ## Cphy_3553 hypothetical protein - Prom 36859 - 36918 4.6 29 11 Tu 1 . - CDS 36922 - 37626 433 ## COG1040 Predicted amidophosphoribosyltransferases - Prom 37681 - 37740 2.0 30 12 Op 1 . - CDS 37749 - 40013 2213 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 40050 - 40086 5.7 31 12 Op 2 . - CDS 40092 - 41081 1137 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 41237 - 41296 5.1 32 13 Tu 1 . - CDS 41335 - 44175 2757 ## COG0178 Excinuclease ATPase subunit - Prom 44196 - 44255 3.9 33 14 Op 1 . - CDS 44336 - 45478 882 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 34 14 Op 2 . - CDS 45508 - 45855 230 ## gi|253579917|ref|ZP_04857185.1| predicted protein 35 15 Tu 1 . - CDS 45966 - 47774 2434 ## COG0481 Membrane GTPase LepA - Prom 47829 - 47888 6.5 - Term 48017 - 48051 3.2 36 16 Op 1 . - CDS 48134 - 49438 669 ## EUBELI_01207 stage II sporulation protein P 37 16 Op 2 . - CDS 49517 - 50431 953 ## Cphy_2317 germination protease (EC:3.4.24.78) Predicted protein(s) >gi|226332910|gb|ACII01000109.1| GENE 1 79 - 1494 1666 471 aa, chain - ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 120 449 29 338 459 167 35.0 5e-41 MNDERNLGEKTDDSGQQPRYEHYQFHEEQGTVLKPSGPSGHRRNQNSFQKKAGTTIALAV IFGLVAAVVFQAANFAADRFLNTGKSSVQIKTTDSVDLQETASDDSTADKVLSDSENGTV AAVAQASMPSVVAITTVSVQEIPSFFGYSSRQYKSASTGSGIIVGDNDDELLIATNNHVV DGATTLSVCFIGDDVANAETETVNAGDNGDLNVEDAVSAKIKGTDADNDLAVVAVKKSDI PEDTLNQIKIAQIGSSDDLAVGQQVVAIGNALGYGQSVTSGWISALNRTISTDDGTNSTG LIQTDAAINPGNSGGALLNMKGELIGINSAKYADSAVEGMGYAIPISKAKPILEELMNRE TREKVDSSKKGYLGVSLASLTTEAIEMYNMPTGAFVRNVENDSPAQEAGICKGDIIVKFD GQKVSDGDDLLDKLQYYKSGEKIEAVIARATNGEYEENTIELTLGTRPDNE >gi|226332910|gb|ACII01000109.1| GENE 2 1691 - 2635 979 314 aa, chain - ## HITS:1 COG:STM2085 KEGG:ns NR:ns ## COG: STM2085 COG0463 # Protein_GI_number: 16765415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 11 312 1 303 314 215 39.0 1e-55 MTHKEKYNDTVTVDVIIPAYRPGREFGELLHRLEEQEYRPRRILVMNTGEQYWNREWEKC PILEVHHLEQKDFDHGGTRRRAAELSNADIMIFMTQDALPADRKVIGNLVCAVSENPGAG AAYARQLPKADCRFLERYTRSFNYPEQSSVKSLDDIDKYGIKTYFCSNVCAAYDKGIYLK TGGFTERAIFNEDMICAGTMIQKGYSVVYAADARVYHSHNYSGRQQFHRNFDLGVSQAEH PEIFEGVPSEGEGIRLVKRSLGYLIRTGHFWLIPQLIWQSGMKYAGYFLGKRYRKLPRKV VLACTMSPYYWNRK >gi|226332910|gb|ACII01000109.1| GENE 3 2724 - 4145 1820 473 aa, chain - ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 24 338 30 336 341 134 31.0 3e-31 MAKPGKKKKVLFVLEIIVLLLFIGGLYVYGQISSRLDKIEQPELKKEKIVVNQEAPKMTG YKTYVLFGIDTRGEGSQYSAQNSDTMIIVSVNNDTKEVRMVSVYRDTLLNVGDDTYTKAN AAYALGGPEQAITMLNTNLDLDISDYATADFSALVEVVDDLGGLDIPLSYAEIEHMNNYC VETSKLTGKDYTPIEKPEPKPEDEEAIVGTYHLNGVQATSYCRIRYTASLDMGRTERQRR VLGMLFDKAKIAGLSSIFKIMDDVFPMVTTSLSKQDILGLIPTLIGYKFTDSTGFPQKFK FSNIKGSIIVPTDLENNVVELHKFLYDDQDYTPSSEVVARSNKILEIVGGEGKLDDAAKT TTQDTDTTNANDTFVWSGDNTSGSTDNSGDYDSDYDNGGYDGGYDNGGDDTGSDTGGGDY DGGDDTGGGDYDGGDDTGGDETGGDDTGGDETGGDDGFSDDSAADSGASSEEQ >gi|226332910|gb|ACII01000109.1| GENE 4 4163 - 5566 1438 467 aa, chain - ## HITS:1 COG:VC0934 KEGG:ns NR:ns ## COG: VC0934 COG2148 # Protein_GI_number: 15640950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Vibrio cholerae # 108 467 105 465 465 247 38.0 3e-65 MLKDNEKNFSRLHMIIDAIVLVLSYFLAWLIRFVGPMAATAVRTRSFQQYMLMLVFIVPV YLLLYQAFTLYTPMRMQGRRLVLANIVKANSLGLLILMFTFYMIDESDFSRSTYIMFYVI NIVLQWCARMLIFALLRNMREKGLNQKQMICVGYSRAAEEYIDRVLANPQWGYVIRGILD DNVPAGTEYKGIKVLGRIANLNIILPENRLDEIAITLGLSEYYRLEEIVALCEKSGVHTK FIPDYNKIIPTKPYTEDILGLPVINIRYVPLNNTFNALVKRAMDIAGSIVGIIVTSPLML LMCAIIKLTSPGPLIYKQERVGLHNQTFRMYKFRSMEVQPESEEKKAWTVKNDPRVTPIG KFMRHTSIDELPQLFNILKGNMSLVGPRPERPFFVEKFREEIPRYMVKHQVRPGLTGWAQ VNGYRGDTSIRKRIEYDLYYIENWSIGLDIKIIFLTFFKGFINKNAY >gi|226332910|gb|ACII01000109.1| GENE 5 5650 - 6606 625 318 aa, chain - ## HITS:1 COG:MTH172 KEGG:ns NR:ns ## COG: MTH172 COG1216 # Protein_GI_number: 15678200 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Methanothermobacter thermautotrophicus # 1 269 1 281 332 170 35.0 3e-42 MQEVSVIIPNFNGMAYLDGVLAGLECQTVSNFDVILVDNGSNDGSCAFVAARYPWVHLIE LPENFGFCKAVNEGIRASRSPYVLLLNNDIEVTENFIEEMLSAIKRHPKAFSCAARMIQF HDRDRLDDAGNYYCALGWAYARGKGKDIHTYEKEEKIFASCAGAAIYRKKIFDELGYFDE EHFAYLEDMDVGYRARIYGYENWYAPDAMVYHVGSGTSGSRYNQFKIRYSSRNNIYLIYK NMPVLQIIINLPFLVAGFGIKILFFTLKGFGREYIAGIKNGFQISKKNHKIPFKFQNLSN YFRIQVELWVNIFRRLCG >gi|226332910|gb|ACII01000109.1| GENE 6 6588 - 8495 1477 635 aa, chain - ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 81 555 1 477 519 230 30.0 5e-60 MSKKPSAGGKIQWTKMFRKLSPYTIRKGFRYMKHYGPKEFWIRLHERFEPEEVPYGPWYR AYIPTEETLETQRKQKFDYSPLISIAVPAYQTPVEFLRQMIESLIVQTYSNWELCIVNAS PDNEEMQKVLAEYSAGDSRVRFCNLKENLGIAENTNRAFAMAKGEFVGLLDHDDLLAPNA LYEIVKILQDHPQADALYTDEDKVTTELDEHFQPHLKSDFNLDLLRSNNYICHFFVVRKS IVEKTGGFRKEFDGAQDYDFIFRCTENAGEVLHVPEILYHWRTHKASTADNPASKMYAFE AGKRAIEAHLERTGTKGEVSHTQDLGFYRVKYPVQGKPLVSVIIPNKDEKETLQTCLEML EKNTGYQNFEIIIVENNSTTDEIFRYYKELSGNRKIHLLRWGKEFNYSAINNFAAAHAKG EYLLFLNNDVKSINPDWLEEMLGVCQRPEVGGVGAKLIYPDNTIQHAGCVIGMGGIAGHM FVDMPADRTGYLHKASLLQDMSAVTAACLLMKKEVFEQAGGFTEELAVAFNDVDLCLKVR KNGYLIVYDPYAKLYHMESKTRGAEDSKEKVRRFQTEIEYMRCHWIDILKNGDPCYNKNL SLTKWNYSLKPIPGMETEAGQKKEKTGRKSCRKYQ >gi|226332910|gb|ACII01000109.1| GENE 7 8674 - 9954 1321 426 aa, chain + ## HITS:1 COG:BH1573 KEGG:ns NR:ns ## COG: BH1573 COG1686 # Protein_GI_number: 15614136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 60 314 29 272 382 163 39.0 5e-40 MMIKKKRITAALMALGLGAVTMFSQFPVSAAEETAQNTDAAAQTADPSVVVTNGIDGWPQ ASDISSAAAIVMETSTNTVLYSKNADQPLYPASAVKIMTCLVALENSSLDEQVTMTATGV SGVTDGGANISSQLDEVFTMEQCLYAIMVASANDIALQVAEHVGGSVDAFVQTMNTRAQE LGCTNTVFTNPTGLPDENQHITAHDMALIMEAAMANDTFRTIAATTSYTLPATNVSGGER VLTNNFTMINSTSDGYYKPCIGGKEGYTEASGSTLVCEASKNNMKLVCIVLNGASGVTDD EAIALLNYGFDNFAPLTIADDDFNRLSGGTVIAPNGATEDNLTTEDTSSDGQITRQYYFG GTPVGTAILEDAEQQTNDAAVTGQKNMEAAQAYSASHTTAPYYIIGAIGAAFLLFFLFLM IKVIKS >gi|226332910|gb|ACII01000109.1| GENE 8 10061 - 11521 888 486 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579888|ref|ZP_04857156.1| ## NR: gi|253579888|ref|ZP_04857156.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 7 486 1 480 480 911 100.0 0 MNVKEDMLKKKKEINEKTEIFIFVFLAFILLTTWAMTQPFNSGPDEQMRYYVADYIYKHH GALPGGDDPAVRNKVWGISYAYYPVVSYMVSALFMRISRLFADPGYSMFKIARMADVLFV TGAVYFVVKASGKLFPKEKYSREVRWLFAALAGFMPQAIFMGTYVNTDSLALLAAAMILY AWVSYLREDWTWKNCILLAVGMAVCALSYYNAYGWILCSFFFFCFTVLLCREEAFSQRVR FLFSRGAVIAAVTLVLCGWWFIRNAVLYNGDFLGRKSCAECAEKYAQNDYRPSLYPTPAK LGWNWKDIILYQDPGWYHNWILTVCVSFIGTFGQMEIYMPYTVSKLYMLFFAVGIISVFF VKETFDLRKKMYVAQRKAVGNDRWKIKTKVISREWNKEGIFHLMMVFLIMIPVFLFLYYV YYSDNQPQGRYLMPALYPLMYFVTLGWNNILTKTVKNEKVRSLIYRVLTVLLVISPFACW AFLILP >gi|226332910|gb|ACII01000109.1| GENE 9 11540 - 13984 2134 814 aa, chain - ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 280 673 1 392 519 187 31.0 5e-47 MPKEIFEVVRERFHLSDRTQYIVQGHWPKDAVMEAYLDQHKLKVSVKENENVSALERFKD PEKMQGIQITATVQIPENLEGYHKLVIYEKFSDKKHVWFSITAGELDKKRDKPQVYFEEE SAEQGTVRIRGWAIAPKPVTVRIFDADKKPVAAEIQRTDRVDVNQLFEEAQDPGKTGFFS EITNVSGKCLYVVFYAGEKKTVHVVPLRKADILAKKLDKYVEKGIRYWKSQGAAALAEKV VTKVKNVRQGPPSYQKWIRHHLPDRNELEKQKKTSFGYRPKISFVVPLYKTPEKYLRRLT ESFQEQTYSNWELCFSDGSGAQSPLTELLKELTAKDNRIKYVSHEEPLQISENTNSAIEI ATGDFIAFADHDDELTPNALFECVKAINEKPQTLVIYTDEDKMSMDGHKFFQPHFKPDYN PDLLCTVNYICHLFVVSRKVIEKVGGLRSEFDGAQDYDFVLRCVEAVKDEEICHIPKILY HWRCHEDSTAENPESKLYAFEAGRRAVQAHYERTGIHAEVFKGEYLGLYRTKFIRDHDPL ISIIIPNKDHIDDLKRCMESIEQKSTYKNYEYIIVENNSTDSATFEYYKKLEAENPKVRM VYWDGVFNYSAINNYGASFAKGEYLLLLNNDTEIINPDCLEELLGYCMRKDVGAVGARLY YEDDTIQHAGVVIGFGGIAGHCFVQQKRGTTGYCHRIICAQDYSAVTAACMMVKKSAFDA VGGLSEELAVAFNDIDFCMKLRKAGYLIVYNPYAELYHYESKSRGLEDTPEKVARFNKEI ATFEKKWPEILKNGDPYYNPNLTLKSQDFSLKRI >gi|226332910|gb|ACII01000109.1| GENE 10 13986 - 16493 2005 835 aa, chain - ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 302 689 1 392 519 149 25.0 2e-35 MGNILWSVDKKIYSDKEDHTLAITGWAITRNQSECDFILYGSGKELFVPEPSRCERADVA KDLKETKDIKEVGNVGFTVKIPEIIKLAEEHEKLQLALRAGDEKEIIWEATAAEVKDFCE ESLIEYHIDEEQITQESILTVRGWVVNQLEPDEIFVQGTDGKVLECTITRQRRPDVEEAK GISEEEKRNLGFSITVNLENTNDQNICICFRGKDVQKIYTVNVKKIKRENTGLYQQMKLL SLKNRQKNQEYIKKNGIGRFIRYVRNSQLKDGDQDYEDWLKDHVAFRKELKRQRNAVFSY SPLISIVMVVTDTDEQRLKSVIDAYTEQTYGNWQLCLADACEGEETGEFLRKKYKKEIRL SYKKVTENNGISGNLNASLKLAMGEYVLFAGQEIIPEPDALFQMVKAITEKKADMIYTDE DEISADGKHYSEPEFKPDFNLFRLRENNYIGQFWAIRKEILEQAGKFDPEYDGAQDYDML LRCSEQAENIVHVPKILCHSMKAENLITEEQEKKNWEAGRKALEEHYRRAEVSATAELAD KKGWYRSHLTISGEPMISVIIPSKDHINDLELCISSIEEKTTWKNYEIIIVENNSVEKET FVYYETLKNRYPNVRILTWKKEFNYSAINNFAVREARGEYLLFLNNDVEIITENWLEEML QLCQQKDVGMVGAKLYYPDDTIQHAGVVVGLGGVAAHVLCKLPRDAEGYMGRLRCVQEIS AVTAACMMVKTSVFKAVGGFDEELKVAFNDIDLCMKVRKYGVKIVFTPYAELYHYESKSR GMEDTPEKQLRFSREVNCFRRKWERELLKGDPYYNPNLTLNNTDCSLRKQEKNGD >gi|226332910|gb|ACII01000109.1| GENE 11 16496 - 18688 1875 730 aa, chain - ## HITS:1 COG:no KEGG:Cbei_4749 NR:ns ## KEGG: Cbei_4749 # Name: not_defined # Def: methyltransferase type 12 # Organism: C.beijerinckii # Pathway: not_defined # 1 305 1 305 308 363 56.0 2e-98 MEEKIGKVILDTTCYPGKDLYSDGAIEDEMLAISRDFAPEEFNRVISERKSWPILYHFSH IRENILSWIPFTGEEKVLEIGSGCGAVTGALCERAKEVTCIELSMKRSKINAYRHQDQDN LKILVGNFQEIEKNLTEKYDYITLIGVFEYGESYIRSENPYVDFLRIISKHLKPDGKIIL AIENRLGLKYWAGCTEDHFGTLFEGIQGYPKTKGVKTFSRKEFNGILEEAGNLKADWYYP YPDYKFPMTIHSDRHLPASGELHMRDYNFDRLRLDLFQESQVYNTLLSNDLYPQFANSFL LVIGKEQPQTAPVYVKFSNERDQKLSIYTEISEAADGQLTVKKVPSQKKAAAHVRNLGTI CEELTGMYKEEEIEVNRCRIKGDCAQLEYLTGITLEDKLDHLLEEGRTEELEKLFFSYIQ KVKNIHEKKPFEKTPEFVRVFGNVNLRSDLKCTEISNIDFVPANIILSENKVSVIDYEWT FAFPVPSQFLVYRMIFYYLELNDKRGILKERDFYEKAGILPEDIEVYVEMEHNFQQYILG EHTAMRNMYAQISPGRVEVEDYYREKKQESLEMLQIFWDNGKSFNEADSVRYLFRNGKIQ TEFELPENTTMLRLDPGEMSKGLKIVKLTWEDESQVKFHTDGCEVSSGEFYFGGDDPQII VDSVPENRKSIKIEMEILDRKTTEKKFWKVYAEQKRAMEQMSQELAQKKALVDQVEGSKA WKVYRAIKRV >gi|226332910|gb|ACII01000109.1| GENE 12 18744 - 20024 1334 426 aa, chain - ## HITS:1 COG:PA1386 KEGG:ns NR:ns ## COG: PA1386 COG1134 # Protein_GI_number: 15596583 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Pseudomonas aeruginosa # 3 378 4 367 422 266 38.0 7e-71 MENKKVIQVKDLTKMYKLYDKPSDRLKEALGLTRKKLYKEHYALHDVNFDIYEGECVGII GTNGSGKSTILKIITGVLTPTAGEVKVDGRISALLELGAGFNMEYSGLENVYLNGTMIGF SKEEIDARLNDILEFADIGDFIHQPVKTYSSGMFVRLAFAVAINIDPEILVVDEALSVGD VFFQAKCYHKFEEFKKQGKTILFVSHDLGSVSKYCDRVILLNKGVKMDEGSPKQMVDLYK QLLVGQNPVKQNESDGTEQIVAEDSEGLGDFQVNPNMLEYGSRIAEITDFRVIDDKGRCS NTVEKGSCFKIRMKVRFNEEIQEPIMAYTFKNIQGTEITGTNTMYENAKIEHSGKGDICT VTFTQNMNLQGGEYLLSFGCTGYKNGDFTVFHRLYDACNITVISSKNTVGFYDMDSKVEV RCEEQV >gi|226332910|gb|ACII01000109.1| GENE 13 20136 - 20927 743 263 aa, chain - ## HITS:1 COG:BS_tagG KEGG:ns NR:ns ## COG: BS_tagG COG1682 # Protein_GI_number: 16080624 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Bacillus subtilis # 9 263 17 275 275 155 37.0 1e-37 MDIYKNRRLVAKLAKNDFKTRYAGSYLGIVWAFIQPVITILVYWFVFSVGFRSGTGDLGV PFVLYLVAGIVPWFFFQDALIGGTNSLLEYNYLVKKVVFNISVLPVVKIISAMFVHAFFV LFTIILYAAYGKFPDFYYLQIIYYSVCVFILVLGLSYATSAIVIFFRDLTQIINIVLQVG VWLTPIMWIVEASPLMGHPVIMKILKLNPMYYIVSGYRDTFLMKTWFWEHAGWTVYFWIF TILCFLFGSWVFKRLRIHFADVL >gi|226332910|gb|ACII01000109.1| GENE 14 20964 - 21518 746 184 aa, chain - ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 181 1 180 185 245 67.0 3e-65 MGKIKVTECGGIKGLKVVEPTVFGDARGYFMETYNYNDFKEAGIDVEFVQDNQSSSHKGV LRGLHFQINYPQDKLVRVVNGEVFDVAVDLREGSETYGKWFGVVLSAENKKQFFIPKGFA HGFLVLSEHAEFVYKCSDFYHPNDEGGLIWNDPDIGVEWPIPEGMELSFSDKDTKWGSFK EYRK >gi|226332910|gb|ACII01000109.1| GENE 15 21595 - 22896 1374 433 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 43 346 17 320 425 184 36.0 3e-46 MKKWKYLLKVVMAVGVIAMTAVQICTAAEGTAQAAVSEVTPVSISTNEISGWPAGPEITS ETGVLMDADSGTLLYSKGGDEIRYPASITKIMTLLLAVENCSLKEDVVFTETGTRDISWD SGNIGMQVGEVMSMRACLYALVIRSANEVAAQIAEHVGGTEQHFVDMMNERAAQIGCTNT HFVNASGLPDPDHYSTAHDMALIMREGLKNKKFRRIIGATDYTIKPTNMNSEPRVLHTHH PMLAPESSYHYDGCIGGKTGYTSEAGNTLVTAAGKNGTTYITVTMKAADLAVASTDSTAL FNYGYQNFTKAQVNGGEVSVPNGVTVDNLTVQENSQNGNTVDDYYYNDYLLGSVEVPEAT PTPEPAVDALSDTSGGNTDQADQNEKADTVGEEKTSAGMPELRKILLIIGAAMLLLIIIL SIALAKKEKRYYG >gi|226332910|gb|ACII01000109.1| GENE 16 22982 - 24112 989 376 aa, chain - ## HITS:1 COG:ECs4724 KEGG:ns NR:ns ## COG: ECs4724 COG0399 # Protein_GI_number: 15833978 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli O157:H7 # 1 373 1 374 376 513 64.0 1e-145 MIDFNRPAFTGREFDYIRDAVQRGMLCGDGEYTKRCSQWMMDKFHVNHVMLTTSCTHALE MAAHLCDIKPGDEVIMPSYTFVSTADAFVLKGAKIVFVDIRPDTMNIDEKLIEAAVTEKT KVIVPVHYAGVACEMDTIMEIAKKYNLKVVEDAAQGVDACYKGKALGTIGDFGCYSFHET KNYTMGEGGAILFNRDEYLEKAEILREKGTDRSKFFRGQIDKYRWIDYGSSYLPSELNAA YLYAQLEARDQIFAKRMEIYNYYHKNLAHLAQEGKIEQPYVPEECSHNAHMYYIKVRDIQ VRTRLIAYLREKGICSVFHYVPLHSAPAGQKFGRFSGEDVYTTKESERLLRLPMFYNLDM EDVKYITDTIASFDGF >gi|226332910|gb|ACII01000109.1| GENE 17 24105 - 25403 1296 432 aa, chain - ## HITS:1 COG:TP0695 KEGG:ns NR:ns ## COG: TP0695 COG0439 # Protein_GI_number: 15639682 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Treponema pallidum # 5 270 4 273 597 62 21.0 1e-09 MKHKVLILGTLGEFTELVKKAKEKGYETVVCDGYADGIARTYADKAYTIPVTDVDAIALM CREEGVDGIITSFSDLLLECMVKIADKAGLPCYLKPEQLSWYRDKSACRDVLDKLGLPAP GFRKISVELLKQGSEEEIQQSIANLQYPLISKPLDKYGSRGIFIIHHSDEVRKKALQTAE YTDCQEILVEEYNDGYEFNMMTWVMDGKVNVISIADREKTEMEEGMLPLSTRNVYPSRFL AEVEKSATDILQNYIRYTGQTEGALSMQFFWKPGRGIQVCEIAGRFFGYEHELTDMVYGF QTEELLLDYLYEKDRIKEMFDRHDIYHPVKYGAVLYFQGRQLQIADQTAACELAKEKCVV KPWIFYKEGEHVIEYGPNPYLALYYIETENRRQLEMETEKFFSEMSIRDPEGREVAYRNK IPEYEKEKKTDD >gi|226332910|gb|ACII01000109.1| GENE 18 25400 - 26362 648 320 aa, chain - ## HITS:1 COG:no KEGG:DP0041 NR:ns ## KEGG: DP0041 # Name: not_defined # Def: hypothetical protein # Organism: D.psychrophila # Pathway: not_defined # 4 313 7 319 324 172 33.0 2e-41 MKEIGGYFQLEEMPGEEYYPDLYRVNLGRTALLWLLKSRRCRKILLPYFLCGSVVHTCQE NQIETEFYHLNEKLEVLYPKEQLPEGEYLYLVNYYGQLSDSRISEYKKIYGNIIVDHTHA FFQKPLKGIDTLYSCRKFWGVSDGAYLSTDTSLTENKTVDYSAERMKHILGRYEHNAGTY YKDMLENAAKYDGMELRQMSKLTQNLLKAVDYDRAKKKREENYRILGELLPSESIFNQTV PEGPFAYPYFHADGMKLRRYLAEKKIFVPTYWKNIIENSETKSLEYTWAANILPLPCDQR YSVEDMKYMASVVRECEERI >gi|226332910|gb|ACII01000109.1| GENE 19 26362 - 27297 767 311 aa, chain - ## HITS:1 COG:STM2298 KEGG:ns NR:ns ## COG: STM2298 COG0463 # Protein_GI_number: 16765625 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 5 306 11 312 327 169 34.0 6e-42 MSLYSVVVPVYNSEHTLGELYTRLEKVFRETLKEDFELILVDDGSKDRSYEIMTELREKD HRVRIIQMARNFGQHPALLCGFAHVKGEFVVTMDDDLQHQPEELPKMVRTMQERPDVDVI IASYEGRKHGPIRKLGTKFSVWATSKMLGKDPDLQITSYRLIRRFLVDAMVKTNTYLPQI GNLLIRSSNRIINVPVQHADRAYGKSGYSFKRLLKDLIYDITTHTAFPLLLVRNIGIVSF LISVVLSVCYLVRYFTLGISVQGWTSLMLVMLAFFGLILLSIGIMGIYLMNILNEAKKMP HYVIRREDTDE >gi|226332910|gb|ACII01000109.1| GENE 20 27490 - 27840 164 116 aa, chain + ## HITS:1 COG:no KEGG:CLL_A3239 NR:ns ## KEGG: CLL_A3239 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 8 101 9 98 113 65 44.0 8e-10 MSKLKDYIQLHLNILLFSLTSVFSKAASVQYNKHGLSSPLLYLFLFLMVANCGIYAITWQ QVIKKFSLSTAYANKSVYLLWSQIWAVLIFHENLSIQNIIGILVVLFGVWTVQRYE >gi|226332910|gb|ACII01000109.1| GENE 21 27857 - 28171 221 104 aa, chain + ## HITS:1 COG:no KEGG:Shel_14320 NR:ns ## KEGG: Shel_14320 # Name: not_defined # Def: membrane protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 104 12 115 117 71 37.0 8e-12 MLSGTFFSAVSQILLKQSANIKYENRIREYLNFRVILSYTIYMLILLLNTWCYTKVDMRY GPVIDTAAYVFVLLLSHFILKEKITKGKIMGNLIIIAGILIYTL >gi|226332910|gb|ACII01000109.1| GENE 22 28263 - 29720 1308 485 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 316 445 200 319 2566 68 36.0 3e-11 MKRDFLKIRKRLIVGCLAAAIAVAQPVSSVFANPHYDRRDTVAEEEFIYSARTSGTESSR KKVNSKAWKKINGVCYNGSGEIIPGAITRGMDVSEWQGNIDWKQVKRSDIDFAFVRISYG LTHEDYTYDENMTNAELAGVPTGTYVYSTALSTTTALKEAQLAISKMQGYKVSYPVVYDL EYAKASKLSAKTVSEMALTFCNEVRRAGYYPMVYCNTNWYDNYIDWSLLSGVDVWIARYG DTIQAPDKERYNYTIWQSTDGNRESGLNSTSGLVAGIPAGNDVDMDFGYVDYTKKITPRW KSLHSYVPAMKPDTGSNDGSQEQTGLHQENGKYYYVNENGERVSDQWVTVNGKTYYISSD GYALMGMKKVDGKCYWFHTKSGYMFKNRRVTRSTGDIYYFGSDGVRCENGMYKVREKSGE HTYYFRKNGKAYKGWLTLNGKKYYFYKGSSALSGTRAENITLTSSNRIVSVFDGNGVCTR QYKKR >gi|226332910|gb|ACII01000109.1| GENE 23 29920 - 32505 2271 861 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 310 691 155 536 2566 71 26.0 7e-12 MKRRLVAVTLSLIMTMATVSEAGAAAFTSPDTAPAVATQTDAEKVNVTEETSETEADAFS DGDDGTAVPSQEQTDDAFTAGTDTAAALPDETPSTADATDAVGAAVVKAEDWVSTESGFK LRKPAKNQDTANTDDQKAATAALPDEEMPSETDGFSGDQTESSETAVAETDTTAVTAEAT AMETPAATDTVSVTNEEFYTAADGIVKISTEYKGETHTGYYLFDEEGILVTGQAEVKEKS SADEEPTSDAAQSAGEDAAGEETEAQANQSYFTTADEAVVYTGCEGEAITPYTSTVGQQE KSVWKWTGTYFQYYDENGNLETIAQLEAKAKAAGTYTGYFKINEDYYCLDPEGKPQTGEI TLTVNGESNLYYFDPASSDIPGKMFHNGWLRSDTTKGERWLYFKKGNVPADIGKYYKRGV VATAIPEKGTGAYLLDANGYVLKSVMKKAQNGAYYCTDSNGQIYRNKLVKYGNFRYYFGS NGKRATWTKRWAKAGDHYYYFGSTPGRVVEKHGWQKLVSTSGKFLGWLYFDSKGNHYTDK WTSAGYYFKPSGKLASGLTEIDGKKYIFESSTSAEHKGKVYKSTMVRYKKKWYIASSKGS LYKSGWRKYSGNYYYLKDCVVQTNQFMKKNGVNGYLDANGKYTTGWVIVSNAKNLVRYID PSGNGFARNKSMRVNGILYYFDSNGYRITDLTNRYRGPYSVQVDRVNGVMTVYADSARTI PVKTIRVSVGLAGTPTPTGDFTLSRSLRWQPLMGPSWGQYGTHVDRAGQGGIFVHSVACG QANSYNLPAVEYNKLGSPASHGCIRTCVADAKWVYENCNGAPISIIDGKYKADDAMKGPL GKKALTPLRGAANFDPTDPAV >gi|226332910|gb|ACII01000109.1| GENE 24 32576 - 32839 316 87 aa, chain - ## HITS:1 COG:STM3779 KEGG:ns NR:ns ## COG: STM3779 COG1925 # Protein_GI_number: 16767063 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Salmonella typhimurium LT2 # 1 87 1 87 89 68 43.0 2e-12 MVSKKVTVKNPTGLHLRPAGILCNEAMKYQSQITFVYDGGMANAKSVLSVLGACVKCGNE IELTCEGVDEQEALDHLVTAIDSGLGE >gi|226332910|gb|ACII01000109.1| GENE 25 32892 - 34214 1458 440 aa, chain - ## HITS:1 COG:CAC3078 KEGG:ns NR:ns ## COG: CAC3078 COG4856 # Protein_GI_number: 15896329 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 427 2 400 403 67 22.0 6e-11 MKKRKITDNIPLKIMSVAIAVVLWLIVVNIDNPTGTNYYTLNDVELINKEYVESSDTIGK MCMPEQNQDSIKVAITATKKIRDKIKVTDISAVADLQQAVSLDTNPVMVPITVTCSVPGV SPSDIKVTPQNLSVNLDEKETQEFVVNVSRGDTKPGKDYEVGSLTASPEKVRITGPKTLI NKIDKVNASIELDGNTEDFTQDVNLTIIDKNQEVLTDSEMNSLRIENNAKVTVTAKLWKI RQGVQISADYVGTPAEGYQVGAVRTVPDTISVAGSAEGLESLADNNNVITIPEDSIDISG ESEDVEKKISLTNLLPDNVKLTSDSSEDVWVTVSILPAGSREFEFPTKNIEVKNKPKDLQ VTFETAQIEVRIKSDNKDLDDLNNDKDIKASIDLDGKKEGSYEVPVEIVLPDGYETVEDV TTEVVISSGTAVDDSKENKE >gi|226332910|gb|ACII01000109.1| GENE 26 34189 - 35079 857 296 aa, chain - ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 26 286 16 270 273 210 45.0 2e-54 MQDFSGNLIETLSRLAFPKIGIIDIIQIALIAFFVYQFMVWIKYTHAYTLLKGILVVLLF ILIAYIFRMNTILWIFSNLASTLIVGVIVIFQPELRKVLEQLGQKRIMASLIPFDAGKEV KERFTDKTISELVKACFDMGEVKTGALIVIEQNELLTDYIRTGINLDAILTSQLLINIFE HNTPLHDGAVIVRENRIVAATCYLPLSDNMELSKQLGTRHRAGVGISEQTDSVTIIVSEE TGQVSVAQGGKLTRGVNSAKLREILVRAQNKQVVDNSKLRHLLKGRVKHEEAKNNR >gi|226332910|gb|ACII01000109.1| GENE 27 35250 - 36347 1372 365 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2782 NR:ns ## KEGG: EUBREC_2782 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 323 1 329 345 177 32.0 4e-43 MTALLELKQRIKNLYSQYEIYILPVLRFVLAMVYFIWINTNMGYMKQIDNIFIVLILALI CSILPSGVMIFVGFALMVAHGYALGIEVAGFMLVLILFMAILFLRFSSDNNLVLVFTPLS FGFSVPTLLPIGSGLLCNAFSALPAGCGVIIYYFIRFIRVQHKLLENPDVAIADKLKLLT DGIVQNWGMWITVIAFIAVILLVNLIRTRSFDYAWRIAIIAGGVVYVLMIIAGGFYFRLD IDVVTLIIYTVISVVIGLLLEFFVFGGDYTRTERLEYEDDEYYYYVKAVPKACVTTSERS IKKINGSSAKDERPAQDNVVSYANPIFHGDEKAVTTDEAAAPVERKKDIDSVDFEKKLEE SLKNL >gi|226332910|gb|ACII01000109.1| GENE 28 36344 - 36778 473 144 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3553 NR:ns ## KEGG: Cphy_3553 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 130 1 130 143 86 41.0 3e-16 MLNEEKIKIMNKLAMYEQGEGKKYLPVSRYYRSDYIGLAMIKNFFLVTIGYCLILAGIAA YFAEYLIDNVHKMNLVSLGVEVILGYVAVLVLFSVLTYIQYTVKYHKAKKSVKNYYEELT QLSKIYGREDKKSSARGVTGGYKK >gi|226332910|gb|ACII01000109.1| GENE 29 36922 - 37626 433 234 aa, chain - ## HITS:1 COG:PA0489 KEGG:ns NR:ns ## COG: PA0489 COG1040 # Protein_GI_number: 15595686 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Pseudomonas aeruginosa # 14 224 15 229 241 107 31.0 2e-23 MKRFLNMVADIFYPRCCPVCQKILADQRRMICPECEKELRPIGHPRCYKCGKPIETGEYC RDCQKHRHMYEQGRGIFVYDGIMRRSVTRYKYYGCREYGDFYARAMYRYAQKELREWKPD LIVPVPVHRSKERQRGFNQAAYLAEKLGHYTGISTDVNIVQKVLKTKSQKKLNALQRRKN LEKAFCVTGDVRGKNILVIDDVYTTGSTIDAMAGCLKRKGAGNVYFLTVCIGRR >gi|226332910|gb|ACII01000109.1| GENE 30 37749 - 40013 2213 754 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 9 741 8 732 739 578 44.0 1e-164 MSETVTGYIDHVIFRNEDNGYTVMVLKGTKKEEELTCVGSFPAITQGASIEATGVYIHHP VYGKQFQISSFTEKMPEDTYGIERYLGSGAIRGIGAALAARIVRKFGDDTLRIVEEEPER LAEVKGISEKKAREIAAQVSEKAEMRKVMIFLQKYGISLNLGAKIYQKYKESVYTILQEN PYRLAEDISGVGFKIADEIAARVGIHADSDYRIRSGMLYTLLQASGEGHTYLPREQLFTR CARLLGVDESYMEKHLMDMVIDRKLVLKEKSGETIVYPAQYYYLELNTARMLNELNIVCP EDKELVRHRIELIEKETGTVLDEMQKKAITEAADHGLFILTGGPGTGKTTTINAIIRFFE GEGAEIRLAAPTGRAAKRMTETTGYEAQTIHRLLELNGMPEEERDGHSAKFERNAQNPLE ADVIIIDEMSMVDIHLMHSLLLAVVAGTRLILVGDENQLPSVGPGNVLRDIIRSRCFHVV ELTKIFRQASESDIVVNAHKINKGEQVQINNKSRDFFFLKRYDADIIIRVVIALIQEKLP RYVDAKPFEIQVLTPMRKGLLGVERLNQILQRYLNPPEDGKSERAVGDRLFRTGDKVMQI RNNYQMEWEIRGRYGVVIEKGVGVFNGDTGILREINEFAETAEVEFEDGRFAMYSFKQLE ELELAYAITIHKSQGSEYPAVILPLLSGPQMLLNRNLLYTAVTRARKCVTVVGKEETFAE MIRNEKQQKRYSALDERIRELSETTGDNNTDGEE >gi|226332910|gb|ACII01000109.1| GENE 31 40092 - 41081 1137 329 aa, chain - ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 4 328 7 331 340 335 53.0 5e-92 MPTSDIGIDLGTRNSLAYSTGKGLVLNEPSIVVYDKNTEKIRAIGEEARLMEGRITSDME IIRPIRQGVIVDYTVTEKMLKYFISRAIGRRAFRKPRISICVPSGITEIEKKAVEEATYQ AGARDVYMVEEPIAAAIGAGVDVTKPFGNLVVDIGAGTSDVAVISLAGVVVSASVKVAGD TFNQAILNYVRKNHGLFIGEDMAEKIKIQIGTAIEESNPRTMEVKGRNVITGLPKTVTLT SEEIRVALKDATSQIVETVHGILEKTPPELAADIVDRGIVLTGGGALLHGMDTLIEQRTG VSTLTVQDAMSVVAVGTGKYAEVMARFDG >gi|226332910|gb|ACII01000109.1| GENE 32 41335 - 44175 2757 946 aa, chain - ## HITS:1 COG:CAC0503 KEGG:ns NR:ns ## COG: CAC0503 COG0178 # Protein_GI_number: 15893794 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 8 945 2 939 939 1236 64.0 0 MAADTTKKHFIRIRGANVNNLKNLSVDIPRDKFVVFTGLSGSGKSSLAFDTIYAEGQRRY MESLSSYARQFLGQMEKPDVESIEGLPPAISIDQKSTNRNPRSTVGTVTEIYDYFRLLYA RVGIPHCPKCGREIKKQTVDQMVDSIMSFPERTKIQLLAPVVRGRKGTHAKLLEQAKRSG YVRVQIDGNLYELSEEISLDKNIKHNIEIVVDRLIVKPGIEKRLSDSIETVLNLAEGLLM VDTMDGNIHNFSQSFSCPDCGISVDEIEPRSFSFNNPFGACPDCLGLGYKMEFDIDLMIP DRSLSILDGAIVVTGWQSCTNEGSFSRAILDALAREYDFSLATPFEEYPEKIQDILINGT NGHSVKVYYKGQRGEGVYDVAFPGLIRNVEQRYRETGSDAMKQEYESFMRITPCKTCKGQ RLKKESLAVTVADKNIYEVTNLSIEKLKAFLADMQLSEQQQLIGRQILKEIRARVSFLSD VGLDYLSLGRATGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDRLLGSLM KLRDLGNSLIVVEHDEDTMRAADCIVDIGPGAGEHGGQLVAMGTAEDLMKNEDSITGAYL SGKLKIPVPLERRKPTGFLTVKGAAENNLKNIDVKIPLGIMTCITGVSGSGKSSLINEIL YKRLARDLNRARVIPGKHKDILGTDQLDKVINIDQSPIGRTPRSNPATYTGVFDQIRDLF AATADAKAKGYKKGRFSFNVKGGRCEACSGDGIIKIEMHFLPDVYVPCEVCKGKRYNRET LEVKYKDKNIYDVLNMTVEEALTFFENVPSIKRKIQTLYDVGLSYIRLGQPSTELSGGEA QRIKLATELSKRSTGKTIYILDEPTTGLHFADVHKLVEILKRLSEGGNTVVVIEHNLDVI KTADYIIDIGPEGGDKGGTVVAQGTPEEVAQSPVSYTGKYVKKYLK >gi|226332910|gb|ACII01000109.1| GENE 33 44336 - 45478 882 380 aa, chain - ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 13 379 6 373 374 317 45.0 2e-86 MERRNITDSPMEIYIHIPFCIKKCDYCDFLSGPSGPKEQADYVDALLEEINAAEEGKGRS VSSVFIGGGTPSVLDERFIGEILNHIRRKFQIADHAEITIEVNPGTADRNKLQAYRTYGI NRLSIGLQSPDDRELKILGRIHNYEQFLETYRSAREAGFDNINIDLMSAIPDQTYEGWIH NLRTVAGLDPEHISAYSLIIEEGTPFASRTLNLPDEDAEYNMYEATAQILREYGFEQYEI SNYAKKGRECRHNVGYWIRQDYLGFGLGASSLYGKERFANTQDMKKYLENSRTPEKIREK EPPLTREDEMAEFMFLGLRMTRGISKAEFERQFGSEIDAIYGDVLRKYKSMGLLLEENGR IFLSREGIHVSNSVMADFLP >gi|226332910|gb|ACII01000109.1| GENE 34 45508 - 45855 230 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579917|ref|ZP_04857185.1| ## NR: gi|253579917|ref|ZP_04857185.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 115 1 115 115 136 100.0 5e-31 MIKRKTVILCMLLLGISVSGCGKTREVDEASSKVIQISVAPEETSPTPAPDQVVSAAVTT NGNLTMVNTYLAEQDAAGISSAESSDSQSDADSTDNSVDSADNNEQDSSTSDTEE >gi|226332910|gb|ACII01000109.1| GENE 35 45966 - 47774 2434 602 aa, chain - ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 4 600 6 602 602 851 69.0 0 MIDQSKIRNFCIIAHIDHGKSTLADRIIEMTGTLTEREMQSQVLDNMELERERGITIKSQ AVRIVYKAKDGEEYIFNLIDTPGHVDFNYEVSRSLAACDGAILVVDAAQGIEAQTLANVY LALDHDLDVLPVINKIDLPSAEPDRVVNEIEDVIGLEAHDAPRISAKTGLNVEEVLEQIV TKIPAPHGNVDAPLKALIFDSIYDAYKGVIVFCRVMDGRVKRGTQIHMMATGFTTEVVEV GYFGAGQFIPCEELTAGMVGYITASIKNLGDTRVGDTVTDKERPCAEALPGYKKVNPMVY CGLYPADGAKYGDLRDALEKLQLNDASLFYEPETSVALGFGFRCGFLGLLHLEIIQERLE REYNLDLVTTAPGVIYKVYKTNGEVINLTNPSNLPDPSEIEYMEEPMVNAEIMVTTEFIG AIMDLCQERRGQYLGMDYMEETRALLKYKLPLNEIIYDFFDALKSRSRGYASLDYELCGY ERSELVKLDILVNKEEVDALSFIVHADTAYERGRKMCEKLKDEIPRQLFEIPIQAAVGSK IIARETVRAMRKDVLAKCYGGDISRKKKLLEKQKEGKKRMRQVGNVEIPQKAFMSVLKLD DK >gi|226332910|gb|ACII01000109.1| GENE 36 48134 - 49438 669 434 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01207 NR:ns ## KEGG: EUBELI_01207 # Name: not_defined # Def: stage II sporulation protein P # Organism: E.eligens # Pathway: not_defined # 149 434 75 362 368 250 45.0 6e-65 MTNFQKAFCILLSMCFFATLAVLPGTFFTGENVYRHLPLYRFLEKRAHTSSCEDLETYKK IAGENGKYLGERAQSEQLLAEKNESEELQTPESTPVPKETAIPSKTVKSAKISESAKISE SEQTSKKSDKTPKPKKITASPENHETDKSQKPQESAAQNDQNIQNSTETIAAAVPHPIID LSPEKLADYNYLLGQFYIVDSNTEADAVQINAGDFLKQDLKIIKETDTPQILIYHSHSQE TFADSREGEESDTIVGVGDYLTQLLTENYGYQVVHLKEQFDMAGGELDRSAAYDYARDYL EPYLQENPDIQVIIDLHRDGIPEDRHLVTEINGKPTAQILFYNGLSYTTSKGSLDYLPNP YIQQNLAFSFQLEYQAAQYYPQFYRGIYLAGYRYNLHLRPRSLLLEAGAQTNTVQEVKNA MEPFADILDKVLQG >gi|226332910|gb|ACII01000109.1| GENE 37 49517 - 50431 953 304 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2317 NR:ns ## KEGG: Cphy_2317 # Name: not_defined # Def: germination protease (EC:3.4.24.78) # Organism: C.phytofermentans # Pathway: not_defined # 7 299 8 303 307 301 52.0 2e-80 MLDNYGIRTDLALEATERFTEENVEVRGVEVHEDYNEEKDIRTTVVKITTENGARTMGRP QGSYITIEAPGLSVHDEDYHREISLEIARHLQNVINLEREQSILVVGLGNSAITADSLGP HVVENLHITRHMIREYGLQSLGKEKMHRISGIIPGVMAQTGMETSEIIQGIVAETKPDIV IAIDALAARSTRRLNRTIQITDTGINPGSGVGNHRVGLTEENLQVKVIGIGVPTVVDAAT IVHDSMAHLLEALEEAEQKEFLEEMISPHLHTMFVTPKDVDETVKYLSFTISEGLNMAFE EIGG Prediction of potential genes in microbial genomes Time: Sat May 28 20:18:35 2011 Seq name: gi|226332909|gb|ACII01000110.1| Ruminococcus sp. 5_1_39B_FAA cont1.110, whole genome shotgun sequence Length of sequence - 61775 bp Number of predicted genes - 54, with homology - 53 Number of transcription units - 30, operones - 14 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 12 - 71 12.2 1 1 Tu 1 . + CDS 153 - 416 273 ## PROTEIN SUPPORTED gi|160880450|ref|YP_001559418.1| ribosomal protein S20 + Term 451 - 503 10.1 - Term 446 - 484 4.3 2 2 Tu 1 . - CDS 539 - 1480 962 ## COG1466 DNA polymerase III, delta subunit - Prom 1637 - 1696 3.5 3 3 Op 1 . - CDS 1699 - 2616 957 ## COG0379 Quinolinate synthase 4 3 Op 2 1/0.000 - CDS 2636 - 3184 566 ## COG1827 Predicted small molecule binding protein (contains 3H domain) 5 3 Op 3 13/0.000 - CDS 3259 - 4113 510 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 6 3 Op 4 . - CDS 4103 - 5392 1223 ## COG0029 Aspartate oxidase - Prom 5429 - 5488 9.2 7 4 Op 1 . - CDS 5570 - 7867 1582 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 8 4 Op 2 . - CDS 7896 - 8837 731 ## EUBREC_1615 hypothetical protein 9 4 Op 3 . - CDS 8849 - 10342 1388 ## COG0642 Signal transduction histidine kinase - Prom 10385 - 10444 5.5 10 5 Op 1 1/0.000 - CDS 11036 - 11797 438 ## COG5632 N-acetylmuramoyl-L-alanine amidase 11 5 Op 2 . - CDS 11833 - 12267 449 ## COG4824 Phage-related holin (Lysis protein) - Prom 12296 - 12355 3.0 - Term 12285 - 12344 11.1 12 6 Op 1 . - CDS 12377 - 12625 181 ## gi|253579932|ref|ZP_04857200.1| predicted protein - Prom 12716 - 12775 6.6 13 6 Op 2 . - CDS 12827 - 13108 109 ## - Prom 13154 - 13213 5.3 14 7 Tu 1 . - CDS 13290 - 13559 228 ## gi|253579933|ref|ZP_04857201.1| predicted protein 15 8 Tu 1 . - CDS 14024 - 14644 517 ## COG0746 Molybdopterin-guanine dinucleotide biosynthesis protein A - Prom 14880 - 14939 5.3 + Prom 14664 - 14723 11.2 16 9 Tu 1 . + CDS 14862 - 15398 249 ## BcerKBAB4_5408 hypothetical protein 17 10 Tu 1 . - CDS 15381 - 16355 1061 ## COG2896 Molybdenum cofactor biosynthesis enzyme - Prom 16573 - 16632 3.9 + Prom 16464 - 16523 8.0 18 11 Tu 1 . + CDS 16566 - 17384 793 ## COG1526 Uncharacterized protein required for formate dehydrogenase activity 19 12 Op 1 4/0.000 - CDS 17359 - 17814 400 ## COG1763 Molybdopterin-guanine dinucleotide biosynthesis protein 20 12 Op 2 . - CDS 17888 - 19150 1167 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 19229 - 19288 3.6 21 13 Tu 1 . - CDS 19391 - 19825 340 ## Cphy_1481 molybdenum cofactor synthesis domain-containing protein - Prom 20020 - 20079 4.1 - Term 19891 - 19926 2.4 22 14 Op 1 . - CDS 20095 - 21297 1254 ## Cthe_0427 serine phosphatase 23 14 Op 2 . - CDS 21284 - 23008 1533 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Prom 23031 - 23090 2.5 - Term 23115 - 23173 14.0 24 15 Op 1 2/0.000 - CDS 23219 - 25927 2834 ## COG3383 Uncharacterized anaerobic dehydrogenase 25 15 Op 2 23/0.000 - CDS 25941 - 27818 2294 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 26 15 Op 3 . - CDS 27815 - 28297 617 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 28342 - 28401 6.3 27 16 Tu 1 . + CDS 28771 - 29889 738 ## mru_0223 hypothetical protein + Term 29918 - 29972 15.2 - Term 29905 - 29959 15.2 28 17 Op 1 1/0.000 - CDS 29979 - 30671 966 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 29 17 Op 2 . - CDS 30682 - 31563 756 ## COG1555 DNA uptake protein and related DNA-binding proteins - Prom 31598 - 31657 2.6 30 17 Op 3 . - CDS 31659 - 32093 543 ## COG3238 Uncharacterized protein conserved in bacteria - Prom 32114 - 32173 3.7 - Term 32146 - 32183 6.2 31 18 Tu 1 . - CDS 32208 - 35765 2232 ## gi|253579950|ref|ZP_04857218.1| predicted protein - Prom 35810 - 35869 4.7 - Term 35825 - 35866 -0.8 32 19 Op 1 . - CDS 35880 - 37961 2223 ## ROP_69790 hypothetical protein 33 19 Op 2 . - CDS 37988 - 38362 159 ## gi|253579952|ref|ZP_04857220.1| conserved hypothetical protein - Prom 38401 - 38460 7.9 + Prom 38451 - 38510 8.5 34 20 Op 1 9/0.000 + CDS 38626 - 39342 340 ## COG3279 Response regulator of the LytR/AlgR family 35 20 Op 2 . + CDS 39344 - 40663 548 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 40803 - 40846 1.0 - Term 40638 - 40688 15.4 36 21 Op 1 13/0.000 - CDS 40762 - 41952 1579 ## COG4992 Ornithine/acetylornithine aminotransferase 37 21 Op 2 10/0.000 - CDS 41939 - 42835 1305 ## COG0548 Acetylglutamate kinase - Prom 42887 - 42946 1.7 38 21 Op 3 11/0.000 - CDS 42951 - 44174 1448 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) - Prom 44403 - 44462 2.8 - Term 44386 - 44441 -1.0 39 21 Op 4 . - CDS 44479 - 45519 1201 ## COG0002 Acetylglutamate semialdehyde dehydrogenase - Prom 45714 - 45773 7.1 + Prom 45724 - 45783 5.6 40 22 Tu 1 . + CDS 45981 - 47207 1582 ## COG0137 Argininosuccinate synthase + Term 47225 - 47269 8.2 - Term 47283 - 47341 9.0 41 23 Tu 1 . - CDS 47353 - 47727 492 ## gi|253579960|ref|ZP_04857228.1| conserved hypothetical protein - Prom 47757 - 47816 7.5 42 24 Tu 1 . - CDS 47891 - 49039 1109 ## COG0628 Predicted permease - Prom 49108 - 49167 4.0 + Prom 49493 - 49552 13.2 43 25 Op 1 2/0.000 + CDS 49718 - 50995 1205 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 51025 - 51075 8.5 + Prom 51040 - 51099 1.8 44 25 Op 2 . + CDS 51158 - 52534 1529 ## COG2252 Permeases + Term 52556 - 52602 3.8 + Prom 52582 - 52641 9.7 45 26 Tu 1 . + CDS 52834 - 53775 574 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 53782 - 53817 1.0 - Term 54026 - 54082 10.2 46 27 Op 1 . - CDS 54092 - 54472 472 ## COG1321 Mn-dependent transcriptional regulator - Prom 54492 - 54551 4.4 47 27 Op 2 . - CDS 54559 - 55263 858 ## COG0813 Purine-nucleoside phosphorylase 48 27 Op 3 . - CDS 55302 - 56066 968 ## gi|253579967|ref|ZP_04857235.1| conserved hypothetical protein - Prom 56092 - 56151 6.2 - TRNA 56165 - 56245 65.1 # Leu TAG 0 0 - TRNA 56343 - 56415 84.5 # Lys TTT 0 0 - TRNA 56426 - 56497 68.9 # Gln TTG 0 0 - TRNA 56507 - 56580 72.1 # His GTG 0 0 - TRNA 56586 - 56659 82.6 # Arg TCT 0 0 - TRNA 56681 - 56751 75.8 # Gly TCC 0 0 - TRNA 56787 - 56861 81.8 # Pro TGG 0 0 - Term 57180 - 57224 6.0 49 28 Op 1 7/0.000 - CDS 57303 - 57782 567 ## COG0622 Predicted phosphoesterase 50 28 Op 2 . - CDS 57779 - 58363 412 ## PROTEIN SUPPORTED gi|15803493|ref|NP_289526.1| putative deoxyribonucleotide triphosphate pyrophosphatase 51 28 Op 3 . - CDS 58414 - 59565 1327 ## COG0116 Predicted N6-adenine-specific DNA methylase 52 28 Op 4 . - CDS 59599 - 60141 608 ## COG0622 Predicted phosphoesterase - Prom 60186 - 60245 3.4 - Term 60295 - 60343 2.4 53 29 Tu 1 . - CDS 60403 - 60714 253 ## EUBELI_00398 hypothetical protein - Prom 60734 - 60793 4.2 54 30 Tu 1 . - CDS 60839 - 61036 179 ## gi|253579973|ref|ZP_04857241.1| predicted protein - Prom 61219 - 61278 7.7 Predicted protein(s) >gi|226332909|gb|ACII01000110.1| GENE 1 153 - 416 273 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880450|ref|YP_001559418.1| ribosomal protein S20 [Clostridium phytofermentans ISDg] # 1 86 1 86 87 109 70 3e-23 MANIKSAKKRILVNRTKAARNKAIKSAVKTSMKKVEAAVANKDKAAAEVALKDAVSTISK ATSKGVYHKNNCARKVARLAKAVNSID >gi|226332909|gb|ACII01000110.1| GENE 2 539 - 1480 962 313 aa, chain - ## HITS:1 COG:BH1337 KEGG:ns NR:ns ## COG: BH1337 COG1466 # Protein_GI_number: 15613900 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus halodurans # 1 311 18 338 342 118 24.0 1e-26 MYLLYGEEAYLKQQYKQNLVKALNPDGDTMNFNHYEGKGIDVKQLIDLCETMPFFAERRV VLLEDTGFFKNKCEELADYMKELPDYLYLVFAETEVDKRNRMYKAVKACGSIAEFIRQDE KTLMRWAAGILGKAGKKITQRDMELLLTKTGTDMGNLRMEMEKLISYTEGRDVVTAEDIE EICTTQTTNRIFDMVRAVTEKNQKRALELYYDLLTLKEPPMRILFLLAKQYRQLLLAKQF AAAGLAQTEIASKLGVPGFVVRNITTCARAYTISELEQAVKDFVDAEESVKTGRLEDKLS VELLIIKYSSKVK >gi|226332909|gb|ACII01000110.1| GENE 3 1699 - 2616 957 305 aa, chain - ## HITS:1 COG:CAC1025 KEGG:ns NR:ns ## COG: CAC1025 COG0379 # Protein_GI_number: 15894312 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Clostridium acetobutylicum # 6 304 5 302 303 258 42.0 1e-68 MTIAEIQQEILRMKKEQDICILAHAYQGQEILEVADYTGDSYGLSVQASKRDCSGVIMCG VRFMAETCKILSPKKRVWLANPVAGCPMAEQLDLEGLRELKAQYPDYAVVAYINTTSELK TECDVCVTSSSAVEICKRLEQDKILFIPDPNLGHYVAEKLPEKTFAFYKGGCPRHIVVSA KDVEKARAAHPDALFLVHPECRQEVVEQADYVGSTTGIMEFAKKSEHKEFIIGTENSIVE HLQFECPDKKFYPVAVQLTCMNMKVTTLMDIYNCLKGQGGEEIFLPENIMEGAGRCIHRM VELGG >gi|226332909|gb|ACII01000110.1| GENE 4 2636 - 3184 566 182 aa, chain - ## HITS:1 COG:lin2129 KEGG:ns NR:ns ## COG: lin2129 COG1827 # Protein_GI_number: 16801195 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Listeria innocua # 6 168 9 172 173 124 39.0 8e-29 MNTVQRRTEILKLLQQEEKPVAARAMASQFGVSRQVIVQDMAVIRASTPGILSTTRGYVL QQDKDIACTREFKVRHGQEQAAEELNLIVDCGGRVKNISISHRVYGRVTAEMDIRSRQDV NEFVQAINSSHSSVLSSATSGYHYHLIEASSQERLDLIGEQLKKAGFLAPLQPWEKEKHH NI >gi|226332909|gb|ACII01000110.1| GENE 5 3259 - 4113 510 284 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 1 279 1 283 286 201 38 1e-50 VMNSITMKMQADKLIRMALQEDITSEDVSTNAVMRSAVKGTVDLIAKEDGIIAGLDVYAR VFQILDEKTEISFNFKDGEAVKKGNLLGTVTGDIRVLLSGERVALNYLQRMSGIATYTKQ VSKLLEGSKVTLLDTRKTTPNCRVFEKYAVRIGGGCNHRYNLSDGVLLKDNHIGAAGSVA KAVAMAKEYAPFVRKIEIEVETMEQVKEAVEAGADIIMLDNMTPEMMKEAVELIAGRAQT ECSGNITKENIAKILETGVDFVSSGALTHSAPILDISMKNLHAI >gi|226332909|gb|ACII01000110.1| GENE 6 4103 - 5392 1223 429 aa, chain - ## HITS:1 COG:FN0009 KEGG:ns NR:ns ## COG: FN0009 COG0029 # Protein_GI_number: 19703361 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Fusobacterium nucleatum # 6 385 7 382 435 392 49.0 1e-109 MKEHYDVIIVGTGAAGLYCALNLPEKMKILLITKQEADQSDSFLAQGGICMLRGEDDYEN YFEDTMRAGHYENNKRAVDLMIRSSNDIIRDLLRYHVDFARDGLGNLAFTREGAHSQPRI LFHEDITGKEITQTLLTAAKIKENIEICEYMTMVDLISKDNTACGIIAMDRNDQAFPVYS QYVVLACGGIGGLFKNSTNFSHIAGDGVGIAMKHHVEMENLDYIQIHPTTLYSKKPGRRF LVSESVRGEGALLYDKNGQRFTNELQPRDLLSQKIFAQMEKDGTEFVWEDMRPLGEKTIL EHFPHIYEQCVEEGFDPRKEPIPVVPAQHYFMGGIKVNLGSKTSMKGLYACGETSCNGVH GRNRLASNSLLESLVFARKAADDMIFGQTPEYVRADAIDMNMYESREELLNACHETVLKE IERMKKSHE >gi|226332909|gb|ACII01000110.1| GENE 7 5570 - 7867 1582 765 aa, chain - ## HITS:1 COG:lin1517_2 KEGG:ns NR:ns ## COG: lin1517_2 COG2333 # Protein_GI_number: 16800585 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Listeria innocua # 503 756 1 252 259 128 32.0 4e-29 MRRRPVCILCMLLVVFLCVTDWLGFSLIRGNPLPQSVQTWIRKHPESTICGEVVRCRENE DFQSVYLRNTYLIYNSEKVSIDNIKVYLKQKKNHSGNSDVDKLLAGSLVLVSGKLEEVQS PTNPGEFDSKAYYGCQRIYYVMKKGKIKKQSQSHSVYGQFLIDMQQKFAGILEKTCGMEA GAFEAIVLGDKTNLDPELKMRYQMAGIIHILAISGLHISLLGMGLYNLLKKIGLGIWPAG LLALVIMLQYGMMTGGTVSTMRAVCMFLLSVGAKIAGRIYDMPTGMAAAAILILMENPAY LLDGGFLLSFGSVIGIGCVWPMVQEGMDVLNRKKRSKVNEKGKIRNKLLMSFLASGVVQL TTLPIVLWFYGEVSVMGIFLNLLVLPTVGIVLGSGTAGALLGLVTVRGAFLAVVPGRIIL RGYEFLTVLLGRLSFCTWIGGKPEVWQIVGYYLVLATAVWIYRAGVMKSENGKIFAWKIR AVYAGMVCFAILLISYRPHEDFRIACLDVGQGDGIVVEIENRWNILIDGGSTNKNELGKY QLLPYLKSRGISRLDGIYVSHTDEDHISGVRELLEFVEKDLTSLRIENLILPKWSDIQEN KNYRELTELAESAGVRVLTVKAGDEIRYGTVRLKVLWPESTASGKEVNEDAMVLEMISKD FKGLFTGDIGMVTEEKLIQNGCLEDVDFLKTAHHGSRYSTGAEFLEIVRPELAVVSCSAT NTYGHPSPDTLERLKKSGSRVLITRDCGAVTIVNGKSVSAFNRIK >gi|226332909|gb|ACII01000110.1| GENE 8 7896 - 8837 731 313 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1615 NR:ns ## KEGG: EUBREC_1615 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 286 7 288 320 143 32.0 8e-33 MKKKLSIIITALILSILAAGCSVEIHEDSGKQDAKYYLYDISSDETQLQKESYSPEETTA DYMLKDMMQRMNSQQTDSQNRVSLLPEGVQMNYSVEEDVLVVNFNSQYSSMSRARELLVR AGVVKTFLQVPGINSVRFTIENEDLTDSRGQAVGNMTADTFAEFTGTEPDAYCSNTFTLY FTDKSGQKLVKEQRTVRYKRSIPKERIVLEQLMKGPLEKGHYPTIPENTEVLNVTIADRI CYVAFDGVFSSYALDVSEKIPVYSVVNSLLDALDADKVQITVGGKDRLDTFGKKMELYHF YERNDKLVVKEKE >gi|226332909|gb|ACII01000110.1| GENE 9 8849 - 10342 1388 497 aa, chain - ## HITS:1 COG:BH2263 KEGG:ns NR:ns ## COG: BH2263 COG0642 # Protein_GI_number: 15614826 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 222 488 190 456 473 152 33.0 1e-36 MTVLRTDRKHASNCMTEENGVEKVKKRKNFFRSLRFRILIILIILGIVPGVIVTYTMLHN YQNRAVAMLTETVGDQCDILCNLIIRENYLNDTDSEVVNSKLELFSNVYNGRILLADRDF KIVSDTFHTEEGKTLLSSLAVACLKGEETSNYDARSKVLELAVPVQSPDVQQLQGVMLVS VSAVDIAETLSEMEQRGVMLIGSIVVLSVFLAWLLSTILVKPLARVTKAIEDLTDGMLDE EISVPDYTETELITDAFNKMVNRMKILDESRQEFVSNVSHELKTPLTSMKVLADSLVGQQ GVPEELYQEFMSDITAEIDRENKIITDLLSLVKMDKKAADVNITHMDINQLLEDILKRLR PIADRRNIDLILDCFRPVDADVDEVKFTLAVSNLVENGIKYNVDDGWVRVSLDADHKYFY ITVADSGMGIPEDSIERIFERFYRVDKSHSKEIGGTGLGLAITKSSIAMHHGTIKVFSKE GEGTTFSVRIPLSYIPS >gi|226332909|gb|ACII01000110.1| GENE 10 11036 - 11797 438 253 aa, chain - ## HITS:1 COG:BS_yqeE_1 KEGG:ns NR:ns ## COG: BS_yqeE_1 COG5632 # Protein_GI_number: 16079624 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 19 168 23 167 182 102 39.0 7e-22 MNINQQYISERNSYSGQIPRYIVIHNTDNYSTGADARAHAMAQYHGNFDGYSAHVYVDDK SAYQAMPYSRGAWHVGVNYGGRLFGTVNNRNSVGIEMCVQAGFDYDKAFANTAEVCRQLM SELNIPADRVVQHYDVCAKNCPSAIRAKNDWNRFKKLIQGKEKENSPSGGKEITLTEELR VIFPELSRGCTGTAVKMLQVFLQVQADGIFGTETENALRTFQKNTNQLADGICGRNSWMA IAVHMRENTYAGY >gi|226332909|gb|ACII01000110.1| GENE 11 11833 - 12267 449 144 aa, chain - ## HITS:1 COG:CAC1842 KEGG:ns NR:ns ## COG: CAC1842 COG4824 # Protein_GI_number: 15895117 # Func_class: R General function prediction only # Function: Phage-related holin (Lysis protein) # Organism: Clostridium acetobutylicum # 20 133 17 123 125 60 31.0 7e-10 MKYRIITLIGLAGSTLAVCFGGWDKFLQTLLLFMAIDWFTGGILLPAVFGKSPKSENGAL ESRAGWKGLCRKGMTLLYVLIAARLDALMGTEYLRDAVCIGFIANEGLSIIENAGLMGLP LPESICQAIDVLKRHSQNTIEKQT >gi|226332909|gb|ACII01000110.1| GENE 12 12377 - 12625 181 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579932|ref|ZP_04857200.1| ## NR: gi|253579932|ref|ZP_04857200.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 82 38 119 119 149 98.0 6e-35 MRECTITGELYISVICGKKDNKDFQISDTQNTIFTKKEFVNKNGEQISYNGYLGKEKPKN YVIHLDGEEISIDWNESDIMIV >gi|226332909|gb|ACII01000110.1| GENE 13 12827 - 13108 109 93 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLINLIFYAMVTGTNYHRTHPGESKFQAGITYNNVRIFIYSIVFCLGLAIMKKMKKLAS KWIITWAILCTVFLIVMSFCEIENAYVSFSTAD >gi|226332909|gb|ACII01000110.1| GENE 14 13290 - 13559 228 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579933|ref|ZP_04857201.1| ## NR: gi|253579933|ref|ZP_04857201.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 89 117 205 205 179 100.0 4e-44 MELIELPEDYKKCSSGGEAVVRKNNGSYTVGFWYTQGFLDSGFSLFAYSNDDSRTDVIQM VKEYGGIEYEINLSEKEKGWYYVTAKVGE >gi|226332909|gb|ACII01000110.1| GENE 15 14024 - 14644 517 206 aa, chain - ## HITS:1 COG:Cj1350 KEGG:ns NR:ns ## COG: Cj1350 COG0746 # Protein_GI_number: 15792673 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein A # Organism: Campylobacter jejuni # 1 186 1 185 191 73 29.0 2e-13 MRKKEISMIILAGGASSRMGRDKSDLTIDGKTFLEMQIEKGEKLGISDILLSGYHGENKY KYPIIPDRFPGKGPLGGLEACFRKAKNPYCLVLGVDVPLVPAEELAALIRQSLHSDAKAV ILSHGGHEEPLMGVYCTDLADAMLEEITERKGAVFAFLRKNGYECYESQAAAWYFSNIND SETYKEIARNHFRFNWKTVMRVDRNV >gi|226332909|gb|ACII01000110.1| GENE 16 14862 - 15398 249 178 aa, chain + ## HITS:1 COG:no KEGG:BcerKBAB4_5408 NR:ns ## KEGG: BcerKBAB4_5408 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 3 170 9 171 190 67 30.0 3e-10 MNWYIVDKKYINYLTQFDSHVGYVEYGERLKLHVGTLLTIGNFHYYVPISSAKPKHQKMS NSLDFHKLQDESTRYLYAVLNINNMIPVPDNCLTQLKYNQIDCFRSFKSDKEKTDYIYLL QKEKFLIDNIQNTLQNKAMKLYQKCIAKPDSSLAARCCNFKMLEEKCLSYLHISPANL >gi|226332909|gb|ACII01000110.1| GENE 17 15381 - 16355 1061 324 aa, chain - ## HITS:1 COG:lin1039 KEGG:ns NR:ns ## COG: lin1039 COG2896 # Protein_GI_number: 16800108 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Listeria innocua # 3 324 6 333 333 224 36.0 3e-58 MIDKYGREIDYLRISLTDRCNLRCIYCMPEEGVKSLSHAEILTYDEILRICRCAADLGIR KIKLTGGEPLVRKGCASLAKQIKAIPGIEKVTLTTNGILLAEQLDALLDAGIDAINISLD TLDPELFKKVARRDGLEKVFEGIKAALAHPGLSLKINSVPVIKEKENFIGIAGLAQKYPI HVRFIEMMPIGFGKQFPFQDEESIKAVLEEAYGPMHAVNERYGNGPCHYYEIAGFKGKIG FISAMTHKFCSQCNRVRLTSEGFMKGCLQYQKGTDLRGLMRGGCTDSELKKAIYQVIWNK PVSHNFYQTKTEQDEIRGMSQIGG >gi|226332909|gb|ACII01000110.1| GENE 18 16566 - 17384 793 272 aa, chain + ## HITS:1 COG:lin2729 KEGG:ns NR:ns ## COG: lin2729 COG1526 # Protein_GI_number: 16801790 # Func_class: C Energy production and conversion # Function: Uncharacterized protein required for formate dehydrogenase activity # Organism: Listeria innocua # 25 267 15 256 261 121 28.0 1e-27 MKINEQYHNLNLLENVTYEFLGRDGETHEETEPILVEHMMDVYVNERLTMKLVCIPQHLT ELVLGRLFTEGIISSAEDVEQIYICEFGKRARVILSKNKSGQSHEENEDYVAPTPTCCTG NRVLNDYFITDQPMTPVTPFPWKKQWIFDLADCFANGTPLHSQTWATHSCFLACDGELLF QCEDIGRHNALDKAIGYALRHNIDLKKCVVYSSGRIPTDMAIKAIRAGIPVLASKASPSA EAVAMAKEYHLTLICAARRDRMKLFTGNNPTE >gi|226332909|gb|ACII01000110.1| GENE 19 17359 - 17814 400 151 aa, chain - ## HITS:1 COG:AGc3121 KEGG:ns NR:ns ## COG: AGc3121 COG1763 # Protein_GI_number: 15889005 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 145 23 162 171 71 29.0 7e-13 MICRLLEIFKDKGLKVAVLKHDGHDFVPDVPGTDTYCQLQSGAYGTAVFSAGKYMLVKQQ PQISEKELAEFFPEADLILLEGFKYSTYPKIEIIRKGNSAESVCNPEKLMAIATNLDAEE RDALSVLENVPFFELDNAECIAEFILSDYFR >gi|226332909|gb|ACII01000110.1| GENE 20 17888 - 19150 1167 420 aa, chain - ## HITS:1 COG:BH3021 KEGG:ns NR:ns ## COG: BH3021 COG0303 # Protein_GI_number: 15615583 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Bacillus halodurans # 4 420 8 411 423 243 36.0 4e-64 MEGISVEQAVEQILQHTPVINETEETELNKAGGRVLAQDMVAGFDNPPFDRSPVDGYACK AEDLAGATKEHPVRLKVMEEIDAGQYSEREIQQGQAVRIMTGAAIPNGCNCCIYQEDTDY GEDTVEVYSEQKRWSNYCFSGEDFKKGTTLLKKGTHIGYVEAAILAGMGVAKVPVYRQPR IVLLTTGDEVVELGLPLPEGKIYNSNMTMLSARLRELGIESFHMEAVKDDPTVMSEKLKE AAKAADMIITTGGVSVGKKDIMHESLRLMGAERIFWRVKMKPGMPTLFSAYKKTSDITAT DHCASEREIPIISLSGNPFGVAVSIELLIRPALEKMMQDPSIGLKEVNGIMADNFEKGIK GRRFIRAYLENGKFHLPNGLHSNGVLSSMAGCNCLIDTKTMENQESRSLNAGDKVGAILL >gi|226332909|gb|ACII01000110.1| GENE 21 19391 - 19825 340 144 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1481 NR:ns ## KEGG: Cphy_1481 # Name: not_defined # Def: molybdenum cofactor synthesis domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 1 101 1 100 310 73 38.0 2e-12 MAKILCVCISKERKTCSQNIHECEADLQGLVGDVHYGMGGRKQVSLLPYEKVKEYFEQSE EEFRYGRFGENLLVEGINWNSIAEGNRFRCGDVLLEAVRIGAGGPASDAYKGEKVCTPME PWFVFCQILEAGRLREGEIILQQR >gi|226332909|gb|ACII01000110.1| GENE 22 20095 - 21297 1254 400 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0427 NR:ns ## KEGG: Cthe_0427 # Name: not_defined # Def: serine phosphatase # Organism: C.thermocellum # Pathway: not_defined # 16 398 4 389 389 328 42.0 3e-88 MKRNKHHGLNVEFDGLCMDFGHVSLNKKKEFLCGDCYKIEENDEDHVLVLSDGLGSGVKA NILSTLTATMLSTMIINQVELDEAVRAVAKTLPVCSVRNLAYATFTVLNFQGKQVSLYQF DNPDAILIRDGKLFDYPVETSMIEEKEIHKSCFELKDEDMLIIMSDGVTNAGMGKTTNGG WGRDDVMAFCRAKYHKGMSAQEMAGYLAEASLDLNLNETDDDITAIVLRMRHKQVVNLMI GPPSKEEHDERYLKSFFDSEGYHVICGGTTAQCAARYLDKELISLSETACGDVPAYYKLE GTELVTEGFLTIEHLLEYCEEWMEDRLSFNRLKRKKDAAALLGVMLFEQATDINFYFGGA INKSNVELGITEKRKEQVAEELVEHLQNAGKNVNIAFWAI >gi|226332909|gb|ACII01000110.1| GENE 23 21284 - 23008 1533 574 aa, chain - ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 11 300 8 282 301 136 30.0 1e-31 MQRFQSILQFKKVSCKNCYKCVRNCPVKAIRVHDHQARIIESQCIYCEKCILVCPQDAKE EQNMIPAICNAMENKTQVIASLHPAYLARFGVTGINGIREAMKKLGFADAADAAEGASLM ISQYRALFAEQKEPGIMISSACPVIVQLVKKHYPHLLGNLVQSASMMQFHASYLKKKYPK AEIVYVSPCISAMSELRDSGNEVDYVITLEELVQWMKKEGISITEKEPEQIAYRSREIAI ADGLTDLLGAVKGIRRLSVSGMEQCREVLEELHPEDFENCFLEMYACSGGCVAGPSFQMK KGHYLADVLAVKNAAFGKDFHREQGDYELPGFELRKNFGYCSAQEQEEVSEEEIRDALAE MGKFSPKDELNCGACGYNTCREKAIAIIQNKAEVAMCIPYMRAKQESYSNKVFNAMPGLL VTVDYNLKIIQMNEAASKLFNMPKKRRLIGKPVSEIMDDYSLASILAFERNLMQDEIYLE DQKCYLDRVMTNDKENKMILCIMKDITKERKHKDQIHNAQIEAARMADKLVEEQLKIVQQ IAGLLGETAADTKVAVEKLKNTILLESEEENEKK >gi|226332909|gb|ACII01000110.1| GENE 24 23219 - 25927 2834 902 aa, chain - ## HITS:1 COG:MTH1552 KEGG:ns NR:ns ## COG: MTH1552 COG3383 # Protein_GI_number: 15679548 # Func_class: R General function prediction only # Function: Uncharacterized anaerobic dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 18 899 2 861 865 698 42.0 0 MIHLTIDGIPVEAEKGTTILQAARQIGVEIPTLCYLEDVLPDGSCRLCVVEVTNNGRTKF DTACTLRCSEGDEVQTMSEKVVAYRKDTLDLLLSDHRVHCFSCEANGDCKLQDYCFEYGV TETSYPGEMKDMPIDDTNKFFTYDPSLCILCHRCVNTCKEIVGRGAIDTMNRGFQSVIGA HYKHKWNEGICESCGNCVQACPTGALTMKRRKKYRPYQIDKKVLTTCPHCATGCQYYLLV KDGKIVDTEAVNGPSNKGLLCVKGRSGCFDFVQSPERIKYPLIKNKETGEFERATWDEAL DLVASKFMEIKKQYGSDSLAGFACSRSPNEDIYMVQKMVRCCFGTNNTDNCARVCHSASV AGLAMTLGSGAMTNPIEDITKNADLIMLVGSNPEEAHPVVGMQIRQAIKRGCKLIVVDPR DIGLAKKADIHLKLKPGTNVAFANGIMNVILSEGLQDDKFIAERTEGFEELKEIVKDYTP EKVAEICHIDADDLRKAAIMYAKADRAPIIYCLGVTEHSTGTEGVMSMSNMAMMVGKLGR EGCGVNPLRGQNNVQGACDMGAQPNVYPGYQKVTDPAVREKFEKAWGVKLDPNIGTHATD VFPKAITGEIKGLYIYGEDPVVTDPDTTHIIKALKSLDFFVLQELFMTETAQYADVILPG VSYAEKEGTFTNTERRVQRVRKAVTVPGEMRLDTDIIIDLMNRMGYPQPHLTSAQIMDEI ASLTPSFAGISHERLDSEEVHGQGLQWPCTSKDHPGTPIMHVGKFSRGLGWFYPAKYVPS AELPDEEYPIILMTGRILYHYTTRAMTGKTPELMEIEGKSFIEMNIVDADKLGIKNGDKV RVSSRRGSIESTARVGTKTSPGESWMPFHFPDGNANWLTNAALDKYARIPEYKVCAIKIE KA >gi|226332909|gb|ACII01000110.1| GENE 25 25941 - 27818 2294 625 aa, chain - ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 26 554 8 525 527 565 53.0 1e-161 MIQNKEVLEQIREEAVKELNSYDCRILVCSGTGCVATGSQKIYEKFMEIAKDAPGVTIEF GPHDKDAHVGVKKTGCQGVCELGPLVRIQKGDDVIQYTKVQIEDCQEIFEKSVQGNETIE RLLYQKGGKVSRGPEDIPFIAKQTRIVLKNCGKFDAESLNEYIASGGFQALEKAMFEMDP DLVIDQVDKSGIRGRGGGGFPAGKKWIQVARQADPVHYVVCNGDEGDPGAFMDGSVMEGD PYRLIEGMMLAAYAVRAQHGYIYVRAEYPQSVARLKHAIAVLEEVGLLGDDILGTGFSFH MHINRGAGAFVCGEGSALTASIEGNRGMPRVKPPRTVEQGLWAKPTVLNNVETYANIPEI ILKGADWYRSIGTEGSPGTKTFSLTGSIENTGLIEVPMGTTLREIIFDIGGGLKSGAAFK AVQIGGPSGGCLTEEHLDVPLDFDSVKKYGAIVGSGGLVVMDEHTCMVEVARFFMGFTQR ESCGKCGPCRIGTKRMLEILERIVDGKGEVEDLDKLEHLGNFIKDRSLCGLGKSAPLPVL STLRNFRDEYVEHIVDKKCAAHVCKAMRSYVIDPEKCKGCTKCARNCPVGAITGNKKEPH SIDTSKCIKCGTCLENCVFGAISVN >gi|226332909|gb|ACII01000110.1| GENE 26 27815 - 28297 617 160 aa, chain - ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 7 159 8 159 164 143 45.0 1e-34 MLDQSFYQKADEIIAFYGKKPASLIPIMQDIQGVYRYLPEELLTYVAEQIGVTEAKAFSV ATFYENFSFDAKGKYVIKVCDGTACHVRKSIPVLEELYKKLGLSKTKKTTDDMMFTVETV SCLGACGLAPTMMVNEEVYPRMTPEKADELIDKLRGGEQG >gi|226332909|gb|ACII01000110.1| GENE 27 28771 - 29889 738 372 aa, chain + ## HITS:1 COG:no KEGG:mru_0223 NR:ns ## KEGG: mru_0223 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 4 249 7 246 363 77 28.0 1e-12 MFSLKEKEVNTGRQRELDLVKGFLLIMIVFIHSFQTIGGVAAAESNVHKILFALFMPTGA CLYLFTMGFGSAFTRHSQPKDMVKNGIKLLFYQGLSNLCYAAVMTISFNIRNSITVEAAG SRELYDANLYSMLTFVNIFFIAGMCYLVLAVYRKLNVSLRGYVISAVIVGVISPFTKLLV SDDPALNWILDMTFGGKGETSFCFFPYLSYVFLGYVFGKVLRRIPEDEKGNFYKESGIIC GITAAVWFICCIVLHPGIEGFFNYMIEQYRIPGLAKVLGSFCSIIFVFAAAFRIMPMMEK WKFGYNKLCYYSKQISKMYAVHIGVYWTLAGFAAFYEFGVKECLILSVAALIVTDLLVHG YIIITDKIKNRK >gi|226332909|gb|ACII01000110.1| GENE 28 29979 - 30671 966 230 aa, chain - ## HITS:1 COG:BH4027 KEGG:ns NR:ns ## COG: BH4027 COG0745 # Protein_GI_number: 15616589 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 2 228 3 229 236 228 51.0 5e-60 MSRKVLVVDDEKLIVKGIRFSLEQDGMEVDCAYDGEEAVEKAKDKKYDIILLDLMLPKMD GLEVCQQIREFSNVPIVMLTAKGEDMDKILGLEYGADDYITKPFNILEVKARIKAIMRRA GSDHEEKEKAKSIQVGDLRMDCEGRRVFIAGKEINLTAKEFDVLELLVFNPNKVYSRENL LNIVWGYEYPGDVRTVDVHIRRLREKIETNPSEPKYVHTKWGVGYYFQAQ >gi|226332909|gb|ACII01000110.1| GENE 29 30682 - 31563 756 293 aa, chain - ## HITS:1 COG:BS_comEA KEGG:ns NR:ns ## COG: BS_comEA COG1555 # Protein_GI_number: 16079613 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus subtilis # 122 292 53 204 205 92 38.0 7e-19 MIKIKNRCCYTVTLILCGTLFLTGLTGCKSREAQFLLEGLQEAKAEVDAESSEEKTSGQK SKKDTDEKKADTEDRQNDGGNSAEFRKKQAESDGSDAGNGTGSDSGKHTSDVDIDNGSEA VSDKEMQQAMIYVDVCGAVANPGVFQLAAGSRVFQAIEAAGGYLPEAALTCVNRAGVLTD GQQLYILTQEEMERQGLDPAEMSGASDGQMNGSAGTGQNTGMNAQVQQDNRININTADEA QLTTLTGIGATRAQAIIAYREENGPFAAIEDIMNVQGIKEGTFAKIKDEIVVG >gi|226332909|gb|ACII01000110.1| GENE 30 31659 - 32093 543 144 aa, chain - ## HITS:1 COG:CAC3547 KEGG:ns NR:ns ## COG: CAC3547 COG3238 # Protein_GI_number: 15896783 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 14 141 14 140 143 79 37.0 3e-15 MWGFIVALISGALMSIQGVFNTEVTKQTSLWVSTGWVQLSAFAVCVLAWLFTGRESVAAL WQVDNKYTLLGGVVGAFITITVIQSMGALGPAKAAMLIVISQLAVAYVIELAGLFGVDKE PFEWRKILGLLIAIFGIIIFKWQK >gi|226332909|gb|ACII01000110.1| GENE 31 32208 - 35765 2232 1185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579950|ref|ZP_04857218.1| ## NR: gi|253579950|ref|ZP_04857218.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 1185 1 1185 1185 2076 100.0 0 MKKKLLAGILTLALCSANILPQTIFAEEFTSGNPEEVSEEETPEVFTDENQESAGETEEE LSVFSEGTSEFEDNNNETNKPNEEPQIVDVDLTSNSITGNVYTIASGDTAYRFTSNGTET SNRITISDTITAAKSICIYLKNVNIKTSEGPALLIPEKLNAKVHIYLEGENVLKTSNRNS AALQKENTFNLIIDNAPNSAGSLTATAIDDNNCAGGAGIGGAANRIGKNITIKGGAITVK SIYGAGIGGGYMGNGEDITIEGGFINAESIYGAGIGGGSNGNGGYPSQGVCYASNIIISG GKIITTSSAGAGIGGGSYSYGSNINISDGLITASSDFGAGIGNGGASKNGSNINIFGGLI TASSKFEAGIGGKYKNNEGSSTIISNGLITASSQSNKDISGDNITISGGFVKAAKLGCQP KNSSSQEVYRCIIENPEKANVTIKTEIGNLIWEGESVNHSSLDPDDTNLYVWLPKPEDAS ENSYYKIILKSETSSGFRTRNYSFDTDTNTFKARQVVNDFVFKSPAYTTYNGEPKEASLE FKFKPTPENNREISLVYYKGNYNDINDSTQPLNTLPVNAGTYTVKAEIAASESYFAHKKL ESPKWTFTIEKAPVAPGADFNETTIFVPWSCKKISDITRPFSSDWKWRTDVNLDQKLQVG KTITATAIYKGADKGNYEKETITYTIIRNECEHKNTVGRYYSSPSCTSNGYSGDTYCNDC ERTIYYGSTIPAYGHDYDAGVIITEPTAETDGIVIYTCKRCKHQDTKNIGKLGDGEPYIE GSFQKKGWDAVNNLIRTSKEKDTISITLNGARTLPASVLSGIKGKDISLNLDMENGFIWK INGASITSETPADTDLFVTNTAEYIPEALYRLVSTNQNDFGFHLGRNGDFDFPAVLSVKA DASCAGLMANLFWYDVENKVLQCIQTVTVSGAFERSIPYADFTLSKGQDYFIAFGTESLN GRVIHTDGSITDENGAYLRPANAKISSHSIDRNKLTVKLSKGCEGAQGYDFVISKKSDML KTGKFSKTVSSTGKPQASFKYLSKGTWYVAARSWVLDAQGNKVYGSWTKVKKIKITVVTP QQPKIRDITVKGNTVTVTYTKCKNATGYEILLGTKYKTSAGEKYPVKKYVKRTEGKNTVT VTFTNVKKGTWYVTVRSWNKTSKDKSRVYSPYSTMKKFTVCFFRN >gi|226332909|gb|ACII01000110.1| GENE 32 35880 - 37961 2223 693 aa, chain - ## HITS:1 COG:no KEGG:ROP_69790 NR:ns ## KEGG: ROP_69790 # Name: not_defined # Def: hypothetical protein # Organism: R.opacus # Pathway: not_defined # 156 397 134 338 478 70 26.0 3e-10 MKKRALILAIIAFIATTTSEPSINMVRAEETQVEETQVEETQTEENAGQTSELPIKTLTP KVIDENPYMAASDSNIHHDCYNTDSTDEVLPVDIYSEINVSYEKVNPNASPAVFFDSYGH SVVPLLGGLAIRDINADETQTLGYFSPKQHDNGSYLIQSSYSFVDESNRIVCPTNDNRVL MLKATDEEGNVLPEFEKVLDIDIKAAAEAALGKTLDQNLLSVVFDYEGNLWFATGGFRIY PDRKQQGTFGYVSRAAIDKILNGEDVDLSDAVFVYELEPGEGAENGIAASKEGAVILTNL KCYLLQADNGVKKVWETSYKSVGAKESKEGDETTGGGLAWGGGCSPSLTKDLVMFTDNQD PVNLIAVDMKTGEQVASMPVIDELPEGTQVSVENSAIVYDDGEGTVSTIVCNWFGAGSAK LGEADNDSSIQSYENIYDVGWLRQGNKMIAPGIERVDTVKTEDGYEMKSIWCRSDLSDTS MMKLSTATGYIYGYVQDMETGMWQYIMLDFETGETAFTMDISDKPGYNNMAIGMYAGNSG NALYCPTGYLELLRLQDRFVYLPEMPYRKVDLDQAMRNVLSQEKFAADGGQGDVEGWLNT ITVENVHPNTTVAIRMKGISGETGSLKLYAYGTDGTLKEVPEEKWHIQTEDGETPDTLTE DVLYEVHMTVEDGGDFDLSETEKEIKISAVLGA >gi|226332909|gb|ACII01000110.1| GENE 33 37988 - 38362 159 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579952|ref|ZP_04857220.1| ## NR: gi|253579952|ref|ZP_04857220.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 124 1 124 124 254 100.0 1e-66 MYCIAILTDQEQEGQNCAEYIRNYCTEKKVFPLIEIYQNQEQFFGRIRKTVPAVVFLALP GVSGLNAAEHLRSLYPKCGIIWCSDLDFSLHAFRMRIEYFFMKPVEEQKIKEGLSIWFER RNVI >gi|226332909|gb|ACII01000110.1| GENE 34 38626 - 39342 340 238 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 233 2 230 234 97 26.0 2e-20 MRIAIVDDISEERTLLRNRLESQFSRRNVHVDIFEYENGETFLTAAKECPFTVVFLDIYM NGSNGIDTAKELRRSDTDCLLIFTTTSTDHALEGFQVRALHYLVKPYSENDISALTDEIL SRIPDSGKYIDVKVNGSNIQIPFRKIIYAEHFSHMIHIHTAGERELITRQSFDSFITSLK MDPRFYQCNRGVVINLEHAVDFDGSGFRLDNGSNVPVSRKLLKNARQTFMEFLFQRRS >gi|226332909|gb|ACII01000110.1| GENE 35 39344 - 40663 548 439 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 167 439 181 452 452 73 24.0 8e-13 MTDILRPVLELSVIIPGILLAYLPVKNYLRQTPLKLTAWLLPLLLGICILGGAVCCALQI PTRWFLFPLLPVIMLIYHKTLKISVWKSVSIFLAVFAVFTCVKSLSRAVNALMTTDLHIT ENELWLHTGAGIFYNVICLLFVLAAWYPACHCVQTMVTDENFAQTWYVFWILPVIFIGVN LFMIPKHRSTLYTGRILQCYIVFSLVLLIILLLFYVLFLMMAVSLNRNARLQQENHFLSL QQERYENLCMAIEEARQARHDIRHHFVRLSTLAEQGDIEKIKEYLSAATGKISDYNLHFC ENQAVDSVFGYYSALAERENIPFHAVVSLPAELSLDEINLCLVFSNLLENAIQASVKTVP ARRKINVEVYPHHNHLLLIHVENTFDGKIQQKNNIFQSSKHSGNGIGIESVRHITDKNGG ACDFTYKDGIFSAKIMLRI >gi|226332909|gb|ACII01000110.1| GENE 36 40762 - 41952 1579 396 aa, chain - ## HITS:1 COG:Cj0227 KEGG:ns NR:ns ## COG: Cj0227 COG4992 # Protein_GI_number: 15791599 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Campylobacter jejuni # 6 396 2 393 395 435 53.0 1e-122 MNMNEQMKESEESILHTYNRFPVVFEKGQGCYLYDSEGKEYLDFAAGIAVNSLGYHYPGY DEALKDQIDKLMHISNLYYNEPIIKAGAKLVKASKMSKAFFTNSGTEAIEGALKAAKKYA YVRDGHADHEIIAMNHSFHGRSIGALSVTGTAHYREPFEPLMGGVKFADFNDFESVKAQI TDKTCAIITEVVQGEGGIYPAKKEFLEGLRQICDEKDIMLIFDEIQCGMGRTGHYFAWQA YGVQPDIMTSAKALGCGVPVGAFVLNEKAAKASLEPGDHGTTYGGNPFVCAAVSKVFDIY ENDKIIEHVQEMTPYLEQKLDEIVAKHECAATRRGMGFMQGIVIQGRPVGEVVKAALAKG LLVISAGSDVLRIVPPLVITKEHIDKMAAILDECME >gi|226332909|gb|ACII01000110.1| GENE 37 41939 - 42835 1305 298 aa, chain - ## HITS:1 COG:Cj0226 KEGG:ns NR:ns ## COG: Cj0226 COG0548 # Protein_GI_number: 15791598 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Campylobacter jejuni # 1 285 1 280 281 326 60.0 3e-89 MVNQKYLDKAEVLIEALPYIQRFNRKIVVVKYGGSAMLDDELKKNVIKDVVLLKLVGFKP IIVHGGGKEISRWVGKVGMEPKFINGLRVTDKDTMEIAEMVLAKVNKELVAMVESLGVNA VGISGKDGGLLKCRKKQTEGGDIGFVGEVTKVEPKILEDLLEKDFLPIIFPVGYDDEFAT YNINADDAACAIAEAVHAEKLAFLSDIEGVYKDKDDPSSLISELHVDEAQKLIDDGYVGG GMIPKLKNCIDAIEEGVNRVHILDGRIPHSLLLEIFTNKGIGTAILREDGEKYYNEHE >gi|226332909|gb|ACII01000110.1| GENE 38 42951 - 44174 1448 407 aa, chain - ## HITS:1 COG:TM1783 KEGG:ns NR:ns ## COG: TM1783 COG1364 # Protein_GI_number: 15644527 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Thermotoga maritima # 12 407 5 397 397 386 53.0 1e-107 MKIIDGGVTAAKGFKAAAAAAEIKYKGRTDMAMVYSEVPCVAAGTFTTNVVKAAPVKWDQ DIVYNHPSAQVVICNSGIANACTGAEGYGYCKETADAAAEILGVAADSVLVASTGVIGMQ VPIDRIKNGVKMMAPKLDGSLEAGTEAAKAIMTTDTKKKEVAVQIEIGGKTVTVGGMCKG SGMIHPNMCTMLGFVTTDAKISKKMLQEALSEDVKDTYNMVSVDGDTSTNDTVLLLANGL AENPEITEKGEDYETFKAALNYINTTLAKKIAGDGEGATALFEVKIIGAESKEQAVTLSK SVVTSSLTKAAIYGHDANWGRILCAMGYSGAKFDPEKVDLFFESKAGKIQIIENGVAVDY SEEEATKILSEEAVTAIADVKMGDATATAWGCDLTYDYIKINADYRS >gi|226332909|gb|ACII01000110.1| GENE 39 44479 - 45519 1201 346 aa, chain - ## HITS:1 COG:Cj0224 KEGG:ns NR:ns ## COG: Cj0224 COG0002 # Protein_GI_number: 15791596 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Campylobacter jejuni # 2 339 3 336 342 406 57.0 1e-113 MIKAGIIGATGYAGNEIVRLLLGHKDVEVAWYGSRSYIDQKYADVYQNFFKLVDAKCMDD NMEALADEVDVIFTATPQGLCASLINEGILSKAKVIDLSADFRIKDVKKYEKWYGIEHKA PQFIDEAVYGLCEINREEIKKARLIANPGCYPTCSTLSIYPLIKEGLIDPSTIIIDAKSG TSGAGRGAKVANLFCEVNENIKAYGVASHRHTPEIEDQLGYACGKEVLINFTPHLIPMNR GILVTAYASLTKDVSYEEVKAVYDKYYENETFVRVLDKDVCPQTKCVEGSNYVDVNFKID PRTHRVIMMGAMDNLVKGAAGQAVQNMNLMFGFKEAEGLLQVPMCP >gi|226332909|gb|ACII01000110.1| GENE 40 45981 - 47207 1582 408 aa, chain + ## HITS:1 COG:CAC0973 KEGG:ns NR:ns ## COG: CAC0973 COG0137 # Protein_GI_number: 15894260 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Clostridium acetobutylicum # 1 398 1 394 400 434 53.0 1e-121 MKEKVVLAYSGGLDTTALIPWLKETFDYDVVCCCVNCGQGNELDGLDERAKLSGASKLYI EDIVDEFCDDFIVPCVQAGAVYEHKYLLGTSMARPAIAKKLVEIARKEGAVAICHGATGK GNDQIRFELGIKALAPDIKIIAPWRMTDKWTMQSREDEIAFCKAHGIDLPFDASHSYSRD RNLWHISHEGLELEDPSQAPNYDNMLVLGVTPEKAPDKETEVTMTFEQGVPKTLNGKEMK VSEIITELNKLGGENGIGIVDIVENRVVGMKSRGVYETPGGTILMAAHEQLEELTLDRET METKKKLGSQFAQVVYEGKWYTPLREAIQAFVESTQKYVTGEVKFKLYKGNIIKAGTTSP YSLYNESLASFTTGDLYDHHDADGFITLFGLPLKVRAMMLKEVEQNKK >gi|226332909|gb|ACII01000110.1| GENE 41 47353 - 47727 492 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579960|ref|ZP_04857228.1| ## NR: gi|253579960|ref|ZP_04857228.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 124 7 130 130 218 100.0 1e-55 MDILKGKRYDIMNVLWEEGRPLSAFEINDIAPELKMPTVRRCLELLLKENLIQVAGTSMN GKVYARNYTPLLSREDYLKGNAKSRKINPVEMMNALLESDGITVEELEKLQELIDKKRAE LESR >gi|226332909|gb|ACII01000110.1| GENE 42 47891 - 49039 1109 382 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 1 380 1 380 383 218 38.0 2e-56 MQLDKENIKKIRWLIAFAILLYLGVQNLHIVISTVRVLLGFLFPFIIGFGIAFILNIPMK FIEHHLFGKALKQEKKTAQKLARPVSLVLSICFVICIIVIVMLVVVPELGATFVNIAKKI EENIPVFQKWIDNVFGNNPEVVKWAQSLDIEPGKIIDSVLGVLKNGVNNIVSSTVSITMG LLTTAMNVSIGFVFACYVLLQKEKLLQQIKKAMYAMFPEKPVRYLAHVWNLANRIFSNFI TGQCIEAVILGSMFFVSMTILHFPYAMLVGVLISFTALIPLFGGIIGCWVAFFLILMISP VKAVLFLGLFLILQQIEGNLIYPHVVGGSVGLPSIWVLVAVTLGGSLMGIAGMLIFIPTV SVIYTLFREWVYARLEKKQLVL >gi|226332909|gb|ACII01000110.1| GENE 43 49718 - 50995 1205 425 aa, chain + ## HITS:1 COG:CAC0282 KEGG:ns NR:ns ## COG: CAC0282 COG0402 # Protein_GI_number: 15893574 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Clostridium acetobutylicum # 7 425 9 424 428 410 49.0 1e-114 MNERTFVLKGTICYSNSLTELSITENGYLVCEDGRCAGVFDELPEKFAGISCTDFGDELI IPGLTDLHLHAPQYTFRASGMDLELLDWLNTYTFPQEARYEDTEFAKEAYSIFTEDMKKS PNTRACIFGTLHVPATEILMEQLDKTGIKAMVGKVNMDRNGSPQLQEESAQASADATVQW IKDTLDKFENVKPILTPRFTPSCSDELMEKLSKIQKRYHLPMQSHLSENFGEIAWVKELC PNTHFYGEAYSQFGLFGGDCPTIMAHCVHSSDEEIALMKKQGVYIAHCPQSNTNLSSGIS PARRYLDEGLHIGLGSDIAGGTSVSILRAMADAIQVSKLYWRLVDLSMKPLTVEEAFYMG TEGGGSFFGKVGSFKEGYEFDAVVLNDSTIPTPLKLSPKDRLERLIYLSDDRNITAKYVA GRKIL >gi|226332909|gb|ACII01000110.1| GENE 44 51158 - 52534 1529 458 aa, chain + ## HITS:1 COG:MJ0326 KEGG:ns NR:ns ## COG: MJ0326 COG2252 # Protein_GI_number: 15668500 # Func_class: R General function prediction only # Function: Permeases # Organism: Methanococcus jannaschii # 3 458 6 434 436 353 49.0 3e-97 MNLDKLFHLKENHTDVKTEVMAGITTFMTMAYILAVNPNILEASGMDRGAVFTATALSSF IATCLMAALSNYPFVLAPGMGLNAYFAYTVVLGMGYSWQQALAAVFVEGIIFILLSLTNV REAIFNAIPMNLKHAVSVGIGLFIAFIGLQNAKIVVNNDSTLVSVFSFKSSVSGGTFNTE GITVLLALIGLLITAILLVKSVKGNILWGILITWGLGIICQLTGLYKPDPAAGWFSLLPD FSNGISIPSMAPTFMKMDFSIVFTLDFVVIMFAFLFVDMFDTLGTLIGVASKADMLDKEG KLPNIKGALLSDAVGTTVGAMCGTSTVTTFVESASGVAEGGRTGLTSLIAAILFGLSLLL SPIFLAIPSFATAPALIIVGFLMLTSVTKIDFNDLTEAIPAFIAIIAMPFLYSISEGISM GVISYVIINVVTGKAKEKKISVLMYILAVLFILKYIFI >gi|226332909|gb|ACII01000110.1| GENE 45 52834 - 53775 574 313 aa, chain + ## HITS:1 COG:CAC0076 KEGG:ns NR:ns ## COG: CAC0076 COG0697 # Protein_GI_number: 15893372 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 8 312 6 292 303 235 43.0 9e-62 MQKQQIPLKNSLLLLLAAIIWGIAFVAQSVGMKYVGGFTFNAVRSLIGAVVLIPLIFILK KRNSPSDSASKASGRSDTSSNTVSNMQEKKALIIGGIACGICLCLASNFQQFGIKYTTVG KAGFITACYIVIVPVIGLFLGKKCTKFIWAAVAMALIGLYLLCITDGFSIGKGDLLVLVC AFLFSLHILVIDYFSPKVDGVKLSCIQFLTCGVLSGIPALLLEHPELSSILAAWQPILYA GVMSCGVAYTLQIIGQKNMNPTVASLILSLESCISVLAGWIILRQQLSTKEILGCVIMFA AIILAQLPQKQTE >gi|226332909|gb|ACII01000110.1| GENE 46 54092 - 54472 472 126 aa, chain - ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 121 3 120 122 100 49.0 8e-22 MQIRKSAEDYLEAILVLSKQGGGVRSIDIATMLGVSKPSVSHAMKLLREDGYIAMDRYGT VTLLEKGNDIAVHIYERHTVLSKMLEHLGVSPEVAKEDACKLEHDLSDESFTRIKEHFHK VIGEVE >gi|226332909|gb|ACII01000110.1| GENE 47 54559 - 55263 858 234 aa, chain - ## HITS:1 COG:SA1940 KEGG:ns NR:ns ## COG: SA1940 COG0813 # Protein_GI_number: 15927712 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 1 234 1 234 236 297 61.0 1e-80 MSIPTPHNSAKKGDIAKKVLMPGDPLRAKYIAETYLEKPVCFNTVRNMFGYTGTYKGEQI SVMGSGMGMPSMGIYSYELYNFYDVEKIIRIGSAGALQDDVSVMDVVIAMSACTDSNYAS QYQLPGAFAPTASYNLVARAVEVAKEQGTPVRVGSVISSDVFYGDNPESSKAWRKMGVLC VEMECAALFMNAARAGKEALGILTISDHVFRDEAISAEARQTSFNRMMEIALGI >gi|226332909|gb|ACII01000110.1| GENE 48 55302 - 56066 968 254 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579967|ref|ZP_04857235.1| ## NR: gi|253579967|ref|ZP_04857235.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 254 1 254 254 317 100.0 4e-85 MDIQKFQQKLTEVCELGEKNGKVLKPEQIKECFGELDLDKSQLIKILQYLKLKGISIEGA EEISAASQTGPEEVSEEKEEKVPLTAEEEAYLKEYLEGLEEQEQGERSAEELFELLSKGD ALAQAELSQKYLHAAAEMAVEMNCEEIFIADLIQEANISLLMALGEEEPEEKDEKWLLGR IRCGIRHAIEEQTQRKFEDDYLVAKVEKLEAAVRELTEDEEDESSKFSVEELAIILDMDE EEIRDVLRLTGDDK >gi|226332909|gb|ACII01000110.1| GENE 49 57303 - 57782 567 159 aa, chain - ## HITS:1 COG:BS_ysnB KEGG:ns NR:ns ## COG: BS_ysnB COG0622 # Protein_GI_number: 16079887 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Bacillus subtilis # 1 147 3 146 171 95 40.0 5e-20 MKVLIVSDTHGLDENLEETVLREAPFDYLIHCGDVEGREIFIEALAECPCTIVAGNNDFF TDLPYEEEVTLEGHKILVTHGHHYFVSRDYDKLVENAQAKGCKIAMYGHTHMPVIENEDG ILVINPGSLTYPRQRGRRPSYAVMQIEEGKDPQVEIRYL >gi|226332909|gb|ACII01000110.1| GENE 50 57779 - 58363 412 194 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15803493|ref|NP_289526.1| putative deoxyribonucleotide triphosphate pyrophosphatase [Escherichia coli O157:H7 EDL933] # 1 194 1 195 197 163 44 2e-39 MKKIIFATGNQDKMREIREILADMDVEVVSMKEAGIHADIVEDGSTFEENAVIKAKTICE LTGEITLADDSGLEIDYLNKEPGIYSARYMGEDTSYHIKNASLIERLNGVPDEKRTARFV CAVAAAFPDGSVKTVRGTMEGRIGYEEKGENGFGYDPIFYLPEYGCTSAELSGEEKNKIS HRGKALRAIKDELK >gi|226332909|gb|ACII01000110.1| GENE 51 58414 - 59565 1327 383 aa, chain - ## HITS:1 COG:BH1771 KEGG:ns NR:ns ## COG: BH1771 COG0116 # Protein_GI_number: 15614334 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Bacillus halodurans # 1 379 1 378 385 397 52.0 1e-110 MEKFELIAPCHFGMEAVLKREILDLGYEITKVEDGKVTFEADAQGIADANIFLRSTERIL LKVAEVKAETFDELFEKTKALPWENYIPKDGKFWVAKANSVKSKLFSPSDIQSIMKKAIV ERLKGVYGVSWFEEDGASFPIRVAFMKDVATIGIDTSGVSLHKRGYRKMTVKAPITETLA SALIMLTPWNKDRILVDPFCGSGTFPIEAAMMAADIAPGMNRSFLAEEWKHLVPRKCWYD ANEEAQDRINLNIETDIQGFDIDPEALKAARANAKMAGVDKLIHFQQRAVKDLRHPKPYG FIITNPPYGERLEEKENLTQLYREIGESYERLDKWSMYLITSYEKAENDIGRKADKNRKI YNGMLKTYYYQFLGPRPPKKNRS >gi|226332909|gb|ACII01000110.1| GENE 52 59599 - 60141 608 180 aa, chain - ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 173 1 173 180 188 49.0 5e-48 MKYMFASDVHGSAYYCRKMLDVYKEEKAERLVLLGDLLYHGPRNDLPREYAPKQVISMLN DMKKEIYAVRGNCEAEVDQMVLQFPVMADYCILNLDGRTFYATHGHIYNENNLPPIQEGD ILIHGHTHVLKAEQKEGYVLLNPGSVSIPKEGNPPTYAIFEDGVFTIKDFEGNIVKSIEL >gi|226332909|gb|ACII01000110.1| GENE 53 60403 - 60714 253 103 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00398 NR:ns ## KEGG: EUBELI_00398 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 96 2 107 112 73 35.0 2e-12 MFPVETISAKMLDYYVDRRDTLIIDLREKESYMHSHVKGAVNMPYGEIDGYTAFPKGKIL VLYCDRGGASLLLARQLAKLGYCTRSVIGGFGAYRGRNLVISP >gi|226332909|gb|ACII01000110.1| GENE 54 60839 - 61036 179 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579973|ref|ZP_04857241.1| ## NR: gi|253579973|ref|ZP_04857241.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 1 65 65 73 100.0 5e-12 MAKNYESESKNSRNYTGTSDKNSYGNEMNHAGNVKNSVKSKNETDCRSKAKNKTSQSYSS EEDRY Prediction of potential genes in microbial genomes Time: Sat May 28 20:20:30 2011 Seq name: gi|226332908|gb|ACII01000111.1| Ruminococcus sp. 5_1_39B_FAA cont1.111, whole genome shotgun sequence Length of sequence - 508 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 247 112 ## gi|291520905|emb|CBK79198.1| hypothetical protein + Term 295 - 338 5.1 2 2 Tu 1 . - CDS 350 - 508 92 ## gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE_01921 Predicted protein(s) >gi|226332908|gb|ACII01000111.1| GENE 1 17 - 247 112 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291520905|emb|CBK79198.1| ## NR: gi|291520905|emb|CBK79198.1| hypothetical protein [Coprococcus catus GD/7] # 1 76 1 76 76 119 100.0 4e-26 MLRKLNCFLNIVIGSFIGVFIGFGIYKFWHFKTYPNLYAMQSAPWYTELLLDGALVAVLV VVCIILKLIIWRKLKP >gi|226332908|gb|ACII01000111.1| GENE 2 350 - 508 92 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153811529|ref|ZP_01964197.1| ## NR: gi|153811529|ref|ZP_01964197.1| hypothetical protein RUMOBE_01921 [Ruminococcus obeum ATCC 29174] # 1 52 66 117 117 104 100.0 2e-21 EEISLSTGDWLKIAPAAKRQFFASDISGITYICIQVKENSLEHFTAEDAVIG Prediction of potential genes in microbial genomes Time: Sat May 28 20:21:00 2011 Seq name: gi|226332907|gb|ACII01000112.1| Ruminococcus sp. 5_1_39B_FAA cont1.112, whole genome shotgun sequence Length of sequence - 69886 bp Number of predicted genes - 58, with homology - 58 Number of transcription units - 36, operones - 13 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 3.6 1 1 Op 1 . + CDS 143 - 475 245 ## COG1733 Predicted transcriptional regulators 2 1 Op 2 . + CDS 562 - 747 145 ## gi|253579977|ref|ZP_04857244.1| conserved hypothetical protein + Prom 774 - 833 2.4 3 2 Tu 1 . + CDS 860 - 1270 138 ## gi|253579978|ref|ZP_04857245.1| conserved hypothetical protein + Term 1319 - 1365 7.2 + Prom 1509 - 1568 7.4 4 3 Op 1 . + CDS 1660 - 2847 1186 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 5 3 Op 2 . + CDS 2844 - 4178 886 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 4257 - 4302 5.2 + Prom 4474 - 4533 5.2 6 4 Tu 1 . + CDS 4613 - 6406 1900 ## COG1283 Na+/phosphate symporter + Term 6447 - 6493 6.8 + Prom 6471 - 6530 3.1 7 5 Tu 1 . + CDS 6616 - 6870 364 ## gi|253579982|ref|ZP_04857249.1| predicted protein 8 6 Tu 1 . + CDS 7004 - 7945 496 ## COG3177 Uncharacterized conserved protein 9 7 Tu 1 . - CDS 8108 - 9241 798 ## COG1940 Transcriptional regulator/sugar kinase - Prom 9281 - 9340 8.5 + Prom 9371 - 9430 9.5 10 8 Op 1 . + CDS 9523 - 11139 2090 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 11 8 Op 2 . + CDS 11187 - 11324 169 ## gi|291547208|emb|CBL20316.1| Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. + Term 11348 - 11392 5.0 + Prom 11370 - 11429 8.6 12 9 Tu 1 . + CDS 11460 - 11849 452 ## gi|253579986|ref|ZP_04857253.1| predicted protein + Prom 11865 - 11924 2.6 13 10 Tu 1 . + CDS 11956 - 12471 297 ## COG0716 Flavodoxins + Prom 12486 - 12545 5.1 14 11 Tu 1 . + CDS 12573 - 13637 977 ## EUBELI_01556 hypothetical protein + Prom 13669 - 13728 2.1 15 12 Op 1 . + CDS 13755 - 15314 1436 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 16 12 Op 2 . + CDS 15394 - 16482 1452 ## COG3347 Uncharacterized conserved protein + Term 16485 - 16531 12.1 - Term 16470 - 16521 14.1 17 13 Tu 1 . - CDS 16572 - 19229 2925 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 19283 - 19342 8.6 - Term 19664 - 19720 8.0 18 14 Tu 1 . - CDS 19738 - 21570 2305 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 21768 - 21827 8.1 + Prom 21761 - 21820 12.2 19 15 Tu 1 . + CDS 21996 - 22622 536 ## COG0629 Single-stranded DNA-binding protein + Term 22623 - 22661 1.0 + Prom 22892 - 22951 6.2 20 16 Op 1 . + CDS 23110 - 23601 591 ## COG2109 ATP:corrinoid adenosyltransferase + Term 23618 - 23653 -0.4 21 16 Op 2 6/0.000 + CDS 23681 - 24574 1232 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 22 16 Op 3 . + CDS 24640 - 25401 1025 ## COG0289 Dihydrodipicolinate reductase + Term 25474 - 25518 9.1 - Term 25462 - 25506 9.1 23 17 Tu 1 . - CDS 25549 - 26871 1108 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 27005 - 27064 5.0 + Prom 26930 - 26989 4.1 24 18 Tu 1 . + CDS 27063 - 27575 514 ## gi|253579998|ref|ZP_04857265.1| predicted protein + Term 27700 - 27739 0.8 + Prom 27640 - 27699 4.7 25 19 Op 1 . + CDS 27900 - 28667 865 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase + Prom 28673 - 28732 3.6 26 19 Op 2 . + CDS 28825 - 29397 474 ## PROTEIN SUPPORTED gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P + Term 29461 - 29513 10.1 + Prom 29523 - 29582 6.2 27 20 Tu 1 . + CDS 29773 - 31245 1405 ## COG1640 4-alpha-glucanotransferase + Term 31312 - 31358 2.9 + Prom 31268 - 31327 8.0 28 21 Op 1 . + CDS 31479 - 34055 3302 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 29 21 Op 2 . + CDS 34454 - 35920 1583 ## COG0733 Na+-dependent transporters of the SNF family + Term 35960 - 36019 17.0 - Term 35948 - 36007 3.2 30 22 Tu 1 . - CDS 36050 - 36910 469 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 37021 - 37080 5.3 + Prom 37362 - 37421 1.6 31 23 Tu 1 . + CDS 37586 - 38434 1102 ## COG0253 Diaminopimelate epimerase + Term 38472 - 38544 24.2 + Prom 38547 - 38606 7.7 32 24 Tu 1 . + CDS 38793 - 39782 1140 ## COG1186 Protein chain release factor B + Term 39795 - 39829 3.2 + Prom 39836 - 39895 6.2 33 25 Op 1 . + CDS 39939 - 40394 415 ## COG4492 ACT domain-containing protein 34 25 Op 2 . + CDS 40401 - 41615 1695 ## COG0460 Homoserine dehydrogenase + Term 41723 - 41756 0.7 + Prom 41622 - 41681 6.5 35 26 Op 1 . + CDS 41779 - 42987 1485 ## COG0527 Aspartokinases 36 26 Op 2 . + CDS 43002 - 44024 1000 ## COG5523 Predicted integral membrane protein + Term 44072 - 44121 8.6 + Prom 44037 - 44096 6.0 37 27 Tu 1 . + CDS 44174 - 45454 1269 ## COG2234 Predicted aminopeptidases + Prom 45585 - 45644 9.4 38 28 Tu 1 . + CDS 45778 - 47937 207 ## PROTEIN SUPPORTED gi|88810653|ref|ZP_01125910.1| 30S ribosomal protein S1 + Term 47949 - 48005 9.6 - Term 47937 - 47993 10.4 39 29 Tu 1 . - CDS 48025 - 48918 683 ## COG0583 Transcriptional regulator - Prom 48992 - 49051 6.2 + Prom 48898 - 48957 5.8 40 30 Op 1 . + CDS 49082 - 49315 418 ## gi|253580015|ref|ZP_04857282.1| predicted protein 41 30 Op 2 . + CDS 49338 - 49637 211 ## COG3326 Predicted membrane protein + Term 49697 - 49735 4.0 + Prom 49718 - 49777 5.3 42 31 Tu 1 . + CDS 49945 - 50805 875 ## COG0668 Small-conductance mechanosensitive channel + Term 50814 - 50864 10.1 + Prom 50960 - 51019 8.3 43 32 Op 1 38/0.000 + CDS 51197 - 52807 2054 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Prom 52832 - 52891 2.5 44 32 Op 2 49/0.000 + CDS 53021 - 53941 1101 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 45 32 Op 3 44/0.000 + CDS 53955 - 54854 1100 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 46 32 Op 4 44/0.000 + CDS 54867 - 55856 575 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 47 32 Op 5 . + CDS 55860 - 56834 855 ## COG4608 ABC-type oligopeptide transport system, ATPase component 48 32 Op 6 . + CDS 56915 - 57697 852 ## COG1349 Transcriptional regulators of sugar metabolism 49 32 Op 7 . + CDS 57796 - 58542 749 ## COG3142 Uncharacterized protein involved in copper resistance 50 32 Op 8 . + CDS 58542 - 61043 2274 ## HMPREF0868_1076 transglutaminase-like protein + Prom 61086 - 61145 1.6 51 32 Op 9 . + CDS 61172 - 62482 1422 ## COG3669 Alpha-L-fucosidase + Term 62493 - 62537 9.4 - Term 62476 - 62529 11.3 52 33 Tu 1 . - CDS 62563 - 63081 484 ## Shel_24810 hypothetical protein - Prom 63107 - 63166 4.3 + Prom 63126 - 63185 7.2 53 34 Op 1 . + CDS 63327 - 64022 906 ## COG2243 Precorrin-2 methylase 54 34 Op 2 45/0.000 + CDS 64019 - 64861 676 ## COG1131 ABC-type multidrug transport system, ATPase component 55 34 Op 3 . + CDS 64935 - 65666 572 ## COG0842 ABC-type multidrug transport system, permease component + Term 65692 - 65736 9.6 + Prom 65842 - 65901 5.4 56 35 Op 1 . + CDS 65985 - 67052 933 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 57 35 Op 2 . + CDS 67055 - 68785 1490 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 58 36 Tu 1 . - CDS 68789 - 69685 746 ## COG0583 Transcriptional regulator - Prom 69716 - 69775 3.4 Predicted protein(s) >gi|226332907|gb|ACII01000112.1| GENE 1 143 - 475 245 110 aa, chain + ## HITS:1 COG:BH0737 KEGG:ns NR:ns ## COG: BH0737 COG1733 # Protein_GI_number: 15613300 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 10 109 14 113 118 96 46.0 9e-21 MRAKEELPECPVATAVSLIGGKWKLLILRNLKERPWRFNELQRSIDGISQKVLTESLRQM MSDGLAYRHDYHEQPPRVEYGLTELGTKMLPIVNSLADFGNYYKSIIEQN >gi|226332907|gb|ACII01000112.1| GENE 2 562 - 747 145 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579977|ref|ZP_04857244.1| ## NR: gi|253579977|ref|ZP_04857244.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 117 100.0 2e-25 MAKKEERFEVIFRDGSSLKDEGIRQILVDKETGVNYLCWKSGYGAGITPLLDSEGTVIIT K >gi|226332907|gb|ACII01000112.1| GENE 3 860 - 1270 138 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579978|ref|ZP_04857245.1| ## NR: gi|253579978|ref|ZP_04857245.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 136 1 136 136 216 100.0 3e-55 MKKKIILSIAFIISLLPMFLNQYGGLKGVQEITGLINLLNPIGMVSVILFAVGVWFPFKE QVVGKSLGALGTIGIVVSEIYKFFTWHVMNITGEVSIHKSIRFAFPEFYIGLIISILMVV TYFVIDKKVSATSVSN >gi|226332907|gb|ACII01000112.1| GENE 4 1660 - 2847 1186 395 aa, chain + ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 393 1 392 393 419 54.0 1e-117 MIAEKMKPYVKNNSAIRTMFEEGNRLRAIYGADKVYDFSLGNPSVPAPECVKEAIIDLVN EVEPTVLHGYMSNAGFEDVRQTIAESLNHRFGTKFAAKNLIMTVGAASGLNVIFKTILNP EEEVIVFAPYFLEYGAYVRNFDGKLVEISPDTTTFQPNLEEFEQKITAKTRAVIVNTPHN PTGVVYSEETIKKLAAILEKKQQEFGSVIYLISDEPYRELAYDGVEVPYLTKYYDNTIVG YSYSKSLSLPGERIGYLVIPDEADGSEELITAAAIANRTIGCVNAPSLMQKVIAKCVDAE VDVAAYDKNRLALYNGLKECGFECIKPQGAFYLFVKSPVADEKAFCEAGKKYNILMVPGS SFACPGYVRLAYCVSYDTIINSLPEFKKLAEEFGL >gi|226332907|gb|ACII01000112.1| GENE 5 2844 - 4178 886 444 aa, chain + ## HITS:1 COG:HI1113 KEGG:ns NR:ns ## COG: HI1113 COG1070 # Protein_GI_number: 16273038 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Haemophilus influenzae # 5 444 22 465 511 83 24.0 7e-16 MKTAGIDIGTTTISGVVLEKGENGQAKILEAKTVENGCFVETGNDWERIQYAKEIVEKAV NLLDYFLEKYPDVERIGLTGQMHGIVYVDKEGNCVSPLYTWQDARGSICDGDQIPLTEEI RERCQIHAASGYGLVTHIYNIRHNLVPDSALSFCTIMDYFGMYLTGRKKPLIHVSNAAGF GFFDSHKMCFEKEKLDEMGVDVNWLPDVCTGIEKLGTYRGRTVTTAIGDNQASFLGAAGD EENILLVNMGTGGQISVLSSQYFSGDGIEARPFLNRKYLLAGASLCGGKAYALLEQFFRK IVKEATGQEQPLYKVMEKMARTGRDQNSERTESRKIKVETTFDGTRVNPEKSGSITQMWS ENFTPEDFCYGVLKGMSKELYQMYMTIQKGTGIKIRHMIGSGNGLRKNPVLCEIIEDMFK AELVLAECEEEAATGAAMSSSMYN >gi|226332907|gb|ACII01000112.1| GENE 6 4613 - 6406 1900 597 aa, chain + ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 1 544 8 539 543 253 29.0 1e-66 MILSLLSGVALFLFGMSLMGDGLKMVAGNKLEAFLYRMTNTPLKGVALGTGVTSVIQSSS ATTVMVIGFVNSGMMKLKQAIGIIMGANIGTSITGWILCLSYIDGKNGIAKILSTATISA VVAIIGIILRMACKRSVHKNIGNIMLGFAILMTGMQTMSGAVSPLKDSPTFTNMLTMFSN PLIGILVGIAFTAVLQSASATVGVLQALSVTGILTFSSAFPIVLGIGVGAACPVLISAIG ANKNGKRTALVYLINDLFGLILWSVIFYTVNAVVHFGFMDMIMSPVAIALLNTVFRVATV CVLFPFIPKIEQLVCWLVKDSAEELEDEADFDLLEERLLNYPALAIGQCHRAMSGMARKL RKNVNRAMNLLNEYQQDKFDKVQRKENLIDKYESRLGEYLMKLTKHEMNSAQTRQASLYL HTINDFERIGDHASYIAYMSSEMHDNHTNFSQEAWDELNVVMEAVREEINLTCRAFLNDD KEMAQRVAPLGMIITSLCNELKMHHVERLSNGNCGLEEGTVYTDILNSFNRIAAHCASAM VALLKSGDENPDMHIHDSKIYPSDSVEYYTYFKEYRQKYEIVKNEEHMRSMEPEEVE >gi|226332907|gb|ACII01000112.1| GENE 7 6616 - 6870 364 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579982|ref|ZP_04857249.1| ## NR: gi|253579982|ref|ZP_04857249.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 84 1 84 84 152 100.0 8e-36 MLFDSRTVYLGASRDKYEQYCKALDSEKIKYKTKRVNHEEKMTAPGRGTARSMGGNFGTD KTLYEIMVGEKDYDNAMGILTALK >gi|226332907|gb|ACII01000112.1| GENE 8 7004 - 7945 496 313 aa, chain + ## HITS:1 COG:FN0971 KEGG:ns NR:ns ## COG: FN0971 COG3177 # Protein_GI_number: 19704306 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 4 256 22 268 330 228 47.0 1e-59 MSSISEKIGRITAISSLETKPHLRKNNRIKSIHSSLKIEANSLSLGQVRDVINGRLVLGE QKEIQEVKNAYAAYESLSEINPYSIKDLKKFHGIMTKYLVEECGEFRHGEEGVFNGDECI FMAPPAQFVPQLMDELFEWMKKSRNSVHPLIMSSVFHYEFVFIHPFADGNGRMARLWHTA ILSKWKPVFEYIPIESRIEKFQDEYYDAIARCHVSGESTIFIEFMLSQIDRILEDISLQM NEENEQFSETVRKMLEIMEYDTPYTRKTLMEKLGLKSRDGFRRNYLQPALEMNLIQMTLP DKPNSRNQRYMKV >gi|226332907|gb|ACII01000112.1| GENE 9 8108 - 9241 798 377 aa, chain - ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 15 325 9 329 385 116 29.0 6e-26 MSEKGLTNISLKKINKSKVYQYIYRKKTTSKLQIVQELQMGLSTVSQNLNLLEQEGLIEK NGFFESTGGRKANALQIVSDFKISIGIGILKGMFHITAVDLYGNAFYTDTIPLPYSNTPD YYKQVTDAVKEFISSRQYDNDKVLGISIATQGITSPDHTTVLYGDIMNNAGMKLDDFSRY LPYPCYLEHDSKSAAFLELWNHPELDSAVIILLNRNLGGGIITNHQIHQGISMHSGTLEH MCINPDGPLCYCGSRGCLETYCSANALEQAAGMPAKEFFPLLREKKSPQIIQIWKDFLNH LAFAMKNLNLVIDAPIILSGYLAPYFTEEDIKYLAEHLHTAAPFILDKTQILVGTNGQYT PAIGAALHYVEKFIQSV >gi|226332907|gb|ACII01000112.1| GENE 10 9523 - 11139 2090 538 aa, chain + ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 538 1 535 539 533 50.0 1e-151 MKLSINGIKEQEAWKKAGITLPSYDVEAVSENAKENPVWVHFGIGNIFRVFIGGIADGLL EEGVLDRGITCVETFDYDVVDKIYDPYDNLGLSVILHGDGTREYKVVGAFAEAVKAQSSN AEQWKRLKEIFTSKSLQMVSFTITEKGYALQKADGTWFPFVEADIKNGPDKAVGAMAVLV AMLYERYKAGEYPIALVSMDNCSQNGAKLRESVLTMTEEWKKAGFVDEGFVNYVSDEKVV AFPWTMIDKITPRPSEQIAADLENLGVENMQPVITGKKTYIAPFVNAEKPQYLVIEDSFP NGRPALEKGFGVYMADRNTVNLSERMKVTVCLNPVHSATGPLGVVLGYDLFAHMLNTNED MMKMARMIAYDEGLPVVADPGILSPQAFVDELFNDRFPNEYLGDTNLRLAVDVSQMVGIR FGETIKAYVQKFGDASRLTAIPLGIAGWLRYMLGVDDEGNKFELAPDPMNEELQEQFKDI VVGKPETFKDQLKPILSNERLFFTDLYKDGVGEKVEDMFREMIAGPGAVRATIHKYVG >gi|226332907|gb|ACII01000112.1| GENE 11 11187 - 11324 169 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291547208|emb|CBL20316.1| ## NR: gi|291547208|emb|CBL20316.1| Glyoxalase/Bleomycin resistance protein/Dioxygenase superfamily. [Ruminococcus sp. SR1/5] # 12 45 97 130 130 69 97.0 7e-11 MNLKEQFNGIPTNDIVHFLPFWENGVKFFTIEGPNKEKVEFSQYL >gi|226332907|gb|ACII01000112.1| GENE 12 11460 - 11849 452 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579986|ref|ZP_04857253.1| ## NR: gi|253579986|ref|ZP_04857253.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 129 7 135 135 186 100.0 4e-46 MSKRTERMSRSAAERINRKDTVIRQNNAELEELRSQYENLDICEEDKQVIDDYIACKDAR ADRMAEKLYEKGRKDVKRCIRIRKLIRRCMILSAIVTVVLVQYNKNEKLRESLEELLKMF RDGVSDEEL >gi|226332907|gb|ACII01000112.1| GENE 13 11956 - 12471 297 171 aa, chain + ## HITS:1 COG:FN0772 KEGG:ns NR:ns ## COG: FN0772 COG0716 # Protein_GI_number: 19704107 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Fusobacterium nucleatum # 8 163 5 161 169 72 30.0 6e-13 MGIQRYSIIYSSKTGNTKKLAEKIREVLPEENCDYFGTEGTKALSSDILYIGFWTDIGNA DPVTLELLKSLKNKKIFLFGTAGFGGSEAYFQQVLGKVKESVDESNTVMGEFMCQGKMPQ SVRDRYVKMKEKPDHAPNLDELIQNFDRALSHPDNEDLDRLEEIIRRDLEK >gi|226332907|gb|ACII01000112.1| GENE 14 12573 - 13637 977 354 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01556 NR:ns ## KEGG: EUBELI_01556 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 340 1 340 353 479 76.0 1e-134 MTFYQELQLSSTGSKQLIRSTTDPKEKRRHILIYNFKVYLVMAFCVAVVSMYSKLTGSSN SVVGVTVLLAVLVLRQADFGIRTTHGLLSIAGIFGILMAGPRLANIVPPLAAFAVNAVCI LLLMILGCHNVIMYNHSTFVLGYLLLLGYDVTGKEYTFRVIGLLVGMVICMIVFYKNQRN RAYRRTFLDLFREFDLKSARSRWYVKLTLIVSSAMLFMNLLGLPRAMWAGIACMSVCLPF TEDCIPRSVSRGMFNVVGCLLFIVLYLVLPKSMYPYIGMIGGIGVGYSAGYPWQTAFNTF GALSIAAGIFGMPAAIALRIGANVLGAAYTVICNKVTDKVAEYIGTNKCAENLS >gi|226332907|gb|ACII01000112.1| GENE 15 13755 - 15314 1436 519 aa, chain + ## HITS:1 COG:CAC3316 KEGG:ns NR:ns ## COG: CAC3316 COG1502 # Protein_GI_number: 15896559 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Clostridium acetobutylicum # 18 519 11 510 510 387 38.0 1e-107 MKQDTLEGRAKTKNGVKRLCFSAVCILLEAAFIIAMITKLNQYAEIINLMTRLLAGVLVL KLYASDQTSSMKMPWVILILVFPILGVGLYLLIGLNGGTRKMRERYDQIDRELLPLLPND SECRETLGRKIPKAGNISDYIQKNASYPVYQNTDVIYYDEAVKGLEAQLADLAKAEKFIF MEYHAIEDAQAWHKIQRVLEDRVKAGVEVRVFYDDMGSIGFINTDFIKKMENVGIHCRVF NPFTPGLNVFLNNRDHRKITVIDGKVGFTGGYNLANEYFNFTHPYGQWKDTGIRLEGEAV RSLTVTFLEMWNAVSDKDKNDSDFTEFLVQTDYQAKQTGFIQPYADSPMDHEQVGEEVYI SMVNKAEKYCWFMTPYLIITDEMTHALCLAAKRGVDVRIITPGIPDKKMIYNITRSFYHG LVKHGVRIYEWTPGFCHAKMSVADDCMATCGTINLDYRSLYHHFENGCFMADCQAVLDIR NDLAATMDECREVTEQYSTGRSAYLRLGQLFMRLFAGLL >gi|226332907|gb|ACII01000112.1| GENE 16 15394 - 16482 1452 362 aa, chain + ## HITS:1 COG:BH1550_1 KEGG:ns NR:ns ## COG: BH1550_1 COG3347 # Protein_GI_number: 15614113 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 3 359 18 415 425 116 25.0 9e-26 MDLSTLVKMSNTYGSNPAYVLAGGGNTSVKDDTTLYVKGSGTQLATIKSEEFVEMSRARL NEIMKTDYPEDDVKRESLYLADVMAAVTDADKTKRPSVEALLHNLFAYTYVLHVHPTLVN GLTCGKGAKELSEQLLGKDVLWIDICKPGYTLARICYEKMNAYKEEYGKDVQVLLLQNHG IFVAANTVEEIGALFDGVISKLEKQVKRTADVSDAVTSEKEQAAEKLSRLLGHAVEVVPA AEADHFVKDKAAAAPLLKPFTPDHIVYCGPYPLFVEDIDQAKAALDAFMAENDKEPRLIL VQGVGAFIMEDDKGKAAKAQLLVKDAIKLAVYAESFGGPLHMTDDITYFITHWEAEAYRS KK >gi|226332907|gb|ACII01000112.1| GENE 17 16572 - 19229 2925 885 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 12 456 3 448 448 595 66.0 1e-169 MAKKENTIPTIIDTPEALTAKIAAMKEAQKIFATFTQEQVDKIFKAAATAADKARIPLAK DAVEETGMGIVEDKVIKNHYAAEYVYNAYKNTRTCGVIEEDTAYGIKKIAEPIGLIAAVI PTTNPTSTAIFKTLIALKTRNAIIISPHPRAKKCTIDAAKVVLEAAVEAGAPEGIIGWID VPSLDLTNQVMREADIILATGGPGMVKAAYSSGKPALGVGPGNTPVIIDDSADIRLAVNS IIHSKTFDNGMICASEQSVTVLENIYKKVKEEFLYRGCYFLKPDELEKVRKTILINGALN AKIVGQKAATIAEMAGVAVPPETKILIGEVESVDISEEFAHEKLSPVLAMYKAKTFDEAI AKAEQLVADGGYGHTSALYIDTREREKMEKHAAAMKTCRILINTPSSQGGIGDLYNFKLV PSLTLGCGSWGGNSVSENVGVKHLINIKTVAERRENMLWIRTPEKVYFKKGCLPVALDEL GTVMHKKRCFIVTDSFLYKNGYTKKIEDKLDQMGIVHTCFYDVEPDPSLASARAGAAAMR MFEPDCIIAMGGGSAMDAGKIMWVLYENPDANFDDMAMDFMDIRKRIFTFPHMGKKAYFI AVPTSSGTGSEVTPFAIITDKTTGIKWPLADYELMPDMAIVDTDNMMSAPKGLTCASGID VMTHALEAYASMMASDYTDGLALNAIKLVFNYLPRAYKDGSDVEARQHMANASCMAGMAF ANAFLGVNHSLAHKLGAFHHLPHGIANALVLLHVLRYNAAEVPAKMGTFPQYQYPHTLAR YAEIGRSVGLTGKNDSEVFEKLLQKLQELMDTIEIKHTIKEYGVDEKYFLETLDEMTEQA FNDQCTGANPRYPLMSELKEIYLKAYYGEGNIPESGKAKEKKEEK >gi|226332907|gb|ACII01000112.1| GENE 18 19738 - 21570 2305 610 aa, chain - ## HITS:1 COG:CAC1684 KEGG:ns NR:ns ## COG: CAC1684 COG1217 # Protein_GI_number: 15894961 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Clostridium acetobutylicum # 2 604 3 602 605 719 61.0 0 MRTKREDVRNVAIIAHVDHGKTTLVDQLLKQSGVFRENQEVQERVMDSNDIERERGITIL SKNTAIHYKDTKINIIDTPGHADFGGEVERVLKMVDGVILLVDAFEGAMPQTKFVLKKAL ELDLHVIVCINKIDRPEARPDEVVDEVLELLMDLGASDEQLDCPFLYASAKAGHAVIDLN DTPKDMAPLFDAILKYIPAPEGDPDADTQVLISTIDYNEYVGRIGVGKVENGKIAVNQEL TLLNHHDLDKRKKVKISKLYEFDGLNKVEVKEATIGSIVAISGIADIHIGDTLCGGENPE AIPFQKISEPTIAMNFIVNDSPLAGQEGKYITSRHLRDRLYRELNTDVSLRVEDTETTEA FKVSGRGELHLSVLIENMRREGYEFAVSKPEVLYKTDERGKKLEPMEIAYVDVPEEFSGT VIQKLSERKGELQGMSTASDGTVRLEFHIPSRGLIGFRGEFLTSTKGTGIINTMFDGYAP YKGDFQYRKQGSLIAFEAGEAVTYGLFAAQERGTLFIGPGAKVYSGMVIGQNGKAEDIEL NVCKTKHLTNTRSSSADDALKLTPPRVLSLEQAIDFIDQDELLEVTPESLRIRKRLLDSR ERKRAAFRKA >gi|226332907|gb|ACII01000112.1| GENE 19 21996 - 22622 536 208 aa, chain + ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 3 205 2 202 229 222 52.0 5e-58 MADKIIENNQVSMMGKIAAGFTFSHQVFGEGFYMTELLVKRLSDSEDRIPVMVSERLVDV TQDYIGEYVEIHGQFRSYNRHEEKHNRLVLSVFSRELKFVEEEDETVPVNQIFLDGYICK PPVYRKTPLGREIADVLLAVNRPYGKSDYIPCICWGRNARYASAFEVGGHVLIWGRIQSR EYMKRIGENETEKRVAYEVSVSKLEYME >gi|226332907|gb|ACII01000112.1| GENE 20 23110 - 23601 591 163 aa, chain + ## HITS:1 COG:TM1465 KEGG:ns NR:ns ## COG: TM1465 COG2109 # Protein_GI_number: 15644214 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Thermotoga maritima # 3 158 2 151 170 79 33.0 2e-15 MEKEFTEVYCGGGKGKTTLAIGQCLRASAQGKSVIIIQFLKGRERRELDFLEELDNLDIK IFRFEKHDEGYEELNEQEKAEEKTNILNALNFARKVIATQECDFLLLDEILALPDYGITT AEAIGDILKMKDESMHILLTGRTLPDSLRKYADSITTLTTEIV >gi|226332907|gb|ACII01000112.1| GENE 21 23681 - 24574 1232 297 aa, chain + ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 1 294 1 291 293 299 51.0 5e-81 MAIFKGAGVAIVTPMYEDGKVNYDKLEELLEYQIANSTDAIIICGTTGESSTMTHGEHLK TIKFAIDKVNKRVPVIAGTGSNCTETAIMMSKEAASYGADALLIVTPYYNKATQKGLIAH YTAIAEAVPETPIIMYNVPSRTGCNIQPATAVALAKNVKNIVGIKAASGDLSQIATMMSM ADGCIELYSGNDDQVLPILSLGGLGVISVLSNVAPKETHDMVMKFLEGDTKGAAEIQLKA IPLIHALFSEVNPIPVKAALNLMGKEVGPLRMPLSEMEDANKAKLAEELKNFGIKLA >gi|226332907|gb|ACII01000112.1| GENE 22 24640 - 25401 1025 253 aa, chain + ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 250 1 248 250 227 45.0 2e-59 MVKIIMHGCNGHMGQVISGLVEKDPDAEIVAGIDVADQGKNSYPVYTNMEECQVEADALI DFSSAKATDALLAYCEKRKLPVVLCTTGLSEEQLAKVEETSKNVAVLKSANMSLGINTLM KLLQDAAKVLATAGFDMEIVEKHHRLKLDAPSGTALALADSINEAMDNQYHYIYDRSQRR EMRDDKEIGISAVRGGTIVGEHEVIFAGPDEVIEFKHTAYSKAVFGKGAVEAAKFLAGKP AGRYDMSDVISAK >gi|226332907|gb|ACII01000112.1| GENE 23 25549 - 26871 1108 440 aa, chain - ## HITS:1 COG:CAC2057 KEGG:ns NR:ns ## COG: CAC2057 COG1686 # Protein_GI_number: 15895327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 197 436 28 260 351 154 41.0 3e-37 MRWIDCSILRNIYNSYLQEVIMKRKRNHSLRSSVFIFLAALFLLFTCSVSTIYASTLQKP DIAASGKFVKDGDYWIYRYDDKTIAKNVFLKIDKKTYYFNKLGHRWCSWHTIKGKNYYFG TRFQGYLIKNSLIKYKGNYYYVGKDGTMVTGWYTDKSGKKYYFGKDGKAVTGKHKIKGTY YYFNQNGTVTHTGLNYSLSSDCALLMNADTGQIIYGKNENVAHANASTTKIMTCILALEN CKLNEKVKFSPYAASIEPSKLYANAGEIFYLKDLLYSLMLPSHNDTAVAIAEHVSGSTAK FVKLMNKKAAEIGCTNTHFATPNGLDAGYNHYTTASDLAKIAQYAIKKPMFRKLVSTGYY SFSNLNTGRSYSIGTTNALLGNVPGVQGMKTGYTNKAGYCFVGLSYSARGNTYISVVLGA DSSSARWQDSRTLLTYAYNH >gi|226332907|gb|ACII01000112.1| GENE 24 27063 - 27575 514 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579998|ref|ZP_04857265.1| ## NR: gi|253579998|ref|ZP_04857265.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 170 1 170 170 172 100.0 8e-42 MDRQNAFREQEMNMGIKVKKILTGTALIMSLCMFTACGNANKAADEADTNEENMTDNNAA NDDNMENNAGENNTNGNSTSGNNAADKNVNNGGTVTDNGTAAEDEAAENETKQNSNATDN TADNTRIDENNNGDGTVSGELGNSVKDLGEGVGNAVEDVGDAVGNAAEGR >gi|226332907|gb|ACII01000112.1| GENE 25 27900 - 28667 865 255 aa, chain + ## HITS:1 COG:SPAC3F10.09 KEGG:ns NR:ns ## COG: SPAC3F10.09 COG0106 # Protein_GI_number: 19114853 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Schizosaccharomyces pombe # 3 253 9 251 264 233 45.0 2e-61 MKFRPCIDIHNGKVKQIVGGSLTDVQDQASENFVSEQDASFYAELYKKAGIKGGHVILLN GHDSPYYESTKEQAILALHTYPGGLQIGGGVNPENAGEYLSAGASHVIVTSYVFKDGRIS WENLNKMKETVGKEKLVLDLSCRRKDGKYYIVTDRWQKFTDVTMTLDIMKELGSYCDEFL VHAVDVEGKARGVETELASLLGEYKGNPVTYAGGVGSMKDIEDLRKYGKDRLDVTVGSAL DLFGGNISFSELIKL >gi|226332907|gb|ACII01000112.1| GENE 26 28825 - 29397 474 190 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P [Clostridium tetani E88] # 16 190 1 176 176 187 54 2e-46 MCVIIDSSNSKGDGSMKFIISGKNITVTEGLKTAVQDKLSKLERYFTPDTEVVVTLSVEK ERQKIEVTIPVKGNIIRSEQVSNDMYVSIDLVEEVIERQLRKYKNKIVDKQQAAANFQKA YLDKDYDEDEEVKIIRTKKFGIKPMYPEDACVQMELLGHNFFVFYNAETEQVNVVYKRKG NTYGLIEPEF >gi|226332907|gb|ACII01000112.1| GENE 27 29773 - 31245 1405 490 aa, chain + ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 1 489 4 493 502 353 39.0 3e-97 MRKAGVLMPVSALPSRIGAGELGESAFQFIELLKENHVKIWQILPLNPVGYGNSPYQPYS SCAGDELYISLDALAEEGLLKEHPKEFQERATRVDYEAVRQYREPFLRTAFDVFTEKKGQ EETAYKEFASQEWVYEYGVFRALKKANNGECWNDWPEEYRTWPENRQKLPAEVETEAQYQ MFLQYIFYTQWMKVKKAANDAGIQIMGDVPFYVGQDSVDVWGGKDNFLLDTDGRPIFIAG VPPDYFSATGQRWGNPIYDWEHMKEQDYRFWVDRIGYSNKLFDIIRIDHFRAFDTYWKIP ADCPTAIDGEWIEAPGYEVIDTLQKEIPGLDLVAEDLGLLRPEVLMLKDHYHLKGMKILI FSIETGGKYARDTFHDVENMIFYTGTHDNDTIMQWYGNMSAAARRKIRRMLKKAGASQGS VKDRFLQYTMQNQAEYAIIPLADILGLGKEGHINTPGTIGSPNWEWHLPDFIQAKKELQK FGRLIVDTKR >gi|226332907|gb|ACII01000112.1| GENE 28 31479 - 34055 3302 858 aa, chain + ## HITS:1 COG:CAC2846 KEGG:ns NR:ns ## COG: CAC2846 COG0653 # Protein_GI_number: 15896101 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 855 1 838 839 948 59.0 0 MNMFSKVFGTRSQREVKRIMPLVEKIESLRPDMQKLSDEELRGKTREFKKRLEEGETLDD LLPEAFAVVREAGKRVLGMEHFRVQLIGGIILHQGRIAEMRTGEGKTLVATLPSYLNALE GKGVHVVTVNDYLAKRDAEEMGKIHEFLGLTVGVVLNDMKQDERRAAYNCDVTYVTNNEL GFDYLRDNMVIYKEQLVQRDLHYCIIDEVDSILIDEARTPLIISGQSGKSTKLYEACDIL AQQLERGEASHEMTKMAAIMGEEVIETGDFVVNEKDKIVNLTEQGVHKVEKFFHIENLAD PENLEIQHNITLALRAHNLMHKDQDYVVKDDEILIVDEFTGRIMPGRRYSDGLHQAIEAK EHVKIKRESKTLATITFQNFFNKYDKKGGMTGTAVTEEKEFRDIYAMDVVEIPTNRPVIR VDHEDAVYMTKKEKFNAVVNAVVEAHAKQQPVLVGTITIETSELLSRMLKRQGIKHNVLN AKFHELEAEIVSQAGQAGAVTIATNMAGRGTDIKLDDVARNAGGLMIIGTERHESRRIDN QLRGRSGRQGDPGESRFYISLEDDLMRLFGSERLMKIFTSLGVAENEQIEHKMLSNAIEK AQEKIEFNNFGIRKNLLDYDQVNNEQREIIYKERRQVLDGENMREAIYKMIQDTVDTYVD MCFSDDVDSEEWDLNEFNGVLTPIIPIRPLTAESVKGKKRDEIRHELKEEAVKLYEEKEA EFPEPEQLRELERVVLLKCIDSKWMDHIDDMEILRQGIGLAAYGQRDPVVEYKMSAFDMF NEMITSIQEDTLRMLYHVHVEQKIEREQVAKVTGTNKDDSAGPKKPVQRKEIKVYPNDPC PCGSGKKYKQCCGRKNKV >gi|226332907|gb|ACII01000112.1| GENE 29 34454 - 35920 1583 488 aa, chain + ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 452 7 454 459 452 52.0 1e-127 MKRESFGSRLGFILVSAGCAIGIGNVWKFPYICGAYGGAAFILIYLCFLVILGIPVLVAE FAVGRGSHTSVAKCFDKLAPAGSCWKPLKFIGIIGCYMLMMFYTTVGGWMIYYCVKSLRG DFVGATPEAVTQSFSDMLGNAPTMMLWMVIICLIGFAVCFMGLQNGIEKISKVMMSALLI IMVILAVHSVMMEGAGEGISFYLIPDFQKIKEAGVGNVVFAALSQSFFTLSIGIGAMLIF GSYLDKSRSLTGEAVSITVLDTFVALTAGFIIIPACFSYGIQPDAGPSLIFITLPNIFAK MSGGALWGSLFFLFLTFAAVSTIIAVFENLIVFNMELFGWNRKKSVVVCAVLVVILSIPC VLGFNVLSGFQPFGDGSNIMDLEDFIVSNNLLPLGSLGYLLFCTRKNGWGWDNFIAEVNA GTGMKFPKCLKNYVAYGIPAIIIVIYLKGYYDKFAGLGTAALAGWMIVALVLLGFVFYCA TAKKKVKQ >gi|226332907|gb|ACII01000112.1| GENE 30 36050 - 36910 469 286 aa, chain - ## HITS:1 COG:CAC2818 KEGG:ns NR:ns ## COG: CAC2818 COG2207 # Protein_GI_number: 15896073 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 12 276 4 274 279 276 52.0 4e-74 MLLSEQFYQTEKTGYLNEQFRLFHLKDQTRKEFSYHYHDFHKVVIFISGKAAYHIEGKAY QLKPWDILLVNRHAIHRPEIDPSVPYERFILWIQNDIPWQELLKCFQKANDRSYNLVRLN SALQEKMKDILFELENSAKSDEYGREILTQSLFLQFMVYLNRIFLEKQYIFDKKSYTFDS QIASILQYINHNLKEDLSVETLSARYYVSKYHLMRKFKQETGYTLHNYIVNKRLLMARTL ISNGMPVTKAAQESGFAEYSTFSRAYRKQFKTNPSEELPHYSNPLK >gi|226332907|gb|ACII01000112.1| GENE 31 37586 - 38434 1102 282 aa, chain + ## HITS:1 COG:alr4841 KEGG:ns NR:ns ## COG: alr4841 COG0253 # Protein_GI_number: 17232333 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Nostoc sp. PCC 7120 # 9 282 5 278 285 227 44.0 2e-59 MANTIIMSKYQGLGNDYLILDPKKNQVQLQGKKIALLCQRGFGLGADGVLYGPVEIDGKL GVRIFNADGSESAISGNGVRIFAKYLMDNGYITEKKFDIETMSGTIHVECLNSRATEFKV NIGKPSFISSDIPVSGEVREVIKENFVFHGKEYKATCLTVGNPHCIIFTDNVSAEAVKNL GPYVENADEFPERMNLQICRQVDKGNLEIEIWERGSGYTKASGTGSCAAAAAAYKLGLVE SRINVNQPGGMIQIDMEEDETIYMTGTVGYIADISVAESFFS >gi|226332907|gb|ACII01000112.1| GENE 32 38793 - 39782 1140 329 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 1 323 40 362 366 363 54.0 1e-100 MEAPDFWDDAEVSQKKMKELKDMKDDMETYQNLVTQMEDMETMIEMGYEENDPALIPEIQ EMLDEFQKDFDNIRIKTLLSGEYDSENAIIKLNAGAGGTEACDWCGMLYRMYSRWADKKG FTLEVLDYLDGDEAGIKSVTFQVNGTNAYGYLKSEKGVHRLVRISPFNAAGKRQTSFVSC DVMPDIKKDLDVEINDDEIRIDTYRSSGAGGQHINKTSSAIRITHYPTGIVVQCQNERSQ HMNKDKAMQMLKAKLYLLKQQEAEEKLSGIRGEVTDIGWGNQIRSYVMQPYTMVKDHRTN EETGNVDAVLDGGIDIFINAYLKWIALKK >gi|226332907|gb|ACII01000112.1| GENE 33 39939 - 40394 415 151 aa, chain + ## HITS:1 COG:BH1214 KEGG:ns NR:ns ## COG: BH1214 COG4492 # Protein_GI_number: 15613777 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Bacillus halodurans # 8 149 3 144 147 119 42.0 2e-27 MQKPVTEKKKYFVVRERAVPEVLLKVVEAKKLLDSGKVDTVQDAAERTGISRSSFYKYKD DIFPFHEETRGKTITFIIQMDDEPGLLSVVLQTIARFHGNILTIHQSIPINGIATLTLSV DILPGEGDAEAMVETIEQNEGIHYLKILGRE >gi|226332907|gb|ACII01000112.1| GENE 34 40401 - 41615 1695 404 aa, chain + ## HITS:1 COG:MTH1232 KEGG:ns NR:ns ## COG: MTH1232 COG0460 # Protein_GI_number: 15679243 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 2 394 1 409 423 287 39.0 2e-77 MINVAILGYGTVGSGVFEVLNENKESISRRAGQELHVKYVLDLRDFPGQPVEKVLVHDYE QIVSDPEVDIVVEVMGGVEPAYTFVKKALLAGKNVCTSNKALVAKHGRELMDIAKEKSIN FLFEASCGGGIPIIRVINSSLTGDEIDEVTGILNGTTNYMLYKMSTEGCEFDTVLKEAQQ KGYAEADPTADVEGYDACRKIAILSSLAYERYFDFEDIYTEGITKITPEDMEYAAKLGRT IKLLGTSRRLADGTCYAMVAPFMLGQNSPLYSVNDVFNAVFVHGNMLGDAMFYGSGAGKL PTASAVVGDIVDAAKHLHVNIVTNWNSTPAVLKPLDEVTGKFFVRIKKEAAEEAKKVFGD VEIISLGQLPQECAFITDEMTQGAFEEKLAQIGDQVLAKIRVKD >gi|226332907|gb|ACII01000112.1| GENE 35 41779 - 42987 1485 402 aa, chain + ## HITS:1 COG:PA0904 KEGG:ns NR:ns ## COG: PA0904 COG0527 # Protein_GI_number: 15596101 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Pseudomonas aeruginosa # 2 400 3 404 412 357 49.0 3e-98 MLIVKKFGGTSVANKERIFNVASRCIEEYRKGNDVVVVLSAMGKYTDELITMARDVNEKP PKREMDMLFTIGEQMSVALMAMAMDKLGVPAVSLNAFQVSMHTTSSHGNARLKRIDTERI RRELDSKKIVIVTGFQGVDKYDDYTTLGRGGSDTTAVALAAALHADACEIYTDVDGVYTA DPRKVPKARKLKEITYDEMLDLATLGAGVLHNRSVEMAKKYGVPLVVRSSLNNSEGTVVK EEVTVERMLISGVALDTDAVRIAVIGLKDEPGVAFKLFDTLAKKNINVDVILQSIGRAGR KDISFTVDGNDLDDAVAVLDANQARLGFAELHSERNIAKLSIVGAGMMSNPGVAAKMFES LYNEGVNINMISTSEIRVTVLINEKDGVRAMNAVHDAFGLAD >gi|226332907|gb|ACII01000112.1| GENE 36 43002 - 44024 1000 340 aa, chain + ## HITS:1 COG:CAC2648 KEGG:ns NR:ns ## COG: CAC2648 COG5523 # Protein_GI_number: 15895906 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Clostridium acetobutylicum # 3 224 10 223 272 63 26.0 8e-10 MTSRRELKNYAKQAMSGKYGILILAFVAVQALALISSMISSALFPGEETIDIVLSYAFTF ILTLLINVVAAGMNYMYLNIARGKAYSLNDLFYFYRHHPDRVIVAGFFMAVLNLLTMLPY TIYSNANLPGEDASVEVLITWLYTGVVLMIVGMIVYQILVIPLEMTYYILADKPELKSTE AMKESLEMMHGNFGRYLMLKISFIPLMFLSVFTFYIALLWIFPYMAMTEVMFYRDLTGEL KVQKEEEERVARDYVNPMFDSYSQSEQPQEDETHPQQFWRVPEESVSYENEDEQNITDST EIQNVQTDMDDKKEQNDSSPKEEEEDNEPKPWDEYFDSLK >gi|226332907|gb|ACII01000112.1| GENE 37 44174 - 45454 1269 426 aa, chain + ## HITS:1 COG:BS_ywaD KEGG:ns NR:ns ## COG: BS_ywaD COG2234 # Protein_GI_number: 16080898 # Func_class: R General function prediction only # Function: Predicted aminopeptidases # Organism: Bacillus subtilis # 22 310 62 326 455 89 30.0 1e-17 MNTNEISGKRQLEFLADFDYIREAGTAGEEKAAERIQKTLDSFGVESHLEEFSFDTFQIK KAKLKVTEPYTKEYTVTGYGRCGNTAEDGLEAPFAYAENGDDISLAYVSGKIVMVNDPVR KDMYRKLVKAGAVGFISIAGSPLDEGVDLVPRAYALPKNLPGEEKKEAGREAVNYDNRIP GVSIHYKDAIELVTKGASQVCLSVEQEIVTHTSRNIVARIEGTDKAEEILTLTAHYDSVP EGPGAYDNMSGAAIIMELCRYFHAHRPRRTMEFIWFGAEEKGLLGSQNYIKIHENELSAH RFNMNVDLAGQLVGGTVAGVTGDASVCNMITYMAHEIGIGMSTKNQIWGSDSNTFAWKGI PAMTLNRDGFGMHTRHDTIALLSDWSLERSAVLLGYIADRLGNIESFPFERKVPEEFIAQ LNEYFG >gi|226332907|gb|ACII01000112.1| GENE 38 45778 - 47937 207 719 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|88810653|ref|ZP_01125910.1| 30S ribosomal protein S1 [Nitrococcus mobilis Nb-231] # 640 719 186 264 563 84 51 2e-15 MDIIQVITQELKVEKWQVEAAVKLIDEGNTIPFISRYRKEATGSLNDEQLRALFERLTYL RNLEDKKNQVLKSIEDQGKLTEELKKQILDAQTLVVVEDLYRPYRPKRRTRATIAKEKGL EPLADIILLQMTDKPVEEEARAYVSEEKGVKNVAEALNGAKDIIAERISDEADYRIYIRN LTTKNGSISSTAKNAETQSVYEMYYEFEEPVRKLAGHRILALNRGEKEKFITVKVNAPEE DILRYLNKRVIKKDNPNTTPILKAVVEDSYKRLIGPAIEREIRSDLTDKAEDGAIHVFGK NLEQLLMQPPIAGKVVLGWDPAFRTGCKLAVVDATGKVLDTTVVYPTAPTTEKKIRAAKD TVEGMIEKYGVSLISVGNGTACRESEQVIVDMLKEIPEKKVQYVITNEAGASVYSASKLA TEEFPNFDVGQRSAASIARRVQDPLAELVKIDPKSIGVGQYQHDMNQKKLDEALSGVVED SVNKVGVDLNTASASLLEYISGISKAIAKNIVAYREENGQFTDRKELLKVAKLGPKAFEQ CAGFMRISGGKNPLDATSVHPESYEAASALLSKLGYKPNDVVAGNLLGLSLQVKDYKKMA AELGIGEITLRDIVKELEKPARDPRDDMPKPILRSDVLEMKDLKEGMVLKGTVRNVIDFG AFVDIGVHQDGLVHISEMTERFIKHPLEAVSVGDIVDVRVIGVDMKKKRISLSMKGLNK >gi|226332907|gb|ACII01000112.1| GENE 39 48025 - 48918 683 297 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 295 1 294 296 165 31.0 8e-41 MILDYYRIFYYVAQYKSFSKAADVMGNNQPNITRCMNILENELGCKLFIRSNRGVQLTIE GERLFEHVSIAIEQLVSGENELLKDKGLESGLVNIGASEIALRLFLLNELEVFHHRYPHV KLRISNHSTPQAVQALENGLVDFAVVTTPVALKKPLQRIPLYSFREILIGGEEYAETASN KHCLQDLQDIPLIGIAPGSSTRELYTQYFMRHNLPFSPDMEVATTDQIFPMVQHNLGIGF FPEELAAEHIAQGKIVQIPIEEPLPERKICLIRNVKRPQSIAARSLEEHLTSHRETE >gi|226332907|gb|ACII01000112.1| GENE 40 49082 - 49315 418 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580015|ref|ZP_04857282.1| ## NR: gi|253580015|ref|ZP_04857282.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 77 31 107 107 147 100.0 2e-34 MTTANKTIKINDVEAAKKLVCAAVRCPFDIDIVSKGKIFIDAKSILGVLSLDVEEPLELK YDGYDEDFESVIAGLAC >gi|226332907|gb|ACII01000112.1| GENE 41 49338 - 49637 211 99 aa, chain + ## HITS:1 COG:BH3136 KEGG:ns NR:ns ## COG: BH3136 COG3326 # Protein_GI_number: 15615698 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 8 85 7 84 94 74 42.0 4e-14 MQMTSTNLLYIYVIIINVVTFFIYGLDKSRAKAGQWRIPEAQLIFLAVIGGSVGALAGMK VFHHKTRKPKFKTGVPAILIIQLIIYFLFSGNEGIILRL >gi|226332907|gb|ACII01000112.1| GENE 42 49945 - 50805 875 286 aa, chain + ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 26 284 30 285 287 186 37.0 4e-47 MEQEIQEAKSLYYSYKQYFSDYLPALTKLGLKVVMALLVFVGGRKVIQWFVSFIKKSMER ASVDKGVIQFTGSLLRIVLYILLVFSIATHFGVKESSIAALLGTAGVTVGLALQGGLANI AGGIMLLIFKPFQVGDYIIIAQQNGCEGTVYKIEICYTTLISIDNKHIVIPNGTLSGSII TNVTARDLRKLEIKVGISYDSDIKKAKAILEEILHNDEDTKDDQGMVVFVDELGESAVVM GLRVWVATEKYWPARWRLNELIKEAFDANGIEIPYNQLQVHVKKQI >gi|226332907|gb|ACII01000112.1| GENE 43 51197 - 52807 2054 536 aa, chain + ## HITS:1 COG:RSc1380 KEGG:ns NR:ns ## COG: RSc1380 COG0747 # Protein_GI_number: 17546099 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Ralstonia solanacearum # 47 536 32 517 517 364 40.0 1e-100 MKMYKKVMAAALAGVMTVSMSATAFAADDDGYILKATNEDSKSSDNDLVIAIEGSVSSMD PANIPDTNAISATRGVYETLVSFDENQELTGKLAESWEVSDDSLTYTFKLRQGVKFQDGT DFNAAAVKANYDRVVNKDNNLRQRRTFIVTNEDGSETPRVDSIETPDDYTLVIKLTQAWA PFINRLTQFCIISPAALEEYGNDIMNHPCGTGPYVCTEWEEGDHTSFKRNDDYWGDKPGV DTVTIKEVPEAGARTAMLQTGEADFIYPTPSDQIEAIKGTDDVNILTTDSNIMRYVTLNM DLEQLSDVKVRQAMNYAIDKDAYIQLMYSGYGKPATSVVPSIIGGYEEQEAYTYDVDKAK ELMKEAGYEDGFKLTLWGDNTTQEIKGMTFISQQLAQIGIEVDVQPMEPATVSDKIYVDK EDAEINMWYVNWSASDFTMDGSLRSLLYSTMCPPTSANTAYFNDADFDKDMDEGLATANE EEQAKYYSDAQKIAWEACPWLFLGNDQIIYSTKSYLSGVYVSPDGAFNFANATLAQ >gi|226332907|gb|ACII01000112.1| GENE 44 53021 - 53941 1101 306 aa, chain + ## HITS:1 COG:STM0850 KEGG:ns NR:ns ## COG: STM0850 COG0601 # Protein_GI_number: 16764212 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Salmonella typhimurium LT2 # 1 306 1 306 306 333 54.0 3e-91 MGRYLGKRLISTIPLLLVISFVVFMFIHMIPGDPARLVAGQDATKEDVAVVREQLGLDEP LLVQYGKYMKGLFTGDLGSSIKNGKTVAETIAPRLKPTIMLTFSSMIWAAIIGIAIGIIA AVFHGRILDYIGMIIAIAGISVPSFWLGLELIQLFSVSLGWLPTSGLETWKSYILPSLTM GAGIMAVLARFSRSSMLETMREDYVRTARAKGLSESLVVMRHAFKNSLIEIVTVAGLQIG GLLSGSVMAETVFSIPGLGRLLVDSIQMRDYKVVQALLLFFATEYILINLIVDVLYGVIN PRVRYE >gi|226332907|gb|ACII01000112.1| GENE 45 53955 - 54854 1100 299 aa, chain + ## HITS:1 COG:RSc1382 KEGG:ns NR:ns ## COG: RSc1382 COG1173 # Protein_GI_number: 17546101 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Ralstonia solanacearum # 1 299 1 297 299 298 51.0 6e-81 MADKNTTSNINLAAAEELPKAQSKFKEFVKKFMKRKTAVISLAFIVFIILMAIIGPYVVP YDPQAPDYTAMMQGPSAAHIWGTNEYGQDVFSRLMVGTRLSLTCALTATIIGTAIGVVLG LIAGFYGGIIDSLIMRCCDVLFAFPDILLAIAVVAIIGQGMVNVMIAVAVFTVPSFARII RSATISVKQAPYVEVARSLGCSNTRILFVHIFPGTIQSMIVNFTMRVGTAILAASSLSFL GFGANVTEPDWGSMLSTGRNYLNTAPHMVLFPGILIFLTVLAFNLLGDGLRDTLDPKMN >gi|226332907|gb|ACII01000112.1| GENE 46 54867 - 55856 575 329 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 324 8 329 329 226 39 3e-58 MAEKLLEVKNLRTEFKRDKTWVTAVNNVSFSIQPGEILGLVGESGCGKSVTSLSIMKLLA RNSKISNGEILLNGKDLLKEDKKGMRKIRGREVAMIFQEPMNSLNPCMRIEKQLTEAILL HNNFSKEEAHNRAFEVLRSVGIPEPDMTLKSYPHQLSGGMCQRVMIAMAMSCEPELLIAD EPTTALDVTIQAQILELMEDIRAKKNTGILMITHDLGVVAEMCSRVVVMYAGRIVEEAPV QELFADPKHPYTQGLIGSVPKLGSGVESLPSIPGSVPDLSVMPKGCKFAPRCKYAMDICH QQEPELADINEAGTRKCRCHLLNKTGKEA >gi|226332907|gb|ACII01000112.1| GENE 47 55860 - 56834 855 324 aa, chain + ## HITS:1 COG:BH3645 KEGG:ns NR:ns ## COG: BH3645 COG4608 # Protein_GI_number: 15616207 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 1 321 1 322 322 412 61.0 1e-115 MSEPLLSVKNMKKTFQANGGMFSKKKFVHAVNDVSFDIYPGETFSLVGESGCGKSTTGKL IDHLITPDSGEIWFEGKEISKLSEKEMRPLRSDIQMVFQDPYGSLNPRMKVQDLIGEPLL IHTNMSAGERLKKVHELLGTVGLSPSHGERYPHEFSGGQRQRIGIARALTVQPKLIIADE PVSALDVSIQAQVLNLLQQLQKDFNLTYLFISHDLSVVEMISDRIGVMYLGTIVETAPKK ELYANPRHPYTRALLSAVPIPDPEVKKERITLKGDLPSPSNPPSGCLFHTRCPHCTEKCK TQVPTPVEIAPGHIVKCHYPELTE >gi|226332907|gb|ACII01000112.1| GENE 48 56915 - 57697 852 260 aa, chain + ## HITS:1 COG:CAC0113 KEGG:ns NR:ns ## COG: CAC0113 COG1349 # Protein_GI_number: 15893409 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 8 259 1 252 253 229 49.0 3e-60 MDEERGAMFTEERQSAIEKCLRENGKVKVKELSEMFQVTEDCIRKDLKILENAGKLKRTY GGAILSQDYPLKRDVVDRRQFNLDKKRTIAAKAFKLIKNNETIFLDISTTNIELAKLLAT SNMRVVVVSNMIDILQILATNPSITAIGTGGTMYQTVNGFMGAATIEVIKQYSFDRAFIG SCGVDMTDCAITTLGVEDGLTKKAAVHSSRHKYVVMERDKFYFNDSYKYAYFDDISGIVT DEFPDDTIVSTLERAGVKLF >gi|226332907|gb|ACII01000112.1| GENE 49 57796 - 58542 749 248 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 6 247 3 243 244 179 38.0 5e-45 MKEYILEACVDSVESAMAAVEGGADRLELCGNLIIGGTTPGPWLFDEIRKRSDIRIHALI RPRFGDFCYTDAEFSMIKHAVEDFRKMGAEGVVFGVLKPDGTLNMEQMKELMEAAGDMSV TLHRAFDVCVDPIETMEQAISLGINTILTSGQRNVCLQGADLLKELVEKSQGRITIQAGS GVGAEVIRQLYPLTGVRAYHMTGKRVIDSEMQFRKDGVNMGLPMLSEYEIWRTDVDNIRA AKKVLEEL >gi|226332907|gb|ACII01000112.1| GENE 50 58542 - 61043 2274 833 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_1076 NR:ns ## KEGG: HMPREF0868_1076 # Name: not_defined # Def: transglutaminase-like protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 7 833 3 907 912 580 38.0 1e-163 MAEPRVFLKENRGRIEENYLEQAKNLPRVFAPVDEKLQKCTEEVALACKYLYAFMPYSDI GNYPFEVFLDYAENGVKLWKENPQVADLPEEIFLNYVLFHRVNEEEIAQCRTYFRTEIGS RIQGMNFREAALEVNYWCAEEATYHCTDDRTLSAISVYRRGNGRCGEESVFTVNALRSVG IPARQVYAPKWSHCDDNHAWVEIWCDGKWYFLGACEPEEILNKGWFTNASSRAMMIHSRV FDTKIPEGEVIGTDGMVTMLNELKRYAVTKEITVTVKDTQGLPAEGAEVSFEVLNYSEYA PIAEKKTDSKGTARLTTGLGSLHISARMCSDGEWFYAETVMNTEKEDNCEICLVPQDKRN DGESEKWTAADIFAPHDAPVNTDMPTPEQKAKGNKRLAAANAHREQKVRNWSNPECERFL EKKVNRIEEAIAASYREDLLRVLTEKDRTDCISDVLEEHLELAIPYHGMMKKDTFVSYVL NPRVDDEVLQKYRREIKKHFSRDEKQELRDDPSRIWNLIEKAIVSRPEKERSSVITTPAG CIRTCTGSFLSKKILFVAIARTFGVAARLNPHDRSMEYMENGRFVPVLARTEKNCTLILK AGETVQWKYFQNWSIAKLENGRYTSLKLGAENFEDQILNLPLESGNYRILTSNRLPNGNM FANEYHFEIQPGETKEIELVLREADLEDMLENISMPEFMLKTEAGTEVKASDLTADGKHI LMFLEEEKEPTEHILNEMMEQEGAFAGYAEQIIFVVRSKEALETPTLSKALAKLKNIQIY YDDFSEIINTLGRRMYVDPDKLPLIIVTNRSLNGIYATSGYNVGTGDMLLRLM >gi|226332907|gb|ACII01000112.1| GENE 51 61172 - 62482 1422 436 aa, chain + ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 14 365 22 385 449 142 29.0 1e-33 MYQFNRKEYEERMKWYQDARFGMFIHWGLYSIPARGEWVRSTEEMPEEEYLPFMKEFDAR DYNPKQWAKEAKAAGMKYVVLTAKHHDGFCLFDSKYTDYKVTNTPAGRDLIKEYVEALRE EGLKVGLYFTLLDWHHPDYPHYGDRIHPMRNHPECSNENRDFSRYIEYMYNQVEEVCTNY GKIDIMWFDFSYDDMRGEKWGATRLIDMVRRLQPGIIIDNRLEVSGEGRGSLYECDPTPY HGDFISPEQIIPPEGICDKEGNPMVWEACFTMNNNWGYCANDHYYKPASMLIKKLVECVS KGGNMILNVGPDAKGKFPKESSAILSEIGRWMDKNHDSIYGCAPAADIPKPDYGRITRNG NKYYVHIYENTLGPLPLIGFDKNKIVKVRALDDGHEVPISTLWVHSDYPDIAFVDLGPDP VLPDPVDYVLEVEMAE >gi|226332907|gb|ACII01000112.1| GENE 52 62563 - 63081 484 172 aa, chain - ## HITS:1 COG:no KEGG:Shel_24810 NR:ns ## KEGG: Shel_24810 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 29 171 239 380 391 160 56.0 3e-38 MRKIFSVLATLICCLSVWTSVPAAQAFTLDKEDGEYSIQVELEGGSGKASVTSPTLITVK NGEVTADIQWSSSNYDYMIVDGEKYLPVNEEGTNSEFLIPVTIMDESMPVIADTTAMGTP HEINYTLTFYSDSIGSKSQMPQEAAKRVVAVALVIIVGGGILNYFVNKRNRC >gi|226332907|gb|ACII01000112.1| GENE 53 63327 - 64022 906 231 aa, chain + ## HITS:1 COG:MTH1348 KEGG:ns NR:ns ## COG: MTH1348 COG2243 # Protein_GI_number: 15679347 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Methanothermobacter thermautotrophicus # 1 203 1 208 232 142 40.0 5e-34 MRGILYGVGVGPGDPELMTLKAVRLIKENDIIAVPGAEVRETVAYKIAVQAVPELADKEL LPIYMPMTHDKAELELNHEKGAKALEAALDTGKNVVFLTLGDPTIYSTFSYVQKRVEADG YETRLVSGITSFCATAARLNIPLTEWNQPLHVQPAVHRLDSELDQPGTYVLMKSGKKMNQ VKEILAKSGRNIRMVENCGMPDEHIYNSVEEIPDDAGYYSLIIAKEQDGGL >gi|226332907|gb|ACII01000112.1| GENE 54 64019 - 64861 676 280 aa, chain + ## HITS:1 COG:MA0845 KEGG:ns NR:ns ## COG: MA0845 COG1131 # Protein_GI_number: 20089729 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 2 235 10 243 336 192 43.0 6e-49 MIETRDLTKTFDNFTAVDSLNLKIEKGEFFGLLGPNGAGKTTTISLLSTLLLPTKGEILI DGQQLTRQRPDLKRKISVITQEYSMRQDMNMDEIMEYQGRLYFMPRKQIKERTEELLEFC DLLKFRKRTVRKLSGGMKRKLMVCRALLTDPEILLLDEPTAGMDALSRRQMWNLLRKLNE KNLTILLTTHYMEEAQSLCNRVALMDHGKLEEVSTPSALIESLGAYAVDEMTADGTQNHY FHTRQEAIRYLEELTGQASLRETTLEDVFVERAGKHLISK >gi|226332907|gb|ACII01000112.1| GENE 55 64935 - 65666 572 243 aa, chain + ## HITS:1 COG:all4219 KEGG:ns NR:ns ## COG: all4219 COG0842 # Protein_GI_number: 17231711 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Nostoc sp. PCC 7120 # 4 210 26 241 275 104 29.0 1e-22 MGIITILWEKWVEFRRDFYKITLAAMIAPLMYLIVFGMGIQTTSHGQPYLNFLIPGVVSL TTMNGSFNAIAQNLNVQRLYEKAFDQVMISPTPLWQFIAGQIIGGSLRGLYAGGIILLLT APIETGLIFNGWSFLVMFLNGAVFSAIGVVVSFLAKNHADVPRFSNYIIMPMAYLCNTFF STEKLPGFVGKFVSALPLSQTSHLMRSISSGEAVNYTGIGILLLYLLVFTVAASWFIYKK KNL >gi|226332907|gb|ACII01000112.1| GENE 56 65985 - 67052 933 355 aa, chain + ## HITS:1 COG:alr3920 KEGG:ns NR:ns ## COG: alr3920 COG3437 # Protein_GI_number: 17231412 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Nostoc sp. PCC 7120 # 5 142 95 230 414 79 32.0 1e-14 MSKLKILIVDDSQLNREILSCMLEDKYDICEAENGKQAVEILERCHKDFKLVLLDLIMPV MDGYQVLAVMKEKQWLDMLPVICISSETSEDSIRRVYGLGASDYFTRPFDAMTVFHRVES TIALHDKMTGDLQDAMKMLSSIFHRILKINLSTDTYTVIRGKDSTVSMLPSLSESLKMMA DYKYIYEEDQEEYLKFCSIDNLRRELAKGRERIYLNYRTLIKQEYRWVCMEVIRSAEYKP DDQIAMMYIRDINDEYLKQLDKIIRHAKDAFASVTVNISDGSCIMANSRIASLELKDVNE PLDSYIERISQSIPQREVRTQYRKVFCVDNLRECLTQGKDMIQWNVLCMPQKESI >gi|226332907|gb|ACII01000112.1| GENE 57 67055 - 68785 1490 576 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 160 567 345 754 776 189 28.0 1e-47 MIRTTVEMVRNVSYDRLEALIYFSDITENYFSEQIPHMLYRKNFDRIEIIDGQRDCVRLD HPGICAMEMVLAEEISYSGYVTEVMKKFVPDDEKSVYEKCVRLDTIRKELQEKDRYSFSV HQFNKKGEKCLKNYTFFYLNKFFDIIVATVEDITEKFEQDILTGGYNRQGFIRHVERILK ESEDRTGYAVVFFDIKNFKAVNELFGTEIGDMMLRKVYEDVRKSELKPLVSAREDADHFI CLVERKNLNLDMLTGMCQKKLTRGGKTLHLSVKCGIFMLEKKKMSVNGMIDRAKIAQRYI TDEFVQSYKIYDSSMKNTYIDKATLAGELEEGIAKGQFKVYFQPVVDAKTGKIASAESLI RWFHPKKGFISPGVFIPALEESGHISELDLYVLDSVNEFQKKRYQSDKITVPVSTNLSWM DFYDETIMNWLEKKCADASMPSGLCRMEITETSYAAIEADRNATLEALREAGILLLLDDF GSGYSSFGMLQNYNFDIMKIDMSFVRQIETNTKTRSIIRFIIEMAHTMDIKIIAEGAETK EQVEFLRDNGCDYIQGYYFYKPMPEEEFVKLLDAGR >gi|226332907|gb|ACII01000112.1| GENE 58 68789 - 69685 746 298 aa, chain - ## HITS:1 COG:PA5437 KEGG:ns NR:ns ## COG: PA5437 COG0583 # Protein_GI_number: 15600630 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 1 295 3 292 311 122 28.0 9e-28 MEFKQLEAFVAVVDYGSFSEAARRLYLTQPTISAHIRSLEDELHMKLIIRTTKKTTITAK GYQLYDSAVRMLEIRNNLLENFTGAHKHMIDLAASTIPSSYLLPELLAAFGKTHPDVYFH SIQSDSSESISRVLDGTVDLALVGQNTRDESCVFIPFCHDELVIATPVTDHYLALKDRET PAIFHDFLKDPIIFREKGSGTKKEMDLFLERTGITTGNLNVIARMNDLESIKKSIVNGLG ISILSSRSVNDLQRTKQILVFPLEEAAHKRSFYIVYSKNRILKPHVKQFVQFVRDYYM Prediction of potential genes in microbial genomes Time: Sat May 28 20:22:06 2011 Seq name: gi|226332906|gb|ACII01000113.1| Ruminococcus sp. 5_1_39B_FAA cont1.113, whole genome shotgun sequence Length of sequence - 20922 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 12, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 5 - 1168 967 ## COG0520 Selenocysteine lyase 2 1 Op 2 . - CDS 1247 - 1651 65 ## COG1943 Transposase and inactivated derivatives - Prom 1683 - 1742 6.1 + Prom 1647 - 1706 5.7 3 2 Tu 1 . + CDS 1736 - 2875 727 ## PROTEIN SUPPORTED gi|163764777|ref|ZP_02171831.1| ribosomal protein L22 + Term 3114 - 3163 6.4 + Prom 3159 - 3218 10.3 4 3 Op 1 . + CDS 3305 - 4387 1612 ## COG2391 Predicted transporter component 5 3 Op 2 . + CDS 4478 - 4684 396 ## Taci_0251 SirA family protein 6 3 Op 3 . + CDS 4681 - 4923 179 ## Vpar_0344 hypothetical protein + Term 4955 - 5014 16.1 - Term 4942 - 5000 4.5 7 4 Tu 1 . - CDS 5214 - 6059 628 ## COG1737 Transcriptional regulators - Prom 6166 - 6225 6.9 + Prom 6052 - 6111 12.5 8 5 Op 1 . + CDS 6313 - 7737 906 ## COG1680 Beta-lactamase class C and other penicillin binding proteins 9 5 Op 2 . + CDS 7750 - 9036 1087 ## COG2610 H+/gluconate symporter and related permeases + Term 9259 - 9301 2.2 + Prom 9151 - 9210 3.7 10 6 Tu 1 . + CDS 9429 - 10418 547 ## llmg_0703 putative transposase + Prom 10632 - 10691 5.2 11 7 Tu 1 . + CDS 10873 - 11448 415 ## gi|253580043|ref|ZP_04857310.1| predicted protein + Term 11459 - 11493 1.4 - Term 11522 - 11558 2.2 12 8 Tu 1 . - CDS 11578 - 12027 464 ## EUBREC_0193 hypothetical protein + Prom 12264 - 12323 5.9 13 9 Op 1 . + CDS 12396 - 12587 266 ## gi|253580045|ref|ZP_04857312.1| predicted protein 14 9 Op 2 . + CDS 12607 - 13065 301 ## gi|253580046|ref|ZP_04857313.1| predicted protein 15 9 Op 3 . + CDS 13102 - 13686 506 ## gi|253580047|ref|ZP_04857314.1| predicted protein + Term 13812 - 13847 2.4 + Prom 14134 - 14193 8.2 16 10 Tu 1 . + CDS 14245 - 14880 771 ## COG3601 Predicted membrane protein + Term 14931 - 14982 13.3 + Prom 14919 - 14978 9.2 17 11 Op 1 . + CDS 15082 - 15375 366 ## EUBREC_2415 putative nucleotidyltransferase 18 11 Op 2 . + CDS 15362 - 15751 479 ## EUBREC_2414 nucleotidyltransferase 19 11 Op 3 . + CDS 15792 - 16406 621 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 20 11 Op 4 . + CDS 16479 - 18899 2685 ## COG1472 Beta-glucosidase-related glycosidases 21 11 Op 5 . + CDS 18936 - 19121 251 ## gi|253580053|ref|ZP_04857320.1| predicted protein + Term 19246 - 19294 -0.0 22 12 Tu 1 . - CDS 19290 - 20630 1169 ## COG0534 Na+-driven multidrug efflux pump - Prom 20857 - 20916 6.2 Predicted protein(s) >gi|226332906|gb|ACII01000113.1| GENE 1 5 - 1168 967 387 aa, chain - ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 4 383 2 371 379 291 43.0 2e-78 MNKIYLDQASTSFPKAPSVAQAVYDYLAGSAVNVNRGGYRAAYSVEEQIFETREQLLKLF HFTSGKGKNVIFTENITASLNVLLKGLLKPGDHVLVTAMEHNAVMRPLVQLEKHGISFDR IPCASDGSLLLDKATALLRPETKLVVCLHASNVCGTVMPAQEIGDFCKEHGLLFILDTAQ TAGTINIDMEKCHIDALAFTGHKGLRGPQGTGGFLIRNKLAAQIEPLISGGTGSASHSEE VPAFLPDRFEPGTPNIPGILGLHAALCDLEKQSMEEIRQHELMLTQYFIDGILNLDPEEK FLRIIGKKNIENRCAVVSVQTLHMDMAQAAFELDSTYGIMTRVGLHCAPNAHKTLGTYPE GTLRFSFGPENTKQELDIALDALSALL >gi|226332906|gb|ACII01000113.1| GENE 2 1247 - 1651 65 134 aa, chain - ## HITS:1 COG:HP0437 KEGG:ns NR:ns ## COG: HP0437 COG1943 # Protein_GI_number: 15645065 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Helicobacter pylori 26695 # 5 134 11 142 142 113 44.0 1e-25 MDNRYYRHNRRKYSLKVHIVLVTKYLKQLLQGSIADDVKQKILDIANTRGYEIIAMETDK DHIHFLLSYDATDRVCDIVKIVKQETTYYLWQKYNSVLSKQYWKKKIFWSDGYFACSIGE VSSATIQKYIESQG >gi|226332906|gb|ACII01000113.1| GENE 3 1736 - 2875 727 379 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764777|ref|ZP_02171831.1| ribosomal protein L22 [Bacillus selenitireducens MLS10] # 150 369 1 220 225 284 60 3e-76 MEQITITAKVQIVATDTDKVLLNKTMSVYCDACNYVSDYVFRTHDLKQFSLNKILYSTLR EKFSLKSQMAQSVFKTVIARYKTILENQNEWIKPSFKKPQYDLVWNRDYSLTQNCFSVNT LNGRVKLPYFAEGMSKYFNHSIYKFGTAKLVNKHGKYYLHIPVTYEVEESNISDICNVVG IDRGINFVVATYDSKHKSGFVSGKAIKQKRANYSRLRKELQMRHTPSSRRRLKAIGQREN RWMQDINHQVSKALATGNPKHTLFVLEDLTGIRNVTERVKTKNRYVSVSWSFYDLEQKLI YKAKQNQSSVIKVDPRYTSQCCPACGHTEKSNRNKKIHLFTCKNCGYTSNDDRIGAMNLY RMGINYLADSQVPNTVVTE >gi|226332906|gb|ACII01000113.1| GENE 4 3305 - 4387 1612 360 aa, chain + ## HITS:1 COG:MK0262 KEGG:ns NR:ns ## COG: MK0262 COG2391 # Protein_GI_number: 20093702 # Func_class: R General function prediction only # Function: Predicted transporter component # Organism: Methanopyrus kandleri AV19 # 188 320 2 146 157 59 29.0 8e-09 MKNEKGTIVLAGGVIGLIAAILVFFGNPANMGFCIACFLRDTTGALGLHSAAAVQYIRPE IIGLVLGACIISLVKKEFRPRGGSAPVTRFTLGAFVMIGCLMFLGCPFRMILRLAGGDGN ALFGLVGFVAGILTGTVFLKKGYTLKRSYKMPKLEGGIYPAFQIVMLLLLVAAPAFIHFT EPEGGPGAKHAAIIIALAAGIIVGILAQRTRLCMVGGIRDAVLFKEYKLLFGFAAILVTA LVMNLILGAVTGTSYFNPGFAGQPIAHTDGLWNALGMYLAGFGCILLGGCPLRQLILAGE GNTDSAVTVLGLMAGAAFAHNFGLASSGEGPTANGKIAVIIGIVVVAVIAAVNSMRKEEA >gi|226332906|gb|ACII01000113.1| GENE 5 4478 - 4684 396 68 aa, chain + ## HITS:1 COG:no KEGG:Taci_0251 NR:ns ## KEGG: Taci_0251 # Name: not_defined # Def: SirA family protein # Organism: T.acidaminovorans # Pathway: not_defined # 1 68 1 68 202 71 51.0 9e-12 MKKIDARGLSCPEPVIRAKNAMESGDKEYEILVDNVVAKENVSRFATHQGYQVQATEQGD DILLKLTR >gi|226332906|gb|ACII01000113.1| GENE 6 4681 - 4923 179 80 aa, chain + ## HITS:1 COG:no KEGG:Vpar_0344 NR:ns ## KEGG: Vpar_0344 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 2 80 1 82 82 71 47.0 1e-11 MITYIATFYSHYGAIQFRRNCQAMNLSAEVMPVPRDLSSSCGTCVRFHTEADFPEKTEEV EQIVRVEPQGYVGIYHADEE >gi|226332906|gb|ACII01000113.1| GENE 7 5214 - 6059 628 281 aa, chain - ## HITS:1 COG:CAC1850 KEGG:ns NR:ns ## COG: CAC1850 COG1737 # Protein_GI_number: 15895125 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 279 8 284 293 117 29.0 2e-26 MEVTMSLDQLIYEKIPSLSAGQRKVAEYILNNKDEFSYATLAKLSKDISVSETTVIRFAY SLGFDSFSAMQQKLREEILSVPQRNVEGQIQNQTFYQKVFSRETQALQDWISHINEELLD KTVEALLNADHILVTGARSSYHAANWFGNRLNLLLGNTHIIQEFYDPRFDLLNHITDKTV VISIAFARYTKWTYRYADSAKKMGATLVSITDSVSSPFFNISHYTILAAPVKDSMGFSSP ICMYCLFDAIISKIHDERKDMIVERLKQYEDTYRDFDMYYE >gi|226332906|gb|ACII01000113.1| GENE 8 6313 - 7737 906 474 aa, chain + ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 60 307 56 321 323 108 27.0 2e-23 MIKNKPEAYGISSDNLLCMVQELETFVEDIHSISVVCDNDVIFSKCIQPYTEEPMQMLHS FSKSMNSIAVGIAIDEGKLHLDDLVIDYFKEYLPEEYDKRIDQLTVRNLLTMAANSCRLS TCFRGVTDSWITHYFTYKLPHDPGTVFQYDTGASYMLSSLVTKTMHKNVLALMKERVLKP MGITDIEWLESPEGNTVGGWGLYLKTPDIAKIAILLANMGKWNGKTLIPEEYLKEATRKQ IDTPEEKYPVCGYGYQYWITADHSFGVYGAFGNVIVVNPEKKLAVAITAGASDTNGNPNR LISKIVNEKLFIPTERGTLETDVDGEKKLKKYLDDLCLPYPEGAKNSCLEGKWFDKTLSF EENDREIQSIKINRKTENVLLINFYLKDKQVSVEAGFENWITQEAVFDDPHHFLNSFSYA FKDDKTLVIKQYWLNMSGYDTYTLHFQDGSVNGMIITSVKLGGAVPVELSGNIQ >gi|226332906|gb|ACII01000113.1| GENE 9 7750 - 9036 1087 428 aa, chain + ## HITS:1 COG:BH3897 KEGG:ns NR:ns ## COG: BH3897 COG2610 # Protein_GI_number: 15616459 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 5 424 1 422 427 264 37.0 2e-70 MDMSLLGIIVALVVLIIICYRKFNPVVGTLICVAILAIFSGLSVLDTITDTYFTGFSDFL KNNFLLFATGTVFASIMEGSGAAAAFAKMIYSKVGGRGAIYGCMLAVLILGYIGVNGWAL MFIAYPIFLCVFKQENLPRWLIPGVIYTSLAYNSSMFPGSPSILNVLPTQYLGTDTMAAS GLGIATGVFSSILCIIYLEYEFRKAKKNNDGFVITPDIAEKMKAFEELETVKPWRSVVPM VLLFVLLNVFKVNVNIAIILASFCCVILYWNTTPKKLNLIDDGVKRASMVIMNTAAVVGF GYVVKSTVGFQNLIDMLSNLGGNPLISFASATTLIAGATGSGSGGIGIAMEVFAQKYMDL GVNPEVLHRIAAIACNGLDTLPHNSMVITCLAACGMTHKESYKPIFITSVCITLIGLAFA VFLGIAFY >gi|226332906|gb|ACII01000113.1| GENE 10 9429 - 10418 547 329 aa, chain + ## HITS:1 COG:no KEGG:llmg_0703 NR:ns ## KEGG: llmg_0703 # Name: tnp1675 # Def: putative transposase # Organism: L.lactis_MG1363 # Pathway: not_defined # 1 324 11 328 439 175 34.0 2e-42 MNYSDSIKAILLAAINDLSKTPEKYAVKPGVDFIRNRKLGFKDYMLMFLTMEADCIREEL YRFFGRTIDAPSKAAFYRQRKKIREDAFRNLLLAFNRKLPKKLYNGKYEFWACDGSSCDI FLNPEDKDTYFEPNGKSTRGFNQIHINAMFSLFDKRFTDILVQPARKRNEYSAFCSMVDS ADIPEHYKVIFFGDRGYTSYNNFAHVIEKGQYFLIRCNDKRASGMMGYPVDTLPAFDEDI SLILTRSKAVSKYSRPELFSSYRYIYQNAPMDYLNDQRTEYDLALRLLRIQLDDGSYENI ATNLPEEEFKAEDFKALYHLRWGIMPISA >gi|226332906|gb|ACII01000113.1| GENE 11 10873 - 11448 415 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580043|ref|ZP_04857310.1| ## NR: gi|253580043|ref|ZP_04857310.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 191 1 191 191 368 100.0 1e-100 MQEVWARVILYNFSSSELNANENYTEFANKIALLAPGDIPTHGNSYDVEVTAPDGGVNMR CGAGVEYDKVLPDMIPNGTVLTVTQEAVASNGNSWGYTNYNGTYGWIALTQVTRYQEPTE GAPIPHTRYVINCNESITLRTNPDVNAAEICQIPLGTAVATFGDAGNGFISVYYQGSSGY CLASYLSAPVD >gi|226332906|gb|ACII01000113.1| GENE 12 11578 - 12027 464 149 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0193 NR:ns ## KEGG: EUBREC_0193 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 9 148 1 174 295 76 29.0 3e-13 MGLIAISSLMLLGGMYLMIAFAWWLLQIIANWRIFTKAGEAGWKSIIPIYGDYISYKIAW QPAYFWLTLVLGIVSSCLQGTLETNDSLMISMIIVLIKIILAIISIMYSVKLARAFGRGT GFAIGLIFLPPIFMLILGFGDDRYYGADK >gi|226332906|gb|ACII01000113.1| GENE 13 12396 - 12587 266 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580045|ref|ZP_04857312.1| ## NR: gi|253580045|ref|ZP_04857312.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 4 66 66 77 100.0 2e-13 MAAILMIAYIAAAYWATNKIWWSKKVYVYSNATYFYMTRFIICMVAGFVTIPIALLQTLV GKN >gi|226332906|gb|ACII01000113.1| GENE 14 12607 - 13065 301 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580046|ref|ZP_04857313.1| ## NR: gi|253580046|ref|ZP_04857313.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 152 1 152 152 288 100.0 5e-77 MEALETQAQATQEKENAVTKQNMKYTMSSSRGIYLSWLTGRIYSTILADHEKLTIDIKPV KKNMIPVIYYEDITAIFMNYKIPGYYIFFICLAVISCFSNPGMIVFVFLFIWAGSNRKIT ICLRSGNKAIIYARRKKIATAFVEDIKERAKI >gi|226332906|gb|ACII01000113.1| GENE 15 13102 - 13686 506 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580047|ref|ZP_04857314.1| ## NR: gi|253580047|ref|ZP_04857314.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 194 1 194 194 369 100.0 1e-101 MAAKTIMKRIAIAALSAVILQGVTPLDSLNVYASENTGASYTALDLYTYIQDQCSTEYTL TEKAETFLEEHDDFFPASPELVIPDEMIDAELDFRHINKNPSRYGDKLMRISDAYVIQVQ EQEMEEGHYLTWLNLIDGEEQQYSVYYNGELDDVFEDDTVEVTGLPLGTSSFENTEGGDT LVVVLAGCRVNNID >gi|226332906|gb|ACII01000113.1| GENE 16 14245 - 14880 771 211 aa, chain + ## HITS:1 COG:CAC2841 KEGG:ns NR:ns ## COG: CAC2841 COG3601 # Protein_GI_number: 15896096 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 12 208 1 192 209 134 39.0 1e-31 MSEQVLRNGNAMARSKTRTVTQIAMLGAIAGILMNLEFPLPFLAPTFYQLDFSEIPVLVG SFAMGPIAGVLIELVKILVHLVTKGSMTAGVGDVANFIFGCTFAVPAGLIYRYKSVKSRK HAVIGMVVGTVLTAVVACFINAFVLLPAYGKAFGMPVEAFIEMGSAVHSSVNNLLTFAAM IIFPFNIFKYALTSLIVFLIYKRIRVVLKGD >gi|226332906|gb|ACII01000113.1| GENE 17 15082 - 15375 366 97 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2415 NR:ns ## KEGG: EUBREC_2415 # Name: not_defined # Def: putative nucleotidyltransferase # Organism: E.rectale # Pathway: not_defined # 4 97 21 114 114 106 62.0 2e-22 MNDTGIRPEVIEEIRNLAQKYDIEKVILFGSRARGDFRRTSDIDIAVTGGDFARFALDVD EETSTLLEYDIVNLDRDMQDELRESIEKEGRILYEKI >gi|226332906|gb|ACII01000113.1| GENE 18 15362 - 15751 479 129 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2414 NR:ns ## KEGG: EUBREC_2414 # Name: not_defined # Def: nucleotidyltransferase # Organism: E.rectale # Pathway: not_defined # 1 129 1 129 129 172 66.0 3e-42 MRKYENFCNALSNMKDIYNYEEPYDNVVITGLVGLYEICFEQSWKMMKEILEIHGYAEGA TGSPKIILKTAYKAGMIRDEEQWLRALQARNNVTHSYNQKIALGIIADAKEEFYQMFCEL KTEIEENWL >gi|226332906|gb|ACII01000113.1| GENE 19 15792 - 16406 621 204 aa, chain + ## HITS:1 COG:CAC3601 KEGG:ns NR:ns ## COG: CAC3601 COG0494 # Protein_GI_number: 15896835 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 3 177 5 179 202 103 33.0 3e-22 MGQIHNIEKLTDCKFLNLYHLNATSIHNTPVSYFVASRAKSINELKIKTGKNTPDGVIIY SLYGEKRDRVVLVRQYRYAIGGYIYEFPAGLVESNEEFHEGAVREMYEETGLKFTPLKVD PAFEKPYFTTVGMTDESCATVYGYAEGEISKDAQEDTEEIEVVLADREEVKRILREENVA IMCAYQLMHFLQDEEPFAFLKEEL >gi|226332906|gb|ACII01000113.1| GENE 20 16479 - 18899 2685 806 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 526 803 4 305 721 258 46.0 5e-68 MAKWQRSFFQPVLPLGEDGRRVTGSKEHIALSRMAAGEGMVLLKNEKNTLPIRRGTKVAF FGKGTVDYVKGGGGSGDVTVEYIRNLYEGMKIKEDEGKVEVFDKLAKYYEKDIQKQYADG AVPGMTVEPELPDELLNEAREYTDTAVITICRFSGEGWDRKCEAAQDGYVLDGEEKRNSE LSAKIFENGDFCLTNAENAMVEKVKKAFPHVIVVMNVGGIVDTKWFRDCDEIQSVLMAWQ GGMEGGLATADILCGDVNPSGKLSDTYAKDLEDYPSTANFHESAFYVDYTEDIYVGYRYF ETIPGAAERVNYPFGFGLSYTDFDWKMTGASEENGVITVLTEVTNTGKTAGKEVIQLYYG APQGKLGKPAKVLGAFKKTSLLQPGERQILTLKIPVDQMASYDDLGKVCRSAYVLEAGEY AIYIGTNVRDAAKIDFTYVVKEDTVTEQLSRKAAPYHLQKRMLADGSYEELPQREYVEEE GLPRQDKYAIGLPCPDTRGQKGIDFLDFLDSKGVRFSDVADGKMTLDEFMDILTLDDCIN LLGGQPNTGCANTFGMGNLPEYGVPNVMTADGPAGLRILPKCGVNTTAWPCATLLASTWD EELVEKVGKAGAEEVKENNISIWLTPACNIHRSPLCGRNFEYYSEDPYLAGKTGAAMVRG IQSQHIGASVKHFAANNKETNRKDSDSRVSERALREIYLKQFEIIVKEAHPYTIMSSYNL INGIHASENKELLTGILRDEWGFDGMVTTDWWTFGEHYRETKAGNDIKMAAGYPERIKEA YEKGFITEGEIRLSARRILNMILKID >gi|226332906|gb|ACII01000113.1| GENE 21 18936 - 19121 251 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580053|ref|ZP_04857320.1| ## NR: gi|253580053|ref|ZP_04857320.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 96 100.0 5e-19 MMDERKRSLRKEIVRLALPIALQQFMTALVGACDAIMLGKLSQDAMSAVSLFLWDASAHL S >gi|226332906|gb|ACII01000113.1| GENE 22 19290 - 20630 1169 446 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 5 433 2 428 447 243 35.0 8e-64 MKTKIQDMTSGSPGRLIILFAIPLMLGNICQQLYTMVDTMVVGQVAGVEALAALGAVDFL MWVVTGISTGLTQGFSIQLSQYYGAKDFENLRKSLAHSYRLTAFIAAGVLILSQSFASLV LTGLHTPSNIIGMSLLYLRIIFCGIPATAAYNMFASALRAMGNSKTPLTAMIIASVLNVS LDILFVAGFGWGVAGAAIATVIAQSFSAVYCFLILRRIDIVHLTRADFMPASGMNARLMK LGIPVVFQNIIIGVGGLVVQYVINGYGFLFVAGFTATNKLYGLLEMAAISYGYAIVTYVG QNLGARKIDRIRKGVRSSMLLSLLTSLIISAAMFLFGKNILSLFISGEPQQTQQVLAIAF KYLSIMAAMLWVLYFLYVYRSALQGLGDTLMPMVSGMAEFVMRISAALILPHFIGQDGIF FAEIAAWSGATVILCISYYVRMHKYH Prediction of potential genes in microbial genomes Time: Sat May 28 20:23:48 2011 Seq name: gi|226332905|gb|ACII01000114.1| Ruminococcus sp. 5_1_39B_FAA cont1.114, whole genome shotgun sequence Length of sequence - 180068 bp Number of predicted genes - 170, with homology - 167 Number of transcription units - 81, operones - 46 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 79 - 138 5.6 1 1 Tu 1 . + CDS 171 - 1283 1141 ## COG1940 Transcriptional regulator/sugar kinase + Term 1347 - 1407 14.3 + Prom 1448 - 1507 8.2 2 2 Op 1 . + CDS 1625 - 2653 1218 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 3 2 Op 2 . + CDS 2767 - 4131 965 ## COG2211 Na+/melibiose symporter and related transporters + Term 4165 - 4219 12.9 + Prom 4204 - 4263 6.8 4 3 Tu 1 . + CDS 4321 - 5622 1259 ## COG0534 Na+-driven multidrug efflux pump 5 4 Tu 1 . - CDS 5624 - 6139 510 ## COG0350 Methylated DNA-protein cysteine methyltransferase - Prom 6211 - 6270 10.8 + Prom 6175 - 6234 5.9 6 5 Tu 1 . + CDS 6380 - 7198 777 ## COG0566 rRNA methylases + Term 7315 - 7361 -0.8 7 6 Tu 1 . - CDS 7260 - 7580 254 ## COG0784 FOG: CheY-like receiver - Prom 7642 - 7701 6.7 + Prom 7675 - 7734 4.6 8 7 Tu 1 . + CDS 7767 - 8138 221 ## gi|253580063|ref|ZP_04857330.1| conserved hypothetical protein + Prom 8157 - 8216 5.5 9 8 Op 1 . + CDS 8263 - 8700 244 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif 10 8 Op 2 . + CDS 8815 - 9306 607 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 9374 - 9424 5.2 + Prom 9395 - 9454 5.5 11 9 Tu 1 . + CDS 9588 - 10235 456 ## CTC00617 hypothetical protein + Term 10245 - 10280 3.4 - Term 10233 - 10269 2.2 12 10 Tu 1 . - CDS 10307 - 11476 1131 ## COG0673 Predicted dehydrogenases and related proteins - Prom 11574 - 11633 5.9 13 11 Op 1 35/0.000 + CDS 11835 - 13103 1562 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 13137 - 13180 4.3 14 11 Op 2 38/0.000 + CDS 13192 - 14100 809 ## COG1175 ABC-type sugar transport systems, permease components 15 11 Op 3 . + CDS 14120 - 14959 752 ## COG0395 ABC-type sugar transport system, permease component 16 11 Op 4 7/0.000 + CDS 14995 - 16815 1479 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 17 11 Op 5 . + CDS 16870 - 18354 1098 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 18356 - 18415 5.4 18 12 Op 1 1/0.133 + CDS 18451 - 19035 198 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 19 12 Op 2 . + CDS 19019 - 19417 223 ## COG4357 Uncharacterized conserved protein + Prom 19425 - 19484 7.6 20 13 Op 1 . + CDS 19553 - 20335 741 ## COG0428 Predicted divalent heavy-metal cations transporter + Prom 20375 - 20434 5.7 21 13 Op 2 . + CDS 20481 - 20918 333 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif + Term 20961 - 21023 13.0 + Prom 21059 - 21118 10.1 22 14 Op 1 . + CDS 21218 - 21757 343 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 23 14 Op 2 . + CDS 21754 - 22548 539 ## CTC00617 hypothetical protein + Term 22557 - 22596 4.4 24 15 Op 1 . - CDS 22567 - 22881 257 ## gi|253580079|ref|ZP_04857346.1| predicted protein - Prom 23030 - 23089 7.2 25 15 Op 2 . - CDS 23109 - 23303 141 ## gi|253580080|ref|ZP_04857347.1| predicted protein - Prom 23339 - 23398 10.0 + Prom 23631 - 23690 8.7 26 16 Tu 1 . + CDS 23716 - 25410 1923 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 25464 - 25504 8.3 - Term 25443 - 25500 6.6 27 17 Tu 1 . - CDS 25504 - 26415 960 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits - Prom 26471 - 26530 7.6 + Prom 26452 - 26511 8.5 28 18 Tu 1 . + CDS 26558 - 27517 666 ## COG0583 Transcriptional regulator + Term 27723 - 27760 8.2 - Term 27711 - 27747 5.5 29 19 Tu 1 . - CDS 27802 - 27885 61 ## + Prom 28273 - 28332 5.6 30 20 Tu 1 . + CDS 28390 - 30111 1172 ## COG1404 Subtilisin-like serine proteases - Term 30534 - 30572 2.2 31 21 Op 1 . - CDS 30681 - 31307 255 ## gi|253580087|ref|ZP_04857354.1| conserved hypothetical protein 32 21 Op 2 8/0.000 - CDS 31304 - 32161 442 ## COG1131 ABC-type multidrug transport system, ATPase component 33 21 Op 3 . - CDS 32173 - 32544 199 ## COG1725 Predicted transcriptional regulators - Prom 32643 - 32702 4.5 + Prom 32600 - 32659 4.4 34 22 Tu 1 . + CDS 32749 - 32910 128 ## gi|253580090|ref|ZP_04857357.1| predicted protein 35 23 Tu 1 . + CDS 33041 - 33286 298 ## gi|253580091|ref|ZP_04857358.1| conserved hypothetical protein + Prom 33330 - 33389 3.7 36 24 Op 1 2/0.067 + CDS 33416 - 35089 946 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 37 24 Op 2 2/0.067 + CDS 35090 - 36745 975 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 38 24 Op 3 . + CDS 36738 - 38291 994 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 38299 - 38358 11.3 39 25 Op 1 . + CDS 38445 - 39737 454 ## COG1106 Predicted ATPases 40 25 Op 2 . + CDS 39715 - 40377 426 ## Cthe_2133 hypothetical protein + Term 40528 - 40570 2.2 + Prom 40538 - 40597 3.2 41 26 Op 1 . + CDS 40623 - 41321 597 ## COG1489 DNA-binding protein, stimulates sugar fermentation 42 26 Op 2 . + CDS 41364 - 41522 135 ## gi|253580098|ref|ZP_04857365.1| predicted protein + Prom 41559 - 41618 5.8 43 27 Op 1 . + CDS 41693 - 42526 956 ## gi|253580099|ref|ZP_04857366.1| conserved hypothetical protein 44 27 Op 2 . + CDS 42594 - 43712 594 ## Shel_09730 hypothetical protein + Term 43724 - 43761 6.4 + Prom 43770 - 43829 4.8 45 28 Op 1 . + CDS 44029 - 45597 589 ## FMG_1388 hypothetical protein 46 28 Op 2 . + CDS 45552 - 46937 440 ## FMG_1389 putative two-component sensor histidine kinase 47 28 Op 3 . + CDS 46885 - 47511 465 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 48 28 Op 4 . + CDS 47600 - 48499 865 ## gi|253580104|ref|ZP_04857371.1| conserved hypothetical protein 49 28 Op 5 . + CDS 48558 - 49082 423 ## gi|253580105|ref|ZP_04857372.1| predicted protein + Term 49159 - 49190 3.4 - Term 49131 - 49189 8.8 50 29 Tu 1 . - CDS 49196 - 49801 454 ## gi|253580106|ref|ZP_04857373.1| predicted protein + Prom 50187 - 50246 3.7 51 30 Op 1 . + CDS 50274 - 52238 2144 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 52 30 Op 2 . + CDS 52240 - 52860 732 ## CDR20291_0648 hypothetical protein 53 30 Op 3 . + CDS 52886 - 53764 995 ## COG0685 5,10-methylenetetrahydrofolate reductase + Term 53824 - 53890 12.6 - Term 54095 - 54160 18.4 54 31 Tu 1 . - CDS 54200 - 56161 2364 ## COG0441 Threonyl-tRNA synthetase - Prom 56245 - 56304 6.0 + Prom 56604 - 56663 6.5 55 32 Op 1 . + CDS 56701 - 57756 912 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 56 32 Op 2 . + CDS 57772 - 58272 465 ## COG2426 Predicted membrane protein + Term 58311 - 58362 17.1 + Prom 58607 - 58666 8.0 57 33 Op 1 9/0.000 + CDS 58711 - 60108 1507 ## COG1158 Transcription termination factor + Prom 60169 - 60228 8.2 58 33 Op 2 . + CDS 60297 - 60500 346 ## PROTEIN SUPPORTED gi|240147058|ref|ZP_04745659.1| large subunit ribosomal protein L31 + Term 60522 - 60563 6.5 59 34 Op 1 1/0.133 + CDS 60575 - 61543 854 ## COG3872 Predicted metal-dependent enzyme 60 34 Op 2 32/0.000 + CDS 61531 - 62505 401 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 61 34 Op 3 . + CDS 62498 - 63574 1297 ## COG0216 Protein chain release factor A 62 34 Op 4 . + CDS 63655 - 64572 1147 ## Cphy_0247 hypothetical protein 63 35 Op 1 . + CDS 64704 - 66050 1532 ## COG1109 Phosphomannomutase 64 35 Op 2 . + CDS 66071 - 67120 1270 ## EUBREC_2423 hypothetical protein 65 35 Op 3 . + CDS 67145 - 68380 633 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 + Term 68405 - 68462 4.0 + Prom 68437 - 68496 4.9 66 36 Op 1 . + CDS 68596 - 70974 2562 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 67 36 Op 2 . + CDS 71009 - 71437 490 ## COG4747 ACT domain-containing protein + Term 71453 - 71509 9.2 + Prom 71500 - 71559 7.5 68 37 Op 1 2/0.067 + CDS 71603 - 72037 563 ## COG1959 Predicted transcriptional regulator + Prom 72050 - 72109 10.5 69 37 Op 2 . + CDS 72129 - 73058 849 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Term 73061 - 73102 8.3 + Prom 73069 - 73128 14.0 70 38 Op 1 . + CDS 73242 - 74324 1026 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain + Prom 74403 - 74462 5.8 71 38 Op 2 . + CDS 74594 - 75907 1542 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 75957 - 76008 6.4 + Prom 76007 - 76066 7.3 72 39 Op 1 . + CDS 76133 - 76987 805 ## EUBELI_00615 hypothetical protein 73 39 Op 2 . + CDS 77042 - 77464 447 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Prom 77535 - 77594 5.1 74 40 Tu 1 . + CDS 77648 - 78616 620 ## gi|253580129|ref|ZP_04857396.1| conserved hypothetical protein + Term 78629 - 78667 6.6 + Prom 78664 - 78723 3.1 75 41 Op 1 . + CDS 78749 - 79474 848 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 76 41 Op 2 . + CDS 79464 - 79781 424 ## Sterm_0649 branched-chain amino acid transport + Term 79796 - 79842 6.1 77 42 Tu 1 . - CDS 79909 - 80262 430 ## COG0253 Diaminopimelate epimerase - Prom 80305 - 80364 1.8 78 43 Tu 1 . - CDS 80377 - 80760 606 ## COG0253 Diaminopimelate epimerase - Prom 80918 - 80977 8.2 + Prom 80937 - 80996 6.8 79 44 Op 1 31/0.000 + CDS 81096 - 81884 1243 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 81906 - 81945 4.1 80 44 Op 2 34/0.000 + CDS 81967 - 82662 950 ## COG0765 ABC-type amino acid transport system, permease component 81 44 Op 3 . + CDS 82652 - 83392 587 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 82 44 Op 4 . + CDS 83394 - 84344 717 ## COG0385 Predicted Na+-dependent transporter + Term 84380 - 84428 13.1 - Term 84365 - 84419 15.2 83 45 Op 1 . - CDS 84423 - 84848 409 ## COG0517 FOG: CBS domain - Prom 84890 - 84949 1.9 84 45 Op 2 . - CDS 84957 - 85964 1138 ## COG2502 Asparagine synthetase A - Prom 86056 - 86115 8.2 + Prom 86015 - 86074 6.0 85 46 Op 1 . + CDS 86253 - 87242 631 ## gi|253580138|ref|ZP_04857405.1| predicted protein + Prom 87258 - 87317 2.7 86 46 Op 2 . + CDS 87348 - 89498 2369 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase + Term 89552 - 89606 7.5 - Term 89603 - 89658 13.0 87 47 Tu 1 . - CDS 89662 - 90594 1194 ## COG0549 Carbamate kinase - Prom 90622 - 90681 5.8 88 48 Tu 1 . + CDS 91286 - 93097 1195 ## gi|253580141|ref|ZP_04857408.1| conserved hypothetical protein + Term 93125 - 93164 1.1 + Prom 93216 - 93275 2.1 89 49 Op 1 . + CDS 93394 - 94851 1647 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 90 49 Op 2 4/0.067 + CDS 94878 - 97097 1564 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase + Prom 97168 - 97227 2.6 91 50 Op 1 26/0.000 + CDS 97267 - 97743 848 ## COG0242 N-formylmethionyl-tRNA deformylase 92 50 Op 2 1/0.133 + CDS 97751 - 98698 1076 ## COG0223 Methionyl-tRNA formyltransferase 93 50 Op 3 2/0.067 + CDS 98703 - 99398 751 ## COG2738 Predicted Zn-dependent protease 94 50 Op 4 4/0.067 + CDS 99391 - 100743 1097 ## COG0144 tRNA and rRNA cytosine-C5-methylases 95 50 Op 5 5/0.000 + CDS 100747 - 101787 1111 ## COG0820 Predicted Fe-S-cluster redox enzyme 96 50 Op 6 17/0.000 + CDS 101789 - 102535 750 ## COG0631 Serine/threonine protein phosphatase 97 50 Op 7 7/0.000 + CDS 102529 - 104712 2239 ## COG0515 Serine/threonine protein kinase 98 50 Op 8 10/0.000 + CDS 104716 - 105594 844 ## COG1162 Predicted GTPases 99 50 Op 9 6/0.000 + CDS 105597 - 106268 840 ## COG0036 Pentose-5-phosphate-3-epimerase 100 50 Op 10 . + CDS 106277 - 107008 662 ## COG1564 Thiamine pyrophosphokinase + Prom 107047 - 107106 3.5 101 51 Op 1 . + CDS 107129 - 108226 1494 ## COG0012 Predicted GTPase, probable translation factor 102 51 Op 2 . + CDS 108247 - 109269 515 ## COG0682 Prolipoprotein diacylglyceryltransferase + Term 109293 - 109332 5.2 + Prom 109390 - 109449 8.2 103 52 Op 1 29/0.000 + CDS 109563 - 109994 392 ## COG2001 Uncharacterized protein conserved in bacteria 104 52 Op 2 . + CDS 110045 - 110980 695 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 105 52 Op 3 . + CDS 110996 - 111574 481 ## EUBREC_2253 hypothetical protein 106 52 Op 4 3/0.067 + CDS 111640 - 113823 2139 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Term 113839 - 113897 -0.3 + Prom 113842 - 113901 3.8 107 53 Tu 1 4/0.067 + CDS 113959 - 115728 1521 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Prom 115730 - 115789 2.4 108 54 Op 1 28/0.000 + CDS 115837 - 116793 1076 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 109 54 Op 2 25/0.000 + CDS 116848 - 118203 1814 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 110 54 Op 3 . + CDS 118290 - 119387 1045 ## COG0772 Bacterial cell division membrane protein 111 54 Op 4 . + CDS 119468 - 120751 793 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 112 54 Op 5 . + CDS 120792 - 121910 1182 ## EUBELI_00732 hypothetical protein + Prom 121995 - 122054 7.9 113 55 Tu 1 . + CDS 122116 - 123306 1428 ## COG0206 Cell division GTPase + Term 123349 - 123391 0.0 - Term 123337 - 123379 6.2 114 56 Tu 1 . - CDS 123453 - 123782 243 ## gi|253580167|ref|ZP_04857434.1| conserved hypothetical protein - Prom 123889 - 123948 6.4 115 57 Op 1 . + CDS 124236 - 124730 402 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 116 57 Op 2 . + CDS 124766 - 124963 242 ## PROTEIN SUPPORTED gi|228582464|ref|YP_002853165.1| ribosomal protein L35 117 57 Op 3 . + CDS 125011 - 125367 540 ## PROTEIN SUPPORTED gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 + Term 125419 - 125467 12.4 + Prom 125533 - 125592 8.4 118 58 Op 1 20/0.000 + CDS 125803 - 126957 1710 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 127026 - 127089 6.4 + Prom 127125 - 127184 4.7 119 58 Op 2 24/0.000 + CDS 127206 - 128093 1232 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 120 58 Op 3 19/0.000 + CDS 128104 - 129159 1430 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 121 58 Op 4 18/0.000 + CDS 129163 - 129921 255 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 122 58 Op 5 . + CDS 129934 - 130644 296 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 123 58 Op 6 . + CDS 130659 - 131282 743 ## Cphy_3531 hypothetical protein + Term 131289 - 131327 7.1 - Term 131277 - 131314 2.3 124 59 Tu 1 . - CDS 131324 - 132643 737 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Prom 132784 - 132843 11.1 125 60 Op 1 . - CDS 132845 - 133357 300 ## EUBREC_1481 hypothetical protein 126 60 Op 2 . - CDS 133394 - 134320 612 ## COG1242 Predicted Fe-S oxidoreductase 127 60 Op 3 . - CDS 134317 - 135024 620 ## COG5578 Predicted integral membrane protein - Prom 135053 - 135112 4.8 128 61 Op 1 . - CDS 135149 - 135556 449 ## COG0726 Predicted xylanase/chitin deacetylase 129 61 Op 2 . - CDS 135582 - 135902 186 ## gi|253580180|ref|ZP_04857447.1| conserved hypothetical protein - Prom 135999 - 136058 8.2 + Prom 135999 - 136058 5.3 130 62 Op 1 1/0.133 + CDS 136160 - 137113 855 ## COG1879 ABC-type sugar transport system, periplasmic component 131 62 Op 2 7/0.000 + CDS 137110 - 138591 1415 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 132 62 Op 3 . + CDS 138618 - 140213 1450 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 133 62 Op 4 1/0.133 + CDS 140213 - 141310 1222 ## COG4213 ABC-type xylose transport system, periplasmic component + Term 141384 - 141429 6.4 + Prom 141378 - 141437 4.9 134 63 Op 1 11/0.000 + CDS 141528 - 142595 1401 ## COG4213 ABC-type xylose transport system, periplasmic component + Term 142609 - 142647 5.9 + Prom 142660 - 142719 3.6 135 63 Op 2 11/0.000 + CDS 142739 - 144277 192 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 136 63 Op 3 . + CDS 144283 - 145455 1463 ## COG4214 ABC-type xylose transport system, permease component + Term 145491 - 145528 8.2 137 64 Tu 1 . - CDS 145336 - 145620 123 ## - Prom 145737 - 145796 4.9 + Prom 145845 - 145904 4.7 138 65 Tu 1 . + CDS 145935 - 146657 857 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Prom 146760 - 146819 8.5 139 66 Tu 1 . + CDS 146895 - 147200 377 ## COG4496 Uncharacterized protein conserved in bacteria + Term 147237 - 147271 6.0 + Prom 147365 - 147424 7.5 140 67 Tu 1 . + CDS 147598 - 150780 2775 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 150788 - 150855 4.4 + Prom 150826 - 150885 5.8 141 68 Op 1 . + CDS 151012 - 152007 1200 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 142 68 Op 2 . + CDS 152060 - 152971 735 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 153072 - 153125 7.4 + Prom 152979 - 153038 4.0 143 69 Tu 1 . + CDS 153146 - 155368 1965 ## COG3345 Alpha-galactosidase + Term 155443 - 155485 -0.8 144 70 Tu 1 . - CDS 155452 - 156357 859 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 156489 - 156548 6.0 + Prom 156501 - 156560 6.8 145 71 Op 1 35/0.000 + CDS 156601 - 157866 1466 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 157916 - 157975 2.3 146 71 Op 2 38/0.000 + CDS 158032 - 158925 898 ## COG1175 ABC-type sugar transport systems, permease components 147 71 Op 3 . + CDS 158938 - 159762 707 ## COG0395 ABC-type sugar transport system, permease component + Term 159872 - 159917 1.7 + Prom 159797 - 159856 4.8 148 72 Op 1 . + CDS 159986 - 160651 592 ## Cphy_1550 hypothetical protein + Prom 160655 - 160714 4.8 149 72 Op 2 . + CDS 160739 - 162940 2476 ## COG3345 Alpha-galactosidase + Term 162965 - 163024 6.7 - Term 162958 - 163004 7.1 150 73 Tu 1 . - CDS 163016 - 163633 622 ## EUBREC_1394 hypothetical protein - Prom 163669 - 163728 6.5 + Prom 163713 - 163772 10.1 151 74 Tu 1 . + CDS 163928 - 166162 2160 ## COG0210 Superfamily I DNA and RNA helicases + Term 166184 - 166214 0.2 - Term 166162 - 166201 -0.9 152 75 Tu 1 . - CDS 166296 - 166979 692 ## COG1011 Predicted hydrolase (HAD superfamily) - Prom 167135 - 167194 9.1 + Prom 167124 - 167183 7.5 153 76 Op 1 . + CDS 167268 - 167489 389 ## + Term 167516 - 167551 4.1 154 76 Op 2 . + CDS 167556 - 168920 1161 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 155 76 Op 3 1/0.133 + CDS 168963 - 169205 106 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 156 76 Op 4 40/0.000 + CDS 169223 - 169912 509 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 169925 - 169993 3.1 + Prom 169938 - 169997 4.2 157 76 Op 5 4/0.067 + CDS 170032 - 171153 638 ## COG0642 Signal transduction histidine kinase + Term 171163 - 171210 14.5 + Prom 171185 - 171244 5.3 158 77 Op 1 36/0.000 + CDS 171267 - 171941 329 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 159 77 Op 2 . + CDS 172004 - 174382 1479 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 174450 - 174493 1.5 + Prom 174416 - 174475 8.8 160 78 Op 1 . + CDS 174609 - 174992 283 ## Mbar_A1092 hypothetical protein 161 78 Op 2 . + CDS 175022 - 175234 271 ## EUBREC_0277 hypothetical protein + Term 175307 - 175351 4.1 - Term 175294 - 175339 8.1 162 79 Op 1 . - CDS 175359 - 175589 170 ## gi|291520905|emb|CBK79198.1| hypothetical protein 163 79 Op 2 9/0.000 - CDS 175625 - 175906 327 ## COG3041 Uncharacterized protein conserved in bacteria 164 79 Op 3 . - CDS 175903 - 176175 374 ## COG3077 DNA-damage-inducible protein J - Prom 176195 - 176254 4.2 165 80 Op 1 . - CDS 176287 - 176610 211 ## gi|253580213|ref|ZP_04857480.1| conserved hypothetical protein 166 80 Op 2 . - CDS 176700 - 177137 326 ## COG1959 Predicted transcriptional regulator - Prom 177221 - 177280 11.9 + Prom 177196 - 177255 8.4 167 81 Op 1 . + CDS 177345 - 178187 707 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 168 81 Op 2 . + CDS 178236 - 179213 539 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 169 81 Op 3 . + CDS 179206 - 179496 93 ## COG1943 Transposase and inactivated derivatives 170 81 Op 4 . + CDS 179521 - 180067 238 ## Teth514_2350 IS605 family transposase OrfB Predicted protein(s) >gi|226332905|gb|ACII01000114.1| GENE 1 171 - 1283 1141 370 aa, chain + ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 9 359 8 370 385 173 32.0 5e-43 MSGLEKKFATKSKIVNYIINRESTSKVEISKELNLSMPTVLSNVKDLLEKGIIIETGEYE STGGRKAKSIGINPSYRYAMGIVITANHLGMVLVNLKYEIVKSERIRLKFVSDMSYCSQV ADYATKFLDHMNDAEQKEKLLGVGISIPGIINQEQKLVIKSHALKLENYSLSFLEQAFSV PVYFSNDANAAMMAEDMNIYQNAIYLSLNQTLGGAFCIGGNLFSGENQKAGEFGHMILVP QGRKCYCGKSGCADAYCAAGALVGESKDSVEQFMQLLQNNDEKAEKKWEEYLDYLAVLIS NLRMAYDMDIILGGEMGGYLSDYMIPLGEKVMKYNGFDHDLRYLKNCSYKKEASAVGAAK HFLQDFIGKI >gi|226332905|gb|ACII01000114.1| GENE 2 1625 - 2653 1218 342 aa, chain + ## HITS:1 COG:YJR159w KEGG:ns NR:ns ## COG: YJR159w COG1063 # Protein_GI_number: 6322619 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Saccharomyces cerevisiae # 5 337 10 353 357 201 35.0 1e-51 MLQQVMTNPGEIIFREVPVPEIKDNQVLVKIMNIGICGSDIHVYHGKHPFTKYPVTQGHE VSGEVVKTGDAVTEFHVGQKVTIEPQVYCGHCYPCRHGKYNLCEELKVMGFQTTGTASEY FAVDASKVTPIPEDMSYEEGAMIEPLAVAVHGVKQMGDVTGMNIVVIGAGPIGNLVAQSA KGMGAAKVMITDVSDLRLEKAKECGIDVCVNTRNKNFGEAMVEAFGPDKADVIYDCAGNN ITMGQAIKYARKGSVIVLVAVFAGMAEIDLAVANDHELDIKSTMMYRHDDYVDGIRLVNE GKVHLRPLISKTFAFKDYLKAYQYIDDNRETTMKVIINVQEK >gi|226332905|gb|ACII01000114.1| GENE 3 2767 - 4131 965 454 aa, chain + ## HITS:1 COG:BS_ynaJ KEGG:ns NR:ns ## COG: BS_ynaJ COG2211 # Protein_GI_number: 16078820 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 5 454 3 452 463 222 34.0 1e-57 MESKNENVKVPLISKIAYGFGDVGCNFSWMFVSNFLMIFYTDVFGISMAAVSALMLFSRF WDAINDPIVGGLTDKTKTKWGRYRPWLLIAAPITAVLLIMTFWAHPDWSDRSKVIYMVIT YCLLVLGYTCVNIPYGTLCGAMTQDIDERAKINTSRSVSAMVAIGIINIITVPLIGKLGS QSAKTGYLLVAIIYGCIFAACHFFCFAKTKEQVIMPEKEKISIKVQLRAVMQNRPYILAL IGQVLFGFTLYGRNADVLYYFTYVEGNASYYTTYSMCIIIPSIIGAACFQPVFRKLNNKG RTASIFALLTGISMLCMFFFNVKETPAAFYTLAGITQFFFSGFNTAIYAIIPDCVEYGEW KTGLRNDGFQYAFVSLGNKIGMAIGTALLAALLGKYGYVANQVQNPAVLSIMRHSFTTIP GVLWIVTAIVLFFYRLNKKRYNEIVEDLKKNRVK >gi|226332905|gb|ACII01000114.1| GENE 4 4321 - 5622 1259 433 aa, chain + ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 428 13 436 440 162 29.0 1e-39 MLKRFTRYVTQSVAGMIGISVYVLADTFFISVYSGADGLAVLNLILPVYGLIYAIGAMIG IGSATRYAISRAKGKNTEHYFVQSVTWSILAAVPFMLIGIFIPDKALALLGADAGLIGLG RNYVRIILIATPFFMSNYTFTAFARNDGAPSIAMIGSISGSIFNIIFDYIFMFPVGLGFS GAGLATAICPIVTMSVCTIHYRSSRNHVGFHWKKPSFRHLISCCQLGVSAFVGELSSGVI AIVFNFLILGIAGNMGVAAYGVVANLSIVAFAIFNGLAQGAQPLISESYGKGQPTQVRKL LKWSLLVCLAVEALTQLIIWTSTDMLISIFNSENNVQLLNYAHTGLRLYFLGFIVAGINI VLVAYFSAVDEPKIAIVGSFLRGIVAIVICAVILAKLFGLNGIWISLLAAETVTFLTILF LAYKDRRKRMAVV >gi|226332905|gb|ACII01000114.1| GENE 5 5624 - 6139 510 171 aa, chain - ## HITS:1 COG:SA2335 KEGG:ns NR:ns ## COG: SA2335 COG0350 # Protein_GI_number: 15928126 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Staphylococcus aureus N315 # 1 165 1 158 173 150 46.0 1e-36 MIYRTYYLSPLGRILLAADDIGLIGVWFEGQKYFGEFPGHMNYIFEEKENHILKDALRWL DIYFSGQKPDFLPKLHLIGTDFQREVWDILLEIPYGQTVTYGEIARKIADKRGLKTMSAQ AVGGAVGHNRVSVVVPCHRVIGSNGSLTGYAGGIERKIKLLDIENGSCLKP >gi|226332905|gb|ACII01000114.1| GENE 6 6380 - 7198 777 272 aa, chain + ## HITS:1 COG:Cgl0802 KEGG:ns NR:ns ## COG: Cgl0802 COG0566 # Protein_GI_number: 19552052 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Corynebacterium glutamicum # 5 268 7 270 276 155 34.0 9e-38 MFNEIKDFAAPELDVYARTSEVQLLRYYEPEPGIFIAESPKVIERALNAGYQPISFLVEH KDLEGEAQEILKQYPDVPVYTAEYELLVKLTGFALTRGMLCAMRRNSLPSVEEICRNASR IAVLENVVNPTNIGAIFRSAAALHMDAVLLTGGCSDPLYRRAARVSMGTVFQIPWTYFDK KTVWPQDGMQILQNLGFKTAAMALRDDSVGIDDKALRSEEKLAVILGTEGDGLASQTIAD CDYTVKIPMSHGVDSLNVAAASAVAFWELGHR >gi|226332905|gb|ACII01000114.1| GENE 7 7260 - 7580 254 106 aa, chain - ## HITS:1 COG:VC1349_4 KEGG:ns NR:ns ## COG: VC1349_4 COG0784 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Vibrio cholerae # 1 94 161 249 250 84 46.0 3e-17 MSDEGANLTVVENGLQAVRMFQEKPEGYFDAILMDIIMPVIDGLTATNSIRSLNHPDAKK IPIIAMTANAFKKDKEKCLAAGMNAHLPKPIEIENVKKVLCEQIKP >gi|226332905|gb|ACII01000114.1| GENE 8 7767 - 8138 221 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580063|ref|ZP_04857330.1| ## NR: gi|253580063|ref|ZP_04857330.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 123 1 123 123 248 100.0 8e-65 MKTAETPVGIFTINKVKIPSAYTCAAEQKIEYISENHMQIITMDQAVLFGNQLLSPRICQ SCMNPDKITIYPLEIEYIGEKVLFTDHYSVKEWKKSDPLPEIHEWYPHIKKAGCNPCRNC GRC >gi|226332905|gb|ACII01000114.1| GENE 9 8263 - 8700 244 145 aa, chain + ## HITS:1 COG:PAB0835 KEGG:ns NR:ns ## COG: PAB0835 COG1661 # Protein_GI_number: 14521466 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Pyrococcus abyssi # 7 143 5 135 139 67 29.0 1e-11 MDYRKFGECYYIRMDCNDEIISTILEICKKENIRSATFSGIGGCKDAEIQTFIPETGSFE EQRISGMLELVSLNGNVVTDENNTCYHHTHAVFSYKDGERHCMAAGHMKSITVLYTAETE LHPVTGGTIRRKYDPETGTGFWDFN >gi|226332905|gb|ACII01000114.1| GENE 10 8815 - 9306 607 163 aa, chain + ## HITS:1 COG:all4397 KEGG:ns NR:ns ## COG: all4397 COG0454 # Protein_GI_number: 17231889 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Nostoc sp. PCC 7120 # 5 151 20 168 182 82 28.0 4e-16 MGIVIREYQTEDVTAAITIWNQVVEDGVAFPQEENLTEETGDAFFKEQTYTGIAVNTDNN EIVGLYILHPNNVGRCGHICNASYTVRRDFRGEHIGEKLVLDCLAQAKEKGFRVMQFNAV VANNTHALHLYERIGFTRLGVIPQGFRMPDGHYEDIIPHYYVL >gi|226332905|gb|ACII01000114.1| GENE 11 9588 - 10235 456 215 aa, chain + ## HITS:1 COG:no KEGG:CTC00617 NR:ns ## KEGG: CTC00617 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 21 214 64 256 256 139 41.0 1e-31 MRLKKKILPLLIAATLTIGVTAVAATGKISMWTGSSASRADYTSLPTLEQVTKDIGYRTV LIDTFENGYCFKKGNIIKNSFKDDNANVIEKFKSVSFDYQKNGDVVSFEQQKFNSKLIPS GDIIATVNGTNLYYVHYINKVVSDDYELTEQDKKDQASGKLVFSYDDSASQIDVSQVQSV NWNKDDIQYDLLQIDGKLSAGELADMAKEVINNRR >gi|226332905|gb|ACII01000114.1| GENE 12 10307 - 11476 1131 389 aa, chain - ## HITS:1 COG:CC1225 KEGG:ns NR:ns ## COG: CC1225 COG0673 # Protein_GI_number: 16125475 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Caulobacter vibrioides # 2 165 32 198 366 71 27.0 3e-12 MKKLNVALVGLSFGLEFVAIYCKHPDIDKVYIVDKNEKLLNIAKERYSIPDERCFTDLQD VLDIPEIDAVHLVTPPATHAPFSVRVLNAGKHCGCTIPMGMSIQELNDIIAARKASGKNY MFMETTIFQREFLYIQELYKKGELGRLQYMTCAHYQDMEGWPEYWEGFPPLMHPTHAVAP CLMLAGHLPDKVYARGSGKIRKELADKYGCPFAFESAFISLQDSDVTIEMERFLYGVARS YSECFRVYGENESFEWQQLADEDPVLFTRTGELEKVDILDEDSEKSSRGSEIVEKRIQIP DYAHLLPKEIASFTTNTVYNNENTHLSFTQGGGHGGSHPHLVHEFIRSILEERKPAIDDI MGAYWTGTGICAHESAMKGGEVITIPRFE >gi|226332905|gb|ACII01000114.1| GENE 13 11835 - 13103 1562 422 aa, chain + ## HITS:1 COG:AGl1009 KEGG:ns NR:ns ## COG: AGl1009 COG1653 # Protein_GI_number: 15890622 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 413 7 405 410 96 25.0 1e-19 MKKRALAVMLAGAMVLGAQTGVWAASDTSDQKEGELTLLMVNDWIDETGAKGEELTKVIK QYEEENPGVEITLQGASQQDIKESFQTAALAGGGADIVMMDNSGHAIDLAAMGLLYPLED ITTADELTAQYQEGPLNSGKFEGKYYSVPWYMDCTGLYYNKERLEELGIDVPTTWEELSD AVDKAKEAGYGGIITYQSAYAFYSFFYQNECPVIDTSGDIPQVVIDNKEGKEAWNYICDL ISKGGLVESFKEATTWDKVYESFANGEATFLLGGDWCSSGVENINPDLDYGIAPMVKGKT EATVLGGWTWNINTNCKNPELAYDLIQYLNSEKGDSLLGVEGKISARKDFDYDKALEGND KLKVFTEEFPYTEARPAVINEKSIDELITNAILEVDYGQSSAEDALTSLTQKLNENIASN YQ >gi|226332905|gb|ACII01000114.1| GENE 14 13192 - 14100 809 302 aa, chain + ## HITS:1 COG:BH1245 KEGG:ns NR:ns ## COG: BH1245 COG1175 # Protein_GI_number: 15613808 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 59 298 194 433 445 177 35.0 2e-44 MSKICRKEGLELRKNLPGYLLMAPAVIAILALSVYPLFRGIYLSFLNYNLVRPNDPAFNT FAGLQNYIDIFKDKVFIQSIGNTVKWTVINLVVQLVAAMLLALALNQKLKGRSVYRTLIL VPWAVPHAIAAMTFTFLYNANVGIINILAVKLGMITESVSWLGNVGSAFWCVVLVAIWKG IPFQMIFILAALQGISGDVYESAEIDGASRWQCFWKITLPIIKEPLAISTILNLIGIVSC FNTIWLMTKGGPLYSTEIIYTYAYRRAFIDHNFGTAAAASVVLFVFMAVFSGVYLKMVSE KE >gi|226332905|gb|ACII01000114.1| GENE 15 14120 - 14959 752 279 aa, chain + ## HITS:1 COG:PM1760 KEGG:ns NR:ns ## COG: PM1760 COG0395 # Protein_GI_number: 15603625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Pasteurella multocida # 4 278 2 277 277 201 39.0 1e-51 MLAKQRNEAIRHVIAYVFTIVMVFLAVTPFIYVVLTALKTKEQIYDPSQIIPTHITFDNF RHVLFQSNFVRYFMNSIFITLVTTLICMILSVMAAYGLTRYKIAGAGKIKMAVLMTRMFP GILLCIPYYIIMKQLNLIDSYTGLILMYCSFTLPFAIWNTCAFFISIPWELEEAARIDGC SRLTSFFHVIIHVAKPGLFVTALFCFMTSWDEYMYASIFINTTSKKTIQVGMQDFIGQYS VDWGLLMSAVVISLIPILIFFALVQKNLVQGLSAGAVKG >gi|226332905|gb|ACII01000114.1| GENE 16 14995 - 16815 1479 606 aa, chain + ## HITS:1 COG:BH0792 KEGG:ns NR:ns ## COG: BH0792 COG2972 # Protein_GI_number: 15613355 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 226 602 216 577 587 122 25.0 3e-27 MGFKRKNPGNQSIRRQLTFYMGFFVVLPLCLALMLLNFYLQKVTTENKINNETNLLSQIR DNADQMIEVTNYATSMLMTNKNTLKNLRTLEQDGDSYEIYQAKRELSNDISNVESSVLNA VNGKVAILTKTGYVIGSYALSRTETDYEKEQWYQEVLKNGRKTTYSTGIGEIFQEMTIYD NVQKYLYMGREILDYSGKNLGVMLIRLSETNIWGKLAASKVTEEGGAIYILDRNNNILMG YNEKYQKQLKELGEQESVKEISEKEIITGNLEDDFYYIAGELENASNKLVYLVPREIFLK ENRKILQHILEMVLLVIGFTVCTMLYFSRRLARPLVEVAQTLEKAPNGMAVLEEPRGSFK EMSKFVSCYNQAGKKIEELLEKVERESRLKEKAHYEMLMSQISPHFIFNTVNSIRIMAIK EGQDRAGGNENTEKALEALGDILHAVYSNKNGMTTVGQETALLKAYVDIMQMRFGSSFQY YNVIPTELFYYEIPAFTMQPIVENAILHGVKGVPAGQIIVSAIEYENDFVISVFNNGNSA DKKKIEDLLKGEKNQRAVTGIGLYNVNSRLKLLYGESYGLIYNEKVRNGFEIWVRLPKKI TESEER >gi|226332905|gb|ACII01000114.1| GENE 17 16870 - 18354 1098 494 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 494 1 523 525 107 22.0 6e-23 MKKVLIVEDEILARLGLRQLVDWKKLDLELLPDATDGEEALEMIRNSRPDIILLDLNIPR VSGLEILEFLKKEEMQTRVIVVSCQEEFDVVKKAMKLGAYDYLRKLNLSSDELESILEKC LGETEERNKVHMQGIREIRYEEIMRDSRDIFAGTCNYQTLICILAKDTEELSGVMEMIHK WAETELREGLQIQKGNQYGYFLLKEKPERSVYMELKKRAERKTGRELYMGIFEGCMEKTA DLVRAAAMAEQIQLFSYYDKEEKLVFFQKKIETEGHSPRGMHGYLDSLKEKIRSFDREKV EQELYGIFGLIRQEPYVSINVLRRNFMDILGIYSLVAQSLDGALEEIELDGDNCHYQKIM MMESLREIEKWFLKFNDIFMEKFWIAYKCSRSEILQKVVKYIEAHITEPIHLSDAAAETG VSSAYLSTMFKKEIGYNFIEYVNLRKIELARQMLQDGKMVYEVSELLGFENSTYFSRVFK RYTDVSPDTYRKQM >gi|226332905|gb|ACII01000114.1| GENE 18 18451 - 19035 198 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 17 186 5 175 190 80 29 4e-14 MSQSYAKSPAAERKFTTREMVLVGMFAAVLAVISQISLPMPTGVPITIQVFGVALVGAVL GSRLGTTATLVYVLLGAIGLPIFANFSGGISSIVGVTGGYIWAWPIMTWLCGIRPKTENK TKNLAISIVFALIGLLIVETIGGLQWHFVGGSMSIPAIAVYSLTAFIPKDIVITVLAVLI AIPIRKGINNAGNH >gi|226332905|gb|ACII01000114.1| GENE 19 19019 - 19417 223 132 aa, chain + ## HITS:1 COG:SPy0205 KEGG:ns NR:ns ## COG: SPy0205 COG4357 # Protein_GI_number: 15674404 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pyogenes M1 GAS # 30 123 6 99 103 103 56.0 6e-23 MQETTDKSNRLKNSLPEDNLVKQENIKIHGIGLDKAGRCTHYHTQLDIAALLCAKCRKYY ACYSCHDELEDHSFVATTPEEVYPVLCGNCGRKLTLQEYKKGSCPYCHAGFNPKCSLHEN IYFCCTNTENTK >gi|226332905|gb|ACII01000114.1| GENE 20 19553 - 20335 741 260 aa, chain + ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 16 260 25 269 269 218 52.0 1e-56 MDINVFYGILIPFLGTSAGAACVFFMKKNLNEQIQRALTGFAAGVMVAASIWSLLIPAIE QSSGLGKLSFVPAAVGFWIGVLFLLLLDHMIPHLHQNSNKAEGPKSKLQRTTMLVLAVTL HNIPEGMAVGVVYAGYLTGHAQITIMGAMALSIGIAIQNFPEGAIISMPLRSEGMGKTKA FAGGVLSGIVEPVGAVLTILAAGLIVPALPYLLSFAAGAMLYVVVEELIPEMSAGEHSNI GTLFFAVGFSLMMILDVALG >gi|226332905|gb|ACII01000114.1| GENE 21 20481 - 20918 333 145 aa, chain + ## HITS:1 COG:PAB0835 KEGG:ns NR:ns ## COG: PAB0835 COG1661 # Protein_GI_number: 14521466 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Pyrococcus abyssi # 7 143 5 135 139 67 29.0 1e-11 MDYRKFGECYYIRMDRDDEIISTILEICKKENIRSATFSGIGGCKDAEIQTFIPETGSFE EQRISGMLELVSLNGNVVTGENNICYHHTHAVFSYKDGERHCMAAGHMKSITVLYTAEIE LRPVTGGTIHRKYDPETGTGFWDFN >gi|226332905|gb|ACII01000114.1| GENE 22 21218 - 21757 343 179 aa, chain + ## HITS:1 COG:CAC1766 KEGG:ns NR:ns ## COG: CAC1766 COG1595 # Protein_GI_number: 15895043 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Clostridium acetobutylicum # 4 177 5 184 185 78 28.0 5e-15 MTDEAELLQRLRNREKNSIDEAIQIYTPYLSTVLYHMAGNSLPKEDIEEIVADVFIVLWK NAGRIDLQKGTLRSYLAAVARNFALKRINRKTDYTFLEDIELSDGKNFIEENFHNNYVWE TVMSLGEPDSEIFVRRYKFDEKIKDISKAMGLNISTVKTRLSRGKRKLRKMLSNAEERL >gi|226332905|gb|ACII01000114.1| GENE 23 21754 - 22548 539 264 aa, chain + ## HITS:1 COG:no KEGG:CTC00617 NR:ns ## KEGG: CTC00617 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 21 263 8 256 256 151 38.0 2e-35 MRRKNLLKELGIQKQADTSCNALDIDTENIRQRVYAKLDFADTERKQTTMRSKKKILPLL IAATLTIGATAVAATGKISMWTGSSASRADYTSLPTAEQVTKDIGYRPVLIDTFENGYCF KDGNIVKNSFKDDNANVIEKFKSVSFDYQKNGDVVSFEQQKFNSKLTPAGDIIATINGTN LYYVHYINKVVSDDYELTEQDKKDQASGKVVFSYDDGASQIEVSQVQSVNWNKDGIQYDL LQIDGKLSAGELADMAREVINNRR >gi|226332905|gb|ACII01000114.1| GENE 24 22567 - 22881 257 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580079|ref|ZP_04857346.1| ## NR: gi|253580079|ref|ZP_04857346.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 104 23 126 126 209 100.0 4e-53 MYQTLWNSINKQVKNSHAYAAFLAVSLMGWTTIGLLAGFLICMIGGMSFVTKVTVVLCSG GYTGLIFGFFGGILYLYRSEPRHLAAATPSQVFPQQLQKDGEPD >gi|226332905|gb|ACII01000114.1| GENE 25 23109 - 23303 141 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580080|ref|ZP_04857347.1| ## NR: gi|253580080|ref|ZP_04857347.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 113 100.0 3e-24 MYDYQADIRNPLRTHTVILYRYLLPIAILVNNAFKTVQQSAKDKEDICRRADARQRYPTQ DFTL >gi|226332905|gb|ACII01000114.1| GENE 26 23716 - 25410 1923 564 aa, chain + ## HITS:1 COG:PAE1423 KEGG:ns NR:ns ## COG: PAE1423 COG1966 # Protein_GI_number: 18312627 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Pyrobaculum aerophilum # 1 555 1 555 618 268 35.0 2e-71 MSGIVMMIIAIVVLGGAYLLYGRYLQNKWGIDPKAKTPAYEMEDGVDYVPADTNVVFGHQ FASIAGAGPINGPIQAAIFGWLPVMLWILIGGVFFGAVQDFASMYASVKNKGRTIGYIIE AYIGKLGKKLFLLFCWLFCILVVAAFADVVAGTFNGFVADNAGTVTRVAANGAVATTSML FIIEAVGLGFFLKYSKFNKWINTAVAILLLVLAIALGLKFPVYVSLGTWHIIIFAYILVA SVAPVWALLQPRDYLNSYLLIFMIVGAVIGVFAANPSCNLKAFTSFNVDGQYMFPILFVT IACGAVSGFHSLVSSGTASKQIKNEKNMLPVSFGAMLMESMLAIIALIAVASFADGEAAA QGLTTQPQIFAGAIANFLSVIGLPHSLVFTLINLAVSAFALTSLDSVARVGRLSFQEFFL DSDTDEENMSPFLKVVTNKYFATIITLVLAYLLTKVGYAEIWPLFGSANQLLSVLALVAC AVFLKKTKRQGCMLWIPMVFMMAVTFTALGMTISKLTKALFTTGLDLGNTLQLIFAVLLL ILGVLVAIQGVKKLFEKNDEKQTA >gi|226332905|gb|ACII01000114.1| GENE 27 25504 - 26415 960 303 aa, chain - ## HITS:1 COG:MA3439 KEGG:ns NR:ns ## COG: MA3439 COG2221 # Protein_GI_number: 20092251 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Methanosarcina acetivorans str.C2A # 6 302 1 287 288 183 35.0 3e-46 MNEEKITTSANKKLSPEEIKRVKGLGCLQDKRYDDIFNIRVITGNGHITTDEHRAIADAA DKFGNGQITMTTRLSMEIQGVPYDNIEKTIAFLGEHGLMTGGTGAKVRPVVSCKGTTCQY GLIDTFALSKKIHERFYVGYHDVVLPHKFKIAVGGCPNNCVKPNLNDMGIIGQRIPKPDS EKCRGCKKCQIEKSCPVHVPKLVDGKLYIDPEECIHCGRCKGKCPFGAVPEYQNGYKIYI GGRWGKRVSHGQALTRIFTEEEEVMAVIEKAILLFKNEGIVGERFADTVKRLGFEYVNEK LIG >gi|226332905|gb|ACII01000114.1| GENE 28 26558 - 27517 666 319 aa, chain + ## HITS:1 COG:aq_638 KEGG:ns NR:ns ## COG: aq_638 COG0583 # Protein_GI_number: 15606065 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 43 311 6 281 303 92 27.0 1e-18 MKLKNRNLYKKAFRAEFFLIAQKKLSDINRVEEFKENIMLDHRTETFMAVCSVMNYREAA ELLHITQPAVTQHIQFLEKEYGCRLFLYENRKLIKTPAAQMLEDYLRSVQQRENFLREKI KNNGLRELRIGATKTIGDYVITDRIHDFLNQPDTALTLIVDNTKHLLHLLEQNTLDYAII EGFFDKNRFGSQLYRREPFVGICPKDHPFAGREVSVEEILKETLIHRENGSGTLAILEEK LLEHNESLERFHRHICISSFKMIIDLIKSGYGISFVYEVLAKSDPDLGIFTLKGEPIVRE FNVIYLKHADVREKMEWFL >gi|226332905|gb|ACII01000114.1| GENE 29 27802 - 27885 61 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTILSFMLSILAGVITYYICKWLDRK >gi|226332905|gb|ACII01000114.1| GENE 30 28390 - 30111 1172 573 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 65 565 25 510 1118 345 38.0 2e-94 MPDQKIDNLLNLAMDATPQERRKSGNLNIGYDPATRLWDVIIKYSGPESGLAGNGIQVVP LLGSYAVVTLPESEIDEYSHRVQVEFMEKPKRLYFELFQAKGASCIRTVQTGRNGLTGKG ILTGVVDSGVDYFHPDFRNADGSSRILRLWDQSIQGNPPQGYVTGTEYTKEQIDEALALG ENQGRRLVPSSDYSGHGTSVLGIAAGNGRASDGVNQGVAYESDLLVVKMGIPRENSFPRT TELIQGIDYLVRQALTMGRPMAINLSFGNNYGSHKGDSLLETYIDMVSSTGRLAICTGTG NNGNQPLHEGGTLKQGQTRQIELSVSSREPTLNVQLWKSYEDEMSIYIENPSGNRIGPLD EKLGPQRYRLGNTDLLIYYGKPGPYHLTQEIYIDFLPGKTYVDSGDWKIILSGKKVRGGE YYLWLPGGNTLNRGTGFYEPRAYGTLTIPATARRVIAVGAYDSLVDSYADFSGRGSRMLP YLKPDLVAPGVNIVAPVPGGGYRTVTGTSFATPFVSGSAALLMQWGIVNGNDPYLYGEKV KAYLRKGARPLPGYEEYPNEEVGWGEDVIIRLH >gi|226332905|gb|ACII01000114.1| GENE 31 30681 - 31307 255 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580087|ref|ZP_04857354.1| ## NR: gi|253580087|ref|ZP_04857354.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 208 1 208 208 307 100.0 3e-82 MKGLVFKDLLLMKKMNKKVIFVMYFFVIAISFFGENEVYSIMSSAFFSLFIGMHLMMTMT YDGLTSWKQYELTLPMSKYQIIFSKYLTSLLLVPISIMGTVIIYIIRYVVYHNFTLSQFG FSIAIAIALPVLWCSICLAIAQWFGYMSVQYVRMICTLLVIFFVSKISKDMKYVTQNLVK NPMLITIFALGIVVASYFVSVIGYSRKK >gi|226332905|gb|ACII01000114.1| GENE 32 31304 - 32161 442 285 aa, chain - ## HITS:1 COG:BH0652 KEGG:ns NR:ns ## COG: BH0652 COG1131 # Protein_GI_number: 15613215 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 3 284 2 285 288 214 40.0 1e-55 MEQNSIVVKNVTKKFDDFMLDHISFTVPTGRIVGFIGENGAGKSTTINLILDQLKLDAGE IRILGKQNHSYLHKENIGVVFDECKFHSVLNAKDIAQILSGSYKTWDMNLFEEYMKRLDV PLNKSIGQLSKGMKMKLSIICSLSHRPQILILDEATTGLDPVVRDEILDIFLEFIQDEEH SILFSTHITSDIQKVADYVILIHNGKIIFEEKKDDLIYNYGIIRCKKSEFNTVSPDDYVC CRETNLSVECLIHDKVAAKKRYKNLIIDNASIEDIMLFYIKGGVK >gi|226332905|gb|ACII01000114.1| GENE 33 32173 - 32544 199 123 aa, chain - ## HITS:1 COG:SA1748 KEGG:ns NR:ns ## COG: SA1748 COG1725 # Protein_GI_number: 15927508 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 1 123 1 123 126 105 45.0 3e-23 MEIIISNSSDKPIYEQIAMQIKSLIMNGTLSAGEALPSMRALAKDLHISVITVQRAYEDL TRDGFIETVSGKGSFVASPNKEFIQEEQLRIAEELLEKVAIIGRTHGISYEQMANILKLF FEE >gi|226332905|gb|ACII01000114.1| GENE 34 32749 - 32910 128 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580090|ref|ZP_04857357.1| ## NR: gi|253580090|ref|ZP_04857357.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 53 1 53 53 92 100.0 6e-18 MSDDELELLAPWNEIVKAEIGRRANESNQSYVNCQGAPAAEKQSGIFSIISKN >gi|226332905|gb|ACII01000114.1| GENE 35 33041 - 33286 298 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580091|ref|ZP_04857358.1| ## NR: gi|253580091|ref|ZP_04857358.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 1 81 81 135 100.0 6e-31 MVNQSVIDSSVNLEQLKDIEDLIPNIDKAISEKVETFLDKSGDQPYAHMNEGYVVVVEMT GEMDATDAISDYLRKRTELMY >gi|226332905|gb|ACII01000114.1| GENE 36 33416 - 35089 946 557 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 6 315 1 300 301 184 36.0 4e-46 MEQMQMNMPDIYDAALYLRLSKDDMEEGSAKSESNSIVNQRELLRSFVKSQPDIQIFDIY VDDGYSGGNFDRPEFKRMTTDIEAGKVNCVIVKDLSRFGREYIEAGRWIEKTYPALNVRF ISVTDQFDSKTADFSEKSFVVPIKNFVNESYCRDISGKVRSHQKIKREKGEFIGAFAPYG YCKDPENKNCLVIDSYAADIVRKIFSWKIDGFSLGAIAEKLNVRYVQSPKEYKKANGENY NSGFHSSDTPKWSAVQIKRILTNEVYIGNMVQGKQERISYKVKQRLDKPESEWVKVENTH PAIIRQSDFDVVQKLLQYDGRASKTLDSANFFSGFVFCGDCKTPMIRRVNQYKGKKKAFY ICQTKNKGGDCTRHSIPEEVLKRIVLKEIQAYTALFIDYQMIMEELCEMQVSYDQVIGYD TQISKLQEEYNRYYSLKASLGDDLKEGLISKAEFDDFRESYGRKCEELEQMIENQKKLVK QMFEGGVSATVQLEDWKKSLEIKELDRTLLALTVDKIYIYENKQIKIHIRYQDMIEKMKV IRRFYAEHRTECRKEVG >gi|226332905|gb|ACII01000114.1| GENE 37 35090 - 36745 975 551 aa, chain + ## HITS:1 COG:CAC1228 KEGG:ns NR:ns ## COG: CAC1228 COG1961 # Protein_GI_number: 15894511 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 9 500 7 475 544 122 27.0 2e-27 MARTAKRYKKNTEKKISGIPVCMAAIYARLSVDNDEKKSESIETQVTLIKEFIQKHNENP DKEYEIAVYDIYSDLGKTGTNFDRPGFERMMNDVRAGKINCILVKDFSRFGRNYIETGNY LEKILPFMKVRFISVCDNYDSFAPGAKNQELSMNIKNLVNDAYAKDISAKERAAKRIAQK NGEYVGSTAPYGYCVEKINGICKLIVEPEAAKIVRRIFEEYASGDGIQSIIDRLFEDRVH RISDYNQYHHVYCQDGENLHQWGNSSIRAVLNRNNYYGDLVQRKYESRFQRGEKWCDILD QSQWIITPNAHEPIISRELFDKAQVRLKVAQQKATKTTVGWEEDERAFYNVLYCGDCKRK MCTRRYRGNVYYFCNAAWYRDERKCSHKSISEEKLQKIVRSELTRQFQLSDLRKKDMSAI SSAVFLTKIKEIQAEIRKLDADMERRSEKLAQAFMQYKEGELSKEAYIEMKDDRNNWKVF CEERKKSLEQTIRKLEKQQKKEARFLRSLLELDGTTRINAELAESLIESMYLYGDNRLEI NFGFKGAVEYE >gi|226332905|gb|ACII01000114.1| GENE 38 36738 - 38291 994 517 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 10 306 12 301 301 160 33.0 5e-39 MSNQKLIIGYYRLSMEDDSEGESNSIINQRKLVKDYISNIPELASMPFQEFYDDGYSGSS MERPAIKQVLELARENKVQCIVVKDFSRFARNYIEMGTYLEQIFPFLGVRFISISDRYDS KDYKGKSSDIEVQFKGLIADFYVKDQSVKVKAAVNTRRGKGEYCCGSAPYGYRINPENKK ELVIVEDEAEVIRRVFELTNQRYSKMEICRLFNEEGVLTPLQSMSRRQKSDSKKAASRGL QWTSDMIRKIVDDKTYIGCMVYGKTKIPDPGTGKEVPVPRNQWKVMENHHEPIVSKEVFE KAQSLQIRYTKKSKFDRETTLLGGYVKCGNCRRSLTSSSPIHGHILYSCAYSKGKEDTGC FAGKADNKMLEHIVLAEIKAYLRQNISQEQMQQSMRKQHEDSIEAYKTESADCEKCQEQI KIQNRQNYEKYHEGQMNQNQFMEAKKQLEEERERLQKRVQELDELINDEKEILMKKNVPV EQMLKYLGYENLTREMLEEYVQGIYVYDDGRVGVEYK >gi|226332905|gb|ACII01000114.1| GENE 39 38445 - 39737 454 430 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 24 420 40 420 420 117 31.0 4e-26 MLSSKQKTHNDYVIDKSVNGNKLRVLPMTVIYGANACGKSNIVLAMDILKKMVMKGTLNC KELESYKSMLSFIRDTSWYDPVSLEITFSTQNNIYRYGIKFTDIDVYKIEEEVLYVNDDL FFSRDDENQIYVDVKKLVKKGYINKDDADFSERLIHKLNQTLDKQKLVVAGAISNLFDKK YFEDFNLWFEKFNVIMNANDMNFRQKDLKTIFNKKPDKDIRRNIFESASVKEVMNIAEFG NQKIGFMAETDNDELSMCSMYQVPLRKDEQPQKKYAISMIVDSELMESRGTIHLIRLLQP FIDVLDNGGVIVLDEMDASLHFEIVVSLIRIFNNKDINKNNAQLIFNTHNPIYLDGELLR HDQIVMVEKRRNDMVSEIYSLADYKLRPEERILKNYLNGKYGALPHMDLEIAFKHILERE ANNLESSKEQ >gi|226332905|gb|ACII01000114.1| GENE 40 39715 - 40377 426 220 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2133 NR:ns ## KEGG: Cthe_2133 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 11 213 3 201 203 84 30.0 3e-15 MNRPKNNKIRRPQFLCIVGCEGKNQERIYFDKVAELVNCVEERTHDLVFDYAEPYGGNPK CVVERTIQKSIGKENKVSVFDYDGKKDKYEEAIDLAIENKIELGYTNYCFDLWLILHKED YFDIVQNQDAYADKLRQVFGLAADANIKKEKRVTEIVNQIGLSDIKNAIQRGKKISEDNQ GKEANKTPKENRYYDNPDTQMHVLLQFLFAKVGINIDALG >gi|226332905|gb|ACII01000114.1| GENE 41 40623 - 41321 597 232 aa, chain + ## HITS:1 COG:CAC0144 KEGG:ns NR:ns ## COG: CAC0144 COG1489 # Protein_GI_number: 15893439 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Clostridium acetobutylicum # 1 227 1 230 230 194 44.0 1e-49 MKYEHITEGRFIERPNRFIAHVEINGQVETVHVKNTGRCRELLVPGTQVFLEKSSNPARK TAYDLICVNKKGRGLINMDSQIPNKAALEWIKAGHLFPEKVQVTPEKTYGNSRFDLYVCS EKRKAFIEVKGVTLESDNIARFPDAPTERGVKHLKELIHCMQEGYEAYLLLVIQMKGVDR FEPNWETHREFGETLQEAERAGVHILAYDCLVEPDRMEIQDPVPVCLASDWK >gi|226332905|gb|ACII01000114.1| GENE 42 41364 - 41522 135 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580098|ref|ZP_04857365.1| ## NR: gi|253580098|ref|ZP_04857365.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 52 1 52 52 81 100.0 2e-14 MTANAFEEDRKKAIKAGMNAHIAKPISVDIILENLERMRQNRKYFNEPAEKS >gi|226332905|gb|ACII01000114.1| GENE 43 41693 - 42526 956 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580099|ref|ZP_04857366.1| ## NR: gi|253580099|ref|ZP_04857366.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 9 277 1 269 269 458 100.0 1e-127 MSKKMRLWMAGTVGVLMAGFMSVSVYALEEKDVIGTWYVNELSMGEGASFHPGAMGMEVT VDIKEDGKMETTSSVYGEEPEVDESEWKIENDKILMTHNDQTGECEYKDGKITLTADGTT MILSQEKEEYEPYVPGKPVENPTMKDFEGEWSCTLMEAFGMQMPVNADFTGFEMSMSVED GKAEIILIESGEETKVELQGDVEGNALVLKAVTEDEAGTMFFSLDNMKFNLLDDGKLCLI AEDDAEESDDGEEADDSETGEVDFSVKTYFEKIIVTE >gi|226332905|gb|ACII01000114.1| GENE 44 42594 - 43712 594 372 aa, chain + ## HITS:1 COG:no KEGG:Shel_09730 NR:ns ## KEGG: Shel_09730 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 188 1 194 314 99 32.0 1e-19 MGLFDKKICDICGEKIGLLGNRKLDDGNLCKDCAKKLSPWFEERRHSTVEDIKRQLEYRE KNKKAVMDFCITRQINTRNYNVFIDDNKGNFTVARKLDVNENPDIVPLSAIVQCRVDVDQ QQHEETYTKDGENVSYQPPVYKYEFDYTMRIKVRTQWFDDMDFRLNTFSISSDNRRELME VEQTAYQIIAALTPNAAGMQSGMSGMNMTGGMQSGMPGMNMNGGMQSGMPGMNMNGGMQS GMPGMNMNAGMQPGMSGMNMNAGMQPGMTGMNMNGGMQSGMPGMNMNVGMQPGMPGMNMN GGMQSGMPGMNMNGGMQPGMPGMNMNGGMQQTGMSETNMTGGMQQNNSSWKCQCGAENTG KFCEYCGQPRPF >gi|226332905|gb|ACII01000114.1| GENE 45 44029 - 45597 589 522 aa, chain + ## HITS:1 COG:no KEGG:FMG_1388 NR:ns ## KEGG: FMG_1388 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 40 428 86 470 567 75 23.0 4e-12 MRNLADQCRSCIYLGMYCAWVIYLGKHVVHKKTRRCLTAIGCLMVFWFFVRTVKFHIFHD PLGEHICWYLYYIPMILIPVLGLAAAMFLGEKEEEKTVRKVIILLTVAAILIVSVFTNDL HQLVFRFSKQPPFSDKDYSYGIVFMVIQGWILICLTGMEIILIRKSRIPGKKQFWLPVIP GILLLGWNIGNILRLPFIKIIAGDMTAVCCLLMAAIFQGCILCGLIQTNNRYFELFQTSG GLDAEITDHSFQRYYHSGDFPELSSELRKIIIDRSSVQEQGIRINHIPIRGGHLFWSEDI SVLLDQYQDIREQQEELTARNRLLQKAYQKEAERRKTEEQNRLLNMIQNQTVGQLELLSQ FMNELERTESREQYDRILGKIVVVGTYLKRRKNLVLTQYASDGNLLTMEDLRQSLAESCD SLKLCRIRAAYYVENADVQLNAEDILKCYDTFEWLVERLVDIMQSVFYRVSQIDDALRIS VHIVSEVDLRGFMSERPELKVQQEDENEWFIRCIVFRKRGGR >gi|226332905|gb|ACII01000114.1| GENE 46 45552 - 46937 440 461 aa, chain + ## HITS:1 COG:no KEGG:FMG_1389 NR:ns ## KEGG: FMG_1389 # Name: not_defined # Def: putative two-component sensor histidine kinase # Organism: F.magna # Pathway: not_defined # 129 437 111 422 429 107 23.0 9e-22 MVYQMYRLQKERRKMISFVECSMAMQDFLMILLEICMILELILLLVRITCFSGKKKIVTA VAVFAVSFVFMGVLMNDHRYCLGESSYQPMLQGYPVSVLMTVVIGLLLYLCWAIRSEKRR YYHSLSYWSVNEAVNDVPCGVCFSDPLGRIVLCNTKMQELSRIMTGSYLQDYDALRKAMS GEPESEGLCRLSKDSNVFYFPDGSVWMFQEYSLQEPDCAGYLQTVAVNVSEIYYNGEKIR ANNEKLEILNHKLEEMYEKIGDKIREQETLVMKMQVHDNFGRSLLSIRRILERKEDPDKM DKQLSVLKHLVYILTDSAVESMEEVYKDTIRHAEELGISVQISGNFPMHPSYRLLTDREI RECVTNCARHAHGSTVWVKIDKNADEYTVQITNDGEVPDKNAEEGGGLSALRKAAESGKC RMQVSFSPEFCLILKMPLTERMGFDGTDTDSRRSEDHAEVF >gi|226332905|gb|ACII01000114.1| GENE 47 46885 - 47511 465 208 aa, chain + ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 1 203 1 218 225 69 24.0 4e-12 MVRILIVEDQKIMQKYFEYIIMQEPEFRHVQTVSDAREAVKICDYSAIDLMIMDVQTFHN HDGLSAGKVIREKYPYTKVLIVTSLIDPKVLERAKSGCADSLWYKDHGEEEIRSVIHRTL NGEHVFPDMTPKVELNWITSGDISPRQLEMLRLYIRGMSYSEIAREMKCSTSGVRWNFQE MIAKAGYSCKEDLIAAALESKLIVTTLK >gi|226332905|gb|ACII01000114.1| GENE 48 47600 - 48499 865 299 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580104|ref|ZP_04857371.1| ## NR: gi|253580104|ref|ZP_04857371.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 299 1 299 299 592 100.0 1e-167 MKKYSIAVCVCAVLFSVGIHGGGMREVFARAPGADENLAILQDIIRENECQAGMAFLGYT YEGAEAAEILEYTRQGEVAAAYPFLQDGTVVDAGGYEIYAIVPGDDCRVSVYPAQPTEEG EYLDDLNAPYYTGKENEIIILRCNVSEIHSNVMVSVQKQNMSIQYHPMVSLKDGHLAAET HCYDFSIYYNWDEYDPEYNPVYESEYDPECNPEHDSALNIKIAREQLSETDEVSYYLGLG MSLWYTGTDEYIEGRNCPVFVLGTDHEDHFTKEHYYAVGDNVVYYYDPSGDAWLLLGAG >gi|226332905|gb|ACII01000114.1| GENE 49 48558 - 49082 423 174 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580105|ref|ZP_04857372.1| ## NR: gi|253580105|ref|ZP_04857372.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 174 1 174 174 338 100.0 1e-91 MNDFLTEKNKKAGVLGKLKWVLCGFCMLLSLGAIGASEKYIGEGRWGMAATEILLCLLFL YPTFREVQKALRKKKAREIACWFESYAQNTVSFEKLEQELGKGAVKKLEKFIARGYIRNI QIDREGNYIMITAPNRRVNEKIYITVTCTSCGAKNQVIKGRLSNCEYCGQRLNS >gi|226332905|gb|ACII01000114.1| GENE 50 49196 - 49801 454 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580106|ref|ZP_04857373.1| ## NR: gi|253580106|ref|ZP_04857373.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 201 1 201 201 334 100.0 1e-90 MKKKIQQILLTSLLFVFSILIISIPTQAAKPAATTVKSKTSYSNSKESMTVRGLDAKKTV VWKYVTKKYTATELPRTKCIVRKDKVYVFESSKIVVLRKKDGKQLWTAKKVSPAGHLCKF DKNDNLYVTGYYDNYVYKISTKGKILWKTNTSKANNYWPYKINISGNQMTILYEMNNNDD SSEKTHKIVFNIKNGKILKYS >gi|226332905|gb|ACII01000114.1| GENE 51 50274 - 52238 2144 654 aa, chain + ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 227 654 20 447 451 379 45.0 1e-104 MENYTKYKLKSSDELASVLDGKDNLFVIACNKCFKEFETVDEPDCDEFLKFAADQGKNVT GSAKFDFLCNKMHTERKLQDLIPEGTENVVVISCGLGIQTVADLAGKPVVAASNTLNYRG HHGMALTKKSCDACAQCYLNITGGVCPIVDCSKSLVNGQCGGAKNGKCEVDPNKDCAWEK IYQRLAKQGRLEEFLNQPVQVRDFSKVNFKVINDYVKSIREERLDGYYGGVHPSERKEFS EHIALKKFPDPKTVVISMSQHLGAPANPIVQVGDTVKVGQKIGEAAGFISAPVHSSVSGT VVAVEPRMHGTRGSEVMAVVIESDGKNTLHESVQPHGDLDNLTPDEIIDIIREAGIVGMG GAGFPTCVKLKPAKPVDTILLNGCECEPLLTADHRVLLEYADDIIFGLKAVLKTTGAEKG IIVIEDNKPDAIELMQKKVADIGNMEVFVARTKYPQGAEKTLIKRVMGRIVPSGGLPADV GVVVDNISTVKAISDAIQTGMPLVERVATVTGEKIKNPGNFVIKIGTSVRELIDYCGGFT DDDVLVKMGGPMMGFPLNTLDVPMMKGSNGIIAVEPDETKEQPCIKCGRCVDVCPMELSP LYFVKYAKDENWQGMKDMNVMDCVECRCCQYICSSKIPIINSIKAGKNAVRGMK >gi|226332905|gb|ACII01000114.1| GENE 52 52240 - 52860 732 206 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0648 NR:ns ## KEGG: CDR20291_0648 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 190 1 194 224 127 40.0 2e-28 MLITQLKAKEAITSLTAGKKVFIINCHGCKEVRFPEKEAARLQKELAAEGNVTGIMTTDY ICNPENMELRLKKHMSKIQEADLVLVFSCGVGVQTVSEYLEDKMVCAACDTYPVPGFQGV TPLEYKCDQCGECYLNLTGGICPITACSKSLVNGQCGGSKNGKCEVDSEMECGWERIYRR LKEIGRLDALKCPTQIHNFATDDELK >gi|226332905|gb|ACII01000114.1| GENE 53 52886 - 53764 995 292 aa, chain + ## HITS:1 COG:MA3514 KEGG:ns NR:ns ## COG: MA3514 COG0685 # Protein_GI_number: 20092322 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Methanosarcina acetivorans str.C2A # 4 292 6 289 292 239 46.0 5e-63 MKMKELFDRGEFVVSAEVGPPKGIHVDELVEEAKTYLKDVHAVNVTDNQSSVMRLGSLAM CKVLKDAGMNPIFQLACRDRNRIALESDLLSAAMLGIDNILCLTGDHTKMGDHPQAKPVF DLDSVSLLHTVKLLESGVDLGGNELVGEPPKFSKGAVVSPCSDSVDAQLAKMERKVAAGA EYFQTQAVFEPEKFIKFMEKAKQFGKPVQVGIIIPKSAGMAKFMNNNVAGIHVPDEMLEE LKADKEKTKAGITGVEIAARIIKECKPYCQGVHIMALGWESKIPDLLKLAEI >gi|226332905|gb|ACII01000114.1| GENE 54 54200 - 56161 2364 653 aa, chain - ## HITS:1 COG:SA1506 KEGG:ns NR:ns ## COG: SA1506 COG0441 # Protein_GI_number: 15927261 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Staphylococcus aureus N315 # 3 644 6 645 645 571 44.0 1e-162 MIVTLKDGSKKEYAEAKSVIDIAYDISEGLARAACAGEVNGEVVDLRTVIDSDCELNILT AKDEKGLAVLRHTASHVLAQAVQTLFPDAKVAIGPSIDTGFYYDFDCPPFSRDDLDAIEK EMKKIIKKGAKIERFTKSREDAIAYFQEKNEPYKVELIEDLPEGSEISFYSQGDWTDLCA GPHLMSVKGVKAFKLLSSSSAYWRGSEKNAMLTRIYGTAYASKDELKEHLEQMEEAKKRD HNKLGREMKIFTTVDVIGQGLPLIMPNGVIMMQELQRWIEDEETKRGYIRTKTPLMAKSD LYKISGHWDHYKEGMFVLGDEETDKEVFALRPMTCPFQYYVYKAEQHSYRDLPLRYGETS TLFRNEDSGEMHGLTRVRQFTISEGHLIVRPDQMVKEFKDCIALAQYCLQVLGVEEDVTY HLSKWDPNNREKYIGDAEVWNQTEAHIRQMLEELNIPFTEDVGEAAFYGPKVDINAKNVY GKEDTMITIQWDALLAEQFDMYYIDENGEKQRPYIIHRTSMGCYERTLAWLIEKYAGMFP TWLCPEQVRVIPISEKFHNYAAKVEAQLKENGIRCSVDQRSEKMGYKIREARLARVPYML IVGAKEEEEGKVSVRSRYLGDEGMKDLGDFLAAIKEEIKNKTIRKIEVQEENK >gi|226332905|gb|ACII01000114.1| GENE 55 56701 - 57756 912 351 aa, chain + ## HITS:1 COG:L0065 KEGG:ns NR:ns ## COG: L0065 COG0079 # Protein_GI_number: 15673188 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Lactococcus lactis # 4 351 3 350 360 360 51.0 2e-99 MSRWEENIRKVIPYTPGEQPNQPDMIKLNTNENPYPPAPGVEKALREMDTDTMRLYPDPT AGELVHAIAKNYRLKDEQVFVGVGSDDVLAMSFLTFFNSQKPVLFPDITYSFYDVWADLF RIPYERPALDENFHIRKEDYFRENGGIVFPNPNAPTGVEMPLEEVEDIIRHNPDVIVIVD EAYVDFGGQSALPLIEKYDNLLVVQTFSKSRSMAGMRIGFACGNEKLIRFLNDVKYSFNS YTMDRTAIAAGVAAVEDKAYFDETCNKIIETREWTKKELKALGFSFQDSMSNFIFATHKT CPAKELFEALRGQHIYVRYFQKDRIDNFLRITVGTKEEMQKFIDFLKDYLK >gi|226332905|gb|ACII01000114.1| GENE 56 57772 - 58272 465 166 aa, chain + ## HITS:1 COG:BH1234 KEGG:ns NR:ns ## COG: BH1234 COG2426 # Protein_GI_number: 15613797 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 28 161 1 134 134 69 37.0 3e-12 MEALVHWFSQNLSQYISPEGAVFIISMIPLLELRGGLLAASLLKISAAKAIPLCIVGNII PIPFILLFIRQIFKVLKKTKLFRGLVMKLEDRAMRKSDQVKKYEFWGLMLFVGIPLPGTG AWTGSLIASLLEIDIKKSSLAIFCGIIMATVIMYIVSYGLVGNLVH >gi|226332905|gb|ACII01000114.1| GENE 57 58711 - 60108 1507 465 aa, chain + ## HITS:1 COG:CAC2889 KEGG:ns NR:ns ## COG: CAC2889 COG1158 # Protein_GI_number: 15896143 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 6 464 10 483 483 448 54.0 1e-125 MREKYESLSLATLKDLAKARGLKGVSALRKPELIERMLHEDELEGETKKQETVREKPQQS EVEGNGERIPQYNKTPERTQYHAPAEAVQLDSGERADGILEVLPDGYGFIRCENYLPGEN DIYVSPSQIRRFNLKTGDIIKGNIRIKTQGEKFSALLYVTSINGFHPSEGQRRYNFEDMT PIFPNERLIMERPGGTVAMRIVDLISPIGKGQRGMIVSPPKAGKTTLLKDVAKSILRNNP DMHLIILLIDERPEEVTDIREAICGDNVEVIYSTFDELPEHHKRVSEMVIERARRLVEHK KDVTILLDSITRLARAYNLIIPPSGRTLSGGLDPAALHMPKRFFGAARNMREGGSLTILA TALVDTGSKMDDVIYEEFKGTGNMELVLDRKLQEKRVFPAIDIQKSGTRREDLLLSKEEQ EAVYIMRKALNGMKSEDAVEQILNMFTRTKNNAEFVQTVKKQKFI >gi|226332905|gb|ACII01000114.1| GENE 58 60297 - 60500 346 67 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240147058|ref|ZP_04745659.1| large subunit ribosomal protein L31 [Roseburia intestinalis L1-82] # 1 67 1 67 67 137 89 3e-31 MREGIHPNYYQATVTCNCGNTFVTGSTKQDIHVEICSKCHPFYTGQQKAAQARGRIDKFN KKYGLNK >gi|226332905|gb|ACII01000114.1| GENE 59 60575 - 61543 854 322 aa, chain + ## HITS:1 COG:CAC2886 KEGG:ns NR:ns ## COG: CAC2886 COG3872 # Protein_GI_number: 15896140 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Clostridium acetobutylicum # 2 319 3 312 317 210 39.0 3e-54 MKSSNIGGQAVMEGIMMRHKDKYSIAVRRPDNEIELKVEDYKCVFGNAKFLKYPLIRGVV SFVDSLVVGTKCLMYSAEIAGDEEDEEDKQKNAALSEEELAAKKAKEDKQFKWLLYVTVA ASIVVSVAAFMLLPYALASLCRRVGASEFAVTIVEAFVKLALFMGYMLLISRMKDIQRTF MYHGAEHKCINCVEHGLPLTVDNVLASSRLHKRCGTSFLFLVMLVSIFLHFIFVLVPFYW VRLFGRLLMVPVVAGISFEIIQWAGRSDSKLADFFSKPGLAMQKLTTKEPTADMAEVAIR AVEAVFDWKAYLKEEFGVEAEQ >gi|226332905|gb|ACII01000114.1| GENE 60 61531 - 62505 401 324 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 1 318 4 292 294 159 33 1e-37 SRTMNGETLTGLLKKGQMILAQAGIKEAGLDAWLLLEYVTGKSRAYYFAYGEESVTESVA ERYLELISRRAGHIPLQHLTHQAFFMGHEFYVDKNVLVPRQDTETLVESALECMKAVKNP YILDMCTGSGCILISILKERADAHGTGVDLSDEALKVAVRNARTLEVAEHAEFVQSNLFS EMQNIVYGTEYMKRTAVKDTVKMTECENSNRNYSRAYDMIISNPPYIPTAEIEDLMDEVK LHDPRMALDGMEDGLYFYRAITKQAQDHLVPGGWLLYEIGCSQGEDVAALLRKYKFEDIE IRQDLAGLDRVVLGRKKLQEDKYV >gi|226332905|gb|ACII01000114.1| GENE 61 62498 - 63574 1297 358 aa, chain + ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 358 1 358 359 411 58.0 1e-115 MFDKLDDLLIRFEEVLNELGEPGVTDNQEHFQKLMKEQSDLQPIVDAYREYKKNKETIQD SLSMLEEEKDEDMREMLKEELSEAKKNVEELEHELKILLLPKDPNDNKNVIVEIRAGAGG DEAALFAAEIYRMYVKYAESRRWKTEMMSLNENGIGGFKEVTFMITGAGAYSRLKYESGV HRVQRVPETESGGRIHTSTCTVAIMPEAEEIDFHLDMNDCKFDVFRASGNGGQCVNTTDS AVRLTHIPTGIVISCQDEKSQLKNKDKALKVLRSRLYEMELAKQHDAEAEARRSQIGTGD RSEKIRTYNFPQGRVTDHRIKLTLHRLENVLNGDLDEIIDSLIAADQAAKLSNLQDAE >gi|226332905|gb|ACII01000114.1| GENE 62 63655 - 64572 1147 305 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0247 NR:ns ## KEGG: Cphy_0247 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 304 1 305 306 356 58.0 7e-97 MQKTALVIMAAGIGSRFGEGIKQLAPVGPCGEIIMDYSIHDALEAGFNKVVFIIRKDLEE EFRRVIGERIEKITEVEYVFQELDDLPEGFTKPADRTKPWGTGQAVLAAKKVLDEPFIVI NADDYYGKEAYVKVHDYLVQEQQKDGKLHICMAGFRLGNTLSDNGSVTRGICHIENGQLT GVTETHDIFKTATGAESRNADGSVQELDVKDLVSMNMWGLTPDFMEVLEKGFAEFLSGLA PEDTKKEYLLPELVDHLIKNESAEVDVLETKDTWFGVTYQEDKETVMRAFKNLTEAGIYP QGLYQ >gi|226332905|gb|ACII01000114.1| GENE 63 64704 - 66050 1532 448 aa, chain + ## HITS:1 COG:BS_ybbT KEGG:ns NR:ns ## COG: BS_ybbT COG1109 # Protein_GI_number: 16077245 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 443 1 444 448 452 53.0 1e-127 MGKYFGTDGFRGEANVKLTVDHAFKVGRYVGWYYGRDHKAKIVIGKDTRRSSYMFEYALV AGLTASGADVYLLHVTTTPSVSYAVRTEDFDCGIMISASHNPFYDNGIKLLNGNGQKIEA EVEARIEAYLDGLIEDLPLATKEDIGRTVDFASGRNRYIGHLISIPSRDFKGIKVGLDCA NGSSSAIAKSVFEALQAKTYVINNQPDGTNINTNCGSTHIEVLQKYVVDNGLDIGFAYDG DADRCIAVDHKGNVVDGDKIMYVCGKYLKEQGRLKDDTVVTTVMSNLGLYKSLEREGMKY EQTAVGDKYVAENMMENGYSLGGEQSGHIIFSRYAATGDGILTSLMVMEACVEKKATLCD LAREMKVYPQLLRNVRVADKKTARENPKVVAAVEEVAKKLGSDGRILVRESGTEPLIRVM VEAGTDELCLENVDHVVKVMESEGLLID >gi|226332905|gb|ACII01000114.1| GENE 64 66071 - 67120 1270 349 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2423 NR:ns ## KEGG: EUBREC_2423 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 348 1 320 320 117 28.0 5e-25 MYNRKKRLFLTAVCLSLGLLTGCNVGDTKNYKQAAQDLEQGNYEAALEEYETAISEGVKP AQSYRGAGVAKLKLGNYEEAITYFNDALKCDKVGKALKKDILSYRAVAYLKVKDYEAALE DCQTLAENYKMDADLYFLTGETALAMDSYEEASANFEQAYGEDATYDRAIQIYGAYLNRD MEADGTRYLEAALSGTAKNAEDHCDRGRVYYYMDDYENAESELKQAIDGDNTEALVLLGM VYMDKGDSANAKAMFQQYVSQAENGAKGFNGLALCDIEDGDYDSALSDIESGIHEAGAED MQSLLFNEIVVYEKKLDFQTALQKAQEYLELYPEDKTVKKELAFLKTRV >gi|226332905|gb|ACII01000114.1| GENE 65 67145 - 68380 633 411 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 39 401 43 413 425 248 40 1e-64 MEEFGKLEERVVLVGVQTDDHDNVEESLDELKELASTAGAVTVGRIIQNRESVHPGTYIG KGKIEEVRALVYAMDATGIICDDELSPAQLNNLERELDCKVMDRTLLILDIFAARAITSE GKIQVELAQLRYRSARLVGLRESLSRLGGGIGTRGPGEKKLETDRRLIRTRISALKQELS QVEKHRELIRSSRARGNMKTAAIVGYTNAGKSTLLNTLTGSEVLSEDKLFATLDPTTRLL NLKDGQQILLTDTVGFIHKLPHHLVEAFKSTLEEAKYADYIIHVVDSSNQQAEMQMHVVY ETLKELGVMGKKIITLFNKQDAPGACVLRDFKSDYTLKVSAKTGEGLADLNDLLEKLLAE EQIYVERLFPYQEAGKIQLIREYGQLISEEYTEEGIAVKARVPKEIYARVV >gi|226332905|gb|ACII01000114.1| GENE 66 68596 - 70974 2562 792 aa, chain + ## HITS:1 COG:RSp0963 KEGG:ns NR:ns ## COG: RSp0963 COG1328 # Protein_GI_number: 17549184 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Ralstonia solanacearum # 3 653 5 649 683 701 50.0 0 MFQVVKRDGEVDEFKIGKITAAIHKAFDAKEKNYSEEMIDLLGLRVTSDFQKKITDNKIT VEEIQDSVENVLIQAGYADVAKAYILYRKQREKVRNMKSTILDYKEIVNSYVKVEDWRVK ENSTVTYSVGGLILSNSGAVTANYWLSEIYDNEIADAHRNADIHIHDLSMLTGYCAGWSL KQLIQEGLGGIEGKITSSPAKHLSVLCNQMVNFLGIMQNEWAGAQAFSSFDTYLAPFVKA DNLSYPEVKKCIESFIYGVNTPSRWGTQAPFSNITLDWTVPDDLAELPAIVGGKNMDFKY KDCKKEMDMINKAFIETMIEGDANGRGFQYPIPTYSITNEFDWSDTENNRLLFEMTSKYG TPYFSNYINSDMKPSDIRSMCCRLRLDLRELRKKSGGFFGSGESTGSVGVVTINMPRIAY LSKTPEEFYNRLDRLMDISARSLHIKREVIGKLLDEGLYPYTKRYLGSFSNHFSTIGLVG MNEVGLNARWLGKDMSDERTQRFTKEVLLHMRERLSDYQEKYEGELFNLEATPAESTAYR LAKHDRKRWPDIKTAGKEGDTPYYTNSSHLPVEYTTDIFDALDIQDDLQTLYTSGTVFHA FIGEKLPDWKAAAALVRKIAENYKLPYYTISPTYSVCKEHGYISGEHFTCPKCGKKAEVY SRITGYYRPVQNWNDGKAQEYKNRTLYDVMHSDIRKIQPVHASVVTVTKDDVKIEPVASH KYLFTTSTCPNCRVAKKILEGREFEIIDAEKNPEMVKEFGIMQAPTLVITDGEHMTKYVN TSNIKKYVDEGL >gi|226332905|gb|ACII01000114.1| GENE 67 71009 - 71437 490 142 aa, chain + ## HITS:1 COG:MTH162 KEGG:ns NR:ns ## COG: MTH162 COG4747 # Protein_GI_number: 15678190 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Methanothermobacter thermautotrophicus # 2 138 3 139 143 96 35.0 2e-20 MIKQNVVFVENKAGSLKRVTGTLADNGINIYGFACFDAPEFAIFRMICNDPDKAEIVLNR SGYMNRITQAIVVDMKDQVGGLDELLKVASDSNVSLDYIYTSFHRKDLRPVVILQTEDGA VTECILKNNGFNVFTSAEELEK >gi|226332905|gb|ACII01000114.1| GENE 68 71603 - 72037 563 144 aa, chain + ## HITS:1 COG:CAC1675 KEGG:ns NR:ns ## COG: CAC1675 COG1959 # Protein_GI_number: 15894952 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 140 1 138 139 120 42.0 8e-28 MQISTKGRYALRLMLDLAVHNTGELVKIKDISARENISEKYLEQIISSLKKAGYVKSLRG AQGGYMLAREPETYTVGTILRLTEGSMKPVACLEDEPNQCSRAGECVTLRLWKMLDEAIE GVLDKVTLQDLKDWYEEMGNDYVI >gi|226332905|gb|ACII01000114.1| GENE 69 72129 - 73058 849 309 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 308 3 304 308 331 56 1e-89 MAKIYKNAAELVGNTPLLEVGNLEKELGLEARILVKLEYFNPAGSAKDRIALSMIEDAEE RGVLKPGAVIIEPTSGNTGIGLASLAAIKGYRVILTMPETMSVERRNILKAYGAEIVLTD GTKGMNGAIAKANELAKEYENSFIPGQFDNPANPAIHKRTTGPEIWRDTDGQVDVFVAGV GTGGTITGVGEYLKSQNPDVKVVAVEPATSPVLSQGKSGPHKIQGIGAGFVPKALNTEVY DEVFPVENEDAFTVGKLIAKHEGILVGISSGAALYAAIQLAKRPENKGKTIVALLPDSGD RYYSTPLFV >gi|226332905|gb|ACII01000114.1| GENE 70 73242 - 74324 1026 360 aa, chain + ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 2 356 3 355 355 355 50.0 6e-98 MRKAIIAMSGGVDSSVAALLTKETGDECIGATMKLFHNEDIGVKREKTCCSLDDVEDARN VCYRMGIRYYVFNFSERFKEDVMDRFVDAYEHGSTPNPCIDCNRYLKFDKMFQRMRELEY DYIVTGHYARVEYDEEKNRYLLKKAVDDTKDQSYVLYMLTQEQLAHISLPLGGLRKTEVR EIAEKHGFVNARKHDSQDICFVPDGDYAKFIEQYTGRKSIPGDFVDTEGNILGKHKGIIH YTLGQRRGLGIPAASRLYVCDISPKTNQVVLGNNEDLFHSELTATKVNLISCESLKEPMR LKAKIRYRHPEQEAVAWQTEDGVLHVRFDKPQRAITRGQAVVLYDGDIVVGGGVIENCIK >gi|226332905|gb|ACII01000114.1| GENE 71 74594 - 75907 1542 437 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 9 435 3 426 426 499 59.0 1e-141 MSKKIERKDRHFKFETLQLHVGQESPDPVTDARAVPIYQTSSYVFRNCDHAAARFGLADA GNIYGRLTNPTEDVFEKRIAALEGGSAALAVASGAAAITYTIENLAQQGDHIVAAKNIYG GTTNLLEHTLPAYGITTTFVDVFDLEEVENAIQDNTKAVYIETLGNPNSDVVDIEAIAKI AHAHKIPLVVDNTFATPYLVRPIEYGADIVVHSATKFIGGHGTTIGGVIVESGKFDWEAS GKFASLTEPNSSYHGVSFTKACGPAAFVTKIRAILLRDTGATISPVSSFIFLQGLETLSL RVERHVENALKVVQYLNNHPQVEKVHHPSVSTDPVQQELYKKYFPNGGGSIFTFEIKGDA QKAKDFIDNLELFSLLANVADVKSLVIHPATTTHSQCTEEELLDQGIKPNTIRLSIGTEN IDDIIQDLDEAFKAVAE >gi|226332905|gb|ACII01000114.1| GENE 72 76133 - 76987 805 284 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00615 NR:ns ## KEGG: EUBELI_00615 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 278 1 278 285 270 50.0 4e-71 MSEIILDSVIDSIKLLPFLFLTYLFMEWLEHKTGAAARKRIRTAGKFGPVWGGLLGVIPQ CGFSAAASSLFAGRVITVGTLIAVYLSTSDEMFPIMISNAVPVVTIVKILTCKAVIGILS GLVLEYVYTHILKKQEPDVDIHEICEEERCHCEHGVISSAAFHTLKVFVYIFLISLVLNI IIGLVGEETLAGLFTGTPIAGELIAALVGLIPNCASSVVITQLYLDHIIGAGAMMAGLLV NAGVGLLILFRLNRDRVQNLKIVGVLYGLGVFWGIIIEFAGIVF >gi|226332905|gb|ACII01000114.1| GENE 73 77042 - 77464 447 140 aa, chain + ## HITS:1 COG:RSp0247 KEGG:ns NR:ns ## COG: RSp0247 COG0735 # Protein_GI_number: 17548468 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Ralstonia solanacearum # 7 135 14 139 151 63 32.0 1e-10 MAGSSYATASRRKILEYLKNSNDHTVTAADVDEYLKKHDSEVNITTIYRYLDKLAKDGTV IKYVAEKGCQAAYQYVEPGRGCEQHLHLKCVKCGKIIHLECHFMEEISHHIEESHGFTLQ CKNSILYGVCKECKGSGEDC >gi|226332905|gb|ACII01000114.1| GENE 74 77648 - 78616 620 322 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580129|ref|ZP_04857396.1| ## NR: gi|253580129|ref|ZP_04857396.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 322 13 334 334 603 100.0 1e-171 MLLFFLACVLICAAPPSAAQAAIMQTAKNSTVQKKKTSTKTVTIETSGGITKKKVKSGGS VILPVEVNKRGYTFLGWSTVPGQTCNPMYQAYQKIQVTKNIHLYPVKYKWNQEPDIYAGG LADSVEKYDKIIFVGDSRTAMLRSTLKQQCSSDSLKKVGFVCKTGEGLDWMKKYGEKELL NEISGMDDNAKPVAVIFNLGVNDLIHKNRESISYDSVASDYASYMNGLSRKLTARNCELF YMSVNPCNTAMKSTRKESEIRGFNNRLRQRLNGNFTWINSYSYLMRCGYTTRCEFRGYTD DGVHYSMRTYKRIYAYAIKQIR >gi|226332905|gb|ACII01000114.1| GENE 75 78749 - 79474 848 241 aa, chain + ## HITS:1 COG:BH2910 KEGG:ns NR:ns ## COG: BH2910 COG1296 # Protein_GI_number: 15615473 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus halodurans # 2 220 11 223 237 115 32.0 7e-26 MRSENFRKGVKDGIPIGLGYFAVSFTFGMMAVADGLSIWQAVLISLTNVTSAGQFAGLEI MVMSGSYWEMALTQLIINLRYCLMSFSLAQKFRRDESLVHRYIAAFGVTDEIFGISASQP GKVSAFYNYGAMCVAIPGWVLGTLAGAISGNLLPDFMMSALSVAIYGMFLAIIIPPAKQN KAVLAVVVAAMLISTLFKVIPFLSEVSSGFVIIITTLIVAGAAAYFCPIEDEKEEEGVHE S >gi|226332905|gb|ACII01000114.1| GENE 76 79464 - 79781 424 105 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0649 NR:ns ## KEGG: Sterm_0649 # Name: not_defined # Def: branched-chain amino acid transport # Organism: S.termitidis # Pathway: not_defined # 1 100 1 100 105 78 58.0 7e-14 MSHNIYIYILVMAAVTYLIRMLPLALSRKEITSPFIRSFLYYVPYACLAAMTFPAILFAT DSVISAAVGFIVALIAAYKEKSLLTVALFACAAVFIVERILEFAV >gi|226332905|gb|ACII01000114.1| GENE 77 79909 - 80262 430 117 aa, chain - ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 1 113 169 279 279 134 54.0 5e-32 MDDIDHLDIEKIGPAFENHVAFPDRVNTEFVEVIDEHTVKMRVWERGSGETLACGTGACA VAVASVLNGHVDGDSPVTVKLLGGDLQIFWNRQENLVYMTGPAATVFDGEIDVSFLK >gi|226332905|gb|ACII01000114.1| GENE 78 80377 - 80760 606 127 aa, chain - ## HITS:1 COG:BS_yutL KEGG:ns NR:ns ## COG: BS_yutL COG0253 # Protein_GI_number: 16080270 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Bacillus subtilis # 2 126 5 131 284 131 52.0 3e-31 MKFTKMHGIGNDYVYVNCFEETVENPSAVARYVSDRHFGIGSDGLILIKPSKIADCEMDM YNLDGSQGAMCGNGIRCVAKYAYDYGIVNKEHISVATKSGIKYLDLTVENGKVSQVKVNM GSPILTA >gi|226332905|gb|ACII01000114.1| GENE 79 81096 - 81884 1243 262 aa, chain + ## HITS:1 COG:SPy1315_1 KEGG:ns NR:ns ## COG: SPy1315_1 COG0834 # Protein_GI_number: 15675263 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pyogenes M1 GAS # 27 261 120 362 546 103 31.0 3e-22 MKDQKRFAALALSAVMVFSMAGSVSAAQKVESKDDLAGATIGVQLGTTGDLDASDYEKDG STVERYSKGSEAVQALKAGQIDCVIIDSQPAQKFAENNDDLKILDEPFEEEEYAICLKKG NDELLDKINGALKDLKEDGTIDSIMDNYIGEDAGKTPYESPEDVDRSNGTLVMATNAEFE PYEYHEGDDIVGIDADIAQAICDKLGYELKIEDMEFDSILPAVQSGKADFGAAGMTVTED RKSSVDFTDTYADASQVIIVKK >gi|226332905|gb|ACII01000114.1| GENE 80 81967 - 82662 950 231 aa, chain + ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 14 230 11 236 236 177 48.0 2e-44 MILASVKEDFYLNFIKDDRYLWLLDGLKTTLIITVFAVIVGLIIGFLVAIIRSAHDKSGS FKILNAICRVYLTVIRGTPTMIQLLIMNFVIFGAVSINKIIVGGLAFGINSGAYVAEIVR SGIMSIDQGQTEAGRSLGLNFSQTMRLIIIPQAFKNVLPALVNEFIVLIKETSIIGYIGM MDLTKGAMLIQSRTYNAFWPLMAAAAIYLVIVGILTWGMNKLERRLRTSER >gi|226332905|gb|ACII01000114.1| GENE 81 82652 - 83392 587 246 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 245 1 242 245 230 47 3e-59 MSDKNVLIQVKDLKKAFGTNQVLDGITTDICQGEVVAVIGPSGSGKSTFLRSLNLLEVPT GGQILFEGTDITDPKVDINRHRQKIGMVFQQFNLFPHKTVKQNIMLAPVELKLMTKDEAS KKADELLARVGLPDKANAYPDMLSGGQKQRIAIARSLAMNPDVMLFDEPTSALDPEMVGE VLELMKELAQSGMTMVVVTHEMGFAREVATRVLFIDDGKIQEENTPKEFFANPKNPRLKD FLSKVL >gi|226332905|gb|ACII01000114.1| GENE 82 83394 - 84344 717 316 aa, chain + ## HITS:1 COG:BS_ybaS KEGG:ns NR:ns ## COG: BS_ybaS COG0385 # Protein_GI_number: 16077227 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Bacillus subtilis # 41 306 6 271 283 204 39.0 1e-52 MLRKFNSFIEKWMALVTPTCLLLGVCFPDIAKCGLPYVPAVFAFMTFAGALKSRFKDVAN VFRHPGSLLIIMLLVHVVIPTAACGAGHLFFGNNMELITGMVLEFAVPTAVVSLMWVTIY DGNSPLSLSLVVLDTVLAPFLIPATLKILLGSAVTIDSARMMRELIFMVALPAVVAMVLN QITDGKVMETWPGKLAPFSKLALIFVVTSNSSKVAPYIRDMNMQRVKVALAILVLAASGY ALGWFVAYIFRKDRSTAVSMMYGTGMRNISAGAVIAAAYFPGEVMFPVMIGTLFQQVLAA LFGKLLAQKKEKQEFC >gi|226332905|gb|ACII01000114.1| GENE 83 84423 - 84848 409 141 aa, chain - ## HITS:1 COG:CAC3674 KEGG:ns NR:ns ## COG: CAC3674 COG0517 # Protein_GI_number: 15896906 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Clostridium acetobutylicum # 1 137 1 137 140 142 49.0 2e-34 MNILFFLTPKSDVAYIFESETLRQTLEKMEHRKFSCIPLLSLDGKYKGSISEGDLLWGMK TLNVPGLKEAESISIMAIPRRATYKAVHADSDMEDLLDKAINQNYVPVVDDQGYFIGIIT RKEIIKYCYKELKKYIKASDS >gi|226332905|gb|ACII01000114.1| GENE 84 84957 - 85964 1138 335 aa, chain - ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 9 335 2 327 327 426 63.0 1e-119 MEQIKIPDGYQSSLNLHDTQIAIKTVKDFFQQTLAQKLNLLRVSAPVFVNPSSGLNDNLN GVERPVSFDIKGQPENAEIVHSLAKWKRYALQKYGFAHGEGLYTDMIAIRRDEDLDNIHS VYVDQWDWEKIISKEERNMDTLVSTVRAIYSVLRKTEKYMAVQYDYIEEILPREIAFVST QELVDMYPDLTPKEREYKIVKEKGAVFLMQVGKTLTNGERHDGRAPDYDDWELNGDILVY YPVLDIALELSSMGIRVDEDALDRQLTIAGCDDRRELPFQKAIFNKELPYTIGGGIGQSR ICMFFLRKAHIGEVHASLWPEEVMKEAAAKGVQLL >gi|226332905|gb|ACII01000114.1| GENE 85 86253 - 87242 631 329 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580138|ref|ZP_04857405.1| ## NR: gi|253580138|ref|ZP_04857405.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 329 1 329 329 527 100.0 1e-148 MGKRCQVCDGPIVNGRCKYCGMPYRNDMELYHLNEDRSEHYRHASAKVKKAMAENEIPLS DRNKTVPKAAKTSSARTQTIRTQNTKVQTSRIQTSGTKKTSGQSYSAQTARTYSTGKKTA KSEKKKSGKAGKIFWIIVLILMVIIGQMAENWDTIGYRIEEFIYDEFDIDLNSIFDDSSS DEVNESNKNNESNENNESNENNWTDDWSETISAVKNGQDEQRGNPDWYMFTGENYIVRTD DTASEIGDIDASDQGYDFTIDPGEYRIVSGWEKVALEIRNSSGKDETVKFDEADHEEKIE LHAGDEVSVVSLDGQDNYLAMYQIQQYDE >gi|226332905|gb|ACII01000114.1| GENE 86 87348 - 89498 2369 716 aa, chain + ## HITS:1 COG:HI0075 KEGG:ns NR:ns ## COG: HI0075 COG1328 # Protein_GI_number: 16272049 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Haemophilus influenzae # 9 713 6 703 707 306 31.0 8e-83 MGAKGMIYVIKKDGTREEFNSQKIVAAVNKSAARILYTFSDKEKEFICQFAEEHAEALGK NEIEIQEMHNIVEGALERVNPAVAKSYRDYRNYKLDFIHMMDDVYTKSQAIRYIGDKSNA NTDSALVATKRSLIFNELNKELYRKFFMNRNELQACKDGYIYIHDQSARLDTMNCCLFDV ASVLSGGFEMGNVWYNEPKTLDTAFDVMGDIILSTAAQQYGGFTVPEVDKILAPYAQKSY DKYIQEFMKYSDESWTGREERAIEYALDKVRRDFDQGWQGIEYKLNTVGSSRGDYPFVTV TLGLGIKQFEKMASISLLEVHKGGQGKAGRKKPVLFPKIVFLYDENIHGKGKISEDVFEA GVDCSAKTMYPDWLSMSGKGYISSMYKQYGKVISPMGCRAFLSPWYERGGMYPADDKDVP VFVGRFNIGAVSLHLPMILAKARAESRDFYEVLDFYLEMIRNLHIRTYDYLGQMRASTNP LAYCEGGFYGGHLKPNEKIGKILKPMTASFGITALNELQELYNGKSIAEDGQFALDVLKY INRKVNEFKEKDGYLYAIYGTPAESLCGLQVEQFRKMYGIVKGVSDRPYVSNSFHCHVTE DVTPIEKQDLEGRFWELCNGGKIQYVRYPIDYNREAIRTLIRRAMSLGYYEGVNLSLAYC DDCGHEELEMDVCPVCGSKNLTKIDRMNGYLSYSRVHGDTRLNEAKMAEIAERKSM >gi|226332905|gb|ACII01000114.1| GENE 87 89662 - 90594 1194 310 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 309 1 308 310 286 51.0 4e-77 MSEKITVVALGHRALGTTLPEQKIATKKAAKAVADLVEAGNRVVISHSNGPQVGMIHTAM NEFGKTHPDYTFAPMSVCSAMSQGYIGYDLQTAIRAELISRGIYKPVATILTQVVIDPYD DAFAEPEKIIGRVLTEEEAETEESKGNFVTKLADGTYKRILAAPKPQKIVELETIRTLVD AGQVVIAAGGGGIPVMEQGIDLHGASAIIEKDLSSGLLAQELNADTLLILTSVEKVSLNL GQDNEEYLNTISVDDAKKYMADNQFAPGSMLPKFEAGISFIEKGDNRRTIITDIAHAKDG YFEKTGTIIK >gi|226332905|gb|ACII01000114.1| GENE 88 91286 - 93097 1195 603 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580141|ref|ZP_04857408.1| ## NR: gi|253580141|ref|ZP_04857408.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 184 603 1 420 420 728 99.0 0 MKKRIWNQLKKAVMITAAVGILGAGAVSAAEFEAAELYVPAVGLKQSAQWIDEEKFQAEL TLEVSGLKELYKSQQENTVSENGQLQMENSTWDGEVREAEAAENTTDEVDSENTECESES AVIDIEDIESEEYRKAEENAEGSDTETGQTQQSEQENQNEIREDKTRYFLTVYISEYFQI NAEVLKNDLQAESVQIQNKKGETTQITKLTCETALSDAETDIFTMKVPVSLREEYRISSV KTDYPLCQDAPLCKDWEGTGAYFWMKSGDETRTVAETPSALLSVQEAKTGITARLQQETK EVRAGQNLTYILSVENTGELPLENIEISSVFSQDNIKAEWKQEEGFIANGMQGIISGLKA GEIRNLQMTVQLTEEQAGELIHTVTVKAKYPGKEESIGCQKSVETEVIALKAAFEVEKTA DRTQAYPGDTITYQICIRNTGERTLHSVLSTERFQNAGIQANFVRKEGVTLNSTGTQALI PQIAPGEAFALYATVTVPQYMASQELINEVIVTSDETGTQAMTSKANVELKETDNIVTVT PQPASLASQNYGYDSKSGSAYTTASKPRTGDETETALYIVLGIFAIMSGVSAFCYRKTKK QQK >gi|226332905|gb|ACII01000114.1| GENE 89 93394 - 94851 1647 485 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 482 1 478 482 375 43.0 1e-103 MKLISLLEKLEYTCLQGSTDQEVKNVVYDSRKVEEGSLFICIRGAVVDGHKFVPDVVAKG AKVLIVEEAVEAPEDVTVILVKDTRYAMAFISAAYFGYPAEKLKTIGITGTKGKTTTTYM VKSILENAGYKVGLIGTIEAIIGDKVIPAKNTTPESYVIQEYFHEMAEAGCDCVVMEVSS QGLMLHRTQGFVFDFGIFTNIEPDHIGPNEHKDFDDYLRCKSLLLKQCKVGIVNRDDEHF EKIIEGHTCSLETYGFSPEADLRAEDAKLVGGKGYLGISYHLKGLLDFHVEIDIPGKFSI YNSLTAIAICRHFKVSEENILKALKVAKVKGRIEMVKVSDDFTLMIDYAHNAMALESLLT TLKEYHPHRLVCLFGCGGNRSKLRRYEMGEVSGKLADLTIITSDNPRDEEPQAIIDDIKI GMAKTDGKYVEIPDRKEAIAYAIHHGEPGDIIVLAGKGHEDYQEIKGKKYPMDERVLIAD ILAGK >gi|226332905|gb|ACII01000114.1| GENE 90 94878 - 97097 1564 739 aa, chain + ## HITS:1 COG:CAC1721 KEGG:ns NR:ns ## COG: CAC1721 COG1198 # Protein_GI_number: 15894998 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Clostridium acetobutylicum # 3 705 4 695 733 555 41.0 1e-157 MIYADIIIDISSDKLDRSFQYRVPERLEKTIKAGMVAAIPFGNAGRVCKGYVIGLSDKPK IAPEKIKEISEICSGEETTESRLIALAAWMRENYGSTMIQALRTVLPIQEKIKAKEKKYI CLDISEEAGTQLLKELEQTRFKARTRLLRELLEKKRLDYTHAAKELGTTSAVVKKFQEQG IIRIEYDEIMRTSLNTGDISGERRMPLTDEQEAAVKQIQREWKKPSPKPVLIEGVTGSGK TQVYIKLIEQTLSQGKEAIVLIPEIALTYQTVRRFYARFGDKVSVINSRQSQGERYDQFK RAKKGEVQVMIGPRSALFTPFASLGLIIIDEEHEPSYKSESTPRYHARETAIRRAVLEHA NVVMGSATPSVEAYSKAMNGEYSLVRLTTRYGSRPLPRVSIVDLREELKAGNRSVLSREL RQKMKGRFEKKEQVMLFLNRRGYAGFVSCRSCGHVMKCPHCDVSLSEHNNERLLCHYCGY ETGKPRICPVCGSPYIGGFKAGTQQIEKVVREDFPEVRTLRMDFDTTRTKGSYEKILAAF AAHEADILIGTQMIVKGHDFPDVTLMGIVAADLSLNAEDYRCAERTFQLLCQAVGRGGRG EKPGEAVIQTYHPDHYSIQAAAVQDYQAFYEEEMSYRMLLDYPPAAHMLAVLGSGPEDEK LVQAMYYLKLYIERIYRENDLHIIGPAYASVGKVKDIYRQVIYLKHKDHKTLVHIKDQLE KYIEINSGFRKIYIQFDFS >gi|226332905|gb|ACII01000114.1| GENE 91 97267 - 97743 848 158 aa, chain + ## HITS:1 COG:CAC1722 KEGG:ns NR:ns ## COG: CAC1722 COG0242 # Protein_GI_number: 15894999 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 147 1 148 150 169 62.0 3e-42 MALRKIRLQGDEVLAKVSRPVEKMTPRIHDLIGDMLETMYDAMGVGLAAPQVGMLKRIVV IDIGEGPIVLINPEILETSGEQTGDEGCLSVPGMAGQVTRPNYVKVKALDEDMNEVEYEG EGLLARAFCHEIDHLDGHMYTELVEGELHQVSYDEDEE >gi|226332905|gb|ACII01000114.1| GENE 92 97751 - 98698 1076 315 aa, chain + ## HITS:1 COG:BS_fmt KEGG:ns NR:ns ## COG: BS_fmt COG0223 # Protein_GI_number: 16078636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus subtilis # 2 315 3 315 317 306 53.0 4e-83 MKIVFMGTPDFAVPSLHALTEAGYAVAAVVTQPDKPKGRGKTLLPTPVKEEAVMHDIPVY QPEKVRNNPEFLEILKEINPEIIVVAAYGQIIPKEILELPKFGCINIHASLLPKYRGAAP IQQAVIDGEKVSGVTIQQMGEGLDTGDMISKIVIPISPTETGGSLFGKLAQAGADLLIKT LPSIEQGTAEFEKQPEESPTPYAAMITKQMGLMDFRKSAEELERLVRGLNPWPSAYTFLN GKTLKVWKCKVSTEKTDAVPGTIFLADKEGIHTACGEGVLILTEVQLEGKKRMETEAFLR GYHIENGIVFTDHKE >gi|226332905|gb|ACII01000114.1| GENE 93 98703 - 99398 751 231 aa, chain + ## HITS:1 COG:CAC1724 KEGG:ns NR:ns ## COG: CAC1724 COG2738 # Protein_GI_number: 15895001 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Clostridium acetobutylicum # 5 229 3 227 227 214 52.0 8e-56 MYGYYYFDPTYLLLVIGMVLSLLASAKVKSTFSVYSRVRSASGLTGADAARRILRMSGIT DVTVVPISGSLTDHYDPGSKKLALSQDVYDKTSVAAIGVAAHECGHAIQHATNYVPLNLR SAIVPVANIGSTLSWPLFLAGLIFSIRPLLTVGIILFTFAVLFQLVTLPVEFNASSRALK MLGSSGMLGADEVKGARKVLTAAALTYVAALASSILQLLRLIILAGGRDRD >gi|226332905|gb|ACII01000114.1| GENE 94 99391 - 100743 1097 450 aa, chain + ## HITS:1 COG:SA1060 KEGG:ns NR:ns ## COG: SA1060 COG0144 # Protein_GI_number: 15926800 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Staphylococcus aureus N315 # 6 450 30 459 461 267 36.0 3e-71 MTNGINTRELILQILLEIEEEGKHSHIAIRNALSKYQFLPRQERAFITRVCEGTLEYRIL IDYIIDSYSKVSVDKMKPVIREILRSAVYQIRFMDSVPDSAVCNEAVKLAQRKGFYSLKP FVNGVLRTIAREWKNLKLPSREENPVRYLSVRYSMPETLVNRWLEDYGEEKTEKILTDFL TEKPITVRCRTHKYPQKEIYESLVDQGVEVKPAPYLPYAYEISNYNHILALDAFIQGKIQ VQDVSSMLVAEIADPQKGDYVIDMCAAPGGKSLHVADKMGDYGTVDARDISQYKVDMIEE NIHRTDCINVQAHVMDATVFDVDSELKADVVLADVPCSGYGVIGKKPEIKYRVTAQKQEE IVILQRTILDNAAEYVKPGGVLVFSTCTIAKEENEENMLWFMNNHPFKLESIDPYLPEEL HSETTALGYLQLLPGVHKTDGFFIAKFRRK >gi|226332905|gb|ACII01000114.1| GENE 95 100747 - 101787 1111 346 aa, chain + ## HITS:1 COG:BS_yloN KEGG:ns NR:ns ## COG: BS_yloN COG0820 # Protein_GI_number: 16078638 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Bacillus subtilis # 4 335 20 354 363 360 52.0 2e-99 MTDIKSMTIDELKKLMTTLGDKPFRAKQIYSWLHEHLVTSYDEMTNLSKSLREKLKEYPV TALKMVKVQTSRIDGTQKYLFRLSDGNVIESVLMRYKHGNSVCISSQVGCRMGCRFCAST IGGLTRCLLPSEMLDQIYRIQALTGERVSNVVVMGTGEPLDNYENLLRFIHILTEDGGLH ISQRNLTVSTCGLVPKIYDLAKEKLQMTLALSLHAPNDVKRRELMPIANKYSMDEVLEAC RYYFKETGRRITFEYSLVAGVNDSDEDARELSGRIRDMNCHVNLIPVNPIKERSFVRSTR QAVENFKIKLEKCGINVTIRREMGSDIDGACGQLRKSYMEKTESEK >gi|226332905|gb|ACII01000114.1| GENE 96 101789 - 102535 750 248 aa, chain + ## HITS:1 COG:BS_yloO KEGG:ns NR:ns ## COG: BS_yloO COG0631 # Protein_GI_number: 16078639 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Bacillus subtilis # 7 239 7 242 254 172 44.0 4e-43 MRIYSATDVGQKRKMNQDYVFATADPVGNLPNLFVVADGMGGHNAGDYASSHAVTSMVEE IRQDADFNPVKVIRHAIECVNTEILTQAQQDEKLRGMGTTIVAATIVGHYAYVANVGDSR LYVIGEKIQQITRDHSLVQEMVRMGELDPEQARKHPKKNIITRALGAEKTVDIDFFDLKL EPGDVVLMCSDGLSNMVEDSQLREIISDTSTDLDEKGRILIREANRNGGKDNIAIVLVEP FADEVKAC >gi|226332905|gb|ACII01000114.1| GENE 97 102529 - 104712 2239 727 aa, chain + ## HITS:1 COG:BS_yloP_1 KEGG:ns NR:ns ## COG: BS_yloP_1 COG0515 # Protein_GI_number: 16078640 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Bacillus subtilis # 5 299 4 303 331 275 48.0 4e-73 MLKTGTIIAERYEILGKIGTGGMADVYKAKDHKLNRFVAVKVLKPEFREDTTFIRKFKSE AQAAAVLTHPNIVNVFDVGDDNGVYYIVMELIEGITLKEYISKKGKLSVKEATSIAIQVS MGLEAAHSHGIVHRDVKPQNIIISMDGKVKVTDFGIARAASSNTISSNVMGSVHYSSPEQ VRGGYSDEKSDIYSLGITMYEMVTGKVPFDGDTTVAIAIKHLQEEIVPPSVYAPELPHSL EQIILKCTQKSVDRRYQNMEDVIADLKHSLIDPQGDFVTLTSVDNEAKTVVISDKELGEI KHMPKQIAKSEPEALEEEINETDYDDEPEVKKHRKKSDKPEKKKKRGGHGLTIVMLLMGV VILIAIILVAGKASGLIGSNNDTDKKTEASDTSESDDDGMVTVPNLVGKTEDEAKNITKD MKLGIQPMGEEASNQAKGTISSQDIPKGSKVEQYTTIKYYISKGAQQITIPDVDGQTGVD AQQTLEDMGLTVEIQKEYSELNDDGTPVTDPGYAVSTTPTAGNSVSAGDSVTLLVSRGVD YGDSVEVPSVVGMTKNDAVTTFGKFLNVEVKEEKSTEVAEGEVISQEPEAGNWEDPDNVN VVITVSTGDQEPSAQSDSADTAASSDAPAQTADNSAAAAAGEVWKCTQTLNTPSGYSGGP VRLELIQNVNGTPTASVVLEDQVIQFPYDLDITGAPGISEGTLYLSEQISGTYQELGNYS ITFAKAE >gi|226332905|gb|ACII01000114.1| GENE 98 104716 - 105594 844 292 aa, chain + ## HITS:1 COG:lin1933 KEGG:ns NR:ns ## COG: lin1933 COG1162 # Protein_GI_number: 16800999 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Listeria innocua # 1 290 2 288 291 253 42.0 4e-67 MQGKIIKGIAGFYYIYAENDEIYECKAKGIFRKDKQKPLVGDNVEIEVLDEQEKEGSVTA ILPRKNSLIRPAVANVDQAFVIFAMENPKPNFMLLDRFLIMMEKENVPAVICFNKKDLAK QEELELLYETYKSCGYDVIFSSTFNGEGLDEIREILKGKTTVVAGPSGVGKSSITNALQE NVQMETGEISKKLKRGKHTTRHSQVIPVGHDTYLMDTPGFSSLYLTDIEEQELKAYFPEF RRYEEQCRFQGCRHIHEPDCGVKAALAEHEISQLRYEDYLGLYNELKEKRRF >gi|226332905|gb|ACII01000114.1| GENE 99 105597 - 106268 840 223 aa, chain + ## HITS:1 COG:CAC1730 KEGG:ns NR:ns ## COG: CAC1730 COG0036 # Protein_GI_number: 15895007 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Clostridium acetobutylicum # 1 209 1 210 216 226 51.0 3e-59 MYQLCPSILSADFNRLGEQIKTLEKEGIEWLHIDVMDGDFVPSISFGMPVIKSIRKESKL FFDVHLMVTEPERYIRDFVECGADSITVHAEACEDLERTIEQIREAGVKVGVSIKPATPV NDISHMLSDVDMVLIMTVQPGFGGQKYMEECTEKIREVRELIEAEDLDVDVQVDGGINDE TMETVLTAGANLLVAGSYVFNDDLAKNVKHARERMDEIIRRID >gi|226332905|gb|ACII01000114.1| GENE 100 106277 - 107008 662 243 aa, chain + ## HITS:1 COG:BS_yloS KEGG:ns NR:ns ## COG: BS_yloS COG1564 # Protein_GI_number: 16078643 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Bacillus subtilis # 29 243 29 214 214 100 28.0 3e-21 MIDTIIVSGGNIHSDFALDFLKKNEACLIAADKGLEFFLEHQLLPDAVIGDFDSLSEDGK KFLEIQEEKSMDSEIPYGGMTEWKLQKGFGNEKKEIKVIRLRPEKDDSDTQSAMNYAIQN GAKRITILGATGNRVDHLMANFGILVLAKNQGVEVILADQYNYMKLVSDGEIIKKSEQFG KYISFFPLGGDVTGLTLEGFKYSLDHYRLTTADSGLTVSNEIVSEKAKVTYQSGTLLMIM SRD >gi|226332905|gb|ACII01000114.1| GENE 101 107129 - 108226 1494 365 aa, chain + ## HITS:1 COG:CAC2134 KEGG:ns NR:ns ## COG: CAC2134 COG0012 # Protein_GI_number: 15895403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Clostridium acetobutylicum # 1 365 1 365 365 451 64.0 1e-126 MKLGIVGLPNVGKSTLFNSLTKAGAESANYPFCTIDPNVGVVTVPDERLKLLGDFYHSKK VTPAVIEFVDIAGLVKGASKGEGLGNQFLANIREVDAIVHVVRCFEDSNVIHVDGSIDPI RDIETINLELIFSDLEILERRIAKVTKTARMDKEAAKELDFLQKVKTHLEDGKMAITMEM ENEDEEAWMSTYNLLTWKPVIYAANVAEGDLADDGESNSHVQAVRKYAAEQNSEIFVICA EIEEEISELDDDEKKMFLEDLGLTESGLEKLVKASYHLLGLMSFLTSGEDETRAWTIKIG TKAPQAAGKIHTDFERGFIKAEVVNYQDLLDCGSYAGAREKGLVRMEGKEYVVQDGDVIL FRFNV >gi|226332905|gb|ACII01000114.1| GENE 102 108247 - 109269 515 340 aa, chain + ## HITS:1 COG:BS_lgt KEGG:ns NR:ns ## COG: BS_lgt COG0682 # Protein_GI_number: 16080552 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Bacillus subtilis # 16 284 10 265 269 174 36.0 2e-43 MSIRFPGPGLVFDYVPRSFQIFGMEFTIYGVLIAVGALLGMGLVTLEAKRNGEDQNRYLD MILISLMVSVAGSRLFYVAFSWGFYKENLNAILDFRNGGYALYGGLLFGFLAAAVFCRIT KMSFWQSADIACPGILLGQAIGRWGNFFNRESFGEYTDLPWAMQIPVSAVRSGEVSGAMR DNLLTINGISYIQVQPIFLYESLWCLLLFLLLMAMRRKKKYEGELFMIYLAGYGLGRFFF EWFRTDKLYIPGTKVGISLVISAILFAVFMPVVIIRRVMAQKRDTIRRQRKKRFLDESVK EKQMFPETSEMKPEPEIRPEMDEQVKPDVGTSPTNSKEEQ >gi|226332905|gb|ACII01000114.1| GENE 103 109563 - 109994 392 143 aa, chain + ## HITS:1 COG:CAC2133 KEGG:ns NR:ns ## COG: CAC2133 COG2001 # Protein_GI_number: 15895402 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 141 1 142 142 184 58.0 7e-47 MFMGEYNHTIDAKGRLIIPSKFRELLGEEFVLTRGLDGCLYIYPMDEWESFEMKLRSLPL TNKNARTFSRFFVAGATTCELDRQGRILVPQTLREFAGLEKDVVLTGNLNRIEVWSKEKW NEICDYDDMDSIAESMQDMGITI >gi|226332905|gb|ACII01000114.1| GENE 104 110045 - 110980 695 311 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 1 310 1 310 312 363 59.0 1e-100 MEFKHKSVLLEETIEGLDIKPDGIYVDGTLGGAGHAGEVCRKLSAKGRFIGIDQDQDAII AASERLAPYKQATIIRSNYCYMVQELAARGIYKVDGILLDLGVSSYQLDNEERGFTYRVD APLDMRMDQRQTQTAGDIVNGYEEKELYRIIRDYGEDKFAKNIAKHIVAARQVKPITTTG ELTEIIRESIPMKMQVKSGHPAKRTFQAIRIELNRELDVLRDSLDGMIDILDDGGRLCII TFHSLEDRIVKTIFRKNENPCTCPSDFPVCVCGKKSKGKVITRKPILPGETEMEENPRSK SAKLRIFERVL >gi|226332905|gb|ACII01000114.1| GENE 105 110996 - 111574 481 192 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2253 NR:ns ## KEGG: EUBREC_2253 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 24 190 7 158 158 104 41.0 2e-21 MARTTRSETARRSRTSARTNRGMYVDGNTVRRLAEVPERKRQPGRQNPGRQQTNRNRRVQ ASPAQVPRQKHQLSKEAQKNRQKATAMNWGFVAFLAVVCVAILFCSVKYLRYKSEITAKM STVASLEEELADLKEDNDAYYSQVTSNVDLNKIKQTALGRLGMKYPSDDQTVTYSTSGNS YVRQYQDIPDSK >gi|226332905|gb|ACII01000114.1| GENE 106 111640 - 113823 2139 727 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 1 600 12 593 729 270 31.0 6e-72 MKEKLLILFGIVLLLLVILVLRITYINATSGSKYKKQVLSQAQQKYESQVLPAKRGDIYD KNGNILATSNKVYNVILDCKTVNSDEDYVEPTIRALNEVLGIDEETVRSLLADSRTSQSQ YQVLTKQLSMDKKKEFEKYKAEDEDSGLTKAQAKERANIKGVWFEEDYLRSYPFNETACD TIGFALDRDVADIGLEGYYNSTLTGVDGRQYGYINDDSDVEQTIIAAVNGKSIQTSIDIG AQQIVEKYVNGFKEAMGAKNIGVIVENPSTGEIIAMDGGDRYDLNNPRDLSAVYTEDEIK AMNDVETVEALNGMWSNFCVTDAYEPGSVVKPIVMASALEKGAIQETDTFVCDGSQTFGD TMIKCAVYPSAHGTETLGEVIANSCNDAMMQIGAKMGATQFIKAQSLFNFGTRTGIDLPN EGAGIIHTKDSMGETELACSAFGQGFTCTMIQEINAMSSVINGGYYYQPHLVTKVLDSNG GTIKTISPILLKQTISSRISSDIRSYMALSVQQGTSRHSKVQGYSSGGKTGTAEKYPRGN GKYLVSFIGFAPVDDPAVVIYVVVDEPNVEDQANSTYPQYIAQGILSELLPYLNVVPDES EDGTVPETELWEGFKGHLKSTSISDSELDSDGNLVDADGNLIDWDGNRIDENGYLLDANG DHILDEEGNYKMSTNLVSASGSTDADSSGDAVSNPDAPAPLEDDEAETTEDNDEETDGIT NEEAGLE >gi|226332905|gb|ACII01000114.1| GENE 107 113959 - 115728 1521 589 aa, chain + ## HITS:1 COG:BH2572 KEGG:ns NR:ns ## COG: BH2572 COG0768 # Protein_GI_number: 15615135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus halodurans # 7 580 29 569 644 341 36.0 2e-93 MTGLIGRLVYLMCFRSDYYYKKAKELHEREREIKAARGEILDAKGKVLASNRTVCTISVI HSQIKEPEKVIALLSEKLKIDEAEVRKRVEKISSIEIVKTNVEKSTGDEIRECSLAGVKV DEDYKRYYPCGSLASKVIGFTGGDNQGIIGLEVKYEEVLRGQPGKILTTTDARGVEIDKL GETREKPIEGKSLMISLDVNIQEFAQQSALKVMEEKQAERVSILLMNPQNGEIYACVNVP EFDLNDPFTLTDDLNMQLNESGITIPSEDHNEQSELAVQTASGKTNTERKQELLNQMWRN PCLNDTYEPGSTFKIITMAAGLEAGVVSPGDRFYCPGYKVVEDRRIHCAKRTGHGSQNFV EGAQNSCNPVFIEVGLRLGVERYYKYFRQFGLLDKTGTDLPGEAGTIMHQKKNMGEVELA TVSFGQSFQITPVQLAATVSSLINGGRRVTPHFGVAVLNPDRTEGEKLIFPVKEGIVSEE TSKEIREILETVVSQGSGKNAKIEGYAIGGKTATSQTLPRSANRYISSFLGFAPAEDPQI LGLCIIHDPKGIYYGGTVAAPVIRGIFENILPYLGIEKSDVSDFSAEGN >gi|226332905|gb|ACII01000114.1| GENE 108 115837 - 116793 1076 318 aa, chain + ## HITS:1 COG:CAC2127 KEGG:ns NR:ns ## COG: CAC2127 COG0472 # Protein_GI_number: 15895396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Clostridium acetobutylicum # 5 318 4 316 317 238 47.0 1e-62 MEFQVVIPVLISFAISVVLGPVIIPFLRRLKMGQTERTEGVQSHLKKAGTPTMGGVIFLI ATAITALFYVGDYPKIIPVLFLTLGFGIIGFLDDYLKVVLKRSDGLLPWQKFLLQVVLTA IFVFYIVKYTDISLTMRIPFWSGHFLNLGWLAVPVLFFAVIGTVNGVNFTDGLDGLASSV TLIVAVFFTVVSIGMKSGIEPITGAVVGGLMGFLLFNVYPAKVFMGDTGSLALGGFVAGT AYVMQMPLFILLVGLIYLIEVLSVIIQVTYFKATHGKRIFKMAPIHHHFELCGWSETRVV AVFSVITAVMCMVALLAL >gi|226332905|gb|ACII01000114.1| GENE 109 116848 - 118203 1814 451 aa, chain + ## HITS:1 COG:SA1026 KEGG:ns NR:ns ## COG: SA1026 COG0771 # Protein_GI_number: 15926766 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Staphylococcus aureus N315 # 3 449 7 443 449 280 36.0 3e-75 MELQGKKVLVFGSGKSGIGASDLLAKVGAFPVIYDGNAETDKDAVVHKTDGTYPVTVYAG ELPKEVQDSLDLVVLSPGVPTDLPLVKSFYEQGLPVWGEVELAYRVGDGEVLAITGTNGK TTTTALLGKIMQDARESVFVVGNIGTPYTSKALEMKPNSITVAEISSFQLETIDEFAPKV SAILNITEDHLNRHHTMEEYIRVKELITENQGTEDVCVLNYEDEVLREFGKHLTPRVVYF SSGRKLDEGIYLDGNKIILKDGEKEIEVVKTEDLKLLGKHNFENVMAAVAMAYYDGVSLD SIRKSICEFTAVAHRIEYVTEKKGVVYYNDSKGTNPDAAIKGIQAMNRPTLLIGGGYDKQ SGYDEWIEAFDGKVRYLVLIGQTKEKIKEAAEKHGFHDIILCEDLKEAVKVCEEKAQPGD AVLLSPACASWGQFDNYEQRGDMFKEYVRNL >gi|226332905|gb|ACII01000114.1| GENE 110 118290 - 119387 1045 365 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 30 363 26 365 366 188 37.0 1e-47 MARKRATDLSKTDMTMLVLIVLLVIFGLAVLFSSSEYNGRVRFGDSACYFKKQLFATALG MGVMYMVSSIDYHFFLRLGPVAYLISMFLSGAVLFVGQEINGSKRWLNLGPLSFQPSEFA KVAVILFLAWQIERTKKATMGFGFMCRTILTLLPIIGLVGSNNLSTAIIILGIGGILIFV SNPGYLEFIGLGSAGAGFIAVFLAAESYRLERLAIWRNPEKYEKGFQTIQGLYAIGSGGI FGRGFGNSLQKLGFVPEAQNDMIFSIICEEMGAAGAIFLIFLFAMLLWRLGVAAMHAKDL AGALICCGIMGHLALQVILNIAVVTNTIPNTGITLPFISYGGTSAVFLLGEMGLAMNVGK CDRIN >gi|226332905|gb|ACII01000114.1| GENE 111 119468 - 120751 793 427 aa, chain + ## HITS:1 COG:DR1123 KEGG:ns NR:ns ## COG: DR1123 COG0766 # Protein_GI_number: 15806143 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Deinococcus radiodurans # 4 416 6 418 426 291 40.0 2e-78 MDEIHVTGKASLSGEIFIQGSKNAALPMIAASLMHRGLSILRGCPRISDVFCMEEILKSL GAVTWWDDHDLYLDCSCADGTEIPAVYTGRMRSSIILAGAVLARNRKCRIGYPGGCTIGK RPIDFHLMALRALGADVRENASGLDGECVRFEGQDIVFPKSSVGAAQQAILAAVLAKGVT HLYNCAKEPEVFWLCRYLKTMGALIEGEETGEITIRGVDELKGGDYRIPPDRIVAGTYLC AAAAARSKIILHDVPVEELQSFMEVYRKIGGQYQVNGGTLEVNGKNAVRPAVLVETEIYP GFPTDLQAPLMAVLTGAQGQSTIRENIFDRRFGSAFQMKKMGADISVSGNQAIIRGGKTL KGTQVQAGDLRGGAALLIAALMAEGETCVTGAGFISRGYEHICEDLRTLGCRIWQPGTED LRIYERK >gi|226332905|gb|ACII01000114.1| GENE 112 120792 - 121910 1182 372 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00732 NR:ns ## KEGG: EUBELI_00732 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 16 216 39 236 251 113 33.0 1e-23 MLAGMIGAGVMVAVIVFFSYYKVDTVEVRGTSHYTDEEVKNMVLRGPMASNSVLAPLLYS TTNTEDIAYVDAFKVTQLNRNTICISVKEKKTVGCIRYLDSYIYFDRNGIFVEGSQNRDE TVPYFDGIQVNSIVMDEKLDIKGDTVLNTAVALSTIFQKNDMIPDHIQFDSSYSISLIYG DITVQLGKDADLEEKMNRVIAILPKIQGKKGILHMESVATESNTFEEELEQEEVTAENWN GGYDEDGNYTGDGEYDEDGNYVGAKPKTALEEAIENWNGGYDEEGDYTGAGEYDESGNYV GPKPTQESLDAASDESGDGSGDESSADDQSSETSEDGESDSSYDDGNSDDSDGYGDYTED NSDYQGDDSEGY >gi|226332905|gb|ACII01000114.1| GENE 113 122116 - 123306 1428 396 aa, chain + ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 8 362 7 364 373 326 54.0 4e-89 MLEIMSNEAESSAKIIVIGVGGAGNNAVNRMVEEAIGGVEFVGVNTDKQALTLCKAPTVL QIGEKITKGLGAGAQPEVGQKAAEESIEEVKQLMEGADMVFVTCGMGGGTGTGAAPVIAA AAKEMGILTVGVVTKPFRFEARTRMNNALAGIENLKKAVDTLIVIPNDKLLEVVDRRTTM PEALKKADEVLQQAVQGITDLINLPALINLDFADVQTVMTDKGIAHIGIGEARGDDKAME AVQQAVSSPLLETTIKGATHVIINISGDISLMDANDAASYVQELTGEEANIIFGAMYDDT VADYCRITVIATGLNDDNLQQTPFGKRSTASSFGARRPSSTATTGTATAGTAAGSGMNMP SFSLPTMNATPFGASKTPSSTVQKKDIQIPDFLRNR >gi|226332905|gb|ACII01000114.1| GENE 114 123453 - 123782 243 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580167|ref|ZP_04857434.1| ## NR: gi|253580167|ref|ZP_04857434.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 109 1 109 109 145 100.0 6e-34 MTILKCSAMTCVYNKEQLCSKGDIDVTGENATSANETSCGSFRERTGSSMKDSYTDDCGC DKIQIDCKAHNCTYNDNCKCTASSIQVDGSNAHASSDTRCDTFQCHCGA >gi|226332905|gb|ACII01000114.1| GENE 115 124236 - 124730 402 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 9 164 2 158 159 159 50 8e-38 MINEQIRDKEVRLIGSDGEQLGIMSAKEAMFKAQEAGLDLVKIAPQAKPPVCKIIDYGKY RYELARKEKEAKKKQKTIDVKEVRLSPNIDTNDLNTKVNAARKFLSKGDRVKVTLRFRGR EMAHMASSRHVLDDFAKQLADVATVEKAPKVEGRSMTMFLTEKR >gi|226332905|gb|ACII01000114.1| GENE 116 124766 - 124963 242 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228582464|ref|YP_002853165.1| ribosomal protein L35 [Clostridium sp. 7_2_43FAA] # 1 65 1 65 65 97 75 3e-19 MPKIKTCRAAAKRFKKTGTGKLVRNKAYKSHILTKKSQKRKRNLRKSTVVDATNVKSMKK ALPYL >gi|226332905|gb|ACII01000114.1| GENE 117 125011 - 125367 540 118 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 [Roseburia intestinalis L1-82] # 1 118 1 118 118 212 88 8e-54 MARIKGGLNAKKKHNRTLKLAKGYRGARSKQYRVAKQSVMRALTSSYAGRKQKKRQFRQL WIARINAAARMNGLSYSKLMHGLKVANIDINRKMLAELAVNDAEGFAALAEIAKKAIA >gi|226332905|gb|ACII01000114.1| GENE 118 125803 - 126957 1710 384 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 29 380 31 375 383 235 39.0 1e-61 MKNLKKIISVTAAAAMVMSTVAPVSVFADDATFKIGGIGPVTGAAAIYGQAVKNATELAI NEVNEDGGINGYQVEFKFEDDENDAEKTLNAYNALKDWGMNILVGTVTSTPCVAVANETA ADNMFQLTPSGSATECVANPNVFRVCFSDPNQGTAAAQYIGEHKLATKVAIIYDSSDVYS SGITQNFTAEAANQGIEIVSSEAFTADNNKDFSVQLQKAKDGGAELVFLPIYYTEASLIL AQADSMGFETKFFGCDGMDGILGVENFDTSLAEGLMLLTPFAADSTDEKTQNFVKAYKDA YKDTPNQFAADAYDAVYAVKAAAEKEDVKADMDVADICAALEKGMTEITLEGVTGDEITW TEDGEPNKAPKAVEIKDGAYAAME >gi|226332905|gb|ACII01000114.1| GENE 119 127206 - 128093 1232 295 aa, chain + ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 4 295 16 308 308 242 49.0 5e-64 MSSFISYLISGISLGSVYAIIALGYTMVYGIAKMLNFAHGDVIMVGSYVVYVAVSGMGLN PILAILLSIIICTLMGVVIEGVAYRPLRNAASPLAVLITAIGVSYLLQNVALLIWGADTK SFSNVISLPSLKLAGGSIVITGVTIVTIIGGILIMAGLMLFISKTKTGQAMLAVSEDKGA AQLMGINVNKTISITFAIGSGLAAIAGVLLCSAYPSLTPYTGAMPGIKAFVAAVFGGIGS IPGAFIGGVVLGILEIFGKAYISSQMADAIVFGVLIVVLVVKPTGILGKNIQEKV >gi|226332905|gb|ACII01000114.1| GENE 120 128104 - 129159 1430 351 aa, chain + ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 47 349 2 276 285 189 44.0 1e-47 MEKKKLNKTTKSNLITYGMVLAAYLIVQGMIAGNLLSNLMQGLLVPLCVYIILALSLNLT VGILGELSLGHAGFMCIGAFTSAAFTKATMDTFTNAGLRFFVALIIGAVCAGIFGFLIGI PVLRLKGDYLAIVTLAFGEIIKNIINVLYVGIDSNGIHFSMKDQMSLGMEPGGKMIISGA MGITGTPRQSTFTIGVILILVTLFVVLNLINSRDGRAIMAIRDNRIAAESIGIDITKYKL KAFAISAALAGIGGVLYAHNLATLTALPKNFGYNMSIMFLVFVVLGGIGNMRGSIIAAVI LNLLPELLRGLSDYRMLFYAIVLIVMMLINWAPKAIEMRERLFSKFKKKEA >gi|226332905|gb|ACII01000114.1| GENE 121 129163 - 129921 255 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 244 33 254 329 102 27 9e-21 MALLEVNHLNISFGGLHAVDDFHVSIEKGQLYGLIGPNGAGKTTIFNLLTGVYKPDEGII NLDGQDITGKSTIDINKAGIARTFQNIRLFHQMSVLDNVKAGLHNHDKYNLFTSIVHTPK YFKVEKTMNEEAMKLLKVFDLDKEADILASNLPYGKQRKLEIARALATKPKLLLLDEPAA GMNPNETEELMDTIRFVRDNFDMTILLIEHDMKLVSGICEELTVLNFGHVLRQGKTSEVL HDPEVIKAYLGE >gi|226332905|gb|ACII01000114.1| GENE 122 129934 - 130644 296 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 223 1 226 245 118 29 2e-25 MALLEVQDIQVYYGMIQALKGVSFSVNEGEVIALIGANGAGKTTTLHTVTGLLRAKSGHI IYDGQDITKVPPHKIVTMGMAHVPEGRRVFANMTVLQNLKMGAFTRSDKNEIDATIEKVY KRFPRLKERQNQTAGTLSGGEQQMLAMGRALMSQPKIILMDEPSMGLSPIFVNEIFDIIK EVSESGTTVLLVEQNAKKALSIADRAYVLETGRITLEGKASDLLHDESVQKAYLGE >gi|226332905|gb|ACII01000114.1| GENE 123 130659 - 131282 743 207 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3531 NR:ns ## KEGG: Cphy_3531 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 203 5 206 209 201 50.0 1e-50 MENFVITIARQYGSGGRTVGKMLAEKLGVSFYDKQIIQMASDESGIDVKLFGQVEEGSSV KASLFNKTGLYKGDLIAPDQKGFVSDENIFNYQAKIATDLAEKESCVIVGRCVNYVFKDR PNTLRVFVHAPWEFRVEKSREKISGSDEEIEKFLRRDDKRKQDYYRKFAGGDWSDATNYD LCLNAGKLGFEKCVDAILAQMNVMGIK >gi|226332905|gb|ACII01000114.1| GENE 124 131324 - 132643 737 439 aa, chain - ## HITS:1 COG:BS_yrvN KEGG:ns NR:ns ## COG: BS_yrvN COG2256 # Protein_GI_number: 16079807 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus subtilis # 17 435 3 417 421 398 47.0 1e-110 MDLFEYAKSKTLDQESPLASRLRPTTLDEVVGQQHIIGKNTLLYRAIKADKLTSVIFYGP PGTGKTTLAKVIANTTSAEFTQINATVAGKKDMEAVVKEAQQNLGMYGKKTILFIDEIHR FNKGQQDYLLPFVEDGTIILIGATTENPYFEVNAALISRSIIFELKSLEISEVKELILRA VNDKTKGMGNYKARIDEDALDFLADMAGGDARNALNAIELGILTTPRSEDGLIHITLDVA SQCIQKRVVRYDKNGDNHYDIISAFIKSMRGSDPDAAVYYLAKMLYAGEDVKFIARRIMI LASEDIGNADPQALCVAVAAAQAVERVGMPESQIILSQAVTYMACAPKSNSAVNAIFAAM DSVKRTKTTVPPHLQDAHYGGHEKLGHGIGYEYAHDYPNHYVHQQYLPTEIQGEHFYELS EMGYEKTLKEYQNKIKSQS >gi|226332905|gb|ACII01000114.1| GENE 125 132845 - 133357 300 170 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1481 NR:ns ## KEGG: EUBREC_1481 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 170 1 170 171 125 36.0 8e-28 MATPFTVYKLIVLYMLQNTENTLTNSQISEFILDREYTNYFHLQQAISELVEAELITMDT RSNVSHYRITEDGIKTLSFFQKDLSPEIKQEVREYLKSTGFKAQERIVTPADYYITKQGT YSVRCQLIEKGNSLIDLNIAAPNLEAAQSICKKWSTHYQEIYGKIMEELL >gi|226332905|gb|ACII01000114.1| GENE 126 133394 - 134320 612 308 aa, chain - ## HITS:1 COG:CAC3238 KEGG:ns NR:ns ## COG: CAC3238 COG1242 # Protein_GI_number: 15896484 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 3 307 15 319 324 336 50.0 4e-92 MTKWGEKPYHSLDYMLRERFGEKVYKVTLNGGMSCPNRDGKIGTRGCIFCSAGGSGDFAA DAALSITDQIESQISILSQKRPIHKYIAYFQAYTNTYAPVEYLEKIFTEAISHPKIVALS IGTRPDCLSPEIVTLLSRLNKHKPVWIELGLQTIHESTARYIRRGYPLCVFDDAVKRLRK ENIEVIVHTILGLPGENTADILETMEYLNHMDIQGIKLQLLHVLRGTDLAADYEKGLFQT YERDEYISLLINCLEHLRPDMVIHRITGDGPKDLLIAPLWASRKREVLNMLHHRMKEEQS YQGRLFSH >gi|226332905|gb|ACII01000114.1| GENE 127 134317 - 135024 620 235 aa, chain - ## HITS:1 COG:BS_yesL KEGG:ns NR:ns ## COG: BS_yesL COG5578 # Protein_GI_number: 16077761 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Bacillus subtilis # 9 193 12 196 209 60 25.0 3e-09 MDRLFNMDNKFFTVMGRVADLIMLNVVFLICCLPIVTIGASLTALHYVTLKMARNEESYI IRSFFKSFKQNFKQATVINLIMLAVAAILYMDLRIVGNIDGTMSQVLYIVFFAFGILYMM VFLYIYPVLAKFYNSIKNTFRNAFLMAIRHLPYTVLMAVITLLPAGVFFIKSFRIQSMAI MLLCMFGFALEAFINGHFLVKIFDNYIPADADTQDTDLQNGNSAETSSAENDHTV >gi|226332905|gb|ACII01000114.1| GENE 128 135149 - 135556 449 135 aa, chain - ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 1 135 316 450 463 110 38.0 1e-24 MRKDGHLIGNHTWDHVQLDKIPAEKARLEIEKTNNRIYEASGIYPSYVRPPFGAWIKDME LSVTMLPVFWDVDTLDWKSKNIDSILSIAQKQVHDGSIILMHDGYQTSVDAALKIADLFT EKGYVFVTADQLLLT >gi|226332905|gb|ACII01000114.1| GENE 129 135582 - 135902 186 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580180|ref|ZP_04857447.1| ## NR: gi|253580180|ref|ZP_04857447.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 13 106 1 94 94 182 100.0 9e-45 MKRTLIFFEILCMIFFLLFSSVYRRTQFLTSLAGPGMLSLVKQASAKATASENFSTDTTA GLPVEEPPRIALTFDDGPNARYTPMLLDGIKKKKHQGIFLSDRREH >gi|226332905|gb|ACII01000114.1| GENE 130 136160 - 137113 855 317 aa, chain + ## HITS:1 COG:BH3840 KEGG:ns NR:ns ## COG: BH3840 COG1879 # Protein_GI_number: 15616402 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 30 154 50 174 332 74 30.0 3e-13 MIGESYCEGLSGGSSLEEKNEVMCKYKYDMIVDSPNSSFWQAVYDCAQEWAQKNDAILEL RGSGRESDYSKLDYMNMSIASNSDGIILQYSGEQGLEEKINEAVQKGIPVVTVMGDAVHS KRQSFVGVSDYQLGSVYGEKVAEYVTEDTESILILLKKNIDDMNQSQIYTQISNAAQAKA GSLQNIQITGKNLRLTGTFETEEAVTDIFQQRDKVPDILVCMDEETTECARQAVLDFNLA GKVTIIGYYSSEDILTAVEKGVISVTCDVDTEQLGRYSIEALTAYQKDGRTNSYYNVDIN FLDKDVVREMRRKGVSG >gi|226332905|gb|ACII01000114.1| GENE 131 137110 - 138591 1415 493 aa, chain + ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 13 492 6 477 481 176 27.0 1e-43 MKQLRWKDFSLVSKIVIEVGMIAVLLFAMNMLFYVRINNSMQKMDNVYASNAELTELSQV FEKVQDNMYKYLKVKNSQTLLDYYQNEAKYRNELEKLNEDNINDPVKLLERNIRKMSETY LDCTAETVAAKRGRNVEQYKRKYDDATKLYRYIQSSIDELNNLMFQENSSTYAVLRAVMR YLEISNSVIMIIIVAGGMLLLIQATRNMFVPLSNMAETAQLVGQGNFHVKMHDTDAQDEL GTVTRAFNTMVENLDLYMARTKASMEKEQQMMERELLMENHLKEAQLKYLQSQINPHFLF NSLNAGAQLAMMEDAEQTGIFVEKMADFFRYNVKKGQEDATLGEELEAVDNYIYILNVRF AGDIHFSKEVDESLENVRMPSMILQPVVENAVNHGIRDIEWEGKIHLTVTGDADYIRISV NDNGKGMTQEQIEGVLSGNREHRNEEGDSTGIGMNNVISRLELYYEESGLMEINSEGEGK GTEAVIYIPVHEN >gi|226332905|gb|ACII01000114.1| GENE 132 138618 - 140213 1450 531 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 530 1 517 530 198 26.0 2e-50 MYRILLADDEGIMLESLKKIIETEYGNECEIHCAKSGRVVVEMAQAYPPDICFMDIQMPG ISGIQAIREIQKFNSSAVFVIITAYDKFNYAKEAVNLGVQEFLTKPVNKKVILETCAKMM TKVDQIRQKRSDDLLIREKLETVVPMIESGYINNILLQDDFVTYQDNYTQLLDIRQKYGY MMVIEFGDSMENGVLSNAVGASVKANKFYSTFREIAQGFFECLVGPIMGNRIVLLVPYEN AKEDYEDRVAIVTRARNMVHKFENRIDSMFRCGIGRVKELGSVKESFKEAVVALRESISH VVHIEDVPAAQKYDGEYPRDLEIRYQKRILEKDAAGAMNCAEGFFEWMHSQQTVTREDIE IKILEMVMNAERRAFFAGTLKYNVNSRRSYIRELQSCTDIENLKKWFLDKTREICTKLEN SKEKEAGSIIDRAKEYINENFRRDISLDDVSREVDISPYYFSKLFKQETGKNFIEYLTEI RLKNARELLQDSRLSIKEICAQSGYSDPNYFSRIFKKYEGVTPSEFRERLG >gi|226332905|gb|ACII01000114.1| GENE 133 140213 - 141310 1222 365 aa, chain + ## HITS:1 COG:YPO4037 KEGG:ns NR:ns ## COG: YPO4037 COG4213 # Protein_GI_number: 16124157 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Yersinia pestis # 53 359 25 331 331 237 42.0 2e-62 MKLYKTKKNCGKLLLGILAFCTGMILLTGCGSSSEQKQNKNEETAEENKEEDKVQIGLTV DSFVIERWIRDRDVFVATARELGAEVNVQDAGADVKEQISQIEYFINKQVDVIVVIARDC GALSDAIQKAQSAGIPVISYDRMVNNANTDLYISFDNRKVGEIMAQALVNALPQGGDVFM IQGSSSDNNVQMVKQGFDDMLADTDLHVVYEANCDGWTAELAAGYVEEALEKYPHVKGIM CGNDDIASQVVQVLAENQLAGNVVVVGQDGDLAACQRIVEGTQYMTAFKPIEDLARKAAK YAVEMGSGKGVAELEDVTETVNDGTYEIPSCILEPTAVTKENIDKVIIEGGFHRRDEVYL NADYS >gi|226332905|gb|ACII01000114.1| GENE 134 141528 - 142595 1401 355 aa, chain + ## HITS:1 COG:BH3442 KEGG:ns NR:ns ## COG: BH3442 COG4213 # Protein_GI_number: 15616004 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Bacillus halodurans # 1 355 1 358 359 345 48.0 7e-95 MRKKVVSAILCASMVAGMVVIPGVAVKADDKKLIGVTMPTKDLQRWNQDGENMKKELEAA GYEVDLQYASNDVSTQVSQLENQVANGCDLLVVASIDGSSLGEPLKQAKEAGIPVISYDR LLMNSDAVTYYATFDNYKVGQKQGEYLVDALDLDNQDGPFNIELFTGDPGDNNCNFFFGG AMDVLQKYIDEGKLVVKSGQTEFEQVATANWDSAKAQDRMDTIIAGNYSDGSNLDAVLCS NDSTALGVENALAASYTGEYPIITGQDCDTPNVKNLVAGKQAMSVFKDTRALATAVVGMV DSIMKGEEPEVNDTKSYDNGTGVIPTYLCDPVVVTVDNYKEMLIDSGYYTEDQIK >gi|226332905|gb|ACII01000114.1| GENE 135 142739 - 144277 192 512 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 257 495 3 245 563 78 28 2e-13 MAKILLEMKNITKTFPGVKALDNVNLQVEEGEIHALVGENGAGKSTLMNVLSGIYPYGTY DGNIIYNGEICKFNTIKDSEEKGIVIIHQELALIPYMTIGENMYLGNERGSSLSINWNET YGEADKYLKIVGLEESSRTLIKDIGVGKQQLVEIAKALAKNAKLLILDEPTASLNEDDSQ ALLELLLKFKKQGMTSIIISHKLNEISYVADKITVIRDGSTIETLDKKKDDFSEQRIIQG MVGREMTDRFPKRPGVKIGDVSMEVKNWNVYHPLYSERKVVDNVSFKVHKGEVVGISGLM GAGRTELAMSIFGKSYGTKISGQVFINGKEVKLNTVQEAIDNKLAYVTEDRKGNGLILSK SIKMNTSLANMKGISNGKVIDADKEYAIAEEYRKKLKTKCPSVEQNVGNLSGGNQQKVLL AKWMFAEPDILILDEPTRGIDVGAKYEIYCIINDLVAAGKSVIMISSELPEVLGMSDRIY IMNEGRFVGEVGKEEATSELIMSKIVKSGKGA >gi|226332905|gb|ACII01000114.1| GENE 136 144283 - 145455 1463 390 aa, chain + ## HITS:1 COG:AGc4262 KEGG:ns NR:ns ## COG: AGc4262 COG4214 # Protein_GI_number: 15889623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 385 24 397 398 300 48.0 4e-81 MEKKISFSEILKKYTMVLVLVLVVIMFAVNTKGVMLLPQNVNNLVAQNAYVFILATGMLF CILTGGNIDLSVGSVVCFVAAVGGKMMVLNSMNPYLTMLIMLLTGIAIGAWQGFWIAYVR IPPFIVTLAGMLAFRGLSNVVLQGQTLAPMPDSYLALFNNYIPDPFGKEGFNLICFVVGI IVCIVYVLLVLKNRADRVKKGYSVDSFGGVAVKMILICAVVLVFMFRLAQYKGIPNSLIW VAVIIAIYTYIASKTTTGRYFYAVGGNEKATKLSGINTNKVYFLAYLNMGLLAAIAGMVT MARLNSANPQAGTNFEMDAIGACFIGGASAYGGTGTVPGVIIGALLMGVLNLGMSIMGID QNLQKVVKGLVLLAAVIFDVVSKRKSFIVK >gi|226332905|gb|ACII01000114.1| GENE 137 145336 - 145620 123 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPKPHIKHTIFLRVPCQVKQIFAIKAGDLLDSVKKQGLHPASPAKNFHLRMNIQFLFYDK RFSLTNNIKDNRCQQNQTFNNLLQVLVNTHDTHT >gi|226332905|gb|ACII01000114.1| GENE 138 145935 - 146657 857 240 aa, chain + ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 1 239 1 235 244 202 41.0 5e-52 MRYNSVLDLHTHTVASGHAYCSLREMAKAASEKGLEVLGITEHAPSMPGTCHKFYFNNLK VVPREMYGIQLLLGSEVNILDAQGTVDLGQRTLSSMDVVIASLHTPCMKPASKLENTEAY LNVMKNPYVNIIGHPDDGRYEIDYEALVQGAKEYGKVLELNNHSMDPECTRENAVENDTV MLNLCKKYQVPVVMDSDAHFDLLIGEFDLARDLLEKLDFPEELVLNRSADAIREYVNRKF >gi|226332905|gb|ACII01000114.1| GENE 139 146895 - 147200 377 101 aa, chain + ## HITS:1 COG:lin1875 KEGG:ns NR:ns ## COG: lin1875 COG4496 # Protein_GI_number: 16800941 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 3 97 4 98 98 115 60.0 2e-26 MSKKIRTEAVDQLFEAILCLKDKEECYTFFEDVCTINELLSLSQRFEVAKMLMDKRTYLD ISEKTGASTATISRVNRSLNYGNDGYEMVFSRMAEKETKGE >gi|226332905|gb|ACII01000114.1| GENE 140 147598 - 150780 2775 1060 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 5 1055 6 1009 1014 958 47.0 0 MKAFDYSLVKDPQYFCDNRMEAHSDHVYYRDGNEMQTAVQDGTETSFRYSLNGLWKFHYA RNYKSAVQGFEKKDYCCYDWDDIHVPAHIQMEGYDAPQYANVQYPWEGHEEIYPGQIPER FNPVASYVKYFTVPEHMKGKRLFISFQGAESGLALWLNGAFVGYSEDSFTPSEFELTDYL KDGQNKLAAQVFKWTSSSWCEDQDFFRFSGIYRDVYLYTVPDTHAYDLQIRAIPEENLDV ADLEIKVKTWGKGSIAFRLEKDGECVLEEKKSLTEAGNEEADAEAFYIQKTNSSKTVQNS FVWKINNPKLWSAEDPQLYDLTIELYDEAGNIQEVIPQKVGFRRFEMKNGIMTLNGKRIV FKGVNRHEFSSVSGRHVSEEELRKDLRIMKQNNINAIRTCHYPDTSLIYQLCDEYGIYMI DETNLESHGSWDVAEFTKDYTHVVPHNKPEWLDMMLDRANSMYQRDKNHPAILIWSCGNE SFGGKDIYEMSQLFRKNDPTRLVHYEGLFHDRSYNDTSDMESQMYPSVEAIKEFLAKDDS KPFICCEYTHAMGNSCGAMHKYTDLTDTEPKYQGGFIWDYIDQSIYKKDRYGKEFQAYGG DFGERPTDYNFSGNGIAYGGDREPSPKMQEVKFNYQNITAEVTADTVKVLNKNLFVNTNT FDCKVILAKNGKVICTEALETAVEPLSTKEYKLPFGKAEAVGEYTVTVSFHLKEEKVWAS AGHEIAFGQYVYQVKEDVPDGKSDSAETAITAEKDVCVSDAFVKKPQIIRSTHNIGVRGE HFEVMFSVLNGGLVSYKYAGKEMIEAIPKPNFWRAPTDNDCGNLMQMRYAQWKVASMYLS HKEYRKGAYGPSNLPQAEETDHSVKVTFTYLMPTTPASECQLTYEVFGDGKVKTTLTYDP VKELGDMPEFGVIFKFNADYDNVQWYGLGEAETYADRKKGAKLGIYQNKIVDNIARYMVP QECGAKEEVRWAKVTDRKGRGMYFEMDRESMMFSALPYTPHEMENAMHPYELPQIHYTVV RVAKGQMGIGGDDSWGAYTHQEYLLNADGKMEFSFSFKGI >gi|226332905|gb|ACII01000114.1| GENE 141 151012 - 152007 1200 331 aa, chain + ## HITS:1 COG:STM2406 KEGG:ns NR:ns ## COG: STM2406 COG0667 # Protein_GI_number: 16765732 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Salmonella typhimurium LT2 # 1 325 2 325 332 416 61.0 1e-116 MYQANEKRYETMKYHRCGASGLMLPELSLGLWHNFGDTGVYENMKQMCFTAFDNGITHFD LANNYGPQPGSAEENFGKILREEFSAYRDELIVSTKAGYDMWPGPYGNWGSRKYLIASLD QSLKRLGLDYVDIFYHHRMDPETPLEETMEALAQIVRSGKALYVGISNYDGETMKRAADI LEELHCPFIINQNSYNIFNRTIEKNGLKQAAAEKRKGIISFSPLAQGMLTDRYLNGIPKD SRVNTDGRFLHADQFSEERLNSIRSLNAIAAERGESLAQMALKWILRDGIVTSVLIGASR PQQILENLKVLDSASFSEEELKKIDECSLAL >gi|226332905|gb|ACII01000114.1| GENE 142 152060 - 152971 735 303 aa, chain + ## HITS:1 COG:SP1899 KEGG:ns NR:ns ## COG: SP1899 COG2207 # Protein_GI_number: 15901726 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Streptococcus pneumoniae TIGR4 # 16 281 13 274 286 136 30.0 5e-32 MEDSYVLQLQEPKFREFYLSYSGYAQCGPLHCYGPASRPHYLIHFVVKGKGMYQVSGQKH YLTAGQGFLIEPGTQTFYQADKDDPWSYLWIGVGGTNAGKYIRDIGLNSEQLIFRSESGE ELKKIVLDILKHTESTTSNLYYLQGRLYDFFSVLTKDVVIDTYLGNSREKKYGRENGYGE YIKEAMAFIKNEYACGIGVQELAEHLNVNRSYLYTIFMNELHISPKDFLQKYRITRAREL LVLTELSVESIGKSCGYGNAERFARAFKHETGLTPTQFRRQDRSKNRRNLEVSEDGLEKL INS >gi|226332905|gb|ACII01000114.1| GENE 143 153146 - 155368 1965 740 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 737 1 740 748 894 57.0 0 MSIIFHESSKTFHLYNNNISYIMTVLSNGHLGNLYFGKRIHDREDFSYLLEMKQRPMTAC VYEGNRKFSLEHLKLEYPVYGSSDYRYPAVEILQENGSRISDFTYVSYTIAAGKPKLQGL PATYTEKDEEAQTLCVKLKDEVTGMVLELLYTIFTQRGIITRSARFTNEGTSSVHLLNAM SLSLDLPDKDYVWMQFSGAWSRERHVKERRLEQGIQSVGSIRGNSSHEHNPFIVLRRPSA TENAGEVMGFSLIYSGNHRMQAEVDTHDTTRITVGINPQNFDWKLDCGESFQTPEAVVVF SDKGLNGMSQTFHKLYQKRLARGYWRDRPRPILNNNWEATYFDFTEDRLVEIAAKAKECG VELFVLDDGWFGARSNDYAGLGDWVANRERLPQGIKGIADRIEEMGMKFGLWFEPEMVNK DSDLYRAHPDWILQTPQRHSCHGRNQYVLDFSRKEVVDCIYEMMYKILSEAKVSYIKWDM NRSITECYSAALPADRQGEVFHRYILGVYDLYERLNTAFPQILFESCASGGGRFDPGILY YAPQGWTSDDTDAAERVKIQYGTSMCYPVSSMGSHVSVVPNHQLNRKTPLHTRANVAYFG TFGYELDLNKLSDEEISEVKQQITFMKEYRELIQFGTFYRLKSPFEGNETVWMTVSEDKK TALVFWYRERNVVNADFTRVRLQGLDPDLIYRNEYNETENYGDELMNLGLLTTDCTAGEP TSEDEPCTDYESRIYVLTAK >gi|226332905|gb|ACII01000114.1| GENE 144 155452 - 156357 859 301 aa, chain - ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 46 283 40 275 287 133 31.0 4e-31 MPNVKYTDSATFRCLENLKEGSLDISLIHTGKEHCPPGHICSMPRDEFIIHFVLDGTGFY SAGGQTWSLTPGQMFVIYPNEPVTYGADETNPWTYAWIGFRGIRAHSMVKECGFSKNQLV LPAPDQKIILKHIDYMLNHKQLSKANDLRRQAYLILLLAELAGFHEKQSAQNKKNAKYAY STSVYVELAIEYIKDMYQKGIGISDIADNIGISRAYLNSSFQKELGMSAQTFLIDYKMHK AASLLVSTSLSVKEIANNVGYEDQLVFSKAFKKKFGMSPKNYKTHKEMMDKFNEKQVDDP V >gi|226332905|gb|ACII01000114.1| GENE 145 156601 - 157866 1466 421 aa, chain + ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 81 350 93 367 461 127 31.0 4e-29 MRKNMKKLTVATVIGAMTVTALAPVGAQAKDKTLTVAIWDNGQKAGLQEIMDEFTKETGI KTELQVVEWSSYWTLLEAGASGGDMPDVFWMHSNEAVRYMSNDILLDLTDKIADSDKLEM DKFPEDIKEMYQWDGKTYAVPKDIDTIAMWYNKDMFDEAGIDYPDGSWTWDEFYDIAEKL TKEDGSQYGFAANPSNEQDTWMNIVYSMGGRVLTDDNKSGFDDPNTIKAMEFVDKCVKNV MPPASTMSETGTDVLFGSGKVAMISQGSWMVSAFKDNDYIKEHCDVARLPKDAETGESVS LYNGLGWAASANTDMPDEAFQLIEWFGSKDMQKKQADLGVTMAAYEGMSDGWVNSVDCFN LQPYMDAMDNIVFRPHTNATLAWWNPMCEELKKPWNDEESMDDACKNIVTIMNDAIAEES Y >gi|226332905|gb|ACII01000114.1| GENE 146 158032 - 158925 898 297 aa, chain + ## HITS:1 COG:BH1865 KEGG:ns NR:ns ## COG: BH1865 COG1175 # Protein_GI_number: 15614428 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 5 296 17 309 309 192 39.0 7e-49 MANPKKGTSQARNEFIWGWAFILPTMLGLIILNIYPIFKTIYESFFKTGDFGKGNIFIGF DNYVKLFHDAEVWQALLNTFKYAIVEVPFSIAIALVLAVLLNRKMKGRAVYRTIFFLPMV AAPAAIAMVWRWLFNSEFGLLNHIFHTKINWISDPKIAVYAIAVIGVWSIIGYNMVLFLS GLQEIPRDYYEAASIDGATGIKQFFHITIPLLSPTIFFVMVTRVIGAMQVFDLIFMVMDR NSPALYKTQSLVYLFYQNSFVQNNKGYGSTIVVLLLVVIMIMTVFQNIAQKKWVHYN >gi|226332905|gb|ACII01000114.1| GENE 147 158938 - 159762 707 274 aa, chain + ## HITS:1 COG:MT2099 KEGG:ns NR:ns ## COG: MT2099 COG0395 # Protein_GI_number: 15841527 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mycobacterium tuberculosis CDC1551 # 16 271 22 277 280 173 37.0 4e-43 MSARDKANRVLIHVILILVSITMLVPFLWMILTAFKTMTEATSVNPFVILPSTWRWDSFR EVSANMDFLHLYINTLLLILGRVVCAVLTATLAGYAFGRLNFRGKNLMFSLVLFQMMVPG QIFIIPQYLMVSKMGMLNTIFALVFPGIVTAFGTFLLKQAYMGLPKDLEEAARLDGCNIG QTFLFIMAPLTRSSMVALGIFTAVFAYKDLMWPLICNKDVMPLSAALAKMQGQYTNNYPQ LMAASLLACIPMIVLYLIFQKQFIEGIATSGGKL >gi|226332905|gb|ACII01000114.1| GENE 148 159986 - 160651 592 221 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1550 NR:ns ## KEGG: Cphy_1550 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 10 199 10 197 203 78 28.0 2e-13 MKPGSLFDRIFGFLGQLIALNLLWIVCSLPIITAGASTTALFYCTLKLHKDGDIRVLHDF FKSFKQNFRQSTLIWILMVAAGIFIYMEKEALATMPGSMSQIFNYVIFAVYIPLVAVALY VFPTVAAFENKTMTLITNAFYFAVKHIGYALAVAVITILPMTMTLVDAKLFPVYLLIWLM FGFSLTAYADSWFMWKLFKPYFKEEEEGHHYVDTEPDQYAF >gi|226332905|gb|ACII01000114.1| GENE 149 160739 - 162940 2476 733 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 4 732 3 744 748 623 43.0 1e-178 MEWIKYHEAEKVFDLRTEHSTYQMQVREYDTLVHLYYGSPVGDALITDRIVCVDRGFSGN PYEAEKDRTFSLDTLPQEYTAYGNGDYRINGLETEQADGSDTANLKFESYEITKGKYSLK GLPAMFAKEDEAETLEIVLTDRVSGLKAHLLYGVFPHLDVITRVVRLENTGTAPVTVKKA MSMEMDYEYRELDAVHFYGRHNMERQMERTHLGHGNWSVGSIRGTSSHHHNPFVILCDRN TEETYGNCYGYALAYSGNFLFETEVDQVGHTRVAMGIHPYHFSWTLEQGESFETPEVIMA YSAEGFGKLSRIYHDAYRSNLIRSKYTEQPRPILVNNWEATYFDFDADKIYHIAEEAKNI GLDMFVLDDGWFGKRDNDWCALGDWEVNEEKIKGGLPALAEKIHGLGLKFGLWVEPEMIS EDSDLYREHPDWALKIPGRAMNRSRHQLNLDITRKEVREHIMNQIFKVLDACKADYVKWD MNRSVDNVFSAALPKERQGEVYHRYVLAVYEMMESLVQRYPDLLFESCSGGGGRFEAGML YYSPQIWCSDNTDAIDRLKIQYGTSFGYPISAMGAHVSVCPNHQSGRTTPFETRGIVASA GTFGYELDLMKMSEEEKKTAREQILKYKEMEHLVQSGDYYRLVNPFENNNHVLWQFVSKD KKETVVCGVRLHSEANPYIYLFYPQGLDADMKYEDTATGKVYTGAALMKAGLPLPLTTGD YQPIRFTFKAVEK >gi|226332905|gb|ACII01000114.1| GENE 150 163016 - 163633 622 205 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1394 NR:ns ## KEGG: EUBREC_1394 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 203 1 203 217 221 63.0 1e-56 MTIDDKDMINSILESVSRIDYIKPEDIPNIDLYMDQVTTFMEEQLVSSKRYADDKILTKT MINNYAKNKLLPPPEKKRYSKEHVLMLIFIYYFKNILSINDIQTLLTPITQKYFKSMTEK DMTYIYNEVFSMEKEQIESLKKDLLRKYKTAQDTFGDADEEDQANLKRFSFICLLSFDVY VKKMMIEHIIDGMNPESEHNSKKKK >gi|226332905|gb|ACII01000114.1| GENE 151 163928 - 166162 2160 744 aa, chain + ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 1 743 1 727 730 669 49.0 0 MSDIYSTLNPQQQEAVYHTEGPLLILAGAGSGKTRVLTHRIAYLIDKKGINPWNILAITF TNKAASEMRERVDKIVGFGSESVWVSTFHSTCVRILRRYIDRLGYDTRFAIYDTEDQKTL MKEVCRKLNIDTKKTKERSLLAQISHAKDELITPDEMELNAGGDFVKKKVAEVYREYQAA LRRNNALDFDDLIVKTVELFQNCGDVLENYQERFRYIMVDEYQDTNTAQFKFISLLASKY ENLCVVGDDDQSIYKFRGANIGNILGFEHVFPDAKVIRLEQNYRSTKNILNAANAVIANN TSRKSKTLWTENSEGERIHFRQFMNGYEEAEYVIGEISRAHRENGAKYKDCAVLYRTNAQ SRLFEEKCLLANIPYKIVGGVNFYARKEIKDLLCYLKTIDNSRDDLAVRRIINVPKRGIG ATTLGRIQDYADKMSVSFYDALRVAEEVPSIGRSLSKIDGFVTFIQSLKSKAESYTVREL LEEVIELTGYVAELQAEDTDESKARIENIDELISKTESYQEVMEEQGQTATLSGFLEEIA LIADIDSVDPDQDYVLLMTLHSAKGLEFPRVFLAGMEDGMFPSYMSIISDDRSDLEEERR LCYVGITRAMEDLTLTSARQRMLRGEVQYNKVSRFVREIPRELVDLGQEAQEKKKKIEEM IQADRNLAKMKMAFETKTFKPREFKVTKAAALDYEVGDTVRHIKFGVGIVKDIVDGGRDY EVTVDFDKVGIKKMFAGFAKLKKI >gi|226332905|gb|ACII01000114.1| GENE 152 166296 - 166979 692 227 aa, chain - ## HITS:1 COG:BS_yfnB KEGG:ns NR:ns ## COG: BS_yfnB COG1011 # Protein_GI_number: 16077800 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Bacillus subtilis # 4 225 6 228 235 171 41.0 9e-43 MVETLFFDLDNTLLDFNRAERAAIAKTLKSFHIAPEPSVLKRYSELNLAQWKLLEQGKIT RDQVKLRRFRLLFAELNVDVPAKEAAHTYETLLAQGHYFIDGAEELLETLYGQYRMYLVT NGTLSVQKGRLKSSGISRYFEDIFISEELGYNKPGREYFNYCFSRIPDFHKETAVIIGDS LTSDIQGGINAGIRTIWFNPNHEKTSEIIPDHEFDSLMKLPELLSHI >gi|226332905|gb|ACII01000114.1| GENE 153 167268 - 167489 389 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNIKIEELLSALNRKEEEKEKNTVLWVLAIIGAVAAVAGIAFAVYHFFAPDYLEDFEEDF DDDFDDYFEDEEE >gi|226332905|gb|ACII01000114.1| GENE 154 167556 - 168920 1161 454 aa, chain + ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 1 452 4 453 456 427 48.0 1e-119 MKKGEIYEGVIEKVDFPNKGIVITDGQKVTVKNGMPGQKVRFMINKKRSGRAEGRLLEVL EKSPLETREPVCSIFPECGGCMYQTMSYENQLKMKECQVRELLDGALNGTDYQWEGIHGS PIEFRYRNKMEFSFGDAYKDGPLTLGLHKKGSTYDVLTADDCQLVHEDMTKILTCVHEYF LKRNVSYYKKMQHTGYLRHLLLRRGVTTGEILVHVITTSQEEHDLESLKEELLALPLEGK IVGIMHIINDSLSDVVQSDETRILYGQDFFYETLLGLRFKISTFSFFQPNSLAAEVLYSI VREYIGDTKDKVVFDLYSGTGTIAQLAASVADEVIGVEIVEEAVEAAKQNAALNNLSNCK FIVGDVLKVLDDLTEKPDVIILDPPRDGIHPKALPKILSYGVDHIVYISCKATSLARDLP AFLAAGYKLEKACCVDQFCQTVHVETVVLLSQLK >gi|226332905|gb|ACII01000114.1| GENE 155 168963 - 169205 106 80 aa, chain + ## HITS:1 COG:SP1029 KEGG:ns NR:ns ## COG: SP1029 COG2265 # Protein_GI_number: 15900900 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 1 78 466 542 543 79 52.0 1e-15 MDITSAETKATYDEIKKYVAEHNAGMKVSNLYISQVKRKCGIEVGKNYNLPKNEDSRQPQ CPEDKESAIVEAVKHCKMNK >gi|226332905|gb|ACII01000114.1| GENE 156 169223 - 169912 509 229 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 224 1 223 227 175 40.0 5e-44 MPGILVVEDDENLNRGITFSLKKSGYEVFSAESVKKAKRIASDNNVDVTICDVNLPDGNG LEFVRWMRCNYNTYIICLTALDQEIDQVMGYEAGADDYITKPFSLSVLLLKIEAHFRRRQ EKTEAGKMISGDIVFIAGEMKVLIKSREISLTKTELKMLTFFLQNPKQILSKTQILENVF DLEGDFVDENTIAVNIRRLREKIEDNPAAPVYIKNIRGLGYIWNQEVRQ >gi|226332905|gb|ACII01000114.1| GENE 157 170032 - 171153 638 373 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 17 371 64 412 416 116 28.0 8e-26 MNLMTEIAAKEEFSGLDAVSELLKDKDIETNEQGRRLLEQYGYWGNKGNAFYSQFRHQVM VTGAVSTVICVLLLTFLLYWKKKEDACHQKILDQLEEILIRFRENKFDDLLKTENHAELE KLNDQLEAIGHHIQLIKEEARAEKENTKEMVSDISHQLKTPVAALDTCFSVLMQNDLSAT EQEEFRIRCRSALDGLETLLQSLLEISKMETGLIQINKKKLPLMDTVISAVNRTYPKADE KEIEFVFDYEKELETCTIMQDKRWLGEAVINVLDNAVKYSPCGSKIFIRLQKRNDLVRME IEDQGIGIPQNEYHKIFQRFYRGSSKEVMEKSGTGIGLFLSREIIEKHAGTITVTSGKKK KGSTFVIQLPYVG >gi|226332905|gb|ACII01000114.1| GENE 158 171267 - 171941 329 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 220 1 219 223 131 35 2e-29 MSVILETKQLCKFYGAGENQVKAVNQVDIQIEQGEFVAIVGKSGSGKSTLLHMLGGLDTP TKGSVTLAGKDLYRMKEDALAVFRRRKIGFVFQAFNLVSSVNVWENIVLPLGLDGRKVDE AYVNDIIATLGIENRIYNLPNQLSGGQQQRVAIARALVNRPEIIFADEPTGNLDSKTSDE VIALLKMTAKKYGQTIVMITHDDEIAQVADRILVIEDGQVVDFR >gi|226332905|gb|ACII01000114.1| GENE 159 172004 - 174382 1479 792 aa, chain + ## HITS:1 COG:CAC2731 KEGG:ns NR:ns ## COG: CAC2731 COG0577 # Protein_GI_number: 15895988 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 91 504 116 523 875 74 25.0 9e-13 MMVICLTTMLLVIISTVGNGVIHLQKSQAAGSYGSNYGLFVSADGSQLKEVNRRAEIDAT GTMCTEGIIKGNEKGGFVCMDETARKMLPYNKEYELKEGKYPEKMQEIAAGRAFFRAMGY GDVKIGDTVTLDYRAGMQSECKPEEFVVSGILYDRDEYTIEASYVAFGSQEFYDEHVAEN DRQYNIYFTLNDSANVSMNNIDSVIKQIAAACGIEEKNVIVNDLYLQWVLQPSYETIAVC GVLILAIVLFSVVVIYNIFQVGIANKIQEYGKIKALGATKKQMKQLIFREGMFLTISSIP VGLLLGFLIAKCGFNWLVEQGNLVSTGTGSMGVQNRQMPLFSLPVMLLCIFVSFFTVALA LRKPMKIVSRISPIEATRYLENAETQKKGKRNGRKNVTVFSMAMANVTGNPKRTIGTILT MGFSCVLFVIISNYVGNIDTEHEARLSVNHGQFELQLDYSAEYDERYPENNLDTILTDDP LNDSLIEEIKSIPGVTDVMTREIVSVNLNGTRFPADIVSKKDFDFMRQEGDIGSMDYDQA VKNGDIFFGWSTWMEQDGYAPGESIAFDFENGSGTYTYQGKIAGSFVSADTYLVIPEGVY RSMNPRGTAYGYLWVDCDKKDVASVEQSLNTLISNTSHIKMDTYHAQLQSAEFASSMMKL SCYLFMAIVGLIGFMNMANTMIMNITTKKQEYGVLQAVGMTNKQLNLCLQLQGMMFTVGT ICVALIIGLPLGYALFSYAKHNGIFGMNIYHVPIVPIFIMIFLVGLLQIVLSCVLSSNLK KETLVERIRYQG >gi|226332905|gb|ACII01000114.1| GENE 160 174609 - 174992 283 127 aa, chain + ## HITS:1 COG:no KEGG:Mbar_A1092 NR:ns ## KEGG: Mbar_A1092 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 2 127 52 183 192 134 52.0 8e-31 MIEQSIDDGEAYRIVLDGKKVGGVVIKVEGEKGDLDLLFVAPSVHSKGIGYAAWCEVEKL HPEVKVWETVTPYFEQRNIHFYVNRCGFHIVEFFNKYHQDPNDPEMNAEKQDEQFPDGMF RFEKVIR >gi|226332905|gb|ACII01000114.1| GENE 161 175022 - 175234 271 70 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0277 NR:ns ## KEGG: EUBREC_0277 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 70 4 73 73 107 95.0 1e-22 MWWSIGGGIIIFALGMFIFLKPGLIWKLTEEWKSYRADEPSEFYLKATKIGGALFALFGI IMIMLPFILE >gi|226332905|gb|ACII01000114.1| GENE 162 175359 - 175589 170 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291520905|emb|CBK79198.1| ## NR: gi|291520905|emb|CBK79198.1| hypothetical protein [Coprococcus catus GD/7] # 1 76 1 76 76 112 90.0 8e-24 MLRKLNCFLNIVIGSSIGVFIGFGIYKFWHFKTYPNLYAIQSAPWYTELLLDGALVAVVV VVCIILKLIIRKNSKP >gi|226332905|gb|ACII01000114.1| GENE 163 175625 - 175906 327 93 aa, chain - ## HITS:1 COG:SP0276 KEGG:ns NR:ns ## COG: SP0276 COG3041 # Protein_GI_number: 15900210 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 5 91 4 92 92 76 46.0 1e-14 MKYDIQFTNQFKKDLKLAKKQNKNLDKLFEVIDILANGGTLDAKYRDHDLTGNYKGTREC HIEPDWLLIYEICGDVLVLMLYRLGSHSELFKK >gi|226332905|gb|ACII01000114.1| GENE 164 175903 - 176175 374 90 aa, chain - ## HITS:1 COG:SP0275 KEGG:ns NR:ns ## COG: SP0275 COG3077 # Protein_GI_number: 15900209 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Streptococcus pneumoniae TIGR4 # 1 53 1 53 87 58 45.0 2e-09 MATTNLNIRIDKAIKDQAEEIFNELGLNMTTAVNMFLRTAIREHGIPFELKLEVPNDTTA AAIEEGRKMMKDPSAPRYSSMDALKAALDV >gi|226332905|gb|ACII01000114.1| GENE 165 176287 - 176610 211 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580213|ref|ZP_04857480.1| ## NR: gi|253580213|ref|ZP_04857480.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 107 7 113 113 192 100.0 4e-48 MDNYNYHKGMNVIIQELKDLLKTKSIGTDSDQALLLDFQETLGTIYLMTANLSQAKTHFK RAFKIYEKTWADEPEMIEAKYQEIQELYPQVGFFLGQQISSFLTKQA >gi|226332905|gb|ACII01000114.1| GENE 166 176700 - 177137 326 145 aa, chain - ## HITS:1 COG:lin0458 KEGG:ns NR:ns ## COG: lin0458 COG1959 # Protein_GI_number: 16799534 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 132 1 133 155 108 42.0 3e-24 MKYSTKLSDTIHILIFIALGDEEQLSSTKIAESIKTNPAYVRQLMATLKNAGIVVNTQGH ANAALAKPADKINMYDIYRAVEGDKPLLHLDTDTNPDCGIGINIQFAIGDFYHEIQNMVD EKMKSITLQDIIDRYYFKIRKVKSL >gi|226332905|gb|ACII01000114.1| GENE 167 177345 - 178187 707 280 aa, chain + ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 11 261 28 279 315 136 30.0 5e-32 MNVYQEISQIIKEADGILIGASNGLSIAEGYNIFADDAWFQENMGDFREKYGLRCVLHGF SVPMKVEERWAFVSRLVKAKAMQDEPSEIMKNIYALVKDKEYFVVTSNAEDHFVPAGFES DRVFEMEGKLTQMRCKNRCHDEVYPNQKAVLAMTEEEVNGRVPKELLPKCPKCGGDMEVN WGEMSSFTETKNWKEKAARYQEFIQNLHGKKLVILEFGIGWRNQMIKAPLMQLAAIEPQA RYITFNKGEIYIPEEIKEKSIGVDGNLTVALKEIRKGRID >gi|226332905|gb|ACII01000114.1| GENE 168 178236 - 179213 539 325 aa, chain + ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 50 312 28 289 315 146 30.0 6e-35 MSCYENLVMKTQMNYSRHYSTYASGGTAVVLSKQKPLPYEEQIQEFVRRVQEAECIIVGG ASGLSAAGGGDFYYSDTPSFREHFGKFADKYGFKGAFSGMMHRFSTRNEHWGYVATFLNT TQNAPIREPYLDLDRILQGKDFHILTTNQDTQFVKIYPEEKVSEIQGEHRFFQCSQCCQD ETWDAVQPVADMIAAMGEGTMVPDELIPRCPHCGAEMFPWVRGYGNFLQGKKYEEEYEKI SKYIQKNKDRKILLIELGVGRMTPMFIQEPFWELTNSLKDAYYISVNSEYQFLPEFIEDK GIAILGDIGTVLKDLRKAKEESAFV >gi|226332905|gb|ACII01000114.1| GENE 169 179206 - 179496 93 96 aa, chain + ## HITS:1 COG:TM0048 KEGG:ns NR:ns ## COG: TM0048 COG1943 # Protein_GI_number: 15642823 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 22 90 38 103 108 83 56.0 9e-17 MYRILVLDIPLNGETFECGEGDHVHCFVSAPPKLSITAIVKYLKGISGRKLFERFPEIRN QLWKGELWNHSYYVETVGSVSEENIRRYIEHQSKAY >gi|226332905|gb|ACII01000114.1| GENE 170 179521 - 180067 238 182 aa, chain + ## HITS:1 COG:no KEGG:Teth514_2350 NR:ns ## KEGG: Teth514_2350 # Name: not_defined # Def: IS605 family transposase OrfB # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 10 167 15 158 397 79 32.0 5e-14 MSKKTSIKVSREYANLIGHMCYAASKFWNVCNYERQHYKETGIEQYPDWYYQKKTHKEDL WYKQLPSQTAQEVCKLLGKAWKSFYDLKRSGGIETPRPPRFKQESIPITYMQMGIVHERD MGRVRLSLPKALKKYMEETYQIHENFLYLENKIFRGMDQIKQLRIYPPEKGNCKIIVVYE VP Prediction of potential genes in microbial genomes Time: Sat May 28 20:28:16 2011 Seq name: gi|226332904|gb|ACII01000115.1| Ruminococcus sp. 5_1_39B_FAA cont1.115, whole genome shotgun sequence Length of sequence - 99008 bp Number of predicted genes - 83, with homology - 83 Number of transcription units - 46, operones - 21 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 37 - 75 -0.8 1 1 Tu 1 . - CDS 201 - 1922 1075 ## Mbar_A1995 cell surface protein - Prom 1942 - 2001 7.5 - Term 2116 - 2168 5.1 2 2 Tu 1 . - CDS 2244 - 2525 122 ## gi|253580222|ref|ZP_04857488.1| conserved hypothetical protein - Prom 2661 - 2720 4.4 - Term 2719 - 2759 -1.0 3 3 Tu 1 . - CDS 2863 - 4530 1113 ## MHO_2260 hypothetical protein - Prom 4563 - 4622 2.8 4 4 Tu 1 . - CDS 4710 - 5027 257 ## gi|253580224|ref|ZP_04857490.1| predicted protein - Prom 5262 - 5321 6.7 + Prom 5208 - 5267 6.8 5 5 Tu 1 . + CDS 5347 - 7059 1430 ## gi|253580225|ref|ZP_04857491.1| predicted protein + Term 7086 - 7147 5.4 - Term 7076 - 7129 6.4 6 6 Op 1 . - CDS 7364 - 8314 251 ## COG3291 FOG: PKD repeat - Prom 8420 - 8479 7.7 7 6 Op 2 . - CDS 8516 - 8953 480 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 9187 - 9246 11.3 - Term 9200 - 9253 7.3 8 7 Tu 1 . - CDS 9281 - 10249 1179 ## COG0673 Predicted dehydrogenases and related proteins - Prom 10300 - 10359 6.6 9 8 Tu 1 . - CDS 10372 - 10569 193 ## gi|253580229|ref|ZP_04857495.1| conserved hypothetical protein - Prom 10615 - 10674 2.5 10 9 Tu 1 . - CDS 10710 - 10895 308 ## COG1983 Putative stress-responsive transcriptional regulator - Prom 10949 - 11008 2.4 11 10 Op 1 . - CDS 11028 - 11999 1266 ## Cphy_1272 hypothetical protein 12 10 Op 2 . - CDS 11996 - 12700 547 ## Cphy_1271 hypothetical protein 13 10 Op 3 . - CDS 12700 - 13020 374 ## COG1695 Predicted transcriptional regulators - Prom 13196 - 13255 5.6 + Prom 13676 - 13735 4.0 14 11 Tu 1 . + CDS 13785 - 15188 1193 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Term 15149 - 15215 18.4 15 12 Op 1 8/0.000 - CDS 15243 - 19052 3255 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 16 12 Op 2 . - CDS 19054 - 22428 2998 ## COG3857 ATP-dependent nuclease, subunit B - Prom 22494 - 22553 5.5 + Prom 22426 - 22485 3.0 17 13 Tu 1 . + CDS 22532 - 22969 314 ## EUBREC_1180 hypothetical protein - Term 22940 - 22983 8.4 18 14 Op 1 . - CDS 23031 - 24947 1921 ## COG0171 NAD synthase 19 14 Op 2 1/0.100 - CDS 25006 - 25884 1206 ## COG1281 Disulfide bond chaperones of the HSP33 family - Prom 25915 - 25974 4.4 - Term 25940 - 25993 0.9 20 15 Tu 1 . - CDS 26002 - 26814 922 ## COG0500 SAM-dependent methyltransferases - Prom 26876 - 26935 5.8 - Term 26931 - 26970 3.3 21 16 Tu 1 . - CDS 26986 - 28107 1116 ## gi|253580242|ref|ZP_04857508.1| conserved hypothetical protein - Prom 28278 - 28337 9.7 - Term 28230 - 28283 7.4 22 17 Tu 1 . - CDS 28455 - 30512 2204 ## COG3973 Superfamily I DNA and RNA helicases - Prom 30634 - 30693 7.9 - Term 30667 - 30700 -0.6 23 18 Op 1 35/0.000 - CDS 30807 - 32645 2129 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 24 18 Op 2 4/0.000 - CDS 32642 - 34804 229 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 34881 - 34940 2.5 25 18 Op 3 3/0.000 - CDS 34944 - 35411 359 ## COG1846 Transcriptional regulators - Prom 35527 - 35586 9.2 26 18 Op 4 . - CDS 35642 - 36526 649 ## COG0583 Transcriptional regulator - Prom 36626 - 36685 7.6 + Prom 36611 - 36670 7.3 27 19 Tu 1 . + CDS 36700 - 38034 1186 ## COG0733 Na+-dependent transporters of the SNF family + Term 38117 - 38166 11.1 - Term 38104 - 38153 11.1 28 20 Op 1 . - CDS 38236 - 39237 1310 ## EUBREC_2568 hypothetical protein 29 20 Op 2 . - CDS 39253 - 40692 1275 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases - Prom 40795 - 40854 8.0 - Term 41053 - 41111 13.0 30 21 Op 1 . - CDS 41119 - 41292 245 ## gi|253580251|ref|ZP_04857517.1| heat-shock protein 101 31 21 Op 2 . - CDS 41293 - 43884 1929 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 43922 - 43981 2.1 - Term 43900 - 43933 2.4 32 22 Op 1 . - CDS 43983 - 44279 223 ## gi|253580253|ref|ZP_04857519.1| conserved hypothetical protein 33 22 Op 2 . - CDS 44348 - 44590 326 ## HMPREF0424_0697 hypothetical protein 34 22 Op 3 . - CDS 44587 - 45057 481 ## HMPREF0424_0696 hypothetical protein 35 22 Op 4 . - CDS 45054 - 45290 280 ## HMPREF0424_0695 hypothetical protein - Prom 45339 - 45398 5.0 36 23 Op 1 . - CDS 45472 - 45666 177 ## gi|253580257|ref|ZP_04857523.1| predicted protein 37 23 Op 2 . - CDS 45663 - 46496 511 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 38 23 Op 3 . - CDS 46574 - 47032 265 ## CLH_1981 hypothetical protein 39 23 Op 4 . - CDS 47022 - 47276 182 ## COG1476 Predicted transcriptional regulators - Prom 47343 - 47402 4.7 + Prom 47227 - 47286 6.6 40 24 Op 1 . + CDS 47480 - 48571 685 ## COG1609 Transcriptional regulators 41 24 Op 2 . + CDS 48571 - 49902 866 ## COG0534 Na+-driven multidrug efflux pump 42 25 Op 1 . - CDS 50256 - 50579 250 ## Shel_09520 acyl-CoA thioester hydrolase family protein 43 25 Op 2 . - CDS 50591 - 51397 511 ## COG3315 O-Methyltransferase involved in polyketide biosynthesis 44 25 Op 3 5/0.000 - CDS 51394 - 52176 310 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 45 25 Op 4 . - CDS 52189 - 52830 398 ## COG0500 SAM-dependent methyltransferases - Prom 52999 - 53058 6.2 - Term 52944 - 52986 4.5 46 26 Tu 1 . - CDS 53128 - 54876 1504 ## COG0297 Glycogen synthase - Prom 54993 - 55052 8.2 - Term 55134 - 55172 -1.0 47 27 Op 1 . - CDS 55187 - 55774 583 ## EUBELI_20196 cytidylate kinase 48 27 Op 2 . - CDS 55787 - 57154 1351 ## COG0534 Na+-driven multidrug efflux pump - Prom 57252 - 57311 4.5 - Term 57261 - 57318 4.5 49 28 Op 1 . - CDS 57377 - 57892 454 ## EUBREC_2567 hypothetical protein 50 28 Op 2 . - CDS 57921 - 58142 259 ## gi|253580272|ref|ZP_04857538.1| conserved hypothetical protein - Prom 58222 - 58281 9.3 51 29 Tu 1 . - CDS 58379 - 58591 154 ## gi|253580273|ref|ZP_04857539.1| predicted protein - Prom 58680 - 58739 5.9 - Term 58678 - 58723 6.1 52 30 Tu 1 . - CDS 58757 - 60109 1274 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 60199 - 60258 2.5 - Term 60167 - 60206 4.1 53 31 Op 1 . - CDS 60313 - 61059 432 ## Elen_0852 hypothetical protein 54 31 Op 2 . - CDS 61044 - 61961 486 ## Elen_0853 hypothetical protein + Prom 62007 - 62066 10.5 55 32 Tu 1 . + CDS 62291 - 64807 1296 ## COG2909 ATP-dependent transcriptional regulator + Term 64825 - 64877 0.4 - Term 64813 - 64865 3.1 56 33 Op 1 4/0.000 - CDS 64887 - 65159 208 ## COG0524 Sugar kinases, ribokinase family 57 33 Op 2 1/0.100 - CDS 65173 - 66657 1300 ## COG1621 Beta-fructosidases (levanase/invertase) 58 33 Op 3 14/0.000 - CDS 66689 - 68368 1537 ## COG1653 ABC-type sugar transport system, periplasmic component 59 33 Op 4 7/0.000 - CDS 68388 - 69281 730 ## COG0395 ABC-type sugar transport system, permease component 60 33 Op 5 . - CDS 69294 - 70232 795 ## COG4209 ABC-type polysaccharide transport system, permease component - Prom 70423 - 70482 10.4 + Prom 70357 - 70416 10.7 61 34 Tu 1 . + CDS 70451 - 71476 848 ## COG1609 Transcriptional regulators + Term 71499 - 71542 8.5 - Term 72737 - 72772 -0.2 62 35 Op 1 . - CDS 72773 - 74197 828 ## COG3669 Alpha-L-fucosidase 63 35 Op 2 38/0.000 - CDS 74216 - 75043 681 ## COG0395 ABC-type sugar transport system, permease component 64 35 Op 3 35/0.000 - CDS 75055 - 75936 710 ## COG1175 ABC-type sugar transport systems, permease components - Prom 75962 - 76021 3.2 - Term 76010 - 76053 2.3 65 35 Op 4 . - CDS 76095 - 77441 1718 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 77561 - 77620 6.2 + Prom 77550 - 77609 5.8 66 36 Op 1 7/0.000 + CDS 77754 - 79538 1044 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 67 36 Op 2 . + CDS 79575 - 80753 348 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 80776 - 80820 7.1 - Term 80764 - 80807 5.6 68 37 Tu 1 . - CDS 80832 - 83087 1280 ## BH0842 hypothetical protein - Prom 83279 - 83338 9.8 - Term 83162 - 83196 3.2 69 38 Tu 1 . - CDS 83364 - 83768 295 ## COG2200 FOG: EAL domain 70 39 Tu 1 . - CDS 84207 - 85028 439 ## mru_2036 4Fe-4S binding domain-containing protein - Prom 85241 - 85300 3.8 + Prom 85204 - 85263 6.7 71 40 Tu 1 . + CDS 85318 - 85902 573 ## EUBELI_01147 cytidylate kinase + Term 85925 - 85972 4.0 - Term 85911 - 85960 8.2 72 41 Op 1 . - CDS 86063 - 86539 302 ## EUBREC_2983 hypothetical protein 73 41 Op 2 8/0.000 - CDS 86583 - 87713 810 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 74 41 Op 3 . - CDS 87700 - 87978 202 ## COG1396 Predicted transcriptional regulators - Prom 88127 - 88186 10.1 + Prom 88130 - 88189 6.6 75 42 Tu 1 . + CDS 88219 - 88797 311 ## COG1309 Transcriptional regulator + Term 89000 - 89043 -0.9 76 43 Tu 1 . - CDS 88807 - 90351 1717 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 77 44 Op 1 . - CDS 90455 - 90685 397 ## gi|253580301|ref|ZP_04857567.1| conserved hypothetical protein 78 44 Op 2 . - CDS 90712 - 92100 1631 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins - Prom 92175 - 92234 5.8 - Term 92288 - 92351 18.3 79 45 Op 1 . - CDS 92400 - 93386 497 ## COG1073 Hydrolases of the alpha/beta superfamily 80 45 Op 2 . - CDS 93512 - 96892 2125 ## COG2200 FOG: EAL domain - Prom 97045 - 97104 5.4 - Term 97070 - 97129 5.4 81 46 Op 1 . - CDS 97155 - 97349 138 ## gi|253580305|ref|ZP_04857571.1| predicted protein 82 46 Op 2 . - CDS 97428 - 98285 726 ## COG2510 Predicted membrane protein 83 46 Op 3 . - CDS 98287 - 98913 527 ## PROTEIN SUPPORTED gi|241889736|ref|ZP_04777034.1| putative 30S ribosomal protein S12 Predicted protein(s) >gi|226332904|gb|ACII01000115.1| GENE 1 201 - 1922 1075 573 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A1995 NR:ns ## KEGG: Mbar_A1995 # Name: not_defined # Def: cell surface protein # Organism: M.barkeri # Pathway: not_defined # 123 347 1085 1349 1882 67 27.0 2e-09 MKNKKILALLLACTIMASGVPVYAADFSDGTAESVQDAESSLFGMEDSELTEDTMPAFAD MEENADAAQITDGNSEEEDLTGDKTEDKGTEGLEYEYVPEIDGYRVKKGVNEKEIRIPEK YEGKEVLEIGEGAFAGCDQIERIRILDHHKNFKIKKDAFENCISLRKVSFFGGIKVESGA FRNCPKLYDFALINYYEGTEVSIADDAFDADSKVLVSADGGLPWKGSPEPFFNENGEDGW HEKEQGMDYWDYVKNNGPSQYEGTRVADFDNSVSKVRIRDNVKGIGRKAFYGSDRLEEVR LGKENYFIESKAFASCINLKKLFIPSATTVIEENAFENCPSLVIYTVKGSYAEKFAKKHQ IPVRYSAENIQDEKIKLQVSRIKRKDDQSFTDEIRLKWNQPEMADGYLIQKKEAGGKYTT CFNIKNSDICTTTISLLTISSYYGKDISFRIRAYSTGVDGKKSYTPYSYAEVSFWPKQPE ITSLTKKTDRKLTVRWKKTAHTDGYEVWRSENGKPFTCVKRITNGKTQNFTDTGLKKGVT YTYAIKSYRINSLKKKIYGDIFTQTKLKTIKYS >gi|226332904|gb|ACII01000115.1| GENE 2 2244 - 2525 122 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580222|ref|ZP_04857488.1| ## NR: gi|253580222|ref|ZP_04857488.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 93 55 147 147 176 100.0 5e-43 MINLINITYCGMKLLSYQDEKFSDYRDKSVQDFRFALSECIRQQVFFATFVKNIETRIKS SSVINALKRAIVKQEHYFIKLFVCPPLNELILN >gi|226332904|gb|ACII01000115.1| GENE 3 2863 - 4530 1113 555 aa, chain - ## HITS:1 COG:no KEGG:MHO_2260 NR:ns ## KEGG: MHO_2260 # Name: not_defined # Def: hypothetical protein # Organism: M.hominis # Pathway: not_defined # 122 344 975 1181 1719 67 27.0 1e-09 MNRKEFLAVVISAALGLSGAPVYASESAAFSDGEVQEYGAEEEKQEFQTDSQDPETDTFQ DEVESDADTDGFVDEATGVSSGEADANGFTYQYLKESDTYRLTKGTDTENVIIPYEYNGK VVSEVGERAFYGCINIKTVKAEEGADRRKNRIRIDKSAFENCENLRKVSFDQGAKLESRA FYNCPKLWEYTGVSYYDSFDTEIEQDSFDADTKVIFRAGDNGIPETVKAFADKNRVFMEI TDNDNYVLTDKDGTSYYDDWSEGDSRTFCVDCDDTMASVRVWSAVEVIGRKAFYGCSNVK KVLIERKTSTIESKAFAKCKNMSIIMPSGITAISDDAFDGASGITIYADKGSYAEKYAKK HNLTCKTIPAPTAVPVPKLKVSYDAKNGNATLNWTPVEYTFQYYIYRYDTATKKYKCVSK VDQNTTSYKPESSAGRTVKYKVRVRTLAGIYTDQYSKKSNTVTVQGRPGNVSDVSKKKKG KNLTFKWTKAKGAQGYILYRYDENARKYRKIKTIKNGNVTSYTDKTGKLNKNENYYVRAY CTTKDGTRLYGWYWA >gi|226332904|gb|ACII01000115.1| GENE 4 4710 - 5027 257 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580224|ref|ZP_04857490.1| ## NR: gi|253580224|ref|ZP_04857490.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 105 9 113 113 193 99.0 3e-48 MDIVDKKNLPAGNYEEQDSFIVGDRIYQGLADAVEPVTGITNGAEKTAQSFLVNGSLELN EDENNEGLDTAVQIMVQVDAYQTLNRPVIDTRWTRGITNYAMIIS >gi|226332904|gb|ACII01000115.1| GENE 5 5347 - 7059 1430 570 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580225|ref|ZP_04857491.1| ## NR: gi|253580225|ref|ZP_04857491.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 570 1 570 570 1056 100.0 0 MPGTFNGTTLTISATPVDHALIPDEGATNVAGIAIVGSDMYCVKTTGLKTTLLKTTDYMA TKATAVKKDLNWFALGLTKIGNYLYMLAEDHKGYGGSTTIVKLDTKGNQLGTYSINEICP KGALGITAADGTNEKFIIMHYADKSNAGTLTFSVVDWKAAKAASTFTVTNSGFGYTTQLQ DIYYHTSFGLFILTNNTFSTLQNRILVVDYHLTDDKGKKYTPCAVINVNKTGEYKQYNLE SICMKANHLVLASNVITSDGTAEDKFSVLEGITYSGKTYTFACGFRAGARVPTKVDPNYE TTNLGCVSFNGNNTPYFIKQSQDKVAVLGYCKNYMNTSAGVSHFKTFTDGLLGHANGMSC FNGKFFVVAGNNKVVAISSNKEDEEITYTVTPNDNDKSFALKAINYFYEANTALLLSVVD GTLKLYKCAFENEKNVKATYLCTLINAGQPTSQDIFYHNKLGLFVGTSNPHNVSGVTTKI TTKNTLLHYNLKKLDDSKLLYPDFGFVTDIPAKDAQGRTYNSFELESVALDQNNKLIAVC NANVKDATHTSGLASTDGFFQYNTLEFITA >gi|226332904|gb|ACII01000115.1| GENE 6 7364 - 8314 251 316 aa, chain - ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 46 176 1412 1539 1995 74 32.0 2e-13 MKSKKVKKILLIALTCAAVSTSVSAEAAMKSQITVESKNKYEQLKISESRVYGEYPTGDY KKITLLPSVSKVEKFCFEDNLNIEEVEWIASVDTVPVFAFSTCPKLKRVILSDNVKKIGQ SAFIYCGELTSVKLPQNLQSIDFFAFADCRKLKTLYIPETVTEIGAEAFINCDSLTVHGK KNSYAYYYCKMNGIPFVSEGTASKPETNRPYIKSVDSDIVNKQIYVTIDLSGKVKNADGY QYQIYDGTKVLANKNSANTTCILKKVPTMGFARVRSYTVQNGKKSYSRWSNEMRMPPVKL NKDNIKLIKITGKRKQ >gi|226332904|gb|ACII01000115.1| GENE 7 8516 - 8953 480 145 aa, chain - ## HITS:1 COG:yjiL KEGG:ns NR:ns ## COG: yjiL COG1924 # Protein_GI_number: 16132155 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Escherichia coli K12 # 2 126 5 153 257 111 43.0 5e-25 MYYVGIDIGSTASEIASRLLDEGIDVMSEGVRVAATGYGRVAVDYADFVITEITCHGRGG RELAGNECAIIDVGGQDTKVILVDQGMIQDFLMNDKCSAGTGKFLEIMANRLGVTIAELF DLAEQMKSETEKWKKYKLIVEIKKT >gi|226332904|gb|ACII01000115.1| GENE 8 9281 - 10249 1179 322 aa, chain - ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 314 4 320 320 213 36.0 6e-55 MKLGILGAGGIASTMAKTVAEMKGVEVYAVAARDLERARVFAQKYEVKKAYGSYEEMLAD DEVELVYIATPHSHHYLHAKMCLEAGKHVLCEKAFTVNAEQAQKLFDLAKEKKLLITEAI WTRYMPSRKMINDIIESGVIGEVTAVTANLSYTVSHVERIRKPELAGGALLDVGVYPINF ASMVLGDKVKDVKATAIFQNGVDILDSIAMVFEGDCMATLQCGAREISDRMGSIFGTRGY MQVQNINNPEKITVFDTEHKEVASYVVPEQISGYEYEVESCMKAIQEGKLECPEMPHAET IRIMKIMDDIRKSWNYEIPCIE >gi|226332904|gb|ACII01000115.1| GENE 9 10372 - 10569 193 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580229|ref|ZP_04857495.1| ## NR: gi|253580229|ref|ZP_04857495.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 56 1 56 65 89 100.0 1e-16 MNKEKEIKAYLDGELLPEVRMKYEIAEEMGLLDRVLSDGWKSLSAKETGRIGGLMTKRKK EKLKK >gi|226332904|gb|ACII01000115.1| GENE 10 10710 - 10895 308 61 aa, chain - ## HITS:1 COG:lin2628 KEGG:ns NR:ns ## COG: lin2628 COG1983 # Protein_GI_number: 16801690 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Listeria innocua # 4 60 2 58 66 72 61.0 2e-13 MGDKRLYKSSENSMLCGVCGGIAEYFDIDPTLVRLAWVILTCFGGAGIWAYIIAAIIIPK R >gi|226332904|gb|ACII01000115.1| GENE 11 11028 - 11999 1266 323 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1272 NR:ns ## KEGG: Cphy_1272 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 171 310 129 285 288 76 28.0 2e-12 MKKFTKGMLIAAGIFGAVGIGLTAAGGVMGASMSELTGVKSLKRVLLVADGDYDYDDSDD YDDDDDYDDSGDYDDSDDYDSDDYDDSDDCDDSEDYARVVDENEEDGTVYQLKYQPTKLD IELKYDELILEEGDSFCVRVYDDSGKNVTVKESSDTLKVRSTKKLSKTSKVHISYPEDVK LQELEIEMGAGTVYLNRDIETEKLSVEMGAGEFESKNPVTAREADLEIGTGSMTFADLSA RKTDGECGLGELDLTLTGTQEDYNYDLECGVGNLDVGSDSYSGLGREKTISNKGADRKLD LECGMGNISVDFSGKEHRDLQIS >gi|226332904|gb|ACII01000115.1| GENE 12 11996 - 12700 547 234 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1271 NR:ns ## KEGG: Cphy_1271 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 82 1 81 301 75 57.0 1e-12 MDRAQFMQELEKLLADISETERQDALDFYNSYFDDAGAENEASVLRELGSPEKVAAIIKA DLKGSAGGYEYGEYTEHGYEDARTKERGQMPERYEEESGTGKRFFRKGNQAVLILAVILL VFISPFVKGAVGGILTFAVGILLLPFWLIVGLGIGAMALLVGGIAAVVAGAGLLAVMTGT GILTIGIGCLMIALAILMILGLISIAVRIVPKWFRKITDFFNRLLYRKRKEAVK >gi|226332904|gb|ACII01000115.1| GENE 13 12700 - 13020 374 106 aa, chain - ## HITS:1 COG:SP0100 KEGG:ns NR:ns ## COG: SP0100 COG1695 # Protein_GI_number: 15900043 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 105 1 103 108 88 40.0 3e-18 MVFNTGAALLDAVVLAVVSKEKEGTYGYKITQDVRGVLDVSESTLYPVLRRLQKDDCLEV YDMAYAGRNRRYYKLTDRGAAQLEFYKAEWKIYASKISGMFEGGIG >gi|226332904|gb|ACII01000115.1| GENE 14 13785 - 15188 1193 467 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 11 301 6 277 280 204 39.0 4e-52 MNENTTHLHPAFLTRMKEMLGNEYPAFLSSFSAPRTYGLRINTSKITCEDFENLSPFPTR QIPWIKNGYFYDEDIRPSRCPYYQAGLYYLQEPSAMTPASRIPIEPGDRVLDLCAAPGGK ATAAGAALQGQGLLVANDISTSRARALLRNLELFGIPNVFVANETPAKLTKAFPEFFDKI ILDAPCSGEGMFRKEEALAKDWTPEKSMELSEIQKELILQAADMLRPDGLMLYSTCTFAP AEDEQTVSWLLENRPDMELAEMEDYEGFSHGVPEWGNGNPELIKCIRIFPHKMNGEGHFL ALFHKKGQAIKESSRPHTKPDKNAFPLIEEFLNEIGLKTLCGQPFDWERVEIRGDKAYYL PPVAHNFRGITFLRNGLYLGDLKKNRFEPSQPLALAIRKDESEAVISLSASDERITRYLK GETLNIEPEEAAHKKGWHLLCADGYPIGFGKLVNQILKNKYPAGWRV >gi|226332904|gb|ACII01000115.1| GENE 15 15243 - 19052 3255 1269 aa, chain - ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 5 1261 8 1243 1252 721 35.0 0 MGVQWTKEQQEVIRLRDRNILVSAAAGSGKTAVLVERILSKITDKEHPVDIDRLLIMTFT RAAAGEMKERISAAIEKALCEDPDNEHLQRQTTLLHTAQITTIDGFCAYIIRNYFHLIGL DPGYRTADEGELKLLRGDVVKALLEEYYAKKDEKFQKFVECFATGKSDENLGNLIQKLYE MAMSNPFPQEWLSGCMDDYRIDSLEELRETEWMRMLWDAVKDELQEANLLVQEARRICSE PDGPYLYDEALSSDLLLIRSLQELAEKRDYNGTAEILMKPSFARLSTKKAADVDEQKKQR VKDLRDEEKGILKELGQRYFQSAEKELLEVIRYIRGPVEMLVELTVGFKERFGEAKREKN ILDFTDMEHFALQILMTKEGEEIHMSQAARELSAKYDEVLVDEYQDSNLVQELLTTAVSG WINQKKNIFMVGDVKQSIYRFRLARPELFMEKYKSYSTEEAQEQRIDLHKNFRSREQVLE SVNFIFRQIMGEDLGGITYDKDAALYPGASFPEGESEEFVKTEVLLVERDGEELSDVQDY EDAGASGNRREMENQTGQELEALAIAQRIKEIVGKEQIVDKEKKEYRSVEYGDIVILLRT AYGWAETFREVLASQGIPVYCTSRTGYFSATEIVTVLNYLKVCDNPLQDIPLMGVLRSPI VGCTSQELAELRIQYPDGLLYESVSAYAGENEIPEKELDPDKLKSELLNSNLRTDEKNSL NIKLKGFLSLLEKVRNMATYTPVHELILYVLKETGYGDYARALPGGEQRFANLTMLVEKA MDYEKTSYRGLFNFVRYIEQLQAYQVDYGEVNLTGAGNTAVEIMTIHKSKGLEFPVVFVA GMGKQFNFQDMNAGLLLHPELGIGADAIIPEKRVIASSLNKQVIRRQLLKESLGEELRVL YVAMTRAKEKLILTGTVGKLEKQMMSLSRFLDEEEELLPLGTRMKAKNYWAFVLPALVRH RAMSELLWEYGILMKKQPGIYDDISEFVIKKITVRQMTEKAVLIQAGNQMQEEYLKNWDE NKVYDEVVKEEIEKRFSFVYPYKYLEDIPVKVSVSDLKKRSWHDESELEENISVSAEEQD EEQEAPVPAFMAEKQEEYKGAARGTAYHRVMECLDYAEADTEEQLRAQLKRLLESQKMTE QEAECIRIRDIRRFVESGLGQRMKKAAMKKHLYREQPFVIQRNASMLDDGWKNETVLVQG IIDAYFMEEEEIVLVDYKTDRVRRGQEQKLINLYHVQLEDYARALERMTGKRVKEKIIYS FTLQKEILL >gi|226332904|gb|ACII01000115.1| GENE 16 19054 - 22428 2998 1124 aa, chain - ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 1 1122 1 1142 1153 633 31.0 0 MSLQFIIGSSGAGKSYFAYERVIRESMEHPERNYYIIVPEQFTMQTQKTLVEMHPGKGIL NIDILSFERLAYRVFEETGGDNRKVLEDTGKSMVLQKMVQQHRKELAYLGSQMNKPGYLD EVKSLVSEFMQYDIREENLAEMKEKAQNQPLLEMKLKDVGILYQSFREFLKGHYMTGEEV MDVLLKQLPFSEKLKGAEFLFDGFTGFTPIQVNVLRELLVIADRISVTVTMDEREDAFSP GKPYQLFFMSKQMIRALAGLTRDLEDPVYLKPSGQSRFAQAPALQFLEKNIFRYRKGVYA EEQQEIKIFTAPSPLEEMREAARRMSELVRTCGYRYGEIAVITGNLEEYARLAAQVFEEA DIPYFIDEKHSVMMNPFVEYLRAAMEMAVQGFPYESVFRYLRCGMSEVTREQADKLENYV LALGIRGYKKWSEKWVRVYRGMGAEKIQELNEIREIFAEEVRELAQGFGSGKKTVEEYCR ILYEFIQKSDVWQKLKRQERKFKESGDKAMEKEYNQIYGIVMDLLDKMVEILGEETVNRQ EFRQLLESGLSQAKVALIPPSIDQVMVGDMERSRLKEIKALFFVGVNEGNIPKSTQTGGI LSELDRDFFQEQGVELAPGPKELMNMQRFYLYLNMTKPGEKLILSYSDTNAKGEGISPAY LIGSICSLYLKLEIEGGAGVRPHKNSINNYCYPENPEAGIDLFLEKLVQETEKEHEDIRE QADETDAMFGELYSWYLRNPEYRSRVQKLVQSAFAGKPEDIISQSVAKALYGEVSPYSAT RLERFAACAFAHFLQYGMKLTERVEYEFKPMDMGNVMHEALESFAEEVRKRGMKWTELTE QERNEIADRCLDNIVADYGNTVLKSSARNEYMIERTRRILRRTVWALQKQLEQGEFQPEG FEVTFGGGRIDRVDIMEDQNKVYVKVIDYKTGNTSFDLVYLYHGLQLQLMIYLDGALRVE QKKYPDKEIIPAGVFYYNIKDPMIQEKIDADVEAVSAGLMKELKMNGLVQADPELVYRMD SSLGSIPVAFNKDGSFRKNSSVADRTQFAVLGRYVRTKIEKIRSSILEGDAEVSPYELGK KNACTYCPYMTVCGFDRRLSGYEFRRLKNFSDEELWKAFDREAE >gi|226332904|gb|ACII01000115.1| GENE 17 22532 - 22969 314 145 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1180 NR:ns ## KEGG: EUBREC_1180 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 145 1 145 175 101 42.0 7e-21 MSAKTKIIVLHMKEVIYTGIFLLLAVILAIVLFFMFGPGQKKSASADAKSLYKPGIYTSS IDLNENTFDLEVTVDSDRVTSIRLVNLSESTAAMFPLMEPALESLASQIYTSQSLENIQY SEDQKYTSMILLNAIETALKKAENN >gi|226332904|gb|ACII01000115.1| GENE 18 23031 - 24947 1921 638 aa, chain - ## HITS:1 COG:CAC1782_2 KEGG:ns NR:ns ## COG: CAC1782_2 COG0171 # Protein_GI_number: 15895058 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 322 633 1 313 313 411 62.0 1e-114 MKNGYIKTAAATPYITVADCNANGSEIIRLIHEMEKEHVKVMTFPELCITGYTCQDLFLQ RRLLDSAWETLLKITKETADVDALVFVGVPFRNHGKLYNVAAVLNRGEIIGLVPKTYLPN YGEFYEQRHFASGLGCLEYVDIEGKRVPFGTDILFICEEEPELVAAAEICEDLWVTLPPS VLHAQAGANLIVNLSASNEMVGKDSYRRDLVSGQSARLVCGYVYANAGEGESTQDLVFGG QNMIAENGVILAEGKRFHNGIVCSEIDVQRLNDERRRLTTYQPADDSDHIKVCFHLNVEE TKLTRKYSQYPFVPSRKEERDMRCDEILNIQAMGLKKRMDHIHCHKATVGLSGGLDSTLA LLVIARAFDLSDADRKDIHCITMPCFGTTDRTYQNACKLSQCLGATLSEINIKEAVNVHF RDIAHDPSVHDVTYENSQARERTQILMDSANQDGSILVGTGDLSELALGWATYNGDHMSM YGVNASVPKTLVRHLVRYYADTCKDEKLTEVLLDILDTPVSPELLPPKDGKIAQKTEDLV GPYELHDFYLYYMLRAGFEPEKIYRLACETFEGMYDKETIFKWLKTFYWRFFAQQFKRSC LPDGPKVGSVAVSPRGDLRMPSDASAGVWLEQLENMDI >gi|226332904|gb|ACII01000115.1| GENE 19 25006 - 25884 1206 292 aa, chain - ## HITS:1 COG:CAC2370 KEGG:ns NR:ns ## COG: CAC2370 COG1281 # Protein_GI_number: 15895637 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Clostridium acetobutylicum # 1 290 1 292 297 278 47.0 8e-75 MNDYIIRAIAANDQIRAFAAVTTETVETARQDHNTSPVATAALGRLLTAGAMMGTMMKGD KDILTLQIKAGGPLEGITVTADSKGRVKGYVGNPDVCIPANSKGKLDVAGAVGVGFMNVI KDMGLKEPYVGQVALQTSEIAEDLTYYFATSEQVPSAVGLGVLMNKDNTVRQAGGFIVQL MPFAEESTIAKLEENVQKITSVTNLLEEGHTPESLLEKVLEGFDMEINEKVPTEFYCNCS RERVEKALISIGRKELNEMIQEGKSIEMNCHFCNKNYEFTVEELKEILRKCK >gi|226332904|gb|ACII01000115.1| GENE 20 26002 - 26814 922 270 aa, chain - ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 4 247 3 241 247 136 33.0 4e-32 MESYGRFAGVYDVFMDNVNYREWADYIIETLAQDGIRDGLVLELGCGTGTVTEMLADAGY DMIGIDNSEEMLAEAMEKRAESGHDILYLLQDMQDFELYGTVRAVISVCDSMNYLTDEED LEYLFALVNNYLDPGGLFIFDMNTIHKYRDVIGDTTIAEDREDGSFIWENSYDRENALNV YELALFLPREDGLYEKCEEEHVQKAYSIEAIKAMIVKAGMELVAVCDAYTHNPGDENCER LTFVAREHGKSAQNGYHGKIPEKPETHRAD >gi|226332904|gb|ACII01000115.1| GENE 21 26986 - 28107 1116 373 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580242|ref|ZP_04857508.1| ## NR: gi|253580242|ref|ZP_04857508.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 373 10 382 382 484 99.0 1e-135 MALTTVLSVSMTACNFGGGSDEDSSVVQVTPTPEPTKAAKATPTPAPANAQNTTYTSKNK AVSIKLPDATWANKSDSDDMLSFESPKQGKLLILHGSGEEDMSVAIIPSSQDTASALVKA DNLVEGTDFEIQDYKSEEVSGVNVYSYTVHYLTDKSDYKYVVNKYFTDNTTEFYSVAGSV KDEKALAGIKTSVDSFTISGDSVLKAAATGKTSGTTAGTTADGTSGTAGTAAAGTDGTAS GSSSDGSSSAGTDGSYSDTSSDGSSDYSSSDYTSDGSSDGSYDYDDGSSNGYYADGTPVG TDDPDYDTDQTRTIYRNSDGYPLVIYPNGDGTWCDDDGNTYDFANGEDVYDENGVDYYYH GEPAYVRYMPKNQ >gi|226332904|gb|ACII01000115.1| GENE 22 28455 - 30512 2204 685 aa, chain - ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 15 681 22 753 774 284 30.0 3e-76 MTDKKNGREYLGYVLEKLKERITEISLSLVEGQKEIEGMHEYYWENYTEMDQYGYENYDN QQALFRQISANEEQLILKQRFRRMADSPFFGRVDFIYEGEDEPEIFYIGIGNFAEKAGHI PLVYDWRAPVSGLFYDYDKGPASYEAPMGEIHGEVASKWQYKIRNGKMIYEFESDVKIDD EILKAELGSNGEVQLKNIIRTIQKEQNAIIRNTKDRIMVIQGAAGSGKTSVALHRIAYLL YHDRQNLKSSNILILSPNGVFSDYISHILPELGEENIKEMSFDLFAYKQLRDTVSDCEDR YDEIERRIRFPQKASLAEEKQSMEFINLMERYLVELEDRLMNFKDVEYKGFVKKESEIIE LFYFKFQDFPLLSRMDAVADYFIDEVETLRDRDLADDEKDLIREKFMKLYVTGDLYVIYS QFLKENGYKGLPRVSYEKRKLKYEDVYPVLYLKYRLQSQQGRSNIKHLVVDEMQDYSRLQ YEILQRIFSCRMTILGDRAQTMDDKQQDVLKFLPKIFGRDIHKIIMNKSYRNTIEIASYA NQLAGIEDMELFERHGAPVEEKIFADMSHAAEEIAETLKLGEEEYETAAVVLRTEKEAEH MWLLLKEILGEKGFDIKERLSYLDRNSTSFKKGLTVTTFYLAKGLEFDQVFAVFPQKDRS PLVRQARYIAATRALHELHMYEVEK >gi|226332904|gb|ACII01000115.1| GENE 23 30807 - 32645 2129 612 aa, chain - ## HITS:1 COG:CAC3281 KEGG:ns NR:ns ## COG: CAC3281 COG1132 # Protein_GI_number: 15896526 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 67 606 157 697 706 624 59.0 1e-178 MSEQRRRGPMGGRGRMMSGEKAKDFKGSMAKLFRYMGRYKFRFILMFIFAVAGTVFSIVG PKILGKATTELFNGLVAKVNGTGEIDFGKIGMILLWTLGLYVLSACFSFVQGFVMSGISN DVTYNLRKDISKKINRLPLNYYESRTNGEILSRITNDVDTLQMSLNQSLTQLITSVTTLI GVFIMMLSINVWMTLAALLILPVSMFIIQTVMKHSQKYFQDQQSYLGKVNGQIEENFGGH NVVRVFNKENDVVEEFEKDNQKLYESAWKSQFFSGMMMPIMQFVGNLGYVMVALLGGVFV IKKSIEVGDIQSFFQYIRNFTQPIQQIAQVTNLLQSSAAASERVFEFLEEPEESQNEKNP VDVNTLTGDVQFEHVKFGYNPDKIIINDFSADVKDGQKIAIVGPTGAGKTTMVKLLMRFY DLNGGSIKVDGYDIKDFNRSSLREMFGMVLQDTWLFSGTIMENIRYGRLDATDEEVIAAA KAAHVHNFIMQQPGGYDMVLDEETSNISQGQKQLLTIARAILANNKILILDEATSSVDTR TEVRIQKAMDNLMKGRTSFVIAHRLSTIKDADLILVMKDGDIIEQGNHEELLSKKGFYAD LYNSQFENKATA >gi|226332904|gb|ACII01000115.1| GENE 24 32642 - 34804 229 720 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 479 702 132 354 398 92 27 6e-18 MIKLMKYLKKSAGYIVLIIALLFLQAYCDLSLPDYTSKIINVGIQQKGIEDSVPDKLRDS TLQNLQIFMDEDQKKEVSQNYTQEDDTWVLNDDISKETRENLNEDFSKAMMMVSAFSEDS EQGQAMVAQMGLPEGTDPLTALAQMPEEAVQQIMSQVDEKLKDMPESIVTQAGVSFVASE YEALGKDVDAIQMHYILMSGIRMLAMALVIMLAAISVTFISARVAGRLGHDLRNSIYRKV MSFSSREYHKFSTASLITRSTNDVQQVQQVMAMMFRIVLYAPILGIGGVIKVLNTDSSMT WILGVAVGLILIVIFVLFQVAMPKFTILQTLIDRLNLVSREILTGIPVIRAFSREKHEEE RFEKANLDLTKTNLFVNRCMTFMMPIMMLIMNGVSVLIIYSGSHAVDNGTMQVGNVMAFI QYAMQIIMSFLMITAMSIMLPRANVAALRIDEVLKTEVSIQDPETPVHPSESVKGEIEFD HVSFAYPEAGENVLTDISFKAKKGQTIAVIGSTGSGKSTLINLIPRYYDVTKGSVKVDGV DVRDMTQKELRDKLGYVPQKGVLFSGTIDSNIRYGKPEITDAQVKEAAEVAQATEFIDTK PEKYESPVSQGGTNVSGGQKQRLSIARAIAKEPEIFIFDDSFSALDFKTDSQLRKALKEY TKDATTVIVAQRISTILGADQIIVLDDGHMAGIGTHKELMANCEVYQQIARSQLSEEELA >gi|226332904|gb|ACII01000115.1| GENE 25 34944 - 35411 359 155 aa, chain - ## HITS:1 COG:MA0180 KEGG:ns NR:ns ## COG: MA0180 COG1846 # Protein_GI_number: 20089078 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 34 133 38 137 156 67 30.0 7e-12 MWCPEEEKKEKREEKSMPALFMEINRQYGMRCMQRIREIGIQQGQMPIIMIVYRHNGCSQ KEIAEWMCVTPPTVNVSIQRLEKADIVCRKRDDKDQRIMRVYLTENGRKIVEELQQESKA VEKVMFSNFSEAELCLLRRFFGQILDNISEIPVNR >gi|226332904|gb|ACII01000115.1| GENE 26 35642 - 36526 649 294 aa, chain - ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 292 1 293 299 142 31.0 6e-34 MDTNVLKTFIAVCEYSGFSAAAKELGYTQSTVSSQIKQLEKELDVRLFDRYYHKINLTEK GVLVLQQARNILKAQAKMLDSLNSAESIEGEIRLSMSSSVCSRYFKNDFLRFHHQYPEIK VEITENGTEQMFDKLRKNETDLVFTLDRHIYDSDFIICAEQEEQVHFIAAADNPVADCSW KLSEISQNEFVLTEQAMSYRKILNETLASQSLEIRPVLEIGNPLQICELVKNSSLLSFLP DFISEKYVKDGQIKRLDVAGCPVTVWTQLLLHKNKWRSPAINVFIEFYKEVMQR >gi|226332904|gb|ACII01000115.1| GENE 27 36700 - 38034 1186 444 aa, chain + ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 437 7 451 459 372 46.0 1e-103 MKREQLGSRLGFIMLSAGCAIGCGNVWKFPWMCGQNGGGSFMLIYLLCLVILGIPALVLE FSIGRAAQTSPLFMYRKLEKPGQKWGIFGWFCLLGNIALMAFYTVVCGWIIYYFVQFLRG KNGSLGFSAMISSPSVNVFFLLVTVVIAFFILSFNLQGGLERVTKYMMSALLVLMLALAV HSLFLKGSGEGMTFYLKPDFSKIDGSVIVGAMNQAFFTLSTGMGGMAIFGSYIGKDHSLM GEAIHVITLDTLVAFLAGVIIFPACFTFNLEVNAGPSLLFDTMAAVFNNMSGGRIWGSLF FLFMVFAAMSTVLGVCENILAMIRELTGWSRPKGSVVCGTGVFLLALTTALGFSVFHFQP FAEGTTWLDFWDFIVSNNILPLGSLILALFCCNKFGWGWDNFIKESNTGKGLKVQSWMKP LFRFVLPIIIAFIYVYGMSTFNWR >gi|226332904|gb|ACII01000115.1| GENE 28 38236 - 39237 1310 333 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2568 NR:ns ## KEGG: EUBREC_2568 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 328 1 320 331 292 46.0 1e-77 MQDNQEKCKELAIRIYNLDEEVYKCVGSSLGRTFTGQKETTLRHLAALMYTKPDGHLLRS ELALISNMARTIRDKSERRRLMIEYDSILKEIESLPDTFGQTDIVDKERISLNNLIARRQ NEISENDHLIICISRTQGSGGNDIGFELADQLQINYYDAEIFDQVMKRLQADKNAVGDAG SFEGFDKYRKKKHTDLKTWFREFNRYHGLPKQDAVFFNMSDLICELAKSEDCVIMGRCAD AILKNNHIPHISLFISAPFQVRVQHVMDIRNMNLKQAVRFLKQMDKQHKKYYEFYTGEKW GKPENYDLCINSANYGLKDTIELIRRMLDQKVH >gi|226332904|gb|ACII01000115.1| GENE 29 39253 - 40692 1275 479 aa, chain - ## HITS:1 COG:mlr4884 KEGG:ns NR:ns ## COG: mlr4884 COG1055 # Protein_GI_number: 13474086 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Mesorhizobium loti # 9 478 3 468 469 338 44.0 1e-92 MKKLVTVLGAFFAVLLACPLSVFAAEGSKAPSTVPVWLCIPFAGLLLCIAVMPLVKAEWW ESHQPLVVALWIILMVVPFAFVYGAGKTAETVLDCIVNDYLTFIILLFGLFCVSGNITME GDFAGSPRVNVGLLALGTLLSSCIGTTGASMLMVRPVIKMNSWRKRKGHIMIFFIFMVSN MGGCLTPIGDPPLLMGFMRGVPFFWSLHLFPVLIFNMVILLFVFYQIDKRSYRKDIANGR KPDIRKPGTQFRLDGLHNIIFLVMIVGAVILSGVLPGMSAFQDAAGNVRGIHLFGEVTLT FPALIEIVIILLAAFLSFKTTDKQIRIRNHFTWGAIQEVAVLFIGIFITMQPALMLLKAV GPNLGVTEPAEMFWATGALSSFLDNTPTYLVFLTTAGTLAFTNGITTTLGTVPTKILSAI SCGAVFMGANTYIGNAPNFMVKAISDENGVKMPSFFGYMLWSVAFLVPVFIIDMFVFFL >gi|226332904|gb|ACII01000115.1| GENE 30 41119 - 41292 245 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580251|ref|ZP_04857517.1| ## NR: gi|253580251|ref|ZP_04857517.1| heat-shock protein 101 [Ruminococcus sp. 5_1_39B_FAA] # 1 57 1 57 57 107 100.0 3e-22 MIPKDPMILLSYVNTQLRDYYDSLEALCTCRGLKKDELVAKMHSIDYEYDEATNQFI >gi|226332904|gb|ACII01000115.1| GENE 31 41293 - 43884 1929 863 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 861 1 813 815 747 46 0.0 MNISKFTQKSVQAVQDLEKVAYQFGNQEIEEEHLLYALLEQEDSLILKLIEKMEIDKDYF RNSLNQALDAKVKVSGGELRFGQYLNKALVSAEDEAKAMGDEYVSVEHLFLALLRYPSPS MKKLFQEFGITKERFLQALSTVRGNQRVVSDNPEATYDTLNKYGEDLVEKARNQKLDPVI GRDEEIRNIIRILSRKTKNNPVLIGEPGVGKTAAIEGLAQRIVAEDVPEGLKDKKIFALD MGALVAGAKYRGEFEERLKAVLEEVKKSEGNIILFIDELHLIVGAGKTDGAMDASNMLKP MLARGELHCIGATTLDEYRQYIEKDAALERRFQPVMVDEPTVEDTISILRGLKERYEVFH GVKITDSALVAAATLSHRYITDRFLPDKAIDLVDEACALIKTELDSMPSELDEQRRKIMQ LEIEESALKKETDNLSKERLETLQKELAELRDTFNTQKAQWDNEKHSVEKLQKLREQIED VNKQIQKAKQNYDLEKAAQLQYGELPKLQQQLEIEEKSVKESDRSLVHEAVTDDEIARII SRWTGIPVTRLTEGERAKLLTLEDQLHKRVVGQDEGVKRVTDAILRSKAGIKDPTKPIGS FLFLGPTGVGKTELAKTLAENLFDDEQNMVRIDMSEYMEKYSVSRLIGAPPGYVGYEEGG QLTEAVRRKPYSVVLFDEIEKAHPDVFNVLLQVLDDGRITDSQGRTVDFKNTILIMTSNI GSPYLLDGIDENGEIKPEAQSQVMDDLRGHFRPEFLNRLDEIIMFKPLTKSNIGKIVDLM VGELDKRLADQELSLELTDAAKDQVIENGYDPVYGARPLKRYLQKYVETLAARKILSGDV HAGDTLVLDVQNGEFIVTVKDGN >gi|226332904|gb|ACII01000115.1| GENE 32 43983 - 44279 223 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580253|ref|ZP_04857519.1| ## NR: gi|253580253|ref|ZP_04857519.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 98 3 100 100 192 100.0 7e-48 MEEKLPFAGDIVRTTSVRRLNKLYFAYMNLSPEVQQAYKELILYCGGVEDLLQKSEIWIK NSGHNSNTDQPNEVNRLIEKFISEVDRRGVEKKLALTY >gi|226332904|gb|ACII01000115.1| GENE 33 44348 - 44590 326 80 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0697 NR:ns ## KEGG: HMPREF0424_0697 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 1 65 1 65 75 84 61.0 1e-15 MTANPILLQKKYSRVIECFAKQQGLSLDAALDFFYHSQVYQLIRDGVSDMHCMSDAYLAE ELEQEYAGKVPVNVVVKDMI >gi|226332904|gb|ACII01000115.1| GENE 34 44587 - 45057 481 156 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0696 NR:ns ## KEGG: HMPREF0424_0696 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 1 147 1 147 164 189 59.0 3e-47 MILYHGSFLEIAKPDLAHSRPNVDFGRGFYATPLYEQAAKWCGKFKRRGKDGIISRYEYD ESRESELKMLKFDSYSEEWLDFILNCRSGKDSTDYDLVVGGVANDKVFNTVELFFDGLID KTEAINRLRYEKPNLQISFRTEKALSLLHFEGSETL >gi|226332904|gb|ACII01000115.1| GENE 35 45054 - 45290 280 78 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0695 NR:ns ## KEGG: HMPREF0424_0695 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 4 78 6 80 80 82 50.0 5e-15 MTDEKIKNSSELEFAVFCIENVAAKLGVNAERVYQAFTEQSNILNGYIVPEYEMLHTQSR EYIVDDLLDVMKERGVEV >gi|226332904|gb|ACII01000115.1| GENE 36 45472 - 45666 177 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580257|ref|ZP_04857523.1| ## NR: gi|253580257|ref|ZP_04857523.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 92 100.0 5e-18 MKLNIQWKKVLYGIALIVISIILAALHFIVAGKGIGYFTSSIIAVISVSTWDSSVSFPNL SDSW >gi|226332904|gb|ACII01000115.1| GENE 37 45663 - 46496 511 277 aa, chain - ## HITS:1 COG:CAP0133 KEGG:ns NR:ns ## COG: CAP0133 COG0596 # Protein_GI_number: 15004836 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Clostridium acetobutylicum # 6 270 7 264 264 124 31.0 1e-28 MTENSYKTPCGCIHYFVNIIDKQRITLVFLPGLTADHRLFEKQTEYFENKQNVFVWDAPS HALSRPFTNNYSLSDMAEWLYEILAKEEIYNPIIIGQSMGGYLAQMYMELYPDKITGFIS IDSAPLQKSYMTAMEIWLLERVEPLYRIYPWKALLRAGSRGCSETDYGQNLMRKMMMAYN SEHKEYAKLAGFGYRMLAKAIKADLPYHISCPALLICGEKDKAGSAKSYNKKWHQREGLS LKWIKNAGHNSNTDQPDEVNRLIEKFISEIDRKGVPG >gi|226332904|gb|ACII01000115.1| GENE 38 46574 - 47032 265 152 aa, chain - ## HITS:1 COG:no KEGG:CLH_1981 NR:ns ## KEGG: CLH_1981 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 3 151 1 146 155 152 55.0 2e-36 MKIRKMKSNLDERQELKLLKIEHNGCWIAFWGLLIVMAIQMIVGNDSIKNLAGEWAVFMS LAFYLWVACIRNGIWDRRLKPNFKTNVIVSSIAAVLTGIIWFSVSYRNYHKLIGSIATGI IMFVQVEILCLLALMISSKIYKRRVQKLEEDE >gi|226332904|gb|ACII01000115.1| GENE 39 47022 - 47276 182 84 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 17 81 2 66 68 93 67.0 1e-19 MEKKYYFINGGEHKVAKNIAIKVARAEKDMTQKVLAEAVGVSRQTINAIEKGEYNPTIKL CRKICRVLDKSLDDLFWEDEEDEN >gi|226332904|gb|ACII01000115.1| GENE 40 47480 - 48571 685 363 aa, chain + ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 32 363 2 335 337 118 29.0 2e-26 MGYALNLCKFHIYNVPYNHNIAEEGMEMTQNRIRIVDVADALGLSTATVSNVIHGKTEKI SDETVKRVQQELERSGYIPNMAGILLARNNSRIIGVVVNDHEKYEGRVLEDGFVMSSLNA LSHEVNEKGYFLMIKTTSDIREIPVFASMWNMDGLILIGFCEADYESLRNQMRISFVVYD GYFEKCSKVVNLVINHYDGGYQAGKYLKELGHKKALCLADNFICMDKERIEGFRKAFEHG ETYRWEIPKTEKERMRFYEDNYMQLLKSNVTAVFAVSDFYALEFMKFLQGKNIRIPEDIQ IIGFDDNMASRESNPSLTTIHQEANLRAKAAIECLEAMRDGAEYKTEIVLPVDLIQREST RKL >gi|226332904|gb|ACII01000115.1| GENE 41 48571 - 49902 866 443 aa, chain + ## HITS:1 COG:CAC0847 KEGG:ns NR:ns ## COG: CAC0847 COG0534 # Protein_GI_number: 15894134 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 3 442 11 450 459 204 30.0 2e-52 MSKFMKSLCKIAIPVTLQSMLQASFSIVDQIMIGQLGETNISAVGLCGNFSLIFSVVIGA VSTVAGILIAQFIGAQETEEAWCSFDVSLICGIMISALFLLAAGVFSSQILGLYTKDMSI INTGAVYFRIVAFSYIPMAVSNILSSWLRCKEYATIPFLASFGAVIVNTGLNYLLIFGKF GFSCMGIKGAAIATLISQLFNLIFIVIGFVLCIRKDGDKPVLSLHFKKITIRDYLIMIMP ILISEFLWSLGQNVESAVYGHLGTSSLAAYTLTGPIQGLMVGALSGLSAAAGVIVGKRLG RKEYDEAYTESKKIMYAGLFGAVTVSALLILLAGVYTGLYRVDDSVKDLGKTLLIVFALY APVKVENMILGGGIIRSGGNTKIIMIIDIVGTWCIGIPLCLLAAYVFNWGIVGVYTLLTT EELFRLAVSLIVFKRRKWIISLC >gi|226332904|gb|ACII01000115.1| GENE 42 50256 - 50579 250 107 aa, chain - ## HITS:1 COG:no KEGG:Shel_09520 NR:ns ## KEGG: Shel_09520 # Name: not_defined # Def: acyl-CoA thioester hydrolase family protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 107 1 107 306 143 59.0 2e-33 MKKRHFDVETDGFYGSYWKCKTGSDCAMIAMIGDDPEDYLAHTSVKWLHKLGVNVMTMSP GKKDYGHHNYPLERIEKAINWLKAHGDQKIGIVGASTTGTLALTAAS >gi|226332904|gb|ACII01000115.1| GENE 43 50591 - 51397 511 268 aa, chain - ## HITS:1 COG:MA3472 KEGG:ns NR:ns ## COG: MA3472 COG3315 # Protein_GI_number: 20092284 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: O-Methyltransferase involved in polyketide biosynthesis # Organism: Methanosarcina acetivorans str.C2A # 6 218 13 226 274 157 38.0 2e-38 MKEKVNVTGVPETMVQTLYARAKETKKKNAKINDEIAVELVKKLDYDFSKADKDNAMTYG VIARTIVLDRMVKQYLEKNANTVVINIACGLDTRCYRMKGKYLHWYNIDLPETMKIRRQF LPETGPIYQIVKSAMDNSYIDDIDYHGENVLAIIEGLTMYLCEKDIRKMFSIIEKSFKKV TVMVETMSPFVVKHVKEKSIEGSNAKFTWGVKNGTELQRIVPGFSVQQEVSLVEGMKKLI PIYHVIGKIQIVRNISNKIIVLEKRGRY >gi|226332904|gb|ACII01000115.1| GENE 44 51394 - 52176 310 260 aa, chain - ## HITS:1 COG:CAC1470 KEGG:ns NR:ns ## COG: CAC1470 COG0596 # Protein_GI_number: 15894749 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Clostridium acetobutylicum # 1 255 1 254 255 232 47.0 4e-61 MKFYESGDNRKPVIFLFPGTCCLYSSFDHVLDGLHSYFYTVAVSYDGFDPNEKTEFYSME DECEKIEQEIRTKYGGRIKAAYGCSLGGSFVSLLIQRKRIHINHGIIGSSDMDEAGSFMA KLQSSIIVPIMYKMVHTGKLPKFMQKKINEADESRKKLIDGFVNMFGIDKGGSPWITKQS VYNQFYSDLVTKVQHGIDVPGTTIHVFYATKMGEKYEKRYCTYFKNPDIRRHNMQHEELF CCHSAEWVKEVKKAVEGDRQ >gi|226332904|gb|ACII01000115.1| GENE 45 52189 - 52830 398 213 aa, chain - ## HITS:1 COG:CAC3419 KEGG:ns NR:ns ## COG: CAC3419 COG0500 # Protein_GI_number: 15896660 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 3 151 5 150 207 98 36.0 1e-20 MKNKEYKKLSIAEFTKAAGRYESNHAGIYEMCKKDYPDILEELEKEPFRDLLDAGCGPAP MISLLAEKYPDRHYTGLDLAPAMIEQAKKKNISNATFVVGDCENFPFENDSFDAIICSMS FHHYPDPQAFFDSVKRCLRPNGRLILRDVTSDNKVLVWLMNTLEMPLANICGHGDVRVPT RNVVMKCCRKVGLKVEKFEIRKGMRMHCVVRKA >gi|226332904|gb|ACII01000115.1| GENE 46 53128 - 54876 1504 582 aa, chain - ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 102 578 3 476 477 387 42.0 1e-107 MRKKTVQTSSTKSNAKSNTKSNAKSNTKSNTKSNARNSVKSSVKSSVKSSAKNSPQKAVK NVAPTVENVSVKQAVIIQEPSVEQNEQNIPTRQPDLGPRRSVAFIGSECYPFVKTGGLGD VMSALPKSLAKLNIDVKVIIPRYKCIPQKFQEKMEYRGSFDMNLCSDGKQYYVGIMEYQE DGVVYDFIDNDEFFSWGNPYTNLIDDIPKFCYFAKAALAALNYLNWTPDVVHCHDWQAAL VPLYLRTCFQDTDVGRAISVLTIHNLKFQGIYDRKKIQYWSGLPDYVFNKDCMIQNWLDA NMLKGGIAYSNKVTTVSNTYAWEIQTEEYGEGLAEHLRYHNNKILGIVNGIDTDIWNPAT DKLLAADYDEKSAIKNKKINKKALQESLGLDVDEHKMVIGLISRLTNQKGLDLVNDVIPG IMDEHTQVVVLGTGDSQYENTFRYYENKYKGNFCAYIAYNENVAHNIYAGCDALLVPSRF EPCGLTQLIAMRYGAVPIVRETGGLKDTVQPYNMFENIGNGFTFDRYESGLLYDAINRAK TLYFENRKSWDDMVIRDMNKDVSWEKSAKQYKDMYVGLTPRD >gi|226332904|gb|ACII01000115.1| GENE 47 55187 - 55774 583 195 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20196 NR:ns ## KEGG: EUBELI_20196 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: not_defined # 1 195 1 195 196 338 85.0 7e-92 MTKRIITISREFGSGGRFIGEEVAKKLGIAYYDKNIIGQIAEKSGLSPEYIQENAELSPK KGLFAYAFSGRDITGKSVEDMVYEAQRNIILELAEKEPCVIIGRNADYILKDRDDVLNVF IHGDMPEKIKRITGLYNVKEKEAVKMMADTDKRRRTNYNFYTDQNWGKASNYTLCLNSSQ LGYDRCEMIIMECVK >gi|226332904|gb|ACII01000115.1| GENE 48 55787 - 57154 1351 455 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 12 448 13 444 468 173 31.0 5e-43 MAESNKMKDMPVNKLMIQMGIPMILSMALQAVYNIVDSAFVGNMRVGSEAALNALTLVFP VQMLMVAAGIGTGVGTNALLARTLGQGNREKAAKVAGNSLFLGVIIYVVCLLFGIFGVKA YISSQTVDAEVLEMGVSYLRICCVISFGIIFFSLFEKLLQATGRSLYSTIGQVVGAVVNI ILDPIMIYGIGPCPEMGVKGAAYATVIGQVASAVLLLIFHMKLNREFGHGPKYMKPNAGV IKEIYAIGLPAIIAQALMSIMVYVMNLILKFNPSAQTAYGLFYKVQQFVLFLAFGLRDAI TPIIAFAYGMRSKKRIQDGIKYGIIYTIALMILGIAITEIFPGAFAMLFNAGQSREYFIA AMRVISVSFLFAGINVAYQGIYQALDGGLESLVISLLRQLIIILPLAGIFSLFVRNGQMG VSLIWWAFPITEIISCFVGYVFLKKIRKNRVDVLN >gi|226332904|gb|ACII01000115.1| GENE 49 57377 - 57892 454 171 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2567 NR:ns ## KEGG: EUBREC_2567 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 165 2 164 164 135 43.0 8e-31 MKYEIGDFVSKPVTGICKIENILYLNPQDEKNDKLYYLMKPVEDEKEKIYVPVSNSDSRL RLCLTKEEAWNLIKRIPDIPTAWTNNEKMREQNYKEAVRANNPEALVAIIKMIYQRKQKR LAQGKKCTATDARYFQIAENLLYMELGVALEKPKQEICKTIIDYIDQNKID >gi|226332904|gb|ACII01000115.1| GENE 50 57921 - 58142 259 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580272|ref|ZP_04857538.1| ## NR: gi|253580272|ref|ZP_04857538.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 73 1 73 73 134 100.0 2e-30 MHEYEIFIEDINPCGGEQYSKKTLIEAETASPEAYVKENGRFPILESTRNESGDVVIVTG DNQGSFVRYTFTE >gi|226332904|gb|ACII01000115.1| GENE 51 58379 - 58591 154 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580273|ref|ZP_04857539.1| ## NR: gi|253580273|ref|ZP_04857539.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 70 1 70 70 127 100.0 2e-28 MTENTVLPLINTAFEPGNARKAAESFADRDFREIAEHIGVSLNTVKHYLTDIFNKLHVKK RDELKNFVLK >gi|226332904|gb|ACII01000115.1| GENE 52 58757 - 60109 1274 450 aa, chain - ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 80 244 157 318 752 112 43.0 1e-24 MNIRIKIPDKMINNQPYTIREYKIFRLHKDSATNQAIVDILDSVFNSSTNELTFKSDKFS IYVLTYKDTYYSPSYPVTGIKVSPDTLTLTKKGETAQLTAEVTPSYADNKRVTWQSSDEK VATVDENGKVTAVGNGTATITATSVSGSYTATVSVTVKIPVEIQKLTIEAEKETLTKIGE STELKVKIEPENADLQKLIWKSDNEKVATTDENGKVTAVGNGTAEITVTTEDGKITASIM ITVKVPDEPTINKSTGFRRLRARSVKQTKTSVTLQWNIIKDADGYFIYGNRCNTGTKSYK YRKLAIITGGDISTWTQKDLKKGTYYKYVVKAYRMVNGKKVVTDISISVHAVTGGGKYGN AKAVSVTQIGNKKNVSKITLKMGKTAQIKAKEVKKDKKIARHRKLCYESSNTKVATVTPD GLIRATGKGTCTIWVYAQNGVYKALKITVK >gi|226332904|gb|ACII01000115.1| GENE 53 60313 - 61059 432 248 aa, chain - ## HITS:1 COG:no KEGG:Elen_0852 NR:ns ## KEGG: Elen_0852 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 2 247 1 247 249 237 47.0 3e-61 MVKGLDTFWKYFADYEEQYVLIGGAACDILFESNEVNFRATRDLDMVLIVEALTPEFGEK FWKFIVDGKYRNKATNGSNPQFYRFDKPEEDKFPKMIELFCRSDFELKSAEGITPIHIDD EVSSLSAILLNDDYYKALLNGKVIRNGLSVLRPEYIILFKAKAYLDLKSRKDLGEKVDSS DIKKHKKDILRIASELMLEKVEGLPIAVGNDIHSFIDLLEQEPFDQNSLKRYGLKNEDIM ELLKKVFG >gi|226332904|gb|ACII01000115.1| GENE 54 61044 - 61961 486 305 aa, chain - ## HITS:1 COG:no KEGG:Elen_0853 NR:ns ## KEGG: Elen_0853 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 11 297 23 317 329 130 31.0 8e-29 MTENKEVYKKLPLAYRGRYDIFTVETNGVLWMAIHPKDDIGLVVLRRDRAVVEKITGLNC AVFLDRTTFYIKEKMIEEGIPFVIDRKQVFLPFIGYLLSKENERELAPVHLISFLTQKML LIAIYERWNEVKVSDAAKRLEVSTKSASRCFDELEYLNIAVLGMKGKSRVIDIPDEREQL WQQIKNVLRNPVIRRFILREDIKFEKKAGLSALCEYSLLSDNVYPTYAVTKKELKDSGVK AEKQVSELEEIGCVVLELGYFIDFLGKGFQDPLSVVLSLTGDEQEEERVDISINEMLEEY VWSKD >gi|226332904|gb|ACII01000115.1| GENE 55 62291 - 64807 1296 838 aa, chain + ## HITS:1 COG:PA1760 KEGG:ns NR:ns ## COG: PA1760 COG2909 # Protein_GI_number: 15596957 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Pseudomonas aeruginosa # 32 393 50 423 907 76 22.0 2e-13 MSKKKYNLNTIYISERLQENLKPISQSAFTAVTAPMGYGKTTAINWYLDKQSKNGNSCVI RISIYSDNLSVFWQSVQKAFAFAGLDFLDNYVCPSDVASAGMLADELCYALSGQTSYYIF IDDFHLLGEPHIADFFCMLANRLPENIHLIVSGRNAFLSGKEILRLGKKLYQIHTRDLCL NRRELSVYTHRCGASLNEDQLNTLLYSSEGWFSAVYLNLCTFFESGHFPDHTSNIYDMFS SAMIAPLSDAQQEFLTVMGLADEFTVEMALFITGNPEAGQILSLMTRQNAFITPLPDGVS FRFHHMMKECTQRAFAMLSHEKQTDFRNRYGQWYESRGQFLQALAAYNKALNYDAALAVI QKDAGILLASLSPEKVLAFLDVCPTEILKNRPLALLVLMRRMFTWHQIPKMLELKQLLTD TIAEDNTLSEDERKNLSGECDLIMSFLMYNDITGMSVLHRQASSKMTRPAISIRKTGSWT FGSPSVLMMFHRRSGTLDAELTAMNECMPHYYRITQGHGQGAELLMNAEAAFMQGNFSDA QILLEQTYSTIASNGQHNISLCCDFLAARLSLFQESVTFVKNPEVKRKELLSLHNMMWLN IFDSTYAYYYALIRMPEKIPALFKDHMLSTVSFLSPCRPMMEMIENQVFLSQKMYAKVIG RSETLLPVCEKMHYELVSLHVQIQTAAAYAMLGKHHDARQLLLKALGHAMPDGFLIPFVE NYTYIKDVLSSINSIASELFTDRILSLGSVYEQHCLRLSSRNSRPEILNMLNSREAEIAA LITDRLSNREIAERLFLSEGTVKQYVNQIYSKLMINGDTRTKRKQLAELISSINKGLT >gi|226332904|gb|ACII01000115.1| GENE 56 64887 - 65159 208 90 aa, chain - ## HITS:1 COG:Cgl0158 KEGG:ns NR:ns ## COG: Cgl0158 COG0524 # Protein_GI_number: 19551408 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Corynebacterium glutamicum # 2 84 54 137 318 57 32.0 6e-09 MGHNTAFIGKVGNDFFGDQLRAAIKEAGIDDIGLCTDEKIHTTLAMVHTYPDGDRDFSFY RTPGADMMLNKTEIPEDILKETEMQISKKL >gi|226332904|gb|ACII01000115.1| GENE 57 65173 - 66657 1300 494 aa, chain - ## HITS:1 COG:SP1795 KEGG:ns NR:ns ## COG: SP1795 COG1621 # Protein_GI_number: 15901624 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Streptococcus pneumoniae TIGR4 # 5 465 8 425 439 297 38.0 3e-80 MSEMLEKARKYEFIQGQQIKEEERPAFHITPYVGWMNDPNGFSYYKGEYHLFYQYNPYST HWDSMHWGHVVSKDLLHWNYVPTALAPDEDYDKFGCFSGSAIELEDGRQLLMYTSVNQEK LEDGTVRDIQTQAVAVGDGKDYEKYDKNPVLTAKDLPKGASKVDFRDPKIWKGNDGNFYC VIGSRPADGSGQILLYRSKNGFEWEFVSILAKNQNRYGKMWECPDFFELDGKYVLLTSPQ DMLPEGLEFHNGNGTLCIIGELDPETHTLKEQFCQGVDYGIDYYAMQTLLAPDGRRIMIA WMQNWDTLAYRCNDSGWFAQMSLPRELSVKNGRLYQVPIRELNAMRANKVEYNNVVIKDT RLTLDQIEGRTVDLELVIRPADKDNLYKKFELCFAENEKYHSTLCFRPDESVLKIDRKFS GSERALVHQSSCLVNGDSNELKLRVILDKFSVEVFINDGEQVMSAVILTKQEAKGISFFA DGAAKLDIVKYDLL >gi|226332904|gb|ACII01000115.1| GENE 58 66689 - 68368 1537 559 aa, chain - ## HITS:1 COG:SP1796 KEGG:ns NR:ns ## COG: SP1796 COG1653 # Protein_GI_number: 15901625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 3 552 4 534 538 498 47.0 1e-140 MKKNRVRSLLALGLSLTMMVSIMGCGKQQRLDAEINPDTPVSDVQFPLKETEELSFITSA PATSTQDPNKRVIFQRMEEQTNVHIDWTCFVSDQFSDKKNLALAQFGNLPDGLFNAGMSD YDLLRYAKQGIIIPLENLIDKYMPNLQAVFEKYPEYRTMCTAPDGHIYSFPWIEQLGSGK EAIQAIGDIPYINKKWLDYLGLEIPTTTDELEQVLIQFRDHADDLKQEFSIEGDVIPMSF IINNGDQDPSILINGFGDGYGDTGDHFAVTDEGKVIYTTVQEGYKEGIKWLHKLVTENLI DPEAFTQEWSTYVAKGKNHRYGLCFTWDIANIDNNTDYVMLPALTGPDGMRNITRQNNSE TSGFDRGRCVLTTSCRNTALAAAWIDQMYAPLQSPQNNWGTYGEKDSFNIFELSVNKDGE KMLKHMDLGDQSPVEVREAQSVNGPLAILNEYYGVYVTQPEDAKWRLDNMHETYLQDMNS KYVYPNVFMSIDDTNKVSQYDTDIKKYAEQKKADWILNGGIDEEWDSYLKKMEKYGLSDY LSIKQKYFDQYQDSLSSEK >gi|226332904|gb|ACII01000115.1| GENE 59 68388 - 69281 730 297 aa, chain - ## HITS:1 COG:SP1797 KEGG:ns NR:ns ## COG: SP1797 COG0395 # Protein_GI_number: 15901626 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 9 297 17 305 305 329 56.0 5e-90 MEKKKRKYSRTDRLILGIGFTILGLFVLSIAIPIIYVVLASFMDPTVLNNQGLSSDIKDW TLDAYRRVLENEMIWRGFLNSFLYSFAFTVISVFVTLLAAFPLSKKEFVGRKFFNLIFLI TMFFGGGMIPTFILINQLHMVNTVWAVLIPGAFNVWNMILARTYYQSIPAELREASAIDG ANEIQHFFKIMMPVCKPIIAVLALWSFVGMWNSYFDAMIYLNDANLQPLQLVLRSILVQN TPQPGMIADIQSTAEMAKVAEQLKYATIVVSSLPLLVMYPFFQKYFDKGIMVGSVKG >gi|226332904|gb|ACII01000115.1| GENE 60 69294 - 70232 795 312 aa, chain - ## HITS:1 COG:SP1798 KEGG:ns NR:ns ## COG: SP1798 COG4209 # Protein_GI_number: 15901627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type polysaccharide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 25 312 15 305 305 317 56.0 1e-86 MKTVQNQAKPVSKQTWRNKVTYVKKNWQLYLFFLMPGLLLTIIFKYLPMGGLLIAFEDYN VIKGVLGSPWVGLEYFRRFLSSPDFMNYLMNTLKLSIYGLLWGFPVPIILALLMNRIQKT GIKKKVQLLIYMPNFISVIVLCGMVRMLLSPVGPLNRLLGISTNWMTMPSAFRTIYIASG IWQTAGWASIMYTAALSNASKELEEAAVMDGANLLQQIWYVELPAIKNIIVIQFILQAGN IMSIGFEKAYALQTDMNLPASEILSTYVYRIGLLNGDYGYSTAVGLFNSVVNVILLIFVN WVVKKLNDGEGL >gi|226332904|gb|ACII01000115.1| GENE 61 70451 - 71476 848 341 aa, chain + ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 7 340 3 322 331 125 27.0 1e-28 MSKKQVTFADIAAYTHFSKTTISRYFNHPDSLTLENQEKISQALDTLGYKKNKLAKVLAN GKSEFIGIIIPNLYLHYYSEMLTQLLSSYSDYHYKFLIFSSEHGAEEEQQYIEELLSYQI EGLIVLSHTLSSKQLSSYQIPIVAIEREAEYISSVTTDNYMGALQATTLLIRDKCDILIH INVNVEKAVPAYDRIRAFKETCEEYQVPYDIDLSVSGNSYQEILNEIRRIFTRIEEKYPN QKKGIFLANDTYANMFLNLIFQKYRELPSFYEIIGFDNSPIASEAILPITTVGQQIDIIA QTAMELLVQQMEEQKKRSPKPLEKPVHKQIAPVLIRRNTTS >gi|226332904|gb|ACII01000115.1| GENE 62 72773 - 74197 828 474 aa, chain - ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 11 470 10 448 559 293 35.0 7e-79 MPTDQELIKIVPSSRQLAYQATEFYAFFHFGMNTYTNREWGDGTETPQIFNPTEFVADQW VSAAQNAGMKGVILTCKHHDGFCLWPTRYTSHSVVSSPWKNGRGDVVWEVSEACRRYGMK FGIYLSPWDRNKPCYGSGKEYDDYYLAQLTELLTGYGDIFSVWLDGACGEGPNGKKQIYD WKRYYECVRKYQPDACICVCGPDIRWCGNEAGDVRKSEWSVVPARTALAESVQERSQQTD DKEFRMRRITSDMEDLGSRRALEGETNLIWYPAEVNTSIRPGWFYHPEEDDQVKSLEELI YIYIGAVGGNATFLLNIPPMPNGLLHENDVKRLEEFGSWKKKSFTHNLMSTAHIFSENED PTHPVSDLTDDTSETWFQPESSELPVEITICLDGSYNLGYLVLKEAVCYSQRVEKFEIFV KEGEIWNSIYTGTVIGYKKIISVKGQKAQKVKIVLHDFRVLPLLSFVGIYPESL >gi|226332904|gb|ACII01000115.1| GENE 63 74216 - 75043 681 275 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 11 275 39 300 300 155 30.0 9e-38 MKTVRTWCIRILMILFTVIFLYPLFWNLMSAFKTNAEYLTDPYALPTALNLDNFVAAWQK ANIAAYFGNSIFVTVFSTVLLLLLVIPISYVLARYRFAGSKLISAIYMACIFLQATYIMI PLFLELQAVNGLNNLPVLCLVYAVMQFPFCIFTLQGFMSAVPRDYEESARIDGATNIQLL SRVVVPLAKPGIATITMLSAMGFWNEYPLALVLLTEDAKKTLPIGMANLFEVQKYATDWG ALFAGLVIIMIPTIIIYLIGQRYLLQGIGAGGLKG >gi|226332904|gb|ACII01000115.1| GENE 64 75055 - 75936 710 293 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 1 287 1 284 292 157 32.0 2e-38 MVKKPRLFIALCTLPALILTIVFMIVPMVNAIYLSFTSSTALSVGFNSKFVFLDNYKYMF QDKDFLQALFNTLKLMLVIPAVTIFLSLVFAFILTQGQLKEKSFYRTVFFFPSIVSMTVV GIVWSFVFNPTRGILNHMLDLFHFSTWKHAWLGEGKTALWCIGAALVWQAIGYYMVMHVA SIDGISQEVYEAASIDGATGVQKFFRITIPLLRRSIGTTYILSLSGTINLSFTLSNVMTG GGPNCASSVLLQYMYTQGMRNANFGYAMAIAMFTLILAIVLAMISKKVSSEKE >gi|226332904|gb|ACII01000115.1| GENE 65 76095 - 77441 1718 448 aa, chain - ## HITS:1 COG:mlr4771 KEGG:ns NR:ns ## COG: mlr4771 COG1653 # Protein_GI_number: 13473999 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 25 434 20 408 419 75 25.0 1e-13 MKKRFAASAMILAMSAATVLSPLTAFAEAEEQELNIAIFQGGYGDAYWNQMVELFEESHE GVKVNMTISPTIGDIIRPQIVAGNAPDFICLNDGGEDGVILSLIKDHALLNLDDVFDGEN YAGTGALRDDITDGILSSTKCAPYGDDEIYLAPFNSGPQGLIYNKTFFDENNLEVPKTWD EFFALGDKVKDIDGRALFTYQGIYPGYMEEMLWPAIANECGEEALTKIANYEEGSFNNEG VLKALSHIKEIADKGYLLEGTVGMNHTESQTEMMLGKAAFITNGTWMENEMQDAPREDGF EFAMAPIPTENADDVHYVFDSCEQFSIPAAAKNPELAKEFLRFLYSDESVSAFAEASGAL YATKSAREVAKDKLSTAIYNMYGIYEEANASSLIMSFAAVPADCKVNPKDEIFNPITSVM NDEMTVEEWAQNVEDAFAQVRADMEAAN >gi|226332904|gb|ACII01000115.1| GENE 66 77754 - 79538 1044 594 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 4 581 2 571 577 165 24.0 3e-40 MEKRQRYQGFRKFSIRTRLIIALFLISIVPLTGISFYSFHIFSEALREKLSTSISQTLSM INLNMVSEIEKYQYLCGSICISQEIKDGLLKKGMTDTEKNQAINEIQHMIRSKIIYPAQA KNITVYDTDGNIFYDLGYDGLYSDDVSRILSRLEEENQDVWAYAHTYRNRDILILGRRIY EQYSQSRVIGYTLISIDEKIFSKTVLEPVGLADSSNIMYLNMDGTILSSWDRSIQLGQKT EKKLLKNIQARLPNRTDFFSIYKDGEEQLVTYIFNKNLNQLFVYTMPYHYINSEVNTMFW KILAVAFFLVLLCIGIVAMVYPGITSPIRSMLDFCRELSKGNLSVRIQDDHKDELSDLSG SMNHMADTIESLMKQQKSQEKKKRELELQMLQYQLNPHFLFNTLNSLRFVAAMHKDQIVS DGIQALSSLLQNTLTNKNEYITIQEELENLENYFAILRIRYAGSFEYSFDVEEEELLSCL VPKLILQPLAENSVMHGSSDDGSVMEILITCWEENNHLIIELCDNGKGFEVTPAALEPHT NRKKIGITNVNDRIQLNFGKEYGLKVNSHPGEGTTCTLTLPLLYVQELSKDYNN >gi|226332904|gb|ACII01000115.1| GENE 67 79575 - 80753 348 392 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 389 1 361 368 105 26.0 1e-22 MYSILIIDDEPIVKIALRSILPWEEHGFFICGTASNGLEAVPLIEKHHPDIIITDLKMPE MDGLELIRFLKEKEYPGEILVLSNYEDFDSVRSALLLGAADYLLKIKIQPDTLLACLNKT TKKMQNTADRKDSILKTDITEPITDHLLSFFQGEESLDSFIERYGAEKFGFMKHSCAVCY VTFEKFLSNDAFSISANLLRDMILDAVQGVLQPYILVLSAHNALVVFSQKELMCSQVKVE QLIKKLYNRFTMYQSFAPDMPYKENLKNYEEARTTYQDFLHSEGHYKNDVAKTLTYIENN YMHRLTLSSISANVNLSTSYLCRVFKSEVGTSITSYLNNLRIRKAATLIKENTFSLKEIS VMVGIDDQLYFSRLFKKCMGISPSEYGKRFHL >gi|226332904|gb|ACII01000115.1| GENE 68 80832 - 83087 1280 751 aa, chain - ## HITS:1 COG:no KEGG:BH0842 NR:ns ## KEGG: BH0842 # Name: not_defined # Def: hypothetical protein # Organism: B.halodurans # Pathway: not_defined # 8 721 5 742 795 543 40.0 1e-152 MDRLALIFDKPAEAWNEALPLGNGTMGAMSYGRFQNERIELNLDSLWSGNGRNKENPNKN VDWDLFRKHIFAGDYQGAENYCKENVLGDWTESYLPAGTLSINVKEPIQNGNSFYRRELC LTNATEKIEFCQDDLIYQREFFVSMSEPVMAIHYHTSPNCNLEMSITLESEIKHKSAFFA ENGIILEGQAPIYVAPPYYSCEVPVVYEEGQGIRFAIGLYVQTNGGNVYQQADKLFINTP NDVYIYVSGVTDFKQKELFFSKRNCMMENIQHIQYEKQKKAHMDVYANYFDRMHLDINYT PDNELALKMFHYARYLMICSSVPGSQCTNLQGIWNHHMRAPWSSNYTVNINTEMNYWMAE KANLSDCHMPLLELIERTSKKGEKTAQDVYHLAGWVSHHNLDIWGHSSPVGQFGQDENPC TYSMWPMSSGWLCCHLWEHYCYTLDEAFLKKKAFPIIQGAVEFYLGYLVPYKGYYVTAPS TSPENTFLAPDMTTHGVTFASTMDISILRELFGLYLKACEILGVEDFTNAVKNVLQKLPP YKIGKEGQLQEWFYDYPEADINHRHISHLFGLYPGNQIHKENEPLIEACRTSLERRGDKG TGWCMAWKACLWAKLGDGNHALTLLKNQLRLTREEACSLVGGGIYPNMLCAHPPFQIDGN FGFAAAVLEMLVQYEEQKIVFLPALPDEWKDGMAEGVKAPGNITLNFKWKEKRVTEINLK SPIDAKLVILYNGMEEEIVLNAGSSYQKTLI >gi|226332904|gb|ACII01000115.1| GENE 69 83364 - 83768 295 134 aa, chain - ## HITS:1 COG:CAC2702_1 KEGG:ns NR:ns ## COG: CAC2702_1 COG2200 # Protein_GI_number: 15895959 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Clostridium acetobutylicum # 3 128 137 262 269 107 43.0 5e-24 MQQIMKHYANQGYEIALDDVGAGHLGLNRVVNTTPDYLKADIELVRDIHKYKKKEIMMKL LLDYCNATGAILIAEGIETVQELECLHSLGVHYGQGFFLGKPERAFGNISAESEEILERL NGRKNILFLDISGK >gi|226332904|gb|ACII01000115.1| GENE 70 84207 - 85028 439 273 aa, chain - ## HITS:1 COG:no KEGG:mru_2036 NR:ns ## KEGG: mru_2036 # Name: not_defined # Def: 4Fe-4S binding domain-containing protein # Organism: M.ruminantium # Pathway: not_defined # 14 273 12 278 280 307 58.0 2e-82 MRRKRLAPFGMWIVFEIVAVTLWLTKDNLFYLLNFSYIGTSITVGLLLFQFNYKHARRIV QLLVGLYMLIYLGLICRENMQMEGFWYYLFTGVFEAATIHYAVAKIFGPLLFGRGWCGYA CWTAMVLDFLPYKKPQAARKKIGFIRYITFAFSFLFVVMLFLNHVGNMEKIMFIAFIVGN ILYYIVGIILAFRLKDNRAFCKYICPVTIFLKPMSYFSVVRVKCDKEKCISCGKCKKVCL MNVDMIDNSRKRKNGTECILCMECVRTCPRKAL >gi|226332904|gb|ACII01000115.1| GENE 71 85318 - 85902 573 194 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01147 NR:ns ## KEGG: EUBELI_01147 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: Pyrimidine metabolism [PATH:eel00240]; Metabolic pathways [PATH:eel01100] # 2 191 6 210 212 131 38.0 2e-29 MIITIARQCGSGGHEIGKELASRLGLELYDRKKLEEEAKKLGKLDENKEFFQEKAVNSLL YSIAVSYGESNPMEKSFELIRELTQQKSAVIIGRCSGAIYANDPEATTVFLHADKDCRIK RVMERDGINEKKAAKLIKEVDEKRASFHKFCTGKEWEDASQYQLSIDTGKLDISDAADMI INYINAKNIGQKHL >gi|226332904|gb|ACII01000115.1| GENE 72 86063 - 86539 302 158 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2983 NR:ns ## KEGG: EUBREC_2983 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 156 1 156 157 258 77.0 7e-68 MTIRKAEEKDILRIIELLGQVLQIHADIRPDIFIPGTTKYTVSQLTELLGKEEKPIYVAV NEEDVCVGYAFCQLQEQPFSTNMVPFKSLFIDDLCVDQHARGQHIGESLFEYVKNEAKRM GCYEVTLNVWVGNTSAEKFYEKMGMKTKERQMEYILEH >gi|226332904|gb|ACII01000115.1| GENE 73 86583 - 87713 810 376 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 27 339 26 374 435 158 30.0 2e-38 MREGNALQVFYDEKLVGTLAMTADHKAAFQYSEEWIENGFPISPFSLPLKKQVFVPTKDY FDGLFGVFADSLPDNWGRLLLDRLLRAHKQNPDKLTVLDRLAIVGKSGMGALTYYPEKKI DEQYGDVDLDELAEQCRKILNTEYSDRLDELYRLGGTSGGARPKIMTTVNGEEWIIKFPA HVDGENAGKMEYDYSCCAKKCGIIMSETRLFSSENCEGYFGIKRFDRISDKNGTKRIHML TAAALLELNFEHPSLDYHSLMKLTKILTRDNEKDMREMFRRMCFNVFAHNRDDHSKNFTY LYDDEKDRWSLSPAYDLTYSNTYYGEHTTTVDGNGRNPGRKELLTVGIAAGMKKEVCIES MDMVEKCVKNMLGRYL >gi|226332904|gb|ACII01000115.1| GENE 74 87700 - 87978 202 92 aa, chain - ## HITS:1 COG:FN1997 KEGG:ns NR:ns ## COG: FN1997 COG1396 # Protein_GI_number: 19705293 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 7 90 21 104 106 75 48.0 3e-14 MEQFVWETAEELDQKLAQRVRNIRKRRSISQEKLSSMSGVSYGSIKRFETTGQISLISLT KIAMALDIADELRNIFTHVPYKDIQEVINEGR >gi|226332904|gb|ACII01000115.1| GENE 75 88219 - 88797 311 192 aa, chain + ## HITS:1 COG:BH0719 KEGG:ns NR:ns ## COG: BH0719 COG1309 # Protein_GI_number: 15613282 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 20 179 19 176 188 65 27.0 5e-11 MQNIIPDERYSVADEAILGAFFKLLKEKKPDRITVSDITKTAGIARSTFYNHYQDIPSLI SAVEDKTIHDVFSMMENFQPKNDRDICSSYFLTLCRYTMENPFLSGLLSTPHGNDLFEKM LTMLHHYVTETTSNSRPDRHTKEEVSYVITCAIGSTIGVLHKWSRDNFNLAPEVIADILS EVFLSGMLPLLS >gi|226332904|gb|ACII01000115.1| GENE 76 88807 - 90351 1717 514 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 30 504 18 497 600 216 30.0 9e-56 MLIRNILEESVRKFDEVKAVKWLNKKEIMERSYGELMENVVSTRKGLLAEGFEGKHIALI GTSSVEWMESYLGIITGCTTAVPLDAALPCEDLIDLLNRSDSVALFLSPKLRPYLDAFLE NCPKLQKVWMLQEEVEDAPAKVYGIGELRNAGKSASADSVCPDAEDIATIIFTSGTTGKS KGVMLTQNNLASNVEAVKITAEPGTAVLSVLPIHHAFCLVMDWLKGFSLGATLCINDSLL HMVRNMSIFKPDIMLMVPMMIETIYKRLAAADPSIPKAVLAEKVFGGKLRIIFTGGAHLD PYYIDRFAEYGVEVLEGYGMSECSPVISNNTLENNKKGSIGKPLENAEIRFENGEILVKG SSVMKGYYQMPDETAETLKDGWLHTGDKGYMDEDGYLFINGRVKNLIILSNGENVSPEEI ENKLALNPLVGEVIVTGEDNGLTARIYPEQAVVEAKALDAEAIQAQLQAFLDEYNRNQPT YRRITGLVVRKNPFIRNTTKKIRRQDVLIDEPLE >gi|226332904|gb|ACII01000115.1| GENE 77 90455 - 90685 397 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580301|ref|ZP_04857567.1| ## NR: gi|253580301|ref|ZP_04857567.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 76 1 76 76 106 100.0 6e-22 MNQEMEFKNIVAQYSKVAPEEMNNEMRFREDLGFSSLDFMSFLGELEDTFDLELDESEVL KITTLGEALNLLEELQ >gi|226332904|gb|ACII01000115.1| GENE 78 90712 - 92100 1631 462 aa, chain - ## HITS:1 COG:BS_srfAA_1 KEGG:ns NR:ns ## COG: BS_srfAA_1 COG1020 # Protein_GI_number: 16077417 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Bacillus subtilis # 4 458 5 435 1100 75 20.0 2e-13 MTNYYPLTAAQKMHHNWIMDYGTQQVSGVSVVASVQAELDFGLLKKCIQMETERSGCTRI RFTKPDKDGNVQQYLVKQDPRDIGFKDLSGMGSLAKADELMQQWAYETFDGDDIPMCEFT MLKLPEGYNGFFVHMDHRLIDSCGLVVMIGDLFQLYTYYKYGTAYPQELADFETVLKKDL AKAGNEKRFAKDKKFWDDQLDALGEPLYSDIQGPSVLEEARKRHGNPKLRSSDIEMKDLF VAVKDYYLEPGPTKNLIDFCMNHQLSMTNLLLLGIRTYLSKVNNGQEDITIQNFISRRST HDEWTSGGSRTIMFPCRTVIAPETDFLSAAYEIQNMQNRIYMHSNYDPAFIMDEMRKRYN TPEHTGYESCYLTYQPMTVKVENEMLGTIRQHAKWFANGAATKKMYLTVSHTEDGGMNFS YHYQTAHLEEHDMELLYYYMMRILFKGIAEPDMSIGEIMELV >gi|226332904|gb|ACII01000115.1| GENE 79 92400 - 93386 497 328 aa, chain - ## HITS:1 COG:L123536 KEGG:ns NR:ns ## COG: L123536 COG1073 # Protein_GI_number: 15672103 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Lactococcus lactis # 2 322 6 310 311 166 32.0 5e-41 MKKKFFRLSAAGLLFFSGVMEIGWRFFNMVVCCKRGGRKKERKKWFELSHIRDNHPRNGY AKEYDESRAWCETQKMQDCYIQSVDGLKLHGLYLPAEHAKRFVILSHGYRGSRFGSLSFM AKYLHEHQCNLLFMEQRCCGESEGKYITFGAKEKWDVQRWAIYVSERNKEKLPIYLYGQS MGAAAVLMASGYRLPSEVKGLIADCGFQSMERQMRDMADNWFHLHYIPLLLKEMECLCHF VAGFRMKDADTTEAMKRNTRPVLFFHGEKDTYVYPNNSFQNYMLCKAPKELVIVQGARHL CSAYADPELYQRTVMEFFEKYDGSNSEK >gi|226332904|gb|ACII01000115.1| GENE 80 93512 - 96892 2125 1126 aa, chain - ## HITS:1 COG:PA4601_3 KEGG:ns NR:ns ## COG: PA4601_3 COG2200 # Protein_GI_number: 15599797 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 888 1117 2 232 247 163 37.0 2e-39 MKIEKKLWITLFAFGIFAICFSGWIFRQSTAREKMFWDNTIRCETKELKEGEIKSENIAL PEDFDVPKSILFKTTHTIAEVWLDGEKIYEYGNEADAPGFMKTPGSCWHIVDIPGDSSGK NLEVRIIPVYEGYYGNEVSLVYGTRGDCILKILTDSLGILVISCGILFAAVICILLYCGA ARRKRSDKTEAKIEIFLNLGFFSLLVAVWTLVQCGFLQFLIPDGRTLYFVEYFSLFLFPV PFNFLLYDICKSRYHKGALIFSILYLTNMAVDVLLQCTGIIDMSRLLSVIHVIMVANVVY TVVIILYEAGKKENDVAQKFRYPMCVVMGFGMAEMIFYYLRRFEQISILLSMGTMLFIIM LIWIQVSQYYDQYIQKQKVIYLQKIANMDMLTEAMNRNAYEDMVKYLDEGEIKLSTTGVV VFDLDDLKVINDNFGHEKGDEALKLCYQCISQAFQNVKNCFRIGGDEFAYVYHSDEKDMI PERLKTLELLLKKTAKTHKLDYPLSISAGYAYYQPDIDFDFKDIVRRSDTMLYRQKRRKK IARSTDLDHLFSRMEKHSAEEITDEVILQEKKYQSMSVDELCSVIDLLSPTTDNYPYVVD FRTDLYYIANQALDRFCIPKNGFHNVISNHKEFVYGPDYEKMKEELDDLLKTDRCTHSME YRWLDLKKMPVWIQCKGYLVRDDNMKPLYMIGCINEIGERQKADNVSGLLGEKGFREYMD QQDTPLEKGYLLRIGIDHFKEINDNFGQEYGDFVLRKTADCISGCLSEGQKVYKLVADEF LILDVSSDQVRDADKLYDKVRVATDRFIESNEFKVMYTVSGGIVSFAALEGNQYSEALKL TDFALNEAKKLGRNRCYIFDVETYRKFLRKREITQELREAVLNGCQGFTAFYQPVFAEDK KVPYGAEALMRFTSEKLGMISPAEFIPILEETGLIIPVGRWMMREAMGKCSEIRKVLPDF RVSINISQVQASKSDVIKDISAEMKRAGLPLEALIVELTESDLLEQNINEKHFLTELRRM GISLALDDFGTGYSNFHYLSELKPEIIKIDRSFTVKAVADEQEYYLLNQFCTMIHNLDLR ICIEGVENEQEWAMIRKLYPEFTQGYFWGKPCEYEEFMRKFTESRL >gi|226332904|gb|ACII01000115.1| GENE 81 97155 - 97349 138 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580305|ref|ZP_04857571.1| ## NR: gi|253580305|ref|ZP_04857571.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 123 100.0 4e-27 MVFQMPMESFDPRCTLGDGIGESLRNMGISCTETRRKVEELLERCGLEKEYADRYPKIVY LSCV >gi|226332904|gb|ACII01000115.1| GENE 82 97428 - 98285 726 285 aa, chain - ## HITS:1 COG:alr2616 KEGG:ns NR:ns ## COG: alr2616 COG2510 # Protein_GI_number: 17230108 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Nostoc sp. PCC 7120 # 151 284 9 142 143 103 47.0 3e-22 MWILFAFGSALFAGLTAILAKCGIRNTDSNVATALRTGVVLVFSWLMVFMVGAQSEIRDI SAKVLIFLILSGLSTGISWLCYFKALQIGDINKVTPIDKSSTVITMLLAFIFLREEITWL KFVSMILIGIGTYLMIQKKETKEKAEDKKWLLYAVGSAVFASLTSILGKIGIQDVNSNLG TAIRTAVVLVMAWIVVFVTGKQNTVKNIDRKSWLFLILSGFATGGSWLCYYRALQTGPAS VIVPIDKLSILVTVAFSYIVFHEKLSLKSGTGLLLIVVGTLALLI >gi|226332904|gb|ACII01000115.1| GENE 83 98287 - 98913 527 208 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|241889736|ref|ZP_04777034.1| putative 30S ribosomal protein S12 [Gemella haemolysans ATCC 10379] # 14 206 26 227 230 207 52 2e-52 MTIYEYGNPDADTVLIQLTGDHELSVLKNEVEEIRKRTSTDFRLIAAKVDDWNYELSPWK APAVFGNEDFGDGAVRTLEQILTLCTDKSRTYYIGGYSLAGLFSLWAAYQTDVFSGIAAA SPSVWFPGFIEYMKEHEIKSETVYLSLGDREEKTRNSVMSQVGTCIRMGYGWLIEHGINC NLEWNQGNHFREPDIRIAKAFAWVMEGK Prediction of potential genes in microbial genomes Time: Sat May 28 20:31:00 2011 Seq name: gi|226332903|gb|ACII01000116.1| Ruminococcus sp. 5_1_39B_FAA cont1.116, whole genome shotgun sequence Length of sequence - 7019 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 513 - 1790 1172 ## COG3681 Uncharacterized conserved protein 2 1 Op 2 . - CDS 1865 - 3208 1475 ## COG0534 Na+-driven multidrug efflux pump 3 1 Op 3 . - CDS 3198 - 3650 468 ## EUBELI_20398 hypothetical protein - Prom 3688 - 3747 4.7 + Prom 3714 - 3773 4.8 4 2 Op 1 . + CDS 3845 - 4777 767 ## COG0679 Predicted permeases 5 2 Op 2 . + CDS 4816 - 5358 585 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Term 5609 - 5666 15.1 6 3 Op 1 34/0.000 - CDS 5684 - 6355 247 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 7 3 Op 2 . - CDS 6388 - 7017 355 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters Predicted protein(s) >gi|226332903|gb|ACII01000116.1| GENE 1 513 - 1790 1172 425 aa, chain - ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 21 421 3 409 411 307 42.0 3e-83 MLDERIYQNYIGILKEEMVPAMGCTEPIALAYGAARAREVLGKEPEHITAKCSGNIIKNV RCVIIPNSGGLTGIEAGVVLGAVAGDPLLNMEVLSKVNESGRKRCCELLDKKICKVELLD SPVVLHFIIEMQAGDDTVSLEIKYDHINVTKIVKNQEVLLDSDHKSGETPAAADRTLLNL EDIKTFADTVELDDVKEIIENQIRSNMAIAHEGMTGKYGLGIGRIIRETYSNDMLTKMRS LTAAASEARMGGCDMPVVINSGSGNQGIACSVPLIVYAREMELPDYVLYRALVFSNLLTV YQKQYIGKLSAFCGAVSASCAAGAGITYMGGGELSVIKKTIENTLANIPGIICDGAKISC AAKIAASLDAAFLAHHLAMNGQSYAPYTGILKEEAGETISCVGQIGKEGMKETDKEILRI MLERV >gi|226332903|gb|ACII01000116.1| GENE 2 1865 - 3208 1475 447 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 380 1 383 452 164 30.0 3e-40 MTNDRSDFTQGNILKKLVSFMMPILGALILQAAYGAVDLLVVGRFGSTAGLSAVSTGSQV LNLVTFVVVQFAMGITVLIARYLGERRPEKIGAVIGGGAIVFTMISAVLFVIMVAFAHPI SVLMQAPESAVSLTSSYVRICGGGIFFIVAYNLLSAIFRGLGDSKSPLLFVLVACIVNVI GDLVLVAGLHMDAAGAAIATVAAQALSVVFALILLVKMKLPFTITKKDFRLNVQCKRFLQ IGLPLALQECLTQISFLALCAFVNRLGLEASSGYGVACKIVNFAMLVPSSLMQSMASFVS QNIGAGKKKRAKQSMFTGIGVGLAVGCVVFILVMFKGDMLAGIFSTDAEVISNAFDYLKG FAPETIVTAILFSMIGYFNGNNKTVWVMIQGLIQTLLVRLPLAYFMSIQPDANLTKIGLA APVATVTGIILNVGFYIYLNRTENTPG >gi|226332903|gb|ACII01000116.1| GENE 3 3198 - 3650 468 150 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20398 NR:ns ## KEGG: EUBELI_20398 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 145 1 145 149 179 63.0 3e-44 MTTEKIKRMLDACYQAKRIRELLPPLPQGVTPSYIQYLDNIHALEKQGIQVKISDISDVM NLPRPGVTRTVKEMETKGYLRKISSPDDGRVTYISITEEGRKLSQKYNEYYFGELVPYLS EISEEEADCMIRTIEKFYQIMCERRNHYDK >gi|226332903|gb|ACII01000116.1| GENE 4 3845 - 4777 767 310 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 5 307 1 300 301 96 26.0 6e-20 MNISILLMEQIIQLFLMIFMGYLIVKVRLVKDEDSKVLSKIILYLIIPCVIINAFQVDYT MDTVKGLLLALAASIMTQVLLLIIISIAGKLLHLNEVEIASVYYSNSGNLIVPIVTFILG QDWVLYGCVFMSVQLIFLWTHCKKIISRESSYDWKKIVLNINMISIFIGVVLFFAKIHLP EIINNTLGSVSSMIGPASMIVTGMLFAEMNLKQIFADKRVYFVSFLRLIAVPLLALVMIK ISHLAMFSADGNKIMLIVFLAIITPSASTITQMCQVYGNDSRYASAINVMTTLFSIITMP LMVMLFEAVI >gi|226332903|gb|ACII01000116.1| GENE 5 4816 - 5358 585 180 aa, chain + ## HITS:1 COG:FN0535 KEGG:ns NR:ns ## COG: FN0535 COG1611 # Protein_GI_number: 19703870 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Fusobacterium nucleatum # 2 171 5 174 192 150 45.0 9e-37 MNITVYLGALEGNDPALGDAVRELGTWIGESRNSLVYGGSKSGLMGQIAESVLNAGGKVT GVEPQFFIDSELQYDEITELIVTKNMAERKAKMIELGDAFIAFPGGTGTLEEITEVMSMV SLKHLDAPCILYNLNGYYDSLKQLLYHMIKMGLSSENRQQGIYFADDMEDIKALLATVTK >gi|226332903|gb|ACII01000116.1| GENE 6 5684 - 6355 247 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 1 184 1 179 205 99 32 5e-21 MIRLENVCFAYEKEIALRYVDLHINRGDSIVIQGPNGCGKSTLIKLLNGIIFPSEGKYFY QGHEINEKALKNSQFAKWFHQQMGYVFQNADAQLFCGSVEEEIAFGPVQMGLSEAEIKKR TEDCLHLFGLEKLRDRPPYHLSGGEKRKVSLACILSLNPEVLILDEPLAGLDEKTQDMLV EFLQSFHNAGKTLITITHNRQLAETIGTRFACMNEEHELKMLS >gi|226332903|gb|ACII01000116.1| GENE 7 6388 - 7017 355 209 aa, chain - ## HITS:1 COG:CAC0772 KEGG:ns NR:ns ## COG: CAC0772 COG0619 # Protein_GI_number: 15894059 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 2 179 59 236 269 110 35.0 2e-24 TLLYIVLTACSGNYLFTLIMCAAVTVRLAFFSAKAIRQILRGTAGAVLFSILILLPSVFM GTPQTLMNITSRVYVSVTLVGILSSGTSWNKLTGSMRTFRLPSIFIFTLDITLKYISVLG EICAAILTSVRLRSVGKNPQKAKALSGVLGISFLKSGEMAEEMHAAMCCRGFTGEYKKKQ KYTLCAADIFSTFIMAGCIVLFWYLNRKI Prediction of potential genes in microbial genomes Time: Sat May 28 20:31:05 2011 Seq name: gi|226332902|gb|ACII01000117.1| Ruminococcus sp. 5_1_39B_FAA cont1.117, whole genome shotgun sequence Length of sequence - 2991 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2 - 37 -0.9 1 1 Tu 1 . - CDS 193 - 1197 870 ## COG0310 ABC-type Co2+ transport system, permease component - Prom 1253 - 1312 4.8 2 2 Op 1 . - CDS 1381 - 2142 237 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 3 2 Op 2 . - CDS 2139 - 2933 412 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 Predicted protein(s) >gi|226332902|gb|ACII01000117.1| GENE 1 193 - 1197 870 334 aa, chain - ## HITS:1 COG:CAC0771 KEGG:ns NR:ns ## COG: CAC0771 COG0310 # Protein_GI_number: 15894058 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Clostridium acetobutylicum # 1 327 1 322 322 317 53.0 2e-86 MHIPENYLSLSTCAVMTAAMLPVWGYSIHKIKTEIPKSKMPLLGIGAAFSFLGMMFNVPL PGGTTGHAVGGTLIAILTGSPAAGCISVSIALLIQALLFGDGGILAFGANCFNMAFILPY LGFFLYRLLLKKTGMRKLSAAIGSYIGINAAAFCAAVEFGIQPLLFKNAEGQALYCPYPL SVSIPAMMIGHITLFGLAEIILTVVVLTFVEKVTPKALDAVPDKTAFKPLYILMAVLIVL TPLGLLATGTAWGEWGADEIAGLVSGGSQLGYTPSGMTNGFELSTLFPDYSMSGLPEWTG YILSAIVGVALLVIIFKIISSMMKEKVDFKKKAA >gi|226332902|gb|ACII01000117.1| GENE 2 1381 - 2142 237 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 242 1 230 311 95 28 4e-20 MSILLEAEHLTKTFTQRGKNPLKAVKDVSFTLKKGETLGIVGESGSGKSTIAKLLTRLTD ITEGTLKFEGKDITRLKQSQLKEVYGDIQMVFQNPVGSFDPRRTLGDGIGESLRNRGMKK ADVEKRVAELLEQCGLEKEYAKRYPHEVSGGQCQRAAIARALAVEPKVLICDEATSALDV TIQKQIMELLEDLKEKKGLSFIFICHNLALVQMFCDRVLVLYEGKVVESGIPDDIINEPK EEYTRRLVDAVLS >gi|226332902|gb|ACII01000117.1| GENE 3 2139 - 2933 412 264 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 20 264 27 270 563 163 36 4e-83 MDRKILTYDSVEISFGGKAVVHDVSFSLEHGEILGLVGESGSGKSTLIKAAMGLLESDAL VTRGDIRYMDQNILDIPEKEKRKIRGAGIGMIFQDAGASLCPIRTIGEQIYESMCAHTKI TRSEAKTRAMDLFDKLNFKDSQRVWDSYPFELSGGMNQRAGIAIAMLMNPPILFADEPTS ALDVSVQRQVVREMLKLRELFGTAIVIVTHDIGVVRAMADTVLVLKNGKTIEYGTADRVL DHPQDPYTRKLLEAVPKLQRKVRV Prediction of potential genes in microbial genomes Time: Sat May 28 20:31:12 2011 Seq name: gi|226332901|gb|ACII01000118.1| Ruminococcus sp. 5_1_39B_FAA cont1.118, whole genome shotgun sequence Length of sequence - 28290 bp Number of predicted genes - 25, with homology - 23 Number of transcription units - 15, operones - 6 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 51 - 1589 1914 ## COG0747 ABC-type dipeptide transport system, periplasmic component 2 1 Op 2 49/0.000 - CDS 1624 - 2490 892 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 3 1 Op 3 . - CDS 2487 - 3431 821 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 3481 - 3540 7.0 + Prom 3578 - 3637 9.7 4 2 Tu 1 . + CDS 3682 - 3771 76 ## + Term 3813 - 3859 13.1 - Term 3584 - 3631 0.6 5 3 Op 1 . - CDS 3700 - 3858 71 ## - Prom 3883 - 3942 7.3 6 3 Op 2 3/0.000 - CDS 3946 - 4854 823 ## COG0583 Transcriptional regulator 7 3 Op 3 . - CDS 4890 - 6638 1180 ## COG2199 FOG: GGDEF domain - Prom 6789 - 6848 5.8 - Term 7208 - 7256 5.4 8 4 Tu 1 . - CDS 7274 - 8668 1770 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) - Prom 8725 - 8784 6.2 + Prom 9168 - 9227 7.4 9 5 Op 1 9/0.000 + CDS 9426 - 10094 714 ## COG1760 L-serine deaminase 10 5 Op 2 . + CDS 10114 - 10983 1060 ## COG1760 L-serine deaminase - Term 10994 - 11054 1.5 11 6 Tu 1 . - CDS 11063 - 11698 772 ## COG2364 Predicted membrane protein - Prom 11723 - 11782 6.7 + Prom 11707 - 11766 9.0 12 7 Tu 1 . + CDS 11957 - 13048 1118 ## COG3641 Predicted membrane protein, putative toxin regulator - Term 13068 - 13103 1.1 13 8 Tu 1 . - CDS 13216 - 13668 357 ## gi|253580328|ref|ZP_04857594.1| predicted protein - Prom 13812 - 13871 7.8 + Prom 13660 - 13719 8.0 14 9 Tu 1 . + CDS 13846 - 15270 646 ## gi|253580329|ref|ZP_04857595.1| predicted protein + Term 15361 - 15412 18.0 - Term 15348 - 15400 15.3 15 10 Op 1 . - CDS 15448 - 17268 857 ## gi|253580330|ref|ZP_04857596.1| predicted protein 16 10 Op 2 . - CDS 17290 - 17760 393 ## gi|253580331|ref|ZP_04857597.1| predicted protein - Prom 17900 - 17959 7.3 + Prom 17882 - 17941 7.1 17 11 Op 1 . + CDS 17968 - 18738 570 ## gi|253580332|ref|ZP_04857598.1| predicted protein 18 11 Op 2 . + CDS 18819 - 20357 452 ## COG3291 FOG: PKD repeat 19 11 Op 3 . + CDS 20398 - 21189 367 ## Cphy_2618 Ig domain-containing protein + Term 21201 - 21246 0.8 - Term 21245 - 21306 16.6 20 12 Tu 1 . - CDS 21362 - 22018 687 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein - Prom 22142 - 22201 6.2 - Term 22146 - 22189 -0.6 21 13 Op 1 1/0.000 - CDS 22288 - 23472 1396 ## COG1454 Alcohol dehydrogenase, class IV 22 13 Op 2 . - CDS 23584 - 25104 1676 ## COG0531 Amino acid transporters 23 13 Op 3 . - CDS 25108 - 25728 783 ## EUBREC_0098 cytidylate kinase - Prom 25764 - 25823 3.9 - Term 25883 - 25922 6.5 24 14 Tu 1 . - CDS 25956 - 27173 1513 ## COG4992 Ornithine/acetylornithine aminotransferase - Prom 27276 - 27335 5.3 25 15 Tu 1 . - CDS 27684 - 28133 416 ## COG1396 Predicted transcriptional regulators - Prom 28223 - 28282 2.2 Predicted protein(s) >gi|226332901|gb|ACII01000118.1| GENE 1 51 - 1589 1914 512 aa, chain - ## HITS:1 COG:FN1523 KEGG:ns NR:ns ## COG: FN1523 COG0747 # Protein_GI_number: 19704855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 32 512 44 525 526 283 36.0 7e-76 MRTKKFLAMAIALGLAVGMTAMPTLAAEKKTLVFGDTTFNAENEEADINPQNTYAGWACI RYGVGETLFRYSDTMEIEPWIAKEYENIDELTWKITLNDGVVFSNGRACDGEAVKESLEA LVANHERAAGDLCIDTIEADGNTVTIKTTEPKPALINYLSDPYGCIIDMQAGISDDGIVI GTGPYVATELVTDDHVNLVKNENYWNGDVNVDEITVRTISDGDTLAMALQAGEINAAYGM AYASYPLFENDDFTFTSTATSRAFYAWMNFESPVTSDPAVRKAIAMGIDKDSFVSVLLNG YGYPAVGVFPDTFSFGGQNLTTESYDPDGAKKVLEEAGWVDSDGDGIREKDGQKLEIRWL TYPSRQELPLLAESAQATLKEIGMDVQINCTSDNNTIRKDPALWDVYASAMVTAPTGDPE YFFTSCTLDESASNNGHYHSDKLEDLEKQMKQEFDADKRSDLAIQMQQTILDDNAYVFCS YLRMSMIMQSNVTGLTAHPCDYYELTADLDIN >gi|226332901|gb|ACII01000118.1| GENE 2 1624 - 2490 892 288 aa, chain - ## HITS:1 COG:FN1522 KEGG:ns NR:ns ## COG: FN1522 COG1173 # Protein_GI_number: 19704854 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 11 284 1 272 276 251 44.0 1e-66 MSEKQNKNVTVRVQKIKKNHIKQKLIFFLILAIGLVILAIFSEHLCPYDPYAQDLASALQ PPSLQHPMGTDTYGRDMLSRVISGARASIFSTFILVAVIAVLGTAVGVCCGYFGGAADAV MMRISDVCLAFPGLVFAMAIAAILGGGIQNAVISLAVVSWPKYSRIARGQTLALKEEPYI HAAILAGDSSGQIMLRHILPNMLGPVLVTAMLDIGTMMMELAGLSFLGLGAQIPMAEWGS MMSSGRSMIQTYPWVVLSPGIAIFISVAIFNLLGDTVRDYLDPKNSRS >gi|226332901|gb|ACII01000118.1| GENE 3 2487 - 3431 821 314 aa, chain - ## HITS:1 COG:FN1113 KEGG:ns NR:ns ## COG: FN1113 COG0601 # Protein_GI_number: 19704448 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 310 1 309 312 259 43.0 7e-69 MRKYAFRRLLQLIPILFAITFLSYGMMRIAGSDVVTQKMENTGQILSEDKLNAAREQLGL DKPFLTQYFVWLGKLLHGDMGNSYVSSLPVFDTFISKLPATLLLTVTSILLTIIISIPLG ILSAVKQNTITDYLIRLCSFIGNSLPNFFVSLLLMYFLAIRLRIFPVISKDVSLKSVAMP AITLAIAMSAKYLRQVRATVLDELSKDYVAGAKARGVKFSVTLWKSIMKASLVTIITLLM LSVGNLLGGTAIVESIFMWDGVGKMAVDAISMRDYPIIQAYVMWMAIIYVVVNLLTDLSY RFLDPRIRLGGVKQ >gi|226332901|gb|ACII01000118.1| GENE 4 3682 - 3771 76 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEAIFTSFLVSVIAGVVSYYICKWLDSGH >gi|226332901|gb|ACII01000118.1| GENE 5 3700 - 3858 71 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRNESPQSCRSEGFLARYGELFRLDCCCLMSAVQPFAYVVANYACYDRDKK >gi|226332901|gb|ACII01000118.1| GENE 6 3946 - 4854 823 302 aa, chain - ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 295 1 294 301 89 22.0 1e-17 MDLKQLQYFVVCAQTGSFSDAAKNLYSTQPSVSKVIKGLEDTLGMQLFERLPRGIRLTVQ GHKMYEYASKIINEVNVLENMASSGMTKWIRISLNPSSWFANQFVDFYNETYEKKYHFQV TTAGVQSVMERVRDYMDDIGFVYILSQQKENFLYELSKNKMYFEPIYETDVMLYPGGLTE LYNSGKERVELEDLKDVRFIQNYQDEFFDIGAVKEDAFQWKDIDISVLTNSDYIMEKMLK NSKVANISGSYLSENKEGTTPGIPLDIGDSKVIFGFILHKGEEMDESVAELVEFLKSRLP KH >gi|226332901|gb|ACII01000118.1| GENE 7 4890 - 6638 1180 582 aa, chain - ## HITS:1 COG:RSc0588_4 KEGG:ns NR:ns ## COG: RSc0588_4 COG2199 # Protein_GI_number: 17545307 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 420 581 5 169 182 117 38.0 7e-26 MSEHRCMLNKMNKIEIYVANEDTDINPSVQEAIEYVLKQNDPVGTVAGYYDEKLTIWSVS NYFLQLLGWDDLDEFMKASDGSMLSVVCNEQKHIFSPEGLHDLQGSHTLYLTDSKGLSIS VRIVKADARDNKGRPIWVLSVRSDHFARNQEAREAAFHRAFTDMNLCEYYVDLQENTFES MKVKGSLQEIFNKSRTWDELIQMFLNNYVCPESKEAVAQIYNREYIMKELRKITGELSQE CKAVFDGELRIIRNVVMEGDTDENGEVRHAMIFLRDVTDSKNTEKECRAMLKQNIAMDQL IQGVTRIVERFAVCDLDSGIYEYYEMNNESYYNPTGDYRELLQRMSGEYVVLTEKINIQM DDLLSPEHLRKVIMSEDDLYTFEYSTLDRSSYKVMSVIPVEWKGSILSKVMLIAQDIGQK HELEKLANTDALTGLYNERYLSERLKRNGKLHKKFAMFYLDLDRFKPVNDTYGHDMGDRL LKAVSRRLCKCIRKTDYAFRIGGDEFSLIIEEGNINDEFCEMMVRRIKSVIDRPFDIEGR LLSVDTSCGYAIYPEHSQKIDEIRIMADHRMYEDKTQNRKKG >gi|226332901|gb|ACII01000118.1| GENE 8 7274 - 8668 1770 464 aa, chain - ## HITS:1 COG:BH0511 KEGG:ns NR:ns ## COG: BH0511 COG2239 # Protein_GI_number: 15613074 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Bacillus halodurans # 19 462 19 452 452 226 30.0 6e-59 MEKRVQDYSKEILKIIRSNTSPAVMGERLQDYHENDLADVMPKLTVQERCKLYRILDIDM LSDIFEYTDEENAAEYLNEMDVKKAAAILSRMETDALADVLNKVEKTKKKILIDLLEPEV RRDVEMIASFDEDEIGSRMTTNYIVITDKLTVKQAMSSLIEQAAKNDNISTIFVVTEQQK FYGAINLKELIIARRDDLLENLVVTSYPYVYAEENIDECIEELKDYSEDSIPVLDNDNQL LGVITSASIIDLVDDEMGDDYAKLAGLTAEEDLKEPLKESMKKRLPWLIILLGLGMVVSS VVGIFEKVVTALPIIMCFQSLILDMAGNVGTQSLAVTIRVLMDESLTGKQKLELVWKEMR IGLCNGGLLGILSFALIGLYIYLFKGKTLLFSYAVSGCIGVALLLAMLISSAVGTCIPLF FKKINIDPAVASGPLITTVNDLVAVITYYGLSWLFLLKMLNLAG >gi|226332901|gb|ACII01000118.1| GENE 9 9426 - 10094 714 222 aa, chain + ## HITS:1 COG:lin1927 KEGG:ns NR:ns ## COG: lin1927 COG1760 # Protein_GI_number: 16800993 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Listeria innocua # 1 222 1 220 220 193 44.0 3e-49 MSFISVFDVMGPNMIGPSSSHTAGAARISYLAQKMIEGPLKRADFILYGSFAKTYHGHGT DRALLGGIMGFSTDDMRIRNSFDIAHEKGLKFSFTPNEQETDIHPNTVDICMENEKGQKM TVRGESLGGGKVRIVKINHVQVDFTGEYSAVIVIHQDTPGVVAYITKCLSDRNINIAFMR LFREGKGDIAYTIVESDGKLPENIVPAIRENPNIHEVMIVQM >gi|226332901|gb|ACII01000118.1| GENE 10 10114 - 10983 1060 289 aa, chain + ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 1 278 1 278 290 224 47.0 1e-58 MDFKNANELLALCEEKNLPISEIMRIREIELGETASEIVNKKMIRVLGIMKDAAFSPIRQ PVKSMGGLIGGEARKLSLHAQEKKGLCGNLLQKAITYAMATLETNASMGLIVASPTAGSA GIVPGLMLAMQEHYQFSDEEIIRALFNASAIGYLAMRNATVAGAVGGCQAEVGVASAMAA SAAVELMGGSPKECTFAASTVLMNMLGLVCDPVGGLVEYPCQNRNAAGVANAIIAAEMAL AGIPQLIPLDEMIQTMFTVGKKLPAELRETAMGGCATTPSACKACHICS >gi|226332901|gb|ACII01000118.1| GENE 11 11063 - 11698 772 211 aa, chain - ## HITS:1 COG:HI0522 KEGG:ns NR:ns ## COG: HI0522 COG2364 # Protein_GI_number: 16272466 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Haemophilus influenzae # 10 197 23 204 218 59 26.0 5e-09 MSEWIRKINWKYVVVMLIGNIILGLGIAIFKLSGLGNDPFSGMVMALAECVGIEYARFLI LLNLGFFVIEIIWGRKLIGLGTIINALFLGYFVTFFYNLITSVIDAPDQMAMQVVTVFIG VIITSLGISMYQLPKQGVAPYDSISLIMTEKWPKIPYFWCRVSNDAISALVCWLAGGIVG LGTLVSAFGFGPFVQFFDTHFTSKVLAKLEK >gi|226332901|gb|ACII01000118.1| GENE 12 11957 - 13048 1118 363 aa, chain + ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 7 362 4 333 336 221 43.0 1e-57 MEKLKAFLRRKDIEISIKRYGIDALGAMAQGLFCSLLIGTIINTLGTQFHISFLTTAVAT VNDTQYTVGSLASAMSGPAMAVAIGYALHCPPLVLFSLITVGFASNALGGAGGPLAVLFV AIFASEIGKAVSKETKIDILVTPLVTIGVGVALSAWWAPALGSAAMKVGNIIMWATNLQP FLMGILVSVFVGIALTLPISSAAICAALGLTGLAGGAAVAGCCAQMVGFAVMSFRENKWG GLVSQGIGTSMLQMGNIVKNPRIWIAPILTSAITGPLATCLFKLQMNGTPVSSGMGTCGF VGQIGVYTGWMNDIADGLKTSVTGWDWAGLLMISFILPAILCPLINMFIRKLGWVKDGDM TLS >gi|226332901|gb|ACII01000118.1| GENE 13 13216 - 13668 357 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580328|ref|ZP_04857594.1| ## NR: gi|253580328|ref|ZP_04857594.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 13 150 13 150 150 229 100.0 6e-59 MFLVAVLVLLFLMYRFYTFRAEMVFIMTLLTPAMLLIGAYHLTGEYREAGIFLLIVGIFL LAVVMFMYRITARGVEKEFRSAYYKNLFIYGALRFLQTFCRMTIVFSFVAAMLGGVSLDY KERLDIAGRKIYVDDNLRDAEGNQYTEVER >gi|226332901|gb|ACII01000118.1| GENE 14 13846 - 15270 646 474 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580329|ref|ZP_04857595.1| ## NR: gi|253580329|ref|ZP_04857595.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 474 47 520 520 949 100.0 0 MSDFSQRLEKEIQNTGKTLCYLAEASGLQLDYISKMNKGKRIPKEEDRLVRLLDAMECTA GTKSDLLLLYRQEKMGKSQWQCMEELIHIIQQDVIEFPTVSASVSISDTPNTISILDNEF AIYSWIRKIMISTSDSEIYLWTSEISSEHLHYLIQIIHEAPYQRLEHLFPLLQNQNADNT LDNLKKISILKPAMFCESYEPFYYYITPSVSEEKKSGLLPNWLITDNIALGINFQTSKGI IIRDTQQLQLLKHEFIREKVLARKMLFRSDLDSYVKDVNTMMSEGYVTHNYYIEYSPCLL HLIPANILEQQIILDTVEKEPLLVSLSQRTRHMENENMVHIFCISGLRQFMEEGRIAGYP DMLYKPFDPSLRLWLLKSYYQYMLNTPHSCICVKENFIHLPRHISIVSSSNVNNGIAFWN NTSRGLQYYVLKESGFSQKLYEFCQFLDKGNMAWSEQETLGIIRNMILEYGGTL >gi|226332901|gb|ACII01000118.1| GENE 15 15448 - 17268 857 606 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580330|ref|ZP_04857596.1| ## NR: gi|253580330|ref|ZP_04857596.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 606 1 606 606 1169 100.0 0 MEFPIYTPWENLDAQTIQKVINSGINYRTYCDEKPWDVSFEIDFLPKEELIIVVLIEHGR RKGGVSVQKTSERTEISPYSKEEQAEMLKAYNEKIDRYETWNALFTGPTTGLFGNNVYES YMDYMYSWEHWSDRIFLSDYYEEQLYSVQSDTYVTVKSTQEGYDCIYELAQVKVKEHILE GICWKSFPKLIFSRGDKSDAPYKDVYNRLEILTGAEYLIVPALIEIKEKYEITAVSEKLD CPQMRFRGMTMEETVKQAVFENIFGEFLQGSRFCRENTNKKCLIAEEQKEIFASEFLESQ KYPELLMNAVKKADFHCMAFKVLGILNSPYMQTYDKIEMLRSCMAISNSAEKSREWKNAY QNLIWVMIKKLEMDDSQGNNLEVVSELQKFVEYQQGDFEKEQKGKSREETAEIFLMNELI RMLACISIVTVKKIQGQECAELAEKSWFMAESIEYMWNAEILNQQVRENVQKTFIRICVG CGSMYLQNVQEENVNPEQIIETAEKVLLLLNNHRDLILQEEMPVYQIPIYQMLQVANIRL NHDENALYYARKGVELCDGINADNNAQLNGMIRDSKRMFQEYINQNAQKVPEKAKKKGFF SRLFSR >gi|226332901|gb|ACII01000118.1| GENE 16 17290 - 17760 393 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580331|ref|ZP_04857597.1| ## NR: gi|253580331|ref|ZP_04857597.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 156 25 180 180 288 100.0 1e-76 MSAKLVLTVKDSSRMAKDYSMSMMDKMFHPLKHNKILNSKAKIGILIDDEEEGTLYEASK TPYEINLQPGFHRITMIDPAKENKKGNVNLIGSIIGVVEGNGIADTMATSRFAKSMAGAM FKTGDGYLDLNLKDGDIIKVQCQAGIGGKVSIKILK >gi|226332901|gb|ACII01000118.1| GENE 17 17968 - 18738 570 256 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580332|ref|ZP_04857598.1| ## NR: gi|253580332|ref|ZP_04857598.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 256 16 271 271 425 100.0 1e-117 MKKKKWFLVITSILLMLLACTQIQAAAIKAPANIKVTASKKASVSIKWSKVSNASGYEIW RANSAKGNYSKIKTIKTKNTTSYANKKLSAGHYYYKVRAYKTVNGKTIYSNFSRYSGTTV KVLNLMKNLPPLSPSYVGKYSTIINKIGGMHKKSNSGYPSFYAAGNKMIIGVNYNAKYSK NQKYVYICNRGNYGVGIGGMQLGMTLSKATAILNKNGLRSFNNPTVFWWGNAASITLTIK NNIVTGFTYACAPTCD >gi|226332901|gb|ACII01000118.1| GENE 18 18819 - 20357 452 512 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 36 273 1329 1568 1995 140 34.0 5e-33 MKNKNTGFKHLVLLFAFILCLFSCVSAKAANYYYEDGYKYTLSLGKATIISYVGSDTDLT VPSILNGKPVVKIESSAFANNKNLCSVILPDTITSMGISVFAQCENLKSIHYPEGLDRIY YRTFAGCHNLTKVNISSNIKIIEDSAFAGCTSLTYLDLPKVKEIGEYSFANCTSLSEVIF HKGFTYVGSASFKNCTNLKDVTLPLGTQTISYSAFEGCKALRHIDFPSTLVFLKSASFKD TGLVHIIVPPSVTHEEQSVFLNCKNLEEAEYHGKVIEGFTFANCTNLKKLIIGPEVTYIG SSVFDGTTQLETLQIPATTNLHSRVFIGAEALHTIYGVKNSPVEIAANNAGLNFVPVKDH SIKLNKKQLTLYTVKGKNSAVLKSVISGSTSTPTWNSSNTKVVQVNATGKVTAKGAGTAY VTVTLGNKSASCKITVKKPVLNLKNKAITLKKGSSYTISCSAVPTGKITYKSSNTKCVTV NTKGKITARQKGTATVSIKCNGITQKLKVTVK >gi|226332901|gb|ACII01000118.1| GENE 19 20398 - 21189 367 263 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2618 NR:ns ## KEGG: Cphy_2618 # Name: not_defined # Def: Ig domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 5 116 3 117 399 72 37.0 2e-11 MKKLLKKHFSFIIAVLMVISLIIIPRTAQAASVKLNKTKLTMNVGGVYHLKVSGTNKKVT WSSTDSKVASVSSGKVKAKKTGTATITAKIGSKKLKCQIKIKDQRALYEKVLLQSGGKCF YLMDIDRNGTPDLIVSSNRGVIVDYSVYTIKNGKVIYAGQCSGKGMNYQILQYNTHYNGI HISWWTNGVGGSGSALWTLSGSKLVSTSRAYCSVAGFQTGKDEQHMKNVSQASFNSYCDK HFRNCKQYTMWSNTSENRIKHLK >gi|226332901|gb|ACII01000118.1| GENE 20 21362 - 22018 687 218 aa, chain - ## HITS:1 COG:CAC2551 KEGG:ns NR:ns ## COG: CAC2551 COG3341 # Protein_GI_number: 15895813 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Clostridium acetobutylicum # 1 211 1 239 240 102 29.0 7e-22 MAKKKIYAVRKGHKTGLFATWAECQKAVSGYSGAEFRGFTEKEEALAFLNMETTGNVSGD KAKEAAGIVEVPENMVIAYVDGSFEKSIGRYAFGCVILTPDGQEIRKSGSGSDSAGVAIR NVAGEMLGSMTAVNWAIENKYPAVEIRYDYEGVEKWVTGVWRAKTPLTSKYAAHMQEAGK KVKISFCKIAAHTGNHYNEEADQLAKSALLKTDEEVRI >gi|226332901|gb|ACII01000118.1| GENE 21 22288 - 23472 1396 394 aa, chain - ## HITS:1 COG:MA2630 KEGG:ns NR:ns ## COG: MA2630 COG1454 # Protein_GI_number: 20091453 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Methanosarcina acetivorans str.C2A # 4 394 21 400 400 233 36.0 4e-61 MNPFEFFIPQNITVGAGTLAKLPECAKKLGGSHAMLISGPTLRKMGVVDKAADYLKDAGM AVDIFTDVEANPSVTTVEKATEAYKDSGADFIVALGGGSPMDVAKAVGVTAKYGGSITEY EGAHKVPGKIVPLIAIPTTAGTGSEVTAFSVITDHSRDYKLTVFSYELLPAYAILDPELL TSAPASVAAACGIDAFIHAEEAYVSTAASPFSDAMAEKAMELIGGNIRRFVARRTDLEAA EAMLTGSLFAGIAFSFARLGNVHAMSHPVSAFFNVAHGVANAVLLPVIAEYNALADHGRY LKIYNYISPIPAYEDEFEPLMLMDAIRELNEDIGIPEDLITAIRQAKKGEEISDEEIESK IDAMADDAMKSGNIAVNPRSSRKQDIVKLYYKAL >gi|226332901|gb|ACII01000118.1| GENE 22 23584 - 25104 1676 506 aa, chain - ## HITS:1 COG:BH0994 KEGG:ns NR:ns ## COG: BH0994 COG0531 # Protein_GI_number: 15613557 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Bacillus halodurans # 71 465 1 385 395 321 45.0 2e-87 MEQKTSEFDKVLGAWDILVIAFGAMIGWGWVVSSGNWIESGGVLGAAIAFAIGGIMIFFV GLTYAELTAAMPQCGGEHVFSYRAMGSTGSFICTWAIILGYVSVACFEACAFPTIITYLW PGFLKGYLYTVAGFDVYASWLIVAIVIAFLIMIININGAKTAAVLQTVLTCIIGGAGIIL IVASVVTGSVDNLQGQMFVGNGTGEAMKSIIKVAVITPFFFIGFDVIPQAAEEINVPLKK IGKMMILSVVLAVVFYALVILAVGYILNPAEIIESQQGTGLVTADAMAKAFGTKIMAKVI IVGGMCGIITSWNSFLLGGSRAMYSMAESYMIPPFFAKLHPKHKTPVNSLYLIGALTMLA PFAGRTMMVWICDAGNFGCCLAYCMVSVSFLILRKKEPDMARPYKVGHYKFVGFMAVVMS GLMLLVYCIPGSGGSLVFQEWLMVGSWSLLGVVFYAICKRKYKEDFGKLIELISDEDAAS LMPEADDEELDVVIDAAIDRVLSSMA >gi|226332901|gb|ACII01000118.1| GENE 23 25108 - 25728 783 206 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0098 NR:ns ## KEGG: EUBREC_0098 # Name: not_defined # Def: cytidylate kinase # Organism: E.rectale # Pathway: not_defined # 1 199 1 197 211 288 76.0 1e-76 MKHLVITIGCEYGAKGNAIGKKIAQDLGIKFYDRDLVDEIIKEVGIPTDIMDRVEEGGTI AGKGAQGDVRGSFSKYADLTDRAIHVQKQIIRKLAQKEACVIIGRSADYILKDTEGIQLV RAFIYAPNDVRIHNIMESHNLSEKDAKLLLDEKDKRYHKRHLALTGSNRGDRHNRDILIN SDFLGIQGTAEYLEDLIAKKYGKKEA >gi|226332901|gb|ACII01000118.1| GENE 24 25956 - 27173 1513 405 aa, chain - ## HITS:1 COG:MJ0721 KEGG:ns NR:ns ## COG: MJ0721 COG4992 # Protein_GI_number: 15668902 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Methanococcus jannaschii # 15 405 8 398 398 330 43.0 4e-90 MKLKDTGLTFQDIKDKVNKYMIETYERFDFLAETAEGMYIYDENKTPYLDFYAGIAVNSA GNCNPKVVAAAKDQLDDIMHTFNYPYTIPQALLAEKVCETIGMDKIFYQNSGTEANEAMI KMARKYGIEKYGPNRYHIVTAKMGFHGRTFGAMSATGQPGNGCQVGFGPMTYGFSYAPYN DLQAFKDACTENTIAIMVEPVQGEGGVHPATMEFMQGLRKFCDENDMLLLIDEVQTGWCR TGAVMSYMNYGIKPDIVSMAKALGGGMPIGAICATAEVAKAFTAGSHGTTFGGHPVSCAA ALAEVNELLDRDLAGNAKKVGDYFAAKLEKLPHVKEVRHQGLLVGVEFDDTIGGVDVKHG CLDRHLLITAIGAHIIRMIPPLIVTEEECDKAVAIIEDTIKSLEK >gi|226332901|gb|ACII01000118.1| GENE 25 27684 - 28133 416 149 aa, chain - ## HITS:1 COG:BH2909 KEGG:ns NR:ns ## COG: BH2909 COG1396 # Protein_GI_number: 15615472 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 148 38 184 189 77 27.0 1e-14 MLAQIERGTANPSLGVLGKITSGLRIEFQELIQPPRVDCCLVTPAELVPTKEVAGQYRVW TCLPYEDTRLVEVYRIEVEPGGVYISGSHGEKTREYLAVMEGVLTLECHGEKYEIHRDQV FKFETDQEHVYRNTGKEKACCTCFFLDYH Prediction of potential genes in microbial genomes Time: Sat May 28 20:32:30 2011 Seq name: gi|226332900|gb|ACII01000119.1| Ruminococcus sp. 5_1_39B_FAA cont1.119, whole genome shotgun sequence Length of sequence - 20708 bp Number of predicted genes - 19, with homology - 18 Number of transcription units - 14, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 431 - 820 76 ## 2 2 Op 1 32/0.000 + CDS 946 - 2664 2156 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 3 2 Op 2 . + CDS 2683 - 3192 671 ## COG0440 Acetolactate synthase, small (regulatory) subunit - Term 3044 - 3085 3.8 4 3 Tu 1 . - CDS 3261 - 4517 895 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 4559 - 4618 4.3 + Prom 5135 - 5194 8.8 5 4 Tu 1 . + CDS 5280 - 5993 498 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 5996 - 6032 5.0 + Prom 6020 - 6079 8.1 6 5 Tu 1 . + CDS 6195 - 6554 433 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 6575 - 6618 7.2 + Prom 6606 - 6665 5.7 7 6 Tu 1 . + CDS 6903 - 7541 624 ## EUBELI_20076 hypothetical protein + Term 7573 - 7611 5.4 + Prom 7557 - 7616 4.9 8 7 Tu 1 . + CDS 7666 - 8631 759 ## COG3191 L-aminopeptidase/D-esterase + Term 8642 - 8691 10.7 - Term 8635 - 8674 4.2 9 8 Op 1 . - CDS 8703 - 8855 112 ## gi|253580342|ref|ZP_04857608.1| conserved hypothetical protein 10 8 Op 2 . - CDS 8792 - 9028 64 ## gi|253580348|ref|ZP_04857614.1| predicted protein 11 8 Op 3 . - CDS 9044 - 10228 1245 ## Cphy_1651 major facilitator transporter - Prom 10400 - 10459 5.0 - Term 10476 - 10527 5.6 12 9 Op 1 . - CDS 10542 - 11306 736 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 13 9 Op 2 . - CDS 11300 - 11497 91 ## gi|253580352|ref|ZP_04857618.1| predicted protein - Prom 11556 - 11615 6.1 14 10 Op 1 . - CDS 11633 - 12319 699 ## COG2364 Predicted membrane protein - Prom 12419 - 12478 5.1 15 10 Op 2 . - CDS 12488 - 13906 1781 ## COG0469 Pyruvate kinase - Prom 14008 - 14067 8.1 - Term 14002 - 14054 2.0 16 11 Tu 1 . - CDS 14160 - 14690 508 ## EUBREC_0237 hypothetical protein - Term 14942 - 14992 12.9 17 12 Tu 1 . - CDS 15019 - 16524 1558 ## COG1640 4-alpha-glucanotransferase + Prom 17137 - 17196 3.9 18 13 Tu 1 . + CDS 17223 - 19598 2385 ## COG3957 Phosphoketolase + Term 19639 - 19687 10.2 - Term 19627 - 19675 -0.7 19 14 Tu 1 . - CDS 19685 - 20587 747 ## COG0583 Transcriptional regulator - Prom 20633 - 20692 6.4 Predicted protein(s) >gi|226332900|gb|ACII01000119.1| GENE 1 431 - 820 76 129 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPSRLQRQFHVTGTQYISTDKVQSQTLNTGGTAETFSPWANKPRAFFTIFIYQGQKVVRA CFNLSHQTRRPTLGRISAAPFCPTLYWPTHSSEIYLFSGAFAGVVVTLSAWKSALFRKIS LSEPHLRRV >gi|226332900|gb|ACII01000119.1| GENE 2 946 - 2664 2156 572 aa, chain + ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 2 569 7 559 564 567 50.0 1e-161 MKQISGNKLLVKALKEEGVEYLFGYPGACTIDISDELYKQDDVKIILPRHEQALVHEADA YARTTGKVGVCLVTSGPGATNLVTGLATANYDSVPLVCFTGQVARHLIGNDAFQEVDIVG ITRSITKYGVTVRRREDLGRIIKEAFYIARTGRPGPVLIDLPKDVMAELGSAEYPKNVNI RGYKPNTDVHIGQLKRAIKLLNKAKRPLFLAGGGVVISRAHEIFREVVEKTKVPVVTTVM GKGSIPTDHPLYIGNLGMHGAYAANMAVSNCDLLFSIGTRFNDRITGKLHEFAPHAQIVH IDIDTASISRNIQVDIPIVSDAKEAITKMNEYVQECSTGKWLGQISQWKEEHPLKMRPND VLSPMDILKEINEQFENSIIVTDVGQHQMLVSQYAEITEGKQMIMSGGLGTMGYGLPGGI GAKIGNPDRPVIVISGDGGVQMNIQELATAVLEELPVILCIFNNEYLGMVRQWQKLFYGK RYAMTNLRAGALSRRTEGMEYPQYTPDFIKLAESYRAKGIRVTKKEEIAAAFKEAKKNTK APTVIEFIIDPEEMVYPMIKPGGTLEDMIMDC >gi|226332900|gb|ACII01000119.1| GENE 3 2683 - 3192 671 169 aa, chain + ## HITS:1 COG:PM0869 KEGG:ns NR:ns ## COG: PM0869 COG0440 # Protein_GI_number: 15602734 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Pasteurella multocida # 7 161 2 155 163 106 36.0 3e-23 MDKGMKKRWISLYVENQVGVLSKISGLFSGKSYNLDSLTVGTTEDPTVSRMTIATVSDDE TFEQIKKQLNRMIEVIKVIDFTDVFVRMKEILYIKVFNCSREDKAELFQIAETFKAKVID YGKDSLLLEFVQTATKNDAVVKLIKEEFHRIEVVRGGSVGIESISMMER >gi|226332900|gb|ACII01000119.1| GENE 4 3261 - 4517 895 418 aa, chain - ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 6 401 3 396 396 344 44.0 2e-94 MKSTERLIFHIDVNSAFLSWESARRVSQGLPDLREIPSCVGGDPKKRTGIVVAKSIPAKK YGIQTGEPVAMALRKCPNLVIVPSDFELYDKCSRAFKAICASFAPVMESFSIDEVFLDMT GTSLIYPDPVAAAHELKDKIHRELGFTVNVGISTNKLLAKMASDFQKPDRVHTLFPDEIP EKMWPLPIRNLLFLGKASEQKLLNQGIRTIGDLARERASNVQYWLGEKTGHQLWQYARGI DDSPVKAVPDEAKGFSVETTFNDDITSIEQVFPILLEQCDVLATRMRRKDKKCNCISVTF RTLDFRNRSHQTKLENATDLTDEIYINATRLFKEFWKGQPLRLIGVALTGLTDGTYEQMS LFEDTENKERHRKLDAAMDEIRMKFGNDKITRASIMNSNARIGRKAKAQMKNEKTEEP >gi|226332900|gb|ACII01000119.1| GENE 5 5280 - 5993 498 237 aa, chain + ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 2 237 3 235 244 157 35.0 2e-38 MYLSDIHTHSIASGHGTTCTISDMAKAASKKGLKLLGISDHGPGTLASGTSSYFRSLPFY PRKRFGIDILYGVELNILDSAGHTDLSDELLNNLDYAIISMHRQNYRSGSVSQNTEAYIN AMKHPAVKILGHCDDTHFPVDYETLARAALRENVIFEINEASLAPGGYRGDTKANAARIL YLCQKYQLPVILSSDSHGKEHVGDFTYAEEFVHQLMFPETLIFNNQLPRLKVFLQTR >gi|226332900|gb|ACII01000119.1| GENE 6 6195 - 6554 433 119 aa, chain + ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 1 111 52 161 168 63 33.0 1e-10 MTIATVSDDKTFEQIKKQLNRMIEVIKVIDFTDVFVRMKEILYIKVFNCSREDKAELFQI AETFKAKVIDYGKDSLLLEFVQTAIKNDAVVKLIKEEFHRIEVVRGGSVGIESISMMER >gi|226332900|gb|ACII01000119.1| GENE 7 6903 - 7541 624 212 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20076 NR:ns ## KEGG: EUBELI_20076 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 210 1 208 212 206 63.0 3e-52 MNTTTTTTTTAATKRSSTLYMVELAMMIAIILLMSFTPLGYLRTPGLSITLLTIPVAVGA IILGPKGGAVCGLAFGATSFYMAVTGSSAFAAALFNINPFGTFIVCIVARVLEGWITGLI FAALYKSPAKKFSYYVASLACPLLNTFFFMGFLCLFFYNTDYIQGIVSSLGVSNPVVFVA AFVGVQGLIEAGFCFVVGSIVSKALSVALKRA >gi|226332900|gb|ACII01000119.1| GENE 8 7666 - 8631 759 321 aa, chain + ## HITS:1 COG:BMEII0898 KEGG:ns NR:ns ## COG: BMEII0898 COG3191 # Protein_GI_number: 17989243 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Brucella melitensis # 8 307 8 311 335 221 44.0 1e-57 MSLLKEISITDLSNLRIGHASDYNAQTGVTVLYFPLGAKVGCDISGGGPASRETPLTSPV TADNPINAIVLSGGSAYGLAAADGVMNYLESQNIGYNTGAALVPLVCQSCIYDLSYGDPK IRPTAKMGEEACRQAFAGTDTRTGNIGAGTGATVGKLYGMKQSMKSGLGIAAVSVGNFQM AAIVVVNALGDIFSPQNGQKIAGLKTPDRSGFLDSVHELYRFMTPHDQFTGNTTIGAVIT NGAFSKAELNKIASMTRCAYARCINPVATMADGDSIYAASIGDVSVDINMAGTLAAEVMA QAIQNAIHTSRIQDEEFLKYV >gi|226332900|gb|ACII01000119.1| GENE 9 8703 - 8855 112 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580342|ref|ZP_04857608.1| ## NR: gi|253580342|ref|ZP_04857608.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 50 183 232 418 95 86.0 1e-18 MWPLPVRNLLFLGKASEQKLLNQGIRTIGDLAREREANVQLWLGEKPEHR >gi|226332900|gb|ACII01000119.1| GENE 10 8792 - 9028 64 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580348|ref|ZP_04857614.1| ## NR: gi|253580348|ref|ZP_04857614.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 78 1 78 78 152 100.0 7e-36 MERLIFHIDVNSAFLSWESARRVSQGLPDLREIPSCVGGDPKKRTEYTRCSRTKYRRKCG HFLSVTCYFWEKPPSRSF >gi|226332900|gb|ACII01000119.1| GENE 11 9044 - 10228 1245 394 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1651 NR:ns ## KEGG: Cphy_1651 # Name: not_defined # Def: major facilitator transporter # Organism: C.phytofermentans # Pathway: not_defined # 2 392 12 418 422 375 50.0 1e-102 MKTNYNRTVQACFIGYVVQAIVNNFVPLLFLTFQGSYGISLQKITLLVTFNFGIQLLVDL ASIGFVDRIGYRASMILAHAMAAAGLILLTVLPECLGDPFVGLLIAVMIYAIGGGLLEVL VSPVVEACPTDNKEKAMSLLHSFYCWGHVGVVLLSTLFFRICGIANWKYMALVWALIPIA NGIFFTRVPIAPLLDEGEKGLTMGELFKKKIFWVLMLMMLCAGASEQAVSQWASTLAEKS FGINKTIGDLAGPMAFAILMGSSRAFYGKFGDKINLERFMQYSAVLCMVSYLMIAFIPVP ALGILGCAVCGLSVGIMWPGTFSMAAASVKRGGTALFALLALAGDVGCSSGPTYVGMISG ALNDNLKLGIFAALIFPVLMIMGIKMIGSLQKYD >gi|226332900|gb|ACII01000119.1| GENE 12 10542 - 11306 736 254 aa, chain - ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 5 248 6 248 251 286 57.0 2e-77 MLNQFSRTELLLGKEAMNKLENSRVAVFGIGGVGGYVCEALVRSGVGTFDLIDDDKVCLT NLNRQIIATRKTVGQYKTEVMRDRILDINPKADVRIHNCFYLPENAADFDFSEYDYVVDA VDTVTAKIELIMRAKEAGTPVISSMGAGNKLDASAFRVADIYKTKVCPLAKVMRRELKKR GVKKLKVVYSEEQPIRPIEDMAISCRSHCICPPGAAHKCTERRDIPGSVAFVPSVAGLII AGEVIKDLTAEYRR >gi|226332900|gb|ACII01000119.1| GENE 13 11300 - 11497 91 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580352|ref|ZP_04857618.1| ## NR: gi|253580352|ref|ZP_04857618.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 65 1 65 65 86 100.0 5e-16 MQIMKRLLIKNKRKLWLWFLLLVFLFLACRAAYITGSKAEQTRNNEVEERQADSRSIPNE MEILC >gi|226332900|gb|ACII01000119.1| GENE 14 11633 - 12319 699 228 aa, chain - ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 10 218 4 209 227 70 24.0 2e-12 MKTNEEEDSRHIFRGELALVIAVMINSLAVILMLYSGSGISAISSVPYAFEKVFPVITLG TWTYIFQALLILSLMILRKKFVPAYLCSFLVGFAFSELLDVNEAWINILPKTIPLRVLYF VISYLLICIGIALSNRCGLPIIPTDLFPRELSEIIKVKYSKIKVSFDVICLMITACMTGL ILGHINGLGIGTVMAAFTMGKVIGLIGDEMDKHFRFVNFTQHCSLHVQ >gi|226332900|gb|ACII01000119.1| GENE 15 12488 - 13906 1781 472 aa, chain - ## HITS:1 COG:BH3163_1 KEGG:ns NR:ns ## COG: BH3163_1 COG0469 # Protein_GI_number: 15615725 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Bacillus halodurans # 1 468 1 468 473 475 52.0 1e-134 MRKTKIICTMGPSTDKGDVMELLVREGMDVARFNFSHGPHDEQRGRIQKLRELRKKYDRP VAALLDTKGPEIRLGCFKDGKVSLEEGQIFTFTNKDIEGTNERVSITYKELYKDVQPGGH ILVDDGLVDLEVQDIQGRDIVCKVINAGVIGDKKGVNVPGANLKMPFISKKDHDDLLFGI QEGFDFVAASFTRTANDIREVRKILKENGGEEIQIIAKIENQQGVDNIDEIIEAADGIMI ARGDMGVEIPPEYVPVIQQKIIQKVYTAGKPVITATQMLDSMISHPRPTRAEATDVANAI FQGTSATMLSGETAAGKYPVQALQMMSRIAEHMEQNIDYNTIFKKTDRNENPDITNAIAH ATCLTAIDLKASAILSVTKSGSTAHMMSKHRPGCLIGACTSSERVLRQLNLSWGVYPMLI KEEYSSEILCLRAIEAAQKHKLVKVGDTIVFAGGVPLGIPGRTNLIRVCTVE >gi|226332900|gb|ACII01000119.1| GENE 16 14160 - 14690 508 176 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0237 NR:ns ## KEGG: EUBREC_0237 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 176 38 224 224 190 55.0 2e-47 MNYNRLENSLIDVIKEEQAKLGYMKEKISLYYPLSSLNHFFGNKTDAAGMIEILKDFPSY IEEKFGEVAVSHRGDRFCFTVSEKASEYVHNNMKENEFIKALIELVGRHGCTIEEIKELF HSYSDKIITEPMDNEEFDVMIRFDDGSDPYYYCFKDEGCHIIYHRFLPEDYADFGF >gi|226332900|gb|ACII01000119.1| GENE 17 15019 - 16524 1558 501 aa, chain - ## HITS:1 COG:SP2107 KEGG:ns NR:ns ## COG: SP2107 COG1640 # Protein_GI_number: 15901922 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Streptococcus pneumoniae TIGR4 # 1 499 1 497 505 480 48.0 1e-135 MGKRAAGILMPISSLPSDYGIGCFSKSAYEFVDWLKEAGQTYWQILPLGPTSYGDSPYQS FSTFAGNPYFIDLDTLVEEGVLDKKDCEKVNWGKKPDEVDYEKIYKGRYPLLRKAYENSD ISKNPDYQKFVAENSWWLSDYALFMAVKDRFEGVEWTKWAEDIRLRWNNAMDYYREELYF DIEFQQYMQFKFYEQWMKLKSYANSKGIQIIGDIPIYVAMDSADAWAHPELFQLDQDNVP LAVAGCPPDGFSATGQLWGNPLYRWDYHRNTGYQWWISRMSYCFRLYDVVRIDHFRGFDE YFSIPYGDKDARGGHWEKGPGIDLFRKIEQALGWKQVIAEDLGYMTDSVRHLVYESGFPG MKVLEFAFDSRDSGCASDYLPHNYPENCVAYTGTHDNETIVGWWKSITAAERKLARDYLC DHATPEEELYKCFISLIMRSAARVCVIPMQDYMGLDNRFRMNKPSTVGTNWKWRIKKRDL TKKLQKEIYDVTLRYGRKNWD >gi|226332900|gb|ACII01000119.1| GENE 18 17223 - 19598 2385 791 aa, chain + ## HITS:1 COG:CAC1343 KEGG:ns NR:ns ## COG: CAC1343 COG3957 # Protein_GI_number: 15894622 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoketolase # Organism: Clostridium acetobutylicum # 13 790 14 795 796 1211 71.0 0 MEDNKLCLEKQPLNQEMLDKIDAYWRAANYLSAGQLYLLDNPLLREPLTMDQIKKKIVGH WGTVPGQNFVYAHCNRVIKRYDLDMILLSGPGHGGNFMIANTYLEGSYSEVYPNISQDEE GMKKMFKQFSFPCGVPSHCAPETPGSINEGGELGYSIAHAFGAVFDNPDLIATTVVGDGE AETGPLATSWQSNKFLNPITDGAVLPILHLNGYKISNPTIFSRLSHEELECFFKGCGWKP YFVEGDDPMTMHKKMAETMDTVIEEIKAIQKNARENNNPERPVWPMIILRTPKGWTGPKV VDGLQIEGSFRAHQVPITMEKPEHLELLKEWLESYRPQELFDKNGRLIPELAELAPKGNA RLGANPHANGGLLLKDLRLPDFRDYGIEVQPGKTKAQDMIELGGYIRDIFKLNEENKNFR IFGPDESMSNRLYKVFEYQKRDWNAEMLDTDDCLARDGRIMDSMLSEHMCEGWLEGYLLT GRHGFFASYEAFIRIVDSMAAQHAKWLKVCNQLSWRQPIASLNFILTSNVWQQDHNGFTH QDPGFLDHIANKKADVVRMYLPPDANCLLSCFDHCIKSKNYVNAIVASKHPSCQWLTMEQ AVKHCTQGIGIWEWASNDCGEEPDVVMACCGDTPTLEIMAAVTILRDEMPELKIRVVNVV DLFKLESDHKHPHGLSDAEYDAIFTKDKPVIFAFHGYPTLIHELTYERHNHNISVHGYQE EGTITTPFDMRVQNQIDRFNLVKDAIMHLPQLGNKGSFLIQKMNDKLVEHKQYIAEYGQD LEEIRNWEWHK >gi|226332900|gb|ACII01000119.1| GENE 19 19685 - 20587 747 300 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 4 298 2 295 296 160 28.0 3e-39 MNANFEYYKVFYYVSKYENLTKAASVLKTSQPAVTRTIHNLENILGCRLFTRSKSGMKLT PEGKIFYQYVAAGCAQFFKAEDNLNNLISLENGTIYISATETALHCYLFEAVRDFNIQYP NVHFKILNNSTTDSVNALKEGKVDLAVVSASLQLAEPLKMKVVRKYREILIGGSRFAELK GRKTSLEELSAYPWISLTSEAISRRFLNEYFSKNGLQFSPDVELATTDMILPAVRYNMGI GFLPGEFAGDDLESGKVFEIDVNETLPERSIILIHDTEYPQSIASKAFQKFLAERKEQNR Prediction of potential genes in microbial genomes Time: Sat May 28 20:33:03 2011 Seq name: gi|226332899|gb|ACII01000120.1| Ruminococcus sp. 5_1_39B_FAA cont1.120, whole genome shotgun sequence Length of sequence - 2332 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 32 - 91 6.5 1 1 Tu 1 . + CDS 128 - 1786 1467 ## COG0366 Glycosidases + Term 1815 - 1864 5.0 - Term 1802 - 1851 10.2 2 2 Tu 1 . - CDS 1876 - 2331 344 ## EUBREC_0955 hypothetical protein Predicted protein(s) >gi|226332899|gb|ACII01000120.1| GENE 1 128 - 1786 1467 552 aa, chain + ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 7 549 5 556 561 484 45.0 1e-136 MKHKKDWYKDLVIYQIYPRSFKDGNNDGIGDLKGMIEKLDYLKELGINAVWMSPVYASPN VDNGYDISDYFAIMEEFGTMEDWDAFRDGAHERGIAIIMDLVLNHSSDKHKWFRESKKSR NNPYSDYYIWRDPAPDGGAPNCWMSVFGGSAWEYVPERGQYYLHFFAKEQPDLNWDNPET KEKIFDIIRFWNEKGVDGYRIDAISYLDKGLDGRADMNEPIGTAACVNLEGTHRYIREMV AETMTPDNLMSVGEVNINNEQDAINYSSAASKEFNMAIPFVSPIVEIQTWSPEKMKRDLK KDYEILKKDGWWARFLSNHDKPRQVSLYGNDREFWSESAKMLACYLHTLPGTPFVFQGEE FGMTNVAFPSIDDYNDIDTRNYYKTMIEQGASPEEALAESRSISRDNARTPMQWNDSPNG GFSEHTPWLGVNPNYKLINAESQMNDDCSVFKFYQNLIALRKKHPVMAHGDFKLCCPEDG PVIAYTRSYQDETWLAVHNFSASMQKFHYGAEDIELPCNTPVIISNYGENEVEYGIGTLV LKPYETVVFQLH >gi|226332899|gb|ACII01000120.1| GENE 2 1876 - 2331 344 151 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0955 NR:ns ## KEGG: EUBREC_0955 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 148 415 561 564 165 53.0 5e-40 KTVSENGEALEKFCNALKDGNSEVVEQSLNNYLKRTISIRDTLVKKQMKENFYHGILLGI LGFQQNWSVSSNKESGDGYSDILIETEDQETGIIIEIKYAETGNLEAVSEEALKQIEDRR YEEQLLEEGVEHILKYGIAFYKKKCKVMAVK Prediction of potential genes in microbial genomes Time: Sat May 28 20:33:13 2011 Seq name: gi|226332898|gb|ACII01000121.1| Ruminococcus sp. 5_1_39B_FAA cont1.121, whole genome shotgun sequence Length of sequence - 31343 bp Number of predicted genes - 25, with homology - 22 Number of transcription units - 16, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 200 - 1060 598 ## EUBREC_2865 hypothetical protein - Prom 1155 - 1214 6.8 + Prom 1004 - 1063 6.9 2 2 Tu 1 . + CDS 1182 - 1694 -92 ## + Prom 2781 - 2840 3.2 3 3 Tu 1 . + CDS 2860 - 3060 90 ## + Prom 3173 - 3232 4.8 4 4 Tu 1 . + CDS 3280 - 3570 122 ## + Term 3707 - 3748 -0.9 - Term 3750 - 3794 8.2 5 5 Op 1 13/0.000 - CDS 3803 - 4093 194 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair 6 5 Op 2 2/0.000 - CDS 4159 - 4347 125 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair - Prom 4423 - 4482 3.4 - Term 4423 - 4464 3.0 7 5 Op 3 . - CDS 4559 - 6940 1675 ## COG1203 Predicted helicases - Prom 6966 - 7025 2.1 - Term 7168 - 7200 2.3 8 6 Tu 1 . - CDS 7207 - 7944 650 ## CTC01466 hypothetical protein - Prom 8016 - 8075 4.9 9 7 Op 1 . - CDS 8097 - 8885 648 ## CTC01467 hypothetical protein 10 7 Op 2 . - CDS 8878 - 10632 1210 ## CTC01468 hypothetical protein - Prom 10660 - 10719 3.4 11 7 Op 3 . - CDS 10721 - 11398 348 ## CTC01469 hypothetical protein - Prom 11418 - 11477 4.5 - Term 11715 - 11776 2.1 12 8 Tu 1 . - CDS 11790 - 13580 2257 ## COG0426 Uncharacterized flavoproteins - Prom 13704 - 13763 6.5 - Term 14072 - 14128 4.1 13 9 Op 1 . - CDS 14158 - 15327 911 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 14 9 Op 2 . - CDS 15348 - 17237 1436 ## COG0514 Superfamily II DNA helicase 15 9 Op 3 . - CDS 17263 - 18150 914 ## COG4905 Predicted membrane protein 16 9 Op 4 . - CDS 18218 - 21166 2491 ## COG0642 Signal transduction histidine kinase 17 9 Op 5 . - CDS 21172 - 23022 1528 ## EUBELI_01635 multiple sugar transport system substrate-binding protein 18 9 Op 6 . - CDS 23022 - 25070 1594 ## COG0642 Signal transduction histidine kinase 19 10 Tu 1 . - CDS 25186 - 26001 833 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Term 26317 - 26381 14.1 20 11 Tu 1 . - CDS 26437 - 27780 851 ## COG0534 Na+-driven multidrug efflux pump - Prom 27862 - 27921 6.9 + Prom 27880 - 27939 6.5 21 12 Tu 1 . + CDS 27998 - 28456 320 ## COG1522 Transcriptional regulators + Prom 28759 - 28818 5.8 22 13 Tu 1 . + CDS 28945 - 29232 163 ## gi|253580381|ref|ZP_04857647.1| predicted protein + Term 29367 - 29395 -1.0 + Prom 29268 - 29327 4.1 23 14 Tu 1 . + CDS 29561 - 30559 286 ## COG0534 Na+-driven multidrug efflux pump + Term 30639 - 30673 3.2 + Prom 30657 - 30716 3.3 24 15 Tu 1 . + CDS 30743 - 30934 171 ## gi|253580383|ref|ZP_04857649.1| conserved hypothetical protein + Term 30952 - 31003 9.1 - Term 30939 - 30990 9.1 25 16 Tu 1 . - CDS 30998 - 31186 76 ## gi|253580384|ref|ZP_04857650.1| conserved hypothetical protein - Prom 31250 - 31309 2.8 Predicted protein(s) >gi|226332898|gb|ACII01000121.1| GENE 1 200 - 1060 598 286 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2865 NR:ns ## KEGG: EUBREC_2865 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 22 196 10 189 576 189 49.0 1e-46 MYNMYIEPRQKGGSCIEFSQKKLPIGIENFAKLQAEDFYYVDKTMLIKELLDNWAEMNLF TRPRRFGKTLNMSMLKAFFELNGDKNCFKGTRISREVYLCEQYMGKFPVIFISLKSISAQ TYEKACDMAIQIVNGEARRFQYLLDSERLTEYDKDAFKSLLLPDMKESVLCSSLRILSEL LEKHHGQKVILLIDEYFGFTDSDVKNILEYYELSEHYDEIKMWYDGYRFGNVEVYCPWDV INYCDELRINNMAQPQNYWSNTSSNEAVKRFIQETDKERSERKLRN >gi|226332898|gb|ACII01000121.1| GENE 2 1182 - 1694 -92 170 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFQYISCYSLSGSIGELSVVNRLFQYISCYSLSVTGNTYGAKDVRFNTSHVTLYRKTIQY PQYPFCRFQYISCYSLSWSITDNKRLKRVSIHLMLLFIKTGGINDGKIITVSIHLMLLFI EALLKISGLHIAVSIHLMLLFIFCCWFLWLHIKFVSIHLMLLFIKLHKTL >gi|226332898|gb|ACII01000121.1| GENE 3 2860 - 3060 90 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVYSLFQYISCYSLSYVNKKDKSIIQMFQYISCYSLSNYTENIMEDKASFNTSHVTLYLR ELGIAF >gi|226332898|gb|ACII01000121.1| GENE 4 3280 - 3570 122 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLFISRQYQEAQYLLDVSIHLMLLFIQFLASYDELEISFNTSHVTLYLERASDFVDTMT LFQYISCYSLSKEGIDNMKDVRSFNTSHVTLYPLVL >gi|226332898|gb|ACII01000121.1| GENE 5 3803 - 4093 194 96 aa, chain - ## HITS:1 COG:TM1796 KEGG:ns NR:ns ## COG: TM1796 COG1343 # Protein_GI_number: 15644540 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Thermotoga maritima # 9 91 2 84 87 63 42.0 7e-11 MKKQLNYNYAFLFYDVGEKRVQKVFKICKKYLSHFQKSVFRGDMTPSKFIQLRNELNRVI DKEEDFVCIIKLMNDNVFGEEVLGNGKKDNGEDLIL >gi|226332898|gb|ACII01000121.1| GENE 6 4159 - 4347 125 62 aa, chain - ## HITS:1 COG:aq_369 KEGG:ns NR:ns ## COG: aq_369 COG1518 # Protein_GI_number: 15605875 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Aquifex aeolicus # 1 57 255 311 316 60 50.0 8e-10 MNEEGRKIFIEAFENRLESVFLHPKLKRKVSYRTALKLDCYKLIKNILEGKEFVPFSLKE GM >gi|226332898|gb|ACII01000121.1| GENE 7 4559 - 6940 1675 793 aa, chain - ## HITS:1 COG:BH0336 KEGG:ns NR:ns ## COG: BH0336 COG1203 # Protein_GI_number: 15612899 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Bacillus halodurans # 214 607 211 597 800 103 26.0 1e-21 MSREGIHFFETMRDNVITFHDIGKINPRFQEKNMGNYELKGKARPDNSVGSKHSILSSVL YLDYFLDRVSQIENLEERKSLRDLAYIHSFLIAKHHGALTDFQTYLDSFASLNDTEGTVL GWHAQQWLKKWKEENENQKVRKKVYLKKDKQEDMFERISGDDASRQVYLFGYTRLLFSLL VAVDYYATSDYMNGIKLKAFGQLDDISNIIQAYEKTGTQQSIREYEKTKYPQRRERLVKK KKADVINDLRTEMFLDAENTLKEHINDHIFYLEEPTGAGKSNTALNLSFQIIKKVKNINK IFYIYPFNTLVDQNIENLGKIFGNQKDIMNQIAVVNSLVPMKEEKDEYEDKTKKFYQKVL LDRQFLNYPIVLSTHVTWFKTLFGHEKEDVFAFHQLCNSVIVLDEIQSYKNALWSEIITF LKGYAKLLNMKIIIMSATLPNLEALTDDKEDAVNLIPQKESYFKHPVFAERVIPDYSLLK QKMTLEILCEHVQKQVLRKKKILIEFISKKSAEKFYGMLTDTEIDCETLFMSGDSSIWER QKIIEKLSKLKSVILVATQVIEAGVDIDMDIGYKDCSKLDSEEQFMGRINRSCKGEGIVY FFNLDSARMVYKDGDIRVDTEFTVMKTDMQEILRTKNFSDYYGEILEIERNNKKSENENN IVHFFKKEVGQLAYPKVSDKMHLIDEQREMMNVFLSRTLTLSATEKICGDEIWQEYESLL HDQGMDYSEQRVRLHEIKCKMKMFVYQIEKGNTFAWDEYEAIGDIYFIKDGDSYFENGVL NREKFGTGGEMFF >gi|226332898|gb|ACII01000121.1| GENE 8 7207 - 7944 650 245 aa, chain - ## HITS:1 COG:no KEGG:CTC01466 NR:ns ## KEGG: CTC01466 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 245 1 256 256 230 48.0 3e-59 MRALKFTLSGKNAFFKKPEVNAYFYFTYGQIHRVALLGILGAIVGYKGYGCTGTYPEFYE KLKDLKVSVVPRNSQGYIQKKVQMFNNTVGYASQELGGNLIVREQWLENPVWDIYILLDS READKIAEMILDKKCVYIPYMGKNDHLADICAAKVVELDVVTCENVVLSCLYEKKDGKLG IADEEEEEYEDEEEVPEFKYEEKLPVGIDSHLNLYQYETFCFTNMKVTLKGKKIFKTEDR MLVFY >gi|226332898|gb|ACII01000121.1| GENE 9 8097 - 8885 648 262 aa, chain - ## HITS:1 COG:no KEGG:CTC01467 NR:ns ## KEGG: CTC01467 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 261 3 268 319 331 68.0 2e-89 MNKRVYGVLGISSIMANWNADFSGYPKTTSDGLVFGSDKALKYPMKKMWDNEGKKVLYIK SMKFGGKDASLVSLVPRSLGERYEYIFGEKVEKNPQKTLKNLFSAVDVKNFGATFAEAKN NISITGAVQFGQGFNKYADTCSEEQNILSPFRDSSKDDKKDEEAKNSTLGTKITSNEAHY FYPFVINPLAYKEFKELGVTDGYTEEDYQNFKRTALVSATAFATNAKEGCENEFALFVET ESDLYLPNLTEYISFDKGEEKT >gi|226332898|gb|ACII01000121.1| GENE 10 8878 - 10632 1210 584 aa, chain - ## HITS:1 COG:no KEGG:CTC01468 NR:ns ## KEGG: CTC01468 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 579 1 579 579 400 41.0 1e-110 MLKDCLEVFKRQMQQVKEKGRAEDALILDSYIPADGCYISVNSDGIIACQMDLKFNKKAK QMEGISQRYYGKMCFFDYHSRLVSMDKPVDPKKVIHSNNYLSFWVKQESLGNGKLNQEAI DRYFDVLKNPEKKYAKSKDRQMYDYIAAQIDEINVEKLEWCRDWVKKHIFSLEDMGISLS GKNYLKIFFEDTEERYIQEEQRYLITKIFNKNDYNQEIDGKIWGLPNDNLGMNQKKPYMA HKTRKTELPYMITAEDAVLQRKFFDYLYNQASAGKVNIYIDPEKGEISAFSKEEKMQKDF TGYYLYVQKGKEVQIMHQDVIVDYRYHLRKRFQYINVFDRKTEEELYRDYGTVGQMESLL NEILFSKWLVTNYFTPVDELQISGETARNLIWSRDAIFAWLYKNETQNIAGILSKVSVNM IKDSIKNGYISKAVKQFNLKCSLEEYFSGGEQMSTDYESIRKELRRKIQSKETEQITSDE EYYYAVGQLVNYFISLSKAKDKKHSLANPFFNIKNNQALKDKLRQYFIKYNYQLNCSGTW FNNLYAMICNYTKAEKTDQNSMIAGYINNNILYQKKNMEEQVNE >gi|226332898|gb|ACII01000121.1| GENE 11 10721 - 11398 348 225 aa, chain - ## HITS:1 COG:no KEGG:CTC01469 NR:ns ## KEGG: CTC01469 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 225 6 230 230 212 44.0 8e-54 MRVKLYLLQDIPFKELQNAVANFVDSALCQNESLISFHETNCYKFYSVGTLWPLERGMTY KKDQIYTLTVRTVEPILARYFSEVLKNHYTQKMKGLTVENRIIPHKMISEIYTLSPVIMK SDEGYWRSYMSLDEYEERLFSNLVKKYNSFTGEQIEEDFPLYTNITFLNRHPISCKYKRI SLLGDKLSLQIADDEKSQKLAYFILGVGLCELNSRGAGFCNFRWF >gi|226332898|gb|ACII01000121.1| GENE 12 11790 - 13580 2257 596 aa, chain - ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 2 382 6 399 405 330 40.0 6e-90 MKITDTIKYVGVNDHQVDLFEGQYAVPNGMAYNSYVILDDKVAVMDTVDANFKHEWLDNL EQALGGRKPDYLIVQHMEPDHAANVANFLEVYPGTTVVANAKTFVMIKNFFGLDLEGQKL EVENGSTLSLGNHNLTFVFAPMVHWPEVMVTYDSTDKVLFSADGFGKFGALDVEENWDDE ARRYFIGIVGKYGAQVQRLLKVAATLDIQIICPLHGPVLTENLGHYISLYDTWSSYTPEE EGIVVVYTSVYGHTKEAVNQFVEKLKSKGCPKVVVYDLARDDMSQALSDAFRYSKLVLAT TTYNAGIYPFMNDFITRLAEHNFQNRTVGLIENGSWAPLAAKVMKNMLSECKKINWLDTT VKIMSAVNQENRDQMEAMASELCKEYIAKNDELANKNDMTALFRIGYGLYVVTSNDGKKD NGLIVNTVTQLTDSPFRVAVNINKTNYSHHVIKQTGVMNVNCLSVEAPFSVFEQFGFQSG RSVDKFAGQKVNRSDNGLIFLDKYINAFMSLKVEQYVDLGTHGMFICSVTEARVVSDQET MSYTYYQKNVKPKPETEGKKGFVCKVCGYIYEGDELPEDYICPLCKHGAVDFEPIG >gi|226332898|gb|ACII01000121.1| GENE 13 14158 - 15327 911 389 aa, chain - ## HITS:1 COG:all2102 KEGG:ns NR:ns ## COG: all2102 COG1473 # Protein_GI_number: 17229594 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Nostoc sp. PCC 7120 # 5 372 60 421 441 183 32.0 5e-46 MNIEKYTEELITLRHYFHQHPELALQEVETSAYIRKYLEKLGYEIVPIEPTGLIAELPSL RNREKLVVLRAEMDALPIQEQTNLPYASVNQGCMHACGHDMILAAALILAKIVAEEQRMQ ESFPVRLRFLFEPAEEIGEGAKRMLQAGALENPKADAFLMFHYAADMTFGMAVHQGQASS MINSMQIHVHGKSSHWCEADKGIDAIYAAAQVISAIHDLNESFGKQHPDAGKYIVGTGTI HGGEYTNIIADHVVLNGNIRAVHEETYMALEHELERNLQEIEKQTGTQIRMEFPKDPVYA FANDEELTETAKAVGAEVFGDKFVLEGEDELFLSGDNAYRYFRETRGLFTVFLAGIPGEN HPLHHPKFQLDERILPYSVEALYKIITTL >gi|226332898|gb|ACII01000121.1| GENE 14 15348 - 17237 1436 629 aa, chain - ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 1 613 1 591 714 552 46.0 1e-157 MNRQAVQTLKTYFGYDTFREGQESVVESILEHRDVLAIMPTGAGKSICYQVPALMLSGIT IVISPLISLMQDQVKALNEAGIHAAFINSSLSESQISKALYLAAGGRYKIIYVAPERLEN YEFLEFARQVEISMVTVDEAHCISQWGQDFRPSYVKIVDFVKNLPGRPIVSAFTATATEE VKNDILCTLKLEDPKVVITGFDRKNLYYSVENIRRKDDFVMDYIDRHPTESGIIYCSTRK NVDNLFELLFQKGVAVTRYHAGLNNETRKKNQDDFIYDRTPVIIATNAFGMGIDKSNVRY VIHYNMPQSMENYYQEAGRAGRDGENSQCVLLFSAQDVIIDRMLLDNKDFSDVDEEDEFL IRQRDIRRLQIMEGYCKTTGCLRNYILEYFGEKTFGPCDNCGNCHREYHETDMTREAKWV VNCVAETRGRYGLTIVLGTLLGAKRARLRELGADKYKSYGALNDHSEAELRALISQMTEM GYLYQTQERYSVLKLGDISPLRDENTHVIMRTYEEKEPDKKKKPQKSVRKRSTDALTAAG YDLFEALRKLRLEIAKEEAMPPYIVFSDKTLIDMCIKCPSNEEEMLEVSGVGENKLKKYG QRFLEEIQKFCLERPNAVLSMSEDENGNP >gi|226332898|gb|ACII01000121.1| GENE 15 17263 - 18150 914 295 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 12 260 8 261 270 127 30.0 3e-29 MICGMTYFQICLYFLIYSFGGWVVEVIFHAVALGKVINRGFLNGPVCPVYGFGVLSVFAL LNTIQGSGRQMSDGMIFVFGIVLATAVELVAGWLLDVCFHARWWDYSDKPFNFHGYICLE FSLIWGLAIVMVVKVFQKYVEAHALHTPATWEWIVIAVLYAVYLTDFIVTVAVIQGLNKK LTRLDKVRSDLRIVSDKLSDTLATTTIGTAQKVGEGKVQATLAAAELRDATAAQREKSIE MLRIKRAELQAQFEELSSSITNHTVVGQGRIIKAFPKMQHRDYSELIQELKKKLK >gi|226332898|gb|ACII01000121.1| GENE 16 18218 - 21166 2491 982 aa, chain - ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 470 747 1 276 385 191 41.0 6e-48 MDVKNHNTNRKKRVVRKRWNAAVLIALFVGILLTVFQYLGFVSKTVYEESVSHLTEIFHQ SDNMLSELTNKNLTYLHIWGEYLQNTSDESEIREYIEKAQEDAGFLDFYFLSADGNYKMV TGETGYLGLQENIEDEIRQGNDVITNATVPGKSQLLVFATPRPHGSYQGFEYDAIAIAYE NSDIVNVLNISAFNGNAQSYVVHPDGRVVVDHSSEAWGEVYNFFGILREHSTLSEKEILK LSEEFKEGHTDAMLINLDGEDYYLVYEKSKPQDWIFLGLVQADIVNASMNVLQRSTVLLV SAVAVCIAGLFIGIILRKNRVNLKRKDTEILYRDELFQKLSMNVDDVFLMLDAKTYQADY VSPNVEKLLGITVEQIRKDICVLGKLHPGDVEDPEKKYLEEIQVHEQQEWDLEYVHQKTG EHRWFHNVAMGSEVNGKKKYILVLSDRTSDRKMNQALSEAVRAAETANKAKSTFLSNMSH DIRTPMNAIIGFTTLAVSNIDDKERVRDYLGKILSSSNHLLSLINDILDMSRIESGKIHL EETEVSLSEVLHDLKTIISGQIHAKQLELYMDVMDVTNEDVYCDKTRLNQVLLNLLSNAI KFTPAGGTVSVRLREYPGTQRGCELYEIRVKDNGIGMSQKFVQKIFSPFERERTSTVSRT QGTGLGMAITKNIVDMMGGTIEVQTEQGKGTEFIIRLPLRIQPENHRIEKIAELEGLKAL VVDDDFNTCDSVTKMLVKVGMRSEWTLSGKEAVLRARQSVEMGDAFHAYIIDWRLPDMNG IEVTRQIRSLGDDTPIIILTAYDWSDIEVEARAAGVTAFCAKPMFMSDIRETLMAAIGQK QAGAEDNILPAADLDFRGRRILLVEDNELNSEIAVALLSEYGFRVDTAEDGAEAVEKVKN STPGDYDLVLMDVQMPVMNGYEATERIRSLDDPSLAGITILAMTANAFDEDKKKALACGM NGFLSKPIVIEELISTLQNSLG >gi|226332898|gb|ACII01000121.1| GENE 17 21172 - 23022 1528 616 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01635 NR:ns ## KEGG: EUBELI_01635 # Name: not_defined # Def: multiple sugar transport system substrate-binding protein # Organism: E.eligens # Pathway: not_defined # 33 341 7 320 326 155 31.0 6e-36 MKKKKWNKILAVLLAMVTAVSLLSGCGGKSAEKEDAETITVYLWSTKLYEKYAPYIQEQL PDINVEFVVGNNDLDFYRFLNENGGLPDIITCCRFSLHDASPLKDSLMDLSTTNAAGAVY DTYLNSFRNQDGSVNWLPVCADAHGFVVNKDLFEKYDIPLPTDYESFVSACQAFDEVGIR GFTADYYYDYTCMETLQGLSASELSSGDGRKWRTIYSDPDNTKREGLDSTVWPEAFERME QFIQDTGLSQDDLDMNYDDVVEMYKSGKLAMYFGSSAGVKMFQDQGINTTFLPFFQQNGE KWLMTTPYFQIALNRDLTKDETRRQKAMKVLNTMLSEDAQNRIIYDGQDLLSYSQDVDFR LTEYLKDVKPVIEENHMYIRIASNDFFSISRDVVSKMISGEYNAEQAYQSFNSQLLEEKS TSEDIVLDSKKTYSNRFHTSGGNEAYSVMANTLRGIYGTDVLIATGNSFTGNVLKAGYTE KMARNMIMPNELSAYSSEMSGAELKETVRNFVEGYRGGFIPFNRGSLPVFSGISVEIKET DDGYTLSKVTKNGKQVQDKDTFTVTCLAAPQHMEAYPAEENIVFDGGDISVDDTWTTYVS DGNAILAEPEDYITLR >gi|226332898|gb|ACII01000121.1| GENE 18 23022 - 25070 1594 682 aa, chain - ## HITS:1 COG:PA4982_1 KEGG:ns NR:ns ## COG: PA4982_1 COG0642 # Protein_GI_number: 15600175 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 282 566 487 763 763 160 35.0 1e-38 MVKTVNYVKVQCSTYTHYNEASESKSLLRAIESARQMSTNIDMETENGGRLSQEFLKENL QTLWVDGILVLDAEGKTVCKYSMDEALTNEITDYLQKDIIMDFTGYEERSYSERIDREDG SRIDIAACARKDGPGMVAIYYYTSSEFVRNYTLTIQNLLKGYSTQKDGTIIVADKGTIVA SNDESLLGQDTAGNQVVQAMKEHTDSQHIFHLKNEGTGCYGIMLKQRDYYIYAYLPDTEV FRNLPLNVTAVVFLYLLIFGIFCFWGYRADLAHRKQEQEKDEKYKAELLRTAKKAEAANE AKTEFLQRMSHDIRTPINGICGMINVADYYADNMEKQTECRAKIKEASHLLLELINEVLD MSKLESDEVVLEEIPFNLNSISEEILGVIEHMAAEQNIRIIWEEKEVTHWNLIGSPVHVK RILMNILSNAVKYNKENGYVYIGCREIPSKQTAMTTLEFVCRDTGIGMAEAFQKRIFEPF AQEHAGSRTKFAGTGLGMPITKKLVEKMGGTISFESKEGTGTTFVIQIPFRIDTDMKDRT EAEEKTETSIHGLHVLLTEDNELNMEIAEFVLQNEGAVVTKAWNGQKAVDIFRKNRPGEF DVILMDIMMPVMNGYEAAKMIRSLDREDAKVIPIIAMTANAFTEDKMRAKEAGMDEHIAK PVDGKLLVKAINELVKHNQEKL >gi|226332898|gb|ACII01000121.1| GENE 19 25186 - 26001 833 271 aa, chain - ## HITS:1 COG:SPy1274 KEGG:ns NR:ns ## COG: SPy1274 COG0834 # Protein_GI_number: 15675229 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pyogenes M1 GAS # 13 166 8 168 278 72 31.0 9e-13 MKGKRVIAGILLVGILATALAGCKNADVTKKEREKPVITLGSDSYPPYNYLNEDGIPTGI DVELATEAFRRMGYQVDVVQINWEKKKELVESGEIDCIMGCFSMEGRLDDYRWAGPYIAS RQVVAVNESSDIYKLSDLEGKNLAVQSTTKPEGIFLNRTDERIPKLGNLISLGHRELIYT FLGKGYVDAVAAHEESIVQYMKDYDASFRILEDPLMVVGIGVAFAKEDDRGICEQMDQTL EEMRQDGTSLKIIKNYLDDPQKYLEVDDLGY >gi|226332898|gb|ACII01000121.1| GENE 20 26437 - 27780 851 447 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 14 424 15 426 452 228 34.0 3e-59 MSKDEYLITDAPLKALTVFAMPMILGSFFQQVYNMADSIIVGQFVGSSALAAVGACAALT NVFICVALGAGVGAGVLVSRYFGAQNYSKMKTIVSTSLISFLILSILLGVFGFVFSHFMM SLLQTPADILDEAVLYLRVYFVGFPFLFMYNILSTMFNSIGESKIPLGLLIFSSVLNIVM DLWMVAGLGLGVFGAALATLIAQGISAVFSLLIFFNRMRRYKSRFNWFDMQELHSMLRIA VPSVLQQSTVSIGMMIVQAVVNPFGTQALAGYAATMRVENVFSLMFVSIGNAVSPYVSQN LGAKKIERIKKGYHAALVLDLCFAVLAFIVIETLHTPISSLFLGKDGTALAYQVSGDYMK WLGYFFIFMGIKMATDGVLRGLGIMRPFLIANMVNLAIRLSVALIFAPRYGIAFVWLAVP AGWLANFLISYGALRKSWPVDEGVSFR >gi|226332898|gb|ACII01000121.1| GENE 21 27998 - 28456 320 152 aa, chain + ## HITS:1 COG:BS_ywrC KEGG:ns NR:ns ## COG: BS_ywrC COG1522 # Protein_GI_number: 16080664 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 6 152 12 158 158 123 41.0 1e-28 MYLDGLDKLDQKIIRLLIENARISYSDIGEETGISRVAVKARIQALEKRGVIEEYTTIIN PQKISGAVSCYFEIETKPEYLAQVTDILYKNDTVTQIYRVTGRDRLHVHAVASSGDEMEY FLHNVIDTLPGIISCSCNMILSRIKDIKGLRL >gi|226332898|gb|ACII01000121.1| GENE 22 28945 - 29232 163 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580381|ref|ZP_04857647.1| ## NR: gi|253580381|ref|ZP_04857647.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 95 7 101 101 176 100.0 5e-43 MEKNFLSLIKTRRSVRAYKSEPVLSKALDVVLEAGTYAPTGGRHQSPTIIAITDSKYRRA LASDYPAQEAGQPAARKQNYIVRTICGEEGKHADQ >gi|226332898|gb|ACII01000121.1| GENE 23 29561 - 30559 286 332 aa, chain + ## HITS:1 COG:MA0334 KEGG:ns NR:ns ## COG: MA0334 COG0534 # Protein_GI_number: 20089232 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 3 323 122 439 466 152 31.0 1e-36 MKPYFGAFTNNEEIYQLSLAYMSLCSFMQIPNMVHIAIQKMIQATGNMVAPMWFQIAGVV VNFVFDPLLIFGIGVFPAMGIRGAAVATVAGYLLSMILAFALLLGKKQKVRIKIKEFHIQ KWMIARIFALGLPSFIMNALSSFMVIFVNLFLVAYSDTAIAFFGAYFKVQQLVVMTVNGL IQGCLPIMRFNYGAGNRDRLHSAFRYGTALVSGMMILGTLTVNFFPAQLLELFTASEAMR SFGISAMRIMAASYLFCGLSTMISTYFQATEKVGSSIAIQLCRQLLFLVPALWCLDKLFQ LNGIWLAFPVAETATMLIALVIMAWHRHKNIL >gi|226332898|gb|ACII01000121.1| GENE 24 30743 - 30934 171 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580383|ref|ZP_04857649.1| ## NR: gi|253580383|ref|ZP_04857649.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 25 87 87 116 100.0 4e-25 MATSSITHNFVISNPNSVKRFIAAIDEADRDRTPKQTLPGRQLTNPQEILALMSKRKKQT LEY >gi|226332898|gb|ACII01000121.1| GENE 25 30998 - 31186 76 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580384|ref|ZP_04857650.1| ## NR: gi|253580384|ref|ZP_04857650.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 62 53 114 114 117 98.0 2e-25 MFAVVIAFKHLLSGKDFLEFKRKLIKEIDRVNREVEHISETELLNKMGFPENWKNITRYH LK Prediction of potential genes in microbial genomes Time: Sat May 28 20:34:13 2011 Seq name: gi|226332897|gb|ACII01000122.1| Ruminococcus sp. 5_1_39B_FAA cont1.122, whole genome shotgun sequence Length of sequence - 2046 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 396 - 455 7.0 1 1 Tu 1 . + CDS 492 - 1430 283 ## Cphy_1803 transposase Predicted protein(s) >gi|226332897|gb|ACII01000122.1| GENE 1 492 - 1430 283 312 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1803 NR:ns ## KEGG: Cphy_1803 # Name: not_defined # Def: transposase # Organism: C.phytofermentans # Pathway: not_defined # 5 311 7 307 434 202 40.0 1e-50 MNNKGNQKHLTFEQRVDIEKGLTENKSFAEIGRTIGKDPSTISKEVRLHAHTKEHPDAGY TNPPCIHRKTCKIVCLCDEQCGIVCKLCRKPDMQCINICPGYETAECEKLKKPPYVCNGC VKKTNCLMPRKFYSSKYAHDEYRSVLVDCRVGINQTPESIQAMNDLLVPLIKEKHQSIGH IYATHAEELGCSRRTLYSYISNCVFEVRNSDLRRTVRYKKRRKPTQASAKDRFYRQGHNY EDFQNYIKEHPDINVVEMDCVEGKKGESKAILTFTFRNCNLMLMFLLEYQDQECVLEVFV WLETVLGQEAFK Prediction of potential genes in microbial genomes Time: Sat May 28 20:34:20 2011 Seq name: gi|226332896|gb|ACII01000123.1| Ruminococcus sp. 5_1_39B_FAA cont1.123, whole genome shotgun sequence Length of sequence - 11085 bp Number of predicted genes - 14, with homology - 12 Number of transcription units - 9, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 210 79 ## + Term 291 - 346 0.3 + Prom 277 - 336 2.0 2 2 Tu 1 . + CDS 357 - 1568 773 ## COG3547 Transposase and inactivated derivatives + Term 1591 - 1629 -0.6 - Term 1747 - 1818 14.5 3 3 Tu 1 . - CDS 1828 - 2262 448 ## EUBREC_0368 hypothetical protein - Prom 2450 - 2509 2.5 4 4 Tu 1 . + CDS 2653 - 2823 216 ## gi|253580389|ref|ZP_04857655.1| predicted protein + Term 3027 - 3059 5.0 - Term 2888 - 2917 0.5 5 5 Op 1 . - CDS 3035 - 3226 199 ## gi|253580390|ref|ZP_04857656.1| predicted protein 6 5 Op 2 . - CDS 3190 - 3444 175 ## gi|253580391|ref|ZP_04857657.1| predicted protein - Prom 3508 - 3567 2.3 7 6 Op 1 . - CDS 3595 - 4098 324 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 8 6 Op 2 . - CDS 4088 - 5026 961 ## gi|253580393|ref|ZP_04857659.1| predicted protein - Prom 5151 - 5210 5.6 + Prom 5202 - 5261 10.6 9 7 Op 1 9/0.000 + CDS 5421 - 6635 517 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 10 7 Op 2 . + CDS 6651 - 7364 447 ## COG3279 Response regulator of the LytR/AlgR family + Term 7385 - 7443 14.1 11 8 Tu 1 . - CDS 7713 - 9050 752 ## COG3666 Transposase and inactivated derivatives - Term 9388 - 9424 4.2 12 9 Op 1 . - CDS 9553 - 10005 318 ## COG3476 Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) - Prom 10031 - 10090 4.5 13 9 Op 2 . - CDS 10093 - 10203 123 ## 14 9 Op 3 . - CDS 10235 - 10801 482 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 10898 - 10957 7.2 Predicted protein(s) >gi|226332896|gb|ACII01000123.1| GENE 1 1 - 210 79 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KLNENRFVIFIYYECRYAQFALSSTTIYLWLISLTASKELSILLLKLLKCMEQSVYRLIH LGLRVRYRD >gi|226332896|gb|ACII01000123.1| GENE 2 357 - 1568 773 403 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 3 399 1 384 391 107 23.0 3e-23 MNMNAVGIDVSKGKSMVAILRPYGEIISKPFEVKHTVSGIRSLIEQIQSIDGESRIVMEH TGRYYEPLVRELSKADLFVSAINPKLIKDFGDNSLRKVKSDKADAVKIARYTLDSWTELR QYSLMDEIRNQLKTMNRQFDFYMKHKTAMKNNLISILDQTYPGANTYFDSPAREDGSQKW VDFSATYWHVDCVRKLSLNAFIDHYQKWCKRRKYNFSRSKAVEIYEASKELVSILPKDDL TKLIIKQAVEQLNTASQTVEQLRTMMNEAASKLPEYPVVMAMKGVGKSLGPQLMAEIGDV SRFTHKGAITAFAGVDPGVNESGSYEQKSVPASKRGSSTLRKTLFQVMDVLIKTKPQDDP VYLFMDKKRAQGKPYYVYMTAGANKFLRIYYGRVKEYLSSLPE >gi|226332896|gb|ACII01000123.1| GENE 3 1828 - 2262 448 144 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0368 NR:ns ## KEGG: EUBREC_0368 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 144 17 152 152 91 35.0 8e-18 MDNSTKKLLEECSVGCKMGIESMEQVQHHVTDAKIAATIEKSCSKHKELEEEISKILLRA GQPEKEPGVIVSTFSWMTTGVKMMTGEDENKQAVKIIMNGCNMGTQSITEAMHQCKDASS ESISIAKKLIGMEENLRDDMQKYL >gi|226332896|gb|ACII01000123.1| GENE 4 2653 - 2823 216 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580389|ref|ZP_04857655.1| ## NR: gi|253580389|ref|ZP_04857655.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 56 1 56 56 74 100.0 2e-12 MSEIKQNITNELSEEELENIAGGFGHPTMDDPRYKEQMRINTMLKRGMKMDEIQKR >gi|226332896|gb|ACII01000123.1| GENE 5 3035 - 3226 199 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580390|ref|ZP_04857656.1| ## NR: gi|253580390|ref|ZP_04857656.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 1 63 63 112 100.0 7e-24 MYLVGDQQERYIIEEISRSLGKFLGFGLGIIYIMVLVPGTVVLAGTLTNRGSDKVTREKQ GQP >gi|226332896|gb|ACII01000123.1| GENE 6 3190 - 3444 175 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580391|ref|ZP_04857657.1| ## NR: gi|253580391|ref|ZP_04857657.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 84 1 84 84 150 100.0 3e-35 MVSVVERSSSGSFDEVIGTVGTDDFGLVLGIALGIVAVALLVNFIVTILIWNPLEVGCKK LFIQCEYGTAETRCIWSVINRRDI >gi|226332896|gb|ACII01000123.1| GENE 7 3595 - 4098 324 167 aa, chain - ## HITS:1 COG:CAC1509 KEGG:ns NR:ns ## COG: CAC1509 COG1595 # Protein_GI_number: 15894787 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Clostridium acetobutylicum # 6 166 19 180 184 77 32.0 8e-15 MKYEEFECIYQEYCLKILTFIHKRVPDLYEAEELTGDVFLSFYRNMDSYDEEKGSIATWL YAITANRLKNYYRDKKTHYSLEILKQQTIPREKMPEEIVAKIMREETLRKSLEQLSDRER EILLGRFYYQKSSTELGRQMNLSPGNVRMIQKRALEKLRMIMEKQGE >gi|226332896|gb|ACII01000123.1| GENE 8 4088 - 5026 961 312 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580393|ref|ZP_04857659.1| ## NR: gi|253580393|ref|ZP_04857659.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 312 1 312 312 510 100.0 1e-143 MKKRMIIVTALIMGTLGAGCVSVQASDTLPELTKDNALEQMMKAGETDSLVSNHKSYEAT CEYILQNTKVYDYEDSEVMYVDMYGTTQMLYKEGELKYGKEGDEYLSCLYIDEADGNADD LISFYTDETIEEEVTDMKEDGDMIIIQTKVPEELTQSALNSQGDGESYTEGDWISMEYVV ASDDYRPLKHTQILNHKDGTSEEASKMEFSYGVDRPKDVADLCEKYEEAEKAEGDVRNVT VTAAPGTEQEKDYAITVPADNHVWIMLHDDNSYYIMYTDKECKELYEPEEDVKEDTHLYA EIESYEEAEDEV >gi|226332896|gb|ACII01000123.1| GENE 9 5421 - 6635 517 404 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 172 385 215 438 452 77 25.0 5e-14 MALSIFIVMVIPFSLIAQFNIGGDYKELIWNVIFYIALLLFGIFYCFSIQAKAAEKLFVF FVVMSYGFFVTSTVTFLHRTFRFPSDYFMYPPFALALTLIINLVLAKPFLMLMERIRTMI NADLESRIWKILCSLPALFILIASIAQFSSIINLSNNIVVHVMFVLFAVFAFMVYAVFFS VMGYIRSKQEEQRISERILENYRNQAENNEHILEIHHEIRHHMNALSSYLKQEDYAGARQ YTQKFTEEAEQLPFVTYTANALVNSILSEFAERASRYKAIVEYSIIISRRLNMEDIDLCR MLTNILENAVEGCQNVSEGQRIIRLNLHSKGNFLFIKCENSCNEDNLRITNGKYKSSKKN SDRHGYGLKIINGIAEKYNGILSVQVHDGFFAVTTTLCLDENNE >gi|226332896|gb|ACII01000123.1| GENE 10 6651 - 7364 447 237 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 232 1 229 234 104 29.0 1e-22 MYHIVICDDEPIFLTQISEMISDILASMGESCEIRKYTSIAELKNTLQNTPKNCDILILD IMLGENNGISFAECLRDTRNPIPVIFISSSKEFVFDAYSAEPVGYILKPVSRQKLAEALT RAIRHLIPKSIIIDTPSRTVSLHIRDITYIEIINKELQIHLQDGTVTKIYKSLSAVREIL PKDIFVQCHRCYIVSLHAVRSIRRFEITLNNHEIIPVSKYSYRDVQEQLQQYAAKWF >gi|226332896|gb|ACII01000123.1| GENE 11 7713 - 9050 752 445 aa, chain - ## HITS:1 COG:FN0028 KEGG:ns NR:ns ## COG: FN0028 COG3666 # Protein_GI_number: 19703380 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 2 445 91 491 491 253 38.0 4e-67 MFLLEGSPAPDHATFARFRSIHFAPCSKRILAEMSNALFDMGEISGETIFIDGTKIEASA NKYTFVWKKAVTKNQAKLLQKLADFVAECELLYDIKIVYGNTIKIKHVKKLRKKLYALKE TENIVFVHGIGKRKTPLQKSIETLESYLNRLKKYNQQIHICAERNSYSKTDHDAAFMRMK EDAMGNGQLKPAYNLQHGVDSEYITWLTIGPQPTDTTTLIPFLKDAQEHLKFKYKNITAD AGYESEENYLFLEANGQLFYIKPANYEISKTRKYKNDIGKIENMEYDVKKDIYTCRNGKK LGVDHIRHSKSKTGYVSEKAIYKCEDCNGCPYKSECIKGNNCKTLLEERTKTLQVAKTFL KHRKEDLDRILSEEGILFRTNRSIQAEGSFGDLKQDMQFRRYLSKGNANVLAESTLLAMA RNINKLHNKIQNGRTGTHLFSIKSA >gi|226332896|gb|ACII01000123.1| GENE 12 9553 - 10005 318 150 aa, chain - ## HITS:1 COG:CAC0262 KEGG:ns NR:ns ## COG: CAC0262 COG3476 # Protein_GI_number: 15893554 # Func_class: T Signal transduction mechanisms # Function: Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) # Organism: Clostridium acetobutylicum # 4 146 14 161 170 103 44.0 1e-22 MRQKPLLISLLISLGTGVIAGFLTFKSMEQYQEMYRPPLSPPGWVFPVVWLILYALMGIA AYRIYMKDPKAEVLKLYLIQLAVNFLWPILFFNFGWQLFAVVWLLLLWYLVLVMIKEFAK IDEGAAKLMIPYLIWLTFASYLNIAIALHR >gi|226332896|gb|ACII01000123.1| GENE 13 10093 - 10203 123 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKGAAIKTALTYIKEMNLVQRMSIFTRKTNKADYR >gi|226332896|gb|ACII01000123.1| GENE 14 10235 - 10801 482 188 aa, chain - ## HITS:1 COG:CAC2438 KEGG:ns NR:ns ## COG: CAC2438 COG0671 # Protein_GI_number: 15895703 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 38 178 33 172 180 100 40.0 1e-21 MVGEFFMMTLQQIDMSILLWIQEHLRADALTPFWKVITFLGNGGWFWLVLAAGLLVCKKT CLTGIAALLSITVGFLLTNVLLKNIVARPRPFDAYTEIISLITKPTDFSFPSGHTCASFA CALILFRMLTKKFGIPAVILAGMVAFSRLYLGVHYPGDVLGGFLVAVFASTLVYHLMSAY QKKMQEQS Prediction of potential genes in microbial genomes Time: Sat May 28 20:35:00 2011 Seq name: gi|226332895|gb|ACII01000124.1| Ruminococcus sp. 5_1_39B_FAA cont1.124, whole genome shotgun sequence Length of sequence - 9415 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 28 - 84 -0.7 1 1 Op 1 . - CDS 161 - 859 396 ## Cphy_0053 hypothetical protein 2 1 Op 2 . - CDS 870 - 2117 1189 ## EUBELI_01971 hypothetical protein - Prom 2137 - 2196 1.8 3 1 Op 3 . - CDS 2198 - 3553 882 ## Cphy_0051 peptidoglycan-binding LysM - Prom 3722 - 3781 4.7 + Prom 3610 - 3669 3.7 4 2 Tu 1 . + CDS 3831 - 4385 426 ## Cphy_0050 sporulation protein YyaC 5 3 Op 1 12/0.000 - CDS 4358 - 5296 1323 ## COG3958 Transketolase, C-terminal subunit 6 3 Op 2 . - CDS 5284 - 6123 1115 ## COG3959 Transketolase, N-terminal subunit 7 3 Op 3 . - CDS 6139 - 6792 1093 ## COG0176 Transaldolase - Prom 6842 - 6901 6.3 - Term 7104 - 7156 1.0 8 4 Op 1 . - CDS 7221 - 8318 855 ## COG1609 Transcriptional regulators 9 4 Op 2 23/0.000 - CDS 8362 - 8961 625 ## COG0353 Recombinational DNA repair protein (RecF pathway) 10 4 Op 3 . - CDS 8961 - 9317 184 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 Predicted protein(s) >gi|226332895|gb|ACII01000124.1| GENE 1 161 - 859 396 232 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0053 NR:ns ## KEGG: Cphy_0053 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 8 230 8 228 233 119 29.0 9e-26 MALTWAGVITILFLAAACVRGYRRGLIKELVSLVCVFLSMAIVWFINPYVNEFIRENTSI YEKVQESCREFVGEEYSTWTGSGESQTEFINEMNLPELLRNGLVQNNNSDSYQYLAVTTF SDYIAQYLARMAVNGISFLISLLMSTIMVRSITWMLNLVTRLPVLHGMNKVAGALLGAVK FLIVIWIIFLALTIVCNTKVGEAALQIIKKDCILSFIYDRDILIRIFMSIFY >gi|226332895|gb|ACII01000124.1| GENE 2 870 - 2117 1189 415 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01971 NR:ns ## KEGG: EUBELI_01971 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 22 378 26 384 422 213 33.0 2e-53 MANTFKKIIKKHKKKKLQAERNRKAKARERAEEEYADDYERRLSRHKRSVVKKTVITVVA IAAAVTAVGFYIEKRSYHTYKVVQTSEQEDIVSTNYVEMDGNILRYSPDGVSLVSDKMST LWSETYQMQNPVADVNGTRAVIADKDGTTLEIYDKSGKTGSVTTSYSIVKAKVSKSGLVA AILDGGDDTWIDFYGTDGSLIAENQTKIDDPGYPLDIAVSEDGVIMMVTYQFVDGSDTTS YVAFYNFGDVGQNEDDRIVSGYKYEGVVVPQIQYLDNNRSVALKDNGFTIYHGSQIPKEV KTVKVDKEIVSTFYDNDMIGLVFKNDSKDKQYTMEVYTTDGKLKFKENFNIPYTTIKLSG GNILMYNSSQMCVMNSRGVQKYLGSVDGTIKDFFKIGMNRYLLVLDSGVEIIKLS >gi|226332895|gb|ACII01000124.1| GENE 3 2198 - 3553 882 451 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0051 NR:ns ## KEGG: Cphy_0051 # Name: not_defined # Def: peptidoglycan-binding LysM # Organism: C.phytofermentans # Pathway: not_defined # 22 451 26 436 436 138 27.0 5e-31 MIEVIYKDESQEGQNGEEPFGLPRNIRQIGLAAEDYRIYMEDYVYTFLVRLARTEDSLGE AKTRVAVLTGNLKWRSQTAYLFIKGAIIAEEMEAAPDHIDFSENQWKQIQEAQKEYFEDQ EIVGWFFSQPQLLLKVSEVMSKVHMKHFGGEKVLMLMEPQEREDAFFRYENNEMVRLGGY YLYYEKNPGMQTYMIDKNEELQPEPQEKYEDQAVKDFRKIITDKKETRKEPAAPSVFSYG LTACLAIAVLTVGVNFYRSYQNVKQNEKESATVSSVIVEEITPSPVVGTSNNEAVRQHKQ YRTDTIQNDAGKNKKNNEENNQSTSEKENSEKKNNTSEKKISAEQTDTEQKSDKTEQLPD VKTEKADKTEQIYQQEADERKAQKRVREAVQKENSEAAGKAHESYVIQPGDTLFQISMDR YGSIEAISQICKLNGMSADEIIYPGQVIVLP >gi|226332895|gb|ACII01000124.1| GENE 4 3831 - 4385 426 184 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0050 NR:ns ## KEGG: Cphy_0050 # Name: not_defined # Def: sporulation protein YyaC # Organism: C.phytofermentans # Pathway: not_defined # 2 174 11 180 197 157 47.0 2e-37 MVFYVDSQKSSSAEEIAFLLNKCILQYPGRWSELVFLCIGSDRVTGDCLGPYIGHQLLEH LNTDTHGVYVYGTLKSPVHALNLSRISRQIKILHPEGLVIAVDASLGQKKHLGYVTIADG ALYPGAAVQKELPPVGDIHITGIVNIAGVLEQLTLQTTRLSTVISLADTITQGIVNYTNS LICL >gi|226332895|gb|ACII01000124.1| GENE 5 4358 - 5296 1323 312 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 4 310 1 306 309 348 58.0 1e-95 MSEVKKIATRASYGAALVELGKKHENLVVLDADLAAATQTGVFKKEFPERHIDCGIAECN MMGIAAGLATTGKVPFASTFAMFAAGRAFEQVRNSIAYPKINVKIGATHGGISVGEDGAT HQCCEDFALMRVIPGMVVACPSDDIEAKAMVEAAYEHVGPVYMRFGRLAVPVINDRPDYK FELGKGIVLREGKDLTIIANGLCVAPALEAAEKLAADGVDAKVINIHTIKPLDEELVVAA AKETGKVVTVEEHSVIGGLGGAVCECLSEKAPVPVKRIGVNDVFGESGPATALLEKYGLD AEGIYKQIKEFV >gi|226332895|gb|ACII01000124.1| GENE 6 5284 - 6123 1115 279 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 6 269 7 269 270 318 58.0 8e-87 MEKLELQKIANEVRKDIVTALHSAKAGHPGGSLSAADVFTYLYFEEMNIDPKDPKKADRD RFVLSKGHTAPGYYSALAERGFFPKEDLKTLRHLGSYLQGHPDMKHIPGVDMSSGSLGQG ISAAVGMALSAKLSNESYRVYTLLGDGEIQEGQVWEAAMFAGFRKLDNLVVIVDNNGLQI DGKVDEVCSPYPIDKKFEAFNFHVINVADGNDMDQLRAAFDEAKATKGMPTAIIMKTVKG KGVSFMENQVGWHGKAPNDEQYAQAMEELEKAGEALCQK >gi|226332895|gb|ACII01000124.1| GENE 7 6139 - 6792 1093 217 aa, chain - ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 213 1 211 214 292 76.0 4e-79 MKFFIDTAKVEDIKAANDMGVICGVTTNPSLIAKEGRDFKEVIKEITSIVDGPISGEVKA TTTDAEGMIKEGREIAAIHPNMVVKIPMTVEGLKAVKVLHAEGIKTNVTLIFTANQALLA ARAGATYVSPFLGRLDDISTRGVDLIREIAEIFEVAGIETEIIAASVRNPIHVTDCALAG ADIATVPYNVIVQMTKHPLTDAGIEKFQKDYKAVFGE >gi|226332895|gb|ACII01000124.1| GENE 8 7221 - 8318 855 365 aa, chain - ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 6 356 2 337 347 145 31.0 2e-34 MSKKKATSSDVAKRAGVSQATVSMVLNKKYNVSFSRETVEKVEQAARELGYHLPGRKNRK ESRKEKLIVVFCPTLTSPYYVLLLQGIEAVANKQGYGVFICNTQRDPRLEEKYLRMMGTI EPLGIIYCCNPNPDFQQQVEELAQTIPLVIISNKEKTTTIDAINQDNTVVGRMMARHLLD LGHRDVAFITPPLTRRQWQRTKRVNGFVKEFEKEGLKDHVIIKAADEAIDMQIPRMDSEY SMGYELTMELLDEGREFTAIAGQNDMMAIGALDALHEARIHVPKDVSVIGCDNIFYSGIR KISLTTIDHFVALKGRDACDIILRKIDEQDRFLTDSAPVSLYNIEYTPKLIARRTTGYVR TKKNN >gi|226332895|gb|ACII01000124.1| GENE 9 8362 - 8961 625 199 aa, chain - ## HITS:1 COG:CAC0127 KEGG:ns NR:ns ## COG: CAC0127 COG0353 # Protein_GI_number: 15893423 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Clostridium acetobutylicum # 1 199 1 198 198 239 56.0 3e-63 MEYYSSHINKLIEQLSHLPGIGAKSAQRLAFHIMNMPKDQVEQLTSSITGARENVQYCKC CCTLTDREICPICSNDKRDHSVIMVVENTRDLAAYEKTGKFDGVYHVLHGAISPMLGIGP DDIKLKELMQRLAKDEVKEVIIATNSSLEGETTAMYISKLIKPTGIKVSRIASGVPVGGD LEYIDEVTLLRALEGRVEL >gi|226332895|gb|ACII01000124.1| GENE 10 8961 - 9317 184 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 7 110 2 105 114 75 36 1e-13 MAKRGGFPGMGMPGNMNNLMKQAQKMQRQMEENQKALEEKEFTATAGGGAVEVTISGKRE VTKVKLQEEVVDPDDIEMLEDLIVAATNEALRKVEEESTAVMSKLTGGLGGLGGGLPF Prediction of potential genes in microbial genomes Time: Sat May 28 20:35:26 2011 Seq name: gi|226332894|gb|ACII01000125.1| Ruminococcus sp. 5_1_39B_FAA cont1.125, whole genome shotgun sequence Length of sequence - 31434 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 14, operones - 7 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1620 1489 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 1724 - 1783 2.8 2 2 Tu 1 . - CDS 1790 - 2872 1495 ## COG0205 6-phosphofructokinase - Prom 2977 - 3036 5.1 - Term 3441 - 3500 8.9 3 3 Op 1 . - CDS 3582 - 4949 1304 ## COG0534 Na+-driven multidrug efflux pump - Prom 4969 - 5028 4.3 4 3 Op 2 . - CDS 5064 - 5582 496 ## COG0590 Cytosine/adenosine deaminases - Prom 5753 - 5812 5.1 - TRNA 5669 - 5745 79.2 # Pro CGG 0 0 + Prom 5759 - 5818 10.1 5 4 Tu 1 . + CDS 5878 - 6921 611 ## EUBELI_02025 hypothetical protein - Term 6807 - 6847 -0.9 6 5 Op 1 . - CDS 6892 - 7587 608 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 7630 - 7689 5.5 - Term 7702 - 7735 1.4 7 5 Op 2 . - CDS 7759 - 9081 1561 ## Cphy_3607 peptidase M23B - Prom 9108 - 9167 9.3 - Term 9195 - 9239 11.3 8 6 Op 1 . - CDS 9275 - 9562 370 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 9 6 Op 2 7/0.000 - CDS 9612 - 10730 1218 ## COG0448 ADP-glucose pyrophosphorylase 10 6 Op 3 . - CDS 10730 - 12004 1328 ## COG0448 ADP-glucose pyrophosphorylase - Prom 12085 - 12144 11.4 + Prom 12112 - 12171 12.4 11 7 Tu 1 . + CDS 12398 - 13777 1511 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Term 13812 - 13861 13.6 12 8 Op 1 3/0.000 + CDS 14541 - 15623 729 ## COG3935 Putative primosome component and related proteins 13 8 Op 2 . + CDS 15692 - 16675 750 ## COG1484 DNA replication protein 14 8 Op 3 . + CDS 16679 - 17866 1428 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 18095 - 18129 -0.9 15 9 Op 1 . - CDS 18029 - 19399 1073 ## Cphy_0103 hypothetical protein 16 9 Op 2 . - CDS 19365 - 20417 860 ## COG0420 DNA repair exonuclease 17 9 Op 3 . - CDS 20502 - 21893 1838 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 21972 - 22031 3.8 - Term 21926 - 21974 9.1 18 10 Op 1 . - CDS 22079 - 24496 1814 ## Dhaf_0489 cell wall/surface repeat protein - Prom 24610 - 24669 4.7 - Term 24497 - 24529 -0.6 19 10 Op 2 . - CDS 24678 - 26168 1492 ## COG4468 Galactose-1-phosphate uridyltransferase - Prom 26251 - 26310 7.7 20 11 Op 1 . - CDS 26382 - 28835 2531 ## EUBREC_0675 hypothetical protein 21 11 Op 2 . - CDS 28832 - 29995 999 ## COG0500 SAM-dependent methyltransferases - Prom 30021 - 30080 1.9 22 12 Tu 1 . - CDS 30083 - 30451 244 ## gi|253580432|ref|ZP_04857697.1| conserved hypothetical protein + Prom 30236 - 30295 3.7 23 13 Tu 1 . + CDS 30415 - 30633 92 ## 24 14 Tu 1 . - CDS 30561 - 31259 961 ## COG0775 Nucleoside phosphorylase - Prom 31349 - 31408 8.3 Predicted protein(s) >gi|226332894|gb|ACII01000125.1| GENE 1 1 - 1620 1489 539 aa, chain - ## HITS:1 COG:CAC0125 KEGG:ns NR:ns ## COG: CAC0125 COG2812 # Protein_GI_number: 15893421 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Clostridium acetobutylicum # 1 487 1 498 542 382 45.0 1e-105 MSYTALYRKFRPDNFADVKGQDHIVTTLTNQIKHNRIGHAYLFCGTRGTGKTTVAKILAK AVNCEHPVNGSPCNECAMCKAIQAGTAMNVIEIDAASNNGVDNIREIREEVSYRPTEGKY KVYIIDEVHMLSTGAFNALLKTLEEPPSYVMFILATTEAHKIPITILSRCQRYDFHRITI DTIAARLDELLKVEGVEAEEKAVRYVAKAGDGSMRDALSLLDQCIAFYLGQELTYDKVLE VLGAVDTEVFSKLLRKVIRGDVTGSIHILEELIVGGRELSQFVGDFTWYMRNLLLVKTSE NPEEAIDVSSDNMKLLKEESTMLDVETLMRYIRIFSDLSNQIRYATQKRVLVEIALIKLC RPAMETNLDSVLDRLRVLEQRMDERPVQQVIVQQGSGKMPAETGAVQEPAGNKAPAKAAP EDLQKIVAGWRVITGQTTGMFKQMLQKSVPKYNSETGEPVLYVEFQDFLGQSYVDNPEAK KELQDIITAQTGKTVEIQMLVADKHQHTNLANITVDQAIKNNIHMDVVIEEDPDEEKGE >gi|226332894|gb|ACII01000125.1| GENE 2 1790 - 2872 1495 360 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 4 355 5 346 346 261 41.0 1e-69 MIKRIGMLTSGGDCQALNATMRGVVKGLSNNLDELEVYGFDDGYKGLIYGKYRMLTSKDF SGILTQGGTILGTSRQPFKLMRVPDENGLDKVEAMKQTYYKLCLDCLVILGGNGTQKTAN LLREEGLNIIHLPKTIDNDIYGTDMTFGFQSAVNIATNAIDCIHTTASSHGRVFIVEIMG HKVGSLTLHAGVAGGADIILIPEIPYDIKKVTAAIQKRAKAGKRFTILAVAEGAISKEDA ELPKKKYKEKLEARAKKYPSVSYEIADQIYKEIGSEVRVTVPGHTQRGGEPCPYDRVLST RIGAGAAQAIMDGEYGIMIGVVNGKIKRVPLEECAGKLKMVSPKDQLVVAAKQIGISFGD >gi|226332894|gb|ACII01000125.1| GENE 3 3582 - 4949 1304 455 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 8 429 4 425 447 278 38.0 2e-74 MKKDTTNDMTVGSPVSLIIKFMIPMCLGNLFQQFYNIADSIVAGKFIGVNALAAIGSTGS LMFFVTGWLNGLSSGFAIIVAQMFGAKKFDRMRHYVAMSIYLMAAFSIVMTIGFSLANKP ILHLMNSPEEVFGDVTAYMGIIYAGLIITGAYNALAAFLRALGDSKSPLYFLIISAVINV ILDVVFIVAFGMGVEGCGYATVIAQGISAVCCLIYIVKRFPILHLERKDFEICWDSFGRL LKLGIPMGLQFSITAIGTIIVQSAVNIYGPVHMAGFSAAGKIQNIFATVFTAFGATIATY VGQNRGAGKMDRVKQGVRYTQMMVLGWSVFVMFLMFFFGKYLTYLFVDPSEQDVVNVSVT YFRTVFWAYPFLGSIFLYRNTLQGMGYGLVPMLGGVFELVARTGIVVLIAGHTSFAGVCM ADPTAWIAALIPLIPYYFHVMKKYKNKSQVQAVED >gi|226332894|gb|ACII01000125.1| GENE 4 5064 - 5582 496 172 aa, chain - ## HITS:1 COG:BH0033 KEGG:ns NR:ns ## COG: BH0033 COG0590 # Protein_GI_number: 15612596 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Bacillus halodurans # 3 154 4 155 159 178 59.0 4e-45 MTDQERFMKEAIRQAKKAEALEEVPIGCVIVHEGKIIARGYNRRNTDKNTLSHAELNAIR KASKKLGDWRLEGCTMYVTLEPCQMCSGALVQSRIDEVVIGCMNAKAGCAGSVMNLLQVD GFNHQVKIIQGVLEEECSSMLSEFFRKLREKKKQEKAALKAAQENPEGEPEQ >gi|226332894|gb|ACII01000125.1| GENE 5 5878 - 6921 611 347 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_02025 NR:ns ## KEGG: EUBELI_02025 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 4 340 3 374 374 94 24.0 5e-18 MAGYRRFIAYVYDYESGKKGSNCGFIKVEVKDQQCSVEIHLHCPGLPENVKCNIYGFTRK DGLINGILLDTCETEKETVECLIITDATDMNDSGVAMGKLGGMIITSDTGGFFGTEWDDQ PIRPENFHEIKAMPDADIPESAKITSQKPEVPEDMVLQEMPEELEDISEKPEDTSLTKVP DDSLSIEPSANIPAQGTSDNKPGTESSYISDNTVEDSNSSENARNPVPERNAENRKTSAE FSPFSDGELISAWKIHLDDLKHFPRHYCALRNNRFLQYGHYNFGYLLLGQRNNGQYILGV PGVYNQQERFMANMFGFPYFKESSYIEIPKMRGGYWYRLIDAPDSHR >gi|226332894|gb|ACII01000125.1| GENE 6 6892 - 7587 608 231 aa, chain - ## HITS:1 COG:MJ1227 KEGG:ns NR:ns ## COG: MJ1227 COG1180 # Protein_GI_number: 15669412 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Methanococcus jannaschii # 3 231 5 235 240 102 31.0 5e-22 MRICGFNKTTLLDYPGKVASTIFLGGCNFRCPFCQNGILVVAPGEQPDYSQEELLTFLKK RKGILDGVCISGGEPTLSDGLEEFLGKIKELGYAVKLDTNGSRPKIVKHLAEAGLIDKVA MDIKACPDNYGNLTGIEKPDMDSIFETADFLLHGNLDYEFRTTVVRELHTQKDFEEIAGW LAGAKEYYLQAYKDSDGVLRPGYGSYTFEELQNFQKILQKTILSVGIRGID >gi|226332894|gb|ACII01000125.1| GENE 7 7759 - 9081 1561 440 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3607 NR:ns ## KEGG: Cphy_3607 # Name: not_defined # Def: peptidase M23B # Organism: C.phytofermentans # Pathway: not_defined # 26 284 73 337 469 91 28.0 5e-17 MKKKVFSLLLAGMLTFSMGTTAYATEDAIADMQAQKQQAEAGLAQTQANIDSLQSKKQEL ENYLSDLNTQYEDLTNAISELSIQAGEKENELKQLHTELKKAKKALNKQYDDMKLRIQYM YENGGTSALETLLSSKDLSEFLNNAESVAKISQYDRTMLEKCENLQNTIKDQETTAEEEK AAIDQLLEERAAKQQEVQNLAASTSDNISSYVSQISASQEEAAALTAEINNADNSIAQLV QQAEEEKAAREAAQAQAEAEAAAAQEQDAEESSDDYEEDTDSYEEDSSDSQDSSETEGSE DTSEDSGDSDDEDTYEENDADFYEGDSDSNESEDTSDSQASSDSSSGQGKYLGNFTLTAY CNCAQCCGTAGNLTASGTVPTAGRTVAMAGVPFGTQLLINGNVYTVEDLGTPYGHVDIYC DSHSEALSFGLQSAEVYQLN >gi|226332894|gb|ACII01000125.1| GENE 8 9275 - 9562 370 95 aa, chain - ## HITS:1 COG:CAC3223 KEGG:ns NR:ns ## COG: CAC3223 COG2088 # Protein_GI_number: 15896470 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Clostridium acetobutylicum # 1 83 1 83 95 114 69.0 6e-26 MNITDVRVRKVAKEGKMKAVVSITIDDEFVVHDIKVIEGEKGLFIAMPSRKATDGEYRDI AHPINSATRDRIQTIILDKFQEVMDAEPEEAAAAE >gi|226332894|gb|ACII01000125.1| GENE 9 9612 - 10730 1218 372 aa, chain - ## HITS:1 COG:TM0239 KEGG:ns NR:ns ## COG: TM0239 COG0448 # Protein_GI_number: 15643011 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 362 1 363 370 307 40.0 2e-83 MRAIGIILAGGNNNRMRELSNKRAIAAMPIAGSYRSIDFALSNMASSHIQRVAVLTQYNA RSLNEHLSSSKWWDFGRKQGGLFVFTPTITKENSLWYQGTADAIYQNLEFLKSSHEPYVV IASGDCVYKMDYNKVLEYHIAKRADVTVVCTTCDNPSEIERFGVLRMNEDCRIEEFEEKP MVSSYNTISTGIYVIRRRQLIELIERAALEGRHDFVKDILIRYKNLKRIYGYKIDTYWSN ISTAEAYYKTNMDFLKPEIRNYFFKQEPTIKTKIDDLPPAKYNPGAQVKNSLVASGCIIN GTVENSVLFKDVFVGNNCVIKNSVILNNVYLGDNTHIENCIVESRDTIRANSYYCGDGEV KIVVEKNDRYIL >gi|226332894|gb|ACII01000125.1| GENE 10 10730 - 12004 1328 424 aa, chain - ## HITS:1 COG:TM0240 KEGG:ns NR:ns ## COG: TM0240 COG0448 # Protein_GI_number: 15643012 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 7 421 5 422 423 459 53.0 1e-129 MIKKEMIAMLLAGGQGSRLGVLTEKVAKPAVAFGGKYRIIDFPLSNCINSGIDTVGVLTQ YQPLRLNTHIGIGIPWDLDRNEGGVTVLPPYEKSTSSEWYTGTANAIFQNMDYMEQYNPD YVLILSGDHIYKMDYEVMLDFHKANKADVTIACMPVPIEEASRFGIMVTDDIGRITEFEE KPEHPSSNLASMGIYIFSWPALKEALMSLKDQNSCDFGKHVLPYCKEKGERLFAYEYNGY WKDVGTLGSYWEANMELIDIIPEFNLYEEFWKIYTKGDIIPPQYISADAVTDRCLIGEGA EIYGEIHNSVIGPNVVIGKGSVIRDSIIMRNSTIGEGVQMDKAIIAEDVTIGNNVVLGCG EEAPNVLKPAVYAFGIATVGERSVIPDNVRIGKNTAISGITVPEDYPDGELAGGQVITAK DGDE >gi|226332894|gb|ACII01000125.1| GENE 11 12398 - 13777 1511 459 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 12 459 13 458 458 461 52.0 1e-129 MYQIDFHKPLHIHFIGIGGISMSGLAEILLGEDFVISGSDSKSSPLTQALEKKGATIYYG QRATNITDDVDVVVYTAAIHPDNPEFACAKEKGLPMLTRAELLGQIMRNYDTPVAISGTH GKTTTTSMVSHILLAGDCDPTISVGGILPAIGGNIRVGNSETFVTEACEYTNSFLSFFPK ISIILNMDADHLDFFKDIDDIRHSFRRFAELLPADGTLIINADTPKYEDIIRDLPCNVIT YGLEHDADYQAADITYDKYGHASFSVLRNGVKVGSYYLKVPGIHNVSNALAAIALGHLLG LSEEVIIKGLGSFTGTDRRFQYKGEVAGVTIVDDYAHHPTEIEATLHAAHNYPHKKLWCV FQPHTYTRTKALLPEFAKALSLADHVVVADIYAARETDTLGISSEDLQKRIQELGTPCEY FPTFDEIENYLLSNCQEGDLLITMGAGDVVNIGEHLLGK >gi|226332894|gb|ACII01000125.1| GENE 12 14541 - 15623 729 360 aa, chain + ## HITS:1 COG:CAC3587 KEGG:ns NR:ns ## COG: CAC3587 COG3935 # Protein_GI_number: 15896821 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Clostridium acetobutylicum # 9 358 8 321 328 151 29.0 2e-36 MAMITLQNSRNAEVTVLTNNFIDNYMPGANGEFVKVYIYLLRLLSDTSVPFSLEQMADHF FCTERDIIRALKYWEKEKLLTLTYRNNEDIADIILNVPPVKSAASDTPVSTAPVTKTETR TSSAPAQPVKQTTNSATALSPDRVKELKQNDEIVQLLYIAEQYLGKTLTPTEMKKILFFY DELKFSPDLIEYLIEYSVSRGHKSMRYIETVALAWADEGITTVTMAKEANSRYAKEYFTI FKSMGISGRNPVDTEISLMNTWLNDYGFTMDIIQEACSRTVLSTGQPSFQYADKILSGWK DKNVRTLADVRLLDAQHQRQKMEKNTQRKAASKPAASNRFNNFHQRQYDFNEYEKKLLNQ >gi|226332894|gb|ACII01000125.1| GENE 13 15692 - 16675 750 327 aa, chain + ## HITS:1 COG:CAC3588 KEGG:ns NR:ns ## COG: CAC3588 COG1484 # Protein_GI_number: 15896822 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 9 321 9 322 329 206 38.0 5e-53 MPLQNYQYDTIMREYSKTQSQNRRILEERTQEIYKKIPRIHEIDEEVATLSAKKARALLS GESSGLEDLKAAISLLSQERNALLVCNSYPEDYLELPYKCPVCQDTGYVGSQKCTCFKKA EIELLYTQSNLKEILKKENFDHFSFDYYSDTMKNEATGLTERETARRAYDIARGFVRNFD SSFENLFLYGDTGVGKTFLSHCIAHDLLESAHCVMYFSAFDLFELLADSKFSRDKTEGQE FVFDSDLLIIDDLGTELTNSFVSSQLFLCINERIMRRKSTIISTNLKLENFSDTYSERTF SRIASNYRMVKLEGKDIRIQKIFLGGK >gi|226332894|gb|ACII01000125.1| GENE 14 16679 - 17866 1428 395 aa, chain + ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 15 376 11 357 371 351 45.0 1e-96 MAQATTKDLTTLPVGRLGLIPLISCKDLGEKVNEWLIQWRKERSHEELDSFAFEGYQRES YLIPVQTARFGSGEAKCTIMESVRGDDIYLMVDVCNYSLTYSIGPYENLMSPDDHFQDLK RVIGAIGGKARRVNVIMPYLYESRQNIRSGRESLDCATALQELVSMGVENIITFDAHDAR VQNATPLHGFETIQPSYQFIKALLNHEKGLHIDNDHFMIISPDEGSMNRAIYLANILGVD MGMYYKRLDYSKRINGRHPIAAYEFLGPNLKGKDMILIDDMISSGDTVLKISSLLKERGA GRIYICSTFGLFTNGLEKFDEAHKAGVFDKLLTTNLIYQSPELLAKDYYISCDMSKYIAL IIDTLNHDQSVSYLLNPVDRINRCVSNYMAQYDEK >gi|226332894|gb|ACII01000125.1| GENE 15 18029 - 19399 1073 456 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0103 NR:ns ## KEGG: Cphy_0103 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 261 450 512 695 711 105 34.0 4e-21 MVFRLYWKQAGDRQNMIIRQLNIKNSGEIHDKTLEFSPGINVLYGENESERAIVHTYIKN MFLGDISEAIYDNAVFAAKLKSPTEQDLARELQKFMTGYQGTADSSMDLGRAMQMLKMSR KGYLVQADRKQKETEQERQKLLTRMDDIISDLNDLNGKLDQINEKEESLRMHPGDENGAV ILDERMAQAKAKRNSFAMGMLISAAAGIFGLILTAIIADSVSVSLIVAVIAAALVGFCGK QQLKYAREFQKRIRMKKRWVSRQEKLRCSKENLRQDYDEKETELCNLKEEYREYEENSFL LTSEEREVQALNMAMETIERMSGNIHLQVGRKLQIRTSQILSEITDGEYQDVQMDAASHM TVTTGNGAEALECLSRGTLELIYFAMRMAAGELLCQEESLPVILDDIFGMYEEEDLEAVL EWMYKEKKQVIISTCSKREMELLDREGIPYGKQIIS >gi|226332894|gb|ACII01000125.1| GENE 16 19365 - 20417 860 350 aa, chain - ## HITS:1 COG:BS_yhaO KEGG:ns NR:ns ## COG: BS_yhaO COG0420 # Protein_GI_number: 16078055 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Bacillus subtilis # 1 282 5 304 408 94 29.0 3e-19 MRFIHLADVHLGAVPDRGCPWSSRREEEIWETFRRVIAGIRENPVDLLFIAGDLFHRQPL LRELKEVNNLFSTIPDTRVYLMAGNHDYIKADSFCRDFQWEKNVTFFKNEQLTCVKDEKL DIYVYGLSYEHQEIENPLYDSIHPAEGEGVHILLAHGGDAKHIPMNIKSIAASGFDYIAL GHIHKPQILIRDNAAYAGALEPVDRNDLGDHGYIEGHLENGRLKTNFVPFACRSYEQILL MLREDSTQASLEDMLKADLANKGRMNIYRIVIQGTRAPELLLLPERLKSFGYVTEVLDES KPSYDLEALQKKYSGTLIGDYISYFLEKDRNAVEEKALYYGLQALLETSR >gi|226332894|gb|ACII01000125.1| GENE 17 20502 - 21893 1838 463 aa, chain - ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 1 463 1 463 463 667 67.0 0 MKLTTVKEIYKEREKYLDKEVTVGGWVRSVRDSKTFGFIVLHDGSFFETLQVVYHDTMEN FAEISKLNVGAAIIVKGTLVATPQAKQPFEIQAAEVTVEGNSAPDYPLQKKRHSLEYLRT ITHLRPRTNTFQAVFRVRSLCAYAIHRFFQEQGFVYVHTPLITGSDCEGAGEMFQVTTLD MNNVPTDDKGAVDYSQDFFGKETNLTVSGQLNCETFAQAFRNVYTFGPTFRAENSNTTRH AAEFWMIEPECAFADLNDNMDLAEAMLKYVIRYVLENAPEEMNFFNSFVDKGLLDRLNHV INSEFGHVTYTEAVELLEKNNDKFDYKVFWGCDLQTEHERYLTEEIFKKPVFVTDYPKEI KAFYMKMNEDNKTVAAMDCLVPGIGEIIGGSQREDDIEKLEKRMDELGLKKEDYDFYLDL RKYGSTRHSGFGLGFERCVMYLTGMGNIRDVIPFPRTVKNCDL >gi|226332894|gb|ACII01000125.1| GENE 18 22079 - 24496 1814 805 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0489 NR:ns ## KEGG: Dhaf_0489 # Name: not_defined # Def: cell wall/surface repeat protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 214 587 514 881 1452 96 27.0 5e-18 MEVNMYRKRHKRGKRLASVFLTVALITATVPVQAAEFGTPDSFEEIFSSGEEDSCITDTE NTLPAASDSEKVMASELSPTPTVTSTPEGTPALTETPTPSVTPMPSSTPTPSVTPSPSPT PTINPEKEVMLRFIDGEGNECEQLRTVMSWGESLILPNVPDTGAPDMWKLEKNEKLGDAI TLKGGDILTLKKGESWNLFLEKGILNFYMPKKCTVSLYNNSGTSVFSNGILQAYETKNVI LPDMPSSKYINYGWTDTKGSSAVKYELNSEFTVTGDTDFYIVRRTALQVNFKTNTGASNS KFTRLNQKVGKGLTVTMPQVPVKTGYQSLGWSKSKKASKADYKAGQNVTISKTLTLYAVY KKLPYTVTFNNNNGTSTSKIYTSLTMYASKNQKVTLPDVPKVKGYTNLGWTTEKGETEPE YSAGDTVKITKATQFYAVRRKSNYYTVSYYLGNGSTNAAYQKLTQTVEEGTVVTFAKVPA RTGYVNQGWSSKKNSEKATAKAKCTVNKNITLYAVQKKAVQLTFHRCDGSTWQKTTLAKG SSYSLPGVRDAEGYTFMGWSTKPMQSVSPQYEAEEKITVNGNVDLYAVVFNRTTETDLTE DQLPQVDIYKYKQVIFVGDSRTEFMENVLTGMGESATKNVKFVCSAGKGLDWFTTTGWAQ LYSIVQHDSNSILSKKTAVIFNFGVNDLSKSADYAEYYNWIAPQLKSKGCELYFMSVNPV NRLMLPNAGRADRSEAAVRSFNQYMKANLSSAYTYIDMYSYLKSTGYSFASDHYGTGTVD DGLHYTTRTYKRIFAKCMDSLRVPA >gi|226332894|gb|ACII01000125.1| GENE 19 24678 - 26168 1492 496 aa, chain - ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 1 496 1 497 497 565 55.0 1e-161 MLYENIKKLVQYGVETGLTPACEKNYTINLLLDVFKEDEYVEPEEEYRDIDLEEVLNALL DEAVKRNLIEDSVVYRDLFDTRLMNCLMPRPAQVQNEFWSRYEKDPQEATDYFYKLSQDS DYIRRYRVKKDQKWTVDSEYGKIDITINLSKPEKDPKAIAAAKLVKSSSYPKCLLCPENE GYAGRVNHPARENHRIIPITVNDSPWGFQYSPYVYYNEHCIVFNSQHVPMKIEKNTFIKL FDFVKLFPHYFLGSNADLPIVGGSILSHDHFQGGHYTFAMAKAPIEKHVTIPGYEDVEAG IVKWPLSVLRIRHKDEKRLIELATHVLEAWRGYTDESAFIFAETDGEPHNTITPIARRSG DMFELDLTLRNNITTDEHPLGVYHPHAEYHHIKKENIGLIEVMGLAVLPARLKEELELLA EYILEGKDISSNEKIEKHAAWVAEFLPKYDNLDKDNIQDVLRKEVGNVFVHVLEDAGVYK CTAEGREAFMRFINKL >gi|226332894|gb|ACII01000125.1| GENE 20 26382 - 28835 2531 817 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0675 NR:ns ## KEGG: EUBREC_0675 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 812 1 840 842 393 30.0 1e-107 MKRILKKAGILLLVFLLGTAGTALLLNSESTDNRSDFNDAVFPEVMVDMNDTLINRMYGY AQPMQADFSRDSVTPLDTSKKLTFKVNPYDSEVKSFSYEIRTSDGSKVLENKKIKNLVKE DQYLSVDVEIGSDLRMNQEYSMQIALELDEGTAYYYTRVVSRSQVHASDYAAFVKYFYEA CLDKESADALGSYLEPQTTGAATNYSGININSSLSEISWGNLAPQLCQEGIPVIKEINET TASVVLEYQLTSQNEDEETELYDVKEFYRMKYQDTRIYLLDFQRSANQVFDGTLPVYEDD GIILGVRDKNVEYMMNDAATVIAFVQEGDLWSYSPGNEKVNQVFSFRKLKDGDFRDSRTQ HDIKIVRVTDEGDIDFVLYGYMNRGSHEGYEGIAVYHYNRDKNVAEERAFIPVSESFEFL KKDLEKLSYVNEKNELFLILAKNLYRINIEENSSEILEKGIKNANFVSSDNNDHAAWLVS EGDEKGNIKEIDFDTCKTRLIAPQKGQRLRTVGFMNEDLIYGMLNKEDILTDEEGHKSVG IRILRIEDFEGNVKKEYRKDGLYITDISVGNTLIEFELSAKSGETSYVAQKKDTIMNNKK AAANTVKIELISASRTGVRVKLVFNTTKQTDSPLTMYAKVSSSDRKDIVLDTQIPQEIAY YVYGQGGLDGIYIDPAKAVLRADTLGGVVLNRTQQYVWERGNKKTKMQIDTEEIPEIVLQ GTYDIKTLKKSLKKTGTVIDLSGCSLDSVLYEISAQRPVIAKTGADTSVVIVGYDEYNTW LYDPVKKETYPYGMNDSTDLFQKAGNVFITYIETVNY >gi|226332894|gb|ACII01000125.1| GENE 21 28832 - 29995 999 387 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 4 376 7 396 412 262 40.0 9e-70 MKNLREYLEEQINENLIQAVLSAGRNKDGISKIKIRPIRLKGQICYQASATEGQKVLHKN YGRTELIEYVEKELAENFRQFQAQGAVTDGVVLVSKKGKMTIKQKHHEQKEKVQIQAHNR VKQYILKEGVPVPFLIDLGVMNEQGKIIHARYDKFRQINRFLEFIEDILPRLSRDREITI LDFGCGKSYLTFAMYYYLRELKGYDVNIIGLDLKTDVIEKCNSLALRYGYEKLHFYHGDI ADYEGVSCVDMVVTLHACDTATDYALAKAVEWGAEVILSVPCCQHEVNKQIKNEMLEPVL RYGILKERMSALITDAVRADLLESKGYDTQILEFIDMEHTPKNLLIRAVRTGKRSDQGKV EKMLAALNIHPTLDRLLNEKEQEGSVR >gi|226332894|gb|ACII01000125.1| GENE 22 30083 - 30451 244 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580432|ref|ZP_04857697.1| ## NR: gi|253580432|ref|ZP_04857697.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 109 1 109 122 187 100.0 3e-46 MYLKEYFPFYTTYANPLLYEGERMQDKEFNLMKSYYPGTVQHIQEKIEEECDLMDYEGSR LYDEYPDKYMIYHLGCKIRESMELEISTQAIREDFLDELIQVMLCQEISRRRCRRYRCRR YF >gi|226332894|gb|ACII01000125.1| GENE 23 30415 - 30633 92 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MWYKTENILSDTSYDPAFFSIVYFSICKVLQTVPKYSSRAALFHLLLADYNSWIILTVNL TLCSIAAISNSA >gi|226332894|gb|ACII01000125.1| GENE 24 30561 - 31259 961 232 aa, chain - ## HITS:1 COG:CAC2117 KEGG:ns NR:ns ## COG: CAC2117 COG0775 # Protein_GI_number: 15895386 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 4 232 3 230 230 202 46.0 6e-52 MKCIGIIGAMEQEVAKIKEKMQDVTITSRARMDFYEGTLEGKKVVVVRSGIGKVNAGMCT QILADVFGVEAVINTGIAGSLNNDVNIGDIVLSTDVLHHDMDAIGFGYKKGQIPQMDEFS FPADEKLRKLAAKVCKEVNPEISVFEGRICSGDQFISDKSVKDAIISEFGGFAVEMEGAA IGQAAYLNHIPFLVVRAISDKADGSAHMDYAEFEMAAIEHSVKLTVRMIQEL Prediction of potential genes in microbial genomes Time: Sat May 28 20:36:23 2011 Seq name: gi|226332893|gb|ACII01000126.1| Ruminococcus sp. 5_1_39B_FAA cont1.126, whole genome shotgun sequence Length of sequence - 30560 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 14, operones - 4 average op.length - 5.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 65 - 310 360 ## EUBELI_00054 hypothetical protein 2 2 Tu 1 . - CDS 460 - 2076 1479 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing - Prom 2197 - 2256 8.8 - Term 2239 - 2291 10.1 3 3 Tu 1 . - CDS 2502 - 2942 448 ## gi|253580436|ref|ZP_04857701.1| predicted protein - Prom 3081 - 3140 5.3 + Prom 2934 - 2993 4.4 4 4 Tu 1 . + CDS 3111 - 4181 1269 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 5 5 Op 1 . - CDS 4182 - 4715 292 ## gi|253580438|ref|ZP_04857703.1| predicted protein 6 5 Op 2 . - CDS 4742 - 5155 505 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases - Prom 5283 - 5342 5.2 + Prom 5127 - 5186 5.3 7 6 Tu 1 . + CDS 5295 - 5978 339 ## COG0642 Signal transduction histidine kinase - Term 5868 - 5896 -0.0 8 7 Op 1 . - CDS 5986 - 6576 598 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 6600 - 6659 2.0 9 7 Op 2 . - CDS 6661 - 8154 1203 ## COG1716 FOG: FHA domain 10 7 Op 3 . - CDS 8161 - 8622 331 ## gi|253580443|ref|ZP_04857708.1| conserved hypothetical protein 11 7 Op 4 . - CDS 8619 - 9074 239 ## gi|253580444|ref|ZP_04857709.1| conserved hypothetical protein 12 7 Op 5 . - CDS 9065 - 9904 471 ## Cphy_0037 hypothetical protein 13 7 Op 6 . - CDS 9929 - 11395 969 ## EUBREC_0046 hypothetical protein 14 7 Op 7 . - CDS 11426 - 11626 250 ## gi|253580447|ref|ZP_04857712.1| predicted protein 15 7 Op 8 . - CDS 11687 - 12721 944 ## Cphy_0034 hypothetical protein 16 7 Op 9 8/0.000 - CDS 12730 - 13503 576 ## COG4965 Flp pilus assembly protein TadB 17 7 Op 10 . - CDS 13500 - 14690 1123 ## COG4962 Flp pilus assembly protein, ATPase CpaF 18 7 Op 11 . - CDS 14726 - 15751 579 ## COG1192 ATPases involved in chromosome partitioning 19 7 Op 12 . - CDS 15748 - 16308 292 ## gi|253580452|ref|ZP_04857717.1| conserved hypothetical protein - Prom 16333 - 16392 7.7 20 8 Op 1 . - CDS 16609 - 18963 2533 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 21 8 Op 2 . - CDS 18929 - 19354 99 ## gi|253580454|ref|ZP_04857719.1| predicted protein + Prom 19120 - 19179 6.8 22 9 Tu 1 . + CDS 19232 - 20209 716 ## COG3049 Penicillin V acylase and related amidases + Prom 20242 - 20301 9.7 23 10 Tu 1 . + CDS 20332 - 21843 659 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Term 21880 - 21930 4.2 24 11 Op 1 42/0.000 - CDS 21955 - 22368 540 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 25 11 Op 2 42/0.000 - CDS 22381 - 23778 1739 ## COG0055 F0F1-type ATP synthase, beta subunit 26 11 Op 3 42/0.000 - CDS 23784 - 24674 763 ## COG0224 F0F1-type ATP synthase, gamma subunit 27 11 Op 4 . - CDS 24683 - 26191 1959 ## COG0056 F0F1-type ATP synthase, alpha subunit 28 11 Op 5 . - CDS 26199 - 26702 517 ## EUBELI_01483 F-type H+-transporting ATPase delta chain 29 11 Op 6 . - CDS 26683 - 27183 634 ## COG0711 F0F1-type ATP synthase, subunit b 30 11 Op 7 . - CDS 27224 - 27442 469 ## EUBREC_2902 hypothetical protein - Prom 27473 - 27532 3.3 - Term 27465 - 27509 -0.4 31 12 Tu 1 . - CDS 27705 - 28367 579 ## COG0356 F0F1-type ATP synthase, subunit a - Prom 28530 - 28589 8.0 + Prom 28600 - 28659 8.4 32 13 Tu 1 . + CDS 28799 - 29497 203 ## COG2755 Lysophospholipase L1 and related esterases 33 14 Tu 1 . + CDS 29598 - 30374 540 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 30378 - 30435 0.4 Predicted protein(s) >gi|226332893|gb|ACII01000126.1| GENE 1 65 - 310 360 81 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00054 NR:ns ## KEGG: EUBELI_00054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 77 1 76 82 98 63.0 8e-20 MTYKTKGVCASHIEFEVDDNKKVHNVRFIGGCSGNTQGVASLVEGMDAEEVISRLKGIKC GFKNTSCPDQLSTALEEVLNK >gi|226332893|gb|ACII01000126.1| GENE 2 460 - 2076 1479 538 aa, chain - ## HITS:1 COG:PA3614 KEGG:ns NR:ns ## COG: PA3614 COG1236 # Protein_GI_number: 15598810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Pseudomonas aeruginosa # 3 459 4 462 467 359 40.0 1e-98 MKITFIGATHEVTGSCYYLEAAGHKFLVDCGMEQGPDYYENAEIPVALGEIEFVLLTHAH IDHSGNLPAIYAKGFRGPVYATDATSHLCDIMLRDSAHIQMFEAEWRNRKGRRQGKPEFV PAYTMEDAMGVIRNFVGCPYNKMITPAEGISARFIDAGHLLGSASIELTIREEDTEKKIV FSGDIGNTCQPLIKDPEYLHHADYIVMESTYGDRSHGEKPDYVKLLSEIIQETFDRGGNL VIPSFAVGRTQEMLYFIRQIKADGLVYGHDGFKVYVDSPLANEATTIFSEHQYDCFDEEA MELIKKGINPISFPGLKISVTSDDSKSINYDEEPKVIISASGMCDAGRIKHHLKHNLWNE KNTILFVGYQAVGTLGRALIEGAADVKLFGEPVHVAAHICQMPGISGHADVNGLLDWAKA FEEKPQKVFVTHGDDTVTEIFAKRLTDELGYDTMAPFSGTVYDLADNVCLYEAKGVKIQK ASASPKATRAARAFEKLLALGYRLITVIRKNEGTPNKDLERFSRDVQSLCDKWDRTDM >gi|226332893|gb|ACII01000126.1| GENE 3 2502 - 2942 448 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580436|ref|ZP_04857701.1| ## NR: gi|253580436|ref|ZP_04857701.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 146 1 146 146 232 100.0 4e-60 MKNKTDIQTKNQQIDKILQEQYDTETKILESALLNAAGKTSWDEFPEESEEKIRAGYDRL ITRLRETGDFNEAVRAEKTGSTGAAKEKLEILEYDWNRTHRILKSAAAAVAVVMLTALVG MTAIADCRYPDKNTQTYQKTSVFEKK >gi|226332893|gb|ACII01000126.1| GENE 4 3111 - 4181 1269 356 aa, chain + ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 4 350 6 352 359 411 57.0 1e-114 MKHIVLTGGGTAGHVTPNIALIPRLKELGYEISYIGSYDGIEKKLIEEMNIPYYGISSGK LRRYFDLKNFTDPFRVLKGFGEAKKLLKQLKPDVVFSKGGFVTVPVVIAAGRRKIPTIIH ESDMTPGLANKICIPSATKVCCNFPETVKSLPADKAVLTGTPIRQELLNGSKEAAREFCG FTDDKPVLMVIGGSLGAASVNENIRKILPELLKEFQVIHLCGKGKMDESLKDTKGYVQYE YIKQELADLFALSDIVISRAGANAICELNALKKPNLLIPLSANASRGDQILNARSFERQG FSMVLEEEEITESTLLNAIRELYQNRESYVHAMSESSHMNSIEKITGLIEDCVNKQ >gi|226332893|gb|ACII01000126.1| GENE 5 4182 - 4715 292 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580438|ref|ZP_04857703.1| ## NR: gi|253580438|ref|ZP_04857703.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 177 1 177 177 309 100.0 4e-83 MENIDFVSIYRKYRSRAFNIGKFILGDEDEAEDVCQDVFETLFHMGRSVDFSDEKKLENL IVRISYHKASDHRKKAFRKYEYANSDAILEMTDMRVRKGNRVDEIVLNLEAAGYLQSIFE KLREKNRTNYEIYVSVTLYDIPTRLVARHYHITENNVNNRVMRTRRWLAREYKKITR >gi|226332893|gb|ACII01000126.1| GENE 6 4742 - 5155 505 137 aa, chain - ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 3 136 6 142 142 135 45.0 3e-32 MENCIFCKIANGEIPAATLYEDENFRVILDLGPASKGHALILPKSHAANIYELSDEMAAK AMILAKKMATAMTAALKCDGFNIVQNNGECAGQTVFHFHMHLIPRYKGDQVGITWHPGEL NDADKEEILLKVKEQLS >gi|226332893|gb|ACII01000126.1| GENE 7 5295 - 5978 339 227 aa, chain + ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 16 227 62 273 278 115 30.0 7e-26 MQIEITKQSEKDFRFLLSRLSHEIRNPVTLINSELQLMASSHPELCSYRQWDSLMDNLEY VKELLKELSDFSHAQTVTLVPVNPAEFLTTVLSCQKNTLDYLGIRLEIHVEHTSSPVFLD RIKMRQVIFNLIKNSCEAIDQPGGKIIVNLTPADSGICICIKDNGCGMTHKQTENIFHPF ITYKPEGTGLGLAVTAEIISAHHGQIRVSSTPGQGSAFYIYLPASPE >gi|226332893|gb|ACII01000126.1| GENE 8 5986 - 6576 598 196 aa, chain - ## HITS:1 COG:RSp1525 KEGG:ns NR:ns ## COG: RSp1525 COG0705 # Protein_GI_number: 17549744 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Ralstonia solanacearum # 9 188 207 390 569 117 42.0 1e-26 MEELKKAPVTVLLILANILVFTAVEFTGGSEDTMHMLQCGAAYTPAIMQGEYYRIFTSMF LHFGPQHLGNNMLVLFVLGGRLERTVGKLKYLLIYLLGGMGGNLLCLFLELDSADFAVSA GASGAVFAVMGAMIYAVIRGRGHIEDLSARQVVIMAAFSLYFGFTSEGVDNAAHVGGLIC GFLLAVLLYHPSQRAG >gi|226332893|gb|ACII01000126.1| GENE 9 6661 - 8154 1203 497 aa, chain - ## HITS:1 COG:CAC0036 KEGG:ns NR:ns ## COG: CAC0036 COG1716 # Protein_GI_number: 15893334 # Func_class: T Signal transduction mechanisms # Function: FOG: FHA domain # Organism: Clostridium acetobutylicum # 353 495 340 465 468 72 31.0 2e-12 MRTEYKRDMNHNYLIVYGENEINTDSYQVRMLAGNVIPSLLKCRIQGMDGRFLIYFDITS KQAVNVLYEEKKMGVEDLRLIFGGFVKVMEDAAEYLINPGQFIMSPEYIYTDIEKRQIYF CMMPGYEKDIKEQFQLLTEYILPKIDHQDQDAVILGYGVYKRAMEDSFHLEHIKEELYKT QSSDVNGEKKEKINAKKTEQEMEFAEEETFPEDNENRNEFVREGEDSKEPGRLNPVGVIV PAAVLICGLAAAILKGYLPHVEMETILGVIVIAIAGIMLGLRIIKTKKVLHLPEQTYASG ETVRHIEKIRNMPSWNKTEKKTSGKRSGWKEAEGKVSEQKWEDRLYENLSQTVERNQQTN QSQPVNGQKMSDCGQKTSKSSRIHMDYGETVVLSAGTVSGPASLVSKEPGELATIYLNED LTVIGKLETACDAVISLPTVSRIHAKIRKKEENYFLSDMNSRNGTSVNGRLLRPDEEYQL EPEDEVDFAQARYIFLE >gi|226332893|gb|ACII01000126.1| GENE 10 8161 - 8622 331 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580443|ref|ZP_04857708.1| ## NR: gi|253580443|ref|ZP_04857708.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 153 1 153 153 308 100.0 9e-83 MKDNFCITNEYSIIRKGSFTIETACVMPLILLVLMGLIYLSFFVHNRAWLTAAAYESAVS GSMEGIKKNGEIYDTARMRSEELGSIGFFGAENLDTQTNVGKEVQVTYDLDTISSYGNLS WHLRTEGKSVVINPVKHIRRLRAAQAVLREMGE >gi|226332893|gb|ACII01000126.1| GENE 11 8619 - 9074 239 151 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580444|ref|ZP_04857709.1| ## NR: gi|253580444|ref|ZP_04857709.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 151 1 151 151 256 100.0 4e-67 MWLNFAGEMYMEMNGVKEICILGFLGINSWIDIRKKQVSLLLIIIFAVCGTVWTIYSRRN VPEILMCVGTGFLFVLISILTEGTMGMGDGWLLMALGTVLYPEEFFSTLFIGMICSAVWS GIMMMGFCRKGSTEIPFVPFLLAGYLGGFLI >gi|226332893|gb|ACII01000126.1| GENE 12 9065 - 9904 471 279 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0037 NR:ns ## KEGG: Cphy_0037 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 149 278 165 298 304 87 33.0 4e-16 MLLQKEKQSIEEIGKKDKKIYERRKVSLSNIYDLLFPQKSRKYFRTNAKEHIGEVSSVKV SFCTSFRGKIKKGSLAVETALVLPLFFLGMVTLISFMDIYKLQTEHLTALCTKAKQAGMY AYLPGGDSVDDITLPDIYTYKPFGGLIPLPDVVVYNHVKVHAWTGTEFPDNGGEQGETEP MVYVTASGSVYHKNPGCSYLNVSLKQIPGSSAKSAYNKYGEHYSACEICSRNQNPAGVVY VTEQGNRYHNLESCSGLKRSVRLVKASSVSEMGACSRCG >gi|226332893|gb|ACII01000126.1| GENE 13 9929 - 11395 969 488 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0046 NR:ns ## KEGG: EUBREC_0046 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 467 2 487 524 195 29.0 4e-48 MKYCGKERGSITLFLALILSLLLSLVCTSIESVRMASARTQILNSMDIGLYSVFGQYDRK LLEEYDLFALDGSMGGGQLNLAKICDNLESYMKPVLKQNSQKLELHQSGLTGYRLLTDEC GEVFYQQIVQYMQETLGSQGVQLLLNKMSDRERKTEEADLKAKQAETGGSIDRYDSEMDQ ASQKSRQAAQEAENRQQQEGQISTQPQADNPISIIKRIMKMGILELVLPPGREISTRTVS KDTMVSGRQLQQGMEMPDGVTADSSYTSGVLFQQYLMNHLGNYTDPSKESLAYQMEYVFG GRDNDIDNLKSVASKLLFIREGVNFACLMADNVKRTEAQALAAAIASGFLVPPAAVVIES ALLLCWAFAESVLDVRELFAGGKIPLVKTSADWQISLSNLSSLMEGLDSMRKNNEHGLSY EDYLQVLILPVSKEKKVMRAMDMIEDAIRKKGRANFYMDSCVVALEAFAEVKANRKKEFQ VIHQYCYE >gi|226332893|gb|ACII01000126.1| GENE 14 11426 - 11626 250 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580447|ref|ZP_04857712.1| ## NR: gi|253580447|ref|ZP_04857712.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 66 14 79 79 90 100.0 2e-17 MGQLIRQIKKKNGLIRRFAEEEDAVGVVEIILILVVLIGLVIIFKEQLTELVQSILSKIA KQSNSI >gi|226332893|gb|ACII01000126.1| GENE 15 11687 - 12721 944 344 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0034 NR:ns ## KEGG: Cphy_0034 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 141 339 265 482 485 149 39.0 2e-34 MWVHVVIFVMILCALTGWKKGWIPSSGIGGKDGIRLFAVVAIAGNLLGMILTLQNGGQVY PQGYRLEKEDTGSYEKEFMVSVDGEEPVSMYIQVPEKELEEQEEEPKQNKKLTRKEQAQK SIAEAVSRYNEQRSDPDYYYLPDKLNGKKVKWESPADTSGMLLTGLFLIAAMAILVMKGR EEQVQLQKRYEELLMDYPGLIMKFTLLVQAGMTVRKAFQKISLDYGRKRKRNPRPAYEEI RIVCYEMESGVSESEAYRRFGERCGQAKYKTFATLLIQNLQKGSRQMADMLERESTEAWE ERKRKARVLGEAAATKLLVPMIMMLIVVMAIVMIPAFMSFYGET >gi|226332893|gb|ACII01000126.1| GENE 16 12730 - 13503 576 257 aa, chain - ## HITS:1 COG:CC2941 KEGG:ns NR:ns ## COG: CC2941 COG4965 # Protein_GI_number: 16127171 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein TadB # Organism: Caulobacter vibrioides # 62 248 136 310 325 64 22.0 2e-10 MKGKEKEQLYQKRNYDHYRFSMKELLKYMGQALALCIMADYLFYKSKWVLLLMLPVPVFY LKWQKNRMIRERRKNLNYQFKDALTSLSVAVQAGYSVESAVKTCVRDLERLYGKGTDIVE EFRYIESQQHISVPLEELFLDLGERSQIEDIENFASVFYTAKRTGGDMNRVIQKVSRMLG DKIDVKKEIEATLAAKKSEQMIMSLMPVGIILYLQMTSPGFLSVLYGNPFGIAAMSICLV IYAAAYWLGRRIVDIEV >gi|226332893|gb|ACII01000126.1| GENE 17 13500 - 14690 1123 396 aa, chain - ## HITS:1 COG:PM0849 KEGG:ns NR:ns ## COG: PM0849 COG4962 # Protein_GI_number: 15602714 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Pasteurella multocida # 67 375 74 382 425 303 50.0 6e-82 MNKDVFRKLRQMLMEELDLSRELSDKEILDVIDELILNQIKDVRLALKEKVQLRQELFYS VRKLDVLQELVDDETVTEIMVNGPDTIFVERAGKLMKWHKSFTSAEKLEDVIQQIVGKCN RVINESMPIVDARLENGSRVNAVIYPVALNGPILTIRRFPEHPITMEKLIALGSITQECA EFLEKLVKARYSMVIGGGTGSGKTTFLAAMSEYIPRDERLITIEDNAELRIRGIDNLVRL EAKMANMEGAVSVTIRDLIKSALRMRPDRIIVGEVRGGEAMDMLQALNTGHEGSLSTAHA NSARDMLSRLETMVLMGVDLPLEAIRRQIASGVDILVHLGRMRDKSRKLLEVTEVCGFEN GEIRIRPLYQWQEGKGLVKTASLLHVEKLERAGIEL >gi|226332893|gb|ACII01000126.1| GENE 18 14726 - 15751 579 341 aa, chain - ## HITS:1 COG:CAC0037 KEGG:ns NR:ns ## COG: CAC0037 COG1192 # Protein_GI_number: 15893335 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 15 282 15 287 361 97 27.0 2e-20 MKRNIFAVCDLEVDYALNFMDYMNRKKNIPFEIQAFTSVENLIAYGNQTRIELLLISGRA MCREVRDLDIGKIIILSEGVHPPELDQYPSVYKYQSSSDVLREVMAYYGAEKKTVADQIA VLKKTTEIIGIFSPLGRCLKTSFALTLGQILAKERAVLYLNMEEYSGFEELMGKGFAHNL SDLLYYVRQDNQNLLYKMNSMIQTINNLEYVPPVQMPADIRTTAWQDWERLFQMLIIDSS YEVIVLDIGCGIDENFQLLDMCKKIYMPVLSDAVSQCKIAQFENLVRIWDYPQILEKTEK INPPFHMATCLSPAYVEQLMWSELGDYVRNLLRKESQKKEN >gi|226332893|gb|ACII01000126.1| GENE 19 15748 - 16308 292 186 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580452|ref|ZP_04857717.1| ## NR: gi|253580452|ref|ZP_04857717.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 51 186 1 136 136 234 100.0 1e-60 MNSIETVAKNIRLGVMQYAFFHLFPVAVAGTAMIMDIRTAKVDNGWIIFSMSVGLFVCIW QKNITGIGFFIMGSVMPLFLMILFAFGMIGAGDIKLFCALGGIMGCESIIKCIFISFLTG AGISAALLIFNHNFCERILYFIEYVKCTVGTGKISSYRRKSIAAPENFHFTIPVFISVLL YAGGIY >gi|226332893|gb|ACII01000126.1| GENE 20 16609 - 18963 2533 784 aa, chain - ## HITS:1 COG:CAC1085 KEGG:ns NR:ns ## COG: CAC1085 COG1501 # Protein_GI_number: 15894370 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Clostridium acetobutylicum # 1 761 1 747 769 853 54.0 0 MKFTEGYWEKNERANASYAMQAFTAEAIPGGMRIVSPFRQITDRGGALDVGTITTEFISV RKDIISVRSYHYEGYVKGEPRYELNEDPQETEVEITEDEAVMKAGRMTVRVDLKDFKITY EADGKVLTNIGFRNLGYMQYDRATLTKFPEPNYMAAQYQPYMVTELSLAPGECVYGLGER FTAFVKNGQVVDIWNEDGGTASQISYKNIPFYVTSKHYGVFVDHSDHVSFEVASEKVENV GFSVKGEEIRYHIIYGDDIKGVIENYTDLTGKPALPPAWSFGLWLSTSFTTNYDEETTNS FIQGMADRDIPLSVFHFDCFWMKEFHWCDFEWDSRIFPDVPGMLKRYKDKGLKICVWINP YIAQGTNFFKEGLKNGYLVQRADGRGIKQIDNWQPGMGLVDFTNPDAVKWYQNKLKTLLD MGVDCFKTDFGERIPVDVKYYDGSDPVSMHNYYTYLYNKAVFDLLKEVKGEGEAILFARS ATAGGQQFPVHWGGDCSATYASMAESLRGGLSFTLSGFSFWSHDMGGFEMTAAPDVYKRW LQFGLLSTHSRLHGSKSYRVPWLFDEEAVDVCRKFTKLKLRLLPYLYSMAVKSHKTGIPS MRAMIMEFNDDPATKYLDMQYMLGDSILVAPIFNKEGHAEYYLPAGKWTHLLSGEVKEGG RWYEEDYDFSSLPVFVRENTLLPIGAVDTTVDYELEKDVQIQVYEVNETASCEVVTRKGE TAFTVKAVRQGNKLTLEASAENGGMTYLLRNIHEIANVTGGSIVKDTENGIIIVPEGCRQ EIEL >gi|226332893|gb|ACII01000126.1| GENE 21 18929 - 19354 99 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580454|ref|ZP_04857719.1| ## NR: gi|253580454|ref|ZP_04857719.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 67 141 1 75 75 144 98.0 2e-33 MTEMEWVVSWCDGDFITVGKFIIKGSAEIEIFCFISCCCAHIFLPPGAKIDILVERVYFE EYYHNLLFLCNIQILFFAQIKLGYIAGKKKNHDVYWYQIIIQQSVIPRGTERVSILENKQ IIRLKARRIRNEVHRRLLGEK >gi|226332893|gb|ACII01000126.1| GENE 22 19232 - 20209 716 325 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 306 1 309 328 188 34.0 2e-47 MCTAATYKTKDFYFGRTLDYEFSYGDEITITPRNYPFHFRHSDKAPDSHYAIIGMAHVAG DYPLYYDAINEKGLGMAGLNFVGNAVYAEPVSDKENVAQFEFIPWILSQCATLKEARQRL STMNLTNTQFSENFPCAQLHWILADASGCLVIESMQDGLHIYENPVGVLTNNPPFPQQMF QLNNYQSLSPRQPENTFAPGLELQSYSRGMGALGLPGDLSSASRFAKVAFTKMNSRSGDS ELESVSQFFHILGSVDQQRGCCEVAKGKYEITLYTSCCNTTKGIYYYTTYENHQINAVDM HRENLDTTVLIRYPVLAKQNICFQN >gi|226332893|gb|ACII01000126.1| GENE 23 20332 - 21843 659 503 aa, chain + ## HITS:1 COG:FN0191 KEGG:ns NR:ns ## COG: FN0191 COG2865 # Protein_GI_number: 19703536 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Fusobacterium nucleatum # 20 503 19 476 477 87 25.0 6e-17 MVHFFDITKFNLYKEDNRREVKKANGGLPSSLWETYSAFANCYGGVIILGVAENKDGTWR TTGLKSTDRDKLLKHFWDTINNRKKVNVNLLSDQDVEIYEKDEDTIIVIYVPMANREQKP VYINDDIFGGTFRRNHEGDYHCTKLQVKAMLRDQTDNTMDMDVLDDVPISDLNYETIQGY RNRHRALKPAHPFGRLNDSEYLRSIGAAAISNIDKCLHPTAAGMLMFGDEYNIVRHFPEY FLDYREILDPTIRWTDRLQSSSGEWSGNICDFYFRVYNKLVKDIKVPFKTIDGNRIDDTP VHEALREALANCLINADFYGVRGIVVRKEADRIVFENPGYSRTGKQQMKKGGISDPRNKV LMKMFNLINIGERAGSGVPNIFNTWEDQGWVEPVIEEQFDPDRTLLILSFDKKQAIKTSD KKQAIKTSDKKQANKSCHKKQNQKTLENIEKIREYLSKNNISKTSDIAEYIGLSLPRTRA ILKEIPDVSPIGNNSNRKWTLQK >gi|226332893|gb|ACII01000126.1| GENE 24 21955 - 22368 540 137 aa, chain - ## HITS:1 COG:CAC2864 KEGG:ns NR:ns ## COG: CAC2864 COG0355 # Protein_GI_number: 15896118 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Clostridium acetobutylicum # 1 130 1 129 133 79 34.0 1e-15 MADTFGLEIYASNKLAFAGRAKTLTIPAVDGEQAFLAQHENIVAAIIPGEMRFEEADGTK HVLAVSSGFVEMINNRVKLFCLTAESPEEIDIRRAQEAKERAEEQMRQKQSIQEYHMNQL ALSRAMARLRVTSHKEI >gi|226332893|gb|ACII01000126.1| GENE 25 22381 - 23778 1739 465 aa, chain - ## HITS:1 COG:lin2673 KEGG:ns NR:ns ## COG: lin2673 COG0055 # Protein_GI_number: 16801734 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Listeria innocua # 1 465 1 472 473 638 70.0 0 MNKGKIVQVMGPVVDVVFEDGNLPEIKDALEVENNGKRCVMEVSHHLGNNMVRCIMLSAS EGLQRDREVIATGSGIKVPVGDKTLGRLFNVLGDTVDDGPSLEGEQKWVIHRDPPDFEHQ KPAVEILETGIKVIDLLAPYAKGGKIGLFGGAGVGKTVLIQELIQNIATEHGGYSIFTGV GERSREGNDLWSEMKESGVLEKTALVFGQMNEPPGSRMRVAETGLTMAEYFRDEEHRDVL LFIDNIFRFVQAGSEVSALLGRMPSAVGYQPTLATEMGELQERIASTKDGSVTSVQAVYV PADDLTDPAPATTFAHLDATTVLSRKIVEQGIYPAVDPLESTSRILEPDVVGEEHYEVAR KVQEILQKYKELQDIIAILGMEELSEEDKSTVFRARKIQKFLSQPFHVAENFTGIKGKYV PLKETIRGFKAIVDGEMDQYPENAFFNVGTIEDVIEKAKTEMGQE >gi|226332893|gb|ACII01000126.1| GENE 26 23784 - 24674 763 296 aa, chain - ## HITS:1 COG:SA1906 KEGG:ns NR:ns ## COG: SA1906 COG0224 # Protein_GI_number: 15927678 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Staphylococcus aureus N315 # 1 292 1 286 288 166 35.0 5e-41 MGSTKEIQTRMKSIQDTMKITNAMYMISSSKMQKAKRILSDTEPYYYNMQAAISRILRHM PDTEHPFFSIRHKIAEEDRKIGCIVVTGDKGMAGAYNHNIQKMTEQFMAEHEHCKLYVLG MVGRQYFTKKHMDIAENFPYTVQKPTMHRARLISEEIVRAFLDRELDEVRIFYTEMQSAV AMEPVNMQLLPLKKAEFLPNQAMLADMPQEEIVLSPSADVLLNTMIPNYLTGLIYGCLVE AYASENNARMMAMQSSTDSAKKMLRELSIEYNRARQAAITQEITEIVSGAKAQKRK >gi|226332893|gb|ACII01000126.1| GENE 27 24683 - 26191 1959 502 aa, chain - ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 4 497 3 496 505 584 57.0 1e-166 MAAMNPDEIISILKEEIKNYDEISKDQEVGTVISVGDGIATVHGIDHAMYGEVVTFENGL KGMVQDIRANSIGCILFGKDTGIKEGTKVARTQKQAGVPVGDAFIGRIVNALGAPIDGKG EIKADGYRPVENPAPGIIDRKSVTVPLETGILSIDSMFPIGRGQRELIIGDRQTGKTSVA LDTILNQKGKDVICIYVAIGQKASTVAKIVSTLEKHGAMDYTTVFSSTASDCAPLQYIVP YAGTALAEYFMYKGKDVLMVYDDLSKHAVAYRALSLLLERSPGREAYPGDVFYLHSRLLE RSSRLNEKVGGGSITALPIIETQAGDLSAYIPTNVISITDGQIFLESDLFNSGMRPAVNV GLSVSRVGGAAQAKAMKKAAGSVRIDLAQYREMEIFTQFSSDLDEATTQQLKYGSGLMEL LKQPLSSPLSLHEQVITLCAATHKAMMHVETKKMKKYQRDMLDYFDSAYPEIGREIEEKR QLPDELAEKIVKVAEEFADKSR >gi|226332893|gb|ACII01000126.1| GENE 28 26199 - 26702 517 167 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01483 NR:ns ## KEGG: EUBELI_01483 # Name: not_defined # Def: F-type H+-transporting ATPase delta chain # Organism: E.eligens # Pathway: Oxidative phosphorylation [PATH:eel00190]; Metabolic pathways [PATH:eel01100] # 1 167 1 173 173 91 34.0 1e-17 MTETSINYAKALYELSVPEEAVLETEKIFRGTPQLKGALENPLVSLKEKEHVIDRVFPQE MKNFLKVTCKYQKISSIYDILEAYGNYSRKQKGILKAVLTYVTKPEEAQKEKMEDFLRRE FGAKEVILTLREDKSLIGGFILSAGDKEFDWSLRGRYNNLRQKLTRR >gi|226332893|gb|ACII01000126.1| GENE 29 26683 - 27183 634 166 aa, chain - ## HITS:1 COG:SA1909 KEGG:ns NR:ns ## COG: SA1909 COG0711 # Protein_GI_number: 15927681 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Staphylococcus aureus N315 # 8 159 21 172 173 80 31.0 1e-15 MIEININLVFTIINLIVLYLLMKKFLFGPILNVMEQRKNMIDQQFASAKDTEEQAYELKG KYEDALKSAKDESMRIVNQAKDEAKVQAERIVKDANTQAGAMLDKAKADIRTEQENAMKA MESRVAEIALDAASKIMGEKNSSQQDLSLYDQFIKEAGDSNDGNKH >gi|226332893|gb|ACII01000126.1| GENE 30 27224 - 27442 469 72 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2902 NR:ns ## KEGG: EUBREC_2902 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Oxidative phosphorylation [PATH:ere00190]; Metabolic pathways [PATH:ere01100] # 1 71 5 75 76 74 87.0 9e-13 MLVAIGAGIAVLTGIGAGIGIGMATSKAVEAIARQPEAESKISKSLLLGCALAEATAIYG FVIGLLIVIMLG >gi|226332893|gb|ACII01000126.1| GENE 31 27705 - 28367 579 220 aa, chain - ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 3 220 2 220 221 136 39.0 3e-32 MEELQCKTVFTIPVFGGIPVAESVAVTWVIMAVLLILSLVLVRNLSVENPGKKQLLLETG VSFLHNFFKDILGEEGKMYIPYLMTVVIYIGIANIFGVFGFVPPTKDLNCTIGLAITSIF LIEYAGFHKKGLKGFLKSFAEPAPIMLPINILEVVIRPTSLCMRLFGNVLGSYVVMKLLE FICPAVLPIPFSLYFDFFDGFIQAYVFVFLTSLFIKEAIE >gi|226332893|gb|ACII01000126.1| GENE 32 28799 - 29497 203 232 aa, chain + ## HITS:1 COG:lin0495 KEGG:ns NR:ns ## COG: lin0495 COG2755 # Protein_GI_number: 16799570 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Listeria innocua # 2 218 3 189 197 80 27.0 2e-15 MQIICLGDSITDCNHLFEDFPLGNGYVQILSEMFRNQTPSFSISANTVRRSSSAVQLTDK STGAIHFRNCGIDGFTVTRVLENIRQHRISLHHSPVVTLLIGINDIGLIMNTDRMDSQKE QMIREFATHYNELLDLLTADARQVILMEPFIFPHPEEYETWIPYVHTMSDIIRQLSVRFR LPFLPLHNYFNKEATQSGFDAITTDGIHLTLYGHKLLAEKLFPLLQSIDNNP >gi|226332893|gb|ACII01000126.1| GENE 33 29598 - 30374 540 258 aa, chain + ## HITS:1 COG:mll3460 KEGG:ns NR:ns ## COG: mll3460 COG0454 # Protein_GI_number: 13472990 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Mesorhizobium loti # 1 248 30 276 279 91 26.0 2e-18 METTIKQIEDMSLNAWPSHKMELYDGWILRFSYFYTHRTNSVEQFGNSTLTWREKIPYCE SVYKRLGTPAVFKISPLVSPDFDYVLENRGYAIQHTTNVMAMSMNAAHLDTPYPDVTFYN NIPSEWIESLFNLKNTINPIHRKVVPSMYQAILKETICACIRIDGQIIATGLGILDRDYI GIYAIHVKEEYRKHGYARQICTGLLKEGMKKGAQNAYLQVVEGNDNARALYRSLGFQQLY TYWFRVQPDENGNFPPEK Prediction of potential genes in microbial genomes Time: Sat May 28 20:37:38 2011 Seq name: gi|226332892|gb|ACII01000127.1| Ruminococcus sp. 5_1_39B_FAA cont1.127, whole genome shotgun sequence Length of sequence - 26148 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 12, operones - 7 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 6 - 566 681 ## Amet_4324 hypothetical protein - Prom 768 - 827 11.4 + Prom 1037 - 1096 8.2 2 2 Tu 1 . + CDS 1295 - 2473 859 ## gi|253580470|ref|ZP_04857735.1| conserved hypothetical protein + Term 2526 - 2570 -0.9 - Term 2518 - 2553 2.4 3 3 Op 1 . - CDS 2657 - 3535 978 ## COG1228 Imidazolonepropionase and related amidohydrolases 4 3 Op 2 . - CDS 3674 - 4825 1345 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain - Prom 4975 - 5034 7.4 + Prom 5219 - 5278 6.2 5 4 Op 1 . + CDS 5299 - 5805 397 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 6 4 Op 2 . + CDS 5806 - 6300 325 ## gi|253580474|ref|ZP_04857739.1| predicted protein 7 4 Op 3 . + CDS 6305 - 7069 506 ## CLB_2242 hypothetical protein + Term 7077 - 7132 9.3 - Term 7065 - 7120 13.1 8 5 Op 1 . - CDS 7133 - 8101 942 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase - Prom 8147 - 8206 4.4 - Term 8261 - 8298 -0.9 9 5 Op 2 . - CDS 8373 - 8597 241 ## gi|253580478|ref|ZP_04857743.1| conserved hypothetical protein - Prom 8651 - 8710 7.0 - Term 8707 - 8765 12.8 10 6 Op 1 23/0.000 - CDS 8842 - 9534 962 ## COG1346 Putative effector of murein hydrolase 11 6 Op 2 . - CDS 9531 - 9899 473 ## COG1380 Putative effector of murein hydrolase LrgA 12 6 Op 3 . - CDS 9919 - 10428 515 ## gi|253580481|ref|ZP_04857746.1| predicted protein - Prom 10485 - 10544 9.6 - Term 10474 - 10530 7.9 13 7 Op 1 . - CDS 10703 - 11134 246 ## Exig_3023 hypothetical protein 14 7 Op 2 . - CDS 11200 - 11658 512 ## COG0622 Predicted phosphoesterase 15 7 Op 3 . - CDS 11719 - 12954 1326 ## COG1379 Uncharacterized conserved protein 16 7 Op 4 . - CDS 13030 - 13617 549 ## COG1451 Predicted metal-dependent hydrolase 17 7 Op 5 . - CDS 13592 - 14395 1113 ## EUBREC_3191 hypothetical protein 18 7 Op 6 . - CDS 14435 - 15418 793 ## EUBELI_20061 hypothetical protein - Prom 15534 - 15593 7.7 + Prom 15615 - 15674 7.3 19 8 Op 1 . + CDS 15746 - 16132 444 ## PTH_0034 hypothetical protein 20 8 Op 2 . + CDS 16129 - 16425 348 ## PTH_0033 hypothetical protein + Term 16449 - 16498 8.4 + Prom 16528 - 16587 7.3 21 9 Tu 1 . + CDS 16641 - 18230 780 ## EUBREC_0128 hypothetical protein + Prom 18490 - 18549 6.5 22 10 Op 1 . + CDS 18689 - 19519 362 ## COG3279 Response regulator of the LytR/AlgR family 23 10 Op 2 . + CDS 19520 - 20452 190 ## gi|253580492|ref|ZP_04857757.1| hypothetical protein RSAG_02718 24 10 Op 3 . + CDS 20436 - 21266 410 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 25 10 Op 4 . + CDS 21285 - 22172 647 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 22247 - 22306 5.7 26 11 Tu 1 . + CDS 22343 - 24610 2636 ## COG0058 Glucan phosphorylase + Term 24653 - 24707 14.5 - Term 24644 - 24695 3.3 27 12 Tu 1 . - CDS 24793 - 26004 897 ## gi|253580496|ref|ZP_04857761.1| predicted protein - Prom 26030 - 26089 6.9 Predicted protein(s) >gi|226332892|gb|ACII01000127.1| GENE 1 6 - 566 681 186 aa, chain - ## HITS:1 COG:no KEGG:Amet_4324 NR:ns ## KEGG: Amet_4324 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 8 163 19 176 187 64 33.0 2e-09 MQKKQYWNVKTLVFMALLIAMHLVLTRVLVIDLGAYRISVGSVCTILAGLWLGPVAGGVC GLCADIIGCFMKGYAVNPFITVAAILWGVLPALAKPLFANRKKTGKTVGICVSIVVTAVL SSLVLTTAGLVIMLGYNFYAIMPGRLVQFAIMIPIYCVLTCLLYFSPLTAMVVGNTSPAV LKNKTV >gi|226332892|gb|ACII01000127.1| GENE 2 1295 - 2473 859 392 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580470|ref|ZP_04857735.1| ## NR: gi|253580470|ref|ZP_04857735.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 392 1 392 392 669 100.0 0 MKKLLAACVCAGLSFSFAGCSPETILEQFTSTGDTSVRELTSSSDTTTETETKDRIYMDE LSGTLQDFSGSQLILNTDSASYVFNVSNATLECKGGMITGDEISIIYEGQLSGTDTGSVH VLKVVDEFHKKNKLKKRTAHGQVISLTSNTITIKSKKGKTATYPITGTKQYYQNGIKAGD WVYLTFKGKFPDTSNDSSVSLNASHLKVLSISDLKDLEIPDPTPTPDPKLTQTPEQIENK EKQLLATVQGVNLNILRILPAGSDTSFNLDMSAIPAYFKGGIAPGSHVNVTYIGDLETTS LEEVRIIAVTGEDPDTINDKHISFTVTGTIIGSTANTITVQTDDGAIDTFRTENAQDLTA GGMEYGDYVCVTFHPSKSKSSNIYTAIKVQDA >gi|226332892|gb|ACII01000127.1| GENE 3 2657 - 3535 978 292 aa, chain - ## HITS:1 COG:MTH1534 KEGG:ns NR:ns ## COG: MTH1534 COG1228 # Protein_GI_number: 15679530 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Methanothermobacter thermautotrophicus # 2 251 74 340 424 73 29.0 5e-13 MFGECHAHIIMDGVNYRHAIDMHKNGLDDKIIREHLKAYQERGIAFVRDGGDALGVSARA KELAPEYGIDYRTPIFAIHKEGHYGSIVGKGFATMVEFHKRVLEAKQAGADFIKIMTTGI LDFNEHGAITGTPLDGSEVKEMVHIAHEEGMAVMSHTNGDYGVQAAVAAGVDSLEHGNYM NEESLAMLAESDTVWVPTLVTVRNLLGDGRYDDETLKPIIESAEENIRKAFRLGIKTAPG SDAGAYRVLHGKGIRDEVQSFAEILGDQDAAYRWLAEGEAEIKKKFTVTVHW >gi|226332892|gb|ACII01000127.1| GENE 4 3674 - 4825 1345 383 aa, chain - ## HITS:1 COG:CAC0906 KEGG:ns NR:ns ## COG: CAC0906 COG2872 # Protein_GI_number: 15894193 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Clostridium acetobutylicum # 4 375 2 375 387 170 30.0 4e-42 MTETRRLYYEDVYKKEFTATVVECREQKKGYAVILDESAFYPEGGGQPSDVGTLGDAKVT EVHEKDGELLHYTDKALEVGAKVEGKIDWVRRFDLMQQHSGEHMVSGIIHEKYGYDNVGF HMGSDVITVDLNGMLDDSQLAEIEREVNERVWEDREVVITYPDAEELKTIDYRSKKELTG QVRIVTFPGVDVCACCGTHVTHTGEIGMVKLLSVVKFHDGIRMEMICGKRVLDYLNMVNE QNHQISMKLSAKMDRTADAVQRLQDENFRMKGQVARMEEEMFRAEAKKWEGAGSVLIFKE GLEADSVRKLADAVMNTCEGCCAVFSRNEDGSYKYAMGEIDGDLRQYTKEMNAALNGRGG GKPFFVQGSVQATEDEIRNFFEK >gi|226332892|gb|ACII01000127.1| GENE 5 5299 - 5805 397 168 aa, chain + ## HITS:1 COG:lin0443 KEGG:ns NR:ns ## COG: lin0443 COG1595 # Protein_GI_number: 16799520 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Listeria innocua # 5 166 18 180 182 87 30.0 8e-18 MTTEKQYLEYLIKQYQNLIYSICLKSVGNPFDAEDLTQEVFLSAYKNLSRFDGNYEKAWL SKIAVRKCLDYLKAAGRRSIPTEDTYFSQIPDRQSSPEDEYMKTASNTHVEALCMQLKSP YKEVAYAHFCKELSVPEIAQQTGKNPKTIQTQLYRARAMLKKILERGD >gi|226332892|gb|ACII01000127.1| GENE 6 5806 - 6300 325 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580474|ref|ZP_04857739.1| ## NR: gi|253580474|ref|ZP_04857739.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 164 1 164 164 311 100.0 6e-84 MHINKKDLIEFDNGNMNTQEVIEFLSHLDHCDFCLDQMMEQENNSTDTAPSYLMEQILDK AGSGEVQLARAANKASRKIQLLHYSLQTAAGVMVALLLLFSIPNLDFTSLYQDTSSQTDR ILPEHENGRLYQFTRELGQNIGDGTSSFIKYVSDFSSKLMNGGK >gi|226332892|gb|ACII01000127.1| GENE 7 6305 - 7069 506 254 aa, chain + ## HITS:1 COG:no KEGG:CLB_2242 NR:ns ## KEGG: CLB_2242 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 148 1 148 174 104 35.0 3e-21 MQKQKRGFWLFIFSLIPGAGEMYMGFKKQGISIMFLFWGVFAIGACTGMDWLVFLIPIIW FYSFFNVHNLKSLSEEEFYSIEDSYVLHMDELAGDISSLLKHHRKITAILLIFLGASILW NTLVDFLYMILPGYLADMIGRFTYQLPQLVIAAAIIFAGIYILTRKKDALDEEQPDSFTE EEHYWTPYRPYQQNVPNMETDAVKNTTDTVNPDCCTDIDVTDTANNKSPVSLKKDSAAEV SQEDASSQTTNSES >gi|226332892|gb|ACII01000127.1| GENE 8 7133 - 8101 942 322 aa, chain - ## HITS:1 COG:SP1900_2 KEGG:ns NR:ns ## COG: SP1900_2 COG0340 # Protein_GI_number: 15901727 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Streptococcus pneumoniae TIGR4 # 80 321 16 249 252 123 34.0 4e-28 MTVKSSLLEMLEKNKGEVLSGESIAGELGCTRAAVWKAVKSLREEGYHIEAGPNKGYMLA KDTNRLSQEGIRLFLDDPKVKIDIYDELESTNQTAKKEAMMGEAGHGAFVIARSQTAGRG RRGREFYSPADTGLYMSVILKPQGTIHDSLLITTAAAVAVYRAVAQLCGIQLDIKWVNDL FYKGKKVCGILTEAVTDFESGNIEFVVVGMGLNLYLDQENLPQKLRSIAGALYETKEDAE QTDRNKLTAMIVNELRKETADLKLSPDYVTHNMIPGHQITITDGNSSRQAFALEICRDGR LKVREEDGQETILSFGEVSVSM >gi|226332892|gb|ACII01000127.1| GENE 9 8373 - 8597 241 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580478|ref|ZP_04857743.1| ## NR: gi|253580478|ref|ZP_04857743.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 74 6 79 79 112 100.0 5e-24 MANAAGTAQKMRRRKENHKFSRMMAVILGLIVTVEVMIAGGMIFNIDFHSADIIAVIMGF IVLYSGVVAVLTDN >gi|226332892|gb|ACII01000127.1| GENE 10 8842 - 9534 962 230 aa, chain - ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 7 220 11 224 238 172 46.0 5e-43 MTQFMSNSLFFGAAISLISYEAGLLLKRKFKMAIFNPLLIAIIAVIAVLCLLHIDYDTYN QSGQYISYLLTPATVCLAVPLYQQMELLKKNLKAVIIGIVSGVLASLVSVLILAKLFSLS HEQYVTLLPKSITTAIGMGVSEELGGIVTITVAVIIITGVLGNMIAETVIKLAHIEEPIA KGLALGTSAHAIGTAKAMELGEIEGAMSSLAIAVAGLLTVVGASVFAQFM >gi|226332892|gb|ACII01000127.1| GENE 11 9531 - 9899 473 122 aa, chain - ## HITS:1 COG:MA3263 KEGG:ns NR:ns ## COG: MA3263 COG1380 # Protein_GI_number: 20092079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Methanosarcina acetivorans str.C2A # 4 120 2 118 165 72 34.0 2e-13 MKYVRQFWIILLISAMGEALHVLIPLPVPASVYGLVIMLIALGTHIIRLEQVKEAAEFLI EIMPVMFIPAGVGLLTAWGILKPVCVPIILITVITTVVVMIVTGRVTQAVIRMDRKKGQK RS >gi|226332892|gb|ACII01000127.1| GENE 12 9919 - 10428 515 169 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580481|ref|ZP_04857746.1| ## NR: gi|253580481|ref|ZP_04857746.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 169 1 169 169 322 100.0 7e-87 MNRDWLDQYKGVEYLKNKKERDWKFIIGMIVLLAIFAVMGGAFWNMVGKQAQMVREEEQK EEANAISAIYIETGEFLKTGVFVDLNNGTIFSADIPAEGIYNKKGKLISDDVLENGDKVK IYGDGIMLESFPGQYPGVTKIQRTGRASLEETQEYEDQVTGMMKPVAVQ >gi|226332892|gb|ACII01000127.1| GENE 13 10703 - 11134 246 143 aa, chain - ## HITS:1 COG:no KEGG:Exig_3023 NR:ns ## KEGG: Exig_3023 # Name: not_defined # Def: hypothetical protein # Organism: E.sibiricum # Pathway: not_defined # 4 143 3 143 147 82 31.0 4e-15 MHKWNEITDENSLKEFMERVSFFHDSCIKEMHYLSGAYVNENLDMYPVNDRRILRVIIQR QYEEDSMIEMEFQGLKYLKLFPADERYSCEILDSNIILKEDCIIWSDCEDKTELEDGDTG TLVCASKLRWRSIFGYMGEKNYW >gi|226332892|gb|ACII01000127.1| GENE 14 11200 - 11658 512 152 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 3 148 9 157 157 132 44.0 2e-31 MKKIGIISDTHGLLRPEILEILKGCDCIIHAGDVNKPEILDTLRMMGSIYVVRGNNDKDW AEGMAKTLHFTIEGVKFFMTHNKKDVDWDLKDTQVVIFGHTHKYFEKMIDNRLWLNPGSC GPRRFDQEITMAVMTVDNGTYQWEKVAMIPEK >gi|226332892|gb|ACII01000127.1| GENE 15 11719 - 12954 1326 411 aa, chain - ## HITS:1 COG:MA0641 KEGG:ns NR:ns ## COG: MA0641 COG1379 # Protein_GI_number: 20089528 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 407 1 384 467 207 35.0 2e-53 MKMIADLHIHSRFSMATSKEGTPENLDFWARKKGISLIGTGDFTHPVWREELKERLVSEG NGLYRLRDAYVKEESRKFPGEGTRFVVSGEISSIYKKNGKTRKVHNVILLPSLEAADAMA QRLEKIGNIHSDGRPILGLDSHDLLEMMLDVCPEGILIPAHIWTPHFSVLGAKSGFDSVE ECFEELAPYIHALETGLSSDPAMNWRISKLDRYQLVSNSDAHSPSKLGREANLLDIDCSY EGLYRAIQTGEGLEGTVEFFPEEGKYHFDGHRKCGVSLSPVEAERLGGICPVCGKKLTMG VDHRVEQLADRAEGFVKKDGKKYESLVPLPEVISTCMGYSTASKKVQGCFEQMIQTLGTE FDILRNVPSEDIKSCAGERIAEGIENVRTGNVKRIPGYDGEYGKIELFEEN >gi|226332892|gb|ACII01000127.1| GENE 16 13030 - 13617 549 195 aa, chain - ## HITS:1 COG:VNG0110C KEGG:ns NR:ns ## COG: VNG0110C COG1451 # Protein_GI_number: 15789432 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Halobacterium sp. NRC-1 # 81 191 117 230 241 105 46.0 5e-23 MREFETKNNQEKFTCGEFEYQVIRSARKTMTLEVRRDGNVIVRAPLRTGLPRIKRFVNQK QEWVLGCLERTKEYREQKPLSADLSESKRNVYIRKAKETITKRVSYFARLMGVSYRNITI REQKTRWGSCSSEKNLNFNWKLILAPPEVLDYVVVHELCHLKEMNHSKAFWDEVGKVMPE YETYKLWLKENGWKL >gi|226332892|gb|ACII01000127.1| GENE 17 13592 - 14395 1113 267 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3191 NR:ns ## KEGG: EUBREC_3191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 10 234 4 231 261 131 34.0 2e-29 MDKLLDLTRDFDAAKAAFEAYLDEYDRADDKIHLKIVHTYGVVDAAEDIARRMGLGEEDV QLAKVIGLLHDIGRFEQIKRFDSFEPGTMEHAAYGAQILFGPEKMIRRFVKDDRFDSLIC TAIEKHSDFKLEGIADERTLLHAKLIRDADKLDNCRVKLEEPMETLLGVDEKGVGEGVIA PKVWESCMAKESVLSSDRVSKVDYWISYIAQYYDINFPETYEIMREHDYVKRIMDRVPYA LPKTQEKMDILAAEMEEYMDERIRNKK >gi|226332892|gb|ACII01000127.1| GENE 18 14435 - 15418 793 327 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20061 NR:ns ## KEGG: EUBELI_20061 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 38 324 35 312 316 203 40.0 6e-51 MEDIMEDNFEKLNLLLEQEQCEFVDIPDGFTGQTESGELRLIYLMNDAVESFLVLKNARM TGNYVRDYEGEFEGSVEKADWDLCEAEYILVIHQGQNVFTVFFEDILLETQLYNYGELGH FWVKGYENLRVMEYQIAILRDKYEYLGEKYCTEYEGKLAMLRDFPPLNYLFYPAVPEKYI VPVDNPWEVTAEALAVMQELATEAGDEKLGKMLRRYEKNPDISNAKKIAGMLCRSSHLPV ITLLGEKIREAASVYPDRDFGRKQNKYLHELMEKAERRKEELEAENVQTLIYREEPFIYD CDSISFQVYLMIVRKGVWKQKIMVEKI >gi|226332892|gb|ACII01000127.1| GENE 19 15746 - 16132 444 128 aa, chain + ## HITS:1 COG:no KEGG:PTH_0034 NR:ns ## KEGG: PTH_0034 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 14 127 1 113 113 123 59.0 2e-27 MATATIPLKGSASMESKKVSISSKRQITIPQKFFTLLGFNTEAECIMRGNELVLRPVKEN TSGEFAEQVLADLIRQGYSGEELLEKFKQTQRKIRPAVEAMLAEADRVAESKSGGYSLED VFGTEDEK >gi|226332892|gb|ACII01000127.1| GENE 20 16129 - 16425 348 98 aa, chain + ## HITS:1 COG:no KEGG:PTH_0033 NR:ns ## KEGG: PTH_0033 # Name: not_defined # Def: hypothetical protein # Organism: P.thermopropionicum # Pathway: not_defined # 1 97 1 97 97 108 58.0 5e-23 MTQVQILPPAAKFLKKLKDKKLKSLYKEAIEMICEDYSIGEEKTGDLAGMYGYDIYYNKT NYELAYRVRQLDDLIIIVIMAGTRENFYEELKRYIRTI >gi|226332892|gb|ACII01000127.1| GENE 21 16641 - 18230 780 529 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0128 NR:ns ## KEGG: EUBREC_0128 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 529 22 550 550 707 64.0 0 MGRFVNSGNSAFQVALNSEIYVDKTGLLAYTNKVMDTTAKFICNSRPRRFGKSITADMLT AYYCKGCSSEEMFSSLEIGQAPGFTKYLNQYDVIHLDIQWCMEPAGGPENVVSYITEKTI LELRNYYPEILPDSISSLPEALSCINTATGQKFVVIIDEWDVLIRDASTDLVVQEEYINF LRGMFKGSEPTKYIQLAYLTGILPIKKIKTQSALNNFAEFTMTDARIFSRYIGFTEEEVR MLCEKYHRDFSKTKHWYDGYLLEQYQVYNPKAVVELMIWNKFQSYWSDTGTYEAIVPLIN MDFDGLKTAIIEMLSGNSVEIDPTTFQNDMVNFTCRDDILTCLIHLGYLGYDQDTHSAFV PNEEIRQELAKAVKRKKWNEFLTFQQESSELLDATLDMDTDTVARSIEKIHNEYASAIQY NNENSLSSILSIGYLSTMRYYFKPIREFPAGRGFADFVYLPKPEYSRDYPALVVELKWNQ NAQTAIRQIKNRQYPEAVANYTDNILLVGISYDKKQKTHNCVIEKYSEK >gi|226332892|gb|ACII01000127.1| GENE 22 18689 - 19519 362 276 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 36 268 1 230 234 87 27.0 2e-17 MIIGNTVHFIHLAYPTDIVTDIRNYNHFNPKELNPILKLAICEDNPSHYEIIHTILQKYP PGAFEITHFTSGDEFLRAAAENGCPYSIVLTDIDLGSDTVNGISLAEKINHISPDTQIIF ISQYLQYATAVYETEHAYFIHKQQMEKYLPLALRAACQKLQKLHTRYLYFSGNSRNYQVL CSDILYLERNLRQTTIYTRTGTYTTKERLTALTERMKPDFCLCHNSFAVNLHAVRTYSHK GIILSDNTEIPVSRSYYQQFKDAFAFMMLAGHREVQ >gi|226332892|gb|ACII01000127.1| GENE 23 19520 - 20452 190 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580492|ref|ZP_04857757.1| ## NR: gi|253580492|ref|ZP_04857757.1| hypothetical protein RSAG_02718 [Ruminococcus sp. 5_1_39B_FAA] # 1 310 1 310 310 536 100.0 1e-151 MSSHMINLFLCILSSLYLCFGLLYFCCRILPSKHKISLPLFLCLSVFMALLFWIKRESQH NGITIVFQLTTFLVTLFLFQASFMKKLAVYFIFQLLIICPEILCTSVFIALHNLFIPTDT YTPHNLISSCSPAEYFVIELSNILLGLFLLWKISEILRQCIDYLKILTFLQLLLPLIAPV FLNVIISLQKKPEAVLALSIIYWIICIGSYLLFLRAVHSLAQQHREYLQKKMEIELIKKQ INDSVQLSNEYASLRKWNHDIENHIMSVMYLMDMKKYEEAETYTASVLSRLNCRPQEKQP EEDCSHEKEH >gi|226332892|gb|ACII01000127.1| GENE 24 20436 - 21266 410 276 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 21 263 196 442 452 72 25.0 1e-12 MKKNIKLIPLFLFLFVFQLLYFLSTIYTEKIDHIFLILAVFMILICILLYCVLSDVLKKS NTEMELAFLQKQKQLKQEQDYSLQIRRQDTRDFQTKTVQELQDFQILLEQGKYEQADTAI RNLNQTFQKDRFHPYCQNNLLQAILEGKRLRAEQEHIQVSYEILLPEKISINTTDLSSIF FNLLDNAIEACSSSGNPDPEIRLSANISNGFLTIYMHNTKNPLQSFTHKTTKSEPGSHGY GLSIIEDICQKYNGSYQWIDHNNTFDSIVLLQIKNL >gi|226332892|gb|ACII01000127.1| GENE 25 21285 - 22172 647 295 aa, chain + ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 10 292 3 287 299 115 28.0 8e-26 MPVTPTIEVDDTLKERSSHGSFELPIETYTDNCEIFHSLYNHWHDEMELIYIESGSGLVR LNKEILRVKAGDLIIVNRGVLHGIKTDLKNILYFRSIVFDLNFLSGPAGDLCQERVISLL MDNQAEFTHIISRSDDNYGNICLLFDRIHSCHKEKKPYYFVKLKALFFSFFYEMLMGNYI TPADKDENKSLASIKKILDYINLNYSQPLTAAELTALSNYSEYYFMKLFKQYTGKTLIAY INDLRIEKAKPMLLHSDSSVTEIALEVGFNNTSYFIKKFQQATGMSPHKFRRNLS >gi|226332892|gb|ACII01000127.1| GENE 26 22343 - 24610 2636 755 aa, chain + ## HITS:1 COG:SP2106 KEGG:ns NR:ns ## COG: SP2106 COG0058 # Protein_GI_number: 15901921 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pneumoniae TIGR4 # 1 751 1 751 752 976 65.0 0 METLEKLIYDLYKKNARQCTDEEIYNALLIYTKNQLFERGYQDGKKKIYYISAEFLIGKL LSNNLINLGIYDDVAKFLKENGKEIAAIEEVEPEPSLGNGGLGRLAACFLDSIATLGLPG EGIGLNYHLGLFKQLFENRLQKETPNPWIEKHSWLTPAGVSYTVPFRGFSLKSSLYDIDV AGYNNKSIHLHLFDIDLADESMVHDGISFNKKDILHNLTLFLYPDDSDDDGRKLRIFQQY FMVSNAAQFILDEATKKGCNLHDLADYAVIQINDTHPSMIIPELIRLLTERGIAFDEAAE IVSKVCAYTNHTILAEALEKWPMDYLLDVVPHLVPIIEKLDEKIKAKYPQENVAIIDKNN LVHMAHMDIHYGFSINGVAALHTEILKTSELKAFYDIYPEKFNNKTNGITFRRWLMHCNH PLTDYITSLIGDEFKTNATALENLLKYKDDEAVLNKLLEIKTEAKKTCKEFILSNTGVEI NENSIYDIQIKRLHEYKRQQLNALYIISKYLEIKGGKKPSQPVTFIFGAKAAPAYVIAKD IIHLLLTLQDLIKDDPEVAPYMKLVMVENYNVSAAEKLFPACDISEQISLASKEASGTGN MKFMLNGAVTLGTMDGANVEISELVGEDNIYIFGESSEKVIEHYEKADYCSRDIYENDKR IKECVDFIVSEKMLALGHKENLERLHHELVSKDWFMTLLDFNDYVEKKDQALADYADRKT WAKKMLVNIAKAGFFSSDRTIEQYNNDIWHLTPAK >gi|226332892|gb|ACII01000127.1| GENE 27 24793 - 26004 897 403 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580496|ref|ZP_04857761.1| ## NR: gi|253580496|ref|ZP_04857761.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 403 1 403 403 809 100.0 0 MKVRMYNVGFGDCFCLRDRKKSLLVDFGTNNSRIEGRPRREIFDVIISDLSTIKSKNLLL THFHMDHLSGLLYMMKKKDISVDFGKIYLPDVFSKKEMSRTLVLLLLADLLKESGLPSRQ VSLFALVDALLENKQNVELLSRGKLFENKYQALWPDTDVIRKETDEVYNRICEDGKFKEV MDVLLEFAEKLRLIVWSMTAEGNKPAETVEKETDLTASELTEQPEVRQKIIRAYVYEREF RRIKALPDFKKLLAWLNENQINLRQFKHKISIVFQNARDGELNLLFTGDAQPEHLKMIAE NYDGKLPLYEHYWCIKVPHHGTQGHYFDFSQYEPENMLISNGIHFANSKKESKELRTSPL YGGLFYIPDTHMYCSNCDCCDSYENGCSCKEADVISPAYYKDI Prediction of potential genes in microbial genomes Time: Sat May 28 20:39:08 2011 Seq name: gi|226332891|gb|ACII01000128.1| Ruminococcus sp. 5_1_39B_FAA cont1.128, whole genome shotgun sequence Length of sequence - 7595 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 36 - 95 7.0 1 1 Tu 1 . + CDS 165 - 812 184 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Term 864 - 904 -0.3 2 2 Op 1 . - CDS 952 - 1242 483 ## gi|253580498|ref|ZP_04857763.1| conserved hypothetical protein 3 2 Op 2 . - CDS 1311 - 2117 617 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase - Prom 2201 - 2260 8.7 + Prom 2144 - 2203 7.8 4 3 Tu 1 . + CDS 2242 - 2520 387 ## gi|253580500|ref|ZP_04857765.1| conserved hypothetical protein + Term 2554 - 2604 3.1 + Prom 2550 - 2609 5.2 5 4 Tu 1 . + CDS 2687 - 3694 565 ## gi|253580501|ref|ZP_04857766.1| predicted protein + Term 3726 - 3763 5.5 + Prom 4124 - 4183 2.7 6 5 Tu 1 . + CDS 4219 - 5529 1627 ## COG0422 Thiamine biosynthesis protein ThiC + Term 5536 - 5596 16.4 - Term 5531 - 5578 11.1 7 6 Op 1 3/0.000 - CDS 5589 - 6164 422 ## COG0352 Thiamine monophosphate synthase 8 6 Op 2 . - CDS 6166 - 7407 1303 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes - Prom 7495 - 7554 3.8 Predicted protein(s) >gi|226332891|gb|ACII01000128.1| GENE 1 165 - 812 184 215 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 59 206 64 213 378 75 32 1e-13 MTKQELALEVIERLKKEYPDADCTLDYDNAWKLLVSVRLAAQCTDARVNVVVQDLYAKFP TVEALANADVADIESIVRPCGLGKSKARDISACMKILHEQYHDNVPGDFDALLKLPGVGR KSANLIMGDVFGKPAIVTDTHCIRLSNRIGLVDNMKEPKKVEMALWKIIPPEEGNDLCHR LVNHGRDVCTARTKPYCDRCCLNDICEKNGVDSQE >gi|226332891|gb|ACII01000128.1| GENE 2 952 - 1242 483 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580498|ref|ZP_04857763.1| ## NR: gi|253580498|ref|ZP_04857763.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 96 1 96 96 179 100.0 5e-44 MEIGAMLDELREKALKDRNIWQELMDTRKAEQPLAAFCAKCRELGYQIYEMDMITAGEEF YATMRRSTNGGGENSPKLAGEDDFYELFFASLEEHI >gi|226332891|gb|ACII01000128.1| GENE 3 1311 - 2117 617 268 aa, chain - ## HITS:1 COG:CAC2707 KEGG:ns NR:ns ## COG: CAC2707 COG0122 # Protein_GI_number: 15895964 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Clostridium acetobutylicum # 6 264 18 284 292 176 39.0 5e-44 MLTIEMDNFDLGQICRSGQCFRMDQIGDDRYRVIAGDKYLELTQERGIVNFFCPEEDFIF FWIRYFDLDCDYSEYINMINPRDKYLTAAGEMGSGIRILQQDLWEMIISFLISQQNNITR IKKCIENISREFGVRKTSSTGAEYYAFPTAEALALATEEQLRECNLGYRAKYVLDTARKV CFGDISLNSLHDMTYKAARKELLGLYGVGEKVADCICLFGLHQLDAFPVDTHIRQALDAH YKRGFPNRRYKGCRGVMQQYIFYYELMK >gi|226332891|gb|ACII01000128.1| GENE 4 2242 - 2520 387 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580500|ref|ZP_04857765.1| ## NR: gi|253580500|ref|ZP_04857765.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 92 1 92 92 164 100.0 1e-39 MKFIYPAVFRKNEEGRYKAFFPDLACCEAEGDTLDDAIDNANEAAYNWIYAEVMEDEMDL PAISDESDLDLLEGDIVRNIQVNIRMYDGWDE >gi|226332891|gb|ACII01000128.1| GENE 5 2687 - 3694 565 335 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580501|ref|ZP_04857766.1| ## NR: gi|253580501|ref|ZP_04857766.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 335 63 397 397 523 100.0 1e-147 MKARSFKKFFVSIALSAVLTLSSASSVFAATAQLAPAEQSVEQAQSDDSDTPAIESSDAQ NACASLTAQPGIRQTAASENSVTIQWNPVTNASKYAVNISPLSSSSYRFLGYIGNTRNKA KINKLKAGTAYVIKITALNSSGIAISSRTVGCTTLYSKVKIKSSYASTGRYTFNMQTVNP SNSITGYKVVYQSSAAHKLITKYFNTRYSFTIPISGNTFYQVKIYPYLVLGNKRYVSSTS TDRYISNVITLQKAGNTNSSMSVKWNRTAGADNYSIYIKYPGSSSFKKVKTTTSNFFTLT GMKKNTKYGIKVIANKKVKNKVWHSDSKAYNMSLV >gi|226332891|gb|ACII01000128.1| GENE 6 4219 - 5529 1627 436 aa, chain + ## HITS:1 COG:CAC3014 KEGG:ns NR:ns ## COG: CAC3014 COG0422 # Protein_GI_number: 15896266 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Clostridium acetobutylicum # 4 436 3 435 436 592 66.0 1e-169 MTQYTTQMDAARQGIITPQMEIVAEKEHFDVEELRELIAKGQVIIPCNKNHKSISPSGVG AKLTTKINVNLGVSRDWKDVDMEYEKVHSAVEMGAEAIMDLSSYGDTRSFRRKLTSECPA MIGTVPIYDAVVYYHKALGKITSEEWIDIVRMHAEDGVDFMTIHCGMNRATAARFKQNKR LMNIVSRGGSIMFAWMEMTGKENPFYEHFDEILDICREYDITLSLGDACRPGCIADATDT AQIEELITLGELTKRAWEKDVQVMIEGPGHMPLNQIAANMEIQKTLCHGAPFYVLGPLVT DVAPGYDHITSAIGGAIAAYSGAAFLCYVTPAEHLRLPNAADVKEGIIAAKIAAHAADIA KGVPGAAEWDYKMSEARKRLDWEEMFKLSMDPEKARRYRAEAKPEKEDTCSMCGNFCAVK NTNRILDGEIVTIFDE >gi|226332891|gb|ACII01000128.1| GENE 7 5589 - 6164 422 191 aa, chain - ## HITS:1 COG:Cj1043c KEGG:ns NR:ns ## COG: Cj1043c COG0352 # Protein_GI_number: 15792370 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Campylobacter jejuni # 7 190 4 185 201 130 37.0 1e-30 MQKNPYENVIAVTNRSLCQRPFAEQIERVCSLHPKAVILREKDLPEEEYSRLAEQILEIC KRYQVPCILHTYVNVAEKLHHPYIHLPIFLLEKYEGKLGGFRQIGSSVHSVEDALKAESL GADYLTAGHIYTTDCKKGLPPRGLEFLENVCKAVKIPVYAIGGIHPGTGQLNEIMEHGSA GGCIMSDMMKI >gi|226332891|gb|ACII01000128.1| GENE 8 6166 - 7407 1303 413 aa, chain - ## HITS:1 COG:FN1753 KEGG:ns NR:ns ## COG: FN1753 COG1060 # Protein_GI_number: 19705074 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Fusobacterium nucleatum # 47 413 8 374 376 501 63.0 1e-141 METTEKNMEHGEHFNESIVNEVILEDMKKNRIDHMKYLPGMEVLEESDVMDQVISAMNAY DYDKYTEADVRRALAHDNRTPEDFQALLSPAALPLLEEIAQAAQKETRKHFGNSVYMFTP IYIANYCENYCIYCGFNCHNKIRRAKLNAEEIDKEMAAIAKTGLQEILILTGESRAKSDV KYIGEACKIARKYFKVIGLEVYPMNSDEYAHLHECGADYVTVFQETYNSDKYETLHLAGH KRIFPYRLNAQERALKGGMRGVGFAALLGLDDFRKDAFATGYHAYLLQRKYPHAEIAFSC PRLRPIINNDRINPMDVHEPQLLQVVCAYRLFMPFASITVSTRECERVRDNLVGIAATKI SAGVSTGIGSHVEDIEDKGDDQFEISDGRSVDEVYKALLDHDLQPVMNDYVYL Prediction of potential genes in microbial genomes Time: Sat May 28 20:39:37 2011 Seq name: gi|226332890|gb|ACII01000129.1| Ruminococcus sp. 5_1_39B_FAA cont1.129, whole genome shotgun sequence Length of sequence - 21666 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 12, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 34 - 813 1055 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis 2 1 Op 2 . - CDS 865 - 1068 214 ## EUBREC_0225 hypothetical protein - Prom 1244 - 1303 4.8 - Term 1281 - 1323 3.4 3 2 Op 1 . - CDS 1476 - 2054 236 ## Cthe_1184 hypothetical protein 4 2 Op 2 . - CDS 1939 - 3297 1002 ## Cthe_1185 hypothetical protein 5 2 Op 3 . - CDS 3361 - 4299 640 ## COG0657 Esterase/lipase - Prom 4334 - 4393 1.9 6 3 Tu 1 . - CDS 4431 - 5162 593 ## Cphy_3338 phosphoglycerate mutase - Prom 5299 - 5358 8.2 + Prom 5330 - 5389 13.5 7 4 Op 1 . + CDS 5621 - 6499 314 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 8 4 Op 2 . + CDS 6545 - 6739 93 ## + Term 6882 - 6918 0.2 - Term 6756 - 6790 6.0 9 5 Op 1 1/0.167 - CDS 6821 - 7735 750 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 10 5 Op 2 . - CDS 7758 - 8732 915 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 11 5 Op 3 14/0.000 - CDS 8734 - 10125 1314 ## COG1653 ABC-type sugar transport system, periplasmic component 12 5 Op 4 38/0.000 - CDS 10166 - 10996 428 ## COG0395 ABC-type sugar transport system, permease component 13 5 Op 5 . - CDS 11039 - 11920 642 ## COG1175 ABC-type sugar transport systems, permease components - Prom 12048 - 12107 13.0 + Prom 12266 - 12325 3.9 14 6 Tu 1 . + CDS 12345 - 13202 456 ## COG1737 Transcriptional regulators + Term 13267 - 13319 7.2 - Term 13254 - 13308 5.4 15 7 Op 1 . - CDS 13345 - 14046 1009 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase - Term 14058 - 14121 1.6 16 7 Op 2 12/0.000 - CDS 14135 - 15238 1340 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase - Prom 15281 - 15340 3.6 17 7 Op 3 . - CDS 15349 - 16092 1015 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 16305 - 16364 6.6 + Prom 16223 - 16282 6.9 18 8 Tu 1 . + CDS 16367 - 17194 645 ## COG1737 Transcriptional regulators - Term 17494 - 17555 6.2 19 9 Op 1 1/0.167 - CDS 17583 - 19100 1384 ## COG3534 Alpha-L-arabinofuranosidase - Prom 19127 - 19186 3.2 20 9 Op 2 . - CDS 19238 - 19426 190 ## COG0395 ABC-type sugar transport system, permease component - Prom 19534 - 19593 9.0 + Prom 19602 - 19661 8.5 21 10 Tu 1 . + CDS 19731 - 20654 983 ## COG4189 Predicted transcriptional regulator + Term 20684 - 20732 2.2 - Term 20665 - 20725 11.2 22 11 Op 1 . - CDS 20751 - 21023 287 ## COG3533 Uncharacterized protein conserved in bacteria 23 11 Op 2 . - CDS 20945 - 21217 229 ## gi|253580525|ref|ZP_04857790.1| conserved hypothetical protein - Prom 21251 - 21310 3.5 - Term 21238 - 21299 10.2 24 12 Tu 1 . - CDS 21349 - 21666 393 ## EUBREC_2454 alpha-L-arabinofuranosidase Predicted protein(s) >gi|226332890|gb|ACII01000129.1| GENE 1 34 - 813 1055 259 aa, chain - ## HITS:1 COG:FN1754 KEGG:ns NR:ns ## COG: FN1754 COG2022 # Protein_GI_number: 19705075 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Fusobacterium nucleatum # 6 258 2 254 257 371 75.0 1e-103 MSTNNEDKLILGGHEFTSRFILGSGKFSLDLVKACIEKADAQIITLALRRANEGGLANIL DYIPKNVTLLPNTSGARNAEEAVRIARLSREIGCGDFVKIEVIHDSKYLLPDNYETVKAT EILAKEGFVVMPYMYPDLNAARDLVNAGAACIMPLGSPIGSNKGLCTKEFIQILIDEIEL PIIVDAGIGRPSQACEAMEMGASAVMANTAIATAGDVQIMAEAFKKAIEAGRSAYLAGLG RTLEKGASASSPLTGFLHE >gi|226332890|gb|ACII01000129.1| GENE 2 865 - 1068 214 67 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0225 NR:ns ## KEGG: EUBREC_0225 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 66 1 65 65 64 47.0 9e-10 MLITVNGKTKELQNTLTIGQYLEENSYVPSQIAVELNERILSKAEYSTTVLHENDILEIV SFMGGGR >gi|226332890|gb|ACII01000129.1| GENE 3 1476 - 2054 236 192 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1184 NR:ns ## KEGG: Cthe_1184 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 16 162 46 192 222 89 29.0 6e-17 MNFRDMGKSTGLKKKENQKVIQTFSFLCLAIAVICGMINFMMAGVLNWFWFAGAGCACAW LVVMVAYYKRRNILKNEMWQLLLISVIAILWDRFTGWKGWSVDFVIPFGILAVQFSVPVI AKINRLEREEYLFYLVQAGIAGLIPMILVWTGIVQFAVPSVICAGISFLTLAALFIFCKK DTMREFHKKLRM >gi|226332890|gb|ACII01000129.1| GENE 4 1939 - 3297 1002 452 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1185 NR:ns ## KEGG: Cthe_1185 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 7 408 8 418 426 286 35.0 1e-75 MDTGYSKWRKLDNAALAFPLVTGKNDTRVFRFYCQLKEEVNGEILQAALDQTMEKYPLFQ AVLRKGLFWFYLEHRDIRAVVKPETEPPCSRLYIPDKKSLLFQVSYDKNRINFEVFHALT DGTGAMHFLQELVQDYLILAHPQADLPQIEHAEEITHGDKEEDSFSQYYSSDIPKDKEKK KAAVKLKGEKLVHSDMHVTEVALSVKDIHRKARSCGVSITVLLTAMMLCSIREEIPKNQQ KRPVALMIPVNLRNYFPSQSMTNFFGWIEVGYIFSDETTFEDVLLSVKKQFEEELVKEKI AMHMSGYVRIEKNPFVRAVPLEIKKYFLMIGANLGSRSITAVYSNIGIIRLPEEYKEYIQ HFGIFASTNSLQMCSCSYGDEMVLGFTSKIPDDSIQRNFQRMLGEENVSHRELKNEFPGY GEKHRLEEKRKSEGYPDFQLPVSGNCGNLRND >gi|226332890|gb|ACII01000129.1| GENE 5 3361 - 4299 640 312 aa, chain - ## HITS:1 COG:AF1716 KEGG:ns NR:ns ## COG: AF1716 COG0657 # Protein_GI_number: 11499305 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Archaeoglobus fulgidus # 71 282 80 287 311 187 48.0 3e-47 MKAILHALSYGNIELESSRRMADLKQLDAMRIFVKKLDARVYNGEHEVPVRLYFPTEEAM QAGIVEGNTFPVLLFFHGGGWVTESVENYDRVCARMAQATAHIVVSVEYRLAPEHKFPVP LEDCYAAAKALYTNQLILNTDPERITIIGDSAGGNLTAAVCLMARDKGEFTPRRQILIYP ALGNCYTEESPYRSVQENGSDYLLTSVKMEDYLNLYQSSAEDRQNPYFAPILEKDLRNLP ETLILTAEYDPLRDEGEVYGRKLHAAGNHVEVHRIYGAFHGFFALGIKFLHVQESFKYIN RFLNKSYEKILQ >gi|226332890|gb|ACII01000129.1| GENE 6 4431 - 5162 593 243 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3338 NR:ns ## KEGG: Cphy_3338 # Name: not_defined # Def: phosphoglycerate mutase # Organism: C.phytofermentans # Pathway: Glycolysis / Gluconeogenesis [PATH:cpy00010]; Metabolic pathways [PATH:cpy01100]; Biosynthesis of secondary metabolites [PATH:cpy01110] # 1 231 1 230 235 240 49.0 2e-62 MRLLIVRHGDPDYSIDSLTEKGWKEAEYLSERLSKLDVKDFYVSPLGRAKDTASFTLKKM NRTAVECDWLREFDVLIDRPDVTDRQKRLWDWLPQDWTQDERFYQYDHWYENERFQQSDV KRYYDHVTGEFDKLLAEHGYVREGHYYRVEKPNEDTLVFFCHFGLECVLLAHLIGASPMV LWHGFCAAPSSVTTVNTEERREGIASFRISAFGDVSHLYVHDEPPAFAARFCEMYSNTDE RHD >gi|226332890|gb|ACII01000129.1| GENE 7 5621 - 6499 314 292 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 7 292 9 317 323 125 30 2e-28 MKILVFDIGGTAIKYGICTNGHLEETKEYPTEAFRGGTHILNTICRLSEQYLPFDAIGIS TAGQVNPDEGSIIYANSNIPDYTGTQFKRILQKLFHVPVAVENDVNSAALGEAVFGAGKG KNSFLCLTYGTGVGGAIIENKQVYHGSSFSAGEFGAIITHAEEKLSGTDPFDGCYERYAS ATALVKMVSTVDSSLTNGRQIFASLKRPEINEVINKWIDEIVLGLATLIHIFNPSCIVLG GGIMVQPYILERIHTRIPQMVMSSFTHVQISNAKLGNSAGLMGAYYLASQKL >gi|226332890|gb|ACII01000129.1| GENE 8 6545 - 6739 93 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAILFNNTFKTVQQGAKDQKCLPQYFTALPNCPDCIILWNCYNRSTLEVSYHPYLTISGT ETIL >gi|226332890|gb|ACII01000129.1| GENE 9 6821 - 7735 750 304 aa, chain - ## HITS:1 COG:SP1676 KEGG:ns NR:ns ## COG: SP1676 COG0329 # Protein_GI_number: 15901511 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Streptococcus pneumoniae TIGR4 # 2 298 3 299 305 410 65.0 1e-114 MNTEKFKGVFPAFYACYDDNGEISAERTQLLAKHLMEKGVKGLYVGGSSGECIYQSVEDR KKILENVMKAVKGKLTIIAHVACNNTKDSCELAAHAESQGVDAIAAIPPIYFHLPEYAIA EYWNDISAAAPNTPFIIYNIPQLSGTILTMSLYKNMLKNPNVLGVKNSSIATQDIQMFKT EAGKDHIVFNGPDEQFISGRAIGADGGIGGTYAVMPELFLKMNEFLEEGKMKEALEIQNK ADAIIYKMCEAHGNLYAVMKEILRINENIDIGNVRKPLPGLIPEDIPIVEEAAKMIRDAI SELQ >gi|226332890|gb|ACII01000129.1| GENE 10 7758 - 8732 915 324 aa, chain - ## HITS:1 COG:TM0327 KEGG:ns NR:ns ## COG: TM0327 COG0111 # Protein_GI_number: 15643095 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Thermotoga maritima # 39 323 36 315 327 190 38.0 4e-48 MKTKVWLNLDPQKVDVSYLYKKFDEIGCDFEAEAIPDNNPELLIKKVKDVDIVIATMEPW NETTLGAVKGKVKFIQKYGTGVDSVDLKAAGKNGIPVANIPGANAPAVAEVAMMHILNLG RRFTNCVEGCREGIWPSTITGNELDGKIVGLAGYGRVAKNLARMISGFSVKLLAYDPFVK EAVPGQDITFVDTLEELFERSDIVSLHMPFMPSTARIINKSLFERMKPHAYLVNTCRGGV IDEADLIEALKTGKLAGAGLDVLTEEPPKADQPLMHMDNVYITSHMGAASLESEYRSQVI IADNIKEFLEGKLPKSVRNKEFLV >gi|226332890|gb|ACII01000129.1| GENE 11 8734 - 10125 1314 463 aa, chain - ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 75 395 62 379 451 90 24.0 7e-18 MRKKLLSMVMAGTLALSFTACGGSSSGETEKSTDTSKVEETSSGDTQAASADSGNKKTIT VWVEKIFSDDANTKMEERLKEYGKEKNVTVNCEMVAATDFVTKLNAAIEAGQSVPDIISA DTTKVLNYYPNIPCNDVTDLVDQIDEERPYLQASYEGTKIDDKYYYVPFYSSSTLMFVRK DKLEEAGITEMPTTWDEVFKDAEKVSDPDNDFYGLAIGCGENDDDDENTIRQYMWNEGGY LFDEDGNVAADDKVTAVFDKYAELYDDKVIPQDATTWDAGGNNGSYLAGRTAFCFNAPTL YNALVSDEQYKDLLDNTVVLAPPAGSDNSVYMNFNRGFAVMNTCKDTDLASDVISYLLDK DWYDSYMEEIAPVFAPVFEDAKENTTWKDNEVNAQALKYVENASGYYGYPVKTLKGRTVA AKHYFTYPFVKAVNQVATGTADSAGALKSMTSAIEDFQDQVGE >gi|226332890|gb|ACII01000129.1| GENE 12 10166 - 10996 428 276 aa, chain - ## HITS:1 COG:mlr7227 KEGG:ns NR:ns ## COG: mlr7227 COG0395 # Protein_GI_number: 13476021 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 25 275 30 280 280 162 35.0 4e-40 MKAKKRTALLGRVYLIICLIVALFPIYWMINTSFKSDSEIYAKVPTLIPHHPTGAAYSYL INNINFLSSMKNSLIIAVSVSVFSILIAYPVAYTLSRLKFKGRRIFSKSVLFMYLLPTTV LYIPLYMLVSKMHLTNTIWGLIVIYPTFTLPYVAWILIPHIAAVPKELEEAAKVDGCSRI GTMYKIVFPLALPGIISTTIFTFAMCWGEYMYALVNLTSSQVQTFPLVISGLIYGDMPPW NQLMAGGVLAGIPIIVIYMLASSGLVGGATDGGVKG >gi|226332890|gb|ACII01000129.1| GENE 13 11039 - 11920 642 293 aa, chain - ## HITS:1 COG:AGl3272 KEGG:ns NR:ns ## COG: AGl3272 COG1175 # Protein_GI_number: 15891756 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 289 37 334 337 176 34.0 4e-44 MNRNRENGENRFAYLVNAPMIIYLACFLAFPMVWGLYMSFTNKTIGNPASFVGFKNYIRL LGDAEYRRSILNTVFFTAVSILVKTVLGMLMALSLNQKFKGRNIARALLMIPWTLPNIVV VYNWRWIFNSTGGIANYILKSLHITNTDIIWFGSAGLAMTTIIVANVWRGTPFFGVSILA KLQTIPKDYYEAAEIDGAGLWQKFRHITLPEVKDVTILSALMSTIWTINEFETVWLLTGG GPNGTTEVMNVYSYKTAMRSMMLGRGIAVAVLAMPVLMILISILTRRMLPEGE >gi|226332890|gb|ACII01000129.1| GENE 14 12345 - 13202 456 285 aa, chain + ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 283 12 285 287 132 29.0 5e-31 MADSILDSITEQYHSLTKSGKKLADYVFSHTTETQYLSITSLAENCKVSEATITRFCRGL GLSGYNAFKLALAQADRTTDLGDSSNSAEPVSSEDSISTICAKVHAANLLSLNESYALYD EEEMSKAISILSSSRRIYCFGQGGSMIMAMEAWARFSTASPNFVHICDSHMQASAIAMAD SKDAILFFSYSGSTRDMCDTLQIASSRHVPVILITHFPKSAGAEFASVVLQCGYNESPLQ SGSVAAKVGQLFLIDCLFYGYCSQKPDTCSAARSATAHAIANKLL >gi|226332890|gb|ACII01000129.1| GENE 15 13345 - 14046 1009 233 aa, chain - ## HITS:1 COG:SP1330 KEGG:ns NR:ns ## COG: SP1330 COG3010 # Protein_GI_number: 15901184 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Streptococcus pneumoniae TIGR4 # 2 230 6 233 233 249 56.0 2e-66 MTKKEILDAIHGKMIISCQAVEGEPLYVEEKSIMYLMARAAKQAGTPAIRTSSIRDVIAI KEETGLPVIGLVKIQYPGYEGYITPTMKEVDDLVAAGSDVVALDCTLRKRGDGSTVNEFI AQIKEKYPDIILMADISNYEEGINAWKCGVDIVGTTMSGYTDYTSKKDEPDYGLMERLAK DTDIPVIGEGKIHYPDQAVKALQTGVWAIVVGGAITRPLEIAQRFQKAIENSK >gi|226332890|gb|ACII01000129.1| GENE 16 14135 - 15238 1340 367 aa, chain - ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 363 4 380 393 253 40.0 6e-67 MIIKNGKVFQEDGSYKVTDLYVENGRIVASADEVTDKTELDASGLKVLPGLVDIHSHGAV RHDFSDADVDGLRTILQYEKSHGITSYCPTSMTLPKEELLKIFQTAKDVEQDETCARIVG INMEGPFLDPAKKGAHVEGYIRKPDIEFFRACNEAAGGMIKLVTLAPNMEGSEKFIRELH NEVVISIGHTAADYGCAAEAMKEGALHVTHLYNAMNPMGHREPGVIGAAADNQDCMVELI GDGIHIHPVTVRNTFRLFGDSRVVLISDSMMATGMENGLYELGGQEVTMKDRKATLADGT IAGSATCLFDCMKCVISMGVPEREAILAATANPARSIGIYDEVGSLAPGKRADIVLTDEE LNIVKVL >gi|226332890|gb|ACII01000129.1| GENE 17 15349 - 16092 1015 247 aa, chain - ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 1 241 1 241 241 250 53.0 2e-66 MIIYVGKDYQDVSRKAANIMSAQIIMKPNAVLGLATGSTPVGLYKQLIEWYNKGDLDFSQ ITSVNLDEYKGLSGDNDQSYRYFMNTNLFDHVNIDKNKTYVPNGLEEDSDKACADYNEII RSVGGIDIQLLGIGGNGHIGFNEPGEAFEKETHCVDLTESTIKANARFFESMDEVPKQAY TMGIKSIMAAKKILLVATGSAKADALYKSLYGPITPNVPASILQLHQDVTVVADEDALSL IKEKGLL >gi|226332890|gb|ACII01000129.1| GENE 18 16367 - 17194 645 275 aa, chain + ## HITS:1 COG:SP1331 KEGG:ns NR:ns ## COG: SP1331 COG1737 # Protein_GI_number: 15901185 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 3 252 2 252 269 208 44.0 1e-53 MSLNKHKIIPLIESYYHTFTPLERTIADFFIHNTEEQDFSSKNISGLLYVSEASLSRFAQ KCGFHGYREFIYEYKQSLAPGPEENIPNFEVSEFNTYQELLNKSNALLDKAQITRITNLL VSKPRVYVYGRGSSGLVAQEMKLRFMRIGLNIEAVTDSHIMKVNSVILDENCLVIGISVS GQTDDIISSLKAAHQHGAYTLLMTARQDKSYQDFCDETLIFASMEHLEYGNIISPQFPIL LVLDVLYAHYLQIDRSKKEALHEYTIQTLQPFLIK >gi|226332890|gb|ACII01000129.1| GENE 19 17583 - 19100 1384 505 aa, chain - ## HITS:1 COG:BH1861 KEGG:ns NR:ns ## COG: BH1861 COG3534 # Protein_GI_number: 15614424 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 4 504 5 500 500 676 64.0 0 MKQARMIIDRAFKVSEVDKRIYGSFIEHLGRAVYDGIYQPGNPLSDEDGFRTDVLNMVKE LNVPIVRYPGGNFVSNFFWEDSVGPVEERPVRLELAWRSLEQNKIGLNEFSKWLKKADAD MMMAVNLGTRGVADACNLLEYCNHPGGTKYSDLRIKHGVKDPHNVKVWCLGNEMDGDWQI GHKTMHEYGRLSQETAKAMKLIDPTIELVSCGSSNLDMPTFPEWEATTLGYNYDYVDYVS LHQYYGNLNNDTADFLALSDDMDKFIRSVIATCDFVRAKKRGKKDINLSFDEWNVWFHTR EADDEFMEKDPWHIAPPLLEDQYSFEDALLVGLMLITLMKHADRVKMACLAQLVNVIAPI MTEKDGGKAWRQTIFYPFMHASRYGRGMVLQPVIDTPVHDTKEHENVTDLSSVAVWNEAD EELTVFAVNRNIDEDMELVTDLRSMEGYQLIEHIVLENNDMKACNSASGEAVIPITVSRS KVDGGIMTSVLNKASWNVIRLKKQK >gi|226332890|gb|ACII01000129.1| GENE 20 19238 - 19426 190 62 aa, chain - ## HITS:1 COG:BH0903 KEGG:ns NR:ns ## COG: BH0903 COG0395 # Protein_GI_number: 15613466 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 3 57 145 199 279 77 69.0 7e-15 MWGIILPYLVVPMLVFFFRQYLMGIPKDFLDAARVDGCTEYGIFFKIMIPLMKPSFAIRK QL >gi|226332890|gb|ACII01000129.1| GENE 21 19731 - 20654 983 307 aa, chain + ## HITS:1 COG:BH1869 KEGG:ns NR:ns ## COG: BH1869 COG4189 # Protein_GI_number: 15614432 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Bacillus halodurans # 9 307 8 303 313 282 45.0 5e-76 MLYIKSLEQAVPVFKALGSDIRIRLIQTLLEHQEMNMNELASSLGITNGALTSHVKKLEE SGILAILPEHSGHGNQKVCRINVDKILVDIASNNDSPAEDSYSIDIPIGNYFNYSVYPTC GLSTTDNLIGEVDDPRYFAHPSHVDAKILWFGRGFIDYRIPNMLPPGQKIDRLTLSFEIS SEAPGVNSDWPSDISFFLNNTKIGTWTSPGDFGDVHGMFTPDWWFPNWNQYGLLKMLVID RHGTFIDGLKISDVNTQQLTFTSQDDMVFRFQVDEPSQNIGGLTLFGKDFGNYNQDINVK VHYTPNV >gi|226332890|gb|ACII01000129.1| GENE 22 20751 - 21023 287 90 aa, chain - ## HITS:1 COG:ECs4459 KEGG:ns NR:ns ## COG: ECs4459 COG3533 # Protein_GI_number: 15833713 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 29 90 2 63 656 71 56.0 4e-13 MLCGTAGCTEGAKMSGSAGKMGGAEMKGSLKEGILNKVKVTDKFWRGYQELVMDTVIPYQ EKILNDEIPGVEKSHALVNFRIAAGLEVGE >gi|226332890|gb|ACII01000129.1| GENE 23 20945 - 21217 229 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580525|ref|ZP_04857790.1| ## NR: gi|253580525|ref|ZP_04857790.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 90 1 90 90 155 100.0 9e-37 MNGKLRVCRKCLPGQKNEEAFYEDLSRYIQRMDEELKVDQQTYEKRLGICSSCERLMSGM CRLCGCFVELRAVQKVRKCPDLPAKWEAQK >gi|226332890|gb|ACII01000129.1| GENE 24 21349 - 21666 393 105 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2454 NR:ns ## KEGG: EUBREC_2454 # Name: not_defined # Def: alpha-L-arabinofuranosidase # Organism: E.rectale # Pathway: not_defined # 11 104 423 518 520 68 38.0 1e-10 AETIGVEDEWQVPNLTESVSVAEDGAVHITLTNLSLDKDYEIRTILTDYQVNEVKGEIVH GEMHEMNTFETPDQVRVKEFNEVEKTAEGIKFTIPKCSVLHLEVR Prediction of potential genes in microbial genomes Time: Sat May 28 20:40:19 2011 Seq name: gi|226332889|gb|ACII01000130.1| Ruminococcus sp. 5_1_39B_FAA cont1.130, whole genome shotgun sequence Length of sequence - 60469 bp Number of predicted genes - 57, with homology - 57 Number of transcription units - 30, operones - 12 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 186 210 ## gi|253579862|ref|ZP_04857130.1| transposase - Prom 243 - 302 6.3 + Prom 200 - 259 7.7 2 2 Op 1 . + CDS 397 - 765 312 ## Cbei_3562 transposase IS116/IS110/IS902 family protein 3 2 Op 2 . + CDS 811 - 1440 238 ## COG3547 Transposase and inactivated derivatives 4 3 Tu 1 . - CDS 1585 - 1923 481 ## COG3534 Alpha-L-arabinofuranosidase - Prom 1970 - 2029 8.3 - Term 2137 - 2193 11.8 5 4 Tu 1 . - CDS 2196 - 3032 968 ## COG0627 Predicted esterase - Prom 3071 - 3130 7.0 - Term 3081 - 3138 8.2 6 5 Op 1 . - CDS 3214 - 4689 683 ## CKR_1533 hypothetical protein - Prom 4750 - 4809 3.7 7 5 Op 2 . - CDS 4811 - 5185 287 ## gi|253580531|ref|ZP_04857796.1| predicted protein - Prom 5220 - 5279 14.3 + Prom 5222 - 5281 8.4 8 6 Tu 1 . + CDS 5313 - 5630 322 ## EUBELI_01592 hypothetical protein + Term 5709 - 5751 9.1 - Term 5689 - 5744 17.8 9 7 Tu 1 . - CDS 5958 - 7313 1044 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 7532 - 7591 9.3 - Term 7551 - 7592 1.0 10 8 Op 1 . - CDS 7607 - 8689 447 ## DSY0720 hypothetical protein 11 8 Op 2 . - CDS 8692 - 9624 587 ## SAOUHSC_00164 hypothetical protein 12 8 Op 3 . - CDS 9626 - 10486 275 ## DSY0719 hypothetical protein - Prom 10511 - 10570 7.5 13 9 Tu 1 . - CDS 10614 - 10790 167 ## gi|253580538|ref|ZP_04857803.1| predicted protein - Prom 11016 - 11075 8.1 - Term 10877 - 10909 -0.9 14 10 Op 1 1/0.000 - CDS 11098 - 12504 993 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 15 10 Op 2 . - CDS 12356 - 13060 256 ## COG0358 DNA primase (bacterial type) 16 10 Op 3 . - CDS 13082 - 14320 719 ## CDR20291_1768 hypothetical protein - Prom 14347 - 14406 2.6 - Term 14354 - 14378 -1.0 17 11 Tu 1 . - CDS 14501 - 14725 146 ## gi|253580542|ref|ZP_04857807.1| DNA binding domain-containing protein + Prom 14962 - 15021 8.0 18 12 Tu 1 . + CDS 15042 - 16085 352 ## gi|253580543|ref|ZP_04857808.1| conserved hypothetical protein + Term 16102 - 16143 0.2 - Term 16003 - 16040 1.3 19 13 Op 1 . - CDS 16116 - 16547 246 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 20 13 Op 2 3/0.000 - CDS 16578 - 18317 942 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component - Prom 18401 - 18460 5.2 21 13 Op 3 . - CDS 18613 - 18981 385 ## COG3682 Predicted transcriptional regulator - Prom 19019 - 19078 6.3 - Term 19068 - 19107 6.6 22 14 Op 1 . - CDS 19112 - 19489 269 ## EUBREC_0802 hypothetical protein 23 14 Op 2 . - CDS 19539 - 19733 69 ## EUBREC_0618 stage III sporulation protein D - Prom 19863 - 19922 2.4 - Term 19885 - 19928 8.0 24 15 Tu 1 . - CDS 19942 - 21168 1449 ## COG3633 Na+/serine symporter - Prom 21210 - 21269 9.4 - Term 21284 - 21339 15.2 25 16 Op 1 6/0.000 - CDS 21411 - 23315 1380 ## COG2200 FOG: EAL domain 26 16 Op 2 . - CDS 23305 - 24978 1113 ## COG2199 FOG: GGDEF domain - Prom 25094 - 25153 6.4 - Term 25080 - 25135 0.2 27 17 Tu 1 . - CDS 25166 - 25801 642 ## COG2357 Uncharacterized protein conserved in bacteria - Prom 25928 - 25987 6.2 - Term 26073 - 26120 3.1 28 18 Tu 1 . - CDS 26152 - 27297 1530 ## COG1454 Alcohol dehydrogenase, class IV - Prom 27359 - 27418 10.5 - Term 27377 - 27424 8.0 29 19 Op 1 . - CDS 27487 - 28065 494 ## EUBREC_1092 hypothetical protein 30 19 Op 2 . - CDS 28109 - 28513 302 ## COG1895 Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN 31 19 Op 3 . - CDS 28506 - 28841 317 ## Dhaf_4164 DNA polymerase beta domain protein region - Prom 28907 - 28966 8.4 + Prom 28922 - 28981 10.4 32 20 Tu 1 . + CDS 29029 - 29529 527 ## DSY4699 hypothetical protein + Term 29565 - 29622 15.4 - Term 29553 - 29610 15.4 33 21 Tu 1 . - CDS 29710 - 30801 1176 ## COG3608 Predicted deacylase - Prom 30908 - 30967 5.4 - Term 30912 - 30956 9.1 34 22 Op 1 5/0.000 - CDS 30988 - 31275 298 ## COG0640 Predicted transcriptional regulators 35 22 Op 2 . - CDS 31427 - 33292 2210 ## COG2217 Cation transport ATPase - Prom 33343 - 33402 3.2 - Term 33395 - 33433 -0.7 36 23 Tu 1 . - CDS 33445 - 33663 422 ## EUBELI_01434 hypothetical protein - Prom 33730 - 33789 7.8 - Term 33788 - 33841 1.1 37 24 Op 1 . - CDS 33877 - 35232 1214 ## COG0534 Na+-driven multidrug efflux pump 38 24 Op 2 44/0.000 - CDS 35232 - 36203 681 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 39 24 Op 3 44/0.000 - CDS 36196 - 36978 306 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 40 24 Op 4 49/0.000 - CDS 36990 - 37817 1014 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 41 24 Op 5 38/0.000 - CDS 37822 - 38760 898 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 42 24 Op 6 . - CDS 38825 - 40387 1934 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 40491 - 40550 2.1 - Term 40640 - 40681 1.5 43 25 Tu 1 . - CDS 40859 - 42067 1100 ## gi|253580567|ref|ZP_04857832.1| predicted protein - Prom 42156 - 42215 4.9 + Prom 42658 - 42717 3.8 44 26 Tu 1 . + CDS 42839 - 43939 121 ## DSY0900 hypothetical protein 45 27 Op 1 . - CDS 44027 - 45388 842 ## GALLO_0719 hypothetical protein 46 27 Op 2 . - CDS 45381 - 46052 700 ## Blon_0633 hypothetical protein 47 27 Op 3 . - CDS 46049 - 46801 488 ## Blon_0632 hypothetical protein 48 27 Op 4 . - CDS 46788 - 48581 1517 ## BLJ_1528 membrane lipoprotein lipid attachment site 49 27 Op 5 4/0.000 - CDS 48568 - 50037 1116 ## COG4267 Predicted membrane protein 50 27 Op 6 1/0.000 - CDS 50018 - 51460 912 ## COG0438 Glycosyltransferase 51 27 Op 7 . - CDS 51466 - 53304 1592 ## COG4878 Uncharacterized protein conserved in bacteria 52 27 Op 8 . - CDS 53292 - 54185 523 ## BLJ_1532 membrane protein 53 27 Op 9 . - CDS 54182 - 56290 1188 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 56310 - 56369 8.2 + Prom 56269 - 56328 5.6 54 28 Op 1 . + CDS 56526 - 58640 1728 ## Sterm_3109 protease-associated PA domain protein 55 28 Op 2 . + CDS 58705 - 59346 474 ## DhcVS_847 acetyltransferase + Term 59474 - 59523 10.4 + Prom 59406 - 59465 6.7 56 29 Tu 1 . + CDS 59695 - 60069 454 ## COG3682 Predicted transcriptional regulator + Term 60090 - 60121 -0.7 - Term 60067 - 60120 9.3 57 30 Tu 1 . - CDS 60152 - 60382 358 ## EUBREC_0484 D-galactose-binding periplasmic protein precursor Predicted protein(s) >gi|226332889|gb|ACII01000130.1| GENE 1 3 - 186 210 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579862|ref|ZP_04857130.1| ## NR: gi|253579862|ref|ZP_04857130.1| transposase [Ruminococcus sp. 5_1_39B_FAA] # 1 61 9 69 163 119 98.0 7e-26 MQIISSYGVELRKQNIPIRQTLEIYRSAVSYLIGIYVQVWEELAEIPDAKRRFNAAEHLV H >gi|226332889|gb|ACII01000130.1| GENE 2 397 - 765 312 122 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3562 NR:ns ## KEGG: Cbei_3562 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: C.beijerinckii # Pathway: not_defined # 1 122 56 175 398 84 38.0 1e-15 MEHTGRYYEVLAHQLSKANLFVSAINPKLIKDFDNDSLRKVKSDKADAVKIARYTIDKWQ NLKQYSVMDELRNQLKTMNRQFGFYMKHKTAMKNNLIGILDQTYPGVNTYFDSPARSDGS QK >gi|226332889|gb|ACII01000130.1| GENE 3 811 - 1440 238 209 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 18 204 205 384 391 62 25.0 7e-10 MSKNAFINHYQNWCKRKKYNFSQSKAEEIYGKAKELVPVLPKDDITKLIIKQAVDQLNSV STTVESLRTLMNETASKLPEYPVVMAMKGVGTSLGPQLMAEIGDVSRFTHKGVITAFAGV DPGVNESGSYEQKSVPTSKRGSSVLRKTLFQVMDVLIKTHPQDNPVYQFIDKKRTPGKPY YVYMTAGANKFLRIYYGRVKEYLSSLPES >gi|226332889|gb|ACII01000130.1| GENE 4 1585 - 1923 481 112 aa, chain - ## HITS:1 COG:BH1874 KEGG:ns NR:ns ## COG: BH1874 COG3534 # Protein_GI_number: 15614437 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Bacillus halodurans # 11 110 12 111 498 180 76.0 6e-46 MTKLVINPLKKQSKINKEIYGHFSEHLGRCIYEGIYVGEDSPIPNKNGMRTDVVEALKHI RIPVLRWPGGCFADEYHWKDGIGPKETRKKMINTHWGGVVEDNSFGTHEFFE >gi|226332889|gb|ACII01000130.1| GENE 5 2196 - 3032 968 278 aa, chain - ## HITS:1 COG:lin2527 KEGG:ns NR:ns ## COG: lin2527 COG0627 # Protein_GI_number: 16801589 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Listeria innocua # 1 263 1 252 252 169 35.0 7e-42 MALMEVNFFSKALMRPVTMNVILPADKVFFGEETEEENKPFKTLYLLHGVMGNYTDWVTG TCIKRWAEEKNLAVVMPSGANMFYMDHPNANENYSEFIGKELVKITRRMFPLSHKKEDTF IAGLSMGGYGAIRNGLKYHDTFGYIAGLSSAMILEKMGTADDSSPMFFEKKSFLESVFGD LSRIKDCEINPEWIAENMKKDGIPFPHMYLACGLDDPLLPPNRKFRDTMQKLGVDVTYEE GPGAHEWDFWNRYIKKVLDWLPLDKDSKEGMNSGNVGL >gi|226332889|gb|ACII01000130.1| GENE 6 3214 - 4689 683 491 aa, chain - ## HITS:1 COG:no KEGG:CKR_1533 NR:ns ## KEGG: CKR_1533 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 2 318 43 358 359 362 52.0 2e-98 MAPFPELLDVGIMGHCIHGKRGLCMAAGVECYQNGMYSNLQNMSLANFEKIAKQCGGKVY QFALGGCGDPDQHEDFKNILKICREYEITPNFTTSGLGMTKELAQLCKEYCGAVAVSWYG GDYTVNAIELLIKAGVKTNIHYVLHNESIKEAMYRMKEGKFPKGINAVIFLLHKPVGLGT REKVISVNNEEYIQFIKYISEEKLEYKVGFDSCTVPAFANHPGKIDTDSLDTCEGARWSA YITPDMKMLPCSFDNQEQRWAVDLNVYTIQEAWDSAEFDQFREKFRNACPGCEKRNLCMG GCPIRPEIVLCSNKQSFEQDILITKDMNEKRLVVIPHVLFKGKRNVPWKEVEKYLIRYVG KIFEVAETEDFICIDKIFSDEYTGSVYTRKLKGALPKVKANMSQGIPQMIEIATEKRWKE DFENKHKKKAGNGWFRYSTRFALPVMNEKGDILDYNVYQAVLIVRYAADKKLYLYDIQNI KKETRYPSWTE >gi|226332889|gb|ACII01000130.1| GENE 7 4811 - 5185 287 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580531|ref|ZP_04857796.1| ## NR: gi|253580531|ref|ZP_04857796.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 124 1 124 124 211 100.0 2e-53 MKIRMDFVTNSSSSSFILARNERLNEKQKDKIIEYVEKTFLGKRILTPESTEEEIQKILD ENVFGEEERDAVRKALHDGKMIYSDCVCFEDCLYNYESVYEDIWEIMQENSDGDFEEIDG DLSY >gi|226332889|gb|ACII01000130.1| GENE 8 5313 - 5630 322 105 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01592 NR:ns ## KEGG: EUBELI_01592 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 103 4 102 107 105 56.0 5e-22 METENYGSLIKNLRQKMGLTQNQVADSLGVTPGYISNVENNRTAMSLRILTYYARLTGCS LDSLVGELDPEYSETAIDRKLYQNIIKLDVETKEKLLKTLEIWSK >gi|226332889|gb|ACII01000130.1| GENE 9 5958 - 7313 1044 451 aa, chain - ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 1 427 67 497 542 104 25.0 3e-22 MMADAEHGLFDMVVVKDISRFARNTVDLLQNVRKLKALGIETQFLTANMTSMGNSEFVLT IFGALAQEESANTSKRVKFGKKMNAEKGRVPNIVYGYDKTIGDYFNLAINEEEAAVIRQI YKWYIEEGYGAAKIANMLNAKGIRTKRNCQWSQNATCRILTNELYTGKIINGKQEVADFL TGQRTEKDETEWMVTERPELRIIEPEMYEKAQEILECRSKTFKLTKERQSNKYLFSTLIK CKECGWSFRRTVRTYKNTYIRWVCSGHNGRGADSCPNAQTVDEEELIEVLSEYFSELLKA KKNVIRHVTGEFQRVYKAKDENLNYEKELNAQLHKLKKMRQKYMDMYTDDLITREELNEK IGGTKQEIERLENELKMVAYHLTKGEQLETILNRTFKEIEDISDVHQMTNAQLKRIIQKI EVDKDGNVDIYLRIFGDLGLDETVLISDNQT >gi|226332889|gb|ACII01000130.1| GENE 10 7607 - 8689 447 360 aa, chain - ## HITS:1 COG:no KEGG:DSY0720 NR:ns ## KEGG: DSY0720 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 131 357 1 226 230 179 40.0 2e-43 MQIGLECFLDEQLSSMIASENRHGDCEIQHKTDCIIYDTEEDHYLEEYLEEIMDAFTVAK HLKVAESDVRADYLKNFLSKWKVFSVTGDDIQQIITAICSERYQDEPELFDKKVTIREFF SADTMEQQCILKTYNWDDFCYNIKHVNRFHSQQVNFDQLENLLKNMVIDIPKGTLKLFRS RICDEDSYTSGYSTRKMGVPPVALTTAGRTNSEGIQCLYLAGDEETTFHEVRARDYDHVS VGEFIQTKDLRIVDLSLFDKIGPFSIPDFDMTWFAINIEIIRKIGDEVAKPMRRFDRALD YVPTQYICDYIKHLGYDGIKFKSTLVDGGTNYAIFNEKKFECTNVKVVQIGNIDYNWSPL >gi|226332889|gb|ACII01000130.1| GENE 11 8692 - 9624 587 310 aa, chain - ## HITS:1 COG:no KEGG:SAOUHSC_00164 NR:ns ## KEGG: SAOUHSC_00164 # Name: not_defined # Def: hypothetical protein # Organism: S.aureus_NCTC8325 # Pathway: not_defined # 1 307 1 300 307 219 42.0 1e-55 MYFPYLRGRQFELIALRELLEGKRISEKVIPIIEPVKPSSTLLKTLETFVKNDREIAVVF NPTVGDFAKKLKEMREEDSKVANELYDLLTQNDKVIKSYIMDRGILSEIKSEASKNKYLI INLNRDCLDDFLDAYEDTLPRFTLIPDDRAFRRVIPDSKVLFEDNFNKQSRNIDYIDNQD EFFSDSHLYYQNENYVGFADYSVVGEEFNESGFAPVAVAIHVVYFDKKKELRIHHFVSDS NEGIEDPGGKFGEALEKLVYWCDENNVQNTLGLQGFYDCYSSGKYPGLGTVKKYSIMHHI ELVSNFLGGK >gi|226332889|gb|ACII01000130.1| GENE 12 9626 - 10486 275 286 aa, chain - ## HITS:1 COG:no KEGG:DSY0719 NR:ns ## KEGG: DSY0719 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 8 285 7 284 284 259 50.0 6e-68 MEKNKLNILNKIFSRSTFRNCFDNGYDKTYSQVVRRYVDNGTGKQNSELISQIYNVLREG YRNEYYYKNTLLNKLLLGIHSVNTTTALTEIPIAKSKADFVLINGKAVVYEIKTELDNFD RLENQINDYYKAFDHVAVVTCKENLQVLKKKIEMIGKPVGIYILQKRGTITTIQKPQAYS VELDAEILFKILRKQEYEEILFNKYKHLPDVSEFKYYSECKKMFLEIPLEEAYLSVLKLL KKRSQIIKDEFSKIPYELKFLAYFMNLKSDDYKKITKFLNESYGGV >gi|226332889|gb|ACII01000130.1| GENE 13 10614 - 10790 167 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580538|ref|ZP_04857803.1| ## NR: gi|253580538|ref|ZP_04857803.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 58 1 58 58 94 100.0 3e-18 MKKRKLNYRFHNPNTAEVTADYLLKILIEANASKVEQAIQEAANELPEQMNGTEGHST >gi|226332889|gb|ACII01000130.1| GENE 14 11098 - 12504 993 468 aa, chain - ## HITS:1 COG:L109011 KEGG:ns NR:ns ## COG: L109011 COG5545 # Protein_GI_number: 15672499 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Lactococcus lactis # 34 372 45 383 480 138 27.0 2e-32 MNREFQTLSPQTRQELCEVMDIMETELTVEEIREQLEMTQKGAVKNNRHNCKLILEHDPL LKDVFRHNILTEQTDIVKPVWWERISPAFTDMDLNYIMLYLEETYGLTMDKIVQKSIVHQ ADRNKYHPVRDYLNSLQWDGQERIRYVLHHFLGAPVDELTYESMKMFLLGAIARAFRPGI KFEYMLCLVGGQGVGKSTFFRFMAVKDDWFTDDIGKLDSEKVYCQLRGHWMIEMSEMVAT ARSKSIEETKSFLSRQKETYRDSYAVYALDRPRQCVFGGTSNIKRFLPFDRTGNRRFVPV QTNRAEMEVHILENEKESRQYIDQVWAEAMMLYRNGNFKLAFSKETETQLDKLRQEFMAD DTEAGMIQAWLDEHEDRKVCSLMLFKEALDNPYVKPKKAETDRICEIMNTSIVGWKQGTM TRFKDYGTQRSWVCVNENCKRDAKDLKNENDWHPITEKEARQVELPFQ >gi|226332889|gb|ACII01000130.1| GENE 15 12356 - 13060 256 234 aa, chain - ## HITS:1 COG:RC1330 KEGG:ns NR:ns ## COG: RC1330 COG0358 # Protein_GI_number: 15893253 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia conorii # 1 125 15 136 595 72 32.0 9e-13 MNIFEAVKQSVTTRQAAESYGIRVNKNGMAVCPFHRDKNPSMKVDRRFHCFGCQADGDVI DFTAHLYNLKPKEAAEKLARDFSIHYENMGHSPPQRKQIKRQLTQEQRYLQAENRYFRAL ADYLHLLKQWKEECAPKQVEDVWHPLFMEALEKISETEYLLDTLLSGTLEERVAVVVAHG KEVTAIEQRISDFITTDTTGIVRSNGYNGNGVDSRGNQRTTGNDTEGCCEKQSA >gi|226332889|gb|ACII01000130.1| GENE 16 13082 - 14320 719 412 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1768 NR:ns ## KEGG: CDR20291_1768 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 81 412 1 333 333 450 73.0 1e-125 MKATRHNGRSGKNGTYNPRHNDRRFDLENSEHIDAGRVRENIYWDCYRGYTTMQNREEDS QDISFEKIECTFYQEHYGKYIEAQNERNARTRHTERNRSVEDLLKNNKTCPEETIFQIGT VEESVSYEILLAVVQDFTKQFTNRFGSHVHILDWALHLDEGTPHIHERHVFDCENKYGEI CPQQEKALEELGIALPKPDKPKGRNNNRKQTFDAICRTMLFDITRKYGLHLEEEPSYGGR DYLEKQDYILFKQKEQMKNQTEILDTLTLKIEEVEALIDEVSDIAYDKAVEVVTDEVRVE THKEDIRLVEQSKAWVLSPERKASQKERTYAVSRLDGVITKITKAMQTAVKTIQTTLIKP EVKKVNTEQIKKKARSSILDRLAKAKITAEQENRERWEHEGRTKLGRNDMEL >gi|226332889|gb|ACII01000130.1| GENE 17 14501 - 14725 146 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580542|ref|ZP_04857807.1| ## NR: gi|253580542|ref|ZP_04857807.1| DNA binding domain-containing protein [Ruminococcus sp. 5_1_39B_FAA] # 1 74 1 74 74 150 100.0 2e-35 MSENKRLNNYDELPLVLDVADIQRIMGISRVTAYELVHTPGFPAFRSGRLIKVSKIAFFE WMAKGNGRVPENSN >gi|226332889|gb|ACII01000130.1| GENE 18 15042 - 16085 352 347 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580543|ref|ZP_04857808.1| ## NR: gi|253580543|ref|ZP_04857808.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 347 1 347 347 681 100.0 0 MAYQFIRQEPCYDHILFVLALDRDKMKEQILIGTEQQIRFRLDGNKDNDVIYDITRPLGR FLIDFEYDKEKNWNIYGLAPLRDALHTNRWKQPALEQTAGDFLAKKYLTGDPVRMYATFR IWNEYLVTREYRDRNTACDRFIEKIQILTQAFQTENVMNFNSDTGKPERFHTGSLYFRNT PAEITRLELWFPDNRRRTECVAAYSSFYPLITYYLNRLNDWGLCFRQCKVCGKYFLAKSQ RYELCSDKCRKAQALQNKREFDERARENNYDLLYKNECQNWRNKINKSQKTANFPIERLE EMKNAFDLFKKEALQRKKAVKAGATSPTEFTDWLYQQSNIILKLSKI >gi|226332889|gb|ACII01000130.1| GENE 19 16116 - 16547 246 143 aa, chain - ## HITS:1 COG:BS_ycbL KEGG:ns NR:ns ## COG: BS_ycbL COG0745 # Protein_GI_number: 16077324 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 47 138 134 225 226 105 50.0 4e-23 MGKILLVSFEGAEEDLVNSVISVLDSGREKFLFQEIQFQSKIWHKGIAIDLKRREVVRDN RKIELTYTEFEILQLLAQNPGRVFSKEQIYDIVWKESYFGDYNIVMSHISHIREKIEDDP GNPVYIQTVWGVGYKFNEKAGSE >gi|226332889|gb|ACII01000130.1| GENE 20 16578 - 18317 942 579 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 3 254 107 366 541 98 24.0 3e-20 MVQRTESISILSVIWLAGLLLCFGFFAVSYIKCYREFRFSLPVENDILEAWKEKHPLKRS LSIRQTETIAAPLSYGVIRPVILMPKNTEWKNIYQLRYVLEHEYVHIRRLDMLTKLIMIA AVCIHWFNPLVWVMYILFNRDLELSCDETVVRRFGMDIKSVYATALISMEEKKSGLTPLC NSFSKNAIEERIRAIMKIKKTSKFAVIISAVLVICVTGGFATSASSLEKKTETAQENGET TVALNEVNIREDESLSSSDVEWWTAEEYAKWLDEEKEVLQSMIGEKAYTGGDGWFVWTQE KVDETIALYEDNLQKIKDGMKLSKSSDDAVGITMAYSPENIEYAKQEAETVTENKDSNEN VFSEEQLSEYAKAGITYQKETGFLMYDGKTIGYFRDEFKPGTYTISSKRGGTLRVEVQRE NYGTITDVKAEPLSDDFWSEPAVLVESSGGEAVTADEMKGSVFEEGGSENIAADDMGEYS SEEGKGLNIAVPQEYADYGVSCDAQGNWVYNGKIIADLYDEGRGIFSNSNGTMYIEVTRD KSGKISSFQKVSKNRMQELFTEFNPEAETFDGYTSEAKH >gi|226332889|gb|ACII01000130.1| GENE 21 18613 - 18981 385 122 aa, chain - ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 1 122 18 141 144 69 31.0 2e-12 MTEENIKLFDSELKVMDVLWKDGDVPAKYVADTLNREVGWNKNTTYTLIKRCINKGAIER TEPGFMCHALIAKEQVQDLETDELIDKIYDGSVDKLFAALLGRKKLSMEQIKNLKQIVSE LE >gi|226332889|gb|ACII01000130.1| GENE 22 19112 - 19489 269 125 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0802 NR:ns ## KEGG: EUBREC_0802 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 124 1 124 125 173 69.0 1e-42 MKDLKTRITENGIDYVLVGDYYIPDLKLPEEKRSIGKWGRMHREYLKKQCYGRYSSLLLT GKLWSYLEDLDEQAEERFTCIVDQMKAAEGVTEELKRKDAMLWVQRCNNIRNRAEEIVLN EMVYI >gi|226332889|gb|ACII01000130.1| GENE 23 19539 - 19733 69 64 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0618 NR:ns ## KEGG: EUBREC_0618 # Name: not_defined # Def: stage III sporulation protein D # Organism: E.rectale # Pathway: not_defined # 22 63 6 47 95 63 73.0 2e-09 MLHIREFLCIHCKNSVFGGLSVKDYIEERAVEIANYIIDTKATVRQAAKKFGISKSTVHM EVTI >gi|226332889|gb|ACII01000130.1| GENE 24 19942 - 21168 1449 408 aa, chain - ## HITS:1 COG:VCA0036 KEGG:ns NR:ns ## COG: VCA0036 COG3633 # Protein_GI_number: 15600807 # Func_class: E Amino acid transport and metabolism # Function: Na+/serine symporter # Organism: Vibrio cholerae # 7 392 9 395 405 396 58.0 1e-110 MKKIWDKWTRIALVKRILVGLILGAILGIAVPQATGIAILGDVFVSALKAIAPLLVFFLV ISSLCNAGKSHGGVIKTVIILYMFSTVLAAVIAVFASMAFPVKMTLANAATDTSAPQGIV EVLNNLLLNVVANPVSSLVNANYVGILMWAVLLGLAFRAADKMTKKVLADVADGISMVVT WIINLAPFGIFGLVFNTVSTNGLDIFTTYGKLLLLLVGCMLFIYFVTNPLLVYWCIRQNP YPLIFQCLKRSALTAFFTRSSAANIPVNMKVCEEMGLDRDTYSVTIPLGATINMDGAAIT ITVMTMATAFTLGIHVDIPTAIILSLLAALSACGASGVAGGSLLLIPMACSLFGISDDIS MQVVAVGFIIGVVQDSVETALNSSSDLLLSASAEFRQWRLEGKEIKFK >gi|226332889|gb|ACII01000130.1| GENE 25 21411 - 23315 1380 634 aa, chain - ## HITS:1 COG:AGl2664_2 KEGG:ns NR:ns ## COG: AGl2664_2 COG2200 # Protein_GI_number: 15891442 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 374 629 6 257 262 169 37.0 1e-41 MNYNINFQVAAVIITALLLYHFLTQKKLHNANVKTFTYILVLSGLYILSDLLGTLTIMNY TAEDEGTVMGILTGIYLLDILIPYILYSCIPDSHENEKKSGALSVICAGITVAMMIAVLG NLGSGGFFCFDSNGVFQKGTGYVFLYLYALVYIVLIMQRMIRSREDYTPEKMSIAGEFLV IEGVCIGVELYTGYIFLSDFGLALGLIFLYLMMNNPGDYIDSTTGVFDKRYFDNWIQEKF TKGIEFHVIAVELFMLKQINQVYGSSTGDLLLVQIARELQNITGSVQVFRTTGNCFLIIT DSLTEYEKKRQEIENYFKEPFETDGEKITFPAIICGIINGEKMEKEDVLLAYIEYLISLV KRADETVVIQSDDRILEGFRYEKEVEHFLKTAVDKDLFEVYYQPVFWTKEDRYITLEALS RLKHPCMGMIPPDVFIGIAERQGLIAQIGLLQFRRVCRFIKENEYMMKQIKNVKFNLSPS ELLKPGHSQLLINIVKEHELLPKYFQFEITETVATEYSESFCKAVEDFTNAGIGLCLDDF GSGYANLNAVLRLPFDVVKMDRSLLTGITCDEQAAVFYHSIVTVMQNMGYTIVAEGAETE EEVSLLREWGVDMVQGYYFSRPLPEAELLRLFML >gi|226332889|gb|ACII01000130.1| GENE 26 23305 - 24978 1113 557 aa, chain - ## HITS:1 COG:CAC1566 KEGG:ns NR:ns ## COG: CAC1566 COG2199 # Protein_GI_number: 15894844 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Clostridium acetobutylicum # 401 556 104 252 254 81 33.0 4e-15 MKYNQKKKKENIEIFLFIMVMAVFAAVLGVMSLGEKMQPVNQIRKISSGWYYYDNGKRTE VTLPDTIQAEKGEELILYNDGITDLDAGKVVTTRGAQYDLKIWLGDRPIYEYQDTTFVRN TQMKSKLECVGEIPTDMQNEPLKLVYSNPHHGKYVLTSVYIGTGSAVIALHLQNSGIVIG IALCFLVLSVISLLITVYLKYRQMPDARFRDVALFLMICAVWLITDSSAIQTYSSHPDML CTISFYMFMLHSVPMLHFAQKIGGLKKERILDAGIAVFYLNAFIQGLLAYFGVFTFADML FVTHVLLITWVLIVAVLLWKEYRKKPDRSVQIILIAYMILLFSGLLSLSLYWLFEISYYG AIFEFGILVFLVMIIADTVISLVGKVRYRTEMQAYERLMKEDWMTGMQSREPFENLLAEI PKTMNEHKDILLVFMDIDYLRRINNDFGRAAGDEVIVAAARCIENVFGAIGKCYRTGGDE FAVVVFDPEEDRNALSEKLDKEIRRYNRNSRYRMSISRGFSSIRDDRGKLKTTGEWKYEA DTNMYQNKMKERKTHEL >gi|226332889|gb|ACII01000130.1| GENE 27 25166 - 25801 642 211 aa, chain - ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 191 20 207 217 155 44.0 6e-38 MTDEEYYEFIQPYEDAKQMLLTRLDVLNHNLYGEASARPIHNIQCRIKKKQSIEEKLQRK GKEPTVMNAKDYLQDIAGVRVICYFVDDIYNLTELLKSQSDLIVIKERDYIGNPKPNGYR SYHVIVGVPVYCLDGMEYFPVEIQFRTMSMDFWASMEHRISYKKERTDKEQLKEELREYA NMLVEIESRFEQYNETDHQQEKRTAEVKQSL >gi|226332889|gb|ACII01000130.1| GENE 28 26152 - 27297 1530 381 aa, chain - ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 377 2 379 383 449 59.0 1e-126 MANRINLNQTSYHGAGAIEEIANEAKAHGFKKAFVCSDPDLVKFHVTSKVTDILEKNGLA YELYSDIKPNPTIENVQHGVQAFKNSGADYLIAIGGGSSMDTSKAIGIIIANPEFEDVRS LEGTAPTKKPCVPIIAVPTTAGTAAEVTINYVITDVERKRKFVCVDPHDMPIIAVIDPEM MSSMPKGLTASTGMDALTHAIEGYTTKAAWEMTDMFHLKAIELISKSLRGAVENTKEGRE GMALGQYIAGMGFSNVGLGIAHSMAHTLGAVYDTPHGVACAMMLPIVMEYNQECTGEKYR EIARAMGVKGVDEMSQDEYRKAAIEAVKKLSVDVGIPTKLEAIKEEDLQFLAESAHADAC APGNPKDASVEDLKDLFRKIM >gi|226332889|gb|ACII01000130.1| GENE 29 27487 - 28065 494 192 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1092 NR:ns ## KEGG: EUBREC_1092 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 13 191 5 183 183 202 54.0 7e-51 MKLTAENSKGEKMKYRLARSMKECMKTMSVDNITVKQITENCGVTRQTFYRNFMDKFDLI NWYFDKLLAKSFEHMGMGKTVYDALVKKFTYIQEEHVFFAAAFKYDSQNSLRQHDFELIL DFYENLIREKTGRIPDETIHCILEMYCHSSIYMTVKWVLGELECAPEGLAKILVDGMPGK LSELFEKLEILS >gi|226332889|gb|ACII01000130.1| GENE 30 28109 - 28513 302 134 aa, chain - ## HITS:1 COG:TM1000 KEGG:ns NR:ns ## COG: TM1000 COG1895 # Protein_GI_number: 15643760 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to C-terminal domain of eukaryotic chaperone, SACSIN # Organism: Thermotoga maritima # 3 129 2 128 132 59 27.0 2e-09 MNDKKEDLCIYRIRNAVETLGVAVLCLESQHYKDSINRSYYAAFYAIKAVLALEEIDFKR HKDAVAYFNKTYIAKEIFPREMGKRLGRLKQERENSDYDDFYVASLKDASDQYKSAKLII EKIEEYLSEKGIQY >gi|226332889|gb|ACII01000130.1| GENE 31 28506 - 28841 317 111 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4164 NR:ns ## KEGG: Dhaf_4164 # Name: not_defined # Def: DNA polymerase beta domain protein region # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 111 1 111 113 97 41.0 1e-19 MPENIRNIIYRFSQELRRILGDKLTKIIVYGSYARGDFRENSDIDIMILVKMSDEEIRLV KNDIYDLAFEFEINTGIEFSPIIKNEDQYEYWIDTLPFYRNVRDEGVVIDE >gi|226332889|gb|ACII01000130.1| GENE 32 29029 - 29529 527 166 aa, chain + ## HITS:1 COG:no KEGG:DSY4699 NR:ns ## KEGG: DSY4699 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 9 133 14 138 182 68 35.0 5e-11 MDLIKEALSMGFADAAIMDTKDLVFVPEYRQFCEDNLCGNYNLVPACPPACGTVEEMQAK ALKYEKALVLQTVLKDPIMDPVLFKQAKHAQNILTEQLARQMQEAGKEDILIMSAGPYKN CSCMSAYSVDAQKMADAVGMVCWADDSDVRFFTQILFHEDYSKPHK >gi|226332889|gb|ACII01000130.1| GENE 33 29710 - 30801 1176 363 aa, chain - ## HITS:1 COG:SMb20435 KEGG:ns NR:ns ## COG: SMb20435 COG3608 # Protein_GI_number: 16264169 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Sinorhizobium meliloti # 30 265 38 265 331 93 30.0 8e-19 MRTLRVADLEAKPGEKVSGFVHIIGAEFGIPVTLICGEKEGETVLISGGVHNAEYVGIQA AMQLADELDPKKIAGNIIVIRLMNRTGFEHRTMSLTYEDGKNLNRVFPGNPNGTLSDRIA YTVVTEFFPKADYYVDLHCGDGFEGLVSYVYCTGAAAPEVAAKSREMAEIAHVDYLVTSM YGTGGAYNYAGSMGIPSILLERGHSSRWCEDLVAEDVHDVKNILRHLRILRGKSHIHGKP PIEVSPVIYEDAPVSGCWYPAKQPGETFKEGEVLGRICDYFGRELFVYRAKMGGIILYQT ISLCIMKDTPMVSYGTWDEDTQGKIEVGCEVCGNEKHKHGHHHHKSEEKHHKRHEHHKHH EDK >gi|226332889|gb|ACII01000130.1| GENE 34 30988 - 31275 298 95 aa, chain - ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 94 28 121 122 117 62.0 5e-27 MPEETELEDLADLFKIFGDSTRIRILFVLFETEVCVCDLAKALNMTQSAISHQLRILKQN KLVKNRREGKSIFYSLADDHVRTIINQGREHIEEN >gi|226332889|gb|ACII01000130.1| GENE 35 31427 - 33292 2210 621 aa, chain - ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 29 620 113 701 702 664 58.0 0 MNKKQKKMLIRIIVAAVLIVLFSKLPIDGYVRFGLFMIPYLVIGYDILQKAFKGIRNKQV FDENFLMAVATVGAILLGDYSEGVAVMLFYQIGELFQSYAVGKSRRNISELMDIRPDYAN IEVDGKLEQVDPDEVEIGTVIVVQPGEKVPIDGVIIDGVSILNTSALTGESLPRDAKAGD EVISGCINMTGVLKIRTIREFGESTVSKILELVENSSSRKSKSENFISKFAKYYTPVVCY GALALAFIPPIVLLIMGKPAMWGDWIYRALTFLVISCPCALVISIPLSFFAGIGGASNQG VLVKGSNYLETLAQTSYVVFDKTGTMTQGVFEVSGVHHNEISDEKLLEYAALAECSSSHP ISKSLQKAYGKPIDRNRVTDIEEISGNGVTAKVDGISVAAGNAKLMKRLGISYQECHHVG TVIHMAVDGKYEGHILISDILKPHAKEAIAELKKAGIKKTVMLTGDSKRVADQVAKELGI GEVYSELLPADKVSKVEELLHQKSEKEKLAFVGDGINDAPVLSRADIGIAMGALGSDAAI EAADIVLMDDDPLKISKAIKIARKCIHIVYENIYFAIGIKILCLILGALGIANMWMAIFA DVGVMIIAVLNAIRTLFVKNL >gi|226332889|gb|ACII01000130.1| GENE 36 33445 - 33663 422 72 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01434 NR:ns ## KEGG: EUBELI_01434 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 72 1 72 72 105 93.0 3e-22 MKKTYKIDVDCANCANKMEEAAKNTAGVKDATVNFMMLKMIVEFEDGQDPKAVMKDVLAN CKKVEDDCEIYL >gi|226332889|gb|ACII01000130.1| GENE 37 33877 - 35232 1214 451 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 441 5 441 445 293 39.0 5e-79 MKKDSLISENIGKALLLFVLPLIAGSLIQQLYTTVDAVIVGQFTGKVGLAAIDSVHTLFK FPINFMNGLATGATIMISRYFGAKDKEGLHCAVRTALLLAGILGIICAAAGAVFSPWLLD IMSVPEEVRAGSLIYCRIYFGGIWAMILYNMMAGVLRAFGDSKKPLYVLVVCSVVNIVVD LILVGLMHTGVGGAAAATVLSQIVSVVLTAYFLVKSDFLEEKIGIRTMKICKEHMGMMVK KGFPLALQSMLFPIANSIVQASVNTMGTDSIAAWGICDKLDLLIWLIADSMVPVLTTYTA QNLGAGKTKRVKKGALIGTGLSVIAVGIISLVLYFGSGLIGPLFIDQKDVPTLIPLVVRY MQMMAPFFVFYAVAEALSGACCGLGETLVPMITTLLTICLFRVVCILLVLPSFKTMECIV WIYIGSWVVSGVAFAVLYLAKVRKLYHKKIK >gi|226332889|gb|ACII01000130.1| GENE 38 35232 - 36203 681 323 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 28 320 37 329 329 266 46 2e-70 MSEKREVILEAKHVTRRFAASHGRTLLANNDINLKMYKGETLGLVGESGCGKSTFMRFLV SLDTPSEGEILYRGTDITKLKGEELRNNRQNIQMVFQDPTLSFNPKMNIRDIVCEPLMNF KKIKKSEKDAVCRKLLEMVELPGDFADRYPHNMSGGQRQRVAIARALALEPEIVVLDEAT SALDVSVQKTVIELITKLQREKNITFGFICHDIALVQLVAHQVAVMYLGNLVEVLPGKDL DMKAMHPYTRALMGAVFDINMDFSKPIESIESEAPSPLDLPQGCPFQGRCEHCMDICKKE NPKLVEVEDGHSVACHLFQKGSK >gi|226332889|gb|ACII01000130.1| GENE 39 36196 - 36978 306 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 252 278 524 563 122 29 5e-27 MLKLDNVTISYKNVPTVQNFNLDMKEGQIVSLVGESGSGKTTVIRAVLGLLAGGGKVTEG SITFEGEDLLSYTPEQWRKLRGSDISMIFQDSGAMMNPTRKVGKVFTEYILTHENISKKE AWSKGIAMLERMRLPSPDNIMESYPFQLSGGMRQRVGIAMGMTYQPKLLLADEPTSALDV TTQAQIVRQMMKLRDDYHTGIIVVTHNLGVAAYMSDYIVVMKNGRMEDQGTREYILKESK NEYTRKLLEAVPSRGGERYV >gi|226332889|gb|ACII01000130.1| GENE 40 36990 - 37817 1014 275 aa, chain - ## HITS:1 COG:FN1522 KEGG:ns NR:ns ## COG: FN1522 COG1173 # Protein_GI_number: 19704854 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 275 1 274 276 279 52.0 5e-75 MKKKNRFLANKTFVVFSILAVCIILTAVFAPVVTRGVDPLKGSLVDALLPPCKEHIFGTD KMGRDIFSRVIYGARASLSATFGVVALIFLVGTVTGVLAGYFGGVIDAVIMRIADMMLAF PGLVLALAVAGIMGASIKNAIIAIVVVSWTKYARLARSLVMKIRDRDYVSAAIVTGSKTP YMLFRYMLPNALPTLIITAATDIGSMMLELAAMSFLGFGAKPPAPEWGYMLNEGRACMQS APWLMIFPGLAIFVVVVVFNMLGDSIRDILDPRNE >gi|226332889|gb|ACII01000130.1| GENE 41 37822 - 38760 898 312 aa, chain - ## HITS:1 COG:FN1521 KEGG:ns NR:ns ## COG: FN1521 COG0601 # Protein_GI_number: 19704853 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 308 1 308 312 296 54.0 3e-80 MSKKQIIKRLLQIVIVLIGISFITFALTFMSPGDPVRNMYTATGVMPTEEMVQETREELG LNDPFLVQYTRWLKNCLKGDFGKSYSLNKPVTELLAARLWPTLKLTLMAMILMLVISVPL GMLSAIYKDKWIDYLVRGITFLGCAMPNFWVGLLLMLAFCVYINAFPVISSAGDFKSLFL PALTLAIAMSSKYTRQVRTAVLDELSQDYVIGAQARGVKKSKIIWGNVFPNSLLPLITMF GLSIGSLLGGTSVVEVIFSYPGLGNLAVSAITSSDYNLIQGYVLWLALIYMVINLIVDAS YVYIDPRMRLKE >gi|226332889|gb|ACII01000130.1| GENE 42 38825 - 40387 1934 520 aa, chain - ## HITS:1 COG:FN1523 KEGG:ns NR:ns ## COG: FN1523 COG0747 # Protein_GI_number: 19704855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 56 520 64 526 526 239 33.0 7e-63 MKKTKRELLVLAAATTVALSMGVTVQAKGDEKVLNFGCQMYTDGCVNSANDENGGWNAMR YGIAEALFKFDDNMEVIPWLAESYEVNDEHTEWTIKLKDGIKFSDGCDLTPTKVKEYFDH MKEVGPSGSAKPEKYLEFEAEVTVDDDANTINIKTSKPYANLVGQLCHPTMGITDVEHIE NYDNGIIGTGPYKIEEFNGVGVGYTLAANEYYREDVPYDKVNLLFMGDNSAKTMALQSGQ VDLVENITNVADIQDFQDNDDYTVDIASGVRCGFAWMNFDGVLGNKTLRQAILMAIDNDT ICNSRTIGGLYTPGFSVLPSTLNYDYDKLENPYTYDPEKAKQILDDAGIVDTDGDGIREL DGQNINLRYISYENRLLNDFADAQAQYLAEIGIGVVPEYGSSDDQWSKLAALDYDLNNNN WITVGTGDPLAYMANWATDTSYCAYSNPDYDKLYEELKGEMDPEKRKDLIFQMQQILVDD AAVLIDGYYNSSMIYSKKVGSAHIHTADYYWLSTEITPAE >gi|226332889|gb|ACII01000130.1| GENE 43 40859 - 42067 1100 402 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580567|ref|ZP_04857832.1| ## NR: gi|253580567|ref|ZP_04857832.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 402 1 402 402 741 100.0 0 MKKIISFILTMIMLLTCTSISVMASDVQSDTADQDEKMTIGVSVYNLNDAEVREFRNYFE NYVGMAFEVEFLYSSSISTAEEEISFINELHERNVKGIISFLSSDLEEVLPVCEEYGMYY VRGSGTISDEMYEKVKDNPYFLGTIGTSVETEKDAAANMAEYFASEDQNNQNHYIIACGG SAIGNEMHRLRTIGILNKLQEAYGLSYEKSVEELVNTQDTEEIETGSDIKITLVPGYPKD HLDEKLAAPLNTGEYNVILSVVSASDFIKVIDAYEEENSTDVLLGSIDCFTEDTYEMFNK KGYNGKERIDYLVGKYGAIVAPSFVAMKNALEGFAEDYREDGSAFRLQQSFWTADSVDEF NKQYVLSIGMYDNTYSVEDMMGVLKSYTPETNFEEFQKFTEK >gi|226332889|gb|ACII01000130.1| GENE 44 42839 - 43939 121 366 aa, chain + ## HITS:1 COG:no KEGG:DSY0900 NR:ns ## KEGG: DSY0900 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 363 30 391 396 315 43.0 2e-84 MINHLISILISIFISGYHGKTTDFAKNSSCHRTTIAHFLNSGKWDDSLLSDTLKRSVIEI IYSEAARTGKPVFCIVDDTIASKTKPSSQALYPIEDAYFHQSHLKGKQDYGHQAVAVMLS CNGIVLNYAFVMYNKSISKIDIVQNIAKELPVPPVMSYFLCDCWYVSEKIINTFVQKGFH TIGALKTNRLLYPSGMKKKLSELAAERSTTHKGFDLVTVKKRNYYVYRYEGNLSGIENAV VLLSYPEKAFGNPKALRAFISTNVSLSTQEILSCYVCRWPIEIFFRQCKNHLALDTYQIR SSKGIQRYWLLMSLTHYICVTGTGGYRSFQDGYHRICSAVHQEKYQYLFLCAKASDDFEA FMKLAG >gi|226332889|gb|ACII01000130.1| GENE 45 44027 - 45388 842 453 aa, chain - ## HITS:1 COG:no KEGG:GALLO_0719 NR:ns ## KEGG: GALLO_0719 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus # Pathway: not_defined # 28 452 27 452 454 426 48.0 1e-117 MAKSQKRYLVLLVFGLLVIIAAGVWMVFGRKTQIYEKTEEIFGNPLMGYAPCAWEETIGE DISLLYMDITWAELEPEEGKYDWEKIERENQTDRWREEGKHLVLRFVCDIPGEEKHMDIP QWLYDKTDHAGTWYDMEYGKGYAPDYNNEQMIQYHKRAVNALGEHFGRDGLVSYIELGSL GHWGEWHVNISAGIQSLPEESVRERYMIPWTEAFPDSMILMRRPFSEGEHYGIGLYNDMT GQPESTEEWFRWINEGGEFDQTHEKKGLSAMKDFWKKFPSGGEFTSSITMKELLDTNLDQ TINLIKKAHTTFLGPKVADKTYENGYRKVLGSMGYRLWISQASLIQMPGYVSLNLKWKND GVAPFYEDWPVWVLVEDEDGNSLEKEAVDISLKSLLPEETLQTKTRMQVKKMISLAGKKY KISIGIEDPMTGKLAVRFSMKGTYNEGKNYLFQ >gi|226332889|gb|ACII01000130.1| GENE 46 45381 - 46052 700 223 aa, chain - ## HITS:1 COG:no KEGG:Blon_0633 NR:ns ## KEGG: Blon_0633 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 223 1 221 221 294 70.0 2e-78 MSVQDMIKKSVLESGMFSQYDIPKILAALAVALILGSVIYLVYQKFYVGVIFSRSFAVTL VGMTVLTCMVTLAISSNIVISLGMVGALSIVRFRTAIKDPMDLLYLFWAITSGITAGAGM YALTLLTAVIIILMITLFYHKQQNGRIYIAVIHYQGTQAGDEIIRCFGKRKYFVKSKTMR NEKTEMAVEIYCRQNDMEFMEKIRDIGGVEDVTLIQYNGEYHG >gi|226332889|gb|ACII01000130.1| GENE 47 46049 - 46801 488 250 aa, chain - ## HITS:1 COG:no KEGG:Blon_0632 NR:ns ## KEGG: Blon_0632 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 8 243 32 267 274 343 68.0 4e-93 MKKADRKKERFRNEWKYLISTSEKEVLNLRMKPLMKLDPHAETGGYLIRSLYFDDYWNSA YEEKESGVLMRKKYRIRIYNYSAESIKLERKKKFGSYIYKEDAPLTKDEVQKILSGEYEF LLKSPYNLCREFYIECMSNMMRPRTIVDYEREPWIMDEGTVRITFDTDVRAAVGSYDIFD STLPALSVLEPGKLIMEVKFTEMLPQILRNLLPPQAAEFTAVSKYVLCYEKTRYMNGFEY WYDTQGRMIR >gi|226332889|gb|ACII01000130.1| GENE 48 46788 - 48581 1517 597 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1528 NR:ns ## KEGG: BLJ_1528 # Name: not_defined # Def: membrane lipoprotein lipid attachment site # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 4 595 16 603 603 689 59.0 0 MTGIKKINLISAVLVSLSLCGGCTLEKAGNSSQDQTVQEDNKTEQVKAEKAEKEEINEIH LRDKDSLYENDDDTSVVTMYLTVSKGNSSENTYHTWKEINSYSVYDYEDMGVERYQVAGL LQVGDENGPTEGEVGYGERVPNATVQIRGQTSSQNAQKNYKIELKKNKGTWRGQRTINLN KHMTEGMRFRNKLAYDLIRGIPQMVGLRTQFVHLYVKDNTEEPGGKFEDYGIYTQVEQLN KTALKSHGLDSNGQLYKINSFEFYRYEDIIKKEDDAGYDKTAFEKMLEIKGDSDHTKLID MLTDLNDYSIGIEDVLKEHFDEENIVYWMAFQILMGNVDTQNRNVYLYSPLNSDIWYFIA WDNDGCLMRPEYELRNFSDQNSWEKGISNYWGNILFQRCLKSRSFREKLNDAILDLREYL NKDMLTSKIESYKNVLKPYLYLMPDAEYEPLTSEQYDQIAASLPDEIEKNYQLYLESLET PMPFYIGVPTIDGDKLKFNWDVSYDLDAEDITYSVEVAGDYLFRDVIYQNTTLTVPEAEM DLPKAGQYFVRVRATNTSGKTQDAFDYYVTDEGKYYGMRCFYITEDSTVEEDDYEEG >gi|226332889|gb|ACII01000130.1| GENE 49 48568 - 50037 1116 489 aa, chain - ## HITS:1 COG:CAC0735 KEGG:ns NR:ns ## COG: CAC0735 COG4267 # Protein_GI_number: 15894022 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 472 1 471 478 216 29.0 8e-56 MAGIGFELKKLFSRRGLFASFRAYGYAGIICTGPMLLGIVLLLGVMFLCDRTGASKQSRE LLVCMITYTLLASLTVTSFLSMVVTRFIADMLYEEKNEAVLSSFWGSTGLMLIAGGILYG IFLIFSGVGLIDKFLCFELFGELIVTWNAMSYLTAIKDYRGIMLSFLAAIAVTFLSGALL LFLGISHVEALMAAVCIGYGIMLLWDVVLLYEYFPQSDISAFLFLRWADEFLPLAFTGLC INIGLFAHLVIMWAGPLGKQVKGLFYGAPSHDVPALIAFLTILITTVNFVVSVEVNFYPK YRNYYSLFNDGGTIKDIMQAGTEMLDVLNRELKYTALKQLLTTALAISMGESLLKYLPLG FNDLMYGYFRTLCVGYGIYAVANTMLLLLLYFTDYRGALAASVIFAVGTSVFTVISLFCP QVYYGFGFLAGCVLFYFIVMIRLERYTRRLPYYILSIQPVVAEDKSGVFTRIGCFMDEKL ERRTSVDRN >gi|226332889|gb|ACII01000130.1| GENE 50 50018 - 51460 912 480 aa, chain - ## HITS:1 COG:CAC0734 KEGG:ns NR:ns ## COG: CAC0734 COG0438 # Protein_GI_number: 15894021 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 1 472 1 468 471 404 41.0 1e-112 MRICLVLEGCYPYVHGGVSTWMHQYITVMKEHEFVLWVIGAHACDRGKFVYELPDNVVEV HEVFLDDALKLKEHGNQNGQLHRINHFSEEETRSLRELMECSHPDWEVLFHLYHDRKMNP MSFLKSEQFLNILTESCLEKYPYIAFADAFHTMRSMLLPVLYLLGSEVPQADTYHAISTG YGGLLACLGGYVYRRPVLLTEHGIYTREREEEIIRAKWVIPSFKKQWISFFYMLSEAIYK RAFRVTSLFTNAMLTQIQIGCDAEKCRVIENGIDYDRLSQIPLKEEDGWIDIGAVVRLAP IKDIKTMIYAFYELSSRMENVRLHILGGVDDEEYADECYALVKQLDIKNLVFTGRVDIVQ YMRKLDFTILTSISEGQPLSVLESFAARRPCVTTDVGCCRELLNGKEGDDFGCAGYCVPP MYRDGLAFAMEKMCESRRRRIRMGKSGQARTKAYYRHERMISEYRKLYREVEEAYGRDWI >gi|226332889|gb|ACII01000130.1| GENE 51 51466 - 53304 1592 612 aa, chain - ## HITS:1 COG:CAC0733 KEGG:ns NR:ns ## COG: CAC0733 COG4878 # Protein_GI_number: 15894020 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 69 612 60 600 600 258 31.0 2e-68 MAGMIKYLQSFRFKGLTVILGVFLAISVVLWTERSGIQYQAEIKKKAYLDRDTVVTETQA VKNIESSCLVLMDSSQADSVVAWEQFERIFMDMKTGVDLADIRQSEIPDFSGYETIVVLM SDLNPLKDVVIKIGNWVEKGGRVLFALTLQKDTYVSLIEQKLGITDSDYGNVLVDKIYID DDFMIGGGRSFQIPDAYDSAWEVSVGETAKVYAWTDDEKKVPLIWENSYGKGKFVVDNFG LCEKATRGFFAASYSLLTDVMVYPVLNGSVFYLDDFPSPVPSGDGTYIKRDYGLSIKEFY TNIWWPDMLDMAEEHGVKYTGVIIDNYEDDVSGDVVEQEDVQRFQYFGNMLLHQGGELGY HGYNHQPLSLSNVDYANILPYKTWESYDAMKKAMTELIRFGKEMFPGTELSVYVPPSNVL SDEGREMIVKEFPEIRTIASNYFVGDMAYTQEFEAAEDGIVEQPRIISGAVIDDYMELAA VSELNMHFVNTHFMHPDDLLDEDRGARLGWKKLKKRLDEYMDWLYTSAPCLRNLTASELS GAIQRYGALVIDKDVSDQELNLKLDNFYDEAYIMIRMNEGTPGNIEGGELTHITGNLYLL RAKEKSVKIEIR >gi|226332889|gb|ACII01000130.1| GENE 52 53292 - 54185 523 297 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1532 NR:ns ## KEGG: BLJ_1532 # Name: not_defined # Def: membrane protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 295 1 295 304 202 32.0 9e-51 MIQNITGVVFIIFHIICCVLVWTGINSHLLKVKRYLILPVIFVPVWGMICVLLLHFQIFF HQDKKREIGIEKMKINEEIYRSIIAPKEESDRNIVPLEEVLLIDEAAMRRDLIMNVLNDD PENYIDMLKQARMNDDVEVVHYAITGMVELSKEYESRLQKIEYRYAKEPENQQLISEYCD FLQEYLSQGLLEGQMELVQRNQYIKLLKKKLKFKEDLHTYVCLAENQMQTKEYEQVLKSL ERMDKKWHRNEEYWILRIRYYVELKQGKELKETLEQIQQEHIYLSAKGKEALAFWQE >gi|226332889|gb|ACII01000130.1| GENE 53 54182 - 56290 1188 702 aa, chain - ## HITS:1 COG:CAC0731 KEGG:ns NR:ns ## COG: CAC0731 COG0451 # Protein_GI_number: 15894018 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Clostridium acetobutylicum # 1 692 1 722 725 169 22.0 1e-41 MEVLLAGNTGYVTETFIEESFPECDVVVLGNEMLKSNRKKNILSRPFIKDEKELKEIFET YEFERIVYFSNYLTMHGRMTGELEQLRQILQLCKKNKEIRMLYLTGQESIYNITSGKTLL VSAAENVVMEYGNMYQINTKILRIPHLYSAVYTQDFFYKLFTEAEESGKIVFEESPEQNI YFVCMDDLAELLYKVYDNWGKERCLNVPDCFRQSFSDLEKEIRKTIPGKLDIRYQNSGQI YKVQPDDQIIRHEYGWFPKISVFEDIPGMYQEYKKLSDSDFGHFANIRNWISKNTLLIHI LELISGFILFEFLNKYTGTYAQFKMIDLRLVFIVFMGSLYGINYGISAAALETCSLIAAY RQENVNIYTLFYEASNWIPFIFYFAAGAVCGYIRLKNKEDIEFVTKENRLIKEKFFFMQE MYQETLQDKRQYKKQILGSRDSFGKIFDITRKLDVIQPQKLFMETIRVMEDVLENHTIVL YSLDSWKRFGRMEVSSPEIRRKMPNSIRIEEYQNAVETLEKGEVWRNRELLAGYPAYIAG IKRNGELVMLVFIQDAASSQMTLYYLNLIKILCGLVEVSLLRALEYQEAVRDREYLEGTH ILKTEYFMERLKLFHSIKEQKIGSYALLQIENPEMTLEETERILKNKIRENDILGISEDG QLYLILSQVDDAMLPIVIERLEKNGLHCRVIDTRNESRVAEE >gi|226332889|gb|ACII01000130.1| GENE 54 56526 - 58640 1728 704 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3109 NR:ns ## KEGG: Sterm_3109 # Name: not_defined # Def: protease-associated PA domain protein # Organism: S.termitidis # Pathway: not_defined # 5 697 27 697 698 532 40.0 1e-149 MNIEDSCISCIDTDWSYQLARRMEKEKTNPVLGYRTAGSAAETATGDMLYREMKAIGLTD VTRDTFSLDGWEFEKAVLKFTDDDGQEHTFQMGGYQTNFETDGFEDYELVYLGKGTAADY EGINVKGKLVMVEINQRDEWWISFPVYQAYLRGAAALIAVQANGYGEIAESALNAQDIAG PDFAPAFSLSQADARILKRSLRNKISACEDNTTDSEDTHNTLEQDAALLRRSSLKVSLDA RSRVIPDTTAGNITGKIQGTETDSMILLSAHYDSYFDGFQDDNAAVAMMLGIARSLVKGG YKPAHTLVFCAMAAEEWGIIDSKYDWSTGAYNQVFRVHPDWQGKVIADLNFELPAHAHNR QDAIRCTYEYADFIRQFTDSLTVPKEAYPDGITVLSPIETWSDDFSMAIAGIPSTVNDFS AGPFMQNYYHSQYDNQDVYQEAVYQFHHECYLKLIMAIDRLVLPPMNFSNTMNAVAESIH ENALSFADPEQNVLLRNLQTITASAENIYDKICQINAEYATLSDSDARIAFRHKWEYAGL ELLKTFRKAQDYLVRLNWQDEVIFPQEAASKNIRQLEKASRALSEGNITDALKALYRVDN NMYAFLFDEEVYYHFTEYILHQPKDRLMWGAGRIMHHENLYGIVQKLKQRMEEETPELDS EIRRLEAALRRQKAYYKDDIRYLNTSVNKLSAMLDNIYNNLLSF >gi|226332889|gb|ACII01000130.1| GENE 55 58705 - 59346 474 213 aa, chain + ## HITS:1 COG:no KEGG:DhcVS_847 NR:ns ## KEGG: DhcVS_847 # Name: not_defined # Def: acetyltransferase # Organism: Dehalococcoides_VS # Pathway: not_defined # 1 210 2 203 221 65 25.0 1e-09 MESDNLYRVQEEDLPRLQKLLTVCFAQDPLYHTLIPDDDTRERLLPELFSCDLTEFFQTC EIYSDSSELNSILVVSDESEPYNFLHYCLTEARAALTTDGWLIKEDPSLKTFWNFIQGKD YLNSSWTDQLHQTERLHVIYLAVDPGHQHHGLAELLMDEVINYAQKHKMLISLETHNPNN VPIYEHFGFKTYGIVEKPHFGLKQYCMIREATV >gi|226332889|gb|ACII01000130.1| GENE 56 59695 - 60069 454 124 aa, chain + ## HITS:1 COG:CAC3438 KEGG:ns NR:ns ## COG: CAC3438 COG3682 # Protein_GI_number: 15896679 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 6 117 7 115 124 60 34.0 5e-10 MVRNAMSATEFYILQYLWTRETPATFSEIMVHFNEEEKKAWKKQTVNTFLSRLAQKGFLN IDKSGKRAIYIPSVTSKKFYENYAQEIVKDSYNGAIKDFIAAFTAGNKLSAAEKAELLSY IQTL >gi|226332889|gb|ACII01000130.1| GENE 57 60152 - 60382 358 76 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0484 NR:ns ## KEGG: EUBREC_0484 # Name: not_defined # Def: D-galactose-binding periplasmic protein precursor # Organism: E.rectale # Pathway: ABC transporters [PATH:ere02010]; Bacterial chemotaxis [PATH:ere02030] # 6 74 264 332 338 67 49.0 1e-10 MYKTDIDRTDEGIEAVKNEEMLGTVYNDKEGQAREMLNLAFVIATDGDKSDIPLIDGKYV RTPYHKLTLENIEESE Prediction of potential genes in microbial genomes Time: Sat May 28 20:42:39 2011 Seq name: gi|226332888|gb|ACII01000131.1| Ruminococcus sp. 5_1_39B_FAA cont1.131, whole genome shotgun sequence Length of sequence - 36899 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 16, operones - 8 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 176 - 235 8.2 1 1 Tu 1 . + CDS 271 - 2085 1392 ## gi|253580583|ref|ZP_04857847.1| predicted protein + Term 2111 - 2153 11.1 - Term 2099 - 2141 11.1 2 2 Op 1 . - CDS 2166 - 3911 1341 ## gi|253580584|ref|ZP_04857848.1| predicted protein - Prom 4000 - 4059 4.0 3 2 Op 2 . - CDS 4061 - 4966 725 ## BDP_0983 TraG-related protein (EC:3.5.1.28) - Prom 5023 - 5082 10.8 - Term 4990 - 5020 -0.9 4 3 Tu 1 . - CDS 5152 - 5487 293 ## gi|253580586|ref|ZP_04857850.1| predicted protein - Prom 5649 - 5708 8.5 + Prom 5444 - 5503 2.3 5 4 Op 1 . + CDS 5604 - 6191 398 ## gi|253580587|ref|ZP_04857851.1| conserved hypothetical protein 6 4 Op 2 9/0.000 + CDS 6188 - 6898 625 ## COG3279 Response regulator of the LytR/AlgR family + Prom 7431 - 7490 6.9 7 4 Op 3 . + CDS 7619 - 8365 435 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 8394 - 8453 2.3 8 5 Tu 1 . + CDS 8505 - 11135 1066 ## COG4403 Lantibiotic modifying enzyme 9 6 Tu 1 . - CDS 11156 - 11686 222 ## gi|253580591|ref|ZP_04857855.1| predicted protein - Prom 11757 - 11816 7.9 + Prom 11768 - 11827 4.8 10 7 Tu 1 . + CDS 11905 - 12135 238 ## CLB_0787 putative bacteriocin + Prom 12253 - 12312 8.1 11 8 Tu 1 . + CDS 12341 - 14455 1550 ## gi|253580593|ref|ZP_04857857.1| predicted protein + Term 14478 - 14528 0.5 12 9 Tu 1 . - CDS 14465 - 14668 225 ## gi|253580594|ref|ZP_04857858.1| predicted protein - Prom 14793 - 14852 6.1 + Prom 15106 - 15165 9.3 13 10 Op 1 36/0.000 + CDS 15194 - 15910 220 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 14 10 Op 2 . + CDS 15920 - 19696 3867 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 19714 - 19754 -0.7 + Prom 19721 - 19780 4.5 15 11 Op 1 . + CDS 19803 - 21008 563 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 16 11 Op 2 . + CDS 21034 - 22728 1450 ## EUBREC_0259 hypothetical protein + Term 22807 - 22856 0.3 + Prom 23217 - 23276 2.8 17 12 Op 1 . + CDS 23301 - 25955 3556 ## COG0525 Valyl-tRNA synthetase 18 12 Op 2 . + CDS 25968 - 28004 1796 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Prom 28018 - 28077 5.8 19 13 Op 1 . + CDS 28115 - 29413 1441 ## COG0285 Folylpolyglutamate synthase 20 13 Op 2 . + CDS 29410 - 29562 219 ## gi|253580602|ref|ZP_04857866.1| conserved hypothetical protein 21 13 Op 3 . + CDS 29573 - 30910 1168 ## EUBREC_2575 hypothetical protein + Term 30977 - 31013 4.1 22 14 Op 1 8/0.000 + CDS 31118 - 31432 273 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 23 14 Op 2 6/0.000 + CDS 31455 - 31880 438 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 24 14 Op 3 . + CDS 31883 - 32596 527 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Term 32690 - 32754 13.8 - Term 32678 - 32739 6.3 25 15 Op 1 . - CDS 32768 - 33265 621 ## COG4708 Predicted membrane protein 26 15 Op 2 . - CDS 33334 - 33948 524 ## EUBELI_01606 stage V sporulation protein AA 27 15 Op 3 . - CDS 33955 - 34311 518 ## EUBREC_2431 sporulation protein 28 15 Op 4 . - CDS 34342 - 35424 658 ## Cphy_0480 stage V sporulation protein AD 29 15 Op 5 . - CDS 35494 - 35958 366 ## Cphy_0479 stage V sporulation protein AC - Prom 35983 - 36042 2.6 30 16 Tu 1 . - CDS 36076 - 36495 267 ## EUBELI_01605 hypothetical protein - Prom 36586 - 36645 7.9 Predicted protein(s) >gi|226332888|gb|ACII01000131.1| GENE 1 271 - 2085 1392 604 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580583|ref|ZP_04857847.1| ## NR: gi|253580583|ref|ZP_04857847.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 604 6 609 609 1132 100.0 0 MKNKKMLAMLLALTVTAAGTPVYAADFSDGNLESVQEVAEPEKILTGDTALEEDSNAQEN QESDESGVTSAEETDEFTEEDVASAFTDDTENEVAEAQSSDTESGSEEVTGTEGLEYEYD AETDTYHLVKGVNVESVIIPEQYNGKIVSEIGEKAFAGCDRIKNVRVNRDWDQETMIIRQ SAFEGCIHLRKVMLNRVIKVEKNAFRNCPQLSECVYISIDNLFIDGGEFIERDAFDPDNK VCVYLGEESNNFKGNKQYFVAGLFDEDYVFKESGVTYVCYDPNMTGDMEGISREHEGTRI LDVQDELETVFVKDTNVTGVGRKAFYGSEKIKVVKLGSQNRYIETKAFFDCENLERVYIP DTTNDIADDAFDGCNSLVIYTPKGSYAEKFARKHNIPVVVSATSQDILNDVKPNLRVTKK EKDGVTLKWEISQPYLAEGYEVERKNAQGEYVHCSTIKSSEVNTKDIELWNYSKFYGKNI TFRVKAYFVDEDGTIIYSQPSNTVKTTFCPVQPKITSLTKKAGKKLTVKWKKSDYAIGYQ VWRSVDGKKGVCIKTIKGGDVTSFTDTNLKEGTTYIYWIKSYRMDYQNKKIYSRSNDYSK SIVY >gi|226332888|gb|ACII01000131.1| GENE 2 2166 - 3911 1341 581 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580584|ref|ZP_04857848.1| ## NR: gi|253580584|ref|ZP_04857848.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 581 1 581 581 1093 100.0 0 MSTNAKTRYNISKVFGPAVPDKGSKDSDPSADNMAGIAINGKTAYCAKRSQTSSNIFVIE NIASASFDALNGKTPVSIPYAVHGMTYHNNYLYMTCYKNTIIKMPVSSIGKIAAEYTVTE DGNGYIPQSITYYGTVNNEDLFIVGVEKMQKNGAFFYMIGSLSNGKFVEKKRFYVNNSAG YEQPQDIFYHSKYGLFVVTNKKTDGAFTIYNMLLCAKIPDELSSVENGNTYYPTSEFKFN GNRNYSKLVVKSIYITDDGTLYLSTNATSAPSVLEKYSKDAIFRATNPVFKKDGILEMTL SCNGDTEIPNAVGTYNNSTYSCINPGAFTLDGNTGYCFKTHTTKDTVLRDKISILFKSSD VSSQNFSEAFSPRKLFTELGHCNGMVFVNGFLYVAAYDRYTTPERKQITKINPSTGKIVE TYECDSMFGGIGYYGNNKFIVLNYEQLAGEPAFSQTLHFYIGSFKNGSFDSEKEFTVNNP FYLSKVTVLQDIHYDINHGLFFITHRTDVEISKIYRINPDKITNASVKEPLIPDEVFIPN SSISEIESVCIASSGIMYIVQNKSTSKPDLVSKVSGLTFSN >gi|226332888|gb|ACII01000131.1| GENE 3 4061 - 4966 725 301 aa, chain - ## HITS:1 COG:no KEGG:BDP_0983 NR:ns ## KEGG: BDP_0983 # Name: not_defined # Def: TraG-related protein (EC:3.5.1.28) # Organism: B.dentium # Pathway: not_defined # 8 148 66 198 495 87 43.0 8e-16 MTKIEKAVTWALAIANDNTHGYDQQYRWGPDYDCSSLIISAWQQAGVPVKTKGAAYTGNM KSAFLACGFTDVTSRVNLRTGNGMKRGDVIINTLHHTVMYIGNGRIVAARINENGKVSGG KTGDQTGMEIMVQDYYFYQYGWDCVLRYMKEDATTDISSPSAPAASSESENVRKGQTYAN RFTSAQIPVTGQRDTATVKAGIKVFQTALNKDYGAGLVVDGIFGLNTDKALGKHYVTSGE CQYMVTAAEILLLLHGYNPNGVEIPGIFGSGLLTAVKTFQKAQGLTVSGTCDRNTFLKLI R >gi|226332888|gb|ACII01000131.1| GENE 4 5152 - 5487 293 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580586|ref|ZP_04857850.1| ## NR: gi|253580586|ref|ZP_04857850.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 111 1 111 111 195 100.0 6e-49 MKKNTIGKRIAEARINLNITQEQLEELSGFSVSTISRFETGRTQPSIENLIKLSKVLNVG IDYFLYDLIPHNETIQSPTVKDAVTVLSQMNERQAQYMLDIIKIYQASHQD >gi|226332888|gb|ACII01000131.1| GENE 5 5604 - 6191 398 195 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580587|ref|ZP_04857851.1| ## NR: gi|253580587|ref|ZP_04857851.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 195 1 195 195 365 100.0 1e-100 MTDADRTDKAEERENMYIMMFTLQSLQNYGLSVKKTVERSLDYLWEWYQESGEERYLLLA KQHMQAYVNMGFALNEQNQTIRDIISILDQTIADFYPKDSLPGKRVKLTKAQIRSMIGRW RPSRENPMTIGELVEDIIKKVKEHQTGRYIYRYCRSDCTSGEDAEIYELVVGREESYFYD VRKFRFYTFMEETKK >gi|226332888|gb|ACII01000131.1| GENE 6 6188 - 6898 625 236 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 234 1 231 234 82 26.0 9e-16 MIRIAILDDEKADLEKEEQITRQYFYNRQAESEIATYQSVEWFLLGLTEEQFDIYILDAE MPVKNGIEVAKEIRKLYPEPVIIYVTNHLNYAVEAYEVNTYCYIAKDTMEKGLNAAYETL LPILLAKEERYYIVKKRSELEKIAYSDIFYVKKEGKCAVIVHRNGETSVRDSLSAVEKAL GSREFIVADRGYLANIRHVMKMKSRELYMRNGNIITVGRERFKHVRKAILDYWKEI >gi|226332888|gb|ACII01000131.1| GENE 7 7619 - 8365 435 248 aa, chain + ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 21 245 202 431 433 68 23.0 8e-12 MNWGYIENRELIISFGVISGIALILLFVLARFWAKSIYTEKNILEVRNNVIASQYEELNE SYKKYRCLIHDERHMMDYIEECIGTGNFSEAQKIIKKSRNKFSERYYWTGITVLDNVISI EKRKMDNMNIEFHIETDVTDIVMEDIDIIILLENLFDNAIEAVEKCANKKEIKFSIKNIN SMLVLKLWNSSCKKPKVKKERFVTDKHDSKGHGWGLESVKYIVKKYDGIIEFEYTEMFFQ TVIMIEGE >gi|226332888|gb|ACII01000131.1| GENE 8 8505 - 11135 1066 876 aa, chain + ## HITS:1 COG:BH0455 KEGG:ns NR:ns ## COG: BH0455 COG4403 # Protein_GI_number: 15613018 # Func_class: V Defense mechanisms # Function: Lantibiotic modifying enzyme # Organism: Bacillus halodurans # 2 871 147 1030 1059 288 26.0 3e-77 MLENICIRILILDISEKKSNHELYGFTVHEEYNFYIKHFWKNESYKCDLAIRYAEGSRLQ KLKEKYFVILQNEINEKINIDKDMISKKFHENMNEFQMLNKDQGDIHKNGRSVSKIEFDN GSILYYKPHSLDKDIKYQQLYKYLCEKAGISCREVRCLSRQTYGWEENIENKSCNTKEEI ERYYFRLGIHLFLGYALGATDLHGENIVAYGEHPVIIDMETYPGYITKNSGSSTEEKAEI KIREKIMTSVLNTGILPVLTWGTGNNRVLMSAVNMHGKIRTPFKMPVVKDDKTSNIHIEY EQVEFEIKECIVRLNGEVVNSEEYTGEIIRGFRMAYTEILQNQKLRNMLKTFFQGKSRVI LRHTQQYYMYLFASFHPDYMKDRKQREELLQVLHKKGETQLQKELRDYEIQSLLELDIPY FEIDGNSRSIFDGNGKEYQGYLPCTPYESWIEHMKQLSCQDMEQQCDYIRLSMGLLNHGY IGEKNPRWADENTCIHQIAEWICRTAVIDGADIGWAGLHFWDNGYWSLKPCGMYLYDGIA GIVLFLAKYLDRYQDSSCRQDVEKIYKLAIEKLEKYTDLRCEQNEVPEPLATGLYDGESS IVYVYLILYEITGQEKWIKNAQKHFEIVAKLLPKDENMDYLSGNAGAIVAAMKLYQLTGE IEYCTAAVETEKDLWKKGQRMEVGYGWKLKNLKYPLSGLSHGNSGFLMAYVELYKMTYDQ EYLKKIKLLLSYEDILYSEDLKNWIDLRDPDGRKTMRGWCHGAPGILLSRMSIMDILPDD KQIKKDILRAVSTLFHNEREKKICLCHGMTGNLLIMKKYLEKNPDDDLRQKYERQFSNLK EQLQDMRSLMVTETLNPAFMNGITGVGMAMMLLNMD >gi|226332888|gb|ACII01000131.1| GENE 9 11156 - 11686 222 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580591|ref|ZP_04857855.1| ## NR: gi|253580591|ref|ZP_04857855.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 176 1 176 176 332 100.0 5e-90 MKSLLQLIHDYHSGVKNAGEEILKRMEPLIKSYASRIHCMERDDATQELYLTLLNTLPYL NGKNFSEGECVRYMQTAIENRYRALCRYHLSEPEREDFDTCSLTLQADNPFDETLYDITT YIESFPVQSMEYKILSLFFYQYKTDSEIASILHVSRQYINRQKKKLLKTYFFRHPD >gi|226332888|gb|ACII01000131.1| GENE 10 11905 - 12135 238 76 aa, chain + ## HITS:1 COG:no KEGG:CLB_0787 NR:ns ## KEGG: CLB_0787 # Name: not_defined # Def: putative bacteriocin # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 69 1 69 77 65 55.0 7e-10 MESTVMETAISQGIWAVLAVFLLIYIVKSNEQRDAKQEEREKNYQTVIENLTEKFQILNQ VQNDLKEIKDNLLNKN >gi|226332888|gb|ACII01000131.1| GENE 11 12341 - 14455 1550 704 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580593|ref|ZP_04857857.1| ## NR: gi|253580593|ref|ZP_04857857.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 704 6 709 709 1332 100.0 0 MKEYKVITALFLSAVMLGTAPLTVWGAESFSTGESFSDGTENIPDIVTDSALSGESSERD YELSIETGTDVELTESDFYDPEFRVEKYVKVRNSGKKETRVRVDASGLKNFEVDDQMAKW SEDYEEQDSKRLLKPGETLDFFVLPKQEYGTYDETFDFLADDGSRYPVHITMNREKNPTE KKSLEITEKSSESFPIMHWGYKTLPEARIYVLKNFTDMDMELSFDYSKDCSVSLCGNTSL APGESTELSLCPQMGLPVGEHDLGVTVTAKTATGETITKKIWDSFRVDSRTFQGLVDTIE PVTGITNGVEKTADALHLPSVLKVYGAEVDGEKTTFSADVKWDVENCAYNPKNTASQSFT VNGQLELREGENNEELDTSVKIKVQVDAYKGLNRPVIDSTWTRVITNYAELSLRDTSNEA DGYFFVAAESQKELEKENYIAQVKVAKSELNHYPKLKYIPEGTHYLYCRAYKEESGHTEK YGEWSEGFLLKVNIKTPNAPVVQKLQVKKNDVKIILGSNTEELDGYDVVAARSKDGDEPS DYIKVQCKYSGNSKELILKGVPKGNWYIGVHGYKYLDGSNKKVLSKWAEVQKVTVKTSLV TGKPAVKSAKVSRQGTKRKVTVKFTVPKACDGTDWVLAKKVSQSEDGSYTYVSGYAYTKK NQTKTTVIFKNVKPGVYYLTGRAYVKGYAKNYTKWSEVKKVVVK >gi|226332888|gb|ACII01000131.1| GENE 12 14465 - 14668 225 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580594|ref|ZP_04857858.1| ## NR: gi|253580594|ref|ZP_04857858.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 67 38 104 104 115 100.0 6e-25 MPETLTFSVVNWKESDAECTFTVVNSGFGYTTQLQNKYYHTSFGLFILTNNTFFFDIYMQ KKNKFLV >gi|226332888|gb|ACII01000131.1| GENE 13 15194 - 15910 220 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 4 216 1 208 311 89 31 3e-17 MNEMTDAYVKLDKVSKIYGTKEVKIVAVDEISFEIAKGEFVVIVGPSGAGKTTVLNILGG MDQATSGEVLVDGRNIARYNSRQLTGYRRNDIGFVFQFYNLVPNLTALENVELALQICKN PLDAKEVLEEVGLKDRLTNFPAQLSGGEQQRVSIARALAKNPKLLLCDEPTGALDYQTGK AILKLLQDMCRERGMTVIVITHNSALTPMADRVIKIKNGKVSRMTMNDHPTPIEEIEW >gi|226332888|gb|ACII01000131.1| GENE 14 15920 - 19696 3867 1258 aa, chain + ## HITS:1 COG:lin1187 KEGG:ns NR:ns ## COG: lin1187 COG0577 # Protein_GI_number: 16800256 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 342 1258 260 1136 1136 545 37.0 1e-154 MKAMHKDFWMEIRKSKARFISIFMIVALGVAFFSGIQASSPDMRFSGDAYYDETNLMDIK VMGTLGLTEDDVAAIKQVDGVENAEGAYGTDVMCGEGEKQKVLHVEAVDQTMNRISVTEG KAPEKSGEIFLDCIFAESNGYKVGDQITLKEGGDSELLKKTDYTVVGLGESPLYISYNRG NSTLGSGEVNGFAYVLPEDFDQEVYTQIYVQAHGAQDLISYTDAYDSLIERVQEQVEGIE AERCQVRYDEIVEEANDKLADARQELEDGKKEANEKLADARQELEDGKKKLKDGKKEYKD GKKKLADAKKELEDGKARLADAKKELEDGRSQLASAKEQLASGRAQIASAKEQLNAGWAE VSENQAKLDDGKAQLEDGKNQLSAGEKQIADAKTQLTQSQQELDNGKAQIQSGREQIAAT RQDLNAKKESCNQGLAQIEQQEAGLAEGEAQLEGARSQLAALQAQYEQAQASGTYSEEDL AALAAQVSAYQEQVDSQAAQLEASRNQIAAARSELESGLSQVESGLAQLDAKEAELNQQE AAFPDAQAKIDAGWKEVKAQEKKLEPARKEIQEKEAQLESAQEQIDAAKAKLNSSQAQLE EKEAELASGEAQIRENEGKLASGEKEIADNEQKLRDGEKEISENEQKLKDSRKDIKKAEK DLEEGKKEYEDGKKDAEKEIADGEKKIQDAQDEIDDISMPEWMVTDRNDLPEYSDYGDNA DRIRSIGQVFPVIFFLVAALVSLTTMTRMVEEQRTQIGTMKALGYGKYAIASKYLLYAFL ATVGGSILGILIGEKILPLVIINGYGIMYKGMMNNIQIRYEFKFAMIAAGAATVCTVGAT IFSCYRALAETPASLMRPPAPKEGKRILLERIPLFWKHLNFTWKSTLRNLFRYKKRFFMT IFGISGSMALMLVGFGLRDSIMDIARRQYQELQHYTGTIIDDEDATDKERQELDEFLKND SQIERYTHVQFTKMSVPRNKSNISVYVYVPENLETFNKDVTLQDRKTKEKYKLTEAGTVI SEKTATLTGLKVGDTMTITKDGKNYETKIAAVTENYMGHYIYMTGNVYEQTFGEKPNYSA TVFTVKEEYKESESEVGNEILKHPAALSISYTSSLEAQLHRMLGALDTVIIVLIISAGML AFVVLYNLNNVNITERQRELATLKVLGFYDNEVSAYVYRENVILTLIGVLAGAVFGIFLH RYIIRTVEVDAVMFGRNINPVSFLYCGLLTIGFSMIVNLFMHQKLKKIDMVESLKSVE >gi|226332888|gb|ACII01000131.1| GENE 15 19803 - 21008 563 401 aa, chain + ## HITS:1 COG:SSO2553 KEGG:ns NR:ns ## COG: SSO2553 COG2159 # Protein_GI_number: 15899287 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Sulfolobus solfataricus # 71 400 30 336 339 75 22.0 2e-13 MGTANQIEQELMSYISDHKIRNTHSHHLPEQAMVDFDLDKLINNTYLQWQQVTPGTTAES RTAYLEKTRYKSYFVWLQKAIGELYGISEPITAQNWNQISDQIRKAHQNPDFYMDVLKNK CKYQKVIVDTYWNPGSDNGRPELFTPAFRLDLFFLGYKKGLRNHDGVSLEENFGELPDNL QDYVEWVRKWIIQKKSEGCVALKIAMAYERSLHFEKVTREQAERVFRLKESDITQEDIRC FQDYLFWKICEIAAEVSLPLQCHTGMGQVIDTNILQLNNVIKNNPETKFVLLHCGFPWVD DLFSIVDGYPNLYPDLTWLPILSYTASKRVMHQLIEMSQIDKICWGCDTWTVEESYGSLL AFRFSLCSVLREKIEDGYLSVNNAKDIIDKILFDNAGKIYV >gi|226332888|gb|ACII01000131.1| GENE 16 21034 - 22728 1450 564 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0259 NR:ns ## KEGG: EUBREC_0259 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 560 1 560 565 604 56.0 1e-171 MNSTISIGMQDFGKLIRSGVFYVDKTSFIKEWWENQDDVTLVTRPRRFGKTLTLSMVDYF FSNQHVDSGKLFEKLDIWKEEKYQDLQGTFPVIFVSFAGIKAENYATARDGIIQVLLDLY AKYYFLLQSDVLNEQEREYFAYVRPDMSDAVAAMALHRLAICMYRYYGKKAIILLDEYDT PLQEAYVYGYWEELTTFIRSLFNCTFKTNPYLERGLMTGITRVSKESIFSDLNNLEVVTT TSEKYASSFGFTEQEVFQTLEQMGKADKKDEVKQWYDGFTFGRHTDIYNPWSIIKFLDTG EFNTYWAATSENSLVSKLIREGSPQLKMDFEDLLKGKTVMFKMDEQIVFEQLQRKKGAIW SLLLASGYLKVEKKIPDRRGGCYLYGLKITNYEVLLMFEDMVESWFPEESSSYENFKQAL LLGDLDYMNQYMNQVALQTFSSFDVGRKPSEHLEPERFYHGFVLGLIVDLADKYKITSNK ESGLGRYDVMMEPLVENLDGIIMEFKVHNPAKEKNLEQTAENALRQIREKKYDTELEMAG ISSDRIRHYGFAFEGKKVLIGSDS >gi|226332888|gb|ACII01000131.1| GENE 17 23301 - 25955 3556 884 aa, chain + ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 4 883 6 881 881 1087 59.0 0 MSKELAKTYDPKGIEERLYTKWMDNGYFHAKVNPDKKPFTIVMPPPNVTGQLHMGHALDE TMQDILIRFKRMQGYEALWQPGTDHAAIATEVKVIDKLKKEGIDKHDIGREEFLKHAWAW KEEYGGKIINQLKKLGASADWERERFTMDEGCSKAVQEVFIKLYEKGYIYKGSRIINWCP VCQTSISDAEVEHEDQDGFFWHINYPIVGEEGRFVEIATTRPETLLGDTAVAVNPEDERY KDLIGKMLKLPLTDREIPVIADEYVDKEFGTGCVKITPAHDPNDFEVGRRHNLEEINIMN DDATINELGGKYAGMDRYEARKAMVADLEELGLLVKVVPHNHSVGTHDRCGTTVEPMIKP QWFVKMDEMAKAAIKALDDGDLKFVPSRFDKTYLHWLENIRDWCISRQLWWGHRIPAYYC DECGEVVVAREMPEKCPKCGCTHLHQDEDTLDTWFSSALWPFSTLGWPEKTPELEYFYPT DVLVTGYDIIFFWVIRMVFSALEQTDQVPFHHVLIHGLVRDSQGRKMSKSLGNGIDPLEV IDKYGADALRLTLMTGNAPGNDMRFYWEKVEASRNFANKIWNASRFIMMNLEGKTVTEPE NLNDLCAEDKWILSHLNTVIREATENMEKYELGIAVQKVYDFLWDELCDWYIEMAKVRLW KAEEDPKAANDALWTLRTALTEGLKLLHPYMPFITEEIYCTLLPEEESIMISEWPVYREE RNFPDAEKAIEGFKEVVRGIRNTRTEMNVPNNRKTSLHIVAKDAETAAMYENSKKSFVNL AFAKEILVQTDKNGISEDAVSVVVSNAVVYMPLEDLIDKDKEIERLTKETERLTKEIARC EGMLNNPNFVNKAPATKVDAEKDKLAKYKEMMDKVKGQLEQLKK >gi|226332888|gb|ACII01000131.1| GENE 18 25968 - 28004 1796 678 aa, chain + ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 11 557 33 623 639 157 25.0 7e-38 MKLYKICNKISLLLHVLASAAGYFVMEAICRHSFIEAWNYMTQRPLVFAYNAAFIFTSSL IVYLFHRRVFWRILVTLFWLILAIINGVLLLNRVTPFTGPDLHLITDAMKIANKYLPVAG VVAVCILFGILVILLLMLLIKGPKYQKKIKYRYNIPLILLAVALFAGSTQLALEKRVLSN YFGNIAFAYEDYGYPYCLATTIFNTGISCPRDYSEKEIKRIEKTEKNLPETQEEKRPNIL FLQLESFFDPTLVNYLNISEDPIPTFRKLMKEYSSGYYKVPSVGAGTANTEFESITGMSM HYFGPGEYPYKSILRETTCESAPYVLKNLGYTTHAVHNNEANFYGRRSIFPNLGFDTFTS EEYMARENEKNPNGWVKDEVLTDEILKCLDSTEGPDYVYTISVQGHGAYPDEQILEDPEI TVSGAPTEEENNKWEYYVNEIHEMDNFVKELTDKLADYPEDVVLVMYGDHLPTMGLTVED LKNKYLFQTEYVMWDNFGLKKKNENLAAYQMAAEVMDRVGIHEGTVFHYHQARRNTRNYQ VDLETLQYDLLYGKRYSYGESGESPYLRTRMRMGIYDVTLDSIQCISEADHTYYIKGTEF TPSSEIKLNGEWYDTVYINPTTLMITGTELNDFDRLAVIQRSNSSTRKPLSKSYDRSCYA LYSNNKWKLTESAGTNEN >gi|226332888|gb|ACII01000131.1| GENE 19 28115 - 29413 1441 432 aa, chain + ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 1 429 1 431 431 257 36.0 3e-68 MNYEEAVAYIDETPKFTKKNSLDHTKECLRRLGNPQDKFKVIHVAGTNGKGSTCAFLTSV LREAGYSCGLFTSPHLVEINERFQINEVNIDNDTFLKAFEKVKVLADELVAEGSYHPTYF EMLFLMGMVIFAEAGVDYVTLETGLGGRMDATTAVENPVACVITSISLDHMQYLGDTVAK IAGEKAGIIVPGVPVIYDGNDPDAAEVIRARAEELGSPAYEVKRSDTEVLRNTSAGIDFA FRNEYYKDMVFSIPFIAKYQVMNSALALKTIEVLQNVISVSMDQIHVGIAGTRWQGRMET VLPGVIVDGAHNEDGVEKFVETAEHFQKECPLTLLFSAVDDKDYTDMIHTICSRITFRQV VVTSVGGYRQVPAERFAKLFTEAGASSVEVVEDVEKAFGRALEVKGDDGMLFCVGSLYLV GEIKDVIRRNKK >gi|226332888|gb|ACII01000131.1| GENE 20 29410 - 29562 219 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580602|ref|ZP_04857866.1| ## NR: gi|253580602|ref|ZP_04857866.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 50 1 50 50 75 100.0 7e-13 MIDYEEELKKFEPCLDVADAEGAIYERELTDILDILQEVLREAKSGRRVK >gi|226332888|gb|ACII01000131.1| GENE 21 29573 - 30910 1168 445 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2575 NR:ns ## KEGG: EUBREC_2575 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 16 366 14 363 456 239 39.0 2e-61 MNCMNCGAFLVDKDLDYCPKCGCNVLIQKKVDYLSRQYYNRGLEKASVRDLSGAIDCLKQ SLIYNKHNIQARNLLGLVYFETGEVVAALSEWVISKNLQPSRNLASEYINKLQANSNKLE AINETIRKYNDALNLCREGHEDMAAIRLKKILIQNPKLIKGYHLLALVQMKAGEYNKARK TLRKAARIDKTNTTTLRFLREIDEQTGVSTRLERQNKRNRKAVDSSENDMAIQIPQYKEK GRIPLFFTLVAGFCAGLLAFYLLAVPAIRQGIYREANQQIMKYSDAVSSQGAELTKAQSQ AQESGDTVEAASKQIEEEKKKSSSYEALIEAYSALQQQNTDEAALKIQNVYADLLPADLK GIYNTICNTTGTTGIEGTTDAKVTDGTGSADSAEGDVSSDSAEDGSYDDGSYDSGEYDDG SYDNSDGDSADDGSYDTGDYDDSGY >gi|226332888|gb|ACII01000131.1| GENE 22 31118 - 31432 273 104 aa, chain + ## HITS:1 COG:CAC2308 KEGG:ns NR:ns ## COG: CAC2308 COG1366 # Protein_GI_number: 15895575 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Clostridium acetobutylicum # 1 104 1 104 111 68 34.0 3e-12 MENRFEIQGNSLIIHLPKEVDHMVTDEIRRDSDDIIRKKYIRTITFDFSGTAFMDSSGIG LIMGRYRMMGMRGDCIQATGVNSYIEKLLHLSGVYKFVEICGEV >gi|226332888|gb|ACII01000131.1| GENE 23 31455 - 31880 438 141 aa, chain + ## HITS:1 COG:CAC2307 KEGG:ns NR:ns ## COG: CAC2307 COG2172 # Protein_GI_number: 15895574 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Clostridium acetobutylicum # 5 137 4 136 143 135 51.0 2e-32 MENTNEMKIIFDSRPENEGLARVAAAAFCTQLNPTLEEVADLKTAVSEAVTNCIIHAYEG QVQKIEIFCRREGQKLWVDVIDKGVGIRDVAKAMEPLFTTKPEKDRSGMGFTFMEAFMDE VTVESQVGYGTVVHMKKTIGR >gi|226332888|gb|ACII01000131.1| GENE 24 31883 - 32596 527 237 aa, chain + ## HITS:1 COG:BS_sigF KEGG:ns NR:ns ## COG: BS_sigF COG1191 # Protein_GI_number: 16079402 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 7 237 22 250 255 206 47.0 3e-53 MDHTLALIMKSQQGDKEARDTVFKENAGLVYSMAKRFAGRSVEMEDIVQIGSIGLLKAID RFDISYDVKFSTYAVPMIIGEIRRYLRDDGMLKVSRNLKENCARIYSAREALEKELGREP ILEEVAKATELSVDEVVMSMESGAEVESLHKIIYQGDGNDISLMDRLQEKENGQDAALNR IFLDEILKKLDARERQLIGMRYFKDMTQTEIAAEMGISQVQVSRMEKRILKELKKQV >gi|226332888|gb|ACII01000131.1| GENE 25 32768 - 33265 621 165 aa, chain - ## HITS:1 COG:CAC2413 KEGG:ns NR:ns ## COG: CAC2413 COG4708 # Protein_GI_number: 15895679 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 2 155 3 157 166 92 40.0 3e-19 MKNKGTQFLTEAAVIGAIYVVLTLLFAPLSYGEIQIRFSEALTILPFFTPAAIPGLFVGC IIANLFGGAIPVDIIFGSIATLIGAVFTYRLRSCNRFLAPIPPIVSNAVIVPFVLHFGYG VNLPIPLMMLTVGIGEVVSCGVVGLILQTALLKYKNVIFRSQRTV >gi|226332888|gb|ACII01000131.1| GENE 26 33334 - 33948 524 204 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01606 NR:ns ## KEGG: EUBELI_01606 # Name: not_defined # Def: stage V sporulation protein AA # Organism: E.eligens # Pathway: not_defined # 1 195 19 212 218 124 27.0 2e-27 MSDTLYLQLDQNIQINHPHIYLQDIAKLSCSNSKILNRLRVMPVINLDPNKPGRYVMSVM DLISEIKKKKPDLEINNIGEADFIITFKNKPGSGLVWQWCKIIFVGLAAFFGAGFSIMTF NNDVDVGGLFSQIYTQVTGQTSGHFTVLEITYSIGIGLGVLFFFNHFGHMKITDDPTPMQ IQMRLYEENVNKTLIKDIDRTSEK >gi|226332888|gb|ACII01000131.1| GENE 27 33955 - 34311 518 118 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2431 NR:ns ## KEGG: EUBREC_2431 # Name: not_defined # Def: sporulation protein # Organism: E.rectale # Pathway: not_defined # 1 117 22 138 139 166 73.0 2e-40 MDYFNAFWTGGLICALVQILLDRTKLMPGRIMVLLVCTGSVLSAVGIYQPFAEFAGAGAS VPLLGFGNILWKGMKESIDKNGLIGLFMGGFTACAVGVSAALIFSYLASLIFKPKMKG >gi|226332888|gb|ACII01000131.1| GENE 28 34342 - 35424 658 360 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0480 NR:ns ## KEGG: Cphy_0480 # Name: not_defined # Def: stage V sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 8 352 13 353 355 417 58.0 1e-115 MNRQMTCGTSSISFQNPVFVQSCASVVSRKEGEGPLGAYFDTICEDPMFGTDTWEAAEST LQKQAALLAIKKNGLTCSDIQLLFAGDLLAQTSASSFGTADLKIPFYGLFGACSTMGESL SLGSMCIDGGYGKYILCATSSHFASAEKEFRFPLGYGNQRPLSATWTVTGAGACILGYEP PPRPDNKTLNTLRNRNICAVITGITTGRLIDFGFRDSLNMGACMAPAACDTIARNLQDFH RKPSDYDAIFTGDLGMVGQTILFDLLTEKGFDISSVHQDCGMLIYDPQTQDTHSGGSGCG CAASVLAGYILPGIIQKKWKRILFVPTGALLSKVSFNEGDPIPGIAHAVVIEQYRVPDMM >gi|226332888|gb|ACII01000131.1| GENE 29 35494 - 35958 366 154 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0479 NR:ns ## KEGG: Cphy_0479 # Name: not_defined # Def: stage V sporulation protein AC # Organism: C.phytofermentans # Pathway: not_defined # 8 152 20 164 167 189 62.0 2e-47 MSDNNLSQKKQKQYEEYVKTVTPVHSLPLNMCKAFFTGGVICVVGQVILNYADNLGLDKD TAGSWCSLILILFSVILTGCNLYSKIGRFGGAGSLVPITGFANSVASSAIEYKAEGQVFG IGCKIFTIAGPVILYGILSSWILGVIYYLFTMIP >gi|226332888|gb|ACII01000131.1| GENE 30 36076 - 36495 267 139 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01605 NR:ns ## KEGG: EUBELI_01605 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 11 133 9 131 135 80 37.0 1e-14 MLEQIFLGFTGLCSGFIIAGGLTGLMIGLSIIPRYAGITHTADHILLYEDITFWGTELGN LFFLFHWNIRFGTPFLILYGLFSGIFLGGWIMALAEMADIFPIFARRSRFQRGLSFTILC IAAGKTVGALIYYYNGWLP Prediction of potential genes in microbial genomes Time: Sat May 28 20:44:42 2011 Seq name: gi|226332887|gb|ACII01000132.1| Ruminococcus sp. 5_1_39B_FAA cont1.132, whole genome shotgun sequence Length of sequence - 3288 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 2.5 1 1 Op 1 1/0.000 + CDS 84 - 1004 1007 ## COG0812 UDP-N-acetylmuramate dehydrogenase 2 1 Op 2 . + CDS 1101 - 1988 863 ## COG1660 Predicted P-loop-containing kinase 3 1 Op 3 2/0.000 + CDS 2014 - 2958 665 ## COG1481 Uncharacterized protein conserved in bacteria 4 1 Op 4 . + CDS 2955 - 3209 465 ## COG1925 Phosphotransferase system, HPr-related proteins + Term 3232 - 3277 8.0 Predicted protein(s) >gi|226332887|gb|ACII01000132.1| GENE 1 84 - 1004 1007 306 aa, chain + ## HITS:1 COG:CAC0510 KEGG:ns NR:ns ## COG: CAC0510 COG0812 # Protein_GI_number: 15893801 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Clostridium acetobutylicum # 6 305 5 303 305 281 46.0 1e-75 MKSVNQNIIEKFNDLLGEEKVKVDEPMKRHTTFRIGGPADYFLLPSSEEELSGILKICKN EELPYFILGNGSNLLVSDEGYRGVIIQLYRNYGDITVKGNEIHATAGALLSQIAAAAKNA SLTGFEFAGGIPGTLGGAVVMNAGAYGGEMKDVLKEVTVMTAAGEILVLPAEKLEMGYRT SLVKTKGYLVLSAVIVLEQGNQEAIKARMKELTEQRISKQPLEFPSAGSTFKRPEGYFAG KLIMDAGLRGYQTGGAQVSEKHCGFVINKDNATAADVCRLLRDVQDKVKEQFGVTLEPEV KFLGKF >gi|226332887|gb|ACII01000132.1| GENE 2 1101 - 1988 863 295 aa, chain + ## HITS:1 COG:CAC0511 KEGG:ns NR:ns ## COG: CAC0511 COG1660 # Protein_GI_number: 15893802 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Clostridium acetobutylicum # 1 287 1 286 294 302 52.0 4e-82 MRFVIVTGVSGAGKTSALKMLEDAKYFCVDNLPIPLLEKFASLMPEIHGEDVQNVALGID ARSGRSLDELEIVLDRMKKAGYDFEILFLDAQDSVLVKRYKETRRSHPLAMGGRVDDGIR MEREKMRFLKERADYIIDTSNLLTRELKQEIDRIFVDNQDFCNMMISVLSFGFKYGIPAD ADLVFDVRFLPNPYYVDELRPLTGLDDRVFNYVMDCDIARTFADKLEDMINFLIPNYVKE GKTNLVIAIGCTGGKHRSVTLARELYSRLSGNTKYGFRLEHRDAQKDRLVRKQEG >gi|226332887|gb|ACII01000132.1| GENE 3 2014 - 2958 665 314 aa, chain + ## HITS:1 COG:CAC0513 KEGG:ns NR:ns ## COG: CAC0513 COG1481 # Protein_GI_number: 15893804 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 308 7 315 317 256 44.0 3e-68 MVKEELSRQIGLARHCKMAELAAILCSCGKMECFSGDTKLKIQTENEAVARKCFTLLQKT FNIETKIFVRENSHLKRVKVYTIEITDPEEIQVIFQALRLVTNSIDQGTLVLSDMLVVQQ NCCKRAFIRGAFLASGSISDPEKGYHFEIVCSDAKRAEQLQTIIRSFSVDAKIVQRKKSH VVYVKEGAQIVDMLAVMEANVALMDLENIRILKEMRNSVNRKVNCETANINKTVNAAVKQ MEDIKLVRQKIGFEQLNEGLAQVAELRMQYPEATLKELGMMLSPQVGKSGVNHRLRKLSA MADELREKQGGELL >gi|226332887|gb|ACII01000132.1| GENE 4 2955 - 3209 465 84 aa, chain + ## HITS:1 COG:BS_crh KEGG:ns NR:ns ## COG: BS_crh COG1925 # Protein_GI_number: 16080527 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 84 1 84 85 71 44.0 5e-13 MIKKPITINLSTGLEARPVAQLVQVASQFNSEIYVEIGKKRVNAKSIMGMMTLGLDAGEE ITLSANGEDEEAAMAGIEQYLSNQ Prediction of potential genes in microbial genomes Time: Sat May 28 20:46:21 2011 Seq name: gi|226332886|gb|ACII01000133.1| Ruminococcus sp. 5_1_39B_FAA cont1.133, whole genome shotgun sequence Length of sequence - 26432 bp Number of predicted genes - 42, with homology - 42 Number of transcription units - 9, operones - 4 average op.length - 9.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 6.3 1 1 Tu 1 . + CDS 116 - 1261 1161 ## gi|253580617|ref|ZP_04857881.1| conserved hypothetical protein + Term 1294 - 1348 15.1 + Prom 1536 - 1595 5.2 2 2 Op 1 40/0.000 + CDS 1822 - 2139 478 ## PROTEIN SUPPORTED gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 + Prom 2144 - 2203 1.8 3 2 Op 2 58/0.000 + CDS 2236 - 2868 898 ## PROTEIN SUPPORTED gi|160881785|ref|YP_001560753.1| ribosomal protein L3 4 2 Op 3 61/0.000 + CDS 2890 - 3510 775 ## PROTEIN SUPPORTED gi|240145877|ref|ZP_04744478.1| ribosomal protein L4/L1e 5 2 Op 4 61/0.000 + CDS 3510 - 3809 424 ## PROTEIN SUPPORTED gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 6 2 Op 5 60/0.000 + CDS 4004 - 4849 1281 ## PROTEIN SUPPORTED gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 7 2 Op 6 59/0.000 + CDS 4865 - 5146 465 ## PROTEIN SUPPORTED gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 8 2 Op 7 61/0.000 + CDS 5235 - 5624 561 ## PROTEIN SUPPORTED gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 9 2 Op 8 50/0.000 + CDS 5636 - 6292 1008 ## PROTEIN SUPPORTED gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P 10 2 Op 9 . + CDS 6292 - 6729 651 ## PROTEIN SUPPORTED gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 11 2 Op 10 . + CDS 6710 - 6928 299 ## PROTEIN SUPPORTED gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 12 2 Op 11 50/0.000 + CDS 6944 - 7198 340 ## PROTEIN SUPPORTED gi|158319552|ref|YP_001512059.1| ribosomal protein S17 13 2 Op 12 57/0.000 + CDS 7212 - 7580 565 ## PROTEIN SUPPORTED gi|238916275|ref|YP_002929792.1| large subunit ribosomal protein L14 14 2 Op 13 48/0.000 + CDS 7592 - 7903 384 ## PROTEIN SUPPORTED gi|238922841|ref|YP_002936354.1| 50S ribosomal protein L24 15 2 Op 14 50/0.000 + CDS 7925 - 8464 833 ## PROTEIN SUPPORTED gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 16 2 Op 15 50/0.000 + CDS 8480 - 8665 308 ## PROTEIN SUPPORTED gi|240145865|ref|ZP_04744466.1| small subunit ribosomal protein S14 + Prom 8716 - 8775 6.1 17 2 Op 16 55/0.000 + CDS 8809 - 9210 609 ## PROTEIN SUPPORTED gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 18 2 Op 17 46/0.000 + CDS 9299 - 9838 775 ## PROTEIN SUPPORTED gi|240145863|ref|ZP_04744464.1| 50S ribosomal protein L6 19 2 Op 18 56/0.000 + CDS 9856 - 10224 463 ## PROTEIN SUPPORTED gi|240145862|ref|ZP_04744463.1| 50S ribosomal protein L18 20 2 Op 19 50/0.000 + CDS 10243 - 10752 690 ## PROTEIN SUPPORTED gi|160881768|ref|YP_001560736.1| ribosomal protein S5 21 2 Op 20 48/0.000 + CDS 10768 - 10950 208 ## PROTEIN SUPPORTED gi|160881767|ref|YP_001560735.1| ribosomal protein L30 + Prom 11015 - 11074 2.9 22 2 Op 21 53/0.000 + CDS 11094 - 11534 571 ## PROTEIN SUPPORTED gi|160881766|ref|YP_001560734.1| ribosomal protein L15 23 2 Op 22 28/0.000 + CDS 11534 - 12850 1241 ## COG0201 Preprotein translocase subunit SecY + Term 12857 - 12897 8.1 24 2 Op 23 12/0.000 + CDS 12920 - 13564 905 ## COG0563 Adenylate kinase and related kinases 25 2 Op 24 9/0.000 + CDS 13597 - 14352 713 ## COG0024 Methionine aminopeptidase 26 2 Op 25 . + CDS 14378 - 14596 254 ## PROTEIN SUPPORTED gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 + Term 14629 - 14672 6.9 + Prom 14942 - 15001 8.0 27 3 Tu 1 . + CDS 15092 - 15205 189 ## PROTEIN SUPPORTED gi|160881761|ref|YP_001560729.1| ribosomal protein L36 + Term 15262 - 15301 -0.9 + Prom 15238 - 15297 1.6 28 4 Op 1 48/0.000 + CDS 15527 - 15895 574 ## PROTEIN SUPPORTED gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e + Prom 15965 - 16024 3.5 29 4 Op 2 36/0.000 + CDS 16058 - 16453 617 ## PROTEIN SUPPORTED gi|240145851|ref|ZP_04744452.1| small subunit ribosomal protein S11 30 4 Op 3 26/0.000 + CDS 16485 - 17078 851 ## PROTEIN SUPPORTED gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 + Term 17129 - 17181 6.1 + Prom 17100 - 17159 3.4 31 4 Op 4 50/0.000 + CDS 17225 - 18184 1225 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 32 4 Op 5 . + CDS 18323 - 18859 740 ## PROTEIN SUPPORTED gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P + Term 18898 - 18958 14.9 33 5 Tu 1 . - CDS 18851 - 19045 78 ## gi|253580650|ref|ZP_04857914.1| predicted protein - Prom 19287 - 19346 5.2 + Prom 18971 - 19030 8.2 34 6 Tu 1 . + CDS 19194 - 20030 621 ## GYMC10_5801 hypothetical protein + Prom 20282 - 20341 3.4 35 7 Op 1 10/0.000 + CDS 20361 - 21695 1467 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 36 7 Op 2 . + CDS 21695 - 22417 764 ## COG3442 Predicted glutamine amidotransferase 37 7 Op 3 . + CDS 22420 - 22902 327 ## Elen_2252 transcriptional regulator protein-like protein 38 7 Op 4 . + CDS 22820 - 23062 221 ## gi|253580655|ref|ZP_04857919.1| predicted protein 39 7 Op 5 . + CDS 23088 - 24602 1696 ## COG4213 ABC-type xylose transport system, periplasmic component + Term 24616 - 24661 3.8 - Term 24604 - 24649 3.8 40 8 Tu 1 . - CDS 24650 - 25222 726 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 25434 - 25493 8.0 + Prom 25400 - 25459 11.2 41 9 Op 1 . + CDS 25528 - 25800 165 ## gi|253580658|ref|ZP_04857922.1| conserved hypothetical protein 42 9 Op 2 . + CDS 25931 - 26278 414 ## COG1937 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|226332886|gb|ACII01000133.1| GENE 1 116 - 1261 1161 381 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580617|ref|ZP_04857881.1| ## NR: gi|253580617|ref|ZP_04857881.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 381 1 381 381 651 100.0 0 MSKFLKFIVHFIVICTIVCVVGLAVPPFLGITTEIMDDSGKETNLPMGSVTYAIPVKIKE AAVGDSILYQTDSKAYRYMITEMDKKNHIFKVIDSSDKDAEPVAVEVKDYIPKVVITVGY AGYLLLATESIEGLIILGLAVLFLVILYIVAELLKKEPQDDYDETDTEPGYVKSKKELKR EEKAREKRYKEEERQIKKEERARRKGKAPERKKIKTGGFVDEIYEDELEPVKPAQPETIQ TATSEAHELLRKEIAAATADVPVKQEESVQTKPDAGDIHMEPVSGKTEVIVPQVKEQLKQ MMNASDRKPVEAEETEGNAQEEIRRMVIPTWSAAQIAETARSQGDAPDIVRDDVTKVTLF DYSDIIADEKPVYPQKQTAEE >gi|226332886|gb|ACII01000133.1| GENE 2 1822 - 2139 478 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 [Roseburia intestinalis L1-82] # 1 105 1 105 105 188 90 3e-47 MANQVMRITLKAYDHQLVDASAAKIIETVKKNGAMVSGPVPLPTKKEVVTILRAVHKYKD SREQFEQRTHKRLIDILTPTQKTVDALSRLEMPAGVNIDIKMKTK >gi|226332886|gb|ACII01000133.1| GENE 3 2236 - 2868 898 210 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881785|ref|YP_001560753.1| ribosomal protein L3 [Clostridium phytofermentans ISDg] # 1 209 1 209 211 350 79 5e-96 MKKAILATKVGMTQIFNEDGQLIPVTVLQAGPCVVTQVKTEENDGYEAVQVGFGDIRESL VNKPEKGHFDKAGVAVKRFVKEFRFDNAAEYTVGQEIKADIFADGDHIDATAVSKGKGFQ GAIKRHGQSRGPMAHGSKYHRHAGSNGACSDPSKVFKGKHMPGHMGNVQVTVQNLEIVRV DTENNLLLVKGAVPGPKKSLVTIKETVKSL >gi|226332886|gb|ACII01000133.1| GENE 4 2890 - 3510 775 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145877|ref|ZP_04744478.1| ribosomal protein L4/L1e [Roseburia intestinalis L1-82] # 1 206 1 206 206 303 72 1e-81 MANVTVYNMEGNEVGTMELNDAVFGVEVNEHLVHLAVVRQLANNRQGTQKAKTRSEVSGG GRKPWRQKGTGHARQGSIRAPQWTGGGVVFAPVPRDYEVKMNKKERRAALKSALTSKVQD NKLVVVDALTLADVKTKEMQKVLTNLKAEKALVITATDDKNVILSARNITDVQTATPSTI NVYDVMNHNTVIVTKDAVASIEEVYA >gi|226332886|gb|ACII01000133.1| GENE 5 3510 - 3809 424 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 [Roseburia intestinalis L1-82] # 1 99 1 99 99 167 85 5e-41 MADIKYYDVIKKPVITEKSMNAMAEKKYTFLVHPEANKSQIKEAVEKMFEGTKVKSVNTM NMDGKKKRRGMTVGTTAKTKKAIVALTEDSKDIEIFEGL >gi|226332886|gb|ACII01000133.1| GENE 6 4004 - 4849 1281 281 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 [Roseburia intestinalis L1-82] # 1 279 1 280 281 498 85 1e-140 MGIKSYNPYTPSRRHMTGSDFSEITKSTPEKSLTVSLKKNAGRNNQGKITVRHRGGGSRR KYRIIDFKRRKDGIPATVVSIEYDPNRTANIALISYVDGEKAYILAPEGLKVGQKVMNGA DAEVRVGNCLPLELIPVGTMVHNIELHPGKGGQMVRSAGNGAQLMAKEGKYATLRLPSGE MRMVPIVCRASVGVVGNGDHNLINIGKAGRKRNMGIRPTVRGSVMNPNDHPHGGGEGKTG IGRPGPCTPWGKPALGLKTRKKNKPSNKLIVRRRDGKALSK >gi|226332886|gb|ACII01000133.1| GENE 7 4865 - 5146 465 93 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 [Roseburia intestinalis L1-82] # 1 93 1 93 93 183 94 9e-46 MSRSLKKGPFADASLLKKVDALNAANDKSVIKTWSRRSTIFPSFVGHTIAVHDGRKHVPV YVTEDMVGHKLGEFVATRTYRGHGKDEKKSGVR >gi|226332886|gb|ACII01000133.1| GENE 8 5235 - 5624 561 129 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 [Roseburia intestinalis L1-82] # 1 129 1 128 128 220 85 6e-57 MAKGHRSQIKRARNESNRETRPSAKLSYARISVQKACYVLDVIRGKDVQTALGILTYNPR YASSVIKKLLESAIANAENNNGMNADNLYVAACYADKGPTMKRIQPRAQGRAYRIEKRTS HITIVLDER >gi|226332886|gb|ACII01000133.1| GENE 9 5636 - 6292 1008 218 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P [Roseburia intestinalis L1-82] # 1 215 1 215 218 392 88 1e-108 MGQKVNPHGLRVGVIKDWDSRWYADADFADYLVEDYNIRTFLKKKLYSAGVSKIEIERAS DRVKIIIYTAKPGIVIGKGGAEIEKVKAELKKFTDKKLIVDIKEVKRPDKDAQLVAENIA LQLENRISFRRAMKSTMQRTMKAGAKGIKTSVSGRLGGADMARTEFYSEGTIPLQTLRAD IDYGFAEADTTYGKVGVKAWVYNGEVLPTKGTKEGSDK >gi|226332886|gb|ACII01000133.1| GENE 10 6292 - 6729 651 145 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 [Clostridium phytofermentans ISDg] # 1 145 1 145 145 255 85 2e-67 MLMPKRVKRRKQFRGSMRGKALRGNKINYGEFGLVATEPCWIRSNQIEAARVAMTRYIKR GGQVWIKIFPDKPVTAKPAETRMGSGKGALEYWVAVVKPGRVMFEIAGVSEEIAREALRL AMHKLPCKCKIVSRADLEGGDNSEN >gi|226332886|gb|ACII01000133.1| GENE 11 6710 - 6928 299 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 [Clostridiales bacterium 1_7_47_FAA] # 1 70 1 70 70 119 88 2e-26 MITVKINKFVEDLKAKSAAELNEELVAAKKELFNLRFQNATNQLENTSRIKEVRKNIARI QTVITEQANASK >gi|226332886|gb|ACII01000133.1| GENE 12 6944 - 7198 340 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|158319552|ref|YP_001512059.1| ribosomal protein S17 [Alkaliphilus oremlandii OhILAs] # 1 84 1 84 84 135 76 3e-31 MERNLRKTRVGKVVSNKMDKTIVVAIEDHVKHPLYKKIVKRTYKLKAHDENNECNIGDTV KVMETRPLSKDKRWRLVEIVEKVK >gi|226332886|gb|ACII01000133.1| GENE 13 7212 - 7580 565 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916275|ref|YP_002929792.1| large subunit ribosomal protein L14 [Eubacterium eligens ATCC 27750] # 1 122 1 122 122 222 91 2e-57 MIQQETRLKVADNTGAKEILCIRVMGGSTRRYASIGDTIVATVKDATPGGVVKKGDVVKA VVVRTKKGARRKDGSYIRFDENAAVIIKDDLTPRGTRIFGPVARELREKKFMKIVSLAPE VL >gi|226332886|gb|ACII01000133.1| GENE 14 7592 - 7903 384 103 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238922841|ref|YP_002936354.1| 50S ribosomal protein L24 [Eubacterium rectale ATCC 33656] # 1 103 1 103 103 152 72 2e-36 MSAMKIKKGDTVKVIAGKDNGKEGKVLAVNAKDNTVVVENINKVTKHSKPSAANQQGGII TKEAPLHISNVMLVVDGQATRVGFQMDGDKKVRVAKKTGKVID >gi|226332886|gb|ACII01000133.1| GENE 15 7925 - 8464 833 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 [Roseburia intestinalis L1-82] # 1 179 1 179 179 325 89 2e-88 MSRLKEMYKNEIMDAMTKKFGYKNVMEVPKLDKIVINMGVGEAKENAKLLDAAIADMELI TGQKAIATKAKKSVANFKIREGMPIGCKVTLRGEKMYEFADRLINLALPRVRDFRGVNPN AFDGRGNYALGIKEQLIFPEVEYDKVDKVRGMDIIFVTTAKTDEEARELLTLFNMPFAK >gi|226332886|gb|ACII01000133.1| GENE 16 8480 - 8665 308 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145865|ref|ZP_04744466.1| small subunit ribosomal protein S14 [Roseburia intestinalis L1-82] # 1 61 1 61 61 123 90 1e-27 MAKTAMKIKQQRKQKFSTREYTRCNICGRPHSVLRKYGICRICFRELAYKGQIPGVKKAS W >gi|226332886|gb|ACII01000133.1| GENE 17 8809 - 9210 609 133 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 [Roseburia intestinalis L1-82] # 1 133 1 133 133 239 88 2e-62 MTMSDPIADMLTRIRNANTAKHDTVDVPASKMKTAIANILVDEGYIAKYDLVEDGVVKTL HITLKYGEDKNEKIITGLKRISKPGLRIYAGKDQLPKVLGGLGIAILSTNKGVITDKEAR KLQVGGEVLAFVW >gi|226332886|gb|ACII01000133.1| GENE 18 9299 - 9838 775 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145863|ref|ZP_04744464.1| 50S ribosomal protein L6 [Roseburia intestinalis L1-82] # 1 179 1 179 179 303 81 1e-81 MSRIGRLPVAIPAGVEVTVAEGNVVTVKGPKGTLERALPTEMEIKVEDGHVVVSRPNDLK KMKSLHGLTRSLIHNMVVGVSEGYTKELEVNGVGYKAAKQGKKLTLSLGYSHPVEMEDPD GIETKVDGNKIIVSGISKEKVGQFAAEIRDKRRPEPYKGKGIKYVDEVIRRKVGKTGKK >gi|226332886|gb|ACII01000133.1| GENE 19 9856 - 10224 463 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145862|ref|ZP_04744463.1| 50S ribosomal protein L18 [Roseburia intestinalis L1-82] # 1 122 1 122 122 182 75 1e-45 MVNKASRAKIRENKHRRLRHHLNGTATTPRLAVFRSNKHIYAQIIDDTVGKTLVSASTLQ KEVRAELENTDDVAAAAHLGTVIGKKAVEAGIESVVFDRGGYIYHGKVKALADAAREAGL KF >gi|226332886|gb|ACII01000133.1| GENE 20 10243 - 10752 690 169 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881768|ref|YP_001560736.1| ribosomal protein S5 [Clostridium phytofermentans ISDg] # 1 167 1 167 168 270 80 7e-72 MKRNLIDASQLELEDKVVSIKRVTKVVKGGRNFRFTALVVVGDGNGHVGAGLGKAAEIPE AIRKGKEDAIKKLVTVARDENNSITHDFVGKYGSAEMLLKRAPEGTGVIAGGPARAVIEL AGIKNIRTKCMGSRNKQNVVLATIEGLRQLKTPEEVARLRGKSVDEILA >gi|226332886|gb|ACII01000133.1| GENE 21 10768 - 10950 208 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881767|ref|YP_001560735.1| ribosomal protein L30 [Clostridium phytofermentans ISDg] # 1 59 1 59 60 84 67 6e-16 MANTLKVTLVKSPIGAVPKHKKTVAAMGLTKMHKTVEMPDNAATRGMIQQVQHLVKVEEA >gi|226332886|gb|ACII01000133.1| GENE 22 11094 - 11534 571 146 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881766|ref|YP_001560734.1| ribosomal protein L15 [Clostridium phytofermentans ISDg] # 1 146 1 146 146 224 77 4e-58 MELSNLRPAEGSKHSDNFRRGRGHGSGNGKTAGKGHKGQLARSGHKKPGFEGGQMPLYRR LPKRGFKNRNTKEIIAINVDVLNRFEDGAEVTAESLLASGAISKIADGVKILGNGELTKK LNVKVNAVSETAKSKIEAAGGTVEVI >gi|226332886|gb|ACII01000133.1| GENE 23 11534 - 12850 1241 438 aa, chain + ## HITS:1 COG:Cgl0541 KEGG:ns NR:ns ## COG: Cgl0541 COG0201 # Protein_GI_number: 19551791 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecY # Organism: Corynebacterium glutamicum # 9 436 8 439 440 328 42.0 1e-89 MFETLKNVFRVKEMRRKLLYLIWMIFIIRIGCQIPVPGVDSDFFKQWFSSNAGDAFNFFD AFTGGSFERMSIFALNITPYITSSIIIQLLTIAIPALEEMQRDGEEGRKKMTAITRYVTV GLALFESIAMAIGFGRQGMIPNLDFFKGVVVVACLTAGSAMLMWLGERITEKGVGNGISI VLTINIISRVPSDLTLLYENFIKGKTIAKGTLAGLIIAAVILLVVVLVLILNGAERRIPV QYSKKMVGRKLMGGQSTNIPLKVNTAGVIPVIFASSIMSFPAVIAQLTGKGNGTGIGSEI IRGLSSNNWCNPKQLQYTWGLVLYIVLCVFFAYFYTSITFNPLEVADNIKKQGGFIPGIR PGKPTSDYLTNILNYIIFIGAVGLIIVCVIPFIFNGVFGANVSFGGTSIIIIVGVILETV KQIESQLLVRNYKGFLNN >gi|226332886|gb|ACII01000133.1| GENE 24 12920 - 13564 905 214 aa, chain + ## HITS:1 COG:CAC3112 KEGG:ns NR:ns ## COG: CAC3112 COG0563 # Protein_GI_number: 15896362 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Clostridium acetobutylicum # 1 214 1 214 215 264 60.0 8e-71 MKIIMLGAPGAGKGTQAKKIAAKYQIPHISTGDIFRANIKNGTELGKKAKTYMDQGLLVP DELTCDLVVDRIQQPDAANGYVLDGFPRTIPQAECLTEALNKLGSKVDYAIDVDVPDSNI VNRMSGRRACLKCGATYHVVHAAPKVEGVCDTCGEKLVLRDDDQPETVQKRLNVYHEQTQ PLIDYYTKEGILKSVDGTKDLEEVFADIVAILGE >gi|226332886|gb|ACII01000133.1| GENE 25 13597 - 14352 713 251 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 6 248 5 246 248 247 49.0 1e-65 MSVSIKTEHEIELMREAGHLLEKVHDGLIPYIKPGVSTKEIDRIGEQMIRDLGCIPNFLN YGGFPASFCISLNDEVVHGIPSEEKIIQEGDLVKIDAGLIYKGYHSDAARTYAVGEVSPQ ARKLMDVTRECFFEGLKAARAGNHLNDISKAIGAHAAKYHYGIVRDLVGHGIGTHLHEDP QIPNFPQKRRGVRLMPGMTLAVEPMINLGRADVAWLDDEWTVVTMDGSLSAHYENTILIT DGDPEILTLTK >gi|226332886|gb|ACII01000133.1| GENE 26 14378 - 14596 254 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15610598|ref|NP_217979.1| translation initiation factor IF-1 [Mycobacterium tuberculosis H37Rv] # 1 72 1 73 73 102 68 3e-21 MSKADVIEVEGTVLEKLPNAMFKVELENKHVVLAHISGKLRMNFIRILPGDKVTIELSPY DLSKGRIIWRDK >gi|226332886|gb|ACII01000133.1| GENE 27 15092 - 15205 189 37 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881761|ref|YP_001560729.1| ribosomal protein L36 [Clostridium phytofermentans ISDg] # 1 37 1 37 37 77 94 9e-14 MKVRSSVKPICEKCKIIKRKGSIRVICENPKHKQRQG >gi|226332886|gb|ACII01000133.1| GENE 28 15527 - 15895 574 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e [Eubacterium rectale ATCC 33656] # 1 122 1 122 122 225 92 2e-58 MARIAGVDLPREKRIEIGLTYIYGIGRPSADKILAKAEVNPDTRVRDLTDDEVKRLSAVI DETMTVEGDLRREIALNIKRLQEIGCYRGIRHRKGLPVRGQKTKTNARTRKGPKRTVANK KK >gi|226332886|gb|ACII01000133.1| GENE 29 16058 - 16453 617 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145851|ref|ZP_04744452.1| small subunit ribosomal protein S11 [Roseburia intestinalis L1-82] # 1 131 1 130 130 242 93 2e-63 MAKVTKKATAKRRVKKNVEHGQAHIQSSFNNTIVTLTDNEGNALSWASAGGLGFRGSRKS TPYAAQMAAETATKAALIHGLKTVDVMVKGPGSGREAAIRALQACGLEVTSIRDVTPVPH NGCRPPKRRRV >gi|226332886|gb|ACII01000133.1| GENE 30 16485 - 17078 851 197 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 [Roseburia intestinalis L1-82] # 1 197 1 197 197 332 83 2e-90 MAVNRVPVLKRCRSLGLDPIYLGIDKKSTRQLKRANRKMSEYGLQLREKQKAKFIYGVLE KPFHNYYDKADRMPGQTGANLMILLESRLDNVVFRMGFARTRKEARQIVDHKHVLVNGKC VNIPSYLVKAGDQIEIREKSKGSERYKGILEVTGGRLVPEWIDVDQENLKGTVKELPNRE AIDVPVNEMLIVELYSK >gi|226332886|gb|ACII01000133.1| GENE 31 17225 - 18184 1225 319 aa, chain + ## HITS:1 COG:BS_rpoA KEGG:ns NR:ns ## COG: BS_rpoA COG0202 # Protein_GI_number: 16077211 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus subtilis # 1 315 1 311 314 382 64.0 1e-106 MFDFNKPKIEITELSEDKKFGRFVVEPLERGYGTTLGNSLRRIMLSSLPGAAVSQVKIDG VLHEFSSIPGVKEDVSDIIMNLKSLAIKNHSTDNEPKTAYIECEGKGVVTAADIQADQDI EIMNPDQVIATLNGGKDCRLAMELTITKGRGYISADKGKSDDMPIGVLAVDAIYTPVDRV NMTVENTRVGQVTDYDKLTLDVYTNGTLDPDEAVSLAAKVLSEHLSLFIDLSENAKTAEV MIEKEDNEKEKVLEMSIDELELSVRSYNCLKRAGINTVEELCNKTSDDMMKVRNLGRKSL EEVLAKLKELGLQLQPAEE >gi|226332886|gb|ACII01000133.1| GENE 32 18323 - 18859 740 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P [Roseburia intestinalis L1-82] # 1 178 1 178 178 289 81 1e-77 MAKYRKLSRTSDQRKALLRSQVTSLLYHGKIVTTEAKAKEIRKIAEGLVAMAVREKDNFE TVTVTAKVARKDAEGKRVKEVVDGKKKTVYDEVQKEIKKDAPSRLHARRQMMKVFYPVKE VPAKGAGRKKNTKDVDMVAKMFDEIAPKYADRNGGYTRIVKIGPRKGDAAMEVLIELV >gi|226332886|gb|ACII01000133.1| GENE 33 18851 - 19045 78 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580650|ref|ZP_04857914.1| ## NR: gi|253580650|ref|ZP_04857914.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 99 100.0 7e-20 MVFIYSTILRYLAQIIKGNKRLCYKEVISYSITKKLHIYGMNTIHMQSYHRLDKYKIENL AGLY >gi|226332886|gb|ACII01000133.1| GENE 34 19194 - 20030 621 278 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_5801 NR:ns ## KEGG: GYMC10_5801 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 35 274 35 276 293 129 31.0 1e-28 MFLLLVELLAVTLFLWGNDAYSISATTPEFKWENYKTIAHALGGIGDKTYLNSKESFLAG YQMGCRLFEVDLVKTSDNVWVCRHSWYQSLGQWEGDEKKVLSSEEFLSRPIYGKYTPITF EDLLVLLSDYPDAFVMLDSKQYSLRNYQKTVEDYADYIELAEAAGVPDVMRQVIPEIYNQ AMFAGTALLYDFPGYIYSLWQEYSIKELTEIAAFCREKNIQAATVYYKYWSEDVQEIFDK KGIRLYIYTVNDLKEAQYYMQEGAAGVCSDYLQDTMFN >gi|226332886|gb|ACII01000133.1| GENE 35 20361 - 21695 1467 444 aa, chain + ## HITS:1 COG:SA1708 KEGG:ns NR:ns ## COG: SA1708 COG0769 # Protein_GI_number: 15927466 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Staphylococcus aureus N315 # 3 439 1 430 437 268 34.0 2e-71 MKIRVMIAVCAAKFVGYVCKKMGRQGVTWAGKVAIKICPDILEQLSSQVRKAIFATCGTN GKTTTNNMLCAALEAEGQKVICNHTGSNMLNGVVAAFVLASKWNGKIDADYACIEADEAS TRHIFPRIKPDYMLLTNLFRDQLDRYGEIDITMNILEEMMRKVPKMQIIVNGDDALSAYL AMDSGNPYVTYGISKPVIKSAANEIREGRFCKRCGEKLEYRFYHYSQLGDYYCPKCGFAR PKPDFDAEDVKVGDQLSFCVEGKHIVANYKGFYNVYNILAAYAGVRTAGFAGEHFGDMLR RFNPENGRMEQFRIKGTGVTLNLAKNPAGFNQNISAVMQDQAPKDIIIAINDNAQDGTDI SWLWDVDFDLLGNDSVKSITVSGIRCQDMRLRLKYVDIPSVLEGDVEKAIRDRVEDGVGN LYVLVNYTALFSTRNILKRLEGEK >gi|226332886|gb|ACII01000133.1| GENE 36 21695 - 22417 764 240 aa, chain + ## HITS:1 COG:CAC0961 KEGG:ns NR:ns ## COG: CAC0961 COG3442 # Protein_GI_number: 15894248 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferase # Organism: Clostridium acetobutylicum # 1 236 1 238 243 231 51.0 9e-61 MKITIGHLYPDLLNLYGDRGNIQCLMKRCQWRGIEAETIPFELDDKIDFSKLDIVLLGGG SDREQMIVCEKLQKIQPDFKAYVEDNGVVIAICGGYQLLGKYYKTDQGNMKGLDLVDLYT EQGEGRLIQNIVLQSELFDMPIVGFENHGGRTCINNNKPLGKVLYGAGNDGKSGYEGVVY KNVIGTYLHGPLLPKNPQLADWLIQHALERKYGKETELTPLDDSQEKEANDYIYHRFVRG >gi|226332886|gb|ACII01000133.1| GENE 37 22420 - 22902 327 160 aa, chain + ## HITS:1 COG:no KEGG:Elen_2252 NR:ns ## KEGG: Elen_2252 # Name: not_defined # Def: transcriptional regulator protein-like protein # Organism: E.lenta # Pathway: not_defined # 7 67 6 68 327 63 53.0 2e-09 MAKSCNQKGKILYLQKMLSETTSQKPVTMQEILAKLEEQGIRAERKSIYDDMETLRDFGM DIHYRRGREGGYYEEKSASDKTEADNTIIHQPDGETGEKSENVQEQIGKESEAPAKISAP TPSQTGENLKEIRLVCQNSTKKKNPAYLWFGDFLQNKGGG >gi|226332886|gb|ACII01000133.1| GENE 38 22820 - 23062 221 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580655|ref|ZP_04857919.1| ## NR: gi|253580655|ref|ZP_04857919.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 80 1 80 80 148 100.0 1e-34 MYVRILQRKKIQRTFGSEIFCKIKGEDNFIVTVEVVVDKAFFGWLTSMGRNVHILKPKKA AVAYRDYLKNIAKDYKGIDK >gi|226332886|gb|ACII01000133.1| GENE 39 23088 - 24602 1696 504 aa, chain + ## HITS:1 COG:AGc4267 KEGG:ns NR:ns ## COG: AGc4267 COG4213 # Protein_GI_number: 15889625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 128 439 45 353 354 194 36.0 5e-49 MKRRATAILLGIMVSSMFLSACGKNEAKEAANESAQVEEEGVGEVTEEGEKVEAENKNAS DADDSSKAGDSAKSDEDNKETSVKEEEDGDSSGKDSDDESEEDAEVTEASAGKIGVLLSD DDEDAKIDSEEMTSQIEDGGYEADVKNAGGDPALQISQIQEFIDEQVSALIIDPVDSYGL TDILKTAKEQEIPVISYDSLIRDTADINYYATYDTRAIGKDIAKEIIKKMDLDKAREDKK SYTIEFLMGSPDDNAALFLCNGIQEGLQEYLDDGTLVCKSGNTSFDDTGIMRWSETSAKT KLDSIISEFYAEEKAPDIICTAYDGFAYAAEEILSDSGLEPGSDEWPMITGYGSEAQAVK DIAAGKMSFTMFMDRKELAKGGAQMAIDYLTGEKVDVKDYSQYDNGVKIVGTFTCGAQMI DKDNYQILVDNGTYTEDEIAPDPTPAPEATPVPKVTLKTASEEDSKEVTPTPETEDKSEG ETRENLIYDSEDNSKVEKTSDQKK >gi|226332886|gb|ACII01000133.1| GENE 40 24650 - 25222 726 190 aa, chain - ## HITS:1 COG:SA0373 KEGG:ns NR:ns ## COG: SA0373 COG0503 # Protein_GI_number: 15926089 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Staphylococcus aureus N315 # 1 188 1 188 192 176 51.0 2e-44 MKLLKDRILKDGVVKPGNILKVDSFLNHQMDITLINEIGKEFKRRFSDCPITKILTIEAS GIGIACIAAQYFDVPVVFAKKAQSVNLDGAMYTTKVESFTHKKVYDVILSKKFLGPEDHV LLIDDFLANGCALLGLIDIVKKSGATLEGAGIVIEKGFQHGGQEIRDMGIRLESLAIVDS MTDDSLTFRD >gi|226332886|gb|ACII01000133.1| GENE 41 25528 - 25800 165 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580658|ref|ZP_04857922.1| ## NR: gi|253580658|ref|ZP_04857922.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 5 90 1 86 86 134 98.0 2e-30 MKRIVNIRLTDWRWNMKEYELIWEIFNKCPRNQMRDVFVEELELEDPEEYIRNKFKGKEV SYEKTVLEDGTIIFDITTSQIKQRCSFTEI >gi|226332886|gb|ACII01000133.1| GENE 42 25931 - 26278 414 115 aa, chain + ## HITS:1 COG:Cj0510c KEGG:ns NR:ns ## COG: Cj0510c COG1937 # Protein_GI_number: 15791872 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 24 115 5 96 96 92 55.0 3e-19 MAEEQKTHTHVLEDGTVVEHSHGEHGHHHSHAHTKAVLNRMSRAIGHMESIKRMIEDGRD CAEVLIQLSAVKSAINNTGKIILQDHIEHCIVDAVEHGDKEAIKELEVAIDRFVK Prediction of potential genes in microbial genomes Time: Sat May 28 20:47:31 2011 Seq name: gi|226332885|gb|ACII01000134.1| Ruminococcus sp. 5_1_39B_FAA cont1.134, whole genome shotgun sequence Length of sequence - 95320 bp Number of predicted genes - 82, with homology - 82 Number of transcription units - 48, operones - 17 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 50 - 880 1152 ## COG0489 ATPases involved in chromosome partitioning 2 2 Tu 1 . - CDS 1150 - 2028 778 ## COG0583 Transcriptional regulator - Prom 2161 - 2220 11.3 + Prom 2145 - 2204 7.4 3 3 Tu 1 . + CDS 2329 - 6093 4762 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 6161 - 6201 1.6 + Prom 6251 - 6310 7.5 4 4 Tu 1 . + CDS 6392 - 6499 67 ## gi|253580664|ref|ZP_04857928.1| predicted protein + Term 6565 - 6608 3.8 - Term 6552 - 6596 4.0 5 5 Tu 1 . - CDS 6660 - 7367 313 ## gi|253580665|ref|ZP_04857929.1| conserved hypothetical protein - Prom 7393 - 7452 6.6 + Prom 7686 - 7745 6.4 6 6 Op 1 . + CDS 7834 - 8010 150 ## gi|253580666|ref|ZP_04857930.1| predicted protein 7 6 Op 2 . + CDS 8004 - 8276 176 ## gi|253580667|ref|ZP_04857931.1| predicted protein + Prom 8904 - 8963 5.8 8 7 Op 1 32/0.000 + CDS 9036 - 10091 1086 ## COG1135 ABC-type metal ion transport system, ATPase component 9 7 Op 2 22/0.000 + CDS 10078 - 10737 901 ## COG2011 ABC-type metal ion transport system, permease component + Prom 10774 - 10833 6.0 10 7 Op 3 . + CDS 10872 - 11798 1394 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Term 11824 - 11875 10.2 + Prom 11809 - 11868 4.2 11 8 Tu 1 . + CDS 12084 - 14099 1598 ## COG0515 Serine/threonine protein kinase + Term 14118 - 14185 10.1 + Prom 14106 - 14165 2.9 12 9 Tu 1 . + CDS 14292 - 14699 473 ## Shel_21690 hypothetical protein + Prom 14759 - 14818 8.9 13 10 Tu 1 . + CDS 14922 - 17978 2713 ## COG3291 FOG: PKD repeat + Term 18070 - 18131 13.6 + TRNA 18378 - 18450 81.0 # Val TAC 0 0 + 5S_RRNA 18384 - 18435 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 18466 - 18539 79.4 # Met CAT 0 0 + TRNA 18571 - 18643 81.0 # Val TAC 0 0 + 5S_RRNA 18577 - 18628 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 18660 - 18736 80.0 # Met CAT 0 0 14 11 Tu 1 . + CDS 19048 - 19452 478 ## gi|253580675|ref|ZP_04857939.1| predicted protein + Term 19573 - 19611 4.5 + Prom 19467 - 19526 8.2 15 12 Tu 1 . + CDS 19685 - 21145 362 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 21209 - 21250 3.1 + Prom 21682 - 21741 3.3 16 13 Op 1 35/0.000 + CDS 21772 - 23259 1531 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 17 13 Op 2 13/0.000 + CDS 23256 - 23834 555 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 18 13 Op 3 21/0.000 + CDS 23827 - 24837 1344 ## COG0547 Anthranilate phosphoribosyltransferase + Prom 24839 - 24898 1.9 19 13 Op 4 9/0.000 + CDS 24936 - 25736 787 ## COG0134 Indole-3-glycerol phosphate synthase 20 13 Op 5 23/0.000 + CDS 25767 - 26405 576 ## COG0135 Phosphoribosylanthranilate isomerase 21 13 Op 6 37/0.000 + CDS 26468 - 27652 1571 ## COG0133 Tryptophan synthase beta chain 22 13 Op 7 . + CDS 27645 - 28424 409 ## PROTEIN SUPPORTED gi|148997862|ref|ZP_01825426.1| ribosomal protein L11 methyltransferase - Term 28504 - 28576 10.1 23 14 Tu 1 . - CDS 28777 - 29571 515 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 29616 - 29675 5.3 + Prom 30073 - 30132 7.2 24 15 Op 1 . + CDS 30193 - 31143 908 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 25 15 Op 2 . + CDS 31145 - 33265 2356 ## COG3968 Uncharacterized protein related to glutamine synthetase + Prom 33291 - 33350 4.2 26 16 Op 1 4/0.000 + CDS 33382 - 34302 983 ## COG0714 MoxR-like ATPases 27 16 Op 2 . + CDS 34292 - 36085 1482 ## COG4548 Nitric oxide reductase activation protein + Term 36120 - 36167 0.1 + Prom 36171 - 36230 8.8 28 17 Op 1 . + CDS 36264 - 38162 2555 ## COG1297 Predicted membrane protein 29 17 Op 2 . + CDS 38187 - 38552 283 ## EUBELI_20051 hypothetical protein + Term 38561 - 38614 11.1 + Prom 38631 - 38690 4.5 30 18 Tu 1 . + CDS 38711 - 39583 948 ## COG1284 Uncharacterized conserved protein + Prom 39585 - 39644 10.5 31 19 Tu 1 . + CDS 39730 - 40818 1281 ## COG2508 Regulator of polyketide synthase expression + Term 40847 - 40891 7.4 + Prom 41026 - 41085 5.1 32 20 Op 1 28/0.000 + CDS 41117 - 41800 319 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 33 20 Op 2 2/0.000 + CDS 41790 - 42695 938 ## COG2177 Cell division protein + Prom 42697 - 42756 1.8 34 20 Op 3 . + CDS 42776 - 43966 1411 ## COG0793 Periplasmic protease + Prom 43982 - 44041 7.3 35 21 Op 1 . + CDS 44144 - 45013 967 ## COG1307 Uncharacterized protein conserved in bacteria + Term 45035 - 45098 -0.9 + Prom 45015 - 45074 5.5 36 21 Op 2 . + CDS 45110 - 47047 1764 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Term 47174 - 47210 -0.3 + Prom 47155 - 47214 4.1 37 22 Op 1 . + CDS 47294 - 47767 386 ## COG3236 Uncharacterized protein conserved in bacteria 38 22 Op 2 . + CDS 47787 - 49910 1999 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 49971 - 50019 1.3 39 23 Tu 1 . - CDS 49931 - 50833 523 ## COG0583 Transcriptional regulator - Prom 51018 - 51077 12.2 + Prom 50969 - 51028 10.4 40 24 Tu 1 . + CDS 51228 - 52493 1400 ## COG0471 Di- and tricarboxylate transporters + Prom 52540 - 52599 2.5 41 25 Op 1 11/0.000 + CDS 52639 - 53544 1054 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain 42 25 Op 2 1/0.286 + CDS 53560 - 54192 955 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain + Term 54289 - 54341 5.1 + Prom 54654 - 54713 6.5 43 26 Tu 1 . + CDS 54820 - 57603 2247 ## COG0642 Signal transduction histidine kinase + Term 57618 - 57659 3.4 + Prom 57619 - 57678 6.0 44 27 Tu 1 . + CDS 57741 - 59729 1720 ## COG0556 Helicase subunit of the DNA excision repair complex + Prom 60246 - 60305 7.6 45 28 Op 1 21/0.000 + CDS 60377 - 61417 1186 ## COG1420 Transcriptional regulator of heat shock gene 46 28 Op 2 29/0.000 + CDS 61435 - 62079 801 ## COG0576 Molecular chaperone GrpE (heat shock protein) 47 28 Op 3 31/0.000 + CDS 62140 - 64017 2621 ## COG0443 Molecular chaperone + Term 64062 - 64117 15.5 + Prom 64057 - 64116 4.8 48 29 Tu 1 . + CDS 64137 - 65324 1270 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain + Term 65343 - 65378 4.0 + Prom 65337 - 65396 5.8 49 30 Tu 1 . + CDS 65444 - 66169 452 ## COG0671 Membrane-associated phospholipid phosphatase + Prom 66194 - 66253 4.6 50 31 Op 1 9/0.000 + CDS 66287 - 67240 963 ## PROTEIN SUPPORTED gi|240145923|ref|ZP_04744524.1| ribosomal protein L11 methyltransferase 51 31 Op 2 . + CDS 67245 - 67991 796 ## COG1385 Uncharacterized protein conserved in bacteria 52 31 Op 3 7/0.000 + CDS 68013 - 69167 1122 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 53 31 Op 4 . + CDS 69208 - 70389 1277 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase - Term 70454 - 70488 4.5 54 32 Tu 1 . - CDS 70505 - 70741 366 ## Cphy_2300 phosphotransferase system, phosphocarrier protein HPr + Prom 70840 - 70899 4.4 55 33 Tu 1 . + CDS 71059 - 72390 462 ## PROTEIN SUPPORTED gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase + Term 72465 - 72499 -0.7 + Prom 72448 - 72507 7.0 56 34 Op 1 6/0.000 + CDS 72604 - 72876 344 ## COG4472 Uncharacterized protein conserved in bacteria 57 34 Op 2 . + CDS 72878 - 73300 531 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 58 34 Op 3 . + CDS 73313 - 73570 432 ## gi|253580720|ref|ZP_04857984.1| conserved hypothetical protein 59 34 Op 4 . + CDS 73587 - 75482 2038 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 60 34 Op 5 . + CDS 75484 - 75828 300 ## EUBELI_00777 hypothetical protein + Term 75848 - 75895 -0.9 61 34 Op 6 . + CDS 75918 - 76322 525 ## EUBREC_1892 hypothetical protein 62 34 Op 7 4/0.000 + CDS 76319 - 76963 686 ## COG4122 Predicted O-methyltransferase 63 34 Op 8 . + CDS 76969 - 78192 1422 ## COG0826 Collagenase and related proteases + Term 78201 - 78241 7.2 - Term 78189 - 78229 6.4 64 35 Tu 1 . - CDS 78231 - 78839 392 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Prom 78942 - 79001 8.1 + Prom 78896 - 78955 6.8 65 36 Op 1 . + CDS 79018 - 79173 65 ## gi|253580727|ref|ZP_04857991.1| predicted protein 66 36 Op 2 . + CDS 79193 - 79849 619 ## COG3294 Uncharacterized conserved protein + Term 79888 - 79925 1.1 + Prom 79938 - 79997 7.4 67 37 Tu 1 . + CDS 80098 - 81078 874 ## COG0042 tRNA-dihydrouridine synthase + Prom 81080 - 81139 4.8 68 38 Op 1 3/0.000 + CDS 81180 - 82280 586 ## COG5438 Predicted multitransmembrane protein 69 38 Op 2 . + CDS 82277 - 83050 582 ## COG5438 Predicted multitransmembrane protein + Term 83181 - 83224 4.5 70 39 Tu 1 . + CDS 83533 - 83802 218 ## gi|253580733|ref|ZP_04857997.1| conserved hypothetical protein + Term 83968 - 84013 1.7 + Prom 84046 - 84105 7.2 71 40 Tu 1 . + CDS 84183 - 85505 649 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 85751 - 85802 -0.1 72 41 Tu 1 . - CDS 85946 - 86155 76 ## EUBELI_00178 MerR family transcriptional regulator, mercuric resistance operon regulatory protein - Prom 86231 - 86290 5.6 + Prom 86216 - 86275 5.0 73 42 Tu 1 . + CDS 86317 - 87570 852 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 87647 - 87689 0.3 74 43 Tu 1 . + CDS 87707 - 88072 209 ## EUBREC_3110 hypothetical protein + Term 88149 - 88193 1.3 + Prom 88279 - 88338 6.5 75 44 Tu 1 . + CDS 88455 - 88991 473 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 89008 - 89067 1.8 76 45 Op 1 . + CDS 89116 - 89286 154 ## EUBREC_3364 hypothetical protein 77 45 Op 2 . + CDS 89356 - 90741 914 ## COG0534 Na+-driven multidrug efflux pump + Prom 90858 - 90917 6.6 78 46 Op 1 . + CDS 90954 - 91361 251 ## COG1943 Transposase and inactivated derivatives 79 46 Op 2 . + CDS 91358 - 91543 77 ## Acfer_1166 transposase, IS605 OrfB family 80 46 Op 3 . + CDS 91516 - 92460 706 ## COG0675 Transposase and inactivated derivatives + Term 92646 - 92676 -0.4 + Prom 92748 - 92807 5.6 81 47 Tu 1 . + CDS 92843 - 93763 489 ## COG0388 Predicted amidohydrolase + Term 93810 - 93837 0.1 + Prom 93810 - 93869 7.7 82 48 Tu 1 . + CDS 93983 - 94675 300 ## gi|253580746|ref|ZP_04858010.1| predicted protein + Term 94879 - 94916 -0.4 Predicted protein(s) >gi|226332885|gb|ACII01000134.1| GENE 1 50 - 880 1152 276 aa, chain + ## HITS:1 COG:CAC2982 KEGG:ns NR:ns ## COG: CAC2982 COG0489 # Protein_GI_number: 15896234 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 36 257 28 256 277 235 50.0 7e-62 MAKECNNTSCDKASCEGCSSSKKQQSFQAEMNAQSNVKHVIGVVSGKGGVGKSFVTGSLA NMMAAQGYKVGILDADITGPSIPKMYGLKGAAMANDEGIYPMITKNGIKVMSINLLLPTE DTPVIWRGPVLANMVKQFWTDVIWGDVDYLFVDMPPGTGDVPLTAFQSLPIEGIVIVTSP QDLVKMIVKKAFNMAEMMKIPVLGIVENYSYVKCPDCGKEIKVFGESHIDEIAAELKVPV LGKMPIDMDYATKADGGFFAAIDNQYITDALAVMPK >gi|226332885|gb|ACII01000134.1| GENE 2 1150 - 2028 778 292 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 287 1 287 296 151 32.0 1e-36 MDINLELYKVFYYVATTLSFSEASRQLFISQSAVSQSIKTLEKKLNHPLFIRSTKKVLLT PEGELLLQHVKPALQLLDEGESLLSGDNLLKGQLRIAASDTICRYFLIDYLQKFHQTYPD VRIKVTNSTSIGCAELLEKGQADLIVCNCPNSRLGSHFQTRVLKEFHDVFVANTDYFPVH TVQTELQELLNYPILMLSPKSTTSEYLREAFTAHNLKLLPEVELNSNDLLLDLARIGLGI ACVPDYMLKENDQLTPLMLKEPLPGRQLILAQHDSLTVSQAAERFIEMFTSI >gi|226332885|gb|ACII01000134.1| GENE 3 2329 - 6093 4762 1254 aa, chain + ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 2 959 3 959 985 1076 55.0 0 MSNVRRVYVEKKPAFAVQAKELKHEVSSYLGIKSVTAVRVLIRYDVENISDEVFDKACKT VFAEPPVDDLYLENFEAAEGSRIFSVEFLPGQFDQRADSAVQCVQFLDENAQPIIRSATT YVIEGTITDEEFDAIKHHCINPVDSRETGLQKPETLVTVFPEPEDVKIFDGFKEMPEAEL KELYASLNLAMTFKDFQHIQHYFKEEEKRDPSMTEIRVLDTYWSDHCRHTTFSTELTDVK FDEGDYKAPIVDTYNKYLADREELYKGRKDKFVCLMDLALMAMKKLKAEGKLADQEESDE INACSIVVPVDVDGKEEEWLINFKNETHNHPTEIEPFGGAATCLGGAIRDPLSGRTYVYQ AMRVTGAADPTVSVKETLKGKLPQKKLVRSAAHGYSSYGNQIGLATGYVKEVYHPNYVAK RMEIGAVMGAAPRRAVIRENSDPGDIIILLGGRTGRDGIGGATGSSKVHTEASIEVCGAE VQKGNAPTERKIQRMFRREEVSHIIKKCNDFGAGGVSVAIGELAAGLRVDLDKVPKKYAG LDGTEIAISESQERMAVVVDPKDVDTFLGYANEENLEAIPVAVVTEDPRLVLVWRGKEIV NISRAFLDTNGAHQETTVEVEIPNKEGNLFEERPDVADVKAKWMETLADLNVCSQKGLVE MFDGSIGAGSVFMPYGGKYQLTETQSMVAKVPVQNGKTDTVTMMSYGFDPYLSSWSPYHG AAYAVTESVARIVATGGDYKKIRFTFQEYFRRMTEDPKRWSQPFAALLGAYAAQMGFGLP SIGGKDSMSGTFNDIDVPPTLVSFAVDVAKIQDVITPELKKAGNKLVWLRAPKDQYDLPD YAGIMDQYEKLHKDMQDGKVVSAYALDRHGIAAAVAKMAFGNGMGVKIEHNLDPRDFFAP GFGDIILEVPDGKVGELSITYTVIGEVTADGNFSYGNTVITVDEAENAWKGTLEKVFKTV SSENDKEAADRDEKLYRADSIYVCKNKVAKPRVFIPVFPGTNCEYDSTRAFERAGAEVDV KVFKNLTAEDIHDSVELFEKAIDQAQIIMFPGGFSAGDEPDGSAKFFATAFQNAKIKEAV MKLINERDGLALGICNGFQALIKLGLVPYGEICGQKADSPTLTFNTIGRHISKMVYTKVV SNKSPWLQKAQLGGVYCNPASHGEGRFVANEEWLQKLFANGQVATQYVTPDGQLSVDEEW NVNGSYMNIEGITSPDGRILGKMAHSERRGDGVAINIYGQQDIKIFESGVEYFK >gi|226332885|gb|ACII01000134.1| GENE 4 6392 - 6499 67 35 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580664|ref|ZP_04857928.1| ## NR: gi|253580664|ref|ZP_04857928.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 35 40 74 74 63 100.0 6e-09 MDLKKPLTFDEQLDKLAAHGMIIRDREKAKDILNL >gi|226332885|gb|ACII01000134.1| GENE 5 6660 - 7367 313 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580665|ref|ZP_04857929.1| ## NR: gi|253580665|ref|ZP_04857929.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 235 1 235 235 432 100.0 1e-120 MNNVKIICIAGSIVLGSILLSGQAFGKAPSDLITLSAPKEPEGQHVSYDIDGSGSYYDSE SDTTYYFVKGIVTSYSNHTKTFSLTSDVGETYELPADTDINITDYLNTEAHIWYSEDSQK NLKLLVFSPQIFWEHINELESDEQIPDGATEIDGTLSKYISFWLTEDKIDVTLLFVEDND GNTHIYAEDNTCNNCNSSVFGKTIGDYIKLYEDMHLNSQVYDDFNVAYLISEASS >gi|226332885|gb|ACII01000134.1| GENE 6 7834 - 8010 150 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580666|ref|ZP_04857930.1| ## NR: gi|253580666|ref|ZP_04857930.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 58 1 58 58 70 100.0 5e-11 MSDERKRENEEEKLKDFFHNVIGIKSEAVLEVLAENSEICKLKAKTLLMKENEKVKLW >gi|226332885|gb|ACII01000134.1| GENE 7 8004 - 8276 176 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580667|ref|ZP_04857931.1| ## NR: gi|253580667|ref|ZP_04857931.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 5 90 1 86 86 159 100.0 5e-38 MVGIMNLDKQITSFLTVELITDCEIVRISADVLRKLAMENIEIALVCNRMQSVASMREYE YRKMILTCNPVQRYEYFLETYPGLEKYDGA >gi|226332885|gb|ACII01000134.1| GENE 8 9036 - 10091 1086 351 aa, chain + ## HITS:1 COG:VC0907 KEGG:ns NR:ns ## COG: VC0907 COG1135 # Protein_GI_number: 15640923 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Vibrio cholerae # 4 349 2 343 344 303 48.0 3e-82 MSGIVIEKVRKSFDTKDGVVEALKDVNLNIDSGDIYGIIGMSGAGKSTLVRCMNFLEIPT EGQVLIDGKALGDLTEKELRKQREEIGMIFQHFNLLMQKSVIDNVCFPLYIKGKKKAEAR KRAAELLEIVGLGDRQNAYPAQLSGGQKQRVAIARALASDPKILLCDEATSALDPQTTSS ILELLKKINRQFGITIVIITHQMSVVREICTHVAIMYEGEVVEKGLVADIFANPQSEVAK ELIRKDTGSDIDADSRLDAGREKGQAEIKSGEKIRIVFSENSAFEPVIANLILTFGEPVN ILKADTKNVGGVAKGEMILEFMEGSTRTEMMKQYLKEKGLAIGEVTEYVGS >gi|226332885|gb|ACII01000134.1| GENE 9 10078 - 10737 901 219 aa, chain + ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 2 218 14 230 233 166 45.0 4e-41 MWDHETIMMIVQGVGETLYMTVLSTVLGYLFGLPMGVLLAVSDKEGLKPNAVLYKILDAI ANIVRSIPFLILLILLIPFTRAVVGKSYGSTATVVPLVVAAIPFIARMVESSIKEVDAGV IEAARAMGASNIRIIVKVLLVEARTSLITGATIAIGTILGYSAMSGAVGGGGLGDIAIRY GYYRYQSDIMIVTVILLVVLVQIFQSVGMMIANKIDRRK >gi|226332885|gb|ACII01000134.1| GENE 10 10872 - 11798 1394 308 aa, chain + ## HITS:1 COG:PA5505 KEGG:ns NR:ns ## COG: PA5505 COG1464 # Protein_GI_number: 15600698 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Pseudomonas aeruginosa # 1 268 1 258 260 224 48.0 2e-58 MKKLVAAVLTGVLVAGTLSTGVYAKDDDKTITVAASATPHAEILEEAKPLLEKEGYDLEV TVFDDYVRPNEVVESGEFDANYFQHIPYLDQFNEEKGTHLVNAGGIHYEPFGIYPGTKDS LDDLEDGDSIAVPNDTTNEARALLLLQDNGIITLKDGAGLNATVKDIAENPHNVKIVELE AAQVARVTGETAYVVLNGNYALEAGFSVGKDALAYEKSDSEAAKTYVNVIAVKEGNENSD KIKALVDVLKSDEIKDFINEKYDGAVIPFDDSADTEEADAEETNSKDEAKDDAAEENAET TEAAEDAE >gi|226332885|gb|ACII01000134.1| GENE 11 12084 - 14099 1598 671 aa, chain + ## HITS:1 COG:CAC0404_1 KEGG:ns NR:ns ## COG: CAC0404_1 COG0515 # Protein_GI_number: 15893695 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Clostridium acetobutylicum # 58 268 74 293 306 125 34.0 2e-28 MKEKQIWKDYLPVEMKEHWTVYECLKENEDSATFLVKETVAGILCVLKWGRNMQAELLRN EMEILEKLADRKLSGIPKAYRIFEENREVYLVREYIEGTPLAQMVLQKGEIPETEICRIS RKICRTAEQFQNPDNLMIHRDIKPENIVITPGGEAVFIDFGTMRSYKKDGSRDTFVVGTR GTAAPEQYGYTQTDQRTDVYAIGQTMLYMVSESYEMDQLSECNISRKMKKVIEKACSFEP DKRYTDASELSRAIEKCQKNSRKILWKKTGVAVGLIAVGYILAFLFPGMTATKNERIAAT DQNAAEVLQTTAETQDEETEQTAEKTQNPEQVQNVEQSQNNPVVFKEALIEKAVRKELNL SETDTITASMLENVRKLRIVGKEILEDEDSFWGEGHHVDGKGSSFGSVRGDIKDLSDMAQ MTNLEELALCNQEIEDISGLKDLPLKKLYLSKNMITDFSVLPNLIDLDILCIMENPAENL SPVGECTGISRLNIQGMNLEDIDFLKNLKLDYLDMSNAEVKSNIFEPLAEMGKLDTLCMC DINESAAETLSQLSNLKALFMWGDTTELENLKPLKGMKELENLAFTTQISSLEGIEQFPS LNFLSVSFSLVKDLTPITGAKSLQVIDISNAYIENFEALFEHTGLKEVRCSEEQKQEILK LNSSPDFEIYT >gi|226332885|gb|ACII01000134.1| GENE 12 14292 - 14699 473 135 aa, chain + ## HITS:1 COG:no KEGG:Shel_21690 NR:ns ## KEGG: Shel_21690 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 4 134 11 143 144 70 33.0 3e-11 MKETDEIMGMIKEEGAAYFARERGNKVEVGLYLEHLIEERGMKKKDIVRKLNLEESYARK LFGGQRIPTRKILLQCAFLLSLNLSDTQRLLDIGQKPRLYPRVRYDAAIIYGLEEKMTLE QMNSFLDEIEEETLL >gi|226332885|gb|ACII01000134.1| GENE 13 14922 - 17978 2713 1018 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 842 994 1330 1452 1995 66 33.0 2e-10 MKKKQIAVLLAALLSVSPAVEGVAVMGADFSSGEEAVEAMQEQTEIEDSSAENVATETEE AENAGITEGDSDSTGTEDIWTSGEEENEFGTGEIPEVDAFSAEEDDVTDAATAEQGKGSF GVAYITEEQYIGAFQNEQDPGVELKTVENTTFSEALSSVEGENTGYCYVQVQDLDNPADF VVPEGLTVLVEGARIRSITPKGNITFLGVRDQDVSVEIKEGSGTVTFCHLDTDAVIRGTG SNDTVMFYDNAFLGGISGVENIHFGKECRNLNIRGESEFYNLYNDTERPQEDLPVWFQIE GYSAKKVPVFHKEFDWGSGEFDDGNGGHRTAEYGIGIEYIESFDVDGEWKVLDIGANKAG AKFDISDMDALEMIGRVGVSIPENWYLLDLDGKTWMPSENTAVEIHQFQACGGMTAQETF ETYGKGETPENACVGLAIVPSIAQAERYMSAYQKKNSDTEGYYLMHIGSGSEGSETITVP DGVKALKVEGQNNYDESANTDHFVPVNISSVNVPAGKKISLCRTLVKSNTLTFTGAGTAE IVDSRLDTNVKADKLEIIDAAVKSLECKELVAALGGRLIISEYLKFDKAFLEPEGMVIYG QPGAYLNMGEIKVTESDSNDLVQIFIGENGKKRAQIYFGGNIDSGKVTYEGKEYPRRIQL RKFDEKAARSVGALCAKDCFDESYRYDWYDDETDNWWNYQCQYNNEEICIATIGSTVDTE KIKEELLVILSLKDDKGEQRYCQVAPRCLYLDDYKNLTDGIDGNKRLAFYLDWWSSENEE KAVEIPAYFEFTGETLDSIASAQISDIKTQLYAGKAITPSVTVTLNGKKLKKGTDYTVSY ANNTKAGSTASVIITGKGKYVGTATKDFEIAAIPAKGKVYTAGNLKYKMTKSAYKNGTVS VYAPAKKTLTSISIPATVKINGYTFQITAIGNKAFAGCTKLKSVKIGAKVTAIGKEAFSG CKALTSITISSNVLKAVGSSAFKGISAKAVIKVPAAKLSAYQKLLKGKGQKSSVKITK >gi|226332885|gb|ACII01000134.1| GENE 14 19048 - 19452 478 134 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580675|ref|ZP_04857939.1| ## NR: gi|253580675|ref|ZP_04857939.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 134 10 143 143 215 100.0 7e-55 MLDKNDMKMLREMMEATVKETVEATVKETVEATVQKAIEPLQQDICDLKEEVNGVKLHLE NVTDRNISILAENHGYLVKKFNEAVEAAHNNHVNTVQTSYLIEHVANLENRMQAVRKKYD IVMCRNSKSGCEAY >gi|226332885|gb|ACII01000134.1| GENE 15 19685 - 21145 362 486 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 13 474 6 431 447 144 28 2e-33 MEKQEKNTATPYDLDGRPPLKLAIPLGLQHVLAMFVGNLTPLLIITAACGIEAGGELQVA LLQNAMLIAGIVTLVQVFTVGPVGGKLPIVMGTSSGFIGVCQSVAGVMGNGVVAYGAIMA ASFIGGLFETVLGSFLKPLRKFFPSVVTGTVVLSIGLSLIAVGISSFGGGSSAKDYGSLE NLFVGFVVLVVIVVLKHGTKGFTSFASILIGIIVGYVLVSIMAAVLPTTFTYVDDAGATV EATKSWVLNWDKVASASWFAVPKIMPVKWVFDARAIVPICIMFIVTAVETVGDISGITEG GLGREATDKELSGGVMCDGLGSSLAAVFGVLPNTSFSQNVGLVAMNKVVNRFSIATGGIF LIACGLFPKLAALISIMPQSVLGGAAVMMFSSIVVSGIQLITKWPVTPRNVTIVSVALGL GYGLGANSAVLAHMPQAITLIFGGSGIVPAALFAIILNIVLPPDEKSEVEIAEEILDMKK KEAQKK >gi|226332885|gb|ACII01000134.1| GENE 16 21772 - 23259 1531 495 aa, chain + ## HITS:1 COG:aq_582 KEGG:ns NR:ns ## COG: aq_582 COG0147 # Protein_GI_number: 15606032 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Aquifex aeolicus # 4 495 6 493 494 388 44.0 1e-107 MYPTLETIRRLAQTKEYRRIPLCRELYADRYTPVEVMRILRKASRHCYLLESASQTEVWG RYSFLGYEPSMEITCTDGTLKIRTTDAEGKTEETVRQVTHPGDTLREIIRKYKTPVMEKM PPFTGGLVGYFSYDYIKYSEPKLKLEDREQQDFRDMDLMLFDQVIVFDHFRQKVLLITGV MTEDLEASYEKAVKKLKEMAQLIRNGEHMEFEPLRLQSEIQPKFPKEKYAEMVEKAKHYI REGDIFQVVLSNPMRAKATGSLFDTYRVLRASNPSPYMFYFSSDDIELAGASPETLAKLE NGTLSTFPLAGTRPRGKTREEDKELEEGLLKDEKELAEHNMLVDLGRNDIGKISKIGTVK VEKYLCVERFSHVMHLGSTVTGIIRDDKDAVDAVDAILPAGTLSGAPKFRACQIIEELEQ SKRGIYGGAIGYLDFAGNLDTCIAIRLVYKKNGEICIRSGAGIVADSVPEKEFQECCNKA RAVVQAIEQAQEGLE >gi|226332885|gb|ACII01000134.1| GENE 17 23256 - 23834 555 192 aa, chain + ## HITS:1 COG:CAC3162 KEGG:ns NR:ns ## COG: CAC3162 COG0512 # Protein_GI_number: 15896410 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Clostridium acetobutylicum # 1 186 1 185 195 228 57.0 7e-60 MILLIDNYDSFSYNLYQLIGAVEPDIKVVRNDECTLEEIAEMKPEAIILSPGPGRPEDAG ICIPVIKEFAGKIPILGVCLGHQSICEAFGGTVSYAKELMHGKKKLMYKTGESQLFKGLP DTFAAARYHSLAALKDTLPAELKVTAESEDGEVMAVEHTKYPVFGVQFHPESVMTPDGRV MIENFMEVVRND >gi|226332885|gb|ACII01000134.1| GENE 18 23827 - 24837 1344 336 aa, chain + ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 332 1 331 336 317 54.0 2e-86 MIKEAIIKLSKKQDLAYAEAEAVMDEIMSGQATPVQMSAYLTALALKGETIDEIAASAAG MRAHCIKLLHNLDVLEIVGTGGDGSNSFNISTTSSLVIAAGGVPVAKHGNRAASSKSGAA DVLETLGVKITLTPERSAEILKKINICFLFAQNYHIAMKYVAPIRKELGIRTVFNILGPL SNPAGANMELMGVYDQSLVEPLAQVMANLGVNRGMVVYGQDSLDEISMCAPTSVCEIRDG KFTSYEITPEQFGYERCEKGALTGGTPAENAEITKAILKGEEKGPKRQAVCLNAGAALYI AGKAASIEEGVKLAESLIDSGAALKKLEEFVEETNK >gi|226332885|gb|ACII01000134.1| GENE 19 24936 - 25736 787 266 aa, chain + ## HITS:1 COG:BMEI0843 KEGG:ns NR:ns ## COG: BMEI0843 COG0134 # Protein_GI_number: 17987126 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Brucella melitensis # 3 260 39 296 303 196 45.0 5e-50 MKNILEEIAARTRERIAKEKTCISVSELENRIQEVNKNARQKITFLQALQKDGMSYICEV KKASPSKGLIAPDFPYLAIAKEYEQAGASAISCLTEPFYFQGADQYLREISAAVQIPVLR KDFTVDEYMIYQAKSLGASAVLIICAILDDGELRAYRQLAKELGLDALVEAHDEYEVDRA LNLGAEIVGVNNRDLRTFQVDMNNSIRLRKMAPDNVVFVSESGIRTPEDIRILYENKVDG VLIGETLMRSPDKKAALESLNGSPLR >gi|226332885|gb|ACII01000134.1| GENE 20 25767 - 26405 576 212 aa, chain + ## HITS:1 COG:CAC3159 KEGG:ns NR:ns ## COG: CAC3159 COG0135 # Protein_GI_number: 15896407 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Clostridium acetobutylicum # 10 205 3 201 205 145 42.0 5e-35 MTREQQQRTKIKICGLKRPEDITYVNEAKPDYCGFIIEFPKSSRNVTGALVRELTAKLNP DIIPVGVFVNVAPERVEELLLDGTIHIAQLHGQEDEAYIRRIQKNTGHQVIKVFSVKTAQ DIENALKSPADYILLDQGGGGTGQTFDWSLIPEIDRPFFLAGGLGADNLETAVRTIHPYA VDLSSSVETDGMKDRDKILKAVQLVHIKSERI >gi|226332885|gb|ACII01000134.1| GENE 21 26468 - 27652 1571 394 aa, chain + ## HITS:1 COG:CAC3158 KEGG:ns NR:ns ## COG: CAC3158 COG0133 # Protein_GI_number: 15896406 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Clostridium acetobutylicum # 4 391 3 390 394 514 63.0 1e-145 MSKGRFGIHGGQYIPETLMNAVIELEEAYNHYKDDLEFNAELTELLNEYAGRPSRLYFAS HMTEDLGGAKVYLKREDLNHTGAHKINNVLGQVLLAKKMGKTRVIAETGAGQHGVATATA AALMGMECEIFMGKEDTERQALNVYRMRLLGAKVNAVTSGTATLKDAVSETMREWTNRIS DTHYVLGSVMGPHPFPTIVRDFQAVISKEIKEQILEKEGRLPDAVLACVGGGSNAIGTFY NFINDKDVRLIGCEAAGRGVNTAETAATIATGKLGIFHGMKSYFCQDEYGQIAPVYSISA GLDYPGIGPEHAHLYDTGRAEYVPVTDEEAVEAFEYLSRTEGIIPAIESAHAVAYAKKIV PQMRKDQIVVITISGRGDKDCAAIARYRGEDIHE >gi|226332885|gb|ACII01000134.1| GENE 22 27645 - 28424 409 259 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148997862|ref|ZP_01825426.1| ribosomal protein L11 methyltransferase [Streptococcus pneumoniae SP11-BS70] # 1 253 1 258 258 162 37 8e-39 MNKRITEAFEKGKAFIPFITCGDPSLEVTEQLVYAMEEAGADLIELGIPFSDPTAEGPVI QEANVRALSGGVTTDKVFDMVVKIRKNSSVPMVFMTYANVVFSYGTERFIKTAAEIGMDG LILPDVPFEEKEEFDSVCKKYGIDLISLIAPTSHERVAQIAKDAEGFVYCVSSLGVTGTR TNITTDIGAMVKLVKAVKDISCAVGFGISTPEQAKKMAAQSDGAIVGSAIVKLCAEHGAD CVPYVKEYVKSMKDAVMEA >gi|226332885|gb|ACII01000134.1| GENE 23 28777 - 29571 515 264 aa, chain - ## HITS:1 COG:Cgl0894 KEGG:ns NR:ns ## COG: Cgl0894 COG0561 # Protein_GI_number: 19552144 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Corynebacterium glutamicum # 3 264 17 275 277 131 29.0 1e-30 MIKLIASDIDGTLVKDGTLAIDREYMEVIGRLIDKGIVFVACSGRQYRSERKLFTPVADR LLYISDGGTVVRTPKEILKVDTLPDEIWKGMCSMVRENMPSCDYFISTPERCFAEDGGSQ MFHWLRDSYGYDIHEVPEMLKLEGQQVNKFAVYHPNACEEMCAPLFTPTWKDKTVMAAAG KEWMDCTAPGSGKGPSVAFLQEYLGISPDETCTFGDNLNDIEMLQNAGLSYAVSNSRPEV RAAAKNTCPPYWENGVLQILRSFL >gi|226332885|gb|ACII01000134.1| GENE 24 30193 - 31143 908 316 aa, chain + ## HITS:1 COG:lin1865 KEGG:ns NR:ns ## COG: lin1865 COG1597 # Protein_GI_number: 16800931 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Listeria innocua # 1 293 1 293 310 154 31.0 3e-37 MNKKMLFVFNPKAGKGKIKTNLLDIVDIFNKGGYEVIIYSTQKPKDAYEKAKEYESKVDL IVCSGGDGTLDEVVTGVMEKKSSIPIGYIPAGSTNDFANSLFMPKSMTDAASMIMEEKLY HCDIGRFNNQSFTYIAAFGLFTDVAYQTDQDLKNILGHVAYLLEGVKRLFDIKSYHMRIE SEELTVEDDFIFGMITNSRSVGGFKNLTGKNVDMNDGLFEVTMITRPKNPLELQEIMTAM LTAEDNTDLIHSFKSARVTITSEEPVPWTLDGEYGGSHTQIEIENCHEALNLYLKSTKNE ETRSIGISPLINKKEK >gi|226332885|gb|ACII01000134.1| GENE 25 31145 - 33265 2356 706 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 9 706 6 696 696 848 60.0 0 MGEYINVVETFGCDVFNDSVMQARLPKKIYKELKKTIEEGKELSMEIADVVAHEMKEWAI EKGATHYSHWFQPLTGVTAEKHDAFITAPMDNGKVLMSFSGKELIKGEPDASSFPSGGLR ATFEARGYTAWDCTSPAFVRHDAAGGTLCIPTAFCSYTGEALDQKTPLLRSMEAINTQAL RLLRLFGNTTSKKVTPSVGAEQEYFLIDKEKWLQRKDLIYTGRTLFGAMPPKGQEMDDHY FGSIRQRISAYMKEVNEECWKLGIPAKTQHNEVAPAQHELAPIYAPVNIAADQNQMMMRI LKKVASRHGMRCVLHEKPFAGVNGSGKHNNWSLTTDDGINLLDPGKTPHENIQFLLVLTC ILKAVDEHADLLRESAADVGNDQRLGGHEAPPAVISVFLGEQLEDVLEQLVSTGTATHSK KGSKLETGVKTLPDFMKDATDRNRTSPFAFTGNKFEFRMVGSRDSISACNVVLNTITAEV FKEACDRLEKAPDFELAVHDLIKEYATDHQKIVFNGNGYAPEWEKEAKRRGLPILPSMVD AIPALTTEKAVKLFESFDVFSRAELESRAEIKYEIYSKAINIEAKTMIDMATKQIIPAVI RYTTVLGESINTVKTAGEGLIDVSVQLDLLKKASKLLKAANSAMSKLSKLVEKTPEYAEG QDRAVFCREKVVPAMEALREPVDELEMIVDKEMWPMPSYGDLVFEV >gi|226332885|gb|ACII01000134.1| GENE 26 33382 - 34302 983 306 aa, chain + ## HITS:1 COG:BS_yojN KEGG:ns NR:ns ## COG: BS_yojN COG0714 # Protein_GI_number: 16078999 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus subtilis # 14 292 19 301 304 119 30.0 5e-27 MEELKFLEEQQVAPELIKRVEEFRSQYEVVPETAGRIVKPSIPFYGKNILEMAIAGLLKG ENLLLTGPKATGKNILAENLAYIFNRPAYNVSFHVNTNSGDLIGTDTFVDNEVKLRKGTI YQCAEYGGFGILDEINMAKNDAVSVLHATLDYRRSIDVPGYDKIDLHPAARFIGTMNYGY AGTKELNEALVSRFLVIDMPAQTEESLEFIFRRMFPNIKENARTQFIGVFLDLQLKATNS EISTKPLDLRGLLAAMKIVETGLSPRKAVTMGVVNKTFDIFEKEIVEDIVKTRIPADWTR DDVYER >gi|226332885|gb|ACII01000134.1| GENE 27 34292 - 36085 1482 597 aa, chain + ## HITS:1 COG:SMa1269 KEGG:ns NR:ns ## COG: SMa1269 COG4548 # Protein_GI_number: 16263145 # Func_class: P Inorganic ion transport and metabolism # Function: Nitric oxide reductase activation protein # Organism: Sinorhizobium meliloti # 398 584 436 618 631 66 30.0 1e-10 MKDEKIVEIDDYRLELENRIRNLLWTISGDYTLNMKVDVSLYLRSKAIALYDGIKQGALA KYYDRNLLGLYLVKKVFCQAGEGELTAVAQLCVEEAIGDKICQERPGVREMQKEAFEDIL DQEFERMPAYGDILGRLKIAILRDRLQNGNHYVEERLREIRDMVYKARTAADTIELIRII DSLYNTIIDPNFEKDHGTLEQVMAVTLEELTEYDWEDFLTEELYEEALESYIEQMTNKIT DMEDTAVTEEMEKQRQTKHKITILTEEELEKAYTYVELNFGKTYLTPAEEKRMNYLMCRE LHRDCSLYFTEGILKNPVKRNYQYEYAVRLKNKNIWLYHDKHRIVKQNIASLTDLLRKTL VLKSETQEVLSDRGTIIPSRLWRVGRSSEANLFKRELKSDASDFVVDVLIDASGSQMSRQ GDVALQAYIISEALSNVNLPHRVMSFCTFWDYTILHRFREYDDPQSANENIFNYVTSSNN RDGLAIKTAGYGLLQRSEEKKILIVLSDGKPYDVIVNRPHAKNPEPYMGKYAINDTATEV RHLRNLGVSVLGVFAGEEKDLSTEKKIFGKDFAYIRDISNFSRIVGRYLIKQLDSDE >gi|226332885|gb|ACII01000134.1| GENE 28 36264 - 38162 2555 632 aa, chain + ## HITS:1 COG:VNG6268C KEGG:ns NR:ns ## COG: VNG6268C COG1297 # Protein_GI_number: 16120189 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Halobacterium sp. NRC-1 # 9 595 28 614 655 370 41.0 1e-102 MNENKEFKPYIPADKITPEFTVTSIIMGVILAVIFGAANAYLGLRVGMTVSASIPAAVIS MGVIRKIMKKNSILESNMVQTIGSAGESLAAGAIFTMPALFLWAEEGLCEMPGIVEITLI ALCGGVLGVLFMVPLRNALIVKEHATLLYPEGTACADVLLAGEEGGANAATVFSGMGIAA IFKFVVDGLKVIPADVAVAFKGFKGEIGMECYPALLGVGYIVGPRIASFMFVGSLIGWMV LIPVICLFGADTVMYPGTETISALYAAGGASKIWSTYVKYIGAGAIATGGIISLIKSLPL IIATFRDSMKSMKGGKNTSTERTAQDLPMNVILIGIVAMVAIIWLVPAIPVNPIGALIIV IFGFFFATVSSRMVGLVGSSNNPVSGMAIATLLIATMVIKASGNTGIDGMKAAIAIGSVI CIVAAIAGDTSQDLKTGYLLGATPKKQQIGELIGVVAAGLAIGGVLYLLNTAWGYGSAEV PAPQATLMKMIVEGIMGGKLPWTLVFIGVFLAIGLEILRIPVMPFAIGLYLPIYLNAAIM IGGVVRMFVDGRKNVDDKKKEEQATDGTLYCAGMIAGEGLVGILLAILAVVNVSSVFDLS GSINLGNAGGVILMLLMILSLLKFSIWNKKKA >gi|226332885|gb|ACII01000134.1| GENE 29 38187 - 38552 283 121 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20051 NR:ns ## KEGG: EUBELI_20051 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 121 1 117 121 100 41.0 2e-20 MSKKKKELQINYLDLIPERSAQLRWHTDIRGRMVLEIENKGAFNTIAQKLFGKPRYTKIH LDDNGTFIWPLIDGRKTVEDIAALVKDEFGEKAEPLYPRIVKYFQIMESYHFVNFINKPE K >gi|226332885|gb|ACII01000134.1| GENE 30 38711 - 39583 948 290 aa, chain + ## HITS:1 COG:BH1678 KEGG:ns NR:ns ## COG: BH1678 COG1284 # Protein_GI_number: 15614241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 15 286 11 279 290 163 36.0 3e-40 MKLVNGKRSQLLEYLMIIAGTGLMALAINSVFDASGMVTGGFSGIAIIIKAGSQGIIDGG IPLWVTNMGLNIPLFLIGWKINGFSFVKKALIGEIALSTWLAIEPVWNLAGNDLLLAALY GGVIQGVGIGLVFLGGGTTGGTDMMAAIIQKYLQHYSIAQIMQVIDAGVVLVGMYVFGIH KALYAIIAVYLVTKVSDGLIEGLKFSKAAYIITAKPKEIADMIMKDLDRGVTGISARGMY SGQDKLMLFCVVNKKEIVMLKEKVDEIDPEAFVIVTDAREVHGEGFIEKK >gi|226332885|gb|ACII01000134.1| GENE 31 39730 - 40818 1281 362 aa, chain + ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 169 350 134 307 312 88 34.0 2e-17 MISNQVLQNTLEGLKEISRTEFCVLDTEGKVLASTFADFSIATQDVQAFVESQADSQLVK GFQYFKVCDDYQLEYILVAHGDDEDTYMVGKLAAFQIQNLIVAYKERFDKDSFIKNLLLD NLLLVDIYNRAKKLHIEADVRRVVMILEMPQEKDHSSMESVKSLFGGKSKDFITAVDEKS IIVVKELEDGENYPEMDQLAHTILDTLGMGNDGKTHMAYGTIVNELKEVSRSYKEARMAL DVGKIFFGDKEVIAYSSLGIGRLIYQLPIPLCKMFIKEIFGGKSPDDFDEETLVTIDKFF ENSLNVSETSRQLYIHRNTLVYRLDKLQKSTGLDLRVFEDAITFKIALMVVRYMKYMETL EY >gi|226332885|gb|ACII01000134.1| GENE 32 41117 - 41800 319 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 218 245 127 32 2e-28 MIDLIDVSKSYGPGLPAVNHANLHIDKGEFVFVVGNSGSGKTTLIKLLMKELDSTSGQII VSGERLEGMKHRRVAKYRRKLGVVFQDFRLLKDRNVYENVAFAQRVVNRPTRVIRKRVPE VLQEVGLAVKYKSFPDELSGGEQQRVALARALVNRPDILLADEPTGNLDPKTSEEIMKLL EQINDRGTTVVVVTHNKDIVNSMRKRVVTMHEGSIISDEKEGEYHEA >gi|226332885|gb|ACII01000134.1| GENE 33 41790 - 42695 938 301 aa, chain + ## HITS:1 COG:BH3601 KEGG:ns NR:ns ## COG: BH3601 COG2177 # Protein_GI_number: 15616163 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus halodurans # 1 301 1 298 298 138 30.0 2e-32 MRPSTIWYTLKQGVKNIKRNWMFSLASIITMAACIFLFGIFFSIVNNVNNIAHKVEQEVP ITVFFEEGTTNKQIKAIGKKIKARPEVEKIEYQSADEAWEEYKEQYFQGSEAADGFKDDN PLANSANYRVYMNDITKQSELVSYIQELKHVRKVNQSEEAAKTLGNINKLVSYISIAIIV LLLIISIFLISNTVSVGIAVRKEEIGIMKYIGATDAFVRAPFLLEGMVLGVVGAAVPLIV LYFCYNGAISYILTRFNVLTGVVDFIPVWQIYQTLLPISLGLGIGIGLIGSFFTTRKHLR V >gi|226332885|gb|ACII01000134.1| GENE 34 42776 - 43966 1411 396 aa, chain + ## HITS:1 COG:CAC0499 KEGG:ns NR:ns ## COG: CAC0499 COG0793 # Protein_GI_number: 15893790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Clostridium acetobutylicum # 44 395 53 401 403 227 37.0 3e-59 MKNKEFWKGAVAGALVVTCIGTGVFAGTGLAAKGTALGDRKCTSKLAYLEELIDRYYLGD KDEEKLQEGLYAGLLYGLDDPYSRYYTEEEYDSENSSNEGSYVGIGILMEKNKEGGVKIV ECYEGGPGEKAGLEEGDIISAIDGEDITEDEVSDVADIVRNSDKDSVVLTVHREGAEDAM EITVPVTDVELPSVFHEMLGSKIGYIRITQFTGVTGEQYQEAFDDLQKQGMEKMIVDLRD NPGGLLDSVCDVLRKILPEGLIVYTEDKDGNREEEKCDGKNELRIPLAVLVNESSASASE IFAGAVQDYGIGTIVGTTTYGKGVVQSIRQLSDGSAIKLTVANYYTPKGNNINKTGIKPD IEVSLDTSLLNKNKDEITHDEDNQLQEALKAVEKEK >gi|226332885|gb|ACII01000134.1| GENE 35 44144 - 45013 967 289 aa, chain + ## HITS:1 COG:BS_yitS KEGG:ns NR:ns ## COG: BS_yitS COG1307 # Protein_GI_number: 16078175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 285 5 281 283 198 37.0 8e-51 MSEYVITTDNNSDLPEEYLKDHGVGCMYLSYSMDGKNYTHGNFLPEHEFYEAMRNGSMPT TAQVNPENAKALLEPYLKEGKDILHIAFSSALSGTYNSSRIAAEELMEDYPDRKIIVVDS LGASLGQGLLVYLAQEKKEQGEDMETVAKWAEENRLHMVHLFTVNDLNHLYRGGRISRTT AVVGSMLNIKPVLHVDNDGKLTAIGKVRGRKKSLQELVKLMDEKIGSYHDTCHTIFISHG DCEEDANYVAEKVKEKYQINTIIMNQVGATIGAHSGPGTMALFFVGDER >gi|226332885|gb|ACII01000134.1| GENE 36 45110 - 47047 1764 645 aa, chain + ## HITS:1 COG:AF0876 KEGG:ns NR:ns ## COG: AF0876 COG0737 # Protein_GI_number: 11498482 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Archaeoglobus fulgidus # 26 579 15 542 584 212 31.0 2e-54 MKLLKKAFAFTFIFMLSVSGLTGYQVNASEEPKHLDILFTHDLHSHLNSFQTIVDGTQQE TGGFARLKTLINEHEKENTDTLILDGGDFSMGTLIQTVYDTEAAELRMLGYLGCDVTTLG NHEFDYGSDGLADMLNAAVSSGENLPRMVVCNVDWDAMKKAGLSEGQKQIYEEFQTYGVK DYTVIQKGDVKIAVLGVFGKDSLDCAPTCELLFKDPSEAARETVEEIKKNEDVDMIACVS HSGTWEDEKVSEDEILAKNVPDIDLIVSGHTHTQLAEPILQGDTCIVSCGEYGKNLGTIS MTQKENGRWETDTYELVPVTDEIKADEATQKKIDELSDTVDTNYLSNFGYTREEILAEND IEFNSLSEMETKHEELNLGDIISDAYVYAVENTGDSDGEKVDVAIVPAGTVRDTYTKGNI TVEQVYNSFSLGTGKDGLAGYPLISAYLTGKELKTVAEVDASISDFMTIARLYCSGMNFT YNPHRMILNKVTDCYLTGKGGEREEIQDDKLYHVVTDLYTGRMLGSVLDKSYGLISIVPK DKNGNSIENLEDYAVMDGKKELKAWAAIAEYMQSFEDTDKDGIANVPEYYDTTHERKVAD NSKNIINLVKHPNKFAVAIVGICIVVVGVILLLIFGVVKIIRRRR >gi|226332885|gb|ACII01000134.1| GENE 37 47294 - 47767 386 157 aa, chain + ## HITS:1 COG:PA4580 KEGG:ns NR:ns ## COG: PA4580 COG3236 # Protein_GI_number: 15599776 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 16 157 41 182 184 126 45.0 2e-29 MNVICFHNPDEENGYLSNWYLSHFTVKGIDFSSVEQFMMYQKACHFHDEKIANDILHTDD VAEIKKLGRKVHNYDENVWNGVRQVIVYEGLKAKFSQNADLKRQLVDTGDAVLAECAVRD QIWGIGLSMGDPNRFERSKWKGTNLLGYALMLVREQL >gi|226332885|gb|ACII01000134.1| GENE 38 47787 - 49910 1999 707 aa, chain + ## HITS:1 COG:lin0529 KEGG:ns NR:ns ## COG: lin0529 COG0659 # Protein_GI_number: 16799604 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Listeria innocua # 8 544 8 545 553 388 37.0 1e-107 MHLELAKTLRNYDRKNLSKDILTGLIIMAVSIPISMGYAQIAGLPAVYGLYGSVFPIILF GLFSTSPQFIFGVDAAPAALVGAALVEMGIQSGSKQAMAVVPVLTFFVAVWLFVFAFLKA GKLVNYISAPVMGGFITGICTTIILMQVPKLMGGTAGTGELSELAEHIWATLGQINLPSL ILGLVSLVILLVMKKILPKFPMVVVLMAAGVLLSYFVPISNWGIRTLDAVEPGLPVWSIP DFSAIGLKEAVIYSLSIAVVIMAETLLGENSFAQKNGYRVNANQELVAFSMGNLAAALTG CCPINGSVSRTAISEQYQGKTQLTGIVAGISMILVLLGATGLIAYLPVPMLTAIVISSLL GATEFDLAVRLRKISKKECLIYWGAFWGVLTLGTINGVLIGIILSFAEMIIRTAKPARCF LGVQPGHRHFEDLTESLHNCEIEGVVIYRFSSSLFFANIDILQQDIENAIGSDTKAVILD ASGIGSIDVTAADRLAILYRSLKSRGIRFYITEHIAELNEQMRALGLGFMIEEGRVRRTI HVALKDMGIHYPYPLKGNVDNTERSPFRKRVDNSVQEFVWAFGSDSEAEIEKQIRLQIEQ LKENGDMQELLHGQWNHMDELDADAWIEHLEEHMKEIVKISGQDEQELTVRFEKHRKKLR EQMMKEHPELVAMFEERRRLLDEQLERKWPDVYALVLKQREKDSRLK >gi|226332885|gb|ACII01000134.1| GENE 39 49931 - 50833 523 300 aa, chain - ## HITS:1 COG:BS_ywbI KEGG:ns NR:ns ## COG: BS_ywbI COG0583 # Protein_GI_number: 16080882 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 290 1 291 301 164 33.0 3e-40 MTLNQLRYFCTASRCHSITKAAEELYVTQPTVSVAIRDLEIEFGISLFYRKGNRLILTEE GEELYNKATYILQYCTELQADYSSMARVKPPLRIGIPPMLSTVFFPELLIAFHDQYPEIA VVLEEYGSVRACNLVQDDTLDLALVNMEQYNIDKFNKALLANDQVVFCVSDDHRLADKEV VTTKEMSKESLIFFNADSVQNQLLKTRFETDGYVPDIIMRSSQIYTILQLVKTCKYGSFL YSSMTDRFASSGIKGIPLSPPIRIKIGMIWKKGKYISDNMQKFLNFTKMYYREHPLIQKL >gi|226332885|gb|ACII01000134.1| GENE 40 51228 - 52493 1400 421 aa, chain + ## HITS:1 COG:STM3356 KEGG:ns NR:ns ## COG: STM3356 COG0471 # Protein_GI_number: 16766651 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Salmonella typhimurium LT2 # 1 421 1 422 422 442 60.0 1e-124 MSQQTITLLFLVFTIISFILEKIPLGLTASICAIGLTLTGVVDVSTTFSQYVNSNVILCV GMFVVGQALFETGMANKIGGLVTRFARTERTLIIAIMVIVGVMSGFLSNTGTAAVLIPVV CGIANESGYSRSRLLMPLVFAAALGGNLSIIGAPGNLMGVNALQEMGLSTSFFMYAPVGV PMLLVGIIYFVFIGYRFLPDGKGAVGATVEKQKDFSDVPKWKQTVSLIVLILVILAMIFE KQIGVSIEISACIGAIILVLVGVLSEKEALASIDLKVVFLFGGSLTLAKALDTTGAGELI ADKIVGLLGADPNPIVLLLVIFIVTCALTNFMSNTATTALMIPIAVSLANNLGADPRAVV IATVIAGSCAYATPIGMPANTMVVGLGGYKFKDYVKSGLPLILVSFVICMILLPVLFPFY P >gi|226332885|gb|ACII01000134.1| GENE 41 52639 - 53544 1054 301 aa, chain + ## HITS:1 COG:STM3355 KEGG:ns NR:ns ## COG: STM3355 COG1951 # Protein_GI_number: 16766650 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Salmonella typhimurium LT2 # 1 299 1 299 299 452 72.0 1e-127 MTKENGVKVLTDMVADFVAHIGKKLPDDVVTKLEELGSQESAALPKVLYETMTKNQGLAV ELNRPSCQDTGVLQFWLKCGTNFPYINELEALLKEAVVQATFAAPLRHNSVETFDEYNTK KNVGKGTPTVWWDIVPNSDKCEIYAYMAGGGCTLPGKAMVLMPGAGYEGVTDFVLDQMTT YGLNACPPLLVGVGVGTSVETAALNSKKALMRPIGSHNDNANAAKMEKLLEDGINAIGLG PQGMGGKYSVMGVNIENTARHPSTIGVAVNVGCWSHRRGHLTVNSDLTVTCDTHSTWKFN A >gi|226332885|gb|ACII01000134.1| GENE 42 53560 - 54192 955 210 aa, chain + ## HITS:1 COG:STM3354 KEGG:ns NR:ns ## COG: STM3354 COG1838 # Protein_GI_number: 16766649 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Salmonella typhimurium LT2 # 8 210 3 205 205 281 67.0 6e-76 MIEIRGDKKVLITPISEEDLADIKIGDIVWLDGELMTCRDVAHRRLVEYGRELPYDIKDK AIFHAGPIVRKIEGTEDDYEMVSVGPTTSMRMEKFEYEFTKLTGVRVIVGKGGMGPNTER ACKEFGAIHCVFPAGCAVVAATEVEKIVEHHWDELGMPETLWCNKVKEFGPLIVSIDAQG RNLFEENKVIFNERKEAAKQAIYPEVKFIK >gi|226332885|gb|ACII01000134.1| GENE 43 54820 - 57603 2247 927 aa, chain + ## HITS:1 COG:CC0723_1 KEGG:ns NR:ns ## COG: CC0723_1 COG0642 # Protein_GI_number: 16124976 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 535 774 215 454 473 165 37.0 4e-40 MKKFKLNVQRHICILLCFILYITVLPLPVSAEEAKSKTVRVGWYEGTYNTTGADGQRRGY SYEYQQAVAAHTGWKYEYVEGSWAELMSMLKNGQIDLLGGISYTEERSTSMLFSELPMGE DKYYLYVDTSNTDISISDLTTLNGKRIGMLPNALPAEMFHEWEKSHGVNTQQVDITGADD VRQKLKNHEIDGFVLNESPQWERDDISPAILIGSSYNYFAVSKKRPDLKEELDQVMQKIE RENPFYTDDLYKRYLSANSLETLTDEEQNWLEQHGAVRIGYLKNDVGISLVDTESEKPVG IINDYISLVSGYLGEQAIEFQLTGFESQEKELQALKDNRIDMIFHMNQNPYEAEQNDIVL SNTVFEINVAVLTGVKKFDENKENTVAVSRNNLLGKWYISFNYPFWKIKEYDSSAEADKA VQSGEADCFVVKAGQSLKTLEDSKMRSIFLTKSGASCFAVTRENTTLMNILNKTIQTLPA SRLSSQFYVYENAPGKVTLAEYIKDNLRVVSIWFVSVVLIIVWIIVYLLIQARKAKIQAE KANAAKSEFLFNMSHDIRTPMNALLGYSELIKRELTDPKLLDYQEKMEQSGNLLLSIINN VLDMARIESGKVELDEDYVKIRDIYQGIYKIFQAEAEKKCIHLKMEYDVKHEHVICDETK NKEIFLNLISNAVKYTASGGRVTIRITELDCDRKDYVRIRTQVIDTGIGMSEEFLPSLFE AFARERNTTDGKIAGTGLGMPIIKKYIDMMYGTIEVESKQGEGSKFTVTLEYRIADKSYY EQDTEKSSDVDETDRISGKHILLAEDNDLNAEIAQFILEDMGLMVDRVEDGVQCVSRIEQ KPAGTYDLILMDIQMPNMDGYRATEMIRGLSDKDKATIPIIAMTANAFEEDRKKALAKGM NGHIAKPVDAEKVEKTICSVLGSQSEQ >gi|226332885|gb|ACII01000134.1| GENE 44 57741 - 59729 1720 662 aa, chain + ## HITS:1 COG:BH3595 KEGG:ns NR:ns ## COG: BH3595 COG0556 # Protein_GI_number: 15616157 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus halodurans # 4 654 5 657 660 853 65.0 0 MDHFELHSEYKPTGDQPQAIERLVRGFKEGNQFETLLGVTGSGKTFTMANVIAQLNKPTL ILAHNKTLAAQLYGEMKEFFPENAVEYFVSYYDYYQPEAYVPSSDTYIAKDSSINDEIDK LRLSATAAMSERKDVVVISSVSCIYGLGSPQEFFDMMISLRPGMTKDRDEVIRELIDIQY TRNEMDFHRSTFRVRGDTLEIFPANSSDTAVRVEFFGDEIDRISEIDVLTGEIKCQRSHI SIFPASHYVVPAEQIQRAAVAIEEELKERVEYFKSEDKLLEAQRISERTNFDIEMMKETG FCSGIENYSRHLSGLKPGQPPYTLLDYFGDDFLLIIDESHITVPQVGGMYAGDQSRKQTL VDYGFRLPSAKDNRPLSFEEFESKLDQVMFVSATPGQYEYDHELLRAEQIIRPTGLLDPK VEVRPIEGQIDDLIGEVNKEVSKKHKILITTLTKRMAEDLTEYMKEVGIRVRYLHSDIDT LERARIIRDMRLDVFDVLVGINLLREGLDIPEITLVAILDADKEGFLRSETSLIQTIGRA ARNSEGHVIMYADNMTDSMHKAITETNRRRTIQEAYNKEHGITPTTIKKAVRDLIAVSKA VAETEVRLQKDPESMTRKELTKLIAQVEKQMRAAAADLNFEQAAELRDKMIDLKKNLDIL DK >gi|226332885|gb|ACII01000134.1| GENE 45 60377 - 61417 1186 346 aa, chain + ## HITS:1 COG:CAC1280 KEGG:ns NR:ns ## COG: CAC1280 COG1420 # Protein_GI_number: 15894562 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Clostridium acetobutylicum # 4 344 2 339 343 207 37.0 2e-53 MDQELDDRKKKILKAIIKTYMETGEPVGSRTISKFADLNVSSATIRNEMSDLTDMGYIVQ PHTSAGRIPSDKGYRLYVDELMKEKESEVQEIKDLMIEKTDKLDKMLKQVAKVLASNTNY ATMISVPQYSGSKIKFIQLSRVTPLQLVAVVVSDNNVIRNQIINLDEEMDDQTILKLNLL LNTNLNGVPIQDINLGMIARLKEQAGVHGSVVGTVLDAVADTIQVDEDDMQIYTSGATNI FKYPELADTSKASELISAFEEKQDLVDLVKERMSDTENTGIQVYIGDEMPVQTMKDCSVV TATYELGKGMHGTIGIIGPKRMDYANVVDSLKALKVQLDELIKKES >gi|226332885|gb|ACII01000134.1| GENE 46 61435 - 62079 801 214 aa, chain + ## HITS:1 COG:CAC1281 KEGG:ns NR:ns ## COG: CAC1281 COG0576 # Protein_GI_number: 15894563 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Clostridium acetobutylicum # 23 214 10 200 200 105 41.0 5e-23 MSEETKNTAQEEETTDTPVTEETVENEPEVVENGEAETEEIPVEDGDEEASKDDKKDSKS KTSFFGKKKKEKDKFEQQIEELTDRLKRNMAEFDNFRKRTEKEKSSMYIIGAKDIVEKML PVVDNFERGLAQAPEGDSFADGMKMIYKQLITTLDELGVKPIEAVGKEFDPNFHNAVMHV EDEEAGENIVVEEFQKGYTYKDFVVRHSMVKVAN >gi|226332885|gb|ACII01000134.1| GENE 47 62140 - 64017 2621 625 aa, chain + ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 623 1 610 615 726 67.0 0 MGKIIGIDLGTTNSCVAVMEGGQPTVIANTEGARTTPSVVAFTKTGERLVGEPAKRQAVT NAERTISSIKREMGTDYRVTIDDKKYSPQEISAMILQKLKKDAEGYLGEKVTEAVITVPA YFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAYGLDNEKEQKIMVYDLGGGTFDVSVI EIGDGVIEVLSTAGNNRLGGDDFDQKVTDYMLAEFKKAEGVDLSNDKMALQRLKEAAEKA KKELSSATTTNINLPFITATAEGPKHFDMNLTRAKFDELTHDLVEMTAEPVRRALSDAGI TASELGQVLLVGGSSRIPAVQDKVRQLTGKEPSKNLNPDECVALGASIQGGKLAGDAGAG DILLLDVTPLSLSIETMGGIATRLIERNTTIPTRKSQIFSTAADNQTAVDINVVQGERQF AKDNKSLGQFRLDGIPPARRGVPQIEVTFDIDANGIVNVSAKDLGTGKEQHITITAGSNM SDSDIDKAVKEAAEFEAQDKKRKEAIDTRNNADSMVFQVENALKEAGDKIDANDKAAVEA DLNALKDLLSQTANAEEMTDSQVADIKAAQEKMMESAQKLFAKMYEQTQQAGGQAGPDMN MGGGAAGSTDAGNNDDVVDADYKEV >gi|226332885|gb|ACII01000134.1| GENE 48 64137 - 65324 1270 395 aa, chain + ## HITS:1 COG:YPO0469 KEGG:ns NR:ns ## COG: YPO0469 COG0484 # Protein_GI_number: 16120798 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Yersinia pestis # 4 395 3 376 379 337 48.0 2e-92 MADKRDYYEVLGVDKSADDATLKKAYRKLAKKYHPDVNPGDKEAEAKFKEATEAYTILSD PEKRKQYDQFGHAAFENGGGGAGGGFGGFDFNGADMGDIFGDIFGDLFGGGSRSRRANNG PMKGANLRARVNITFEEAVFGCEKELEIVLKDECTTCHGTGAKPGTSPVTCPKCHGEGQV VYTQQSMFGMVRNVQTCPDCHGSGKIIKDKCTSCRGTGYTSSRKKIQVSVPAGIDNGQSI RIREKGEPGINGGPRGDLLVEVNVARHPIFQRQDMNIFSTAPLTYAQAALGGTVRINTVD GEVEYEVKPGTQTDTRIRLKGKGVPSLRNKNVRGDHYVTFVVQVPTNLNEEAKEALRKFD EACGNRPKSKDSDDFEKPEKKKKSFMDKLKETFED >gi|226332885|gb|ACII01000134.1| GENE 49 65444 - 66169 452 241 aa, chain + ## HITS:1 COG:CAC1489 KEGG:ns NR:ns ## COG: CAC1489 COG0671 # Protein_GI_number: 15894768 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 18 213 10 202 219 153 40.0 3e-37 MERYLTKTEKIAAFREKYLPVWVILPLISIFVLNCLIYWGSGVLTADRYHFDFTMSLDRA VPLIPQFIWIYILAFPFWAVNYILAAQRGKDGFYRFVATDLTVHITCFIIFLVLPTTNVR PEIAGTTWSQQMLKFVYLMDGGNSPSNLFPSIHCYVSWLSWRGVSKSENIPKWYQHFSLI FAILIIISTQVLKQHYLVDAIAGVALVELAWRFYSKGNRHEIFRHFFEKISEACSRWVNK N >gi|226332885|gb|ACII01000134.1| GENE 50 66287 - 67240 963 317 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145923|ref|ZP_04744524.1| ribosomal protein L11 methyltransferase [Roseburia intestinalis L1-82] # 1 316 1 326 327 375 55 1e-103 MKWNRFTVKTKTEAEDIVISTLAEVGIEGVEIQDKQPLTEEDKAQMFVDIMPEGPADDGV AYLNFYLEEDADKEAILKDVREALDDLKNFMDIGEATIEESQTEDKDWINNWKQYFHQFY VDDILIVPSWEEVKAEDKDKMILHIDPGTAFGTGMHETTQLVIRQLKKYVTPDTEMLDVG TGSGILGIVALKLGAKHVLGTDLDPCAVPAVAENKEANQIVDETFDMVIGNIIDDKAIQD QAGYEKYDIVTANILADVLIPLTPVIVNQMKKGAYYITSGILDVKEEVVVEAVKAAGLTV VEVTHQGEWVSVTARKD >gi|226332885|gb|ACII01000134.1| GENE 51 67245 - 67991 796 248 aa, chain + ## HITS:1 COG:CAC1285 KEGG:ns NR:ns ## COG: CAC1285 COG1385 # Protein_GI_number: 15894567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 247 1 245 250 177 44.0 1e-44 MQRFFVEPYQIEKEAHRIHINGTDVNHIKNVLRMKCGEDVWISDGGDKEYHCQIEELGED EVLLHILYAQEPEYELPNKIYLFQGLPKADKMELIIQKAVELGAFSIIPVETKRCVVKLD VKKAAKKVVRWQQIAESAAKQSKRMLIPEIHEVMTYKQALEFAKQLDVKLIPYELAKGMK ETREILSEIKPGQSVGIFIGPEGGFEEEEVAKALEAGAHAITLGRRILRTETAGLAILSV LMFQLENE >gi|226332885|gb|ACII01000134.1| GENE 52 68013 - 69167 1122 384 aa, chain + ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 1 380 1 376 379 314 44.0 2e-85 MEAYFDNSATTRCFEEVKDIVVKTMMEDFGNPSAMHQKGVDAEGYVKESARTLAQILKVT EKEILFTSGGTESNNLALIGGALANKRSGNHIITTAVEHAAVSQPVAFLQEQGFEVTILP VDGHGVVKLDALEKALRPDTILVSTMMVNNETGAVMPVEKIGAMIQEKCPKALYHVDAIQ AFGKYRIYPKKWNIHLLSVSSHKIHGPKGVGFLYINSKAKVQPLILGGGQQNGMRSGTDN VPGIAGLGVAAKMMYQNFDEKAEHLYQLKERMAEGLSKMNDVVINGMPVREGAPHILSVS FLGIRSEVLLHTLEDRNIYVSAGSACSSHKRKPSATLSAMGMSNAQIENTVRISFSEENT FEEVDYCLEVLNEVLPMLKRYARR >gi|226332885|gb|ACII01000134.1| GENE 53 69208 - 70389 1277 393 aa, chain + ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 6 392 5 381 384 339 46.0 6e-93 MIFHAFLIKYAEIAIKGKNRYIFEDALVKQMNIALSKVEGEFEVKKEQGRIYVFCPEQYD YDETVEALQHVFGIVGICPVVIYEDQGFEQMAKDVVSYMKNCHPNYDGSFKVYTRRAKKS YPITSMEVSAELGGRILDEFPDASVDVHEPDLTLSVEIRDKIYVYSQTIKGAGGMPIGTN GKAMLLLSGGIDSPVAGYMIAKRGVKIDAVYFHAPPYTSERAKQKVVDLAKQVAKYSGPI RLHVINFTDIQLYIYDQCPHDELTIIMRRYMMRIAEHFAKESRCLGLITGESIGQVASQT LQSLAATNEVCTLPVYRPVIGFDKQEIVERSWEIGTYETSIQPFEDCCTIFVAKHPVTKP NLNVIHRSETKLEEKIDELMKIALETVEIIEID >gi|226332885|gb|ACII01000134.1| GENE 54 70505 - 70741 366 78 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2300 NR:ns ## KEGG: Cphy_2300 # Name: not_defined # Def: phosphotransferase system, phosphocarrier protein HPr # Organism: C.phytofermentans # Pathway: not_defined # 1 78 1 77 77 104 76.0 1e-21 MKTVKISLNSIDKVKAFVNEISKFDCDFDLVSGRYVIDAKSIMGIFSLDLSKPIDLNIHA DGAALDTVMAIVGKYVTE >gi|226332885|gb|ACII01000134.1| GENE 55 71059 - 72390 462 443 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227372256|ref|ZP_03855738.1| SSU ribosomal protein S12P methylthiotransferase [Veillonella parvula DSM 2008] # 1 391 1 392 449 182 31 6e-45 MGKKVALHNLGCKVNAYEVEAMQQLLEEAGYEIVPFEPGADVYLINTCTVTNIADRKSRQ MLHKAKKMNPDAIVIAAGCYVQTDEGKLDKDEAVDLILGNNQKGNIVQVLEEYEQQHTKQ KHVLKINQTKEYEELAIDHTAEHVRAYIKVQDGCNQFCTYCIIPYARGRVRSRKIAHVMD EVHALAAKGYKEVVLTGIHLSSYGVDFPAEEKETLLSLIRAVHEVEGIERIRLGSLEPGI ITEEFVQSIAALPKMCPHFHLSLQSGCDTTLERMNRRYRSAEYAEKCGLLRKYLGNPALT TDVIVGFPMETEEDFQDSYDFVESIHFYETHIFKYSRRHGTKAAAMSGQLTEAQKAVRSD KMLALNEQHAKEYEKSMLGGELKVLLEEEVEINGEQWYIGHSMEYIKTAVKKQDNYSVND IITVKAEEFLEEHTLTGEVLGVE >gi|226332885|gb|ACII01000134.1| GENE 56 72604 - 72876 344 90 aa, chain + ## HITS:1 COG:lin1538 KEGG:ns NR:ns ## COG: lin1538 COG4472 # Protein_GI_number: 16800606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 19 85 18 84 90 85 62.0 2e-17 MAENNLGNTQYFRTKPDQELQVKEVLDLVYTAMDEKGYNPVNQIVGYIMSGDPTYITSHK GARSMIMKVERDELVEELLNEYIKNKSWER >gi|226332885|gb|ACII01000134.1| GENE 57 72878 - 73300 531 140 aa, chain + ## HITS:1 COG:lin1537 KEGG:ns NR:ns ## COG: lin1537 COG0816 # Protein_GI_number: 16800605 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Listeria innocua # 1 137 1 136 138 156 62.0 1e-38 MRVLGLDYGSKTVGVAISDPLGITAQGVETIWRKQENHLRQTLARIEELVSEYHVEKIVL GYPKNMNNTIGERAQKSLEFKEMLERRTGLEVIMWDERLTTVEANRTLMEASVRRENRKQ YLDQLAAVFILQGYLDSLSC >gi|226332885|gb|ACII01000134.1| GENE 58 73313 - 73570 432 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580720|ref|ZP_04857984.1| ## NR: gi|253580720|ref|ZP_04857984.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 85 1 85 85 105 100.0 7e-22 MEKIKFQLEDGTEAEFYVEEQTRIGGVAYLLVSDSQDDEATAYIFKDVSEDDSQEACYEM VEDEDEMQAVFKVFEQMLDDVDLEM >gi|226332885|gb|ACII01000134.1| GENE 59 73587 - 75482 2038 631 aa, chain + ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 80 630 2 554 555 677 59.0 0 MSNENNNQFNNEARENHDGQFGHGQKRPYNRPKNYKYKPQHRNSYGRKPMQDRKAPADKE KNGIQVPFKAAIKNSSRRPKKPTSKLKIIPLGGLEQIGMNITAFEYEDSIVVVDCGLAFP EDDMLGIDLVIPDVTYLKENISKVKGFVITHGHEDHIGALPYVLKDVNVPVYSTKLTIAL IANKLKEHNMTNTTKLKEVRHGQVINLGDFSIEFIKTNHSIQDASALAIYSPAGIVVHTG DFKVDYTPVFGDAIDLQRFAEIGRKGVLALMCDSTNAERTGFTMSERSVGHVFDNLFNEY KTNRIIIATFASNVDRVQQIINTAYRFGRKVAVEGRSMVNVITTAAELGYLRIPDQTLIE IDQVKNYPDEQVVLITTGSQGESMAALSRMAANIHKKIVIKPNDTIIFSSNPIPGNEKAV SKVINELSMKGAKVIFQDVHVSGHACQEEIKLIYSLVKPKYAIPVHGEYRHLTAQKNVVE ELGIPKENIFVLASGNILELDSQSAQVTGTVHTGAILVDGLGVGDVGNIVLRDRQHLAED GIMIVVVTLERHSNVILAGPDIVSRGFVYVRESEDLMDGAREVVENALDSCLERNITDWG KIKTVVKDALSEFLWKRTKRSPMILPIIMEA >gi|226332885|gb|ACII01000134.1| GENE 60 75484 - 75828 300 114 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00777 NR:ns ## KEGG: EUBELI_00777 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 111 4 109 113 63 33.0 2e-09 MEPNTIEESIKGLLEAIKESPEYLEFQKQSDILKKKPELKARVDTFRADNYKVQNECDSD NLSEVVEQMGKESAELRRHPEVNAYLDAELALCKMMQRICIKLGEGIDIDVPGM >gi|226332885|gb|ACII01000134.1| GENE 61 75918 - 76322 525 134 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1892 NR:ns ## KEGG: EUBREC_1892 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 19 129 10 124 136 75 33.0 5e-13 MSDTRERAGAAGVVVAGTFFKIVLYICVIVLLFWVGKASYQFGHDVFNQQAMSPGEGQEV TVVIKEDTSLYKIAKTLQKKGLVKSATVFYVQERLSTYHGKLQAGTYLLSTAYTPNRIMG ILAGDDEQEGVSDS >gi|226332885|gb|ACII01000134.1| GENE 62 76319 - 76963 686 214 aa, chain + ## HITS:1 COG:SPy1391 KEGG:ns NR:ns ## COG: SPy1391 COG4122 # Protein_GI_number: 15675315 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Streptococcus pyogenes M1 GAS # 2 214 17 228 235 170 44.0 1e-42 MIVDERMRTYINSLDMGNTPFLEELEQYAIRERVPIIRREMQSFIKMFLAVNRPKRILEV GTAIGFSTLLMCEYGPEDLEIVTIENYEKRIPIAKENFRRAGREAQITLLEGDAGQILKE LEDSFDMIFMDAAKGQYINWLPDVMRLMKNGGVLISDNVLQEGDIIESHYLVERRNRTIY KRMREYLYELKHNPSLITSIIPLGDGVTVSVKQE >gi|226332885|gb|ACII01000134.1| GENE 63 76969 - 78192 1422 407 aa, chain + ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 1 386 1 384 406 447 55.0 1e-125 MRKIELLVPASSLEVLKIAVIFGADAVYIGGEAFGLRAKAKNFSHEDMKEGIAFAHEHGV KVYVTVNILAHNYDLPGVREYLTELKELKPDALIIADPGIYMYAKEICPEIERHISTQAN NTNYETYRFWYNLGAKRVVTARELSLAEIKEIREHIPEDMEIETFIHGAMCISYSGRCLL SNYLAGRDANQGACTHPCRWKYSIVEEKRPGEYMPVFENERGTYIFNSKDLCMIEHMDDI INSGIDSLKIEGRMKTALYVATVARTYRKAIDDYMESPEKYQANMPWYQEQISNCTYRQF TTGFFYGKPDENTQIYDNNTYQKEYTYLGFAEAVDERGYAQITQRNKFSVGETIEIMKPD GQNVAVTVKGIYDEEGNSMESAPHAQQKLFIDLGTEIQVYDLLRRAE >gi|226332885|gb|ACII01000134.1| GENE 64 78231 - 78839 392 202 aa, chain - ## HITS:1 COG:BH1285 KEGG:ns NR:ns ## COG: BH1285 COG1191 # Protein_GI_number: 15613848 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 1 194 31 223 235 201 57.0 8e-52 MSISQEKYYLQKYQEGDSQAKNILIEHNLRLVAHVVKKYQGTGEDTDDLISIGTIGLIKA VTTFNPQKASRLSTYAARCIENELLMYFRAKKKHSKEVSLYEPIGTDKEGNEINLLDVIE SAPVDIVEDCYIRENTDYLLRSLKKVLSDKEYQVICCRYGLFGMEEETQREIARKLCISR SYVSRIEKTALQKLRGLFPQTQ >gi|226332885|gb|ACII01000134.1| GENE 65 79018 - 79173 65 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580727|ref|ZP_04857991.1| ## NR: gi|253580727|ref|ZP_04857991.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 51 1 51 51 85 100.0 7e-16 MMNKDMIFDADGQAAQTELKDQRKGAVRYMIKTYVIDTNVLIQAPYAIKRL >gi|226332885|gb|ACII01000134.1| GENE 66 79193 - 79849 619 218 aa, chain + ## HITS:1 COG:MJ1020 KEGG:ns NR:ns ## COG: MJ1020 COG3294 # Protein_GI_number: 15669209 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanococcus jannaschii # 1 169 19 209 259 67 27.0 2e-11 MKYKEIANNKEINAYLQKGNANLGQLGFTDHSKAHCVQVAHQAGKILKRLGYSEHEIELA KIAGYMHDIGNAINRTHHAEYGALLANDLLKETDMSLEDRITVIAAIGNHDESTGSPEDV VSAALIIADKTDVRRSRVRQKEQSAYDIHDRVNYAVTDAKLKIAEDKSVIALNLQIDEKI CSMYDYFEIFLERMMLCRKAAEILGTTFKLTVNGRKVL >gi|226332885|gb|ACII01000134.1| GENE 67 80098 - 81078 874 326 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 1 311 1 311 311 342 53.0 5e-94 MKIYLAPLEGITGYTYRRALYQCFGGFDKYFIPFILPNQKGHLSTREKKDIMPENNEGMY AVPQILTKNAEDFIQTAETLQEYGYNEVNLNLGCPSKTVVTKGRGAGFLDRPDELDKFLD EIFRKCDMKISIKTRLGMDDPEEFEDLLTIYNKYPLEELIVHARVQKDYYKNTPRLETFG EAVERAKSPVCYNGDIVTAKDCTRLQDMFPTLDCIMTGRGTLKNPALAREIRGGAPASKE EIRRFHDMMYNEYCEDLSGDRNILFRMKELWSYLAPMFTNNKKYAKKIKKAEKCVVYENA VRELFGCEQLIGEVEDGDMEKNTGLL >gi|226332885|gb|ACII01000134.1| GENE 68 81180 - 82280 586 366 aa, chain + ## HITS:1 COG:SA0428 KEGG:ns NR:ns ## COG: SA0428 COG5438 # Protein_GI_number: 15926147 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Staphylococcus aureus N315 # 16 362 22 370 370 166 33.0 1e-40 MKSHVFQMKYFLAGIYIICVIFTCFNARFYHTPLAKISSVTEKETMSRKGTRDAEEYYYT QKITARILNGTHKGEEIAISNEYTSSQVQNQKYHKGDTVLLSGSRENPGKSIRSIKRDGY LALLAGGLFFLLICITGKQGFYTIISLILNTIIFAFGFQAFMKGENILNICNVIAFLFSI TTLICLNGIHRKTWASVISTICVLFLIMALFEFSIQFFGDLDYSNLEYLGSMSNSADIFW TDIMLTGLGAIMDVAVTISAATGEIVRKNPDVSLRKLIHSGREIGYDIMGTMINVLLFVL ASGMIPMFILKMNNEIRFITIIRYHIPYDICRFLIESIGIVLAIPVSVFISSIIMKLPSL KRSDKK >gi|226332885|gb|ACII01000134.1| GENE 69 82277 - 83050 582 257 aa, chain + ## HITS:1 COG:SA0427 KEGG:ns NR:ns ## COG: SA0427 COG5438 # Protein_GI_number: 15926146 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Staphylococcus aureus N315 # 17 239 19 238 260 105 34.0 8e-23 MIIFLALVLIILMILIGGERGAVSLMTLAGNIVILFISILLMSIGFPALLLILAAGAGVC YNSLFYQNGNNPKTRAAFLATLLVMLLLFIPIYLIIWRSGSYGLNELQLSEDDFMYYYST DIHINMLHVAVFVTVFSTLGAVIDTALSVTSSVYEVWTHKNSLTAKELTSSGYQVGKEII GTTVNTLLFAYLGGSILLFAYVQTQQYNLEIILNSKFLFQDVAIMLSGAIACLFTVPVSI RCVIRQIHHAATETDHS >gi|226332885|gb|ACII01000134.1| GENE 70 83533 - 83802 218 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580733|ref|ZP_04857997.1| ## NR: gi|253580733|ref|ZP_04857997.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 89 7 95 95 178 100.0 1e-43 MLLLTKKLNVMASFYPMYDFAVKVGGDCLMLIWQHFGKKEVYYIISAGLGEDIIERSLGD MDGFVEHLEEYRIAGKNYRTHISLWDTLE >gi|226332885|gb|ACII01000134.1| GENE 71 84183 - 85505 649 440 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 2 438 23 456 470 194 31.0 2e-49 MLKRKVTHYIKKWVDSKNKKCLVIQGARQTGKTYIVERFAEENYEEIVEINFKQIPSAMD IFSGDLTVDNMVMAMRFRFPEKKIIPGKTMIFLDEIQECQEAITSLKFWALDNRFDVIAS GSLLGIDYKRASSYPVGYVDYLKMYGIDFEEFLWGMGISEDMIMNLCGYLGSKAAIPEAI HSQMMKYFRQYVAIGGMPEAVQKYIDTKDFREVDRIQRSLLQGYQYDIAHYATAEEKVKA EKCYLSLAKQLLDKENHKFQYKEVEHGGRAQKYFSSIEWLLRADMLYLCKLVTDIRFDLD DYARDDFFRAYTTDLSLLMAMKDFSLKQHIIENTLAGNSKGGIFECAVADALYKKGYQLY FYKNETTKKEIDVIIQKDGKVVPIEVKSGNTRANSLKSIMKNNKDISYGYKFIDGNVGTS EEGIITLPLYMIAFFDKVSE >gi|226332885|gb|ACII01000134.1| GENE 72 85946 - 86155 76 69 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00178 NR:ns ## KEGG: EUBELI_00178 # Name: not_defined # Def: MerR family transcriptional regulator, mercuric resistance operon regulatory protein # Organism: E.eligens # Pathway: not_defined # 10 68 25 83 145 68 52.0 5e-11 MILPLQAIVGFFPNPERKGNIHYFSDNELEVLRIIECLKKSGLGIKDIKQFFIWVSEGNS SYEKKKRII >gi|226332885|gb|ACII01000134.1| GENE 73 86317 - 87570 852 417 aa, chain + ## HITS:1 COG:FN1715 KEGG:ns NR:ns ## COG: FN1715 COG1373 # Protein_GI_number: 19705036 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 416 1 429 430 394 51.0 1e-109 MEIKRDAYLQQLIERKDNGMIKVITGIRRCGKSFLLFTIFKRYLLENGIGSDHIIEIALD GIENEELRDPKMCFKYIKDAVKDDGKYYLLLDEVQFMPRFEEVLNSLLRMSNIDVYVTGS NSRFLSSDIVTEFRGRGDEIRIYPLSFAEFYSVYDGDYDDAWDDYMIYGGLPQVVGFQSE RQKADYLKNIFANVYLKDVIERNKIQNVDEIGILVDVLASAIGAPTNPSKIANTFASQRQ MTYTNKTISNHIDYLADAFLISKASRYDIKGRKYIGANLKYYFTDLGLRNARLNFRQQEP THIMENIVYNELLIRGYNVDVGVVQIFDKDKEGKRIRKQLEVDFVVNQGNQRYYIQVAYD MTSEEKQTQEFNSLRNIPDSFKKMVIVNGSKKPWRNDEGFVIMGMKYFLLNADSLEF >gi|226332885|gb|ACII01000134.1| GENE 74 87707 - 88072 209 121 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3110 NR:ns ## KEGG: EUBREC_3110 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 114 1 114 116 139 58.0 2e-32 MKTCCLVWEIPVIEGEHYNPIINVIYMQKAKKFLNQLNRYFQDENMDWKCVLDRSACTYN EIFFSENQAVIFAPEAKTRQWLYKKEIQAVSIPKYYLDFAEYNEGKLDKIAIFFMNLKNN V >gi|226332885|gb|ACII01000134.1| GENE 75 88455 - 88991 473 178 aa, chain + ## HITS:1 COG:L182026 KEGG:ns NR:ns ## COG: L182026 COG0454 # Protein_GI_number: 15672571 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 37 176 34 176 177 73 32.0 1e-13 MKNLRLKKVSIKDLDQIRALYWRLLDSSPKYGQILQWKKNIYPNDDDWNSYIVKGEMYLI LKDTDVIGAVAVTNAQSKEYRKIHWKVKADDQDAAVVHLLMILPEYQGDGAATVALDEII KLAADKKKKAVRLDAIGTNVPAQKLYEKYGFVNCGTAQEYYESTGETEFVFYEYVLVE >gi|226332885|gb|ACII01000134.1| GENE 76 89116 - 89286 154 56 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3364 NR:ns ## KEGG: EUBREC_3364 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 56 304 359 359 105 92.0 5e-22 MGLLICKDMDKVETQYALESTSQPLGISGYELSKLIPEEFRGSMPTIEEIEVELNE >gi|226332885|gb|ACII01000134.1| GENE 77 89356 - 90741 914 461 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 14 443 10 439 445 192 28.0 2e-48 MCENEKKKYHSIEMTTGSLWKNILLFSLPLMGSQVLEVLFNLSDVAVVGRFADYMALGAV GSTTLLVTLFTGILIGMGAGVNVRVAHELGAGNKKGTEETIHTSLLLCAITGLLVCAICL LFSGRMLSMMNTKPELIDQAILYMKIYSLGMPAMAIYNFGNGVLSAIGDTKRPLIYLSIA GVVNVLLNLFFVIVCHMAAAGVATASAISLYISASLVMIHLLRRTDDCKVSLRKICLHPK ACKAVLMLGIPTGLQNGIFAIANLFVQTGVNSFDAVMVSGNAAAANADSLIYNVMFSFYT ACASFIGQNWGAGNKKRMLKSYRVSLTYAFIFGAILGGLLLVFGPQFLSLFATEPTVIDA GMQRIRIMAFSYAFSCFMDGSIAASRGIGKSIPPTIIVILGSCVFRIVWIYTVFAHFQTI SSLYLLYIFSWGITGLAETLFFVVSFNIIYSKNSPSSIEGH >gi|226332885|gb|ACII01000134.1| GENE 78 90954 - 91361 251 135 aa, chain + ## HITS:1 COG:alr7149 KEGG:ns NR:ns ## COG: alr7149 COG1943 # Protein_GI_number: 17233165 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 6 134 4 134 134 91 38.0 4e-19 MEKIDHNAHSVYLMYYHLIMVVKYRRKVINDPISERAKEIWEYIAPRYGIVLEEWNHDID HVHVMFRAQPKTELSKFINAYKSASSRLLKKEYPEIREKLWKEAFWSQSFCLLTAGGAPV EVIRQYIETQGEKKH >gi|226332885|gb|ACII01000134.1| GENE 79 91358 - 91543 77 61 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1166 NR:ns ## KEGG: Acfer_1166 # Name: not_defined # Def: transposase, IS605 OrfB family # Organism: A.fermentans # Pathway: not_defined # 1 48 1 48 369 65 60.0 4e-10 MNIAYRFRIYPTEEQKILLGKTFGCCRFLYNQMLNDKIREYKKTKKLLKIHQPCIKRSIH F >gi|226332885|gb|ACII01000134.1| GENE 80 91516 - 92460 706 314 aa, chain + ## HITS:1 COG:all7245 KEGG:ns NR:ns ## COG: all7245 COG0675 # Protein_GI_number: 17233261 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 1 313 59 372 407 194 38.0 3e-49 MYKKEYSFLKEVDSLALANVQLHLEKAYKNFFRDPKVGFPRFKSKHHSKNSYTTNVVNGN ILVEDKRIRLPKLKWISMKKHREPAENCCLKSVTVSMEPSGKYFASLLYEGYSCENQAAD EDYSNAKILGIDYAMQGMAVFSEEIELEKAGFFRRNEKRLAREQRKLSRCVKESRNYVRQ KKKVARCHEKIRNQRKDYLHKLSRRITDQYDIVAVEDIDMKAMGQCLHFGKSVQDNGYGM FRNMLDYKLAWKGKKLVKIDRFFPSSKKCSKCGKIKKELGLSERVYRCTCGNEMDRDRNA AINIREEARRMLAV >gi|226332885|gb|ACII01000134.1| GENE 81 92843 - 93763 489 306 aa, chain + ## HITS:1 COG:sll0784 KEGG:ns NR:ns ## COG: sll0784 COG0388 # Protein_GI_number: 16331918 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Synechocystis # 1 306 6 308 346 160 33.0 2e-39 MKDIKTTCKIALVQAEPVMFSKSASLKKALQYICEAASQKPDLIVFPELFIPGYPVGMNF GFSMGKRSDEGRKDWKRYYDASVVAGDSEFQQLAEAAKKAEAYISLGFSERDAVSGTLYN SNVIFEPNGSYKVHRKLKPTGSERVVWGDANKDYFPVTETPWGPIGSMICWESYMPLARV ALYQKGITIYISPNTNDNPEWQATIQHIAIEGKCFFVNADMIIRRSSYPSDLCERNIVSE LPEFVCRGGSCIIDPYGHYLTKPVWDEETIIYAELDMSLPAACKMEHDAIGHYARPDILE LKVNEK >gi|226332885|gb|ACII01000134.1| GENE 82 93983 - 94675 300 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580746|ref|ZP_04858010.1| ## NR: gi|253580746|ref|ZP_04858010.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 230 1 230 230 367 100.0 1e-100 MSDKDKKTLALPARIIIIFLICTADILVAAYFIYTWKQNGTVVVVRNASDNIFGIFRQSL VVSMVAIIVILIFFMILRKKFISEMYLKISGKKQKTTVLILTAILGAVTVFCLITKADKI TILYNLLYYTVFIAMEEELLVRGICVYLLKDENNYIRYLVPNILFAAMHLFSYTNWGKIT LVYVFSFITSQMLGLILTGCIFQYLKEKSGTIWIPVLVHAILDYMVVLGY Prediction of potential genes in microbial genomes Time: Sat May 28 20:48:49 2011 Seq name: gi|226332884|gb|ACII01000135.1| Ruminococcus sp. 5_1_39B_FAA cont1.135, whole genome shotgun sequence Length of sequence - 7873 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 148 - 207 6.5 1 1 Op 1 40/0.000 + CDS 255 - 929 364 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 2 1 Op 2 4/0.000 + CDS 934 - 1839 316 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 36/0.000 + CDS 1913 - 2623 358 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 4 1 Op 4 . + CDS 2617 - 5277 1633 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 5373 - 5413 5.9 5 2 Op 1 . - CDS 5406 - 6368 431 ## EUBREC_3500 hypothetical protein 6 2 Op 2 . - CDS 6088 - 7809 542 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member Predicted protein(s) >gi|226332884|gb|ACII01000135.1| GENE 1 255 - 929 364 224 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 222 3 222 227 159 39.0 3e-39 MEKYKILIVEDDVDLREGLSFSFSGDGYDVTETGTKKDGLQEIANGGYDIVFLDCNLPDG TGFELCKEVRSYSNIPIIMLTARDTEMDEIKALELGVNDYLSKPFSLGVLKARMKRILQE KSESTKIVTGSLSIDQSTCKVYKRGSEISLSKLEYRLLVYLIENKNHILSKEQILEKIWD SDGKYVDNNTVSVNISRLRTKIEDDVSKPKWIQTVHGIGYIWKE >gi|226332884|gb|ACII01000135.1| GENE 2 934 - 1839 316 301 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 45 294 158 408 416 141 29.0 1e-33 MWITILLAVLLCLSIGYSIYQKRKTYRLIDRLLDSVLNREMIAASDVEEGEYSALVSKIK QIQEVLDNHVNSAEQEKEQVKSLVSNMSHQLKTPLANISLYAEILSKEEITSERKTLFSE KMQRQIDKLSWIVESLSKMVKLEQNIDGFEGKAIGIKQTILDAVDTLYEKLEKKEINFTL EPFEDRLLYHNRKWTAEVFVNLLENAVKYTDRGGTISICVKSYDLYTEIQIADNGRGIRQ EEMTDIFKRFYRSPEVENMEGSGIGLYLSNLILEKEKGYMTVDSEYGNGSCFSVFLQNCK N >gi|226332884|gb|ACII01000135.1| GENE 3 1913 - 2623 358 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 3 202 4 200 223 142 39 8e-34 MSILELRNVEKIYGEKDNQVKALRNINLKVEEGEFVAIVGTSGSGKSTCLNLIGGLDNPT SGQIFIKNKEIGSLSRKELTIFRRRNIGFVFQNYSLMPVLNVYDNVALPVTFDCGKHINR KYIEELLRELGIWEKRKKYPNELSGGQQQRVAIARALANKPSILLCDEPTGNLDSATTIE VMGLLKTSCRKYNQTILMVTHNEAIAQTCDRVIHIEDGQIVTGKRRFPAKGGDVVC >gi|226332884|gb|ACII01000135.1| GENE 4 2617 - 5277 1633 886 aa, chain + ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 4 885 7 829 832 85 19.0 3e-16 MLKLAWKYMRYYKSQTFAIFASILLTASLLSGISSLIYSSQKSDLANSKTIYGDWHYYIE TDKELFNSVESGEKGKGYTLEQCGKMEIRDVVAEEFMICFIHADETYRQMAHRDLLEGTF PEKENEIAADGFVLSNLGFSGNLGDSLRIGEKEYIVTGVLKSEWAASSSEMEVFVSDSFK GRGSQTFLYLSFNEDEKLYKQLDAFLQEHKLSSESVAENDEVIQYLGGEAPSSISEIVKF GLFEEEGNFTYIVLKLQSDYNLAYNGMIFLLCLFSLFVVYSVFSISVSKRTSEYGILQTL GISEAQIGGTLLLELWTLFFIGYPLGCLLGNGVLSLMYQKFSGVFGGKGISGAGNGLSVA DHTLAEGENTVEFFVSKEAVVFGFVFLLISLALVAFLVVRSMRKDSLKTVMSGDTSFVKR RKIYAMRNVNMANVVVRKFMFANKRKVIGILLSLSIGGCIFLCTTYMVENLKVHAELSLI SDDGLGSEYRISLKSDSLKDTIPEAVADKIKNMPETENVYATKYTMGELQISKNEFLSEE EWSDYFKYQNQEPYYIQRYDGICKQQEDGNYRIKYNVYGYDEAMIEQLRDFILEGEIQPE TIKNNNQVIVTANMDGQGNYFFYGKKPGDTITLRVPKTENYTDELLKFQSGEENYIEKEF EIAAIVNRPLAQEKDFLNTEVWKNAQSVIMTGEQMEENFGITDYSFINASPADSADAKIV SNQLLQVIRDVPKAVLQDYTTAIATQKGYLGQQQIFFSSIAVILLIISLFHIVNSMSHTI LTRRREYGIIRAIGITDGGFYKMILQTGILYGLLADVFIYLIYNRVLRRVMNYYLAHVLQ FLHYTTNIPNLVLNGIMVLNVVIAVVAVMFPAWKMGKENIISEIRR >gi|226332884|gb|ACII01000135.1| GENE 5 5406 - 6368 431 320 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3500 NR:ns ## KEGG: EUBREC_3500 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 87 312 1 226 236 249 57.0 7e-65 MNESRESKNAEQLEELLLLTEEIQEELEQKDQLIEELKTQLSESLTLNEKLNKENKAGNI QALKNDLKLADRALQIEKEKLRSANVMIGECQDKIRCLTQQRDYARTHQKIVEIPVEKPV LYEKCEACDRTDYQNAKAKYETQKERLAGQYKAKTVMFQTTLFLLAWYSLTTTLFQAVQS DVLLSDCKSFFHDAASFIQTFIGWTIDAGQSVAQISTKIPNAFIAGIIYWLLLILVVGIY VAGTGILAILIEIKIIELYKKNCWDVITLLMILTSAVVIIYFGESIKKVMPINLLFLLLF IQGIYLGIRCYLNTLYWTNR >gi|226332884|gb|ACII01000135.1| GENE 6 6088 - 7809 542 573 aa, chain - ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 307 18 260 1117 108 28.0 3e-23 MAIFHYTVKIVGRSKGKSVISASAYLNGDVMKNEETGRISYYTSKKEVVYTRLMMCENAP PEWQIVPEENIKRFQKSVRYKRSEDKEAALEKFKITFQKQRLWNEVLKIEKNADAQLGRS FEFALPKEWNRQEQIQYTTDYIQKTFVDRGMCADWSIHDKGDGNPHVHLLLTMRPFNPDH SWGKKEVKDWDFLRDKNGNIVIDESHPNWWQDKKNPDRHGIRIPVLDENGIQKMGARNRL QWKRVLTDATGWNNPKNCELWRSEWAKVCNEHLPLHNQVDHRSYEKQGKLQIPTIHEGAD ARKIEQKFLAGQEIKGSWKVAENQIIKQQNTLLQKILDTFGKVSGALSLWKERLNDIRRK PGNYTLNGIHDWANRRTADLNGRNDSGNAEPGHPTLSYAGTESEIAKIKQRVIRAAQHFA KYRGTAFQDGRTENEDRTFGKRKSAMAEIGTDSEQRKQFIAETEHRIAELEQQIRKGRDI DERIQRIKERRTVGRTSALDRGDTRRTGTERPAYRGTEDAAQRISDLEREIKQREQSREY SSIKERLEAGRQSIADREREAAKRKRHDRGMSR Prediction of potential genes in microbial genomes Time: Sat May 28 20:48:54 2011 Seq name: gi|226332883|gb|ACII01000136.1| Ruminococcus sp. 5_1_39B_FAA cont1.136, whole genome shotgun sequence Length of sequence - 1436 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 7 - 321 355 ## EUBREC_3497 hypothetical protein 2 1 Op 2 . - CDS 462 - 692 328 ## EUBREC_3496 hypothetical protein - Prom 731 - 790 9.8 + Prom 781 - 840 10.6 3 2 Tu 1 . + CDS 864 - 1434 516 ## EUBREC_3495 hypothetical protein Predicted protein(s) >gi|226332883|gb|ACII01000136.1| GENE 1 7 - 321 355 104 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3497 NR:ns ## KEGG: EUBREC_3497 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 104 33 136 136 123 81.0 2e-27 MAKSTKSYEARMLEMEKKEQESLEKAKRYAAQKKELLKRKKAEESKKRTHRLCQIGGAVE SVLGAPIEEEDIPKLIGFLKRQEANGKFFSKAMQKEVNTDMEEV >gi|226332883|gb|ACII01000136.1| GENE 2 462 - 692 328 76 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3496 NR:ns ## KEGG: EUBREC_3496 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 75 1 75 76 118 92.0 8e-26 MTQRKIALSIEEAADYTGIGRNTLRQLVEWKKLPVLKVGRKVLIKTDILEMFMEANEGRD LRDRGNVKAVTRTAAN >gi|226332883|gb|ACII01000136.1| GENE 3 864 - 1434 516 190 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3495 NR:ns ## KEGG: EUBREC_3495 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 190 1 190 209 312 88.0 4e-84 MKDKELRKLIGSRIKQRRLELNLTQPYVAEKMGVTASTILRYENGSIDNTKKMVLEGLSE ALHVSVEWLKGETDEYETDITDKRELQIRDAMGDILEQLPLALTKEEDAFSKDLLLLMLK QYGLFLDSFQFACKNFKGNAGQTDIAKTIGFESNEEYNEIMFLREITHTINAFNEMADIV RLYSKKPKTA Prediction of potential genes in microbial genomes Time: Sat May 28 20:49:02 2011 Seq name: gi|226332882|gb|ACII01000137.1| Ruminococcus sp. 5_1_39B_FAA cont1.137, whole genome shotgun sequence Length of sequence - 2386 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1039 354 ## COG0582 Integrase + Term 1185 - 1228 4.3 + Prom 1086 - 1145 8.2 2 2 Op 1 . + CDS 1284 - 1751 254 ## gi|253580757|ref|ZP_04858021.1| conserved hypothetical protein 3 2 Op 2 . + CDS 1786 - 2386 357 ## COG1073 Hydrolases of the alpha/beta superfamily Predicted protein(s) >gi|226332882|gb|ACII01000137.1| GENE 1 2 - 1039 354 345 aa, chain + ## HITS:1 COG:SPy0937 KEGG:ns NR:ns ## COG: SPy0937 COG0582 # Protein_GI_number: 15674956 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 8 341 92 379 379 62 21.0 2e-09 VENYLGTIRNIKKHPLAERKLKNVTSEHLQSFFDLLSFGGVHPDGKERKGYSKDYIHSFS AVMQQSFRFAVFPKQYITFNPMQYIKLRYQTDEVDLFSDEDMDGNVQPISREDYERLLAY LQKKNPAAILPIQIAYYAGLRIGEACGLAWQDVNLEEQCLTIRRSIRYDGSKRKYIIGPT KRKKVRVVDFGDTLVEIFRNARKEQLKNRMQYEELYHTNYYKEVKEKNRVYYEYYCLDRT QEVPADYKEISFVCLRPDGCLELPTTLGTVCRKVAKALEGFEGFHFHQLRHTYTSNLLAN GAAPKDVQELLGHSDVSTTMNVYAHSTRDAKRKSVRLLDKVVGND >gi|226332882|gb|ACII01000137.1| GENE 2 1284 - 1751 254 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580757|ref|ZP_04858021.1| ## NR: gi|253580757|ref|ZP_04858021.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 155 1 155 155 311 100.0 6e-84 MPQMNKGGKFIFGKSLIRTDGTLRIPPQAMEEYHIADEGKVYLFTGSKITGGFCVTRKGL LHPSKLGHILDDTPPLLDYSAGSGEFIKYKGRSYCWVEISQEGQILLTKKMMDFLKVRPG MELLSIRSSDIAFTMGAKGPLLEKAENYDGEIPLF >gi|226332882|gb|ACII01000137.1| GENE 3 1786 - 2386 357 200 aa, chain + ## HITS:1 COG:FN0852 KEGG:ns NR:ns ## COG: FN0852 COG1073 # Protein_GI_number: 19704187 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Fusobacterium nucleatum # 1 195 1 195 199 244 60.0 6e-65 MDKAVIYIHGKGGNAEEAIHYKPLFSNCDVIGLDYTAQFPWEAKEEFPLLFNSIYRNYKT VEVIANSIGAYFAINALSNQQIEKAYFISPVVDMERLIADMMIWANVTEDELKEKKEIQT TFGETLSWDYLCYARENPIIWEIPTHILYGEKDNFTAYGTIFEFVQRTNSTLSIMKNGEH WFHTDEQMKFLDEWIIKSSK Prediction of potential genes in microbial genomes Time: Sat May 28 20:49:10 2011 Seq name: gi|226332881|gb|ACII01000138.1| Ruminococcus sp. 5_1_39B_FAA cont1.138, whole genome shotgun sequence Length of sequence - 1424 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 129 - 188 4.6 1 1 Tu 1 . + CDS 287 - 460 97 ## gi|253580760|ref|ZP_04858023.1| conserved hypothetical protein + Term 645 - 689 5.1 2 2 Tu 1 . - CDS 739 - 1422 625 ## gi|253580761|ref|ZP_04858024.1| predicted protein Predicted protein(s) >gi|226332881|gb|ACII01000138.1| GENE 1 287 - 460 97 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580760|ref|ZP_04858023.1| ## NR: gi|253580760|ref|ZP_04858023.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 57 23 79 79 112 100.0 5e-24 MILTAIWHILTDLKPYTPEGFLDSRPVNKEKVLTTSQALNLLKQRGYFIKDDPLSVS >gi|226332881|gb|ACII01000138.1| GENE 2 739 - 1422 625 227 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580761|ref|ZP_04858024.1| ## NR: gi|253580761|ref|ZP_04858024.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 42 227 1 186 186 332 99.0 9e-90 EQVATDIFKVSITGGIELSYIISQNCPHIAPPYPEGATIKDVKVQGSTIKAVAYDCANSE GFDAVLTKKSWGDKPENYVKVAKNQDAKTITFTNVKNGTYYLGIHAYNRDYYDGKVSGKR FGAWSYPVKVVVKNGIPTERVKIKTVKTGKGAVAVTVGVPKGFARADMELRSTGGTTVYK RNNRTTYTRITGVKPGTYTLRVRPWAKTNGRKAYGDWVSWGRRIRVK Prediction of potential genes in microbial genomes Time: Sat May 28 20:49:32 2011 Seq name: gi|226332880|gb|ACII01000139.1| Ruminococcus sp. 5_1_39B_FAA cont1.139, whole genome shotgun sequence Length of sequence - 25082 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 14, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 265 - 312 8.3 1 1 Op 1 . - CDS 347 - 1357 1298 ## EUBELI_01242 hypothetical protein 2 1 Op 2 . - CDS 1376 - 2068 897 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 2237 - 2296 5.0 + Prom 2201 - 2260 10.1 3 2 Tu 1 . + CDS 2321 - 3061 611 ## EUBREC_1055 hypothetical protein + Term 3113 - 3169 4.2 - Term 3094 - 3165 8.2 4 3 Op 1 . - CDS 3169 - 3840 830 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 5 3 Op 2 . - CDS 3903 - 6083 1912 ## COG0550 Topoisomerase IA 6 3 Op 3 . - CDS 6085 - 6840 791 ## COG2116 Formate/nitrite family of transporters - Prom 6867 - 6926 6.0 - Term 6879 - 6918 -0.9 7 4 Tu 1 . - CDS 7090 - 8139 1398 ## COG0709 Selenophosphate synthase - Prom 8289 - 8348 5.4 - Term 8216 - 8264 2.1 8 5 Op 1 . - CDS 8374 - 8610 238 ## Cphy_1490 hypothetical protein - Term 8622 - 8655 3.6 9 5 Op 2 . - CDS 8675 - 9760 1560 ## COG0136 Aspartate-semialdehyde dehydrogenase - Prom 9825 - 9884 2.9 10 6 Tu 1 . - CDS 9898 - 11277 1651 ## COG0165 Argininosuccinate lyase - Prom 11323 - 11382 4.2 11 7 Op 1 . - CDS 11643 - 12269 586 ## COG1636 Uncharacterized protein conserved in bacteria 12 7 Op 2 14/0.000 - CDS 12342 - 13256 661 ## COG0688 Phosphatidylserine decarboxylase - Term 13268 - 13320 6.4 13 7 Op 3 . - CDS 13340 - 13960 639 ## COG1183 Phosphatidylserine synthase 14 7 Op 4 . - CDS 13972 - 15003 721 ## COG0392 Predicted integral membrane protein - Prom 15179 - 15238 6.3 + Prom 14962 - 15021 7.5 15 8 Op 1 24/0.000 + CDS 15267 - 16034 203 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 16 8 Op 2 . + CDS 16027 - 16818 579 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 17 9 Tu 1 . - CDS 17064 - 18299 1350 ## COG2873 O-acetylhomoserine sulfhydrylase - Prom 18409 - 18468 11.9 18 10 Tu 1 . + CDS 18759 - 20459 1597 ## gi|253580779|ref|ZP_04858042.1| conserved hypothetical protein + Term 20654 - 20689 6.7 - Term 20396 - 20447 1.4 19 11 Tu 1 . - CDS 20483 - 21820 1317 ## COG0534 Na+-driven multidrug efflux pump - Prom 21878 - 21937 4.1 - Term 21923 - 21955 2.0 20 12 Tu 1 . - CDS 21971 - 22711 829 ## gi|253580781|ref|ZP_04858044.1| predicted protein - Prom 22807 - 22866 3.5 21 13 Tu 1 . - CDS 22943 - 23965 405 ## Fisuc_2661 hypothetical protein - Term 24052 - 24096 -0.9 22 14 Tu 1 . - CDS 24113 - 24868 585 ## STH615 hypothetical protein - Prom 24925 - 24984 6.2 Predicted protein(s) >gi|226332880|gb|ACII01000139.1| GENE 1 347 - 1357 1298 336 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01242 NR:ns ## KEGG: EUBELI_01242 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 336 1 336 336 558 90.0 1e-157 MALFESYERRIDKINSVLNSYGIASIEEAEKITKDAGLNVYDQIKGIQPICFENACWAYT VGAAIAIKKGCKKAADAAAAIGEGLQAFCIPGSVADTRKVGLGHGNLGKMLLEEETECFC FLAGHESFAAAEGAIGIAEKANKVRQKPLRVILNGLGKDAAQIISRINGFTYVETEYDPY KNEVKEVFRKAYSEGLRAKVNCYGANSVPEGVAIMWKEDVDVSITGNSTNPTRFQHPVAG TYKKERLEAGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTLGRMHSDAQFAGSSSVP AHVEMMGLIGAGNNPMVGMTVAVAVSIEEAAKAGKF >gi|226332880|gb|ACII01000139.1| GENE 2 1376 - 2068 897 230 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 230 1 230 230 370 81.0 1e-103 MIYSREVEEMCPVAQGVHHGAAPIPEEAKWVQAKEVKDISGLTHGVGWCAPQQGACKLTL NVKEGIIQEALVETIGCSGMTHSAAMASEILPGLTVMEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEEGLPVGAGLEDLGKGLRSQVGTMYGTLKKGPRYLEMAEGYVTGIALDAD DEIIGYQFVSLGKMTDFIKKGDDPNTAWEKAKGQYGRVADAVKIIDPRKE >gi|226332880|gb|ACII01000139.1| GENE 3 2321 - 3061 611 246 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1055 NR:ns ## KEGG: EUBREC_1055 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 17 242 10 253 260 120 36.0 4e-26 MTTKLPASFLPAKFQVLSGSALKLIAITLMLIDHTGLMILYNYPATTATLFSFGGVDYSW YRIFRDIGRAAFPIFCFLLVEGFLHTHNRKKYGLNLFLFACISEIPWNFMFTNTWRYEKQ NVFFTLFLGYLAFCALEYFWDNQKMQLVCVLALLTVSVFLKADYGWRGFVFLLIMYWFRN DKTAQAIIGSCWLYYEWKACFAFLSINMYNGERGFVKGRFLKYVFYWFYPVHITILVILR KLLFHM >gi|226332880|gb|ACII01000139.1| GENE 4 3169 - 3840 830 223 aa, chain - ## HITS:1 COG:CT629 KEGG:ns NR:ns ## COG: CT629 COG0110 # Protein_GI_number: 15605360 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Chlamydia trachomatis # 46 202 48 204 205 114 38.0 1e-25 MLKDFTIANLLDLNETIAKDLFEGKTYPWEVLPEIGDFILKLGQTLSEEEYDHPSEDVWI AKSAKVAPTACINGPVIIGKEAEVRHCAFIRGKAIIGEGAVVGNSTELKNAVLFNKVQVP HYNYVGDAVLGYKSHMGAGSICSNVKSDKKLVVVKDGDEKIETGLKKFGAMLGDNVEVGC GSVLNPGTVIGRCCNIYPLSSVRGCVPANHIYKSKTEIAEKRS >gi|226332880|gb|ACII01000139.1| GENE 5 3903 - 6083 1912 726 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 668 1 588 709 347 35.0 6e-95 MSKALYIAEKPSVAQEFAKALKINGQRRDGYLESQDSVVTWCVGHLVTMSYPEKYDIKYK RWSLDTLPFLPREFKYEVIPGVQKQFEIVKGLLNREDVDTIYVCTDSGREGEYIYRLVAQ MAGVHGKKEKRVWIDSQTEEEIMRGIREAKDLSAYDNLSASAYLRAKEDYLMGINFSRVL TLRYGNSVSNYLNTKYQAISVGRVMTCVLGMVVRREREIRAFVKTPFYRVLSCIALEGEN FEGEWRAVEGSRYFQTPYLYKENGFKEKAYAEKLIQELSVSQPLQCTVEKIERKKENRNP PLLFNLAELQNVCSKLFKISPDETLKIVQELYEKKLVTYPRTDARVLSTAAAKEIYKNIS GLRNYEHTAEIAQHIIEQGNYKNLAKTRYVNDKQITDHYAIVPTGQGLNTLRSVSLTAQR VYETIVRRFVCIFYPPAVYQKISLVTKIQNESFFSSFKVLLDEGYLKIATNSFAKRKAAD AMSSVNRAGAADSEGSEEEDPDTGKNGGNKADDSAEDMACDTRLLAALQNLKKGDILSVD SLSIKEGETSPPKRYNSGSMILAMENAGQLIEDEELRAQIKSCGIGTSATRAEILKKLCN IKYLALNKKTQVITPTLLGEMIFDVVNCSIRQLLNPELTASWEKGLNYVAEGSITEQEYM DKLEHFVRLRTRQVEDSNIQPYLRQFFDAAAVNYKDSSEKNSAKTTGRSTSAAGRSRTCR KPSASK >gi|226332880|gb|ACII01000139.1| GENE 6 6085 - 6840 791 251 aa, chain - ## HITS:1 COG:CAC1512 KEGG:ns NR:ns ## COG: CAC1512 COG2116 # Protein_GI_number: 15894790 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Clostridium acetobutylicum # 1 251 1 253 256 148 37.0 1e-35 MNYEDVQKVSNAAKAKSNLVNTNFFKYFIRAVMAGFFIDVAMIYSNVVGNVFSKTMPEWG KFVGALVFSIAVLLISFVGGELFTGNNMVMAFGAYDKQVSWKEAGKVWGVSYLGNFVGCA ILALLFVGAGASGTADYFAGFIGNKLSIPLGQMFFRAVLCNFFVCLGVLCGMKLKSDAGR FLMIVMCISGFVVSGFEHCIANMGIFVTAYCLVPGLSLGAMVKSMVVVTLGNMVGGALLL AWPLRKMSADK >gi|226332880|gb|ACII01000139.1| GENE 7 7090 - 8139 1398 349 aa, chain - ## HITS:1 COG:selD KEGG:ns NR:ns ## COG: selD COG0709 # Protein_GI_number: 16129718 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Escherichia coli K12 # 5 326 1 325 347 249 44.0 6e-66 MGNEMKKNEVRLTQLASAAGCGAKIGPKVLAQVVGKLPKFTDPMLLVGPETSDDAAVYKI NDELAMIHTVDFFPPVVDDPYMYGQIAAANALSDVYAMGGVPKLALNVVAFPNCLGPEVL EQILRGGADKVMEAGAVLAGGHTINDKEPKYGLCVTGYVHPDKMWKNYGAEEGDILILTK PLGCGILNTAIKAEMASQEEIERVQKIMAKLNKYAAEIASKYTIHSCTDVTGFSLAGHSL EMAKGSRKTLVIQSEKLPIIEGVEEYAQMGLIPEGAYRNRDFAGDEVRSEIKELWMEDLV FDPQTSGGLLLAVPAEEADALAEELAGMDIFGGVIGYVTELQDKAVVFE >gi|226332880|gb|ACII01000139.1| GENE 8 8374 - 8610 238 78 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1490 NR:ns ## KEGG: Cphy_1490 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 76 1 76 78 69 44.0 3e-11 MKEKVLTTVVTFETTTQAIAMERECKKNKFTGRLIPVPREISASCGLAWKCGEQNEEETA SYMKEQDLEYEKIYRILA >gi|226332880|gb|ACII01000139.1| GENE 9 8675 - 9760 1560 361 aa, chain - ## HITS:1 COG:CAC0022 KEGG:ns NR:ns ## COG: CAC0022 COG0136 # Protein_GI_number: 15893320 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 4 361 3 360 360 460 60.0 1e-129 MSEKLRVGILGATGMVGQRFISLLENHPWFEVVTVAASPRSAGKTYEEAVGGRWKMDTPM PEGVKNLVVMNVNEVEKVAATVDFVFSAVDMSKEEIKKIEEEYAKTETPVVSNNSAHRWT PDVPMVVPEINPEHFDVIESQKKRLGTTRGFIAVKPNCSIQSYAPVLTAWKEFEPYEVVA TTYQAISGAGKTFKDWPEMVGNIIPYIGGEEEKSEKEPLRIWGKVEDGVIKPATEPVITC QCIRVPVLNGHTAAVFVKFRKNPTKEQLIKALVEFKGLPQELELPSAPKQFIQYLEEDNR PQVTEDVNFEHGMGVSVGRLREDTVYDWKFVGLSHNTVRGAAGGAVLCAETLKAKGYIQA K >gi|226332880|gb|ACII01000139.1| GENE 10 9898 - 11277 1651 459 aa, chain - ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 1 456 1 456 458 531 57.0 1e-151 MAQLWGGRFTKETDKLVYNFNASISFDQKFYEQDIRGSKAHVAMLARQGILTAEEKDQIE AGLDGILADVRSGKLEITSEYEDIHSFVEANLIDRIGDAGKKLHTGRSRNDQVALDMKLY VRDEIDETDELVKKLLEALQKIMEENVHTYMPGFTHLQKAQPVTLAHHVGAYFEMFVRDR SRLADIRKRMNTCPLGAGALAGTTYPLDREYTAQLLGFDGPTRNSMDSVSDRDYVIELLS AFSTIMMHMSRFCEEIIMWNSNEYRFVEIDDAYSTGSSIMPQKKNPDIAELVRGKTGRVY GALTSILTTMKGIPLAYNKDMQEDKELTFDAIDTVKGCLALFTGMISTMQFNKQNMEASA KNGFTNATDAADYLVNHGVPFRDAHGIVGRLVLTCIDKGISLDELPLEEYKAISPVFEND IYEAISMKTCVEKRMTIGAPGPDVMEKVIAENKKYLEEN >gi|226332880|gb|ACII01000139.1| GENE 11 11643 - 12269 586 208 aa, chain - ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 208 1 208 208 238 62.0 6e-63 MNKRNYQKELEKLIDQAQKDNKIPSLFLHSCCAPCSSYVLEYLSKYFNITVFYYNPNIYP EEEYRKRVHEITRLVNSMEFEHPVKLIEGHYDPQEFFQMAKGLEDVPEGGERCFKCYRLR MEEAAKLAEEGGYDYFTTTLSISPLKNAAKINEIGQELAEIYHVQHLPSDFKKKNGYKRS IELSHEYDLYRQNYCGCVYSRREAENRK >gi|226332880|gb|ACII01000139.1| GENE 12 12342 - 13256 661 304 aa, chain - ## HITS:1 COG:CAC0799 KEGG:ns NR:ns ## COG: CAC0799 COG0688 # Protein_GI_number: 15894086 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Clostridium acetobutylicum # 10 286 13 288 291 214 42.0 2e-55 MKYIDRKGNITIEENEQDKFLRHLYNDRGGRLCLKLLIRPFVSKAAGVLLNTRLSARFVP DFVKNNKIDLSIYEKQNFSSWNDFFIRRIRKEERPIDMRENILISPCDGKLSVHRISSDS RFSIKDTEYTAGQLLKNKAIAERYTGGYALIFRLTVDDYHHYCYVADGRKSANVTLPGVF HTVNPAANDVYPIYKENAREYTLLKTKQFGTILMMEVGAMMVGKITNLHKNPATVKKGQE KGNFEFGGSTIILLIQPGKVRIAYDLIENTEEGYETIVKMGERIGECRKLKNTKNHYEGT IENT >gi|226332880|gb|ACII01000139.1| GENE 13 13340 - 13960 639 206 aa, chain - ## HITS:1 COG:CAC0798 KEGG:ns NR:ns ## COG: CAC0798 COG1183 # Protein_GI_number: 15894085 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Clostridium acetobutylicum # 1 203 1 202 205 129 38.0 3e-30 MFLGIYDYTVILTYISLGISVFGITRALEGDFKVAIFCLALSGLCDMFDGKIARTKKNRT DDEKNFGIQIDSLCDVVCFGIFPAMICYCLGVNTSAGIGALIFYSVASVIRLAYFNVSEA KRQNETSENRQYYQGLPITSMAIILPFLYLMRRYCGLYFLIVIHIAVIIVGLLFILDIKV KKPQNPVRILLVAVVALALAKMFRLI >gi|226332880|gb|ACII01000139.1| GENE 14 13972 - 15003 721 343 aa, chain - ## HITS:1 COG:CAC3016 KEGG:ns NR:ns ## COG: CAC3016 COG0392 # Protein_GI_number: 15896268 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Clostridium acetobutylicum # 8 331 5 330 337 91 23.0 2e-18 MRNKKKAIFNTIFLILIFSITVYMVFKGEDLGEIIHTVRQADPVYLLVSVVCVVLFILGE SVIIFYMMRTLGAKVKMGHCALYSFVGFFFSCITPSASGGQPMQIYFMKKDKLPIPVTTL VLMIVTITYKAVLVVIGIVIWIFGRGFLEGYLGQYMWVFYLGVGLNIFCVTFMMILVFAP GLAKWIMVKGLKLIEHFRFLKPKTTRLEKLEASMDQYHETAAFWASHKLVILKVLLITMV QRILLFTVTYWVYRSLGFHGYSIITLTILQSVISVSVDMLPLPGGMGISESLYLVMFAPV FGEALLPAMLLSRGISYYAQMLISAVMTCVAYFIIGRKEQEEK >gi|226332880|gb|ACII01000139.1| GENE 15 15267 - 16034 203 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 209 1 214 223 82 29 2e-15 MKPLLDITDVSLSYHSLSGETPALSHISFQLMPGEFLAIVGPSGCGKSTLLNLICGLLRP EQGQILLDGTPVTSGDCRIGYMLQKDHLLEWRTIYKNVLLGLEIRRELTAEKLAYVNQLL SDYGLDKFRSAHPSELSGGMRQRAALIRTLALKPELLLLDEPFSALDSQTRLSVSDDIGK ILRQEKKTAILVTHDISEAISMADRVIILSPRPAIIRKIVPVCFDLENRTPMASRNAPEF KSYFNLIWKELNHDV >gi|226332880|gb|ACII01000139.1| GENE 16 16027 - 16818 579 263 aa, chain + ## HITS:1 COG:CAC0618 KEGG:ns NR:ns ## COG: CAC0618 COG0600 # Protein_GI_number: 15893906 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 3 263 2 262 264 197 42.0 2e-50 MSEISYHQQQYIRHQKREKKLVLFLRIFILLFFLVLWEISARTGLIDSFIFSSPVMIWHS FCSMCRDGSIFPHLSVTITETLVSFLFVVVLGAGMAVLLWTCPRLAKITEPYLVVLNSLP KSALAPLLIVWLGANERTIIVCGMSVAIFGSILNLYTGFGEADPEKLKLIETLGGGKKEK LMKIILPSSVPLLLSVMKVNIGLCLVGVIIGEFIGARKGLGYLIIYSSQTFKLTWVLMSI IILCIIAIILYGLLGLIEKRARR >gi|226332880|gb|ACII01000139.1| GENE 17 17064 - 18299 1350 411 aa, chain - ## HITS:1 COG:CAC0102 KEGG:ns NR:ns ## COG: CAC0102 COG2873 # Protein_GI_number: 15893398 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Clostridium acetobutylicum # 1 408 1 409 409 373 44.0 1e-103 MEFQTKLLHGKAVDNYANGSTVPPVSLANAFAYESSEQLEKVFQNRAPGFAYTRIANPTV DAFERRVNELEGGIGGVACSSGMSAVTLSLLNILQAGDEVIAGSALFGGTLDLLHDLEAF GIKVHFIPRVEKALIEPFLTDKTRAVFGEIVGNPALNVMDVRETADFLHGKGVPLIVDST TSTPYLLNPIQYGADVVVHSTSKYINGSGDAISGIIIDSGNFSWSPERYPGMKEYKKYGK FAYLVKLRNGIWRNMGGCLAPMNAYLNIIGMETLGLRMERVCNNAFRLAQALEKLEGVSV NYPLLESSPYHELAEKQLSGKGGAILTIRAGSKERAYKLINHLKYVKIATNIGDVRTLVI HPASTIYIHSTPEAMEEAGVYDDTIRISVGIEDIKDLIKDFTEAVESLGEE >gi|226332880|gb|ACII01000139.1| GENE 18 18759 - 20459 1597 566 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580779|ref|ZP_04858042.1| ## NR: gi|253580779|ref|ZP_04858042.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 550 1 550 566 883 100.0 0 MKTRKKWLRKTTAICTAALLGTAAIPSTAFAADSYASIEKDAWAKKVTEMASSYATSIEE SQSLMSGMQSDMILKFEDSGRSLLGFVAPFDVSWLDNVTLSNDISFTEGKEGILMKVLLN DNKICTLEYYLDPDSQDIYMRIPELSDKYFKTNLEEAADQQAANIENDLEELTPDDSDAD IPTDNFASAYSDSLSLTVSMMSDLSAAAPEASVVETLLDKYGSMLFDNVTEGESSQETLT AGDISQDCTVYEGQISAEDAVKTATAILEEAKSDSDIENILDTWTEKLSSNEDLHESFTK AVEDGLDFLKDADTGDSDDSHLNTRIWVDETGRIAGRKIEFQEGDKITPVLNWQMTRDGS DFGYLLSIETDDSGTLSLSGSGQIDGGKLNGTYKISQDDTAAAVIEVKDYDTESAKEGYL NGNYTITFPADSSEDTDSSLSMLENFALVLDLNSAKDSGSVALSVESAGSTLGSFTVTSG AGESVEIPDLTALGDVYDVTNEDDMSAYAATLDLTTLMDNLSNAGVPDEVITYVLSGGSN SDAEDVTDENASEAESDAASAEAGAA >gi|226332880|gb|ACII01000139.1| GENE 19 20483 - 21820 1317 445 aa, chain - ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 5 433 14 443 475 170 28.0 5e-42 MNNTFMKKKPILPLLASMAMPMVLSMLVNSLYNIVDSFFVAKISEQAMTALSLVFPVQNF INAVAIGFGVGINAQISFQLGAKNIKNANIAATHGMLFNIIHGIIFTVICIPFMPVFLSM FTKAQEVISSGVQYSTIAFAFSTVIMISLSFEKIFQAVGRMKLTMISLMIGCISNIILDP LLIFGVGIFPEMGIRGAALATGLGQVFTVIVYLIAYKKCSIPVEISRKYLELNRNLDGKL YMVGIPAILNIALPSVLISFLNQILSAFSGSYVVVLGIYYKLQTFLYLPASGIVQGMRPL IGYNYGAGEIKRVKKLFGISVLLNGMIMLAGTIICFTASETLMGMFTENPETVRLGTAAL QIISIGFIPSAISVTASGALEGLSKGAQSLVISLLRYIIVIIPAAYLLCYFAGADAVWNA FWICEFITAVVVGIGIKYYWNRLLQ >gi|226332880|gb|ACII01000139.1| GENE 20 21971 - 22711 829 246 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580781|ref|ZP_04858044.1| ## NR: gi|253580781|ref|ZP_04858044.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 246 1 246 246 436 100.0 1e-121 MADKNQDKNQSKNRAEQAEEHDESVLNPEIKNESDLSKKERRLIEKEKLKGMGPKKKLEY IWMYYKPAIFGVIAVIALIFGIKDYYEQSKIKTVLSMTVVNSMANDTETPEQKIKETLGY KDDPYSKVEIGVNLTTDSEMAEFDYNAQMAYVAQIQAGSIDIMVMPEKLYQTLKKNEPFA DLKELMGEEAFEKFGMQTDTTHISITDSELEQELGVIYDPVCIAVPYSAPNQENAVKWIK SLDSRK >gi|226332880|gb|ACII01000139.1| GENE 21 22943 - 23965 405 340 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_2661 NR:ns ## KEGG: Fisuc_2661 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 3 333 8 326 333 116 29.0 1e-24 MRNKSLDAVKAIAACLVVCIHVSFPGQAGQLVKVLARCAVPFFFMVSGYFCYYQNCNASK RILSKILHIMKLFAVSVVFYFIWECFMKAWNGERVWTWIKGLVSTEHLKEFFVYNSTSPV RAHLWFLPALIYCYLLALLIEKWRMRRAAYCMAPVLLAILLWRAEFCVFFDRFYHTMEYR NFLFTGMSFFLTGQIIHEYQDKIVCKRLEQWMQWGLKAGMIFGVALSMMEYAFRGAGEIY TGNCVAVICLFLWLILYGREINFPSVLVETGRKYAFLIYLLHPAVSDLLKKCSEGLGVSN CQIYFWLRPVLVYMLTVVTVSGISAVSAYARQNILQNNHV >gi|226332880|gb|ACII01000139.1| GENE 22 24113 - 24868 585 251 aa, chain - ## HITS:1 COG:no KEGG:STH615 NR:ns ## KEGG: STH615 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophilum # Pathway: not_defined # 15 245 127 362 367 119 35.0 1e-25 MINLFDFTYCGDYNRQIQNLARMVPEKWSFSDTDDNGILKGYLEHTFKRLYEEQKVWEKK NYAIFNTGLFNYYYQPIYAYFIPNLVPDRQPWFLDGFYTEYYLLKEGITCLPEKACYVEN PSDLVFDTKLPVIPQYEHIFGDEENAARLPKEVRDSSMKMQLFDGALKQTKRMLEADYRT AIPQYYNHSIQLLLPICLRHPGKPDLALACMKTSDGSKYLGRTCLTLRMAYHNARLLARV DRSWLMTSVSA Prediction of potential genes in microbial genomes Time: Sat May 28 20:50:25 2011 Seq name: gi|226332879|gb|ACII01000140.1| Ruminococcus sp. 5_1_39B_FAA cont1.140, whole genome shotgun sequence Length of sequence - 14431 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 4, operones - 3 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 20 - 79 6.5 1 1 Op 1 1/0.000 + CDS 165 - 1346 944 ## COG0500 SAM-dependent methyltransferases + Term 1381 - 1409 2.3 + Prom 1354 - 1413 6.3 2 1 Op 2 16/0.000 + CDS 1570 - 2466 1150 ## COG1209 dTDP-glucose pyrophosphorylase + Prom 2519 - 2578 3.0 3 1 Op 3 11/0.000 + CDS 2643 - 3671 1045 ## COG1088 dTDP-D-glucose 4,6-dehydratase 4 1 Op 4 . + CDS 3675 - 4616 1110 ## COG1091 dTDP-4-dehydrorhamnose reductase + Term 4618 - 4681 15.2 - Term 4615 - 4661 3.2 5 2 Op 1 8/0.000 - CDS 4677 - 6644 2039 ## COG4666 TRAP-type uncharacterized transport system, fused permease components 6 2 Op 2 . - CDS 6650 - 7651 1176 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component - Prom 7700 - 7759 10.6 + Prom 7697 - 7756 8.6 7 3 Tu 1 . + CDS 7788 - 8429 634 ## Amuc_1957 membrane protein-like protein + Term 8462 - 8506 7.0 - Term 8450 - 8494 8.6 8 4 Op 1 1/0.000 - CDS 8536 - 9306 1045 ## COG1540 Uncharacterized proteins, homologs of lactam utilization protein B 9 4 Op 2 27/0.000 - CDS 9393 - 10742 1427 ## COG0439 Biotin carboxylase 10 4 Op 3 1/0.000 - CDS 10761 - 11198 611 ## COG0511 Biotin carboxyl carrier protein 11 4 Op 4 21/0.000 - CDS 11241 - 12248 1140 ## COG1984 Allophanate hydrolase subunit 2 12 4 Op 5 1/0.000 - CDS 12250 - 12975 779 ## COG2049 Allophanate hydrolase subunit 1 - Prom 13005 - 13064 2.1 - Term 13041 - 13088 10.7 13 4 Op 6 . - CDS 13129 - 14331 1750 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family - Prom 14358 - 14417 6.4 Predicted protein(s) >gi|226332879|gb|ACII01000140.1| GENE 1 165 - 1346 944 393 aa, chain + ## HITS:1 COG:VNG0503C KEGG:ns NR:ns ## COG: VNG0503C COG0500 # Protein_GI_number: 15789731 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Halobacterium sp. NRC-1 # 141 355 13 223 262 97 30.0 5e-20 MKKDGYYSSGEFARMAHVTLRTIRYYDKQNILKPSYVTESGARFYTDEDFARLSQILLLK YLGFSLDDIREMTIDDSDYHFMENSLNIQLKLVRDRIEQMQLVEKAIQDTTDAIRSHHAI DWNQMLDLIHLTGMEKSMKNQYQNASNISSRINLHSLYSQNKQGWFPWVYEQCQIHPGMR ILELGCGDGVLWTQNISSLPGKVSVTLSDLSSGMLRDARRAIGRKDSRFSFEAFDCARIP HEDRSFDLVIANHVLFYCEDISAVCSEIQRVLTPGGKLICSTYGKKHMQEVSRLVRDFDD RIVLSADKLYERFGRENGADILAPYFSRIFWESYKDSLLVPDAEPLISYILSCHGNQNQY LLDKYKDFRGFVSRKTKDGFFITKDAGIFLCEK >gi|226332879|gb|ACII01000140.1| GENE 2 1570 - 2466 1150 298 aa, chain + ## HITS:1 COG:STM2095 KEGG:ns NR:ns ## COG: STM2095 COG1209 # Protein_GI_number: 16765425 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Salmonella typhimurium LT2 # 2 291 5 291 292 416 68.0 1e-116 MKGIILAGGSGTRLYPLTMVTSKQLLPIYDKPMIYYPMSVLMNAGIRDILIISTPQDTPR FEELLGDGHQFGVHLTYAVQPSPDGLAQAFIIGEEFIGNDTVAMVLGDNIFAGHGLKKRL KAAVKNAESGEGATVFGYYVDDPERFGIVEFDHEGKAISIEEKPEHPKSNYCVTGLYFYD NRVVEYAKNLKPSARGELEITDLNRIYLEKGELNVELLGQGFTWLDTGTHESLVEATNFV KTMESHQHRKIGCLEEIAYLNGWITKEDVLKVYEILKKNQYGQYLKDVLDGKYLDVLH >gi|226332879|gb|ACII01000140.1| GENE 3 2643 - 3671 1045 342 aa, chain + ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 3 339 4 333 336 456 64.0 1e-128 MNIIVTGGAGFIGSNFIFHMLKKYPDYRIICLDCLTYAGNLSTLAPVMDNPNFRFVKESI TDREAVYKLFEEEHPDMVVNFAAESHVDRSIENPEVFLTTNIIGTAVLMDACRKYGIKRY HQVSTDEVYGDLPLDRPDLFFTEETPIHTSSPYSSSKASADLLVLAYHRTYGLPVTISRC SNNYGPYHFPEKLIPLMIANALNDKPLPVYGTGENVRDWLYVEDHCRAIDLIIHKGRVGE VYNVGGHNEMTNIDIVKIICKELGKPESLITYVADRKGHDMRYAIDPTKIHNELGWLPET KFADGIKKTIKWYLDNKEWWETIISGEYQNYYEKMYANRGEA >gi|226332879|gb|ACII01000140.1| GENE 4 3675 - 4616 1110 313 aa, chain + ## HITS:1 COG:SPy0784 KEGG:ns NR:ns ## COG: SPy0784 COG1091 # Protein_GI_number: 15674829 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Streptococcus pyogenes M1 GAS # 3 310 2 280 284 197 40.0 2e-50 MRIFVTGVGGQLGHDVMNELAKRGYEGVGSDIAPSYSGIADGTAVTTMPYVQMDITDKAS VEKVIKEANVDAVIHCAAWTAVDAAEDEENQPKVRLVNVTGTQNIADVCKELDIKMLYLS TDYVFDGQGTTPWEPDCKDYKPLNVYGQTKLDGELAVSGTVNKFFIVRIAWVFGKNGKNF IKTMINLGKTHDTLSVVNDQIGTPTYTYDLARLLIDMIETEKYGYYHATNEGGYISWYDF TKEIFRQAAAMGHPEYLPENMTVNSVTTAEYGASKAARPFNSRLDKSKLTANGFTPLPTW QDALGRYLKELDF >gi|226332879|gb|ACII01000140.1| GENE 5 4677 - 6644 2039 655 aa, chain - ## HITS:1 COG:BH2938 KEGG:ns NR:ns ## COG: BH2938 COG4666 # Protein_GI_number: 15615500 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, fused permease components # Organism: Bacillus halodurans # 19 642 6 638 654 422 42.0 1e-117 MKENKIKKANTSPEGQAGSENMSTADVDAVMKKYDRESNIRVWTGKYKLAVRGLLVAFSL WCIYVTLFATFLEEIRLTSFLGLIILIGFLTYPAKKGDIRENYMPIGDIIFMIAGTGAFL YFTFSANQIINQGTRFAPYQIVIGIIGIAALIELTRRCVGLPILFVAGFFLIYALAVGLT NPAFLGRVKFLVRNLFYTKEGIFSTPVNVCSKYIVVFIIFGAFLERTGISNFFIQLANCI AGKYAGGPAKVAVISSALCGMVSGSSVGNTVTTGSVTIPMMKDTGYKPEFAGAVEAASST GGQIMPPIMGAAAFLMADFVGVPYSNIIVRAILPALLYFTGIFISVHLEAKKLGLSGIPK EKLPVFKLLIRKIYLLLPLVMLVIWVSGNYMTMQKAASYAIVLSVIVSLFDKENRISVTK CIDALEAGGRGVISVAVACGVAGIISGSITMTGLANDLINGIISVANGKLIIALLLTMLC CIVLGMGVPTTANYCIMAATCAPILVRMGVPTLAAHFFVFYFGIVADITPPVALAAYAGS AIAKANPMKTAFTASKLAIAVFIVPYVFCFNPAMLLIDTTPLKVVQIFITSLIGVFGLSS SLEGFLSVKMSVPVRVLMAAGGLMLIDPSLMTDVVGILLIVGCCVWQTAQKKKTA >gi|226332879|gb|ACII01000140.1| GENE 6 6650 - 7651 1176 333 aa, chain - ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 40 326 41 328 330 225 42.0 8e-59 MRNKELFRGRRKAGRIVGLVLTLALAVSLLSGCGQSKERLTFGTGGTAGTYYSYGGVLAQ YITNNTNVKVTAVSTGGSKANLQSVQDGDFQLGFVQSDVMAYAWDGVRAFEQDGATHDFR TLGGLYAETVQLFTMDPDIKTVEDLKGKSVSIGAANSGVYFNALDVLNAGGITLDDIHPQ YLSFEDSKESLKDGKIDAAFIVAGAPTTAITELATTNGVYMINIDGAIRDSILSDCPYYT SYQIPAGTYPGQDEPVETITVKATIVVSESLDEDTVYDMTAAIFDHADEISAENAKGAEL SLENATSGMTIPFHKGAARYYAEHGITVDTKGE >gi|226332879|gb|ACII01000140.1| GENE 7 7788 - 8429 634 213 aa, chain + ## HITS:1 COG:no KEGG:Amuc_1957 NR:ns ## KEGG: Amuc_1957 # Name: not_defined # Def: membrane protein-like protein # Organism: A.muciniphila # Pathway: not_defined # 13 199 247 434 473 120 35.0 2e-26 MEVSIMSERKYPVITISREFGAGGHSIARVVSERLGIPYYDRDFAKQAAARSGYSQEDID REGENLSRSARLLDNILNNSAGYFSSHDAIYQAQKELILQFAAEQDCIIIGRCSNLILRN AGIPSLDIFLHADVNLRTEHIQKLGLNGKEDPRKYLTKMDNLRETYFKTYTKHELGTYHD YDLCLDTGTLGYDNCIEIITSLAKQQGLNLKHQ >gi|226332879|gb|ACII01000140.1| GENE 8 8536 - 9306 1045 256 aa, chain - ## HITS:1 COG:Cj1541 KEGG:ns NR:ns ## COG: Cj1541 COG1540 # Protein_GI_number: 15792849 # Func_class: R General function prediction only # Function: Uncharacterized proteins, homologs of lactam utilization protein B # Organism: Campylobacter jejuni # 1 255 1 255 255 313 60.0 1e-85 MYTVDLNSDLGESFGNYKIGNDDKVIPLISSANVACGYHASDPVVMGNTLAMAREAGIRI GAHPGFPDLMGFGRRNLSVSPAEAKAYVLYQLGALDAFCRVTGVKMQHVKPHGALYNMAA EDYALSTAICEAIKEFDSSLIVLALSGGQLAKAAQDMGLRTAMEVFADRAYEEDGTLVNR RKEGAVITDENEAIVRVIRMVKEKKITAITGKDIDIQADSVCVHGDGAKALAFVEKIREA FAKEGIQIRSLDEFVK >gi|226332879|gb|ACII01000140.1| GENE 9 9393 - 10742 1427 449 aa, chain - ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 446 1 446 447 527 59.0 1e-149 MFNKILIANRGEIAVRIIRACRNLGIRSVAVYSKEDAKSLHVQLADQRICIGEGPARNSY LNMDRIIAAAENMGADAIHPGFGFLSENSEFVRKCKENGITFIGPDADVIDKMGNKSHAR KTMMDAGVPVVPGTKEPVYDAETGKKLAEEIGYPVMIKASSGGGGKGMRVAANEDEFEFQ FNMAQRESANAFGDDTMYIERFVENPRHVEIQIMADSHGNVVALGERDCSVQRNHQKLIE ESPSPAIDEETRRKMNEYAVLAAKTVGYTNAGTIEFIVDPKGNFYFMEMNTRIQVEHGVT EMVTGTDLIIEQIRVAMGEELSFKQEDIRIKGHAIECRINAEIPEKNFMPSPGVVQHMHL PAGNGVRVDTALYTGYKIPSEYDSMIAKVIVHAPDREAALQKMRSALDEMVVMGVETNLD FQYQIMKNQVFCDGKADTGFIENVLKIKG >gi|226332879|gb|ACII01000140.1| GENE 10 10761 - 11198 611 145 aa, chain - ## HITS:1 COG:CAC3572 KEGG:ns NR:ns ## COG: CAC3572 COG0511 # Protein_GI_number: 15896806 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Clostridium acetobutylicum # 1 142 1 156 159 103 43.0 7e-23 MDLEKIEGLVKIIEDSSLNEFTYKDKDVKITMSKLDHPPVVAAGVPVAASAPVNTIVEKA EEEEEESLFITSPIVGTFYSASAPDVPAFVKVGDQVKAGQTVCILEAMKLMNEIQSDYDC EIEAVLVSNEQKVEYGQPLFRVKRL >gi|226332879|gb|ACII01000140.1| GENE 11 11241 - 12248 1140 335 aa, chain - ## HITS:1 COG:FN0436 KEGG:ns NR:ns ## COG: FN0436 COG1984 # Protein_GI_number: 19703774 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 2 # Organism: Fusobacterium nucleatum # 3 310 4 310 336 287 48.0 3e-77 MGIRILKGGMMTTVQDLGRYGYQSQGFSVAGVMDVRSFKIANLLLDNPENEAVLEITLIG PTLEFTSATIIAITGGDFQPTINGEPAPMYTALYMNKGDVLKFGSARTGSRGYIAFSSYL DIPVVMGSRCTNLKSGLGGFKGRKLEADDYIGFRIKRRYLPFFLSRKLDMDEFDQTEATL RVVMGPQDGMFSKQGIKTFLGSEYTVTNEFDRMGCRLEGPFIAPKKTSDIISDGIAFGAI QVPSHGKPIILLADRQTTGGYGKIATVASVDIPKLVQRKTDDKIHFKAITVQEAQALYVE EMKELDGLRKIIHQPCKEVLDCRLVAKRLRKLFEE >gi|226332879|gb|ACII01000140.1| GENE 12 12250 - 12975 779 241 aa, chain - ## HITS:1 COG:FN0437 KEGG:ns NR:ns ## COG: FN0437 COG2049 # Protein_GI_number: 19703775 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 1 # Organism: Fusobacterium nucleatum # 4 239 18 254 262 241 48.0 6e-64 MPDIRILTEGDSSVLVEFGKEISPEINRKITATVQLMKEQHIEGVVDMIPAFCSLLVNYD PRVISYDDLKKRLEILLKMEVTAGEGCRKVYEIPVCYGGEYGPDIENIAEHAGLSVEEVI KIHSSRDYLIYMLGFLPGFCYLGGLDERIHTPRLANPRIKINAGSVGIGGSQTGIYPLDS PGGWQLMGMTPVKTYDPDREIPILVEAGDYIRFVPIDEDEYKRIRELVERGEYQCTVHEE A >gi|226332879|gb|ACII01000140.1| GENE 13 13129 - 14331 1750 400 aa, chain - ## HITS:1 COG:FN0438 KEGG:ns NR:ns ## COG: FN0438 COG1914 # Protein_GI_number: 19703776 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Fusobacterium nucleatum # 3 392 2 387 395 366 59.0 1e-101 MSEKKKSGGALLGAAFLMATSAIGPGFLTQTATFTGQYQESFGFVILVSVILAAIAQLNI WRVLCVTGLRGQDVSNKVLPGLGYLVSILIVFGGLVFNIGNVGGGALGFNTLLGIPTKVG YILAGLLAILVFVLKNAKSAMDTITKVLGAIMIIVIFVVIIVVKPPVGSALKNTFVPEAG ATNLIPAILTLLGGTVGGYITFSGAHRLIDAGITGEKNLKEINKSSVMGMIIATIVRIFL FLAVLGVVVKGVTLDAANPAADAFKQGAGEIGYRFAGLVLLCAAITSIIGAAYTSVSFLK TFSKSIEENENKVIIGFIIISTAIMFILGNPAVLLVLAGAVNGLILPITLAICLIAAHKK SIMGENYHHPVVLTILGVIVVILTAYLGVTTFVSKIGTLL Prediction of potential genes in microbial genomes Time: Sat May 28 20:50:34 2011 Seq name: gi|226332878|gb|ACII01000141.1| Ruminococcus sp. 5_1_39B_FAA cont1.141, whole genome shotgun sequence Length of sequence - 20770 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 6, operones - 3 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 23 - 80 13.5 1 1 Op 1 1/0.000 - CDS 103 - 1485 1587 ## COG0423 Glycyl-tRNA synthetase (class II) - Prom 1509 - 1568 3.9 - Term 1574 - 1612 4.1 2 1 Op 2 16/0.000 - CDS 1638 - 2390 775 ## COG1381 Recombinational DNA repair protein (RecF pathway) 3 1 Op 3 . - CDS 2419 - 3327 1110 ## COG1159 GTPase 4 1 Op 4 . - CDS 3344 - 6268 3133 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like 5 1 Op 5 . - CDS 6301 - 6885 550 ## gi|253580803|ref|ZP_04858066.1| conserved hypothetical protein 6 1 Op 6 . - CDS 6957 - 8072 1291 ## COG0562 UDP-galactopyranose mutase - Prom 8148 - 8207 3.4 - Term 8162 - 8200 5.2 7 2 Tu 1 . - CDS 8245 - 9780 1684 ## COG0591 Na+/proline symporter - Prom 9861 - 9920 6.0 8 3 Op 1 . - CDS 10296 - 10850 546 ## EUBREC_1818 hypothetical protein 9 3 Op 2 . - CDS 10828 - 12723 1458 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 10 3 Op 3 . - CDS 12787 - 13542 587 ## COG0708 Exonuclease III - Prom 13614 - 13673 6.6 + Prom 13651 - 13710 6.5 11 4 Tu 1 . + CDS 13761 - 14138 541 ## EUBREC_1820 hypothetical protein + Term 14152 - 14199 9.6 + Prom 14246 - 14305 7.5 12 5 Tu 1 . + CDS 14402 - 18625 1794 ## COG4870 Cysteine protease + Term 18635 - 18690 0.6 - Term 18618 - 18680 23.0 13 6 Op 1 . - CDS 18696 - 19880 1459 ## COG3007 Uncharacterized paraquat-inducible protein B - Prom 20023 - 20082 8.1 - Term 20043 - 20083 2.7 14 6 Op 2 . - CDS 20172 - 20684 684 ## COG0756 dUTPase Predicted protein(s) >gi|226332878|gb|ACII01000141.1| GENE 1 103 - 1485 1587 460 aa, chain - ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 451 4 450 462 698 71.0 0 MEKTMDKIIALAKARGFVYPGSEIYGGLANTWDYGNLGVELKNNVKRAWWQKFIQESPYN VGVDCAILMNPQTWVASGHLGSFSDPLMDCKECHERFRADKIIEDFSQENGIELESSVDG WSQEEMMNFIKDHNVPCPTCGKHNFTDIRQFNLMFKTFQGVTEDAKNTVYLRPETAQGIF VNFKNVQRTSRKKIPFGIGQIGKSFRNEITPGNFTFRTREFEQMELEFFCEPDTDLEWFA YWKNFCLNWLHSLGLKDEEVRYRDHDAEELSFYSKATTDVEFLFPFGWGELWGIADRTDY DLTQHQNVSGQDLTYFDDEKKQKYIPYVIEPSLGADRVVLAFLCSAYDEEVLDAEKNDVR TVLHFHPALAPVKIGVLPLSKKLNEGAEKIYQQLSKKYNCEYDDRGNIGKRYRRQDEIGT PFCVTYDFESENDGAVTIRDRDTMEQVRVKIDELDAYFCG >gi|226332878|gb|ACII01000141.1| GENE 2 1638 - 2390 775 250 aa, chain - ## HITS:1 COG:BS_yqxN KEGG:ns NR:ns ## COG: BS_yqxN COG1381 # Protein_GI_number: 16079582 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Bacillus subtilis # 4 249 2 243 255 87 28.0 2e-17 MSDLITVQGVVLSAMPVGEYDKRIVLLTRERGKISAFAKGARRPNSQFLAAANPFVFGTF ALYEGRSSYNLNQVSISHHFVELAGEQPGIYYGYYFLELADYFGQEGIDEKESMNLLYVT VKALLNPNIDNRLVRCIFELRMMAAQGLCPSLFHCVCCERQPVEGEELFFSQQNHGILDK ACLGHVNDAKRISAPALYAMQYIVTASLGKLYTFTVTEEVLHELERHIHTYIAANTEKRF KSLEILEIMS >gi|226332878|gb|ACII01000141.1| GENE 3 2419 - 3327 1110 302 aa, chain - ## HITS:1 COG:BH1367 KEGG:ns NR:ns ## COG: BH1367 COG1159 # Protein_GI_number: 15613930 # Func_class: R General function prediction only # Function: GTPase # Organism: Bacillus halodurans # 3 301 6 303 304 326 55.0 4e-89 MNENFKSGFVAIIGRPNVGKSTLMNHLIGQKIAITSKKPQTTRNRIQTVYTCDEGQIVFL DTPGIHKAKNKLGEYMVQVAERTLKEVDAIMWLVEPSTFIGAGERHIAEQLQGIGVPVIL IINKIDTVSKEEILPAIDTYRKVCDFAEIIPCSALRGQNTQDIIGCILKYLPYGPMFYDE DTVTDQPQRQIVAEIIREKALHALDAEIPHGIAVAIDRMKTRPGKNPIVDIDATIICERE SHKGIIIGKQGAMLKKIGSNARYEIERMLESKVNLKLWVKVKKDWRDSDFLIKNFGYDKK EI >gi|226332878|gb|ACII01000141.1| GENE 4 3344 - 6268 3133 974 aa, chain - ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 8 965 12 968 976 844 46.0 0 MNKANLTAYEVVTEENLTDIHSTGWLLRHKKTGARVMLIENDDENKVFNIAFRTPPKDST GVAHILEHSVLCGSREFPLKDPFVELVKGSLNTFLNAMTYPDKTCYPVASCNDKDFQNLM HVYLDAVFYPNIYKREEIFRQEGWNYHLEQKEGPLKYNGVVYNEMKGAFSSPDEVLEREI MNHLFPDTTYGCESGGDPKNIPDLTYENFLNFHRTYYHPSNSYIYLYGNMDMEEKLAFLD EHYLSHFDYLDVDSVIQEQKAFGACRDVTLEYPVAENEGEEDNTYLSYNMVVGNAADSQM AMAFEVLDYALLSAPGAPLKQALLDVKAGKDVYGSYDDGILQPYFTVIAKGSNPDRKEEF VSVIRQVLGDIVKNGIDRKAVEAGINYFEFRYREADFSSYPKGLMYSLDILGDWLYEKGN PFAQVQQLTVFEKLKKAVNEGYFEELIRKYLLENPHGCIMTLVPKKGLAAQREKELEEKL EAYRSSLSEEQLDAMVEKTKALEAYQEAGEDPKALECIPMLKRSDIKREAAKIINEELTV DDSLFLYHDVCTNGIGYVDLMFKTDSIAPEQIPYLGLLKSVLGYVDTENYTYGELFNEIN ANTGGINCGVEVFDRADSTEEFQAMFSVRGKALYTKMDFLFKMIGEILNSSKLEDTKRLY EIVASVKSRAQVNLTGAGHSTAVLRAAAYSSPMAAFQDEMAGIGYYQFIEKLEKDFEQRK EETVEELCKLMKKILRPENFMISYTGERESLETVQKLAGAVKAGLGTEPVEKSEEKLTCT KKNEGFKTSGQVQYVAQTGNFKKKGLEYTGALEILKVILSYDYLWMNLRVKGGAYGCMSG FKRNGESYLVSYRDPHLKRTLDVYKGIPDYIRNFQADEREMTKYIIGTISGKDVPRTPQM KGSVSKTAYFCGVTEEMLQKERDQILNASAEDIHALAEIIEAVLAADQICVVGSESKVAE ASDVLMEVKPLINC >gi|226332878|gb|ACII01000141.1| GENE 5 6301 - 6885 550 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580803|ref|ZP_04858066.1| ## NR: gi|253580803|ref|ZP_04858066.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 194 1 194 194 358 100.0 1e-97 MTDTPIIALYYREHDPMKRKMFLDQSIAAGEDEEANAVRKELWELRYGEPSEAGSGTRAD GYLALWMAMEYSKDTAGKLFGLKRARKEIEKNLARLKFREMQEKSELHRELLYRECCHMV KTYMELCEKDKTYNTTLCGIVPISEKSAKSKLQKDVYTTAIVFPEEINMQEELEMITKAA REAYEAHFPGEGGL >gi|226332878|gb|ACII01000141.1| GENE 6 6957 - 8072 1291 371 aa, chain - ## HITS:1 COG:Cj1439c KEGG:ns NR:ns ## COG: Cj1439c COG0562 # Protein_GI_number: 15792757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Campylobacter jejuni # 4 366 2 363 368 481 64.0 1e-135 MKKYDYLIVGAGLFGAVFAHEATRRGKTCLVIDKRGHIGGNIYTEEVEGIQVHRYGAHIF HTSRKQVWDYINQFAEFNHFVNSPIAVYKDELYNLPFNMNTFHQLWGVRTPAEAKAKIQE QIARMHITHPKNLEEQALALVGQDVYEKLVEGYTRKQWGRECKDLPAFIIKRLPLRYTYD NNYFKDPYQGIPKGGYTRIVKKLLEGVQVCLNTDFFANREELTAQADKVLFTGMIDEFYD YCYGELEYRSLKFETEVLDMGNYQGNAVVNYTDYEVPYTRIIEHKHFEFGTQPKTVITRE YPAAWEKGKEPYYPVNDPKNSELFDKYERRALEEKNVIFGGRLGMYRYMDMDQVIEEALS LAETELTYEEP >gi|226332878|gb|ACII01000141.1| GENE 7 8245 - 9780 1684 511 aa, chain - ## HITS:1 COG:MA0003 KEGG:ns NR:ns ## COG: MA0003 COG0591 # Protein_GI_number: 20088902 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanosarcina acetivorans str.C2A # 1 508 1 489 514 426 51.0 1e-119 MSGTTMVILSAFVAYLLLMIVIGVVYMKKTSSSEDYFLGGRGLNAWVAALSAQASDMSGW LLMGLPGAIYSLGTGQIWIAVGLFIGTVLNWVCISHRLRKYTIAANNSLTIPAFLENRFQ DKKRILLLLSSIVIVIFFLVYTASALAAGGKLFNTVFGIDYHIALAIGAAVILCYTFMGG FMAVCVTDFVQGTLMLIGLLIVPLVAYLTLSGSLSDLLTQSGAPGGAAAFLNPFENGERP YTFVEIFSQLAWGLGYCGMPHILTRFMAVKSEKELKKSSAIAIVWDILSLTAACFIGIIG RAYLLPTVLGENGASSSESVFIEMINKLFSSHLGLPFIGGIFLCGILAAIMSTADSQLLV TASAASEDLYHQFIKKDADSKEILAVARLTVIVVSVLAFVIAWNPNSSIMGLVSNAWAGL GAAFGPTVVMSLFWRRTNLAGAVAGIVSGGLTVIVWDYIPFAAGQTLGSYTGLYSLAVGF AVSLVMIIIFSLATKAPSKEITDVFDKVTGK >gi|226332878|gb|ACII01000141.1| GENE 8 10296 - 10850 546 184 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1818 NR:ns ## KEGG: EUBREC_1818 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 176 6 181 200 185 57.0 7e-46 MHPWLHFKTITKHKLLVMHYCFRAGMYKQGLLHDLSKYAPVEFLVGCKYYQGDRSPNNAE REDTGISKSWLHHKGRNKHHFEYWVDYAPGDEHIINGVPMPRKYIAEMVMDRISASRNYL GDKYDQHQPLDYYLKGKEKLWFIHPKTKRDLEGLLRILNDHGEEVLISYIKNVYLKKDKA LERV >gi|226332878|gb|ACII01000141.1| GENE 9 10828 - 12723 1458 631 aa, chain - ## HITS:1 COG:slr1857 KEGG:ns NR:ns ## COG: slr1857 COG1523 # Protein_GI_number: 16330244 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Synechocystis # 12 618 26 706 707 376 32.0 1e-104 MAESRKMKTEKGLALVPGANPLADGCNFAVEVPEDSRASLILYKKRSAKPYVEIPFTEEN RTGNVYAMYIPDFNLKEYEYNFLINGKVYTDPYAYRILGRERFGAEVGTNPHKVRGGFLK KEVFDWENDKNPAIPYHEMILYKLHVRGYTKANRTITGTKGTFQALEEMIPYWKELGINT IELMPAYEFMESGTCKNSESEKMVSEKHTQGRVNFWGYMYGYYFAPKRSYCATDDPEKEF KTFIKKLHQAGIACIMEMYFPRECNPVTALRALQFWKLYYRVDGFHVLGEGVSAKLLMHD GVLSDTRLMFHDFDESQIRKKKKPEDKCIAQYNPGFLQDMRRFLKSDEDMVSVAAYHIRR NPNIYAVINYMACQDGFTMNDMVTYNYRHNEANQENNHDGSSYNYSWNCGVEGPSRRLQI RQMRERQIRNAFLMVLLSQGVPMIYGGDEFGNSQNGNNNAYCQDNQVGWIDWKALKKNES LFQFVKNAIAFRKEHPILHVPGEMYGVDYQTRGLPDVSLHGERAWYMNSENTSRLLGIMY CGAYAHRADGSEDASIYVAYNFHWEDRIFALPNLAGYRKWKKVIDTSAVKENGFLEQEQE TYSKKLKVTPRTIVVLMAVEEEKKDASVAAL >gi|226332878|gb|ACII01000141.1| GENE 10 12787 - 13542 587 251 aa, chain - ## HITS:1 COG:CAC0222 KEGG:ns NR:ns ## COG: CAC0222 COG0708 # Protein_GI_number: 15893514 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Clostridium acetobutylicum # 1 249 1 249 250 390 74.0 1e-108 MKLISWNVNGIRACIGKGFEESFAALDADIFCLQETKCQQGQVKLELPGYYQYWNYANRR GYSGTAVFTKKEPLSVVYGIGIEKHDKEGRVITLEYEKFYLVTVYTPNSQSELRRLEYRM HWEEDFLAYLLKLQESKPVICCGDFNVAHQEIDLKNPKTNRKNAGFTDEERACFGKVLES GFIDTFRYFYPDVEGRYSWWSYRFKAREKNAGWRIDYFITSPQLKDKLKGAEIHSEIMGS DHCPVELQITL >gi|226332878|gb|ACII01000141.1| GENE 11 13761 - 14138 541 125 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1820 NR:ns ## KEGG: EUBREC_1820 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 43 116 23 97 103 76 57.0 2e-13 MSDEFTAGGCNPSDCASCGGGCSSAGGCDPVENHKTITLTMEDDTEVECAILTVFPVDAK EYIALLPLDENGQNESGEVYLYAFARTESGDPMLSNIESDEEYAKAAVAFDTVLQNAKDS MEKID >gi|226332878|gb|ACII01000141.1| GENE 12 14402 - 18625 1794 1407 aa, chain + ## HITS:1 COG:MA1513_1 KEGG:ns NR:ns ## COG: MA1513_1 COG4870 # Protein_GI_number: 20090372 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 936 1305 159 508 511 243 42.0 3e-63 MKKKKFPVFILCSLLTLTNIQTPWAFSDEAPDDPVFSDESVDDPVFPVELPDDPAFSDEI TPDEATYAKDTADISASGNSIPITADSGFADPVFQKWISENIDTDRNGLLSDEEISMCTE ISIPSMSVDSLEGIEYFYNLKTLDCSDNELLFLDVSANTVLKSLNCSHNNLLSLDLSSCK KLKDLDISFNNCGSNSSISLPKSFSVEKLNISSVVLDDFDFSRFSRLKELDCSNCNLRAL KPASMPSLEKLICSGNLLDTLNLSRLSGLKYLDCSSNSIQTLKLPDTRSLNTLYCQNNLL TKLNLSGFLSLKKLNCSDNKLLEPDLRGCFPEELICNDNVIRTPMEFYDLSQFASPEDIT IMENGMIDTSYRLTAINKLNPVKIRRRFTSAGTEQYGTQIIYIGYLLENDFRSPALFQQM QQNYPLTDDHMLSREYLKEVSSLSFDEDFPLFSSDLAYFTGLKSLRCTDLPQITELDLSR NPELTELICIGTGIRSLNLTSNKSLESVQLCNNSLSSLNITGLTSLSSLDCRFNSLVSVD TTGCTALKDNLFFPQHTCTVNADQAGKISFDQLGNFHQQLDTPAAILAGDELLYTPGDDF FTLLSPAVTMTEASFSIMDTNTWQILGTCRMTINIESGTPVTPEKPSLKKIKSTHEVTIT WKKAKNVSGYRILRREKNKSEASWKAIATVSSSRTSYTDSTGLIKKSYLYTVQSYTFRNG KKIYSKYNTKGLTGTAKLKRPAVKKPQKDVVINLMWDEVPGADGYRIYRRRKGISKWTKL GDIEFSGLYYTDETAVPKTVYDYAVRPYRKLGSKISLGDYSSTGYVCQTAIPKVSLTSVT EQDGGVTVKWKHQTYVTGYFIYRADDQNSKYCKISDTPNGPDSEEYYAYYEPVPFGSVNP LYKVRAYVLIGKKYYYGPYSRALALNPYSDNLNAHDFSEFHMGYQLSNSSSSTGFDRNYN DGGSFSMAAAYLTRGSGPVYEWQAPYENISSASYDSFTPALRVNEIMFIPQRKNALDNQA IKQAIMNYGAVSASYLSVDEYYSNDQISFCLPDDYDAGRAPAGSAPVISTQHAIAIVGWD DDYAKENFPVRPAGNGAFLCKNSWGKDFGDNGYFYISYYDGFLGMQEFSMAFGKISSDNT CNRIYQYDPLGATTAFGYNDELYCANVFPENGQKLARKEQLGSVSFYTYDNDYNYEVYIV SSFKNQNSFQKLPSPAARGNCRYAGYHTVDFSRPVTLKAGTRFAVVIKLWSSTGAKTYFE APLNGTSSRATASDGESYISHHGDIWTDFNTYLPNTNVCIKAFTNGKITENVPAAGSDGK DISCESYSAEELKERGFLLNPAYEDGSSDMVSSVSTGNASSASAPVLPAAYDLRQSGRVS PVKNQGEQGLCWTFSTYASLESFLLFS >gi|226332878|gb|ACII01000141.1| GENE 13 18696 - 19880 1459 394 aa, chain - ## HITS:1 COG:YPO4104 KEGG:ns NR:ns ## COG: YPO4104 COG3007 # Protein_GI_number: 16124212 # Func_class: S Function unknown # Function: Uncharacterized paraquat-inducible protein B # Organism: Yersinia pestis # 1 393 1 397 399 379 47.0 1e-105 MIIEPKVREYICTTAHPQGCAESVRNQADYACKQGMVNGTKKALIIGCSTGYGLASRICA LENCGADTLGIMFERQANGRRTATPGWYNTAEFHRLASGKGVYAKTINGDAFSKEIKEKA IELIKQDLGKVDLVVYSLAAPRRTDAEGKTWSSCLKTTDEPFTEKSLDLRNNEITEKTVE PATEEEVLSTVKVMGGEDWADWIDALKAADVLTENAVTVAYSYIGPELTYPIYYHGTIGT AKQHLQKTMSEINQAHPDVRAVISVNKGLVTQASAAIPVVPLYFAILYKVMKRAGNHENC IQQIARLFTQKLYTPEGFRTDENGFIRMDDYELLPEIQEEVKKCWKAVTTDTVNQYCDID GYWEDFYHMFGFHYSDIDYTQDVDANVEINGIVM >gi|226332878|gb|ACII01000141.1| GENE 14 20172 - 20684 684 170 aa, chain - ## HITS:1 COG:SP0021 KEGG:ns NR:ns ## COG: SP0021 COG0756 # Protein_GI_number: 15899969 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Streptococcus pneumoniae TIGR4 # 39 167 18 145 147 96 42.0 2e-20 MKRIAKFHKVSKERFTADWIDTFAQSQEEAEKVYEAIRLPKRATAGSAGYDFFAPAEFTL KPGETVKIPTGIRVEMQPEWVLKCYPRSGLGFKYRLQLNNTVGIIDSDYFYSDNEGHIFS KITNDSNENKTLTIPADTGFMQGIFVEYGITVDDDATEIRNGGFGSTTAK Prediction of potential genes in microbial genomes Time: Sat May 28 20:50:54 2011 Seq name: gi|226332877|gb|ACII01000142.1| Ruminococcus sp. 5_1_39B_FAA cont1.142, whole genome shotgun sequence Length of sequence - 16732 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 3, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 19 - 65 6.5 1 1 Op 1 . - CDS 77 - 565 453 ## COG0602 Organic radical activating enzymes 2 1 Op 2 26/0.000 - CDS 572 - 2665 2561 ## COG1185 Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) - Prom 2855 - 2914 8.2 - Term 3017 - 3061 7.1 3 2 Op 1 9/0.000 - CDS 3078 - 3344 354 ## PROTEIN SUPPORTED gi|240146067|ref|ZP_04744668.1| ribosomal protein S15 - Prom 3365 - 3424 4.5 4 2 Op 2 12/0.000 - CDS 3486 - 4382 329 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 5 2 Op 3 1/0.000 - CDS 4391 - 5275 976 ## COG0130 Pseudouridine synthase 6 2 Op 4 4/0.000 - CDS 5275 - 6237 1081 ## COG0618 Exopolyphosphatase-related proteins 7 2 Op 5 32/0.000 - CDS 6252 - 6647 590 ## COG0858 Ribosome-binding factor A - Prom 6718 - 6777 2.4 - Term 6739 - 6799 3.1 8 2 Op 6 10/0.000 - CDS 6816 - 9641 3293 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 9 2 Op 7 8/0.000 - CDS 9645 - 9944 249 ## PROTEIN SUPPORTED gi|126698907|ref|YP_001087804.1| putative ribosomal protein 10 2 Op 8 22/0.000 - CDS 9934 - 10209 205 ## PROTEIN SUPPORTED gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein 11 2 Op 9 32/0.000 - CDS 10193 - 11413 753 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 12 2 Op 10 . - CDS 11438 - 11902 630 ## COG0779 Uncharacterized protein conserved in bacteria - Prom 11975 - 12034 6.8 - Term 11922 - 11953 4.1 13 3 Op 1 24/0.000 - CDS 12054 - 14300 2557 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 14 3 Op 2 . - CDS 14373 - 16376 2111 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit - Prom 16514 - 16573 6.0 Predicted protein(s) >gi|226332877|gb|ACII01000142.1| GENE 1 77 - 565 453 162 aa, chain - ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 159 1 165 168 146 43.0 1e-35 MRYHNITKDDMLNGDGLRVVLWVAGCNHCCPECQNPVTWDPNGGLPFGEAERKEIFAELD KDYVSGITFSGGDPLHPANITEVTAFAKEIRKRYPGKTIWLYTGFLWEEICKEEIVRYLD VCVDGEFEVDKKELSLKWKGSSNQRVIDVPKTLHEGKVVLHK >gi|226332877|gb|ACII01000142.1| GENE 2 572 - 2665 2561 697 aa, chain - ## HITS:1 COG:CAC1808 KEGG:ns NR:ns ## COG: CAC1808 COG1185 # Protein_GI_number: 15895084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) # Organism: Clostridium acetobutylicum # 8 693 8 691 703 703 52.0 0 MYKSYSMELAGRTLTVDINRVAKQANGAALMHYGDTTVLSTATASKEPREGIDFFPLSVE YEEKMYAVGKIPGGFNKREGKASEHAILTSRVIDRPMRPLFPKDYRNDVTLVNMVMSVDP ECNPEIPAMLGSSIATCISDIPFDGPCATTQVGLINGEYIINPTMAQKDVSDLQLTVAST REKVIMIEAGAKEVPEDKMIEAIYKAHEVNQEIIKFIDKIVEECGKPKHSYESCAVPEEL FAAIKEVVPPAEMEVAVFSDDKQTREENIRQVTEKLKEAFADKEEWLAVLGEAVYQYQKK TVRKMILKDHKRPDGRAITQIRPLAAETDIIPRVHGSAMFTRGQTQICTITTLAPLAEAQ KLDGLDEFETSKRYMHHYNFPSYSVGETKPSRGPGRREIGHGALAERALVPVLPSEEEFP YAIRTVSETFESNGSTSQASICASTMSLMAAGVPIKKPVAGISCGLVTGDTDDDYIVLTD IQGLEDFFGDMDFKVAGTHDGITAIQMDIKIHGLTRPIVEEAIRRTKEAREYILTEVMEK CIAAPRTTVGEYAPKIIQIQIDPQKIGDVVGQRGKTINTIIERTGVKIDITDEGAVSICG VDQKSMDEAANMVKIIATDFEAGQIFTGKVVSIKEFGAFVEFAPGKEGMVHISKICKERI NRVEDVLTLGDKVKVICLGKDKMGRISFSMKDVPEEA >gi|226332877|gb|ACII01000142.1| GENE 3 3078 - 3344 354 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146067|ref|ZP_04744668.1| ribosomal protein S15 [Roseburia intestinalis L1-82] # 1 88 1 88 88 140 78 4e-33 MISKEKKAAIMKEYARTEGDTGSPEVQVAVLTARIQELTEHLQSNHKDHHSRRGLLKMVG QRRGLLAYLKKTDIERYRALIEKLGLRK >gi|226332877|gb|ACII01000142.1| GENE 4 3486 - 4382 329 298 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 17 295 19 312 317 131 30 4e-30 MEYLRIENDFPRISHSAVTLGKFDGIHRGHQKLVEKIIEQKQEGAQAVVLALGTASRTIL TKEERCRILEEMGVDILLECPLTEKIRHMKAENFIKEILIGDLQVSYVAVGEDFRFGYER KGTPAMLKEFGKKYGFHTEVLPKKMDGRRKISSTFVREELNRGNMEKFRFLMGTDFSVEG IVEHGRGMGHKYLLPTTNLIPPVEKLMPPNGVYITVSHFRDRSYQGITNVGHKPTVGGEK FIGVETYLFDCNEDLYGEYCKVDFKKFLRPEQKFSSLEALKAQLERDAVKGQEYFETL >gi|226332877|gb|ACII01000142.1| GENE 5 4391 - 5275 976 294 aa, chain - ## HITS:1 COG:CAC1805 KEGG:ns NR:ns ## COG: CAC1805 COG0130 # Protein_GI_number: 15895081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Clostridium acetobutylicum # 1 284 1 280 289 231 43.0 1e-60 MDGVIVIRKEKGFTSHDVVAKLRGILHMKKIGHTGTLDPDAEGVLPVALGKATRLVDMIT DKEKTYEAVMRLGVVTDTQDMSGTVLSQTTELSVTEEELCTVVSSFVGDYMQVPPMYSAL KVNGKKLYELAREGKTVERKPRPVHFYEIEILDISFPLVRFRVTCSKGTYIRTLCHDIGE KLGCGAAMESLLRTKVGRFTLDDAITLAQTEEAVQEGTIESKILGIEEILAEYPRVCCTK EGDRLLANGNPLVQALVDAQEKNGWIRMCSSEGSFAGVYQWDEKRNRYFPVKMF >gi|226332877|gb|ACII01000142.1| GENE 6 5275 - 6237 1081 320 aa, chain - ## HITS:1 COG:CAC1804 KEGG:ns NR:ns ## COG: CAC1804 COG0618 # Protein_GI_number: 15895080 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Clostridium acetobutylicum # 14 311 16 317 321 128 30.0 2e-29 MENIAEILEGVKTMGIGGHIRPDGDCVGSCMALYLYVKTYYPEIQADVYLDNPKPVFGHI DCMDEIKTELDGDKEYDLFVTCDVSARDRLAIAGPYFDTAKRTACIDHHISNSGFAQVNH IRGEVSSACEVLYGLFDPEKVVRTIAVPIYTGMVHDTGVFQYSSTSPETMRIAGELMKTG FNFSKIIDESFYQKTYVQNQVMGRVLAESILLLDGKCIIGYLKKRDMEFYGVDGKDLDGI VSQLRLTAGVEVAMFIYEVETQSFKVSLRSNGNVDVSKIAVYFGGGGHMRAAGCDLQGSV YDVINNVTEQICKQFQEQEV >gi|226332877|gb|ACII01000142.1| GENE 7 6252 - 6647 590 131 aa, chain - ## HITS:1 COG:BH2411 KEGG:ns NR:ns ## COG: BH2411 COG0858 # Protein_GI_number: 15614974 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus halodurans # 4 116 2 113 116 92 44.0 1e-19 MRKNSIKNTRINGEVQKELSTIIRNEIKDPRIHPMTSVMAVEVAPDLKTCRAYISVLGEK EAKEATIKGLNSAEGYIRRQLARNLNLRNTPEIRFILDESIEYGVNMSKLIDDVTKKDSN SHRQDEEQDEN >gi|226332877|gb|ACII01000142.1| GENE 8 6816 - 9641 3293 941 aa, chain - ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 241 941 47 729 730 637 53.0 0 MSKLRVHELARELGRQNREVIEFLKSKGVDVRSHMSMVDEPAVSEVKNRFRKDNNNRGKE HIAKLETPKTEEKVVTAKSEGDKTETPKKKKNIIRVFHAQNASDGGKTRKKPVKAEGERT GSPRNAEGRNSDSSRTDGDRPRRNNNDRPGTGNRRPDGQNRQGRPQGEGRSQGGRFGQGR NQGEGNRQGRPQGEGRSQGDNRQGGRFGQGRSQGDGQGRPGGRFQGDGNRQGGRFGQGRP QGDGNRQGGRPQGNGQGRPDGNRSEGRDGRFGGSQGRQGQRQNTRKNDDMAFAPELTKTS KDSKRERDRENKNKKKDFDKTQSGGRRPNQGGFNKNSRIPKALQKPAPQPKQEEKKPEVK EITLPEKMTIRELAEAMKMQPSVIVKKLFMQGMMVTVNHEIDFEKAQEIALEYDIIAEPE EKVDVIEELLKEDEEDEKDMVSRPPVVCVMGHVDHGKTSLLDAIRKTNVTRGEAGGITQH IGASVVEVGGQKITFLDTPGHEAFTAMRMRGANSTDIAVLVVAADDGVMPQTVEAISHAK AAGVEIIVAINKIDKPSANIERVKQELSEYELIPEDWGGSTIFVPVSAHTGEGIDTLLEM ILLTAEVCELKANPNRSARGLVIEAQLDKGKGPVATILVQKGTLHVGDFIAAGACNGKVR AMMDDKGRRIKEAGPSTPVEILGLGDVPNAGEILLAFDSDKEAKNFAGAFVSENKNRLLE ETKGKLSLDNLFDQIQASDLKELPLIVKADVQGSVEAVKQSLTKLSNEEVVVKVIHGGVG AINESDVSLAATSNAIIIGFNVRPDATAKQLAEQEGVDLRLYRVIYQAIEDVEAAMKGML DPIFEEKVIGHAEVRQLFKASGIGTIAGSYILDGIFQRNCKVRISREGEQIFEGELASLK RFKDDVKEVKAGYECGLVFDGFNDVKEEDKVEAYIMVEVPR >gi|226332877|gb|ACII01000142.1| GENE 9 9645 - 9944 249 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126698907|ref|YP_001087804.1| putative ribosomal protein [Clostridium difficile 630] # 1 97 1 99 103 100 51 7e-21 MKNNRVLSLIGLATKAGKTVSGEFSTEKSVKTGKGLLVIVAEDASENTKKKFRNMCSFYE VPIFFFSDKESLGRAMGKEYRACLAVQDENFAKAIMKEV >gi|226332877|gb|ACII01000142.1| GENE 10 9934 - 10209 205 91 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein [Dictyoglomus thermophilum H-6-12] # 1 87 1 87 98 83 44 8e-16 MKTKNKIPMRQCTGCREMKSKKEMLRVLKTTEDEIVLDTTGKKNGRGAYLCLSRECLDKA IKNHGLERSLKTAVPDEVYERLKKELEDIEK >gi|226332877|gb|ACII01000142.1| GENE 11 10193 - 11413 753 406 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 4 400 9 433 537 294 39 2e-79 MSRELREALDILEKEKSISKDTLLEAIEQSLIQACKNHFGKADNVHVTIDPETCDFSVYA DRSIVEYVEDPAMEISLADALKITSRAEIGGMIQVPIQSKEFGRIATQNAKNVILQKIRE EERKVLYDEYYGKEKEVVTGIVQRVMGKNVSINLGKADAVLSENEQVKGETFQPTERIKV YILEVKDTPKGPRILVSRTHPGLVKRLFESEVAEVKDGTVEIKSIAREAGSRTKIAVWSN DPDVDAVGACVGMNGARVNAVVEELRGEKIDIINWDENPAILIENALSPAKVIAVMADPD EKTALVVVPDYQLSLAIGKEGQNARLAARLTGFKIDIKSETQAKEAGDFYDYDDDETAED AAEASAEETSAEDSLTEETEAEEVSEEVPVEEEPQAQEAGENEDEE >gi|226332877|gb|ACII01000142.1| GENE 12 11438 - 11902 630 154 aa, chain - ## HITS:1 COG:lin1358 KEGG:ns NR:ns ## COG: lin1358 COG0779 # Protein_GI_number: 16800426 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 9 154 6 155 155 118 42.0 5e-27 MSRREEYEKRAEELLAPIVELNGFELVDVEYVKEAGNWYLRGYIDKPGGITVNDCETVSR AFSDKLDENDFIEDSYIMEISSPGLDRPLKKDKDFERNMGKLVEIRTYRPIEKQKEFCGI LTAYDSNSVTIDEDGTERVFDKKDTALIRLAIEF >gi|226332877|gb|ACII01000142.1| GENE 13 12054 - 14300 2557 748 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 8 706 7 694 830 537 42.0 1e-152 METKQENVIPTDYAEVMQKSYIDYAMSVIISRALPDVRDGLKPVQRRTLYDMYELGIRYD RPYRKCARIVGDTMGKYHPHGDSSIYEALVVMAQDFKKGKTLVDGHGNFGSIEGDGAAAM RYTEARLEKLTQDVFLEDLDKNVVDFMPNFDETEKEPVVLPVRIPNLLVNGADGIAVGMA TSIPPHNLGEVVDAVKAYMKNNDISVKGLMRYLKGPDFPTGGLVVNKDDLLRIYETGTGK LRVRGKVETVKLKGGRQQLVITEIPYTMIGANIGKFLNDIASLVENKKATDIVDISNQSS KEGIRIVLDLKRDADVENLTNMLYKKTKLEDTFGVNMLAVADGRPETLSLKQVIEHHVDF VFDVTTRKYTTLLNKEQEKKEIQEGLIKACDVIDLIIEILRGSKNREQVKKCLVDGVTEG IRFKGKSSEKAAQKLHFSEKQAAAILDMRLYKLIGLEIEALQADYAETMKNIAIYEDILN NYDSMAEVIMKELDQIKKEYGTKRRTVIENAEEVVYEEKKMEEMEVTFLMDRFGYMRTID KSAYERNKEAANAENKYVFNCMNTDKICIFTDNGKMHSIKVADIPLVRFRDKGTPADNLS NYDSAQERMLYVAPLASVKENTLLFVTAASMCKLVSGAEFDVAKRTIVSTKLAEEDSLIF VGSADEMEQVVFQSEGGYFLRFQKQDISAMKKTSIGVRGMKLAEGDKLDHAYLLESRQEY TITYHDKPYVLNRIKLSKRDSKGTKPRI >gi|226332877|gb|ACII01000142.1| GENE 14 14373 - 16376 2111 667 aa, chain - ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 29 661 4 631 638 627 52.0 1e-179 MFAMQKIIPGFTVSNPVDRKKGNVTDMAKKNTYDAGSISVLEGLEAVRKRPGMYIGSVSR KGLNHLIYEIVDNAVDEHLAGYCSLIHVILEKDGSCTVTDNGRGIPVDMHEKGVSAERLV FTTLHAGGKFDNSAYKTSGGLHGVGSSVVNALSTYLDIKISRDGYVHHDHYERGIPTIEL EEGLLPKLGKTRQTGTSINFLPDPEIFEKTRFSATEVKSRLHETAYLNPELTIRFEDKRG AEVEDITYHEPDGIIGFIKDLNHSAEVLHDPVYLKGEADGIQVEVAFQFTNEFRENVLGF CNNIYNAEGGTHLTGFKTTFTTVMNTYAREIGVLKDKDPNFTGADIRNGMTAVVSIKHPE PRFEGQTKTKLDNQDASRATGKVVGEQMVLYFDRNLETLKTILSCAEKAAKIRKTEERAK TNLLTKQKYSFDSNGKLANCEKKDPSQCEIFIVEGDSAGGSAKTARNRNYQAILPIRGKI LNVEKATIDKVLANAEIKTMINAFGCGFSEGYGNDFDITKLRYDKIVIMADADVDGAHIS TLLLTLFYRFMPDLITEGHVYIAMPPLYKVMPKKGQEEYLYDDKALERYRRAHKPGSFTL QRYKGLGEMDAEQLWETTLNPETRMMKRVEIEDARLASSVTEILMGSDVPPRKKFIYEHA QDAELDI Prediction of potential genes in microbial genomes Time: Sat May 28 20:50:56 2011 Seq name: gi|226332876|gb|ACII01000143.1| Ruminococcus sp. 5_1_39B_FAA cont1.143, whole genome shotgun sequence Length of sequence - 6355 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 3285 1728 ## PRU_0857 putative surface protein 2 1 Op 2 . - CDS 3305 - 3484 71 ## gi|253580828|ref|ZP_04858091.1| predicted protein - Prom 3612 - 3671 1.9 - Term 3962 - 4002 6.0 3 2 Tu 1 . - CDS 4074 - 4940 94 ## COG4823 Abortive infection bacteriophage resistance protein - Prom 5052 - 5111 7.6 - Term 5174 - 5209 -1.0 4 3 Op 1 . - CDS 5275 - 5385 56 ## 5 3 Op 2 . - CDS 5461 - 5784 201 ## EUBELI_20456 hypothetical protein - Prom 5858 - 5917 6.1 - Term 5882 - 5936 14.7 6 4 Tu 1 . - CDS 6050 - 6355 282 ## gi|253579950|ref|ZP_04857218.1| predicted protein Predicted protein(s) >gi|226332876|gb|ACII01000143.1| GENE 1 3 - 3285 1728 1094 aa, chain - ## HITS:1 COG:no KEGG:PRU_0857 NR:ns ## KEGG: PRU_0857 # Name: not_defined # Def: putative surface protein # Organism: P.ruminicola # Pathway: not_defined # 75 353 159 431 493 67 32.0 2e-09 MKKKLLAGILALALCSTNMPPQTIFAGEFTSGNPDVVSEEDTPEIFTNEEQEAAGETNED LFVFSSEEAPEFNDTPDEAMAATENAQNGVIDLTEDANVTDGVYTINIAEDYKFTCKKSP ETSNRIVVDGTNTSEQDNINIYLDNVNIKTSAGSALQINNNVKATVTIYLTGINNLTTTN QSSAGLQKDNEAQLIITNASDTTTGILKASSDGSGYGAGIGSGNYGSCKNITINSGFVDA KSKFGAGIGSGHQGSCDNITIKSGSVNAKSMNGAGIGGGHHGSCNNITINSGSVNAKSMH GAGIGSGNYGSCNNITIKSGSVTASSTSGAGIGGGLYGSCNNITISGGSVNAQVGCTPHQ NLDSSGSYDPKSPEVYLCVIPNTKTKSVAIDNKVWEPSNHKAVESTNTNLYAWLTGEDHL ITVEKETKSYIFNSNSGTFSEGKRDVTSDIFEFNGSEFTYDGNTHSPNITTKNNIKGVGN FTVKYFKEDNLQAEINEPKDVGTYIVKITAVEEGDFYNAYSGYLTNDNWKFAINLTPTIT TYKDEYDGNPHPVISIEESTIPPNSIIEYSVDNGQTWYILNSNDNIPTVSTVREAENTKI FIRISNFNSNSNDDTWTSQEYQAIIKRATSTPNFPSIDTISVPWSCKKVKDIDPNSLPQN WNWQANDANNDLQVGNNSATAIYNGSDKGNYEKETFTYTIIRKECEHKNTVGCYYSSPSC TSNGYSGDTYCNDCERTIYYGSTISAYGHDYDNGVITTEPTAETDGIITYTCKRCKHQDT KNLGKLGDGEPYIEGSFQKKGWDAVNGLIKTSKEKDTISITLNGAKVLPATVLSEIKGKD ISLNLDMENGFIWKINGTSITAETPADIDLSVTNTEEYIPAALYSLISTNQNDFGFHLGR NGAFDFPAVLSVKADASCAGLMANLFWYDAENGVLQCIQTVTVGGAFERSIPYADFALSK GQDYFIAFGTESLNGRVIHTDGSITDENGAYLRPANTKISSHSIDRNKLTVKLSKGCAGA QGYDFVISKKSNMLQTGKFSQTVSSTGKPQASFRYLAKGTWYVAARSWVLDAQGNKVYGS WTKVKKIKITVVTP >gi|226332876|gb|ACII01000143.1| GENE 2 3305 - 3484 71 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580828|ref|ZP_04858091.1| ## NR: gi|253580828|ref|ZP_04858091.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 16 74 74 114 100.0 1e-24 MQSRPINATGMYGTILLHEGAGGYVLVAVISFTVAVVITALCIKFRKREQKSQDAEDHV >gi|226332876|gb|ACII01000143.1| GENE 3 4074 - 4940 94 288 aa, chain - ## HITS:1 COG:lin2373 KEGG:ns NR:ns ## COG: lin2373 COG4823 # Protein_GI_number: 16801436 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Listeria innocua # 21 286 35 288 298 100 27.0 3e-21 MQSYVDAGMEITSYEDVEKVLKTIGFYRLRGYSFHLYDNTTKKYVAGTKFKDIIKLYQFD QELSALIFSMISKIEVALRVRLVESLLVHGEPLVLQDSSIFKEKKRYWQNMSTVASEIAR SNDVFIKHNFDNHDGEVPVWAAVEVLSFGTLSKIIKNLKTGTGSSYSILALNYQYKSKKG NLVKPSQKMLASWIQSVSVLRNMCAHNSRIYNRTIHTTPEILDVDKVVPSPAHNGLYQIL LAMKYLRSSDEEWVTFVAAFDKLIQNNIDVVSLTAMNLPADWKAHLSV >gi|226332876|gb|ACII01000143.1| GENE 4 5275 - 5385 56 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNGTRNPSLSIVKKLAQGLGMQLKLEFVLMPTKNKM >gi|226332876|gb|ACII01000143.1| GENE 5 5461 - 5784 201 107 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20456 NR:ns ## KEGG: EUBELI_20456 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 103 371 472 477 110 54.0 2e-23 MIDELDSGIYEYLLGECLEVMQDKAKGQLIFTSHNLRPLEILENDSLLYTTVNPENCYIK SSYIKNTQNTRLSYLRTIKLGGQKEKLYNETNIYEMELAMRRARRQY >gi|226332876|gb|ACII01000143.1| GENE 6 6050 - 6355 282 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579950|ref|ZP_04857218.1| ## NR: gi|253579950|ref|ZP_04857218.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 97 1082 1178 1185 164 90.0 2e-39 EPKIKNITVKGNTVTVTYTKCKNATGYEILLGNKYKTSAGEKYPVKKYLKRTEGKNTVTV TFTNVKKGTWYVTIRAWNQTSKDKSRAYSPYSTMKKFKTKK Prediction of potential genes in microbial genomes Time: Sat May 28 20:51:24 2011 Seq name: gi|226332875|gb|ACII01000144.1| Ruminococcus sp. 5_1_39B_FAA cont1.144, whole genome shotgun sequence Length of sequence - 3547 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 3042 1506 ## gi|253580832|ref|ZP_04858095.1| predicted protein 2 1 Op 2 . - CDS 3064 - 3384 251 ## gi|253580833|ref|ZP_04858096.1| predicted protein - Prom 3450 - 3509 3.7 Predicted protein(s) >gi|226332875|gb|ACII01000144.1| GENE 1 3 - 3042 1506 1013 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580832|ref|ZP_04858095.1| ## NR: gi|253580832|ref|ZP_04858095.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 61 1013 1 953 954 1722 99.0 0 MRKKLLAGMLALALCSTNMPLQTIFAEEFTSGNPDVVSEEETPEIFTNEEQEAVGETDEE LSVFSSEEVPEFNDAPDEAMAAAENEQAGEIDLADNDKVMNGVYTISSAGDYKFTCSRET GNRIVVDGKKISAKDSINIYLNKVNINTSAGPALRINLNVEATVTIHLTGTNSLITKNNY YAGLQKDNKARLIIKTNNSDATAGILNARSIDGDSAGIGGGFYGSVSCSNIIIDSCSVIA SSKYGAGIGGSKQRAGSVSDIIINSSSVTASSTDGAGIGGGGYGASSVSGITINSSSVTA SSTNGAGIGGGGYGADSVSDITINSSSVTASSTNGAGIGSAGGTCSNIGISGGSVKAYSD RMPGINCTPHNGNSTNVYCCIIKNEYFLPVTIDSESWKPSYHIFPDSTKDGNLYVWLTEK ENNDAYDVTVGTEKRQYSFDQAKNQFVRIQTTPTADQFDYTQPNFTYTKDTHVDISKYIK WKDDVTGHGKITKVTYFKKGDKTPLADSPTDAGTYTFKIDVNEGDYYNSVDSISAPEWEF VISKAQAPSSKPTDTDPTIYVSWLCKKVEDVKGLFNDEWKWSDSDISKKLPVGEEVSATA VYNGTDADNYVNTSVVFKITRKACTHPHTAERYYSSPSCTSSGYSGDTYCTDCNETLSYG YTISAYGHDYDNGVITTEPTTETDGIITYTCKRCKHQDTKNLGKLGDGEPYIEGSFQKKS WDTVNDLIKTSKEKDTISIIMNGARTLPASVLSGIKGKDISLNLDMENGFIWKINGTSIT AETPADIDLSVTNTAEHIPAALYSLISTNQNDFGFHLGRSGAFDFPAVLSVKADVSCAGL MANLFWYDVENGVLQCIQSVTVGGAFERSIPYAEFILSKGQDYFIAFGTESLNGRVIHTD GSITDENGAYLRPADAKISSHSIDRNKLTVKLSKGCAGAQGYDFVISKKSNMLQTGKFSK KVSSTGKPQASFRYLAKGTWYVAARSWVLDEQGNKVYGSWTKIKKIKITVVTP >gi|226332875|gb|ACII01000144.1| GENE 2 3064 - 3384 251 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580833|ref|ZP_04858096.1| ## NR: gi|253580833|ref|ZP_04858096.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 106 4 109 109 202 100.0 8e-51 MYDTEKRIELVKKRMHEYHRRQERRTVCRLSVLCTLLFLSLVGAMGIMQNKPINVTGMYG TILLHEDAGGYVLVAVISFTVAVVITALCIKFRKKGQKSQDAENHV Prediction of potential genes in microbial genomes Time: Sat May 28 20:52:07 2011 Seq name: gi|226332874|gb|ACII01000145.1| Ruminococcus sp. 5_1_39B_FAA cont1.145, whole genome shotgun sequence Length of sequence - 2215 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 333 61 ## gi|253577920|ref|ZP_04855192.1| conserved hypothetical protein 2 1 Op 2 . - CDS 330 - 956 196 ## gi|253580835|ref|ZP_04858098.1| conserved hypothetical protein 3 1 Op 3 . - CDS 940 - 1308 287 ## gi|253580836|ref|ZP_04858099.1| conserved hypothetical protein 4 1 Op 4 . - CDS 1295 - 1843 343 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - TRNA 2000 - 2074 62.8 # Glu CTC 0 0 Predicted protein(s) >gi|226332874|gb|ACII01000145.1| GENE 1 3 - 333 61 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253577920|ref|ZP_04855192.1| ## NR: gi|253577920|ref|ZP_04855192.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 110 16 125 125 177 90.0 3e-43 MTYIENVFLCMASPLLIAALCMGKRQRKFFLFCFAGMGVCLLSAYINTFFAALYKADTFA ATTEIAPVVEEVMKLLPLLFYLLIFEPKAEQIKNAAVITALSFATFENVC >gi|226332874|gb|ACII01000145.1| GENE 2 330 - 956 196 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580835|ref|ZP_04858098.1| ## NR: gi|253580835|ref|ZP_04858098.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 15 208 1 194 194 342 99.0 1e-92 MELISNLFQFAVTLLGFCMSGIRYLKGRKQTYFLLTCFYGCFALGSLYWTLYLLLFSETP QVFYVSEFGWIASVIFLYLLQYTLSSAEERDFSTRKSLIAPLIGVPLCVFYCTFGDVLSN LLWCGMMIVVSYHSIRGLAYAQIQTGTACKMRYFHIGVLCYVAVEYVLWISGCLWPGYSI SAPYCWLDLLLTGCLFALLPATGKAVQV >gi|226332874|gb|ACII01000145.1| GENE 3 940 - 1308 287 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580836|ref|ZP_04858099.1| ## NR: gi|253580836|ref|ZP_04858099.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 122 1 122 122 204 100.0 1e-51 MLKDEERIAEVKRRIAKKEQQQRLRRRRIISAVCIAACFAVIVGVSFAMPGIVGQIEPGT SSGFETAATILGGSTALGYIVIGLLAFVLGACVTILCFRIHQLNKEEQTEKQKEDNGDGA DQ >gi|226332874|gb|ACII01000145.1| GENE 4 1295 - 1843 343 182 aa, chain - ## HITS:1 COG:Cgl1096 KEGG:ns NR:ns ## COG: Cgl1096 COG1595 # Protein_GI_number: 19552346 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Corynebacterium glutamicum # 42 181 63 205 213 68 28.0 4e-12 MVTDEELYGQYLCGDETGLELLIKKYGDPLTLYIDGYLHDVHEAEDLMMETFSWLFMKKP RIRNGCFKAYLYKAARHMALRHKSRRRIIFSLDNLTREAEAQTLVEEVVRTKERNQILHL CMDELNPDYREALYLTYFEGMSYQQAAEVMGKSVKQITNMVYRGKERLRGLLKREGITNA ER Prediction of potential genes in microbial genomes Time: Sat May 28 20:52:30 2011 Seq name: gi|226332873|gb|ACII01000146.1| Ruminococcus sp. 5_1_39B_FAA cont1.146, whole genome shotgun sequence Length of sequence - 9828 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 35 - 85 3.1 1 1 Op 1 11/0.000 - CDS 117 - 2285 2498 ## COG0855 Polyphosphate kinase 2 1 Op 2 . - CDS 2300 - 3838 1650 ## COG0248 Exopolyphosphatase 3 1 Op 3 . - CDS 3875 - 5182 1465 ## COG2607 Predicted ATPase (AAA+ superfamily) - Prom 5207 - 5266 6.6 - Term 5271 - 5306 0.1 4 2 Op 1 . - CDS 5368 - 6498 1246 ## COG0077 Prephenate dehydratase 5 2 Op 2 . - CDS 6520 - 7803 1075 ## COG0285 Folylpolyglutamate synthase - Prom 7887 - 7946 5.6 + Prom 7825 - 7884 3.9 6 3 Op 1 . + CDS 7969 - 8355 217 ## Cphy_0824 hypothetical protein 7 3 Op 2 19/0.000 + CDS 8434 - 9357 1003 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 8 3 Op 3 . + CDS 9351 - 9788 641 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit Predicted protein(s) >gi|226332873|gb|ACII01000146.1| GENE 1 117 - 2285 2498 722 aa, chain - ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 11 690 19 692 705 677 52.0 0 MDFEKFEKKEYFVNRELSWLKFDERVLSEARDKNLPLFDRLKFLSITASNLDEFFMVRVA SLKDQVHAGYHKTDIAGMTAKEQLKEISVRTHELVHVQYNTLNRSLIPALEKAGMHLVAA HENLTEAQSVFVDRYFEDNVYPVLTPMAMDSSRPFPLIRNKTLNIGALISKKEKSDKLSK KDKTGELLFATVQVPSVLPRVVQIPSKKDGDTTVILLEEIIERNIDKLFLSYDVICAHPY RIMRNADLTIDEDEAEDLLVEIQRQLKKRQWGEVIRLEVEDKMDERLLKILKTEFDIKEA DIFEINGPLDLTMLMKVYGADGFDAYKTPRYQPAPVPEFQNEKDIFQVIREGDVFLHHPY MSFDPVVNFVRQAAKDPDVLAIKQTLYRVSGNSPIIAALAQAAENGKQVSVLVELKARFD EENNIVWAKKLEKAGCHVIYGLVGLKTHSKITLVVRREETGIRRYVHLATGNYNDSTAKL YTDCGIFTCDERFGEDATAVFNMLSGYSEPKSWNKLIVAPIWMKDRFLSLIEREAENAKK GLPALIRAKMNSLCDPKIIAGLYYASSCGVQVELLVRGICCLKVGVPGISENIHVRSIVG EFLEHSRIFYFENGGNPEIYMGSADWMPRNLDRRVEIVFPVEDEKIKKELEHVLDLEFKD NVKAHILQPDGTYVKPDKRGKAQINSQMEFCIEATEKAALHKHEEKKSRVFIPAEPAEDI EE >gi|226332873|gb|ACII01000146.1| GENE 2 2300 - 3838 1650 512 aa, chain - ## HITS:1 COG:all3552 KEGG:ns NR:ns ## COG: all3552 COG0248 # Protein_GI_number: 17231044 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Nostoc sp. PCC 7120 # 7 505 24 538 550 95 22.0 2e-19 MAVMTFAAIDIGSYEVSMKIFEMSKRIGFRELNDVRYSLEIGKGVYSDGKIDSEMLNVLC EVLNDFKRLMQDFGVEEYRACGTSAFRELVNPLLIIEQIYQRTGMKIEILSSAEQHFLGY KSIAAIEKGFKKMIQKGTAILDVGGGSLQVSLFDKDALVTTQGLKMGSLRIRQRLQELEK TTIHYDKLVEEFIRNDLMSFQRLYLKDKEIRNVILMGDFITDMIFQEEMEDRIITREEFM KRYEDTVGKSVDLLAQEMEIDPEYASLVVPTMVVCRNFIDIFNAESLWAPGVSLLDGIAY DFAEKKKFLKSVHNFENDILVTSKNIAKRYSSSKSHIQGTMNLCLNIFDSMKKVHGMGAR ERLLLQIAALLHDCGKYISMEKVSECSYQIIMSTDIIGLSFMERQMIACAVRFNNSAFVY YDELYSMGIAIDRESYIKVAKMTAILRLANAMDRSHYQKVKGIKAVLKDRELQMIVDSAQ DISLELGLLKDKEAFFEEVFGIRLVLRRKGRA >gi|226332873|gb|ACII01000146.1| GENE 3 3875 - 5182 1465 435 aa, chain - ## HITS:1 COG:CAC3262 KEGG:ns NR:ns ## COG: CAC3262 COG2607 # Protein_GI_number: 15896507 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Clostridium acetobutylicum # 4 429 14 422 426 261 36.0 3e-69 MARLNECILYRNFEHGEILDKMAELMNAWEQKAPDLKEKEGLFFECANGLVETAGAYGFS GNLWHCYLTFLLVNNENAFSTACEIRGAVNGSINELALNDFGVFKELYDFDLTVLDEAFG ISCCKVLGDYTNTGSNSKMFNSRIRDRICDLSKTLAAAESTEEFMKDMVQFYKDFGVGKL GLHKAFRVGHDENDNVEIQPITRIAHVKLEDLVGYEIPKQKLIENTEAFVRGRKANNCLL FGDAGTGKSSSIKGILNRYYDEGLRIIEVYKHQFQDLNDVIAQIKNRNYKFIIYMDDLSF EEFEIEYKYLKAVIEGGLEKKPDNILIYATSNRRHLVREKSSDKLEIMDDDDLHSSDTVQ EKLSLVYRFGVRIYYGAPSKKEFQTIVKALAERNGITMPEDELLLEANKWELSHGGLTGR TAQQFIDHLLGVEKE >gi|226332873|gb|ACII01000146.1| GENE 4 5368 - 6498 1246 376 aa, chain - ## HITS:1 COG:DR1147 KEGG:ns NR:ns ## COG: DR1147 COG0077 # Protein_GI_number: 15806167 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Deinococcus radiodurans # 112 376 23 285 293 187 40.0 2e-47 MIDLQECRNEIDKIDSDIIRLFEQRMKVCEDVAEYKIRTGKKVLDPERERQKLEVLRGKA HGEFNQLGAQELFQQIMAISRKRQYQLLTEHGIEDDEKLEMVDALPLKDVRVVFQGVEGA YSYAAMREYFQDDIESFHVKTWRDAMEAVVEGRADYAVLPIENTTAGIVADIYDLLTEYE LSIVGEQIIRPEHVLLGLPDAELEDIRQVCSHPQALSQCGKYLESHPDWKKKEMENTAGS AKKIKEDNDKTQAAIASRQAGELYGLKILAENICYNGQNATRFVIVSKKPIYVKDAHKIS IFFELHHESGTLYNMLSHIIYNGLNMTKIESRPITGKNWQYRFFVDFEGNLKDSAVKNAL RGIEAEADRMRILGNY >gi|226332873|gb|ACII01000146.1| GENE 5 6520 - 7803 1075 427 aa, chain - ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 1 425 1 430 431 273 39.0 5e-73 MNYQESREYIDKITREIPSVLGLEHMRELMKRLGNPQDDLKYVHVAGTNGKGSVIAFLYS ALSGARYRIGRYVSPTLYSYRGRMEVSGSRISREDFAAYITQVSDVIAAMTKDGYPHPTA FEIETAVAFLFFKKENCDLVLLEVGMGGNLDATNIIKNTLLAVLVSISMDHMSFLGNTLA QIAEKKAGIIKDGCRVVTARQKPEAMQVIERISREHGTKCTIADASEAEVLKESCLGQTI RYRGEEYEIPLAGVYQKENAVVALNALKVLDELGFPTAAEQKKEGLRTVNWNGRFTVICK KPLFVVDGAHNPAAADMMAASIEHYFKGKRIIYIMGVFADKDYRSVIQKTAHFADRILTI QTPDNIRALPAGELAKTVSEYNPNVQAMDTIKDAVEEAFSLAGEQDVIIAFGSLSFIGEM TDIVENL >gi|226332873|gb|ACII01000146.1| GENE 6 7969 - 8355 217 128 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0824 NR:ns ## KEGG: Cphy_0824 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 5 128 24 147 147 133 51.0 2e-30 MKKAFILCGLTGWCMEILWTGLHSILSGELTMTGKTSLLMFPIYGCGAIIRPLSGKLSAV PLFVRGCIYTVGFFFVEFISGALLRCFHMCPWDYSNTPLNYRGLIRPDYAPLWFGAGLFF EKILGKLS >gi|226332873|gb|ACII01000146.1| GENE 7 8434 - 9357 1003 307 aa, chain + ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 2 303 5 306 307 431 69.0 1e-121 MRHLMSPLDLSVDELNTLLDLASDIEKHPDKYAHVCDDKRLATLFYEPSTRTRLSFEAAM MNLGGKVLGFSSADSSSASKGESVADTIRVVSCYADICAMRHPKEGAPLVASMASSIPVI NAGDGGHQHPTQTLTDLLTIRSLKGRLDNLTIGLCGDLKFGRTVHSLIEALVRYTNVKFI LISPEELRIPSYIREDVLKKNNIEFQEVERLEDALPDLDILYMTRVQKERFFNEEDYVRM KDFYILDKQKMELAKKDMYILHPLPRVNEISTEVDADPRAAYFKQAQYGVYVRMALILTL LEVKLPC >gi|226332873|gb|ACII01000146.1| GENE 8 9351 - 9788 641 145 aa, chain + ## HITS:1 COG:CAC2653 KEGG:ns NR:ns ## COG: CAC2653 COG1781 # Protein_GI_number: 15895911 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Clostridium acetobutylicum # 1 143 1 142 146 132 51.0 2e-31 MLNVGALREGYVLDHIKAGKAMTIYHDLKLDKLDCTVAIIKNARSNKMGKKDIIKVECPV ENLDLDILGFIDHNITVNVIKDEKIVAKKELKLPNQVVNVIKCKNPRCITSIEQGLDQIF VLADKEKEVYRCKYCEEKYTGKRNK Prediction of potential genes in microbial genomes Time: Sat May 28 20:52:37 2011 Seq name: gi|226332872|gb|ACII01000147.1| Ruminococcus sp. 5_1_39B_FAA cont1.147, whole genome shotgun sequence Length of sequence - 14358 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 648 - 682 3.2 1 1 Tu 1 . - CDS 759 - 2426 652 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 2 2 Op 1 9/0.000 - CDS 2545 - 3327 806 ## COG0327 Uncharacterized conserved protein 3 2 Op 2 5/0.000 - CDS 3314 - 4021 594 ## COG2384 Predicted SAM-dependent methyltransferase 4 2 Op 3 31/0.000 - CDS 4032 - 5159 1646 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 5 2 Op 4 3/0.000 - CDS 5199 - 6995 1631 ## COG0358 DNA primase (bacterial type) 6 2 Op 5 . - CDS 7019 - 7858 779 ## COG0232 dGTP triphosphohydrolase + TRNA 8224 - 8308 52.6 # Leu AAG 0 0 + Prom 8235 - 8294 80.4 7 3 Tu 1 . + CDS 8461 - 9714 729 ## COG3409 Putative peptidoglycan-binding domain-containing protein 8 4 Tu 1 . - CDS 9979 - 10179 284 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase - Prom 10293 - 10352 4.7 - Term 10380 - 10432 10.1 9 5 Tu 1 . - CDS 10483 - 12189 1214 ## EUBREC_0955 hypothetical protein - Prom 12411 - 12470 9.3 - Term 12444 - 12506 13.2 10 6 Tu 1 . - CDS 12527 - 13219 660 ## gi|253580855|ref|ZP_04858118.1| predicted protein - Prom 13302 - 13361 7.8 - Term 13411 - 13460 9.3 11 7 Tu 1 . - CDS 13470 - 14201 892 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump - Prom 14259 - 14318 5.4 Predicted protein(s) >gi|226332872|gb|ACII01000147.1| GENE 1 759 - 2426 652 555 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 43 553 36 544 546 255 31 1e-67 MGQKMYNVTIQGITKQYPEKIPYSEIVKEYEGSTDAPVMLVIVNGRLRELHKHLKADCEI EFVTSKDPIGHKTYCRSASLILLKAIYDVAGQENIDSVMIHYSVGSGYYFTMKGPMALDQ EFIDKVKAQMHRIVDENLPIVKRSVSTSEAVALFHKHHMYDKEKLFNYRRSSAVNLYSIG SFEDYFYGFMANHTGYIKTFDLFLYEGGFVLQLPTQNEPDRIPEFKPREKIFRVQKESQE WGDKLDIATVGDLNEKVTRGGIQDILLIQEAMQEAKISEIASEIAAAGNKKFVMIAGPSS SGKTTFSHRLSIQLAAHGMKPHPIAVDNYFIDRHLTPVDEFGEKNFECLEAIDVEQFNKD MLELLEGKRVEMPVFNFKTGTREYKGDFLQLDKDDILVIEGIHGLNDRLSYALPKESKFK IYLSALTQLNIDEHNRIPTTDGRLIRRIVRDARTRGTSAKETIARWPSVRRGEEQNIFPF QEDADVMFNSALIYELACLKVYAEPLLFGIAKDEPEYTEAKRLLKFFEYFVPVPSEAVPN NSILREFIGGSCFNV >gi|226332872|gb|ACII01000147.1| GENE 2 2545 - 3327 806 260 aa, chain - ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 239 1 239 258 159 37.0 5e-39 MKAYELTSWLEKKYPSDAAEDWDNVGLLAGDDTNEISHVFLALDLTEETLAEAIEDGADM IITHHPMIFSGIKKINNHSFTGRKILTLIQKGIVYYAMHTNYDVLGMADLSADYTKMHDT TVLSVTEEREGEVQGFGRVGKLPREMTLREYAQLVKESLKLNDVKVYGNLDSMVKCAAVC TGSGKSMIKDVIKAGADVYVTGDIDHHTGIDTVAQGLALIDAGHYGTEYIFMDAMKKELT QAFPELKISCAEVKSPYTIL >gi|226332872|gb|ACII01000147.1| GENE 3 3314 - 4021 594 235 aa, chain - ## HITS:1 COG:CAC1302 KEGG:ns NR:ns ## COG: CAC1302 COG2384 # Protein_GI_number: 15894584 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Clostridium acetobutylicum # 1 218 1 215 229 128 33.0 8e-30 MQLSLRLSAIAGLVTRGNRLVDVGCDHGYLPVSLYLDGKIPGAIAMDVRKGPLSRAQEHI SQYGLDAYIETRLSDGLEALKPGEGDTLVIAGMGGPLMERILTDGAEVRESFREMILQPQ SDIPHFRRFVRKIGWQITEEEMVLEDGKFYPMMKVIHGEKMHISEDTPYTLDEWFGGMLL ERKHPVLREYLERELRIRNEILDRLKNAPNAEKRAGEIEEEKQAVIAALEEYEGI >gi|226332872|gb|ACII01000147.1| GENE 4 4032 - 5159 1646 375 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 2 375 13 372 372 444 70.0 1e-124 MEMNMGKFSEKLVELLELAKKKKNVLEYQEINDFFKDQPLGVEQMEKVFDFLEASGVDVL RITDSNADDMILDDDDADIDKLDEEEIELDKIDLSVPEGVSIEDPVRMYLKEIGKVPLLS AEEEIELAKRMENGDEAAKKRLAEANLRLVVSIAKRYVGRGMLFLDLIQEGNLGLIKAVE KFDYRKGYKFSTYATWWIRQAITRAIADQARTIRIPVHMVETINKLIRVSRQLLQELGRE PTPEEIAKEMDMSVERVREILKISQEPVSLETPIGEEEDSHLGDFIQDDNVPVPADAAAF TLLKEQLVEVLSTLTDREQKVLRLRFGLDDGRARTLEEVGKEFNVTRERIRQIEAKALRK LRHPSRSRKLKDYLD >gi|226332872|gb|ACII01000147.1| GENE 5 5199 - 6995 1631 598 aa, chain - ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 4 575 7 572 596 353 37.0 6e-97 MRYSDDIIEEVRMKNDIVDVVSQYVKLNKRGSTYFGLCPFHNEKTPSFSVTPAKQMYYCF GCGAGGNVFNFVMEYENYTFGEALKHLADRAGVQLPKIEYSGEAKKKAERRAALLEINKL AAGYFYYQLRRENGRQAHEYLTGRGLSEETIRKFGLGYSDKYSDDLYKYLKSKNYSDELL RDSGLFNVDERRGMYDKFWNRVIFPIMDVNNRVIGFGGRVMGDGKPKYLNSPETTIFDKS RNLYGLNVARTTRKNYLILCEGYMDVISMHQAGFTNAVASLGTALTSGHASLLKRYTQEV LLLYDSDDAGVRAALRAIPILREAGVSSRVVNLKPHKDPDEFIKALGAEEFEKRLEQAMD SFMFRVHMAEREFPMEEPQGQNRFFERCAQMLLELSDELERNLYIEAIVKDYRSSGISVE NLKKRVGALAMKGTPAEQRIQPKPVGAGQQKKKESAAEKAQKLMLTWLVTYPGIFDTVEK YIQPSDFVVPLYRQVAEMLYQQHRDGDVNPARLMNAFIDSEEQREVSSLFNATIHLETPE EQNRAFSDAVIRIKDESLKERNRTWDPTDIQGLQELVKAKKELEELGRKRQQLHISFE >gi|226332872|gb|ACII01000147.1| GENE 6 7019 - 7858 779 279 aa, chain - ## HITS:1 COG:AGc3147 KEGG:ns NR:ns ## COG: AGc3147 COG0232 # Protein_GI_number: 15889019 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 271 78 419 426 177 33.0 2e-44 MKDKTQVFVAPQGDHYRTRLTHTLEVSQIARTIAKALRLNEDLVEAIALGHDLGHTPFGH AGERALDAVNPDGFAHYKQSVRVAQILEKNGEGLNLTWEVRDGILNHRTSGNPSTLEGQV VRLSDKIAYIHHDMDDAQRAGIISEDDIPVTLRMLLGYTTRERLNTFVHDVIENSLEQDT IKMSAEIYEAMMDLRALMFQNVYENPAAHKEEEKVVKMLTELYEYYVEHPEAMSTEYREL IVRGEKKEQAVCDYLSGMTDQYSIRKFREIYIPKAWEVY >gi|226332872|gb|ACII01000147.1| GENE 7 8461 - 9714 729 417 aa, chain + ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 11 417 4 408 437 372 50.0 1e-102 MQMQQENIDTGQYQVTVSDRLNNRPIENARVRISYTGAPDSTIEEVATDSSGRTPVIELK TPPLEYSMEPVEQQPYSEYTIQIEAEGFEPKEIAGSQVLADTLSRQPTTLNVMESSETFQ RIVIPPHTLFYEYPPKIEEAEIKPINENGEIVLSKVVVPEYIVVHDGPVNDSAAGNYYVR YKDYIKNVASSEIYATWPDDTIRANILAIMSFTLNRVYTEWYRNKGYDFTITSSTAYDHK WIYGRNIFASIDRIVDELFENYLSRPNVRQPILTQYCDGKQVQCRNRGWMTQWGSKALGD QGYSAIEILRAFYGNDMYINVAEAISGIPSSWPGYDLDIGASGNKVRQIQEQLNTIAEAY PAVPVVTADGIYGPETQNSVRIFQSIFGLDQTGIVDYPTWYKIQEIYVAVSRIAELR >gi|226332872|gb|ACII01000147.1| GENE 8 9979 - 10179 284 66 aa, chain - ## HITS:1 COG:NMA0507 KEGG:ns NR:ns ## COG: NMA0507 COG0163 # Protein_GI_number: 15793506 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Neisseria meningitidis Z2491 # 1 65 1 63 189 62 52.0 2e-10 MRKRLIIGMSGASGAPLTIELLKQLQQYKEIIETHLIVTKGAEMTLEQETDYTLEQLYAL ADEVHD >gi|226332872|gb|ACII01000147.1| GENE 9 10483 - 12189 1214 568 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0955 NR:ns ## KEGG: EUBREC_0955 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 565 5 564 564 669 59.0 0 MRNRKKLPIGIDSFEKIIRHNFYYVDKTEMITELLHNWGEVNLFTRPRRFGKSLNMNMLQ SFLEIGCDKSLFNGLKVSREKELCEEYMGKFPVISLTLKNVEGLNFESARKALKNTLGME AWRLSALAESSRLTEEEKNSYKALTVVDDHGDFKMSDATMEKALLILTVLLEKHYGKKAV LLIDEYDVPLDKAFQYGYYDEMVSLIRNMFGNVLKTNSSLFFAVLTGCLRIAKESIFTGL NNFNVFSITSVQFDEFFGFTDDEVAEMLKYFGLSDYHETIREWYDGYQFGKKAVYCPWDV ISYCRNLCADPDAIPEDFWSNTSSNSIVSRFIDKANKQTRDEIENLISGETVIKEIKQEL TYNELDKSIENLWSILFTTGYLTQRERIDSRKLRLAIPNREIKELFELQIREWFQEKSSE DIKKLDKLCMAFPDGDAETIEDLFNDYLWNTISIRDTAVKGRKENFYHGVLLGLLSHMEN WAVWSNIESGEGYCDILLEVPENRVGVVIEMKYAQEDRMEAACTEALEQIEQRQYAARLK SDGMKNIVNYGIACYRKHCKVKIGKENS >gi|226332872|gb|ACII01000147.1| GENE 10 12527 - 13219 660 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580855|ref|ZP_04858118.1| ## NR: gi|253580855|ref|ZP_04858118.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 230 1 230 230 407 100.0 1e-112 MSMKNRFAGKCRSYLVFFLLAAVAVLLFFPRVASAASKPTAKQLKVTYRGFEYDNDTAKL YLKLSLKNTSSYTITKVKMGYEIPIMEDGTITQTFSVTINPGKTVNKTVYIGKMTQQPYK APKVKCLSFWYKSATPKLNQLKVSYKGYEYNPNTGELYITARMQNTSSYTITKVTMYFEI PLDETATPTKTYNVNIPAGKTKNYRFKIGMMTDAPDGKVLVKCKKFWYKK >gi|226332872|gb|ACII01000147.1| GENE 11 13470 - 14201 892 243 aa, chain - ## HITS:1 COG:CAC1482 KEGG:ns NR:ns ## COG: CAC1482 COG1811 # Protein_GI_number: 15894761 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Clostridium acetobutylicum # 1 242 1 242 242 275 67.0 7e-74 MPIGIIINALSIVIGGILGAFVGHKLSPKFKEDITLVFGVCSMGMGISTIGLMENMPAVI FSVVIGTGIGLAIHLGERINAGAGVMQRVIGKFIKNSNSELSEDEFMNTLVTIIVLFCAS GTGIYGSIVSGMSGDHSVLISKSILDLFTAAIFACSLGMVVCMIAIPQVIIFLILFFAAT AIFPLTTPGMINDFKACGGFLMLATGFRMVKLKNFPTADMIPAMVLIMPMSCFWVNYILP LVS Prediction of potential genes in microbial genomes Time: Sat May 28 20:53:01 2011 Seq name: gi|226332871|gb|ACII01000148.1| Ruminococcus sp. 5_1_39B_FAA cont1.148, whole genome shotgun sequence Length of sequence - 24173 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 10, operones - 6 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 7 - 74 16.1 1 1 Op 1 17/0.000 - CDS 149 - 1444 1359 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 2 1 Op 2 15/0.000 - CDS 1446 - 2588 1331 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase - Prom 2612 - 2671 5.8 - Term 2801 - 2843 -0.7 3 1 Op 3 32/0.000 - CDS 2863 - 3657 1016 ## COG0575 CDP-diglyceride synthetase 4 1 Op 4 19/0.000 - CDS 3661 - 4368 862 ## COG0020 Undecaprenyl pyrophosphate synthase - Prom 4411 - 4470 3.9 - Term 4454 - 4490 4.5 5 1 Op 5 33/0.000 - CDS 4504 - 5055 855 ## COG0233 Ribosome recycling factor 6 1 Op 6 . - CDS 5125 - 5820 996 ## COG0528 Uridylate kinase 7 1 Op 7 . - CDS 5901 - 7031 1133 ## gi|253580863|ref|ZP_04858126.1| conserved hypothetical protein - Prom 7161 - 7220 5.3 - Term 7332 - 7378 9.1 8 2 Tu 1 . - CDS 7410 - 8000 548 ## Cphy_3706 hypothetical protein - Prom 8055 - 8114 9.0 9 3 Op 1 10/0.000 - CDS 8154 - 8624 562 ## COG0691 tmRNA-binding protein 10 3 Op 2 . - CDS 8648 - 10930 1680 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Prom 10973 - 11032 1.5 - Term 10980 - 11015 3.1 11 4 Op 1 . - CDS 11035 - 11271 345 ## Cphy_2867 preprotein translocase, SecG subunit - Prom 11295 - 11354 5.2 12 4 Op 2 . - CDS 11359 - 12168 905 ## COG0148 Enolase - Prom 12189 - 12248 2.8 13 5 Op 1 2/0.000 - CDS 12266 - 13873 902 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 14 5 Op 2 2/0.000 - CDS 13852 - 15582 705 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 15 5 Op 3 . - CDS 15582 - 17219 878 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 17293 - 17352 8.5 - Term 17241 - 17294 3.0 16 6 Op 1 . - CDS 17358 - 17624 311 ## gi|253580873|ref|ZP_04858136.1| predicted protein 17 6 Op 2 . - CDS 17681 - 18046 283 ## COG2337 Growth inhibitor - Prom 18076 - 18135 3.0 - Term 18100 - 18151 8.3 18 7 Tu 1 . - CDS 18197 - 18379 81 ## gi|253580875|ref|ZP_04858138.1| conserved hypothetical protein - Prom 18537 - 18596 4.1 19 8 Tu 1 . - CDS 18622 - 19095 468 ## EUBREC_3532 hypothetical protein 20 9 Tu 1 . - CDS 19872 - 20054 205 ## Shel_25590 predicted transcriptional regulator - Prom 20250 - 20309 4.4 + Prom 20094 - 20153 4.8 21 10 Op 1 . + CDS 20184 - 20858 654 ## PROTEIN SUPPORTED gi|154175142|ref|YP_001407467.1| 30S ribosomal protein S12 22 10 Op 2 1/0.000 + CDS 20879 - 22354 675 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 23 10 Op 3 . + CDS 22354 - 22824 166 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 24 10 Op 4 1/0.000 + CDS 22910 - 23296 154 ## COG1533 DNA repair photolyase + Prom 23506 - 23565 4.1 25 10 Op 5 . + CDS 23795 - 24172 250 ## PROTEIN SUPPORTED gi|157165407|ref|YP_001467497.1| 30S ribosomal protein S8 Predicted protein(s) >gi|226332871|gb|ACII01000148.1| GENE 1 149 - 1444 1359 431 aa, chain - ## HITS:1 COG:BS_yluC KEGG:ns NR:ns ## COG: BS_yluC COG0750 # Protein_GI_number: 16078719 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Bacillus subtilis # 3 430 6 420 422 199 33.0 1e-50 MKIIIAIVIFSAIILFHELGHFLFAKLNKIVVTEFSLGMGPRLYSFEKGDTRYSLKLLPI GGSCAMLGEDTDIENEPGTFNSASVWGRISVVAAGPVFNFIMAFVLSVIIVGAVGYEPSR VLSVKEGSAAEAAGLKEGDIITGYQGYHIDLGKDLYVYSYLNQLKEGDTINLTVKRDGKK MDISYKSDTNVRYLLGCNFNGDDTSAMTVESVMDGMPLQEAGIQQGDVITSINGVKITNA ADYQKYIQENPLTEKSVKITYSRDGQEYDITVTPKEYRTAESGFTYNMYSEKAKGLNVVK YGAVEVKYMVRTTILSLKELVSGKLGMKDLSGPVGIVDAIGTTYEESKSEGTMILWMNML NLAVLLSANLGVMNLLPFPALDGGRLVFLVIEAIRRKPINRQVEGGIHFAGLMLLMALMV FVMYNDIVKLI >gi|226332871|gb|ACII01000148.1| GENE 2 1446 - 2588 1331 380 aa, chain - ## HITS:1 COG:lin1354 KEGG:ns NR:ns ## COG: lin1354 COG0743 # Protein_GI_number: 16800422 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Listeria innocua # 1 378 1 378 380 385 50.0 1e-106 MKKVAVLGSTGSIGTQTLDVVRANDDLEVVGLAAGSNVEMLEKQIREFHPRLVAVWKEEA ARDLAVRVQDLDVKIVSQMGGLIELARMEESDILVTAIVGMIGIRPTMEAILAGKDIALA NKETLVTAGHLIMPLAKQFGVQILPVDSEHSAIFQALHGEKREQIHKLLITASGGPFRGK KTADLEKVTLEDTLKHPNWVMGQKITVDSATLVNKGLEVMEARWLFDVDLDHIQVVVQPQ SIIHSMVEFEDGAVMAQLGTPDMRLPIQYALCYPDRRFLKGDRLDFHMLKQITFEEPDRK TFKGLPMAVEAARAGGSMPTVFNAANELAVRKFLQKKISFLDIYEIIGQSMSRHTAVKDP DLDQILEIEKETYRWIESRW >gi|226332871|gb|ACII01000148.1| GENE 3 2863 - 3657 1016 264 aa, chain - ## HITS:1 COG:CAC1792 KEGG:ns NR:ns ## COG: CAC1792 COG0575 # Protein_GI_number: 15895068 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Clostridium acetobutylicum # 5 237 6 240 245 116 32.0 6e-26 MFKTRLLSGILLVIIALATIISGDYVLFFTLLAVSLIGMRELYRAMKVQDEKINLLAAAG YLGAVLYYLAVLLDFERYGVLAIIFGLVLLMFVYVFTYPTYEAGQVMPALFGIVYVAVML SFIYLTRELPGGKFHVWLIFLCSWGCDTCAYCVGMLIGKHKMAPVLSPKKSVEGAVGGVA GAALLGVIYAAATQGPMLEYAVICAIGALISMVGDLAASAIKRNQGIKDYGKLIPGHGGI LDRFDSVIFTAPVIYFLSLVMIEI >gi|226332871|gb|ACII01000148.1| GENE 4 3661 - 4368 862 235 aa, chain - ## HITS:1 COG:FN1326 KEGG:ns NR:ns ## COG: FN1326 COG0020 # Protein_GI_number: 19704661 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Fusobacterium nucleatum # 2 233 4 228 230 242 53.0 4e-64 MNVPNHIAIILDGNGRWAKAKGMPRSYGHVKGCANLETVCDYMKELGVKYVTVYAFSTEN WKRSKDEVDGLMKLFRSYLKKCIKLADKNKMRVRVIGEVSAFDQDIQESIARLEQYSQKY DEIYFQIALNYGSRDEIIRGIRKLAQDAADGKVKPEEIDEHVFDNYLDTAGIPDPDLMIR TSGELRLSNFLLWQMAYTEFYFTDVAWPDFNKAELIKAIEKYNQRDRRYGGVKEE >gi|226332871|gb|ACII01000148.1| GENE 5 4504 - 5055 855 183 aa, chain - ## HITS:1 COG:CAC1790 KEGG:ns NR:ns ## COG: CAC1790 COG0233 # Protein_GI_number: 15895066 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Clostridium acetobutylicum # 5 183 6 185 185 182 60.0 3e-46 MDERIQKYEEKMKKTLASLESELVTIRAGRANPHILDKLAVDYYGAPTPLQQVANITVPE ARMIQIQPWESSLIKGIEKAILTSDLGLNPSNDGKVIRLVFPELTEERRKELVKDVKKKG EAAKVAVRNIRRDANDAFKKLAKQDVSEDEIKELEEKIQKSTDKYIKEVDAAVDAKSKEI MTV >gi|226332871|gb|ACII01000148.1| GENE 6 5125 - 5820 996 231 aa, chain - ## HITS:1 COG:CAC1789 KEGG:ns NR:ns ## COG: CAC1789 COG0528 # Protein_GI_number: 15895065 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Clostridium acetobutylicum # 2 230 7 235 236 198 44.0 7e-51 MKRVLLKLSGEALAGEKKTGFDEATVIEVAKQIRTIAEEGLEIGIVIGGGNFWRGRTSET IDRNKADQIGMLATVMNCIYVSDICRYLGLKTEIFTPFVCGAFTSLYSKDAVEASFAQGK IVFFAGGTGHPYFSTDTATVLRAVEIEAEAILLAKAVDGIYDSDPKTNPNAVKYDEISIE EVVAKKLAAMDLTASIMCMEQKMPMLVFALDEKDSIINAVHGKFVGTKVTV >gi|226332871|gb|ACII01000148.1| GENE 7 5901 - 7031 1133 376 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580863|ref|ZP_04858126.1| ## NR: gi|253580863|ref|ZP_04858126.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 376 1 376 376 489 100.0 1e-136 MKKRYLILAGLLVMTVAAAGCGKKKTTETAPVEVTATPTPEVTKAVDMVDMQQTADEDIK NVMGEKTSTASKIVFVNNTGDDIQSLYIRTHVDEDSEDYDADEDGGWGDDLINGMFTLTD KDKALYYMQTANTQTSGTQTSGTQTSDTQTSGTQTSDTQTSDTQTTSNKSTASYDIRIAY TDEDTNECFFRDIPLGTISQITLCMDGTDDDAIPYAKYLTGTSTKEVSTLDAVKERLGIT DDSESESDSTDDSDKNSADDSDSTDSNNSSDQNNNSGNGTGDSSDDPGNGGNSDDPGTND DPGSNDDPGNGGDMISTAEQYIGQSLDALEGACGSPQGSSYEDDPETGKTGFHYYSNFTV STTVDENGNEIVAGIW >gi|226332871|gb|ACII01000148.1| GENE 8 7410 - 8000 548 196 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3706 NR:ns ## KEGG: Cphy_3706 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 194 3 199 204 144 36.0 1e-33 MKTITISRQYGSGGRYIAALLSEKMGVPCYDSKFLPKEAERHGISQEIINEFKGKTSLLY AIGVMMSEESQDKKRLTIPEKMFHAQKETVKRLAQEGPCIFVGRCADQILKDDNQLLRVY IYASDMEDRIKRIKKNKHISQEEALDRIAYKDRQRRDYYNFYTGHEWGKMENYDICLNTS VLSEEECVELLMKLAE >gi|226332871|gb|ACII01000148.1| GENE 9 8154 - 8624 562 156 aa, chain - ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 8 151 7 150 156 162 57.0 2e-40 MAKKKGMKLIANNKKAFHDYFIEDTYEAGIALAGTEVKSLRMGKCSIKESFIRVENGEVY IYGMHISPYEKGNIFNKDPLRIRKLLLHRYEINKIEAKLKEKGLTLVPLKVYFKDSLVKV EIGMARGKKLYDKRQDIAKKDQRREAEREFKVKNLY >gi|226332871|gb|ACII01000148.1| GENE 10 8648 - 10930 1680 760 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 13 750 7 702 730 651 46 0.0 MSKKDMDRRKKFILELMGDPIYQPMRLREISSLLRLSKEEKRELYDVLDELCEEGKVSVD RKGRYEKVKGKWKKKKDDRYYDDRREEYGSEYGRKKKDKNKKDKNKKEQPEGIQAEGTFI GHPKGFGFVEIEGQDEDIFIPESDTGTAMHQDKVRIIIRNDKKEGKRQEGVVVKVLERGM PEIVGTYQLNRDFGFVISDNPKFSKDIFIPRKEAAGIKNGDKVIVVITDYGSGNKNPEGK IKENLGNIRTPGTDILAIVKSFGIPSEFPEKVMKQAQRVPDHVLDADRDGRLDLRHLQTV TIDGEDAKDLDDAISLTKEGDIYHLGVHIADVSNYVQYNSALDREALKRGTSVYLADRVV PMLPERLSNGICSLNQGEDRLALSCLMDINEKGKVVSHQIAETVINVNERMCYTDVKNIL EDTDEEAKKRYEALIPMFFMMKELSGILRNSRHHRGSIDFDFPESKIILNAAGKAIDVKP YEANVATKIIEDFMLMANETVAQEYCTEEIPFVYRTHDNPDPEKVESLLTLLHNQGVKIQ KAKEEITPKEIQQIIESIEGLPNEAMISRLVLRSMKQAKYTTECSGHFGLAAKYYCHFTS PIRRYPDLQIHRIIKDNLRGRLMREGRTEHYAEILDEVARQSSVCERRADEAERESDKLK KAEYMSYHLGEEFEGIISGVTGWGLYVELPNTVEGLVHVNTLRDDYYIFDQETYELRGEM TKKVYKLGDKVCVRVADADKMLKTVDFELVADIWNDEEEN >gi|226332871|gb|ACII01000148.1| GENE 11 11035 - 11271 345 78 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2867 NR:ns ## KEGG: Cphy_2867 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: C.phytofermentans # Pathway: Protein export [PATH:cpy03060]; Bacterial secretion system [PATH:cpy03070] # 1 76 1 77 81 77 61.0 2e-13 MQIIKIILSVIFIIDCIALTVVVLMQEGKQQGLGAIAGAAETYWGKNKGRSMEGGLVKAT TIMGILFFVISVALNMLG >gi|226332871|gb|ACII01000148.1| GENE 12 11359 - 12168 905 269 aa, chain - ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 1 258 171 420 431 317 62.0 1e-86 MPTGAENMEQAIRMCAEVYQFLRIILKQKGLSTAVGDEGGFAPDLSDSESVLEVILEAVK KAGYEPGKDISIAIDAAASELYDEERGVYVFPGEGKMKGEEVLRDSGEMIEYYEKLAEKF PIVSIEDGLEEDDWEGWKQMTKRLGDKIQLVGDDLFVTNIKRLACGIKLGAANAILIKLN QIGTLSEALDAVEMAQKAGYRTVISHRSGESEDSFIADLAVATGAGQIKTGAPCRSDRNA KYNQLLRIHEALGELAVYENPFKENEKNC >gi|226332871|gb|ACII01000148.1| GENE 13 12266 - 13873 902 535 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 10 304 12 299 301 184 37.0 4e-46 MSDGKCIVKYLRLSLEDEDMLDESNSITNQRIVIGQYIASKNEFKNTEVLEFKDDGYSGT NFDRPGFQSMMELVRDGKVSTIIVKDLSRFGRNHIEVDTYLEQIFPFMNVRFIAINDNVD SMKYESGMPGIDVGFRNIINEHHSIDTSVKVKRTLIQRQKAGKYMGARAPYGYLKPDEDV TSLVINPETAPVVKMIFQKYLDGMNITQLARYLNEQKIMSPGQYKREVLKTGVKKTTEKY IWYPVTVRLILMTETYTGTTIGGKWKVASVGSNKHLKTKEEDWIVVEGTHEAIVSKEVFD AVQEKLELNSRKRSKTHNNNYPLKGLVKCGGCGQNLQHVTRCNPHFKCPRKFNVANQDCV TDNLYDDEFNEMIFRAIKLFAKISDDAEPVLELQKAELKSKVNGAAKKIRDAKDSISRYK HQKTELYMRYAMEEISEEEFTRKNDKLDKQIEKETLAIAQMETEQSEAAERLFELPPDGR QCLTDLIEGNKQLTREIAVTFIRGIKVYNDKRIEIEWNFADELVKYVEQVQKICS >gi|226332871|gb|ACII01000148.1| GENE 14 13852 - 15582 705 576 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 20 322 5 300 301 134 32.0 4e-31 MARTRNRQMKQAETFVQASPQKVWKAGIYTRISVDNNGEKRESLETQKLIALSYAESHPD IEVVKFYKDDGISGTKFDRDDFVRMLGDIKTKEINTVIVKDLSRFGRDLEEVSKYLEKIF PFMQVRFISVNDNYDSISPECDNQMLGIMISNLANDMYAKDASVKMASAMKIKMESGEYC GGDAPYGYKRARDEKGKSTTVPDPLTAPYVVEIFEKLAAGQSYLKISREFNTMLLASPRV YARTGKLFLDRIDETDMHWQSSVIKQIAENRHYLGNTYSHKTRTSLLTKEKNVMLDKEEW IEHVNTHKAIVSAELFEKVQNVIKLKQEKALPKKDLSNMQTHGKQDNKYVGLIYCGDCGA NMVRRYYYSEKNGVLYYNYYFICGNYAKISKEKYNCNRWKEEVIDELVYRALIMQLKTVC ELKTQLKRFNDEYFETFQKYLNREQSKIVQLNKRNEARRFELYEQYVSGEIDTDAYNRMT ERITMVEKDLVTRQKEIEKSRKITEKLCKKNFSWLAEFSKGKNLEFLTKDVVRSYIKKIS LYEDKRIEIEFKFQDEIQALSEILEEGVIRCQMVSA >gi|226332871|gb|ACII01000148.1| GENE 15 15582 - 17219 878 545 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 7 305 8 300 301 166 34.0 2e-40 MKKNIWIAAKYLRLSIEDGDKAESESIVNQSILIDNYMKSTSDITIVETFKDDGFSGTDF NRPGFQAMLKAIENKEINCIIVKDLSRFGREHIDVDRYIQKVFPQLGVRFIAINDNYDSE TANITDTHLVLPVKSFVNDTYCRQNSQKVRSHLSAKRNIGEYVGNYVSYGYKKCDTDKSQ IEIDPVAAKHVRDIFTWKMEGMSNQLIADKLNELGVLAPADYKRATGVNFKSSFQTHLTS RWSAVAIIRILKNPIYYGVLQQGKSQRINYKVKVQRALPKEEWVIFENHHEGIVTKEEYE TVQMLLAKDTRIAPGENRLYLFGGLLSCGDCGSNLIRRTNSYKGEKTVFYICSSYNKKKD QCSRHSIREDVLIQLVMDSLKMYSKMTDAIRSAVEYLKENSLDTQTLIQHDDLILELRNK VNKYYKLLHSLSGNLASGVISKDDYTLLRERYQTEIKSLESNIEKQEEYMEDLLENKLLC EEWVNTFLEKPSLGDLDRDMLLQYVEKINVFEGKKIEIVYRFQDELVTAARLAEQIGKTK QEVAV >gi|226332871|gb|ACII01000148.1| GENE 16 17358 - 17624 311 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580873|ref|ZP_04858136.1| ## NR: gi|253580873|ref|ZP_04858136.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 88 3 90 90 152 100.0 4e-36 MTAVERLEALKTVDIKEIDRDSLPKYKDLISELDSRKNNKMEVLLNHSDNPYVYEDMGYV VKSVFNAAASLSYIDCTKQLVAKKAGLA >gi|226332871|gb|ACII01000148.1| GENE 17 17681 - 18046 283 121 aa, chain - ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 1 117 3 116 122 77 38.0 4e-15 MTIRRGDILWADLGMFPTTSVQGGVRPVIVVSNNKANTYSSVITVVPLTSRIYKKRYLPT HVFISKYDMTGIRKGSLALAEQVMSISTKCIIEKCGRVNKWSLDRVLKAVQIQMGMEEKT T >gi|226332871|gb|ACII01000148.1| GENE 18 18197 - 18379 81 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580875|ref|ZP_04858138.1| ## NR: gi|253580875|ref|ZP_04858138.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 60 1 60 60 119 100.0 6e-26 MTYMRCDTLADPEKSVTGKPMNCWSLETRVSGLWSGVCIRAGYYSLPAYAGVTIRGIEDK >gi|226332871|gb|ACII01000148.1| GENE 19 18622 - 19095 468 157 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3532 NR:ns ## KEGG: EUBREC_3532 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 157 35 191 191 282 97.0 3e-75 MYIEHPYFYEGKYYANIDGEMIEVTKEVAYAMNNFYRSSKAKKVEIKNELGEVVDKMLRE VPYSGQSIDGEVFMIEDFPDLNCDVEHCVLTKMEQQDIHKVINQLNSEERMIIYGIFFEN KTQTQMAEIMGISRQMLSYKLKSTLNKMREMYMNKFF >gi|226332871|gb|ACII01000148.1| GENE 20 19872 - 20054 205 60 aa, chain - ## HITS:1 COG:no KEGG:Shel_25590 NR:ns ## KEGG: Shel_25590 # Name: not_defined # Def: predicted transcriptional regulator # Organism: S.heliotrinireducens # Pathway: not_defined # 1 42 1 42 129 67 71.0 1e-10 MDHIMTYEEYLRDVKKGIVTDLGNSPVTPLLLMLQGKWKTQITMTIWFVLLSAGIKTSAF >gi|226332871|gb|ACII01000148.1| GENE 21 20184 - 20858 654 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154175142|ref|YP_001407467.1| 30S ribosomal protein S12 [Campylobacter curvus 525.92] # 5 224 12 231 231 256 55 1e-67 MTSLTIEQKRVDIYPSAKPGSPVIYLNTFSNAVNSVYKNLMALGCPDFCLVAVSNLKWDH DMTPWYMGPISKHDTPCTGGADDYLKLLLDEIMPEAEALLPGAPAWRGLAGYSLAGLFAL YALYQTDVFARAASMSGSFWFEGIMEYVCSHEMKRKPDCLYFSLGDKESSTDNPILNVVQ QNTQEIETLCHGKGIETTFVLNSGSHYQACNKRTAAGIHWILNR >gi|226332871|gb|ACII01000148.1| GENE 22 20879 - 22354 675 491 aa, chain + ## HITS:1 COG:all2648_2 KEGG:ns NR:ns ## COG: all2648_2 COG1020 # Protein_GI_number: 17230140 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Nostoc sp. PCC 7120 # 4 483 461 949 1525 241 32.0 3e-63 MDTIYSLFQKVVKEHENAPAIIENDRAMTFGELSNMVDMITCSFPQEVHSIGIVMRHRTE MIASILAVLKCGGQYVPAEPDFPTGRIHDMMTEAQVDFVLTEHAFAPKLSSFPIRYTDCE ICGVETPSWKRNAIEDPERPAYVLYTSGTTGRPKGVCVTNRNVCHYVRAFANEFHPGPGD VMLQYSVCSFDIFVEEVFTSLLNGAALAIPADEDKADIHALMAFVERHHVTMISGFPYLL AEMNHLSVIPSSLRLLISGGDVLRGVYVDHLLNKAEVYNTYGPSETTVCASYYRCNSGTV LEDGTYPIGHPVLGAQIRILDQSGNEVAKGQTGEICIYGGGVSLGYIGDHAEENSAFERQ PDGSTMYRSGDLGYILPNGNIAFLHRKDDQIMIYGKRVELAEVESRLYRCKDVQQAIVRA FTDEDGLSYMTAYVVPSDNKLKVSEVKKELSENLTSFMIPEFFVKMKQIPLNVNGKPDVS KLPVVMKAGAL >gi|226332871|gb|ACII01000148.1| GENE 23 22354 - 22824 166 156 aa, chain + ## HITS:1 COG:CAC2491 KEGG:ns NR:ns ## COG: CAC2491 COG0454 # Protein_GI_number: 15895756 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 6 153 4 147 155 82 28.0 3e-16 MEIQIRQATIQDLNVLMQWRMEVLHEVFSIPSERSVTELESENRRYYQTELPQGGHIACF AYVGEEIVGCGGICLYHEMPSPDNLNGKCAYLMNIYTRPQFRGHGIGNRIVRWLVEQARQ RHISKIYLETSDKGRPLYQTIGFADMKDMMKLSREK >gi|226332871|gb|ACII01000148.1| GENE 24 22910 - 23296 154 128 aa, chain + ## HITS:1 COG:MJ0683 KEGG:ns NR:ns ## COG: MJ0683 COG1533 # Protein_GI_number: 15668864 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Methanococcus jannaschii # 5 121 26 140 259 86 40.0 1e-17 MSDYYSVNPYVGCGHGCKYCYASFMKRFTNHPEPWGEFIDVKFWPEIKHPERYAGKELFL CYVTDPYQPLEETACRTRAILEQMQGSGCSLSIATKSDLVLRDLDLIKTFPNARVSWSIK HTGRRFSR >gi|226332871|gb|ACII01000148.1| GENE 25 23795 - 24172 250 125 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165407|ref|YP_001467497.1| 30S ribosomal protein S8 [Campylobacter concisus 13826] # 1 114 1 114 117 100 41 7e-21 MKRAELESFIMETYNAETDYPWLKYPNYEVFRHCNNRKWFALIMDVPKNKLGLQGEEMLK VVDFKCDPILIGTLREEPGFFPAYHMSKDSWITVALDGSVSDDKTKMLLDMSYEATIPKT CKKRY Prediction of potential genes in microbial genomes Time: Sat May 28 20:53:48 2011 Seq name: gi|226332870|gb|ACII01000149.1| Ruminococcus sp. 5_1_39B_FAA cont1.149, whole genome shotgun sequence Length of sequence - 35533 bp Number of predicted genes - 30, with homology - 30 Number of transcription units - 18, operones - 8 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 91 - 507 198 ## COG1254 Acylphosphatases 2 2 Tu 1 . - CDS 615 - 1133 658 ## COG0778 Nitroreductase - Prom 1321 - 1380 6.3 + Prom 1268 - 1327 5.5 3 3 Op 1 . + CDS 1352 - 2548 1071 ## EUBELI_20336 hypothetical protein 4 3 Op 2 . + CDS 2538 - 3338 682 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 3435 - 3483 10.0 - Term 3423 - 3471 10.0 5 4 Tu 1 . - CDS 3527 - 4759 1724 ## EUBREC_0459 hypothetical protein - Prom 4828 - 4887 4.8 6 5 Op 1 7/0.000 - CDS 4891 - 8208 3204 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 7 5 Op 2 . - CDS 8308 - 8889 706 ## COG0193 Peptidyl-tRNA hydrolase - Prom 8953 - 9012 3.3 - Term 8978 - 9028 2.1 8 6 Op 1 . - CDS 9053 - 10006 1165 ## EUBREC_1831 hypothetical protein 9 6 Op 2 . - CDS 10160 - 11308 1308 ## COG0523 Putative GTPases (G3E family) - Prom 11455 - 11514 9.4 - Term 11589 - 11625 6.5 10 7 Op 1 . - CDS 11645 - 11878 378 ## gi|253580894|ref|ZP_04858156.1| conserved hypothetical protein - Prom 11930 - 11989 9.8 11 7 Op 2 . - CDS 12092 - 13978 2150 ## COG0296 1,4-alpha-glucan branching enzyme - Prom 14200 - 14259 8.6 - Term 14283 - 14339 8.3 12 8 Tu 1 . - CDS 14391 - 15716 1422 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 15778 - 15837 1.9 13 9 Op 1 . - CDS 15860 - 17140 1600 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 14 9 Op 2 . - CDS 17159 - 18613 1473 ## Cphy_0334 hypothetical protein 15 9 Op 3 . - CDS 18650 - 20443 1791 ## Cphy_0334 hypothetical protein - Prom 20628 - 20687 8.9 16 10 Tu 1 . + CDS 20793 - 21248 654 ## Cphy_3788 hypothetical protein + Term 21273 - 21326 4.1 - Term 21261 - 21313 5.5 17 11 Op 1 . - CDS 21372 - 21818 592 ## COG4506 Uncharacterized protein conserved in bacteria 18 11 Op 2 . - CDS 21900 - 22721 883 ## COG0796 Glutamate racemase 19 11 Op 3 . - CDS 22764 - 23036 279 ## gi|253580903|ref|ZP_04858165.1| predicted protein - Prom 23102 - 23161 6.6 20 12 Op 1 . - CDS 23170 - 24855 1206 ## EUBREC_1002 spore germination protein-like protein YndE 21 12 Op 2 . - CDS 24852 - 26354 1394 ## EUBREC_1001 spore germination protein B1 - Prom 26434 - 26493 2.6 - Term 26454 - 26486 3.1 22 13 Op 1 . - CDS 26522 - 27241 1074 ## COG1802 Transcriptional regulators 23 13 Op 2 1/0.000 - CDS 27231 - 28109 889 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase - Prom 28194 - 28253 5.8 24 13 Op 3 . - CDS 28269 - 29819 1359 ## COG1388 FOG: LysM repeat 25 13 Op 4 . - CDS 29842 - 30234 284 ## EUBELI_00153 hypothetical protein - Prom 30403 - 30462 5.3 + Prom 30406 - 30465 8.4 26 14 Tu 1 . + CDS 30518 - 31264 517 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Term 31299 - 31341 2.4 - Term 31376 - 31418 -0.7 27 15 Tu 1 . - CDS 31492 - 32493 634 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 32722 - 32781 3.2 28 16 Tu 1 . + CDS 32477 - 32779 273 ## DSY0184 hypothetical protein - Term 33245 - 33289 10.1 29 17 Tu 1 . - CDS 33333 - 34343 744 ## EUBELI_01420 hypothetical protein - Prom 34363 - 34422 5.6 + Prom 34403 - 34462 9.0 30 18 Tu 1 . + CDS 34637 - 35531 906 ## gi|253580932|ref|ZP_04858194.1| predicted protein Predicted protein(s) >gi|226332870|gb|ACII01000149.1| GENE 1 91 - 507 198 138 aa, chain - ## HITS:1 COG:alr0877 KEGG:ns NR:ns ## COG: alr0877 COG1254 # Protein_GI_number: 17228372 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Nostoc sp. PCC 7120 # 46 108 4 66 101 63 52.0 7e-11 MSIFDFKRKKYRTFLQDSETGEEIAEEYEKGRGVWKKHDVQDGKGSEMREDLIRKHYWFS GRVQGVGFRYRACYIASSLGVTGWVRNNWDDRVEMEAQGSRELLAQMVEMLGRQRFIEIE GIEERVIPVEVESGFYSR >gi|226332870|gb|ACII01000149.1| GENE 2 615 - 1133 658 172 aa, chain - ## HITS:1 COG:FN1223 KEGG:ns NR:ns ## COG: FN1223 COG0778 # Protein_GI_number: 19704558 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 172 1 174 175 172 47.0 2e-43 MNETIKNLLERRSVRGYKEDLVPEEVLNEILEAGEYAPSGMGQQGTLMVVTQNPELVAKL SKMNADVMGTESDPFYGAPTVVVVFADSNMPTCVENGSLVMGNLMNAAHALGVDSCWIHR AREVFASEEGKALKAEWGVPESYVGIGYCVLGYRSGEYPKAKARKDGFVIRV >gi|226332870|gb|ACII01000149.1| GENE 3 1352 - 2548 1071 398 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20336 NR:ns ## KEGG: EUBELI_20336 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 387 5 387 393 370 49.0 1e-101 MNKKEISEIKKQFTPENCSLTRICGCYVDGEKNKKTELKEAFLSLSEEEMFKYFEIFRKS LSGTIGKNLMNMEFPLEQETEGGTQHFLMKLRESKLTDDALVESFYDKIIENYDYPENYY IILIHGVYDIPGKSSDGAEMFDASDEIYEHIMCCICPVKLSKAGLCYNAETNSIQDRIRD WIVELPDVAFLFPQFNDRSTDIHGVLYYSKNANELRDIFIDELLGCTAPLSAGGQRDSFN ALVEDTLGDDCRYDTVLSIHEKLNDLIESQKDEPEPVVLTKSEVKRLFEECGVEDEKLQS FDEQYEITAGEKSSLVASNITNTKRFEIKTPDVVVHVDPERADLVETLVIDGRKCLVIPM EGEVELNGIRVHTGNENPSDDEFYDNTDTETEETSDGE >gi|226332870|gb|ACII01000149.1| GENE 4 2538 - 3338 682 266 aa, chain + ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 4 266 2 266 266 200 39.0 2e-51 MENNRKLLFFDIDKTLLTPYPWTVPDSAKQALKEARYNGHLLFINSGRTYTMIPEIIKEM GFDGYVCGCGSQIYMNEELLFSSSIPNPLCRKTTELLRECRIPAFFERPDKILYDSSSCA VPETVQKLKAEVIVEDLALYPPETYNHFTFDKFLAFPSADSDMETFRKFTDEHFLCFIHE DKAWELTQKNLSKATGIRFLADRLGVPVKDTYAFGDSTNDLLMLQFAGNSIAMGNSDPLI LPHCTYQTTNIEDNGIANALKHFHLI >gi|226332870|gb|ACII01000149.1| GENE 5 3527 - 4759 1724 410 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0459 NR:ns ## KEGG: EUBREC_0459 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 290 4 314 317 95 29.0 3e-18 MVKRTAVVTLAGVMSVGMLSGCGSKTLDGTKTVATVDGTDIPLGVVSLYAREQQQQTTTM YLNYMGSADNIWDQTAGDDSDETYGDQAVTSSLESVEKMYILKEKAADYNVELTDDDEAA IADAASQFMAANSEETIKELAVTEDQVKTLLELQTIQKKMYDPVVAEGKITVSDDEANQT TFTYVSISTSGDDITDEEKKTKKEQAQEILDKMKEDPTADMSEIAKGVDDSYSAVQGNFT TKESEDEDEDSGSEAYPDEVLKVLRGLKDGEVADNIIETDTGYYVVRLDKINDEDATASK RESLQNSKESTYFTDTTAKWLDEADVKAVKKVIKTLKITDKHTFMAPTPTPVPETPTPEV TEEATPTPETEEVTETPAAEDTETTETPAAEEEAAETTETPEATPTEAAK >gi|226332870|gb|ACII01000149.1| GENE 6 4891 - 8208 3204 1105 aa, chain - ## HITS:1 COG:CAC3216 KEGG:ns NR:ns ## COG: CAC3216 COG1197 # Protein_GI_number: 15896463 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Clostridium acetobutylicum # 1 1103 3 1169 1171 885 43.0 0 MKAFLTPLQGLAEFEQIKEKSKTNKGILQVSGCMESQKSHLMYGLSGIAPYRLILAEDER RAREIYEDYRFYDRKVYSYPAKDLLFFQADIHGNLLIRQRMKVIKALLEEKELTVVTSID GCMDFLESLEKIKEQLIHYESDSTVDIEQLKNQLVALGYERVGQVEMPGQFSVRGGIVDI YCLTEENPWRIELWGDEIDSIRSFDPESQRSLENLEELTIYPAVEHIGDKDMVSFLDYFP EERTIIFLDEPNRLTEKGGAVEEEYRQSRMHREEKGSRNLPENWLCSFEQLQKELNKRNC ISVCALEPKQAGWKVREKFYLEVKSISAYNNSFELLVKDLHQYKKQGYRIALLSGSRTRA ERLAKDLQEEGLAAFYGQDYDREICPGEIMVVYGHAKKGFEYPLIKFAVMTESDIFGQEQ KKKKKKNYSGSRIQDFAELSVGDFVVHEKHGLGIYRGIEKVEVDRIVKDYIKIEYRGGSN LYIPATQLDCLQKYSGADASKAPKLNKLGTQEWNKTKSKVRGAVKNIAKELVELYAVRQE KEGYVCGPDTVWQREFEEMFPYEETEDQLSAIEDAKRDMESTRIMDRLICGDVGYGKTEV ALRAAFKEVQESRQVAYLAPTTILAQQIYNTFVQRMKEFPVRVELLCRFRTPAQQKKAIE DLKKGQVDVIIGTHRILSKDVQFKNLGLLIVDEEQRFGVTHKEKIKQLKKDVDVLTLTAT PIPRTLHMSLIGIRDMSVLEEPPMDRMPIQTYVMEYDEETVREAINRELRRGGQVYYVYN RVTDIADVALRIAKLVPDARVDFAHGQMSERELENVMYSFVNGDIDVLVSTTIIETGLDI SNVNTMIIHDSDRYGLSQLYQLRGRIGRSNRTAYAFLMYRKNVMLKETAEKRLAAIREYT DLGSGFKIAMRDLELRGAGNLLGAQQHGHMNAVGYDLYCKMLNEAVKEAKGIHTMEDFET SVDLNVDAYIPDSYISNEFQKLDIYKRIAGIETQQDYDDMLEELLDRFGEPGKAVLNLLA IAKLKAIAHQGYVTEIKQTGKIVRFTLYEKARLNTEGFPALMQKYRRGLQFKNEQEPKFI LEPQGNLILALTEFAEELKSMAENM >gi|226332870|gb|ACII01000149.1| GENE 7 8308 - 8889 706 193 aa, chain - ## HITS:1 COG:BS_spoVC KEGG:ns NR:ns ## COG: BS_spoVC COG0193 # Protein_GI_number: 16077121 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Bacillus subtilis # 1 189 1 188 188 186 47.0 2e-47 MYLVVGLGNPGKQYEATRHNMGFDTVDRLVEDYNVPQGGVKFNAMYGKTMIGGEKVILMK PLSFMNLSGGPVREMANYFKIDPESELIVIYDDIDLEPGQLRIRKQGSAGGHNGIKDIIR QLGTEKFLRIKVGVGAKPKGWDLADHVLGRFSTEDRKLVDEAIAKAAKAVDIIISQGTDA AMNEYNRKVPVQK >gi|226332870|gb|ACII01000149.1| GENE 8 9053 - 10006 1165 317 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1831 NR:ns ## KEGG: EUBREC_1831 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 317 5 314 314 258 40.0 1e-67 MMDDIRVPIYLVTGFLESGKTTFLDFTLQQEYFAIDGKTLLILCEEGEEEYDMDKLKLTN TVVEVIEDEEDLTPQRLAAMDIIHQPERVVIEYNGMWLVSKFEQMELPEGWGIEQEITCV DATTYQVYMANMKSLFMDMIRNTDMVVFNRCKEGDPLPAYRRGIKVANQSAEVIFENEEG EMDDIFQDEMPFDINAPIIEIPPEDYGIWFVDAMDNPDKYVDKIVRFKGRVMKPRGMGSK FFVPGRVAMTCCADDTTFLGYVCRSDYAPHIKEGSWVEVTAKVAFENRTEYQGEGIVLYA SDVKECEPLKEEMVYFN >gi|226332870|gb|ACII01000149.1| GENE 9 10160 - 11308 1308 382 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 4 182 2 177 294 133 36.0 7e-31 MATKIDIISGFLGAGKTTLIKKLLKEAYADEQVVLIENEFGEIGIDGGFLKEAGIQIREM NSGCICCSLVGDFGTSLKEVVDKYHPDRILIEPSGVGKLSDVIKAVQGVQGDVDIVLNSY TTVVDAKKCKMYMRNFGEFFDNQVEYAGAIIMSRTDIIDEKKAQQAMELLRGINAKAAII TTPIEKLDGRKILEVMEKPVSLEEELMSEEEVCPECGHVHEHGEHHHHDHDHDHECGCGH DHEEHEHHHHDHDHECGCGHDHEEHEHHHHDHDHECGCGHDHHDHHHHHADEVFTSWGRE TIKKYTREGLEKMLEALSASEDYGIILRAKGMLPAEDGTWIYFDMVPEETEIREGAPEYT GRLCVIGSKLKEDKLAELFGLA >gi|226332870|gb|ACII01000149.1| GENE 10 11645 - 11878 378 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580894|ref|ZP_04858156.1| ## NR: gi|253580894|ref|ZP_04858156.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 77 3 79 79 139 100.0 7e-32 MKVSNIKDIDKFFAVVDSCEGRVELVTGEGDRLNLKSKLSQYVSLANIFSGGEIPELEIV AYEKDDIDKLMSFMING >gi|226332870|gb|ACII01000149.1| GENE 11 12092 - 13978 2150 628 aa, chain - ## HITS:1 COG:all0713 KEGG:ns NR:ns ## COG: all0713 COG0296 # Protein_GI_number: 17228208 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 4 626 106 751 764 630 48.0 1e-180 MGVSMKFTDLDQYLFGQGTNYEVYKKLGAHPTTYRRKKGVYFAVWAPNAQSVSVIGEFND WEEEANPMEKVGPIGVYEVFVPGAKIGQLYKFFIVGAHGEKLYKADPYANEAELRPGTAS RITDITDYKWKDATWIKNREKFDEQVEPMAIYEVHPGSWKKHPQSEENEKGFYNYREFAH ALAAYVKEMGYTHVELMGIAEHPFDGSWGYQVTGYYAPTSRYGTPQDFKYMVDYLHQNKI GVILDWVPAHFPKDAHGLANFDGTAVYEHADPRQGEHPDWGTKIYNYGRPEVKNFLIANA LYWVEECHVDGLRVDAVASMLYLDYGKKDGEWIANKYGGNQNLEAIEFFKHLNTVVLGRN HGTVMIAEESTAWPMVTGDAEKDGLGFSLKWNMGWMNDFLEYMKLDPYFRKFNHNKMTFS MSYAYSERYVLVLSHDEVVHLKCSMLEKMPGELPDKFKNLKAAYSFMNCHPGKKLLFMGQ EFGQLREWSEERELDWFLLNEEPHKELQNYVHDLLTIYKKYPALYAGDNNPEGFEWINAD DGDRSIFSFVRKSPTKRNNILYIVNFTPVDRPDYRVGVPKKKQYKLIMDENGLLEQPKTF KAVKQKCDNREFSFAYPLPAYGVAIFTY >gi|226332870|gb|ACII01000149.1| GENE 12 14391 - 15716 1422 441 aa, chain - ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 145 336 260 451 463 145 40.0 2e-34 MRRKKRQMMIRHYTKIGLCVIGMILAVVFVVRGIIIPIAHHVAGGGSSDQTVQAQADTEE VQANSDAAVRQPLKGKSDTDKIAAMTAGWHEDANGKWYQNTDGTYFSNGFQDIDGVTYSF DENGYIQTGWVEKGVKDYYFNEDGSYDPSKVRPMLALTFDDGPGEYTDELLDCLEQNNAH ATFFMLGQNVSSYPDAPKRMLELGCEIGSHSWDHTQLTTIDLDAVAKQFSDTDDALIQAC GQAASVARAPYGDGNSDIYNTVNKPFFMWSLDTEDWKLLDADADYSAVMNGDLTDGTIIL MHDIHEPSVKAALRLIPDLIAQGYKLVTVSEMAEAKNVTLQNACYVDFWPSTLSNGDVPG YQGGSDAAASADGTDGTSDASADSSDGSTDSSDGSEDYSDGSSDDGSYDDGSSDDESYDD GSSDDSEDYSDDSVDYGDGTE >gi|226332870|gb|ACII01000149.1| GENE 13 15860 - 17140 1600 426 aa, chain - ## HITS:1 COG:SP0400 KEGG:ns NR:ns ## COG: SP0400 COG0544 # Protein_GI_number: 15900319 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Streptococcus pneumoniae TIGR4 # 39 365 94 426 427 120 28.0 7e-27 MKKKIYLAMLAVCFALTASACGDGGAVITDGTKTEGAADGKKAETGTTRLVSVENVEKYI TIGEYKGLTLDNTVDAITDDDVQAQIDEDLKDKAEPVSDAAKEGDLVTVNYTGTKDGQTF DGGTANNYDFVIGDGQMFEEFENGVIGMKKGDTKEIKIDFPSDYADETLAGKEVIYKVTV QNVRREGELTDEWVAKNTDYTTVDDYRESIRSELEKNAKESAQEVLKNTAWSTVLENSEV KEYPQEDVDKAVSEFKKSMEVYAKQADMTLEEFTDSQGISQDDFDEQCQQYAEGKVKQNL IVQGIMDAEGLSLDDKESLQLQDKLVEQMGVSSIAELVGTYGQDYVDESVGLLRVEEFII KNASVSEKVANGDVLADDADAAAENAEQDSDQNVSDEDTDDSGQDNSDVDENLEEELGTE DVDQSE >gi|226332870|gb|ACII01000149.1| GENE 14 17159 - 18613 1473 484 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0334 NR:ns ## KEGG: Cphy_0334 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 81 478 73 460 463 342 45.0 2e-92 MRKRIACVTAAILSISLGVSGCGLIPFEQDIPFLSSRAENVETEKKTEQQTIDEQDADTE KSTEDTQQDDGLNEGEQTDNEDTEEADEKQENPMEQAALMAVQYDYDGAIELLKSQPDYE SNTDMQSAVSDYENTKATCTEYPLEQITHVFFHTLIKDTARAFDGDSDSNGYNQYMTTID EFNKIIQSMYDKGYVMVSPHDMAVINEDGTMSRGSIMLPPGKIPFVLSQDDVSYYHYMDG DGFASKLVVDSNGEVKNEYIEDDGSVSTGDYDMVPLIDTFVKEHPDFSYHGRKGILAMTG YDGVLGYRTDIAYKTGKKLQDDQKKFLEEHPDFNYKQEVKNAKKVAKAMKAEGWEFASHT WGHKDVAATSLDDLKRDDKKWKKYVAPILGETDMIIFAFGADIGSWEGYSADNEKYEFYK SQGYRYFCNVDSSQYFVQITDDYFRQGRRNLDGYRMYYNPEMLSDLFDVSEVWDSSRPTP VPGM >gi|226332870|gb|ACII01000149.1| GENE 15 18650 - 20443 1791 597 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0334 NR:ns ## KEGG: Cphy_0334 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 162 590 37 460 463 322 40.0 3e-86 MSELKPDSEKKEERLSDKEKVNTEKTNAEELNAEELNAEELNAEELNAEEPNAEELNAEE LNAEKPNVEKLNEEKPNTEEIDEEPPAASEDEDFFFDDNKAYEARRAARLERRNKLRRRK KRLRIAGCIVVVAAAAIGIGIGFYGEELQNFVSEQQVKIAQMVKSADRKTEEEKTAQQTN AEAEAENAVTPEVQDTDGLSEEDKTLYRQAKYAAKQYDYDKAISMLQNSETYQTSEKFQH AVKVYQKKKESCVSWPLDQVTHVFYHTLIKDTGKAFDGDYKSGDYDQVMTTIDEFNQITQ SMYDKGYVMVSIYDMATADENGNMNAGEILLPPGKVPFVLSQDDVCYYHYMDGDGFATKL IVDEEGKIRNEYVEDDGSISVGDYDMVPLIDRFVEEHPDFSYRGAKGIVALTGYNGILGY RTDSSYETRPDDLDADKVKWLDEHPDFNLNTERENAARVAQAMKDEGWLFASHTWGHQNV SQISLERLQADTQKFKENVDPLIGGTDIIIFAFGADLTSVEDYSGEKFEYLKSQGYNYYC NVDSSQYFVQIRSNYFRQGRRNLDGYRMYYNPELLSDLFDAQSVFDSSRPVPVPTMG >gi|226332870|gb|ACII01000149.1| GENE 16 20793 - 21248 654 151 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3788 NR:ns ## KEGG: Cphy_3788 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 141 1 140 151 137 55.0 1e-31 MSTVTIVLLVILVILIAALIGLYFFGKKAQKKQEEQQAQIETTKQTVSMLVIDKKRMPLK ESGLPQMVIEQTPKLMRRSKLPIVKAKIGPKIAIMIADEKVFDLIPLKKEIKAEVSGLYI VGVKGIRGSLQTPAPKKKGFFAKFKKDKTAR >gi|226332870|gb|ACII01000149.1| GENE 17 21372 - 21818 592 148 aa, chain - ## HITS:1 COG:CAC2894 KEGG:ns NR:ns ## COG: CAC2894 COG4506 # Protein_GI_number: 15896147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 124 1 120 137 77 33.0 1e-14 MDKEVLIHVRGLQTMDADGEQEPLEIVVPGQYYFRNGSHYLRYEEVMEDFAEPTVNYIKI SRKGMEVRKKGVVNVHMVFEQGKKNMTYYTTPYGTIEMGIAATNLNLQESDGGLDMKVDY ALDMNQEHVADCYLAIKAQPKDCKNFTI >gi|226332870|gb|ACII01000149.1| GENE 18 21900 - 22721 883 273 aa, chain - ## HITS:1 COG:NMA2026 KEGG:ns NR:ns ## COG: NMA2026 COG0796 # Protein_GI_number: 15794906 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Neisseria meningitidis Z2491 # 3 270 4 269 270 236 47.0 3e-62 MKIDREAPIGVFDSGVGGLTVAREIMRNLPSEKIVYFGDTARVPYGSKSKETIIRYSRQI IRFLQQQQVKAIVVACNTASAFALDAVRDEFDIPIIGVIEPGAKVAAAQTRNKRVGIIGT VGTVGSGIHAEYLKHLDPEITVFGKACPLFVPLVEEGWLHDPVTDEVAARYLKELQDKQV DTLILGCTHYPLLRSTIRKIMGDGVCLVNPAYETALELGRLLEEKGLAGEGTEKNEFPYR FYVSDLADEFKAFANSILPYDVEMTKKIDIEKY >gi|226332870|gb|ACII01000149.1| GENE 19 22764 - 23036 279 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580903|ref|ZP_04858165.1| ## NR: gi|253580903|ref|ZP_04858165.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 14 90 15 91 91 147 100.0 3e-34 MNKKSKNKKLRRKLSFVLFVAAVGVLMMRAMPGMSGKIAKIAERRQEQTAVTAWWGTLYP KFCFSQFPEENKGQKDDIKISFWLAQALDW >gi|226332870|gb|ACII01000149.1| GENE 20 23170 - 24855 1206 561 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1002 NR:ns ## KEGG: EUBREC_1002 # Name: not_defined # Def: spore germination protein-like protein YndE # Organism: E.rectale # Pathway: not_defined # 3 504 11 543 578 75 21.0 5e-12 MRFAENNRISHRQLYRQMILALLAPFMLCVFGKGGMNGISAVTGMIFALILLGFYVIWLI RLTPSFEDPVKSAGAFAGRLIGIFFLIYVLMAGGYLLALLRRLVPVKLITGVSGRWIAFW AILVCSVGTCKGVQRRGRMAEVSGGLLLGGIMIMMILCVPQAKTEYLMGEIWWEELTVRN VSQSFYGTLCAFSPVALLPFLLGNVEKYGSAGKTVAGGILTLGGILIGMEILLPAVLGYD RVAAESYPVLPLLAGADLPGNVLARFDILWMGFLLYSLLFAIGSLLYYGNHIIGKSHPGT GRFWLPALVFLISLLEEEGKGILDYFGWCLAYIFVPGILICQFYMFIRGKGHRRKQRKRA VGVVTGILSVSLFMSGCGAAVEPEKRMYPMALGVDASEEGICLTYGMPDLSESTGQGKEE EDGGSRVLQISGADFTRIEKMYDQSQEKLLDMGHLQVLVMGRTLVEDGRWRMVLDYLKQE IFVGEDLYVFEAEDAGEILNWHGEDNSSAGEYITGLIRNRMSGGNITAVTLRELFYEKYK EDKILRLPIVKIRNGSLEVEV >gi|226332870|gb|ACII01000149.1| GENE 21 24852 - 26354 1394 500 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1001 NR:ns ## KEGG: EUBREC_1001 # Name: not_defined # Def: spore germination protein B1 # Organism: E.rectale # Pathway: not_defined # 9 494 23 510 515 474 47.0 1e-132 MKKLEESRKVSANLRENEKYLRSRLENCSDILIRPMRLGDKHKVDCLMVYIEVAVSNMML DDSALGKMINHFWEISPEDIQEFVRHNSLGIADVKKLENLDESIDAMLAGNAVFFIDGYD KAMKISSKGYPSTGVMEAESEKVLRGSREGFSDSVKSNSALVRKRLRDTRLKVEEYKIGV RSHTLTQVLYMDDLVHEGLLEEVKERLEEFQIDGILDSGMLEQLTEDVWYSPFPQYQTTE RPDRAVQEILKGKVVILCDNSPEALILPGNFSSFMESSEDWYHRFEMASFLRILRYLAVI MATVLPGLYLAVIRFHTQILPSALILSFAEAREGVPFSSVVELIFLELAFELIREAGVRV PGSLGNAIGIVGGLVIGQAAVEANLVSPIVVMIVALTALGSMTVPNEEFAAAFRLVKYGF LILGGYLGIYGIVLGVYLVIGHLAGLISFGIPYLVPFIKKEQKGSRGEGILRVPLRKRVL RPLYAREEQKIRLKRKESGS >gi|226332870|gb|ACII01000149.1| GENE 22 26522 - 27241 1074 239 aa, chain - ## HITS:1 COG:mll6782 KEGG:ns NR:ns ## COG: mll6782 COG1802 # Protein_GI_number: 13475658 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 9 210 8 209 224 100 31.0 2e-21 MTSDFSVNMNEYLPLRDVVFNTLRQAILKGELKPGERLMEIALAERLGVSRTPIREAMRK LEQEGLVVMIPRRGAQVANITEKDLNDVLEVRIALENVAIEKACARMTEEEMRRLWLAAK EFEHTIAEGNLVKLAEADVAFHEVIYQASDNKRLIQVLNNMREQIYRYRVEYLKEGETRD VLVKEHEELTKAIRERDVERAKQLSFQHIENQRMAIMRSIEAEDAERERAEKEKSRGHR >gi|226332870|gb|ACII01000149.1| GENE 23 27231 - 28109 889 292 aa, chain - ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 4 269 6 270 289 243 48.0 3e-64 MRLRALAKINLGLDILRKREDGYHEVRMIMQTIQMYDVLEMKRVRKPGISLSVNYSYIPN DERNLVYKAAKLLMDEFQVKGGVDIHLEKFIPVAAGMAGGSSDAAAALVGINRLFKLGLS QKELMDRAVNIGADVPYCVMRGTALAEGIGEKLTRITQVPDCFVLIGKPGINVSTKAAYE SLQLDKISSHPDIDGMIGDIERGDLLAMTQKMGNVFEPGIIEKYPVIGEIKALMESHGAL KAMMSGSGPTVFGIFDDREKMEAAAEVLRESRLAKTVFATEVTKSGGNTNDK >gi|226332870|gb|ACII01000149.1| GENE 24 28269 - 29819 1359 516 aa, chain - ## HITS:1 COG:CAC2903 KEGG:ns NR:ns ## COG: CAC2903 COG1388 # Protein_GI_number: 15896156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 2 512 4 514 520 120 21.0 7e-27 MELKKESIQMLRIKNKAAAQATFDEDYNVPDVKPDIGRLVQSKADVSMEEVRLSEGRALL KGTLNADLLYVGEKEGRIYSLSAKLPLDEMINLEGIEGGDKLCLKWEIEDLSVHMIHSRK LNIKAIVTFYAVVDELAVVELPVSAEDQEVSVKTEKIRLMSLRVHKKDTLRIKDDITLAS NRPNVENLLWYMAEPRNLDLRPGENKLRVKGELAVFLLYTGYEEENPPQWLEYTMPFSNE MECSGCMEDLIPHIEVSLLHQGIEVKPDPDGEERILQVDVVLELNMKMYREEEHELLLDV YSPHKECVLHRKKEMLESLLVRNFSRCRLTDRIEVKESQGKVLQLCHSSGKVKVDKTRIT DKGIVAEGIVALKILYIIGNDEMPFYSMDAMLPFTHLIEAEEIGKECTYLLQADLEQLST AMADGDEIEVKAAVGLNVLVFRQWEEQLIESVEEQPLDRKKLESMPGITVYIVKAGDTLW DIAKKFYTTVDEISGVNSLTEKEVKPGQSLILVRQG >gi|226332870|gb|ACII01000149.1| GENE 25 29842 - 30234 284 130 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00153 NR:ns ## KEGG: EUBELI_00153 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 124 14 141 143 88 37.0 9e-17 MLLLALLPAGCQFIRIEEGERTPLEYTIVKQEEIPEEISELMEQKKKKVFQMTYQVGDIR YLMKGYGEQLTGGYSIQVEEVSESENAVFCKTRLIGSEKADAGSEPSYPCIVLKIRETEK PIEFLGFIIK >gi|226332870|gb|ACII01000149.1| GENE 26 30518 - 31264 517 248 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 2 239 1 235 241 145 36.0 7e-35 MIRFILVCIVVIGYLILSIPILLVEWIIGRFSPEKKDISSLRIIQAVFRFILKITGAKIT VIGEENVPKDTPVLYIGNHRSYFDILLTYSRCPIRTGYIAKKEMEKIPLLSTWMRYLHCL FLDRKDIKQGLKTILTAVDKVKSGISICIFPEGTRNRNKDELDMLPFHEGSFKIATKANC PIIPIAISNSANIFEAHFPKISPAKVVVEYGKPIYPDELSKEDKRHVGEYTQNVIREMLI KNKPLTEN >gi|226332870|gb|ACII01000149.1| GENE 27 31492 - 32493 634 333 aa, chain - ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 1 298 119 429 531 97 26.0 3e-20 MLNICIVFAQLECETIQKRVSDAYYSRSQKGFRMGGKPPYGYRLEEILMEGIHTKKLVEE PGEAAIVREIFDMYEQPDTSYGDITRYYAEKGVQFYGKELIRSMLAQLLRNPVYVRADMD VYRFFRSHGTNIVSSPEQFDGIHGCYLYQGRDAQTDKLQNLKGHMLVVAPHEGLVSSEQW LNCRIKLMRNKTIQANRKAVNTWLAGKVKCGNCGYALMSIKIQSGKQYLRCTKRLNNKAC PGCGKVYTEDVENYVYKEMVRKLREGQSPAAYTKLKENPQVKQIYREIEEMEKVASNISL WGEIDFEEKRFIVDKMIRSLKVFPGSVQIQWKF >gi|226332870|gb|ACII01000149.1| GENE 28 32477 - 32779 273 100 aa, chain + ## HITS:1 COG:no KEGG:DSY0184 NR:ns ## KEGG: DSY0184 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 18 100 137 219 433 92 50.0 7e-18 MQIFNIASIYKGVVNSDSFSVHDFDILTLYPLDFEEFLWANAEFALTREIRNHFSDFSPM GKKLHEKALSQFRLYLIIGGMPAAIMEYKKEKKLLLVPDT >gi|226332870|gb|ACII01000149.1| GENE 29 33333 - 34343 744 336 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 279 1 279 321 244 46.0 3e-63 MLDTAEKNKIKPDTILKNFWRDNRHFADLFNAVFFNGEQVLKPQDLTEADTDVSSMLKFN GHAETVQKILDVVKKTAYGVDFVLWGLENQAKIHYAMPLRHMVGDAFSYMKEYDEIAAVN KKNGDFHSADEFLSGFKKTDRLHPVISLCVYYGESEWDGPFSLKDMLEIPEKIEPLVSDY RMNLVQVRSSGELCFSDPDVNMVFDVSRLIYARDYTKINEVYRDHNIPADLGLVIGAITE SQKLIDHALESEQKGGQINMCNALEELRQEGVEEGRQEGRQEGRQEGRWEGILEGIRATV RTCRNFNISETDTVRNIMNEFSLSQEEAVNYVKKYW >gi|226332870|gb|ACII01000149.1| GENE 30 34637 - 35531 906 298 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580932|ref|ZP_04858194.1| ## NR: gi|253580932|ref|ZP_04858194.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 113 298 1084 1269 1358 256 77.0 1e-66 MTKIKNTKKGMAKKTLSMSLVVAMLATSNVPVWAAEFSDDFDAAVATEAPVTEFSDNTIE TPAIEEETTEAPMAQAVVESDDLSVDVSVSESAEFTGVVSVSGTINASDLDKWGIDKQTI ANYDVTIKDGVATVINGTIKVPISEYTVTKNSDDTYTVAAKSTSKNYTGSKTVQAEGKAE NKKPDAPMISSVKVVGNKATVILSGASEGAAGYDYVISTDRDCITNKDYDSVNKNQVQTS TTFKYVQQGTYYAYCHAWKRDKNGKKVFSDWSNAYPFVVSAITPDAPVITNVKVSGST Prediction of potential genes in microbial genomes Time: Sat May 28 20:55:05 2011 Seq name: gi|226332869|gb|ACII01000150.1| Ruminococcus sp. 5_1_39B_FAA cont1.150, whole genome shotgun sequence Length of sequence - 5399 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 2.8 1 1 Op 1 . + CDS 93 - 1061 559 ## COG0860 N-acetylmuramoyl-L-alanine amidase 2 1 Op 2 . + CDS 1134 - 1997 1088 ## COG1047 FKBP-type peptidyl-prolyl cis-trans isomerases 2 - Term 1995 - 2051 21.0 3 2 Op 1 . - CDS 2101 - 4512 1345 ## Ccel_2642 Ig domain protein group 2 domain protein 4 2 Op 2 . - CDS 4536 - 5249 641 ## gi|253580917|ref|ZP_04858179.1| predicted protein - Prom 5326 - 5385 3.9 Predicted protein(s) >gi|226332869|gb|ACII01000150.1| GENE 1 93 - 1061 559 322 aa, chain + ## HITS:1 COG:BH3665_2 KEGG:ns NR:ns ## COG: BH3665_2 COG0860 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 118 315 6 174 180 81 28.0 2e-15 MKKRKSCFIWLLCILLLVTVLPAVDFTDVQAASVSSTFTGWKTFGGKKYYYKNGKKLTDL HKIGKYYYCFAADGTMFTGWHRIHNRFRYFGKQTGRMRINQTVNGRKINSKGVWTPVVVL DPGHSSVVAGGYEPLGPGSSQLKEKDTSGTQGVATGVEEYKLNLSIGLQLRTLLQKRGFK VVMTRTNSKVALSCIDRAKVANKAKADAYIRIHANGSDNSSISGALTICTTRNSPYISSM YRKNKALSEAVLNAYVSATGCRKEYVWETDSMTGNNWSKVPTTIIEMGYMSNPSEDRRMQ QSSYQKKMVRGIANGIENYLIK >gi|226332869|gb|ACII01000150.1| GENE 2 1134 - 1997 1088 287 aa, chain + ## HITS:1 COG:MA3137 KEGG:ns NR:ns ## COG: MA3137 COG1047 # Protein_GI_number: 20091955 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 2 # Organism: Methanosarcina acetivorans str.C2A # 132 287 22 181 181 84 36.0 2e-16 MKIAVTYDNGEIFQHFGKTESFKVYEVEDNKIVSSEVIGSNGTGHGALAGLLAEQGVNVL ICGGIGGGAQTALTEAGIELCAGAQGNTDQAVESYLKGELVSSGANCDHHHHEEGHSCGS HEEGHSCGDSCGGGCGGCGGSQPQLTGRNVGKTCRTHYRGTFNDGTQFDSSYDRGEPLEF ICGAGQMIRGFDAAVADMEVGQIIDVHLMPEEAYGMADPNAIFTMEIAQLPGSEDLAVGQ QVYLSNQFGQPFPVKVTAKDEKNITFDANHEMAGKELNFKIELVEVK >gi|226332869|gb|ACII01000150.1| GENE 3 2101 - 4512 1345 803 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2642 NR:ns ## KEGG: Ccel_2642 # Name: not_defined # Def: Ig domain protein group 2 domain protein # Organism: C.cellulolyticum # Pathway: not_defined # 132 619 35 551 889 181 26.0 1e-43 MKKKLLVLLLTSSMILMNFAPAYGAGDFTDSDNVTAVENPGSSDVDAIPDMGNAVNDEMS FSPEEFDNNGEFNDTEDEFTSEQTDDDFFSDEKEMPSVQEGDTLVANAGQGITAGTSTYS SKSSFGRRKALSQLQGMGINSGSYSWNWANPEYTSYYTDEAGNLHIVAWKDQTLYDAVCN SDLNVTNVTTVKLPLPLWGGFYAAPDGNFYVAAGQKNLNEDNSITAVRILKYSRAWKLLG ATDIGGGYTNMFEGIYIPFDAASLRMTQIGSTLIVHTGREMYGMEGIHHQSDITFVINTQ DMTLINSDMPYCSHSFNQFVVNDGSHVYFLDHGDAYYRGLILSSFSAYSGGYIAQDHAVN IFPFMGATGDNYTGCEVTGFSLAGNNLITVGKSVPHGFAVNGQTGYENLNKNIFMIITDK NSMASRFIWLTQYSPSGAEITLTEPKLIPVGNNQYAVLFSEETSEQSILHYLLMDASENV ILSKLYKNVTIQTDSQPILWGRNIVWVSGNYDNGNYDSSRTYLYEIPVVTTPLNGIALNQ TNLTIDEGNTQKLTPSFTPSNSDDVKDVVWTSSNPGVASVSEDGTIQGNGYGQAVITASA GDFQTQCQVTVKVSENNTPLTKPVLKLSQKSADQIHLTWKKVPGAKGYQIYCKTDSQSSY KRIKTLKTGAVSFDAAVVPGVTYSFKVRAYGTNASGKNKYSKFSAVKSRKAAVPAPSKVS CKMSNGGTEVSWKKVAGASGYVIYRNGSAAKTVKSSVSTWKDTKAYDSQTGMYWVYNYYV RAFKTVNGKRIYSKPTKTINLYS >gi|226332869|gb|ACII01000150.1| GENE 4 4536 - 5249 641 237 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580917|ref|ZP_04858179.1| ## NR: gi|253580917|ref|ZP_04858179.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 237 1 237 237 369 100.0 1e-101 MKKTSKKLLSFLLAFGMIIAMFAVTSATGWAADEHVPASATLVAYPKPATESLASLSSKV SGLKSSNKAVVTVKLSKSTYGTSQTYYTILAVPKKAGTATVSFKCQGKTYKIKVTVKKYV NPVKSVKIGATTVPGSRFKSSSETSLSYAKFAGKKVKTTVTLAKGWKLDKLYIYSGNNPV NGSMKPAIEYLQKGWMRSESVANGSKIPVAGGKGFRIMFTAVNSKTGIKERITIELR Prediction of potential genes in microbial genomes Time: Sat May 28 20:55:25 2011 Seq name: gi|226332868|gb|ACII01000151.1| Ruminococcus sp. 5_1_39B_FAA cont1.151, whole genome shotgun sequence Length of sequence - 14509 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 7, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 3930 3561 ## gi|253580932|ref|ZP_04858194.1| predicted protein - Prom 4119 - 4178 6.0 + Prom 4269 - 4328 6.5 2 2 Op 1 . + CDS 4391 - 4573 157 ## gi|253580918|ref|ZP_04858180.1| predicted protein 3 2 Op 2 . + CDS 4649 - 4828 75 ## gi|253580930|ref|ZP_04858192.1| predicted protein 4 2 Op 3 . + CDS 4829 - 5386 327 ## COG4185 Uncharacterized protein conserved in bacteria + Term 5400 - 5462 2.5 + Prom 5502 - 5561 6.5 5 3 Tu 1 . + CDS 5596 - 6915 796 ## CDR20291_2878 hypothetical protein - Term 7612 - 7646 -0.4 6 4 Tu 1 . - CDS 7892 - 8614 188 ## gi|253580922|ref|ZP_04858184.1| predicted protein - Prom 8717 - 8776 6.6 7 5 Tu 1 . - CDS 8787 - 12722 3622 ## PRU_2359 CUB domain-containing protein - Prom 12813 - 12872 5.4 + Prom 12811 - 12870 7.2 8 6 Tu 1 . + CDS 13018 - 14040 759 ## EUBELI_01420 hypothetical protein + Term 14061 - 14101 1.7 + Prom 14179 - 14238 7.7 9 7 Tu 1 . + CDS 14272 - 14472 254 ## TDE0496 DNA-binding protein, putative Predicted protein(s) >gi|226332868|gb|ACII01000151.1| GENE 1 3 - 3930 3561 1309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580932|ref|ZP_04858194.1| ## NR: gi|253580932|ref|ZP_04858194.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 1309 21 1269 1358 840 48.0 0 MTKIKNTKKGMAKKTLSMSLVVAMLATSNVPVWAAEFSDGSEASVATEAPAAETFSDDVA EAPVVEDTTEADTTDAATVTEAGDLILDDVTITKSATFGANEKVAVSGIIKYKDGSDVKE LKDYYFGWRVKGTTNAIFTDRVAKDNTTDKMSFIPDFADASSWTASQARHIDWSEYAGKT LELYVYNNENPDDINIAPTVIGETTINKCAIEGTLNLNKTSNDLEYNGKDFYYTDNTADG SDNAVKLVNSGAALTVKKGDTVPGVLGSTWTAATVLKYFDVAATKTAKNANEKLTVTATA KADSPYTGTISADVLTVQKRDFDANEFELSVADGLSYQYTGDIITLPTDKVTFKEKDDVL SGADLSAAVKKAVTTDKATGMKTVEVTLDADKLPNFNLTDSDKTQITTKANVEITKRDLS ADSTSITLRYGRVPKGTTVGQFANANLVFKDKSGTELKLTNNSDYNLIIKDPDNQTVTNT QSFDRVGTYTVTVFANEQSNPTCKNGQILEVRVASNVIATASFTNTYAPYYTGTELKPTK EELGKLVISNVNLAVGSETLKDDEWEITGYSNNVEASKYDKNGDITTYGYVEIKVKGDSS YAGQTYKVPFEIKPLIVSADSVTVPKTVTYNKGNGPASDYKVQLVVTAKDRTGKIVKGLS ADDYSIKYEYEDGYSDVPKNNVGQNELHDFIKATITIKNPNFAGDKGANTVVEMPQTTSA DKWTEIVEKAVTNDMIKINPSSYVYTGGNITPTYTVLDGAIALYDKADYGDKGEYEQVSI TNNKEVGTGTVTVKGVTGKYSGTASANFTITPANTSDVKVTFYKDDECQYTGRQVRPRTF KATLNGNDVTNQFEIVSYGENVSGKGTVVLKPVDGNKNFTGSNITVDFNIIKEYVNADLN VFNSNGVNVTVANSVSNVKGYKVTDEKAINYSGDSFDFDGTAKTFASEVLKNISKTDANG STTAVSKAKESDFEIKYVDNISGKNTGMKDSSNNSYNIAYVYVVAKDGTGYTGTKSFTTA DGTTIKGVVDYVAFAIKNVKFVKQNVYVQNATYAGGLPVKPSVLVQINGNTLVEGKDYTL TLKANGHNDPSGVNFVNVTDGKVYKVEVTGINGYTGSSETDLAWGIDKKDIKDCDVKVTN GVVTVMNGYIPVPTTEYTSKNNGDGTYTVTANSTSKNYTGSKSVKADGKAADEKPDAPMI SSVKVVGNKATAILSGDSEGAAGYDYVISTDRDCITNKDYDSVNKNQVQTSTTFKYVQQG TYYAYCHAWKRDANGKKVFSDWSNAYPFVVSAITPDAPVITNVTVSGST >gi|226332868|gb|ACII01000151.1| GENE 2 4391 - 4573 157 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580918|ref|ZP_04858180.1| ## NR: gi|253580918|ref|ZP_04858180.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 60 11 70 70 113 100.0 4e-24 MRTLKGRETKDKEAKNIFSYKPIMSNILKYTVAEYRDCSLEEIMNCIEGDTIRTGQSGAP >gi|226332868|gb|ACII01000151.1| GENE 3 4649 - 4828 75 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580930|ref|ZP_04858192.1| ## NR: gi|253580930|ref|ZP_04858192.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 1 59 59 86 98.0 6e-16 MCDVKKYEKIYEEIKCLEPEDTLQLTLEADSEEQREFYQMVGDFLLQKKQKQVIARNLF >gi|226332868|gb|ACII01000151.1| GENE 4 4829 - 5386 327 185 aa, chain + ## HITS:1 COG:CAC1491 KEGG:ns NR:ns ## COG: CAC1491 COG4185 # Protein_GI_number: 15894770 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 179 1 181 187 142 40.0 3e-34 MKKYIVLGGVNGAGKSTLYQILDNLKKMPRVNTDEIVKELGDWRNTSDVLKAGKIAVQLI DKYFSEGISFNQESTLCGKSIIKNFKRAKQLGYTIELHYVGVDSVETAKRRVAERVQNGG HGIPESDIEKRYFQSFSNLQYLLKECDLAVFYDNSNSFHRFAIFRRGKIVRLSQDIPQWF DYIIY >gi|226332868|gb|ACII01000151.1| GENE 5 5596 - 6915 796 439 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2878 NR:ns ## KEGG: CDR20291_2878 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 253 407 19 167 343 90 37.0 1e-16 MRKRSERLKKLSGLLLLCCMLLCMALCTGVNAEAAQTVREDTTFMNIKQWRAGNASCIEF DALVNTLGSKYVDVYRSEPSVKNGGELKRLDRFQVIGAVWNVIDDDRYACYGNAQKVICY SERPSARGDNLTFVDTTTKAGHEYSYQLRYSDYVFDPESENFGDDVRIISNTITVKPVLQ TPELYKCYTTDNKIVKLSWSYTNQADGYRIYRYDNGKWSYLKAVRKGSKLTAADKTAKTG KTYQYRILAYKNVNGKNIYSDKSAARKITLKSPTVKGDYSYGSVYGPYLDTAHLAQVRSV VQSFKLNYIRKGMSDYDKVLTAFNYLRSNCRYAYRGWQYNYANTAWGALVYGEAQCSGYA RGMKALCDAIGVDCRYVHANAKASNPSHQWNQVKVGGKWYILDAQSNGFLLGTDTWIKQA GMSWDTKGLPTCSKTDYKR >gi|226332868|gb|ACII01000151.1| GENE 6 7892 - 8614 188 240 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580922|ref|ZP_04858184.1| ## NR: gi|253580922|ref|ZP_04858184.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 240 1 240 240 459 100.0 1e-128 MLEEQLKELALIVKNDDLRKYAGNAFWNLIQAVISKDPVSGTLAGKDVKQIVFHMPTVIF WNKIKRFLQGTYKDYEEQVKMASQFNNDNEKYAEFVKKQIYLIDKLEDDMKVDCFAILTR CLLLQEIEISLYFKLAQIINQCTPFELEYIRKIGINEKQKNSAMVSSLYQYGLLEQDSDE TEVYYIFSGFGKALKGNCLNYGDDTKCEVFKTYNDVSPLSISEPALMGDIKQLFIEEVDS >gi|226332868|gb|ACII01000151.1| GENE 7 8787 - 12722 3622 1311 aa, chain - ## HITS:1 COG:no KEGG:PRU_2359 NR:ns ## KEGG: PRU_2359 # Name: not_defined # Def: CUB domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 541 892 574 903 1307 76 26.0 9e-12 MTKIKNTKKGMAKKTLSMSLVVAMLATSNVPVWAAEFSDGSEPSVTTETPVAETFSDDAA EAPVVEDNTADVNAANEAEVSGNQYSVTYTPISFTATGATSTPVEKNTMTWADGTLSTTV SVTDNKILTDAKVYATWKVDNVAVSALAGTTAATFANDVATITPTAYTVSSATANKTVAL YVYAVKDNQVVWSYTSDAITVNPKNTADVVKATAADVEYNGKIQQATPTLNPVSGNLPAE LDNIADYTIGYDENSDFVNVTDKPITITLTPKSSAYKGSITTTYKINPKNLSTSNAIADE MVATLKNTSYKFDGGNTNTDVLKVKKEDITLVDKATKKIDLSDYLKVDKNGYVNVTRDGK IELATPETGNKNYTISGAVGSDNTKIQSTNVPEVAARDLSSVTVTIDPVQKTNNKATLAA DQVHFYDKDSKKELALFNYVDIDIPNNAVEAGKYTVTITPKATTSSSKLTGKTTAELSIF ATDINTAVFKIGTNAKETTVPGSAGKKDTYNFNSITKYYTGEPVTFTTDEIGVPTVGANS TPLIKDSDYEITYSDNTNAGTAKMFLIGKGSYANSIKNYTFTISQTPVTDVKASEYVEKI NGAKPEDYKTAMGVVVKAQVPGTNKILTLTEGKDYTVTYAWGADGKSVDATVAVKANSNF DDPTDTSILKVNSKISNATLKSEYIKLKETSFTYNGQAVKPDFDVVIGGHIVNPNQYTAK YTNNVNAGTATLTVSANDNSDYQGTASITYTIQSADASKLKGVIGTQEYKGYTLEIAPDK IDLTLDGKKIDVESNFILTYGENVKVGEGTVTLTPKNKNFTGTKTLTFQISGEMLDGTNA TFAYVDKDGFAVASPTFAYDGTAHTCAKTSLTYTGKDLKEGTDYEIKYVDNVYGQKGKDK KQYAAVLAVAKGKFGGNLTTSDAASGISVKDGVYTDAAGNKITNVFKIDLIEITQEEITA SCVSVSNGTYAAGLPVKPSVKIVVKGRTLVEGTDYDLNVSANKDVINATEKQTLLVTVEP KNGYKLPNSVTLTYAWGIDKFDLANADVTVNGDKVTVKCGKVEVAADEYTVTKDAAANKV TVTATKGNKNYKGSKTVSAVVTDPTEKPATPMISSVKVTGNKATVILSGDSEGAAGYDYV ISTDRDCITNKDYDSVNKNQVQTSTTFKYVQQGTYYAYCHAWKRDENGKKVFSDWSGAYP FVVSAITPDAPVITNVTVSGSTIKVTYKAAANATGYDVVLGTDSKKENGETRPYHYGNYK KLNLKEGTVTATFKNVPKGTWVVGMHAFNRTSEDGKKVFSPWSNLKKATVK >gi|226332868|gb|ACII01000151.1| GENE 8 13018 - 14040 759 340 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01420 NR:ns ## KEGG: EUBELI_01420 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 279 1 279 321 246 47.0 1e-63 MLDTTEKNKIKPDTILKNFWRDNGHFADLFNAVFFNGEQVLKPQDLTEADTDVSSMLKFN GHAETIQKILDVVKKTAYGVDFVLWGLENQAKIHYAMPLRHMVGDAFSYMKEYDEIAAVN KKNGDFHSSDEFLSGFKKTDRLHPVISLCVYYGESEWDGPSSLKDMLEIPEKIEPLVSDY RMNLVQVRSSGELCFSDPDVNMVFDVSRLIYARDYTKINEVYRDHNIPADLGLVIGAITE SQKLIDHALESEQKGGQINMCNALEELRQEGVEEGRQEGRQEGRQEGRQEGRWEGILEGI RATVRTCRNFNISEADTIRNIMNEFSLSQEEAVNYVKKYW >gi|226332868|gb|ACII01000151.1| GENE 9 14272 - 14472 254 66 aa, chain + ## HITS:1 COG:no KEGG:TDE0496 NR:ns ## KEGG: TDE0496 # Name: not_defined # Def: DNA-binding protein, putative # Organism: T.denticola # Pathway: not_defined # 1 65 1 65 72 92 81.0 4e-18 MSTIGDYIKKERKKAGLTQEDFAIRSGLGLRFVRELEQGKETVRLDKVNQALAMFGKEAV PGKKED Prediction of potential genes in microbial genomes Time: Sat May 28 20:56:46 2011 Seq name: gi|226332867|gb|ACII01000152.1| Ruminococcus sp. 5_1_39B_FAA cont1.152, whole genome shotgun sequence Length of sequence - 4954 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 + CDS 2 - 292 123 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 2 1 Op 2 . + CDS 289 - 1248 464 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 3 1 Op 3 . + CDS 1305 - 1520 276 ## gi|253580928|ref|ZP_04858190.1| predicted protein + Term 1660 - 1705 7.0 - Term 1604 - 1641 6.2 4 2 Tu 1 . - CDS 1733 - 3346 1198 ## EUBREC_2821 hypothetical protein + Prom 3336 - 3395 3.6 5 3 Op 1 . + CDS 3445 - 3642 72 ## 6 3 Op 2 . + CDS 3644 - 3823 66 ## gi|253580930|ref|ZP_04858192.1| predicted protein - Term 4136 - 4178 9.2 7 4 Tu 1 . - CDS 4293 - 4952 467 ## gi|253580931|ref|ZP_04858193.1| predicted protein Predicted protein(s) >gi|226332867|gb|ACII01000152.1| GENE 1 2 - 292 123 96 aa, chain + ## HITS:1 COG:HI0666 KEGG:ns NR:ns ## COG: HI0666 COG3550 # Protein_GI_number: 16272607 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Haemophilus influenzae # 2 95 16 105 106 71 35.0 5e-13 HFSGIIAETEEGYIFTYDQDYLDREDAVAVSLTLPLRQEPYETTILFPFFDGLIPEGWLL GVVSRNWKINQKDRFGLLLSACRDCIGDVCIRREKV >gi|226332867|gb|ACII01000152.1| GENE 2 289 - 1248 464 319 aa, chain + ## HITS:1 COG:SMa0592 KEGG:ns NR:ns ## COG: SMa0592 COG3550 # Protein_GI_number: 16262763 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Sinorhizobium meliloti # 46 283 109 350 390 95 34.0 9e-20 MKCLCCGKPITNSATNVEKEWCWHKKCVKRFFQTDELPILDITKEQLEILATETVNEGLT VPGVQKKLSLHLSTDLNARLTIVDYPTGYILKPQTEEFDNMPEFEDLAMRLAEIMGIQTV PHALIKMNDEYAYITKRIDREISEKETKLYAMEDFCQLSYRLTQDKYKGSYEQCGRIIKK CSATPGLDLSELFLRVVGSFVMGNSDMHLKNFSLKETEPGSRNFQLSKAYDMLPVNVIMP EDKEQLALTINGKKRNIHKKEFRILAESCGIPLNSAERMMKKVCSLKAKLFTQIEESCLS AEQKEQMEELISKRLDILS >gi|226332867|gb|ACII01000152.1| GENE 3 1305 - 1520 276 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580928|ref|ZP_04858190.1| ## NR: gi|253580928|ref|ZP_04858190.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 71 3 73 73 110 100.0 3e-23 MCNALEELRREGVLEGQREGRLEGRLEGIRATIGICKKFSISEEDIIRNIMEEFSLSQEE ASGYVKNISRF >gi|226332867|gb|ACII01000152.1| GENE 4 1733 - 3346 1198 537 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2821 NR:ns ## KEGG: EUBREC_2821 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 537 1 528 531 277 31.0 9e-73 MAMFLNTPLPYKRYKKISETTYFVDKSLILNDIFGCMEEETQYICITRPRRFGKTIMANM LGAFFGRVWDASQIFDHLKIADSPEYHQYLNQYDIIYIDFSRLPENCTSYRQYIDRIITG LKNDLAEAYPEYQTDSIYALWDILSQISEQTDRQFVFIMDEWDAVFHLPFITEKEKAEYI LFLKTLLKDQDYTALAYITGILPISKYSSGSELNMFMEFKMTSMEAFSEYFGFTDEEVDV LYEKYTHLCKKLQVPRNGLQNWYDGYFTASGKRLYNPRSVVMALRFNQLSNYWTSSGPYD EIFYYIRNDIDHIKNDLALMITGEGVDAKIDEFAASAMELKSRDQIYSAMVVYGLLTYSG GKVFIPNRELMFKYEELLQNEESLGYVYQLAKISDQMLKATLSRDTQKMSEILQYAHNTE SPILSYNNEVELSAIVNLVYLSARNKYRIEREDKAGKGFVDFIFYPWNLTDTCIILELKV DHSPEDALLQIREKDYLLRFQGKSGETRKYTGEVLGVGISYDKETKEHFCKVEVLSK >gi|226332867|gb|ACII01000152.1| GENE 5 3445 - 3642 72 65 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDNDIPENQQDRKIAVSAIKHLDQSKKRKNILIAVVSLFKENWLYEKFSLFYLEIRYIIF KYRWD >gi|226332867|gb|ACII01000152.1| GENE 6 3644 - 3823 66 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580930|ref|ZP_04858192.1| ## NR: gi|253580930|ref|ZP_04858192.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 59 1 59 59 87 100.0 2e-16 MCDVKKYEKIYEEIKCLEPEDTLQLTLEADSEEQREFYQMVGDFLLQKRQKQVIARNLF >gi|226332867|gb|ACII01000152.1| GENE 7 4293 - 4952 467 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580931|ref|ZP_04858193.1| ## NR: gi|253580931|ref|ZP_04858193.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 6 219 1 214 214 301 99.0 1e-80 PEGYELVLSGDLAINDGYVYAAVRKAVTTKGITVVYQDRKGNVIKTAPMTVDKDATYINT GKLTAPDGYTIGIVGDIKISQNNKVYITVDQNKKSVTVIFKESGKVVSSKKLSVVATAKY VNMSRITAPDRYQIVTTGSKLKISKNNTVTVEVKKLGKTIKVIYRLAKTNKIVVTGEIVV DKKATSIKASEVPVPAGYKMVTTGSCSVKTKKIEIYVKK Prediction of potential genes in microbial genomes Time: Sat May 28 20:57:36 2011 Seq name: gi|226332866|gb|ACII01000153.1| Ruminococcus sp. 5_1_39B_FAA cont1.153, whole genome shotgun sequence Length of sequence - 74725 bp Number of predicted genes - 58, with homology - 54 Number of transcription units - 28, operones - 12 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 178 195 ## - Prom 237 - 296 4.6 - Term 219 - 257 7.0 2 2 Tu 1 . - CDS 299 - 4315 4339 ## gi|253580932|ref|ZP_04858194.1| predicted protein - Prom 4388 - 4447 4.3 - Term 4571 - 4612 6.4 3 3 Tu 1 . - CDS 4623 - 5093 318 ## gi|253580933|ref|ZP_04858195.1| predicted protein - Prom 5180 - 5239 6.3 - Term 5244 - 5295 12.2 4 4 Tu 1 . - CDS 5339 - 7801 3012 ## COG0058 Glucan phosphorylase - Prom 7970 - 8029 5.7 - Term 7905 - 7944 4.1 5 5 Tu 1 . - CDS 8124 - 11279 3808 ## COG0060 Isoleucyl-tRNA synthetase - Prom 11347 - 11406 7.0 - Term 11387 - 11420 3.1 6 6 Tu 1 . - CDS 11505 - 11651 62 ## - Prom 11671 - 11730 7.5 + Prom 11787 - 11846 7.2 7 7 Tu 1 . + CDS 11932 - 13167 1037 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 13177 - 13230 7.7 - Term 12875 - 12905 1.1 8 8 Op 1 . - CDS 13138 - 14097 789 ## EUBELI_02047 sporulation inhibitor KapD 9 8 Op 2 . - CDS 14158 - 15282 1309 ## COG2768 Uncharacterized Fe-S center protein - Prom 15310 - 15369 4.2 10 8 Op 3 . - CDS 15372 - 15617 362 ## gi|253580940|ref|ZP_04858202.1| predicted protein - Prom 15665 - 15724 8.4 - Term 15767 - 15813 6.1 11 9 Tu 1 . - CDS 15851 - 17533 2437 ## COG1109 Phosphomannomutase - Prom 17572 - 17631 5.6 12 10 Op 1 49/0.000 - CDS 17634 - 18521 806 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 13 10 Op 2 6/0.000 - CDS 18518 - 19522 912 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 14 10 Op 3 44/0.000 - CDS 19522 - 20433 601 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 15 10 Op 4 5/0.000 - CDS 20426 - 21439 755 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component - Prom 21508 - 21567 5.0 - Term 21458 - 21490 5.4 16 10 Op 5 . - CDS 21586 - 23106 2264 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 23162 - 23221 5.5 - Term 23345 - 23395 13.7 17 11 Tu 1 . - CDS 23402 - 25438 2438 ## COG1190 Lysyl-tRNA synthetase (class II) - Prom 25459 - 25518 6.8 - Term 25767 - 25824 2.2 18 12 Op 1 1/0.000 - CDS 25838 - 26320 762 ## COG0782 Transcription elongation factor - Prom 26454 - 26513 8.8 - Term 26462 - 26502 3.3 19 12 Op 2 . - CDS 26546 - 27553 611 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase 20 12 Op 3 . - CDS 27558 - 27923 602 ## EUBELI_01674 hypothetical protein 21 12 Op 4 . - CDS 27962 - 28765 1073 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 22 12 Op 5 . - CDS 28765 - 29766 1184 ## COG0078 Ornithine carbamoyltransferase 23 12 Op 6 26/0.000 - CDS 29803 - 30744 1236 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 24 12 Op 7 . - CDS 30747 - 31181 493 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 25 12 Op 8 . - CDS 31202 - 31975 263 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 26 12 Op 9 17/0.000 - CDS 31985 - 32638 736 ## COG0569 K+ transport systems, NAD-binding component 27 12 Op 10 . - CDS 32713 - 34095 1075 ## COG0168 Trk-type K+ transport systems, membrane components 28 12 Op 11 3/0.000 - CDS 34111 - 35109 1294 ## COG0205 6-phosphofructokinase 29 12 Op 12 . - CDS 35182 - 38649 3280 ## COG0587 DNA polymerase III, alpha subunit 30 12 Op 13 . - CDS 38708 - 39547 314 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains - Prom 39587 - 39646 8.5 - Term 39638 - 39681 7.2 31 13 Op 1 . - CDS 39782 - 40600 778 ## COG0657 Esterase/lipase 32 13 Op 2 . - CDS 40590 - 41651 314 ## gi|253580962|ref|ZP_04858224.1| predicted protein - Prom 41893 - 41952 8.7 - Term 42136 - 42188 19.0 33 14 Op 1 . - CDS 42237 - 42929 952 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 34 14 Op 2 . - CDS 43081 - 43785 767 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 35 14 Op 3 . - CDS 43840 - 43911 103 ## - Prom 43984 - 44043 8.0 36 15 Tu 1 . - CDS 44119 - 44829 626 ## COG3884 Acyl-ACP thioesterase - Prom 44880 - 44939 8.6 - Term 44886 - 44953 18.2 37 16 Op 1 40/0.000 - CDS 44975 - 47650 2070 ## COG0642 Signal transduction histidine kinase 38 16 Op 2 . - CDS 47735 - 48430 712 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 48507 - 48566 3.6 39 17 Op 1 . + CDS 49248 - 49892 553 ## EUBREC_2283 hypothetical protein 40 17 Op 2 . + CDS 49956 - 50318 424 ## COG1115 Na+/alanine symporter + Term 50345 - 50414 16.1 - Term 50344 - 50392 12.3 41 18 Tu 1 . - CDS 50436 - 51692 900 ## COG2942 N-acyl-D-glucosamine 2-epimerase - Prom 51880 - 51939 8.7 - TRNA 51773 - 51847 66.7 # Gln CTG 0 0 + Prom 52231 - 52290 3.3 42 19 Tu 1 . + CDS 52310 - 53773 1799 ## COG1488 Nicotinic acid phosphoribosyltransferase + Term 53865 - 53907 -0.4 - Term 53794 - 53859 16.5 43 20 Op 1 . - CDS 53869 - 55071 976 ## COG1408 Predicted phosphohydrolases - Prom 55145 - 55204 6.4 44 20 Op 2 . - CDS 55370 - 55918 483 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 56013 - 56072 2.4 - Term 56017 - 56053 -0.7 45 21 Tu 1 . - CDS 56075 - 59509 2549 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Prom 59564 - 59623 2.5 46 22 Tu 1 . - CDS 59650 - 61674 2103 ## COG1874 Beta-galactosidase - Prom 61872 - 61931 5.7 47 23 Op 1 . - CDS 61981 - 62988 861 ## COG1609 Transcriptional regulators 48 23 Op 2 . - CDS 63066 - 63251 217 ## gi|253580977|ref|ZP_04858239.1| predicted protein 49 23 Op 3 . - CDS 63242 - 64030 798 ## COG1192 ATPases involved in chromosome partitioning - Prom 64193 - 64252 8.0 + Prom 64146 - 64205 7.8 50 24 Tu 1 . + CDS 64280 - 64942 570 ## COG1272 Predicted membrane protein, hemolysin III homolog - Term 64955 - 65011 17.1 51 25 Op 1 . - CDS 65063 - 66034 1065 ## COG5263 FOG: Glucan-binding domain (YG repeat) 52 25 Op 2 . - CDS 66063 - 66305 148 ## COG2827 Predicted endonuclease containing a URI domain - Prom 66361 - 66420 4.3 - Term 66518 - 66570 1.2 53 26 Op 1 1/0.000 - CDS 66591 - 68030 1238 ## COG1621 Beta-fructosidases (levanase/invertase) 54 26 Op 2 7/0.000 - CDS 68027 - 69931 2192 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 70180 - 70239 9.2 - Term 70310 - 70353 5.3 55 26 Op 3 . - CDS 70366 - 71358 919 ## COG1609 Transcriptional regulators - Prom 71399 - 71458 5.3 - Term 71561 - 71595 5.0 56 27 Op 1 . - CDS 71616 - 72509 800 ## COG0739 Membrane proteins related to metalloendopeptidases 57 27 Op 2 . - CDS 72571 - 73404 229 ## COG2385 Sporulation protein and related proteins - Prom 73429 - 73488 10.9 58 28 Tu 1 . - CDS 73608 - 73853 174 ## - Prom 73966 - 74025 6.0 - TRNA 74075 - 74147 86.6 # Lys TTT 0 0 - TRNA 74178 - 74250 83.5 # Phe GAA 0 0 - TRNA 74255 - 74328 73.0 # Met CAT 0 0 - TRNA 74350 - 74422 82.3 # Thr TGT 0 0 - TRNA 74440 - 74521 53.8 # Tyr GTA 0 0 - TRNA 74527 - 74599 81.0 # Val TAC 0 0 - 5S_RRNA 74542 - 74593 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. - TRNA 74642 - 74715 82.7 # Asp GTC 0 0 Predicted protein(s) >gi|226332866|gb|ACII01000153.1| GENE 1 1 - 178 195 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLKERRKKIMALMLAACVAGTSVVPVAASDVSVFSDGTDASPEAFDSADVDSANGAAT >gi|226332866|gb|ACII01000153.1| GENE 2 299 - 4315 4339 1338 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580932|ref|ZP_04858194.1| ## NR: gi|253580932|ref|ZP_04858194.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 1338 21 1358 1358 2147 100.0 0 MTKIKNTKKGMAKKTLSMSLVVAMLATSNVPVWAAEFSDGTDASVATEADAFAAEAPVVD DTEDAAPASVMVNEADITLNLSADSKKVVWGGASVISGTVTKTDGSEVEGGWQYRWVDEK GIAYASSNGKVDDAADMSLNTTADMAGKTLTLYVYDVDSNQQTLYDINTGISVAVEKRSL NSVTATPATTTKAYNGFAQTITEADGVALTVLDSEKANVAADAYETSVTNATNNKEEFKL TVTAKEDSAYTGSAELVIGKVTAKAYAKGDILATATKGQSYQYTGKAIKISKDNITLKES ASSNKGADGKLSGADLSAAVTTAQVEYTKVGDYPVVVYVDKTKLKNFTDADEYYTTNETV NVAKRDLSTAGTKISVKTLSGQIPTSIAVKDLANYLAFTGVEGSDLSAIKNSNKEYTLIV KNADGTDATEFKNGSTYTVTVRAKDGGNCTGEQTFTVVATAAALNAVSAKTQYTTPYTGD EIKPSKSDLGDLVIQYYNATGVPKTEDLGTDGYEITGYTNNVNATTVYDGNYNPVANAYV NIKITSGTYKDQTVSIPFLIKPLVVEADYVTVPDDVSINKALTEAGEYKLPVTVVAKDET GKKVEKTLTDSDFTVKYEYENKDIANGLFNNIITEVTVTNTNYILKTTNGKTRTVTASGK TEIVPKKLTDAMVVATPSTYTYTGGKIQPTYAVLDGAIALYKKGEVNDKDAEYEEVSISN AVNVGTGTITVQGYNNFYSGKATGTFTITAANTADVKVSIADQDYTGKQVRPRKFTATLN GNDVSNQFEIVSYGENVEAGVGTVVLKPLDGNKNFTGSNITASFNIVKEKVEATLNVYDS KGFDVTDAYTADVDKDADGNVNGVVHNSQTAFTFDGNEHTFAKALLSNIRKVTDDGVKTT AKESDFEIKYADNITGKKVTAGKENIGYIYAVAKEGTGFAGTKTIVTSDGTIINGVVAYI PFTIKSVEFVAKNVSVKNATYAAGLPVKPEVLIQIGGSTLVEGKDYTLRLIDKDGNTVTP IDVTVGDIYGVEIKGINGYTESGVATFDTDDVDSTAYSRANLTWGVDKKNINDCTVTVKD GVTTVLNGYLPVPATEYTSEKNTDGTYTVKANSTSKNYTGSKTVVADGKAEDEKPDAPMI TKVNVTGNKATVVLSGDTDGAAGYDYVISTDRDCITNKDYDSISKNQVSTSTNFKYVQQG TYYAYCHAWKRDENGKKVFSDWSNAYPFVVSAITPDAPIITSVKVSGTTVKVTYKAAANA TGYDVVLGTSSKKENGETRPYHYGDHKILNLKEGTVTATFKNVPKGSWTVGMHAFNRTSE NGRKVFSPWSNLKKITVK >gi|226332866|gb|ACII01000153.1| GENE 3 4623 - 5093 318 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580933|ref|ZP_04858195.1| ## NR: gi|253580933|ref|ZP_04858195.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 156 1 156 156 214 100.0 1e-54 MQKLLFLLPFYVIRYEKSREDIEKDPEKLQKLLDEYVTISRRLEESLLENGKEALHRYLV EVIIKIADYIFSDSEKTKKGVDRAMGGEVLELKTDKLINEIRQDVKEEVKEEIKTETKEK IVICMLKRGKSTLEEIAADTQMPIEQVERIKKEQQL >gi|226332866|gb|ACII01000153.1| GENE 4 5339 - 7801 3012 820 aa, chain - ## HITS:1 COG:BH1084 KEGG:ns NR:ns ## COG: BH1084 COG0058 # Protein_GI_number: 15613647 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Bacillus halodurans # 9 817 6 808 815 827 51.0 0 MTNEMFKKEAFKKSVKDNVKFLYRKTIEEATQEQIFQAVSYTVKDVIIDNWLETQKAYDE QDPKTVYYMSMEFLMGRALGNNLINLCAYGEVKEALDELGFDLNCIEDQEPDPALGNGGL GRLAACFLDSLATLNYSAYGCGIRYHYGMFKQKIENGYQIEVPDNWLKNGYPFELRRPEY AKEVHFGGFVRVEYDPEKGGNKFIHEGYQAVKAIPYDMPITGYDNDVVNTLRIWDAEPIV DFELDSFDKGDYKKAVEQENLARNIVEVLYPNDNHYAGKELRLKQQYFFVSASLQAAIAK YKKKHDDIHKLYEKVTFQMNDTHPTVAVAELMRILMDEEGLGWDEAWEVTRKSVAYTNHT IMSEALEKWPIELFSRLLPRVYQIIEEINRRFILEIQAKYPGNYEKIKKMAIIYDGQVKM AHLAIAAGYSVNGVARLHTEILKNQELKDFYEMMPQKFNNKTNGITQRRFLLHANPLLAD WITEHIGPDWITDLPQLKKLAVYADDDKALQEFMNIKFKNKERLAKYILEHNGVEVDPHS IFDVQVKRLHEYKRQLLNILHVIYLYNQIKMHPEMEFYPRTFIFGAKASAGYATAKKIIK LINSVADVVNNDASINGKIKVVFIENYRVSNAEWIFAAADVSEQISTASKEASGTGNMKF MLNGAPTLGTMDGANVEIVEEVGAENAFIFGLSSDEVINYENNGGYDPNVIYNTDEEIRQ VLMQLINGTFSNDTELFRDLYDSLLNTKNTDRADRYFILADFRSYADAQKRVEAAYRDEK GWAKKALLNTACSGKFTSDRTIQEYVDDIWHLDKVIVRKK >gi|226332866|gb|ACII01000153.1| GENE 5 8124 - 11279 3808 1051 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1050 1 1033 1035 1212 55.0 0 MYQKVDTNLNFVDREKKVEEFWKENHIFEKSMENRKEGETYTFYDGPPTANGKPHIGHVL TRVIKDMIPRYRTMKGYMVPRKAGWDTHGLPVELEVEKKLGLDGKEQIEEYGMEPFIKQC KESVWKYKGMWEDFSSTVGFWADMEHPYVTYYDDYIESEWWALKEIWNKKLLYKGFKIVP YCPRCGTPLSAQEVSQGYKTVKERSAVVRFKVVGEDAYFLAWTTTPWTLPSNVALCVNPD ETYCKVKAADGYTYYMAEALLDKVLGKLAKEEGEKAYEVLETYKGTDLEYKAYEPLFACA GEAAAKQKKKAHFVTCDNYVTMSDGTGIVHIAPAFGEDDSRIGRNYELPFVQFVDGQGNL TKETPYAGVFVKKADPMVLTDLDKEGKLFDAPKFEHDYPHCWRCDTPLIYYARESWFIKM TAVKDDLIRNNNTINWIPESIGKGRFGDWLENVQDWGISRNRYWGTPLNIWQCECGHMHS IGSRQELFEMSGDERAKTVELHRPYIDEITLKCPECGGEMHRVPEVIDCWFDSGAMPFAQ HHYPFENKELFEQQFPANFISEAVDQTRGWFYSLLAESTLLFNKAPYKNVIVLGHVQDEN GQKMSKSKGNAVDPFDALNKYGADAIRWYFYINSAPWLPNRFHGKAVVEGQRKFMSTLWN TYAFFVLYADIDNFDPTKYELNYDQLPVMDKWLLSRLNTTVQAVDNDLANYKIPEAARAL QEFVDEMSNWYVRRSRERFWAKGMEQDKINAYMTLYHALVTIAKTAAPMIPFMTEDIYQN LVRSVDKDAPESIHLCDFPTVNEAWIDKDLEADMKELLEIVVLGRACRNTANIKNRQPIG TMYVKAEKKMSEFYTDIIADELNVKEVKFADDVESFISYSFKPQLRTVGPKYGKLLGGIR QALTDINGTAAMNELRTNGVLKLDINGNDVELTEEDLLIETAQTEGYVSESDGETSVVLD TNLTPELIEEGFVREIISKIQTMRKEAGFEVMDKIVVYAHGNDKIQDVMKAHEDEIKSEV LADEMVLGETDGYVKEWNINKEAVTMGVKKL >gi|226332866|gb|ACII01000153.1| GENE 6 11505 - 11651 62 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTKTLMERVTAGESRQRTGEGVSPVVSKCSEDHSGVAALKGVQTGKS >gi|226332866|gb|ACII01000153.1| GENE 7 11932 - 13167 1037 411 aa, chain + ## HITS:1 COG:BH2877 KEGG:ns NR:ns ## COG: BH2877 COG1686 # Protein_GI_number: 15615440 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 64 313 42 284 395 174 40.0 2e-43 MFKTKFHKLMKRFLTGMLCAGILTSQPAVTYAAETSATPEAGHSETYSQASDIDSVKGWP TGPNIEGQSAVLMDAVTDTILYSKNAKDKLYPASITKIMTALLACEYLNMDDTITMSQEA AYGIEAGSSSIYAETGEVFTVEQALMALMLESANEMALALAEKTSGSVKKFVELMNQRAA QLGCKNTHFNNPNGLPDETHYTTAGDMMKIAKAAWYNPRFRKFVTTQVYEIPPTNKQSET RYLLNHHKMMPGQSYAYDGVLGGKTGYTDAAGSTLVTYAKRGNSILIAVVLNSTNGAFPD TTSLLDYGFDNFEKVDLNIDTDPVPAVFLPCEKHLLKDWNNLCSFYYMRHVYVTVPTGTD VSQLVKKQKLLNNSVGPKRIKSKYYLDGHMVGYGMQYEKEILSDFLLNASF >gi|226332866|gb|ACII01000153.1| GENE 8 13138 - 14097 789 319 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_02047 NR:ns ## KEGG: EUBELI_02047 # Name: not_defined # Def: sporulation inhibitor KapD # Organism: E.eligens # Pathway: not_defined # 1 302 3 304 324 282 42.0 1e-74 MNYIVFDLEWNQNPSGKKTRNDRLPFEIIEIGAVKVNSKKEITDHFHRLIKPQVYKWIHD SIHEVIHVDYKDLMKGVPFEQAVREFIDWCGEDWYFFTWGNQDVMELQRNMKFYGLLDLL PGPVTYYDVQKLYSISYDDGTHRCALEHAIDELKIEKSRGFHRALADAWYTAKVLEKINN IIIINHPSLDVYQNPKKKKDEIHISYPDHDKYVSREFATRERIMKDREVTSTRCPVCHLP AKRKLRWFMNNPKVYYSISNCEEHGLIRGKIRIRKTEDEKYFAVKTLRFTDTEEAEELRE RKEVLRLRRKLKRSVEKEI >gi|226332866|gb|ACII01000153.1| GENE 9 14158 - 15282 1309 374 aa, chain - ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 4 370 3 349 357 328 48.0 1e-89 MEKSTVYFTDFRCPVGTSQLDKLKKLCVTAGIKDIDMDGKFVAIKMHFGELGNLAFLRPN YAKTVADLCKEQGGLPFLTDCNTLYPGSRKNALEHLECANLNGFNSISTGCQIIIGDGLR GTDEVEVPVVNGEYCQTALIGHAIMDADIFISLSHFKGHEATGFGGALKNIGMGCGSRAG KMKQHASGKPAVNEELCRGCRRCAKECGSDAITYPNKKAVIDYDKCKGCGRCIGACGFDA VYNPNSSANELLDRKMAEYAQAVCHGRPHFHVALVQDISPNCDCHGENDAPILPDIGMFA SFDPVALDQACADACLKATPIENSQLGEHLAKPDWHCHHDHFKDSNPNIEWQATLDQAEK IGLGTRQYELKIVK >gi|226332866|gb|ACII01000153.1| GENE 10 15372 - 15617 362 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580940|ref|ZP_04858202.1| ## NR: gi|253580940|ref|ZP_04858202.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 7 87 87 137 100.0 2e-31 MEEYEVISIPVSDGTERDFAIMDTFEFEGQGYLAVSLIEGDEIQEGVYIYRYHNAEDGDV VVEQITLPAEYKRVTKFYEQM >gi|226332866|gb|ACII01000153.1| GENE 11 15851 - 17533 2437 560 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 2 551 3 562 575 468 42.0 1e-131 MYQEEYKRWLAADLQDADLNPELSKIEGNDEEIKDRFAVALKFGTAGLRGVLGAGTNRMN IYVVRQATQGLANWVKTQGGNQTVAISYDSRLKSDVFAKTAAGVLAANDINVRIYDALMP VPALSFATRYYECNAGIMVTASHNPAKYNGYKAYGPDGCQMTDDAAAIVYEEIQKTDVLT GAKYMSFAEGVEQGKIRFVGDDCKQALYEAIESRQVRPGLCKTAGLKLVYSPLNGSGLVP VTQVLKDIGITDITIVPEQEYPNGYFTTCSYPNPEIFAALELGLNLAKETGADLMLATDP DADRVGIAMKCPDGSYELVTGNEVGVLLLDYICAGRIEKGTMPKNPVAVKSIVSTPLADA VAEHYGVELRSVLTGFKWIGDQIAQLEAADEVDRFIFGFEESYGYLAGPYVRDKDAVIGS MLICEMAAYYRSIGSSLKQRMEEIYAQYGRYLNKVDSFEFPGLSGMDKMSALMQGLRDKP LTEIAGHTIVKVTDYQKPEETGLPAANVLIYKLDNGETVVVRPSGTEPKIKIYYTTLGKN LEEAEAEKEKLAEALKPIMA >gi|226332866|gb|ACII01000153.1| GENE 12 17634 - 18521 806 295 aa, chain - ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 11 291 19 300 303 171 34.0 1e-42 MTNHLFELAGNRKTETVIPKSKKKWYQGKPVVSAAILILIVLGCLCAELIMTKDPAYMDL LNYNKAPDREFLFGTDTMGRDIFSMIWYGGRISILIGGLATVISTFIAVVVGAFSGVAPA WLDELIMRFTEIFLSIPTLLLIILLQAIMGDANILSLSFVIGVTSWTSIAKVVRTEVRQI RNSEYIIAARCMGAGFFRILWKHLVPNFFSSIMFMVVMNVRTAMISEATLSFMGIGLPIE VITWGSMLSLSDKALMTGSWWIILIPGLFLITTVLCLTNIGNACRERTNRKESNF >gi|226332866|gb|ACII01000153.1| GENE 13 18518 - 19522 912 334 aa, chain - ## HITS:1 COG:lin0183 KEGG:ns NR:ns ## COG: lin0183 COG0601 # Protein_GI_number: 16799260 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Listeria innocua # 11 322 7 314 316 176 32.0 6e-44 MRQKKLVYYGKKCLIFLISVFILSVAVFYVARLAPGDPLISYYGDRTEKMSPEEREWAMG KLGLNEPISVQYVKWVKNAFHGEFGISFKYKQDVVEVIGGRIGNTLVLGGIGFIIMFAGA LLLGILCAWYENRLADRILCKLGTISSCIPEFWMALVLILVFSVNLKILPSSGAYSTGNA GDIGDRVLHLIMPLTIVVMEHLWYYAYMIRNKILEEVRADYVLLAKAKGLDKKKLMFRHC VRNVMPAYLSIMAIAVPHVLGGTYVVESVFSYPGIGALSYESARYHDYNLLMLLCMMSGI LVIFCNIVSQTINEQIDPRMKDEVITEKSEVVEG >gi|226332866|gb|ACII01000153.1| GENE 14 19522 - 20433 601 303 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 24 280 35 288 329 236 47 3e-61 MSKEILLDVRHLTQQFHLTKKIKVKAVDDVSFQIHKGEIFGLVGESGSGKSTVARCLMNI YQPAQGEIWYKDVNICDKKEFKKHKKMLQTRRQMIFQDSASSLNQKMKVSDIITEPMRIQ HITPPRGSFVKEAAFQMEYVGLENSFLDRYPSELSGGQRQRVAIARSLCMEPEFLVADEP IASLDVSIQAQIVNLFRHLQEEHGFTFLFIAHDLAMVEYLCDRVGVMYHGKLVELAPSEE LYSNPVHPYTKTLLSAIPVPDPIRERKRVLQYYDGEGMENSVWTEVSPGHFAMLPAEGKG CSI >gi|226332866|gb|ACII01000153.1| GENE 15 20426 - 21439 755 337 aa, chain - ## HITS:1 COG:lin2297 KEGG:ns NR:ns ## COG: lin2297 COG0444 # Protein_GI_number: 16801361 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Listeria innocua # 1 330 1 330 358 425 62.0 1e-119 MEHLLEVKNLSVSIDGPAGEVQAVRDVSFSLKRGEVLAIVGESGCGKSVLCKSIMKLLPP SAKIKEGSICVNGQDITCYREREMQKLRGRVFAMIFQDPMTSLNPTIKIGKQIGEAVVIH NKNYTKEQVNKKVLELMELVGISHPKERYHQYPWQFSGGMRQRCVMAIALAADPDILFAD EPTTALDVTIQAQILDLLREIQMKLGTATILVSHDLGVVARVADRVAVMYAGKIVEIGTV EEVYYDPRHPYTWGLMRSLPAFANGKESLYTIPGMPPALIDPPKGDAFACRNEYALNIDY EEMPPMFKITDTHYAATWLLDPRAPQIKSPVGGIVHE >gi|226332866|gb|ACII01000153.1| GENE 16 21586 - 23106 2264 506 aa, chain - ## HITS:1 COG:Cj1584c KEGG:ns NR:ns ## COG: Cj1584c COG0747 # Protein_GI_number: 15792889 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Campylobacter jejuni # 23 504 21 510 511 310 34.0 5e-84 MAGVLLTVSLVSAGTTVPVLAEEETTLVYGSGDYTRINPAMDEHGEINILIFNGLTAHDG DNQVVPGLAESWDFDDETNTYTFHMAEDAKWQDGEPVTAEDVKFTIEAIMDPENGSENAP NYEDVEEINVIDDHTVAFKLEDKNVAFLDYMTMAVLPKHLLEGEDMQTSDFFRNPVGTGP YKIESWDEGQAITLVKNEDYFKGEPSIDKVVFKIVPDDNAKALQLKSGELDLALLTPKDA AAFADDEAYTCYDMKTSDYRGIMFNFGNEYWQKNRDLIPAVCYGLDRQAIIDAVLLGQGM PAYGPLQRNIYNYEDVEHYDYNPEKAKEILEAAGCEMGDDGFYYRDGEKVGFVISVSAGD QVRIDMAQIAAQELKEIGMDVSVEIPAQTDWAGQMAFLIGWGSPFDADDHTYKVFGTDKG ANYSGYSNADVDKYLTEARQSADPEVRAEAYANFQKALAEDPAYAMICYIDANYVADSNI KGIDPDTIMGHHGVGIFWNVADWTIE >gi|226332866|gb|ACII01000153.1| GENE 17 23402 - 25438 2438 678 aa, chain - ## HITS:1 COG:SA0475 KEGG:ns NR:ns ## COG: SA0475 COG1190 # Protein_GI_number: 15926194 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 8 530 3 495 495 532 53.0 1e-150 MAQQKQEQDLNQLLKVRRDKLADLQANGKDPFQITKFNQTHHSMEVKSLYEAHEAELLKD RAEVDVTGLDEEQAKEAVKKDYEERREIMDASPIHVAIAGRMMFKRVMGKASFCNIQDLQ GNIQVYVARDAIGADSYADFKKSDIGDIFGLEGFAFRTRTGEISIHAESMTLLSKSLQIL PEKFHGLTDTDMRYRQRYVDLIMNQDSKNVFIKRSQILKEIRNFLADRDFMEVETPMLVA NAGGAAARPFETHYNALNEDVKLRISLELYLKRLIVGGLERVYEIGRVFRNEGVDTRHNP EFTLMELYQAYTDYEGMMELTESMFRYLAEKVCGSAKISYNGIEIDLSKPFERLTMNDAI KKYAGIDFDEVADDEAAKKLADEHHIEYEERHKKGDIINLFFEEYCEKELIQPTFIMDHP IEISPLTKKKPSDPNKVERFELFINTWEMCNAYSELNDPIDQRERFKAQDALADAGDEEA NHTDEDFLNALEIGMPPTGGIGYGIDRLVMLLTDSQAIRDVLLFPTMKSQGAAKNEANNV AQAKTVSEKPVEKIDFSKVEIEPLFKDEVDFETFSKSDFRAVKVKECVAVPKSKKLLQFT LDDGTGVERTILSGIHAFYEPEELVGKTLIAITNLPPRAMMGIESCGMLLSAIHEEEGEE KLHLLMVDDHIPAGAKLY >gi|226332866|gb|ACII01000153.1| GENE 18 25838 - 26320 762 160 aa, chain - ## HITS:1 COG:CAC3198 KEGG:ns NR:ns ## COG: CAC3198 COG0782 # Protein_GI_number: 15896445 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Clostridium acetobutylicum # 2 157 3 158 158 136 58.0 2e-32 MEEKKNLLTYAGLKKLEEELHDLKVVKRKEVAEKIKEAREQGDLSENAEYDAAKDEQRDI EARIEEIEKILKNAEVVVEDEVDLDKISVGCKVKVHDFEFEEDIELKIVGSTEANSLEGK ISNESPVGKALIGAHTGDVVEVEMPAGIMKYKVLEIQRNV >gi|226332866|gb|ACII01000153.1| GENE 19 26546 - 27553 611 335 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 16 321 25 330 353 239 41 2e-62 MQTRNEPVRAREPFMEMQWKIGNVQIDNPFVLAPMAGVTDLPFRTLCKEQGAGLICMEMI SAKAISFHNKNTIALMEIDPCEHPVSMQLFGSEPDLMAEVAKSIEDRDFDILDINMGCPV PKVVNNGEGSALLKNPNLIEEIVRKVSSAISKPLTVKVRIGFENEPVDIVEIAKRIEDAG AAALAVHGRTRQQYYSGTADWDTIRRVKEAITIPVIGNGDVDSPLKAEALVKQTGCDAVM VGRAVRGNPWLFRELNHYFRTGELLERPSVEEIREMILRHARKQIELKGEFVGIREMRKH VAWYTAGMRHSAGLRRESNTIESYEALEALLSRLA >gi|226332866|gb|ACII01000153.1| GENE 20 27558 - 27923 602 121 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01674 NR:ns ## KEGG: EUBELI_01674 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 106 3 108 110 124 61.0 1e-27 MYQDNITLCGASAYEKKFYFNQDFNALPDHVKKELQIMCVLYTEDVGGILTLEFDENGRL QFKTEALEADARYDEIGSGLKIKQLQQDKKELLESLEMYYKVFFLGDIPEEELKKTSGSE D >gi|226332866|gb|ACII01000153.1| GENE 21 27962 - 28765 1073 267 aa, chain - ## HITS:1 COG:MA0676_2 KEGG:ns NR:ns ## COG: MA0676_2 COG0340 # Protein_GI_number: 20089561 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Methanosarcina acetivorans str.C2A # 10 256 1 246 254 160 35.0 3e-39 MAHLYNEETISQAINTKWAGKTVHFAKETDSTNSWIKRLAKDGAEHGTLAVAEFQSAGRG RFDRRWEAPEGSSIMMTLLLRPEFSPQYASMLTLVMGMAVAQAAEELGFNVSIKWPNDIV ISKKKICGILTEMGTNGVKINYVLIGVGINVNLKEFPEEMQDKATSLILEGGHEYDRNQV IALVMKYFEINYEKFIQTCDFTHLLDDYHRILANLNQPVRVIDGDRSFEGICRGIDEKGE LLVERQNKEVVKVSAGEVSVRGLYSYV >gi|226332866|gb|ACII01000153.1| GENE 22 28765 - 29766 1184 333 aa, chain - ## HITS:1 COG:VC2508 KEGG:ns NR:ns ## COG: VC2508 COG0078 # Protein_GI_number: 15642504 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Vibrio cholerae # 2 329 4 334 334 417 61.0 1e-116 MNLKGRNFLTLKDFTPEEITYLIDLSAELKAKKKAGELHEYYRGKNIALIFEKTSTRTRC SFEVAAHDLGMGSTYLDPTGSQIGKKESIADTARVLGRMYDGIEYRGFGQEVVEELAKHA GVPVWNGLTNEYHPTQMIADMLTIREHFGDLKGRKLVYMGDARYNMGNSLMIACTKLGMH FVACTTKKYFPNAELVAQCEEYAKASGGSITLTEDVQEGTKDADVIYTDVWVSMGEPDEV WTERIHDLTPYKVTKDVMKNAGEKAIFLHCLPAFHDLKTKIGKEMGERFNLTDMEVTDEV FESEQSKVFDEAENRMHTIKAVMVATLGEPENK >gi|226332866|gb|ACII01000153.1| GENE 23 29803 - 30744 1236 313 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 3 303 4 293 294 318 60.0 9e-87 MGVVFLIVIIVIAVWVLASCVRIVPQAYAVILERLGAYQATWSTGIHFKVPFIERVARKV NLKEQVVDFPPQPVITKDNVTMQIDTVVFFQITDPKLYTYGVENPIMAIENLSATTLRNI IGDMELDETLTSRETINTKMRASLDEATDPWGIKVNRVELKNIIPPAAIQDAMEKQMKAE RERREAILIAEGQKKSTILVAEGKKQSAILDAEAEKQAAILRAEAQKERMIKEAEGQAEA VLKVQNANAEGIRMIREAGADEAVLTLKSLEAFARAADGKATKIIIPSDIQGIAGLASSL KEIVTDPKAETDK >gi|226332866|gb|ACII01000153.1| GENE 24 30747 - 31181 493 144 aa, chain - ## HITS:1 COG:CAC1051 KEGG:ns NR:ns ## COG: CAC1051 COG1585 # Protein_GI_number: 15894338 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Clostridium acetobutylicum # 5 143 7 144 146 68 33.0 5e-12 MQPLIWLGILALLLVVEAITAGLTTIWFAGGALVAAIACYAGANLTVQILLFLCVSLVLL IFTRPLAMKYFNKETIQTNANSLIGKKAVVIQEIDNLAQTGQVRINDIEWTARSADDEKI GEGTVVTIEEIRGVKLIVKQNKED >gi|226332866|gb|ACII01000153.1| GENE 25 31202 - 31975 263 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 45 257 25 247 255 105 30 5e-22 MITSVNNGQVKNIIQLNQKTKARREQGLFVAEGRKMFGEAPRDWISKVYVSEALSGDAEL MAQVEKLPYEIVTDSVFRQMSDTQTPQGIMTVLKKPSYIMEDILGGENPLVMILEDLQDP GNAGTILRTGEGAGVSGVLLTKTCVDITNPKVIRSTMGSVYRMPFLYVESVVSLAQELKD RNIRTFAAHLHGKNSYDQESYTGGTAFLIGNEGKGLTDEAADSADCLIRIPMCGKVESLN AAMASGILMYEAARQRR >gi|226332866|gb|ACII01000153.1| GENE 26 31985 - 32638 736 217 aa, chain - ## HITS:1 COG:BH2663 KEGG:ns NR:ns ## COG: BH2663 COG0569 # Protein_GI_number: 15615226 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 4 211 3 210 220 160 39.0 2e-39 MKNKSYAVIGLGQFGMTVALTLAEANCDVLAIDDKDDNVQDIAEKVSYAVKADVRDPGIL ESFGVQNVDVAVIAVAENMEASITATMQVKELGVPFVMAKAMNSLHGRILEKIGADKVIY PEHSMGIRVARNLLSSGFVDMFELSSDFSMAEFKIPREWIGKTLRELKVREKYNINLIGL KHGDKMNMNVAPDEVFPADCTVVAAGANSDLNKVSEN >gi|226332866|gb|ACII01000153.1| GENE 27 32713 - 34095 1075 460 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 9 435 13 423 445 247 38.0 4e-65 MQLYNRLHKLNTAQIITLGFAGVIILGGLLLWLPFCTAPGYHTSFTDAMFTATTSICVTG LVTVVTATHWTLAGKIIILVLIQIGGVGLISLGSIIFISLRKKISLRNRRVIQESYNMDR MGGMVRLVKKVLICVFGAEGIGAVCYAVRFIPQFGLAKGLGYSVFTAVSAFCNAGIDLLG EDSLAQYVADPIVNFTSVGLIIMSGLGFVVWWDIWDKIKRVIRGKLPVGRIFKNLRLHSK IVLMMTLILVVGGTVLIFLFDHGNPESIGTYSPGTKWMASLFQSVTTRTAGFFTVSQERF SNATYMLCLVLMLIGGSPMGTAGGIKTTTVAVLLLSLKSNLQGKRDVEVHHRRIRDSYIR SAIVVTGMVLTVLILMSMLLCAAMPEAPIEDVVYEITSAVATVGLSRGLTPCLNTAGKWI VILTMYLGRIGPLTLGTAVTVRVQKMPADSHLAEEDIMIG >gi|226332866|gb|ACII01000153.1| GENE 28 34111 - 35109 1294 332 aa, chain - ## HITS:1 COG:CAC0517 KEGG:ns NR:ns ## COG: CAC0517 COG0205 # Protein_GI_number: 15893808 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Clostridium acetobutylicum # 6 324 1 318 319 360 59.0 2e-99 MAENKIKTIGVLTSGGDAPGMNAAIRAVVRRGLSNGLNVKGILKGYNGLLNEEIIDMSAK DVSDTIERGGTILYTARCAEFRTEEGQKRGAEICRKHGIDGLVVIGGDGSFAGAQKLANL GINTIGLPGTIDLDIACTEYTIGFDTAVNTAMEAIDKVRDTSTSHERCSIIEVMGRGAGY IALWCGIANGAEDVLVPEKYDYDEQKLINNIIESRKKGKKHHIIINAEGIGHSEAMAKRI EAATGIETRATILGHMQRGGSPTCKDRVYASMMGALAVDLLIAGKTCRVVGYRHGEFVDF DINEALAMQKGISDYQWEVCQSLSHNYDKNNK >gi|226332866|gb|ACII01000153.1| GENE 29 35182 - 38649 3280 1155 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 2 1148 10 1167 1167 1181 54.0 0 MEFTHLHVHTEYSLLDGSNKIKEYVARVKELGMDSAAITDHGVMYGVIDFYRAARAEGIN PILGCEVYVAPGSRFDREAGSGEDRYYHLVLLAENNQGYANLMKIVSKGFTEGFYYKPRV DLAVLKEYHEGIIALSACLAGEVARYLQRGMYEDAKAAALRYQDIFGKGNFFLELQDHGI PAQRLVNQELLRMHEETGIDLVATNDVHYTRAEDADPHDILLCLQTNKKLADEDRMRYEG GQYYVKSPEEMAELFPYALEALENTHKIAQRCHVEIEFGVTKLPRFDVPDGLTSWEYLNK LCFEGLEERYQPVTEELKARLNYELSTIKNMGYVDYFLIVWDFIKYARDHDIMVGPGRGS AAGSLVAYTLGITQLDPIRYDLLFERFLNPERVSMPDIDVDFCFERRQEVIEYVRRKYGD DCVVQIVTFGTLAARGVIRDVGRVLDMPYAQVDSIAKMIPQELNITIDKALTMNPELKKA YEEQDEIHYLIDMARRLEGLPRHTSMHAAGVVISQKDVSEYVPLSRASDGSIVTQFTMTT LEELGLLKMDFLGLRTLTVIQNAVKLIQKDAGVTLDMQKINYDDKKVLDSLGTGRSDGVF QLESAGMKNFMKELKPQSLEDVIAGISLYRPGPMDFIPQYIRGKNRPDTIRYDCPQLEPI LKPTYGCIVYQEQVMQIVRNLAGYTLGRSDLVRRAMSKKKASVMEKERQNFVYGNEAEGV PGCIANGIDEATANKIYDEMIDFAKYAFNKSHAAAYAVVSYQTAYLKYYYPVEFMAALMT SVIDFPNKVAEYILVCRQMGIKILPPDVNCGMYGFSVDNGAIRYGLSAIKSVGRPVIESL VREREENGQYRSLKDFMERNSPQMNKRAVENFIKAGALDCLDGNRRQKMLVYQKISDSIS QDKKNSLAGQMSLFDLVSEEDKKEFEIRMPDVEEFGKEELLGYEKEVLGIYLSGHPLENY RGMMEKTISAKTSDFQQDEETNLPKVMDGQKVIIGGMITDKTIKYTKNNKVMAFLTVEDL VGTVEVVVFPRDYEKSQQFLNEEGRVFIQGRVSAEDDRASKLILEKIRPFDNMPREIWIQ FDNKESYTQQSQELLADLRRSPGDSAVVIYLKDVKAIKKLPVGYHAQIQDSWLNYMYEKY GKTNVKVVERGLKNL >gi|226332866|gb|ACII01000153.1| GENE 30 38708 - 39547 314 279 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 29 279 22 280 285 125 32 7e-28 MIELGKKQKLTVVKSVDFGVYLGEDMQADAKNRVLLPSRQVPEGTKEGDSIEAFIYKDSQ DRLIATTKEPKLQVGQTAVLKVSQVTRIGAFLDWGLEKDLLLPYHEQTLKVREGEDVLVA LYIDKSSRLCATMKVYHYLSTRTPYVVGDMVKGRVYEISDRFGVFVAVDDKYSALIPARE AKGKYRPGKILELRVSEVKEDGKMNVTDRQKAYLQINEDAENVLEVINEFAGVLPFDDKA SPEVIQREFGLSKGAFKRAIGHLMKEGKVEIKDKRIYAK >gi|226332866|gb|ACII01000153.1| GENE 31 39782 - 40600 778 272 aa, chain - ## HITS:1 COG:CAC2917 KEGG:ns NR:ns ## COG: CAC2917 COG0657 # Protein_GI_number: 15896170 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Clostridium acetobutylicum # 33 272 34 269 272 247 48.0 2e-65 MIYKKIEIAVDGYKETADLYTYFLDNSIEMHINRKRPVVVICPGGGYAMTSDREAEPIAM QYLVRGYHAVILRYSVEPARYPLALLQLAKSVAFLRKNAAEFHIDTNKIVLQGFSAGGHL AASLGVFWKKDFIAQTLGVTSDMVKPNGMILSYPVITSGEFAHTGSFECLLGEDYNDLDK RKEQSLEFQVSKDTPTTFLWHTVTDDCVPVENSLLFFNALRKFEIPVEMHLYPVGGHGLS LANEETSNEDGGCVQKECQSWIELACKWMQNI >gi|226332866|gb|ACII01000153.1| GENE 32 40590 - 41651 314 353 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580962|ref|ZP_04858224.1| ## NR: gi|253580962|ref|ZP_04858224.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 10 353 67 410 410 622 100.0 1e-177 MLLLILYYMRTVVLSREGCTVSWLFWKRNYKWEELAVIREDVWYTYGKGGSIRYQGIVFS KYAVNKKKKKYTTRMIVDAFPLSDCFYITFDEHGNVFSYKLEKGKKQVIRKMYVNVREKM KEWGVEAEKGEGIKEEEKQRQYDEMIELRRQQRKKIQSKGADCTNLKEYQQEVIKIRYFL QITMVILITVLLIWTGLSAEEGERIPILILEFLIDAPLLFFSSKVVTLSKEGCKISYLFF KKMYKWEELEIIREDYFPSFRDRRSRGIVFSKRKKNKAGYIYTTREIVGSGHILECFYIV LDADGNIHLYKRLAAKDGKTKLRKVAVNIPQKLAEWGVEMEKMMKYTGGTYDL >gi|226332866|gb|ACII01000153.1| GENE 33 42237 - 42929 952 230 aa, chain - ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 2 227 1 227 228 300 61.0 1e-81 MLEKLKEEVYKANMDLPKYGLVTFTWGNVSGIDRESGLFVIKPSGVDYDLLTPDDMVVVD LNGNKVEGKYNPSSDTATHVELYKAFPNIGGVVHTHSSWATSWAQAGRAIPCYGTTHADY IYGEVPCARCLEGKEFEEYEKNTGLLIVDLFKDKDYEAVPAVLCKNHGPFAWGKDAHEAV HNAVVLEEVAKMASRCELINPQVKPAPQDLQDKHYFRKHGANAYYGQGNK >gi|226332866|gb|ACII01000153.1| GENE 34 43081 - 43785 767 234 aa, chain - ## HITS:1 COG:rhaD KEGG:ns NR:ns ## COG: rhaD COG0235 # Protein_GI_number: 16131742 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 12 224 55 266 274 225 52.0 5e-59 MIRPRLNATGEWIPIGVEVPGLAGEFFLVTGSGKYFRNIIVDPEVCLAIIELDETGTNYR IRWGLVEGGRPTSELPTHLMNHEVKKKLTNGKHRVIYHAHTTNTIALTFVLPLDDKIFTR ELWESATECPVVFPDGVGVVGWMVPGGREIAIKTAELMKKYDVVIWAHHGMFCSGEDFDL TFGLLHTVEKSAEILVKILSMTPNKLQTITPDNFRSLAKDFKVSLPEEFLYEKQ >gi|226332866|gb|ACII01000153.1| GENE 35 43840 - 43911 103 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKILESKFVQGFIKMANDGYLQG >gi|226332866|gb|ACII01000153.1| GENE 36 44119 - 44829 626 236 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 13 212 16 221 248 98 26.0 1e-20 MNYSFNSRVRYSETGENGKLTLPGVLNYFQDCCTFHAESVGLGGDVLKARDRAWVLSSWQ VIVDEYPAMGTEIRITTAPYDFKGFMGMRNFTIETMDGKKLAWANSNWTHLAISTGIPVR LTPADTDNYILGEKLEMDYAPRKIKLPDDMTSQESFTVQKHHLDTNHHVNNCQYICMAED FLPEDFKVYQMRAEYKMQAKLGDIICPKAKAETGKVIVSLDDTDGKAYAIIEFQQK >gi|226332866|gb|ACII01000153.1| GENE 37 44975 - 47650 2070 891 aa, chain - ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 645 891 30 274 274 223 44.0 1e-57 MKGKGYRSNVAKAVWIVIVHLAAVAAVICAAFFVMIYQTGIRLDDRGKSYIQSEGFRERL NSRGTDILESLSAKEDINYLTNAGSSAVIDLAEFKEKEINRDSLRELSFKNTSGVAYSVK DLLEWAQDWAGVGERYDDGGSFGDIGQFIQCKTSDGSSHYFNLNDFKKLVTDGLLKVNYD QDIMEEYDDSYETKFAEKTEKQKIDAAIELGYWSDSDSRSLGSITDKEHNTEYPEFYLQE IWCFTEEFKPQGAESLPDAVNSSTEWNGKLEDAYSELAKVLDCIRTVQDDINVSDCAISL TSVYHTSGDYEEGSTNLTYLFADKEKKTIYTNRKAYSSYSQLEQNLEKIFKEKAYAVVYP ELSECVTNIPDADLQVWNHTIDQSFDTKDFVFAVSVDTKFSVADSMADEAENYETYSKLM FPMLAGAIFGSVLWLIGMVWLTVTAGRKPKDEEIHLNGFDRWYTEIAAGAVIGIWLAGTI ISGTLIANSSLGYSHAVVTVIVTCLICGTYTMAWFLIGYLSLVRRIKAGTLWKNSLIRTV LKWIGKCSGKLSDFARAFSRNTAEKIKVLLVGGAFLFLQFLIIGCGFTGAGVFLIILLIV DAAAVIFIIRKADGLDLIMDGLKKISDGELQYKIKTDTLTGKQKVMAEYINNIGSGLDAA VENSLKKERMQTELITNVSHDLKTPLTSIINYVDLMKRENPTDPKIQEYLRILDEKSQRL KVLTEDVVEASKASTGNIKLEMNDIDFVEMVQQVIGEFEEKFQEKNLTMMVHFTDEPSII YADGQRMWRVLENVFGNVVKYAMEGTRVYAEISNRNKKVTFSLKNISAQPLNISADELTE RFIRGDVARNTEGSGLGLSIAKSLTELQGGEFKLYLDGDLFKVMITFAAKN >gi|226332866|gb|ACII01000153.1| GENE 38 47735 - 48430 712 231 aa, chain - ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 230 6 232 232 266 60.0 2e-71 MANILVCDDDKDIVEAIGIYLEQEGYTVIKAYDGVEAINVLRSQPVDLLIIDIMMPKLDG IRTTLKIREENPLPIIILSAKSEDADKILGLNVGADDYVTKPFNPLELVARVKSQLRRYT RLGAAIPVENAHIYETGGLSINDDLKEVRVDGDVVKLTPIEYNILFLLMKNQGRVFSINQ IYENIWNEEAVAADNTVAVHIRHIREKIEINPREPRYLKVVWGLGYKIEKI >gi|226332866|gb|ACII01000153.1| GENE 39 49248 - 49892 553 214 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2283 NR:ns ## KEGG: EUBREC_2283 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100] # 3 214 1 212 212 191 43.0 1e-47 MNLENMTKIYARGDSRIQLKVIPGHFVTSQSHITHYLDLTTMKSRTAEAQNIAHELAANY EVSTPVDTIICMDGLEVIGAYLSEELTKAGIFSLNAHQTIYVITPESSNSGQIIFRDNFQ PMIKGKNVLILNGSITTGSTLSKAIESILYYGGTIRGIAAIFSRVDSVASLPVYSIFNTK DIPDYHSYSSTKCPMCQRQQKIDAIVSSYGYSAL >gi|226332866|gb|ACII01000153.1| GENE 40 49956 - 50318 424 120 aa, chain + ## HITS:1 COG:BH4033 KEGG:ns NR:ns ## COG: BH4033 COG1115 # Protein_GI_number: 15616595 # Func_class: E Amino acid transport and metabolism # Function: Na+/alanine symporter # Organism: Bacillus halodurans # 1 82 21 100 460 87 59.0 5e-18 MLILLVGTGIFLTVRTRFLTWRNLGYALKSTLSKEARTKSRGQGDVSPFSALTTSLAATI GTGNIVGVATAMVSGGPGALVWNFSDIANALMAIPNLICMLLLSGEIAKDVKEFQPEIKK >gi|226332866|gb|ACII01000153.1| GENE 41 50436 - 51692 900 418 aa, chain - ## HITS:1 COG:yihS KEGG:ns NR:ns ## COG: yihS COG2942 # Protein_GI_number: 16131720 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Escherichia coli K12 # 1 397 1 396 418 259 36.0 6e-69 MENSGKKYQIDTEENKMFLGELQKNLLNFGKGFLSPGGSAYFLGDDGTPWKDRNRETWIT CRMVHVYSMGIMLGDKESPALVHGAVHGLLEELKDCENGGWYPGITPDNKFLPDKQCYAH AFVLLAASSALLAGEKNAETLLKDALELFDKRFWDEKQGLTYDTWNTEFTVLDDYRGLNA NMHTVEAFLAVADAIKEEKYRIRAGRIIDHVIGWASANDWRIPEHFTKEWKADLDCNKEC PADRFKPYGATPGHGIEWARLIVQWAYSSYKDTPDSAEVYLKAAEKLYNRAVEDSWNVDG NPGIVYTTDWDGKPVVHDRMHWTLAEAINTSAVLWHITHKQKYAKDYEEFMRYLDDKVLD HKNGSWFHQLDENNNVIGTVWPGKSDVYHAFQSTWIPYGTPYISIASDVKQYMEVCHQ >gi|226332866|gb|ACII01000153.1| GENE 42 52310 - 53773 1799 487 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 5 480 14 485 489 540 56.0 1e-153 MRANNLTLLTDLYELTMMQGYFKNPTNQTVIFDMFYRTNPCGGAFAITAGLEQMIEYIEN LKFTDEDIKYLRSLNIFEEDFLEYLSNFRFTGDIYAIPEGTVVFPREPLVKVVAPIMEAQ LVETAILNIINHQSLIATKAARVCYAAKGDAIMEFGLRRAQGPDAGIFGARAAIIGGCAG TSCVLTGKMFDVPILGTHAHSWIMSFPDEYTAFKTYAKLYPNACTLLVDTYDVLKSGVPN AIRVFEEMREEGIPLTRYGIRIDSGDLAYLSKEAYKMLAAAGFDDAVISASSDLDEYLIE SLKAQDAKINSWGVGTRLITANDNPAFGGVYKLAAVKDADSTEFTPKIKLSENTEKVTNP GNKTVYRLYNKKTGKIRADLICLADEKLDADQNMVLFDPIDTWKKTKVLGGTYEVRELLV PVIREGKRVYESPSVMELREYCQKEQNTLWDESRRFVNPQKVYVDLSQKLWDLKKDLLEE ISEKALD >gi|226332866|gb|ACII01000153.1| GENE 43 53869 - 55071 976 400 aa, chain - ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 44 399 55 390 392 189 33.0 1e-47 MAAIYLAPFYLLVCVYILLRSLHWFQVLHTVFQNVWVCRGIGLVYLFVVFSILIAFMAPA SGFRRFMKLLSNYWLGVLMYTLMTLGIADGLRLLLKYPLKNFAFPGRELLFSNMGTAVVG AVCAVIISTVSIYGVLSAGNIHTTKYNISVDKKAGNMKELNVVLVADLHLGYNIGCKQME QMTEKINKQNPDLVVVAGDIFDNEYEALDDPEKLAEILRGIRSKYGVYACYGNHDIQEKI LAGFTFGSKEKKESTPEMDEFLEKAGITLLRDEYVLIDDSFYLYGRPDYERPGRGIGKRK TPQEITEGMDVSLPVLVIDHEPRELQELADTGVDVDLCGHTHDGQVFPGNIVIRFFWENP CGYLKKGNMHNIVTSGVGLFGPNMRVGTKSEICDVSICFN >gi|226332866|gb|ACII01000153.1| GENE 44 55370 - 55918 483 182 aa, chain - ## HITS:1 COG:BS_spoVT KEGG:ns NR:ns ## COG: BS_spoVT COG2002 # Protein_GI_number: 16077124 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Bacillus subtilis # 1 182 1 178 178 174 53.0 1e-43 MKATGIVRRIDDLGRVVIPKEIRKTLRIKEGTPLEIFTDREGEIILKKYSPIGELNVFAK EYAEALAQSSGMVACITDHDQVVAAAGQGSREYVGKPISKALEDAITERSSVFANGNDRS RIPVTEEQREPLYSQIMQPIISAGDTIGSVLLLGKNERDVMGESEKMLIRTASGFLGRQM EQ >gi|226332866|gb|ACII01000153.1| GENE 45 56075 - 59509 2549 1144 aa, chain - ## HITS:1 COG:CAC3303 KEGG:ns NR:ns ## COG: CAC3303 COG0553 # Protein_GI_number: 15896547 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Clostridium acetobutylicum # 17 1122 20 1054 1077 514 32.0 1e-145 MISKTQIRYMANSSSYSKGAELYATGKVLDMDVKNMGAFDEIVASVKGSGRNIYEVDVSI DTENDEVDTCYCECRAYAEYGGLCKHCVAVLLQYNDYENDMDSYDYGQSVEKIVKGGLTH TIRKGVQMHTTPELASLLQKQAVAKSLPLIQGSTYGKVRLEPYFNFDGRTFTVEFKIGIN KMYVLKDAFSFDVHIANQDDYKYGKNLQFVHTIESFAEESRPLAKFICKWADNNRQFHRS SSYYGYYMGTLEKVRHLELSGNELAEFLLLMEGKTLQGESIGTRNTTWEITREHLPRKMT ITGAKQGIELKVSKFTCAANTEQYKICFYDKKIYIENVEELLPVKDFLDSLSLISGEKAF IENKDVPAFCQELLPVIQKFFKCRMVEFHPENYGMVKPEFRFYLDAPQENMVTCKATVKY GDREFSLYTTDDIAARDMNRETVVRNVIHKYSNAFNPFEQCAVIADDEEMEYEFLTEGIQ ALQAVGEVFISDALRRIEVRNSPKVTVGVSLSGNLLELSMTAGDISKEELIDILSRYNKK KKFYRLKNGAFVNAADSGLDTVEELRAGLQLTDKQMKQDKIEVQKYRALYLDAQLKENPV VLAVKDKSFKSLVRNMKTIEDNDFEVPESLDKVLREYQKRGFLWIKTLNYNGFGGILADD MGLGKTLQVIAFLLSEFLERRNTVVENIAVKETAMLNMQSEIEIVEQACKAAEAQKADGK KADGQTGKLQRNTLIIAPASLVYNWSSEIQRFAPELTAKMVTGTAAERRQILAEADSEDI LLTSYDLLKRDISEYEGYKFRCEIIDEAQYIKNANTQAAKAVKEVQADFRLALTGTPVEN RLSELWSIFDYLMPGFLYSYKKFREEVEIPAVQNSDEDAMKRLQKMLRPFVLRRLKKEVL TDLPDKLEENMFVQLTGEQQKLYDAHVKRMMLMLDKQSEEEFKTSKITILAELTKLRQIC CDPSLIFADYKADSAKVDMCLNMISNAVESGHKILLFSQFTTMLDHLAKRLEEEKISYYM LTGSTSKEKRAQMVENFNTDDTQVFCISLKAGGTGLNLTAADIVIHFDPWWNLAVQNQAT DRAHRIGQKNVVNVYKLIVKDTIEENILKLQEKKRELADQILEGEGLNGGSFTKEELMEL LSGK >gi|226332866|gb|ACII01000153.1| GENE 46 59650 - 61674 2103 674 aa, chain - ## HITS:1 COG:YPO0852 KEGG:ns NR:ns ## COG: YPO0852 COG1874 # Protein_GI_number: 16121160 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase # Organism: Yersinia pestis # 4 647 10 655 686 689 49.0 0 MDFRTDKLLHGGDYNPEQWLKRPDILEKDIDMLEESGCNVVSLGIFSWSTLEPEEGVFHF EWLQEIIDKLYKRGISTILATPSGARPKWMADKYPEVLRVDETRRRALFGFRHNHCYTSP VYREKVHIINKKLAQEVATHPGVILWHISNEYGGECHCPLCQKAFRNWLKEKYQTIENLN DKWCTTFWSHTYNSFDQIESPSKIGETQLHALNLDWKRFVTHQTADFIHHEIAALREGGS TLPTTANLMYYFGGLDYFKIAKEIDVVSWDTYPTWHKEAVIDTAYDNGMCHDLMRSLKGK PFFQMESCPASTNWQSVSKLKKPGMLFAQSMQAIAHGGEGALYFQIRQSRGASEKFHGAV IDHYGGNDTRVFKEVSRVGETLKEIRELAGTTVNSSVAMLYDWDSQWAMEDSQGPRNKGL HYLEAMLKFYRGFRKQGVNVDVIDMTCELDKYKVLALPMVYMFKEGFAKKVRAFVENGGT LITSYWSGIADDTDRCYLEGTPHGLMDVLGIRSTEIDGLYDWEENSFVPVAGNELGLDKT YTCKYLCDLVELRGARTLMTYGSDFYEGYSCLTVNEYGRGKAWYVAADADKEFYGDFLEK VLKDSGVSCGIKEEIPDALEITVRENENEKYYIYQNFGTEAVNIPVPEGQISWIYGNGSD KLKVYGLVVAKVIV >gi|226332866|gb|ACII01000153.1| GENE 47 61981 - 62988 861 335 aa, chain - ## HITS:1 COG:L0143 KEGG:ns NR:ns ## COG: L0143 COG1609 # Protein_GI_number: 15673630 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 10 335 7 332 332 143 29.0 4e-34 METEEKKIYTIEDIARELGVSKTTVSRAISGKGRISQATRDRVRAFIKEHDYRPNVVAKG LAQRKTYNIALLMPKDYVATEFLFFKDCMNGICEMASSYEYDIIISMIDGADVSQIQRLE ANRKVDGIIVSRAVVSSKVQKYLKNCKEPYILIGPSSDPEVPFVDNKNQEAGKELTSIML MKGFRNLALLGGNQSYNVTGSRYQGFLEAHEEMGVSVNENLIFMDTDNQAAVSDAVKRLL EEGADGIVCMDDVICSMCLSSLREKKISVPAEIKIASMYDSKNLEYNNPPITSIRFDTVR LGKMACAKLLNILGEKPLEDTLTLNYQVMLRESTQ >gi|226332866|gb|ACII01000153.1| GENE 48 63066 - 63251 217 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580977|ref|ZP_04858239.1| ## NR: gi|253580977|ref|ZP_04858239.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 61 1 61 61 80 100.0 4e-14 MSVGRESIRRAANAGAGKTVKKPVVSTKEAEKPETISVTEEKKKTAKGSVHINEELPVYL L >gi|226332866|gb|ACII01000153.1| GENE 49 63242 - 64030 798 262 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 257 1 249 253 143 34.0 3e-34 MGKTITICFTNNKGGSGKTTTCSNLGAAMAQAGKKVLLIDGDMQLNLSLAFFPENWVLEQ AKGEKNLYCAIGKQEELSGYVVHTPYENLDLIPSSTLMSSIEYELFTKWQREFILRKCIQ KIKDAGNYDYILIDAPPTLGGWVMNILCASDRVIIPVEASPWGMFGLANMFEFLNEVKQI TPELEVAGIAVTKVDTRKSYYKQTMETLYELEDIHVFEQIIRVDSSVEWSQDNSIPVVEY KKSSRSAREYTRLAEEVMELCQ >gi|226332866|gb|ACII01000153.1| GENE 50 64280 - 64942 570 220 aa, chain + ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 4 215 4 213 214 171 46.0 1e-42 MKFKLKDPGSAITHGIALLLAAVGAVPLIIKAARSYDVLHIVALSIFILTMVLLYAASTI YHSVDSTEKVNRRLRKMDHMMIFVMIAGSYTPVCLIVLHNRIGYILCALVWSIAVLGIIL KGCWITCPKWLSSVLYIAMGWLCVLAFVPIFHALPRAGFDWLLAGGIIYTIGGVIYALKV PLFNSRHKNFGSHEIFHIFVMLGSACHFIVMYFFVAPLPV >gi|226332866|gb|ACII01000153.1| GENE 51 65063 - 66034 1065 323 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 60 175 2320 2440 2566 80 39.0 3e-15 MKKMTKWLAVCAVCVMGSVPAMAAEVPDVTVTPIPIETPAPDVTPIETPAPTPIPEVKNG WCEYGTKNKKYFKDGQYLTGMRKIHGEVYYFSPKGFMKTGWIKYNNKKYYFDSNGIRYSG VKKISGKYYYFSDKGVLRTKTVKVGNTIYYCTEKGILEAWKKGKTIYYPNGKKMNSTKAY EYETLQRAKDVVSKITKPSMSKSEKFETCFRWVMYQHYYDTRRIFYNQTAWPALYANDYL ILSGKGGDCFSDACSFAYLAKALGYKNVYVCVDTTATDGSGHCWAEIGGRVYDPLFAEAK SYYGYFGVGYGSYGLYPERRVAV >gi|226332866|gb|ACII01000153.1| GENE 52 66063 - 66305 148 80 aa, chain - ## HITS:1 COG:VNG2274C KEGG:ns NR:ns ## COG: VNG2274C COG2827 # Protein_GI_number: 15791086 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Halobacterium sp. NRC-1 # 1 78 1 77 77 79 52.0 1e-15 MNYTYIVKCADGTFYTGWTNCLQKRMKAHNEGKNGAKYTRTKRPVTLVYYEGFSTKEEAM RREYEIKQFTRNKKLELLSF >gi|226332866|gb|ACII01000153.1| GENE 53 66591 - 68030 1238 479 aa, chain - ## HITS:1 COG:BS_sacA KEGG:ns NR:ns ## COG: BS_sacA COG1621 # Protein_GI_number: 16080855 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus subtilis # 27 451 29 455 480 336 42.0 6e-92 MSTLTKVLERNIKVIEKNADLTEGRFRLGHHLMPPVGWLNDPNGLCWYKGRYHVFFQYAP FDVEGGLKFWGHYTSEDLVDWKYEGTALYPDSPYDCHGVYSGSALAESEKLHLFFTGNVK IDGDYDYINEGRETSTLHVESEDGIHFGDKEEIISFEKYPEEFTCHIRDPKVWKENDRYF MVLGGRLKGDKGAVLVYESENLKEWKFKHIITTPEAFGYMWECPDYFELDGKKFLSVSPQ GLKREEFRFQNIYQSGYFQVKEDGSVDERDFREWDMGFDFYAPQTFTDNSGRRLLIGWMG MPDAEEEYTNKTIDEGWQHCLTVPRELRVKDGKIFQYPAKELERLRKEKTILDDEKSIVE VRIEVNEGFDLLIEDIAVTDRSFQISMGGQMLFKYENEIAEIGFSEIAGAGRNKRKAKVS ELNNIRILADTSAVELYLNDGEIVFSTRYYPDCEDLQLKVKGGKFRGNLWNLRKMIFTK >gi|226332866|gb|ACII01000153.1| GENE 54 68027 - 69931 2192 634 aa, chain - ## HITS:1 COG:RSp1285_2 KEGG:ns NR:ns ## COG: RSp1285_2 COG1263 # Protein_GI_number: 17549504 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Ralstonia solanacearum # 91 450 1 355 374 273 43.0 9e-73 MDYRKTAQEIYDHIGKKENIISAAHCATRLRLVISDNSKADKEYVENIEGVKGVFFAQGQ MQIILGTGVVNKVYDEFIRIAGVSESSKEELKKVAASRANPVQRLIKTLGDIFVPIIPAI VASGFLMGIMEALNFMVNNGFLNIDTSGSIYTFAQLFSNTAYTFLPILIAYSGAKVFGAN PYLGAVIGMIMIHPNLQNAWTVATEGVKATQKVWFGLYSIDMVGYQGHVIPVIIAVWVLA QIEKRLHKVVPAMFDLFVTPLVSVFVTGYLTLSIIGPIFVTVENGLLNGIQWLIALPFGI GSFIMGAFYAPTVVAGVHHMYTIIDLGQLSKFGVTYWLPLASAANIAQGGATLAVALKTK DQKIKSMAVPSALSACMGITEPAIFGVNLRFGKPFVMGCIGGAFGALFASVTGLGATGTG VTGIFGILLCLNNPVSYILMFVIAFGAAFVLTWLFGYKDTNVSEKTESIEAVGDKSTTEK SNADDSVLYSVSEGTAILLSQVNDATFASEVLGKGIAVIPSKGEVVAPCDAVVETVFDTK HAVGLSTESGMELLIHIGINTVELNGKYFTSHVKNGDHVKKGQLLVSFDMEKVKAAGYDV TTPLIVTNSDDYKDMKILSEDSVTPSDKVLEIVK >gi|226332866|gb|ACII01000153.1| GENE 55 70366 - 71358 919 330 aa, chain - ## HITS:1 COG:BH1855 KEGG:ns NR:ns ## COG: BH1855 COG1609 # Protein_GI_number: 15614418 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 330 4 326 326 213 36.0 4e-55 MNINEIAKLAGVSRATVSRYLNNGYVSEEKKKNISRVIEETGYQPSSQAQRLRTKKTKLV GVILPKIDSNTISREVAGISDILTKKGYQLILANTNNSVEEELKYLSLFRDNQVDGVIFI ATILTKQHREMMKGFKVPIVLLGQQLDGYPCIFQDDYKAAVNLTEQMVKTGKKFGYITVT DKDEAVGRQRKSGVEKVLGENQITLESGCVKCGNFTLESGYEKAKELFTEHPDVDTLICA TDTMAVGAANWLKENGYQIPQQVQIAGMGDSFLGKIIEPKLTTVHFFYKTSGMESAKVLV DLIESKETSFVHKEIKMGCKVVLRESHRNI >gi|226332866|gb|ACII01000153.1| GENE 56 71616 - 72509 800 297 aa, chain - ## HITS:1 COG:TM0409 KEGG:ns NR:ns ## COG: TM0409 COG0739 # Protein_GI_number: 15643175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 161 297 126 255 271 63 27.0 5e-10 MMNNKVTRIIANSVGLAAVAALGITVYQLGTSPVKEKTPEEKTENVGESTQDTEEQLSED AGTNHVESENQWISDDGEDLDDTKVTFQKQGMYGETAQNTVDNNESDNTADVDKTDNAVN SQQAEESSVNANMKSQTISEDQADEATDVSASALNLATVNFSEDTLMEWPVNGNVLLDYS MDQTTYFPTLDQYKLSPAIAVGAVEGAPVVAAVNGKVYSIEQNAQTGTTLTMELGNGYQA VYGQLTDLTVSEGDTIKKGTTIGYIAQPTKYYSTEGTNLYFAMKKDGEPIDPIEYLP >gi|226332866|gb|ACII01000153.1| GENE 57 72571 - 73404 229 277 aa, chain - ## HITS:1 COG:CAC2861 KEGG:ns NR:ns ## COG: CAC2861 COG2385 # Protein_GI_number: 15896115 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Clostridium acetobutylicum # 36 272 74 343 345 100 29.0 3e-21 MRKSFFPGTTGIFALFFVPYIVTIIFNGANTTLINKKFNVEMLLPVIVSSQIEDKYELET IKAQTIIARSNFYRTMKEEKNLAITLCQIKKEMEGKSLACVILQNKYEKAVTETEGKVIV WNKELKLVPYHELSAGQTRDGMEVFHNEDDSYLRSVHSLVDKNAKDYLNSVYINKNVLPE RIEIKSRDSAGYVTEILADGKVLEGEAFRKGMGLTSSNFTIQKSGKEIRFLCRGKGHGLG FSQYGGNEMAKESANAKEILQYYFPEMDVLNIKSVNC >gi|226332866|gb|ACII01000153.1| GENE 58 73608 - 73853 174 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKSPLKNLKKYFQLIMTALICFVMGVIEDKKRICIKDTDSKKDRTVLRPDDKKNNRKNKI IYWNEPRIQNRINRAVYFNTG Prediction of potential genes in microbial genomes Time: Sat May 28 20:59:20 2011 Seq name: gi|226332865|gb|ACII01000154.1| Ruminococcus sp. 5_1_39B_FAA cont1.154, whole genome shotgun sequence Length of sequence - 5483 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 65 - 137 74.8 # Asn GTT 0 0 + TRNA 182 - 253 66.5 # Glu TTC 0 0 + TRNA 292 - 362 62.9 # Cys GCA 0 0 - Term 1292 - 1342 9.7 1 1 Tu 1 . - CDS 1463 - 1657 156 ## gi|253580990|ref|ZP_04858251.1| conserved hypothetical protein - Prom 1677 - 1736 5.5 + Prom 2065 - 2124 7.4 2 2 Tu 1 . + CDS 2273 - 2797 266 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Prom 2803 - 2862 4.3 3 3 Tu 1 . + CDS 2964 - 3098 146 ## + Prom 3211 - 3270 11.0 4 4 Op 1 . + CDS 3308 - 4651 941 ## COG1106 Predicted ATPases 5 4 Op 2 . + CDS 4654 - 5316 368 ## Acfer_0737 hypothetical protein Predicted protein(s) >gi|226332865|gb|ACII01000154.1| GENE 1 1463 - 1657 156 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580990|ref|ZP_04858251.1| ## NR: gi|253580990|ref|ZP_04858251.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 3 66 66 123 100.0 4e-27 MDTVKKKLGYTVRSERERLGLSQSSLAERAGVSTRTISDIETCKPVPKMIARLPFVHCVN FFLH >gi|226332865|gb|ACII01000154.1| GENE 2 2273 - 2797 266 174 aa, chain + ## HITS:1 COG:HI0664 KEGG:ns NR:ns ## COG: HI0664 COG1132 # Protein_GI_number: 16272605 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Haemophilus influenzae # 1 170 360 544 552 93 30.0 2e-19 MSTHKGKILGITGENGSGKTTLIKLLRRFDVVYNGTITINDVDINELSQESLVKGISVMP QKATVSVDDIEEIACYVNNNELLNFLNLDEVPKKQVYNYTLKSGGELQKICFLKVFLENK DVLILDEPTASMDMKSEEIVGDLLAKIKRDKIIIIITHRKYLLSMCDEIITLNA >gi|226332865|gb|ACII01000154.1| GENE 3 2964 - 3098 146 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MARPRKMISLEEQIKKQEETVLKVKEKYDSEMMKLKDLYAKRAI >gi|226332865|gb|ACII01000154.1| GENE 4 3308 - 4651 941 447 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 1 431 1 413 420 112 27.0 1e-24 MLLEFKTKNYKSFVEEASFSMVATPKQKGLDYSLMKTKIKGKEIKGLSSSVIYGPNAAGK TNIIGAMDVLRAIVLRGNIRNSEEKSSPNPAAAALELIPNNNEMESKPVCFEIEFYEEDG EDHKFKIHYELEVDLGIFLEEEHQRKILAEVLEVNGERVFERTQDLKIENLKVIKDYLSD ITEQNAESVNEIAKNSLNQEELFLTNGFKLIFSPKFTKLIVDWFTNKFMVIYRADSMQLI KRFADPKKKAIYVEKTTDEAAKLFGINSNAVGYVVSDDEPDAKLYSVFKNMKNKKATAVA ADIFESYGTIRFINMFPLVIKAMQTGGTLVVDEFDASIHPMALMSIINVFHNDDVNIHHA QLIFNTHNPIFLNSNLFRRDEIKFVERDDDTHDSILYALSDFGTTGDKGVRKHEDYMSKY FISQYGAIKDIDFTPIFEEILCDEKEG >gi|226332865|gb|ACII01000154.1| GENE 5 4654 - 5316 368 220 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0737 NR:ns ## KEGG: Acfer_0737 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 55 219 6 169 172 105 35.0 1e-21 MAKRKPTNKYYFSVEGETEQWYLKWLQDTINNTEKATCKVSIDCPVRKNPLKHAKSLTVT RKIEIYHFFDYESDESLHVKGFQEALDNMKKAEKIGKQIKYKSGYSNFTFDLWIILHMTN CNASFSHRKQYITMINRAFGEKFQNMDEFKEENNFKRCLKKIDLFNVIAAVDRARKIMQR NQDNGYTLYQYKGYQYYKENPSLTAWEAIEKILKDCELIV Prediction of potential genes in microbial genomes Time: Sat May 28 20:59:51 2011 Seq name: gi|226332864|gb|ACII01000155.1| Ruminococcus sp. 5_1_39B_FAA cont1.155, whole genome shotgun sequence Length of sequence - 72085 bp Number of predicted genes - 65, with homology - 65 Number of transcription units - 40, operones - 17 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 86 - 145 7.7 1 1 Tu 1 . + CDS 225 - 1220 123 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 2 2 Op 1 . - CDS 1186 - 2427 381 ## gi|253580996|ref|ZP_04858257.1| predicted protein 3 2 Op 2 . - CDS 2349 - 3524 260 ## gi|253580997|ref|ZP_04858258.1| predicted protein - Prom 3560 - 3619 3.1 - Term 3595 - 3634 1.0 4 3 Tu 1 . - CDS 3639 - 4568 275 ## COG4974 Site-specific recombinase XerD - Prom 4588 - 4647 6.4 5 4 Tu 1 . - CDS 4792 - 5157 160 ## gi|253580999|ref|ZP_04858260.1| conserved hypothetical protein - Term 5571 - 5615 -0.5 6 5 Tu 1 . - CDS 5765 - 6217 -72 ## gi|154503657|ref|ZP_02040717.1| hypothetical protein RUMGNA_01481 - Prom 6352 - 6411 9.0 - Term 6493 - 6549 7.1 7 6 Op 1 . - CDS 6561 - 7145 512 ## COG1247 Sortase and related acyltransferases 8 6 Op 2 . - CDS 7179 - 7409 276 ## EUBELI_00960 hypothetical protein - Prom 7480 - 7539 8.4 9 7 Tu 1 . + CDS 7771 - 8508 607 ## COG3279 Response regulator of the LytR/AlgR family + Term 8726 - 8758 -0.7 10 8 Op 1 . - CDS 8776 - 9969 1007 ## CTC00591 hypothetical protein 11 8 Op 2 . - CDS 9982 - 11370 1210 ## COG1696 Predicted membrane protein involved in D-alanine export - Prom 11395 - 11454 2.4 12 9 Op 1 . - CDS 11526 - 12041 441 ## CTC00593 hypothetical protein 13 9 Op 2 . - CDS 11999 - 12793 698 ## CTC01503 phospholipase-subfamily protein - Prom 12878 - 12937 13.6 + Prom 12815 - 12874 7.8 14 10 Op 1 . + CDS 13101 - 13961 806 ## EAT1b_1734 hypothetical protein 15 10 Op 2 . + CDS 13969 - 14805 622 ## BVU_3509 putative arginase 16 10 Op 3 . + CDS 14861 - 15643 510 ## COG0682 Prolipoprotein diacylglyceryltransferase 17 10 Op 4 . + CDS 15704 - 17425 1349 ## COG1404 Subtilisin-like serine proteases + Prom 17507 - 17566 9.3 18 11 Tu 1 . + CDS 17586 - 18977 403 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 19006 - 19051 5.0 + Prom 19028 - 19087 8.2 19 12 Tu 1 . + CDS 19123 - 20355 1122 ## COG0205 6-phosphofructokinase + Term 20593 - 20641 1.1 20 13 Tu 1 . - CDS 20652 - 22373 251 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 22409 - 22468 5.6 + Prom 22351 - 22410 6.2 21 14 Op 1 . + CDS 22531 - 22839 286 ## gi|253581015|ref|ZP_04858276.1| predicted protein 22 14 Op 2 2/0.000 + CDS 22808 - 24046 844 ## COG3864 Uncharacterized protein conserved in bacteria + Prom 24149 - 24208 1.6 23 14 Op 3 . + CDS 24230 - 25795 1550 ## COG0714 MoxR-like ATPases 24 15 Tu 1 . + CDS 25907 - 26719 797 ## COG4509 Uncharacterized protein conserved in bacteria + Prom 26747 - 26806 3.7 25 16 Op 1 . + CDS 26841 - 27563 775 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 26 16 Op 2 1/1.000 + CDS 27624 - 27815 313 ## COG4481 Uncharacterized protein conserved in bacteria + Prom 27817 - 27876 2.2 27 17 Op 1 24/0.000 + CDS 27926 - 28213 366 ## PROTEIN SUPPORTED gi|240144657|ref|ZP_04743258.1| 30S ribosomal protein S6 28 17 Op 2 21/0.000 + CDS 28293 - 28754 520 ## COG0629 Single-stranded DNA-binding protein 29 17 Op 3 . + CDS 28791 - 29054 342 ## PROTEIN SUPPORTED gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 + Term 29064 - 29116 12.6 + Prom 29159 - 29218 5.4 30 18 Op 1 9/0.000 + CDS 29286 - 31361 684 ## PROTEIN SUPPORTED gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein 31 18 Op 2 16/0.000 + CDS 31358 - 31804 527 ## PROTEIN SUPPORTED gi|240144674|ref|ZP_04743275.1| 50S ribosomal protein L9 32 18 Op 3 . + CDS 31825 - 33153 1483 ## COG0305 Replicative DNA helicase + Term 33302 - 33356 14.2 + Prom 33371 - 33430 7.8 33 19 Tu 1 . + CDS 33560 - 34030 432 ## gi|253581028|ref|ZP_04858289.1| conserved hypothetical protein + Term 34050 - 34096 11.1 - Term 34038 - 34084 8.1 34 20 Tu 1 . - CDS 34100 - 34273 319 ## BT_2414 ferredoxin - Prom 34508 - 34567 10.2 + Prom 34467 - 34526 7.3 35 21 Op 1 15/0.000 + CDS 34694 - 35872 1638 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Term 35910 - 35956 13.3 + Prom 35957 - 36016 6.4 36 21 Op 2 24/0.000 + CDS 36127 - 37662 1856 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 37 21 Op 3 26/0.000 + CDS 37659 - 38738 1337 ## COG4603 ABC-type uncharacterized transport system, permease component 38 21 Op 4 . + CDS 38751 - 39731 1176 ## COG1079 Uncharacterized ABC-type transport system, permease component + Prom 39752 - 39811 1.6 39 22 Tu 1 . + CDS 39832 - 40032 407 ## EUBREC_0036 hypothetical protein + Term 40102 - 40141 7.7 - Term 40155 - 40195 1.1 40 23 Tu 1 . - CDS 40214 - 40960 821 ## EUBREC_0035 hypothetical protein + Prom 41498 - 41557 2.7 41 24 Tu 1 . + CDS 41661 - 41861 256 ## gi|253581037|ref|ZP_04858298.1| conserved hypothetical protein + Prom 41958 - 42017 10.0 42 25 Tu 1 . + CDS 42173 - 43822 1865 ## COG3858 Predicted glycosyl hydrolase + Term 43879 - 43924 2.3 + Prom 43851 - 43910 3.4 43 26 Tu 1 . + CDS 43987 - 44682 606 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Prom 44702 - 44761 7.5 44 27 Op 1 . + CDS 44808 - 46820 1428 ## Cphy_2889 hypothetical protein + Prom 46862 - 46921 4.9 45 27 Op 2 . + CDS 46961 - 49198 2017 ## COG5427 Uncharacterized membrane protein + Term 49203 - 49252 12.2 + Prom 49541 - 49600 8.2 46 28 Tu 1 . + CDS 49774 - 50823 439 ## gi|253581043|ref|ZP_04858304.1| conserved hypothetical protein + Term 50943 - 50999 12.4 + Prom 51011 - 51070 4.5 47 29 Op 1 2/0.000 + CDS 51140 - 52192 1211 ## COG0009 Putative translation factor (SUA5) + Term 52193 - 52244 -0.6 + Prom 52212 - 52271 4.4 48 29 Op 2 . + CDS 52330 - 52761 606 ## COG0698 Ribose 5-phosphate isomerase RpiB + Term 52898 - 52946 -0.3 + Prom 52806 - 52865 3.3 49 30 Tu 1 . + CDS 52986 - 53477 617 ## COG2131 Deoxycytidylate deaminase + Prom 53849 - 53908 9.4 50 31 Op 1 . + CDS 54062 - 55501 1847 ## COG0442 Prolyl-tRNA synthetase 51 31 Op 2 . + CDS 55593 - 56918 1339 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 56969 - 57002 2.1 + Prom 56921 - 56980 1.6 52 32 Tu 1 . + CDS 57042 - 58055 739 ## Cphy_3824 CotS family spore coat protein + Prom 58071 - 58130 4.2 53 33 Op 1 1/1.000 + CDS 58156 - 58476 170 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 + Term 58492 - 58521 2.1 + Prom 58492 - 58551 2.8 54 33 Op 2 . + CDS 58575 - 58814 293 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Prom 58859 - 58918 7.2 55 34 Tu 1 . + CDS 59007 - 59252 231 ## GYMC10_0046 sporulation protein YabP + Term 59296 - 59334 -0.9 + Prom 59284 - 59343 4.0 56 35 Op 1 . + CDS 59475 - 59720 118 ## gi|253581053|ref|ZP_04858314.1| predicted protein 57 35 Op 2 . + CDS 59731 - 59988 360 ## gi|253581054|ref|ZP_04858315.1| conserved hypothetical protein + Term 60060 - 60124 3.1 + Prom 60086 - 60145 4.3 58 36 Tu 1 . + CDS 60175 - 61668 1346 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit + Term 61703 - 61745 2.4 59 37 Op 1 10/0.000 + CDS 61871 - 63328 629 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 60 37 Op 2 11/0.000 + CDS 63321 - 63851 683 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 61 37 Op 3 . + CDS 63868 - 65721 1305 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Term 65735 - 65777 7.2 + Prom 65994 - 66053 4.8 62 38 Tu 1 . + CDS 66073 - 68208 2103 ## COG0366 Glycosidases + Term 68246 - 68292 5.5 - TRNA 68328 - 68409 53.8 # Tyr GTA 0 0 + Prom 68536 - 68595 3.3 63 39 Tu 1 . + CDS 68678 - 69541 1213 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 69564 - 69600 4.0 + Prom 69758 - 69817 6.0 64 40 Op 1 1/1.000 + CDS 69863 - 71305 1045 ## COG1376 Uncharacterized protein conserved in bacteria 65 40 Op 2 . + CDS 71368 - 71904 488 ## COG0778 Nitroreductase Predicted protein(s) >gi|226332864|gb|ACII01000155.1| GENE 1 225 - 1220 123 331 aa, chain + ## HITS:1 COG:TM0259 KEGG:ns NR:ns ## COG: TM0259 COG1181 # Protein_GI_number: 15643029 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Thermotoga maritima # 36 325 23 296 303 73 20.0 5e-13 MIKYNIVLIADRIQNKAALDMQSDKLEQIGDDYFFQIYNSLQRISPKVTHYNDLPTFIDN IYKHRNDIVFTIYGGNSSRNRMALVPGICESYNIKYVGADVYNRIICQDKNISKELAKKA GILVPDSVTINSLDNLNTFNLHLLKYPVVIKPLMEGSSIGITKNSLCQNANNIKEKIEKL YKTYNQPIMVEQFIPGKEIVTCLVGSSKNIRLAEMVEVCVEQDNTYFNDKLYTGEIKHQN KVNTYHRIITSEIDGSIMNTIKQVFTSFGKMDYMRIDGKLYNNKFYMIEFTPDASLSAHC SFKDIYEYQNKTYDNLMEDIINSSLENYRIQ >gi|226332864|gb|ACII01000155.1| GENE 2 1186 - 2427 381 413 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580996|ref|ZP_04858257.1| ## NR: gi|253580996|ref|ZP_04858257.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 413 1 413 413 716 100.0 0 MEKYYKEGYHKHDSQVIYKSLHFTILENNINDLNTLLASIRQYIISSADPIDYLKKIERV YYQNATYSQNTIAMDTLLFWLTELYQNLGNYKKAFTFFNYITDKSNNNYLMLKALLLYQV GFQENAVTYCTELIDGGKITERMELFARIIRLEANYTLEEPDKVNEDYMYIYQNSTRFEK YLEYGFFLRNSEFVKTPLNIIEDLKKSIVHFEKFNANKQAVSSRITLGVYYSLLGNYKQA EYQFKIADSKKEEFVGLYDMILTNQAVLMQYQGTLEGIKEKLTLAQKYAYYDFNKLAIAI NLLVYNIRTQKEDEMLIENILSLIENRTFKNKRIICYAYINLYNYYKNKDDKLSSFYFKK IFDLSPLPYYIKEWIYEPVLTPEDPEYYRTRIKWPLNFLDEWSIEFDSSLMSY >gi|226332864|gb|ACII01000155.1| GENE 3 2349 - 3524 260 391 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580997|ref|ZP_04858258.1| ## NR: gi|253580997|ref|ZP_04858258.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 391 1 391 391 693 100.0 0 MIDFINRKDERDLFKKYLEKIQGKNLIIVYGDMGVGKSELVKQILLSYRKYPAIKVEISH KNEFEAGYYLGKIKSYSEHPVNGIKFNFDSNHYLIKSKHFLLLKKMLFSILEEIPYLSAI FKIVNKTFKEYLNIKEALSEPSSKNNSILMEKYLNALYGDTPFILDIENIQTIDTCSWER LYSILWNTDNFCLILEYTANTSKNLELSTIIEKFKNIIDDKNIKLIHLTKLKDEHVIKIK PNLSNTEKRTLISLLKDWDGNIKEIENFLYFNKYNCKIKLAENTKYIIDTLHRDELKWLV YIYLSIEEFSFQEWIKIGLPKTVFKILIDSHLIKVENNFVAIDHDTICSILDDVEYIDFK NDAIIFWKNIIKRDTINMTLRLFINHFTLPY >gi|226332864|gb|ACII01000155.1| GENE 4 3639 - 4568 275 309 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 4 300 1 296 297 144 31.0 2e-34 MDQLSNFISEYLKTCETIKKLDSKTIKAYRIDLSQFHTFMLPQEDFTEKSTLNSFLSLLH KKYKPKTAKRKIATLKAFFHYLVYEESLSQNPFEKLNVKFREPQVLPRIVPNTIIEIFLR TMYKQKKLAPTTYKYNSILRDIAVIELLFATGVRVSELCSLTLSQIDFTTYKIIINGKGS KERILQIGNEDVKEILSEYYAAFKTDISKTGWFFINRLHKRYSEESVRSMIVKYLKLAAI DMHITPHMFRHSFATLLLESDVDIRYIQRMLGHSSIKTTEIYTNVSTSKQNSILTAKHPR NSMHIDITM >gi|226332864|gb|ACII01000155.1| GENE 5 4792 - 5157 160 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253580999|ref|ZP_04858260.1| ## NR: gi|253580999|ref|ZP_04858260.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 121 1 121 121 229 100.0 4e-59 MSKKNHHPVSAEVENNPLYHDTYKLLRCYRDSTYSLMVAVRQVEIQFQLEYNTSVDEFLD SIYATGADLGDSQIEEWAKSIARSNKMIKLLLSSVDLLRKNHKHGEEYYWILYYAFCLLM S >gi|226332864|gb|ACII01000155.1| GENE 6 5765 - 6217 -72 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154503657|ref|ZP_02040717.1| ## NR: gi|154503657|ref|ZP_02040717.1| hypothetical protein RUMGNA_01481 [Ruminococcus gnavus ATCC 29149] # 9 127 1 119 405 218 88.0 1e-55 MKNHFRFIVNYQLELKQIVDFPRCRIYREFIQTLMKDRSIRTNGGSCPFYFLILCSYANH SSSYRNIEHLTYKVVPGEWICSLKELQIHFRQKFQHQVISILDTLVEQNLLTYSLHEKIR LSNTRSKTDQKIILPFPIIIHLCKVLMPIR >gi|226332864|gb|ACII01000155.1| GENE 7 6561 - 7145 512 194 aa, chain - ## HITS:1 COG:L19745 KEGG:ns NR:ns ## COG: L19745 COG1247 # Protein_GI_number: 15673759 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Lactococcus lactis # 7 172 1 159 187 154 46.0 7e-38 MNTNTFITIRTATLSDAQALLNIYSPYVEHTAITFEYDVPSVEEFASRIKNTLQKYPYLV AEKNGRLLGYAYASPFHERPAYDWAVETSIYVDQNIKHQGIGRRLHNALEDALRSQGILN MNACIAYPPEEDEYLDKNSVEFHTHMGYRLVGEFYKCGYKFHRWYNMVWMEKLIGNHLSD QKPPKFPALKYKIP >gi|226332864|gb|ACII01000155.1| GENE 8 7179 - 7409 276 76 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00960 NR:ns ## KEGG: EUBELI_00960 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 76 1 76 76 76 52.0 3e-13 MRQEKFKMERQGLINVGDEVEISEYQSFMNYSYIIEPAVAMSGCYPNAQRIKSKKGIVRN IEHTPRGFIVTAEFDE >gi|226332864|gb|ACII01000155.1| GENE 9 7771 - 8508 607 245 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 232 1 230 234 103 30.0 4e-22 MKTIVVCDDVEIERLLLKEILCQYFEEINEEVSIVEYDSGETLIADVEEGYIAMDLLFLD ICMKKLNGMETARKLRQIQCKVPIIFLTASPDYAIESYEVQASGYLLKSFSEEKLMKLLN RILKTDMKRRVAIKNRRQYRYPCTDDIMYIDSDKHNVTLHLSDGSEIITVDKLGEIEKRI NEKRFLRCHQSYLVNMDYIKDVEDDFIMEDGTLVPIRVRGRKEILDTYYDYFVNHFGDSK DTFAE >gi|226332864|gb|ACII01000155.1| GENE 10 8776 - 9969 1007 397 aa, chain - ## HITS:1 COG:no KEGG:CTC00591 NR:ns ## KEGG: CTC00591 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 4 379 3 382 386 303 41.0 1e-80 MKKSLRNYRLFFHRMGLLFLFIPLIVLILSIVLPDRGFSEKENRVLSSLPHLNGSLIASG GFEKQFETYENDQFPLRDLWITLKAGTDRLMGKVESNGVYVGKSGYLMEEFKAPDQTQYD ATVKAMTDFAQKHSDLKQYALIAPNSVNILSDRLPAFAPVQDQNSWLDSLSASLTDAGVT FIDVRSTFKDHKMEDLYYHTDHHWTTQGAYYAYLKAAKDMGIDTSADTYDKAPVTRSFQG TLSAKSGFRSGEKDEIDVFLPTGDQALSSVVNYVDEQKKSASFYDTEKLETRDKYALFFG GNHAQIKISTPTETDNTLLVLKDSYANSFIPFLAQHYRKIIMIDPRYYFGDLEQLMQVEN VQEILYLYNANTFFTDTSLELALTTAEQASSDTSDIR >gi|226332864|gb|ACII01000155.1| GENE 11 9982 - 11370 1210 462 aa, chain - ## HITS:1 COG:CAC1564 KEGG:ns NR:ns ## COG: CAC1564 COG1696 # Protein_GI_number: 15894842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Clostridium acetobutylicum # 1 462 1 473 473 316 38.0 8e-86 MVFNSIFFIFCFLPVFMLIYYLVPGKLRNLLLFLGSLIFYAWGEPVYVILMLFSSIFNYY MGTELERLYYDDRKQKINLIFSIIINLAVLVFFKYYGFLLNTIGGIIGIHIPHPELSLPI GLSFYTFRNLSYLFDIYLSKVSAQRNFLTFSVYSTMFPYTSAGPIVRYTDIETQLKQRTI NISKFGIGAELFVKGLAKKVLLADNLSVLYSSICGHPQMSVFTSWLGILAYTMQLYFDFS GYSDMAIGLGKMLGFDFNKNFDYPYISTSVSEFWRRWHISLGSWFRDYIYIPLGGNRVST LKHIRNILVVWALTGLWHGASWNFVLWGVYYGLFLLLEKFVLQKYLKKIPSWLCTVYTML VVMIGWVFFSQTDFGALGHYLGTMFGIGASGFIDKTALYYLKTGFILLLISILMCRPAPY QYFKRLVRRRPVAAVIINLILFALSIAYMVYNSYTPFLYAKF >gi|226332864|gb|ACII01000155.1| GENE 12 11526 - 12041 441 171 aa, chain - ## HITS:1 COG:no KEGG:CTC00593 NR:ns ## KEGG: CTC00593 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 14 167 20 173 178 69 25.0 5e-11 MADKYGTDCGVKGMKNYLHIFIKGAFVLFLIAYLAVLYTSDSAKNVPMEQIAQTMEQDSD ITSLNKEARSDLKHYYQTDDRNIDGYFFYKAASPMAVEEICIMKATDNTQANTLLENANA HLSGQKQVFEGYGTDQMALLNNAVVGKKGNYVYYMCGADAQNWRTAFLSMI >gi|226332864|gb|ACII01000155.1| GENE 13 11999 - 12793 698 264 aa, chain - ## HITS:1 COG:no KEGG:CTC01503 NR:ns ## KEGG: CTC01503 # Name: not_defined # Def: phospholipase-subfamily protein # Organism: C.tetani # Pathway: not_defined # 89 254 144 310 317 105 30.0 1e-21 MKKQPKNKVLFLIPVGVILAGAVIFLCTRGNENDSVCKENVARLQALENSDISQTEEQLQ ALKQTTGISSGSDASDGSGSTLLSDVEIRQVFAGNVIIGDSITESIVEYGYLDTDVVVSK RGLNVGAADDQISTAIALNPSHVFMAFGSNDLEIYGSASSEFIDAYRTQIKKIQTALPDV PIYINCILPITDEAIAQTPDLGYYPDYNEGLIKLCQEMGCTFIDNSTIVTDSSENLYEPD GEHVIQDYYPKWLTNMAQIAGLKA >gi|226332864|gb|ACII01000155.1| GENE 14 13101 - 13961 806 286 aa, chain + ## HITS:1 COG:no KEGG:EAT1b_1734 NR:ns ## KEGG: EAT1b_1734 # Name: not_defined # Def: hypothetical protein # Organism: Exiguobacterium_AT1b # Pathway: not_defined # 10 174 10 173 181 72 27.0 2e-11 MGKKVSDFLKSEILSSVFYLAFGLCLLLVPDQTVNIICKIVFGLVMIASGIYHIVIYTAE KEKATILDLFTGVIVMVLGIFLFFTPQIVIKILPYLLGALVLVDSIWKIKGSYRLKKAQR GRWKIILIGCLVFIALGVSMLLYSFLSVTRMILFSGIILTADGVADIVFLIMIRLGMKKS EKFCAEKEQEENGKEKPWDDPADAGLDEKVQDKSLNEDETEWKQDTDSQVQEAEMEIETG KTADAADADAILDAEMQTPDEGVESPGREIREMLKNHDEPLEEWKD >gi|226332864|gb|ACII01000155.1| GENE 15 13969 - 14805 622 278 aa, chain + ## HITS:1 COG:no KEGG:BVU_3509 NR:ns ## KEGG: BVU_3509 # Name: not_defined # Def: putative arginase # Organism: B.vulgatus # Pathway: not_defined # 6 265 20 259 269 176 36.0 1e-42 MQEQNPIVLMNFSGIYREEEFWKNRQVSWIELQDVCGTNCYCDEEAIAEINKRTENYPTA GIHFIDSGNYHYMTRLWLTRMDQPFCLLVYDNHTDMQPPAFGGILSCGGWIAAALEELEN LKYVILVGPDEAAYEQVDENLKDRVIFLSREKLQVMNDEERNWFLRETVSEVCNWRKSEG LQEDAEKFLPLYISVDKDVLCTEDAQTTWSQGDMRLTTLVSGVQTVLECAKESSGKIAGV DICGEADSEAVHENEANDYANEKLLEVFEGLFGKSDCE >gi|226332864|gb|ACII01000155.1| GENE 16 14861 - 15643 510 260 aa, chain + ## HITS:1 COG:SA0716 KEGG:ns NR:ns ## COG: SA0716 COG0682 # Protein_GI_number: 15926438 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Staphylococcus aureus N315 # 6 214 13 227 279 122 35.0 7e-28 MHNELLHIGPLTVYGYGFMIAMGVLAAWFVAEQRARKLKLACEHIFYLVVWCAIGGFTSA KILFWITNWREFLQNPRQIIGSDGFVVYGGIIGGILVGWLYCRIKKLKFLEYFDLMMPSI ALAQGFGRIGCFLAGCCYGKETSGPLAVTFTNSDFAPNNVALIPTQIYSGFLDFAHFLLL LYVAKHKKADGQVAACYLIFYSIGRFVIEFFRGDIERGSVGVLSTSQFISIFTTVAGIIL LLTVVKKQKQEANISLNSKG >gi|226332864|gb|ACII01000155.1| GENE 17 15704 - 17425 1349 573 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 49 570 569 1095 1118 263 31.0 7e-70 MFRPDDFEVQIQQENPPGIHAEPSIPPAQDEEYGDFIVKHGQNMGGKFLPVNELFGILYI PLNEIGPLEINSYTYASLPKCYTFMDADALNSSGITRLHNHPYLKLQGEGTVIAVIDSGI DYLNPIFRNGDVSRIAYIWDQTIPGNEDEQVPYGKVFTGEEINQALLSENPQEIVPSLDE NGHGTAMAGLAAGNFVPTENFSGAAPKATIIVVKLKKAKSYLRKFYQYPPQAPVFQEDDI MLGISFAVKMAQEMGMPVSVCLGLGTNQSAHVGDSELSRYVDYINEDSQVSVSVAAGNEG AAQHHYTAELDYVKNQDTVELRIADKEEGFSMEFWGDPPDDYGISLQSPAGEKLYVSSSL GAGTQELSFIFVETKVLVNYVKMERMTGKQLIYFRFFHPAAGIWKVNVSKKGISGSRFHM WLPVQGLISPDTYFLESTPYITVTAPGDSTRGITATAYQYLDNSLYFQAGRGFTPNNQVT PDLAAPGVDLLIPLPGGAFGKASGSSLSSAVVAGAAALVQEWAIVRGNIPYASGNTVKFY LQKGAVREEQMEYPNPGWGYGRLDLYRTFEIIN >gi|226332864|gb|ACII01000155.1| GENE 18 17586 - 18977 403 463 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 17 457 15 445 447 159 27 3e-38 MKRGSIFETDGIPRMSEALPLAMQHVVAMIVGCVTPAIIVSGAVPGGLSREDQVILIQSA LVIAALSTLLQLFPIGGKAKFAIGSGLPIIMGVSFAYVPSMQAIAESYGIAAIMGAEIVG GIVAVVMGLLVKKIRVFFPPLITGTVVFTIGLSLYPTAINYMAGGTSSPNYGSWQNWAIA FFTLIVVTALNHFGKGIWKLASILIGIIVGYLVSIPFGMVDFSSIGEAGVCQLPSLMHFG VQFEPSSCVALGILFAINSIQAIGDYSATTIGAMDRTPKDDELQRGIVGYGLSNVVGALL GGLPTATYSQNVGIVTTTKVINRWVLGLAATILGIAGLVPKFSAILTTIPQCVLGGATVS VFASIAMTGMKLVASAEMDYRNSSIVGLAAALGMGVSQATAALASFPTWVTTIFGKSPVV LATIIAVMLNVILPKTRDEQKEEAKKKIKIEQKLEEDHREFGE >gi|226332864|gb|ACII01000155.1| GENE 19 19123 - 20355 1122 410 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 1 410 11 415 427 231 37.0 3e-60 MSRKNLIVGQSGGPTAVINSSLYGVVSEGMRHPEAIEHVYGMINGIEGFLNGTVLDFAEA LPGEKLDGLKVTPGAYLGSCRYKLPESLEDPVYPRLFKKFEEMNIGWFFYIGGNDSMDTV SKLSRYAAQTGSDIRILGEPKTIDNDLIHTDHTPGFGSAARYVASTVREITIDANVYEKK SVTIIEIMGRHAGWLTAASALARKYTGDNPLLIYLPETAFDQEEFLKAVENSFEKNCNVI VCVSEGIHDDKGTFICEYDNSVGTDTFGHKMLAGCGKYLENLVRNRLGVKARSVELNVSQ RCSASMMSATDQQEAIKAGEFGVQAALNGETGKMISFIRKETSDGTYIMECGLEDVNAIC NEEKTVPSEWITEDGSDVTEAFINYARPLIQGTVQIPCGEDGLPSFVYRK >gi|226332864|gb|ACII01000155.1| GENE 20 20652 - 22373 251 573 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 333 560 125 351 398 101 31 1e-20 MNTIKKFIHYYGPYKTVFFIDLICAAVISLIDLAYPQILRTMTKTLFTQDQSRILHALPV IAGGLFLMYVLQCLCKYYVTCQGHMMGANMERDMRQELFDHYQQLSFSYYSQNNSGQMMS KLVSDLFDISEFAHHGPENLFISLVKIIGAFVFLFFINKKLAFPLIILVIVMFWFSFKQN ARMQATFMENRRKIGDVNASLQDTLSGIRVVQSFANEDIEHNKFKKSNEAFLLSKRDNYR CMGSFMSSNLFFQGMMYLVTLVYGGYLIANGEMQTADLAMYALYIGIFISPIQILVELVE MMQKGLSGFRRFLDVMETESEITDAPDARELTDVKGHVSYEHVSFHYSDDNTPVLSDISI DIPAGKSVALVGPSGGGKTTICSLLPRFYDVTEGRITVDGKDIRSLTLKSLRNNIGTVQQ DVYLFDGTIRDNIAYGKPGASDEEIIAAAKRASIHDFIMELPQQYDTYVGERGTRLSGGQ KQRISIARVFLKNPPILILDEATSALDNESERWIQRSLEELSQNRTTITIAHRLSTIRDA DEIIVITEDGIAERGTHEELLDMNGVYAAYYNM >gi|226332864|gb|ACII01000155.1| GENE 21 22531 - 22839 286 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581015|ref|ZP_04858276.1| ## NR: gi|253581015|ref|ZP_04858276.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 102 17 118 118 166 100.0 3e-40 MQDRKLEQALEIYFQENPDRESDKYQEIQKYLKLRIRSVMELLIENEDTERMEQIEKCGW FSANELENFIRCAQEKAKLRSLVWLLHLKDKKYGYQKKDFSL >gi|226332864|gb|ACII01000155.1| GENE 22 22808 - 24046 844 412 aa, chain + ## HITS:1 COG:DR1169 KEGG:ns NR:ns ## COG: DR1169 COG3864 # Protein_GI_number: 15806188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 57 381 48 355 379 87 26.0 6e-17 MDTKKKTFHCKTTTEQELTGKKILQNCRNQLYLDFPYLDGAFTSLEYKADSSIDTIGTDG ELIYFNPVFLMKKYLEDPAAVRRGYLHMLLHCLYLHIFMEPDRDSGEWDRECDCFVEQLI DKAVGEGKISLKKRKSDDTQVNTNTFDDHIFWQRTAGQASGGRRTKEKWEHILAYTSQNK TGMQSRAGTQTNDQQEELDEIYRSKYDYRKFLKQFCRLREEVELDTESFDYIFYSFGMEH YGNIPLIEPLEYKEVNRLEELVIAIDTSGSCSGETVQRFLGETYGILSEKENFFSRMKVY IVQCDCFIQKVDVIHSEEEWKEYIRNVKIQGRGGTDFRPVFELIRQEKERKELKNLRALI YFTDGDGIYPGQKPDYETAFVFLKRTDKMKLVPPWAKCLITGTGQTAEQMSV >gi|226332864|gb|ACII01000155.1| GENE 23 24230 - 25795 1550 521 aa, chain + ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 18 238 3 212 340 85 26.0 2e-16 MNIQEAKTEICNTLRAYLAKDENGYYTYPLIRQRPLLLIGPPGIGKTAIMEQAAAECGVG LVAYTITHHTRQSAIGLPEIVKRNYGGKGMMVTDYTMSEIVASVYDCMENTGRKEGILFI DEINCVSETLAPAMLALLQNKTFGSHKIPEGWILVAAGNPPEYNKSVREFDIVTLDRVRK LTIEPDCDIWLKYAGQQKVHQAIISYLSVKKNNFYAVENTVDGKFFVTARGWEDLSRLLQ SYEKLGIGISEELVEEFLQKEETARDFAGFYQLYTKYGEDYEIPSILSGNLSRENLALKE KMAADGAFEERFAVVNLILGALREKAEEYGQADHQLEVLYEMLLHLRTQVRKNAEEEKLG ISLLEEFVQEQENSLQIKVKMELISVREQKYQETAIRRLREYILTAKKEHVRSAENGFKR ISKCFEKEAVCREDMIQKLQGELQNAFSFIKESFGENQEMLLFVTGITADKSMTSFIAGN GCPAYFEHSEMLLYNKEENELRKACLEAVHERLQEESRDRV >gi|226332864|gb|ACII01000155.1| GENE 24 25907 - 26719 797 270 aa, chain + ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 17 263 11 244 246 115 36.0 1e-25 MSEKQPEKKKTAGDVVLTVVLIAAICVFCYAGYNLFHIYTEYKKGTDEYNSITQMAVTER DPDGEAAGPEAGSELKAPMDIDFASLKSVNNDVVGWIYVEAVPDINYPIVHGKDNETYLH RTYEKNYNFAGTIFVDYENKGDFSDCNTIVYGHNMKNGSMFAQLKKFTQDEETYKKSKYF WIFTPEKNYRYEIISAYTTGVNSDTYTLFKGPGEEFEKYLEKIRGYSEIRTDAEGMNIKD KIITLSTCTGNEATRYVVQGKRVDTLDVGQ >gi|226332864|gb|ACII01000155.1| GENE 25 26841 - 27563 775 240 aa, chain + ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 4 238 6 244 245 303 59.0 2e-82 MAGEVERLQELVDKYDNIVFFGGAGVSTESGIPDFRSQDGLYHQKYDYPPETILSHTFFM RKPEEFFKFYRDKMLCDTAKPNAAHLKLAELEQAGKLKAVITQNIDNLHQMAGSKKVLEL HGSVYRNYCMKCHRFYDFAHMKASTGVPRCECGGIIKPDVVLYEEGLDNQTINEAVKAIS EAQVLIIGGTSLAVYPAAGLIDYFRGEHLVVINKSPTPRDRYADLLIQEPIGQVFAQIHC >gi|226332864|gb|ACII01000155.1| GENE 26 27624 - 27815 313 63 aa, chain + ## HITS:1 COG:CAC3725 KEGG:ns NR:ns ## COG: CAC3725 COG4481 # Protein_GI_number: 15896956 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 63 1 63 65 65 47.0 2e-11 MGLPYEVGDIVTLKKVHPCGSRDWEILRVGADFRLKCTGCGHQIMVPRKMVEKNTKNLIK KEK >gi|226332864|gb|ACII01000155.1| GENE 27 27926 - 28213 366 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240144657|ref|ZP_04743258.1| 30S ribosomal protein S6 [Roseburia intestinalis L1-82] # 1 94 1 94 96 145 78 6e-34 MNKYELALVVSAKVEDEVRDAVVEKAKGYITRYNGTITEVEEWGKKKLAYEIQKMHEGFY YFIQFEADAQCPAEVERHVRIMDNVLRYLVVRKDA >gi|226332864|gb|ACII01000155.1| GENE 28 28293 - 28754 520 153 aa, chain + ## HITS:1 COG:CAC3723 KEGG:ns NR:ns ## COG: CAC3723 COG0629 # Protein_GI_number: 15896954 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 1 115 1 114 144 109 47.0 2e-24 MNKVILMGRLTRDPEVRYSAGENALAIARYTLAVDRRFRRDGEATADFISCVSFGRTAEF AEKYFRQGLKIIVSGRIQTGSYTNRDGQKVYTTEVVVEEQEFAESKAASDNYAASHPQTS VPTPAPSMPAPGAASADGFMNIPDGIDEELPFI >gi|226332864|gb|ACII01000155.1| GENE 29 28791 - 29054 342 87 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 [Clostridium bolteae ATCC BAA-613] # 1 87 1 89 89 136 74 4e-31 MAYNRGERPDSPMKRRGGRRRKKVCVFCGKENNEIDYKDVAKLRKYVSERGKILPRRITG NCAKHQRALTVAIKRARHMSMMPYIQD >gi|226332864|gb|ACII01000155.1| GENE 30 29286 - 31361 684 691 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein [Aster yellows witches'-broom phytoplasma AYWB] # 189 687 193 695 849 268 32 1e-87 MMKNGKKDMSDKLKLKGHMKAFMRWPLILSALLIVLNIWVYFVSVKAGIIVSGGILIYVG CAVIILRCHRPFIVNELIAFANQYDSLEKRILEELALPYAIMDMNGRMIWSNKVFAELTG KDQFYKKNVSTVFPDVTADKLPVADKKETAEISTRFGEKTYRISMQRVSLGEVVAKSEFL ENSNRNVSLIAMYLYDDTELKSYIKKNEDNKLVVALAYLDNYEEALESVEDVRRSLLIAL IDRKITKYFSNFDGLVKKLEKDKYFLIMRQSSLEALKEQRFHILDEVKTVNIGNEMAITL SIGVGLNASTYIQNYEYSRIAIEMALGRGGDQVVIKNGNNITYYGGKTQQMEKNTRVKAR VKAQALKEFMSTKDRVVVMGHKITDVDALGAAIGIFRAGKTLGKSVSIVVNDPTKSIRPL IAGYVNNPDYEPSMFVDSEQAKDMVDNNTVVVVVDTNRPSYTECEELLHMTKTIVVLDHH RRGSEVIENAVLSYVEPYASSACEMVAEILQYFSDDLRIRNMEADCLYAGIMIDTNNFTT RAGVRTFEAAAFLRRSGADVTRVRKLLRDDLKSYQARAEAVRTAQIYREYYAIARCPSEN LDSPTVIGAQAANELLNIAGVKASFVLTQYNNEVYISARAIDEINVQVMMEKMGGGGHMN IAGAQVKASPDEVERMLKDIIDQEYQEENTK >gi|226332864|gb|ACII01000155.1| GENE 31 31358 - 31804 527 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240144674|ref|ZP_04743275.1| 50S ribosomal protein L9 [Roseburia intestinalis L1-82] # 1 147 1 147 148 207 70 1e-52 MKVILLEDVKALGKKGQIVNVSDGYARNMILPKKLGLEATPKNLNDLKLQKANEEKVAKE VYEAAQAFAKDLETKEIILTLKTGEGGRTFGSVSSKEISEAAKKQLNLDIDKKKLQLPEP IRTLGVTQVPVRLHPKVTGTLKVWVKEA >gi|226332864|gb|ACII01000155.1| GENE 32 31825 - 33153 1483 442 aa, chain + ## HITS:1 COG:BS_dnaC KEGG:ns NR:ns ## COG: BS_dnaC COG0305 # Protein_GI_number: 16081096 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Bacillus subtilis # 1 442 1 443 454 455 53.0 1e-128 MEEALIKRILPHSIEAEQSVIGSMILDRDAILVASEILTSDDFYQKQYGIIFDAMVELCN EGKPVDLVTLQNRLKEKELPPDISSMEYVRDLISAVPTSANVKYYAKIVSEKAVLRRLIK ANEEIANTCYLEKENTEIILEEAEKKLFNILQRRNNEEYVPIQQVVLNAINNIEKASKLK GSVTGIPTGFMDLDYKTSGMHPSDLVLIAARPSMGKTAFVLNIAQYMAFRKDVTVAIFSL EMSKEQLVNRLLAMESHVDSQNMRTGNLKDEDWTKLVEGADIIGKSNLIIDDTPGISIAE MRSKCRKYKLEHNLGIIMIDYLQLMSGSGKSDSRQQEISDISRSLKALARELNVPVIALS QLSRAVEQRPDHRPMLSDLRESGAIEQDADVVMFIYRDDYYHKDTEKKDIAEIIIAKQRN GPIGTVELVWLPRYTQFVNMKK >gi|226332864|gb|ACII01000155.1| GENE 33 33560 - 34030 432 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581028|ref|ZP_04858289.1| ## NR: gi|253581028|ref|ZP_04858289.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 156 1 156 156 230 100.0 2e-59 MEQRYTVTQTAEILGVRASVLRYWEEELELRICRNEQGHRYYTGNDITLLANIKELKKRG LALRAIKELVPRISRTAPGTTASKVKLLEDEQPEREKILEFQKIMERLIAQELHMKKEGE DQCRDLDETIRNRQLARKEAAAALEKRSRRIRRKHL >gi|226332864|gb|ACII01000155.1| GENE 34 34100 - 34273 319 57 aa, chain - ## HITS:1 COG:no KEGG:BT_2414 NR:ns ## KEGG: BT_2414 # Name: not_defined # Def: ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 57 21 76 76 66 71.0 4e-10 MAYVISDECVSCGTCESECPAEAISQGDEHYVIDADACLDCGTCADACPTEAIHPAE >gi|226332864|gb|ACII01000155.1| GENE 35 34694 - 35872 1638 392 aa, chain + ## HITS:1 COG:SMc00242 KEGG:ns NR:ns ## COG: SMc00242 COG1744 # Protein_GI_number: 15965420 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Sinorhizobium meliloti # 3 354 2 336 356 161 33.0 2e-39 MKKALAIGLAAVMAVSMSAPVFAEDEGKGIAKEDLKVGVIYIGDENEGYTAAHMKGIDEM EEKLGLDDSQIIEKTLIGEDEGCYDAAADLADQGCQIIFANSFGHETYILEAAGEYPEVQ FCHATGTQAASSGLSNMHNYFTNIYEARYVSGVVAGLKLNEMIEDGTVKEDACKMGYVGA FPYAEVISGYTAFYLGAKSVCPSVTMEVKYTNSWASFDLEKECADALISDGCVLISQHAD TTGAPTACEAAGVPCVGYNIDMTSVAPNTALTSASMDWGVYYTYAVQCMLDGTAIDTDWC KGFAEGADKITALNDKTVAEGTEEKVKEVEDALIDGSLHVFDTSAFTVDGKELDTYKKGD TEYISDGYFHESEYGSAPAFDIAIDGITSITE >gi|226332864|gb|ACII01000155.1| GENE 36 36127 - 37662 1856 511 aa, chain + ## HITS:1 COG:AF0887 KEGG:ns NR:ns ## COG: AF0887 COG3845 # Protein_GI_number: 11498492 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Archaeoglobus fulgidus # 9 501 1 487 495 394 43.0 1e-109 MGENYALELRDISKSFGSVQANDHVDLTLRKSEILAILGENGSGKTTLMNMIYGIYYPDE GHIFVNGKEVTIRSPKDSYELGIGMVHQHFKLVDVLTAAENIVLGLPGKGKLDMKRITED IQKLADKYGFELDLSQKIYEMSVSQKQTVEIIKMLYRGARILILDEPTAVLTPQESDRLF DILRNMRKDGCSIMIITHKLQEVLALSDRVAILRKGKYIDTVETAQANAQSLSEMMVGGR VDLNIDRPEPENVRKRLVVKGLNCKNKEEVKTLDDVDLTVNAGEILGIAGISGSGQKEFL EAIAGLQAIESGSITLLDDNGKGTELAGMDSISINKAGISLAFVPEDRIGMGLVGDMDMT DNMMLRSYRNGKSPFLDRKGPKALALKIKEQLEVMTPSISAPVRQMSGGNVQKVLVGREI AQNPKVLLVAFPTRGVDVNTSHVIYRLLNEQKKKGVAVVCVIEDLDVVLELCDRIAVFCG GKISGIEDGRTATKEGIGILMTKHEKGGTQA >gi|226332864|gb|ACII01000155.1| GENE 37 37659 - 38738 1337 359 aa, chain + ## HITS:1 COG:SMc00243 KEGG:ns NR:ns ## COG: SMc00243 COG4603 # Protein_GI_number: 15965422 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Sinorhizobium meliloti # 11 355 6 349 364 149 33.0 9e-36 MKKEPFLQVSKRRNKARWQERLLIRFIALILALIVCGAVIVALVKMNPVDVYKAIWDGAM GSDRRLWQTIRDTMVLLCISIGLAPAFKMKFWNIGAEGQILIGGACSAAVMIYAGDKMPT GLLLIVMFVASALGGMIWGMIPAVFKANWNTNETLFTLMLNYVAMQVVTYCIVFWENPKG SNTVGIINQATKGGWLPELFGQKYGWNVVIVLILTVGMFIYLKYCKQGYEIAVVGESENT ARYAGIQVKKVIIRTMAISGAICGIAGFVIVSGASHTISTATAGGRGFTAIIVAWLSKFN TFIMVAVSFGIVFMNQGAVQIATQYGLNENASNVLLGIILFFLIGAEFFINYRVKRAKK >gi|226332864|gb|ACII01000155.1| GENE 38 38751 - 39731 1176 326 aa, chain + ## HITS:1 COG:AF0889 KEGG:ns NR:ns ## COG: AF0889 COG1079 # Protein_GI_number: 11498494 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Archaeoglobus fulgidus # 2 323 5 306 310 184 40.0 3e-46 MSVLIVFIQKAIVQGICILYGALGEIMTEKSGNLNLGIPGIMYMGGISGLMGAFLYEKDN PDPNALVGVLISFLCAFACAAIGGLIYSILTITLRVNQNVTGLALTIFGTGFGNFFGGSI SKLAGGVGQISVKVTGSAYTAKIPGLSKIPVIGGILFNYGFLTYLCIIIAVLLSFFLFKT RAGLNLRAIGENPGTADAAGINVIKYKYLSTCIGAGLAGLGGLYFVMEYSGGTWTDNGFG DRGWLAVALVIFALWKPLNAIWGAFLFGALYILYLYIPGLGRSMQEVFKALPYVVTIIVL VFTSFRKKKEHQPPAALGLPYFREER >gi|226332864|gb|ACII01000155.1| GENE 39 39832 - 40032 407 66 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0036 NR:ns ## KEGG: EUBREC_0036 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 64 1 64 69 87 70.0 1e-16 MAKVSKDMLIGQLLQIDANIAPILMRAGMHCLGCPSSQMESLEEAAMVHGLDVDVLVNQI NDFLGE >gi|226332864|gb|ACII01000155.1| GENE 40 40214 - 40960 821 248 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0035 NR:ns ## KEGG: EUBREC_0035 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 248 1 246 246 215 46.0 1e-54 MKIEKINDNQIRCTLTREDLENHQIRISELAYGTEKAKKLFRDMMQQAQIQFGFEADNIP LMIEAIPVSTESIILIITKVEDPEELDTRFSKFAPFRGNDKTDTLQLDGADNIIDIFQKL YEAKMKSTQTKDGKSAQNTGAKENDDTQVPDVNLIRLYEFDTLDDVIAAAHGLNGYFTGT NTLYKDPADELYKLVLHQSSLSPEDFNRVCNILTEYGQGKAFSLSGEAYLTEHGELISDS ALQQLIQL >gi|226332864|gb|ACII01000155.1| GENE 41 41661 - 41861 256 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581037|ref|ZP_04858298.1| ## NR: gi|253581037|ref|ZP_04858298.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 66 1 66 66 122 100.0 8e-27 MRKPLEKELKSYREQGISLYLNGILSNPKTIAKACQIAEGGGYMRDYVEDEKGRIARVDF DFVKEK >gi|226332864|gb|ACII01000155.1| GENE 42 42173 - 43822 1865 549 aa, chain + ## HITS:1 COG:CAC1556 KEGG:ns NR:ns ## COG: CAC1556 COG3858 # Protein_GI_number: 15894834 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Clostridium acetobutylicum # 250 548 146 445 446 121 27.0 3e-27 MNKKYKPVIAVAVLVILVAILGIVTHVVMKYIPSSEKMDLNEYYGEMADGEIALVIGTEK MEERGLVDGDRVYLPLDVVNTYLNQRYYWDSANQQILYATPSELTSASASSEAGDKVWVK DDKVYLNLTYVQEFTDLDAYITKDPYRIAIQYKFKNVKTVTVKKNTSIRYRGGIKSAILT SVKKGTKLRLIEEMENWDQVATDDGYIGYIDKKKVGEAEKTKFERSFKREQYSYLTMDSK VNMVWHQVTSTDANAYFADATANMTGVNVISPTWFYLTDTSGNIASIASADYVSQAHEKG LQVWGLIDNFTQEVSTTETLSSTAARQNIISQLIQAAQDVGMDGINVDFESLSEDVGTHF LEFLRELSIECHKNNLVLSVDNPVPEDFTSHYDRAEQGRVVDYVIIMGYDEHYVGSEAGS VASLPWVEQGIQDTLDEVPAERVINAIPFYTRLWRTTGGNVTSEAIGMDQAQQTIADNNV ETYWDKTTSQNYGKYDIDNSTYQIWLEDAQSVAEKVKLVSKYDLAGVSAWKLGFENNVIW QVISDNLNN >gi|226332864|gb|ACII01000155.1| GENE 43 43987 - 44682 606 231 aa, chain + ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 13 229 12 233 238 123 35.0 3e-28 MKKYGLELMLGCLLLVSFLILSKQAAEVSETMSSTENSKIILVDAGHGGADPGMIGVNGL EEKGINLQIAVKLKDSLEKQGFSVIMTREEDKGLYEEDSRNQKAQDMQCRIAMIKKYRPV LCISVHQNSYQDSSVCGPQVFYYEDSVRGKNLAEFIQEELNLGLKVKRPRVAKGNKTYYL LKRSESVLNIVECGFLTNPEEAGLLCKEEYQNKIVEAIVKGIEQYLKQQKI >gi|226332864|gb|ACII01000155.1| GENE 44 44808 - 46820 1428 670 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2889 NR:ns ## KEGG: Cphy_2889 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 670 1 694 698 423 36.0 1e-116 MLGIIYLLLAGMLGCEASKMLTGEGRSVSGINRIWLMLPASFGVGTLLLTWMVYIISWFF SVVGKAENPLLYGNIIGMTGAAVIMILLSVQKYRKQGSYRIWNIDKTQDKRRLKKEILLF GLLTVFITYMMFYVFYIKDGILYSGLTVYGDYAPHTAMMRSFSAGNNFPTQYPHYGGADV KYHFMFQFLTGNLEYLGMRIDFAYNIVSTLSLVGFLMLLYQLALRITGKMCCGVLALFLF FFRSGMAFFRFVWEHIQAGNLVETLEENTSFIGYTVNENWGLWNFNVYLNQRHLAFGLLM ATLALYLFMDWLEAGTAHEEKGILWIKNRIFSKEGWKSRNLDQALLMGMFLGLCAFWNGA AVIGGLLILCGFAIFSDGKVDYAVMAGVSIFFSWLQSKIFIVGSAMSPQIYLGFLAEDKT VPGVVKYLFWMSGVFFLGLVLMVWFMRRRERAILVSFLFPVIFAFVLLMTPDINVNHKYI MISYAFLTIFWAWAVCSLWKGRNGQSGIQKTMGKILAIVLVISLSVTGIYDFVVIIKGNG PGRRVTVNMNSELTQWLEENLEKNDLLLTPEYSMNEVTMSGAMLYCGWPYYAWSAGYDTN YRAAQAVTIYTTSDSETLKKTVQQEKITYILFEEGSEFEQQECREETIAAAYEKVYETQD GRIRIYKTTE >gi|226332864|gb|ACII01000155.1| GENE 45 46961 - 49198 2017 745 aa, chain + ## HITS:1 COG:MA4078 KEGG:ns NR:ns ## COG: MA4078 COG5427 # Protein_GI_number: 20092871 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Methanosarcina acetivorans str.C2A # 516 720 465 679 705 99 31.0 3e-20 MKKWAPRVLLAAALAGLSAFLLKGDVWTFWTWWLLAFLMGMVAMPVTGRLFAGFEDKGWM FSKVLAITVTGFLTWFLVTAKILPFTAATCIGVSVVCAVGCGVLYHFQGKNGIDCFPSGK VNLIYGEEILFFIFFLMWTYFAGFRPQAYGTEKFMDYGFMEAMMRSTTLPARDLWYSEGT INYYYGGQYFAVFLTKLTGSKVELTYNLMRTFVAAFAFVLPFSLVRQMSVDRLKESLTGK KRCLPAVAGIIAGLSVSIAGNMHYVVYSKIIPWLQNLQGKEADSYWFPDATRYIGYNPDV PDKTIHEFPCYSFVLGDLHAHVVNVMFVLFLVGLLYAWMRSVRMREAVIMKPGRKQFWKK QLLIPHILLAAVMIGMFRFTNFWDFIIYFVVTGGVVLFTNIVQFDGKVKRILAVTAVQAV EIIVISYVVSLPFTLQFDSMFKGVGIAQNHSMIHQLLILWGLPVVLTVLLIICIIWEKLR GGANRSLYKLMKAISVPDLFAIVMGLCAIGLIVIPELVYVRDIYENGNARANTMFKLTYQ AYMLFGMTMGYGIFRMLVAARKKVFKVVSVIGLVLLCWTFGYFGNSVYAWFGKVWEPSEY KGLNAASFLENDFAEDVAGIKWLKKNVKGAPVVLEANGDSYSGYERVSAMTGLPTVLGWY VHEWLWRDDTADLNEKSADIESIYTSLDEEYVKELLEEYDVSYIFVGSKEREKYGETLNE DVLKSLGEVVFQDEVYSTYILKVNK >gi|226332864|gb|ACII01000155.1| GENE 46 49774 - 50823 439 349 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581043|ref|ZP_04858304.1| ## NR: gi|253581043|ref|ZP_04858304.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 349 18 366 366 697 100.0 0 MMLTGAICCSSCFCQTAFAKTSKVRVYDAEKGTTSFTSFTYSDNKKDESLAPSEGTGAIS SASQKVVIFHQADGSTIIRKADSNGKVTLPAIRNQTGYTFLGWSTKPDQTQNPQYQAGQV IQVRKKTHLYAVMYNWQQEPDIQVNNLAAQLSEYSGIIFVGDSRTYFMQKTLLREYGKDA VAKVSFVCKTGEGLSWFETAGERVMRSEIARLQSDSDKPVAVIFNLGVNDLSSHNSGNGV DYKGEANAYLARMNTLAEELESDCRLFYMSVNPVNTAMKPTRKEAQLRYFNDRLQSRLNK RFQWIDTYKYLMKNGYSTYNEFKGNIDDGVHYSTCTYKRIYKYCMNAIR >gi|226332864|gb|ACII01000155.1| GENE 47 51140 - 52192 1211 350 aa, chain + ## HITS:1 COG:CAC2882 KEGG:ns NR:ns ## COG: CAC2882 COG0009 # Protein_GI_number: 15896136 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Clostridium acetobutylicum # 1 349 1 350 350 316 45.0 5e-86 MKAEVVSMTADNLDMEAIRRGGDILKQGGLVAFPTETVYGLGGDALNPQASMKIYAAKGR PSDNPLIVHIAEFDKLAPIVAEVPEKAKILAEKYWPGPLTMIFPKSDLVPQETTGGLDSV AVRFPSDRIAQELIKAAGGYVAAPSANTSGRPSPTTAQHVEEDLGEAIEMIIDGGQVGIG LESTIVDFTEDVPVVLRPGYISLEMLQETLGDVRMDKGLIAADSKVRPKAPGMKYRHYAP KADLAIVEGPEEAVVKKINELAAEAKAHGEQVGIIATDETKDRYPEGLVVSIGSRKEEET IAHHLYEVLRDFDQSAVRSIYSEAFYTPRMGQAIMNRLLKAAGHKIIQVV >gi|226332864|gb|ACII01000155.1| GENE 48 52330 - 52761 606 143 aa, chain + ## HITS:1 COG:CAC2880 KEGG:ns NR:ns ## COG: CAC2880 COG0698 # Protein_GI_number: 15896134 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Clostridium acetobutylicum # 2 140 3 140 152 167 62.0 6e-42 MIALGCDHGGYELKQEIKKYLDEKGIEYKDYGCDSLDSVDYPVYAKKVAHAILDGECEKG ILICGTGIGISITANKFKGIRAAVCTDCFTAEATRLHNDANILALGGRVVGPGLALKIVD TFLNTPFSNDERHIRRINQIETE >gi|226332864|gb|ACII01000155.1| GENE 49 52986 - 53477 617 163 aa, chain + ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 4 161 12 168 174 201 59.0 6e-52 MDVNHKRTDYLSWDEYFMGVAMMSGMRSKDPNSQVGACIVSEDNKILSMGYNGFPKGCSD DEFPWAREGDSLHTKYFYVTHSELNAILNYRGGSLEGAKLYVSLFPCNECAKAIIQAGIK TIVYDCDKYADTPAVIASKRMLDAAGVRYYKYNRTGRKITIEV >gi|226332864|gb|ACII01000155.1| GENE 50 54062 - 55501 1847 479 aa, chain + ## HITS:1 COG:MYPU_1830 KEGG:ns NR:ns ## COG: MYPU_1830 COG0442 # Protein_GI_number: 15828654 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Mycoplasma pulmonis # 6 479 27 501 501 479 47.0 1e-135 MANNKKLVEAITSMEEDFAQWYTDVVKKAELCGYTSVKGCMAIKPAGYAIWENIQHELDR RFKETGVQNVYMPIFIPESLLQKEKDHVEGFAPEVAWVTHGGLDPLQERLCVRPTSETLF CDFYAKEIQSHRDLPKVYNQWCSVVRWEKTTRPFLRSREFLWQEGHTAHATAEEAEERTI QMLNLYADFCEEVLAIPVIKGQKTDKEKFSGAEATYTIESLMHDGKALQSGTSHNFGDGF ARAFGIQYTDKENKLQYVHQTSWGMTTRMIGAIIMVHGDNSGLVLPPRIAPVQAVIIPIQ QRKEGVLEKADEMFAALKAAGVRVKVDDTDKSPGFKFAEQEMRGIPIRIECGPKDIENGQ AVICRRDTREKYTVTFDELVEKVQEILETIQKDMLERARKHRDSHTYVATNYEEFKDTIV NKPGFVKAMWCGDRACEDKIKEDVQATSRCMPFEQEHLSDVCVCCGKPAKKMVYWGRAY >gi|226332864|gb|ACII01000155.1| GENE 51 55593 - 56918 1339 441 aa, chain + ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 12 441 16 448 451 236 36.0 9e-62 MILEIPKNAETILHILEKAGYEAYVVGGCVRDSILGRSPDDWDITTSAKPEQVKALFHRT VDTGLQHGTVTVLMEKEGYEVTTYRVDGEYEDGRHPKEVTFTASLKEDLKRRDFTINAMA YNPSSGLVDLFGGLEDIERKIIRCVGDPLERFTEDALRIMRAVRFSAQLGFAIEEETRKA LKVLAPNLKHVSAERIQVELVKLLMSPHPDYLRVAYEAGITAEFLPEFDACMTTSQNTPH HCYTVGEHILHSLCHVRADKVLRITMLLHDIGKPVVRKTDENGRDHFKMHGIAGEKRAAQ ILRRLKFDNDTIRKVTRLVKWHDDRPDGTTKAVRRAVNRIGEDLFPYYLEVQQADMLAQS DYRRAEKQERLDKVKEAYETIINEHQCVSLKTLAVTGKDLIEAGYKPGREIGETLNRLLE VVLADPQKNQKEILLGLLDEK >gi|226332864|gb|ACII01000155.1| GENE 52 57042 - 58055 739 337 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3824 NR:ns ## KEGG: Cphy_3824 # Name: not_defined # Def: CotS family spore coat protein # Organism: C.phytofermentans # Pathway: not_defined # 1 326 1 346 347 192 33.0 2e-47 MYDYGLGTLAQYELTADRSARTRGALLCYTAQGLLILREFHGSEKKLEKQQELLMRLQEN GINTDYFLRNNQESLVSKDKTEQRFTLQHWYEGKECDTKSREDILKSVRTLARLHILMKM EPVEEYRERSLREEYLRHNQELRKIRKFIRNKGASNVFEKNYLASVEQFLERAQYAVKLL DETDYDDLRERAWREGQVCHGEYNQHNILMLKGDHLGTAVTNFGHWSFDIQVADLYRFMR KILEKYNWNLELAGEMLREYHKIRPISAEEWKNLRVRFTYPEKYWKLANYYYSHKKVWIS EKNVEKLQNLIRQREIWENFAEECFRDYPRYCSLRSQ >gi|226332864|gb|ACII01000155.1| GENE 53 58156 - 58476 170 106 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 14 103 3 92 96 70 32 3e-11 MPPNKKEIIRGGNTMNKTELVAAMAKETNLSKKDVEDVLKSFVDVVSKELKNGGKIQLVG FGTFEVSERAAREGRNPQTGETMKIEASKSPKFKAGKALKDMVNGK >gi|226332864|gb|ACII01000155.1| GENE 54 58575 - 58814 293 79 aa, chain + ## HITS:1 COG:L1001 KEGG:ns NR:ns ## COG: L1001 COG1188 # Protein_GI_number: 15671997 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Lactococcus lactis # 1 79 17 95 105 75 58.0 3e-14 MRLDKYLKVSRLIKRRTVANEACDAGRVLVNDKPAKASVQVKTGDTIEIQFGSKNVKVEV LDVKETVKKEDVESMYKYL >gi|226332864|gb|ACII01000155.1| GENE 55 59007 - 59252 231 81 aa, chain + ## HITS:1 COG:no KEGG:GYMC10_0046 NR:ns ## KEGG: GYMC10_0046 # Name: not_defined # Def: sporulation protein YabP # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 2 81 14 93 93 65 42.0 4e-10 MLENRQNGRITGVKDIKSFDEKEILLFTQAGKLVIKGEQLHVKQLDLEKGEVDLEGKVDS LTYLSKNTDNRDESLFRRMFR >gi|226332864|gb|ACII01000155.1| GENE 56 59475 - 59720 118 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581053|ref|ZP_04858314.1| ## NR: gi|253581053|ref|ZP_04858314.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 74 154 154 150 100.0 3e-35 MVLGFLSGAYICHRILGDIFVKCCTKFMEIPVIIIKFLIKWLLFPVKRCKLLWYKAYKCA KKGRLANWVILRKGRKKESRV >gi|226332864|gb|ACII01000155.1| GENE 57 59731 - 59988 360 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581054|ref|ZP_04858315.1| ## NR: gi|253581054|ref|ZP_04858315.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 85 1 85 85 93 100.0 4e-18 MIAIALVVLVLLGGLMLESNDLKERLTGYDAKAATLQQQIEDEQTRTEEIDKLKKYMETD EYAEEVAREKLGLVKDNETVFKKQQ >gi|226332864|gb|ACII01000155.1| GENE 58 60175 - 61668 1346 497 aa, chain + ## HITS:1 COG:BH0078 KEGG:ns NR:ns ## COG: BH0078 COG2208 # Protein_GI_number: 15612641 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Bacillus halodurans # 5 494 310 806 830 192 26.0 1e-48 MQEWMIAMLVISIMLIIRDMAKTILAENLPDTEELPSLQEGHPQKERVEKYAASFQKLAD TFYGMPYRKDYLSSRQVEQIIEDTNAKVCSRCYQREICWGEHSQELVKGVEALVRSMESG NEENVRGIRADWTGICPRSVQYYETVAENFQKERQNLMWDNRMIENRLAVAQQLMEVSHI MENVASDLYDLERVTPQFEEELRKGLRKSHVILRRAWMMNKVKGRKQIFLTMRARSGQCI SMTEIAQLLSKYCEISMVPVNGSRCIVNGEYHTVHFAEDVSYQVLYGTARLTREEEKVSG DNYICRQEDGGRFVMCLSDGMGSGMEACRESETVVELLEQFMESGFSQETAAKMVNSALV LKGEEGMFSTVDICAVDLYTGICNFLKAGAASTFIKRDHWVESITSESLAAGLVQQIDFE TATRKLYHGDYLIMMTDGVLDALPDQKEEETMKEIIMDVHEESPKDFGRGILERVLGYSD YHARDDMTVLVAGVWKK >gi|226332864|gb|ACII01000155.1| GENE 59 61871 - 63328 629 485 aa, chain + ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 1 474 1 453 461 201 30.0 4e-51 MEREILAYIKQNRMIGKNDVVLAGVSGGSDSMAMLRILKELQGKLDFTLRVVHIHHGIRG KEADRDQSFVENICRKWQIPCTVYCYDVPGLSREWKLGEEETGRIVRKEAFQREAAVCGR KDSEIKIALAHNQEDLAETMLHNLCRGTGLRGLCTMRPVDGEIIRPILCLSRDKIAEYLK EKKISHIQDSTNLSDEYTRNRIRHNILPMLERQVNGKAAAHMAETAARISQAEEYLTQQS CLVLSEFQKGKGYYFTEKFFMEPQIIQVYALQQAMEQLAGRRKDLAAVHYEKVLELYEMQ TGRRISLPYHMEARRDYEGVRLCRYGEEKSAGMIGEKHKGSDIQNGKTVCGKNISDREWK IRIPGTVTSPLGIFSAEIFLYEGQKIEEKKYTKWMDYDKIEKNPYIRTRRTGDYMVINAQ GNTKKLNRCMIDEKIPSEYRDSIPLIACGKEILWMVGSRMNERYKINPQTRKVLVLNYQG GNENE >gi|226332864|gb|ACII01000155.1| GENE 60 63321 - 63851 683 176 aa, chain + ## HITS:1 COG:CAC3203 KEGG:ns NR:ns ## COG: CAC3203 COG0634 # Protein_GI_number: 15896450 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 172 1 173 178 187 56.0 7e-48 MSEKIKVLLSEEEVDSRIKQIAAKVSKDYAGKEIHLICVLKGGVFFTCELAKRITVPVSL DFMSVSSYGDDTKSSGVVKIVKDLDQPLEGKDVLIVEDIIDSGRTLSYLIEILKQRNPNS IRLCTLLDKPERRVRDVRVDYCCFNIPDEFVVGYGLDYAQKYRNLPFIGVVELGED >gi|226332864|gb|ACII01000155.1| GENE 61 63868 - 65721 1305 617 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 1 614 1 605 636 507 47 1e-143 MNKQPSRIGLYLVLIIALVAGYFYLNNQVVSQSNYTQKQLEQALEDNKVVDATIQQNREV PTGSVIVDTTDSGQKKVYVTDVNQAIELLNKYDIDITTQDVPRDNVFLTTLLPVLLTGAL VIFLIMMMNRQMAGGGGGGNAKMMNFGKSRARMSSPDDNKKVTFDKVAGLQEEKEDLVEV VDFLKSPQKYTKVGARIPKGVLLVGPPGTGKTLLAKAVAGEAGVPFFSISGSDFVEMFVG VGASRVRDLFEEGKRHAPCIIFIDEIDAVARQRGTGMGGGHDEREQTLNQLLVEMDGFGV NEGIIVMAATNRVDILDPAILRPGRFDRKVAVGRPDVKGREEILRVHAKDKPLGEDVDLA QIARTTAGFTGADLENLLNEAAIEAARKGRGFILQSDIKGAFIKVGIGAEKKSKVISEKE KKITAYHESGHAILFHVLPDMDPVYTISIIPTGMGAAGYTMPLPDNDEMFNTKGKMLQDI MTLLGGRIAEEIIFGDITTGASNDIKRATATARSMVMKYGMSDKLGLICYGDDDDEVFIG RDLAHTRSYSEDVAKSIDEEIRRIISECHDQAKKIILGHEDVLHKCASLLLEKEKVHRDE FEALFTTEDPETENNSI >gi|226332864|gb|ACII01000155.1| GENE 62 66073 - 68208 2103 711 aa, chain + ## HITS:1 COG:ECs0453 KEGG:ns NR:ns ## COG: ECs0453 COG0366 # Protein_GI_number: 15829707 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 59 648 36 562 605 256 30.0 1e-67 MINDREDWTEMEYEKADIKKRMQYILNMRSVFNKEALFSDGTEYYRIPAEPKAGDTVTIK FRTQRNNVDSVYLVSQEQRVQMEICGTENGFDYYSAQVTIGEDIFRYYFEIQYGWVTCYY NNQGVCMKHEGRMDFEIYPGFDTPKWAKGAVMYQIYVDRFLNGDPTNDVVTGEYHYIGDK SVQVEQWNKIPAVMGVREFYGGDLQGIMNKLDYLQDLGVEVIYLNPIFVSPSNHKYDCQD YDYVDPHYGRIVEDCNEGILLVDDDDNSHAWKYIKRVTDKKNLEASNELFAKLTAEIHRR GMKIILDGVFNHCGSFNKWMDRERIYENQEGYPKGAYVSADSPYRNFFSFNDPNAWPYNT SYDGWWAHDTLPKLNYEGSRELYDYILRVGQKWVSAPYNVDGWRLDVAADLGHSNDFNHQ FWKDFRKAVKAANPNAIILAEHYGNPEGWLKGDEWDTVMNYDAFMEPLTWFLTGMEKHSD EYREDLLGNSEAFIGAMKTHMRALHMSALQTAMNELSNHDHSRFLTRTNHRVGRISYAGP EAASEGVNPAVMREAVTIQMTWPGAPTVYYGDEAGLCGFTDPDNRRTYPWGREDYQMIDF HRVMIRIHKKYEVLKTGSLGFLWNDYQGLCYARFSHDEQIIVIVNNQEEGREVEIRLGQA GISRLEDTKLKRVVMTSAVGFTEECKEYTAAAGILKITMPAFGAVVLHHKN >gi|226332864|gb|ACII01000155.1| GENE 63 68678 - 69541 1213 287 aa, chain + ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 287 1 287 287 434 78.0 1e-121 MLVNATEMLKKAKAGHYAVGQFNINNLEWTKAILLTAQELNSPVILGVSEGAGKYMTGYK TVVGMVNGMMEELNITVPVALHLDHGSYEGCLKCVEAGFSSIMFDGSHYPIEENVAKTKE LVKIVAEHGMSLEAEVGSIGGEEDGVVGMGECADPQECKMIADLGIDFLAAGIGNIHGKY PANWKGLSFETLDAIQKLTGEMPLVLHGGTGIPADMIKKAIDLGVSKINVNTECQLSFAA ATRKYIEEGKDQQGKGYDPRKLLAPGFEAIKATVKEKMELFGSVNKA >gi|226332864|gb|ACII01000155.1| GENE 64 69863 - 71305 1045 480 aa, chain + ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 480 7 465 466 203 28.0 6e-52 MEQKKALNRKKLKVILVICIVMIVTAGIGLAVYIHHVDTVTLGRKVSVYRLDVSKLTVEE AQQKISEAFQNKTITFHEDGKDVYNVSMEQLGYSLDQDILKSALETLKQERSGTFKLFAS QRDYKIDYQIIRDEAQEQAALSADHFDEKDRTEPTDAHIRYSKKKQKYVLVKQVPGNQID ENRLLSYVEETLDKDFETELLTSDVKMELNEEVYRQPDIEESGEMKQKVKKLNSLLKKYR STTVSYLFGEETQVLDSDTISSWLQIKNSGISIDKDAAADYISNMANKYNTIYVPRTFHT SLGTDVTVSDNEYGYRIDQDAELTQLLEDLKSGENVSREPVYSSSGMKRNGTDDLAGSYI EVSLDSQHLWLYKDGALVTETDIVSGAPTPERETYRGAWPIAYKASPFTLSSEEYGYAET VKYWMPFVYGQGLHDASWQSAFGGNRYKTGHGSHGCINLPEDQAALIYNTIDGGYPIIIY >gi|226332864|gb|ACII01000155.1| GENE 65 71368 - 71904 488 178 aa, chain + ## HITS:1 COG:MA3441 KEGG:ns NR:ns ## COG: MA3441 COG0778 # Protein_GI_number: 20092253 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 167 6 163 174 64 30.0 9e-11 MEFQNLIEKRRSVRKYVERNTVTKDEILSMIKAAQEAPSWKNSQTGRYYCIMDEKNVEQF RRECLPEMNAGKCENAVLLVSTFVHNRAGFQKDGTADNEIGNGWGCYDLGLQNENLILKA EELGYGTLIMGIRDADKIREFCSVPETETVVGVIAVGVPGEEPGRPKRKDTEEIVKFL Prediction of potential genes in microbial genomes Time: Sat May 28 21:02:11 2011 Seq name: gi|226332863|gb|ACII01000156.1| Ruminococcus sp. 5_1_39B_FAA cont1.156, whole genome shotgun sequence Length of sequence - 45850 bp Number of predicted genes - 39, with homology - 39 Number of transcription units - 13, operones - 7 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 167 - 254 52.1 # Ser GCT 0 0 - Term 274 - 319 2.0 1 1 Op 1 . - CDS 431 - 1126 639 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 2 1 Op 2 . - CDS 1163 - 1720 619 ## gi|253581064|ref|ZP_04858324.1| conserved hypothetical protein 3 1 Op 3 24/0.000 - CDS 1723 - 4296 3058 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 4 1 Op 4 . - CDS 4316 - 6241 2255 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit - Prom 6350 - 6409 10.0 - Term 6387 - 6426 6.8 5 2 Op 1 1/0.000 - CDS 6439 - 7290 885 ## COG0313 Predicted methyltransferases 6 2 Op 2 1/0.000 - CDS 7331 - 8071 780 ## COG4123 Predicted O-methyltransferase 7 2 Op 3 1/0.000 - CDS 8064 - 8963 951 ## COG1774 Uncharacterized homolog of PSP1 8 2 Op 4 . - CDS 8965 - 9954 1035 ## COG2812 DNA polymerase III, gamma/tau subunits 9 2 Op 5 1/0.000 - CDS 10007 - 10573 570 ## COG0194 Guanylate kinase 10 2 Op 6 2/0.000 - CDS 10605 - 12071 886 ## COG1982 Arginine/lysine/ornithine decarboxylases 11 2 Op 7 40/0.000 - CDS 12189 - 12878 855 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 12 2 Op 8 . - CDS 12875 - 14017 1105 ## COG0642 Signal transduction histidine kinase - Prom 14069 - 14128 7.5 - Term 14190 - 14238 0.3 13 3 Tu 1 . - CDS 14255 - 15436 1681 ## COG0192 S-adenosylmethionine synthetase - Prom 15473 - 15532 5.9 - Term 15743 - 15780 -0.1 14 4 Tu 1 . - CDS 15799 - 17091 1544 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 17232 - 17291 9.3 15 5 Op 1 . - CDS 17308 - 17820 597 ## COG0703 Shikimate kinase 16 5 Op 2 . - CDS 17872 - 18462 587 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 17 5 Op 3 . - CDS 18497 - 20167 1844 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase 18 5 Op 4 . - CDS 20181 - 21089 495 ## PROTEIN SUPPORTED gi|148379145|ref|YP_001253686.1| ribosomal protein S1 - Prom 21266 - 21325 7.6 + Prom 21222 - 21281 4.8 19 6 Tu 1 . + CDS 21375 - 22031 715 ## COG2357 Uncharacterized protein conserved in bacteria + Term 22055 - 22090 2.3 20 7 Tu 1 . - CDS 22365 - 22817 501 ## gi|253581081|ref|ZP_04858341.1| conserved hypothetical protein - Prom 23052 - 23111 7.8 - Term 23152 - 23198 8.9 21 8 Op 1 . - CDS 23210 - 24082 1053 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 22 8 Op 2 . - CDS 24130 - 26484 2385 ## EUBREC_0026 hypothetical protein 23 8 Op 3 9/0.000 - CDS 26558 - 27643 792 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 24 8 Op 4 6/0.000 - CDS 27648 - 27857 370 ## COG2501 Uncharacterized conserved protein 25 8 Op 5 16/0.000 - CDS 27871 - 28986 996 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) - Prom 29196 - 29255 4.2 - Term 29076 - 29115 0.2 26 8 Op 6 . - CDS 29265 - 30620 1392 ## COG0593 ATPase involved in DNA replication initiation - Prom 30726 - 30785 8.2 + Prom 31071 - 31130 7.8 27 9 Tu 1 . + CDS 31243 - 31377 187 ## PROTEIN SUPPORTED gi|160882064|ref|YP_001561032.1| ribosomal protein L34 + Term 31378 - 31419 5.1 + Prom 31452 - 31511 4.7 28 10 Op 1 . + CDS 31645 - 31812 151 ## gi|253581089|ref|ZP_04858349.1| ribonuclease P 29 10 Op 2 18/0.000 + CDS 31838 - 31993 106 ## COG0759 Uncharacterized conserved protein 30 10 Op 3 16/0.000 + CDS 32075 - 33370 1226 ## COG0706 Preprotein translocase subunit YidC 31 10 Op 4 4/0.000 + CDS 33391 - 34323 1173 ## COG1847 Predicted RNA-binding protein + Term 34363 - 34434 4.3 + Prom 34421 - 34480 7.9 32 11 Op 1 11/0.000 + CDS 34601 - 35974 1415 ## COG0486 Predicted GTPase 33 11 Op 2 24/0.000 + CDS 36045 - 37919 2020 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division + Prom 38005 - 38064 10.0 34 11 Op 3 . + CDS 38099 - 38821 693 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Term 38834 - 38900 9.1 + Prom 38878 - 38937 6.6 35 12 Op 1 24/0.000 + CDS 39056 - 40129 999 ## COG1131 ABC-type multidrug transport system, ATPase component 36 12 Op 2 4/0.000 + CDS 40132 - 40995 833 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 37 12 Op 3 . + CDS 41013 - 42497 1619 ## COG3225 ABC-type uncharacterized transport system involved in gliding motility, auxiliary component 38 12 Op 4 . + CDS 42498 - 44297 1783 ## gi|253581099|ref|ZP_04858359.1| conserved hypothetical protein + Term 44315 - 44382 17.1 - Term 44309 - 44362 18.0 39 13 Tu 1 . - CDS 44569 - 45522 678 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 45574 - 45633 8.5 Predicted protein(s) >gi|226332863|gb|ACII01000156.1| GENE 1 431 - 1126 639 231 aa, chain - ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 2 201 4 210 241 97 33.0 2e-20 MVFRNLFFVPYGWFKLCWYASHAERYSEEQRYQLLKYVDNRAVKGGNLVIDGHGMENIPK ENGFIFFPNHQGLFDVLAIIQVCPVPFSVVAKKELTNVPFLKQVFACMKAFMIDRDDVKQ SMQVIINVIKEVKAGRNYLIFAEGTRSKDGNHPQEFKGGSFKAATKSKCPIVPVALIDSY KAFDTGSVKKLTVQVHFLEPIYYDEYKDMKTTEIAAEVKKRIEATITQYTS >gi|226332863|gb|ACII01000156.1| GENE 2 1163 - 1720 619 185 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581064|ref|ZP_04858324.1| ## NR: gi|253581064|ref|ZP_04858324.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 185 1 185 185 329 100.0 6e-89 MKKWYRYLAAGVLVLLTLCVAGCGGVSHKSPEKVTEELIQSYTDGKEKKVKECYNQADNT EADLQAEITATLKYFAAHNPKKVSVQDCEILSENDKYSYVYITYNLILEDDQEYPCVGTY MVGKQDKTYYIMAPSQITDDMKTQAATAYVQFMKTDAYKTYTKAYETFIKKNPGYEDKIS EKAGV >gi|226332863|gb|ACII01000156.1| GENE 3 1723 - 4296 3058 857 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 8 856 6 830 830 908 56.0 0 MEDNIFDKVHEVDLEKTMKDSYIDYAMSVIASRALPDVRDGLKPVQRRVLYSMIELNNGP DKPHRKCARIVGDTMGKYHPHGDSSIYGALVNMAQEWSTRYPLVDGHGNFGSVDGDGAAA MRYTEARLSKISMEMLADINKDTVDFQPNFDETEREPVVLPSRYPNLLVNGTSGIAVGMA TNIPPHNLREVVDAVVKIIDNIIEEQRETTMEEILDIVKGPDFPTGAMILGRRGIEEAYR TGRGKIRVRAVTNIETLPNGKSRIIVTELPYLVNKARLISKIAELVRDKKIDGITDLNDH SSREGMRICIDLRKDANANVVLNLLYKHTQLQDTFGVNMLSLIPNNGSLEPKVLNLKQML EYYLAHQEDVVTRRTKYDLNKARERAHILEGLLKALDNIDEVIRIIRASQNVQIAKQELM DRFELTDVQAQAIVDMRLRALTGLEREKLEAEYADLMEKIRKYEAILADRSLLLRVIREE ILAIAEKYGDDRKTSIGYDVYDISTEDLIPRENTVITMTKLGYIKRMTVDNFRSQNRGGR GIKGMQTLEDDYIEELLMTTTHHYLMFFTNTGRVYRLKAYEIPEAGRTARGTAIINLLQL MPGECITAVIPLRKFEDGHYLMMATKNGLVKKTPIKEYANVRKNGLAAITLRDDDELIEV KLTDDKKDIILVTKDGMCIRFKETDVRSTGRTSMGVRGMNIDDGDEVVAMQLNSQGDCLL IVSANGMGKRTSMGEFTCQNRGGKGVKCYKITEKTGDVIGARAVNEDNEVMLITTEGIII RIACADISILGRITSGVKLINLTEGVTVASVAKVRDKEEKDTNAQQAAVEITDSAETEES DQPTDQETQDENRGEEE >gi|226332863|gb|ACII01000156.1| GENE 4 4316 - 6241 2255 641 aa, chain - ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 1 641 1 637 637 826 63.0 0 MGGENTEYGADQIQILEGLEAVRKRPGMYIGSTSGRGLHHLVYEIVDNAVDEALAGFCTE IQVVINPDNSITVVDNGRGIPVGINHKAGIPAVEVVFTILHAGGKFGGGGYKVSGGLHGV GASVVNALSTWLEVRIYHEGKIYQQRYERGKTMYPLKVVGDCQPDKTGTTVVFLPDKEIF EETVYDYDILKQRLREMAFLTKGLKISLKDTREGMEREKVFHYEGGIREFVSYLNRSKEA LYPEIIYCEGQQNGVSVEVAMQHNDSYTENCYGFVNNINTPEGGTHIVGFRNALTKTFND YARKNKLLKDSEPNLSGDDIREGLTAIVSVKIENPQFEGQTKQKLGNSEARGAVDSVVSK QLEIFLEQNPTVAKTTVEKSVLAQRAREAARKARDLTRRKSALDNMSLPGKLADCSDKNP ENCEIYIVEGDSAGGSAKTARNRVTQAILPLRGKILNVEKARLDKIYANAEIKAMITAFG TGIHEDFDISKLRYHKIIIMTDADVDGAHIATLLLTFLYRFMPELIKQGYVYLAQPPLYK VEKNKKVWYAYDDDELNQILTEIGRDGNNKIQRYKGLGEMDADQLWETTMDPEKRILLKV TMDEAATSEIDLTFTTLMGDKVEPRREFIEENARFVKNLDI >gi|226332863|gb|ACII01000156.1| GENE 5 6439 - 7290 885 283 aa, chain - ## HITS:1 COG:CAC0307 KEGG:ns NR:ns ## COG: CAC0307 COG0313 # Protein_GI_number: 15893599 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Clostridium acetobutylicum # 2 278 3 279 282 280 50.0 2e-75 MSGKLYLCATPIGNLEDITLRVLRTLKEVDLIAAEDTRNSIKLLNHFDIKTPMTSYHEYN KIDKAYVLISKMQEGQNIALITDAGTPGISDPGEELAAMCYEAGIEVTSLPGPAACITAL TLSGLPTRRFAFEAFLPADKKERKLILEELKNETRTIILYEAPHRLVRTLEELKETLGNR RMTLCRELTKRHETAFHTTIEELILYYQTEKPLGECVLVIEGRSRQEMEEEQKASWEKIT IEEHMEIYENQGHSRKEAMKMVANDRGMTKRDVYQYLINKTEE >gi|226332863|gb|ACII01000156.1| GENE 6 7331 - 8071 780 246 aa, chain - ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 5 243 3 242 244 215 44.0 5e-56 MTDQLVKDNERIDDLQNGYYVIQDPDKFCFGMDAVLLSGFAKVKKGETALDLGTGTGIIP ILLKTKTNGKHFTGLEIQKECADMAGRSVRYNHLEDDVEIVRGDIKEAADIFGAASFDVV TSNPPYMIGQHGLRNSDMPKAIARHEVLCNLEDVVSQASKVLKERGRFYMVHRPFRLAEI MNVLTKYRLEPKRMQLVYPYIDREPNMVLIEALKGGNSRVTVEPPLIVYKEPGVYTENIL KIYDMI >gi|226332863|gb|ACII01000156.1| GENE 7 8064 - 8963 951 299 aa, chain - ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 1 292 1 295 303 318 57.0 9e-87 MTKVVGVRFRNVGKIYYFSPKDYEIKTGDHVIVETARGIEYGKVVLAPREVDEEDVVHPL KEVLRVATKEDDEREAQNRVREREAFKICQKKIREHGLEMKLIDAEYTFDNNKVLFYFTA DGRIDFRQLVKDLAAIFKTRIELRQIGVRDETKILGGIGICGRCLCCHTYLSEFAPVSIK MAKEQNLSLNQTKISGVCGRLMCCLKNEQETYEELNKKLPGLGDTVTTPDGLTGTVHSVN VLRQRVKVIVEINDEKEIQEFPVDDLKFRPRKKKVKVSEKELKELGNLEDKKGDSKLND >gi|226332863|gb|ACII01000156.1| GENE 8 8965 - 9954 1035 329 aa, chain - ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 4 326 15 357 563 135 30.0 9e-32 MMGFNNIIGHEEIIGHLKNAIESGKISHSYIFTGEPGSGKKLLAGTFASTLQCEAGGTEP CQKCDSCKKAMGKNHPDIIMVSHEKPGTITIDEIRDQVINDIDIRPYYSPYKIYIIADAD LMTPQAQNALLKTIEEPPEYAVILLLTNNIGGLLPTIQSRCVRLDLKVVDDGLVKKYLME HLHVPDYQAEIDASFAQGSIGRAKEAATSQEFAEMTQNALRILKYANTMEVYELSDAIKS LSEDKQNINDYLDIFQFWFRDVLMFKATQEIDNLVFKQEINFIREQAKQRSYENLENILD SIQKTKVRLKANVNFDLAFELLFLTIREK >gi|226332863|gb|ACII01000156.1| GENE 9 10007 - 10573 570 188 aa, chain - ## HITS:1 COG:CAC0298 KEGG:ns NR:ns ## COG: CAC0298 COG0194 # Protein_GI_number: 15893590 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Clostridium acetobutylicum # 2 184 9 191 195 179 49.0 3e-45 MGKSASGKDRIYSLLAAHKELNLKTLILYTTRPIRAGEQDGKNYYFVDDGKLEEFRKNGN LIEERAYHTVYGIWTYFTADDGQVNLADSDYLGIGTLESFKKMRKYYGEDAVCPVYVQVE DGERLSRALNREREQENPRYEEMCRRFIADQSDFSEENILNAGIEKRFQNINLDDCVKEI ANYIKSVQ >gi|226332863|gb|ACII01000156.1| GENE 10 10605 - 12071 886 488 aa, chain - ## HITS:1 COG:CAC2338 KEGG:ns NR:ns ## COG: CAC2338 COG1982 # Protein_GI_number: 15895605 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Clostridium acetobutylicum # 4 472 11 485 487 333 37.0 4e-91 MERLYKKLESYGQSDYYPFHMPGHKRNRASSADDFLFERDITEISGFDNLHHAEGILKEA QEYAAQIYGTKKCFFSVNGSTAALLAAVSASVNKGGKILVARNCHKAVYHALYLRELQPV YIYPHEDQRLGINGGISPERVERYLEENTDVQAFLLTSPTYDGVVSDIKTIAEVVHRHKI PLIVDEAHGAHLHYSKYFPVSAADLGADIVIQSFHKTLPSMTQTAVLHICSDMADVEKIK RFMGIYQTSSPSYILMASMDACMDKLRKDGQQMFREFTFNLERARQRLSKCEKIKLIEGS MIEGSGIYDFDRSKLLFSTVGTSVNGHLLHQILRDRYHIEMEMAAEKYVLGIAAVGDTEE GFERLCTAIEEIDAEIQQTDESEESQYHTSHARMTQLMTISQAVDAQQRRYSLKESVGKV SAEFAYLYPPGIPIIVPGEQITGQFVRNVRRYMEQGLEVQGLSDTSAETICVAARNEIGQ EEYSPAKE >gi|226332863|gb|ACII01000156.1| GENE 11 12189 - 12878 855 229 aa, chain - ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 229 230 310 69.0 1e-84 MSRILIIEDEEAIADLEKDYLELSGFEVEICNRGDTGLTRALNEEFDLIILDLMLPEVDG FDICRQVRQEKNTPIIMVSAKKDDIDKIRGLGLGADDYMTKPFSPSELVARVKAHMDRYN RLIGSNVRKNDIVEIRGIKIDKTARRVWVNGEEKTFTSKEYDLLTFLAENPNRVFTKEEL FREIWDMESVGDIATVTVHIKKIREKIEFNTAKPQYIETIWGVGYRFKV >gi|226332863|gb|ACII01000156.1| GENE 12 12875 - 14017 1105 380 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 82 372 202 492 498 295 51.0 1e-79 MKLKTRIIVGFGMSILVPLLLFSATLYGLGHTKFAQQTKDSSEVVYDISVGESTHSQTQV RIIAKDLLFTATVILVFTALSIGLWIYRSIATPLVKLKKATQNIKEGNLDFVLEVDGDDE FSELCRDFEEMRKRLKESAEEKVLMDKENKELISNISHDLKTPITAVKGYVEGIMDGVAD TPEKIDRYVKTIYNKTNEMDHLINELTFYSKIDTNRIPYTFSKLNVEDYFSDCAEEVGLE LETRGIELVYANYVEDNVQVIADGEQIRRVIHNIISNAIKYMDKPKGIIQIRIKDVGDFI QVEIEDNGKGIAAKDLPSIFDRFYRTDVSRNSSKGGSGIGLSIVKKILEDHGGKVWATSR LGIGTIMYFVLRKYQEVPMK >gi|226332863|gb|ACII01000156.1| GENE 13 14255 - 15436 1681 393 aa, chain - ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 5 390 7 395 400 589 73.0 1e-168 MEKILFTSESVTEGHPDKMCDAISDAILDALMEKDPMSRVACETATTTGLVLVMGEITTN AYVDIQKIVRDTIREIGYTRGKYGFDADTCAVITAIDEQSADIALGVDKALEAKMGEDEI DAIGAGDQGIQFGYASNETEEYMPYAINMAHKLARQLTKIRKDGTLKYLRPDGKTQVTVE YDEAGKPSRIDAVVCSTQHDPDVTQEQIHEDIKKYVFDEIIPADMVDENTKYFINPTGRF VIGGPHGDSGLTGRKIIVDTYGGTGRHGGGAFSGKDCTKVDRSAAYAARYVAKNIVAAGL ADKCEIQLSYAIGVAHPTSIHVETFGTGKLSDTKLVEIIRENFDLRPAGIIKMLDLRRPI YRQTAAYGHFGRTDVDLPWEHLDKVDDLKKYLA >gi|226332863|gb|ACII01000156.1| GENE 14 15799 - 17091 1544 430 aa, chain - ## HITS:1 COG:CAC3539 KEGG:ns NR:ns ## COG: CAC3539 COG0766 # Protein_GI_number: 15896775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Clostridium acetobutylicum # 1 415 1 415 418 460 60.0 1e-129 MEQYVIKGGNPLVGEVEIAGAKNAALAILAAAIMTDETILIENLPDVRDINVLLEAIAGI GAQVDRIDKSTVKINGSTIGDVSVDYEYIKKIRASYYLLGALLGKYKHAEVPLPGGCNIG SRPIDQHLKGFRALGADVDIMHGAIVAKADELHGSHIFLDVVSVGATINIMMAASLAPGR TILENAAREPHVVDVANFLNSMGANIKGAGTDVIRIKGVEKLHRTEYSIIPDQIEAGTFM FAAAATGGDVTVKNVIPKHLEATTAKLEEIGCEVEEFDDAVRVRAPKRLHRTHVKTLPYP GYPTDMQPQIAVTLALAEGTSIVTESIFENRFKYADELSRMGANIKVEGNSAIIDGVKKL TGARVSAPDLRAGAALVIAGLAADGITVVDDIVYIQRGYENFEDKLRSLGAEIERVSNEK EIQKFRLRVG >gi|226332863|gb|ACII01000156.1| GENE 15 17308 - 17820 597 170 aa, chain - ## HITS:1 COG:MA3237 KEGG:ns NR:ns ## COG: MA3237 COG0703 # Protein_GI_number: 20092053 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Methanosarcina acetivorans str.C2A # 3 151 8 157 175 120 43.0 1e-27 MNNVTLIGMPGSGKSTIGVILAKALRYEFLDSDLLIQKQEKRKLSEIIEQDGPEKFKEIE NQVNADIHVTDTVIAPGGSVIYCDEAMEHLKSIGKVIYLKLSLESLSKRLGNLKGRGVLL KAGQSLKDLYEERVPLYEKYADITIDEEGKDLDESLRAILEIIHIKEQGK >gi|226332863|gb|ACII01000156.1| GENE 16 17872 - 18462 587 196 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 184 1 185 197 176 48.0 3e-44 MEKETWKAGNMLYPLPAVMVSVSDGEGNDNIITVAWAGTVCTNPPMVSISVRPSRFSYDM LRKTGEFVINLTTEKLAYATDYCGVRSGRDVDKFKEMKLTKEKADFVKVPMIAESPVSIE CKVRQVLELGSHHMFLADVLAVHADPQYMDEKKKFHLNDAKPLVYSHGEYLGIGKKLGTF GYSVKKKKKKSGKKKQ >gi|226332863|gb|ACII01000156.1| GENE 17 18497 - 20167 1844 556 aa, chain - ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 5 549 2 536 538 301 36.0 2e-81 MRNQEKIFVIGHKNPDTDSICSAIAYADIKNRTSQKVKYIPKRAGQINEETEYVLNRFGM QPPGYLSNIGTQIKDMDIRMSPEADKSMSLKNAWDLMMEKSIVSLPIRDREGQLEGLITI GDIAKTYMDTTDSYLLSRAKTQYRRIAETIAGTVVEGNEHGYFTKGKVLVGTANPEMLKA YIEPDDLIIMGDREEDHLQAIAQNVSCIIVGMGIEVSEKVIKLAHEREIVIIMSPYDTFT IARLINQSIPVRYIMKTDNLVTFNTEDFTDDIQNEMIKHRHRAFPVINKKGKCIGTISRR NFLDMHRKKVALVDHNEKDQAVDNIDKAEIVEIIDHHKLGSLETMVPISFRNQPVGCTAT ILYEMYGEQKLEISPSIAGLLCAAIISDTLMFRSPTCTLSDKMAAGALALIAGINIEQFA KEMFKAGSNLKDKSPEEIFYQDYKKFIAEDEINFGVGQISSMDSDELAEIKERLVPFMVS ECGRHGVTRVFFMLTNIIEESTELLYYGEGSEEMARIAFHMEPKDGVFDLKGVVSRKKQL IPALMEAAQAGQNDYN >gi|226332863|gb|ACII01000156.1| GENE 18 20181 - 21089 495 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148379145|ref|YP_001253686.1| ribosomal protein S1 [Clostridium botulinum A str. ATCC 3502] # 5 264 1 261 381 195 37 5e-49 MAESMKDYETELEASFKKIEEGDILTGTVISVDEKEVVVDLKYYAEGIIPAEDYSREPGF SLKEQVNVGDEVSATVVRKDDGNGNILLSRTEAADVLAWDKLKELKDSKEVIDVVVKGIV NGGVIAYVEGVRGFIPASKLALNYVEDTNEYLNKPIQVQVFDIDKEKGRLILSAKEILRE KAEEERKTKISNVQVGLVTEGVVESLQPYGAFVDLGNGLSGLVHISQICEKRIKKPSEVL TVGDKVKVKVTAVKDGKLSLSIKEATDMMAKEIEEEVIELPDSKEEASTSLGALFANIKL DN >gi|226332863|gb|ACII01000156.1| GENE 19 21375 - 22031 715 218 aa, chain + ## HITS:1 COG:lin0794 KEGG:ns NR:ns ## COG: lin0794 COG2357 # Protein_GI_number: 16799868 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 13 217 8 212 212 202 49.0 3e-52 MNYDSVPLRDFKELNDWETVIFVYQSALKQVETKIDILNGEFQHRHKYNPIEHVKSRIKT PESIVKKLKRHGYESSIENMVRYVNDIAGIRISCSFTSDIYLIADMISKQNDLTILARRD YMKNPKKSGYRSYHMLVTTPVFLSDSIIDTKVEIQIRTVAQDFWATLEHKMHYKFEGDGP DYITKELRECARYVAELDTRMEELNNEIQKYGKLSDKK >gi|226332863|gb|ACII01000156.1| GENE 20 22365 - 22817 501 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581081|ref|ZP_04858341.1| ## NR: gi|253581081|ref|ZP_04858341.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 150 1 150 150 205 100.0 1e-51 MRIKKAGILIIAAAITCTTPAVPAMAADTSESVQEIISDNEIDSLVSDPDKVVDIIMYVK NEAAKQDISDDQIRSLIQTAESTAGVSLSEEEENRIIKIVKQIKDSDIDEEQLRSAVTKA YDKLEEMGIGKEEVKGILHKLADFAKSLFE >gi|226332863|gb|ACII01000156.1| GENE 21 23210 - 24082 1053 290 aa, chain - ## HITS:1 COG:CAC2986 KEGG:ns NR:ns ## COG: CAC2986 COG0030 # Protein_GI_number: 15896238 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Clostridium acetobutylicum # 12 284 6 272 276 279 50.0 5e-75 MPRPYLGDPKYTIEVLQKYNFAFQKRFGQNFLIDTHVLDKIIDSAQITKDDFVLEIGPGI GTMTQYLAEAAREVAAVEIDKTLLPILDDTLKDWDNVTVINNDILKVDIRQLALEKNQGR PIKVVANLPYYITTPIIMGLFENQVPVDSITIMVQKEVADRMQVGPGTKDYGALSLAVQY YAKPKIVANVPPNCFMPRPKVGSAVIRLERYEKPPVEVKNEKLMFRIIRASFNQRRKTLV NGLKNSQEIPFSKEQIEQALGMCGLSLSVRGEALTLAQFAQLANAFTEIS >gi|226332863|gb|ACII01000156.1| GENE 22 24130 - 26484 2385 784 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0026 NR:ns ## KEGG: EUBREC_0026 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 538 1 539 877 187 29.0 1e-45 MDKEEFRIKLEEINKLVQDKDYKGAMNIVDSIDWRRVKNVRTLCVVGEIYAANGRYEDSK EIFLLAYHKASIGKNILYRLIEISLRMDDINEAEEFFEEYKQVASNDSTQYILQYKIARA KNSSLNEQIRILEEYKEQEFTEKWSYELAALYYKAGEKQKCLDLCNEIILWFSEGKYVMK AYDLKMRMGELTGAEKAKFEKQFVPKLLTPEQAKELEKKKTETEVKAQEEPEAEEVEETT ENNEPEVQVSMEGIQEKISKGIRDVFGGKTQEEKEEFPEESMDMVNEAGITREDEIPEEI IKSDASQKEPENVPELEAEPEKPGAEPMAAIKMPELNIPASMKNMELSKPPKVEEVAETK LNMDSFDLEGKELNLEDTILAAASAQGIEIPKEEESKTEEVQEESSETDVDDKEIKAHDI SEELEEPDFLSGDIQDIKVADPKPDEEPKKEIPVQPEIEEAEEEDEVTEEDLRRAEEEFL HGPMGNTDLNPEDDTEGTEEAQPLTEEEELERFIESIQPENGADPRDIVPREKELTDDEK QLFTYFVKVPGMKEQLVDTLYDVQMAAADKTSKTGNIIVMGGKECGKTRLISGLIPAICK ELNLEASKVAYVFADQINGKNIYKIFSKLAGGFLVIENANQLTPETVEMLDKAMEVNTDG LTVIVEDEKIGMRKLIARYPKFAKKFTSMINIPVFTNDELVNFARVYTKENGYAIDQMGM LALYNLIGINQKEDSPMNVGAVKELIDAAIAKSQGGIRKFKRNVSKKRIDRDGYIVLYEK DFTK >gi|226332863|gb|ACII01000156.1| GENE 23 26558 - 27643 792 361 aa, chain - ## HITS:1 COG:CAC0004 KEGG:ns NR:ns ## COG: CAC0004 COG1195 # Protein_GI_number: 15893302 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Clostridium acetobutylicum # 1 359 1 362 363 305 44.0 6e-83 MYIESVQLKNFRNYDSLELDLAQGTNIFYGNNAQGKTNILEALYLCGTTKSHKGSRDKDM IQFGKDESHIRMMVKRDELSYRIDMHLKKNKAKGVAINGLPIRKASELFGVVNLVFFSPE DLNIIKNGPGERRRFLDLELCQLDKIYLTDLASYNHIVNQRNKLLKDLSVQPSLKDTLDI WDIQMAEYGRKIIDKRSEFIKELNETVRKIHGNLTGGLEELNVIYEPDCTAEKLESTICA NRERDMRMRLTSAGPHRDDLCVMANGIDIRKYGSQGQQRTAALSLKLSEIYIVKRKIKDT PVLLLDDVLSELDSSRQNYLLDSISDIQTLITCTGLDDFISHQFQINKVFQVVQGTVSQP V >gi|226332863|gb|ACII01000156.1| GENE 24 27648 - 27857 370 69 aa, chain - ## HITS:1 COG:CAC0003 KEGG:ns NR:ns ## COG: CAC0003 COG2501 # Protein_GI_number: 15893301 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 68 1 68 68 60 51.0 9e-10 MEEIKIRDEFIKLGQLLKLADMVQDGVEAKYVITDGLVKVNGEVDDRRGRKVYEGDIVSY DGKEIKVIR >gi|226332863|gb|ACII01000156.1| GENE 25 27871 - 28986 996 371 aa, chain - ## HITS:1 COG:CAC0002 KEGG:ns NR:ns ## COG: CAC0002 COG0592 # Protein_GI_number: 15893300 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Clostridium acetobutylicum # 1 365 1 363 366 244 40.0 2e-64 MKIICQKINLLKSVNISLKAVPSKTTMPILECILIDATTNQIKFTTNDMELGIETIVDGI IEEKGIIALDAKIFYEIIRRLPDNTVTIKTDEKLTATITCEKAKFTIPGKSGEDFAYLPL IEKEESLTISQFTLKEMIRQTIFSIASNETNKLMTGELFEIKNNYLKIVSLDGHRIAIRR MELKKDYADRKVVVPGKTLNEISKILSGEIDDIVNIYFSKNHILFEFDQTIVVSRLIDGE YFRIDQMLSSDYETKIKINKKEFLDCIDRATLLVREGEKKPIIIEITDNSMELRIDSAMG SMNENIDIDKEGKDILIGFNPRFLIDALKVIDDETISIYLVNPKAPCFIRDDEENYTYLI LPVNINQNQAR >gi|226332863|gb|ACII01000156.1| GENE 26 29265 - 30620 1392 451 aa, chain - ## HITS:1 COG:BS_dnaA KEGG:ns NR:ns ## COG: BS_dnaA COG0593 # Protein_GI_number: 16077069 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Bacillus subtilis # 1 447 1 445 446 421 49.0 1e-117 MNKVVEKWDEILQIVKTEHDLSDVSFNTWLKPLTVYEVVDNVVTIIVPSEQVGLNYISKK YKLPLQVTISEVTGMENCDINFILPEDVPKKEEVSPKAQSQDARCEEAHLNPKYTFDTFV VGSNNKFAQAAALAVAESPGDTYNPLFIYGGAGLGKTHLMHSIAHFIIDHDENSKVLYVT SEEFTNELIETIRNGNNSAMTKFREKYRNIDVLLVDDIQFIIGKESTQEEFFHTFNSLHS AKKQIIISSDKPPKDMEILEERFRSRFEWGLIADITLPDYETRMAILHKKEEMDGYDINE EVIKYIANNIKSNIRELEGAINKVMAFAKLEKKEVTLELAEQALKDIISPDEKKVITPDY IISMVAEHFDVTVDDLCGNKRNSKIVTPRQIAMYLCREIISTPLKSIGKCLGNRDHTTIM HGIDKIEKEINNDENLKNTIETLKKKINPQG >gi|226332863|gb|ACII01000156.1| GENE 27 31243 - 31377 187 44 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160882064|ref|YP_001561032.1| ribosomal protein L34 [Clostridium phytofermentans ISDg] # 1 44 1 44 44 76 86 2e-13 MKMTFQPKKRSRAKVHGFRARMSTKGGRKVLAARRLKGRKHLSA >gi|226332863|gb|ACII01000156.1| GENE 28 31645 - 31812 151 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581089|ref|ZP_04858349.1| ## NR: gi|253581089|ref|ZP_04858349.1| ribonuclease P [Ruminococcus sp. 5_1_39B_FAA] # 1 55 69 123 123 100 98.0 4e-20 MNEAEFENSLDIVVVARPQAKDRTYQEIESALMHLAGKHCIAGKNNDKEVSDQNN >gi|226332863|gb|ACII01000156.1| GENE 29 31838 - 31993 106 51 aa, chain + ## HITS:1 COG:CAC3737 KEGG:ns NR:ns ## COG: CAC3737 COG0759 # Protein_GI_number: 15896968 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 50 26 70 71 77 77.0 7e-15 MKRTKCPYIPTCSQYGLEAIEKYGALKGGLLAVWRILRCNPFSHGGYDPVP >gi|226332863|gb|ACII01000156.1| GENE 30 32075 - 33370 1226 431 aa, chain + ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 21 134 45 155 280 86 41.0 8e-17 MEMILATKSSTPIIGQLATVMGWIMDGIYRLLNIFGIQNIGICIVIFSILIYALMTPLQI KQQKFSKLSAIMQPELQKIQKKYKGKNNDTAAMQKMQEETQAVYQKYGVSPTGSCVQLAI QFPILMALYQVIYKIPAYVGSVRDILASAVTSITGVNGYTDILQQFITDNKMTRVQLIMD GSKATSNSVTDFLYALSPSQWKTLAETSQFAGFTDTLNSTAKEISHVQNFFGLNIADQPL TYIKAAFVGGSALLAIVAILIPILAWATQMINLKLMPQAAQQSGDSQQDAMMNSMKTMNM VMPLMSAVFCFTFPVGLGIYWVASAAVRSVQQVVINKKMDKIQIEDLISENMKKMEKKRE KAGLPPQKITNQAHQSAKNINKIEKGSSNTNVETRAKKVEEAYKDAANAKPGSITAKANL VKAFDERNKKK >gi|226332863|gb|ACII01000156.1| GENE 31 33391 - 34323 1173 310 aa, chain + ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 162 310 62 209 209 126 52.0 5e-29 MEDYIQFSAKTKSEAITKACIELGVSSDQLEIQVISEGSNGFFGIGSKPAVIKVRKIESV SEEEEMKEIVETVKLDSFKEEAPVQEEKKTEAIKPVKKEIKEPKAVSEKPRQPKPVKERA AKEKQPREFREPKEKQVREKTTKPVKPVEILTDPEEIKEVENRAKVFLRDVFASMNLGEV EITSEYNTTDGSLEVDFEGEDMGILIGKRGQTLDSLQYLTSLVVNKGKSNYIRVKLDTED YRKRRKETLENLARGIAYKVRKTRKPVILEPMNPYERRIIHSALQGNKFVETVSEGEEPY RHVVVKLKRN >gi|226332863|gb|ACII01000156.1| GENE 32 34601 - 35974 1415 457 aa, chain + ## HITS:1 COG:BH4062 KEGG:ns NR:ns ## COG: BH4062 COG0486 # Protein_GI_number: 15616624 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Bacillus halodurans # 4 457 5 458 458 410 48.0 1e-114 METTIAAISTAMSASGIGIIRISGENAMDVISRIYRSKGGKKKIKEVPTHTIHYGYIYDG EELIDEVLVMIMHAPRTFTGEDTVEIDCHGGVYAMQRVLDTVLKNGAEIAEPGEFTKRAF LNGRMDLSQAEAVMDVIQAKNEYALRSSMDQLRGSVQKAIRDIREKLIYHIAYIESALDD PEHISLDGYPQELLEVVDNEQKEVKRLLKTSSDGKMIQEGIQTVILGKPNAGKSSLLNLL IGENRAIVTDIAGTTRDILEEYITLHGISLKIIDTAGIRETKDIVEKIGVDRAREMAQKA DLILYVVDSSVPLDENDEEIMEMLTGKKAIILYNKTDLQPEIQPEILKEKTGHPVIPISA KEEKGITELEEQIKDMFFGGEISFNDEVYITNARHKAALEEADRSLDLVRNSIEMGMPED FFSIDLMNAYENLGKILGESVGEDLVNEIFSKFCMGK >gi|226332863|gb|ACII01000156.1| GENE 33 36045 - 37919 2020 624 aa, chain + ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 7 620 8 619 626 774 61.0 0 MRTLVEDYDIVVVGAGHAGCEAALAGARLGLNTVMFTVSVDSIALMPCNPNVGGSSKGHL VRELDALGGEMGKNIDKTFIQSKMLNCSKGPAVHSLRAQADKQAYSNEMRKTLENTPNLT IKQGEVTKLLVEDGKITGVKTYSGATYNCKAVVLCTGTYLKARCIYGDVSNYTGPNGLQA ANYLTDSLKELGIEMFRFKTGTPARIAGNTIDYSKMEEQFGDERVVPFSFSTNPEDVQIE QKSCWLTYTNEKTHEIIRANLDRSPLYSGMIEGTGPRYCPSIEDKVVRFADKNRHQVFIE PEGLYTNEMYIGGMSSSLPEDVQEEMYHSVPGLEHAKIVKNAYAIEYDCINPRQLYPTLE FKKIKGLFSGGQFNGSSGYEEAAAQGLIAGINAALEVKGQEQLILDRSEAYIGVLIDDLV TKENHEPYRMMTSRAEYRLLLRQDNADLRLRKKGYQAGLISEEDYQKILTKEEQIKTEIS RVEHTNIGANKEVQTLLESYNSTPLKSGTTLAELIRRPELSYEAIKPLDKERPELPWDVQ EQVDINIKYDGYIRRQLKQVEQFKKLEAKKIPTDLDYEKVGSLRIEARQKLEAYRPISIG QASRISGVSPADISVLLVYLSSTK >gi|226332863|gb|ACII01000156.1| GENE 34 38099 - 38821 693 240 aa, chain + ## HITS:1 COG:BH4060 KEGG:ns NR:ns ## COG: BH4060 COG0357 # Protein_GI_number: 15616622 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Bacillus halodurans # 5 240 3 238 238 227 50.0 1e-59 MPYNLTTLENGCRELGIELSQKQKNQFIQFYEFLVEKNKVMNLTGITEFEEVLTKHFLDS LACVKAIDMTKVKTIMDIGTGAGFPGVPLKIAFPHLEACLLDSLKKRVNFLEESFELLGL EGIKAIHGRAEEYAKNKEYREKYDLCVSRAVSNLATLSEYCLPYVKTGGTFISYKSGTVQ EEAEEAEKAINILGGQVKDITYFKLPDSEIDRSLVIINKKKSTPGKYPRKAGTPLKEPLS >gi|226332863|gb|ACII01000156.1| GENE 35 39056 - 40129 999 357 aa, chain + ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 1 311 1 313 342 262 44.0 9e-70 MIEVNNLVKRYGNHTAVDHLSFKIEKGKIYGFLGPNGAGKSTTMNMITGYIASTEGTVKI DGHDILEEPEAAKKCIGYLPEQPPVYFDMTVLEYMKFVADLKKIPKDKKANMIEEVMDMV KISDMRNRLIKNLSKGYRQRVGLAEAIMGYPEVIILDEPTVGLDPKQIIEIRTLIKELKK KHTVILSSHILSEVSAVCDYVLIISHGKLVASDTPENLGKLAEGSNTLEMLIKGEKTQIR QALEGIEGVNSVTMEKDEKQNLWSVKVSTEEQNDIREEVFYKMSEINSPIYEMKSRKVSL EEIFLELTASDKPISADEVPNTPEDKDTDAGKSNETNVKQTSAEIPEEQNNDKGGEK >gi|226332863|gb|ACII01000156.1| GENE 36 40132 - 40995 833 287 aa, chain + ## HITS:1 COG:PA4038 KEGG:ns NR:ns ## COG: PA4038 COG1277 # Protein_GI_number: 15599233 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Pseudomonas aeruginosa # 4 169 7 174 244 76 33.0 7e-14 MTAIYKRELKSYLTSMVGYLFIFFILVLTGIYFSAYQLSAAYPKFEYTLSAITFAFLIGV PILTMRVLAEERKQKTDQLLLTAPVSVSGIVIGKYLALVTVFAVPMAVMCTYPLIMSRFG TVEFASAYTAVLGFFLLGCANIAIGVFMSALTESQVIAAVLTFVFLFAFYMMNGISSFFS QTSMSTCVTFGLLILAAAIIIYTMIKNILISAAIGVIGEVALIIIYVVKSSIFEGGIQKV LDVFNLSGHFDNFTSNIFDIKGIVYFLSVIAVCLFLTTQSILKRRWN >gi|226332863|gb|ACII01000156.1| GENE 37 41013 - 42497 1619 494 aa, chain + ## HITS:1 COG:slr2105 KEGG:ns NR:ns ## COG: slr2105 COG3225 # Protein_GI_number: 16330592 # Func_class: N Cell motility # Function: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component # Organism: Synechocystis # 22 394 64 420 595 89 19.0 1e-17 MKLKEKMKNNKHKKFDKKKLIGTISKKHIKNGSYTMVMSVVFIAVVVVLNMIVNAIPSKY SEIDISSQKLYSIGDDTKAMLKDLDKDVTIYQIAQSGSEDENITNLLKKYEDESKHIKVE QKDPVVNPKFVTEYTSDDLSANSLIVVCGDRNKVIDYNNIYESTIDYQTYSSQTTGFDGE GQITSAIGYVTSEDLPILYTLEGHGEKEMDSTIKEDIEKANMDIQSLNLLTEGSVPDDAD CLFIDSPSTDISEDEKTAIIDYLENGGKAMIFSDYTTEDLPNFDAVLENYGVQREAGIVF EGDNQHYAMQMPYYLVPTINSTDASSETVSGGYYVLVPYAQGIKKMDDVRDTVTINSILT TSDQAYSKTDLNSDTLEKEDGDEAGPFDLGVSITETLDDDKETQIVYYSTSNLMESQVNQ MVSGGNEKLIIESLKWMTDTEDSATVSIPSKSLSVSYLTLTDYDAAFWKICTIGLIPGFF LVVGFAVWMKRRKA >gi|226332863|gb|ACII01000156.1| GENE 38 42498 - 44297 1783 599 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581099|ref|ZP_04858359.1| ## NR: gi|253581099|ref|ZP_04858359.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 61 599 1 539 539 558 100.0 1e-157 MKNKTVKMVLAVVVLGVCCGAYAGVKTYVAHQEQKESEEDSEESTTVFTASTDNIKSLDF MVDDTETTFEKDDDSWVKKDETDFPVNQTTLDSAASAIASVDSDRVLEDVDDLSEYGLDS PSNTIKIVTKSDEEDGDDITTTLYVGDENSSTSQYYVRKDDDEKTVYLIDSSCVEPFTKS LNDYAQMEDFPAISNTDNITKISVDGDNSYELSKDEDTSTWSVKGSEDEEKADSATVSSL VSSFGSMAYSSLADYKCDDKSKYGLDKPYATITVDYQEEVADDDTQDDTTASGETADEDA SSVEESSDDAEQDTESDESVSETTDETSENSDSASDNEDSSEEETKMADKQLTILVGNET DDSSRYVMVNDSNEVYTMSTDTLSALTDKSEEDFWDMTVSYVSLNSLGSLKVNYQGSDYK VNVSRETSTDDDGNDTETVTYKLNGSDLDETTFTTFYNKLINMTAQKRLTDKYEPDGDAE LTATFTEEDGDTQEVAYYSYDTNYYAAVVDNKVYLVNKMNVKELFTAFESVVGTEEDSSD KTDSDEATTDDTKSDSTDMPGESIEESDDSTSETAGITNDSDTSEEENTDSQETDSTEE >gi|226332863|gb|ACII01000156.1| GENE 39 44569 - 45522 678 317 aa, chain - ## HITS:1 COG:VNG6296C KEGG:ns NR:ns ## COG: VNG6296C COG0596 # Protein_GI_number: 16120209 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Halobacterium sp. NRC-1 # 44 310 44 305 328 88 26.0 1e-17 MSKNKHKFLTFAALMTGATVAVHFINHTIATAAQLKQMLHISNDNYFEWRFGNIYYTKKG TGSPILLIHDTLPGASGYEWSKIEDELAIDHTVYTVDLLGCGRSDKSSITYTNFVYVQMI SDFIKKIIGQKTDVIASGFSGSFVTMACHNEKELFNKIMLVNPPSLTQLKQMPNRKDRLL KAALEIPIFGTLVYHMIVSRDNINNLFIEKMYYNPFHVDNQMADAYYEAAHKGGYYTRFL YSSLAAKYININICHALKALDNSIYIVEGETEPNGKAVTDDYCASNPAIEVSVLKETKHL PHVEAPEAFLEQVKIFF Prediction of potential genes in microbial genomes Time: Sat May 28 21:03:13 2011 Seq name: gi|226332862|gb|ACII01000157.1| Ruminococcus sp. 5_1_39B_FAA cont1.157, whole genome shotgun sequence Length of sequence - 5056 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 90 - 149 9.1 1 1 Op 1 25/0.000 + CDS 249 - 1016 925 ## COG1192 ATPases involved in chromosome partitioning 2 1 Op 2 . + CDS 1016 - 1990 872 ## COG1475 Predicted transcriptional regulators 3 1 Op 3 . + CDS 2004 - 2522 556 ## Cphy_3933 hypothetical protein + Prom 2635 - 2694 4.4 4 2 Op 1 . + CDS 2715 - 4007 1530 ## COG0172 Seryl-tRNA synthetase 5 2 Op 2 . + CDS 4040 - 4915 522 ## EUBREC_3695 hypothetical protein + Term 4932 - 4972 5.2 Predicted protein(s) >gi|226332862|gb|ACII01000157.1| GENE 1 249 - 1016 925 255 aa, chain + ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 249 1 249 253 297 59.0 1e-80 MGRIIAVANQKGGVGKSTTAINLSACLAEKGKKVLAIDMDPQGNTTSGFGVDKNGIENTL YELLLGEAEMKDTIVKDVVENLDLIPSNINLSGAEIELVGIDDKEFILKGITDKLRRKYD YIILDCPPSLNMLTINALTAATSVLVPIQCEYYALEGLSQLIHTIDLVKERLNKRLKMEG VVFTMYDARTNLSLQVVENVKENLNQNIYKTIIPRNVRLAEAPSYGQPINIYDPRSAGAE SYRLLAEEVLNREDN >gi|226332862|gb|ACII01000157.1| GENE 2 1016 - 1990 872 324 aa, chain + ## HITS:1 COG:CAC3729 KEGG:ns NR:ns ## COG: CAC3729 COG1475 # Protein_GI_number: 15896960 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 59 321 23 282 283 221 48.0 1e-57 MAGRKNGLGRGLDAFFPDRTSVVKEPARKTTTKTVKTEKKSDVAEKQTNPTVAKKQTADS KTGAMIVKISSVEPNMDQPRKQFDEDALMELSESIKQYGVLHPLLVSDKKDYYEIIAGER RWRAAKLAGLTEIPVIVKEFSEQELVEISLIENIQREDLNPVEEAMAYKRLIDEFHLKQD EIAERVGKSRTAVTNAMRLLKLSEKVQQMLIDEMITAGHARAILSIADKEKQESIAMKVF DEKLSVRETEALVKRMLEPPKTAKKSKFSSAEDAIYESLEEKMKSIMGTRVQIHRKKNDK GKIEIEYYSKDELERIIDLFESIG >gi|226332862|gb|ACII01000157.1| GENE 3 2004 - 2522 556 172 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3933 NR:ns ## KEGG: Cphy_3933 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 164 3 164 170 118 38.0 1e-25 MSGIFENLGIDSGIIIIILLILTIFLLVRNISMNMRLSRLERKYKTFMKGSDAQSLEKVF VRKFAQIDKLYDAKEEHDHDLLFIKKNLDKMFNKYGVEKYDAFDDVGGKLSFALALLDKE NTGLILNAVHSRDNCFLYMKEIVKGESYVMLSQEEVEALRKAVNFGLDNIEE >gi|226332862|gb|ACII01000157.1| GENE 4 2715 - 4007 1530 430 aa, chain + ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 428 6 454 460 315 39.0 8e-86 MLDIKFVRENPEVVKQNIRNKFQDNKLELVDKVLELDKENREIKQEVEALRAERNKISKQ IGALMAQGKKEEAEEVKKQVAASGTRIEELSTREKEVEEELLKDMMVIPNIIDPSVPIGK DDSENVEIEKFGEPVVPDFEVPYHTEIMENFDGIDLDSARRVAGNGFYYLMGDIARLHSA VISYARDFMINRGFTYCVPPFMIRSNVVTGVMSFAEMDAMMYKIEGEDLYLIGTSEHSMI GKFIDQIIPEEELPKTLTSYSPCFRKEKGAHGLEERGVYRIHQFEKQEMIVVCRPEESPM WFDKLWQNTVDLFRSLDIPVRTLECCSGDLADLKVKSLDVEAWSPRQKKYFEVGSCSNLG DAQARRLKIRVNGKDGKKYLAHTLNNTVVAPPRMLIAFLENNLNEDGSVSIPKALQPYMG GMTKIEKKHA >gi|226332862|gb|ACII01000157.1| GENE 5 4040 - 4915 522 291 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3695 NR:ns ## KEGG: EUBREC_3695 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 289 1 289 291 280 48.0 4e-74 MHSYLRAVGFSRITKRSSIRQIITDVVETYDEKTVIENYPDGMFAEFSKNYGCDCGITVC GQYDENNVFYPDYYFPFFRGTGITTQESVVIERHADKESFAGACDDLRIGVTLIFYLQNA AEYLQQREKGNILPGGQPLTLSGLAREGKILFPVEKDKEAVKAERELTKNRNHLIAAARN GDEEAMESLTMEDMDTYSMISQRIVTDDIFSIVDSYFMPYGIECDQYSIMGEILDFMTFK NIITGEEICQLTLDCNDMQFDVCINKNDLLGEPKIGRRFKGIIWLQGQLHY Prediction of potential genes in microbial genomes Time: Sat May 28 21:03:22 2011 Seq name: gi|226332861|gb|ACII01000158.1| Ruminococcus sp. 5_1_39B_FAA cont1.158, whole genome shotgun sequence Length of sequence - 4744 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 2, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 20 - 79 3.5 1 1 Tu 1 . + CDS 118 - 927 909 ## Fisuc_1141 hypothetical protein + TRNA 1418 - 1491 80.5 # Arg CCT 0 0 2 2 Op 1 . + CDS 1914 - 2309 284 ## COG1475 Predicted transcriptional regulators 3 2 Op 2 . + CDS 2306 - 2569 246 ## gi|166033000|ref|ZP_02235829.1| hypothetical protein DORFOR_02721 4 2 Op 3 . + CDS 2569 - 2739 135 ## BLJ_1240 hypothetical protein 5 2 Op 4 . + CDS 2749 - 3045 237 ## gi|253581110|ref|ZP_04858370.1| conserved hypothetical protein 6 2 Op 5 . + CDS 3059 - 3457 455 ## gi|295099024|emb|CBK88113.1| hypothetical protein 7 2 Op 6 . + CDS 3528 - 4274 256 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) Predicted protein(s) >gi|226332861|gb|ACII01000158.1| GENE 1 118 - 927 909 269 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_1141 NR:ns ## KEGG: Fisuc_1141 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 53 260 185 402 404 134 36.0 3e-30 MKNKLWKQWMLSGVLVSAMVVSCGVSVSVQAEETTETEEAAETDSSAESEDDLRILFDQA VEDAMIAEDGEILPVVSLDEGEPYAVYNEEGRVLLYTFHKYPDSYPDGTDVKLEWGNVWT FTGGELEDWYQENKEGVTDWQTRMKELLGLTPDNESNYVTAMWVKPEDVFRPAYISDIGT VEMTDSFSEDVDADYKAWFDANIISSYYDGEYPWTRLGYTYDWADNGQAYGLSEFIVKQD SDVKVAYTVELGEMIQKLEDNTWNPEAEN >gi|226332861|gb|ACII01000158.1| GENE 2 1914 - 2309 284 131 aa, chain + ## HITS:1 COG:Cgl3034 KEGG:ns NR:ns ## COG: Cgl3034 COG1475 # Protein_GI_number: 19554284 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Corynebacterium glutamicum # 3 109 108 212 379 60 32.0 8e-10 MTEPNRQSEENEERIIEIEIERLRPFKEHPFQVKDDKEMFLLQESIEKYGILNPLIVRPV PDGYYEIISGHRRKHAAEKLGYRKVPVIIRVLSEDDSIFSMVDSNLHEVPKSNFKKFTFQ TLIILILVILK >gi|226332861|gb|ACII01000158.1| GENE 3 2306 - 2569 246 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|166033000|ref|ZP_02235829.1| ## NR: gi|166033000|ref|ZP_02235829.1| hypothetical protein DORFOR_02721 [Dorea formicigenerans ATCC 27755] # 1 87 189 275 332 155 96.0 1e-36 MNDNKSNPIISVDEKRFDSDNHSEDYQAYENLVKKTIDYESLEVTHHDDMRQVDEIVNLI VETVMCKNDKILIASNWYPASLVKKNF >gi|226332861|gb|ACII01000158.1| GENE 4 2569 - 2739 135 56 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 2 51 321 370 381 80 70.0 2e-14 MLTYSHIEYVLHCMSGNTTKVKNIKKYLLAALFNAPSTMNGYYQAEVNHDMPGLVI >gi|226332861|gb|ACII01000158.1| GENE 5 2749 - 3045 237 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581110|ref|ZP_04858370.1| ## NR: gi|253581110|ref|ZP_04858370.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 98 1 98 98 115 100.0 1e-24 MFILKFAGKILLLPVWLILFVIGLAVKMTVQTYAVVRGILGFIFTLLIIATAYCYHDWVQ VAFLFSLSVILYLILFAGVFVDTVLDMTRERIIDFIIS >gi|226332861|gb|ACII01000158.1| GENE 6 3059 - 3457 455 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295099024|emb|CBK88113.1| ## NR: gi|295099024|emb|CBK88113.1| hypothetical protein [Eubacterium cylindroides T2-87] # 1 132 1 132 132 243 100.0 2e-63 MSENNDYIQLPPLKKDTPSDVVAFMWEYMKVPEDSREKVKNLLKDANENGVKLSHQAPTL YDVVPKEEITEFEELMRKTIADIVSEASSIACWVYVQKYVKQKTLDEMLQELPGAGQFII VMDTWFERLMAE >gi|226332861|gb|ACII01000158.1| GENE 7 3528 - 4274 256 248 aa, chain + ## HITS:1 COG:CAC2472 KEGG:ns NR:ns ## COG: CAC2472 COG0596 # Protein_GI_number: 15895737 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Clostridium acetobutylicum # 1 238 1 239 250 186 38.0 3e-47 MNYVEYGKENSDVIILLHGGGLSWWNYKEVAERLQTDYHVVLPILDGHAGCDKQFTTIEN NALDIIEFVNSKLGGSVLMMGGLSLGGQILLETLSQRKDICKYAIVESVLVIPSKFTYSM IKPAFGSCYGLIKYKWFSKLQFKSLRIKSNLFDEYYKDTCAIRKSDMIAFLQENSVYSLK DGIGECEATVQIYVGEKEKQSMKKSAKIIHEKLQDSFIQVLPNMYHGEFSINHADDYVRK LLEIVKRR Prediction of potential genes in microbial genomes Time: Sat May 28 21:03:45 2011 Seq name: gi|226332860|gb|ACII01000159.1| Ruminococcus sp. 5_1_39B_FAA cont1.159, whole genome shotgun sequence Length of sequence - 1121 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 5.2 1 1 Op 1 . + CDS 85 - 384 255 ## COG1943 Transposase and inactivated derivatives 2 1 Op 2 . + CDS 437 - 1121 286 ## cce_0110 hypothetical protein Predicted protein(s) >gi|226332860|gb|ACII01000159.1| GENE 1 85 - 384 255 99 aa, chain + ## HITS:1 COG:DR0667 KEGG:ns NR:ns ## COG: DR0667 COG1943 # Protein_GI_number: 15805694 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Deinococcus radiodurans # 1 95 46 140 140 107 49.0 5e-24 MLQDLAEEYRFQILAMEVMPDHIHMLVDCRPQFYISDMIKIMKGNLARQMFLLHPELKKE LWGGHLWNPSYYAVTVSDRSREQVSAYIEGQKEKEKRKP >gi|226332860|gb|ACII01000159.1| GENE 2 437 - 1121 286 228 aa, chain + ## HITS:1 COG:no KEGG:cce_0110 NR:ns ## KEGG: cce_0110 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_ATCC51142 # Pathway: not_defined # 21 227 54 265 486 80 30.0 6e-14 MRTVSSYGVEIRKQNIPARQTMEIYRQAVGYLTEIYAQVWEELRKIPETKKRFNTAEHMV HMTKKNTARFDFDLRFPKMPSYLRRSAIQHALGSVSSYETRLGQWKETGVLSGRPKLTCR NHAMPVFYRDVMYREGAEGKDEAYLKLYDGHDWKWFRVYLKRTDMEYLPRNWKGKKTSAP ALEKRHRRYFLRFSCTEEVTLTKTPVKEQIICSVDLGINTDAVCTIMR Prediction of potential genes in microbial genomes Time: Sat May 28 21:03:49 2011 Seq name: gi|226332859|gb|ACII01000160.1| Ruminococcus sp. 5_1_39B_FAA cont1.160, whole genome shotgun sequence Length of sequence - 1318 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 8 - 316 86 ## cce_0110 hypothetical protein - Term 578 - 626 14.0 2 2 Tu 1 . - CDS 666 - 878 229 ## gi|253581118|ref|ZP_04858377.1| conserved hypothetical protein - Prom 969 - 1028 7.0 3 3 Tu 1 . - CDS 1062 - 1256 273 ## gi|253581119|ref|ZP_04858378.1| predicted protein Predicted protein(s) >gi|226332859|gb|ACII01000160.1| GENE 1 8 - 316 86 102 aa, chain + ## HITS:1 COG:no KEGG:cce_0110 NR:ns ## KEGG: cce_0110 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_ATCC51142 # Pathway: not_defined # 11 60 401 450 486 68 62.0 5e-11 MRISRICAWNTSRLAYDGSGAVTRDRENHSLCTFQTGKRYNCDLSASYNIGARYFIRELL KPLPATERSLLEAKVPPVKRRTSCVYADLRKLHSEMELLKAA >gi|226332859|gb|ACII01000160.1| GENE 2 666 - 878 229 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581118|ref|ZP_04858377.1| ## NR: gi|253581118|ref|ZP_04858377.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 70 1 70 70 131 100.0 1e-29 MHSLSDSTQGYTYTIKWMFGLPEVMKSMHDMNIHEGSTIRVLRKFHDSLIISSANRKIVL GNEVADRIQV >gi|226332859|gb|ACII01000160.1| GENE 3 1062 - 1256 273 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581119|ref|ZP_04858378.1| ## NR: gi|253581119|ref|ZP_04858378.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 64 1 64 64 110 100.0 2e-23 MPNDFIVRPKCTDKKEDKSITMTIRLERELQEQYDDLSAKSGRSRNELMCMALRYALDNL KFIE Prediction of potential genes in microbial genomes Time: Sat May 28 21:04:11 2011 Seq name: gi|226332858|gb|ACII01000161.1| Ruminococcus sp. 5_1_39B_FAA cont1.161, whole genome shotgun sequence Length of sequence - 44193 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 15, operones - 10 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 43 - 333 274 ## gi|253581120|ref|ZP_04858379.1| predicted protein + Term 346 - 404 5.3 2 2 Tu 1 . - CDS 443 - 1180 515 ## COG2186 Transcriptional regulators - Prom 1211 - 1270 5.8 - Term 1550 - 1606 -0.8 3 3 Tu 1 . - CDS 1760 - 3502 1036 ## COG3044 Predicted ATPase of the ABC class - Prom 3581 - 3640 4.9 + Prom 3616 - 3675 7.5 4 4 Tu 1 . + CDS 3778 - 5169 1667 ## COG1362 Aspartyl aminopeptidase + Term 5259 - 5303 1.2 + Prom 5265 - 5324 5.9 5 5 Op 1 24/0.000 + CDS 5410 - 6141 268 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 6 5 Op 2 13/0.000 + CDS 6167 - 7999 2020 ## COG0845 Membrane-fusion protein 7 5 Op 3 . + CDS 7992 - 9224 305 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + TRNA 9416 - 9492 71.1 # Arg ACG 0 0 - Term 9775 - 9824 4.5 8 6 Op 1 . - CDS 9828 - 10688 359 ## CLL_A0826 putative O-methyltransferase 9 6 Op 2 . - CDS 10685 - 11710 725 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 10 6 Op 3 . - CDS 11780 - 12853 447 ## COG0406 Fructose-2,6-bisphosphatase 11 6 Op 4 . - CDS 12846 - 13775 753 ## COG2068 Uncharacterized MobA-related protein - Prom 13816 - 13875 11.9 + Prom 13824 - 13883 14.8 12 7 Op 1 . + CDS 13986 - 14246 262 ## Sterm_3269 hypothetical protein 13 7 Op 2 . + CDS 14283 - 17045 2787 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 14 7 Op 3 . + CDS 17129 - 19423 1567 ## COG0247 Fe-S oxidoreductase 15 7 Op 4 . + CDS 19423 - 19644 315 ## Sterm_3272 hypothetical protein 16 7 Op 5 . + CDS 19645 - 20178 442 ## Sterm_3273 methyltransferase type 12 17 7 Op 6 . + CDS 20218 - 20613 569 ## Gura_1002 hypothetical protein 18 7 Op 7 . + CDS 20610 - 21935 976 ## COG1964 Predicted Fe-S oxidoreductases 19 7 Op 8 . + CDS 21956 - 23128 581 ## COG1541 Coenzyme F390 synthetase + Term 23138 - 23204 8.1 - Term 23132 - 23184 7.1 20 8 Op 1 . - CDS 23204 - 23941 302 ## Shel_05240 hypothetical protein 21 8 Op 2 . - CDS 23947 - 24978 696 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 25071 - 25130 7.1 + Prom 25155 - 25214 9.3 22 9 Op 1 . + CDS 25270 - 25506 249 ## EUBREC_0180 hypothetical protein 23 9 Op 2 . + CDS 25583 - 26101 560 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 24 9 Op 3 . + CDS 26133 - 26831 655 ## COG1768 Predicted phosphohydrolase + Term 26851 - 26892 8.5 25 10 Op 1 . + CDS 26928 - 28013 706 ## Cphy_1220 hypothetical protein 26 10 Op 2 2/0.000 + CDS 28035 - 29402 1176 ## COG3864 Uncharacterized protein conserved in bacteria 27 10 Op 3 . + CDS 29458 - 30951 1467 ## COG0714 MoxR-like ATPases + Prom 31014 - 31073 6.5 28 11 Op 1 . + CDS 31122 - 32825 2082 ## COG1409 Predicted phosphohydrolases 29 11 Op 2 . + CDS 32849 - 33997 970 ## COG5438 Predicted multitransmembrane protein + Term 34015 - 34056 2.1 + Prom 34419 - 34478 10.0 30 12 Op 1 . + CDS 34508 - 35020 419 ## Fisuc_1357 Appr-1-p processing domain protein 31 12 Op 2 . + CDS 35082 - 35282 324 ## gi|253581150|ref|ZP_04858409.1| predicted protein 32 12 Op 3 . + CDS 35272 - 36780 693 ## gi|253581151|ref|ZP_04858410.1| predicted protein + Prom 36812 - 36871 8.4 33 13 Op 1 . + CDS 36901 - 37599 487 ## gi|253581152|ref|ZP_04858411.1| predicted protein 34 13 Op 2 . + CDS 37640 - 37987 244 ## gi|253581153|ref|ZP_04858412.1| predicted protein 35 13 Op 3 . + CDS 37980 - 38204 115 ## gi|253581154|ref|ZP_04858413.1| predicted protein 36 13 Op 4 . + CDS 38191 - 39234 265 ## gi|253581155|ref|ZP_04858414.1| predicted protein 37 13 Op 5 . + CDS 39225 - 40334 413 ## DvMF_2723 radical SAM domain protein 38 13 Op 6 . + CDS 40331 - 41065 371 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 39 13 Op 7 . + CDS 41062 - 41718 200 ## BDI_2144 hypothetical protein 40 13 Op 8 . + CDS 41709 - 42509 296 ## COG4642 Uncharacterized protein conserved in bacteria 41 14 Tu 1 . + CDS 42969 - 43127 85 ## + Prom 43162 - 43221 4.0 42 15 Op 1 . + CDS 43274 - 43525 187 ## CTC00334 putative phosphohydrolase 43 15 Op 2 . + CDS 43566 - 43796 178 ## gi|253581159|ref|ZP_04858418.1| ser/Thr protein phosphatase 44 15 Op 3 . + CDS 43783 - 44163 291 ## gi|253581160|ref|ZP_04858419.1| conserved hypothetical protein Predicted protein(s) >gi|226332858|gb|ACII01000161.1| GENE 1 43 - 333 274 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581120|ref|ZP_04858379.1| ## NR: gi|253581120|ref|ZP_04858379.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 96 1 96 96 144 100.0 2e-33 MENVIRKLYGMYMDNDNREKNEMSVMHDDLCKEKKEKAYQCYQKLRVQLTDELAEELDEL MNKHMEIYPQELEESFAMGFKTGARLMCEIFSEETE >gi|226332858|gb|ACII01000161.1| GENE 2 443 - 1180 515 245 aa, chain - ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 230 1 231 231 132 31.0 7e-31 MFRKIDKTSTFNDVLKQIIENIQNGELQPGDALPAERILAEEFGISRPALREVLKALSLL GITVSVHGGANYIATDLHSCLTEPLSIIFQMYNSKIQDALQLRGALESKAAFLAAQNCTP LDAAELQLIIARLDSTQDEKIRGDLDRDLHLKIAQISANPLILSVLNASAQITENMIVGI RSYLMQKNNDAPEIDSQHTRLVQAITNNDAPLAEKVMNEHMHTIEYLLNEITSDNLTISD NVQQP >gi|226332858|gb|ACII01000161.1| GENE 3 1760 - 3502 1036 580 aa, chain - ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 5 578 2 543 549 363 36.0 1e-100 MKNSEELRQQLRSINRKSYPAYKGLKGLYHFGNYILSIDHVQGDPFASPSHVSIQISHRD AGFPVEYYKDTLTGTTLCDYLTRQFEKQVSQYSFRAKGSGKSGLLTVSHCGQEILSRTAC EITEKGITARFFVGFPANGRTINATELEKILFDFLPVCIQKSFFYSSLNAKELQNYIELA EDQEFIRQTLPAKNLCAFIADGSILPRESGISSRPMKASVPFTSPDSLRISINLPHKGKI TGMGIPKGITLIVGGGYHGKSTLLNALELGVYNHIPGDGREYVITDATAVKLRSEDGRFI KDVDISMFINDLPNKKDTRCFSTLDASGSTSQAAGIVESMEAGSHLFLLDEDTSATNFMV RDAFMQQVIQREKEPITPFLERAEDLYKKAEISTILVAGSSGAFFHIADTIIQMDNYVPK DITASVKKLCSQYPLPAVSVTNFQLPHSHRIMSRPAESSKRLIHNSRGNHSDSGATKPER LKTRISGTDGFSLGRQEIDLRYTEQLIDAEQTAALGLLLKYAVEHLADGRRTLPEIVQFL WKNLSLHGLSFFTENQKISCGYATPRIQEIYACLNRYRGL >gi|226332858|gb|ACII01000161.1| GENE 4 3778 - 5169 1667 463 aa, chain + ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 2 460 8 463 465 576 61.0 1e-164 MERRNAWLSYTEAEEKELEQVAKAYRNFLNVGKTERECVKQIIREARVAGYESLEEKTAK GEKLKAGDKVYTVGMKKIIALFHIGQDDISEGMNILGAHIDSPRLDVKQNPLYEDTDLAY LDTHYYGGVKKYQWVTLPLAIHGVVVKKDGSTAEVNIGEDEDDPIVYITDLLIHLAGKQM QKKAAEVIEGENLDILIGSRPLKDLEDDKKKEAVKQNVLNILKEKYDMEEEDFLSAELEI VPAGKARECGLDRSMIAAYGQDDRVCAYTSLLAMLEMDTPKHTSCCLFTDKEEIGSVGAT GMQSRFFENAVAELLDAMGCYSDLRLRRTLKNSSMLSSDVSAGYDPAYGEAFEKKNAAYL GRGIVLNKFTGARGKSGSNDANAEYVARVRNIFDSHEVAFQTAELGKVDIGGGGTIAYIA ALYGMEVIDSGVAVLSMHAPWEVTSKADIYEAKKAYKAFLLEA >gi|226332858|gb|ACII01000161.1| GENE 5 5410 - 6141 268 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 231 1 223 311 107 31 9e-23 MEEALIRVKDLCKIYNPGENEVRALDHINLEINKGEFVAIIGQSGSGKSTFMNMLGCLDV PTSGEYFLNGTDVSTMEDNELSEVRNREIGFIFQGFNLISNLTAIENVELPLVYRGVDRK TRHKLAVESLTMVGLEKRMDHRPNEMSGGQQQRVAIARAIAAQPPVILADEPTGNLDSAS SKEILQILKSMHEQGKTVILITHDNGIAAQARRVVRIMDGKIESDFINKNYGKEEYIKNQ LDA >gi|226332858|gb|ACII01000161.1| GENE 6 6167 - 7999 2020 610 aa, chain + ## HITS:1 COG:RSp1598 KEGG:ns NR:ns ## COG: RSp1598 COG0845 # Protein_GI_number: 17549817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Ralstonia solanacearum # 402 600 203 407 421 61 26.0 6e-09 MKKKKSKKKIIIGAAAVLVVAGGVTVAGQRNSTENQIPQVQVVIAEKGDVEEIVDATGTV GSEEEKTYYSPVNAELKTVSFAQGDVVKKGTKLIEFNTEDLEKDNRKAELNLKSTKYDTK DTRNKSDKAEKKQKDAKKNVQELEKKIKDKKAYVSSLKSQISAATAAAQREAAAQASAQA AAQAQAQQQEAQAKAQAEAKKQEEIQSKYQAALYTYKTETLPQYQQQLSDLNAQYNQAQS DYNQADTTYQMAFATWQADPSDENTQALDNAEAARTQAQLAMQQAQQTYNDLKQQTPKMP VLSDFTESSADYSWGISDGSDADDSDGEDSSSYGYDYSGSDGGTVTADTSALESALETAS DELAELQSDLASEKAIAEADSTSLTKEEKEKLKVTDNLSELDAKSVEELVKEGKKGITAE FNGIISKADIKQGAAVTQGMELFTIENTDKASVDVTLSKYDYNTVKEGQSVEITLGDNTY QGTVTKMSHIAVQNEKGTPVISATVSIDNPDEDIFLGVDAKVKIHAASAKNVVTLPVEVV NIGKEGSFCYVIEDGLVTKRNITTGISSEDYVEITDGIKEGEEVIADLGDYTEGMEVQAV PEQTGEDADE >gi|226332858|gb|ACII01000161.1| GENE 7 7992 - 9224 305 410 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 6 408 7 411 413 122 25 5e-27 MSNLYEYIKMAVHNIMANKGRSFLTMLGIIIGIASVIAIVSVGEGTKNQMNSEIDDIGGG QIAISVSDEAQTDEEWITADDVEAIRAVDGIEGVNVSDSFSGETTTGKGNFSLMLTGEGP DAKLLNNATMKHGVYFGENEVQEAKNVCVLSDADAKRLFGTDDVVGMTVDVNCYGLTKSL RICGVTTQKENGTFVSYTYEGMPVTINVPYTTMNEFTGVSDYFFSVTMQADKSLNSQDVA DKVVKLMEKRHQCAGKDYFQVQSFQDIMSSMNQMLDMVTAFISFVAGISLLVGGIGVMNI MLVSVTERTREIGIRKSLGAKTSSIMMQFLAEAAILTVIGGVIGIVLGVIGGYVICSIIS SSMGMSITPGISLSTIMAATLFSCAVGVFFGIYPAKKAAKLSPIEALRRN >gi|226332858|gb|ACII01000161.1| GENE 8 9828 - 10688 359 286 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0826 NR:ns ## KEGG: CLL_A0826 # Name: not_defined # Def: putative O-methyltransferase # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 8 285 3 280 280 182 39.0 1e-44 MMAKSVYKTVIFGAGQIGQMTARLLSSPCQLLCFADNDPHKHGSYIGNIPVCSPDTAAAL LPDLVILGVLDEERRNSMIKQMENLGYHGPFRDPSVLRMFDARVAVMRLLSEQIYQLDIP GNVAELGVFRGEFSSLISAAFPDRKIHLFDTFEGFSEKDITIEASGNLSRAKTGDFSSTD IDSVLHVMPDPTRTVIHKGWFPDTFSDVTDETFCFVSLDADLYAPTAAALPLFYERLATG GVLLIHDVYSTQFSGCRKAVGEFCLKNHLFADPVCDLHGSAMIRKL >gi|226332858|gb|ACII01000161.1| GENE 9 10685 - 11710 725 341 aa, chain - ## HITS:1 COG:CC2617 KEGG:ns NR:ns ## COG: CC2617 COG1975 # Protein_GI_number: 16126854 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Caulobacter vibrioides # 20 330 21 320 338 92 29.0 9e-19 MRTLFQTIKQQFLEGNDLVLVSVTASSGSTPRGAGSRMLVGKNGRISGTIGGGAVEYRAE CMALDILEKKESCEHEFRLNHKDVENIGMICGGDVTAFFQYLDHNDPIIMEITETAEKFY EERKDFWLICDLLTTSGVSLYSPACGLIGTANVPSSVLSSLSTKPHRYHSEDYDLFSEQI GTSGTVYVFGGGHVSQKLVPILASVDFRCVVLDDRPEFTNPALFPGAVETILCDFNHLDQ YVSITAADYCCVMTRGHSYDTIVQAQLLATPACYIGVIGSRAKKAAVFHRLIEEYNINEQ DLNHIISPIGLEIKAETPAEIAISITGQMIQVRADRKATSK >gi|226332858|gb|ACII01000161.1| GENE 10 11780 - 12853 447 357 aa, chain - ## HITS:1 COG:DR1393 KEGG:ns NR:ns ## COG: DR1393 COG0406 # Protein_GI_number: 15806410 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Deinococcus radiodurans # 4 150 21 173 237 75 37.0 2e-13 MRNLYLVRHGKPQYPDEHSYCVGQTDFSLSMLGHLQAVLLNEELSDKISGVYCSPLLRAV ETAGHMAPELPHIIVSDLSERNLGEWDGLSFDEIRQRWPDIYKARGNNPDHPIPGAETPA ASGFRFSQAVHKILCASEGDIAVVTHTDVISSYLHALHSDMYSRQRFRLPCGSYYHLEVN EKNNISFSDPSYILPHPELNDGLCLRLRNAVSLPRHVQAHSDAVTELACCLCNMLESNGY IFDQKLVRSGALLHDIARLQRHHAKTGGELFLQLGYPEISQIISQHHGLLEATLDEAAIV FLADKLIQETQRVTIEKRFADSMSKCKSPEARKAHEQQLEQARKLQDMIQSLCHITL >gi|226332858|gb|ACII01000161.1| GENE 11 12846 - 13775 753 309 aa, chain - ## HITS:1 COG:ECs3750 KEGG:ns NR:ns ## COG: ECs3750 COG2068 # Protein_GI_number: 15833004 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Escherichia coli O157:H7 # 6 185 7 182 192 69 31.0 8e-12 MQTGALIVAAGKSSRMGDFKPMLQLGSISIAQRVINNFRQAGISKVVVVTGYHADVLECH LASNNVVFLRNENYANTHMFDSVRIGLEYLKDKVDTVLFTPVDVPLFTAQTVTQMLSLGR PLVTPVCNGNPGHPILIRSTLIDSILSDDGKSGLKGAVDNCGEPMYYLNVKDPGIIHDAD TPEDYAELLRIHNQSLIRSEIHIQLAREKVFFDEKLYSLLTLIHETGSVRDACERMHISY STSWNLIHTLESQLHEPLIIRSQGGTRGSHSELTPYGEEFLKRYARFSEETRSCSKKIFE ECFGGFLNA >gi|226332858|gb|ACII01000161.1| GENE 12 13986 - 14246 262 86 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3269 NR:ns ## KEGG: Sterm_3269 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 78 48 123 125 67 42.0 2e-10 MDKIFEITAKEVTVQVKDERTGVVYSRTLPIDYYENANVLKLSGENLDGSSSSIVFYSAR GIERLKDLTGRGADHDSCGTHKPEDQ >gi|226332858|gb|ACII01000161.1| GENE 13 14283 - 17045 2787 920 aa, chain + ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 164 917 11 763 774 254 28.0 4e-67 MIKKTLKINGVERILILDKDETLAQVLRDHLLLTGCKIGCGEGHCGACNVIMNGKVTRSC IYKMSRVPDHAEITTIEGVGTLSDMHPIQVAWMAYGCAQCGFCSPGFIISAKVLLDNNPS PTREEVRDWFNKQRNLCRCTGYKPLIDATMAAAAVMRGEMTKEDLVFKQTGDSIVGTNYI RPSAAQKVTGTWDFGADDALKMPSGTLRLALTQAKVSHANILNIDTAEAESMPGVVRVIT AKDIKAAGGTNKINGLVMLPKHNKTDGFERPVLCDEKIFQFGDAIAIVAADTEEHARAAA EAVKVEIEELPAYMNAMDAIAPDAAEIHPGIPNAFFETNCIKGPDFDWDSIPDSQQVEIE SYCSRQPHLHLEPDCGYGYIDEDGMITVHSKSIGIHLHMPMIADGIGVPMENLRIVQNHA GGTFGYKFSPTNEALIGAAVKILERPVSLVFNQFQNITYTGKRSPAFMNIKLAADEHGKL LALWGNNYVDHGPYSEFGDLLTMRLSQFTGAGYGINSIRNKSTTVFTNHAWGSAFRAYGS PQSYMGSEIAIDVLAAKMGIDPFDIREMNCYKESEGSTIPTGYPPEVYCEEELFKTARPL YEAAKKRVAEKNAASDGKIKYGIGVSLGVYGCGLDGVDTSNAFAELNPDGTVTMYAAWED HGQGADMGVLVSSHETLRQAGIRPEQIKLVMNDTKTCPNAGPAGGSRSQVMSGNACRLAA ENLVAAMKKEDGTFRTYDEMVAEGLATKYEGNWVTTFCADHPIDQDTAQGDPFATYMYTI FLPEVAVNTETGKVKVEKFTCVADVGTIMNRLLVEGNFYGGLVQGIGLALTEDFEDLNHD TSLKNCGIPYPNDAPDDIELHFNETYRPSGPYGAAGCGEAPLDAPHPAILNAIYNATGAR ITRVPARPEKVLAALKELEK >gi|226332858|gb|ACII01000161.1| GENE 14 17129 - 19423 1567 764 aa, chain + ## HITS:1 COG:AF0867 KEGG:ns NR:ns ## COG: AF0867 COG0247 # Protein_GI_number: 11498473 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Archaeoglobus fulgidus # 430 622 132 312 328 69 29.0 3e-11 MTYIQERGSTHVYHVNRMSKEEMDHMISLCVHEQPAYCVAACPFKADTKEMLFYAAKGNF KKALAIYEKITPFPMILCNGCTAPCEEKCRLCELGDGISIREVERAIVRYGEPGKRSSVF RIRKKKKAVIFGSGLFPLFLAGELEKKMYPATIYCQEKDYEAYIAAAAPELLESDRKNEV KRLSSMDLSFEFGCSLDLPFIRAKMKEADVVCASEEVAKKLAPEETADAEIMLREQAGIV SGPVRSVMDAAFAAKRAALTVDLLVQNLSPHSNRGSEGAVTTRLYTNMDGMKGSKKIPCS TDGYSKEEAVEEAKRCIQCHCDECMKSCVYLREYKKHPGLLAREIYNNTQIIMGDHQMNK PMNSCSLCGQCTVTCPNGFDMSQVCKSARENMVSTDKMPLAPHEFALMDMLFSNSEAFLC RPQPGYETCRYVFFPGCQAGAIAPDVVTEAYEDLCRRTEGGVALMLGCCGAISEWAGRYE MTEKVNEQLKQELAKLGDPMIIAGCPSCMKQLKESLGAKVTGIWEILKEIGLPGQAKGLE IPVAIHDACGARGDTQTQDTIRELLADMGCTVVNTEYSRDLSPCCGYGGLTAYANKEMAD KMTEKCLERSDSPYITYCMACRDRFVREGRESRHILELLYGINAANMPDISEKRYNRLEL KEKLLKNIWNEELMMEKKDYTVAYTEDAISMMDERMILKSDVERVLSDYRENQEAIFDEE TKELVTRSRLGNVTFWVRFVETEEGYLVRRAYSHRMNIMKRVGQ >gi|226332858|gb|ACII01000161.1| GENE 15 19423 - 19644 315 73 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3272 NR:ns ## KEGG: Sterm_3272 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 16 73 17 74 74 77 65.0 2e-13 MAVERTYYTIDPEGKLTCMKCKVPLVKGKAKFMYLENGFPVEMPVCPKCGFVYVPEELAL GKVLAVERALEDK >gi|226332858|gb|ACII01000161.1| GENE 16 19645 - 20178 442 177 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3273 NR:ns ## KEGG: Sterm_3273 # Name: not_defined # Def: methyltransferase type 12 # Organism: S.termitidis # Pathway: not_defined # 5 176 18 209 210 78 28.0 1e-13 MNRHPGGEEQTIHLLKSIELKKGMKALDLGAGEGETVRIMKAFGLNVQGVDLAPRSSEVQ QGDFLTLQYAADSMDLCISQCAFFVSRDQKKAVSECWRVLKKGGFLLLSDLDPGNLLEIV KETGFTILCQEDQTALWREYYLEAIWNDSFCCEDHKLLQKEYKGRKIGYTMVVGRKE >gi|226332858|gb|ACII01000161.1| GENE 17 20218 - 20613 569 131 aa, chain + ## HITS:1 COG:no KEGG:Gura_1002 NR:ns ## KEGG: Gura_1002 # Name: not_defined # Def: hypothetical protein # Organism: G.uraniumreducens # Pathway: not_defined # 2 124 16 137 147 107 43.0 1e-22 MGYHCSQIIMIMTLETIGEENPQLVKAMGGLGGGIGYCGDTCGCLTGSACAIGYFLGNLA PEEKEDAQMKPAVQELYQWFRQKTEEEFGAFYCKDITHLDWGVIMEKCPGLIADTYTKVM EILTEREVLKL >gi|226332858|gb|ACII01000161.1| GENE 18 20610 - 21935 976 441 aa, chain + ## HITS:1 COG:MTH831 KEGG:ns NR:ns ## COG: MTH831 COG1964 # Protein_GI_number: 15678851 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Methanothermobacter thermautotrophicus # 1 429 3 477 497 277 34.0 3e-74 MITIGKTKSVCPECMKVVPAIKGIGEDGIYLVKECNAHGKFQTLIWEGNAADYLSWGREN LSAETPVNPKVKEKSCPDNCGLCEEHERKGCCMILEVTKRCNMHCPVCFASAGEDREKGD IPICEIEKQYDFLMAHGGPFNIQLSGGEPTMRDDLPEIIHMGREKGFTFFQLNTNGIRLA REEGYAGKLKKAGLNTVFLQFDGVTDQVYETLRGQAMMELKKKAVLNCSEAELGIALVPV IAPGVNDMQVGDILKFGLDHMPFVRGVHFQPISYFGRCSQKRPQNPITIPKMLRLIEEQT EGLMKIEDFAGGGAENPYCSFHASYLRKGEQELKLLEKKSGKGCCCTTSDDSRQYVENQW SYSTKTYDEGEMTQTDALDEFLIRIHNETFAVSGMIFQDAWNLDLDRLKRCYICEVDPDH GMVPFCAYNLTNLKGTYLYRK >gi|226332858|gb|ACII01000161.1| GENE 19 21956 - 23128 581 390 aa, chain + ## HITS:1 COG:MA1725 KEGG:ns NR:ns ## COG: MA1725 COG1541 # Protein_GI_number: 20090577 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 21 307 27 326 435 112 35.0 9e-25 MICRQEKITDITRETINKIQLDKLNAVLKREKERQGFYRDLPERLESLDDLKTLPFTTES DLAQKGGRMLLCSQGEIQRIISEQTSGTTGAGKRVFYTEGDCEHTIELFMAGLGEFIYPG SRTMVAMPFSGPSGLGELIADAIRRIGAHPLLTGNNKTYGELKTILEEERPDTYVGMPTA LLSMLRMCGKGSIKRALISGDACPETVMKAIEKILGTPLWPHYGSREMGLGGAICCQAHE GMHMRENHCITEIIDKEGNVLPDGQWGELVITTIGMEAQPLIRYRTGDHTRIIPGKCICG SEVRRLDFVRRIDQSKSMREMDELLFQFPKLVDYCVRSVGGKKEITALFISDNGEELIRN VCTEKNIISLDCRKAEWSDRALYPAKRIIL >gi|226332858|gb|ACII01000161.1| GENE 20 23204 - 23941 302 245 aa, chain - ## HITS:1 COG:no KEGG:Shel_05240 NR:ns ## KEGG: Shel_05240 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 19 244 5 229 230 196 42.0 6e-49 MDNTNNSCCCGETEHFSGCLICGAPVTYSVESSVKTCSICHKEQLTNAVCENGHFVCDAC HSYGTYIPVIIALRSSTEKDPLLLLEEIMDLPSVHMHGPEHHAIVPSVLLTALRNNGERM NYDTALSEICKRARQVPGGTCGYWGVCGAAAGAGIFMSVMTGSGPLHKDAWPFPQKLVSV ILSKLADVGGPRCCKRTSRIAIEEAIRFYRQFSSVKIPLSSVLCKYFEDNKECIREDCPY YPVNK >gi|226332858|gb|ACII01000161.1| GENE 21 23947 - 24978 696 343 aa, chain - ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 13 339 13 337 343 390 55.0 1e-108 MSMKINHELPLPEVLKSQYPLSQELKEVKKQRDEEIRRIFTGESDKFIVLVGPCSADNED TVCEYVRKLKTVADKVSDKLMIIPRVYTNKPRTTGDGYKGMLHQPDPDKAPDLLAGIIAI RKMHMRVMQETGLSSADEMLYPENRSYLDDILSYEAVGARSVENQQHRLTASGMDIPVGM KNPTSGDLSVMLNSVTAAQHPHHFIYRGNDVETSGNDLAHTILRGGVNQYGQTIPNYHYE DLIHLSALYAKKDLKNPAIIIDANHSNSNKQYKEQIRIVSEVLHSRNYNPDLRKLVKGVM IESYLLEGRQDISDHMTPGCSITDPCLGWEDTERLIYDIAEKC >gi|226332858|gb|ACII01000161.1| GENE 22 25270 - 25506 249 78 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0180 NR:ns ## KEGG: EUBREC_0180 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 72 4 70 75 85 67.0 4e-16 MEIKSILIKDTTREERKQIVEESFGNISASCDGCMAGLTEMYQEYIDGTKEIRDINMEFN AHYVREDLDEKSKRSCPM >gi|226332858|gb|ACII01000161.1| GENE 23 25583 - 26101 560 172 aa, chain + ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 10 169 4 161 162 117 38.0 1e-26 MSTNNMDRFEFRCIRPEETQQAIEIEQICFPPNEACSPKSLTERIKATAETFLVAEDKET GKLAAFLNGVPTDEETFRDEFFTDISLSNPEGKNIMLLGLDVRPEYRMQGLGRELVSRYC QKEAQKGRKKLFLTCLDEKVKMYEKFGFTDLGQANSTWGGEAWHAMSIEIGR >gi|226332858|gb|ACII01000161.1| GENE 24 26133 - 26831 655 232 aa, chain + ## HITS:1 COG:CAC0640 KEGG:ns NR:ns ## COG: CAC0640 COG1768 # Protein_GI_number: 15893928 # Func_class: R General function prediction only # Function: Predicted phosphohydrolase # Organism: Clostridium acetobutylicum # 1 227 1 225 231 195 45.0 5e-50 MSLYAIGDFHLSFTVNKPMDVFDKRWKNHVVKIEKYWKRKVTENDTVVITGDHSWGRDLA ECQADLDFIMALPGRKILLRGNHDMFWDAKKTEDLNEMFWGKLEFLQNNFYTYEDYALVG TKGYCYEGKDSYEHFLKIRERELGRLRCSFEAAKAAGYEKFIMFLHYPPTSIGEMESPFT LMAQEFGAEKVIYSHCHGEKRYDDSFKGYVDGIEYKLVSGDYLNFKPELVLR >gi|226332858|gb|ACII01000161.1| GENE 25 26928 - 28013 706 361 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1220 NR:ns ## KEGG: Cphy_1220 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 3 313 5 292 297 147 27.0 5e-34 MKIYYREKSGGIEILRCFGIEGRVEIPGMIDGKLVISAAPYAFSSHMDEKEELKNASLWE VSEGLGFGREEHVLAGNDVEEIVFPDTLKEIGRYIFYGCGNLKKLEFSDSLMQIGCGAFT GCHALEKLTVHMRQGKKSGVKEMLGEMWQRIDVNFLYEYEEARLVFPEHYDEAVENTPAR ILYTEYHGSGSNYRQCFYDKELNYQEYDRLFEMAVAMDKLEVLVDMSFGRLEFPYELTGK ARENYREYIRKNLGDIAEYLVKQDDMHRLEVISSQKLWTLEGIDSALDCASKRKETEVSA FLMNERANLVDNTAGSERIDVDKLQNSQEADRTEQGKNEQSQTTEKSLNRRTILRKKRFE L >gi|226332858|gb|ACII01000161.1| GENE 26 28035 - 29402 1176 455 aa, chain + ## HITS:1 COG:DR1169 KEGG:ns NR:ns ## COG: DR1169 COG3864 # Protein_GI_number: 15806188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 53 437 43 367 379 67 24.0 6e-11 MQPEDKIQEIAERAERLQEIGSSILRAARDELYLGMRFLDVALSSFSYQMDGQVHGFGTD GRVMYFQPQMLGGLYRENRILVNRGYLHMVFHCIFRHFAWSGTEGKKRADDGITIQERMR DLSCDIAVEHMIDGMNYRSIRFSRSLLRRETYRLLEKEGKTLNAQRVYKILSEWNLNEKD LTNLEQEFRTDDHRYWESKKPDQKPNPMLSRKWGEINDGIETDLETFSQEAGERDGDFLE QIKTENRSRYDYREFLRKFAVFHEELAVDDDSFDYNFYTYGLRLYGNMPLIEPLESKEVK KIEEFVIVIDTSMSCSGELVRKFLEETYGVLSESESFFTKINVHIIQCDEKVHTDKKITS QKEMKDYMEHLELYGDGGTDFRPAFEWVDKLLEQHEFHNLKGLIYFTDGFGIYPKKMPSY KTAFVFMQDNYRDVDVPVWAMKLILDEDDFEKPDR >gi|226332858|gb|ACII01000161.1| GENE 27 29458 - 30951 1467 497 aa, chain + ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 18 238 3 212 340 100 27.0 9e-21 MDIKRAKQEIKDSIEAYLAKDEFGEYLIPAIRQRPILLMGAPGIGKTQIMEQIARECKVG LVSYTITHHTRQSAVGLPFIKEKTFGQETFSVTEYTMSEIIASVYEKMEKTGLREGILFI DEINCVSETLAPMMLQFLQGKTFGNQKVPEGWVIVTAGNPPEYNKSVREFDVVTLDRIKR IDVQPDFEVWKEYAYEQGIHPAVISYLELRRKNFYRMENTVDGRIFATARGWEDLSRLIQ VYETLDKEVDREVVYQYIQHPMIAKDFAAYLALYNKYKTDYAVEDLLQGKWTPITLGKIR NASLDEHLSIVGLLNGKLSQLFADCYFMDAYVTKLYGYMTEYRDNLPEMTLESIYKKAEN DFQTAKKSELLTKNEEKVFIRTVNFFENEISEKDTYEQTKIAFTAEADSLETQIEYTSQM LQNVFDFMEAAFGDSQEMVAFITELNANYYSLWFIRENGSDQYYRHNKGLLFDNRQKQIL GQMEELEQGNNFPSVLK >gi|226332858|gb|ACII01000161.1| GENE 28 31122 - 32825 2082 567 aa, chain + ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 2 527 7 461 652 208 31.0 3e-53 MRRKLIAGLLTALSVTMVCPAAYASTEHYTDSNVVGADSGWSDWDAAWETTAADFTKVSL TPGADDTQLNFAWYSEKGDSDATPVVHFGTDKDNLETFEGTAGDVDQSFTGDKAYEYNHV TVTGLEPNTTYYYTVEKNGEQTEVSEYKTQDTDSVKILYVGDPQIGASKGQTQNGAELTN ESGAANKAAENDGFSWNRTLNTALEQNPDINFVISAGDQVNKTGEAKEEEYASYLSADAL KGLAVATTIGNHDSLNEDYMYHFNNPNNTENGKTQAGGDYYYSYGEGLFVVLNTNNYNVA EHQKTIEEAVKAYPDAKWRIVTIHQDIYGSGLDHSDTDGMILRTQLTPVFDANDIDVVLQ GHDHTYSRSKMLYGDGQTHGKYEFSLNADGTDYDWDHATNVDTQEQIALAPEEGDTDAQA ALDAFHEDNDCYTIEDVDGDTVTDPQGILYMTANSASGSKYYELLSTQQDYVAARSQNWL PSYSVITLTADTFAIDTYQITDDGKAEAIDDTFTIKKTGADAADASADTTDGSSDDTDTA ADTTDSASDTAEAADTSATAAAEASGN >gi|226332858|gb|ACII01000161.1| GENE 29 32849 - 33997 970 382 aa, chain + ## HITS:1 COG:CAC0206 KEGG:ns NR:ns ## COG: CAC0206 COG5438 # Protein_GI_number: 15893499 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 48 376 41 374 397 206 37.0 5e-53 MSMKNIKFTGLLTRKKAVRYLIYVLFVCAFAVFVIKLNQVEKTELVVRTGQTFEKAKVVK ILQDNLEENGTRVGEQKVRVRMLTGVRKGEELDITSSSGYLFGAACKPGMKVIVMQSVAG DSTVASVYTQDREGVIYIFALIYLLVLCLIGGKQGIKGCLGLVFTFFCVIFVYLPLVYLK YSPFWTAVFVCFITTLVTMYLIGGPTQKTCAATLGTLVGVILAGVSAWCFSKASGISGYN VSDIETLMTLWNTNRIQVGGLLFSGLLISCLGAVMDVAMSISSAIDEIYRQNLSLSRKEL FKAGLRVGRDMMGTDSNTLILAFAGSSVSTLLLDYSYNLPYQQIINSNNIGIAIMQGLAG SFGIVLSVPFTVLICTILFHKK >gi|226332858|gb|ACII01000161.1| GENE 30 34508 - 35020 419 170 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_1357 NR:ns ## KEGG: Fisuc_1357 # Name: not_defined # Def: Appr-1-p processing domain protein # Organism: F.succinogenes # Pathway: not_defined # 48 170 224 346 347 118 57.0 8e-26 MLLDRSFKTKLNAYIVKNYTELCRDDDQYLMWDAGESLSKDVGLTNNESLDLKSLIDEVG DTFHDKLFQYIDNSGMTDVEIYKKAGLDRKLFSKIRSNPAYHPGKNIVLALAIALELDIN ETNDLLSRAEYAFSPSNKGDLIIKFFIEHKVYDRMAINFMLDEYGQPILG >gi|226332858|gb|ACII01000161.1| GENE 31 35082 - 35282 324 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581150|ref|ZP_04858409.1| ## NR: gi|253581150|ref|ZP_04858409.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 66 7 72 72 107 100.0 2e-22 MGVLTNQERKAKIGDIVEKVTQMSDEQLGNVHEYTSDEFKEPNHEAVALNAVIQLSRKYG KSEDGK >gi|226332858|gb|ACII01000161.1| GENE 32 35272 - 36780 693 502 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581151|ref|ZP_04858410.1| ## NR: gi|253581151|ref|ZP_04858410.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 502 6 507 507 979 100.0 0 MENKFYALLAPVGVYEIGKRKNLPSWKMDWELMKIALMQGLKIPDDNIRISGEDGVVTSR SFARNIAEISKYVSGEDGFIFYFSGHGDNSGLCFSDAAVSIQSIIEFIKKIKAKSKIVIM DCCYSGDFRMSQSVKMDMEKTVDDFAGHGIAVMASSASDEKSWLGVGGTHSLYTGILTTA MTVNRKIRQGRVSLADINEEVKQLIKIWNMKNPDRIQHPIYRASLGGTIFFQVEEYKTYQ SLQVYLEKNDYIIQSVEPLSTLKEKRLAVFILVKEKSNARQLSVITREAVQEVKYADVYL TQKMEDIHGHKSADAIWFYFGYDESDMVNHRYFATGIWCCNSVLQKKYFRNEKNAEVIEG IWISQDSSYEIVRKLQQTDISEEQFRQQAKQLLSKTVSMAEQFIADLEEVDNRTKTIKEV QREYKPWIRQVYEEYYKMTEIPVPPDELHDWFGVVEDLVGCVLDMALELRKDVEFKGLDR WVFRNCVRRYYEDLERLRGMEE >gi|226332858|gb|ACII01000161.1| GENE 33 36901 - 37599 487 232 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581152|ref|ZP_04858411.1| ## NR: gi|253581152|ref|ZP_04858411.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 232 1 232 232 398 100.0 1e-109 MKRKLVLNLLAMSAILSFPVSVYAIEVNDASVIFATQNALNKAGFNCGTADGVIGANTSN AITQYQTEKGLEITGKINEQLLISLGFSIDDSFGIDVTSFVARYNESAVYFNNISAESGD ALINQIVSENVFSEKGMLDGITTVSFGLNNVGSYVDSCTLQDNDGIYDVKNLYELSSVAY ALNTTYSSPADALTGVQKLFEDYKIENDNLTYSILNAGDKVTFSFAYKKLNK >gi|226332858|gb|ACII01000161.1| GENE 34 37640 - 37987 244 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581153|ref|ZP_04858412.1| ## NR: gi|253581153|ref|ZP_04858412.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 115 1 115 115 186 100.0 3e-46 MRYSKDIYDRDVLVKTAYCFSDRAYVHLDVEGNCYTVTIRSKQLDDHYDYENEFENELLA QQARKIVSVKTKNIRELIAARALSSTIVQLGEELEEEESDYNSADILKDWFEENE >gi|226332858|gb|ACII01000161.1| GENE 35 37980 - 38204 115 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581154|ref|ZP_04858413.1| ## NR: gi|253581154|ref|ZP_04858413.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 74 1 74 74 131 100.0 2e-29 MNNELYLNKEIYDEKYIQIAIQAYQGLAEISYKLNACWWMCSFTATRYDIDLTKKEFENY LIALMNRGLSNEVM >gi|226332858|gb|ACII01000161.1| GENE 36 38191 - 39234 265 347 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581155|ref|ZP_04858414.1| ## NR: gi|253581155|ref|ZP_04858414.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 347 1 347 347 514 100.0 1e-144 MKLCEILLMSIILVLGIVVIVTDFHAGVIKNKVLLVAIVFGVIINSIYITFFAKNFLLVY LLNWFAVSLISVLLYAFRFWAAGDSKLMITLAFLIPARYYENSLLLISGVYYIIFIFLLA YLYIIVESIVLAIKKKEYYRKHIEKRFFKEFIYRYFVSYLYLRTLTLILQNIMGEFYYTN QLFFTFINIFIILYIHEKEVFHKRTIILILFGINALLIIWSILQGSYTVNLDILHNYVIV LLAIFLRYLVSGYNYEEIQTSEVKKGMVLSYATVVGFMPSRVKGLPQTTSEDMSSRITEE EAEAVRRWEKSKYGKNSIIIVRKIPFAIFIVCGTISYIVMRIVMTCL >gi|226332858|gb|ACII01000161.1| GENE 37 39225 - 40334 413 369 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2723 NR:ns ## KEGG: DvMF_2723 # Name: not_defined # Def: radical SAM domain protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 57 364 70 375 384 171 32.0 3e-41 MFIKKIKNWDNEEQVVMFKVGSPTEDCGKAIYELDAEHLIYHDGKNSIVYTYESGNIEVK EHDIITISENGMIIRTHYSYGSEVDIFVTNHCNSNCIMCPLSEYSRKKKSPQYNQWLMKY IQALPREVGFINVTGGEPTLAGEYFIDVMDTLREKFQKSGFQLLTNGRSAADFRFLKNVL HHCPSGMLFAIPLHSCIPEIHDEITQSKGSFVQTDQGIKNLLKLNQRVEVRIVLSKVSIE TAEKTAEYIIENYKGITTVNFVAMEMMGNAVINREKVWIDYDSIFVKIRTAIDQLIKNGF DVKLYNFPLCMIKKGYWHIAAKSISGYKIQYQDDCLLCEAKDICGGFFSSTKKLMNPKVY PINRSELQL >gi|226332858|gb|ACII01000161.1| GENE 38 40331 - 41065 371 244 aa, chain + ## HITS:1 COG:CAC0658 KEGG:ns NR:ns ## COG: CAC0658 COG0641 # Protein_GI_number: 15893946 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 91 230 131 277 518 89 34.0 6e-18 MKLNYFNFDNRKDLFFLTNDFGYYCYLNKEDFQNLLSEKYDQIVPDKQKELLEKYFIYEE DEAVFAEKTVIPYRDSKYYTLQGTSLHIFVMTNACNMNCVYCQAQDSEQLDKGKMSMETA ERAVDIALQTPVRRMTFEFQGGEPLTNFDVIKHIVQYSQQQCNDKEIEYCIVTNTLLLTE EMITFLRNNGISISTSLDGNEIVHDSNRKTVKGTGTFFKVSENIKRIRESGISIGAIQTT TKKV >gi|226332858|gb|ACII01000161.1| GENE 39 41062 - 41718 200 218 aa, chain + ## HITS:1 COG:no KEGG:BDI_2144 NR:ns ## KEGG: BDI_2144 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 6 218 259 473 474 176 39.0 7e-43 MNYANEIVTEYVNQGLECIFIRPLTPLGYANEHWEEIGYTTEEFISFYRDALLKIIEYNK KGVKIVEGHAVIFLQKIIGHFSGNYMELRSPCGAGTGQIAYYYDGNIYTCDEGRMLAEMG NPTFCLGNVYKDNYNSIMESKVCRITCQASVLEGLPSCCDCVYHPYCGVCPVINLALENN IYERRPNNYRCRIYKGILDTLFELIEDKEVEAIFTSWI >gi|226332858|gb|ACII01000161.1| GENE 40 41709 - 42509 296 266 aa, chain + ## HITS:1 COG:slr1485 KEGG:ns NR:ns ## COG: slr1485 COG4642 # Protein_GI_number: 16329198 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 53 261 46 252 349 91 28.0 1e-18 MDLKKMKNIGIVIVFSAIGLLINTNIVTAQNGNKADAVYEMQKIILSDEADDLNGTDEQK YSDGSDYNGNYLEGKREGQGIYTFDNGDYYDGQWKNDLMSGEGVYYFSDGATLSGVFKKG KLKDGTFCYTDENGEYKVKIKNYKYSKKITATFLNGDIYKGIYKNDSFSGKCKILYNSGD QYEGEVEKNDKSGNGTYTWSNGAVYIGEWENDMMNGHGIYYYSSSSYPRLEGEFVNNQPE GECEYYTGENASIMTMWKDGECVSQD >gi|226332858|gb|ACII01000161.1| GENE 41 42969 - 43127 85 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGLNNIDLGKLPTIEGVQTPEVNTVPDVSSAVVKTAANIPEVSVAASIKTK >gi|226332858|gb|ACII01000161.1| GENE 42 43274 - 43525 187 83 aa, chain + ## HITS:1 COG:no KEGG:CTC00334 NR:ns ## KEGG: CTC00334 # Name: not_defined # Def: putative phosphohydrolase # Organism: C.tetani # Pathway: not_defined # 1 57 1 57 228 62 50.0 6e-09 MSLYAIGDFHLSFTVNKPMDVFDKRWKNHVVKIEKYLKKKVTERAAVVITRDHSWGVEFG GVSGGSGFHYDVAGQEDFASWKS >gi|226332858|gb|ACII01000161.1| GENE 43 43566 - 43796 178 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581159|ref|ZP_04858418.1| ## NR: gi|253581159|ref|ZP_04858418.1| ser/Thr protein phosphatase [Ruminococcus sp. 5_1_39B_FAA] # 1 76 40 115 115 150 100.0 2e-35 MFKGKLERLRCSFEVAKAAGYEKFIMFLHYPPTSIGEMDEHLSLVGFLNGKLSQLFTECH LTDSYITKLYEIYGIL >gi|226332858|gb|ACII01000161.1| GENE 44 43783 - 44163 291 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581160|ref|ZP_04858419.1| ## NR: gi|253581160|ref|ZP_04858419.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 126 4 129 129 221 100.0 1e-56 MVYYRDNLANISLKAVCQKAENDLESARKAELLTRNEEKTFLRVNTFLENIWLKMEGITQ AECDIYEKVKEAFSGEADGLEKQTETVSQTLQNVFDFMEAAFGDSQEMVAFITELNNCAS HSRYFQ Prediction of potential genes in microbial genomes Time: Sat May 28 21:06:09 2011 Seq name: gi|226332857|gb|ACII01000162.1| Ruminococcus sp. 5_1_39B_FAA cont1.162, whole genome shotgun sequence Length of sequence - 30988 bp Number of predicted genes - 22, with homology - 21 Number of transcription units - 11, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 2.4 1 1 Op 1 . + CDS 150 - 2111 1478 ## Caci_0892 tetratricopeptide TPR_4 + Prom 2115 - 2174 1.7 2 1 Op 2 . + CDS 2202 - 2639 190 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 + Term 2703 - 2748 10.2 3 2 Tu 1 . + CDS 2762 - 3610 682 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 3626 - 3685 3.2 4 3 Op 1 . + CDS 3705 - 4322 922 ## Cbei_2388 hypothetical protein 5 3 Op 2 4/0.000 + CDS 4416 - 5897 1675 ## COG2407 L-fucose isomerase and related proteins + Term 5985 - 6021 7.1 + Prom 5916 - 5975 4.8 6 3 Op 3 . + CDS 6042 - 7499 1649 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 7523 - 7566 9.0 - Term 7438 - 7476 1.6 7 4 Tu 1 . - CDS 7611 - 8576 485 ## COG0583 Transcriptional regulator - Prom 8604 - 8663 8.1 + Prom 8579 - 8638 8.7 8 5 Tu 1 . + CDS 8662 - 10086 1256 ## COG1027 Aspartate ammonia-lyase - Term 10084 - 10119 7.4 9 6 Tu 1 . - CDS 10195 - 11880 2592 ## COG2759 Formyltetrahydrofolate synthetase - Prom 12006 - 12065 5.9 + Prom 12868 - 12927 6.8 10 7 Op 1 . + CDS 13007 - 14464 1499 ## COG1982 Arginine/lysine/ornithine decarboxylases 11 7 Op 2 1/0.250 + CDS 14467 - 15327 850 ## COG0421 Spermidine synthase 12 7 Op 3 7/0.000 + CDS 15324 - 16604 1357 ## COG1748 Saccharopine dehydrogenase and related proteins 13 7 Op 4 1/0.250 + CDS 16594 - 17715 1025 ## COG0019 Diaminopimelate decarboxylase 14 7 Op 5 . + CDS 17750 - 18853 1012 ## COG2957 Peptidylarginine deiminase and related enzymes 15 7 Op 6 . + CDS 18780 - 18998 98 ## 16 7 Op 7 . + CDS 18955 - 19833 962 ## COG0388 Predicted amidohydrolase + Prom 19903 - 19962 9.2 17 8 Tu 1 . + CDS 20033 - 20494 319 ## Cphy_0363 hypothetical protein + Term 20498 - 20546 13.5 + Prom 20546 - 20605 7.3 18 9 Op 1 . + CDS 20696 - 25393 4315 ## ABC1165 cell surface protein 19 9 Op 2 . + CDS 25390 - 26277 439 ## COG4509 Uncharacterized protein conserved in bacteria + Prom 26347 - 26406 8.6 20 10 Op 1 . + CDS 26469 - 27560 1028 ## COG1609 Transcriptional regulators 21 10 Op 2 . + CDS 27610 - 29109 2119 ## COG2160 L-arabinose isomerase + Prom 29180 - 29239 6.4 22 11 Tu 1 . + CDS 29336 - 30940 2099 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 30945 - 30987 -0.3 Predicted protein(s) >gi|226332857|gb|ACII01000162.1| GENE 1 150 - 2111 1478 653 aa, chain + ## HITS:1 COG:no KEGG:Caci_0892 NR:ns ## KEGG: Caci_0892 # Name: not_defined # Def: tetratricopeptide TPR_4 # Organism: C.acidiphila # Pathway: not_defined # 2 634 1 635 667 324 32.0 7e-87 MVSELRFETKKIRMADLGKESCVPDLLGELTMQNQLEFHLDETDEIYEGYGRVKSAYPYR QRNTYTRELKEMEIHTAVLENRYLKAVFLPEFGGRLWELWDKYTGRNLLYTNDVLQFSNL AVRNAWFSGGVEWNIGIIGHNPFTTEPLYTAQTVNEDGEPVLRMYEYERIRKVTYQMDFW LEKDSSFLNCRMRIVNEGKEVVPMYWWSNMAVPEYENGRVVVPAVQAFTSRGTQVTKVDL PIVENIDISDYTAIKKSVDYFFDIPAGCPKYIANIDETGYGLLQISTDRLRSRKLFSWGN QDASNHWQEYLTDKAGRYLEIQAGLGKTQYGCIPMAPHTAWEWMEQYGSVQISEDVLEKE YRERTVLVTEKILETGLHEKLKEKLETSKEMSRKKAQTVYRGSGYGALTVHGESTKHLEF SMKAVNESQTENSKEEAGLEKWKHFFETGILHCPKPLEAPDEFMIDETNVDFLEAHMEEN AQSWYAYYQLGLGYYRKEDYGKAEKAFEDSLKLRESAWAFHGLSCVKLMQNEKDQAGRYI LQGMAFERKELSYLKEGFRILLLAEKYEELSHFYRKLDKEEQEDSRLKLGYVQALHGLKQ DKKALDLLESKGGLIPEDIREGEDSLGKVWKELYKSVYKKEGKLPHKFNFQAN >gi|226332857|gb|ACII01000162.1| GENE 2 2202 - 2639 190 145 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 4 142 5 144 147 77 37 8e-14 MELEVLRKDMVAAMKAKDKVTKEAVSSLIAAVKKVAIDEGCRDEIKSDLVDRVILKELKT VKEQLDTCPESREDLKAEYQARYDVIAKYAPKQMDAAEVKAYLEEKFADVIATKNKGQIM KAVMADMKGKADGKVINQVVAELCK >gi|226332857|gb|ACII01000162.1| GENE 3 2762 - 3610 682 282 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 11 272 11 288 292 96 27.0 4e-20 MEFQHELIIPNEGFPFKVFLFEGGNGNYVREKHWHTSVEIFAVMEGRLDFFVNKDEYPLK AGEQIIINSNETHSIHAVEKNKTVVLQIPLKQFENYFTAQRYIRFRGQEELVDKKLASLL RKLYHVYSERKIGYEFRTISIFYEIMYILVKDYRVTETREKDIRHSRRLDALSKITTYMR EHYREELKLSDVAATFGYSDAYLSRMFQKYAKINYKTYLQDIRMAYAYRDLLNTDHTISQ IALDNGFCSSRGFSGEFQKRYGVLPSEMRKQINEKGQKNAID >gi|226332857|gb|ACII01000162.1| GENE 4 3705 - 4322 922 205 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2388 NR:ns ## KEGG: Cbei_2388 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 204 1 206 207 274 64.0 1e-72 MKIQNVTDASFRKYGKVLEGYDFSALLKEMKHTPVPDDVVYVPSVEELEALDVAKALQNK GFGGIPIEIGYCNGHNKKLNAVEYHRSSEINVAVTDLVLLIGSQQDITDDFTYDTSKIEA FLVPAGTGIEVYATTLHYAPCNVQDGGFQCVVVLPAGTNTDLTFETAKTGEDSLLTAKNK WLIAHEDAAIEGAVNGLRGENITID >gi|226332857|gb|ACII01000162.1| GENE 5 4416 - 5897 1675 493 aa, chain + ## HITS:1 COG:CAC2610 KEGG:ns NR:ns ## COG: CAC2610 COG2407 # Protein_GI_number: 15895868 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Clostridium acetobutylicum # 1 492 1 489 490 712 67.0 0 MNNIPVVKLGLIAVSRDCFPIQLSEKRRAAIKETYKGELYECPVTVENEKDMEKALEDVN KAECNALVVFLGNFGPETPETLIAERFDGPCMYVAAAEGDGDMINGRGDAYCGMLNCSYN LKMRHLKAYIPEYPVGTADDIAKMIADFVPVARAVIGVSNLKIITFGPRPQDFFACNAPI KGLYELGVEIEENSELDLLVSFKEHENDPRIPEVCADMAKEMGEGKYYADLNVKMAQFEL TLLDWAEAHKGARKYVAFADKCWPAFPSQFGFEPCYVNSRLVSRGIPVSCEVDIYGALSE YIGLCISNDAVTLLDINNSVPQYIYDEDIKGKYDYKLTDTFMGFHCGNTPACKMCDSRAV KYQLIQHRLLEPAGSEPDFTRGTLEGDIAASDITFYRLQCNSEGELVSYVAEGEVLDVPT RSFGGIGIFAIKEMGRFYRHVLIEGNYPHHGAVMFGHYGKAFYEVVKFLGLDVKKIGYNQ PAGVRYPTENPWG >gi|226332857|gb|ACII01000162.1| GENE 6 6042 - 7499 1649 485 aa, chain + ## HITS:1 COG:SMc03164 KEGG:ns NR:ns ## COG: SMc03164 COG1070 # Protein_GI_number: 15966648 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Sinorhizobium meliloti # 3 482 2 480 484 377 43.0 1e-104 MYYIGIDLGTSAVKLLLMEESGKICNIVSREYPLFFPHPGWSEQKPEDWFIQSMEGMKEL TKDIDRTQVAGIGFGGQMHGLVTLDKDDNVIRPAILWNDGRTGEETEYLNTVIGKDKLSQ YTANIAFAGFTAPKILWMQKNEPENFKKVVKIMLPKDYLAYRLSGSFCTDVSDASGMLLL DVKNRCWSKEMLEICGITEEQLPKLYESWEVVGTLKPEIAKELGFSEAVKVVAGAGDNAA AAVGTGTVGDGQCNISLGTSGTVFISSKNFGVDENNALHSFCHADGSYHLMGCMLSAASC NKWWAEEILQTKDFAAEQAPIQKLGENHVFFLPYLMGERSPHNNPDARGVFFGMSMDSTR ADMTQAVLEGVAFGLRDSLEVARSLGIKIERTKICGGGAKSPLWKKIIANVMNLKVDVLE NEEGPSMGGAMLAAVGCGAYPDVETIGKKFAKVVDTVEPDPELVAKYEERYQKFRTLYPA MKPLF >gi|226332857|gb|ACII01000162.1| GENE 7 7611 - 8576 485 321 aa, chain - ## HITS:1 COG:FN0603 KEGG:ns NR:ns ## COG: FN0603 COG0583 # Protein_GI_number: 19703938 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 11 318 1 308 314 229 37.0 7e-60 MFKLYIEGVYMFQGMEYIYEVYKEKSFSKAAAALFISQPSLSANVKRVENRIGYPIFDRS TKPLQLTEVGKHYIQAVEKVMDIENNLENYLLDLGNLKTGTLNVGGSNFFSSWILPPLIA DFSQKFPHVQISLVEESTAKLSQFLQAGKLDLVIDNCILDNQIFERYIYQKEQLLLAVPK NYSINTRLQKYQIPIEEIRNGQFRSSHIPSVPLNKFENEPFIILRSDNDTGKRALTICQE NHFSPSVVFRLDQQMTAYNIACLGMGITFIGDLLLSRVPTNSELVFYKLPGQSSKRTVFF YWKKGRYISRAMEEFLKLPFS >gi|226332857|gb|ACII01000162.1| GENE 8 8662 - 10086 1256 474 aa, chain + ## HITS:1 COG:CAC0274 KEGG:ns NR:ns ## COG: CAC0274 COG1027 # Protein_GI_number: 15893566 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 5 468 2 465 465 600 64.0 1e-171 MKKTDYRVERDSIGVKDIPEDVYYGVQSLRAAENFHITGLNMHPEIINSLAYIKKAAAIT NCEVGLLEKKKAQAIVQACDEIVSGKFHNEFIVDPVQGGAGTSLNMNANEVIANRAIEIL GGKKGDYTIINPNDDVNCGQSTNDVIPTAGKMTSLRLLQNLKKQLLRLYDALNEKATEFD HIIKMGRTQMQDAVPIRLGQEFKAYSVAIMRDIHRMDKAMDEMRTLNMGGTAIGTGINAD EGYLRRIVPNLTEISGMDFIQAFDLIDSTQNLDPFVAVSGAVKACAVTLSKMSNDLRLMS SGPRTGFGEINLPAKQNGSSIMPGKVNPVIPEVVNQVAFNIIGNDVTITMAAEAGQLELN AFEPIIFYCMFQSIDTLGYAVQTLVDNCIVGITANEERCRYLVENSVGIITAISPHLGYQ KAADIAKKAIKTGESVRSLILKEKLMDEDELNRILDPIHMTEPGISGKDYLIKK >gi|226332857|gb|ACII01000162.1| GENE 9 10195 - 11880 2592 561 aa, chain - ## HITS:1 COG:SPy2085 KEGG:ns NR:ns ## COG: SPy2085 COG2759 # Protein_GI_number: 15675843 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Streptococcus pyogenes M1 GAS # 5 561 4 557 557 748 68.0 0 MGFKSDIEIAQECEMLPITQIAEKAGIDDKYLEQYGKYKAKIDYNLLKESDKKDGKLILV TAINPTPAGEGKTTTTVGLADGMQRLGKSVMVALREPSLGPVFGVKGGAAGGGYAQVVPM EDINLHFTGDFHAIGAANNLLAAMIDNHIFQGNALNIDPRKITWRRCVDMNDRQLRNVVD GLGGKTNGMPREDGYDITVASEIMAVLCLASDIKDLKERLSKIIIGYTYGKVSEQKPVTA GDLHAEGAMTALLKDALKPNLVQTLEHVPAIVHGGPFANIAHGCNSVTATKMAMKLADYA ITEAGFGADLGAEKFLDIKCRMAGLKPSAVVIVATVRALKYNGGVAKADLNNENLEALEK GIPNLLKHVSNIKNVYKLPCVVAINAFPTDTKAELDFVEAKCKELGVNVALSEVWAKGGE GGIKLAEEVIRLVEEPNDFSYAYELEGSIEDKLNQIVQKVYGGKKVVLTANAQKQAKQLE ALGFGNCPICVAKTQYSLTDDQTKLGAPTDFEVTVRNLKISAGAGFIVALTGEIMTMPGL PKVPAAERIDVDETGKITGLF >gi|226332857|gb|ACII01000162.1| GENE 10 13007 - 14464 1499 485 aa, chain + ## HITS:1 COG:SP0916 KEGG:ns NR:ns ## COG: SP0916 COG1982 # Protein_GI_number: 15900796 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Streptococcus pneumoniae TIGR4 # 6 482 6 482 491 712 72.0 0 MDYERQQHAPIYEALERFRKKRVVPFDVPGHKRGRGNPELAQLLGEKCVGLDVNSMKPLD NLCHPVSVIREAEELAADAFGAAHAFLMVGGTTSAVQSMILSVCKAGDKIILPRNVHKSA INALVLCGAIPVYVNPEVNAQLGISLGMEVSQVEKAMDENPDAVAVLVNNPTYYGICSDL RTIVKKAHARGMKVLADEAHGTHLYFGRNLPVSGMAAGADMAAVSMHKSGGSLTQSSLLL LNKSMNAEKVRQIINLTQTTSASYLLLSSLDISRRNLALRGEESFEKVAGMAQYAREEIN EIGGYYAYGMDLINGTSVFDFDVTKLSIYTLGNGLAGIEVYDLLRDEYDIQIEFGDICNI LAYISIGDRLQDIERLVGALADIERLYKKDSTGMLSGEYIAPAVVASPQQAFYAEKESLP MEQTAGRISGEFVMCYPPGIPILAPGEMVTQEIVEYILYARDKGCSMQGMEDPKVEYLQV LKGGI >gi|226332857|gb|ACII01000162.1| GENE 11 14467 - 15327 850 286 aa, chain + ## HITS:1 COG:SP0918 KEGG:ns NR:ns ## COG: SP0918 COG0421 # Protein_GI_number: 15900798 # Func_class: E Amino acid transport and metabolism # Function: Spermidine synthase # Organism: Streptococcus pneumoniae TIGR4 # 1 283 1 283 286 404 65.0 1e-112 MELWYSEFHTGNVKLSVRINRQLFSGESEFQRIDVFESEEFGRFVALDGEIVFSDKDEFI YDEMVTHVPMTVHPNVKNVLIIGGGDGGVARELIHYPQIESIDVVESDKMFVDVCAEMFP DIAQGLKDERVNIYYEDGLRFLRNKKARYDLIINDSTDPLGHTEGLFTKEFYGSCYKALR DDGIMVYQHGSPFYDEDEVECRKMHRKVFRSFPVSRVYQAHIPTCPSGYWLFGFASKKYH PLEDFRPERFDNLNIETWYYTTNLHRGAFMLPKYVEDLLEEEENSL >gi|226332857|gb|ACII01000162.1| GENE 12 15324 - 16604 1357 426 aa, chain + ## HITS:1 COG:SP0919 KEGG:ns NR:ns ## COG: SP0919 COG1748 # Protein_GI_number: 15900799 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 1 418 1 418 419 739 82.0 0 MSRLLVIGCGGVASVAIRKCCQNSDVFTEIMIASRTKEKCDALKKKIESTTKTKIETAKV DADNAAEVAELIRAYKPDAVLNVALPYQDLTIMDACLEAGADYIDTANYEAEDTEDPTWR AIYEKRCKEKGFTAYFDYSWQWAYNEKFKEAGLTALLGTGFDPGVTSVFSAYALKHYFDE IHTIDILDCNGGDHGYPFATNFNPEINLREVSANGSYWEDGHWVETEPMEFKSVYDFPEV GKKDMYLLHHEEIESLAKNIPGVQRIRFFMTFGQSYLTHMKCLENVGMLSTAPVEFNGQE IVPIQFLKALLPDPASLGPRTVGKTNIGCIFTGVKDGKEKSIYIYNVCDHQECYKEVESQ AISYTTGVPAMIGSMMVVTGQWKKPGVFNVEEFDPDPYMEALNKWGLPWKVCEDPELVKV WKANED >gi|226332857|gb|ACII01000162.1| GENE 13 16594 - 17715 1025 373 aa, chain + ## HITS:1 COG:BH3958 KEGG:ns NR:ns ## COG: BH3958 COG0019 # Protein_GI_number: 15616520 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Bacillus halodurans # 1 373 5 379 379 532 64.0 1e-151 MKINQLQTPCYVIDEKKMRENLEILKKVQEDTGCKILLAQKAFSGFALYPMIGKYLAGTT ASGLFEARLGYEEMGKENHVFSPAYRTEDIEELDRICDHIIFNSTAQLKKFKEVCTGASL GLRINPECSTQGDHAIYDPCAPGSRLGVTRENFDGECLKWIDGLHFHTLCEQNADDLEKT LQAVEEKFGEFLSEVSWLNMGGGHHITREDYNVKLLEKCIQHMKETYDLDIYLEPGEAVA LNAGYLVTEVMDIVENEIRTLILDASAACHMPDVLEMPYRPPLKDSGEAGEKAYTYRLSS CTCLAGDVIGDYSFDREIQIGDKLYFMDMAIYSMVKNNTFNGMPLPDIAVMHEDGECEVI RHFGYEDFKSRLS >gi|226332857|gb|ACII01000162.1| GENE 14 17750 - 18853 1012 367 aa, chain + ## HITS:1 COG:SP0921 KEGG:ns NR:ns ## COG: SP0921 COG2957 # Protein_GI_number: 15900801 # Func_class: E Amino acid transport and metabolism # Function: Peptidylarginine deiminase and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 11 363 10 360 361 459 59.0 1e-129 MEVTEKNRTKYRMPGEFEPHEGCVMIWPERPGSWNYGAREAQKAFVKVAEAIGVSEKVYM LVSKAQMENAKNQLGNVSGVTLLECETDDAWARDVGATMVLDEKGAVCGVDWQFNAWGGT FDGLYRNWEKDDRVAAFICRTLGCPCLDARPFVLEGGSIHSDGEGTLIVTEACLLSQGRN PQMSREQIEEQLKYWLGVHKIIWLPCGIYQDETNEHVDNVCAFVRPGEVVLAWTEDETDP QYAMSLADLKVLEQETDAKGRKFKIHKLPIPQKPVCLTQEDVDGFVFEEGEDEREAGERM AASYVNFYISNGGIILPQFGDENDKRAVEILQKCFPERRIYPIDARAIIVGGGNIHCITQ QIPGKNI >gi|226332857|gb|ACII01000162.1| GENE 15 18780 - 18998 98 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPEPSLLAVEIFTVSPSRFPEKISDLDAENFESHVFEGYYLKVMKKLTYVNRSRRNRQDE KCNSGSSSDEMQ >gi|226332857|gb|ACII01000162.1| GENE 16 18955 - 19833 962 292 aa, chain + ## HITS:1 COG:SP0922 KEGG:ns NR:ns ## COG: SP0922 COG0388 # Protein_GI_number: 15900802 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Streptococcus pneumoniae TIGR4 # 1 289 1 289 291 439 68.0 1e-123 MRNVTVAAVQMKCSKSVEKNIAHAEELVRQAAAKGAEIVLLPELFERPYFCQERRYEYYE YAQTAEENPAVRHFSRVAAELGIVIPVSFYEKEVNNTYNSVAVLDADGKNLGIYRKTHIP DDHYYQEKFYFTPGDTGFKVFDTRFGTIGVGICWDQWFPETARCMALQGAELLFYPTAIG SEPILECDSMEHWRRCMQGHAASNLIPVIAANRIGEETVEPCPENGMQKSALNFYGSSFI TDNTGALCAELPGGEEGVLVSTFDLDALKADRLNWGLFRDRRPEMYAKIVGK >gi|226332857|gb|ACII01000162.1| GENE 17 20033 - 20494 319 153 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0363 NR:ns ## KEGG: Cphy_0363 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 150 1 166 169 65 31.0 5e-10 MNRKEFLEILRSQLAGQMQEGKAAAHIRYYEDYIQSQVRGGRSEQEVLQELGDPHLIAKT LIDTDDGSTQEDYGEYSSYGSSYGNETELPHQQEKRWKKVIDLSTWHGKAVVIAAAAVII VLLILIIGVAIPFFIILAIILYFLSWLKKRNDQ >gi|226332857|gb|ACII01000162.1| GENE 18 20696 - 25393 4315 1565 aa, chain + ## HITS:1 COG:no KEGG:ABC1165 NR:ns ## KEGG: ABC1165 # Name: not_defined # Def: cell surface protein # Organism: B.clausii # Pathway: not_defined # 511 1349 1851 2591 2870 87 23.0 4e-15 MKHNFKLKERLGALLLAMLFILQAILGLVPVCVTQAAPLTVETWDSDKVVDYGYRFNMKF QPGITTYESFGCDNLDREAFSDNGKSERDTECVRVGADYKAGSAGMRYNNVGKDGNGNIV DVRLILVGVENAEPRYDLRTAESIVQNKGGATFAWKDNEAYPMVGFSKNSIGVFIYSVGY AKVKFQFLKHGTEETLPISGHGTIRDIDAGQGVRIPSDSSLDNAYVLKNNDYLTVDGNSV SSPLGSVEPDDPRGWLNLFYDTDNFTVEFCHQFRLDKWDKSREDAIAKAGSQERWAEITR NKYLDPSGNSYCPNFKGQKYCKAYAYFDFTSYCFGDVEMKKAPEKRVGEANCTWEQAAAA SKEKPFGIRQGQEFQYMIRAEVTPNRLKSFVVQDILEDCLTIEDASKVSIVNDAGQTVTD WFDVAVEGQKVTCRAKAESLQDEAFTDNQTYTFTLKVRQRPESEINISKYLAEDGYSILV PNHASMSYERTNGSGDTMDTETVWVKGVIPPELEVKKNTSQYEWKTGDIIDYEVLVSQTK QDVKAVNVVITDELPSCLQFLEGQYAVETSQGGENCTLTGQGENGWKAECPSLKYGETIT IRFKCQASADSNGQEWENIVTATADNLINPETGEQESRKDMAEVWPNSPQLEIDKTADKY EWQAGEQVAYRIVVNNVTAGTIAKDVTITDIGLPQGLVLAGGAQSMEVLGVQQQVNYPVP DKKTGQAYEARSVDSQLNADENGFSFYCSYVPYSQPVTIIFHCIAQEEANGHESVNAATV KAANTDERSDDAEVYVNSGEFWIEKSADHYEWQVGEQVQYNVVVENKKAGTVARNVTVWD TGMPAGLALSSAEDVSVSGIPQNITQLTAGTKDVLNQLNPEFYNETSEKPVNYEFLQEGS GWRLNISDLPANTPVMISFLCTVAEAANGMESINVANVQAQNAPVSQDDAEVYVNTAVLS IEKSFQNPYLAAGDGRAENEFRVGEQVNYQVTVNNLQKGSIARNLVISDLSLPEGLALDG AEDAVTVSGVPDVIQNPVAGTDDAGNQLNPENYKETVEKPVSCQVTRQGTGWIVTISDLP YQTPVTVNFRCTAQESVNGMEIVNTAQAYADNAQKVKDSSKIWVNSPVLKVEKTTDKPFY KYGDIITYRIALTQEQTGCVARNVTLQDVIDTQGVRLLKDSVILMDEKGNVADADVQIND DNTFLMSTGRTLVRDSRYSICDNDKGGLFEQIMYNPLDCQEQKSMIVEYQAAVIDAALAG QKVHNTAVADSSEKIPATGEAETEVHSPILEIVKESDKKEYASGEKGYYKLTVRQLREDV TDQNIVIEDKLETQGASIVKDSIFVKKNGIELKDAKIEADDIGFVIQTGASLSDMDKIEV CYEMVFKTESTEPEKIVNTAKARGDISPEIAAQQEVYVKTKAEPTATPTPSATPKPTETP KPTETPKPTEVPKPTEQPKATPTPSVPGTCPKMTPVPTKAPLASYNGGNNGGTTSGNGGS SYGNSGGYGSYQGGSMAGSSKTGDVRPFKMMAVLGLIGMGLLTGGVVIYRRTVSGKRNIK GMPRK >gi|226332857|gb|ACII01000162.1| GENE 19 25390 - 26277 439 295 aa, chain + ## HITS:1 COG:BH3294 KEGG:ns NR:ns ## COG: BH3294 COG4509 # Protein_GI_number: 15615856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 86 281 57 248 254 109 35.0 6e-24 MKKLRTFAVSMIIAAIGLLIYCAGFYIKSYEQNLAARQEYAELSQMAGVKNREGAGLLAD GKKTDEDSSTKNSEKKRRKKQRRSSESKNKNCTDEIRESFGISWENLRKINSQTVAWITV PGADISYPVVQAADDEYYLKHNFRGEEDLFGCIFLEHDIKKNFTDSHSILYGHNIEGNMM FANLNRYEQSEFLKKCPEIEITTPKRKFLYKIFSVEQASSQSPAFEYGYKLSSPAYRRQL SILKNNSMYDTGVEPDERERMVTLITCNSRLDKEIRMAVHGICHECYGIEKAEPK >gi|226332857|gb|ACII01000162.1| GENE 20 26469 - 27560 1028 363 aa, chain + ## HITS:1 COG:BS_araR KEGG:ns NR:ns ## COG: BS_araR COG1609 # Protein_GI_number: 16080450 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 7 362 25 382 384 273 41.0 3e-73 MTDNGKPKYFSLMEQLRSDIMSGAIRPGEKLPSENELSQKYSLSRHTVRKALGILEQDGY VEAFHGRGTYCSENMRHIKNSKNIAVVTTYISDYIFPRLIQGMDNVLSESGYSIILKNTA NSRQKEARCLEELLKKDIDGLIIEPSKSEMICKHRNLYQNLDKFEIPYVFIQGIYSEMRD KPHILMDDAQGGYLVTKYLLELGHKKIKGFFKADDMQGLERHKGYVKALQESGIAYDPDD VVWFHTEDRKVKPALMAKEMVQSGQLPDGIVCYNDQIAVQVMEELEKMGVRIPDDISITG YDNSLYAQRGSGITTIAHPQEKLGEMAAELILEKINGVPEEESKVERLIYPELIIRGSCR KIL >gi|226332857|gb|ACII01000162.1| GENE 21 27610 - 29109 2119 499 aa, chain + ## HITS:1 COG:BH1873 KEGG:ns NR:ns ## COG: BH1873 COG2160 # Protein_GI_number: 15614436 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Bacillus halodurans # 1 499 1 494 497 581 55.0 1e-166 MKTGRNYKFWFCTGSQDLYGDECLRKVAEHSRIIVEELNKSGVLPFELVWKPTLITNELI RKTFNEANADEDCAGVITWMHTFSPAKSWILGLKEYRKPLCHLHTQFNEEIPYDTIDMDF MNENQSAHGGREYGHIVTRMGIERKVIVGHWADKDVQERLASWMRTAVGIMESSHIRVCR VADNMRNVAVTEGDKVEAQIKFGWEVDAYPVNEIADYVKDVAKGDVDALVDEYYSKYDIL LEGRDPEEFKRHVAVQAQIEIGFEKFLEEKNYQAIVTHFGDLGALKQLPGLAIQRLMEKG YGFGAEGDWKTAAMVRLMKIMTAGVKDAKGTSMMEDYTYNFVPGKEGILQSHMLEVCPSV ADGKIGIKVCPLSMGDREDPARLVFTSKTGPGIATSLIDLGDRFRLIINDVECKKVEKPM PKLPVGSAFWTPQPNLKTGAEAWILAGGAHHTAFSYDLTAEQMGDWAAAMGIEAVYIDKD TNIRDFKNELRWNELAFRK >gi|226332857|gb|ACII01000162.1| GENE 22 29336 - 30940 2099 534 aa, chain + ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 14 533 15 534 534 662 60.0 0 MGAAEAKAAILANKTALGIEFGSTRIKAVLVDDKNQPIASGGHEWENRYENGVWTYSLDD IWTGIQDCYQDMARDVKAKYDIELESVGAFGVSAMMHGYMPFNKEGELLVPFRTWRNNIT GEASEKLMELFNYNIPQRWSIAHLYQAILNGEEHVKDIDYITTLEAYVHWKLTGKRVLGI GDAAGMFPIDTTKADYNQEMVDKFDELVAPYGFSWKLRDIMPKALVAGEDAGVLTEEGAK LLDVTGKLKAGIPMCPPEGDAGTGMVATNSVAVRTGNVSAGTSVFAMIVLEKELSRPYKE IDMVTTPSGHLVAMAHSNNCTSDLNAWVNVFKEFAEAMGMEVDMNKLFGTLYNKALEGDP DCGGLLSYCYFSGEHMTGFEEGRPLFVRSPESKFNLANFMRTNLYTCLGAMRVGLNLLFE KENVKVDRLLGHGGLFKTKGVGQQILADAVNAPVSVMATAGEGGAWGIALLASYLVNKEE GETLESFLDNKVFADQESSTLDPKPEGVAGFNAFMDSYMKGLSIERAAVESEIW Prediction of potential genes in microbial genomes Time: Sat May 28 21:06:49 2011 Seq name: gi|226332856|gb|ACII01000163.1| Ruminococcus sp. 5_1_39B_FAA cont1.163, whole genome shotgun sequence Length of sequence - 1711 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 51 - 110 5.9 1 1 Tu 1 . + CDS 138 - 272 85 ## + Term 330 - 394 2.3 - Term 91 - 126 2.4 2 2 Tu 1 . - CDS 244 - 1584 543 ## COG2826 Transposase and inactivated derivatives, IS30 family - Prom 1620 - 1679 6.2 Predicted protein(s) >gi|226332856|gb|ACII01000163.1| GENE 1 138 - 272 85 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKGHLEIVCRTKFHMSVGKTKFYAMKLACACIVIIILLMLVSM >gi|226332856|gb|ACII01000163.1| GENE 2 244 - 1584 543 446 aa, chain - ## HITS:1 COG:L0443 KEGG:ns NR:ns ## COG: L0443 COG2826 # Protein_GI_number: 15672663 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS30 family # Organism: Lactococcus lactis # 120 416 41 312 315 63 27.0 6e-10 MNTGRPKGNQKHLDLSARIIIEQHLNNGDSFRSIAIELSKDPSTISKEIRRHSIIRERSA DAFAPIPCANNYDSSKPRTNICNVMHMCGDNECRRKCVLCRKFRCSDVCKFYKPRECEKL NKPPYVCNGCSKRTNCMMDKKIYSSKYAQDSYEALRTTSREGINQTPESIQKLDNLLSPL LKKGQSIAHIYASHADEIACSRRTIYSYIDRGIFQARNIDLRRKVVYKQRKRKTTTSLKD RSFRKDRGYKEFLEYIAANKSVYVVEMDTVEGAKGTSPCFLTMFFRNCSLMLMFLLEEQT QKEVTRIFDHLTELLGIELFQKLFEVILTDNGHEFQDRQSLEYSKNDEVRTRIYYCDPNR SDQKGALEKNHEYIRYVLPKGTSFEKMTDKTTLLLLNHINSEKRDSLNGHSPYEVSRLLL DNRLHKALGLAEIPADEVTLIPALIK Prediction of potential genes in microbial genomes Time: Sat May 28 21:06:56 2011 Seq name: gi|226332855|gb|ACII01000164.1| Ruminococcus sp. 5_1_39B_FAA cont1.164, whole genome shotgun sequence Length of sequence - 11529 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 3, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 58 - 450 278 ## COG1708 Predicted nucleotidyltransferases + Term 500 - 539 9.8 - Term 487 - 527 6.2 2 2 Op 1 7/0.000 - CDS 575 - 961 77 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 1086 - 1145 7.6 3 2 Op 2 . - CDS 2136 - 2864 237 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 3086 - 3145 4.0 4 3 Op 1 35/0.000 + CDS 4421 - 5668 1131 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 5673 - 5732 5.2 5 3 Op 2 38/0.000 + CDS 5783 - 6658 393 ## COG1175 ABC-type sugar transport systems, permease components 6 3 Op 3 . + CDS 6661 - 7497 391 ## COG0395 ABC-type sugar transport system, permease component 7 3 Op 4 . + CDS 7504 - 9114 664 ## gi|253581191|ref|ZP_04858449.1| predicted protein 8 3 Op 5 . + CDS 9111 - 10478 653 ## Hore_04640 ADP-ribosylation/crystallin J1 9 3 Op 6 . + CDS 10516 - 11391 593 ## COG1082 Sugar phosphate isomerases/epimerases + Term 11466 - 11509 3.2 Predicted protein(s) >gi|226332855|gb|ACII01000164.1| GENE 1 58 - 450 278 130 aa, chain + ## HITS:1 COG:MJ0604 KEGG:ns NR:ns ## COG: MJ0604 COG1708 # Protein_GI_number: 15668784 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Methanococcus jannaschii # 9 110 7 100 100 57 35.0 8e-09 MPKVMQDLIEQYIEAVKKIYGSHVRQIILYGSYARGDFRSDLDVDIMILLDLSDLELKAY GQQLSYMTYDFNMDHDLDIKPIAKSEAHFNKWIVNYPFYANIHREGVVLYGAASAVENAV EVQVAIISKH >gi|226332855|gb|ACII01000164.1| GENE 2 575 - 961 77 128 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 12 117 409 514 530 80 33.0 1e-15 MIIDLSDFRSIKTSTKQDIVNKITDYISNHYMEQLSLSTVAKSVFLSPSYLSSLITGETG KNFTDIVNEIRISKSIELLKNPKMRIADIAYSVGFNEPQYFSSIFKKCTNLTPRDYRDFY LSSVTERK >gi|226332855|gb|ACII01000164.1| GENE 3 2136 - 2864 237 242 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 1 234 359 576 577 134 29.0 2e-31 MSKKLDMLINTVYKVQLAQKEAQLKNLQSQMNPHFLFNTLQLISWKAYEYEAYPVCDMIS SLSYMLQTDLYSNENKVYTLRDEMEYIKQYTLIIRCKYNNKIAIHTSIPEHLLDCIIPKL IIQPFLENSINHGLAPKPTPGVVSISVEQCDQDLLCIIEDDGVGIDNKVLQNIRSLDSPA TSIDNPNKNGHHIALSNIKTRLELLYGKNYGFTITSQLSFGTRVELRIPYQTSIASKENT ND >gi|226332855|gb|ACII01000164.1| GENE 4 4421 - 5668 1131 415 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 22 183 43 202 422 64 24.0 4e-10 MNVMANEGEAKTITIAWTNIMESQQEIWEKYIFEPFEEKHPEVTIDFQCLPDLQNTVRVQ VAAGAGPDMFYMDSVDIPDYASTNRILSLENYRKEYNLDDSMYDWAIRSCLYEDEMYALP ASVEATAMTYNKNLLDQLGKDVPTNREEFVDVCNAALEASLIPVSFGYSGANLLLTWPYE HYLTCYAGGEKTAQLLKGEITFDDPDIKGAFELLKADWDAGYINDKKSGAITNDEARILF ANQKAVFNYEGPWLILADGAAKTWDFEWGQCAWPSMKDGTPAGSAITLGEAIGINANSQV SDLCMEMMMDFYSNEELMAQAVAEGFSTPAVPIDSAAYPEDMDENIRKALDAQNENMNLE TVGYAPWGFFPAKTTTFLDDNLDKVFYDKMDLDTFIDKANETIAEDFADGYVFAG >gi|226332855|gb|ACII01000164.1| GENE 5 5783 - 6658 393 291 aa, chain + ## HITS:1 COG:SMa2307 KEGG:ns NR:ns ## COG: SMa2307 COG1175 # Protein_GI_number: 16263695 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 9 291 12 286 307 137 29.0 3e-32 MSYNTRKKLKGTFFILPAFLIHVIVIVIPALSMIYYAFTKWNGLSEPVFIGLDNFKRMLK DYDFLFAMKNNLIWMAIFLIVPFILGLGMALVFTKIGKVQMIFRTLCFLPYVISATVSGK IFSIFYSPYSGLGSIFEKLGIKALAGFAPLGNEKQALYAAAFVDNWHWWGFVLVLMLSAL HQVDTSLYEVAKTEGANAWQTLIHVTIPQIKPTIISYFVFVIIAAFTTFDYVWIMTQGGP AGSTEVFATRIYKTTFINYDAGYGAAMSLSVCILALSVYFVLKFIQKRGRE >gi|226332855|gb|ACII01000164.1| GENE 6 6661 - 7497 391 278 aa, chain + ## HITS:1 COG:BH3682 KEGG:ns NR:ns ## COG: BH3682 COG0395 # Protein_GI_number: 15616244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 7 278 19 293 293 150 31.0 3e-36 MTMQKRKKMTLYMKVFFAALISLIQIMPIVVVVVNSFRGNDEISKLMLGFPTKFHFENYA VAWTRGGYAYAYASSLIIGFGTAFSVVLLVGLAVYGLYKTDCFFKEFFKSYFVAGLAIPT FAVIVPLFFFFYKINLINTHIGMILIYIGINIPFNFTFMSAFFEGMSKELDEAARIDGAS EMQNLWHIVVPLAKPIMTSVMLIVFVNTWNEFLFSNTFLQKEEMRTVSLRFFNFVGKNGA DYGYIYAAAIISILPIIIIYFLMQDSFVEGMTTGSVKG >gi|226332855|gb|ACII01000164.1| GENE 7 7504 - 9114 664 536 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581191|ref|ZP_04858449.1| ## NR: gi|253581191|ref|ZP_04858449.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 536 1 536 536 1123 100.0 0 MVEFPKYVKINDLDLVMAIREGMNPMLEWYDKKHNNLPYFWNYISGPKYGNSHHKSYSCV HSMGRWLDALVNAEAITGEVVPQEIYDQLAFWAYEIFNNDTGMMANLNVNTFEWEKVCDL HNLREAMYAFVALLKKNPNDQKAKKGAEHLIDMVDRYTDFETGNWKTDLYERECGGKVEC GASSEREVYRFSSTLGRYIGGLVRLYMVYPYDKALDQAIRLTDTALKNVLLDDGEFDRER FAEHLHSTSSMISGIAMLGSLIQNQEILCRVKKFMENGYYQVATDFGWCLENDKRVDNWV GEINNTGDYLEACLCLGKAGYEEYYDRADKMIRCHLLPSQLLDVSFISDEESEDDSISKM ATRMKGAFGFPCPYGHEYEPGSEISFNWDIVGGGVSSLCWAYNHIVTNVNGIISVNLQFD YEDEKICYRTPYSCGEMKILLKEDRVVRCLMPKDVDWNALIKELNRQSILFYIEGQWLYL YGIYQKGMLRLPVKYLKQRKKYSFRNNELLVDYVGNRIVGMSSEGKRLCFFKEVNK >gi|226332855|gb|ACII01000164.1| GENE 8 9111 - 10478 653 455 aa, chain + ## HITS:1 COG:no KEGG:Hore_04640 NR:ns ## KEGG: Hore_04640 # Name: not_defined # Def: ADP-ribosylation/crystallin J1 # Organism: H.orenii # Pathway: not_defined # 7 454 5 443 443 302 37.0 2e-80 MIENTLWMTYFEYLPIELKQAEEEGKNVELYRKEIEEIIQTSGDMSFEEKERKAWEILDR IEKEPIKREFPYQEPNEIEKIEELRNTELRRNFTVDQATIEDRVWGAWLGRCIGCLLGQP IEGWKRKRIEGFLKDTDNYPVERFLSSDVSEEIIHKYNVSNNGENAFCDTVTHWINNIVD MPEDDDTNYTVIGLKTLEIYGKDFTSDQIAWMWLTSLAMGHVSTAERVAYRNIGNLVPTS KSGWWKNPYREWIGAQIRADIFGYVCPGDPKKAADMAWRDARISHAKNGIYGEMFVAALL AAAYAESNVVKLIETGLGEIPATSRLYEVVLGIVSDYCNGVSKEKAINKLHSKYNEDDSH DWCLTITNAAVVAISILYGETDFTNALGIAMECGYDTDCNGATVGSIMGIMIGAKNIPES WKNNVTGILRTGVSGFYQVSIEELTRRTCAIIDKK >gi|226332855|gb|ACII01000164.1| GENE 9 10516 - 11391 593 291 aa, chain + ## HITS:1 COG:AGl260 KEGG:ns NR:ns ## COG: AGl260 COG1082 # Protein_GI_number: 15890243 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 289 1 287 289 307 50.0 1e-83 MKYGIYYAYWEKEWNGDYKYYIDKISKLGFDILEISCGAFSDYYTKDQELIDIGKYAKEK GVTLTAGYGPHFNESLSSSEPNTQKQAISFWKETLRKLKLMDIHIVGGALYGYWPVDYSK PFDKKRDLENSIKNMKIISQYAEEYDIMMGMEVLNRFEGYMLNTCDEALAYVEEVGSSNV GVMLDTFHMNIEEDNIAAAIRKAGDRLYHFHIGEGNRKVPGKGMLPWNEIGQALRDINYQ HAAVMEPFVMQGGTVGHDIKIWRDIIGNCSEVTLDMDAQSALHFVKHVFEV Prediction of potential genes in microbial genomes Time: Sat May 28 21:07:25 2011 Seq name: gi|226332854|gb|ACII01000165.1| Ruminococcus sp. 5_1_39B_FAA cont1.165, whole genome shotgun sequence Length of sequence - 16045 bp Number of predicted genes - 24, with homology - 23 Number of transcription units - 12, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 424 - 669 244 ## gi|253581195|ref|ZP_04858452.1| predicted protein 2 2 Op 1 11/0.000 - CDS 596 - 1552 602 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 3 2 Op 2 . - CDS 1568 - 1891 317 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Prom 1978 - 2037 5.9 4 3 Tu 1 . + CDS 2082 - 2741 358 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 2772 - 2829 9.1 - Term 2758 - 2819 11.3 5 4 Op 1 3/0.000 - CDS 2862 - 4553 1440 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 6 4 Op 2 2/0.000 - CDS 4555 - 4875 258 ## COG0607 Rhodanese-related sulfurtransferase 7 4 Op 3 2/0.000 - CDS 4888 - 5193 317 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 5218 - 5277 2.1 8 4 Op 4 4/0.000 - CDS 5280 - 6026 303 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 6063 - 6122 3.1 - Term 6211 - 6249 -1.0 9 4 Op 5 . - CDS 6372 - 7337 510 ## COG2199 FOG: GGDEF domain - Prom 7524 - 7583 6.5 10 5 Tu 1 . - CDS 7597 - 8100 160 ## COG3022 Uncharacterized protein conserved in bacteria - Prom 8142 - 8201 3.3 11 6 Op 1 . - CDS 8358 - 8723 276 ## COG1225 Peroxiredoxin 12 6 Op 2 . - CDS 8720 - 8812 118 ## 13 6 Op 3 . - CDS 8816 - 9025 180 ## EUBREC_1224 hypothetical protein - Prom 9054 - 9113 8.6 14 7 Op 1 . - CDS 9205 - 9633 337 ## EUBREC_3533 hypothetical protein 15 7 Op 2 . - CDS 9638 - 9850 225 ## EUBREC_3534 hypothetical protein - Prom 9893 - 9952 4.4 - Term 10048 - 10082 3.0 16 8 Op 1 1/0.000 - CDS 10138 - 10572 509 ## COG2703 Hemerythrin 17 8 Op 2 . - CDS 10649 - 10813 207 ## COG1592 Rubrerythrin - Prom 10892 - 10951 12.7 - Term 11038 - 11078 4.2 18 9 Op 1 . - CDS 11090 - 11461 199 ## EUBREC_3537 hypothetical protein 19 9 Op 2 . - CDS 11471 - 12043 576 ## EUBREC_3538 hypothetical protein - Prom 12106 - 12165 2.1 - Term 12206 - 12252 -0.5 20 10 Op 1 . - CDS 12294 - 12533 175 ## EUBREC_3540 hypothetical protein 21 10 Op 2 . - CDS 12546 - 12758 85 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 22 10 Op 3 . - CDS 12773 - 13417 509 ## COG1192 ATPases involved in chromosome partitioning - Prom 13510 - 13569 4.8 - Term 13451 - 13483 -0.1 23 11 Tu 1 . - CDS 13653 - 14201 558 ## COG0148 Enolase - Prom 14411 - 14470 5.7 + Prom 14998 - 15057 5.8 24 12 Tu 1 . + CDS 15189 - 16016 542 ## COG0726 Predicted xylanase/chitin deacetylase Predicted protein(s) >gi|226332854|gb|ACII01000165.1| GENE 1 424 - 669 244 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581195|ref|ZP_04858452.1| ## NR: gi|253581195|ref|ZP_04858452.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 81 18 98 98 126 100.0 5e-28 MYAKADFDDEDTRNQVLEYFVDKIYVYNDRLVITWYYSDDKTSIDLDALTEITETSESNT VDRVSTLLQSDPYTFLPYVLP >gi|226332854|gb|ACII01000165.1| GENE 2 596 - 1552 602 318 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 6 306 1 304 306 236 41 7e-62 MNENHVYDMIIVGGGPGGYTSALYAARAGFDTIVLEKLSAGGQMSLTWQIDNYPGFENGI DGFSLAEKMQKQAEKFGAKSEYAEVFNMDLTGKLKRVETSSGIYIGKTVVIATGANPREL GLAKEKELIGHGVAYCASCDGMFYKDKIVVVVGGGNSAVSDAVLLSRIAKKVIIVHRRDT LRATKIYHDQLLKTENIEFRWNNTVNELIYGERLTGVRLKDTVTGEENIIECDGLFVSIG RKPTTDFLDNQIELDNKGYIVAGENTETSIPGVYAVGDVRTKLLRQIVTAVADGAMAVHM AEKYMDLTVAMSKPYLPC >gi|226332854|gb|ACII01000165.1| GENE 3 1568 - 1891 317 107 aa, chain - ## HITS:1 COG:CC3539 KEGG:ns NR:ns ## COG: CC3539 COG0526 # Protein_GI_number: 16127769 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Caulobacter vibrioides # 1 104 1 105 110 83 38.0 8e-17 MTSININKEKFGQLIHKEKPVLVDFWAPWCGYCRRIGAAYEKISEEYSDILIAGKVNIDE EPQIAEAEKIEIIPTLVLYRDGKAIDSIVAPESKIMIENFINEALEK >gi|226332854|gb|ACII01000165.1| GENE 4 2082 - 2741 358 219 aa, chain + ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 2 209 9 212 217 86 30.0 3e-17 MNFENCFPLWNDLNTTQKKIISDNLITQYVKKGTIIHNGNLDCTGLLLVKSGQLRTYILS DEGREITLYRLFDMDMCLLSASCIIRSIQFEVTIEAEKDTDLWIIPAEIYKGIMKESAPV ANYTNELIATRFSDVMWLIEQIMWKSLDKRVASFLLEETSIEGTNELKITHETIANHLGS HREVITRMLRYFQGEGLVKLSRGKITILDPKRLETLQRS >gi|226332854|gb|ACII01000165.1| GENE 5 2862 - 4553 1440 563 aa, chain - ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 2 469 3 469 469 408 42.0 1e-113 MKVIIVGGVAGGATAAARIRRLNEHAEITVFERSGYISYANCGLPYYIGDVITDPEELTL QTPESFFKRFRINMKIHHEVISIHPERKTVSVKNLENGDIFEEKYDKLILSPGAKPTQPR LPGVGIDKLFTLRTVEDTFRIKEYINKNHPKSAVLAGGGFIGLELAENLRELGMDVTIVQ RPKQLMNPFDPDMASMIHNEMRKHGIKLVLGYTVEGFKEKDNGVEVLLKDNPSLQADMVV LAIGVTPDTVLAKEAGLELGIKESIVVNDRMETSVPDIYAAGDAVQVKHYVTGNDALISL AGPANKHGRIIADNICGGDSRYLGSQGSSVIKVFDMTAATTGINETNAKKSGLEVDTVIL SPMSHAGYYPGGKVMTMKVVFEKETYRLLGAQIIGYEGVDKRIDVLATVIHAGLKATQLK DLDLAYAPPYSSAKDPVNMAGFMIDNIAKGTLKQWHLEDMDKISRDKSVVLLDVRTVGEF NRGHMDGFKNIPVDELRERINEIEKGKPVYLICQSGLRSYIASRILEGNGYETYNFSGGF RFYDAVVNDRALIEKAYACGMDY >gi|226332854|gb|ACII01000165.1| GENE 6 4555 - 4875 258 106 aa, chain - ## HITS:1 COG:BH2813 KEGG:ns NR:ns ## COG: BH2813 COG0607 # Protein_GI_number: 15615376 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Bacillus halodurans # 16 106 36 124 125 68 41.0 3e-12 MGFFDLFKQSNINQGIEEYKRTGDAVLLDVRTPQEYQEGHIPESKNVPLQQLDNIASVAK NKDIPLFVYCYSGSRSRQATGILQRMGYSKVNNIGGIAAYSGKVEK >gi|226332854|gb|ACII01000165.1| GENE 7 4888 - 5193 317 101 aa, chain - ## HITS:1 COG:AGc37 KEGG:ns NR:ns ## COG: AGc37 COG0526 # Protein_GI_number: 15887381 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 97 28 125 133 107 48.0 8e-24 MSAININKNNFQNEVLNSDKPVLLDFWASWCAPCRMVVPIVEEIASERRDIKVGKINVDE EPELANKFSIMSIPTLVVMKNGKIVQQVSGARPKNAILEML >gi|226332854|gb|ACII01000165.1| GENE 8 5280 - 6026 303 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 244 1 238 242 121 32 4e-27 MRLKDKVAIITGGSRGIGFATADKFLKEGASVVLAASSQESADVAVDKLKEKYPNAIVAG ISPNLASMESVRKAFKEATEKYGCVDILVNNAGISENTSFMDYTEETFDKVMDLNVKGVF NTTRVAAECMVARGKGVILSTSSMVSISGQPSGFAYPASKFAVNGLTVSLARELGPKGIR VNAVAPGITETDMMKAVPKEVIEPMIERIPLRRLGQPEDIANAFVFLASDEASYITGVIL SVDGMARS >gi|226332854|gb|ACII01000165.1| GENE 9 6372 - 7337 510 321 aa, chain - ## HITS:1 COG:DR0267 KEGG:ns NR:ns ## COG: DR0267 COG2199 # Protein_GI_number: 15805298 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 156 315 338 501 511 82 33.0 1e-15 MMIQISRLQGTARVINYAGLVRGATQREIKLEITGNQNDELIKYLDDILLGLRYQDGHYD LVKLNDEEYQEKLQIQSDYWDKLKTEIEAVRNKGYENTDIVNMSEIYFTMADETVSAAES YSEKIAVKIRTIELLSALDMLSLVILVIMQTLKAMQMAMQNRLLEQKAFVDAHTGLPNKN ACNELLNKKDIITDSTACIMFDLNNLKTVNDTMGHSAGDQLILNFAKLLCSVIPEKDFVG RYGGDEFIAVIYHTSEAEIKEILKSLSREKERLNSCENQLPIDYACGWALSSDDMACTMQ MLLDDADAYMYKNKQLCKKYN >gi|226332854|gb|ACII01000165.1| GENE 10 7597 - 8100 160 167 aa, chain - ## HITS:1 COG:NMB0895 KEGG:ns NR:ns ## COG: NMB0895 COG3022 # Protein_GI_number: 15676791 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 1 164 91 258 259 122 39.0 4e-28 MAPSVFEDSQFEYVQNHLRIISAFYGVLKPRDGVTPYRLEMQAKVGIGETKNLYEYWGDL LYRSVIDNSRIIINLASKEYSKCIEKYLTRQDRYITITFCELSGDKLVTKGTYAKMARGE MVRFMAENRIENPEDIKKFDRLGYSFRSDLSSDAEYVFERKTEITSC >gi|226332854|gb|ACII01000165.1| GENE 11 8358 - 8723 276 121 aa, chain - ## HITS:1 COG:aq_495 KEGG:ns NR:ns ## COG: aq_495 COG1225 # Protein_GI_number: 15605968 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Aquifex aeolicus # 1 109 35 143 161 129 58.0 1e-30 MILYFYPKDNTPGCTKQACGFSERYPQFTEKGAVILGVSKDSVASHKRFEEKYGLAFTLL ADPERKVIEAYDVWKEKKNYGKVSMGVVRTTYLIDEQGIIIKANDKVKAADDPENMLKEL D >gi|226332854|gb|ACII01000165.1| GENE 12 8720 - 8812 118 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLELGIKAPDFELPDQNGEVHKLSDYAGKK >gi|226332854|gb|ACII01000165.1| GENE 13 8816 - 9025 180 69 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 69 26 94 94 121 92.0 6e-27 MKAHKYWSVGALITMIGTFYTGYKGLKAAHKYFAFGSLICMIMAVYSGHKMISGNKRTRK QVKNTKAEE >gi|226332854|gb|ACII01000165.1| GENE 14 9205 - 9633 337 142 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3533 NR:ns ## KEGG: EUBREC_3533 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 142 1 142 143 241 87.0 7e-63 MKKKIIAVISGTAILIIAAGSIYGKFESSHKEGEPDVVGTFSVNRDENITVVANREDIED REAFARKLLQMYKDDSFHSTKFSTDRGYATSIDMHIYLWKEDIEDGESVMTAEYRPVEYG KDYDIVNNPDKFQLYIDGKEIK >gi|226332854|gb|ACII01000165.1| GENE 15 9638 - 9850 225 70 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3534 NR:ns ## KEGG: EUBREC_3534 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 70 1 70 70 108 94.0 5e-23 MREKTIYEKIAEKYNTTPEEVRREMQIAIDAGFDNPDPAVQEEWKKMTLKGDRPIPEEVI NYAVKKLKGN >gi|226332854|gb|ACII01000165.1| GENE 16 10138 - 10572 509 144 aa, chain - ## HITS:1 COG:PA1673 KEGG:ns NR:ns ## COG: PA1673 COG2703 # Protein_GI_number: 15596870 # Func_class: P Inorganic ion transport and metabolism # Function: Hemerythrin # Organism: Pseudomonas aeruginosa # 6 128 3 120 153 58 31.0 3e-09 MTYDLNITFDDNLVTGNETIDTQHKELIDRIQNFVTACQNGDSKVKAIKMLDYLDEYTDF HFKEEEKLQEKSGYPEREKHYEKHEEFRKTIQELYEYLQEYEGPTDRFSELVQKNVIDWL FGHIKTYDRSVAKFIFMKQNPDRC >gi|226332854|gb|ACII01000165.1| GENE 17 10649 - 10813 207 54 aa, chain - ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 3 54 182 233 237 73 57.0 7e-14 MIKYECEPCGYIYDPAVGDPDAGIDPETAFEDIPDDWTCPICGLGKDVFVPVED >gi|226332854|gb|ACII01000165.1| GENE 18 11090 - 11461 199 123 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3537 NR:ns ## KEGG: EUBREC_3537 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 121 1 121 123 226 95.0 3e-58 MRYEDSMKGVADQIIKEQQRVSKIKNNPEEFHWHDEYATLNFFRFICKNIGNLRNGEIEK MVTRLKNIDQKAVENHVNGAVWAFDYDKMFCVLMENEICRKVCEKNKYNSWMNLIGQYCV SRT >gi|226332854|gb|ACII01000165.1| GENE 19 11471 - 12043 576 190 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3538 NR:ns ## KEGG: EUBREC_3538 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 190 1 190 190 347 98.0 9e-95 MTDRDYAIKSMKEITFQMASHAQDYLEVTIERHYTDIKELMRSYQKLILENQVVLEELDM ECQEKINEDMAYALSYLSIYNNQLNVPKMHREMNNLMIIYGLSDMIYRGMTLVKFYAPNG VMLSEILHSCFCSHYNKTDVEVQQELGIGRTSFYKMKKQALGYLGFYFYEIVVPQAKDKR FKPSLGVEEE >gi|226332854|gb|ACII01000165.1| GENE 20 12294 - 12533 175 79 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3540 NR:ns ## KEGG: EUBREC_3540 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 79 1 79 79 155 97.0 3e-37 MNKVYVDVVAEFRKDGRLVPIFFTWEDGRKYSIDRILKIERCASRKAGGVGVMYTCMIQG QESHLFYEVDKWFMERKTA >gi|226332854|gb|ACII01000165.1| GENE 21 12546 - 12758 85 70 aa, chain - ## HITS:1 COG:SA1174 KEGG:ns NR:ns ## COG: SA1174 COG1974 # Protein_GI_number: 15926920 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Staphylococcus aureus N315 # 5 70 4 69 207 67 46.0 6e-12 MKKVLTRKQKESYQCILNYTKEHGYPPTVREFGKLIGVRSTSSAFSRIKQLEQNGYIRRI PASPRAIEIL >gi|226332854|gb|ACII01000165.1| GENE 22 12773 - 13417 509 214 aa, chain - ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 2 180 64 235 318 137 39.0 1e-32 MAKVISIVNQKGGTGKSACTANLAVGLAQKNMKVLIVDADPQSDVSAGFGYRDCDESNET LTALMDTVMKDEDIPSDCYIRHQAEGIDIICSNIGLAGTEVQLVNAMSREYVLKQILYGI KDQYDAIIIDCMPSLGMITINALAASDEVLIPVEASYLPIKGLQQLEKLDRTVDEIRRRF GYFSIQRAAMYQNKVLSHLDAGTHTVHPHSYFHG >gi|226332854|gb|ACII01000165.1| GENE 23 13653 - 14201 558 182 aa, chain - ## HITS:1 COG:TM0877 KEGG:ns NR:ns ## COG: TM0877 COG0148 # Protein_GI_number: 15643639 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Thermotoga maritima # 10 162 6 152 427 185 62.0 5e-47 MLRYLPVRRVHARQVLDSRGNPTVEVEVTVGEGVIGINGYTGRAIVPSGASTGKFEAVEL RDGEKGCYTGLGVRKAVENVNTKLAEAILGENALDQSYIDKKIIETDGTDNKSNVGANAA LGVSLAVARAAAAALRVPLYQYLGGCHTRQMPVPMMNILNGGACVIIMTQGRTPYNTRAL AI >gi|226332854|gb|ACII01000165.1| GENE 24 15189 - 16016 542 275 aa, chain + ## HITS:1 COG:CAC3009 KEGG:ns NR:ns ## COG: CAC3009 COG0726 # Protein_GI_number: 15896261 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 70 275 85 290 295 209 48.0 6e-54 MHLKYISSTRIFSSRFTKIFLLFCACAFIGSSIGKIVSHNTARQTAASAQSSNWGLSFQE EGKRPVGNATIEELAQYNAFFAEDTEEKKIYLTFDAGFENGNTPAILDALKKHQAPATFF VVGNFISENKDLIKRMETEGHIVGNHTMTHPDMSKISTKESFQKELSDVEKIYREITGKE MTKFYRPPQGIYSTQNLSMARELGYHTFFWSLAYVDWYQDNQPDPQEAIAKLTKRIHPGA IVLLHSTSSTNAQILDELLDKWEAMGYSFHSLNEL Prediction of potential genes in microbial genomes Time: Sat May 28 21:07:50 2011 Seq name: gi|226332853|gb|ACII01000166.1| Ruminococcus sp. 5_1_39B_FAA cont1.166, whole genome shotgun sequence Length of sequence - 7801 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 3 - 62 1.5 1 1 Tu 1 . + CDS 88 - 255 60 ## - Term 77 - 130 16.1 2 2 Op 1 16/0.000 - CDS 197 - 1189 1535 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 1220 - 1279 8.1 - Term 1301 - 1335 3.2 3 2 Op 2 34/0.000 - CDS 1484 - 2239 542 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 4 2 Op 3 17/0.000 - CDS 2261 - 2938 660 ## COG0765 ABC-type amino acid transport system, permease component 5 2 Op 4 . - CDS 2919 - 3578 599 ## COG0765 ABC-type amino acid transport system, permease component - Prom 3685 - 3744 3.9 6 3 Tu 1 . - CDS 3757 - 4377 734 ## COG2755 Lysophospholipase L1 and related esterases - Prom 4447 - 4506 7.2 - Term 4525 - 4567 -0.2 7 4 Tu 1 . - CDS 4623 - 5390 508 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 5440 - 5499 7.5 + Prom 5395 - 5454 5.5 8 5 Op 1 1/0.000 + CDS 5538 - 6746 976 ## COG1472 Beta-glucosidase-related glycosidases 9 5 Op 2 . + CDS 6743 - 7799 873 ## COG1472 Beta-glucosidase-related glycosidases Predicted protein(s) >gi|226332853|gb|ACII01000166.1| GENE 1 88 - 255 60 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRTPQECPNGVRLTLFINQYFREYQNYNFYTAFTLLIQRLLPLSLPVLLLFPHL >gi|226332853|gb|ACII01000166.1| GENE 2 197 - 1189 1535 330 aa, chain - ## HITS:1 COG:Cj0982c KEGG:ns NR:ns ## COG: Cj0982c COG0834 # Protein_GI_number: 15792309 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Campylobacter jejuni # 1 288 1 279 279 265 50.0 1e-70 MKKKILAVLLATGVAATTLLGGSILVSAADDGGNTAQARTLDEIKKDGKIKIGVFSDKNP FGYVDENGDVQGYDIYFGKRLAKDLLGSEDDAEFTYVEAASRVEYLQSAKVDVILANFTV TDDRAEKVDFALPYMKVALGVVSPDDALIKDVKDLEGKKLIVVKGTTAETYFTDNYPDIE LVKFDEYQEAYDALLDGRGDAFSTDNTEVLAWAQQNEGFSVGIESLGDIDTIAPAVQKGN TDLLNAINDEIKSLADEQFFHKDFEETLKPVYGDNINPDDLVVEGGVVDSEEKAEDADAA ATEDSDAASDDTKDAETTEEPEETEAADAE >gi|226332853|gb|ACII01000166.1| GENE 3 1484 - 2239 542 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 247 1 245 245 213 43 4e-55 MGQPILEVEHLKKSYVNNVPILDDVSFSLQKGEVVVIVGPSGCGKSTFLRCINRLEEIDT GVLKLNGVSYEKEKKNISKVREKIGMVFQSYDLFPNMTILNNLLLAPMKVQKRNKEEVKK EAEQLLDRVGLLDKKDNYPRQLSGGQKQRVAIVRAMLMHPEIMLLDEITAALDPEMVREV LQVVLELAKTGMTMLIVTHEMEFARSVADRVIFMDKGNIVEENTPEQFFDNPETDRAKQF LNSFHYETVKK >gi|226332853|gb|ACII01000166.1| GENE 4 2261 - 2938 660 225 aa, chain - ## HITS:1 COG:SP0710 KEGG:ns NR:ns ## COG: SP0710 COG0765 # Protein_GI_number: 15900608 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 223 1 223 225 254 59.0 9e-68 MQNMGIEVLFKGINSLRLWQGLWVTIRIALISMVFSIILGFLLGMVMNIPNKIIKFICRI YLEIVRIMPQLVLLFLVYFGAAKHLGMNLSGETAAVIVFTFWGTAEMGDLVRSALISIPK HQYDSGYGLGLNKWQVYLYIILPQTIRRLLPSAVNLLTRMIKTTSLVALIGVVEVLKVGK QIIDASRYTNPTAALWVYGAVFFMYFIICFPFSRLSRVLEKKFAD >gi|226332853|gb|ACII01000166.1| GENE 5 2919 - 3578 599 219 aa, chain - ## HITS:1 COG:SP0711 KEGG:ns NR:ns ## COG: SP0711 COG0765 # Protein_GI_number: 15900609 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 21 219 1 199 206 232 64.0 4e-61 MDFEFIREQTPLYIEAAKLTLHMAVIGVLLAIVVGLVCSMIKYFRIPVLRQIVECYIELS RNTPLLIQLFFLYFGLPKIGIVLSSEQCAITGLTFLGGSYMAEAFRSGLDNVPPIQVESA LSLGMSKSQVFTNIILPQAVSNSIPAFCANAIFLIKETSVFSAVALADLMFVTKDLIGIY YKTDEALFMLVIAYLIILLPISLICSFVERRVRYAEYGD >gi|226332853|gb|ACII01000166.1| GENE 6 3757 - 4377 734 206 aa, chain - ## HITS:1 COG:SMa1993 KEGG:ns NR:ns ## COG: SMa1993 COG2755 # Protein_GI_number: 16263545 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Sinorhizobium meliloti # 2 206 16 227 229 107 34.0 1e-23 MKQILCFGDSNTYGLIPGTTNQRYGWGTRWTSILDDKVRTKGYRVIEEGLCGRTTVFDDP FRTERRGTEMLPAILESHRPVDTIVLMLGTNDCKSVYSATPEVIGQGIEQLLDQINTVNP DAKILLVSPIYLGERIWEEDFDPEFDKNSIEVSWNLPRVYEKIARRRNISYLPASEFARP GEADQEHLDELGHSRLADAIYEKLAG >gi|226332853|gb|ACII01000166.1| GENE 7 4623 - 5390 508 255 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 147 248 153 254 257 74 37.0 1e-13 MQEKSKEYNEKEQQVAENRSRNTLFERRENLQKHESYASERRQYDAIRNGQTDRIQSVFQ ITPDGTPGILSRNELRNSKNMFIAGITLFTRAAIEGGVPEETAYALSDGYIQTVEECTSK SSIEKLSQKAALRFAQEVQKSGMRHYSREIEAAVKYIHLHLHVPVTLEETAEAAGISASY LSRLFKKETGMLFVDYIQKERIEAACNMLTYSDYTAAQISEYLCFSTQSYFIKIFRKYTG TTPAKYKKYKVQDWK >gi|226332853|gb|ACII01000166.1| GENE 8 5538 - 6746 976 402 aa, chain + ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 1 337 1 357 721 278 42.0 1e-74 MKRDLKKIISQMTLEEKAGMCSGLDFWHLKSVERLGIPEVMVSDGPHGLRKQDDKGDHLG MNDSIKAVCFPPAALSACSFDRSLMEAMGETIGREAQANDVSVVLGPAVNIKRSPLCGRN FEYYSEDPYLAGEIAAAFINGVQSQHVGTSIKHFAANNQEYHRMSNSSEADERTLREIYF PAFETAVKKAQPYTFMCSYNQINGTFASENKWLLTDVLRNDWGFEGYVMSDWGAVNDRVK GLEAGLDLEMPGSNGTNDALIMEAVKNGTLKEEVLDQAVERILNIIYKYADHRAPQEFTM EKDHEEARRIAEESMVLLKNADQILPLKTSEKVAFVGGFAKNLVSRAAEALILTALKLPM HWKLFLLMHRLFTQKDFLLIKIYTMKNWLLKHFRLQKLLIRL >gi|226332853|gb|ACII01000166.1| GENE 9 6743 - 7799 873 352 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 1 263 566 826 832 223 42.0 5e-58 MIFAGLPDSFESEGYDRSHMRLPECQNRLIGEILEVQPNTVIVLHNGSPVEMPWINNVKG ILEAYLGGQAGGAAVANILYGIVNPSGKLAETIPVKLADNPSYLNFGGGDKVEYREGVFV GYRYYDTKQMEVAYPFGYGLSYTTFAYSNLQISNTNPTEKDTITVSVDVTNTGSIAGKEI VQLYVKDMTASTTRPEKELKGFEKVQLAPGETKTVTMELDKRSFAWYNTELHDWYAASGK YEILIGASSRDIRLSETIELSSSQVIPMHIHMNTTLGELFNNPDTKEAAKELTSRYLAAM NGGSDTASESAAEAITEEMVAAMTYSMPLRSLCSFGGFSRAEIQEMIDKLQA Prediction of potential genes in microbial genomes Time: Sat May 28 21:07:59 2011 Seq name: gi|226332852|gb|ACII01000167.1| Ruminococcus sp. 5_1_39B_FAA cont1.167, whole genome shotgun sequence Length of sequence - 15414 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 8, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 59 - 1249 970 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase 2 1 Op 2 4/0.000 - CDS 1246 - 1944 797 ## COG0132 Dethiobiotin synthetase 3 1 Op 3 2/0.000 - CDS 1941 - 2933 609 ## COG0502 Biotin synthase and related enzymes 4 1 Op 4 . - CDS 2930 - 3523 186 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 - Prom 3548 - 3607 9.9 - Term 4045 - 4093 5.0 5 2 Op 1 12/0.000 - CDS 4167 - 4607 278 ## COG3610 Uncharacterized conserved protein 6 2 Op 2 . - CDS 4604 - 5368 587 ## COG2966 Uncharacterized conserved protein - Term 5706 - 5760 10.0 7 3 Tu 1 . - CDS 5973 - 7208 605 ## COG0582 Integrase - Prom 7253 - 7312 3.1 8 4 Op 1 . - CDS 7356 - 7928 435 ## gi|253581232|ref|ZP_04858489.1| predicted protein 9 4 Op 2 . - CDS 8011 - 8823 407 ## gi|253581233|ref|ZP_04858490.1| predicted protein 10 4 Op 3 . - CDS 8867 - 10165 784 ## gi|253581234|ref|ZP_04858491.1| predicted protein - Prom 10197 - 10256 1.7 11 5 Op 1 . - CDS 10357 - 11154 336 ## smi_1323 hypothetical protein 12 5 Op 2 . - CDS 11141 - 11806 248 ## gi|253581236|ref|ZP_04858493.1| predicted protein 13 5 Op 3 . - CDS 11854 - 12228 349 ## gi|253581237|ref|ZP_04858494.1| predicted protein - Prom 12357 - 12416 7.4 + Prom 12217 - 12276 5.5 14 6 Op 1 . + CDS 12332 - 12511 92 ## + Term 12547 - 12589 -0.3 + Prom 12598 - 12657 10.3 15 6 Op 2 . + CDS 12711 - 13427 230 ## gi|253581238|ref|ZP_04858495.1| predicted protein + Term 13460 - 13526 17.1 - Term 13463 - 13500 -0.7 16 7 Tu 1 . - CDS 13505 - 13741 265 ## gi|253581239|ref|ZP_04858496.1| predicted protein - Prom 13761 - 13820 4.2 + Prom 13763 - 13822 8.2 17 8 Tu 1 . + CDS 13952 - 14830 753 ## gi|253581240|ref|ZP_04858497.1| predicted protein + Term 14846 - 14877 4.1 Predicted protein(s) >gi|226332852|gb|ACII01000167.1| GENE 1 59 - 1249 970 396 aa, chain - ## HITS:1 COG:VC1111 KEGG:ns NR:ns ## COG: VC1111 COG0161 # Protein_GI_number: 15641124 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Vibrio cholerae # 2 389 12 416 428 322 42.0 7e-88 MIWYPYEQMKTMKAPYKILDAEGVHLYTKDQKLIDSISSWWCMIHGYKHPELTEAIKEQA DRFCHVMLGGLTHEPVEKLSAKLQEFLPGDLDYCFFSDSGSVAVEVSLKMALQYNTNQGR KRPLVLALEHAYHGDTFKAMEVGDDEDYHFAFDKKEGVVHIPTEIPALEEAFEQYHDRLN CFIVEPLLQGAGGMRMYDISFLKRARELCDQYDVLLIFDEVATGFGRTGNRFVADLVLPD ILVLGKALTGGYIGHAVTVANHKVFDAFYSDDDRLALMHGPTFMGNPLACAVALKGIEIF EREDYMSKIRHIEQVSRGEMEAFTDPRIKEVRIMGGCVCVEVHDSRNLEGFQQFAYERGV FSRPFLNYMYSMVPYVISDEELIRIFDVMKEWFREK >gi|226332852|gb|ACII01000167.1| GENE 2 1246 - 1944 797 232 aa, chain - ## HITS:1 COG:CAC1361 KEGG:ns NR:ns ## COG: CAC1361 COG0132 # Protein_GI_number: 15894640 # Func_class: H Coenzyme transport and metabolism # Function: Dethiobiotin synthetase # Organism: Clostridium acetobutylicum # 3 215 2 216 240 170 40.0 2e-42 MSKAVFITGTGTDMGKTYLSGLIVKKLAQAGKNPAYYKAAMSGNDRRADGSLIPGDALFV KEMSGISQSLDDMCPYVYENAWSPHLASRVEGNPVDLDVVRRGFLKAANDYEYITMEGSG GILCPLCFDERKIQLEDVIREFELSSILVADAGLGTINSVVLTAEYMKAHSLPIKGIIFN HYHPGNIMEEDNIAMCQYMTGLPVIAKVQDNAKDLDIDADVLAEFYDEVKIK >gi|226332852|gb|ACII01000167.1| GENE 3 1941 - 2933 609 330 aa, chain - ## HITS:1 COG:FN1000 KEGG:ns NR:ns ## COG: FN1000 COG0502 # Protein_GI_number: 19704335 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Fusobacterium nucleatum # 30 319 69 360 360 273 46.0 4e-73 MNPLILAQEIIDGRRITRKDDLSFFLTCDLDELCEGADRIREAYIGDKVDLCSIINGRSG RCPENCKYCAQSVHHHTSCEIYDFLPEEKILEACKMNEEEGVDRFSIVTAGKALTGKEFD QAIHAYETMYKECNIDLCASMGFLSSEQLHRLHEAGVTSYHHNIETSRRNFPNICTTHTY DMKIETLKKVKAEGMCACSGGIIGMGETWEDRLDMAVSLAELGIDSIPINALMPIPGTPL EHLDQLSEQDILRTIAFFRYINPEANIRLAAGRALLTNDGETAFKSGASATITGNMLTTV ACATIRSDRKMLSDMGRNVTPEYWKEVEES >gi|226332852|gb|ACII01000167.1| GENE 4 2930 - 3523 186 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 3 168 4 171 190 76 30 1e-13 MTKTKSMIYCGLFTALIAAGAFIKIPVPVVPFTLQYLFTMLAGLFLGSRRGMISVVAYML LGLAGLPIFSEGGGIWYIFKPSFGYIIGFCLGTYVTGLIAERLKQKTVFHYLLANLAGLM IVYACGMVYYYVICNYVIDTPIGIWPLILYCFLLAVPGDIALSVLGAVIAKRVRPVVLQY LFDSKSNVRLETEELAK >gi|226332852|gb|ACII01000167.1| GENE 5 4167 - 4607 278 146 aa, chain - ## HITS:1 COG:CAC2266 KEGG:ns NR:ns ## COG: CAC2266 COG3610 # Protein_GI_number: 15895534 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 5 142 1 138 152 74 28.0 6e-14 MTAAIMIQLITGAIGSVGFGILFHIKKKYLPLAAVGGFLSWLVFLLGKEFWGNVFLPTLM AGFVTDVYAEILARICKETSTSFFVTSVIPLIPGSTLYYCMNSIVEGNTVRALEYGRDTF LFAFGIAAGMSIAWSICYFLRTVRKK >gi|226332852|gb|ACII01000167.1| GENE 6 4604 - 5368 587 254 aa, chain - ## HITS:1 COG:BH0081 KEGG:ns NR:ns ## COG: BH0081 COG2966 # Protein_GI_number: 15612644 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 1 249 1 250 251 102 30.0 5e-22 MQTSKDFLHIFLDMGDALLNSGAEIFRVEDTLNRMGYACGAAQMNVFVITSSIVITMEFP GEGARTQTRRIRECGGNDFTKLEQLNDLSRRFCNHPVPAAELRKEFDKININKTKPLWKL LGSLLAACSFAFFYGGSIPDAVAAGLGAVLIWGLQQYLRPVCMNEVTFQFVASFFTGCAI CGLTLLCPFLRMDKIMIGDIMLLIPGLMSTNAIRDVLIGDTLSGIIRLIAALLLAAALAL GFMGAIILFGRFGL >gi|226332852|gb|ACII01000167.1| GENE 7 5973 - 7208 605 411 aa, chain - ## HITS:1 COG:lin0524 KEGG:ns NR:ns ## COG: lin0524 COG0582 # Protein_GI_number: 16799599 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Listeria innocua # 236 390 142 286 309 68 32.0 2e-11 MSKNICSKDLKLIYDDGKIDIDIEDVQERMKELEKIMDTQTYLSQHPYKIWQGNTGFYYT YLPDIVKGRILLKKKTESDLIKAIVEFYKCHNTVCMLFEEWLEYKKLYRLIKEPSAIRYE TDYRRFIYGTDFGNTEVADVTEEMLEKFIISAIFDNNLTAKAFSGLRTIIIGMFKYAKKR KYTSISITSFFGDLELSPNMFRHVVKNDEDEVFTDFEVRQISNYVKEHPTIRNLGVLLAF HTGLRVGELTVLTPDDIDWKNHILLINKTEIKIKDKTTGKYTYAVSDQGKTENSTRNVVF TDSAEWVLHQIIKINPHGKYLFENPEGKRIRGTTMNKRLHGICRALGIKERSIHKSRKSY ATQLLDAGIGDALIISQLGHSDILTTKKHYYRNSHTLAETRDKLESALALG >gi|226332852|gb|ACII01000167.1| GENE 8 7356 - 7928 435 190 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581232|ref|ZP_04858489.1| ## NR: gi|253581232|ref|ZP_04858489.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 190 1 190 190 369 100.0 1e-101 MSDENDAFNEDDAFNEDDSFNEDDAFNEDDSFNENGELTLTNDEYYQYLLNNCLSYCGDF NKIKHVPKSKKSNDPEIVKVGDLEIVVKNYSPSRIKQIFRDYFGDPDKGGIRLEKIWIYK WGRYPGQYTRYNLVDCATNKIVVHDKTLHDMRIYLSGIAKHGFPLKKEFDQIFMFRVMSW QEHGFPDFPA >gi|226332852|gb|ACII01000167.1| GENE 9 8011 - 8823 407 270 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581233|ref|ZP_04858490.1| ## NR: gi|253581233|ref|ZP_04858490.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 270 1 270 270 447 100.0 1e-124 MGRKKISSEVKETVNEVKETANGVKETDISEESSNVIITESGQEIIPASKENKQLLVDVS PKDTPKMVTGGLISYEDQINRGIELMKKGIQGRLLIGQALVNIKAKEDLTELGYSNIYDF CDHKFRMGRTIADNCMFLYTKLSHINPKTNEPELEEEYKKFNLSQLSELRSVDKEQLWRF TPDQTVSQMRTLKHQIKKENSKKNSTNDDSNKPPKKKTTLLKSVCELPEKGNVTAKIRDQ IEETLEEIRNSKSGKEQNLKIMICICEEKE >gi|226332852|gb|ACII01000167.1| GENE 10 8867 - 10165 784 432 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581234|ref|ZP_04858491.1| ## NR: gi|253581234|ref|ZP_04858491.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 432 4 435 435 857 100.0 0 MAQKNDIQRNKYFLTINSPEKFGYTHEVIYQVASNFKTFQYVAVVDEQGSNFHTHVLLVF KSRVRWSTVQDKFPHAHIEEGKGDINQILQYMRKEGKWLLDEKKQEQKIEGSFESWGDRP VDTKEKVSEFSELYDLVYDEVPTGEIIKFNPKYIRYIDKIAPMRIEIMNEKYRGKRRLDL KVIYVFGLSGTGKTRMILDRHGDENVFRVTDYFHPFDSYNMQQVLCLEEFRDSLTITQCL NLLDIYTVELPARYANKLGIYKTVYMVSNWEIGKQFKSVQQEHPETYHAFRRRFHYLLDF REKNVHAWNSRDFDMGFTARVQTILEFTKRDMVEADWQKLLELMSVNTMQKIIDGEIDES ETFKEIGQIIQKMEKPDTFESAVAKRKEWISKCREKKLDINKDIFHAPVPNNIRNLFPMN TSANGMNNISNM >gi|226332852|gb|ACII01000167.1| GENE 11 10357 - 11154 336 265 aa, chain - ## HITS:1 COG:no KEGG:smi_1323 NR:ns ## KEGG: smi_1323 # Name: not_defined # Def: hypothetical protein # Organism: S.mitis_B6 # Pathway: not_defined # 30 256 225 447 467 67 28.0 6e-10 MENIKTIWGFDEPTLIEDDIPDPILLGLGSHPHALITGSSGSGKSQSLLFIIGRTLQILQ ENKIDIDFRIGDFKYSEDYQFLEEISYPYYFRGFDCYEGIMSYYKDFSENRMNKNANKNQ HHFLILDEYQATINYYTMYDKQHKTKYSSEILSAIAEILMLGRSTQAGVWHIWVVTQVAS ATLFANGTRENFMITLAMGNLSKEQKGMIFPGQTLPDKRFKPGEGMLLADGMEIKEVIFP VISDVADWTLHIGNILMPDGTAFTL >gi|226332852|gb|ACII01000167.1| GENE 12 11141 - 11806 248 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581236|ref|ZP_04858493.1| ## NR: gi|253581236|ref|ZP_04858493.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 221 1 221 221 444 100.0 1e-123 MLRDLILAFIIFSILAILILVGLGRNPNQAVRLVVVFWKKIFIDLIDELFGKHVNQILYP FVFGIDAMGNVDLNEIDANFEDLYRVFEQGNIYYDFMKLYPENGTVPKDIICYNFCICVE DKENWEKMKKKIPIRINGIVNRRLKEYGYYNVASSPFWWCGFYSNENPTFLKIYFALTPK GVEEIKQKKLQVAYERQKKNRKSSTNFKTDWKSGESKDGKH >gi|226332852|gb|ACII01000167.1| GENE 13 11854 - 12228 349 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581237|ref|ZP_04858494.1| ## NR: gi|253581237|ref|ZP_04858494.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 124 1 124 124 237 100.0 2e-61 MTILKGREIALSAIFPNSKFFIVGVEPYYKYDSATNKRTDLIEGYRYELIESTKFTHIPL KVPNMAAVITADELDKRIEAGEQVFVTLKGAYLKMYYSPKTKNFEDSVFASSLHLCDDAE IELD >gi|226332852|gb|ACII01000167.1| GENE 14 12332 - 12511 92 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFAQILFISLSYHDFFSEKNLKKIVRVVEYNVRVKLAIRFNPDNENALSAISRPLKHL >gi|226332852|gb|ACII01000167.1| GENE 15 12711 - 13427 230 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581238|ref|ZP_04858495.1| ## NR: gi|253581238|ref|ZP_04858495.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 238 1 238 238 384 100.0 1e-105 MEEKKIGKKRGPKGNWEELLEQYRKLPYSYQRVVTDVLQVMINESNKTFSKKKEIRKLYH TIIRQEMQKKNITYSELYERLNEKEHLQLNPKTYESFRRDMIIDGYLFKHTCDILEIPYT EIKSKGKDGTDKSYIKLKILEDAISKTDTYEMEINVLDEESISKIAKSKNKDINEDELTH IDKKISLSGYYSNFLIAKGNFMASNMEWSFKLLNKYEKEIIEKLIYSLKQLAFDESTK >gi|226332852|gb|ACII01000167.1| GENE 16 13505 - 13741 265 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581239|ref|ZP_04858496.1| ## NR: gi|253581239|ref|ZP_04858496.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 78 1 78 78 137 100.0 2e-31 MKDIKVKIDYTPLLKAIEEQNLSENKIFKQWNIPRKTFYNIRHERGITIDTLARLSVVLD KDIGQLVNISYDIIEKPE >gi|226332852|gb|ACII01000167.1| GENE 17 13952 - 14830 753 292 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581240|ref|ZP_04858497.1| ## NR: gi|253581240|ref|ZP_04858497.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 292 1 292 292 519 100.0 1e-146 MLNKTIKKLIALTLATVTVATGSTAVSASTLTPKKSHSKAEITEYVDYFNDITDRYAFRP DGYVPVSRNGEYHKGQSYNSWQMTLLLQNTPGITSEQKADMKKAASKWKKSYLSERMLYI YQTKWSDKMYDYLDDDRTSASWASKYKAKVKKTYTEIDAYLTKNAKYNQKLAGELRKFNK QRRDIYNKAIDNWCFKMTKSQKAKYKSPICGSGMKPQASFIQYVDVMFSKSKMSSNPYYE TMFGGSCTRHIAVDDYNDTKLYASKTLGLHGGELNPNGANNVWGRHIQWKKW Prediction of potential genes in microbial genomes Time: Sat May 28 21:09:30 2011 Seq name: gi|226332851|gb|ACII01000168.1| Ruminococcus sp. 5_1_39B_FAA cont1.168, whole genome shotgun sequence Length of sequence - 31272 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 15, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 550 - 581 -1.0 1 1 Tu 1 . - CDS 824 - 1048 390 ## EUBELI_00116 hypothetical protein 2 2 Op 1 . - CDS 1123 - 2322 1307 ## COG0019 Diaminopimelate decarboxylase 3 2 Op 2 . - CDS 2319 - 3839 1537 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 4 2 Op 3 . - CDS 3912 - 4892 991 ## EUBELI_00118 hypothetical protein 5 2 Op 4 . - CDS 4879 - 6390 1216 ## COG1696 Predicted membrane protein involved in D-alanine export - Prom 6459 - 6518 2.4 - Term 6427 - 6479 8.6 6 3 Tu 1 . - CDS 6531 - 7478 1213 ## gi|253581247|ref|ZP_04858503.1| predicted protein - Prom 7498 - 7557 2.2 - Term 7534 - 7580 -0.0 7 4 Op 1 . - CDS 7594 - 8382 887 ## gi|253581248|ref|ZP_04858504.1| conserved hypothetical protein - Term 8412 - 8462 -0.3 8 4 Op 2 . - CDS 8480 - 9148 752 ## gi|253581249|ref|ZP_04858505.1| predicted protein 9 4 Op 3 . - CDS 9166 - 10152 839 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 10235 - 10294 9.3 - Term 10538 - 10583 6.5 10 5 Op 1 . - CDS 10588 - 11241 591 ## COG0637 Predicted phosphatase/phosphohexomutase 11 5 Op 2 . - CDS 11241 - 11591 511 ## EUBELI_20567 hypothetical protein - Prom 11643 - 11702 6.2 - Term 11811 - 11860 12.0 12 6 Op 1 . - CDS 11950 - 13458 1869 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 13526 - 13585 7.6 13 6 Op 2 . - CDS 13647 - 16136 1858 ## GYMC10_0344 hypothetical protein - Prom 16287 - 16346 7.6 - Term 16306 - 16349 -0.8 14 7 Tu 1 . - CDS 16533 - 17102 581 ## COG0681 Signal peptidase I - Prom 17191 - 17250 5.0 - Term 17616 - 17674 22.2 15 8 Tu 1 . - CDS 17684 - 19243 1392 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 19477 - 19536 7.3 - Term 20108 - 20167 13.6 16 9 Op 1 30/0.000 - CDS 20251 - 21498 1361 ## COG0065 3-isopropylmalate dehydratase large subunit 17 9 Op 2 30/0.000 - CDS 21517 - 22008 671 ## COG0066 3-isopropylmalate dehydratase small subunit - Prom 22059 - 22118 3.0 18 9 Op 3 30/0.000 - CDS 22277 - 23527 1408 ## COG0065 3-isopropylmalate dehydratase large subunit 19 9 Op 4 . - CDS 23529 - 24011 567 ## COG0066 3-isopropylmalate dehydratase small subunit - Prom 24094 - 24153 6.8 - Term 24106 - 24149 -0.5 20 10 Tu 1 . - CDS 24171 - 25361 1154 ## COG1015 Phosphopentomutase - Prom 25393 - 25452 1.8 21 11 Op 1 1/0.000 - CDS 25483 - 25926 380 ## COG0295 Cytidine deaminase 22 11 Op 2 . - CDS 25945 - 26775 766 ## COG2820 Uridine phosphorylase - Prom 26888 - 26947 6.7 23 12 Tu 1 . - CDS 27089 - 27553 300 ## EUBREC_0506 hypothetical protein - Prom 27578 - 27637 4.6 - Term 27946 - 27989 1.1 24 13 Tu 1 . - CDS 28099 - 28302 129 ## gi|253581265|ref|ZP_04858521.1| predicted protein - Prom 28411 - 28470 7.0 - Term 28415 - 28462 11.3 25 14 Tu 1 . - CDS 28527 - 29780 991 ## COG1373 Predicted ATPase (AAA+ superfamily) - Prom 29821 - 29880 6.3 26 15 Op 1 . - CDS 30095 - 30427 251 ## Apar_1207 hypothetical protein 27 15 Op 2 . - CDS 30489 - 31013 484 ## COG0655 Multimeric flavodoxin WrbA 28 15 Op 3 . - CDS 31038 - 31223 208 ## gi|153811831|ref|ZP_01964499.1| hypothetical protein RUMOBE_02224 Predicted protein(s) >gi|226332851|gb|ACII01000168.1| GENE 1 824 - 1048 390 74 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00116 NR:ns ## KEGG: EUBELI_00116 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: D-Alanine metabolism [PATH:eel00473] # 1 73 18 90 91 86 60.0 2e-16 MEELIELLEEIQPDADFENCDTLIDDGILDSFAILSIVSELQDTYDITITPADIIPENFN SAKALWDMVCRLKG >gi|226332851|gb|ACII01000168.1| GENE 2 1123 - 2322 1307 399 aa, chain - ## HITS:1 COG:RSp0424 KEGG:ns NR:ns ## COG: RSp0424 COG0019 # Protein_GI_number: 17548645 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Ralstonia solanacearum # 8 376 3 397 413 142 28.0 9e-34 MSSYGFIKELIEKEHTPTPAYVFDLDKMKEFVKKVQSCLGESAQLCYAMKANPFLTGPMM DVVPTFEVCSPGEFRICERVGVPMERIVLSGVYKNPEDMEYVLSTYGGKGVYTVESLQHL QILNDTAVRLGMKITVLIRVTSGNQFGVDEADIRKIISDRTDYPGVEIEGLQFYSGTQKK DLSQMKTELEHLDEFIGELKSESGFEAQVLEYGPGFFVPYFKKDKSEDVENILSEFRALL ESLNFKGKVVLEMGRFLAAACGYYVTSIVDMKVNKEQPYVIMDGGINHLNYYGQAMAMKQ PYCTQLDTEGNEKTGGEEESWNLCGALCTVNDVVVKRFPLHKPQLHDILVFERVGAYSVT EGIYLFLSRPLPRIYFWTEGEGLRMVRDGVHTDLLNSEK >gi|226332851|gb|ACII01000168.1| GENE 3 2319 - 3839 1537 506 aa, chain - ## HITS:1 COG:Cj1307 KEGG:ns NR:ns ## COG: Cj1307 COG1020 # Protein_GI_number: 15792630 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Campylobacter jejuni # 1 500 1 501 502 375 38.0 1e-103 MKTSILEYLEESARKYPDKTAFADEHTSCTFKELEQTARRTGTALAKHFTPRNPVPVFME KSVETIGVFMGAVYAGCFYVLMDTKQPASRLQQILGILDSDVIVTSEKYLKDLEKLEFKG KVLMAPDLAAEQENEAVLNEIREQALDVDPLYGIFTSGSTGVPKGVVVGHRSVLDFIDCF TELFDITDKDVMGNQAPWDFDVSVKDIYSGLKTGATVQIIPKQLFSFPMKLIDYLIEREV TTLVWAVSALCIISTLNGFEYKVPDKIKKILFSGEAMPVKHLNIWRKYLPDVMYVNIYGP TEITCNCTYYIVDREFQPGDVLPMGKAFPNEKVFLLDEEDKLITEKNKNGELCVTGSALA LGYYKNREQTAKVFIQNPLNDRYLEPMYRTGDLAYYNERGELCFATRKDFQIKHMGHRIE LGEIEAAMDKVPEIVRSCCIFDTVKSKIVAFYEGDIERRPLAKALGQYVPAFMVPNVFRQ VESMPLTKNGKIDRKALTAMYQEEKK >gi|226332851|gb|ACII01000168.1| GENE 4 3912 - 4892 991 326 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00118 NR:ns ## KEGG: EUBELI_00118 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 4 319 6 331 358 132 29.0 1e-29 MHNFKKVLKTFLVLVVCVALLAALDLALYPCTFMRNDVHAVVTQQFDDIIVGTSHGKMDI DPEVMQEVTGRSGHNLCVGGEYGIDAYYLTKLIKEKQNPKRIIYEVDPGYFVSEKEEGNN YLLFYHEFPFSKAKVEYFWNSIAKCNFRTVLFPWYEYSLSYELPKIKDTFTQKVTGDYDV SHLKSDSQEYHESGFIERYPVDVTKLKKSEPKFYEEGKVNEENMEYLKKLITFCKENDIE FVAVTTPIPINTLKDYSDSYNAAWKYFGQFFEEQGVEYLNFNTQYFKAFTHDLKAYTDYD GHMNGDAAKEYSEVLAQVLESAGQRK >gi|226332851|gb|ACII01000168.1| GENE 5 4879 - 6390 1216 503 aa, chain - ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 18 410 19 384 520 196 31.0 9e-50 MSLVSNLFLLFVAVSVLVYYIVPHKFQWLVLLCFSYIYYLAGGVRYVGFILFSTLVTWGI ALAVEKAEAGGSHKSARNFLVLGLILNFGMLGVIKYTNFMIENLNALFHMNLRGMEILLP LGISFYTFQSSGYLLDVYWKRCDAERNPVKYALFVSFFPQILQGPIGRYSRLAHQLYEPH KFEGKNITRGFERILWGFFKKMILADWAAVFVDAIFDNPDQYSGLAIFGVLFYTVQLYAD FSGGMDVVIGIASMFGIELDENFKRPFFSISITDFWHRWHITLGTWMKDYVFYPVTLSKW MGKFGKWGKKVFGKKTGRTLPICLANLIVFFVVGVWHGAAWKYIVYGMYNGIIIAFSGLM AEHYRNWKKKFNITGKENWYHVFMIIRTFILVNISWFFDRADTVGQAFHMMKLSVTKFAP SQLLLIPAGKEGTAFTPYALAILAAGCIILFIVSVLQERGMKIRESLAGLSLPITVAIYF CLLASIGFFGSTAVARGFIYAQF >gi|226332851|gb|ACII01000168.1| GENE 6 6531 - 7478 1213 315 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581247|ref|ZP_04858503.1| ## NR: gi|253581247|ref|ZP_04858503.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 253 9 261 323 288 100.0 5e-76 MKKKVVTAFMMAALGSSLVFTQAVMAADTTAKTETSEDADKDATESEDKDVAEDADKADE EKEDLKTIGEKTEEDTEGVFSVKLKNLTGKVITGVTVKNEKDEEYPDNFLKEEDKFEADE ERVLWFDPSAKDKADEENKDSSNDTKETTDTEETKTEDAKTDEKAEGEDKDDAEKEIPKY NIQLTFEDGTTAEIHTFPFGDTEEAELKLEGEVAYLVFDSISQKKEYNTLETEKALAPKP TEAAAQAASETTQSYDNSSDYSNDYDYNNDYDYSNDYDYSNDYDYGNDDSGDYDGSDDGG SDDGGDGCLDDGLLN >gi|226332851|gb|ACII01000168.1| GENE 7 7594 - 8382 887 262 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581248|ref|ZP_04858504.1| ## NR: gi|253581248|ref|ZP_04858504.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 262 1 262 262 446 100.0 1e-124 MNRKKITALLLSSVMALTPAMPAMAEQSTIVNYVDGTTDSTVETPDVTPTPLPEKKEGWE TVDGVKYYYVNGEKITNKVKKIGKYTYCFDKTGKLVTNKPYYKVNAKTYYKIKKNGKATK LSTVETMAAVRLQKCKGNLKKAFNWSVSLQYAGNVKVSKKTPTEYGLYGFKTGSGDCYVM AATFYWMAKVAGYDAHYVKGYFQKSGGKKGAHAWVEIDQKVNGKKKTYVYDPNFQKEYKL NGYKLTYGAKRTLKYVNYKRVN >gi|226332851|gb|ACII01000168.1| GENE 8 8480 - 9148 752 222 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581249|ref|ZP_04858505.1| ## NR: gi|253581249|ref|ZP_04858505.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 222 1 222 222 389 100.0 1e-107 MKKTKRLLAVVLCMLLMLTPLAVVAETVTVQAAEPQTVKVKLDKKTGKRYGYDENNQKVT QQWGVTAKGFRYYFGKNGAAYQADQDMVGKYGILMKKINGKYYGFDVSGHTVKGIRVGSV SMYEIPKLYYFNPKTGAVDKKKTSLYRKYAATSTLAKQNNASKIKKVLGKYKKCTISKGN TCMLDGNGKDVTYTYDYVQLNVVRPTGKGSSAEVVASITVRR >gi|226332851|gb|ACII01000168.1| GENE 9 9166 - 10152 839 328 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 52 193 496 628 693 67 29.0 4e-11 MKRKRFTALKVTLTAMMLTAALTFGAGSAYAADLSQEGGSEPTVTITPSPEVTPELPLPT STPKPTPVPKNGLLKEGKVYRYYVNNKPVRNKWKKINGKYYWFKSNGVAAHDGHYKIKGV FYLFNKNAQRIIPGKKSIVKVNGVKYFVDAKGRPVTGWNEFNGRMYYVHKNGKCATNETI GGIRFNKNGYASNLTQARCKLAARNFIARHSNANASNYEKFRSCFYYIMAYTNFVGYMDP TPQEFKTKDWVYKYSLQMFQNGLTGNCYGIASSVAAIAKELGYEPYVITIPDGHSFVMIN GLYYDNMYGTLFGAATRPAYTIEHKIKF >gi|226332851|gb|ACII01000168.1| GENE 10 10588 - 11241 591 217 aa, chain - ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 1 210 5 214 215 142 38.0 5e-34 MKAAIFDVDGTLLDSMSVWEDIGERYLASQNITAEKNLRAALHTMSLEQGAAWMKEKYQL DKSISQIIEEVLKIVSDFYRFEAPLKPGVKKTLEWFKERNIQMTVATSGNRELTEAALAR NGILDYFEQIYTCTETGAGKDEPLIYLKAAESMQAEPNETLVFEDALHAAETAKKAGFVV IGVYDEENRKNISKMKEVCDCYCDRMDAAIENIRLAI >gi|226332851|gb|ACII01000168.1| GENE 11 11241 - 11591 511 116 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 114 1 114 114 134 63.0 1e-30 MTVKECYEQMGADYEGVLGRLRSEVLIKKFAKKFLDDGSFRSLKDNLAQKNGEEAFRAAH TLKGVCQNLGFDNLYKASFDITEKLRGRDTEGCEELLAKVEEQYNNTVDAIHMMED >gi|226332851|gb|ACII01000168.1| GENE 12 11950 - 13458 1869 502 aa, chain - ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 93 419 30 337 459 172 36.0 1e-42 MMNKDNRNDKIRKIAKKGLTLSLCAVLAGGLAAGSFEGVNKLAGWSGATTVEAASNKDET TLTYAKSEKKDADTSDSKSDTGKDTGSTAKGNLDVSEIASEALPSIVSITTKSVQEVQNY FGMYGMYGYAPQQQEQEVEGSGSGIIVGKNDDELLIATNYHVVEGADTLSVAFTDGNAVE ASVKGFDEERDLAVVSVSLDDVKDDTMDAISIAKIGSSDDLKVGEQVIAIGNALGYGQSV TTGIVSAKNRRMDSDNNTVTDGSDDSSDGVNLIQTDAAINPGNSGGALLNMEGEVVGINS AKLASTEVEGMGYAIAISDVTDILQNLMNETSRDKLDDSEHGVLGIEGSSVSSEAVQMYG IPAGVFVKKVTEGGAADKAGLKANSVITEFNGKTVSSIDQLSEYLSYYEPDEEVELTVQV PHGTSYKEETVKVTLDENTDADDSDDNDKDSKKSKKDSKKSSKDADEDVDEDTDSEDSMD SDDTEESENPFIQYFENQGFFR >gi|226332851|gb|ACII01000168.1| GENE 13 13647 - 16136 1858 829 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_0344 NR:ns ## KEGG: GYMC10_0344 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 387 824 434 900 1277 137 27.0 1e-30 MKKWKKFIAMGLAVSMIVPSCIPQTLWASEFSAGSVESAADVFEDGAEYETGSEAGNKIS DENNSGDEFGTGELKDQEPEQSETEDSVNAGQSAEGYNILLYSDGKLEKTYVIPYADADT AKVPEALRGKTGYIFKEWNTKEDGTGETYKPGDSIKKLLETADMVKQAQETSERTEVGEE IKPEKDVMQEENDVAQTDSASWEEKTDKAEMQISDTENDGTSPDISTDKAIAAGDTTAIT LYAIWEKASEYKITYKLNKGKNNTANPKTYTSEDEIKFKKPTRSGYHFVGWYTDSKYKNQ ISVIEKGSEGSLTLYAKWTKEISPSAKAASLDYVKGTKANTITVSATVSNYVKSSDGYYY LVYVDSNSGKVKKTVGKVKKPEKAKGKITFKLNISGHPEYAQGKFAIGIKKSKSAYSVIS PKSYVSNPEKLSTNTAAYFVPGTKKGIQATDINELTDTKSKTVFFNLYISDLMRKDSGVE TYKYNGKTYHFNGLYGYVYLVQQCNAKGIQVTAQISIDRNASTQSFITGNSPYAETAYYG WNTDNSTTRQTMEAMFAYLGEKFGKNNCYISNWILGNEVNSASGYYYVGNVSFSKFISMY SEAFRCLYNAVKSSRGSSKVFICLDNCWNQKNAFTICYSARSTLESFAAKISDMQKDVNW NLAYHAYNQPLSDSQFWSGANASMFTSDANTTTFITMRNIQTLTDYVKNRFGSNTRIILS EQGFSSTYGGQANQAAAIALAYYKAACNPMIDAFIIRSYKDEAHEVAQGLAMGLKDANGK KKTAYNVFKNMDSSNSLKYTEKVLKSQVGNWKSLVPGYSTGKISSMYRK >gi|226332851|gb|ACII01000168.1| GENE 14 16533 - 17102 581 189 aa, chain - ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 3 188 2 190 190 120 35.0 1e-27 MENEEKNQTEQSDKSRSVWRELWDYAKIIIAVFVIAFLLGHFVYINARVPSGSMEETIMT GDRVFGNRLAYIKDDPERFDIVIFKYPDDPSQLFVKRVIGLPGETVNIVDGKVYINDSEE PLDDSFCPETPEGSFGPYTVPEGCYFMLGDNRNHSMDSRYWQNPFVEEDAIEAEVAVRYW PLNKIGTVK >gi|226332851|gb|ACII01000168.1| GENE 15 17684 - 19243 1392 519 aa, chain - ## HITS:1 COG:L162604 KEGG:ns NR:ns ## COG: L162604 COG0436 # Protein_GI_number: 15672142 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 114 517 1 404 404 593 67.0 1e-169 MNNIFAERLKQAMEQKNMKQIDLVKKAAEQGVKLGKSHVSQYLSGKTTPRSEILNFLATT LGVETEWLKGTDVSVDTLKKETNEAGIQMENMKFDYNYNNMKENTRETVEEVQVREFKKS SKLNNVLYDVRGPVVEEAARMENAGTQVLKLNIGNPAPFGFRTPDEVIYDMRQQLTECEG YSPAKGLFSARKAIMQYAQLKKLPNVSIEDIYTGNGVSELINLCMSALLDNGDEILIPSP DYPLWTACATLAGGKAVHYICDEQAEWYPDMDDIRKKITSRTKAIVIINPNNPTGALYPR EVLQQIVDIAREHHLIIFSDEIYDRLVMDGAEHISIASMAPDLFCVTFSGLSKSHMIAGF RIGWMILSGNKAVAKDYIEGIKMLSNMRLCSNVPAQSIVQTALGGHQSVESYIVPGGRIY EQREFIYNALTDIPGITAVKPKAAFYMFPKIDTKKFNIVNDEKFALDLLKEKKILLVHGG GFNWHQPDHFRVVYLPRIEVLKEAAGKIRDFLEYYQQEI >gi|226332851|gb|ACII01000168.1| GENE 16 20251 - 21498 1361 415 aa, chain - ## HITS:1 COG:MJ0499 KEGG:ns NR:ns ## COG: MJ0499 COG0065 # Protein_GI_number: 15668676 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Methanococcus jannaschii # 1 413 1 421 424 392 47.0 1e-109 MGETVIEKIIRNNVGKTVKPGDIVTVNVDRVMIHDIFIPFVAEKFEEMGFTKLWDPDKAV LIYDHLVPASQLDDTRHFHAGDAFARKYGMKNVHRSDGICHQLMTEAGYVKPGNIVFGTD SHTTTYGCVGAFSSGIGYTEMASILGTGTMWIKVPETIKVVIEGELPGNVMSKDIILRLI GDLGADGATYRALEFTGSTVKNMTVASRMTMANMAIEAGAKCALFTPDEKTAEYCDIELN EFQKSLTGDEDATYMKTITYRAEDFVPVMACPSQVDKIKNVSELEGTEIDQVFIGSCTNG RLEDLRAAAEVLKGKKVADYVKLIVTPASRKIYRQAIAEGIMDTLAEAGAMITHPGCGLC CGRAGGILTDGERVVATNNRNFLGRMGTSKVEIYLASPKTAAACAIAGKIVNPEK >gi|226332851|gb|ACII01000168.1| GENE 17 21517 - 22008 671 163 aa, chain - ## HITS:1 COG:AF0629 KEGG:ns NR:ns ## COG: AF0629 COG0066 # Protein_GI_number: 11498237 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Archaeoglobus fulgidus # 6 159 2 157 161 152 49.0 2e-37 MEKFTGKVWVLDDDIDTDIIIPTEYLALKTIDDMKQYGFSPLRPELAGQIQKGDIIVAGK NFGCGSSREQAPEIIKALGIQCVIAKSFARIFFRNSINNGLLLIEQPDLHDDIKEGDEIT VVMNEHVDYNGKQYPIASLPENLMSIIQAGGLVKAMRKLNGLD >gi|226332851|gb|ACII01000168.1| GENE 18 22277 - 23527 1408 416 aa, chain - ## HITS:1 COG:aq_940 KEGG:ns NR:ns ## COG: aq_940 COG0065 # Protein_GI_number: 15606262 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Aquifex aeolicus # 1 416 1 423 432 398 49.0 1e-110 MGMTIAEKIIAAAAGVDSVKPGDIHTVKLDRLMSNDGTTHLTVDMYNNKLKNPHIADTSK LVFIVDHNVPSDSPKTAASQKKMRDFAKEHNIDFWEGKGVCHQVMIENYVRPGELIFGAD SHTCSYGALGAFGTGVGCTDFLYGMVTGTSWVLVPESVKFNLTGKLPEGVYARDLILTII GEIGANGCNYQAMEFTGEGAKTLSISDRMALCNMAVEAGAKTGIFEADEKAIEYLKEHGR EPKAVYHSDPDAVYAREYTFDLSKVRPVVAKPDFVDNVVPAEEACGIEINEAFLGSCNNG RIEDLRVGAEIIKGKKVAEGVRFLVVPASQTIYRQALKEGLIDTFMEAGAIVMNPNCSVC WGSCQGVIGENEVLISTGTRNFKGRAGHPSSKVYLGSAATVTASAITGKIALASEV >gi|226332851|gb|ACII01000168.1| GENE 19 23529 - 24011 567 160 aa, chain - ## HITS:1 COG:MJ1277 KEGG:ns NR:ns ## COG: MJ1277 COG0066 # Protein_GI_number: 15669463 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Methanococcus jannaschii # 4 158 7 163 168 159 48.0 2e-39 METGTIFKFHNDLDTDQIIASQYLLLPNLDEMKGHAFESLDPDFSKKVKPGDFVVGGENF GCGSSREQAPGVLKALGVQAVVAKSFARIFYRNSINIGLPVIVCKDLPDEVNTGDKMVLH MSEGTVEANGKTYSCTKLPEYMQNILNQGGLIASLNKEEK >gi|226332851|gb|ACII01000168.1| GENE 20 24171 - 25361 1154 396 aa, chain - ## HITS:1 COG:lin2068 KEGG:ns NR:ns ## COG: lin2068 COG1015 # Protein_GI_number: 16801134 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Listeria innocua # 4 395 5 393 394 440 55.0 1e-123 MKNYDRIFVIVLDSLGIGAMPDSAKFGDIGVDTFGHILEKMGSLEIPNLRKLGMLNLHPA GKMAGVENPKGRYTHLGEASNGKDTMTGHWEMMGIKTEKPFKTFTETGFPPELIAELEKR CGKRVIGNKSASGTEIIEELGEEEIQTGAMIVYTSADSVLQICGNEETFDLQNLYRCCEI AREITMKDEWRVGRVIARPYIGKKKGEFKRTSNRHDYALKPTGLTTLNVLKDHGFDVIGV GKIHDIFCGEGITETHHSDSSVHGMEQTIEICKRDFRGLCFVNLVDFDALWGHRRNVEGY GKEIEKFDKNLGVLLEEMRENDLLILTADHGNDPTYTGTDHTREYVPFIAYSKRMKNGGA IKDEDTFAVIGASVAENFGVPMPEGTIGHSVLKELM >gi|226332851|gb|ACII01000168.1| GENE 21 25483 - 25926 380 147 aa, chain - ## HITS:1 COG:CAC1544 KEGG:ns NR:ns ## COG: CAC1544 COG0295 # Protein_GI_number: 15894822 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Clostridium acetobutylicum # 10 141 4 130 132 132 51.0 2e-31 MQNLEKSMISKLIETAMEQLKFSYTPYSNFKVGASLLTKNGQIYTGCNIENAAYTPTNCA ERTAIFKAVSEGVREFDTICIVGGKDGILTEYTAPCGVCRQVMMEFCDPETFRIILAIDK EHYDIYTLKDLLPLGFGLRNLINSREI >gi|226332851|gb|ACII01000168.1| GENE 22 25945 - 26775 766 276 aa, chain - ## HITS:1 COG:SPy1869 KEGG:ns NR:ns ## COG: SPy1869 COG2820 # Protein_GI_number: 15675688 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Streptococcus pyogenes M1 GAS # 16 274 1 259 259 379 69.0 1e-105 MATDVSRENEKDGMIMKSYSEDVNRQYHIQVAKGEVGRYVILPGDPKRCVKIAQYFDNPV LIADNREYITYTGTLDGVKVSVTSTGIGGPSASIAMEELYRCGADTFVRIGTCGGMQTEI KSGDIVIATAAVRMEGTSREYAPIEYPAVANLDVTNALVEAAKEKGFIYHTGVVQSKDSF YGQHEPEAMPAGYELINKWEAWKRMGCLASEMESAALFIVAGKLRARMGSCLLVLANQER EKLGLENPVVHDTDMAIRVAVEAIRRMIKEDQGQVL >gi|226332851|gb|ACII01000168.1| GENE 23 27089 - 27553 300 154 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0506 NR:ns ## KEGG: EUBREC_0506 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 21 140 22 141 146 80 35.0 2e-14 MYFWDKHKAITSYYELLSGEVCDRYELTQMEYDILMFLHNNPQLNTAAEIVKIRKSTKSH VSTSLKKLENRGFVKRIQSEDNKKHIEIVLLDRAALIVEAGLNAQKQFAQDVLSGLTKEE RHMCIKVFDKICNNAEEHLREYKRSMFNNERKQL >gi|226332851|gb|ACII01000168.1| GENE 24 28099 - 28302 129 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581265|ref|ZP_04858521.1| ## NR: gi|253581265|ref|ZP_04858521.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 67 18 84 84 102 98.0 5e-21 MLNEVYTNKATTEYRETKLANLRVIIRKPYCEIDTENVATLQFLDLIKEVVDISEVDGEE LTNRLII >gi|226332851|gb|ACII01000168.1| GENE 25 28527 - 29780 991 417 aa, chain - ## HITS:1 COG:FN1715 KEGG:ns NR:ns ## COG: FN1715 COG1373 # Protein_GI_number: 19705036 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 1 416 1 429 430 395 52.0 1e-110 MEIKRDAYLQQLIERKDNGMIKVITGIRRCGKSFLLFTIFKRYLLENGIDVDHIIEIALD GIENEELRDPKVCFKYIKDAMKDDGKYYLLLDEVQFMSRFEEVLNSLLRMSNIDVYVTGS NSKFLSSDIVTEFRGRGDEIRIYPLSFAEFYSVYDGDYDDAWDDYMIYGGLPQIVSFQSE RQKADYLKNIFANVYLRDVIERNKIQNVDEIGILVDVLASAIGSPTNPSKIANTFASERQ MTYANKTISNHIDHLTDAFLISKASRYDIKGRKYIGANLKYYFTDLGLRNARLNFRQQEP THIMENIVYNELLIRGYNVDVGVVDIFDKDKEGKRVRKQLEVDFVVNQGNQRYYIQVAYD MTSEEKQTQEFNSLRNIPDSFKKIVIVNGSKKPWRNDEGFVIMGMKYFLLNANSLEF >gi|226332851|gb|ACII01000168.1| GENE 26 30095 - 30427 251 110 aa, chain - ## HITS:1 COG:no KEGG:Apar_1207 NR:ns ## KEGG: Apar_1207 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 10 110 3 103 103 69 44.0 4e-11 MKLADLATGSDWIVWIVFAIFAVLSIILLSGHGSWFISGYNTASNEEKEKYDEKKLCRTM GIGMSIIAILALTMGLFENILPAFFVYIALGIILVDVVVIIILGNTLCRK >gi|226332851|gb|ACII01000168.1| GENE 27 30489 - 31013 484 174 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 118 4 123 179 80 39.0 1e-15 MKIVIINGSARKGNTLAAINAFIKGASEKNKIEIIEPDKLNIAPCKGCGVCQCSKGCVDK DDTNSTIDKIAAADMILFATPVYWWGMSAQLKLIIDKCYCRGLQLKNKKVGTIVVGGSPV GSIQYELIDRQFDCMAKYLSWDMLFKKSYYATARDELEKNKDSMNELEWIGKNL >gi|226332851|gb|ACII01000168.1| GENE 28 31038 - 31223 208 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153811831|ref|ZP_01964499.1| ## NR: gi|153811831|ref|ZP_01964499.1| hypothetical protein RUMOBE_02224 [Ruminococcus obeum ATCC 29174] # 1 61 1 61 61 116 100.0 4e-25 MAKKEERFEVTFRDGSQLKDEGVRQILVDKETGVNYLCWKSGYGAAITPLLDSEGKVIVT K Prediction of potential genes in microbial genomes Time: Sat May 28 21:10:36 2011 Seq name: gi|226332850|gb|ACII01000169.1| Ruminococcus sp. 5_1_39B_FAA cont1.169, whole genome shotgun sequence Length of sequence - 2924 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 50 - 541 268 ## gi|253581271|ref|ZP_04858526.1| conserved hypothetical protein + Prom 643 - 702 6.2 2 2 Op 1 . + CDS 733 - 1689 602 ## COG4271 Predicted nucleotide-binding protein containing TIR -like domain 3 2 Op 2 . + CDS 1649 - 2620 218 ## COG0535 Predicted Fe-S oxidoreductases 4 2 Op 3 . + CDS 2648 - 2839 238 ## gi|253581274|ref|ZP_04858529.1| predicted protein Predicted protein(s) >gi|226332850|gb|ACII01000169.1| GENE 1 50 - 541 268 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581271|ref|ZP_04858526.1| ## NR: gi|253581271|ref|ZP_04858526.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 61 163 1 103 103 203 99.0 3e-51 MHHDLDIHPIPEKVLYINDLSIADTAFGYETKKQHQKAITGKIHANGGILADDNVRIYIP LDLNADQILSRLQQIYRVMGNPNDMNEMEFSCEVGKIISQLEIYDQVWVARDLKNAVRIE ERIHSTKGIELTKKIINVLMEDEGCAECFPYDVVDELKAEFGI >gi|226332850|gb|ACII01000169.1| GENE 2 733 - 1689 602 318 aa, chain + ## HITS:1 COG:CC2702_2 KEGG:ns NR:ns ## COG: CC2702_2 COG4271 # Protein_GI_number: 16126935 # Func_class: K Transcription # Function: Predicted nucleotide-binding protein containing TIR -like domain # Organism: Caulobacter vibrioides # 2 116 1 116 151 60 35.0 3e-09 MRIFIASSGERLNMAIEVGSWLESMNVEPVLWNEPEVFPLGVYTWDALIDLSKNVDAAVF IFGEDDKIWFRGEGTMTVRDNVLLEYGLFSGKLSKVNTIFLCEGRPKLASDLEGITWGDV GKKNSAKIKLKNWINSIRTNNQKKVEGFKITDLNDAFQITLKNQKIKKLRIFAISTFKSV QMFRLINDISINHAEILLREYDNKNDEYYDQSMEGAIDNAILNWKHMIENDHKINKLDIF RFNYHPDEGYYIFDDKYLILGNLKYEMEKKEYTFSKNVMLIDNQTETGRTWIKEYIERFE DTKKNYETSVMQYCDTES >gi|226332850|gb|ACII01000169.1| GENE 3 1649 - 2620 218 323 aa, chain + ## HITS:1 COG:PH0827_2 KEGG:ns NR:ns ## COG: PH0827_2 COG0535 # Protein_GI_number: 14590694 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Pyrococcus horikoshii # 8 121 24 140 426 63 31.0 4e-10 MRHQLCNIVTLNLNTTCNLHCKWCYNQEKQHKLLKFELFVKFYDDIIKNNVTSMTLIGGE PTIHPQFVEILRKLEKQEVHLFTNAIRFSEKKFCEDVCKQKNLKDITISIKGFNEKSFKE FSDEISFVEFCSAIENLNEKGTFIRVNYNCTENLSQSMVSDFVYFLRYYGLHEIVLHDIR PYITESGELIKKHSMEPLQLIAQELEKAGIVPYIRLNQPFCRYNPTFLKQFLDSKRVMAT CAVKKQQGIFVDPDLNIILCNELRHIIMGGYQRDFWDYKSMLDVYNKQDTVRLYNKLEGC PMKKCVKCDMWEKCGGSCILHWL >gi|226332850|gb|ACII01000169.1| GENE 4 2648 - 2839 238 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581274|ref|ZP_04858529.1| ## NR: gi|253581274|ref|ZP_04858529.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 63 1 63 63 113 100.0 3e-24 MKRELKGKIEKLVVYETEREYQGTFIDSNFVSGDCGAGGPGRVKEKSKGSKLLYCGDCGP TSH Prediction of potential genes in microbial genomes Time: Sat May 28 21:10:49 2011 Seq name: gi|226332849|gb|ACII01000170.1| Ruminococcus sp. 5_1_39B_FAA cont1.170, whole genome shotgun sequence Length of sequence - 1188 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 39 - 842 389 ## CLI_1846 hypothetical protein + Term 975 - 1007 1.0 Predicted protein(s) >gi|226332849|gb|ACII01000170.1| GENE 1 39 - 842 389 267 aa, chain + ## HITS:1 COG:no KEGG:CLI_1846 NR:ns ## KEGG: CLI_1846 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_F # Pathway: not_defined # 1 256 1 250 263 81 28.0 2e-14 MNLYKRDYIKSVPTKQEMQLYEGILCTERKAVGDRGFGKINHRRLYPAAVRHYDSLFPNN HIELFDFQNEGNMEQLNEEFCALIHDANTNGRNVLRFINHRPAYHIIAGVFKYYNFGHHD AYVFPEFALGKYIADYLLIGKSSGGYEFVFVELEHPNGRTTLKSGHEGETFRKGTYQIYD WKAEIEAHFSASFVTITKYSNKSSLPKEFSEYDSSRFHYAVVAGLREDYNEATYRDRRNK VTQQNILTLHYDNLYDKACELETAQSF Prediction of potential genes in microbial genomes Time: Sat May 28 21:10:53 2011 Seq name: gi|226332848|gb|ACII01000171.1| Ruminococcus sp. 5_1_39B_FAA cont1.171, whole genome shotgun sequence Length of sequence - 2251 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - LSU_RRNA 58 - 1224 93.0 # CP000885 [D:78327..81226] # 23S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. - LSU_RRNA 1647 - 2250 91.0 # CP001107 [D:12954..15836] # 23S ribosomal RNA # Eubacterium rectale ATCC 33656 # Bacteria; Firmicutes; Clostridia; Clostridiales; Eubacteriaceae; Eubacterium. Prediction of potential genes in microbial genomes Time: Sat May 28 21:10:58 2011 Seq name: gi|226332847|gb|ACII01000172.1| Ruminococcus sp. 5_1_39B_FAA cont1.172, whole genome shotgun sequence Length of sequence - 1542 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 40 - 1529 99.0 # EF403532 [D:1..1493] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Sat May 28 21:10:59 2011 Seq name: gi|226332846|gb|ACII01000173.1| Ruminococcus sp. 5_1_39B_FAA cont1.173, whole genome shotgun sequence Length of sequence - 2560 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 277 - 1452 506 ## EUBREC_0065 hypothetical protein 2 1 Op 2 . - CDS 1455 - 2297 212 ## COG0582 Integrase - Prom 2445 - 2504 3.5 Predicted protein(s) >gi|226332846|gb|ACII01000173.1| GENE 1 277 - 1452 506 391 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0065 NR:ns ## KEGG: EUBREC_0065 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 377 1 376 383 540 64.0 1e-152 MDKSCTIQDVFERFYPSYEKRSNPPAHHRKTAYHIRNCKTGVFGVNISVCEDCGCISVHN NSCRSRCCPMCQEFPKEKWIDAQKENVLDAPYYHVVFTVPEELNSIIYSNQKLLYDALYH AASATLNELSKDAKYLGADIGYICILHTWGSAMNYHPHIHTIVLGGGLDDENKWKETGGN FFLPYGVISKVFRGKYLDELKSLWNDSKLKFHGTAEKYQNSYRFKELLDKCYEKNWVTYC KETFNGAQSVINYLGKYTHRIAISNQRIKSMKDTTVTYAVKDYKNEGCWKEKTISGEEFI RRFLMHVPPKQFVRIRHYGLLSCRNKRKKITHCRNLLGCKKYISTLKNLNAIEMIKVLYH IDVCKCSSCGGNMVSPRKDKYSSSFCAHMRC >gi|226332846|gb|ACII01000173.1| GENE 2 1455 - 2297 212 280 aa, chain - ## HITS:1 COG:mll9328 KEGG:ns NR:ns ## COG: mll9328 COG0582 # Protein_GI_number: 13488149 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 16 275 21 281 299 159 34.0 8e-39 MYEEIFNQIRSAANKRNLKDSTIHAYCTSVAHFLNRTAKDIDALTTDDVDIFLTEKKLSG ISPETYNHYHSGIRFFYKKILKKNWDDDDIPRMKRDRKLPTVLTKAEISAILDATPNLKH KAMIATMYSGGLRVSEVTHLHYDDISRTNKTIHIRDGKSRSDRYTLLADRTLEILTEYWF QCGRPRGILFPSSWTGDYLTKDSVIQFFRESAERAGIQKHVSTHCLRHSFASHLFESGCD IKYIQALLGHRDPKSTEIYLHVSNKTLLGIKSPFDEMGGE Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:04 2011 Seq name: gi|226332845|gb|ACII01000174.1| Ruminococcus sp. 5_1_39B_FAA cont1.174, whole genome shotgun sequence Length of sequence - 1708 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 83 - 137 14.0 1 1 Tu 1 . - CDS 145 - 1503 1470 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases - Prom 1645 - 1704 3.9 Predicted protein(s) >gi|226332845|gb|ACII01000174.1| GENE 1 145 - 1503 1470 452 aa, chain - ## HITS:1 COG:lin0994 KEGG:ns NR:ns ## COG: lin0994 COG3594 # Protein_GI_number: 16800063 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Listeria innocua # 6 331 2 330 343 110 29.0 7e-24 MKIKKERAAVPYSCLIDNIKVLLIFLVVFNHIIAFNLVKVDTVVRYVWYAITIFHMPAFV FISGYLSKKPQNVLKNFKNLLIPYVLGYTLTWYSQIWLGRSVDYEILRPTGSVMWYILAL FIYRLTIEALGKIRFIVPLSILFALWAGTRPEFTTFLSSSRIVVFFPFFVAGYLWKSEYI TAIRKFKGKWILVAISGVLLWAIPNYMIPNEMGIAIFRGNHGYQLCGLTDPQGVILRLLM YLVSFVVVYTMLALVPDIKLPLTYVGRHTMGIYFFHYPIMIIMNGLYILMLPAMNNVWVL LGVSLVFVLVLGSLPVDLLYTGVLNLIAFILIKKDKTVRDEGLEEEYDSEYDEYELLRRK KAIAELAATLDSESEQDGHVEKVNMDEDTPDHTSSDSHEGFSNVAGMELEIEEDNLDDIP DELIKGEEELSLEDLIQELEATTRTMDDNKEP Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:05 2011 Seq name: gi|226332844|gb|ACII01000175.1| Ruminococcus sp. 5_1_39B_FAA cont1.175, whole genome shotgun sequence Length of sequence - 1665 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 156 - 506 268 ## COG0464 ATPases of the AAA+ class 2 1 Op 2 . + CDS 564 - 1538 310 ## gi|253581284|ref|ZP_04858536.1| predicted protein Predicted protein(s) >gi|226332844|gb|ACII01000175.1| GENE 1 156 - 506 268 116 aa, chain + ## HITS:1 COG:BS_spoVK KEGG:ns NR:ns ## COG: BS_spoVK COG0464 # Protein_GI_number: 16078805 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Bacillus subtilis # 1 112 205 317 322 67 37.0 8e-12 MKKFFDSNPGLKSRFNTFIEFEDYSVDELDDILVGMCVNNDYELSEDAKACAHSYLNNKV ETKEDNFANGRMVRNLYEDLVMNHARRVFNINAPSKADLKTILDVDFKFDSDVKTD >gi|226332844|gb|ACII01000175.1| GENE 2 564 - 1538 310 324 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581284|ref|ZP_04858536.1| ## NR: gi|253581284|ref|ZP_04858536.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 324 1 324 324 645 100.0 0 MIVAFMDLLGFSNLLRNNMEVAIDNLNTFNNVIKTRVIDNKCHPIEEYRENYPNDVKFQQ FVEKSSVTSFEQMISFSDSLVLGGTDCNMFIKQLMNFVATVYIEYSEPFQKNFSDINQVT TYKVAEGCGSGSIQYHKAFPILFRGGLSVGNDVTFWDEYCINDSDFKISSRNVMGLTYLN AVKLESVGKGPRLFCDKSLVDAVDDEINKYIKLVDSEKEIYEIVWTIEGCEATGCCSSNK WSNVINRINDKMLPAAINLYKYYQKDKGLEPQYKELLNLVCVGIVKYAKDECNRENEAIH YINQVFQEKHIQVIDGSLLEDFIG Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:17 2011 Seq name: gi|226332843|gb|ACII01000176.1| Ruminococcus sp. 5_1_39B_FAA cont1.176, whole genome shotgun sequence Length of sequence - 1613 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 71 - 110 -0.8 1 1 Tu 1 . - CDS 127 - 1335 864 ## COG3547 Transposase and inactivated derivatives - Prom 1525 - 1584 3.0 Predicted protein(s) >gi|226332843|gb|ACII01000176.1| GENE 1 127 - 1335 864 402 aa, chain - ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 397 1 384 391 97 23.0 5e-20 MNAVGIDVSKSKSVVAIMRPFGEIVSTPFEIKHTASDIHSLIELINSVEGESRIVMEHTG RYYEVLAHQLLEANLFVSAINPKLIKDFDNDSLRKVKSDKADAVKIARYALDKWQNLKQY NVMNELRNQLKTMNRQFGFYMKHKTAMKNNLIGILDQTYPGVNTYFDSPARSDGSQKWVD FASTYWHVDCVRKMSLNAFIDHYQNWCKRKKYNFSQSKAEEIYGKAKELVPVLPKDAITK LIIKQAVDQLNSASTTVESLRTLMNETASKLPEYPVVMAMKGVGTSLGPQLMAEIGNVSR FTHKSAITAFAGVDPGVNESGTYEQKSVPTSKRGSADLRKTLFQVMDVLIKTHPQDDPVY QFLDKKRAQGKPYYVYMTAGANKFLRIYYGRVKEYLASLPES Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:18 2011 Seq name: gi|226332842|gb|ACII01000177.1| Ruminococcus sp. 5_1_39B_FAA cont1.177, whole genome shotgun sequence Length of sequence - 1565 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 13 - 62 -0.2 1 1 Tu 1 . - CDS 122 - 1438 792 ## COG3385 FOG: Transposase and inactivated derivatives - Prom 1484 - 1543 4.8 Predicted protein(s) >gi|226332842|gb|ACII01000177.1| GENE 1 122 - 1438 792 438 aa, chain - ## HITS:1 COG:yi41 KEGG:ns NR:ns ## COG: yi41 COG3385 # Protein_GI_number: 16132099 # Func_class: L Replication, recombination and repair # Function: FOG: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 34 346 45 353 442 60 19.0 7e-09 MKHSQKRTFMDIEKMSSDEFKAFCRLGNKNHFTRIRKMPLQDLLFTMINRKGLTLALELR NYMKLAHPGVSISKPGYLKQRMKLNPDAFLELYKYHNRNFYADSTFSTYKNHLILAADGS DINIPTTTETLKLYGSASRKNTKPQAQIGLGCIYDVMNRMILESDCNKVKFDEMRLAEKQ MERIPETIGNIPYIIIMDRGYPSTPAFIHMMDKDLKFIVRLKSSDYKKEQSSLTENDQLV KIKLDKSRIRHYEGTPDGERMKELGEISLRMVKILLENGNLEVLATNLSQTEFHTEEIKE LYHMRWGIETAYETLKNRLQLENFTGTKPILLLQDIYSTIYLSNLVEDIILDAERELDQK ETNRKHKMMINQTVSIGILKNDLIYILLETDDQKKNILFQQIYEDISKNLVPIRPDRHYT RTKGRLAGKYSNTHKRAY Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:19 2011 Seq name: gi|226332841|gb|ACII01000178.1| Ruminococcus sp. 5_1_39B_FAA cont1.178, whole genome shotgun sequence Length of sequence - 1498 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 8/0.000 - CDS 2 - 1088 309 ## COG0675 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 1051 - 1497 257 ## COG2452 Predicted site-specific integrase-resolvase Predicted protein(s) >gi|226332841|gb|ACII01000178.1| GENE 1 2 - 1088 309 362 aa, chain - ## HITS:1 COG:MA0258 KEGG:ns NR:ns ## COG: MA0258 COG0675 # Protein_GI_number: 20089156 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 1 362 1 340 370 150 31.0 4e-36 MVKAIKVMLVPNNVQKTKLFQYAGASRFAYNWALVREIENYEKGGKFISDAELRKEFTKL RHSDEYAWLLNISNNVTKQAIKDACTAYKNFFRGLQKFPRFKSKKRSMPKFYQDNVKIRF SNTHVKFEGFSSSRKANKQKMNWVRLAEHGRIPTNAKYMNPRISFDRLNWWISVCVEFPD CREILNDDGVGIDLGIKELAVCSDGTKYKNSNKSQKIKKLEKQKRRLQRSISRSYEKNKK GESYCKTNNVIKKEKLLLKRNHRLTNIRKNYLNQTISEIINRKPRFICIEDLNVSGMMKN RHLSKAVQEQGFFWFRKQLEYKCSDKGIQLIVADRFYPSSKLCSCCGNIKKDLKLSDRVY RC >gi|226332841|gb|ACII01000178.1| GENE 2 1051 - 1497 257 148 aa, chain - ## HITS:1 COG:PAB2076 KEGG:ns NR:ns ## COG: PAB2076 COG2452 # Protein_GI_number: 14520623 # Func_class: L Replication, recombination and repair # Function: Predicted site-specific integrase-resolvase # Organism: Pyrococcus abyssi # 1 134 70 208 212 109 47.0 1e-24 VSSHKQKDDLERQIDNVKTYLLAKGQPFEIISDIGPGINYKKKGLQELIRRISQNQVEKV VVLYKDRLLRFGFELIEYIASLYNCEIEIIDNTEKSEQQELVEDLVQIITVFSCKLQGKR ANKAKKLIRELIQEEADGKSHKSNVGTK Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:19 2011 Seq name: gi|226332840|gb|ACII01000179.1| Ruminococcus sp. 5_1_39B_FAA cont1.179, whole genome shotgun sequence Length of sequence - 1195 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 241 - 1173 362 ## Cphy_1803 transposase Predicted protein(s) >gi|226332840|gb|ACII01000179.1| GENE 1 241 - 1173 362 310 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1803 NR:ns ## KEGG: Cphy_1803 # Name: not_defined # Def: transposase # Organism: C.phytofermentans # Pathway: not_defined # 1 310 127 433 434 237 41.0 4e-61 MPRKFYSSKYAHDEYRSVLVDCRVGINQTPESIQSMNDLLVPLIKEKHQSIGHTYATHAE ELGCSRRTLYSYINDCVFDVRNGDLRRSVRYKKRKKPTQTSAKDRSYRQGHNYENFQSYM KDHPDINVVEMDCVEGMKGESCALLTFTFRNCNLMLMFLLEYQNQECVLEVFVWLETVLG QDAFKKLFPVILTDGGSEFSAREEMEEFCDGSKSTTVFYCDPYSFWQKGACEKNHEYIRY IRPKGSSFADLNDEKVRLMMNHINNEKRDSLNGHSPYELSLLLLDNKLHKALGLKAIAPD DVMLSPKLIK Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:24 2011 Seq name: gi|226332839|gb|ACII01000180.1| Ruminococcus sp. 5_1_39B_FAA cont1.180, whole genome shotgun sequence Length of sequence - 1117 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 37 - 67 -0.5 1 1 Tu 1 . - CDS 87 - 1025 600 ## Cthe_0220 transposase, IS4 - Prom 1053 - 1112 3.2 Predicted protein(s) >gi|226332839|gb|ACII01000180.1| GENE 1 87 - 1025 600 312 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0220 NR:ns ## KEGG: Cthe_0220 # Name: not_defined # Def: transposase, IS4 # Organism: C.thermocellum # Pathway: not_defined # 12 312 13 314 315 296 51.0 6e-79 MLKFQDNNTTVIATFEDFILTAYVIIDELYHQFAPPEVTRRRHILNAKLSDSEIITISLC GELAGVDSENAWFSFVKRNYRHLFPQLCSRSRFNRTRRALMQTTELLRQKMISVFPIPVS SYYIVDSFPLAVCKFGRARYCKAFRGHGADYGKCPSKKETYYGYKVHALITLEGYIASFE ITPASTDDREGLRDLADHWSNVTILADKGYVGKNMKQEMQEKNICLFALKRSNSKENWPK SVRQLIFKLRRRVETVFSQLSGQLNAERVLAKSFQGLCTRLVNKVLAYNLCIALNSIFGE TCELGKIKKLIF Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:28 2011 Seq name: gi|226332838|gb|ACII01000181.1| Ruminococcus sp. 5_1_39B_FAA cont1.181, whole genome shotgun sequence Length of sequence - 1029 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 180 - 1028 495 ## AFE_1328 transposase, IS605 OrfB family protein, putative Predicted protein(s) >gi|226332838|gb|ACII01000181.1| GENE 1 180 - 1028 495 282 aa, chain - ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 49 239 190 391 439 123 38.0 8e-27 NHTDMEYLRKNWSGKKASAPTLEKRHHKYFLRFSYTEEVTLTKTPVKEQIICSVDLGINT DAVCTIMRADGTVLGRKFINHPSEKDRMYRTLGRIRRFQREHGSAQSRGRWAYTKRLNIE LGRKIAGAIVKNAEENHADVIVFEYLEMQGKISGKKKQKLHLWRKRDIQKRCEHQAHRKG MRVSRVCAWNTSRLAYDGSGSVSRDPKNHSLCVFQTGKRYNCDLSASYNIGARYFIRELL KPLPVTERSLLEAKVPSVKRRTSCVYADLKELHLQMEILKAA Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:33 2011 Seq name: gi|226332837|gb|ACII01000182.1| Ruminococcus sp. 5_1_39B_FAA cont1.182, whole genome shotgun sequence Length of sequence - 1007 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 129 - 1007 272 ## COG3547 Transposase and inactivated derivatives Predicted protein(s) >gi|226332837|gb|ACII01000182.1| GENE 1 129 - 1007 272 292 aa, chain - ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 19 286 130 384 391 67 23.0 3e-11 DKWQSLEPYSITDELRAQLKVMNRQFHFYVKQKTAMKNNLIGLIDQTYPNANTYFNSPTR SDGSQKWVDFVSTYWHVDCIRKMSLNAFVEHYQNWCKRKKYAFNKSKAEEIYTKTKELVP VFPKNEITKHIIRQAIDQLNSTSVTVETIRTLMNETASKLPEYSIVMQFKGIGPSLGPQL MAEIGDVTRFTHKGALTAFAGVDPGVNESGSYAQKSVPTSKRGSSNLRKTLFQVMDVLIK TMPQDDPVYQFMDRKRTQGKPYYVYMTAGANKFLRIYYGRVKEYLSVLPDPS Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:33 2011 Seq name: gi|226332836|gb|ACII01000183.1| Ruminococcus sp. 5_1_39B_FAA cont1.183, whole genome shotgun sequence Length of sequence - 990 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 848 431 ## COG0675 Transposase and inactivated derivatives - Prom 918 - 977 2.3 Predicted protein(s) >gi|226332836|gb|ACII01000183.1| GENE 1 2 - 848 431 282 aa, chain - ## HITS:1 COG:all7245 KEGG:ns NR:ns ## COG: all7245 COG0675 # Protein_GI_number: 17233261 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 2 282 34 322 407 167 37.0 3e-41 MLADKIRHYQEEKKMLKNTPAGYKKEYPWLKEVDSLALANVQLNLEGAFRKFFREPGVGF PHYKSKKHSRKSYTTNMVNGNICLQDRFLKLPKMQPVKIKLHRMIPEGWKLKSVTVSREP SGKYFASLLFDCENQTAEKRQAEKFLGMDFAMHGMCVFSTGERAGYPMFYRNAEKKLARE QRKLSRCEKGSRNYQKQKKKVALYHEKIKNQRKDFQHKLSHSLAEDYDAVCVEDLNLKGI AGGLHFGKGIQDNGYGQFLSMLGYKLEERGKYLIKVDRYFAS Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:34 2011 Seq name: gi|226332835|gb|ACII01000184.1| Ruminococcus sp. 5_1_39B_FAA cont1.184, whole genome shotgun sequence Length of sequence - 905 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 429 242 ## Shel_22380 hypothetical protein 2 1 Op 2 . - CDS 459 - 731 276 ## EUBELI_20612 hypothetical protein - Prom 813 - 872 3.7 Predicted protein(s) >gi|226332835|gb|ACII01000184.1| GENE 1 3 - 429 242 142 aa, chain - ## HITS:1 COG:no KEGG:Shel_22380 NR:ns ## KEGG: Shel_22380 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 3 140 72 210 397 72 26.0 7e-12 MEQYYYNHMNKAQQAAYHSILSGVKNLADEFQIPALEGEELYNVFFQMRLDHPEIFWVSS YKYRYYKDSPNLIFIPEYLFDKKKICEHQKAMTARVEKLIRPAQKLSEWEKEKYVHDFIC ENIRYDKLKKSYSHEIIGPLGQ >gi|226332835|gb|ACII01000184.1| GENE 2 459 - 731 276 90 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20612 NR:ns ## KEGG: EUBELI_20612 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 80 5 81 94 80 48.0 1e-14 MASERTEELYRVLLSKGYPKELCAEIAYKNMNTDYTATRMLGYLYRYTNPKIEDLVDEML AILSDRAQIIEKKESEHAQAVISEMYRKGL Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:39 2011 Seq name: gi|226332834|gb|ACII01000185.1| Ruminococcus sp. 5_1_39B_FAA cont1.185, whole genome shotgun sequence Length of sequence - 861 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 389 - 721 280 ## COG1733 Predicted transcriptional regulators - Prom 801 - 860 2.5 Predicted protein(s) >gi|226332834|gb|ACII01000185.1| GENE 1 389 - 721 280 110 aa, chain - ## HITS:1 COG:BH0737 KEGG:ns NR:ns ## COG: BH0737 COG1733 # Protein_GI_number: 15613300 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 10 109 14 113 118 96 45.0 1e-20 MRAKEELPECPVATAVSLIGGKWKLLILRNLKERPWRFNELQRSIDGISQKVLTDSLRQM MSDGLAYRHDYHEQPPKVEYGLTELGTKMLPIVNSLADFGNYYKSIIEQN Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:40 2011 Seq name: gi|226332833|gb|ACII01000186.1| Ruminococcus sp. 5_1_39B_FAA cont1.186, whole genome shotgun sequence Length of sequence - 832 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 17 - 460 575 ## COG4154 Fucose dissimilation pathway protein FucU - Prom 651 - 710 7.6 Predicted protein(s) >gi|226332833|gb|ACII01000186.1| GENE 1 17 - 460 575 147 aa, chain - ## HITS:1 COG:SP2165 KEGG:ns NR:ns ## COG: SP2165 COG4154 # Protein_GI_number: 15901975 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose dissimilation pathway protein FucU # Organism: Streptococcus pneumoniae TIGR4 # 1 145 1 141 147 133 46.0 9e-32 MLKGIPEILSPELLKVLCEMGHSDRLVIADGNFPVESMGKNAITIRCDGHGVPEILDAIL KLFPLDTYVEHPVNLMEVMPGDNVETPIWDTYKEIVAKYDERGDKAVGNIERFAFYEEAK TAYCIISTSEKALYANIMLQKGVVINN Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:40 2011 Seq name: gi|226332832|gb|ACII01000187.1| Ruminococcus sp. 5_1_39B_FAA cont1.187, whole genome shotgun sequence Length of sequence - 799 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 140 - 610 274 ## COG1943 Transposase and inactivated derivatives - Prom 650 - 709 5.0 Predicted protein(s) >gi|226332832|gb|ACII01000187.1| GENE 1 140 - 610 274 156 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 156 1 156 157 243 72.0 7e-65 MDNNSLSHTKWNCKYHIVFAPKYRRKVAYGKIKQDIADILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRHFWCRGYYVDTVGKNAKK IQEYIANQLQDDLEYDQMTLKEYIDPFTGEPVKRNK Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:41 2011 Seq name: gi|226332831|gb|ACII01000188.1| Ruminococcus sp. 5_1_39B_FAA cont1.188, whole genome shotgun sequence Length of sequence - 756 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 8 - 80 74.8 # Asn GTT 0 0 + TRNA 85 - 156 66.5 # Glu TTC 0 0 + TRNA 205 - 277 82.3 # Thr TGT 0 0 + TRNA 282 - 355 85.6 # Met CAT 0 0 + TRNA 372 - 445 82.7 # Asp GTC 0 0 + TRNA 483 - 555 81.0 # Val TAC 0 0 + 5S_RRNA 489 - 540 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 559 - 644 65.8 # Leu TAA 0 0 Predicted protein(s) Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:42 2011 Seq name: gi|226332830|gb|ACII01000189.1| Ruminococcus sp. 5_1_39B_FAA cont1.189, whole genome shotgun sequence Length of sequence - 685 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 90 - 684 222 ## COG4823 Abortive infection bacteriophage resistance protein Predicted protein(s) >gi|226332830|gb|ACII01000189.1| GENE 1 90 - 684 222 198 aa, chain + ## HITS:1 COG:lin2373 KEGG:ns NR:ns ## COG: lin2373 COG4823 # Protein_GI_number: 16801436 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Listeria innocua # 20 175 20 173 298 76 30.0 4e-14 MQTPKKFSSFSDQVSWISDEKGIKIKDREYAEEMLRQIGYFPLMGGYKHLFRISNTKKYK AGTSFEEIVSLYKFDAELRELFFKYLLQIERQMRSLMSYYFTEMYGAEQKQYLDANNYNN TKRNHATIVKLIATLKRATTTTDYTYINYYRKTYGEIPLWVLANVLTFGNLSKMFRVFPQ SLKSKVSKNFEPLNQHQM Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:42 2011 Seq name: gi|226332829|gb|ACII01000190.1| Ruminococcus sp. 5_1_39B_FAA cont1.190, whole genome shotgun sequence Length of sequence - 659 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 95 - 361 327 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 483 - 542 2.0 Predicted protein(s) >gi|226332829|gb|ACII01000190.1| GENE 1 95 - 361 327 88 aa, chain - ## HITS:1 COG:BS_ytlA KEGG:ns NR:ns ## COG: BS_ytlA COG0715 # Protein_GI_number: 16080111 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 5 84 67 146 229 99 57.0 2e-21 MTLITGFGADKTMTAVISGEADIGFMGAEASIYAYQEGATDPVVNFAQLTQRAGNFLVAR EEMPDFKWEDLKGKKVLGGRKGGVHTSM Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:43 2011 Seq name: gi|226332828|gb|ACII01000191.1| Ruminococcus sp. 5_1_39B_FAA cont1.191, whole genome shotgun sequence Length of sequence - 634 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 15 - 633 391 ## SMU.1646c hemolysis inducing protein Predicted protein(s) >gi|226332828|gb|ACII01000191.1| GENE 1 15 - 633 391 206 aa, chain + ## HITS:1 COG:no KEGG:SMU.1646c NR:ns ## KEGG: SMU.1646c # Name: not_defined # Def: hemolysis inducing protein # Organism: S.mutans # Pathway: not_defined # 4 86 130 210 212 80 43.0 4e-14 MIIGNYLPKVKQNNTIGIRVVWTLQDEENWSATHRFSGKLWVASGVLCMLCGLFGESIAA LVLYIVSIMAAAIVSILYSYLFYKKKIAAGEKLKIQYNKKTIVIYVIVSVFVVIFTIWTL FWGGIDISFHDNDFTVEAQGWSDYTVDYEQIDSISYKENLFQNGNDRRTNGMGNLKYGMG NFRNDIYGDYIRYTHASCHSYVVMDI Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:47 2011 Seq name: gi|226332827|gb|ACII01000192.1| Ruminococcus sp. 5_1_39B_FAA cont1.192, whole genome shotgun sequence Length of sequence - 626 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 625 404 ## gi|253581318|ref|ZP_04858554.1| predicted protein Predicted protein(s) >gi|226332827|gb|ACII01000192.1| GENE 1 1 - 625 404 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581318|ref|ZP_04858554.1| ## NR: gi|253581318|ref|ZP_04858554.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 208 1 208 208 415 100.0 1e-115 KYEATVSSEILDLGQTGVNGGGWERLEIKNTGTESWQIKSISELNDFRLEHLEHPLLYEG RLPMKVGNKVKPGEAVAVDIYVKPNLSQGIHEETFVMETEEGIQAECRIRVLVRGESEED YEVNCVPNSGTRFLTITQEEYDSWDDVEPEPDMLMCPSRLITICNNGQKPVTVDFEDTEH FRAWFYIDEDTGVVMPGENTIIQVFPKA Prediction of potential genes in microbial genomes Time: Sat May 28 21:11:56 2011 Seq name: gi|226332826|gb|ACII01000193.1| Ruminococcus sp. 5_1_39B_FAA cont1.193, whole genome shotgun sequence Length of sequence - 623 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 622 327 ## gi|253581320|ref|ZP_04858555.1| predicted protein Predicted protein(s) >gi|226332826|gb|ACII01000193.1| GENE 1 1 - 622 327 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581320|ref|ZP_04858555.1| ## NR: gi|253581320|ref|ZP_04858555.1| predicted protein [Ruminococcus sp. 5_1_39B_FAA] # 1 207 1 207 207 422 100.0 1e-117 KYEATVSSEILDLGQTGVNGSGWGWERLEIKNTGTEAWHIKSISELNDFTLEPFDYPFQG RFPMEVGNKVKPGEAVAADIYVKKNLSPGIHEETFVMETEEGIQAKCRIRVLIRGASDQD YEVNCRPNSGTRFLTITPEEYDWWDNTDPDPDMFMCPKRSIWIYNNGKNPVTVDFENTEH FWAYLDDYETGVIMPGEGTTVYLRPKK Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:06 2011 Seq name: gi|226332825|gb|ACII01000194.1| Ruminococcus sp. 5_1_39B_FAA cont1.194, whole genome shotgun sequence Length of sequence - 611 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 163 - 609 134 ## AFE_1328 transposase, IS605 OrfB family protein, putative Predicted protein(s) >gi|226332825|gb|ACII01000194.1| GENE 1 163 - 609 134 148 aa, chain - ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 1 105 283 391 439 76 41.0 2e-13 ENHADVIVFEYLEMNGKISGSKRQKLQLWRKRDIQKRCEHQAHRKGMRISRICAWNTSRL AYDGSGIVLRDWRNHSLCAFQTGKRYNCDLSASYNIGARYFIRELLKPLPATERSLLEAK VPAVKRRTSCVYADLRELSSEMGLLMAA Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:09 2011 Seq name: gi|226332824|gb|ACII01000195.1| Ruminococcus sp. 5_1_39B_FAA cont1.195, whole genome shotgun sequence Length of sequence - 611 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 84 - 143 10.1 1 1 Tu 1 . + CDS 198 - 609 154 ## gi|253581324|ref|ZP_04858557.1| conserved hypothetical protein Predicted protein(s) >gi|226332824|gb|ACII01000195.1| GENE 1 198 - 609 154 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253581324|ref|ZP_04858557.1| ## NR: gi|253581324|ref|ZP_04858557.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 137 1 137 138 252 100.0 5e-66 MNRTNICKNIVQSIKDYITDQVKLEPHRVEKHFVRRRKLSLLQVIIYLFFSSKASMFQNL SQIREELGTLSFPDVSKQALSKARQFINPSLFKELYYLSVDLFYSQIPSRKLWQGYHLFA VDGSRIELPNSKSTFDF Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:16 2011 Seq name: gi|226332823|gb|ACII01000196.1| Ruminococcus sp. 5_1_39B_FAA cont1.196, whole genome shotgun sequence Length of sequence - 602 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 539 345 ## AFE_1328 transposase, IS605 OrfB family protein, putative Predicted protein(s) >gi|226332823|gb|ACII01000196.1| GENE 1 2 - 539 345 179 aa, chain - ## HITS:1 COG:no KEGG:AFE_1328 NR:ns ## KEGG: AFE_1328 # Name: not_defined # Def: transposase, IS605 OrfB family protein, putative # Organism: A.ferrooxidans_ATCC23270 # Pathway: not_defined # 18 155 2 140 439 70 35.0 3e-11 MQTVSSYGVEIRKQNIPIRQTLEIYRQAVSYLTEIYEQVWAELKMIPEAKKRFNAAEHLI HTTKKNHARFDFDIRFPKMPSYLRRAAIQHALGSVSSYESRMEQWEAAGELSGKPNFTCE NHAMPVFYRDVMYREGTEGKDEAYLKLYDGHDWRWFRVCLSHTDMEYLRRNWYGKKASA Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:19 2011 Seq name: gi|226332822|gb|ACII01000197.1| Ruminococcus sp. 5_1_39B_FAA cont1.197, whole genome shotgun sequence Length of sequence - 593 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 333 254 ## Cbei_3562 transposase IS116/IS110/IS902 family protein - Prom 529 - 588 4.4 Predicted protein(s) >gi|226332822|gb|ACII01000197.1| GENE 1 3 - 333 254 110 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3562 NR:ns ## KEGG: Cbei_3562 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: C.beijerinckii # Pathway: not_defined # 1 110 1 110 398 103 50.0 2e-21 MNAVGIDVSKGKSMVTILRPFGEIVSSPFEIKHTSSDIHSLIKLIHSIEGESRVVMEHTG RYYEALAHQLSAADLFVSAVNPKLVKDFNNDSLRKIKSDKADSVKIARYA Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:22 2011 Seq name: gi|226332821|gb|ACII01000198.1| Ruminococcus sp. 5_1_39B_FAA cont1.198, whole genome shotgun sequence Length of sequence - 586 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 181 91 ## gi|153810204|ref|ZP_01962872.1| hypothetical protein RUMOBE_00585 + Prom 183 - 242 2.8 2 1 Op 2 . + CDS 273 - 443 70 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 517 - 551 1.2 Predicted protein(s) >gi|226332821|gb|ACII01000198.1| GENE 1 2 - 181 91 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153810204|ref|ZP_01962872.1| ## NR: gi|153810204|ref|ZP_01962872.1| hypothetical protein RUMOBE_00585 [Ruminococcus obeum ATCC 29174] # 10 57 1 48 137 91 97.0 1e-17 KINNVEAIKVKKIEIIERITDANGSISENILTSWEHKNQFIPEIIRFGELKIYLQERNC >gi|226332821|gb|ACII01000198.1| GENE 2 273 - 443 70 56 aa, chain + ## HITS:1 COG:BS_yvrH KEGG:ns NR:ns ## COG: BS_yvrH COG0745 # Protein_GI_number: 16080375 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 2 52 313 363 369 63 56.0 8e-11 MVYDLVWNEPYSGDYNVVMRHICNIREKIEDDPGQPLYIQTVRGVGYRFNGNLGSE Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:27 2011 Seq name: gi|226332820|gb|ACII01000199.1| Ruminococcus sp. 5_1_39B_FAA cont1.199, whole genome shotgun sequence Length of sequence - 572 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 81 - 458 234 ## COG1943 Transposase and inactivated derivatives - Prom 495 - 554 6.4 Predicted protein(s) >gi|226332820|gb|ACII01000199.1| GENE 1 81 - 458 234 125 aa, chain - ## HITS:1 COG:DR0667 KEGG:ns NR:ns ## COG: DR0667 COG1943 # Protein_GI_number: 15805694 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Deinococcus radiodurans # 1 121 20 140 140 140 50.0 4e-34 MQYHLVWCTKYRKKVLKNGIDTECKEMLQNLAEEYKFQILAMEVMPDHIHLLVDCKPQFY ISDMIKIMKGNLARQMFLAYPELKKELWGGHLWNPSYCAITVSDRSRDQVLAYIEGQKEK EKKKA Prediction of potential genes in microbial genomes Time: Sat May 28 21:12:27 2011 Seq name: gi|226332819|gb|ACII01000200.1| Ruminococcus sp. 5_1_39B_FAA cont1.200, whole genome shotgun sequence Length of sequence - 520 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 246 - 283 -0.9 1 1 Tu 1 . - CDS 336 - 518 64 ## Predicted protein(s) >gi|226332819|gb|ACII01000200.1| GENE 1 336 - 518 64 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no ITYVGRFIQFLTFSCLICGTYISAAKGWFSAKLTMNATHLITQACHLTVLAMNTWRVSVT