Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:14:29 2011 Seq name: gi|157101665|gb|DS480659.1| Clostridium bolteae ATCC BAA-613 Scfld_02_0 genomic scaffold, whole genome shotgun sequence Length of sequence - 2298 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 34 - 840 141 ## COG3547 Transposase and inactivated derivatives - Term 823 - 891 31.2 2 2 Tu 1 . - CDS 994 - 2286 1485 ## COG1253 Hemolysins and related proteins containing CBS domains Predicted protein(s) >gi|157101665|gb|DS480659.1| GENE 1 34 - 840 141 268 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 248 79 325 391 215 43.0 7e-56 MCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFMALEDLDYIELKELGRFR QKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLKEAPTPNAIASMHLTHLA HTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIEQIELLDSQLFSTELEMA NLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAFAGLDPTVYQSGNFKAHR TRMSKRGSKVATLMPSLNAAHIVVKKQC >gi|157101665|gb|DS480659.1| GENE 2 994 - 2286 1485 430 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 1 403 1 405 443 362 48.0 1e-100 MESDPEAANILFQITILVILTLINAFFAGSEMAVVSVNKNKIHRLSEQGNKNAALIERLM KDSTVFLSTIQVAITLAGFFSSASAATGIAQVLAVRMAQWNLPYSQTLAGVVVTIILAYF NLVFGELVPKRIALQKAEAFSLFCVRPIYYISRIMNPFIKLLSLSTSGFLKLIGMHNENL ETDVSEEEIKSMLETGSEAGVFNDIEKEMITSIFSFDDKKAKEVMVPRQDMVALDINEPL EEFLDEILESMHSKIPVYEGEIDNIIGVLSTKALTIEARRTSFDKLDVRTLLKPAYFVPE NRRTDALFREMQANKIKLAILIDEYGGVSGMVTLEDLIEEIVGDIHEEYEEEEPELTELE PHKVYRVSGGITLFDLKEEMHLHMDSSCDTLSGYLMEQLGYIPSREQLPLTVLHRKLIMR YWKWRRVIDG Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:14:32 2011 Seq name: gi|157101664|gb|DS480660.1| Clostridium bolteae ATCC BAA-613 Scfld_02_1 genomic scaffold, whole genome shotgun sequence Length of sequence - 8120 bp Number of predicted genes - 12, with homology - 10 Number of transcription units - 6, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 22 - 1200 445 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 2 1 Op 2 . + CDS 1216 - 2574 345 ## Dtox_1525 recombinase 3 1 Op 3 . + CDS 2516 - 2713 170 ## Ccel_2729 resolvase 4 2 Tu 1 . + CDS 3080 - 3367 323 ## gi|160935490|ref|ZP_02082866.1| hypothetical protein CLOBOL_00380 + Prom 3398 - 3457 3.1 5 3 Op 1 . + CDS 3533 - 3736 105 ## gi|160935491|ref|ZP_02082867.1| hypothetical protein CLOBOL_00381 + Prom 3743 - 3802 2.4 6 3 Op 2 . + CDS 3826 - 3897 65 ## 7 4 Tu 1 . - CDS 4035 - 5189 792 ## COG3328 Transposase and inactivated derivatives - Prom 5385 - 5444 80.3 + Prom 5368 - 5427 80.4 8 5 Op 1 . + CDS 5469 - 5609 72 ## + Prom 5650 - 5709 2.5 9 5 Op 2 . + CDS 5731 - 6144 215 ## Closa_3034 PfkB domain protein 10 5 Op 3 . + CDS 6177 - 6815 513 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 6836 - 6895 80.4 11 6 Op 1 3/0.000 + CDS 6960 - 7289 289 ## COG0526 Thiol-disulfide isomerase and thioredoxins 12 6 Op 2 . + CDS 7364 - 8120 547 ## COG0225 Peptide methionine sulfoxide reductase Predicted protein(s) >gi|157101664|gb|DS480660.1| GENE 1 22 - 1200 445 392 aa, chain + ## HITS:1 COG:SPy0655 KEGG:ns NR:ns ## COG: SPy0655 COG1961 # Protein_GI_number: 15674723 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pyogenes M1 GAS # 9 196 122 285 471 71 33.0 3e-12 MTEESEGILTVYGAVAQQESENISQNIHWSAVNRFRHGTFVISRRPYGYDKDEEKELAVK EDEAAWVRRMASMYMNGMSGVRIAEYLNGNRVPAPYGGLWSSEAVLRILFNEKTVGDCLH QKTYSTGTVPYRTEKNCGKKPQYFIRDDHEGILSRNEQMRLQQIKEHRLGRQAKMNPRGC GEAYILSGKVVCGECGRAFIRKKEQRRKGVSIKWRCPGHRSEACQCYTNEIWEADIKNTF VNAFNTLKEYADEILDPVIRGMKKMQETNGLHEALDTLNQKKLELKEQRHILRQLKANEC IDSALYFEESRKIERELSRCRSEEKQLYSRGIHQDTITGLQATLHQLQGYDGKMQAFDGD QFILLVREVSVGKERELGFHLRCGLNLYEPVL >gi|157101664|gb|DS480660.1| GENE 2 1216 - 2574 345 452 aa, chain + ## HITS:1 COG:no KEGG:Dtox_1525 NR:ns ## KEGG: Dtox_1525 # Name: not_defined # Def: recombinase # Organism: D.acetoxidans # Pathway: not_defined # 1 296 2 291 299 117 27.0 1e-24 MNNYITYGYRLVDGYMRMNPEQSEAVHLIFDAYDGGESIKKIAGMLKDKKVPSVHGKPSW TPALIRKILRNKDYLGVEPYPRMIEDEQFKRVQEHLKRSRDLWNQRHPTGSIQRTSMYTA RLICSSCGGVYSIYKQSGKEDRRKVRYWKCRHYDPATKEPCKMPILTEKQLDGLFLQALV RMKTEPALYHEETKQILTGIIKERQRMESELNVCWNRCNKDDERMEQLYFQIAARRYQEI KMTELTRQIEDYLRGLDGSIQATAFDSLIAIQASIFKIIKRVEVQPDRSLKFELINNAIV LQYPEIKTKDKKSKNIRYCVPFGYWLDGQEVPIIHEQYGPIVQMLFQSYAEGISLTTLAK ELVHQSIPNQKGRPAWSHSCIRNILTNSVYLGDKKFAALVTRELFTQVQTRLEETGRKKR ESRTKAADGKEELRAKGGTNRRGHPGNLEPDR >gi|157101664|gb|DS480660.1| GENE 3 2516 - 2713 170 65 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2729 NR:ns ## KEGG: Ccel_2729 # Name: not_defined # Def: resolvase # Organism: C.cellulolyticum # Pathway: not_defined # 1 65 1 65 535 75 61.0 9e-13 MPRAERIVEVIPATWNPTDESSREIRKLRVAAYCRVSTELEQQQSSYDIQIEYYTRHIMQ NPNWI >gi|157101664|gb|DS480660.1| GENE 4 3080 - 3367 323 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935490|ref|ZP_02082866.1| ## NR: gi|160935490|ref|ZP_02082866.1| hypothetical protein CLOBOL_00380 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00380 [Clostridium bolteae ATCC BAA-613] # 1 95 13 107 107 176 100.0 7e-43 MFKVKLSFDTDRLEQDILEQLYDKVDEIFESEDLNCVDRALVRVYEDKGREEDYGRFWAA IFALKQTPELAENLKECTWYNGSAKENLLTGFLRN >gi|157101664|gb|DS480660.1| GENE 5 3533 - 3736 105 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935491|ref|ZP_02082867.1| ## NR: gi|160935491|ref|ZP_02082867.1| hypothetical protein CLOBOL_00381 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00381 [Clostridium bolteae ATCC BAA-613] # 1 67 30 96 96 124 100.0 2e-27 MPMSEPDVMTIVYRMTKRFPWINYCVSVCTLADVPEAFDILHVFVREAERLDEAMTSTTQ GIDTLSM >gi|157101664|gb|DS480660.1| GENE 6 3826 - 3897 65 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVSAIHRFDGYFFAFLECVKKSL >gi|157101664|gb|DS480660.1| GENE 7 4035 - 5189 792 384 aa, chain - ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 18 382 34 398 402 343 45.0 2e-94 MNLPQDIQDALKDLLGGTIKEMMESEMDEHLGYRKSERSDCDDYRNGYKTKQVNSSYGSM KVEVPQDRNSTFEPQVVKKRQKDISDIDHKIISMYAKGMTTRQISETLEDIYGFEASEGF ISDVTDKLLPQIEDWQNRPLSDVYPVLYIDAIHYSVRDNGVIRKLAAYVVLGINSDGLKE VLTIEVGENESAKYWLSVLNGLKNRGVKDILLLCADGLTGIKEAIAAAFPKTEYQRCIVH QVRNTLKYVSDKDRKLFAADLKTIYQAPTEEKALEALERGTKKWSEKYPNSMKSWHQNWD AIIPIFKFSTTVRKVIYTTNAIESLNATYRKLNRQRSVFPSDTALLKALYLSTFEATKKW TMPIRNWGQVYGELSLMYEGRLPM >gi|157101664|gb|DS480660.1| GENE 8 5469 - 5609 72 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEGGEAVPLGISQHEVLVENEVRVRPVKNVKSKGITISLDMSLLT >gi|157101664|gb|DS480660.1| GENE 9 5731 - 6144 215 137 aa, chain + ## HITS:1 COG:no KEGG:Closa_3034 NR:ns ## KEGG: Closa_3034 # Name: not_defined # Def: PfkB domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 137 233 355 371 80 35.0 2e-14 MVRKAGANNLLDYIDVRGMAEKLADELLEMGGTIVMLKCGHEGMYLRTAEKERWGNMGKA APNSLNGWYDRKIWQKPVKVKRILSRTGAGDIAIAGFLSSFLHEDNAKTALGIAAWAASI CIQSYDTISGLCPLNEL >gi|157101664|gb|DS480660.1| GENE 10 6177 - 6815 513 212 aa, chain + ## HITS:1 COG:SPy0252 KEGG:ns NR:ns ## COG: SPy0252 COG1653 # Protein_GI_number: 15674433 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pyogenes M1 GAS # 5 190 3 176 439 63 28.0 4e-10 MKKGLAKVTAIGLCVSMLISLAGCSGGGGSATTDKAAEKAAEAVTDKNIQSEAEGTQEPI ELTYWYWEDEQGTIENMLKDKWEAYAGDRIKVNFESVPSSSFHDKLITAISTGTGPDVFI CKPMWAPELYGMGGLMNMEDVFEGWEYADEVDDFMLEQMRAGLDKLYLYPRTTIVMYLYC RKSMFEKAGIDYPKTVDEFFDACEKLTVDTDG >gi|157101664|gb|DS480660.1| GENE 11 6960 - 7289 289 109 aa, chain + ## HITS:1 COG:HI1453 KEGG:ns NR:ns ## COG: HI1453 COG0526 # Protein_GI_number: 16273359 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Haemophilus influenzae # 7 102 56 151 156 88 48.0 2e-18 MGLLVSICLGGLEDINTLSAEDNGFKVLTIVAPGSKGEKNDEDFKTWFAGVEHTENITVL LDIDGVYTNRAGVRGFPTSEYIGSDGVLISLAPGHADNETIKSTFEAIN >gi|157101664|gb|DS480660.1| GENE 12 7364 - 8120 547 252 aa, chain + ## HITS:1 COG:NMB0044_2 KEGG:ns NR:ns ## COG: NMB0044_2 COG0225 # Protein_GI_number: 15675984 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Neisseria meningitidis MC58 # 62 220 4 162 181 214 61.0 2e-55 MAAAILTLAGCSASAAEAGMAAGAKRETESLSMDDSVMDSGKEKKLTAEQEDMMISGEDL HTIYLAGGCFWGIEAYVKKLPGVSSTDVGYANGNTENPSYKEVCYNNTGHAETVKVVYDT SRISTDQLLDGFFKVVDPTSVNRQGNDRGSQYRTGIYYVDEADQAIAKAAVARQEEKYSS PIATEVLPLSNFYMAEDYHQDYLDKNPGGYCHIDLNDADEFIRESGLDMADVASLIKVED YPVPSDAKLKEV Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:15:12 2011 Seq name: gi|157101663|gb|DS480661.1| Clostridium bolteae ATCC BAA-613 Scfld_02_2 genomic scaffold, whole genome shotgun sequence Length of sequence - 26026 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 7, operones - 6 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 150 - 1481 506 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 2 1 Op 2 . + CDS 1497 - 2294 196 ## Dtox_1525 recombinase 3 1 Op 3 . + CDS 2370 - 2501 104 ## gi|160935488|ref|ZP_02082865.1| hypothetical protein CLOBOL_00379 + Prom 2512 - 2571 80.3 4 2 Op 1 . + CDS 2618 - 2776 70 ## gi|160942414|ref|ZP_02089721.1| hypothetical protein CLOBOL_07298 5 2 Op 2 . + CDS 2796 - 3329 334 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 6 2 Op 3 . + CDS 3382 - 4251 594 ## Ccel_2729 resolvase 7 2 Op 4 . + CDS 4325 - 4612 161 ## gi|160942417|ref|ZP_02089724.1| hypothetical protein CLOBOL_07301 + Term 4694 - 4734 3.8 + Prom 5313 - 5372 3.8 8 3 Op 1 38/0.000 + CDS 5454 - 7106 1240 ## COG0747 ABC-type dipeptide transport system, periplasmic component 9 3 Op 2 49/0.000 + CDS 7133 - 8074 223 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 10 3 Op 3 44/0.000 + CDS 8084 - 8902 282 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 11 3 Op 4 44/0.000 + CDS 8907 - 9884 390 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 12 3 Op 5 . + CDS 9905 - 10810 261 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 13 3 Op 6 . + CDS 10862 - 11965 783 ## COG0620 Methionine synthase II (cobalamin-independent) + Term 12041 - 12084 2.0 14 4 Tu 1 . - CDS 12075 - 12257 142 ## gi|160942424|ref|ZP_02089731.1| hypothetical protein CLOBOL_07308 + Prom 12511 - 12570 7.8 15 5 Op 1 23/0.000 + CDS 12626 - 13114 372 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 16 5 Op 2 1/0.000 + CDS 13134 - 14978 934 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 17 5 Op 3 . + CDS 15006 - 16712 924 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Prom 16727 - 16786 2.9 18 5 Op 4 1/0.000 + CDS 16806 - 17744 809 ## COG0583 Transcriptional regulator + Term 17778 - 17837 7.5 + Prom 17836 - 17895 8.3 19 6 Op 1 18/0.000 + CDS 17964 - 19052 1094 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 20 6 Op 2 16/0.000 + CDS 19101 - 19481 509 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 21 6 Op 3 12/0.000 + CDS 19485 - 20798 1382 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 22 6 Op 4 . + CDS 20798 - 22234 1410 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 22280 - 22339 8.0 23 7 Op 1 . + CDS 22353 - 23774 761 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 24 7 Op 2 3/0.000 + CDS 23789 - 25468 1857 ## COG2759 Formyltetrahydrofolate synthetase 25 7 Op 3 . + CDS 25485 - 26025 617 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase Predicted protein(s) >gi|157101663|gb|DS480661.1| GENE 1 150 - 1481 506 443 aa, chain + ## HITS:1 COG:SPy0655 KEGG:ns NR:ns ## COG: SPy0655 COG1961 # Protein_GI_number: 15674723 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pyogenes M1 GAS # 1 247 62 285 471 80 30.0 5e-15 MLEECEKGWIDLILTKSITRFARNTVDTLSTVRHLKEIGVSVYFEKERLNTMTEESEGIL TIYGAVAQQESENISQNVHWSAVNRFKHGTFVISRRPYGYDKDEEKELAVKKDEAAWVRR MASMYMNGMSGVRIAECLNQNQVAAPYGGLWSSEAVLRILFNEKTVGDCLHQKTYSTGTV PYRTEKNRGKKPQYFIRDDHEGILSRDEQMRLQQIKEHRLGRQARMNPRGSGETYILSGK VVCGECGRTFIRKKEQRRKGVSIKWRCPGHRSEECQCYTNEIWEVDIKNTFVNAFNTLVG HADELFGPIIQGVKRIQEANGLHEALGTLNQKKLELKEQRHILRQLKANECIDSALYFEE SRKIERELSRCRSEEKQLHSRGIHQDTITGLQATLHQLQGYDGKMQAFDGDQFILLVREV SVGKERELGFHLRCGLNLYEPVL >gi|157101663|gb|DS480661.1| GENE 2 1497 - 2294 196 265 aa, chain + ## HITS:1 COG:no KEGG:Dtox_1525 NR:ns ## KEGG: Dtox_1525 # Name: not_defined # Def: recombinase # Organism: D.acetoxidans # Pathway: not_defined # 1 186 2 186 299 99 32.0 1e-19 MNNYITYGYCLVDGDMRMDSGQSEAVHLIFDAYDGGESIKKIVGMLKDRKAPTTRGKPSW TPVLIRKILRNKDYLGVEPYPRMIEEEQFERVQKCLKRSRDLWNQRHPSGSIQGTSMYTG RLICSSCGGVYSIYKQSVRGDKRDIRYWKCRHYDPATKEPCKMPILTEKQLDGLFLQALV RMKTEPALYHEETKKILTGIIKERQRMESELNGDWNKCNEDYKRMEQLYFQIAARRYREI KMTELTCQIEDYLRGLDGSIKAEKN >gi|157101663|gb|DS480661.1| GENE 3 2370 - 2501 104 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935488|ref|ZP_02082865.1| ## NR: gi|160935488|ref|ZP_02082865.1| hypothetical protein CLOBOL_00379 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00379 [Clostridium bolteae ATCC BAA-613] # 1 43 1 43 43 79 100.0 8e-14 MRQYFNTQKLKQQKIEIEDIRYCVPFGYWLDAQEMPIIHEHTV >gi|157101663|gb|DS480661.1| GENE 4 2618 - 2776 70 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942414|ref|ZP_02089721.1| ## NR: gi|160942414|ref|ZP_02089721.1| hypothetical protein CLOBOL_07298 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07298 [Clostridium bolteae ATCC BAA-613] # 8 52 1 45 45 74 100.0 2e-12 MPLSGAIMWRSFWPSWNPTEESVREIRKVRVAAYCRVSTELEQQQSSYDIQI >gi|157101663|gb|DS480661.1| GENE 5 2796 - 3329 334 177 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 11 148 48 186 542 62 27.0 3e-10 MQNPNWIFAGVFADDGRSATNTFRRDDFNQLMNQCMKGKVDMVITKSISRFARNTVDCIS WVRKLREKNVAVYFEKENLNTLDDSTEMILTILSSQAQEESRAISTNVKWGYARKFEKGE STGQRSYGFRKAPTGEMCIVEEEAAVIRNMAPVVSGRGQPGTNQAQAGGCRDRDHHR >gi|157101663|gb|DS480661.1| GENE 6 3382 - 4251 594 289 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2729 NR:ns ## KEGG: Ccel_2729 # Name: not_defined # Def: resolvase # Organism: C.cellulolyticum # Pathway: not_defined # 1 286 256 532 535 238 45.0 2e-61 MGDVLLQKTFTADYLTKRRVKNSGQQKQYYVKNHHEAIIPKTVYYKIQEEIARRSSLKKA GTRKGKTAQGVYSSKYALTGIMVCNECGAHYRRTTWAKNGKKVIVWRCINRLEHGTKRCH ESPTLKEEVIQEAIMGKLHSLSIDQEEENFLNGVKEDILRAAKVVGGACTEEEIDKTIEE LRDQLMDYVGMAAREHGGENWYSDRMRKLGLQISELKRRRESIREQEKIRDEYEYLDQEI SRIIGETGGATGAEFDNIFIRQIVREIRVISKNKLQIQLRTGMVLDVNL >gi|157101663|gb|DS480661.1| GENE 7 4325 - 4612 161 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942417|ref|ZP_02089724.1| ## NR: gi|160942417|ref|ZP_02089724.1| hypothetical protein CLOBOL_07301 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07301 [Clostridium bolteae ATCC BAA-613] # 1 95 1 95 95 190 100.0 2e-47 MPKIELSFDIDRLGQDVLKRLYAMTDEIFESEGLKCVDRGVVRVYEDRGRKKDYGRFWAA IFALKESPELSANLKECIWYNGSVKENLITGFLRN >gi|157101663|gb|DS480661.1| GENE 8 5454 - 7106 1240 550 aa, chain + ## HITS:1 COG:FN1504 KEGG:ns NR:ns ## COG: FN1504 COG0747 # Protein_GI_number: 19704836 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 45 547 16 519 522 292 35.0 9e-79 MLIPAVILMATLLSAGCTAGSTQGVTETAVQAESAVEENTDENAETKALQSEEKSTLVLA YPKDLGDMNPHTMSSPMYAQDWVYDGLTALVNGKIVPELAENWEISEDGKTYTFHLRKNV KFSDGSDLNAELVKENIQDVIDNKDGYSFLQCLEEIDTMDTPDDTTLIMNLKNPCNSLLS DMSFNRPLTVAGRAAFPESGNIYKDGVKKPVGTGMWTVKEYVQDQYTIFERNEYYWGEKP SFKYVKVEVIPDMDTVVNALKAGEIDMYIDVNDGLSADAYYELGKLGFGTQMAEGTQVTS LSLNTAGDMLGDVSVRQALEYTTDNTVISEYVYGSLQKPASSYFADSVTLTQTGTDGYPY QPEKAAEILELAGWNLESGSDYRTKDGVTLEVDMLYDTVFKNGKNIGLVLQEQYKKVGVK LNICEEDSQVFRKKWKEGDFDMILYSSWGGSYEPFATLAAMRSDGDKFSTVQKGMDNKAE LNKVMNEALSEIDEDKLKEDFTYIMESFKEQAVYIPLTVSSTLAVYDSSLKGLDLTDGKD VMPVGSVVRE >gi|157101663|gb|DS480661.1| GENE 9 7133 - 8074 223 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 68 304 43 310 320 90 25 1e-17 MGKYIQKRLVGMIVLLFLVSIAAFALVRLMPGDAAFAYLNSINAPVTDAALAKVRAELGL DRPVIFQYTAWLRRVVCLDLGVSFMTKKPVSGQLLISFRYTMILTAMSAVWIAVLSLPLG IMAGKRPGSAGDQLIRGITFLGSSMPSFWLGFLLVELFALKLKLLPVQGAETFSHLVLPS ATLACSYIAMYTKMIRGGIVENMGRRYAVYGRARGLKKNTVLWRHVLPNALNPVFTTLGL SLGSMLSGAVIVENVFAWPGMGRLIVSAVAGRDYPVIQGYILVIAVVYTLLNLGMDILCA AMNPQIRFEGEER >gi|157101663|gb|DS480661.1| GENE 10 8084 - 8902 282 272 aa, chain + ## HITS:1 COG:BH0569 KEGG:ns NR:ns ## COG: BH0569 COG1173 # Protein_GI_number: 15613132 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 5 272 9 276 277 190 39.0 3e-48 MFRKLIKDRVCLFCLFMLILFAIAAVLAPCLSPNDPNEIHYTARFAGISSMFPLGTDHMG RCILSRILYGSRLSLGYAVLITVISSVTGTMLGMLSGYTGGLPDRIFMGICDAVRAFPGI VIVLVVVSLLGTGLFELCLGMLATRWVWYARVARNLTKAEREKEAVLASRMAGSSLFSII TTLIFPAIRMQMLSVTTINFGTALTALAGYSFLGFGAAPPSPEWGMMIQDGRSFISTDPS MMFWPGLCILAVVIFTNGLGDRIQEIAGKMRR >gi|157101663|gb|DS480661.1| GENE 11 8907 - 9884 390 325 aa, chain + ## HITS:1 COG:CT690 KEGG:ns NR:ns ## COG: CT690 COG0444 # Protein_GI_number: 15605423 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Chlamydia trachomatis # 9 320 6 318 321 218 36.0 1e-56 MHLHNQTELLRLDDIRVALKEGKRRFTAVNGVSVSICAGRNLGIIGESGCGKTLLCHCIL GLLDQDEWETEGSILFEGHRIIAGKDGVRPGFCGTAAGFIAQDPAGAFDPRMTLRGHFLE MAGVFSQNRQEILERTAALLIRMGIREPERVLKSYAFQLSGGMLQRVMIALALLGKPRLL IADEPTTALDITTQWEILGLLGELKREQGLTVLMVSHDLRVIQSISDDLCVMYAGYVIEQ GPVSEVLENPLHPYTKGLLASRPVFSKEAVRVMEGYPPGIKEKVPGCPFADRCKDALKLC RQSCPKLLKHGPVNHQVRCFRAGEG >gi|157101663|gb|DS480661.1| GENE 12 9905 - 10810 261 301 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 246 1 243 245 105 29 4e-22 MLEADMISRRYRCAGKGQVYAVNNLSLKIHTGECHGIVGESGCGKSTLMRMLAGMETPSS GQILYKGEPVALQMKRKRNEIQMIFQNSMNAVNQYATAEEIIGEPLRNFTHMNKVERRQQ TAALIKQVGLSAADLLKYPAQFSGGQLQRICIARALAAEPKILLLDEPLSSLDVSVQAQI MKLLEMLGRENGLTQVLVSHDLEAIYYLSDSLSVMYGGWIVEDIEHIDMFSTLCHPYTKR LFKACGAKYSEYIPEKSGYGGTVGGCPYEELCPDCVDVCRSRTPEMKEIQEGHWVRCHKV G >gi|157101663|gb|DS480661.1| GENE 13 10862 - 11965 783 367 aa, chain + ## HITS:1 COG:lin0838 KEGG:ns NR:ns ## COG: lin0838 COG0620 # Protein_GI_number: 16799912 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase II (cobalamin-independent) # Organism: Listeria innocua # 7 362 10 359 367 239 39.0 7e-63 MLNIKYDIVGSFLRPGKIKKARQKYADGNLSLEELRKIEDEAIRELVSTEVSHGLEFVTD GEFRRRWWHLDWLKEFDGFTTKHLDKERNGVVNHIELGYITGKISYSQERVHPEIEAFDF LLCEAEKYPGVTAKKCISGPNMMLVDNYLQMGIKEVPYYGSNINALIEDIALAYQDAIKD LYEHGCRYLQIDDTSWTYMIDEKFLKKVESLGYKKSEVLEWFKRVSTRALEDRPADMVIA THFCKGNFKGNPLFSGFYDSVAPIISQIPYDGFFVEYDDERSGSFEPWSILKGTGATFVA GLISTKNPVLESYDEIKRRYMKAKEVVGANIALSPQCGFASVEEGNCIDEDTQWKKIDLL VRCKDFL >gi|157101663|gb|DS480661.1| GENE 14 12075 - 12257 142 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942424|ref|ZP_02089731.1| ## NR: gi|160942424|ref|ZP_02089731.1| hypothetical protein CLOBOL_07308 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07308 [Clostridium bolteae ATCC BAA-613] # 8 60 1 53 53 72 100.0 8e-12 MSPRTVPMLFREQEFIQTLTKIFHHVIALIFPMYQYIQADFFLKPDTFFNLFPYKFLYNH >gi|157101663|gb|DS480661.1| GENE 15 12626 - 13114 372 162 aa, chain + ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 4 162 5 162 164 127 40.0 8e-30 MLGESFYKKADEIIEMHTREERSLIPIIQDIQEEYRYLPPELLTYVAKEIGITEAKAYSV ASFYENFSFEEKGKYIIKICDGTACHVRKSMPILDYLYKTLKLNAKKHTTEDALFTVETV SCLGACGLAPAITVNEKVYPKMSPEKMKKLLEEIKRGEPDAG >gi|157101663|gb|DS480661.1| GENE 16 13134 - 14978 934 614 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 14 544 5 527 527 572 54.0 1e-163 MLQRQCRDRKDGYTCRILVCAGTGCVATGSLDVYSQLRELCKEDEGIRVELEKDVPHIGI VKSGCQGFCELGPLVRIEPQHCQYVKVQPEDCEEIVEKTVKQGIPVERLFYKKEGISYGA VDEIPYFARQTRIVLEQCGNIDAESVEEYMAAGGFSALKKALFAMTPEAVIAEVEFSGLR GRGGGGFPAGRKWRQVAAHKEEPLKYIVCNGDEGDPGAFMDGSVMEGDPCRLIEGMMLAG YAVGASEGFIYVRAEYPLSVARLRQAIGQMEERGLLGDSILGTGFSFHLHINRGAGAFVC GEGSALTASIEGHRGMPRVKPPRTVDQGLWGKPTVLNNVETYANLPGIIVNGGKWFRTIG TENSTGTKTFSVTGCIENTGLVEVPMGTTLRTIIYDICGGLKEGSEFKAVQIGGPSGGCL TEEHLDEPLDFDSVQKFNVIIGSGGLVVMDKHTCMVEVARFFMNFTQRESCGKCVPCREG TKRMLEILERIVEGKGEPDDIERLEQLADMISNTALCGLGKSAPLPVISTIKAFRKEYLE HINEKKCRAGVCQSMKTYIIDEETCRGCSKCAKGCPAGAIAGELKHVFTIRQQQCIKCGA CAEACPFGAVHIQS >gi|157101663|gb|DS480661.1| GENE 17 15006 - 16712 924 568 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 209 558 2 357 372 294 45.0 3e-79 MGFMTIDRIRVEFTDEPNVLSVIRKADIDIPTLCYHSELSIYGACRLCTVENERGKTFAS CSEPPRDGMVIYTNTPRLMKYRKMILELLLAAHCRDCTTCIKSGECHLQELAHRMGVTHV GYENTKTQYPIDMSSPAIVRDPNKCILCGDCVRMCDNVQSVNAIDFAYRGTEALVTPAFN KNIAETDCVNCGQCRAVCPTGAISINTNIEVIWEALADPNTKVVAQVAPAVRVAVGDFFG MARGENVMGKIVNVLHRMGFDEVFDTAYGADLTVIEESKEFLERLVSGKNLPLFTSCCPA WVSFCENRYPEFKQNLSTCRSPQQMFGAVIQEYYRNPQKSGGKKLLSVSIMPCTAKKAEI LQEESKTNGKADVDYVLTTTELVTMIRKSGIRFENIDIESSDMPFGIGSGAGVIFGNTGG VMEAVLRRLCEGHDRTEMDQIRYSGVRGEEGLKEITYDYNGRKIRAAIVSGLANADMLMK RIGNKEAEYDFVEVMACRRGCIMGGGQPVPAGPRTKAARAKGLYDTDINTQIKKSNENPL VLSLYENLLKGREHKLLHRNMAMGDIQG >gi|157101663|gb|DS480661.1| GENE 18 16806 - 17744 809 312 aa, chain + ## HITS:1 COG:SPy0824 KEGG:ns NR:ns ## COG: SPy0824 COG0583 # Protein_GI_number: 15674863 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 230 1 225 304 88 26.0 1e-17 MNLTDLEFVIAVERHGSISKAAKELFVAQPNLSKVIRTLEKEFGIVIFERTSKGVVTTSE GQAFIQKAKNIMKDVASLKGEFEDTAVKQEKLRISVPRASYITYAFTKYVNSIPHESQLS ISFVECNSTTAINNILSSKYGLGIIRYEMTYEDYFLMFLQLKGLQHETVMEFDYQILLSA DSPLALYPVIAEKDLDGYIEIVHGDTQLPSGEYLELPPDSGRGSQADRIYVCERGSQFDL LAMVPHTYMWVSPMPKQTLERYGLVQRSSAGRPRKMKDLLVYQMEHRKNRSEKAFIEMLK GTRDEIMTPAKL >gi|157101663|gb|DS480661.1| GENE 19 17964 - 19052 1094 362 aa, chain + ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 3 361 5 364 365 356 47.0 3e-98 MERKTALYDCHVACGGKMVPFAGYSLPVQYKTGVIKEHMAVRTQAGLFDVSHMGEVLFEG PDALKNINYILTNDFTNMYDGQVRYSVMCYEDGGVVDDLIVYRYNQEKYLVVVNAANREK DVNWMKDHLNGDVVFTDISDELSQLAIQGPNADAILRKLTKDEDIPEKYYSFVPEGIVGG IKCIVSQTGYTGESGYELYVNNEDAPKLWNMLLEAGEAEGLIPCGLGARDTLRLEAAMPL YGHEMDAAIHPLETGLKFAVKMQKDDFIGKKALEEKSVLTRKRVGLRMIGRGIARENEKV YAGDREVGWTTSGTHCPFLGYAIAMAILDLDCTEIGTKVEVEVRGRRIEAEVVALPFYKK AK >gi|157101663|gb|DS480661.1| GENE 20 19101 - 19481 509 126 aa, chain + ## HITS:1 COG:ML2077 KEGG:ns NR:ns ## COG: ML2077 COG0509 # Protein_GI_number: 15828123 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Mycobacterium leprae # 2 103 3 105 132 108 52.0 3e-24 MNVLENLKYSRSHEWVEEVDETTVRVGLTDFAQHELGDLVFVNLPEVGDTVTVKEAFADV ESVKAVSDVYSPVTGVISAVNEDILDSPEMINEAPYEAWLIEVKEVSEREELVDSAAYEK ICEEEG >gi|157101663|gb|DS480661.1| GENE 21 19485 - 20798 1382 437 aa, chain + ## HITS:1 COG:lin1386 KEGG:ns NR:ns ## COG: lin1386 COG0403 # Protein_GI_number: 16800454 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Listeria innocua # 3 436 5 448 448 369 44.0 1e-102 MGKYVPSTSEEQQEMLKAVGVSSFDELYTVVPSGMKVDRLAIPEGKSELEVSGIVRAMAA KNKVYSSIFRGAGAYNHYIPSIVKSITGKEEFLTAYTPYQAEISQGILQSIFEYQTMICE LTDMDVSNASVYDGASAAAEAMAMCHNRKKKKTVISGAANPMTIQTIQTYCEGNNTEAVV VPAKDGKTDLKALMEAMDQTTSGVFIQQPSYFGQIEDAAAIGEAVHAAGAKFVMGIYPIA GAALKSPRECGADVVTGEGQPLGMSLAFGGPYLGFMACTHDMMRQLPGRIVGETKDVDGR RAYVLTLQAREQHIRREKASSNICSNQALCAMTAAVYMASMGTTGVSQVASLCYSKAHYA AAEISRIPGFELVNQGDFFNEFVTKIPCDADVLLKKLSDHDILGGYPVEGGILWCVTEMN SKAQIDALAECLKEVRA >gi|157101663|gb|DS480661.1| GENE 22 20798 - 22234 1410 478 aa, chain + ## HITS:1 COG:BS_yqhK KEGG:ns NR:ns ## COG: BS_yqhK COG1003 # Protein_GI_number: 16079511 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Bacillus subtilis # 3 474 8 483 488 534 58.0 1e-151 MELIFEKSRVGHGCSLLPACDVPVAEIPAREARKKAPRFPQMAEIDISRHYTELMKRTHG VNDGFYPLGSCTMKYNPKLNEEMAALPGFTGVHPLQPEETVQGCLEVLVKAEELLCEITG MDQMTFQPAAGAHGEYTGLLLIKAYHKARGDMKRTKIIVPDTAHGTNPASCTMAGFTVIS IPSNDEGCVDLDKLREVVGDDTAGLMLTNPNTVGIFDRNILEITKIMHEAGGLNYYDGAN LNAVMGIARPGDMGFDVIHLNLHKTFSTPHGGGGPGSGAVGCKDILAPFLPKYHAVKSEK GYAFENPAQTIGSVKEFYGNFLVVVRALSYILVLGNEGIPEAAENAVLNANYMRVKLQDT FDIAFNNICMHEFVMSMTRLKKLNGVSALDFAKGLLDHGVHPPTMYFPLIVDEALMVEPT ETESKETLDNVVDIFKEIYKNSVENPEAMHEAPHSTPVRRLDEVGAARNPVLKYSGWK >gi|157101663|gb|DS480661.1| GENE 23 22353 - 23774 761 473 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 4 467 3 457 458 297 37 4e-80 MGERYELIVLGAGPGGYVAAIKAAQMGHRVAVVESRDVGGTCLNRGCIPTKTLVHAAEVL EEIRSCEKLGIHVGQVSYDFAGMHERKAEVVEQLRGGIEGLFKAYKIDLIRGCGIILDSN RIQVGDQAYETDRILIATGSKPAMPPIPGLTLPGVVTSDEMLEGKGVYCKKLVIIGGGVI GVEFATIYHALGCEVTIIEALDRILPTLDREVSQNLTMILKKRGIKVYTGARVDRVEHTD QSVQTAQEKGLAVYFMSKEKEQCVETERVLVSIGRRANTEGICPPSLNLQMQRGMIPVDD SFETCIKGIYAIGDIVLGGIQLAHVASAQAVNAVCSMFGERAPMNLSCIPSCIYTNPEIA AVGMTADEAKMRGIRTITGKAIMSSNGKTVIEMADRGFVKLVFEEESKVLLGAVLMCSRA TDMITGLSDGVAGKLTMSQLAATVRPHPTFSEAIGEAIEDADGGAIHAAPRRK >gi|157101663|gb|DS480661.1| GENE 24 23789 - 25468 1857 559 aa, chain + ## HITS:1 COG:NMA0617 KEGG:ns NR:ns ## COG: NMA0617 COG2759 # Protein_GI_number: 15793607 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Neisseria meningitidis Z2491 # 1 559 1 558 558 708 68.0 0 MGYKSDIEIAQEAKARPINEIAGALGIPEKYIENYGPYKAKIDYRYYNDALKDRPDAKLV LVTAINPTPAGEGKTTVTIGLADALNRLDKKAMIAMREPSLGPVFGVKGGAAGGGYAQVV PMEDINLHFTGDLHAIGAANNLLAAMLDNHIHQGNALRIDTKRITFRRAVDMNDRQLRNI VDGLGGKPDGTIRQDGFDITVASEVMAVFCLANDIMDLKERLARIIVAYDLDGQPVTAGQ LKAQGAMAALLKDALKPNLVQSLEGTPAFVHGGPFANIAHGCNSVIATKMGCKLADYMVT EAGFGADLGAEKFVDIKCRMSGLRPNAVVLVATIRALKYNGGVPKAELNNENLEALNKGI PNLLKHIENITTVYHLPAVVAINKFPSDTAAELELVENKCRELGVNVALTDVWGRGGAGG EALAKEVIRLSEGRSTMEFVYDTNLPIKEKIEAIAARVYGADGVEFAPKALKEMKNLESL GYGNLPVCMAKTQYSLTDNAKLLGRPRGFKISIRDITASVGAGFLVALAGDIMKMPGLPK VPAAEKIDVDENGSISGLF >gi|157101663|gb|DS480661.1| GENE 25 25485 - 26025 617 180 aa, chain + ## HITS:1 COG:Ta1477 KEGG:ns NR:ns ## COG: Ta1477 COG3404 # Protein_GI_number: 16082442 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Thermoplasma acidophilum # 7 180 17 190 217 124 39.0 1e-28 MMLDKKVTEFLDVLSSNEPVPGGGGASATVGAMGVALGIMVGNLTLGKRKYADVEPDIKE LIEKSQSLMDELKHLTDEDARVFEPLSRAYGLPRGTEEEKAAREAVMEHALLEASLVPLQ IVEKAYESLTLLEELGEKGTRIALSDVGVGALFARAALEGASLNIFINTKLMKDRTQAED Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:16:16 2011 Seq name: gi|157101662|gb|DS480662.1| Clostridium bolteae ATCC BAA-613 Scfld_02_3 genomic scaffold, whole genome shotgun sequence Length of sequence - 105936 bp Number of predicted genes - 92, with homology - 92 Number of transcription units - 36, operones - 22 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 104 - 520 362 ## COG0735 Fe2+/Zn2+ uptake regulation proteins - Prom 561 - 620 6.1 - Term 596 - 641 5.2 2 1 Op 2 . - CDS 677 - 1636 1078 ## COG0583 Transcriptional regulator 3 1 Op 3 . - CDS 1633 - 2457 695 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Prom 2507 - 2566 8.9 + Prom 2521 - 2580 6.4 4 2 Op 1 . + CDS 2669 - 3445 318 ## PROTEIN SUPPORTED gi|89099121|ref|ZP_01172000.1| 30S ribosomal protein S1 + Prom 2521 - 2580 6.4 5 2 Op 2 . + CDS 2669 - 3445 223 ## PROTEIN SUPPORTED gi|229004370|ref|ZP_04162139.1| 30S ribosomal protein S1 6 2 Op 3 . + CDS 3469 - 4128 719 ## COG2357 Uncharacterized protein conserved in bacteria 7 2 Op 4 . + CDS 4129 - 5931 2005 ## COG0584 Glycerophosphoryl diester phosphodiesterase 8 2 Op 5 . + CDS 5948 - 6403 602 ## COG1490 D-Tyr-tRNAtyr deacylase 9 2 Op 6 . + CDS 6494 - 6958 480 ## Closa_0124 hypothetical protein 10 2 Op 7 2/0.000 + CDS 6955 - 8382 1163 ## COG3864 Uncharacterized protein conserved in bacteria 11 2 Op 8 . + CDS 8427 - 9956 1565 ## COG0714 MoxR-like ATPases + Term 10010 - 10061 12.1 + Prom 10014 - 10073 6.9 12 3 Tu 1 . + CDS 10138 - 11319 425 ## COG3464 Transposase and inactivated derivatives + Term 11522 - 11550 -0.1 - Term 11399 - 11442 11.7 13 4 Tu 1 . - CDS 11579 - 12823 1128 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 13035 - 13094 8.0 + Prom 12942 - 13001 10.1 14 5 Op 1 . + CDS 13216 - 14781 1328 ## COG2978 Putative p-aminobenzoyl-glutamate transporter + Prom 14802 - 14861 5.1 15 5 Op 2 . + CDS 14957 - 16306 1219 ## COG0534 Na+-driven multidrug efflux pump + Term 16361 - 16399 -0.8 + Prom 16361 - 16420 6.2 16 6 Op 1 . + CDS 16479 - 17189 678 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 17 6 Op 2 . + CDS 17274 - 18590 1366 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 18 6 Op 3 . + CDS 18565 - 19278 805 ## COG1011 Predicted hydrolase (HAD superfamily) + Prom 19313 - 19372 5.7 19 7 Op 1 31/0.000 + CDS 19435 - 20346 1223 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 20368 - 20408 6.4 20 7 Op 2 34/0.000 + CDS 20449 - 21108 918 ## COG0765 ABC-type amino acid transport system, permease component 21 7 Op 3 . + CDS 21129 - 21893 507 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 22022 - 22081 8.6 22 8 Tu 1 . + CDS 22109 - 22783 529 ## Sterm_2685 AroM family protein + Prom 22899 - 22958 9.1 23 9 Op 1 2/0.000 + CDS 22998 - 24185 992 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase + Term 24196 - 24246 9.1 + Prom 24200 - 24259 9.9 24 9 Op 2 38/0.000 + CDS 24295 - 25947 1874 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 26004 - 26033 1.2 25 9 Op 3 49/0.000 + CDS 26053 - 27006 1006 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 26 9 Op 4 1/0.000 + CDS 27007 - 27906 868 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 27 9 Op 5 . + CDS 27909 - 29093 1007 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 28 9 Op 6 44/0.000 + CDS 29105 - 30124 582 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 29 9 Op 7 1/0.000 + CDS 30135 - 31106 863 ## COG4608 ABC-type oligopeptide transport system, ATPase component 30 9 Op 8 . + CDS 31142 - 32371 1313 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 31 9 Op 9 . + CDS 32392 - 33333 886 ## Sterm_2686 hypothetical protein 32 10 Tu 1 . - CDS 33344 - 35002 1356 ## COG2508 Regulator of polyketide synthase expression - Prom 35029 - 35088 8.0 33 11 Op 1 . + CDS 35135 - 35299 133 ## gi|160936793|ref|ZP_02084159.1| hypothetical protein CLOBOL_01683 34 11 Op 2 . + CDS 35328 - 35963 844 ## gi|160936794|ref|ZP_02084160.1| hypothetical protein CLOBOL_01684 + Term 36048 - 36081 -0.4 + Prom 35972 - 36031 5.2 35 12 Op 1 . + CDS 36150 - 37787 1980 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 37805 - 37843 4.2 36 12 Op 2 . + CDS 37879 - 38364 504 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis 37 12 Op 3 . + CDS 38441 - 39196 729 ## COG0708 Exonuclease III 38 12 Op 4 . + CDS 39209 - 41056 2170 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 41073 - 41129 12.0 + Prom 41141 - 41200 8.2 39 13 Op 1 . + CDS 41292 - 43319 1584 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 40 13 Op 2 . + CDS 43216 - 44376 1135 ## COG2195 Di- and tripeptidases + Term 44474 - 44530 10.1 + Prom 44417 - 44476 4.0 41 14 Tu 1 . + CDS 44618 - 47374 2679 ## COG0474 Cation transport ATPase + Prom 47404 - 47463 3.0 42 15 Op 1 . + CDS 47497 - 48393 995 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) 43 15 Op 2 . + CDS 48425 - 48874 568 ## COG1803 Methylglyoxal synthase 44 15 Op 3 . + CDS 48944 - 51991 2808 ## COG0296 1,4-alpha-glucan branching enzyme + Term 52008 - 52049 12.2 + Prom 52004 - 52063 3.6 45 16 Op 1 7/0.000 + CDS 52241 - 53704 1448 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 46 16 Op 2 . + CDS 53719 - 54513 831 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 54583 - 54642 6.2 47 17 Op 1 11/0.000 + CDS 54705 - 55835 1358 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component + Term 55851 - 55897 -0.9 48 17 Op 2 11/0.000 + CDS 55908 - 56432 541 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 49 17 Op 3 . + CDS 56432 - 57727 943 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 57774 - 57808 6.7 + Prom 57753 - 57812 3.0 50 18 Op 1 . + CDS 57834 - 58319 309 ## Closa_0824 hypothetical protein 51 18 Op 2 . + CDS 58387 - 59406 531 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 + Term 59451 - 59511 12.5 + Prom 60368 - 60427 80.3 52 19 Tu 1 . + CDS 60566 - 61015 327 ## COG3464 Transposase and inactivated derivatives - Term 61095 - 61138 11.7 53 20 Tu 1 . - CDS 61188 - 62363 1029 ## Closa_3855 hypothetical protein - Prom 62397 - 62456 6.7 + Prom 62493 - 62552 2.1 54 21 Op 1 1/0.000 + CDS 62574 - 63152 611 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 55 21 Op 2 . + CDS 63149 - 64540 1120 ## COG0726 Predicted xylanase/chitin deacetylase + Term 64575 - 64625 13.6 + Prom 64712 - 64771 6.9 56 22 Op 1 . + CDS 64949 - 65977 482 ## Closa_0425 hypothetical protein 57 22 Op 2 . + CDS 66052 - 66885 847 ## COG0005 Purine nucleoside phosphorylase + Prom 66891 - 66950 2.7 58 22 Op 3 . + CDS 67029 - 67862 1204 ## gi|160936827|ref|ZP_02084192.1| hypothetical protein CLOBOL_01716 + Term 67877 - 67923 10.1 + Prom 67875 - 67934 8.7 59 23 Tu 1 . + CDS 68079 - 70670 1900 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Term 70699 - 70750 9.4 + Prom 70825 - 70884 7.2 60 24 Tu 1 . + CDS 70982 - 71161 129 ## EUBREC_0688 hypothetical protein + Term 71186 - 71235 12.1 61 25 Tu 1 . + CDS 71264 - 72454 1060 ## COG1459 Type II secretory pathway, component PulF 62 26 Op 1 . + CDS 72579 - 73922 1093 ## gi|160936833|ref|ZP_02084198.1| hypothetical protein CLOBOL_01722 63 26 Op 2 . + CDS 73909 - 74559 555 ## gi|160936834|ref|ZP_02084199.1| hypothetical protein CLOBOL_01723 64 26 Op 3 . + CDS 74543 - 74875 306 ## gi|160936835|ref|ZP_02084200.1| hypothetical protein CLOBOL_01724 65 26 Op 4 . + CDS 74932 - 75420 475 ## gi|160936836|ref|ZP_02084201.1| hypothetical protein CLOBOL_01725 66 26 Op 5 . + CDS 75420 - 76211 555 ## gi|160936837|ref|ZP_02084202.1| hypothetical protein CLOBOL_01726 67 27 Tu 1 . - CDS 76240 - 76995 549 ## COG1989 Type II secretory pathway, prepilin signal peptidase PulO and related peptidases - Prom 77029 - 77088 9.2 + Prom 76923 - 76982 4.6 68 28 Op 1 . + CDS 77068 - 78786 1568 ## COG2804 Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB 69 28 Op 2 . + CDS 78826 - 79182 448 ## gi|160936840|ref|ZP_02084205.1| hypothetical protein CLOBOL_01729 70 28 Op 3 . + CDS 79255 - 79677 492 ## gi|160936841|ref|ZP_02084206.1| hypothetical protein CLOBOL_01730 71 28 Op 4 . + CDS 79693 - 81225 1604 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 72 29 Tu 1 . + CDS 81365 - 81862 309 ## Closa_0447 hypothetical protein + Prom 81867 - 81926 6.6 73 30 Op 1 . + CDS 81968 - 82720 777 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 74 30 Op 2 . + CDS 82723 - 83859 1198 ## COG0077 Prephenate dehydratase 75 30 Op 3 . + CDS 83875 - 85143 1313 ## COG2607 Predicted ATPase (AAA+ superfamily) + Prom 85154 - 85213 3.8 76 31 Op 1 11/0.000 + CDS 85271 - 86836 1868 ## COG0248 Exopolyphosphatase 77 31 Op 2 . + CDS 86839 - 89232 2349 ## COG0855 Polyphosphate kinase + Term 89291 - 89328 7.8 - Term 89268 - 89324 10.9 78 32 Op 1 4/0.000 - CDS 89349 - 90692 1532 ## COG2610 H+/gluconate symporter and related permeases 79 32 Op 2 . - CDS 90715 - 91863 1209 ## COG3835 Sugar diacid utilization regulator 80 32 Op 3 . - CDS 91927 - 92730 671 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II - Prom 92759 - 92818 9.9 + Prom 92756 - 92815 9.4 81 33 Op 1 40/0.000 + CDS 93036 - 93740 966 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 82 33 Op 2 . + CDS 93730 - 94866 1368 ## COG0642 Signal transduction histidine kinase 83 33 Op 3 . + CDS 94901 - 97417 2544 ## Closa_0455 hypothetical protein 84 33 Op 4 . + CDS 97417 - 98379 1127 ## COG1131 ABC-type multidrug transport system, ATPase component 85 33 Op 5 . + CDS 98357 - 99289 914 ## Closa_0457 hypothetical protein 86 33 Op 6 . + CDS 99292 - 100758 1361 ## COG4260 Putative virion core protein (lumpy skin disease virus) 87 33 Op 7 . + CDS 100786 - 101886 956 ## Closa_0460 hypothetical protein 88 33 Op 8 . + CDS 101883 - 102875 876 ## COG1512 Beta-propeller domains of methanol dehydrogenase type + Term 102958 - 102999 -0.2 + Prom 103026 - 103085 4.8 89 34 Tu 1 . + CDS 103158 - 103418 334 ## COG2827 Predicted endonuclease containing a URI domain + Term 103459 - 103515 11.6 - Term 103447 - 103503 11.6 90 35 Tu 1 . - CDS 103531 - 104202 711 ## COG1272 Predicted membrane protein, hemolysin III homolog - Prom 104286 - 104345 4.4 + Prom 104238 - 104297 6.4 91 36 Op 1 . + CDS 104499 - 105284 307 ## COG0564 Pseudouridylate synthases, 23S RNA-specific + Prom 105335 - 105394 8.0 92 36 Op 2 . + CDS 105433 - 105816 578 ## gi|160936868|ref|ZP_02084233.1| hypothetical protein CLOBOL_01757 + Term 105819 - 105866 10.6 Predicted protein(s) >gi|157101662|gb|DS480662.1| GENE 1 104 - 520 362 138 aa, chain - ## HITS:1 COG:SA1678 KEGG:ns NR:ns ## COG: SA1678 COG0735 # Protein_GI_number: 15927435 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Staphylococcus aureus N315 # 3 128 23 148 148 97 33.0 5e-21 MKALKYSRQRESIKACLMNRHDHPTADALYASIREEFPNISLGTVYRNLNLLAETGEIRK LTCGDGAVHFDGDTRQHYHFVCSECNQVYDMELGAMEDLNKEAQEHAPGIIDSHYVLFYG RCNSCLQKKVDKAIDKPA >gi|157101662|gb|DS480662.1| GENE 2 677 - 1636 1078 319 aa, chain - ## HITS:1 COG:CAC3361 KEGG:ns NR:ns ## COG: CAC3361 COG0583 # Protein_GI_number: 15896604 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 2 317 1 296 312 167 34.0 2e-41 MINFLNLEYFLVAAEELNFTRAARKLYISQQSLSNHISNLEKEFDVILFNRTSPLTLTYA GRALKTRARELLDLRDETYKEISDIKDFSTGQLSIGVSHTRGRVILPEILPTYQSQFPGI ELHLAEGNSSQLASDLLHGNIDLMIDLLPFTAENVETVPICNEEILMVVPDETLKKAYPG QVEEIKEKLRAASDIRLLEHCPFVLLKKGNRVRTIADEIFEDAQMSPRIVLETENIETVL ALSGKGMGITFYPRMFMSPDPASHSYRSAVLEQNHLNLYSLNYTRAHGVLGIGYHKGRYM SRAAHEFIRIAKERLDYAP >gi|157101662|gb|DS480662.1| GENE 3 1633 - 2457 695 274 aa, chain - ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 4 234 4 232 244 191 41.0 1e-48 MIDLLDLHTHTTASGHAYNSLYEMVHSASVKGLSLFGCSDHAPAMPGSCHSFHFINFKVL PRTLYGVRLMMGVELNIMDYDGTVDLEQSVLEPLDYAIASLHQPCIHSGTALQNTSAYLG ALKNPLIHIIGHPDDSRFPIDYDTLVAAAGEHHKLLEVNNSSLNPLSFRVGARDNYIKML ELCRHYGTSIIINSDAHCEADAGNHGFAHALLEEVDFPQELIVNTSLDRLCGFLPKAAAI LAQQEDERCGHLYRSGTIQEETVQEGNASGGNES >gi|157101662|gb|DS480662.1| GENE 4 2669 - 3445 318 258 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|89099121|ref|ZP_01172000.1| 30S ribosomal protein S1 [Bacillus sp. NRRL B-14911] # 23 222 60 262 385 127 35 3e-28 MSEELNKELENEVTEEVKEAAPMETMEDYAAELEASYKTLNQRHIEITEEESGAGAKWAQ FAQMMEDRTVVKVKIAEAVKGGVVTSLEEVRAFIPASQLSTEYVEKLEDWQGKYVEAVII TVDPEKKRLVLSGREVEKEKKEAQKKERMAQFKAGDVVEGTVDSIKPYGAFIKLDDGVDG LLHISQISTQRIKHPSAVLTEGQTVKVKILSAQEGKLSLSMKVLAEQQADKEEHETFDYK ETGSVSTGLGDLLKGLKL >gi|157101662|gb|DS480662.1| GENE 5 2669 - 3445 223 258 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229004370|ref|ZP_04162139.1| 30S ribosomal protein S1 [Bacillus mycoides Rock1-4] # 111 257 4 151 152 90 36 3e-17 MSEELNKELENEVTEEVKEAAPMETMEDYAAELEASYKTLNQRHIEITEEESGAGAKWAQ FAQMMEDRTVVKVKIAEAVKGGVVTSLEEVRAFIPASQLSTEYVEKLEDWQGKYVEAVII TVDPEKKRLVLSGREVEKEKKEAQKKERMAQFKAGDVVEGTVDSIKPYGAFIKLDDGVDG LLHISQISTQRIKHPSAVLTEGQTVKVKILSAQEGKLSLSMKVLAEQQADKEEHETFDYK ETGSVSTGLGDLLKGLKL >gi|157101662|gb|DS480662.1| GENE 6 3469 - 4128 719 219 aa, chain + ## HITS:1 COG:lin0794 KEGG:ns NR:ns ## COG: lin0794 COG2357 # Protein_GI_number: 16799868 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 9 214 7 212 212 221 53.0 9e-58 MTNIPKADDQVDQWRSVMFLYDSALKKVNTKIEILNNEFANRYDYNPIEHIKSRLKSAES IVMKLKKDGYEVTIENMMECLSDIAGIRIICSFTSDIYQIAGMIAAQGDVTVLHVKDYIK NPKPNGYKSYHMVVTVPVYLTDGPVETKVEIQIRSVAMDFWASLEHKIAYKFEGNAPENL LKELKACADMVDILDAKMFSLNEAITAFAREQKERSEES >gi|157101662|gb|DS480662.1| GENE 7 4129 - 5931 2005 600 aa, chain + ## HITS:1 COG:CAP0015 KEGG:ns NR:ns ## COG: CAP0015 COG0584 # Protein_GI_number: 15004719 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Clostridium acetobutylicum # 325 575 9 264 281 158 33.0 2e-38 MNRLIKETDQVIKLNLRQLVIFEVLFRLVAGTFYIRLVNQLLRFSLRMAGYSYLTMSNMG AFLLRPATIVCALLAAVLGMVLMVVEIGGLITAYQAAAYSMRVDSVYILKGALGKAWDAC KGRNWKLLPLALADYVMMNSYLLLRILTKIKPLNFVMYEIMHAPAARLGLVMLTAALVLI GIPTMLVFFACMIEQKQFRDGFRRSMELLKGRSLRAVVLLLVVNFGLAVGLVLMYVVIVV VSAVIVTLFVDSYAAMAVLAAVCTKLEVVVLFIGGILAPLVDFGALTVIYYQYGMKSAHA APWDFTMPYELHFKRKWILTITGALAGASLFLIFDMVYNGTSPDWDVLGQTEVTAHRGSS RMAPENTMAAIEAAMEEMADYSEIDVQTTADGIVVVCHDLNLKRVAGVDRRLGTMTYDQV SRLDVGSHFSPRFAGERIPTLEEVLEACKGRMKLNIELKNIGNSSSLPEQVAALVREYGM EDQCVITSVKLGYLERVKEMAPELRTGYILAAAYGTYYDNEYIDFISIRSSFVGRKLVEA AHEKGKAVHVWTVNSKTEIEQMKLLGVDNIITDYPVRAREILYREETTETLMEYLKMMLR >gi|157101662|gb|DS480662.1| GENE 8 5948 - 6403 602 151 aa, chain + ## HITS:1 COG:CAC2273 KEGG:ns NR:ns ## COG: CAC2273 COG1490 # Protein_GI_number: 15895541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Clostridium acetobutylicum # 1 149 1 149 149 166 55.0 2e-41 MRAVVQRVTQASVTVDGELLGRIGKGFLILLGVADGDTRQMAEKMADKICRLRIFEDENG KTNLSLEDVEGELLVVSQFTLYADCRKGNRPSFIKAGAPQMAESLYKHFMERCRTHVDVV EKGRFGADMKVELLNDGPFTLMLDSQELFGE >gi|157101662|gb|DS480662.1| GENE 9 6494 - 6958 480 154 aa, chain + ## HITS:1 COG:no KEGG:Closa_0124 NR:ns ## KEGG: Closa_0124 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 150 1 132 136 125 44.0 4e-28 MVKMMETVLYYNPGRPETMKHVAMMKSVLVRMGVRIRNIGPEQVLEKVGYLAGMEGYEAA GGSAGADGAGAQGADGERTGAGPGELPVIPEEVMVLKQFSSQRLDMLLAGLRRAGVPRIA LKAVLTEHNSDWTFYHLYQELKEEHETMTAGQQP >gi|157101662|gb|DS480662.1| GENE 10 6955 - 8382 1163 475 aa, chain + ## HITS:1 COG:DR1169 KEGG:ns NR:ns ## COG: DR1169 COG3864 # Protein_GI_number: 15806188 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 1 427 1 367 379 62 23.0 2e-09 MTDPTRKRQLEELDQVTLCARILLQARSELYVNMKFLDVSLSSLGFEADWAREGLGTDGH LIYYGPDYLLGLYKRGRTLVNRGYLHMLFHCLFCHMYTRKDRDKGYWDLSCDIAMEYVID GLYQKCVHVPRSALRREIYLRLEKALSEKGPVLTAERVYRALLEMELPDRRLDMLKAEFF VDSHDLWEQEDSPKIVRSRQNKWNDNREKMQTELETGSKDASEDNRSLLEEVQVENRERY DYSRFLQKFAVLKEEMQVDPDSFDYAFYTYGLSLYGNMPLIEPLESREVYRVEDFAIVID TSMSCSGELVRRFLEETYDVLSETESYFKKVHLHIIQCDDAVQSDVLVTSQEELKAYMDD FQIRGSGGTDFRPAFEYVNGLKKSRQAASLKGLIYFTDGKGIYPVQAPSYDTAFVFIENM FSDESVPPWAMKVVLEEEQLMEYRPGSRNRDTEADRNRRNGEMPADQDSRNRHGQ >gi|157101662|gb|DS480662.1| GENE 11 8427 - 9956 1565 509 aa, chain + ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 18 238 3 212 340 97 26.0 8e-20 MNIKRAKEEIKDTIEAYLLKDENGEYVIPSIRQRPVLLIGPPGMGKTQIMEQISRECRVG LISYTITHHTRQSAVGLPFIEKKMYGDREYAVTEYTMSEIVASIYNRMEETGLKEGILFI DEINCVSETLAPTMLQFLQCKTFGSHKIPEGWIIAAAGNPPEYNKSVREFDVVTLDRIKR IDVEPDLAVWKEYAYAENIHPAIISYLSIKGQNFCRIETTVDGKLFATPRGWEDLSQLIL VYESLEKRADRDVISQYIQHPRIAKDFANYLELYYKYQNEYQVDEILQGTIREALCDRVA RAPFDERLSVTGLILSRLTEGFKKLWDKNASMELLMEQLKRYRQEMDGRAKTSANSAAAQ SQSPAMALSCLAQELEKERLHRKKSGLSGRREDRLWLCVRIMLEEYAQQMRKEAVEDREQ AWEWVRRKFMEHSDCYEAMKEDCGSRLEHAFDFMEAAFGTGQEMVIFVTELNTSEACVRF LDEYECERYYQYNKDLLFDEQEQAIRQKL >gi|157101662|gb|DS480662.1| GENE 12 10138 - 11319 425 393 aa, chain + ## HITS:1 COG:BH3752 KEGG:ns NR:ns ## COG: BH3752 COG3464 # Protein_GI_number: 15616314 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 4 389 60 450 463 219 35.0 6e-57 MYPNYTKAFLNLEGVFIKKVVQADSFIKIFIQSQPVEQTCPCCGAKTKRIHDYRLQEVQD IPLLGKQVILLLRKRRYLCPYCRKRFTEPYSFLPSYHRRTRRLAFYIVSLLRQTFSLKQI AELTGVSVQTVCRLLDTICYPPPDQLPQALSIDEFKGNASTGKYQCILVDPKKRRILDIL PDRTQSHLADYWRNIPRKERLKVKFFVCDMWRPYTELAQTFFPNATIIVDKYHFIRQVTW AIENVRKRLQRSMPVSLRKYYKRSRKLILTRYKKLKDENKQACDLMLHYSEDLRLAHRMK EWFYDICQMEAYRQQQREFDDWIANAQSCGIKEFEACAKTYRAWRKEILNAFKYGLTNGP TEGFNNKIKVLKRSSYGIRNFKRFRTRILHCTS >gi|157101662|gb|DS480662.1| GENE 13 11579 - 12823 1128 414 aa, chain - ## HITS:1 COG:RP176 KEGG:ns NR:ns ## COG: RP176 COG1301 # Protein_GI_number: 15604051 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Rickettsia prowazekii # 8 406 1 397 399 157 27.0 3e-38 MEGKKKGLSLTTQILIATAGGIVFGSLIGPWASNLKFIGDIFIRLIQMSVVLLVMTAVAG AVGSGDGQDVGKMGFHTFKWIIFFTLISAGLGVVLSMLIQPGIGIEIANAEAIANASAET ASLQDTLLGFVSTNIIGSMADSAMVPCIVFALFFGTAMGTYTRQSGNRNMVEWVTGFNTI ITNIIKAVMHVAPIGIFCLLANVAGSTGFKVIIPMIKFLGVLLLGDAIQFLIYGPLTAAL CKVNPAKMPKKFAKMSMMAVTTTSGAICLPTKMEDAVVKFGVSRKVADFTGPITMSMNSC GAALCYVAAIFFMAQSTGIQLTTYQMGMAILLSCLMCMGTIVVPGGSVIVYTFLASSLGL PLESIAILIGIDWFSGMFRTLMNVDVDVMVGMLVSSKLGDLDRDVYNETKVVKY >gi|157101662|gb|DS480662.1| GENE 14 13216 - 14781 1328 521 aa, chain + ## HITS:1 COG:FN0470 KEGG:ns NR:ns ## COG: FN0470 COG2978 # Protein_GI_number: 19703805 # Func_class: H Coenzyme transport and metabolism # Function: Putative p-aminobenzoyl-glutamate transporter # Organism: Fusobacterium nucleatum # 11 512 4 502 512 486 51.0 1e-137 MGGIVLANDGQKKSMVLKALDTVERVGNRLPDPAVIFLILTGIVILVSALCGFLGLSVTY DFLDRTSGEITKMTVEAVSLLAPDSIRYMVTAVVPNFTGFFALGTVFTIMIGVGVADGTG FMSALLRRVASSTPKSLATAVVVFLGIMSNIASSTGYVVLVPLGAVLFTAFGRHPAAGLA AAFAGVSGGWSANLLIGTNDPMFAGMSTQAARMIDPEYTVLPVCNWYFMVASTVLITVIG TLVTDRLVEPRLGPWGPVETGTIGDIGENERRGMGWAGITVLVYAAVMALLTVLPGGLLR DPQTHSLLNSPFMDGIIFFMMLLFLLPGLAYGIGAGTLKNSRDAVALVNKSISGLAGFMV LIFFAAQFTACFNYSNLGTIISVKGADFLQSVGLTGLPLIVCFILLTACINIFIAVDSAK WAIMAPIFVPMFMRLGLSPELTQAAYRIGDSSTNIIAPLMPFFVLTVAFFQKYDRDAGIG SVISTMLPYSGAFLLGWIVMFTVWYLTGLPLGPGSPLFYGL >gi|157101662|gb|DS480662.1| GENE 15 14957 - 16306 1219 449 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 14 442 92 527 547 194 32.0 3e-49 MEQNGAGVRMFTNQDLKRLILPLIVEQILAVSVGMVDTMMVSNAGEAATSGVSLVDMVNT LFINIFAAVATGGAVVSSQYLGQRRRDRACQSANQLILIIACISLIIMVLCILFRRGVLH LLYGGVAGDVMANALVYLTISALSYPFLAVYNSCAALFRSMGNSKISMQASIIMNIINVI GDSLFIFVFHWGVAGAAAASLISRMTACFILLFRLKNKNLDIFIGGKWNLNFRMVKQILG IGIPNGIENSIFQLGRVLVVGIIAMFGTTQIAANAIANNLDGMGVLPGQAMNLAMITVVG RCVGAGDFDQAGYYAKKMMKITYLVNGLCCIAVILTMPLSLSLYGLSKEALELGAVLVLI HDGCAVFLWPSSFCLANVLRAASDVKFPMCVSILSMLLFRIGFSYVLAVGLGMGAVGVWW AMIADWSVRSAFFGWRFASGKWKTFYHAL >gi|157101662|gb|DS480662.1| GENE 16 16479 - 17189 678 236 aa, chain + ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 3 211 2 213 237 108 28.0 8e-24 MKITYIHHSSFMAELDHAVLLFDYFEGNIPETDREKPLFVFASHRHGDHFSRSIFDLEES RGNVTYVLSDDIWTRQVPEDLLGRTIFLGPGEEVSLKDGAGNPVDIRAFKSTDQGVAFLA SLEGRTIYHAGDLNNWVWEGEPEADNRRMSENFHREMDKLAGRHIDVAFMLIDPRQEKDF YLGMDDFMRMVGADVVFPMHFWEDFQAAARFKALACADGYKDRIQEIHRRGEEFLV >gi|157101662|gb|DS480662.1| GENE 17 17274 - 18590 1366 438 aa, chain + ## HITS:1 COG:CAC0519 KEGG:ns NR:ns ## COG: CAC0519 COG0044 # Protein_GI_number: 15893810 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Clostridium acetobutylicum # 1 434 1 424 424 422 49.0 1e-118 MILIRAGRVLGPAEGLDIEADVVLDGDRIAGIFPADQGGAEGAPGFSTRERQYEQIIEAE GLAAAPGLVDVHVHFRDPGPTYKEDIHTGARAAAKGGFTTVVCMANTNPIVDNEETLRYV LEEGKRTGIHVLSCAAVSKGFAGRELTDMEGLLAAGAAGFTDDGIPLMEGTFVREAMEEA ARLDTVLSFHEEDKHFIRENGINHGRISDELGIYGSPALAEDSLVARDCMIALDTGATVD IQHISSGHSVEMVRLAKKLGAKVWAEVTPHHFTLTEEAVLKHGTLAKMNPPLRTEKDRQM LIEGLKDGTIDMIATDHAPHSREEKDRPLTQAPSGILGLETALALGITNLVKPGHLTLLE LMEKMSYNPARLYKLDCGRMEAGAPADLVLFDADRQWTVEGFESKSSNSPFLGATLTGKV MYTVCKGRIVYGDQGTEV >gi|157101662|gb|DS480662.1| GENE 18 18565 - 19278 805 237 aa, chain + ## HITS:1 COG:BS_yfnB KEGG:ns NR:ns ## COG: BS_yfnB COG1011 # Protein_GI_number: 16077800 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Bacillus subtilis # 6 231 2 227 235 187 44.0 1e-47 MEIRERKFDVILLDVDGTLLDFGMSEKQGMKVVLEQYGFEPTEERLLLYHEINEGFWSAF ERGEVTKEDLVRQRFETFFGRLGRAVDGREAEELYRRQLDASAFLIDGALELCAYLKDRY DLYVVTNGTSSTQYKRLAASGLDGFMKDIFVSEDAGSQKPQKEYFDYCFSRIPDANPRRM LLIGDSPASDIKGGMAAGTYTCWYNPGGQTLPEGIRADYEVGSHKELMNLLQAPVLP >gi|157101662|gb|DS480662.1| GENE 19 19435 - 20346 1223 303 aa, chain + ## HITS:1 COG:SP1500 KEGG:ns NR:ns ## COG: SP1500 COG0834 # Protein_GI_number: 15901347 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 58 300 22 271 278 147 34.0 3e-35 MKKRLFSLALASVMALSLAGCGGSKEAATEAETKAETEQTAADTSAKAEDTTAQEETKAE SEAASEAVSESAGGTLIVGFDQDFPPMGFMGDNGEYTGFDLELAKEVAERLGLEYTAQPI AWDSKDMELEAGNIDCIWNGFTMTGREDDYTWSEPYMANTQVFVVAKDSGIASQADLAGK IVECQVDSSAEAALKEVPDLTATFKQLLTTADYNSAFMDLEQGAVDAIAMDVIVAGYQIQ QRNADFIILEDSLSAEEYGVGFKKGNTELRDKVQATLEEMAADGTLKSVSEKWFGEDVTT IGK >gi|157101662|gb|DS480662.1| GENE 20 20449 - 21108 918 219 aa, chain + ## HITS:1 COG:SP1502 KEGG:ns NR:ns ## COG: SP1502 COG0765 # Protein_GI_number: 15901349 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 8 219 6 213 213 154 44.0 2e-37 MGKMTIQQMLIQLAGGMGTSIQIFLVTLIFSLPLGLLVAFGRMSKNRLLQAVVKVYISIM RGTPLMLQLMVVYFGPYYLFRIKVGSGYRLWATFIGFVINYAAYFAEIYRSGIQSMPVGQ YEAAQILGYSRFQTFFKIILPQVIKRILPSVTNEVITLVKDTSLAFTLSVAEMFSIAKAL AASQTNMIPFVAAGLFYYIFNLVVAVGMEWVEKKMDYYR >gi|157101662|gb|DS480662.1| GENE 21 21129 - 21893 507 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 250 1 243 245 199 42 4e-50 MKLLELNHVEKSFDGLGVIKDISLSVEEGEIVSVIGPSGSGKSTLLRCATMLETMDSGEV IYLGKRAAWTEDGRAVYASKKDLKEIQSQYGLVFQNFNLFPHYSVMKNITDAPIHVQKRD KAQVYEEARALLKKMGLEDKENAYPCQLSGGQCQRVAIARALALNPKILFFDEPTSALDP ELTGEVLKVIKSLADLDIAMVIVTHEMAFARDISDRVVFMAGGVIVEEGTPEQVFASDNE RTQSFLGRYGALLN >gi|157101662|gb|DS480662.1| GENE 22 22109 - 22783 529 224 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2685 NR:ns ## KEGG: Sterm_2685 # Name: not_defined # Def: AroM family protein # Organism: S.termitidis # Pathway: not_defined # 3 222 4 223 227 199 46.0 5e-50 MFKLGAITVGQSPRTDVTDDIMGIFQGKVEILERGALDGLTVEDIDKLAPDAGEYVLVSR MRDGSQVSFSEQKILPRIQECIEQLEAEGVKMILFFCTGEFDYKFKSRVPLIFPCDLISR LIPVLCGDRQLIVVTPTQRQVQQSQEKWRKYVNDVVTVAASPYGGWEEIEQACRKISALQ GPLVVMDCIGYSFAMKREVAEKTGKTVVLSRTMAARVISELSDI >gi|157101662|gb|DS480662.1| GENE 23 22998 - 24185 992 395 aa, chain + ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 7 390 4 393 394 269 37.0 9e-72 MDNRYTLTRAGELSDELISIRRKIHQNPEIGFDLPETTALVAEKLKAYGIEFKKVGKAGI SAVLGNAEAGGKTFLLRADMDALPFEELTGLTFASNNGCMHACGHDIHTSALLGAARILK EREGELRGRVKLMFQPCEEDVGGAADMVEAGVLENPSVDGAMALHVVHHSMGSVGYSTGA ACASSDVFTITIHGEGGHGAVPDSCIDPINAAVHIHMALQALNSRETHPDEMLVLTICEF HSGAAANVFSDTAVMRGTIRTRNQKVREYARRRLEEISSTVASAFGASARVEYLYSGVPP MVNDQELLEEAAGYIDRLLGPGTCYELPRMTGSEDFSVLSQLVPSVLFWVGTGSQEEGYP YGVHNPKVTFSEDMIPVMSAIYAEVAICWLNNHPN >gi|157101662|gb|DS480662.1| GENE 24 24295 - 25947 1874 550 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 518 1 485 511 222 30.0 2e-57 MKKRWMVSLMAVAMAVSMAACGKSSGGGTENAGSSAPAGTTAAEAGASGETQTADAGEPV DGGVLTISLSSSPKNLDPVKYTGTYESQIIGSVCDTLVEYNSSLTEIMPSIAASWEVSDD GKEYTFKLRDDVYFQPGQYQDGRQVTAEDIKYSLERSHELSAMNRLDMLDHCEVISDTEV KCVLENPNAVFLTALTDAGNVIVPKEEVEGWGDDFGTHLVGSGPFALKEFKLDQQATLVR NEKYWAQKPHLDGVVFRPVSDGNQAVNALRTGEVNVATSLTGEAVQVARDDDTVKLLEMP GLHVSYLYFNMVNGPTKDPKVREALIKAVNVEELTGGVYQYGEAVPASLPLPPGSWGYDE AVKAEVPTYDPEAAKALLAEAGYPDGFDLNLYISNTPTRVKMATLFQAYLQQNLNVNVNI NTSEWGTFSEIGASGQADVFGMSWTWYPDPYFFLNKLFYSGEIGSLGNGQGYNNPEVDKC LENALLTTDQNERADLYKKALALIVKDMPGIFYANENVEWGVTPNVQGLVQRADGKVKIC TTDINVWLSK >gi|157101662|gb|DS480662.1| GENE 25 26053 - 27006 1006 317 aa, chain + ## HITS:1 COG:BH3643 KEGG:ns NR:ns ## COG: BH3643 COG0601 # Protein_GI_number: 15616205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 1 313 1 333 333 257 42.0 2e-68 MLKTLINRCLQIIPALFVVVTLTFVLTRMIPGDPARAVAGPQASVQDVEKLRESMGLNEP MLVQYKDYVLGVLQGDFGTSYSYNQPVLKLVAERIPNTLLLAIPSIAVALLIGMVIGVVS AVKQGSLFDYIFMILALIGVSMPIFWMGLMLVLTFSVRLGWLPALGMGTFENGLGDVIRH MVLPCFCLSTIPMATFARITRSSILESISSDSIRAIRARGIRDAIVIWKYALKSALPPIV TVLGLQLASCFAGAILTENVFSWPGMGSLMVGAIDNRDYMLIQGAVLVIALAFVLVNLAV DIVYMLINPRVSYEGGS >gi|157101662|gb|DS480662.1| GENE 26 27007 - 27906 868 299 aa, chain + ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 27 298 30 300 301 239 44.0 7e-63 MKKTDIAASQPSYEEILRKERRANSAWNKLRRNKTAMIGLFIIVIMTAIAVFAPFITGGK PNEIHPIDAYLGFFEKGHLFGTDEAGRDLFTRICYGARVSLLVAVGATVLGGVIGVALGL ISGYAGGVVDAVIMRCMDGVMAFPFILLSIILMTVLGDGIFNVILAIGIAVVPRFSRVVR GQVLIVKKEEYCNAGRVIGISNTRMLLHHILPNTVSEVIVYATLNTASAIISEASLSFLG LGIKLPTASWGSILRSGRDCLNTAPHIAGISGCFILLTVIGFNLLGDGIRDVLDPKMRR >gi|157101662|gb|DS480662.1| GENE 27 27909 - 29093 1007 394 aa, chain + ## HITS:1 COG:SA1935 KEGG:ns NR:ns ## COG: SA1935 COG1473 # Protein_GI_number: 15927707 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Staphylococcus aureus N315 # 6 382 4 382 394 205 31.0 1e-52 MREELKRQVGTCIDRCSGELTELADTIWNNPEYNFKEYKACRSITGLLEKHGFDVERGTG GIETAFHAFCDSGKPGPHIAFLAEYDAVPGMGHACGHNLMAAMSAGAGIAVKSVLPQLKG SISVFGTPAEEGGGGKVIMLENGAFNGVDAAMIIHSANETVVNDISYSRTDIIVDFFGKG AHAATWPEEGVSALDPLLLLFQHISQMRLRWNGQGTILGIISEGGEDPIHIPEHCQGKFT VRSFSMKTKKKLLGDFLEACEAAARMTGTTWQWKEDGYTYEDIRNNPVIEDVLAEQFTRL GETVMPRRKELGIGCTDVGNLTHAFPALQSYVQVVPLLRGHTKEFEDACKSPDGHRAALT GAKALAMTAVELLSDDSLMEQVDRAFQDMKKQYE >gi|157101662|gb|DS480662.1| GENE 28 29105 - 30124 582 339 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 4 320 9 324 329 228 40 7e-59 MEGQKLLTVKDLKTYFYTSGGVSKAVDGVSFEVEKGEILGIVGESGSGKSVTSSSIIHLL PAKTGKIVGGSIDFNGTDVLKLSQKELLEFRGKDISMIFQNPMTSLSPVFKVGSQMVEMI CAHQKVTKAEARNMAEEALKRVGIPDARRRMEAYPYELSGGMCQRVIIAMSVCSKPQMII ADEPTTALDVTVQAQVLDLLKELRDKYGTAILLITHNLGVVWDMCDSVIVMYAGKTVEYA PCIDLYNHPLHPYTWGLLDCMPRLSSNSKEELSTIAGTPPDLRLTGVGCNFANRCPYAQE RCRTEVPELTEVSPGHRVACHRQTGGNRLVRNEVTESGR >gi|157101662|gb|DS480662.1| GENE 29 30135 - 31106 863 323 aa, chain + ## HITS:1 COG:BH3645 KEGG:ns NR:ns ## COG: BH3645 COG4608 # Protein_GI_number: 15616207 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 1 319 6 322 322 336 50.0 3e-92 MKIEHLCKDFDIKAKSFTGKALKLHALSDISLDIYEGETLGIIGESGCGKSTLGRCLVNL HPITSGKLVYDGKELNRLRKKERQALSREIQMVFQDPYSSLDPRHTAADSVEEPMIVHKT PATARERQERTIGLFKEVGLDMQHLNRYPHEFSGGQRQRLNIARAIASNPRVVICDEPVS ALDVSIQAQVINLFKKLQKHFNLTYIFISHDLSVVKYISDRIAIMYLGRLVELCDSDSIY HNPLHPYTQALLSAIPPESPEEEKERIVLTGEVPSPIGEQKGCPLANRCPRCMKRCREEL PKLLPHGDTDHMVACFLYQNERD >gi|157101662|gb|DS480662.1| GENE 30 31142 - 32371 1313 409 aa, chain + ## HITS:1 COG:lin0289 KEGG:ns NR:ns ## COG: lin0289 COG0624 # Protein_GI_number: 16799366 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Listeria innocua # 1 398 1 370 378 147 29.0 5e-35 MVEDRKDELIDLVSRLIRINSENPTGTQREVVDFVEKYLEKAGIAYEETGENPDYPCVVA RMGSDDGYSLIFNGHVDVVPAGDRGLWDFDPFCGTVTDSQILGRGTSDMKAGVAGVLFAM ALLKESNVCLKGNIRLHIVSDEESGGEYGSAWLCEHGYAKGANGCLVAEPTSMATIEIGQ KGGMLLTIRAHGKAAHGSLGNYKGENAILKMAKVLPLVEKLTRISGHFSDRQLKPLADSK MIAEDKNEIPGLGSVIDHVTTNIGLIQGGTRHNMVPDCCEAVVDCRLPIGVSQDEIAACV EEIRRESGVDGVDFELNYRSEPNFTDHEDPLVLAVKKNVEAFLGTQVVPAYQWASSDARD YRRQGIPTIQFGPSNTVGIHSYNETVEIEDVVTAAKAYVAAVCDLMGIE >gi|157101662|gb|DS480662.1| GENE 31 32392 - 33333 886 313 aa, chain + ## HITS:1 COG:no KEGG:Sterm_2686 NR:ns ## KEGG: Sterm_2686 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 311 1 310 310 404 66.0 1e-111 MLIKQLLEVYDVLDASNASGETVEAYLRGIRKDADIHVYPLVGPKGKTDMVKVRIHGSRG KTSGGDAPTLGLLGRLGGLGARPERIGFVSDGDGALVALSLAAKLLDMQNKGDILEGDII VSTHVCPDAPTRPHKPVAFMDSPVEMAQVNKEELSDELDAVLSVDTTKGNRIINTRGFAI SPTVKEGYILKVSESVLDIMQTTTGRLPYVYALSVQDITPYGNDLYHLNSILQPATGTKA PVIGVAITTEVPVAGCATGATHITDCEEAARFMLEVAKAYARGECSFYDEDEFTRIRKLY GNMNHLQTLGKDA >gi|157101662|gb|DS480662.1| GENE 32 33344 - 35002 1356 552 aa, chain - ## HITS:1 COG:BS_yunI KEGG:ns NR:ns ## COG: BS_yunI COG2508 # Protein_GI_number: 16080295 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Bacillus subtilis # 12 543 7 524 531 112 24.0 2e-24 MNDPVLTLRWFLNCPEIDNLEFFGDEDCLGNPITGVNVIDNPEGVPWIKKNELVLTTGFI FYKDIALTKKLLKELHDVSCTALCIKIKRFYDTIPQFLIEEAARYRLPLIAVPFYHSYSD IMRVIFRELELRQLSNQAFVINETRKFCQLFTSNSGISSMLSMLSSYTDSIILITDYNDK SVYFHIPDSYANLLQTSSRVVIRPDKSKNAESGKEADSDKEASNTRRFWINKQIIPFAVF PIPSGRYYLCADINAAPLSDDIVSIITNCLPILSTELETIQVNQRRFSSTNHFAPFFSML SDMRSKSSEEIRIVCNTYSFPYDCKRVCIAYEIQQTGSPERLMRTVYDKFSTALENSGQQ YFLSFYNDYVIVFLFFQKSISNLEAVNQAQMTAEYLYQSLDESLRQWVRGGVSRSHSRLV TIGTAFEESLISINMQSHLNPQKAVASYLRQIPYHLLRNLNHHDLNKIYNDTIVRLVDYD QANGTCLVHTLRMYLENRMVIRETSRQLFIHRNTLADHLNLIKRILGTDLEAMDEIFSFY LGLCAYDLLRET >gi|157101662|gb|DS480662.1| GENE 33 35135 - 35299 133 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936793|ref|ZP_02084159.1| ## NR: gi|160936793|ref|ZP_02084159.1| hypothetical protein CLOBOL_01683 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01683 [Clostridium bolteae ATCC BAA-613] # 1 54 1 54 54 83 100.0 5e-15 MELSPAARRLVILLCLRHLPGILGMGLSGVLVMIPGKLRESLLKLTGYDDKLHR >gi|157101662|gb|DS480662.1| GENE 34 35328 - 35963 844 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936794|ref|ZP_02084160.1| ## NR: gi|160936794|ref|ZP_02084160.1| hypothetical protein CLOBOL_01684 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01684 [Clostridium bolteae ATCC BAA-613] # 1 211 1 211 211 323 100.0 6e-87 MKKHSLMITAAALAAAMVLGTGCAKKPAVETSETVTSQASAEETTKAPETTAKAQETTEQ AAEAKEVYHMLQGTVTKVTADGSVFTLQADDGRDYDISQSDIRDVEVEIEEDVQIAIAYI GEPLGSLEDVTLVVALPEQEEWSILTEKGTTTANAMSSFTIKTDDGQELSFLKDNCPIEE GALAGDSGDKVTVTYVNSQGVNYPVEIKGTK >gi|157101662|gb|DS480662.1| GENE 35 36150 - 37787 1980 545 aa, chain + ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 531 1 533 540 756 69.0 0 MISTSNISLRVGKKALFEDVNIKFTEGNCYGLIGANGAGKSTFLKILSGQLEPTSGDVII TPGQRLSFLQQDHFKYDQYTVLDTVIMGNKRLYEIMKEKEAIYMKEDFTDEDGIKAAELE GEFANMNGWEAESDAESLLNGLGIEPDLHYAQMGSLLGAQKVKVLLAQALFGNPDILLLD EPTNHLDLDAIAWLEEFLINFENTVIVVSHDRYFLNKVCTQIADIDYGKIQLFAGNYDFW VESSQLIVKQMKEANRKKEEKIKELQEFIQRFSANASKSKQATSRKRALEKIELDDIRPS SRKYPYIDFRPFREIGNEVLTVEHLSKTIDGEKILDDISFILGREDKVALVGPNERAKTV LFKILSGEMEPDEGSYKWGLTTSQAYFPKDNSAEFDNDDTIVEWLTQYSPEKDVTYVRGF LGRMLFAGEDGVKKVKVLSGGEKVRCMLSKMMISGANVLMLDEPTDHLDMESIDALNRGM IKFPGVMIFSSRDHQIVQTTANRIMEIINGKLVDKITTYDEYLESDEMARKRFTFTLTES EEADD >gi|157101662|gb|DS480662.1| GENE 36 37879 - 38364 504 161 aa, chain + ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 161 1 158 158 237 69.0 6e-63 MEKITSFTIDHNKLQPGVYVSRKDQVEGGMVTTFDLRMTSPNEEPVMNTAEMHTIEHLGA TFLRNHGTFGKKCIYFGPMGCRTGFYLLLAGDYQSADILPLMVEMYEFIRDFQGEVPGAS PKDCGNYLDMNLGMARYLAGKYLEVLYHVDQFPESRLVYPE >gi|157101662|gb|DS480662.1| GENE 37 38441 - 39196 729 251 aa, chain + ## HITS:1 COG:CAC0222 KEGG:ns NR:ns ## COG: CAC0222 COG0708 # Protein_GI_number: 15893514 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Clostridium acetobutylicum # 3 251 2 250 250 402 74.0 1e-112 MTKLISWNVNGLRACVGKGFLDFFREAEADVFCIQESKLQEGQIELDLPGYHQYWNYARK KGYSGTAMFTKEEPVSVSYGIGMEEHDTEGRVITAEFPEYYVVTCYTPNSQDGLARLDYR MKWEDDFLAYLKKLEENKPVVFCGDLNVAHKEIDLKNPKTNRKNAGFTDEERGKFTDLLA AGFVDTFRYFYPDLEGTYSWWSYRFSARAKNAGWRIDYFCVSESLKDRLVSASIHNTVMG SDHCPVELVIE >gi|157101662|gb|DS480662.1| GENE 38 39209 - 41056 2170 615 aa, chain + ## HITS:1 COG:CAC3012 KEGG:ns NR:ns ## COG: CAC3012 COG0488 # Protein_GI_number: 15896264 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 612 1 630 632 521 45.0 1e-147 MNLVTIEHLTKSYTERLLFDDTSFSINEGEKIGLIGVNGTGKSTLLKIVAGLEDADSGSV VRGRSLYIRYLSQIPEFAEGDTVLESVMRENAGETHYSSADEMQATAKSMLNKLGIIEHD ALTSTLSGGQRKRVALASVLMSTADLLILDEPTNHLDSGMADWLEEYLKAFRGALLMITH DRYFLDSVVGRIVELDKGKLYSYQANYEGFLALKAERMEMAEASERKRQTILRNEIAWMQ RGARARSTKQKAHIQRFEELRDQKGPEYDRNVELESIASRLGRTTVEVKDLCKAYGDKVL MKDFTYIFLKNDRVGIIGPNGSGKSTLMKMLAGWVEPDSGTIQIGQTVKMGYFSQENEAM DESLKVIDYIKNVAEYVKTKDGSISASQMLERFLFPSGMQYTSIGRLSGGERRRLYLLRI LMEAPNVILLDEPTNDLDIQTLTILEDYLDTFPGIVIAVSHDRYFLDRVVNRIFAFEGQG KVTQYEGGFTDYQIAWSARHPQETGEQKGVRAGDSDENGSGLPINRNNWKQAAGGEKKRK LSFKEQREWETIEADIAALEQSIGDLEQEIGKSATNYTRLNELMEEKSAREAQLEEKMER WMYLNELVEQIEAQK >gi|157101662|gb|DS480662.1| GENE 39 41292 - 43319 1584 675 aa, chain + ## HITS:1 COG:slr1857 KEGG:ns NR:ns ## COG: slr1857 COG1523 # Protein_GI_number: 16330244 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Synechocystis # 12 617 20 707 707 414 35.0 1e-115 MMKGFRIDRNDGRIYAMGVTAVSGGVHFSFPSKGETCAVILYRKGAKSPKGRIEFPVNAR TGDVWSMTVLGDFSGLEYVYEVDGQQMPDPYGRKFTGMERWGSLARRDKPVRTPVDTEGG AYDWEDDILPCIPYDQCVIYHIHTRGFTKHASSGLEGDSRGTFKGVAEKIPYLKDLGITT VEMMPPVEFEELIIPERVDGSPFAAEEPDGKLNYWGYTRGYYFAPKCAYSSGPVREPERE FKDMVKALHRAGLELVLELFFDGKEAPSYVLDVVRFWAQEYHVDGVRLVGYAPVKLLGED PYLSRLKLLAPGWDGVEPGQEKHLAEYNDGFMMDMRSFLKGDEDQLNRLVYHIRHNPGQV GVVNYMANTNGFTLMDMVSYDTKHNEANGEENRDGTDYNQSWNCGVEGPSRKKRIMQMRR KQIRNALLMLFLSQGTPLLMMGDEFGRTKKGNNNSYCQDNDISWLNWGLLNSNSSIHEFV KHVIQFRKEHSVFHMEKEPALMDYRSVGLPDVSYHGLKTWCPMFDRFLRQLGILYCGKYG KKPDGREDDYFYVAFNMHWEPHEFALPNLPKDYKWHTAFNTDEDQVNGMYPGGSEPVLEN QKSIMIMARTIVVLMGKEVPAQKKASRGKAAGKGERKEEEADDTVRGNDTVYREPQAGGL QPSSGAGQDTGALQL >gi|157101662|gb|DS480662.1| GENE 40 43216 - 44376 1135 386 aa, chain + ## HITS:1 COG:CC3032 KEGG:ns NR:ns ## COG: CC3032 COG2195 # Protein_GI_number: 16127262 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Caulobacter vibrioides # 22 383 40 407 412 203 35.0 6e-52 MILSEEMIRYIENHRQEAYSLLLELARIPAPSNYEERRAEFCLNWLKSQGARGVYIDQAL NVVYPADVSQDRPLDVFMAHMDVVFSDETELPLRVEDGRIYCPGVGDDTACLVCLLLAGK YIARHVDSPSWREIRGEDAPGLLLVCNAGEEGLGNLKGVRKICEDYGGRIKSFCTFDSSL SSIVDRAVGSKRFRVKVRTRGGHSYNDFGNDNAIAGLASLVGKLYKIQVPEGGKTTYNVG MISGGTSVNTIAQEAEMLYEFRSDKRENLEYMERQFMEVLDRAKAEGTDVSFQVIGVRPC EGQVDPAQKQALTDWAHRVVEEMTGQAPKHHAGSTDCNIPLSLGIPSVCVGSFRGAGAHT REEYVETDSLEEGYKVAFGMIFGRRD >gi|157101662|gb|DS480662.1| GENE 41 44618 - 47374 2679 918 aa, chain + ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 5 914 4 860 862 756 46.0 0 MADWFEQPEAEVLAGLDSGRQGIDSGQAEQRLKEYGENVLEEAGRLQWWQVFLAQFKDLL VIILIGAAAVSMLTGDPESAMVIFAVLLLNAVLGTVQHQKAEKSLDSLKRLSSPEARVLR DGRSMTIPSALVVPGDILLLETGDMVAADGRLLEVYGLTVNESSLTGESLGVEKHTGAVK AREGTVALADRTNMVFSGSLVTSGRAVAAVTATGMNTEIGKIAVLMNGTKEQKTPLQVSL DRFSGHLAIAIIAICGLVFLLSLYRREPVLDALMFAVALAVAAIPEALSSIVTIVQAMGT QKMAREQAIIKDLKAVESLGCVSVICSDKTGTLTQNRMDVEQVYVNHTQQQPSWLENEKR RSDVRLLLMGAVLNNNASGQDGDPMETALFNMAREAGLIPEQVRLQSPRKGEIPFDSRRK LMTTVNEIEGHDIVFVKGAVDVLLKKCTRLANPAAGQGTGLAGQGTGAAGQRPGESGQRP EAAIPCRTLSSSDRQRILDQNSRWSARGLRVLAFACRPFNEEGKEHGNSGLAGAKAGNSQ ERSSLSGRLEDNLVFLGLTAMMDPPRPESRQAVASARQAGIRPVMITGDHKVTAMSIAER IGIFTPGNMAVTGAELDQMEEEDLNRSLEHISVYARVSPEHKIRIVKAWQNRGHIVAMTG DGVNDAPALKKADIGVAMGITGTEVSKDAASMILADDNFATIIKAVSNGRNVYRNIKNSI QFLLSGNMAGIFCVLYTSLLSLPLPFKPVHLLFINLITDSLPAIAIGMEPPEEGLLKEPP RNPKEGILTASFIRDILVQGALIAAATMWAYTTGLHEGGAGRACTMAFSTLTLARLFHGF NCRSSHSIIRLGLGSNLYSVMAFQLGVVLLAAVLFVPGLQIMFAAADLSLSQLTTLAAAA FLPTLAIQIHKILRESIY >gi|157101662|gb|DS480662.1| GENE 42 47497 - 48393 995 298 aa, chain + ## HITS:1 COG:Cj0021c KEGG:ns NR:ns ## COG: Cj0021c COG0179 # Protein_GI_number: 15791420 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Campylobacter jejuni # 1 298 1 289 292 292 48.0 4e-79 MRLVTYEIEHKSGLGVISRDGSWVYPLASLDMDYKTMQELIENISESEKQLLEYVSGQDP YKVRGAAPIGEVSLLAPIPHPRQDVICLGINYMAHAEESARFKKEAFDGERPHAIYFSKR VNRATDPGAGIPSHRDIVTDLDYEAELAVIIGREASHVREEDVKDYIFGYTIINDVSART LQTRHKQWYFGKGLDGFLPMGPCIATVDELSYPPKVQVQSRVNGELRQDSNTELLIFDIS HIVSELSQGMTLKPGTIIATGTPAGVGMGFDPPKFLVPGDVVECTIEGIGTIANKVTD >gi|157101662|gb|DS480662.1| GENE 43 48425 - 48874 568 149 aa, chain + ## HITS:1 COG:CAC1604 KEGG:ns NR:ns ## COG: CAC1604 COG1803 # Protein_GI_number: 15894882 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Clostridium acetobutylicum # 1 142 3 144 149 184 58.0 6e-47 MTIRRQKNIALIAHDQKKRELIQWCGEHKDILSRHFLCGTGTTARMITDATNLPVKGYNS GPLGGDQQIGAKIVEGKIDFVIFFSDPLTAQPHDPDVKALLRIAQVYDIPIANNKASADF MITSPLMDEEYEHDVVNFRQNIADRANTL >gi|157101662|gb|DS480662.1| GENE 44 48944 - 51991 2808 1015 aa, chain + ## HITS:1 COG:sll0158 KEGG:ns NR:ns ## COG: sll0158 COG0296 # Protein_GI_number: 16331275 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Synechocystis # 14 753 10 762 770 760 48.0 0 MEQKLYDLMDWAGIEELVYSEASNPHAMLGPHLTEDGLLIQALVPTAAEISVKISTTGKK YPMELADEAGFFAALIPRKSKTPYTLEVTYDNGVTEELCDPYSFAPQYREEDLKKFAAGI HYTIYEKMGAHPMTIDGVSGVYFSVWAPCAMRVSVVGDFNLWDGRRHQMRRLGEGDGSVF ELFIPGLYEGEIYKYEIKTRAGEPMLKADPYANYAELRPNNASIVWDIDKYKWNDKSWMD QRAKTDTKDKPMNIYEVHLGSWIRKETVRDETGQDLAGSEFYNYRELAPRLAEYVKDMGY THVELMPVMEHPLDESWGYQVTGYYAPTSRYGAPEDFMYFMDYMHGQGIGVILDWVPAHF PRDAYGMAVFDGTCVYEHMDPRKGSHPHWGTLIYNYGRPGVTNFLIANALFWADKYHADG IRMDAVASMLYLDYGKNDGEWVPNIYGGNENLEAMEFLKHLNSVFKGRKDGTVIIAEEST AWPMITGDPKEGGLGFNYKWNMGWMNDFTNYMRCDPLFRKNNYGELTFSMLYAYSEDFVL VFSHDEVVHGKGSMIGKMPGETLEKKAENLRAAYAFMMGHPGKKLLFMGQEYAQTAEWNE GASLEWGLLEYPVHKNMQDYVKALNRLYREHPALYEMDYDPDGFEWINCSYQNESMIIFL RKSRKSEETLLFICNFDNMEHEKFRVGVPFHGKYKEIFNTDAKDFGGEGRVNGRVKTSRK LEWDEKDDSIEVYIPPMSVSIYTCTQAEESEKPEAKKPGVKKTADKKPAARKAASGKTAA SKTSVSKTASPKRITAVAQETEGAAAAKDVQKPMAAFKAGQKQIITTKEEQKQIGTSREE QKQIGTTKEEKKQIGTTKEEKKQIGTSREVQKQIGTSRKKQEQIGTTKEEQKQIGTSKEE QKQIGTSKEEQKQIATSKEEQKQIGTSRKRQEQIATSKEEQKQIGTSRKRQEQIATSKEE QKQIGTSRKKREQIGTSKEEQKQISTSKKSRKQIGTAKKNQKQITASKTEGSETK >gi|157101662|gb|DS480662.1| GENE 45 52241 - 53704 1448 487 aa, chain + ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 1 477 4 474 481 184 28.0 4e-46 MKKKLGAFSVVVILVMGLSIAFNMLVMNFSLESFNVILNDNSLCYDVQESMEQEADAFEL YVRERSWENAKQYRLACFKTESSLEALPFDYDVIGPERYARTWNIKNGYEGYQTFRDQVL HMDQSDEGFVEQLYRVYGMQGYLQTYARRLMQSTLKEGKRSYQEKVPLLYDLPYMIMACS VLMIITVICLTKLLSNTLIAPMVKLAHSSREIARNDFGGEDLVVENRDEMGELVQAFNKM KHATEGYINTLKEKNEMAERLHKEEVERIEMEKRLDAARLELLKSQINPHFLFNTLNMIS CMAKLEEAQTTERMISSLGNLFRYNLKTSEQIVPLERELKVVQDYMYIQQMRFGSRISHD LRVEVDGSETMIPAFTLQPLVENAIIHGIARKEEGGRIFLRIWKRKDKIVISLADTGVGM EEEALKRLMDALKGHRTARVGIGLGNIYQRIKSMYEDGDLRIYSKAGKGTVVQIILPAAR EEDYQKI >gi|157101662|gb|DS480662.1| GENE 46 53719 - 54513 831 264 aa, chain + ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 257 6 253 257 115 31.0 9e-26 MRLLIADDEMIERQVLYKTLQKNLGDRCRIFQAANGREALKIYEEEKIQIAILDIEMPGI NGIEAAQQIRQADKDCCIIFLTAFDEFSYAKKAITVRALDYLLKPYDEEELMLVVEEAMH IAGEHRQDEGDGDKNENEIPAPPDTEEPEDGGHARLSKVTSLISQYIHENYMFDISMQDA ARAMNYSEAYFCKLFKQCFDQNFTSYLTQYRIKEAKKMLSQPTVNVKEIGRAVGYGDSNY FAKVFKRITGQSPTEYRLSIFQKG >gi|157101662|gb|DS480662.1| GENE 47 54705 - 55835 1358 376 aa, chain + ## HITS:1 COG:BH0701 KEGG:ns NR:ns ## COG: BH0701 COG1638 # Protein_GI_number: 15613264 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Bacillus halodurans # 50 364 18 327 341 144 32.0 4e-34 MKKRWISTALCAAMVAGSLAGCGLKSPDATTAAPAATEAPTEAAKAEGDAAEESKTEAAD TEAAADTSDMPEVTLVMAEVNPLDTIVGQTDTKFKETVEELSGGKIKIDLQASGVLGSEN DVLDTMLGGGGTIDISRISAFALTSYGAEKSVLLSIPYTFANRDHFWKFADSELAPEFLM EPHDNGLGVRGLYYGEEGFRHFFTVKEVKGLDDLKGMKLRVSNDPIMNGMVEGLGASPTV VSFNELYSALQTGVVDGAEQPIANYKSNAFQEVAPNLILDGHTLGAIQVIITDEAWDKLT EEQQNILMEAGKVASQFNRTISEEAENKVLEELKADGINVVEVPDIKPWQDACKGIIESS TKDQAELYQQILDMNK >gi|157101662|gb|DS480662.1| GENE 48 55908 - 56432 541 174 aa, chain + ## HITS:1 COG:VC1778 KEGG:ns NR:ns ## COG: VC1778 COG3090 # Protein_GI_number: 15641781 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Vibrio cholerae # 29 165 21 154 173 58 27.0 4e-09 MPAIFQKIDKIRPAYDMTYRVVLFLCKLLLVVDILITTMAVAGRYISFIPDPAWSEEVVL TCMSYMAVLSAALAIRRGAHIRMTAFDRYLPAKVVKSLDIIADLAVLALAVIMITVGWKY ATGIGSKGTYVSMPKVSRFWMYFPVPLAGVAMLVFEVEALYNHVKSFFVKEGME >gi|157101662|gb|DS480662.1| GENE 49 56432 - 57727 943 431 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 6 426 3 426 435 367 42 1e-100 MDANSMAILVLLGSFFLMVFLRFPIAYAVALSSVFCLLFQGHNLTTICQQMVKGISSFSL MAVPFFITMGCLMGSGGISEKLIALANACVGWMRGGMAMVNIVASYFFGGISGSASADTA SLGSILIPMMVDQGYDDDFSTAVTITSSCEGLLVPPSHNMVIYATTAGGISVGSLFLAGY IPGALLAVTLMIGSYIISVKRNYPKGEKFSIINLIKQLSVSFWALAAVLIVVFGVVGGVF TATESAAVAVIYSLIVSVFIYRGLDWKGVWRVLGDSVDTLSIVMILIATSNIFGFCLTQL HVPILAASAITGMTNNPIILALLLNLILLVLGCIMDMAPIILIATPILLPIATSIGLDPI QFGIMVVLNCGIGLLTPPVGAVLFIGSAVSKVPMERVVKATLPFYVCMIITLLLITFIPA VSMWLPSVLAH >gi|157101662|gb|DS480662.1| GENE 50 57834 - 58319 309 161 aa, chain + ## HITS:1 COG:no KEGG:Closa_0824 NR:ns ## KEGG: Closa_0824 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 156 1 156 160 196 53.0 3e-49 MRYKYTYRTTARDLWQLSMYYIYGSLAGLCNIIFTVAAFALGFSRWDQAQGIVRCLIVLG CCLFTVIQPLMIYAKAKKQAAGITQDTQVSIDDNGLYIRVGDDTSQLPWKSVKRISRKPA MIIIFSDTTHGFIFTNRVLGNEKEEFYRYASSKVEGRNVHN >gi|157101662|gb|DS480662.1| GENE 51 58387 - 59406 531 339 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 48 321 59 329 346 209 33 6e-53 MRKNVIYLMGCAAVMILAASCAGRSKDSTAQKTVPEFVLTYAENQAEDYPTTQGAYKFAE LVHEQTGGRVEIQVNAGGVLGDEKTVIEQLQFGGVDFARVSLSPLAEFVPKLNVLQMPYL YTGREHMWKVLEGPIGTDFMNSFGGSSLVPLSWYDAGARNFYSTAGPIESLDDMKGLKIR VQESELMMDTIQALGATPVPMAFGDVYSGLQTGEIDGAENNWPSYESTRHYEVAKYFTLD EHTRVPELQLAAQSTWDKLPEEYRNIIRDCAQKSALYERELWDEREKTSEDRVRRAGCVV TELDAREKERFQEAAAPMYEKYCSEYVDIIDDIMDAGRQ >gi|157101662|gb|DS480662.1| GENE 52 60566 - 61015 327 149 aa, chain + ## HITS:1 COG:BH0270 KEGG:ns NR:ns ## COG: BH0270 COG3464 # Protein_GI_number: 15612833 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 14 145 290 428 441 78 34.0 4e-15 MYADGCSVMPVSLRRYYKRSWKLILTRYKKLKDENKQACDLMLHYSEDLRLAHRMKEWFY DICQMEAYRQQQREFDDWIANAQSCGIKEFEDCAKTYRAWRKEILNAFKYGLTNGPTEGF NNKIKVLKRSSYGIRNFKRFRTRILHCTS >gi|157101662|gb|DS480662.1| GENE 53 61188 - 62363 1029 391 aa, chain - ## HITS:1 COG:no KEGG:Closa_3855 NR:ns ## KEGG: Closa_3855 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 22 250 6 233 236 137 34.0 9e-31 MDRHSSYSAPRSERASGSYGGNRGRSTKRNTYNKRSWNDNRKLHTLVFYILPFILINLVI FIVATSKPKIEYDVSDTRDYRSVDISIRIRSLLPIRETSVTMESQPLEMEKEKGVYTATL TSNGTLQIYVKGWNGMSARLSDTIQALDDMAPTIQEDYVMENGILTITAEDSQSGINFDA VYATDEEEKTVKPSHINKESGEITFPMDTQTLVVFVADYAGNTIKSSYFSVKDGIDLSGR DKEFPSDDTNGTNGSSSKGSGTDKETSKAKESTKAKETTKAKETTKAKETTKAKETTKAK ETTKAKETTKAKETTKAKETTKAAESTKAAESTAATTAALSPADPQTPSPTQSSQAPSPT QAPQAPSPTQASQAPSPDNGSGDNVTIVPLG >gi|157101662|gb|DS480662.1| GENE 54 62574 - 63152 611 192 aa, chain + ## HITS:1 COG:lin0808 KEGG:ns NR:ns ## COG: lin0808 COG0317 # Protein_GI_number: 16799882 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 1 175 1 178 180 96 33.0 2e-20 MIGEAAAFADKAHKGAFRKGTKIPYITHPMETAVIVSAFTDDEEIIAAALLHDVMEDAGV TREELEDTFGSRVADLVVDESEDKSRSWQERKTRTVRHLCTASREIKILALGDKLSNMRC TARDYLVVGDAIWQRFNEKDKRKHARYYWGIAHALKELEDHLYFKEYVMLCRAVFGEDVI SSNKDMMRGECE >gi|157101662|gb|DS480662.1| GENE 55 63149 - 64540 1120 463 aa, chain + ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 258 445 23 207 217 169 44.0 8e-42 MIRCGKKSKILAAVLAAGIILSMQGMTAFAGSTGVGAAKPAQTGPGVAAQNDSQPGQAGG QTVQPEGQNGQTEGQAAQAEGQTGQAEGQTAQPVGQEQSTQPVTGQEPLQVNYSAFILQQ GWSAVTADNNLCSAPVNSWVTAVKANLIHIPDGTQVGIRYQVNLSGTGWLSWAEDGAETG GAEGVMPLESIRMELTGSGAAAYDLYYKVLQNGAWTDWAANGASAGQEGAGLRVDGIRAS ITAKGAGIPAEPVVPHGIDPSRPMIALTFDDGPKTSVTSRILDSLQANGGRATFFMLGSN VNANAGVIRRMVDQGCEVANHTHDHKYLTKIGAEGIVSQVGSTNQKIQAVCGVSPVLMRP PGGYIDGASLNVLGSMGMPAIMWSIDTRDWQHRNAQRTIDTVLSQVKDGDIILMHDIYST TADAAVVLIPELTARGYQLVTVSELAAYRGGIAPGHKYSQFRP >gi|157101662|gb|DS480662.1| GENE 56 64949 - 65977 482 342 aa, chain + ## HITS:1 COG:no KEGG:Closa_0425 NR:ns ## KEGG: Closa_0425 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 83 260 105 280 361 76 30.0 2e-12 MAYREKRIPPERRLNIRGKMDRAVFAVTAVATAVFWLSGSGAAMAAQEPQEEGYGAELRI TRDVETAEDGGIREPADRYWDGDGKEYRLDSWEIVTIPGQDVSRRLERKMVYTGVEGAEE IPGSISLKEDVSGSQAEGTLFLRKSRIVREEWQDGFTAPVTFHAYGADEYHGGSLVIPGD AVLETCVSMGGQLLEFMGLSPLEYRILSADWYGESYEDEEGRMCRQAMVWGQKLLRDYKA EYEGEVSWIKPETQELKMVYRQISAVSPAALETAQGAEHVPAPEIQGEGPEGSLWYWVRS GFVITVGAGLIGIGVGLLALLAVWYRQNRREHRRRCLPRIKG >gi|157101662|gb|DS480662.1| GENE 57 66052 - 66885 847 277 aa, chain + ## HITS:1 COG:BS_pnp KEGG:ns NR:ns ## COG: BS_pnp COG0005 # Protein_GI_number: 16079406 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus subtilis # 7 267 3 263 271 270 50.0 3e-72 MTNPVYDKLRNCYDSIKDKIPFEPKVALVLGSGLGDYAEQLEIQGTIDYHDIEGFPVSTV PGHKGRFVFARINEVPAVLMQGRVHYYEGYPMSDVVLPVRLMKLMGAEILFLTNAAGGVN YDYGAGDFMLIKDQISCFVPSPLVGPNLEELGPRFPDMSHIYDEDLRDIIRSTALELGIR IQEGVYVQLTGPAYESPHEVKMCRILGGDAVGMSTACEAVAANHMGMKICGISCISNLAC GMTDVPLSHKEVQEASDKMAPLFKKLVSESIRKLGSI >gi|157101662|gb|DS480662.1| GENE 58 67029 - 67862 1204 277 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936827|ref|ZP_02084192.1| ## NR: gi|160936827|ref|ZP_02084192.1| hypothetical protein CLOBOL_01716 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01716 [Clostridium bolteae ATCC BAA-613] # 1 277 1 277 277 343 100.0 6e-93 MKRHYLAVAVLALALGMTACSSGNKETTAAPTTTASQTETEKAAEAEDDMELEYFYGFVG EVADKVVTVKDDEGKEVKFDVSEAKIEGADAIGEGDEVEVAYAGELSADTTKAKSVDIVT SAAAEAAEEAAAEEDDIVTGTVEKADDNTLTLTTDEGTYTFNTKIAQKVSKDGIKSGVQA DVTYYGDLEDETDKPVATKIVTEDAMDSEDAKINTLSGKVAEVGADYVVIDTADPENTLF TFLGKEGMFDKLKVGDTATVIYEGTLTNKTITATGVK >gi|157101662|gb|DS480662.1| GENE 59 68079 - 70670 1900 863 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 853 1 804 815 736 46 0.0 MNINKFTQKSMEAVQNCEKLAYEYGNQQIDEQHLLYSLLTLEDSLILKLITKMGIQGDMF SNEAKQAVERLPKVSGSGQLYISSDLNKVLISGEDEAKAMGDEYVSVEHLFLSLLKHPNK EIKALFKLYNITRETFLQALSTVRGNQRVVSDNPEATYDTLAKYGYDLVERARDQKLDPV IGRDSEIRNVIRILSRKTKNNPVLIGEPGVGKTAVVEGLAQRIVRGDVPDGLKDKKLFAL DMGALVAGAKYRGEFEERLKAVLEEVKKSDGQIILFIDELHTIVGAGKTEGSMDAGNMLK PMLARGELHCIGATTLNEYRQYIEKDAALERRFQPVMVDEPTVEDTISILRGLKERYEVF HGVKITDSALVSAATLSDRYITDRFLPDKAIDLVDEACAMIKTEMDTMPAELDEMSRKIM QMEIEEAALKKETDRLSQDRLADLQKELAELHDEFASKKAQWENEKASVDRLSSLREEIE SVNREIQDAQQKYDLNKAAELQYGRLPELQKELQAEEERIRNEDLSLVRESVSEDEIARI VSKWTGIPVAKLTESERSKTLHLDEVLHKRVVGQDEGVQLVTQSIIRSKAGIKDPTKPIG SFLFLGPTGVGKTELAKALAEALFDDESNIIRIDMSEYMEKHSVSRLIGAPPGYVGYDEG GQLTEAVRRKPYSVVLFDEVEKAHPDVFNVLLQVLDDGRITDSQGRTVDFKNTIIILTSN IGSQYLLEGIDENGNIRPQVENMVMDELRAHFRPEFLNRLDETILFKPLTRMDIARIVDL CVADLNKRLADRELTIELTMGAKEFVTDKGYDPAYGARPLKRYLQKNVETLAARIILGDG VREGSTIVIDVDEDKTRLIAYVE >gi|157101662|gb|DS480662.1| GENE 60 70982 - 71161 129 59 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0688 NR:ns ## KEGG: EUBREC_0688 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 59 3 58 58 62 53.0 5e-09 MSTIPKDPVILLSYVNTQLRDNYSSLDEMCQALDADKSRIISVLSGIGYQYEPTQNSFR >gi|157101662|gb|DS480662.1| GENE 61 71264 - 72454 1060 396 aa, chain + ## HITS:1 COG:DR1863 KEGG:ns NR:ns ## COG: DR1863 COG1459 # Protein_GI_number: 15806863 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Deinococcus radiodurans # 1 395 1 404 406 214 31.0 3e-55 MPRYRYRAKSEDGRVHCGTEKADSGRALFVILKSRGLYCYEYSSLERTPSARPAKLNQRQ LPLLCRQLAAMLTAGVPLSRALEVSAGSARDKTLKVTLAGLRESIHKGRTLSEAMEEMKG VFPNLLVYMTRTGESSGRLDELLHKMAGYYGREEELNGKVRAAMTYPVILFSITVLAAVF MLTTVLPQFASMLQEQELPWITRMMMRLSFSLRSRGVLYIMLILVLMALFKGILTCPFVR LQADRAVLYTPIIGKLLSTVYTSRFASAFAVLYGSGIGILDAMHTVGRVMGNSYVEKGLV QVAESLKGGVMLSQALDELNLFQPVLISMVAAGEESGALDMVLEDAGSFYEKEAARAVNQ MIALLEPAMILILALVVGSVVMAIMMPVFNMYSSML >gi|157101662|gb|DS480662.1| GENE 62 72579 - 73922 1093 447 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936833|ref|ZP_02084198.1| ## NR: gi|160936833|ref|ZP_02084198.1| hypothetical protein CLOBOL_01722 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01722 [Clostridium bolteae ATCC BAA-613] # 1 447 39 485 485 867 99.0 0 MEPVPDVASALTRLAEKYGMEGKRVNLIAGRDVRMMSFAIPKAGKNAMKRMAVNEMSLLD PDFGNHAAALDLRAKTGRTMADVTAYYMEQRSLDAYKKAVEDGKMICGRVLVLPDCMAIM ARELWREERVLLLDVNQASMGFYVLSGGHCLACRMTGLKAGCFLREGAVDMLCEEMAEQV EDLIQDCTAASDVPVPDCMVLMTSCIPDAQAAAEYMSGRFKIPCSVRMPEPVDAVCLAVC VAGSLEGKRKALELEERGYEDGKNSILGRGPFMTRGWALFLLANGLAAAGLSFHAAYVDH AAGKALARTEYAMEGTEYKTRVRKSMEMEERIGEMSADREKTRTEKALLAGKNLLGMNEF RAFTEAMNPEMRIESISFDGEGPYLQMVVSMDRSEDVPAYVERVEHSGIFRQVGHSLWEK SAEDRETERIYATVYGTLGMGGQDEAQ >gi|157101662|gb|DS480662.1| GENE 63 73909 - 74559 555 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936834|ref|ZP_02084199.1| ## NR: gi|160936834|ref|ZP_02084199.1| hypothetical protein CLOBOL_01723 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01723 [Clostridium bolteae ATCC BAA-613] # 1 216 1 216 216 396 100.0 1e-109 MKLSSRERFLIGVLFTIIIWAVALEVFIWPCRAGYRKSMELMESGNQEREEMEFYLSHYE ELEQKLKEWERLRNTEDFFYRDIDDIFMDRSLQDMAARAGVSIRRMSIGGAAAAEISYDT QTYDAQSDDPEESGKPAVESGPAGRTVKESVITMEIECPDPKDVMTFADEIYREQKSVLV SFMDVEAVYESGEDGQETYQGMKGIVEVRYYYEEIG >gi|157101662|gb|DS480662.1| GENE 64 74543 - 74875 306 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936835|ref|ZP_02084200.1| ## NR: gi|160936835|ref|ZP_02084200.1| hypothetical protein CLOBOL_01724 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01724 [Clostridium bolteae ATCC BAA-613] # 1 110 1 110 110 167 100.0 2e-40 MKKSVNRSSRGGFTLIEIVVSTALLAVVCTGFLMMTAANAGQLSGEQRLEQSNYNLSALA GEGEGEPTGETIAVEFGLEEENGVREIFEQYEITESEEDTGNHMTFYRHR >gi|157101662|gb|DS480662.1| GENE 65 74932 - 75420 475 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936836|ref|ZP_02084201.1| ## NR: gi|160936836|ref|ZP_02084201.1| hypothetical protein CLOBOL_01725 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01725 [Clostridium bolteae ATCC BAA-613] # 1 162 16 177 177 289 99.0 4e-77 MIETVAALAIAAVLAVFISGFIHPQMKLYYELDRLSQAKAMCSEAYRGLEEKLRYGYVFF CDYQDTGVISYYVRDREKIPDIKEGRSYYEELPPVEDWPSISAEELEVAELGGMELELDF SGTKNTRAKVSIKVKKDGDVIYEQKTAIGSMYGYTMDWEGGT >gi|157101662|gb|DS480662.1| GENE 66 75420 - 76211 555 263 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936837|ref|ZP_02084202.1| ## NR: gi|160936837|ref|ZP_02084202.1| hypothetical protein CLOBOL_01726 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01726 [Clostridium bolteae ATCC BAA-613] # 1 263 2 264 264 416 100.0 1e-115 MGRRRTVRKNNGHTRGSSLMYVVLLMTLLAVLSSGYMAISRYNMKAALKNRRYMEAQLSA KTIHRSFCEAVSSGESPAMNLIWQNFQEDCDRVREEFDAMMDEEEAEEGEDGELKAESFS SEEDTDTTDAEEELRWERYLYHALGNKKYVVRGKGCSEDSGGTDDGAADAPAADGFTTDS NITDSTTTIDITLTAHPLKSLATVRTRVESSGYSFSMMADIVFDDRDGSVMVISKPYRSS RGDDPEVKVYLNGNGVYRYYEGS >gi|157101662|gb|DS480662.1| GENE 67 76240 - 76995 549 251 aa, chain - ## HITS:1 COG:TM1696 KEGG:ns NR:ns ## COG: TM1696 COG1989 # Protein_GI_number: 15644444 # Func_class: N Cell motility; O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, prepilin signal peptidase PulO and related peptidases # Organism: Thermotoga maritima # 4 230 1 225 240 103 31.0 4e-22 MVFIYFVLYGLLGAVTGSFLNVCILRIPANQNFVTGRSHCPACGHVLAFYDMVPVLSWLF LKGRCRYCRAPVSIQYPAVELLTTSAFLLCLLAKGPGMESALMCMFSSILITAAFIDARH MYIPDGIHILILILSCISLAAGSGPAIINRLGGSLLAGGFLALVNLLSRGGVGWGDVKLF AASGLLIGAAPAITALLMGYVAAGLWYAVPLVRGRVGRKTQIPMAPFFAVSLMVCGLWFR QLFLWYLGFFR >gi|157101662|gb|DS480662.1| GENE 68 77068 - 78786 1568 572 aa, chain + ## HITS:1 COG:CPn0816 KEGG:ns NR:ns ## COG: CPn0816 COG2804 # Protein_GI_number: 15618725 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, ATPase PulE/Tfp pilus assembly pathway, ATPase PilB # Organism: Chlamydophila pneumoniae CWL029 # 79 567 13 491 496 359 40.0 8e-99 MPDKIPIENLLIDSGYLTRSQLDRAMRLKQGRPECSIEEILTEFGYVTEAALLTCAAARD HMEVAPFGPLKADRKSAAVIEPSFAVRNRILPMSFEEECLIVAAAYPVSPDVLDEVETLS GMKVRAVLAPEKELKQALKKAYGSAPEHVYGENPYVSMERPGVGVGAAAEASVSELAFME RVESAPVVRMVNTVIEEAFHKNASDIHVEPGKNDLVIRIRINGDLMVHTTLKMEYHRPMI TRLKLMAGMDIAEKRLPQDGKYHYERGEVSTDLRISTLPSIYGEKAVLRLLGNDRNNSLM DIRRLGMEEEQRIVFEHILKAPHGMVLVTGPTGSGKTTTLYAAINRMVKKKINIVTVEDP AEKVMDGITQVQVNSKAGLTFASALRSILRQDPDVIMVGEMRDEETVAIGVRAAITGHLV LSTLHTNDCASTIHRLRNMGVPPYMAAASLSGIIAQRLVKLLCPNCRQPYEPDEREQRIF AERGRHIPDRLWRSAGCPLCGGTGYTGRTAVYEIMDVDEQIKAMILDNEPLSSIREYQEK KGSMPLRDHVLRMAADGETDMEEAEKILYSVQ >gi|157101662|gb|DS480662.1| GENE 69 78826 - 79182 448 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936840|ref|ZP_02084205.1| ## NR: gi|160936840|ref|ZP_02084205.1| hypothetical protein CLOBOL_01729 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01729 [Clostridium bolteae ATCC BAA-613] # 1 118 3 120 120 219 100.0 5e-56 MSIHMKKGKKRTEGFTLVELICVIAILGILVAIAVPSYRGIQDKSAEQVAISNARSNYTL GKAQQDMLDAGVIKESEKEEYHFDKDKNSAYWEGQINGKTYKAVYNGDTGEGRIESGN >gi|157101662|gb|DS480662.1| GENE 70 79255 - 79677 492 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936841|ref|ZP_02084206.1| ## NR: gi|160936841|ref|ZP_02084206.1| hypothetical protein CLOBOL_01730 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01730 [Clostridium bolteae ATCC BAA-613] # 1 140 6 145 145 265 100.0 6e-70 MKKKGNCAFTLVELIAVVAILSILLCIAVPNMMNYINAAHEATAQTEARICVDAVQRYLN DEKEKGKLTGGMLLRLMGLDLSNSDGPLKDYISGGQKGARIVSVEADLATGRLQVLKYEN RYCTVKLNIDEDGNITEASD >gi|157101662|gb|DS480662.1| GENE 71 79693 - 81225 1604 510 aa, chain + ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 9 510 1 482 482 436 44.0 1e-122 MELIWSTVLTVWGWLLDHLLYINLAFSIIIIFFQRRDPRSVWTWLLALYFIPVFGIVFYL LFGQDMKKSRMFRVKEVSDRLNLPVKSQEDIIRSDDMPEEMMDPMSRDFKSLIIYNLETS GSVLTVNNQVQIYTDGQAKFEALRNALRQAHRFIHIQYYIIKNDELFDSIVPILLDKVRE GVEVRILCDGMGGRFMPKKRWDQMRSAGIKVGVFFPPFLGWLNLRVNYRNHRKIVVVDGN TGFVGGFNIGREYISRDPKFGYWRDTHLEIKGEAAISLQIRFALDWNYVTGENLFKEIRY FGEEAEYARRRLNGQIHAVWPDPDHVNEMVCMQIITSGPDTREPHIRNNYLELFHKAKHH IYIQTPYFIPDDAIFSSLKIAALSGVDVRLMIPCKPDHPFVYWATCSYAGELLDYGARVY VYENGFLHSKGIMIDGRVSCYGTANMDIRSFELNFEVNATIYDEEVTGQLEEAFMNDLPH CREITREEYADRNLWMRIKEQSSRLLSPLL >gi|157101662|gb|DS480662.1| GENE 72 81365 - 81862 309 165 aa, chain + ## HITS:1 COG:no KEGG:Closa_0447 NR:ns ## KEGG: Closa_0447 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 16 164 14 156 156 165 57.0 7e-40 MRAQTGQINHENKIYELFLNFCRCGIAGWCLEVMFTSVDSIMAGDWRLIGRTSLIMFPIY GMGALLLPISRFIDSWLTGLPGLEDAGQDRLSRVGRSIRHGLIYMVLIFIAEYITGIWLT SIGICPWDYSLWPDNVGGVIRLKFAPLWFMTGLLFEYLTKPKGGD >gi|157101662|gb|DS480662.1| GENE 73 81968 - 82720 777 250 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 2 224 1 220 241 134 34.0 2e-31 MIRIILVAMTVILYLIFSVPLLAVLRFRAKTDPVGVQKVSLGKIQGVFRFILKLAGVTYE VQGLENIPSDRAVLYVGNHRSYFDILVGYMTVPGLTGFVAKKEMLKIPLLRTWMQRVNCL FLDRVDIKEGLKTILEGIEKVKSGVSIWIFPEGTRNENQELTQLLPFHEGSLKIAQKSGC PVIPVAITGTAEIFEQHLPFVKPSHVCIRYGEPIYIKELPAEKRKFPGAYTRDVIEGMLK EMKEEQEQDK >gi|157101662|gb|DS480662.1| GENE 74 82723 - 83859 1198 378 aa, chain + ## HITS:1 COG:DR1147 KEGG:ns NR:ns ## COG: DR1147 COG0077 # Protein_GI_number: 15806167 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Deinococcus radiodurans # 111 378 20 285 293 174 37.0 3e-43 MENLDLQEIRGQLDEIDTRLVELFEKRMELCGQVAEFKIGTGKAVYDGQREKQKLAAVAA MAHSEFNRKGVHELFSQIMTISRRLQYRLLAEHGKGGDTGFTMVDELKREGARIVYQGVE GAYSHRAALQYFGEDADVYHVPVFEDAMIEVEEGRADYGVLPIENSLAGAVIDNYDNLLK HDIYIVAETKVAVDHALLGLPEASLEDIRRVYSHPQGLMQCSGYLGAHRQWSQISVENTA GAAKKVLEEGEVSQAAVASPTAGALYGLKVLEASINNNKNNTTRFIIVARKPMYRKDAGK VSICFEGLHKSGSLYNMLGNFIYNDVNMLMIESRPIEGRSWEYRFFVDVEGSLGDPAIRN ALLGISEEAVSMRILGNY >gi|157101662|gb|DS480662.1| GENE 75 83875 - 85143 1313 422 aa, chain + ## HITS:1 COG:CAC3262 KEGG:ns NR:ns ## COG: CAC3262 COG2607 # Protein_GI_number: 15896507 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Clostridium acetobutylicum # 1 421 12 425 426 263 36.0 5e-70 MDIKDLVIYRNLEHQELFRDIAHLMGYAQGGQEPDGASCGGRLTELAVKYGFEGNLWHCF LAFCLANNENAYTTTCEITGPAGGSLDELAREDCRIFKALFDYDIRNLEPSGLWALMEDY KPARKDSRVFNRRVRNRIVNLAVQLEESADAGEFQDHVVDFYREFGVGKFGLNKAFRIIE REEGQPVIIDPIINVEHIYFKDIVGYELQKQKLIENTEAFVEGKEANNVLLFGDAGTGKS SSVKAVLNEYYGRGLRIIEVYKHQFKALSDVLEQIKDRNYKFIIYMDDLSFEDYELEYKY LKAILEGGLGKRPGNVLIYATSNRRHLIREKFSDKRELDDELHNNDTVQEKLSLAARFGV TIYYGSPDKRQFETIVKALADRHGLDMPEQELFAEANKWELSHGGLSGRTAAQFITHLKA RQ >gi|157101662|gb|DS480662.1| GENE 76 85271 - 86836 1868 521 aa, chain + ## HITS:1 COG:all3552 KEGG:ns NR:ns ## COG: all3552 COG0248 # Protein_GI_number: 17231044 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Nostoc sp. PCC 7120 # 4 512 21 536 550 124 25.0 3e-28 MAVRTFAAIDVGSFELELGIYEMTYKTGIRRIDHVRHVIALGKDTYSTGKISYELVEEMC QVLEEFVQIMKSYKVKDYRAYATSAMREAHNRQIILDQIRVRTGLTVRIISNSEQRFISY KAIAAKAAEFNKIIQKGTAIVDVGFGSAQLSLFDKDSLVTTQNLPLGTLRVRGLLARIPA TVEEHKQHIEEIVDNELFTFRKMYLKDREITNLIGIGENILYIINRLDLNTYGDKVDAAT MNRFYERMCQMTTDQIEERFGVNSEYASLLLPSVVVYKRILELTGAEMFWVPGIRLCDGI AAEYASENKLVKFSHNFENDILAASRNIAKRYKCHTSHNQVLEQYALCIFDNMKKFHGLG QRERLMLQIAVLLHACGKFISIKNSNECSYNIIMSTEIIGLSHLEREIIANVVRYNIRDF DYDKVQLETQVHQEEDIGMSRNAITILIAKLTAILRLANSMDRSHRGKLVDCRMAIRENE LVITTGYPGNMALEAASFEQKADFFEEIFGIRPILRQKRRV >gi|157101662|gb|DS480662.1| GENE 77 86839 - 89232 2349 797 aa, chain + ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 89 750 16 674 705 702 54.0 0 MAESAGKPQASKSKTEKAKVEKNRAEKTDTSRAASAKSGSSRENAGVKADDTVKANSGAK ISAGAKAQTGAKANAAVRPEAVPVDFAADPTNYVNRELSWLEFNYRVLSESRDKSIPLFE RLKFLSITASNLDEFYMVRVASLKDMVHAKYTKPDIAGLTPQEQLDRISERTHELVEMQY STYNRSLVPTLRQNGLRMITEHEDMTEEEAAYTDEYFRKNVYPVLTPMAFDSSRPFPLIR NKTLNIAALLRKKSGEDEELEFAMVQVPSVISRIVELPREIDEDGKERRVVILLEEIIER NMPSLFLNYDIVASHPFRIMRNADLTIDEEEAVDLLEEIQKQLKKRQWGEAIRLEIEDNV DKRLLKIIRRELSINSQDIFEINGPLDLTFLMKMYGLEGFELFKAPRYVPQPVPALMNED DIFTNIRKGDILLHHPYQSFDPVVNFVRSAARDQSVLAIKQTLYRVSGNSPIIAALAEAA DNGKQVSVLVELKARFDEENNINWAKMLEKAGCHVIYGLVGLKTHSKITLVVRMEEDGIR RYVHLGTGNYNDSTAKLYTDCGILTCNPQIGEDATAVFNMLSGYSEPLAWNKLSVAPLWL RGRFLRMIRREAEHAREGRPAHIMAKMNSLCDKEIITALYEASCAGVKVEMIIRGICCLK AGVPGLSENISVRSIVGNFLEHSRIFYFFNDASPEVYMGSADWMPRNLDRRVEIMFPVED EVLREKVIHILEVELEDNVKAHILQPDGSYEKEDKRGKVLVNSQEQFCREAILMAKESAD QEDPARSRVFKPVMSSN >gi|157101662|gb|DS480662.1| GENE 78 89349 - 90692 1532 447 aa, chain - ## HITS:1 COG:PA1051 KEGG:ns NR:ns ## COG: PA1051 COG2610 # Protein_GI_number: 15596248 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Pseudomonas aeruginosa # 17 446 14 452 452 343 50.0 4e-94 MSGLFIFLVIFLAVVCMVLAISKFNQHPFIVLTVIAIAVGLVCGIPTEEVINTVKSGFGN ILASIGIVILAGTIIGTILEKTGAALTMANTILKMVGKKNSVLTMAITGYVTGIPVFCDS GFVILSPISRALASQSNVSLAVMATALSGGLYATHCLVPPTPGPIAMAGTLEADLGLTIL VGLLVSIPATLSALIYAKKVSSKIDIPANPEYTVDELLQKYGKLPGALHSFSPILLPIVL IALKSVGDFPSAPFGDGVVKMFFSFIGNPVIALILGVFLAMTLVPAAEKKNTLSWITEGV TNSAGILAITGAGGAFGAILQKLPIADALSASLLGLGVGVFLPFIIAALLKTAMGASTVA MITTSAMIAPLMPALGFTSPLAKVLVIMAIGAGSMTVSHANDSYFWVVSQFSDMETKDAY KCQTGMTGVMGIVTIIVVFLLSLVFVH >gi|157101662|gb|DS480662.1| GENE 79 90715 - 91863 1209 382 aa, chain - ## HITS:1 COG:YPO3978 KEGG:ns NR:ns ## COG: YPO3978 COG3835 # Protein_GI_number: 16124105 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Yersinia pestis # 8 351 11 350 375 78 23.0 2e-14 MLEFSSMAQDFVEATSSLVGGRTINIMDREGTIIASTEQERIGTFHQGAAEVIATGKPVL IETKDLPRYPGAKEGYNMPIFLKDELIGVVGIFGCEEQMLNVANLLKVYVTQHFAQQAMA QKQNVESEVRNRLLSLLLLGDIGQMETIYQLSALIPVQLAFPVKVIMIRAGKRKNTREQM NHYTQLFQNMMWQGTLDRSRDVFGIQNNDCIIIHSAPRHGKADEDLDKIIAQVVREEGLC IAVSGACPRLEDIPGGMKESNTLISMEGGPVRNLEDSQVKIQYLIYKSLIHGGTKYAEML YRKLTASQDARQAEVLLVTARVYYQENGSVQKASERLHLHKNTLLYRMKRLFQLLDLENE TPFTREFLIRLIFMYHPVDDIT >gi|157101662|gb|DS480662.1| GENE 80 91927 - 92730 671 267 aa, chain - ## HITS:1 COG:MTH1101 KEGG:ns NR:ns ## COG: MTH1101 COG1237 # Protein_GI_number: 15679112 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Methanothermobacter thermautotrophicus # 11 260 15 254 260 103 31.0 4e-22 MDNLAGEQKAMKAEHGLSFYVETGDMRILFDFGAGRHALENALRLGIRPEQADIAVGSHG HYDHAAGYRDFVAAGLSCPLVTGNGFFEEKYARNGLKATYLGTGFDSLFLEEHGIRHMVC DSVLPIGKDCWVMGGITRSHDFETVPERFVIRDKDGWKQDYFQDEVCLVLEDKGELVVIV GCSHPGILNILDTVSKRFSQPIRAVVGGTHLVEADRERIHMTLRTMKDLGIGLLGFNHCS GELLQSMMAEHPELNTVYLGAGDCLFL >gi|157101662|gb|DS480662.1| GENE 81 93036 - 93740 966 234 aa, chain + ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 234 1 231 233 253 58.0 3e-67 MPDINILVVDDEKEIADLIEIYLVSDGYKVFKAENAAKGLEIISKEEVHLVLLDIMMPGM NGLEMCKKIRENNNIPIIMLSAKSTDLDKILGLGTGADDYVVKPFNPLELTARVKSQLRR YTQLNPNSASKEKANNEITIKGMTINKDNHKVIVDDEEIKLTPIEFDILYLLASNPGKVF STDEIFEKVWNEKVYEANNTVMVHIRRLRGKMKEDTRQNKIITTVWGVGYKIEK >gi|157101662|gb|DS480662.1| GENE 82 93730 - 94866 1368 378 aa, chain + ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 16 363 5 345 351 213 35.0 4e-55 MKSDMSRRYHTRVVTNILYSAVISCLVEIFLVTNVSMIARYMEESGRMNRLLQAVLDYHV AVVLVYVVSGLVLFAVTFMILQEPYIRYISNISDAVQSISEGNLNTTIDVIGDDEFSSMA ANLNKMVEDIRELMDKERESERTKNELITNVAHDLRTPLTSIIGYLELLAGNSKIPAQMQ HKYIEIAYGKARRLEKLIEDLFGFTKLNYGRISMHVSQLDVVKLLGQLLEEAYPNFVEKN LSYDLQSNVPVKIITADGNLLARLFDNLIGNAIKYGADGKRVLVKILAQEDVVTVSITNY GYVIPPEELPLIFNKFYRVEQSRSSSTGGTGLGLAIAKEIVDMHGGTINVTSDLDGTVFT VKLQVNFDINKENFGTIS >gi|157101662|gb|DS480662.1| GENE 83 94901 - 97417 2544 838 aa, chain + ## HITS:1 COG:no KEGG:Closa_0455 NR:ns ## KEGG: Closa_0455 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 17 838 25 829 829 765 47.0 0 MRIFPEKIKRALKSTLFGLMLAASVTVQSLGAQALGTDSTAAKEETVESPVAMDVVYGYQ NTAKSGRFLPLKIELSSQSGQVFNGTLCILAMESDFQGYNMNLDYDVYRYEYPVEIEPGG NVNKSVSISLGARVDQMYVRLLDENGDEVVSKRLKLNLNLETAELFIGVLSDSPAELLYF NGVGINYSTLRTKTIEMTASTLPASELGLDQLDVLLITDFDSGKLSGQQIEAVWEWVRKG GVLLIGTGERGEDTLRGFGKELLEQPLPQPDERVINMGVEYAVDRPEGASIPLVCTDVML KGGTEVLGSDELSVLSSVSAGSGLVAVAMYDFVDIEEFCQANISYIDNLFTTLLGEDKIN GLASAMDGSTSSQFWSVQGLINTGNINNLPKVGLYVTLAVAYVALAGPGLYFFWKQRGMR QYYQLSVGILSLCCTGMVLLMGMSTRFTGPFFTYATIKDTDRDEISETTFINMRAPYNKP YSVTLNPEYTLYPITGSAYYNMGPLPKFTGEETPSITIHYGEEGTRLRSDNVGAFNSKFF MMERRTENGQQEGFTGDVNSFDGKVTGTLTNNYSQEVDNVAILLYNQMILIGHMEPGETV SLDGMKVIYGITNFGYAMAEQITGASRYKEDKDIRDAAYVQALERTNLLSFYMGSYLSGY HSEARVLGFSNEKEETEFLKSSNYETYGSTLLTSSIDVNYEQDGMIYRSALQKQPNVLSG EYYESNNSMYGLTPVMLEYYLGNDIEVEKLSFHQMSDEVVKSMRYYYTVPFAGNMYFYNY NTGTYDSMDTHVQSYDREDLEPYLSPGNTLTIKYVYDATGDYTWNIMLPILTVTGRSK >gi|157101662|gb|DS480662.1| GENE 84 97417 - 98379 1127 320 aa, chain + ## HITS:1 COG:alr0970 KEGG:ns NR:ns ## COG: alr0970 COG1131 # Protein_GI_number: 17228465 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 2 302 7 310 316 227 40.0 2e-59 MLEIRGLHKIYGKYHALSGLDMTVETGALYGFVGPNGAGKTTTIKIMTGLLEAEEGRITI NGMNARFDLGKLKSMIGYVPDFFGVYDNLTVSEYMEFFAACHHINGLVARKRYTALLEQV GLEDKLDFFVDGLSRGMKQRLCLARALIHDPAILILDEPTSGLDPRTRFEFREILKELQE SGKTIVISSHVLSELSELCTDIGIIDQGRMVLEGNIDDILSRVNTSNPLVISVFTNIDKA LSILKSHPCVQTISLREQDICVRFAGDAQDEALLLQQLIDSDVLVNGFTREKGSLESVFM QLTEHEEERVVLSYDAKSGL >gi|157101662|gb|DS480662.1| GENE 85 98357 - 99289 914 310 aa, chain + ## HITS:1 COG:no KEGG:Closa_0457 NR:ns ## KEGG: Closa_0457 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 303 1 302 302 263 55.0 9e-69 MMQNPVYSREMKVSSRSIRLPLIIVLFNGILSMVTLLNMYSAVAQVESTAVIQYSSFMDM YEFVTTIEFILLMFIVPAVTAASISGERERQTLELMLTTQMTASQVVIGKLMSALSTLLL LIVSSFPAVAMVFVYGGITWQDILSLLMCYITVAFFAGSLGICFSALFKRSTISTVVTYG VLIAVVAGTYFINKFSLSLSSMNINNSVVAYGFGENVVKPSSGNVIYLLLINPAATFYTI INGQMGGSAPTAKMAGYFGMQSTGFVMENWIIISIVIQLSIAALMIWVAIKAVEPVKRRR RQKKASKKGD >gi|157101662|gb|DS480662.1| GENE 86 99292 - 100758 1361 488 aa, chain + ## HITS:1 COG:BH1805 KEGG:ns NR:ns ## COG: BH1805 COG4260 # Protein_GI_number: 15614368 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Bacillus halodurans # 1 326 1 328 433 454 68.0 1e-127 MGIIKAVTTAVGGALADQWLEAVEPDDMGDRTVFVRGVQVRRGKGSNTKGSSDIVSDGSV IHVYPNQFMMLVDGGKIVDYTAEEGYYKVSHSSMPSMFNGQFGEALKESFNRIRFGGVTP GAQKVYYVNLQEIKGIKFGTRNPVNYFDNFYNAELFLRAHGTYSIKVTDPIKFYAEVIPK NADHVEIDSINEQYLSEFLEALQTSINQMSADGTRISYVTSRSRELGQYMAQTLDEEWTR MRGMEIQAVGIASITYDEESQNLINLRNRGAMMSDPSIREGYVQTTIAEGLKNAGSNDSG AMAGFMGMGMGMQMGGGFMGAASNTNMQQMQMNQAPGQMPGNTGMPGMGGHPTGGQPAGN QPMGGQPAGGQPMGSQPMGGQPMGSQPMGGQPAGGQPMASQPVGVQPAGSQWAVPNQMAG TQPGSAGMAGQMPGTQPGSAWTCPGCGTSNTGKFCGECGHPRPDTPWTCACGNINTGKFC SDCGKPRP >gi|157101662|gb|DS480662.1| GENE 87 100786 - 101886 956 366 aa, chain + ## HITS:1 COG:no KEGG:Closa_0460 NR:ns ## KEGG: Closa_0460 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 365 5 356 356 436 54.0 1e-121 MAAITYKCPNCDGGLVFDPSSQKYHCEYCLSDFTQEELECLSQELVRRKAEHQEPGLQGQ DSREPERREPGSAGAQPVLYTCPSCGAQIVTDETTAATFCYYCHNPVVLAGRLEGQFRPD YVIPFQIDRQKAEEIFSQWIGRKKYVPETFYNKKQIESMSGVYFPYWLYSCRVDGELDAQ GTRLRTWDTAGMRYTETTVYDVKRQGTMEVDHVARNGLRKANRQLVDSVLPYRMNEKEDF SMGYLSGFLAENRDMEKEQFVLDVQTEVRQFALQSLQDQAGSYSSMQVRKKEAGIRDEKW SYALLPVWTLTYNDKARGKIYYFALNGQTGKICGKLPVDRKKIIILFFSVFLPLFLLLMT GGYFLG >gi|157101662|gb|DS480662.1| GENE 88 101883 - 102875 876 330 aa, chain + ## HITS:1 COG:BH1807 KEGG:ns NR:ns ## COG: BH1807 COG1512 # Protein_GI_number: 15614370 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Bacillus halodurans # 72 241 33 201 271 83 33.0 5e-16 MTGRRMVRWIRFLAISAAVVCLGVWSGAGSILDSQAQESGAYGKTQAYNKDKNYEKDKNY DKDQNHKEARRLFDEADLITSEEAGKLEELIARCRKKTGMDVAVVTAYNDGSHTASEYAD DFYDQNGLGTGRKASGVLFLIYMDRPGSYGGEGYVSTTGNMIRILTDQRIEQIQDDVAYS LKTRDYAGAAAEFLKDVEYYVDRGIQRGQYNYDTETGEISIYRSIRWYEGVFAFLVSAGV AGSVCMGVKRRYSMEPTGRERANSLLAYRADAKFAFGDAGDNLIRKFVTSAPIPRPTQNH SSGGSGRSSGRSSTHRSSSGRSHGGGGRRF >gi|157101662|gb|DS480662.1| GENE 89 103158 - 103418 334 86 aa, chain + ## HITS:1 COG:VNG2274C KEGG:ns NR:ns ## COG: VNG2274C COG2827 # Protein_GI_number: 15791086 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Halobacterium sp. NRC-1 # 1 76 1 76 77 82 52.0 2e-16 MNYTYILTCADGTFYCGWTNDLDKRLKVHNQGKGAKYTKPRRPVVLSYYEAFETKQEAMS REYAIKHMSRAEKEHLIKKTGFQRTS >gi|157101662|gb|DS480662.1| GENE 90 103531 - 104202 711 223 aa, chain - ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 10 222 3 213 214 177 46.0 1e-44 MTAKLKRGITKVKDPGSALTHFIAMILAIIAAIPLLSKAGHDSGHMRISALAIFILSMIG LYAASTIYHTLDISPKINKLLRKIDHMMIFILIAGTYTPVCMIVLGDKTGWTMLTLVWGI AIVGILINALWITCPKWFSSLIYIAMGWVCILAITKILSSMPRAGFMWLLAGGIIYTAGG IIYAMKLPFFNSRHRYFGSHEIFHLFVMGGSLCHYVMMYRFVA >gi|157101662|gb|DS480662.1| GENE 91 104499 - 105284 307 261 aa, chain + ## HITS:1 COG:AGc4432 KEGG:ns NR:ns ## COG: AGc4432 COG0564 # Protein_GI_number: 15889714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 229 135 323 340 98 31.0 1e-20 MQKPAGLESQVSRGFAPDMVSEIRRYLHNPQDIHNHPQGVSTAQPPYVGVIHRLDKPVEG IMVYALNQKSAASLSAALQAGKIKKSYLAVVCGKPVDKQGTYVDYLRHCKENNTSKIVDK SDENSKKAVLNYRVLEVINNPKNQEQILSLIDIELLTGRHHQIRVQFAGHATPLYGDERY GGGLSTKSTTMTVDRGRKKSSFGDAHRPLALCARRLAFPHPSTGKIMEFSMVPSSGAFAW FPDRARKLISQSECDKCHKEI >gi|157101662|gb|DS480662.1| GENE 92 105433 - 105816 578 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936868|ref|ZP_02084233.1| ## NR: gi|160936868|ref|ZP_02084233.1| hypothetical protein CLOBOL_01757 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01757 [Clostridium bolteae ATCC BAA-613] # 1 127 1 127 127 155 100.0 1e-36 MARPKRTAAAAAQTGAAQTRTAEVKKAAEVKSTAEVKTAEDVKKADEPVVKEAVKETVKP EIEKSVVVEFGGRQVAAKDVLAQAEKAYAKTHPGTVIKSMELYISPEQNAAYYVVNGEGS DDFRIDL Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:19:47 2011 Seq name: gi|157101661|gb|DS480663.1| Clostridium bolteae ATCC BAA-613 Scfld_02_4 genomic scaffold, whole genome shotgun sequence Length of sequence - 149103 bp Number of predicted genes - 146, with homology - 143 Number of transcription units - 47, operones - 25 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 189 99 ## CD3335 hypothetical protein + Prom 256 - 315 4.8 2 2 Tu 1 . + CDS 431 - 1108 64 ## gi|160939540|ref|ZP_02086889.1| hypothetical protein CLOBOL_04433 + Prom 1559 - 1618 5.7 3 3 Op 1 . + CDS 1856 - 2275 337 ## CD3329 hypothetical protein 4 3 Op 2 . + CDS 2272 - 2511 243 ## CD3328 hypothetical protein + Prom 2981 - 3040 5.0 5 4 Op 1 . + CDS 3082 - 3276 145 ## CD3327 hypothetical protein 6 4 Op 2 . + CDS 3313 - 4497 722 ## COG0582 Integrase - Term 4515 - 4575 9.4 7 5 Op 1 . - CDS 4578 - 5879 1401 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 5914 - 5973 3.1 8 5 Op 2 . - CDS 6039 - 6635 730 ## Closa_0782 hypothetical protein - Prom 6759 - 6818 4.5 - Term 6746 - 6812 13.1 9 6 Tu 1 . - CDS 6862 - 9471 3049 ## COG0474 Cation transport ATPase 10 7 Tu 1 . - CDS 9835 - 11286 1593 ## Spirs_3076 hypothetical protein - Prom 11326 - 11385 3.9 11 8 Op 1 . - CDS 11387 - 11827 313 ## Rmar_2026 Fe-S metabolism associated SufE 12 8 Op 2 . - CDS 11827 - 13113 801 ## COG0520 Selenocysteine lyase 13 8 Op 3 . - CDS 13115 - 13528 456 ## gi|160939555|ref|ZP_02086904.1| hypothetical protein CLOBOL_04448 - Prom 13588 - 13647 6.1 + Prom 13538 - 13597 7.3 14 9 Tu 1 . + CDS 13716 - 14603 616 ## COG2207 AraC-type DNA-binding domain-containing proteins 15 10 Op 1 1/0.000 - CDS 14627 - 15544 1080 ## COG0191 Fructose/tagatose bisphosphate aldolase 16 10 Op 2 1/0.000 - CDS 15564 - 16436 843 ## COG0191 Fructose/tagatose bisphosphate aldolase 17 10 Op 3 7/0.000 - CDS 16471 - 17559 1364 ## COG1299 Phosphotransferase system, fructose-specific IIC component 18 10 Op 4 8/0.000 - CDS 17564 - 17893 437 ## COG1445 Phosphotransferase system fructose-specific component IIB 19 10 Op 5 5/0.000 - CDS 17931 - 18377 586 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 20 10 Op 6 . - CDS 18374 - 20374 1816 ## COG3711 Transcriptional antiterminator - Prom 20616 - 20675 4.7 + Prom 20574 - 20633 6.0 21 11 Tu 1 . + CDS 20794 - 21555 395 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 21556 - 21598 -0.6 - Term 21542 - 21586 1.2 22 12 Op 1 3/0.000 - CDS 21654 - 22571 901 ## COG0191 Fructose/tagatose bisphosphate aldolase 23 12 Op 2 5/0.000 - CDS 22592 - 23581 878 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 24 12 Op 3 38/0.000 - CDS 23632 - 24468 637 ## COG0395 ABC-type sugar transport system, permease component 25 12 Op 4 35/0.000 - CDS 24449 - 25366 517 ## COG1175 ABC-type sugar transport systems, permease components 26 12 Op 5 1/0.000 - CDS 25393 - 26742 1110 ## COG1653 ABC-type sugar transport system, periplasmic component 27 12 Op 6 . - CDS 26775 - 27866 695 ## COG0524 Sugar kinases, ribokinase family - Prom 28029 - 28088 8.9 + Prom 28529 - 28588 5.0 28 13 Op 1 . + CDS 28805 - 28963 80 ## gi|160939573|ref|ZP_02086922.1| hypothetical protein CLOBOL_04466 29 13 Op 2 . + CDS 29007 - 29420 388 ## COG3328 Transposase and inactivated derivatives + Prom 29424 - 29483 2.6 30 14 Tu 1 . + CDS 29542 - 29634 96 ## 31 15 Tu 1 . - CDS 29819 - 29995 208 ## gi|160936473|ref|ZP_02083841.1| hypothetical protein CLOBOL_01364 - Prom 30015 - 30074 7.3 - Term 30051 - 30084 -0.7 32 16 Tu 1 . - CDS 30152 - 30355 66 ## gi|160939578|ref|ZP_02086926.1| hypothetical protein CLOBOL_04470 - Prom 30378 - 30437 1.5 33 17 Op 1 . - CDS 30446 - 32038 732 ## gi|160939579|ref|ZP_02086927.1| hypothetical protein CLOBOL_04471 34 17 Op 2 . - CDS 32035 - 33201 347 ## gi|160939580|ref|ZP_02086928.1| hypothetical protein CLOBOL_04472 - Prom 33367 - 33426 6.2 + Prom 33034 - 33093 5.8 35 18 Tu 1 . + CDS 33134 - 33265 64 ## + Prom 33372 - 33431 6.3 36 19 Op 1 . + CDS 33505 - 34572 646 ## gi|160939581|ref|ZP_02086929.1| hypothetical protein CLOBOL_04473 37 19 Op 2 . + CDS 34598 - 35968 320 ## gi|160939582|ref|ZP_02086930.1| hypothetical protein CLOBOL_04474 + Prom 35987 - 36046 7.3 38 19 Op 3 . + CDS 36066 - 37229 838 ## COG3328 Transposase and inactivated derivatives 39 20 Op 1 7/0.000 - CDS 37514 - 39067 911 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 40 20 Op 2 . - CDS 39060 - 40793 1123 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 40841 - 40900 1.8 - Term 40826 - 40866 0.4 41 21 Op 1 . - CDS 40917 - 41894 490 ## BF3495 hypothetical protein 42 21 Op 2 11/0.000 - CDS 41888 - 42865 726 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 43 21 Op 3 21/0.000 - CDS 42841 - 43821 693 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 44 21 Op 4 16/0.000 - CDS 43814 - 45238 949 ## COG1129 ABC-type sugar transport system, ATPase component 45 21 Op 5 . - CDS 45378 - 46439 572 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 46497 - 46556 6.8 - Term 46586 - 46636 10.9 46 22 Tu 1 . - CDS 46674 - 47048 397 ## Achl_2086 hypothetical protein - Prom 47080 - 47139 8.1 - Term 47173 - 47215 7.0 47 23 Op 1 37/0.000 - CDS 47251 - 48024 450 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc 48 23 Op 2 23/0.000 - CDS 48017 - 49201 1370 ## COG0133 Tryptophan synthase beta chain 49 23 Op 3 9/0.000 - CDS 49229 - 49849 696 ## COG0135 Phosphoribosylanthranilate isomerase 50 23 Op 4 21/0.000 - CDS 49936 - 50766 751 ## COG0134 Indole-3-glycerol phosphate synthase 51 23 Op 5 13/0.000 - CDS 50763 - 51797 1130 ## COG0547 Anthranilate phosphoribosyltransferase 52 23 Op 6 35/0.000 - CDS 51858 - 52490 638 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 53 23 Op 7 . - CDS 52487 - 54055 1292 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 54 24 Op 1 6/0.000 - CDS 54630 - 55310 761 ## COG0352 Thiamine monophosphate synthase 55 24 Op 2 1/0.000 - CDS 55297 - 56136 840 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family 56 24 Op 3 . - CDS 56160 - 56696 607 ## COG4732 Predicted membrane protein 57 25 Op 1 . - CDS 57152 - 57979 860 ## COG0561 Predicted hydrolases of the HAD superfamily 58 25 Op 2 2/0.000 - CDS 58032 - 58772 789 ## COG0584 Glycerophosphoryl diester phosphodiesterase 59 25 Op 3 14/0.000 - CDS 58801 - 60246 1643 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 60277 - 60336 5.7 60 25 Op 4 38/0.000 - CDS 60341 - 61252 819 ## COG0395 ABC-type sugar transport system, permease component 61 25 Op 5 10/0.000 - CDS 61252 - 62118 954 ## COG1175 ABC-type sugar transport systems, permease components 62 25 Op 6 . - CDS 62127 - 63266 1085 ## COG3839 ABC-type sugar transport systems, ATPase components - Prom 63480 - 63539 4.2 - Term 63418 - 63472 11.5 63 26 Tu 1 . - CDS 63546 - 63923 431 ## gi|160939610|ref|ZP_02086958.1| hypothetical protein CLOBOL_04502 - Prom 64009 - 64068 4.2 64 27 Tu 1 . - CDS 64134 - 64532 431 ## Closa_4143 hypothetical protein - Term 64571 - 64607 8.1 65 28 Tu 1 . - CDS 64636 - 66420 2019 ## COG4716 Myosin-crossreactive antigen - Prom 66453 - 66512 3.2 66 29 Op 1 1/0.000 - CDS 66558 - 67436 916 ## COG0030 Dimethyladenosine transferase (rRNA methylation) - Prom 67547 - 67606 3.6 - Term 67527 - 67565 4.0 67 29 Op 2 . - CDS 67608 - 68372 884 ## COG0084 Mg-dependent DNase 68 29 Op 3 . - CDS 68377 - 69150 690 ## Closa_0158 hypothetical protein - Prom 69195 - 69254 4.4 69 30 Tu 1 . - CDS 69306 - 70655 968 ## COG0438 Glycosyltransferase - Term 70693 - 70733 6.1 70 31 Tu 1 . - CDS 70826 - 72820 2389 ## COG0143 Methionyl-tRNA synthetase - Prom 72922 - 72981 8.0 + Prom 72995 - 73054 9.0 71 32 Tu 1 . + CDS 73095 - 74969 1995 ## COG4805 Uncharacterized protein conserved in bacteria + Term 75022 - 75085 17.7 - Term 75008 - 75073 13.0 72 33 Op 1 28/0.000 - CDS 75106 - 76038 1006 ## COG2177 Cell division protein 73 33 Op 2 . - CDS 76028 - 76714 284 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 74 33 Op 3 10/0.000 - CDS 76760 - 77821 1177 ## COG1377 Flagellar biosynthesis pathway, component FlhB 75 33 Op 4 17/0.000 - CDS 77904 - 78683 995 ## COG1684 Flagellar biosynthesis pathway, component FliR 76 33 Op 5 16/0.000 - CDS 78699 - 78962 437 ## COG1987 Flagellar biosynthesis pathway, component FliQ 77 33 Op 6 . - CDS 78981 - 79757 979 ## COG1338 Flagellar biosynthesis pathway, component FliP 78 33 Op 7 . - CDS 79750 - 80127 409 ## Teth514_1678 hypothetical protein 79 33 Op 8 20/0.000 - CDS 80129 - 81091 846 ## COG1886 Flagellar motor switch/type III secretory pathway protein 80 33 Op 9 . - CDS 81116 - 82006 901 ## COG1868 Flagellar motor switch protein 81 33 Op 10 . - CDS 81999 - 82844 858 ## COG1360 Flagellar motor protein 82 33 Op 11 24/0.000 - CDS 82845 - 83279 494 ## COG1558 Flagellar basal body rod protein 83 33 Op 12 . - CDS 83291 - 83647 392 ## COG1815 Flagellar basal body protein 84 33 Op 13 15/0.000 - CDS 83669 - 84049 532 ## COG1516 Flagellin-specific chaperone FliS 85 33 Op 14 . - CDS 84066 - 86972 2265 ## COG1345 Flagellar capping protein 86 33 Op 15 . - CDS 86984 - 87460 587 ## gi|160939635|ref|ZP_02086983.1| hypothetical protein CLOBOL_04527 87 33 Op 16 5/0.000 - CDS 87447 - 87725 372 ## COG1551 Carbon storage regulator (could also regulate swarming and quorum sensing) 88 33 Op 17 . - CDS 87719 - 88165 385 ## COG1699 Uncharacterized protein conserved in bacteria 89 33 Op 18 . - CDS 88215 - 88814 686 ## Closa_3431 hypothetical protein 90 33 Op 19 21/0.000 - CDS 88836 - 89819 1125 ## COG1344 Flagellin and related hook-associated proteins 91 33 Op 20 . - CDS 89830 - 91284 1653 ## COG1256 Flagellar hook-associated protein 92 33 Op 21 . - CDS 91327 - 91773 637 ## gi|160939641|ref|ZP_02086989.1| hypothetical protein CLOBOL_04533 93 33 Op 22 . - CDS 91786 - 92088 281 ## gi|160939642|ref|ZP_02086990.1| hypothetical protein CLOBOL_04534 - Prom 92157 - 92216 4.8 - Term 92371 - 92426 12.1 94 34 Op 1 . - CDS 92491 - 93009 421 ## Cphy_1341 hypothetical protein 95 34 Op 2 . - CDS 93002 - 93772 715 ## COG1192 ATPases involved in chromosome partitioning - Prom 93810 - 93869 10.0 - Term 93873 - 93928 3.5 96 35 Tu 1 . - CDS 93992 - 95497 1060 ## COG1344 Flagellin and related hook-associated proteins - Prom 95631 - 95690 3.6 - Term 95658 - 95703 9.1 97 36 Op 1 . - CDS 95705 - 96631 932 ## COG1876 D-alanyl-D-alanine carboxypeptidase 98 36 Op 2 . - CDS 96676 - 97326 676 ## gi|160939647|ref|ZP_02086995.1| hypothetical protein CLOBOL_04539 99 36 Op 3 . - CDS 97345 - 97737 346 ## COG2257 Uncharacterized homolog of the cytoplasmic domain of flagellar protein FhlB 100 36 Op 4 . - CDS 97724 - 99124 1527 ## Closa_3422 hypothetical protein 101 36 Op 5 . - CDS 99124 - 100884 1565 ## COG1315 Predicted polymerase, most proteins contain PALM domain, HD hydrolase domain and Zn-ribbon domain 102 36 Op 6 8/0.000 - CDS 100936 - 101703 989 ## COG4786 Flagellar basal body rod protein 103 36 Op 7 . - CDS 101717 - 102457 935 ## COG4786 Flagellar basal body rod protein 104 36 Op 8 2/0.000 - CDS 102462 - 103238 872 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 105 36 Op 9 . - CDS 103245 - 105299 2246 ## COG1298 Flagellar biosynthesis pathway, component FlhA 106 36 Op 10 . - CDS 105302 - 106408 1103 ## COG1886 Flagellar motor switch/type III secretory pathway protein 107 36 Op 11 3/0.000 - CDS 106421 - 107305 1055 ## COG1291 Flagellar motor component 108 36 Op 12 1/0.000 - CDS 107344 - 107550 382 ## COG1582 Uncharacterized protein, possibly involved in motility 109 36 Op 13 . - CDS 107571 - 109256 1901 ## COG4786 Flagellar basal body rod protein 110 36 Op 14 . - CDS 109327 - 109722 390 ## Amet_2717 flagellar operon protein 111 36 Op 15 . - CDS 109805 - 110506 679 ## TTE1435 flagellar hook capping protein 112 36 Op 16 . - CDS 110538 - 111962 1158 ## gi|160939661|ref|ZP_02087009.1| hypothetical protein CLOBOL_04553 113 36 Op 17 . - CDS 111995 - 112444 484 ## gi|160939662|ref|ZP_02087010.1| hypothetical protein CLOBOL_04554 114 36 Op 18 . - CDS 112455 - 113759 1550 ## COG1157 Flagellar biosynthesis/type III secretory pathway ATPase 115 36 Op 19 . - CDS 113772 - 114578 845 ## gi|160939664|ref|ZP_02087012.1| hypothetical protein CLOBOL_04556 116 36 Op 20 19/0.000 - CDS 114571 - 115578 1353 ## COG1536 Flagellar motor switch protein 117 36 Op 21 . - CDS 115580 - 117187 1729 ## COG1766 Flagellar biosynthesis/type III secretory pathway lipoprotein 118 36 Op 22 . - CDS 117217 - 117522 343 ## gi|160939667|ref|ZP_02087015.1| hypothetical protein CLOBOL_04559 - Prom 117567 - 117626 10.1 119 37 Op 1 1/0.000 - CDS 117967 - 119175 1315 ## COG5263 FOG: Glucan-binding domain (YG repeat) 120 37 Op 2 . - CDS 119198 - 120418 1090 ## COG5263 FOG: Glucan-binding domain (YG repeat) 121 37 Op 3 . - CDS 120448 - 121635 1078 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 121737 - 121796 5.4 - Term 121776 - 121831 13.4 122 38 Op 1 1/0.000 - CDS 121878 - 123395 990 ## COG1070 Sugar (pentulose and hexulose) kinases 123 38 Op 2 . - CDS 123406 - 124266 664 ## COG1737 Transcriptional regulators - Prom 124303 - 124362 3.8 124 39 Op 1 . - CDS 124368 - 124583 153 ## BP951000_1856 glycine/sarcosine/betaine reductase selenoprotein GrdB 125 39 Op 2 . - CDS 124632 - 125678 782 ## Amet_3591 selenoprotein B (EC:1.21.4.2) 126 39 Op 3 . - CDS 125690 - 126994 1018 ## TDE2120 glycine reductase complex proprotein GrdE2 127 39 Op 4 . - CDS 127013 - 128296 685 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 128 39 Op 5 . - CDS 128297 - 128788 181 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 - Term 128803 - 128850 7.7 129 40 Op 1 . - CDS 128865 - 129947 758 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 130 40 Op 2 . - CDS 129997 - 131103 354 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 131195 - 131254 9.1 - Term 131397 - 131452 13.4 131 41 Op 1 . - CDS 131478 - 132026 749 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 132061 - 132120 2.0 132 41 Op 2 . - CDS 132134 - 133207 1075 ## COG0502 Biotin synthase and related enzymes 133 41 Op 3 35/0.000 - CDS 133207 - 134016 202 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 134 41 Op 4 33/0.000 - CDS 134013 - 135032 1062 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 135 41 Op 5 1/0.000 - CDS 135078 - 136310 1278 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 136 41 Op 6 . - CDS 136383 - 137618 939 ## COG2710 Nitrogenase molybdenum-iron protein, alpha and beta chains 137 41 Op 7 . - CDS 137600 - 138949 1032 ## CLJU_c23040 hypothetical protein 138 41 Op 8 . - CDS 138942 - 139721 930 ## COG1348 Nitrogenase subunit NifH (ATPase) - Prom 139960 - 140019 5.1 + Prom 140011 - 140070 2.8 139 42 Op 1 . + CDS 140154 - 140405 399 ## Shel_16520 hypothetical protein 140 42 Op 2 . + CDS 140423 - 140740 485 ## ELI_3158 hypothetical protein + Term 140753 - 140811 19.5 - Term 140745 - 140795 13.1 141 43 Op 1 7/0.000 - CDS 140825 - 144427 3846 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 142 43 Op 2 . - CDS 144447 - 145016 720 ## COG0193 Peptidyl-tRNA hydrolase - Prom 145221 - 145280 5.1 - Term 145245 - 145285 8.1 143 44 Tu 1 . - CDS 145317 - 146030 998 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Prom 146084 - 146143 5.2 144 45 Tu 1 . - CDS 146213 - 146305 71 ## - Prom 146419 - 146478 4.0 + Prom 146359 - 146418 6.2 145 46 Tu 1 . + CDS 146469 - 147881 1635 ## COG2509 Uncharacterized FAD-dependent dehydrogenases + Term 147884 - 147944 7.2 - Term 147878 - 147925 14.1 146 47 Tu 1 . - CDS 147949 - 148863 622 ## Tthe_0908 hypothetical protein - Prom 148915 - 148974 1.8 Predicted protein(s) >gi|157101661|gb|DS480663.1| GENE 1 1 - 189 99 62 aa, chain + ## HITS:1 COG:no KEGG:CD3335 NR:ns ## KEGG: CD3335 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 56 246 301 302 94 87.0 1e-18 GKEYIFSELVNPVYNRKGNQVTASLAVKYLDNQTMTTQVSQFKLVLEKDEGKWKILCSMK AQ >gi|157101661|gb|DS480663.1| GENE 2 431 - 1108 64 225 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939540|ref|ZP_02086889.1| ## NR: gi|160939540|ref|ZP_02086889.1| hypothetical protein CLOBOL_04433 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04433 [Clostridium bolteae ATCC BAA-613] # 1 225 8 232 232 402 100.0 1e-110 MISVVLAVVMLFSLSAQSVFASSVGADTLNGIKIIEQKDTDTTMYAKYYMTINGETTLYT EYGEIQKDRVVVDSVSIKVDEHMVIIPSTEQRQHFEYPITAPDSSNSISNSDNLTTRASS CQYKPHTETFSLKIHQFTLSVVKAAIITSLGLGAGDAGVVAGAMIDVVISHGGSYIPDSI YFDGKRCVSKSTGKIYYRYKGNIYMDSSKTELLAKNVSWSRRWGH >gi|157101661|gb|DS480663.1| GENE 3 1856 - 2275 337 139 aa, chain + ## HITS:1 COG:no KEGG:CD3329 NR:ns ## KEGG: CD3329 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 138 2 139 140 246 97.0 3e-64 MKPSEFQTTIENQFDYICKVAMEDERKDYLKALSRQCKRETLFCDMDDYTVNLLSSKDTY PSHFHTFEMDGFTVRIENSLLAEALENLDGKKRDVILRYYFLGFDDTEISKILEVNRSTI QRRRHAGLEFIKKFMEEET >gi|157101661|gb|DS480663.1| GENE 4 2272 - 2511 243 79 aa, chain + ## HITS:1 COG:no KEGG:CD3328 NR:ns ## KEGG: CD3328 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 79 1 79 79 143 96.0 3e-33 MTAKHPMIPFPVIVKAADGDIEAINQIVRHYSGFIASRSMRPMKDEYGNTHMVVDETLRR RMETRLIAKILSFEIREPN >gi|157101661|gb|DS480663.1| GENE 5 3082 - 3276 145 64 aa, chain + ## HITS:1 COG:no KEGG:CD3327 NR:ns ## KEGG: CD3327 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 64 1 64 64 115 96.0 6e-25 MITVTKKDLIELGYGTSFAADIIREAKRLMISKGHTYYQSRKLDRVPREAVEELLGINFT DKSN >gi|157101661|gb|DS480663.1| GENE 6 3313 - 4497 722 394 aa, chain + ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 6 368 3 368 368 182 32.0 1e-45 MAKDLIKKAENGTYYFRANLGFHPITGKQIQKYKSGFKTKKEAREAYSKLILASTEELAE KKQQLSFKQFIEETYLPWYKTQVKESTYLNRRSTIQKHFSYFYKMTVDEIKPSNVQNWQL ELAKKFNPNYIRIVQGMLSIAFDRAIVLGLAKKNPSRMIGNIKSKKTKVDFWTLDEFQKV ISLLYKGDYYEHYLFISYWLLFMTGMRIGEAAALQWSDINFETGLLSITKTLYYKTMDEY KFVEPKTQASIRTIYIDTDTIKELKAWQEVQQKVLKDCDLVLSYNGIPTSKHTLPRALKK LAGLAGVHRIKIHALRHSHASLLISMGENPLLIKERLGHEKIQTTLGTYGHLYPNTNVEV AKKLTGVLTYTPATTSVAAYTSNQHTSIYHRAVE >gi|157101661|gb|DS480663.1| GENE 7 4578 - 5879 1401 433 aa, chain - ## HITS:1 COG:BH0607_2 KEGG:ns NR:ns ## COG: BH0607_2 COG0519 # Protein_GI_number: 15613170 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Bacillus halodurans # 119 433 1 315 315 425 64.0 1e-119 MKQDMIVILDLGSTENTKLARDIREMGVYSEIYPHDITASELKELPNVKGIIINGGPNNV VDGTPIDVRSELYEAGYPVMAAGHGAAACERSIHSWDEADSAQILRSFVFDTCKAQANWN MKNFISDQVELIRQQVGDRKVLLALSGGVDSSVVAALLIKAIGKQLTCVHVNHGLMRKGE SESVIDVFKNQMDANLVYVDAVDRFLGKLAGVADPEQKRKIIGAEFIRVFEEEARKLEGI EFLAQGTIYPDIVESGTKTAKVVKSHHNVGGLPEDLNFTLVEPLRQLFKDEVRACGLELG LPHSMVYRQPFPGPGLGVRCLGAITRERLEAVRESDAILREEFANAGLDKTVWQYFTIVP DFKSVGVRDNARCFDYPVIIRAVNTVDAMTASIERIDYDVLQKITDRILKEVKNVNRVCY DLSPKPTATIEWE >gi|157101661|gb|DS480663.1| GENE 8 6039 - 6635 730 198 aa, chain - ## HITS:1 COG:no KEGG:Closa_0782 NR:ns ## KEGG: Closa_0782 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 193 2 195 199 274 67.0 1e-72 MDKVITVSREFGSGGRELGVKLAEKLDIPFYDKELISMAADDINIAEEAFQNYDEHIVIH DPLDRQYYHAFSDIYKVPMSDQIFVAQSNVIRRLAAHGPCVIVGRCADMILDDSINLFIY SKMRDRIKRMLILEPGSDEKEMERRIREVDRKRKEYYQYYTGNTWGRAQNYHLCLDSGLT GVDGCLRAVLAYLGELSE >gi|157101661|gb|DS480663.1| GENE 9 6862 - 9471 3049 869 aa, chain - ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 5 863 4 860 862 842 53.0 0 MKEWYQQTKEEILSQFQVTEQGLTSSQAEKILAEKGENVLEEGKRKSTLQVFLEQFCDLL VVILIIAALISMVSGNVESTVVILAVIILNAILGTVQHAKAEKSLDSLKSLSSPNAKVLR DGQKVEIPSAKVVPGDILYLEAGDLVVADGRILENYSLQVNESSLTGESTNVDKSDGTLH SDCALADRANMVYSSGLVTYGRAVVLVTATGMDTEIGKIAALMNATKEKKTPLQVSLDQF GSRLAMAIMVICALVFLLSLYRKMPVLDSLMFAVALAVAAIPEALSSIVTIVQAMGTQKM AKEHAIIKELKAVESLGCVSVICSDKTGTLTQNKMTVQNIYTNGQTITIDQLNLKNQLHR YLLYDAILTNDSSIVDGKGIGDPTEFALVEMGRKATVDENLLRELMPRLEEIPFDSDRKL MSTKYELHDVPTVLTKGALDVLLDRTVKIRMEEGIRDITQEDREAILQKNLEFSQEGLRV LAFGYKEVPEDYTLSLDNENDFIFLGLISMMDPPREESKAAVADAKRAGIKPVMITGDHK ITATAIAKQIGIFEDGDIAMTGRELDAMSEEELDRKITEISVYARVSPENKIRIVDAWQR RGSITAMTGDGVNDAPALKKADIGVAMGITGTEVSKDAAAMILTDDNFATIIKAVANGRN VYRNIKNAIKFLLSGNMAGILSVLYTSLAALPVPFAPVHLLFINLLTDSLPAIAIGMEPA EKDLLSEAPRNPKTGILTKDFMTTILTQGGIIAVCTMIAFHTGLRTGSAATASTMAFATL TLARLFHGFNCRSKHNIFKLGFSSNWYSLGAFAAGVVLLGIVMFVPFMQNLFSVTPLTQS QLINVCLLAAVPTVLIQMFKIIRDIKHRK >gi|157101661|gb|DS480663.1| GENE 10 9835 - 11286 1593 483 aa, chain - ## HITS:1 COG:no KEGG:Spirs_3076 NR:ns ## KEGG: Spirs_3076 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 1 478 2 510 514 177 26.0 1e-42 MKYAMDVKVNLKPIFSNLVHTDWWEGPCRVGPMEEGTPEFERRVGKEQFKIWYEELQNNL DLTRCNLMDPVYMEFDESFVMRDSEFDKLMPENHQVDLYLITYRVPGIERLNKPVSMINL GPTPIDLVGYYRDIGLEAYMAHDYEEYNRILTCLQVRKAVANTKILILSNSEQTPASVNT SCCDLVSLFTRYGIRHNRIDFRQVFNYFDEIPADEGIHQEARALMEGADKVDITEEYLCS DLRYFHAVRRMMERYDCNAFTTPCKELCASRLPQKNKCVPCITHSLNKDDRIPSACEEDL AVWMAMMMMMYLTRQSVFMGNPVLVLAGSRTLEQLGMPKLLTQPGQVFDHDVLEIHHAVP VRRMRGFDQPEQKYELAHFTTQGWGAHYHVDMAEEEGQVVTFGRFNRQGTRMMVAVGHTL GCEFRPVYCSPAVYYEVEGGAREFRQALAKGGYGHHQAIIYGNHVKELQELGEIVGFEVE VFH >gi|157101661|gb|DS480663.1| GENE 11 11387 - 11827 313 146 aa, chain - ## HITS:1 COG:no KEGG:Rmar_2026 NR:ns ## KEGG: Rmar_2026 # Name: not_defined # Def: Fe-S metabolism associated SufE # Organism: R.marinus # Pathway: not_defined # 24 137 14 128 147 69 31.0 4e-11 MDEKDQTDICQSDICRTDICRAEKKYIDSLLRLQDPGAQCEYLLMLGMEKPLLDSLRVDR YRIGGCRTAIWLRAEDRDGMVHFYSDSDSLLVRGVLSILEELYQARTPEIIKSHPMRFLD YISDDVIYPEIKENGLSKCYQMLAHM >gi|157101661|gb|DS480663.1| GENE 12 11827 - 13113 801 428 aa, chain - ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 2 422 11 407 413 327 40.0 2e-89 MDVTEIRRDFPILNQEGKPLVYLDNGATTQKPQAVIDRLCRYYSMENSNIHRGSYPLSSQ ASRMYERARETVRSWVDAEYGEEIVFTKGCTESVNLAAAAVFETYVCPGDNVIVTELEHS SNYFPWKNQCEKKCAQFRVALAQTDGSLRAEDVLDQMDSRTRLIAVTAMSNVTGFCPDVD RIIREAHKRGVLVLIDAAQAVVHRAISVRQMKCDFLCFSGHKIYGPMGIGVLYGRKMYLE EMAPYLYGGDMVVKGDRGEVSYKKSPSKYEAGTQDIAGALGLEAALLYLNRRGFEEMIQY EAELGAYLRERLEAVKGVHVIGEPAVRQPLSGRSEPQPQSGGSVPQPQSAGSVSPIALFE TDKLGAYDIGVLLANSGIAVRSGSHCAYPLMKRMGKESLCRVSLSFYNTKQEIEYMAERL ERICGRGS >gi|157101661|gb|DS480663.1| GENE 13 13115 - 13528 456 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939555|ref|ZP_02086904.1| ## NR: gi|160939555|ref|ZP_02086904.1| hypothetical protein CLOBOL_04448 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04448 [Clostridium bolteae ATCC BAA-613] # 1 137 1 137 137 218 100.0 1e-55 MIDEIVKSTREASAVLDRIKSMPCEGLEKEVILPLLRKAVNLKLMIQEDTQTDIRKLVII SIKRQDLRKGNLPDEVIQREIKKYDCHQTSLAVQKKVLLLMFIERELGIAMEDDEASDIE NLDELADAVIRHLKGGK >gi|157101661|gb|DS480663.1| GENE 14 13716 - 14603 616 295 aa, chain + ## HITS:1 COG:CAC2818 KEGG:ns NR:ns ## COG: CAC2818 COG2207 # Protein_GI_number: 15896073 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 12 289 8 275 279 79 23.0 8e-15 MDIPFSRYTEKLPEEFSVRHKITETTEKRNFHLHRQMEIIFALSGNLKCKFETGEIDIPR NGLILFNQMDLHYIYPEDGSGICDRYVLYFSSNYISRFSTPEVNLLECFLFRPGNQPVVL TVPDRQLDSFLSVLSRMEDYNRPESGCSASPVYGRDLHLKFLLGEFLLLTNQLYTERFGP LNTAAYQEHARLVYDIYEYIGGHYSDNLSTDSLSRLFLISKTQLYHVFKEISGMTVSDYI TEYRITRAKDFLINTDYSVEMIGQAVGYTNLSSFSRVFKEQAGCSPIQYRRKHAD >gi|157101661|gb|DS480663.1| GENE 15 14627 - 15544 1080 305 aa, chain - ## HITS:1 COG:lin2239 KEGG:ns NR:ns ## COG: lin2239 COG0191 # Protein_GI_number: 16801304 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Listeria innocua # 1 283 1 283 284 244 42.0 1e-64 MYTDIKTILDDANKNNYAVLAASAFNMELARGLIAAADEMQAPLIILMGQAQMTKHARPE LMVPMIKTLAEQTNVPVALILDHGRDWEKITWAYRQGFSSIMVDASAYDMEENIARTKKA VELCHVQGLAVEGELGHVGQAATLDGCDVSLYTKPEEAVRFVQETGVDSLAIACGTAHGD YPKGFIPTINFDVIKGVKQAVNMPLALHGGSGSGDENIRKAVEAGINKVNIATEIFNACR DYAKNRLDEKPDLDYMSLMMEVEQECKKTVKHFISLTGSEGKAAGFKKKYAFCHGISQID TGIGE >gi|157101661|gb|DS480663.1| GENE 16 15564 - 16436 843 290 aa, chain - ## HITS:1 COG:lin2238 KEGG:ns NR:ns ## COG: lin2238 COG0191 # Protein_GI_number: 16801303 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Listeria innocua # 1 288 1 287 299 261 43.0 1e-69 MLLNLREILEPAVKDNFAVGGFNVTESTMFKAIVEEAQYREAPAIIQVSPNEFQFSEREL YLYFSVRLQRSRNPFVLHYDHSKSYEGCIRAIQAGFTSVMFDGSQMEYDQNVECTRRVVE AAHGAGVSVEGEIGTIGETEDYLNGTVRDMVYTSPELARRFVEDTGVDALAVSIGTVHGI LPKGYVPKLQLGLLKELAAAVPVPLVLHGGSGTPHDEVAAACRTGIHKVNISSEFKHAYY QSVREFVTNHPTLVSPTRVLKDAVQEVRNVAGTKMQVLGSTGKSHCYRKY >gi|157101661|gb|DS480663.1| GENE 17 16471 - 17559 1364 362 aa, chain - ## HITS:1 COG:lin2240 KEGG:ns NR:ns ## COG: lin2240 COG1299 # Protein_GI_number: 16801305 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Listeria innocua # 4 348 1 353 363 286 45.0 6e-77 MGGLKLSEIKKHVMTGISFMIPIVVGAGLCMALGTAIGGTKVSEAEGTLIWYIWRTGKIG MGFVVPVITAAIAYSIADRPAIAPGLILGSIASEIYTGFIGGIIAAFLVGYTVLFLKKYL KVPKTMQGLVPVLILPFLTTLICGALMFAVIGKPVAFVINALQNWLIGLQGGSKFILGAI VGGMCGVDFGGPIGKTASLFANGLLVDGIYGPEAVKLVTCMIPPVGVTLSYILTRNKYTK AEKEAVKAAFPMGICMITEGVIPLAMNDPLRVIGSSVIATSIAGGLTMVLGIENYVTAGG WFIIPLSNKPMMMAAMVVLGGVIMGVILSLWKKVVTEDDSMDFGMEQENGESTTEGMTFE NF >gi|157101661|gb|DS480663.1| GENE 18 17564 - 17893 437 109 aa, chain - ## HITS:1 COG:lin2241 KEGG:ns NR:ns ## COG: lin2241 COG1445 # Protein_GI_number: 16801306 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Listeria innocua # 1 101 1 101 103 99 51.0 1e-21 MKIVGVTACTAGIAHTYIVKEKLEQAARKLGYTCRIETQGSIGIEEQLTPEEIQEADVVI LAIDVKISGENRFKGKPVVRMSTEKAMKNTGAMLQSIEEALTKKKNGGN >gi|157101661|gb|DS480663.1| GENE 19 17931 - 18377 586 148 aa, chain - ## HITS:1 COG:SP1619 KEGG:ns NR:ns ## COG: SP1619 COG1762 # Protein_GI_number: 15901456 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Streptococcus pneumoniae TIGR4 # 1 148 1 149 149 103 34.0 1e-22 MNLLEVINENAVAVNVKAQNKEEVFECVCDLLLKDGSITSKEEFKQDLYIRESQGKTGIG DGIAIPHGKSLAVKRNCISVLKLEQPIQWETLDGLPAQVFIIFAINQKDKDDYFLRLMAS VAKKLAQEGTCGKLMGSSTKEDILEAFC >gi|157101661|gb|DS480663.1| GENE 20 18374 - 20374 1816 666 aa, chain - ## HITS:1 COG:BS_yjdC_1 KEGG:ns NR:ns ## COG: BS_yjdC_1 COG3711 # Protein_GI_number: 16078265 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 497 4 495 510 116 23.0 2e-25 MQERQYKILKLLYRSDGFVTVERLAADVNCSLKTIRNDLKQLGSFLEEYGLGKVVSKSNK GVCLIKDKDWDTAKQAWNDVNQMELYGTDKHFQILELLLKQRSIQKMKLEQRLLISRLDA EKAASQASVWLTAHHVAVTCKRGQGLQIECSEYYWRMAMWELFADISRKRQEFGDQTLKS QTEKFLWGFDMAGVTGAIAGLEKKFGIHYSFDGYQQLSFLLSLCVIRIRKKEYAEIPADF GHQRAEWQGLEYDLSMAETCMESLKEYYHTDFPEGEKQFTVLVLGAAEIQEFTDEASRQT YMEANQNICRVTDKVISITGNILGQGLDRDDYLFDSLLLYMKSKTLKLRCGILEENPLKE VVKSKYPNIYAAAWSASLIIESELEVRIGEDEVAYLALYIGGAIERLNVGVEVCILCNHG IGISRILKEQIERSIQNINVVDVLTTRDTCKIQRSQCDFLISSVPVGDVFAGRDVVQIGN VLQPWDIQQIQNKMKQVRKKKMRRIAEKTELSEYQLFHPSLVYHFPERTHKKEIISFLCA RLAEAGYVTKDYEQTVLDREETTSTVLELGVAIPHGAAFCVCRPVIAAAILEEPVDWGRD RRVDRIFLLALNLDERFKAKGQIIRFYSAIVTLLDDKEAYEEFHSLRGQEEIAGYLNSIV KGDRKA >gi|157101661|gb|DS480663.1| GENE 21 20794 - 21555 395 253 aa, chain + ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 10 241 54 280 286 93 29.0 4e-19 MLTPNSVCHVSEKNWKVPKNSIILFNNMDLHKITLNTHERFERYTLSFKPEYIESLSSKE TDLLECFFLRPFSNPYILPLTIPQAEECLTQLKRLITCNDDTAKQDYGYDLKVKFLLGEF LIFINNLYRKYHNISSDTITSSYSLIYSVINHIHTHLSDELSLELLSSTFYINKFYLCNL FKNVTGTSPNQYIISCRIMKAKELLSQNLPVDSVCSLVGYKNLSHFSRIFKQHTGLSPKQ YSKFKQAEDFKRK >gi|157101661|gb|DS480663.1| GENE 22 21654 - 22571 901 305 aa, chain - ## HITS:1 COG:lin2239 KEGG:ns NR:ns ## COG: lin2239 COG0191 # Protein_GI_number: 16801304 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Listeria innocua # 1 283 1 283 284 238 42.0 2e-62 MYTDVKTILDDASRNNYGVLAASAINLELARGLIAAADELQSPLIILMGQMQMTKHARAD VMVPLIRTLAEETNVPVALILDHGRDWEVITHAYRNGFSSIMIDASAYEMEENIRRTKKV IEMCHPQGVAVEAELGHVGQAALGDQTDDSCYTRPEDVTEFLNQTRADCLAIACGTAHGQ YPKGVTPEIRFDIIRAVKKVTDVPIALHGGSGSGDENIKKAVEAGINKVNICTEIFNYVR DEIRKTLEQTPEIDLLSLMSRTEQAAKEIGKHFICLTGSQGKAANFQRKSAFQYGFAQTE TDGGE >gi|157101661|gb|DS480663.1| GENE 23 22592 - 23581 878 329 aa, chain - ## HITS:1 COG:ydjJ KEGG:ns NR:ns ## COG: ydjJ COG1063 # Protein_GI_number: 16129728 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 329 17 345 347 321 47.0 1e-87 MDCDMPSVSDTDVLVEVAHVGICGSDMHIFEDPHYAVKDIKLPVVLGHECAGRVAAVGKS VKGIIPGDRVALEPGVPCGSCEFCMGGRYNLCPDVRFLGARPWLNGAFSRYVSHPARWTF RLPDAMDTVEGALLEPLVVGMHAVDRANLRTGQSVLILGAGCIGLMTLEACLARGITNVT MSDLYENRLDMAGTIGARHVVNSSEEDIISRSAQITANRGYDVIFETAGSQKTAALTADL VKRGGKIVMVGNVFGETPFNFFKTNSKEADILGVFRYRNLYPAAIELCSEGQAEPKKIVT NYFEFEKIQAAMEYAITQKQEAVKTVIRM >gi|157101661|gb|DS480663.1| GENE 24 23632 - 24468 637 278 aa, chain - ## HITS:1 COG:mlr7094 KEGG:ns NR:ns ## COG: mlr7094 COG0395 # Protein_GI_number: 13475911 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 4 278 6 281 281 198 37.0 8e-51 MKMKRTKRIQKLVMLIPLLLIVCFVIFPYIWTFLTSIKPTDELYTTNIKILPRETTFQNY VTLFTSTDFLASMFHSIVIAVITSCISMLVSSMASYAFARYRFRGKNLALSGILLLYMFP SVLYLTPLFVVFNKLKLIGSPVALVVSYCTFTIPFSIWLLTSYMKSIPLELEEAGKIDGA NVPQLLYYVVMPLLKPGLIATGTYVFINSWNEYLFAVMFTTSNNRTLPVSLASLVGEYDL RWDIISAGAVAAMIPVVILFMFIQKNLVAGLTAGSVKG >gi|157101661|gb|DS480663.1| GENE 25 24449 - 25366 517 305 aa, chain - ## HITS:1 COG:SMc01978 KEGG:ns NR:ns ## COG: SMc01978 COG1175 # Protein_GI_number: 15966265 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 12 272 20 280 311 200 40.0 3e-51 MNRGKAAGKRCRKSDIYIPIILLMPAFLLVVGVIIYPICYAVGLSFQYYKLTDYVNRQFI GLENYISVWRNETFLASLGNTVKWVGITVACQFLFGLVLAMILNVPFRGRGIIRSITLMP WVTPGVVIALMWVWIYNGNFGVLNKCLTSLGIISKNIPWLGSSQTALYSQIVTMIWQGIP FFAIMILAALQTISADLYEAADISGANSWQKFLYITLPELMPTIITTCMLRIIWVFNNVE VLYLMTGGGPGHSSMTVSLVAYIKAQKSLDFGQGSTIAIYGTLFMILFMTIYLKLTRRGD EDEKD >gi|157101661|gb|DS480663.1| GENE 26 25393 - 26742 1110 449 aa, chain - ## HITS:1 COG:AGl3270 KEGG:ns NR:ns ## COG: AGl3270 COG1653 # Protein_GI_number: 15891755 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 81 440 45 403 419 122 26.0 2e-27 MKKGLAKVTAIGLCVSMLISLAGCSGGGGSATTDKAAEKAAEAVTDKNIQSEAEGTQEPI ELTYWYWEDEQGTIENMLKDKWEAYAGDRIKVNFESVPSSSFHDKLITAISTGTGPDVFI CKPMWAPELYGMGGLMNMEDVFEGWEYADEVDDFMLEQMRAGLDKLYLYPRTTIVMYLYC RKSMFEKAGIDYPKTVDEFFDACEKLTVDTDGDGKTDQYGFGMRGGNGGHYMWSSFVFSA LKGKDYYDAEGKASLADAKLAEMNQKYIDLYQNGYVPPSAITDGFSEVLTNFKSGVTAML YHHIASITTIKETFGDDFEVIPVPTGESGQAFGCQEMTGWAINPNSKNLEAAEEFVKWAS SPEIHDVRCEKLQQVPFMSSVQALDKYKNDQAYKVSMDNMLSAHTLPVGPEITTYTEEMW AQTFQRALMGELSSMEMLEQLDKCLNGEM >gi|157101661|gb|DS480663.1| GENE 27 26775 - 27866 695 363 aa, chain - ## HITS:1 COG:YPO1816 KEGG:ns NR:ns ## COG: YPO1816 COG0524 # Protein_GI_number: 16122068 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 48 215 38 197 319 66 28.0 8e-11 MESKKILVAGSIVLDIIPEVYIPKGQPIPEIFLSEGKTTECSYTHIYLGGEVGNTGLGLK KLGCDVRLVSKIGDDSVGEIVREILNRYDTDYTLIEMMGEQSSASVAIALPGKDKMTLHS RGASQLFKAADITDEMLEGVKLFHFGYPPSMKYLVENEGEELEDLLKNVKSKGITISLDM SLPDLKTFLGHVNWRPILQRILPYVDLFLPSLEESIFFLHREDYVEMVRKAGANNLLDYI DVRGMAEKLADELLEMGGTIVMLKCGHEGMYLRTAEKERWGNMGKAAPNSLNGWYDRKIW QKPVKVKRILSRTGAGDIAIAGFLSSFLHEDNAKTALGIAAWAASICIQSYDTISGLCPL NEL >gi|157101661|gb|DS480663.1| GENE 28 28805 - 28963 80 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939573|ref|ZP_02086922.1| ## NR: gi|160939573|ref|ZP_02086922.1| hypothetical protein CLOBOL_04466 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04466 [Clostridium bolteae ATCC BAA-613] # 7 52 1 46 46 87 97.0 3e-16 MRTSDSLNYLVSCIFIHYMNKQIIKTVFVYNLKPAYPGRLGVCESFSVIKDL >gi|157101661|gb|DS480663.1| GENE 29 29007 - 29420 388 137 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 47 137 34 124 402 82 40.0 2e-16 MAREKKSVHKVQMTDGKRNIIRQLLEEYEIESAQDIQDALKDLLGGTIKEMMESEMDEHL GYRKSERSDCDDYRNGYKTKQVNSSYGSMKVEVPQDRNSTFEPQVVKKRQKDISDIDQKI ISMYTKGMTTRQITETL >gi|157101661|gb|DS480663.1| GENE 30 29542 - 29634 96 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESTISTGYGVIRKLAAFVVLGIIQMLKEF >gi|157101661|gb|DS480663.1| GENE 31 29819 - 29995 208 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936473|ref|ZP_02083841.1| ## NR: gi|160936473|ref|ZP_02083841.1| hypothetical protein CLOBOL_01364 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01364 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 413 103 100.0 4e-21 MAREKKSVHKVQMTDGKRNIIRQLLEEYEIESAQDIQDALKDLLGGTIKEMMESEMAS >gi|157101661|gb|DS480663.1| GENE 32 30152 - 30355 66 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939578|ref|ZP_02086926.1| ## NR: gi|160939578|ref|ZP_02086926.1| hypothetical protein CLOBOL_04470 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04470 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 130 100.0 3e-29 MAGLLPVFIGGLFNYSLNATFKECVKKRPVDKVTKSASYMVKYKETILWRLIIYDEKTKQ SNADDNS >gi|157101661|gb|DS480663.1| GENE 33 30446 - 32038 732 530 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939579|ref|ZP_02086927.1| ## NR: gi|160939579|ref|ZP_02086927.1| hypothetical protein CLOBOL_04471 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04471 [Clostridium bolteae ATCC BAA-613] # 1 530 1 530 530 952 100.0 0 MKVLTSIRKLLVTCSLALAMVSFYATAQGLHEYVFTQMWQALLVSAAIQTALFILNLKLI YFFTLNPRIVLILWATMLVSSSLFSFVFISNEIYNTSLYYSDAERIMDEFTIKKCYQYQN YLDNYLIYTKETMDAYCSVPFVSHSSESNTEITKILDDCKTDISSFTFSSNCAAFHRSLL KNLDSIKSSLPSSYTEQQANDALTALEGEKNTAETLKNNMLSNISSKEAVWNNINLRLSQ YRNFNDQAFIDLQNDNNKRDTEINSMKTEATQLDSVISALQTCINKVTAHKMNNTENTME SYRHDLMVEINKDAPDVDILDTITSDMFALLLKEGFTSSSYEMTNYFSFKNSLSVRKEIL ECQDKTDAIIHMLDNKRQDKNAFMFTSKADENTEKQIESWKQLWYGQTNELRELVKSLPL PQDLIIYSPDAENGTVTLNPGINLPEPDRTEEIDDISALERRYLANLNPLEKALNLLHTK YNTMALFSAAAAAVLDISSALMGAFVYTLERAFKNRKTAPATPAVLASTE >gi|157101661|gb|DS480663.1| GENE 34 32035 - 33201 347 388 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939580|ref|ZP_02086928.1| ## NR: gi|160939580|ref|ZP_02086928.1| hypothetical protein CLOBOL_04472 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04472 [Clostridium bolteae ATCC BAA-613] # 1 388 1 388 388 694 100.0 0 MDTSKSNTIISKRVLSVLPLASILVIAFLLTEFHPFASDFKSFFTFFLSGWFLFTLYLSV KGKHLHERYMIFILSLIAIINGLYRREIFEAYFQIPIEYGACIIIVFYLLRISFPYIKKL SLWFLSHTQEPPDTDTNTPNNGASKMRYARHRFVHHTKSAAEADSQASSTNAKVVCPTTQ YHKGISLGQSLLYIALTVIPFLFTYNMVTRLSQTTNYRIANTETLFHDVNSIIIVFLIGI LSMVVVFTLLFCYYKLIKIMVTTIQGNENPSAYVLTIAALSAFLAKTGVLSQDTVLNLFA QGDIFSLPLLILILYPVCLLSVHALINLAKNHNVTSWILDRAIQLAKKAASIGNNILNSS IDLIAFATAGFLSSAMELIDEQEGDEEE >gi|157101661|gb|DS480663.1| GENE 35 33134 - 33265 64 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLARGKTDKTRLLMMVLDLLVSIRSSFRLAGFPVSIIGYKVSP >gi|157101661|gb|DS480663.1| GENE 36 33505 - 34572 646 355 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939581|ref|ZP_02086929.1| ## NR: gi|160939581|ref|ZP_02086929.1| hypothetical protein CLOBOL_04473 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04473 [Clostridium bolteae ATCC BAA-613] # 1 355 12 366 366 677 100.0 0 MDKEYGGEFSVEMIEARRVLKTNIKKIREVEGLSQEAFGKKYGKSMSSISNYETGMSLPS FEFLLTLREQYDFSMEEFITSTDIEITKGGNQGAKAEGTGKNLNKYVGAYFIYLFDTSSH GDEHEKSDIEAVHFGVLSIYKKRGRKDTLGVMGCFNMADSKTKVFHDECSRIFEINDAEC EQRLKEVYRNRRNTSVYYGDLAFSRDNLFINLNSGEKDKAFIILRIPNSESPKYIGGIGT INSVSKGRYSSPCIQNIGLSRFYLKVNPAAIGEKLIFEEINPELIRESRELTEFFKLLYC NDRPITAFEEETKRRLIQSLMENYIKRYIEANLFRTNKVEVERNDDWYKFIKDNR >gi|157101661|gb|DS480663.1| GENE 37 34598 - 35968 320 456 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939582|ref|ZP_02086930.1| ## NR: gi|160939582|ref|ZP_02086930.1| hypothetical protein CLOBOL_04474 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04474 [Clostridium bolteae ATCC BAA-613] # 1 456 1 456 456 875 100.0 0 MSRAELLFERMKQRMKYVYPVYNGYLCETADRWNVRRTAEGLMFQRTKGHYCRMALKMSE IRLPAYMLLAALLHDQQQDVLIQDIPKTIPEEHKIKLMQVLQYYHDLEDTIATSVVMEKL LNDMEDVHYLALILLFAKNLDLLRVRSRKVLSGELEITRKYLVLWARRLHIRYFEKEFRE VEFRVTCKDYFQKTDILLQKLYSVNSESTKSLMEDIDGNIFELKFINKLRKSEFYNQLCN KGYTVSAEDYNNPVIYEIPLYNVYLLNSESEPLYMRRKSEEWVKYYLDCLVKKGYLLLGY EDETEQYLSHFILQGKWGVRYRIFICTTEICKRYFYSSKIQEFSEEKWHGKRKSGIIVYD KEKRPHEFDEKLTALDFAFYIHKDIGLCAKQAIINNEMRPLDTILKHGDTVTIISQSDVD NQKYSAEPGWMRFIKTQKAKEYIIRYLEIRFSELEG >gi|157101661|gb|DS480663.1| GENE 38 36066 - 37229 838 387 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 47 387 34 374 402 320 45.0 3e-87 MAREKKSVHKVQMTDGKRNIIRQLLEEYEIESAQDIQDALKDLLGGTIKEMMESEMDEHL GYRKSERSDCDDYRNGYKTKQVNSSYGSMKVEVPQDRNSTFEPQVVKKRQKDISDIDQKI ISMYTKGMTTRQISETLEDIYGFEASEGFISDVTDKLLPQIEDWQNRPLSDVYPVLYIDA IHYSVRDNGVIRKLAAYVVLGINSDGLKEVLTIEVGENERAKYWLSVLNGLKNRGVKDIL LLCADGLTGIKEAIAAAFPKTEYQRCIVHQVRNTLKYVSDKDRKLFAADLKTIYQAPTEE KALEALERGTKKWSEKYPNSMKSWHQNWDAIVPIFKFSTTVRKVIYTTNAIESLNATYRK LNRQRSVFPSDTALFKALYLSTFEATK >gi|157101661|gb|DS480663.1| GENE 39 37514 - 39067 911 517 aa, chain - ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 517 1 500 510 157 27.0 4e-38 MIKTILVDDEDIIREGLSNFIDWPALGLELVGQASNGREAVELVKIMAPEVIVTDVKMPM MSGIELLALAKDQKPDCVVIMISSYDEFEYAQQSLNLGAFAYLLKPIDTDKLIQLLKEAV LKIQKARSGEKILNTYRDVLEQTQRKILDNKIKLMALGGKVDTLNEAETQYLGQFTKFCV ASVNTESPRYHEESFMEEMGIHLEEWKNGHGVAADIKQFQNQDRMTVFCIMEKGDMTIQF RKLLTIYSRYYIKAVLGISDIVESVRELSAAYRQSLEAVEYRFFTDRSLICYSTVREEIQ SSFKDMPDWNRYLEKSMKHGGECEIKTFTDDFLSYVYQAKPSPVIIRTAVSAILLETIRT LRTAGGKPEDLFLSVSDTITAVLNESNPALMARKLREILLSASVYISRLEKLRPNSVVQK ARKYIEENYWNPDLRLDDVAGYVYINASYFSSVFSKEMGISFGDYLTKVRIEKAMDLLKN THMKIYEIADLVGYQNPSWFNVAFKRYTGQKPGDFRK >gi|157101661|gb|DS480663.1| GENE 40 39060 - 40793 1123 577 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 9 563 15 587 602 204 27.0 5e-52 MMNYRLHTRLMIYFSTMVIVSMLLVASISYIASYRIVEELAYSFSNQSAQSVADNLNDIF NEAENLAGLVENNSIFQNILNAEMPQDIKERYAAELKYDFELYQLAGYTINEFGGLYVLG DNGFNFKSHNVAFQEKDFRQEEWYRRIHQADGVVWVGPQVYSRTVKSIDRSYAAMGCPVI NKANGSRIGVVLVEIEADTIEYIIQEFGNIENGVIQVLDGDNTILFKKNNTGNGVKEAKS QKDRNKNHSVFYYAETMDNGWTVESYIPKAILLGKIINLGVWLGMVIFLMILLAVKVTSV ISGTVTNPINKLIDLMEQAEQKEFAVQMHVKYKDELAVLGNKFNNMMDFTRHLIVVNNKE QENLREAELKTLQMQINPHFLYNTLETVIWLIRSNENEKAISVITSLSKFFRIGLSRGRN IITLREELEHVKEYVKIQNTRYRDKIDFSIHIDEDSMLDYPIPKLTLQPLVENAIYHGIQ EKPEGGAISIEIVHETGDRIRISVIDDGIGMTPAQLERLQEGLKEMAVSGFGMYNTNQRL RNYFGRESALRIESQFGEGTNVSFSINGNKRGEKAYD >gi|157101661|gb|DS480663.1| GENE 41 40917 - 41894 490 325 aa, chain - ## HITS:1 COG:no KEGG:BF3495 NR:ns ## KEGG: BF3495 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 15 322 40 326 328 255 45.0 2e-66 MLNYDPCTIGCQAGWRKYEHNPVLGDSGDFCFDNHVLKVGDKLRMYFSWRTHYSIAYTES EDGLHWGERHVVLSPRQDISWEEDLNRPAIDYRNGVFHMWYSSQTTGGFNKRKWVDSYME ASKEDKGGSVIGYAWSRDGIFWERLEEPVVVPDCTWEKRSLMCPTILWDQKKEIYKLWYC GGGWFEPDAIGYAESKDGIVWEKCGKNPVFTPDKKNLWERAHVAGCQVIQMDGWYYMFYI GYEDLFKARICLARSKDGVSGWERHPMNPIISAGLPGAWDCESIYKPFLYFDEDQDRWLM YFNARTGTTERIGIAIHNGRDFWNG >gi|157101661|gb|DS480663.1| GENE 42 41888 - 42865 726 325 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 11 307 19 316 322 193 42.0 5e-49 MGDSKKYIVWIQEKSAVVVFLFLFLISALRYKQFFTPINLINLFRQSSIIGVIAVGMTFV IITGCIDLSVGAIVAVCGILAARMCTVNIVAAILVPLCVGALIGIINGTFVTKLEVPPWI TSLSMMMCLRGIAYIMTNESSVNVEKVSPAFQMIGRGKILGLPVPGILFLLFVVIGTYGL KYTRFGRSVYATGGNREGARMMGIRIDKTIILSYMLCGIGAAMSGLILASRLGAAQSTAG ELYEMQVIAAVVLGGTLLTGGVGHMPGTLFGVFTMSMITNIFNMQGNISTWWQNVIMGFM ILAIVIMQSGLEQIKMKKSEEEAIC >gi|157101661|gb|DS480663.1| GENE 43 42841 - 43821 693 326 aa, chain - ## HITS:1 COG:SMb21343 KEGG:ns NR:ns ## COG: SMb21343 COG1172 # Protein_GI_number: 16264667 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Sinorhizobium meliloti # 4 301 22 328 334 150 37.0 4e-36 MSKLLKKYGVYLALFAMVAFNSLVTRNFFSLNTCWNIMIQSTTVMFVSLGMTAAIASGGI DISIGPVMALSAIVFARLLDISVMGAFICALALALLCGAMNGFIIARFSIQPMIVTLGMM NMVRGFAELVNDGRTYSFSHPVISNLGFYKVLGVVPIQVLFIIIAVGAMYILIKRTRFGA YVETIGDNPKAARLSGIRISGMMVLIYMLSGFLAGAAGLVEALRMSAADPINFGLQIEVD AIASTAIGGTNMAGGKANLAGTVAGVFIMQLITVMVNMNNVPYSYSLVIKTLVVIIAVCA QNGKFTRLVRFRKRMEVNAWAIQKNI >gi|157101661|gb|DS480663.1| GENE 44 43814 - 45238 949 474 aa, chain - ## HITS:1 COG:SMb21344 KEGG:ns NR:ns ## COG: SMb21344 COG1129 # Protein_GI_number: 16264668 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Sinorhizobium meliloti # 1 459 34 493 497 400 45.0 1e-111 MGENGAGKSTLMKILNGLYSSDSGRVLFDGKEIHPQSSLEAQKLGISTIYQELNLIPELS ICENIYIGREPMGKWGIDWKKIENNARELLAGIGLDYDVKQPLSLQGAAVAQMVAIARAL AIQSKLVVMDEPTSSLDNQEVKILFQVIRRLKSQGVSIIFISHRLNEVYEICDRVTILKD GVCEGEYGINELDEISLVSKMIGQKVENSEARLQNDDFHSTSQPLCEMEKITNRKLREVS FKMYPGEVLGFAGLLGSGRTELIKVLFGEDTDYKGNIVINKRKVRFKMPSDAIRQGMALC PEDRRTEGIIPNLSVRENISIVLLPKLTKMGIISKNSQEKVVKEFIEKLGIKTPDMEQKV KNLSGGNQQKVLLARWLCTSPRLVIMDEPTRGIDVGAKAEIEKIVRDLAKQGMSVIMISS EISEVVRNSNRVMVMRDGHKLGELTGGEINQEEIMKRIADNRVFSKCGKEAQYE >gi|157101661|gb|DS480663.1| GENE 45 45378 - 46439 572 353 aa, chain - ## HITS:1 COG:SMb21345 KEGG:ns NR:ns ## COG: SMb21345 COG1879 # Protein_GI_number: 16264669 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 60 353 32 327 327 217 43.0 3e-56 MAATLLAAAIFCSGCSTDTSPSGNTTVGQQSSAAQTGETVKKETDTSSEKEAPAQDGGNK IVNVSDYLIGYSQLANTGTYRIAETNSMKDEAEKRGFKLIVTDAQDNTAKQVSDVEDLIA QNIDLLILSPREFEGLETALQTAKENSVPVILVDRLAKGEAGVDYVTYIATDFVWEGQAA GEWLKDKTGGTCNIIELTGTVGSSSAQDRATGFAGVVDQNEGMKIIASQTGNFERSEGQK VMENLLQAHGGEVDAVFCHNDQMALGAVQAIKAAGYKPNEDILVIGIDGEMDSFKSVIAG EMSATVVSSPMYGPITFDTVEKILAGQEVPEQTIMEGVVVDAGNAQENMELAY >gi|157101661|gb|DS480663.1| GENE 46 46674 - 47048 397 124 aa, chain - ## HITS:1 COG:no KEGG:Achl_2086 NR:ns ## KEGG: Achl_2086 # Name: not_defined # Def: hypothetical protein # Organism: A.chlorophenolicus # Pathway: not_defined # 1 121 1 118 119 71 30.0 1e-11 MDSSVLLFFEKMPQALPIYEAFTKRLLKELGPVQVKVQKSQIAFSNRYQFAFIWHPSRRF RGRKGVYMVVTFGVSHRIEDSRIEAATEPYPNRWTHHVIVQSADEIDDKLMGWIREAYEF ALVK >gi|157101661|gb|DS480663.1| GENE 47 47251 - 48024 450 257 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 11 249 16 254 263 177 41 2e-43 MNRIQQVFENKKAFIPFITAGDPDLSVTEALVPAMAEAGADLIELGIPFSDPVAEGIVIQ EADLRALASGTTTDKIFDTVRRIRQKTDVPLAFMTYINPVFAYGVDKFMKNAADCGIDAL IVPDMPFEEREELLPACKKYDMTLIYLVAPTSRERAQAIARESDGFVYCVSSLGVTGVRS EITTNIKEMVDTVKQGRDVPCAIGFGISTPEQASRMASQADGVIVGSAIVKMVARYGINC VEPVCDYVKQMKAAVGV >gi|157101661|gb|DS480663.1| GENE 48 48017 - 49201 1370 394 aa, chain - ## HITS:1 COG:aq_706 KEGG:ns NR:ns ## COG: aq_706 COG0133 # Protein_GI_number: 15606106 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Aquifex aeolicus # 3 391 8 396 397 525 66.0 1e-149 MSRGRFGIHGGQFIPETLMSAVTELEEAYEYYKKDPQFKAELKYLLEEYTGRPSRLYYAE KMTKDLGGARIYLKREDLNHTGSHKINNALGQVLLAKKMGKTRVIAETGAGQHGVATATA AALLGLECEVYMGKEDTDRQALNVYRMELLGAKVHPVTSGTQTLKDAVNETLREWTNRIS DTHYVIGSVMGPHPFPMMVRDFQSVISREIKEQLMEKEGKLPDMVIACVGGGSNAMGAFY EFIDEKDVKLVGCEAAGRGVETEQTAATIATGRLGIFHGMKSYFCQDEYGQIAPVYSISA GLDYPGIGPEHAQLHDSGRAQYVPVTDDEAVDSFEYLSRMEGIIPAVESAHAVAYARKIA GSMDKDQIIVINLSGRGDKDVAAIARYKGVELYE >gi|157101661|gb|DS480663.1| GENE 49 49229 - 49849 696 206 aa, chain - ## HITS:1 COG:CAC3159 KEGG:ns NR:ns ## COG: CAC3159 COG0135 # Protein_GI_number: 15896407 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Clostridium acetobutylicum # 5 206 3 202 205 134 38.0 1e-31 MKTCRLKICGLSRFGDILAANDILPDYIGFVFAESSRRVTPDKAAQLRRELDKRIRAVGV FVKAPMEEILELAGHREKERVIDLIQLHGDEDEAYIRQLKKHTPLPVIKAVRVRSREQIL KAQELSCDYLLLDTYTKGRYGGSGTQFDWNMIPELKKPYFLAGGIHLGNIRQAASHGPYC IDVSSAVETGGRKDRRKMEEIAQVLR >gi|157101661|gb|DS480663.1| GENE 50 49936 - 50766 751 276 aa, chain - ## HITS:1 COG:CAC3160 KEGG:ns NR:ns ## COG: CAC3160 COG0134 # Protein_GI_number: 15896408 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Clostridium acetobutylicum # 55 276 39 259 262 187 47.0 2e-47 MILDTIARSAGKRVEQRKQVKPLEQVMKEAFKCREREGIKREDIKGEDGTGGGYSFKAAL SKPGVSFICEVKKASPSKGLIAPDFPYVEIARQYQEAGADAVSVLTEPEFFLGADRYLEE IHGEIGLPLLRKDFTVDEYQIYEAKVLGASAVLLICSLLDMEQLKRYMGICGRLGLNSLV EAHTDREVAMAAEAGADIIGINNRNLDTFEVDFTNALRLRKLVDRGTIFVAESGIRTPED IELLAENQVDAVLVGETLMRAADKKQALRELKSRIR >gi|157101661|gb|DS480663.1| GENE 51 50763 - 51797 1130 344 aa, chain - ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 332 1 331 336 315 50.0 1e-85 MIQKAIHELVENKNLDFETTKEVMDEIMSGNATQAQISSFLTALRMKGETIDEITACATV MRDKAVKLSPPFPVMDIVGTGGDEVGTFNISTTSAFVAAAGGIRVAKHGNRSVSSKSGAA DVLERLGAELALTAEQAETVLEDTGMCFLFAPAYHSSMKYAAPVRKEIGIRSIFNILGPL SNPAGASMQLLGVYSRDLVEPLAQVLANLGVTRGMVVCGGDGLDEATLTGPTHVCEIRYG KITAYDMTPEELGLTVCNLEELIGGTPEENAQITRNILSGSLKGPKRDVVVLNAAISLYL GIDDCTVRDCVKTAQDMIDSGAAMAKMEEFIQATKKAAGEVKAS >gi|157101661|gb|DS480663.1| GENE 52 51858 - 52490 638 210 aa, chain - ## HITS:1 COG:TM0141_1 KEGG:ns NR:ns ## COG: TM0141_1 COG0512 # Protein_GI_number: 15642915 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Thermotoga maritima # 2 208 47 238 246 199 48.0 3e-51 MIVLIDNYDSFSYNLVQMIGSIEPDIKVVRNDQVTVDDIESMKPSHLILSPGPGYPRDAG ILEEAVIKLKGKMPILGVCLGHQGICEAFGGTIAHAKKLMHGKQSDIRLDSHIKEEKSGL PEKMGCGCRLFEGLPSVIPAARYHSLSAVPESLPEELEVVALDCMEGEVMAVSHRDYPIY GLQFHPESILTPDGRKILENFLNIGMIQAM >gi|157101661|gb|DS480663.1| GENE 53 52487 - 54055 1292 522 aa, chain - ## HITS:1 COG:aq_582 KEGG:ns NR:ns ## COG: aq_582 COG0147 # Protein_GI_number: 15606032 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Aquifex aeolicus # 8 518 9 490 494 383 42.0 1e-106 MIIPACNEILSLSGEYDIIPICREIYADVVTPITLLRRIAERSSRFYLLESIEGGEKWGR YSFLGYNPVMRVACRCGKVKIESHVSRFPSREIQTDHPMDVLREILAQYKSPRLKGQPPF TGGFVGYFAYAMIEYAEPTLHISRGGANDFDLMLFDKVIAYDHLKQKIQIVVNMKTGRTE RGICEELERCKGRKRCEGRENGNTSGGLEYLMKEYKRACEEIQEMASLISDTAPLKRLSS PEKPSFVCSVSKEEFCGTVEKTKEYILDGDIFQAVISRQFTSSYEGSLLNAYRVLRTTNP SPYMVYMNIDQDEIISTSPETLVRLENGRLTTFPVAGSRPRGAGDEEDRRLEEELLADEK ELSEHNMLVDLGRNDIGRIAEFGSVEVTEYKMIHKYSKIMHICSQVEGNIKPGLDGCSAV EALLPAGTLSGAPKIRACEIIEKLESVPRGIYGGALGYLDFTGNLDTCIAIRMAVKQAGK VTVQAGGGIVADSVPELEYEESANKAKAVIQAILQAGEVDDR >gi|157101661|gb|DS480663.1| GENE 54 54630 - 55310 761 226 aa, chain - ## HITS:1 COG:SP0725 KEGG:ns NR:ns ## COG: SP0725 COG0352 # Protein_GI_number: 15900622 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Streptococcus pneumoniae TIGR4 # 4 213 2 208 210 167 44.0 2e-41 MRFDRRQLLIYAVTDKAWIGKLSLKEQVEEALKGGATMVQLREKELDKGNFEEVLRTAWD IRRITEDYNVPLMIDDNLELALACRADGLHVGQNDMEASEARRLLGPDRILGVTAKTVDQ ARKAQAAGADYLGSGAIFGTSTKADARPMTMETLNAICDCVDIPVVAIGGICLDNIGHLA GSHAAGAAIVSGIFGAPNIRETTEKLVKAMEEITRLSGQDLRAGSV >gi|157101661|gb|DS480663.1| GENE 55 55297 - 56136 840 279 aa, chain - ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 1 277 1 271 273 199 40.0 4e-51 MQRQWMKDLRERSPRVHCITNYVTAGDVANMVLAAGGSPVMAQGLHEVEDVTSICSSLVL NLGTLEERTIPAMEAAGRKAAGLGLPVILDPVGVTASRFRRQTAIDIIRKTSPSVIRGNE AEIRALNQELRGHETGNTCGVDSETFGTIESRMETACSLRDLTGAVVVMTGVEDVVAGKE KKFIVRNGHPWMARITGSGCMLDGLMGAFCGLYGEPGLAGLTEEAAVTALSAHGLCGELA AAETAKRGGGTGTFRMYLLDKMSLLDDGTLERGKKIEIR >gi|157101661|gb|DS480663.1| GENE 56 56160 - 56696 607 178 aa, chain - ## HITS:1 COG:SA1136 KEGG:ns NR:ns ## COG: SA1136 COG4732 # Protein_GI_number: 15926877 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 6 160 4 159 163 110 42.0 1e-24 MKVNIKKLAVSAMLTAVAVSLSGFSIPIGASKCFPVQHLANVLAGVFLGPWYGVGMAFCT SFIRNLMGTGSLLAFPGSMVGAFLGGYLYQRFGRLTLAYVGEVFGTGILGGMLCYPVATL VMGKEAAIFAYVIPFLMSTMCGTVIAAFLIGVLYKSGAFQYMRRMLDLDVQTGKAMGR >gi|157101661|gb|DS480663.1| GENE 57 57152 - 57979 860 275 aa, chain - ## HITS:1 COG:lin0668 KEGG:ns NR:ns ## COG: lin0668 COG0561 # Protein_GI_number: 16799743 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 271 9 288 288 154 35.0 2e-37 MDGTLLRPDGHISERNVRALKGLKAMGAEFLICTGRSYIDALEPLKEAGLRAPVVCMNGA AVYDWDGRLMDEIRLSERQVRGILDCCQKEDIIFDFMTDRGSYTTAGEAQFRACFERNVL LPMAEFTYEGVRDRFSFVEEGQLFELGLAFYKISVIHESQEVLGRIKERLSGIAELAVAS SFATNLELTHSYAQKGRALAFYASSRGVRPDEIMAIGDSENDYSMLSMDIKYTVAMGNAM ESIKRIARCQTRSNIQDGVAYAIETLVLTREARAY >gi|157101661|gb|DS480663.1| GENE 58 58032 - 58772 789 246 aa, chain - ## HITS:1 COG:lin1330 KEGG:ns NR:ns ## COG: lin1330 COG0584 # Protein_GI_number: 16800398 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Listeria innocua # 3 244 2 233 235 172 39.0 8e-43 MKTQVWAHRGASAYAPENTLEAFRLAAEMGADGVELDVQLSRDGELVVAHDETIDRVSNG TGYIKDYTLAQLKKLSFNRLFPKFKDARIPTLKEVYELLKPAGLTVNVELKTGIILYPEI EEKVLALTASMGMEDRVIYSSFCHPSLVRLKELDSGLKTGLLYSDGWIGAADYGRHTVGA DALHPALYHMQDPDLIPAARRQGLAVNVWTVNEEPHMEMLVQQKVDAIITNRPDLCRRVV DRYTDL >gi|157101661|gb|DS480663.1| GENE 59 58801 - 60246 1643 481 aa, chain - ## HITS:1 COG:CAC0429 KEGG:ns NR:ns ## COG: CAC0429 COG1653 # Protein_GI_number: 15893720 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 58 481 20 447 447 422 51.0 1e-118 MKKRLIGYAAALLAVSMLAGCGAPAAGTGTAADPGSQAGASQPGTSQSGTSQADASQTGT SQTDASQSAGAEGASITFWHSMGGVNGEALEYLVNKFNSENTMGIQVEAQYQGEYDDAIN KLKSAQIGNMGADLVQIYDIGTRFMIDSGWVVPMQELIDADGWDKTQIEPNIAAYYTVGD TLYSMPFNSSTPLLYYNKDMFDKAGITQAPGSLGEIGRIADDLVNKGGAGEPISLSIYGW FFEQFTCKQGLNYVNNGNGREAAATAVEFDQNGAGAKTLAAWKELYDKGYAPNVGRGGDA GLADFSSGRSAMTLGSTASLKQILEDVNGKFEVGTAYFPMVSESDKGGVSIGGASLWALN NEDEAKKAATWEFVKFLVSPESQAYWNAQTGYFPVTVKAHEEPVFRENLEKYPQFGTAID QLHDSAPQSAGALLSVFPEARQVVETEIENMLNNGTSPEEAASEMAQQINKAIGEYNLLN E >gi|157101661|gb|DS480663.1| GENE 60 60341 - 61252 819 303 aa, chain - ## HITS:1 COG:CAC0428 KEGG:ns NR:ns ## COG: CAC0428 COG0395 # Protein_GI_number: 15893719 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Clostridium acetobutylicum # 32 303 2 273 273 271 49.0 8e-73 MEQTMTIKGRKRVSREEIIKVQEKANRKAMTVRRVRTGALIAANVLVSLFVLLPLLYAVS VAFMPPGELFTTEMNLVPRQPTWDNYRQALVKIPLVRFVCNSFITAGLITAGQIISCSLA AFSFSFLEFKGKNALFMLVMATMMIPGEATIISNYLTVSRWNWLDSYQVLIVPYLTSAMG IFLFRQFYKSFPISLYESAKIDGCGNLRFIFRILLPLTKSAIGAMAVYTFINAWNMYMWP LLVTGSNEMRTVQIGISMLNSVDAQSITLMIAGVVMIILPSISIFILGQKQLIRGMFSGA VKG >gi|157101661|gb|DS480663.1| GENE 61 61252 - 62118 954 288 aa, chain - ## HITS:1 COG:CAC0427 KEGG:ns NR:ns ## COG: CAC0427 COG1175 # Protein_GI_number: 15893718 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 5 288 20 303 304 243 46.0 4e-64 MKSDSLLKKAGPYGYILPALLVFAVFLFYPFFKTIYLSLYKTNKMGQAKLFVGLGNYMEL FQSESFYNSLAVTLLFVAIVVAGSMALGLLAAVLCNKAFPGIRIFSTAYALPMAIASSSA AMIFKIMLHPSIGIVNKLLGLDINWINDPKYALICVALLTAWLNSGINFLYFSAGLSNID ETIYERASVDGANGIQKFFRLTLPGLSPIMFYTIVVNIIQAFQSFGQVKILTQGGPGEST NLIVYSIYRDAFFNYRFGSAAAQSVILFAIIMVMTLMMFKAEKKGVSY >gi|157101661|gb|DS480663.1| GENE 62 62127 - 63266 1085 379 aa, chain - ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 377 1 369 369 408 53.0 1e-113 MADISLVNVCKTYKGGQNAVKDFSLDIRDRELIIFVGPSGCGKSTTLRMIAGLEDISSGE LWMDDCLMNMVEPKNRNLSMVFQNYALYPHMTVYENMAFGLRVRRTAPGEIDARVREAAR ILEISHLLDRRPAALSGGQKQRVAIGSVIVRKPKAYLMDEPLSNLDAKLRAQMRVEIAKL HKQLNATIIYVTHDQVEAMTLGTRIVVMNQGMIQQVAPPAELYRNPVNKFVAGFIGSPTM NFLDVDVLEEDGTVWLLGQGWRLPLEGYQARKLKEKGFAGKRVTLGIRPEDLHQEECSRK NGSDGLNGNDSGWIGVHISVREMLGSEVLLHGNTGDMGRLSARMPASCRVKPGEFLRLYV DMPQIKLFDIQSEENILFD >gi|157101661|gb|DS480663.1| GENE 63 63546 - 63923 431 125 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939610|ref|ZP_02086958.1| ## NR: gi|160939610|ref|ZP_02086958.1| hypothetical protein CLOBOL_04502 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04502 [Clostridium bolteae ATCC BAA-613] # 1 125 1 125 125 192 100.0 5e-48 MENQKMDLILQKLGGIEGGLASVREDVRSLQGDMQTIKGDVQTIKGDVQTLKEDVQTLKD RVTNIEITLENETNRNIQLIAEGHLNLDRKLNEALKELQPNTMYHLKVNHLDGEVTKMKR MLNMA >gi|157101661|gb|DS480663.1| GENE 64 64134 - 64532 431 132 aa, chain - ## HITS:1 COG:no KEGG:Closa_4143 NR:ns ## KEGG: Closa_4143 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 1 126 126 153 56.0 2e-36 MIFIGGISQGQKILDYVKTVICDRCGGYGRYQVYMTYMYFSFFFIPLFKWNKRFYVKMSC CDAVYELDPEMGKALLHGQQADIKSSDLTLVQEGRRRGYYDSGTYKTWKKCGNCGYETEE DFEYCPKCGRRL >gi|157101661|gb|DS480663.1| GENE 65 64636 - 66420 2019 594 aa, chain - ## HITS:1 COG:lin0483 KEGG:ns NR:ns ## COG: lin0483 COG4716 # Protein_GI_number: 16799558 # Func_class: S Function unknown # Function: Myosin-crossreactive antigen # Organism: Listeria innocua # 9 593 11 565 566 789 66.0 0 MNKKVGKILTAAAAGAAVGVAVKKLQDSRKEKEDERIAAIEEAVMNNRDYGDRKAYLVGG GLATLAAAAYLIRDCRFPANQITVYEGMHILGGSNDGIGTPEQGFVCRGGRMLNEETYEN FWELFGSIPSLRQPGRSVTEEILEFDHAHPTCAKARLVDKDGNILDVKSMGFNQADRMAL LKLLMTDEKKLDNLTIQDWFKETPHIFETNFWYMWQTTFAFQKYSSLFEFRRYMNRMIFE FSRIETLEGVTRTPLNQYDSVIRPLETYLRKAGVNFRENCEVTDIDFADGPGITVKTLYL KKKVESTDDSGEESDAPAGPSYVTEEVQLNKSDICIMTNACMTDSATLGSLYKPAPAPEK KPISGELWAKVAGKKPGLGNPEPFFTKPEETNWLSFTVTCKGDDILKTIENFTGNVPGSG ALMTFKDSSWLMSSVVAAQPHFVNQPADQTIFWGYGLHTEAIGDYVKKPMKDCTGQELLN EYLHHLHIPEDRIAELMKTVINVIPCYMPYVDAQFEPRKMSDRPPVIPAGSTNFAMVSQF VEIPEDMVFTEEYSVRAARIAVYGLLDVKKKICPVTPYNRQPKILLKALKKSYL >gi|157101661|gb|DS480663.1| GENE 66 66558 - 67436 916 292 aa, chain - ## HITS:1 COG:BS_ksgA KEGG:ns NR:ns ## COG: BS_ksgA COG0030 # Protein_GI_number: 16077110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Bacillus subtilis # 4 282 5 285 292 263 51.0 3e-70 MATLGNPKNTIEIIQKYQFAFQKKFGQNFLIDTHVLDKIISAAGITGDDCVLEIGPGIGT MTQYLAEHAGKVVAVEIDTNLLPILDETLKGYSNVTVINSDILKLDMNQLVDEYNDGRPI KVVANLPYYITTPIIMGLFESNVPIDNITVMVQKEVADRMQVGPGSKDYGALSLAVQYYA KPYIVANVPPNCFIPRPNVGSAVIRLTRYKEPPVQVDEPGVMFRLIRASFNQRRKTLQNG LNNSPEVPYTKEQIAVAIESLGVPASVRGEALTLEQFAGLSNYFTRTAGNGV >gi|157101661|gb|DS480663.1| GENE 67 67608 - 68372 884 254 aa, chain - ## HITS:1 COG:CAC2989 KEGG:ns NR:ns ## COG: CAC2989 COG0084 # Protein_GI_number: 15896241 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Clostridium acetobutylicum # 1 251 1 250 253 239 47.0 3e-63 MIFDTHAHYDDEAFDGDRPELLGRLKEAGVGAVMNVAASLGSCRSTLKLAEAYDWIYGAM GVHPSETGELDQEGLQWIKEQCGRRKIKAVGEIGLDYYWEEPAHDIQKKWFEAQMDLARQ VKLPVIIHSRDAAKDTLDMMKAAKAGDIGGVVHCFSYTREMAREYLNMGFFLGIGGVLTF NNARKLKEVVEYIPLESIVLETDCPYLAPVPNRGKRNSSLNLPYVVEAVSQLKGVDPETV VKVTWENGKRLYRL >gi|157101661|gb|DS480663.1| GENE 68 68377 - 69150 690 257 aa, chain - ## HITS:1 COG:no KEGG:Closa_0158 NR:ns ## KEGG: Closa_0158 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 253 4 249 251 173 38.0 5e-42 MGRGRSRGPYGGRRQGLTGNQLKILGIVAILIDNIGAVVIQGGILHGTDSALYHAVLLTP SGHYWVIAGQVCRYVGRLGFPILAYLTAEEFVRTRDRRWYAVRMLLFALLSEVPFDLAVY HTMYYPHYQNMMFTLFAGVLVMAVMESTQNPCLKAGALAAGCALSWVLQFDYNVVGVLFI AAMYWFRRSDTAQVVAGVAICAVESISCYCVSALSFAPIVLYNGRRGAFQLKYMFYVFYP VHFLVLYGVSMWIAKGV >gi|157101661|gb|DS480663.1| GENE 69 69306 - 70655 968 449 aa, chain - ## HITS:1 COG:MTH450 KEGG:ns NR:ns ## COG: MTH450 COG0438 # Protein_GI_number: 15678478 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 139 347 153 350 411 74 27.0 3e-13 MRILSITAQKPHSTGSGVYLTGLVKGFAALGHEQAVVAGVYEEDEMHFPEGTRVYPVYYK TESLPFPIAGMSDEMPYESTIYSRMTEDMVDVFKRAFLEKTREAVECFRPDLILCHHLYL LTAVVREAFPQYKTAAICHGTGLRQVKKNSLERDYILAHIKELDKVFCLHREQREEISRI YGVPGERLEVIGSGFDDSIFRYMPVQKENGVKRLIYAGKLSEKKGVMSLLRSLTYLEQLQ RKDGEQQGQGIGQPAAPMKIEVWLAGGYGNQLEYETIKKLADQSPYPVKFLGRLDQPQLA ERMNQADVFVLPSFYEGLPLVVIEALACGLQVVCTDLPGVRPWLEENIGTCAVKFVPLPA IVNADEPVEQELPQFERRLAEAIAESLGADGSRQGVDSSRQGPDSRPPETPGGESLAAAG PSIASPLPDLSRISWTGISKKILESCNLI >gi|157101661|gb|DS480663.1| GENE 70 70826 - 72820 2389 664 aa, chain - ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 6 525 4 520 536 636 58.0 0 MCQDCKKPYYITTAIAYASGKPHIGNTYEIILADSIARYKREQGYDVFFQTGTDEHGQKI EEKAEAAGVTPQEFVDKAAAEIKRIWDLMNTSYDKFIRTTDQDHEAQVQKIFKKLYDQGD IYKGYYEGLYCTPCESFFTESQLVDGKCPDCGREVKPAKEEAYFFRMSKYAPRLIEYINE HPEFIQPVSRKNEMMNNFLLPGLQDLCVSRTTFSWGIPVDFDPKHVTYVWLDALTNYITG IGYDCDGSSTDQFKKYWPADLHLIGKDIIRFHTIYWPIFLMALDVPLPKQVFGHPWLLQG DGKMSKSKGNVLYADTLVDFFGVDAVRYFVLHEMPFDNDGVISWELMVERMNSDLANILG NLVNRTISMSNKYFDGVVCDKGVCGEADEDLKKVVLEEVKKADAKMEQLRVADAMTEIFN IFRRCNKYIDETTPWTLAKDESQKDRLATVLYNLTEAIAIGASLLYSFMPETAEKILAQI HTGKRELFQMDAFGLYPNGQKVTDKPEILFARMDIKEVLEKVEAMHGAEAADQKQAGGQD GSSDNAEDSGIDLEAKPEITYDDFAKLQFQVGEIIKCEAVPKSKKLLCSQVKIGSQVRQI LSGIKAYYSPEEMVGKKVMVVTNLKPAKLAGMVSEGMILCAEDAEGSLALMTPEKSMPAG AEIC >gi|157101661|gb|DS480663.1| GENE 71 73095 - 74969 1995 624 aa, chain + ## HITS:1 COG:CC1056 KEGG:ns NR:ns ## COG: CC1056 COG4805 # Protein_GI_number: 16125308 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 114 612 90 601 613 109 23.0 1e-23 MRQRKTIFLQYLLSLLLVICLTAGCVAPARPQASAPLAKESAAQESTQSYEQYEKKTIDA QASFDQLTNDLFLDEVDNSLITLHYTLANPAAYGITDYDKTLGTVSLEDSKQALNDAKSL KVQLNDIDSRLLREDQLLTYNILSSYLNTMLDSEGLELYDQPLSSSLGIQAQLPILLSEY VFYSKQDVEDYLSLLSSIDTYYDSIIAFEKEKTDAGLGLCDTVIDRILNSCNAYLLDADH SFMAETFAERLEQVEGLTKQEKEDFIARNHTAIDEHFVPAYQKLIEGLTPLKGTGTNDKG LYYFPQGKKYYQYLVNAYTGTSYQDIPALKKAMSDQMMDDLTAMDELLTENPTLAKKLYS YSFALTDPNQILEDLRKQCAKDFPAIEDYTCSIKNVPAALEATLSPAFYLTVPIDRPQDN SIYINNGSTNTARNLYSTLAHEGYPGHMYQTLYFNKHNTCNLRKLLSFSSYSEGWATYVE YYSYSLNNGLDPDLGELLQHNAAFTLALYAILDVNIHYEGWDIKQVEDYLNHYFRISDSS VITTIYYDIAENPANYLEYYVGYLEILGMQREAKKTLGSRYTNMEFNRFLLDIGPAPFSV IKPYFAEWLARQDEAGRQNTESRN >gi|157101661|gb|DS480663.1| GENE 72 75106 - 76038 1006 310 aa, chain - ## HITS:1 COG:L2 KEGG:ns NR:ns ## COG: L2 COG2177 # Protein_GI_number: 15672955 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Lactococcus lactis # 3 303 2 308 311 91 24.0 2e-18 MGIRTLLYLFRLGLKNLWHHRVYTAASVITMSACIFLFGLFYLAAANVDSMVKKTEQEVY VAVFFDEGILPERVEEIGNLIQSRPEVLRTVYVSADEAWSGFKEKYYENAEALDGIFETD NPLASSGNYQVYIDRIGSQEGFVEYVQSIEGVRKTTHSADTVMALLKLRQGAARLIAGSA VLLVLISVLLIHNTLSVGIEAQKEKTRVMRLMGAREGFVKIPFMAEAVVMGAAGVVIPLI LLMWLYRWGLAFAVSGLNGLGSLRGLESGLLTEAQVFPGLIRASVLLGIFTGVAGGLSVM GKLKRRKKER >gi|157101661|gb|DS480663.1| GENE 73 76028 - 76714 284 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 215 1 217 245 114 29 3e-24 MIFFDRVTKIYGNGQKALDNVSLKVGRGEFVFLEGDSGAGKTTMLELILKETDPTQGNIQ VNGIELAGLRERQIYQYRRYIGMVFQDFRLFSDFTVYENVAFAQRVVEADYKTMKRRVME VLNQVGLGSKAKYYPDQLSGGEKQRTAIARAIVNRPVLLLADEPTGNLDQKNAEDIMRLL ERINGQGTTVLVVSHNQELVRSMHRRQVVLRRGQVARDLAREAAAYGY >gi|157101661|gb|DS480663.1| GENE 74 76760 - 77821 1177 353 aa, chain - ## HITS:1 COG:BS_flhB KEGG:ns NR:ns ## COG: BS_flhB COG1377 # Protein_GI_number: 16078701 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhB # Organism: Bacillus subtilis # 3 351 11 358 360 291 44.0 1e-78 MAAEEKTEKATPKRRQDERKKGNVFQSNDVAAVCSLLVLFHSLKALAPMMYESFGECMHL FFSYASSPDIFWQEGGKGLFMKAVLVFFKASLPLLFIGVLTAVIVTFFQTRMAFSWDVLK FKFDRISPLKGFKRMFSLRSVIELLKSLIKITVLGWVVWFFLKGRLTELTGLMEGTVASA LIYVGNTVVSLVDTVGIAFVFLGAVDYLYQWWEYEKNLRMSKQEIKEEYKQTEGDPQIKG KIREKQRQMASRRMMQNVPNADVVIRNPTHFAVALGYDSGKNRAPVVLAKGADHLALKIV EVAEASGVYIMEDRPLARGLYDTVEVDMEIPEEYYQTVAKILAFVYKLQKKDM >gi|157101661|gb|DS480663.1| GENE 75 77904 - 78683 995 259 aa, chain - ## HITS:1 COG:BH2440 KEGG:ns NR:ns ## COG: BH2440 COG1684 # Protein_GI_number: 15615003 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliR # Organism: Bacillus halodurans # 11 259 10 256 258 90 25.0 3e-18 MDSSVFEYFDIFLLVLARMGGLVFINPVFSRRGVPPMVRTGLVLALSLLIAPGVRSGAGQ VMAFSTFDMVESLIREVILGLVIGCVFHIFFYMLYVAGDFLDTVFGLAMGKVMDPAGGVQ TSILGQFVNVFFYLYFFATGCHLTMVRLFAYSYQVVPVGAGAILGGRILWYIITLFGSVF LMVIKLVLPFVAAEFILEMTMGVLMKLIPQIHVFVINIQCKILLGIMLMMLFAYPMGAFM DRYTEAMMTEAQKLLMMFG >gi|157101661|gb|DS480663.1| GENE 76 78699 - 78962 437 87 aa, chain - ## HITS:1 COG:CAC2149 KEGG:ns NR:ns ## COG: CAC2149 COG1987 # Protein_GI_number: 15895418 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliQ # Organism: Clostridium acetobutylicum # 1 85 1 85 89 57 45.0 5e-09 MSAEQVMEIMKEAMLVAFEIAGPLLIISIAVGLLVAIFQAATQIHEQTLTFVPKLIVIAL VLLALGSWMSKVMNEFVVELFAIMAAL >gi|157101661|gb|DS480663.1| GENE 77 78981 - 79757 979 258 aa, chain - ## HITS:1 COG:BS_fliP KEGG:ns NR:ns ## COG: BS_fliP COG1338 # Protein_GI_number: 16078698 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FliP # Organism: Bacillus subtilis # 59 257 29 221 221 199 55.0 3e-51 MSKILRKTGKIIGAGTFAWMWAPGAVTSYATDVSLTLQGSSGTGKPMDMLDIMFLFLFLA VVPSLIIMMTSFTRIVIVLSFLRNALGTQQSPPNQILVGLALFLTLFIMSPVIASVNTQA YQPYREGQITREQAFERAQEPMKEFMLKQTEKKSLDLFLSISKAQPPDISQGQEGYMKLG LTTIVPSFILSELNKAFTMGFLIFIPFLIIDLVVSSTLMSMGMVMLPPTMIALPFKIMMF VLVDGWSLVIKTLVQSFR >gi|157101661|gb|DS480663.1| GENE 78 79750 - 80127 409 125 aa, chain - ## HITS:1 COG:no KEGG:Teth514_1678 NR:ns ## KEGG: Teth514_1678 # Name: not_defined # Def: hypothetical protein # Organism: Thermoanaerobacter_X514 # Pathway: Flagellar assembly [PATH:tex02040] # 4 122 9 124 125 62 34.0 4e-09 MAFLKILFYLIVLIAVLVLAYYTTRMLGRGMGRTRGTGGMEILDQMALGRDSYLLVVKVQ ERIFLIGVSPGRISKVEELESYDKKGETEGAPDFVSLLSSHMKEHFQEKGQRNRQDKKAG EKNHE >gi|157101661|gb|DS480663.1| GENE 79 80129 - 81091 846 320 aa, chain - ## HITS:1 COG:BH2445_2 KEGG:ns NR:ns ## COG: BH2445_2 COG1886 # Protein_GI_number: 15615008 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Bacillus halodurans # 247 320 33 106 112 74 50.0 2e-13 MDKEAIKELTEMEAVPICNILGSLLGRSVKMSVSKVEETDMGSLAEGLPHFNVAIESRKA VGGMSADQLYIFAKEDMIRLTNYIMGVPVDAQSPLDEIALSTLKEVASQCVGAAMDELND FLGRDMRDTITRISAFDNTERIQDIIRSWNAEDSVLLMGLHYVIDGVVESDAYIVAAQAL KQVLGISDMADMPCQETGGQTPGMPAAGMQTAEHKEAIAVQEVSFPEFKYTPIVYAKEQI GEEKKKLLDITLDVSVRLGSTVCSVKDILSLKEGQLLVLDKQAGSPADVVVNGELIGRGD VLVTGDRFGARIIEVVGKRE >gi|157101661|gb|DS480663.1| GENE 80 81116 - 82006 901 296 aa, chain - ## HITS:1 COG:jhp0393 KEGG:ns NR:ns ## COG: jhp0393 COG1868 # Protein_GI_number: 15611461 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Helicobacter pylori J99 # 5 292 40 319 354 141 29.0 2e-33 MIKSYDFKSPKKFTKERMSTVENLYDSFARALAPYLTGLLQSFCEININGIVEKRYQEFS SSVQDHSLFGMITMSPANKDYNEAPLTMEVDTRLAFFMIERLLGGVGSEYDLSRDFTDIE KAILQYVLKKITEFLGEAWGQYLDIEASLTGLQTNPHLLQMSAQEDVVIQVELEVVIEKL RARINMVMPAPNVEELTSKFGYSFAVSHKKQDEDKQKSKRHYIEQHLLESEVELRAILHE FTLDAQDILQLAPGDVIPLNKKVDSAVSLYVEDVACFEARAGQTKMRKAVEISREL >gi|157101661|gb|DS480663.1| GENE 81 81999 - 82844 858 281 aa, chain - ## HITS:1 COG:aq_1002 KEGG:ns NR:ns ## COG: aq_1002 COG1360 # Protein_GI_number: 15606305 # Func_class: N Cell motility # Function: Flagellar motor protein # Organism: Aquifex aeolicus # 1 242 1 231 235 85 27.0 9e-17 MKRQKDEEGGQEWLNTYADMITLVLTFFVLLYSISNVNLTKLEEVASAMQRQLGIEAKTE IEDVPSDLKYPVVGEGAQAPEDAEAPLQQTQQQYQASAREMADMARDIQTYFDSENLDAV VTNSENAVYIRFKNDLLFAPDNANLTDASKSMLDAVGIMLKEKQDNILAIYINGHTAQAA NSLINDRLLSSERADNVAIYLEEQVGIPPKKLICRGYGKYYPIADNTTREGREQNRRVDM IILGTGYKPPDTVQGMETMDPLFPVTMPGDETMMQEGTASD >gi|157101661|gb|DS480663.1| GENE 82 82845 - 83279 494 144 aa, chain - ## HITS:1 COG:BH2460 KEGG:ns NR:ns ## COG: BH2460 COG1558 # Protein_GI_number: 15615023 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Bacillus halodurans # 1 144 1 152 152 130 49.0 7e-31 MGYLDSLNITGSALTAERFRTDIIMQNLANQNTTRTAEGGPYKRKQVVFRENTLNFKSEL GKAMTKAENGGVFVEEVVESQNPSVPVYDPDHPDADEDGYVMMPNVNSAEEMVDLMAASR AYEANVTALNVAKSMALKALEIGK >gi|157101661|gb|DS480663.1| GENE 83 83291 - 83647 392 118 aa, chain - ## HITS:1 COG:BH2461 KEGG:ns NR:ns ## COG: BH2461 COG1815 # Protein_GI_number: 15615024 # Func_class: N Cell motility # Function: Flagellar basal body protein # Organism: Bacillus halodurans # 1 113 1 128 132 72 39.0 2e-13 MPLFDDRALGALERGMDGMWLKQQIASHNIANVETPGYKAKKVEFRDVLYETAQGTERIS KPVVEEDGNTQARPDGNNVQVEKEELELWKAYTQYSALTGRVSGKLSTLRYVINNTGK >gi|157101661|gb|DS480663.1| GENE 84 83669 - 84049 532 126 aa, chain - ## HITS:1 COG:TP0943 KEGG:ns NR:ns ## COG: TP0943 COG1516 # Protein_GI_number: 15639928 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport; O Posttranslational modification, protein turnover, chaperones # Function: Flagellin-specific chaperone FliS # Organism: Treponema pallidum # 6 115 9 125 148 67 33.0 5e-12 MQNPYAKYKQQSIMTMTQGDMINLLFDETINRLNKGLAGLEAGDCEATNTHFKKAQAIIS HLASTLDPQYPVSKGLSSLYEYFNYQIIQANVRKNPETVNEILPMIEELKEAFAQADKQV RIGHTG >gi|157101661|gb|DS480663.1| GENE 85 84066 - 86972 2265 968 aa, chain - ## HITS:1 COG:BS_fliD KEGG:ns NR:ns ## COG: BS_fliD COG1345 # Protein_GI_number: 16080587 # Func_class: N Cell motility # Function: Flagellar capping protein # Organism: Bacillus subtilis # 654 964 186 494 498 150 33.0 8e-36 MSSINTVSGNGNKNYVSGLASGMDTESIVKSLLMGTQTKIDKQSGLKQQLEWKQDIYRDL ITKINTFSDKYFSYYGSGDTNLMSQSLFKTMTGISSSGAIKISSVSSNAVSSMNISEIQQ LATACSVKSSGNVTGEPKGQEADLNALAEGENSFSITLDGVNRTITFTKGATEQDTIDNI NQALYRNFGTSVGMSLTDSTAEDGSSIKVMKLVKLNENGKETSEAVDSSRRVIIESAGDN RDTIKHMGFSGGFSNKLDYGTSLKNMNFATPLEGNRYEFQINGVTIKGLTGDSTLSDVIS AINSSDAGVRVSYSSAADKFIMESSSTGEISNITMSQTYGNLLTTMFGVEATGVQSSLFS QNITSDNSVDFNTIVENLNSGRDQSLTFQVDGEDYTVKLEGKNGVGNYKNAQAVIDAINS NLSREFGSGKVNFSLEDDPDIAGNSFVAVNSSEHSILFTTANVNDGGADAFGFAENQTNV LGKDTTWDAAGMAGKIKVNGTEVYITGNTTLEKVAVRLQTAITNAVGGNPKVEYKDGRFN ITGLTGNVKIEGVEGDITDADGSIIKKNPVEIMFGKKSIQNVPNIKTVTSNAVTDTRILK GNLKLTLEDGTTEVIDVNNLSIPKNFADLAAEIQGIVGAGVSVSYTTDGKIQIQGTSKGF NIEGTDTEGKALMNSLFGQDGMTFSSQMTAQVTAGQNARLTVDGTVIERNSNTFELDGIT MELTSEYHDTGTPIRLTTSRDTDKIVDSLKSFVEDYNALVEELNEHLKETANYKKYAPLT DEQKKEMSDKEIELWEEKSKEGLLHNDANITSFLGDMRMVLYSSVEGAGLSLYDIGIETS DNWRDNGKLVIDEDVLKSMAATNPDAICKLFTDRDQGLGTGMQNALKDAANVSSGSPGTM VRYAGTKDVLTTSNTLYEEMKHISETLSNLNTKYQLEKTRYWNQFNAMEQAISNMNSQSS WLTQQFSS >gi|157101661|gb|DS480663.1| GENE 86 86984 - 87460 587 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939635|ref|ZP_02086983.1| ## NR: gi|160939635|ref|ZP_02086983.1| hypothetical protein CLOBOL_04527 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04527 [Clostridium bolteae ATCC BAA-613] # 1 158 1 158 158 261 100.0 2e-68 MEVLEQMRMLLREKAILFGQYEQETLRLDTDDLDAVDDIVDAVQARQALIDKINGLDRRI AAIGEASAYGARCFHIGKNQCDYAGLTEAEQAVFRVGQEVFAIMTRIRELEDGIPGKMAV IQEQLQEKIKKNNVNGKFTGYLKQMGQGSKGVLYDKRR >gi|157101661|gb|DS480663.1| GENE 87 87447 - 87725 372 92 aa, chain - ## HITS:1 COG:BH3617 KEGG:ns NR:ns ## COG: BH3617 COG1551 # Protein_GI_number: 15616179 # Func_class: T Signal transduction mechanisms # Function: Carbon storage regulator (could also regulate swarming and quorum sensing) # Organism: Bacillus halodurans # 1 63 1 64 75 57 53.0 5e-09 MLILSRKKGESIKIGDDIEIFVSEIKGDKVRLGISAPGDMKICRTELYLTMENNKEASDK VDLLKVFQLSRNLKALAQEDEKEVREEKNGSA >gi|157101661|gb|DS480663.1| GENE 88 87719 - 88165 385 148 aa, chain - ## HITS:1 COG:BH3618 KEGG:ns NR:ns ## COG: BH3618 COG1699 # Protein_GI_number: 15616180 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 141 4 142 151 73 32.0 1e-13 MKIETRDFGILEIEERNIITFKQPIFGFEEYTQFVLVNDSNMGNGICWLQSIEQKDICFI MLNPLEVKRDYAPVVMQDILILLQAVPEDDLDCWVLMVIGETFRNSTVNLKSPIIINHKT NLAVQVILDQDYPIRQSIFSQGERDGLC >gi|157101661|gb|DS480663.1| GENE 89 88215 - 88814 686 199 aa, chain - ## HITS:1 COG:no KEGG:Closa_3431 NR:ns ## KEGG: Closa_3431 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 198 4 197 202 96 35.0 7e-19 MNVLKITSTPIKLSMTSQRARLESRLPDPEVGIIRNPGKLNMKSDHIKVDIDTSRSRDSL GFKTARGLMKDAASAGLKAASDATAQYSRVGNQMMQIQDGVTVSGIIKNQMMAQTNVVTG IAFIPSVGPDISWEPADLSIDFAPAQLAFDPQVQEPAARYVPGELSVNVEQYPKVEIEYM GDPIYVPPSANPAGNEGPQ >gi|157101661|gb|DS480663.1| GENE 90 88836 - 89819 1125 327 aa, chain - ## HITS:1 COG:BS_flgL KEGG:ns NR:ns ## COG: BS_flgL COG1344 # Protein_GI_number: 16080593 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Bacillus subtilis # 1 327 1 298 298 77 23.0 4e-14 MRITENMMTSTYNRNLQRNISDLASSNLKLSSKRQYNHVSEDPATASKAFAIRDQIARSE EHISVVKNARGELDTADSNIVTLNSILETVYEKATKAGGASSQDSMDAIAEELGGLKEEI LQTMNAKYGDKFLFSGSSNSEAPFTVDGDGNLLFNGKPVDAYDKNDPDTYFDENKPVYLD IGFGTYASGTNTAKTGIKISTSGVDVLGYGKDENGLPNNLYSLIGKVEEQLRGGDKEGAM DTLSQLKKKQSNISIATSELGTREKLLDRTEDRLETGLINLQKSQTDLESVKIETEAINN KSCETAWMITLQLGSSIIPPSIFDFMK >gi|157101661|gb|DS480663.1| GENE 91 89830 - 91284 1653 484 aa, chain - ## HITS:1 COG:BS_flgK KEGG:ns NR:ns ## COG: BS_flgK COG1256 # Protein_GI_number: 16080594 # Func_class: N Cell motility # Function: Flagellar hook-associated protein # Organism: Bacillus subtilis # 1 484 6 507 507 146 26.0 9e-35 MGLETAKRGIQVNQKAIDIVGNNISNSKTKGYTRQRLDTVSVHTYGSSQYNYSSIPLAGQ GVDARGVAQIRNPYLDAKFRQQYGDVGYYDQKAAIMDQIEGIISDPEVEGTGIKAALTTL SQALADFSQSPYQETNANIVMNAFKGVTQVLNEYDANLKTLMEQTKNDLSVAVEDINTKL DQLSELNSSIAHEIFSNNDYDGVNYGPNDLLDQRNVILDDLSRYGDVQVTDLEDGKIQVK INGKVVVDASGASYTNDKLNIGSDGVTLTWNSNNQNANLGAGAIRGFTDMLTGADPINAG IPYYQNKMDNFAQTLADVFNNKIHADDPDFPDKYKFLLQGGVDGKVTAGNISISDQWADD VSYIIRKQNPDGELDNQDILDMKAAFEQDFNFGGEYTGTFSEFITSVTNTLGSGIKTNSA RMSASLAIAESADSERMGVSGVSLNEEGISMMTYNKAYQALGRLMTTMDEQLDMIINKMG IVGR >gi|157101661|gb|DS480663.1| GENE 92 91327 - 91773 637 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939641|ref|ZP_02086989.1| ## NR: gi|160939641|ref|ZP_02086989.1| hypothetical protein CLOBOL_04533 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04533 [Clostridium bolteae ATCC BAA-613] # 1 148 1 148 148 234 100.0 2e-60 MEEIIDFLGEYAETLEAMEQKQIEKLGLLMTRELDKVEQTIMMQEAMDKKLENMEQRRRK LFVSYGIEGRTLKQIAEDAGPEQRKELMDLYRRMDGAIGNIQYYNQKAEALAKSELEQMG IDSRYVGNPTGIYGRPAFSKGSKLEKKA >gi|157101661|gb|DS480663.1| GENE 93 91786 - 92088 281 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939642|ref|ZP_02086990.1| ## NR: gi|160939642|ref|ZP_02086990.1| hypothetical protein CLOBOL_04534 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04534 [Clostridium bolteae ATCC BAA-613] # 1 100 1 100 100 138 100.0 1e-31 MDIKVSRNYAAYQKAVSNAKHTEKSLSAPVQAAGKARGDAICISSEGVKMSGASSFSAAL SRSMEEGAPADRIAAIKQQVQEGTYQVPAEQIARRLMSGL >gi|157101661|gb|DS480663.1| GENE 94 92491 - 93009 421 172 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1341 NR:ns ## KEGG: Cphy_1341 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 87 172 2 88 93 84 45.0 1e-15 MSKSRTEIDKEIMYRKIMPSASKRDAKTAQADTGNSGAGQSDTALYDRVNGGEGAMGMPA LSAPSAAAMAKTLKRSPVTFPFKEENNMVLVNLMEELVINKLESTLDRFNCCKCDKCKKD IAALALNRLKPRYVVMKEGEQEKRRKAELENGSEVTGALVQAILVVKKAPWH >gi|157101661|gb|DS480663.1| GENE 95 93002 - 93772 715 256 aa, chain - ## HITS:1 COG:CC3753 KEGG:ns NR:ns ## COG: CC3753 COG1192 # Protein_GI_number: 16127983 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Caulobacter vibrioides # 5 250 9 258 267 184 44.0 2e-46 MSLTIVLSNQKGGVGKTTSAYVLSTALKEKGYKVLAVDMDPQGNLSFAMGADTESATIYD VLKGELKPRYAVQKSALVDIIPSNILLSGIELEFTGARREFLLKEALESLKSSYDYILID SPPALGVLTVNAFTASDYVLVPMLSDIFSLQGITQLDETICRVRNYCNPRIQILGVFLTK HNPRTNFSKEVEGALRMVAEDLDVPVLDTFIRDSVALREAQSLQRSVLEYAPECNAVQDY KKLIQELIQRGLKADV >gi|157101661|gb|DS480663.1| GENE 96 93992 - 95497 1060 501 aa, chain - ## HITS:1 COG:BH1477 KEGG:ns NR:ns ## COG: BH1477 COG1344 # Protein_GI_number: 15614040 # Func_class: N Cell motility # Function: Flagellin and related hook-associated proteins # Organism: Bacillus halodurans # 1 501 1 464 464 227 36.0 4e-59 MRIQHNIMALNSNRQLGVNNSAVSKSLEKLSSGFRINRAGDDAAGLAISEKMRAQIKGLE AATDNSQDAISLVQTAEGGLQEVHSMLNRMTELATKSANGTYTDDVDRKALQDEVSALKD EINRIADGTDFNGIKLLDGTMGVGTTGVSGKVDLTAGKVDAFNATISGAGENTSVDFKTA VGTATGVKAEWLGGKLTVTITGANANDKITQEQINQALASATGTPETAKNIKIELDGDIN INAKTTIDGTATLVTAKAVQATSADTGASGISISSTKAGANTHTLTTKTAGTIGAIVDVD GNVALNLTGTKSYTATEINKMLSDAGSDIRMDFEGSKTGTEISGAKSGVAANVLKLGENG KAGTGLAAGGGMKLQIGDTNDSYNQLELSIADMHVNALDLNSVNISTREGASAAMSKIKT AINTVSTSRGKLGAIQNRLEHTINNLGVTTENITSAESRIRDVDMAKEMMNYTKNSVLVQ SAQAMLAQANQQPQSVLQLLQ >gi|157101661|gb|DS480663.1| GENE 97 95705 - 96631 932 308 aa, chain - ## HITS:1 COG:BS_yodJ KEGG:ns NR:ns ## COG: BS_yodJ COG1876 # Protein_GI_number: 16079020 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus subtilis # 116 301 83 267 273 122 37.0 1e-27 MAQRRPRDQRVVRNRRVAIYTFLLILTLFYISAHRTVQKERELEALRAETAAEGPGGQVK ELAEGWPGGMPEEVVGGQTGGQPKEPVQEQPEGLPGARVYKERGEMDADGGLKILPEDMW CLILTNAEYPVPEDYEVELEAIPGTEQSVDKRIYEPLMTMIGDMKDQGLSPIVCSGYRTL DKQEKLFNRKVLSFVKAGHTKEESYNLARQTISIPGSGEHCLGLAVDFYTRRYHKLERAF EDTPESKWLVEHAQDYGFVMRYGENKTDITGIQYEPWHYRYVGVEAANYMKDNELSLEEF YIEQSLYG >gi|157101661|gb|DS480663.1| GENE 98 96676 - 97326 676 216 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939647|ref|ZP_02086995.1| ## NR: gi|160939647|ref|ZP_02086995.1| hypothetical protein CLOBOL_04539 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04539 [Clostridium bolteae ATCC BAA-613] # 1 216 1 216 216 418 100.0 1e-115 MVDLLDGYVGSSCVIKAKNNDLISMGVLHRIGKNFIDVGSSRNELPGIPYNLLIKLEIYN TQLGFKVLMGRVYLSSPKLVRIVELSEATNDERREYFRISTRDEGIIYNCIRGNGTLDMG EESEDYNGLKVRLVDISLGGLMFCTREEFKVNDRFNIVIPAMGDSMLFICEIRRRVDRPE GGYGYGCEFVEMATKQEDLLYRYILRRQGDQLRRIR >gi|157101661|gb|DS480663.1| GENE 99 97345 - 97737 346 130 aa, chain - ## HITS:1 COG:BH2473 KEGG:ns NR:ns ## COG: BH2473 COG2257 # Protein_GI_number: 15615036 # Func_class: S Function unknown # Function: Uncharacterized homolog of the cytoplasmic domain of flagellar protein FhlB # Organism: Bacillus halodurans # 5 78 6 79 92 60 41.0 7e-10 MSKYKKNKAVALRYNVDEDTSPVVIASGYGTVAEHIIDIAEKKGIPVFKDDSAASLLCML DVGSNIPVELYEVVAAIYCKLIETSAQIRGVEKSRSESASAGHKEEASQPGRLRRNLVSR HSKDDSQQTV >gi|157101661|gb|DS480663.1| GENE 100 97724 - 99124 1527 466 aa, chain - ## HITS:1 COG:no KEGG:Closa_3422 NR:ns ## KEGG: Closa_3422 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 466 1 473 473 104 21.0 8e-21 MADILRVTSPVINKNIIQPDKALKDPSVPFDMQEIRHITKNPSGSGLLGHHNVLEGEAGA ATLMNLLKDPAVTVNYLKNIYMLEEIINLLPVNNNTVTQEIRQLFDALLIDPDHIVDELV DQEYASTLFKGELFDLLRGILEKTPDLRMETAELLKAVNASISRQDVLDSVANSLEYLSE QLTGSRNLSGMFSQLAARVRESRGSLEFEELKPQVQQALKELEESVLYTPQAARVIPNIL YNMSRYQDNDAFLQEALMNLLIHVNDREEKAHIRELVKEYIAGFREPERQMNRSRIMDTL AKIIAKQDRETPMNSLNGEKIEKIITSLLSSPCNYTPLLHFVIPVDYQGMKAFSELWIDP KDEGRNGEPGSREGGHVHVLITFDVPGVGQMEAEFMMAGKELGFHLYCPESYIGIFESLV PEFRNIMEEHGYRASGIEVERMDHVRSLMEVFKNLPFKRTGVDVKI >gi|157101661|gb|DS480663.1| GENE 101 99124 - 100884 1565 586 aa, chain - ## HITS:1 COG:TP0710 KEGG:ns NR:ns ## COG: TP0710 COG1315 # Protein_GI_number: 15639697 # Func_class: L Replication, recombination and repair # Function: Predicted polymerase, most proteins contain PALM domain, HD hydrolase domain and Zn-ribbon domain # Organism: Treponema pallidum # 132 584 178 637 656 167 26.0 8e-41 MNEKSLIYSLFSKFLDEGQDEIPREEDGKESPEREQDKKSRPQGQPYDRADETEHADRNH DNRVETVEDWNETEQRRRQQMERELLGQEHPESATQESESYDEDRSVSGDSGAYEGSGAQ KDGIKKEEVTRDAWVEVLVSDDRMSVSMMVYGPSGGGSDITEEMVYDVLEQKGICFGIDQ KKISWVVSGQQYMQMVMIAAGEPARNGEDGHIKDYYPRKAQIKYASKGNGGIDFKSTNLI HNVKKGDVVCDITLPTEPGDGMDVFGQPVRGKKGTMPPVPQGRNIVYSEDRDRLLAACEG NLTFRSGRFHVENVFTVSGNVDNSVGNINFTGSVVIHGDVLEGYSVKAKGDITVMGIVEG ARLSAGGDILLHKGMRGMRTGVLEAEGDITAKFLEDCNIFSRNNIQAEYIINSEVSCGHD LTLIGKRGAFIGGSCSVYNCMNVKAVGAPSHIATSVTLGLTPQLMDEMEAVGKEMILLSR KLTELNKDISYLSGKLKEGTITPSQRERLSKLKLEAPINSLKEKKLKQQGAELSRKLREV GKSRLTAREVYPGTVINIGDCKMSISKKEDSCTFYYLDGEIKKGIR >gi|157101661|gb|DS480663.1| GENE 102 100936 - 101703 989 255 aa, chain - ## HITS:1 COG:TM1542 KEGG:ns NR:ns ## COG: TM1542 COG4786 # Protein_GI_number: 15644290 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Thermotoga maritima # 1 255 1 260 261 122 34.0 1e-27 MNISYYTAVSAMNAFQKDLDVTSNNMANISTNGYKSMRSSFNDLLYTQMDMRPQAQVGHG VRNDGPGTTFQQGIFRKTDRELDFAISGNAFFAIQTSEDEEEPAYTRDGAFQISSTDDGN YLTTSDGSYVLDQDGEPIELEYKTSEDGGDNDAGELDLDGLVERIGLFVCENPEGLQHVG MNLFRTGETSGEWLSMDDLDDEQEKSKAMSHALEMSNCNMSTEMVNLMQTQRAFQLNSRI VTTADQMEEMINNLR >gi|157101661|gb|DS480663.1| GENE 103 101717 - 102457 935 246 aa, chain - ## HITS:1 COG:BB0775 KEGG:ns NR:ns ## COG: BB0775 COG4786 # Protein_GI_number: 15595120 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Borrelia burgdorferi # 4 245 22 299 300 118 30.0 9e-27 MFSGFYTAASGMLMNQRSLNMAANNIANVKTSGYKPKRLVKTTFDQQLVREMNGRTEGLG QGSTISVGTREMTTHGQGPIEDTGKTYDLAINGDGFFVIQGENGQYLTRNGHFTRDDEGN LMLPGVGQLMGDGGPVYVDEEGFRVGADGIIYDNEDGALDQIQIVVPDNYDSLEFYDNGT YGAGAGVELSQVYPTVYQGKLEQSGVNLNNEMTRAMEVQRAFQSCGKALTIIDQMNQKTA NEIGKL >gi|157101661|gb|DS480663.1| GENE 104 102462 - 103238 872 258 aa, chain - ## HITS:1 COG:BS_sigD KEGG:ns NR:ns ## COG: BS_sigD COG1191 # Protein_GI_number: 16078710 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 3 257 1 254 254 134 32.0 1e-31 MLMEQTDSEDMVRVLEEYRKNGSQELRNRLIMHYLPIVRSAAVQLRSMAGSLLEQEELID QGVLALIECLERYDPDRGARFETFAFMRVRGAMIDYIRSQDWVPHRARNFQKKVEEACSI LSHKQMREPDVNEVADYLDLPVEKVENHIKYMNHANLLSFESVLQDMTAIVAKGELESGD IEGKPEENLFYKELMGTLTSAIDGLGEKERLVITLYYYEELKYSEIAQVMGIGQSRVCQI HTRAIQKLKASLEDYMRG >gi|157101661|gb|DS480663.1| GENE 105 103245 - 105299 2246 684 aa, chain - ## HITS:1 COG:BH2438 KEGG:ns NR:ns ## COG: BH2438 COG1298 # Protein_GI_number: 15615001 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis pathway, component FlhA # Organism: Bacillus halodurans # 22 682 22 677 677 596 50.0 1e-170 MKRFNILVTAGIIGIIFLILIPLPTPILDFLFILNISLSLLILVMSMYIRETLEFSVFPS LLLITTLFRLGLNISSTRKILFDNGYAGQVVKTFGQFVIRGNAVIGFIIFLIIVLVQFIV ITKGSERVAEVAARFTLDAMPGKQMAIDADLNSGLIDEQQARERRSKIQREADFFGSMDG ATKFVKGDAIISIIITFINFVGGLIVGIMNSQGSIQDIMQIYTTSTIGDGLVSQIPALLI SVATGMVVTRSASENSLSEDLTKQFLAQPRVLMTAGGAAACLCLIPGFPVLQILIISAGM VGGGYYLYRHQKSLVEETAEELVETEVTSEASYYKNIENVYGLLNVEQIEMEFGYSLIPL ADEGNGGNFIDRVVMFRKQMALDMGFVIPSVRIKDSGQLNPNQYSILLKGEEVARGDILM DHYLALPPGTDADDVPGIDTIDPAFHIPAKWISGDRKIQAELAGYTLIDPTSVVITHLSE VVKEHLHELLNRQEVNNLLEALKKTNSSIVEDTIPSVISVGGLQKVLANLLREEIPIRDM ETIVETLGDYGSQVKDTDMLTEYVRQALKRTISRRFSEAGQMKVISLDDRIENMIMSSVK KMDTGSYLALEPTAIQSIVASATAEINKIKDLVNVPIVLTSPVVRIYFKKLIDQFYPNVA VVSFSEIDNNIQIQALGNISLSQR >gi|157101661|gb|DS480663.1| GENE 106 105302 - 106408 1103 368 aa, chain - ## HITS:1 COG:BH2445_2 KEGG:ns NR:ns ## COG: BH2445_2 COG1886 # Protein_GI_number: 15615008 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar motor switch/type III secretory pathway protein # Organism: Bacillus halodurans # 279 364 23 108 112 102 54.0 8e-22 MAKAEGGQDVTAMTDGQMFEELIKIHAQSAGEAISSILRADIEIGAPRLQEMTIKDVEYG ILEPAIFVKSCLTSQVAGNIVLVLRQRDMQVFLNELMGVDDLPDPDFVFDDVAMSAAGEL MNQMSHASASSMAEYLGDTMDLSDCQLVLSDGSQDLALVIGEPKDAKVMVVSYTMKIKDM VESEFIECISSTAMDSLTEEIQARRAACEKARQEEEEAQKRALIMEQNSVIGQDRSSGTG GGQSPADAGTQKSAYSGAVRGTTASGGRFMGMPGSTLQNQSMNGNLSLIMDVPLNVSVEI GKTRRRLKDVLNFSNGTVVELDKQSDAPVDIIVNGQLIARGEVVVIDDNFGVRISEIVNT RSIIGNGE >gi|157101661|gb|DS480663.1| GENE 107 106421 - 107305 1055 294 aa, chain - ## HITS:1 COG:BS_ytxD KEGG:ns NR:ns ## COG: BS_ytxD COG1291 # Protein_GI_number: 16080025 # Func_class: N Cell motility # Function: Flagellar motor component # Organism: Bacillus subtilis # 34 248 32 246 272 189 45.0 5e-48 MDVLSILGFILAVGLVMFGMTFDQESMRIVYHNLKAFLDIPSMAITIGGTLGVMMISFPA GAFKKIGSHLKIIVKPYQYSPTDSVDQIVNLATEARMKGLLSLEDKLNEIDEPFLHNSLM LVVDSVDSEKVRKAMETELEQLDERHALDRRFYEKAASFAPAFGMIGTLVGLILMLGNMS DVDALAKGMAVALITTLYGSLLANVVCLPMASKLKARHDEEFLCKQLVMEGVLAIQEGEN PKFIEEKLYKLLPASYKKPADKGEKQDGNNDGKKNSKKKGIGKRRRWNKKEIEE >gi|157101661|gb|DS480663.1| GENE 108 107344 - 107550 382 68 aa, chain - ## HITS:1 COG:TM0675 KEGG:ns NR:ns ## COG: TM0675 COG1582 # Protein_GI_number: 15643439 # Func_class: N Cell motility # Function: Uncharacterized protein, possibly involved in motility # Organism: Thermotoga maritima # 1 65 1 65 67 62 52.0 3e-10 MIILKKLNGEEFVLNSELIETIMETPDTTILLTNGKHLIVRESKEEVVKKVVEFRRDAFR GILEQIKG >gi|157101661|gb|DS480663.1| GENE 109 107571 - 109256 1901 561 aa, chain - ## HITS:1 COG:BH2449 KEGG:ns NR:ns ## COG: BH2449 COG4786 # Protein_GI_number: 15615012 # Func_class: N Cell motility # Function: Flagellar basal body rod protein # Organism: Bacillus halodurans # 1 144 1 142 263 129 50.0 1e-29 MLRSLFSAVSGMQAHQTKLDVIGNNIANVNTYGFKSSRARFQDVYYQTLQSATGGDNNKG GTNASQVGYGAQLAGVDLNMGRSALQSTGRPMDVAITGEGFFQVMDSDGNIFYTRAGNLQ LDNNSGNLVDANGYTVLGVTGDPLGKAAGADKIHLSIPGKNNAQSANSQTINGIEYTITS QNTTSAANVTISFRLDDTLPDGADIVVKSGELKDSSITVSVNKNAIFTSLSDFNSKMNAA ITRANNGKAHPAGDFTITAVPADKIFKTSLTGEELLGENFGIKAGVQKFTDAADIKDGIF GGLKPSSTSTEPKFTASGKVSYEMTLQAATDTEPAAWIIKATVESDDGSKRYFTGKVSEN STSANSVWLKEDGTTPLNDQPGQYIELKHPGYAAINDACKDLATGDLTKTDVGTITAATN SNDLGLKSKNFTLTNGSEGGDVALSGVGVTILSNGIIEATHPDLGKIQIGRIDLVTFDNP YGLEGVGNSYFKTTANSGDPKLCQAGENGTGALKTSSLEMSNVDISTEFADMIVTQRGFQ ANSRIITVSDSILEELVNLKR >gi|157101661|gb|DS480663.1| GENE 110 109327 - 109722 390 131 aa, chain - ## HITS:1 COG:no KEGG:Amet_2717 NR:ns ## KEGG: Amet_2717 # Name: not_defined # Def: flagellar operon protein # Organism: A.metalliredigens # Pathway: not_defined # 35 131 33 129 129 89 40.0 5e-17 MDSMIYNKMLHTPIYTGTPGEPPKSRPREKTNDNAFKELLEQRLKEESQVSFSKHAMERV VERGVDVSSEKLDRLNEGVRMAEEKGLREPLILLGTTAFVVNVKNNKVVTVVNEDSLKGT VFTNIDGTVMI >gi|157101661|gb|DS480663.1| GENE 111 109805 - 110506 679 233 aa, chain - ## HITS:1 COG:no KEGG:TTE1435 NR:ns ## KEGG: TTE1435 # Name: flgD # Def: flagellar hook capping protein # Organism: T.tengcongensis # Pathway: Flagellar assembly [PATH:tte02040] # 93 201 23 131 131 73 36.0 9e-12 MPIQSNYSSPYTTGAGTASGPGAGSRSALSGTASSKARALSDDNGGDTEKSNSSDKTNGS EKTNSKEGTNSTDSTGSSGNVIDAVFGDGDDKKVSMDDFLTLMVAQLKNQDFMNPVDDTQ YVTQLAQISTMQQMEEMAYNAKSSYVASLVGKNVSAAKFTVSGELKKADGVVEKISLLDG KFVIYVDGEAFSMDEIMEIKDKPAAEGSGENNETPPDETPPDGTPPDKTPPDK >gi|157101661|gb|DS480663.1| GENE 112 110538 - 111962 1158 474 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939661|ref|ZP_02087009.1| ## NR: gi|160939661|ref|ZP_02087009.1| hypothetical protein CLOBOL_04553 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04553 [Clostridium bolteae ATCC BAA-613] # 1 474 1 474 474 722 100.0 0 MNAGTLTQALTASRYKANPTAGSREKQDDTVSFMDMFAQNLKDARKTGTKAADKKRDGAA DSKAQEMGDTGRNQASVDQGKKSESSEPIEDGTCRDDSKKAESGVENTDKDSDSGKRHNE GAAEEAINLQAMSLLMQEVPVPAQAGEAQDKQAAQDAAMLTEYVPEPGKTEQTLLSNTAL AGYEQTAALQTQELVQPPYAEAVREQMAQETAARSAAGQEPSGKTSDTVRPVQSDTMEAD MTAVVQGHKETRQSGEDMFFSADSRQNKTLNQLREQTGTADEEPKDANQVLEELKRNADA RGIDLTGKAVQNRMNYGLRPVSDAREQQIDTPVLEQLKTGVERAAQTGNRELTIHLKPEG LGDIVVHLTSSGDKTTVRIGVTNPETEKLVTSQMESLKDMLRPLNTEVQEVYHSSQNAMD FSGFSQHMQERRGQQSHTGYSFHAAEDSGSEEELLMEAKRMMAESRMSRLYAYI >gi|157101661|gb|DS480663.1| GENE 113 111995 - 112444 484 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939662|ref|ZP_02087010.1| ## NR: gi|160939662|ref|ZP_02087010.1| hypothetical protein CLOBOL_04554 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04554 [Clostridium bolteae ATCC BAA-613] # 1 149 1 149 149 182 100.0 6e-45 MKKFVFTLERMLAYQEQNLEKEKGILARIMAERDKLEEDRRDKELRIRHIQEDIGQRQIQ GTTVFILKGCYSVLEGVRIRLEEIETELTRTQARAEKQRRVVTEASQEVKKLEKLKEKQM EEYRHEEAKEQQDLIIEHVAGTFVRNGVS >gi|157101661|gb|DS480663.1| GENE 114 112455 - 113759 1550 434 aa, chain - ## HITS:1 COG:BH2455 KEGG:ns NR:ns ## COG: BH2455 COG1157 # Protein_GI_number: 15615018 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway ATPase # Organism: Bacillus halodurans # 1 432 1 432 437 479 56.0 1e-135 MELDQIERVIENAGLYTYLGKIDKIVGTTVESTGPACRLGDVCTIDVVEGGSPVMAEVVG FRENKVLLMPYGETEGIGYGSAVRNTGERLHIAVSNHLIGRTVDAMGNPIDDLGPVEDVS YYSITGRPSNPMTRPRIDTIIQMGVKAIDGLMTVGKGQRMGIFAGSGVGKSTLMGMIAKN VKADVNVIALVGERSREVVEFIHRDLGEEGLRRSVLVVATSNQSAMMRSKCTMTATTIAE YFKDQGMDVLLMMDSLTRFAMAQREVGLSIGEPPVARGYTPSIYSELPKLLERSGNFEEG SITGIYTVLVEGDDANEPISDTVRGIIDGHIMLSRKVAARNHYPAIDILSSVSRLMNDIV TPEHKAAAGKLRRLLSVYESNRDLVSIGAYKKGTNPELDDALERIDRIYGFLQQKTDESF SLDESVIQMIQATD >gi|157101661|gb|DS480663.1| GENE 115 113772 - 114578 845 268 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939664|ref|ZP_02087012.1| ## NR: gi|160939664|ref|ZP_02087012.1| hypothetical protein CLOBOL_04556 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04556 [Clostridium bolteae ATCC BAA-613] # 1 268 1 268 268 437 100.0 1e-121 MSKVLKQGQTFTGSARVRVYQPEPGMPGQAGTGESTDGTENQEADSAGAGGQDTAGKRYA LISEEKRKILEHARKQAEQSAARILEEAYAQRDNIVNTARGEAGRIHDQAKTEGYNQGLN QALEDISQDMAGIQAAVDRLGQGLEEFKRQMNERVAGLAFMMAEKILRKKVEYDEAELAD MVAGAVLSERDKENITVHIPEDAVGLVEALEKRLEPLRDKMGGVLRIKTESRPPGFVQVE TEEGIVDASLDVQLDNLKKQLMALNSRE >gi|157101661|gb|DS480663.1| GENE 116 114571 - 115578 1353 335 aa, chain - ## HITS:1 COG:BH2457 KEGG:ns NR:ns ## COG: BH2457 COG1536 # Protein_GI_number: 15615020 # Func_class: N Cell motility # Function: Flagellar motor switch protein # Organism: Bacillus halodurans # 10 335 10 335 335 266 45.0 5e-71 MPAEERRPAEKAAAVISILGSERASEVFRYLTENEVEQLSMEITRLPRLAEDELEDIAKD FYNCCVTEKVITEGGRDYAKEVLEKAFGQQQARNLMDRVSKALKTKAFSFIRKVDYKTLM AAIQNEHPQTLALILSYATPEQASKIIANLSQDTRVDVVERIANMDRALPMAIKIAEEVL EKKIGSSSSEETMEVGGLNYIADVMNHVDRSTERDIFDELNMKNPQLAEDVRKLMFVFED IAYLDPLSVQRFLREIDSKDLSVALKVANRDVTATIFSNMSNRMRESIQSDMEYLHNIRM SDVEEAQQRIVAVIRRLEEEGEIVISKDGKDEIIV >gi|157101661|gb|DS480663.1| GENE 117 115580 - 117187 1729 535 aa, chain - ## HITS:1 COG:BS_fliF KEGG:ns NR:ns ## COG: BS_fliF COG1766 # Protein_GI_number: 16078684 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Flagellar biosynthesis/type III secretory pathway lipoprotein # Organism: Bacillus subtilis # 58 375 53 375 536 105 28.0 2e-22 MGIKDKIKPGAGGNGVIGRILGSGRKKAMAAGAAAAVVVALAASIMLNRGNAGSYVTLFP GISREENNEILAVLGGRGVAAKRNDEGEVTVPEDQLGDIMIEMSELGYPKTALPFDIFSD NMGFTTTEFEKRQYLLMNLQDRLERTLKDMTGIKNAIVTLNVPDESAYVWEDEDSGSTGS VSLTLMPSYDLSPEKVSAIKNLVADSVPRLAPERVTVVNAETMQEMVSDDIGNTAYGLNR LDFEAKVEERLQNKIKNVLTLAYSPDKIRVSATVVIDYDKMITENLEYEPQENGQGVVDH YRESRGVAGGQIGAGGIAGEENNTDIPAYGTAGTGGTAQGTGDYYRDVDYLVSYIKKQIE KDNVKLQKATVAITVNDDNLTESKKQQLIDAASKAANIAPEDIVVSSFREVPKETAPAKP ALPVQAPVDTGIDYRTIAIAAGAGILLFLLILFLIFGRRRRRQMKEDQELFAPFEGEVPP SADMEERAEEAAVQNDLGRMQVNPEDPVEQVRSFAQMNPEIIASMISSWLKEDKK >gi|157101661|gb|DS480663.1| GENE 118 117217 - 117522 343 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939667|ref|ZP_02087015.1| ## NR: gi|160939667|ref|ZP_02087015.1| hypothetical protein CLOBOL_04559 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04559 [Clostridium bolteae ATCC BAA-613] # 1 101 1 101 101 161 100.0 1e-38 MFIIPVIRMEGVGDLTDLKKDSTVKAVPEKRFDSVLKEAVREADQAIKDTQIMDMKLASG QVDNLHTALIQAEKTAAAVEFTTQITTKAVNAYNQIMGMQV >gi|157101661|gb|DS480663.1| GENE 119 117967 - 119175 1315 402 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 255 398 508 651 693 116 41.0 9e-26 MYGWKKAAAALGIWAVLTAVFPAVSIAAVKTETRTPITSVSLKVRSDVQADYELNEATVY ATTDSSLYTIGAYKWAAKNNKDYWEPGDEPKVQIEIHARSGYYFNRTTGPSKFQIDGATY KSVKRTNNDETLLLTVLLTPASGTLEIPDTAEWMGYPPGKAAWAQVPYAGAYELKLYRNG QMVQGIPKVIATNYDFYPLMTTAGTYQFRVRAIPKDTEETAYITSSDWVYSDELDIDADE VCTLSGGAVGGPDKADALTPSQLGWIKDDEGWWYRNTDGTYPIGTWKNIDGRWYLFDFSG YMLTGWQQKDGNYYFLDMNGIMQTGWLQDSRKWYYLGNDGIMYKGWLTAGDGMYFFDQDG SMHTGWLLDGGNWYYMSPENGRMVKNAYIEGRYLDGSGIWHN >gi|157101661|gb|DS480663.1| GENE 120 119198 - 120418 1090 406 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 263 387 563 685 744 119 46.0 1e-26 MGIRNRFIYFAVRAGAGVCLVLGIWWSCAMTSFADEAIKSVTIAVTSSVEMGGSGSVSAT ADSTKYHVVSCEFINGKTSWKAGEIPRISIELAAEEDYYFGSISSSNAHIRGAEYVSSKK SDAKRSLTITAKLKGVKGTLDQVEEAYWEETPLGKARWSKVDGASSYEVKLFCDESMVYH VPRTNSVSYDFFPYMTEEGDYYFKVRAVAKTESESDYLKAGGWTESDNQQITRKDAQAAD KRAVSQGKGTAGVSVKDGTGPNAQAPGWAQDGNGWWYRNQDGSYPVGCWQEIGGKWYLFD INGYMLTGWQWKNDREYYLTSNGDMVTGWLQFNRIWYYLDPEKGKLTGQWLQQGNDWYYL NPDGSMAAGWLQWQGSWYYLDPSSGQMVRDKAVDQHYVNQAGIWVP >gi|157101661|gb|DS480663.1| GENE 121 120448 - 121635 1078 395 aa, chain - ## HITS:1 COG:Cgl0662 KEGG:ns NR:ns ## COG: Cgl0662 COG0791 # Protein_GI_number: 19551912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Corynebacterium glutamicum # 281 369 174 260 286 87 49.0 6e-17 MAVLVLAVQAPLPGRASPVTRAEQERRRVEEEKSRAESEAQQLEEKLGQSRQKEQALEEE LVRLLALKDILESDMEELKTQIQVADRDYRQAEEKRQRQYDILKKRIQFLYEEGDITYLD ILLKAKNIGDVVSQTEYFRQLYEYDQEIIQRYEKLKQEAAGKKELLEEKQSQLEVMEEEN ESQQKELEGFIAARQKESSSFALELEAAQARAAQAAGEVIRKTEEIRILRARQEEDRIRQ EKERIRQEQESAGREPGSAGQASGGAGTAGGRPVKSIGGTEFGRNVADYALQFVGNPYVY GGTSLTGGTDCSGYTQSVYRHFGVSIPRTSGEQAGFGREIPYEEMEPGDLVCYSGHVAMY IGGGRIVHASSRKEGIKVSNDPAYRTIVSIRRPWQ >gi|157101661|gb|DS480663.1| GENE 122 121878 - 123395 990 505 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 486 1 490 500 295 33.0 1e-79 MKKLFGVDFGTGGCKATVIDLEGNILASAFQEYPSEHKKPGWSEQDPALWIEAFINTVTA CRTQMNDGFDGMLGLAVTASTHNAVLLDGNGQVIRHCIMWNDQRSGEQCRYLKEKHGEEI FKIGMQMPTPTWTLPQLQWVRENEPENYKKIRRLMFTKDYIRSWVTGDFCTDIVDAQGSL LFDARKQCWSEELCNMVGLPVEVLPEIRKTKEIVGTVREEVSKQTGLPGGLPVIAGCSDT AAEDFSAGAVKNGQIIIKLATAGNVNLVTDQAVPHDKTFTYPYSVEGKWYTVTATNSCAS AYRWMRDSLYPAEKEQCDREGRDVYLLMDEQAASVELGCQGLLFHPYLLGERCPYFNPNA RGDFFGVSMVHKKGHFARALLEGVAFSLYDCLQVLKEFTDNMEDIVIIGGGAKSPLWCQI VSDIFGLEVKMPQNAESSFGGALLAGVGVGAFADELEAAGKCIRMKKTYYPDMENHKKYM ELFKIYKEIADMSGSVWKKLAEFSA >gi|157101661|gb|DS480663.1| GENE 123 123406 - 124266 664 286 aa, chain - ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 13 285 14 286 287 208 39.0 9e-54 MGENISNGTLYDRVEEKRGSLSKKELETAEYMAAHQEQLIYSSITDLAERAGSSEATVTR VCGKLGYRGFQALKVGVARELTTPQEKIHENLEADSPAEVIIEKIFSSAIETLSMTQKSI NVKAVAASIDALCRARRIIIIGNGNSASIAADAQHKFLRLDLNAHAYTDDHMQMIAVSSM TDQDVLLAISYSGSSRNVVEAAHQAKEQGATVISLTNEGSSPVSKLADICLNTYSQETRY RTYAISSRMAELTIIDTIYTGVALRLGDKAIANFEALERALVVKKY >gi|157101661|gb|DS480663.1| GENE 124 124368 - 124583 153 71 aa, chain - ## HITS:1 COG:no KEGG:BP951000_1856 NR:ns ## KEGG: BP951000_1856 # Name: grdB # Def: glycine/sarcosine/betaine reductase selenoprotein GrdB # Organism: B.pilosicoli # Pathway: not_defined # 2 61 364 423 432 64 53.0 1e-09 MGIPVTQICAVVDIAKSVGSSRILRGFAITCPVGNPSLSQTDEKVSRRRYVEKALSMLTK PGSRGAVEDVL >gi|157101661|gb|DS480663.1| GENE 125 124632 - 125678 782 348 aa, chain - ## HITS:1 COG:no KEGG:Amet_3591 NR:ns ## KEGG: Amet_3591 # Name: not_defined # Def: selenoprotein B (EC:1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 5 348 3 349 436 318 45.0 2e-85 METKKWRVVCYVNQFFGQIGGEDMAHVGFSVEDKPVGPAVLFQNLMKNDCEVVGTIICGD NYFAENIERAVVEGVELVRSMKPDLFIAGPAYNAGRYGISCGNMAAAVGRELGIPTVTGM YPENPAADLFRKDTYIVKTEILSSTLRKAAPVMCSIGLRLLKGEHIAGAQEEGYIIRDII LNEEQPLNAAERSIEMLMKKMRGEAFKSELLPPAFDVVDPAPPVQDLKTARLALVTDGGL IPESNPDKLKPNGSTTVGCYNWDSLMSDKYFVIHSGYDGTWVMENPYRLLPVDVLREIIE EKKLGYLDPQVYVTCGNCASVAAAKVKGEQIAKGLLDKGITAAILTST >gi|157101661|gb|DS480663.1| GENE 126 125690 - 126994 1018 434 aa, chain - ## HITS:1 COG:no KEGG:TDE2120 NR:ns ## KEGG: TDE2120 # Name: grdE-2 # Def: glycine reductase complex proprotein GrdE2 # Organism: T.denticola # Pathway: not_defined # 1 390 1 393 429 182 32.0 2e-44 MKLTVETIRIQDLQFGEETCFRDGILYVNKEDILAFAASEPCFETLKIDISYPGDSTRII NVVDVVQPRCKVSGNIDWPGVLTEEYEIAGTGVTRAVEGAGIVLCQNDTYWSRKWGSFDM SGECAAMNPYAKMPELVIEPMAPENADFRDYREALRRIGYKTSVLLAKVTLGCKANTSET FDNETQYPDLPSVAYSYQIYSKQYDTQNYREPMVYGNAVPDSLPLVMQPTEILDGAISLC GGFRCITTYEIQNHPIIVNLMRRHGKDLNFAGVLITVTSVEAKHRYLVSKMAANLLKEVF HADGVIVTKGVGGASTLCVGAIASEAEKLGIKAVPVIQILNGKSNLGIECMISEKNVDSI VCSGTYYHNFTLPPVETLLGGPQDALYLSGDDGVIGGHKIATGDPAKGQVRSTYMKQVGL MSQIGYSHGMAVDY >gi|157101661|gb|DS480663.1| GENE 127 127013 - 128296 685 427 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 424 4 428 432 268 32 1e-70 MITIAIITLLIFLVLGVPVSFAIGLSGLLAIVMGSDVPSFMAVQQAIRGMNSFSLMAGPL FILAGEILGAAKLSKRILDFCRACISQLRGGLGMVSVMANMIFAGISGSGAATMSAVSTL TVPELKKAGYDRAFIASMIAGSGALAPIIPPSTNMIVYASLTGFSVGKLFMGGIIPGLLI GFALMGMCYWYARKYNVDAGTSTLDLHVVWKSFKDAFFALITPIIIIAGVVSGIFTATES GIIACLYALICGLFIYRTLKLKDLLGIFKRATSSSAMLMMIMGISNIYSYIFARENLADT IKTFMLSVSTNPTVVVLIIIGIMLVIGCFMETLAATAVILPIVYPLVMSLGVDPLMFGVL FSIATVVGALTPPVGLYLFLSMSIAEAPFKEAIRYTVPVVLIILAVMVLMLIWPPLVTFV PHLLMGT >gi|157101661|gb|DS480663.1| GENE 128 128297 - 128788 181 163 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 11 148 8 146 164 74 32 3e-12 MKTLLKLDQAIETIQTWFCALMFIMILIFGTIQVFGRFVLSSAPPWTEEAMRFCGIYLTF IGSALTVRADAHVSVDILIGFLKDNKIRAGLFVVSRLICVVFLIAFFPGAITLVQKSGNS LGAAIRIPYSYIYAAAPAGIIMMLCAYASAIPKLARQYAKGEK >gi|157101661|gb|DS480663.1| GENE 129 128865 - 129947 758 360 aa, chain - ## HITS:1 COG:BH2673 KEGG:ns NR:ns ## COG: BH2673 COG1638 # Protein_GI_number: 15615236 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Bacillus halodurans # 3 342 2 328 348 145 29.0 9e-35 MKKRLIRVMSSLMFVCLLAGCGSKSSSGNAVSEAPSKNEGTESVAEVAASGNAEITLEFG HIQNPGHALAIAPEEFKALVEEKSGGRVAVNIYPSSQLGSAREMMEQVTMGTLDMTCCDT ADWAAALNIPELAVFNMPFLTKDLATQAELIRTIIPEEVPKMLEGSGTRLLMTYSNGIRQ PLLKNKPITCLEDIKGLKMRTAETPLYVNLWNALGASTVTSAWSEAYTLLQQGVADAVEA DVTGLVNQNLQEQGKYLSKIGHLGAIYCVFINEDKWNSIPEDLQGIISECALESQEKQLS SRQASDDAAEKVMADAGVEINEISQEERSRMKDACQSIYDEYINNYNLGDLIDRMLALNN >gi|157101661|gb|DS480663.1| GENE 130 129997 - 131103 354 368 aa, chain - ## HITS:1 COG:SMc02038 KEGG:ns NR:ns ## COG: SMc02038 COG0371 # Protein_GI_number: 15966303 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Sinorhizobium meliloti # 7 363 3 361 364 281 44.0 1e-75 MVNSTTRAFGSPFKYVQGPGEFNHLPEYSKPYGKACVIIDSFLYAELNQKLENTYGESEG EFFSICFKGECCDDEVQRICEIARSNGAALIVGVGGGKTLDTAKLCADSVDLPVIIVPTS ASTDAPVSEIAVLYTADGEYIGSRKMKHNAHLVLVDSEIIVKAPKRLFVAGMGDALATYP EALACQGSDSPNYIAGGRRRCKAGMAIAKSSWDILFSDGRSALTALERGVITESLENVIE ANTLLSGLGFLNTGLATAHGIHSGLTMIPSTHKYLHGEKVAFGIVCQMVMENTPAETVDK VMRFMVDIGLPVTLEQLDVEPIEENITAIAKKTADGPLVHQEPVFITEKIVYDAILMADE LGKKYRGM >gi|157101661|gb|DS480663.1| GENE 131 131478 - 132026 749 182 aa, chain - ## HITS:1 COG:CAC3214 KEGG:ns NR:ns ## COG: CAC3214 COG2002 # Protein_GI_number: 15896461 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 181 1 182 183 194 57.0 6e-50 MKATGIVRRIDDLGRVVIPKEIRRTLRLREGTPLEIFTDREGEIILKKYSPMMELTSFAV QYAEAMAQSTGLFVCITDRDQVIAVAGGAKKDLLQRNISRQLEQAINERSTVIAAREDKA FIQLVDEELEGISAQVVAPIICEGDAIGAVALMSREPRAKFGDAEMKLASTAAGFLGRQM EG >gi|157101661|gb|DS480663.1| GENE 132 132134 - 133207 1075 357 aa, chain - ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 1 347 1 343 350 263 39.0 3e-70 MERVITLVDKLAKSHVLSREEFSFLLDNIGEEDSYLYEKAREAALANYGNKIYVRGLMEF SNYCKNDCYYCGIRRSNQKASRYRLSPEQIMECCSIGYGLGFRTFVLQGGEDPWFTDEKI AYLVERMKKQYPDCAVTLSVGERGYDTYKRWFDAGADRYLLRHETANPCHYASLHPPQMS SEYRKECLHNLKAIGYQTGCGIMAGSPYQTTAHIAEDLEFMHGLQPEMVGIGPFIPHHDT PFKDRPAGTLRQTLLLLAIVRLMLPDVLLPATTALGTIEPDGREQGVMAGANVVMPNLSP MEVRKKYMLYDNKISTGMEAAANIKELKRRMASIGYEVVTDRGDHKKIKEVPALKTS >gi|157101661|gb|DS480663.1| GENE 133 133207 - 134016 202 269 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 19 268 25 270 309 82 25 1e-14 MIFEVRDGGFGYSSHECILKNVNFSLDEPQVLSVLGANGAGKTTLLKCMLGLLEWNTGGT FVDGVNIKDIPYNQLWQKIGYVPQAKASAFAYSTQEMVILGRNAHLGTFRQPKKRDWEIA EACMEEIGIGYLKGKLCSHISGGELQMVLIARALAANPSILVLDEPESNLDFRNQLIVLE AIERLCREKHISAVVNTHYPEHALSISQKALLLTAHGTTLFGPTDEILSEEHLNEAFGVE VRLYPLTLRNRTYTCVLPLGLKERMERGA >gi|157101661|gb|DS480663.1| GENE 134 134013 - 135032 1062 339 aa, chain - ## HITS:1 COG:FN0306 KEGG:ns NR:ns ## COG: FN0306 COG0609 # Protein_GI_number: 19703651 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Fusobacterium nucleatum # 17 333 14 329 333 248 44.0 1e-65 MNGSKRSWNRYWTLGSLAVIFAAAAAAMAIGRFTIAPKELAAVLIPGSFPDVQVSGQVRT VISNIRFPRVVLALLSGAGLAAAGAAFQGLFANPLATPDTLGAANGASFGAVLGILLGLP AFGIQTLAMITGVAAVLMAWSVAQVRGTIPPLMVILAGMVVSSLFSALVSLVKYAADPQD VLPSITYWLMGSMASTTRTTLIMGAPFILAGSLLLFLLRWKLNTMSLPEDEAKSLGIPVK RIRSLVILGAAMVTAAVVSMCGQIGWIGLLIPHMVRMIFGNDNRSVVPASMASGALFMLI IDTAARSVMATEIPVSILTSLIGAPCFILLLRKTGGIRL >gi|157101661|gb|DS480663.1| GENE 135 135078 - 136310 1278 410 aa, chain - ## HITS:1 COG:FN0305 KEGG:ns NR:ns ## COG: FN0305 COG0614 # Protein_GI_number: 19703650 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 72 394 10 322 336 182 33.0 1e-45 MKKLTAIILSVCLGAAGLTGCQAGASETPAASSRAADTGRQQAEKEPKEETKSREDSAGE PDASQSGGTVIITDHAGSQVEIPREVNRVVVTDILPLPSIITVFLGSGEKIVGMAPASMS AARTGLLGELFPDILNASTDFMNGSDINMEELMKLEPDVVFYNAGSKEIGEALRSAGFAA VAVSVNKWNYDSIETYDQWISLLSQIFPEKGDTAEQVSRYSQNIYEDIQNKVSGLKDEEK KDILFLFNYSDTTMVTSGKKFFGQFWCDAVGGRNVAEGVAAENSNAVINMEQVYEWNPDI IFITNFTPVQPEDLYENAVGQDDWSPVKAVQDKNVYKLPLGTYRSYTPSTDTPMTLLWMA KQVYPDLFSDVDMTQEVKDYYKELYGVDLSGEQVQRMYHPGREAAAGYKK >gi|157101661|gb|DS480663.1| GENE 136 136383 - 137618 939 411 aa, chain - ## HITS:1 COG:FN0304 KEGG:ns NR:ns ## COG: FN0304 COG2710 # Protein_GI_number: 19703649 # Func_class: C Energy production and conversion # Function: Nitrogenase molybdenum-iron protein, alpha and beta chains # Organism: Fusobacterium nucleatum # 12 396 33 395 415 99 24.0 9e-21 MGVLLLKRTNRILPVYAADISGICSALYELGGMTVMHDASGCNSTYTTHDEPRWYSMDSM VYISALSELEAVMGDDEKLVRDIVDAALNLCPAFICIGGTPIPMMMGTDFKGIARMIENK TGIPTFGIATNGMHSYVRGAGEALRWIVKRFCPPGIKADRPAALKVNILGVTPLDFSLTG NTARLKDFLRSHGMETVGCWAMESGLDELRLAGLADINLVVSAVGLPAAAELKKMYGTPW VVGTPVGRRTSGRLAECIRQAARTGENQSLFDGAGYVASWDADACIRPDLGLAGLPEDLS GRKVLIIGEQVIAGSLRCCLREEWGADQVTVLCPTEGEKALALEGDIIDWDEDEVIRAIA ESTVVVADPLYRQVCPAHRNIVFVDFPHEACSGRMYHSIMRPFVCAPDDSV >gi|157101661|gb|DS480663.1| GENE 137 137600 - 138949 1032 449 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c23040 NR:ns ## KEGG: CLJU_c23040 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 25 449 26 441 442 419 48.0 1e-115 MLKRLHDTAGQDTALALSQVDSLAPFTYGLEYSAPARGGWTIVHIGMLLPESHQVFVCAQ SCLRGVVLSAAELGDSSRFSTISVDEDSVLEGNCEEIIIEGVTHILEGLPALPKALLLFT SCIHHFLGTDMKIVFGELNTRFPQVGFVQCWMNPIMRKTKMPPDPFMRKQLYSLLKPADM FTKQVNVIGNNLALRKSSELYNILEKNGFNIKDICTCRDFREYETMASSKLNLVTNPLGR PAAKELCGRLGQEWVYLPVSYDWEQIREHWKCLSDLLELDFDSIWKEVEMDQLWQKAETA LETLAGELAGWSVAIDYTASSRPFGMAKLLVKCGFRVERIYADTISPEEEDTFAWLKKQC PRMLLCPTVYHKMAVLPRTWCREAGGKVLAIGQKAAYFTGTSHFVNMVEDGGLYGYGGIL ELAAMMEEAAGEEKDTEKLIQVKGWGCCC >gi|157101661|gb|DS480663.1| GENE 138 138942 - 139721 930 259 aa, chain - ## HITS:1 COG:MJ0879 KEGG:ns NR:ns ## COG: MJ0879 COG1348 # Protein_GI_number: 15669069 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrogenase subunit NifH (ATPase) # Organism: Methanococcus jannaschii # 1 245 1 245 279 238 51.0 1e-62 MLKIAVYGKGGIGKSTISSNLSVALSERGYKVMQIGCDPKADSTIQLHQGHTVTSILDII RARGDKAGLDELVTEGSGGVLCAEAGGPTPGMGCAGRGIITAFEALEERNAFEVYKPDVV IYDVLGDVVCGGFAMPIREGYADKVFIVTSGENMAIYAAANIAAAVKSFEARGYASLGGI LLNRRGVKREQEKVEELAEDMETEIIGSLDYSPLVGEAEELGKTVMEAYPDSDMAGQYRK MADAVLAACGEEEVKKAYA >gi|157101661|gb|DS480663.1| GENE 139 140154 - 140405 399 83 aa, chain + ## HITS:1 COG:no KEGG:Shel_16520 NR:ns ## KEGG: Shel_16520 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 76 1 76 83 68 42.0 1e-10 MEDRLAVISCIISNPDSVATVNEKFHLQSGLIIARLGVPCRDYGVSVISVVLHGTQNQIS SFAGDLGKVDGVKVKSIQVPVNK >gi|157101661|gb|DS480663.1| GENE 140 140423 - 140740 485 105 aa, chain + ## HITS:1 COG:no KEGG:ELI_3158 NR:ns ## KEGG: ELI_3158 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 102 1 102 122 109 59.0 3e-23 MEHLEWLLSLVANTAIIIFEFIGVGIIIYSGLTSFLKFLRRSPDTKIYLAKGLAMGLEFK MGSEILRTVVVREWKEIGIVAGIIALRAALTFLIHWEIKEEERAE >gi|157101661|gb|DS480663.1| GENE 141 140825 - 144427 3846 1200 aa, chain - ## HITS:1 COG:CAC3216 KEGG:ns NR:ns ## COG: CAC3216 COG1197 # Protein_GI_number: 15896463 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Clostridium acetobutylicum # 3 1195 8 1167 1171 977 43.0 0 MENPLLELQEYDNLVQALKSGKGPLQVTGTLDSQKVHLMYELGEASAFSWKLVVTYDDTR AKEIYDDLRSFTSRVWLYPAKDLLFYSADIHGNLMARQRIAVLRRLMEDREGVVVTTMDG LMDHLLPLKYLREQSITVESGQVIDLDVWKERLIAMGYERVAQVDGMGQFSIRGGIVDIF PLTEEVPVRIELWDDEVDSIRTFDLESQRSVEQLENITIYPAAEVVLSADQLAAGIRRLE KEEKTYEKALREQHKPEEAHRIHTIIGELRSGLDEGWRIGGLDAYIRYFCPDTVSFLEYF PQGESVIFLDEPARLKEKGETVELEFRESMVHRLEKGYLLPGQTELLYPAAEILARMQKP YAVMLTGLDQKLPGMKVNQKFSIDVKNVNSYQNSFEILIKDLTRWKKEGYRVILLSASRT RASRLASDLREYDLRAYCPDGREGESGNAGGEGAGSADTGNPGAVNTSVRKVRPGEILVT YGNLHRGFEYPLLKFVFITEGDMFGVEKKRKRRKKTNYQGKAIQSFTELSVGDYVVHEEH GLGIYKGIEKVERDKVIKDYIKIEYGDGGNLYLPATRLESIQKYAGAEAKKPKLNKLGGT EWNKTKTRVRGAVQEIARDLVKLYAARQEKAGFQYGTDTVWQREFEELFPYDETDDQMDA IDAVKKDMESRRIMDRLICGDVGYGKTEVALRAAFKAVQDSKQVVYLVPTTILAQQHYNT FVQRMKDFPVRVDMLSRFCTPARQKRTLEDLRKGMVDIVIGTHRVLSKDMQFKDLGLLII DEEQRFGVAHKEKIKHLKENVDVLTLTATPIPRTLHMSLAGIRDMSVLEEPPVDRTPIQT YVMEYNEEMVREAINRELARNGQVYYVYNRVTDIDEVAGRVQALVPDAVVTFAHGQMREH ELERIMADFINGEIDVLVSTTIIETGLDISNANTMIIHDADRMGLSQLYQLRGRVGRSNR TSYAFLMYKRDKLLREEAEKRLQAIREFTELGSGIKIAMRDLEIRGAGNVLGAEQHGHME AVGYDLYCKMLNQAVLALKGETLEEDSYDTVVECDIDAYIPGRYIKNEYQKLDIYKRISA IETEEEYMDMQDELMDRFGDIPRSVENLLKIASIRALAHQAYVTEVVINRQEVRLTMYQK AKLQVDKIPDMVRSYKGDLKLVPGDVPSFHYIDRRNKNQDSLEMMGKAEEILKSMCGIRI >gi|157101661|gb|DS480663.1| GENE 142 144447 - 145016 720 189 aa, chain - ## HITS:1 COG:CAC3217 KEGG:ns NR:ns ## COG: CAC3217 COG0193 # Protein_GI_number: 15896464 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Clostridium acetobutylicum # 1 188 1 186 187 215 55.0 3e-56 MYIIAGLGNPTREYEKTRHNVGFEVIDVLADRLGTTVEEKKFKGCYGRGIIGGQKVLLLK PQTFMNLSGESVRAAADFYKVDPEHIIIVYDDISLDVGQLRIRKKGSAGGHNGIKNIISH LGTQEFPRIKVGVGDKPKKMDLADYVLSRFSKEDRAVMEDAFREAAGAVEMMITQGADAA MNQFNGHKG >gi|157101661|gb|DS480663.1| GENE 143 145317 - 146030 998 237 aa, chain - ## HITS:1 COG:BS_pbpF KEGG:ns NR:ns ## COG: BS_pbpF COG0744 # Protein_GI_number: 16078075 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 44 228 57 242 714 165 42.0 7e-41 MKVRTGIKWVAAIMLMVVLCFSVTVAVKGYGMYREAVDSMSLEDKVASIRAKENYTTFDQ LPEIYVDAVLSVEDHRFYNHPGIDVIAIGRAVFNDIKAGAFVEGGSTVTQQLAKNMYFSQ EKELVRKVAEVLVAFDLERNYSKNEILELYVNTIYYGNGYYCVRDASNGYFGKEPEDMTD YESTLLAGIPNAPSCYAPTVSQELAARRQQQVLDRLVKCGYFTEERALETMAVGSSR >gi|157101661|gb|DS480663.1| GENE 144 146213 - 146305 71 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MWCYARAESRKVLMEKASVKVPVKAWRKAR >gi|157101661|gb|DS480663.1| GENE 145 146469 - 147881 1635 470 aa, chain + ## HITS:1 COG:BH1470 KEGG:ns NR:ns ## COG: BH1470 COG2509 # Protein_GI_number: 15614033 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Bacillus halodurans # 7 466 5 477 480 410 46.0 1e-114 MNPTTNYDVIIIGAGPSGIFCAYELIQKKPGMKILMVEKGRPIEKRVCPKRTTKACVGCK PCSITTGFAGAGAFSDGKLSLSPDVGGNLPEILGYDKTVELLKESDNIYLKFGADTNVYG MDKQKEIEEIRRKAIGANLKLIECPIRHLGTEEGYKIYTRLQEHLLAQGITMEFNTMVRD IIIEGDTVKGIITDKDETYYAPEVVSAIGREGSDWFSHICGNHGIQTQVGTVDIGVRVEV RDEVMKFLNENLYEAKLVYYTPTFDDKVRTFCTNPSGEVATEYYENGLAVVNGHAYKSQE YKTNNTNFALLVSKNFTKPFNEPIEYGKHIAQLSNMLCGGRIMVQTFGDFQRGRRTTEER LCRNNLIPTLKDAVPGDLSLVFPHRIMTDIKEMLLALDKVTPGIASDETLLYGVEVKFYS NKVVVNADFETSIRGLRAIGDGASVTRGLQQASANGISVARSILKQMETL >gi|157101661|gb|DS480663.1| GENE 146 147949 - 148863 622 304 aa, chain - ## HITS:1 COG:no KEGG:Tthe_0908 NR:ns ## KEGG: Tthe_0908 # Name: not_defined # Def: hypothetical protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 8 256 4 274 316 101 29.0 5e-20 MDKKRLQWHPGFFAVLQIELEEERRFLRFYAEYNFTRKPLQIDVLVVRKETGRVIQKNIG RIFRQYNVVEYKGIKDYISINDFYKSIGYACLLQSNTERVQEILPSQVTVTLAGEHYPRS LHVFLEKAYGVHMEEEAPGIYYIKGLLFPLQILVIRELSKEDNIWLSRLRSGLKPDEDIE VLMKEYKGKEQNPLYETAMDLILRANWETCQEVEKMCDALRELFADELEERETIGLEKGL EQGKMAKLITQVMRKREKGQSAARIAEDLMEPAEVVQRLYDLIGLHPDSDAEHILAYMES WDKV Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:24:46 2011 Seq name: gi|157101660|gb|DS480664.1| Clostridium bolteae ATCC BAA-613 Scfld_02_5 genomic scaffold, whole genome shotgun sequence Length of sequence - 2588 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 325 - 384 7.5 1 1 Tu 1 . + CDS 461 - 1882 458 ## BSn5_08155 levansucrase + Term 1887 - 1939 10.7 - Term 1876 - 1926 10.3 2 2 Tu 1 . - CDS 2020 - 2286 156 ## COG1662 Transposase and inactivated derivatives, IS1 family - Prom 2432 - 2491 2.5 Predicted protein(s) >gi|157101660|gb|DS480664.1| GENE 1 461 - 1882 458 473 aa, chain + ## HITS:1 COG:no KEGG:BSn5_08155 NR:ns ## KEGG: BSn5_08155 # Name: not_defined # Def: levansucrase # Organism: B.subtilis_BSn5 # Pathway: Starch and sucrose metabolism [PATH:bsn00500]; Metabolic pathways [PATH:bsn01100]; Two-component system [PATH:bsn02020] # 1 473 1 473 473 883 100.0 0 MNIKKFAKQATVLTFTTALLAGGATQAFAKETNQKPYKETYGISHITRHDMLQIPEQQKN EKYQVPEFDSSTIKNISSAKGLDVWDSWPLQNADGTVANYHGYHIVFALAGDPKNADDTS IYMFYQKVGETSIDSWKNAGRVFKDSDKFDANDSILKDQTQEWSGSATFTSDGKIRLFYT DFSGKHYGKQTLTTAQVNVSASDSSLNINGVEDYKSIFDGDGKTYQNVQQFIDEGNYSSG DNHTLRDPHYVEDKGHKYLVFEANTGTEDGYQGEESLFNKAYYGKSTSFFRQESQKLLQS DKKRTAELANGALGMIELNDDYTLKKVMKPLIASNTVTDEIERANVFKMNGKWYLFTDSR GSKMTIDGITSNDIYMLGYVSNSLTGPYKPLNKTGLVLKMDLDPNDVTFTYSHFAVPQAK GNNVVITSYMTNRGFYADKQSTFAPSFLLNIKGKKTSVVKDSILEQGQLTVNK >gi|157101660|gb|DS480664.1| GENE 2 2020 - 2286 156 88 aa, chain - ## HITS:1 COG:insB_g3 KEGG:ns NR:ns ## COG: insB_g3 COG1662 # Protein_GI_number: 16128015 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS1 family # Organism: Escherichia coli K12 # 1 88 80 167 167 171 100.0 3e-43 MATLGRLMSLLSPFDVVIWMTDGWPLYESRLKGKLHVISKRYTQRIERHNLNLRQHLARL GRKSLSFSKSVELHDKVIGHYLNIKHYQ Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:24:53 2011 Seq name: gi|157101659|gb|DS480665.1| Clostridium bolteae ATCC BAA-613 Scfld_02_6 genomic scaffold, whole genome shotgun sequence Length of sequence - 3120 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 40 - 192 74 ## gi|160938565|ref|ZP_02085918.1| hypothetical protein CLOBOL_03461 2 1 Op 2 . + CDS 179 - 691 555 ## gi|160938564|ref|ZP_02085917.1| hypothetical protein CLOBOL_03460 3 1 Op 3 . + CDS 707 - 865 73 ## gi|160938563|ref|ZP_02085916.1| hypothetical protein CLOBOL_03459 + Prom 954 - 1013 80.4 4 2 Tu 1 . + CDS 1218 - 1577 202 ## gi|160942217|ref|ZP_02089527.1| hypothetical protein CLOBOL_07104 + Term 1794 - 1828 -0.8 - Term 1326 - 1367 0.5 5 3 Tu 1 . - CDS 1602 - 2054 -15 ## gi|160942218|ref|ZP_02089528.1| hypothetical protein CLOBOL_07105 - Prom 2164 - 2223 80.4 6 4 Tu 1 . - CDS 2420 - 3118 329 ## Closa_0759 hypothetical protein Predicted protein(s) >gi|157101659|gb|DS480665.1| GENE 1 40 - 192 74 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938565|ref|ZP_02085918.1| ## NR: gi|160938565|ref|ZP_02085918.1| hypothetical protein CLOBOL_03461 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03461 [Clostridium bolteae ATCC BAA-613] # 1 50 189 238 238 96 94.0 7e-19 MNYYRIGMRKGGTPGVKEIVNYTVNNGEESITASYDRFFALKGLTLNASG >gi|157101659|gb|DS480665.1| GENE 2 179 - 691 555 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938564|ref|ZP_02085917.1| ## NR: gi|160938564|ref|ZP_02085917.1| hypothetical protein CLOBOL_03460 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03460 [Clostridium bolteae ATCC BAA-613] # 1 170 1 170 170 316 100.0 4e-85 MLLDEKMLPGIKIRIPQMEDLLQAEQAYLDVVLEVVEWLRERMILLNEEVMNIPNLKAKI KQITGWDCEILEDAEHLTLTIRYYFDAREPVLEQEERIMKYIPAHLKAVHEYLQRYAGKR KLYAGTMLGTYVRYIGRPQETNGHREGKAHVRVQGNMYIHTKMIVYPEGR >gi|157101659|gb|DS480665.1| GENE 3 707 - 865 73 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938563|ref|ZP_02085916.1| ## NR: gi|160938563|ref|ZP_02085916.1| hypothetical protein CLOBOL_03459 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03459 [Clostridium bolteae ATCC BAA-613] # 3 47 7 51 338 70 80.0 3e-11 MPIQATGTFVVSKGFRLLTKLAASREVCSSHGAAVGTGKPPEGFSPEIPDLA >gi|157101659|gb|DS480665.1| GENE 4 1218 - 1577 202 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942217|ref|ZP_02089527.1| ## NR: gi|160942217|ref|ZP_02089527.1| hypothetical protein CLOBOL_07104 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07104 [Clostridium bolteae ATCC BAA-613] # 1 119 1 119 119 221 100.0 2e-56 MVLLCSLFFWPSGHYTKIACVELDLSAYGITSGQAVRINNVVIITFWHNTAVPTGKAMNA VALPDDFKPTTNKVGTAVYMDSSSAVALGTSIVKPDGYIYLDSAITRVRVEATIAYTVK >gi|157101659|gb|DS480665.1| GENE 5 1602 - 2054 -15 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942218|ref|ZP_02089528.1| ## NR: gi|160942218|ref|ZP_02089528.1| hypothetical protein CLOBOL_07105 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07105 [Clostridium bolteae ATCC BAA-613] # 10 150 1 141 141 277 100.0 2e-73 MNSVTTAVQMMWDKLYPSTLYLFNHGIIEGIPVSPIYRQSNAFVNVNNESISAGNTQETV CHFAAGIGNAIDITGFSKLIFETSGSKLLSIGIGRNPVPTYDRANFVVSADGTGTIELDV RDYLGYYYIGFRDGNININRSEIHEVFLIP >gi|157101659|gb|DS480665.1| GENE 6 2420 - 3118 329 232 aa, chain - ## HITS:1 COG:no KEGG:Closa_0759 NR:ns ## KEGG: Closa_0759 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 41 129 65 139 485 63 39.0 5e-09 SRLWWIQFAEFTLYVLIGSVSNVIAAVTPGSIITRDTFTAANLKAIDTHGILGGEAGAGT TGQGLIDALTNKLLTEFVTNTGLMERLGTYVLKSKIVNDFLSTDEETVLSGPMGKLLKEQ LNVLNTNFNNKQNREWIYAATLSNKNTASVGVNIRNYREACIIINGHRKSGSIMIVDDVL HFIAPEAGTGTMAFMYLVKVNIVNGGFGLYVEAANRICFLESNAWLEKQKLV Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:25:35 2011 Seq name: gi|157101658|gb|DS480666.1| Clostridium bolteae ATCC BAA-613 Scfld_02_7 genomic scaffold, whole genome shotgun sequence Length of sequence - 3302 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 152 - 322 175 ## gi|160940488|ref|ZP_02087833.1| hypothetical protein CLOBOL_05381 - TRNA 353 - 425 80.2 # Ala TGC 0 0 - 5S_RRNA 431 - 547 94.0 # CP000885 [D:447851..447967] # 5S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. - SSU_RRNA 633 - 1203 97.0 # Y10028 [D:1..1487] # 16S ribosomal RNA # Clostridium sp. # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. - Term 1217 - 1285 29.7 2 2 Op 1 . - CDS 1327 - 1557 102 ## Mpal_1544 beta-lactamase domain protein 3 2 Op 2 . - CDS 1634 - 1762 117 ## gi|160941983|ref|ZP_02089304.1| hypothetical protein CLOBOL_06875 4 2 Op 3 . - CDS 1759 - 2100 204 ## COG5470 Uncharacterized conserved protein - Term 2135 - 2172 1.0 5 3 Op 1 . - CDS 2174 - 2608 301 ## Sterm_0500 TetR family transcriptional regulator 6 3 Op 2 . - CDS 2624 - 2833 159 ## gi|160941981|ref|ZP_02089302.1| hypothetical protein CLOBOL_06873 7 3 Op 3 . - CDS 2830 - 3165 279 ## gi|160941980|ref|ZP_02089301.1| hypothetical protein CLOBOL_06872 Predicted protein(s) >gi|157101658|gb|DS480666.1| GENE 1 152 - 322 175 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940488|ref|ZP_02087833.1| ## NR: gi|160940488|ref|ZP_02087833.1| hypothetical protein CLOBOL_05381 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05381 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 77 100.0 4e-13 MLKTSEETEAYHMYLENRILNKNELYILNQNIKTSEAIQEIVWKNKLVNAKAFMKT >gi|157101658|gb|DS480666.1| GENE 2 1327 - 1557 102 76 aa, chain - ## HITS:1 COG:no KEGG:Mpal_1544 NR:ns ## KEGG: Mpal_1544 # Name: not_defined # Def: beta-lactamase domain protein # Organism: M.palustris # Pathway: not_defined # 1 62 71 132 294 64 46.0 2e-09 MTHSHPDHIGALKNIKGMYSCCVYATGESGIGFEDIDLQYEKRPIPGFYSLVGWVCYVDR ILGLTGTCCCPETWAS >gi|157101658|gb|DS480666.1| GENE 3 1634 - 1762 117 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941983|ref|ZP_02089304.1| ## NR: gi|160941983|ref|ZP_02089304.1| hypothetical protein CLOBOL_06875 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06875 [Clostridium bolteae ATCC BAA-613] # 1 33 1 33 286 73 100.0 5e-12 MKINDRVHQLRVQFNITPEITRYVYVYIITGKKRMFTWLTQG >gi|157101658|gb|DS480666.1| GENE 4 1759 - 2100 204 113 aa, chain - ## HITS:1 COG:MA2290 KEGG:ns NR:ns ## COG: MA2290 COG5470 # Protein_GI_number: 20091128 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 16 97 2 83 88 65 36.0 2e-11 MSCYFIISVFLEPGKDRADYDDYIKRVRPIVEEHGGRYVVRSEGIEYIGKDWHPDRFIMI WFPDRESIDRCFSSEKYRKIMSKRENTVDSQAIIVTAEEMYGDNSGKRERRME >gi|157101658|gb|DS480666.1| GENE 5 2174 - 2608 301 144 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0500 NR:ns ## KEGG: Sterm_0500 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 1 128 52 179 191 96 38.0 4e-19 MLQAMGQKRWHPDLEPDDFMIETGNLTEDLCQFARIYMKKVTPEFVKLSLGLRTPELAED TREGILAIPQVFKTGVTAYFRKMYEKGKLISDDYESMAMMFLSLNFGFVFFKASFGSGLT EMKADEYIIKMGRCVCPWSGQVNA >gi|157101658|gb|DS480666.1| GENE 6 2624 - 2833 159 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941981|ref|ZP_02089302.1| ## NR: gi|160941981|ref|ZP_02089302.1| hypothetical protein CLOBOL_06873 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06873 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 216 136 100.0 4e-31 MTGSLDREMGLLYDASVHLRFGEGDMDQTGQNIIDAAMELVVERGYTATTTKDIAKRAGV NECTIFRKI >gi|157101658|gb|DS480666.1| GENE 7 2830 - 3165 279 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941980|ref|ZP_02089301.1| ## NR: gi|160941980|ref|ZP_02089301.1| hypothetical protein CLOBOL_06872 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06872 [Clostridium bolteae ATCC BAA-613] # 1 111 168 278 278 216 100.0 6e-55 MTSVDGSSLICIMSESLADFGGYADLVYTYQESVLDAAVEAELGTPQTKTTSQLSNGLWY SYHYMDASSLGIPGFLKVYARINGDRLQMVMFGGSISSLDTEGIMNSLTFL Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:26:36 2011 Seq name: gi|157101657|gb|DS480667.1| Clostridium bolteae ATCC BAA-613 Scfld_02_8 genomic scaffold, whole genome shotgun sequence Length of sequence - 127872 bp Number of predicted genes - 130, with homology - 126 Number of transcription units - 49, operones - 25 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 72 - 839 788 ## COG1489 DNA-binding protein, stimulates sugar fermentation + Term 872 - 915 11.1 2 2 Tu 1 . - CDS 966 - 1334 250 ## gi|160935109|ref|ZP_02082492.1| hypothetical protein CLOBOL_00004 - Prom 1361 - 1420 6.1 + Prom 1278 - 1337 9.1 3 3 Tu 1 . + CDS 1468 - 1707 127 ## gi|160935110|ref|ZP_02082493.1| hypothetical protein CLOBOL_00005 4 4 Op 1 1/0.167 - CDS 1723 - 2955 1571 ## COG1541 Coenzyme F390 synthetase 5 4 Op 2 11/0.000 - CDS 2972 - 3565 694 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 6 4 Op 3 . - CDS 3620 - 5485 1726 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits - Prom 5577 - 5636 7.0 7 5 Op 1 . - CDS 6005 - 7258 1000 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 8 5 Op 2 . - CDS 7287 - 9284 1965 ## COG0513 Superfamily II DNA and RNA helicases - Term 9304 - 9354 7.1 9 5 Op 3 . - CDS 9358 - 9753 587 ## bpr_III171 hypothetical protein - Prom 9799 - 9858 4.6 + Prom 9613 - 9672 4.3 10 6 Tu 1 . + CDS 9886 - 11544 503 ## COG0370 Fe2+ transport system protein B + Term 11643 - 11671 -1.0 + Prom 11642 - 11701 4.3 11 7 Op 1 . + CDS 11723 - 12790 818 ## COG0628 Predicted permease 12 7 Op 2 . + CDS 12847 - 14058 839 ## Closa_3889 hypothetical protein + Term 14306 - 14350 3.1 - Term 14036 - 14071 4.3 13 8 Tu 1 . - CDS 14079 - 14807 711 ## COG2186 Transcriptional regulators - Prom 14855 - 14914 8.8 + Prom 14918 - 14977 10.8 14 9 Op 1 . + CDS 15042 - 15638 461 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 15 9 Op 2 . + CDS 15651 - 16448 700 ## COG0491 Zn-dependent hydrolases, including glyoxylases 16 9 Op 3 . + CDS 16457 - 17677 643 ## COG5441 Uncharacterized conserved protein 17 9 Op 4 . + CDS 17690 - 19078 1187 ## COG2610 H+/gluconate symporter and related permeases 18 9 Op 5 . + CDS 19101 - 19787 611 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases + Term 19849 - 19889 7.2 19 10 Op 1 5/0.000 - CDS 19784 - 20380 545 ## COG1186 Protein chain release factor B 20 10 Op 2 . - CDS 20391 - 21419 637 ## COG1690 Uncharacterized conserved protein 21 11 Tu 1 . + CDS 21919 - 24066 1754 ## COG3590 Predicted metalloendopeptidase + Term 24090 - 24135 11.3 - Term 24077 - 24123 15.3 22 12 Op 1 . - CDS 24141 - 26177 1968 ## COG1164 Oligoendopeptidase F - Prom 26205 - 26264 2.3 23 12 Op 2 . - CDS 26286 - 27851 1292 ## COG1409 Predicted phosphohydrolases 24 12 Op 3 . - CDS 27886 - 28695 834 ## COG0561 Predicted hydrolases of the HAD superfamily 25 12 Op 4 . - CDS 28673 - 29257 985 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 26 12 Op 5 . - CDS 29293 - 30492 1538 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 30584 - 30643 6.3 27 13 Op 1 . - CDS 30659 - 32053 1676 ## COG1362 Aspartyl aminopeptidase 28 13 Op 2 . - CDS 32126 - 33100 696 ## Closa_4147 hypothetical protein - Prom 33133 - 33192 6.3 29 14 Op 1 16/0.000 - CDS 33253 - 33942 761 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 30 14 Op 2 . - CDS 33935 - 35416 1652 ## COG2205 Osmosensitive K+ channel histidine kinase 31 14 Op 3 1/0.167 - CDS 35425 - 35841 535 ## COG0569 K+ transport systems, NAD-binding component 32 14 Op 4 . - CDS 35838 - 36275 535 ## COG0569 K+ transport systems, NAD-binding component - Prom 36305 - 36364 6.0 33 15 Op 1 . - CDS 36461 - 36658 301 ## gi|160935148|ref|ZP_02082531.1| hypothetical protein CLOBOL_00043 - Prom 36678 - 36737 3.3 34 15 Op 2 15/0.000 - CDS 36739 - 38733 2116 ## COG2205 Osmosensitive K+ channel histidine kinase - Term 38761 - 38807 8.0 35 15 Op 3 18/0.000 - CDS 38827 - 39432 792 ## COG2156 K+-transporting ATPase, c chain 36 15 Op 4 20/0.000 - CDS 39488 - 41560 2272 ## COG2216 High-affinity K+ transport system, ATPase chain B 37 15 Op 5 . - CDS 41576 - 43327 1966 ## COG2060 K+-transporting ATPase, A chain 38 15 Op 6 . - CDS 43324 - 43413 141 ## - Prom 43503 - 43562 7.2 - Term 43542 - 43593 17.0 39 16 Op 1 . - CDS 43621 - 44562 1004 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 40 16 Op 2 . - CDS 44585 - 45355 725 ## COG0813 Purine-nucleoside phosphorylase 41 16 Op 3 . - CDS 45362 - 46129 859 ## COG2188 Transcriptional regulators 42 16 Op 4 26/0.000 - CDS 46200 - 47105 1211 ## COG1079 Uncharacterized ABC-type transport system, permease component 43 16 Op 5 24/0.000 - CDS 47109 - 48212 1168 ## COG4603 ABC-type uncharacterized transport system, permease component 44 16 Op 6 15/0.000 - CDS 48193 - 49713 1779 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 45 16 Op 7 . - CDS 49744 - 50829 1316 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein - Prom 50852 - 50911 7.5 - Term 51660 - 51714 1.6 46 17 Op 1 15/0.000 - CDS 51729 - 52211 539 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 47 17 Op 2 12/0.000 - CDS 52204 - 53094 894 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 48 17 Op 3 . - CDS 53150 - 55480 2230 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 49 17 Op 4 . - CDS 55516 - 56268 983 ## Closa_2900 hypothetical protein 50 17 Op 5 . - CDS 56308 - 57624 1478 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 51 17 Op 6 . - CDS 57695 - 58141 641 ## LC705_00064 hypothetical protein 52 17 Op 7 . - CDS 58138 - 59034 1035 ## Closa_2902 hypothetical protein - Prom 59143 - 59202 7.4 + Prom 59018 - 59077 7.8 53 18 Tu 1 . + CDS 59211 - 59954 857 ## Closa_2903 Crp/Fnr family transcriptional regulator + Prom 59987 - 60046 7.6 54 19 Op 1 1/0.167 + CDS 60075 - 62048 1750 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 55 19 Op 2 . + CDS 62066 - 63001 631 ## COG0583 Transcriptional regulator + Term 63198 - 63239 -0.6 - Term 62826 - 62862 4.0 56 20 Op 1 . - CDS 63020 - 63895 842 ## COG1082 Sugar phosphate isomerases/epimerases 57 20 Op 2 . - CDS 63912 - 64850 812 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 58 20 Op 3 . - CDS 64886 - 66367 1053 ## COG1070 Sugar (pentulose and hexulose) kinases 59 20 Op 4 21/0.000 - CDS 66391 - 67374 683 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 60 20 Op 5 16/0.000 - CDS 67396 - 68892 1359 ## COG1129 ABC-type sugar transport system, ATPase component - Term 68916 - 68966 5.3 61 20 Op 6 1/0.167 - CDS 69019 - 70068 1035 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 70133 - 70192 6.1 - Term 70213 - 70262 9.9 62 21 Tu 1 . - CDS 70291 - 71457 309 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 71550 - 71609 6.8 - Term 71585 - 71629 3.0 63 22 Tu 1 . - CDS 71654 - 72088 219 ## Ccel_2418 hypothetical protein - Prom 72190 - 72249 5.5 - Term 72121 - 72163 3.4 64 23 Tu 1 . - CDS 72259 - 73059 720 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 73111 - 73170 4.8 65 24 Tu 1 . - CDS 73178 - 73753 665 ## COG0212 5-formyltetrahydrofolate cyclo-ligase - Prom 73779 - 73838 5.5 + Prom 73729 - 73788 4.2 66 25 Tu 1 . + CDS 73898 - 75304 1218 ## COG1767 Triphosphoribosyl-dephospho-CoA synthetase + Term 75494 - 75536 3.1 - Term 75105 - 75153 -0.1 67 26 Tu 1 . - CDS 75265 - 75651 446 ## Cbei_3026 heavy metal transport/detoxification protein - Prom 75690 - 75749 4.4 - Term 75791 - 75852 9.2 68 27 Tu 1 . - CDS 75914 - 77398 1540 ## COG1292 Choline-glycine betaine transporter - Prom 77439 - 77498 2.8 - Term 77492 - 77535 13.8 69 28 Op 1 . - CDS 77593 - 77832 191 ## CLJU_c27830 betaine reductase complex component B subunit gamma (EC:1.21.4.2) 70 28 Op 2 . - CDS 77860 - 78912 1226 ## CLJU_c27830 betaine reductase complex component B subunit gamma (EC:1.21.4.2) 71 28 Op 3 . - CDS 78938 - 80263 1308 ## CLJU_c27360 putative glycine/sarcosine/betaine reductase component B subunit alpha - Prom 80315 - 80374 5.5 72 29 Tu 1 . - CDS 80414 - 81004 569 ## Shel_25340 hypothetical protein - Prom 81026 - 81085 2.0 - Term 81042 - 81084 0.2 73 30 Op 1 . - CDS 81110 - 81814 803 ## COG1309 Transcriptional regulator 74 30 Op 2 . - CDS 81804 - 82445 303 ## Shel_25350 hypothetical protein - Prom 82470 - 82529 6.9 + Prom 82441 - 82500 4.9 75 31 Tu 1 . + CDS 82587 - 83081 345 ## EUBELI_00140 hypothetical protein + Term 83299 - 83361 2.6 - Term 83302 - 83334 1.0 76 32 Tu 1 . - CDS 83470 - 83814 284 ## Odosp_1093 hypothetical protein - Prom 83847 - 83906 3.1 - Term 83852 - 83899 4.3 77 33 Tu 1 . - CDS 84009 - 84752 362 ## COG0327 Uncharacterized conserved protein - Prom 84774 - 84833 2.8 78 34 Op 1 . - CDS 84924 - 85310 253 ## PC1_2348 hypothetical protein 79 34 Op 2 . - CDS 85386 - 85631 166 ## gi|160935201|ref|ZP_02082584.1| hypothetical protein CLOBOL_00096 80 34 Op 3 . - CDS 85613 - 87463 115 ## PPSC2_p0627 hypothetical protein 81 34 Op 4 . - CDS 87479 - 87682 110 ## gi|160935204|ref|ZP_02082587.1| hypothetical protein CLOBOL_00099 82 34 Op 5 . - CDS 87696 - 88283 497 ## COG1704 Uncharacterized conserved protein 83 34 Op 6 . - CDS 88303 - 89418 385 ## gi|160935206|ref|ZP_02082589.1| hypothetical protein CLOBOL_00101 84 35 Tu 1 . - CDS 89572 - 89784 65 ## - Prom 89995 - 90054 2.1 - Term 90394 - 90436 -0.8 85 36 Op 1 9/0.000 - CDS 90468 - 91751 822 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 86 36 Op 2 . - CDS 91748 - 92467 517 ## COG3279 Response regulator of the LytR/AlgR family - Prom 92631 - 92690 4.0 87 37 Op 1 . + CDS 92867 - 93247 176 ## Cbei_3562 transposase IS116/IS110/IS902 family protein 88 37 Op 2 . + CDS 93171 - 93518 142 ## ELI_1812 hypothetical protein - Term 93476 - 93512 1.1 89 38 Tu 1 . - CDS 93634 - 94161 156 ## Bacsa_0336 death-on-curing family protein - Term 95006 - 95053 -0.9 90 39 Op 1 . - CDS 95171 - 96178 785 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 91 39 Op 2 . - CDS 96206 - 96823 134 ## smi_1699 hypothetical protein - Prom 96865 - 96924 8.5 + Prom 98327 - 98386 6.2 92 40 Tu 1 . + CDS 98567 - 98911 121 ## gi|160935220|ref|ZP_02082603.1| hypothetical protein CLOBOL_00115 + Term 98956 - 98988 4.0 - Term 98792 - 98831 -0.9 93 41 Tu 1 . - CDS 98985 - 99758 667 ## Cphy_0799 hypothetical protein - Prom 99778 - 99837 2.3 - Term 99774 - 99807 5.2 94 42 Op 1 . - CDS 99845 - 100267 470 ## Closa_0862 toxin secretion/phage lysis holin 95 42 Op 2 . - CDS 100283 - 100411 271 ## gi|160935223|ref|ZP_02082606.1| hypothetical protein CLOBOL_00118 96 42 Op 3 . - CDS 100452 - 100565 197 ## 97 42 Op 4 . - CDS 100569 - 100913 393 ## gi|160935225|ref|ZP_02082608.1| hypothetical protein CLOBOL_00120 98 42 Op 5 . - CDS 100932 - 101645 339 ## bpr_II210 hypothetical protein 99 42 Op 6 . - CDS 101652 - 102095 478 ## Closa_0858 hypothetical protein 100 42 Op 7 . - CDS 102086 - 102511 239 ## gi|160935228|ref|ZP_02082611.1| hypothetical protein CLOBOL_00123 101 42 Op 8 . - CDS 102508 - 104604 1419 ## Sgly_0356 hypothetical protein 102 42 Op 9 . - CDS 104606 - 105004 267 ## gi|160935230|ref|ZP_02082613.1| hypothetical protein CLOBOL_00125 103 42 Op 10 . - CDS 105006 - 108839 2710 ## gi|160935231|ref|ZP_02082614.1| hypothetical protein CLOBOL_00126 - Term 108859 - 108887 -1.0 104 42 Op 11 . - CDS 108888 - 109451 492 ## Sgly_0348 putative protein GP15 105 42 Op 12 . - CDS 109448 - 109897 475 ## gi|160935233|ref|ZP_02082616.1| hypothetical protein CLOBOL_00128 106 42 Op 13 . - CDS 109956 - 110492 524 ## gi|160935234|ref|ZP_02082617.1| hypothetical protein CLOBOL_00129 107 42 Op 14 . - CDS 110495 - 110944 434 ## gi|160935235|ref|ZP_02082618.1| hypothetical protein CLOBOL_00130 108 42 Op 15 . - CDS 110941 - 111366 230 ## Clole_0824 minor capsid 109 42 Op 16 . - CDS 111378 - 111701 260 ## gi|160935237|ref|ZP_02082620.1| hypothetical protein CLOBOL_00132 110 42 Op 17 . - CDS 111760 - 112131 237 ## gi|160935238|ref|ZP_02082621.1| hypothetical protein CLOBOL_00133 111 42 Op 18 . - CDS 112146 - 113336 1151 ## Ethha_0094 PEGA domain protein 112 42 Op 19 . - CDS 113351 - 113884 385 ## gi|160935240|ref|ZP_02082623.1| hypothetical protein CLOBOL_00135 113 43 Op 1 . - CDS 114003 - 114179 67 ## gi|160935241|ref|ZP_02082624.1| hypothetical protein CLOBOL_00136 114 43 Op 2 . - CDS 114164 - 115918 992 ## ELI_3213 hypothetical protein 115 43 Op 3 . - CDS 115931 - 117310 953 ## Clole_0815 bacteriophage portal protein, SPP1 GP6-like protein 116 43 Op 4 . - CDS 117325 - 118704 958 ## EUBREC_1284 hypothetical protein 117 43 Op 5 . - CDS 118688 - 119407 403 ## COG5484 Uncharacterized conserved protein + Prom 119465 - 119524 8.1 118 44 Op 1 . + CDS 119544 - 119663 222 ## + Term 119669 - 119694 -0.5 + Prom 119738 - 119797 5.5 119 44 Op 2 . + CDS 119850 - 120041 174 ## gi|160935248|ref|ZP_02082631.1| hypothetical protein CLOBOL_00143 - Term 120066 - 120099 1.0 120 45 Op 1 . - CDS 120123 - 121679 508 ## Pedsa_1277 hypothetical protein 121 45 Op 2 . - CDS 121660 - 122130 67 ## gi|160935250|ref|ZP_02082633.1| hypothetical protein CLOBOL_00145 122 45 Op 3 . - CDS 122167 - 122304 74 ## gi|160935251|ref|ZP_02082634.1| hypothetical protein CLOBOL_00146 - Prom 122328 - 122387 12.2 - TRNA 122444 - 122517 61.8 # Met CAT 0 0 - Term 122511 - 122567 5.0 123 46 Op 1 . - CDS 122744 - 123196 296 ## Closa_1384 hypothetical protein 124 46 Op 2 . - CDS 123202 - 123495 185 ## gi|160935253|ref|ZP_02082636.1| hypothetical protein CLOBOL_00149 125 46 Op 3 . - CDS 123470 - 123670 204 ## gi|160935254|ref|ZP_02082637.1| hypothetical protein CLOBOL_00150 - Prom 123730 - 123789 4.3 126 47 Tu 1 . - CDS 123948 - 124190 278 ## gi|160935256|ref|ZP_02082639.1| hypothetical protein CLOBOL_00152 - Prom 124251 - 124310 3.9 127 48 Tu 1 . - CDS 124781 - 125428 112 ## COG0863 DNA modification methylase - Prom 125450 - 125509 1.9 128 49 Op 1 . - CDS 125561 - 125998 402 ## gi|160935259|ref|ZP_02082642.1| hypothetical protein CLOBOL_00155 129 49 Op 2 . - CDS 125988 - 127358 1002 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 130 49 Op 3 . - CDS 127339 - 127632 244 ## CLJU_c36520 phage-like protein - Prom 127720 - 127779 1.6 Predicted protein(s) >gi|157101657|gb|DS480667.1| GENE 1 72 - 839 788 255 aa, chain + ## HITS:1 COG:CAC0144 KEGG:ns NR:ns ## COG: CAC0144 COG1489 # Protein_GI_number: 15893439 # Func_class: R General function prediction only # Function: DNA-binding protein, stimulates sugar fermentation # Organism: Clostridium acetobutylicum # 11 254 12 230 230 191 44.0 2e-48 MTYEHIVAGTFVSRPNRFIAHVKTGNRTVVCHVKNTGRCRELLIPGAAVILEFHPDAAVS GRKTEYDLIGVYKNDLFINMDSQAPNKAAWEWLTSLDGSMDSCGCTEKAGSPFSPLGPYV PCDIRREVTHGDSRFDLAFSLRDRDTKAVSPAFMEVKGVTLEENGVAMFPDAPTERGIKH LKGLIRAHEEGYEAYVLFVIQMKGIRGFTPNDMTHPAFGDALRQAREAGVHVLAYDCLVT PDTMIVDSPVKVILD >gi|157101657|gb|DS480667.1| GENE 2 966 - 1334 250 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935109|ref|ZP_02082492.1| ## NR: gi|160935109|ref|ZP_02082492.1| hypothetical protein CLOBOL_00004 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00004 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 122 67 100.0 5e-10 MNYNNCNGWNEYAGTYEGAYYAGDAAFARTAAAGGNCGSTAAAAAAGTAGSCGGTGRTCD LYWQGYMAGFYANNRCNQSAEARTAAQAADRTAAASNCRCSCNNCNNCGNWNNCGNCGNC RS >gi|157101657|gb|DS480667.1| GENE 3 1468 - 1707 127 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935110|ref|ZP_02082493.1| ## NR: gi|160935110|ref|ZP_02082493.1| hypothetical protein CLOBOL_00005 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00005 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 138 100.0 2e-31 MVYPLHELKPGERAEIVWVISEPPMSLRLEELGFTAPALVTCILKGRRGRMSAYQIQDTV IALRPQNAREVLVRPLQPV >gi|157101657|gb|DS480667.1| GENE 4 1723 - 2955 1571 410 aa, chain - ## HITS:1 COG:MA1725 KEGG:ns NR:ns ## COG: MA1725 COG1541 # Protein_GI_number: 20090577 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 409 23 433 435 495 59.0 1e-140 MKMKESQFLLIKEQLKKLTAKECFYKHKLEGVNVNQIQSQEDFEKLPFTWKGDLREAYPL GLMAAPEEKIVRIHSSSGTTGTPVIIPYTQKDVDDWALMFARCYEMAGITNLDRIQITPG YGLWTAGIGFQLGAERLGAMTVPMGPGNTDKQLRMMMDMKSTVLCATSSYALLLAEEIAK RGIGERIHLKKGVIGSERWGEKMRNRIAAELGVDLFDIYGLTEVYGPGIAINCEKQGAMH YWDDYIYIEIVDPRTGEVLPDGEVGEIVITTLKKEGAPLIRYRTHDLSRIVPGDCPCGSP YPRIGTLIGRTDDMVKVKGVNIFPSQIEELLSSIEGASSEYQVMVDHLMGKDVLTLFVET NPSINKYALEIEIQNQFKGRIGLTPVVKLVELGELPRSEKKSTRVFDNRY >gi|157101657|gb|DS480667.1| GENE 5 2972 - 3565 694 197 aa, chain - ## HITS:1 COG:PAB0857 KEGG:ns NR:ns ## COG: PAB0857 COG1014 # Protein_GI_number: 14521497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus abyssi # 4 193 5 196 202 132 39.0 4e-31 MTKNVLLCGVGGQGTVLASRLIALAAMEKGMEARGAETIGMAQRGGSVVSHVRIGEEIYS PLIPHGGADVIIGFEPAEAVRCLPYLKKGGCVVVSPAPIRPVTASLTGGAYTGREMMEYL EHAGENLVVVDAASIGMECGSPKVMNVALLGAAIASGLIGISLEEMEAAIEKRVPEKFKD MNMKALKLGAAASAMIR >gi|157101657|gb|DS480667.1| GENE 6 3620 - 5485 1726 621 aa, chain - ## HITS:1 COG:MA1727 KEGG:ns NR:ns ## COG: MA1727 COG4231 # Protein_GI_number: 20090579 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Methanosarcina acetivorans str.C2A # 3 617 4 612 619 564 48.0 1e-160 MSKSFLMGNEAIGLGAVRAGVKVVSGYPGTPSTEVLETVAKHNPGDIYVEWSVNEKAGME VAAAAAYTGARTMVTMKQVGLNVASDPLMSLAYVGVKGGMVVVVADDPGPISSQTEQDTR RFGQFSKLPVFDPSSPEEAYEMIKDAFEYSEKYHTPVLFRPTTRLCHGCASVELKERVKL PAPEGFVKDSGKWVIFPRLSHANHRMIEARNPIIGEDFSSYRFNLLHREEGNTVKGIITH GISYGFVMEALNGYRGARVLKVSTPNPMPERLLLEFAEGLDQVMAVEELDPVLEQELLLL SGRHHLPLEVKGKLTGEVQTAGENSVESVRRVLEAYLGEAYIRYLEGLEGGAAEPEAALP VPPLPVRPPVLCAGCPHRASFYAVKRAMEKLNEGLGEGAAPIEGVYCGDIGCYTLGNAKP LDMVDTCLCMGAGITMAQGLQRVEPDKRYFSFVGDSTFFASGLTGIVNAVYNEASLTLCI LDNSTTAMTGHQPHPGTGRTMMGNVVEKVDITKVLEGIGVKNTVTVDALDLNACVDAVLK LSAMKGVKAIIFKSPCAAIIKSFRTCRIAEDRCVNCRTCINEIGCPALVLDGDMVGIDSG LCTGCGLCSQICSVDAIETVK >gi|157101657|gb|DS480667.1| GENE 7 6005 - 7258 1000 417 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 4 404 2 417 447 389 47 1e-107 MENQRIIQVEDKVPFNLLVPLSIQHMFAMFGASILVPFIFGINPAVVLFMNGVGTLLFIG VTKGKAPAYLGSSFAFLAPAGVVISNFGYEYALGGFVAVGFCGCILAFIVYKFGTEWIEV VLPPAAMGPVVALIGLELSGSAASNAGLLDEVLDPRKTIVFVLTLGTAVFGSILFRKFFS VIPILIAVIVGYASALAAGIVDFAEVAAAPVFALPNFSTPKFNLEAIMIILPVLLVITSE HIGHQVVTSKIVGRDLLKDPGLHRSLFGDNFSTMISGLIGSVPTTTYGENIGVMAITRVY SVRVIAGAAVLSIVCSFIGKFSTLISTIPGPVIGGISFLLYGMIGTSGLRILVDSRVDYG NSRNLALTSVIFVTGLSGIAVKFGNVQLTGMVLSCVVGMILSLVFYVLDYFKLTNDQ >gi|157101657|gb|DS480667.1| GENE 8 7287 - 9284 1965 665 aa, chain - ## HITS:1 COG:CAC0778 KEGG:ns NR:ns ## COG: CAC0778 COG0513 # Protein_GI_number: 15894065 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Clostridium acetobutylicum # 90 612 44 582 585 316 37.0 1e-85 MEIKKECEQYLAEYYQGLFFGNVNKRYRAMTGAELRKRMSRLTEANLKPLVKSNELSDVH AMLSKVCAYLMRREGHLTGPALEQWESGCRQMNEFCEALARKDLEYVNGLSLEDLEQVLK MQGIRRYLLTNSLERAYQLFYIPKTIKKGILESVKQKPEQEYPGAREMKRRFILHVGPTN SGKTHDALERLKECRHGAYFGPLRLLALEVYDKLNTEGLSCSMVTGEETLEVPGAVCQSC TVEMLNDHEYFDIVVVDECQMIADPYRGHNWTRAVLGLRAEEIHLCMAPEAEDIVVQMIK RCGDQYRVVRHKRNTRLTMEKKPYNLKQDLKKGDALIVFSKKSVLALAAHLENEGIHCSV IYGSLPPATRREQVRRFLARETEVVVSTDAIGMGLNLPIRRIVFVETRKFDGVNKRTLNP EEIKQIAGRAGRYGLYDEGFVAAIDEPEVIEDGLSRMPMPIMKAYVGFPEQLLNLPAEID TLVKIWAGMDTPSIYEKMEVDELLALYMSFEHVHRDDMGEYSRQEIYKLITCSIDIDNKM VMDLWKDYCREYRDVTELEFPYSPGEDLYDLESYYKMLDLYFQFSRKVGLPVQAENLAEE RRSTEEEISRILRTECASYTRKCSACGKELSWDYPFSICERCFERGKISRGHNRSGRRRA AGNRV >gi|157101657|gb|DS480667.1| GENE 9 9358 - 9753 587 131 aa, chain - ## HITS:1 COG:no KEGG:bpr_III171 NR:ns ## KEGG: bpr_III171 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 3 122 2 119 126 66 38.0 3e-10 MGYQDDYVMRTISDLVRAIARLVLGKSDIDYDLPEDEDKYTDLDRVYKRLKDLVDEGNIN DAENLLTDELDTDSLDCLEMALTFYMYLNQLKDEELYTANYSREEIVDGINSVCAEYGIS GFEHFVDTTMV >gi|157101657|gb|DS480667.1| GENE 10 9886 - 11544 503 552 aa, chain + ## HITS:1 COG:MJ0566 KEGG:ns NR:ns ## COG: MJ0566 COG0370 # Protein_GI_number: 15668746 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanococcus jannaschii # 37 217 6 178 668 71 28.0 4e-12 MHQRFCGTYAILRLNLPAYRQILYFRSDWMEHPHFIIAFTGNEQTGKEVLFKKLTAKKGP AMPSCMRDMSCPHGDFSYDSRHYKAVNLPGVLSLSSPSEEASCTASYLCSGHPDAILVIG HALHLEILLGLLKEILSLGPVRDSSIPVVLCVSRCEEARRQGIRIDFSLLHDVLQIPVVP LHGYGREQMDDLKAALHYALQPHHRHDFLYDCLDFSPCRLAHECMMPEISPSQGKKRNIS PEAGWIRRICEALLLLLLIGLTACLSIRLTDCLWPLFFETETVLWAWAEWFGIPAWLAAP MVHGAFRAVSCTILVMLPMLFLLFPLLGLLGRCSCFPWAVYLSDLLTEFFIEKPCNTASH YKSYGIFSTLFSSTADHTIPVLDKAAAAAVPSGILIWFLGSMAYAGPETGYGTLLFSDIA GGNLLTAITHFLEHPARLLGLDGTILAAFILGIFSHGMALPAMMMIYLKTGGIPSPSSPF ILGQVLANHGWTWRTALCTCLMALTRFPSITVCLKMRRSPGHTPYFFWGRLLVFILGITL CFLIALTGRISG >gi|157101657|gb|DS480667.1| GENE 11 11723 - 12790 818 355 aa, chain + ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 31 340 30 350 371 108 26.0 1e-23 MDIQKRRTFIISFIYFGIIGVLCYIGLKKLLPILIPFMAAMAIAAMLEPAVSLLDRHMKG GKGAAAAVVLLVFYGSILTIMCVSGSQILSSIQEQAKKLPGIYSQTIEPGLSHFFSLLEN SFPGHSIHISALGQSLERFMENASVGISSGLLGWGASAISGIPALVLDFVIAVIASFFLT GNYRETLDFLLYQIPDDRRQMLLQVLLHIRKVACRLLRAYALLMLLTFTELYIGFLVLGI PAGFTLACITSLVDILPVLGTGTVLLPWALIAWTTGSGSLAMGLVCLYLLIAVVRQTLEP KIIGHQMGLSPIATLLCIFAGGKLLGLTGIFLFPIAATVLAELHSGNNLPAQNQV >gi|157101657|gb|DS480667.1| GENE 12 12847 - 14058 839 403 aa, chain + ## HITS:1 COG:no KEGG:Closa_3889 NR:ns ## KEGG: Closa_3889 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 43 399 12 349 349 381 52.0 1e-104 MKFFHNVTAKIAPAFSSKAALKSTALQNKAPERTEKKVRSFRFFAAGLVLAAVSGLLGLC ARTVPGFAQFYSVAVYPLLAGTLGRLCSLFPFSLSEIGLYSLCLFCICYMILHIRQPVVV FSRAFFLCCVLLFVFTINCGINYYRNPFSYEAGIAAEKTSTEELLALCRYLTNQINSSLT EIDHSGEILDGLYPGQTEATPAPSARDLSKLGRDGRAAMVRLGQSYPQLEGYYPYPKPLI NSRLLSVQQLCGIYSPFTIEANYNREMPYYNIPHTICHELSHLKGFMREDEANFIGYLAC IGSDSPDFRYSGYLTGWVYAGNALARVDPESYYDLYTKLSPQAAQDLAWNNQFWERFEGP VAEASTQMNDRYLKAHSQEDGVRSYGRMVDLMLAYYKDQLRED >gi|157101657|gb|DS480667.1| GENE 13 14079 - 14807 711 242 aa, chain - ## HITS:1 COG:BMEII0071 KEGG:ns NR:ns ## COG: BMEII0071 COG2186 # Protein_GI_number: 17988415 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 2 219 36 252 270 81 34.0 2e-15 MARQKLSEVILEDLKKRILTGEYRTNQKLPTERELANYYDTSRIPVREALRQLADAGIVR TSAGSGTVVISSGSTLSDTSYGTPFMENVTLLKETIILRRLIESQAAREAAQNRSADDIK ELQNALFDSINQIRKLKAKENNSFFEADAWFHKAIAKASHNQLLVDCLDAIPYITANHQF LSLKYTTPRDEVVSYHTQIYENILDMNGERAYESMYNHLYRVETLMMNHKQEVGGLGGED GI >gi|157101657|gb|DS480667.1| GENE 14 15042 - 15638 461 198 aa, chain + ## HITS:1 COG:TVN1450 KEGG:ns NR:ns ## COG: TVN1450 COG0235 # Protein_GI_number: 13542281 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermoplasma volcanium # 5 198 2 190 218 85 30.0 7e-17 MYREFLKAKDDFLAAASRTYESRIQTGTGGNLSVRIPGTDLMIVKPSGFSYGQCSEENIT ITDFQGNLQEGLYKPTRESTLHGNLYARYPKIGGIVHTHSPYSILISLNDRQLELITLHS QLKLKKPVKVVDVTTQAVTQEELPKVFKVLDTEPETAAIILKGHGIVAISSSAVKAGQIA ELIEETAMIAWEQKKLRK >gi|157101657|gb|DS480667.1| GENE 15 15651 - 16448 700 265 aa, chain + ## HITS:1 COG:AF1509 KEGG:ns NR:ns ## COG: AF1509 COG0491 # Protein_GI_number: 11499104 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Archaeoglobus fulgidus # 40 253 32 256 262 105 30.0 1e-22 MTNYTITPINTGFTNTSKGQYLYHHSVHKFYDVEGFVKLPVTVFLVQGGGKKIMIDTGMS DTEIAGKYHHPGSVQPEGYAVYEQLENLGIKCSEITDIIFTHLHWDHVYYLDRFVNADLH VQRTEYQFAVNPIPLYYKSYEYPVLGLKPQFDGRSFKLYDGEAEIFDGISVYPTPGHSVG HQTVVVNTSEGQYHCCGDLIFTYDNLKPVPEMHYDITPPGRFLDIVDEWNSIVELKKRAK SQEFILPTHAPEMIEIVDSKRVYGN >gi|157101657|gb|DS480667.1| GENE 16 16457 - 17677 643 406 aa, chain + ## HITS:1 COG:YPO3839 KEGG:ns NR:ns ## COG: YPO3839 COG5441 # Protein_GI_number: 16123974 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 6 390 5 384 405 159 29.0 8e-39 MEHNRIALIASLDTKLQETLYAKHEIESCGGQGLIIDISTKTLVDAGDAVGPADILARYG MTWEEFGPLDKAQSIETMSNALTAVMPALYAQGLFDAVISIGGGQNARMAAAAMKSLPFG VPKIVASSLACGRRTMEQYVGDKDIMVVHTVADISGLNYTTKTVIHNVCHAALGMLQHQR QVTPDSRKKIAATMLGITSKGVEGALRLLPDGTYEKTCFHANGVGGRCMEKLIEEGAFDL IADITLHELTCEVLGGYCTGANNRLEAAVRHHVPMVVVPGALDMLDFFIDEDGRGLPDDI DRRKKVYHNSSIVHTKIYREEAVKLARVLAGRLNKSTAPVTLILPDEGFCEAAAKGGPMY DPEVDEAFISTIKPLLEQHIKIIEVRGNINSDSCQKAVAAAIMNLV >gi|157101657|gb|DS480667.1| GENE 17 17690 - 19078 1187 462 aa, chain + ## HITS:1 COG:FN0815 KEGG:ns NR:ns ## COG: FN0815 COG2610 # Protein_GI_number: 19704150 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 18 460 1 446 447 390 47.0 1e-108 MTTTISVAGLFLALFLLVYLAFKGHSVIIIAPIVAMVAVIFSAGFDSHLMANYTEVYMTG FANYARNYFPLYLFGAVFAKLMEVSGYADAISHLIASRLGKERAILAVVLCCAVLTYGGV SLFVVTFVVLPIAISLFRAADIPKRLIPGSIALGAFTFTMTALPGSPQVQNTIPMTYFGT DAFAAPVLGIIAAIMMFSLGMSWLNYRAKKAKQAEEGYGDHDDEKKEISESDLPGIIHPV IAFLLIVAVNLLCSKVIYPRLNTAYLEQYNTKISAVSGNWSVLIALLVAIIYLIITGLPR FKALKDSLKSAAGNSLMPLINSCAVVGFGSVIKGLAIFAIVQAFILSISANPLITEVLAV NLLCGMTASASGGLGATLEALAPTFLEQGARIGIGPQVLHRIASISSGGLDSLPHNGATV TTLSLCGITHKEGYLDMFVTSVVIPIATALVIVVLATLGLRF >gi|157101657|gb|DS480667.1| GENE 18 19101 - 19787 611 228 aa, chain + ## HITS:1 COG:ygbL KEGG:ns NR:ns ## COG: ygbL COG0235 # Protein_GI_number: 16130645 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 7 211 8 210 212 118 34.0 8e-27 MKNYLSEQEARRAILNVGKRIYDRGYVAANDGNISCKISDTTILVTPTGVSKGFMEPDSL VKMDLDGAILDGGKPSSEVKMHLRAYQENPDITGVVHAHPASATSFSVMGMEMKKPVVAE AVLVTGNILIAPYAKPGTYDVPDSIAPFINRYNAVLLANHGALSWGKDLYQALYRMEALE HQAGILFRTLILAHITGRDVPYLSGEALAGLVQIRDEMGILTGGIPEN >gi|157101657|gb|DS480667.1| GENE 19 19784 - 20380 545 198 aa, chain - ## HITS:1 COG:RSp0699 KEGG:ns NR:ns ## COG: RSp0699 COG1186 # Protein_GI_number: 17548920 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Ralstonia solanacearum # 3 193 4 202 207 131 40.0 7e-31 MRVQISSGQGPAECELAVAMLYEELRKEAGDVRLVGCTPGKKPGCMSSVVFETDRDLREL EGSVQWICKSPYRPGHRRKNWYVDVSILDQVPRISEERMVRFETFRSGGKGGQHVNKVET GVRAIHIPTGTAVVSTQARSQHMNKQIAMDRLCSILAEMNARNQQKEKSLAWMEHARLER GNPVRIYEGMKFERSRKH >gi|157101657|gb|DS480667.1| GENE 20 20391 - 21419 637 342 aa, chain - ## HITS:1 COG:PA5471 KEGG:ns NR:ns ## COG: PA5471 COG1690 # Protein_GI_number: 15600664 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 1 341 22 372 379 269 45.0 5e-72 MEQAAKDQLLAVSRLPGVVKAVGLPDLHPGRIPVGTAVLSRGVLYPHLLGNDIGCGMSLF DTGVKKKKFKQEKWVSRLEAVRELEDIPFSSPYEEECPIRDLGTLGGGNHFAEFQCVERI YDQEAAGSLGLCADRILLLVHCGSRGYGQEILSRFWVPEGLADGSEQAEAYMAEHDRALR WAVRNRRAAAQKLLAWLGGASEPELLMDSCHNYLERTKDGLLHRKGSVSASKGAVVIPGS RGSLTYVCIPREDTAVSLNSLSHGAGRKWARSICKSRIDQKYDRDSIRCTGLKSRVVCHD TNLLFAEAPEAYKNVEQVMGVLQEYGLIDIAATLRPLITFKG >gi|157101657|gb|DS480667.1| GENE 21 21919 - 24066 1754 715 aa, chain + ## HITS:1 COG:SP1647 KEGG:ns NR:ns ## COG: SP1647 COG3590 # Protein_GI_number: 15901483 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Streptococcus pneumoniae TIGR4 # 82 715 4 630 630 304 32.0 5e-82 MKKSKSPAEQITAARAVSPRKALTLTAAVLAAAITATGCSGTGRTETEPESQAEITAGPA VNQPESPERPHGGQTVQGPAAYVDDFYDAVNHDTLNGWEIPAEQADMSWFRKAREDNYSK VNDLIRQASSEAGQAEQEAGSDLYNIRALNLTGLDRETRDREGYGRTAGAFLKEVDSAGS VSELLKACLQFQRDTGLFSLMGWYYEGDSGDSSVKVLYLSQPDSGLRREVWFSEDASNQK RVEEYKKYLTKLHENNGLSPEEAGVTVARVTDMMKDLASSTLKIEEVYDPEKTYNVCTAG EAANLYSSALPFRLLNSIFGIQEGEKTVVSQPDACRKLGSYLKEENLPLLKEYVKTCLYS DLSMMTDTASLDAAQEYQTAAGGIEKKKDFERTVSETVQEKLGFQCGRVFCETYFNEDAK QDVASIIRQIIDVYDNRLAHMEWMTESTRQEARKKLKAITVKVGYPDEWPQDKLNLVLKS PDQGGVYIDNIMEILKASQDYSFKTRHDPVDRNEWSMTPQTVNAYYNPGNNEIVFPAGIL QAPFYDPQAEPVANLGGIGAVIGHEITHAFDTSGAQYDEKGNLRDWWTAQDKQHFRELAQ KVIDYYDTMEVNGIQVNGTLSVTENIADLGGVSCITEIAREKGYDLKELYHAYGVIWATK YRDEYLSYIMTNDTHSPGITRVNAVLSATDDFYTAFGVREGDGMYRRPEDRPHIW >gi|157101657|gb|DS480667.1| GENE 22 24141 - 26177 1968 678 aa, chain - ## HITS:1 COG:TM0963 KEGG:ns NR:ns ## COG: TM0963 COG1164 # Protein_GI_number: 15643723 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Thermotoga maritima # 174 574 106 518 547 108 24.0 3e-23 MKNGMWKRSLRRGAALVLALVMSLTITSCSASQGSLTARERAREPEYVSKAGGKVHEAVP YSRRPYEHYDPAAMEQAMADFEQACASQGREEEVLRLYDAIVDEYDRLATLTYMAQLNYD RDVSDEQAAAEQAYTTDIYSEMGDKAAACLKKGMNSSYKELLTDKMGIEYAPYIEYYREN SGELTELNRREQELLREYDKLAAGDFTVELEGGQWSYSRLEREPDLDDGQYAEIEDALVR EKNRVLGSKFRELVQVRRRTAELKGYDNYARYSFEAVYGRDYTLQDAEGLCSDVKTRIVP LNNDIWYMDVAQESYDSLDLVEESSAQDILAGVGRAVGSVNPELGEIFRYMQDNELYDIQ SGEDGQDRMDNNYTVGLPSYGDAFIFINRNHTFTDYQSLIHEFGHFSSYYYNSVPELFQG YSVDVCEVQSQGLEMLVNQYAGDMFGEGAEAFEFETVTDMLYVIIMSCMLQEFEEAVYMD PDMSLEDMNRTFKEIQDSYHGWYFDVYDEGVCYDWVDVSHLFYSPLYYMGYGTSALSALD LWTMSRRDWDGAVDTYMGLLNEGLDAPYRDTMYRCGLRDVFDSSELEALAGDVRRIQGLD QDGEGAEDPSQENPSQDIPSQEDPDQAGTNSEGSSGSGLRAAGFLVLAGICVILVFQVMI LCTGFVIIWLLVRKKREE >gi|157101657|gb|DS480667.1| GENE 23 26286 - 27851 1292 521 aa, chain - ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 102 512 32 443 443 232 33.0 1e-60 MKRKTWMISALMYAMVFMACFGVLHVLKTLMEPSGPVSVESRESGGPEGTREAKTGEEWE EPAGPELATDSQAQEPQDEVIRQEGYKEWKKKFPGMWTPPQAPEEPYIPPRLILATDLHY QSALAGDGGEAFRLFVERSDGKVVQYLPELLEAFLDEVIEERPSALVLSGDITMNGERLN HEELAARLKRVQDAGIQVLVIPGNHDINNHDASVYYGARKEPAPYIDGPEFHEIYHEYGY DQALSRDDSSLSYVYALDDRNWLLMLDSCQYEPDNKVEGRLKDSTMAWADEQLARAKEEG VFVIPIAHHNLLAQSRMYTTQCAMDNNKEVISLLQKYELPLFFSGHLHVQRIRKHKSEPG VPEGNYGIQEIVTDALSIPPCQYGEVLWEEDGSISYATRAVDVSAWAQKNGSTNPDLLDF EDWSYRYIQKLISDQIRGVVKNLGADVEHSMAATYAGVYIDYYAGREIDVKGIKSTKGYR FWERNMPDSYLIRELDAMIADSDRDNNYFLLPEEEGWITGD >gi|157101657|gb|DS480667.1| GENE 24 27886 - 28695 834 269 aa, chain - ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 11 269 4 266 266 174 36.0 1e-43 MCSEMMNNRTKALFYDIDGTLLSEKTGRVPESAVQALKEARAKGHLVFINTGRVYAHLGE IKKLVEADGYLCGCGTYILAEGRVMYSYHIPHERGLEIKGHIDACGLDGALEASDGIHIH RSQSPIPRVEQMKASLRRSDCVSEYDWEDDCYDYDKFYLISGENSRPKELFGRLRDMEII DRGNGEYECVPRGHSKATAIDLVLKHYGISLDDAYVFGDSGNDLAMFRYASNCILMGKHD EVLEPYATFETKDVEEDGIAYAMKELGII >gi|157101657|gb|DS480667.1| GENE 25 28673 - 29257 985 194 aa, chain - ## HITS:1 COG:BH1514 KEGG:ns NR:ns ## COG: BH1514 COG0503 # Protein_GI_number: 15614077 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus halodurans # 1 193 1 191 198 160 47.0 2e-39 MQILKDRIRKDGKIKAGNVLKVDSFLNHQMDIKLFGEIGKEFKRRFADVEVTKILTIEAS GIGIACIVAQYFDVPVVFAKKTQTKNIAGEVYTTKVESYTHGRVYDIIVSREFLGAGDKV LLIDDFLANGKALEGLAALVRDSGAELVGAGIVIEKGFQVGGDLLRSEGIRLESLAIVES MDEEAQTIVFRDDE >gi|157101657|gb|DS480667.1| GENE 26 29293 - 30492 1538 399 aa, chain - ## HITS:1 COG:CAC2641 KEGG:ns NR:ns ## COG: CAC2641 COG0544 # Protein_GI_number: 15895899 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Clostridium acetobutylicum # 35 289 116 372 431 136 33.0 8e-32 MKKLALICLCGLVAASIMTGCGASQTEGKENLGTVELSEYKGVKVNVPAVMVTDAEVESK INQVLSQNPKIEEVDRPAAEGDIVNIDYVGKQDGVEFAGGTGEGQDLTLGSGRMIDGFED GLIGTKKGDKKELNLTFPEDYSEKALAGQAVVFEVTVNAVKEKQSAVLDDAFVQSVSDFK TVDEYRASVKEDLLKQKQQSADLQIQQYVLKNVVENSKFKLNKNTLSRRYNDRIKQYEEQ AKMYGTNLSGMAKANGMDVPGLQEAVYASVKDDVKNQLVILKVAELEGITLEDADRQAFA EMNGQTLEAAVEYFGQETLDEMALNQKVMKFMADNAVNEAQDAAAPAEETTAAGETAAAE ETTEAETTAAETETSEETPAAEETTAPAETTAAETTAAQ >gi|157101657|gb|DS480667.1| GENE 27 30659 - 32053 1676 464 aa, chain - ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 27 462 19 462 465 456 51.0 1e-128 MAKKTKGQELQEQLTWAFPHIGKDKPEQAEKAFKYCEGYKEFLDVGKTERECVKEAVKML KKAGYKPFDSGESYEPGDKVYYVNRGKSLIATTFGKKPLDQGLRINGAHIDSPRLDLKPN PLYEKEDIAFFKTHYYGGIRKYQWGTIPLAMHGVVVKKNGETVEVSIGEDEKEPVFCVTD LLPHLAARQNERPLKDGLKGEELNIVVGSLPFQDEEVKEPVKLMALSLLNEKYGITEKDF FRAEIEMVPAVKARDVGFDTSMIGAYGQDDRVCAYTALTAEIDAKKPAHTTVTILADKEE IGSTGNTGLNSDFVLHYIEDLAEQAGVSPREVLRKSLCLSSDVNAAFDPNFPDVYEGRNT SYINKGCVLTKYTGARGKSGSNDASAETMAKVIAIMEEEGVYWQAGELGAVDVGGGGTIA QFVAHMDVDTVDLGVPILSMHSPFELASKLDVYHTYKAFKAFYK >gi|157101657|gb|DS480667.1| GENE 28 32126 - 33100 696 324 aa, chain - ## HITS:1 COG:no KEGG:Closa_4147 NR:ns ## KEGG: Closa_4147 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 306 3 321 325 230 38.0 8e-59 MRILYEKTLDGVCLQRCYGLDKILEIPGTVDGMPVTEFAGYLFSDTVRRREPPPCEYLGE PELCGSRVEEIILPDTIRKVGAYGFYNCSGLKRLSCSSAVEDWGAGVFTGCTGLRCLDIR VAEGRKSCFKDILSELHQTLSVNYRSGSGTLLAKLIFPEFFEESVENTPARIIMREMHGC GHMYRYCFDGGDFRFDEYDRLFPHVKVQEKPELAVRLALYRLYWPCGLRESAEDEYWDYV RTHAGEGAKGLIERGERDILGFMARSARLGEDEIKKMIEAAAGSGDAQSSALLLDVKHRR LSPAGRNGQGEEAKRCGRKRTFEL >gi|157101657|gb|DS480667.1| GENE 29 33253 - 33942 761 229 aa, chain - ## HITS:1 COG:CAC3677 KEGG:ns NR:ns ## COG: CAC3677 COG0745 # Protein_GI_number: 15896909 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 229 3 232 232 261 56.0 1e-69 MNETTILVIEDDKSIQNFMKISLKTRGYAYILADNGLSGISLFYADHPDLILLDLGLPDI DGMEVLRQIRQKSDVPIIVVSARGQEREKISALDQGADDYVTKPFSAEELLARIRVGLRH KSSTMNHQEQEFCLDYFRLDFEKRKLYVHDQEVHLTPLEYKMMVLLVNNSGKVLTHHYIQ QEVWGYDTTDDYQSLRVFMANIRRKIEDNTANPRFILTEVGVGYRFVDE >gi|157101657|gb|DS480667.1| GENE 30 33935 - 35416 1652 493 aa, chain - ## HITS:1 COG:CAC3678 KEGG:ns NR:ns ## COG: CAC3678 COG2205 # Protein_GI_number: 15896910 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Clostridium acetobutylicum # 12 488 403 899 900 312 35.0 8e-85 MLYSWDWVLDFVKACAKMAVIWVISTILAIVLDSCGIRAENLLLVYLLGVLISILATSSL AWALCAAIVFTFTFNYLFTEPKLTFHMDDVNYVVSSIVFVAAASIVATLVVKLQKQMQIA NKKREITARMNEIGSGFLNLSGYQQIKEYSEESLSNLTGKPVSVLLKEDEKKEFPNYMAE WCYRQSIPCGHGECQFPDDNGLYVPIRNREKTYGVIIFDCEGWTLSEEEKIYVDTVISQL TLVIERELLSREKENTRIQVERERLKSTLLRSISHDLRTPLTGITGSAGFLLDNLGIMDE ATIKSMLKDICSDSEWLSTMVENLLNLTRIQEGRLDINKKKEVVDDLVASAVRLVSNRVG NHTLKMETPEDILLVSVDGRLFIQVLVNLLDNAFRHSGTGTTVTLRVKQDGNCLKFVVSD NGIGIPNDKIDKIFDNFFTTAYENGDKQRGVGLGLTICKAMVEAQGGTIRAFNSPQGGAV FEVAMPMEDRKDE >gi|157101657|gb|DS480667.1| GENE 31 35425 - 35841 535 138 aa, chain - ## HITS:1 COG:TM1088 KEGG:ns NR:ns ## COG: TM1088 COG0569 # Protein_GI_number: 15644627 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Thermotoga maritima # 3 129 1 127 218 71 29.0 4e-13 MSMKIAVALGGEYADFLIGLLMEKKYKVTVIDPDKAFCEHLCASYNVNAVLGDPCRQFIL EEAGIRNYDVILALGREDTDNFEICQMGRKVLGIKRSVCLVHNPRNAALFEELGVDRAVN LPMILAQAILGQKQEEGE >gi|157101657|gb|DS480667.1| GENE 32 35838 - 36275 535 145 aa, chain - ## HITS:1 COG:Rv2691 KEGG:ns NR:ns ## COG: Rv2691 COG0569 # Protein_GI_number: 15609828 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Mycobacterium tuberculosis H37Rv # 9 130 3 124 227 79 34.0 2e-15 MLFHKNKCIVIAGCGPLGAALAKSLYKKGHKVVVLDKDRESFRYLPEEFRGYEMEGDPTD PQVLKAAGIEDAGLFIASTQDDNTNLLLAQIASRIFHVSQVYARLDDTGRHQLVTDLPVK PISLYDLSADDYGRLAETVSREVAV >gi|157101657|gb|DS480667.1| GENE 33 36461 - 36658 301 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935148|ref|ZP_02082531.1| ## NR: gi|160935148|ref|ZP_02082531.1| hypothetical protein CLOBOL_00043 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00043 [Clostridium bolteae ATCC BAA-613] # 1 65 1 65 65 104 100.0 2e-21 MTAKMIFKLKSPVGQKIDNLCERFPLISSVIVQLAAAIFMVGVVTSVGLIGGTVIWLFYK AFGVM >gi|157101657|gb|DS480667.1| GENE 34 36739 - 38733 2116 664 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 24 655 13 639 888 519 42.0 1e-147 MDGQRDERPDPELLLKQNFPRESDKARGRLKIFFGYAAGVGKTYAMLEAARQEKAAGRDV VAGYIEPHDRPDTMELLAGMELLTPMAVTYKGIILKEFDLDAALMRRPDLILVDELAHTN AQGCRHKKRYQDVIELLDAGINVYTTVNVQHLESLHDIVASITHVRVYERIPDDIFDDAG QVELVDIAPEELRQRLEDGKIYHKDQADKASVRFFTVENLSALREIALRRTADRVNREVV RERDAAGKDYYTGEHVLVCLSPSPTNARVIRTAARMAGAFHADFTAVLVETADMKREHAK ISRSMEANINLAKQFRANVITLYGDDIVQQIAGYARRNGISKIVIGRTVRKRSFLHLKST IIDRLIKAVPNMDVYVIPDAAARENGKIKKPENGRKYSMDILKTGLIMAGCTFLSMVLDH YKLGIINVAMIYMMGTMLAAYVTRAGTYGFLASVASVLLFNFFFTPPRYTFQADGSIYPF TFAAMLVCALITSSMAARLKREAEKSEAESGYMQILFGINRNLRKGTTREEILDICGRLV QSLLKRNIVIYDMLDGKMNRMKIYPASEREDASSLTSLFKNPDEMAVAAWVYKNHHKAGA TTDTLPGARARYTPVYKDDTVYAVIGVEMKGGDIIKPNDKNVLNIIAEETSYRLKDMSGP RICR >gi|157101657|gb|DS480667.1| GENE 35 38827 - 39432 792 201 aa, chain - ## HITS:1 COG:CAC3680 KEGG:ns NR:ns ## COG: CAC3680 COG2156 # Protein_GI_number: 15896912 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Clostridium acetobutylicum # 1 201 2 198 205 139 38.0 3e-33 MKTVKSVLPKMAGFFLVMLIITSVIYPAVITAAANVAFGHEANGSVIVGKDGTKYGSALL AQEFTGNQYLWGRVMNVDTGTFTDENGEPLAYAGASNKTPAGSELEAMIAERVEKIRAAH PEKGEEPIPVDLVTCSGSGLDPEISPAAAGYQAERIARERNMAVEDVEAVISRYTTGRFL GIFGEPRVNVLKVNLALDGLI >gi|157101657|gb|DS480667.1| GENE 36 39488 - 41560 2272 690 aa, chain - ## HITS:1 COG:lin2829 KEGG:ns NR:ns ## COG: lin2829 COG2216 # Protein_GI_number: 16801889 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Listeria innocua # 18 689 10 681 681 907 72.0 0 MNEQKKSIFSNHEIMCRALKDSFIKLNPRTQIKNPVMFMVFVSAILTFVMFLFSLAGIRD AAPGYILAISVILWFTVLFGNFAEAIAEGRGKAQADALRAGRKDTMASRIPSADRKSEAV RVKSTELKKGDLVYIKAGEQIPADGDVVEGAASVDESAITGESAPVIRESGGDRSAVTGG TTVISDWIVVRVSSNPGEGFMDKMIAMVEGASRKKTPNELALQIFLVALSIIFVLVTMSL YSYSVFSSVEAGAANPTSITNLVALLICLAPTTIGALLSAIGIAGMSRLNQANVLAMSGR AIEAAGDVDTLLLDKTGTITLGNRQADAFIPVDGHKEMELADAAQLSSLADETPEGRSIV VLAKERFGIRARNMEELQASFVPFTAKTRMSGIDYNGNEIRKGAADAIKAYVDRHGGAFS RECEDIVKRIANQGGTPLVVAKNGRVMGVVELKDIVKQGVKEKFADLRKMGIRTIMITGD NPLTAASIAAEAGVDDFLAEATPEAKLDMIRDYQARGHLVAMTGDGTNDAPALAQADVAV AMNTGTQAAKEAGNMVDLDSSPTKLIEIVKIGKQLLMTRGSLTTFSIANDVAKYFAIIPA LFMGLYPGLSALNIMGLHSANSAVFSAIIYNALIIVALIPLALKGVRYREVSAARMLSGN LLVYGLGGIIIPFIAIKAIDVLITAAGLVV >gi|157101657|gb|DS480667.1| GENE 37 41576 - 43327 1966 583 aa, chain - ## HITS:1 COG:RSc3382 KEGG:ns NR:ns ## COG: RSc3382 COG2060 # Protein_GI_number: 17548099 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Ralstonia solanacearum # 6 577 6 586 590 546 52.0 1e-155 MMNVALQMIIYCVILVVLAIPLGSYMGKVMNGERVFLSRLLLPVENLIYRVLRIDKEEEM GWKKYSVCTAVFSVMSLAVLWIILCFQNLLPLNPQGMGKMSWHLGFNTASSFTTNTNWQA YSGESSLSYFSQMIGLNFQNFVSAAVGIVVLFALIRGFVRVKERGLGNFYTDMTRTVLYI LIPLSVVVSLAIASQGVPQTFKQYDEVQLVEPVVVEQEDGTAAEVTKGVVPLGPAASQIA IKQLGTNGGGFWGNNSAHPFENPTPLSNLFEMISLLLIPAGLCFTFGRNVKDKRQGVAVF LAMFIMLAAALAIVGCSEQAGTPQIAQDGAVYMGTEGQAGGNMEGKETRFGIATSGTWAV FTTAASNGSVNSMHDSYTPFGGLVPMLLMQLGEVVFGGTGCGLYGMLGFAILTVFIAGLM VGRTPEYLCKKIEPFEMKMAVLVCLATPVAILVGSGIATLLPSTHDSLNNPGAHGLSEVL YAYSSAGGNNGSAFAGFNANTPFLNVSIGLIMLFVRFLPMFATLAIAGSLAAKKKVAVSS GTLPTHNAMFIFLLIFVVLLVGALSFFPALSLGPVAEFFQMIA >gi|157101657|gb|DS480667.1| GENE 38 43324 - 43413 141 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALLLGIIGITAVIILAYLVIILLRGDKQ >gi|157101657|gb|DS480667.1| GENE 39 43621 - 44562 1004 313 aa, chain - ## HITS:1 COG:Cgl1931 KEGG:ns NR:ns ## COG: Cgl1931 COG1957 # Protein_GI_number: 19553181 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Corynebacterium glutamicum # 2 309 4 307 316 198 39.0 1e-50 MKVLLDVDTGVDDSIALLYALFNPEIEIVGISAVCGNVEAWLAAENTMKILDLAGAPDIP VAVGAEKPSCREWDGRVAFIHGKNGLGNVELPPSRRSTRDVDVSRFHMDLAEQYEGELVV ITLGPLTNIARTIREYPGFVHKVKGLVMMGGTLTMRGNVSPVAEANVACDPQACDQVFTS GMDITVVGLDVTMRTRLKMEHLDWLSGCCKPACRPAVDYMRQAIVHYLRGNQTQNYCMGD CPLHDPLAVMCAVMPSLVRTESRKARVECGGTYCRGMIVTDLREHPFQAEYVRFAVEVDS ERAVRELMSVFWE >gi|157101657|gb|DS480667.1| GENE 40 44585 - 45355 725 256 aa, chain - ## HITS:1 COG:SA1940 KEGG:ns NR:ns ## COG: SA1940 COG0813 # Protein_GI_number: 15927712 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 3 233 5 235 236 231 51.0 1e-60 MKTFHNEAGPGQIGKAVLMPGDPLRAKYIAETYLTDVFCYNRVRGMNGYTGYYKGKKLSV QGSGMGIPSMGIYAYELYHNYDVDTIIRVGTAGAIHRNVEVGDLVLAMGCCTNSNFPSQY HLPGTFAPIADFGLLEAAAAHARNSHTSFRAGNVYSSDVFYEDCQGEKGQWEKMGVLVQE METLSLYCTAARAGRRALSMVTVGSSIHHDRALTNEEREQSLDSMIRCALELAAEAAEAA EEAGTADRILAPGYTA >gi|157101657|gb|DS480667.1| GENE 41 45362 - 46129 859 255 aa, chain - ## HITS:1 COG:SPy1870 KEGG:ns NR:ns ## COG: SPy1870 COG2188 # Protein_GI_number: 15675689 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 16 246 15 246 247 89 28.0 6e-18 MKDNVINEKNVKIPLYVTAYETISQWLKEGKYKAGDKLPGENILAEQLAVSRGTLRQAML LLQEDGLIINHQGKGNIVLSNQDMSKSGLEKVGNPIVDFCSRPIDKVVTAIGFQPATQKH QQVLKLRPSSVVAVIDITYYCGDTPAGFAMVYMPHEVLDNSDVDLENPDTVYQYYTGLLS EGGLYSDAKIRLGQARERLANILNMQEGDPILILEEVLYSEYDTPVLSQKLFFGSEQHEL SIRRKNDRNIIKKSE >gi|157101657|gb|DS480667.1| GENE 42 46200 - 47105 1211 301 aa, chain - ## HITS:1 COG:alr5368 KEGG:ns NR:ns ## COG: alr5368 COG1079 # Protein_GI_number: 17232860 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 297 4 301 312 189 35.0 4e-48 MDMINNFIGLTLQLSVPIILGALCGTIAERGGIILLGVEGIMLIGAFAGAAGSYAAGSAM AGVLLSVVLGGMIGWFYALFCLKWKAHQSVVGVGINLFASGITAVMLKAIWQTEGMSESV PSISNLTIPGLSGIPVIGALFKDQSPYLYAMFLIVAAVWIVFYKTKFGLRYRAIGDQPYA VQTAGVQVNRYRYIAMIAAGSIAGLAGSYLSISQNNLFVADMTAGRGFMGLAANIFGGWQ PLGSMGAGMIFAVAQAARFYLTDASVPSQFVQMLPYGVTLLVLLFVGKRVKGPEALGKLV D >gi|157101657|gb|DS480667.1| GENE 43 47109 - 48212 1168 367 aa, chain - ## HITS:1 COG:TM0104 KEGG:ns NR:ns ## COG: TM0104 COG4603 # Protein_GI_number: 15642879 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Thermotoga maritima # 42 361 14 334 344 197 38.0 2e-50 MKWNSNKNAQADGGDRTGEKARTVGKICGGRHTAEIGKSLASIGLALLIGALFILLAGES PVKAYGSLIQGALGTPQSLANTISKSIPLAFTGLAVAMGSRCGMLNIGAEGQLHAGAMAS VITALYFSALPAPLLLVISILAGIAAGMLVGSVPGIFKARFSTNEVIVAIMLNYICTLFT SYLVNGPFKTEGSTAQTELIAEGIWFGKLVPRTQLTYALFLLLFVAAAMYIFLWKTSVGY QLRAVGANPSASGTAGIRVNMFLIMSMTISGGIAALAGITEVFGKYHRFIEGFSPSFGFT GIAVAILGRNHPAGVLLTALLFGIMDMGSLRMSRETMVSTNMVTVIQSLVILLVAAPELI KWSRKRG >gi|157101657|gb|DS480667.1| GENE 44 48193 - 49713 1779 506 aa, chain - ## HITS:1 COG:BS_yufO KEGG:ns NR:ns ## COG: BS_yufO COG3845 # Protein_GI_number: 16080207 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Bacillus subtilis # 4 497 5 503 510 449 50.0 1e-126 MDAIRMIGIVKCFGPVRANDGIDLIVGQQEIHCLLGENGTGKSTLMNILFGLYHPDAGEI FINEKKAAIANPNDAYALGIGMVHQHFMLVNQLTVLENIIMGKESGGLFLNRKESRKKVE ELVERFGFRIDLQEKVVNLSVGMKQRVEILKTLYRGADIIILDEPTAVLTPQEVDELFKI LRQLKENGKTIIFITHKLNETMSLSDRITVIRKGKVVFRCDTSSTNEKELATQMVGRQVE NIVAKRGQKTGQAVLELKGVRLHERAGETVSLKVRAGEILGIAGVDGNGQQELEEMIVGN RRVREGEIFINGIPVQNMAVKDRKAMGLGYIPSDRHKNAMIPSFSITENFLLGYQETPEY CRKGFIDYERLRQDAEKQVEEFEIKVAGVDQEIGQLSGGNQQKVILGREISHDPGLVVVA QPVRGLDIGAIERVHKTLLQLKEQGKAILLISAELSEVMNLSDRIAVFYEGEVSARFDNG EYTKEEIGLFMAGKKQEVRAHEMEQQ >gi|157101657|gb|DS480667.1| GENE 45 49744 - 50829 1316 361 aa, chain - ## HITS:1 COG:VNG0903C KEGG:ns NR:ns ## COG: VNG0903C COG1744 # Protein_GI_number: 15790034 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Halobacterium sp. NRC-1 # 57 359 37 348 349 199 38.0 9e-51 MKKRIRQWAALCAAVGIAGSAVMGCGSSASKPDAGSGQTSREAADGTGTGSGGGAHIGII FTEAGLGGNSFNDLALEGVKKAAADYGITYDEVEPKSVSDEEIIQDEMAASGDYDLIICV GAEQVDALTNVASTYPEQRFALLDATSDLPNVASYSCKEQEGAFLAGALAALAKQEKTDD KLGDGKTIGFIGGVDNPLINRFAAGYKAGAEYIEPEMKVLVDYAGGFNDPTTAKTIANTF VEKGADVIFHAAGASGMGMFQAAEEKGFAAIGVNLNQNSIAPDYIMASMLKKLDSCAYHA IASIVEDTYTGENQMLGLSDGGVDVTVEGSNIKVSDDILARLDELKQKIISGEIQVPSEL N >gi|157101657|gb|DS480667.1| GENE 46 51729 - 52211 539 160 aa, chain - ## HITS:1 COG:mlr1925 KEGG:ns NR:ns ## COG: mlr1925 COG2080 # Protein_GI_number: 13471825 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Mesorhizobium loti # 5 153 6 154 157 139 46.0 2e-33 MLKVIKLKINGKEQETAVDDRESLLDTLRRLGFTSVKKGCEVGECGACTVLVNGEAIDSC IYLSMWADGKSIMTVEDLKGPNGELSPIQKAFIEEAAVQCGFCTPGLIMSAVEIVGTGKK YNREELKKLISGHLCRCTGYENILNAMERIVEETYRVVGE >gi|157101657|gb|DS480667.1| GENE 47 52204 - 53094 894 296 aa, chain - ## HITS:1 COG:ygeT KEGG:ns NR:ns ## COG: ygeT COG1319 # Protein_GI_number: 16130769 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli K12 # 1 293 1 290 292 256 42.0 5e-68 MFDIKSFYEADSVEDAIAALTADPQSQVISGGTDVLIRVREGKDAGRGLVSIHNIPELKG VSLEEDGTIIIRPATSFSHITNDPIIKKHLSMLGEAVDQVGGPQVRNTATIGGNICNGAT SADSASTMCALNATVVLKGPEGVREVPVTEFYTGPGRTVRKQNEVCTAFKITRENYEGWE GHYIKYGKRRAMEIATLGCAVRVKLSPDKTVLEDVRLAYGVAAPTPVRCYEAEEALRGKQ VSDATIYDLFADKALSQVNPRTSWRASREFRLQLIGELARRALKTSITLAGGKADA >gi|157101657|gb|DS480667.1| GENE 48 53150 - 55480 2230 776 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 11 776 2 752 752 663 45.0 0 MNNIVGKGVTRVDAYAKVTGEAKYTSDLEPRDCLCARVVHSTIANGLVKRFDLEDALKVP GVVKIVTCFDVPDYQFPTAGHPWSVEAAHQDIGDRKLLNRRVRLYGDDIAAVIASDDVAC QQAARLIKVEYEEYKPMVTVEQAMAEDATPLHPDLRKDNVIVHSHLTMGPEDFTFEEGLR EAEEKYGKENLVVLEREYHTPRISHCHIELPVSWAYQEPGGKITVTSSTQIPHIVRRVIS QALGLPVGKIRVIKPYIGGGFGNKQDVLYEPLNAYLTTVVGGRPVRLEISREETIYGTRT RHAIDGKCRALATKDGKMLARKLEAYANNGGYASHGHAICANCGNVFKDIYQDRLGAEID CYTVWTSTATAGAMRGYGIPQAAFLSECLTDDVCRAIGADPLAFRMENCMGEGFVDPANG ITFHSYGLKKCMEEGAKYIGWEEKRRAYANQTGPVRRGVGMAIFCYKTGVYPISLETASA RMILNQDGSMQLQLGATEIGQGADTAFSQMAAETTGISLGKVYIVSTQDTDTAPFDTGAY ASRQTYVTGMAVKKCAQQLKARILDYAGFMLGGRPAEEMDVAHDHIVEKATGKELLDMET LALTAFYSLERSEHITAEVTNQCRDNSFSSGCCFVEVEVDMPLGQVTVKDIVNVHDSGML INPELAEAQVHGGMSMGLGYGLSEEFLYNDAGRPLNDNLLDYKIPTAMDTPDLHVKFIQL EDPTGPYGNKSLGEPPAIPVAPALRNAVLHATGVAMDTIPMTPQRLIEAFQAKGLI >gi|157101657|gb|DS480667.1| GENE 49 55516 - 56268 983 250 aa, chain - ## HITS:1 COG:no KEGG:Closa_2900 NR:ns ## KEGG: Closa_2900 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 84 248 14 178 181 219 83.0 9e-56 MVFWMIVMAANLVVGAFVGLTGVAGFLLPIVYTSPLGMGVTEGLALSFAAFIVSGALGSV NYNKAGNLDIPFGIRLSLGSLAGAVLGVKLNLIIPESTVKVILYVVVLLSGISILLRKDK SQEESGRAYVISDHLAATLILGFVTGAVCSLSGAGGPVLVMPLLVVLGIGIRTAVGVALF NSVFIGIPACIGYMMQCSVKDLLPVMAAALVFHGIGVVYGSRNAVKINQILLKKGIAVFS ILIAIWKLAL >gi|157101657|gb|DS480667.1| GENE 50 56308 - 57624 1478 438 aa, chain - ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 26 433 36 435 442 187 33.0 3e-47 MQTCDLLIKDGSLLIKYDQLEEHVDMAVSGGRILETGKELWKKYQAVETINGKGKLFMPG LIDSHMHTGQQLLKGLVLDAKPIIWTRIMLPFESTLTPDKMRLCAQAAALEMIKCGTAGF IDAGSYHMETAASVYEESGLRGALSYSTMDEEGLPESIAMDAEEALKRTDSLYDRFHGKG NLKVYYSLRALNSCSSRLVELEAMRARERGTMLQAHMNEYMGEINGIVDREGMRPYEYLE KMHVLGSNFLGAHSLILSEQEKSLVMERGVKTCHCPFSNCGKAVPDTPQLLEMGIPVGLG SDGAAHGGLSLWNEMKIFRSVMNIYHGVPNGNPKVMPAETILKIVLEGGAAALNEEGSLG RLEAGYKADIISINMDQPHLCPTGNKIHTLFECVNAGDVEDMIVGGRILMKNREVLSLDE ERIMYESRKYMEENADTF >gi|157101657|gb|DS480667.1| GENE 51 57695 - 58141 641 148 aa, chain - ## HITS:1 COG:no KEGG:LC705_00064 NR:ns ## KEGG: LC705_00064 # Name: not_defined # Def: hypothetical protein # Organism: L.rhamnosus_Lc705 # Pathway: not_defined # 1 148 1 148 148 140 60.0 1e-32 MRYIEWLVLFVLAACGTGLANLIGYGVGITDSIPGLAVLIVISMAAVVLTKVLPLKLPIV AYCSIIGLLSASPISPVRDFVIQAASNINFTAPFTMVGAFAGLAISDQLKTFISQGWKIL IVGIFVMTGTFLGSCLWDQMLLSLAGVI >gi|157101657|gb|DS480667.1| GENE 52 58138 - 59034 1035 298 aa, chain - ## HITS:1 COG:no KEGG:Closa_2902 NR:ns ## KEGG: Closa_2902 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 298 1 286 286 483 91.0 1e-135 MAGWKGGALMSGKKETYVYGSLKERFALEWKIYALAFGFILIADNIGQIKIPVGKGMFIL FPIFYAIILGVLSGPQVLKIVDNRHVKAASKLVVVGICPFIAKLGITAGANIDTILQSGP VLLLHGFGNLLGPLLALPVAILLGMKREAIGSCHSINREYHMALINNIYGPDSAEARGSL SIYIVGGMVGTIYFGLMASVVGMTGWFHPQALGLGSGVGAGIMMASSSASLCAIYPEWTD TISAMASVGETIAGITGMYITMFIAIPMTDKLYTILEPKLGRFALKNKEGRDIKEETV >gi|157101657|gb|DS480667.1| GENE 53 59211 - 59954 857 247 aa, chain + ## HITS:1 COG:no KEGG:Closa_2903 NR:ns ## KEGG: Closa_2903 # Name: not_defined # Def: Crp/Fnr family transcriptional regulator # Organism: C.saccharolyticum # Pathway: Two-component system [PATH:csh02020] # 1 247 1 235 235 377 75.0 1e-103 MTLTELEAAAPALKEYTKNMPEDIRSRCTVRTHAAGSIIHQKNMELGYFGIVAKGENRVI NEFENGNVYMIESNKAIDFIGEVTILAGMSHTSVTIEAVTDNVVAYISRRDAERWLASDI NILNLAARHTAFKLYRSSYNNGAKLFYPPSYLLLDYMVKYGRQNGMESSRGMGSSRGMGT SRAPASVTVLRTRQMLQEEIGVNVKTLNRTIRQLKEEGFFSICKGKITFTREQYEAAIEW LEAEKDK >gi|157101657|gb|DS480667.1| GENE 54 60075 - 62048 1750 657 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 7 657 1 637 637 694 53.0 0 MNQTSALSPVTEDYGAAQIQILEGLEAVRKRPSMYIGSTSSQGLHHMVYEIVDNAVDEAL AGHCTLIQVYLNKDGSVTVRDNGRGIPAGIQTKTGLSAIQVVFTVLHAGGKFDDSNYKVS GGLHGVGASVVNALSSHLHVQVDQDGFSYAQSYERGIPAAPVKQLGPCSQGTHGTTVTFM PDQDIFPSIAWDRNILEERLRETAFLTQGLEISLYDLRRDPAGEATCPVTAWSRTFCFQN GLKDFVSWLEQDGEPLYTGIISGRHETDKVQAEFALVHNSSYSEQLLSFANNISTPEGGT HVTGFRQALSRAVNDYARNAKLLKDTDSSLSRDELSEGLTAVLSVRLQDAQFEGQTKQKL GSSHVRPVVEQMVYTRLTLFLEQNPAAAKAICDKAMLARKARTAARTAREMARRKTALSG LSLPGKLADCSSKNPGECELFIVEGDSAGGSAKLARSRENQAVLPLRGKILNVEKASLDR ISANAEIKAMISAFGTGILDTFSLENLRYNKIILMTDADVDGAHITTLLLTFLFRFMPKL IEEGHVFLAQPPLYRVEKGKKLWYAYSDSELEQTINAIGRDGTYKIQRYKGLGEMDAGQL WETTMDPSRRTLLRITINDMEMYHETCQTFSILMGDEVEPRKNFILEHAQYISMLDI >gi|157101657|gb|DS480667.1| GENE 55 62066 - 63001 631 311 aa, chain + ## HITS:1 COG:ECs0391 KEGG:ns NR:ns ## COG: ECs0391 COG0583 # Protein_GI_number: 15829645 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 6 280 6 269 299 63 23.0 6e-10 MNRDDLQLFMTILEENNLIKAAEQLYLSPSTAGSRLRAMEEELGYPLFERRKGVKSALPT SRGLAFSHIASQMLALWNEADQIGGMSDSPFLTIATVDSFLDYNLTPFYRELISLHGFSL DIKCYPADMIYSLVSSKQADVGFALYDISTPHVNVTPLSEDDMVLVVPEGSGLIPAEDCP AQPVHPSVLPPDRELFTGSSANSNIGWGPQFKLWHDRYIRSGSRPLVTATSISILNDFLE ARDYWTIMPSTTAMGLKKQYPVRILPLSPAPPRRTLYMLKHNTPSRISLENMTLFTGYLN EYLDRKNSLIL >gi|157101657|gb|DS480667.1| GENE 56 63020 - 63895 842 291 aa, chain - ## HITS:1 COG:AGl260 KEGG:ns NR:ns ## COG: AGl260 COG1082 # Protein_GI_number: 15890243 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 290 1 288 289 307 50.0 2e-83 MKYGIYFAYWTKEWFADYKKYMDKVSALGFDVLEISCAALRDVYTTKEQLIELREYAKEK GLVLTAGYGPTKAENLCSEDPEAVRRAMTFFKDLLPKLQLMDIHILGGGLYSYWPVDFTI NNDKQGDRARAVRNLRELSKTAEECDVVLGMEVLNRYEGYILNTCEEAIDFVDEIGSSHV KIMLDTFHMNIEETNMADAIRKAGDRLGHLHLGEQNRLVPGKGSLPWAEIGQALRDINYQ GAAVMEPFVMQGGTIGSEIKVWRDMVPDLSEEALDRDAKGALEFCRHVFGI >gi|157101657|gb|DS480667.1| GENE 57 63912 - 64850 812 312 aa, chain - ## HITS:1 COG:DR0403 KEGG:ns NR:ns ## COG: DR0403 COG1957 # Protein_GI_number: 15805430 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Deinococcus radiodurans # 6 310 7 308 314 98 28.0 2e-20 MGREKILLDTDIGTEVDDAITLAYLLANPDAEIMGITTVTGEAAERARLASVLCRIAGRE DIPIYPGCENPIFVPQIELYAPQKKVLSRWEHQKDFPANRAVAFMQDAIRRNPGEITLVC IAPLTNVGLLFAMDPEIASLLKRIVLMCGSPTYTRYDNTGETMSAMEKSDVIVLGSKGVI ENNALIDPHATKIVYGADVREHITCGLNVTSQLIMKPEEAEGLFRHPILEPVLDIAREWF KDEERVTFHDPLAAVSLFHDDVCLFEQGDLFIETDSSLLSGFTYWKKKSDGRHRVAMKVN KEKFFEHLFEVF >gi|157101657|gb|DS480667.1| GENE 58 64886 - 66367 1053 493 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 485 1 494 500 367 38.0 1e-101 MSVILGIDLGTSSVKAMLLDSVKGVISVESSPYEVSIPFEGYAEQNPETWWLETKQVLGR LRDKNEQEFNEIQAVGFSGQMHGIVVADCEGKPLRPAILWLDQRSRKELESINSVLDFRE MGEIFRNRAFTGFAFPSLMWLKEHEPEVLLKAGAVLMPKDYIRFRMTGNLASDVTDASAT ALFDTAKRDWAFGIIKKFGLPEEIFPVCKESMEVAGTVTRQCHEECGLKEGIPVVYGSGD QMAQSIGNGVFREGEVISNIGTGGQISAYIKEAKYDRELRTHTFCHALDKAYTIYGATLC SGMSLNWLKNKVLGIEDFNSMSHMAQEIDPGCRGLIFLPYLSGERTPHMNPSAKGMFFGL SLCQDRRYLTRAVMEGVTFSLRDSLTIFEELGIQCETVIASGGGSHSDVWLQIQADVFNK KVKVCEVDEQACLGACILAGTGCGIFNSIEEAARRFVSFREKVYEPIPEHVGLYEKQYRV FRELYVANERFMI >gi|157101657|gb|DS480667.1| GENE 59 66391 - 67374 683 327 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 29 324 36 329 330 214 43.0 1e-55 MEAKKSDGILPGFGRNVVSTVKYNGGIIIGLLIICLLLSVMTDSFCTVRNLSNIMRQISI NVILACGMTMVIILGGIDLSVGSVIAVSGCLCCGLITNVGVPSIIAIPISICAGTLVGVF NGFVISRTTIPPFIVTLAMMNIGRGFARIYTKATTILVDDPLFTFIGSGKILGGVPIQFV YMLVVIVISGLILNRTKFGRNIYAVGDNKQAASYSGINSRRVTMTVFVIMGIFASCAGIL SSARTFSAQFNVGEGSEMDAISAVVLGGTSMSGGVGRLSGTIIGCIVIGVLNNGMNILGI DSSWQYVVKGVVVLIAVFIDYIKKMKD >gi|157101657|gb|DS480667.1| GENE 60 67396 - 68892 1359 498 aa, chain - ## HITS:1 COG:PM1379 KEGG:ns NR:ns ## COG: PM1379 COG1129 # Protein_GI_number: 15603244 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Pasteurella multocida # 5 492 28 514 519 425 46.0 1e-118 MVSAVLEMKHIAKSFSGNKVLLDVDLTVEEGEVHALLGENGAGKSTLMKILGGIYTKDAG SILINGREVQIHNVADARDNGISIIHQELMLARHMTIAENVFMGREQKKAGGFVDLARQE AETQRFLDHYGIKLKADTRLDRLTIAQQQMVEIIRAVSFGSRIIVMDEPTSSLSDAEVDI LYEMIRILKEQKVSIIYISHRLNELYDIADRTTVLRDGEHVGTVRMKETERAELIAMMVG RDLASYYTKNDNAKDEVVLEVKELSDGKMVKKVSFDLRKGEILGVSGLVGAGRSETVECL FGIRRKVRGTVRFKGREVDFRNPREAMANGFGMVPESRKEQGLFLQSGVRFNTTINVVPR FLKHFIWNRQAEEGIVEGKINDMHIRVTGPEQVVGKLSGGNQQKVLIGRWLCSTQSVLIL DEPTRGVDVKTKSEIYALIDQLAASGMSIIMVSSELPEIINMSDRVLVMCNGYSTGILNR DELTQERIMTLATTEIGA >gi|157101657|gb|DS480667.1| GENE 61 69019 - 70068 1035 349 aa, chain - ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 44 349 26 325 325 152 30.0 7e-37 MKKTLAVVLAAMMAASLTACGGSAAKDTTAAAAKAPAAESTAEKTGEGAGTDETEKAAGT EEKKQYKFGFTEMSAGSFFDACYSGVEEVVKANGDEIVHVEGKADSAYQLGVIEDFISQG CDLVFYNPSDAAASAAAVKALNDAGIPIVNFDSAVSDLSKVNCFVVSDSYSCGQIAGEEL IKNHPEGGKVAVLDFPASAAAADRAKGFVDTVTADGLFEVVAQMDAGAKPEKGLTVMQDL LQAHSDLTAVFCINDECAQGAYSAITTAGDKIEIYSVNAGPEAKAAMTKDGVDGIWKCTA AQSPIGIGQKSAEVAYKILNGESYESEIKIPSFAVTPENIDQYNKSDWQ >gi|157101657|gb|DS480667.1| GENE 62 70291 - 71457 309 388 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 90 369 24 313 323 123 27 4e-27 MNRTLLLNLLRKEGICARAHLANLSQLKQATVTNIVSDFIDWGLVKEVGFLVGNKGRRSI GISINNDDFGVLAIRLARKNYTVGIFNLSGRLLDKKRVELDVNQPPRVTFEAIIHEAGEL IRSSESRKIIAIGMAIPGPYSEKRGRIELMTGVLGWNEIPIQEKLQDIFKIPVFLEQDAN AGALAQYWHNDEDYKNGVVVYIAVGQGVGAGIINNGELLKGCIGVAGEVGHTSICYNGPR CACGNYGCLENYCSSIAFTKEVNRVLRPEIEYNFRQVSQLLRDGDQVVTDIFLDACDKLS VGIVNIINSFNPSVIVIGDEMSHILPSVMLERVKSNVKERVLPEIYANMNITMSVVSNDS MAHGAAIVAINDIFNHPLTYFESNQRDD >gi|157101657|gb|DS480667.1| GENE 63 71654 - 72088 219 144 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2418 NR:ns ## KEGG: Ccel_2418 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 3 139 2 138 142 141 43.0 1e-32 MTWMEAYPAHRQPDMEQIGRYIASPCWQPLLAWLEDTFHISPRIEYSRCSMQGGWNVKYK KGSRAVCTLYPEEGYFICMISVGAKEAPEAELALNGCTAYVRQLYHDTTPFNGGRWMMIE VRNGEVLDDVKELIGIRMRKKRSV >gi|157101657|gb|DS480667.1| GENE 64 72259 - 73059 720 266 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 9 186 5 180 187 66 26.0 5e-11 MEKQEKYDMERLVDLAASGNRDALEELLGDVQDMVFNLSLRMLGTVPDAEDACQEILIKV MTHLSDFRKESRFSTWVFRVAVNHVLAYRKHMFSHHPLSFEYYGEDIANGREEDVPDMTG GVDRGILEEELKLSCTNVMLQCLDGESRCIFILGTMFRLDSAVAGEILGISPAAYRKRLS RIREKMAAFLGEYCGLAGGKCSCRRRVDYAIASRRLTPERLEYNGMEQGRSRIAEQCRAI MEEIDGQSRIFAEMPFYRTGPDAGKT >gi|157101657|gb|DS480667.1| GENE 65 73178 - 73753 665 191 aa, chain - ## HITS:1 COG:BS_yqgN KEGG:ns NR:ns ## COG: BS_yqgN COG0212 # Protein_GI_number: 16079545 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Bacillus subtilis # 7 182 2 179 187 91 32.0 9e-19 MDLEERKKALRQEIKAAAVALDEGYTKEADLEIFSHVAGLSEYEQAGTLFCFVGTSGEID TAPILEDALRKGKRVGVPKCTARGIMEVREIRSLGDLEAGKYGILEPGAEAPVIQPEEIN LAIVPCMSCSHDGRRLGYGGGYYDRYLVRTRAVKAVICRERIMRADIPVEPHDQLMDMVI SEHGVRRLGSR >gi|157101657|gb|DS480667.1| GENE 66 73898 - 75304 1218 468 aa, chain + ## HITS:1 COG:VC0801 KEGG:ns NR:ns ## COG: VC0801 COG1767 # Protein_GI_number: 15640819 # Func_class: H Coenzyme transport and metabolism # Function: Triphosphoribosyl-dephospho-CoA synthetase # Organism: Vibrio cholerae # 185 457 30 303 313 194 39.0 3e-49 MDINLSTLVTGREVSLKDMLDAREHRQEVQRMLLSEHHLPVISFTLNIVGPVKVFPLALR TFHEGIRLIETQCHAWKIPIIATYSTTSHAGHEYFWAVDGDARFIKENLCLLEDSVALGR LFDIDVIQTDGMKISRTDLGFSTRKCLICNQEAFVCSRARTHSVKELLEQECQIMTNYFA KQHARKLSSLSMQALLYEVSVTPKPGLVDRNNTGAHQDMDIFTFEASAVSLNHYFEQFAL CGIENGHEPFSRIFSRLRSLGIQAEETMFRATNQVNTHKGLIFSLAIMNGALGYMYANHI PYSPDALLKINRKLVADVLEDFNDVTVENARTNGERLYALYGMKGARGEALSGYHTVLKK ALPVLKHQLDRGLSLNDAGAVTLLYIIAHSEDTNIVNRSSYHSMKKIQALLRETLNDPEF INKDPIPYIESLDREFIKNNISPGGSADLLALTFFLYLFENSGLSSIL >gi|157101657|gb|DS480667.1| GENE 67 75265 - 75651 446 128 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3026 NR:ns ## KEGG: Cbei_3026 # Name: not_defined # Def: heavy metal transport/detoxification protein # Organism: C.beijerinckii # Pathway: not_defined # 15 122 15 122 122 93 42.0 3e-18 MSTAIICAVLIVIAIIGIKSYAKKLTSGCCGASSQPSVKKMKVRDKDKSHYPYSRLLKVD GMSCGNCASHVENALNSLEGVWAQVDLEKGEALVRMKQEYGNNELKQAVKDAGYVVYKIE ESPEFSNR >gi|157101657|gb|DS480667.1| GENE 68 75914 - 77398 1540 494 aa, chain - ## HITS:1 COG:BS_opuD KEGG:ns NR:ns ## COG: BS_opuD COG1292 # Protein_GI_number: 16080059 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Bacillus subtilis # 1 494 2 491 512 432 47.0 1e-121 MKDKNQVFVVSLAITIIMAIWAVAFNANFTVVSNAAYSFLTNNFGWLYLMAMTAFVIFSV AIAFSKWGKIKLGPDDSKPEYSTVSWFAMLFGAGMGVGLVFWGISEPLAHFSNPIPGIEA GTEAAANFAIRSSYMHWGLHPWANYCIIGLGLAYFQFRKGKPGLISSIFEPLIGEKGING WVGKTIDVLAVFATVAGVVTSLGLGVMQINAGLNYLFGIPTTLVIQIIIIAVISVIYIWS AVDGISRGIKIISDANLYIAIGLITVTFLVGPKLEILNNLTNGLGQYLQNFFGDSLMINN YGDNTWVGAWRVFYWAWWIAWAPFVGSFIARISKGRTIREFIAGVVLAPALGSILWFAIM GSLGLHLGMDGTLSVAQLADIASKPETGLFIVMGQYPLGMILCVVSLILLCTFFITSANS GTFVLAMFSSKGDLNPKNGRKVLWGVVQSLLAVGLLVAGGLKPLQTISLAAAFPFIFIML FAGAALVKALMKEK >gi|157101657|gb|DS480667.1| GENE 69 77593 - 77832 191 79 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c27830 NR:ns ## KEGG: CLJU_c27830 # Name: not_defined # Def: betaine reductase complex component B subunit gamma (EC:1.21.4.2) # Organism: C.ljungdahlii # Pathway: not_defined # 1 79 359 437 437 125 75.0 4e-28 MVKEFERAGFPVVLMCNLLPVAQTVGVNRMVPTISIPYPLGNPSTSKEEQHLLRRSRVEA ALDALATDIKKQTIFKVEI >gi|157101657|gb|DS480667.1| GENE 70 77860 - 78912 1226 350 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c27830 NR:ns ## KEGG: CLJU_c27830 # Name: not_defined # Def: betaine reductase complex component B subunit gamma (EC:1.21.4.2) # Organism: C.ljungdahlii # Pathway: not_defined # 1 350 1 348 437 468 65.0 1e-130 MKNVICYINQFFAGIGGEDKADYEPEIRDGVMGPTAAINQMLAEIDAQVTTTIICGDNFM STHRDEAIERILTALEGREFDLFLAGPAFQAGRYGVACGEIGKAVSARFGVPVITSMHAE NPGVEMFKKDMYVMTGSHSAASMKKDAGAMVKLADKLLKGQTPEGPDAEGYYARGIRHQV WREDGVAAADRAVDMLIAKATGQPYQSELIIPKKDLVPIAPALKDLKHARIALANTGGIV PVDNPDRIQSASATRWGRYDISKMDRLAGGEFKTIHAGFDPAAANADPNVIMPVDVMKEF LNEGKYGSLHDYFYSTVGTGTTQGEARRMAKEIIPLLKEDHVDAVIMVST >gi|157101657|gb|DS480667.1| GENE 71 78938 - 80263 1308 441 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c27360 NR:ns ## KEGG: CLJU_c27360 # Name: not_defined # Def: putative glycine/sarcosine/betaine reductase component B subunit alpha # Organism: C.ljungdahlii # Pathway: not_defined # 1 441 1 441 441 738 78.0 0 MKLELGNFYVKDIVFGDKTKYENGILTVDKEDCLAFVKRDPHITEADLRIVKPGDMVRLV PVKEAVECRVKVNGDALFPGYTGPLSQAGDGRTHCLKDMSLLAVGRHWGGFQDGLIDMGG EGAKYTYFSQLKNLVLVADTDEDFEKREQQKKNRAIREAVHKLAEYIGSCVKDMEPEEVE SYELEPVIRRAPETEKLPSVVYVMQPQSQMEEMGYNDLCYGWDCNHMLPTFMNPNEVLDG AMISGSFMPCSSKWATYDFQNCPVIKRLYKEHGKTLNFLGVIMSNLNVALEQKERSALFV AQIAKSLGADAALVTEEGYGNPDADFTGCIVALEDAGVKTVGLTNECTGRDGKSDPLVSM DEKEDAIVSTGNVSELIELPPMPEVIGELEALARDGLSGGWAGDEILGPSVRPDGSIIME NNAMFCGDQIIGWSTKTVREF >gi|157101657|gb|DS480667.1| GENE 72 80414 - 81004 569 196 aa, chain - ## HITS:1 COG:no KEGG:Shel_25340 NR:ns ## KEGG: Shel_25340 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 12 193 3 184 187 171 49.0 2e-41 MSKETANGYYTVYTRQSKKVLEELEKTGEYRVKEEYIRMKNDSISEYYLKLYRWFAERCR ERIDVPKGCSLPIWLSMHDEYRLRNTEDTVCFTLRIPREKVHVISEYAWGFRVNYMYVPL NLEDERAFNEELKRYGIENEMALVTESLGNYYPMLKKRIISSWDRVFELKPNSPADELGV CFEIQREWIENIESLT >gi|157101657|gb|DS480667.1| GENE 73 81110 - 81814 803 234 aa, chain - ## HITS:1 COG:FN1803 KEGG:ns NR:ns ## COG: FN1803 COG1309 # Protein_GI_number: 19705108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 19 193 15 189 217 83 28.0 3e-16 MENERVRRKYFKDLNRKVYIERALEIIRDEGTEAISIRRMAKEFQCSTTSMYRYFENVEE LLYYANLGYLDEYLEELNVHEKEWRDIWDMHIGIWECYCRIAFRYPQAFDIIFFSAASRN LTTAIREYYDMFPERINIVSPYLQVMLQSTDLFERDMVMAERCAQAGVITMGNAMNMNRM VCLLYKGYLKEILDYGLEPDEIEGKVQEFKRDVEVIVGIYASDTLGHDYLANAR >gi|157101657|gb|DS480667.1| GENE 74 81804 - 82445 303 213 aa, chain - ## HITS:1 COG:no KEGG:Shel_25350 NR:ns ## KEGG: Shel_25350 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 22 193 8 180 188 158 43.0 2e-37 MEVDMLRNNKEDGFSPGSMLHLWTAQTDAVLDCIRENGFSQVKMEFIDKKYEESAWVFKE AYGFFKQRARLMVKPPEGAESPVWLFFDPGWVYLSPDSYLLELSVPRERVVLFDRERWQR VLNLSYVGKEQEDEARFEQKMNQMGVSTYWEVFQSAFYPYLKSEIKKSWERIFDIDNTEQ TNLGAAVWQLRQEDVVGINSHSALEERDGFSGK >gi|157101657|gb|DS480667.1| GENE 75 82587 - 83081 345 164 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00140 NR:ns ## KEGG: EUBELI_00140 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 162 13 178 178 95 36.0 5e-19 MNISLTIAGILCVSILGTLLHFTYRWSGRNPLIGLIAPVNESVWEHMKLLFFPMLLFGLW NLKGVTDACRISAFHAGLLMGTLLIPVLFYAYTSVLGRNFLVLDIALFYICVIAAFLIYR GLSGSCRLEKYSHVTSMAVFLLLICFLLFTYFPPGFSIFKDPER >gi|157101657|gb|DS480667.1| GENE 76 83470 - 83814 284 114 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1093 NR:ns ## KEGG: Odosp_1093 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 114 1 114 114 187 74.0 9e-47 MSKVKITVLKTTLDKELAREYGAEGLTACPMMKEGQIFYVDYAKPEGFCDGAWIAIHPYV FSLVHGTGNELFYYGNWIRKPGVAICSCNDGLRPVIFKVESAGQDSAMEYEPVR >gi|157101657|gb|DS480667.1| GENE 77 84009 - 84752 362 247 aa, chain - ## HITS:1 COG:HI0105 KEGG:ns NR:ns ## COG: HI0105 COG0327 # Protein_GI_number: 16272079 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Haemophilus influenzae # 8 245 35 278 279 61 27.0 2e-09 MEYKYLKEQLLSMFDSDKLHLFPEEWGFFNETERDIKCIGYATNLTGEIIERAGEKGVDF LLTHHDSWEFIYGLKEHCNKMLREAGMTHAFFHAPLDDADFGTSASLAQALGMKGCKKVM PYREQYYGGVAGEIVPTDFESFAASLSHILQENVRCHRNNDRPVCRIAVAAGGGNMTSDM RTAVELGCDTYVTGEYALYSQQYAGFCGMNLFVGSHTNTEILGVKSMAERLTCGGKIELI RIREPND >gi|157101657|gb|DS480667.1| GENE 78 84924 - 85310 253 128 aa, chain - ## HITS:1 COG:no KEGG:PC1_2348 NR:ns ## KEGG: PC1_2348 # Name: not_defined # Def: hypothetical protein # Organism: P.carotovorum # Pathway: not_defined # 4 106 3 106 119 80 40.0 2e-14 MNKAIKSDWKTLDMPEKTASFSIDIGLTEGEFSALQNGHIPCEMEDKWFEYFENNILYIH RSWTGICIYKVQFSTDRRIEEVVVNRDSEQYRETNIERDKVQVMMRINSLCGRTGNGELM LKYIKSGR >gi|157101657|gb|DS480667.1| GENE 79 85386 - 85631 166 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935201|ref|ZP_02082584.1| ## NR: gi|160935201|ref|ZP_02082584.1| hypothetical protein CLOBOL_00096 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00096 [Clostridium bolteae ATCC BAA-613] # 1 81 1 81 81 160 100.0 4e-38 MQQKTLVGFFTEDLEALEQEAVCTPEFGNGRMFYYYMKEGTSNASAGVGWRIYLGRIIEK AVITNYSYLAKKCRFNHNAKG >gi|157101657|gb|DS480667.1| GENE 80 85613 - 87463 115 616 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_p0627 NR:ns ## KEGG: PPSC2_p0627 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 376 612 18 261 452 87 28.0 2e-15 MNEKRGQYAYEIGHVFSTKHGSMAVIARQKVEKKPDHFRKYYTLQCGRGHQYEVGESYLQ QGRLRTCKHCYHPPIAETDPDFALWFAEPQIPRERSRYSHTLADFYCQECGSLVRDKSIH TVYQRKYVPCPYCRDGMSYPERYVNAFLAQLNISFHRQYMVPFEKEGKRSHYKYDFYDEP QGILLEVHGLQHFAPDVFKRIGGWSLEMIQERDREKERFAKEVLHLQYIYLDCRKSEPDW IRKEIISKLACYPLDGVDWGKVRQDANTSMVLQMIELSKQGYTQKQIGEKLQVHPSTVCQ KLKKAEADGLFDGRCPRVEQAEQNRQHKQEKRIRYLKQKMRLQDQKKYLEKMQYRPNEEL KSNSRSGICCEQEYPQIQMLGSYVNTRKPIRFLCNQCGQEFECSATWFMEHHACPYCKQL ARIQGRIAEKYGEEIQVLSIYKNCKTSMTMYHTVCNETFQITYTDFMKRGCPVCGKRNRI IHSAETRRNREIQSFYQKLPEIEARGYTLESDVCTRLGVPHKFRCHHCGEIWEVTPSVVM HGRNHICISPCKKKTPEQFQQQVEALVGEEYTVLSEYQNAFAHVKMRHNACGLEYSVAPA HFTSTGRRCPKCSRKR >gi|157101657|gb|DS480667.1| GENE 81 87479 - 87682 110 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935204|ref|ZP_02082587.1| ## NR: gi|160935204|ref|ZP_02082587.1| hypothetical protein CLOBOL_00099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00099 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 117 100.0 2e-25 MGKDDNNQEFLRKLDHFDRAGNRILHWIVVNLYAVFTGTRWKESSRFLRSGRSINEKSTL KQQGKNR >gi|157101657|gb|DS480667.1| GENE 82 87696 - 88283 497 195 aa, chain - ## HITS:1 COG:FN1125 KEGG:ns NR:ns ## COG: FN1125 COG1704 # Protein_GI_number: 19704460 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 25 188 19 182 183 162 50.0 3e-40 MGDIGKYIVIGGAVLIVAVIFLVIIKTRNGFVVLQNRVKNQLAQIDVQLKRRYDLIPNLL ETAKGYANFERSTLEAVVKARQSAMAAADFDQAAAANGKLQAALRRLFAVSEAYPELKAN ANFMQLQSELSDTENKIALSRQFYNDTVLKYNNAIQMFPASFIAGLCGFHPMSFWNAEEE ARERITIQAEDMKFS >gi|157101657|gb|DS480667.1| GENE 83 88303 - 89418 385 371 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935206|ref|ZP_02082589.1| ## NR: gi|160935206|ref|ZP_02082589.1| hypothetical protein CLOBOL_00101 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00101 [Clostridium bolteae ATCC BAA-613] # 1 371 10 380 380 688 100.0 0 MKRWQKLDRLALLAPLVLFLFLSIGKEGRLLWGIVLRERNFVVTIAALLLLALAAVLASL PIVLIWRAVSHTMKKAAIQNATFQADEDFDYYREKLTGVPPATISLLMDLQIEAKKDMAA LLLKYTKMGAVSMKAGTVHVQNQELPGLLPSDRTLLALIAGGQAQPANLGAWRRQAVTEA VESGNLKYRGMRQNVHSASRSCLTGCLGGCLLPILIFLGMGITAVAINNSDWMEKLDGFL AAAPQSFGMRQMEYLLSSPDMVIAIPLTAFFVLSFLAMFLLPIAAVLRTALSIYGTGTRL KRTQAGEILTAQIWGLKNFIRDFSNLAESEKEQLVLWDDFLIYAVVLEENERIIEDIFRL RNLKYRDFILF >gi|157101657|gb|DS480667.1| GENE 84 89572 - 89784 65 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENTIISMKNRMEREEECTRIPLLRMDTMWTNMECGRQPQTFQNCYRRVPNYWIDVEYFD PQNGQTKRRA >gi|157101657|gb|DS480667.1| GENE 85 90468 - 91751 822 427 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 234 412 242 419 433 92 34.0 2e-18 MIAYILTHLWVFMEMLFLDILADAFLIKKKSGSRSKRVFGLLFLTMVMSVSVTLFEDIVA LKLFFDFLLVYAYLLYFYQTTLGEALSVYVINYSVIICIDFIGVSGYYYIFGTSENMPMF YLMCLMVKSTELGSGFLIRWFWKKGGNAAIRSEGIRALFPSFVIIAGAGIFASQFLMRTQ TVPVEVTVLMAGMIGLNIFLVFNMLLAARVELEKNSLKDAARQTKMELLIYQNKQELYTK QGKRLHEYKNQLLTISDMLEQGLVSQTLQYTQGLTGEIGKALERIYTNHPVVDAILNMKR QEALNRGVNVNLMCGDLRDMILKEDEIIILLGNLLDNAIEAAEKCESEGMVLVRIVREKR QLVITVKNSYAGQLHLEDSRLMTSKLDEENHGYGLAAIQDIAGRYDGTFVVKAEGDYVKA TVLIPDM >gi|157101657|gb|DS480667.1| GENE 86 91748 - 92467 517 239 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 223 1 218 234 79 25.0 4e-15 MLNILICDDEKPFLDRLANKITVYLENREIPFQMAAFCSGEELLKHNEDRLFDIAFLDIS MNAVSGMDAAEYLRDRNRKICIVFVTGYMDYVLEGYKVGAFRYLVKDSLEISFEECMEAV LKKFHIDTDEICLEFTDRKLYLKADEICLVESRGHKLLFLAVDGKHVIGTMNRKLTDVEG LLGGHGFLRVHQSYMVNMKYIVNIASYRLELEPDIILHVPRNRYPYVKREYAIYKGDSL >gi|157101657|gb|DS480667.1| GENE 87 92867 - 93247 176 126 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3562 NR:ns ## KEGG: Cbei_3562 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: C.beijerinckii # Pathway: not_defined # 1 112 80 191 398 71 30.0 9e-12 MNPHLIKNFGNNSLQKVKSDPADARKIARYTLDNWTELRQYSGMDNTRTQLKTLNSQFSL FMKQTVAAKANLIALLDNTYPGVNKLFNSPPREDGSEKWVDYAYSFWHVERLQIYPQGST YRLCMR >gi|157101657|gb|DS480667.1| GENE 88 93171 - 93518 142 115 aa, chain + ## HITS:1 COG:no KEGG:ELI_1812 NR:ns ## KEGG: ELI_1812 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 44 111 2 69 74 82 57.0 6e-15 MPIHSGMWNVSRFTHREALTAFACVDPGVDEPGQHKSKSNRASKVGSARLRKTLFQIMTT LLQNAPEDNPVYRFLDKKRVQGKPYYAYMTAAANKFLRIYYGKVKECLRNLEQAE >gi|157101657|gb|DS480667.1| GENE 89 93634 - 94161 156 175 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0336 NR:ns ## KEGG: Bacsa_0336 # Name: not_defined # Def: death-on-curing family protein # Organism: B.salanitronis # Pathway: not_defined # 1 174 1 175 176 231 64.0 1e-59 MKRTEEQLDAKQILSVIEKYSLALDLLDAYDHQNMKRPEGGNTIYILSYQECRKVIDSMG YGESSDVFGNEKDDSFKGSIGAIYQTFAGQEVYPSLEEKAANLLYFITKNHSFSDGNKRI AAAIFLYFMDRNQALFLDGEKIISDHTLVALTIMIAESRPEEKEMMISVIMNCLK >gi|157101657|gb|DS480667.1| GENE 90 95171 - 96178 785 335 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 8 328 11 335 339 306 47 2e-82 MEHQFYEQIKRILSEARNKVYQTANFAMVEAYWNIGKSIVEQQDGYEKAEYGSRLIAELS KQMTVDFGKGFTPTNLKYMRQFYLTFPNSHALRDELSWTHYRLLMRVENENAREFYTEEA IKSNWSTRQLERQINSFFYERLLSSQNKETVSEEIQKLEPAKVPEDIIRDPYVLEFLGLS PNDDFYESDLEEALITHLQKFLLELGRGFSFVARQKRITFDGRHFRIDLVFYNYILKCFV LIDLKIGDLTHQDLGQMQMYVHYYERELMNEGDNPPIGIVLCADKSESVVKYTLPENETQ IFASKYKLYLPSEEELSQELQREYRALEYDKEKSK >gi|157101657|gb|DS480667.1| GENE 91 96206 - 96823 134 205 aa, chain - ## HITS:1 COG:no KEGG:smi_1699 NR:ns ## KEGG: smi_1699 # Name: not_defined # Def: hypothetical protein # Organism: S.mitis_B6 # Pathway: not_defined # 89 181 1 93 149 155 87.0 1e-36 MLEAITYQNMLRDINSNEIKEDTIGILITRPDLEVGKSILNSLNYFHHLSRNNTNFYLPG YGAYWYESYPDGQVVTKIEGVDWSFSDKMFVCFINDLETYSKWEYSGESELLMLEYKDGI LSYDNMMQFYLDNMMRDRVIVSIPSFFQQLLRICKNDKSLKDISNAFGKDKLIQVTKENI LNNIPSSLANVFVQEKYFCIRNCGK >gi|157101657|gb|DS480667.1| GENE 92 98567 - 98911 121 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935220|ref|ZP_02082603.1| ## NR: gi|160935220|ref|ZP_02082603.1| hypothetical protein CLOBOL_00115 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00115 [Clostridium bolteae ATCC BAA-613] # 1 114 378 491 491 236 100.0 5e-61 MFLCSCVFAISLATGAFNSGIGNMLKVQTERYIAFNNLMSSGAIQNFDDGTNIYMPEYHC INNSEEYMQYYAKLYTDKDLFFTNDYEQLDFGHPVVEFRYNPGEGCVEYNYLTR >gi|157101657|gb|DS480667.1| GENE 93 98985 - 99758 667 257 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0799 NR:ns ## KEGG: Cphy_0799 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 149 3 165 216 89 33.0 1e-16 MSKTAAGLIQHCKDKLGTPYVYGAKGEVLTQAILDRLARENPGTYTSTYKTKAAKYIGQR CTDCSGLISWYTGVLRGSYNYHDTAVERVGVDHLDESMVGWALWKPGHIGVYIGDGWCIE AKGINYGTIKSRVAATPWQKVLKLRDIDYTPVPVTYTQGFQPAADGQRWWYQFTDGSYAA NGWYWLREATDGTCGWYLFDSEGYMLTGYQVDPAGEAFLLCPVKGSDEGKCMITDARGAL RIAEEYDMVNRRYVFEW >gi|157101657|gb|DS480667.1| GENE 94 99845 - 100267 470 140 aa, chain - ## HITS:1 COG:no KEGG:Closa_0862 NR:ns ## KEGG: Closa_0862 # Name: not_defined # Def: toxin secretion/phage lysis holin # Organism: C.saccharolyticum # Pathway: not_defined # 3 140 2 139 139 115 46.0 4e-25 MKKDIICAIAGMAAAAGVKLFGGWTPTLSIVLILMGLDLLAGFLVAVVFKKSPKSESGAA SSNAMLKGLCKKFMMVCLLAVAHQLDVALGVDYIMLAATYGFIANESLSIVENAGLMGIV KSDVIVNAIEVLKGKSQKIE >gi|157101657|gb|DS480667.1| GENE 95 100283 - 100411 271 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935223|ref|ZP_02082606.1| ## NR: gi|160935223|ref|ZP_02082606.1| hypothetical protein CLOBOL_00118 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00118 [Clostridium bolteae ATCC BAA-613] # 1 42 17 58 58 64 100.0 2e-09 MAVIYATLIVKGRKTFRQVPDKIKDQVRQVLVDLECEELITE >gi|157101657|gb|DS480667.1| GENE 96 100452 - 100565 197 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADYMAIVYADLIRKGKKTIEQVPEKLRAEVEAVLNA >gi|157101657|gb|DS480667.1| GENE 97 100569 - 100913 393 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935225|ref|ZP_02082608.1| ## NR: gi|160935225|ref|ZP_02082608.1| hypothetical protein CLOBOL_00120 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00120 [Clostridium bolteae ATCC BAA-613] # 1 114 1 114 114 207 100.0 1e-52 MKALVIYDVTGRIWSIIYGEETLPQGLRCMWVDIPDGAQLNYIDVTDASNPQPVFAYLPE SDIGRLQEQVVSLDSQLTEAQLALTEQYEANLALAEEVTNTQLALTEIYEGMEV >gi|157101657|gb|DS480667.1| GENE 98 100932 - 101645 339 237 aa, chain - ## HITS:1 COG:no KEGG:bpr_II210 NR:ns ## KEGG: bpr_II210 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 19 149 550 668 808 75 41.0 1e-12 MGRLWIPGSGGGADLDVITAAASDVRKGKVIVDKDGNPLTGTMAEKGAATYYGQNYDQVI AANQYLTGNQTIVGDGNLQPWNIKRGVTIFGRAGTFEGWLDRYYNIFLDGNTTGINYSGS YTNYVNIGSTISFATNSDQPSRKGVAFNSPVSFSSYGKLYVRYSCNVSLTVGVVRQGADY GSWEVSTSNSYSIDSNTREVALDIFDIGRQPTVFIGTSGYSGGYSATIYRIILGRPV >gi|157101657|gb|DS480667.1| GENE 99 101652 - 102095 478 147 aa, chain - ## HITS:1 COG:no KEGG:Closa_0858 NR:ns ## KEGG: Closa_0858 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 62 147 55 140 144 71 44.0 9e-12 MALKTDYKADVFEGNRKYQIIQDGEGKSEILDVTEYSQEGDVFGPKDINATNKAVNALNH VVPVTLQASGWSTAAPYTQTVPIEGLTTEDNPILVKVIADGATPEQVKAYNKAFGMIDDG DTADGQATFKCYNKKPTIDLTVGLKGV >gi|157101657|gb|DS480667.1| GENE 100 102086 - 102511 239 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935228|ref|ZP_02082611.1| ## NR: gi|160935228|ref|ZP_02082611.1| hypothetical protein CLOBOL_00123 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00123 [Clostridium bolteae ATCC BAA-613] # 1 141 1 141 141 258 100.0 1e-67 MSTVLETLITDRTAADLANDTDRAYIAYTDLNRVEEACALLAGRLGVTIQTKAWKMEDFR TDTEMSRLLNNIKTLRAAYYTKASTPAIPAKITYESIYQANDIEQILKDLGDMYDSMVSG QQRLAFRLGMRAIGNRRQEWH >gi|157101657|gb|DS480667.1| GENE 101 102508 - 104604 1419 698 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0356 NR:ns ## KEGG: Sgly_0356 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 210 486 63 362 542 81 24.0 2e-13 MEWDIRVGTNGQQPYSSVDDLTSFEQNLPPYAYCLPRYARLDGTYANTPNAIPIGKNGYI STALSDQDGAFGVPPMITVTFDRLKTSNGVSMVFNRVSGDYASRLKISWYKDAELVQEQE FEPDGVEYFCRAKVPLFNQLVITYLETSRPYRYLWLSVLKNQRMTDAGGLKIVYDDIALG ASEDNTAASGDHDYYVDLQDLKSGVEFPDYAMCLPRYARMDGNYNNAPDELADMGYVSDS ISDEGGTFGNPPSITFTFGQTYSSVGITLRFNDYSGDYCSMVNIKWYRGDELLSDRDYSP DSPDYFCYGIVDYYNRVVVTFLRTSKPYRNVFLTGITWGLIRVFKDDEIEDISCLMELSP ISEEVSINTMDYTIRSKSDYAFEFQKRQKQTLYFDEAILGIFYLKDGKQLGAKRYSVETQ DAVGILDNNQFMGGVYNNALVSDILAGIMAGEGITYFLDDVYVDARVSGYLPICTKRVAL QQLAFAIGALVDTSYDRQLYIYPQQTEVTSEFTAKDIRLGLSVEHSDIITGIRLYVHSYT QGMESAQLYKGVLDDTTKIEFSEPYHSLSITGGILGEHGDNYACITGTGNEVVLTGLKYN HSTAMLLKEEPKITQNKNIAEVKEATLVTAGNAQAVLDRVYGYYSNNESISFRSTINDQE LGNRVNVFTGFRGTMTGNITKLDFKFSRRKVTAEVTVR >gi|157101657|gb|DS480667.1| GENE 102 104606 - 105004 267 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935230|ref|ZP_02082613.1| ## NR: gi|160935230|ref|ZP_02082613.1| hypothetical protein CLOBOL_00125 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00125 [Clostridium bolteae ATCC BAA-613] # 1 132 1 132 132 251 100.0 1e-65 MDSVFLLDGKVYNVEVEKDSLERSFAVTDTEQSGRTLDYAMDRDIIGTFYNYTMKVYPKT EDLAAYDAFYDAVSDPNYASHEMTFPYGQETLTFQAYITQGKDKLRIRRGKNIWGLDGLS LNFTAMEPQRRR >gi|157101657|gb|DS480667.1| GENE 103 105006 - 108839 2710 1277 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935231|ref|ZP_02082614.1| ## NR: gi|160935231|ref|ZP_02082614.1| hypothetical protein CLOBOL_00126 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00126 [Clostridium bolteae ATCC BAA-613] # 1 1277 1 1277 1277 2313 100.0 0 MAADGSLKFDTKINVEGFEEGISTLSKAMDRLTGAVNRLSSNILSRFNGAGQAITKTAQS AGEASDAVESIGESADGSLKDVKRLQEQMDAIRVQAMEAADTEPVEAVAVSTNPESLNYD PKAMAAVFGEEAAEIHNYAEAVEQYGEQGAAALNKMDLEAQELKQQIEQLHVQDIEAVET TPIEAYAVPNSAESMGYDPKAMAAVFGEAAAEIHNWSEAVQEYGSQAGAALNGDEIEQEA GVANEKIVELSKRLQELKERQKELQSEGIGLGHVEYDSNAAEIAQINSVLKDYQKSLTDT GRETRKFSKTTQSAFLKAASAVGNFAKSIGRGLANKAKQAVSSLKGLGKSSNNVSKSILK LSNMFKLMLIRMAMRAAIQGVREGMQNLVQYSDRANQSMSGLMTNMTYLKNSFAAAFAPI LSYVAPVLNTLINLLATAVGYINQFFSALGGGSTYIRAKKANEDYAASLKKTGGAASKAG KDAKKALAPFDDLVQIQQQGADASGGGGGGASPSDMFETVGIDKGISDFANKLKEMFAAG DWEGIGKLIGEKINEAVQKFTEFISWDNVGAQITAFITAFTTMLNSLVATIDWYAIGIMM GTGINTLANTLYLLLTQIDWLMLGNALSQSLMGMVDTVDWNLVGATIGAYFQAQISGLLG FIIGTDWGAIGAALATCLMGITGAIDWGQFGYLMAAGLNGAFALLLEFASTFDWTEFGNN VATGISSFFQTFQWAQAGEALSTFVIGILDFLITAVQQTDWASFVQGIVDCIEAVDWIGL AGKIYTLLYSALGVAFGALANFIGTLIADGFAKAKDYFNGKIEECGGDVWEGMLKGIVDA AKGVVSWIKTNVVDPFINGVKAGFGIHSPSTVMAGMGQYLWEGFCEGVKEFFSDPGAFIK ANITDPFVNGIKSLLGIHSPSTVLASIGSNTVAGFNQGVTNEQAASQSVVQSWASGVASW FSNKFGISTGDSTEARQWATSILSGFNNSVSKNYTKSQTVMQTWAENVRKWFVGADEAQG VNELSWTKFADLIIQAFKVKIEGSHTETQAPMEAWAKNVREWFWGDSNPEGTGGMYAAFY NMARRINEGFANGISDFAYMAKNAIRKWAREAMEAAEEEFDINSPSREFYSIAEYVVRGF NDGISAMASSSRSTVQKWLDGVLDVFDGVNVQLPIGINIPNAASYLPRMASGTIVPPRAG EMSSSMRNMAGYGQEEAMGYLIGKMEEMISRLQAEGNKPVQIVLNLTGNLAALARVLKPE LDKEAARKGVSLVIVGG >gi|157101657|gb|DS480667.1| GENE 104 108888 - 109451 492 187 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0348 NR:ns ## KEGG: Sgly_0348 # Name: not_defined # Def: putative protein GP15 # Organism: S.glycolicus # Pathway: not_defined # 4 186 7 188 191 92 31.0 7e-18 MIGQLPTSLDVGGVSYPIETDYRNILVFLAACSDPELSPAEKLEILMKRLYRDGFSQIPQ EHLEEAILQAKWFVDCGQEDDDKKPARKVMDWEQDEPILFPAINKVAGMETRATQYIHWW TFSGYFMEIEEGTFSTILGIRQKKAKGKKLEKWEQEFYRNNRRLCDIRKRYTEEEQAEID YWNNLLG >gi|157101657|gb|DS480667.1| GENE 105 109448 - 109897 475 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935233|ref|ZP_02082616.1| ## NR: gi|160935233|ref|ZP_02082616.1| hypothetical protein CLOBOL_00128 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00128 [Clostridium bolteae ATCC BAA-613] # 1 149 1 149 149 291 100.0 9e-78 MAKKMKSLLFDDGYESFAVNDDPTRIIRFNPADPEIINRVLDVQRHFKDYSPPEGIELNP DGTPKSDMERDGAYVAEFSGEMRKAFNGIFLSDVYDTIFAGQSPLCIVGQKYLYEGVLEG LLVLMKPAVEEYARKNREKSRKYLEDIGK >gi|157101657|gb|DS480667.1| GENE 106 109956 - 110492 524 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935234|ref|ZP_02082617.1| ## NR: gi|160935234|ref|ZP_02082617.1| hypothetical protein CLOBOL_00129 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00129 [Clostridium bolteae ATCC BAA-613] # 1 178 1 178 178 349 100.0 5e-95 MRKMNLQLFAESIPAAGKIKRKWMAHYIDAALPSASKAEYSRLGKDLEEYIVEMNANVET KNNIWGETSVNLDSYQPQASADPYYAEIGEPLFDRLQDIVDERQTLDDLKTSVVEVHLWE PVEAADGTYVAYKEDAIIEVSSYGGDTTGYQIPFNVHHTGNRVKGKFVLATKTFTADT >gi|157101657|gb|DS480667.1| GENE 107 110495 - 110944 434 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935235|ref|ZP_02082618.1| ## NR: gi|160935235|ref|ZP_02082618.1| hypothetical protein CLOBOL_00130 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00130 [Clostridium bolteae ATCC BAA-613] # 1 149 1 149 149 294 100.0 2e-78 MTIIDFMRQKLTEYPKISEFLVDGDIHVDFTEPGSSYGLSSTGDSLVKEDMLGNQTRRHN FAMYAVAPSFTDYCRLANSNFLLELGYWLEQLPEEDGLSANIGNQEMEARFIKAITSNAM AMQPMGETVNDGILYQIQIQVTYQIESEE >gi|157101657|gb|DS480667.1| GENE 108 110941 - 111366 230 141 aa, chain - ## HITS:1 COG:no KEGG:Clole_0824 NR:ns ## KEGG: Clole_0824 # Name: not_defined # Def: minor capsid # Organism: C.lentocellum # Pathway: not_defined # 1 123 3 125 141 85 38.0 9e-16 MKVELKMLPSEAVLQNHGLQEGGSVQKLVDNETMRYMSAYMPRRQAGELEHMMVMATVIG SGQIDIPGPYANYLHEGILYVSPTTGSAWAKKNEIKVPTDRELTYAGAPMRGKKWFERMK ADHKDDILQAAQALVNRGGKV >gi|157101657|gb|DS480667.1| GENE 109 111378 - 111701 260 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935237|ref|ZP_02082620.1| ## NR: gi|160935237|ref|ZP_02082620.1| hypothetical protein CLOBOL_00132 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00132 [Clostridium bolteae ATCC BAA-613] # 1 107 20 126 126 200 100.0 4e-50 MYTRIPIEGVYWEDVRQSTYLKTGQREGTSVLLVIPMESLAGTIKLTQGKDLAVKGIIED EVDCSSQEAMSKSLAALKTAHRFLTVTTVDERLYGSESVQHYELACK >gi|157101657|gb|DS480667.1| GENE 110 111760 - 112131 237 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935238|ref|ZP_02082621.1| ## NR: gi|160935238|ref|ZP_02082621.1| hypothetical protein CLOBOL_00133 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00133 [Clostridium bolteae ATCC BAA-613] # 1 123 1 123 123 232 100.0 6e-60 MRAYTDETYYINDYLKGRKPVITAGFLFYARSASQVIDRYTFNRLKGVSDVPEEVQMCCC ELSESEYHREKQQKESGGKTSEKIGTYSVGFASAQESATAISREQRSIVMKWLAHTGLCY QGV >gi|157101657|gb|DS480667.1| GENE 111 112146 - 113336 1151 396 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0094 NR:ns ## KEGG: Ethha_0094 # Name: not_defined # Def: PEGA domain protein # Organism: E.harbinense # Pathway: not_defined # 1 393 1 387 391 452 61.0 1e-125 MPVNITSRADAEAIIREQVISTIFQDAPKQSTFMSLARKLPNMTSNQTRMRVLDFLPTAY WVDGDTGMKQTTRQAWDNVFIEAAELAVIVPIPEAVLDDAEFDIFGEITPRVNEAIGQRV DSAIIFGVNRPRNWQNDIITLARQAGNNVAVGSSPDYYNLLLGEGGVISKVEEDGYMATG ALAAMTMRAKLRGIRSTDGSLIFKSDMQGSTNYALDGAPMYFPQNGAYDSTIAQLIVGDF KQAVYSIRQDVTVKILDQGVIQDPVTKEIEYNLAQQDMVALRIVFRMGWALPNPATRMDE DRLGCPFAYLEPTSPVTTQKVTFTVKDNAETPAAIDGAIVDVNGSRVKTDVSGVAEFNLR AGTYPAKIKKSGYGQITETVTVAAEAVTKDVTLIKQ >gi|157101657|gb|DS480667.1| GENE 112 113351 - 113884 385 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935240|ref|ZP_02082623.1| ## NR: gi|160935240|ref|ZP_02082623.1| hypothetical protein CLOBOL_00135 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00135 [Clostridium bolteae ATCC BAA-613] # 1 177 1 177 177 304 100.0 2e-81 MKTEDLQAKGLTQEQIDYVMAEYGKDINGIKQERDTYKTQLCTAQATLKSFEGVNISELQ GKIQTLTTDLANKDAEYQKQLAERDFNDLLKTTAEGFKPRDIKAVMPFLDVEKLKGSKNQ ESDIKAALEAVKKDKGYLFQDVGIPRVVAPTPGPGGEKTDDTRTQANNALRSILGRE >gi|157101657|gb|DS480667.1| GENE 113 114003 - 114179 67 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935241|ref|ZP_02082624.1| ## NR: gi|160935241|ref|ZP_02082624.1| hypothetical protein CLOBOL_00136 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00136 [Clostridium bolteae ATCC BAA-613] # 5 58 1 54 54 93 98.0 5e-18 MAKAVIVIPICLKCEYCEKGMKCRVYPQGIPREIVLAQKPPEDICKDYKYKWENEASD >gi|157101657|gb|DS480667.1| GENE 114 114164 - 115918 992 584 aa, chain - ## HITS:1 COG:no KEGG:ELI_3213 NR:ns ## KEGG: ELI_3213 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 411 2 414 526 219 35.0 2e-55 MTPEELEKLPKPLERTMTALELSIMDEIIQRIKEAARVTPVIDWLLVRMDAIGAGRVRIK QLIGEGIRKAGLQVDDIYEQAVKSDCIRNKAIYEAAGKGYQPYEGNQWLQQIVDAARRQT KDSLRPLENITQTTGFNVPMGGSKKVFTPLSEYLERSLDKAMLGITTGARTYSQAIGEVI DEMTASGIRTVDYASGKSDRIEVAARRAVMTGVAQMTDKVNEKNMEALQTDYCEVDWHMG ARNTGTGYQNHQSWQGKVYSSEEMRTVCGKGQMLGFGGINCYHIAFAFIPGISKRKYTDE WLAEQNQKENEKKVYKGREYDTYAALQHQRRLERTIRKQKQDVELLEKAGADKEDVTAAR CRLRLTNKTYLDFSKEMGLRQQRERLRVSKDIVAVAAKADIIKEIRLPNEALNAKNITPD IVDEIQSGIDEMKREYDLRIDRALVQDVSDRFPDTPYLTRVVDNHGSREVEFVINKGYNF SDFRRIVKAGYETGYFAGRTIKDHAIHEMTHVMAGQQFKTISGYNAFKERLESQYVPGVS GYSDAMKDGFETLAEAFVKMRNNEAVPDEAKQLVIKHIERWRKQ >gi|157101657|gb|DS480667.1| GENE 115 115931 - 117310 953 459 aa, chain - ## HITS:1 COG:no KEGG:Clole_0815 NR:ns ## KEGG: Clole_0815 # Name: not_defined # Def: bacteriophage portal protein, SPP1 GP6-like protein # Organism: C.lentocellum # Pathway: not_defined # 22 453 28 453 469 386 48.0 1e-105 MRFTKMLDLITNVLNKDADTQVDVCLTSQMANQIELWTRMYENRSPWVNNKDVLSANLAP AIASEIARLVTLELKSEVTGGTAADYLNEQYQRKVIKDLRRYVEYGCAKGGLVMKPYITQ QGIEVQFVQADCFFPLSFDSSGRITQCVFTEQFRKGQKIYTRLEVHTLQGKRVHITNRAF VATNDYSLGSEVVVSSIDRWSELVPELLLEGADRLLFGYFKVPLANADDSDSPLGVSVFS RAVDLIREADRRYSNICWEYEGTQLAVHIATSLLKYNQDRDKFEYPGGKERLYRNVEYNT GAADKPFIDTFSPEIRDTALFNGFNNQLKLVEFNCNLAYGTLSDPQSVDKTATEIKTSKQ RSYVMVSDTQMALQDALEDLVYAMSFWSALYGLVPAGNDCEVSSDWDDSVMVDAETEREQ DRKDVAMGVMRLEEYRAKYYGETLEEAVKNLPEPAMTEE >gi|157101657|gb|DS480667.1| GENE 116 117325 - 118704 958 459 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1284 NR:ns ## KEGG: EUBREC_1284 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 25 423 1 394 429 241 35.0 6e-62 MAAVNRFVRKKTIPFNFSEKHKDYIRRCEACMYNVAEGAVRAGKTVDNVFAFAHELKTTP DRIHLATGSTMANAKLNIGDANGYGLEWIFRGQCHWGKYKDNEALYIKGPDTRWKQKIII FAGAAKEDSYKKIRGNSYGMWIATEINLHHDNTIKEAFNRQLAAQRLKVFWDLNPDNPRA PIYAEYIDKYQRQADAGDFPGGYNYMHCTIYDNINITPERLREVESRYDKNSIWYLRDIK GMRVVANGLIYRRFADDTSTKQYTFRLTDKPKDIMEIILGIDFGGSGSGHAFTATAITRG YHNVVVLASEWIGCKDEKGNQIEIDPEMLGTMFCNFCQKIISRYGYITTVYADSAEQTLI AGIRSSLRKHGLGWVRVENALKTEINDRINATAILMAQGRFYYVQGECQSLVDALSTAVW DPKELTKNVRLDDGTSDIDSLDSFEYTFERQISRLIKYG >gi|157101657|gb|DS480667.1| GENE 117 118688 - 119407 403 239 aa, chain - ## HITS:1 COG:lin2395 KEGG:ns NR:ns ## COG: lin2395 COG5484 # Protein_GI_number: 16801458 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 229 1 226 246 155 43.0 9e-38 MPRPRSPNRDKALQLWLDSGRKRQLKDIAAELQVSEEQIRKWKNQDKWDKVTLPNAKGNV TNHKGAPAGNQNAVGHGAPKQNKNAEKYGFFSKYLPEETVSIIQEMPTDPLDILWDQVQI AYAAIIRAQSIMYVRDQKDVTITKIGHKDGETVTEERWEVQQAWDKQGNFLQAQARAQKT LEGLIKQYDELLHKNWELASEEQKARIAQLRAQTDKLTGNNQELEDMEEIEGDIYGSGK >gi|157101657|gb|DS480667.1| GENE 118 119544 - 119663 222 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKDLPKKIKKLNKLCDQLIELLVKAYIILLVIEALTKIV >gi|157101657|gb|DS480667.1| GENE 119 119850 - 120041 174 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935248|ref|ZP_02082631.1| ## NR: gi|160935248|ref|ZP_02082631.1| hypothetical protein CLOBOL_00143 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00143 [Clostridium bolteae ATCC BAA-613] # 1 63 1 63 63 110 100.0 5e-23 MSEDKKYTSQQKHLRTKYVRFPLDLKPEVLDAFKAKCDIMGTTPTTEIKKFINNFISEDE AAD >gi|157101657|gb|DS480667.1| GENE 120 120123 - 121679 508 518 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_1277 NR:ns ## KEGG: Pedsa_1277 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 4 362 157 513 685 121 27.0 1e-25 MLEIQNEDSGYTNEQRRIINDFLDWDDKEKNEILFRLIAFSVDYCRLTTKKNINSFTALL KGKKFYIDANVIYRLMGINNESRRKAMKEFVEKCRETGIELLYTNVTYKELTETFVYYVD KIKQVLQVTNASPTKIKKLYDVKDDNGFYDIYYKWAGENKSYGKWNDFLYYLKGQLRDTI ASFRKVTIQNSQVFDKEKFEILVDDLECFKRNSSNSNRRSINRVNVEYDINNILHLLDER KKNGKNAWEINEYMISADHGLVEWADRMFLGEVPYVVLPSIWYSLLLKLTGRTEDDYKAY VEFMKLRYTQTINYTPEEIIYDITQLTEKGEIQDRVIDILTDENLLITKDNEQLEIQEKV RIAYDKAVDEIRKNEYNDGYSLGNVEGYRSGKEDAEKFSYEKGKRVATLEIKKDALLRQI ADKAKKKKRINYIIIGIGTIFILAVVFLVCKWVWHDLSPDKVDQFDLIITILSCALGGTV WMMIKYFLCVDLKVLEDRERNKVIDELDDIERNLKELK >gi|157101657|gb|DS480667.1| GENE 121 121660 - 122130 67 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935250|ref|ZP_02082633.1| ## NR: gi|160935250|ref|ZP_02082633.1| hypothetical protein CLOBOL_00145 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00145 [Clostridium bolteae ATCC BAA-613] # 1 156 1 156 156 269 100.0 6e-71 MNENMLRMALVINTSHDNFIKNLSNILLYLLYNDGSPNQYTADELKDLVEKTIHLQFTVN EIESALKNCESDRLIFKYNDKYTLDELGCKKVTRNSNNDIKRLIDKYLLVYQIEGYGHED LFQLICNYLYNLLDSNYKRFIASSRYRKTGYVRDSK >gi|157101657|gb|DS480667.1| GENE 122 122167 - 122304 74 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935251|ref|ZP_02082634.1| ## NR: gi|160935251|ref|ZP_02082634.1| hypothetical protein CLOBOL_00146 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00146 [Clostridium bolteae ATCC BAA-613] # 1 45 1 45 45 72 100.0 1e-11 MINENDVKKEYTMYLFHIMLAEEFPTYIMSYEEFRDSYIEDACTP >gi|157101657|gb|DS480667.1| GENE 123 122744 - 123196 296 150 aa, chain - ## HITS:1 COG:no KEGG:Closa_1384 NR:ns ## KEGG: Closa_1384 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 150 1 150 150 199 72.0 4e-50 MDKEVLIQYCEMKEEIKDIRRRIQKLDRFLEEPHQVSDTVKGTRRDGTIGSIKVTGYPVP EHYRKQRLRERYRLLLERKEAELLELTCQAEEYIQGIPKSEVRTMFRLYYIDGLPWWKVA QAMNRMFPKRRVKFTEDSCRVRNNRFFEEI >gi|157101657|gb|DS480667.1| GENE 124 123202 - 123495 185 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935253|ref|ZP_02082636.1| ## NR: gi|160935253|ref|ZP_02082636.1| hypothetical protein CLOBOL_00149 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00149 [Clostridium bolteae ATCC BAA-613] # 6 97 1 92 92 169 98.0 8e-41 MASRPVYYDLYDCGQYDGRYRAAELMVMLGIRHRQQIEHYSDVGILYQKRYTFVRVEDGN ASELAYEWDRVTQVLKGCGHDLGRIPIVVSRDKRKRR >gi|157101657|gb|DS480667.1| GENE 125 123470 - 123670 204 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935254|ref|ZP_02082637.1| ## NR: gi|160935254|ref|ZP_02082637.1| hypothetical protein CLOBOL_00150 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00150 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 102 100.0 7e-21 MRKKGSKQSKISRIDRSKALAAQADEAIKERIRTAPAYMYTSLCPVPELRRPPKGVIVRG IKTCVL >gi|157101657|gb|DS480667.1| GENE 126 123948 - 124190 278 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935256|ref|ZP_02082639.1| ## NR: gi|160935256|ref|ZP_02082639.1| hypothetical protein CLOBOL_00152 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00152 [Clostridium bolteae ATCC BAA-613] # 1 80 2 81 81 150 100.0 2e-35 MWKIIFTYPDGVKVKLTNSSIPMDKRLANKYYDIYGYNSDGGIFQQYPKKKYRPMAMATV VDILNAGGNLEKEILIDADD >gi|157101657|gb|DS480667.1| GENE 127 124781 - 125428 112 215 aa, chain - ## HITS:1 COG:jhp0043 KEGG:ns NR:ns ## COG: jhp0043 COG0863 # Protein_GI_number: 15611114 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Helicobacter pylori J99 # 153 214 168 229 230 66 51.0 3e-11 MIFTDLPYGTTRNGWDCVIDLKRLWEHYSRIIKNNGCIALWAQSPFDKVLACSNLKMYRY EWIIEKTKGTGHLNAAKMPMKCHENVLIFYKHLPTYNPQITTGHSPVHSYTKHVSDGSNY GKTRTGISGGGSTERYPRDVLRFKWDTQRERLHPTQKPLEACKYFIRTYTNSGDTVLDSC MGSNTTGVACQELGRKYIGIEKDTVNYRIALDRVD >gi|157101657|gb|DS480667.1| GENE 128 125561 - 125998 402 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935259|ref|ZP_02082642.1| ## NR: gi|160935259|ref|ZP_02082642.1| hypothetical protein CLOBOL_00155 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00155 [Clostridium bolteae ATCC BAA-613] # 1 145 1 145 145 235 100.0 8e-61 MVIRFNIPNGKMEINLETFFQEARRPQIRKMLKWVRASWPDEENAREIREWLTDRRQDET DRAKAFAKKYVDCRTELAELQEMYERMQSPCYAVYTRNKEKLTNAKKDVSRYKAKTVRYK REMGEHRKLAERYEGILKDVDKLLS >gi|157101657|gb|DS480667.1| GENE 129 125988 - 127358 1002 456 aa, chain - ## HITS:1 COG:XF0680 KEGG:ns NR:ns ## COG: XF0680 COG0553 # Protein_GI_number: 15837282 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Xylella fastidiosa 9a5c # 8 446 6 456 472 333 43.0 3e-91 MHYEPHDYQKYATEKIIELPACALFMEMGLGKTVSTLTAIDELIYDRFEVEKVLVIAPYR VADDTWTTETEKWDHLKHLRVSKVLGTSGERIAALEADADIYVINRENVTWLVNLTGKEW PFEMIVVDELSSFKSNSAKRFKSLRIVRPLARRFVGLTGTPAPNGLLDLWPQVYLIDRGE RLGKTYTGYKDRYFLPDRRNGFVVYSWTPKEGAKEAIEQKLSDICISMKADDYLNLPAQI VNDVYVSMDRHEMRKYRELEKEKLLELDGKEITALSAAAVWGKLLQLANGAAYDGEGNVI PLHDRKLDALAEILEASGGHPVLVFYNFRHDYDRLMGRFKGYNPRTLKSQQDIRDWNEGR IPLLLAQPASMGHGLNIQAGGHIIVWFGLNPSLELYLQANARLHRQGQTEAVIIHRLITK GTVDEDVVKKLWVKDETQDGLMESLKARIRRIKDGN >gi|157101657|gb|DS480667.1| GENE 130 127339 - 127632 244 97 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c36520 NR:ns ## KEGG: CLJU_c36520 # Name: not_defined # Def: phage-like protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 91 1 91 91 74 43.0 1e-12 MLEKDIEDWLNKQIEKMGGLAFKFVSPGNPGVPDRIYILPDGRVWFVELKQQLGKVARIQ KWQRERLIRLGCNYRLVKGMDDARAYVGEMRDALRTA Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:34:18 2011 Seq name: gi|157101656|gb|DS480668.1| Clostridium bolteae ATCC BAA-613 Scfld_02_9 genomic scaffold, whole genome shotgun sequence Length of sequence - 187012 bp Number of predicted genes - 168, with homology - 167 Number of transcription units - 88, operones - 41 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 82 - 141 8.3 1 1 Tu 1 . + CDS 185 - 583 408 ## COG2050 Uncharacterized protein, possibly involved in aromatic compounds catabolism + Term 667 - 718 16.6 - Term 655 - 706 12.8 2 2 Op 1 3/0.000 - CDS 810 - 1232 521 ## COG4747 ACT domain-containing protein 3 2 Op 2 1/0.133 - CDS 1252 - 2550 1401 ## COG1541 Coenzyme F390 synthetase - Prom 2580 - 2639 3.0 4 3 Op 1 11/0.000 - CDS 2676 - 3251 781 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 5 3 Op 2 . - CDS 3253 - 5001 1761 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 6 3 Op 3 . - CDS 5068 - 6375 1552 ## COG1541 Coenzyme F390 synthetase - Prom 6472 - 6531 4.4 7 4 Tu 1 . - CDS 6765 - 7274 514 ## COG0517 FOG: CBS domain - Prom 7300 - 7359 8.6 8 5 Tu 1 . - CDS 7415 - 9574 2101 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 9707 - 9766 7.4 - Term 9826 - 9882 9.1 9 6 Op 1 . - CDS 9978 - 10235 405 ## Closa_1801 hypothetical protein 10 6 Op 2 . - CDS 10266 - 11555 1456 ## COG1301 Na+/H+-dicarboxylate symporters 11 6 Op 3 . - CDS 11592 - 12761 1071 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 12925 - 12984 2.6 - Term 12892 - 12955 10.2 12 7 Tu 1 . - CDS 13019 - 15154 1848 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 15177 - 15236 5.1 + Prom 15194 - 15253 3.3 13 8 Tu 1 . + CDS 15327 - 16619 1134 ## COG3681 Uncharacterized conserved protein + Term 16651 - 16689 -0.5 - Term 16550 - 16598 2.0 14 9 Tu 1 . - CDS 16628 - 18088 1468 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Term 18483 - 18545 11.1 15 10 Op 1 . - CDS 18652 - 19683 686 ## COG2320 Uncharacterized conserved protein 16 10 Op 2 . - CDS 19673 - 20479 917 ## COG2357 Uncharacterized protein conserved in bacteria 17 10 Op 3 . - CDS 20543 - 21064 452 ## COG0242 N-formylmethionyl-tRNA deformylase - Prom 21091 - 21150 2.0 - Term 21123 - 21165 9.2 18 11 Tu 1 . - CDS 21187 - 21618 556 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif - Prom 21654 - 21713 6.6 + Prom 21783 - 21842 9.5 19 12 Tu 1 . + CDS 21893 - 22225 280 ## COG1733 Predicted transcriptional regulators - Term 22092 - 22123 -0.9 20 13 Op 1 . - CDS 22332 - 23198 679 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily 21 13 Op 2 . - CDS 23226 - 24524 1282 ## CLL_A2678 hypothetical protein - Prom 24556 - 24615 4.9 - Term 24598 - 24669 28.2 22 14 Op 1 . - CDS 24678 - 26312 884 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 23 14 Op 2 1/0.133 - CDS 26347 - 27048 583 ## COG2186 Transcriptional regulators - Prom 27076 - 27135 7.7 24 15 Op 1 . - CDS 27146 - 28441 1169 ## COG1593 TRAP-type C4-dicarboxylate transport system, large permease component 25 15 Op 2 . - CDS 28442 - 28960 393 ## Arnit_2259 tripartite ATP-independent periplasmic transporter subunit DctQ 26 15 Op 3 . - CDS 28968 - 30032 229 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 27 15 Op 4 . - CDS 30049 - 30453 282 ## COG0346 Lactoylglutathione lyase and related lyases 28 15 Op 5 . - CDS 30466 - 32043 1049 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 29 15 Op 6 10/0.000 - CDS 32106 - 33290 925 ## COG0183 Acetyl-CoA acetyltransferase 30 15 Op 7 2/0.067 - CDS 33302 - 34048 221 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 34142 - 34201 7.3 31 16 Tu 1 . - CDS 34390 - 35019 507 ## PROTEIN SUPPORTED gi|229859876|ref|ZP_04479533.1| acetyltransferase, ribosomal protein N-acetylase - Prom 35090 - 35149 6.8 32 17 Tu 1 . - CDS 35200 - 36066 800 ## COG0789 Predicted transcriptional regulators - Prom 36183 - 36242 5.8 - Term 36177 - 36221 1.6 33 18 Tu 1 . - CDS 36357 - 36554 374 ## gi|160935303|ref|ZP_02082685.1| hypothetical protein CLOBOL_00198 34 19 Tu 1 . - CDS 36670 - 37017 262 ## gi|160935304|ref|ZP_02082686.1| hypothetical protein CLOBOL_00199 - Prom 37111 - 37170 7.4 - Term 37246 - 37317 11.1 35 20 Tu 1 . - CDS 37359 - 37997 201 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Prom 38094 - 38153 7.0 - Term 38122 - 38173 -0.8 36 21 Op 1 11/0.000 - CDS 38211 - 39383 1649 ## COG4214 ABC-type xylose transport system, permease component 37 21 Op 2 11/0.000 - CDS 39387 - 40925 179 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Term 40937 - 40968 0.7 38 21 Op 3 . - CDS 41014 - 42189 1571 ## COG4213 ABC-type xylose transport system, periplasmic component - Prom 42294 - 42353 6.6 - Term 42281 - 42313 2.4 39 22 Tu 1 . - CDS 42447 - 43667 240 ## gi|160935313|ref|ZP_02082695.1| hypothetical protein CLOBOL_00208 - Term 43733 - 43773 -0.5 40 23 Op 1 . - CDS 43791 - 44852 1209 ## COG4213 ABC-type xylose transport system, periplasmic component 41 23 Op 2 7/0.000 - CDS 44895 - 46505 1639 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 42 23 Op 3 1/0.133 - CDS 46480 - 48000 1805 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 43 23 Op 4 . - CDS 48028 - 49029 1019 ## COG1879 ABC-type sugar transport system, periplasmic component 44 23 Op 5 . - CDS 49026 - 49592 681 ## gi|160935318|ref|ZP_02082700.1| hypothetical protein CLOBOL_00213 - Prom 49683 - 49742 4.4 + Prom 49759 - 49818 7.5 45 24 Tu 1 . + CDS 49893 - 50252 257 ## gi|160935319|ref|ZP_02082701.1| hypothetical protein CLOBOL_00214 + Term 50384 - 50430 17.5 - Term 50370 - 50421 18.0 46 25 Op 1 . - CDS 50432 - 50971 294 ## gi|160935321|ref|ZP_02082703.1| hypothetical protein CLOBOL_00216 - Prom 50995 - 51054 2.1 47 25 Op 2 . - CDS 51219 - 51644 407 ## gi|160935323|ref|ZP_02082705.1| hypothetical protein CLOBOL_00218 - Prom 51676 - 51735 2.5 48 26 Op 1 . - CDS 51919 - 58932 4326 ## DvMF_1880 outer membrane autotransporter barrel domain protein 49 26 Op 2 . - CDS 58987 - 59763 657 ## gi|160935325|ref|ZP_02082707.1| hypothetical protein CLOBOL_00220 50 26 Op 3 . - CDS 59777 - 60340 352 ## gi|160935326|ref|ZP_02082708.1| hypothetical protein CLOBOL_00221 - Prom 60370 - 60429 1.8 51 27 Tu 1 . - CDS 60449 - 62185 780 ## gi|160935327|ref|ZP_02082709.1| hypothetical protein CLOBOL_00222 - Term 62275 - 62314 3.1 52 28 Tu 1 . - CDS 62320 - 62928 472 ## gi|160935328|ref|ZP_02082710.1| hypothetical protein CLOBOL_00223 - Term 62944 - 62992 6.6 53 29 Op 1 . - CDS 63085 - 63705 263 ## COG0681 Signal peptidase I 54 29 Op 2 . - CDS 63702 - 63878 144 ## gi|160935331|ref|ZP_02082713.1| hypothetical protein CLOBOL_00226 - Prom 63957 - 64016 3.4 - Term 64154 - 64202 2.8 55 30 Tu 1 . - CDS 64379 - 65224 635 ## COG0582 Integrase - Prom 65314 - 65373 8.8 - Term 65394 - 65431 5.1 56 31 Tu 1 . - CDS 65635 - 66678 565 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Prom 66717 - 66776 6.9 57 32 Tu 1 . - CDS 66800 - 67291 427 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 67350 - 67409 4.3 - Term 67467 - 67509 6.3 58 33 Op 1 . - CDS 67575 - 69068 1698 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 69119 - 69178 4.3 59 33 Op 2 . - CDS 69244 - 69768 554 ## COG0717 Deoxycytidine deaminase - Prom 69793 - 69852 5.5 - Term 69863 - 69907 12.1 60 34 Op 1 . - CDS 69922 - 70623 547 ## COG0637 Predicted phosphatase/phosphohexomutase 61 34 Op 2 . - CDS 70635 - 71048 496 ## CA_C3497 hypothetical protein 62 35 Op 1 2/0.067 - CDS 71157 - 72044 754 ## COG0524 Sugar kinases, ribokinase family 63 35 Op 2 . - CDS 72063 - 72851 1007 ## COG0524 Sugar kinases, ribokinase family 64 35 Op 3 . - CDS 72855 - 73295 538 ## COG0432 Uncharacterized conserved protein 65 35 Op 4 . - CDS 73337 - 74185 661 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 66 35 Op 5 . - CDS 74237 - 75787 1140 ## COG1070 Sugar (pentulose and hexulose) kinases 67 35 Op 6 . - CDS 75801 - 76610 608 ## COG0434 Predicted TIM-barrel enzyme 68 35 Op 7 1/0.133 - CDS 76612 - 77445 709 ## COG1082 Sugar phosphate isomerases/epimerases - Term 77480 - 77536 3.6 69 36 Tu 1 . - CDS 77549 - 78871 1316 ## COG2704 Anaerobic C4-dicarboxylate transporter - Prom 79104 - 79163 7.0 - Term 79130 - 79180 0.5 70 37 Tu 1 . - CDS 79238 - 80068 678 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 80161 - 80220 3.4 - Term 80083 - 80115 5.0 71 38 Tu 1 . - CDS 80228 - 81238 964 ## COG2222 Predicted phosphosugar isomerases - Prom 81280 - 81339 8.4 + Prom 81316 - 81375 6.6 72 39 Tu 1 . + CDS 81411 - 82124 500 ## COG2188 Transcriptional regulators + Term 82143 - 82193 10.5 - Term 82131 - 82181 9.7 73 40 Op 1 . - CDS 82285 - 83073 799 ## COG0300 Short-chain dehydrogenases of various substrate specificities - Prom 83107 - 83166 4.7 74 40 Op 2 . - CDS 83195 - 83776 709 ## COG1309 Transcriptional regulator - Prom 83882 - 83941 9.2 + Prom 83974 - 84033 6.5 75 41 Tu 1 . + CDS 84112 - 84312 199 ## gi|160935358|ref|ZP_02082740.1| hypothetical protein CLOBOL_00253 + Term 84346 - 84383 1.5 76 42 Tu 1 . - CDS 84356 - 84484 59 ## gi|160935359|ref|ZP_02082741.1| hypothetical protein CLOBOL_00254 - Prom 84517 - 84576 2.5 + Prom 84483 - 84542 6.1 77 43 Op 1 40/0.000 + CDS 84569 - 85243 756 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 78 43 Op 2 . + CDS 85240 - 86496 501 ## COG0642 Signal transduction histidine kinase - Term 86313 - 86356 0.2 79 44 Op 1 . - CDS 86538 - 88499 1424 ## Sgly_0480 hypothetical protein 80 44 Op 2 . - CDS 88512 - 89183 673 ## Sgly_0479 hypothetical protein 81 44 Op 3 . - CDS 89176 - 89895 465 ## Sgly_0478 VTC domain-containing protein - Prom 89921 - 89980 6.9 + Prom 90276 - 90335 13.9 82 45 Op 1 . + CDS 90564 - 91220 386 ## COG2964 Uncharacterized protein conserved in bacteria 83 45 Op 2 31/0.000 + CDS 91274 - 92110 702 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 92121 - 92156 3.9 84 45 Op 3 17/0.000 + CDS 92167 - 92841 392 ## COG0765 ABC-type amino acid transport system, permease component 85 45 Op 4 34/0.000 + CDS 92847 - 93494 254 ## COG0765 ABC-type amino acid transport system, permease component 86 45 Op 5 2/0.067 + CDS 93491 - 94222 630 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 87 45 Op 6 . + CDS 94251 - 94709 353 ## COG0251 Putative translation initiation inhibitor, yjgF family 88 45 Op 7 . + CDS 94745 - 95860 273 ## COG3616 Predicted amino acid aldolase or racemase 89 45 Op 8 2/0.067 + CDS 95875 - 96513 442 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 90 45 Op 9 . + CDS 96540 - 97559 387 ## COG3734 2-keto-3-deoxy-galactonokinase + Term 97583 - 97624 6.0 - Term 97561 - 97618 8.4 91 46 Op 1 35/0.000 - CDS 97630 - 99372 241 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 92 46 Op 2 . - CDS 99369 - 101183 207 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 93 47 Tu 1 . - CDS 101284 - 101907 563 ## Closa_2536 TetR family transcriptional regulator - Prom 102017 - 102076 7.1 - Term 101963 - 102013 5.1 94 48 Tu 1 . - CDS 102132 - 102695 626 ## COG0450 Peroxiredoxin - Prom 102853 - 102912 9.6 - Term 102881 - 102929 7.2 95 49 Tu 1 . - CDS 102942 - 103970 1105 ## COG1299 Phosphotransferase system, fructose-specific IIC component - Term 104468 - 104514 2.3 96 50 Tu 1 . - CDS 104583 - 106658 2377 ## COG2217 Cation transport ATPase - Prom 106693 - 106752 2.1 - Term 106706 - 106745 7.6 97 51 Op 1 . - CDS 106754 - 107083 356 ## Rumal_2099 hypothetical protein 98 51 Op 2 22/0.000 - CDS 107148 - 109322 2198 ## COG0370 Fe2+ transport system protein B - Prom 109347 - 109406 6.3 99 51 Op 3 2/0.067 - CDS 109408 - 109629 274 ## COG1918 Fe2+ transport system protein A 100 51 Op 4 . - CDS 109669 - 109878 330 ## COG1918 Fe2+ transport system protein A - Prom 109965 - 110024 6.2 + Prom 109982 - 110041 2.7 101 52 Tu 1 . + CDS 110146 - 110535 414 ## COG1321 Mn-dependent transcriptional regulator + Term 110536 - 110580 0.0 - Term 110416 - 110449 1.0 102 53 Tu 1 . - CDS 110545 - 111918 1306 ## CD3111 hypothetical protein - Prom 112041 - 112100 5.5 + Prom 112000 - 112059 10.1 103 54 Op 1 5/0.000 + CDS 112102 - 113274 926 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 104 54 Op 2 . + CDS 113274 - 114599 1225 ## COG0477 Permeases of the major facilitator superfamily + Term 114613 - 114668 19.1 - Term 114601 - 114656 19.1 105 55 Op 1 2/0.067 - CDS 114778 - 115602 754 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 106 55 Op 2 . - CDS 115599 - 116618 746 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 107 56 Tu 1 . - CDS 116749 - 117429 411 ## COG5015 Uncharacterized conserved protein - Prom 117465 - 117524 3.9 108 57 Op 1 36/0.000 - CDS 117665 - 119629 1460 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 109 57 Op 2 4/0.000 - CDS 119626 - 120324 230 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 120364 - 120423 4.6 - Term 120455 - 120499 9.0 110 58 Op 1 40/0.000 - CDS 120534 - 121544 849 ## COG0642 Signal transduction histidine kinase 111 58 Op 2 . - CDS 121548 - 122216 515 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 112 58 Op 3 . - CDS 122242 - 122847 531 ## gi|160935403|ref|ZP_02082785.1| hypothetical protein CLOBOL_00298 - Prom 122996 - 123055 2.2 113 59 Tu 1 . + CDS 122885 - 123898 634 ## COG3049 Penicillin V acylase and related amidases + Term 124112 - 124160 -0.9 - Term 123688 - 123722 1.2 114 60 Tu 1 . - CDS 123872 - 126718 2493 ## COG0642 Signal transduction histidine kinase - Prom 126763 - 126822 6.2 - Term 126888 - 126918 3.0 115 61 Op 1 6/0.000 - CDS 126996 - 127586 508 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 116 61 Op 2 . - CDS 127586 - 128857 1232 ## COG3395 Uncharacterized protein conserved in bacteria 117 61 Op 3 . - CDS 128869 - 130239 1004 ## COG4091 Predicted homoserine dehydrogenase - Prom 130259 - 130318 6.0 - Term 130352 - 130392 -0.9 118 62 Op 1 . - CDS 130566 - 131921 996 ## PPE_04861 hypothetical protein 119 62 Op 2 . - CDS 131942 - 132541 505 ## Cfla_0433 aminoglycoside-2''-adenylyltransferase - Prom 132704 - 132763 2.3 - Term 132650 - 132694 9.0 120 63 Op 1 2/0.067 - CDS 132778 - 134706 1202 ## COG1331 Highly conserved protein containing a thioredoxin domain 121 63 Op 2 . - CDS 134745 - 134909 60 ## COG1331 Highly conserved protein containing a thioredoxin domain - Prom 134961 - 135020 5.5 - Term 134959 - 135016 5.3 122 64 Op 1 . - CDS 135028 - 136113 1000 ## COG3949 Uncharacterized membrane protein 123 64 Op 2 . - CDS 136110 - 136523 300 ## gi|160935416|ref|ZP_02082798.1| hypothetical protein CLOBOL_00311 124 64 Op 3 . - CDS 136538 - 136729 137 ## COPRO5265_0224 glycine reductase complex component B subunit gamma (selenoprotein PB gamma) (EC:1.21.4.2) 125 64 Op 4 . - CDS 136793 - 137836 879 ## COPRO5265_0224 glycine reductase complex component B subunit gamma (selenoprotein PB gamma) (EC:1.21.4.2) 126 64 Op 5 . - CDS 137852 - 139141 1061 ## STH2870 glycine reductase proprotein - Prom 139187 - 139246 6.5 - Term 139235 - 139277 7.2 127 65 Tu 1 . - CDS 139304 - 140176 583 ## COG0789 Predicted transcriptional regulators - Prom 140211 - 140270 6.0 - Term 140335 - 140390 5.5 128 66 Op 1 1/0.133 - CDS 140414 - 142309 1126 ## COG3534 Alpha-L-arabinofuranosidase 129 66 Op 2 38/0.000 - CDS 142315 - 143157 691 ## COG0395 ABC-type sugar transport system, permease component 130 66 Op 3 35/0.000 - CDS 143172 - 144059 754 ## COG1175 ABC-type sugar transport systems, permease components 131 66 Op 4 . - CDS 144085 - 144957 664 ## COG1653 ABC-type sugar transport system, periplasmic component 132 66 Op 5 . - CDS 145036 - 145371 211 ## Pjdr2_1617 extracellular solute-binding protein family 1 133 66 Op 6 . - CDS 145368 - 147128 904 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 147205 - 147264 6.1 + Prom 147162 - 147221 6.8 134 67 Op 1 . + CDS 147288 - 147428 73 ## gi|160935428|ref|ZP_02082810.1| hypothetical protein CLOBOL_00323 135 67 Op 2 . + CDS 147437 - 148435 502 ## COG4189 Predicted transcriptional regulator - Term 148528 - 148553 -0.5 136 68 Op 1 . - CDS 148582 - 149235 222 ## EUBELI_01773 hypothetical protein 137 68 Op 2 35/0.000 - CDS 149249 - 151123 254 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 138 68 Op 3 . - CDS 151110 - 153044 1475 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 153076 - 153135 2.3 139 69 Op 1 . - CDS 153140 - 154009 382 ## COG1715 Restriction endonuclease 140 69 Op 2 . - CDS 154015 - 154266 89 ## 141 70 Tu 1 . - CDS 155015 - 155785 185 ## HDEF_1084 putative R of type II restriction and modification system - Prom 155960 - 156019 4.7 142 71 Op 1 4/0.000 - CDS 156117 - 157283 116 ## COG0286 Type I restriction-modification system methyltransferase subunit 143 71 Op 2 . - CDS 157286 - 158119 317 ## COG0286 Type I restriction-modification system methyltransferase subunit - Prom 158170 - 158229 6.2 144 72 Tu 1 . - CDS 158584 - 159153 183 ## LDBND_1115 hypothetical protein - Prom 159394 - 159453 4.1 145 73 Tu 1 . - CDS 160127 - 161101 334 ## COG3943 Virulence protein - Prom 161158 - 161217 5.5 146 74 Op 1 . - CDS 161296 - 161841 151 ## Ccel_2698 hypothetical protein 147 74 Op 2 . - CDS 161889 - 162401 130 ## DSY3426 hypothetical protein - Term 162438 - 162480 9.2 148 75 Tu 1 . - CDS 162514 - 164175 989 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 164282 - 164341 5.3 - Term 164461 - 164524 0.2 149 76 Tu 1 . - CDS 164641 - 166494 1654 ## COG0006 Xaa-Pro aminopeptidase 150 77 Op 1 . - CDS 166660 - 168084 652 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 168115 - 168174 5.9 151 77 Op 2 . - CDS 168187 - 168987 959 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 169083 - 169142 76.3 + TRNA 169066 - 169139 70.4 # Pro GGG 0 0 - Term 169760 - 169823 16.6 152 78 Op 1 2/0.067 - CDS 169853 - 170785 707 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 153 78 Op 2 . - CDS 170818 - 171372 426 ## COG1396 Predicted transcriptional regulators - Prom 171497 - 171556 4.7 154 79 Op 1 . - CDS 171614 - 173050 1367 ## COG1288 Predicted membrane protein 155 79 Op 2 . - CDS 173090 - 174085 964 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog - Prom 174111 - 174170 7.7 + Prom 174213 - 174272 11.3 156 80 Tu 1 . + CDS 174370 - 175203 609 ## COG0345 Pyrroline-5-carboxylate reductase + Term 175397 - 175446 15.1 - Term 175387 - 175430 10.6 157 81 Tu 1 . - CDS 175437 - 175664 234 ## gi|160935461|ref|ZP_02082843.1| hypothetical protein CLOBOL_00357 - Prom 175749 - 175808 2.4 + Prom 175718 - 175777 6.3 158 82 Tu 1 . + CDS 175818 - 176216 430 ## gi|160935462|ref|ZP_02082844.1| hypothetical protein CLOBOL_00358 + Term 176314 - 176377 10.4 159 83 Op 1 . - CDS 176189 - 176524 181 ## gi|160935463|ref|ZP_02082845.1| hypothetical protein CLOBOL_00359 160 83 Op 2 . - CDS 176570 - 176776 299 ## gi|160935464|ref|ZP_02082846.1| hypothetical protein CLOBOL_00360 - Prom 176834 - 176893 4.1 + Prom 176836 - 176895 5.8 161 84 Tu 1 . + CDS 176925 - 178568 1045 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 178670 - 178729 4.5 162 85 Op 1 . + CDS 178825 - 180162 1216 ## COG2211 Na+/melibiose symporter and related transporters 163 85 Op 2 . + CDS 180155 - 181051 711 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 164 85 Op 3 . + CDS 181048 - 182364 1183 ## COG4690 Dipeptidase - Term 182310 - 182346 1.2 165 86 Tu 1 . - CDS 182570 - 183385 1037 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 183449 - 183508 9.1 + Prom 183447 - 183506 7.3 166 87 Tu 1 . + CDS 183643 - 185166 955 ## COG1680 Beta-lactamase class C and other penicillin binding proteins - Term 185252 - 185290 2.6 167 88 Op 1 21/0.000 - CDS 185293 - 185916 858 ## COG1392 Phosphate transport regulator (distant homolog of PhoU) 168 88 Op 2 . - CDS 185934 - 186992 998 ## COG0306 Phosphate/sulphate permeases Predicted protein(s) >gi|157101656|gb|DS480668.1| GENE 1 185 - 583 408 132 aa, chain + ## HITS:1 COG:AF2264 KEGG:ns NR:ns ## COG: AF2264 COG2050 # Protein_GI_number: 11499845 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized protein, possibly involved in aromatic compounds catabolism # Organism: Archaeoglobus fulgidus # 3 122 13 137 154 82 39.0 2e-16 MDAQKLEQIQKVFANDRFATDNGAVIEQVDEGYAKCWLEIQPHHLNAAGTVMGGAIFTLA DFAFAVASNWNKPLHVSTTSQITYLGVARGTRLIAEARKVKEGRSTCYYLVEVKDDLGNE VAHVTASGFSKG >gi|157101656|gb|DS480668.1| GENE 2 810 - 1232 521 140 aa, chain - ## HITS:1 COG:AF1672 KEGG:ns NR:ns ## COG: AF1672 COG4747 # Protein_GI_number: 11499262 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Archaeoglobus fulgidus # 3 136 2 135 137 115 44.0 2e-26 MFVKQLSVFIENREGRLEQVTEVLKEQDINILSLSMADTTEYGMLRMIVSDPDKAKAALR EKGFSAMLTDVLAVKLEDQVGELHKMTHILCSQGLNIEYMYALASANRRAMIVKASDGET AARAVTDGGLVLYSNEEMCW >gi|157101656|gb|DS480668.1| GENE 3 1252 - 2550 1401 432 aa, chain - ## HITS:1 COG:MTH161 KEGG:ns NR:ns ## COG: MTH161 COG1541 # Protein_GI_number: 15678189 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 431 4 434 434 483 55.0 1e-136 MIWNETKECMSRDEMTNLQSARLRKLVDRVYHSVEYYRKKMQAVGLEPGDIRGIEDLERL PFTTKDDLRDTYPFGMFAVPNSQITRIHASSGTTGKSTVVGYTRKDIDVWSECVARCITM AGLGSNDIIQVAYGYGLFTGGLGAHYGAEEMGATVVPMSTGNTKKLVTMMVDFGVTGIMC TPSYLLHIAETIEEMGVKDKIKLKASINGAEPWTEKMRTQIERQLGIHAHDIYGLSEIMG PGVATDCQFHEGLHIHEDHFLPEIVDQNTLKPLRDGMTGELVISTLTKEGLPLIRYRTKD LTSLDHSTCQCGRTTARIARFTGRTDDMLIIRGVNVFPSQIESALLEMGGTTPHYLLIVD RVNNLDTLEVQVEVEERFFSDEIRELENLTGKIAYMIQQAIGLAVKVKLVEPKSIERSMG KTKHVIDKRNLN >gi|157101656|gb|DS480668.1| GENE 4 2676 - 3251 781 191 aa, chain - ## HITS:1 COG:CAC2000 KEGG:ns NR:ns ## COG: CAC2000 COG1014 # Protein_GI_number: 15895270 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Clostridium acetobutylicum # 1 186 1 187 192 189 51.0 3e-48 MTSKNIMIVGVGGQGTLLTSRILGKLAIGAGFDVKLSEVHGMAQRGGSVVTFVRYGEKVA EPIVEEGQADVLIAFERLEALRYLHFLKKDGVVIVNDWRIDPITVVTGVAQYPEGIIDRL KEARRTIVVEATEEAKKMGAPKAFNVIVLGAAAKHMGFEKEDWLKVIETTVPPKTVEINK KAFETGYALSV >gi|157101656|gb|DS480668.1| GENE 5 3253 - 5001 1761 582 aa, chain - ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 4 574 2 573 584 610 51.0 1e-174 MSEKRILLGNEAIARGAYEAGVKVSAAYPGTPSTEVSESLVQYDEIYAEWAPNEKVATEV AIGASIAGVRSMCVMKHVGMNVAADPLYTAAYTGVRGGLVLVVADDPGMYSSQNEQDSRM VARAAMVPIVEPSDSAEAKEFMKYAYDLSEKYDTPVILRSTTRLSHSQGLVELEERAEPF DIPYERDMAKYVMMPGNAIKRHVVVEARMKQMAEDANSLPINRVEYNDLSVGFITNGIAY QYVKEAMPQASVLKLGLLNPLPRKLIEEFAAKVDKLYIFEELEPVVEEQVKSWGIQKAVG KEIFTVQGEYSANLIRERVLGQTSQVDKAAQVPARPPILCPGCPHRSVYTVLNKLKIHAA GDIGCYTLGAVAPLSVIDTTICMGASISTLHGMEKAKGREYIKSWVAVIGDSTFMHTGIN SLMNMVYNQATGTVIILDNSTTGMTGHQDHAATGKTLKGQVVPAINIYGLCKSLGIEHVC EVDAFDQAELERVIKEEVARDAVSVIITKAPCALLKGIKFPNKCRPLPDKCKKCGACLRP GCPALTKNEDGTISIDETMCNGCGLCKQLCKFDAINLVKAGE >gi|157101656|gb|DS480668.1| GENE 6 5068 - 6375 1552 435 aa, chain - ## HITS:1 COG:MTH161 KEGG:ns NR:ns ## COG: MTH161 COG1541 # Protein_GI_number: 15678189 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 1 432 4 432 434 419 48.0 1e-117 MIWAKEETMSRAEIEAIQLSRLKDTVNRVYEKVPAYRAKMEEAGVRPEHIQTLKDLQKLP FVTKQDMRDNYPYGLFAVPKKELRRIHASSGTTGKPTVVGYTENDLEVWKECVARLAVAG GASDEDVAQICFGYGMFTGALGLHNGLEKVGAAVVPSSTGNTQKQLMYMKDFETTLLVAT PSYAMRIAEVALEMGINPRKDLKVKTLVLGSELMTEAMRNELYKVWGEDVNLTQNYGMSE LMGPGVSGECLELKGMHINEDHFIAEVIDSATGEVLPPGEKGELVITCISKEALPLIRYR TRDITRLMYEPCPCGRTTARMENLSGRTDDMLKIRGVNVFPSQIEEVLINTEGIGPNYEI VVDRKNHSDILIIKVEVEAESMMDSYAALERLEDTLKDKMRLMLGLDAKIQLVSPNTLQR FEGKAKRVTDLRKEV >gi|157101656|gb|DS480668.1| GENE 7 6765 - 7274 514 169 aa, chain - ## HITS:1 COG:CAC3674 KEGG:ns NR:ns ## COG: CAC3674 COG0517 # Protein_GI_number: 15896906 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Clostridium acetobutylicum # 1 131 1 131 140 154 53.0 1e-37 MNILFFLTPKSDVAYISEDDTLRQALEKMEHHKYSAVPIVSRTGRYVGTLTEGDLLWGIK NQFNLDLKGAERIPVTAIRRRCDNRPVKADADMEDLIGKALNQNFVPVLDDQKSFIGIIT RKDIIKYFYQKCETAEERSKKCQDQGGKEHKKTDLSAVVKLRSQEGVLV >gi|157101656|gb|DS480668.1| GENE 8 7415 - 9574 2101 719 aa, chain - ## HITS:1 COG:BH0483 KEGG:ns NR:ns ## COG: BH0483 COG2207 # Protein_GI_number: 15613046 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 546 712 596 764 769 80 30.0 9e-15 MYRNLSREILGYNENILERLGDDLVTMNGRLMEAGNRISFSGPVVSRSRDAQNQIQMVDM LREENAANPYITKAYILYENEDMSFSSQGRYAKDILLAGQLRLPEEERQPFLEKLYNMDH PGYDVINKEYHMKDGIITRPQMVFIYPVYNYSFKVESWVVLELQPSKISRSMTTSPGSYS HGIAVLNGEGKILATEGEDLGLDEVMGRVLEAEPGSCTTFQERAGQGRYIVYPMDHPSLI ILSKITMPDLLADVVQNNSLLVTGVLLFLILGCFMAIYMAYHYYKPIHQLAQYMKQDNEN SEDRSRDELVFIKHQYDSLNSVRDSLAKEIEKQWPLVEERLVTKLLYGEDDTREDGTRED DMVNGILTEQMKGRNHMVILAAGGNMSSRELSALYKEKEGPIKERFQTDYGVYTSFLYYF GAIAVILSRKVLTEEHRIGARERLKEIFGSCECVISAGNIHEDASGVHLSYLEALTNMKF RLLNPHKELDFENMEQGKGEEYSATISRYQTECLLCISRCLESGSPEGVEGSVQEVVLSL EELPGQMALMCCYDIVSHLLKEVKKNDIRLTEKELYQLTAFRTVEEFGEKLGDALMRING VLAASRQDTRNDLVKQVLVYIEDNYRDSSLCLVGMAEHFGYSSSYMSKFITQNLNTSFSE LISKKRLDYTKDCLIHTDKQIAQIAQDAGYANLSNFTRRFKSSENMTPGQYRNLYGDGK >gi|157101656|gb|DS480668.1| GENE 9 9978 - 10235 405 85 aa, chain - ## HITS:1 COG:no KEGG:Closa_1801 NR:ns ## KEGG: Closa_1801 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 85 1 85 86 117 65.0 1e-25 MKVTVIGSHLCPDTLYALCKLTEKQADIDFKNLSASLPDLKQYLEVREQNEMYESVRRAG GIGIPYFELEDGTGTLDLNEVMARI >gi|157101656|gb|DS480668.1| GENE 10 10266 - 11555 1456 429 aa, chain - ## HITS:1 COG:CPn0528 KEGG:ns NR:ns ## COG: CPn0528 COG1301 # Protein_GI_number: 15618439 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Chlamydophila pneumoniae CWL029 # 19 395 10 393 414 141 27.0 2e-33 MNKSEIWKNYRFPIILLSGIMIGAVIGVIFGEKAKVLAPLGDIFLNLMFTIVVPMVFVSI SSAVGNMVNMKRLGKILSSLVMVFVITGAIAGVLVLVVVNVFPPAAGTSIHFASAEMEEM SGIGEMVVGALTVDDFSGLMSRKNMLPIIVFAILSGFCVSACGGEDSPVGRMLNNLNSII MKIVSIIMLYAPIGLGAYFASLVGEFGPGLIGDYGHAMLVYYPMCIVYFMAAFPAYSYFA GGREGVRRMFKHILNPAVTAFATQSSIAALPVNCEACDKIGVPRDIRDIVLPMGATMHMD GSVLSSIVKISFLFGIFGQSFTGIGTYAMAMAVAILSAFVLSGAPGGGLVGEMLIVSLFG FPAEAFPLIATIGFLVDPAATCLNASGDTIASMMVTRLVEGKDWLEKQIYKENSMGNTAA AQAAIAETK >gi|157101656|gb|DS480668.1| GENE 11 11592 - 12761 1071 389 aa, chain - ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 389 1 392 393 376 46.0 1e-104 MKYNFDEVIDRSHNRAAKYDERMKKFGREDVIPLWVADMDFRTAEPIIDALKERAEEGIW GYTSRPDSYFEAIRGWQKRRNGWDIKKEMMSWSLGVVPALSSIVKLFSEPGDSIMIQTPV YSEFYDVTEAWGRNVLENRLVEKDGAWGIDFEDFEAKAKEARIFLLCSPHNPLGIVWSRE EMTRMADICMANNTLLVSDEIHSDLVFHGKKHIPTATLSAETAGRVISCISGTKTFNLAG LQASTTIFPNLEMKKKFDDFWMAMDIHRNNAFSSVAMEAAFNEGEEWLEQLLAYLDGNFT FIREYCSANIPKIKPTVPDATYLVWLDCRELNMDNETLRTFMIEKAGLGLNEGYTFGRSL SGYMRLNAACPRSVLERALGQLKNAVDRL >gi|157101656|gb|DS480668.1| GENE 12 13019 - 15154 1848 711 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 211 562 190 545 627 263 39.0 1e-69 MVKKVAVIALDPRASAFYGKQVQELFGEYVRVNSYSVREGTISNMERADLYLMSTDAFDN REDVPQYIPIEAQISEIHVTYLWETIRLLRTIPAGTRALFVNMTEKMVREATSGLNQLGV NHIEFIPFYPGAEPVEGIGLAVTPDEERYVPDSVKQVVNIGQRCLTSETMIEAALKLGLE ELLETPKFQAYFSSVVTSTYNFDMMFARSRRLESQFDILMEILDEGIIGINERAEIFAMN RKAETITGLSRTLVMHKQASASLAYLPFIECLKTKEKIEPRVIRINNVNVGVTVVPVLRR EDCIGAFAILQPFNDAENRQHELRNQLMHKGYRAKYTFEDVIGVSGEILRTKAILRRMAY TESPVLLIGETGTGKELFANAVHLASRREKGPFVAINCGAMPENLLESELFGYEEGAFTG AKKGGRPGLFEFAHHGTLFLDEVEGMSPALQVKLLRVLQEHEIMRVGGNKIISVDVRIVA ATNESLEEKVRDGSFRRDLYYRLNTLPVLIPPLCRRGEDVFLLMERFCRELKGEFVLSEE VKKIFRTYSWPGNIRELRNLAEYFCFTGSQVITVKDLPPTFQYDGIGTGLSYCVSKKAEY PEPVLQPDTAPWKQAMREQGLEPEEYWFVLKMLYCSSENGETVGRERILEAARKEKLLLS QPAVRSILGIFREYGLARVGRGRGGSRITSLGRKIWEGYEFNQINHIINQK >gi|157101656|gb|DS480668.1| GENE 13 15327 - 16619 1134 430 aa, chain + ## HITS:1 COG:yhaN+M KEGG:ns NR:ns ## COG: yhaN+M COG3681 # Protein_GI_number: 16132252 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 4 429 11 434 436 215 35.0 1e-55 MNKEELIALLKQEVVPALGCTEPVCVALAAADAAKAAGGTILSVRVEMNPGIYKNGMNVG IPGFDRVGLKYAAALGACLANPQKGLKLLEDLSPAVSSQAINLVESGMVTLKLKEDEPLL YAHAEAVSSAGTGASTIRNTHSNIVCTARTVNGVLTPFFQKEYSLTGRTALHEKLASMTI SQIVSLVESCSPEELAFMEDGARMNSALADYGMQHKLGIGAASSLAKYCRTDIVGNSLMS RIMTLMAGSAEGRMCGCPYAVMSSAGSGNHGLTTILPVVETARYLGSDKEHLVKALAISH SINVYIKEFTGKLSATCGCGVSAASAASASMVWLMGGSHEQMGYAIINMSGNLTGMICDG GKIGCALKLATASGAALMSACLAVDKTVLDPKDGICHATPEGAIRNMGRISNPGMAQTDR TILEIMMEKD >gi|157101656|gb|DS480668.1| GENE 14 16628 - 18088 1468 486 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 152 481 93 425 450 207 40.0 4e-53 MAYHTEWPVDKRIEELPYKIIPGEVGNFRNDVFLERAIVGERLRLAMGLPYRSAAEHAPV SDNIEAADRPETYYTPPLINVIKFACNACHEKRVFVTDGCQGCLAHPCEEVCPKDAIKLD RTNGRSHINDDKCIKCGRCADVCSYKAIIIQERPCAAACGMDAISTDENGKADIDYDKCV SCGMCLVNCPFGAIADKSQIFQVIRAIQSGERVYAAVAPAFVGQFGPKVTPGKLRAAMKA LGFADVFEVAVGADLCATQEAIDFIQEVPEKLPFMATSCCPAWSMMAKKLFPEHADCISM ALTPMTLTARLIKRHHSNAKVVFIGPCAAKKLEAMRTDIRSDVDFVLTFEEMDGIFDARH VDVEKIEEDPEGVNDASMDGRNFAVAGGVAKSVVDVIHERYPDKEIKVANAEGLKECRKL LTMAKAGKYNGYLLEGMACPGGCVAGAGTMQPIKKSQASVNLYASKAKHKTSNETEHVKE LDKLVD >gi|157101656|gb|DS480668.1| GENE 15 18652 - 19683 686 343 aa, chain - ## HITS:1 COG:PA4798 KEGG:ns NR:ns ## COG: PA4798 COG2320 # Protein_GI_number: 15599992 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 29 183 29 184 242 83 36.0 8e-16 MKDRQDPFSGMSLEELWQLFPVFLTEHRPVWTQWYQEERERLSGILPMEDICEISHVGST SIPSIWAKPIVDILVEMRECGDMQAMKEHIIGGGYICMMEKAGRISFNRGYTPLGFAEKV FHLHLREAGDNDELYFRDYLREHPEAAREYEELKLGLWKEYEHDRDGYTEQKKAVVERFT REAKNLYPGRYKRQALRFARAEPEDTEALRRLARASEAHWGYDEAFMENFDAGFNVTEDF IRRNPVYAAGERGCPAAFWGIRQDRDAWELEYFYVAEERLGRGLGKQMWEHMTGWCGKQE ICRIHFVTSPQAVGFYRKMGAVRDGETRSPVDGRPVPHFVYDL >gi|157101656|gb|DS480668.1| GENE 16 19673 - 20479 917 268 aa, chain - ## HITS:1 COG:FN0926 KEGG:ns NR:ns ## COG: FN0926 COG2357 # Protein_GI_number: 19704261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 251 6 238 259 132 35.0 9e-31 MIDKEKFLTSYNIEPEELKAAGLNWEELAAIYDDYLSVEGKLRSIGKDFVYDYLYDIERA GIHSYRYRTKDPGHLIEKIIRKRNEHFERFEDINRGNYHKYITDLIGIRVFFLYREDWIH FHKYITSVFENNPDLYVVDRIKDFDEDTGHYYIAERPKVYRRNGDTRIYDENLIEIKSNG IYRSLHYIIKYKEYYVEIQGRTLFEEGWSEIDHDIVYPYFQDDEMLKDFSTLLNRLSGMA DEMSSYFRRMKGVKEEWEKQLERYDDEG >gi|157101656|gb|DS480668.1| GENE 17 20543 - 21064 452 173 aa, chain - ## HITS:1 COG:TP0757 KEGG:ns NR:ns ## COG: TP0757 COG0242 # Protein_GI_number: 15639744 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Treponema pallidum # 20 166 5 145 162 76 35.0 3e-14 MYGNENSAEGGFEVERTILLLGNPELYEASQEVKIEELHQMEQVREDLKDTLLAFRARYG VGRAIAAPQIGVKKRVIYRHLDTPVLFINPRLTFPEQEMIDVLDDCMSFPDLLVRVKRYK RCIIYYKDLEWNDCSMELKGDMSELIQHEYDHLDGILATLRAVDERALVMKPH >gi|157101656|gb|DS480668.1| GENE 18 21187 - 21618 556 143 aa, chain - ## HITS:1 COG:SA0649 KEGG:ns NR:ns ## COG: SA0649 COG1661 # Protein_GI_number: 15926371 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Staphylococcus aureus N315 # 1 142 1 140 140 89 35.0 2e-18 MEYKRFGQKIILRLNPGEEVTECIREVCRTEKVALGEVSGLGAACEVELGVFDTTEKKYY GKSFKGIYEIAFLTGNITQQGENPYLHLHMVIGNPLVGECHGGHLNRAVISATAEIFITV IDGTAGRRMSESIGLNLIEFPQA >gi|157101656|gb|DS480668.1| GENE 19 21893 - 22225 280 110 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 7 107 6 106 107 137 66.0 4e-33 MDNAIKIPDCPVEMTLQLIGDKWKVLIIRDLLTGTKRFNELMRSVNGITQKVLTSHLRAM EQAGLITRKIYPQVPPKVEYTLTETGYSLKPILDSMVAWGTDYKKNIGQM >gi|157101656|gb|DS480668.1| GENE 20 22332 - 23198 679 288 aa, chain - ## HITS:1 COG:PM0638 KEGG:ns NR:ns ## COG: PM0638 COG4667 # Protein_GI_number: 15602503 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pasteurella multocida # 1 280 1 278 280 260 45.0 2e-69 MKTGIVVEGGGMRGIYAAGVLDVLLEHGIKADGLVGVSAGAIHGCSFVAGQAGRSIRYNL KYCRDPRYMSFRSLVMTGDLVGNEFCYHELPERLDPFDNDAFEASPMEFYATCTDVETGR PVYHRCGSLRGENIQWIRASASMPLVSRIVEIDGQKLLDGGVSDSIPVAAFRKMGFKRSL VILTRPDGYRKKPNQMLPAIRRVYRRYPEFIKTMECRHLMYNRELEEVRRMESAGEIFVI RPSRLIKISRTENRPEKIQQMYDLGRLDGMKALQDTAVFLGLDPAAQA >gi|157101656|gb|DS480668.1| GENE 21 23226 - 24524 1282 432 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2678 NR:ns ## KEGG: CLL_A2678 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 432 1 429 429 328 39.0 3e-88 MEPFENDKNQNTEPVMAENMTSAPAKRYEGKSFIFSDIPSQEKIIQVLCTTFGLDPEGKG DTLVLHNGDIEIRLAVASESAGEKAEEFIKRQVEGTCDHFYQVKTGVVDVKTNLLYQIGR ARSFVLVEYAFDVEDEEDIEDKKTMIEDMFVSILNDLEGIILIQNQDEKEDGMFCSGENG EKLLILSDKGGSAFTRYLPYQEPDLKAGKDITQEQVDRRMRTMQTLIGNAIYVPGWLPAV APASQITCPSLEETAQRALAIMGVAIYSECLLDEKQGMEKAQNFLMDYIDNNNTQDYFSP KEWKYLHDTAPDRKEILSFLRQYESLHVVEWALGLVEELGFPDHPCNPADLVKLLRNGTS ISKIMAKSRPKSPRELLDACDLISCLEWSCINARRHELPEPVGMKTGVVREWHKALNWLT GHGQWDEVRTNS >gi|157101656|gb|DS480668.1| GENE 22 24678 - 26312 884 544 aa, chain - ## HITS:1 COG:BH1131 KEGG:ns NR:ns ## COG: BH1131 COG0318 # Protein_GI_number: 15613694 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Bacillus halodurans # 44 535 17 544 546 218 30.0 3e-56 MLSWENYWPDLLRERIKEAGAGGRNIHVYEGMADSIYHAVGLTAARWPGKTAVVDDKGQA LTYGRLLDMTDRMAASLKKKTGIQKGERIGILLYNGPEFCVSYLAANKIGGVAVPLPGKF QRPEILSLLEKADVTTIICEEKFEPWFENLEHVTVIVSKSGPDFGLDWILRDMEDNEITD GVGTWDDSAILMFTSGTTSRSKGVAMKNYQVMNSVEAYAATLRLTQLDSSVIATPMYHVT GMICILSVMLAVGGTVYLMKKVDADRILTCFLENNITFYHASPTVFTILLEKRSSYPLIP SVKSFACGSGNMAPENIRKLKEWMPQAEFHTVYGMTETSGAGTIFPGGAADSPWIGASGI PMPDLRIKIADDDGTELMENQIGEICLKGSFVVEEYYKQKVDSITEDGWLRTGDLGYYNQ AGYLYIVDRKKDMINRGGEKVCSFDIENTIHTLPGVVEAVVVGVPDSKYMEVPAAVIKLE KGTNVRAEDIKEMLKTRVARFKIPEYIVFVEDIPKTHNGKIDKRMIKEWMRNIMEGSPEI SGVK >gi|157101656|gb|DS480668.1| GENE 23 26347 - 27048 583 233 aa, chain - ## HITS:1 COG:PA1627 KEGG:ns NR:ns ## COG: PA1627 COG2186 # Protein_GI_number: 15596824 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 6 209 6 204 215 110 31.0 2e-24 MGEGKKKYKYDIVADQILEEIRKGRWKVGDRLPPEAELALEFCVSRVCLREGLKKLNVLG IVKIMQGDGTYVNEVNPSEFMKPLFNLLTVTENDIDEIYNARLFVESGACRLAAKNRTEE EIKVLKRYLDNMEEAIAFNEFASYSKYDRKFHDLLVSASKNRILVLISQMFNDIAQRYTD RLNRDPKIVSRSMMDHRQLFGAVEDGDGDFACHIMQVHLERSRRALLDIEHSE >gi|157101656|gb|DS480668.1| GENE 24 27146 - 28441 1169 431 aa, chain - ## HITS:1 COG:BH2671 KEGG:ns NR:ns ## COG: BH2671 COG1593 # Protein_GI_number: 15615234 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, large permease component # Organism: Bacillus halodurans # 1 429 1 424 426 299 45.0 9e-81 MAGTILIGGLFVLVFMGFPIAIALGLVCCLYSNLFGTIDMSLIVESFLNGTNSFTLLAIP FFILSGELMIEGGISQRLINFCMSLVKDRPGGLAIVTIIASMFFAAISGSGPATVACIGG IMIPQMIRAGYSKEFSTAAAAAGGALGPIIPPSILFLLYGVATNTSVAKLFLAGAFPGIL LGIAMIIIAKKISVKEKYVPGPEVKAELQKVYDMGFWYNFKEAIWALLVPIIILGGIYSG VFSPTEASVVACVYALFAGMFIYKDLKLTNLPGVFMRAAKSCSFIVIISFSTAFAKLLTL EGVTSSIANGVLDMTTNKYLILILINVLLLVVGMIMDAAPATIILSPILLKIVQPLGVDP IQLGVIIVMNLSIGMITPPVGINLFVGCKISKIQLESTLKFALQLLVYMLVVLALTTYIP AISLFLPQMLT >gi|157101656|gb|DS480668.1| GENE 25 28442 - 28960 393 172 aa, chain - ## HITS:1 COG:no KEGG:Arnit_2259 NR:ns ## KEGG: Arnit_2259 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter subunit DctQ # Organism: A.nitrofigilis # Pathway: Two-component system [PATH:ant02020] # 14 161 7 157 208 65 23.0 8e-10 MIQEEHMKAYYNENKFYNHLEEWVGAVLLAVMVVMLFVQVVMRYIFKASAAWISEYSLYM FMYFVFLACSGAFLRNDHIQIMAVIDKLPGKVKEAAHLFLYVVDFLFVTVVGYYVFLRVI DQFQMKTVSITQFPLWIMSASLFVGMGASAIRCLMNIYFIIRYEFISGREGE >gi|157101656|gb|DS480668.1| GENE 26 28968 - 30032 229 354 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 21 326 31 335 346 92 25 1e-17 MMRRYGFLVMAAVLLLAGCGKKDAGSAEQTAKPASGETFTLKIGNTVSDIDPFNVAYREF EEEVEKASEGRIQVELFTSGSIGETDADVLDKVRSNTLQMGNTTPNCFADLSGMSEYYVY GIPYLMSTDEELNAVAESEWFMGLNKDFAKATGVKVFDGGVNMGWFGFSATGKEIHQIAD LKGMAIRINQTNQMIALSEGAGINPQFIAFSEVYTALAQGTVDGMVTNVPLFYSNGFYDQ LKSILTTHCGESYHFYIINDKFYNSLPDDLKPIIDKNVDKLIARTRELEVDYLEEAIGKM REKNVIVTEASKEDEDMLRQISVDKVWWDENVSLTSKENVKKVLEILGRTDEAQ >gi|157101656|gb|DS480668.1| GENE 27 30049 - 30453 282 134 aa, chain - ## HITS:1 COG:SSO2426 KEGG:ns NR:ns ## COG: SSO2426 COG0346 # Protein_GI_number: 15899175 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Sulfolobus solfataricus # 1 130 1 138 142 61 30.0 5e-10 MEHMKLHHVGIVMNSQERADRFMKKFGLEKDYMEYVESYKSNCLFTKHKEDETPIELVIP LEGVLATYNNGKGGMHHIAFEVEDVEEVKRKYEAEGMKLLEESAVPGAGGIIVNFLRPRF GEGILVEFVQKLKK >gi|157101656|gb|DS480668.1| GENE 28 30466 - 32043 1049 525 aa, chain - ## HITS:1 COG:BH3898 KEGG:ns NR:ns ## COG: BH3898 COG4670 # Protein_GI_number: 15616460 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Bacillus halodurans # 3 519 2 514 525 440 46.0 1e-123 MRKPIFVSAQEAIGMIQDHSTLCTIGMTLVSASETVLKAIEQKFLDTGTPRGLTLFHTCG QSDRKDGIAHLAHEGLVTKVIGGHWGLCPPFMEMISENRLEAYNLPQGQMANMFHSMALR EPGKLSKIGLGTFVDPRIEGGKMNSRTMDKEDIVAVVTVDGEEYMQYREVPIDTLVIRGT YADEAGNISTEEEAMVLEVLPAVMAAKRFGGKVICQVKQVVRTGTLDPKRVAVPGVMVDA VVVCENPMEEHKQTSSWYFDPSYSGQIHTPLRASSSIPLNIRKVIGRRAAMLLKSDVIIN VGTGIPNDVIGPILAEEDVQDDVMLTVESGIYGGVPAGGIDFGISCSPQALIPHDRQFEY YTGAGIDYTFMGAGEMDAQGNVNATKMGAVAAGAGGFIDITSTAKNVVFCSTFTGGQIKV EFDECGIKILQEGRFRKLVKKVQQISYNGKMAVSRGQNMFYVTERAVFQLTEDGPMLIEI ARGADLERDILANMEFRPLIAEHIKETPIHIYREGPFGLKNIIKG >gi|157101656|gb|DS480668.1| GENE 29 32106 - 33290 925 394 aa, chain - ## HITS:1 COG:BH2029 KEGG:ns NR:ns ## COG: BH2029 COG0183 # Protein_GI_number: 15614592 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Bacillus halodurans # 1 391 1 392 394 380 52.0 1e-105 MSERIVIVSGCRTAVGKFGGALKDLEATDLGGIVIKEAVKRSGLKPEDIEEVVYGCVGQA AENAFAARLASVKAGIPYEATALTVNRLCSSGLQAIVTAAQEIESGMCEIAVAGGCESMT NIPFYLRKARYGYRMGHGELEDGLITALSDPFTRSHMGITAENVANQYNVTRQEQDRWAA ISQQRASKAQEEGKFDSEIVPIEVKVSKKETMIFDKDEHVRPGTTAESLAKLRPAFKDDG SVTAGNAAGINDAAAAVVLMKESKANALGLKPLVRLVDVAAAGVDPSIMGIGPVPAFRKL LKKTGIRKEEIGLIELNEAFAAQAVACMKELELDENIVNVNGSGISLGHPIGATGCIITI KLINEMIRRKVRYGVSTLCIGGGQGLAVLFELCS >gi|157101656|gb|DS480668.1| GENE 30 33302 - 34048 221 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 244 1 238 242 89 29 8e-17 MGRDLQGRVAIVTGSGRGIGLGIAKKLSEKGASIVISDVNEDNARNGVAEIEAMGGKAIY ILADVSKYEDAQNLVDATIKEMGSLDILVNNAGINKDRMLHKMSVDDWNAVIAVNLTGVF NCMRAAVNYMREREYGRIINISSASYLNGNIGQANYAAAKAGVVALTKTCAKENGKKNIT CNAIVPGFIDTEMTRGVPEKAWDIMVGKISMGRAGTPEDIGNMIAFLASDEASYITQGVF EVGGGMVL >gi|157101656|gb|DS480668.1| GENE 31 34390 - 35019 507 209 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229859876|ref|ZP_04479533.1| acetyltransferase, ribosomal protein N-acetylase [Streptobacillus moniliformis DSM 12112] # 1 187 1 186 187 199 48 6e-50 MKNSGTVRMETKRLVLRPYVIEDADAMFRNWANDPQVTKYLSWEPHKDVEETKQILEGWI KSYESKDFYTWAIARKEDEGNVIGSISVPQLNQKAGRVTVGYCLGRNWWGHKIMKEAFAE LIRFFFEVEGANRVEALHDTRNVNSGKVMAACGLKKEGTLRRYGWNNQGICDECIYGMVA SDYFEGKQRWENKLGKQEIEGYVWTAGYR >gi|157101656|gb|DS480668.1| GENE 32 35200 - 36066 800 288 aa, chain - ## HITS:1 COG:CAC3475 KEGG:ns NR:ns ## COG: CAC3475 COG0789 # Protein_GI_number: 15896713 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 2 284 1 240 243 136 33.0 5e-32 MMTVKQMSSLTGVSVRTLQFYDEIGLLKPAQVTEAGYRMYDENAVAELQQILFFKELDFT LKEIKAVMASPLYDRKDAFEKQRELIRIKRDRLNALLELLDRLIKGERTLDFKEFDMSNY FQVLTDYKRTHMEEIIKQLGSMESFDEMISELKANEGHIAQMAVKQYGSIEKFTKAMEEN LQHFLENGPGFTQKEADEAVDKTDTLTKKLTADLSKEAGSDEVQKIAGELVLLINESNGG MDMGSSYWSFMVNNYMTNPAFQEITDRKYGEGASEFMGRALKVYFGME >gi|157101656|gb|DS480668.1| GENE 33 36357 - 36554 374 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935303|ref|ZP_02082685.1| ## NR: gi|160935303|ref|ZP_02082685.1| hypothetical protein CLOBOL_00198 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00198 [Clostridium bolteae ATCC BAA-613] # 1 53 1 53 65 99 100.0 6e-20 MRIMNQKVEHKRYGIGTIFALKGKKVYVAFGKLYGDMAFPYPKVFEGDMKLVNQELQEEL MEELA >gi|157101656|gb|DS480668.1| GENE 34 36670 - 37017 262 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935304|ref|ZP_02082686.1| ## NR: gi|160935304|ref|ZP_02082686.1| hypothetical protein CLOBOL_00199 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00199 [Clostridium bolteae ATCC BAA-613] # 1 115 16 130 130 165 100.0 1e-39 MKNESVKHVRYGTGRVAEVVQNHMVVLFDGEAGRKVFAYPDAFERFLCFDDPILQKRAEA AVMELKKKRTEEAKQRLVVYQLYEAKRKQEQTELLKKRRKAARERLAREKMAKVI >gi|157101656|gb|DS480668.1| GENE 35 37359 - 37997 201 212 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 59 206 64 213 378 82 32 2e-14 MTKEKLALEIIDRLKKEYPDAGCTLDYDHAWKLLVSVRLAAQCTDARVNVVVEDLYAKYP DVDALAEADVEDIERIVKPCGLGHSKARDISACMKILKEQYGGKVPDDFDALLKLPGVGR KSANLIMGDVFGKPAIVTDTHCIRLVNRMGLVDGIKEPKKVEMVLWKLVPPEEGSDFCHR LVFHGRDVCTARTKPRCEACCLNDICKKAGVE >gi|157101656|gb|DS480668.1| GENE 36 38211 - 39383 1649 390 aa, chain - ## HITS:1 COG:AGc4262 KEGG:ns NR:ns ## COG: AGc4262 COG4214 # Protein_GI_number: 15889623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 385 20 397 398 292 47.0 1e-78 MDQKLKISEVLKKYTMIIALAVIVILFTVNTGGKMLLPQNVNNLIAQNAYVFILATGMLF CILTGGNIDLSVGSVVCFVGAIGGKMMVEMGISPYLTLIAMAIMGMLVGAWQGFWIAYVR IPPFIVTLAGMLMFRGLSNVVLQGMTLAPIPDQFLVLFNTYVPDFFAVEGFNLTCFLVGV IACVVYAFIVFNGRRQKLAKGYYVEPLTGVVIKMAVICVVLLSFMFRLAQYKGIPNALIW VAVIIGIYSYIASKTTTGRYFYAVGGNEKATKLSGINTNRVYFLAYLNMGLLAAIAGMVT VARLNSANPTAGNSYEMDAIGACFIGGASAYGGTGTVPGVIVGATLMGVLNLGMSIMGVD QNLQKVVKGGVLLAAVIFDVVSKRKSFIVK >gi|157101656|gb|DS480668.1| GENE 37 39387 - 40925 179 512 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 263 495 9 245 563 73 26 6e-12 MAEILLEMRNITKTFPGVKALDNVNLKVEKGEIHALVGENGAGKSTLMNVLSGIYPYGTY EGDIIYDGQVCKFDKISDSEEKGIVIIHQELALIPYMSIGENMYLGNEKGKKMAINWDQT YDEADRYLKTVGLNESSHTLIKDIGVGKQQLVEIAKALAKKARLLILDEPTASLNEDDSK ALLELLLKFKQEGMTSIIISHKLNEIAYVADKITVIRDGSTIETLDKQTDDISEDRIIKG MVGRELTDRFPRRENINIGNTALEVRDWVVYHPLYSDRKVVDHVSIHVNKGEVVGICGLM GAGRTELAMSIFGKSYGTNISGQLFIEGKEVHLNTVHDAISHKIAYVTEDRKGNGLILSN PIRVNTTLANMGAVSSHGVIDKDKEYMVAVEYKDKLKTKCPGVEQNAGNLSGGNQQKVLL AKWMFTEPDVLILDEPTRGIDVGAKYEIYCIINRLVEEGKSVIMISSELPEVLGMSDRIY IMNEGRLAGEVDASEATQEYIMSCILKADKGE >gi|157101656|gb|DS480668.1| GENE 38 41014 - 42189 1571 391 aa, chain - ## HITS:1 COG:BH3442 KEGG:ns NR:ns ## COG: BH3442 COG4213 # Protein_GI_number: 15616004 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Bacillus halodurans # 57 391 20 358 359 343 52.0 3e-94 MKKKLLVTMLTLAMAGSLMACGSKEGAAPASEAKTEAQTEAKAEDAKAEDTKAETEAAKD EAASGEKHLVGVAMPTKDLQRWNQDGSNMKAELEAAGYEVDLQYASNDVQTQVSQVENMI SNGCELLVIASIDGSSLGEPLGQAKEAGIPVISYDRLLMNSDAVTYYATFDNYMVGQKQG EYLVEALGLEENDGPFNIEMFTGDPADNNCNFFFGGAMDVLQPYIDSGKLVVKSGQTTFE QVATANWDSEKAQNRMDTIIAGNYSDGTVLNAVLCSNDSTALGVENALASSYTGEYPIIT GQDCDIANVKNLIAGKQAMSVFKDTRTLASQVVKMVDAVMQGGEAEVNDTKSYDNGTGVI PTYLCEPVVVTIDNYKEMLIDSGYYTEDQLK >gi|157101656|gb|DS480668.1| GENE 39 42447 - 43667 240 406 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935313|ref|ZP_02082695.1| ## NR: gi|160935313|ref|ZP_02082695.1| hypothetical protein CLOBOL_00208 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00208 [Clostridium bolteae ATCC BAA-613] # 3 406 1 404 404 787 100.0 0 MSMYGNRLVKHEALKRWVKTISLDNINTVDIGGELLELTEKSKKILDIQIALFSKLVESM KPGDDWRALQNVLSPLFYNAFFRVGNNAIRIANYYECMVIPSNMKTYKKIIKGVDYQEIG SVKLYDGKRCIGEIGAKSDLMWSVFYDYFINIGKWGEITHTHTNHERYLSIQLFDIEFLS NDEICRMINEILLKVSMEHDLDFSVVEMDAIYKLEGEAKTYGIQFHSLEFEYIPALYLIN ALNTREARLAYLSYYQVLEYFFVRAQNYYFLNEYGALSMPAVNHNELRKVLHKYKNILNE RESLKLVLTRSLDISKFKSWILQDKNRVNKYCNSQKNSIDITKSDEKIVGRLGERIYSMR CSIAHAKGDVDEYIAVPTVSDSEISLEIELIKYAAYEALKACSEIK >gi|157101656|gb|DS480668.1| GENE 40 43791 - 44852 1209 353 aa, chain - ## HITS:1 COG:HI1111 KEGG:ns NR:ns ## COG: HI1111 COG4213 # Protein_GI_number: 16273036 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Haemophilus influenzae # 39 349 18 331 332 236 42.0 6e-62 MGIRKKRTAGSIVCILFTALFSVLAVSGCAKTEDAAAGSVEQEEQEVQIGLSFESFVIER WMRERDVFVSAAKELGAEVNVQNANGDVEEQISQIEYFIKKQVDVIVVVAGDCEALSDVM IKAREAGIKTVSYDRLIMNAGCDLYISFDNTEVGRLMAESMLENIPDGGDIFLIQGPLTD NNVSLIREGIDETLAGSNLNVVYEANCPGWIAENAYTYTKEGLKLDRNAKGIICGNDDLA SQAFRALSEERLAGKVCVTGQDGDLAACQRIVEETQEMTAFKSVEQEASLAAKCAVLLGL GEEIEEVQDTAYDGTYNVPYLELQPIAVTKENMDEVIIKGGFHAKEDVYLNVN >gi|157101656|gb|DS480668.1| GENE 41 44895 - 46505 1639 536 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 534 1 521 530 189 28.0 8e-48 MFRILLADDEGIMLESLKKIITTSFGSECDIRCVKTGRGVVEMAESFHPDIAFVDIHMPG LSGIQAIQEIRKVDKEMMIIIITAYDKFAYAKEAVNLGVMEYLTKPISNKKIILDVCMKA MQRVEEARQKRSDDLKIRERLEIVVPMIESGFIYNMLQDDGADSSGYEELLGIREHYGFC LVLEFGDRDESGRLTNVIGSSVKANKYYQAMREIVRDYFSCIAGPVMGNRVVLFVPSPEP VLKYEERVEVITRTRTMLQKLEEHIGSDFRSGIGSTKTLGNAKESYGEALRALRESVSHV VHITDIPSRQEDVAGEYPKELENSYYERGLKADSEGAAACGAELFSCMQGRYDCREDLEA KVLELIMQLEYRICAKDGTEFGLKPRGSYIREIKEASDSDKLCRWFLEKTREICQCAGTE KKKEMEYLVDTVRKYIDENYQKDISLDEISRMVNISPYYFSKLFKQGTGENFIEYLTRTR MRQACVCLRNPDYSIKQICAMVGYSDPNYFSRIFKKYEGVNPSEYRERMENGAGKV >gi|157101656|gb|DS480668.1| GENE 42 46480 - 48000 1805 506 aa, chain - ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 19 495 8 477 481 192 27.0 1e-48 MERPGKNIWSHLSLEKKILLAVTLMILTLFISNIFIYLQVNKIAGKMDTVYASNVNLSEL SEALEHVQKDMGDYVSVKSSDSLESYYKSQQEYQGLLERLNDKTIDKPGKILEKNIRMMS ESYLDMTQDAVQAKRGRNVERYREAYERALTLYQYINTRISELNELQFKNNSASYQTMRE ALRYMEVFNMVILMIVMLAGLMVLRMLVKDMVTPLTSLARTANLVGQGNFNVKMPDTDSN DEIGIVTRTFNQMVGSLEEYVAKVKEGAEKEQELLERELLMENHLKEARLKYLQSQINPH FLFNSLNAGAQLAMMEDAEKTCIFLDKMAEFFRYNVKKGLEDATVGDETAAAENYIYILN VRFAGDIQFKESIDQRVLDVKMPSMILQPIVENAVRHGIRGMEENGTILLTVHGCRDSIR ISVRDNGKGMSRERILEVMEGRKQETPDDGDSTGIGLDNVVSRLELYYQQKGLVQIVSEG PGMGTEVIIHIPVKEEIGHVSDITGR >gi|157101656|gb|DS480668.1| GENE 43 48028 - 49029 1019 333 aa, chain - ## HITS:1 COG:BH3840 KEGG:ns NR:ns ## COG: BH3840 COG1879 # Protein_GI_number: 15616402 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 51 227 47 216 332 82 28.0 1e-15 MKTRVRLERIIVIAGILVVLFLTLVTSSYYQNVLNQVGKDGGEDGNEYRYHYVMIVEDCD MPFWKDVYESTREAAREHNALVELMGKSLSSTIEIESLMDMAIASGVDGIILEYTGGHKI DERINEAGKAGIPVVTVLKDAPDTSRISFVGVNDYQLGRQYGEQILKLVPPDKDDVEVML LLHDRDNSSQVQIAEQINNLLVTSPDTSGRVRLIQETMRSTGKVDADETIRNIFFRKEGI PDVLVCTEAVDTEAAYQAMIDYNKVGELQLLGYYKSDQILEAVRKGNIPVTLVIDTEQMG RYCVQALEEYLQEGHANSYYSVDLQFVTKENVD >gi|157101656|gb|DS480668.1| GENE 44 49026 - 49592 681 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935318|ref|ZP_02082700.1| ## NR: gi|160935318|ref|ZP_02082700.1| hypothetical protein CLOBOL_00213 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00213 [Clostridium bolteae ATCC BAA-613] # 1 188 25 212 212 369 100.0 1e-101 MEYQVGDIFLVDSTWYGGVNVTSKSGIPLSLDKEEYEFVNREEAVHVIDTYSYGLGAMDC FCEMVSAGLKTLAMSHPCDTREERDSYLQDAEKLCRKYGVKLYPEDEAFITDLFPEELNK GKYNYLFYRTGDVLERYMGLKEQQKRLIADHSYTGQERYRIAVEFGKLLSYPEDGIERLI ERAGREKQ >gi|157101656|gb|DS480668.1| GENE 45 49893 - 50252 257 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935319|ref|ZP_02082701.1| ## NR: gi|160935319|ref|ZP_02082701.1| hypothetical protein CLOBOL_00214 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00214 [Clostridium bolteae ATCC BAA-613] # 1 119 1 119 119 241 100.0 2e-62 MVLRCFSAPDEKVNLCFYPCSCGTGPGDHDTYCQESLCIYDTDYTKLLLSYLESEFPEFD VCWDNRIPAGTWNIIIQKIENDMSHIHFDDTVFDFYSRFISWIADALEQEDMIIVEGNQ >gi|157101656|gb|DS480668.1| GENE 46 50432 - 50971 294 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935321|ref|ZP_02082703.1| ## NR: gi|160935321|ref|ZP_02082703.1| hypothetical protein CLOBOL_00216 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00216 [Clostridium bolteae ATCC BAA-613] # 1 179 1 179 179 328 100.0 1e-88 MLLAQLGVTQSFHGKGTLVVMRTAKMNFRMPEILEGMSLYLESLQLLALTIRDITLYTIK SCSGEAQERLTGKFDLLRQENKVHLCFELYFKWIEHQCPMVMIRECYRKQYGLLTWGYPI MLYRIRNQKLQLRYMGFTEEIIKLLREKRWEDFSEAWKGLMEQEEREARKFMLDVNLRL >gi|157101656|gb|DS480668.1| GENE 47 51219 - 51644 407 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935323|ref|ZP_02082705.1| ## NR: gi|160935323|ref|ZP_02082705.1| hypothetical protein CLOBOL_00218 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00218 [Clostridium bolteae ATCC BAA-613] # 1 141 71 211 211 279 100.0 6e-74 MDGKRTAIVVYEAEPDKIREHAAKYFALREEGIKDMIHSVPLLMGPLWLEGLRRGTNQDW DKIRNALACPSLGVISILVELYSLTLGTLKNKLILNLFWEEIRYLRFPYLTDKLLEIQET LKPEASKEEIVARLRRIFFGS >gi|157101656|gb|DS480668.1| GENE 48 51919 - 58932 4326 2337 aa, chain - ## HITS:1 COG:no KEGG:DvMF_1880 NR:ns ## KEGG: DvMF_1880 # Name: not_defined # Def: outer membrane autotransporter barrel domain protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 900 1267 76 453 936 72 29.0 2e-10 MDGKRQKFQLRRQRHNRQRILAFCLIFSLVLSNLNFGSMLTYATEGKRRTFEIGSDVKAV LKDGVLTVKGYGDTRDFSCDTAPFAEYADEIRTLVIEDGITYIGSCLFYGLGGLQGELIL PESIVGIGDYAFSGESPAQAARFTVIRNEFEGGEITEWKKAEPEESETITLATPSQAEKV TSEERIEEGEQNGEADKTGDENAVNEGNREETKQETAEQTEQGETEPGKPDRIEQGETEQ EETEQEEAERTEPKETQEEMERTEPEETEQEETERTEQEETAHGDTLSLIARIENHQVPF VGEGMPEGDSDSSGSESGMDAGSADSGDDSGDAGSGGADTGGGGDADSGSGDTGGGGDAD SGSGDTGGGGDTNNGSGDTGSGDANSGSTDTGSGNTDVDSGSTDTGGDNADADGGGSPDT GSGQEGGLSKDDSMDSGDNTEAGDTSGTSGDKPGDAAGQGSHNQELPSESENGTGGKDSR PQKVTENLEAENPDYDIEYISQQKIENPETVFYEGQTGMVICSPENGTFIEAAEYAGYVM ADSCITVILDDMMEMELPACEGQVCLPECPEEIKCPWDDNAIFTAKFAGWTLEPGTEPAK ALVPGTYTDTGEEHLNLYSVWKPVGTYQMGVRAERQGGTAIYTLFDKETGETVNSMDGYA FLYQWQYSAQGADIPDDKNPYGEDVNLTDNHGNGDDGLWEDIAGEDSPVYKRSVKAGDMG MRFRCAVTPVKMTRSMTAASSVTLYSEPANGVAVLETVYVNQTSGDDGNSGLENAPVKTV EAAVKKLKSREDGGTSDGNQIILMQDYDYGNTTDFLKEHAVPVTIKGITADIEFFSTAAG TESDFHLYEDITFDSLTIEKMNHLYGNGHNITIGDNVSCSGLYLYGASKDGTLTVKPGII QVRSGSIARIIGYIRSNGSVDANNNEAVIIVDGTAKVDTIVAGSASGEIKNGNVGIYIRG GTVNKIIGGCQGFSNVKAPYTGKTSINISGGTVNMVYGAGSGRATGIPTYQGILDIIVTG GSISSIYGSGSAAYVVSGEDATKVNISVSAGTVGNIYAAGVGGEAGVDGGNDVNGTPAAD FGSLTGKADITIDNDAVITGNIYASGQGYTQKDYGKGNAYLDGEANITVNGGTIKGNIYG GGEGVNKAGFEKSARVCKNSKVTVEIKGGLIEGYVYGGGKVAEVEGSTNVIISGGTIKGN VYGGGEAGLTHGRAAVHMANGTVNGSIYGGALGKVGDRFVYGGSTVNMTGGWVRGNVYGG SELSNDGKSGEQDEKSASGLVFVNLAGGTVSGKVFGGGFQGIVNGSTHVHIGLGAIDKCN YYKSHAEEKPVLDASGLSVGGSVYAGGDYGGDSADLNYDTITVNGFSHVYIDGTDYSFGA GDTGGKKMEIAGGVFGSGASCDAGDVRLVTLDHFGAAASGGGQPEQVTGTVTSIQRADQV RLIQTHVRLAGQSDVANSNQTTPYSLNRIGDKGNFDKLGELGNSLVLEGGSTLVLDSASN SVANLKSVDDTESKVVTQADLAHCPNTVVLSTGTVFRISNTSQGDSEEYGAVTGYFYMIA GDTADAYAFARAKTETLNPSDGGFVVPGEKTELGYTNIGTSYRYWKVAGDNAGATRHVVL TARNSDSLGIGEDGFAAAEGTIELPPVGDTENIRYAIKSITLSDKMTLVDAAMNLETGTW ITSGPDVSAEQEKARIRNTPASTFGLVMWQNTGKTEVVSNVISDQTAGTDNSNSIIGKDW EYHVLDQETSGIPNIEFKLTYYNDGIRISQNLGTVSVVVERYAGDILAEEITMNVEVVTR ASALSEQTIDLYASESGSYTGRMVIPADSSRNLSLAGVKVPENGITLKAVNSELEHYDVA VTLQPVQSQGWNTANLMSEPYDISTYKESAGTVLLGTTDSRYEAPIDFVLYNHGAFAPKD TDEIILTLSDADGNKVPVTLNIHWKESIVSAVQMASGKQYEGMNDTAPVVISQESAVTAM FTLGSQNGTLAASSVWLELRDSSGIRHALPDGAKLTLLSGNRYYLYTLTGGEKDGKIPLD NFAEMWGSSRFNENIAAKTITVIAAFDKTASLMPGEYSLRLRNDTGADSVGGYFTVNNST AAITLDSQEGLARGEHRISIFVSPQQDTRLSDKAAVVLTTPEGEAFPKGTAFIYGEKRCY PVNGRVYLILDRGQAHTVIMDTTDTAGLTLENNRISAQLFAAGVNAGKGRITGTELTYNI KANPVYGLSVKADQGEERVATPGAAIGFTVSYSMEHITQAEIIQVEVRRKEDGAYIEADG WEVSGDLDLGQGDGAKREGNAAVTITVPENTGEGTYRLIFKLGDKKALYNIIIRSCD >gi|157101656|gb|DS480668.1| GENE 49 58987 - 59763 657 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935325|ref|ZP_02082707.1| ## NR: gi|160935325|ref|ZP_02082707.1| hypothetical protein CLOBOL_00220 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00220 [Clostridium bolteae ATCC BAA-613] # 1 258 2 259 259 478 100.0 1e-133 MNKIRRKRQRAALMLITAAGIMLSLAGIAAARYMKQESQSGVVEAQTFYFTSDLLKEDGG TAAYFIDPMAQSFTIELYNFADSERVTSADITYRVSVTGGTVEGGDTGIQGTIKAGEKST VPISVIPAAKPEGGGGTVELEEIRVVAESVFPYKKTLTGVFKRQLGNQYVVEDESGRNAA VLTMVCADGEKDITVTLPSGVIPDEADIRVTGYKDNKCTFHSPGYGIYSLVLLKSEADIG LSGKDSFADAIDLSQSGG >gi|157101656|gb|DS480668.1| GENE 50 59777 - 60340 352 187 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935326|ref|ZP_02082708.1| ## NR: gi|160935326|ref|ZP_02082708.1| hypothetical protein CLOBOL_00221 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00221 [Clostridium bolteae ATCC BAA-613] # 1 187 1 187 187 337 100.0 2e-91 MGKHLTKTGIAGYGLLLAAASISVWSFSYARFTAQENGMAAATVAAWGSGSSTIDINVSG LRPGDKKTYIFEVTNTENGVTSEVGQDYSIIVETTGNLPLDFALTYDNSNPDHQGLFVNT GENGTLTFNGGMAAVRGGRLPHSVSAIHTYTLAVSWPEDKAGTEYTDEIDLVTLTVSSEQ TVPAIQE >gi|157101656|gb|DS480668.1| GENE 51 60449 - 62185 780 578 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935327|ref|ZP_02082709.1| ## NR: gi|160935327|ref|ZP_02082709.1| hypothetical protein CLOBOL_00222 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00222 [Clostridium bolteae ATCC BAA-613] # 1 578 1 578 578 921 100.0 0 MIKKHGNRFAPEKKTGNHKWIPLLLLFLLMLLSSCAAGYILGRSAFPESSGRRIDTILLT PDGDAAEQQMQQIFHLAGKINYTNGRPYAQGRVQLHSEPRETVTDTNGSFTFFHVEGGEH QIRILDRTGNVLGERKVNLFSAQAESGVNVSLQQNGQYTIEVAVDVRLMEVAIELDEKNN SLYINPEKLTYLTSDGRLVTPTGEAYYWEGTLVTPGGLVITADKTVVAPASEGPLGIALL TPENQVIYPEQDAALPDGTKLLENGNILLPGETMIERNENGTVILTPDGLRGQPGAGGVV ISGDNQVSPIGTGTSGPDSNTHGNPAGPNESVSGNGYDTGTNPAERNDSSQIGNGSGGVN AGLGSGEGGESNAGSGGVSGDSSEGNDKDGGGISGGGGSGDSSGGNSGDSSGGNSGNSGG ETGGDNGGGGDDKDEGPIPNFRVSWTQGQEIELFSDLSGNVQTLLPGAEGSYPFILENNN SFHVIFTLSIEEDSLHIPMRFRIVTYKDRKPLTDWWDTQNGGAAESDTVRLGAGESKSYL LEWQWPYESGKDGIDTNAGKADLDYRVRLKIWAQQEGP >gi|157101656|gb|DS480668.1| GENE 52 62320 - 62928 472 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935328|ref|ZP_02082710.1| ## NR: gi|160935328|ref|ZP_02082710.1| hypothetical protein CLOBOL_00223 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00223 [Clostridium bolteae ATCC BAA-613] # 1 202 8 209 209 352 100.0 1e-95 MREKSYVARLGALALVLTLMTTCLTGGTLAKFVTEVEGEAEATVAAWEFKATGNNGAALS SIDMGRTAYKGETIGDNVIAPGTQGSFDIVLDGTGSDVGIDYIVTIDKSSEPDTPDLPDD LVFQVDGTDYSLGNEIEGKIEHAAGDDAMKRTITVSWKWDYGSADTAEADKNDNKYAGKS WKLKIKATGTQTKPTETTAPTA >gi|157101656|gb|DS480668.1| GENE 53 63085 - 63705 263 206 aa, chain - ## HITS:1 COG:BS_sipW KEGG:ns NR:ns ## COG: BS_sipW COG0681 # Protein_GI_number: 16079519 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus subtilis # 12 194 3 185 190 94 34.0 1e-19 MSRDRKISVRSVMGGILCVIFIPVIFINLTLIISTYTKPGEMPGVFGIKPVIVLSGSMEP VIQTGDMIFLHSTDPARLQTGDVICYLDSGQAITHRIVGIREGEDGQVRYVTQGDGNNTA DRQAVSADQVQGIWRGGKIPGMGSGIMFMQSAAGMILFMICPLCLFILWDIWRRRRLDRE EAEHRAALEAELREYRSREKGDPDSK >gi|157101656|gb|DS480668.1| GENE 54 63702 - 63878 144 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935331|ref|ZP_02082713.1| ## NR: gi|160935331|ref|ZP_02082713.1| hypothetical protein CLOBOL_00226 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00226 [Clostridium bolteae ATCC BAA-613] # 1 58 14 71 71 95 100.0 1e-18 MEHDRNQFIVNVFTNGAAPEEKRGDSQEHDMAYIESCREMFRLRQKLQHDISSKEEGK >gi|157101656|gb|DS480668.1| GENE 55 64379 - 65224 635 281 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 23 277 9 262 265 117 29.0 2e-26 MNIYGEFQEKLCLFRNYLLEEERSAATIEKYGRDVLAFLSWLSDREEISKEVVVGYKQAI IGKYKTTSANSMLVSVNRFLDFIGKKDCQVKLFKIQRNPFRKKDKELTKEEYNRLILAAK AKSSSRLFLMIQAICSTGIRVSEHRFITREALERGRITIYNKGKERVVFLPKKLKKCLLQ YCRQNGIYSGPVFVTKNGTPVNRCNVWAEMKALCKEAGVSPEKVFPHNLRHLFAVTYYRM QKDIVHLADILGHSNIEYTRIYTFTSEEEHARVLSRMCLLI >gi|157101656|gb|DS480668.1| GENE 56 65635 - 66678 565 347 aa, chain - ## HITS:1 COG:lin1811 KEGG:ns NR:ns ## COG: lin1811 COG1680 # Protein_GI_number: 16800879 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Listeria innocua # 26 339 9 320 323 251 41.0 1e-66 MQNENWPVPEWRTTDPGEQGMDRALPEKLRSVITSEYANTNGIVVVRNGCIVCEEYFHGF GPDVPVHVASVTKSVLSALTGIAVDKGYIPSVSRRIMEYLPEYVSETHNRNCGNVTIENL LTMTVPYAFEEWKEPLEQLCTSPDWVGFTLDMMRDVRETDKPRRHFQYCTAGAHLLSAVL QRAVGMSTREFANRYLFEPVGMTVIPDYPMETYTCDSLFGEGVRGWVHDPDGVTAGGWGL TMTPRDMARLGLLYLNMGRWNGARIISESWVRQSVKRTFSRYGYMWWRFEKKGVAVHAAM GDGGNMICWVPDRNLVAAIASGFVPEAKNRWLMVRDYILPAVEENRV >gi|157101656|gb|DS480668.1| GENE 57 66800 - 67291 427 163 aa, chain - ## HITS:1 COG:all4397 KEGG:ns NR:ns ## COG: all4397 COG0454 # Protein_GI_number: 17231889 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Nostoc sp. PCC 7120 # 3 163 20 180 182 85 29.0 4e-17 MDIRIREYTEADAEEAVRIWNHIVEEGAAFPQLELLTAETGNRFFSEQSFTGIAYDGGGG EIVGLYILHPNNVGRCGHICNASYAVKSQLRGHHIGEALVKHCMEQGKKLGFRILQFNAV VKSNASALHLYEKLGFTQLGVIPGGFLMKDGSYEDIIPHYHVL >gi|157101656|gb|DS480668.1| GENE 58 67575 - 69068 1698 497 aa, chain - ## HITS:1 COG:lin0179_3 KEGG:ns NR:ns ## COG: lin0179_3 COG0516 # Protein_GI_number: 16799256 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Listeria innocua # 227 495 1 269 276 489 85.0 1e-138 MAFYFEEPSRTFSEYLLIPGYSSTQCIPSQVNLKTPLVKYKKGTEPDISINIPMVSAIMQ SVSDDRMAVALAQEGGISFIYGSQAIEKQAEMIHKVKRYRAGFVVSDSNVSPDMTLADVL AITEETGHSTIAVTADGQPNGKLLGIVTNKDYRVSRMGPDTKVKDFMTTLDNLVYADEST TLKEANDIIWEHKINCLPLVNKNQELVYLVFRKDYDTHKKNENELIDSSKRYMVGAGINT RDYEQRVPALLDAGADVLCIDSSEGFSEWQKITIDYIRKNFGDSVKVGAGNVVDGDGFRF LAEAGADFVKVGIGGGAICITREQKGIGRGQATALIEVAKARDEYYKETGVYVPICSDGG IVHDYHITLALAMGADFIMLGRYFARFDESPTKRVNVNGSYMKEYWGEGSARARNWQRYD MGGDKKLSFEEGVDSFVPYAGSLKDNVNLTLSKVRSTMCNCGALTIPELQEKAKITLVSS TSIVEGGAHDVMLRDKR >gi|157101656|gb|DS480668.1| GENE 59 69244 - 69768 554 174 aa, chain - ## HITS:1 COG:BH0368 KEGG:ns NR:ns ## COG: BH0368 COG0717 # Protein_GI_number: 15612931 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidine deaminase # Organism: Bacillus halodurans # 1 174 1 175 177 201 59.0 4e-52 MILSDRTLFKMLEAGTLSITPLDKEQVQPASVDIRLGNTFSIVEDLSTGVITLKDEVRYK TINTDTYILLPGQFVLATTMEYVSLPDNLTAFVEGRSSLGRLGLFIQNAGWVDPGFQGEI TLELFNANRCAIELKAGRRVGQLVFAEMDDTALKPYNGKYQGQKGATGSRVYMD >gi|157101656|gb|DS480668.1| GENE 60 69922 - 70623 547 233 aa, chain - ## HITS:1 COG:CAC0632 KEGG:ns NR:ns ## COG: CAC0632 COG0637 # Protein_GI_number: 15893920 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 6 206 5 209 215 93 28.0 3e-19 MGTEFVLFDFDGVIANTEESNAAYLEKALSAFGIQLTDDDRKALIGTNDKARIESLLSRS LVKVTAEDLAGKRRQVGNTYENSCIRPMPGVVSLIQEIRGRGMKTGLVTSTSTRLIMIAL NRMRMTALFDVVICGDMCANHKPDPECYKKAMDYLGAEPCKCIVFEDSAVGIHAARLAGA RVAAYRGSGIQQDVSEADLIVDSYDECKDMLDDIMQGRKMQDRITQGGIIPCK >gi|157101656|gb|DS480668.1| GENE 61 70635 - 71048 496 137 aa, chain - ## HITS:1 COG:no KEGG:CA_C3497 NR:ns ## KEGG: CA_C3497 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 137 1 137 137 244 86.0 9e-64 MRPDVYENNGEGILCVYKNEKWLVSIKNWKPDNDIDGIAHLEIHHSTDEQFILAAGKAIL ITAAKADDGFKIELTLMEQGKVYNVPAECWFYSITQKDTKMMYVQDSNCSMENSDFCDLS RDEIAYIQTNARRLFAQ >gi|157101656|gb|DS480668.1| GENE 62 71157 - 72044 754 295 aa, chain - ## HITS:1 COG:CAC3498 KEGG:ns NR:ns ## COG: CAC3498 COG0524 # Protein_GI_number: 15896735 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Clostridium acetobutylicum # 1 295 1 295 295 453 74.0 1e-127 MKDYKIIAVGDNVCDKYLSRGRMYPGGQCVNTCVYAKLNGAYTAYLGKYGTDQVAECVRD TLKQIGIDDSHSRCHEGENGFALVTLKGNDRVFLGSNKGGVAREYGYDFTEEDFAYISQF NLIYTNLNSYIEDDLKSLKETGVAIAYDFSTRWTDAYLEKVCPYVTVAILSCAHLTRQER ETEMKKVQSYGVKVVLGTIGEDGSYVLYDDSFLYAPAVHADDVIDTMGAGDSYFAAFLCS LLETSQTGAVVEGTEERMKERLVEAMERGAAFAAKMCAKEGAFGYGVPVLGRTEI >gi|157101656|gb|DS480668.1| GENE 63 72063 - 72851 1007 262 aa, chain - ## HITS:1 COG:yhfQ KEGG:ns NR:ns ## COG: yhfQ COG0524 # Protein_GI_number: 16131252 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 1 256 1 256 261 147 30.0 3e-35 MKRAIAMADNCIDVYYKLDRYYLTGNSIDFAFNYRDLGGDVTEMTILGNDVFAMALEERL KERGIALRVLKRADRPTGMATMDIVDGDKKHLHFEGNAMEEIELSGEDLEFVKSFDIVYA ERWSRIDRYIKELKQPGQIWVYDFSKRLEQEGNDLILPYLDYAFFSYDRDDSYIRDFMVK VRKKGTGTVIAMLGEHGSLAYDGDRFYEKEAEKVPVVNTVGAGDSYIAAFTYGVSLGESI SQCMDRGKKRATEIIRQFNPYK >gi|157101656|gb|DS480668.1| GENE 64 72855 - 73295 538 146 aa, chain - ## HITS:1 COG:PH0581 KEGG:ns NR:ns ## COG: PH0581 COG0432 # Protein_GI_number: 14590477 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pyrococcus horikoshii # 5 138 2 128 133 89 36.0 3e-18 MEFKLKEAAVCADREFEMIKITPEVRAFVEESGIENGLVYVITEHTTTGITVNESLDCVE KDIMECLSRIAPEDYPYHHNHYLPTYGTIGGNTPGHLKSLLTGNHCVFPIVRGELKLGHA ADIYFCEYDGIKMRHYMIYAVGERKE >gi|157101656|gb|DS480668.1| GENE 65 73337 - 74185 661 282 aa, chain - ## HITS:1 COG:AGc2623 KEGG:ns NR:ns ## COG: AGc2623 COG2159 # Protein_GI_number: 15888748 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 56 279 67 289 291 91 28.0 2e-18 MVIDTHIRPALFGPICRDRDRFRKRCDEMNYHLMGPSDMDLLKKQYALADIQKIFMLPDD NSSKTGEPAISNDEIAQLAACDSDFFIGFAAADPGNKDGADELRRAFGELGLKGLYVNTA RLHMYPYDERLTPLYDICREFGRPIIFHSGLSLEQQAFSRFSRPADFEEICCRYPDINIC LTHVGWPWVQETAALLLKYENCYTSTALMNFDGPYQIYHKVFKEDMGELWVEHNISRKIM FGSGSPRIRPVRCKRGLDSLGFSSETLENIYYKNALRFLGHA >gi|157101656|gb|DS480668.1| GENE 66 74237 - 75787 1140 516 aa, chain - ## HITS:1 COG:TM0284 KEGG:ns NR:ns ## COG: TM0284 COG1070 # Protein_GI_number: 15643053 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 4 511 2 506 506 253 32.0 7e-67 MKQYFMGIDIGTYESRGMLIGEDFRVTAEHTMAHGMSHPRQGWFEHDADKVWWGDFCGIS RHLIEAGGIAPEQISCIGASTLGTDCLPVDEQCRPLRPAILYGIDSRADKEIQWLTRHYG RDVRRLFGHPICSGDTATKILWIKNNEPEVYRRTYKFLTGSSFITAKLTGRYVIDQFLAK GSFRPLYRADGTVNEEECGLYCRPDQIAACAYSTDVAGRVTREAAAQTGLAEGTPVICGT GDSTSEAISVGLTEPGTAFFQYGSSMFYYYCVDRMVQDYISSGGNGSVKGGKVFTIPGTF CLGDGTNAAGTLTRWVRNTFYGEELEMERKGGKNAYAVMAGEAGEIQPGSEGLIILPYIY GERSPIQDPLATGMIFGLKGSHTRKHINRAALEAVGYSTLQHLMLFREMGLPPHTLITAG GGTKNDVWMQIICDMTGMPLHIPETFQCSSYGDAMLAALGAGCLKDFHQLKEKLPEGRII GPDRENHEFYQRNYPIYRDLYLNNRELMHRMHMDKE >gi|157101656|gb|DS480668.1| GENE 67 75801 - 76610 608 269 aa, chain - ## HITS:1 COG:STM1615 KEGG:ns NR:ns ## COG: STM1615 COG0434 # Protein_GI_number: 16764959 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Salmonella typhimurium LT2 # 2 266 3 267 268 244 47.0 1e-64 MWIHDMFGADKPIIALLHLDALPGDPGFCGDMDVVLDHAAHDLTALQDGGVDGILIANEF SLPYQPVADIAVISAMAYIIGKLKDRIRVPFGVNVVKNPIATIDLAAATGARFGRSCFSG AYMGEYGVYVSNSGEAVRHRKALGMEHLKLLFKVNPEADAYLVQRDIQVVARSIMFGDFA DGLCVSGAAAGAEPDDVVLSRVHEAAKQKNVPVFCNTGCNHQNVREKLKHCDAVCIGSAF KKDGIFNQRVDEKRVRDFMEIVKEIRGES >gi|157101656|gb|DS480668.1| GENE 68 76612 - 77445 709 277 aa, chain - ## HITS:1 COG:CAC3499 KEGG:ns NR:ns ## COG: CAC3499 COG1082 # Protein_GI_number: 15896736 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Clostridium acetobutylicum # 1 270 1 270 271 477 81.0 1e-134 MKFAPMNYHYLRYPIKKFLDKVESSPFDSIDLYCSAPQLNIFDYSLSRLIELDRDIRQRN LKVMAMTPENCVYPVNFCTQDPVTRQSSIRYYQRAIDTAEFLGCPNIQISTGFGYFDQPR EEAWKYCRESLAQLAVYCQRKQVRLFLEELKVTTTNVLITSKDIAGMLDEIGSPCIVGMV DMDQMTYAGETVDDYFDHLGERLMHIHFNDRGHTVPGHGDFPMKEYYDAIKRRGYDGTVS FEICDRRYYCDPDKAIDDIVSWMKKNTHELGDVIEEI >gi|157101656|gb|DS480668.1| GENE 69 77549 - 78871 1316 440 aa, chain - ## HITS:1 COG:CAC3500 KEGG:ns NR:ns ## COG: CAC3500 COG2704 # Protein_GI_number: 15896737 # Func_class: R General function prediction only # Function: Anaerobic C4-dicarboxylate transporter # Organism: Clostridium acetobutylicum # 1 440 1 439 439 629 82.0 1e-180 MFWIELIIVLAIIFLGVRIGGTFLAMAGGIGMFIMTFILHVTPSDPPITVILIMIAVICA AATMQACGGLDYLVKIAEKILRRNPRMITVLDPVVAYIFTFMCGTGHIVYSLLPVINEIA IETGVRPERPISASIVSSQQAITACPISAATVAVLAFMAEATGYGSVNIFTLLLVCIPAT LVGTVVCALAVIKKGKELADDPEFKARVAAGQIEDYTKAEKKERPATKEAKISVAIFLIA MAGIVLLGAVKPLVPTLADGNKLPLTTVIEIFMLVAAAAMVLITKLDSDKVLDQPVFRTG MFAVVLAFGLCWLVNTFIGDQSSFITDNMSALTNQYPWIYIIAVFIVGAITTSQSSTTMI MVPIGIALGLPANIIVAGWIACSSNYFIPASGQCVAAIAFDSAGTTKIGKFVLNHSYMVP GLVCTVVSVAAALLLGSVIF >gi|157101656|gb|DS480668.1| GENE 70 79238 - 80068 678 276 aa, chain - ## HITS:1 COG:CAC3499 KEGG:ns NR:ns ## COG: CAC3499 COG1082 # Protein_GI_number: 15896736 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Clostridium acetobutylicum # 10 273 7 265 271 175 34.0 8e-44 MKISQLCAANYHYKRYTLDYFLDSANRLGFQSVELWASGPHLHLEDFDGGRLRELNRTIK DHHLKIACFTPEQCVYPISVSHPDKRYRDRTVDFFQRHIEAAVCLDCSCMVVSTGFAYLD VDGEDAFKWTADSFFQICRKAESEGVTLALEPFTKYTTHICNEASQLLRLLRTVGSPALK GLGDTDVIATTGVDTFETFIGILGRENLAHVHFVDGNPGGHLVPGDGNLNLDQALHTLEA FDYKGYLGLEILDRRYVMNPEDAMRRALAWYSERIG >gi|157101656|gb|DS480668.1| GENE 71 80228 - 81238 964 336 aa, chain - ## HITS:1 COG:CAC3501 KEGG:ns NR:ns ## COG: CAC3501 COG2222 # Protein_GI_number: 15896738 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Clostridium acetobutylicum # 1 336 1 336 336 606 87.0 1e-173 MAYFGADIKNIIAEILEAKKSKGGVKNLYFVGCGGSLGALYPAKTFMEKECASIKSAWIN SNEFVHSTPRDFGENSIICLACHKGNTPETIEAAKLGKEKGAAVIVLTWLEESEIIEFGD YIIRYAFDASPDHLKGDIDYAGEKTICALLVAVELTAQTEGYENYDKFQEGLGMISNIIK NARAHVAERALEFAETYKNDPVIYTMGSGAAYGAAYMESICIFMEMQWLDSSSIHTGEYF HGPFEITDANRPFMLQISEGSTRPLDERALKFLRTYAKRIEVLDAKELGLSTIDASVVDY FNHSLFNNVYPIYNHALAEKREHPLVTRRYMWKVEY >gi|157101656|gb|DS480668.1| GENE 72 81411 - 82124 500 237 aa, chain + ## HITS:1 COG:CAC3502 KEGG:ns NR:ns ## COG: CAC3502 COG2188 # Protein_GI_number: 15896739 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 237 1 237 237 376 80.0 1e-104 MELNNTIATPLYKQLEEKLQKEIETGERPAGSRLPTENELSETYNVSRVTVRKALAGLSE LGYLYRKSGKGTFVAEKKIQRGITNGVLSFTDMCRMMNASPGAKITKIALEDPTEKEMEL MSLDKDEKILVLERIRYADGKPVQLELNKFPESFSFLFSENLNDSSLYEILKKHNIILDH SQKTLDIVFATSKEAKALGISSGYPLLRIESVIHDVDNTITNLCRQLCIGDKFKFII >gi|157101656|gb|DS480668.1| GENE 73 82285 - 83073 799 262 aa, chain - ## HITS:1 COG:mll2507 KEGG:ns NR:ns ## COG: mll2507 COG0300 # Protein_GI_number: 13472269 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Mesorhizobium loti # 2 228 7 231 264 139 37.0 6e-33 MKHYALITGATSGLGLAYAKWFAGKGYDLIMTGRRKEVIERRAQEIRDKFACKVIVILVD LAQEDGVRKLLASLEGREIHVLVNNACFGFRTPFSDTDIDGLKRLIYLQTIAVAEITHYV LRGMKERNKGIIINISSDGAFAVVPGNVTYAAAKRFIVTLTEGLHMELMGTGIRVQAVCP GFVDTDFHENNGMYVDKTRKGMFGFRQPGEVVDEAMKEFEKGSIVCVPDKAGKLVKAMAE YMPKKMYYRFADDFVNKNIRKN >gi|157101656|gb|DS480668.1| GENE 74 83195 - 83776 709 193 aa, chain - ## HITS:1 COG:mlr0908 KEGG:ns NR:ns ## COG: mlr0908 COG1309 # Protein_GI_number: 13471041 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Mesorhizobium loti # 7 150 3 151 236 60 26.0 1e-09 MERKTARISKEPEVRRQEILDTAMSVFMEKGYEAATMRDIAAAMHVVPGLCYRYFESKQM LYDTAIEQYVKDITHPMIERLKEKDDSLDGFLDKMEQLFIQTDGKEKYHHFFHKKENHDL QMLMSVKLCETIEPYMEEKLREMNENGITRVGNIPLTASYMLFGAVPVLENDGLSTAEKA LGIRMIMKCILEA >gi|157101656|gb|DS480668.1| GENE 75 84112 - 84312 199 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935358|ref|ZP_02082740.1| ## NR: gi|160935358|ref|ZP_02082740.1| hypothetical protein CLOBOL_00253 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00253 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 103 100.0 5e-21 MDYLGLIPLFVLGLAMIIKPEYLWEVEHIFAVDGVGPTGLYLTFIRVSGIFFMICSVAAA LYVLLP >gi|157101656|gb|DS480668.1| GENE 76 84356 - 84484 59 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935359|ref|ZP_02082741.1| ## NR: gi|160935359|ref|ZP_02082741.1| hypothetical protein CLOBOL_00254 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00254 [Clostridium bolteae ATCC BAA-613] # 1 42 1 42 42 77 100.0 4e-13 MLVKDIVFLYQHMRSGVNLKYYGERLKEESWKGTGLKKAADF >gi|157101656|gb|DS480668.1| GENE 77 84569 - 85243 756 224 aa, chain + ## HITS:1 COG:BH0372 KEGG:ns NR:ns ## COG: BH0372 COG0745 # Protein_GI_number: 15612935 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 2 220 3 220 222 172 41.0 5e-43 MRILVIEDEKRLAFTLADMIGGAGYYTDVSHDGAEGLELAQTGIYDAIVLDVMLPGMDGY AVLKNLRSTRNMTPVLMLTARSELSDRVRGLDAGADYYLTKPFATEEFLACLRTVLRRHG DIAPDTLTFGDLILTPSMPELICGENSVSLSAKELELMSLLIQNSSQYLPKETLLLKVWG YDANVNSNSVEAYISFLRKKLSLLGSCVSISVMRNIGYKLGVSL >gi|157101656|gb|DS480668.1| GENE 78 85240 - 86496 501 418 aa, chain + ## HITS:1 COG:BH0373 KEGG:ns NR:ns ## COG: BH0373 COG0642 # Protein_GI_number: 15612936 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 188 417 225 449 459 127 39.0 4e-29 MIRKLRWKFSAIMMTIVAFLLAVIFITLYYTTKANYVQRSMGTLHSAMLENYPGSMREQA GQPPSAGQRPPSRPRRQELPLMVVDIRQGNYTVVKNQLPDIESQEAETLISLAESKGGES GTLKDLHLRFLRGKTKPDQTVRYVFADIYEEQNSLYWQVIHSSIIGASSFAVFFLFSILL SGWAVKPVEIAWQRQRQFVADASHELKTPLTVILSNADMLEKDRDIPKGQNRQRISHIKA ESLRMKQLIESLLAMARSDSGQESAVCAPVDLSFIVNSSLMAFEPLAFDMDKHITYDIEA SLHVNGDDKKLRQLTDILLDNACKYSQDNGSICVTLKKTSAKEALLTVSDEGAPLTKEEI KHLFLRFYRGDTARSNIPGYGLGLSIAQSIVSEHGGKIEASTDGFATNSFIVRLPIQE >gi|157101656|gb|DS480668.1| GENE 79 86538 - 88499 1424 653 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0480 NR:ns ## KEGG: Sgly_0480 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 25 537 21 585 629 275 37.0 6e-72 MGRNGLKYRKLKIFGIAAGLAGVLFLAGCTKTAGQDGSKAGAAGTSETAGIQEAAGTQDT NRTDADSSFGDGEETRITGNGTTVAIEGTGAAADGANVTISSSGTYRLTGNITGGGVVVD AGKEDEVCLILDGVSITSPDYSAIYASQSGLLTIVLEDRTENRVSDGNTYTYPQTGEDEP DAAIFSKDDVVFEGNGMLTVSGNYMNGIRGKDRLTINSGTYVIEAAEDGVKGKDAVEVNG GEITITAGSDGIQSNNNGEEDKGYVRITGGTTAITAGKKGILAETLVEVTGGIIDINAQD DGIHSGKNVRLFSGELTLSAGDDAVHSDNLVEVSGGTIIVEQSREGLEGLCLEITGGTIQ INSEDDGINAARGTDTSGGPNAAGGSFGATEGAYIRITGGNVKINASGDGIDSNGDLYLE GGTVLAEGPAEGGNGALDYNGTGTISGGTILAVGSAGMFQTFSESSSQPMLMVYFDEMQD AGTTISIKDGLGNQLTETEISKPFEALLFSSPELKSGETYYIEAGEQDIQMAVDGILNQY GTPSSGGFRRIPGGGEGKLKERPGAGTGEENFGRTQEGREGENAGGSREGRRKMQPEAGN ADNIGSESEAEGAGNVSGKASVVVIQETLLAGNSPEESLPEGIHVLGSSGSME >gi|157101656|gb|DS480668.1| GENE 80 88512 - 89183 673 223 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0479 NR:ns ## KEGG: Sgly_0479 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 223 1 225 225 266 60.0 3e-70 MLESVLTETVNTGITMGQFLLCTITSVMLGAVLAAVHTYRNQYSRSFILTLVLLPVMVQT VIMLVNGNLGTGVAVMGAFSLVRFRSLPGNAREIGSIFLAMALGLAAGMGYLGTAMLLMI VSGGITILLISLPAGRAGRKELKITIPENLDYSGIFDDIFAKYTKKSELVRVRTVNMGSL YELCYQVDLKSEFIEKNMLDEIRCRNGNLTIVCGRLPDGRDEL >gi|157101656|gb|DS480668.1| GENE 81 89176 - 89895 465 239 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0478 NR:ns ## KEGG: Sgly_0478 # Name: not_defined # Def: VTC domain-containing protein # Organism: S.glycolicus # Pathway: not_defined # 3 239 5 241 241 236 53.0 4e-61 MTQEIFKRYEKKYMLTQKQYDALMPALERQMDTDHYGEYTISNIYFDTPDFQLVRQSIEK PEYKEKLRLRAYGKATDTSVVYAELKKKFDGVVYKRRIPVPLCQARKYLYYGIRRIEESQ ILKEIEYVLNRYDLKPAAFVAYDRAAYYGKGNEELRITFDRNICCRCSGLDLKNGVYGTM LLDKDQILMEVKIPGAMPLWMSRLFSDMGLFPVSYSKYGAYYKEYLYQGIFIEGGRICA >gi|157101656|gb|DS480668.1| GENE 82 90564 - 91220 386 218 aa, chain + ## HITS:1 COG:HI0575 KEGG:ns NR:ns ## COG: HI0575 COG2964 # Protein_GI_number: 16272518 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Haemophilus influenzae # 23 216 32 221 221 79 28.0 6e-15 MKTIENEMPFFQTLLKLLKNQFGMESEFILHDWSKGYDHSVVAIENGHITNRKIGDCGSN LGLEVIRGTATSDMKCNYVTRTREGKVLLSSTMYIKNDDGQALGALCINTDISERIKIKE YLDSSLPEELNKKDSNEEFFATDVSDLLVYLIDQGINNIGRPVSEMTKEDKMKVLRFLDD RGAFLISKSSTKICQILEISKFTLYNYLDEVRERKESI >gi|157101656|gb|DS480668.1| GENE 83 91274 - 92110 702 278 aa, chain + ## HITS:1 COG:SMa0506 KEGG:ns NR:ns ## COG: SMa0506 COG0834 # Protein_GI_number: 16262719 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Sinorhizobium meliloti # 36 271 24 260 265 132 32.0 5e-31 MKKIKQIKRIAAVVMAVSVAVGLTACQQKTDSQGNSTQTGTLQKVLKERKMTVGCILSFP PFGYKSEDGTPMGYDVDMIHELADSLGVEVEIVEVTADARIPSLETGKVDAVFGNFTRTL ERAQKIDFSNPYVVAGERLLVKKGSGINTVDDLTGKKVAVTKGSTNAELMATLNPNAEVV FFETSADALQAVKNGQCATFLEDSNFQQYQAAQNDDLEVVGDSLISLEYNAIGVKKGDQD WLNYLNLFIFNTNVSGKNDEMYSKWFGEPMPFDLNPQY >gi|157101656|gb|DS480668.1| GENE 84 92167 - 92841 392 224 aa, chain + ## HITS:1 COG:TM0592 KEGG:ns NR:ns ## COG: TM0592 COG0765 # Protein_GI_number: 15643358 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Thermotoga maritima # 9 215 9 214 216 146 39.0 4e-35 MFLDWWTPIENLPALLKGALVSLELTLTVFALSLFFGTIVGFLRYNKHHKAGYRIATVYV ELIRNTPMLVQIFFLYFGLPQFKIYLPALFAGILGLTINNTAYIAEIVRSGIQSIPKGQW EAANCLGLKPIHTFLDVIFPQALRNIFPSLINQFIMILFGTSLLSVLDIKDLTQVASILN SQNFRTLELFTFAIGIYYIMSVICSKILRYVNNRFFPSISNGGR >gi|157101656|gb|DS480668.1| GENE 85 92847 - 93494 254 215 aa, chain + ## HITS:1 COG:AGl666 KEGG:ns NR:ns ## COG: AGl666 COG0765 # Protein_GI_number: 15890450 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 212 11 210 216 156 42.0 2e-38 MDGFLRVFSYWRLLLDGFLITLKLSLTVIITSTLLGVICGMLFTLKNRGIRGVLRIYVEL FRGCPLLMQLFMGYYGLAYLGIQIDIFSAVVLVYTLYGGAYIAEIIRSGIESIPKGQWEA AKCIGLNPVDVLRDVVLPQSFKISLPALIGFHLGLIKDTSIASIIGYSELLREGKTIMNV TGYPFETYILVAVIYFIICYPLSKYVGWTERKAGL >gi|157101656|gb|DS480668.1| GENE 86 93491 - 94222 630 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 247 51 3e-64 MIEVKNIYKSFGTLDVLKDVSLTVNKGEVVVIIGPSGSGKSTLLRCINHLETPDKGEIVV DGEILTGKSGQIRKVRANLGMVFQQFNLFPHMNVYDNITLGPKRVQSMSKEDMDKTVDDL LAKVGLSDKKFSYPPQLSGGQQQRIAIARALAMNPEYMLFDEPTSALDPELIGEVLEVMR QLAKDGMTMIIVTHEMKFAKEVADRVIVMADSHIIEQGPPEELFTDPKEARTRSFLRSVL EKE >gi|157101656|gb|DS480668.1| GENE 87 94251 - 94709 353 152 aa, chain + ## HITS:1 COG:MT3779 KEGG:ns NR:ns ## COG: MT3779 COG0251 # Protein_GI_number: 15843295 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Mycobacterium tuberculosis CDC1551 # 11 152 11 150 151 114 46.0 8e-26 MDVYENIRKSGYELPPSPPKGGIYKPVKQVGNLLYISGQGATKDGQPVVSGKLGSERTIE EGQEAARICTLNALSVLHEYLGDLNKIKSVVKILAFVASAPGFNSQPAVVDGASQLLKDI FGDDNGVGARSAISAIELPGNISVEIEFIFEI >gi|157101656|gb|DS480668.1| GENE 88 94745 - 95860 273 371 aa, chain + ## HITS:1 COG:CC3106 KEGG:ns NR:ns ## COG: CC3106 COG3616 # Protein_GI_number: 16127336 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid aldolase or racemase # Organism: Caulobacter vibrioides # 42 365 11 365 369 94 28.0 2e-19 MDENRYLLKHTDEIITPCLIYYKDIIEQNIKEMISVAGNANRLWPHVKSHKMKKLVELQI KMGITRFKCATIAEAQMVAESGASDALVAYPLVGPNIDRFLELVKTYAGTRFWAIGDDLE QVSLLNSAAGSQETIVNFLADVNMGTNRTGISINTLPDFYRNCCKLPWLSVQGLHCYDGN NGIAEYDAREKAVDKIDRAVFSTVAALEQTGLPCPVLVMGGTPSFPCHARYKRVYLSPGT GIISDFGYASKFTDETFIPAGLLMTRVISHGGPGLFTLDLGYKGIAADPPGLKGCLMGDW HASPINQNEEHWIWKMDEGYEDQRPGIGSVLYVIPTHICPTSALYPEALVALDGQIVDIW EVTARNRKLSI >gi|157101656|gb|DS480668.1| GENE 89 95875 - 96513 442 212 aa, chain + ## HITS:1 COG:BH3723 KEGG:ns NR:ns ## COG: BH3723 COG0800 # Protein_GI_number: 15616285 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Bacillus halodurans # 3 209 9 209 214 115 32.0 5e-26 MNILDNRIIAILRDVDAENIVDVINALVGGGIISIEIPLNHSTPAAKAASLRTIQKSHET FGDQIFLGAGTILSPEEAESAANAGAKYIISPNMDPEVIKKTKSLGLASMPGAMTPSEIV DAYKNGADIVKIFPAGNLGTDYIKAIKGPLNFIPLAAVGGVNLENAGKLLMSGYQMIAVG GNLADKKAIQAGDYRKISSLARAYKEIVDTAV >gi|157101656|gb|DS480668.1| GENE 90 96540 - 97559 387 339 aa, chain + ## HITS:1 COG:RSc2759 KEGG:ns NR:ns ## COG: RSc2759 COG3734 # Protein_GI_number: 17547478 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Ralstonia solanacearum # 6 322 4 321 331 226 38.0 4e-59 MGNVCITIDGGTTNTRCLIWDQEARVIASCKCNIGARNGAMEHGNDTLKVQLKHNLEQVL TSSGHSWRDVEYILACGMITCSSGLYELPHLSTPAGLKELSEGIQTILLPEVSPVPIHFI PGVKNHVNHLDFESIDQMDMMRGEETESMALLRLHHTGSPMLLILPGSHNKYISVDEQSR ITGCMTSMSGEILSSLTQNTILAASVNCSFLDESELDRHWLLNGYKNSCELGIGRAAFLT RVLHLFCELSPQQLGSYLLGTVLALDLEALRGSKAIIFNHGFKIVIGGRGTMALALSQVL EEEYPNSEVLLDRTEYLAGMGARLIWQTMNTRKYIQKTS >gi|157101656|gb|DS480668.1| GENE 91 97630 - 99372 241 580 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 339 562 132 355 398 97 32 4e-19 MSKYLQKKFALSEQGGRDLTKAIFSCALTNMGLIFPMGILFLFMERLLGPLVGIQAPSLG MAGYVAACIVLLVIIFIFWRIQYDATFVASYTESANMRIRLAERLRTLPMSFFGKRDLAD LTTTIMADSAFIEKAFSHFIPELIGAFLSTAIIGVCLMAADWRMGLAVLWVVPISFLLAA GTRPMVDQVERRQKGKKIAASDGIQECIENIQDIRANNQQEEYLRGLDQKILNVESITIR LELLNGTMVTASQMFLKVGMATTVLAGAALLSSHSIGFMMFLMFMVASTRLYGPLTGCLQ NLSAVYSALLVVEQMKGIEEQPVQQGSREAAYDGYDIVFDHVGFSYKEGKQILKDVSFTA RQGQVTALVGPSGGGKSTSAKLAARFWDIDRGKITLGGTDISGIEPETLLRSFSIVFQDV VLFNNTIMENIRLGRKDASDQEVMEAARAAQCEEFIHRLPNGYETRIGENGSALSGGERQ RLSIARALLKNAPVILMDEATASLDVENETLVQQAISNLVKNKTVLIIAHRMRTVAGADR IVVLKDGYVAEQGTPEELLEKKGIYHHMAELQGKSLNWSL >gi|157101656|gb|DS480668.1| GENE 92 99369 - 101183 207 604 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 359 582 133 355 398 84 29 3e-15 MDGSRKHIMTRLPDTGKVTLMTFAGKYKYLTVTGCILSGISAVIALVPYLCMWMVIKLAV LNWPGGLAGGTLVYWGWMAVASSLLSMLIYFGALMCTHLSAFRTARNMKTAALHHLAELP IGYLRGTGSGKLRRIIDDGAGQTETYLAHQLPDLAGALVTPAAVLVLLLVFDWRFGLISL IPMAVGTFFLSRMMGTGMAECMKQYQNALEDMNSEAVEYVRGIPVVKTFQQSIFSFKSFH DSIIRYKNWAVNYTLSLRIPMCCYSVSINSIFAVLIPAGLLLAGDAAGGQGFVTMALDLV FYILFTPVCVTMMDKIMWTSENTMAANDALERILNVIREKPLSEPAAPRKPENHRIEIKD VSFSYNKDGVNALEHVSLTVPQGATTAIVGASGSGKTTLVSLIPRFFDVDQGSICIGGVD VREMETKELMKRVSFVFQDSHLFKDSLMNNIRAAKPEADKEEVMGAVKAARCEDIIKKMP QGLDTVVGTKGVYLSGGEMQRIALARAILKDAPIVLLDEATAFADPDNEYLIQLAFEKLV EGKTVIMIAHRLSTVCRAHRIFVMEKGRVAEQGSHDELLKARGLYAKMWKDYQTSAEWKV GGKA >gi|157101656|gb|DS480668.1| GENE 93 101284 - 101907 563 207 aa, chain - ## HITS:1 COG:no KEGG:Closa_2536 NR:ns ## KEGG: Closa_2536 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 205 1 205 205 250 60.0 2e-65 MEDRYNGTVTAILEAGKKEFLTYGYEKASLRRIAKEASVTTGAIYGYFPGKKALFDALTS DTAEELLDLYRKEHRDFAALPPEQQPARLNEITEQYIPWMVNYIYDHFEVFKLLLCCGAQ EARDRYFDRLAAVEEQSCRDFIKAMESLGHSAEGMSNTLIHILCRSFFQQLHEFVSHDLP REQAITCAVTLSRFQHAGWVRIMELGE >gi|157101656|gb|DS480668.1| GENE 94 102132 - 102695 626 187 aa, chain - ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 1 187 1 188 188 251 63.0 6e-67 MSLIGKEISDFTVQAYAGGAFREVKKSDVLGKWAVFFFYPADFTFVCPTELEDLADKYED FKSAGCEIYSVSCDSHFVHKAWHDASRTIQKIKYPMLADPTGSLARDFDVMIEADGMAER GSFIVNPEGKIVACEVIAGNVGRNADELFRRVQASQFVAEHGDQVCPAKWQPGADTLKPS LELVGLL >gi|157101656|gb|DS480668.1| GENE 95 102942 - 103970 1105 342 aa, chain - ## HITS:1 COG:SPy0146 KEGG:ns NR:ns ## COG: SPy0146 COG1299 # Protein_GI_number: 15674356 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Streptococcus pyogenes M1 GAS # 1 325 1 325 339 246 47.0 3e-65 MTIITGLGLLLLTLAAFSLFSMKAPKGSAAMSGMANAAVATFLVEAVHRYISGDLLHLAF LGETGTISGNMGGVAAAIMVPIGMGVNPVFAVVAGVALGGYGILPGFIAGYAIGFIAPFI EKHLPPGLDSVLGALMIAPIARFIAFLVDPAVNAALAHIGGMISAATEQSPILMGLLLGG VIKMICTSPLSSMALTAMLGLTGLPMGIAAIACFGGSFTNGVIFKMLHFGDNSNVAAVMM EPLTQAHIITKHPIPIYCSNFFGGGFSGVAAAFLGIINNAPGTASPIPGLLAPFAFNPPL KVLMALLLAAISGTLAGIVGAIVFKKKYDMKPELSVINSFEE >gi|157101656|gb|DS480668.1| GENE 96 104583 - 106658 2377 691 aa, chain - ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 102 687 100 683 687 381 34.0 1e-105 MKFVIRHEIRGRVRVHFYQKDMSIRQADLLHYYLCTLPGVKNVRVYERTADAAVVYEGSR RCILEGIQRFSYDSERMEELVPKNSGRALNREYKEKLVRKVVVRAFAKSFLPASVRAVYT AAHSVRYLLKGVRCLLRGKLEVEVLDAAAIAVSVLRRDFDTAGSVMFLLGIGELLEEWTR KKSVGDLARSMSLNISGVWQNVDGTEVLVPVSKIREGDLVTVHMGNVIPLDGVVASGDAM VNQASMTGESVPVRKEEGSYVYAGTAVEEGEITLRVRTAAGDTRFERIVTMIEESEQLKS TAEGKAATLADALVPWSLGGTILTWLLTRNVTKALSILMVDFSCALKLAMPLSVLSAMRE AGSCHITVKGGRYMEAAAAADTIIFDKTGTLTKARPQVADVVVFNDMGKEELLRIAACLE EHFPHSMAKAVVNEAVKRGLVHKEMHSRVDYIVAHGIATYVGDERVVIGSYHFVFEDEGC RIPEDKRTVFDSLPVEYSHLYLAIGGSLAAVICIEDPLRDEADAVIAALHRQGITKIVMM TGDSERTAAAVAGRVGVDEYYSEVLPEDKARFVDEEKKKGRKVIMIGDGINDSPALSAAD AGIAISEGAEIAREIADITISEDNLFQLVTLRAISRALMDRIDRNYRFVIGFNLGLILLG VGGVITPAASAMLHNTSTLAISLKSMTNLLD >gi|157101656|gb|DS480668.1| GENE 97 106754 - 107083 356 109 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2099 NR:ns ## KEGG: Rumal_2099 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 72 3 71 91 89 63.0 5e-17 MIDCFKCKKMGLFLGGVLFGTAGVKILGSDDAKKLYINCLAAGLRAKNCVMTTASNIQEN AEDILAEAREINAQRCQEVFEDESREESPEHTSEYLKEDETASADTVTA >gi|157101656|gb|DS480668.1| GENE 98 107148 - 109322 2198 724 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 5 709 4 702 709 717 51.0 0 MSIKIALAGNPNCGKTTLFNGLTGSNQFVGNWPGVTVEKKEGKLKGNKDVIIMDLPGIYS LSPYTLEEVVARNYLITQRPDAILNIVDGTNLERNLYLTTQLMELGIPVLMAVNMMDVVK KSGDRIDIQALSRELGCPVVEISALKGTGIMEAANKAVELARDSRASVPVHKFCADVEAA LEEIEQCIGGNVPEAQRRFYAIKLFERDDKIGAQMTMVPDVGSITARMEEAMDDDAESII TNERYTYITSIIGRCYTKKSREKMSVSDKIDHIVTNRFLALPIFAAVMWVVYYVSVTTVG TWATDWANDGVFGEGWSLFGLEVPGIPVVVEGFLAAVGCADWLSSLILDGIVAGVGAVLG FVPQMLVLFIFLAFLESCGYMARVAFIMDRIFRKFGLSGKSFIPMLIGTGCGVPGIMASR TIENDRDRKMTIMTTTFIPCGAKLPIIALFAGALFGGASWVAPSAYFVGIAAIICSGIIL KKTKMFAGDPAPFVMELPAYHLPTVSNVLRSMWERGWSFIKKAGTVILLSTIFIWFTSSF GFVDGHFGMVEDLSDGILASIGQAIAWIFAPLGWGNWQAAVAAITGLVAKENVVGTFGIL YGFAEVAEDGSEIWGTLAQSFTAVSAYSFLVFNLLCAPCFAAIGAIKREMNNARWTWFAI GYQTILAYVVALCIYQFGRWFTEGTFGVGTVAAIGAVAVFIYLLVRPYKEGSSLNVNVKA KTAS >gi|157101656|gb|DS480668.1| GENE 99 109408 - 109629 274 73 aa, chain - ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 72 80 148 152 70 50.0 1e-12 MRTLKEVKCKSTVTVVKLHGEGAVKRRIMDMGITKGTEVYVRKVAPLGDPVEVNVRGYEL SLRKADAEMIEVQ >gi|157101656|gb|DS480668.1| GENE 100 109669 - 109878 330 69 aa, chain - ## HITS:1 COG:MA3478 KEGG:ns NR:ns ## COG: MA3478 COG1918 # Protein_GI_number: 20092289 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Methanosarcina acetivorans str.C2A # 1 69 1 69 80 57 46.0 5e-09 MPLTLVKEGTVASIVRVGGKEEVKRHLENMGFVPGASVTVVSANNGNVIVNVKESRVAIS KEMANKIMV >gi|157101656|gb|DS480668.1| GENE 101 110146 - 110535 414 129 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 118 3 117 122 109 52.0 2e-24 MKIQESAENYLETILILSRKNPYVRSIDIANELAFSKPSVSVAMKNLRENGYILMDDQGH ITLSSMGKEIAETMYERHTMLSQWLMYLGVDEQTAAEDACRIEHVVSAESFRAIKDHITK GHELPEDNL >gi|157101656|gb|DS480668.1| GENE 102 110545 - 111918 1306 457 aa, chain - ## HITS:1 COG:no KEGG:CD3111 NR:ns ## KEGG: CD3111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 9 437 3 433 442 197 30.0 6e-49 MGGMELKRKIAVIATEYIKEFLQSMLGDLDVDYCIFTYKTFSDIQYIYENVTEEFDGILT SGSFPAHMIHLYYKEEKRPICFFNTDEAALYRLFLKLLNENRNLDFTRVYADIVEIFGVG LKDFVEGRSPMPDIRELSADEFDMERMLGIEQEEYGKHVRLWEEGKIDLSVTRFSSIVPA LQEAGVKVYFPFPSKRYVGEMCDKLLNEIERRKLEEQIPGVIIVKLSENGSGGEMFQGLD YDYMRLENLVIEFIGASMIDCSVHRRHYGLEIVSTKKHVSGWTGDFKEDRLSPFLREKKL SARFSIGCGLGNSLSQARLNALDACHEAELKQSLAYLINEREQIIGPMGDCGQLLLNVDN SEVLDVQSKLSPLTVKKIFTAISASEKQEITARTLALRLGITKRSANRFLAVLEQEGYLK IAYKTRTTTKGRPESVYIRTGGPGNPEEQQGQQQEYF >gi|157101656|gb|DS480668.1| GENE 103 112102 - 113274 926 390 aa, chain + ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 1 386 1 386 387 263 37.0 3e-70 MNEFLRRAKELEHQMQKDRRYLHQHAEAGEHLPGTTKYVMERLSSIGLSPREICDSGITA LIEGSCPGKTILLRADMDALPMKEMNSLPFQTVTEAAHNCGHDIHTAMLLAAAQILHERR DELCGSVKLMFQPAEEVFTGSENMIKAGILSNPTVDAAMGIHVMLDTPVPSLNYGTGFMT SSCDGFKITVKGVGCHGAMPHLGIDPINVGFHIYSAFQNLIARECDPGEKASLTLGAFNA GNTSNIIPDSAVLMGTLRTYNKDLRARLVKRMHEIAEYTGKVFGVAIEYESLSEVPSTCS DPDLTRELAEYASEVVPDFIKHTDYSVTPSDDFARVSEKVPTVYFMIGCRVDGCTVQHHN PGVLFDETVMPYGAAVHAACAFNWLNRRNA >gi|157101656|gb|DS480668.1| GENE 104 113274 - 114599 1225 441 aa, chain + ## HITS:1 COG:BH2694 KEGG:ns NR:ns ## COG: BH2694 COG0477 # Protein_GI_number: 15615257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 3 411 2 396 418 98 25.0 3e-20 MERKKIYYGWFVVFGCLMITCTMVPPIMALSNKFLIQVTGELQISRSAFTLANTILQGLG IFLSPVVSARLARGNMKRIQTVSIIGFVLSYASFSLATNVIHLYISSFFTGIFFLNASLI PVSMMITNWFVKKRGLAMSIAMAGIGVGGTIFSPVITWLLGAYGWRSTYRIMALIILVLA LPAALFILRKRPEDMGLLPYGSQDSTIENSSLQDSSSRDAASKRIPQKADVMFPLSVKES RTKLFFILFIFGMLCNGLINTGSLGHFPPAIEELQGPQVQALIISLYSMIGIFGKLVLGW LNDRFGVVASTAFGCITFALSFIFILFGQNISMLYIMAFLFGLGDAIGTVTPPLITSAIF GAEKYGEAYGIANSFTQIGLSLGALMVAAIYDTSGSYNTAWILLLILTLGAFAGWVGAYA ASRKYCRKPAAGNTQNMQAAQ >gi|157101656|gb|DS480668.1| GENE 105 114778 - 115602 754 274 aa, chain - ## HITS:1 COG:SA0314 KEGG:ns NR:ns ## COG: SA0314 COG2110 # Protein_GI_number: 15926027 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Staphylococcus aureus N315 # 7 274 8 262 266 197 40.0 2e-50 MKQMGQEARLNYIVEELKKDSVSFRDMEVRPRDRRRVMRSLMNIRMPGPLPAGFLEIQDE FLREEAREKGIVSLADIPTVKEQYGSGVLFGDRISIWQGDITRLQADAIVNAANSRMLGC FVPCHGCIDNAIHSAAGLRLREACSRYMDDRRREDPDYEEPVGRAVLTPGFCLPCRYVIH TVGPVVGMRLTKELKQDLRNCYVSCLEAAADQGLRSIAFCCISTGEFHFPNDKGAEIAVD TVTKFIKQRKAAFDRIVFNVFKDGDRELYEKLLS >gi|157101656|gb|DS480668.1| GENE 106 115599 - 116618 746 339 aa, chain - ## HITS:1 COG:SPy1215 KEGG:ns NR:ns ## COG: SPy1215 COG0846 # Protein_GI_number: 15675179 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Streptococcus pyogenes M1 GAS # 52 316 18 283 293 215 38.0 8e-56 MKGRGQKTAAQETGIQETGMQKTAVQETEAQKKDEKDIGDPNPAYGDCLLALKRRLEQAD AVVIGAGSGLSAAAGLTYAGRRFEDHFREFIERYGMRDMYSAGFYPFPTQEEKWAYWSRH IYYNRYDMPAAKLYRDLYAMMKDKNHFVLTTNVDHQFWLAGFQDECIFATQGDYGLFQCK QACHKRLYDNEAQIRAMLREQKDCRIPSCQVPKCPVCGGDMEVHLRCDGNFVEDEAWYRA AVRYETFLRNNQEKTLVFLELGVGMNTPGIIKYPFWRMTKQWKQAFYISVNLEQAWVPDD IMDRSLVLKADIAHVLEHLAGTDAGCNDGAGWGEGSCQP >gi|157101656|gb|DS480668.1| GENE 107 116749 - 117429 411 226 aa, chain - ## HITS:1 COG:MA0739_1 KEGG:ns NR:ns ## COG: MA0739_1 COG5015 # Protein_GI_number: 20089624 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 5 153 23 163 163 98 38.0 8e-21 MDTGDYLNILKDDIHSVVFATVDRDGHPAARVIDIMLADENTLYFITAKGKEFYRQLTER KYVAISGMTGGGGSLSKKAISIRGEVKELGAGLVDRVFEENPYMAEIYPDKESRMAITVF CVTRGRGEYFDLSVKPITRKGFLIGDWDKEEAGGYFITRQCRGCGNCLSKCPQTCITTAQ VPFEIQKEHCIRCGNCLEVCPFGAVVRREEDGTTAGQWRKDDGRGS >gi|157101656|gb|DS480668.1| GENE 108 117665 - 119629 1460 654 aa, chain - ## HITS:1 COG:CAC2731 KEGG:ns NR:ns ## COG: CAC2731 COG0577 # Protein_GI_number: 15895988 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 73 653 287 875 875 166 21.0 2e-40 MIWKEYSRKYLKDHRESSVSIMAAAFVSSLFISFITTLFYNMWVDSVRRAAEKGGDWQAG AQPSLLTGFYFVIMCMVCLSMALVLYHAFVMGADERMHQLGILQTVGATPGQIRACLLQE AIALSLLPVLAGIAAGVGAAALFLHMANSISRALGMKAAYLTYHPLLFLLSLGICVLTVG AAGGRAAIQLSRVNPLDAVMGGREEPVRQVKSSRLFSEIWGVEGELARKSLYARRKAFRA SSLSLTLSFMVFSLFLNFWVLSGLSTKYTYFERYKDTWDLMAAVKEESGGMPPGLLGDIR SLSGVTRCIRYEKAEGYTWLEEGMLSEELRKAGGLESLKNTGIQKLDGRYRIKVSMVILD QDSFHEFLEEAGLDSSVTEVTVNRIWDNVNSHFRAKAYLPFISALDHPVAVAVDEGNGMG ENGDRGSIQLHTGAYTNQVPDLREEYPDFSLVHIMPEQIYEEKAGDRVPGEVYYQIKAVS DEAVPKLEESVSRLMDGKMDYELENRQTEEAYEMDVRRGYQLIMGSLCGLLALVGIANVY ANTLGGLYMRKREFARYLSVGVTPGGLARILAMEACIVGIRPILASIPVNIGFVIFTVRQ SRFTFSDYLSVIPLKPLALFGGAILVFVGMAYWTAGRRTGNINIVEGMRDDTMI >gi|157101656|gb|DS480668.1| GENE 109 119626 - 120324 230 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 20 223 35 244 329 93 31 7e-18 MELLRTEHLCKVYGTGENQVHALRDVSFSVEKGEFVAIIGQSGSGKSTLLHMIGGVDVPS SGHVWVDGSDVYARNRKELAIFRRRQVGLIYQFYNLIPVLNVVDNMTLPVKLDGQKVNRE RLEELLDILDLRDRAAHLPRQLSGGQQQRAAIGRALLYAPALVLADEPTGNLDSRNSQEI MNLLKYSNRTYRQTLILITHDQEIALQADRIIRLEDGRIVSDAPNGRQGGRI >gi|157101656|gb|DS480668.1| GENE 110 120534 - 121544 849 336 aa, chain - ## HITS:1 COG:CAC0525 KEGG:ns NR:ns ## COG: CAC0525 COG0642 # Protein_GI_number: 15893815 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 5 333 2 329 329 244 38.0 2e-64 MAEWLRNPEVRLQTGIHGVLLVTAAAAAAIIWDIGCALYAAALCLAGSGISLWFTAKRYD RLLQLSREVDRVLYGNDSMAGIPDEEGELAVLASKIYKMTIRLRDQAEELRTDKAYLQES LADISHQVKTPLTSIHMLLRQLKEEENEQERSRISRSIGSLLARIEWLIAALLKMASLES GTVVLKQEMVLAGDVIQKAAEPLAVPMELKGQQLILKGQEGARYTGDFLWSVEALGNILK NCMEHTPAGGTVTVSTEENPVYTQIQVEDTGKGFSKADLKHIFERFYRGENACGDSVGIG LALSGMIIKEQKGTILASNRENGGARFCIRFYKGAV >gi|157101656|gb|DS480668.1| GENE 111 121548 - 122216 515 222 aa, chain - ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 222 1 222 228 266 59.0 2e-71 MNRILIVEDDRDIAGNLLLLLKKEGFTAVAVGTREESLEQIRRESFDLVLLDLMLPDGNG YSVCTAIKREADIPVIFLTAMGDEESVVTGFELGADDYIAKPFRPLELVSRVKNVLRRRG RSRSVLRAGDLLVDTVKGTVTRDGQEIILSALEYRLFLVFLNRQGEVLTRNRLLEEIWDI AGDYVNDNTLTVYIKRLRDKIEKDPAHPEIIETVRGRGYKVG >gi|157101656|gb|DS480668.1| GENE 112 122242 - 122847 531 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935403|ref|ZP_02082785.1| ## NR: gi|160935403|ref|ZP_02082785.1| hypothetical protein CLOBOL_00298 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00298 [Clostridium bolteae ATCC BAA-613] # 1 201 14 214 214 387 100.0 1e-106 MVMYIPFGGGTFPAISGFIKYKIDLEKNMEYQWDTVTEEELKHLYYEEGMTDRKIAERFG VSMGKVAYKRRKYGISIKNMIYQQFMDENSELFAQLNENSRERLLRGENIDAISKAVTHY AFRNGPVEDMHANGQLSQQDMKTLNKYMVNRIAGLLAAAMDGSWLQLEQLFSYYRFFGGD WDAAEPDMGEMKLLMERLKKR >gi|157101656|gb|DS480668.1| GENE 113 122885 - 123898 634 337 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 305 1 303 328 252 39.0 6e-67 MCTAMVTQTSQGDIYFGRTMDFSYPLEPELYFVPKGYQWNNIMNTHQVRNRYRFMGIGQD ISPVVFADGVNEMGFGAAVLYFPGYAQYDVPDPEDSSRPAITALELTGFLLGLSASVEEA ASILRTIRIVGAEDSITGTIAPLHWLVADRTGKSMVIEKTADGLHLMDNPVGVLSNSPDF QWHLTNLRNYMNLSPTQEQSQTWGSLELTPFGQGGGGFGLPGDYTPPSRFVRTAFLKTHT PIPAGRDEAALTCFHIMESVSIPKGPVITSRNTPDYTQYTAFINLSSREYFFKTYGSSCI TSASLPGNPDAQTGIKSLAALNQPAGFCQLPASQDAI >gi|157101656|gb|DS480668.1| GENE 114 123872 - 126718 2493 948 aa, chain - ## HITS:1 COG:all0638_1 KEGG:ns NR:ns ## COG: all0638_1 COG0642 # Protein_GI_number: 17228134 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 533 795 328 594 615 188 37.0 5e-47 MKNALQAGKKIAGMACILVFGCLLSVLPCSDAQAGEGDQDKAQGPDRILRVAYPELKGMS ETDENGTRHGIVVDYLNEIAKYTGWQYEYIDTAAEDIIPEFLDGQYDLMGGTYYQPEFEQ YFAYPDYNTGYNKSVLLVRRDDKSIKTYDWKSMSGKTIGVYERAEENIRRLQAFLDLNGI DCTIRYFSKDQLVDGNLYQCLEDREVDMLLGNNADVKGNLRVVAEFDSQPYYIVTTPDNQ EVLDGMNMALGKIVDSNPNFAEERYNANFPDSGIESICLNDEEQAYIREKQKVSVAVVKE WHPLFSQEEDTDMHNGLIPDVLEKVTEFSGLEFSYEYTDTYGDALNLVNQGKADILGAFL GSEEDGADMGLALSKAYASMSDIIVRNKGVSYPSDGLVGAVIEGRRMPTGIKADEIRYFP DVRAALRAVNNGEVDFFYGISTKIEHDMQAHHYPNVVPNTLVNNRNDICFAVTRPVDGEL LPILNKSVNSLSSEQKTALTNQNMITIGSRSASIVELMYANPVMFVTVTACVFLAVMILV MAAARSRIRAARMQSSLERAEAANRAKGEFLSRMSHEIRTPMNAIVGLSDLTSMMDDVPE QVRENLGKIRSSSQYLLRLISDILDMSRIESGKMTVSNEAFSMNRAVEELEDMLTAEATR RGLEFTVEKDVENRTLMGDVIRLQQVLTNLVSNAFKFTPAGGSIIMRVTRTKSTNQQVIY KFQVIDNGVGISVENQKRVFGAFEQVGPNYSKSQGTGLGLTISRTIVGLMGGELKLKSEL GKGSEFYFSAAFSLADSNVEEEIKIKQEKQKAGGVESSCLEHINILLAEDNDLNAEIATE LLKMKGASVRRAENGKQAVELFMQSSPGTYHAILMDLQMPEMNGLEACRAIRRIQRQDAV SIPIIAMTANSFKEDADAAAEAGMDGFVTKPVDVEYLYQVLDRVLRRG >gi|157101656|gb|DS480668.1| GENE 115 126996 - 127586 508 196 aa, chain - ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 4 196 3 195 210 136 37.0 3e-32 MDRELAQKLDDAVWIGRSLFERSKTSGSSANISFFHNGTMYISRSGSCFGTLKADEFAVM DMNGASLSGSKPSKEWPLHLKVYQKKPGTGAVIHTHGTYGVLWSFVPAEDETDVIPDHTP YLKMKLGKVGLVPYEKPGSQALFDAFEQRVMDSDGYLLKQHGAVVPGKSLMDAFFCLEEL EESARIAWMLRQAGMR >gi|157101656|gb|DS480668.1| GENE 116 127586 - 128857 1232 423 aa, chain - ## HITS:1 COG:PM1365 KEGG:ns NR:ns ## COG: PM1365 COG3395 # Protein_GI_number: 15603230 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 3 410 1 405 413 281 38.0 1e-75 MPIFGCVADDFTGASDAASFLVKGGMSVQLFNGIPAKGSRVGEGIQAIVIALKSRTQDTG EAVADTLRALGWLKEQGVERFYLKYCSTFDSTPKGNIGPAADAAMEFLGVSYTVLCPALP VNGRVVMKGELYVGGVPLHESPMKNHPLTPMWDARIANLMKPQSKYSCVELWKEQMDRPA DEIHAMLSDQADTQEHYYVIPDYQDMEDGERIVELFGDLALLTGGSGILEPLAKHLSRRQ DILPELDIRAEGAAVLIAGSCSRATLEQIAHFQTHGGFSYKMDPMAMLEGRESLEDAWRF VRENWGTPVLIYSSDTARNVKEVQQYGQERVAEMLEQAAAELAERAVAHGIRRVVVAGGE TSGAVTKRLGFSSYRIGASVAPGVPVMVPMENEALRLVLKSGNFGQEDFFVRVLEMTQTE DGK >gi|157101656|gb|DS480668.1| GENE 117 128869 - 130239 1004 456 aa, chain - ## HITS:1 COG:AGl3299 KEGG:ns NR:ns ## COG: AGl3299 COG4091 # Protein_GI_number: 15891768 # Func_class: E Amino acid transport and metabolism # Function: Predicted homoserine dehydrogenase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 402 38 427 489 132 26.0 2e-30 MDYTTLLKNSGGKKSRVGIIGATRGYGYTLLAQIPKVDLMELRLISSRHAAECIEVLTGL GYDKDSITVCENKEQVRDAAGDAVIIVNDYRLVLESDITALVECTGNTAVSSEVAENALR RGINVYMVSKETDSVCGPVLNHIADQYGTVYSLVNGDQPRNLLDLYSWGKILGLEIVAAG KASEYDFVWNPDTCEFIYMEGEEKPHEIMPELREFWHYEGRETLEGRRKVLERYMAVISA DLCEMNLVSNVTGLLPADARLNYPVAKISELADIFIPEEDGGILKKTGVVDVFCNLRAPD EASFAGGVFIIIRCGNETVWKLLTSKGHVVSRNGKYACVYLPYHYMGLETPLSILLGDIL GIGTHGECRQVSVMTGVAQRNLAKGTILKVQGHHHSIEGLIPELWELKEAGNVAPFYLLN GMELLEDVMAGSPITKDVVDLSGSLPYELYNRGIRL >gi|157101656|gb|DS480668.1| GENE 118 130566 - 131921 996 451 aa, chain - ## HITS:1 COG:no KEGG:PPE_04861 NR:ns ## KEGG: PPE_04861 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa # Pathway: not_defined # 27 268 13 249 302 132 34.0 4e-29 MRRKQESRPPNPSNTYEPLSDAHAAYLKDLVKRAGKKDGTREIFGSSKHGYKLNPTVTRE EVRRFEARWHLTLPDEYVFFLTKVGNGGAGPYYGLYSLENLDRYNEYLGFYDARDKEALP PFIDRNMSPVNWARAMEEMEDINDDNEYDVRMKQVCSGLLVIGTQGCTYDNLLMWKGSER GKIVYIDWNMEPEYGPFLTGLTFLEWFEQYFLEILAGHNLTSYGYISLKSQQELRKEYGG AVLTEDKLRILAGFHKFTEAEPETIELLLQRDRPETDGAKGELLFKLAPMKGLKMFEETI VGNHPQGAVACARRMPDQYKDRFYGKMLGLLYRPDIKDKARILFFLGDCSNRKAADLKGF AENMENSSEDRSTAVYVMGKCPDKLDFAEFFAKRMQGEDYWLAHAALQAMSRTPCAELTD TYEWMWKRYEGDRMMRSNLQIAFEANGIEKK >gi|157101656|gb|DS480668.1| GENE 119 131942 - 132541 505 199 aa, chain - ## HITS:1 COG:no KEGG:Cfla_0433 NR:ns ## KEGG: Cfla_0433 # Name: not_defined # Def: aminoglycoside-2''-adenylyltransferase # Organism: C.flavigena # Pathway: not_defined # 34 169 10 143 157 131 45.0 2e-29 MQWQETLYKMIMEKDEGIMIKKETVTKTDLLKVLDLMEASGIQYWLDGGWGVDVLVGKQT REHRDVDINFDARCTDVLLDTLVSRGYEIVTDWRPVRIELYHPELSYVDIHPFVINDDGT AKQADLEGGWYEFEADYFGSAVFEGRTIPCISAKGQKVFHTGYELREVDKHDIRNIDHLL AVQASSRDETGGDVCQAEK >gi|157101656|gb|DS480668.1| GENE 120 132778 - 134706 1202 642 aa, chain - ## HITS:1 COG:BS_yyaL KEGG:ns NR:ns ## COG: BS_yyaL COG1331 # Protein_GI_number: 16081134 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Highly conserved protein containing a thioredoxin domain # Organism: Bacillus subtilis # 1 642 61 684 689 518 41.0 1e-146 MERESFENEVIAEILNREYVCVKVDREERPDVDSVYMSVCQAMNGQGGWPLTIIMTPDCR PFFSGTYFPPRARYGRPGLEELLTAAAGQWKVKKEKLLDQAGQIEKYLKSQERTERQAEP ELGAVHQAFRQLADCFDSKNGGFGSAPKFPAPHNLIFLMEYGAREKRPEALAMAEKTLVQ MYRGGIFDHIGGGFSRYSTDGQWLVPHFEKMLYDNSLLVMAYIKAYGSTGRKMYGCVAEK ILEYVRRELTDSQGGFYCGQDADSDGVEGKYYVFTREEIREVLGEKAGRDFCRQYGITGH GNFEGRSIPNLLENDNYEEICEEPWGNGDHGGNICHGSCDTIGGRENEECRRLYQYRIDR ARLHKDDKILVSWNSWMICACAMAGAVLGEEQYVDMAVRADAFIKSHLVKEGRLMVRYRD GDAAGEGKLDDYACYSLALLELYRVTFRVDYLKRAAAWAEIMTEQFFDRERGGFYLYAKD GEQLIVRTKETYDGAMPSGNSVAAQVLYRLTRITGDVIWQKVLEKQLCYLAGAMDGYPSG HSYGLLTMMDVLYPAKELVCTLSSGSDTERRRKLAAQLANLAVTAEGLTVVIKTEENARE MERLIPYTKDYPVPDTGELFYLCVGHECMQPVPQLEQLIEKL >gi|157101656|gb|DS480668.1| GENE 121 134745 - 134909 60 54 aa, chain - ## HITS:1 COG:MA3726 KEGG:ns NR:ns ## COG: MA3726 COG1331 # Protein_GI_number: 20092523 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Highly conserved protein containing a thioredoxin domain # Organism: Methanosarcina acetivorans str.C2A # 1 54 1 54 697 100 81.0 6e-22 MDNQNNRPNRLINEKSPYLLQHAYNPVQWFPWGEEAFEKARLENKPVFLSIGYS >gi|157101656|gb|DS480668.1| GENE 122 135028 - 136113 1000 361 aa, chain - ## HITS:1 COG:SA0609 KEGG:ns NR:ns ## COG: SA0609 COG3949 # Protein_GI_number: 15926331 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Staphylococcus aureus N315 # 33 361 1 320 326 113 27.0 7e-25 MNKKLNIGRVVTIGGAFIAFAIGSGFSTGQEIMQFFAAYGTEIVLCAVVFFIGNLYMNYN FLEAGRKGQFEKGSQVFNYFGGKYIGLFFDWFSVIFSFASYFVMIAGAGATLQQQFGIQP VIGAVIIAALAGLTVMLGLNKLVDIIGKIGPAIIVLCLLAGLVGIINGNLSFQEAEVLIP GSGIIAASGTWWGSMISYLGFGMLWFAPFLAAIGKEERNPKDAQYGTIMGVLVLTLAVLI VGSALIKNIDLVGASQIPLLLILHQSSPFLATVFSVVIFAGIYTTACPLLWTPVNRVAKE GTNTYKIAAIVLAAAGCVIGLVAPYAAIVNFIYAVNGKVGFLLVALVIIKNIRDAAARKN A >gi|157101656|gb|DS480668.1| GENE 123 136110 - 136523 300 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935416|ref|ZP_02082798.1| ## NR: gi|160935416|ref|ZP_02082798.1| hypothetical protein CLOBOL_00311 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00311 [Clostridium bolteae ATCC BAA-613] # 1 137 1 137 137 284 100.0 1e-75 MNVLVVTNNPEAGERLEKLGLCVEFTDGNAKRSLLHARDLIIKGWHLAADPRGGYNKRYN PCHTVFLCDEAISNMGEDVLVLEKLAQFHDSPYRPDALYTDLELKDFQILDMSIALRTTE HLKCYMDELSYKKVRKV >gi|157101656|gb|DS480668.1| GENE 124 136538 - 136729 137 63 aa, chain - ## HITS:1 COG:no KEGG:COPRO5265_0224 NR:ns ## KEGG: COPRO5265_0224 # Name: not_defined # Def: glycine reductase complex component B subunit gamma (selenoprotein PB gamma) (EC:1.21.4.2) # Organism: C.proteolyticus # Pathway: not_defined # 1 63 370 432 435 85 61.0 5e-16 MHVCTVTPISINVGANRVVPAMAIPYPTGNIELEKEDEKLFRRHMLDTAVEALATEITEQ TIF >gi|157101656|gb|DS480668.1| GENE 125 136793 - 137836 879 347 aa, chain - ## HITS:1 COG:no KEGG:COPRO5265_0224 NR:ns ## KEGG: COPRO5265_0224 # Name: not_defined # Def: glycine reductase complex component B subunit gamma (selenoprotein PB gamma) (EC:1.21.4.2) # Organism: C.proteolyticus # Pathway: not_defined # 1 346 2 346 435 368 51.0 1e-100 MKIIHYINQFYAQIGGEEKADTPLSVRENAMIGPGVALKAAMGDEAEIVATIVCGDNYFN ENLDEVTAGVAKALENYKPDIVIAGPAFNAGRYGMACGNILKICSLKGIPAFSGMYPENP GAEMFRTYGFITKTRNSAGSMRSAIADMTALIRKVLTDPRSLDPKADNYIQRGQRINKFS NKTGAERCVEMMLNKVNGRPYETELPMPTYNRVPPAPAIKELSKAVIALVTTGAVVPEGN PDHIETGIATKFGKYSFSKDYGGFHMPRHQMIHGGCDPVYAQEDPNRMIPADVLKDMEAE GKIGKLSDTLFVTSGNGCATNNAVAFGQAIAAELKAMKVDGIILTSA >gi|157101656|gb|DS480668.1| GENE 126 137852 - 139141 1061 429 aa, chain - ## HITS:1 COG:no KEGG:STH2870 NR:ns ## KEGG: STH2870 # Name: grdE # Def: glycine reductase proprotein # Organism: S.thermophilum # Pathway: not_defined # 1 424 1 429 434 370 45.0 1e-101 MSTDKNLRLDMEIVHIKDIRFGAKTEVRSGILFINKQEMVEAIADPFFSSIDIDLARPGE SVRIIPVKDVVEPRVKLDGKGAGSSFPGFAGAYEGCGEGRLKVFKGCAIVTTGTIVGVQE GVIDMKGTCADYCYYSKLNNIVVVGNCPDGTEPHAHEAACRLLGLHAANYVAEAAMDADA DAYETYQLTPVDPARKLPKAGVIYMAMAQGLIHDNYIYGVNAGKTNPVYMNPNEIFDGAI VNGCCVIASDKNTTWDHQNNPLLKELYKHHGVDLEFAGCIVTPTHTVLHDKERNSHAAVR MARSLGWDVAAVVEEGAGNPDADLMMMIRSLEKAGIKTVGMLSACCGEEGTTDSTPEADA CINTGTDADYMPVHLEKMDRIIGDAKQIRVLSGGSLDGYNADGSVDVNMVAIMCSMNQMG MTRMYSVVL >gi|157101656|gb|DS480668.1| GENE 127 139304 - 140176 583 290 aa, chain - ## HITS:1 COG:BS_ydfL_1 KEGG:ns NR:ns ## COG: BS_ydfL_1 COG0789 # Protein_GI_number: 16077613 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 9 127 4 115 117 60 30.0 3e-09 MRDEQGNNLYSIGKLADIVGIPATALRFYDEQGVLKPQIRDDETGYRYYTEEQVMKCMFI SEMRRLGLTVADIKALCEKSSLTFQREALINRMVLLNDELDHLLFKKAYLEELMRNVDMG LGYFESAKICRENHIPIKIKYYPAAYAVTIPVYRCFYEQEKFSTTHRLLYDICEQKHLRI CGPATSRFEDPGMEHLKKPFYKSQWILPVVKPEEDMKGITLFPEMWCVYTSHVGPYRNIL EQYEKLMDYCAEYGLKISGNPIEEYMISCANICGSENYITNVMFPIELPE >gi|157101656|gb|DS480668.1| GENE 128 140414 - 142309 1126 631 aa, chain - ## HITS:1 COG:TM0281 KEGG:ns NR:ns ## COG: TM0281 COG3534 # Protein_GI_number: 15643050 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Thermotoga maritima # 191 604 48 454 484 239 35.0 9e-63 MKAEITIKEGNKRPVNPLIFGHFIEYMRDCIDEGMWAQLLKNRSFEISEHVKDGVPDFWH RTGVKDAFHFEQDWDHTISREGCSLKITDRNHYDGYAGTAQTGLCIQNGRYEGYVWMKSE QEVPVSIEIYDKEGREFLSKTISVNGEWKKYCFDFRSAAVTYTGTMEIRLKAAGTLWVDG CSLMPSDTVDGIWREVFERIKNISPAVIRFPGGCFADCYHWEDGIGERDTRPVRKNEHWG GYEDNSFGMDEYMEFCRKIGCEPMICVNFGSGTAEEAANWVEYCNGLADTPYGRLRAAHG HPEPWNIKYWDIGNETFGDWEIGHLDASGYAVKYLSFYEAMKEKDPTITFMVCGGDGDSI SQEWNRKISEIIGDKMDVVCLHMYSQKEIQGEHDSRDIYYATAGSVKKYEGILNDSCETI RKSGNPAAMAAVTEYNVGTIIDSYREQTLEAAIFNAGMLNMFLRNTDKLAMCNLSDLVNG WPGGCIVSKDGHAFGTATYHVMSMYSGSRLKEVLDIDVFSPTYSTSEQIGNIEPLSGVPY VDAAACLDENGNTVIFALNRSCDQEAVLEIKYETPNKTVKKAEITTITSEKTSDMNTLPD ESIVPAACMMQTQSGRITLAKHSVNRIVLLP >gi|157101656|gb|DS480668.1| GENE 129 142315 - 143157 691 280 aa, chain - ## HITS:1 COG:all4824 KEGG:ns NR:ns ## COG: all4824 COG0395 # Protein_GI_number: 17232316 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Nostoc sp. PCC 7120 # 20 280 18 279 279 208 38.0 8e-54 MTRTNNIYRKNTGKIIITILLAALAFIVLYPFIWMLVTSFKPEAEIVSYPPKLFSGRFSL DSYINIWKRVPFLKYYRNTIIFSTAVTVTSLVFDTMAGYAFARMNFPGKNIMFLMVMGTM MIPFQVIMVPLYIEIFKFGLINTYPGLVLPRATNAFGIFMMRSFFISLPKGLEEAGRIDG CTEFGIFRKIMVPLCKPAIVSLFIFHFMYQWNDLLYPLLLTTDDNMKTLPSGLATFMGTH VVEYGIIMAGACLSLLPILIAYFIAQKQFVQGIAMTGMKG >gi|157101656|gb|DS480668.1| GENE 130 143172 - 144059 754 295 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 16 269 16 268 292 186 42.0 7e-47 MKTSLERKRERAGLYFVLPSMTIFTVFVFIPLIIAFIFSFFKFDMMFQNFQFQKLGNYAK LAGDKKFWNALFNTVYYTAFTVPVQIGLALVTAVAVKRKGWLNGFFKSVYFIPAICSMTI VSILWSFLVNPDIGMFCYWAKLLGFNPINVLSSPTWAMPTVILVSVWKNFGFNMVILLAG LNGIDESYYEAANIDGATGFQKFRGITIPMLMPTLSFTVVNCIIGSFQVFDQVYIMTKGG PLFKTQTLVQLIYSMAFDSFNMGYASTIAVALFIITFIISVFTFKHMTKGEEDLG >gi|157101656|gb|DS480668.1| GENE 131 144085 - 144957 664 290 aa, chain - ## HITS:1 COG:AGc4715 KEGG:ns NR:ns ## COG: AGc4715 COG1653 # Protein_GI_number: 15889858 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 233 124 356 410 102 32.0 7e-22 MVPDALKDNLKWKGTYYGVPMNFATLLLYYNKTIFTEAGLDPETPPATWDELEQYAEQIV EKTGKYGFDMAVKETTPMWCIMLWGNGGDFIKDGKAVFNSPENVETITRWAENIRDKKFG PEVLTGGEIDKLFESQKLAMYFCGPWATTGFKNAGIDFGVAQAPKGPEQQVTQANGVGMF LTASSKNKEAVYDFFKYWNSVDTQVEWCLGTGFPPARTDIADDARLNENPYIKEFAKPAN ESQMYLQQLTNFADIENQAITPAFEKILLQNADVKESLDEANDIIQSMLE >gi|157101656|gb|DS480668.1| GENE 132 145036 - 145371 211 111 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_1617 NR:ns ## KEGG: Pjdr2_1617 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Paenibacillus # Pathway: not_defined # 35 111 52 127 451 88 51.0 9e-17 MRKRLLALGLAGIMAASAALTGCSGSNGAAETKSEAAQTEGSEGGKTVITFWNGFTGSDK DTLEALVQKYNDTNDKNIEVQMNIMPWDSLYQKLATVLPVGEGPDILAFAP >gi|157101656|gb|DS480668.1| GENE 133 145368 - 147128 904 586 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 22 561 11 535 570 99 24.0 2e-20 MTKFYKKGYPRPQFVRESFFSLNGEWNFRFDPRKEGIKCGWQNGFEDRKIIVPFSYESPA GGIGVQDECNVVWYSRSQSFELHEGKRILLNFEGSDYRTELWVNGIYMGVHEGGYTRFTF DITEALKKTGNQLVCRIEDTYSTVQPRGKQRWLKQNYDCWYTQTTGIWKDVWAEETDSSY LTALYLTPLYDENSLEVDYELNLWKDSSDSFEIETIITFENFTIYSGKDKVIKNAFSKKI SLVNDEIKWKVRYWCPKSPNLYDVCIRIYKNGKVIDEVGSYFGLRKISTDSGRVKLNNMD FYLKMVLDQGYWESSLLTPPDEEALILDIKKILEYGYNGVRKHQKIEDERFYYWADVYGL TVWCEAPSFYEFDRRAIECVTKEWVDIVKQHYNHPSIITWVPFNESWGIPRILCDKKQQV FTEAIYYLTKSLDATRPVISNDGWEHTKSDIITLHDYAEYGEDLLSHWTDWEQNLSNTQS FNGERYVFAGGFRYEGQPIILSEFGGIAFCKDEKAWGYGNAETSEGSYLERLNSLTDAIY SMDFISGYCYTQLTDVEQEQNGHMDMNRRDKVGAEKIRTIIQGGRK >gi|157101656|gb|DS480668.1| GENE 134 147288 - 147428 73 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935428|ref|ZP_02082810.1| ## NR: gi|160935428|ref|ZP_02082810.1| hypothetical protein CLOBOL_00323 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00323 [Clostridium bolteae ATCC BAA-613] # 1 46 1 46 46 69 100.0 9e-11 MAELTFYLLVFYFYTRYNNTIKHSEWVWLKKVCYAGSCMCVKVSYV >gi|157101656|gb|DS480668.1| GENE 135 147437 - 148435 502 332 aa, chain + ## HITS:1 COG:AGl3407 KEGG:ns NR:ns ## COG: AGl3407 COG4189 # Protein_GI_number: 15891821 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 28 330 6 305 309 233 38.0 5e-61 MHKSQLRGVSMNRQSANEKEKIDRHIVLDIDSEENYELFERLGKALSSRVRIRMLALLKH KSMNIVELAQELTIPVSSAAFHVNMLEEANLINTEMLPGVRGSQKVCSSRAEDVFLSINT SPLEKERRSFSVDMPIGNYYDFRIHPTCGMVNEETYIDSCDDIKSFYSPGRSAAQLIWFQ RGFVEYRFPNHFISGKNPNYVSFSLEICSEAPGYRNVWPSDITFSINDRELLTYTSPGDF GGRHGKLTPIWWTDGNTQFGLLKTISVDGSGTYLDGVMKSEEINIDTLALSAQPYISFKI EIKKDAKHVGGINIFGKSYGDYPQNIVMHVEY >gi|157101656|gb|DS480668.1| GENE 136 148582 - 149235 222 217 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01773 NR:ns ## KEGG: EUBELI_01773 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: Mismatch repair [PATH:eel03430] # 1 205 1 205 216 234 54.0 2e-60 MRLQEACVRFNKLAGIRFEELFSQNDMNMIIINKGKTGQLLELALGMHLSSTNLDFDDGE LKTNKCDNAGNPKETVFITQISSVIDELIQERPFEETHLYEKISNILYVPVCKDGSPKDW MFLPSIHIDLSSPRYAQLRDIWRSDYYSICRQLRMHIETSPDRRIHTSNGQHIQVRSKDS MPYHPIYSGVYGRYVSNKNHAFYFRKEFVYDIRSMSR >gi|157101656|gb|DS480668.1| GENE 137 149249 - 151123 254 624 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 363 587 132 355 398 102 32 1e-20 MAAYEEQEYNNPFSFRIWMKLFPFFKPYKKFFAITLGLNILLAGVDILVPLFQSYAIDTF IIPDRLDGLGVFTLVYISVIVAQTVCVYVSVHAATSIEMNVGKDLKRAQFKHLQTLSFSY YNTTPVGYIHARVMSDTLRIAGMIAWGLVDMFWALIYVVSVFVIMFMLNAQLAFIITMIV PCIALLTVYFQNRILKWNRKVRRINSQITSAYNEGITGVKTSKSMVIESDNEKRFFEKTS DMHQAAIRSAKLNAVYIPAILFFSSVASAVVLAKGGYMVQENIIKLGTLSVFISYAVVIF EPIQQLARLLAELISCQANIERVMDLLEQKPNVVDRTDVLRKYGDAFSPNKCNWEKIQGD IVFEDVSFMYPDGKEYVLEHFNLHVPAGMNVAIVGETGAGKSTLVNLVGRFFEPTKGRIL IDGVDYRERSQLWLHSQIGYVLQNPHLFSGTVRENIRYGRLDASDEEIEEAARNVSADMV VAKMEKGYDSDVGESGGLLSTGEKQLISFARAVLANPAIFVLDEATSSIDTQTEKLIQDA TNHLLKGHTSFVIAHRLSTIRQADLILVVKDGKIIERGSHPELLEQKGYYYELYSRQFEE EEAMKVFAGATFAGETDWESVCQP >gi|157101656|gb|DS480668.1| GENE 138 151110 - 153044 1475 644 aa, chain - ## HITS:1 COG:BS_yknU KEGG:ns NR:ns ## COG: BS_yknU COG1132 # Protein_GI_number: 16078496 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Bacillus subtilis # 29 601 18 571 585 352 35.0 1e-96 MKNIQKNTIPKHRKLHLIFRFLQGSKLYFAAAVAASLVSTILNALTPQIFRFSIDEVLSG SSGTISSSGTSGSYLSSHLWVLALMIVAVAIASGIFTYISRTNTARAGENFAKNLRDTLF IHVQKLPIRWHDRNQTGDIIQRCTSDVEVIRGFVVTQLLEVFRTAFLVLTSFVMMLSMNV KLSCIVLLFVPVVVVYSTVFYRLIAKRFTTADEAEGELSTVVQENATGVRVVRAFGREQF EMERFDKKNNAFAKLWIRLGTLSGLYWGIGDLITGLQVIAVIIFGVVEAVNGSISVGEFV AFAAYNSTLVWPIRGLGRILSDMSKAGVSFERVDYIIRSQEEAYGKTDEKGGKEREHSSG KYDISFQHVSFGYEEGKDVLRDISFTVPEGSTFGILGGTGSGKSTVIQLLSRLYELEENR GSICIGGRVVKEIPIEELRRNIGMVLQEPFLYSRTIRENIAASVPGASLEEIRYAAQIAC IDDAIMSFPDGYDTLVGERGVTLSGGQKQRIAIARMLLRKAPIMVFDDSLSAVDSQTDYR IRCALKEHMREATVILISHRVTSLMGADEILVLNQGKIEERGTHGELIRRNGIYRRIYEI QMSRDDRDLMGQDEKDKKNKKDKKSKENNKIEDKMDGGIINGSI >gi|157101656|gb|DS480668.1| GENE 139 153140 - 154009 382 289 aa, chain - ## HITS:1 COG:MA4541 KEGG:ns NR:ns ## COG: MA4541 COG1715 # Protein_GI_number: 20093325 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Methanosarcina acetivorans str.C2A # 7 281 3 290 303 97 29.0 3e-20 MRKTEEVPAYHEMMQELFQAIKELGGSGTIQEIDDKTIEILGLSPEVLTIMHGDTSKSEV EYRLAWTRTYMKKVGILENSARGVWALTTVGRELQEINSDEIVKKVREMTLLKMKDTKEI SLEDHNLENDGVDTPDEIQTWREKLKNVLKNLKPDAFERLTQRLLRESGFTQVKVTGKTG DGGIDGMGIIKLNGIISFHMLFQCKRYTGSVSAGEIRDFRGAMQGRADKGLFITTGKFSA PAIEEANRPGATPIDLVDGDELVDKLRELQLGVAPVNDYVIDDAWFLSI >gi|157101656|gb|DS480668.1| GENE 140 154015 - 154266 89 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISKILLLEVENKERTRLEINPRIKRESPVLKNHVIIFRKTIISSDSQKGISLVSKYGLY LPSRKGLDNYVRRVLRYYVRKRG >gi|157101656|gb|DS480668.1| GENE 141 155015 - 155785 185 256 aa, chain - ## HITS:1 COG:no KEGG:HDEF_1084 NR:ns ## KEGG: HDEF_1084 # Name: not_defined # Def: putative R of type II restriction and modification system # Organism: H.defensa # Pathway: not_defined # 3 255 81 324 336 235 47.0 1e-60 MTSIRKAIETRFTYNAKATKIALKNTKIVLPIGKDGKVDYAYMENYVSSLEKDGLVTIEN YLKNNLLEDFDLTDKEKEVLESFKTLKFHEYNIADLFTIKNAGSILSRSIIENSGETPYL CASAENNAVSSYISYNEKYLDEGNCIFIGGKTFVVTYQESDFYSNDSHNLVLYLMDKEKR NKLIQLYLVTCINNSLKHKYSWGDSVSKAKIQNDKILLPTKNNKPDYDFIEMYMKAMEKI VLRDVINWKNNILKTE >gi|157101656|gb|DS480668.1| GENE 142 156117 - 157283 116 388 aa, chain - ## HITS:1 COG:HP1472 KEGG:ns NR:ns ## COG: HP1472 COG0286 # Protein_GI_number: 15646081 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Helicobacter pylori 26695 # 1 388 279 674 679 209 36.0 5e-54 MKSFTEINKDHDRDKITALDDIVAELLEKEASVTKQIFTYIFENIFLSIDSMSGHLDIMG EMYSEFLKYAFGDGKELGIVLTPPYVTKMMSQILDIDENSKVMDLATGSAGFLISAMKLM IECVEQKYGKNTTKANKKIDEIKQQRLLGVELNAEMYTLASTNMILRGDGSSNIRKGSSF DEPPELYRNFNANALLLNPPFTFKENGLPFLKFGLENMKIGGKAAIIIQDSAGSGRGIIS CKEILSKNQLVASIKMPVDLFLPMAGVQTSIYILEHTGKEHDYKKQVKFIDFRNDGYKRT KRGIYELDSPSQRYRDIVEVYKNGITANVSSELWDIKNQVVMDVISRNGDDWNFEQHQKI DLVPTEEDFKKTVRDYLSWEVTKFLRGE >gi|157101656|gb|DS480668.1| GENE 143 157286 - 158119 317 277 aa, chain - ## HITS:1 COG:jhp1365 KEGG:ns NR:ns ## COG: jhp1365 COG0286 # Protein_GI_number: 15612430 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Helicobacter pylori J99 # 61 275 47 292 678 70 28.0 2e-12 MASWKLEDNVNDWVKSEFARIGQNNYTVESAMSPHLKSALQMGVKLKRLELEIEGEKEKG KSWKPDFELESFNIPVIIENKLGTAKLSAIKEGKVKRDIKSVQNFAVNGAIHYAQCAIMS KKYSEVVAIGIAGDSEENVSIEVYYVFGATDETYKLVSSYNTLDFLENKLSFAEFYKAAT LTEEEKHRVLIDSQAKLQEYAKKLNKLMHNHAITAPQRVLYVSGMLLSMQDIADKKKGLI PNDLKGLDLDDERDGDLIVKHINNYLNVKKYLLTKLH >gi|157101656|gb|DS480668.1| GENE 144 158584 - 159153 183 189 aa, chain - ## HITS:1 COG:no KEGG:LDBND_1115 NR:ns ## KEGG: LDBND_1115 # Name: not_defined # Def: hypothetical protein # Organism: L.delbrueckii_bulgaricus_ND02 # Pathway: not_defined # 35 189 1 155 155 117 39.0 4e-25 MAAETAKKQAESEADQLIESVDEELQKLQRQIEDLTHANEALLYENQGLRAKLNGIDTIP ILYLGDEDEFFPNEIKEMILVALEEKLAKIDSKTRQADILKDVVYNNGGCQHIASQKAQK LKSALKGYKNVSSSMRQLLTELGFVISEEGKHYKLTYYGDGRYWTTIAKSPSDNRTGTNV ALTIIKNML >gi|157101656|gb|DS480668.1| GENE 145 160127 - 161101 334 324 aa, chain - ## HITS:1 COG:NMA1039 KEGG:ns NR:ns ## COG: NMA1039 COG3943 # Protein_GI_number: 15793995 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Neisseria meningitidis Z2491 # 1 318 5 322 336 284 48.0 2e-76 MYTTEDGVTKVEVTFDNDTVWLSLDQIADLFQRNKSTISRHIKNIFLEGELSRNSVVANF ATTGSDGKRYHVDFYNLDVIISVGYRVKSLRGTQFRIWATNILKEYMIKGFALDDERLKN LGGGNYFDELLARIRDIRSSEKVFWRKVLEIYATSIDYNPKAESSVQFFKQVQNKMHWAA HKHTAAEVIYQRADADKDNMGLTTWSGKRIKLSDVEVAKNYLDEKELDALNKIVTAYLDI AEVHALNQEPMYMKDWLETIDDYLRMTRRDILTTKGKVTHQQALEKAHLEYEKYKRNPEY ILSPVECHFLEGIGELDKLDRDSK >gi|157101656|gb|DS480668.1| GENE 146 161296 - 161841 151 181 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2698 NR:ns ## KEGG: Ccel_2698 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 177 15 190 194 202 57.0 5e-51 MLKNEKYKGDMMLQKTFTEDYLNGIRKKNIGQRTRYYVKGSHPAIISPEIFDKVQEEMLN RARLLRTSDGNQISSGNRYRSKYSLGNLLVCGYCGGGFRRRTERGKIVWRCGTRMEKGKA ECENSPTLNDQDVREMLGKVVCNGEYDENVVKDRVKRIDVYEKRLIICYAEKEGYQICEF E >gi|157101656|gb|DS480668.1| GENE 147 161889 - 162401 130 170 aa, chain - ## HITS:1 COG:no KEGG:DSY3426 NR:ns ## KEGG: DSY3426 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 5 162 40 241 566 115 37.0 9e-25 MPVKKENMEPLRVAAYCRVSTKKETQLASLAHQIVAYTEQISDHPGWVFSGVFWDCGRSG LRKKGRNGLKHMLESAAEGKFDYIITKSAKRVSRNTVELLQIMRYLKERGIQRKFEEGLF SSYKYFMVYRCVEGELVIVPEQAKIVELIFELYSKGYTFSQIKNTWRKVV >gi|157101656|gb|DS480668.1| GENE 148 162514 - 164175 989 553 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 171 549 73 455 458 343 42.0 5e-94 MDKRADRGMSGKMAGKTVKRVGTDSGTGSRKGVKRDKGGVRDRVRDNVKDGVKDRVRDNV KDGVKDRVRDNVKDRMKDNTKDNIKDSKHKVRDGEGKNGAGESGKRFDGNEGAFSRQRSG EKCRDGSSPRVRKGAEGLSVSKKQNGGALNAEKRREDSKRGGSYGGRGSRRSNSICPVLN LCGGCQLLDMEYAKQLAFKQKQAEELLKGLCPVKPIIGMKDPFHYRNKVHAVFDRDKKGN IISGIYEENTHHVIPVEKCLIENQKADEIIGTIRGMLKSFKIRTYDEDTGFGLLRHVLIR KGFSTGEIMVVLVTASPVFPSKNNFVKALREKHPEITTIVQNINGRGTSMVLGDKEHVLY GKGYIVDELCGCRFRISSKSFYQVNPVQTEILYEKALSLAGLTGQELVVDAYCGIGTIGI IASKAAGKVIGVELNQDAVRDAVNNAKMNGIENIRFYCNDAGRFLVNMAEQGEKADVVIM DPPRSGSTEEFMDAVGKLGAEKVVYVSCNPETLARDVRYMKKMGYRAAEAWLVDMFPGTV HTESIVLLQKSVK >gi|157101656|gb|DS480668.1| GENE 149 164641 - 166494 1654 617 aa, chain - ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 12 617 3 584 584 502 42.0 1e-142 MKRKDGNVMNVIRDRLDALRKLMKERGMDAYMIPTADFHESEYVGEHFKCREYMTGFTGS AGTALITMDEACLWVDGRYYVQAAAQLKDSTVTMMKMGQEGVPSLRAYLEDKMPEGGCLG FDGRVVNAAEGLALEEMLRERGARISYGEDLAGMIWQERPELSAEPAWVLDERYAGKSAL DKIADVREAMEKVHASVHVLTSLDDIAWLLNIRGNDILYNPVVLSYALVTMDQLYLFVNS SVLEGKAYPYLEDEKGISVREYLERTGVTVMPYDGVYDMVEGLKNEKVLLEKCRINYAVY RLIDGSNKVIDRINPTASMKAVKNDVEIENEKRAHIKDGVAMTKFIYWLKKNTGRIPMDE ISVSDYLGKLRMDQEGCIGLSFATISAYGAHGAMCHYSATPESSIPLEPRGLYLIDSGGQ YYEGTTDITRTIAMGPVTDEEKEHFTLVLMSMLRLGDVKFLHGCRGLSLDYAAREPLWRR GLNYEHGTGHGVSYLSSVHERPNGIRFKMVPERQDNAVMEAGMITSDEPGVYIEGSHGIR TENLVLCVEDEKNEYGQFLRFEYLTYVPIDLEVIDREIMSERDVELLNRYHEQVYEKISP YLDEDERVWLAEATRAV >gi|157101656|gb|DS480668.1| GENE 150 166660 - 168084 652 474 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 4 470 3 451 458 255 30 9e-67 MAREFDVVIIGAGPGGYTAALKAAEFGLKVVVIEAKKIGGTCVNRGCIPTKALLHASDMF HMMQSCDEFGVSTDFISFDFGKMQKYKKSAVVKYRDGIKYGFEKLNVEIVYGTAVLRRDR TVEVELKEGGREFFRGNAVIIATGAVPYMSRIPGADLTGVWNSDRLLAAESWNFDRLTIM GGGVIAVEFATMFNNLCSHVTIVEKQKHLMAPMDDVMSAELEKELRQKGIDVYCDATVTE ILEDEGGLSCVITPNGEGEPIKMRAGQILMAIGRRPNVEKLLGKDISLEMEGGKIAVNSD FETSERGIYAIGDVSARTQLAHVAAAQGTYVVEKIAGRPHSIKLEVVPNGMYVSLPIVPN CIYTDPEIATVGITEEIAREKGLKVRCGHFSMRENGKSIITGGENGFIRLVFEAYSNTIV GAQMMCPRATDMIGEIATAIANGLSAEEMSFAMRAQPTYNEGIGAAIEDAMRKK >gi|157101656|gb|DS480668.1| GENE 151 168187 - 168987 959 266 aa, chain - ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 10 266 3 256 256 102 31.0 6e-22 MTQKPSKKKRIVSFDLDMTLLDHKTWKIPDSAMKALELLRKDSVIAIGSGRNMDHDMSVV YRDMIMPDAVIHMNGTRVVAEGNVLFEHLMDKERLRALLEYGDAHGISLGVSQEGKDYYI HPEGVVRMDRLRWGVSERNFMDGWELLDMPVRTLVYIGGPEGARELEAHFPEFKFPMFSG NMGADVVEQEASKAEGLKRLCQYYDIGLEQTVAFGDSMNDYEIVREAGIGIAMGNSVEEL KAVADYVADDIDRDGVWKACRHLGLI >gi|157101656|gb|DS480668.1| GENE 152 169853 - 170785 707 310 aa, chain - ## HITS:1 COG:PA4783 KEGG:ns NR:ns ## COG: PA4783 COG0697 # Protein_GI_number: 15599977 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Pseudomonas aeruginosa # 10 296 8 289 296 90 28.0 4e-18 MKSDSSLRRLLPFAAYFNICFFWGTAGVANKLSSSIIHPFFAGSLRFLIASVLMALWVGF RHESLAVGLRDGKTLAVGSVLMYFLNTVLLLFASRRVDAGISTVMVSLIPITILLVDSLA ARKLCVSWVGIGGIIGGFAGIAVVVVGSVSGGNADPIGIALLLLADIVWSVGTVYLKYQS ISASVQVQIFYQSAIPAILFFFCAVLTGNFDLGKLSWRGALPMAYMGIADSIVGLTSYLY LLRRWKTSVVSTYAYINPVVGILLSALLLSEEITMAKVGGMLLILVSVFVIQREDWLSGL LTRKVKRIAD >gi|157101656|gb|DS480668.1| GENE 153 170818 - 171372 426 184 aa, chain - ## HITS:1 COG:DR1029m KEGG:ns NR:ns ## COG: DR1029m COG1396 # Protein_GI_number: 15807634 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 2 179 17 188 190 90 30.0 2e-18 MVGEKIRKLRKEKKLTLKDIAEATGLSIGYISQLERGAVEPSLSSLRKVSEFLGVSPYLL VDQSEHHPAMVKSDQRPIIKFPKSEIFYEIVSPMSAPEYTPSSMVIQFQIEPGGHDSEEF LTHPSEEIVILLQGEADMVLGETSYHLNAGDSLVVRPNMPHRTINTGETPAIGFCVMTPM GWTT >gi|157101656|gb|DS480668.1| GENE 154 171614 - 173050 1367 478 aa, chain - ## HITS:1 COG:yfcC KEGG:ns NR:ns ## COG: yfcC COG1288 # Protein_GI_number: 16130233 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Escherichia coli K12 # 13 477 9 511 513 240 33.0 6e-63 MQKNKNVHTNGGTAATARKPIKIPHVYVIICAIILFVAVCTWFVPPGQFDTQTVEVEGVS RTLVIPGTFHTLEEKHPAGLVTILSSLHRGLVSGAEVTMLIFLVNGAFSMVLKTGAFNAF LGSLLRRFSSRAKLVIPVFFLTFAVLSSTFGMWNEYNGLIPVFMGLGVALGYDALAGFAI LELGKGIGWSAATLNPFSVPVSQGIAGVPIFSGMGIRVVSFVIFSTLGIAYIFYYGQKVS KNPSKSLILGDKLDINFNREEIINTKTTKKQMWILLEILVSLIAIFYGSLKLGWGNAELA GIFVLMGIFAGTVSGWGPSKMAEEFLEGASSVVMGALVVGFAKAILVILQGAMIIDTIVF YASQVLQGMPPIIAAQGMLILQTLINFFIPSSTGQAATVIPILAPLGDLLGVSRQVTCLA FQFGDGFSNILWPTCSLAISIGISGIPMHKWWKFFMPLFGILYIAQALILSAAVLMGI >gi|157101656|gb|DS480668.1| GENE 155 173090 - 174085 964 331 aa, chain - ## HITS:1 COG:RSp0056 KEGG:ns NR:ns ## COG: RSp0056 COG2355 # Protein_GI_number: 17548277 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Ralstonia solanacearum # 2 330 6 321 323 184 33.0 2e-46 MKEDLIIDGCAFTETEFTGVTEDVLHSKMDAFFLTIPGEGNGYVASTRSIGKIYNMLDNP KYGLMIAKSVADIYKAKEDGKKALILAFQNPNSIENSLEQLRVFYELGVRVIQMTYNDAN FIGTGCCESQDGGLTDFGKRVLKMMNRLGMLADLSHVGKKTTMDIIRLSEKPVAVTHAGV YNITPSVRTKTDEEMLAIKENGGVMGISPWAPLIWKKEKGCRPDVKDYVDHVDYVVNLIG IDHVSFGADNTLDGNKDDKGTADQAVLYPAVVGAYNSCVGTRSDERHAKGFEGCWQLENV IDEMKRRGYSEEDIAKLTGKNLLRVLRANWN >gi|157101656|gb|DS480668.1| GENE 156 174370 - 175203 609 277 aa, chain + ## HITS:1 COG:MA4102 KEGG:ns NR:ns ## COG: MA4102 COG0345 # Protein_GI_number: 20092895 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Methanosarcina acetivorans str.C2A # 7 276 4 270 270 193 42.0 4e-49 MSQAIQKRIAYIGAGQMCEAIFSGLISSGAILPEHILLYDVNESRLTDLQTRYRTNILTP SAHSYEDLVSWADIVFLSVRPQDARETLSKAGMLFRPEQTVISIMGGVKLSFLEEYIKNA AVVRVMPNTPMLVMEGAAGIALGARCTSQTEELVTSIFNVIGSSYVLPESLIDPLTGISG CGPAFAYLFIESLADGGVEMGLSRDVALNLAAQCLIGAGKMVLKTNTHPGILKDRVCSPG GGTIAGVHTLEEGKFRGVVMDAVVESIRRMQNVGDKA >gi|157101656|gb|DS480668.1| GENE 157 175437 - 175664 234 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935461|ref|ZP_02082843.1| ## NR: gi|160935461|ref|ZP_02082843.1| hypothetical protein CLOBOL_00357 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00357 [Clostridium bolteae ATCC BAA-613] # 1 75 14 88 88 126 98.0 5e-28 MIKNDKGTNVGANIRRIRLESDIGQTDLVRILQLMGADITREALVKIEKGTQHIKVSQLK AIKAALGTSYEDLLE >gi|157101656|gb|DS480668.1| GENE 158 175818 - 176216 430 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935462|ref|ZP_02082844.1| ## NR: gi|160935462|ref|ZP_02082844.1| hypothetical protein CLOBOL_00358 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00358 [Clostridium bolteae ATCC BAA-613] # 1 132 1 132 132 261 100.0 1e-68 MEQYDFTDTGNAKDIYGLLDCMSEKELEMAREAVRNIRETAQLAQFERYNAWFDHTLLPI FKEYAQMTSSLLQIERDNGTIDVLFRNSGGLDITENCKGMYMALMIAVHIFMDSDAGDAV LALTYDCCRIVS >gi|157101656|gb|DS480668.1| GENE 159 176189 - 176524 181 111 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935463|ref|ZP_02082845.1| ## NR: gi|160935463|ref|ZP_02082845.1| hypothetical protein CLOBOL_00359 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00359 [Clostridium bolteae ATCC BAA-613] # 1 111 1 111 111 231 100.0 1e-59 MDGSAWNHYREHFLEGLEQAGGIRVTKTHGRRSVAGLNQMDNCLWKIPALVKKGQLFQPV HCHEVNRERCRMAGYEGYQYPVQCFKADMERMLSGRPDELASFHDTILQQS >gi|157101656|gb|DS480668.1| GENE 160 176570 - 176776 299 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935464|ref|ZP_02082846.1| ## NR: gi|160935464|ref|ZP_02082846.1| hypothetical protein CLOBOL_00360 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00360 [Clostridium bolteae ATCC BAA-613] # 1 68 14 81 81 101 100.0 2e-20 MNIRRIRKEKGIGQTELIQKIDLEEWDFEVNLTREALVKIERGIQHIKVSQLKAIKVILE TTYDELLK >gi|157101656|gb|DS480668.1| GENE 161 176925 - 178568 1045 547 aa, chain + ## HITS:1 COG:BS_rocR KEGG:ns NR:ns ## COG: BS_rocR COG3829 # Protein_GI_number: 16081087 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus subtilis # 78 545 8 455 461 316 37.0 8e-86 MPVCFFACDDTSVLYCNDDCLKFFREQVPGTESPAADQKAASRFPGYQVFPLGPYTCCMF TFSRTASASGFSQGNDRLPVIYESVLNEVEEGVVISDHENRVIFINRAAEAIEGVDARMS LGKRMEELYLPVGNGKKKNSHAAVLNTGIAPNEHLNQYVVKSSHKIMNVVERMYPVNIHG RTMAVFSLIKNLPVIQKSMEQSLELYDYFRENTPHNGTRYTFRSIVGSDIRFVEAISDAK CVARNQTTVLIYGETGTGKELFAQSIHNASPYQNGPFISVNCAAIPSTLLESMFFGTVKG SYTGAANTTGLFEQARNGTLFLDEINSMDISLQAKLLKVIENKTVRRIGDMKEREIRCRI LCALNEAPLACIENGKLRRDLYYRLSSCILYIPPLRERKSDIPLLCSFFLKHFNKEYGRH IHRIDPELMDRFLEYPWPGNVRELQHVVESAFSVSRKDMEILQLGHLSHYYREFFTACKP EAVSDSQAAPGPVSPRSLQPLKAYMDSCEKNFLAAALCCHSNNITATARSLQITRQALQH KIRKYGL >gi|157101656|gb|DS480668.1| GENE 162 178825 - 180162 1216 445 aa, chain + ## HITS:1 COG:CAC3451 KEGG:ns NR:ns ## COG: CAC3451 COG2211 # Protein_GI_number: 15896692 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Clostridium acetobutylicum # 5 441 6 450 458 182 28.0 2e-45 MENKKLTMREMICYGIGDITANVYLQFIALFAIVFYTDVLGISATLAGLIFMGSRVFDGI NDIAVGYISDRYGHYKRWILYGSIATAVAFVIMFTNFHLSTKMQSVYALAAFCFWTLMYT CYAIPFNAFASTMTQNTEERTLLNSIRFAIVAVPSLIISLATPYLKSGTQESNSTYGNIA LVFAVIATVCTLICVAGIVERAKAPTARQRTSGKEYFKAILKNRQLLVVSAAFFCRTLGY YIYSSSMAYYFNYYLGSAKLMGIILGISAPISAVAALCVTPVSRYMGKKKALMGCGVIFA LSSVIRYFMPLNTAVVAVTCWIGMFVMSATLAIFFTMIADTTDYGMWLTGKNVRAVNYGF YTFCQKIGMAFSGTIVGILLDMAGYVPNQQQTDAALRGILNIYCLIPAAIYLIMTAFMLL WKLDEGRMHEIVAELEVRRRLETND >gi|157101656|gb|DS480668.1| GENE 163 180155 - 181051 711 298 aa, chain + ## HITS:1 COG:RSc1002 KEGG:ns NR:ns ## COG: RSc1002 COG0697 # Protein_GI_number: 17545721 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Ralstonia solanacearum # 12 297 24 307 312 63 27.0 5e-10 MTKQIRKGYGTVILASLAYGVMPVVSKLLLLSGMNSESVVFYRFFLTCLFSALILAAFRG FRAVTLPQAVLLALFGILGFGFTMQFLTISYQYIPIGLATVLHFAYPLFVTLIMLAVYGE KPAPARLWGCAAALAGICLMVDLKGGFSPGGVIYALLSAVTYSAFVVSNKKACYGSLSPM LCLFAFSLSASLFFGLSCTLTGTLQVPCSLYQWGCLMAVSLLCTVFAFCTLMTGVRILGA AKASVINMLEPATGVVFGVILFKEKLSLKIMIGCACILVSTLITVLARDSRKDGGKKI >gi|157101656|gb|DS480668.1| GENE 164 181048 - 182364 1183 438 aa, chain + ## HITS:1 COG:PA4677 KEGG:ns NR:ns ## COG: PA4677 COG4690 # Protein_GI_number: 15599872 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Pseudomonas aeruginosa # 7 353 2 341 412 183 37.0 8e-46 MNELLSCDTMVALGNSTKSGNVIFAKNSDRPLGESQPLCLFEAKDYPDQELLSCTYISIP QVSRTYKVLGSKPYWIWGFEHGMNEWGVAIGNEAVWSREEEERENGLPGMDLVRLGLERS KTAYEALHVITDLLKTYGQGGNASVTMDFRYHNSFLIADTCEAWILDTVNRRWVARRVKN AEGISNCYSTGEHWDEDSGDIQEHAYEMGYADRGQTFDFARAYGAVNLKLRAAYPRYKRL NQLLDRKKGILTPDYIGSILKDHFEGEIIEPRWSPADGILASVCMHNMDESSSKTAAGSV VELSKDKTPVWWSCMSNPCISVFLPFTMETAIPQKVSQGSARYSDDSLWWKLERLSYELE EDYPRNTALWNPVKERLQEYICSLAEKGLDDSLLCEMASRLEEAADEMYLVLRDANASSS QPQRKAALKAARTMAGII >gi|157101656|gb|DS480668.1| GENE 165 182570 - 183385 1037 271 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 24 122 646 744 744 87 42.0 2e-17 MKKRLAVTLAAVALSVVMSVPAYAGTWKYVNDQWKYQRGANKFAYKEWIEDNGNWYYMDN NGVMTTGWQQIDGQWYYMDQAGVMQKGWFKDNDKWYFLLPNGAMATNTVIDGRQIGADGV WIAAEGEVEPANITDLSTPYLVQNLLDGLSTKGYNIITSGKTSTGERWNNAIRLKGKGSY VKYAANGQYRLLSGVIAPSSQFSSGIMAKVTVYGDNDTVLYTSQDIHYNEKLMHFGVDVT GQNEIRVEVSLVTDNEWDDPIILIDGLSLYK >gi|157101656|gb|DS480668.1| GENE 166 183643 - 185166 955 507 aa, chain + ## HITS:1 COG:SA2230 KEGG:ns NR:ns ## COG: SA2230 COG1680 # Protein_GI_number: 15928020 # Func_class: V Defense mechanisms # Function: Beta-lactamase class C and other penicillin binding proteins # Organism: Staphylococcus aureus N315 # 24 218 43 241 498 90 27.0 1e-17 MNTREHTAMKMRGNPDISYLGHTVDQMIWSFMEQEKIDGLTLAIVQAPYIPRVAGYGYSD SQRRRLASPNTMWPAGPISQAFAAVAVMQLHEDGGLDVNRRASVYIPELPDTWNDITVLQ LLRHASGLPDYRRAPEFSPDTPWTFDALLQLAARNPLHFVPGTDVEQSATNFLLLTEIVE RVSGLSYHDFVTKRQIEFLGLGHTGFKEDLGQFHHEDISRTADIHQLFKKDRLYINPTEP AVSYDSSGTPSTAAETTALRGFSDIWASAQDISFWDIGLAGSVLIHQPENRALIYAPWNL PDGRTVPAASGWQFYHHRGLMDIKGSVHGYSSFLSRFTDASELVCVTLMCNKEGVDFTNL GRRIAGAYGDLLSTGYDDNRLYLLEGQFPAAETVSRLESALKDRGIPVFAKFDHGQNAAG AGLFLRPTTVLVFGSPQVGTGLMEEDQSVSLELPLRISVWEDEAGSTWLAFPRLDKVFGE YGLENHPAVPKMQNLLEQLVHIGGSIY >gi|157101656|gb|DS480668.1| GENE 167 185293 - 185916 858 207 aa, chain - ## HITS:1 COG:CAC3094 KEGG:ns NR:ns ## COG: CAC3094 COG1392 # Protein_GI_number: 15896345 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate transport regulator (distant homolog of PhoU) # Organism: Clostridium acetobutylicum # 6 206 8 209 210 117 34.0 1e-26 MSKKSDSYYFQNFIECVECGCQAAKMLEDNLNHFDTGLLQDKLDELHRIEHDADKKKHEM MAVLVKAFITPIEREDIILLSQSIDEVTDKIEDVLIRIYINNVQQIRPEALAFIKVIIRC CEALKEVMEEFADFRKSKTLHGLIIEINALEEEGDRLFIESMRRLHAEVTDPIEIIAWRE IYVYLEKCCDACEHVADVVESVIMKNT >gi|157101656|gb|DS480668.1| GENE 168 185934 - 186992 998 352 aa, chain - ## HITS:1 COG:SA0619 KEGG:ns NR:ns ## COG: SA0619 COG0306 # Protein_GI_number: 15926341 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Staphylococcus aureus N315 # 31 350 21 335 335 168 34.0 2e-41 MDMSFSFFLRQLVTNPALLITVLLTLGVICVNGITDAPNAIATCVSTRSLDVNLAIVMAA VCNCAGVVVMTMVNSTVAMTITNMVNFGEDTSIALVALCAALFSIVVWSSFAVSIGCPTS ESHALIAGLTGAAVAIRGGFSAINSAEWMKVVYGLIFSVVLGFLSGYLSCKLVAWTCRDM DRRKTTGFFRYAQIGGAAAMAFMHGAQDGQKFMGVMLLGICMVNGSSNTQGVQFPLWMLL LGSAAMALGTSIGGKKIIKSVGMDMVRLEKYQGFSADMAGAACLLGFSIFGIPVSTTHTK TTAIMGVGAVKRLSAINFSVVKDMVMTWVMTFPGCGIIGFLIAKLFMVLFHG Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:41:55 2011 Seq name: gi|157101655|gb|DS480669.1| Clostridium bolteae ATCC BAA-613 Scfld_02_10 genomic scaffold, whole genome shotgun sequence Length of sequence - 317461 bp Number of predicted genes - 289, with homology - 287 Number of transcription units - 135, operones - 73 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 34 - 327 340 ## gi|160935496|ref|ZP_02082871.1| hypothetical protein CLOBOL_00385 - Prom 380 - 439 6.2 2 2 Op 1 . - CDS 442 - 678 332 ## gi|160935497|ref|ZP_02082872.1| hypothetical protein CLOBOL_00386 3 2 Op 2 . - CDS 724 - 1317 401 ## Mlab_0595 hypothetical protein - Prom 1343 - 1402 9.2 - Term 1461 - 1499 1.3 4 3 Tu 1 . - CDS 1527 - 1973 282 ## gi|160935500|ref|ZP_02082875.1| hypothetical protein CLOBOL_00389 - Prom 2041 - 2100 6.7 + Prom 1984 - 2043 8.3 5 4 Op 1 . + CDS 2253 - 2882 324 ## Closa_2883 hypothetical protein 6 4 Op 2 . + CDS 2889 - 3074 105 ## Closa_0840 hypothetical protein + Term 3120 - 3166 -0.5 - Term 3108 - 3154 6.1 7 5 Tu 1 . - CDS 3167 - 3919 402 ## COG0384 Predicted epimerase, PhzC/PhzF homolog - Prom 4046 - 4105 5.6 8 6 Op 1 . - CDS 4136 - 4381 206 ## gi|160935504|ref|ZP_02082879.1| hypothetical protein CLOBOL_00393 9 6 Op 2 . - CDS 4406 - 4645 343 ## COG4443 Uncharacterized protein conserved in bacteria - Prom 4881 - 4940 6.8 10 7 Op 1 . - CDS 5083 - 5439 150 ## gi|160935508|ref|ZP_02082883.1| hypothetical protein CLOBOL_00397 11 7 Op 2 . - CDS 5451 - 5912 180 ## gi|160935509|ref|ZP_02082884.1| hypothetical protein CLOBOL_00398 12 7 Op 3 . - CDS 5968 - 6540 288 ## Ava_B0296 hypothetical protein - Prom 6584 - 6643 7.2 - Term 6642 - 6702 18.3 13 8 Op 1 . - CDS 6729 - 8045 989 ## COG0531 Amino acid transporters 14 8 Op 2 . - CDS 8051 - 9238 767 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Prom 9302 - 9361 9.2 + Prom 9264 - 9323 6.7 15 9 Op 1 . + CDS 9468 - 10352 643 ## COG0583 Transcriptional regulator 16 9 Op 2 . + CDS 10393 - 11346 507 ## COG1266 Predicted metal-dependent membrane protease + Term 11368 - 11421 5.5 - Term 11356 - 11409 13.1 17 10 Op 1 . - CDS 11452 - 11580 149 ## gi|160935517|ref|ZP_02082892.1| hypothetical protein CLOBOL_00406 18 10 Op 2 . - CDS 11596 - 12519 808 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 19 10 Op 3 44/0.000 - CDS 12537 - 13574 934 ## COG4608 ABC-type oligopeptide transport system, ATPase component 20 10 Op 4 44/0.000 - CDS 13561 - 14586 521 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 21 10 Op 5 49/0.000 - CDS 14620 - 15531 1010 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 22 10 Op 6 38/0.000 - CDS 15542 - 16501 976 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Term 16527 - 16566 5.1 23 10 Op 7 . - CDS 16581 - 18314 1662 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 18549 - 18608 7.3 + Prom 18415 - 18474 6.6 24 11 Op 1 . + CDS 18592 - 19305 642 ## COG2186 Transcriptional regulators 25 11 Op 2 . + CDS 19356 - 19544 57 ## gi|160935526|ref|ZP_02082901.1| hypothetical protein CLOBOL_00415 - Term 19534 - 19596 18.4 26 12 Tu 1 . - CDS 19658 - 26098 3327 ## Closa_3324 LPXTG-motif cell wall anchor domain protein - Prom 26137 - 26196 6.9 - TRNA 26298 - 26368 63.1 # Trp CCA 0 0 27 13 Op 1 . - CDS 26545 - 27933 1394 ## COG1066 Predicted ATP-dependent serine protease - Prom 27956 - 28015 3.8 - Term 27973 - 28004 3.4 28 13 Op 2 . - CDS 28022 - 28438 586 ## Closa_1335 hypothetical protein - Prom 28480 - 28539 10.0 29 14 Op 1 5/0.000 - CDS 28557 - 31187 2394 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 30 14 Op 2 7/0.000 - CDS 31220 - 32260 535 ## PROTEIN SUPPORTED gi|163764772|ref|ZP_02171826.1| ribosomal protein L5 31 14 Op 3 . - CDS 32247 - 32864 236 ## PROTEIN SUPPORTED gi|163764773|ref|ZP_02171827.1| ribosomal protein L24 - Term 32941 - 32966 -0.5 32 15 Tu 1 . - CDS 32986 - 33399 409 ## COG1725 Predicted transcriptional regulators - Prom 33469 - 33528 6.7 - Term 33587 - 33626 5.9 33 16 Op 1 . - CDS 33695 - 33922 183 ## CHY_2393 glycine reductase, selenoprotein B 34 16 Op 2 . - CDS 33950 - 34993 1053 ## CLJU_c27830 betaine reductase complex component B subunit gamma (EC:1.21.4.2) 35 16 Op 3 . - CDS 35007 - 36305 1291 ## FMG_1472 glycine reductase complex proprotein 36 16 Op 4 . - CDS 36339 - 37847 1258 ## COG0591 Na+/proline symporter 37 16 Op 5 . - CDS 37890 - 38048 188 ## gi|160935541|ref|ZP_02082916.1| hypothetical protein CLOBOL_00431 - Prom 38109 - 38168 11.7 - Term 38059 - 38122 12.4 38 17 Op 1 . - CDS 38185 - 39615 1287 ## COG2508 Regulator of polyketide synthase expression 39 17 Op 2 . - CDS 39654 - 39878 59 ## gi|160935543|ref|ZP_02082918.1| hypothetical protein CLOBOL_00433 - Prom 40019 - 40078 6.7 - Term 40038 - 40088 11.2 40 18 Op 1 . - CDS 40092 - 40820 507 ## CDR20291_2751 hypothetical protein 41 18 Op 2 . - CDS 40837 - 41538 745 ## CDR20291_2752 hypothetical protein - Prom 41579 - 41638 7.7 + Prom 41639 - 41698 6.6 42 19 Tu 1 . + CDS 41807 - 42574 730 ## COG1402 Uncharacterized protein, putative amidase + Term 42623 - 42660 8.2 43 20 Op 1 . - CDS 42731 - 43138 505 ## bpr_IV159 YolD-like protein 44 20 Op 2 . - CDS 43056 - 44408 1127 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 44500 - 44559 5.2 - Term 44982 - 45030 -0.5 45 21 Op 1 . - CDS 45104 - 45589 440 ## gi|160935553|ref|ZP_02082928.1| hypothetical protein CLOBOL_00443 - Prom 45611 - 45670 5.9 - Term 45616 - 45651 -0.5 46 21 Op 2 . - CDS 45747 - 46367 224 ## gi|160935554|ref|ZP_02082929.1| hypothetical protein CLOBOL_00444 - Prom 46401 - 46460 9.5 + Prom 46680 - 46739 5.1 47 22 Tu 1 . + CDS 46766 - 47461 461 ## BDI_2733 hypothetical protein + Term 47674 - 47719 -0.6 - Term 47268 - 47310 -0.5 48 23 Op 1 . - CDS 47493 - 48710 793 ## COG1763 Molybdopterin-guanine dinucleotide biosynthesis protein 49 23 Op 2 . - CDS 48707 - 50584 1785 ## COG2199 FOG: GGDEF domain - Prom 50649 - 50708 8.9 - Term 50739 - 50777 5.9 50 24 Tu 1 . - CDS 50784 - 51671 1091 ## COG3588 Fructose-1,6-bisphosphate aldolase - Prom 51813 - 51872 7.6 - Term 51934 - 51977 10.0 51 25 Tu 1 . - CDS 51992 - 53473 1777 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) - Prom 53634 - 53693 6.5 + Prom 53558 - 53617 3.9 52 26 Op 1 . + CDS 53756 - 55129 1350 ## Closa_1352 hypothetical protein 53 26 Op 2 . + CDS 55170 - 57023 1795 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) 54 27 Tu 1 . - CDS 57039 - 58424 1127 ## COG2199 FOG: GGDEF domain - Prom 58478 - 58537 8.4 + Prom 58463 - 58522 7.4 55 28 Tu 1 . + CDS 58556 - 58903 345 ## Ethha_1949 helix-turn-helix domain protein + Term 59082 - 59120 -0.7 - Term 58991 - 59058 5.2 56 29 Tu 1 . - CDS 59096 - 59449 475 ## COG0662 Mannose-6-phosphate isomerase - Prom 59480 - 59539 5.8 57 30 Tu 1 . - CDS 59590 - 60477 615 ## gi|160935569|ref|ZP_02082944.1| hypothetical protein CLOBOL_00459 - Prom 60551 - 60610 5.6 - Term 60623 - 60663 8.0 58 31 Tu 1 . - CDS 60703 - 61905 1356 ## COG0426 Uncharacterized flavoproteins - Prom 61957 - 62016 5.9 - Term 62118 - 62161 5.2 59 32 Op 1 2/0.050 - CDS 62200 - 63255 1127 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits 60 32 Op 2 6/0.000 - CDS 63274 - 64065 730 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 61 32 Op 3 . - CDS 64082 - 65116 1170 ## COG1145 Ferredoxin - Prom 65154 - 65213 10.9 - Term 65290 - 65338 -0.3 62 33 Op 1 . - CDS 65400 - 67142 1958 ## Closa_1330 cellulosome anchoring protein cohesin region 63 33 Op 2 . - CDS 67182 - 69449 1844 ## COG4193 Beta- N-acetylglucosaminidase - Prom 69515 - 69574 6.5 64 34 Tu 1 . - CDS 69690 - 70892 1157 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 70960 - 71019 4.2 65 35 Tu 1 . - CDS 71648 - 71848 59 ## gi|160935583|ref|ZP_02082958.1| hypothetical protein CLOBOL_00473 - Prom 71868 - 71927 4.4 + Prom 71745 - 71804 6.3 66 36 Op 1 . + CDS 71840 - 73084 1020 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 67 36 Op 2 . + CDS 73142 - 74041 297 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 68 37 Tu 1 . - CDS 74048 - 75643 1217 ## COG0144 tRNA and rRNA cytosine-C5-methylases - Prom 75664 - 75723 3.3 - Term 75718 - 75770 11.1 69 38 Tu 1 . - CDS 75793 - 76353 766 ## gi|160935587|ref|ZP_02082962.1| hypothetical protein CLOBOL_00477 - Prom 76410 - 76469 7.1 70 39 Op 1 . - CDS 76517 - 78541 1364 ## COG1376 Uncharacterized protein conserved in bacteria 71 39 Op 2 . - CDS 78548 - 78991 397 ## COG1683 Uncharacterized conserved protein 72 39 Op 3 . - CDS 79048 - 80409 1363 ## COG0534 Na+-driven multidrug efflux pump 73 39 Op 4 . - CDS 80462 - 81595 1168 ## COG0628 Predicted permease 74 40 Op 1 . - CDS 81803 - 82885 1219 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 75 40 Op 2 . - CDS 82945 - 84138 1479 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 84209 - 84268 7.8 - Term 84382 - 84427 1.4 76 41 Op 1 . - CDS 84563 - 85918 865 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 77 41 Op 2 . - CDS 85945 - 86934 926 ## Closa_2398 PHP domain protein 78 41 Op 3 . - CDS 86974 - 88278 1515 ## COG0477 Permeases of the major facilitator superfamily - Prom 88429 - 88488 6.3 + Prom 88580 - 88639 6.4 79 42 Op 1 . + CDS 88686 - 89465 937 ## COG1349 Transcriptional regulators of sugar metabolism 80 42 Op 2 9/0.000 + CDS 89506 - 90165 806 ## COG1760 L-serine deaminase 81 42 Op 3 . + CDS 90202 - 91137 939 ## COG1760 L-serine deaminase + Term 91191 - 91250 16.4 - Term 91179 - 91238 16.4 82 43 Op 1 . - CDS 91261 - 93672 2370 ## Closa_1704 hypothetical protein 83 43 Op 2 8/0.000 - CDS 93647 - 94546 900 ## COG1131 ABC-type multidrug transport system, ATPase component 84 43 Op 3 . - CDS 94539 - 94931 429 ## COG1725 Predicted transcriptional regulators - Prom 95059 - 95118 6.7 + Prom 95090 - 95149 6.4 85 44 Op 1 9/0.000 + CDS 95171 - 96910 1583 ## COG3275 Putative regulator of cell autolysis 86 44 Op 2 . + CDS 96891 - 97721 730 ## COG3279 Response regulator of the LytR/AlgR family + Prom 97792 - 97851 7.3 87 45 Tu 1 . + CDS 97881 - 99320 1712 ## COG1966 Carbon starvation protein, predicted membrane protein + Term 99362 - 99416 1.6 - Term 99349 - 99405 21.1 88 46 Op 1 . - CDS 99441 - 101327 1869 ## Closa_1353 ATP-dependent OLD family endonuclease - Prom 101445 - 101504 4.7 - Term 101454 - 101512 14.1 89 46 Op 2 . - CDS 101529 - 102644 1023 ## Closa_1256 hypothetical protein - Prom 102714 - 102773 3.7 - Term 102774 - 102809 7.4 90 47 Tu 1 . - CDS 102859 - 104466 1501 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) - Prom 104497 - 104556 5.3 - Term 104590 - 104639 16.0 91 48 Op 1 . - CDS 104704 - 105168 577 ## COG2731 Beta-galactosidase, beta subunit 92 48 Op 2 . - CDS 105254 - 106087 822 ## COG3773 Cell wall hydrolyses involved in spore germination + Prom 106581 - 106640 5.6 93 49 Tu 1 . + CDS 106681 - 108252 1714 ## COG0591 Na+/proline symporter + Term 108273 - 108322 13.4 - Term 108261 - 108310 5.1 94 50 Op 1 12/0.000 - CDS 108342 - 109130 937 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 95 50 Op 2 3/0.050 - CDS 109147 - 109722 716 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 96 50 Op 3 13/0.000 - CDS 109735 - 110526 896 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 97 50 Op 4 12/0.000 - CDS 110542 - 111153 862 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 98 50 Op 5 12/0.000 - CDS 111156 - 112091 1208 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 99 50 Op 6 . - CDS 112107 - 113432 1424 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 113567 - 113626 8.8 100 51 Op 1 35/0.000 - CDS 113629 - 115317 191 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 101 51 Op 2 4/0.000 - CDS 115509 - 117248 181 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 102 51 Op 3 . - CDS 117249 - 117734 387 ## COG1846 Transcriptional regulators - Prom 117802 - 117861 11.5 + Prom 117884 - 117943 7.5 103 52 Tu 1 . + CDS 118018 - 118422 251 ## Hac_0481 alpha (1,3)-fucosyltransferase fragment 3 (EC:2.4.1.214) - Term 118436 - 118471 6.4 104 53 Op 1 . - CDS 118563 - 119186 678 ## COG0655 Multimeric flavodoxin WrbA 105 53 Op 2 . - CDS 119190 - 122342 2969 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 106 53 Op 3 . - CDS 122426 - 124213 2124 ## COG0018 Arginyl-tRNA synthetase - Prom 124255 - 124314 1.6 - Term 124260 - 124295 4.0 107 54 Tu 1 . - CDS 124336 - 124893 542 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) - Prom 124955 - 125014 6.3 108 55 Op 1 . - CDS 125026 - 126435 1363 ## COG0169 Shikimate 5-dehydrogenase 109 55 Op 2 . - CDS 126445 - 126963 586 ## COG2179 Predicted hydrolase of the HAD superfamily 110 55 Op 3 . - CDS 127000 - 127452 511 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains - Prom 127497 - 127556 6.7 + Prom 127536 - 127595 6.0 111 56 Tu 1 . + CDS 127635 - 128393 612 ## COG0710 3-dehydroquinate dehydratase - Term 128556 - 128592 7.1 112 57 Op 1 . - CDS 128627 - 128893 255 ## COG1873 Uncharacterized conserved protein - Prom 128913 - 128972 1.6 - Term 128954 - 128991 3.1 113 57 Op 2 . - CDS 129041 - 129841 449 ## Closa_2337 hypothetical protein - Prom 129903 - 129962 8.9 - Term 129921 - 129971 8.5 114 58 Op 1 22/0.000 - CDS 130038 - 130961 1220 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen - Prom 130992 - 131051 3.7 115 58 Op 2 32/0.000 - CDS 131216 - 131881 793 ## COG2011 ABC-type metal ion transport system, permease component 116 58 Op 3 . - CDS 131883 - 132899 1268 ## COG1135 ABC-type metal ion transport system, ATPase component 117 59 Op 1 . - CDS 133324 - 134469 601 ## gi|160935647|ref|ZP_02083022.1| hypothetical protein CLOBOL_00537 118 59 Op 2 . - CDS 134526 - 135929 1412 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 119 60 Tu 1 . - CDS 136071 - 136517 211 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 120 61 Tu 1 . - CDS 136777 - 138381 1945 ## COG4108 Peptide chain release factor RF-3 - Prom 138450 - 138509 6.3 + Prom 138555 - 138614 5.5 121 62 Tu 1 . + CDS 138680 - 139909 1375 ## Fjoh_1885 hypothetical protein - Term 139950 - 139989 -0.9 122 63 Op 1 16/0.000 - CDS 140039 - 140956 1167 ## COG1879 ABC-type sugar transport system, periplasmic component 123 63 Op 2 21/0.000 - CDS 141025 - 141966 1254 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 124 63 Op 3 9/0.000 - CDS 141963 - 143456 1915 ## COG1129 ABC-type sugar transport system, ATPase component 125 63 Op 4 6/0.000 - CDS 143484 - 143900 437 ## COG1869 ABC-type ribose transport system, auxiliary component 126 63 Op 5 13/0.000 - CDS 143950 - 144843 1004 ## COG0524 Sugar kinases, ribokinase family - Prom 144864 - 144923 5.2 127 63 Op 6 . - CDS 144937 - 145914 956 ## COG1609 Transcriptional regulators - Prom 146003 - 146062 8.8 - Term 146186 - 146234 7.6 128 64 Tu 1 . - CDS 146311 - 147525 1295 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase - Prom 147570 - 147629 9.6 + Prom 147515 - 147574 7.6 129 65 Tu 1 . + CDS 147618 - 148391 910 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor + Term 148437 - 148483 12.6 - Term 148234 - 148265 -1.0 130 66 Tu 1 . - CDS 148472 - 149863 839 ## COG3507 Beta-xylosidase 131 67 Op 1 . - CDS 149968 - 150120 187 ## Bacsa_1659 xylan 1,4-beta-xylosidase (EC:3.2.1.37) 132 67 Op 2 . - CDS 150125 - 151159 624 ## Pjdr2_2530 hypothetical protein 133 67 Op 3 38/0.000 - CDS 151206 - 152045 896 ## COG0395 ABC-type sugar transport system, permease component 134 67 Op 4 35/0.000 - CDS 152055 - 152954 1101 ## COG1175 ABC-type sugar transport systems, permease components 135 67 Op 5 . - CDS 153055 - 154422 1586 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 154535 - 154594 3.9 + Prom 154489 - 154548 5.3 136 68 Op 1 7/0.000 + CDS 154659 - 155411 788 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 137 68 Op 2 . + CDS 155462 - 157294 1756 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 138 68 Op 3 . + CDS 157325 - 157471 133 ## gi|160935674|ref|ZP_02083049.1| hypothetical protein CLOBOL_00564 139 69 Tu 1 . - CDS 157618 - 161691 3074 ## Closa_3324 LPXTG-motif cell wall anchor domain protein - Prom 161759 - 161818 8.9 + Prom 161907 - 161966 14.2 140 70 Op 1 . + CDS 162140 - 162916 850 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 141 70 Op 2 . + CDS 163001 - 163927 1045 ## COG1432 Uncharacterized conserved protein + Term 163941 - 163998 1.6 - Term 163807 - 163850 1.0 142 71 Tu 1 . - CDS 164019 - 166313 1495 ## gi|160935679|ref|ZP_02083054.1| hypothetical protein CLOBOL_00569 - Prom 166393 - 166452 4.6 - Term 166428 - 166475 5.8 143 72 Op 1 . - CDS 166476 - 167216 951 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 144 72 Op 2 . - CDS 167256 - 168092 791 ## Closa_2351 peptidase U4 sporulation factor SpoIIGA - Prom 168126 - 168185 5.2 - Term 168227 - 168289 28.2 145 73 Op 1 . - CDS 168366 - 169328 973 ## COG1052 Lactate dehydrogenase and related dehydrogenases 146 73 Op 2 . - CDS 169384 - 170286 942 ## Spirs_2575 xylose isomerase 147 73 Op 3 12/0.000 - CDS 170315 - 171307 1024 ## COG3958 Transketolase, C-terminal subunit 148 73 Op 4 . - CDS 171307 - 172134 810 ## COG3959 Transketolase, N-terminal subunit 149 73 Op 5 1/0.250 - CDS 172131 - 173033 1004 ## COG1082 Sugar phosphate isomerases/epimerases 150 73 Op 6 3/0.050 - CDS 173030 - 174553 1332 ## COG1070 Sugar (pentulose and hexulose) kinases 151 73 Op 7 . - CDS 174553 - 175218 599 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 152 73 Op 8 11/0.000 - CDS 175229 - 176518 719 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 153 73 Op 9 11/0.000 - CDS 176518 - 177027 596 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 154 73 Op 10 . - CDS 177104 - 178216 351 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 155 73 Op 11 . - CDS 178255 - 178443 228 ## gi|160935692|ref|ZP_02083067.1| hypothetical protein CLOBOL_00582 - Prom 178477 - 178536 10.1 + Prom 178476 - 178535 5.7 156 74 Tu 1 . + CDS 178606 - 178953 432 ## ELI_2972 hypothetical protein + Term 178977 - 179015 6.5 + Prom 178961 - 179020 6.5 157 75 Tu 1 . + CDS 179101 - 179439 212 ## COG1396 Predicted transcriptional regulators - Term 179519 - 179553 7.0 158 76 Tu 1 . - CDS 179578 - 180891 1453 ## COG0206 Cell division GTPase - Prom 180982 - 181041 5.9 - Term 181230 - 181285 18.1 159 77 Op 1 . - CDS 181343 - 182299 858 ## COG1052 Lactate dehydrogenase and related dehydrogenases 160 77 Op 2 . - CDS 182315 - 183157 859 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes 161 77 Op 3 2/0.050 - CDS 183196 - 184548 1028 ## COG2610 H+/gluconate symporter and related permeases - Prom 184623 - 184682 6.0 162 78 Op 1 1/0.250 - CDS 184687 - 186213 1080 ## COG1070 Sugar (pentulose and hexulose) kinases 163 78 Op 2 . - CDS 186307 - 187044 742 ## COG1737 Transcriptional regulators - Prom 187193 - 187252 10.0 - Term 187288 - 187324 3.5 164 79 Op 1 . - CDS 187366 - 188097 798 ## Closa_2458 hypothetical protein - Term 188129 - 188159 1.4 165 79 Op 2 . - CDS 188172 - 189446 1036 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 166 79 Op 3 25/0.000 - CDS 189540 - 190769 1411 ## COG0772 Bacterial cell division membrane protein 167 79 Op 4 28/0.000 - CDS 190783 - 192141 1601 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 168 79 Op 5 4/0.000 - CDS 192305 - 193273 1117 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 169 79 Op 6 3/0.050 - CDS 193344 - 195134 1956 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 170 79 Op 7 . - CDS 195206 - 197629 2564 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 171 79 Op 8 . - CDS 197607 - 198137 495 ## Closa_2465 hypothetical protein 172 80 Op 1 29/0.000 - CDS 198252 - 199199 961 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 173 80 Op 2 . - CDS 199199 - 199624 447 ## COG2001 Uncharacterized protein conserved in bacteria - Prom 199809 - 199868 7.4 - Term 199809 - 199852 9.8 174 81 Op 1 . - CDS 200011 - 200901 969 ## COG0682 Prolipoprotein diacylglyceryltransferase - Prom 200928 - 200987 7.8 175 81 Op 2 . - CDS 200989 - 202326 1626 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Prom 202418 - 202477 4.9 - Term 202654 - 202683 -0.2 176 82 Tu 1 . - CDS 202922 - 203041 61 ## gi|160935720|ref|ZP_02083095.1| hypothetical protein CLOBOL_00610 - Prom 203157 - 203216 6.0 + Prom 203117 - 203176 5.4 177 83 Tu 1 . + CDS 203257 - 204354 1457 ## COG0012 Predicted GTPase, probable translation factor + Term 204444 - 204509 12.1 + Prom 204475 - 204534 7.1 178 84 Tu 1 . + CDS 204746 - 205330 543 ## Trebr_1792 helix-turn-helix domain protein + Prom 205482 - 205541 6.5 179 85 Op 1 11/0.000 + CDS 205579 - 206256 539 ## COG1309 Transcriptional regulator 180 85 Op 2 . + CDS 206256 - 207503 1190 ## COG0477 Permeases of the major facilitator superfamily + Term 207507 - 207545 0.3 181 86 Tu 1 . - CDS 207568 - 208356 511 ## COG0500 SAM-dependent methyltransferases 182 87 Tu 1 . - CDS 208457 - 208636 93 ## gi|160935728|ref|ZP_02083103.1| hypothetical protein CLOBOL_00618 - Prom 208707 - 208766 5.1 + Prom 208516 - 208575 4.6 183 88 Op 1 5/0.000 + CDS 208772 - 209269 260 ## COG3547 Transposase and inactivated derivatives 184 88 Op 2 . + CDS 209340 - 209936 34 ## COG3547 Transposase and inactivated derivatives + Term 209982 - 210016 1.1 - Term 210587 - 210651 21.7 185 89 Op 1 . - CDS 210670 - 210813 186 ## 186 89 Op 2 . - CDS 210877 - 212025 1478 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 187 89 Op 3 . - CDS 212059 - 213396 1704 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 213440 - 213499 6.4 188 90 Tu 1 . - CDS 213518 - 214354 1049 ## COG1284 Uncharacterized conserved protein 189 91 Op 1 35/0.000 - CDS 214469 - 216238 179 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 190 91 Op 2 . - CDS 216228 - 218048 1868 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 218096 - 218155 9.7 + Prom 218081 - 218140 8.8 191 92 Tu 1 . + CDS 218184 - 219056 893 ## COG0583 Transcriptional regulator - Term 219082 - 219119 4.0 192 93 Op 1 . - CDS 219129 - 219968 944 ## COG1295 Predicted membrane protein 193 93 Op 2 . - CDS 219985 - 220257 211 ## COG1254 Acylphosphatases 194 93 Op 3 . - CDS 220265 - 221209 1325 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 221269 - 221328 6.0 - Term 221346 - 221406 12.4 195 94 Op 1 . - CDS 221434 - 221901 583 ## COG0251 Putative translation initiation inhibitor, yjgF family 196 94 Op 2 . - CDS 221919 - 223049 1086 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 197 94 Op 3 . - CDS 223075 - 224181 1033 ## COG3616 Predicted amino acid aldolase or racemase 198 94 Op 4 . - CDS 224193 - 225434 1032 ## COG1228 Imidazolonepropionase and related amidohydrolases 199 94 Op 5 38/0.000 - CDS 225486 - 226322 865 ## COG0395 ABC-type sugar transport system, permease component 200 94 Op 6 35/0.000 - CDS 226312 - 227196 1053 ## COG1175 ABC-type sugar transport systems, permease components - Term 227212 - 227243 1.7 201 94 Op 7 2/0.050 - CDS 227266 - 228687 1748 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 228723 - 228782 7.4 202 95 Op 1 7/0.000 - CDS 228858 - 230438 1414 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 203 95 Op 2 . - CDS 230432 - 232309 1739 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 232330 - 232389 2.1 - Term 232499 - 232527 -0.1 204 96 Op 1 7/0.000 - CDS 232541 - 234079 2015 ## COG0747 ABC-type dipeptide transport system, periplasmic component 205 96 Op 2 44/0.000 - CDS 234111 - 235088 802 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 206 96 Op 3 44/0.000 - CDS 235088 - 236089 597 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 207 96 Op 4 49/0.000 - CDS 236089 - 237003 1316 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 208 96 Op 5 . - CDS 237000 - 237968 321 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 209 96 Op 6 . - CDS 238003 - 239823 1801 ## Pat9b_4973 peptidase M28 - Prom 239867 - 239926 7.6 - Term 239986 - 240025 4.2 210 97 Tu 1 . - CDS 240067 - 240525 327 ## COG3708 Uncharacterized protein conserved in bacteria - Prom 240574 - 240633 3.1 - Term 240582 - 240620 0.6 211 98 Op 1 . - CDS 240818 - 241186 376 ## COG3603 Uncharacterized conserved protein 212 98 Op 2 . - CDS 241273 - 241404 150 ## gi|160935763|ref|ZP_02083138.1| hypothetical protein CLOBOL_00653 213 98 Op 3 . - CDS 241460 - 242554 873 ## Closa_2898 acyltransferase 3 214 98 Op 4 . - CDS 242589 - 243437 567 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II - Prom 243460 - 243519 3.0 215 99 Op 1 . - CDS 243571 - 244092 345 ## Amet_3236 hypothetical protein 216 99 Op 2 . - CDS 244116 - 244601 369 ## Amet_3237 hypothetical protein - Prom 244641 - 244700 5.0 + Prom 244604 - 244663 4.2 217 100 Tu 1 . + CDS 244700 - 245362 565 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Term 245303 - 245351 7.3 218 101 Op 1 . - CDS 245412 - 245756 252 ## COG3323 Uncharacterized protein conserved in bacteria 219 101 Op 2 . - CDS 245819 - 248074 1470 ## COG3973 Superfamily I DNA and RNA helicases 220 101 Op 3 . - CDS 248067 - 248915 759 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 248982 - 249041 3.9 221 102 Tu 1 . - CDS 249045 - 249785 254 ## gi|160935773|ref|ZP_02083148.1| hypothetical protein CLOBOL_00663 - Prom 249879 - 249938 5.4 - Term 249946 - 249987 9.5 222 103 Tu 1 . - CDS 250043 - 251416 764 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 251488 - 251547 3.9 + Prom 251488 - 251547 3.5 223 104 Tu 1 . + CDS 251647 - 251808 104 ## gi|160935776|ref|ZP_02083151.1| hypothetical protein CLOBOL_00666 + Term 251847 - 251886 -0.0 - Term 251764 - 251813 4.0 224 105 Tu 1 . - CDS 251979 - 252173 106 ## gi|160935778|ref|ZP_02083153.1| hypothetical protein CLOBOL_00668 - Prom 252247 - 252306 4.0 + Prom 252291 - 252350 7.6 225 106 Tu 1 . + CDS 252524 - 252700 167 ## gi|160935779|ref|ZP_02083154.1| hypothetical protein CLOBOL_00669 + Term 252705 - 252742 -0.5 - Term 252738 - 252794 12.1 226 107 Op 1 . - CDS 252846 - 253763 921 ## COG1897 Homoserine trans-succinylase - Term 253797 - 253842 6.5 227 107 Op 2 . - CDS 253843 - 255798 2287 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 228 107 Op 3 . - CDS 255812 - 256567 644 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 229 107 Op 4 . - CDS 256602 - 257651 976 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 230 107 Op 5 . - CDS 257673 - 258749 1271 ## Closa_0811 hypothetical protein - Prom 258807 - 258866 7.4 231 108 Tu 1 . + CDS 258682 - 258948 118 ## gi|160935786|ref|ZP_02083161.1| hypothetical protein CLOBOL_00676 - Term 258984 - 259027 4.3 232 109 Op 1 . - CDS 259154 - 259516 184 ## gi|160935789|ref|ZP_02083164.1| hypothetical protein CLOBOL_00679 233 109 Op 2 . - CDS 259527 - 259718 226 ## gi|160935790|ref|ZP_02083165.1| hypothetical protein CLOBOL_00680 - Prom 259758 - 259817 5.0 234 110 Op 1 . - CDS 259821 - 260999 551 ## COG0582 Integrase 235 110 Op 2 . - CDS 261027 - 261233 246 ## gi|160935792|ref|ZP_02083167.1| hypothetical protein CLOBOL_00682 236 110 Op 3 . - CDS 261252 - 263279 289 ## BACI_c54550 hypothetical protein - Prom 263355 - 263414 8.5 - Term 263483 - 263544 10.0 237 111 Tu 1 . - CDS 263548 - 264882 317 ## gi|160935794|ref|ZP_02083169.1| hypothetical protein CLOBOL_00684 - Prom 264923 - 264982 5.6 - Term 264908 - 264951 1.0 238 112 Op 1 . - CDS 265115 - 265234 105 ## gi|160935796|ref|ZP_02083171.1| hypothetical protein CLOBOL_00686 239 112 Op 2 . - CDS 265227 - 266150 526 ## Closa_0805 hypothetical protein 240 112 Op 3 . - CDS 266179 - 266550 247 ## Sulku_2419 hypothetical protein 241 112 Op 4 . - CDS 266526 - 266618 70 ## - Prom 266646 - 266705 2.0 242 113 Op 1 . - CDS 266740 - 267057 186 ## Shel_03820 plasmid stabilisation system protein 243 113 Op 2 . - CDS 267038 - 267307 260 ## HMPREF0421_21057 prevent-host-death family antitoxin - Prom 267330 - 267389 6.2 244 114 Op 1 . - CDS 267420 - 267677 309 ## gi|160935801|ref|ZP_02083176.1| hypothetical protein CLOBOL_00691 245 114 Op 2 3/0.050 - CDS 267695 - 268495 594 ## COG3723 Recombinational DNA repair protein (RecE pathway) 246 114 Op 3 . - CDS 268515 - 269444 889 ## COG5377 Phage-related protein, predicted endonuclease 247 114 Op 4 . - CDS 269517 - 270002 318 ## COG2003 DNA repair proteins 248 115 Tu 1 . - CDS 270142 - 271074 727 ## CKL_2932 hypothetical protein - Prom 271135 - 271194 5.8 - Term 271216 - 271266 12.0 249 116 Op 1 . - CDS 271276 - 271845 193 ## gi|160935806|ref|ZP_02083181.1| hypothetical protein CLOBOL_00696 250 116 Op 2 . - CDS 271897 - 273069 548 ## COG0515 Serine/threonine protein kinase - Prom 273090 - 273149 3.7 - Term 273091 - 273129 2.0 251 117 Op 1 . - CDS 273260 - 274045 550 ## Cbei_3805 hypothetical protein 252 117 Op 2 . - CDS 274064 - 275737 569 ## COG0464 ATPases of the AAA+ class 253 117 Op 3 . - CDS 275755 - 276954 372 ## COG3468 Type V secretory pathway, adhesin AidA 254 117 Op 4 . - CDS 276951 - 277157 123 ## gi|160935811|ref|ZP_02083186.1| hypothetical protein CLOBOL_00701 255 117 Op 5 . - CDS 277154 - 277537 126 ## COG0602 Organic radical activating enzymes - Prom 277596 - 277655 4.0 - Term 277683 - 277719 -0.9 256 118 Tu 1 . - CDS 277831 - 278061 202 ## JDM1_2528 hypothetical protein - Prom 278108 - 278167 4.5 257 119 Op 1 . - CDS 278183 - 278701 281 ## gi|160935814|ref|ZP_02083189.1| hypothetical protein CLOBOL_00704 258 119 Op 2 . - CDS 278701 - 278949 132 ## gi|160935815|ref|ZP_02083190.1| hypothetical protein CLOBOL_00705 - Prom 279160 - 279219 6.8 + Prom 279376 - 279435 4.8 259 120 Tu 1 . + CDS 279455 - 279634 131 ## gi|160935816|ref|ZP_02083191.1| hypothetical protein CLOBOL_00706 + Term 279678 - 279733 2.4 - TRNA 279820 - 279891 70.3 # Arg CCG 0 0 - Term 279732 - 279795 17.0 260 121 Op 1 12/0.000 - CDS 279939 - 280379 593 ## COG3610 Uncharacterized conserved protein 261 121 Op 2 . - CDS 280373 - 281170 787 ## COG2966 Uncharacterized conserved protein - Prom 281195 - 281254 6.3 + Prom 281365 - 281424 5.8 262 122 Tu 1 . + CDS 281628 - 282557 755 ## Closa_0805 hypothetical protein 263 123 Op 1 . - CDS 282601 - 284700 2045 ## COG0370 Fe2+ transport system protein B 264 123 Op 2 . - CDS 284697 - 284933 355 ## Closa_0801 FeoA family protein - Prom 284968 - 285027 7.3 - Term 285026 - 285072 7.0 265 124 Tu 1 . - CDS 285127 - 286311 1281 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 286360 - 286419 12.5 266 125 Tu 1 . - CDS 286458 - 288440 1993 ## COG0556 Helicase subunit of the DNA excision repair complex - Term 288913 - 288964 13.0 267 126 Tu 1 . - CDS 289067 - 291853 1922 ## COG0642 Signal transduction histidine kinase - Prom 291951 - 292010 8.9 - Term 292004 - 292048 10.3 268 127 Op 1 1/0.250 - CDS 292062 - 293435 1698 ## COG0793 Periplasmic protease 269 127 Op 2 2/0.050 - CDS 293514 - 294722 1351 ## COG0739 Membrane proteins related to metalloendopeptidases 270 127 Op 3 28/0.000 - CDS 294738 - 295646 972 ## COG2177 Cell division protein 271 127 Op 4 . - CDS 295716 - 296387 381 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 272 127 Op 5 4/0.000 - CDS 296402 - 297490 1497 ## COG2508 Regulator of polyketide synthase expression - Prom 297537 - 297596 8.5 - Term 297546 - 297602 13.1 273 127 Op 6 . - CDS 297632 - 298744 1365 ## COG3839 ABC-type sugar transport systems, ATPase components - Prom 298910 - 298969 6.6 - Term 298805 - 298841 -0.6 274 128 Tu 1 . - CDS 299010 - 299849 1063 ## COG1284 Uncharacterized conserved protein 275 129 Op 1 . - CDS 299988 - 300821 324 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains 276 129 Op 2 . - CDS 300823 - 301632 639 ## COG3884 Acyl-ACP thioesterase - Prom 301796 - 301855 8.9 - TRNA 301695 - 301769 65.2 # Gln CTG 0 0 - Term 301914 - 301958 8.2 277 130 Op 1 1/0.250 - CDS 301971 - 303476 1247 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 278 130 Op 2 1/0.250 - CDS 303522 - 304982 1246 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Term 305016 - 305060 3.0 279 131 Op 1 7/0.000 - CDS 305138 - 306688 1482 ## COG0747 ABC-type dipeptide transport system, periplasmic component 280 131 Op 2 44/0.000 - CDS 306739 - 307707 840 ## COG4608 ABC-type oligopeptide transport system, ATPase component 281 131 Op 3 44/0.000 - CDS 307700 - 308692 332 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 282 131 Op 4 49/0.000 - CDS 308711 - 309568 1004 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 283 131 Op 5 . - CDS 309592 - 310515 272 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 - Prom 310682 - 310741 8.9 + Prom 310667 - 310726 6.4 284 132 Tu 1 . + CDS 310763 - 312277 1173 ## COG2508 Regulator of polyketide synthase expression - TRNA 312343 - 312414 64.6 # Gln CTG 0 0 - Term 312284 - 312333 7.2 285 133 Tu 1 . - CDS 312573 - 313247 857 ## COG2364 Predicted membrane protein - Prom 313293 - 313352 7.2 286 134 Tu 1 . - CDS 313403 - 314434 971 ## COG0392 Predicted integral membrane protein - TRNA 314621 - 314692 63.7 # Gln CTG 0 0 - Term 314566 - 314607 7.3 287 135 Op 1 . - CDS 314836 - 315402 605 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 288 135 Op 2 . - CDS 315410 - 316594 1267 ## COG0053 Predicted Co/Zn/Cd cation transporters 289 135 Op 3 . - CDS 316591 - 317337 573 ## COG0101 Pseudouridylate synthase Predicted protein(s) >gi|157101655|gb|DS480669.1| GENE 1 34 - 327 340 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935496|ref|ZP_02082871.1| ## NR: gi|160935496|ref|ZP_02082871.1| hypothetical protein CLOBOL_00385 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00385 [Clostridium bolteae ATCC BAA-613] # 1 97 1 97 97 181 100.0 2e-44 MKAFDCVNKQEVEVTKEGLIDFMKKDRQIDMKFAEKRTDDMGYLTWDAENWTCVDGQNKF MRCYSLEGRVLRDSTSHNIYDMENDFFPEQAMEIQIN >gi|157101655|gb|DS480669.1| GENE 2 442 - 678 332 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935497|ref|ZP_02082872.1| ## NR: gi|160935497|ref|ZP_02082872.1| hypothetical protein CLOBOL_00386 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00386 [Clostridium bolteae ATCC BAA-613] # 1 78 1 78 78 131 100.0 2e-29 MERKVIKIFGAQDNIQAEMILETLGNNQIPAYKKGVGSSDIMNIYGGNSLYGEDIYVAEE DAQKAMEVLEGMGLTEER >gi|157101655|gb|DS480669.1| GENE 3 724 - 1317 401 197 aa, chain - ## HITS:1 COG:no KEGG:Mlab_0595 NR:ns ## KEGG: Mlab_0595 # Name: not_defined # Def: hypothetical protein # Organism: M.labreanum # Pathway: not_defined # 9 141 8 151 208 70 33.0 4e-11 MSISRNYAVRCPECGCIIPFELYDSITSSLDPGLRQQILDGQFETVVCPDCRAVSYIQYD ILYHDPARRFMVCVGTDYSDVFQQSDCPKDYKLRYVDDYKQLAEKIRIFEAGLNDRIMEI TKESMRHMAKRNLELYYAGSRRKLMYFIVPGHERYVAVSQGLYTLAGKYYDKIPVEKEFG FLRINRKYAEALKTVAV >gi|157101655|gb|DS480669.1| GENE 4 1527 - 1973 282 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935500|ref|ZP_02082875.1| ## NR: gi|160935500|ref|ZP_02082875.1| hypothetical protein CLOBOL_00389 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00389 [Clostridium bolteae ATCC BAA-613] # 1 148 6 153 153 266 99.0 4e-70 MSTNYYIFNRKKREEIQEFNRFWEEIFIPGIKQQIEDHCSKQNGAYVNTDFGNEIIGEKI SGISGAPGKSESYETVIGVSHWNGKRNLFQWEGSYVEEHIIRDESSLVEFFNSKMNQQQY SIVDEFDKEYTLEAFLNAIKYGGDESVS >gi|157101655|gb|DS480669.1| GENE 5 2253 - 2882 324 209 aa, chain + ## HITS:1 COG:no KEGG:Closa_2883 NR:ns ## KEGG: Closa_2883 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 198 13 210 219 182 48.0 1e-44 MLYGNIPVFTYHIAYPSFSTTCVLSAAQTANIYYMQLAENTEQYCRTVLYPQAVESARYI TSNHPPFNRYTLDMNYQITYNSGCITSLYMDTYTYMGGAHQELERISDTWDFSTGKQLHL DDISALTPAALNGLQTSVERQIAERLKESPGSYFEDYPYLLRNKFNQNHFFLRPGYIVIY YQQYEIAPYATGIPEFSFRMPAYQMTTRR >gi|157101655|gb|DS480669.1| GENE 6 2889 - 3074 105 61 aa, chain + ## HITS:1 COG:no KEGG:Closa_0840 NR:ns ## KEGG: Closa_0840 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 51 1 51 60 65 72.0 5e-10 MQSCELALSVSTLACCIAEGKSPEEIALISSIFMQLGDTLATIAAHQALCCPPENTDTQK S >gi|157101655|gb|DS480669.1| GENE 7 3167 - 3919 402 250 aa, chain - ## HITS:1 COG:CAC3446 KEGG:ns NR:ns ## COG: CAC3446 COG0384 # Protein_GI_number: 15896687 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Clostridium acetobutylicum # 1 250 52 301 302 249 50.0 5e-66 MFETDSPDYNIEVRFFTPTTEVPICGHATIAAHYVRAKERNMEGGTVVQKTKAGILPVEV VRTLSDYSITMTQGTPAVSAPFGDAVRERIADALGIAHEELCREYPVAVASTGHSKVMVP LYSNELLHGLRPDLQKLTKISKQIECNGYYVFTLNPQNKILVHGRMFAPAIGIAEDPVTG NANGPLGAYLVHFNILQEEDAPCFDFDILQGEAIKRDGTMHVHVEKEHGAPKLVQITGNA VIAFSTEITI >gi|157101655|gb|DS480669.1| GENE 8 4136 - 4381 206 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935504|ref|ZP_02082879.1| ## NR: gi|160935504|ref|ZP_02082879.1| hypothetical protein CLOBOL_00393 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00393 [Clostridium bolteae ATCC BAA-613] # 1 81 1 81 81 121 100.0 1e-26 MLERWKKQLFSEDGMKVVNVLFILLYLVRNPLFTICAYTVWFVYLVYSVRHTKSKGMKIF YSVLSMLAMATVLLNAYFTIL >gi|157101655|gb|DS480669.1| GENE 9 4406 - 4645 343 79 aa, chain - ## HITS:1 COG:L114363 KEGG:ns NR:ns ## COG: L114363 COG4443 # Protein_GI_number: 15672295 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 1 69 1 70 72 82 67.0 1e-16 MADIKYDIIKEIGVLSENAKGWRKELNLISWNGGAAKYDIRDWAPDHEKMGKGTTLTEEE MENLKEILGLIETVTEAFV >gi|157101655|gb|DS480669.1| GENE 10 5083 - 5439 150 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935508|ref|ZP_02082883.1| ## NR: gi|160935508|ref|ZP_02082883.1| hypothetical protein CLOBOL_00397 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00397 [Clostridium bolteae ATCC BAA-613] # 1 118 2 119 119 239 100.0 4e-62 MDGHIWFVGWYDGGDYHEIKITQKEVDNCGTAGCGLNSILYYIGGKYGDEALYNMIYLQS IEPFGTNGIFRMCVTDPIKVHPEFQNGGIYLVDCNRTLICRNGDELYMELVKEKGDIS >gi|157101655|gb|DS480669.1| GENE 11 5451 - 5912 180 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935509|ref|ZP_02082884.1| ## NR: gi|160935509|ref|ZP_02082884.1| hypothetical protein CLOBOL_00398 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00398 [Clostridium bolteae ATCC BAA-613] # 1 153 1 153 153 298 100.0 6e-80 MSHTKEYRGEKIEAGLTARLIPPGAQYSDSFCRLILTERHLYILEDNYNGTYNELYAFFI ERIQDMEAQVKGLRYKKSVLGEMFSNGILSLFGGVLYASEKKKEDNGIRFVITYDDGMGK CNQLYFKDLQTNSNHMVKVYHTLKKSFDKTIMR >gi|157101655|gb|DS480669.1| GENE 12 5968 - 6540 288 190 aa, chain - ## HITS:1 COG:no KEGG:Ava_B0296 NR:ns ## KEGG: Ava_B0296 # Name: not_defined # Def: hypothetical protein # Organism: A.variabilis # Pathway: not_defined # 7 184 1 180 182 116 32.0 4e-25 MPEETYMTGQIADVYEYNGNRYNIVAIKGTMNFDIHELGFDPVPIHTACWRGYHCIYNIT EEGLFLKDLFVNDRNGSYPPVNGVEPEEDRFDGKVYYGINIPVKFDGGIIIGNDFLTRYY IHMGFQRVYAYKTVYELIFKEGLITKKTDLSKRAALIRTEIDEKGIDRANISKFVEESFS LDYEDKWTVY >gi|157101655|gb|DS480669.1| GENE 13 6729 - 8045 989 438 aa, chain - ## HITS:1 COG:yhfM KEGG:ns NR:ns ## COG: yhfM COG0531 # Protein_GI_number: 16131248 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 427 18 448 462 186 32.0 7e-47 MGESENGLKRQIGLFGAVAILVGAVIGSGIFMTPGTVAASAKSFGPFMVAWILAGASGIL CSLVYAELSPAMPKAGGPYVYITEAFGNGFGFVYGWSMTIGNYIPLVAMLATGFASNLAK LIPGITPVGIKMVASAVIIALMILNIRGTKLGSTIANIFTVGKLLALLLVIIGGFFIISP ENFTSVTTSSQVAEWNGVLSAAFPAFLAFGGYYQLAYMSADIKDPKKTLPKAMIIGMIIV IAVNILISVVCVGTVGFAQLAGSETPVIDAGTAIFGPVGTIIVAIGASVSIFGALNGGIM SYPRVSYSMSQNGLMFKSFGRLHNRYNTPYIPTLFICLTALIFVWTGSFGTLLGINVFAG RILECIVCLSLLVLRKKKPNMSRPLKMPGYPVTTILAIAVTFIICMTCSGIQMIKSIGLM ATSIPAYFIFRTLDKKKD >gi|157101655|gb|DS480669.1| GENE 14 8051 - 9238 767 395 aa, chain - ## HITS:1 COG:STM3833 KEGG:ns NR:ns ## COG: STM3833 COG4948 # Protein_GI_number: 16767118 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Salmonella typhimurium LT2 # 1 383 2 388 397 353 46.0 4e-97 MKIVSVDIFLLDGGSPGWRPIVCRVNTDEGVSGYGEASVGFDTGASASYSMIKEVAPFVI GMDPMATEAVWNKMYTQTFWAQGGGTIMFSAISAIDMACWDIKAKALNLPLYKLLGGKCR EKLRSYASQLQFGWGKGMVFDRGYKIEDLVEHSLKAVAEGFDAIKINFITYDGNGNRLGF LKGPIMPGTRALIEERVKAVREAVGDGVDIIVENHARTDAVSAVEMSQIIKPYGIMFMEE TCTPMNLQVLETVRNHSAVLQAGGERVYGKYHYANLIKKDIFQVYQPDLGTCGGITEAMK IASMADAVDAGIQIHVCASPIAIAASLHVEASLPNFVIHEHHVTNRSENNIRLGVYDYQP DGCGFCHVPELPGIGQELSRWAMDHALQKETVKGE >gi|157101655|gb|DS480669.1| GENE 15 9468 - 10352 643 294 aa, chain + ## HITS:1 COG:PA0491 KEGG:ns NR:ns ## COG: PA0491 COG0583 # Protein_GI_number: 15595688 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 4 196 6 197 308 61 25.0 2e-09 MNNISLQQIQYFKIAMKCSSFSQAAKLAYTTQSTISKNIATLEKIMGEPLFTRQKKGIMP TQRAILLDLELTEIYDKIDLLLNNTRQSKQERVSIGFCQSIDFSSSIPEFFSLFRRENYI PTAVIKLQCCENSDVVNGVLDGSIDLGFILSDTNISNPNIKLHTITSTEPQIFFSVNSPL NKSGLTINDFANYPIVTTKYLIEKNDYRMINLLPFTPKGIEIVKSYDDIPIYLATGLYVT LLRPYVNLANNKNIMSYKLPDSYHYKQGITMIWFSHNKNKYLKRILSILYKANT >gi|157101655|gb|DS480669.1| GENE 16 10393 - 11346 507 317 aa, chain + ## HITS:1 COG:CAC0420 KEGG:ns NR:ns ## COG: CAC0420 COG1266 # Protein_GI_number: 15893711 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Clostridium acetobutylicum # 86 301 52 267 276 125 37.0 8e-29 MHSWNNSIFKTIFKNKNGFLRSGWTILIVMLLYYALLYFASFLVLTALIVILTATGDLNR AADYFSPLANWVNDACLPVAMLILTDVMMIIIPIIAWKCFIKRPLSEMGLHPFKSTKKEC GAGMILGMVNCTLVFLLVVWFGGGKVASWQPRITALTLTWLFAFILVAYAEELLNRGFIM AVLRRCRNHVFVLIFLPSVIFGAIHLGNPSVTLLSVFNIIIVGILFSYMFIKSGNIWMCI GYHFTWNVFQGIIYGMPVSGLSIPGIITTHFTRDNILNGGGFGIEGGILTTLVTLLSFLF VRYYYRNSTYDFMNNMD >gi|157101655|gb|DS480669.1| GENE 17 11452 - 11580 149 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935517|ref|ZP_02082892.1| ## NR: gi|160935517|ref|ZP_02082892.1| hypothetical protein CLOBOL_00406 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00406 [Clostridium bolteae ATCC BAA-613] # 1 42 1 42 42 67 100.0 4e-10 MSGSKPQYSVLERNRGVLAAVAGTAITRPGEITKRFVAAMKQ >gi|157101655|gb|DS480669.1| GENE 18 11596 - 12519 808 307 aa, chain - ## HITS:1 COG:STM3339 KEGG:ns NR:ns ## COG: STM3339 COG0329 # Protein_GI_number: 16766634 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Salmonella typhimurium LT2 # 9 287 5 286 297 177 37.0 2e-44 MSRFEMNQIKGVIPAMMTFFDREENVDTECTRRMVEFMLEHGADGFYLTGSTGECFTMTV EERNLVVDTVIDQVKGRVPVVVHVGDIGTKKSIELAEHAYRAGADAISSVPPFYWKFRAG DIYNYYRDISESTPLPMVVYNIQLAGLMDMDLLLQLAGLPNVHGLKYTARSHDEMGFIKE TLGPDFMIYSGCDEMAFSGMCSGADGIIGSFYNLFPDLYKQILKKVEESDIKGGMRLQRI ADEVIFAALKYDFPSVLHNLMNWRGLESGYSRRPFYNYQDSELEGLKEDIRRIKDRYQAE ELDMFRL >gi|157101655|gb|DS480669.1| GENE 19 12537 - 13574 934 345 aa, chain - ## HITS:1 COG:BH0027 KEGG:ns NR:ns ## COG: BH0027 COG4608 # Protein_GI_number: 15612590 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 8 339 5 332 338 360 53.0 2e-99 MPETKDVLIKVRNLKTYYPVKQGVFKHTVGHVKAVDDVSLDVYRGEILGVVGESGCGKTT LGKSILQLIKPTEGSIRYAFEGGEKELIGMNKKELDEARCKMQIVFQDPYSSLNPSFTIF GSMQEPLIRFGEKSRKARREKIAKLLEAVNLPADYMERYPNEFSGGQRQRIGIARALSVS PEFIVCDEAVSALDVSIQAQVLQLLKKLQEERGLTYMFITHDLSVVEYISDRIVVMYLGR MVEMADTQELFENTLHPYTRALLSAIPIADIDRRRKRIPLQGDVPSPVNPPSGCPFHPRC PECMERCKTEVPHPVIITRDGREHMVCCHRVQANAGGGNDFKTHI >gi|157101655|gb|DS480669.1| GENE 20 13561 - 14586 521 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 24 326 36 324 329 205 37 2e-51 MNDNIVLSVRDLEVNFYNNERCNRVLRGVSFQLQKGKIMCIVGESGCGKSVTANSIMGLL PDLSRIEKGEIHFFHNGRDIRIDQLKRKGKEMRQIRGDGISMIFQDPMTALNPVFTVGYQ INESLLYHDRAKTKSEAKQKAIRLLKDMGIPLPEKRVDEYPYQFSGGMCQRAMIAMAMSC QPKVLIADEPTTALDVTIQAQIFELMQDLKNNNDAAILLITHDMGVVAELADNVAVMYMG NIVESGDARAVLTRSAHPYTRALLKSIPVLGRGKKQELEPIKGSTPDPYDRPKGCQFAPR CGYATEQCMQEMPPETEIGPGHFCRCFHNLMGKEGQPDAGN >gi|157101655|gb|DS480669.1| GENE 21 14620 - 15531 1010 303 aa, chain - ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 15 302 15 302 303 224 40.0 1e-58 MKHSLDRKMDRILERENSGRIKKGIRNRALRKLFANRLSVLGFIIFIVILVMCLGAPLFT QYSATKVDLRNILSPPTSSHIFGTDKIGRDIWARILYGGRISIGVGLGSALVATVLGVSL GTMAGYKGGWFDGIIMKVSEILMSFPQIILVLILVTITGQSLWNLIFIFSITGWPSMYRM ARSQMLSLREEEYVQALKAFGIGSMRIAFVHMLPNAIGPIFVNITLSTAMFILQEASLSF LGLGVPLEVATWGNILNAAQDLMILKNSWWVWLPAGIVVTLFVVGINFIGDGLRDATDPS QIG >gi|157101655|gb|DS480669.1| GENE 22 15542 - 16501 976 319 aa, chain - ## HITS:1 COG:BS_appB KEGG:ns NR:ns ## COG: BS_appB COG0601 # Protein_GI_number: 16078204 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 1 317 1 316 317 207 39.0 2e-53 MSNYIIRKILTLIPMMLVISFLIYLGMELMPGDAVDFLIPPDALSTMSPEQLNAMRDALG LNDPFFLRYLKWLAGLVRGDFGYSLQSGVPVLTLMKNHLPATIELTLTSLVMSSIFGILL GVVCALKKGTALDQILSVVGMVGVAIPQFLLGLICINAFALHTSILPVGGRMAYAGQNFI QRLPYLIMPATVLGFSLTSGVMRYGRSSMLDSMGRDYMKTARSKGLPEWRVNLLHGLRAA MTPVVVLIGFRLPMLIGGAVVVEEVFQWPGIGVLFVDAVRSQNTPLVMIIGFFSVLLVLV ASIVVDIITALLDPRIKLS >gi|157101655|gb|DS480669.1| GENE 23 16581 - 18314 1662 577 aa, chain - ## HITS:1 COG:SA0850 KEGG:ns NR:ns ## COG: SA0850 COG0747 # Protein_GI_number: 15926580 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Staphylococcus aureus N315 # 334 557 78 298 571 81 29.0 5e-15 MRKRRLASLLLAASLAASALAGCGSGGKTETTAAAAGGQTEAAGGTAGETGAAGGTSAAD LGTPLADVRVRQALAYAIDMNAIVDSLFEGKAEVAKSFTAPGDWLNAGIPVYEYNPEKAK ELLKEAGWPSDYTLDVVYYYDDQQTVDLMTIIGQYWQEVGVKAQFRKLEGDLAAQLWVPP ADMEKGPSAVKWDLAYAAVAALAESEFYNRFASTASNNSSVPKQEGLDEMIAASNATMDV GEQKEAFYKIQQFVAENELAMPLYHQVCFIYTSDKLDTAGSAFGNDQFSYEKNILDWKID RDDRTMYTNGGPQEFFWYPMVNPGYMINTELVFDKLINADSSLNPTDGMLAESYKVSEDD KSIEFVLRDGLKWHDDEPLTAEDVKFTLELMLRTPGTNAVASEVMKAIQGAQDFLDGKTD NLEGVVIDGSKITVNFDTVSANALAVFSQWPILPKHCLENASPETLQQDQFWQKPIGSGP FKVDEVVLNNYATLKRWDGYYKTGTGNIETIYMFASGENDSNLVKNAGAGKIDYAWSKST DDAKAIESMDGMKVSTANIRYTRCFYINQFPHEPNIK >gi|157101655|gb|DS480669.1| GENE 24 18592 - 19305 642 237 aa, chain + ## HITS:1 COG:BMEII0858 KEGG:ns NR:ns ## COG: BMEII0858 COG2186 # Protein_GI_number: 17989203 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 15 185 14 175 242 107 35.0 2e-23 MAESTGMKNLKSRRLYLQVYDEIKSYIEKNHLLPGDKLPSEMKMCEMLGVSRNVLREAIK SLEITGAVHSTPGVGIVIQEFNTDFFLSNLIYSISDEDQLHKEIEELRRVLELGFAKDAF DRMTEKGIALLSEKVVTMEELFNQIKRSHSSVYGVKFAETDAAFHKILFQDVDNRLLKSI IEFFWACDRYYKVKTTHSNMELTVEKHQRICHALKAGSYSDFYDAMQFHFNVGYFKR >gi|157101655|gb|DS480669.1| GENE 25 19356 - 19544 57 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935526|ref|ZP_02082901.1| ## NR: gi|160935526|ref|ZP_02082901.1| hypothetical protein CLOBOL_00415 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00415 [Clostridium bolteae ATCC BAA-613] # 13 62 1 50 50 80 100.0 3e-14 MSVKAVIPTSGGMENLYLHTLNTAGISGSKEKKAAAATTGKIHRIQMTPIICTASVLVLK RK >gi|157101655|gb|DS480669.1| GENE 26 19658 - 26098 3327 2146 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 30 696 33 644 2050 137 27.0 5e-30 MNKNKLVKRTLAIVLSAAMVFTNIGHSPTVAYAASGNSVDFMVSGTDFVAAIEDMIHSGA EPLKQEELDFTNGKAEKYYQFFFDGEDPVYEFYPEFDGQDMEAEVRVFVRLPEAADDTYS LTGEEEIIFLYINNSEDTISCSTTILMSDGSEKRTKRVTVKDYETAFGDNRIEYKSNTSA TGQTEAEIPESDAAEIEKPGTDAPMVETPETSMEESGVTEPETGSGIPDQETEEKENEAV ETDEVIEDTEPEIVEKSEPETQSPDVSEEGKTQADSPVASRIRHAVPVVAAVDTGEENST IEPSVDKQQEDTETGGTHDDSEKHSNDKDNETVGSVPKESSAEAESETESLTEESGKQTE ASADETGEETAAVESSEPATDIQTPSEESRTETSEETQTETSEETQPVESEPVQVTETQE SQISTETESIPVSSIGNGDLVGMDGCSTAKAYVTTLKKLKVLDEVDGYRVTYEITPEDKA AIVDKCDRVKAGESVTFGVEEQEGFRVGQVEANGDVVESAYSEENIVWFTLEYVEEDLEI LITMVPDGTSMPFNKTISMDDGMDITISALPGVLPENTEVTAEIVTSQVEEAIRSDQSEE GREIIAAPAYDINLCLDGEKLDDEIWNRNGAVTVSFSGPLMDQLIQESQMAQVLWVKDEN NSTEVMEGSYKDITREGTTEDLSFETSHFTTFAAVFAVAPDISNVFNIDEHGLMDEDFPK LVVRDKDENTLDIEITEQEDGARRVTLKDGKPLPRNISTWFEISYDFKNSSDPFFETYSV QKGDHFTYQFPSNILYDKKQTVVKDINGGEAVIGTASINAEGYMDVEITAENVGQNYRGT AEAGGKLDLEKVLEEKNEITLVGDEVEYIMELVPAPVQENYSVKVVKGPKSGKWTDQDIR YDENRLPSALQYHVTVTADAQNSGPITNVKVSDTLGNTKNGTIVFDKTQEIKFESNNQDT VIDNVTFGGMSNGGKTIGIQLNTKDGMPASMMPGESVSLSYWVKLGAGAWSGQNTGNSSK EMSASLFLELKNTVNVTADKKVSGSDSTTFKRTIECVTKSGIVHYNYEIDGKTEPVVIEY HVFVNPRLINMTGWTIEDKLDKNQKYLGNVEVIAYTGKNGTSLGKVDDIEPIISKNPQTW KYVIENPGKYYYDFKYFTTPNEATESNLSNEIKITWPGGGSGGIGGSTGVGFLYNTYTMT KKNLSSNMKSSNNINSGGVYEPVNNTAGNVGWGDEFGTIRWQSVLTPYADGQTKGATIPA GTVYKDSLSVIKWGNKNKEDAARANMHTFKDDGSFKASFVLKDGNGRAISPDDGTYTLSF DEKLGNRGFSVVFSKDVPGPVVIEYESFVDLEMLENVGVFDKDGTKDNLRFKNVGEMTIN GTTWKSATSQPYYIEDYISKNWVKNDTKAGTITWNLNINKGYGNDAPQNLGRFTVDVIEY IPEGLTLDHIKFQSMNYTLKPEKGDYDIDGNKVTIHLNTVARNFSGYKMSKHINLNIITK VTDPSIKSFTNSAQMVIDGNMLHKVSATTGFDKSFLKKGMAYSRDTAPYAEYTILVNQPG ADISNGTLTVIDTLGVPGKMAYVPDSFKVTNAENGTPIEEANIRVGKDSFGNDSFEISNL PNKTPIKIFYKVMITGAINETVDTKNTASLLYSREGIITETIERSVTIVKAAFTGGGKFS IKLYKVDQNKQPLTGAAFTLSKVSEVGSPDLEYVGEFTPVIDSGNPQAGAAVLITGLEKG QLYYLEETTVPEGYQKAEGEYFIIPDRENGIGVPDGAIAIENAGTPHVVENIKNSGTLKL TKDVNGRESGSDFETDFYFTVYGGDRYYDRDGSYADLRTVKLHYDSRIADNSLILSLPIG EYVVKEVADAEGTPINGDNFYYNVQINGEDRNEAVIAIEKEQQAECLVKNLYEADGNLQF TAMKTMEGRPLEEGEFTFRIYEGDTLKAEAQNGLNGVILFPEINYKANDAGRHTYTIMEQ KGTLTGVDYDDTVYTVTVEVTDNHDRTVSAEIVNIQRSDGAEAEVIQFNNVYHENPETTP DPSPNPNPSPSPGGSTGGGGGTSNTGGGRYQPSDGGPGVTISITPEEVPMAQIPSQPDSL ITILDDDVPLAPLPKTGDMSVDHYMLTVISMLLTGIYLALTKRKKE >gi|157101655|gb|DS480669.1| GENE 27 26545 - 27933 1394 462 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 459 1 452 458 507 56.0 1e-143 MAKARTTAFFCKECGYESSKWMGQCPACKAWNSMVEEPVSKPSGSQGGLGRLASGSQKSP VPAARPSLLSEIDIEEQDRISTGFGELDRVLGDGIVAGSLVLVGGDPGIGKSTLLLQVCR NLAGAGQRVLYISGEESLKQIKMRANRIGPVTGELKFLCETSLEHIERAIDTENPQIAVI DSIQTMYREEISSAPGSVSQVRESTGILMQIAKSRGIAIFVVGHVTKEGVVAGPRVLEHM VDTVLYFEGDRNASYRILRSVKNRFGSTNEIGVFEMQEQGLAEVENPSEYMLDGRPEEAS GAVVSCSFEGTRPILLEVQALVTETNFGMPRRTAAGTDYNRVNLLMAVLEKRCRYEMSRL DAYVNIAGGMKMNEPALDLAIIMALMSSYKDRPVDPKMLIFGEVGLSGEVRAVSQASQRV NEAAKLGFTACVLPSVCADKMKPVEGIRLIGVSNVREAIGIM >gi|157101655|gb|DS480669.1| GENE 28 28022 - 28438 586 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_1335 NR:ns ## KEGG: Closa_1335 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 137 137 177 70.0 2e-43 MPVIEELICIEQDGTISFGNYKLGQKAKKSDFEYQGDMYKVKTYNEITKLERNDMFVYES VPGTAAEHFKVTDEDVEFAVEGSRDAQITIQLENDTDYEVYVDGAAVGSMKTNMSGKLSV SVELEEGVSVQVKAVKRA >gi|157101655|gb|DS480669.1| GENE 29 28557 - 31187 2394 876 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 7 814 1 806 815 926 57 0.0 MQGGHDIMMERYTPQAQEALGLAVGVAETLNHGYVGTEHLLIGLLQEGTGVAAKVLEENG VEEDRVIELVSQLIAPNPTVQTADRTAYTPRARRVIENSYREAVRFKAAQIGTEHILIAM LREGDCVASRLLNTIGVNIQKLYIDLLAAMGEDAPAAKDDLQGARAGKRGNATPTLDSYS RNLTQLATAGKLDPVIGREQEIQRVIQILSRRTKNNPCLIGEPGVGKTAVVEGLAQMIAS GNVPETIADKRVVTLDLSGMVAGSKYRGEFEERIKKVISEVVESGDVLLFIDEIHTIIGA GGAEGALDASNILKPSLARGEIQLIGATTINEYRKYIEKDSALERRFQPVTVDEPTEDES VAILKGLRSRYEEHHKVEITDHALEAAVKLSSRYINDRFLPDKAIDLIDEAASKVRLQNY TKPAKIKDYEAEIDGLEEAKEEAIKREAYEKAGEIKKKQEKIREKIAQTMEKWQKDKESK KLIVSDNEIADVVSGWTRIPVRKLAEEESERLRNLEGILHQRVVGQEEAVTAISKAIRRG RVGLKDPKRPIGSFLFLGPTGVGKTELSKALSEAMFGTENALIRVDMSEYMEKHSVSKMI GSPPGYVGYDEGGQLSEKVRRNPYCVILFDEIEKAHPDVFNILLQVLDDGHITDAQGRKI DFKNTIIIMTSNAGAENIISPKRLGFGMVSDAKADYNFMKDRVMDEVKRLFKPEFLNRID EIIVFHQLTREHIKGIADIMLGTIGKRCKEQLGIGLEVTDSAREHLIDKGYDDKYGARPL RRTIQNLVEDRMAEEMLDGRIKAGSLVEVGFDGEKLTFSVKAKTARPKAAAKPANKPAAK PASAGEGSEDKSKAAGSKSRAGRASGKKKETPAPKA >gi|157101655|gb|DS480669.1| GENE 30 31220 - 32260 535 346 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764772|ref|ZP_02171826.1| ribosomal protein L5 [Bacillus selenitireducens MLS10] # 1 342 9 350 365 210 33 5e-53 ESMSKWFEEIDSKNSNVIYSRVRLARNWDEYVFPSRMTAAQCGEMVERLREGLKGLKDQD GRDLSYSQLDQMQELEKMALRERRILNMACVSRKEPSGLLLSADESGSIVLGGDDHIRMQ ILSPGLKLDELWHKADVLDDYVNERFSYAFDEKYGYLTSFPTNVGTGLRACAVLHLPTLS QVRKFQSIVADMSRFGTAIRGLYGEGSDNYGSLYEVSNQRTLGQSEKEIVELVTKAAAQL NNQEMRVRSATLNTQRLEREDEAYKSYGVLKYARKITEKDARIFISQLMAGEEDGLMKFN EECSLYSLIIGIKPANLNLWAKRPLDKDELDAVRAAYIRQNLPEII >gi|157101655|gb|DS480669.1| GENE 31 32247 - 32864 236 205 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764773|ref|ZP_02171827.1| ribosomal protein L24 [Bacillus selenitireducens MLS10] # 1 200 1 172 179 95 30 2e-18 MLCERCKIREATIKYTEIINGVKTEHNLCSHCAKEMDFGQYSALLDGEFPLGKLLSGLLG LEDDEEETDVRGNVVCPTCGTSFEDFVENSRFGCADCYGVFDLFIKDKMKQLQGSESHKG KHPKYRSTFEKEHPEASGGSGEETKENTDCLAEAGSQDSQTAVVRKQIRELDSKLKEAVR QEDYELAASYRDQIKELKKEAGIHE >gi|157101655|gb|DS480669.1| GENE 32 32986 - 33399 409 137 aa, chain - ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 121 1 123 125 109 43.0 1e-24 MFLKIDFNSDEAIYIQLRNQIIMGIATDMIREGDTLPSVRQMADYIGINMHTVNKAYSVL RQEGFVKLDRRKGAVVSLDVDKIQAIDDMRGDLDVVLAWGICKNISRDEIHHLVDEIYDK YTRGCMGTNDGQGTGEG >gi|157101655|gb|DS480669.1| GENE 33 33695 - 33922 183 75 aa, chain - ## HITS:1 COG:no KEGG:CHY_2393 NR:ns ## KEGG: CHY_2393 # Name: grdB # Def: glycine reductase, selenoprotein B # Organism: C.hydrogenoformans # Pathway: not_defined # 1 75 360 434 435 65 36.0 6e-10 MVNEMEKSGIPTALITNMTAVARSIGSPRVIPGIAINNPCSDIDLPMEEQLKMRRNYIDR ALKAVSAEVSGQEFF >gi|157101655|gb|DS480669.1| GENE 34 33950 - 34993 1053 347 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c27830 NR:ns ## KEGG: CLJU_c27830 # Name: not_defined # Def: betaine reductase complex component B subunit gamma (EC:1.21.4.2) # Organism: C.ljungdahlii # Pathway: not_defined # 1 347 1 348 437 361 51.0 3e-98 MKRIVMYINQFFGGIGGEDKAGYEPSVEEGPVGPGNVILSCLKDAEITHTIICGDNFMTG HRDEAIERMDQFLEGIEFDLFLAGPAFQSGRYGMSCGEICKYITEKYHVTAITSMNEENP GVAAYAKTPDVYIMRGSKSAVRMRQDASAMAGLAAKVLSGEEILWAEAEGYFPHGVRVSV KSEEAPADRAVRMMLSKLQGQPFKTEFPIEQEDTVVPAAPVDAGRAKIAVITTGSLVPVG NPDRIPSGSASVWKRYDIRGLEAFKKNEFYSVHGGFSTNNVNEDPEVLVPLTALKEAERE GKIGKLDDYYYVTTGNLTILKEARKMGREIVEQLKMDGIQAAILVAT >gi|157101655|gb|DS480669.1| GENE 35 35007 - 36305 1291 432 aa, chain - ## HITS:1 COG:no KEGG:FMG_1472 NR:ns ## KEGG: FMG_1472 # Name: not_defined # Def: glycine reductase complex proprotein # Organism: F.magna # Pathway: not_defined # 3 432 2 441 441 340 43.0 8e-92 MKKLEIGNFLVRDIVFGEKTAYEGGILTVNREEATKAINPLGKLKNIELHIVHPGDSVRI CPVKGAVEPRFRPDGRALFPGYTGPVTSCGEGTLHAMKGIAVLVCGRYSSMGDGILDMSG PGADYSYHSTTTNLVIFAERTNEKELDYTLREEDEFRISAHYMAEYLGKTLAGREPDRWE CYDLDEGSQQAQEKKLPRVGYVMTVISQLAKGINDLFLGRDCNNMMPLLVHPQEILDGFL LGGMGLAGQALTTYDAQNHPMIKRLCQEHGKTIDFAGVILSPADVSDSMKLRNAISATQI AECLKLDGAFVQEWGGGSNVDVDTFYLLSHLEDKGIKTVGIMDEHIGKVYTDPKADAIVS VGETSTIIELPPMDKVIGDDESLVRDYYYGAWSTHTTLGPSLRPDHSVIINTYALACGGN PAAQLKRAVKEY >gi|157101655|gb|DS480669.1| GENE 36 36339 - 37847 1258 502 aa, chain - ## HITS:1 COG:BS_opuE KEGG:ns NR:ns ## COG: BS_opuE COG0591 # Protein_GI_number: 16077734 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Bacillus subtilis # 8 474 6 472 492 164 26.0 4e-40 MKIGWLQIIMAVIYMGILIGVGLYMSRKVKSSDDFWIGGRQVGPVATALSYGAAYVSTVA IIGDPPMYYNYGLGYAAFEIMISVFFCCILIFIVFAPKMRALSERLNVVSLSGFLAIRYK SNQFRLLCGTLVSVMMVPYAISAMKGIADAMHAIVGIPYAMGVVVIAVVSFCYLVTAGYW GVSTTDIIQGITIAVSCLLVAVIVITKSGGVTAAVTYMGSVSPSHIQPSSGLTFAQLFSW AGVWALIAFGQPQLVTKFMGLRDSRTVGTVIRVAVVWEIIFMVSIAMIGAAAFKLFRAVP FDNVDAIVPTLVAEHTNAVVSGIFLCGILAAGLSTTVALVLTSSGAVTKDIYEDYYCARS NAKQDSGKSVRISRIATGIILVVVVIGSIYPPDFIWSLSTMSAGVMGAAFTAPLVLGLYW KRATRQGCMAGIIGGSAVSIIWYFAGLTSVVHSFVPGTAVSFLLMVLVSLCTPRMEEEHL DVFFEPGCSKNKIQAAIAAGSK >gi|157101655|gb|DS480669.1| GENE 37 37890 - 38048 188 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935541|ref|ZP_02082916.1| ## NR: gi|160935541|ref|ZP_02082916.1| hypothetical protein CLOBOL_00431 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00431 [Clostridium bolteae ATCC BAA-613] # 1 52 7 58 58 99 100.0 7e-20 MGPLGLPWGTFLMFASAVIGCPLITELILRSKWWKKKSDYYWSFNQTDDEEE >gi|157101655|gb|DS480669.1| GENE 38 38185 - 39615 1287 476 aa, chain - ## HITS:1 COG:CAP0121 KEGG:ns NR:ns ## COG: CAP0121 COG2508 # Protein_GI_number: 15004824 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 315 464 391 538 543 84 28.0 6e-16 MDIRILLEVLDTRQNQYKYTGPMEGQVYGVQVQNGSYSYLEGMLYVQPSLTHPRPPKNTC VIAGETDLYEVLQKILLNYNRKESLLIKMQDKLTAFRSSETDMISGVMDNPCILLREDYK PLVWNGFDGEDEFLWIASIDFKAVSKKGQHLNICYQECSDEIPYPLMIVHYTNPRGKKRY CVLAERNTRFDRVTDPIFLKKICNILQCYTFTNNGIVQPISRLDELLEGMIRRIPAQPEL MLNELLKAGWVEKEKYYVLLIDVSLGKTKVNDVGELSKRLNARVFSHENYYVCLLTGTCA EEYDSRTQLGIEDFLVERNFYAGLSYGFFNIVNISIGYKQSVEAIRVLFNTLNGVHYYAF ADSIVTYLVKTSVNCGEFTMESLCHPSVYKIFEYDKKYGSDYMNFLRIYIYSGGSVKRTA EALFMHKNTVYQKIEKLKEFFHVDVNDLYAYVKLYVSLVILEQMPIYDSNDFLKWM >gi|157101655|gb|DS480669.1| GENE 39 39654 - 39878 59 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935543|ref|ZP_02082918.1| ## NR: gi|160935543|ref|ZP_02082918.1| hypothetical protein CLOBOL_00433 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00433 [Clostridium bolteae ATCC BAA-613] # 1 74 18 91 91 114 98.0 2e-24 MVFSLIALMAVLQVVRKDLPGFNLNTAYGQWIAGYAILSNIPVFSKNSSTLVIHSIAFSV AFILGIIGMKINLI >gi|157101655|gb|DS480669.1| GENE 40 40092 - 40820 507 242 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2751 NR:ns ## KEGG: CDR20291_2751 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 10 236 3 222 225 154 39.0 4e-36 MKNEKAENQKMEDVYNKEYLPSIVRVGRFVLLASTLFFFAPFLYTWIGWGVMPDWPAIVK GTFAWLLINLPWWISEPVAYFPVQGVTGMLLCSLAGNASNMRMPCAITAQKAAGVDPGTQ KGSLISTIAISITCFVSIGILAVAVFAGQAALSRLPASVSNGLSLLLPALFGCIFAQFLS GNEKSGAFSILTALGVLLLYKKGFLSAIPADGGSIIVMLVPILGTMAFAYATAKKKGKEP VI >gi|157101655|gb|DS480669.1| GENE 41 40837 - 41538 745 233 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2752 NR:ns ## KEGG: CDR20291_2752 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 7 230 5 228 231 149 38.0 7e-35 MPREIQQLVNSPILWMITLMSIALVLGMAAYFLVKSIKVAKSIGVTKEQISITVKTAGIS AVAPSVVIAVGMISLLVMVGAPTALLRLSVVGNVGYELQSVGIAADAFGTTASAATLTPG IFQTTVFLMAFGCIGYLVIPALLCTKMDKVIRKISGKDATVATIISTAAILGCYAYVDAP YILKMDASTVALAGGFAVMLLLQAIQKRTRKKWLLEWGLLISMAAGMCLGLFV >gi|157101655|gb|DS480669.1| GENE 42 41807 - 42574 730 255 aa, chain + ## HITS:1 COG:slr0596 KEGG:ns NR:ns ## COG: slr0596 COG1402 # Protein_GI_number: 16332320 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Synechocystis # 14 245 20 261 273 102 28.0 6e-22 MKNQLLWEQLRAPELKKLAEKNAVVMIPMGAVEQHGPHLPVATDTILAQWIAERAAKEMW DKGIPVVIAPAFVIANSMHHMHFSGSLSLTPGTFIQVLKEQCRAIAVQGFRKIAIINGHG GNTAPIDVALIDINQELGFPVYNVPYTAGVDESPFLDKQNYMIHSGEVETSLILAYDESL VDPSYTNLSGNPGGCSDYEDCGALSTFHYMESHTENGIMGESCAASKEKGIALADAYCKR LVEVLSDERLWSVPV >gi|157101655|gb|DS480669.1| GENE 43 42731 - 43138 505 135 aa, chain - ## HITS:1 COG:no KEGG:bpr_IV159 NR:ns ## KEGG: bpr_IV159 # Name: not_defined # Def: YolD-like protein # Organism: B.proteoclasticus # Pathway: not_defined # 33 132 1 100 102 85 44.0 6e-16 MAGTKAKSAAKSVKAQNTKPEGTEMAGRPRAKMSVQDRAKQFLPFAALKGLPEALAEKER VVVPKIILTEDMSEELNRKMQQIEPGMIIGVVYFHKDEYLKVTGMVARFNISSRVLQVVN TKISFDDVLDIEFCR >gi|157101655|gb|DS480669.1| GENE 44 43056 - 44408 1127 450 aa, chain - ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 1 418 7 419 420 239 33.0 6e-63 MEQRIYIVIDLKSFYASVECVERGLDSMTARLVVADPERTEKTICLSISPAMKRLGIHNR CRVFEIPKSVDYIMAQPRMQLYIDYAAGIYGIYLKYISKEDIHVYSIDECFMDVTNYLPL YHMTARELGQTIMADIFREYGIRATCGVGTNLYLAKIALDITAKHSPDFIGELDEETFQR TLWDHKPLTDFWRIGRGIAAKLEKYGIHTAGDIARTDEDFLYKLFGVDAELLIDHAWGRE TATIAAIKSYKARTNCLTSGQVLGCDYSFEDAKLIVKEMMDLLCLEMVEKNLVSDSVTLQ VGYSRQLAVESSRGTASLDAETNADIILVPAVSALFERIANPDLAIHRLNISVNHVIEEE YRQYSMFTDVQELERNRKLQTAMLDIKKKFGKNAILKGINLQEASTMQERNQQIGGHKSG IYKDGGNKSKERGKEREGAEHKAGGNRDGR >gi|157101655|gb|DS480669.1| GENE 45 45104 - 45589 440 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935553|ref|ZP_02082928.1| ## NR: gi|160935553|ref|ZP_02082928.1| hypothetical protein CLOBOL_00443 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00443 [Clostridium bolteae ATCC BAA-613] # 1 161 1 161 161 325 100.0 6e-88 MGIFGKLFKGPEIDMVKSNANAKKMRALFNQAVENGDDYRLIFGFTEDVSRFNYGFVHGS KSKIGNLLVGWNEVNQTIVVVPTVPDLSGCGDPTYYRRSEILKAYRNKYPTDAFIIYPDR KSYIGINAYDWLEDEKLYVYVSQDEELAAFTEFFKTHFATK >gi|157101655|gb|DS480669.1| GENE 46 45747 - 46367 224 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935554|ref|ZP_02082929.1| ## NR: gi|160935554|ref|ZP_02082929.1| hypothetical protein CLOBOL_00444 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00444 [Clostridium bolteae ATCC BAA-613] # 1 206 10 215 215 372 100.0 1e-101 MPKVGYSEKERVQIREALIAVGLDLMTKQGIQHTTVEQIYKKVGISRTFFYSFFPNKEEL IVEALYLQQPQLIEYARKLMNDPAISWRDKVKTFLHSLCYGEKYGIAVLTMDEQRLIFKR LSRDSYKVFRDKQLSLLHAILNSFEIEADTEQLKLVLNLTLLIVITCRALPENLPLLVSE ASDGTAEFQMEALVDYLEKLKCKPIL >gi|157101655|gb|DS480669.1| GENE 47 46766 - 47461 461 231 aa, chain + ## HITS:1 COG:no KEGG:BDI_2733 NR:ns ## KEGG: BDI_2733 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 225 1 225 227 297 64.0 2e-79 MKYFGRGLSEQHKELSSIIRKTSETERSKDLFLQIHAKLHSSVVSGTDKNEVDNLLCDLK QNEYAIMPTGKDETIAWGLWHIARIEDLTMNILVARKEQVFNQDWKERLNARITDTGNAL SDDEIIDFSRNVNTEQLICYRNAVAQTTRDIIRSLSAADLKRQIRQDDIEKILSVGGVTQ QETSIWLLDFWEKKDVAGILLMPPTRHVMLHLNDCCKWKEAIRTKKKFFRS >gi|157101655|gb|DS480669.1| GENE 48 47493 - 48710 793 405 aa, chain - ## HITS:1 COG:VC1527_1 KEGG:ns NR:ns ## COG: VC1527_1 COG1763 # Protein_GI_number: 15641535 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin-guanine dinucleotide biosynthesis protein # Organism: Vibrio cholerae # 201 324 4 131 206 83 36.0 8e-16 MTQQEEQDMKHGWGSVILSGGQGRRMGGIDKGGLDYKGESFCGHIQKQLQALDIPCYLSR ACYAGSGGREGGLEVIEDTVKGAEGEWIGPMGGIWSCFQRTGLDGLFFVSCDMPLFRKEM AAILMERWEPGADAVLWRTRDGRIQPLCGFYAATCLEALGDTIRQGNYRLMKFLNAVRCI VVDTSEAHIPDIWFANVNSPSAYRSLEGLRTPVLAVSGRKNTGKTALLEMLVGALDRVGI RSAVIKHDGHEFEADVPGTDSRRIKEAGAYGTVVYSGTKFSMTKEQPSMKAEDFFGFFPE ADIIFLEGQKDSDYPKLEVLRREVSDVPVCRPETVLAYIWSGQIARSGENTWSGENARSG EDCPGRGNRRNGKCPVLTAAQKDCILEIVIEHMDRYCRGDGGDEL >gi|157101655|gb|DS480669.1| GENE 49 48707 - 50584 1785 625 aa, chain - ## HITS:1 COG:AGc2058_2 KEGG:ns NR:ns ## COG: AGc2058_2 COG2199 # Protein_GI_number: 15888451 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 451 613 49 216 218 108 38.0 3e-23 MKRKLIGKWHVAAFLVFCFVFALTASGVFYNQRGKLAGEFGNMVAANLTSYTQAQKRYLN SSITDAWNTLKGISGLVEQIIPEYTEDSLNGYLDQLNLQNRDYMIEYVTLQELERRLRVQ QASERDWSLFGEIEQGTGVVSDIWDWKKAGNKSVFTVAEPVWKNGEVIGVLQTRLEPLAI TEQVPEASAFTRSSTLIVRRDGTILASENHNRGDISTGNLFDSVKVAGITDEVVEQMEER FYGDGSDSFMFEGKGDSYYFSWDYLGYNEWYIVNFVRSPDVAIHYDNILKELIYASLFLI GLTAALGGGIVVLFLHYRRSLDFETKKYGLLAEFSDTALFEYDRRKDTLEFTNNARRILM LDELKISHVMGKKTRTDLFVQEDRKVMEDMLRGRTGSGEDKIQYAELRLKSISGEYHWFG CHYKAITSDAGTVAKVVGKLADISRQRSREQELREQAMRDVLTGIYNKAGEKLIDRMVKE KGQGLFLMLDLNDFKSINDTMGHAAGDAILTEMGRVLKGACRENDIVARIGGDEFVMFLP GAFDWQTGKRKIGEIQDSLRTVMITTWGIRGIRASIGAALCPEDGMDYETLFKAADEAMY LDKEQSKNRNESREAGEEKQDRTIS >gi|157101655|gb|DS480669.1| GENE 50 50784 - 51671 1091 295 aa, chain - ## HITS:1 COG:CAP0064 KEGG:ns NR:ns ## COG: CAP0064 COG3588 # Protein_GI_number: 15004768 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1,6-bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 295 1 295 295 377 67.0 1e-104 MNQEQLRIMSGKKGFIAALDQSGGSTPKALKNYGIREDQYSNDEEMFNLVHEMRTRIITS PSFTSDHILAAILFENTMERKIGDKLTADYLWEEKGIIPILKVDKGLAEEADGVRLMKPV PGLDELLVRAVERHIFGTKMRSVIKQANPVGIKKIVDQQFEIGLRIAAAGLVPILEPEID IYSPDKAESEQIMKDEIKKHLAALPEDTRLMFKLSIPDKHGFYSDLMEDSHVVRVVALSG GYSRQEANERLSRSPGLIASFSRALSEGLNANQTQGEFDRMLAQSIKEIYDASIT >gi|157101655|gb|DS480669.1| GENE 51 51992 - 53473 1777 493 aa, chain - ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 45 493 1 449 449 440 50.0 1e-123 MMIAIITGGVCNYGCFVVAVRPSRAGLRHFACCFHDVTIHGRNKMQENFDLKELMELLDT MQLRLLKEKLIEMNEVDIAAFIEELDSEKTVVVYRMLPKELASDVFACLPVEKQEHIINS ITDYELSAIVNDLFVDDAVDMLEELPANVVKRVLKNSTPDTRKLINQFLKYPEGSAGSIM TAEYVGLKKRMTVEEAFAYIRRHGVDKETIYTCYVMDAKRALEGVVTVKDLLMHPYEEVI GNIMDSHVIKAVTTDDQEEVAESFRKYDLLSLPVVDHENRLVGIVTVDDVVDVMEQEATE DFEKMAAMLPSEKPYLKTGVFALAKNRLAWLLILMISSMITGSILAKYEAAFAVIPLLVT FIPMLTDTGGNAGSQSSTMIIRGMAVGEIEAGDILRVLWKELRVGVIVGVLLGLVNYIQL VIRFPGQEMLCLTVVLSLLATVMLAKTIGCVLPIAAQVLHLDPAIMAAPLITTIVDAVSL IIYFQLACSLLKI >gi|157101655|gb|DS480669.1| GENE 52 53756 - 55129 1350 457 aa, chain + ## HITS:1 COG:no KEGG:Closa_1352 NR:ns ## KEGG: Closa_1352 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 457 1 456 456 575 75.0 1e-162 MIESAVVLTPLHYAYLIGVIVILAVMVLKKDTPAVCIAFLFILGLIGLKSVTGGIITVFS AVLYAGREFMEVLATIALVTSLSKCLKDLGSDYLMMVPMAGIMKTPSLTWWILGLTMFLF SLFLWPSPSVALVGAIMLPFAVKAGLNPLAAAMAMNLFGHGFALSYDVVIQGAPAISAGA AGLDTSSILRQAWPLFWMMGIVTVLSAYVLNRSQLDARGPRLSIQAEPSRPQGHKKAALF LAVFTPIAFAGDIILMLLCGLKGGDATSMVSGTALLITCLGALLGFGKQSLEKVTGYVTE GFLFAIRIFAPVIIIGAFFFLGGDGITSIMGERFNRGIMNDWALWLAQNAPLNKYMAAFI QMVVGGLTGLDGSGFSGLPLTGALARTFGTATGSSVPVMAALGQIAAIFVGGGTIVPWGL IPVAAICNVSPLELARKNLIPVCIGFACTFALACFML >gi|157101655|gb|DS480669.1| GENE 53 55170 - 57023 1795 617 aa, chain + ## HITS:1 COG:BH1508 KEGG:ns NR:ns ## COG: BH1508 COG0367 # Protein_GI_number: 15614071 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Bacillus halodurans # 1 617 1 615 615 593 46.0 1e-169 MCGIAGFYNPHDHYLKEKEHYESVLQAMAERLRHRGPDDAGRWLSEHGGLSHARLSIIDL AGGHQPMVKSGSGQTFAIVYNGELYNTDELRQELMAAGYSFQTSSDTEVILAGYMEYGPD FVKRLNGIFAFAIMDTALNRLCLFRDRAGVKPLFYTIRGEELIFSSELKGILAYPGVKAQ LDLRGLNEVFSVGPARTYGCGVLKDMEELLPGHYLICTPDGIRQQSYWKLVSRPHEDSYE RTIEKTSFLVEDSVRRQMVSDVPICTFLSGGVDSSLVSAICAKELKKQGKRLTTFSFDFV DNDKFFRANSFQPSQDRPYVDQMVKFLDSDHHYLECGNLTQADRLFDSVLAHDLPAMGDI DSSMLQFCSMVKEHNKVALTGECADEIFGGYPWFHKKECFEAHTFPWTMDLAPRKALLDD SFLEYLDMDRYVLDTYERSVAETPVLEEDSPEEARRREIAWLNLRWFMQTLLNRMDRTSM YSGLEARVPFADHRIMEYVWNVPWDMKTRDGLVKNLLRQSGKGLLPDEILFRKKSPYPKT YDTAYEKLLVDRVREMMDDSHSPVMDFLDRKKLARFLTSPSDYGKPWYGQLMAGPQMLAY LLQVNFWLKTYHIEVVG >gi|157101655|gb|DS480669.1| GENE 54 57039 - 58424 1127 461 aa, chain - ## HITS:1 COG:mll7787_2 KEGG:ns NR:ns ## COG: mll7787_2 COG2199 # Protein_GI_number: 13476460 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Mesorhizobium loti # 282 455 1 175 180 121 39.0 2e-27 MGEGTEDRMQVYIKHVVLEPGLEMEYAQPLLTLFDELLAWDLASGDSRIIFYTEGRSRLR ENRTEQEFQAFAWSHVHPRECRQFQAFFSAVHMKELYGSKHPEKLVYRQRTAEGGYVWVE VVAITAGGGGTMLCYVRDMGKEEQKRYIQEQIIDQYVYKKCDFFLCLDGDTGYYEILQKN DSRIQGQMPHSGNYEQELEECIRKYVVSEDRDLLLEKASMNNVLDVLEREGELMVSYGEK GTDGRYARKLLRYVYFDRQEQKILLMRQDITAEYLEQRRQKKRLSDALLHARIDSLTGLY NRQAVNAKINGILRGAAVMPPSVLLFIDLDNFKTVNDTLGHRYGDKVLCSVAECFRQVLR TSDIIGRVGGDEFVAFLSGVTSLDEARECAGRLCQAVSRIPDLDLKGCGLSCSIGGAVCP RDGRDYDSLLMKADIAVYEAKRRGKNQFAFYEPGMRPHSEP >gi|157101655|gb|DS480669.1| GENE 55 58556 - 58903 345 115 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1949 NR:ns ## KEGG: Ethha_1949 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: E.harbinense # Pathway: not_defined # 1 92 1 92 113 77 43.0 2e-13 MHDYVKTLGTVIRKAREEQGMSQASLAEKLGIDVRTIINIENFRGNPKFEVLYPLVTTLR IPADRIFYPETESMGDSKQQLFWELHSLSEKEALELLPVIRCLAEMMRKHGSENP >gi|157101655|gb|DS480669.1| GENE 56 59096 - 59449 475 117 aa, chain - ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 1 113 7 119 121 82 37.0 1e-16 MIRRKDEMEPVTRHELQKGMGDVTFHTFMTKEEAYGSGRTFARVVFEPGTSIGIHEHHGE FEGYYILKGQALVTDNGVETVLNPGDYHMCKDGDSHGIGCYGDETMEIIALIINVPA >gi|157101655|gb|DS480669.1| GENE 57 59590 - 60477 615 295 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935569|ref|ZP_02082944.1| ## NR: gi|160935569|ref|ZP_02082944.1| hypothetical protein CLOBOL_00459 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00459 [Clostridium bolteae ATCC BAA-613] # 1 295 1 295 295 612 100.0 1e-174 MGIMRLIEGIRAGKAVSVFSGNQEYLFSAGKPGPACPDGFWEISVRCRPEAVSDRTGPGN GPFLFFDSRTNRFLDFTGGLLEALFPMSSEHIYDMAVLEQEVNLRVNRYITHNLIYKIQE GAAKYGREEFDEGEIREQAYVCYMTGAWGHLHVPARAFDMEQFCRMDGRRSRTGEGPFVE AYARLIMLHLSEQFPSGEVARSFLEDICRKDTGRQRNYDVICREYAKDSFVEQELEHLMR EATDSMKRAKSLCQLLKAKCSVKPQVELDQEILAVSLASDCPGPVIGRKRDLYYS >gi|157101655|gb|DS480669.1| GENE 58 60703 - 61905 1356 400 aa, chain - ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 1 399 1 403 405 473 54.0 1e-133 MYCVKRINDDLFWVGGTDRRLALFENAYPIPRGVSYNAYLLLDEKTVLFDTVDRAITEQF FENIDALLKGRALDYVVVNHMEPDHCATLGEIVLRYPGVQVVCNPKTIPIIKQFYNFDID SRAIVVKENDTLCTGRHTFTFLMAPMVHWPEVMVTYDTTDKTLFSADAFGTFGAMNGNLF ADEVNFERDWLDDARRYYTNIVGKYGPSVQTLLKKASGLDIRTLCPLHGPVWRKDISWYV DKYLTWSSYEPEEKAVMIAYGSIYGNTENAANILACKLAERGIRNIAMYDASSTHPSTII SEAFRCSHLVFASATYNGGIFSSMEHVLMDLKAHNLQNRTVALMENGSWGVLSGKQMKEI IGTMKNMTVLEQMVTVKSSLKESQMEELDAMADAIVNSMK >gi|157101655|gb|DS480669.1| GENE 59 62200 - 63255 1127 351 aa, chain - ## HITS:1 COG:STM2550 KEGG:ns NR:ns ## COG: STM2550 COG2221 # Protein_GI_number: 16765870 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Salmonella typhimurium LT2 # 1 334 1 334 337 359 49.0 5e-99 MNHDIDIKKVRINCFRQSKVAGEFMLQMRVPGAMIEAKHLKLVQDLCERFGNGTFHLGTR QTFDIPGVKYENIEEVNKYIKSYIHDVDVEMCGVDMDVTDYGYPTIGARNIMSCIGNAHC IKGNANTYELARKIERLIFPSHYHIKVAVAGCPNDCVKANFNDFGIMGINKQVYDIDRCI GCGSCVDACKHHATGVLSLNQNGKIDKDACCCVGCGECSLICPTGAWTRGDKKLYRVTLG GRTGKQNPRAGKLFLNWVTEDVILGMFANWQKFSAWALDYKPEYLHGGHLIDRVGYKEFV KHIFEGVEFNPEAKMADDIFWAENEQRGNMHVMPLSEHKHAGPQEPAKKKA >gi|157101655|gb|DS480669.1| GENE 60 63274 - 64065 730 263 aa, chain - ## HITS:1 COG:CAC1514 KEGG:ns NR:ns ## COG: CAC1514 COG0543 # Protein_GI_number: 15894792 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Clostridium acetobutylicum # 7 263 8 264 264 249 47.0 3e-66 MNNIVKPVACKIREVRRESLHEYTFRVETDIRPSHGQFLQLSIPKVGEAPISVSSFGDGW MDFTIRSVGKVTDEIFEKQPGDILFLRGAYGKGWPVEQFQGKHMVVITGGTGLAPVKSML NMFWEQEDFVKSVHLISGFKNEDGIIFKHDLDRWKEKFTTIYALDTDRKDGWETGFVTEF VSRIPFKDFGEDYSVVVVGPPPMMKFTGLEVLKQGVPEEKIWMSFERKMSCAVGKCGHCR IDEVYVCLDGPVFNYTQAKYLVD >gi|157101655|gb|DS480669.1| GENE 61 64082 - 65116 1170 344 aa, chain - ## HITS:1 COG:CAC1513 KEGG:ns NR:ns ## COG: CAC1513 COG1145 # Protein_GI_number: 15894791 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 339 1 335 338 323 46.0 3e-88 MGYRMDLKDADRMFAKLQEEYEIWAPKRFVGKGRYSDTDLIKYARVDRAEEIEYREKSDF PAKEVLSPITQSLFYFTEDEFIESKAGSKKLLIFMRPCDIHAQHHQEKIYLTNGGYEDIY YKRMNEKVKIVMMECGEGWDTCFCASMGTNKTDDYSMAVRFGEDGLEFQVKDHAFASYFE GMEECPFEPQFVEKNELTLTVPEIPDKEVLTKLKEHPMWRQFDGRCISCGACTVACSTCT CFTTTDIIYNENRNVGERKRTSASCQVKDFTDMAGGMSFRNRAGDRMRYKVLHKFHDYKA RFKDYHMCVGCGRCMDRCPEYISIVATVNKMADAIEEIKAGKQA >gi|157101655|gb|DS480669.1| GENE 62 65400 - 67142 1958 580 aa, chain - ## HITS:1 COG:no KEGG:Closa_1330 NR:ns ## KEGG: Closa_1330 # Name: not_defined # Def: cellulosome anchoring protein cohesin region # Organism: C.saccharolyticum # Pathway: not_defined # 1 327 1 332 620 228 41.0 6e-58 MIRRMRHAAAVLAMACMMTVMMPAQMITAFAANARIAFSDPSATVGGKIKVNMKITSSDN LANADVMLSYDSNILEFVDGTNADGGAGAVRVHGDAGTPNTGTLVFTLNFNAIAAGTSKI EVTSQEIYDSNSQIVTVNQQGNSTVTVSALQSASKDATLKSLQISPGSLTPAFSPDVDTY AVTVGTDVDKVIVSADCTDENATKVVSGNEGLQMGENRVTCRVTAQDGETIKEYVIVVTK AEGGASAADADASSQVRMRIAERVITILPPDPEVKVPEGFKESTINIDGHKVQGWVWGSE AEHQYCVVYGMNEAGEKGFYRYDMKDDERTIQRYFEDPAISGSVTKEVYDELALTYDKIY DDYSLLKILLIVTIVIAIILLVILLVVVFTRGSGKKGGSRDELGDKKDGPRGSRGRNSRK EKAEEPELLENSEDYQEYAGEDEYQDSDEYDESREYEDDSVYDDDEDYRDLPEEDYPVGE EDGDIYDEEEDVYEQEEEPQVETPVKKPVKPARPVQAPAHAPARTPAHAPVQNPERTPAQ APASRQKPPVTQEQLSARREAEQVPAKPDGQDDDFEIFDL >gi|157101655|gb|DS480669.1| GENE 63 67182 - 69449 1844 755 aa, chain - ## HITS:1 COG:SA0905_3 KEGG:ns NR:ns ## COG: SA0905_3 COG4193 # Protein_GI_number: 15926639 # Func_class: G Carbohydrate transport and metabolism # Function: Beta- N-acetylglucosaminidase # Organism: Staphylococcus aureus N315 # 433 582 19 164 165 60 32.0 9e-09 MKGKKLKRGLALVLGLMLAAEVPASVATPFQMFDSYAYTGAATVKATSLNVRSGAGTGYS SVGRLAAGAAVTVIGEQRGTDGNTWYQIQYTGTGGAVNTGYVSSVYIRLPVAYTKDSDFE AYLTSQGFPESYKNGLRQLHAQYPNWVFKAKNTGLDWNTVIENESVLGRNLVATGSVSSW KSVANGAYNWDNSTWTGFDGSNWVAASEDIIRYYMDPRNFLDETYVFQFLNHEYDANTQT KEGLNSLISGSFLSGTTNSTGTGGSDFSSGPGSSTGGSGSGGPGSSGSSSGSSSGSSRGP GRSSTYEDTQHGPGVSGSSNGSGAAAVSPGGSSSGGSSGGSSPSGSGEVSLESPHASIEP RERNVVSTSVSLVAPGQDGSGSSNGPVSSGQSGSSSGPVASPGSDQVVDQTGTSPGGGST GAAAPYADIIMAAASQSGVSPYVLAAMILQEQGNNGTSPLISGSYSGYEGYYNFFNVEAY QSGAMSAIEMGLRFASQSGSYGRPWNTVEKAIRGGAQNYGDNYVKAGQNTFYLKKFNVQG SNLYKHQYMSNIQGAASEAAKLSQAYTADLKKTALEFHIPVFNNMPEQPCVAPTGDGSPN NKLSGLGVDGFNLTPSFNRDTQEYNLIVDSSVSNITVSAYASDSNARVDGAGNVSLQNGG NDISIAVTAKNGSVRTYTIHVVKQDGGPTQGSGGSPVYGGGSSSGGIVSPDGSSGGSSGG SSGSSGPGGSGGPGSPSGSGSGPGGSNVTIVEVQS >gi|157101655|gb|DS480669.1| GENE 64 69690 - 70892 1157 400 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 8 291 23 296 425 163 36.0 6e-40 MNAAADSWTSVPATKEVREDGDFLKLYAQSAVLMDGDTGRILYGKGQDIIRPMASTTKIM TCILALENGRGEDAVKASSNAASQPKVHLGVREGEEYRLEDLLYALMLESYNDAAVMIAE HIGGSVEGFAAMMNEKAAGLGCGDTYFITPNGLDGVKTDEQGKEHIHSTTASDLARIMRY CVTESPAKDEFLKITQTQNHYFTDMAGKRSFNCYNHNAFLTMMEGVLSGKTGFTGGAGYS YVGAMENQGRTYVIALLGCGWPPHKTYKWSDARKLYTYGLEHFKMKDVFQEETFARVPVS GGLCWKEGEGAIDQVELSMMLNPDDMHLTLLMGEGEEARVVKKVPSILMAPVREGQQVGT VDYYLGEVLVKRFPVYAMQGVEKMNLRRAADRVLDLFLVR >gi|157101655|gb|DS480669.1| GENE 65 71648 - 71848 59 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935583|ref|ZP_02082958.1| ## NR: gi|160935583|ref|ZP_02082958.1| hypothetical protein CLOBOL_00473 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00473 [Clostridium bolteae ATCC BAA-613] # 1 66 8 73 73 124 100.0 3e-27 MSHDKTPAFLCCCLFTDIIAEYVFGGKRYMGRGTRMGEVLASVGEREEAGGDTAAGQVRS KGLRIG >gi|157101655|gb|DS480669.1| GENE 66 71840 - 73084 1020 414 aa, chain + ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 6 400 3 394 396 356 45.0 5e-98 MGHGERLIFHIDVNSAFLSWECVYRLSKDPQALDLRMVASAVGGDAKSRHGIVLAKSPLA KKYGVTTGEPLVQALRKCPDLVVVPSRFDFYIKCSRRMMDLLEEYTPDHEKFSIDEIFLD MTSTIHLFGSPMEVANEIRERIRKELGFTVNIGISTNKLLAKMASDFEKPDKCHTLFPQE VPDKMWPLPIRELFFVGGAAQRKMENLGLHTIGQLACCNLEVLKSHLGEKYAVLIRQYAN GIDDDPVAEKEPVNKAYGNSITLSRDISDYESACQVLLSLCETVGARLRADHVLCNNVCV ELRDWEFRNQSHQMVIPEPTDSTSVIYEYSCRLLRESWNMTPLRLMGVRAGKISDDGFSQ MSLFDDPGLQKKKDFEKAVDAIRSKYGIDSVKRASFLRKDAIVDHAASKKKHLN >gi|157101655|gb|DS480669.1| GENE 67 73142 - 74041 297 299 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 18 290 21 281 285 119 30 2e-25 MRRSAICYTIGDMEDGIVLGQFLKAKGFSHRLAVRIKAAQGLAVDGIAAYAGYRLKTGQT VEVTLPEEEDSGNIVPVKLPLSIVYEDEDILVINKDAGVPIHPSQGHYDNTLANAVSWYF HEKGEAFTYRVINRLDRDTTGLLVLARHMLSACILSEQMTGRRIRREYRAIVLGHTPEEG TVDSPIARAEGSTIERRVDHETGERAVTHFRTLLYNEEKDLSLVSLSLETGRTHQIRVHM RSIGHPLPGDFLYCPDYRYIDRQPLHSYLLRFEHPITGKEMEFTAGLPEDMKQLLPSGI >gi|157101655|gb|DS480669.1| GENE 68 74048 - 75643 1217 531 aa, chain - ## HITS:1 COG:SPy1246_1 KEGG:ns NR:ns ## COG: SPy1246_1 COG0144 # Protein_GI_number: 15675206 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pyogenes M1 GAS # 117 372 51 295 297 211 46.0 3e-54 MADLPVRFEERMKALLGEEYPAFAASYDKERVQGLRFNSLKFPDRIRIQDAVGSGENREA GKNGEGKEICEAKAVCEAKADCEAKADCEVEVTWEEAGAAEAAKQIGQETGFTLERIPWV KEGYYYSGSRPGKHPYHEAGLYYIQEPSAMAVVELLDPRPGERVLDLCAAPGGKSSHIAS RMKGSGFLLSNEIHPARARILSQNMERMGVRNAVVSNEDAQSLAGTFDHFFHKIVVDAPC SGEGMFRKDEDARSQWSEEHVKMCAARQGEILDHAASMLAPGGRMVYSTCTFAPEENEGT VLAFLRRHPEFCTEQVPAYSGFTKGKPQWAGPEAEGWGLERTFRIMPHILEGEGHFMAVL RKEGEPETAVKAGNRDQLYLDGRKRKEVFRDYEPFIRDTLSEPDTFLERKEYVLFGDQLY LMPADMPDMKGLKILRPGLHMGTLKKNRFEPSHALALALRKEEAQCFWELSPSGDSIIRY LKGEALSEDAGTPEGCLKGWVLVCTGGFSLGWAKYAKGILKNHYPKGLRWN >gi|157101655|gb|DS480669.1| GENE 69 75793 - 76353 766 186 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935587|ref|ZP_02082962.1| ## NR: gi|160935587|ref|ZP_02082962.1| hypothetical protein CLOBOL_00477 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00477 [Clostridium bolteae ATCC BAA-613] # 1 186 1 186 186 125 100.0 1e-27 MVEKKDVKATVKAAAEAAKPAEAKTEAKPAAPEKVEAKAEAAVPVKAEAAPVEKEAKAAP AKKTTARKTAAKTTSKKETAPKEAAKKEAAPKAEAKKEAATKAAAKKETAKKAPAKKAAE PKAAVHFQFDGKDLVAKDVLDRAVKAFKKSHRGVEIKTIDLYIVANEGAAYYVVNGEGGD DYKIIL >gi|157101655|gb|DS480669.1| GENE 70 76517 - 78541 1364 674 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 119 561 23 465 466 344 40.0 4e-94 MREERNKAESVPARSEERRPRRSRSREAFRQRQKELAESRGQEKRVQENGGQEKRVQEKR VQEKRVQENRVQDTPGTGEDKTHGGQPENDGQGTFNTQGKGPFDKFLSWKTAVLVLILAV AVLAGIYAYKALGYRNTYFPHTVINGMDVSGKTVGEVKELIASGVNGYGLVLKLRDEEQE TITGEKIGLHTVFDGSLEEIIRQQNPFRWPRYLLKGPSYDIKTMIAYDADALAQTLGGLS CFDSSRAVLPSDAYLSDYVSGQGYSIVPETEGTTLDMDKVRAQVDEAISSLAPELDLDAL GCYKSPSIRRGNTSLAAARDARNRYVNMTVTYTFGSKTEVLDGDEIHEWLVSDGEQVSID PDQAAAYVKSLASKYNTAYRKRSFATSYGQNVEVSGFYGWRINQSEETRELLGILEAGES VSREPVYLQTAASHDGPDYGSTYAEVNLTAQHLIFYKDGQKVLESDFVSGNVSRGHTTPP GIFSITYKQRDAVLKGEGYASPVKFWMPFNGGIGFHDASWRSSFGGSIYKNGGSHGCVNM PYDKAKELFENVYAGMPVICYDLPGTESKKSSQSSGRAPRETTAPAQPAPVPTQAPPAVP PSVLPETVPSETLPSEAAPPETSPQPSQPQVIIQPADQTTAAPAQTQPSPNTEEGYGPAF QTPQSGSAAGPGVS >gi|157101655|gb|DS480669.1| GENE 71 78548 - 78991 397 147 aa, chain - ## HITS:1 COG:CAC2339 KEGG:ns NR:ns ## COG: CAC2339 COG1683 # Protein_GI_number: 15895606 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 135 6 148 150 127 46.0 7e-30 MNILVSACLLGVECRYDGRGVLMSQAEELLSRHHLIPVCPEIMGGLATPRTPAERTGSGV VTRDGEDVTAAYEKGAGEVLKLAQLYGCQAAILKERSPSCGSGQVYDGTFTGTLTKGDGV CAACLKEHGIRVYGESQVGRLLEDIER >gi|157101655|gb|DS480669.1| GENE 72 79048 - 80409 1363 453 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 435 1 433 448 230 31.0 4e-60 MKVENNLDTDKVRGLVWRLAFPSMLAQFVNVMYSIVDRMYIGNIPEIGALALAGVGICGP VVTLISSFASWIGVGGAPLMSIRLGQKNERAAAQMVANCFALLTGMALIIMIVSYLFKDQ LLVFFGAGPAIFPYANQYMSWYLTGTAFALISAGMNQFIICQGFATVGMKSVVLGAVSNI VLDPVFIFVMNMGVRGAAIATVLSQMASCIYVLLFLFGKRPLVRITFGGYRWKTMKQVLL VGLSPFLIIASDSLLIIVMNMVISSYGGPERSDMLLTCNTIVQSFMLIITMPLGGITGGT QTVMGYNYGAGRPDRIRKAEKHILLLSVAFTTVMFIIAQAGPGYFVRIFTREMSYVRVTE WAIRIFTLCIIPLAVEYTVVDGFTGMGIAKVAISLSMFRKSVYFLGMILLPRYFGVEAVF YAEPISDIMACAAASTTFIMLSGKVLKDNRRLF >gi|157101655|gb|DS480669.1| GENE 73 80462 - 81595 1168 377 aa, chain - ## HITS:1 COG:BH0463 KEGG:ns NR:ns ## COG: BH0463 COG0628 # Protein_GI_number: 15613026 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus halodurans # 33 335 29 339 372 102 26.0 1e-21 MEETGWKHYLRLILNIVIPLTGWLLICLLGPKLLKFFMPFVIGWVIAMIANPLVRFLERR VKLVRRHSSIVIVAAALALVIGLLYLLVSRTFVLLRQFIIDLPGLYAGIEGDVARSMEQL EHLFDFMPDSIQQSWAQFGNNLGSYMGTVVEKIASPTVEAAGTVAKSLPAMLVYTVVTIL SAYFFIVDRDRILAAIKAHMPQWAGRYGLYLKGEVRHLIGGYFMAQFKIMAVVWLILTVG FIVLGVGYGPLWAFLIAFLDFLPVFGTGTALLPWGLIKLLGGEYAFAAGLLLIYVLTQVT RQIVQPKLVGDSMGLPPLLTLFLLYLGFKADGIAGMILAVPIGLLFVNLYHYGAFKDITD SLSVLVRDIDRFRRKEE >gi|157101655|gb|DS480669.1| GENE 74 81803 - 82885 1219 360 aa, chain - ## HITS:1 COG:CAC3031 KEGG:ns NR:ns ## COG: CAC3031 COG0079 # Protein_GI_number: 15896282 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Clostridium acetobutylicum # 4 348 5 349 352 345 48.0 9e-95 MRPWEMNIRRVVPYVPGDQPAGDKLIKLNTNENPYPPAPGVERALREMDADRLRKYPDPS SAELVTALAEYYGVGEDQVFVGVGSDDVIAMSFLTFFNSQRPVLFPDITYSFYKVWADLY RIPYETPALDGNMAIRSLDYKRENGGIIFPNPNAPTGLYMPLDQVEEIVKANQDVIVIVD EAYIDFAGPSARELLPGYDNLLVVQTFSKSRSMAGVRIGFALGSPALIKALNDVKYSYNS YTMNLPSQIAGTQAVKDRAYFEETRAKIMTTRERSKKRFAELGFTFPDSMTNFILVTHER VPARAIFDALKKEQIYVRYFNAPRLDNSLRVSIGTDEEMDVLFRFLEEYLKEWKPAGDSE >gi|157101655|gb|DS480669.1| GENE 75 82945 - 84138 1479 397 aa, chain - ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 393 1 392 393 434 54.0 1e-121 MISEKMKPLVNNNSAIRAMFEEGKRLAAIHGAENVYDFSLGNPNVPAPAEVNQAVCDILK EEESTYVHGYMSNAGYEDVRQAVAESLNKRFGTQLHQNNILMTVGAASGLNVILKTLLNP GDEVIAFAPYFVEYGNYVRNYDGNLVVISPDTTDFEPNLAEFEQKINEKTKAVIINTPNN PTGVVYSLETLTKMADIMRAKEKELGTTIVLLSDEPYRELAYDGVDVPYVTNVYDNTVIC YSYSKSLSLPGERIGYLVIPDALKDSGEVFNAATIANRVLGCVNAPSLMQRVILRCLDAK VNLEAYDRNRNLLYNGLKDLGFECIKPQGAFYLFVKSPEADEKKFCENCKKYNILVVPGT SFACPGYVRISYCVSYEQIERSLPAFAKVAADYGLTK >gi|157101655|gb|DS480669.1| GENE 76 84563 - 85918 865 451 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 445 3 443 456 337 42 3e-91 MKELQQILTTFDDWLWGNWLLFVLLGVGLLYTIITGGIQIRRFGYIIRKTIWEPILSKSK ESKEAGTISSFQALCTAVASCVGSGNIVGVSTAVLAGGMGAIFWMWVAAFVGMATKYGEI ILGMLYREKNEEGNYVGGPMYYIKKGLHAPWLAVLCAFFMVTQIIGGNFIQSNTISGVMK ENFGVPYLTTGIALVLIIFVITLGGLKRLAHVAQRLVPTMALIYVVGGLLIILANITKVP GVFADIFAGAFGLRAMAGGAMGSMLIAMQKGVARGLYSNEAGEGSAPVIHSAAQVNHPVE QGITGVAEVFLDTFVICTITGLVLGVTGVLDSGAPGNVLAINAFASVWEPLRYVVAVCLL MFSSTTLMSQWYFGFVGLNYIFGTKVADKFKYVFPCFCIIGALAEIELVWVIQDVALGLL TIPNLIAMVALAPQVRKATKEYFGREAGGRG >gi|157101655|gb|DS480669.1| GENE 77 85945 - 86934 926 329 aa, chain - ## HITS:1 COG:no KEGG:Closa_2398 NR:ns ## KEGG: Closa_2398 # Name: not_defined # Def: PHP domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 327 12 319 323 154 31.0 8e-36 MKEYHGFKNCGNWYKGNIHSHTTVSDGMLTPEQSVKLYKDNGYHFLCFSEHDIFTDYREQ FNTEQFIIVPAVESSVVLYRAKGTSERYKIHHLHGILGTQAMQDAAPLGTYGHMQFIPPM KFFGEWTGPEAAQKVADMLRDHGCVTTYNHPIWSRVTEEEFIHTEGISSLEIFNFNTVQE SNTGYDVTYWDRMLRMGKKINAFASDDNHNEGLFEDSCGGWICVQSPELSHDGIVQNFID GNYYSSSGPEIYDWGVRDGKVWIDTSNVNHVNFVCGNIINDGTTVLGERFKDTLNHAEYA LKGHETYIRVEAVDKYGRTAWTNPIYLEW >gi|157101655|gb|DS480669.1| GENE 78 86974 - 88278 1515 434 aa, chain - ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 15 430 2 411 411 390 50.0 1e-108 MNNETRAKIVKAAPNRWLVFCILTLSGGIAFKLSSMKDMFYVPMQQFMGLTNTQIGGALS AYGIVQTIGLIAGIYICDMFSKKYMIGGSLIGLGIVGVYLSTFPGYWGFLAAFGVMAILG EVTYWPVLLKAIRLLGDEKTQGRMFGFLEMGRGVVDVIIASSALAIFKAMGEGAAALRGG LLFLSAVTAAAGVLCLIFVPNDEKRVDETGKEVNKAQAAFGGMMQAIKSVDIWAVSLNGF TVYCIYCGLTYFIPFLNQIYFLPATAVGMYGIINQYGLKMVGGPIGGFMSDKVHKSSAKH IRAGFVVCIIAMALFLMVPHESLGQGGNWIIGAACTLAFGAIVFTMRAVFFAPMDEVRVP KEITGAAMSLASLVIYLPNAFAYVMYGSFLDRYPGMAGFRIVFSVMIGWAVIGVGVSTFL IHRIKKCQAAAKAE >gi|157101655|gb|DS480669.1| GENE 79 88686 - 89465 937 259 aa, chain + ## HITS:1 COG:lin0370 KEGG:ns NR:ns ## COG: lin0370 COG1349 # Protein_GI_number: 16799447 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Listeria innocua # 1 259 1 253 254 134 34.0 1e-31 MLAKQRYTKILELLDKDGIVHTAELVKLMGVSSETIRKDLEYLDSQGRLSRVHGGAVPAD SGKPADIPGGYISLQIRNSQNLEQKAAITSKAAGLVSEGQVIALDYGSTSQMMALVLKER FRNLTVVTNSIQNALMLAENPGITIILTGGILNRDEYTLVNEFSSTLESIHIDIMFMTVT GIDPVIGCTDQRLSEIRIQNQMHRSATRTIVLADSTKFGKASLLKICPFEEVDTIVTDSG ISPQMESRIRGAGAKLIIA >gi|157101655|gb|DS480669.1| GENE 80 89506 - 90165 806 219 aa, chain + ## HITS:1 COG:L43452 KEGG:ns NR:ns ## COG: L43452 COG1760 # Protein_GI_number: 15672812 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Lactococcus lactis # 2 210 8 216 223 190 47.0 2e-48 MNVFDILGPVMIGPSSSHTAGAARIGRITLALLGAPAVKADIFLHGSFAKTYKGHGTDKA LIAGIMGMATDDSRIRRAPELAREQGLEVSITTGEIDGAHPNTARVTLTDVTGNTVSLLG SSIGGGNILVKEVNGMEVSITGQHTTLIVLHRDAPGTIAAVTEVMADAGVNICNFRLSRQ SRGGEAVMTIEIDGSFGPELNEKIKVLPNIFSSTMLQPI >gi|157101655|gb|DS480669.1| GENE 81 90202 - 91137 939 311 aa, chain + ## HITS:1 COG:BH2496 KEGG:ns NR:ns ## COG: BH2496 COG1760 # Protein_GI_number: 15615059 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Bacillus halodurans # 7 300 2 295 295 251 49.0 1e-66 MEPEFNYSSVASIVAAAEKSGLPISAIVLRQQAEQMEQTEESVYEHMRKNYQVMTECIDP GCNKDLKSTSGLTGGSAYKMRRISENGKSLTGSFLSGALYRALAVSELNAAMGRIVAAPT AGSCGILPAALLTMQAEKQIPERDCVMSLFTASAVGMVIANNASLAGAQGGCQAECGSAA AMAAAAIVELAGGTPKMAEHAIAIAIKNILGLVCDPVAGLVEIPCIKRNASGVAGAFVAA ELALAGIESAIPADEVIWTMKKVGDAMSSTLKETAEGGLAATPTGRRLHEQVFGTVREPK SGCSGCGGCSS >gi|157101655|gb|DS480669.1| GENE 82 91261 - 93672 2370 803 aa, chain - ## HITS:1 COG:no KEGG:Closa_1704 NR:ns ## KEGG: Closa_1704 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 707 1 648 725 372 33.0 1e-101 MTSGSLYFKLLKEDFRRRLWAIALVFLAFFFTLPIGLALNMENAANTNYYRYNDYEPFIQ DAAMTDLQYKTRLLELKTEVVLNGVEYGNGMIAFLMITAAVVIGVSSFAYLHNKKKVDFY HSIPVRREALFGVQFSGGIFIVGAAYGLNLLLLTGVAMSYGVPAGRILGAMAGGWALNML YYALMYSTAVAAMMMTGNIVVGVLGTGVFFFLLPGIMLLLCWYCGTFFVTTARYLWSSDQ SPFMWGVRYLSPFSVYINSFGWKMNELSKHVPELICTVLAFLAVTTLDLGLYRKRPSEAA GKAMAFKKSMPPIRILLVLGIGLAGGMFFWSLQSRLRWGLFGTIVSVVLTHCIVEIIYHF DFKKMFSHRLQLGLCLAAGILVFLSFRYDWYGYDSYIPSKEKVVSASLDISVDSRWLSDK TFETGSDGKLQIRYADPYEYIEENMVLTDKEAVMSMVEEGRKRALERRDERLGIRKAVAR VDGYHTNAASSISYIGGADGPTSIFLAMKSGGGEDSDQEKFETNMTVCYRLVNGREVRRS YSLPLSAVMDSYEALYRQDAYKEGMYRILSQEPGQVKQIMYREADQVTDVSQDSRVQEAV LKAYQQDMRELTVEERLVEMPVGSLMFMTDQENTYLQQQRQAARSSFGGIARYQVEDVSQ YWPVYPSFERTITLLKEQGIEPGTCLAPEKVERIAIDVQNYFYEQQGELPEGETLAELQK INPNYQEDGSLIFEDSHTIGLLLEAIAEDESSGMNYLCETVEGGPHCTIRTEKEAGVGGL LIKNRITPEIAELFTGIPMNNGK >gi|157101655|gb|DS480669.1| GENE 83 93647 - 94546 900 299 aa, chain - ## HITS:1 COG:lin2912 KEGG:ns NR:ns ## COG: lin2912 COG1131 # Protein_GI_number: 16801971 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Listeria innocua # 2 283 1 281 295 160 33.0 3e-39 MIEAVNLTKRFDDIVAVDHISATIRDGSVFGLIGTNGAGKSTFLRLASGVLKPDQGCITI DGHEVFENIPAKKRFFYISDEQYFFNNTTPVEMMRYYSKVYQGFDEARFHKLMGNFGLDE KRKIHTFSKGMKKQVSVICGVCAGTDYLFCDETFDGLDPVMRQAVKSIFAADMEDRNLTP IIASHNLRELEDICDHVGLLHKGGILLSKDLDDMKLNIHKIQCVLKPDMKPEDLTSLDKV NVEQRGSLCTITVRGSRSEVEAVMASYEPVFFECIPLSLEEIFISETEVAGYDIRKLIL >gi|157101655|gb|DS480669.1| GENE 84 94539 - 94931 429 130 aa, chain - ## HITS:1 COG:BS_ytrA KEGG:ns NR:ns ## COG: BS_ytrA COG1725 # Protein_GI_number: 16080098 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 80 1 80 130 82 45.0 2e-16 MIVLDYRDSRPLYEQVAERLRELMFKGALPQDAQLPSVRSLATELSINPNTIQRAYTELE RQGYIYSIKGKGSFVADNSRMKAGILQEWKGHFNSAVEEGIKVGVTAGEMKEMIDTACRN HAMERGESND >gi|157101655|gb|DS480669.1| GENE 85 95171 - 96910 1583 579 aa, chain + ## HITS:1 COG:BS_lytS KEGG:ns NR:ns ## COG: BS_lytS COG3275 # Protein_GI_number: 16079945 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Bacillus subtilis # 9 556 8 579 593 320 36.0 5e-87 MSNTMFFNLLLNIGLLVLIATMLTKLPMVRSMLLEDSHAMGSRAALAVIFGLVSIFSTYT GVRAQGAIVNTRVIGVLAAGLLGGPYVGIGAAVIGGLHRYLFDIGGFTAVSCAVSTFVEG LIGSAFSKRFKAGKIDGAGVFLITALAEIGQMAIILLISRPFPAALDLVRLIAFPMIIMN ALGMVVFLGTFNMVFMEEDSQFAEKMRLAMGIVDQSLPHLRKGLYSTADMEATADIIYQS TSCAAVMITDAQKILAMKGENWYDVLRDEAVLKPILSSIHDRKPTTFSYAEKTDPLYHIL KNHIIVAAPLIEMDKPVGSLTMMVKKHWHTSQSNLDFASELARLFSTQLELSDLDYQKQL RRKAEFKALQSQVNPHFLYNALNTISCVCRENPDRARELILTLSSYYRQALENDQYMLSL HTELYHVASYLELEQARFEEKLVVELDVEDDLNCKVPSFILQPLVENAVRHGGDRTGARH VSISVHSVEGMAEISVADRGQGIPREVVLGLYTGQGKGVGLSNVHKRLKSIYGEDNGLKI ETSEAGSRVWFRIPLEPEEIPVPSHEIKQEESYNEDSCD >gi|157101655|gb|DS480669.1| GENE 86 96891 - 97721 730 276 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 5 276 6 240 240 127 27.0 3e-29 MKIAVIDDERPARKELVHQIMDAMPGSQIEEADSGASAIELISRRTFDLLFVDISLGDMD GTTLAAAARRILPDAQIVFATAYSQYGVKAFELGVNNYILKPFDPDRVRRVLEKCQKDLQ KNQALTASAVCSCDDHQPWGESCRPASSSLYASQQEAGDPVTAAFPASRMPINMNRTIIL VDIQQIVYIETSGRSCIIHTPTRDYTENLLLGEYEKRLAPHGFCRIHKSYLVNLSFITEL FPWANNSLAVKMQGFEKNILPVSREKSKMLRQIIGV >gi|157101655|gb|DS480669.1| GENE 87 97881 - 99320 1712 479 aa, chain + ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 443 1 448 479 469 55.0 1e-132 MISFFVCLAILIIGFMLYSKVAEKVFGPDDRKPPALSMTDGSDYVPMGTPRIFLIQLLNI AGLGPIFGALAGACWGPSVFLWITFGTLLGGGVHDYMVGMMSMRHKGASVSELTGTYLGN AMKQVMRVFSVVLLVLVGVSFSTGPANLLAMLTPNVLDAKFWLAVVLIYYFIATFVPIDK VIGKIYPIFGICLIVMALGVGGAIILGGYHIPEITLSNLHPSGTPIWPTMFISVACGAIS GFHATQSPLMARCLTNEKDGRKVFYGAMVAEGVIALIWAAAGVAFYNGTGGLLEALGNGQ SSVVYEICFKLLGPVGAVIAMIGVIACPISSADTAYRSARLTLADWFKIDQKPIKNRLAL TVPLLAVGAVLTQVDVQVVWRYFSWSNQTLAMIALWAASVYLFRRKRFYWLTALPATFMS AVSCTYILMAKEGFQLSTAIAYPAGIIFAAVCLGTFVFTCVMGKGKNADSPAVEEPYGV >gi|157101655|gb|DS480669.1| GENE 88 99441 - 101327 1869 628 aa, chain - ## HITS:1 COG:no KEGG:Closa_1353 NR:ns ## KEGG: Closa_1353 # Name: not_defined # Def: ATP-dependent OLD family endonuclease # Organism: C.saccharolyticum # Pathway: not_defined # 1 615 1 615 618 852 67.0 0 MQLTYLRIHNFKSIRSMEIRDIERALILVGKNNTGKTSVLDAICAVCGCYEIQDRDFNEN RQAIRIDACFSIEEEDLHLFHHMGIVSQYRRYDVWRRVFSERLPSFQNGELSFTFHVNQD GKVRYEDLYRKNNPYIHMVMPRIYRINAERELNQLQNDLLMFQEDEELRRLRSGSCIFER AKKCNHCFQCIGLINKKKPEELSAFETARLLEFKIYQMNLSGFSGKVNENFFKNGGYEEI QYTLSCDADQLFNVEVTAHNRQRGTVKPVELMGKGMRSIYMLSLLETYISEQGRIPSIIV VEDPEIFLHPQLQKTCSEILYRLSKKNQVIFKTHSPDLLFNFSIRQIRQVVLDDERYSVV RPRTNMSEILDDLGYGANDLLNVSFVFIVEGKQDKSRLPLLLEKYYSEIYDEEGNLYRIS IITTNSCTNIKTYANLKYMNQVYLRDQFLMIRDGDGKDPEELASQLCRYYDERNLEDVDR LPKVTRKNVLILKYYSFENYFFNPAVMARLGIVESEDVFYRTLYGKWREYLYRIRSGQQL MEVLGRDFSSPEDMKEHMEEVRTFLRGHNLYDIFYGPFRDREKEILRAYIDLAPKEDFKD ILDAIDRFVYFDSRKRPEDERQAEKSMM >gi|157101655|gb|DS480669.1| GENE 89 101529 - 102644 1023 371 aa, chain - ## HITS:1 COG:no KEGG:Closa_1256 NR:ns ## KEGG: Closa_1256 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 309 2 297 320 247 47.0 6e-64 MKKWKKKRTWTLAMTLVLLALLSLSACGSKLSNTAATAETMAAGYAMDVREAAITMAPAG EDLNGLKAESGSGLSPSPDIGSPGEAGRKLIRDVSMNVEARDFDGVLSQITDQVGELGGY VESSDVSGVSVDSYGGSQQRHADIRARIPADRLDRFVETVESAGNVTSKQEQVTDVTLQY SDTESRKKSLEIEQERLWALLEKAESLDAVVALEARLSEIRYELESYTSQLRLYDNQVHY STVSIYMREVKDLTPTAPDSIGTRIQKGFNRSLNNLGEAVTDLIVWLAVNSPILFVLAVI IGAVVLIVRRLSRKLRGGNRSPRRFSPFRGKPMGGKKGAKDQKEQAEKGQPEKEPPEKEP PEREQAEKEQN >gi|157101655|gb|DS480669.1| GENE 90 102859 - 104466 1501 535 aa, chain - ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 3 535 10 541 542 816 73.0 0 MMANVDLSQYGITGTTEIVYNPSYEMLFEEETKPTLEGYEKGQVSELGAVNVMTGIYTGR SPKDKFIVMDENSKDTVWWTTDEYKNDNHPASQEAWDTVKKIALEELSNKRLFVVDAFCG ANKDTRMAIRFIVEVAWQAHFVKNMFIQPSAEELENFKPDFVVYNASKAKVENYKELGLN SETAAMFNITSREQVIVNTWYGGEMKKGMFSMMNYYLPLKGIASMHCSANTDMNGENTAI FFGLSGTGKTTLSTDPKRLLIGDDEHGWDDNGVFNFEGGCYAKVINLDKDSEPDIYNAIR RNALLENVTLDADGKIDFDDKSVTENTRVSYPIDHIEKIVRPVSAAPAAKDVIFLSADAF GVLPPVSVLTPEQTQYYFLSGFTAKLAGTERGITEPTPTFSACFGQAFLELHPTKYAEEL VKRMEKSGAKAYLVNTGWNGTGKRISIRDTRGIIDAILDGSIAKAPTKQLPFFNFEIPTE LPGVDPKILDPRDTYAEVSQWEEKAKDLAQRFVKNFAKYEGNEAGKALVSAGPQL >gi|157101655|gb|DS480669.1| GENE 91 104704 - 105168 577 154 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 152 1 150 152 116 40.0 2e-26 MIFDEKKNLDFYRNLGIEGRYAKAVDFLQNTDLEALEPGKYEIDGKNVYANVTAYTTIPW EEAKYEAHEHYTDIQYVIEGSEIMSYAPVHEMTVKVPYNPDKDVVFFENTTPGLRVVTGA GQYLIFNPWDAHKPKAANGEPAPIKKVIVKIKED >gi|157101655|gb|DS480669.1| GENE 92 105254 - 106087 822 277 aa, chain - ## HITS:1 COG:BS_ykvT KEGG:ns NR:ns ## COG: BS_ykvT COG3773 # Protein_GI_number: 16078446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus subtilis # 156 263 91 195 208 74 37.0 2e-13 MLTLHSFLRSVVLFVKSLTVKVTKQMYRSSSVVMAGMMVVAIIAFSANGFGGGGRNALAA PMTEESDAIDMAEEENEGSGLVTEAKVQFGHLNTDSEGQHLAGALLEADVREKQRKQAAA QTQIETLQKQILKERQEEEARKKAEEERRAARRIKYTDEDYQVLLRIVQAEAGICDPKGK ILVADVIINRVLSGKFPDSVKAVVYQPSQFQPVSNGTINTVKVTAETIECVDRALAGEDY SNGALYFMNRRASGSAASWFDRRLTYLFAHDGHEFFR >gi|157101655|gb|DS480669.1| GENE 93 106681 - 108252 1714 523 aa, chain + ## HITS:1 COG:PM0588 KEGG:ns NR:ns ## COG: PM0588 COG0591 # Protein_GI_number: 15602453 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pasteurella multocida # 5 480 8 479 500 171 29.0 5e-42 MTVKLLMLVIFFAIMLGVGFYSRRHATNVNDFVLGGRGVGPWLTAFAYGTSYFSAVVFIG YAGQFGWKYGLASTWIGIGNALIGSLLAWVILGRRTRVMSKHLSSATMPEFFGSRFHSQG MKIVASLIIFVFLIPYTASVYNGLSRLFGMAFDIPYSVCVMGMAILTAVYVILGGYMATA INDFIQGIIMLAGIAAIILSVLSGQGGFTAAVGRLAQIPSEVPVTLGQPGAYTSFFGPDP LNLLGVVILTSLGTWGLPQMIHKFYTIRSEKAIATGTVISTLFAIVISGGSYFLGGFGRL FDSPAIYNGNGSVAYDTIIPAMLSGLPDLLIGIVVMLVLSASMSTLSSLVLTSSSTLTLD FIKGSLVKNMKDKTQLVLMQCLIVAFIVISVVLALDPPDFIAQLMGISWGALAGAFLAPF LYGLYWKRTTPAAVWASFATGIGITVSNMFLHYIASPINAGAVSMIVGLIIVPLVSLLTP RMDSRDINAIFSCYDNLVEATTKKVLPDDQSFDSPAFKKRKKK >gi|157101655|gb|DS480669.1| GENE 94 108342 - 109130 937 262 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 1 261 1 261 264 187 46.0 2e-47 MVTGIIIAAAVVGILGILIGVFLGIASEKFKVEVDEKEILVRNELPGNNCGGCGYAGCDA LAKAIAAGQADVGACPVGGASTAEKIGAIMGVAGGAAEKKVAFVKCKGSCDKTAVQYNYY GVDDCKKVSVVPGAGEKACAYGCMGYGSCVRACAFDAIHVVDGVAVVDKEKCVACGKCVA SCPNHLIELVPYKAEHLVQCSSHDKGKDVKAVCAAGCIGCTLCTKQCEFDAIHMENNLAV IDYEKCTNCGKCAEKCPVKVIV >gi|157101655|gb|DS480669.1| GENE 95 109147 - 109722 716 191 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 19 190 21 192 194 192 62.0 4e-49 MKELLLIAIGSALVNNVVLSQFLGLCPFLGVSKKVETSAGMGAAVIFVITIASAVTSLVY TGILVNLHLEYLQTIVFILVIAALVQFVEMFLKKSMPSLYEALGVYLPLITTNCAVLGVA LTNVTKSYNFVQSVVNGIGISVGFTIAIVMLAGVREKIEHNDVPYSFQGSPIVLITSGLM AIAFFGFSGLI >gi|157101655|gb|DS480669.1| GENE 96 109735 - 110526 896 263 aa, chain - ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 8 188 7 185 200 207 57.0 2e-53 MNNCSERLYNGIVKENPTFVLMLGMCPTLAVTTSAVNGLGMGLSTTVVLLFSNMIISALR KIIPDRVRIPAYIVIVASLVTVVQLLLQAYVPSLYSALGIYIPLIVVNCIILGRAESYAS KNGPLVSTFDGIGMGLGFTLALTCIGLVREILGSGAIFGNVVIPEDYHIAIFVLAPGAFF VLAVLTALQNKFKAPSATNGSVPQSRLACGGNCMECTGSSCMSNHEVLETRKRQAEEKAL AAKKEAVAKKEAELKAATSKKEN >gi|157101655|gb|DS480669.1| GENE 97 110542 - 111153 862 203 aa, chain - ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 10 200 9 173 177 71 30.0 1e-12 MNKAGFMKDALILFVITLVSGCLLGGVYQVTKEPIEKATIAANNKAYKAVFNEAESFEPD DALTAAVEGCNADLAGMDFGGVKVENVLKAVDGSGSQLGYVITSLSNDSYGGAVKVSVGI KEDGTITGIEFLEINDTPGLGLKAKEPKFKDQFIGKNAESLSVTKMGNADDTQINAISGA TITSSATTNAVNAALYYLHNCIK >gi|157101655|gb|DS480669.1| GENE 98 111156 - 112091 1208 311 aa, chain - ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 1 311 1 311 314 211 42.0 1e-54 MSKLLNVSSSPHVRSHETTHSIMLDVAIAMLPATAFGVYHFGMHALLVLIVTVAACVLSE YVYEKLMKKPSTISDCSALVTGLILALNMPPEIPVWIPALGGIFAIIVVKQLYGGLGQNF MNPALAARCFLLISFAGKMTNFSYSGFDGVSGATPLAVLKSGGTVDVPAMFVGNIPGTIG EVSVIALLIGAAYLLAKKVISIRIPGTYILTVVVFAILFGGHGFDLNYIAAHLCGGGLIF GAFFMATDYVTSPITPKGQIVFGILLGVLTGLFRIFGGSAEGVSYAIIISNILVPLIEKV TLPKAFGKEGK >gi|157101655|gb|DS480669.1| GENE 99 112107 - 113432 1424 441 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 438 7 441 441 325 40.0 8e-89 MGLATFIGGIHPYEGKELSENKPVQVLMPKGDLVYPMSQHIGAPAKPLVAKGDHVLAGQK IGEAGGFISANVIASVSGTVKAIEPRRMANGAMVPSIVVENDGEYKTVEGVGEDRDPSGL SKEEIRNIVKEAGIVGLGGAGFPTHVKLTPKDENAIEYILVNGAECEPYLTSDYRMMLEE PEKIVGGLKVILQLFDNAKGVIGIENNKPEAIKKLTEMVKDEPRITVCPLMTKYPQGGER SLIYAVTGRKVNSSMLPADAGCIVDNVDTVISIYMAVCKSTPLMRKIITVTGDAVADPRN FSVKLGTNYQELLDAAGGFKTEPEKVLSGGPMMGQAIFDLNIPVSKTSSALTCFTKDQVA EMEPSACIRCGKCVSVCPSHIIPVMMMQAALRDDCETFEKLNGMECMECGSCTYICPAKR PLTQAFKEMRKTVAANRRKKG >gi|157101655|gb|DS480669.1| GENE 100 113629 - 115317 191 562 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 325 548 132 355 398 78 27 4e-13 MMLAFFCVVLNTASTLAGSYMLRPIINTFIAPLTGGPGDPAGLARALAVLAAVFAVGVLA NYAQAKVMLTVAQNALQKIRNDLFGRMQKLPVRFYDTNSNGDLMSRFTNDVDTIGQMLSS TLVQLFSGALSIIGTLFLMVYTNLILTLVTLVMIPIMMKAGGAVAGWSQKYFTAQQTSLG AVNGYIEETITGQKVVKVFCHEDMAREEFGILNHDLRHNQIRAQFFGGIMGPVMNSLSQI NYSLTACVGGLLCVLRGFDVGGLTVFLNFSRQFSRPINEISMQVSNVFSALAGAERVFAV MDEEPEPANTRDCIVPSPMIGHVVFHHVTFGYDPDKTILRDINLYAKPGQKIAFVGSTGA GKTTITNLLNRFYDIQSGSITIDGADIRTISRDALRSNIAMVLQDTHLFTGTVRENIRYG RLDATDDEVIQAAKTASAHSFIMRLPHGYDTMLEGDGANLSQGQRQLLNIARASISKAPV LILDEATSSVDTRTEKHIEHGMDRLMADRTTFIIAHRLSTVRSANAIMVLEQGEIIERGT HEELLEMKGRYYELYTGMKELE >gi|157101655|gb|DS480669.1| GENE 101 115509 - 117248 181 579 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 333 551 279 502 563 74 25 5e-12 MKRYRKYIKPYLGAFILGPLLMLTEVAGEIMLPKLMSMIINNGVVQRNVQYILSVGGLMI LATLIMAIGGIGGAYFSVKAAVSFTSDLRDDLFSKVQEFSFKNIDSYSTGSLVTRLTNDI QQIQNVLMMGLRMAMRAPGMFIGALIMAFMMNARLAVVILVVIPLLTAAIAVILKTAFPR FTAMQKKLDQLNSGIQEALTNVRVIKSFVREEFEEEKFRDMNQDLKNSSLDAMKIVIVTM PVMTLAMNITTLAVVWYGGNIIIAGDMPVGDLTAFTTYIVQILMSLMMLSMVFLQLSRAV ASIRRVGEVMDTEIDLTDRDASRKDLAVLKGKVEFKHVSFSYTDNRDEMVLEDINFTAEP GQIIGIIGSTGSGKTTLVQMIPRLYDATKGQVLVDGVDVREYSLKNLREGVGMVLQKNVL FSGTIEENLRWGKEDASMDEIREMAQSAQADSFVTSFTNGYDTDMGQGGVNVSGGQKQRL CIARALLKRPKILILDDSTSAVDTATEARIRQCFNTSLKDTTKIIIAQRIGSVEEADRIL VLDEGRLIGQGTHEQLMEECKAYQEIYYSQRDRKETAVS >gi|157101655|gb|DS480669.1| GENE 102 117249 - 117734 387 161 aa, chain - ## HITS:1 COG:FN2010 KEGG:ns NR:ns ## COG: FN2010 COG1846 # Protein_GI_number: 19705306 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Fusobacterium nucleatum # 45 151 36 142 160 65 35.0 3e-11 MDKASRSCFVQSPEHCLVEKYIRVARLHRSMMERRLDGTGVYRSQHQILMFVSDNPNVSQ KELARMYGVSGATIAVSLKKLERGGYIRRLVDQEDNRCNQICITDKGRKVVEDSVKIFRQ MESRMFEGFSENDMKVLGQLLDRIYGNLDREFTDRAQREES >gi|157101655|gb|DS480669.1| GENE 103 118018 - 118422 251 134 aa, chain + ## HITS:1 COG:no KEGG:Hac_0481 NR:ns ## KEGG: Hac_0481 # Name: not_defined # Def: alpha (1,3)-fucosyltransferase fragment 3 (EC:2.4.1.214) # Organism: H.acinonychis # Pathway: not_defined # 5 88 239 322 409 63 36.0 2e-09 MTDSDKLDVILTEIIDMKTDIRDMKTDIQGVKTEMQGMKSDILGVKTEMQGMKSDILGVK AEMQGMKSDIQNIQSDIKSLNTRMDSLEFQIKSTERVLRNQITKSETLILGEVERVHLIL DQHIHNQTMHTALA >gi|157101655|gb|DS480669.1| GENE 104 118563 - 119186 678 207 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 207 1 208 208 282 64.0 3e-76 MKVLMLNGSPHEKGCTYTALAEVAGELNQAGIETEIMHVGGGTVHGCMGCGACGKLGKCI YSDDKVNEAVEKMRQSDGLIVGSPVHYASAGGAITSFLDRFFYSGSSAAAHKPGAAVASA RRAGTTATLDQLNKYFMITQMPVVSSQYWNMVHGQCPEDVKKDEEGMQIMRVLGRNMAWM LKSIEAGKAAGITLPETEEKKRTNFIR >gi|157101655|gb|DS480669.1| GENE 105 119190 - 122342 2969 1050 aa, chain - ## HITS:1 COG:FN1160 KEGG:ns NR:ns ## COG: FN1160 COG0553 # Protein_GI_number: 19704495 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Fusobacterium nucleatum # 44 1050 75 1089 1089 538 32.0 1e-152 MKIVRMNTSSFWKGEAGVKGLVEDQGETFEVNIYLGSGRVRDYSCSCSKGNSYRGMCAHG EALFAYYNQQREEASKPTVHTSSQVHTMIREYTNREVALILAEEVDAQVRLEPVLILDGK DTRLEFEVGITRFYAVRDLRAFKEAVENGAHVAYGKDLSFHHHKSAFTDSSRELLALLMG GVQNQKAVRSLTLNRMNRDQFFEIMSGRTAKVQLPGGNRVMMDMEDSDPVVSLKVEKTGR DGLKASLMGVAPMKGSEGPRQVAGCFRGERFLYVVSGQRLYRCSESCTQVMGLFMEQMCM ERDESVMVGQRDIPLFYERVIKHILPYCRLMLEDVDFKDYEPEPLKVSFRFDTGEDGALV MEPTLAYGGYEFHPLEDENLPRTICRDVPGEFRVSQLIHKYFKYKDPEGIRLVIRNDEDE IYRLMTEGMDEFRSMGDVYVSENLRQWKVLAPPRVTVGASAAGGWLELDVDMGDMNSQEL NRILAAYSQKKKYYRLKNGQFLGLDEGGLTVISRMASELGVTRKELQSGKVRLPAYRAFY LDYLLKESTGVTYYRDQMLKAMVRSVKSVEDSDFTAPERLRGVLREYQRIGYVWLRTLDS YGFGGILADDMGLGKTIQIIALLEDAYGSGEQSPSLIICPASLVYNWEHEIRRFAPDLKV LSVVGSSSEREVLLNEVGRNPQDYQVIVTSYDLLRRDVGLYEAIHFRYQVIDEAQYIKNA STQSARAVKSLDVQTRFALTGTPVENRLGELWSIFDYLMPGFLFGSQFFKREYEIPIVRE GDGAALKRLKRLIGPFVLRRVKKDVLKELPDKMEEVVYSNFETEQKKLYAANAAKFKEKL STGGFGQAGEGKLQILAELMRLRQICCDPRLCYDNYRGSSAKLETCMDLVRRGVAGGHKI LLFSQFTSMLDIIHTRFEKEGIMSHMLTGATSKEERIRLVGDFGKDEVPVFLISLKAGGT GLNLTAADIVIHYDPWWNVAAQNQATDRTHRIGQDKQVTVYKLITRNTIEENILKLQEAK SHLADAVVPEGTISFGSLTRDDILNIIKEE >gi|157101655|gb|DS480669.1| GENE 106 122426 - 124213 2124 595 aa, chain - ## HITS:1 COG:CC3359 KEGG:ns NR:ns ## COG: CC3359 COG0018 # Protein_GI_number: 16127589 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Caulobacter vibrioides # 1 595 1 600 600 468 42.0 1e-131 MNKILDMMERELKQAFTASGYEDSFAKVVLSNRPDLCEYQCNGAMAAAKVYKKKPIDIAN QVVEHLVSGGSHPVFSEAEAVMPGFINLKLSEAFLAEYTGFMAQSDKLGLEAPEKPETVI VDYGGANVAKPLHVGHLRAAIIGESIKRMGRFLGHHMIGDVHLGDWGLQMGLIIEELRDR KPELVYFDESFEGPFPQEAPFTISELEEIYPAASAKSKADEVFKERAHQATLKLQRGYAP YRAIWQHIMAVSVADLKKNYANLNVEFDLWKGESDAEPYIGDMIQMLVDKGLAHESQGAL VVDVAQDTDTKEIPPCLVRKSDGASLYATSDLATIVEREQDFKPDRYIYVVDKRQGMHFE QVFRVAKKAGIVKEDTPMIFLGFGTMNGKDGKPFKTREGGVMRLEKLIEEINEAVYQRIM ENRTVSEDEARSTAAVVGLAALKYGDLSNQAAKDYVFDIERFTSFEGNTGPYILYTIVRI KSIIAKYRENGGQVSQDAVEKKILACAGSSEKALMLMLARYNEVLENSFAETAPHKICQY IYELANAFNSFYHDTKILAEEDEARKESYIGLISLTRRVLEACIGLLGIEAPERM >gi|157101655|gb|DS480669.1| GENE 107 124336 - 124893 542 185 aa, chain - ## HITS:1 COG:CAC2094 KEGG:ns NR:ns ## COG: CAC2094 COG0231 # Protein_GI_number: 15895364 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Clostridium acetobutylicum # 1 185 1 185 186 212 58.0 4e-55 MISAGDFRNGITLEIDGQVVQIMEFQHVKPGKGAAFVRTKLKNVINGGVVERTFRPTEKF PQARIDRVDMQYLYADGELYNFMNQETYDQVALNQDIIGDALKFVKENEVCKVCSYNGNV FSVEPPLFVELEITETEPGFKGDTATGANKPATVETGATVYVPLFVEIGDKIKIDTRTGE YLSRV >gi|157101655|gb|DS480669.1| GENE 108 125026 - 126435 1363 469 aa, chain - ## HITS:1 COG:MK0117 KEGG:ns NR:ns ## COG: MK0117 COG0169 # Protein_GI_number: 20093557 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Methanopyrus kandleri AV19 # 1 288 7 290 290 182 39.0 9e-46 MDGKTRVCGLIANPVEHSMSPMMHNFFAQRTGVNLAYVPFKVEEDRVGDAVKGAYALNIL GMNVTVPHKQRVMEFLSELDEDARAIGAVNTLVRTEAGYKGYNTDGAGLKRAFAEAGIRT EGEHCLLIGAGGAAKAAAYVLAKGGAAKIHILNRNEARARELAGYINILTGRGLAEALPL EGYRNLPARKGGYLAVQSTSVGMHPHVEDVIIDNPRFYEMIHTGVDIVYTPARTRFMKMV QEAGGRAVNGLDMLLYQGVIAYELWNPQVRVDSGTIELAKQMIENYLQKKQAGEGRGDNL ILIGFMGAGKSRVGEHFAARYQMPIIDTDKEIEAAAGMAISDIFATQGEEAFRRLETGVL EKLLEQGGRSVISVGGGLPLREENRALLKQLGTVVYLDVLPETVMERIGSDVSDRPMLHG GDVMGRIVSLLESRKPHYLEASHIIVDVNGRDVDDIVEEIYRRAADKTC >gi|157101655|gb|DS480669.1| GENE 109 126445 - 126963 586 172 aa, chain - ## HITS:1 COG:BH1322 KEGG:ns NR:ns ## COG: BH1322 COG2179 # Protein_GI_number: 15613885 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HAD superfamily # Organism: Bacillus halodurans # 1 162 1 163 171 117 38.0 2e-26 MLEMFYPRRYEVSTYVIPFDYYHAQGMQGVIFDIDNTLVPHDAPADGQAVELFERLRAMG MKTCLLSNNKEPRVKPFADFVGSCYIHKAGKPGVKGYEKAMELMGTDREHTLFVGDQLFT DVYGANRAGLYSILVRPMNPREEIQIVMKRYLEKPVLYFYKKHARDINQYRE >gi|157101655|gb|DS480669.1| GENE 110 127000 - 127452 511 150 aa, chain - ## HITS:1 COG:CAC1698 KEGG:ns NR:ns ## COG: CAC1698 COG1327 # Protein_GI_number: 15894975 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Clostridium acetobutylicum # 1 150 1 150 151 166 58.0 1e-41 MKCPYCNEADTKVIDSRPADDNSSIRRRRQCERCGKRFTTYEKLETMPLMVIKKDNSRET YDRSKIEAGIIHSCHKRPVSPQQINSMIDEIENQIFNMEEKEVPTSAIGELVMGKLKDLD EVAYVRFASVYREFKDMNTFIDEIGKLLKK >gi|157101655|gb|DS480669.1| GENE 111 127635 - 128393 612 252 aa, chain + ## HITS:1 COG:lin0494 KEGG:ns NR:ns ## COG: lin0494 COG0710 # Protein_GI_number: 16799569 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Listeria innocua # 1 252 1 252 252 234 46.0 8e-62 MKNVIIKQTILGRGMPKICAPLVEKDYESLIHQAASFTEQPVDIAEWRADWFDSILEPDT LDQVLPGLESALKDIPLLFTFRTQREGGHLSAPLPAYRSLAEHAICSGHVHLVDLELFSG DDMIRETIDLAHRHQVSVILSNHDFAATPKEEEILRRLHHMEDLGADIAKIAVMPQSAGD VLTLLSATHKASQSLSCPLITMSMKGTGLISRLSGEVFGSCLTFGSVKEASAPGQIEVGK LKEFLTAIHQNL >gi|157101655|gb|DS480669.1| GENE 112 128627 - 128893 255 88 aa, chain - ## HITS:1 COG:BS_ylmC KEGG:ns NR:ns ## COG: BS_ylmC COG1873 # Protein_GI_number: 16078600 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 3 79 2 78 81 68 41.0 2e-12 MDVRIYDLKQKEVINVKTCKRLGFVGDVDFDMETGCLLALIVPGPGCICGFLGREKEYVI PFCDICQVGNDIILVDIKEKDVTESIKC >gi|157101655|gb|DS480669.1| GENE 113 129041 - 129841 449 266 aa, chain - ## HITS:1 COG:no KEGG:Closa_2337 NR:ns ## KEGG: Closa_2337 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 166 243 12 89 115 68 49.0 2e-10 MENENGNRYGTGEGQGTDYHTGNREPVIVAEQPAREVTEEDSGQTSAGQDGSDGWSSPVD APLQEDRWSETGDKPYLGSPWEPTPSGASYDAGKENGVSGNDSQPVYKPAPASYGQGQPG ANQGQPGINQGQPGTGQGGDFSQNDSRRERGGSQPGGNGQCQPQYQTQYQTQYQSYQPAY QEKNNMATASLVMGILSLCSICCCMLFGVVFGVLGIIFAIMSKRGDRMDSQAKAGLILSI IGVAVTVLIIIFFVVIEVLAVIPELN >gi|157101655|gb|DS480669.1| GENE 114 130038 - 130961 1220 307 aa, chain - ## HITS:1 COG:PA5505 KEGG:ns NR:ns ## COG: PA5505 COG1464 # Protein_GI_number: 15600698 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Pseudomonas aeruginosa # 53 307 9 260 260 222 50.0 8e-58 MKKNILGTAVLAAVLVSGALSACSGSPKETAAADTTAATAGTQAADSKEEETTAAEAETT QASAELKEIVVGASPAPHAEILNAAKEVLASKGYELKIVEYTDYVQPNNALDSGDLDANY FQHKPYLDSFNAQNGTKLVSAGAIHYEPFGIYAGKTASLEELPDKATVLVPNDVSNEARA LLLLEAQGLIKLKDGVGLEATKNDIVENTKNLDIVELEAAQLPRSISDGDIAVINGNYAI EAGLKVSDALATEDSQSLAATTYGNVVAVREGEESSDATKALVEALTSPEVKQFMEETYE GAVVPLF >gi|157101655|gb|DS480669.1| GENE 115 131216 - 131881 793 221 aa, chain - ## HITS:1 COG:FN0659 KEGG:ns NR:ns ## COG: FN0659 COG2011 # Protein_GI_number: 19703994 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Fusobacterium nucleatum # 3 220 14 231 233 184 48.0 1e-46 MQFDATTINMLVKGIWETIYMVFLSSALSYVIGIPLGIALVVTDREGISPVPLFNKVLGL IINLLRSVPFIILLIMVLPITKFIVGKTIGSNATVVPLIIAAAPYIGRMVESSLKEVDAG VIEAAKSMGASTWQIIVKVLLPEAKPSLLVGAAISVTTILGYSAMAGFTGGGGLGDIAIR YGYHRYQTDMMMVTVVLLVIIVQLIQEVAMRMSRKSDKRIR >gi|157101655|gb|DS480669.1| GENE 116 131883 - 132899 1268 338 aa, chain - ## HITS:1 COG:BH3481 KEGG:ns NR:ns ## COG: BH3481 COG1135 # Protein_GI_number: 15616043 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Bacillus halodurans # 9 337 1 335 338 293 47.0 4e-79 MEANNENPIIQLVGLGKQFQTMNGPVTALEDINLEIRYGEVFGIIGLSGAGKSTLVRCIN YLEVPTSGKVVFEGKNLSVMKDREKRLARQSMGMIFQQFNLLSQRNVLQNVCFPLEIAGV SKAEAKKRAEELLTLVGLEVRMKAYPAQLSGGQKQRVAIARAMATNPKVLLCDEATSALD PNTTKSILELLKKINREMGITVIVITHEMAVIEAICDRVAIIDHSHIAEVGNVSDIFSGP KSDIGRQLILGDVAEQNLSFGNSRQIRIIFDGRESSEPVIANMVLACKVPVNIMHADTRD IEGKAMGQMIIQLPEDDTDAGRICNYLKTANVKFEEVR >gi|157101655|gb|DS480669.1| GENE 117 133324 - 134469 601 381 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935647|ref|ZP_02083022.1| ## NR: gi|160935647|ref|ZP_02083022.1| hypothetical protein CLOBOL_00537 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00537 [Clostridium bolteae ATCC BAA-613] # 17 381 1 365 365 644 100.0 0 MVRLGCVPEQGGPSAYMENAKAARTAPPGRQEQTDDLGKNGPETTAVKSNDSACRDQAVW SWTEGADDWREQAVRMLEQLKKQYPGIVFHLAQEKDAGDLAALAVRAGTGVHLYLSREFV GRMGSSREDFSQCWSELVSAAGTLLAKQGQTRAVGAFLGEEGTSYWSVQDKTEKEAVPVV SHTGAKETPSAGSDPRVRVSVSLNVSSHFSRVARARTKGQVQEAVWDIYRSISNLKRAAA CGDDQQRVKASRALRSLQKLLGRSGRKMRKLERERLKAQEQKKAEKEHQDKRAFRLKQEK RRMRSARLGSDAGLVKEGHGDDFYIQAYRRFSGTAYKWEHQMPAGTMAGLFDAVPSGGSG MAGTAEAGFTAADVIVSGDTL >gi|157101655|gb|DS480669.1| GENE 118 134526 - 135929 1412 467 aa, chain - ## HITS:1 COG:SPy1070 KEGG:ns NR:ns ## COG: SPy1070 COG0624 # Protein_GI_number: 15675062 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Streptococcus pyogenes M1 GAS # 10 465 10 467 469 276 37.0 5e-74 MYRNEIEGFIDSHREEMIEDICTLCRINSEKMPYVEGKPYGEGPFQALQAALGMAEGYGF SIRNYDNYVGTADLNDKERQLDILAHLDVVPAGEGWTETKPFEPVVKDGKLFGRGTADDK GPAVAALYAMRAVKELGIPLNKNVRLILGTDEECGSSDIVNYYDKEDEAPMTFSPDAEFP VINIEKGRLEGHFHASFQASEAMPRLVKVEAGIKANVVPGKARAVVEGIDLKTLEAAAEA VEKETGISFVLEGDLPVMTVTAQGEGAHASTPQEGKNALTGLLVYLTRLPFAECPQMDMV KNLLKLFPHGDTCGKAAGISMEDELSGALTLAFSMLTVGADGLEGVFDSRCPICATEESV LGVVKQAMADQGLTLENDSMIPPHHVDGNSHFVRTLLSVYEDYTGLEGSCQATGGGTYVH SLKNGVAYGAALPGTDNRMHGADEFAVVDELVLSAKIFAQVIVELCS >gi|157101655|gb|DS480669.1| GENE 119 136071 - 136517 211 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 137 1 136 147 85 34 2e-15 MSKIDAVRAAMVEAMKAKDKARKDSLSMLLSALKNAEINKREPLTEEEENAVVKKEIKQT QETYEMAPADREDIRSEAAARMAVYKEFAPEDMSAEQIREVISAVLSELGIENPTAKDKG AIMKVLMPRVKGKADGRLVNETLASMFR >gi|157101655|gb|DS480669.1| GENE 120 136777 - 138381 1945 534 aa, chain - ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 6 533 2 526 526 746 67.0 0 MSQIDNQMTEEIKKRRTFAIISHPDAGKTTLTEKFLLYGGAINLAGTVKGRKAAKHAVSD WMEIEKERGISVTSSAMQFNYDGFCINILDTPGHQDFSEDTYRTLMAADSAVMVIDGSKG VEAQTIKLFKVCVMRHIPIFTFINKFDREARDPYELLDEIEEVLGIRTCPVNWPIGCGKS FKGVYDRFSKKITTFTAAMGGAREVDSQEFAVDSADVDSIIGQDYHQQLRDDIELLDGAS DELDMERVRIGDLSPVFFGSALTNFGVETFLEHFLKMTTPPLSRKTLDGVIDPFRPDFSA FVFKIQANMNKAHRDRIAFMRICSGKFEAGMEVNHVQGGKKIRLTQPQQLMAQDRKIVEE AYAGDIIGVFDPGIFAIGDTLCASNEKFQFEGIPTFAPEHFARVRQVDTMKRKQFIKGVN QIAQEGAIQIFQEFNSGMEEIIVGVVGVLQFEVLTYRLRNEYNVEVILEKLPFEHIRWVE NPGEVDVARIQGTSDMKRIKDLKDNPLLLFINSWSVGMVLDRNPALKLSEFGRA >gi|157101655|gb|DS480669.1| GENE 121 138680 - 139909 1375 409 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_1885 NR:ns ## KEGG: Fjoh_1885 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 189 409 33 255 256 102 31.0 4e-20 MKCTKPKKLALLFCLTAALASAALQGCGAKTPEDMLAQMQEFSTPDDTASIYLDKSWKTQ EAPMDGWLIAASSREDNAVFLLQFIKNGPSSQAGSMEDVKDIIKTSYNLSGETKAEAPEV PGMTNVTAITGRMSVDSTKVDAYIVYGETDYAYYSMAYAADKMNDSKIASFKVSCSKFRE SAPEVEDRTTAKLTDTIRWFNASYAVLTDLNGWDYNRFAGLAANDDVMAMEQKSLEEWWS VTDRATADSTLDWILTEGHRDTFAEDMAYLEEAGIRDIAPNERSAFLLDQFQMTADEAQN YADMFGFYEQYGPDAIAGWDYCRAMNLMSFYYLAGYYTEQEALDKSLEIARTMQPLFESW DDLMSSYMRGYEYWAEESADERRALYEDLKTREDNPYSVDFKTELEKTW >gi|157101655|gb|DS480669.1| GENE 122 140039 - 140956 1167 305 aa, chain - ## HITS:1 COG:BH3732 KEGG:ns NR:ns ## COG: BH3732 COG1879 # Protein_GI_number: 15616294 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 28 304 30 308 311 272 59.0 5e-73 MKIKKILALMAAGTMLMTMGGCNAITIDGEENVREGSSGNVIGFSVSTLNNPFFVTLTEG ARKAATENNVELVVVDAGDDAAKQTSDIEDLVSRNVGVLIVNPVDSDAVAPAVKSAMSQG IKVIAVDRGVNGVDVDCQIASDNVAGARMATEYLMDLVGEGAKVAELQGVPGASATIDRG AGFHQVADQSLQVAASQTANFNRAEGMTVMENILQSDGTIKGVFAHNDEMALGAVEAVAA SGKDIKIVGFDATDDAQKAVKDGKMAATVAQKPDKMGETAIGTAVKIMAGETVEKSIPVE VELIK >gi|157101655|gb|DS480669.1| GENE 123 141025 - 141966 1254 313 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 13 313 20 320 322 299 59.0 5e-81 MNENKNKVINYAQDFGALIALILLVAGISIASPSFRTGANFLSLLRQSSINGLIAFGMTF VILTDAIDLSVGSVLALSTALCAGMISAGVPAGLAMILALVLGCAMGVISGVMVTKGRLQ PFIATLITMTVYRGLTMIYTNGKPISNLGDSFILKVVGKGKFYGVPIPVILLILLFLLFY FLLNKTTFGRRIYATGSNWKSAKLAGVNIHRTKIIAYAISGTMAALSGLILLSRLGSAQP TLGNGYELDAIAAVALGGTSMSGGRGKIYGTLIGVLIIAVLNNGLNILGVSSYYQDVIKG LVILIAVLSDRKR >gi|157101655|gb|DS480669.1| GENE 124 141963 - 143456 1915 497 aa, chain - ## HITS:1 COG:BS_rbsA KEGG:ns NR:ns ## COG: BS_rbsA COG1129 # Protein_GI_number: 16080647 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Bacillus subtilis # 1 494 1 492 493 536 57.0 1e-152 MKIEMKGINKSFGQNQVLKDAGFVLDDGEVHALMGENGAGKSTLMKILTGVYTRDRGTVI VDGKEVTYKSPQEAERAGIVFIQQELNVLFDLTVEENLFLGKEIKKGFGICDRKAMRAKA EETLKRLGVSIPTDKVMSELSVGQQQMIEICKALMVDAKVIIMDEPTAALTQSETEVLFE VMQSLRKKGVSIVYISHRMEEIFELCDRISVLRDGTYIGTKKIPETNMNEIVKMMIGREI GERYPKRDCAIGDEVFRVEGLTKEGVFRDVDFSVKAGEVLGVSGLMGAGRTEIMQAIFGN LPYDRGSIFINGKQEVIKNPLQAIEHGIGFITEDRKTEGLLLEESIEKNVSLTNLGRISG KSKVVISREKEKGLVTKAIEELHIRCFGPGHECVNLSGGNQQKVVFAKWIYTEPRILILD EPTRGVDIGAKKEIYSIINQLAKKGVAIIMVSSELPEVLGMSDRIMVVREGLVRGIIGQA EADQEKIMTLATGGTLS >gi|157101655|gb|DS480669.1| GENE 125 143484 - 143900 437 138 aa, chain - ## HITS:1 COG:BH3729 KEGG:ns NR:ns ## COG: BH3729 COG1869 # Protein_GI_number: 15616291 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Bacillus halodurans # 1 138 1 129 129 140 49.0 4e-34 MKRHGILNSNISRVLSSMGHTDRIAIADCGLPVPEGTERIDLAVTFGNPRFMDVLKAVSS DMKIEAVVLAEEIKEQNPQILEEIQELFASFETGFKPEKVEFVPHTRFKEMTKECRAVIR TGETTPYANIILQSGCIF >gi|157101655|gb|DS480669.1| GENE 126 143950 - 144843 1004 297 aa, chain - ## HITS:1 COG:BS_rbsK KEGG:ns NR:ns ## COG: BS_rbsK COG0524 # Protein_GI_number: 16080645 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Bacillus subtilis # 5 293 6 293 293 253 46.0 3e-67 MKTAVVGSINMDMTVTAERIPLKGETLKGEDIRYIPGGKGANQAVAMAKLGADVEMFGCV GGDEAGRSLIKNMEDMGVGTTHIKVAPGVNTGLAVITVGDHDNTIVVVAGANDKVDREYV DSIRDALLACDIVLLQHEIPQDTIEYVIDICHENGVKVVLNPGPARPVKPEVLAKVDYLT PNEHEAAILFGDMPVEEMLRAHPGKLVITQGSRGVSTCLPTGEILTVPARKANVVDTTGA GDTLNGAFTVEIAGGSTMEEALRFANTAASLSTERFGAQGGMPMRSQVLEAMNQENM >gi|157101655|gb|DS480669.1| GENE 127 144937 - 145914 956 325 aa, chain - ## HITS:1 COG:BH3727 KEGG:ns NR:ns ## COG: BH3727 COG1609 # Protein_GI_number: 15616289 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 324 1 322 331 237 38.0 2e-62 MTGIRDVAKLAGVSPSTVSRVMNGTARVDDEKRQKVERAILETGFKPNEVARSLYKRSSK IIGILVPNIINPFFNELAGAVEEECDRRGYRLTLCNSNDDLEKEKRNLSLLERMNADGVI LMTNMDEIHKEVGNCRIPVVMIDRQIEGGKEIACIQSNHYQGGRMSVEHLLECGCRHIVQ MSGPLHLSSARKRHQGYLDVCRERGIEPAWIEGDYSFESGTEMAARLLEQYPDTDGIIAA NDMVAISVYKELHRKGIRVPEDMQLIGFDNISLSSLFTPEITTIAQPIKRMGQEAVRMLT EYVAGAQISRNKVFDVKLIKRDTTL >gi|157101655|gb|DS480669.1| GENE 128 146311 - 147525 1295 404 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 398 1 402 404 357 48.0 2e-98 MLKGKTVLLGVTGSIAAYKTASLASMLVKKGVKVHVLMTKNAAHFINPITFESLTGNKCL VDTFDRNFEFSVEHVALAKQADVVMIAPASANVIGKIAHGIADDMLTTTVMACRCKKILA PAMNTNMYENPILQDNLDICRRYGMEVITPAWGYLACGDTGAGKMPEPEVLYEYILREAG YPGDLAGRRILVTAGPTRESIDPVRYITNHSTGKMGYAIARVAAYRGADVTLVSGPVDLP RPLFVETVPVESAREMFEAVTARSGEQDIIVKAAAVADYRPATVGTEKIKKSDGDLSIAL ERTNDILGWLGQNRRDGQFLCGFSMETQNMLENSRAKLAKKHIDMIVANNLKVEGAGFGT DTNVVTLITRDSETELPIMTKEEVADCLLTEILKRTKEQDGIQG >gi|157101655|gb|DS480669.1| GENE 129 147618 - 148391 910 257 aa, chain + ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 1 253 1 253 254 210 42.0 2e-54 MILAIDMGNTNTVIGGIDETKTYFIERVTTDQGRTDTEYAVSFKNILEMHQIDIASIEGA ILSSVVPPVNSTILNAVEKVTGIRPLLVGSGMKTGMNIIMDNPKTVGSDQIVDAVAASHE HPLPLVVIDMGTATTLCTVDKKGNYIGGVILPGLKVSLNSLSSKTAQLPYISLEVPSHVI GKNTIDCMRSGIIYGNVDMIDGILDRMEKELGEPPTIIATGGLAKFITPLCRHKIVVDDA LLLKGLLILYRKNTEVH >gi|157101655|gb|DS480669.1| GENE 130 148472 - 149863 839 463 aa, chain - ## HITS:1 COG:CAP0114 KEGG:ns NR:ns ## COG: CAP0114 COG3507 # Protein_GI_number: 15004817 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-xylosidase # Organism: Clostridium acetobutylicum # 39 459 180 525 531 95 25.0 3e-19 MFIPLVDEGILVARSKDPYGEFEWNMLCESKGWIDPCPFWDDDGRAYMVFAYAKSRSGIK HRLSLVEIDPQCRSLLTEPRLIFDGEQIAPTSEGPKMYKRNGYYYILMPSGGVETGWQSC LRSRSVYGPYDYRVVMHQGNSRVNGPHQGGWVTSPDGRDWFVHFQDVVELGRIIHLQPMC FLDDWPFIGQDQNGDGIGEPVREWSVPAGERTTGERATGERATGEHTTGERTTGKHTAGD HTEGERAAREDPAGDVPEYAIRQSDDFEEKTLGLQWQWQANPDAGFYSLRENPGHLRLYC YRNPGRENLLWYAPNVLTQIPQKQKLSMTVKLSLSGREDGDFGGIGMVGHDYGYAGLYRM GKGIQIRCYRGTVTQPMFEGEAGEKLVFSQEADREQAWFRLVLEEDKTYGFSYSFDGRNF TRIEYSFPLKRATWTGAKLCLWSCSRENRESDGYCDYEYVEIK >gi|157101655|gb|DS480669.1| GENE 131 149968 - 150120 187 50 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1659 NR:ns ## KEGG: Bacsa_1659 # Name: not_defined # Def: xylan 1,4-beta-xylosidase (EC:3.2.1.37) # Organism: B.salanitronis # Pathway: not_defined # 3 39 40 76 559 67 75.0 2e-10 MRTYKNPILYADYSDPDVVRVGEDYYMVSSSFTYIPGFRFYIPEIWFTGS >gi|157101655|gb|DS480669.1| GENE 132 150125 - 151159 624 344 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_2530 NR:ns ## KEGG: Pjdr2_2530 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 1 333 1 335 347 149 32.0 1e-34 MEFGSTILHNVESLIRDEEAGDYLLCRVPEPTRLGLNMKAQKNILFASNSEIRFVMKGDS AVLGLRRMPAKGDIRSQGVLEVFQGDYQGSYEISPWSVTADRTEITIRRQDWSAIQRFAG NGYSFHPSVTRILLPYDWGCCLGNIKGDIRPPKKQEMPGKRLLVYGSSITHGGNASVPSG TYAFRLARKLGYDLINLGCAGSCQMDEAMASYIGSRDDWDMALFELGINVIETWDGDRLY GKALAFLAAVLEARPDKQVFVTDMYYNRYDFEGDKRAARFRGAIQRCAAVLSGQYGGRLH YINGLEAMDTPQGLSSDGLHPSDLGHGMIAERLAGYMAEKKEVN >gi|157101655|gb|DS480669.1| GENE 133 151206 - 152045 896 279 aa, chain - ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 1 279 28 306 306 285 48.0 5e-77 MKEKKIKNGITYVILGIVGVVMLFPVIWMFFACFKTNNEIFGSLSLLPQGWSPEAFIKGW KTTGTYTYAQYFINTFALVVPTTLLTLVSCSIVAYGFARFDFPGNQFLFMVLIATLMLPN AIIIIPRYALFNKLGWLDSYMTFYAPAAVGCYPFFVFMMVQFLRGLPRDLDESACIDGCG PFQCFIQILLPLLKPALFSAGLFQFLWTWNDFFNTNIYINTVSKFPLSLALRVSIDVTSN IQWNQVMSMALVSVIPLILLFFAAQKYFVEGIATSGLKG >gi|157101655|gb|DS480669.1| GENE 134 152055 - 152954 1101 299 aa, chain - ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 297 7 287 293 276 51.0 3e-74 MEKAKPRHGGWTKRNRIGFMYVCIWIFGFLVFQLYPFISSLLYSFTNYDIFHAPEFVGLD NYVRLFTKDREFWNSMAVTLKYTFITVPGKVVLALIIAVILNRNLKGINFIRTVYYIPSL LSGSVAVAILWKVLFMNDGFINSLLGLVHIGPVKWLGTPDMAVITICMLEIWQFGSSMVL FLSALKQVPQSLYEAARIDGASKPRMFFKITLPMITSIAFFNIIMQLITALQNFTSAFVV TNGGPNKATYVLGMKLYTDAFKYFKMGYACATSWILFMVILVMTIILFVTSKKWVYYDN >gi|157101655|gb|DS480669.1| GENE 135 153055 - 154422 1586 455 aa, chain - ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 51 441 23 415 430 159 28.0 8e-39 MRKARQITGLLMAGIMASAVLAGCTNEPAPSSTSQAEAGKDSAAGSGAPAASGDQITLRF MWWGGDERNEATLAVIDQYQKLHPEVRIEAEMNSDQGYIDKVSTMLANGTAPDILQQNVD SLPDFVSRGDFFVDFYEYKDTFDTSGFDETFISQFGTFDGKLLALPTGMSCLATVANRNA AETCGVDLTKQITWDSMLEDGKKLHAENPDYYYLNTDTKILCEYVLRPYLRQMTGESFII DSEKKMSFTREQLVEVLTYIKDCYDGGVFEPAEDSATFKGQIHTNPKWMDGKFVFAYGPS SSINLLMDAVPEAECTVVQMPMAEDRVNDGFFADTPQYMTVNKNSEHVEEAVDFLNYFYN NAEAQETLKDVRSIPPTSTARQLCADKGLLNQTVVDAVDLAAGLNGKSDKGYTTSAEVYA IQEDMIESIAYGQSTPEEAADNAIELINDYLSGLN >gi|157101655|gb|DS480669.1| GENE 136 154659 - 155411 788 250 aa, chain + ## HITS:1 COG:SPy1556 KEGG:ns NR:ns ## COG: SPy1556 COG4753 # Protein_GI_number: 15675449 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 1 243 1 244 246 132 32.0 5e-31 MYQVIIADDEAKIRSGLANLFPWNQLGFEITGSFSNGRDAYDFALSSPVDLILSDIRMPL MDGLELSERLLSRKKIKIIFFSGYQDFDYVRKALRNGVFDYLLKPVKYEDLADCLTRVKE LLDSEQGQDMPEVCESLSYYEKIISTVKNYLDTSYQDATLEQAARLVSLSPNYLSKIMKE HSDTSFSDYLLKTRMENAARMLRDIGYKQYEIAYRVGYDNPKNFSRAFHQYFNMTPTQYR NCGGRPEPQL >gi|157101655|gb|DS480669.1| GENE 137 155462 - 157294 1756 610 aa, chain + ## HITS:1 COG:BH1909 KEGG:ns NR:ns ## COG: BH1909 COG2972 # Protein_GI_number: 15614472 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 346 578 356 589 597 129 32.0 2e-29 MRKRNFIKSFLYYSAIIAIPILTVFLVFLAFVFQGKQREIREDAETAAQAVRDNYSLVLD SAATQYDILARNPRLSISLRGFISHKQIGYLDVILTNTILSNFNSLTNNAPYIASVYYYL DGYDSILTSDTGGTTRLSQFSDLSWKDSYDHSDEDEWLERREMHQYSYTQPVPVLTYFKK LSMFKGVVMVNIYEEQLKQLINTSSQLDRFFILDKDGNFLLQGGAQTAQLSPEDKDLLVS RLEQSPDDAYRGWVDLSRGKYYVRIIPADCGTYIAACISHSNYYQALYTLLLQFLMVLAA VISTSLWIAYTITKKNFQQIDYFVEVFSDAERGVFNPKKPAFIKDEYDIILNNLVQLFLN QTFLRSQLALKEQEQKLAEMTALQLQINPHFLFNTLQTLDFKAMEYTHKPTAVNQMIEAL SDILKYSLQNPHSMVTLKDEIDYLKKYDSIQRYRYEDKYILYYEYEKELESIPIMRLILQ PLIENSLYHGIKPLEGSGFIKLRIVRRGGHLIFSVIDSGVGMEKADIQKLYEKIANPGSE NIGLTNVNSRLIMRYGECAGLKILSKKGMGTCISFRISIEEIACSVPHKDTKTHQPSKSA VIFQDMPNKI >gi|157101655|gb|DS480669.1| GENE 138 157325 - 157471 133 48 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935674|ref|ZP_02083049.1| ## NR: gi|160935674|ref|ZP_02083049.1| hypothetical protein CLOBOL_00564 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00564 [Clostridium bolteae ATCC BAA-613] # 1 48 1 48 48 76 100.0 6e-13 MTIYPICTTIMKYRIIQTLAEAYYRYRVNKPCFIMRQNNRWTLVTMDR >gi|157101655|gb|DS480669.1| GENE 139 157618 - 161691 3074 1357 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 371 822 251 644 2050 139 30.0 8e-31 MERRRSSGTITKSMYYSQKRILAFLLTAAMVFTNLGTGLNVSYAGTADQVTFQMKGADLV AAVEEAIESGHEITADDLGFTNGRTAEFEKVFFGTKGAVYEVYPEMEGSSMEAELRVFVR LPEDAGDMYMVTGDEEIILLYVNNGEETISCTTEITRMDDGVEKVKKTKRVTIRSYEDAF GDEEVDIISKPAQEETEASESETSVESGADGTGESTLPEESADGSGSGENSGNSGQDASE NSSESSGENQEEISRPEDGSLGSAGNTDHGSDGNADHSTADTEAVPPDDAQGTDTEKDNA ADTKAEVPEKNSGSGSDEKPEKDIADQDKDTQKQEDREEKSNKDEDKKEEGAVPAAAISR HDAPVVSEKEEGSPADSADVKENVPEKEEASNEKETEPREKEEHKEVKETEEAAETGKEK ETETVKETEKPSAEETREPAAPTTEVNVPDTDEVQTTKADDEGSAPESTQAADSTAGSSE PSFDSAETSTAPSADASLPSESEQETAPEPVPEDDTKKASTSDLAGMGYCSTAKVYTTSL NQLKALEDFDGYKVTYTSFPEASARIMEGTRGVKEGESLTFGVKTQLGYTIDNVTANDED LEADSVTDELDGSQTAWYTISEVYEEQNIQVYTLETLEHPAFDKSVEVNGVTIRITAPEG VLPADTQIQAEEITEQVESALTEKVDAESDDGTVVSTIIAYDINLMYDGVKLDNSWAEDG SNLVTVSFSGERIEKASMEADQIEILHLETPTQEVQAVSAKAELEGSGETEAVAQVPVLD GISADDIAVDSEGRQELDVSGSASVGQIHFQTDHFSAFTAIFKTSPYLVIANELQEGSEG AEKVEYTYKVSITDSRKNKNEKLYLLTGDGGAEEYNTTAVTEGDTNTYTFNLKPNGKMKI GNVGPSAEYRVVQTIEDTYNEFRGVTETARTKIDWIETDIDLGTEGEGESFPYGYGSNNQ YGSLMERAAVLMRAYTKCREYDETPGRKDRFEQRRALLGISNNNVTNDSIIAKLLEDHYG TSKWPLASEEGFGDLAGIVSERLEKDNVPNNKRPKNLYLHIYFRTVDSTENMEIIPYLTG SDALKDWNAYAAFDSQEGMWYQYLKAYPVSTTYMISSLHAGGNVNTPKTMHMLMEEDSGN WHAMCPGEVPDIGDPIMQPAVDGVISKFGTAYQIEYTFKNYYYNSTSAGDGSTDTGSTDN DSSGNDSSEKPDNDSNNGNGNESGGVSGNDTADTDTPGSSAGGTTNTSQGSEGSGSSGGG TTTTSGGTGRYQGTDAQSGPGAQQNVIESSQVPLASLPADASAAFSQSMAIIDDGEIPLA AIPKTGDRGSSAHELSFILSGMLMAVYLVLGKKKKES >gi|157101655|gb|DS480669.1| GENE 140 162140 - 162916 850 258 aa, chain + ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 5 257 5 257 257 335 66.0 5e-92 MSGYKVEICGVNTAKLPLLSNEEKEELFQRILDGDKEAREQYIKGNLRLVLSVIQRFSNS QENIDDLFQIGCIGLIKAIDNFDITQNVKFSTYAVPMILGEVRRYLRDNNSIRVSRSLRD TAYKALYAREQLTRTNSKEPTLMEIAQEIGVSKEDITYALDAIQTPVSLYEPVYTDGGDP LFVMDQISDKKNLEENWVEDLSLVEAMKRLPERERHIIDMRFFEGKTQTEVAQEIHISQA QVSRLEKNALKSMKSYLS >gi|157101655|gb|DS480669.1| GENE 141 163001 - 163927 1045 308 aa, chain + ## HITS:1 COG:slr1870 KEGG:ns NR:ns ## COG: slr1870 COG1432 # Protein_GI_number: 16330259 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 1 240 1 246 249 164 38.0 1e-40 MEQNERRFAVLIDADNVSPKYIKYILDEVSDVGIATYKRIYGDWTDNEKRSWKNVLLDWS VNPIQQYSYTTGKNATDSAMIIDAMDILYSGNVDGFCLVSSDSDFTKLAQRLREAGMFVM GIGEQKTPKPFRAACDTFKLLEIISSDDAPETSVIENQKTITNIDEIQKAITKLLIENNS QNQPIILAKVGNFLTKRFSDFDVRNYGYSKLSTFLESLNNNDFQVVKLHGGYFVQEKSAS ISKTEIEKEIIRIIKGYNGHVDNLSIIHDELKKTYPSFDVKQYGFSRISSFIRSFGTFRI RDNMVQLK >gi|157101655|gb|DS480669.1| GENE 142 164019 - 166313 1495 764 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935679|ref|ZP_02083054.1| ## NR: gi|160935679|ref|ZP_02083054.1| hypothetical protein CLOBOL_00569 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00569 [Clostridium bolteae ATCC BAA-613] # 2 764 15 777 777 1422 100.0 0 MSAAQVGERLCFRLSEQEYKGFLRWYKKDFLKEKDLGRVRPDFMNLKEDCCITMENEVIR IQTGNTDIWLLYSALTEVHRKEGLILFFGQKEFWAIPERVLGGDLEAQEWFRCLNTKCLE NRDGRIPMGDVEEACRRRTVPFCCYTRSVDQIADACRALGLPWRNVQKLRKAALFPYRYV GLQLLALEEDGICEYGERSVVRHAYGDFEKAVYTPEYVYLMKGRGGAVMIPVEPLAQIGG IKMLFQICNERGSGKPLSLKQKKVHVPAGKWGTGRRLLAAAAAAAALGLAAWGAGSGEST GTTGSTGTAGSTGTTGGTGTTGSTETAEGSVQARMAMAEVSEEGTADSGENASQTAEHGN KAADKIQDGIMITVPDDTVFDQVGSDGTYVSSSLYYKIRLPQEEWTQQGNRDFGDVLVSE WGSILVRGYREQPQFFAIAGIDTPRTKADYIRRMESGAAVKDDSMPEVLEYTYRRVKGDE IVQKEIRYKAEENTGQIYGGGEYGEGAYRYSVELSVLGPEYFYTVTVRLREETPESIGEA RRALDSFRILDTSAGICKKMEDEVFHGYYGKNYVMTNCLVLLEEDLSPEEIGDCLELVKK IDTGFFGVKSGDALAARTKDSPWLGIDSQSLQENCTMENARKVSKIFKSDVILYDEFDGD LLMVAYSDKHQKHVYQRATSFNKEILESEFGCYGKEQEFPEDILKYTDLTREEAEAIWTD PDAVFQMDKWYEITSHMTKMPVPEEFIGIRDYKEFKDGFEIIRR >gi|157101655|gb|DS480669.1| GENE 143 166476 - 167216 951 246 aa, chain - ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 29 241 23 232 237 268 68.0 5e-72 MIIKVSIPNRFQLKTIPSFRTVLMPRQGEVHYIGGADILPAPLETEEEGRMISLLGSEDD KEARAALIEHNLRLVVYIAKKFDNTSVGVEDLISIGTIGLIKAINTFKPDKNIKLATYAS RCIENEILMYLRRNNKTRLEVSIDEPLNVDWDGNELLLSDILGTDEDVIYHDIEDEIEKS LLNNAISRLNPRERKIVELRYGLTNEDGEEMTQKEVADLLGISQSYISRLEKKIMKRLKK EIVRFE >gi|157101655|gb|DS480669.1| GENE 144 167256 - 168092 791 278 aa, chain - ## HITS:1 COG:no KEGG:Closa_2351 NR:ns ## KEGG: Closa_2351 # Name: not_defined # Def: peptidase U4 sporulation factor SpoIIGA # Organism: C.saccharolyticum # Pathway: not_defined # 5 276 28 296 297 197 39.0 3e-49 MVYTVYIDVVFAVNTIMDMMVLTILNRVLSYRTTKRRILAGAVIGGIWSCVVSLVPGLPA AVEILGTYVAVSSLMAVAAYHLKSPREVIKSVAGIYLVSVVLGGVMLVLYEHTRAGYYAW LLVESGHGRRIPVMGWILMIAGAAAACYGFSGGIKELIRTMAHRKDLCRVTMIYGEKKET VTGLIDTGNRLREPVSSQPVHVAAAGIMKQLCPSVKGVVYVPYQSVGTSRGILPAVYIDR MEIEQEGGRYSLEKPLIAITKQELSPSGEYQILIQKSD >gi|157101655|gb|DS480669.1| GENE 145 168366 - 169328 973 320 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 319 1 324 324 373 53.0 1e-103 MRIVVLDGYTENPGDLSWKGLEDLGDLTVYDRTPKDRVVERIGQAEAVYTNKTPISARTI ESCPNLRFIGVLATGYNIVDIRAAGDAGIIVSNIPSYGTDAVAQYAIALLLELCHHVGFH SDCVRAGEWTRSRDWCFWKYPLTELAGKTMGIIGFGRIGRRTAVIAQALGMKVLAFDRYQ DKGLETDSCRYADLEELLSGSDVISLHCPLFPETEKIINRRTIEKMKDGVLIINTSRGQL VEEADLREGLDSGKIGGAAVDVVSAEPIQEDNPLLGAENILITPHIAWAPRESRQRLMDI AVENLSRFMSGTPQNVVNEP >gi|157101655|gb|DS480669.1| GENE 146 169384 - 170286 942 300 aa, chain - ## HITS:1 COG:no KEGG:Spirs_2575 NR:ns ## KEGG: Spirs_2575 # Name: not_defined # Def: xylose isomerase # Organism: S.smaragdinae # Pathway: not_defined # 1 300 71 368 368 417 64.0 1e-115 MKEPIQKYFQVGTIQWMTHPPVSYPVCDSVRTICCDPYFGALEITHIPDSETRERVKKML DQSHLRVCYGAQPNLLGKGLNPNHLEETERRTAEEELTRAVDEAAYMGAGGIAFLAGKWE PDSREQAYSQLLKTTRAVCTHAAKKGMMVELEVFDYDMDKAALIGPAPLAARFAADMGMT HHNFGLLADLSHFPTTYETSRYVVRTLRPYITHFHIGNAVVKEGCEAYGDQHPRFGFPES ANDTEQLTEFFRVLKEEGFFNEKEPYVLSLEVKPWGDEDGEIILADTKRVINRAWAMVED >gi|157101655|gb|DS480669.1| GENE 147 170315 - 171307 1024 330 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 6 293 5 293 309 255 45.0 7e-68 MGEMTAIRDAYGAALKELGEQDVRIVGLEADVASSTKSGIFGSAFPERYFNVGISELNMV SMAAGLARTGFIPFVNTFAVFLTTRGADPVQSLIAYDSLNVKLCGAYCGLSDSYDGASHQ AITDMAFVRSIPNMTVIATADGTETRKAVFAIAEHQGPVYLRLSRAPAPVFYGDNMRFEI GKGIRVREGNDVSIITTGTLLHNAIRAALLLEQEGIQAAVVDMHTVKPIDQNLILECAEQ TGAIVTAEEHSIYGGLGSAVAEVLAEHCPVPMERIGAVDFAESGDYGQLMEKYGYGPESI AQRCRAVMRRKQDNGTGRLPERTLRPFDMP >gi|157101655|gb|DS480669.1| GENE 148 171307 - 172134 810 275 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 17 272 15 270 270 270 51.0 2e-72 MREERKETLQKLCLVFRNKLIDLLHSVQTGHPGGSLSCTEILTALYYELMDVDPMNPEKE DRDHLILSKGHGAPMLYLVLAHKGFFPLAELKNLRQTGSMLQGHPCVHKTPGVELSTGPL GLGLSAGLGMALGSRLKGYDSYTYVIMGDGEIQEGCVWEAALSASKFKADHLIGILDNNG VQLDGTLEDIMPMGDIKAKWEAFGWNVIPCDGHDVEDICRAVEEAKMTADKPSLILAKTV KGKGVSFMEGKNTWHGKAIGDQEYVQAKAELGGDR >gi|157101655|gb|DS480669.1| GENE 149 172131 - 173033 1004 300 aa, chain - ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 1 269 1 265 270 137 32.0 2e-32 MKLCYQAATPDVAIADSVTAYQGPLAKTFGDLGALGYDGVELMTLNPGKLDWREVKETAE ENNLSVVLVCTGEIFGQLGLSYTHPHRSLRREAVERTKEIIDFAGYLGANVNIGRIRGQY CRELDREETIELAQNAFTELADYAAPKNVDIALETVTIMQTNFINTLAEGAAMADRVGRP NFRLMMDIFHLNLEEKDLYEAIRTYSSYNIHVHLADNNRRYPGQCGLDFEKILTVFRECG YDGNYCTEIFQLPSMEEAARESIRYLRPIADRVYGGNTHCNEAGRNKDGRGRSRKEGMQI >gi|157101655|gb|DS480669.1| GENE 150 173030 - 174553 1332 507 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 6 500 4 500 500 263 31.0 5e-70 MHKDLLLGIDVGTTGTKCSVYDFNGKRVASAYREYPMLHPRPGWTEQDPSRWWDAVCCNL QTIFQSQDIDKNRIAAVGTSSTNAVFLADRQGQPLCNAISLHDQRSGPQVEWLKENIGEE RIRALAANRIANGSFSLPALRWLVENRPDLIKGAHKLMVPCGYVIQKFTGEFTMNRPRMS LTLMADIRTGQWDTEIAEQTGLPARLLPSPCGSAEIVGTVTQWAASMTGLAAGTPVTGGT IDTVAATIGAGAVDEGDFALTIGSSGRLCSIAGQPMEDPRLLNLYGAYDGQYIIVQSTNN ACVSLRWFRDTFGREAARQAQAAGCGIYPYLDRLADTAPAGAGGLIWLPYLAGEQSPVWD TRARGVFYGAGLETDYPSFIRAVLEGVAFSQRHCLEVVLDRAARPDIIPLGGGAANSPLW CRIFADVLGIPVARLRSNETETLGDIIIAAQAMGIHEIAPDFGKVLAADGEVFWPDDVRA CVYDRQYRVYRELYQALKPVFAGGGEQ >gi|157101655|gb|DS480669.1| GENE 151 174553 - 175218 599 221 aa, chain - ## HITS:1 COG:ygbL KEGG:ns NR:ns ## COG: ygbL COG0235 # Protein_GI_number: 16130645 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 6 204 9 199 212 129 37.0 4e-30 MDMDRKNMLERVVKAARKAYMKGFAAGSGGNVSIRIQGQVYISPSGICLGDLTQEDMIML EADGVAPPADMSDPGRRRPSKEAGMHLACYRSRPDILCLFHLHSPYSIAAACRRQGNGSP ATGMPAYTPGYAMRVGRIPVVPYYLPGSRELAEAVSDVICSRDSLLLANHGMVAAGKSPE AVLATAEEIEENAHLTILLGDRGIPLDEEQTEALFRAGGTV >gi|157101655|gb|DS480669.1| GENE 152 175229 - 176518 719 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 425 1 426 435 281 34 2e-74 MNTGLIVLFVVLGICLLIGLPVAFSIGISCMALLTVNGYPPLDIVVQRMASGAKSFNMMA MPMFIFAGSLMVYGSTPRLMRFANMMLRKMPGGLGATALAACGFFGAVSGSGVASAAAIG KIIGPEMLEQKYPRGLTVGLIAAGGTMACIIPPSIVMVVYASSSGASVGDMFLAGFVPGF ICILALIGLNTFFAVKRGNKENVEKTDYTARERLRITGDALLPLLMPIFVLGGVFSGICT ATEASVVAVIYSFILAVFVYRELTLREFYKVAAESVVTTGVIMLIISVATPFGWIMSIQN VPTLFAGWLLSITSSKFLIMAFIFLLLLLLGTFMETMCIIILVTPILLPIAQTLGMGVVH YGVSMLMALMVGSLTPPLSVNLFTSCRVLNVKYDEAFPDTLWVILTVTACALLTFAVPGI TEFLPAILK >gi|157101655|gb|DS480669.1| GENE 153 176518 - 177027 596 169 aa, chain - ## HITS:1 COG:SMb21352 KEGG:ns NR:ns ## COG: SMb21352 COG3090 # Protein_GI_number: 16264676 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Sinorhizobium meliloti # 16 167 1 149 165 57 25.0 1e-08 MVLKLIDKVKFLLRILSCVTVCTLTIIIGIQVVNRYVFGTSFTWVEELAGMAMVYITYFG AAMATINNSNTRIDFFIRKLPGPVYRGFEILDDCICIAFLLVICRYAAKLVGTNLNALSA AMKVPLSVNYVGVLSGCVLMIIFYVLRLWIDVEKLRGRNMDYVEEELSR >gi|157101655|gb|DS480669.1| GENE 154 177104 - 178216 351 370 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 62 350 41 329 346 139 30 1e-31 MKKRLVSLFLALAMTWGLTACAGGSSAPASAAAPASAADPAASADAASAQGEVEGSKTDN TGTDAKASYTLNIGSAMSSTNPSSIALQSFKKAVEERTGGDLAVNIYTDSALGGEADLLE QVTSGTVEGMMQMGAANWEPYNSEVNVALLPFLFTSLDNARQAWAGEFGQQFCEKLLEPT GVTILSVWESGYRHMTNNTRPILAPADIAGIKFRTNENSMKVKMYEAVGGSAVIMAFSDV YTGLQNKTIDGQENPLANIYTSSLQDVQTYLSLTGHMYDAAPLAVNTAWFETLPEEYQTI LFEEADKAREVDLQENDESKYLELLKEAGMEINEVDKEAFQEAMSGIWEEFASQYEDGQY WIDLATSFNK >gi|157101655|gb|DS480669.1| GENE 155 178255 - 178443 228 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935692|ref|ZP_02083067.1| ## NR: gi|160935692|ref|ZP_02083067.1| hypothetical protein CLOBOL_00582 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00582 [Clostridium bolteae ATCC BAA-613] # 1 62 1 62 62 96 100.0 7e-19 MQKEEERQNRVLMEIIDNAKELPVESQNLLLILARGMVFTRNCMMRQNAADQPRMPPDEH SA >gi|157101655|gb|DS480669.1| GENE 156 178606 - 178953 432 115 aa, chain + ## HITS:1 COG:no KEGG:ELI_2972 NR:ns ## KEGG: ELI_2972 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 3 105 2 105 118 82 39.0 6e-15 MNTLGERLNYARKQKGYTQDSLAQTIGVSRGVIFNLEKNKTEPQAIVINAVCQILNINKE WLTDGTGDMESGNQAFQSAKLLEELYDIAKELSEDEQLFLLDTVKALKLRLGKRQ >gi|157101655|gb|DS480669.1| GENE 157 179101 - 179439 212 112 aa, chain + ## HITS:1 COG:BS_yonR KEGG:ns NR:ns ## COG: BS_yonR COG1396 # Protein_GI_number: 16079161 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 3 68 2 67 108 60 46.0 6e-10 MGFSEQLKKARVSMGYTQQQVADALGLTASTYCGYETGKRQPDVAKLKQLARILNTTGSF LLETEPVQNTCRPDPGSGNDLMVSYGRLNERSKKAVDDLVALLLDLQEKSKE >gi|157101655|gb|DS480669.1| GENE 158 179578 - 180891 1453 437 aa, chain - ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 13 362 12 355 373 346 60.0 6e-95 MLEIKINESENAARIIVVGVGGAGNNAVNRMIDENIAGVEFIGINTDKQALQFCKAPTAM QIGEKLTKGLGAGARPEVGEKAAEESSEEISQALKGADMVFVTCGMGGGTGTGAAPVVAK IAKDMGILTVGVVTKPFRFEAKTRMSNALSGIEQLKNSVDTLIVIPNDRLLEIVDRRTTM PDALKKADEVLQQAVQGITDLINVPGLINLDFADVQTVMIDKGIAHIGIGKAKGDDKAIE AVKQAVSSPLLETTIEGASHVIINISGDISLIEANEAASYVQELAGDEANIIFGAMFDEN AQDEATITVIATGLDEHGANASVAKAMTGFTSFKTKVPAAGAPAGQQAHAAAPEAAATAS APYTSRPASQGYSTHGYNPAGGSQGYTAPKYNPNQGYSAGGSQNAGQNPTPAPSQPFRPA TNKEMQINIPDFLKNKR >gi|157101655|gb|DS480669.1| GENE 159 181343 - 182299 858 318 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 318 1 323 324 374 56.0 1e-103 MKIVVLDGYTENPGDLSWEGLEALGTLTVYDRTPGDKVIERIADAQAVYTNKTPITGETI EKCANMKFIGVLATGYNVIDIKAARSANVVVSNIPSYGTDAVAQYAVALLLELCHHIGEH SDCVKAGGWSRSRDWCFWKHPLVELAGKTFGVIGFGRIGQRTAKIAEALGMKIVAYDERP VKELEGENFRYVSLDELLSMSDVISLHCPLFPSTEGIINKDTIAKMKDGVKIINTSRGPL IVEQDLRQALDSGKVSGAAVDVVSEEPIREDNPLLGAKNSIITPHIAWAPRESRQRLMDI AVSNLKAFMEGNPVNVVS >gi|157101655|gb|DS480669.1| GENE 160 182315 - 183157 859 280 aa, chain - ## HITS:1 COG:MJ0400 KEGG:ns NR:ns ## COG: MJ0400 COG1830 # Protein_GI_number: 15668576 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Methanococcus jannaschii # 4 264 10 266 273 171 36.0 1e-42 MAMLGKEVRMSRLVNPESKKMMAITVDHAISRGIAPMKGLQPIQETIEKIVMGRPDAMTM TKGIAEHCMWPFAGKVALLMKVSNYSPVSPTRDTIFGSVDEAVRMGADAVSMGAMTLGDF QGEQFAAIGRFSEECMAKGMPLIGHVYPKGESVPADKQTSWENIAYCVRSACELGMDIVK TTYTGDPDSMARVVSCVPSSFRIVIQGGDKCRTLDDYLTMTREAMDCGVGGVTMGRFVWE YEDVTALVIALRYMIHKGYTVKQAKELLAQLEHDKNYEDF >gi|157101655|gb|DS480669.1| GENE 161 183196 - 184548 1028 450 aa, chain - ## HITS:1 COG:FN0225 KEGG:ns NR:ns ## COG: FN0225 COG2610 # Protein_GI_number: 19703570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 6 417 5 420 452 183 36.0 6e-46 MSGITLAVNLAISIAIILFLVLKFKINPVISMILASLYMGISCSLGFMDTITSINSGFGS LMTGIGFPIGFGIMMGQILEDSGAAESLAKSILKAFPGKKAPWALGLTAFLLSIPVFFDV TFVILIPLGIAVAKETKRPLAYFAGAIAIGGVSSQTFVPPTPNPLAAATILDFDLSYIII AGTIVGLAAAVFSMFVWFRMLDRPGFWDPNKDETGLLDMDAAVVHRVDLPSPWAAVIPIC LPVLAILIGSFWPVVTGSDAPVVIQFISQKTIAILLGLLAAYIILLKRMGWSGLNESVSK SLKQAGVVLLITGAGGAFGAVIQATNIGEVLIAGLTEGQSSTMLILCLTFGIGVLFRVAQ GSGTVASITAMTIMASVAPSAGCHPVYIALAALAGGNFIGHVNDSGFWVVTNLSGASVTG GLKTYTWNTITLAGMAFILSLVGATVLPMV >gi|157101655|gb|DS480669.1| GENE 162 184687 - 186213 1080 508 aa, chain - ## HITS:1 COG:AF1752 KEGG:ns NR:ns ## COG: AF1752 COG1070 # Protein_GI_number: 11499341 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Archaeoglobus fulgidus # 2 490 10 486 493 242 32.0 1e-63 MFCGIDLGTSSVKASVFDEHGRQLAFSRREIDLVLPQTGLAELDPENYLVKVYETLRDVS AQCHGNIASIAVSSQAQAVVPIDRMGNPLYNIIVTMDNRTLRQYRFWKEHHDEWEIYKRT GNGFASIYTVNKIMWFIENRPDIYEKAWKFCCVQDYVVFKLTGEGPFIDYSMAGRTMMLS SERPEWDSGVLDIAGISKEKLSEPVSSTATIGRVREDIRAGFGLSRTCQVVLGGHDQACG AVGSGVITPGMVMDACGTVDAMVSVLPGFIIDRAMLDNKLPCYRHVDGTNYITMAINTNG GLFLKWYKNTFCHEESQLCGEQNCDIYTYIIKECANTPADIYVLPHLEGAGTPVNDPLSL GAVIGLRVTHTKKDITRAVLDSLGYEMKLNLSAIEQSTGQSIEEIRMIGGGAKSPKWLQI KADIFNRSITVLETQEAASLGAAITGAVGTGYFEDFGQAIQHMVHPKETYIPNPAMVKEY ETRFEEYKRIYPGLKDISHRISSRTSIY >gi|157101655|gb|DS480669.1| GENE 163 186307 - 187044 742 245 aa, chain - ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 2 240 44 282 287 222 45.0 6e-58 MSVADAAEACGVSEAAIVRYAQKLGYKGYQAMKISIAQDVIEPGQQIYGQLSKGDTIPTI VDKIFDSNIQSLRDTSDVLSRENIDEAVNLILGCRRLLFFGVGGSGCVAMDGQHKFLKIG YMAMAFTDSNLQAMAASVLTSRDVLVAVSHSGASKDILMAMDIAKQSGAKTIAITNYGKS PIVEKADVVLYTSSNETAFNSDALSSRIAELTIIDMLYIGVSYKRYDESYANILKTRKAL DSTKI >gi|157101655|gb|DS480669.1| GENE 164 187366 - 188097 798 243 aa, chain - ## HITS:1 COG:no KEGG:Closa_2458 NR:ns ## KEGG: Closa_2458 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 242 7 240 242 249 50.0 8e-65 MRNYGSKHHIGAGIAVAVVLLLAILLLSVQIQHVTVTGSDRYSAKQVEELLFTGRWGKNS AHAYFSDKFRPHIQIPFVEDYKVVFHNPLDVEVIIYEKSIVGYVSYMSSFMYFDKDGIVV ESSGSQLPGVPWITGMEFGRIVLHRPLPVDDKNIFEEILNLTQQLSVYDIAVDRIQYDSH GQATLFIGKMEVTMGDHTDIDGKISTLNDILKAQPQLTEISGTMKLDNYSESNSGAGITF KKK >gi|157101655|gb|DS480669.1| GENE 165 188172 - 189446 1036 424 aa, chain - ## HITS:1 COG:BS_murA KEGG:ns NR:ns ## COG: BS_murA COG0766 # Protein_GI_number: 16080729 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Bacillus subtilis # 1 424 1 421 436 299 40.0 5e-81 MSVIQVCGLTPLKGEISIQGSKNAVLPMMAAALLHRGVTVLTNVPVIRDVACMLDILESL GCRCCHKGDCLVMDARSVTGTSIPEEYVTAMRSSIVVLSALLGRMGEGSCCYPGGCLIGA RPIDLHLMALRALGADIRERDGTIEASCRKNGGLKGTEIHLSYPSVGATEQAILASVLAD GVTIIHQAAREPEISQMCRFLNNMGAVICGMGTDHLMVQGVAGLHDSSFRVEGDRIVAGT YGAAVVAAGGEALLRGICPSDLKVPLEEFQKAGAAVDADEKNRQIRICMGKRPLPLLIKT EPYPGFPTDLQSPFMAFLATAQGTSYIEEQVFEGRFATAKILEQMGAVIRTEDQRAVIEG HYPLKGAAVNACDLRGGAALMVAALAAEGDTFIGECHHIERGYEDICRDMAALGAHIRWV GKDS >gi|157101655|gb|DS480669.1| GENE 166 189540 - 190769 1411 409 aa, chain - ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 58 402 26 361 366 218 40.0 2e-56 MAGKVQTAAGPSGQMRRPAGSGERMEQQMPKKKKKAKRFYDYSLLFCIIFLTAFGLVMIY SASSYSAQLSKAYNGNGAYFMQRQAGIAAVGLVAMLIISKIDYHIFTRFSVFAYLMSYIL MIAVSLVGREVNGKKRWLGVGPLSFQPTEFVKIALIVLLAAVITTMGMKINKWKNMGYVV ALTLPIAGLVAMNNLSSGIIVCGIAFVMLFVSCKVKWPFFSIGALGLTTLAFAGPIGKFL TTVGLLQPYQYRRIEAWLNPESDPTDKGFQVLQGLYAIGSGGLVGQGLGESIQKLGFLPE SQNDMIFAIICEELGLFGAVSIILIFLFMIYRFMIIANNAPDLFGALLVVGVMGHIAIQV ILNIAVVTNTIPNTGITLPFISYGGTSVLFLLMEMGIVLSVSNQIKLEK >gi|157101655|gb|DS480669.1| GENE 167 190783 - 192141 1601 452 aa, chain - ## HITS:1 COG:BH2567 KEGG:ns NR:ns ## COG: BH2567 COG0771 # Protein_GI_number: 15615130 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus halodurans # 3 452 8 448 450 270 37.0 3e-72 MNQSQRVLVAGTGISGIAAAKLVLERGGDVVLYDGNAALKEEDIKEKFDRDAKVTVVLGE IKRSDLLGVELCIISPGIPLDSPFVAVVDDAKIPILGEIELAYQCGLGKLAAITGTNGKT TTTALTGEILKARYEETFVVGNIGDPYTSKVLEMTEDSVTVAEVSSFQLETIMDFRPNVS AILNITPDHLNRHGTMENYIRIKECVTLNQTKEDAVVLNYDDPVLREFGQNPELKPRVIW FSSREMLKDGFCMDGDNIVYCQNGRQTVVVNVHDMQLLGRHNYENAMAAAAIALEMGVPM SDITRVIEGFHPVEHRIEFVRERTGVRYYNDSKGTNPDAAIQALRAMPGPTLLIAGGYDK NSEYDEWVSEFEGKVRYLVLIGQTRDKIAECAKKHGFTEIMYAEDMQEAVQVCAVYADIG DYVLLSPACASWGMFKNYEERGRVFKECVMAL >gi|157101655|gb|DS480669.1| GENE 168 192305 - 193273 1117 322 aa, chain - ## HITS:1 COG:CAC2127 KEGG:ns NR:ns ## COG: CAC2127 COG0472 # Protein_GI_number: 15895396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Clostridium acetobutylicum # 6 322 5 316 317 245 47.0 9e-65 MIHETILAIIIAFAISALLCPIIIPFLHKLKFGQQVRDDGPESHLKKQGTPTMGGLIILS SIIITSVFYIPSYPKIIPVLFVTVGFGIIGFLDDYIKIVMKRSEGLKPMQKLVGQFIITG IFAWYLLNSGEVGTDMLIPFTGGFDGGSFLSLGIFFVPALFFIMLGTDNGVNFTDGLDGL CTSVTILVATFLTIVAIGEDMGISPITGAVVGSLLGFLLFNVYPAKVFMGDTGSLALGGF VAASCYMMRMPLFIPVIGLIYLVEVLSVIIQVTYFKRTGGKRIFKMAPIHHHFELCGWSE TRVVAVFAIVTAILCMVAYLGL >gi|157101655|gb|DS480669.1| GENE 169 193344 - 195134 1956 596 aa, chain - ## HITS:1 COG:BH2572 KEGG:ns NR:ns ## COG: BH2572 COG0768 # Protein_GI_number: 15615135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus halodurans # 3 588 4 571 644 372 38.0 1e-102 MYSNLTKHRKNIAVLALLIAIVMVSLSGRLGYLMLFCSEHYSAMAEDLHQRERTIKAARG RIIDANGVVIADNRTVCTISVIYNQVKDREQVIRVLCDELDLDEELVRKRVEKRSSREII KTNVDKELGDQIRSYRLAGVKVDEDYKRYYPYDSLASKVLGFTGGDNQGIIGLEVKYEKY LKGMNGKILTMSDAKGVEIENAAEDRIEPIPGNDLHISLDVNIQKYAEQLAYQVLEKKNA KKVSIIVMNPQNGELMAMVNVPEFNLNDPFTLNQNLRSQSLQEMAAANQGATGAGKQELL NQMWRNTCINDTYEPGSTFKIITAAAGLESGVVKLTDQFSCPGFRVVEDRKIRCHKVGGH GAETFLQGAMNSCNPVFIDVGQRLGVDGYYKYFTQFGLKGKTGIDLPGEAATIMHKKENM GLVELATVSFGQSFQITPVQLITTAASIVNGGNRVTPHFGVETVSADGTSVHKLDYPSGG RILSEETSATMRYVLEQVVSEGSGKKAKLEGYRIGGKTATSEKLPRSLKKYISSFIGFAP ADNPQVMALITIDEPEGIYYGGTIAAPVVGDLFKNILPYLGIEAVQEETAVHLSNP >gi|157101655|gb|DS480669.1| GENE 170 195206 - 197629 2564 807 aa, chain - ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 48 684 12 590 729 270 32.0 8e-72 MKTSRNTKSGGRMGKGSSGSGNGPRDQEQYRNEQRNNQRGKYYYPKYIQEKLAITVLVIT LALFALVMILYRIIKDNNEQYNKIVLSQRQQEYDSRVIPYRRGDIVDRNGTYLATSEKVY NLIIDPGQIMSDPENYLEPTIQALVTNFGFDAGELRTLIEERKESAYLRYNKGRQLTYDQ RVAFEQMAKETNEAYRKSDNDAEAKKRVKGIWFEDEYKRIYPYNSLACNVIGFTSSDGSV GTGGIEQYYNSSLIGVNGREYGYLDQDSNLEGVVKPASNGNTVVSTIDVNIQNIVQKYID EWQTNVGSKVTAAIVMNPNNGEILAMGTSNKFDLNNPRDVSGYTEQELFDLGKKEAAAVY RRENDGAVITEDQVTEHFSREDVISYGQQVAWNQIWRNFCVSDTYEPGSPSKIFTVATGL EEGVLKGNESFECTGYLHVGDYNIKCTAWRRGGHGWLNLQESLMQSCNVAMMRIGAMIGR ERFTKYQGIFGFGDKTGIDLPGEADTSGLVYSADDIGPTDLATNAFGQNYNCTMVQLSAA FCSVLNGGSYYEPHVVKQILNEQGSVVEKKEPVLVRETVSQSTSDFLKEAMFQTVETGTG KAAQVIGYDVGGKTGTAEKQPRSAKNYLVSFAGFAPIEDPQVFVYVVIDTPNYPPGEQQA HSSYASAVFSKIMTEILPYLNIFPTKDLPENEAIQESLPSSEGINEPVSETEGAEGTEET PAETEKQYETDEYMPGGEEGEGAGSGVPDAVPTVAGDGNGEGTGPSSLPVKAPPRQSNES STQAGPAGESTQATEPSSAGETAKPSE >gi|157101655|gb|DS480669.1| GENE 171 197607 - 198137 495 176 aa, chain - ## HITS:1 COG:no KEGG:Closa_2465 NR:ns ## KEGG: Closa_2465 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 16 176 5 160 160 174 57.0 1e-42 MAARAQASSGRRNTHRNTRPVYGNDYVQGSAAPKLRPQTAPKQETRPGKPSGSQSVRSNH RIRRNQERAMYMDLPYVIMLTIASVFTLYLCINYLHVQSSITARMHHIEQMEAELEQLRA ENDALETSINTSIDLNKIYEIATKELGMVYAKKDQVLLYDKTESEYVRQYEDIPEH >gi|157101655|gb|DS480669.1| GENE 172 198252 - 199199 961 315 aa, chain - ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 6 312 3 309 312 350 58.0 1e-96 MEESTFVHKSVLLYETVGSLNIKPGGLYVDGTLGGGGHAYEVCKRLGSGKLIGIDQDADA IEAAGKRLLPFQDKVTIVRSNYANIKEVLKELEIEGVDGIYLDLGVSSYQLDTASRGFTY REDAPLDMRMDQRNDVTAAHIINTYSEMELYRIIRDYGEDNFAKNIAKHIVRARGEHPIE TTGELAEIIKGAIPARVRATGGHPAKRTFQAIRIECNHELEVLEQSIDTMIDLLNPGGRL SIITFHSLEDRIVKSRFRLNENPCICPHDFPVCVCGRKSKGKVVSRKPILPSEEELEENS RSKSAKLRVFERGQS >gi|157101655|gb|DS480669.1| GENE 173 199199 - 199624 447 141 aa, chain - ## HITS:1 COG:lin2148 KEGG:ns NR:ns ## COG: lin2148 COG2001 # Protein_GI_number: 16801214 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 141 1 143 143 174 57.0 6e-44 MFMGEYNHTVDAKGRLIVPSKFREQLGDEFVVTKGLDNCLFVYENSEWAALEEKLRTLPL TNAAGRKFSRFLLAGATTCEVDKQGRILLPAVLREFAGIEKDAVLVGVGSRIEIWSKDKW LDANTFDDMEEIAEHMEGLGI >gi|157101655|gb|DS480669.1| GENE 174 200011 - 200901 969 296 aa, chain - ## HITS:1 COG:BH3589 KEGG:ns NR:ns ## COG: BH3589 COG0682 # Protein_GI_number: 15616151 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Bacillus halodurans # 30 287 18 260 289 166 38.0 3e-41 MEAVTGADLSFVNLGITIEHLPNSITVFGFRIAFYGIIIGLGMLAGMAIACSDAKRRGQD PDIYLDFALYAIIFSIMGARAYYVIFDWANYKDDLLQIFNLRAGGLAIYGGVIAAALTLI VFTKKRKLSFFSMADSGVLGLITGQMIGRWGNFFNCEAFGGYTDNLFAMRIRKSLVNPAM ISQQLVDNEIVENGIAYIQVHPTFLYESLWNLGVLVFMLWYRKRKKFEGEMLWVYFLGYG LGRVWIEGLRTDQLKIPGTGFAVSQLLSAALVGVSAAVIIYRHRHPKKGDKVEESA >gi|157101655|gb|DS480669.1| GENE 175 200989 - 202326 1626 445 aa, chain - ## HITS:1 COG:SPy1150 KEGG:ns NR:ns ## COG: SPy1150 COG0446 # Protein_GI_number: 15675127 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Streptococcus pyogenes M1 GAS # 5 445 4 455 456 592 62.0 1e-169 MKETIVVIGANHGGTAVLNTILDQFPGNRVVAFDRNSDISFLGCGMALWIGGQISGPEGL FYSSKEAFEEKGAVIHMETEVERIDFDGKAVYARGKDGKSYVQEYDKLILATGSVPLRPE IEGMELDNVQFVKLYQNARDVIEKLKDTSVKKVAVVGAGYIGVELAEAFKRNGREVVLID LADTCLSGYYDREFSDRMAHNLQEHGVKLAFGEKVVRLEGKRRVERVVTDQGCYEADMVV FGIGFRPNTSLGQGRLRTFGNGAFLVDRRQETSVPGVYAVGDCATVYNNAIQAADYIALA TNAVRSGIIAAHNACGKVLESVGVQGSNGISIWGLHMLSTGISLARAKRMGYDAAAADYE DWQKAEFMENGNHKVKIRIVYDRKSRIILGAQMCSDYDVSMGIHMFSMAVEEQVTIDKLK LFDLFFLPHFNKPYNYVTMAALGAE >gi|157101655|gb|DS480669.1| GENE 176 202922 - 203041 61 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935720|ref|ZP_02083095.1| ## NR: gi|160935720|ref|ZP_02083095.1| hypothetical protein CLOBOL_00610 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00610 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 72 100.0 9e-12 MVPSVAVSKPEPGTKQGDRGKFRRNLDDFSSMCHRLRAV >gi|157101655|gb|DS480669.1| GENE 177 203257 - 204354 1457 365 aa, chain + ## HITS:1 COG:CAC2134 KEGG:ns NR:ns ## COG: CAC2134 COG0012 # Protein_GI_number: 15895403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Clostridium acetobutylicum # 1 365 1 365 365 455 65.0 1e-128 MKLGIVGLPNVGKSTLFNSLTKAGAESANYPFCTIDPNVGVVPVPDVRLDKLTAMYNSEK TTPAVIEFVDIAGLVKGASKGEGLGNQFLSNIREVDAIVHVVRCFDDPNVIHVDGSIDPI RDIETINLELIFSDIEILERRIAKQSKGAFNNKELAKEVELLKAIKAHLEDGKLARSFEP DDDDAMEFINTLNLLTWKPVIFAANVAEDDLANDGADNQYVQAVRGFAGENNCEVFVICA EIEQEIAELDDDEKKMFLDDLGLTESGLEKLIAASYRILGLMSYLTAGPQESRAWTIKVG TKAPQAAGKIHSDFERGFIRAEVVNYDDLMACGSMNAAKEKGLVRSEGKEYVMRDGDVVL FRFNV >gi|157101655|gb|DS480669.1| GENE 178 204746 - 205330 543 194 aa, chain + ## HITS:1 COG:no KEGG:Trebr_1792 NR:ns ## KEGG: Trebr_1792 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: T.brennaborense # Pathway: not_defined # 1 194 1 194 195 245 56.0 7e-64 MDCSKVGNLIYTLRTEKGMTQKALANAMNISDRTISKWERGAGCPDVSLLRELSDILGVN IEKILLGDLEPNKKDGGNMKRIKFYVCPVCGNVISSTGMAEISCCGRKLNALKARPADLS HSVSAKAVEGDYYITLEHDMEKQHYITFMAYAVYDRVLLMKLYPEQAAQVRFPRSGHGVL YICCNSHGLFSVSL >gi|157101655|gb|DS480669.1| GENE 179 205579 - 206256 539 225 aa, chain + ## HITS:1 COG:CAC3606 KEGG:ns NR:ns ## COG: CAC3606 COG1309 # Protein_GI_number: 15896840 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 205 1 202 202 108 30.0 5e-24 MTRIIKEADERRNEILDAAETLITEKGYSKTTIIDILNQVGIAKGTFYYYFKSKEEVMDA IIERFIEQDVQKARLIAMDKSISPVRKICRIIAAGQPRTDGPKDRMIEEFHLPANALMHE KSIVRSILALSPILGEIASQGVKERLFSTEHPLEAMQILLVSGQILFDSSMFTWTPCEME QKINGFIEAMEAVLGAEKGTFEPMKQILAAGSGCILQESESRRDE >gi|157101655|gb|DS480669.1| GENE 180 206256 - 207503 1190 415 aa, chain + ## HITS:1 COG:PAB0724 KEGG:ns NR:ns ## COG: PAB0724 COG0477 # Protein_GI_number: 14521293 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Pyrococcus abyssi # 9 385 2 373 418 138 27.0 2e-32 MANSPISGNRDFRLMVIGQIISILGSALLRFALSLYVLDITGRADIYAALYAFSNIPLLI SPVGGAVADRFNRRNLMVLFDFTSGIIISLYYVSLRLGGTSIFLTGAVLILLSVISSMYG PAVTASIPLLVKEEHLEGANGLVNGVQALSNVAAPLIGGMFYGIFGVKALVCVSGTAFIC SAVLELFIHIPFRKREFTMPVIPTIAADLKEGFSYVGRNPLILRSMILASLLNLVLTPFF VVGGPVILRTAMKSTDAMYGIGMGTINLATILGALSMGAAAKKMRMENLHRLLAAIALLF LPMALSVTPAWISRGFYPSYLMFLACAVPIAMIMTIISIFVITKVQKETPNENLGKVMAI ITAVSQCAAPLGQLVCGFIFQTFPARIYLPALFTCIAMAVISLAARKLWSETAVS >gi|157101655|gb|DS480669.1| GENE 181 207568 - 208356 511 262 aa, chain - ## HITS:1 COG:FN0736 KEGG:ns NR:ns ## COG: FN0736 COG0500 # Protein_GI_number: 19704071 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 10 252 9 251 251 316 58.0 2e-86 MKTEELRKIWKAEEALAHIHGWDFSHIDSRYREEGSLPWDYRKIILSHLGDRDYLLDMDT GGGEFLLSLGHPYSRTSATEAYPPNAELCRRLLAPLGIDFREAPGNQTLPFADSCFDMVI NRHGNYDAGEVNRILKPGGIFITQQVGEDNDRELVELLLPGTPKPFPGMNLKEQSKKLKE SGFDILEQGEAFCPIRFFDVGALVWFAKIIEWEFPGFCVDGCLDRLAAAQGLLESGGSIE GTIHRYMMTARKREPVYLEPMY >gi|157101655|gb|DS480669.1| GENE 182 208457 - 208636 93 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935728|ref|ZP_02083103.1| ## NR: gi|160935728|ref|ZP_02083103.1| hypothetical protein CLOBOL_00618 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00618 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 99 100.0 7e-20 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQGYMRKSMSDNYPAQP >gi|157101655|gb|DS480669.1| GENE 183 208772 - 209269 260 165 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 131 1 131 391 118 46.0 6e-27 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKSSVDSARTCEAAYTLKNSADFLCRPGIPGTTILL >gi|157101655|gb|DS480669.1| GENE 184 209340 - 209936 34 198 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 6 198 196 388 391 175 44.0 4e-44 MHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIEQIELLDSQLF STELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAFAGLDPTVYQS GNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTHYNALGHCAGK LVRVIWKMLTDEVAFNLE >gi|157101655|gb|DS480669.1| GENE 185 210670 - 210813 186 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHVGIALEIMGKGMGGIFAAIIIIMAAVMILGRITKDRKDHKDGKE >gi|157101655|gb|DS480669.1| GENE 186 210877 - 212025 1478 382 aa, chain - ## HITS:1 COG:SPy1184 KEGG:ns NR:ns ## COG: SPy1184 COG1883 # Protein_GI_number: 15675155 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 1 379 1 373 373 381 61.0 1e-105 MSFLLEGITAVTWQQAVMYVVGIILIWLAVKKEYEPSLLLPLGFGAILVNLPYSGVVDQM VQGKVPAAGIIEWLFKTGIEASEAMPILLFIGIGAMIDFGPLLSQPVLFLFGAAAQFGIF AAILIACLMGFDLRDAASIGIIGAADGPTSILVSQVLGSRYIGPIAVAAYSYMALVPIIQ PFAIRLVTTRKERCIHMEYNPKSVNKQLRIAFPIAVTIIVGLISPQSVALVGFLMFGNLL RECGVLGSLSQTAQNEFANIITLLLGITISFSMRAEQFVNPATLMIMVLGLVAFVFDTIG GVLFAKLINVFLKMAGKKPVNPMIGGCGISAFPMSSRVVQRMAAREEPGNIILMQAAGTN VSGQVASVIAGGLVISIVSQYL >gi|157101655|gb|DS480669.1| GENE 187 212059 - 213396 1704 445 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 1 438 1 439 443 361 46.0 1e-99 MESDPEAANILFQITILVILTLINAFFAGSEMAVVSVNKNKIHRLSEQGNKNAALIERLM KDSTVFLSTIQVAITLAGFFSSASAATGIAQVLAVRMAQWNLPYSQTLAGVVVTIILAYF NLVFGELVPKRIALQKAEAFSLFCVRPIYYISRIMNPFIKLLSLSTSGFLKLIGMHNENL ETDVSEEEIKSMLETGSEAGVFNDIEKEMITSIFSFDDKKAKEVMVPRQDMVALDINEPL EEFLDEILESMHSKIPVYEGEIDNIIGVLSTKALTIEARRTSFDKLDVRTLLKPAYFVPE NRRTDALFREMQANKIKLAILIDEYGGVSGMVTLEDLIEEIVGDIHEEYEEEEPELTELE PHKVYRVSGGITLFDLKEEMHLHMDSSCDTLSGYLMEQLGYIPSREQLPLTVVTPEADYE ILEVEDRVIEWVKLTLKETKKEQEE >gi|157101655|gb|DS480669.1| GENE 188 213518 - 214354 1049 278 aa, chain - ## HITS:1 COG:SP2113 KEGG:ns NR:ns ## COG: SP2113 COG1284 # Protein_GI_number: 15901928 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 8 272 43 308 313 122 27.0 8e-28 MAAFLCILLGSLIVAVNMNTFVEQGELVPGGFTGLAKLIQRIGLSFYDVQISFTLLNVLF NAVPAVFAYRLVGKKFTILSCICLLTVSILVDELPVVPITGDILLISVFGGIINGLGMSL VLNSRACGGGTDFIAMSLSAKYKVSTFNYMLLFSAVIILISGTIFGMDKALYSIIFQFCN TQVINTFYKKYKKKTLLIVTDNPAAVSADLLELTNHSSTILKGFGSYSANKKYLIYTVLS DSDVRKMKKRIREQYPDTFVNVINSSDVVGNFYIQPLE >gi|157101655|gb|DS480669.1| GENE 189 214469 - 216238 179 589 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 364 571 36 249 329 73 27 9e-12 MSSKKNALDKNRSRETLKRILKYISQYKWSVILSLFLALITVALTLYVPILTGRAVDRIV SAGHVDFAGLKGILITILGAVAVTSVSQWLMNHINNIITYRVVKDIRTRAFNQLEVMPLS YIDVHPSGDIISRIIADIDQFSEGLLMGFTQLFTGVLTIAGTLLFMLSIQPAITAVVVVL TPISLFVASFIAKKTFVMFRKQSETRGELTSLTDEMLGNMKVVQAFGHQEEAQKQFEEIN ERLAGYSLRATFFSSITNPATRFVNSLVYAAVGITGAYGAIRGLITVGQLTSFLSYANQY TKPFNEISGVVTELQNALASAARVFELIDEKTMIEDREDATVLRGAKGSVDLEQVCFSYT PEKKLIEDFNLSVQPGQRVAIVGPTGCGKTTIINLLMRFYDVDSGAIKVEGTDIRDVTRE SLRTSYGMVLQDTWLKSGTIRDNIAYGHPNATEEEIIQAAKQAHAHGFIMRMPDGYATVI SEDGGNLSQGQKQLLCIARVMLCLPPMLILDEATSSIDTRTEIKVQKAFAQMMEGRTSFI VAHRLSTIREADVILVMRDGKIVEKGKHEELLGMHGFYADIYNAQFARG >gi|157101655|gb|DS480669.1| GENE 190 216228 - 218048 1868 606 aa, chain - ## HITS:1 COG:CAC2393 KEGG:ns NR:ns ## COG: CAC2393 COG1132 # Protein_GI_number: 15895659 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 601 1 577 577 559 47.0 1e-159 MKRLFSYMKDYKLESILGPLFKMLEASFELFVPLVVARMVDVGIRGRDSGYILKMGGVLV LLALIGLACSLTAQYFAAKAATGAATSLRNDLFSHIGRLSYTEIDTVGTSTLITRMTSDI NQVQSGINMTLRLLLRSPFVVFGAMVMAFTVDVRSAFVFVVTIPVLCVVVFGIMAVSMPL YKSVQRQLDKVLLTTRENLLGVRVIRAFNRQKSETEKFDRENGNLVRMQVFVGKISALLN PVTYVIINIAVVAVIWVGAEQADSGIITQGKVIALVNYMSQILVELIKMANLIIIISKAV ACMNRVDSIFKVESSIEDKGRHGSRKPGSQNSGSQNSGSQNPGPQNSGPQNPALRIPKVE FKDMEFVYAGAKEPALKDISFCAMAGQTIGVIGGTGSGKSTLVNLIPRFYDAASGQVLVD GTDVKEYSLDELRDKTGVVPQKSVLFKGTLRDNMRWGKQDASDEEIYRALDTAQAREFVD SKGEGLDLYIDQGGHNLSGGQRQRLTIARALVRRPEILIMDDSASALDFATDARLRKAIR ENTGDMTVFIVSQRATTIKSADTILVLDEGRLAGMGTHKELLKDCQVYREICLSQLSKEE VERDEQ >gi|157101655|gb|DS480669.1| GENE 191 218184 - 219056 893 290 aa, chain + ## HITS:1 COG:CAC2394 KEGG:ns NR:ns ## COG: CAC2394 COG0583 # Protein_GI_number: 15895660 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 283 1 284 286 221 40.0 9e-58 MNLKQMEYFTAIVNEGSISAAAKALHISQPPLSTQMKLLEEELGLTLFQRGSRSILLTDA GKVLYERAQGILDMTGAVLEELEQMGQGLLGSLRLGMISSVETEPIIGKIAAFRQEHPGV KFRIYEGNTYQLLDKLNTGIIEAAIVRSPFPEEPYDCFYLSGDIMIAVGERCFFPRPGAD TISLKELCSCPLVVYRRWEGVLNRLFAPDCPDYLCINDDARTSLTWAETGSGVAVVPASI ARCVRPDILMKPLDTRDFTSRITLVCRKNGPLSAVMKEFLDYFTGQPPLS >gi|157101655|gb|DS480669.1| GENE 192 219129 - 219968 944 279 aa, chain - ## HITS:1 COG:lin1818 KEGG:ns NR:ns ## COG: lin1818 COG1295 # Protein_GI_number: 16800885 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 3 266 13 273 289 124 29.0 2e-28 MFGIIVAGKRVYDKFAEDEMTVYAAQVSFFIILSVVPFIMLLLTAVQMIPSISNARFMEL IVGLVPVDYKSLAFRVVNDLSLKSPATMISVTAVTALWSAGRGMFSVARGLNRVNGHGEK RWYVINRLICSGYTIVFILVCILSLGLLVFGNMIQAFMVSRFPIIADVTTHIINFRALWA LMILIIFFLGIYTFVPDKSLKLRDQLPGAVFSTVGWMAFSFAFSLYFSHIGGKNYSYMYG SLTAIVLLLLWLYFCMCILFFGAEINYFWKELFPGETKE >gi|157101655|gb|DS480669.1| GENE 193 219985 - 220257 211 90 aa, chain - ## HITS:1 COG:CAC2830 KEGG:ns NR:ns ## COG: CAC2830 COG1254 # Protein_GI_number: 15896085 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Clostridium acetobutylicum # 12 90 11 91 91 61 41.0 5e-10 MKKVRKHIIFSGRVQGVGFRYTSCYLARPLGLTGWVKNLWNGDVEMEVQGDPLAIGRFLR NLEQGRFIHIEHMEAEDIPVIEEGSFREIY >gi|157101655|gb|DS480669.1| GENE 194 220265 - 221209 1325 314 aa, chain - ## HITS:1 COG:CAC0089 KEGG:ns NR:ns ## COG: CAC0089 COG0111 # Protein_GI_number: 15893385 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 313 1 316 318 333 53.0 4e-91 MKMVIIEPLGVEEGKLLAMAKESLGDRVEIVYYNTKTTDTAELIERGKDAEIIVVSNLPL NAQVIEGCRNLKLLAVAFTGVDHIAMDVCRKNGVMVCNCAGYSTCAVADLVFGMLISLYR NVIPCDKVCREEGTKDGLVGFELEGKTFGVVGTGAIGLRVAAIAQAFGCRVLAYSRTAKD VPGVRYVDLETLLAESDVVSLHTPLTEETRGLMNEKRIGLMKKNAVLINTARGPVVDSDA LAGALKEGRIAGACIDVFENEPPVRKDHPLFSAPNTIVTPHVAFATKEALVKRAVIVFDN VVNYLDGTPRNVIG >gi|157101655|gb|DS480669.1| GENE 195 221434 - 221901 583 155 aa, chain - ## HITS:1 COG:mlr3802 KEGG:ns NR:ns ## COG: mlr3802 COG0251 # Protein_GI_number: 13473262 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Mesorhizobium loti # 5 154 4 151 153 110 44.0 1e-24 MRDVYEVLKEKGITLPQPPAKGGVYTPVQELGEKFLYCSGCGPDLGNGNTVVGKLGKDLT VEDGQKAAYNCVLNLLANLNEKTGDLNRIKRFVKVLAFVNSADDFGMQPQVVNGGSNLIA GLFGEEAGLPARSAIGTNALPGGIACEIEVLVELK >gi|157101655|gb|DS480669.1| GENE 196 221919 - 223049 1086 376 aa, chain - ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 349 14 362 387 182 33.0 7e-46 MNCIELLRELIGFDTVNPPGNEKPAARYLAGILEPMGFKCEVQDLGGSRANLIAVLDGGD GPELMLNGHLDVVPAVGEWDSSPFSMEEKDGKLYGRGTCDMKGGIAAMCEAAMRCAARKE PMKGKLKLLFVADEECSNLGTLSYLKTHERSDYAIIGEPTRLEVAVAHRGVSRDYIDVKG APRHAALPAGEEDAVMKACRAVRAVKDMNETLRHITHPVLPPPSIAVTMMEGYEKDNVVP GNVRLLLDFRIHPGMDHEQVGQFLDKGFEQAGIDGFQRTLHFYMPGGEIPQDDRFVKLCL EERERQFGIKSDPQPFDASCEQCFLAREGIQTVICGPGDIAQAHTVGEFTWEKQVRDAVS LYERIIDRVLYKTNYI >gi|157101655|gb|DS480669.1| GENE 197 223075 - 224181 1033 368 aa, chain - ## HITS:1 COG:BMEII0546 KEGG:ns NR:ns ## COG: BMEII0546 COG3616 # Protein_GI_number: 17988891 # Func_class: E Amino acid transport and metabolism # Function: Predicted amino acid aldolase or racemase # Organism: Brucella melitensis # 8 364 11 358 360 94 25.0 3e-19 MEHMYELKNTDTIISPSLIYYEEIIRENIKKAIRTAGSPERLWPHVKSHKSKDMVRMQME YGITKFKTATVAEAEMAAEAGAEKVILAYPLIGPNMERFVKLAKAYPDTVFYGVEDDLAQ FEALSGVCVEHNAVLPMLVDVNMGMNRTGVPIERLEELYRSASSLPGLRLCGMHCYDGNH NNKDAAVRQAQVDDTDKKVSDIMERLNGDGICCELAVAGGTPSFPCHAGATRWYLSPGTA FITDAGYYMNLPDLDFIPGAAVMTRVISHPAKGVFTTDLGYKAIASDPAGQRGYIVGLED AVPIIHSEEHWAFRLEDESRIPAIGSCLYVIPTHICPTTALYPEILVAKEGQVADRWQVT ARNRRITY >gi|157101655|gb|DS480669.1| GENE 198 224193 - 225434 1032 413 aa, chain - ## HITS:1 COG:BH2935 KEGG:ns NR:ns ## COG: BH2935 COG1228 # Protein_GI_number: 15615497 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 51 405 46 381 394 123 26.0 9e-28 MEIPVLVHGKGIYDSDDMKMKYTGMILEGSRIKEMGSFEELGPAYPGARVMDFSDCYILP GLVNTHVHLEFTPCRDTYGIYRREDRKTRLARAVEHAGTLLRSGVTTVRDAGSSMELIGA LWGGDAGAKPGTELPRIQAAGMPLTPDGGHLAFLGKTSNGTEGLIQAVKERKSAGCGCVK LIVSGGQLTPGSVPEHDSYSREEIRAAVEAAHELGLPTAAHCLTTSSYVNAMEAGVDSVE HCACFRRNPFLGLLERCYEPDVMEAFRGDHRYFMIGISNNYHQLDQVREGVRKPDEREAF LLEQEKRECEIFGRLADLGMRPLVGTDAGCGYTFFDETWLEMELLCSRCALTPEKVIHAA TLEGAGALGWGDFLGRLAPGYEADFIAAEQNPLRDIKALRHIKHVVCRGKIIE >gi|157101655|gb|DS480669.1| GENE 199 225486 - 226322 865 278 aa, chain - ## HITS:1 COG:AGc4555 KEGG:ns NR:ns ## COG: AGc4555 COG0395 # Protein_GI_number: 15889772 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 278 12 285 286 183 37.0 3e-46 MAVKKYRAKKIISKLLFGILVLVIVIPILFPLYFVLISSFKNMAQVYIMPPKLFGFKPIL DNYIYIFKTQHYGTYMMNSTIVAVASTALSLLLGVPAAYAIARYKMGKANTAILTARLLP NISILLPYYFIFSKLRMIDTYGVLILSHMVLSLPLIVWIMVGFFSDLPLELEEAAIVDGC TRQKCFKDVLLPISAPGLVTCSTLSFLGSWNNFQFALILSGEKTRTLPVSLQYFVSGADI RWGRMLAATIVVIVPTIILTMILQKYIVQGMTAGAVKG >gi|157101655|gb|DS480669.1| GENE 200 226312 - 227196 1053 294 aa, chain - ## HITS:1 COG:AGc4553 KEGG:ns NR:ns ## COG: AGc4553 COG1175 # Protein_GI_number: 15889771 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 287 29 303 312 198 38.0 8e-51 MNNFVNKHIKWVFSLPALIFMMAMIAVPIAMTFLFSLNDWNLLMGNGMRFNWGKNYLDVL ASREFWQSLGITFYYTAIATVAEMLLGLAIALVLNKDFWGKNGVKAVVLLPYMMAPVAVG LMWMLFYEPSSGLLNYIFTTVGLPRSAFVSAKQTVIPAIAAVETWQMTPMVVIVCLAGLS SLPTDPMEAARVDGATPIQTFFQVTLPLLTQTLFSIGLLRFIDVFKSFDLIYAMTKGGPA NASRTLNLFAYETAFSYYKFGLSSTMLMILFVIVLLLSVLVMKIQRKLVAYYGG >gi|157101655|gb|DS480669.1| GENE 201 227266 - 228687 1748 473 aa, chain - ## HITS:1 COG:AGc4552 KEGG:ns NR:ns ## COG: AGc4552 COG1653 # Protein_GI_number: 15889770 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 35 459 16 438 443 175 30.0 2e-43 MKRKLCLALSAVMLAGSLSGCSGGTSGLPGDTSKASEAPAGKEAEGTKGESAAGEAGEKP YAGTTIRVLSMTGQISDAIEAYLDDFEEETGIKVNLELYGEAQLREKITTEFLAGSSTID AFLMSPLQDMAAYSNNGWIEPLDEYLKDPDFDWDDFSSAPMEQIKIKSDGAIGALPLYSS VQLMFYRKDVFEEKGVKVPTNYDELLDVCGKINDPANNFYAIACRGEKIALTSQFSPFLY GFGGAYFKDGTCAFNTPENLEAARFYGKLLGDYAPEGILTAGYSQMTQLFNAGQVGMCVD AIALYQTLIDPNESSFYDKVGVAPIPEGPAGRQSYKQVVWGASIYSGSKNKEAAWEFLKY AAGKDIAADITPKGMPTFRASVWEDERVTAAMPEDYIQAYNEEIQADTTNQYGLPRMTAV SEARDAMGEAIVYSIETKGEGADLESKMNAAAEKVDQLLKDAGEYGADYPYED >gi|157101655|gb|DS480669.1| GENE 202 228858 - 230438 1414 526 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 523 1 522 526 140 24.0 7e-33 MLKVMIVDDEIIVRVGFQSCINWEAYGCQVVSTCESGGDAVEYMNRDVPDIVFTDIMMPG MDGIQLVKYISENHPDTKVVVLSCVDEIDYVKKAIKLGAEDYILKLSFTQDTLVELITRL KSMIEEERKKAGRGSLDMRVQSFNREEDLRTLLLGNQGPGDNEMLLDRLGYTYNPFETYQ AGCFLVDDERMNRPGSGTDSHIRKYGLLNIVREYLENLPKSDLAFTGEHEIVVLFRREGG ESPACFFPDTLDLLNHALKTHLNLTLSMGMGQECLSRMDIPAGFAQARRMAALRFFDGPA AFHCGREAAAGRPFIAKRSVQRNMQEAIFRQDAGEAFRLIDSWFEEMAGLRSYDQIQAIR RGVVETWVFISGYSIPEGADMPEYDEIYSTGDFWGAETLADLKGCFKAAVQSVVDYLMAN KNANPEITRFLQYLEDHVDENISLEEAAGWCALGKSQFCILFKKAAGDTFVNYFNGLKMK KAFTLLGSSNIQVQEAASRIGIHDISYFSRLFKKYYNMSPSDVRKL >gi|157101655|gb|DS480669.1| GENE 203 230432 - 232309 1739 625 aa, chain - ## HITS:1 COG:BH1909 KEGG:ns NR:ns ## COG: BH1909 COG2972 # Protein_GI_number: 15614472 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 124 623 77 595 597 139 25.0 2e-32 MKWPSGRSGGVSLMKKMKVNLRIKIIYTCIIAVFVSVFLIMQYSFNRSRNLLITQEAGII SQYMNRNELALTEVTDSIRKLSAASSTNKQVASYLNQSCDGNIYSSENVSRIRAVEETLT FYRNIFFDYRLHYIILGADSTVYSVADGIDNSSYFGEKFAQSVRQQPWYREFLEGKEVSG WIAPCIYNDKGVFKERIESGKDEDFILFIRRIKDYNSLKFLGVSFVSFPTENLAQVLIPY DGAVLALFDKGNHLIYATEQSDILDDISAGRLSEDLQEDQDYFHYTKDGREYLVNHVTMA GTGWRMVNLVPLSHITQAVDKLHSTVTSLTVLMALGACMVCLAMYFYVNAPLNRLIHKVS EVNIGGTKIADMEDAKGRQMPVFGIVEAELEISRMVDYIEKLSAQTIKQKEIEQNLRYEM LRAQINPHFLFNTLNVIKWSSMISGAGNIADMITSLGILLESTMNRREEEVPLKEEIRVV KAWVEIKNWGLKNRIELVTDIQQELEEFKVIKFFLQPLVENAVLHGMGEATHGTIWVTAE PYGERVCVTVQDNGVGMEQGKLDEILKEMDENSKKRHVTGIGLTSIHELMKARYGPDYGL YIESRIGYGTKVFAVFPYRRGSETC >gi|157101655|gb|DS480669.1| GENE 204 232541 - 234079 2015 512 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 9 475 12 475 511 235 32.0 1e-61 MKKAIAMCLIACMVMTGCGGSKGGAGEPQETVAGSDGQNVKIKDEIIFCQGNDLTTLDAT IGQQERAYSISNNIFDPLFTYDSDMKMQPCLATEYEWADDTTLKLTLREGVKFHDGSDMN AEDVKFTLDLIDQRGALFVGNYDGCTIDDDYHVTVKLKAPNPAFLSILTLPQASVVPAEY DEAAFGANPIGTGPYKLKEAVEGDYYTLERFDDYWGGPAKTKLLTMKIVPEASQRTMMLE TGEVDVAYEVPNNDISRIQENKDLQILTSPSMKIILMELNCSSDGPVGNPDVRKAVECAI NKQTIVDSLLYGFGTVSDNIIPQSAQDYREYETNGYDPEKAKSLLAQAGYSDGIQLTLWT NSNQTNTEIAQVLQSQLAEVGINLNIVTQDDNTSFSLIEAGEDFDLMLDFWQTNTGHADY VFNGMLLSTSVNNFSRYYNPEFDETYVKYASTGEGEEREALLKQLYDDMVTDTPIIGLYS ETKVIAATSKLQGLMLSQIGAHEYQNAVVTED >gi|157101655|gb|DS480669.1| GENE 205 234111 - 235088 802 325 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 6 316 11 326 329 313 52 5e-84 MEHKGLLEVRNLKKYFDTNRGPLQAVDNISFSIEKGRTLGVVGESGCGKSTLGRTILRLQ EPTSGEIWFNGEDIVKYDKKQVKALRSKMQIIFQDPYSSLNPRMTVSQAIEAPLVLQGIY KKNDREGLQKKVREMMDLVGLAYRFANSYPHELDGGRRQRIGIARALALNPEFVVCDEPV SALDVSIQAQVLNLMQDLQEQMGLTYLFITHDLSVVKHLSDDIVVMYLGQLVERAKPDEL FEHPLHPYTQALLSAIPIPDPTIKMEREILNGELSSPINMGDGCRFAKRCPKATPECTRP IREVEASPGHFVSCHLLCKENQAGS >gi|157101655|gb|DS480669.1| GENE 206 235088 - 236089 597 333 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 324 3 328 329 234 37 3e-60 RGSKTLSHILEIKDLQVQYHTEDGTVYALNHVDLELDDGETLGLVGETGAGKTTLAKSIM RLIPDPPGKIEGGVIEFEGRDLLKASGAEIRAVRGNQISMIFQDPMSSLNPVVRVGEQIA EVIRVHEKCTKKEAEEKAMEMLETVGIPAERAGDFPHQFSGGMKQRVVIAIALACNPKVL IADEPTTALDVTIQAQVLEMMRRLKDKYHTATLLITHDLGVVAQMCEKVAIIYAGEVVES GAVLDIYKHTLHPYTEGLLGAIPQVHLHVHRLSPIDGLMPDPMAQRTGCPFADRCKYAFE RCRVEHPRFIDAGNGHRVRCFKVENELRKGAGE >gi|157101655|gb|DS480669.1| GENE 207 236089 - 237003 1316 304 aa, chain - ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 4 304 1 289 289 265 44.0 9e-71 MSTLKKTENNLGQDEKVRSLWKQSVIQFKHNRLAVVGLVIIVLFIVTILTTLAVDQITNY SFYNEYILKQDLSLRLAKPSLAHPFGCDEFGRDLLLRILWGTKLSFAIGVVAIALSVIMG APFGMIAAFYGGKTDNAIMRVMDVLLAVPYMLLAMAIVAALGTSTFNLLLALAVSGIAKY ARIARAAVLTVKDSEFVEAARAVGASDGTILFQYILPNALAPILVQVSLGIGDSILAVAG LSYLGLGVQPPQPEWGAILTTARTYMRDAWHISVFPGLFLILAVIAFNLFGDGLRDALDP KLKR >gi|157101655|gb|DS480669.1| GENE 208 237000 - 237968 321 322 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 65 312 43 316 320 128 27 3e-28 MLRYIIKRIIFAIPVLIGATFLVFTIMDLAPGDPARIILGSNATEAALASLRDSMGLNDP FLVRYGRFILGLLHGDLGTSYRNGQAVFSLIMGRLGNTVVLATSGIVFSVIIGIPVGIIS AIKQYSVLDNITMFITLFLMAVPTFWFALILVIIFSLNLGWLPSSGMQHGFPDNLISLIL PTLTLGCGYAALIARTTRSSVLEVVRQDYIDMARAKGLTENAVIWKHMLKNALIPIVTVV GLNFGTLLGGSVLTESIFSWPGVGRFVVDSISYKDTPAVLGSVVTLAILFTIVNLIVDLL YAFLDPRIKSEYQSSWKRAKKK >gi|157101655|gb|DS480669.1| GENE 209 238003 - 239823 1801 606 aa, chain - ## HITS:1 COG:no KEGG:Pat9b_4973 NR:ns ## KEGG: Pat9b_4973 # Name: not_defined # Def: peptidase M28 # Organism: Pantoea_At-9b # Pathway: not_defined # 1 584 1 532 546 253 30.0 1e-65 MDMDSWAEMEEQILKELNLHRMKDDAEVFSRLERYSGSEAGEAAVDYLVQELEAAGIEHE RHYYELMRSLPVQASVTVKKSGEPDFTVEAIAAVYSGEAHGLTGELAWDEMCAKGQLNGM EQEERFRTFKGKIVLTYDISFPFYYEAARAGALGIVAIWPKDIHHHDTMGGVWGMPGSRD RDLYPYLPYVQILGQDGLKLIEMVKAGTAAVGTEAAKGMAVTAQMDVAMDNRIVRSSMPV ATIPGKSESFVLLSGHYDSWYEGMTDNGAANVLMLETARALEKFKDRLNRTVVIAWWSGH SDGRYSGSTWFCDHHYEYLRKHCVAHINMDICGCKGSNAVRFDMSGMEGEAFNDEFLASY NSRKPLAYRALDRSSDQTFWGTMTPVSIAPQFYMDDGQTPQPPKSSDILRPAAMPAAFGV GGPFYWWHTREDTLDKIGDDVLARDCEIAARLVLRYAMEKPLPIDMNGFMGEMQSYFEAF AEELDPDFDVAPVLASIALTRKSVEKLSDAIRAYPKQDADSILIRTAGELVRLKYTYSSP YGHDYAVEHQPYAVFSSLLGVHRDNTPEDRYLMSQTDFIRQRNRMTGQLHEVCEAIELQL YRWQVQ >gi|157101655|gb|DS480669.1| GENE 210 240067 - 240525 327 152 aa, chain - ## HITS:1 COG:FN0643 KEGG:ns NR:ns ## COG: FN0643 COG3708 # Protein_GI_number: 19703978 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 152 1 145 145 169 51.0 2e-42 MAYSLKEVTIRTDNSEEGMEHIGELWRHIMDGRLPILFDSGHQFRAGISPVSRYSGYDGD ETGSYDLTVMAVTSDFFTQMDEKVKMGQYRKYDAWDENGDLGACTRKAWETVWSQQKAGI IHRCFTEDYESTVPAEYTKDGKAHCYLYIAVR >gi|157101655|gb|DS480669.1| GENE 211 240818 - 241186 376 122 aa, chain - ## HITS:1 COG:MA2818 KEGG:ns NR:ns ## COG: MA2818 COG3603 # Protein_GI_number: 20091642 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 120 6 131 135 98 39.0 4e-21 MEIKVIEGAFSVCRLKELGQAVIEDDLYFLGRTDEEISLVCRTESVPQDTAAREDGWRAF RIQGELDFSLIGILAGISAVLAENRIGIFVVSTYNTDYVLTKADDFERALGLLESRGYGI AK >gi|157101655|gb|DS480669.1| GENE 212 241273 - 241404 150 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935763|ref|ZP_02083138.1| ## NR: gi|160935763|ref|ZP_02083138.1| hypothetical protein CLOBOL_00653 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00653 [Clostridium bolteae ATCC BAA-613] # 1 43 9 51 51 82 100.0 7e-15 MSIGIDFTVKVLHRKTFMDRVKELAEAEDYNIHIWDEGLRVEL >gi|157101655|gb|DS480669.1| GENE 213 241460 - 242554 873 364 aa, chain - ## HITS:1 COG:no KEGG:Closa_2898 NR:ns ## KEGG: Closa_2898 # Name: not_defined # Def: acyltransferase 3 # Organism: C.saccharolyticum # Pathway: not_defined # 19 364 6 352 356 237 37.0 7e-61 MRQTEKNRKNPRNGNSKARDPRYDCLRAVSVMAIIMVHAMPVETVSARQWWFNSIMTPFL LSFVGIYFMLSGMFLLEHGTERIGEFYRKRFITVGIPFLVYGLIYYCYNVHADGVVLPVW KHAGRYLAQVLTAGLPRAGHMWFMYAIVSFYICAPFLSRMVKNMTDREMKVFLLLMLGIH VVEVAGEILGLDVKPWAQFVLYTGWVYYFLLGYGLKRLCRKEQFPIFAVLAVFGLAMDVA ADIGLSWWVPQTPHKSPAMVFICAGVFLLFEYYGGKVPVWAGKLGIFVSRYSFSIYLIHF LILSYYVNPVLLKDMAARHYILGTMVSTVVTFCLSLACSMVTDHLAVRPLERLAERIRFE KDKI >gi|157101655|gb|DS480669.1| GENE 214 242589 - 243437 567 282 aa, chain - ## HITS:1 COG:CAC0535 KEGG:ns NR:ns ## COG: CAC0535 COG1237 # Protein_GI_number: 15893825 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Clostridium acetobutylicum # 9 274 1 266 268 302 53.0 6e-82 MKKAGTDGMKLRVLVDNNTYIDRYYLGEPAVSYYIECDGIRLLFDAGYSDVFLKNAEALG IDLGLVTHMVFSHGHNDHTRGLKFMKERLELSGMEVIAHPSCFDEKVFGEESIGAPYSAA DMAGICRYRPCGGPCNISERLVFLGEIPDAVDFETRKAIGRTRIRGKWQDDYVRDDSALV YRGEEGIFIITGCSHSGICNIVQYAQAVCGQQRVLGIIGGFHLFEADERLVRTIDYLKCC GAEQIYPCHCVSLRAKARMMEALNVKEVGVGMELCLDGQAAH >gi|157101655|gb|DS480669.1| GENE 215 243571 - 244092 345 173 aa, chain - ## HITS:1 COG:no KEGG:Amet_3236 NR:ns ## KEGG: Amet_3236 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 43 167 49 173 178 117 45.0 2e-25 MMGPWKAAAAAYLAVFIFIPERSGEIGACACTGLGQFIWNLVPVFLCVGLMDVWIESDKM IKMMGDRSGLSGMGISLLLGMLTAVPVYALLPIAGLLLKKGGRISNVLIFLCSSVSIRIP LLLFEISSLGIRFAACRFVLNLAVVFVISFLVERFLSERDRQDIRRRNQINQV >gi|157101655|gb|DS480669.1| GENE 216 244116 - 244601 369 161 aa, chain - ## HITS:1 COG:no KEGG:Amet_3237 NR:ns ## KEGG: Amet_3237 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 5 161 2 158 159 107 45.0 1e-22 MNTGFTLFFYGLSAILLSLSFRKDREKTNRAVRKALFMMLGVLPYFLIILLVTGTAFFIL PPKTVQTLMGTESGIRGMLLAAVTGAAALVPVLAVFPVVSELLKNGAGTAQMAVFISTLT TVGIVTIPLEVKYMGVKAAVLRNLLFFLLAFATSSLLEVLL >gi|157101655|gb|DS480669.1| GENE 217 244700 - 245362 565 220 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 13 219 14 216 229 79 25.0 6e-15 MNENIQDIMQAKLFSGLSQKDVCRLLACLKAQSVNYAGETVVVEEGCPVKKFGILLAGRG KSYKTDSQGHTLTVTLLKEGSEIGVILAASPGRKSPVSVTVEQGSSVLFISYNRLINNCT RNCPCHRQLLKNFMWIVAEKGLVLHERLDCLLRPTARDKILTYLKNFSGLENGLPFTIPL DRNAMAEYLNMDRSALSRELSGMKKDGIIDYYKSTFRLFH >gi|157101655|gb|DS480669.1| GENE 218 245412 - 245756 252 114 aa, chain - ## HITS:1 COG:CAC0464 KEGG:ns NR:ns ## COG: CAC0464 COG3323 # Protein_GI_number: 15893755 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 8 105 7 104 113 104 51.0 5e-23 MEPFTYCKLEIFIPETHLKALQKALQEVDAGHIGKYDSCMSYSPVTGCWRPLEGSSPFIG KCGTISMEPELKAEVTCRTERLEETLAAIKKVHPYEEPVINVIPLYMTGMDIVS >gi|157101655|gb|DS480669.1| GENE 219 245819 - 248074 1470 751 aa, chain - ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 83 590 91 673 774 233 31.0 1e-60 MSDYPAIPFPDELVHLEQVSARLSKALQRAEASVSRADREYTDTKKYMADHRGEIDPHEM FQNELLLKQTDRTGAFAVEMRDRIVKLKDSPYFARIDFKEEEEEEGQTYYIGRFAFQYEN EPLIFDWRAPISSMFYDCEVGEAGFLAPAGWTAGELTRKRQFKIRNGIMEYALESSANVQ DDVLQRELSHTSDEKMKSIISTIQKEQNQIIRNEKEGTLIIQGVAGSGKTSIALHRIAFL LYRFKDRLSARNVTILSPNKVFGDYISGVIPELGEEPIYEMSFGELADIQLEGVIGFEPD RDPFEMQDRAWEERVRFKSTLDFVYMMDQYIEQMPEFIFVPTDYVYEGFRVTGEWIRDRF LAYGTCPVKKRLAMVADDIHDRFETDNIMEQEVPRPRVILKQLNSMLAMKDTLAVYKDFY KRMGIAEQFVMAARRTLEWADVYPFLYLHAAFRGLKESHITRHLVIDEMQDYTPIQYAAL NRMFPCQKTILGDYGQYINPNHLHCLEDLRTIYDKARFVELNKSYRSTYEIMCFAKKINH VSALEPMERHGEPPELVPCLDAADEIRKIREVIRRFRAGGNVSLGIILKTDAAARDMYEV LAGYDGAKGNGSEGCEKEGSKREGSKWEGSEREGSEREGNEREGNEGKRGKIGESGAKER GINLITRDSASFQNGISITSVRMSKGLEFDEVLIPQADSRTYASDFDRSLLYIACTRAMH RLTLTYSGRETRFISGQKLSGRHSGKESPGS >gi|157101655|gb|DS480669.1| GENE 220 248067 - 248915 759 282 aa, chain - ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 114 1 112 132 89 41.0 6e-18 MSSMDEVILAINFIEANLTKKMDLDMISGAVHYSKYHLHRVFSDTVGLTIHDYIQRRQLT EAAKLLVFSGKPIVQIAFLSGYQSQQAFTNAFTAMYKMPPNMYRENERFYPLQLKFNFEG SYEMLDRKEKALWEISFASDEDIPSWMELVRLVIDGFPHLNEEEYIRVLRQKISTGQALI MKDRGSAIGIMLFSYENGSIDFMGSHPLYRKNGVPKAMLDKVMKELLKGKEISITTYREG DKADTGYRKEIKGLGFAEAELLVEYGYPTQRFILQQEEREHE >gi|157101655|gb|DS480669.1| GENE 221 249045 - 249785 254 246 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935773|ref|ZP_02083148.1| ## NR: gi|160935773|ref|ZP_02083148.1| hypothetical protein CLOBOL_00663 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00663 [Clostridium bolteae ATCC BAA-613] # 1 246 1 246 246 499 100.0 1e-140 MEKLDNQALTRFERLWKIEGDKLLRANIIPSAVKDGEIYPVDYCISAIARLIQQPHGAEV VCGIRNALSELFEAADTQFIYPDESLHVSLLGCTQRKNTNVFEHAQINKIKHICIKEIEK KEPAEIILRGIGIVGNQIFIQGFPQNRNWEELRVSLGEELVNSGEFPILYEDKSPVHMNI IRIVDAAPHVLASLHRAVSRLRDVELGTVKLQTVEFVITDFCVTKKNVVWLHKIECRNSS GNSCRV >gi|157101655|gb|DS480669.1| GENE 222 250043 - 251416 764 457 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 448 6 433 456 298 38 1e-79 MLQIVNTINSYLSDYILVFLLVAVGLWYTVKTKCVQRYLWKGLKQLFGGFSLRGGSQGGG MSSFQAVATAIAAQVGTGNIVGSAGAILVGGPGAIFWMWIIAFLGMATIYAEATLAQKTR VTDAEGNIQGGPVYYIQTAFKGKFGTFLAGFFSIAIILALGFFGCMVQSNSIGSTIQMAF GIPSWIVGIFLVIICGFIFMGGVDRLASVTEKLVPVMAAFFLIGGLGVLAMRLQYIPETF GLIFKYAFQPQAIIGGGFGIAIKTAVSQGAKRGLFSNEAGMGSTPHAHALALVKNPHEQG CVAMVGVFIDTFIVLTLNALVIISTLYAGNGPLANGYVGEAANTLKNTNLAQVAFGVVYG DKAGAVFVAICLFFFAFSTILGWNLFAKINVTYLFGKKAQRPFMLVALVFIFLGTIGESD LVWACSDMFNQLMVIPNAIALFALTGVVTKILGDRDK >gi|157101655|gb|DS480669.1| GENE 223 251647 - 251808 104 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935776|ref|ZP_02083151.1| ## NR: gi|160935776|ref|ZP_02083151.1| hypothetical protein CLOBOL_00666 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00666 [Clostridium bolteae ATCC BAA-613] # 1 53 14 66 66 102 100.0 7e-21 MARLTASAYLMVQIVRIYYGIVKNNLCILEQAVHTFLHFIFKDRLSGVHKVIP >gi|157101655|gb|DS480669.1| GENE 224 251979 - 252173 106 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935778|ref|ZP_02083153.1| ## NR: gi|160935778|ref|ZP_02083153.1| hypothetical protein CLOBOL_00668 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00668 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 129 100.0 8e-29 MSDEGPERFSSGGGGDYKMFNGLELAAKDRSCPFWYITGTLQDACFKRIVSPDKYFKGMV YIKC >gi|157101655|gb|DS480669.1| GENE 225 252524 - 252700 167 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935779|ref|ZP_02083154.1| ## NR: gi|160935779|ref|ZP_02083154.1| hypothetical protein CLOBOL_00669 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00669 [Clostridium bolteae ATCC BAA-613] # 1 58 43 100 100 90 100.0 4e-17 MSPTKKKAKSKKNHTQPSVSNKRKNDGYVSVKYDEKKSYANVVLIVIVGILLLYLFLK >gi|157101655|gb|DS480669.1| GENE 226 252846 - 253763 921 305 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 301 301 376 59.0 1e-104 MPIKVQRDLPAKAILEEENIFIMDEDRAMSQDIRPLEILILNLMPLKEETETQLMRALSN TPLQVDCTFLMLSTHVSKNTSASHLNKFYVTFDDIKKKKFDGMIITGAPVENLEYEEVNY WPELSMMMEWSKTHVTSTIHICWGAQAGLYYHYGIPKYPMEKKLSGIYNHRVLDRKVPLV RSLNDFFLAPHSRYTQVRKADILKHPELAILAESDEAGVFLVMSRDGRQIFVQGHPEYDR MTLNNEYHRDLNKGLNPEVPYNYYEDNDPFSAPPLTWRNTSNTLYTNWLNYYVYQVTPYL LDAEE >gi|157101655|gb|DS480669.1| GENE 227 253843 - 255798 2287 651 aa, chain - ## HITS:1 COG:SPy0751 KEGG:ns NR:ns ## COG: SPy0751 COG0272 # Protein_GI_number: 15674800 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Streptococcus pyogenes M1 GAS # 6 651 3 652 652 345 35.0 2e-94 MEDRIQRMKELVSILREAGKAYYQESREIMSNFEYDKLYDELVSLEKETGVTLSGSPSQQ VGYEILSELPKETHGSPMLSLDKTKSVEDLQEWLGDQKGLLSWKMDGLTIVLTYEEGTLA KAVTRGNGEIGEVITPNARVFANIPLNISYQGQLILRGEAVITYSDFERINSGIEDVDAK YKNPRNLCSGSVRQLNSQITAERNVHFEAFALVKADGVDFHNSRKAQFEWLKGQGFEVVH YEEVDAASLPASVEGFAKAVEGNDIPSDGLVLTYDDIAYGESLGRTAKFPRNSIAFKWKD EIRETALSYIEWSASRTGLINPVAVFEPVELEGTTVSRASVHNLSIMEGLELGVGDTITV YKANMIIPQIEENLTRSGVKDIPEECPVCGGRTEIRKVNDVKSLYCTNPDCQAKKIKSFT LFVSRDALNIDGLSEATLEKFIQAGFIHEYADIFHLEEHRDAIVEMEGLGQKSYDNLIAS IKTASNTTLPRMVYGLGIAGIGLANAKMLCREFKYDFDKMRHAGEEELVAVDGIGGVLAQ AWMDYFALGRNNQMVDRLLAELNIEKEQPETAGEAVFEGMNFVITGSVEHFANRKELQEL IESKGGKVTGSVTAKTAYLINNDAASNSSKNKKAKELGVPIISEEEFLRML >gi|157101655|gb|DS480669.1| GENE 228 255812 - 256567 644 251 aa, chain - ## HITS:1 COG:VC2223 KEGG:ns NR:ns ## COG: VC2223 COG1187 # Protein_GI_number: 15642221 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Vibrio cholerae # 2 251 18 260 340 238 50.0 7e-63 MEERLNKWLSQMGVCSRREADRLIEAGKVLVDNRPASMGQKVTPGQKVVCNGKMVRVDRS QRPRPVLLAVNKPKGIVCTTTDKDRAENIVDFLGYPERIYPVGRLDKDSEGLLLMTNQGD LVNKIMRSGNAHEKEYIVTIDKPVTERFLEQMAGGIRLPELDQVTRPCKVKKVERNTFSI ILTQGLNRQIRRMCEACGCQVKRLVRVRIMNIRLGNLRTGAYRQITKEEYEELTGLLKDS SSLSLKERKNQ >gi|157101655|gb|DS480669.1| GENE 229 256602 - 257651 976 349 aa, chain - ## HITS:1 COG:CAC2283 KEGG:ns NR:ns ## COG: CAC2283 COG0809 # Protein_GI_number: 15895551 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Clostridium acetobutylicum # 1 340 1 340 341 467 62.0 1e-131 MDVKDFDYELPEELIAQDPLEDRSASRLMVLDRETGEIQHRIFKDITEYLRPGDCLVIND TKVIPARLFGVKKDTGAKIEILLLKRRENDVWETLVRPGKKAKPGTVIEFGEGLLTGTVV DTVDDGNRLIQFHYKGIFEEILDQLGQMPLPPYITHQLKDKNRYQTVYARHEGSAAAPTA GLHFTQELLAEIEGMGVKIAHVTLHVGLGTFRPVKVENVQDHHMHSEFYIVEESEAKKVN DTRAAGGRVICVGTTSCRTVESASTEDGVLKAGTGWTEIFIYPGYRFKVLDALITNFHLP QSTLVMLVSALAGREHILDAYREAVKEGYRFFSFGDAMFIAAHPAAEKR >gi|157101655|gb|DS480669.1| GENE 230 257673 - 258749 1271 358 aa, chain - ## HITS:1 COG:no KEGG:Closa_0811 NR:ns ## KEGG: Closa_0811 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 248 14 227 250 105 28.0 3e-21 MRQNRRRTGSRRSADDKIKYMRMAAIPLVIVLVLVAVILVMDRKPSDKEASAPTDVSLSS DEDINISSDSSAAPDDSGVEPDNNQYTTDFSQYELKKDEIPQVNQLISEYFQAKVDQDAQ TLYRIFGKSDDTGLDARKEELKNEAVYIEDYVDIVCYTKPGLTEDSYVAYVTYEVKFRRV ETLAPGLMWCYVVKDDNGNYIIRENVVGDEADYVAKQNQSEDVKLLSNQVNERLRQAIES DTVLAGIYKDLRNGAVVHSSEEETETGDSTVILEEEGGETQENGGPQPSQDPSADAGQNS QGEGIQNVPAGQEGTNARESEGTPAAAQENQGTDSGTAGGSGTEESAGTAGGSSVKIE >gi|157101655|gb|DS480669.1| GENE 231 258682 - 258948 118 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935786|ref|ZP_02083161.1| ## NR: gi|160935786|ref|ZP_02083161.1| hypothetical protein CLOBOL_00676 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00676 [Clostridium bolteae ATCC BAA-613] # 38 88 1 51 51 93 98.0 6e-18 MRIYFILSSAERLLPVLLLFCLMIQSSCPDCEARSMLVFKYLRSILAKSGSNCNLYQKNT ADNPTLKQIICGILTRAFCPFLTNTKCV >gi|157101655|gb|DS480669.1| GENE 232 259154 - 259516 184 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935789|ref|ZP_02083164.1| ## NR: gi|160935789|ref|ZP_02083164.1| hypothetical protein CLOBOL_00679 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00679 [Clostridium bolteae ATCC BAA-613] # 7 120 1 114 114 226 100.0 5e-58 MRIPVSMYIQGTVIDIEFPFQGDSQKTKRRPAVVTDFDDMHTTVILLKVTSHEPRTDYDY VLLDAGMAGLKEGSVIRCNHILTVNNDLLCDKRGDLSRRDFIRVLALYQTALISGSEELY >gi|157101655|gb|DS480669.1| GENE 233 259527 - 259718 226 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935790|ref|ZP_02083165.1| ## NR: gi|160935790|ref|ZP_02083165.1| hypothetical protein CLOBOL_00680 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00680 [Clostridium bolteae ATCC BAA-613] # 1 63 29 91 91 113 100.0 4e-24 MKQVTAETISQMSDAEVLKLLGDMRDIPISSKRRVWKSLNTGIYDSSGKLIGDRAADEDD YVE >gi|157101655|gb|DS480669.1| GENE 234 259821 - 260999 551 392 aa, chain - ## HITS:1 COG:SPy2122 KEGG:ns NR:ns ## COG: SPy2122 COG0582 # Protein_GI_number: 15675872 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 21 377 15 362 381 103 26.0 7e-22 MVATRRVSSPYTIRQRSDGRYEVRLYTGTDSNGKKTYKSIYGRTSKELKEKLKEIATEQK KNRTPAANINFRDYALHWMRLYKYPVLKPVSYDRLEQTYNKVCEYLGWIQMGNISTDDIQ EMINDLARTKAYSTVKKHHEFVKNVFSHAYKTGELDFNPCEAVALPIERNMTVKTKPAEI LLEEETEAMYAFNEKIKKSRNQFFKQMPALLLMLNTGWRVGELLALEWTDIDFKKRTARI NKTLAKAKTRNDAGESISRHKTTFPEPTKTKAGERLTPLNDMAVSLLKQIKEYNQRMGIQ SNYVVCTKDGGYVSERNLLRTFKSVMGIIGAEKDYTIHSLRHTYASRLLKRGVDVSVVSK LLGHSDINTTYGKYIHVLHTQLMQSAQSVERI >gi|157101655|gb|DS480669.1| GENE 235 261027 - 261233 246 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935792|ref|ZP_02083167.1| ## NR: gi|160935792|ref|ZP_02083167.1| hypothetical protein CLOBOL_00682 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00682 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 116 100.0 5e-25 MEELNRNSSKKLLHKKEIGVMFGYGRDKTRRLLESGILPVIQINNDYVISAEELDKWMKR NAGKKIKL >gi|157101655|gb|DS480669.1| GENE 236 261252 - 263279 289 675 aa, chain - ## HITS:1 COG:no KEGG:BACI_c54550 NR:ns ## KEGG: BACI_c54550 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_CI # Pathway: not_defined # 91 477 176 556 644 65 23.0 7e-09 MVLSIKNLEIIDETYYCDGEPVTDFTVEFLGHLVRWENGIPNSYYKIRIITTTKQKVVMD VSAADMNSVRWIYKIPFAVWCKNKKKFTGMLITAIKTQLQGIAPIEVIYDRVGFVERNCE WIFLLSNGSISKRGFGFKEHSGIQRYNFLGKVIMEQEERGTQIEEFICLLKQNIEIYYPL FLITLMSVVRLPLRKQGIELGITVWLYGNSGSGKTELAEALAVFTEVNEFGNKELMYATT SKRYILETLSCSKGMPVILDDVKQEKVQGQRDHCRIAVDAVLRGVFQGCLTENYGQKDTS VDTCAVITGEYMETTESQNARLLYKDVSGFLEDKGNSESLRKFQRNSRLLPGIIGDFICW LCMEMEDQDVLDKWKKHYHRGQDEESIYANVPNGARLKANDNMMRFMHYIWMKYITDTGF NNEGLKVEFQRKGEYSIQKIVEDTSLLLGGLEALLVQAMTEVVQTSRIRHAKYRRESLLV CDGCVWQQQEFCLWEEDDFLYIDDIERSWSTKSCEQSADIRGILVIGRDKLLDKMLYVVD NKIKQGVIPEHFLDKVSLGAMARCGLIACSLRDEKQYRYAKYYPCIAPKDEEEIGEYFSD YCNHNLPRNQRRRTVEYKRDMVVQCNLGHSAFAHVLDCMIEDAKEGMLSGVENTEVMSMR RAFSQRKVLLKSVSE >gi|157101655|gb|DS480669.1| GENE 237 263548 - 264882 317 444 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935794|ref|ZP_02083169.1| ## NR: gi|160935794|ref|ZP_02083169.1| hypothetical protein CLOBOL_00684 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00684 [Clostridium bolteae ATCC BAA-613] # 1 444 1 444 444 817 100.0 0 MIEELNNADKKNLIWNSEEIVSEYENLYGPISPKEGKKYATLTKNLSKAVKILSKSCKGK DNNEGQDMTKLPEILGGTRGKGYYLVRKYRYDFILKNNRAAEYFMKDRNKRKDCFYKELE ELEQTLDFQIFFDYKSELKDICRLFWLCERFTKKPDIFIIYDLVQRLGGVIFSEQLQMDK GLETLMKYRELKEDLYQLAVYYIEDLLWQGKLDPGIQRIYNDINLCVIDKDEIKEAIKPE ITKIIFEYQEIVDNMLSMRCADKIMMHCEYKYKRSKPKLPLSCILSDLEEIDDYEDRTCL TPRNLLEDARFCYEIRKTGQRGLSQRRIEKELDNIFKKIRELEVCDRECLLYDFLSELQN WENMTDYLINTVRYLRFCDMVKFQYQVKECDRVEFESIYKIANAELFEEYRQWEKCIPSK ALCMQTKAMYYREKESSEKLMVSR >gi|157101655|gb|DS480669.1| GENE 238 265115 - 265234 105 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935796|ref|ZP_02083171.1| ## NR: gi|160935796|ref|ZP_02083171.1| hypothetical protein CLOBOL_00686 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00686 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 75 100.0 1e-12 MLEYISELHYNYKLRNKCPEIFALIKATGGGGTVIMGHA >gi|157101655|gb|DS480669.1| GENE 239 265227 - 266150 526 307 aa, chain - ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 305 4 308 321 170 35.0 7e-41 MLNVSLELFADEVVECVKEILGDGYDIELRRVPKNNGIILTGISICRAGEKVSPVIYLDD YYAERADTETEVELTARKIVNCFRYNGDLSDTVLKSAEHLYDFGKARDRVMLKLIHTKNN EQRLNQVPNIPYLDLSIVFYLYMDEDSGSMMTAQIQNEHLALWGVDTQMLYRIALQNMQR VMPAQINSITQIIEDLKCAFGEEPDEDRGEEDAPFYVLTTSRGINGAACMLYSGVLEKFA EQRGKDIIILPSSVHEVLLLEDTGDIDCKALTELVKCINATEVPMEDVLSENIYRYERIC NRITIIA >gi|157101655|gb|DS480669.1| GENE 240 266179 - 266550 247 123 aa, chain - ## HITS:1 COG:no KEGG:Sulku_2419 NR:ns ## KEGG: Sulku_2419 # Name: not_defined # Def: hypothetical protein # Organism: S.kujiense # Pathway: not_defined # 5 115 9 126 294 101 40.0 1e-20 MEHKKFKVGCHLKAGRLLYEHHGIYIGEGLVIHYAFDGITVDTVEKFARGEIMEEVPHFD SPYTGEEIRKRAFSRLGEDRYDLVSNNCEHFANWCCTGEAESEQVEEAVRLAGTLLLKIL EGC >gi|157101655|gb|DS480669.1| GENE 241 266526 - 266618 70 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIRKRIVAFFLSNEIRQERFKAIWSIRNLR >gi|157101655|gb|DS480669.1| GENE 242 266740 - 267057 186 105 aa, chain - ## HITS:1 COG:no KEGG:Shel_03820 NR:ns ## KEGG: Shel_03820 # Name: not_defined # Def: plasmid stabilisation system protein # Organism: S.heliotrinireducens # Pathway: not_defined # 4 97 3 98 104 81 46.0 1e-14 MEQKHYKLRYLPLFVSDMEEIVDYITLQNPDAAYHFLERVEAAIWKRLEFPVSFEPYQGT GARKHPYYRIYVGNFIIYYVVIDDVMEVRRILWDKRDAGWLLERD >gi|157101655|gb|DS480669.1| GENE 243 267038 - 267307 260 89 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0421_21057 NR:ns ## KEGG: HMPREF0421_21057 # Name: not_defined # Def: prevent-host-death family antitoxin # Organism: G.vaginalis_ATCC14019 # Pathway: not_defined # 1 84 1 85 85 99 57.0 4e-20 MIQIRPVSDLRNKFPEIETLVKENQPVYLTKNGYGAMVVLSLEEYSRITGDVGMKLDEAD KQAEETDIRLTHEEVFSGLKGRIHGAEAL >gi|157101655|gb|DS480669.1| GENE 244 267420 - 267677 309 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935801|ref|ZP_02083176.1| ## NR: gi|160935801|ref|ZP_02083176.1| hypothetical protein CLOBOL_00691 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00691 [Clostridium bolteae ATCC BAA-613] # 1 85 1 85 85 155 100.0 1e-36 MEPTAYDFLEEMIMERINQILHGDLKGTKIFYKEQEAIMKSLDQETKDKFEEFASNMLCI SAEECVAVYKDAFLDGLRLGHKAFR >gi|157101655|gb|DS480669.1| GENE 245 267695 - 268495 594 266 aa, chain - ## HITS:1 COG:ECs1933 KEGG:ns NR:ns ## COG: ECs1933 COG3723 # Protein_GI_number: 15831187 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecE pathway) # Organism: Escherichia coli O157:H7 # 11 239 13 243 269 187 40.0 2e-47 MGVNVKHELEQRAAGQGASVRLTKNMTIVDMVKALEPEIRRALPAVLTPERFTRMALSSI NNTPELAECTPMSFIAALLNAAQLGLEPNTPLGQAYLIPYKNKGKLECQFQLGYKGLIDL AYRTGQVQIIQAQVVREFDSFEYQYGLDSKLVHKPGEGARGEITYVYGLFKLSNGGYGFE VSNKTEMDTFAARYSKSFGSKYSPWTEDYESMAKKTVIKRVLKYAPISSDFQKALSMDET IKTGIAVDMSEIRNECLPEEAGSEAA >gi|157101655|gb|DS480669.1| GENE 246 268515 - 269444 889 309 aa, chain - ## HITS:1 COG:lin2414 KEGG:ns NR:ns ## COG: lin2414 COG5377 # Protein_GI_number: 16801476 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Listeria innocua # 6 307 14 316 319 238 42.0 9e-63 MITKVSTLGMTHEEWLKRRKEGIGGSDAGAICGLNPYASPMSVYQDKTSQEISASDNEAM RQGRDLEEYVARRFTEAIGLKVRKSNMMYVNSRYPFMLADVDRLVVGEDAGLECKTASAY NADKWKDGEIPPHYVIQCYHYMAVTGRKNWYIAVVVLGQGFQYRKLSWDEEMIRNLISIE DDFWNSHVLKRVMPDPDGSNACDEVLEQYFHHARKGTAVPLIGFDEKLNRRQEIVQLMKK LEQEQKQIEQEIKLYMKDNESAFNERYRITWTNVDTARLDTKRVKEEKPEVYRQFMQTTS SRRFTVKAA >gi|157101655|gb|DS480669.1| GENE 247 269517 - 270002 318 161 aa, chain - ## HITS:1 COG:CAC1241 KEGG:ns NR:ns ## COG: CAC1241 COG2003 # Protein_GI_number: 15894524 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Clostridium acetobutylicum # 30 153 107 229 229 105 44.0 4e-23 MMRRPIPKKKVGIVHLQMVREGRALYGMTRFTDPGMAAEMVWPLFEMADREMALVLSLNT KLEPQALEIAAVGGLNACSIDCRDIFKHAVLNNAAFIICFHNHPSGDPKPSMEDRKLTKR LEECGKILGIPLIDHIIVGEGPCCYSFKEQGMLSYTDGEVA >gi|157101655|gb|DS480669.1| GENE 248 270142 - 271074 727 310 aa, chain - ## HITS:1 COG:no KEGG:CKL_2932 NR:ns ## KEGG: CKL_2932 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 309 1 309 311 414 62.0 1e-114 MSAEVETMFYTREKPWHGLGTRVEDAPGSREALELAGLDWQVIQRPIATTDGQAISGFKA NIRNTDSRVLGVVTDRYKVVQNEDAFAFTDQLLGEGVTYETAGSLQNGRRTWLLAKLPQR YIISGDEITPYMVFMNTHDGTGAIRVAMTPVRVVCMNTLNLALSTAKRSWSTNHTGDIAG KMEDARYTLLYADRYMSELGKAIDHMKRLRLSERQVMEYIDALFPLYDNPTPQQQKNLNR MKEDMKTRYFDAPDLKHVGKNGYRFINAVSDFATHARPLRESANHKENLFAKTVEGNALI DRAFAMLQAA >gi|157101655|gb|DS480669.1| GENE 249 271276 - 271845 193 189 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935806|ref|ZP_02083181.1| ## NR: gi|160935806|ref|ZP_02083181.1| hypothetical protein CLOBOL_00696 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00696 [Clostridium bolteae ATCC BAA-613] # 1 189 3 191 191 337 100.0 3e-91 MFKKSFVMMLFAAIIAVGSAVPVFAEEEYYGNVKADSFYEEESDNAIIFEYPSYIGYEAG KNLSDYFSYCTYASPDAHQTKNFDLAEYRWFYGTVNLVADYGDSGYQIWVSGYDPSYTDT VGWPIESNVAAAKDAEIKEGDTVIVFGKVFSYDSGFNKELYVGGADILITTRSFTDILTE YYNYLENMQ >gi|157101655|gb|DS480669.1| GENE 250 271897 - 273069 548 390 aa, chain - ## HITS:1 COG:alr3997 KEGG:ns NR:ns ## COG: alr3997 COG0515 # Protein_GI_number: 17231489 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Nostoc sp. PCC 7120 # 196 278 408 489 496 62 36.0 2e-09 MRMGLAVLLAFGLAASVTACGDVNTGEDKKEAATIGTTVESENVTGVSDTTEIQNAEEEY KSGCIESFDYNEYFRYPDNHVGERLCVTMQIVQLLNGGFRGMDMNGKIYITLYEEGNPNL QLDDIITVWGEYKGVIGYETTDENYMGFFTINAKYTGFVDEMIANSFQEVLSEEYDVSEN FSETVADNPYRDSILLWKSYDSELSDSDVVGFSKEELRIARNEIYAAHGRIFTSQDLIDY FNSKPWYQGIVPSDQFSESVLTKIQKANIEKIKAFEEQADSAFDASIEIPKKDGIYTYRN MDPDEMSVGVEYMKMTVEGSRVSMIMYDTESASGNITEYSGFYNVNDEAYVSEEEGGIVI GFDEYGQYATVIFPWDYEPKFSLVSVSPLN >gi|157101655|gb|DS480669.1| GENE 251 273260 - 274045 550 261 aa, chain - ## HITS:1 COG:no KEGG:Cbei_3805 NR:ns ## KEGG: Cbei_3805 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 6 260 1 243 244 99 28.0 1e-19 MSISLMGVMLGLPLALAMRGIMGKEKFEQWIESKDYLLYTTFDSMEEMRRDIEASGYDVT EWYGSLKTHLTQNKSSYFVWKDIDGKIVARMSVYDDKAAIKEFIEAVEGRAKKKIFYEDG KDSVNISAEEIQQAAESVTDLYYEEYPTIYVDVQLLCDIIERYQIEILTYSEDEIKCRYE NYDMIFSRQKGEEPFNLEINSSSKNMRHLHDCIDCLNEDYYAALQEKTYIGIRKKIMEEG FEIEEEQVMEDNSIVMMISVG >gi|157101655|gb|DS480669.1| GENE 252 274064 - 275737 569 557 aa, chain - ## HITS:1 COG:all1872 KEGG:ns NR:ns ## COG: all1872 COG0464 # Protein_GI_number: 17229364 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Nostoc sp. PCC 7120 # 9 509 1 483 503 398 43.0 1e-110 MDRQSGINVKKHLSNLFRARFPIIYIPTWEESRAIEILSDIANDVQLIKTQRTLYIWSQT QGLCSYTTGKAIADTKTPVKALEYIADCSEAAIFCLKDFHVYFGADNSKADYGVIRKVRD SVEELKNSRYPKNMVFISPNIVLPNDLQKEVTIVEFGLPTLEEVSETLSEMIEANQEKID ILLSEREKIQMCKAALGLTLQEVENALARAMVEKGRLSIDELSIILDEKNQVIKKTGMLE FVKSDLSIDDVGGLENLKRWLLKRNKSWSDTASKYNLPLPKGVLITGVPGCGKSLTAKAI SAVWKLPLLRLDMGKIFSGIVGSTEENMRKAIQTAEAVAPSILWIDEIEKGFSGVSGSSG DGGISSRIFGTFLTWMQDKTKAVFVVATANNISNLPSEMLRKGRFDEIFFVDLPTEKERK DIFQLHLKKRLTNSEISGNVDINDGLLTELSGRTEGFAGAEIEQIVVAALFEAFSEDRAL EIKDLFKVISNMVPLSVTQSEQIMQIRNWANVRAVAATAREDRHEYERKEKENIQDPKIT EQEDEVKKNRGGRSVDF >gi|157101655|gb|DS480669.1| GENE 253 275755 - 276954 372 399 aa, chain - ## HITS:1 COG:NMA0905 KEGG:ns NR:ns ## COG: NMA0905 COG3468 # Protein_GI_number: 15793871 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Type V secretory pathway, adhesin AidA # Organism: Neisseria meningitidis Z2491 # 175 302 1000 1138 1773 64 39.0 4e-10 MKKLGIGLQVFMCAVLSFLGFVNYGVWIWMAGKMGLKKMKLAGMVLCGLMVMLMVLAGVY DAESADILMSICMILLFLPTFVILANVGKLKRCLDLRLLVKKNNINVQKLGEQECLFIEI PENERLLKTSKMLFGDKGEAVLRVIAFNEKEKADKALAEKEDRIRQKKLQAEAEIAKAEA QKAEAEKAKAEARKAEAEKVRAEAERARAERASLEAQKAEAARAKAEAQRAEAARAAAEA KKAETEKVKAEAEKAKAEAERAIAEAERIRAEAEKAKAENQKKEIEKDRKEVRRVKTEDN REKEIKPVSIGKISEQPVDINFCDEQELSLVPGIGIILAKKAIRIRNEKGNFESVDDFVK CIGIRESNAASIKSHLICTAENNCNTVDIVRKAGRKIDI >gi|157101655|gb|DS480669.1| GENE 254 276951 - 277157 123 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935811|ref|ZP_02083186.1| ## NR: gi|160935811|ref|ZP_02083186.1| hypothetical protein CLOBOL_00701 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00701 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 112 100.0 8e-24 MKQIKVTIYANGYVQAETVGIKGQKCEEFRPLLENILQNRVVDQAYTDEFYEMALEETYV NEMEKARI >gi|157101655|gb|DS480669.1| GENE 255 277154 - 277537 126 127 aa, chain - ## HITS:1 COG:MTH287 KEGG:ns NR:ns ## COG: MTH287 COG0602 # Protein_GI_number: 15678315 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Methanothermobacter thermautotrophicus # 1 114 112 224 237 65 35.0 2e-11 MDQKEAVAQLVEAVKRQGLSVILFTGYLYDDLKKSRDICVQRILRNIDLMIDGPFIENKL DFSRPWMGSENQKYIFLSDKYSEKDLVDIKAKSQYEVRIYPDGSVKINGMGDLKKIRKFL KVDGGMV >gi|157101655|gb|DS480669.1| GENE 256 277831 - 278061 202 76 aa, chain - ## HITS:1 COG:no KEGG:JDM1_2528 NR:ns ## KEGG: JDM1_2528 # Name: not_defined # Def: hypothetical protein # Organism: L.plantarum_JDM1 # Pathway: not_defined # 5 72 14 81 87 62 35.0 6e-09 MAGIYKDIAEVAGEDVAKILYKNFRGQQVVFPNKLYSSNFTAQKIREEYNGKNAKELALK YGFTERWVRTILKSTK >gi|157101655|gb|DS480669.1| GENE 257 278183 - 278701 281 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935814|ref|ZP_02083189.1| ## NR: gi|160935814|ref|ZP_02083189.1| hypothetical protein CLOBOL_00704 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00704 [Clostridium bolteae ATCC BAA-613] # 1 172 1 172 172 311 100.0 2e-83 MATVKVFGGKSYDDKYSGKNYTNYNALECVLNYMFREKGNRDDIIPEIKGGYGVNLESTE TIIEDMRMVKDVYEKNGGRQLRHFSVNLSAEETETINDLNAFAYDISGYYGNRYQSVFAA HKKGRGVHIHVCVNSVSFVDGKKFSDRNGGLKEYKDYVNRLVQKHKDTNDKK >gi|157101655|gb|DS480669.1| GENE 258 278701 - 278949 132 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935815|ref|ZP_02083190.1| ## NR: gi|160935815|ref|ZP_02083190.1| hypothetical protein CLOBOL_00705 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00705 [Clostridium bolteae ATCC BAA-613] # 1 82 15 96 96 129 100.0 6e-29 MSQKMYTKIEKKMREKNYKNISAYIRDLVQAEQAKNELSKPAKQRAVEAICTIQTYINCN GDFCEGNERILDALKTVKEVLR >gi|157101655|gb|DS480669.1| GENE 259 279455 - 279634 131 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935816|ref|ZP_02083191.1| ## NR: gi|160935816|ref|ZP_02083191.1| hypothetical protein CLOBOL_00706 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00706 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 110 100.0 5e-23 MEDAYISLKEFCERTSISPSTAYRMIQEGRLPAEKLAGGRRYKIPVDFLKSLRPTHRHY >gi|157101655|gb|DS480669.1| GENE 260 279939 - 280379 593 146 aa, chain - ## HITS:1 COG:SA0699 KEGG:ns NR:ns ## COG: SA0699 COG3610 # Protein_GI_number: 15926421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 10 144 11 145 164 75 28.0 2e-14 MVVAIQVAGSFMAVLSFGLVLEMPRKYLGWSGLTGGVCWLVYLVVKAGTGSMILGAFLSS LSVALMGHLFARIFRAPVSVFLVPGILPLVPGTSIYNSVYYVIRNSREESMYYLVETLQI AGAIALAVFLMDSVFKLVGKKKRIKV >gi|157101655|gb|DS480669.1| GENE 261 280373 - 281170 787 265 aa, chain - ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 12 264 6 257 257 117 33.0 3e-26 MDEHHWSEDHLIASTAVLAGEIMLISGAEIARIEGTIHYILGCCHDRNAQTMVFSTGIFV SLDSPEGEALTLVRRVEARSTNLNRIYRVNEVSRRLCGGLMSPEEAYRELEEIKDSCQYK RELKALSYAFIAVFFGVVLGGRPADCLGAAVIGGILGLVVYAISGLGFNDFCVNGLGAFT IGIAALAMNRWILTGASNDVVIISAIMPLLPGVIFTTAVRDTLNGDYSSGAARMLEAVVT ALAVAAGVGAGMALFHQLTGGGGIW >gi|157101655|gb|DS480669.1| GENE 262 281628 - 282557 755 309 aa, chain + ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 308 1 316 321 242 41.0 1e-62 MVYEQFLAAVKKRMEAELGSSYELELHKVPKNNGLILDGLCITRGNAHIAPAIYLNPCYD QYRKGMSLDQIVAELLTMYRQNDTPPPLNYEMLSDYEGIRSNIACKLIHAASNEAILKDV PYIPWMDLAIVFYLCIHEDDTGLMTAMIHNSHLRIWNISLEDLKTSALSNSPRLFPPVIS SMACIIEEMNRGLNPHFQETHPKPETPAPFYVLSNRSGINGAACILYEDVLKNFADGVEK NLIILPSSIHEVLLLPDDGDISYEEMSRLVTHINRSEVPEEDRLSNQVYLYSRETGEVTM ASSGPASIC >gi|157101655|gb|DS480669.1| GENE 263 282601 - 284700 2045 699 aa, chain - ## HITS:1 COG:lin2209 KEGG:ns NR:ns ## COG: lin2209 COG0370 # Protein_GI_number: 16801274 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Listeria innocua # 40 696 5 662 664 477 40.0 1e-134 MNGRARADGTDTYGSVILGDSGAKKMDFQDQHEDDRTPITVGFVGNPNCGKTTLFNAFTG AKLKVANWPGVTVERVEGETSYKGRPIKVIDLPGIYSLTSYTIEEKVTRKCIEDGGVDVI INVVDASSLERNLYLTMQLIELKKPVILALNMMDIVEERGMEIDMHRLPELLGEIPVVPV SARKRTGLDVLMHAVVHHYEEGPQGVVIRYSQEVEEKIRLIEEALYKKYGEMDNMRWHAI KFLEFDQVVIDDHPLDDIAVILPRNYEKQIINEKYDYIEEVIKECLFNKDEKAATTDKVD RLMTHPVWGIPVFLGIMALVFFLTFTVGDFLKEYFQMGLDMFSGAVLSLLTSVHAAQWIT SLIVDGIIAGVGGILTFLPNIFILFLALAFLEDSGYMARVAYVMNETMGMVGLSGKAFLP MLLGFGCTVPAVMATRALESVKDRRRTILITPFMSCSARLPIYVLFAEMFFPDRALIVAY SLYIIGLCMAILVAFIVHIHGKNDSAQNALLIELPEYKTPNGRTVSIYVWDKVKDYLSKA GTTIFIASIVLWFVLNLGPEGFVSDVSHSFAAIIGHVLVPVLRPAGLGQWQVAVALISGL SAKEVVVSSFSVLYGISNVNSAAGMTALSTQLAMAGFGGVNAYALMIFCLLYSPCIAAVA TIKRETGSWRWTIGMVLFQLAVAWAGAMLVFQLGSLIFG >gi|157101655|gb|DS480669.1| GENE 264 284697 - 284933 355 78 aa, chain - ## HITS:1 COG:no KEGG:Closa_0801 NR:ns ## KEGG: Closa_0801 # Name: not_defined # Def: FeoA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 74 25 95 100 87 73.0 2e-16 MQKKLNECEIGGRYVVSGVQVDESITRRLEALGVNEGTPVNILNKKGSGSVIIKVRGTRL ALGKRLSEGITVREEQAS >gi|157101655|gb|DS480669.1| GENE 265 285127 - 286311 1281 394 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 236 393 465 620 621 158 49.0 2e-38 MRRKRGLCLLLAASMALGSSFTVLAADKKIDTVKLSFSYSKAPESGDEIGDITVKAGSSG FEVDSAEYTNIEDQDTWTVGDIPEVKVELSAKDGYKFSYTSKSHFSLSGCNADFKKAKIY DDGQYIEVYAELKRIGGKLQGASELEWNDTTAEWEEIEGAKSYDVKLLRDDKTVTTVSTT GTSYNFAGYFNREGDYTFRVRAISRYNNKTGEWSEDSNSLYIDEDELGRYGGSGHWEQDG TGVWYAYDTGGYPASCWKQINGAWYYFNNKGYIVTGWQKLDGTWYYLNPSNGIMMTGWQA INGKWYYMNGSGEMQTGWQAINGRWYYLDGSGAMQTGWQRLGGQWYYLDPSGAMQTGWQA INGKYYYLGTNGAMYYNTWTPDGKYVDGSGAWVQ >gi|157101655|gb|DS480669.1| GENE 266 286458 - 288440 1993 660 aa, chain - ## HITS:1 COG:BH3595 KEGG:ns NR:ns ## COG: BH3595 COG0556 # Protein_GI_number: 15616157 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus halodurans # 4 654 5 657 660 855 66.0 0 MDHFELVSEFQPTGDQPQAIEQLVKGFKEGNQFETLLGVTGSGKTFTMANVIQQLQKPTL IIAHNKTLAAQLYSEFKEFFPKNAVEYFVSYYDYYQPEAYVPSTDTYIEKDSSINDEIDK LRHSATAALSEREDVIIVASVSCIYGLGSPIDYKEMVISLRPGMIKDRDEVIHKLIDIQY DRNDMDFKRGTFRVRGDVLDIYPAYSDGVAYRVEFFGDEVDRISEIDTLTDETKAQLGHV AIFPASHYVVPKEKMMEATENILTELEERVTFFKSEDKLLEAQRISERTNFDVEMMRETG FCSGIENYSRHLTGGLPGEPPCTLIDYFPEDFLIIVDESHITLPQVRGMYAGDRSRKTTL VDFGFRLPSALDNRPLAFPEFESKINQMMFVSATPSAYEAEHELMRVEQIIRPTGLLDPE ISVRPVEGQIDDLVSEVNKEVAAHHKVLITTLTKRMAEDLTDYMREVGIRVKYLHSDIDT LERAEIIRDMRMDVFDVLVGINLLREGLDIPEITLVAILDADKEGFLRSETSLIQTIGRA ARNSEGHVIMYADKVTDSMAVAIEETNRRRQIQQKYNEDHGITPTTIKKAVRDLIAISKA VNADDKHFKKDPESMDEKELKKLSKELEKKMHQAAAELNFEEAARLRDRMIEIKKMLQDM >gi|157101655|gb|DS480669.1| GENE 267 289067 - 291853 1922 928 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 306 779 196 663 676 184 30.0 9e-46 MGDMTEQDSAHAAENTVIKCPAVLEALRTGIICCLLDEDLTFLWGNSSFFSGIGYTAERF GGLFSTLRQYYAAMPDVFSCIRQELTQAVENERQDIELTVRLPLKEGGYSWKHLYGTVRE DSLAGGKVLQAELAGVDALAAGKEELERLYRQKLQYFHFMLDTYEGNAYISDMDTYELLY VNQHSCEVLGMPAVKAAGRKCYEVIQGRTSPCPFCTNSKITENEFYEWEFQNPVLERTFR IKNRIIDWEGHRARLELSHDMYSTEYKLAKKDQERDALVRSVPGGLARVDGRDMRTVLWY SGSFLDLIGYTKEEFEQEMHSQCSYVHPDDKERVAAVMLQSRETGRPTMVESRIFTQDGK VKILLMTCSYVSGEENWDGIPSYYTVGIDVTAERTEQARQRQALEDACQAAQIANDAKTN FLSSMSHDIRTPMNAIIGMAVIAQANLQSPEKIQDCLNKINVSSRHLLNLINEVLDMSRI ESGKIDLISENVSLPELIEDVMDVFRPLAAEKHLELQINADHVRHEKVVTDQNRLQQVLV NLLSNAIKYTPEGGSVGLRVREIPAFAKGKGQYEFIVTDNGIGMSGDFIPHIFEPFSRTE ESKTNQIQGTGLGMAITQNIVSMMNGTIEVKSVLGEGSQFIVAVSFKLCEEAEDNNAELS GLPVLVVDDDQVICESAAEILDDLGMRSSWVLSGKEAIRRVVEAHEAEDDFFSLILDWKM PGMDGLETLKVIRRKLGMDVPIIVVSAYDYSEIEDEFRMAGADAFITKPLFRSKIAHTFH QFCREGRTDASSLPGGEVYTIMEGKRILLVEDNQLNREIAVELLKMHGFLIDEAENGRLA VEKFASSGPEEYDCILMDIQMPVMDGYQASEAIRALTREDARTVPILALTANAFATDIGK AHCAGMNDHVAKPIEVERFMETLRRWIR >gi|157101655|gb|DS480669.1| GENE 268 292062 - 293435 1698 457 aa, chain - ## HITS:1 COG:CAC0499 KEGG:ns NR:ns ## COG: CAC0499 COG0793 # Protein_GI_number: 15893790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Clostridium acetobutylicum # 118 454 77 399 403 224 39.0 4e-58 MGGQPDGPYRDNGPGPQDGNRNGSGKNRFWAGALVGALVTAFVGLIVVGMSAGIYIFGKR VMSRQPSVAIEGKGPGITDGSSHEGGVDFDRVTAKMSMIQQIIDLHFLYDEDAGNVEDFI YRGMLAGLDDPYSTYYTEEDFRSINDSTKGTYSGIGAMLSQNRTTGLCTIVKVFEGSPAL EAGMQPGDIIYKVGDTLVASESLDVLVNNYIKGEEGTDVAITVYRADKDEYVDMSVTRRK IEVPTVEYSMLDDKIGRIAVSEFDVITVEQFEQAVDELQKDGMEGLIIDLRSNPGGVLDS AVKMVDYILPDDLDQYEKGKGKTLIVYTADKNEKGDVFTASDGHELKMPIVILVNGDSAS ASEVFTGALKDYDWATVVGTTSYGKGIVQNLIPLGDGSAIKITTAHYYTPSGFDLHGKGI EPDVEVELDEKLKTQAVVKPEEDNQLQKAVQVLKENK >gi|157101655|gb|DS480669.1| GENE 269 293514 - 294722 1351 402 aa, chain - ## HITS:1 COG:Cj1235 KEGG:ns NR:ns ## COG: Cj1235 COG0739 # Protein_GI_number: 15792559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Campylobacter jejuni # 247 401 114 266 273 118 42.0 2e-26 MRIRHWICAAALSGGLLMGGMAAFSAYATSVEIEDAKKQVSALEEEKKKVESTLNQLEGL KADTAAYVKKLDGSLSSLAEELEQLGNRITLKEEEIDQAQIQLEEARREEEHQYDSMKLR IKYMYENGQNNLLDMVMESGSISELLNRAEYVSQIAEYDRKMLTAYASAKEQVAAREQNL EKEHGELLVLQESTQAKQASMQQLMDSKQKELDSYNSKIAMAQDELDQYNADIKAQEDQM KRIEAEMKRREEEARKKAEAAGKTYTVSNLGNISFKWPCPSSSRITSNFGDRESPTEGAS SNHKGIDISASTGADIIAAADGEVVISTYSYSAGNYIMIDHGGGVSTVYMHSSKLLVGVG EKVTKGQVIAKVGSTGYSTGPHLHFGIRSGGTYVNPRSYVSP >gi|157101655|gb|DS480669.1| GENE 270 294738 - 295646 972 302 aa, chain - ## HITS:1 COG:BH3601 KEGG:ns NR:ns ## COG: BH3601 COG2177 # Protein_GI_number: 15616163 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus halodurans # 1 302 1 298 298 116 28.0 6e-26 MRISTFWYCLKQGIINICRNILFSLASIATISACIFLFCLFFALIANVQNVAKTAETTVG ITVFFDEDMPEEQILAAGDGIRGWEEVREAQYISAAQAWENFKTDYFEGMEELAEGFADD NPLSGSASYEIFLNNIEEQDKIVERLEGMEGVRKVRYSSTAVAGLTSAGKMVGAMSAVII CVLLAVAVFLISNTISVAAAFRRRENEIMRLIGATNYMIRAPFVVEGVLLGALGAAVPLA GMYALYQRAVIYISEHYQMLTGMFEPIPLGNIFPYMAATAGCLGVGIGFFVSYFTIHRHL KV >gi|157101655|gb|DS480669.1| GENE 271 295716 - 296387 381 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 215 4 219 223 151 38 3e-35 MIEISNVSKTYETGNKAIKDVSLTIDDGEFVFIVGRSGSGKSTLMKLLLKELEPTKGRIV VNDMDLGRMPRRYVPKYRRRLGVVFQDFRLLKDRTVFENVAFAQRVIGVPPRIIRETVPE MLRLVGLSSKYKAYPRQLSGGEQQRVAIARALINDPEVLLADEPTGNLDSFNTHEIMRLL EEINQRGTTVIVITHSQEMVDEMNKRIITMERGSVISDVGGYY >gi|157101655|gb|DS480669.1| GENE 272 296402 - 297490 1497 362 aa, chain - ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 143 355 111 312 312 90 31.0 4e-18 MISNQILQTTLDGLKAITRIDLGICDTEGKVLASTFTDAEDSESSVLVFVDSPADSQVIQ GYQFFKVFDEHQLEYILLAKGASDDVYMVGKLAAFQIQNLLVAYKERFDKDNFIKNLLLD NLLLVDIYNRAKKLHIDTDVKRVVYIIETHNEKDVNALETVRSLFASKTKDFITAVDEKN IILVKEVRQGETYGELDKTANTVLDMLNTEAMTKVRVAYGTIINDIKEVSRSYKEAKMAL DVGKIFYANKNVIAYNNLGIGRLIYQLPIPLCKMFIREVFDGKSPDEFDEETLATINKFF ENSLNVSETSRQLYIHRNTLVYRLDKLQKSTGLDLRIFEDAITFKIALMVAKYMKYMESL DY >gi|157101655|gb|DS480669.1| GENE 273 297632 - 298744 1365 370 aa, chain - ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 369 1 369 369 475 64.0 1e-134 MASLSLKNVTKKYPNGFVAVKEFNLEIADKEFIIFVGPSGCGKSTTLRMIAGLEDISSGE LYIDDKLVNDVEPKDRDIAMVFQNYALYPHMSVYDNMAFGLKLRKTPKDEIDKKVHDAAK ILDLEHLLDRKPKALSGGQRQRVAMGRAIVRSPKVFLMDEPLSNLDAKLRGQMRVEISKL HQRLETTIIYVTHDQTEAMTLGTRIVVMKDGIIQQVDNPQNLYDKPCNKFVAGFIGAPQM NMIDATVGKDGALTTLSFGGHTVALSEAKSKKLEDAGYIGKVVTLGIRPEDLHDEESYLA MSPKSVFEATVRVYEMLGSEVLLYFDIEDANFVAKVNPRTTARPGDTIKLAMDLEKIHIF DKDTELVVLN >gi|157101655|gb|DS480669.1| GENE 274 299010 - 299849 1063 279 aa, chain - ## HITS:1 COG:lin2652 KEGG:ns NR:ns ## COG: lin2652 COG1284 # Protein_GI_number: 16801713 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 9 276 19 285 287 167 38.0 2e-41 MKKRNPAADYAMIVAGTFLIGFAIKNIYDPVSMVTGGVSGVAIIAKELWSVPLWLTNTVL NIPLFAAGFFFCGWRFIKRTLFATVMLSVSLYVLPEASYLGNDLFLSALFGGIISGVGTG LVFLTSCTTGGTDLLAALIQKRLKHYTLAQIMQVLDGLIVVAGASVFGIRSALYALIAIF CVAKVTDGLIEGLKFSKQAYIISEHYREIAEAIMEQMGRGVTSLEARGMYSDQEKKMLFC VVSKKEIVRLKEIVAEFDSSAFVIVSDAREVFGEGFIEY >gi|157101655|gb|DS480669.1| GENE 275 299988 - 300821 324 277 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 4 267 2 270 285 129 33 1e-28 MIELGKVQELVIIRTKEFGVYLAEKEGDEAAVLLPKKQVPEGAGIGDKVEVFIYKDSSDR LIATTGKPKLQVGETAKLEVKDVAKIGAFLDMGLEKDLLLPFKEQTHPVRKGEQCLVTLY VDKSRRLAATMRVYSHMSNQSPYKAEDWVTGTIYEINETIGAFVAVDHKYYGLIPKKELY GRFREGDTVQARVTRVREDGKLDLCVREKSYVQMGTDAEKVLKVMDEFDGVLPFNDKAKP EVIMREFSMSKNAFKRAVGRLLKEGRIRITDKTIERV >gi|157101655|gb|DS480669.1| GENE 276 300823 - 301632 639 269 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 39 232 16 211 248 102 30.0 7e-22 MKGVIPGRGMEAPAAKAASWSRENEVNMYTFNSRVRYSETDEGGCLSVSGIINYMQDCST FQSEDCGVGVGYLEASHKAWLLSSWQIMIDRYPGLGEKLVVGTWHCASKGIYGYRNFVLR DGEGTDVVRAESMWFLYDTDKNIPIRVRPGDTAPYGEPEPRLVLSPCPRKIPVPEDCKAG EPIVVARHHIDTNHHVNNAQYVEMAREAIPEGLVIKEIRADYKKAAVLGDVLTPRVSVAR DREWTVVLEGSTGEICAVVWLRTKGQEED >gi|157101655|gb|DS480669.1| GENE 277 301971 - 303476 1247 501 aa, chain - ## HITS:1 COG:SMc02414 KEGG:ns NR:ns ## COG: SMc02414 COG0402 # Protein_GI_number: 15966343 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Sinorhizobium meliloti # 1 482 1 483 491 404 43.0 1e-112 MYDLLLTNAVVITMDEKRRVIENGAVGVKDGRIAFVGDSMEAEQFEAKEVLDCKDHVVMP GFVDAHGHGGHSAFKSIVDLTSYWMPVMTHTYKNYITDDFWYNEGRLSALERLHAGVTTG VCVLGSQPRCDSPVPAFNNARAYAEVGVKDIVCTGPCHVPWPHRFSRYTEDGERHMSQVS YETVLESLETVVSELNHANSDRTRAYVAPFGAVTSVDPSAPTSADRCVKLTEHDLQQARD MRRIAEKYNTRIHTDAFGGMVHLAYQDKENALLGPDVHLQHCTGLSFDEALIIQKTDTHI SVAPGMRQLVNRTPVIELLELGVTVALTTDGSMLTSGFDMFDAMKRAQMIFRRAMNDDYY LPAEKVLEMTTIDAARCVGLDEEIGSLETGKRADIIAIDLMNPRLMPRINLIETLIGNGH PSDVDLVVVDGEIRLRDGKAVGINEREILLKAEEEALETAKRAKHLHPFAWPEKEHWGQT KIYFDEVRFDLDENRKDGGHY >gi|157101655|gb|DS480669.1| GENE 278 303522 - 304982 1246 486 aa, chain - ## HITS:1 COG:SMc02421 KEGG:ns NR:ns ## COG: SMc02421 COG0402 # Protein_GI_number: 15966350 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Sinorhizobium meliloti # 3 484 18 491 497 321 36.0 3e-87 MITRVKGSYVIGYNGADHEIIRDGEVVYENDTIIYVGKHYEGEVSKTEDAGNAVIMPGFI DLNALGDIDHDILHTEAFSNIRSSLNPSEEYFEKGTHEVMTAEEEAFKSLYAYTQLIMHG ITTAMPITSTYYKKWADTYEEEEAGVHHAGRLGLRLYTSPSYQCGLNVVRPDGTMTVRYL EGQGEEGLERAVRFIRKYDGAYDGLIRGALLPERIETQTEENLIRTKQYAEELKCPIKLH AAQGLFEYRFIKERTGKSPVEYLESIGFLGKNVGIPHCYIVKGTRWVEDGGDDLSILSNT GTTAIFCPVIIGRSGHYQDSFAKYRARNINVAVGTDTFPPDFFQNIRIASWFSQMAENKV EGSAMADVYRAVTLGGSALLGREDLGRLAPGAKADMIAVDLEDFHMGAVDDPIRTIFMCG CGRDVKLSIINGKTVMKDRTIEGVDLEEIKAKGQKYYDKMRMGYMERDYRHLSEDELFRP SFPIHK >gi|157101655|gb|DS480669.1| GENE 279 305138 - 306688 1482 516 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 4 512 1 510 511 308 34.0 1e-83 MRKVKKAAGILLALTLAASLVTGCGGSSSKAQEEKDTLTVAITGDPPTLDPHASSTSSSV NNLNPVYETLVRYDDDGEIKPLLATEWKRLDDLNWEFKLRDDVYFHNGEKMTANDVLYSF KRATGPDGAKVSYIMSAIDADNCKVVDDNTIIIATHEPFSPLVGYLPYIGAVVVSEKEFT EDPEKAAQNPVGTGPFKFVSWAKNDRCVYERNDEYWGDKPAYKNLVIRTIVEANSRVIEL ESGNIDIAFDIPANDVERLENNKDTAIVKRASTIVEYMVMNVTKKPLDDVRVRQAIDWAL DENAILTAVWRGSAQYSPTTVAPSMKYFDDSDTDCRYDVEKAKALLAEAGVSDLHLKLTC AENTNRLNEATIIQDMLAQVGITVEIQSYEAGTFYDMVDAGETEIFLIGFGAVGFPEPDN NIYGPFNSKQIPTNNMGFYSDPEMDKMLEAQRNTSDGPEREQIIKDIQKYLRVNLPVIPV ANTEQVLGIRSNVKGFVPTPAGSHFFQNVHFENAEQ >gi|157101655|gb|DS480669.1| GENE 280 306739 - 307707 840 322 aa, chain - ## HITS:1 COG:FN0400 KEGG:ns NR:ns ## COG: FN0400 COG4608 # Protein_GI_number: 19703742 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 8 322 1 314 320 392 59.0 1e-109 MSNPEERLVEINHLKKYFHVNKGLLHAVDDVSFYINKGETLGLVGESGCGKSTTGRTIIR LLEADSGEVNFHGENVLGFNKSQMKTFREHVQMVFQDPYSCLNPRLSVFELIAEPLTNME EYRRDKKKLDARVRELMDVVGLSRRLINAYPHELDGGRRQRVGIARALSVKPEFIVLDEP VSALDVCIQAQIINLLKRLQMDLGLTYLFISHDLSVVKHISNRIGVMYLGKIVELSDYRS IFSGPLHPYTKALLSAIPIPKVGVEREQIILKGDVPSPVNLPGGCRFAGRCPYVTERCRQ EDPELKNAGDNHFVACHLAGEM >gi|157101655|gb|DS480669.1| GENE 281 307700 - 308692 332 330 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 6 269 278 531 563 132 31 2e-29 MEKEKLLDIKDLSVTYKTEDGIVYAVNNLNLTLEPKETLGFVGETGAGKTTTAMTIMGLL PTPPGKVTSGEIMFDGKDMLKIGDKERRGILGEKISMIFQDPMTSLNPSMKVGDQIVEMI RLHKVIGKKEAFEEACGLLEMVGIRRERANDYPHQFSGGMKQRVVIAIALACRPKMIIAD EPTTALDVTIQAQVMNLIKGLKEKLGTSMILITHDLGIVAEICDKVAIMYAGQVVEYAEV SKIYHNPCHPYTKGLFNSIPKLDTEEDWLEEIHGLPPEPTERIDGCSFNPRCPNCMEICK HKKPEITVVNGEHYTRCFLYEKQEEVNKDE >gi|157101655|gb|DS480669.1| GENE 282 308711 - 309568 1004 285 aa, chain - ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 3 284 5 288 289 320 61.0 2e-87 MEEKKSRSQVVTILKALSRNRMAVLGLVILIILLFTGIFAHVIAPYDFAKQNLADAFQHP SPSHIFGTDEFGRDIFSRVVYGARMSLLVGFVSVGIAVIIGGVLGAIAGYYGRRVDNLIM RFMDVLLAVPQTLLAIAIVAALGTGLMNLMIAVGISSVPTYARIVRASVLTIREEEYIEA ARASGTSNTKIIIKHILPNCVAPVIVQVTLGIAGAILTAAGMSFIGLGIQPPNPEWGNML SSGRDYIRGYAYMTMFPGLAIVITVLSLNLLGDGLRDALDPKLRN >gi|157101655|gb|DS480669.1| GENE 283 309592 - 310515 272 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 65 307 43 315 320 109 25 1e-22 MLKFTMKRLVYLVLVLVGVSFLVFLLLYMTPGDPVRMMLGESATPEAQAELRLELGLDDP FLVQYGRYIKNIVVHQDLGTSYSTRRPVLDEIMTVFPNTVKLATAAIIIAVILGTFLGIV SAVKQNSLLDNAVMVLALIGTSAPIFWIGILMIILFSVNLGWLPPSGFGSFKQLIMPALA LGMQSTAVVARMTRSSMLEVIRQDFVKTARAKGQKESVVIMKHVFRNALIPVITVVGLQF GTLLGGAMLTEVVFSIPGVGRLMIEAIKQRDFPIVQGSVLFVAACFSLVNLAVDLLYAVV DPKVSKE >gi|157101655|gb|DS480669.1| GENE 284 310763 - 312277 1173 504 aa, chain + ## HITS:1 COG:BS_yunI KEGG:ns NR:ns ## COG: BS_yunI COG2508 # Protein_GI_number: 16080295 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Bacillus subtilis # 4 502 1 530 531 115 22.0 2e-25 MKKLTVADILQLDILKNVTICAGAAGLSHFVTGITTAEDPDLIQWLTGGEILLTSLYGAV TGPLSFKEYIDALASKHISALFIKTGERMPEIPAEITQSGSQYGIPIVQLPPDVSFFDVV SSVMKQILNNKAQYFIELQNNMSQLLASGAGEQDILDYLAEYVPASIHLCDSDKNIIFTS QSEEFHSEMAIADRISLPIMCMGEVCGYLEAATNRYLDENLEGLLKNAANLSAVLYLKKY YTVEIEQRYISGFLKELFRQNMGVKQIVEKAEGYGWHENDSYLAVSIQLEAARNITKVPE ALAEMTRLIPKSNYYFCIQDDLLNIIYRTEKKLSPQELYDLVTEVITNLNQYISKTYGSF TFYAGISSIASELFQLSQKIQESADSLQFCRVFNNRLVKYEDMGALRMLATYSHRSNLEQ VIPPAVTRLAEYDKLNNTQYLETLDSLLGNNLNLSKTAKQLFIHYKTMLHRMDRICEIAE ISLDDRQTRLDVELGVKLYMMLPK >gi|157101655|gb|DS480669.1| GENE 285 312573 - 313247 857 224 aa, chain - ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 198 1 198 227 143 42.0 3e-34 MKNSDKLKRYSMFIAGVIFSALGISLITKAGLGTSPITSLAFVLTFIFPHSLGEFTMLVN TVMFLIQAALLGRAFGKVQLLQLPAALLFSACIDGWMYILGFWKTGSYVGQVLMLLLGCV FLGLGIAFEVIPNVLILPGEGLVRVIAGLTGWRFGLVKTGFDMSIVASAVGVSLFFTGHV LGIREGTVLAALIVGSISHFFIEKVSNVLEGWIPDYSDAGDACP >gi|157101655|gb|DS480669.1| GENE 286 313403 - 314434 971 343 aa, chain - ## HITS:1 COG:lin2698 KEGG:ns NR:ns ## COG: lin2698 COG0392 # Protein_GI_number: 16801759 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 5 330 6 331 357 96 24.0 5e-20 MSDKKKNLINGIFLLTVFVLTIYSVFSGEDLSDIADTISEASPVYLLMGVGCVIFFIWAE SAILHYLLGTLGIKTKRRTCFLYSSVGFFFSCITPSAGGGQPAQVYYMRKNMIPVPVATV VLMVVTITYKSVLVVIGCLLAVFGQGFLNRYLYEVMPVYYLGLGLNVVFCAAMLVLAFHT SLARDMAMRVLGLLEHFRFLKKKTSRTERLLLTMEHYNDTAAFLRQHKHVLLRVLFLTFC QRLALFSVTYFVYRAFGLKGTAMITLVLLQSTISVSADMLPLPGGMGISEHLFLTIFDPI FYGDLLLPAMVLSRGIAYYVQLGFSAAVTMYAHITLVRKHRKI >gi|157101655|gb|DS480669.1| GENE 287 314836 - 315402 605 188 aa, chain - ## HITS:1 COG:AGc3605 KEGG:ns NR:ns ## COG: AGc3605 COG1595 # Protein_GI_number: 15889274 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 185 52 222 223 77 31.0 1e-14 MGAGLVLEGGRWMTQEEMIEALGDGDMEAFHLFYQDTARPVYSFILSLTRNPEEAEDVMQ DTYLTVWTKAGSYVPQGKPLAWVFTIARNLCYMRFREQKMKADVALEDLAEKEEGEYCAP LEQAADRKVLLDALGRISREERQIVLLHAAAGMKHREVAEALQMPLATVLSKYNRSMKKL QELLRCSG >gi|157101655|gb|DS480669.1| GENE 288 315410 - 316594 1267 394 aa, chain - ## HITS:1 COG:CAC0606 KEGG:ns NR:ns ## COG: CAC0606 COG0053 # Protein_GI_number: 15893895 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Clostridium acetobutylicum # 1 388 11 400 403 318 42.0 1e-86 MTGFLVKRFVKDHDNVEDVRVRTAYGTLTSMVGIGCNLLLFVSKLCIGILVHSISVMADA FNNLSDAASSIIGFVGVKMAEKPADDDHPFGHGRIEYIAAFIVAFLVIQVGFSLFKTSFG KVLHPEDMSFRWISILILFMSVAIKLWLSCFNKKLGKRINSKVMLATSADALGDVVTTSA TICSIGIYGAFGFNVDGIVGLIVSVVVMIAGVNIAKDTLAPLIGEAIDPEIYDKISSFVE SFDGIVGSHDLIVHNYGPSRSMASIHAEVPNDCDVERTHEIIDRIEREAMRRMGILLVIH MDPVETHDERVVEFKELVTSVLDGIDSRLTFHDFRMVDGVERINLIFDLVVPREYKPSVL GKLKARITEDVAKKDRRCCCVITMENSFISESRE >gi|157101655|gb|DS480669.1| GENE 289 316591 - 317337 573 248 aa, chain - ## HITS:1 COG:BH0167 KEGG:ns NR:ns ## COG: BH0167 COG0101 # Protein_GI_number: 15612730 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus halodurans # 3 237 4 238 263 200 45.0 2e-51 MNIGVWLSYDGTRYDGWQKQGNTDRTIQGKLEAVLEKLEGSPVEVHGSGRTDAGVHAESQ VANFHLAKNMTPDEIMRYMNRYLPEDIAVTRACVMPERFHSRLNALRKTYRYQIETAPKK NVFERYYYYGLGQDLDVGKMEQAAAILTGTHDYKSFCGNKKMKKSTIRTIESICFYRMGS RIHISFTGNGFLQQMVRILSGTLIEVGTGKRKPEEMAAILEAGDRSMAGFTAPPEGLFLE KVEYGDLI Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:52:18 2011 Seq name: gi|157101654|gb|DS480670.1| Clostridium bolteae ATCC BAA-613 Scfld_02_11 genomic scaffold, whole genome shotgun sequence Length of sequence - 132169 bp Number of predicted genes - 122, with homology - 122 Number of transcription units - 45, operones - 27 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 63 - 350 447 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 2 1 Op 2 7/0.111 - CDS 428 - 1549 958 ## COG0448 ADP-glucose pyrophosphorylase 3 1 Op 3 . - CDS 1546 - 2796 1058 ## COG0448 ADP-glucose pyrophosphorylase - Prom 2851 - 2910 8.0 + Prom 3272 - 3331 6.7 4 2 Tu 1 . + CDS 3406 - 4797 1653 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases + Term 5002 - 5029 0.5 5 3 Op 1 5/0.111 + CDS 5109 - 6179 805 ## COG0420 DNA repair exonuclease 6 3 Op 2 . + CDS 6184 - 8049 1691 ## COG4717 Uncharacterized conserved protein + Term 8081 - 8128 5.6 - Term 8068 - 8116 5.0 7 4 Op 1 . - CDS 8161 - 8517 398 ## COG0251 Putative translation initiation inhibitor, yjgF family 8 4 Op 2 7/0.111 - CDS 8580 - 10256 1394 ## COG0747 ABC-type dipeptide transport system, periplasmic component 9 4 Op 3 44/0.000 - CDS 10308 - 11249 548 ## COG4608 ABC-type oligopeptide transport system, ATPase component 10 4 Op 4 44/0.000 - CDS 11246 - 12235 865 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 11 4 Op 5 49/0.000 - CDS 12267 - 13160 815 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 12 4 Op 6 . - CDS 13202 - 14140 799 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 14245 - 14304 9.9 + Prom 14231 - 14290 8.5 13 5 Op 1 7/0.111 + CDS 14425 - 16260 1336 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 14 5 Op 2 . + CDS 16253 - 17365 727 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 17372 - 17425 18.0 + Prom 17382 - 17441 6.3 15 6 Tu 1 . + CDS 17541 - 18344 861 ## GAU_1495 hypothetical protein + Term 18435 - 18473 5.5 + Prom 18392 - 18451 2.4 16 7 Op 1 . + CDS 18495 - 20459 935 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 17 7 Op 2 1/0.222 + CDS 20456 - 22291 773 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 18 7 Op 3 . + CDS 22353 - 23513 699 ## COG1228 Imidazolonepropionase and related amidohydrolases + Term 23571 - 23619 -0.0 - Term 23555 - 23611 4.3 19 8 Tu 1 . - CDS 23630 - 24730 945 ## Closa_0068 hypothetical protein - Prom 24885 - 24944 4.1 20 9 Tu 1 . - CDS 24969 - 26369 1361 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 26465 - 26524 6.7 + Prom 26413 - 26472 7.7 21 10 Tu 1 . + CDS 26608 - 27465 630 ## COG2207 AraC-type DNA-binding domain-containing proteins 22 11 Tu 1 . + CDS 27573 - 27815 249 ## Closa_0067 hypothetical protein 23 12 Op 1 44/0.000 - CDS 27841 - 28875 446 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 24 12 Op 2 . - CDS 28788 - 29768 196 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 25 12 Op 3 . - CDS 29831 - 31189 850 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 26 12 Op 4 49/0.000 - CDS 31220 - 32134 780 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 27 12 Op 5 38/0.000 - CDS 32149 - 33069 262 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 - Prom 33113 - 33172 4.7 28 12 Op 6 . - CDS 33178 - 34785 1388 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 34911 - 34970 6.9 + Prom 34887 - 34946 7.7 29 13 Op 1 . + CDS 35028 - 36947 1069 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains 30 13 Op 2 1/0.222 + CDS 37000 - 37827 870 ## COG0789 Predicted transcriptional regulators + Term 37878 - 37912 6.0 + Prom 37854 - 37913 7.1 31 14 Tu 1 . + CDS 37944 - 39314 1419 ## COG0534 Na+-driven multidrug efflux pump + Term 39389 - 39436 7.2 - Term 39376 - 39424 12.0 32 15 Op 1 . - CDS 39512 - 40861 1570 ## COG0166 Glucose-6-phosphate isomerase - Prom 40938 - 40997 7.1 33 15 Op 2 . - CDS 41017 - 41766 655 ## Closa_0056 hypothetical protein - Prom 41893 - 41952 5.8 34 16 Tu 1 . - CDS 42003 - 43175 270 ## gi|160935893|ref|ZP_02083267.1| hypothetical protein CLOBOL_00786 - Prom 43272 - 43331 3.8 - Term 43286 - 43329 3.1 35 17 Tu 1 . - CDS 43408 - 44100 749 ## COG0775 Nucleoside phosphorylase - Prom 44126 - 44185 3.9 + Prom 44156 - 44215 5.7 36 18 Tu 1 . + CDS 44245 - 44487 248 ## Closa_0054 hypothetical protein + Term 44517 - 44580 9.3 - Term 44503 - 44566 10.1 37 19 Op 1 . - CDS 44621 - 45139 193 ## gi|160935897|ref|ZP_02083271.1| hypothetical protein CLOBOL_00790 38 19 Op 2 . - CDS 45160 - 45930 383 ## gi|160935898|ref|ZP_02083272.1| hypothetical protein CLOBOL_00791 - Prom 46077 - 46136 9.8 + Prom 45994 - 46053 5.6 39 20 Tu 1 . + CDS 46107 - 47171 563 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Term 47304 - 47339 1.5 40 21 Op 1 . - CDS 47174 - 47725 270 ## Closa_0149 RNA polymerase, sigma-24 subunit, ECF subfamily 41 21 Op 2 . - CDS 47731 - 48147 292 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 42 21 Op 3 . - CDS 48190 - 48810 473 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 48834 - 48893 5.1 43 22 Op 1 . - CDS 48993 - 50531 454 ## Closa_0140 FHA domain containing protein 44 22 Op 2 . - CDS 50521 - 50913 274 ## Closa_0139 hypothetical protein 45 22 Op 3 . - CDS 50910 - 51458 223 ## Closa_0138 hypothetical protein 46 22 Op 4 . - CDS 51483 - 52475 428 ## Closa_0137 hypothetical protein 47 22 Op 5 . - CDS 52524 - 54521 1437 ## Closa_0136 hypothetical protein 48 22 Op 6 . - CDS 54521 - 54700 230 ## gi|295115363|emb|CBL36210.1| hypothetical protein 49 22 Op 7 . - CDS 54719 - 55885 934 ## Closa_0134 hypothetical protein - Prom 55910 - 55969 1.7 50 23 Op 1 8/0.000 - CDS 56007 - 56900 651 ## COG4965 Flp pilus assembly protein TadB 51 23 Op 2 . - CDS 56806 - 58002 745 ## COG4962 Flp pilus assembly protein, ATPase CpaF 52 23 Op 3 . - CDS 58080 - 59084 1080 ## COG1192 ATPases involved in chromosome partitioning 53 23 Op 4 . - CDS 59141 - 59725 218 ## Closa_0130 peptidase A24A prepilin type IV - Prom 59882 - 59941 5.1 + Prom 59927 - 59986 6.5 54 24 Op 1 . + CDS 60021 - 60374 411 ## Closa_0129 ArsR family transcriptional regulator + Prom 60386 - 60445 4.2 55 24 Op 2 . + CDS 60469 - 60723 305 ## Closa_0128 hypothetical protein 56 24 Op 3 4/0.111 + CDS 60792 - 62132 1339 ## COG0366 Glycosidases 57 24 Op 4 . + CDS 62134 - 64062 1786 ## COG0296 1,4-alpha-glucan branching enzyme + Term 64164 - 64204 -0.5 + Prom 64107 - 64166 3.2 58 24 Op 5 . + CDS 64224 - 64973 865 ## Closa_0125 aminoglycoside phosphotransferase + Term 64988 - 65047 19.9 - Term 64986 - 65024 10.3 59 25 Op 1 40/0.000 - CDS 65157 - 67529 2520 ## COG0642 Signal transduction histidine kinase 60 25 Op 2 1/0.222 - CDS 67538 - 68233 978 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 61 25 Op 3 . - CDS 68277 - 69878 1621 ## COG3409 Putative peptidoglycan-binding domain-containing protein - Prom 69987 - 70046 8.3 - Term 70030 - 70092 14.0 62 26 Op 1 . - CDS 70107 - 70775 823 ## Closa_0311 MerR family transcriptional regulator 63 26 Op 2 16/0.000 - CDS 70869 - 72227 1573 ## COG0305 Replicative DNA helicase 64 26 Op 3 9/0.000 - CDS 72299 - 72745 639 ## PROTEIN SUPPORTED gi|239629097|ref|ZP_04672128.1| ribosomal protein L9 65 26 Op 4 . - CDS 72790 - 74838 675 ## PROTEIN SUPPORTED gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein - Prom 74880 - 74939 4.2 + Prom 75016 - 75075 5.5 66 27 Op 1 . + CDS 75109 - 75453 416 ## bpr_I0018 hypothetical protein 67 27 Op 2 . + CDS 75446 - 75841 196 ## Clole_2235 hypothetical protein + Term 75892 - 75942 3.7 68 28 Op 1 38/0.000 - CDS 75852 - 76685 567 ## COG0395 ABC-type sugar transport system, permease component 69 28 Op 2 35/0.000 - CDS 76682 - 77518 422 ## COG1175 ABC-type sugar transport systems, permease components - Term 77586 - 77624 5.3 70 28 Op 3 . - CDS 77633 - 78955 1081 ## COG1653 ABC-type sugar transport system, periplasmic component 71 28 Op 4 . - CDS 78977 - 79708 534 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 72 28 Op 5 3/0.111 - CDS 79715 - 80767 680 ## COG1609 Transcriptional regulators - Prom 80805 - 80864 7.0 + TRNA 81133 - 81217 52.9 # Tyr GTA 0 0 + TRNA 81317 - 81400 64.1 # Leu TAA 0 0 - Term 81397 - 81455 7.1 73 29 Tu 1 . - CDS 81605 - 83728 2115 ## COG0366 Glycosidases - Prom 83960 - 84019 4.8 - Term 84033 - 84073 6.1 74 30 Op 1 11/0.000 - CDS 84128 - 85942 1279 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 75 30 Op 2 10/0.000 - CDS 85965 - 86498 568 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 76 30 Op 3 1/0.222 - CDS 86519 - 87910 975 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 77 31 Tu 1 . - CDS 88049 - 89467 1422 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit - Prom 89490 - 89549 4.5 78 32 Op 1 . - CDS 89572 - 89835 377 ## Closa_4064 Septum formation initiator 79 32 Op 2 . - CDS 89894 - 90274 233 ## Closa_4065 Spore cortex biosynthesis protein, YabQ-like protein 80 32 Op 3 . - CDS 90283 - 90570 390 ## Closa_4066 sporulation protein YabP - Prom 90592 - 90651 3.2 81 33 Op 1 1/0.222 - CDS 90683 - 90922 341 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 82 33 Op 2 1/0.222 - CDS 90926 - 91201 171 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 - Prom 91241 - 91300 4.5 83 34 Op 1 . - CDS 91302 - 91682 438 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain 84 34 Op 2 16/0.000 - CDS 91692 - 92312 842 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D 85 34 Op 3 16/0.000 - CDS 92318 - 93709 1827 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 86 34 Op 4 . - CDS 93725 - 95491 1942 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 87 34 Op 5 . - CDS 95484 - 96077 841 ## Closa_4197 hypothetical protein 88 34 Op 6 . - CDS 96104 - 96412 554 ## Closa_4196 Vacuolar H+transporting two-sector ATPase F subunit 89 34 Op 7 . - CDS 96417 - 96851 652 ## Closa_4195 H+transporting two-sector ATPase C subunit 90 34 Op 8 . - CDS 96883 - 98823 2325 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 91 34 Op 9 . - CDS 98841 - 99860 880 ## Closa_4193 H+transporting two-sector ATPase C (AC39) subunit 92 34 Op 10 . - CDS 99865 - 100176 448 ## Closa_4192 hypothetical protein - Prom 100203 - 100262 6.0 - Term 100277 - 100329 16.1 93 35 Tu 1 . - CDS 100379 - 101374 872 ## Closa_4191 SH3 type 3 domain protein - Prom 101411 - 101470 10.1 + Prom 101468 - 101527 7.8 94 36 Op 1 . + CDS 101560 - 102222 619 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 95 36 Op 2 . + CDS 102259 - 102420 160 ## gi|160935963|ref|ZP_02083337.1| hypothetical protein CLOBOL_00858 + Term 102666 - 102733 13.1 + TRNA 102638 - 102713 85.8 # Lys CTT 0 0 - Term 102625 - 102693 31.2 96 37 Tu 1 . - CDS 102783 - 104435 1708 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 104577 - 104636 6.3 + Prom 104510 - 104569 6.2 97 38 Op 1 . + CDS 104604 - 105473 854 ## COG0005 Purine nucleoside phosphorylase 98 38 Op 2 15/0.000 + CDS 105501 - 106625 1398 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Term 106638 - 106674 1.1 99 38 Op 3 24/0.000 + CDS 106688 - 108214 1278 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 100 38 Op 4 26/0.000 + CDS 108207 - 109256 1116 ## COG4603 ABC-type uncharacterized transport system, permease component 101 38 Op 5 . + CDS 109253 - 110173 1163 ## COG1079 Uncharacterized ABC-type transport system, permease component 102 38 Op 6 . + CDS 110176 - 110721 608 ## COG1719 Predicted hydrocarbon binding protein (contains V4R domain) + Term 110968 - 111036 11.7 + TRNA 110938 - 111013 85.8 # Lys CTT 0 0 + Prom 110941 - 111000 78.9 103 39 Op 1 . + CDS 111153 - 111800 547 ## Closa_4227 integral membrane protein TIGR01906 + Prom 111814 - 111873 5.1 104 39 Op 2 . + CDS 111902 - 113536 1394 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 113541 - 113600 2.5 105 40 Op 1 16/0.000 + CDS 113622 - 114731 898 ## COG2205 Osmosensitive K+ channel histidine kinase 106 40 Op 2 . + CDS 114735 - 115427 854 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 115448 - 115486 5.1 - Term 115426 - 115482 11.3 107 41 Op 1 10/0.000 - CDS 115495 - 116130 657 ## COG2376 Dihydroxyacetone kinase 108 41 Op 2 . - CDS 116169 - 117164 1038 ## COG2376 Dihydroxyacetone kinase 109 41 Op 3 2/0.222 - CDS 117183 - 117836 770 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 110 41 Op 4 . - CDS 117859 - 118302 478 ## COG4154 Fucose dissimilation pathway protein FucU 111 41 Op 5 4/0.111 - CDS 118307 - 119407 1354 ## COG0182 Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family 112 41 Op 6 . - CDS 119468 - 120691 1431 ## COG4857 Predicted kinase 113 41 Op 7 21/0.000 - CDS 120709 - 121686 1122 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 114 41 Op 8 16/0.000 - CDS 121710 - 123245 1546 ## COG1129 ABC-type sugar transport system, ATPase component - Prom 123266 - 123325 2.5 - Term 123272 - 123313 9.3 115 41 Op 9 . - CDS 123372 - 124604 1422 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 124829 - 124888 8.4 + Prom 124828 - 124887 8.8 116 42 Tu 1 . + CDS 124965 - 125966 901 ## COG1609 Transcriptional regulators + Term 125978 - 126023 14.4 - Term 125966 - 126011 10.6 117 43 Tu 1 . - CDS 126047 - 127060 763 ## Closa_4231 Exonuclease RNase T and DNA polymerase III - Prom 127093 - 127152 4.6 + Prom 127268 - 127327 8.5 118 44 Op 1 . + CDS 127360 - 127989 183 ## gi|160935986|ref|ZP_02083360.1| hypothetical protein CLOBOL_00883 119 44 Op 2 . + CDS 127992 - 128543 418 ## COG1268 Uncharacterized conserved protein 120 44 Op 3 8/0.000 + CDS 128554 - 129747 815 ## COG0183 Acetyl-CoA acetyltransferase 121 44 Op 4 . + CDS 129793 - 131046 690 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Term 130992 - 131036 8.7 122 45 Tu 1 . - CDS 131051 - 132136 1090 ## PRU_1044 hydrolase domain-containing protein Predicted protein(s) >gi|157101654|gb|DS480670.1| GENE 1 63 - 350 447 95 aa, chain - ## HITS:1 COG:CAC3223 KEGG:ns NR:ns ## COG: CAC3223 COG2088 # Protein_GI_number: 15896470 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Clostridium acetobutylicum # 1 81 1 81 95 116 71.0 1e-26 MQVTDVRVRRVEKEGKMKAIVSITLDNEFVIHDIKVIEGEKGLFIAMPSRKAADGEYRDI AHPINSNTRDMIQTIILNKYETTLLENEAAEEGMA >gi|157101654|gb|DS480670.1| GENE 2 428 - 1549 958 373 aa, chain - ## HITS:1 COG:TM0239 KEGG:ns NR:ns ## COG: TM0239 COG0448 # Protein_GI_number: 15643011 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 349 1 350 370 320 43.0 3e-87 MRAIGIVLAGGNSKRMRELSNKRAVAAMPVAGSYRSIDFALSNMTNSHIQNVAVFTQYNS RSLNLHLSSSKWWDFGRKQGGLFVFTPSITPENGDWYRGTADALYQNLTFLKNSHEPYVV IAAGDGVYKLDYNKVLEYHIEKKADITVVCKDMEDGTDVTRFGCVKLNDDGRITDFEEKP MASDATTISCGIYVIRRRQLIELLERCAAEDRYDLVNDILVRYKNLKRIYAYKLDGYWSN IASVDSYYKTNMDFLKPEVRDYFFKQYPDVYTKIDDLPPAKYNPGALVRNSLISSGCILN GTVENSILFKKAYVGNNCVIKNSIILNDVYIGDNTVIENCIVESRDTIRANTTHIGTPEN IKIVVEKNERYTL >gi|157101654|gb|DS480670.1| GENE 3 1546 - 2796 1058 416 aa, chain - ## HITS:1 COG:TM0240 KEGG:ns NR:ns ## COG: TM0240 COG0448 # Protein_GI_number: 15643012 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 410 7 419 423 457 54.0 1e-128 MLLAGGQGSRLGVLTQKVAKPAVSFGGKYRIIDFPLSNCINSGVDTVGVLTQYQPLRLNS HIGIGVPWDLDRNVGGVTILPPYERSKGSDWYTGTANAIYQNLEYMESYNPEYVLILSGD HIYKMDYEVMLDYHKANNADITIAAMPVPIEEASRFGILITDESNRITEFEEKPPVPRSN LASMGIYIFSWPVLKEALIKMKEEPGCDFGKHIIPYCHARGDRIFAYEYNGYWKDVGTLG SYWEANMELIDIIPEFNLYEEYWKIYTKNDVIPPQFISNEANIERSIIGEGTEIYGEVMN SVIGAGVTVAKGAVVKDSIIMQGTVIGAGTVVNKAIIAENVRIGSGVELGTGEYAPSTYD PKVYQFDLVTIGENSVIPDGVKVGKNTAIAGETTVGDYPDGLLAGGNYIIKAGGVK >gi|157101654|gb|DS480670.1| GENE 4 3406 - 4797 1653 463 aa, chain + ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 1 463 1 463 463 674 68.0 0 MDLITVRELYKHPEEYLDKEINVGGWVRSLRDSKAFGFIVVSDGSFFETLQIVYHDTLPN FAEVSKLSVGTAILVKGTLVSTPQAKQPFEIQAAEVTVEGTSSPDYPLQKKRHSLEYLRT ISHLRARTNTFQAVFRVRSLVAYAIHQYFQERDFVYVHTPLITGSDCEGAGEMFQVTTMD LNNIPKTEEGKVDFSQDFFGKATNLTVSGQLNAETYAMAFRNVYTFGPTFRAENSNTTRH AAEFWMIEPEMAFADLDDDMCLAEGMLKYIIRFVLEHAPEEMNFFNQFVDKGLLDRLHNV LNSEFGRVTYTEAVKILEEHNDEFDYKVSWGCDLQTEHERYLTEQVFKKPVFVTDYPKEI KAFYMKMNPDNKTVAAVDCLVPGIGEIIGGSQREDDYGRLVARMDELGLKKEDYDFYLDL RKYGSTRHAGFGLGFERCVMYLTGMSNIRDVIPFPRTVNNCEL >gi|157101654|gb|DS480670.1| GENE 5 5109 - 6179 805 356 aa, chain + ## HITS:1 COG:MJ1323 KEGG:ns NR:ns ## COG: MJ1323 COG0420 # Protein_GI_number: 15669513 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Methanococcus jannaschii # 1 201 1 207 366 75 30.0 1e-13 MKFIHTGDIHWGMCPDANKPWGKERAQAIKDTFRIIIEKTKEMDVDCLFISGDLFHRQPL MKDLKEVNYLFSTIPSVKIILIAGNHDRIRENSALLSFIWSPNVTFIMDEDLTSVYFEDI NTEVHGFSYHTAEIKAPLLEDIQVPLNSRIQILMAHGGDAAHLPINFNSLELSPFSYIAL GHIHKHHVLADGKIAYCGSPEPLDLTETGSHGFYAGEIHPVTRKLINLQFIPAASLQYIP LAVNVSKDTTNGELCDRISNEIENRGSQHIYRFRIRGMRDPDIEFDLEQLSARYRIAEII DDSEPQYDFSALFAEHSSDMIGFYIQALQKEDMSPVEKKALYYGVNALLHTTDERS >gi|157101654|gb|DS480670.1| GENE 6 6184 - 8049 1691 621 aa, chain + ## HITS:1 COG:BS_yhaN KEGG:ns NR:ns ## COG: BS_yhaN COG4717 # Protein_GI_number: 16078056 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 372 619 715 962 963 60 26.0 1e-08 MKILSLHIEGFGKFHDLDISFKDGLNVVYGKNEAGKSTLHTFIRGMLFGIEKQRGRAARN DVYSKYEPWKGSGTYEGWMRLESEGQIYRIQRRFQKQNKELTIVNETQGREMEPTKALLD QLRCGLSETAYDNTISIGQLKCATDGGMVAELRNYIANLNTSGSIALNITKATAYLKSQR RELENQVVPEAARSYTALLGDIRKTEQEISAPQYANQLIEYRNKKSEIKKQLECCQKEKE GLLEKSAKGRQILLNSQFTDQESVQKYMEDTRKLYGEYQNTLAACSKKSRTVCAVFMFLI TLAFAGAAGLCLAAPEMVRSLVLLPFPLTAAAGISGAIALAACITGILKLGKNRRFRKEL DTSTRLLQEIFARHLGDSSISQDAMNALEGRMAEFLRLSKAVEQSEQTLVQLSQEIGKLQ SSEDQFSEEIEQQQRIQWELEQKLEHLADCKTRAEALKHVLAENERIQEELDAIDLAQDT MTTLSTSIRDSFGLYLNREASDLISGITGGIYTSMSIDENLDVFMNTPSKLVPIEQVSSG TMDQIYLALRLAAAKLVQNGRRDTMPLIFDDSFVLYDDERLKTALKWLVKAYDNQIIVFT CHQREAQLLTANQVKYHLVRI >gi|157101654|gb|DS480670.1| GENE 7 8161 - 8517 398 118 aa, chain - ## HITS:1 COG:BMEII0278 KEGG:ns NR:ns ## COG: BMEII0278 COG0251 # Protein_GI_number: 17988623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Brucella melitensis # 1 113 15 127 128 73 38.0 7e-14 MEIIRHDMCKHFARVTEYNGVLYFTGHIAAGKQPTMTEQMVALTHRLDELLEQFGSDKNH ILYAKINIHDWSMFDEFNAVWDAWFDPGCMPARTTAQSVLTDGYLVEITLTAAKISNN >gi|157101654|gb|DS480670.1| GENE 8 8580 - 10256 1394 558 aa, chain - ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 528 1 484 511 250 33.0 6e-66 MKRKLALVMAMLMVLGTLPACSSNSGGSPSEKGGSAVSGGAAGTEAGAVIETGDKINSTY TNNTPDANLSVTAVEDTLTIALASEPPSLDIFQLSDSAATLVSKVHCEGLLRRNDATMEY EGVLAKSWENVDDTTIRFHLRDDVYFHNGEKMTAEDVLYMYQRGNELPLQSSFYNMYDLE KCKIIDDYTLDIVTFEPFPPMEDFMSDPADCVLSKKAIESSTPEALVRDLNGAGTGPYKF VEWIAGDRIVFERNENYWGEKPEFKNLVFRIITDSTARTLAFEAGDVDICMQPLTTSVDS LRQNPNCDIWTCDTFTTANIVFNCTVAPLDNKQVRQALAYALDLDTIVDTVYSGTGRKHD SFFTPQNEGYHPADPEKDKALLEYNPEKAKELLAEAGYPDGFTIKLWTNENQNRIDLAEI LQNNWGAIGVKVDVSIMEFATLMAEYSKGIHEVVICAFAPTGKDGQFYYPYMHSTGPFRR NYAGIRNPEIDKLMENAQSSLDKEERAADYAKVIDILREEVYMIPLHNSENIFGVRSTLT NFEPNPVSLPRLAIVKSK >gi|157101654|gb|DS480670.1| GENE 9 10308 - 11249 548 313 aa, chain - ## HITS:1 COG:FN0400 KEGG:ns NR:ns ## COG: FN0400 COG4608 # Protein_GI_number: 19703742 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 4 311 1 309 320 435 67.0 1e-122 MSVLLEIRNLKKYFHVPDGWLHAVDDVSFSLEAGKTLGIVGESGCGKSTLGRTIVRLLDS TEGEICFDGKRITNARGKELKKLHTDMQIIFQDPFSSLNPRMTVKQIIAEPLRCCERLSA REIDKRVDELMELVGLASRLAGSYPHELDGGRRQRIGIARALALNPRFIVCDEPVSALDV SIQAQILNLMQDLQERLGLTYIFITHDLSVVKHISDEIMVMYLGQVVEKCGSDDLFKDPR HPYTKALLSAIPIPSLGVKRERIFLKGELTSPVNPRPGCRFAPRCPYADEECFKRNPLLA DDGRGHFAACHHI >gi|157101654|gb|DS480670.1| GENE 10 11246 - 12235 865 329 aa, chain - ## HITS:1 COG:FN0399 KEGG:ns NR:ns ## COG: FN0399 COG0444 # Protein_GI_number: 19703741 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Fusobacterium nucleatum # 2 317 3 317 335 397 61.0 1e-110 MNETLLEIKDLVIHYETDSGVVEAVNGMNLVLNKGQSLGLVGETGAGKTTTALGILRLLS EPPARIISGEIHFEGRDVLAMDKDTLQKYRGKKVSMIFQDPMTSLNPVMTIGEQVMESLE IHEPNLSREELAAKAQETLEMVGIREERFNDYPHQFSGGMKQRVVIAIALCCNPELLIAD EPTTALDVTIQAQVLDMIRTLRNELQTSMILITHDLGLVVENCDMVAIIYAGEIVEYGNL ADIFENARHPYTVGLFGCIPNIDKDVHRLKPIEGLMPDPMNLPDGCPFHDRCEYCMEICR KEKPADLEIAPGHIVKCHLCDTGEKEGSK >gi|157101654|gb|DS480670.1| GENE 11 12267 - 13160 815 297 aa, chain - ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 12 297 2 289 289 293 53.0 2e-79 MMKARTGGKPRKTTKKYGTSQWGEAWIRLKRDKKAMMGLVIICILLCCALVPSLLAPYGL DEQRPAEALQWPNLLHLMGTDNFGRDIFSRIIYGARTSIVVGFIAVGFACITGTLIGAAA GYYGGKADNVLMRFMDVLMAIPSTLLAISIAAALGSGLVNLMIAAGIGTIPSYARTVRAS VLTVKDQEYIEAARAGGASDLRIVLRHILPNCLAPIIVRSTLGVASAILTCASLSYIGLG IQPPTPEWGNMLNTGKAYLLNEWYMALFPGLAIMMVILALNLLGDGLRDALDPRMKR >gi|157101654|gb|DS480670.1| GENE 12 13202 - 14140 799 312 aa, chain - ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 298 12 308 308 277 52.0 2e-74 MIPVLLGVLFLVFTMNEISPGDPAAMIAGDAASVEVVEQIREDLGLNKPLPVRFFNYTKN LVLHGDLGTSYKTKRPVLDEVMDRLPTTILLSLTSAAFAVFLSIPIGIISAIKQNTWIDN LLMVLALIGVAMPAFWQGLMTIILFSVKLGWFPSYGFTTPAHWFMPVLTIGTGAMASLVR ITRSSMLEVIRQDYIRTARAKGQTERKVIISHALRNSMIPIITAIAIQLGSMLGGAIVTE TVFAIPGIGMLMIQSIKARDYPTIQGAVVVIAVMFSLLNLVVDIIYTFVDPRLKSIYQTK RKVKHFIAKSET >gi|157101654|gb|DS480670.1| GENE 13 14425 - 16260 1336 611 aa, chain + ## HITS:1 COG:BH2110 KEGG:ns NR:ns ## COG: BH2110 COG2972 # Protein_GI_number: 15614673 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 312 602 285 580 585 159 33.0 2e-38 MKKGFIKQKSRSVMERLMNQSLTVKISVFIYLIVIILFLFLSLFLFGIVGRASEQRLSEE HSKLLRVAILMINSRQQYMIGVADYYALSPDIKNMMDTSNSGLPVATVLDESRFKTPTII ILSTVFYNAEGEAVYYTTKDTSRAPINQKNNDAFQKLASGQATYVWDYIDHNNSDFMKYD FSPKLCLWRAVRDSNDVHMIGAVAVTIDSRSLLGFDETAKQLSENLAIVNSSNQVVYNRS NIDLNEDDIRLLSNNSFGSDAGLYTADLSSGKYRVSYQAIPSTDFVAFFLYRHTPFAFGI DVVYSYILAGLFLYILSLFPSIILVSKMITKPLVSLTKSIQQFAEGQQNVKVQFKYTDEI GLLGKAFNDMVLDNERLRISEYDLRLKNKDAELALMQAQINPHFLYNMLNAIQWQALKSG NKEIADIAYSMAQVFRISLSRGRSIISVKQELDLVSYYLSLQKYRLGKKIDYHIDFDEDV LDRQIPKLILQPLVENSIVHGMAKDSSLNLVISLSVALSEDGKSLHFMIQDNGCGIPPEI LRYLPSEVIPAAAEQGQRPNKSNRFAVKNIYDRLTLVYGSDFTFRIHSEQGTSIEIILPV KEIDERSTSNA >gi|157101654|gb|DS480670.1| GENE 14 16253 - 17365 727 370 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 360 1 359 368 128 29.0 1e-29 MLRLAIIDDEEETRQIMVNFIDWDTYGITVVGEASDGFAAYDLINSLRPDIVLIDIQMPG MTGLDVLEKIRSEDLISPAFIIMSGYDDFEFARRAVSLAAVEYLLKPFRPDDVIMAIQKS IRHLELIRSSYPDTPSPDSKKKSLAGNAIDFSLLHYPSREEHRLLECLKVGTKESVLGEL DCFWNCLCELNGDTESRLNCAIILYVEVCRFLLERGSCFSAPYFENRDSSGSDSSREIHQ TLTSIVLDAISLLESGDVSRVYVNKAIQYIQAHYNENLSLDSVANEIELSSPYLSSLFSR ILGTTFVNYIQSVRIQKAKELLCGTTMKVYEIAYCIGYDDEKYFSQVFRKTEGITPSQFR AQSSSIPHRM >gi|157101654|gb|DS480670.1| GENE 15 17541 - 18344 861 267 aa, chain + ## HITS:1 COG:no KEGG:GAU_1495 NR:ns ## KEGG: GAU_1495 # Name: not_defined # Def: hypothetical protein # Organism: G.aurantiaca # Pathway: not_defined # 14 257 56 312 327 77 25.0 5e-13 MVRSENQVANPVRTNIELSEKVLELNKQLAKDPENAELWIKRGLALVEQNLMREAVESYS KALCLDPFNWLGYRNRAHRHLSCWEFQEACSDFIMSTRLVPNTVDIHEIWEAWYLLALSH YLLGEYEKAADAYKTCYSLARGDDIITTTGWYWSTLMRLDRKEEAEGILAFITKDMKFEA SREAYFSTLRMFKGWVSPEDTEAQCNMPDCKDKFTRLYAISNYYYITGDLDNSNRVIDRI LDEAGPEWWMVFGYLAAMVDKKARSNA >gi|157101654|gb|DS480670.1| GENE 16 18495 - 20459 935 654 aa, chain + ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 13 649 4 660 660 438 38.0 1e-122 MTNKTTENHSIPVRLTDLLHFKSLSNPSLSLDGTRAAFMTHQCSRTQDGYDSNLWIYDLS RSRTIQVTSGNKKQVFCWAPDGSLLYATPDAETHFTKVNPEGSLSPAFSIPIRVKAIQPV TDALWLVEAASFVNEEAKGLLTEKICVVLEELPAHFNGVGYTSGSRNSLYLFNQFTGSLR QITPPRFETMGYTLCMKEQKIVCHGHEYTDVRGIRGSVYLYDIPAETGSIVVKPNIYRIY YAAMTGQNLLFAGTKGNHDSNENPIFYTLNLSSNEVSELCYPDHYIGGLGIGSDCRYGTG TTSLCSGDSIYFTSAHFESSHLFQVTSDGSFRQISRREGSIDCFDIKGDKAVFVGMRDMG LQELYTLNLVTGQETRLTNFNRDYADSHRICPPVPCYFKNHDNMTITGWVIPPANYDVNR TYPAILDIHGGPKACYGMVFYHEMQLWANEGYFVFFCNPRGSDGHGDAYSTISGINGTLD YDDLMQFTDHVLDIYPQIDRTRVGVTGGSYGGFLTNWIAGHTDRFAAAASQRSIGDWMVH YAACDSGFWVTSEQFPPSPLYDAKNAWEHSPGKYAMNIKTPMLFIHSDEDRRCPLSEMMP VYTGAILAGIPVRMCLFHGENHELSRAGKPRCRMKRLAEITNWMDKYLKGERME >gi|157101654|gb|DS480670.1| GENE 17 20456 - 22291 773 611 aa, chain + ## HITS:1 COG:FN1128 KEGG:ns NR:ns ## COG: FN1128 COG1506 # Protein_GI_number: 19704463 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Fusobacterium nucleatum # 122 611 156 659 660 218 30.0 2e-56 MKHPELLSLHSVSDLRISPGTSRCAFLVHTPCEAENNYETHLYVSDFTVSYPIGMSGITS FAWVDDAALAVGRPSADSTQFALLSLKDGSEGTVWNIPFDARIEGWVLGQLLISARRPIT EEKAQEDGSWTILDELPIWEDGEGYRAKIRRQLFLCSYGGQPIRISPEEMDVRIVSAHSN GLAYAGYILGNHGRIVNEIRYWAGEDRLLRRNCGDIRHLALGSNYAFIYALDIERETDSA PALIQVSLDTGKADPLYVSGIAVGNYIVSDIGLQGKILCADGDALYFAATKEGSSQIYKL SASGEPLCLTLDPGSIEQLDVRAGRIVFAGLRRGSCQEVYVLEDGEQMISHLHGNDTLPF YPMVEIPCHGIQGWALREETEEKTCPAVIFLHDGPQQAFGKVYHFGMQLLAQNGYVVLFA NLPGSMGYGKDFSMLDGHLGDDDCNGLIRFLDAALDACPEIDPSRLAVIGTGYGAYLAAA ATGKCNRFGAAICDGVISNCVSMVSTSDHGIAFAEKQMKACAFKQTGELWKRSPLSRIAS MKTPTLLLHGENDRSSHLSQGQMLFTALKIHGVPARLCVFPGENHSLASKGTPLARDRYH TEILRWLKMYL >gi|157101654|gb|DS480670.1| GENE 18 22353 - 23513 699 386 aa, chain + ## HITS:1 COG:CAC3332 KEGG:ns NR:ns ## COG: CAC3332 COG1228 # Protein_GI_number: 15896575 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Clostridium acetobutylicum # 2 385 4 389 393 317 44.0 3e-86 MAMLLIKNGLIHNAVDRNPFIADILVDNKKIVKIEPDMFWDGAEVYDAAGSEIYPGFVEA HAHTGLSGYGIGYEGHDYNELGDIISPQLRAIDGIEPRDETLLEARNAGVTTLCVGPGSA NVLGGTFAAIKPVGCRVEQMCLKKEAAMKCAFGENPKRNYKEKGNSSRMTTAARLRDVLF RAREYEAKLASAGDHIADRPPFDMKLQAMLPVLRKEIPLKAHAHQANDFFTALRIAREFD VDITLEHVTEGHLVAEELAKENVMLAVGPTLGHANKFELRNKTWDTPGVLAARGCHVSII TDAPFIPQQYLALCAGLAVKAGMDEFEALKAITIHPAEHIGVAHRVGSLEAGKDADIVVT DGNPFDIQTNILSVFIDGRQVFSCRR >gi|157101654|gb|DS480670.1| GENE 19 23630 - 24730 945 366 aa, chain - ## HITS:1 COG:no KEGG:Closa_0068 NR:ns ## KEGG: Closa_0068 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 359 1 359 363 330 48.0 8e-89 MGQAITDYVKFLADARDAVYRLNCDQNTAGQLADEEERKERELEAARKAVADSVNQTIRK RLDEINSSYDKEIGKGQERLKKARAKREKAKSQGMKERIAEETSELREHNRDLKVQLRTL FQKNRVPAWCGTTFYYALYYPRGFKETMIFLLTLIIFFLGIPCGIYFLIPQRVIWYLAAV YFADVLLFGGVYVMIGNKTRLRHQDTLKRGRDIRNVLRSNQKKIKVITKTIQRDGNEDIY DLEKYDDEIACVEQELSEIASRKKDALSTFENVTKTIISDEIIASHKENLDRLKQELDQV SADLRSLEGSIKEQNIRITDTYGPYLGREFLQPEKLVQLSRFIQDGSASNITEAIKVYRD AGQDER >gi|157101654|gb|DS480670.1| GENE 20 24969 - 26369 1361 466 aa, chain - ## HITS:1 COG:BS_lplD KEGG:ns NR:ns ## COG: BS_lplD COG1486 # Protein_GI_number: 16077780 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 11 454 10 440 446 489 52.0 1e-138 MIYENDVVRDVQIAYIGGGSRGWARTFMTDLAMEPRMRGTIRLYDIDTEAAKANETIGNH LSQRKETVGKWIYRTCMSMEEALTGADFIVISILPGTFDEMAVDVHMPERLGICQSVGDT AGPGGMIRALRTIPIYVEFAEAIRQYAPDAWVINYTNPMSLCVKTLYHVFPQIKAFGCCH EVFGTQKVLKGIAEDVLGLKHIDRKDIQVNVLGINHFTWFDYASYKGIDLFPMYRKYVDE HFEEGYAERDTNWANACFACAHRVKFDLFRRYGLIAAAGDRHLAEFMPGNEYLKDPETVE KWKFALTSVDWRKEDLENRLEKSRRLLNGEEEINLEPSGEEGILLIKALCGLERVVSNVN IPNTGLQIGNLPAKAVVETNAVFERNAIRPILAGPLPECVRELIMPAVENHEAILEAALT CNEELVVRAFMNDPQVKGRKCKEQDIRQLVRDMLDGTKKYLPKGWK >gi|157101654|gb|DS480670.1| GENE 21 26608 - 27465 630 285 aa, chain + ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 30 279 35 278 284 99 30.0 7e-21 MEQILYAEKTDGIAIDRIRREAAFTMPCKHLHNEYEIYYLLEGQRYYFIDRRTYHVTGGS LVFIDRNQIHQTSQAGPGSHERILISLDQNPFSDFLAFTGEMSLEGFFRRNSGILNLDKE GQNLAEYLLDGAAKELHEKKPGFRHITMSKLSQLFILAQRHMSGPSSLSADLPLSTQPKH RKVDEVASYIIDHYDKALSLEIIAAHFYVNKCYLSRIFKEASGFTVNEYINMIRIRRARE LLSNTSMSITEVSESVGYETITYFERVFHQYTQTSPSGYRKQYTF >gi|157101654|gb|DS480670.1| GENE 22 27573 - 27815 249 80 aa, chain + ## HITS:1 COG:no KEGG:Closa_0067 NR:ns ## KEGG: Closa_0067 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 80 9 88 88 109 72.0 3e-23 MKDNHLVRDTVACISDYSLSRTQMVFDSLDSICYEFDLGKPIWLDSSVKEFQLHDKVRFT QDNFIESIDFDFLEIQVIEE >gi|157101654|gb|DS480670.1| GENE 23 27841 - 28875 446 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 281 246 524 563 176 38 5e-43 MQDNRSKGHVHRRGAHCKMSFIRKGRTGMSDTILEVHNLKKYFKVPAGTLHAVDNVSFHI DKGETLGLVGESGCGKSTVGRVILRLIPHTDGQVIYQGEDILSYDNRKMQSLRKKMQIIF QDPYSCLDPRKTIGQIIAEPLRIHKTYPSGEINDRVLDLMAKGGVSRKLFNSYPHELDGG RRQRVGIIRALALNPEFIICDEPVSSLDVSIQAQILNLLQELQSTMGLTYLFISHDLSVV RHISNRILVMYLGQMVEICDSRELFRNPVHPYTKALLSAVPIAKYGYKRERILLEGDVPS PINPKPGCKFANRCLMAQDICRRQEPPVYTVSPGHQVCCHMAEH >gi|157101654|gb|DS480670.1| GENE 24 28788 - 29768 196 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 22 290 144 398 398 80 28 5e-14 MSENLLEIKDLSIEFRTYDGVVKAINHMSFDVPKGKTVGLVGETGAGKTTTALSVMGLVP NPPGVITSGSILFEGTDLLKLNESAMCQYRGKKIGMIFQNPMTSLNPVYTVGHQISDVIK QHQNVDQKEAWKRAGDMLEIVGIPRERLTNYPHEFSGGMKQRVCIAMGLSCNPQLLIADE PTTALDVTIQAQILELMKGLKEKYKTSTIFITHALGVVAEIADYVVVVYAGCVIEKGTIQ EVFEQPSHPYTQGLFGCLPDIESEESAMLHVIRGSMPDPMDLPAGCKFCPRCDKAMDICR TTDPKDMSIGGEHTVKCHLYGKEELV >gi|157101654|gb|DS480670.1| GENE 25 29831 - 31189 850 452 aa, chain - ## HITS:1 COG:BH3875 KEGG:ns NR:ns ## COG: BH3875 COG0624 # Protein_GI_number: 15616437 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 1 438 1 444 458 293 36.0 5e-79 MKEIFEYIEQHHDAYLEELFTFLRCKSISTRDEGVKECAELLAGIMEKSKIKASIYPTDR HPIVYGERIEDETLPTMLVYGHYDVQPPEPLGEWESEPFEPTIRDGKIFCRGVSDDKSQL FTHIKAAQAWAEVKGRLPVNIKYLFEGEEEIGSPDLLPFVESHKELLKCDVCFYSDSHYH ESGRPQINLGVKGLCYVEITLREAETDLHSMMATCIQNPAWRMVKLLNTMRDAEGNITID GFYDDVRPLNQLEIDAVSKIPSDNEVLKKQYGVDHFIKGRKSDNFYYNLIFEPTCNIAGI SSGFTGNGSKTVLPREAVVKIDMRLVPGQTPDKIYETLRRHLDNHGFEDAQLKHYGMVTP SRTPVDHPYVEVAAQALREGFHEEPVIFPGIGGVAPDFVFTGHLGVPSIVIPYAAADQKN HAPNESMVLDGFFKGIRTSAALPYYLAKKTQA >gi|157101654|gb|DS480670.1| GENE 26 31220 - 32134 780 304 aa, chain - ## HITS:1 COG:BH0030 KEGG:ns NR:ns ## COG: BH0030 COG1173 # Protein_GI_number: 15612593 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus halodurans # 3 303 6 301 301 265 46.0 1e-70 MPDKTENLEAGGALQADIKISKSYQIKRFWHNFKKRKIAVLGLVIVLLYVAIAIAAPWIA PYDPEEANFSIMLKAPGTIEGHLLGTDELGRDILSRLVYGARISIGIGVTVVIAAFFIGV PLGIVSGYYGGKMDFMLMRIMDVLMAFPQLLLCILFVAVLGANLTNAVMAVSIYTVPNFA RMARSETLAIKNGEYIEAAKALGANNFRIIVSHILPNIMSPLIVLATLNFGNAVLTTSGM GFLGIGAQPPTPEWGAMLSSGRQYLLVAPHVTSIIGLAILFLVLGLNLFGDGLRDILDPK LKSD >gi|157101654|gb|DS480670.1| GENE 27 32149 - 33069 262 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 65 304 43 313 320 105 25 1e-21 MAKYIFKRMLSLIPVVIVVSLLVFLMVHMMPGDPARLIAGEQATTEDVERIRVAYGYDKP LYVQYFKYVGGILQGNFGTSTRTGRPVAEELAVRYPNTLLLAATSTIVAIIGGVGIGLLS AVKRFSIWDNLCMFLALIGLSTPAFYLGLMLMLVVCVKLQWLPITPQSTALSLILPTVTL SSRSLATIARMTRSSTLEILGQDYIQTARAQGFSKRKVIFGCALKNAMNAIVTVAGLQFG LLIGGAVITEKVFGWPGLGDLIVTSIKARDFQVVQSAILVIAASFVVVNLIVDLLYAVIN PRIKLS >gi|157101654|gb|DS480670.1| GENE 28 33178 - 34785 1388 535 aa, chain - ## HITS:1 COG:BH0031 KEGG:ns NR:ns ## COG: BH0031 COG0747 # Protein_GI_number: 15612594 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 1 529 3 531 538 266 31.0 6e-71 MKKRTAKVFAMTIASLFVLSACSGTASNQTSTGETAGAAGTTDTTVPSSTGQSSGGNHFT YVIGTEPLTMDTHLMSDANTGRVSVQIHENLVKRDLEGNFQPVLATEWSPNEDATEWTFK LREGVTFHDGEPFNAEAVKYNIERLKDPATGSPKSSLVTMISDFDIVNDYEIKFILDQPC AVIPAMVSTYSTGMMSPKALEEYGKDYSTHAAGTGPLKLKEWIPGTSMSLEKNESYWGKA ATAETIDIKIIAEDSARAMMLKTGDADVAANIPSVLVDELQSDPNVEIEMVPGYRTIYLG LNFQDEKLANLKVREAIDYAIDRDAIINGILGGYVTYPSTGVISSSIQNAKQGIGDDYKY DPEKAKQLLAEAGYPDGFTTKINTPEGRYAMDRQVAEAIQAMLSQVGITAEVNVLDWGAY TEAAAAGDTEIFLLGKGCATGDLAQDLMYNYRTGELQNYTFYSNPEYDKVCEEQQRTADE NARKELLYKMQDIIHEDRASIILYYENQTFGTRADVSGLVIYPNETIELAYLARK >gi|157101654|gb|DS480670.1| GENE 29 35028 - 36947 1069 639 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 46 630 44 626 627 357 37.0 6e-98 MLINIIFIIPYPELEEKVNKIFQNHPLRARLRKRIEMRTADELETFSFNAHSDVIIARGY TACGLRKMTPELPIIELPITAYDLIRALNECITLYSPKKVAFIGNYTALAEADELGNFVN CDIKSYHSESTDGITRAIDQAVADGYTAFIGGYSVNLICSQRRLPSVTIRTGNEAITRAL DEAVRTVDVVRMERERSAMFTTITQTAMDGVLYVDIQQTIKICNPAALKMYTGPKKKLQG LSLQSAFPFLSEPVSTVISTGTELPNQLIRWKNYTINATIHPVIINSRITGIVVTFQDVT QIQQVEIQIRKKMTDKGLNAHYHFNDIIHESPEIDYVIEKAKKFASVSSNILIEGETGTG KELFAQSIHNASTRCNGPFVAVNCAALPENLLESELFGYVEGAFTGTAKGGKTGLFELAH KGTLFLDEISEIPLSIQGKLLRVLQEHEVRRVGGDRVISVDVRIIAATNHSLSRITEQGR FRQDLLYRLNVLKLFLPPLRRRCGDTKILFLHYLNLYYRQLGRIIPHVATAAIQMLSDYH FPGNIRELRNVTERLVVICMDKEEITADDMREALHPEELPVKFQSTDMPIPPFYQDGERE FLIRILEESNYNQTLTAEKLGINRTTLWRRMKKHGIRRP >gi|157101654|gb|DS480670.1| GENE 30 37000 - 37827 870 275 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 117 1 117 117 125 56.0 1e-28 MKNLFTIGEMADLFGINIRTLRYYDEIGILHPETADPDTGYRYYSTRQFERLNTIKYLRA LGVSLKKIALFFENRDVDIMLDLLKEQKAETQAKISELTQIEQKLEYRLKALEDAVHTQC GEIRTLHFDQRQIAYLRKDIRLGEDLEFSLRELERANTLEPVMFLGKVGVSLSAANLKAR KFESFSGIFVLLEKEDHFMGEEQYLQAGDYVSVRFSGTHQEASQYYIQLLEYMDQMGYAC CGDSVEITLIDAGFTNDTSRYVTEIQIPYLKCMKE >gi|157101654|gb|DS480670.1| GENE 31 37944 - 39314 1419 456 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 7 405 8 407 452 187 30.0 4e-47 MKNHNLLTEGNVFHVLLAFSVPFLIANIIQALYGAVDLMVIGWYCTPESVAAVSTGTQVT QIITSMVSGLTLGGTILVGKYTGMKDDERTCRTIGTTLSVFAIVALVLTIGMLIFRDTIL TALRTPAASMDEARQYVTICFCGIFFICGYNAISAILRGYGDSRRPMYFVALSCVLNIAG DIIFVKYLKLGVAGTALATVLSQSISMICSIVYLNRKKFIFTFTLKNLRIDRDRLKELAM VGIPISSQECMVRLSFLYLTSVTNRLGVNAAAAVGIASKYDVFAMLPATSIASALAAITA QNYGAGRPERARQSLAAGIGFAVLASSVFFLWAQLSPHTMIGLFNTNSEIIEAGIPFFQS CSFDYLAVSFVFCLNGFMNGRSKTVFTMVSCCFGALVLRMPLIWLAYTYCPDNLFVIGLI APAVSGFMAFYTLLFVIRQLKQDGTERQKELTPRRV >gi|157101654|gb|DS480670.1| GENE 32 39512 - 40861 1570 449 aa, chain - ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 1 449 1 449 450 608 64.0 1e-174 MGKEVMFDYSKAAGFVSAEEMANFKTTVMSAKETLLEKTGAGNDFLGWIDLPVDYDKAEF ARIKKAAEKIQNDSDVLLVVGIGGSYLGARAAIEFLSHSFYNVLPKSVRKTPEIYFVGNS ISSKYIHDLKQVLDGKDFSVNIISKSGTTTEPAIAFRVFKEMLIEKYGKEEANKRIYATT DKAKGALKNLADEEGYEEFVVPDDVGGRFSVLTAVGLLPIAVSGADIDKLMEGAASGRQK ALDAPYESNAALQYAAVRNILHRKGKAVEIVANYEPSLHYVSEWWKQLYGESEGKDQRGI FPAAVDLTTDLHSMGQFIQDGSRIMFETVLNVEESPAEILLNKEDVDTDGMNYLAGKSVD FVNKSAMNGTILAHTDGNVPNLMVKIPEQNEFYLGELFYFFEFACGVSGYILGVNPFNQP GVESYKKNMFALLGKPGYEAQREELLKRL >gi|157101654|gb|DS480670.1| GENE 33 41017 - 41766 655 249 aa, chain - ## HITS:1 COG:no KEGG:Closa_0056 NR:ns ## KEGG: Closa_0056 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 249 1 313 313 213 40.0 4e-54 MMSFLQTGKALYVLAAICLAGVFSRLMAGSFYKRLIKESTNMALTKNKYLRGLKQNAEDT YRMNQGMNNTRVYLERQMYSLRTFGMSVKGLDNLSGQLTLLCFLAGGGAAFLSYWFRSDN YYIVLYGTTGILAGMFTMLVDYGVNLESRRQQLLTCLQDYLENVMWPRMNREGTAEHVPA VTAEEDRREPANLRSIGRDRRSNSNNRRGLEQTAASREKQEEKSKDNWLQDLNPEQKRML GELLKEFIS >gi|157101654|gb|DS480670.1| GENE 34 42003 - 43175 270 390 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935893|ref|ZP_02083267.1| ## NR: gi|160935893|ref|ZP_02083267.1| hypothetical protein CLOBOL_00786 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00786 [Clostridium bolteae ATCC BAA-613] # 1 390 1 390 390 749 100.0 0 MEKKKKWMIFGGTAVVVVGVFLCIAAFFLYKGVTLFFDREIVADQYLEQMTSAASQGDRD AMRSLYMPGAVPEEHMERQIQENVDIWGKSQLDSFKKKGLNIRTSTKNGVKTKSVECSYG IKTEDGQRFTAILNRMESSDGNQGIISFIIRDNVSLAPQGTLRSISHWNVFQWGLFLLSL IMIFSTIATAIICYRQNPRYKWGWIILILIAYLSMGFSAIRTPNTWEFKVNWSAATVRLS RYVTYENGSREFRLFLPAGMAVYWIMKKRLDRESVHKKYGDREAATKKHGNWESGQEKRG NRESVHRKHENQETGPENRADQETGPENRADRENDPKKRADRENGQENRADQEPGPKKRA DQEPGPEKRADRETRQEQRPDADRGSWRIP >gi|157101654|gb|DS480670.1| GENE 35 43408 - 44100 749 230 aa, chain - ## HITS:1 COG:CAC2117 KEGG:ns NR:ns ## COG: CAC2117 COG0775 # Protein_GI_number: 15895386 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 2 227 3 227 230 208 46.0 5e-54 MLGIIGAMDEEVAMIKAQLTDVQVETRAAMDFYKGKLEGKEVVVVRSGIGKVNAAMCTQI LADIYGVTGVVNTGIAGSLKAEIDIGDIVLSSDAVQHDMDATGFGYEPGQIPRVETLAFK ADEGLIHLAEECCSKVNPDIHTFVGRVVTGDQFISDKGKKKWLTDTFGGYCTEMEGAAIA QACYLNSIPFLIVRAISDKADDSATVDYPAFEAKAIIHSVNLLTEIVRSI >gi|157101654|gb|DS480670.1| GENE 36 44245 - 44487 248 80 aa, chain + ## HITS:1 COG:no KEGG:Closa_0054 NR:ns ## KEGG: Closa_0054 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 78 1 78 82 135 82.0 7e-31 MRFKTQGVCSREISFEVKDNKLTNVQFVGGCSGNTQGLSRLIEGMDVDEAIRRLDGIQCG PRPTSCPDQLARALKQFKGR >gi|157101654|gb|DS480670.1| GENE 37 44621 - 45139 193 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935897|ref|ZP_02083271.1| ## NR: gi|160935897|ref|ZP_02083271.1| hypothetical protein CLOBOL_00790 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00790 [Clostridium bolteae ATCC BAA-613] # 1 172 1 172 172 342 100.0 6e-93 MRRLFKVLISTILVMNMVVMTAMAAPYNPDKFTSVNIESELLAPQIRSSRVTELPKPRGV FFSAADLIISDEGNGDVGVFAKAYMEVPVDEAYITVYLDQWDEDAERWRQVTFYDAEFYL KDYPNGITEPEVNMIFKNQPKGYYYRLRGVFGAVLDGRFEGFSPTTAGILVK >gi|157101654|gb|DS480670.1| GENE 38 45160 - 45930 383 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160935898|ref|ZP_02083272.1| ## NR: gi|160935898|ref|ZP_02083272.1| hypothetical protein CLOBOL_00791 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00791 [Clostridium bolteae ATCC BAA-613] # 1 256 1 256 256 509 100.0 1e-142 MDCPIWGTDDISDEELLKELEAVKNMAVPLPIPSPSPDEFEKIWARIQEERAESKNVPES DQPEHPKVIKPRFGWKRLAAIGLIACLVAGSGCMVAMGTKSYFYREKELGDGQQTVFVND FYKGDVNGEEEAYNLIEKELGIQPLKLGYIPSDMYFLDVYIKDGYARLCFTYNDEFVYFI QSKFNKKVSYDYKSDREEIISVKNKWLNKDIDIKIATLEDGSSRNEISFVDDGKYYRLWG PIEIQEFKEIVERLTY >gi|157101654|gb|DS480670.1| GENE 39 46107 - 47171 563 354 aa, chain + ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 3 350 5 352 359 397 54.0 1e-110 MKKIILTGGGTAGHVTPNLALLPSLKEAGYEIRYIGSYQGMERKLIETAGIPYDGISSGK LRRYFDIKNFSDPFRVVKGYAEARRLLKRHKPDVIFSKGGFVAVPVVLAAKHYKIPVIIH ESDMTPGLANKICIPSALKVCCNFPETLKYLPSDKAVLTGSPIRAELLQGDRLSGLSYAH LSAGRPVLLVIGGSLGSVAVNTAVRNILPRLLSSYQVIHICGKGNLDESLIGTAGYVQYE YVDAPLKHLFAAADLVISRAGANSICELLALRKPNLLIPLSAAASRGDQILNANSFAKQG FSKVLEEEALSDDSLFDAINDLYLNRNSYIQAMEQSNLNNAVKTVVSLIESCVK >gi|157101654|gb|DS480670.1| GENE 40 47174 - 47725 270 183 aa, chain - ## HITS:1 COG:no KEGG:Closa_0149 NR:ns ## KEGG: Closa_0149 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: C.saccharolyticum # Pathway: not_defined # 9 177 6 174 180 81 31.0 1e-14 MKIEKMDTEDKKLFEEMYLEYEVYLRRIAYVNDIPVDYIEDVVQDTFVSYARYKYSLDMS EESKRALLIRILKSRCMDFHRRMKYRSYGELDEEAYNSEDYPAHDKAANLPDYVVSKERC QALLKEIERMPENWRQVATLRLIEGRPTREVCAMLNITEKACYSRVSRIRKYLEELLKSD NWP >gi|157101654|gb|DS480670.1| GENE 41 47731 - 48147 292 138 aa, chain - ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 3 111 4 112 142 115 47.0 3e-26 MTDDNCIFCKIAGGVIPSTTLYEDEDFRVILDLGPASRGHALILPKQHFADVCALDGDIA AKVLPLGAKIGSAMKKSLGCAGFNLVQNNGEAAGQTVFHFHMHVIPRYEGGPDMVSWTPG KASPEELAEVADKIKGCL >gi|157101654|gb|DS480670.1| GENE 42 48190 - 48810 473 206 aa, chain - ## HITS:1 COG:RSp1525 KEGG:ns NR:ns ## COG: RSp1525 COG0705 # Protein_GI_number: 17549744 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Ralstonia solanacearum # 9 200 204 398 569 122 38.0 4e-28 MVTGINRSRPYVNIALAAVNVLVFLYLEAIGSTEDGVFMVKHGAVFAPFVILGGEYYRLF TAMFLHFGVSHLANNMLVLLVLGEKMERALGHIKYLIFYLASGVAANSISLAVQVRTGQA SVSAGASGAIFGVVGGLVYVIAIHHGQLDGLTNRQLGFMVLLTLYHGFTSAGVDNMAHIG GLISGFILGILLYRRKDAARISGMAG >gi|157101654|gb|DS480670.1| GENE 43 48993 - 50531 454 512 aa, chain - ## HITS:1 COG:no KEGG:Closa_0140 NR:ns ## KEGG: Closa_0140 # Name: not_defined # Def: FHA domain containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 220 1 220 437 170 45.0 2e-40 MEISYRREAKRNYLVAGVADATAGYEARMLAHNEIRGLLRMYITYQDGLPLYCYDITSRQ PLSRLLETRFITREEICQLLIQIHAALSGMEEYLLDAGGVLLEPEYIYVEPELFQTGLCL IPGVQDDFQGKLSRLLQYILKRINHKDRESVVLAYGLYQESLKENCGMDDLLGLIASERR KGKKRELLSEPEGDYIKKGGLQETDHWEKGLKEKGQGKEQGKEQGNEQNRDCITGKWKGK SGETEKEHNSDRGEKRRKKEKGKLERCEIKQEGSGRTEPKWKGTTFRRQFIVWLSAAVLC PSALWMFRGMTMVIDNWRLLAAIDGGLLFMLSAMNLYRLFIGHRAAKADGAGSDDEQDPW RILYEDEDDEDEDLGNQPSTAYMNGDKNEYLQKMPESSAGECFQTVLLSERPAQGEEVRR LSALNGSDEDIVISYYPFVIGKHKDLADYVLLKDTVSRFHIRLDEDNGSYTVTDLNSTNG TRVKGRLLEANETMQLEPGDQIFIADCGYIFY >gi|157101654|gb|DS480670.1| GENE 44 50521 - 50913 274 130 aa, chain - ## HITS:1 COG:no KEGG:Closa_0139 NR:ns ## KEGG: Closa_0139 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 1 125 129 71 35.0 1e-11 MREKTLKGSLTVEAAWIMAMVLLSIAVMLQQACRIHDETKAAMGLHEALEKGRHGSAKEL EASASGVQEHLGRLMFFPDCDLAIKEKGKRIYGEGRGGKWKREIEAERFRPETFLRKITL IEGLVKEDGN >gi|157101654|gb|DS480670.1| GENE 45 50910 - 51458 223 182 aa, chain - ## HITS:1 COG:no KEGG:Closa_0138 NR:ns ## KEGG: Closa_0138 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 93 182 54 143 143 67 44.0 2e-10 MGILNGDLTEQLTAGHDIAGVLLRCGFLMTLSIGAWQDMRNHRIKLSLIVMSAAAGAVLR CMYIVLDAAIIYRDSGHKIPSDLVSGQIIDTAMAMAVGAGLLFLSRITNEAVGKGDGWFF LISGIYLGALKNLVLLAGGLGICFLLSMVLVFKGIIQGTDRGRLRIPLLPFLIPAGIGVM FL >gi|157101654|gb|DS480670.1| GENE 46 51483 - 52475 428 330 aa, chain - ## HITS:1 COG:no KEGG:Closa_0137 NR:ns ## KEGG: Closa_0137 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 19 327 10 305 307 218 39.0 3e-55 MPFFLVQNKKSNPYCPYKKTAKINTTAPLQVRKCCRLFHGKRALSWNSCQMQGSLSVEAA LSLSLFLFFCFCLIMPMKMLDRQRQIQAVMESVGEELSQYAYVEYCFRSAEEGGAEVDVG RAEDTATSVLAAAYASARILGEINREWVEAVSFEGTDIGEDEMVHIVMRYRMRLPFSVLG VSSIPVEQVCSRRMWNGADGGRFGDGDRDGNGEEDEIVYIGKNSTRYHRLRTCHYLYNDL KAVDLGAVGELRNEAGGRYSSCGTCGGGSGGTGSGGAGSGSMGSSGTVYVMPYGSSYHVS KSCRSIIAYVQAVPLTQVEYLGECSYCRGK >gi|157101654|gb|DS480670.1| GENE 47 52524 - 54521 1437 665 aa, chain - ## HITS:1 COG:no KEGG:Closa_0136 NR:ns ## KEGG: Closa_0136 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 632 1 613 630 437 39.0 1e-121 MCRKGEITVFLAMILFSVCALLCVIVESARTAGARCYLRMAVDSSADSLMAQYHRELWNR YRILGLEYDKAETLEKEFREFMRPYMEAENWYPMKAEQIRITDMTDLTQGDGRYFEQEIL DYMKYGLLDADWDELDEAGASELLEVWKEGNSVNRVSELYAAHSREAVKVEKALEIINST LLAQRERWEQGKGCLDQLDGGGFVSQANKLIRELERLPGQVNSYEKRADELYEKLADSRE RFLEEASDLSGDVRAGLEEEIRQYEAYAAQDGQRRREVEALTNLSRDRILWIRDVIDMAE AVMEYISSWEPEDEDDELDEDALWQPVRARWSQYGMLTLGVEFGVRDKEKEGFLEQVGNM AGREMLELVLPEGTVVSGTALRLSGTPSVQRKTDGGGWKETDQDGDSKSTGFLTGVRTLI QRLLIGEYDIRFFKRLKKEMQKGEFYELEYIIHGKEKDRDNLSGVAARLVAFREGLNLVH ILSDAGKRQEARNLALTIVGGTGILPLVSVVAFFIMAVWALGEALLDVRYLLEGKRVPVF KTGSDWKLDLAGLLEMGRSGSLIDEEGGNGSGADYKGYLRILIFGAYDTDLVYRMMDVMQ IVTAVKQPGFSLANCVCTVDAEALVSGKHVFFSNGLWKSRGREDGYAYDTRMAVAGSYLE DYKSP >gi|157101654|gb|DS480670.1| GENE 48 54521 - 54700 230 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295115363|emb|CBL36210.1| ## NR: gi|295115363|emb|CBL36210.1| hypothetical protein [butyrate-producing bacterium SM4/1] # 5 59 23 77 78 68 76.0 2e-10 MKLGKEIREFWKDEQGVGVIELVLVLVVLIGLVIIFKKQITTLLQNIFKEINSQSKEVY >gi|157101654|gb|DS480670.1| GENE 49 54719 - 55885 934 388 aa, chain - ## HITS:1 COG:no KEGG:Closa_0134 NR:ns ## KEGG: Closa_0134 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 387 42 419 420 292 43.0 2e-77 MWQRTDSALLEGDTLARGNYGQGDQTYSLIVEGLGEKEVPVEMVLSERMYTKQEAHKTYE AVMVQLPQLILGNNPSLEDVRQDLNLLTYLDSYGIRLRWESDEPELLDSFGELHVQELKQ RGIGEQGIKISLHVRMTEGNWPEEYELSVCVKPPVLTQQEQTEEAFVEFLKSQDELQSYT AYMKLPEEYQGQSLKYSLPEESVFVPVMGLGVAGAVLCFFGDQARQQNRTEERKRQLALD YPEVLSRLTIFLGAGMSIRTAWDKIALEYKMMVEQGRRRKRYVYEEMYETSCQMKGGVPE GKAFEEFGRRCGLQSYMKLGGLLEQNRKNGSKNLRNLLRTEMTDAFEQRKHQARRLGEEA GTKLLLPLFILLSVVMVMIAVPALMEFK >gi|157101654|gb|DS480670.1| GENE 50 56007 - 56900 651 297 aa, chain - ## HITS:1 COG:mlr6484 KEGG:ns NR:ns ## COG: mlr6484 COG4965 # Protein_GI_number: 13475421 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein TadB # Organism: Mesorhizobium loti # 86 267 121 293 323 64 23.0 2e-10 MERWSWNHCLNTTVRRTGFVRKGEGHRRDLVRQEIARRKPEKKVPRARPPDRVDYSTYRL TAGEWLLYGALGVGACGLASYVFYRSVVVFLILMPVGACYPLYRRSNLKKERARRLELQF KEGIQVLSSFLSAGYSLENGLTLSIKELEILFGRREMITEEFRILSDGIRMNRPAEELFM DFGRRSGVEDVDNFAQVLSAAKRSGGELVEIIRQTAGIIRDKVQVKEEIHTMLASRIFEQ RIMNLIPFLIVLYIDLTSPGFFSVMYGTWMGRSVMTMCLAAYAGALVLAGKILDIEV >gi|157101654|gb|DS480670.1| GENE 51 56806 - 58002 745 398 aa, chain - ## HITS:1 COG:SMa1568 KEGG:ns NR:ns ## COG: SMa1568 COG4962 # Protein_GI_number: 16263305 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Sinorhizobium meliloti # 76 380 142 447 497 299 52.0 7e-81 MEERESKRRSREGVRWLIRERIMEELQERHHMDDGELLEMIDRSIGEMGQEIFLPLKERL WLRGSLFDSFRRLDILQELIDDSSVSEIMVNGAGKIFMERNGRMELWDRNFERPEQLEDI IQQIVSRVNRVVNVSSPMVDARLEDGSRVHVVLPPVALDGPVVTIRKFPDPITMEKLIRF KAISTEAAGFLEQLVEDGCNMFISGGTNSGKTTFLNALSSFIPSGERVITIEDSAELQIT QVPNLVRLETRNANTEGEGEITMSQLIKAALRMNPNRIVVGEVRGKEALDMLQAMNTGHP GSLSTGHGNSPRDMISRLETMVLMAADLPLAAIRSQIVSALDIMVHLGRLKDGKRRVLSI MRIGGLQNGEVELEPLFEYDGKEDRLCQKGGRTQAGPG >gi|157101654|gb|DS480670.1| GENE 52 58080 - 59084 1080 334 aa, chain - ## HITS:1 COG:CAC0037 KEGG:ns NR:ns ## COG: CAC0037 COG1192 # Protein_GI_number: 15893335 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 1 286 1 296 361 81 21.0 3e-15 MVKILAVYDADPAFGERLADYVNQKEKIPFTAMAFSSLDRLIEYGEDHPIEILLVDESSR ALVGDVKAEQVMVLCDGALVDGPDIFPAIYKYQSGDCIMREVLASYCSRPVEPALALLGS RALVVGIYSPVNRCFKSSLALTIGQVMAKKESVLYLNLEEYSGFTRLIDSEYKADLSDVL YLYRQGGYNWMKLKSMISNWGNMDFIPPVRYAEDLSQVAPEDMAQLIDRIARESGYDRLV VDVGQMGRGALPVLSMCNVVYMPVREDYISAAKIEEFEEYLEEADDAGVRDRIQKLRLPR HTGIAKQEGYLEQLIWGEMGDFVRQLLNGKQPER >gi|157101654|gb|DS480670.1| GENE 53 59141 - 59725 218 194 aa, chain - ## HITS:1 COG:no KEGG:Closa_0130 NR:ns ## KEGG: Closa_0130 # Name: not_defined # Def: peptidase A24A prepilin type IV # Organism: C.saccharolyticum # Pathway: not_defined # 8 192 10 169 171 88 37.0 1e-16 MSYALYGIMSYCAFMDIRRYRIPNRALAAAAAAGLAVSVEAWVSGAGNVGLWAAGIGFWM QDTGFWAAAAEGAAVFIMRLMLAAAAGFPFFLLRMVGAGDIKFMALVAGCFGLERGFWSV VIGLCLGAVLALGKMLREGSICQRFLYLTAYIRRLIQSKEIEAYYCPERDGYKCVIPLGA CFFAGTLISVLWKG >gi|157101654|gb|DS480670.1| GENE 54 60021 - 60374 411 117 aa, chain + ## HITS:1 COG:no KEGG:Closa_0129 NR:ns ## KEGG: Closa_0129 # Name: not_defined # Def: ArsR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 4 116 3 121 125 119 53.0 5e-26 MTTAQELQKRFHACMPLFIALGDEMRLSIIEVLTEEALDRRQTPGPIHFEQYGLNVNEIT RRTSLSRPAISHHLKILKDAGIVGVRQEGTANYYYLTLRESNRRLMELGYRLEEFLY >gi|157101654|gb|DS480670.1| GENE 55 60469 - 60723 305 84 aa, chain + ## HITS:1 COG:no KEGG:Closa_0128 NR:ns ## KEGG: Closa_0128 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 83 12 71 72 82 80.0 4e-15 MFDIMKAVRETAAARSAVTRARRPRPELIRLRVEIERTKCAIEAARNHFEQAVDPTLIDC YIYELNAAQLRYQFLLRKFKSQED >gi|157101654|gb|DS480670.1| GENE 56 60792 - 62132 1339 446 aa, chain + ## HITS:1 COG:CAC2686 KEGG:ns NR:ns ## COG: CAC2686 COG0366 # Protein_GI_number: 15895944 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Clostridium acetobutylicum # 4 442 3 449 451 393 42.0 1e-109 MSTWYERGVFYHMYPLGMTGAPKHNDATEVTNRFEELDKWISHIRSLGANAIYIGPLFES TSHGYDTRDYKLVDRRLGDNGSFRKFVDQCHQEGIKVVVDGVFNHTGREFFAFKDIQEKR WDSPYKDWYKGVNFDWQSPCGDSFGYEAWQGHFELPCLNLFNPDVRSYLFDVIRFWIDEF DIDGIRLDCANVLDFNFMKEMRSQTEAMKEDFWLMGEVIHGDYSRWVNNEMLHSVTNYEL HKSLYSGFNDHNFFEIAHNVRRLEAIGRQLYTFVDNHDEDRIATKLKLREHLFPIYICLF TLPGIPSIYYGGEWGVEGKRTNTSDEALRPAISIEQEGELHCELTDLIAQLGQIHSQQEP LHTGRYQELLLTNRQYAFARHGEDSVIITAVNNDDEPAELAIPVPLQAGEAVNLLEPADR LPISDGKVHIKLKGNWGAVLKLKGEN >gi|157101654|gb|DS480670.1| GENE 57 62134 - 64062 1786 642 aa, chain + ## HITS:1 COG:sll0158 KEGG:ns NR:ns ## COG: sll0158 COG0296 # Protein_GI_number: 16331275 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Synechocystis # 21 642 109 752 770 637 49.0 0 MVAKKRTAASSKKKASATGFISDLDCYLFGAGTHYDIYQKLGAHPKTYKGKEGIYFAVWA PHAKEVHLVGDFNNWNPEASPMERISESGIWEIFNPGMKLGELYKFAITTQSGKILYKAD PFAFSAEYRPGTASVTADIQGFSWTDSSWIEKRTQADVQKLPMSIYEVHLGSWRKRDRPE KDGFYTYTEAAHELAAYVKEMGYTHVELMGIAEHPFDGSWGYQVTGYFAPTSRYGTPEQF MYFVNYLHKNGIGVILDWVPAHFPKDAHGLADFDGEPLFEYADPRKGEHPDWGTKVFDYG KNEVKDFLISNALYWVEQYHVDGLRVDAVASMLYLDYGRQEGQWVPNKDGGNQNLEAIEF FKHLNTVVQGRNHGALVIAEESTAWPKVTEHPEQDGLGFTFKWNMGWMHDFLEYMKLDPY FRKYNHHRMTFGLTYFTSENYILVLSHDEVVHLKCSMINKMPGLGRDKFSNLKAGYTFMM GHPGKKLLFMGQDFGQLHEWDEKVSLDWYLTDEDDHRELQNYVKDLLHLYKKYPSLYRQD NDWNGFQWINANDGDRSIFSFIRRDETGKKNLMFIINFTPVERPDYRVGVPKRGRYTLLL DNHGAYKAAEAPCFSSSKSECDGQPYSFSYPLPGYGTAIFRF >gi|157101654|gb|DS480670.1| GENE 58 64224 - 64973 865 249 aa, chain + ## HITS:1 COG:no KEGG:Closa_0125 NR:ns ## KEGG: Closa_0125 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 249 1 249 249 340 66.0 3e-92 MAKQLISSSETKSVYRDGSTAIKEFCEGFPKAEVLNEALCNARVETIEVLNVPKVLGVSV IDGKWSITREFVEGKTLQQLMEENPDKTGEYLEQMVDLHLLIFAQACPLLNKLKEKTIRA LKSEEQLDENLRYELLTRLDGMPKHTKLCHGDFCPSNIIVGDDGKWYLVDWVHASQGNAS ADVARTYLLLSLKDKKIADMYMDLFCDKTGTEKRYVQGWLPIVAAAQLAYKRPDEKELLE SWINVFDYQ >gi|157101654|gb|DS480670.1| GENE 59 65157 - 67529 2520 790 aa, chain - ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 520 763 31 272 274 228 47.0 3e-59 MTDEKRKFRRRKLFLTLFHIIFVSMSVFGISAMYLNSNYGKGVKWIYEDAYEDSPQFDKQ LTEDIDRIFTYVGYKDMFETDGKLDMHKVIVRVSNGPGEQGTEWTLDRIVRYIKTRGYYL DENFQLAGSPISMDDDEEEITVDFQSYNPDFVDADAGDAPRMTMEDLALDILVHLSEYYT IYYNYIENDTNLNFRIQYTSNKGVTKVSTNVPDKTMEDIIGTGKYLFVPGNTIRMETNMA TVPANAATLLEIWNPYKNDKNYMVLSVDTSYPNTDAYSIEAKAYKEARNEFILGMGSVIL GGLGCVVTLAMLMLMSGHTTDGSTEIRLFPIDEIHTELCLILWAAATAVFIYIGRYVGIR LFSLFAPNEQWNYWNKMIKLVIVYGSVVLCGFDLLRRYKARTLWSNSLAKRALEASKDYV GRVSYAVGTGFCYLLFLGFNAGMLWGLIFLFFYKENRISYQIMFYVFVVLYLGLDGWIYH QLFKKSVQRDVLDMAVSTISQGDTSYQIDTSRLSGKERDMAEHLNNISSGLDSAIQEQVK SERLKASLITNVSHDIKTPLTSIINYVDLLKREKIQDPKIAAYLEVLDQKSQRLKTLTED LVEASKASSGNMKLDISDIDLVELVQQTNGEFEERFAMRRLELVSNIPDEVLIIQADGRR LWRVLENLYTNAVKYAMEHSRVYVDVLDEDGKAIFTIKNVSESSLNISPDELTERFVRGD VSRATEGSGLGLSISKDLTQLQGGEFQVVIDGDLFKAVVSFPVKRVEFKADKTSGEPGDI SQDFGFEEEI >gi|157101654|gb|DS480670.1| GENE 60 67538 - 68233 978 231 aa, chain - ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 4 230 7 232 232 269 60.0 3e-72 MQTILVCDDDKQIVEAIDIYLTGEGFKVIKAYDGYEALEYLDKNEVDLLIVDVMMPGLDG IRTTLKVRENSSIPIIILSAKSEDADKILGLNIGADDYITKPFNPLELVARVKSQMRRYT QLGNLNQQADGQIYKCGGLQINDDNKEVTVDGDAIKLTPIEYNILLLLVKNAGKVFSIDE IYEQIWNEEAIGADNTVAVHIRHIREKIEINPREPKYLKVVWGVGYKIEKQ >gi|157101654|gb|DS480670.1| GENE 61 68277 - 69878 1621 533 aa, chain - ## HITS:1 COG:BH1295_1 KEGG:ns NR:ns ## COG: BH1295_1 COG3409 # Protein_GI_number: 15613858 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Bacillus halodurans # 83 392 10 331 459 135 34.0 2e-31 MNSKYRKYVTYTAAVGVLTLSSVLGGCTKKSTSEVGDATPSEVNVIQAVPETIAQLGLNG QVLPGLNDINLPDAEPAPEYLRIGVRHEIIKRLQQRLMDLGFMDNDEPTDYFGEMTQQAV KHFQRQNELPTDGIVGNVTWDAIMSPDAKYYAVSKGTQGDDIERIQQRLYELGYLATADL VTGNFGDSTEAAVLKLQEVNGLEQDGKVGQRTINLLYSDEIKPNFLSYGEKSDVVLACQE RLKELGYLTTTPDGAYGEDTVVAVKQFQARNDQVVDGYLGPSTRIALNSPDARANGLMLG ERGDAVTKVQQLLNKHGYLVSGNVTGYYGEATENAVRNFQSRNGLTSDGLVGVQTMAKLT GDNVRRPAANSSGSGTTTRPNNSGNSGNTGNNGGSGNTGKPSGNTTPPVSIPASGGASAL ISVASSKLGSPYVWGAKGPNSFDCSGFIYWCLNQVGVNQSYLTSSGWRNVGRYTKITNFN DIQAGDIVVVRGHVGIAAGGGTVIDASSSNGRVVHRSLSQWWANNFICAWRIF >gi|157101654|gb|DS480670.1| GENE 62 70107 - 70775 823 222 aa, chain - ## HITS:1 COG:no KEGG:Closa_0311 NR:ns ## KEGG: Closa_0311 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 218 1 219 219 216 61.0 4e-55 MDEVHYLISDASKKVDVEAHVLRYWEEELELDIPRNEMGHRYYTDLHIRLFKQVKNLKEK GYQLKAIKHALNQVLKKDGKAQGELSDEILERDMSAALKEFKEEDSATSLSTVKGDGVSV VAMEEKMEQFQQIMNLIIGRALEVNNEKLSQDISCLVNEKMGKELEFLFQASDQKEEERF RQLDETLRSYQKGGQAEAAAAKVPFFKRRRFGRSGKKLRDGK >gi|157101654|gb|DS480670.1| GENE 63 70869 - 72227 1573 452 aa, chain - ## HITS:1 COG:CAC3715 KEGG:ns NR:ns ## COG: CAC3715 COG0305 # Protein_GI_number: 15896946 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Clostridium acetobutylicum # 8 439 6 433 442 462 53.0 1e-130 MEEALIKRVLPHSLEAEQSVIGSMLMDREAVIAASEIVTGSDFYQQQYGIMFDAMVELFN EGKPVDLVTLQDRLKEKDVPPEVSSLDFVRDIVTMVPTSANVRSYATIVGEKAVLRRLIK TTEEIANTCYAGKEPLENILADTEKSVFNLLQNRGGQDFVPIKQVAINVLEKIEDAYKNQ GTVTGIPSGFIDLDYKLSGFQPSDFILIAARPSMGKTAFVLNVVDYVAVRKSLPCMVFSL EMSKEQLVNRMLSMESNVDSQKLRTGSLTDSDWDAVVEGIGVIGNSKLIIDDTPGISISE LRSKCRKMKLEYGLSVVIIDYLQLMSGSKKGGGDNRQQEISDISRSLKALARELHAPVIA LSQLSRACETRQDHRPMLSDLRESGAIEQDADVVMFLYRDDYYNKDTEHPNEAEVIIAKQ RNGPIGTVTLIWKPEYTKFVNAARPQGGGQDA >gi|157101654|gb|DS480670.1| GENE 64 72299 - 72745 639 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239629097|ref|ZP_04672128.1| ribosomal protein L9 [Clostridiales bacterium 1_7_47_FAA] # 1 148 1 148 148 250 87 2e-65 MQVVLLEDVKALGKKGQIVNVNDGYARNFILPKKLGVEANSKNLNDLKLKKANDDRIAAE QLAAAKELGAKLDDSSVTLTIKAGEGGRAFGSVSSKEIAKAIGDQLGIDIDKKKIMLSDP IKSIGSFEVPVKLHKDVTARLAVKVVEA >gi|157101654|gb|DS480670.1| GENE 65 72790 - 74838 675 682 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein [Aster yellows witches'-broom phytoplasma AYWB] # 179 680 193 697 849 264 30 1e-69 MRDKGRTRGIFKVYLQWPLFLSVLLIVLTAVVAAVSVKAGIIVSVFTIVYIGIALWLYFS RKRGILAGLIAFSSAYDQSRQNLLEEMLMPYAVTDGSGHLLWMNREFTAILEEDKNSYKN ITAVFPEITKEMLATGGEVVSIHSSLGDRKFRIDLKQVMVDSLDAAVTAGGLEGQGSSVT AVYLLDETQTLKYMQQINDQKLVAGLIYLDNYDEALESVEEVRRSLLTALIDRKISKYIS GMNGIVKKIEKDKYFFAIKQQYMARIQEERFSILEDVKTVNIGNDMAVTLSIGIGMNGDS YSQNYEYARTSIDMALGRGGDQAVVKDSDKIQYYGGKAQQMEKTTRVKARVKAHALRELM ENKDRLLIMGHRLADIDSFGAAVGIYRIAMSMNKKANIVVNEVTSSVRPMMERFTGNAEY PEDMLLTGPKAAELVDQGTMLVIVDVNRPSITDEPALLEMVKTVVVLDHHRTSSEIIDNA VLSYVEPYASSTCEMVAEVLQYIADGIKIKSAEADAMYAGIVIDTQNFTNQTGVRTFEAA AFLRRSGADITRVRKLFRENMKDYQAKADAVRKAEVFMDAFAISECSAEGLESPTIIGAQ AANELLEIRGMKASVVLTDYNGTIYFSARSIDEVNVQVMMEKLGGGGHRTIAGAQMQGVT VEEAKERLKDVIRQMMEEGEVS >gi|157101654|gb|DS480670.1| GENE 66 75109 - 75453 416 114 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0018 NR:ns ## KEGG: bpr_I0018 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 111 82 190 194 126 50.0 3e-28 MKQRNIVVCILLSLVTCGIYGIYWMIVLNDETNALSGRQGTSGGLVFLFSLITCGIYALY WMYQMGNAVEMLHDQHGEPRGSAPVVYLLLSLFGLGIVAYALLQNELNKYLPNA >gi|157101654|gb|DS480670.1| GENE 67 75446 - 75841 196 131 aa, chain + ## HITS:1 COG:no KEGG:Clole_2235 NR:ns ## KEGG: Clole_2235 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 19 129 22 131 131 87 41.0 2e-16 MHKRIQTLCIKTGGLLGAGLFYAFICILAGRPLIPCLFHTVTGLYCPGCGVSRMCLSLLR LDFASAFQANAAIFLLLPPGLFMAVWMAVRYVRTGSTRLTRVQACVFYVMIGVLLVFGVL RNLPGFGWLRP >gi|157101654|gb|DS480670.1| GENE 68 75852 - 76685 567 277 aa, chain - ## HITS:1 COG:mlr7094 KEGG:ns NR:ns ## COG: mlr7094 COG0395 # Protein_GI_number: 13475911 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 2 277 9 281 281 133 31.0 3e-31 MKVMKNIPRLIMLAVLTLIAVVFVIPIFYSVFNSFKSQKEILSTTMTFFPNSPSLENYLY VFQHGSQYLGYYVNSLKITFIGVILTVILSAMSGYAFARLPFKGSGAVMAFILFVITFPL AALLIPIYIMEYNVGLLNTTMGLVFPNVMNVLPFSIFIMRGIFLGLPIELEEAARMDGCS VFRTWKDVMAPLAKNGMIIVLVFSFYNIWGEYTMGKTLATQDSAMPIAVALTLLKGDSWN FGVLGAVITMTIIPPVIIFIIFQRQLVDGIAMGAVKG >gi|157101654|gb|DS480670.1| GENE 69 76682 - 77518 422 278 aa, chain - ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 5 274 26 291 296 166 38.0 3e-41 MPSFLIIAVFVAYPLCYTAYLSMSEYDFMYDLTPSFVGIKNYVSLFSDPDFLVSMKNTAV FASLDFLLLMVLSLIIALLIFFCKKGTGIFRTAVFTPIVVPASLACIIFGWLFSENFGYI NYFLVKIGLPQFTRAWLTSRNTAMGTTIAANLWYNIGFITILFLAGLQSISADILEAAVV DGATGIKKIFYIVLPNLRESFVITGIWGIVTALKVFVEPMVMTGGGPGNATRTLYMYIYN TAFKYFDMGYASAMAFVLSGLILLFSLINLMVSKGENE >gi|157101654|gb|DS480670.1| GENE 70 77633 - 78955 1081 440 aa, chain - ## HITS:1 COG:TM1120 KEGG:ns NR:ns ## COG: TM1120 COG1653 # Protein_GI_number: 15643877 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 57 325 31 300 436 73 27.0 6e-13 MRKMISVLLAGVLAVNLGACSSAASKASKEAESAAPAGTTAQESGGKRTLRIAGESWQVT KLFLNEAAEAFMKDHPDVTVEVQTYADPTVVSNYSINWQKGDTPCDLVVIDGASFAEQFV TKDLIYDFDQDLKFFDTYSKDKFIPAALEMAKLHDKQYVIPVIQEVTAININKAMFKDAG LVDENGDALVPKTWDDVYEFAKKLTKTENGVVTQQGCTIQWSKDIHGTVLGILQASCGGL YQEDGITVNFQSQEFKDILAIWQKGVKEGVFSTETFADYDAGKNSYKAGKVAMLLQSGSN WVEAIPTIGEDNASVVPIPGGEENGSVGYVNGVIIPKCSPSSDLAVQFVQEQLLGEYTQT STVNQYGKIPVITEYYNKTESPNWQNIKGSIEKAVTYPPYKDLGEFVIKCQEIFQASLVD GTDVNTTAEELQNMVNKLDK >gi|157101654|gb|DS480670.1| GENE 71 78977 - 79708 534 243 aa, chain - ## HITS:1 COG:all0727 KEGG:ns NR:ns ## COG: all0727 COG0363 # Protein_GI_number: 17228222 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Nostoc sp. PCC 7120 # 1 243 13 258 258 162 36.0 5e-40 MAVYILDNSEELGEKAAELIAQKLNEAIEKKGKARIILSTGASQFETIKYLVEKNVDWEK VTMFHLDEYLELPETHKASFRRYLKERFTSKVPVKVHFVNTEGDVEENLKELTSEIRKDI IDIGVIGIGENGHIAFNDPPADFETREAYRIVELEERCRKQQLNEGWFPTLDDVPFKAVS MTPYQIMQCETIVSSVPGERKAEAVRNTLKSDEVTNMVPATLLKTHKDWHLFLDKESSSL IDS >gi|157101654|gb|DS480670.1| GENE 72 79715 - 80767 680 350 aa, chain - ## HITS:1 COG:VCA0132 KEGG:ns NR:ns ## COG: VCA0132 COG1609 # Protein_GI_number: 15600903 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 6 344 3 333 334 209 32.0 7e-54 MNKKTTIKDIAKKANVSIATVSRVLRRTDYPVSQDVKKRVFDAADELEYKPNLFGRMLRG DASIEIGIIVPSINNPFYAAQVAASEEECLSRNFIPIICNSNSNPGLESWYLEMLEERQA AGILLTTIQNEETFVKRIERLNMPVLLVDQGIENYTGDRVLFNFFKAGYMAAEYLIECGH KDLALASGPITRHNRREILRGFKQALLDYDIVFNKRRIVNYDSKYDIYNIGDDQGAIQII DKMFEEEYLPESIVAINDSLAIKMINELHNRGIHVPNDISIVGMDDIFVSKMVTPRLTTI HEPAEEMGKRSAKYLIDKIEGKSRNIVNITMQPTLVERNSVRKVHSKIRR >gi|157101654|gb|DS480670.1| GENE 73 81605 - 83728 2115 707 aa, chain - ## HITS:1 COG:ECs0453 KEGG:ns NR:ns ## COG: ECs0453 COG0366 # Protein_GI_number: 15829707 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 121 658 118 605 605 258 32.0 2e-68 MEDNLKGRINRDALFSSETEDYRMPMEPDADDDVLLRMRTAKGNVDHVYYVEDKAEVEMT KAKSDELFDYYEYEITVGTDQVLYHFKVVSEQDVCLYNRLGPTQDGQPCFDFKITPGFHT PEWAKGAVMYQIFVDRFCNGDESNDVESYEYVYIGRPVQKVTEWDRYPAAMDVGYFYGGD LQGIWDKLDYIKKLGVEAIYLNPVFVSPSNHKYDCQDYEHIDPHFTRIEKDGGGLVRPDA TDNKEADRYVQRSAGKENLEASDAFFARFVEEVHKRGMRVILDGVFNHCGSFNKWLDAEG IYEHSGDYEAGAYESKSSPYHSFFQFHDDSDGAWPYNRTYDGWWGHDTLPKLNYENSEKL VEYILNVARKWVSPPYSIDGWRLDVAADLGHTAEYNHEFWRRFRQAVKEANPEALILAEH YGDPASWLEGDQWDSIMNYDAFMEPVTWFLTGMEKHSDSYNGGIWGDGEAFFNAMGYHMS RMQTNTVMTAMNQLSNHDHSRFLTRTNQMVGRIQTAGPEKAGQNIKPCVFREAVVIQMTW PGAPTLYYGDEAGVCGWTDPDSRRTYPWGKEDLELIEFHRYMTGIHRKIPAIRKGAVKPL LAGYQKIAYGRMYGKYQAVVVISNSPDSQTMDIPVWQLGVTDDMVLGRPILTAETGYNAG VMLYRIKNGMLKLNMPPYSAAVFVSQPDEFYPVVDSRFADDGAAVEK >gi|157101654|gb|DS480670.1| GENE 74 84128 - 85942 1279 604 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 33 601 35 595 636 497 47 1e-139 MRQQQTFRGFIFILLMLILIATAVRFPYARQADKVTNQDFIKILEDGQAADVSIHQNPQT PTGEVVLTLLDGQVKRLYVSDVKDAQKLLEAHDMAYTTMDVPQENYLVTIILPFMLSIVV VVIIIMVMNRSAGGGGANARMMNFGKSRARMSRDSKVNFSNVAGLVEEKEELEEVVDFLK NPQKYTSVGARIPKGLLLVGPPGTGKTLLAKAVAGEAGVPFFSISGSDFVEMFVGVGASR VRDLFEEAKKNSPCIVFIDEIDAVARRRGTGMGGGHDEREQTLNQLLVEMDGFGVNEGII VMAATNRVDILDPAILRPGRFDRKVAVGRPDVKGREEILKVHSKEKPLSEDVDLHRVAQT TSGFTGADLENLMNEAAIISARENRRFIKQSDIDRAFVKVGIGAEKRSKVISEKDKKITA YHEAGHAILFHVLPDVGPVHTVSIIPTGIGAAGYTMPLPEKDEMFNTKGRMKQNIMVDLG GRIAEELIFDDITTGASQDIKQATQIARAMVTQYGMSEKVGMIQYGGDENEVFIGRDLAH TKSYGNEVADTIDSEVKRIIDECYQKAKDIIKQYDYVLHACADLLIEKEKISQSEFEALF TPVQ >gi|157101654|gb|DS480670.1| GENE 75 85965 - 86498 568 177 aa, chain - ## HITS:1 COG:FN0288 KEGG:ns NR:ns ## COG: FN0288 COG0634 # Protein_GI_number: 19703633 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 172 1 172 175 186 60.0 2e-47 MADRIRVLLTEEEVDKRINEVAAKISEDYAGKQVHMICILKGGVFFTCELAKRMTVPVSL DFMSVSSYGGGTVSSGVVRIVKDLDESLEGKDVLIVEDIIDSGRTLAYLIEVLKQRGPKS IHLCTLLDKPERRVKKQVKVDYTCFTIPDEFVVGYGLDYDQKYRNLPYIGVVELDAE >gi|157101654|gb|DS480670.1| GENE 76 86519 - 87910 975 463 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 1 463 14 461 461 231 32.0 3e-60 MLEPGDRVIAAVSGGADSVCLLALLCAWREPRGISIRALHVHHGLRGEEADRDADFVRSL CEGLHVPCHILKVDVRGLAAEKGMSEEEAGRFLRYEALEKEALDWEGEDRASGARSGQDG DWAHGGLAGSSLSPVKIAVAHHSGDQVETILHNLFRGSGLFGMKGIVYRRGRIIRPLLDV DRDCILKWLADHDLPYVQDSTNDTLHYTRNRIRNQLLPEIEQYVNRGAAGNILRLGRLAA QADEYLESQAAAWIKAHVRKNPGAYIPAEAFLQEPEILRSYVVMSLLKELGGASRDLGLV HVSQVMELAGRAVGKQVDLPYGLSAIREYEGIWMGRGDPASEEEWGDLPIVDMEVFSRKK GMEFPKNVYTKWFDCDKIKGTPVVRTRRPGDFIVLSDNNHKALNRFMIDEKIPRQMRDKI PLLADGSHVMWIIGYRISEYYKIGPDTVRVLQAEAGQTGERKE >gi|157101654|gb|DS480670.1| GENE 77 88049 - 89467 1422 472 aa, chain - ## HITS:1 COG:BH0078 KEGG:ns NR:ns ## COG: BH0078 COG2208 # Protein_GI_number: 15612641 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Bacillus halodurans # 91 467 429 806 830 147 26.0 4e-35 MFIREREVKGRTDVNYYTARRLGDMAESLNQLARAFDDGIEKNGQLTKDDGLAAMQASAA LVCDNCSRCNLYADSEKEDSYYLYYLLRAFEQKGHIDFEDMPQMFQSGCRKKEDYLAQLN RSLGRATMNLSWKNRFLESRDAVISQFRELSVILEEFSHQIDRARDITDEYEYILKKYFK RYHVALGNLLLLEYENGQKEAFLTARTTNGRCITSKDAALIMGEVMDGTRWSPAKDSRSI ITKQYETVRFLEEGGYRMLYGASRIPKKGEKYSGDNYTFCESPGNQVVMSLSDGMGSGET AARESKQVVELTEQLLETGFSPRAALKLVNTVLLLAGPEQHPATLDLSCIDLHTGVLEAM KLGAVPTFIIGEEGVEIMEAGEVPMGILSGVEPVLMSRKLWEGDRIVMVSDGVLDALPGD DKEQAMQQYLESVEEMGPQELADQVLDFAVSFIPAPRDDMTVLTAGIWKRRS >gi|157101654|gb|DS480670.1| GENE 78 89572 - 89835 377 87 aa, chain - ## HITS:1 COG:no KEGG:Closa_4064 NR:ns ## KEGG: Closa_4064 # Name: not_defined # Def: Septum formation initiator # Organism: C.saccharolyticum # Pathway: not_defined # 1 86 19 104 107 89 61.0 5e-17 MAIMGITLVVMFLAVAINIKGADLKKSDLEYSIREQNLEQQKEEEEKRTAELQEYKIYVK TKQYAEEVAKEKLGLVNPDEILLKPTE >gi|157101654|gb|DS480670.1| GENE 79 89894 - 90274 233 126 aa, chain - ## HITS:1 COG:no KEGG:Closa_4065 NR:ns ## KEGG: Closa_4065 # Name: not_defined # Def: Spore cortex biosynthesis protein, YabQ-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 114 1 114 120 109 50.0 3e-23 MSSRIQYEAWLMVLSLVTGGWLMLAYDTLRVFRLVIRHGPFWTGVEDFLYWLYAGLVTFI LLYEQNDGVFRAYVIGGVFAGMILYDRFISRIFFKCLKKAGKCLRMIISRHRQKAVCSPE KDGAGD >gi|157101654|gb|DS480670.1| GENE 80 90283 - 90570 390 95 aa, chain - ## HITS:1 COG:no KEGG:Closa_4066 NR:ns ## KEGG: Closa_4066 # Name: not_defined # Def: sporulation protein YabP # Organism: C.saccharolyticum # Pathway: not_defined # 1 95 1 94 94 120 71.0 2e-26 MEEKIGSNRQHKLILQNRGKGNITGICDVVSFDENAVVLDTDMGLLTIKGKELHVSRLTL EKGEVDIEGTIDSMVYSSNEALRKSGESLFTRLFK >gi|157101654|gb|DS480670.1| GENE 81 90683 - 90922 341 79 aa, chain - ## HITS:1 COG:BH0073 KEGG:ns NR:ns ## COG: BH0073 COG1188 # Protein_GI_number: 15612636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Bacillus halodurans # 1 76 1 76 88 75 61.0 2e-14 MRLDKFLKVSRLIKRRTVANEACDAGRVLVNDKPAKASVSVKTGDIIEIQFGAKAVKVEV LDVQETVKKDEAKELYRYI >gi|157101654|gb|DS480670.1| GENE 82 90926 - 91201 171 91 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 1 91 4 94 96 70 36 4e-11 MNKTELIAAVAEKAELSKKDAEKAVKAFTDAVSEELVKGGKVQLVGFGTFEVSERAAREG RNPKTGKTMTIEASKTPKFKAGKALKDEVNK >gi|157101654|gb|DS480670.1| GENE 83 91302 - 91682 438 126 aa, chain - ## HITS:1 COG:BS_yabN KEGG:ns NR:ns ## COG: BS_yabN COG3956 # Protein_GI_number: 16077126 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus subtilis # 2 122 231 351 489 129 52.0 1e-30 MYTFDDLVGIIAKLRSPEGCPWDREQTFESLKKCLADETEEVFQAIDNKDMENLCEELGD VLLQVILNSQIAKEEGRFTIDDVIDGLCRKMVRRHPHVFGGVKVNSVEEGTALWNQIKMQ EKAKKP >gi|157101654|gb|DS480670.1| GENE 84 91692 - 92312 842 206 aa, chain - ## HITS:1 COG:PH1972 KEGG:ns NR:ns ## COG: PH1972 COG1394 # Protein_GI_number: 14591709 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Pyrococcus horikoshii # 7 200 9 205 214 112 37.0 6e-25 MDPNTFPTKGNLIAAKNSLGLAYMGYGLMDKKRNILIRELMELIDEAKGIQSEIDSTFRE AYAALQKANIELGINYVQEIGASVPIDDRVKIKARSIMGTEIPLVQHESQPLALTYGFYN TTESLDEARYHFEKVKELTIKLSMVENSAYRLANSIKKTQKRANALKNITIPHYEDLSKS ISNALEEKEREEFTRLKVIKRNKEKV >gi|157101654|gb|DS480670.1| GENE 85 92318 - 93709 1827 463 aa, chain - ## HITS:1 COG:TP0528 KEGG:ns NR:ns ## COG: TP0528 COG1156 # Protein_GI_number: 15639518 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Treponema pallidum # 5 454 6 454 480 553 59.0 1e-157 MAIEYLGLSEINGPLVVLEGVQNAAYDEIVEMTVAGNLHKIGRIIEIYGDKAIIQVFEGT DGMALRNTHTRLTGHPMEIEVSEEMLGRTFNGIGQPIDGLGDIISDIKLDINGKPLNPVT REYPRNYIRTGISAIDGLTTLIRGQKLPIFSGNGLPHDQLAAQIVQQASLGENSDEEFAI VFGAMGVKYDVADFFRRTFEESGVSEHVAMFINLANDPVVERLITPKIALTLAEYLAFEK GMHILVILTDMTSYAEAMREVSSSKGEIPSRKGFPGYLYSDLASLYERAGIVAGRPGSVT QIPILTMPNDDITHPIPDLTGYITEGQIVLDRQLNGQNIYPPINVLPSLSRLMKDGIGEG YTRADHQDVANQLFSCYAKVGDARALASVIGEDELSPIDKKYLEFGKAFEEYFIGQAPHE NRTILDTLDIGWKLLGMLPREELDRIDTKVLDQYYKPADTMKE >gi|157101654|gb|DS480670.1| GENE 86 93725 - 95491 1942 588 aa, chain - ## HITS:1 COG:MK1017 KEGG:ns NR:ns ## COG: MK1017 COG1155 # Protein_GI_number: 20094453 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Methanopyrus kandleri AV19 # 5 585 7 589 592 643 54.0 0 MAKTGVIYGINGPVVSLLGDSGFQMNEMVHVGQEKLVGEVIGLNSEKTTIQVYEETSGIK PGETVTGTGAPVSVTLAPGILSNIFDGIERPLSEIAKGSGYYISRGISVNSLDTDKKWDT HMVVKEGDRIMPGTIIAEVPETRAITHKVMAPPDAEGYVLTVVPDGRYTIEEPLLTLQLM DGTERTLTMTQKWPIRVARPSARRFPAVKPLITGQRILDTMFPLAKGGTAAIPGGFGTGK TMTQHQVAKWSDADVIIYIGCGERGNEMTQVLEEFSELIDPKTGNPLMDRTTLIANTSNM PVAAREASLYSGLTLAEYYRDMGYHVAIMADSTSRWAEALRELSGRLEEMPAEEGFPAYL ASRLSAFYERAGMMQNLNGTEGSVTIIGAVSPQGGDFSEPVTMNTKRFVRCFWGLDKSLA YARHFPAIHWLTSYSEYINDLSSWYMDNVDKRFVDDRNRLMALLVQESSLMEIVKLIGAD VLPDDQKLVLEIAKVIRLGFLQQNAFHKDDTSVPLIKQFKMMEVILYLYKKSRSLVSMGM PVSVLKEENIFDKIISIKYDVPNDRLEMFDDYMKMIDGFYDRVVERNA >gi|157101654|gb|DS480670.1| GENE 87 95484 - 96077 841 197 aa, chain - ## HITS:1 COG:no KEGG:Closa_4197 NR:ns ## KEGG: Closa_4197 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 190 1 190 200 198 59.0 1e-49 MTTEEKLKHFQDICMTDARERSARMLDDYMNALEAAFEEHAADARRRADMQVEAETEKLE REINKRLSIGQLDLKREASHKQEELKDKLFVELRDKLANFMETQDYQRLLDRQVKAAKEF AGDEELIIYMDPSDVDKLQRIAMHHNATIKVSEYSFNGGTRAVLPGKHILIDNSFQTKLN EARHEFKFDLTGGRQSG >gi|157101654|gb|DS480670.1| GENE 88 96104 - 96412 554 102 aa, chain - ## HITS:1 COG:no KEGG:Closa_4196 NR:ns ## KEGG: Closa_4196 # Name: not_defined # Def: Vacuolar H+transporting two-sector ATPase F subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 102 1 102 102 165 85.0 6e-40 MKMYLISDNIDTLTGMRLAGVEGAVVHKREELKEELDKALADKTIGIILLTEKFGREFPE IIDNVKLERKLPLIVEIPDRHGTGRKPDFITSYVNEAIGLKL >gi|157101654|gb|DS480670.1| GENE 89 96417 - 96851 652 144 aa, chain - ## HITS:1 COG:no KEGG:Closa_4195 NR:ns ## KEGG: Closa_4195 # Name: not_defined # Def: H+transporting two-sector ATPase C subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 144 1 143 143 120 78.0 1e-26 MSLLVKITLSAALILSIAVPFGAFAAGEKTKGRYRTALAANVFLFFGTMVFAAVVMFSGN AAAAETGAQAAGSSAAGMGYLSAALATGLSCLGGGIAVSAAASAALGAISEDSSILGKSL IFVGLAEGVCLYGLIISFMILGKL >gi|157101654|gb|DS480670.1| GENE 90 96883 - 98823 2325 646 aa, chain - ## HITS:1 COG:AF1159 KEGG:ns NR:ns ## COG: AF1159 COG1269 # Protein_GI_number: 11498759 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Archaeoglobus fulgidus # 3 641 5 669 676 127 23.0 9e-29 MIEKMKFLSITGPKEDIDRVVDTYLSRYEIHLENALSELKTVKDLRPYIETNPYKDVLEK AQGLVESYHDLITDQTGRNMKLEEAVEVINSLDTQLKELTDQKNVLVHQRDELKASRDRV IPFAGLNYNVKEILKFRFMKFRFGRISREYYEKFSSYVYETIDTVMFKCEEEGDYVWVVY FVPESLADKIDAIYASMHFERCFLPDEYEGTPVEAGHVLDDRIGSLQKEIDEADRNIVEA MSSRKDEFAAALRRIRTFSTNFDVRKMAAVTKHDTHNFYILCGWMTENDAKRFQKEIECD SDTFCIIEDDHNNIMSTPPTKMKNPGLFRPFEMYVEMYGLPSYNELDPTILIGITYSILF GFMFGDAGQGLCLLIGGFLLYRFKKIRLAGIISCCGVFSTIFGLLFGSVFGFEDIIDAVW LRPQEAMTDLPFIGRLNTVFVVAVAIGMGIILLCMILNIINSLRTHDVEKAYFDTNGVAG LVFYFALASVIVLYMTGNPLPATVILVIMFLIPLLVIFFKEPLAAILEKKAERMETGLGM FITQGIFELFEVLLSYFSNTLSFVRVGAFAVSHAAMMQVVLMLAGAEAGGNTNWAVVVGG NLFVCGMEGLIVGIQVLRLEYYELFSRFYRGSGRAFEPYGKAAGEQ >gi|157101654|gb|DS480670.1| GENE 91 98841 - 99860 880 339 aa, chain - ## HITS:1 COG:no KEGG:Closa_4193 NR:ns ## KEGG: Closa_4193 # Name: not_defined # Def: H+transporting two-sector ATPase C (AC39) subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Methane metabolism [PATH:csh00680]; Metabolic pathways [PATH:csh01100] # 1 334 1 334 348 355 53.0 2e-96 MGSLIAYSGIATKVKAMERWRIKDEQFLEMAALETVPDAVQYLRAFLPYREIFGQVEDKD LHRGNIEQHLNLSQYRDFAKLYQFANIKQRRFLDLYFMHYEITILKTCLRNAAGHREQRQ DLSMFRDFFERHSALDLITLSQSQSMEEFIGNLKGSVYYGPLYQMQQNGRVTLPACETAL DMLYFRSMWKIKDKYLSKNEQKIIAQCFGTRMDMLNLQWICRSKKFYQLEPGEIYAMLIP IQLHLTRKEINRMVEAENLDILYGLIGASWYGRTGLVHLDQDQDLDHLAHDIIDRIYLKT SRKDPYSIAILNSYLYFKEREIQRIITTIEKIRYGLGAS >gi|157101654|gb|DS480670.1| GENE 92 99865 - 100176 448 103 aa, chain - ## HITS:1 COG:no KEGG:Closa_4192 NR:ns ## KEGG: Closa_4192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 102 1 102 103 69 51.0 5e-11 MDTVINRLSEIEAAAGAIVEEANARKKAFAEEMDAKTAAFDKSMEQETARRIAEIQEKME ADMNGLLAKQKAESEALLKELEDNYNRHHEEYAEALFQKMIKG >gi|157101654|gb|DS480670.1| GENE 93 100379 - 101374 872 331 aa, chain - ## HITS:1 COG:no KEGG:Closa_4191 NR:ns ## KEGG: Closa_4191 # Name: not_defined # Def: SH3 type 3 domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 12 319 11 361 377 194 43.0 5e-48 MSNVERTTSWQQIQQGVKEAERLIARKEYNLVMVRARQVLEYMVRCMAERACVVEGDLSD TIDQLYEGQWINKATKDNYHTIRILGNKAVHEGDDTAYDANQAFQLLTQEVYVFANEFAG GSGGRPVRTSSAARNTSGRPAPRQGTGGGRSASGRNSQRVPARNTRSGPSRQGGPRKKRR RVSPAAYILRLLIPVLLVILLIVVIRVFTSDKGKDADKTTTVPTVTTAPVEPAPTLPPEP ETEPETEPPAETYRIKGSSVNVRSEPSTSGRILVQLGSGTEVVYVKRYDNDWAVINYDGQ EAYVSSKYIEKVEPVASTGGETEGSEAQTAN >gi|157101654|gb|DS480670.1| GENE 94 101560 - 102222 619 220 aa, chain + ## HITS:1 COG:BS_ytmQ KEGG:ns NR:ns ## COG: BS_ytmQ COG0220 # Protein_GI_number: 16080042 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Bacillus subtilis # 1 204 1 204 213 217 50.0 1e-56 MRLRHIQGAEETIASSPFVIQEPEQHRGTWNQVFGGSHPLEIEVGMGKGRFIMEKASANP DINYIGIERYSSVLLRGLQKRSEMDLENIYFMCIDAKDMAEIFAPGEVDRIYLNFSDPWP KDRHAKRRLTSPVFMSVYDKILAPQGVVEFKTDNRGLFEYSLESIPDAGWEITEYTFDLH NSPMAEGNVMTEYEMKFASEGKPICKLIARRTGTFGQDPA >gi|157101654|gb|DS480670.1| GENE 95 102259 - 102420 160 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935963|ref|ZP_02083337.1| ## NR: gi|160935963|ref|ZP_02083337.1| hypothetical protein CLOBOL_00858 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00858 [Clostridium bolteae ATCC BAA-613] # 1 43 1 43 53 67 100.0 3e-10 MALPPGITKKLLQATSDKTDLNKQALRFNRSFKEVGKLLGSGSAKAKKDKDKK >gi|157101654|gb|DS480670.1| GENE 96 102783 - 104435 1708 550 aa, chain - ## HITS:1 COG:ECs3742 KEGG:ns NR:ns ## COG: ECs3742 COG3829 # Protein_GI_number: 15832996 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli O157:H7 # 3 547 7 589 592 145 23.0 2e-34 MKRSVLSACAAIVQKEIEFYAGFTCLEYDVNDENTMRVAGTGIFRSIIYANENQPYSAIE QAKTDRKPVFVRNRNSSICRKCSTNRICTGKAEVVVPLYIRQAFLGTLSVLCYDEKIHKD MLEHTEYYKNLAVHVGQVITRIAESYLLEDALHVTEHALSDLVDSMEDGAVLIENRRVSK VNKNARIQMQRMGIGEELFRETVKVIKRRNSNWLDMRAENRSIRLHAKHQIFDSEFLMGK RLEYVVLEQNLVTEQSYFQDASRNLSLDYFEGTSEVLTKMKKNVMKISEISRYMVIQGER GLGKEMWVKAIHNSGRYAQQELVIFDCNAFSDLNFANQVFDEETGIFCGRDITVCLKEIP RLPYWLQLKLADNITLLENHNIRLIATTLEDLSKMADHNAFSKRLFNLFYPAILCIPPIR ERDMDIDYYIQKYLIYYQDFEKKKVACSSRALTMLKNHSWPGNFKQMEQVLSYLISINTS GIITEEDVLRLPDFKQNDGPFNLKEQERQLIQKALQRFTGPYGKQAAADALGISKATLYR KIQEYRLDEA >gi|157101654|gb|DS480670.1| GENE 97 104604 - 105473 854 289 aa, chain + ## HITS:1 COG:mlr7546 KEGG:ns NR:ns ## COG: mlr7546 COG0005 # Protein_GI_number: 13476267 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Mesorhizobium loti # 5 272 17 291 306 133 31.0 5e-31 MICKEIPKTMYCHIGGSGTWGCEFPEDLNMEGITLLKRDMEFETPFGVTACMKLYEMDAS ITADHKTRQVLYVPFHGWKGLSPYNDTPSERIFWALREAGVKYILADGSGGGINPLLEPG DIICPQDFIDYTKRVSYLGKFTPYSMRMREIICPDLHRILLEKASGVFRRVFRTGIYGVA EAPRFESAAEIQKMYADHCDVCGQTMMPEAALARAIGACYAALYVISNSAEGINPNWERS IFDIYSECAPVVGKVMLEAMAAINPETMNCHCGENIIEMPKAVQNRLDM >gi|157101654|gb|DS480670.1| GENE 98 105501 - 106625 1398 374 aa, chain + ## HITS:1 COG:CAC0702 KEGG:ns NR:ns ## COG: CAC0702 COG1744 # Protein_GI_number: 15893990 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Clostridium acetobutylicum # 70 374 36 349 357 162 33.0 1e-39 MKHLKLFTALALTAAMTVSLAGCQGKAASSPDAGTQTAADSGETDSNGSASKEADAKAPA DGQTVPSGGSEPFKVALMIGIGGLGDGGFNDSMLAGVEAAKADYGIEYQLVEPKEISEFE ANFTDLSASGKYDLIIGGGFDAVEALQKVASEFPEQRYLFVDGEVTGCDNVTSVLFKDNE KTYLVGTVAGLNTKTNKIGMVVGVDSPSQNIFVAGYMAGAKAANPDVEVIVKYVGSFADT TTAKELALAEADAGADVIFAAAGGSGLGVFNAAQQGTFKAIGADVNQCLIDPDHIMLSAI RKIDVVIKDGIKSAIDGTLEGGTMVPGLKEDALGITNEGSKVEIDPASLDAAQTAKDKII SGEITVPSTIDEVK >gi|157101654|gb|DS480670.1| GENE 99 106688 - 108214 1278 508 aa, chain + ## HITS:1 COG:TM0103 KEGG:ns NR:ns ## COG: TM0103 COG3845 # Protein_GI_number: 15642878 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Thermotoga maritima # 2 497 3 499 507 450 47.0 1e-126 MYAIEMQGITKTFHGSYANKEVSLKVEQGEIHSLLGENGAGKTTLMKILFGLYKANKGRI LINGSPAAIRSPHDAIDYGISMVHQHFVLVGNLTITENIILGHEPRKGILLDRGKARQKV DALIRRYHFDLNPESRIENLSVGEKQRVEILKALYNESSIILLDEPTAVLTPQEVQELFA MLRQLKKDGRTIIIITHKLKETLAIADNITILRRGENIASLPATQATEDKLAELMVGRPI SFDVQRVPYDGSGPVKLGLKNICLKKKGRPVLTDISLECRAGEILGIAGVEGNGQTELIE VVTGIENHFTGQVCCGSQPVSAISPRRMLDYVGHIPEERGKRGFVKGFSNWENLILGYHY RPEYVKHGFLNIRRIKQVARDTIKQFDVRTDGIDQLTSSLSGGNQQKLIIGRTLLHQSNI IVAAQPTRGVDIGAIEYIHQHLMQMRDSGSAILLISADLDEVIKLADRIAVLYEGRVVAE KETGAFTKTELGYYMLGGNARKEAADND >gi|157101654|gb|DS480670.1| GENE 100 108207 - 109256 1116 349 aa, chain + ## HITS:1 COG:lin1427 KEGG:ns NR:ns ## COG: lin1427 COG4603 # Protein_GI_number: 16800495 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Listeria innocua # 8 334 6 336 350 203 38.0 5e-52 MIKYLKSRTMAFTLISFLISLVLAGIIIAAAGYSPVEAYSAMLGGAFGSTYNLAQTAGTA IPLIFAGLAMAVASKAGIFNIGIEGQILAGALPAALAGTYITGLPAVLHIPVCILVAAIF GGIWAMLAAVIKNKLQISEVIITIMLNYVALYLVEYLVNNPFKSEGMVVRTEEIQDSARL VSLVAHTRLNTGIFIALLAVFLFWVLFYKTKLGYEMRATGDSPFAAESAGINNKKNLLAA MFISGALAGIGGACEVMGVQGYYISGMTTGYGFSGIAIAVMGHNQPLGTLASALLFGALN AGASSMNRMTDIPGEFISVLQALIILFISTPGIVIAIQNLYKRRKAVTS >gi|157101654|gb|DS480670.1| GENE 101 109253 - 110173 1163 306 aa, chain + ## HITS:1 COG:CAC0705 KEGG:ns NR:ns ## COG: CAC0705 COG1079 # Protein_GI_number: 15893993 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Clostridium acetobutylicum # 14 306 16 309 310 209 42.0 5e-54 MSGQILSLLAGMVRVAIPISFAALAGMLSERAGVINMGLEGIMLIGSFFGVVGSYVTGSA WMGLLFAALSGILMGLVLVTLTVGFKCEHVLAGVGINIFASGITIVLLEMIWGTKGKSSM VAGLGMLRVPVISKIPVIGDIIGTISPLFYILVLCVFLLWFLLYRTPAGLRILVIGENPE MAGTMGVNVYKMQYLCVMASGMLAAIGGAYLSIGDINMFSKDMVSGRGYIALSMVILGNW KPLWVALGGLVYGFAQSLQFRLQGVNIPPQLVQMLPYILTLVVLLFARKKSSAPAASGKH YYQKGE >gi|157101654|gb|DS480670.1| GENE 102 110176 - 110721 608 181 aa, chain + ## HITS:1 COG:MJ0150 KEGG:ns NR:ns ## COG: MJ0150 COG1719 # Protein_GI_number: 15668322 # Func_class: R General function prediction only # Function: Predicted hydrocarbon binding protein (contains V4R domain) # Organism: Methanococcus jannaschii # 70 180 60 167 173 72 32.0 4e-13 MGKIQIDTEHARKMADDYSYVMRSFADAYAAQKKGVKIRPRLGDHMPFNYMRERQTLTFL FYPELKEAAYEISRIIGESITSDQIRGKTLPEIMESNQPIAQRDGYAFEEVVEADETHAV YRHYECADCYGLPNIHSKICVYEAGTAAGMFSTALGKPCRVTETKCCANGDPYCEFLVEV L >gi|157101654|gb|DS480670.1| GENE 103 111153 - 111800 547 215 aa, chain + ## HITS:1 COG:no KEGG:Closa_4227 NR:ns ## KEGG: Closa_4227 # Name: not_defined # Def: integral membrane protein TIGR01906 # Organism: C.saccharolyticum # Pathway: not_defined # 1 215 18 231 236 251 62.0 2e-65 MIILFITSVEAVVYWTPGYFEKEYTKYNVLDSLPSMTMDDLLHVTDEMMDYLRGGREDLH VTTTMGGEQREFFNEREIAHMEDVQVLFLKALSIRRICLVLGAGLLILMAATKARMGRVL PPSLCMGTGLFFALITALGLIISTDFTKYFIMFHHIFFSNDLWILDPATDMLINIVPEGF FMDTAARIALLFGSLSLILFGVCLLLTIKNRRKVS >gi|157101654|gb|DS480670.1| GENE 104 111902 - 113536 1394 544 aa, chain + ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 29 283 27 289 425 216 44.0 9e-56 MKRFLMLFICLCLCAGLFPITAAAAPEWPSDVSIQADAGIVMDSDTGTVLYGKNMDQPYY PASITKILTALIVLEQCDLNEMVTFSHDDVYNVEAGSSSAGIDEGDVLTVRDCLYALMLA SANESANALACHVSGSREAFAQLMNEKARSLGCTGSHFNNPSGLNDENHYTTAHDMALIA RAAIQNPEFLTINGTRSYQLAPTKRTPEGGYVANHHKMLNKNEAVYYPGAFAGKTGYTSL AGNTLVTCARKNDMTLIAVVLNGHQSHYSDTKALFDFGFRNFQSLRTVDYETRYKSLEND MTIAGMTSGDSISLELDRSGRVVIPRDADFTDTQSALTYDLDGSHPQAAIACISYTYNDR PVGSVYLCSPGLEGSAASLTSQDASGASAVSIGQDGEADNPAPSDPSGRPDAPGLQDTPE APSNAETPHGQTPAPTPAPRQDITSKEGTAASIRIPANTLAILGIAFSLAVIIAIVAAVK IHVRRKEEEDLYLRRQRRLERLEDIGFSSSDFEKLVAQRRVSSPPLGEKGKKGRGRRRKK SFFR >gi|157101654|gb|DS480670.1| GENE 105 113622 - 114731 898 369 aa, chain + ## HITS:1 COG:lin2827 KEGG:ns NR:ns ## COG: lin2827 COG2205 # Protein_GI_number: 16801887 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 100 364 627 889 896 215 41.0 1e-55 MKSFHINKTDLKAAAINLVRNLPITIEYMAAATLVSTIFFHFSNNVTNISIIYTLAIIMI ARATSCYGAGILASLFGVFWVNFAFTYPYLTLNFTMSGYPITFLGMALISSLSSSICIMI TKQNVQLQEKDRMLLNAEKETMRANLMRAMSHDLRTPLTSIIGSSSTYLAQEEYMSPEEK RKLVRNIEEDAQWLLNMVENLLSVTRIQDEKGVASVVKADESLEEVISESVQRFRKRFPD VQVRVSIPDSVIIIPMDATLIEQVINNLLENALFHSGTNGPIDLVAASEKSGLSVSIKDY GKGIAPELLDTIFDGGGTSENHTGDGHKGMGIGLSICKTIINAHGGEIHAGNHLRGAQFT FTLPDWREY >gi|157101654|gb|DS480670.1| GENE 106 114735 - 115427 854 230 aa, chain + ## HITS:1 COG:lin2826 KEGG:ns NR:ns ## COG: lin2826 COG0745 # Protein_GI_number: 16801886 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 230 1 231 231 256 57.0 2e-68 MNTKANIIIIEDEKNICNFIETVLSPQGYQVTCANTGTDGLKLIESLKPDVVLLDLGLPD MDGLELIQEVRSTSALPIIVISARTLERSKVAALDLGADDYLTKPFGTAELLARIRTALR HSQRAAGTQSLKYEVGDLLIDFERRLVKVKGQDVHLTQIEYKLVSLLAQNAGKVMTYESI ISKIWGPFADSDNQILRVNMAHIRRKLEENPAEPQYIFTEIGVGYRMREE >gi|157101654|gb|DS480670.1| GENE 107 115495 - 116130 657 211 aa, chain - ## HITS:1 COG:PM1648 KEGG:ns NR:ns ## COG: PM1648 COG2376 # Protein_GI_number: 15603513 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Pasteurella multocida # 27 209 44 225 228 82 30.0 4e-16 MERITCHELPDLFQGAADIFGEKKEELCEMDARMGDGDLGLTMQKGFGALPQLIRDNEAE GDIGKTLMKAGMKMAALVPSTMGTLMSSGIMEGGKAIGKKGEMGPGELADFLAGFARGIA HRGKCQLGDRTILDAVDAGARKAQEACAEGADMGAMLRRAAEGAKEGVKATEDMLPKYGK AAVFSAKAKGVPDQGAVAGQYFLEGLGRYFL >gi|157101654|gb|DS480670.1| GENE 108 116169 - 117164 1038 331 aa, chain - ## HITS:1 COG:mll7280 KEGG:ns NR:ns ## COG: mll7280 COG2376 # Protein_GI_number: 13476064 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Mesorhizobium loti # 2 322 6 328 337 280 45.0 3e-75 MKKIINAPEAYTDDMLRGIYAAHPDMLKYVEDDLRCYCTAKKKQGKVAIITGGGTGHLPL FLGYVGENLLDGCGVGGVFQSPSSEQLYNVAKEVEAGAGVLFLYGNYTGDIMNFDMAAEM LDMDDIRTASIVGADDVLSNKDAQVRRGVAGIFFMYKCAGAMAARMGTLEEVLDAAKKAK ENTRTVGFALTPCVIPEIGHSNFTLAEDEMAFGMGIHGEPGVWNGPVKTANDLAEESVGE ILGDMPLNPGEEVCLLINGLGATSQEELYILSGSVHRILEEKGIKVYRTFVGEYATSMEM AGASISICRMADQEMKQWIDMPVHTPFYTQV >gi|157101654|gb|DS480670.1| GENE 109 117183 - 117836 770 217 aa, chain - ## HITS:1 COG:PH0191 KEGG:ns NR:ns ## COG: PH0191 COG0235 # Protein_GI_number: 14590126 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Pyrococcus horikoshii # 5 187 4 179 189 116 37.0 3e-26 MLDILKEQVVAVAKEAERLGMCRHKSGNFSIYDPETGYVVITPSGVARDVLGPEHVCVMD LSGKVIERAAEVKPSSEAMMHLYIYKERKDIRAIVHTHARYSTAFSIMNKPIMPIVYECA YLGKTGTVPVAPYGRAGTVDLAEKVAEVMKEYDCCLMESHGAIAADKDMDAAFLKAQYVE EIAEMYYITLAAGNGAEPHALPVEELQKWAYPKEIIL >gi|157101654|gb|DS480670.1| GENE 110 117859 - 118302 478 147 aa, chain - ## HITS:1 COG:SP2165 KEGG:ns NR:ns ## COG: SP2165 COG4154 # Protein_GI_number: 15901975 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose dissimilation pathway protein FucU # Organism: Streptococcus pneumoniae TIGR4 # 1 145 1 138 147 115 42.0 2e-26 MLSGVPASISPELLKVLHEMGHGDTLVIGDANFPAASIAAEKNHINIRCDGHRATDMLDA ILQLMPLDGFVEKPVTIMDKMEMHRDLKCPVWDEFTDIVAKHDERGADAVGFLDRFAFYD AAKDAYAVVSTSETAFYACIMIQKGCL >gi|157101654|gb|DS480670.1| GENE 111 118307 - 119407 1354 366 aa, chain - ## HITS:1 COG:TM0911 KEGG:ns NR:ns ## COG: TM0911 COG0182 # Protein_GI_number: 15643673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family # Organism: Thermotoga maritima # 12 357 5 335 343 296 48.0 5e-80 MTVEEFFDQVITVRMREDRSSLDIIDQTLLPGTIKRIHLNTKEEIWEAIKKLRVRGAPAI GAAAAYGIALLASGIAEDQYDAFYSRFKELKDYLASSRPTAVNLFWALNRMEAAVIANKE RGVAAIKELLFEEADRIREEDIQISRGIGEIGFGLLKELKKEGKEIGILTHCNAGTLATA KYGTATAPMYIALEKGWAGTDMHVYCDETRPLLQGARLTSFELYNAGITTTLQCDNMASI LMKSGKIDIIFVGCDRVARNGDAANKIGTSGVAILAKHYGIPFYVCAPTSTIDMDTLTGD QIPIEMRSPDEVTEMWYKERMAPEGINVYNPAFDVTDHSLITGIITEKGMCRAPYDKAFE ALGIGR >gi|157101654|gb|DS480670.1| GENE 112 119468 - 120691 1431 407 aa, chain - ## HITS:1 COG:FN1412 KEGG:ns NR:ns ## COG: FN1412 COG4857 # Protein_GI_number: 19704744 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Fusobacterium nucleatum # 3 381 2 382 384 267 38.0 3e-71 MSRFDHYFQMNVGDVIEYTLEKTTEIPWDRDSMEAVVPPSHGNLNYVYRVWDGKGHSIYI KQAGSEARISKDIKPSRDRNRLESEILMLEDQYAPGMVPHVYFFDTVMCACGMEDCSDYA VMRDAMLKHETFPGFAEDVSTFMVNTLLLSSDVVMDHKEKKELVKKFISPDLCDITEKLV LMEPYMDLYNRNNVYAPNRDFVKKELYEDEALHLEVSKLKFRFMTDAQALLHGDLHTGSI FIRQDSTKIFDCEFGTYAPMGYDVGNVVANLIFAYDNGLSTDDGPFCDWCLKTIEETVDL FIEKFGRKYDEAVTEPMAKVKGFKEWYFDGILKDTAGYAGTELHRRTVGMANVVDVTTIQ DEKKRLLAERINILAGKEYIMNQGAFRTGADYVAAVLRAKEAAQKTL >gi|157101654|gb|DS480670.1| GENE 113 120709 - 121686 1122 325 aa, chain - ## HITS:1 COG:YPO1553 KEGG:ns NR:ns ## COG: YPO1553 COG1172 # Protein_GI_number: 16121826 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 4 323 14 331 331 218 41.0 2e-56 MKSRTNIKRFLVDWAAVLALMVCFVAFTAYKGNSFMSTSNMVNILRAMAINTVFGIAATI TMAPDGFDMSAGTLASCSAYVFVSAYLWLGQSLGMSILICILATLVMYQLTMFLILVCKI PDMLATCALMFVHQGIGQWYIGGGAVSTGMKTSWGAAPARTALSESFSAIGRAPWIIIIM LGCVLVAYLFLNFTKHGRYLYAMGGNKEAAKLSGIDVKKYRYLAGMITAVFIAIGGILVA SRGSSAQVMCCDNYLMPSLAAVFVGRTVGGAEKPNALGTMIGAMLVSTLENGLTICAVPF YVLPAVKGAVLALALIAAYASKKED >gi|157101654|gb|DS480670.1| GENE 114 121710 - 123245 1546 511 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 503 24 515 521 373 41.0 1e-103 MDRIKLETKHVSIEFPGVKALDDINFEVTTGEIRAVVGANGAGKSTLMKVLAGANSTYTG EVLLNGKKVEVRTPVDAKKLGIQIVYQEVDMALYPTLSVAENIVQNDMVMGKSGYIVNWK KAYKQGREALDRLHIGRDQVDEHELVQNLSLAQKQMVLIAKAIRSECNFLILDEPTAPLS NTETEELFRIVKHLHETENIAILFISHRLNEILEVCENYTVMRNGKMIDTTPVTGETTTK EIVEKMLGRKFEENFPKEACQIGDVLFQTEHLSGAGGKVKDVSIQVKKGEVVGIAGLVGA GKSELCKTIFGAYKKTGGKVLLKNRELKIANPSGAVKNRMALVPEERRKEGVLVNENVSF NLSAACLSRFCTGPFIRRRKVDDNAKRFVKDLGISTPSVRQQVKNLSGGNQQKVAVGKWL AADCDVYIFDEPTKGVDVGAKQDIFHLINEIAKQGNCVIYASCENSELLSLTDRMYVMYN GTVMAELETANTTEDEIMYYSVGGKDNKNGK >gi|157101654|gb|DS480670.1| GENE 115 123372 - 124604 1422 410 aa, chain - ## HITS:1 COG:YPO1517 KEGG:ns NR:ns ## COG: YPO1517 COG1879 # Protein_GI_number: 16121790 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 100 345 74 306 363 78 25.0 3e-14 MRKMKKVVSVIAAAAMAASLLAGCGGSGGSSTTAAETKAAETTAAAAETTAAEAAADTTA APAAAENPVAGKKVAYIMLLPSATIFQMWKDSCASLCDALDVKFDFFFCDGDFNKWQDTI RTCASAGYDGLLVSHGNQDGSYVFLKEITEQYPDMKIVTFDTQFYSDGEYQKLPGVTQMF QQDKSLVTVLLDQMIEKYGEGVRLIKVWRGPNYNSPFDRREVGWQEYEAAGKIVTVGEVQ PLQDSVDSANTVTAAYLQGVNRADVDGVIAYYDLYGQGVYNAIAENDNFNGKNGDALPMA SVDIDPVDVTNMQTRPDIWTAAGTTDWTMNGEIGMRILMLELADQYDKIFDPATGESGVD VVEVPGTAIKADALKSDSTVENLGNIAGETYGNLDYLSAADWMPKDLIHK >gi|157101654|gb|DS480670.1| GENE 116 124965 - 125966 901 333 aa, chain + ## HITS:1 COG:VC1721 KEGG:ns NR:ns ## COG: VC1721 COG1609 # Protein_GI_number: 15641725 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 2 332 3 333 336 127 29.0 4e-29 MTVRELAKQIGVSPATISIVLNGKKGVSEDTRKKVLDAVKACQYMPPARKPKSGKNVLLV KYYKSGMLVEENQGFISMIIDSIEEQLRAEQLGMTMTVVKTGLKSALDSIDYSKYCGMFL IATEVMRDEFSALDSIPIPFVVVDNTVPNHYCSSVCMNNAENVHIALQYCKECGHTELGY LGSTTGAENFNERHIAFLRYVKELNFQFDSKNEFRVKPTMLGAHDDFFRILDQNPVLPSC FFAENDTIALGAMKALKEKGYKIPNDVSLIGFDDIPYSSISSPTLTTIHVQRKIMGKQSV IQLMQLIEDPRFMPMKTQITGKLVERSSVKHLA >gi|157101654|gb|DS480670.1| GENE 117 126047 - 127060 763 337 aa, chain - ## HITS:1 COG:no KEGG:Closa_4231 NR:ns ## KEGG: Closa_4231 # Name: not_defined # Def: Exonuclease RNase T and DNA polymerase III # Organism: C.saccharolyticum # Pathway: not_defined # 2 328 3 327 330 381 58.0 1e-104 MNYIVFDLEWNQSPNGKEDSVEHLPFEIIEIGAVKLNGNFEQTGTFHKLIRPKVYKKMHF KISEVTHMDMAELRQEGEPFDVVMNRFLAWCGEEEYCFCTWGSMDLTELQRNMAYHKLPN PFPRPLLYLDIQKLYCLQYGDGKNKVSLDMAVQIQGMEEERPFHRALDDAYYTGRILSVL DMETFGTYVSVDYYGLPRNKAEEYRLHFPEYSKYVSREFDSREDILKDKDITDMKCVKCS RMLRKKIRWFPYGQRFYFCLAVCPEHGYVKGKIRVKKSEDDRLYAVKTAKMADAQDVELL AQKKEDGRKKRSAKAHHKAASKSLKGKSKQNQPDKVS >gi|157101654|gb|DS480670.1| GENE 118 127360 - 127989 183 209 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935986|ref|ZP_02083360.1| ## NR: gi|160935986|ref|ZP_02083360.1| hypothetical protein CLOBOL_00883 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00883 [Clostridium bolteae ATCC BAA-613] # 1 209 10 218 218 429 100.0 1e-119 MIKLIQAACQWGQKDFDGWVIGKARYCIHSHSEPVSCESIPQKAAQGYVLLETTSYDACR QPCQSLEICLVDIKVFRDSWGAKASTFPASQAASPATSGCMASGCMPGQMTMTFPQKCLD RFLADSGDSNPIHHGPNAVIPGLWILSRLEEMYGSHRPAETLSIRFLRPVHTGGSVRLEQ KNNIVTGTMGSATCFTMTIHTSDQTQKRR >gi|157101654|gb|DS480670.1| GENE 119 127992 - 128543 418 183 aa, chain + ## HITS:1 COG:MA4340 KEGG:ns NR:ns ## COG: MA4340 COG1268 # Protein_GI_number: 20093128 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 5 169 11 182 187 95 38.0 4e-20 MNQKFNTVELTKMALLTALICVSAYIVIPLPFTPASLTAQTLVVNLIALLLTPRQAAFTM VVYILLGLTGLPVFSGAMGGPGKLFGPTGGYIMSWIPAVILMSRLKGTYYSFKRYCLVTI LVGMPVIYLVGSAYMKFITGMDWAATFTAAVIPFIPLDIFKCFAAALIAKPVQISLSNAQ RAR >gi|157101654|gb|DS480670.1| GENE 120 128554 - 129747 815 397 aa, chain + ## HITS:1 COG:SPy0524 KEGG:ns NR:ns ## COG: SPy0524 COG0183 # Protein_GI_number: 15674625 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Streptococcus pyogenes M1 GAS # 1 373 1 339 382 257 42.0 2e-68 MDKVFILGGLRSYIGVRNSAYRHVPAEHLGAAVLKELTNRYQPSKIDMIICGNCVGGGGN ITRLMALEAGLSESIPSFTVDLQCASSLEAVITAAARIQSGLADLVIAGGFESSSTQPMR CYNPNHPYAAAAHDDTLSGNPAGICNAHRTELSLTYSTAKFIPGPHREDVMLQGAEKTIC HYHITPEEMDAWVLESHRRAYRTAREGLLDGIIVPVYGLAHDEGIRPRLNQRLLDRLPCV LKDGRYLNAANACTMNDGAAFLLLCSEQYLKKHKLSSCFRFSNACTVGADPLMSPASVLP AVRGLLERSGLSMDDIGAIECNEAFAAIDVLINRAYPDKASRYNPLGGALAYGHPYGASG AIILLHLMRSMELSKSCRGICCAAAAGGIGSAILLER >gi|157101654|gb|DS480670.1| GENE 121 129793 - 131046 690 417 aa, chain + ## HITS:1 COG:SPy0526 KEGG:ns NR:ns ## COG: SPy0526 COG0318 # Protein_GI_number: 15674626 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Streptococcus pyogenes M1 GAS # 16 416 13 408 415 171 30.0 3e-42 MKHSTDYLSLIQSIPPHRTALVENGRFYTYSMLTREACRIRSLTPEPSVPSAAWIRAASV HGQLVQFLALSGTSHIPVIVPADMKYVPESLITEGLPAGACMGVLTSGSTGQPKLWFRTL DSWASFFPTQNAIFGMTAQTRLFAHGSLAFTGNLNLYLSLLSLGASIITASHVNPFVWCR EIELHHADALYLIPSKLRLLAKYISHPSPDIKTILAGSQSLGHGDIVLLKAAYPNSCCFL YYGASELSYVTWLTDQEMNDNPACIGKPFPGVDVTLRNGEIYVDTPYSAIGITGPYSVGD MGYTDHKGYLYFNGRKDNVYNIHGRKVSAVKIENALNSLTQIQEAAVILQNNALQAHVVL TVPNPSGQDAVSLRRIIKRGLNDYLESWEIPRDIIFHENLPKNNSGKTVKRLLSAKE >gi|157101654|gb|DS480670.1| GENE 122 131051 - 132136 1090 361 aa, chain - ## HITS:1 COG:no KEGG:PRU_1044 NR:ns ## KEGG: PRU_1044 # Name: not_defined # Def: hydrolase domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 6 361 1 351 355 196 36.0 2e-48 MRTRKMKRWLSIILAGLAAVILIAAAGGAFLFRHELKTLHSLKKVDDNVLYTMKYDGDYG FDEFLETGASSDSELVEFVTNRLLKGIPLEFSIPDLGCSTFSAQTEDGARIFGRNFDLTY SPAMFVLTEPANGYRSMSTVNLAFLGFGEDKLPDTLKRKIITLAAPYAPLDGVNEKGLAV AVLRIGDEPTNQDTGKTDITTTTAIRLMLDKAANVDEALELLAQYDMHSSAGSCYHFQLA DALGNSAVAEYIDNEFEVIGKKGDYQAATNFLLSEKKFNFGNGQDRYQILEQALGECAGI VRDEQEAMDLLEAASKDWHVSETTGRLNATQWSIVYNCTDLTASVVTGRQYDKPAHEFSL K Prediction of potential genes in microbial genomes Time: Thu Jun 30 16:56:55 2011 Seq name: gi|157101653|gb|DS480671.1| Clostridium bolteae ATCC BAA-613 Scfld_02_12 genomic scaffold, whole genome shotgun sequence Length of sequence - 287909 bp Number of predicted genes - 249, with homology - 242 Number of transcription units - 112, operones - 50 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1314 801 ## COG1653 ABC-type sugar transport system, periplasmic component - Term 1236 - 1265 -0.9 2 2 Op 1 1/0.147 - CDS 1409 - 2974 851 ## COG0747 ABC-type dipeptide transport system, periplasmic component 3 2 Op 2 . - CDS 2974 - 5556 1183 ## COG0642 Signal transduction histidine kinase - Prom 5626 - 5685 4.4 4 3 Tu 1 . - CDS 5688 - 10502 2978 ## COG1352 Methylase of chemotaxis methyl-accepting proteins - Term 11027 - 11081 2.2 5 4 Tu 1 . - CDS 11119 - 12591 390 ## ELI_2097 regulatory protein GntR - Prom 12766 - 12825 6.9 6 5 Tu 1 . - CDS 13056 - 14318 788 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 14369 - 14428 7.2 + Prom 14308 - 14367 8.8 7 6 Tu 1 . + CDS 14527 - 15195 712 ## COG1802 Transcriptional regulators + Term 15205 - 15256 6.2 + Prom 15219 - 15278 5.5 8 7 Op 1 . + CDS 15299 - 16630 1249 ## COG0161 Adenosylmethionine-8-amino-7-oxononanoate aminotransferase 9 7 Op 2 11/0.000 + CDS 16649 - 17131 328 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 10 7 Op 3 9/0.000 + CDS 17132 - 18421 796 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 11 7 Op 4 . + CDS 18488 - 19561 314 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 + Term 19574 - 19611 5.5 12 8 Op 1 6/0.000 + CDS 19635 - 20981 735 ## COG3395 Uncharacterized protein conserved in bacteria 13 8 Op 2 . + CDS 20984 - 21580 469 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 14 9 Tu 1 . - CDS 21776 - 22858 768 ## COG2008 Threonine aldolase - Prom 22985 - 23044 5.4 + Prom 22949 - 23008 6.7 15 10 Tu 1 . + CDS 23072 - 23950 600 ## COG0583 Transcriptional regulator + Term 24123 - 24159 -0.1 16 11 Tu 1 . - CDS 23995 - 25062 802 ## COG3835 Sugar diacid utilization regulator - Prom 25112 - 25171 5.6 17 12 Op 1 1/0.147 + CDS 25428 - 26444 989 ## COG2195 Di- and tripeptidases 18 12 Op 2 . + CDS 26471 - 27868 1431 ## COG1288 Predicted membrane protein + Prom 27893 - 27952 10.7 19 13 Tu 1 . + CDS 28017 - 29192 1223 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase + Term 29220 - 29280 13.2 + Prom 29248 - 29307 8.1 20 14 Tu 1 . + CDS 29344 - 30399 503 ## COG1609 Transcriptional regulators + Term 30530 - 30569 1.1 + Prom 30538 - 30597 10.0 21 15 Op 1 . + CDS 30625 - 31827 833 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 22 15 Op 2 . + CDS 31858 - 32910 374 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 23 15 Op 3 . + CDS 32922 - 33425 418 ## TepRe1_0135 Tripartite ATP-independent periplasmic transporter subunit DctQ 24 15 Op 4 . + CDS 33422 - 34687 733 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 25 15 Op 5 1/0.147 + CDS 34718 - 35899 574 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 26 15 Op 6 . + CDS 35901 - 36683 560 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 27 15 Op 7 . + CDS 36709 - 38367 711 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 28 16 Tu 1 . - CDS 38394 - 39773 1047 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 39835 - 39894 7.7 + Prom 39827 - 39886 8.2 29 17 Tu 1 . + CDS 39930 - 41108 881 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 30 18 Op 1 40/0.000 - CDS 41175 - 43154 1402 ## COG0642 Signal transduction histidine kinase 31 18 Op 2 . - CDS 43183 - 43869 516 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 43912 - 43971 6.7 + Prom 44057 - 44116 6.1 32 19 Op 1 9/0.000 + CDS 44173 - 44907 741 ## COG3279 Response regulator of the LytR/AlgR family 33 19 Op 2 . + CDS 44927 - 45577 657 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 45668 - 45723 -1.0 + Prom 45647 - 45706 1.9 34 20 Op 1 . + CDS 45739 - 45987 213 ## gi|160936033|ref|ZP_02083406.1| hypothetical protein CLOBOL_00929 35 20 Op 2 . + CDS 46004 - 46300 476 ## Fisuc_0373 anti-sigma-factor antagonist 36 20 Op 3 . + CDS 46326 - 48056 1609 ## COG0534 Na+-driven multidrug efflux pump 37 20 Op 4 . + CDS 48108 - 48704 368 ## TherJR_1559 cell wall hydrolase SleB 38 20 Op 5 . + CDS 48723 - 50414 1293 ## SOR_0328 cell wall surface anchor family protein + Prom 50439 - 50498 2.8 39 21 Op 1 . + CDS 50542 - 53994 2324 ## COG5263 FOG: Glucan-binding domain (YG repeat) 40 21 Op 2 . + CDS 54047 - 54877 784 ## COG0223 Methionyl-tRNA formyltransferase 41 21 Op 3 5/0.000 + CDS 54883 - 56532 1637 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 42 21 Op 4 . + CDS 56532 - 56936 571 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 43 21 Op 5 . + CDS 56982 - 57890 613 ## DvMF_2421 hypothetical protein 44 21 Op 6 . + CDS 57887 - 58525 532 ## gi|160936043|ref|ZP_02083416.1| hypothetical protein CLOBOL_00939 45 21 Op 7 . + CDS 58513 - 59139 559 ## Mpal_0500 hypothetical protein 46 22 Tu 1 . + CDS 59248 - 59820 524 ## COG1126 ABC-type polar amino acid transport system, ATPase component + Prom 59874 - 59933 4.7 47 23 Tu 1 . + CDS 59993 - 61888 766 ## gi|160936046|ref|ZP_02083419.1| hypothetical protein CLOBOL_00942 + Term 61957 - 61995 9.0 + Prom 62046 - 62105 7.8 48 24 Op 1 . + CDS 62171 - 62875 615 ## Acear_0463 hypothetical protein 49 24 Op 2 . + CDS 62927 - 63331 391 ## Acear_0461 hypothetical protein 50 24 Op 3 . + CDS 63383 - 64147 619 ## COG0388 Predicted amidohydrolase + Term 64291 - 64337 1.2 + Prom 64266 - 64325 3.4 51 25 Tu 1 . + CDS 64528 - 65268 405 ## COG3641 Predicted membrane protein, putative toxin regulator 52 26 Tu 1 . - CDS 65363 - 66289 550 ## COG0583 Transcriptional regulator - Prom 66334 - 66393 16.2 + Prom 66319 - 66378 11.5 53 27 Op 1 8/0.000 + CDS 66403 - 66591 139 ## COG1146 Ferredoxin 54 27 Op 2 23/0.000 + CDS 66594 - 67724 876 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 55 27 Op 3 22/0.000 + CDS 67724 - 68536 786 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 56 27 Op 4 . + CDS 68536 - 69120 565 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 57 27 Op 5 . + CDS 69145 - 70683 1168 ## COG0591 Na+/proline symporter 58 27 Op 6 3/0.000 + CDS 70676 - 71593 731 ## COG0280 Phosphotransacetylase 59 27 Op 7 . + CDS 71593 - 72714 908 ## COG3426 Butyrate kinase + Term 72785 - 72823 -0.7 60 28 Op 1 9/0.000 - CDS 72715 - 73596 704 ## COG1760 L-serine deaminase 61 28 Op 2 . - CDS 73586 - 74278 712 ## COG1760 L-serine deaminase - Prom 74408 - 74467 9.5 + Prom 74367 - 74426 9.2 62 29 Tu 1 . + CDS 74454 - 75368 818 ## COG0583 Transcriptional regulator + Term 75540 - 75601 11.7 63 30 Op 1 . + CDS 75770 - 76789 849 ## Pjdr2_1991 extracellular solute-binding protein family 1 64 30 Op 2 . + CDS 76810 - 77757 736 ## Pjdr2_1989 putative sensor with HAMP domain + Prom 77811 - 77870 5.6 65 31 Op 1 . + CDS 77927 - 78643 522 ## COG2045 Phosphosulfolactate phosphohydrolase and related enzymes 66 31 Op 2 . + CDS 78669 - 80117 1552 ## Bsel_2944 ABC-type sugar transporter periplasmic component-like protein 67 31 Op 3 . + CDS 80126 - 82300 1590 ## COG2199 FOG: GGDEF domain + Term 82335 - 82368 -0.5 - Term 82117 - 82161 -1.0 68 32 Tu 1 . - CDS 82360 - 83250 729 ## COG0583 Transcriptional regulator - Prom 83368 - 83427 9.6 + Prom 83305 - 83364 6.2 69 33 Op 1 35/0.000 + CDS 83543 - 85354 202 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 70 33 Op 2 . + CDS 85354 - 87330 196 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 87365 - 87424 6.4 71 34 Tu 1 . + CDS 87471 - 87920 417 ## gi|160936072|ref|ZP_02083445.1| hypothetical protein CLOBOL_00968 + Prom 87930 - 87989 7.3 72 35 Tu 1 . + CDS 88036 - 89415 1091 ## COG4826 Serine protease inhibitor + Term 89637 - 89682 0.6 + Prom 89468 - 89527 6.7 73 36 Op 1 . + CDS 89757 - 90665 907 ## Clocel_3685 hypothetical protein 74 36 Op 2 40/0.000 + CDS 90662 - 91321 782 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 75 36 Op 3 . + CDS 91318 - 92730 1460 ## COG0642 Signal transduction histidine kinase 76 36 Op 4 . + CDS 92727 - 93578 759 ## gi|160936078|ref|ZP_02083451.1| hypothetical protein CLOBOL_00974 + Term 93616 - 93663 9.1 + Prom 93584 - 93643 4.2 77 37 Tu 1 . + CDS 93675 - 95210 1102 ## COG0860 N-acetylmuramoyl-L-alanine amidase 78 38 Tu 1 . - CDS 95214 - 95774 385 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 95802 - 95861 9.2 + Prom 95766 - 95825 9.1 79 39 Op 1 6/0.000 + CDS 95942 - 96940 765 ## COG1609 Transcriptional regulators + Term 96942 - 96982 0.9 + Prom 96947 - 97006 2.4 80 39 Op 2 16/0.000 + CDS 97028 - 98029 672 ## COG1879 ABC-type sugar transport system, periplasmic component 81 39 Op 3 21/0.000 + CDS 98069 - 99583 191 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 82 39 Op 4 . + CDS 99597 - 100553 709 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 83 39 Op 5 . + CDS 100634 - 102016 916 ## COG2407 L-fucose isomerase and related proteins 84 39 Op 6 . + CDS 102071 - 102964 463 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 85 39 Op 7 . + CDS 102961 - 103998 621 ## gi|160936087|ref|ZP_02083460.1| hypothetical protein CLOBOL_00983 + Term 104006 - 104037 2.1 + Prom 104020 - 104079 5.3 86 40 Op 1 . + CDS 104127 - 104522 368 ## gi|160936088|ref|ZP_02083461.1| hypothetical protein CLOBOL_00984 87 40 Op 2 . + CDS 104536 - 104862 247 ## gi|160936089|ref|ZP_02083462.1| hypothetical protein CLOBOL_00985 88 40 Op 3 7/0.000 + CDS 104855 - 105691 548 ## COG1352 Methylase of chemotaxis methyl-accepting proteins 89 40 Op 4 8/0.000 + CDS 105684 - 106049 467 ## COG0784 FOG: CheY-like receiver 90 40 Op 5 . + CDS 106069 - 107745 1629 ## COG0840 Methyl-accepting chemotaxis protein 91 40 Op 6 8/0.000 + CDS 107764 - 108387 691 ## COG1776 Chemotaxis protein CheC, inhibitor of MCP methylation 92 40 Op 7 . + CDS 108380 - 108856 358 ## COG1871 Chemotaxis protein; stimulates methylation of MCP proteins 93 40 Op 8 . + CDS 108880 - 109812 893 ## COG3706 Response regulator containing a CheY-like receiver domain and a GGDEF domain 94 40 Op 9 . + CDS 109828 - 111855 1846 ## COG0643 Chemotaxis protein histidine kinase and related kinases 95 40 Op 10 . + CDS 111845 - 112285 362 ## gi|160936097|ref|ZP_02083470.1| hypothetical protein CLOBOL_00993 + Term 112294 - 112343 8.1 - Term 112282 - 112330 9.1 96 41 Op 1 . - CDS 112338 - 112676 395 ## EUBELI_20572 hypothetical protein 97 41 Op 2 . - CDS 112698 - 113471 858 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 113535 - 113594 7.6 + Prom 113514 - 113573 9.2 98 42 Op 1 40/0.000 + CDS 113808 - 114518 781 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 99 42 Op 2 . + CDS 114490 - 117123 2605 ## COG0642 Signal transduction histidine kinase 100 43 Tu 1 . - CDS 117232 - 118842 1875 ## COG2268 Uncharacterized protein conserved in bacteria - Prom 119046 - 119105 5.3 - Term 119057 - 119098 -1.0 101 44 Tu 1 . - CDS 119193 - 119423 101 ## - Prom 119454 - 119513 2.4 + Prom 119219 - 119278 4.4 102 45 Op 1 23/0.000 + CDS 119485 - 120000 650 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 103 45 Op 2 1/0.147 + CDS 120000 - 123101 3307 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 104 45 Op 3 . + CDS 123112 - 124905 1874 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 124968 - 125017 6.5 + Prom 125097 - 125156 5.4 105 46 Op 1 . + CDS 125274 - 125414 58 ## gi|160936109|ref|ZP_02083482.1| hypothetical protein CLOBOL_01005 106 46 Op 2 . + CDS 125462 - 125947 499 ## CLM_0794 hypothetical protein 107 46 Op 3 . + CDS 125988 - 126086 86 ## + Term 126125 - 126166 9.1 - Term 126108 - 126159 14.6 108 47 Tu 1 . - CDS 126192 - 127934 1285 ## COG0165 Argininosuccinate lyase + Prom 128156 - 128215 6.1 109 48 Op 1 . + CDS 128270 - 128794 472 ## Dret_0071 DctQ (C4-dicarboxylate permease, small subunit) 110 48 Op 2 9/0.000 + CDS 128791 - 130101 618 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 111 48 Op 3 . + CDS 130137 - 131183 1092 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 112 48 Op 4 . + CDS 131209 - 131604 323 ## Amet_3596 GrdX protein 113 48 Op 5 . + CDS 131626 - 132912 1233 ## STH2870 glycine reductase proprotein 114 48 Op 6 . + CDS 132925 - 133968 1081 ## Toce_1618 selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family 115 48 Op 7 . + CDS 133996 - 134226 207 ## BP951000_1856 glycine/sarcosine/betaine reductase selenoprotein GrdB 116 48 Op 8 . + CDS 134238 - 134495 92 ## gi|160936119|ref|ZP_02083492.1| hypothetical protein CLOBOL_01015 117 48 Op 9 . + CDS 134492 - 134569 76 ## + Prom 134619 - 134678 7.9 118 49 Op 1 8/0.000 + CDS 134701 - 136089 1401 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 119 49 Op 2 10/0.000 + CDS 136104 - 137402 1219 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 120 49 Op 3 8/0.000 + CDS 137446 - 137748 386 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 121 49 Op 4 . + CDS 137748 - 138056 412 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 122 49 Op 5 . + CDS 138077 - 138910 759 ## COG0406 Fructose-2,6-bisphosphatase + Term 138940 - 138995 15.6 - Term 138927 - 138981 16.1 123 50 Op 1 . - CDS 139001 - 140884 1140 ## COG3711 Transcriptional antiterminator 124 50 Op 2 1/0.147 - CDS 141035 - 142780 1831 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 125 50 Op 3 1/0.147 - CDS 142799 - 144589 1956 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 126 50 Op 4 . - CDS 144604 - 144981 374 ## COG3411 Ferredoxin 127 50 Op 5 . - CDS 145048 - 145146 66 ## 128 50 Op 6 . - CDS 145163 - 145720 376 ## COG0642 Signal transduction histidine kinase 129 50 Op 7 . - CDS 145725 - 146219 417 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 146301 - 146360 8.0 + Prom 146329 - 146388 5.8 130 51 Op 1 . + CDS 146434 - 146781 411 ## Closa_3389 DRTGG domain protein 131 51 Op 2 . + CDS 146877 - 147299 559 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 132 51 Op 3 . + CDS 147330 - 148724 1604 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 133 51 Op 4 . + CDS 148783 - 149115 419 ## Closa_3386 hypothetical protein 134 51 Op 5 . + CDS 149112 - 149822 595 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Prom 149923 - 149982 3.7 135 52 Op 1 39/0.000 + CDS 150057 - 151046 1201 ## COG0226 ABC-type phosphate transport system, periplasmic component 136 52 Op 2 38/0.000 + CDS 151124 - 151990 704 ## COG0573 ABC-type phosphate transport system, permease component 137 52 Op 3 41/0.000 + CDS 152072 - 152914 600 ## COG0581 ABC-type phosphate transport system, permease component 138 52 Op 4 . + CDS 152899 - 153648 216 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 139 53 Tu 1 . + CDS 153760 - 154227 594 ## COG4087 Soluble P-type ATPase + Prom 154231 - 154290 3.5 140 54 Op 1 . + CDS 154345 - 154485 104 ## gi|160936143|ref|ZP_02083516.1| hypothetical protein CLOBOL_01039 + Term 154491 - 154547 1.1 141 54 Op 2 . + CDS 154589 - 155182 649 ## gi|160936145|ref|ZP_02083518.1| hypothetical protein CLOBOL_01041 + Term 155249 - 155295 12.8 - Term 155235 - 155283 -0.5 142 55 Tu 1 . - CDS 155291 - 155524 282 ## Closa_1058 hypothetical protein - Prom 155729 - 155788 4.1 + Prom 155429 - 155488 2.7 143 56 Op 1 28/0.000 + CDS 155635 - 156861 1110 ## COG0420 DNA repair exonuclease 144 56 Op 2 . + CDS 156858 - 159818 2957 ## COG0419 ATPase involved in DNA repair 145 57 Op 1 . + CDS 159922 - 160389 423 ## COG3467 Predicted flavin-nucleotide-binding protein 146 57 Op 2 . + CDS 160415 - 160669 175 ## CLJU_c17650 hypothetical protein 147 57 Op 3 . + CDS 160688 - 161290 504 ## COG1357 Uncharacterized low-complexity proteins 148 57 Op 4 . + CDS 161340 - 161966 947 ## COG3404 Methenyl tetrahydrofolate cyclohydrolase 149 57 Op 5 . + CDS 161999 - 162844 1002 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase + Prom 162863 - 162922 5.6 150 58 Op 1 31/0.000 + CDS 162982 - 163842 1344 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 164011 - 164039 1.0 151 58 Op 2 34/0.000 + CDS 164065 - 164757 926 ## COG0765 ABC-type amino acid transport system, permease component 152 58 Op 3 . + CDS 164747 - 165505 259 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein + Term 165661 - 165720 1.4 - Term 165938 - 165994 2.8 153 59 Tu 1 . - CDS 166124 - 166621 543 ## gi|160936159|ref|ZP_02083532.1| hypothetical protein CLOBOL_01055 - Prom 166696 - 166755 6.8 + Prom 166692 - 166751 7.6 154 60 Op 1 . + CDS 166837 - 168714 1205 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Term 168766 - 168807 0.2 + Prom 168846 - 168905 3.4 155 60 Op 2 . + CDS 168929 - 169213 384 ## gi|160936161|ref|ZP_02083534.1| hypothetical protein CLOBOL_01057 + Term 169240 - 169285 7.2 + Prom 169287 - 169346 7.5 156 61 Op 1 15/0.000 + CDS 169466 - 170758 1592 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Term 170805 - 170842 5.3 157 61 Op 2 24/0.000 + CDS 170886 - 172418 2077 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 158 61 Op 3 26/0.000 + CDS 172411 - 173538 1170 ## COG4603 ABC-type uncharacterized transport system, permease component 159 61 Op 4 . + CDS 173551 - 174531 1296 ## COG1079 Uncharacterized ABC-type transport system, permease component + Prom 174563 - 174622 3.2 160 62 Tu 1 . + CDS 174647 - 174802 65 ## + Term 174919 - 174953 1.2 + Prom 174983 - 175042 6.8 161 63 Op 1 . + CDS 175087 - 176799 1399 ## COG4187 Arginine degradation protein (predicted deacylase) 162 63 Op 2 . + CDS 176903 - 179728 3125 ## COG0178 Excinuclease ATPase subunit + Term 179772 - 179808 5.0 + Prom 179776 - 179835 3.7 163 64 Tu 1 . + CDS 180072 - 180830 867 ## COG1802 Transcriptional regulators + Term 180885 - 180925 -0.2 - Term 180869 - 180914 1.1 164 65 Tu 1 . - CDS 181002 - 182075 904 ## Spirs_0032 hypothetical protein - Prom 182201 - 182260 6.0 + Prom 182209 - 182268 11.0 165 66 Op 1 49/0.000 + CDS 182400 - 183347 277 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 166 66 Op 2 44/0.000 + CDS 183357 - 184265 1352 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 167 66 Op 3 44/0.000 + CDS 184297 - 185265 1177 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 168 66 Op 4 7/0.000 + CDS 185283 - 186257 1163 ## COG4608 ABC-type oligopeptide transport system, ATPase component 169 66 Op 5 . + CDS 186349 - 187953 1894 ## COG0747 ABC-type dipeptide transport system, periplasmic component 170 66 Op 6 . + CDS 187978 - 188739 856 ## COG1402 Uncharacterized protein, putative amidase + Prom 188812 - 188871 3.9 171 67 Tu 1 . + CDS 189057 - 190121 1085 ## COG1363 Cellulase M and related proteins + Term 190366 - 190403 4.9 172 68 Tu 1 . + CDS 190649 - 192190 1439 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 192201 - 192245 10.2 + Prom 192287 - 192346 7.4 173 69 Tu 1 . + CDS 192406 - 192498 123 ## + Term 192550 - 192600 0.0 + Prom 192555 - 192614 3.9 174 70 Tu 1 . + CDS 192746 - 193087 327 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 175 71 Op 1 9/0.000 + CDS 193289 - 194029 657 ## COG3279 Response regulator of the LytR/AlgR family 176 71 Op 2 . + CDS 193995 - 195305 1132 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 195337 - 195368 0.9 + Prom 195333 - 195392 6.6 177 72 Op 1 . + CDS 195465 - 196217 659 ## gi|160936187|ref|ZP_02083560.1| hypothetical protein CLOBOL_01083 178 72 Op 2 . + CDS 196238 - 197269 826 ## gi|160936188|ref|ZP_02083561.1| hypothetical protein CLOBOL_01084 + Term 197334 - 197392 2.2 179 73 Tu 1 . - CDS 197421 - 197927 180 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 198233 - 198292 6.2 180 74 Op 1 . + CDS 198378 - 199316 864 ## Nther_1480 transcriptional regulator, GntR family 181 74 Op 2 44/0.000 + CDS 199406 - 200389 1011 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component + Prom 200434 - 200493 2.4 182 74 Op 3 7/0.000 + CDS 200524 - 201483 751 ## COG4608 ABC-type oligopeptide transport system, ATPase component 183 74 Op 4 38/0.000 + CDS 201502 - 203061 1562 ## COG0747 ABC-type dipeptide transport system, periplasmic component 184 74 Op 5 49/0.000 + CDS 203120 - 204088 1038 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 185 74 Op 6 . + CDS 204063 - 204959 1059 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components + Term 205048 - 205096 4.1 186 75 Op 1 . + CDS 205129 - 206310 1170 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 187 75 Op 2 . + CDS 206373 - 207605 916 ## COG0006 Xaa-Pro aminopeptidase + Term 207640 - 207681 8.1 188 76 Tu 1 . - CDS 207629 - 207796 70 ## gi|160936203|ref|ZP_02083576.1| hypothetical protein CLOBOL_01099 - Prom 208010 - 208069 1.7 + Prom 207648 - 207707 6.2 189 77 Op 1 . + CDS 207728 - 208645 886 ## COG0598 Mg2+ and Co2+ transporters 190 77 Op 2 . + CDS 208689 - 209357 570 ## Rumal_2048 hypothetical protein + Prom 209426 - 209485 6.6 191 78 Tu 1 . + CDS 209589 - 209669 58 ## + Term 209670 - 209707 2.5 192 79 Tu 1 . + CDS 209779 - 210729 692 ## COG2267 Lysophospholipase + Prom 210733 - 210792 3.8 193 80 Op 1 . + CDS 210828 - 213806 2358 ## COG1221 Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain 194 80 Op 2 . + CDS 213853 - 214491 673 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase 195 80 Op 3 13/0.000 + CDS 214488 - 214961 624 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 196 80 Op 4 10/0.000 + CDS 214989 - 215267 413 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 197 80 Op 5 . + CDS 215356 - 216723 1722 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 198 80 Op 6 . + CDS 216710 - 218347 1674 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase + Prom 218479 - 218538 8.1 199 81 Tu 1 . + CDS 218585 - 219289 559 ## Swol_1999 hypothetical protein 200 82 Tu 1 . - CDS 219307 - 220635 937 ## COG3325 Chitinase - Prom 220723 - 220782 8.1 + Prom 220811 - 220870 6.5 201 83 Tu 1 . + CDS 220929 - 225146 2841 ## COG1201 Lhr-like helicases + Prom 226852 - 226911 4.3 202 84 Tu 1 . + CDS 227015 - 227710 566 ## gi|160936221|ref|ZP_02083594.1| hypothetical protein CLOBOL_01117 + Prom 227791 - 227850 5.3 203 85 Tu 1 . + CDS 228080 - 228553 234 ## Rumal_2970 GCN5-like N-acetyltransferase + Prom 228589 - 228648 4.1 204 86 Tu 1 . + CDS 228676 - 229992 1281 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Term 229970 - 229997 -0.8 205 87 Tu 1 . - CDS 230056 - 231162 758 ## COG0006 Xaa-Pro aminopeptidase - Prom 231201 - 231260 10.2 + Prom 231157 - 231216 6.9 206 88 Tu 1 . + CDS 231386 - 232360 1093 ## COG0407 Uroporphyrinogen-III decarboxylase + Term 232569 - 232600 0.1 207 89 Tu 1 . - CDS 232379 - 233041 764 ## COG1802 Transcriptional regulators - Prom 233089 - 233148 12.7 + Prom 233045 - 233104 5.0 208 90 Op 1 49/0.000 + CDS 233165 - 234124 279 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 209 90 Op 2 5/0.000 + CDS 234144 - 235031 1058 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 210 90 Op 3 5/0.000 + CDS 235094 - 236785 1962 ## COG0747 ABC-type dipeptide transport system, periplasmic component 211 90 Op 4 44/0.000 + CDS 236846 - 237835 602 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 212 90 Op 5 1/0.147 + CDS 237826 - 238809 877 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 213 90 Op 6 . + CDS 238876 - 240276 1193 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Prom 240311 - 240370 4.3 214 90 Op 7 . + CDS 240435 - 241223 1006 ## COG2859 Uncharacterized protein conserved in bacteria + Term 241241 - 241301 19.9 - Term 241234 - 241284 10.3 215 91 Tu 1 . - CDS 241330 - 242691 1605 ## COG0534 Na+-driven multidrug efflux pump - Prom 242722 - 242781 7.8 + Prom 242764 - 242823 6.6 216 92 Tu 1 . + CDS 242865 - 244319 1425 ## COG0069 Glutamate synthase domain 2 + Term 244510 - 244555 1.0 - Term 244122 - 244169 10.7 217 93 Tu 1 . - CDS 244326 - 244718 406 ## Closa_1072 hypothetical protein - Prom 244866 - 244925 5.6 + Prom 244813 - 244872 2.0 218 94 Tu 1 . + CDS 244910 - 245077 65 ## gi|160936239|ref|ZP_02083612.1| hypothetical protein CLOBOL_01135 + Prom 245310 - 245369 6.8 219 95 Op 1 1/0.147 + CDS 245502 - 246833 1511 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases 220 95 Op 2 31/0.000 + CDS 246867 - 247160 396 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 221 95 Op 3 21/0.000 + CDS 247174 - 248694 480 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 222 95 Op 4 . + CDS 248687 - 250135 1640 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) + Term 250175 - 250223 4.0 + Prom 250177 - 250236 6.0 223 96 Op 1 . + CDS 250323 - 251612 1300 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Term 251682 - 251729 13.3 + Prom 251625 - 251684 1.5 224 96 Op 2 . + CDS 251802 - 252374 463 ## Closa_1315 hypothetical protein + Term 252392 - 252432 9.8 - Term 252372 - 252425 11.6 225 97 Tu 1 . - CDS 252476 - 253663 1406 ## COG1454 Alcohol dehydrogenase, class IV - Prom 253796 - 253855 7.3 + Prom 253638 - 253697 7.1 226 98 Op 1 21/0.000 + CDS 253897 - 255099 1333 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 227 98 Op 2 . + CDS 255099 - 255980 1027 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 256049 - 256094 9.6 - Term 256031 - 256087 11.1 228 99 Tu 1 . - CDS 256107 - 257033 707 ## COG0583 Transcriptional regulator - Prom 257168 - 257227 7.4 + Prom 257132 - 257191 4.9 229 100 Op 1 . + CDS 257226 - 257690 533 ## COG1522 Transcriptional regulators 230 100 Op 2 . + CDS 257735 - 258577 923 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Prom 258580 - 258639 1.6 231 100 Op 3 . + CDS 258672 - 259634 1036 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 232 101 Op 1 . + CDS 259735 - 260163 491 ## Closa_0818 GtrA family protein 233 101 Op 2 . + CDS 260179 - 263325 2787 ## COG4485 Predicted membrane protein 234 102 Tu 1 . + CDS 263463 - 264854 1627 ## COG0165 Argininosuccinate lyase + Prom 264866 - 264925 7.0 235 103 Tu 1 . + CDS 264964 - 265656 547 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 265690 - 265736 1.8 - Term 265494 - 265520 -0.6 236 104 Tu 1 . - CDS 265762 - 266775 768 ## Closa_3658 hypothetical protein - Prom 266800 - 266859 2.0 237 105 Op 1 4/0.000 - CDS 266865 - 268256 1343 ## COG3225 ABC-type uncharacterized transport system involved in gliding motility, auxiliary component 238 105 Op 2 24/0.000 - CDS 268262 - 269128 805 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 239 105 Op 3 . - CDS 269125 - 270222 1164 ## COG1131 ABC-type multidrug transport system, ATPase component - Prom 270297 - 270356 10.8 + Prom 270259 - 270318 8.3 240 106 Tu 1 . + CDS 270495 - 272477 2249 ## COG0021 Transketolase + Term 272511 - 272562 17.3 + Prom 272583 - 272642 6.4 241 107 Tu 1 . + CDS 272690 - 276058 3549 ## COG0642 Signal transduction histidine kinase 242 108 Tu 1 . + CDS 276228 - 278486 2352 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Term 278513 - 278549 -1.0 - Term 278309 - 278345 -0.8 243 109 Tu 1 . - CDS 278541 - 280592 1679 ## COG0840 Methyl-accepting chemotaxis protein - Prom 280632 - 280691 5.7 - Term 281003 - 281052 6.0 244 110 Tu 1 . - CDS 281086 - 282318 1406 ## COG0137 Argininosuccinate synthase - Prom 282355 - 282414 6.8 + Prom 282394 - 282453 10.9 245 111 Op 1 1/0.147 + CDS 282566 - 283606 1272 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 246 111 Op 2 . + CDS 283679 - 284143 505 ## COG0456 Acetyltransferases + Term 284155 - 284194 2.4 247 112 Op 1 10/0.000 + CDS 284228 - 285472 1589 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) 248 112 Op 2 13/0.000 + CDS 285582 - 286472 1154 ## COG0548 Acetylglutamate kinase 249 112 Op 3 . + CDS 286465 - 287700 1444 ## COG4992 Ornithine/acetylornithine aminotransferase Predicted protein(s) >gi|157101653|gb|DS480671.1| GENE 1 1 - 1314 801 437 aa, chain + ## HITS:1 COG:BH2226 KEGG:ns NR:ns ## COG: BH2226 COG1653 # Protein_GI_number: 15614789 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 84 357 70 347 424 63 25.0 6e-10 TGGDRMKKIRTKRGLLGTLMIAAILGTGCNTEQSVPELYVDDNIRETPVTAFTENIEVSK AIEERCKAVLNQDGNTNIAVYSDSADYYAAEGLSYRELLLKRLASGNADDLYTIHAEDVL EFDEKGYIYDMSSLEFTGNLSHDALSQSTYNGKVFSIPLSYTGFGFIWNVDMLKQYGLKL PENLDEFLDVCETLKENGILPYGANKDFALTVPVMCAGLYDVYQSDDCNQKLEALSNGTV AISEYMEKGFDFLQMMIDKGYMNPERALDTLPKKEEEKSFFANGNCAFICAIYSGKALEG YPFEIAMTPIPVLKNGSICVVGADQRMAINPKSKHLDSAIAIVEALGQAETLDSLAKSQG RISSSKNATAPDIPQTNSFIACIAEGRQIPNQDFSLNFNVWENVRDLSQQLCRGKSAAQV AAEYDSIQMEEISKYGN >gi|157101653|gb|DS480671.1| GENE 2 1409 - 2974 851 521 aa, chain - ## HITS:1 COG:mlr6670 KEGG:ns NR:ns ## COG: mlr6670 COG0747 # Protein_GI_number: 13475567 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Mesorhizobium loti # 43 520 35 511 513 165 27.0 2e-40 MIMKRALTILSVSILAAAWLCGCGKIRKTESRDADVLYHACYSEPYVTLDPSAEQSNGIR ILYNVYETLTHYDDKTGEVIPKLAVEWSSNADATEWVFKLRNDVQFHDGEKMNAETVKKS IDRTIALNKGAAYIWDSVDSIEVTGEYEVTFHLRYGASIPLIASAGYAAYIMSPNAADKD TEWFNAGNDGGSGPYTIATVSARAVTLSSYEGYRGGWEDNQYKNVYIQEISDSGTRRRLL EAGEAHLTSELSAEDLAALSENDGVTILPADSFTNVILMLNTKSSPCSNADFRKALAYAF PYEEAVHDILQGNAAQSCGMIPKGLWGHQDNLTQYHCDLKKSAEYLEKSGLMDATVTVTF IVNDAVYREILQLYKENLAQIGVTLKLLNMDWDAQLALAKSPNPDDCQDILLMKWWPDYA DPAGWFSPLLMSSGDSVGYNFCYLDDDIFGEKNRDAVRLTATNKNEAEKLYIEMQEEILD QCYMIFAYNTLQYYAVNNAIHGVYENPAYQTCVSYYDITKD >gi|157101653|gb|DS480671.1| GENE 3 2974 - 5556 1183 860 aa, chain - ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 344 615 1 273 385 179 36.0 3e-44 MKIKQGKGRYIALALGNIAVVAFILILSISYVQDVQRDKAATAKESFTAAMESTAQLSYG YMSSLQNECDSWASYLENHEYSMDEAIEYLKEVNINDYVSVHILYYDTLSGLSAGLEQSV NKVDYSQLTDAFSYILPKMVNGTRGEGTIYISSAYINPMDGVNSVSFCSLLTIRDEHGGR AKAILVKTIPVEILHAQWIFPGAYREAEVSLLDINGQYIIQSDSMGGDTFWTFIKQHNDL SYVDIGEYQSSFQKKDHYLVELMDQDGKPAYYVSAQIENTPNCTFVGYIQSEKLSAMGYD YDWQMSLYIVMGFIFLLLLDGSYVLHINKRLRRSIEETQNANMAKTKFLSSMSHDIRTPM NAIIGMTEIAKKQIHNPAQVQYCLDKITLSGNHLLTLVNDVLDISQVESGKMALHPVVFS LPECCANLINIIKSQIQEKELQFDVHVRNLKYEYLYADELRMNQIFINLLTNAVKYTNPG GKIIVEFEEALQPGDKGNVVLTYTVQDTGIGMSSDFMRDMYQTFTRAVDSRIDKVHGSGL GLAITKQMVELMNGTINCESEPGKGTRFIVALPLPATEITDDLMLPPLHLLLADNDKTSL DAAENMLVSMGVAVDTADNGHTAIEMAEAKHNAGEDYPIMIVSLKMPDMNGIEIAGTIRA KVGEDSPIILISAYDWFEMEDAAKAAGINGFINKPMFKSTIYEKINEYLHFTEKNEKTAE SSSDELQGLHLLVAEDNDLNWEIIQELLKYRGITAVRAVNGQVCIDILKHAEPGTFDAVL MDIQMPVMNGYEASAAIRAFPDMYIRNIPIIAMTADAFAEDIRACLDTGMNGHIAKPVDM KKLFKELRNAGLTNGNERKR >gi|157101653|gb|DS480671.1| GENE 4 5688 - 10502 2978 1604 aa, chain - ## HITS:1 COG:MA3542_2 KEGG:ns NR:ns ## COG: MA3542_2 COG1352 # Protein_GI_number: 20092349 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methylase of chemotaxis methyl-accepting proteins # Organism: Methanosarcina acetivorans str.C2A # 221 717 6 530 531 259 34.0 2e-68 MDKMNHGAVKCQLKYCVGIGASAGGVEALQELFRNMPLDTGASFIVVQHLSPDSVSMMDK ILQKSVKMPVRLAEENMELRPNEVYLNIPGMILEVESGHLHLSPVHDRVQRYAPINQMLN SLASDKNIHTIAVILSGSGSDGTIGIGSVKENGGTIIVQKPLEAQYASMPQSAIATGLVD LTENVSKIGLAIRDYLKNPNNQYLHYDDGVENQELAGYYNQIIDAISKYSDIDFSAYKSN TIYRRIERRIAINKYTCIEDYMDHLLASEEEKGLLCSDLLIGVTSFFRDEAAFKSLGEHV LAPLLREKKSIRIWSIACSTGEEAYSIAILLCEYMERLNYNADVKIFASDTDPDAIAAAQ RGFYTEGSLASINELMIEKYFDKKEDGYIIKDMIRKMIVFAKHNIFRDAPFSKLDLIVCR NMFIYVKPEVQQRTIGNFHHLLNDDGCLFLGNSESLGDLEGAFDVLDRKWKIFRKNKEFH TENNGFYLWDNLSHPQKAMQEPAGNLRPKVRPTNIFERLFFDFAGPSVLVDGYGKIVQII KSGGRYLRLQDGQFDNSISSCFAPSLTILLNHIMEQMKESSCSCVEKNVTGVADYPDESL NIKVNYFNLDNDEYFLFQIGIGRTAMPDSADKSLDISELTNSRIAQLEYELGESNWRLSL AMEESESRNEELQATNEELMASNEELQSTNEEMQSINEELYTINAEHQSKITELTTAYSD FDNLLVNAEVGALYVDRDMKIRKITPIMLQNTNLLNSDLGRPVYHINFIDSYESFNSDIK KVAETGTIIEREVTDARNSIWLVRIRPYYEKPEQVGGLLVTMFDITRRLEAAKFELKRLT DSVPGGVVRMHYDGELIIDYANDSFYAMTGCSPEDVRYRFHNRLSKMILSEDWTQMEQQI ESAAAGNGILKYEYRGGRDALSARWYSIQATVFQVNGLTELQGIIMDISKLKDYEQRLQR ERDYYNTLYQNMLCGIAQYAYDEDAMDFIGINPESLRILGYSSLEEFKGQGGPTMLDITY KEDRERIVGMLQTLKKEGDCVSFDHRIVRSDGEIRWVTGASKLITSPEGKLLIQSTFIDT TEEKRTLEQLKEERDRYDLLYETSYNMAVCGVIQADVKHNKIIAVNKEALQLLGTDAAGL ETQLFSREAEKGVPNLSSIGEMMQAAAENEHKRFSHILNFKDRYMSIEGAVDYIMEDKSV RVVQFTFLDNTERELLRQAETKLEIATKSSKAKSYFLSKMSHEIRTPLNGIVGMIDSAML YRHDKEKLLDSLNKLKRSSLHLQQLVSEVLDLSKIESGKMDVNMGPVNLQLLLSDVIEEF GTMAKERGIGLTQTGKLHHKYVNTDKVKLHEILANLIGNALKFTDSSGVVLLNIEETVIA ERESLYTFQVRDNGKGISREDQERIFEAFDQGSHDHLYGNSGSGLGLTICKSLVAMLGGS LKVDSIENAGSEFSFTLTMDLLDEEKETDTPVKSCSCYQGHRVMVAEDNLINGEIAETFL RSFGFEVDLVKNGQEAFDTFADSPEGTYSIILMDIQMPVMNGYDAARQIRSCENGDAKSI PILAMSANAFQEDVACSLDAGMNEHLSKPIDMETLFSVIGKYIK >gi|157101653|gb|DS480671.1| GENE 5 11119 - 12591 390 490 aa, chain - ## HITS:1 COG:no KEGG:ELI_2097 NR:ns ## KEGG: ELI_2097 # Name: not_defined # Def: regulatory protein GntR # Organism: E.limosum # Pathway: not_defined # 6 490 4 485 485 432 44.0 1e-119 MGLIYDGLMYERIYEILKDRIESGVLPAGTKLPSRDNLCREFGTSAKTIRRVLSMLKENG LIETHQRKRPVVSFHQQTRRPMINLALKRVDTSITDEVLKTGVLLGYPLIENGIALCRQE DFIIPRKIVENMDINNSAEFWRLAKQFQRFFIRRNENDLSLRVMDSLGLAGLRPLQDNLE IRTRFYEQMQELMRVIENRGDTDSVHFDDLSGMYGLTYGSEPAFDVPVDSAAVLGRKQLE KLLLREEVRYSAVYMDLLGLITMGRYQPGDKLPTHNELQKLYNVSVDTTLKAIQILQEWG VVKAVRSKGIFVMMDIQALKKIEIPPHLIACHVRRYLDTLDLLALTIEGVSAYAAEHISQ KEIEEAMSEIKRCWNEDHRYMLTPSFLLNLIVKHTGDGSLNAIYILLRQNLGIGRSIPAL RETEKTAEDYELYEQCVTALNQLYAGRQEDFSNGTAKAFRYIYDYVTEKCKNLGYYDSAM AVYDGSALWK >gi|157101653|gb|DS480671.1| GENE 6 13056 - 14318 788 420 aa, chain - ## HITS:1 COG:RSc0480 KEGG:ns NR:ns ## COG: RSc0480 COG0334 # Protein_GI_number: 17545199 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Ralstonia solanacearum # 2 420 20 433 433 407 48.0 1e-113 MSDTYDPYQDVLQIIEDCAKACGYERDDYIALAYPERELKVAIPIEMDDGSIQVFEGYRV QHSTSRGPAKGGIRYHQDVNINEVKALAAWMTFKSAVVDIPFGGGKGGIKVDPSTLSIHE LRRLTRRYTSMIAPIIGPQQDIPAPDVGTNPIVMGWIMDTYSMLNGHCIPGVVTGKPLEL GGAVGRKEATGRGVMFTVHNLIRALDLDISSCTAAIQGFGNVGSTTARLLYESGVRILAV SDVSGGVFCESGLPIVKILEFCKSGALLKEYAVKDSSVTFLSNDKLLALPVTFLIPAALE NQINRTNLETIHAQYIVEAANGPVSMEADALLHDKGILIVPDILANAGGVVVSYFEWVQN IQEMWWTEKKVNQTLEEKMGAAFSDVWETAKIRRISLRKAAYLIAVKRVIDTKKLRGIWP >gi|157101653|gb|DS480671.1| GENE 7 14527 - 15195 712 222 aa, chain + ## HITS:1 COG:BH1062 KEGG:ns NR:ns ## COG: BH1062 COG1802 # Protein_GI_number: 15613625 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 181 14 186 226 66 28.0 4e-11 MNELIYEEIKAKIINNELKPNEKLDVDFLSNSLGVSRTPVTNALKSLNKDGYVIINPRSG SYVRELSKEELSCIFNFREALESQVIREVISIADMQVLDGFGDEFRKLLNIEPDGGLDAK KRLVEDFFDIEMRFHEYLIDLCPKIIGDEIKNLIDLTKRIRILHITYNVNSAPLEKFHYE IKIHCQLIEALRQHLLDKCIELIAQDIRNTKNEIMDCFDEIN >gi|157101653|gb|DS480671.1| GENE 8 15299 - 16630 1249 443 aa, chain + ## HITS:1 COG:BH2260 KEGG:ns NR:ns ## COG: BH2260 COG0161 # Protein_GI_number: 15614823 # Func_class: H Coenzyme transport and metabolism # Function: Adenosylmethionine-8-amino-7-oxononanoate aminotransferase # Organism: Bacillus halodurans # 9 423 13 421 445 317 39.0 3e-86 MSNVFYANLDKKYKCVERAEGSWVYDTDGKKYLDCAAGIAVTNIGQGIKEVIDAMYQQAS NVSYVYGGTFTSEAKERLSHQIIELSPEGMDKVFFCSGGSEAVESMGKIARQYQIEAGRP GKYKIISRWQSYHGNTIATLTYGGRPSWREKYDNYLMKMPHIAQCNCYRCPYGLSYPECG LPCAEELERIIKYEGPDTVAAFLIEPIVGTTSCATMPPIEYMKRIREICDKYNVLFCVDE VITGFGRTGKNFAMDHFGVVPDLISVAKGLGGGYVPIGAVIAHKKVVDAFEKGSKNLIHS FTFAGNPLVCAGASAVLSYTVNNNLVERAAKEGEVFLGKLKEALGDSPLIGDIRGVGMLI GVELVMDREKKEPFDPEKNVSGFLAGYCFDHGLLISSGVTGTADGIAGEALQISPPLILD EEEMDFAVDILKTAVLEAEHRFL >gi|157101653|gb|DS480671.1| GENE 9 16649 - 17131 328 160 aa, chain + ## HITS:1 COG:AGl3307 KEGG:ns NR:ns ## COG: AGl3307 COG3090 # Protein_GI_number: 15891772 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 147 53 179 207 60 29.0 9e-10 MKKFHFDLIVSSVVLVAMIVVILIQIIFRLMGNPLSWTEEVARWMLVWATFAGASYAFKN GGLIRVDFFVKKLFKKKGRRITDIAAMMFMIVYFGLLGVSAASYMKMTISKGQTYPITQV PYAITLISLVYCGLSCCIFAVKEGLKAYHSTDTQEEGGNA >gi|157101653|gb|DS480671.1| GENE 10 17132 - 18421 796 429 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 4 429 2 430 432 311 37 2e-83 MSSLTVTLILFSLLLLLLFLNVPLVVSIGLPTALIMLGSGMKVATLAQRTYASVDSFTLM AIPFFMLAGKLMEVGGMSKRIVRLADCLVGWMAGGLAHVIVVASAFFGALSGSAAATTAA IGSTLIPEMKKRGYPADFVAGIQAVAGALGVIIPPSITMIMYGVCSSTSIGKLFMAGIVP GIVLALFLMVTIAYQAKKRKISQMNTFTFKELGTAFLDAIPALLVPTIILGGIYGGIFTP TEAGAVAATYGFLAGAFWYKEINLQNIGNIMGGAIVNTVLVMIVVGASGAFSWLLTSAGV SSLLGKAISAIAGNKYIFLLVANLIFIFIGMFIESVAGILIVTPILLPIATSLGVDPVHF GIIMVVNLALGLTTPPVGENQYIAAAIAEIPFEEEVKASIPFLIAGFAALAVITYVEPLS LWLPGILGM >gi|157101653|gb|DS480671.1| GENE 11 18488 - 19561 314 357 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 85 355 53 325 328 125 29 2e-27 MKKQLISMVLMAAMTVSMLTGCKTAGGKKAEAPSSGNQAAAEAKTEDEEKASEGGEDYSY HWKLATTESSDYYMTQLSQEFLDIVEEKTGGKVTGEVFASGQLGGLVDALEGLEMGNIDI VMDGVSSLSAVDDIFNVWCLPFLYDNKEHQYRFWDNHFDEVSDMVAEQSGFRLVSVIDGM NRELASTVPVESMADLKGLKIRVPNIPGYVRIWECLGAAPIPMSLSEVYTSIQTGVVQGQ ENDIALTLSLKFYEVAPYCVMTDHVAYEGSFYFNEEEYQSYPDELKQIIQEAGEEIKLKS RSIIKEQETECMKQIEDMGVTICRPDLEEFKKATAVMYDEYDMCAPIIELVDKARNE >gi|157101653|gb|DS480671.1| GENE 12 19635 - 20981 735 448 aa, chain + ## HITS:1 COG:PM1365 KEGG:ns NR:ns ## COG: PM1365 COG3395 # Protein_GI_number: 15603230 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 7 442 2 408 413 295 39.0 2e-79 MKNKFYIGCVADDFTGAGDVASFFVKAGLVTVLYNGIPDDSHTVAEGTQAVVIALKSRTQ DREQAVADSLRAFGWLLQEGARKLYFKYCSTFDSTKEGNIGPVADAVMEKFGYPYTILCP ALPVNGRTVEKGKLYVNGVLLEESSMRNHPLTPMRESELGRLIEMQSRYKGISMAGKTKE QWKKEQETLCRQEGHCYLIPDYYEESHGKEIAREFCDIIFYTGGSGLAEHVGRLLAEKAM ADRDADVDVSGEACGKEEACGREKVCDKEEEYGKEKACSRQEALLLAGSCSEATLRQIRS YEKGGYPCYRIDAGRVLRGEETAEHVWNIVKMHPGENMLVYSSDSPEQVRKIQEEGKEKI SFMLEQLTAELAARAAGSGYRRLIVAGGETSGAVIQRLGYQGFWIGHSIAPGVPVMVPLE DTRMRLVLKSGNFGQEDFFSRAVKELEG >gi|157101653|gb|DS480671.1| GENE 13 20984 - 21580 469 198 aa, chain + ## HITS:1 COG:HI1012 KEGG:ns NR:ns ## COG: HI1012 COG0235 # Protein_GI_number: 16272947 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Haemophilus influenzae # 8 196 6 195 210 144 38.0 1e-34 MDIITERQLETAVEIAHSLFERGKVSGSTANISIRIGDFIYISGSGTSFGTLEKEQFSTL LLDGTHVEGIKPSKEYPLHAMIYRYKPEMKGVVHTHSFYSVLWSCLEHEKKTDVIPAYTP YLKMKVGTVGIIPYAKPGSQELFGYMEERIMNSDAYLLKQHGPVVAGRTILDAFYGLEEL EESAKTAWYLRGEHVQEC >gi|157101653|gb|DS480671.1| GENE 14 21776 - 22858 768 360 aa, chain - ## HITS:1 COG:SA1154 KEGG:ns NR:ns ## COG: SA1154 COG2008 # Protein_GI_number: 15926897 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Staphylococcus aureus N315 # 10 348 3 341 341 399 53.0 1e-111 MELCMHSRPSFASDYMEGAHPAILARLMETNMLKTQGYGSDPFSQSAREKIRAACGCPDA EIHFLVGGTQTNATVIRGLLHSYEGVIAAGSGHISVHEAGAIELGGHKVLTLPHVSGKLR AADIEGLIGDYRKDANSGHMVMPGMVYISQPTEYGTLYSLSELREISSVCRKNGLPLFLD GARLAYALSCPCNEVSLKDIARLCDVFYIGGTKCGALFGEAVVITQQNLIPHFFTVIKQN GALLAKGRMLGIQFDTLFEDDLYMKIGASAIAAADQIRDTLTKCGYRLFLDSPTNQVFIV MENEAAAQLARKAEITLWEKYDDTHTMVRFVTDWATQQRDVDMLVEILEQECNPTLACVS >gi|157101653|gb|DS480671.1| GENE 15 23072 - 23950 600 292 aa, chain + ## HITS:1 COG:BS_yvbU KEGG:ns NR:ns ## COG: BS_yvbU COG0583 # Protein_GI_number: 16080452 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 288 1 289 292 134 28.0 2e-31 MDVTKCEAFLAAIDHGSLTAAGVFLGYTQSGITRMINALEEEVGFPLFIRTKKGVSPTEN GKAMIPAFREIVRAHNHALEIGADIRGILSGALTIGSYYSVSAMWLPPILKRFRQLYPNV RINMKEGGNREMTRWLNEMSVDCCFFAEPAAGTICDWIPIRQDELLAWLPKNHPGAGEAS FPLCALENEPFIITMPNQDTEIDRLLGSRHLTPDIRFSTADAYTTYCMVEAGLGMSLNNR LITLSWSGDVVTLPFDPPQFVSLGIGVPSWKEASPAMKKFIECVKDMLEELP >gi|157101653|gb|DS480671.1| GENE 16 23995 - 25062 802 355 aa, chain - ## HITS:1 COG:BH2731 KEGG:ns NR:ns ## COG: BH2731 COG3835 # Protein_GI_number: 15615294 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Bacillus halodurans # 1 345 1 363 371 100 24.0 5e-21 MIDNKIMYEIIEKLHESFSASISICDVSGRVIVSTDSSCMGEMNLLAIEALNINSKVTVS MDSKIQKAGAAMPLRFQKSRMGAVVLQGAGSSSSQLAELLSKTIELLYEELILSKKKQNR TQERDQFLYEWLHLQSDYTENFIKRGEHLGIDITGNHTIILMERKQDDLFTSTSIIQNLL DDRDILLPLSQDQNLIILKENEHFEKKYNRVIAAGHNCHTGICSGSAHLHTAYQAALESL KLGKILFPDEHLHCFEKMKLAIALSKTPIPGLEDSFSLLVAKGRNAQLADTAITYIRLNG DIQKICDKLHIHRNSIPYRIRRIHEICGRNLMDYYDMLCLYASFIRYAGKEADIQ >gi|157101653|gb|DS480671.1| GENE 17 25428 - 26444 989 338 aa, chain + ## HITS:1 COG:VC1343 KEGG:ns NR:ns ## COG: VC1343 COG2195 # Protein_GI_number: 15641355 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 1 331 33 362 368 127 27.0 2e-29 MEQELAGMGIPFERQELGNEVVTDGWNILARIPGRSGKTPFLFVFHLDTAAPSNQVEVCV EGGCIRSKGSSVLGADGKLAIAVVMEAVERLMQEGEINRPIELLFTVCQELGLHGAKYAD YSRIESEEAIVIDHYVTGEVLTRTPARLYMNVELIGRAAHVIRNEEPGVNALLTAVEIIH QIPVGRLNDNLSINVFDLVSLSPSNAVPKYARFDVEIRCFGNDIKQEIRDRIRRKAEETA ERMGCKCNIREEEDVPETDFSENTDMLERLASIYSRSGLSMIPARSFGVLDATCTNQLGI RTVPIGFNIYHSHSAREYVVIEDVKKMLELVENIIRYF >gi|157101653|gb|DS480671.1| GENE 18 26471 - 27868 1431 465 aa, chain + ## HITS:1 COG:BH2308 KEGG:ns NR:ns ## COG: BH2308 COG1288 # Protein_GI_number: 15614871 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 15 465 24 470 470 210 30.0 4e-54 MTKDKQKANKPLNPFVPLVLMVLACAIVSYFVIPGAYDRETVDGVTRVLADSYHATERTP VSFFNIFRSIPEGLTASANMMFCVMLIGGIVEIYKRTNTVGAAINSVLKASERMSSQVII AIVMIVFFIFGGILGWSEHIIPFVPIIVSLALSLGYDSLVGMAISGFACLISFAVAPFNV YTVGISHTIAELPMFSGWELRIAALVCVWILSLVWVMRYAKKVKADPSKSLVKDVDTSSL RIPVDPGLKFDLPKKVSVISLTISILVTIYGILNLSWSYTEMAATFMIGGIVSAAINRVN LDDGINMVLDGARGAFSGALIIGVARAVQWTMTNGGLVDPLVHGLSNLMRSASAYVSTVG MFIVNFFVNALIPSGSGQATAVMPIMVPLADMLHITRQTAVLAFQFGDGISNTFWFTNGT LLIYLSLGKVPLKSWYKFILPLHGLFLILQLIFLFVAVQIGYGPF >gi|157101653|gb|DS480671.1| GENE 19 28017 - 29192 1223 391 aa, chain + ## HITS:1 COG:alr4934 KEGG:ns NR:ns ## COG: alr4934 COG1473 # Protein_GI_number: 17232426 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Nostoc sp. PCC 7120 # 15 388 28 402 405 301 41.0 1e-81 MNVIEEVGKYHDYAVSMRREFHKHPELSWKEVETAGRIRDELAGMGIPYEEVAGTGTIAT LKGKEDQPVIGLRCDIDALPIREVKSLPYCSQNQGVMHACGHDAHISMLLTAARVLAEHQ DELKCTVKLIFQPAEELTNGAVKVLESGKVGKLDTVAGMHIFPYLESGTISVDPGPRYTS ASFMNIKIIGKSGHGAMPQYAVDPIYVGAKVVDALQSIASRETSPMDTVVVSICTFHSGT MANVFAETAELSGTVRTFNPKLQKELPGMIERIIKSTCEAYRAEYEFDYYSDIPATINDE YCSGIAAESVRKILGDKGLVKYAGTPGGEDFSYFLEKFPGVYAFVGCRNESKDCCYSLHN ERFDLDEDALVNGAAFYVQYVLDAQEKFGAV >gi|157101653|gb|DS480671.1| GENE 20 29344 - 30399 503 351 aa, chain + ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 15 303 4 293 333 192 38.0 7e-49 MVVFNYKRLQFGGWKMATIKDVAKLAGVSICTVSRALANKENITPKTMEKVLSAVRELDY KPNYSARSLKIGSTDTLGLIVPDITNPYYPKVAKSIEEYAEKKGYMILLCNSNEDLNKEK RLVDTLKKRNVDGVIILPCSRHIEHFRGFDNAGIPYVFLNRSFKGIANCIPSDNFYGAYT VTKYVIGRGHKNICAAYLGFENQIYQERFEGTMEALREHGLEQCAKQFIFDIKDIQDSYM RIRKVLEGRERPTALIAANDMLCFGAYSAASDCGLSIPGDFSVTGYDDISMASLMMPPLT SFRQPEDVMAKGGVDYLLECIAGNSPAPPARLRGELVVRQSVCDIRGDTCL >gi|157101653|gb|DS480671.1| GENE 21 30625 - 31827 833 400 aa, chain + ## HITS:1 COG:mll1018 KEGG:ns NR:ns ## COG: mll1018 COG1804 # Protein_GI_number: 13471132 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Mesorhizobium loti # 1 399 1 398 405 336 43.0 5e-92 MLEGVKVLSFTHYLQGPSAAQALADLGADVVKVESCRGAYERGWSGCNTYKNGVSVFYLM ANRNQRGVALDLKSDEGRETIYRLVRDKGYDVILENFRPGVMDKLGLGYEELKKLNPGVI YCSCTGYGSSGPKVRKPGQDLLIQGMSGLAALAGPGDHPPMPTGTALVDQHGAILAALGI TAAVYDRDKTGKGHRIEASLLGSALDLQIEPIGYYLNGGTLTPRADTGLSTRIHQSPYGI YKTADKYITLSLTSFENLEQAFTPGALEGFSAKDQMDNRIEFDKVVCRELLKRTTNEWEL IFEELGIWYAPVNEYEDVVKDEQVVYNQCFMTMNHPVAGEVKVLGHANRYDGQPVPLRRL PPELGENTIELLKEAGYSDSQIQEMMKQGKAVSPEKKQDK >gi|157101653|gb|DS480671.1| GENE 22 31858 - 32910 374 350 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 53 333 48 329 346 148 29 2e-34 MKKIAVLLTTVLTAGMLAACGQGKNAETTSAGVSDVKTEADSVPEAKPEKDAITFKLGMV DPDGSNFHKGALAIAEEVNKATGGRITIQVFAGGQLGNERDMYEGAQMGTIDMFTASNAV LTSFIPEMAVLDQPFLFETADEAHRVIDGTVGTLIAEKTEEQKIHTVGWMDVGFRNIFST RPVTSLEDMKNLKIRTMENDLHIAAFNAMGAIATPMASGDVFTGLQQGTIDAAENAIANL IANRYYEITKNVTWTNHVFGYMGVFMSDKAWNQIPDDLREDFVEGVKAGAQRQRDYLVEA NEAAVKELTELGVEFYEIPIDDMRKLVEPAMEQFADRMDPAWVEAIEAEK >gi|157101653|gb|DS480671.1| GENE 23 32922 - 33425 418 167 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_0135 NR:ns ## KEGG: TepRe1_0135 # Name: not_defined # Def: Tripartite ATP-independent periplasmic transporter subunit DctQ # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 143 4 143 158 75 29.0 9e-13 MKRLFANLQKIEDFILVATFTVMVLASFAQVVNRNIIHAGVSWLEELARYCMVYMALLAT EAGLRDNTQISITAITDKIHGMTGRVVRIISKAVVAIFSGVCFFSSFTILKTQLSSGQVS PGLHLPMAIPYFALTLSFGIITVIQAAALVCLVMEKAAEDQGKEDAS >gi|157101653|gb|DS480671.1| GENE 24 33422 - 34687 733 421 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 420 3 425 435 286 34 5e-76 MTAIILFGTMIVLLIIGVPISVSLGAATMASLLCFDVPMAVVPQRMFTALDSVSIMAIPL FVLAGNLMTEGGISRRLVDFCNSIIGNIRGGMGYAMVLACAFFAALSGSAPATVLAIGGM LYPDMVRLGYPKERCAGLLTVSGGLGPIIPPSIIMVVYCTITNSSVGDMFKGGIVAGLMI SIVLVVICIYYSNKEKWPKNTERTSVSQLAKHFIHALPALFLPFVILGGIYSGLMTPTES AAVAVIYSFIVSVFVYRELHPRDIYRILLASGKSSAMILFIIATSTAFSWLFTFSGISGQ LVDAIVAMNLAPWMFCLIVALVLLVFGTFLEGIATCVLLVPVLWPIAQTLGIGVIHFGMI MCIANVIGCMTPPVAVNLFAAASVSKLKMGDIAKGELPFFLGYTAVFFLIVFVPALSTLL L >gi|157101653|gb|DS480671.1| GENE 25 34718 - 35899 574 393 aa, chain + ## HITS:1 COG:mll1018 KEGG:ns NR:ns ## COG: mll1018 COG1804 # Protein_GI_number: 13471132 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Mesorhizobium loti # 1 393 1 398 405 224 34.0 2e-58 MLEGMKILSFVHGLYGGSASQILADLGAEVVRVEWDMTGRIRPGSVLDGENGLFFQMTGR NQKSISFNPQSPEGIDILHRLLQDYDIVLDNSPPGLLEGWGIQYDELLRSHKQLIWCICT ADGSGCCQNPDDELLMEAKSGLASLNGPGSKPPVPAGAALIEQHAAVLIALGIIAAARHR NITGQGHKVETSLLSASLDLQIETIGYYLNGGRFIDRADTGLSTRIHQSPYGVYPTADGC ITVSLTHHDRLCSLFTPGVIENFTEQDAMERRVEFDQMINQEMQKKTTAQWVEIFERMED MWFAPVNEYEQVMKDDQILYNRPFIELGESEGRTLRGLGHANYYDGVRPAVRNLPPAFGQ HTDEILEKAGYTQRELNQLERKGLIVRNHERKN >gi|157101653|gb|DS480671.1| GENE 26 35901 - 36683 560 260 aa, chain + ## HITS:1 COG:SSO2624 KEGG:ns NR:ns ## COG: SSO2624 COG1024 # Protein_GI_number: 15899349 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Sulfolobus solfataricus # 15 259 18 260 266 173 40.0 3e-43 MEYENNLILVEKPADKVALITLNNPPLNLVTLELSRELKETLFRLEQDDEVRVVVLTGSG EKAFCVGSDVKEFPLVWDDVIGKKLQKENEVFNAIEFLDKPVIAAMEGNVCGGGYEMAMA CDLRILSESGRIAQPEINLGVFPGSGGIFRLPKLVGASKAMEMMFLGEFIDAEDCLRLGL VNRLAPAGKTVSAALDLAERIAHKPFEAIKLIKKGVREIGMVSSDKCFYKNLEFSRSIFK TEDCKEGVDAFLGKRSPQFR >gi|157101653|gb|DS480671.1| GENE 27 36709 - 38367 711 552 aa, chain + ## HITS:1 COG:MJ0033 KEGG:ns NR:ns ## COG: MJ0033 COG1053 # Protein_GI_number: 15668203 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Methanococcus jannaschii # 30 536 25 507 539 204 31.0 3e-52 MDRLVTDVLVIGGGGAALRAALAACDAGADVIIAAKMPLGLGGATTYPVAEMAGYNAGDP HVPGDVECHYQDIMAAGQGMADEKLAAILAEGAPETIDELESWGVEFDREGDGYYTFKSC FASIPRTHVIRGHGEPIVQALIRQIRRRPDIQSLTGVTIIELAKKNGICCGAWGVDSRGK LIYISAGAVILASGGASQAFEKNLNPGDVSGDGYMLAYESGARLVNMEFMQSGIGFSHPL MNIFNGYIWAAHPLLTNSEGESFLEKYIPDSMQPEYVMDEHRKHFPFSSSDDSKYLEIAI QKEIRAGRGGRHGGIIADLRHMKDDYICHVPDDCGLHHMWPVARNHMKDKGVDLLTQKVE INVFAHAINGGVCIDAEGSTSLPGLYAAGEVAGGPHGADRLGGNMMVTCQVFGKIAGTNA AKWAGIHRPQRESAAMYPDDRLWEILHRKADTQGLIHELQSVNQHHLLISRNEAGLMQVL GTVERIGRQLAESPRQDAINMEAFRLYCLITASRLMGEAALRRQESRGAHYREDYPIVNP KLDRPMFSDQSI >gi|157101653|gb|DS480671.1| GENE 28 38394 - 39773 1047 459 aa, chain - ## HITS:1 COG:BS_ydeF KEGG:ns NR:ns ## COG: BS_ydeF COG1167 # Protein_GI_number: 16077585 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus subtilis # 1 458 4 461 465 425 46.0 1e-119 MSINSFDNYPMSWKPDLSRTSGPKYLALVSLMEEDIKNGTLKAGTKLPPQRELADYLNVN LSTISRAFKLCGQKGLLSASVGNGTYVCADVSAETILLCGRENPRMIEMGALVPQVSGNR LVKIYTERLLKRPDALNLFSYDDPEGTGRQREAGVAWLQKSGFSTDKDHVLLAAGGQNAL TAALGALLERGDKLGTDPLTYPGVKTAAKLLGIHLIPIQNHDYEMTEEGIRYAVQNEKIK GIYVIPDYHNPTSHIMSLETRKRIAELAREKHILVIEDGINNLLTENPLPPIASFAPEQV IYLSSLSKTIAAGLRTAFVHVPDQYHRCLATTLYSMNISISPLLAAVSAGLIADGTADEI IRGRKEELRRRNHIISQSLKGFALDCPPTSPMRYLRLPDYFTGKSFEICAKQAGVQVYGA ERFSVGNKPAEKAVRISVITPPSMGDLEEGIRRLRGILG >gi|157101653|gb|DS480671.1| GENE 29 39930 - 41108 881 392 aa, chain + ## HITS:1 COG:PAB2227 KEGG:ns NR:ns ## COG: PAB2227 COG1167 # Protein_GI_number: 14520410 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Pyrococcus abyssi # 1 391 12 406 410 318 41.0 1e-86 MEHLLSERIKSTPPSFIRSILKTAADPQIISFAGGLPNPVSFPQEELLVSMERIVRDYGS SLFQYSITAGLPELRQYIADRYNCRFGLKLGIENILITTGSQQALDLISKVLLNTGDGVI VEKPSYLGAIQAFSQYQPVFYPVELTEEGMDIEQLQDALKHDVKFIYAIPDFQNPTGLSY CAENRACIREVLREREIVLVEDDPYGELRFDEGGRIPYIGAGQLPGSILLGTFSKTVTPG MRTGFMICADTELLKHISVAKEASDLHTNIFSQYLIWDYLMHNDYDAHIDQIRQLYKEQA QAMTEAMDRYFPETVRYTRPRGGMFLWVTLPEGVSALSLFPKALEKKVAFVPGDPFYINM TDTNTMRLNFTNADCSMIDEGIRRLGELLREI >gi|157101653|gb|DS480671.1| GENE 30 41175 - 43154 1402 659 aa, chain - ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 435 654 51 275 279 121 33.0 5e-27 MTQYRNSKWLNLTAAAALLLAVLILFQFLYQKDNKYTNGPPYGSQGVFAFSARDLTGSRP LFLIDGWEFYPDQLLTPDTVGSVPDQSMSFIFIGQYSNFSFLSSGHSPFGRATYRLLLRY DGEPRLLTLEVPELFTDYQLWINGQPVEQGSPFATFYMDGTAEILLNTDNRSHYYSGLTY PPALGTPEAVHRMLFLRHMFYGILCIFPLALCLYAAAAWRIRDRDRRLLHFGLLCLFFSI HCAYPFIHQLGLSGRLWYAVEDVSWMAMLYEVMVLSSVEAGFSGKLWYRRFLRPAAAAAC AVCGLSVLFLIPASPRLVNLYGAFLDGYKLLIWLYLLECAVYGLWMNRDSAGLVLAGCLT MGAAMVSNLLDNNQFEPVYTGWQTEYSGFILVIVFWILTVRHISRVLKQNQALTEHLEDQ VQKRTRELHAVLDERKAFFSDLAHNLKAPVVAIHGFTDLILRGNLYLDDDLKEYLDKISS ENEELCRRMYVLGDLNAFDKITEPRELIEINGLLSQVSHDNEPEACISGIELRVEKLEDQ AFILAQKRKLLLLFENLIYNAISFTPEDGVITISPCLDQDGVTIRVSDTGMGIEPEHLPH IFERFYSGRPDASESSGLGLYIARITVEELGGSIHAESVKGQGSVFTVRIPLAGTVPSL >gi|157101653|gb|DS480671.1| GENE 31 43183 - 43869 516 228 aa, chain - ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 223 1 228 230 142 30.0 5e-34 MPYLLFVDDDSNVLNINRNYFESRGFSVSVAQTTDQALDHVKKYPVDCVVLDIMMPGFDG FLLCRRFKTQMTAPVIFLTSLTEKEYLYQGFSLGGDDFLTKPWDLKELEMRIRTRISQCS NRSLKGERLEFPPLTIDAGTRQASIGNVCVPLTAYEFDILLLLARSPGHVFSPDSIYREI WKLPDLDNTQTVKVHVARMRHKMEAACPGHSFIGTVWKQGYRFLPENS >gi|157101653|gb|DS480671.1| GENE 32 44173 - 44907 741 244 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 203 1 197 234 106 30.0 4e-23 MLRIAICDDLKSERDMVKGFLRSFFAAVPYEYTLAEYSRGETMVDDYDDGSVDFDLIFMD IFMDGMLGMEAARCLRRYAPHVSIVFLTTTPAYALESYDVYAYGYLVKPLDGEKTAALLR RFLQEEYEGNQKTLLLKKGCRGRRIAYREIEYIESRRNVLLVFLENGEEYRVYAKLDDVE KELKGHGFLRCHQSFIVNMNRVRVAEEDFLMMSGAHIPIRQRGSRAIRDAYFGYLLERAE LTRI >gi|157101653|gb|DS480671.1| GENE 33 44927 - 45577 657 216 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 23 189 265 433 452 89 29.0 5e-18 MSADKRQLEQYYAFLAGHIDEVRRMQHDILHHLRVMSGYAQAGDYDRLKEYLDALVEQMP DMDGLYYCGDHAANILLKYYGEQARNGGIRFNCDAQIPPDLPCTPVDLCTVLGNAMQNAV EACQRQGRGAGRYISLLARLVGHNLIMEIRNSYDGRVKWDGDRLVTLKEENGHGLGLDSI RHVAERYHGYCAVSHTDDEFTVKVVLALEWKEATPC >gi|157101653|gb|DS480671.1| GENE 34 45739 - 45987 213 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936033|ref|ZP_02083406.1| ## NR: gi|160936033|ref|ZP_02083406.1| hypothetical protein CLOBOL_00929 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00929 [Clostridium bolteae ATCC BAA-613] # 1 82 63 144 144 175 100.0 1e-42 MSEKKIRGAVIGVTDDGYPIVTEDYSCPYFTGTSRSMPAVRECWYCRYADFRKSTKLHIR SSICRCMENRVDTRFGFMKNEE >gi|157101653|gb|DS480671.1| GENE 35 46004 - 46300 476 98 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0373 NR:ns ## KEGG: Fisuc_0373 # Name: not_defined # Def: anti-sigma-factor antagonist # Organism: F.succinogenes # Pathway: not_defined # 1 97 1 97 98 100 55.0 3e-20 MTITTTKNNGCLTLELEGRLDTTTAPELETVIKNGLEGVDRLVLDMEGLAYLSSAGLRVI LAAQKQMNRQGTMTVRNVCGTIMEVFEVTGFTDILTIE >gi|157101653|gb|DS480671.1| GENE 36 46326 - 48056 1609 576 aa, chain + ## HITS:1 COG:lin2873 KEGG:ns NR:ns ## COG: lin2873 COG0534 # Protein_GI_number: 16801933 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 11 422 16 426 450 125 25.0 3e-28 MKNRDRFTGRIFRRMWGPAIISSAGWALSDIADAVVVGQRMGAVGLAAIALILPVYMINC MFAHGLGLGGSVRYSRLLGEGKPGEAVDSFNQVVTGALAFSILTGILGNVFMTPFLAILG TVPDDGPLFEATRSYLVVLVSATPLFYMSNILNYYLRNDDNEKIAGLGSVAGNLTDIGFN ILFVLILGYGTAGAALSTMIGQAVAILCYLPGILGGRHILKFRPVRPAPGAAARAFKDGC SSSVQYLYQLIFLLLCNNILIRAGNEASVAVFDMLQNASYLILYLYEGTTRAMQPMVSTY CGEHRGEGMRNVRRYAFRYGCSAGCLAALLIFLFPQFICTLFGLREAAAAAMGAGALRLY CAGTVFAGISILLAGYFQSCNQERESLIISSLRGAIVLIPGVLFFSFLRMDLFWWMFPAV EVLSLGLWIIRFIAGRGRTEEFDGSRVYTCTVNQGNQDMTLVTGEAGAFCEKWEASPRQT YYVTMTIEELCVVILRDGGGDDVCIEITLVAGQQGEFVLHIRDTARYFDPFALETGRVSQ EGGFDMDAMGVRVIKSKAKEFFYRRYGGFNSLVVKI >gi|157101653|gb|DS480671.1| GENE 37 48108 - 48704 368 198 aa, chain + ## HITS:1 COG:no KEGG:TherJR_1559 NR:ns ## KEGG: TherJR_1559 # Name: not_defined # Def: cell wall hydrolase SleB # Organism: Thermincola_JR # Pathway: not_defined # 59 194 119 255 257 66 33.0 6e-10 MKKFVGVNLSFIVILSFTIHAMTVFGLDGADVNGQYGKMVTGIRVPETICSIAKGELQKK REKPAGNKGSPQNRWGIQLDEIELDMIARITMLEAGAEPDRGQQAVVEVILNRMYSDQFP DTVYEVLSQKDNGCSQFVTWKNRNMDAAKPSERVKRNVKAVLDGETQILPFKTMYFSLEK ENGHIQCVIGNHVFCNQT >gi|157101653|gb|DS480671.1| GENE 38 48723 - 50414 1293 563 aa, chain + ## HITS:1 COG:no KEGG:SOR_0328 NR:ns ## KEGG: SOR_0328 # Name: not_defined # Def: cell wall surface anchor family protein # Organism: S.oralis # Pathway: not_defined # 83 301 1682 1900 2064 117 27.0 2e-24 MRIYNRKKRAISLVMAIMVMSSSIQITGLAEGEKEEGKKIIISEFQELPEELSSMKVPEG TSQEEFEKLFPDTLMVKAYRDDAEKQKPEQPTSAVTPTVPDGVQKPSEGEMPEGNLNPSE GEKPGEPQNPSEGETPGENQSQPEGEKPGENQGPSEGEKPGENQGPSEGETPGENQGPSE GEKPGENQGPSESEMPGENQGPSEGEKPGENQGPSEGEEPGENQGPSEGEKPGENQNPSE GEDPGENQGPSEGETPEGDQSPSGGEKPEGNQSSSEGAVPEGNQDTSHGASADSGSSDDS GASISTSTGNADGYDSTDNNDFGDSVVIMGQMVPWAGLPENADTGADVTENDGKTGSIEA YTETAAEKVNVKDGNTESGNTENSNTEETDNQWTEITVKAWEPDEGEYAYSPEDEAGTTY CYIPVISSRYDMDTELPMIFIEITDAVIIYPTSAMESVYEKIQALPSAEEIVELVERISV ATASQIPDIQGIMEQEKEAKKAYDALSQDEKDEFDSELYRKLMDLDTFFSNMKLVNTIEE QVISIRVDGGMELTGRRFEGSDF >gi|157101653|gb|DS480671.1| GENE 39 50542 - 53994 2324 1150 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1016 1139 581 693 744 91 37.0 9e-18 MPDVSSGSIFPKTICSVELGQAARIGQNAFNACGQLHELTVPESVTSIGSNALGTGSRLV VSLPRNTGVPEGIANSITGSGSVIIRTDQLELGRALYDVCKSEYPRIKVTFSSDLSHELE PVETENGVLSFRLSASADNYDVHGIKDGKNVQTTYSNNGYYLGVCAGKEPTKFSDCTSIK NLEREPVFVELKSGEILAAGISADFVSDGKYLKVTHTLVNVSDKTMERVCIGVNADIQIG SNDAASIYRTGTGFRMEDPGSQQQFSLACKKAPGADDISTMWFGYYSDRQSNQFNDTSAD DLIGSDSGLAFSWKDMELSPGERITRSVLFSVGEIADPPVLSKDPFSFTLEAEGKKSIKV DARVTDKAGRTDSIYYTIKKDGQPQQEERNLASVLADGSEKTITGVIGEAEFPDDGTYIV NVWIMNDAGAISEAVTREIVVENGQITAGIDGAAPIVDVTVTLSAPSDLVYSGEEKAASA VSAGADLTEEDYTVNYTRTTDSGQSVPVGHAPVSAGSYTAVFALTESGAQKYELTEDSVT AISYIIKPKPVSITVSGRKVYQSDPIGDSGFIVTWPGDEGLTDAVKASIRSALIVSSEGN AKEAPAGSYDIRITQIQVNGDADNYQVMPEVIPGGFLVESVHNPVLSLSAPKDITPNSAM LEGIILLGNVPLKEVTVSYKKAGTEDGYQIVNHELSQDSDMEIVSVKAALAGLDPETDYQ VKMQISYEENGGGGIKTHEQLVRFRTAALNGPLGEISGTVTDNSGMADTLIYVTLERGNT VQAGAGPLQSPGQFKFEGLPDGFYNLVANNGYYRVTQIVEIKSGNSVAGIQMSIGKTQSI IEIATEDTPEVVAGGLNELFHSELYTQSDADVVRDGGTVEFKLRVENKPRNEVAQDAERI AQEMAPGGQVGMYLDLQVLKTVKNAAGVTAGDHETPVPDLKGKKLTIVIPLPEEIRNRAP YFVYKVHGGTVSAVDNTYQEEHHTLTIRADEFSTYAIAYTQETEETAGAVQAEHDSGTVR EGRWMQNDTGWWYAYSNGTWPSARWAYLYYNGRYDWYYFDPKGYMKDGWIEIGGKWYYLH KLYDGRRGYMYTGLRQIDGKIYFFGSDGVMFSGGWKNVNGHWWYFNADGSMAVNAVVGGC KVNQDGIIIN >gi|157101653|gb|DS480671.1| GENE 40 54047 - 54877 784 276 aa, chain + ## HITS:1 COG:HI0623 KEGG:ns NR:ns ## COG: HI0623 COG0223 # Protein_GI_number: 16272566 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Haemophilus influenzae # 18 218 23 231 318 65 26.0 2e-10 MKIVYFGSDVFLDVFEYLHLNHEILALYTYHMEEDYFNEHNVVQLAHMSGIPVCYGQITE DEMLSYMEKEGCELFFVAEYSHKIPVPDDSRFYGVNIHSSLLPEGRSYYPVECAMERGLG RSGVTMHKIAKSLDRGDILAQRKYDIQPGNDSVDIYLKSGRQALSMVKELMEDFDRVWSN ARPQKEKLPYWNRPKKEALALNHSMTVEEARGVYRCFNKMTTVCIDGRSCYVDGFAAGSV LLGQDDSHNLFVRENRVLYGVADGHLRLDLVPVWEE >gi|157101653|gb|DS480671.1| GENE 41 54883 - 56532 1637 549 aa, chain + ## HITS:1 COG:CT588 KEGG:ns NR:ns ## COG: CT588 COG2208 # Protein_GI_number: 15605317 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Chlamydia trachomatis # 228 533 337 646 650 144 31.0 5e-34 MRSRKKGKLQLKFLIGLIAMTVVLMITLCLVITRQYRRSMESYYTKMAFDQAKIAAEYID GDTIKDYALTRKTDSYYDQVSRYLLYMKETIGIKYFYVVIPCEDHMYYIWDVGKTGEEGV CRLGDVDDYYQGGREIMRGAYLHPDGEDTILITDNEEYGYLASAYVAIRDSSDRPVALSC VDISMDMINRQIGSFAAAVVLVILAVLLVFIVGYFFFIRQSVLRPLNRLSQAARTIVSEQ MDDLSNFHVDVKTGDEIEELGEAFSHMAHELYSYIENLSAVTAEKERIGAELDVATHIQA SMLPGIFPAFPNRSEFDIYATMQPAKEVGGDFYDFFLVDQGHLAVVIADVSGKGVPAALF MVIAKTLIKDHTQAGATPAEVFGEVNAQLCESNEEGLFVTAWMGVLEIATGHMVYVNAGH NPPLVRQAGGSFGYMKLRPGFVLAGMEGIRYKSGEIDLSPGSTLYLYTDGVTEAANRENE LYGEERLLSVLAAHEDAAPEELLPAVKADIDAFAGDAPQFDDITMLALKLSEKLKLSEKR GQPAEEGAL >gi|157101653|gb|DS480671.1| GENE 42 56532 - 56936 571 134 aa, chain + ## HITS:1 COG:CPn0670 KEGG:ns NR:ns ## COG: CPn0670 COG2172 # Protein_GI_number: 15618580 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Chlamydophila pneumoniae CWL029 # 2 130 7 136 144 73 34.0 1e-13 MELTVPAALEELEHVQEFVEQALEEQGVSMKIQMQISIAVEEIYVNIARYAYHPEIGQAT VRCAVGGNPLQVTIQFLDSGKPFDPLAKPDADTTLSAEERDIGGLGILMVKKSMDDVVYE YRDGCNILTLMKRL >gi|157101653|gb|DS480671.1| GENE 43 56982 - 57890 613 302 aa, chain + ## HITS:1 COG:no KEGG:DvMF_2421 NR:ns ## KEGG: DvMF_2421 # Name: not_defined # Def: hypothetical protein # Organism: D.vulgaris_Miyazaki_F # Pathway: not_defined # 105 255 20 170 170 86 31.0 1e-15 MKINIRKCLLACMASMTLISFTAYGSEQSDQDQILREIAERIAASGDILDKIQEGAPWED VQKLVEDLKPMDVGGPGLDSEEGESRGASMTGAMVGAGGTDSGGGSSSNINLELARLQIQ LANSTKDTMTPYMNEINRIQEDQNRAGSFLSQARQIQKSAESSRKSIAMPDTMKFYMDTN KISYPKSNQGLLYSSDQWKSVIRGLEDYIEKTGAQVQSLMVKMQELMGEYNSYSQGASSS LQGGYKPLQGITRGQSLFSQHGGAVNTAPIATSMIIGVFIGMAVMWGILKKKGPGSGKEA HS >gi|157101653|gb|DS480671.1| GENE 44 57887 - 58525 532 212 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936043|ref|ZP_02083416.1| ## NR: gi|160936043|ref|ZP_02083416.1| hypothetical protein CLOBOL_00939 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00939 [Clostridium bolteae ATCC BAA-613] # 1 212 1 212 212 375 100.0 1e-103 MTGLKLAAALVCVLYGGWECIRRKQPLFFKLLLYAMASYLLGTLFTAGYALVYGHEPEGF HVGILGYMGTYFFLLSSYYGAMNRLADGGEARYRGYRLAAASGPLLVLVLLAHSVRTLGD GACLTYALLCLPIAPTLYFSIKHLILPDVDLGIIRVMRPYNACVIVFCMVQVLLQAPACS GLDIRAAGSVLSSVLLMTLLPMAEMGVRKWFI >gi|157101653|gb|DS480671.1| GENE 45 58513 - 59139 559 208 aa, chain + ## HITS:1 COG:no KEGG:Mpal_0500 NR:ns ## KEGG: Mpal_0500 # Name: not_defined # Def: hypothetical protein # Organism: M.palustris # Pathway: not_defined # 1 199 1 205 219 89 35.0 8e-17 MVYIIFLCLTIPMLLMLPLLEIRSRCIVGFMLLGTVTALSAYEINTIVYPMTGLSARSFS ELVPPMTEELLKAMPVLLYAVLLDDRRNRVLPIAMAVGVGFAILENSMILIDNVGSVTVL WATARGFSASLMHGLCTVVAGTGITYVKKQKKLFYTGTFGLLSISITMHALFNLLIQSDY DYIGMAMPLVLYGLIYWLYKGQRVRLPF >gi|157101653|gb|DS480671.1| GENE 46 59248 - 59820 524 190 aa, chain + ## HITS:1 COG:CAC0378 KEGG:ns NR:ns ## COG: CAC0378 COG1126 # Protein_GI_number: 15893669 # Func_class: E Amino acid transport and metabolism # Function: ABC-type polar amino acid transport system, ATPase component # Organism: Clostridium acetobutylicum # 1 66 174 239 243 84 56.0 8e-17 MAGEVLSVIRALAGEGLTMLIVTHEMKFVWDVSSRIFYMDQGELYEDGPPEQIFGHPKKE RTRAFVKGLEVFEQEITSRRFDYIEINTAIEEFGRRQILSQRHINNIELIFEELCVQTLL GRMGDEIRLGFAVEVSEADESCLVTVTYGGNAFNPFMDCADSLSMVLLSRMVRQYSHRFQ NGNNQMNLYL >gi|157101653|gb|DS480671.1| GENE 47 59993 - 61888 766 631 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936046|ref|ZP_02083419.1| ## NR: gi|160936046|ref|ZP_02083419.1| hypothetical protein CLOBOL_00942 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00942 [Clostridium bolteae ATCC BAA-613] # 1 631 16 646 646 1049 100.0 0 MKRKTWIAVLVLLCFIAVSDVNTGILSGKQSFCLAGPIAIARIAGKEDLLNWYQSDSANG GFAVLTADLVIDGKVELGMEQGNAGRRRELNTRDFTIRIANKGRLTVDNPDLYFMGNKEI MTVESGGRLSLSQGGTYESLYSDHILVQAGGKYEISSRFHLAGGIRDENREPETSDPPGE SWGEDVPAARPLLPNPDSTVPTVACAEGEPPKKEEYPAYDLVLYQADNGKYEWAELPVQW DLSSVDFNRAGVYRVWGNYSKEALTDRNLCNPSDLRASLDVAVQSRTSMKIKAADILSAD ADGYAVMRIRIAGIPDNAVGLYLYYSFDQVSYEMAAWTGKDGEYTNYLEQNAASGQAYDY IVFKYNIGNRPIWLKIQLKVRDGDAVRTVTSDPVKCQVSLVPEATKPSSDSDGSSGGNRG GGGQGTSGRGGVVSGNASGGPGVGGVNGGGVNGAGGVDSAGGGEMAGDQAGMGEGSPGGA GVWGGGFGIPATGQPDSIRPDTARVTQGDASASDHPEYGLFGERNRLGRDAYGRRGMPGD ENGDNTEEAGDALQANAGKVESGPGGDGQEGGEGDGSSASGGRHGRVKRFVYVFAGSAGI LLSANLGVGLLKGSRGESGVKRLVRKWRRKG >gi|157101653|gb|DS480671.1| GENE 48 62171 - 62875 615 234 aa, chain + ## HITS:1 COG:no KEGG:Acear_0463 NR:ns ## KEGG: Acear_0463 # Name: not_defined # Def: hypothetical protein # Organism: A.arabaticum # Pathway: not_defined # 16 232 16 231 235 234 54.0 3e-60 MLKFKILEKDGTEGTIEFEERYVINAGYTGRDQAAVKAHIDELKEEGIPAPDKTPVYFVK LPGKITQEKTFEVLDETDHSGEVEFVLLCDGESIYVGVGSDHTDRKLEVVDIPKAKQIYP NTISRELWRLEDVAGHWDDIIIRSWVINNGDKKVLQEAKLTAMLDPMDLLERVKKLLKNP DDTQGLVIYSGTVAALFKADYSLYFESELEDPVLGRKLNNVYELNCVSSWYKGD >gi|157101653|gb|DS480671.1| GENE 49 62927 - 63331 391 134 aa, chain + ## HITS:1 COG:no KEGG:Acear_0461 NR:ns ## KEGG: Acear_0461 # Name: not_defined # Def: hypothetical protein # Organism: A.arabaticum # Pathway: not_defined # 1 127 1 128 132 162 66.0 3e-39 MAKQEYEIRDVDLFGEWRLVEGGYNGTMEKILSYDPETGNYTRLLKFPPKTDMPEVLTHD FCEEIYVVDGYLTDTKKNLTMSKGYYGSRLPGMLHGPYSIPLGCTTMEFRYQDPGKPIDP ECSLLKLKLGGPRE >gi|157101653|gb|DS480671.1| GENE 50 63383 - 64147 619 254 aa, chain + ## HITS:1 COG:BS_ykrU KEGG:ns NR:ns ## COG: BS_ykrU COG0388 # Protein_GI_number: 16078421 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Bacillus subtilis # 31 250 33 253 259 161 38.0 1e-39 MKVAIVQFAVNDEESKEQKIIRMEGILDNLKGTDLIVLPELWNIGFFAYEEYAKQSELLT GPTFSSLARKAKELGSYIFTGSISEQDGDKCYNTAGLIDREGRLLGTYRKMHLFAAERQY MERGDKPVVIDTEFGKIGMSICYDIRFPELFRKEVEMGAEILVNCAGWPYPRVESWNMLH PVRAMENQCYMLSCCCAGASRGNPFIGRSMVIDPWGTVQAAAGNMETVVKSEIFPEQLAG IRETFTALKDRLLY >gi|157101653|gb|DS480671.1| GENE 51 64528 - 65268 405 246 aa, chain + ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 6 242 111 330 336 176 46.0 4e-44 MQFFFGKLVSKRTAVDLLVTPTVTIVVGVMVANFLAPPIGAGASAVGYMIMWATEQQPLL MGVLVATIVGIALTLPISSAAICAALSLVGLAGGAAVAGCCAQMVGLAVCSYRENRVNGL SSIGLGAAMLMLPNILKKPVLWLPPLITGMITGPIATCVFRLKMNGTPISSGMGTCGLVG PIGVVTGWFSPDQAAVLINETAISPVLKDWIGLALICFVLPALLSFVISEFMRKKGWIQQ GDYTLM >gi|157101653|gb|DS480671.1| GENE 52 65363 - 66289 550 308 aa, chain - ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 299 1 298 299 159 30.0 6e-39 MEFRNLYSFLRVVELGSFTKAAQELGYAQSTITTQIKQLEAEMGFSLFEHVGRRVSLTVY GQQVIPYVNQILQIQEQISSISLTAPSEIHGTLKIGIVESIMNSLLLTNIKKYRERFPNV IIQIYPAVTRPLFEMLRRNEVDLIFAMGDQISMSDCVCACSHAESSVFISAPEHPITQMK QVTLAKVLEEPMILTGEITFLRQELTKKARSLGIELCPVIQTESSSIIINLVRQNLGISF LPEHLVRSAFLSGKVAVLPITDYELPFHVHICYHKNKYLTPQMAGLIQLTQEYWSEIDRL DDGVRQHP >gi|157101653|gb|DS480671.1| GENE 53 66403 - 66591 139 62 aa, chain + ## HITS:1 COG:PAB0341 KEGG:ns NR:ns ## COG: PAB0341 COG1146 # Protein_GI_number: 14520719 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Pyrococcus abyssi # 2 59 36 99 101 59 46.0 1e-09 MSTITINQKWCKTCGICIAFCPRHVFKMNENGYPYPARQEDCIGCRLCELRCPDFAVKVE EG >gi|157101653|gb|DS480671.1| GENE 54 66594 - 67724 876 376 aa, chain + ## HITS:1 COG:TM0878 KEGG:ns NR:ns ## COG: TM0878 COG0674 # Protein_GI_number: 15643640 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 1 373 3 381 386 339 47.0 5e-93 MGELRFMQGNETMTEGAIAAGARFYAGYPITPSTEVAETSSIRLPQVGGVYIQMEDEISS MAALIGASCSGKKSYTATSGPGLSLMQENLGVAVLGEIPCVIIDVQRSGPSTGLATKPAQ GDVLQARWGTHGDHSMIVLSPSSVQDCFDLMITAFNYSEQYRTPVIFLSDEIIGHLREQV IIRSPEEIKVVERRHPDCPPQMYKPYDHTEGLAPLAAYGSKYVFKVNGSMHDEMGYPCAR PDNADRMIRHLSDKIESHRDEIAITRKYAMKDAQYVIIAYGGSARSALEAMKKGRSQGIP IGVLQLVTIWPIAEKDIREAMEQAEAVVVPELNLGQYIGEIRMRNPKHIPVEGVNRVDGK PIEPADILKKIEEVAR >gi|157101653|gb|DS480671.1| GENE 55 67724 - 68536 786 270 aa, chain + ## HITS:1 COG:Cj0537 KEGG:ns NR:ns ## COG: Cj0537 COG1013 # Protein_GI_number: 15791898 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Campylobacter jejuni # 4 261 7 265 281 285 51.0 4e-77 MLQDFLLEEKLPHFWCAGCGNGIVLHAILNSIDRLGWTCQDTVVATGIGCWGKADDYVKT NTFHGTHGRALAFSTGIKLANPRLHVLTLMGDGDGITIGGNHLIHAARRNVDVTAIMVNN LNYGMTGGQYSGTTPKNAITKTSPYGHVEQSFDVCDVVKAAGGSFVARGTVVDAVQLERL IEQAMSHKGFSFVEVISPCPTHYGRSNKLGKAPAMMEWIKENTVSDAKASSMNREELKGK FVTGIMENREQEDYSTAYQHVIDELKGGEA >gi|157101653|gb|DS480671.1| GENE 56 68536 - 69120 565 194 aa, chain + ## HITS:1 COG:MJ0536 KEGG:ns NR:ns ## COG: MJ0536 COG1014 # Protein_GI_number: 15668716 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanococcus jannaschii # 4 182 9 187 187 119 37.0 4e-27 MGKQNEIIISGVGGQGLILCGTLMAQAAVLHDHRQATLSSEYGVETRGTFAKSDVIISDQ EIYFPDVTEPDIIVCLAQTAYERYAGKYGPEVLIIYNQDEVAADPAYSASQQGIDITKIS RELGNTAVANIVTLGIVAGVLKVVTAEGVLNAIREFFGSRGEKTVALNIRAFETGYEIGR NMDGCHSGPVPLCG >gi|157101653|gb|DS480671.1| GENE 57 69145 - 70683 1168 512 aa, chain + ## HITS:1 COG:PA1418 KEGG:ns NR:ns ## COG: PA1418 COG0591 # Protein_GI_number: 15596615 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 16 392 15 344 463 68 21.0 4e-11 MTVWTAAGIALVLFLIVGVGLFSGKRVKGASDFVDGGKKAGPFLVCGTIMGSLVSSQATV GTAQLAFHYGLAAWWFTLGAGIGCLILGVMYARPLRDSGCITELQIISRSYGAAAGSLGS VLCSTGIFISVLAQVVACSGLAITLFPHIPVWAAALASIIIMCFYVIFGGAWGAGMGGIV KLVLLYAASLVGMVYALSVSHGIGGIFSELIEQLCGTGLGLVQQSANGVSNLKDTADLAD RYGNLVARGAMKDIGSGLSLMLGVLSTQTYAQAVLSAESDRKAKRGALLSAMLIPPLGIA GICIGLFMRSHYMLQAEAETLKAAGMAIPDLPVLTSTIQVFPAFVLDYMPPLLAGIALGT LLITSVGGGAGLSLGMATILIKDIYQRMTTKRMDEKKELAATRGILGAILVTAACVASLV PGSTINDLGFLSMGLRGSVVLVPMSCALWLKYDINKRWVLTAIVLAPAAVLVGKLLALPI DSLYLGILVSVLCCCAGGIMGREPGKRRICDD >gi|157101653|gb|DS480671.1| GENE 58 70676 - 71593 731 305 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 298 1 298 301 249 46.0 5e-66 MIKTLGDVLRTAAQSRPLVMAVAAAQDEAVLTAVHLAAEHGLVKPLLVGDKTGIIHIAQK AGIELELGDVIDVPDKESACMKAAELVRDKKADVLMKGMVDTGLIMRAVLDKENNLRKSP VISHVAVMQVPGFDRLFYVTDSAMNIAPTLEQKAAILKNAVEVAHALENEMPRAAVLCAV EKVNPKMPCTLDAKELARQNKQGDIPGCLVDGPLALDNAVSIEAAKHKGVDSPVAGRADI LLVPDIEAGNMLNKAMEYFAGADKAGVMMGAKVPIILTSRAASPQSKLYSIALGVLIASK TKELF >gi|157101653|gb|DS480671.1| GENE 59 71593 - 72714 908 373 aa, chain + ## HITS:1 COG:CAC1660 KEGG:ns NR:ns ## COG: CAC1660 COG3426 # Protein_GI_number: 15894937 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 5 357 3 355 356 318 44.0 1e-86 MSEVYQIIAVNPGSTSTKVAVFDNDKKRFSVNVSHPVSELKQFQEIVDQLDYRKKTILSA LEEHHVDLRKTDAFSGRCTGLLPMTGGVYEVNDLMYRHGSMGIGSKHPGNLGPMIARDFG LQFGSRSFVVNPSSTDEFRLEARLTGLGDIQRTSRGHPLNQKEVAARYAARLGKCYEDIN VVVVHMGGGISVTAHEKGKMVDTADSTRGEGRMAPTRTGALPAASLVELCFSGKYTQKEL MDKIMKTGGWTDLLGTADALEVEQRIGEGDMYAKLVYETTGYQIAKDIGAYAAVLCGEVD GILLTGGLAHSQYLVEIITQMTKFIAPVCVFPGEFEMEGLAAGALRVLRGIECPKFYTGR PVFQGFHKMGQLA >gi|157101653|gb|DS480671.1| GENE 60 72715 - 73596 704 293 aa, chain - ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 1 284 1 284 290 221 44.0 9e-58 MDFKNAKELLDFCQKLNCPISDIMKQRECTLAETSLEDIYAKMDHALSIMRESASSPIQE PKKSIGGLIGGEARLADLHRARGAAICGSVLSRAITYSMAVLETNTSMGLIVAAPTAGSS GIVPGLLLALQEEYSIPDSRVIDALFNAGAIGYLAMRNATVAGAVGGCQAEVGVASAMAA SAAVELMGGTAKQCLNAASSVLMNLLGLVCDPLGGLVECPCQGRNAAGAAVALTAAELAL SGIRQLIPFDEMLAAMYRVGRQLSPDLRETALGGCAATPTGKALGCRACPCSP >gi|157101653|gb|DS480671.1| GENE 61 73586 - 74278 712 230 aa, chain - ## HITS:1 COG:lin1927 KEGG:ns NR:ns ## COG: lin1927 COG1760 # Protein_GI_number: 16800993 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Listeria innocua # 1 217 1 215 220 177 40.0 1e-44 MAFISIFDVLGPEMIGPSSSHTAGACSIALLAGKMADSPITHVEFTLYQSFAKTHKGHGT DLALLGGIMGFSTDDRRIPLACSVAEERGLSYRFITDDTDSDTHPNTVDIRITCLNGRTY SVRGESLGGGKVRISRIDHIDVDFSGEYSTLIIIHRDRLGVLAHITRCLSEGYVNIAFMK LFRETKGDRAYSIIEFDGSLPDHMVSRIYENPDVQDVMFIPVKGENENGF >gi|157101653|gb|DS480671.1| GENE 62 74454 - 75368 818 304 aa, chain + ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 289 1 289 299 166 31.0 7e-41 MELRQLQTFVTITQTESFSRAAEVLGYTQSALTVQIRLLENELGVKLFDRMGKRVYLTAP GRQFLGHANQIIRDIKCAREFAKSEEELCSPLHVGTLESLCFSKLPQILNSFRSSHPRVP VKVTASTPNQLIDMMERNQLDLIYILDRPRYNDNWNKVMEVREPIVFVASPSCGLGGNRV ISLEEIMDEPFFLTEKNENYRRELDCFLESQGMVLTPFLEISNTEFIIRMIRKNRGISYL PLFAVQEYVKRGELAVLDVADFRLAMYQQVIYNRNKWLTKEMDAFIRITANHSGTGSICG GNDI >gi|157101653|gb|DS480671.1| GENE 63 75770 - 76789 849 339 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1991 NR:ns ## KEGG: Pjdr2_1991 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Paenibacillus # Pathway: not_defined # 3 269 76 354 433 146 32.0 2e-33 MPEAENQTYARRLEVLAAQGEFNDIVELRETDSLVQAGLLAPMPEEVYSLVENPGTCGGI CYGVPLYTTTLGMIYNAEIFERLGLCAPRTYEEFLRVCATLKASGYDAIALGAAGDRHMK YWGNYLFCNYIASGDGQACWTKEKAAEMLGDFRDLASRGYINSRYRAVTDRETARAISTR QAVMVYAGPQMLQQIENLNPQIRLGFFFLPGKNGTVYAMDDRSVQWGISSLTAGDGKKMD ACARFLQFCYSVGVYETILEIMNGNSVTVRRVNMPDTPDRKIMEAAYEGNPVYTEFLLKD AQPPDGFIECYDRMLVETLWGGTSISSLADNMLERWEEP >gi|157101653|gb|DS480671.1| GENE 64 76810 - 77757 736 315 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1989 NR:ns ## KEGG: Pjdr2_1989 # Name: not_defined # Def: putative sensor with HAMP domain # Organism: Paenibacillus # Pathway: not_defined # 3 315 14 322 592 66 21.0 2e-09 MIRLATSFVILGLVPLLAAGTALTGHFRSNMERVVLDDIGRMVSYGGSNTEAVLEECSSL TKHIYDVSTDDGMFLYQILKSPGLGREERKMEITLLLSDFLDRDSRLRSVYFKDRKGQIY YATRNAYKVLDEDAFRYWTGKEDEKGDFSVLPAHMDDYFQDSGSQVITFRRSYQDITSLK TIGSCLGHFYMDMDVSRLQSALSDIDLGFGASFYIMDDSGACIYRVDTDGSGNTADSMAS LLPSMNQETGSLLGDGTYVVYKRLEPYGWTVAVQAAQDRVLVMESTIRYTAVFLGAVFLA FLCLYSYFPERIHRP >gi|157101653|gb|DS480671.1| GENE 65 77927 - 78643 522 238 aa, chain + ## HITS:1 COG:BS_yitC KEGG:ns NR:ns ## COG: BS_yitC COG2045 # Protein_GI_number: 16078158 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Phosphosulfolactate phosphohydrolase and related enzymes # Organism: Bacillus subtilis # 18 227 18 224 228 97 28.0 2e-20 MEIRILELIEGAKKAEGLTVIIDVFRAFSLECYLYARGASAVFPAGSVEEARHMKQVHPE YLLIGERWGRRCEGFDYGNSPSQTRDADLSGKKIVHTTSAGTQGIVNAVHAEEILTGSLV NARAVADYISRRQPETVSLVAMGNGGERTAREDVICARYIKCLLEGRPCLIDEEIRSLKT AGGEHFFNPHTQEIYPQEDFWLCTKHDIFPFVLRVEKRDDGSLESVKIDSGEGRDAES >gi|157101653|gb|DS480671.1| GENE 66 78669 - 80117 1552 482 aa, chain + ## HITS:1 COG:no KEGG:Bsel_2944 NR:ns ## KEGG: Bsel_2944 # Name: not_defined # Def: ABC-type sugar transporter periplasmic component-like protein # Organism: B.selenitireducens # Pathway: ABC transporters [PATH:bse02010] # 13 476 14 478 485 409 44.0 1e-112 MKKREGGILAWAAAVFLLVGMAAGCSRSEEARYQLNPKQPVMITVWNYYNGQTKQAFDNL VSRFNETLGTEMGIIVDSVSYGDVNQLAEEAYNSASGRLGADAMPNLFAAYPENAYRIDV LGKLVDLNQYFSEDELKVYRADFLQDCSFGGNYGLKILPVVRSTENLFLNKTDWDRFAGD TGVRLDRLSTWEGVAETAKAYYEWSGGKAFLGIDSLANYIIVASVQTGNDIYKLQDGQVT FSFDQDLARILWDELYVPYVKGYYANLGKFRSDDEKTGDILAYVGSSAGAVYFPKEVTLN QNEIYRIESSVLPYPCFRDGDLCAMVQGAGFAIAQSDSTHEYASAVFLKWLTQEEQNLDF AVNSGYLPVENSSLNQGYAKAMAAYSGTPEYNPTIGGSAEATLKMLSEYRFYSNPPFEGS YEMRQFYGEYMEASVDNARLEAEDKMKAGMTKEEAQAQVTDEAHFQEWYEGWQDNAAKLL GS >gi|157101653|gb|DS480671.1| GENE 67 80126 - 82300 1590 724 aa, chain + ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 534 717 75 241 251 85 30.0 3e-16 MKQKQSIASKCSAFTAGLVILQSLLVLGMLIAGGILKQTRDTAYQSFASTVTLRASSIQD QIDNRWTNIHPYVQELSDRLSESGYGEGKMDAESFLIGSADTLISMLHVSGTTGTFLILD EPGQGKSMPALYFRDYDPETYSADNSDLFQVMGPAEISRRYQIPLDSVWHYGLSLNEENR KFFDMPFQAAGQSRDYEKLGYWSRPFYMTPEDVPVITYTVPLFDRAGQVHGVIGVEISLE HIRKLLPANELSAQDSPGYMVAAMQENPQELFVLVKKGEYQNRLFGDEPLLRIKQEDETY SICQVSSLTGEKIYGCIKSLDLYGTKNVPYDSGNWVVLGLMRGDALYGFLHRIIKIFAVT LMVSLLLGAGAAGFISRKITEPVAKLAAKVRNFDYTRKIRLERTGIEELDLLSQAVEISN QNLLDNTLKMSEIMDLLGLHIGAFEYSPGGYGVQVTKQIFPLMELPCEDENHLYVDEKVF LSRLSDMQRCPVPEENGIYRISGKTERWVRITLSLRHGHSLGIIEDVTGDMMKKQRIKYE RDHDSLTRIYSRAAFQREAEAILEQAGEMPGQTGEMPGLSGNGPVNLGTAAMVMLDLDGL KSVNDTYGHESGDIYIRETAVCMKGIPEDHAIVGRRSGDEFFILLFGYEDRNGVRKQLRD FYDILDKHPAVLPDGIQLKIQISSGIAWYGGELCTFEELVRCADIALYEAKSTMKGQTVE YSKG >gi|157101653|gb|DS480671.1| GENE 68 82360 - 83250 729 296 aa, chain - ## HITS:1 COG:hcaR KEGG:ns NR:ns ## COG: hcaR COG0583 # Protein_GI_number: 16130462 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli K12 # 1 291 1 293 296 96 30.0 4e-20 MNTVQLECFLAVAEYLNFSKAADFLKITQPAVSHQITSLEDELGTRLFVRTSKNVNLTEA GLMFMEDASSILKIACGAKKRLGIGSNDVLFLTVGCHNQSELDLLPPVIRRLADRFPTLH PTVKLIPFKSMANLLDEERIHVMFGFQQDEQKKPIGLFHELARIPLACVCSAGHPYAGRT CLNVDELEGPIVLGEPHRLPATVLQSQIRASSSCHPSELFFVDTYESILALVRAGLGYSL LPFYPGSDKKGLCYIPVTGIPPLAFGLYHKGVKGNTILKEFIRLLDEEVAAGEFRA >gi|157101653|gb|DS480671.1| GENE 69 83543 - 85354 202 603 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 357 571 141 354 398 82 29 2e-14 MRTILAQLKQYKKDSILTPLFTALEVIMEVMLPFITALIIDEGIQEGNMRKVYTYGGVMI VMAFISLISGAMAGKYAASASSGLACNLRKGIYDKVQDFSFSNIDKFSTAGLVTRMTTDV TNIQNAYQMCIRVAVRAPLMLVCSMVMSFMISPRLSMIFLGAILFLACILGIIMMAARKI FNVVFTKYDDLNAGVQENVSGIRVVKAYVREDYENQKFTSAAENLCRLFVKAEGTLAFNN PAMMLVVYGCILAVSWFGARFIVIGTMSSGELTSLFSYIMSIMMSLMMLSMIFVMIAMSA ASIRRIEEVFKEEPDIRNPERPVMDVADGSVDFNCVCFSYGKGKEDAGGTDKGMDKGRGG QALSDIELHIRSGETVGIIGGTGSGKSSLVNLISRLYDADSGSVCVGGRDVREYDLETLR DSVAVVLQKNVLFSGTILENLRWGNQDATEEECREACRAACADEFIERLPHGYHTFLERG GTNVSGGQKQRLCIARALLKNPRILILDDSTSAVDTATDARIREAFGAAIPGTTKIIIAQ RISSVQHADHILVIEDGRINGYGTNDQLVRSNEIYREIYESQSRGGGDFDQVQAENAGKE SVA >gi|157101653|gb|DS480671.1| GENE 70 85354 - 87330 196 658 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 433 639 16 226 245 80 29 9e-14 MEDRQSSRKKQNSGQAQSSRKEQTGRQEQARETKEAIRVLGRVIRYMAANYKLAVTAVLA CIIVSAGATLKGMVFIQSLVDDYIRPLAQQASTGNTAPDFSPLAGALCRMAVLYLIGILA AYAYNRIMVTVSQGTMKRLRVELFTRMESLPIRYFDTHAHGDIMSVYTNDVDTLRQLMSQ SVPQVANSAVTLAATAVSMLILNIPLSILTFAMACVMLFATKCLSGKSAAHFGKQQNSLG EVDGYIEEMMDGQKVVKVFCHEEQAKKEFEVLNERLRDNADKANRYANLLMPVNANIGNL SYVLCAVAGGILALNGIGGITIGTIIAFVGLNKNFSQPITQISQQMNSVVMAAAGAGRVF ELLDQKQEEDNGYVEFVNVRERKDGTLEETKERTGVWAWKHPHKAEGTVTYKKMEGGIVF DGVDFGYDPSKMVLHDIKLFAEPGQKIAFVGSTGAGKTTITNLINRFYDIQDGKIRYDGI NINKIKKPALRRSLGIVLQDTHLFTGTVMDNIRYGKLDATDEECVNAAKLANADSFINRL PHGYETILTGDGGSLSQGQRQLLAIARAAVADPPVLILDEATSSIDTRTESLVQAGMDAL MNGRTTFVIAHRLSTIRNADCIMVMEQGRIIERGSHEELIGRKGKYYQLYTGNFEEAG >gi|157101653|gb|DS480671.1| GENE 71 87471 - 87920 417 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936072|ref|ZP_02083445.1| ## NR: gi|160936072|ref|ZP_02083445.1| hypothetical protein CLOBOL_00968 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00968 [Clostridium bolteae ATCC BAA-613] # 1 149 1 149 149 293 100.0 4e-78 MKIRHRLAGVAVMVSILGCTFPAWAGEWIQEESGQWVYEENQELLKGWNRIDGIWYCLDT ETGVWIEKPSMTSEAACRLLENKLLEMGMYQDEEEPLQFKVDYENNQMVQVSVGYEDKPD VFHRINTYEIDKKKGTADPVVGDKEFSLW >gi|157101653|gb|DS480671.1| GENE 72 88036 - 89415 1091 459 aa, chain + ## HITS:1 COG:all0778 KEGG:ns NR:ns ## COG: all0778 COG4826 # Protein_GI_number: 17228273 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Serine protease inhibitor # Organism: Nostoc sp. PCC 7120 # 71 438 3 356 374 108 26.0 2e-23 MKRNIISIAAAGVLATAISVLAAGSPAAFAAPFNGVSGEGYQCESGSRDPDPIPYNDSGS MRAIAQNNTVNPEFTDGISRFSYKVFLPLLRDSRENMNFSPVSLYYALAQASEGAEGNTR AQMLALMGIGSQTPDGSPEPAQALHTAQPPAGLAASCGRLYRQIFRNNEKTKLKLVNSMW LKEGAPLNDSFKRTSENEYYASVFRVDFGNPDLPAIIRQWIGWQTDQAISMDVATASGES MKLVNAVNYQAQWTGGFRTEDNITDVFHSADGQDITCQYMTGHWNGTYYRGQGFLRASLS LTDNASMIVVLPDQGVTIRELLSSEQYTREMFGRIDELESYGGIYWKIPKFQTATTADLG PSLASLGLTGAFDVSRADFTGITGESLSLSGVIQSTVFSVNENGVGPFSENGEAGFAGLY SGEPVLQMNLDRPFLYAVTGYMGAPLYIGVCNRPAVNNP >gi|157101653|gb|DS480671.1| GENE 73 89757 - 90665 907 302 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3685 NR:ns ## KEGG: Clocel_3685 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 153 273 207 322 347 64 28.0 5e-09 MDSQIFPMTMKGEGEVSFIPAFDRESNRLVLFFARADGTMAYKTDQLETNNRIRGQLRQP DSRVAAVSFQDMDGDGWQDIVLITACANEGAGKQGKPYKVGDVLFQKNDGFYRDYRLSEK MNRFGMNKSIRFITSFVRDGYSTEFLYTATAQDELLSHGMTVISEQSRSIRFEKFGRLLV VPGTYRMAEYTVFMLYLVNEQGYIVWSFQPMGEYEHLYALKGVICQDIDGDGLKDIVILA DYSYEGSDGGTVVEGSYAIYYQRTGGFFEDTDMKQTLILEEGDTLSGLTDRARAYWGWRT EP >gi|157101653|gb|DS480671.1| GENE 74 90662 - 91321 782 219 aa, chain + ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 216 1 216 217 207 48.0 1e-53 MIKVLIVDDEKPICDLIDLNLSASGYHCTSVQDGAEALRLIENQFFDLILLDIMLPGVDG YDIMSYIRPMDIPVIFITARHEVKDRVKGLRLGADDYLVKPFDVVELVARVEAVLRRYNK TERLLTAGDVTVDVEARRVTRAGRNVELTNKEFGLLVLFIQNKNVALFRETLYEKVWEGE YSGDSRTLDLHVQRLRRKLGWENNLVAVYKVGYRLEVVQ >gi|157101653|gb|DS480671.1| GENE 75 91318 - 92730 1460 470 aa, chain + ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 458 1 469 473 218 31.0 2e-56 MKFSFKLLLWTMIVMALGFGFSGYYFVNYVFQTSMDREVNQAMDESSILRFAFETAALNV PAKYDVLQDTTVSQIASNLESGGRSSGRLLRICDEQKNVLYASTGFGGGDGLLDRTGQDT RTYRVIEQDGAYYIQTGTAVNALDRVLYLETMKDVSEVFRDRSRGFSVYRRVTVAMLLCG TAVMYLIASFLTRPIRLLTRATKRMASGDYHHRARKVSSDELGQLTTDFNQMANALEDNI NQLEDEIQAREDFVAAFAHELKTPLTSIIGYADALRSRRLDEEKQFMSANYIYTEGKRLE AMSFRLLDIMVTRRGEVEFTMVSAESLFLYLYDMYVINKSMKIYFNYDDGMVRAEANLIK SVLMNLLDNAFKASEADGLIEVYGHLMKDGYCFEVKDHGVGIPKEELHKITKAFYMVDKS RSRSRNGAGLGLALCAEILELHKSRLNIRSELGKGTTMSFTLPLWKEDQL >gi|157101653|gb|DS480671.1| GENE 76 92727 - 93578 759 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936078|ref|ZP_02083451.1| ## NR: gi|160936078|ref|ZP_02083451.1| hypothetical protein CLOBOL_00974 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00974 [Clostridium bolteae ATCC BAA-613] # 1 283 1 283 283 556 100.0 1e-157 MRRMAANLLFLLFAGLIMALSICGPEMLTRYRDRTMLGEIHAMTADTEGEGYRYTLSAAE KLHILSESLSSQTFPETGQALIPETGQALTAANGSQAEYESLEGAYAFVMNHRGPSGQEI TDSQIYETCNRGLEELKTLGILPESVRPVEKDEYDAVLYSAIDVLEPRNNVAVWKLSLIN SQKNTDKENRLMEAYIDGDNGRIYEFYARSSRLWDDMDPEQIVGTWSSYMGLEKPAAFGD QNPLMEATPYFEKYVISQGEEEETIVTVGFYEGINELFLKISR >gi|157101653|gb|DS480671.1| GENE 77 93675 - 95210 1102 511 aa, chain + ## HITS:1 COG:BH3665_2 KEGG:ns NR:ns ## COG: BH3665_2 COG0860 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 128 303 6 180 180 126 39.0 1e-28 MDRIEIKPYDYGRQTERRRQARRRARRRAFFLAAAELAAGVCMTMAVMAMVIPEKRELGM AGTEQGRIKQEITERERIERQCTGRGGTEILSRILEDTCVLQSSIKGQGESLLSFGSQNS LYTGPLVIAVDAGHGGEDEGASQEGVMEKDINLAIAERLKVKLEDMGYTVVMVREDDAYR SKEERVEAAHKVRAGAYVSIHQNTWEDAAARGIETWYSGKDGASDSGRLAALVHKEAVRS TGAEARELRGDAEFTVTGQTFVPSCLIETGFLSNPRERERLTDPQYQEKLAGGIAKGIDL YFNPKTMYLTFDDGPSAENTSAVLDVLKARNIRATFFVVGENVRKHPDVAKRIAAEGHTI GIHCNRHDYKELYESRDSYLADFEEAYRAVLEVTGVKPVLYRFPGGSINGYNRKVYKEII AEMDARGFVYFDWNASLDDALKKSRPQELIDNAMKTIMGRQQVVLLAHDMVHSTSLCLDG LIDQLPEYRMEPLTPEVAPVRFREQDSAGVY >gi|157101653|gb|DS480671.1| GENE 78 95214 - 95774 385 186 aa, chain - ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 1 185 7 191 192 181 48.0 9e-46 MINTEKSKRLDGQLYYNQDPLLLEEHLNVQMLLHQYNHLPPNKRKKRRKLLKQILNHPPK DACIEQPFHCDFGYNIHIGSHFYSNYNLTILDCNKVTIGHHCLIGPNVSIITVNHPQDRK LRSRDLEYATPVTIGNDVWIGCGAIINPGVSIGDNVIIGSGSIVTKDIPSNSLAVGNPAR VIRTLD >gi|157101653|gb|DS480671.1| GENE 79 95942 - 96940 765 332 aa, chain + ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 4 329 2 331 337 177 31.0 4e-44 MEKRATIKDVARLAGVSISTASLAINGKEGVTDKTRKMVLDAVEELSYQPNNAARNLRTS GTKVIGLLIPDVRNCFYSEITEYIRREVESYGFFLILGITGNSSNNEKKYISEFISRKVD GVIWVPQLTYVNDIEHMKKLDEYGIPYLFLAAYYEQSSAPFVMCNLKKGSYLMTQYLIER DIHNIVMLIGDRRVDETYISGYKMALEEAGLVYSESNVYESTYHFDDVQKLAMEILKERP EAIMTNSDFTACAVLQAARIKGLSIPRDLSVCGYDDVIYATINQIPLTTVSQPIERMCKL AVNNLMNLLQYGIVPDSVILEPSLIIRDTVKR >gi|157101653|gb|DS480671.1| GENE 80 97028 - 98029 672 333 aa, chain + ## HITS:1 COG:PM1377 KEGG:ns NR:ns ## COG: PM1377 COG1879 # Protein_GI_number: 15603242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 61 289 32 258 309 88 30.0 2e-17 MKKKMVAMLLVLSLMGTLAGCSASGGNAGDKAEQKHTDTAATSGGDNKADSGMEFVYVTQ DSANPYFIQVYDGFQSKCAELGIKTQILDSKYDVATQVSYVEDAVQRDVDGIMISPLDEN ALKTVVDQAKEKGIVVSAEAQPVSNAQIDGSLDEYNIGYAIGNAAAEWIKAEQGGKGRAL ILAQDDVEAVIRRGDGINDGILENAPEAEVVARQNASNTDLGMKVTESVLLAKPEVNVIC ACDDFTAIGAYEAVSAMQKIPDNFYIGGCDATDEGKEKMKVENSVYRSSCNLFPYEAGED LAEAMYKYLTEKQEDAVVQRRYEPVWQKDVVGN >gi|157101653|gb|DS480671.1| GENE 81 98069 - 99583 191 504 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 280 492 17 225 245 78 25 3e-13 MNNRLEETPQIDDTDIILEVHKISKRFGGIKALNNIDFSVRRGEVHALVGENGAGKSTFI KILTGAHAPSEGKIMFDGKEYSGLTPALALDAGITAIYQEFNLIPFLSVSENIYFGRELM KQGFLDYEQMNENTKKMFKEIGTAINPKIRVQRLGIAQQQLVEIVKAISKNAKLIIMDEP SAPLTEQETETLHTIVRQLKGRGITIIYISHRMEEIFEICDRVSVFRDGKYISTHPVSET SRETLIADMVGREMGETFPDLGHINEEKVLEVSHVSNRRLKDMSLVLRKGEILGIGGLVG AGRTELLRALYGADEITEGEIILKGKKINITSPAVALKNSIALLPEDRKQQGVIMGMSIE HNISFAILDRLSRFAVIKKKQEKDICNKLVADLSIKISSLLQPVMKLSGGNQQKVVLAKC LATECNVLFIDEPTRGIDVGAKQEIYKIMRQLADNGTSIIMVSSEMSELIGMSDRIIVMK EGQVVKELMPEEYSQELILNYAAL >gi|157101653|gb|DS480671.1| GENE 82 99597 - 100553 709 318 aa, chain + ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 1 308 1 310 314 243 49.0 3e-64 MKNKMNINTFIAEYAIIIIFIVLFAAMSIFAPNFLTGSNIANVLRQVSINGICAIGMTFV ILTGGIDLSVGAIIGVSGVLTAMMMINNVNPIIASLISLALCTLIGFATGLVISHIGIPP MVATMGTMTSLRGVAYLITGGTPVFGFDSRYAVIGQGYVGAVPIPVIILVLSFAAGIFFL SKTRHSRYIYGVGGNEEVARLSGISVHRIKAFVYAVSGFCSALAGLVMLGRLNSGQPRAG ESYEMDVITAVVLGGVSLTGGVGKISHVVFGVLIIGVLTNGMTMMAVDDYWQRVVKGLVL LLAVGFDRFIQKRQQKSE >gi|157101653|gb|DS480671.1| GENE 83 100634 - 102016 916 460 aa, chain + ## HITS:1 COG:TM0307 KEGG:ns NR:ns ## COG: TM0307 COG2407 # Protein_GI_number: 15643076 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 28 460 26 472 473 107 23.0 4e-23 MMIRPKMGFIVFGVHKDGLKDPLGVPFINQKIIDESKAAIEAKGVKLVENEIVVAYKQEA REALARLKHDDSVDGVILFSGTWVWASEIVAAVRDFERAGKGVIIWTVPGSQGWRLVGTL ALGASFKEIGMKYRSVYGSDNDTVEKVACFSRACALRNKLNMTTIGAFGGRGMGLTCGCA DPSQFMREFGVDIDSRDSMDILKAAEEVTEEEIQDVKENLIKPYFQEMPPDDGCTERSIR LYLAVKKIIEKEKFDMYVIQSFPGLAEEYAASCFTQSMMLQQGIPTATLCDYNNVLTVFL LSNLTPDPVYYGDFQCIDKEKKVVKVIGDGACAPSLAGPDKARFAHHGLPTEGSAGGLSV EAVLKPGKVVMGRIGRDNGMFEMILHRGQVYTPDPDDLKDQRTESGMWFWPHAFIKMDVD YDYGVQVWDSEYITLAYGDEEIYGTLKEFCYLTDIKLIEM >gi|157101653|gb|DS480671.1| GENE 84 102071 - 102964 463 297 aa, chain + ## HITS:1 COG:VCA1025 KEGG:ns NR:ns ## COG: VCA1025 COG0363 # Protein_GI_number: 15601778 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Vibrio cholerae # 26 273 1 237 266 76 28.0 5e-14 MIKKQDLYKWCSIPAEELESHPELVIRFRMVKDSCEMGQMMARELVDEIAQAGREERQFR AIVPCGPKCWYAPFAEYINKNRINMKHVTIFHMDECLDWQGNLLAQDDPYNFRTFMLREF YGPIAPDLNIPEANRNFLTPKNMYEVKEKIAEAPLDYTLGGWGQDGHIAYNQSRRHPFSH ITIEELKESSIRIQENNLDTIITLGQRSYGAAYQFVPPMSITLGIRECLSAKKVRLYSDT GSWKQTALRVALFSEKDSEYPMTLLQDHGDAIITATYETANHPISRHPEWKFAGVNI >gi|157101653|gb|DS480671.1| GENE 85 102961 - 103998 621 345 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936087|ref|ZP_02083460.1| ## NR: gi|160936087|ref|ZP_02083460.1| hypothetical protein CLOBOL_00983 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00983 [Clostridium bolteae ATCC BAA-613] # 1 345 1 345 345 706 100.0 0 MRKLKYDYVVGTGGIGSGILFRFSDNITLGRNESRLAELMDVRDFCKLHIILHYVITYLN KSIPVYAIGRVGDDVSGREVLGLMEQYGINCRYVNVDSHEKTMYSVCFVYPDSDGGNITT SNSASGNMREEDIAAFFEQEQLKGRGLVLAAPEVPLDIRLYLLKKGRENSCFNVASFTSD EMGECISSGAFQYVDLLAINKHEMETLVTAGGVEAESGAPECYGFLRSQNPGIQLIVTQG GEGAQFFANGYSERIPAIWKPVNSTAGAGDCFLATVISGLICGVPFYEHDSSVGLGGLAA LASSLKVTCKDTIDFTLDRERLKEEAAKLGIDFSEKIKKLFFNTD >gi|157101653|gb|DS480671.1| GENE 86 104127 - 104522 368 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936088|ref|ZP_02083461.1| ## NR: gi|160936088|ref|ZP_02083461.1| hypothetical protein CLOBOL_00984 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00984 [Clostridium bolteae ATCC BAA-613] # 1 131 16 146 146 251 100.0 1e-65 MADGVNSYLLFFSAGSHFLLPLSCVKRVTDMKEREPEMALADFLELPEAGNLSDQAYLII ADCGDREIGIRAEVVTGLVQVEEEQIYAIPEAVRSSRNQYMDRMAALDAEEEKGKLAFIV VPSLLIPRPVL >gi|157101653|gb|DS480671.1| GENE 87 104536 - 104862 247 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936089|ref|ZP_02083462.1| ## NR: gi|160936089|ref|ZP_02083462.1| hypothetical protein CLOBOL_00985 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00985 [Clostridium bolteae ATCC BAA-613] # 1 108 2 109 109 207 100.0 2e-52 MELVTVKDGSCTVYVEKNLIKQVVIHPQIQQVPGAPDHICGIMLYERTPVVFCRLGDSGG CTCGFILGLRDGTMIGVIGEPGAEDSGEPEKMMEVMPGIWEKKCDTIE >gi|157101653|gb|DS480671.1| GENE 88 104855 - 105691 548 278 aa, chain + ## HITS:1 COG:BB0040 KEGG:ns NR:ns ## COG: BB0040 COG1352 # Protein_GI_number: 15594386 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methylase of chemotaxis methyl-accepting proteins # Organism: Borrelia burgdorferi # 5 265 18 282 283 200 42.0 3e-51 MNDREFCEIVGYIRDNYGINLEKKQVLIECRMARELEKRGLTSFAQYMEQMRKDRTGEMA GEMVNRLTTNYTYFMREPAHFSILNNKILPELFENRKMGGCSIWNAGCSTGEEVYSIAML IQDYGDRFERMPAVRILATDISAEVLRKAEKGIYPLKEMEDLPELWKRKYCTVEDNHTFR VDEKLKYNIRFRRHNLMEMPPGPEKFDLILCRNVMIYFDRISREKLIKQLERCLSPGGYL LVGHAELLSREETRLEAVFPAVYKKPVKENEDRGGLYG >gi|157101653|gb|DS480671.1| GENE 89 105684 - 106049 467 121 aa, chain + ## HITS:1 COG:BH2444 KEGG:ns NR:ns ## COG: BH2444 COG0784 # Protein_GI_number: 15615007 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 1 119 1 119 121 117 50.0 6e-27 MDKKILVVDDAMFMRSIIRKILKEDGYTQVWEAQDGERAMELFREVSPDLVLLDITMPGR SGLEVLEEMLSLVPNIRVIMCSAVGQEMMIQKALTIGAADFIVKPFKADEFSRIVNRCLE Q >gi|157101653|gb|DS480671.1| GENE 90 106069 - 107745 1629 558 aa, chain + ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 5 558 4 514 555 195 28.0 2e-49 MRERFRNYKTGRKMVIAFFSIILLYVVTVTVAIVNVETISSRMEAIYADQFANVQSSLRM TASLRAVGRNIAILAATEGLVDEDEYLQNTRELIRDEQAALSELSTGYITAPEKVEELNE KFQILAETRARIMTLLEAGEDERVLSLYADEYLPRSNDVRIVLSEVADLSAEETAASIET EHHSNHRVIIMLVILSAVCIAATILICSMVTRNIVRPINEVKEASNTISNGRLNIALGYR SKDELGQLADDIRRTAKVLNSYVSEIQSGLTALGNGRLNYRSEVEFKGDFVALGNGLKEI SRLLRNSLQQINSSAEQVSLGAEQVSNSSQALAQGASEQAGSIEELAVSINEIAQSVKDN ADSAVDSSRQAALVGQKLEECDGQMETLMKSIHEVKNNSGKITGIVRQIEDIAFQTNILA LNAAVEAARAGDAGRGFSVVAEEVRRLATKTAGASKLTAELVEKNSDAVSDGMDAVNLTA QTLKTSVEGARQVSHKMDKISETSVQQADAITQIRKSVELISEIVQGNSATSEESAAASE ELSAQAQLLKELVERFEF >gi|157101653|gb|DS480671.1| GENE 91 107764 - 108387 691 207 aa, chain + ## HITS:1 COG:BH2434 KEGG:ns NR:ns ## COG: BH2434 COG1776 # Protein_GI_number: 15614997 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein CheC, inhibitor of MCP methylation # Organism: Bacillus halodurans # 1 201 1 203 209 115 33.0 4e-26 MDGYVQLDEISRDILKEIGNVGTGNAVTSLSQMMEQPIELDMPSLKVVKYYKMHELLEQP EELQTGILIEVTGQLKGIFLFMLSEAFTKTVINTILGEEERNLTSLDDMECSLISELGNI MCGSYIRALSQLMDMDMDVSVPELCIDMGGAILTYPMSKWVIVSDDILLIENIFHMSGEI FKGRILFLPEQEDLGTMLSRLREQRYG >gi|157101653|gb|DS480671.1| GENE 92 108380 - 108856 358 158 aa, chain + ## HITS:1 COG:MA0011 KEGG:ns NR:ns ## COG: MA0011 COG1871 # Protein_GI_number: 20088910 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein; stimulates methylation of MCP proteins # Organism: Methanosarcina acetivorans str.C2A # 4 151 7 152 159 108 37.0 4e-24 MDKIIVGIAEGKVARENQVLISYALGSCVGICLYDRRERIAGMAHIILPERKYSIHWDNA YKFADEGVHELINQMREQGARPAGLIAKIAGGARMFGAPSGMLDIGQQNVVAVRECLAQE GIRLVAQHTGQNYGRTILFYAENGRLEVNTVRHSAVVL >gi|157101653|gb|DS480671.1| GENE 93 108880 - 109812 893 310 aa, chain + ## HITS:1 COG:BH2234 KEGG:ns NR:ns ## COG: BH2234 COG3706 # Protein_GI_number: 15614797 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and a GGDEF domain # Organism: Bacillus halodurans # 5 305 2 310 314 166 35.0 5e-41 MDEKQTILIVDDSLLICEQIKTALKEESIIICEAHSGLEAQEQLRQCQPDLILLDVVLPD ADGYELYKILQDLDHKNAVIIFLTSRSSDEDVVKGFSMGACDYIKKPFVKKELQSRIRAH LIQKKQRDDLDRQNRELRNNMEKLNYMAYRDGLTGLYNRRYVVGDLMEDMRSCGQDEVKT VFIMADIDDFKQVNDTYGHDAGDMALVCIANILESNCRRHRVVRWGGEEFLLILFNVTKD EAYELSEKVRRQVEQFVIPYGDKGFFCTMTLGLHIFHEQEGLEEGIGCADKALYYGKRHG KNQSVWYKEQ >gi|157101653|gb|DS480671.1| GENE 94 109828 - 111855 1846 675 aa, chain + ## HITS:1 COG:CAC0118 KEGG:ns NR:ns ## COG: CAC0118 COG0643 # Protein_GI_number: 15893414 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Chemotaxis protein histidine kinase and related kinases # Organism: Clostridium acetobutylicum # 11 659 10 637 649 300 27.0 8e-81 MGYFESDALEMLEVYLLETRQLNGQLSEILLEAERAGRFNESSIHSIFRVMHTIKSSSAM MDINGLSSLAHKLEDLFSYYRENMDKLEHPEPDLFDLLFAASDFIAQELEQMNREDYAPG DTSVLEQMTDDFLGHVSREEEKDSGKGGKTQGEEEGKEELPVIPELFVSKSGTVVNVTLE EGCRMENIRAFMLVRSITGLCTLVETYPENLEKQGESAQYIEGHGVFIRFESGDKEKVLE TLRKGLFVAQCRVLADSQPQDTAAAAQKAAPERENTDSREGEFMEVRTERLDYLQNLASE LLLQMQVLENELKRNGMEDLESGLGHQISLLVGQIERSVMEMRLVPVSRIVPKLKRALRD ICRDQKKEADLVVRCSDVEADKSVVDYVSEALLHILRNAVDHGIESPEERLAAGKDRRGK ITFSADNVAGELRMTISDDGKGIDEERVRERARQKGLFASADEEYDPQKIREFILYPGFS TNEKVTEYSGRGVGLDVVKNIMEDVGGNLSIRSTIGQGSAFSISVPLNLASIECVRFRVG QYRFSIPARHVYHFLEYESFRGQIREINGRDYILYEDRMMPLINLRRFYSLGGEIPERAI LVYVKGAEKEGCFFIDSIYEQKRIVVKNLPALLGTGFRSRTGICGCSIMGSGRICAALDT EIIISRYEKEGRYGG >gi|157101653|gb|DS480671.1| GENE 95 111845 - 112285 362 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936097|ref|ZP_02083470.1| ## NR: gi|160936097|ref|ZP_02083470.1| hypothetical protein CLOBOL_00993 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00993 [Clostridium bolteae ATCC BAA-613] # 1 146 1 146 146 294 100.0 2e-78 MADEALKTLLCLPGKHRNYAVEFPYAEEICKDMMISLMPCLPGHFAGVCNYKGSIVPVVC QEGSASGGEEAEVRQMVLVLRHQKYFLGILLYQEPYLTQIGVDEQIKGPEQQESGLWVEK AYFMWNGSLYSLIDVEKTLEKLVIFE >gi|157101653|gb|DS480671.1| GENE 96 112338 - 112676 395 112 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20572 NR:ns ## KEGG: EUBELI_20572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 111 1 109 110 62 30.0 6e-09 MTILESYIDCMQRGDEAALADLFNEYGVLHDSSLIKAGMDTLHLEGRMAVEMMFHHKFGF NGGPFPIHSVKYLDGNTVWYFITYNEHVVPVTAFLSGVDEDGKILRLNIYPL >gi|157101653|gb|DS480671.1| GENE 97 112698 - 113471 858 257 aa, chain - ## HITS:1 COG:XF2269 KEGG:ns NR:ns ## COG: XF2269 COG1028 # Protein_GI_number: 15838860 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Xylella fastidiosa 9a5c # 3 250 7 252 255 86 30.0 4e-17 MSTYVITGGTTGIGAATRKLLLSQGHEVFNIDFKGGDCLADLSAPQGRQGAIDEVFRRYP DGIDVLICNAGVGPTAPPQMIFALNFFASVKIAEALRPLLKKKKGNCVVTSSNSITNMTV RMDWVDMLSNVMDEERILEFVKDIPRTQAASCYSSSKHALARWVRRVSPSWAVDGLRINS VAPGNTTTPMTQNMTDAQMESALLIPIPTRYGRKEFLDAEEIANGITFLASPMASGINGV ILFVDGGIDALLRSERF >gi|157101653|gb|DS480671.1| GENE 98 113808 - 114518 781 236 aa, chain + ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 234 4 232 232 240 54.0 2e-63 MNGDRILLVEDDKEIREGVEIFLKSQGYAVFQAADGVEGLKVLEEREIDLAIVDVMMPRM DGITMTAKLREKYDFPVIFLSAKSEEVDKILGLNMGADDYITKPFTPMELMARVNSQLRR YHRFLDRLNSKQTENPKVHRLGGLEVNEETVEVTVDDKTVKVTPIEFKILLLLIQNPGRV FPAEEIYERVWNERAVNTDTVMVHIRNLREKIEVNPREPKYVKVVWGVGYKIEKQP >gi|157101653|gb|DS480671.1| GENE 99 114490 - 117123 2605 877 aa, chain + ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 524 763 33 269 274 218 47.0 5e-56 MDTKLKSSHRLGILIIILALLLASAGTIGLYPYMKTKAESYHNRRSVRQVEESSNFGNLA TQVMNFSYEIWHQKMQEDTGRRLTYAQTFLPGLEEKIRQQQPEDEAAVQIEYGEDGEQTM VYDSGDPYDIYYYQSMQQTIDSAGSEWESLYRQYYSSLSYGICEQDGTYGRSNVTDPKTF FQEPLKDDQIQFTINFNSNGILKVLDIKGNRDDDITRIKESLTRFEFYDPLEVRLDENYR YSGVEFQGPKDMTVVFRCSPSQMFGYTDVRSNSADQQLTIRDYRFSNGYFAIGASVLGLL TVLALALPAVKRFEIGKSALCRLSFEPLSCIGVAWLGIIGEGTIPMALIASTMDGSLKSE LMRAEFLPWSADVMAVVINLAFWMVSYGMFYWGITCYRAIFSLGPWRYFKERTWLGRFLR FIKRWTLNALNVFNETDWERRSTKIIGKAIIANFVILTIISCLWFWGIGALVIYSLVLFF LLQKYWGQMQQKYNTLLKSINEIAEGNLDVEIQEDLGIFNPFKEQLSRIQDGFKKAVAQE VKSERTKSELITNVSHDLKTPLTAIITYVNLLKQENVTAEERKSYVRVLDQKSMRLKALI EDLFEVSKASSGTVSLHQEDVDIVSLLKQVRFELSDKIEASGIEFRYNLPEEKILLYLDS QKTYRVFENLLVNITKYGMPGTRAYIQVVREDDGHVLITMRNISARELEVSPEELTERFV RGDTSRNTEGSGLGLAIARSFVEVQGGTMKLEVEDDLFRVSIRWKIEEPGGAGQEPGRDP EREPVREERETAAVPGAAGEPDMSCMPNMPEIPNMPEMPGAADGPGKTAADVQKGNVRVS AGAEADHDIDISAGTFDMEGVVWGAKEAQEGKKDREE >gi|157101653|gb|DS480671.1| GENE 100 117232 - 118842 1875 536 aa, chain - ## HITS:1 COG:BS_yuaG KEGG:ns NR:ns ## COG: BS_yuaG COG2268 # Protein_GI_number: 16080153 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 19 509 13 493 509 212 32.0 1e-54 MQKGVLMFDLDFILTSLGFIVPVILVIIIITQGYVKAPPDHAYIISGLRKQPRVLIGRAG IKIPFFEQLDKLYLGQITVDIKTDEYIPTNDFINVMVDAVAKIRVADDDERMKLAMRNFL NKEPANIAADLQDSLQGNMREIIGTLTLRAINTDRDSFSDQVMIKASKDMEKLGIDILSC NIQNVTDEHGLIQDLGMDNTSKIRKDASIAKAEAERDIAIAQAAADNAANDARVAAETEI AQKNNELAIKKAELQKASDTKKAEADAAYEIQKQEQQKTIQTATVNAQIARAEREAELRK QEVLVQQQALEAEINKKADADRYAIEQAAAAGLTKRQREAEAKKYEQEQEALAKKAQADA EQYEREKDAEAQKAIAEAQKYSMVQEAEGIRAKGEAEAAAIRAKALAEAEGMEKKAEAYQ KYNKAAMAEMMIQVLPDIAGKIAEPLSQIDKITIIGGGSDSDNGVGAIAGNVPVVMAKLF ESMKETTGVDLAEIMKADTYDAKVNRNVNLTGAPDIILGSGQKALHPNPDSDNIQP >gi|157101653|gb|DS480671.1| GENE 101 119193 - 119423 101 76 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPLARKRPSLDRFAQWLTSKSVALLAGQTAAVCLSAPSPQVPPIASGGLYCEKLYKVKAG ATLPVTPAKKTLYNFY >gi|157101653|gb|DS480671.1| GENE 102 119485 - 120000 650 171 aa, chain + ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 13 167 18 175 176 158 49.0 5e-39 MGENTHACCCDHSEEALLKRIGELAAEYRGKEGSLIQVLHMAQGIYGYLPLEVQKVIADA LDISLAEVSGVVTFYSFFSTQPRGEHTIRVCLGTACYVRGGKKIVERLKELLDVEIGETT ADRRFTFEVARCIGACGLAPAMSIDDQVYKQVNPDKLEQILERYYEEEGQE >gi|157101653|gb|DS480671.1| GENE 103 120000 - 123101 3307 1033 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 28 550 8 527 527 672 60.0 0 MDAVKNRDDLLSIKEEYLRRQKSYRRQVLVCGGAGCISSNCGEVRDALIKSVSTYKLDDE VKVMVTGCMGTCAMGPVILVEPEGIFYTKMNPEKAEDVVARHLLKDEVCEEYTYFDEDEG VHVPRMEDIPFFKGQVRVALRNCGHMEYASLEAYISRDGYGALARALSEMTPAQVVEEMK AAGLRGRGGAGFPTGVKWEAGMKAVSEDGRKFMVCNADEGDPGAFMDRSILEGDPHSIIE GMMLGGYAIGASKGYVYVRAEYPIAVERLGNAIDQARKAGILGENVMGSGFSFDLEIRIG AGAFVCGEETSLLASIEGKRGEPRQKPPFPFEKGLFESPTIINNVETLANAAPIILNGSQ WYRGFGTEKSAGTKVFALAGDIVNAGIIEVPMGTILGDIIYKIGGGIIGGKKFKAIQSGG PSGGCLTKTHLNTPVDYESLSSLGAIMGSGGLIVMNEDTCMVDTARYFMDFICDESCGKC LPCRNGTRRMLEILERITEGKGQPEDLELLEELAGTIQQTAMCGLGQTAPNPVLSTLKYF RSEYEAHIYEKHCSAGVCAELQTSPCRNACPANVFIPGYMSLLAAGRTEDAYRLIRQDNP FPAVCGRVCTHPCEDHCRRAQVDEPLAICSVKRFIGDYALRDEYNVPLVEPRPATGKRIS VIGAGPSGLTCAYYLANMGHEVHVYESESVAGGVLYWGIPEYRLPKAVLQKEIHAIERSG VQIHLNTRIGSDITFEDMKKQSDAVYIASGTQVSRLLDVPGEDLKGVESGLGFLKRIGLK KDMSVPNKLVVIGGGSTAMDVARTAVRLGTSEVTVIYRRSEKEMPAGKDEIIEAREEGVK LIAMASPVEFLGADGKVTGIHLVRRKETDYGSDGRRRTAHIPDSDFFLECDGIVTAINQD VDHQVYRTTHVEVAKNGKLDIGRFTSETGESGVFAGGDVSPWGANVVIHAIADGKKAAMK IDQYLGGKGELNMGEWFDIPEIADIEVNEPHKRFPTRCLSVEERKDNFREVNCGFHKLDA MGEALRCLHCDRR >gi|157101653|gb|DS480671.1| GENE 104 123112 - 124905 1874 597 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 212 596 5 369 372 340 50.0 6e-93 MIHVTINGKPVEVSEGSTIMEAAETVGIRIPSLCHMKNVHQYGSCRICVVEVEGMKNLQA SCMAKVREGMVIRTHSPKVLAARKVLYELLLSNHPKDCLNCSRNTNCELQELGYELGVSE SRFEGAMTKPMVDISPSITRDTSKCVLCRRCVTVCTQIQKVGAIQAQNRGFDTVVSPAMG LPLNSTACAMCGQCTVVCPTGALKETDGLAPVWRALADPEKRVVVQVAPAVRAALGEEFG LPVGTPVTGKMATALHEIGFDDVFDTLFSADLTILEEGTELLGRLNAALKGDESQKEKAV LPMITSCSPGWIKHIEHQFPEELDHLSTCKSPHTMLGAVVKSFYAERTATAPENMFVVSV MPCTAKKYEIQRPEMEVDGNRDVDAVLTTRELARMIKTAGIDFVNLPEGEFDAPLGLGTG AADIFGVTGGVMEAALRTVYEVVTGKELPFDKLHVAPIVGLEQVKTASITIEDTLPAYEH LKGVTVNVAVTSGLEGASMLMDEVAAGTSPYHFIEVMGCPGGCINGGGQPRCTEENYREK RSNALYSEDERKVLRKSHENPDLMKLYSEYLGEPNGHLSHHLLHTHYVKRGKYNQLV >gi|157101653|gb|DS480671.1| GENE 105 125274 - 125414 58 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936109|ref|ZP_02083482.1| ## NR: gi|160936109|ref|ZP_02083482.1| hypothetical protein CLOBOL_01005 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01005 [Clostridium bolteae ATCC BAA-613] # 1 46 14 59 59 89 97.0 9e-17 MYDIRQGVLVWQGDIKYNGGRFLYGLPEQEDVWVDHEYIWNLRVTA >gi|157101653|gb|DS480671.1| GENE 106 125462 - 125947 499 161 aa, chain + ## HITS:1 COG:no KEGG:CLM_0794 NR:ns ## KEGG: CLM_0794 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 7 147 94 230 271 69 30.0 4e-11 MIYNVPVATLALCTSNVKVLKRERMNGNSIYRIEPIRLKDGNADKVFQKSERKIRKKKRL KRCDVFSLLLTPLMSGTMEISERICRGMDILENPGLDMKEEDVKRMQSVLYALAVKFLDR NQLMDVKEKIGMTILGQMLFEDGEKKGMERGEKQGIQRQDV >gi|157101653|gb|DS480671.1| GENE 107 125988 - 126086 86 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGYLVEAFEAGSVLTYDGMHYASECSAKQSAE >gi|157101653|gb|DS480671.1| GENE 108 126192 - 127934 1285 580 aa, chain - ## HITS:1 COG:MJ0791 KEGG:ns NR:ns ## COG: MJ0791 COG0165 # Protein_GI_number: 15668974 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Methanococcus jannaschii # 83 557 6 472 484 284 34.0 3e-76 MSNTSFNQKGVSLYYQVETYIRTKIESGEWPSGFKLPTETELCQYFGVSRTTIRQAVNKM VEGGLLMRKQGSGTYVTQPAYSRNRLSTQPSDSVCKYIYMPILQDDMEHSYQNLLLTHIS HILMLCEQKLISQEDGKNALDFIVPLMDMHPQTIGFNPLNEDYFLNFEQYLISHLGIDLA GKIYTGRSRNDMTPTVMRMSIRDSMLAVYERLLALIRRLLALAEENQGRIITGYTHCMPA QPITLDHYFLAIAEALVRDMDRLLSAYQNLNRSPLGACAMAGTSFPINREYTAQLLGFDG IITNTLDAVATRDYLLELAADFSTLGSTLSRFAQDLYLWSTAEFNYVSFSDAYSCCSSIM PQKKNPLSIEHIKSKSSHLTSTYLDIAMCLKGTSYGHCRDLFECMPPFWDAVEQVTGMLE LSIGTLQDITFHYDRMEFTASMNDSILTDMADFLVQKDRIPFRSAHNIIASAVRAQGDNP SSKITLKQLNKSSKQHLGHDTTLTESEWASLQSPRSSVANKRSEGSPAQSSCRKMLLSLR MAADRCNEQYNKIINSLQLAEQFRKEQIDVLKNGTHFEHN >gi|157101653|gb|DS480671.1| GENE 109 128270 - 128794 472 174 aa, chain + ## HITS:1 COG:no KEGG:Dret_0071 NR:ns ## KEGG: Dret_0071 # Name: not_defined # Def: DctQ (C4-dicarboxylate permease, small subunit) # Organism: D.retbaense # Pathway: not_defined # 12 158 6 151 162 105 36.0 9e-22 MKKLGKTLIQVQYGASIVAALMLLLMIGYEVFARYVLKSSLMGIEELMLFPIIWLYMLGG ANASYEKSHIECGILTLYIKKERSKLIFDAIKRTLCIIILAWICYWGFYFFSYSLKTWKL ADITYAPLFFANIALTVGFVLMFIYAVRDVYMAYRALIGQMKNKDEAREGGENL >gi|157101653|gb|DS480671.1| GENE 110 128791 - 130101 618 436 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 4 433 2 429 432 242 30 1e-62 MTSLSIILMDVILLVFLLMLSIPLPVCFAGALLFLSLFGDVSMKTMLIWVFSQSTGTVLL ASPLFILAGTYMGGSGIAKRLLDCVDAFVGHIKSGLGVVAVLVCAIMGAISGSGFTGVAA TGPIMIPKMEEQGYPRGYATALVTVSSVLGLLIPPSVVMIMFGWVTECSILACFLSTLGP GLIITALFCIIHVFWSRQFDLKVTEKKPFGEFASNAGKMTFKAVPALIMPLIILGGIYGG VFTPTEAAAMAAIYSIPIGLWVYKACTFKGFLAMTKEATSTVGSIMVMIVCCLMLSRVFT VYRVPELVLELLMSITTNKYILLFIIDIFLFIVGMIVNDTTGVLICAPLLMPVMNQLGIS PIHFASIMGVNLAMGGVTPPYASILYLGMRIGKAEFHEILKPTMTFLILGYVPIVFLVTY WPDLALAIPRMAGFAV >gi|157101653|gb|DS480671.1| GENE 111 130137 - 131183 1092 348 aa, chain + ## HITS:1 COG:SMb20036 KEGG:ns NR:ns ## COG: SMb20036 COG1638 # Protein_GI_number: 16263787 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 31 346 20 331 338 76 23.0 7e-14 MKNIKRGLSIAIVAAMVLGMTACGGSGKTESAGKAAEKKSFSLKLSHYRAEGSPADINAK EFADNVKAADDSLEVVVYPAAQLGDYTAVQELVSVGDVEMQMATLSSTVDKYLGITSAPY ICATWDEAKAIFSRDSEFTETIRTHLEDQNIKLLSVYPLYFGGAILNKEPNQPANLDVRE GYKVRCQTMKGAELVTNMLGYNATPMALNDCFTALQTGVIDGMIGSGAEGYYSSYRDVAT YYLPYNDHFEVWYLYINLDKWNEMTTEQQDALQTAANKLEDDRWEVAPKETAEYEKKLED YGVKIIEFTDDELQNFYNKCWESVWPELTELYGEDAVELVKNLKAEAK >gi|157101653|gb|DS480671.1| GENE 112 131209 - 131604 323 131 aa, chain + ## HITS:1 COG:no KEGG:Amet_3596 NR:ns ## KEGG: Amet_3596 # Name: not_defined # Def: GrdX protein # Organism: A.metalliredigens # Pathway: not_defined # 8 125 6 122 127 86 35.0 3e-16 MGHPFRCITNNPMLIDRGFTDLEYYETDVLELFRVVFQKVNCGYRLLTHPLTGSIRPDIT PYKTVLLSGTAGTIDMESVTLIGKAIRYAEDLYRLRDIPVYKKWGKAAREDFQLIDLSII ERALEVEEMGK >gi|157101653|gb|DS480671.1| GENE 113 131626 - 132912 1233 428 aa, chain + ## HITS:1 COG:no KEGG:STH2870 NR:ns ## KEGG: STH2870 # Name: grdE # Def: glycine reductase proprotein # Organism: S.thermophilum # Pathway: not_defined # 1 426 7 432 434 480 54.0 1e-134 MKLTLNKFHVQDIQFAEASSFCSGVLYINKAELAEYLMEDTNIRSVDFDIARPGESVRIV PIKDVVQPRFKQSGSGQVFPGFVGDVETVGNGETNVLEDCAVATSGKIVAFQEGIIDMSG PGAEFNSFSKMNILVPLIYPVEGLDKHTHEATVRIAGLKTAAYMAKTTLGLEPDEKEVFE HESILEMGQKYPHLPKVVYVDMVQCQGLMHDTYVYGLNVKGILSTMIGPLEVLDGAIISG NCAAPGHKNATIHHENNPIILDLLRRHGKELCFVGVLLSNESAMLKEKKRAAYYTSSLSQ LLGVDGVIVSEEGGGNPETDLMFNCRLHENKGIKTVLVTDEYCGRDGASQGLADVTPEAD AVVTNGNGNQFVVLPKMERVIGDIETVRIITGGNVDSIGEDGTVSVEIAAIMGSCCEMGY EHMTTRLR >gi|157101653|gb|DS480671.1| GENE 114 132925 - 133968 1081 347 aa, chain + ## HITS:1 COG:no KEGG:Toce_1618 NR:ns ## KEGG: Toce_1618 # Name: not_defined # Def: selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family # Organism: T.oceani # Pathway: not_defined # 3 346 5 348 349 411 57.0 1e-113 MKRIVHYLNQFYGQIGGEEFADQEPILREGIVGPGTGLQAVLGEKAEIVATIICGDNYFA DHQEDALETILTMTEGLAPDMFIAGPGFNAGRYGLACATICAEVKKRLSIPVLTALYPEN PGAEMFCSKIYIVETSISAAGMRQALPVMGTLAEKLLEGGEIGFPEEEGYIAQGFRVNVH TKKNGAERAVDMMIKKLRGEPYVTELPMPVFDRVKPAPAVKDMAHARIALITSGGIVPQG NPDHIPSANAGIYGIYDIENLDRLTSEGFMTVHGGYDPVYATEDPNRVLPLDVCREIERD GGIGELYGRYYSTVGNTTAVSSAKRFAEEIAQDMIRNEVQAAILTSN >gi|157101653|gb|DS480671.1| GENE 115 133996 - 134226 207 76 aa, chain + ## HITS:1 COG:no KEGG:BP951000_1856 NR:ns ## KEGG: BP951000_1856 # Name: grdB # Def: glycine/sarcosine/betaine reductase selenoprotein GrdB # Organism: B.pilosicoli # Pathway: not_defined # 1 76 356 431 432 106 65.0 3e-22 MAKTIESFGIPVVHICSIVPISMAVGANRIIPAVGIPYPVGNPGLPKEDELQVRRRIVKK ALEALSTDIEEQTVFE >gi|157101653|gb|DS480671.1| GENE 116 134238 - 134495 92 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936119|ref|ZP_02083492.1| ## NR: gi|160936119|ref|ZP_02083492.1| hypothetical protein CLOBOL_01015 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01015 [Clostridium bolteae ATCC BAA-613] # 1 85 6 90 90 165 100.0 1e-39 MNHWLKSETLRRCLFPLAFVCIFIVWRLAEHGVRSVWVYLLLTVGVICLALDRVFKPWEH DRMRKRQERDRGLEDNSIKKQGLNQ >gi|157101653|gb|DS480671.1| GENE 117 134492 - 134569 76 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIQPEADAGVAIASASFILRQALFC >gi|157101653|gb|DS480671.1| GENE 118 134701 - 136089 1401 462 aa, chain + ## HITS:1 COG:lin0328 KEGG:ns NR:ns ## COG: lin0328 COG2723 # Protein_GI_number: 16799405 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Listeria innocua # 2 459 3 458 463 616 62.0 1e-176 MNDFLWGGATASYQCEGGWQEGGRVESMWDVYLHENHLENGDVASDHYHRFREDIRMMKE GGQNSYRFSLAWPRIIKNREGEVNQEGIDFYNRLIDACLEYGITPMVTIFHWDLPQYLEE KGGWLNRDTCVAYTHYAKVCFERFGDRVKLWATFNEPRYYTNSGYLIGNYPPGHQDIQET VTASYYMMLASAMAVEAFRTGGYDGQIGIVHSFSPVYTTDTSVESAIARRFADNFYNNWI LDTAAIGEIPGDLLGELKKTCDLSMMTPEDLAVIRRNRVDYLGLNYYARVMVKPYESGET TLIVNNQGKKAKGTSQTIIKGWFEQVRPESSRYTEWDTEIFPEGLYEGIQQVWNKYHLPI YITENGIGLYEDTSVNQVEDDDRIEFMDMHIAAVLKAKEGGCDVRGYYAWSPFDLYSWKN GTEKRYGLVAIDYENGLERRPKKSYYWYKDVIETDGKHITGK >gi|157101653|gb|DS480671.1| GENE 119 136104 - 137402 1219 432 aa, chain + ## HITS:1 COG:lin0326 KEGG:ns NR:ns ## COG: lin0326 COG1455 # Protein_GI_number: 16799403 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Listeria innocua # 1 431 1 431 437 459 53.0 1e-129 MNIQSMSVSMEKYVMPLANKLGNEIHLRAIRDAFMSLLPITFTGGIAAVLSSAPSVDAAG TGITLAWARFVESNSMIFSWINALTLGAMSLYICIGIIHFLCKSRKIESFLPILLGVCGF LMLIVEPMSLGWDGKMAELSYMDGKGLIPAMFISILTADLYCYMRKRDFGKISLPDTVPA SLSDVFASIVPGAVLMVIYIALYAIMNKAGTTLPKLIYNAISPSLTAADSVGFTIIITLM VHIFWFFGIHDAALSGVLAPIRDGNLSINAAAHAAGQALPSIFTTPFWVYFVVIGGCGSV LALTALLCFSKSKQLKTIGRLGIVPAFFNISEPVIFGLPLMLNPVFFVPFLLTSVINGTA AYLTMQVGLIGKSFAMLSWQMPSVIGAFFSTMDWKAPLLILVLIVVDGLVYFPFFKIYEK NLVKLESGEEDI >gi|157101653|gb|DS480671.1| GENE 120 137446 - 137748 386 100 aa, chain + ## HITS:1 COG:lin2472 KEGG:ns NR:ns ## COG: lin2472 COG1440 # Protein_GI_number: 16801534 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 92 1 94 104 84 51.0 5e-17 MKRIVLLCNGGLSTGILVKKMKAAAEAQDFACEISAAPVSDAEAVGAEADLILLGPQVRF QMETVKKQVACPVTSIDPVSYGTMNGEKVLNQVKKELGEA >gi|157101653|gb|DS480671.1| GENE 121 137748 - 138056 412 102 aa, chain + ## HITS:1 COG:SP2024 KEGG:ns NR:ns ## COG: SP2024 COG1447 # Protein_GI_number: 15901845 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Streptococcus pneumoniae TIGR4 # 2 101 9 108 108 63 35.0 1e-10 MEDMELICLQIISNAGEARSESMAALAAARNSRFDEAGEHLAAAGEKMKEAHHVHTRLIT MDASGELDKIGLIMIHSEDIMMGAEITLALAREMVEMYKSRA >gi|157101653|gb|DS480671.1| GENE 122 138077 - 138910 759 277 aa, chain + ## HITS:1 COG:SPy1766 KEGG:ns NR:ns ## COG: SPy1766 COG0406 # Protein_GI_number: 15675611 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Streptococcus pyogenes M1 GAS # 53 274 6 226 235 134 37.0 3e-31 MKIRKLCLGGAMILGVVALAGCQRQANTESVVQEAVEQAAEPSGAPEKKDVTIYLVRHGK TFFNTTGQVQGWADSPLTEEGESQADAAGKGMKDIVFTTAFSGDLGRQRATAKHILAQNQ GDIPELQEVIGLREEFYGGFEGKPDEELWRPIYEANGASYDEKGSTYLELSKKEKVDTIA AIDPLHMAETYDDISKRSIEVMDTIVKATQEAGGGNALAVSSGDEIATILDLLVPEQYQG ERIANCSVTVLTYKDGLYELKACGDVSYMQAGEESNK >gi|157101653|gb|DS480671.1| GENE 123 139001 - 140884 1140 627 aa, chain - ## HITS:1 COG:lin0325_1 KEGG:ns NR:ns ## COG: lin0325_1 COG3711 # Protein_GI_number: 16799402 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 1 457 3 449 480 176 25.0 1e-43 MTARQKRIWDLLAEHDCLTSQQIAALLHISDRTVRSDIKEINGEHASEVIRSKKGQGYFI DAGQPEKNVPASKPQRPEDDLEWAIIRQVLFCRETPYLELAEELFISDTLLSKIVSELNR NITRRHSLPAILKQNGILSLAASEEEKRSYYTLYIMNRNVNHYFDMEQFQPFFDIVDLKE LKELMFRELDLLSRHLYDTTIVRLIIDTAVMAERAVCGFFMPETPSVTPETPYGKSSEEG AVNTGRHFLEELGSRLFISFPPSEYEYFSRLFQNDFYYVREQPDSQAESLLEKILIEINV EYGFDFTVDKNFCHEMTEQLHGALERRRHRQHVINPVLSTIKSKYPLEYDIAIFFADRFK NLSGISLSEDEISLFAIHFIRAMETNLGRTEQRVGLINPYGKQIKELMVKRLGDMGECRF QIAYTWSVFDYPHEMPKDILAVLTTVPLPVQPADVPVILCRNFLNYHEKEKLLTVVRDSE VNSIRTYFRTLFKPSLFFTDMEFDSRRSAVAFLCGKLREQGYVGPGFLESVMQRESIAPT AFEPGFAFAHAMENNAKRTAVCVCVLKNKLPWGEYNVKIIFLFALAPTWNHTIIPIYNVM IDNLFKTNTIYKLAKIRDCRQFMDLLI >gi|157101653|gb|DS480671.1| GENE 124 141035 - 142780 1831 581 aa, chain - ## HITS:1 COG:CAC0028_2 KEGG:ns NR:ns ## COG: CAC0028_2 COG4624 # Protein_GI_number: 15893326 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 218 581 5 370 371 391 52.0 1e-108 METINLKINGIAVEAPKGSTILEAARLAHIEIPTLCFLKEINEIGACRICIVELKNGKLV TSCVYPAEEGMEVYTNTPKVLDSRKKTLQLLLSNHNRSCLSCVRSGHCELQELAHELGVD DEGYYDGEITPSEIDTSAAHMIRDNSKCILCRRCVAVCENVQGIGVIGANNRGFSTSIGS AFEMGLGETSCVSCGQCIAVCPTGALTEKDYTADVFAAIADPKKHVIVQTAPAVRAGLGE EFGLPIGTDVEGKMAAALRRLGFDKVFDTNFSADLTIMEEAHEFIDRVQNGGVLPLITSC SPGWVKYCEHYFPDMTENLSSCKSPQQMFGAIAKSYYAEKAGIDPKDIVSVSVMPCTAKK FEIGREDEDANGMPDVDISITTRELARMIKKAGIKFLDLPDEEFDAPLGLGTGAAVIFGA TGGVMEAALRTAVETLTGEELPKPDFTEVRGTKGIKEAAYNVAGMEVKIAVASGLGNARE LLNKVKSGEANYHFIEIMGCPGGCVNGGGQPQQPGHVRNTTDIRGLRAKVLYDIDTANPI RKSHENPAIKELYATYLGEPGSEKAHHLLHTTYVKRSINQH >gi|157101653|gb|DS480671.1| GENE 125 142799 - 144589 1956 596 aa, chain - ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 6 527 8 527 527 682 62.0 0 MYRSHVLVCGGTGCTSSGSPKIMEALKYEIKRQGLEEEVAVVETGCHGLCALGPIMIVYP DATFYSMVQVNDIPEIVSEHLLKGRVVTRLLYQETVSPTGGIKALIDTDFYKKQHRIALR NCGVINPENIEEYIGTGGYQALGKVLTEMTPDDVIQTLLDAGLRGRGGAGFPTGLKWKLC KQNDADQKYVCCNADEGDPGAFMDRSVLEGDPHALLEAMAIAGYAIGSNQGYIYVRAEYP IAVERLKIAIQQAREMELLGKNIFGTGFDFDIDLRLGAGAFVCGEETALMTSIEGKRGEP RPRPPFPAQKGLFGKPSILNNVETYANIPQIILNGPEWFASMGTEKSKGTKVFALGGKIN NTGLVEVPMGTTLRTVIEEIGGGIPNGKKFKAAQTGGPSGGCIPAEHFDIPIDYDNLISI GSMMGSGGLIVMDEDDCMVDIAKFFLEFTVEESCGKCTACRIGTKRMVEILTKITKGQAV MEDLDKLEELCHYVKANSLCGLGQTAPNPVLSTLHFFRDEYEAHIKEKRCPAGVCKALLS FNIDRDKCRGCTLCARNCPAGAIVGSVKNPHVIDQNKCIKCGACMEKCKFGAIYKK >gi|157101653|gb|DS480671.1| GENE 126 144604 - 144981 374 125 aa, chain - ## HITS:1 COG:TM0011 KEGG:ns NR:ns ## COG: TM0011 COG3411 # Protein_GI_number: 15642786 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermotoga maritima # 1 115 6 119 128 75 35.0 2e-14 MKTLAELQAIREKMQSQVNMRAEDHDHIRVVVGMATCGIAAGARPVLNTLADEVQKRGLT NKISVTQTGCIGLCQYEPIVEVMEPGKDKITYVKMNADKALEVVERHLVGGHPVEKYTMG AAGLK >gi|157101653|gb|DS480671.1| GENE 127 145048 - 145146 66 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTCFARKRLKDRRQVLITSHSTRFVKHLTKWR >gi|157101653|gb|DS480671.1| GENE 128 145163 - 145720 376 185 aa, chain - ## HITS:1 COG:TM1665 KEGG:ns NR:ns ## COG: TM1665 COG0642 # Protein_GI_number: 15644413 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Thermotoga maritima # 1 182 4 182 186 112 34.0 4e-25 MMTEISLNVLDVSENSTRAGASLVTILVTADTSADKLTIVIADNGCGMTPEQAAHVTDPF FTTRTTRKVGLGIPFFKYAAESTGGSFHIESEPGKGTTVTAVFGLSHIDRMPLGDMNATI HNLIVYHPDTDFLYTYTYNDASFTLDTRQMREILGGIPLNTPEVSAYIMEYLKENQQETD GGALI >gi|157101653|gb|DS480671.1| GENE 129 145725 - 146219 417 164 aa, chain - ## HITS:1 COG:TM0012 KEGG:ns NR:ns ## COG: TM0012 COG1905 # Protein_GI_number: 15642787 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 33 163 41 172 176 148 53.0 4e-36 MACKKQTVPFKGTPEQEAALKSAIAELGDQPGALMPVMQKAQEIYGYLPIEVQTMISDEM GIPLEKVYGVSTFYAQFALQPKGKYKISVCLGTACYVKGSGEIFRKLEELLGITNGECTA DGKFSLDSCRCVGACGLAPVMMINGEVYGRLTVDDIPGILAKYN >gi|157101653|gb|DS480671.1| GENE 130 146434 - 146781 411 115 aa, chain + ## HITS:1 COG:no KEGG:Closa_3389 NR:ns ## KEGG: Closa_3389 # Name: not_defined # Def: DRTGG domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 115 1 115 115 194 85.0 1e-48 MTVKDVRDVLGARVLAGEEFLDREVKSACGSDMMSDVLAFSKDHSVLLTGLCNPQVIRTA EMLDIVCIIFVRGKKPDESMLEMARDRGLVVMETGHRMFSACGMLYQAGLHGGAI >gi|157101653|gb|DS480671.1| GENE 131 146877 - 147299 559 140 aa, chain + ## HITS:1 COG:TM1354_2 KEGG:ns NR:ns ## COG: TM1354_2 COG2172 # Protein_GI_number: 15644106 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Thermotoga maritima # 7 137 32 163 181 114 48.0 7e-26 MEEAIVLTYDISADDFTRAGEASSDVKRKLKQMGVSPDAIRKVAIAMYEGEINMVIHAKG GVITVEITTEKIKMILADVGPGIPDVKLAMQAGYSTAPDEIRSLGFGAGMGLPNMKKYSD SMDIDTRIGEGTTITMVVNL >gi|157101653|gb|DS480671.1| GENE 132 147330 - 148724 1604 464 aa, chain + ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 12 305 9 276 301 105 26.0 3e-22 MDKFYHSVRLDESLCKGCINCIKRCPTEAIRVRGGKASINNKFCIDCGECIRVCPHHAKL ATYDKLDVMKKYKYTVALPAPSLYSQFNNLDDVNIVLNALLLMGFDDVFEVSAAAELVSE ATRQYISEHEEQLPLISTACPSVVRLIRVRFPNLIPHLLPINPPVEVAAVLAVRKAMEDT GLPREEIGIMFISPCPSKVTYVKSPLGTEKSEIDQVLAIKDVYPKLLACMKTVVGEDYPV IGTSGKIGISWGHSGGEASGLFTENYLAADGIENVIRVLEDMEDQKFTNLRFVELNACNG GCVGGVLTVENPYVAEVKLKRLRKYMPVARSHMHESEENVIKWTTGVQYEPVFRLGNNMM ESFSRLNQVERLMKKFPGLDCGSCGAPTCKALAEDIVRGDANETDCVYYLRENLHKLSEE VAVLADDLHNGDRGGQETLRILKGYIQRISDEMSVLDKKDEDEE >gi|157101653|gb|DS480671.1| GENE 133 148783 - 149115 419 110 aa, chain + ## HITS:1 COG:no KEGG:Closa_3386 NR:ns ## KEGG: Closa_3386 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 109 1 109 116 144 69.0 9e-34 MTVQELIDKQIFGVVNLGDSLDRQITVPFCCDLLSIAMGRAPAGCAWVTVMANMNTLAVS ALTDTACVILAEGAVLDDAARKKALDQEITVLSTDMPVFEAALKIHGMLS >gi|157101653|gb|DS480671.1| GENE 134 149112 - 149822 595 236 aa, chain + ## HITS:1 COG:TM1352 KEGG:ns NR:ns ## COG: TM1352 COG0613 # Protein_GI_number: 15644104 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Thermotoga maritima # 7 236 4 220 232 106 31.0 4e-23 MMNLAYDLHIHSCLSPCGDDDMTPANIAGMAALKGLDVVALTDHNTCRNCPAFMAAAAEY GVLAVPGMEINTSEEVHAVCLFPTLEKALDFDAYVYGKLIKFPNREEIFGKQQIYDDRDQ VCASEPNLLINATEISFDSLWNLVRSYDGVMFPAHVDKAANSLIANLGFIPPDSCFKTAE VRDLKKLHQLKRDNPYLEQCRIISNSDAHYLEHINEANLTMQVEERSAEAVVRALL >gi|157101653|gb|DS480671.1| GENE 135 150057 - 151046 1201 329 aa, chain + ## HITS:1 COG:TM1264 KEGG:ns NR:ns ## COG: TM1264 COG0226 # Protein_GI_number: 15644020 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Thermotoga maritima # 59 325 20 272 274 122 30.0 7e-28 MAMAASLLTACGSNAASTETTTAAASEAASEKETAGKEAEETEPGAKTDGAASAQFQSSI LFCGSTSLYPILSSLASSFTEKYVTWDAVDSSFPDSNISVYVAPGGSGVGVSAAIDGTAD FGMLARDIKDSEIEALGEHYQDFVVAKDALTVSVNAENPICGIMDDMPVETIRQIFAGEI ATWDQVDSTLPAEAINVYIRDLSGGAYEVFQKSVMGDSEVTASATQSASMTELATNIAGD KWGIGYAGFGAYNKANADGQVLTAMKVDGVEATAENIISGAYTIQRPVMFVTGAEITPSE QAFIDYIFSQTGYDVVEANGYIPAFTPEA >gi|157101653|gb|DS480671.1| GENE 136 151124 - 151990 704 288 aa, chain + ## HITS:1 COG:MA0888 KEGG:ns NR:ns ## COG: MA0888 COG0573 # Protein_GI_number: 20089772 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 15 276 17 281 296 156 35.0 4e-38 MNKQLNALFSVLIKALTALTVMVLAFVIYFIVKEALPTFDEVAVPDFLLGRRWMPIAYTG EPSFGIFNFIAATLYVSLVAMVLAVTVGVGAAIYLSCVATQRMRGILYPFVDLLAGIPSV IYGFMGLTLMVRLFIRAGVHTGSCVLAAGILLAVMLLPFLISSCSETMLKVRKRYQPAAD ALGISKWHMVATIVLPGSWKNILLSMILAIGRAMGETMAVMMVIGNANLFPSLLGKSESI AAVIALEMGTAVVGSTHYHALYAAGLVLMLLLFLINSGITLLRSKLEQ >gi|157101653|gb|DS480671.1| GENE 137 152072 - 152914 600 280 aa, chain + ## HITS:1 COG:MA0889 KEGG:ns NR:ns ## COG: MA0889 COG0581 # Protein_GI_number: 20089773 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 15 279 41 307 307 143 34.0 4e-34 MHLRASLTKFWAYSSMLLVTGIILFLFGYVFYRGAGTISWEFLSQPPRGAVLGEEGGIWP AIVGSLCFTATAIVLGGIPAVATALFMVFYCKGRRTAGLIRMVIQCISGIPSIVLGLFAY SFLVRDLAWGRCILSSGVALAVMILPFIEVRAEKTFRELPKQLVQSSYALGCSRFYTIRK IVLPACKGEIVSGVILGGCYAMGATAPLIFTGAVAYSGLPARLTAPAMALPMHLYLLVAQ GATSMDAAYGTAFVMMALILFSSLLATIYARRSQKKWNIS >gi|157101653|gb|DS480671.1| GENE 138 152899 - 153648 216 249 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 5 224 12 223 312 87 29 4e-16 MEHIMTLHNVSASYDGREVLSDLSLDIPVRRITAVIGPSGCGKTTLLRCMNGLLAEEKGV SVSGRILLDGQEIKSMQKDVLRRRVGLVFQTPAPFPFSIYKNMTYAPRYYGVRDKASLEA LVKEKLRMAGLYDEVREDLGRNALKLSGGQQQRLCIARALTVDPEVLLLDEPCSALDVKS SAVIEEMLTELKKQYTIVIVTHNIAQARRISDYGAFLLGGRLVEWGESDQLFGCPAQEET RAFLAGIYG >gi|157101653|gb|DS480671.1| GENE 139 153760 - 154227 594 155 aa, chain + ## HITS:1 COG:MK0529 KEGG:ns NR:ns ## COG: MK0529 COG4087 # Protein_GI_number: 20093967 # Func_class: R General function prediction only # Function: Soluble P-type ATPase # Organism: Methanopyrus kandleri AV19 # 1 154 1 157 158 107 44.0 1e-23 MIVIEIPGREPLKINHVVLDYNGTAAVDGMLLKGVKERIARLKELAHVYVLTADTYGTVT RQCEPLGITVRTFPREGAAGCKEEIVKGLEGGVACLGNGFNDIQMFDAAQLSIAVLEREG MCAALLSHASVVVRSIEDGLDLLLKPDRLRATLRS >gi|157101653|gb|DS480671.1| GENE 140 154345 - 154485 104 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936143|ref|ZP_02083516.1| ## NR: gi|160936143|ref|ZP_02083516.1| hypothetical protein CLOBOL_01039 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01039 [Clostridium bolteae ATCC BAA-613] # 1 46 14 59 59 80 100.0 5e-14 MVNYEILGKGLKQFFKFNYGTINVGLAALTKGARLFMLECITGRGG >gi|157101653|gb|DS480671.1| GENE 141 154589 - 155182 649 197 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936145|ref|ZP_02083518.1| ## NR: gi|160936145|ref|ZP_02083518.1| hypothetical protein CLOBOL_01041 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01041 [Clostridium bolteae ATCC BAA-613] # 1 197 1 197 197 251 100.0 2e-65 MKRKLAVLGTAIMAAAAITACSGKDAKTTTAAEVVTQADTTAGMTDGETEDAGAASDSET AAESQSSESASQEAALLEEAEDVSHAGAELEQSEIKPFAEKVQKAVADKDMEALAGLCVY PVYVSLGEGQGEEIADESAFLKMDAGQIFTESLLKEIADTDVDTLEQFGAGVIMGEENSI IFNNVDGQAAITGINLN >gi|157101653|gb|DS480671.1| GENE 142 155291 - 155524 282 77 aa, chain - ## HITS:1 COG:no KEGG:Closa_1058 NR:ns ## KEGG: Closa_1058 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 75 1 74 75 70 49.0 3e-11 MNPMNLLQLKPAWNQFKANHPKMLSFMKAASRDGVMDEGTLIEITVTSSTGQTIASNIRV KKSDLEFLGALKDALED >gi|157101653|gb|DS480671.1| GENE 143 155635 - 156861 1110 408 aa, chain + ## HITS:1 COG:lin1687 KEGG:ns NR:ns ## COG: lin1687 COG0420 # Protein_GI_number: 16800755 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Listeria innocua # 29 402 1 367 374 261 38.0 1e-69 MEKDGPYGLVNGARLVVLWLKMYMKGRTMKFIHLSDLHIGKRVNGFSMLEDQKYILDQIL MIAEEEMPDGVLIAGDIYDKPVPPAEAVQVFDAFLTGLADRNLPVFVISGNHDSPERLAF GGQLMKDRRVYMAPVYDGHLEPVQLEDRYGSLRVYMLPFIKPAVVRRCCPEEGIETFEDA VRWALEHMAEHKKGEDGRNILIAHQFVTGASCCDSEELSIGGLDQVSAELFDSFDYVAMG HIHGPQKVGRDTLRYSGTPLKYSFSEVNHRKSVTVVELLEKGNVTVNTRPLRPLHDMREL RGSYEELTSRDFYQGTAVDDYLHITLTDEEDILDAIGKLRSIYPNVMKLDYDNKRTREGR VVEAAANADKPPIVLMEELYQLQNNQPMTEQQADFAVKLMEEIWEGSR >gi|157101653|gb|DS480671.1| GENE 144 156858 - 159818 2957 986 aa, chain + ## HITS:1 COG:lin1686 KEGG:ns NR:ns ## COG: lin1686 COG0419 # Protein_GI_number: 16800754 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Listeria innocua # 1 984 1 1019 1023 239 26.0 2e-62 MRPTELIISAFGPYAGEVTLDMASLGDRGLYLITGDTGAGKTTLFDAIAFALYGNASGDS RKPRMLRSKYARPDARTYVEMGFSYSGKEYRVRRNPEYMRTKQRGEGETREKPDAQLHMP DGRLVTGDKAVTVEVEGLLGLNREQFSQIAMLAQGSFSRLLSGRTEDRGIIFREIFKTKP YQLFQEKLKDRAKGLYGRYADSRKSMEQYAGGVITEGHSEELKLRWKEVPQGSLEALLEV LDQLIGADEAQQQGRDRAMALVRDEMAALGMELGKMQGDAAVCRDMAAAADVLRENMPVL EQAKVRYEREKERQADRDGLIGAVTRMEENLKTYDRFDGLQENLKACRKETESLEKRLEQ AALEERQWKERNDSDEKILESLRQAGEEYQAALTAGERLREYSGRIGLLAEELGQYGQER KKLEAARERYRAAGEESRRADDAYREMYQRFLDNQAGILASRLTEGQPCPVCGSVTHPAP ARYLEEGKEAAKDKVDRLKAAAEEKDKAAARLSLEAGRLAGSLDTRYERMKQQIDAEVAT WKEDWQQRIRQAEADAGALVAKTGDVRQGRRYFLEQWELMMGQLKGQLERQAAAQKEKIQ EKKKRKEYKEAIEGRRVLGRQSLEAASELKQQAQKMLAESQAKERELESRLKEMESSLPY ENRKAAQAELAEKKAFLAGLEKAFKEAEESYNRISRRVSDAQARLEALRGRMAEKDAEGR ESVGEDRDPVNGPDTMLPGGMDVRTVMERLEARMQENRLRQSEARERLTGLEQEKSRLHH RLETNRMARERISEQKASMEEIQKEWTWVKALSDTAAGEVGGKEKITLETYAQMAYFERI IARANTRFMVMSGGQYELKRCTEEDNRGKNGLGLNVIDHYNGTERSVKTLSGGESFQASL SLALGLSDEIQSAAGGIRLDTLFVDEGFGSLDEDTLNLAMKSLGDLAEGRRLVGIISHVG ELKERIQHQIVVVKDKTGGSRAYIEL >gi|157101653|gb|DS480671.1| GENE 145 159922 - 160389 423 155 aa, chain + ## HITS:1 COG:FN0030 KEGG:ns NR:ns ## COG: FN0030 COG3467 # Protein_GI_number: 19703382 # Func_class: R General function prediction only # Function: Predicted flavin-nucleotide-binding protein # Organism: Fusobacterium nucleatum # 2 155 1 158 158 147 48.0 8e-36 MVRRKDREIKDTYGIREIIRECDCCRLAFPDGKSAYIVPLSFGYDEEENALYFHGAAEGK KMDLVRQTGYAGFEMDTAHGLKTADQACGYSFRYRSVVGEGPIRVVEETQEKKKGLNCIM GHMSGKDSWDYPEAMLKRTAVLRLDVEQMSGKEQV >gi|157101653|gb|DS480671.1| GENE 146 160415 - 160669 175 84 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c17650 NR:ns ## KEGG: CLJU_c17650 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 83 1 83 85 102 74.0 5e-21 MFESRCGVCCNQCERKEAVRCTGCATMEKPFWGGECGVKSCCEAKGLNHCGECPDFPCEM EASMGTDMGFDAEPRLKQCREWAK >gi|157101653|gb|DS480671.1| GENE 147 160688 - 161290 504 200 aa, chain + ## HITS:1 COG:BS_yisX KEGG:ns NR:ns ## COG: BS_yisX COG1357 # Protein_GI_number: 16078153 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Bacillus subtilis # 3 200 17 212 212 72 26.0 6e-13 MLSEDMAEAANQECPILDKVYEDQTVVGISGKKVEFDTVKFVRCRMEECDFSGASFCNVV FDKCDFSNCSFRDTYWKNVNVSDSKGDGSQFCNSTFKWVKLLDSQFHYGNFSTAFWEFGE IKGCNFRESFMSEVKFKKTVFSSTDLAGTDFFRTPLKGMDLSECVIDGIMVSDQFTELAG VKVSLLQAAELARLMGVKIV >gi|157101653|gb|DS480671.1| GENE 148 161340 - 161966 947 208 aa, chain + ## HITS:1 COG:SPy2084 KEGG:ns NR:ns ## COG: SPy2084 COG3404 # Protein_GI_number: 15675842 # Func_class: E Amino acid transport and metabolism # Function: Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pyogenes M1 GAS # 5 205 6 206 208 122 35.0 6e-28 MVESMTIQEFLDVLSSKEPVPGGGGASALAGALGNALGQMVANLTIGKKKYALVEDEIKE LAERMKGIQGQFSALADQDAKVFAPLAKCYSLPSGTEEEKAYKAEVMEARLLDASLVPME IMEKAWEMLEIMDILADKGSRMAVSDVGVGVQFIRTALLGAVMNVYINTKSMKNREKAEE MNEKAERLIKEGTEAADRIYQKVLEQLR >gi|157101653|gb|DS480671.1| GENE 149 161999 - 162844 1002 281 aa, chain + ## HITS:1 COG:BS_folD KEGG:ns NR:ns ## COG: BS_folD COG0190 # Protein_GI_number: 16079487 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Bacillus subtilis # 3 277 5 280 283 220 40.0 2e-57 MKILKGAEVSAKIKEQVTQMMEGLEGPAPKLAIVRVGEKPDDMSYERGAVKKMENFGLRV QTYVFPEQISDSDFKAEFSAINRDPDVSGILLLRPLPKQIAETDIEKMIDPEKDLDGISP ANIAKVFAGDKTGFAPCTAEAVIEVLKANEIDMTGKNVTIVGRSMVVGRPLSMLMLKENA TVTICHTRTRDLETECRRAEILVAAAGKAKMLDGRHVGQDAIVIDVGINVDENGKLCGDV DFPSIEPLASMATPVPGGVGAVTTAVLAKHLVLAGRRQRGN >gi|157101653|gb|DS480671.1| GENE 150 162982 - 163842 1344 286 aa, chain + ## HITS:1 COG:FN0800 KEGG:ns NR:ns ## COG: FN0800 COG0834 # Protein_GI_number: 19704135 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Fusobacterium nucleatum # 71 284 18 230 230 167 45.0 2e-41 MKKKVLAITMAALMAASLTACGGGAKETTAAATTAEEKAEDTTAAESKDETSAEAAETEA AKEAAGGKLVMATNAEFPPYEYHDGDAIVGIDAEIAKAIADELGMELEIEDIAFDSIIPE IVSGKADMGLAGMTVTEDRMQSVDFSDTYAKASQKIIVTEDSEIASPDDLKGVIVGVQLG TTGDIYVSDLEADGTTVERYNKGFEAVQALSQGKIDAVVIDGEPAKTFVAETEGLKILDE SFTDEEYAIAVKKGNTELLEKINGALKTLKDNGTLDEIVAKYIKAE >gi|157101653|gb|DS480671.1| GENE 151 164065 - 164757 926 230 aa, chain + ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 13 226 11 233 236 160 46.0 2e-39 MWKTFTDAFYLNFVSDNRWKYLTEGLKTTLIITFFACLMGIVLGFLVGMIRSTYEKTHKL KVLNAICKVYLTVIRGTPVVVQLLIIYFVVFGSVRVDKVIVAILAFGVNSGAYVAEIFRS GIMSIDPGQMEAGRSLGFNYAQTMWYIIMPQAFKTVLPTLCNEFISLLKETSVSGYIAIQ DLTKGGDIIRSRTYSAFMPLIAVAIIYLIIVMIFTKLIQILERRLRQNER >gi|157101653|gb|DS480671.1| GENE 152 164747 - 165505 259 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 16 242 7 230 311 104 26 4e-21 MSGNTGMFNGQPLIRVQDLGKSFGSIDVLKGITVDIHKGDVVFVVGPSGSGKSTFLRCLN LLEEPTSGHIFFEGTDITDPKTDIDKHRQKMGMVFQQFNLFPHMDIMKNLTIAPIKLQGK GQKEAEDEAMELLERVGLADRAHAYPSQLSGGQKQRIAIVRALCMKPDVMLFDEPTSALD PEMVGEVLSVMRDLAREKMTMVVVTHEMGFAREVATRVMFMDEGHFMEEAAPEEFFSNPK NERLKSFLSKVL >gi|157101653|gb|DS480671.1| GENE 153 166124 - 166621 543 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936159|ref|ZP_02083532.1| ## NR: gi|160936159|ref|ZP_02083532.1| hypothetical protein CLOBOL_01055 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01055 [Clostridium bolteae ATCC BAA-613] # 1 165 1 165 165 297 100.0 2e-79 MNREPADSRHVGEHFMEFAIDFFNLAKNSYHQQNQIRANSAAFQVLMQLNQPDRQAPTMS EMALQLGITKQQLTKLINDLEEKNLVRRQHSSRNRRHVYLMITPEGASIMKQLREAMLEC TVSRLSSYNQEELAELDDCLIRLSTLLEKFAAAEDIASCGDFPEL >gi|157101653|gb|DS480671.1| GENE 154 166837 - 168714 1205 625 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 49 615 37 595 636 468 45 1e-131 MGANEDSGKRNLKSGTPKKPFIYYYFLVILAVMVLNALVFPMMEHSVEVPYSEFLKELDA GNVKDVYVSNGESQIQFTKKDQKQWNVSYKTGPIPGDDLVKRLQNADVENFTGEIQTKAS PLLSFLMSWVAPILMFVVIGNIMGRMLSKRMGGSNAMTFGKSNAKIYAENETGITFADVA GEEEAKDALKEIVDFLHNPQKYADIGANLPKGALLVGPPGTGKTLLARAVAGEAHVPFFS ISGSEFVEMFVGMGAAKVRDLFKQANDKAPCIVFIDEIDTIGKKRDGGGMSGNDEREQTL NQLLTEMDGFDGKKGVVILAATNRPESLDKALLRPGRFDRRVPVELPDLKGREAILKVHG QNVKMSDDVDYNAIARATAGASGAELANIINEAALRAVRMGRSAVVQADLEESVETVIAG YQKKNAVISQKERRIVAYHEVGHALVAACQSHSAPVQKITIIPRTSGALGYTMQVEQGER YLMSREEALDKIATFTGGRAAEELIFHSITTGASNDIEQATKIARSMVTRFGMTDEFDMV AMETVNNQYLGGDTSLICAPDTAKRIDEQVVSIVKEQHRKALSILRENEGRLHEIAAYLL EKETITGDEFMEIFNRGEEEQVRGE >gi|157101653|gb|DS480671.1| GENE 155 168929 - 169213 384 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936161|ref|ZP_02083534.1| ## NR: gi|160936161|ref|ZP_02083534.1| hypothetical protein CLOBOL_01057 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01057 [Clostridium bolteae ATCC BAA-613] # 1 94 1 94 94 178 100.0 9e-44 MGKILQQLYRGDLCPAENTIRGNAEYDALTRQSMDDFNRFTDKLDRDMKEEFDLLMEHYL ELTFIEKTQCFTDGFRIGAGVMCEVFYENAAERN >gi|157101653|gb|DS480671.1| GENE 156 169466 - 170758 1592 430 aa, chain + ## HITS:1 COG:SMc00242 KEGG:ns NR:ns ## COG: SMc00242 COG1744 # Protein_GI_number: 15965420 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Sinorhizobium meliloti # 60 390 18 336 356 167 34.0 3e-41 MKLKKLASMAMAGVMAVSLAACGGSDTAKTTAADAATTAEAETTAAAEDAETKETEASEA PAAGGIAKEDLKIGFVYIGDENEGYTAAHYNGAMEMKEKLGLNDDQIIVKWNIPEDETAK DAAMDLADQGCQIIFANSFGHESYVIEAAKEYPDVQFCHATGFQAASSGLSNMHNYFTSI YESRYVSGVVAGLKLNQMIEDGTVKADACKIGYVGAYPYAEVISGYTSFFLGVRSVCPDA TMEVKYTNSWASFDLEKEAADALISDGCVLISQHADTTGAPTACEAAGVPCVGYNISMIA TAPKQALTSASNNWAAYVTEAVQHVIDGTEIPVDWCKGFSDGAVLITELNEAAVAPGTKE KVDEVEAKLASGELHVFDTSTWTVGGETLDTYKKDGSDIEYISDGYFHESEFGSAPAFDI LIDGITTIDN >gi|157101653|gb|DS480671.1| GENE 157 170886 - 172418 2077 510 aa, chain + ## HITS:1 COG:AF0887 KEGG:ns NR:ns ## COG: AF0887 COG3845 # Protein_GI_number: 11498492 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Archaeoglobus fulgidus # 9 508 1 491 495 389 41.0 1e-108 MSEEYAIELKNITKRFGKVTANDKVCLSVKRGEILSVLGENGSGKTTLMNMISGIYYPDE GQIFVNGKEVTIRSPKDSFALGIGMIHQHFKLVDVLTAAENIILGLSGREVLDMKAVSAK INELTAKYGFELDPDQKIYTMSVSQKQTVEIVKLLYRGVDILILDEPTAVLTPQETDKLF AVLRNMREAGHSIIIITHKLHEVLALSDRVSVLRKGQYIGTVYTADATEESLTEMMVGKK VSLNIERPDPMNPERRLVVEGVTCVDKEGVTTLDHASFTAYSGEILGIAGIAGSGQRELL ESIAGLEPMTGGTITYYPPEGGEKQLSGMSASAINKLGIKLSFVPEDRLGMGLVGAMDMT DNMMLRGYRRGRSFFTDRKTPKNLAEQIIEELEIVTPGVATPVRKLSGGNVQKVLVGREI SSNPKVLMVAYPVRGLDINSSYTIYHLLNEQKKQGAAVICVGEDLDVLLELCDRILVLCA GQVNGVVDGRTATKEQVGLMMTRTGGGLSE >gi|157101653|gb|DS480671.1| GENE 158 172411 - 173538 1170 375 aa, chain + ## HITS:1 COG:mll0503 KEGG:ns NR:ns ## COG: mll0503 COG4603 # Protein_GI_number: 13470722 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Mesorhizobium loti # 36 374 22 365 375 152 31.0 1e-36 MSKEMTAVSNKEPLMHISKRDEIDMWKAWAIRLGAVLLSLVVCAGVIYALTGLNPLEVYK GIFDGAVGTKRRTWMTIRDTLVLLCIAIGITPAFKMRFWNIGAEGQVLIGGVTSAAMMIY LGNSLPPLVLFPLMLLGSALSAMVWAAIPAFFKAYWNTNETLFTLMMNYVAMQVITYCII FWENPKGSNSVGTINSSTKGGWLPKLFGLEYGWNLVIVLALTLAIFIYLKYSKQGYEIAV VGESENTARYAGINVKKVIIRTMALSGAICGIAGFIIVSGASHTISTSTAGGRGFTAIIV SWLSKFNAFAMVLVSFFLVFMQKGAGQIASQFNLNENASDVITGIILFFILGCEFFINYK VGIRSKRKEAGAKLS >gi|157101653|gb|DS480671.1| GENE 159 173551 - 174531 1296 326 aa, chain + ## HITS:1 COG:AF0889 KEGG:ns NR:ns ## COG: AF0889 COG1079 # Protein_GI_number: 11498494 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Archaeoglobus fulgidus # 3 323 6 306 310 167 38.0 2e-41 MGFLVIFIQKAIGQGIGILYGALGEIMTEKSGNLNLGIPGMMYMGGIAGLIGAFLYENGT DNPNGLVGLLISFVCAFVCAALGGLVYSVLTITLRANQNVTGLALTTFGVGFGNFFGGSL AKLAGGVGQISVAVTGAAYKTPIPGLSKFGVIGQIFFSYGFLTYLAVALALALAFFLAKT RRGLNLRAVGESPATADAAGINVTAYKYLATCLGGGISGLGGLYFVMEYSGGTWTNNGFG DRGWLAIALVIFALWKPVNAIWGSILFGGLYILYLYIPGLDRGAQEIFKALPYVVTIVVL VITSLRKKREYQPPQSLGLPYFREER >gi|157101653|gb|DS480671.1| GENE 160 174647 - 174802 65 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKIKDPPAGRIEYQYLTGPGGSLRFSVTTAFVFRTPMFFNGFPSAISEIK >gi|157101653|gb|DS480671.1| GENE 161 175087 - 176799 1399 570 aa, chain + ## HITS:1 COG:CAC0367 KEGG:ns NR:ns ## COG: CAC0367 COG4187 # Protein_GI_number: 15893658 # Func_class: E Amino acid transport and metabolism # Function: Arginine degradation protein (predicted deacylase) # Organism: Clostridium acetobutylicum # 13 567 15 549 549 229 27.0 1e-59 MFEKQIYDYLRGMVSIPSVSNTPQEKEVSDYIAGCLERQPYFAKHPSLCGQCALEGDSLG RTVVYGLVRGKGAGTVVLTGHYDVVDTDEYGRFRALAYDMEAWKHIHGEELEALKSMLPQ EARDDLASGEWLFGRGVNDMKGGLSIGLAIMDWFGQKVLEYPETTGNLLFAAVADEEAYS AGMRGAVSLFTGLRQEYGLTYDCLVDLEPSFNEGGKQQVYIGSVGKTMPAVLVQGAKAHV VECFHGLNAIGVLAEMFMATELAPEFSETFEGEHCPPPTWFNLRDRKYGYDVSVPLRAAG YMSMLGFSKTTSQVMERLKEMGRRSFASYMKRMESQEVLVRSGNILPKVDLEHCVLEYGE LAEICRKKKGYGKWYQDLYGRIESDVRTGAMNYPQATLEMMDAMLTFSGITSPVMVISFA PPYYPAFHSDRLGETDRAGRTGGIQEGGIQEGGIQAGEGTRLYNLLQKAAEDTCGLCLGK RNYCCGISDLSYCGGPDREEMASYAVNAPLWGTMYRMDLDAMADFRVPSLLFGPIGKDAH QMSERVHARSLLEEVPVILQQFIEQMFANA >gi|157101653|gb|DS480671.1| GENE 162 176903 - 179728 3125 941 aa, chain + ## HITS:1 COG:CAC0503 KEGG:ns NR:ns ## COG: CAC0503 COG0178 # Protein_GI_number: 15893794 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 3 940 2 939 939 1245 65.0 0 MAKQYIKIRGANEHNLKDISVDIPRNELVVLTGLSGSGKSSLAFDTIYAEGQRRYMESLS SYARQFLGQMEKPDVESIEGLSPAISIDQKSTNRNPRSTVGTVTEIYDYMRLLYARIGIP HCPKCGKEIHKQTIDQMVDQIMSMPERTKIQLLAPVVRGRKGEHVKVLDQARKSGYVRVR IDGNLYELSEEIRLEKNIKHNIEIVVDRLIVKEGIEKRLTDSIETVMALSGGLMMVDVSG GEPVNFSQSFSCPDCAVSIDEVEPRSFSFNNPFGACPECFGLGYKMEFDEDLMIPDKSLS IDQGAIVVMGWQSCADKGSFTRAILEALSEAYDFRLDTPFQDYPKEIHDVLLYGTGGREV KVRYKSQRGEGVYDVAFEGLIENVNRRYKETFSEASKAEYETYMRITPCPVCKGQRLKKE SLAVTVGGMNIYEATNMSIVRFKEFMDGLKLTPMQETIGASILKEIRARICFLIDVGLDY LSLSRATGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDKLLKTLTHLRDL GNSVLVVEHDEDTMRAADCIVDIGPGAGEHGGKVIAMGTAEEIMKNPDSITGAYLSGRVK IPVPAERRKPTGWITVKGAAENNLKNINVDIPLGIMTCVTGVSGSGKSSLVNEIVYKSLA RKLNRARTIPGRHKSIEGVEQLDKIIDIDQSPIGRTPRSNPATYTGVFDLIRDLFASTPD AKAKGYSKGRFSFNVKGGRCEACSGDGIIKIEMHFLPDVYVPCEVCGGKRYNRETLDVKY KGKNIYDVLNMTVEEALKFFENVPSVTRKIQTLYDVGLGYIRLGQPSTELSGGEAQRIKL ATELSKRGTGKTIYILDEPTTGLHFADVHKLIDILHRLSEGGNTVVVIEHNLDVIKTADY IIDIGPEGGDKGGTVIAKGTPEEVARCPVSYTGMYVKKYLN >gi|157101653|gb|DS480671.1| GENE 163 180072 - 180830 867 252 aa, chain + ## HITS:1 COG:Cgl2264 KEGG:ns NR:ns ## COG: Cgl2264 COG1802 # Protein_GI_number: 19553514 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 22 234 2 195 221 74 26.0 2e-13 MLGAGIVMDKEKKRSLSEVAADYVREQILQGKLAQREKLVENDIAVTLGMSRGPVRDALK QLMYEGFLDYETNKGYSVTMLSPKDAYEIFFLRGNLENLALERCGGRLLSDSIFLMEDAV LAMKEAVSEDDMGKIIKYDEIFHKQILLSGQMERLTSLWETLSPLNGAMFYTIKQINGYV SEEQVDLDQEDIAAQAGRRGNNYLEHKWLLEILQRGNLEQSRSAINTHYIANGERIYRIG MKMKNQELFYHV >gi|157101653|gb|DS480671.1| GENE 164 181002 - 182075 904 357 aa, chain - ## HITS:1 COG:no KEGG:Spirs_0032 NR:ns ## KEGG: Spirs_0032 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 4 347 3 339 349 298 45.0 2e-79 MKQIGGFFTYEPLPETENHYLENLCPPGGSLRFFMSGRCANYYALQDICLSDTKKVAYVP VYTCETVLAPFLKAGYELLFYDVSRDLTPIFDERVLDRISVVNLCGYYGFCSYDREFIRK CSERGIIIVEDTTHSIFSADGVDPHCDYVVGSMRKWIGVPAGGFAIKRKGGFSLPVLPPD ETHLSMRSMSMRGKQQLLQAGTPDPQAMQKFNDTFWDAEMMLRRIFDSYGSDRDSEYIIR HYDFEQLKQRRRANYQYLLEHMPEHPQITVIFPVLKEGAVPSHFTVYAKDRDMVQSYLAG QGAKTTTYWPQGPMVDTEGHGDADYIYSHVLSFTCDQRYGTEDMEYICRQIEGMPEE >gi|157101653|gb|DS480671.1| GENE 165 182400 - 183347 277 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 65 312 43 316 320 111 26 4e-23 MYKYVLKRLLMLIPVIIGVTFIVFFILNLSPGDPAAIILGDQASAEALAMKREELGLNDP LLVRYGHYMINMLHGDLGTSYKNNLSVWSQVMERFPNTAVLAVAGILVAILIGIPTGIIS AKKQYSMMDNTAMLLSLIGVAMPNFWFGLLMVILFALTLGWLPSQGMGEGFIPLIKSLVL PAVTLGTGAAAMITRMTRSSMLEVIRQDYIDTARAKGVSDKKITTRHMLKNAMIPIITAV GLQFGTLLGGAMLTETVFSWPGLGRLMVDSIKSKDIPMVLGSVIFLAIMFTVVNLAVDII YAFVDPRIKSQYKRK >gi|157101653|gb|DS480671.1| GENE 166 183357 - 184265 1352 302 aa, chain + ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 10 301 3 288 289 304 57.0 1e-82 MGKKHLIAAEAEGDQRSLWGEAWRRFKKNRLAMIGMYFILALVIIAIATIIIDAVTGKSI YLNYVVKQDLRGRLQGPSLAHPFGLDEFGRDMLLRMLWAVRYSLFMGTVAIIISCIFGGL LGAFAGYYGKTADNIIMRIMDILLAIPSMLLAIAIVAALGTSITNVLIAIAISYVPTFAR TVRASVLTVKDQEFIEAARSIGCSDWRIIIKYVIPNSMAPIIVQVTLGIAGAILSIAGLS FLGLGIQPPTPEWGAMLSNARSYIRDAWHVTIIPGMGIMLTILALNLVGDGLRDALDPRL KS >gi|157101653|gb|DS480671.1| GENE 167 184297 - 185265 1177 322 aa, chain + ## HITS:1 COG:FN0399 KEGG:ns NR:ns ## COG: FN0399 COG0444 # Protein_GI_number: 19703741 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 319 1 318 335 391 63.0 1e-109 MADKPLLEVKDLVIHYETDDGVVKALNGVNIHIGVGETLGLVGETGAGKTTLAKGIMRLI PNPPGKILGGEVIFEGQDLLKLSTNGMEAIRGRDISMIFQDPMTSLNPVLTVGEQIMEVI ENHNTSLSRQEARKWAENMLERVGIPAERFGEYPHQFSGGMKQRVVIAIALACNPKLLIA DEPTTALDVTIQAQVLEMIYKLKSENNTSMILITHDLGVVAQNCDYVAIIYAGEVVEYGT LREIYKDTKHPYTEGLFGSIPSLTSDVKRLQAIDGMMPDPTKLPEGCVFCERCKYAVPEC SKTHPGMVTVGGTHQVRCIRYR >gi|157101653|gb|DS480671.1| GENE 168 185283 - 186257 1163 324 aa, chain + ## HITS:1 COG:FN0400 KEGG:ns NR:ns ## COG: FN0400 COG4608 # Protein_GI_number: 19703742 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 8 324 1 315 320 438 67.0 1e-123 MQEQKTEILRVEHLKKYFTTPKGTLHAVDDVNFSIRTGETLGVVGESGCGKSTMGRAILR LHEPTSGKVYFEGRDILGYNKKQLKDLRKDMQIIFQDPFASLNPRMTVSEAIIEPLLVQG IYKPNEKAAITQQVEKIMNLVGLAKRLVNTYPHELDGGRRQRIGIARALAVNPKFIVCDE PVSALDVSIQAQILNLMQDLQEELNLTYMFITHDLSVVRHFSNDIVVMYLGQMVESAPAK ALFKNPMHPYTKALLSAIPVPDPDFKMERIPLKGELTSPINPEPGCRFAKRCPYATEGCT SNEMTLKEMEPGHFVSCRMVQEQG >gi|157101653|gb|DS480671.1| GENE 169 186349 - 187953 1894 534 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 51 532 29 509 511 412 45.0 1e-115 MRKTWKKVIALSLVAAMAVSAAGCSKSGQTSTEGGSAQTEAGSKGAEAGGSAGGSGEYKD TLVWAQGADVTSLDPHQGKETPAVEVTDQIFDTLTVVDAATGELQPQIAESWDQNSDTSY TFHIRQGVKFHDGSEVKAEDVKFSLDRAINSAAVSYIVDFIDKVDVVDDYTVNVELKAPY APALRNLSVPFAAIVPKAVVEADEEAFKLHPIGTGPYKFVEWKQGDSVTLERFDDYYAGP AETQKLVMKVIPEASQRTIALETGEVDLAYDLLPNDLEKVRSNSDLQLFEAPSLTCYYIS MNMNKAPYDDQKVRDAVNYAIDRQLIIDTIASGSGEPADAIIAPAVFGYFSPGAYEYDPE KAKSLLAEAGYPNGFETSIWVNDNQTRVEVCQAVQAMLQDVGITCNIEVMEFGSFISRTS AGEHDMGYFGWVTSTTDADYTYYSLEHSSQQGAAGNRSFIGDPEVDKLIETGRTSADESV RLKAYEDLAVKLKEVNNNAPIYYSTITAGANAKVEGFVMDPIGYHKLEHVKVAK >gi|157101653|gb|DS480671.1| GENE 170 187978 - 188739 856 253 aa, chain + ## HITS:1 COG:MK0183 KEGG:ns NR:ns ## COG: MK0183 COG1402 # Protein_GI_number: 20093623 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Methanopyrus kandleri AV19 # 17 251 14 222 224 90 26.0 2e-18 MRLSKITWPTAERYFKENDMVILPIGSTECHGRHMPLGTDTLIPDKILDLIEEKNDNILI APMIPYGACQSLADYPGTINIDSDVLYQYVYQIVDGLYKHGARKILVLNGHGGNIKTIER IGLEFDKKGAMVVMLNWWLMAWDMKPEWKGGHGGGEETAGIMAVDPSLVDMSKIDLPLEM TNLTENLVATGFRTVRFKGVEIEILRNTPKVTDNGWIGPDHPNTATVEWGQEMVQTTADY ILDLIEELKKVSL >gi|157101653|gb|DS480671.1| GENE 171 189057 - 190121 1085 354 aa, chain + ## HITS:1 COG:lin1180 KEGG:ns NR:ns ## COG: lin1180 COG1363 # Protein_GI_number: 16800249 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Listeria innocua # 47 348 58 356 359 182 36.0 8e-46 MTNAFGPSGFEDEVIREVKNWCGGLDVANDAMYNVYATMKKKKAGRPVLMLDAHLDECGF MVQSILENGLLSILMLGGFHLTSLPAHSVIVRTRDGKKHRGITTSKPVHFLTAAQKNDNS LAIEDILVDVGASSREEVTEVYGIRPGDPIMPDVSFEYHKENGILYGKAFDNRLGCVSII DTMQALRDEADQLAVEVVGAFAAQEEVGTRGATVTAQQVQPDLAIVFEGSPADDFYFRAG IAQGCLRSGVQIRHMDNSYISNVEFIRFAQETGDKLGIKYQDAVRRGGSTNAGKISLTGK AVPVLVLGIPSRYVHSHYNFCAEEDVDATVRMAVEVIRGLDEERIARILRKEVI >gi|157101653|gb|DS480671.1| GENE 172 190649 - 192190 1439 513 aa, chain + ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 195 513 1 316 316 444 67.0 1e-124 MNREMIIVLDFGGQYNQLIARRVRECSVYCEVHPYDMSLEKIKEMNPKGIIFTGGPDVVF EDGAPRCSKEIFELGIPVLGICYGAQLMSYLLDGVVKKAVVSEYGHTEVDVDTSSKLFDG VSSKTVCWMSHGVYIAEVPAGFRITAHTDSCPVAGMECPEKNFYAVQFHPEVVHTKEGTR MLSNFVYNVCGCSGDWKMDSFVDTTIKALREKIGNGKVLCALSGGVDSSVAAVMLSKAIG KQLTCVFVDHGLLRKNEGDEVEAVFGPEGAYDLNFIRVNAQERFYARLKGVEEPEAKRKI IGEEFIRVFEEEAKKIGAVDYLVQGTIYPDVIESGLGKSAVIKSHHNVGGLPEHVDFKEL VEPLRLLFKDEVRKAGLELGIPEYLVFRQPFPGPGLGVRIIGEVTADKVRIVQDADAIYR EEIAKAGVDKNLGQYFAALTNMRSVGCMGDERTYDYAVALRAVLTSDFMTAESAQIPWEV LGKVTSRIVNEVKGVNRVLYDCTGKPPATIEFE >gi|157101653|gb|DS480671.1| GENE 173 192406 - 192498 123 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGTYEILSLLFLGGSFLIALLTYIDRHNKK >gi|157101653|gb|DS480671.1| GENE 174 192746 - 193087 327 113 aa, chain + ## HITS:1 COG:CAC0417 KEGG:ns NR:ns ## COG: CAC0417 COG1393 # Protein_GI_number: 15893708 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Clostridium acetobutylicum # 1 112 1 111 112 128 66.0 3e-30 MNIQIFGTKKCFDTKKAERYFKERNIKYQLIDLKEKGMSKGEYTSVKQAVGGLDVMLNDD CKDKDALALIKYIAAEDRDEKVLENQKVLKSPIVRNGKKATIGYQPEVWKTWD >gi|157101653|gb|DS480671.1| GENE 175 193289 - 194029 657 246 aa, chain + ## HITS:1 COG:lin0801 KEGG:ns NR:ns ## COG: lin0801 COG3279 # Protein_GI_number: 16799875 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 4 240 3 239 240 89 25.0 6e-18 METLRVAVCDDDRVYLEQIKEEIKEYIGSMPGTVDMQVSLFTDGRRLYEAGQQQPFDLVF LDIEMPERDGFWLAEQLGISCPETRLIFVSSHESWVFDAHEYMPLWFVRKGLLKRDMGRA LQKYFQVTARRKISYKIQGGFGAGEVLLRDIMYIECNGHTLTYKMSGGAEYSVYGSLKPV AEELRDYGFLRIHRNYLVNQAYISQIEKQDAVLKTGAVIPMGRDRKKGVREMMMEYGRKH GSRLCD >gi|157101653|gb|DS480671.1| GENE 176 193995 - 195305 1132 436 aa, chain + ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 206 427 202 425 433 79 25.0 2e-14 MGESTAAAYVIDVLSTLVAMEYMDHGFDKKYSGAARIMLFTGGCLAYYGIMVLLNRYIVF EGFLGISYGVVLTAYGLIALNGKAQKVIVHSLLWILIAISSSYMIYGTLGIITGRSLNTL MIMDRDAVVFPILAGCALKFSMGRAALALYGRKKGPVQAEDGMLAGTFFCMFILIMGMFL MEEGRLDQRGRYVLALCMLMGIFGVIVFIGGFYHRLERYRREEREAEFRRETKCQQEEQI RDLYRMGREVNRLRHDMKGRFNVLYRLVAKERYAEAAEYIERMGADLGNYPELPQDTGNE GLNAALIKAVQECREKGIRFRYVVMGRPDRIDSMDMGTLLYNLLSNGIEACMAVETEREL ELVVREEDGTTEIFMENTISGSVLKNNPNLESRKQDKRYHGFGMESIRGIIGKYQGHYSY REESGRFIQEIDLRDG >gi|157101653|gb|DS480671.1| GENE 177 195465 - 196217 659 250 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936187|ref|ZP_02083560.1| ## NR: gi|160936187|ref|ZP_02083560.1| hypothetical protein CLOBOL_01083 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01083 [Clostridium bolteae ATCC BAA-613] # 1 250 23 272 272 502 100.0 1e-141 MTAQNKETYLESMNEEYKQKHYPEGSVFKAHKKAMKGVNAGLGLFFMGAFLAGSIAGLVW SINRTLEIIRDKEENMLGVGIGISVFFLLLVAVFGVAVFFITKDIRKTVDDWIRVAAKAG GLEEQEIREFDKQAMGSDSLILNHLGKLKSFTTGQKRGILTRDYICLYGGNMPCVLKLSR LTQAYIKDNTYYVKVGKTRKQAHYLTIHLMTEDKKTAWAETSREAARALQEELESRCPGI DTAGGSVLPG >gi|157101653|gb|DS480671.1| GENE 178 196238 - 197269 826 343 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936188|ref|ZP_02083561.1| ## NR: gi|160936188|ref|ZP_02083561.1| hypothetical protein CLOBOL_01084 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01084 [Clostridium bolteae ATCC BAA-613] # 1 343 7 349 349 659 100.0 0 MGSDNTNRRTVETEFKKAMVRVNVPVGLLVTAFGIGLMVLVGSYLIKYFRGAVPLREVYG TEQAGDEYVSFDINYVLDAFMESYRTRSGVRTKSSIYYLVLDEEAGAVPVRISGKMEAEM DRIMDETWDYLKGVRSDPPQGLHVEGTLKKLKGDGKKYYDRAVRSFDLETETEGYYFWAG MLDNQTPSSAMGTTGLGAVIAFLGLFILSSMLKKHHVAAVRTFLDSHKTINRERLDEDFR NARKISGTFWIGEDLTYGIVGKPVILVNQELKKVRFQVKKVGRGTSTELVCVMADGKEYE FTMNRKDADEAIALYHAKFPTLKMASSLKGKLIVPEDGDTESN >gi|157101653|gb|DS480671.1| GENE 179 197421 - 197927 180 168 aa, chain - ## HITS:1 COG:SP1236 KEGG:ns NR:ns ## COG: SP1236 COG0454 # Protein_GI_number: 15901098 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 9 142 10 141 162 91 39.0 6e-19 MIYTITEMTEKHYYGKAYVQYAAWFETYRGLMPDEFLDNRKLESCYESTLKHPQDTYVAL VDETVAGFIWYSPDAREYTRKTGMSEINALYVLKKYQGHGIGRALMEKSLSMMPHKEAVL YVLRGNNPAIGFYEHMGFMLTGNTFTQDTGFGNIVELEMMRTGRLTGP >gi|157101653|gb|DS480671.1| GENE 180 198378 - 199316 864 312 aa, chain + ## HITS:1 COG:no KEGG:Nther_1480 NR:ns ## KEGG: Nther_1480 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: N.thermophilus # Pathway: not_defined # 10 311 1 313 321 81 23.0 4e-14 MIHYAEVWKINKDSKIPYTDQLVRNIRWSIFTGEVRWTEKLPPIRVLADELGISINTVRN AYKRLEQQELVVTRPHIGTIVQTESMDKKKLEEELVTSIKNAIYYQLSVDEVRSIVDKVL KEIEESKKKSVIFVYEEKNIGHRFAMQIAREADVQVEEVHLDQLRDYLEEHRDQIEHLDA IITTYFQYAQVRSIARSYQPIIYGMTVEVAPDVMEALQSLESGSVAAVICKKEESADGFV NLVHRMCPELDVEVYYEDKRSGWKKIAQKASVICVSPQLAEKVRQCDWKIPVYEMWDRIN EQSMNMLKDYLH >gi|157101653|gb|DS480671.1| GENE 181 199406 - 200389 1011 327 aa, chain + ## HITS:1 COG:FN0399 KEGG:ns NR:ns ## COG: FN0399 COG0444 # Protein_GI_number: 19703741 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 316 1 317 335 391 62.0 1e-109 MEEKLLEIKDLSVEYTTDEDIVKAVNHVSLTLNKGETIGLVGETGAGKTTLALSLMRLLP KVSGRITGGEIMFNGEDLVKAAEEDMRKVRGEKISMIFQDPMTSLNPVLTVGNQIGEALE LHMDLTPEEKQERIEEMLKLVGIAESRKKDYPHQFSGGMKQRVVIAMALACKPELLLADE PTTALDVTIQAQILDMMQELKEKINTSIILITHDLGIVAEICDKVAIMYAGEIIEYGTAE DIYEGKNRHPYTEGLFGSIPDLEVESRRLNPIDGLMPDPTDLPPGCKFHPRCPKCMDICK NAQPPECVEGTHMIRCHLYQDGPEGVS >gi|157101653|gb|DS480671.1| GENE 182 200524 - 201483 751 319 aa, chain + ## HITS:1 COG:FN0400 KEGG:ns NR:ns ## COG: FN0400 COG4608 # Protein_GI_number: 19703742 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 7 315 1 308 320 415 64.0 1e-116 MSDKKILLETKSLKKYFHTPSGELHAVDDVNMQIVQGQTLGVVGESGCGKSTLGRTILRL SEPNSGQIFFDGKDILKCGRKDMQNLRTDMQIIFQDPYASLNPRMTVSETIGQPMMVNKI CRKKSEVEKRVAELMDTVGLASRLYNAYPHELDGGRRQRIGIARALSLNPKFIVCDEPVS ALDVSIQAQILNLLMDLQEQRGLTYMFITHNLSVVKHISNEIMVMYLGQCIEKASSRELF KNPLHPYTQALLDSIPVPRLQGRNRKRQSIKGEIVSPIDPAPGCRFAPRCPRATEKCMGH DIPLKNVSENHQVACVLFN >gi|157101653|gb|DS480671.1| GENE 183 201502 - 203061 1562 519 aa, chain + ## HITS:1 COG:FN0396 KEGG:ns NR:ns ## COG: FN0396 COG0747 # Protein_GI_number: 19703738 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 1 488 1 476 511 213 30.0 9e-55 MKKKLISMLLVCAMAAGALAGCGGSSGGSSGGGAGAGTEKAADAGNQEGGKDTIKIAVYS DFAGFDPYNSGMDLDKVVYSNIFDCLLRYYDGKYENVLCKDYSISEDGTEYTFNLKEGVK FHNGETLTAGDVVFSMMRAKESSEMSNFTKNIVKAEAVDDLTVKIVLDQPYVPFLTAVAS TVCIMNEKAVNEAGDNIRQMPVGTGPYKFKQWDSGSQVVLERFDDYHDALPQIREANYVV LTNPETALTALQTGEIDMTYTIPPIAVQELKDSQDLVLDLNPTMGSGYIVMNLEAPFLSD PNFRMALAYATNREHIVEVGMDGVAKVSSLLWDERTAGYSGKYTTPEFNLDKAKEYLAKT DYNGEEIPFVVGYENYKKIGVVYQEELKQIGVNISVEQLEANTWVSDMKSGNFAMSTIVQ TMDPDVDFWSTVFMSSAIGGYNFSRLNDADVDKAFQDGSVCQDPEQRKDIYSVIEKKLYE DTIIIPIYDRVVTCAYNKGVTVDRSYDSGFSTLRDMHWD >gi|157101653|gb|DS480671.1| GENE 184 203120 - 204088 1038 322 aa, chain + ## HITS:1 COG:FN0397 KEGG:ns NR:ns ## COG: FN0397 COG0601 # Protein_GI_number: 19703739 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 1 309 1 307 308 287 51.0 2e-77 MIKYVLKRLLLMIPVLLGVSFIVYGIMSFTPGDPASAILGTSATKEQIDKLNEELGMNDP FVVRYVHYLEDALHGDFGTSYRTRQPVFDEIFSRFGVTFRIGVLSICLAVMIGVPIGVIS AVRQYSGLDFVSTILAMFFAAVPQFWLAMMFVMLFSLKLGLLPSTFTGVVTMKHFIMPVV TLAMATAASILRLTRSNMLETIRQDYVRTGRAKGASEHTVIYKHALRNALLPVITVIGNE FGYLLGGTVLVESIFGIPGLGSLTITSIRSKDIPQTTACILFLAALNVVIILIVDILYAY IDPRIKAKYVGGVRRAKRQPAT >gi|157101653|gb|DS480671.1| GENE 185 204063 - 204959 1059 298 aa, chain + ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 12 297 3 288 289 284 51.0 1e-76 MPNDNLQPDTVKKDRKKQSRAAEIWKRFRRSKLAMVGFVIIVTVVLLAAFASVICDYEAD AITQNYDARFLGPSLHHLFGTDDYGRDIFARVLFGARVSLTVGIVSTAVACLIACLLGSI VAYYGGKVDTVLMRILDVFMGIPTLLLAIAIAASLGAGIRNLILALIISQVPGFTRVVRS AVFNIVDMEYIEAAKAYGCSPVFIIVRHILPNAVGTIIVQATMAVAAQILNTAALSYLGL GQQPPAPELGSMLSDAKEYMRYSVWGVLFPGLTIATIALSLNLVGDGLRDALDPRLKN >gi|157101653|gb|DS480671.1| GENE 186 205129 - 206310 1170 393 aa, chain + ## HITS:1 COG:YPO3006 KEGG:ns NR:ns ## COG: YPO3006 COG1168 # Protein_GI_number: 16123185 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Yersinia pestis # 1 388 1 390 393 358 42.0 8e-99 MKYNFDEPVDRSRNFSAKYDECARKFGREGLIPMWIADMDLKTAQPIMDAMEERNKQGIF GYTTRPASYYEAVCRWQKDRNGYDFDKSLAAFALGVIPAICTIVREFTEPGQEVMYFTPV YSEFHDSVVNSGRTPLTVSLKEQDGYYEIDYEAFEEAAKRRPGLLIMCNPHNPVGRSWTR EELEKVGDICVRYQIPIISDEIHSDLMLYGHRHTVMAGVSGEIADITITCTSATKTFNLA GLQAATLFFPNHEMKDRYEKFWFGMDVHRNNCFSLVAVETAFREGGEWLEQLLDYLEGNI DYTWDYLKKNIPEIHFLKPESTYLLWLDCRELGLSGDELQTFMVQDAGLALNDGRGFGPE GGGYMRMNIACPRATIERALGQLKAAVDRKMGR >gi|157101653|gb|DS480671.1| GENE 187 206373 - 207605 916 410 aa, chain + ## HITS:1 COG:YPO1950 KEGG:ns NR:ns ## COG: YPO1950 COG0006 # Protein_GI_number: 16122196 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Yersinia pestis # 13 403 14 395 405 136 24.0 6e-32 MKYEKVVIKPKERRIKSMLRERGLDGAIVSSPENFHYVTGFAGHQHTVSRQPGFALAVMS SDDKTATQLTTMDYEELTFEKRVREQKERLGEEASEFVVRPYDTWVGVKTWDEITGAAKT VDKQSMESSFDMLKKMVRDMDLAHKKLGVELAYLNVPYFRQLQETFPEAEFVDISSMFVL ARSVKAEDEIQMFRTLCRIADEGFYQVSRMVRPGVREREMADCFRQSVIATGFCVPSSWS MFSTGPSGARLTLPEDGIIESTHVVKYDAGVNAEFDFYTTDTSRAWVMKDAAPELFRLKD RLYEGQRRMIAAARPGLPVCELFRTAYEYVKEQYPCYRRGHCGHSISLGPATAEEPYINA SAERLLEPGMVLAMEVPCYIRGVNGFNIEDMVLITEDGSEVLTPKTPHYL >gi|157101653|gb|DS480671.1| GENE 188 207629 - 207796 70 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936203|ref|ZP_02083576.1| ## NR: gi|160936203|ref|ZP_02083576.1| hypothetical protein CLOBOL_01099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01099 [Clostridium bolteae ATCC BAA-613] # 16 55 1 40 40 74 97.0 2e-12 MGLPGGYTGIRVILAVIIPMVFHMVSPMLFYVIVTRNSQDVKRPGACDWLPGLLF >gi|157101653|gb|DS480671.1| GENE 189 207728 - 208645 886 305 aa, chain + ## HITS:1 COG:CAC1852 KEGG:ns NR:ns ## COG: CAC1852 COG0598 # Protein_GI_number: 15895127 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 125 305 174 354 354 96 28.0 6e-20 MKYHWNDHSKDDPDAGVSAGQSHPVEILTMEEFRTACPDIYRRIFSGSGLNHIRFCKADV FRAAVAGTFAIPSKEDPSKEKQVFGYCLTGEKLIFMDDSGYVRGLLDEMQDYQMTDVTSP SLQLFDVMEYLLKEDVIFLQQYEDRLTRLEEGLLKRESEEFDVRILAARKDLSALSVYYE QLSDMGETLQQDAAERGNERDSLLFGLFSDKAGRLYATVQMMKEYSMQLREMHQTQVDIR QNEIMKFLTIVTTVFMPLSLIAGWYGMNFINMPELGFPYGYAVICILCLLIVAVEIWFFK RKKWF >gi|157101653|gb|DS480671.1| GENE 190 208689 - 209357 570 222 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2048 NR:ns ## KEGG: Rumal_2048 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 208 1 207 220 127 37.0 3e-28 MNRIHMELVQTIYDYGEYYWMIDGRPIVRYLNEAVSAGACPRLEVFGSLEGLLPAWTGEL VWKAENRFIWEMVDSVEDLNVPVLVCEDDCDLSCIVIMAKIRKEPDTVYWDSLGVLNLEN QDFRMEKQSGILCLEAYSDQDWEEYGDNIACEQFDSPEYRKWVSEHWDEEMIRRRRNYTK PYMQREENITWICSPLWQFERKEYERMVEDYRKVYEDRMGRD >gi|157101653|gb|DS480671.1| GENE 191 209589 - 209669 58 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDDADSDGSQDCDYNNIGLCTGRFKA >gi|157101653|gb|DS480671.1| GENE 192 209779 - 210729 692 316 aa, chain + ## HITS:1 COG:RSc2319 KEGG:ns NR:ns ## COG: RSc2319 COG2267 # Protein_GI_number: 17547038 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Ralstonia solanacearum # 3 305 6 311 319 194 35.0 2e-49 MYYGKDGYWKKVQEYLPEENRLTGVWRPDEYFVGIGRFGIHIDHYRVKEAKARVILFHGV GGNGRLLSLIAVPLMKQGYEVICPDLPLYGMTEYYGKVVYQDWVDCAAEIVMYYQAREIR PTFLFGLSAGGILAYQTASGLPEIQGVIATCLLDQREPLVRKNVVSSGWMAEHGLRLISK VSAWLGFIKVPIKWVGNMKAIVNNEELAEVLMRDRRSSGAKVPVAFLHTLLNPHIHPEPE RFSRCPVLLVLPENDRWTDASLSRIFYDRLNCSKDMKLLKGAGHFPIEKEGLMHLEQYCM EFLEECLAQRLVSNCQ >gi|157101653|gb|DS480671.1| GENE 193 210828 - 213806 2358 992 aa, chain + ## HITS:1 COG:lin0778_1 KEGG:ns NR:ns ## COG: lin0778_1 COG1221 # Protein_GI_number: 16799852 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulators containing an AAA-type ATPase domain and a DNA-binding domain # Organism: Listeria innocua # 146 497 110 461 464 296 41.0 2e-79 MEERILQIIKGEDSKNPLTDEEIASRLQVFREDVTTVRREHHIPDSRKRRKPVIFEDMKR ILTEHPDVSDRGLTRMLEEAGYRIGKYAAGRLREELLEIWTPFGDCQEKEGAGLAPDLAK HTEYAKHAEYEEPAESGAEDERANAFSAFVGFDGSMKTQITRAQAAVLYPPRGLHCLIYG PSGVGKSFLAELMHEYACGTENFGKDAPYFEFNCADYADNPQLLLAQLFGYSKGAFTGAS DNKKGVVELCNGGILFLDEVHRLPPEGQEILFYLMDKGRFRRLGEVDTQRESHVMVIAAT TENPQSSLLLTFRRRIPMVIEIPSLKDRPAGEKLQFILRFFQWESRRLGKQIKIRQQALW CLLGGDCPGNVGQLKSDIQVCCAKAFLEGRARKDGRITVSFASLTENLRKGYVPDRITKE IKELCPDDCYVFPDDTGESAGKSGTETFLEGFNIYESLEEKYDQLLRDGLREQDIEVQLT DEIESRFQRHIHEFQKSGIREDEIASIVGDDILRMTRDICDLARKRLPGLEEQVVFPLAI HLNMAMERMRSHGRMVYPGMENIRQQSYEDYEAACYAVDEIQKKYYLTLPEEEKAFLAMY FRKFKKKDMAQEGRIGVLVVSHGPVASGMAQVANAIMGTEHAVGVDMNLWDTPAQMAEKT VDMARRVNQGKGCIVLADMGSLLNVGQKIREETGMQVRVLPRTDTMMVVEAVRKTMWTDE SLNEIADELDKIKLTGTREQSESRKKAILCLCITGQGAAVKLKEHLTERLKSNLAGTVVV TRGYIEDSSIDRIIANTEKEYEILAIVGTINPDSGTYPFISVSRVYSPEGIRQLRGILKR SAMFEENQLSEVISLGHIYIRTEAEFKDQIIDEAVGHMAEEGLVRQEFLLSVYKREGMMT TRLSQGIAIPHGDPGLVTKPVISVTKLDKPVLWDGVNTVDVIFVLAIQEDSRKYFEQLYQ IISDESMVSAIRASRTREEIRNLLCKNTKSVN >gi|157101653|gb|DS480671.1| GENE 194 213853 - 214491 673 212 aa, chain + ## HITS:1 COG:BH3723 KEGG:ns NR:ns ## COG: BH3723 COG0800 # Protein_GI_number: 15616285 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Bacillus halodurans # 9 210 14 210 214 104 34.0 1e-22 MNIEKFPKVTVILRGYTYSQTRTVVKNLVGTCLGSVEITMNTPGALESIRKISQEFGAQI MVGAGTVLTYEEAVEAIEAGAAFLLSPTVFEKRILDLCRERGVVSVPGAFSPSEIRQSFL DGADIVKIFPAGRLGSRYVSDIQAPLGKMPLMVVGGIGTSNVQEYFAAGASFAGIGSGIF LKEDILNENEEGIRGSIAAMEEKMKSMEEKES >gi|157101653|gb|DS480671.1| GENE 195 214488 - 214961 624 157 aa, chain + ## HITS:1 COG:BH0192 KEGG:ns NR:ns ## COG: BH0192 COG1762 # Protein_GI_number: 15612755 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 1 152 1 152 160 111 42.0 5e-25 MSGIAFNESLILEFDKPMTDREVLSAMTDYLCEKGIVKDTYKQAILEREQSFPTGLKTGG INVAIPHADICHVNEAAICVAVLKTPAPFRAMDEPDNDVPVSLVIMLALTEAHGHIEMLQ RIVKLIQNQEDMKHIVEAGQPDTIHKIIKQYLLEDEN >gi|157101653|gb|DS480671.1| GENE 196 214989 - 215267 413 92 aa, chain + ## HITS:1 COG:gatB KEGG:ns NR:ns ## COG: gatB COG3414 # Protein_GI_number: 16130031 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Escherichia coli K12 # 2 92 3 93 94 68 37.0 3e-12 MKRILVACGNGIATSTVVATKIREKCEDNGIPVSVTQCKLLEVESKADDFDLLVTTGKFT GGNVNIPVVGAISLLTGINEDATLDEILSHLK >gi|157101653|gb|DS480671.1| GENE 197 215356 - 216723 1722 455 aa, chain + ## HITS:1 COG:SA0238 KEGG:ns NR:ns ## COG: SA0238 COG3775 # Protein_GI_number: 15925950 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Staphylococcus aureus N315 # 1 414 1 415 419 320 44.0 3e-87 MQILYNIFNTILDAGAVVMLPIIITIIGLIFRVKLSKAFRSGLTIGIGFAGINLVINMMK SNLGPAAQAMVQNFGIKLDILDVGWGAIAAVTWSSPIIAILIFCILATNIVMLVLKATDT LDVDIWNYHHMAIVGIMVYFVTKNVAMGIGATVIMAIITFKLSDWTAPLVEDYFGIPGVS LPTMSALSSIIIAAPLNWLLDKIPGINKIDFEIKDAKKYLGFFGEPTIMGLVLGCVIGVL AKYSVADIMALGVNMAAVMVLIPKMTSLFMEGLMPISEAAKKFTQEKFKGRKFLIGLDAA VVVGNPDVITTSLIVIPLTILMAAVLPGNRVLPFADLAVVTFRVALVVAITRGNLFKNII MGLVCTGAILLAGTATAPVLTQLATSVGLDLGEGMITSFAATSLTVSFLVYEAFIKKPYI FVPVLIAVMAAVWVFMNRYNKKKTMLAGGNDVAGN >gi|157101653|gb|DS480671.1| GENE 198 216710 - 218347 1674 545 aa, chain + ## HITS:1 COG:PAB0895 KEGG:ns NR:ns ## COG: PAB0895 COG0129 # Protein_GI_number: 14521553 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Pyrococcus abyssi # 11 542 11 549 551 423 43.0 1e-118 MRAIDKLEPFQKAISKAHLASAGVSIDSLDHKPLIAVANSWNEICPGHEPLKQLAEEVKK GILEAGGEPIEFNTVAMCDGIAQGHSGMRYCLPHREIITDSIEAMVVGEGVFDGVVYMGS CDKIVPGMLNAAARIDLPSVIVTAGPCYAQIKPNESKELRQQFLRGEITERQLIEGTLKY YTGPGICPFLGTANTMGALCESLGMMLPGGSLIPSSTSMRRFSARESGRAVMKMVEKQIR PSQIMTREALENTITLLSAIGGSLNALIHLPALAAELGISLDWDDFSKITTATPVLCSIV PNGNRTVVDLHYAGGIPAVMKELESVMNKDVLTVTGETLGQILERTSQGDREVIKSMDDP VSRADGIQVLYGNLAPQGALVKTSAVPMEVRRFKGTARVFEGEDECYAAFRAKEIKEGDA VIVRYEGPKGGPGMKELHRITEIIKGIPNTAVITDGRFSGASGGLSIGYLCPEAAEGGPI ALVKDGDRITVDLYKDYLHLEVSDEELEARRREWKAVRRPREKGLLGRYARLVNTARSGA TLSRE >gi|157101653|gb|DS480671.1| GENE 199 218585 - 219289 559 234 aa, chain + ## HITS:1 COG:no KEGG:Swol_1999 NR:ns ## KEGG: Swol_1999 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 233 1 233 237 266 51.0 5e-70 MEEQTTGCLVCGAPLEYFPSQIEMECTYCHKTFMSNARCVNGHFICDSCHEQMGLRVIED ICRHTDSKDPVAIMKKIILSPYIYMHGPEHHVLVGSALLAAYKNAGGELDLDAALEEMRN RGAQVPGGICGLWGTCGAAVSTGIFVSLITGASPLSGKEWGLCNEMTSRSLGAIAKTGGP RCCKRDSYTAILEAVDFVGEKFGIWMERPKKTVCGLYGRNEQCLKEKCPYNPQG >gi|157101653|gb|DS480671.1| GENE 200 219307 - 220635 937 442 aa, chain - ## HITS:1 COG:BH0916_1 KEGG:ns NR:ns ## COG: BH0916_1 COG3325 # Protein_GI_number: 15613479 # Func_class: G Carbohydrate transport and metabolism # Function: Chitinase # Organism: Bacillus halodurans # 130 370 25 338 450 77 26.0 4e-14 MDKRKTGIFLLIFLLGLIAGFLLSSAVRGHAAADTSGSRISGRQENVPLITTGEDSALPA SPEQAAVSISAPQAILTSGPSYPEWDKNTTYTGGDQVVCGGKIYIARWWTLGELPGQADV WEAAPQTPPPADSQAAPKAGASSPVKVVGYYPDWKSYQPQKLQMDVLTHIAYAFAIPTPD SRLLPLENPETAVRLIEDAHKNQVKVLLAVGGWSYNGAELEPVFVSATSTSEKTRQLGNE ILAMCNEYGFDGVDMDWEHPRVDGPSKDQYQELILYLADALHAQGKLLTSAVVSGVSADG NIYYDAAAHSDAVLNAVDWIHVMAYDGGDGERHSSYDFAVNSAAYWCGTRKMPAGKVVLG VPFYGRPGWAGYGDILAADPDAGNKDHAMVSGMDVWYNGISTIEKKAAYARNNLGGIMIW ELTQDTDDSGKSLLSAIGRGIQ >gi|157101653|gb|DS480671.1| GENE 201 220929 - 225146 2841 1405 aa, chain + ## HITS:1 COG:ECs2362 KEGG:ns NR:ns ## COG: ECs2362 COG1201 # Protein_GI_number: 15831616 # Func_class: R General function prediction only # Function: Lhr-like helicases # Organism: Escherichia coli O157:H7 # 13 1347 15 1476 1538 548 29.0 1e-155 MEELMQEDTLKQFLPATESWFRKTFGEPTKIQREAWPAIADGKHVLVSAPTGTGKTLSAF LVFIDQLGGLAERGELKEELYVIYVSPLKSLAGDIRENLRKPLEGIGAEAPGESKAESRP GNPSIGQIQTAIRTGDTPQKERQRMVKHPPHILIITPESLYLMLTSKSGRQVLKTAKAII LDEIHALIDTKRGAHLMLSVARLDKLCGRRLQRIGLSATIEPLELAAAYLSPEPAVICAP SMDKQVDIQVIGTAPAVGRRKDPVWEELGMAVYRQCLSCKSVIAFSEGRRYVEKLAYYVN LLGGDGFARVHHGSLSKEQRDEVERDLREGRLRLLCATSSMELGIDVGDVEQVLQVGCPR TISSTMQRLGRAGHNPGRVSVMYMYPRTAPESVYCGMTAQVARRGGVERANPPRMCLDVL AQHLVSMASGESYSIDDVMDVLERSYPFYQVTRKDVEDVLAMLAGDYEHSREIPVRPRIL YDRIHGRVAGDVYSRMLATAAGGTIPDKGMYAAKTEDGVKLGELDEEFVYESRIGDRFML GSFGWRVVRQDKDSVIVTQAPAEGARLPFWKGETKGRDLRTSLAFGRILGELEKACREGE LEKALEKLGLDESASVHASGFITRQIQATEGLPDDRTIVVEHFKDSTGSHQVMVHALFGR RVNGPLSLVLRHMIRNTMGMDVGSVDEEDGFLLYPYGREQLPEGLLFQINPDQVRPVLEA ILPDTPLFGMTFRYNAARALMMGMKRNGRQPLWMQRLKSTEMLSSLMDQPDHPLIRETKR ECLEHQWDISGVMEILARIRSGFITVREIWLDVPSPMSLPFQWQAEAEEMYEYSPVTPKI RQTVQEDLKKALLKPTPEELSRGWSRRKMPENEEELHSLLMMEGDLEAGELDIPADWLEA LARQGLACYVEPGIWIAAEHEDEYMRALQQQDEEAAMHIVRRMLYYRGGQTAVDVRERYF LSEEMTEALLDKLCRCKKAVEDQGVYYHEKLYERAREGHIRILRSQTATQPASHYAALMA ARAVTHSTAEEQLREAVEKGCRKPCPVRFWENVYFARRVERYGASLLDRLLAQGDYFWRM SPEGTLCFCSYEDIDWEASAGAETAGLKGDELLMYQELKRRGASFLKFLTKIPKEGSAQE VLLSLAEKGLVCADSFVPVRQWQNQDKVKKVAVRQRVNTRVMALSAGRWDIVRPVRKYGP EEWLEQFFDENMILCRETFRKSVQAVSDSPYVQTGEDGKKWEWSWAQALEILRIWEYTGR IRRGYFVEGMSGAQFVRGEDYDAVTGALREPDDHTVWLNGTDPALVWGKVLELPEKCGFL AVAGTAVALKAGKLAAVMERQGRVLRICDAEDMKEVMAEFVRAFKQGAIYPEQKRLVLKE YPAEAAEALQHAGYMREMKDYVIYK >gi|157101653|gb|DS480671.1| GENE 202 227015 - 227710 566 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936221|ref|ZP_02083594.1| ## NR: gi|160936221|ref|ZP_02083594.1| hypothetical protein CLOBOL_01117 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01117 [Clostridium bolteae ATCC BAA-613] # 1 231 1 231 231 454 100.0 1e-126 MIEIIYRDKRFLVKGSFSIGIAGNYVNEDFGDENIMINDTLEEIMKELQDEDSFWYKPLF PYLKSETADSGGIARGLTAYYNQKEKEIRENEKQINDCILYRLFSDLTGSGYPFWEIEQA VIPGRMKNGGGEFREKEVYSKETAEVFQWADEFDCVPNNGTVDKTDVEERLRELFPMFNF EGLVKTMIPEGLSLQGRFMAFQFSDGWGSDLLECAYDEMDEEFAFRDWHNH >gi|157101653|gb|DS480671.1| GENE 203 228080 - 228553 234 157 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2970 NR:ns ## KEGG: Rumal_2970 # Name: not_defined # Def: GCN5-like N-acetyltransferase # Organism: R.albus # Pathway: not_defined # 1 154 1 154 157 197 59.0 1e-49 MDYKIRTISEGEESLLQDFLYQAVFVPEGVPAPPKSIISQPELQIYITGFGTKKDDIGLV AEVNKKAVGAVWVRIMNDYGHIDNATPSIAVSLYKDYRGLGIGTALMKEMLRILKDRGYK QASLSVQKANYAVNMYQKTGFEIVDEKGEEYSMLCRL >gi|157101653|gb|DS480671.1| GENE 204 228676 - 229992 1281 438 aa, chain + ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 3 435 5 436 443 567 61.0 1e-161 MEQMSLFDDREVYNPLASRLRPDDLDGFVGQEHLLGKGKLLRQLIEQDKIPSMIFWGPPG VGKTTLAGIIAKRTNAQFINFSAVTSGIKEIKEVMVQAENSRRMGIKTLLFVDEIHRFNK AQQDAFLPYVEKGSIILIGATTENPSFEINSALLSRCRVFVLQALTENDLARLIKTALKS PKGLGYLNVEITDPMIDMIAGFANGDARTALNTLEMAVTNGVISPDKTTVTEDVLKQCIG KKSLLYDKKGEEHYNLISALHKSMRNSDPDAAVYWLARMLEAGEDPLYVARRLVRFASED IGMADSQALTLAVSAYQACHFLGMPECNVHLSHTVIYLSMAPKSNSAYMAYESARTDAQN MLAEPVPLTIRNAPTGLMKDLHYGDGYVYAHDTEEKIARMQCLPDSLAGREYYHPTDQGA EEPVKSRLEEIKKWKRGQ >gi|157101653|gb|DS480671.1| GENE 205 230056 - 231162 758 368 aa, chain - ## HITS:1 COG:PAB1637 KEGG:ns NR:ns ## COG: PAB1637 COG0006 # Protein_GI_number: 14521304 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Pyrococcus abyssi # 1 366 1 351 351 203 35.0 5e-52 MNIYDQRIAKARNRMEDAGISLLVITPSSDLMYLTGYSKPATDRLTALLLTEHDSYFLCP AVEKNYFKEVDCQAAPILWEDSGDPFKKMYSVLSAAPGISRKSCIKAAVAGRMQSSCLIR LMELYPGISWNNADSVLVPLRKSKTPEEITIIRRAQDMAERAFARLLENGLEGKTERQLS EQLMKLRLEEGFDAVGPGLIACGPGSASPHPILSDNKVQAGDTVMFDFGGTYKGYHADMT RTCAVGYASDEFKEVYSIVLEAHLAVLKAAAPGTACRDMDLAGRSIIERAGYGAYFTHRL GHGIGLDIHEPPFASASEEGVLETGNVISNEPGIYLPGQFGIRIEDLIVITEKGCESLNT MTKELMIV >gi|157101653|gb|DS480671.1| GENE 206 231386 - 232360 1093 324 aa, chain + ## HITS:1 COG:MA4355 KEGG:ns NR:ns ## COG: MA4355 COG0407 # Protein_GI_number: 20093142 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III decarboxylase # Organism: Methanosarcina acetivorans str.C2A # 4 319 5 367 370 70 22.0 4e-12 METMTKRERIRNVLEGKPVDRTPVGFWLHFPEEMHHGEKAIEAHLDFMKATDTDILKIMN ENILYDGNTKIERLGDISKFRGFSRKDAIFQDQMEIIKRIADKAKGEYPIMSTIHGLIAS AFHETGFAGNYTSMGYQLTLFCRERPAEMKKVFRTVAETLMEFSDCSLEAGAEGIFYAAL GGERNFFTDEEFEEFVAPYEKMVYDHIKKTTDFNVLHICKSNIDFNRYKDLHPAIVNWGI YGNDFSLTKGAGLFSDSIILGGFPDRHGVLVTGTEEEIFKHTSEVLEEMKGRPFIVGSDC TLPTEIAYDRIRCVVDSVEKLEQR >gi|157101653|gb|DS480671.1| GENE 207 232379 - 233041 764 220 aa, chain - ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 7 162 20 175 228 68 28.0 1e-11 MDNQSPKLSLKDTVYQQLIELICQGKLLPDTLFTENHMISYFGVSKSPVREALIQLCHEN VLKSIPRCGYQVTAISSKNVRDVTELRLYLELSSLPKVMENITTADIEELKRQNQVRLVN PEKKDLWIAWRNNYQFHMTVVRLAGNEQVNNAMEQAMTTYRRAYAQLYTLKKDVIAANKA SYHDFFVSSLERHDVFSAHEYLKKDILFMEDELLGTGITT >gi|157101653|gb|DS480671.1| GENE 208 233165 - 234124 279 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 68 319 43 315 320 112 28 2e-23 MERYRYIAKRLLTAIPTFLGITILVYVVASLAPSSPLEILFNDPYATAEEMARKSKELGL DQPVIIQYIRWLVQLVQGNMGMSIRTNGQVLDMILERIGPTLILTGSAMALTIIIALPLG IMSAYKPYSGWDYISSGLSFIGSAMPNFFAGLVLIYFFCIKVKLFPTSGMYDSSGVRTWP MFFHHLVLPAVVLAIQQIGSLIRQCRGSMLEVLQDDYVRTARAKGLHEPSVLVRHALRNA WIPLVSWFGMQIPFLIGGAVVTEQIFGWPGLGSLMVQSINARDYPVIMGITVVIAIVVLL GNLLVDLMYGLLDPRIRYD >gi|157101653|gb|DS480671.1| GENE 209 234144 - 235031 1058 295 aa, chain + ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 21 293 31 302 303 227 44.0 3e-59 MKQDERLKDIPANSYFHDVLQRFCAHKPAMVSAFVLTLEIILVIALPYILHLDPYTSDYE VMFGSAPSALHWLGTDDIGRDIFARFIYGGRVSLSVGIISAVISMAVGVPLGLLAGYYRG IAETLIMRCADVFMSFPAMVLILVLVSVLGPSVATVTIVIGVLGWPKFARLIYGNVLSAR EQDYVEAARAIGTKDGTILTKYILPNTFAPVLISFTFRAAQAIITESALSFLGMGVQAPQ ASWGNILYDAQSISVLSQRPWLWFPPGLALLITVLCINFLGDGARDALDTKMVIK >gi|157101653|gb|DS480671.1| GENE 210 235094 - 236785 1962 563 aa, chain + ## HITS:1 COG:CAC3179 KEGG:ns NR:ns ## COG: CAC3179 COG0747 # Protein_GI_number: 15896427 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 1 551 1 553 567 141 23.0 3e-33 MRKRLTKMTALALCLMMGLSGCSAPSAPSDTKAPAATGTAAEGGDAKAEGSIDGNAQENT ASQGQKIVTMAMTAPWDTLIPFNTTNANTDAVLELIFDKLIVVKADGSFKPRLADSWSMD DETHTKITFNLNKNAKWHDGQPVTADDVVYTAELMSNPDISVARRSKVAPFAGTEDGIRV EGEEFGVKAIDANTVEFTLKDPTNLDYMLDIMFRDFYILPKHCLEDIPVKDMLAADFWKA PIGSGPCIYESEISGERVEFTANKDYHINAPEFDRFVVKIVPASNLLSGLMNGEIDIVAG AGIGNIPLNDWDMAGQQENLTTEAVESLGYQYLSCNTKNIPQEVRQAVNMAINRDALVNN LMKGEGVPAPGPLAPSHRYFNEELLPIPYDVDKAKQMVEESGFDTSKTYRLNVSKGNEVR EKSAPMIQQDLAKIGINVEIITTDHPTLLSTARKGEYDFALIGSGGSPDPGESVMNVTPG HLNNFSQNEDPSLGEVGRKGLNQFSFEDRKPYYDEYQMMLREQCPFVWLYFQKDLVAYNK KISNVCFEDYSLLNRSVWEWHVE >gi|157101653|gb|DS480671.1| GENE 211 236846 - 237835 602 329 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 2 322 8 328 329 236 38 8e-61 MAQPLLEVKHLKTQFRTKRGTVTAVDDVSFSVNQGDTLGIVGESGCGKSVTSLSILQLLP HAVGSVAGGEILYKGEDIAKKGNKEMCRIRGREISMIFQDSMTSLNPVLTIGRQLEETIS VHSSLSKDEIKKQALDILTKVGVSSPEQRLKEFPHQLSGGMRQRVMIAMALSCSPSLLIA DEPTTALDVTIQAQIIDLMVELKKNINASIILITHDMGVVAEMADYIMVMYAGKVMEYGS ARQIFKEAMHPYTQGLLASIPRLDKDLDRLFSIDGTVPSLDRMPKGCRFCARCLQAMDKC REHQPPVYITDDGHKVSCWKYENLKEEKV >gi|157101653|gb|DS480671.1| GENE 212 237826 - 238809 877 327 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 327 3 329 329 342 51 1e-92 KGMSGNETILEVRNLKKYFFSGKKRRGREPRAVKALDGVSFSMEYGETLGVVGESGCGKS TLGRTILRLHEPTEGQVVYRGKNLAEYGREEMRQMRRELQIIFQDPYSSLNPRMTVKELI CAPLDVFAMGSKEEKEARVIEIMDKVGLPLEFMNRYPHEFSGGQRQRIVIARALVLNPRF VVCDEPVSALDVSVRAQVLNLMMDLQKELDLTYLFISHDLSVVRYICDRIMVMYLGNVME IADKKALYEDAKHPYTQALLSAIPIPDMDIEYNKIHLEGDVPSPFNPPQGCRFHTRCRYA TEKCRVQVPELKDVGDRHFVACHLYDE >gi|157101653|gb|DS480671.1| GENE 213 238876 - 240276 1193 466 aa, chain + ## HITS:1 COG:BS_ytjP KEGG:ns NR:ns ## COG: BS_ytjP COG0624 # Protein_GI_number: 16080050 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus subtilis # 7 460 7 461 463 224 31.0 3e-58 MDVKEWVLSQRENMLDTAMELMKIRSVSEPGTREHPYGEGCAMVLDKALEIGKQMGFETE NHDYRCGSILMPGRTGQEIGIFVHLDVVHEGNGWTTEPYKPVIKDGWLYGRGSADNKGPA AAALYSMKYLKEQEVPLEHTIRLYLGCSEERGMEDIEYYTSHYPAPEFSFTPDASFPVCY GEKGILEGEFSCAIPEGHVTGFSAGVASNAVPAEARVWLSRTSCDAVRDYVDSLDNPQDF TVEGGSQIVIKAAGKTAHAAFPEGSDSAAVKLAHMLAKAPFLTEEEQACFRFLDQGFADY YGEGMGIAFEDGLSGRLTLVGGMARTERGRFIQNFNIRYPVTAGAEALVKQMSAQAGKYG WILDWSRDNPPCVIDPESPVVKELTALCRQVLGTETKAYSMGGGTYARKLPNAVAFGPGI RGQKKPCPPGHGGGHQPDECVKIENLTNAMVIYIEALKRLDVLIGE >gi|157101653|gb|DS480671.1| GENE 214 240435 - 241223 1006 262 aa, chain + ## HITS:1 COG:slr1258 KEGG:ns NR:ns ## COG: slr1258 COG2859 # Protein_GI_number: 16330444 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 53 260 47 249 251 94 31.0 2e-19 MEENKEKKEKTGFMAQGRSVIIAVIAALSAIICVSIFTGGLVDYKKSGGGSGITATGSAS CDFESDLIVWRGSFSVHGDTPRDAYAIIKKDAELVRQYLEENQVADDEMIFSSVNISQTY TSRYDEEGKYLGDEPDGYDLTQSLTVSSYDIDKVENISRDITKLIESGVEFESELPEYYY TKLDEVKLDLIEKATANAKERIDIMSAGSGAKAGKLLSATLGVFQITAKNSGSESYSYDG YLDTSSRYKTANITVRLNYAAE >gi|157101653|gb|DS480671.1| GENE 215 241330 - 242691 1605 453 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 7 441 8 443 468 212 34.0 1e-54 MSQENKMGVMPINKLLISMSLPMIISMLVQALYNIVDSVFVSMINQAALTAVSMAFPIQN LLIAVSAGTCVGVNALLSRSLGEKNSKAANLAAANGLFLAFLSFLAFAVFGLFGSRLFFE SQTNNPVIIEYGVQYLQIVCIFCFGLFGEMMFERILQSTGQTFYCMITQGTGAIINIILD PILIFGLFGVPKMGIQGAAAATVFGQIVAMFLALILNHSKNKEVRINFRSFTPNWRTISI IYQVGVPSIIMQSISSVMTFGMNKILIGFSETAVAVFGVYFKLQSFIFMPIFGLNNGMIP IVAYNYGARSKKRIMETVWLSIGIAVGIMLIGLMVFHLMTPQLLMLFQADEEMLAIGVPA LRIISLSFLFAGYCIIVGSVFQALGNGVYSLIISVARQLVCILPLAWFFARTFGLHAVWL SLPLAEITSVVLSSILFRKIYLDKIKPLGQAAA >gi|157101653|gb|DS480671.1| GENE 216 242865 - 244319 1425 484 aa, chain + ## HITS:1 COG:MK0550 KEGG:ns NR:ns ## COG: MK0550 COG0069 # Protein_GI_number: 20093988 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Methanopyrus kandleri AV19 # 94 484 27 422 429 315 43.0 2e-85 MAVYRCRFCGSIYDEEKEGVPVSDLKVCPVCMVTADKLVKAEDGGAGAKAPDREKEKPVK KDGTRETCQDCGGSTGEAAAYDPEYARTQTDARYMDEIHEMAVKGQSIIAAMGTQMPMPG WDDILFLGAQLNPMPLDEHAPVKTETIIGKHAAKPMVLDHPVYISHMSFGALSRETKTAL SRGSAMARTAMCSGEGGILPEEKAAAYKYIFEYVPNQYSVTDENLREADAIEIKIGQGTK PGMGGHLPGGKVTPEIAAIRNKPLGQDVISPSRFPGIDTKEDLKALVDRLRLVSGGRPIG IKIAAGRIEKDLEFCVYAGPDFITIDGRGGATGASPKIIRDSTSVPTIYALYRARKYLDQ AGCGAQLVITGGLRVSSDFAKALAMGADAVAIASAALMAAACQQYRICGTGMCPVGVATQ DEKLRSRFKEDAAAQRVANFLRVSLEEIKTFARITGHDNIHDMDADDLCTVSREISEFTN IRHA >gi|157101653|gb|DS480671.1| GENE 217 244326 - 244718 406 130 aa, chain - ## HITS:1 COG:no KEGG:Closa_1072 NR:ns ## KEGG: Closa_1072 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 41 162 197 120 49.0 1e-26 MSAISLYVFNQLITGYNPDVSAVFHKVSIVEMHHLEIFGKLAQQLGEEPRLWTQHGCKKI YWSPSYNQYPGELRTLMRNVIDGEKAAISKYQHQISYIRDENITENLRRIILDERLHVEI FEKVYGAYCS >gi|157101653|gb|DS480671.1| GENE 218 244910 - 245077 65 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936239|ref|ZP_02083612.1| ## NR: gi|160936239|ref|ZP_02083612.1| hypothetical protein CLOBOL_01135 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01135 [Clostridium bolteae ATCC BAA-613] # 1 55 1 55 55 95 100.0 1e-18 MPGPAFHPLYGMTADSFNCYNKYSDMDMFIVKLSKIHPGKIFALDFDSVMYYFSK >gi|157101653|gb|DS480671.1| GENE 219 245502 - 246833 1511 443 aa, chain + ## HITS:1 COG:DR1055 KEGG:ns NR:ns ## COG: DR1055 COG0017 # Protein_GI_number: 15806075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Deinococcus radiodurans # 22 443 15 435 435 374 46.0 1e-103 MEFVNGVKEKRVLDIREVLEGEYEGKEIRMNGAVHTIRHMGEVAFVILRKSRGLVQCVYE AGITDFDIRELKEESAVEVMGVVKAEERAPQGFEIRLKEIRVLSRPAEPLPLAVSKWKLN TSLETKLSLRPISLRNVRERAKFKIQEGIVRGFRDYLLSRDFTEIRTPKIVARGAEGGSN VFKLEYFNKKAELGQSPQFYKQTMVGVYDRVFEAAPVFRAEKHNTTRHLNEYTSLDFEMG YIDSFRDVMDMETGMLQYVMKLLEQDYKKELDMLGVTLPEVGRIPAVRFDQAKELVSRKY DRKIRNPYDLEPEEELLIGRYFQEEYGSDFVFVTHYPSKKRPFYAMDDPADPRFTLSFDL LFRGLEVTTGGQRIHDYREITAKMEKRGMDPEDIASYLMIFKYGMPPHGGLGIGLERLTM RLLDEQNVREASLFPRDVTRLEP >gi|157101653|gb|DS480671.1| GENE 220 246867 - 247160 396 97 aa, chain + ## HITS:1 COG:lin1868 KEGG:ns NR:ns ## COG: lin1868 COG0721 # Protein_GI_number: 16800934 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Listeria innocua # 5 93 4 92 97 78 49.0 3e-15 MANIISDETIEYVGILAKLELSPEEKEEAKKDMGRMLDYIDKLNELDTQGVEPMSHVFPV HNVFREDVVTGTDQRDQILSNAPQQKDGAFKVPKTVG >gi|157101653|gb|DS480671.1| GENE 221 247174 - 248694 480 506 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 7 474 8 452 468 189 29 1e-46 MSISDMTALELGRRIQSGDITAVQAAEASLARIKAMEPSVHAFVTVNEEKTMEQAGKVQA DIEAGRLKGPLAGVPVAIKDNMCTKGMRTTCSSRILGNFIPTYTAQAVSNLEQAGAVILG KTNMDEFAMGSTTETSAFGVTRNPWNLEHVPGGSSGGSCAAVAAGECFYALGSDTGGSIR QPSSFCGVTGIKPTYGTVSRYGLIAYGSSLDQIGPVAKDVSDCAAVLEVLASHDPKDSTS MERRDCDFTSALSEDVRGMRIGIPESYFGQGLDQEVKDAVLEAARVLGEKGAIVETFDLK LAEYAIPAYYVIASAEASSNLSRFDGVKYGYRAPEYEGLHSMYKKSRSLGFGPEVKRRIM LGSFVLSSGYYDAYYLKALRTKALIKKEFDRAFASYDVILAPAAPSTAPRLGQSLGDPLK MYLGDIYTISVNLAGLPGISLPCGLDSKGLPIGLQLIGDCFKEKNIIRAAYAYEKTREWK LSLLAAGKAKEPSGALTGRAERRSHE >gi|157101653|gb|DS480671.1| GENE 222 248687 - 250135 1640 482 aa, chain + ## HITS:1 COG:CAC2669 KEGG:ns NR:ns ## COG: CAC2669 COG0064 # Protein_GI_number: 15895927 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Clostridium acetobutylicum # 4 473 2 469 476 505 50.0 1e-143 MSKQYETVIGLEVHVELATKTKIFCGCSTRFGGAPNTHTCPVCTGMPGSLPVLNKQVVEY ALAVGLAANCRINQYCKFDRKNYFYPDNPQNYQISQLYLPVCRDGWVDIETESGLKKRVG IHEIHMEEDAGKLIHDEWEDCSLVDYNRSGVPLIEIVSEPDMRSAEEVIAYLEKLRLIIQ YLGASDCKLQEGSMRADVNLSVREAGAPEFGTRTEMKNLNSFKAIARAIEGERRRQIELL EEGRQVIQETRRWDDNKESSKAMRSKEDAQDYRYFPDPDLVPVVISDEWIARVKARQPEL RTEKIARYKEEYGLPEYDARLLTSSKHMADIFEETTRLSGRPKEASNWLMVEAMRLMKEQ EMEPEDMEFSPSHLAALIRMVENNIINRTVAKTVFEEIFRHNADPEAYVEENGLKVVKDE GALRDTVKAVMAANPQSVEDYRNGREKAMGFLVGQTMKAMKGKADPSMVNQLVRELLAGG DV >gi|157101653|gb|DS480671.1| GENE 223 250323 - 251612 1300 429 aa, chain + ## HITS:1 COG:ML2336 KEGG:ns NR:ns ## COG: ML2336 COG1167 # Protein_GI_number: 15828259 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Mycobacterium leprae # 4 425 40 459 463 387 45.0 1e-107 MKPYAEMTKEELLSLRQDLKAQYHEFQAKDLKLDMSRGKPCVEQLDLSMGMMDVLSSTSD LTCEDGTDCRNYGVLTGIDEAKVLLADMMEVNPSTIIIYGNSSLNVMYDTVSRAMTHGIM GCTPWCKLDKVKFLCPVPGYDRHFAITEYFGIEMINVPMSEDGPDMDMVEELVASDDSIK GIWCVPKYSNPQGYSYSDETVRRFARLKPAASDFRIFWDNAYGVHHLYDHDQDHLIEILA ECKRAGNPDLVYKFASTSKISFPGSGIAAIATSHNNLEDIKAQLKNQTIGHDKVNQLRHA RFFKDIHGMVEHMRKHADIIRPKFEAVEEILERELEGLGIGRWTKPRGGYFISFDSLEGC AKDIVARCKKAGVVMTSAGATYPYGKDPHDSNIRIAPTYPTLGDLITAAELFALCVKLSS VEKLLKTMA >gi|157101653|gb|DS480671.1| GENE 224 251802 - 252374 463 190 aa, chain + ## HITS:1 COG:no KEGG:Closa_1315 NR:ns ## KEGG: Closa_1315 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 18 172 10 163 187 64 30.0 2e-09 MQGRNVRFQGKYDLFTIYLIPGVTIFLASFDSWTGTNLSVLGNRTGNKLLFAVWGFATGI YYCVYVRYLFHIGRYRNPGGRTLMYTAAAFLLMAVMIPYMPEKYPLKADLHVLLAFFSPV LLAFSIIGFLRFLSSRDRMRFRRAWGILWMMAACSVLFLLEAGFITSFLEIFIITGLCGY LRYMEQLLAA >gi|157101653|gb|DS480671.1| GENE 225 252476 - 253663 1406 395 aa, chain - ## HITS:1 COG:SP2026_2 KEGG:ns NR:ns ## COG: SP2026_2 COG1454 # Protein_GI_number: 15901847 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Streptococcus pneumoniae TIGR4 # 5 389 8 409 419 315 42.0 1e-85 MRFTLPRDLYHGKDALEALKTLAGKKAIVVVGGGSMKRFGFLDKVVDYLKEAGMEVQLFE GVEPDPSVDTVLKGAAAMQEFEPDWIVAVGGGSPIDAAKAMWAFYEYPDTSFEDLITPFS FPTLRTKAKFCAISSTSGTATEVTAFSVITDYKKGIKYPLADFNITPDVAIVDPALAETM PKKLTAHTGMDAMTHAIEAYVSTLHCEYTDPLALHAIEMINEYLIPSYNGDMEARDKMHD AQCLAGMAFSNALLGIVHSMAHKTGAAYSGGHIVHGCANAMYLPKVIKFNSKEPEAAARY AKIAKFINLPGTTDEELTDALIARLLEMNKALDIPACIKDYEGGIIDEKEFMDKLPKVAE LAVGDACTGSNPRTITPEVMEQLLKCCYYDEEVDF >gi|157101653|gb|DS480671.1| GENE 226 253897 - 255099 1333 400 aa, chain + ## HITS:1 COG:BH3155 KEGG:ns NR:ns ## COG: BH3155 COG0330 # Protein_GI_number: 15615717 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 46 339 14 301 319 161 33.0 2e-39 MNFFMGKKFVGPGGGKKDQGPEVVFDNGEESNKRMKRMLRPIKIAIAVAVLAIAAFDSFY TLSENEMAVLTTFGRPSSVTTSGPKFKVPFIQKVHKMSKEIKGMPIGYDPDYNAQNHADS ENNPITVSSESEMITKDFNFVNVDFYIEYQIVDPIKAYIHSDTAIPILKNLAQSYIRDTV GSYSVDEVITTGKSEIQAKVKALLSERLEQEDIGLGINNVTIQDAQPPTDAVNNAFKAVE DAKQGMDTKINEARKYQSERLPAANAEADKAARDAEAYRQQRISEAEGQVSRFNDMYQEY AKYPLITKKRMFYETMEDILPGLKVIINGSDGTQTMLPLDSFVSGTQSSTGSPAGSYNSQ ANQPDQTGRANQAGADSGSQSGSYDQEELYEGNDQNGGGQ >gi|157101653|gb|DS480671.1| GENE 227 255099 - 255980 1027 293 aa, chain + ## HITS:1 COG:BH3154 KEGG:ns NR:ns ## COG: BH3154 COG0330 # Protein_GI_number: 15615716 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Bacillus halodurans # 4 291 18 307 310 179 34.0 7e-45 MKTLTKFMRNMGIIVIVLLAVTIFNPLVVTKSNEYSLIIQFGKVVRVENSAGPSLRVPFL QSVQKIPKYKMISDLYPSDVTTKDKKVMTVDSFVIWDINDPVKYLASLNASKEKAEVRLG NVVYNSIKNVLSSTNQADIISGRDGNLAKTITENIGDAMDSYGIHIYAVETKKLDLPDSN KESVYQRMISERNNIAAQYTADGDYQSSLIKNETDKTVKETIAKANAEAEKIKAEGEARY MQILSDAYNDEAKADFYNYVRSLDALKASMKGDNKTVILNEDSELARILSGSW >gi|157101653|gb|DS480671.1| GENE 228 256107 - 257033 707 308 aa, chain - ## HITS:1 COG:SP0676 KEGG:ns NR:ns ## COG: SP0676 COG0583 # Protein_GI_number: 15900577 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 1 246 21 263 322 98 27.0 2e-20 MNLSYLKYAVEVEKTGSITKAAQNFYMNQPHLSKIIRELERDLGCDIFDRTSRGMVPTKK GEEFLRYAKAILVQEEQIEALCVKNSEKALEVNLCVPRATYISYAFSDFLKHMDSCPSVN VHYMETNSRDTIRGVSGKTFDVGIIRCQALYESYYMKLLQDENLEWKELWQFSCNVLMSP SHPLASRDKLTYLDFTDYIEIVQGDIQNPSFTFEAQDASATGGNSKKTVSIYDRGSQFEL LRQLPGSYMWTSPVPYGCLTSSGLIQKTCSYPGNIHKDLLIYRNGYRFSKEEEEFLRCLS RMVSRLSD >gi|157101653|gb|DS480671.1| GENE 229 257226 - 257690 533 154 aa, chain + ## HITS:1 COG:RSc0761 KEGG:ns NR:ns ## COG: RSc0761 COG1522 # Protein_GI_number: 17545480 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 1 140 3 145 176 93 31.0 2e-19 MDKLDYKILNILKENGRETASDISKAIHLSVSAVLDRIKKMEESGIIKGYTVRVDAKALG NDVTALMEISLEHPRFHDSFTEVIMDHPNILDCYYLTGDYDFVLKISCASSDELERIHRT IKSIPGVSATRTHFVLKEIKNMYSAIPEEEKKRF >gi|157101653|gb|DS480671.1| GENE 230 257735 - 258577 923 280 aa, chain + ## HITS:1 COG:CAC0792 KEGG:ns NR:ns ## COG: CAC0792 COG0115 # Protein_GI_number: 15894079 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Clostridium acetobutylicum # 1 278 1 279 280 412 66.0 1e-115 MKTLGYYNGKYDELENMSVPMNDRVCWFGDGVYDAGPCRNYKIFAIDEHVDRFFNSAGLI KMQVPCTKAELKALLQDLVNKMDTGDLFIYFQVTRGTAVRTHEFPENVPANLWVMLKPMG TPDLYKKIRLLTQEDTRFLHCNIKTLNLLPSVMAAQKAKEAGCQEAVFHRGDRVTECAHS NCHILKDGILYTAPADNLILPGIARAHLIRICKKLEIPVSETPYTLKEMMDADEVLVTSS TKLCVAANEIDGTPVGGRAPETLKNILDAVMDEYMEETKA >gi|157101653|gb|DS480671.1| GENE 231 258672 - 259634 1036 320 aa, chain + ## HITS:1 COG:lin2785 KEGG:ns NR:ns ## COG: lin2785 COG1477 # Protein_GI_number: 16801846 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Listeria innocua # 16 313 41 341 360 217 38.0 2e-56 MCTAWLAGCSKKVADPVTRSSFLLNTFVTVTLYDSEDQTILDGCMTLCSDYEKLLSRTLE GSEIYKLNHRRPGERTITVSEKTAQVLSEGLEYCRMSRGAFDITIEPLSSLWDFTGKTPH VPPEEEIEADLKKVGYENVLLDGRQVTFLNDETTIDLGAIAKGFIADQMKAYLEENGVKS AVINLGGNVLCVGKRPDGAPFKIGLQRPYATHTETVAALKIDDMSVVSSGVYERHFVENG VNYHHILDPATGYPYENGLTQVSIISPRSVDGDGLSTTCFALGLEEGTRLIESMDQIYGI FLTEDGELHYTRGAEDFLDR >gi|157101653|gb|DS480671.1| GENE 232 259735 - 260163 491 142 aa, chain + ## HITS:1 COG:no KEGG:Closa_0818 NR:ns ## KEGG: Closa_0818 # Name: not_defined # Def: GtrA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 142 1 138 141 87 40.0 2e-16 MLRKLWEKCVNYETVSYIICGFLTTAVDFAVYALLRQIEIGVGLSQAMSWLAAVLFAYVV NKLIVFRNYNLRPSYLVKEVGAFVAARAFSGAVTWVLMVVMVKLGGDRGMLYELFCKFTS SVINMVLNYIFSKLWIFKNKPE >gi|157101653|gb|DS480671.1| GENE 233 260179 - 263325 2787 1048 aa, chain + ## HITS:1 COG:L48341 KEGG:ns NR:ns ## COG: L48341 COG4485 # Protein_GI_number: 15672817 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 203 1021 10 878 883 244 25.0 1e-63 MEREEIERVERELDAILKLVQPEPTMQDKAREEALSKEPVSGGDKTTSEEYLTEEGTLEA YLFGDDSSEENASEYDGADKYAADKYAANKYAADKYAVDKYAVDKYAADKYTADKYAADE YAADEYAADKYAADKYAADKYAADNYIADRYTADEYTSEEDEYGMEEENETAGQDRQQYD KRETGMSQTENRRGLRLLKPSDGLVAAFFVPVVAMIIIFAQRGIFPFGEECFLRTDMYHQ YAPFFSEFQYKLTHGGSLLYSWDIGMGVNFSALYAYYLASPVNWLLILCPKKFIIEFMTI LIVLKTGLSGLSFSWYLKKHFGVKKFGVGFFGIFYALSGYMAAYSWNIMWLDCIMLFPLI LLGLEQLVKEKKGTLYCITLGLSILSNYYISIMICIFMVIYVIAQMTLNPPKSFGEFMGT GLRFAFYSLLAGGLAAVVLLPEIYALQVTASGDFDFPKTFSSYFSIFDMIARHIGNVGTE IGLDHWPNIYCGVAVLMFFLLYLGSRKITFREKAVYCSMLVLFYASFSVNVLNFIWHGFH YPNSLPCRQSFIYIALMLFMCYHAYIHLEDMSWKSVCLAFWGAVSFVLLAEKLVDNPEQY HFSVFYAALLFIAIYGGLIYLYHKGKWSLDAILLTALAVVAIEAAVNTAVTSIPTTSRTS YVKDNQDVEDLVWNIKSDTFYRVEKTDRKTKNDGAWMNFPSVSLFSSTANASLSDFFRRM GCESSTNAYSITGSTPLVDSLMSVRYGIYGDQQPADGLRDLSARKGSMWLYENKFTLPVA FMLPSDVEGNWILDSGNPAHVQNDLCDVLDTEHVLLPNESVTEGRKLTFTAQETGDYYVY VTNKKVKEVTAVIGEQTESFDNVDRGYFMELGRINKDVEVRLETGDDGSPTLQAEVWRFN PQALEAVYGKMNQNPMVLSRWTDTELSGSITAESAGVMYTSIPYDKGWTILVDGKAVTPR KMFDTFLAVDIGEGTHRISFSYEPEGLRTGAWITGASAAVLGVTVLVGISRKKKKDRVVS GYSIRKNSKENKKSQKKENSSKQESGGK >gi|157101653|gb|DS480671.1| GENE 234 263463 - 264854 1627 463 aa, chain + ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 2 457 3 456 458 506 54.0 1e-143 MKLWGGRFTKETNQLVHNFNESLSFDQKFYRQDIRGSIAHVKMLAKQGILTEQDRDDIIR GLESILEDLDNGKLAISSDSGYEDIHSFVEGTLTERIGEAGKRLHTGRSRNDQVALDMKL YTRDEIDAIEALVKELLEELLAVMEKNTDTFMPGFTHLQKAQPITLAHHMGAYFEMFSRD RSRLKDIRRRMNYCPLGSGALAGTTYPLDREYTAELLGFDGPTLNSMDSVADRDYLIELL SALSTISMHLSRFCEEIIIWNTNEYRFVEIDDSYSTGSSIMPQKKNPDIAELIRGKSGRV YGALVSMLTTMKGIPLAYNKDMQEDKELTFDAIDTVKGCLALFTGMISSMSFRKEVMEAS AKNGFTNATDAADYLVNKGVPFRDAHGIVGQLVLACIAKGIALDDLSLEEYKAISPVFEE DVYEAISMKTCVEKRLTIGAPGQEAMKKVIGIYKKQLDMDDVR >gi|157101653|gb|DS480671.1| GENE 235 264964 - 265656 547 230 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 5 220 6 217 229 75 26.0 1e-13 MEEWLPLVKKSPLFYGINDDELYTLLKCCASEQVLYEDGEYIFRMGESVNKVLVLLRGKA CVIQENFWGHQEHIYHIKEGELFGQSYSCARTPVLPVSVVAEGGCETLFLNYQRMITFCP LACDFHTRFIHNMLRLVSEHNVKLENKLEHVCRRTTREKLLSYLSEQAMAKGGRDFDIPH NRQELAEYLCVDRSAMSSELSKMRAEGILDFQKNHFVLHDRGAGDRREGA >gi|157101653|gb|DS480671.1| GENE 236 265762 - 266775 768 337 aa, chain - ## HITS:1 COG:no KEGG:Closa_3658 NR:ns ## KEGG: Closa_3658 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 331 1 319 326 206 38.0 1e-51 MKKHIRPMVPALGALVLLCAAYGIITWHQGRNAESQALSPENASVYMTDLPELSSLSWTK DGKSLSFTREGGTWYYKGDTDCPIRQYPLTTLSDTLSHLKAERKLEGADSPEAYGLDNPS VRFDTVSSDGSSHSILVGSQVPGTGGSGPDGSQLPAQYYAAMDGDSQIYTIGSYLTETAA KGLYDFVETESLPHVAGADIREITVSRNGQTRRFCKKTVDDAGNIAWYKDSDADEANRLD DNGALNNLADAISGLSFQSCVSYKASDEELGSYGLSEPIMTLSWTYESGDRDSAVSLAIG NSTEDGTGYYTRKDDSRAVNLISKEAAERCLNAAYPK >gi|157101653|gb|DS480671.1| GENE 237 266865 - 268256 1343 463 aa, chain - ## HITS:1 COG:slr2105 KEGG:ns NR:ns ## COG: slr2105 COG3225 # Protein_GI_number: 16330592 # Func_class: N Cell motility # Function: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component # Organism: Synechocystis # 15 278 81 329 595 74 24.0 4e-13 MTRKCKKGSYMALFTLVVIAAVIVLNLIVGKLSAKYLKHDLSSNKILTLGDTTRDILAQL DRDVTIHIVADPNSVDERITSFVNLYKDLSPHISVEANDPVLHPDVLASLDTEPGNLIVS CESTGKKTSISFDDIIQFDPMVYYQYGQIKETAFDGEGQITSAIRYVTSDTAATVYTLAG HGETALSSAVSDAMDKSGMNAADLNLLTQGAVPQDCSLLIMNAPVKDISADEKETLTDYL KNGGRMLLLAGCTQDTLSNLTDFISQYGMDLKNGFVADPAPQHYYNNNPFNVIPDYDFAS SLLSGIDSSAAALLVQPAGMTIQENPSEDIAVQPFLTTSDSAFLVDPATQEKTQGTYVLG AVATETVNEEEGTSSMLTVITAPSMIDDSILSRFPSITNLTLFMNAAASGLPGVTPLSIP SKSLDITYNMVTSAGLWSALFIIVIPVIFLIAGFVIWLKRRKL >gi|157101653|gb|DS480671.1| GENE 238 268262 - 269128 805 288 aa, chain - ## HITS:1 COG:PA4038 KEGG:ns NR:ns ## COG: PA4038 COG1277 # Protein_GI_number: 15599233 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Pseudomonas aeruginosa # 4 170 7 175 244 60 30.0 5e-09 MRAIYKRELKSYFCSMTGYVFIAFMTMFMGIYFMVYNMINGYPYFSYTLNSILMILMIAV PILTMRSMSDEHRARTDQLLLTSPVSVWGIVMGKFLSMVTILAVPMAIACLCPLIIRSAG TAYLTEDYASILAFFLLGCVYIAIGLFISSLTESQLIAAAGTFGILLLSILWPGLLNFIP AAASGSLVGFLILWTLVCFILYRLTGHIPLSLGLEAAGAAVLIGTFFVKQSIFERALTGL LGKIVLTDVFDGIVSSHMLDVSGLVYYLSVAGILLFLTVQSIEKRRWS >gi|157101653|gb|DS480671.1| GENE 239 269125 - 270222 1164 365 aa, chain - ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 1 236 1 236 342 256 54.0 4e-68 MIEVKNLVKDYGGHLAVDHLSFRVEDGQIYGFLGPNGAGKSTTMNIMTGYLGATEGQVII DGHDILEEPEAAKKRIGYLPEIPPLYMDMTVDEYLEFAACLKKLPKAGQKDAIQEVMELV KITDVRKRLIRNLSKGYRQRVGMAQAILGKPSVIILDEPTVGLDPKQIIEIRDMIKGLGK QHTVILSSHILSEVSAVCDHIMIIAHGKLIASDTPENLEQRLKGAAGLELTVKGSEETVM PVLEEIHGLSIERIGAGGEAGTCVVNLQFGGDEDSSVSGSGIGSGGGIAPGGGIAPGGIA SGAKTDIRETIFYALANHRLPILSMAPSKASLEEIFLELTEDSPKLREDRGVTDPTDPSK EEDSQ >gi|157101653|gb|DS480671.1| GENE 240 270495 - 272477 2249 660 aa, chain + ## HITS:1 COG:CAC1348 KEGG:ns NR:ns ## COG: CAC1348 COG0021 # Protein_GI_number: 15894627 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Clostridium acetobutylicum # 4 660 3 658 663 829 61.0 0 MNQIDTMSVNAIRVLAADAVQKAKSGHPGLPLGAAAMSFELWAKHMNHNPKNPGWENRDR FILSGGHGSTLLYSLLHLFGYGLTKEDMMNFRQMDSLTPGHPEYGHTVGVEATTGPLGAG MGMAVGMAMAESHLAAVFNKDGYPVVDHYTYVLGGDGCMMEGISSEAFSLAGTLGLGKLI VFYDSNRISIEGSTDIAFTENVQKRMEAFGFQLITVEDGNDLDAIGKAIEEAKADTARPS FITVKTQIGYGCPAKQGKASAHGEPLGDDNVKAMKEFLNWPSMEPFYVPDEVYANYKAYA ERGAETEEKWNALFAEYCGKYPEMKELWDKFYNPNLANDVYDSEDYWAFEDKPDATRSLS GKQLQKLKNLMPNLIGGAADLAPSTKTYMADMGDFSKDNYAGRNLHFGVRELAMSAIGNG LMLHGGLRAFVSTFFVFSDYTKPMARLSALMGVPLTFVFTHDSIGVGEDGPTHEPIEQLA MLRAMPNFHVYRPADATETAAAWYSAVSSKKTPTALVLTRQNLPQLAGSSKEALKGAYIL EDSAKEVPDAIIIATGSEVELAVGAKAELAKEGVDVRVVSMPSMDVFEEQPEDYKEKVLP KAVRKRVAVEALGDFGWGRYVGLDGTTVTMKGFGASAPAGQLFKKFGFTVENVVAAVKSL >gi|157101653|gb|DS480671.1| GENE 241 272690 - 276058 3549 1122 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 449 715 3 266 280 169 39.0 2e-41 MKQIRKKHKFSLPITITVCIALAIVFYLSLSYFLEGYRRDIYDRIINSNISALKETTSTL ISKTTEGFDHCRHEIRVLGIGMSEELKENGFDSIDGLSNGDRSYLRDFNKVSVFDYCVLL NESGRGVYSQGDSVKPINLYSSQAYVDCLNSPDGEAISFISDPFSASGQDVVAFSCRAGS VILIGIYSQESFRALYDSTAFGDNASYMITTDSGLILSRTHVNRELGETLNLFTYFEENP LNAEFFKPDAQGGTEYERVLSDFRDDKGGSAEIYFGDSRYELVYAPIPRTGWDFISCVSY DHITADAAQINQETIQLTMFIIVLMLGMFLVIVIMLVFVVRANASREAVRRDRIFTLMTH YVPNVIVIADSESGAIEYASRNTEQVLGIEDAFGNAIEDKFLSCISGSDRQAVLDLISRV RKGDCASGSLKLHFTRPDIQKDIVLSLNAYLISEQDSSQRFITVTLEDITEREQSRRRLE EALESEARANAAKSTFLASMSHDIRTPLNAVIGLTNLALYSPEDSKKVTECLQKIANSSQ LLLGLINDVLDMSKIESGKMQLAETEFELGEWLSGVVTVTQSQTGVRSQHFDVNVWNITH ELLCGDTVRLGQVLTNVLGNAVKFTPKDGEIYLDITEVPSQDPSFASFVFRVSDTGVGMS REYMGRIFEMFSRDQNSYKQGIQGTGLGMAITKRIVDLMGGTIRVESEPEKGTTFTIGLS IRLSSRHVPLAAGCAMLVIGEPGSEEKCRDAVEKLEEIGVDAAWETDYQAAVSLAAQKRA AGRPFSMAFLPYKMLRFDRTLTGEQLKRDLGGDTLWLLGLKPDEQELFEEIKQKGFKYTI SLPLFRMSLYRKITGMLGQGESISGGLAGELKGVRLLVVEDNAVNLEIAVEMLSGLMGAA VDTARNGEEGCTRFLDAPEDTYDVILMDLQMPVLGGLDAARRIRESIHPRAASIPIIALT ANAFEEDRREALEAGMNGFVPKPIDFAVLAKEIGRVRREGRCLRLLLAEDNELNREIAAE LIRADGQQVDAVENGHQALEAYLNAPDGTYDAILLDLRMPVMDGFEAAAAIRSSQRKSAG SIPVFAFTSSSEDEVRSEAARAGFGAVLTKPINMEQLRGLLG >gi|157101653|gb|DS480671.1| GENE 242 276228 - 278486 2352 752 aa, chain + ## HITS:1 COG:BH0026_2 KEGG:ns NR:ns ## COG: BH0026_2 COG0737 # Protein_GI_number: 15612589 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Bacillus halodurans # 480 669 284 452 471 75 34.0 3e-13 MKKRMVIGLFLVLVLISAATGCTAVWDTGATEDKKQSGAQPAEEPVVLHVYNTMGENINT EQAEAYIHRTMPGVKIESTYEIAPKMENAQLAAVRSGGGPEILYTQDYYTYVQSGYLKDL TSEPFLKNYMISALNDAEVGGKVYALPVGNGYISGLMVNRQLLADMGYDMPRTQEEFIGL CRRIQERREETGVRAYAFGMLYNDAAAVVAMPFLLDAYTDSEYVQWLNLYRNNPAGVSFE NPAFARVLKGIDELESMGLYENGDFLMSSKQNLREVVRGNAVMCSVSYVDYAVHFEEKLR DVNGVPSFKIEKDGKQVWVKAEDFVFIPYMGRTREDRWLATNGDWYLGINANITDETVLR ACRLYLEYVASTEFAPEYYSASVPVGATTYYRRDDTLNYDYFQNTHPEFYQCLTEHSIVK NPYQFYGSDLFTFALRHYLCGQNYYAGLEGASSYMKIGGTEGMLEALEEYRTTGKNRYQV PDHVVGKTDKTYNYVRIFSRSNESALGNLLADALRRYTGADLAAINAGTLTAELEEGEIT ESDLATSMLYGLNNHVVTVRCKGANLISILSTNNLTDAVRADQSSVFGGMVIPSGFTYTI TYEPLEDNAYGGQAEISDVRLSNGETIDPDAWYTVATTDYELGGSDGWDAFTVLPPDKPA QLPEGIALYQTFDPGNESACRLFDLSTDTYDEQYQQITDWARNQPNIIDAVIAYIESHSE NGVLEPVTIDGRIRIVNMPEGLDVEKNKIRLD >gi|157101653|gb|DS480671.1| GENE 243 278541 - 280592 1679 683 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 280 654 188 514 555 235 42.0 3e-61 MKSIKGKILLCMSLTVVICLSILGGAGICLNYSSTNQLLEQTLRETVKIASDKVANELTS YLNVAIDTGTIARLTDPQQSPGNKQLLIDDKVKQYGFQRGNIIGPDGISALDGKDYSDRD YFHKAMAGQANISDPLVSKVNGELSIIVAAPLWKDGRPNTTIAGVVFFVPQSTFLNDIVS QVHISDNSAAYAINSDGYTIADNTTGTIMTQNIEEEAKSDSSLARLAAIHGKIRKGESGF DKYEINGVQKLAAYSPIAGTDGWGMAITAPTSDFMGATLTSIGITCALLVISLIAAVAIA YWLAKKIGNPIHICTERIQKLSEGDLKSQVPVIHSKDETGRLSEATRLITDSISNIINDI DWGLSQMAAGNFAITSQSQDLYVGDFKPLSDSMYKILKEISAVLRNIDQSAEQVAGGSEQ VAAGAQALSQGATEQASSIEELAATVNEISNHIDKNAGNAKTASEKAASVMDQAKESSQR MKEMLTAMDDINQGSGEISKIIKTIEDIAFQTNILALNAAVEAARAGEAGKGFAVVADEV RNLAGKSADASKNTSSLIEGSLHAVERGTKIASETALALDEVVNGVEDVSAAINEISAAS ADQASSARQVTQGIDQISSVVQTNSATAQESAAASEELSGQAQILKNLVGQFKFGSTESG EEPVMPVPDESNYMVQGCDETKY >gi|157101653|gb|DS480671.1| GENE 244 281086 - 282318 1406 410 aa, chain - ## HITS:1 COG:CAC0973 KEGG:ns NR:ns ## COG: CAC0973 COG0137 # Protein_GI_number: 15894260 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Clostridium acetobutylicum # 1 402 1 398 400 446 54.0 1e-125 MKEKVVLAYSGGLDTTAIIPWLKEHFGYEVICCCIDCGQGSELDGLDERAKLSGASKLYI EDLVDDFCDNYIMPCVKANAVYENKYLLGTSMARPAIAKRLVDIARKEGASAICHGATGK GNDQIRFELGIKALAPDIKIIAPWRMTDVWTMQSREDEIEYCKQHGIDLPFSADSSYSRD RNLWHISHEGLELEDPSQEPNYEHLLVLGVTPEKAPDEGEYVTMTFEKGVPTSVNGQKMK VSDIIRKLNELGGKHGIGIVDIVENRVVGMKSRGVYETPGGTILMEAHQQLEELVLDRAT METKKTMADKLAQIVYEGKWFTPLREAVQAFVESTQEYVTGEVKFKLYKGNIIKAGTTSP YSLYSESLASFTTGDLYDHHDADGFITLFGLPLKVRAMKMAEVESAKKDQ >gi|157101653|gb|DS480671.1| GENE 245 282566 - 283606 1272 346 aa, chain + ## HITS:1 COG:Cj0224 KEGG:ns NR:ns ## COG: Cj0224 COG0002 # Protein_GI_number: 15791596 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Campylobacter jejuni # 2 340 3 337 342 393 54.0 1e-109 MIKAGIIGSTGYAGGELARLLLQRDDIEIKWYGSRSYIDQKYASIYQNMFRIVDDACMDD NMKELAEQVDVVFTATPQGLCATLMDEDVLSKVKVIDLSADFRIKDVSVYEKWYKLTHAS PQFIGEAVYGLPEINREKVKKARLIANPGCFPTCSFLSTYPLVKEGLIDPNTLIIDAKSG TSGAGRGSKVDSLYCEVNENIKAYGVASHRHTPEIEEQLSYAAGKPVTISFTPHLVPMNR GILVTAYASLTKEVSCEEVKAVYDKYYGDEYFVRVLEKDVVPQTRWVEGSNFADVNFKID TRTNRVVMMGAIDNMVKGAAGQAIQNMNLMFGLPENTGLKQIPIFP >gi|157101653|gb|DS480671.1| GENE 246 283679 - 284143 505 154 aa, chain + ## HITS:1 COG:Cj0225 KEGG:ns NR:ns ## COG: Cj0225 COG0456 # Protein_GI_number: 15791597 # Func_class: R General function prediction only # Function: Acetyltransferases # Organism: Campylobacter jejuni # 7 152 1 146 148 183 58.0 8e-47 MAGTMDILIREMTIADYDQVYGLWTEIKGFGIRSIDDSREGVDRFLGRNPTTSVVAVQNG HIIGNILCGHDGRTGCFYHVCVAPGYRKHGIGYRMVRFAMEALQKEGVSKISLIAFKGNE VGNAFWQGIGWQEREDVNTYEFILNEENITRFVR >gi|157101653|gb|DS480671.1| GENE 247 284228 - 285472 1589 414 aa, chain + ## HITS:1 COG:TM1783 KEGG:ns NR:ns ## COG: TM1783 COG1364 # Protein_GI_number: 15644527 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Thermotoga maritima # 19 414 5 397 397 373 50.0 1e-103 MDETGMGIKIITGGVTAARGFKAASTAAGIKYKDRQDMAMIYSQEPCRSAGTFTTNIVKA APVKWDKNQVTSGAPARAVVINAGIANACTGEEGMGYCGQTAEAAALALGISAESVLVAS TGVIGMQLPMDRITAGVKAMAPLLDGSLESGTGASMAIMTTDTKNKEVAVRFELGGTTVT MGGMCKGSGMIHPNMCTMLSFVTTDAAISKELLQEALSQDIQDTYNMISVDGDTSTNDTV LLLANGLAGNKEITEKNEDYHTFCRALKIVNETLAKKMAGDGEGCTALFEVKVVGAETKE QAKILAKSVICSSLTKAAIFGHDANWGRILCAMGYSGAQFDPEKVDLFFESAAGKMQIIK DGVAVDYSEEQATRILSEPEVTAVADIKMGDAKAAAWGCDLTFDYVKINADYRS >gi|157101653|gb|DS480671.1| GENE 248 285582 - 286472 1154 296 aa, chain + ## HITS:1 COG:Cj0226 KEGG:ns NR:ns ## COG: Cj0226 COG0548 # Protein_GI_number: 15791598 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Campylobacter jejuni # 6 284 7 280 281 332 60.0 4e-91 MNLEPMQKAAVLIEALPYIQRFNRKIVVVKYGGSAMVDEELKKKVIQDVVLLKLVGFKPI IVHGGGKEISRWVGKVGMEAQFKNGLRVTDEPTMEIAEMVLNKVNKSLVQLVSELGINAV GISGKDGMMLKCRKKYAGGEDIGFVGDVEEVNPKIIYDLLEKDFLPIICPIGFDEDFLSY NINADDAACAIATAVEAEKLAFLTDVEGVYKDYEDKSSLISEMTVEEAQNFVDSGMLGGG MLPKLQNCINAIKNGVSRVHILDGRIPHCLLLEIFTDKGIGTAIYNSEEDERYYHG >gi|157101653|gb|DS480671.1| GENE 249 286465 - 287700 1444 411 aa, chain + ## HITS:1 COG:Cj0227 KEGG:ns NR:ns ## COG: Cj0227 COG4992 # Protein_GI_number: 15791599 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Campylobacter jejuni # 7 408 3 391 395 397 49.0 1e-110 MADKQLMERAEQALYKVYNRFPVVFDHGEGVKLYDTEGNEYLDFGAGIAVMALGYGDEEY TKAVEEQLHKLTHISNLFYNQPSIEAGEKLLQVSHMDKVFFTNSGTEAIEGALKIAKRYH YNKLHETMGDGCDGCEDAEPDMTGEIIAMNHSFHGRSLGALSVTGNAHYQEPFKPLIPGI RFADFNDLDSVKKLINSKTCAVIMETIQGEGGIYPATEEFIKGVRALCDEHDLLLILDEI QCGMGRSGEMFAWQHYGVKPDVMTVAKALGNGLPVGAFLAWGKAASAMVPGDHGTTYGGN PLVTAGASAVLDIFEKRRITEHVKEIGAYLYEKLEELSKELDCIRAHRGMGLIQGLEFSV PAGPIVSKALLEQKLVLISAGTNIIRFVPPLVIEKADVDEMMKRLRAAAAE Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:03:24 2011 Seq name: gi|157101652|gb|DS480672.1| Clostridium bolteae ATCC BAA-613 Scfld_02_13 genomic scaffold, whole genome shotgun sequence Length of sequence - 4295 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1003 430 ## Ccel_2731 resolvase 2 1 Op 2 . + CDS 1019 - 2362 390 ## Dtox_1525 recombinase 3 1 Op 3 . + CDS 2304 - 2729 74 ## Ccel_2729 resolvase 4 1 Op 4 . + CDS 2602 - 3933 1032 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 3991 - 4050 4.4 5 2 Tu 1 . + CDS 4140 - 4293 172 ## gi|160935490|ref|ZP_02082866.1| hypothetical protein CLOBOL_00380 Predicted protein(s) >gi|157101652|gb|DS480672.1| GENE 1 2 - 1003 430 333 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2731 NR:ns ## KEGG: Ccel_2731 # Name: not_defined # Def: resolvase # Organism: C.cellulolyticum # Pathway: not_defined # 1 330 193 524 526 185 32.0 3e-45 KEDEAVWVRRMASMYMNGMSGVRIAEYLNQNQVAAPYGGLWSSEAVLRILFNEKTVGDCL HQKTYSTGMVPFRTEKNRGKKPQYFIRDDHEGILSRDEQIRLQQIKEHRLGRQARAKSCG SEGTYILSGKVVCGECGRAFIRKKEQRRKGVSIKWRCPGHRSEACQCFTNEIWEADIERM FVNTFNTLKEHADEILDPVIRGMKKMQETNGLYEALGTLNQKKLELKEQRHILRQLKANE CIDSALYFEESQKIEQELSRCRSEEKQLHSRGTHQDTITGLQATLHQLQGYDGKMQAFDG DQFILLVREVSVGKERELGFHLRCGLNLYEPVL >gi|157101652|gb|DS480672.1| GENE 2 1019 - 2362 390 447 aa, chain + ## HITS:1 COG:no KEGG:Dtox_1525 NR:ns ## KEGG: Dtox_1525 # Name: not_defined # Def: recombinase # Organism: D.acetoxidans # Pathway: not_defined # 1 290 2 291 299 114 26.0 1e-23 MNNYITYGYRLVDGYMRMNPEQSEAVHLIFDAYDGGESIKKIVGMLKDRKAPTTRGKPSW TPALIRKILRNKDYLGVEPYPRMIEEEQFDRVQECLKRSRDLWNQRHSTGSIQGTSMYTG RLICSSCGGVYAIYKQSSRCDNRNVRYWKCRHYDPITKEPCKMPIMTEKQLDGMFLQALM RMKTEPTLYHEETKKILKGLIKERQRMESELSVCWNRCNEDDERIEQLYFQIAARRYREI KMTDLTCQIEDYLRGLDGSIQAEKIDSSIFKIIKRVEVQPDRSLRFQLINNATVLQYPEI KTTKDKKLKNIRYCVPFGYWLDGQETPIIHEQYGPIVQMLFQSYAEGMSLTALAKELIRQ SIPNQKGKPAWSHSCIRNILTNPVYLGDKKFPALVTRELFTQVQTRLEENGRKKRESRTK AADGKEELHAKGGTNRRGHSGNLEPDR >gi|157101652|gb|DS480672.1| GENE 3 2304 - 2729 74 141 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2729 NR:ns ## KEGG: Ccel_2729 # Name: not_defined # Def: resolvase # Organism: C.cellulolyticum # Pathway: not_defined # 1 96 1 96 535 109 57.0 3e-23 MPRAERIVEVIPATWNPTDESSREIRKLRVAAYCRVSTELEQQQSSYDIQIEYYTRHIMQ NPNWIFAGVFADDGRSATNTFRRDDFNQLMEPVSEGEGGHGHYEIHQPVCQEYGGLHLLG EEAQGKERGRVLREGKPQHPG >gi|157101652|gb|DS480672.1| GENE 4 2602 - 3933 1032 443 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 2 292 79 360 542 78 25.0 2e-14 MVITKSISRFARNTVDCISWVRKLKEKNVAVYFEKENLNTLDDSTEMILTILSSQAQEES RAISTNVKWGYARKFEKGESTRQRSYGFRKAPTGEMCIVEEEAAVIRNMARWFLDGDSLE RIKHRLEDAGIETTTGKKTWSTGTIYNILTNEKIMGDVLLQKTFTADYLTKRRVKNSGQQ KQYYVKNHHEAIIPKTVYYKIQEEIARRSSLKKAGTRKGKTAQGVYSSKYALTGIMVCNE CGAHYRRTTWAKSGKKVIVWRCINRLEHGTKRCHESPTLKEEVIQEAIMGKLHSLSIDQE EENFLNGVKEDILRAAKVVGGACTAEEIDKTIEELRDQLMDYVGMAAREHGGENWYSDRM RKLGLQISELKKRRESIQEQEKIRDEYEYLDQEISRIIGETGGASGAEFDNIFIRQIVRE IRVVSNDRLQIQLRTGMMLDVNL >gi|157101652|gb|DS480672.1| GENE 5 4140 - 4293 172 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160935490|ref|ZP_02082866.1| ## NR: gi|160935490|ref|ZP_02082866.1| hypothetical protein CLOBOL_00380 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_00380 [Clostridium bolteae ATCC BAA-613] # 1 51 13 63 107 87 100.0 2e-16 MFKVKLSFDTDRLEQDILEQLYDKVDEIFESEDLNCVDRALVRVYEDKGRE Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:03:50 2011 Seq name: gi|157101651|gb|DS480673.1| Clostridium bolteae ATCC BAA-613 Scfld_02_14 genomic scaffold, whole genome shotgun sequence Length of sequence - 29514 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 8, operones - 7 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 423 431 ## Closa_0559 transposase IS66 2 1 Op 2 . - CDS 491 - 850 167 ## COG3436 Transposase and inactivated derivatives 3 1 Op 3 . - CDS 840 - 1193 216 ## gi|160936291|ref|ZP_02083661.1| hypothetical protein CLOBOL_01184 - Prom 1357 - 1416 6.1 - Term 1315 - 1362 11.0 4 2 Op 1 . - CDS 1442 - 4588 1721 ## COG3587 Restriction endonuclease 5 2 Op 2 . - CDS 4585 - 5883 482 ## Swol_0509 hypothetical protein 6 2 Op 3 . - CDS 5884 - 7911 1101 ## COG2189 Adenine specific DNA methylase Mod 7 2 Op 4 . - CDS 7904 - 8779 403 ## SG2001 hypothetical protein 8 2 Op 5 . - CDS 8798 - 9397 582 ## Amuc_1520 hypothetical protein 9 2 Op 6 . - CDS 9419 - 12412 1621 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family - Prom 12541 - 12600 3.8 - Term 12671 - 12714 -0.9 10 3 Tu 1 . - CDS 12720 - 12929 187 ## COG3655 Predicted transcriptional regulator - Prom 12977 - 13036 7.6 11 4 Op 1 . - CDS 13418 - 13672 61 ## gi|160936299|ref|ZP_02083669.1| hypothetical protein CLOBOL_01192 12 4 Op 2 . - CDS 13766 - 14929 571 ## MGAS2096_Spy1144 hypothetical protein 13 4 Op 3 . - CDS 14926 - 15120 179 ## gi|160936302|ref|ZP_02083672.1| hypothetical protein CLOBOL_01195 - Prom 15205 - 15264 9.5 + Prom 15162 - 15221 4.3 14 5 Op 1 . + CDS 15314 - 15913 341 ## bpr_I0634 HTH domain-containing protein 15 5 Op 2 . + CDS 15950 - 17050 656 ## COG0582 Integrase + Prom 17236 - 17295 7.9 16 6 Op 1 . + CDS 17478 - 17792 246 ## CD3346 hypothetical protein 17 6 Op 2 . + CDS 17814 - 18191 432 ## CD3345 hypothetical protein 18 6 Op 3 . + CDS 18196 - 18435 169 ## gi|160936308|ref|ZP_02083678.1| hypothetical protein CLOBOL_01201 19 6 Op 4 1/0.000 + CDS 18467 - 19864 1137 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins + Prom 19913 - 19972 2.1 20 6 Op 5 . + CDS 20016 - 21239 855 ## COG2946 Putative phage replication protein RstA 21 6 Op 6 . + CDS 21291 - 21512 287 ## CD3342A hypothetical protein 22 7 Op 1 . + CDS 21642 - 22136 252 ## smi_1327 hypothetical protein 23 7 Op 2 . + CDS 22153 - 22650 493 ## CD3340 putative conjugative transposon antirestriction protein + Term 22653 - 22696 5.2 24 8 Op 1 . + CDS 22740 - 23132 370 ## CD3339 conjugative transposon membrane protein 25 8 Op 2 . + CDS 23116 - 25569 2209 ## CD3338 hypothetical protein 26 8 Op 3 . + CDS 25566 - 27896 892 ## CD3337 conjugative transposon membrane protein 27 8 Op 4 . + CDS 27893 - 28897 733 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 28 8 Op 5 . + CDS 28913 - 29513 459 ## CD3335 hypothetical protein Predicted protein(s) >gi|157101651|gb|DS480673.1| GENE 1 3 - 423 431 140 aa, chain - ## HITS:1 COG:no KEGG:Closa_0559 NR:ns ## KEGG: Closa_0559 # Name: not_defined # Def: transposase IS66 # Organism: C.saccharolyticum # Pathway: not_defined # 37 136 16 107 513 63 46.0 3e-09 MAFLLKEKELANLDKQTLIKMLVAATESNQKLFEATEQLQKSVNLLTEEVVNLRQHRFGR SSEKGLTIGEDGCSQLCFAFNEAEMTIDLDPAFPEPELEDIFPKPYKRGKKKTGKRQEEI KDVPVTVVIHTLSEEELLTA >gi|157101651|gb|DS480673.1| GENE 2 491 - 850 167 119 aa, chain - ## HITS:1 COG:SP1442 KEGG:ns NR:ns ## COG: SP1442 COG3436 # Protein_GI_number: 15901292 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pneumoniae TIGR4 # 11 112 11 111 116 116 50.0 1e-26 MLSENTGFQAIYIYCGKCDLRKGIDGLATLVKEQFHLDPFQKDVLFLFCGCRTDRFKGLV WEGDGFCLLYKRIEAGRLRWPRSQEEAAGISPEELHLLLAGMTILERSSIQECNCTEIG >gi|157101651|gb|DS480673.1| GENE 3 840 - 1193 216 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936291|ref|ZP_02083661.1| ## NR: gi|160936291|ref|ZP_02083661.1| hypothetical protein CLOBOL_01184 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01184 [Clostridium bolteae ATCC BAA-613] # 1 117 1 117 117 223 100.0 3e-57 MNEVMQVRAASWTAMIKQRSDSGLTIKEWCAANGIQESVYYYRLNRLRKMALNACETPDP TKDHDPSGTFAQISVASTVQTSNAAIRIRRGDTVMEVSRDAPDRILSFLKEVMFRAL >gi|157101651|gb|DS480673.1| GENE 4 1442 - 4588 1721 1048 aa, chain - ## HITS:1 COG:PM0699 KEGG:ns NR:ns ## COG: PM0699 COG3587 # Protein_GI_number: 15602564 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Pasteurella multocida # 1 1042 1 1038 1043 859 47.0 0 MKFNFKIQQYQTDAVDAVVKIFQGQGYHDKISYIRDVGKQQENQQMRLNVDMDQQYSLDM GESHQLNIYDMDDDTGYKNEVIELSDEQLLINIQKLQSQNNIKQSKSLVKDLGRCSLDIE METGTGKTYVYIKTMFELNKKYGWSKFIVVVPSIAIREGVKKTFEITADHFMEHYGKKAR FFVYNSSNLNQLDNFSSNAGINVMIINTQAFASSLKEDGRSKEARIIYSKRDEFGSRRPI DVIKANRPIIILDEPQKMGGAVTQKALKNFNPLFTLNYSATHAVKHNTVYVLDALDAFNK RLVKKIEVKGFEVKNFRGTDIYLFLEQIVLSSKKPPMARIELEIGYNKSINRETRMLGVG DDLYYVSQEMEQYKGYTISDIDPFRGTVTFTNGETIQTGEVSGDVSESDMRRIQIRETIL SHFEKEEKLFNMGIKSLSLFFIDEVAKYRQYDADGNEVLGEYGQMFEQEYLNVMNEYITV FDTPYQKYLKSTCSDVSAVHKGYFSIDKKTGHSIDSQLKCGSEFSDDISAYDLILKNKER LLSFEEPTRFIFSHSALREGWDNPNVFQICTLKHSDSQTAKRQEVGRGLRLCVNQHGNRM DAESCGDSVHDINMLTVIASESYKGFVADLQSDIKTVLYDRPTTASSEYFNGKYVKVDGT PTLIDKNAADAIEFYLISNSYVDMKRKVTDKYRTDMKMGTVAPLPEEIQPMAEGIHTLIQ SIYDDSVLKDMFTDGHETKVKDNPLNDNFARAEFQALWKQINHKYAYTVEFDSNELIRKS IDHINEKLFVAELQYTTTVGRQKAKMDEYEVGRGDSFVRESSGTYTLKHTQTSQIKYDLI GKVSEGTVLTRKTVAAILKGLRVDKLYMFKNNPEEFISKVIKLIKEQKATMIVEHISYDQ IDGEYDSTIFTAEKSAQTFDKAFRANKAIQDYVFTDGSAEKSIERRFVEDLDAAQEVCVY AKLPKGFHIPTPVGNYSPDWAIAFYEGTVKHIFFVAETKGTMDSLNLRPIEQAKISCAKK LFNEMSTSNVKYHDVDSYQNLLNVMKSL >gi|157101651|gb|DS480673.1| GENE 5 4585 - 5883 482 432 aa, chain - ## HITS:1 COG:no KEGG:Swol_0509 NR:ns ## KEGG: Swol_0509 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 429 1 429 432 506 59.0 1e-141 MKQITEITRRDIFSLFIYGMDIEEFFDNKRIQYGYCGRLSELDFLKRIYDLKSLPSFDDR FDDAEGDIWQHTVNNNDYEEGWIFEDDRFELLNGDDEVLLKFLCAVFHPAVRTESGYWKE FLGEVNGLLQNDGYELYPESKISGRDVYGWRKHDLEESSLFVPFSQRNEQDIKEKKLKLS LNMKTRNQIYKLIEKHSTVYRETDETGWGYDIKTSECVFRDMSQFYKPKCFDVKNNYIET SSMEEFILHNYPFYVFDAIELYEKYNADSDYTAQMNMIFKLNSVPYKLEQGRIVSTLEIA IDPKILAKIPEKGLKELLSEADAYYRSENKQIAVEKIWDAFERLKTYYSPTLNKAQSADK IIDNMSNSEPNYKELYEAEFKALTSIGNSFRIRHHETTKVDITDNRQYDYFYKRCLALVS VAILYLEGGINA >gi|157101651|gb|DS480673.1| GENE 6 5884 - 7911 1101 675 aa, chain - ## HITS:1 COG:PM0698 KEGG:ns NR:ns ## COG: PM0698 COG2189 # Protein_GI_number: 15602563 # Func_class: L Replication, recombination and repair # Function: Adenine specific DNA methylase Mod # Organism: Pasteurella multocida # 3 670 4 631 636 503 43.0 1e-142 MDKLKMHTPNKADENFKKLAVLFPNAVTETIDENGEVVRAIDKDVLMQEISCTVVDGNEE RYQFTWPDKKKTVLLANAPLNKTLRPCREESVDFDNTENLYIEGDNLEVLKLLQETYLGK IKMIYIDPPYNTGHDFVYEDDFSQSSNEYLSNSGQFDDAGNRLVANTESNGRFHTDWLNM MYPRLRLAKDLLTEDGVIFISIDDNEQCNLVKLCDEVFGAENCIGPIIQNKQNAKNDTVN IQKNHEFILVYRKSSNSINGSIAATLSRKVITYRNAYEDDGRFYYLNDPITTRGEGGTLN ARPNLGYTVYYNPETKDKIAVADYNVTLAKTSNDEGEIYTTDNELLCKGYVAIRPPRVRG KLGCWTWALEKFNAQKDEIIITGKPGSYAVRKRTFVKKEDIVTVDGKMQYTASSMTNSRS ILDFSTNDGTNTLNTVLGKNAVFSNPKNLEMIKYLIQLVADKSFIVLDFFSGSATTAHAV MQANAEDGGKRRFIMVQISEETEDKSDAYKAGYKTICEIGKERIRRAAKKIAEENPDAKF DSGFRVLKCDTSNMKDVYYSPADFDMNLLDMMADNIKEDRTPEALLFQVMLDLGVELSSK IEETTIAGKKVFDVADGFLIACFDKDVNEETIKAIAQKQPYYFVMRDSSLASDSVATNFE QIFATYSPDTVRKVL >gi|157101651|gb|DS480673.1| GENE 7 7904 - 8779 403 291 aa, chain - ## HITS:1 COG:no KEGG:SG2001 NR:ns ## KEGG: SG2001 # Name: not_defined # Def: hypothetical protein # Organism: S.enterica_Gallinarum # Pathway: not_defined # 1 280 1 292 293 171 40.0 3e-41 MSSEQYQRTVNSLDKEIADLEKKKAAKDREVANLQGKINTLKKSINSHTSASTLNSKMRQ IAIHESDQAKKSQDSADLGKKIAEKRKKRAEAYLRLQKEQQNEQKKQDKVNQQIQASYEA RIKELQQQLLQPVVTTTTTPIAADEEYDVFVSHAYEDKESFVDEFVEALRNQELKVWYDM DKLKWGDSMREKIDRGLAKSRYGVVILSPNYIAEHKYWTKAELNGLFQVETVNGKTILPI WHNLTKKQVVEYSPIIADRKAMTTALMTPEEIAAELKELFTAEDTEDENNG >gi|157101651|gb|DS480673.1| GENE 8 8798 - 9397 582 199 aa, chain - ## HITS:1 COG:no KEGG:Amuc_1520 NR:ns ## KEGG: Amuc_1520 # Name: not_defined # Def: hypothetical protein # Organism: A.muciniphila # Pathway: not_defined # 1 198 1 200 200 157 45.0 2e-37 MLGLPKATELSMQLPKNAIYAKFQMNTAEKAKIDADISRITIVNEVSADKVHIAAGEQVK SFFVLLVALKKKDFDDRTIITISKLIPQNMLLVLECGDEAKLATYHTKLMQTGWKRKEEL SVELKGLNMDAVWENIIVQIGGITVEQGNTLDEQIAVDEYRHKIEKEIAKLEKQARAEKQ PKKKFELAQQIKKLNDELR >gi|157101651|gb|DS480673.1| GENE 9 9419 - 12412 1621 997 aa, chain - ## HITS:1 COG:PM0696 KEGG:ns NR:ns ## COG: PM0696 COG0553 # Protein_GI_number: 15602561 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Pasteurella multocida # 1 996 86 1086 1089 1025 52.0 0 MYGTEFEVKLRNELTQKAIAKECAEWIRKKVTFKSNVTNENMMGFINLDDKNYMPINGFT TVDLGCERGNNAYNMVQKTEVPFSTAYIDLFDNLWNDTSKLQEVTDEVIENITAAYNENS PDFIYFVTLYNIFSEFLEDVSEDHLPNEATGFKESKIWSMLYNFQKDAVLAIISKLEKFN GCILADSVGLGKTFTALAVIKYYENRNKSVLVLCPKKLTNNWNTYKDNYVNNPIASDRLR YDVLYHTDLNRTHGKSNGLDLDRLNWSNYDLVVIDESHNFRNGGKLSGEDNEKENRYLKL LNKVIRKGVKTKVLMLSATPVNNRFNDLKNQLALAYEGNTDLIDDKLNTTKSIDEIFKNA QRAFNTWSKWDPADRTTENLLRMLDFDFFEVLDSVTIARSRKHIQKYYDTSDIGTFPTRL KPISLRPPLTSLKKAINYNEIYEQLTQLSLSIYTPTHFILPSKMEKYAEMYEDNKVNIGF TQANREQGIRRLTAINLMKRMESSVHSFNLTLKRIYSLIDSTIHSIDTYDKTSSVRLELT DISDIDEFDSEDQNGDELFTFGKKVKIDIGDMDYKSWRDSLAKDRDTLELLTLMVGDITP EYDSKLQELFQVIKNKLEHPINEGNKKIIIFTAFADTAQYLFDNVSKYVKDNFGQNTAMV SGSVEGRTTVPRLKSDLNTVLTCFSPISKDKHLLMPNDSTEIDFLIATDCISEGQNLQDC DYLINYDIHWNPVRIIQRFGRIDRIGSKNAFIQLVNFWPDVTLDEYIDLKAKVETRMKIV DMTATGDDNLLSDEEKTDLEYRKAQLKRLQEEVVDIEDMSTGISIMDLGLNEFRMDLLEY IKNHPDIDKAPFGLHSVAAASEETPAGVIYVLKNRSNSVNIDNQNRLHPFYMVYISNEGE VICDHLSPKQMLDKMRFLCKGKTEPIPELYRQFNKETRDGKNMVVFSKLLGDAIASIIEV KEESDIDSFLGGGQMSFLTNEIKGLDDFELICFLVVR >gi|157101651|gb|DS480673.1| GENE 10 12720 - 12929 187 69 aa, chain - ## HITS:1 COG:SPy0544 KEGG:ns NR:ns ## COG: SPy0544 COG3655 # Protein_GI_number: 15674643 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 65 1 65 69 67 53.0 6e-12 MDVSYKRLWKLLIDRDMKKKDLAEKANLSNYTINKMNKGENVNTDTLVKICGALNCRIED IMEVVPDEK >gi|157101651|gb|DS480673.1| GENE 11 13418 - 13672 61 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936299|ref|ZP_02083669.1| ## NR: gi|160936299|ref|ZP_02083669.1| hypothetical protein CLOBOL_01192 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01192 [Clostridium bolteae ATCC BAA-613] # 1 84 1 84 84 149 100.0 6e-35 MKNVQIPYDLFVALLQYHLMMDGDYEDEIKQGLEQKLEAMVRHEWYAKYKTAPTPEEREA ARQRYLDERGVPQSFRWTTSPWER >gi|157101651|gb|DS480673.1| GENE 12 13766 - 14929 571 387 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1144 NR:ns ## KEGG: MGAS2096_Spy1144 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 80 387 23 328 331 220 42.0 9e-56 MMQENEKTTVPIPSVGADGEQSLSYVTNEIITTGNEEINPIDESMEEMLRQMQRMSDPSY LATMTMSQLYDTVYESRLPVIDGLLYPGTYLFVGDPKVGKSFLMAQIAYHVSTGLPLWNY SVHAGTVLYLALEDDYRRLQERLYRMFGVDGTDTLHFATCAKQLGAGLYEQLARFVSEHR DTRLIIIDTLQKIREASGDRYSYASDYEIIGQLKHFADQTGIALLLVHHTRKQQADDKFD MISGTNGLLGAADGAFVLQKEKRTGNTAVLEVSGRDQPEQRLILKKDMEHLVWELEKAET ELWQIPPDPILEKVAALLAEDILEWSGTPTELAEALKLELKPNLLTKHLNVNASRLFHEC HTQYENIRTHAGRRIILKRVSEQRDDA >gi|157101651|gb|DS480673.1| GENE 13 14926 - 15120 179 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936302|ref|ZP_02083672.1| ## NR: gi|160936302|ref|ZP_02083672.1| hypothetical protein CLOBOL_01195 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01195 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 124 100.0 3e-27 MRAQFITAEEVKQVLDVSRSKSYQIIRDLNKELKSMGYHTIAGKCPIQFFKQKFYGLEIG GDNA >gi|157101651|gb|DS480673.1| GENE 14 15314 - 15913 341 199 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0634 NR:ns ## KEGG: bpr_I0634 # Name: not_defined # Def: HTH domain-containing protein # Organism: B.proteoclasticus # Pathway: not_defined # 7 170 5 168 174 156 46.0 5e-37 MDNFDYSAIGQRIKTLRKRKGLNQTQLANLIGKSLRTVQKYETGEIEVSIDVVNEIAKHL DTTPTFILGYETNTAPIRSLADIMSFLFELNKVSSLKFDVDVKKPPRSNEWTCSIRFNGK EMDADHNADMCLFLEQWEDMREDVRSYNCSAAALHSWQEQTLAYYAGASVECVEPEELDE DERLARRRAYLEKEFGSKK >gi|157101651|gb|DS480673.1| GENE 15 15950 - 17050 656 366 aa, chain + ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 22 355 23 367 368 144 30.0 3e-34 MSVTRDKNTGKWMSQIRVKDWTGKEIHKKKRGFKTKKEALQWEQDFISQADGSLGMKFKD FVALYMKDADPRLRETTLANKHYLFNKKVLPYFGEMPINAIKPTDIRNWQNELINYRRPN GRGYSPTYLRTINNQLTAAFNFAVKFYGLRENPCHKAGTMGKKNADEMLFWTNEEFKVFI KAMESRPVGYTVFMVMYYTGLRVGELLALTPEDIDFDRHIISVNKNYQHLNGKDYIYPPK TEAGYREVVMPKVLESCLKDYLAKLYDVQPTDRIFPYDRGWVGRQMKYGCDHSGTQKIRV HDVRHTHASLLIDMGCTPLLVAERLGHERVQTTMETYSHLYPNKQSEVANQLDAMAEKDD GYKKLG >gi|157101651|gb|DS480673.1| GENE 16 17478 - 17792 246 104 aa, chain + ## HITS:1 COG:no KEGG:CD3346 NR:ns ## KEGG: CD3346 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 103 1 103 104 199 99.0 3e-50 MELKFVIPNMEKTFGNLEFAGEDKTEQRRINGRMAVLSRGFNLYSDVQRADDIVVILPAE AGEKHFDFEERVKLVNPRITAEGYKIGTRGFTNYILHADDMVKV >gi|157101651|gb|DS480673.1| GENE 17 17814 - 18191 432 125 aa, chain + ## HITS:1 COG:no KEGG:CD3345 NR:ns ## KEGG: CD3345 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 125 1 125 125 218 89.0 9e-56 MRLANGIVIDKEATFGTLKFSALRREVHIQNEDGTVSEEIKERTYDLKSRGQGRMIQVSI PASVPLKEFDYNAEVEIIHPVADTVATATFQGADVDWYIKAEDIILKKGVAMNLQPPKKD HPAEK >gi|157101651|gb|DS480673.1| GENE 18 18196 - 18435 169 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936308|ref|ZP_02083678.1| ## NR: gi|160936308|ref|ZP_02083678.1| hypothetical protein CLOBOL_01201 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01201 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 145 100.0 1e-33 MKSIAGYDEYEQDFQLLALPEYRWMAEQLLIEAFEDYQPKEPFHVYLPDCLCGEPSGYTF FFEACTEGRGHSLLYTGFY >gi|157101651|gb|DS480673.1| GENE 19 18467 - 19864 1137 465 aa, chain + ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 4 465 10 467 480 450 49.0 1e-126 MKPFRSRGRRIRPTDKDLVFHTALAALLPVFLLIVLLFHAGQLRRIDWQHTSLSQMVQGI NLPYLLFSLGIAGLLCLGTACASYHYCRDGIKQLIHRQKLARMILENKWYEVEQVQADGF FKDLSSSRSKEKITCFPKVYYQLKNGLIHIRAEITLGKYQEQLLNLEKKLESGLYCELTG KELKDSYVEYTLLYDTIAGRISIEDVQAKDGKLRLMKNVWWEYDKLPHMLIAGGTGGGKT YFILTLIEALLRTNAALFVLDPKNADLADLQAVMPDVYYKKEDMLACIDRFYGEMMKRSE DMKLMENYRTGENYAYLGLPAHFLIFDEYVAFMEMLGTKENAAVLNKLKQIVMLGRQAGF FLILACQRPDAKYLGDGIRDQFNFRVALGRMSEMGYGMMFGETTKDFFLKQIKGRGYVDV GTSVISEFYTPLVPKGHDFLKEIKKLTDSRQGVQAACEAKAAETD >gi|157101651|gb|DS480673.1| GENE 20 20016 - 21239 855 407 aa, chain + ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 77 407 25 352 352 240 40.0 3e-63 MKVDIRFFRQEGFSLNEETWVREIKEKREIYGISQQKLALAAGITRPYLSDIETGKAHPS EALQEAIMEALERFNPDAPLELLFDYVRIRFPTTDVKHIVEDVLRLKLPCFIHEDYGFYS YTEHYYLGDVFVLVSPELEKGVLLELKGRGCRQFESYLLAQERSWYEFFMDVLMEDGVMK RLDLAINDKTGILNIPHLTEKCRNEECISVFRSFKSYRSGELVRRGEKECMGNTLYIGSL QSEVHFCIYEKDYEQYKKHDIPIADAEVKNRFEIRLKNERAFYAIRDLLEHDNPERTAFQ IINRYVRFVDRDDTKPRSDWRINEEWAWFIGERRGSLKLTTKPEPYSFERTLHWLSHQVA PTLKLALRLDKMNHTQIVHDIITHAKLTEKHEKILKQQAAAAKEVVL >gi|157101651|gb|DS480673.1| GENE 21 21291 - 21512 287 73 aa, chain + ## HITS:1 COG:no KEGG:CD3342A NR:ns ## KEGG: CD3342A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 73 1 73 73 106 98.0 4e-22 MNFGQNLYNWFLSNAQSLVLLAIVVIGLYLGFKREFSKLIGFLVVSLVAVGLVFNAEGVK DILLELFNKIIGA >gi|157101651|gb|DS480673.1| GENE 22 21642 - 22136 252 164 aa, chain + ## HITS:1 COG:no KEGG:smi_1327 NR:ns ## KEGG: smi_1327 # Name: not_defined # Def: hypothetical protein # Organism: S.mitis_B6 # Pathway: not_defined # 1 163 3 164 166 132 46.0 7e-30 MRIYILNTTRFYHEDFEEYPGAWFSCPVDFEEVRERLGVQNEEEIEIEDYELPFPLEGST RLWEINALCRIVQEMQGTPLYYEMDVIQKRWFHSFTEFIDNKDHIRCYPVQDGESLARYL VQETQIFGEVHPDLMNHIDYTAIGRKLETSENYLFTDNGIFSYR >gi|157101651|gb|DS480673.1| GENE 23 22153 - 22650 493 165 aa, chain + ## HITS:1 COG:no KEGG:CD3340 NR:ns ## KEGG: CD3340 # Name: not_defined # Def: putative conjugative transposon antirestriction protein # Organism: C.difficile # Pathway: not_defined # 1 165 1 165 165 275 95.0 7e-73 MEEMRIYIANLGKYNEGELVGAWFTPPVDFEEVRERIGLNDEYEEYAIHDYELPFAIDEY TPIEEVNRLCEMVEDLPEYIQEELSELQSYFGSIEELCEHEDDIIFHSGCDDMADVARYY LEETGQLGELTAHLQNYIDYEAYGRDMELEGTFIVTNHGVYEILR >gi|157101651|gb|DS480673.1| GENE 24 22740 - 23132 370 130 aa, chain + ## HITS:1 COG:no KEGG:CD3339 NR:ns ## KEGG: CD3339 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 130 1 130 130 235 97.0 4e-61 MKKIRSYTSIWSVEKILYSINDFKLPFPITFTQMAWFVISVFAVMLLGNLPPLSFIGGAF LKYFGVPFALTWFMCQKTFDGKKPYGFLKSVLAYLVRPKLTYAGKPVKLEKEYPAQPITA VRSDIYGISD >gi|157101651|gb|DS480673.1| GENE 25 23116 - 25569 2209 817 aa, chain + ## HITS:1 COG:no KEGG:CD3338 NR:ns ## KEGG: CD3338 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 817 1 817 817 1595 99.0 0 MAYPIKYIENNLVFNHDGECFAYYELLPYNYSFLSPEQKYQVHDSFRQLIAQNRDGKIHA LQISTESSIRAAQERSKQEVTGKLKDVACAKIDAQTEALISMIGENQVDYRFFIGFKLLV NEQEVTMKQFRREAKTAVSDFLHEVNHKLMGDFVSMSNEEIWRFQKMEKLLENKISRRFK VRRLDKDDFGYLIEHLYGQTGTAYEDYEYYLPKKRFQEETLVKYYDLIKPTRCLIEENQR YLKIEQEDGTVYAAYFTINSIVGELDFPSSEIFYYQQQQFTFPIDTSMNVEIVTNRKALS TVRNKKKELKDLDNHAWQNDSETSTNVVDALDSVNELESTLDQSKESMYKLSYVVRVTAP DLEELKRRCNEVKDFYDDLNVKLVRPFGDMLGLHGEFLPASKRYMNDYIQYVTSDFLAGL GFGATQMLGEPEGIYIGYSLDTGRNVYLKPALASQGVKGSVTNALAAAFVGSLGGGKSFS NNMIVYYCVLFGAQALIVDPKAERGRWKETLPEIAHEINIVNLTSEEQNRGLLDPYVIME NPKDSESLAIDILTFLTGISSRDGEKFPVLRKAIRAVTNSEERGLLKVIEELRAEGTTIS TSIADHIESFTDYDFAHLLFSDGDVTQSISLEKQLNIIQVADLVLPDKETSFEEYTTMEL LSVAMLIVISTFALDFIHTDRSVFKIVDLDEAWSFLQVAQGKTLSMKLVRAGRAMNAGVY FVTQNTDDLLDEKLKNNLGLKFAFRSTDINEIRKTLTFFGVDSEDENNQKRLRDLENGQC LISDLYGRVGVIQFHPIFEDLFHAFDTRPPVRKEVEE >gi|157101651|gb|DS480673.1| GENE 26 25566 - 27896 892 776 aa, chain + ## HITS:1 COG:no KEGG:CD3337 NR:ns ## KEGG: CD3337 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 775 1 792 793 1345 91.0 0 MKIRKPIQQGTALVPCRTKGQNSLKKIGSIAGKVLLALLLILLLLAVSGTAAHAAGLVDD TVDAANEYSKYPLDNYQLDFYVDSGWDWLPWNWLDGIGKQVMYGLYAITNFIWAISLYLS NATGYLIQEAYSLDFISSTADSIGKNMQTLAGVTTGGLSSKGFYIGFLLILILVVGIYVA YTGLIKRETTKAIHAVVNFVVVFVLSSAFIAYAPDYIGKINEFSADISNASLTLGTKIVL PNSESQGKDSVDLIRDSLFSIQVKQPWLLLQYGNSDVESIGADRVESLLSTSPNENNGQD REEIVVEEIEDRENTNLTITKTINRLGTVFFLFMFNIGISVFVFLLTGIMIFSQVLFIIY AMFLPVSFLLSMVPSFEGMSKRAITKLFNTILTRAGITLIITVAFSISTMLYNLSGEYPF FLTAFLQIVTFAGIYFKLGDLMGMFSLQSGDSQSMGSRIMRRPRMLMHAHMHRLQHKLGR SVAALGAGTAAYHAGKQAGSDQKNASNSGSSKRTQADHSRPDGQTAPEKKSAWKRAGSAV GAVADTKDKIADTTGQLREQAKDLPVNAKYALYHGKKQVSEGVRDFTSSVTQTRTARAEQ RNAQAESRRQTIAERRAELEQAKQTQKTASEAPKGVAPVHERPATAKTKQPASPAIRERG QVSYGGTVAERTSVPVVKAASIHHEQIPPVRTERQIVPPASPDKPDERQKIAPAITPAAP RPARPVQNDTAPVIPERKRAAPVVKESNFTIRRTTARKEWTKTGKAAAKQKKGEKT >gi|157101651|gb|DS480673.1| GENE 27 27893 - 28897 733 334 aa, chain + ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 214 333 5 124 124 138 57.0 1e-32 MKLRHLFFACSGVFVMMFSMLLLVVIVFSDEEDGGSGGNLIYGGVSVSQEVLAHKPMLEK YAREYGIEEYLNVLLAIIQVESGGTLEDVMQSSESLGIPPNSLNTEESIKQGCKYFSELL TAAETKGCDLNSVIQSYNYGGGFLDYVAGHGKKYTFELAESFAREKSGGKKVTYTNPVAV EKNGGWRYSYGNMFYVFLVSEYLTVAQFDDETVQAIMEEALKYEGWTYVYGGDSPSTSFD CSGLVQWCYGKAGIALPRTAQEQYNVTQHIPLSEAKAGDLVFFHSTYNAGTYITHVGLYV GNNQMYHAGNPIGYADLTGSYWQQHLAGAGRIKQ >gi|157101651|gb|DS480673.1| GENE 28 28913 - 29513 459 200 aa, chain + ## HITS:1 COG:no KEGG:CD3335 NR:ns ## KEGG: CD3335 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 199 1 199 302 315 96.0 9e-85 MIQIRKEENQKKQKEKKLKVYKVNTHKKTVIALWVLLAVSFLFAVYKNFTAIDIHTVHET KVIEEKIIDTHKIENFVENFAEVYYSWEQSAASIDNRTNALKGYLTGELQALNVDTVRKD IPVSSALTDFQIWEIIEEKEQHYQVTYTVEQRITEGESSKTVRSAYQVTVYVDGSGNLTI VQNPTITSVPVKSGYTPKAA Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:06:05 2011 Seq name: gi|157101650|gb|DS480674.1| Clostridium bolteae ATCC BAA-613 Scfld_02_15 genomic scaffold, whole genome shotgun sequence Length of sequence - 142307 bp Number of predicted genes - 122, with homology - 119 Number of transcription units - 51, operones - 24 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 822 619 ## COG1083 CMP-N-acetylneuraminic acid synthetase - Term 1115 - 1159 3.4 2 2 Tu 1 . - CDS 1171 - 1287 166 ## - Prom 1368 - 1427 4.5 - Term 1379 - 1413 -0.3 3 3 Tu 1 . - CDS 1502 - 1690 89 ## gi|160936322|ref|ZP_02083691.1| hypothetical protein CLOBOL_01214 - Prom 1874 - 1933 2.2 4 4 Op 1 . - CDS 1936 - 3030 318 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 4 Op 2 . - CDS 3033 - 3626 260 ## gi|160936324|ref|ZP_02083693.1| hypothetical protein CLOBOL_01216 6 5 Op 1 . - CDS 4547 - 5596 287 ## LAR_1288 hypothetical protein 7 5 Op 2 . - CDS 5600 - 6823 450 ## gi|160936326|ref|ZP_02083695.1| hypothetical protein CLOBOL_01218 8 5 Op 3 3/0.000 - CDS 6823 - 7833 266 ## COG0438 Glycosyltransferase 9 5 Op 4 . - CDS 7834 - 8958 494 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 10 5 Op 5 2/0.000 - CDS 8986 - 9840 298 ## COG1091 dTDP-4-dehydrorhamnose reductase 11 5 Op 6 4/0.000 - CDS 9927 - 10970 467 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 12 5 Op 7 . - CDS 10986 - 12176 475 ## COG0438 Glycosyltransferase - Prom 12234 - 12293 2.5 - Term 12240 - 12284 1.0 13 6 Tu 1 . - CDS 12298 - 12432 110 ## gi|160936332|ref|ZP_02083701.1| hypothetical protein CLOBOL_01224 - Term 12627 - 12675 6.3 14 7 Op 1 3/0.000 - CDS 12707 - 13312 503 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 15 7 Op 2 5/0.000 - CDS 13309 - 13980 294 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 16 7 Op 3 . - CDS 14005 - 15210 420 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 17 7 Op 4 . - CDS 15266 - 15415 97 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 - Prom 15486 - 15545 5.1 + Prom 15295 - 15354 4.6 18 8 Tu 1 . + CDS 15551 - 16717 341 ## COG3547 Transposase and inactivated derivatives + Term 16763 - 16797 1.1 19 9 Op 1 . - CDS 17401 - 19713 2223 ## COG5263 FOG: Glucan-binding domain (YG repeat) 20 9 Op 2 . - CDS 19710 - 20198 393 ## Shel_01150 PAS domain-containing protein 21 9 Op 3 . - CDS 20235 - 22319 1722 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 22 10 Op 1 . - CDS 22424 - 23104 653 ## gi|160936342|ref|ZP_02083711.1| hypothetical protein CLOBOL_01234 23 10 Op 2 . - CDS 23193 - 24116 781 ## COG1316 Transcriptional regulator 24 11 Op 1 5/0.000 - CDS 24277 - 25077 824 ## COG0489 ATPases involved in chromosome partitioning 25 11 Op 2 2/0.000 - CDS 25080 - 25850 664 ## COG3944 Capsular polysaccharide biosynthesis protein 26 11 Op 3 . - CDS 25892 - 26638 387 ## COG4464 Capsular polysaccharide biosynthesis protein - Prom 26689 - 26748 4.1 - Term 26920 - 26960 1.4 27 12 Tu 1 . - CDS 27013 - 27930 623 ## COG0582 Integrase - Prom 27967 - 28026 4.7 - Term 28030 - 28077 9.3 28 13 Op 1 . - CDS 28091 - 28573 648 ## COG2606 Uncharacterized conserved protein 29 13 Op 2 . - CDS 28635 - 29447 890 ## Closa_2353 hypothetical protein 30 13 Op 3 . - CDS 29444 - 30667 1338 ## Closa_2354 hypothetical protein 31 13 Op 4 . - CDS 30682 - 34272 3756 ## Closa_2355 hypothetical protein + Prom 34486 - 34545 9.3 32 14 Op 1 . + CDS 34594 - 35748 1369 ## COG0281 Malic enzyme 33 14 Op 2 . + CDS 35793 - 36296 402 ## Closa_2357 hypothetical protein + Term 36420 - 36451 1.5 34 15 Tu 1 . - CDS 36382 - 37236 747 ## COG0622 Predicted phosphoesterase - Prom 37309 - 37368 5.5 + Prom 37684 - 37743 5.0 35 16 Tu 1 . + CDS 37788 - 39206 1618 ## COG0591 Na+/proline symporter + Term 39263 - 39300 -0.6 - Term 39015 - 39065 5.5 36 17 Op 1 1/0.059 - CDS 39226 - 40572 1444 ## COG2252 Permeases 37 17 Op 2 . - CDS 40609 - 42399 1494 ## COG1001 Adenine deaminase - Prom 42426 - 42485 9.2 + Prom 42418 - 42477 5.7 38 18 Tu 1 . + CDS 42512 - 43423 743 ## COG0583 Transcriptional regulator + Term 43430 - 43488 3.2 - Term 43416 - 43475 9.2 39 19 Op 1 4/0.000 - CDS 43522 - 44103 757 ## COG0218 Predicted GTPase 40 19 Op 2 18/0.000 - CDS 44133 - 46442 2566 ## COG0466 ATP-dependent Lon protease, bacterial type 41 19 Op 3 24/0.000 - CDS 46593 - 47912 284 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 42 19 Op 4 29/0.000 - CDS 47989 - 48570 618 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 48678 - 48737 3.4 - Term 48684 - 48743 3.0 43 19 Op 5 . - CDS 48755 - 50041 1569 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 50120 - 50179 7.9 - Term 50151 - 50202 8.1 44 20 Tu 1 . - CDS 50223 - 50777 668 ## COG0693 Putative intracellular protease/amidase - Prom 50854 - 50913 8.9 + Prom 50810 - 50869 10.9 45 21 Tu 1 . + CDS 50935 - 51387 453 ## COG2954 Uncharacterized protein conserved in bacteria + Term 51493 - 51536 7.4 - Term 51912 - 51957 7.1 46 22 Op 1 10/0.000 - CDS 52063 - 53718 1431 ## COG4211 ABC-type glucose/galactose transport system, permease component 47 22 Op 2 16/0.000 - CDS 53734 - 55242 1395 ## COG1129 ABC-type sugar transport system, ATPase component - Prom 55279 - 55338 1.5 - Term 55277 - 55312 5.3 48 22 Op 3 3/0.000 - CDS 55341 - 56681 1608 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 56734 - 56793 6.8 49 22 Op 4 . - CDS 56840 - 57868 1240 ## COG1879 ABC-type sugar transport system, periplasmic component 50 22 Op 5 . - CDS 57944 - 58915 947 ## Closa_2376 ABC-type sugar transport system periplasmic component-like protein 51 22 Op 6 1/0.059 - CDS 58905 - 59309 452 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 52 22 Op 7 7/0.000 - CDS 59156 - 60700 1078 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 53 22 Op 8 . - CDS 60766 - 62403 1895 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 54 22 Op 9 . - CDS 62427 - 63488 742 ## COG4677 Pectin methylesterase - Prom 63549 - 63608 7.1 + Prom 63647 - 63706 7.6 55 23 Tu 1 . + CDS 63778 - 65184 1475 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase + Term 65257 - 65292 7.2 56 24 Op 1 . - CDS 65200 - 65877 658 ## COG2082 Precorrin isomerase 57 24 Op 2 2/0.000 - CDS 65834 - 67522 1427 ## COG1492 Cobyric acid synthase 58 24 Op 3 9/0.000 - CDS 67495 - 68619 861 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 59 24 Op 4 1/0.059 - CDS 68695 - 69693 784 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 60 24 Op 5 8/0.000 - CDS 69708 - 70139 339 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 61 24 Op 6 8/0.000 - CDS 70142 - 70975 836 ## COG0368 Cobalamin-5-phosphate synthase 62 24 Op 7 2/0.000 - CDS 70972 - 71613 641 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 63 24 Op 8 2/0.000 - CDS 71668 - 72726 1157 ## COG2038 NaMN:DMB phosphoribosyltransferase 64 24 Op 9 . - CDS 72723 - 74426 999 ## COG1797 Cobyrinic acid a,c-diamide synthase 65 24 Op 10 . - CDS 74411 - 75709 1180 ## COG2242 Precorrin-6B methylase 2 66 24 Op 11 3/0.000 - CDS 75702 - 76523 837 ## COG2099 Precorrin-6x reductase 67 24 Op 12 . - CDS 76510 - 78024 1304 ## COG2875 Precorrin-4 methylase 68 24 Op 13 . - CDS 78038 - 79285 1023 ## COG1903 Cobalamin biosynthesis protein CbiD 69 24 Op 14 . - CDS 79335 - 80504 627 ## COG2073 Cobalamin biosynthesis protein CbiG 70 24 Op 15 . - CDS 80560 - 80772 341 ## Closa_2419 hypothetical protein - Prom 80889 - 80948 5.1 + Prom 80851 - 80910 5.4 71 25 Op 1 9/0.000 + CDS 80998 - 81729 455 ## COG3279 Response regulator of the LytR/AlgR family 72 25 Op 2 . + CDS 81726 - 83069 828 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Term 83160 - 83209 12.9 73 26 Op 1 21/0.000 - CDS 83223 - 85214 1177 ## PROTEIN SUPPORTED gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 74 26 Op 2 1/0.059 - CDS 85217 - 85894 261 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 - Prom 85926 - 85985 5.4 75 27 Op 1 . - CDS 86020 - 87261 1453 ## COG2081 Predicted flavoproteins 76 27 Op 2 . - CDS 87258 - 88760 1604 ## COG2326 Uncharacterized conserved protein - Prom 88783 - 88842 5.1 - Term 88890 - 88937 10.1 77 28 Op 1 . - CDS 88965 - 90284 1706 ## COG0527 Aspartokinases - Prom 90371 - 90430 8.9 78 28 Op 2 . - CDS 90500 - 92584 2362 ## COG3808 Inorganic pyrophosphatase - Prom 92666 - 92725 9.9 79 29 Tu 1 . - CDS 92798 - 93388 574 ## COG2316 Predicted hydrolase (HD superfamily) - Prom 93427 - 93486 7.0 80 30 Tu 1 . + CDS 93616 - 94128 481 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 81 31 Tu 1 . - CDS 93973 - 94422 238 ## - Prom 94456 - 94515 5.3 - Term 94587 - 94624 3.0 82 32 Tu 1 . - CDS 94673 - 98449 4346 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain - Prom 98663 - 98722 7.3 + Prom 98508 - 98567 8.8 83 33 Tu 1 . + CDS 98700 - 100178 1382 ## COG4868 Uncharacterized protein conserved in bacteria + Term 100250 - 100291 2.1 84 34 Tu 1 . - CDS 100332 - 101579 1551 ## COG0205 6-phosphofructokinase - Prom 101618 - 101677 4.9 85 35 Op 1 . - CDS 101695 - 102297 657 ## COG0406 Fructose-2,6-bisphosphatase 86 35 Op 2 . - CDS 102326 - 102922 709 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 103046 - 103105 6.9 + Prom 102919 - 102978 6.2 87 36 Tu 1 . + CDS 103075 - 103950 1037 ## COG0583 Transcriptional regulator - Term 103820 - 103872 3.8 88 37 Tu 1 . - CDS 103996 - 106413 2495 ## COG1199 Rad3-related DNA helicases - Prom 106449 - 106508 5.8 - Term 106457 - 106506 16.1 89 38 Tu 1 . - CDS 106523 - 106768 374 ## Closa_1342 Phosphotransferase system, phosphocarrier protein HPr - Prom 106862 - 106921 6.0 - Term 106968 - 107018 7.1 90 39 Op 1 . - CDS 107094 - 108332 1300 ## COG0205 6-phosphofructokinase 91 39 Op 2 . - CDS 108375 - 108779 411 ## EUBELI_00568 hypothetical protein - Prom 108908 - 108967 6.3 - Term 108931 - 108979 1.7 92 40 Tu 1 . - CDS 109014 - 109448 270 ## gi|160936429|ref|ZP_02083798.1| hypothetical protein CLOBOL_01321 - Prom 109678 - 109737 7.5 + Prom 109657 - 109716 6.0 93 41 Tu 1 . + CDS 109740 - 110162 515 ## EUBREC_1216 hemerythrin + Term 110273 - 110320 10.1 - Term 110260 - 110306 9.1 94 42 Op 1 . - CDS 110389 - 111819 968 ## COG1757 Na+/H+ antiporter 95 42 Op 2 . - CDS 111876 - 113000 480 ## COG0787 Alanine racemase 96 42 Op 3 . - CDS 113018 - 114118 671 ## COG0498 Threonine synthase - Prom 114226 - 114285 1.7 - Term 114214 - 114261 1.8 97 43 Op 1 2/0.000 - CDS 114295 - 114669 224 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 114704 - 114763 6.1 - Term 114781 - 114826 1.4 98 43 Op 2 . - CDS 114871 - 115752 258 ## COG0583 Transcriptional regulator 99 43 Op 3 . - CDS 115843 - 117009 1306 ## Closa_2524 hypothetical protein 100 43 Op 4 . - CDS 117052 - 117213 71 ## gi|160936438|ref|ZP_02083807.1| hypothetical protein CLOBOL_01330 101 43 Op 5 . - CDS 117245 - 118681 1189 ## COG3395 Uncharacterized protein conserved in bacteria 102 44 Op 1 . - CDS 118810 - 119724 993 ## COG1402 Uncharacterized protein, putative amidase 103 44 Op 2 . - CDS 119752 - 120534 1048 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 104 44 Op 3 . - CDS 120518 - 121444 758 ## COG3386 Gluconolactonase 105 44 Op 4 . - CDS 121462 - 122895 1607 ## COG2407 L-fucose isomerase and related proteins 106 44 Op 5 1/0.059 - CDS 122910 - 123737 219 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 107 44 Op 6 . - CDS 123774 - 124841 1175 ## COG2055 Malate/L-lactate dehydrogenases - Prom 124868 - 124927 7.9 + Prom 125191 - 125250 3.5 108 45 Tu 1 . + CDS 125297 - 126343 1000 ## COG1609 Transcriptional regulators + Term 126465 - 126504 9.1 - Term 126453 - 126490 7.1 109 46 Op 1 . - CDS 126516 - 127802 1418 ## COG1593 TRAP-type C4-dicarboxylate transport system, large permease component 110 46 Op 2 . - CDS 127809 - 128417 601 ## ELI_2933 hypothetical protein 111 46 Op 3 . - CDS 128419 - 129459 350 ## PROTEIN SUPPORTED gi|114773040|ref|ZP_01450335.1| TRAP-type C4-dicarboxylate transport system, periplasmic component 112 46 Op 4 . - CDS 129495 - 130931 1551 ## Spirs_3076 hypothetical protein 113 46 Op 5 . - CDS 130934 - 132064 962 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes - Prom 132091 - 132150 3.1 114 47 Tu 1 . - CDS 132190 - 132258 56 ## - Prom 132288 - 132347 8.2 - Term 132397 - 132444 10.5 115 48 Op 1 3/0.000 - CDS 132479 - 133525 943 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 116 48 Op 2 21/0.000 - CDS 133537 - 134484 1054 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 117 48 Op 3 16/0.000 - CDS 134486 - 136000 1302 ## COG1129 ABC-type sugar transport system, ATPase component - Term 136022 - 136066 5.1 118 48 Op 4 . - CDS 136075 - 137127 1260 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 137188 - 137247 6.5 119 49 Tu 1 . - CDS 137952 - 138404 420 ## gi|160936461|ref|ZP_02083830.1| hypothetical protein CLOBOL_01353 - Prom 138554 - 138613 3.2 - Term 138704 - 138764 5.6 120 50 Op 1 . - CDS 138793 - 139698 662 ## COG2199 FOG: GGDEF domain 121 50 Op 2 . - CDS 139712 - 140008 199 ## gi|160936464|ref|ZP_02083833.1| hypothetical protein CLOBOL_01356 - Prom 140251 - 140310 6.6 - Term 140170 - 140210 9.8 122 51 Tu 1 . - CDS 140319 - 141752 1413 ## COG2252 Permeases - Prom 141977 - 142036 6.5 Predicted protein(s) >gi|157101650|gb|DS480674.1| GENE 1 3 - 822 619 273 aa, chain - ## HITS:1 COG:Cj1311 KEGG:ns NR:ns ## COG: Cj1311 COG1083 # Protein_GI_number: 15792634 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-N-acetylneuraminic acid synthetase # Organism: Campylobacter jejuni # 8 224 6 230 232 65 24.0 8e-11 MSEKVTAIIPAKGMSNRLPSKNTLEFGESNLLVHKIRQLKQVDAITEVIVSSEDDDILTM ADKEGARAIKRPYEYSAETLPFQDFLYYITGEAREDHVMWACCTSPMVGADVYRKGCKTY FEKLKEGYDSLITVMEYHHYLLDKDMKPYTFKWGPEHRNSQDLDKLYFFTDGIQLAPKAS MREWYYYFGHNPYGMEVDLKTSTDIDTIFDYLLARMQYHMDDDAMMAKGINSKDGEAVYH YVLGKMISQLSDEKLAKALMEMPDGKRANVLGL >gi|157101650|gb|DS480674.1| GENE 2 1171 - 1287 166 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEIVTSLIPSRLIRTLRLKKLKKTFRGAGDIFYKCVI >gi|157101650|gb|DS480674.1| GENE 3 1502 - 1690 89 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936322|ref|ZP_02083691.1| ## NR: gi|160936322|ref|ZP_02083691.1| hypothetical protein CLOBOL_01214 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01214 [Clostridium bolteae ATCC BAA-613] # 1 62 49 110 110 124 100.0 2e-27 MLTTAVEHTADVVQCRRKVDNAEGSGNVDVFQGVELTDHHDILDAFLCFQKNYKWTSTRR CK >gi|157101650|gb|DS480674.1| GENE 4 1936 - 3030 318 364 aa, chain - ## HITS:1 COG:SP1771_1 KEGG:ns NR:ns ## COG: SP1771_1 COG0463 # Protein_GI_number: 15901601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 8 238 7 244 259 139 35.0 7e-33 MIKGNPLVSIIVPVYGVEAYLDQCINSIVEQTYQNIELILIDDGSSDRCPEICDAWATRD CRIQVLHKENGGQSSARNIGLDTAKGEWIFFVDSDDWIEPECISVLVEISLVNNADISVI YPQNHKGNMVIPKPFFLKDKGDVYCLSAEDAVTYFMEQAVAVMGKLYRADILQDIRFPVG RKAEEYIVQLAALKKAKIIAFCNRNLYDYLVRNDSDSHDIKPKYRVDNIQAISEALDICR RDFRCEEEWAFRWLCALLHEFYSVSEFAKEEKKKYNDILVYALSQVGGMEQIYKKMEQPL DRIIYVANQYQKILKEEEYKILQKQYREVYRSQAKREVKGIKYRIANLNLKFLVALYNGR EVIK >gi|157101650|gb|DS480674.1| GENE 5 3033 - 3626 260 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936324|ref|ZP_02083693.1| ## NR: gi|160936324|ref|ZP_02083693.1| hypothetical protein CLOBOL_01216 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01216 [Clostridium bolteae ATCC BAA-613] # 1 197 229 425 425 396 100.0 1e-109 MLSKINAIQCGISEKQIVVVGPMKYAGFQFEYEKTNVLSKIGIVFDGLDNFKNNLAMLKS AHQVAEHLELKCYIRFHPSNKREEYAKYLSDTDVICDDLKEFEMIPDFYITYISTMYTDM LFKKRISFRYATQEPNMFTSLTDTTFSNADELEAVVINARENYAACLEWQKATKQEIFGA DEGVNQYQKFFKVWGES >gi|157101650|gb|DS480674.1| GENE 6 4547 - 5596 287 349 aa, chain - ## HITS:1 COG:no KEGG:LAR_1288 NR:ns ## KEGG: LAR_1288 # Name: not_defined # Def: hypothetical protein # Organism: L.reuteri_K # Pathway: not_defined # 1 347 1 335 338 170 35.0 1e-40 MGKYILSVVGANGQNNAGSKAGNDVLRVSQECGYKLIPLYESNQVRTRVQDIISGVIATY SLRNKLVDGDIVLMQYPLNRLLMKNIFRILKRCKSKIRIATLIHDIDYLRDIPLGDKGVD GMKVLELSLLGSSDYLICHNPFMIRTLQKEKLSVEYISLDLFDYLYDGTPATISEDKSTV IVAGNLLESKAGYLYQIKKDKHKFALSLYGSNYAVDKMQMDNATYHGSFKPDELIANLYG AYGLVWDGSSTETCSGSYGKYLRINNPHKVSLYIAAGIPVVIWKEAALCSLIEENALGFG ISSLDELEEALKSHEHLYQSYRNNVLNMKEKVCSGGFLKYVLVQIDRME >gi|157101650|gb|DS480674.1| GENE 7 5600 - 6823 450 407 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936326|ref|ZP_02083695.1| ## NR: gi|160936326|ref|ZP_02083695.1| hypothetical protein CLOBOL_01218 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01218 [Clostridium bolteae ATCC BAA-613] # 1 407 1 407 407 647 100.0 0 MVRNNNNFIYTFLSIVVVLFLVNRFLMPITMVLAIVLLYMMAHEGKVALYMNQITKLFVV IAGMGIVYMVLSIFGGFSALGSSRYAVKEIERGIIYILIAQIIMRSKVDIRKYISIWNML LTLSVVVAILQFTKVIDMNTILKNFYGDSIQFYNTQFTVLSSFRCGSIFVNPNVFACFLV ACLGSHLVVSRKYGCGFMRNIVVYLLCIIGFVLAGSRTGFVLGVVILIYHLFTEGRANQG YAMRFLGTMLFFFLIFVLFMNLTGNIGSGILDDLRLFKVNEGMSNSFGGKISIFTNLIKQ MQPWNYLIGYGPFDYSLSSSFLVDFDLGYFIVYYGLIGIGLYVAMLRSLLHYEFMDGTTD HRLPRMLVVIMIVFGLTAGTYFNLRIFAIYMLMFLPMLANNREGIEA >gi|157101650|gb|DS480674.1| GENE 8 6823 - 7833 266 336 aa, chain - ## HITS:1 COG:wbbK KEGG:ns NR:ns ## COG: wbbK COG0438 # Protein_GI_number: 16129972 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Escherichia coli K12 # 3 322 6 333 372 112 28.0 8e-25 MRILVYDVAAENGGAVTVLNWFYGIHKEDKENQYVYLLGKHTLENTDNITVINVPEVKKT WLHRLWFDYIGVKKYLCDYKIEQILSLQNTDIPFCKLPQTVYIHNALPFSEHRFSIDEDK KLWIYQNLIGSMIKQSIKRANIVIVQTNWMKNEVVRQTGISETKVQVVFPEINLFEDVIW LPADKCTFFYPASDARFKNHDVILEASKLLYQKGITQFRCVFTLKGNENSRISRIQEEAV DCGMDFLWVGQQNREQMSDWFAQSILLFPSYLETVGLPICEAMSVGAPVLLSDCLYARDV AKEYANARYFVKDNAAALAQLMEVCIVKRQDNGGAR >gi|157101650|gb|DS480674.1| GENE 9 7834 - 8958 494 374 aa, chain - ## HITS:1 COG:SP0360 KEGG:ns NR:ns ## COG: SP0360 COG0381 # Protein_GI_number: 15900289 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 3 374 17 393 394 493 61.0 1e-139 MSKLKLMTIVGTRPEVIKMSAIIKKAHQYFDHILVHTGQNYDYELNKIFFDDLGLREPDH YLGVVGENLGQTMGNVISKSYELMSAVKPDAIIVLGDTNSCLCVISAKRLKIPVFHMEAG NRCKDENLPEEVIRRIVDVTSDINLCYSEHARRYILDSGAKPEYTFVVGSPMAEVLTDHK EKIDSSDVLKRLGLEAKKYILLSAHREENINIEENFFSLMNAVNAMAEKYDMPILYSCHP RSKKYIEQRGFIFDKRVIQHQPLGFFDYNKLQQNAFCVVSDSGTVPEEGAYFKFPAVSVR TSTERPEAMDQGVFTIGSISTEQVLQAVELVTAMYANGDLAAEVPCYQDRAVSTKVIKII QSYSGIVNKMIWRK >gi|157101650|gb|DS480674.1| GENE 10 8986 - 9840 298 284 aa, chain - ## HITS:1 COG:RP332 KEGG:ns NR:ns ## COG: RP332 COG1091 # Protein_GI_number: 15604200 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Rickettsia prowazekii # 1 268 1 280 284 171 36.0 1e-42 MRFLILGSNGMAGNTIAVYLKETGHEVLGYAREKSLYFDSVIGDAYNTVQLKEVIDSGNF DSIINCVGLLNQFAENNKPGAVFLNSYLPHYLAGITEGTSIQVVHMSTDCVFSGEKGNYL ESDFPDGKTFYDRTKALGELNDDKNITLRNSIVGPDIKASGIGLLSWFMKQSGTINGFTK AMWTGQTTVQLAKTMEYVAKERMGGLYNAVPHNSISKYELLKLFNKYLRDNSVTINPVEG IVADKSLVPSKLGDDFEIPNYEQMVAEMALWMRAHKEMYPHYEL >gi|157101650|gb|DS480674.1| GENE 11 9927 - 10970 467 347 aa, chain - ## HITS:1 COG:SP0358 KEGG:ns NR:ns ## COG: SP0358 COG1086 # Protein_GI_number: 15900287 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Streptococcus pneumoniae TIGR4 # 1 342 1 343 351 536 76.0 1e-152 MSLFKDKTLMITGGTGSFGNAVLNRFLKTDIGEIRIFSRDEKKQDDMRHEFQTKMPEVAD KISFYIGDVRDIQSIKSAMHGVDYIFHAAALKQVPSCEFFPIEAVKTNVLGTENVLTAAI DDGVETVICLSTDKAAYPVNAMGTSKAMMEKVIVAKSRTISKTKICCTRYGNVMCSRGSV IPLWIEQIRKGNAITLTDPGMTRFIMSLDDAVSLVLFAFEHGERGDILVQKASACTIGVQ AEAVCDLFGGKKEDIKIIGIRHGEKMYETLLTNEECAHAIDMGNFYRVPCDKRGLNYDKY FIHGDTERNTLTEFNSNNTDLLSVEQVKEKLLELAYIREELQKGSEL >gi|157101650|gb|DS480674.1| GENE 12 10986 - 12176 475 396 aa, chain - ## HITS:1 COG:TM0631 KEGG:ns NR:ns ## COG: TM0631 COG0438 # Protein_GI_number: 15643396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 11 384 32 416 434 142 27.0 1e-33 MGFTNLENRGIYPDLLREFVRRGDIVYVVSPIERKNNKKTHIVKEDSWEILKVRTGNIQQ TNLIEKGIATVLLEQQFINAIKKYYSNTKFDLVLYSTPPVTLARVVAYIKKRDKALSYLM LKDIFPQNSIDLGILKKTGLKGVIYKYFSLKEQKLYKLSDVIGCTSEANIRYVKEHDELD KKKIIEFCPNCSDWYDLSLPDNGKKEVRNKYGLPVDKKIFVYGGNLGRPQDVPFIVKCLE ACKDMKNVYFLVVGSGTEKHYLDEYVEKESCSHVRVMGQLPKQEYDSMVACCDCGIIFLD YRFTVPNTPSRLLAYIQAGIPVLTCTDPATDVGDIVEDNGFGWQCTSDKIENFVRLVEHI ANLDIDPRMKENGLKYLEENYTPKVGYKAIVKHLNI >gi|157101650|gb|DS480674.1| GENE 13 12298 - 12432 110 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936332|ref|ZP_02083701.1| ## NR: gi|160936332|ref|ZP_02083701.1| hypothetical protein CLOBOL_01224 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01224 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 44 73 100.0 3e-12 MVATFIDPEIEILSKHRAEFAAIGVELLVSYEETAHLCFDKFEM >gi|157101650|gb|DS480674.1| GENE 14 12707 - 13312 503 201 aa, chain - ## HITS:1 COG:NMB1820_2 KEGG:ns NR:ns ## COG: NMB1820_2 COG0110 # Protein_GI_number: 15677656 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Neisseria meningitidis MC58 # 25 198 1 178 190 126 43.0 3e-29 MNTLVILGASGHGKVIADIGLKNEYENILFLDDNATGKCLGFPIVGTSEDLERLNDGKTD FVIAVGNNAIRRRIAESHDVNWVKLIHPSAQIALGVQISKGTVVMAGAIVNAEVTIGEHC IVNSGAIVEHDNVLGDFVHISPNAALGGTVHVGDNTHIGIGAVVKNNIDICSNCTIGAGT VVVENLFIEGTYVGTPAKIIA >gi|157101650|gb|DS480674.1| GENE 15 13309 - 13980 294 223 aa, chain - ## HITS:1 COG:PM1011 KEGG:ns NR:ns ## COG: PM1011 COG2148 # Protein_GI_number: 15602876 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Pasteurella multocida # 22 220 2 200 200 238 55.0 5e-63 MGNVLENEVKPIHKMGIYEKYIKRPLDFILSLCAILVLSPVLLLVAILVKIKLGSPVIFK QERPGLYGKIFNMYKFRTMTDEKDENGNLLSDEKRLTGFGKMLRSTSLDELPELFCILNG CMSCVGPRPLLMKYLPRYNQRQLRRHEVKPGLTGYAQAYGRNSLTWEEKFEKDVYYVEHM SLWLDVMILFKTVAVVLKREGINSETSATMEEFMGVPEEETVG >gi|157101650|gb|DS480674.1| GENE 16 14005 - 15210 420 401 aa, chain - ## HITS:1 COG:BS_yvfE KEGG:ns NR:ns ## COG: BS_yvfE COG0399 # Protein_GI_number: 16080476 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Bacillus subtilis # 1 251 12 266 301 269 48.0 7e-72 MHREELEYMREAYETNWMSTVGANINEVEKQMAEIIGCKYAVALSAGTAALHLAMKVAGV KQGDKVFCSDMTFDATVNPVVYEGGIPIFIDTEYDTWNMDPVALEKAFEIYPDVKIVVVA HLYGTPGKIDEIKAVCDKHGAVIVEDAAESLGATYKGKQTGNFGKVGICSFNGNKIITGS AGGVLLTDSKEEAEKVRKWSTQSRENAPWYQHEELGYNYRMSNVIAGVVRGQIPYLNEHI AQKKAIYERYRAGFKGLPVQMNPIEYENSEPNYWLSCLIIDPDAMCKQVRGEGEALYIPK HGKSCPTEILDTLATHNAEGRPIWKPMHMQPIYRMNGFVTREGNGRAKTNSYIAGDAIGR DGMPLDVGMDIFHRGLCLPSDNKMKPEQQDVVIEIIRACFE >gi|157101650|gb|DS480674.1| GENE 17 15266 - 15415 97 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 454 86 100.0 5e-16 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQDKYVY >gi|157101650|gb|DS480674.1| GENE 18 15551 - 16717 341 388 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101650|gb|DS480674.1| GENE 19 17401 - 19713 2223 770 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 119 551 136 520 2566 95 26.0 5e-19 MKIKTRMQIKKTLMQMVTCTSAVLVISAMPVYGADGWQQDTAQQWFYMEHDKKVVNRWVT WADGTPRYLGGNGLIVTNNWVNAGGDRYRVKEDGSRYENEWFSVTSNPSLPSGKPSTAWY YAGADGKILVNGWHELEGRYYYFYPGGNSPRTSFFTAEDKRYYVDAEGVRPEPGWFSTEN VDSKGNPYVNWYYVQPDGSLLTDGWHELDGITYYFDKNGNSPRKRWVNLEDNRYYVDDTG ELMKGWFSITGTSSNGQEYTNWYYGDSNGVLWRGGWGAIDGKWYYFDPNGLNYRKRWYID NVKGKRYYLDEDGVLQDKGWFKIENVNSVTKAVTESWYYAGEDGAVLKGGYKTIGDGAAE NTAATERTYYFDANGLNYRKRWITDNNGDKRYMGDDGAMKKKEWFVISGLDSRNADYNNW YYAESDGKVIVDKWHKIDGKYYCFNASGVMRKGWLTETPEDDDKESSYYYCAEDGARAAG WQWLEIPETWMDNTDVADYVQEHGQYAYFYFSTTSGKKKRSGSGKQEKDVDGITYCFDSY GIMYPGWVKMSSTVPEIKGYRYFYQPETEKDKKYIPGEKVESAWLKLEGPPDASGSGQKE WYYFDNTGKPVCGGEDSYEIKKIGDGYYIFDMFGAAQYGLVEVKGDFYYCGAEDGDRKCA VGKAMVDDGVSATRSQYYFDIKGKGITGIKDGYAYYKGKMQKADKAARYEVFDIPGEGKC LVNAAGKVMKNTKVTDGNGQKWETGAGGSIKVYGSDEVAELEVPEATVSY >gi|157101650|gb|DS480674.1| GENE 20 19710 - 20198 393 162 aa, chain - ## HITS:1 COG:no KEGG:Shel_01150 NR:ns ## KEGG: Shel_01150 # Name: not_defined # Def: PAS domain-containing protein # Organism: S.heliotrinireducens # Pathway: not_defined # 3 135 375 508 524 99 37.0 3e-20 MPEKPKVYIQTFYGFDVFVNGRVIYFPSKKSKELLAILINQRGGSVSLAQVVHILYENLK ESTAKRNIRVVYHRLRKTLEEYGCEEMLIHKRGIFSIHTEWFECDLYEFVKGNETYMTAY TGTYMAEYPWASATIPYLDKIYARMNDNDKTHREIYSEEWIG >gi|157101650|gb|DS480674.1| GENE 21 20235 - 22319 1722 694 aa, chain - ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 70 628 58 604 608 439 42.0 1e-122 MDKYLPRSKRVRMCLLWAYDMAAVWICACLALGLRFDLDFNSIPASYVHALWKYGLLQMA VVTLLFYGGRLYAIMWGSAGTREMLEVVLICIIAALAQPAGILISGQRMPRSYYLLWFIL MTGAAICGRISFQVLQRTVNRMNRAWEKDTTPQVMVVGAGQAGTLIIREMKASQKIHGFP ACIVDDDRDKQGRFIDGVVVAGTRDDIPELVRKKGIDAIYVALPSAPVDDIKDILGICQK TGCQVKYLPGVYQLMNGEVNISRLKEVEIEDLLGREPVQVNLNEIMGCVRDKVVLVTGGG GSIGSELCRQLAGHGVRQLIIFDMYENNAYEIQQELKRNVPDLNLVVLIGSVRNTNRLDY LFRTYRPDIVYHAAAHKHVPLMEDSPNEAIKNNVLGTYKTARAAMQYGTGRFILISTDKA VNPANIMGASKRLCEMVIQMCNQKSGTEFVAVRFGNVLGSNGSVIPLFKKQIENGGPVTV THRDIIRYFMTIPEAVSLVLQAGAYAMGGEIFVLNMGKPVRILDMAENLIRLSGYEPYRD IDIQFTGLRPGEKLYEELLMDEEGLQRTGNERIFIGKPIEMDYHRFDRGLGKLDEAAWLE KDNIRELVHELVPEYHYRAEEDACGAMTDQDEVFAQAAAGMEPGYGEGTGPESAYDTKAG AGSPVKSRMALHNAMAVYSRKQEITMTKKENTRI >gi|157101650|gb|DS480674.1| GENE 22 22424 - 23104 653 226 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936342|ref|ZP_02083711.1| ## NR: gi|160936342|ref|ZP_02083711.1| hypothetical protein CLOBOL_01234 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01234 [Clostridium bolteae ATCC BAA-613] # 1 226 1 226 226 383 100.0 1e-105 MKKRKAIITLAMALTVSQATCILAANSPGTGPVIVGGGSDGSSHDIGEKVDQIKGGASAG STVTSSGQGNNAVGIAVGQTAGEHAVETNSRGQAVIGDTALEFVQGANAAVAGLPEPVVN TINSINSGKPLSEAVPGLDLAGYNALVGTHAIMTKDAATNTEKTGQVEVPLYVPNLLDGL GDVEVLFYNNMTGTWQLIKPTRVDAGSKMLWFNAPGSGTLSVVYKK >gi|157101650|gb|DS480674.1| GENE 23 23193 - 24116 781 307 aa, chain - ## HITS:1 COG:slr0883 KEGG:ns NR:ns ## COG: slr0883 COG1316 # Protein_GI_number: 16330199 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Synechocystis # 40 285 108 338 481 77 30.0 4e-14 MSREAQQWHRVQAGNTPAEPDGGVEYQGKSYRRNTYIKAILCMGIDRPGSLEETRTAGFG GQADGIFLVAQDTARDRIKVLVIPRDTMTEITLTDLSGNELGSSIQHLTLAYAYGDGREK SCRYMTEAVSRLLGGMSIDGYMAVSMSVLPLANDKVGGVTVTIEEPGLKQADPVFVQGQT VTLRGEQAEKYVRYRDTGRAQSALVRMERQKTYIKGFLDAARNKSRLDDSLIPDLMKEIE PYMVTDMTKDRYLDMALDFLGGNQDFTDADMVTLPGTAVETVIYDEYHPNQEEIMEIVLD WFYRPRE >gi|157101650|gb|DS480674.1| GENE 24 24277 - 25077 824 266 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 18 234 18 226 227 162 41.0 7e-40 MDEILHLKNTGRKDYFYEEAIKTLRTNIQFCGSGLKTIMFTSSMPDEGKSETAFALASSF GNIGKKVLLVDADIRKSVMVKRYEIKGNPNGLSQYLSGQKSLEEICYETDMENLDMVLSG PFSPNPAELLEDELFKTMIESVKEIYDYIIIDTPPMANVIDGAIIASQCDGAVIVIESGA ISYRLVQKVRSQLEKSGCRILGAVLNRVGGGYEHSYYEKYYGRRGGKYYGKYGRHYGRYE EGKAPPVNEVSGARIKNDTTAQSQDT >gi|157101650|gb|DS480674.1| GENE 25 25080 - 25850 664 256 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 7 225 4 225 230 114 31.0 2e-25 MEKRYREDDMEIDLLELLFEFKKRAWVIILAAVLGCLGAGAYSRLILTPVYTSTAMVYVL SKETTLTSLADLQIGSQLTKDYSVMITSRPVLEQVIKNQGLNMTYGQLKARIRISNPADT RILNMTVSDTDPVRAKAIADEVANASSDYIGDIMEMVPPKIIEQGVVPAAPASPSIKKNA ALGGLACIAAACGVITLKVIMNDTIRSEEDVGKYLGMSVLASVPDDDGMAKEQQRNRSKR KLERKKRKTEKNRRGE >gi|157101650|gb|DS480674.1| GENE 26 25892 - 26638 387 248 aa, chain - ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 11 248 2 243 243 120 30.0 2e-27 MADKGGTKGWVDIHAHILPGVDDGASDWEETGEMLCKAHKQGITHIIATPHYVSGQDTKR LREMMKRLKETAFSISENMMVSLGQEVQYFEELPSFLEQGKVLTLAGSRYVLVEFLPGDG YMRLFRAVRRLVQSSYLPVIAHAERYDCLKEKGRTKELARCGAYLQMNAGSLGGGLFDRR AAWCRKEILRGNIHFIATDMHGVVIRPPELEEAVRWMTRQDRNGQDAGGMAKRLLRQNQE HILRDSVL >gi|157101650|gb|DS480674.1| GENE 27 27013 - 27930 623 305 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 14 272 8 262 265 143 32.0 3e-34 MDMAYTVEQFLDFLKEEEKSDATISKYTYELQMFLQFLGKREIGKELMIQYRTYLSSRYR PQTVNGKLSAVNAFLKFTGLYEYRVKFLKVQRRAYIDETRELTQKEYERLMETAGRQGKY QLYYLMMTICSTGIRVSELRYVTVEAVMRGKAEIFMKGKYRIVIFPKNLAAQLKAFARKN GIRSGSLFCTRSGRPLDRSNICHAMKKLCAKAGVKKDKVFPHNFRHLFARSFYAAEKNMA HLADILGHSSIETTRIYVAASIKEHERILNKLKIGVINKLPQNKHSVVYNSLVFTCRSKN SIQYI >gi|157101650|gb|DS480674.1| GENE 28 28091 - 28573 648 160 aa, chain - ## HITS:1 COG:FN0673 KEGG:ns NR:ns ## COG: FN0673 COG2606 # Protein_GI_number: 19704008 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 153 1 153 154 172 56.0 2e-43 MAFDIAKEHLRKAGLEDRIYEFEVSSATVELAAQAVGCEPARIAKTLSFMADQKAVLIVA AGDAKVDNHKYKEQFHTKAKMLSPDEVTELVGHSVGGVCPFGVKEGVAVYLDESLKRFDV VYPACGSASSAVKLTIPELETASGYLGWIDVCKGWGEDSI >gi|157101650|gb|DS480674.1| GENE 29 28635 - 29447 890 270 aa, chain - ## HITS:1 COG:no KEGG:Closa_2353 NR:ns ## KEGG: Closa_2353 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 267 1 267 271 268 49.0 2e-70 MSLILCRQEPVKHPYYIDVLGIHIHSSQELCYIVYNHPVLVMDDFLDDLLVDFIRKDLDM DYLAGRMEKLVETGTRPEEVLALFLSECDYYSDKEIQKFKQTTAALRALHPAQYEKQRAD YMFGQQQYGKAAARYSKILEYPRDKVVEDLFLAKVYNNLGACYAMMFQFHKALGCYDKAY ELGKDDALLKRIYFLTVFAPELDVKDKYQSVFTDERKRTWSGEMDLAALEAGQAEEVRAL RALFKKDPIKRMSGAAEMVRRWKQEYRMMV >gi|157101650|gb|DS480674.1| GENE 30 29444 - 30667 1338 407 aa, chain - ## HITS:1 COG:no KEGG:Closa_2354 NR:ns ## KEGG: Closa_2354 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 407 1 407 407 550 61.0 1e-155 MDGLVIGLDLNDDYTQICCYDKEKSWTIPTVICRRKEEEVWLSGEEAYAATLLGEGVIVD KLLKMAAKDGTSTIGGICYGGGTLLKLFIEKMLGYPRKEFGTDEVAQLVITLQSVDCRLL DTLMYCADYLEIPRDRVHVISHTEGFIYYVLSQKKELWTNQVGLFELSGERLCYYEMKVQ RGMRRNMVQAEAQNQEEAFNLDILDSPSGSRLADKILTACGEKLLNRKLFSTVFLTGKGF ERQDWAGGFMRLICNRRKVFVEPCLFARGAAFKGADYTHQETSYPYVFICEGRLKAEVSL KVMRRGRENQLVVASYGDNWYESKSSMDLIVDGQKEIEFIISPLDSKKKKLVRIPLAGFP ERPPKTTRIELKVAFTDEGTMTMSIRDKGFGELFPSSGAVVKQEVNL >gi|157101650|gb|DS480674.1| GENE 31 30682 - 34272 3756 1196 aa, chain - ## HITS:1 COG:no KEGG:Closa_2355 NR:ns ## KEGG: Closa_2355 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 1196 1 1190 1190 1086 48.0 0 MRERINRLARGIIDNGSPELVLAPERVEASVPAGEVIRGEILVSSGNNLHIKGLVYSSHE RVRVVNSAFGGLRNRIIYEVNVGFSEHGDEIKGSFYLVTNGGEREIPYSLRVQAGDAGEV LGNLKTPRDFAVLAKKDLEKALRMFEYQDFTEAPFMQDSRVRTIYDGLKGRAGRRNLLEE FLVALQVKEPVKLTLETGTRIYENLTGIAEDYIDIAAGTWGYVSADITVDAPFIELGTFR ITDQDFVQGRCRVGYRIVPSRLHRGRNFGCIRVKSLREEFLISVEAEGHHGSGSTERESG SDRFMDHGSLYKYLSLRLDYEAGVYEPALLLNQMMKETEHLRADFPGDARAKLIQAELLI LNGREDNASLALDDARDHVLAHREKQVELYCFYQYLRLEIKPSVQQKESLVRYIRKLLWE DGEIRPYLFLMLVKLDETMAQNPLELYRSMASLFNQGCCSPFLYASACRLMEAHPDLFVR LGDFEIQTLYLGVQKGIVSRVTALKAAGLALGLKHYRKLVERLFTKLYQTYEEQEILEAV CSLLIKGDCKEKDAFAWYEKALEQGVNLTRLYEYFLYSLPEDYGHLLPKEVLLYFSYDKD LDRHSRQMLYRNILEYMNPSAELYQNYTRHMEQFAMEQLFQSRINRSLAVIYEHMIYKDM IDSRVARVLPGILKSCRIRCGDSRMKYVIVRYEELEDEEAFLLEDQSAYVPLFSERSVLL FQDAYGNRYLDVKHWKVSVMDRPELLAQCYEVYPDHPMLKLSECRDIVNRGVETDEEAAL LEEIVTQMHLSPAFEGRLMQAVTDYYCQKASEDKDGDGVFNCAYLVQIDKKPMAVRQRQQ ICETLISQNYMREAYNMIRDYGSQYISTPRLMKLCTKTILMNLFDQDDLLLGLSHQVFCQ GCYDSVILDYLCEHFNGTVSQMYEVLIQGVREHVETYDLEERLLGQMLFTGCCDQMDSVF ELYMKRKTTKEMVVKAYFTQKSVQYFLEEKDMDQRVFEYLKQAVGNTFDKDRMPTIYLLA LTRYFSTLDKVDGGDAELLKTMTSLLLEEGLVFPYTRELSKHIPVPEDIMDKAMVEYRGR KDAHPELQVRILPEETGFHSEDIRRVYQGIFVKQKVLFEGEIMEYRIYDYLDGHRRLAAE GQVECDHKLEGKENSRFACLNEMGAAIKDRDDSRLLNAMEDYLKKSAALGRLFPME >gi|157101650|gb|DS480674.1| GENE 32 34594 - 35748 1369 384 aa, chain + ## HITS:1 COG:BH3168 KEGG:ns NR:ns ## COG: BH3168 COG0281 # Protein_GI_number: 15615730 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Bacillus halodurans # 2 369 3 371 410 423 61.0 1e-118 MTTNEKALLLHEEWKGKLETTSKAKVASREDLALAYTPGVAEPCKVIAADPEAAYRYTIK SNTIAVVSDGSAVLGLGNIGPLAAMPVMEGKAVLFKEFGGVNAFPICLDTQDTQEIIETV VRIAPAFGGINLEDISAPRCFEIEEALKERLDIPVFHDDQHGTAIVVLAGVINALRLTGK KKEDCRVVVNGAGSAGIAITKLLLTYGFSHVTMCDKSGILSKGAEGLNWMQEKMMDVTNL EHRTGTLADAMKGADIFVGVSAPGIVTQDMVASMNKDAILFAMANPVPEIMPDLAKAAGA RVVGTGRSDFPNQVNNVVVFPGIFKGALEGRASAITEEMKLAAAEAIAGLVDEKDLSDEN ILPEAFDPRVADVVSSAVKDHIRK >gi|157101650|gb|DS480674.1| GENE 33 35793 - 36296 402 167 aa, chain + ## HITS:1 COG:no KEGG:Closa_2357 NR:ns ## KEGG: Closa_2357 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 167 6 172 174 242 74.0 4e-63 MTLYKLMNLYMLKQVNFPLTNAQLTNFFTEHEYTTYFTLQQALNELEDAGLVHKEASHNS TRYDITREGEETLNFFGKNISTAIIEDMDQYLKENKFRLREEVGTTADFYKGTNQDYIVH CEVRENKTTLIRLDLSVPDREQAEAMCNTWKTKSQEIYAHVMRSLLS >gi|157101650|gb|DS480674.1| GENE 34 36382 - 37236 747 284 aa, chain - ## HITS:1 COG:PA0351 KEGG:ns NR:ns ## COG: PA0351 COG0622 # Protein_GI_number: 15595548 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pseudomonas aeruginosa # 5 152 7 157 157 109 37.0 7e-24 METGPIRVAIMADTHGVLRPEVERILETCDVIVHAGDFDNQMLYHKLNVDQPLYAVRGNN DRGWSGGLGQVNRFEIGGVKFIMAHERVDIPSVLKDIQVVIFGHSHMYYQQEISGRLWLN PGSCGYKRFTLPLSMAVMTIEDGTYEVETVWLEHGYGTPGAATSQREKAKASKYEKQQKR YKQKQAKGEGQADKAKGAVKAARYGGQPAARQEAVKPAPDQEKEYLFLIAKILRLRKAGE SREWVIRNLGENFRLASTIYDICEKRPDSNARQILELLLEQISF >gi|157101650|gb|DS480674.1| GENE 35 37788 - 39206 1618 472 aa, chain + ## HITS:1 COG:PAB2354 KEGG:ns NR:ns ## COG: PAB2354 COG0591 # Protein_GI_number: 14520249 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pyrococcus abyssi # 9 463 14 478 492 140 29.0 4e-33 MHELTNLDLVIIVLYMIGMLFVGVWFVKRIKNTGDYYVAGRTLGPFVLTATVCASIIGGS AMMGRAGIAYTTGFKAVMTAIPYLIGMFIFSAIAGRIQEVGFTRNINSIPELFNYRFNST ARCILAFMIVFTMMGTVAAQVTATATIIKMLGNEFGISYEAGALAATLIFIIYTAASGLF GVVYTDVLQFFMLIIFVYILIPFSSIRYLGGFSSFWQNLDKSYLVPHINGSILGDIVTYL VFTLAGAEMWQRAFAAKSKKAATQGMFWGTSIYMVTIALVFFMGLAGRQILPDVVASYGS SDAVVPALAVSILPPGLTGLALAGILSVMMSTADSYLLVSVQTFVHDLGKNFMPEMTEKH EILMSRVMSVVLAMGALIIALYIKNAYNVLMFAWSFYAAAAGLPSLAALYWKKATSQGII SAMTGGFVVCVGWKLAGNPWGLGETVPGALVCGILLVAVSLATCKKHPCRMA >gi|157101650|gb|DS480674.1| GENE 36 39226 - 40572 1444 448 aa, chain - ## HITS:1 COG:FN1877 KEGG:ns NR:ns ## COG: FN1877 COG2252 # Protein_GI_number: 19705182 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 18 432 13 427 442 386 51.0 1e-107 MSKETEGTAKRGASNTWIDRQFHLAEHGTTVRTEILAGITTFVTMAYVLGTVPNIMANAG LDRGVMLTSMILLIIVTTVAMALVTNRPFALAPGLGSVAIVSGMIANSGISPEITAGVIF WSGVIFIVISYIGLRDAVVNAIPASLKHAVSSGVGLFIALLGCKNAGLIIALEGKNNLAF GDLSSPSVLLALIGFLLILITKQMRVPGYMIFSILITTLVGIPMGLTRMPEAFFMAPSSP FGQFMKIDFLGALQFAYLPFLIAMFVPDFFSTFGTALGVGAKAGFLDEDGNLPGIDKVFK VDAVATALGSLFCMPCMTTYLESSSGVEAGGKTGLTPISTSVCFAFTLLLTPFALMIPSA ATAPALIIIGVGMLSSMRNINYDDFTESFPAFICVAFTIFANNIANGICVALPVYVVLKL ASGKIKEMPKIMYVLLGVCVLYFYSIIK >gi|157101650|gb|DS480674.1| GENE 37 40609 - 42399 1494 596 aa, chain - ## HITS:1 COG:BH0640 KEGG:ns NR:ns ## COG: BH0640 COG1001 # Protein_GI_number: 15613203 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Bacillus halodurans # 6 594 12 584 585 318 33.0 2e-86 MLNGGAMETERADFLVRGAMVYNSYFKKFVPADVSVSGGKFLYIDSRRSGAVEADETVDA EGLYMVPGLIDIHMHIESSMLTPGAFCRRLAECGVTTIVSEPHEMANVNGMQGVLDMIRA GENSPVDIFYGIPSCVPSTSPELETTGGIITCEEMEGLKENPWVACVGEVMNYRTIIREN QLEITRFLKQLREKDRIFPIEGHCPALLDLDLAKFLYLGINGDHTEHSLEELEQRFANGM FMEIQEKMLRREVLDYIREHGLEEHFCFVTDDVMADTLCIQGHLDALVRRAMELGMPAEQ AVYNASFTPAKRMNLLDRGAIAPGKIADFLLLSDLDTFAVSATYKNGKCIWRRGQDEREQ PAQEQHGPDRGQDMRFPEPYYHSIHLEFQEQSRFEVPMESDSGTVMVRVMEIQDGSTKTK EIIMEMPVRDGFLKWEGTGCLLAAVFERYGKGQAVGYGLVTGDCHKRGAVATSYAHDCHN ILVAGANPADMKLALNRVIEMQGGMAAADGGRIQAELPLHVGGILSDRPAKEVGASLARV REAMTGLGYRHYNPIMSFCTLTLPASPALKLTDKGLIDVKACKIVPLKVEGQRTQP >gi|157101650|gb|DS480674.1| GENE 38 42512 - 43423 743 303 aa, chain + ## HITS:1 COG:CAC1046 KEGG:ns NR:ns ## COG: CAC1046 COG0583 # Protein_GI_number: 15894333 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 4 300 4 301 303 233 38.0 4e-61 MISKEIRYMLAIRRYGSITKAADSLFITQSALSKAVKNIERQLGSPLFSRIGNDLIPTHI GNRYMDYAEKICQICSDWNSECEDLLGEEKGQLTIAVALMRGSCLLPDVLSRFYSRYPDV QINLLEESHSVEKQLMLSPNVDFAIYNSTATDPAQTAEYLGQEEIVLVAAPGHPLAEKAI AREGFRYPWVDIRHTEHEPYILHPRDQTTGRISAGLFKKAGLTPKVLLFTRNSDIAIRLA ASGTALCLAPESYVRKIHFYNPPLCFSVGDPATVTTLYAVYQKGRYIPSYGRYFIELVKE NYR >gi|157101650|gb|DS480674.1| GENE 39 43522 - 44103 757 193 aa, chain - ## HITS:1 COG:lin1593 KEGG:ns NR:ns ## COG: lin1593 COG0218 # Protein_GI_number: 16800661 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Listeria innocua # 1 190 1 190 194 190 46.0 1e-48 MIIRDVNLETVCGVTSKLPENTLPEFAFAGKSNVGKSSLINALMNRKAYARTSSQPGKTQ TINFYNINGALYYVDLPGYGYAKIAVAVKEKWGKMIERYLRNSQMLKMVFLLIDIRHEPS ANDKLMYEWIIFNGYHPVIIATKLDKINRSQVQKHVKMVREGLGMEKDGIIIPFSAETKQ GRDEIWDLIEGSL >gi|157101650|gb|DS480674.1| GENE 40 44133 - 46442 2566 769 aa, chain - ## HITS:1 COG:BH3050 KEGG:ns NR:ns ## COG: BH3050 COG0466 # Protein_GI_number: 15615612 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus halodurans # 8 768 9 770 774 723 50.0 0 MGEQVITMPVVALRGLTILPGMVLHFDINRPKSVAAVERAMVGDQKLFLVAQRHPEIVEP EQGDLFQVGTVAVVKQLVKLPGKVVRVLVEGLERAELLCLEAEEPAMMGEIAAIEAEEDD LDSLTQEAMLRILKDKLEEYGRVNPKITKEILPNLMIITDLNEMLDQIAIQLPWDYTIRQ TVLENSSLSARYEVVMHTLLTEMEIYRIKKEFQEKVKADIDQNQKEYILREQMKVIRQEL GEDAVSDADEYQKKLDTLKADKEVKEKLHKEIERFRNMPAGSQEANVLRTYVETLLDLPW KKMSKDNDDIKHAERILNEDHYGLEQVKERILEYLAVRALTKKGTSPIICLVGPPGTGKT SIARSVARALNKKYVRISLGGVRDEAEIRGHRKTYVGAMPGRIVEGMRQASVSNPLMLLD EIDKVSSDYKGDTSSALLEVLDGEQNVKFRDHYVELPIDLSQVLFIATANTTQTIPGPLL DRMELIEVNSYTENEKFHIARDYLVTKQMERNGLKEGQITFSDKSLEKIIHNYTREAGVR NLERRIGDVCRKAARQFLEDKKKSIKITESNLEKYLGKEKVTFENANEEDEIGIVRGLAW TSVGGDTLQIEVNVMPGKGSLLMTGQLGDVMKESAQTALTYVRSVCPEYGIDDDYFEKHD LHIHIPEGAVPKDGPSAGITMATAMLSAVTGKSVQAKVAMTGEITLRGRVLPIGGLKEKI LAAKMAHIEKVLVPDKNRPDMAELSKEITRGLDIVYVKAMEDVVSEAFV >gi|157101650|gb|DS480674.1| GENE 41 46593 - 47912 284 439 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 156 416 248 461 466 114 32 3e-24 MPMRPDDTIRCSFCGRTQDQVRKMIAGAGNNNVYICDECIELCSEILEEELEGQDEGESP ALADIRLLKPKEIKAFLDEYVIGQDDAKKVLSVAVYNHYKRVTACQNMDVDVQKSNILMV GPTGSGKTYLAQTLAKILNVPFAIADATALTEAGYVGEDVENILLKLIQAADYDISRAEY GIIYIDEIDKITKKSENVSITRDVSGEGVQQALLKILEGTVASVPPQGGRKHPHQELLQI DTTNILFICGGAFDGLEKIVEQRLSAGSIGFNAEVANKNDLNIDELLKKVEPKDLTKFGL IPEFIGRVPVMVSLEQLTKDAMVRILSEPKNALMRQYQKLFELDNVKLEFTQEALEEIAQ LAVDRKIGARGLRSILESVMMDLMYEIPSDDSIGICTITRDVVKKAGQPELVYRDMPPAS ARKPLSERLRGDSKTGEIA >gi|157101650|gb|DS480674.1| GENE 42 47989 - 48570 618 193 aa, chain - ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 193 1 193 193 305 77.0 3e-83 MSLVPYVIEQTSRGERSYDIYSRLLKERIIFLGEEVNDVSASLVVAQLLFLESEDPGKDI HLYINSPGGSVSAGFAIYDTMNYIKCDVSTICIGMAASMGAFLLSGGAKGKRFALPNAEI MIHQPSGGAKGQATEIKIVAENILKTRKKLNEILAANTGKPLEVIERDTERDNYMSALEA KEYGLIDEIIAHH >gi|157101650|gb|DS480674.1| GENE 43 48755 - 50041 1569 428 aa, chain - ## HITS:1 COG:BH3053 KEGG:ns NR:ns ## COG: BH3053 COG0544 # Protein_GI_number: 15615615 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus halodurans # 1 428 1 430 431 311 46.0 2e-84 MSVQVEKLEKNMAKLTIELSAEQFEDAMKKAFNKSKNKFNIPGFRKGKAPRAMIEKMYGE GIFYEEAADEAINATCMEAMNESGLEIVSRPEVAVEQIGKDKPFIYTATVAVRPEVTLGE YKGIEVEKVDAAVKPEDVEAELKKVQEQNARLLTVEDRPVADGDQTVIDFEGFVDGKTFD GGKAADYPLTIGSHSFIDTFEEQLVGKNIGEECEINVTFPEEYHAAELAGKPATFKVTVK EIKVKELPELDDEFAGEVSEFDTLDEYKKDIEAKILERKQKEAASENENRVVDKVVAGAS MEIPDRMVEGQIDNMVQDTARRMESQGLNMDLYMKYTGMTMEQMREQMKPQALKRIQTRL VLEEVVKAENLQVSDERLDEEIAKMAAAYQMEADKLKGYMSDRDKDQMKEDIAVQEAVDF LVAEAKLV >gi|157101650|gb|DS480674.1| GENE 44 50223 - 50777 668 184 aa, chain - ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 3 172 2 170 188 135 42.0 5e-32 MAKVYAFLADGLEEVECLAVVDVLRRSGVEVTLVSVTGDRKVTGSHGIELGTDALFEDVN PDVADVLFLPGGMPGTNNLKAHMGLRAAVECANKQGRRIAAICAAPSILGSMGLLKGRTA TCYPGFEDQLTGVSYTSQGVVTDGNITTGRGLGFALDMGLELIRLLQGPQQAQKIAAAIQ YNWK >gi|157101650|gb|DS480674.1| GENE 45 50935 - 51387 453 150 aa, chain + ## HITS:1 COG:SMc03154 KEGG:ns NR:ns ## COG: SMc03154 COG2954 # Protein_GI_number: 15966638 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 2 147 4 148 157 76 38.0 2e-14 MEIERKFLIPCLPEGYDAHPFHQIEQAYLCTEPVVRIRQEDDSFYLTYKGKGLLSREEYN LPLTREAYAHLLPKADGVVLTKKRYLIPLDGSDHLTIELDVFEGRYAGLLLAEVEFTSEE EAMAFQAPEWFGRDVTFTREYQNSRLACGK >gi|157101650|gb|DS480674.1| GENE 46 52063 - 53718 1431 551 aa, chain - ## HITS:1 COG:TP0686 KEGG:ns NR:ns ## COG: TP0686 COG4211 # Protein_GI_number: 15639673 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Treponema pallidum # 30 551 30 531 531 404 41.0 1e-112 MAKGNNETLTAEQEMQLRQPIEDYVGEIQKKIDSLRADGTNRVVALQNNIDATKRDRSLT KEERENRIAANSAEMEKAKAVEAKNKDEIAKLISEAESYLKEHFDKDYYLPLKESCEREK ALAKEKYGIKVAELNKEHENTLSKLTDHQEIKDEKYVHKNRLFDAKMDLDKELQRIKDRR HAAFVWQYHLIDMLRLSKFTFMETRAQKWENYKYTFNRRTFLLQNGLYIAIILIFIGLCI AAPVMKNVKLLTYNNILNILQQASPRMFLALGVAGLILLTGTDLSIGRMVGMGMTTATII MHQGINTGAVFGHVFDFTGMPIGLRVVFALVMCIILCTCFTAIGGFFTAKFKMHPFISTM ANMLVIFGLVTYATKGVSFGAIEPAIPKMIIPKINGFPTIILWAVAAIVIVWFIWNKTTF GKNLYAVGGNPEAAAVSGISVFGVTMGAFILAGILYGFGSWLECARMVGSGSAAYGQGWE MDAIAACVVGGVSFTGGIGKISGVVVGVLIFTVLIYSLTILGIDTNLQFVFEGIIIITAV TLDCLKYVQKK >gi|157101650|gb|DS480674.1| GENE 47 53734 - 55242 1395 502 aa, chain - ## HITS:1 COG:TP0685 KEGG:ns NR:ns ## COG: TP0685 COG1129 # Protein_GI_number: 15639672 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Treponema pallidum # 9 502 4 496 496 612 62.0 1e-175 MADKKDDIVLSIRGMSKSFGRNRVLDHINLDVKRGTVMGLMGENGAGKSTMMKCLFGTYQ KDEGNIFLNGKEVSFSGPKDALENGIAMVHQELNQCLERNVIDNLFLGRYPVNSMGVIDE GRMKKEAAELFRKLGMTVNLTQPMGRMSVSQRQMCEIAKAISYNSRVIVLDEPTSSLTVQ EVNKLFEMMRMLKEQGIALIYISHKMDEIFEICDEISVLRDGNLVMTKSTKDANMNELIA AMVGRSLDNRFPPVDNTPGDVILSIQNLSTKFEPHLQDVSFDVKEGEIFGLYGLVGAGRT ELLETIFGVRTRAAGRVYFNNRLMNFSSAKEAMEHGFAMITEERKANGLFLKGDLTFNTT IANLQQYMSGIALSDAKMIKATSKEIKIMHTKCMGPDDMISSLSGGNQQKVIFGKWLERS PQVFMMDEPTRGIDVGAKYEIYELIINMAKQGKTIIVVSSEMPEILGITNRIGVMSNGRL SGIVNTKETNQEELLRLSAKYL >gi|157101650|gb|DS480674.1| GENE 48 55341 - 56681 1608 446 aa, chain - ## HITS:1 COG:TP0684 KEGG:ns NR:ns ## COG: TP0684 COG1879 # Protein_GI_number: 15639671 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Treponema pallidum # 63 445 35 402 403 281 43.0 2e-75 MKKALSVALACSMALSLVACGGGSKPTEAPTTAAAGDTTTEAAADTTAADTAAAPEAGAE VANKDKPLVWFNRQPSSSATGQLDMTALNFNKDTYYVGFDANQGAELQGTMVKDYIEKNI DSIDRNGDGIIGYVLAIGDIGHNDSIARTRGIRKALGTAVEKDGNVNSDPVGTNADGTAT VVQDGSIEVGGKTYVVRELASQEMKNSAGATWDAATAGNAIGTWSSSFGDQIDIVASNND GMGMSMFNAWSKDNKVPTFGYDANSDAVAAIAEGYGGTISQHADVQAYLTLRVLRNALDG VDIDTGIGTADDAGNVLSPDVYVYKADERSYYALNVAVTAENYTDFTDSTMVYAPVSNQL DEATHATKNVWLNIYNSADNFLGSTYQPLLQKYDKLLNLKVDYIGGDGQTESNITNRLGN PSQYDAFAINMVKTDNAASYTSLLEQ >gi|157101650|gb|DS480674.1| GENE 49 56840 - 57868 1240 342 aa, chain - ## HITS:1 COG:STM2190 KEGG:ns NR:ns ## COG: STM2190 COG1879 # Protein_GI_number: 16765520 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Salmonella typhimurium LT2 # 22 314 18 303 332 188 37.0 1e-47 MAACMGAAVLIGVWQYMGREDGRKADKNSVRIGVLLYRGDDTFIGTLRTGLEDKAKEYEQ ETGIKVKLDIMDAKGSQNTQNSQVERLISLGCDALCINSVDRSSASIIIDKAMDAGTPVV FFNREPVEEDMNRWEKLYYVGADAKESAVLQGDILVDAYNQDTRTLDANGNGLVSYVLLE GETSHQDSLIRTEWSIQTLKDGGVPLEKITGGIANWDRSQASALMEQWLKEYPGQIELVV SNNDDMALGAIDAMNRAGIGANSIKVVGIDGTPVGIKALEEGYLFGTVESDKERYSQAIF DIAWSLSLGQDPKDRVPQLEGNYYWCPQRALTQTEAKLLKNQ >gi|157101650|gb|DS480674.1| GENE 50 57944 - 58915 947 323 aa, chain - ## HITS:1 COG:no KEGG:Closa_2376 NR:ns ## KEGG: Closa_2376 # Name: not_defined # Def: ABC-type sugar transport system periplasmic component-like protein # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010]; Bacterial chemotaxis [PATH:csh02030] # 1 323 1 323 323 302 45.0 1e-80 MLNKGKILWCVWAGILVLLFLMSSTDLIIKEKKIEVYPISVIIEGDNDDYYVNFKKGMDQ AAVEFHGDVSFITLYASDDQDQQMDLVKREIRDGTRAVILAPVKPEEAVRGLEDMNPGCP VILLGQSPDEGGTLDTIGVDGRKIGRLLGEAAASQAPRDVPVYLFCRGLDYGDSAQVYEG VRTVLDERGYHYRLIERKNQDTYHQAIQETDSPGGGRITIIALDVQSLDQATRILEENTI YQGRVAGLYGAGSTTSLLRALDKGIVTGMTAYNQFDEGYLSVKQAVEAIQGTRQKQQTML EAIYVDEDKLRDKTYEKMLYPIE >gi|157101650|gb|DS480674.1| GENE 51 58905 - 59309 452 134 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 14 121 460 560 563 87 40.0 8e-18 MEAEDEVLDLASLKLVLQPLVENAIYHGMEFMDGDGEIRIRAWRQDKDLFMSVSDNGLGM TREQVKRLFGDTDHMPSGRGSGIGVKNVHQRIRLYFGNDYGLEIQSEPDEGTTVTAHLPA IPYEEVKKEGSYVK >gi|157101650|gb|DS480674.1| GENE 52 59156 - 60700 1078 514 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 15 457 8 439 563 192 28.0 1e-48 MLFEKQRIKLQKASIRYIIFIYFTLTALAASIFIGISLYSRLSGQMSATIREENQILINQ ITRSMEDYLRTVMKLSDSLYYGVVKNADLASGSIGSEMTLLYDNNKDNVENIALLSKEGK LLEAVPAARLKTNVDVTGENWFTSTLEKTENMHFSTPHVQYIFDGSENQYRWVISLSRAV EITRGPSTDQGVLLIDIRYSSLEQLFDGVNLGNGGYVYLISSSGEIIYHPQAQLIDSGIV RENNLEVSGLRDGNYRQTVDGEERTVTVKTVGYTGWKVVGVTPLDGVSLNNIKTKLLVIF MIAFVLFIMTIINSYISSRITDPIKELEKSVNEIEAGNLETEVRSGGSYEIQHLGNSIQN MARQIRRLMDDIVAEHESKRKSEFDTLQAQINPHFLYNTLDIIVWMIENENLTDAVRVVT ALARFFRISLSRGKSIITVRDELEHVRNYLMIQHMRYKISLLIPWRQRMRSLTWPVSSLS SSRWWKMPSITAWNLWTGTGKSVSGHGGRIRIYL >gi|157101650|gb|DS480674.1| GENE 53 60766 - 62403 1895 545 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 542 1 520 525 176 24.0 2e-43 MDLYRIILVDDEEEVRQSIIRKIDWTGAGFCVVGDAENGEEALEKVEALEPDLILTDIRM PFMDGLSLAERVRQKYPSVKIVIFSGYDDFDYAKQAIKLNVTEYILKPVNVEELTAILKR IKSNLDDEIEQKRNVSLLRENYIRSLPILRDQFLNELVGSTVPGNVLKEKLAEYDIPLAG AKKWVAAAIDIEPEEIREGNMLPLHKERDLIPISVMQVVEEKLGNYCRSSVFTSSRTSWS ELALIAAIDEDNSQTGLIDVLGDICKETKKILEVPITIGIGHSCQDLGEISGSYKTAVDA LGYKAIVGAGSTIYINDVEPVSGGKLSFDSKDEAELIAAIKFGPREKIEEAVQAVMDKMS DAKVHFRQCQAYMLSVSSSIVQLIQQYDLDLEQLAEEGEEQGDTFAVIPRMMKKEDFAHW LLSAALRMNQAMNRERDNTMKQVIQKAKEYIMDNYQDPDLSVEKICRQLHMSPAYFSTMF KKETGQAYTAYLTQVRLDKAVELLNKTDDKTYVIAAKVGYQEQNYFSYVFKKRFGVSPTK FRGAK >gi|157101650|gb|DS480674.1| GENE 54 62427 - 63488 742 353 aa, chain - ## HITS:1 COG:CAC3373 KEGG:ns NR:ns ## COG: CAC3373 COG4677 # Protein_GI_number: 15896615 # Func_class: G Carbohydrate transport and metabolism # Function: Pectin methylesterase # Organism: Clostridium acetobutylicum # 18 347 1 318 321 337 46.0 2e-92 MTEPDGQNERQRQQAVRLTVAKDGSGDYDTVGAALEALGDKEGPPSCIFIKEGVYRERLE IRRPGITLEGQSAGGTVITGGLSAGMTMEDGSKRGTFRTYSVLVDTHDFTAKNLTIENSA GPGEAVGQALALYADGDRILLKGCRLLGGQDTLFTGPLPPKEIQKNGFIGPKQFSPRING RHCYQDCFIRGDIDFIFGSATAYFDRCELYSAGRDTEKKGYVTAASTPKGQKYGYVFRNC RFTGNSSRESVYLGRPWRDYARTVLIDCYLGPHICREGWHDWDKKNARSTLFYGEYGSYG PGAEEQEAVEQGKRDHCTSRPDWVFMLTREQIQDYTMEQVLGGTDGWNPGAYI >gi|157101650|gb|DS480674.1| GENE 55 63778 - 65184 1475 468 aa, chain + ## HITS:1 COG:lin1880 KEGG:ns NR:ns ## COG: lin1880 COG0034 # Protein_GI_number: 16800946 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Listeria innocua # 3 453 13 436 475 186 31.0 9e-47 MGGIFGVASKNSCTLDLFFGVDYHSHLGTKRGGMAVYGSQGFTRSIHNIENTPFRTKFDG DLDELEGTLGIGCISDNEPQPLLIQSHLGSFAITTVGKINNEAELVASAYENGHIHFMEM SHGRINATELVAALINQKSTLVEGLLYAQEKIEGSMSILLLTPEGIYASRDKFGRTPIVI GHKDDAYCASFESFAYINLGYTDYKELGPGEIVYMTPESVETVSPPREEMKICSFLWVYY GYPTSSYEGINVESMRYECGKLLAKRDDAQPEVVAGIPDSGIAHAIGYANESGIPFARPF IKYTPTWPRSFMPQNQGQRNLIARMKLIPVDALIRGKKLLLIDDSIVRGTQLGETTEFLY QSGAKEVHIRPACPPLMFGCPYLNFSRSTSELDLITRRIIRDREGDNVSRDVLNTYTDPD SANYKEMIEEIRKRLGFTTLRYHRLDDLVKSIGISPCKLCTYCWSGRK >gi|157101650|gb|DS480674.1| GENE 56 65200 - 65877 658 225 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 10 216 9 211 219 163 43.0 3e-40 MTDIQLEQVLPGDIERRSFEIIQEELSLRGIYLEPENEMVIKRAIHTTADFDYAQNLVFS GQAVKRGTEALKAGAVIVTDTNMGLSGINKKGLSALGGEAFCFMADEDVALQAKEQGTTR AVASMEKAARLYGLKGRPLIFAVGNAPTALVRIYELIEEGLAAPALVIGVPVGFVNVVQS KELILTLEDTPYIVARGRKGGSNVAAAICNALLYQIPVQGVAEST >gi|157101650|gb|DS480674.1| GENE 57 65834 - 67522 1427 562 aa, chain - ## HITS:1 COG:STM2019 KEGG:ns NR:ns ## COG: STM2019 COG1492 # Protein_GI_number: 16765349 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Salmonella typhimurium LT2 # 5 533 2 504 506 486 49.0 1e-137 MSQHTKSIMIQGTMSNAGKSILAAGLCRIFHQDGYRTAPFKSQNMALNSYITPEGLEMGR AQAVQAEAAGIVPNADMNPILLKPTTDVGSQVIVHGISRGNMKARDYFACKKSLVPDIMA SYRRLEEQYDVIVIEGAGSPAEINLKTDDIVNMGMADMADAPVLLVGDIDRGGVFAQLYG TVALLAEEERKRVKGLIINKFRGDRSILAPGISMLEDLCGIPVVGVIPYMDVDIEDEDSL SSRLETGKGKAPAAVDLAVIRFPRISNFTDFNVFSGIPGVSLRYVTRACDLGTPDMVILP GTKNTIHDLLWMRQNGLEAAILKLADRQVPVWGICGGYQMMGEVLVDEKGVESDLPSRTS GMGLLPLKTEFEKEKIRTQVEGAFGQLDGCLGGLSGKSIEGYEIHMGRTYIDREEGAADT ARIVSWRREPEICRPMDYVMETGKDRLLSRKARMDGWNRGNIYGTYVHGIFDSPGIARTV AEALAAAKGLCLEQAADMDYRTYRQQQYDKLAEELRENLDMAAVYRIMGISQKGKMGISQ KGKGEEKPDDGYTVGTGTAGRH >gi|157101650|gb|DS480674.1| GENE 58 67495 - 68619 861 374 aa, chain - ## HITS:1 COG:STM0644 KEGG:ns NR:ns ## COG: STM0644 COG0079 # Protein_GI_number: 16764021 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Salmonella typhimurium LT2 # 5 366 8 362 364 254 40.0 3e-67 MEYQHGGDIYSQDIQMDYSANINPLGMPEAVRQALRDCLEREVCSLYPDSRCERLRQALG RHHGVEDKWIICGNGAADLIFGLAAALRPVRGLVTAPAFSEYEQALRSVGCRTDYYYLRE DRGFALDAAGMCGCVRAAAERGEPYDVVFLCNPNNPTGIPVKKEEVKQLADTCERLGAYL AVDECFCDFLDNSELYSVIPDLARFTHLFVLKAFTKLYAMAGLRLGYGLCSDSGLLERLQ AVRQPWSVSGPAQCAGVAALGETDFAEHTRAVLKGERERLSAALHDLGFQVYPSQANYLF FRDREEDGILEKGRLYQELLGRRILIRSCANYPGLDSSFYRICVKLRQDNEQLIREMARV LERRKPCPNIQSQS >gi|157101650|gb|DS480674.1| GENE 59 68695 - 69693 784 332 aa, chain - ## HITS:1 COG:lin1155 KEGG:ns NR:ns ## COG: lin1155 COG1270 # Protein_GI_number: 16800224 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Listeria innocua # 21 332 11 314 315 204 39.0 2e-52 MDITWLDIWKFHGAALLAGCVIDWIIGDPIWLPHPVRLMGTMIAGTESRLRKWTEGTGRK GQSRAGVILAVFMCIIWWCVPWAIVHGAEHLWLSGAWAVLLLAEGFLCSLMLAARGLYTE SMKVCRSLEEGSTEEARYNVSMIVGRDTAALDEGGIARAAVETVAENASDGVIAPFLFMA LLGPAGGTLYKAVNTMDSMVGYKNDRYMYFGRCAARLDDLLNLAPARLTGILMVLGAWVL PGMDGAHAFKVFLRDRKRHASPNSAHGEAACAGALHLRLAGDAWYFGTLHKKPYIGDDDR QIQPADIRRANRLMFFAEGMMVLGLAVLIMIL >gi|157101650|gb|DS480674.1| GENE 60 69708 - 70139 339 143 aa, chain - ## HITS:1 COG:STM2018 KEGG:ns NR:ns ## COG: STM2018 COG2087 # Protein_GI_number: 16765348 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Salmonella typhimurium LT2 # 73 143 107 181 181 59 41.0 2e-09 MILVTGGSFQGKRVFVRQYLGEQNADGAVWTEGAEASWEEFMDGRFCRDFQLFVRRVMEG SAGPSFREQKGPAMDQILEQLLEELLAGPKDRVLVTDEIGCGIVPADAFERLYREETGRL CCRIAGEADEVWRVCCGTGIRIK >gi|157101650|gb|DS480674.1| GENE 61 70142 - 70975 836 277 aa, chain - ## HITS:1 COG:FN0912 KEGG:ns NR:ns ## COG: FN0912 COG0368 # Protein_GI_number: 19704247 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Fusobacterium nucleatum # 8 259 5 263 278 73 25.0 5e-13 MNGLYSCIIAISMYSKIPMPNVEWSEDRMRYVMCFFPLVGIVQGAALGLWLHFALDVLNL SVGAAALTGAAIPLLVTGGIHMDGFLDTMDAIHSYGDRSRKLEILKDPHLGAFAVISFGV YMMLYLGVFHEYLSLVLREDRGDRYFLYAVPCLVFVMERAFSGLSVVTFPQAKKKGLAAG FGGAARKRTDSLVLLLWMLICLAAGVAAAKVGCHGAGMLAGVLLMTHLAVFIWYYRMSVK QFGGVTGDLAGCFLQVCELAGLAAAAVLLKAGLMGGM >gi|157101650|gb|DS480674.1| GENE 62 70972 - 71613 641 213 aa, chain - ## HITS:1 COG:ECs2788 KEGG:ns NR:ns ## COG: ECs2788 COG2087 # Protein_GI_number: 15832042 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Escherichia coli O157:H7 # 1 192 1 181 181 76 35.0 4e-14 MITLVTGGSGSGKSEYAEGLILDSPCSRRFYVATMIAYGKEGRDKVERHRVLRKGKGFIT IEKPRNVGEVMVGEYETGLSPLSRTGRALLLECVSNLAANEMFKEGTGKTEAGEHQGGPI QCLSHKIAEDIISLAGQVQDMVIVTNEVDRDGICYEPETMEYIRLMGCLNQKLASAADRV VEVVYGIPVPLKPSALGPAAIGADFKRRECQGE >gi|157101650|gb|DS480674.1| GENE 63 71668 - 72726 1157 352 aa, chain - ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 351 1 351 352 270 38.0 3e-72 MTLKEAMNSIKPASQALRQQAKQRWDQVAKPLGSLGVLEEDIIRIAAASGSLAVSLRPRA IVVMCGDNGVVKEGVTQTGQEVTAIVAGNMARGNSCVCLMARQAGADVIPVDVGMACRDA VPGVVNRKVAYGTKDFLEEPAMTMEETVRAMEAGADMAMELKERGYVLAGSGEMGIGNTT TSSALLSVLLNMDPELVTGRGAGLTTAGYNRKVQVIREGIRIRKPSGEDPVDALSKVGGL DLAALTGFYIGCAACGLPVVLDGLITGAAALAAVRISPDVKGYLLASHVSAEPAGAMVLD ALGLKPAISAGMCLGEGTGAAALFPLLDMAAAVYTQMSSFDDIHVDAYEHLL >gi|157101650|gb|DS480674.1| GENE 64 72723 - 74426 999 567 aa, chain - ## HITS:1 COG:FN0972 KEGG:ns NR:ns ## COG: FN0972 COG1797 # Protein_GI_number: 19704307 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Fusobacterium nucleatum # 211 525 137 433 444 157 31.0 7e-38 MAGVMIAAPGSGSGKTMVTCGLLTLLKRKGYNPAAFKCGPDYIDGLFHRRVLGVENGNLD SFFETGTHMRQKLSRGMERHFVVAEGVMGYFDGLGGVSVQGSSFEIGSILGLPAILVVDG RGASLSLMALIQGFLEYDAMLEGKDEEKTHNKTDCDAECAAGMQYEGCRKRRWKENEENY REKKSHTSNGLHGCVCGNTDCHESKGKKNNNIRGIFFNRMSPMIYNRIKPMVEELFRIPV IGYLPELNFLHVGSRHLGLVLPDEIDGIREQLEQLADRMDENLEWEPLLKIAAEAGGRDR EVPVREEAGAGQQTGNVAGIKTGNKLGNEPGNIPHFRLGIARDEAFCFYYEDNLRAMEQA GARLVYFSLLHQEKLPDGLDGLILGGGYPENHGEALAANRTMRDSVAAAAASGMPILAEC GGYLYLLDSLEGADGTVYPMAGVLKGHGYRAGKTGRFGYITLGPNRCLPYLREGEEIKGH EFHYWDCDCGEDEFCMTAAKPKGGRSWPCMRTAKQVMAGFPHLYYPSCPGLVRRFAGQCV EYGQRHDRKHDRQERNGQEHDGEEHEL >gi|157101650|gb|DS480674.1| GENE 65 74411 - 75709 1180 432 aa, chain - ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 232 414 5 181 189 116 36.0 1e-25 MDDLRPIVYLVGIGMGGEDQLTGRAMDCLEAAQVVMGADRMLDSVSAYTDKKRVFSAYKP SEMVQWLGSFRWDEAALVLSGDTGFYSGAEAAAKAFLREGWDVEFVPGVSSLSYFCARLG KSWQNVHPVSSHGRDCDVVAYIRSFKSCFILLGGAGSVPDLCRQLVSSGMSSVTLWAGEN FSYEDERIAWQMTPAELLMEDERLPFGSLACVLVENTEAAEGQLYPQPGPQDEDYIRGQV PMTKKEVRRISLDKLCIGSQAVCYDIGAGTGSVAVEMGMEIRKWGGSGQVYAIERNREAL QLIESNCQKFHGSWMGFHIVEGEAPEAMKGLEPPTHAFIGGSGGHMREIVAALLDANPHV RIVANAITLETVGEILACMKEFGFATGEITQVWAAPVDTVGSYHMPKAQNPVYVAVMQDP KDDEGEIQWQEL >gi|157101650|gb|DS480674.1| GENE 66 75702 - 76523 837 273 aa, chain - ## HITS:1 COG:lin1163 KEGG:ns NR:ns ## COG: lin1163 COG2099 # Protein_GI_number: 16800232 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Listeria innocua # 8 243 2 225 250 114 32.0 2e-25 MKEIKKSVILFAGTTEGRLLAEYLQELDQPFVICVATGYGRDLLEDISRPDENGCMEIRQ GRLDVTEMERMFDELRPELVIDATHPYAVAVTQNIKMACSKRKGLRLLRCLREPGAGDNR PGVIHMPDVEAAAAWLSSVEGNILVTTGSKELAIYTRIRDYDQRVYARILPSAEAVAQCR SLGFEGRHIIAMQGPFSEEMNLVQLREYGCAYLVTKDGGAAGGFSEKIKAAHRAGAAAVV IDRPGSGEGISLDDIKGQLKEWMDHEKKGSIYG >gi|157101650|gb|DS480674.1| GENE 67 76510 - 78024 1304 504 aa, chain - ## HITS:1 COG:MJ1578 KEGG:ns NR:ns ## COG: MJ1578 COG2875 # Protein_GI_number: 15669774 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Methanococcus jannaschii # 4 248 9 253 259 261 51.0 2e-69 MIHIVGAGPGAVDLITVRGQKLLSEADVVIYAGSLVSRDLLGWARQDARIYDSASMDLEQ VMEVMEQAERDGLTTVRLHTGDPCLYGAIREQMDGLDARGIKYDICPGVSSFCGAAAALG MEYTLPGISQSVVITRMAGRTPVPERESIGKFAAHGSTMVIFLSAGMTGELSEELVKGGY PGDTPAAIVYKATWPDEKVVRCTVATLEETADREGIHKTALIVVGNTVAQTGYERSKLYD PAFTTGFRAGREIVPGKLEPGTLYVVGMGPGEKKQMTGQALEVMGRCQVIAGYTVYVDLV RGLFPQKEFLTTAMTREEERCRKAFECCMEGKNTAMICSGDAGVYGMAGLILELAQQYPG VRVRIVPGITAACAGAAVLGAPLMHDFAVISLSDRLTPLEDIWNRVEAAAQADFVICLYN PASKGRPDFFRQACSRILKYRAEDTVCGLAVNIGREGEEMEVLTLEELKDRRVDMFTTVY IGNSHTRQIGPYMVTPRGYRYEGN >gi|157101650|gb|DS480674.1| GENE 68 78038 - 79285 1023 415 aa, chain - ## HITS:1 COG:MJ0022 KEGG:ns NR:ns ## COG: MJ0022 COG1903 # Protein_GI_number: 15668193 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Methanococcus jannaschii # 6 397 2 356 362 209 33.0 8e-54 MGDSYIYKNQKKLRWGYTTGTCAAAASLAAAVMLLRGRRMEQVSLTTPKGVRLDLEVEEM ETGENCVCCGVRKDAGDDPDVTDGLMVYSQVRLPDADSGGAGDAREAGDNTEAGDDRNAC GDYVYEKDGLRLILSGGVGVGRVTQCGLSCEVGKAAINPVPRQMIFEQVAGVCRESGFKG VLSIEIRVPEALKVADKTFNSRLGIQGGISILGTSGMVEPMSETALLDTIRLELRQRIRK GEKNLLVTPGNYGESFVGTVLGLGLGQAVKCSNFIGSTIDMAVEEGAESMLLIGHGGKLI KLAAGIMNTHSSWADGRMEILAAHGAACGAKRELVEQIMEAVTVDEGLRLLETEDGLREQ VMKRVMARLEQHVKRRAGGRLRAEVIVFTNERGILGATTGADDMLLYFTNRLRNR >gi|157101650|gb|DS480674.1| GENE 69 79335 - 80504 627 389 aa, chain - ## HITS:1 COG:CAC1370 KEGG:ns NR:ns ## COG: CAC1370 COG2073 # Protein_GI_number: 15894649 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Clostridium acetobutylicum # 55 377 38 324 326 181 34.0 1e-45 MKIGMICFTARGTAICRLLCRRFRDTGTECTGYVPQRFWKPEWEAEGIKPQDKSLSEWTG SMFGQKRALVFIGAAGIAVRAIAPFVRDKMTDPPVVAVDEAGHFCIPLLSGHVGGANELA ERMACWLQGIPVITTATDINGIFAVDVFAARRGLCITDRKKAKEISAWLLDGGKVGFFCD SECRRAEGASGAGTTKNGSSGRGAENVGDGTEGICRHNICMNSVCRHNIWITFRNSERPS HSPDREAVFLRLVPKVAVVGVGCRKGTPPDILEGCVLEALERGGIDPAAVKALSTIDIKA GEEAVTRLARRYGWELRTFTSRELLAVEGEFQESDFVRSAVGVGNVCERSCTARGGRLLI HKQAGKGVTVAAAIEPVAGKLDIEKCSWI >gi|157101650|gb|DS480674.1| GENE 70 80560 - 80772 341 70 aa, chain - ## HITS:1 COG:no KEGG:Closa_2419 NR:ns ## KEGG: Closa_2419 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 70 1 70 73 79 71.0 6e-14 MGKSKKNEAFDPHNLSAQDQLKFEIAQELGLDAKVMEGGWRSLTAKESGKIGGLITKRKR ELKKEALQQE >gi|157101650|gb|DS480674.1| GENE 71 80998 - 81729 455 243 aa, chain + ## HITS:1 COG:SPy0245 KEGG:ns NR:ns ## COG: SPy0245 COG3279 # Protein_GI_number: 15674427 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Streptococcus pyogenes M1 GAS # 1 210 2 220 246 68 26.0 1e-11 MNIAVIDDSIDEARQLAGFISTYCSDTHTCQQTTIFNTAADFLRCWQRGSFDLIFIDIFL RNETSGIRIAERVRRDDDSCTIVFTTSSTDFALKGYEVRALDYMVKPIRYDKFSQTMEYF LSVSRRKQSHYIEVKESRIMQKIPIDSILYTDYSNHYIQIHLPDKTVRTYMRFEDFSGML LIYSQFICCYRNCIINMDKVFSMEKSEFILTTGEHLPITRTMRSQIHQQYADYQFSKLNG GIG >gi|157101650|gb|DS480674.1| GENE 72 81726 - 83069 828 447 aa, chain + ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 246 437 235 426 433 67 26.0 4e-11 MSFAQIQLISVILAEIIFIIVPFLAVFRREWRYSMSKTTGILMGYLFFVYFLGLTCFSPF LSRPWLELAWSCTTMTTNILVCKWMVRTDWPVNIYSLFLFKNFTDTAGYCADLSNASASL NKSSSLITGDKLLSNLIFLFLMVWVAYYILHKYLTKAVEYTRLLPVWKYLAAIPVLFFIM FRLASRSLAPARMLQHHPDMIFFAVCWFACIYTVHYVSLRILSRLSESYAAKEQYRTTRL LASVQKSQMATLQYNLEQFKKARHDYRHHLITLKGLLEQKETEFALEYINDYLGTYATLD TTQYCQNPSSNALLNYYIQTAQSQGIAVNSSICLPQSLPVPEIDFCTILGNLLSNAVEAC QRQTQGTPSITINIGQAGESMIALSIQNTYSHLIRMKDGRFLSSKREDMGTGTTSVRYLV ERYHGILKFDYSNGIFEASLLLNPAMK >gi|157101650|gb|DS480674.1| GENE 73 83223 - 85214 1177 663 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 [Oribacterium sinus F0268] # 288 659 1 366 367 457 61 1e-128 MDRKVTVAKSAGFCFGVERAVDQVYEQIEAVKNKKASGPIYTYGPIIHNEEVVRDLSDKG VMVLETEEELRALPRNQGTVIIRSHGVSRHVYEILKERNVAVVDATCPFVKKIHRIVAEH SAAGEAVVIIGSPDHPEVEGIKGWGNEHTTVIMDGEGVDNYRLPEGRKLCIVSQTTFNYN KFKELVEKFLKKGYDSSVLNTICNATQERQVEAARIASQVDAMIVIGGRHSSNTQKLYEI CLKECKDTFYIQTLGDLDPDSVSSVRSVGITAGASTPRKIIEEVHTRMAEQSFAELLKEE EESTKPIRTGEIVTGQVIEVSKDEIILSIAGNKSEGVITRDEYSNDSNVDLTTKVQIGEE MEAKVIKLNDGVGMVSLSYKRMAADRGNKRLEEAFENQEVLTAKVAQVLDGGLSVEVEGA RVFIPASLVSDSYEKDLSKYAGQDIDFVITEFNPKRRRIIGDRKQLLVAQKAKLKEELFA RIQPGDTVDGTVKNVTDFGAFIDLGGADGLLHISEMSWGRVENPKKVFKAGDQVRVLIKD IQGEKIALSLKFPEANPWVNAADKYAVGNVVYGRVARMTDFGAFVELEPGVDALLHVSQI SREHVEKPADVLRIGQEIEARVVDFNEADHKISLSIKAMLAPAAQDADVADVDIDAMSQY TEE >gi|157101650|gb|DS480674.1| GENE 74 85217 - 85894 261 225 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 6 225 37 290 863 105 31 2e-21 MEYFNIAIDGPAGAGKSTIAKLAAKRLGFVYVDTGAMYRSMALYFIRQGISPEDEEAIAG ACGQVEVTIRYADGVQQVILNGEDVSGLIRSEEVGRMASATSVYRPVREKLVQLQKQLAR EADVIMDGRDIGTCVLPDAPAKIYLTASVEIRARRRFRELEEKGTACDIREIEKDIEERD YRDMHRENSPLRQAEDAVLLDSSHMTIEQVVDSIIRIARERGYEG >gi|157101650|gb|DS480674.1| GENE 75 86020 - 87261 1453 413 aa, chain - ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 17 409 1 389 393 404 53.0 1e-112 MSETNHVIVIGAGAAGMLAAIAAAREGAAVELWEQNEKPGKKLFITGKGRCNVTNACDME ELLGSVVSNSKFLYSSFYGFTNQDMMDLLEQAGLRLKVERGNRVFPQSDKSSDVISALSA LLRQAGVRVCLNQKAEGLAVEEGVCKGVRCGGRIQTADRVIVATGGLSYPTTGSTGDGLK WAADSGHRLTELSPALVPFEVKETETVKELQGLSLKNIEAAVYDGKKELYREFGEMLFTH FGVSGPVLLSASSFCAKAIRKRPLRLVIDLKPALSWEQLDERILRDFSDSRNKQFKNALN HLYPSKLIPVIIDRSSVDPDKKVNEITREERRGLAEATKALEFTLTGLRGYKEAIITQGG ISVKDVNPSTMESKKIQGLYFAGEILDLDAVTGGFNLQIAWSTGWAAGKAASS >gi|157101650|gb|DS480674.1| GENE 76 87258 - 88760 1604 500 aa, chain - ## HITS:1 COG:MA2391 KEGG:ns NR:ns ## COG: MA2391 COG2326 # Protein_GI_number: 20091222 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 493 67 567 569 470 50.0 1e-132 MLEKIDLSKKVDKKTYRRVMDEAEEKLGLLQRECKDAGIPVILVFEGMGAAGKGVQINRL IQALDPRGFDVYACDRPTEDEQMRPFLWRYWTKTPAKGRIAVFDRSWYRSVQVDRFDGLT REDKLGDAYQDILSFEKQLCDDGTVIMKFFLYIDKDEQKKRFKKLEGSKETSWRVTEEDW NRNKDFDRYLKMNEEMLEKTDTDYAPWVIIEAVDKDYAALKIVSTVMDRLEYELEHRRPE DEKTAQRQESKTRERFKNGVLSGIDLSKSLTEEEYKTRLKKLQKRLAELHSELYRLRIPV VIGFEGWDAGGKGGAIKRLTSNLDPRGYRVNPTAAPNDIEKVHHYLWRFWNSVPKAGHIA IFDRTWYGRVMVERIEGFCSEAEWRRAYQEINEMESHMANAGAVVLKFWLHIDKDEQERR FRERQANPAKQWKITDEDWRNREKWDQYEEAVNEMLIRTSTTYAPWIVVEGNDKRYARVK VLQTVVDALEKKVKEVKTDK >gi|157101650|gb|DS480674.1| GENE 77 88965 - 90284 1706 439 aa, chain - ## HITS:1 COG:CAC0278 KEGG:ns NR:ns ## COG: CAC0278 COG0527 # Protein_GI_number: 15893570 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Clostridium acetobutylicum # 4 434 5 436 437 488 53.0 1e-137 MKKVVKFGGSSLASARQFKKVADIIKSDKSRRYVVPSAPGRRSDKDEKVTDLLYACYDAV AEGRSYKKILEKIKSRYMDIIDGLDLNLNLDHEFDRIEENFLAKAGRDYAASRGEYLNGI VAANFLGFEFVDAAEVIFFDENGNFESEHTNRELSERLEHVEKAVIPGFYGSKPDGTIKT FSRGGSDVTGSIVAKAIHADMYENWTDVSGVLVTDPRIVENPEVIETITYKELRELSYMG ASVLHEDAIFPVRKEGIPINIRNTNRPEDKGTLIVESTCRKPRYTITGIAGKKGFCSINI EKAMMNAEVGFGRKVLSVFEQYGISFEHMPSGIDTMTVYVHQSEFEEFEQSVIAGIHRAV EPDTVELESDLALIAVVGRGMRSTRGTAGRIFSALAHARVNVKMIDQGSSEWNIIIGVEN DDFETAIRAIYDIFVITEL >gi|157101650|gb|DS480674.1| GENE 78 90500 - 92584 2362 694 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 4 689 14 685 685 554 53.0 1e-157 MVAMYFVGAGSLMALVFAAFMFVRVRREPGGNPEMLRISGAVQKGANAYLRRQYKGVGIF FAAVFVILLVMAFCGFLSFFTPFAFLTGGFFSGLSGFIGMRTATMANCRTAQGASRNLNK GLRVAFSAGSVMGFTVVGLGLLDLTVWYFILNMAFGALPDSQRIAQITANMLTFGMGASS MALFARVGGGIFTKAADVGADLVGKVEAGIPEDDPRNPAVIADNVGDNVGDVAGMGADLY ESYVGSIVSTAALAVAAGYGGKGVAVPMMLAALGVLASILGTFFVKTEEDASQKNLLKAL RTGTYISAALVVAAAYAAIRILLPDHMGIYAAILSGLAAGVLIGAVTEYYTSDTYNPTRK LAASSRTGGATVIISGLSLGMLSTVAPVVIVGVSVLISYYCSGGSADFNAGLYGVGVSAV GMLSTLGITLATDAYGPVADNAGGIAEMTHMPEEVRQRTDALDSLGNTTAATGKGFAIGS AALTALALIASYIDKVKQIDPSLSMDLSITNPTVLIGLFIGGMLPFLFAALTMEAVGEAA QSIVVEVRRQFREIKGLMEGKAEPDYGACVDMCTISAQRLMVAPAMVAVIIPVAVGLLLG PEGVSGLLAGNTVTGFVLAVMMANAGGAWDNAKKYIESGQLGGKGSEEHKAAVIGDTVGD PFKDTSGPSINILIKLTSMVSIVFAGLIVAVHLL >gi|157101650|gb|DS480674.1| GENE 79 92798 - 93388 574 196 aa, chain - ## HITS:1 COG:DR2421 KEGG:ns NR:ns ## COG: DR2421 COG2316 # Protein_GI_number: 15807410 # Func_class: R General function prediction only # Function: Predicted hydrolase (HD superfamily) # Organism: Deinococcus radiodurans # 4 182 29 204 205 120 40.0 2e-27 MNTALTRERALEALKKYNREPFHILHALTVEGVMRWFAQDQGYGEETDFWGTAGLLHDID FEMYPEQHCVKAPELLKEAGAEDELIHAVCSHGYGLVSDVKPEHQMEKILFASDELTGLI GAAARMRPSGSVMDMEVSSLKKKFKDKRFAAGCSRDVIREGAEELGWTLEELMDKTIQAM RSCEAAVSEEMKLYEA >gi|157101650|gb|DS480674.1| GENE 80 93616 - 94128 481 170 aa, chain + ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 168 1 165 167 105 31.0 4e-23 MTYRFADTEDLDQLVAMSDQAKESFKARNIDQWQKGEPNRQVMEASILQSQLHVLEDMGQ VVGMITIVPGPEASYASIDGAWLNQEPYFAFHRVCVKDSLKGRGLAARLFSEAEQYVLQS GIRNIRIDTHPDNQAMQRALAKSGYIRCGTLILTEGSEAGDLRVGYQKRI >gi|157101650|gb|DS480674.1| GENE 81 93973 - 94422 238 149 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQMRRYVVGERDGGGGGGGIYRSVGITWDAGNGFKIQAYPCLSLGNGFQAEPIARVGSPE SCLKPFSAALVIPAPQPASASPAPIPFTDGCPFPLLHPLNPLLIPNPQIPCFASFCENQG AAADVAAFSQGALHGLIVRMGVNADIADA >gi|157101650|gb|DS480674.1| GENE 82 94673 - 98449 4346 1258 aa, chain - ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 3 981 5 979 985 1066 54.0 0 MSVRRIFVEKKPAYAVKAGELREELENYLGIHNVKNVRVLIRYDIENLSEETYKKALVTI FSEPPVDLVYEEEFTRNPGDLVFSVEYLPGQFDQRADSAEQCVKLLKEDEEPVIRSATTY VVTAPLTAEQEEAIKSFCINPVDSRQADEEKPETLVTEFEVPEDILYFDGFADSSEEELK ALYGTLNLAMTFKDFLHIQNYYKKEEHRDPSVTEIRVLDTYWSDHCRHTTFSTELKNVSF TDGDYREPMEETYKQYLADREVIYKGRDDKYVCLMDLALMAMKKLRSEGKLQDMEVSDEI NACSIVVPVEIDGATEEWLVNFKNETHNHPTEIEPFGGAATCLGGAIRDPLSGRTYVYQA MRVTGAADPTRPLGETLKGKLPQRKIVNSAAQGYSSYGNQIGLATGYVKEIYHPGYVAKR MEIGAVMGAAPRKDVIRETSDPGDIIILLGGRTGRDGIGGATGSSKVHTTSSIEVCGAEV QKGNAPTERKIQRMFRRPEVSRIIKKCNDFGAGGVSVAIGELAAGLKIDLDKVPKKYAGL DGTEIAISESQERMAVVVDPKDVDQFLGYAAEENLEAVTVAVVTEEPRLKMDWRGKTIVD LSRAFLDTNGAHQEADVTLEVPGRAGNLFDKADVADVREKWLGMLSDLNVCSQKGLVERF DGSIGAGSVFMPYGGRYQLTETQAMVAKLPVLKGHTDTVTMMSYGFDPRLSSWSPYHGAV YAVVSSVAKIVAAGGDYSRIRFTFQEYFRRMNENPSRWSQPFSALLGAYSAQLGFGLPSI GGKDSMSGTFDTEDGEINVPPTLVSFAVDVNSHKNVITPEFKRPGSKIVVFRIAKNQYDL PDYSQVMDGYGKIFEDIKAGRIISAYAVEGNGLAEAVSKMAFGNKLGVKIEHNVDPRDFF AAAWGDIVCEVPDGMVGQLSISYTVIGEVTDRGAFEYGNVTISMEEALDAWMTPLEDVFP TAAGREARGEAAALEEKLYHSGAVYVCDHKIGQPTVFIPVFPGTNCEYDSARAFERSGAK VVTRVLRNMSAEDIRQSVDEYRREIAKAQIVMFPGGFSAGDEPEGSAKFFATVFRNAVMK EEIEKLLGERDGLMLGICNGFQALIKLGLLPEGEIKEQGPESPTLAMNTIGRHVSKMVYT KVVSDKSPWLAGAGLGSVYCNPASHGEGRFVANESWLHRLFANGQVATQYVDDQGRCTMD EYWNPNGSYMAIEGITSPDGRILGKMAHSERRGEAVAMNIYGEQDMKIFESGVRYFGL >gi|157101650|gb|DS480674.1| GENE 83 98700 - 100178 1382 492 aa, chain + ## HITS:1 COG:Cgl2942 KEGG:ns NR:ns ## COG: Cgl2942 COG4868 # Protein_GI_number: 19554192 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 491 1 492 495 588 58.0 1e-168 MKIGFDNEKYLTMQSEHIKQRIGEFGDKLYLEFGGKLFDDYHASRVLPGFAPDSKLRMLL QMADQAEILIAINAADIEKNKVRHDLGITYDEDVLRLIQEFRDKGLYVGSVVITQYSGQS GADQFKTKLEHRDIKVYRHYCIEGYPSNVPLIVSDEGYGKNDYIETTRPLVIITAPGPGS GKMATCLSQLYHENKRGIKAGYAKFETFPIWNLPLKHPVNLAYEAATADLNDVNMIDPYH LEAYGITTVNYNRDVDIFPVLNAIFEGIYGESPYKSPTDMGVNMAGFCIMDDEACCEASK QEIIRRYYHALNRLAKEEGTSDEVYKINLLMNKAKIDSTMRAVIAPCLERSKVSGGPAAA MELNDGTIVTGKTSSLLGASAALLLNAIKVLGDIPHNIHLIAPSAIEPIQTLKTQYMGSK NPRLHTDEVLIALSMSAATDETAKKALAQLPKLRGCQVHTSVMLSDVDIKVFSKLGVQLT SEPVYEHKKLYH >gi|157101650|gb|DS480674.1| GENE 84 100332 - 101579 1551 415 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 8 408 16 413 427 245 39.0 1e-64 MTEVRGNVIVGQSGGPTAVINSSLAGVFKTAKDRGARKVYGMLHGIQGLLQERYVDLEEH IKSDLDIELLKRTPSAYLGSCRFKLPEICDDREIYDKIFSILDKLEVEYFFYIGGNDSMD TIKKLSDYSLLNGSKIRFMGVPKTIDNDLAATDHTPGYGSAAKYIGAITKEVIRDGLVYD QQNVTILEIMGRNAGWLTGAAALAKGEDCEGPDMLFLPEVTFDVDSFMKKVEELHKKKAS VVVAVSEGVKLADGRYVCELTDGIEFVDAFGHKQLTGTARFLAEKVNREVGCKTRAIEFN SLQRCASHIVSRVDITEAFQVGGAAVKEAFEGETGKMVILKRVSDDPYVCVTDIYDVHKV ANVEKKVPREWINEAGDYVTEEFVNYIRPLIQAELTPIMTDGLPRHLYYTDVERK >gi|157101650|gb|DS480674.1| GENE 85 101695 - 102297 657 200 aa, chain - ## HITS:1 COG:slr1124 KEGG:ns NR:ns ## COG: slr1124 COG0406 # Protein_GI_number: 16329243 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Synechocystis # 1 181 131 318 349 111 38.0 8e-25 MKLYIIRHGQTDWNVQGRIQGRQDIPLNAAGRSQAQMLAKGMEKRPVTAIYSSPQLRAME TAMALAGNQGVEVIPLPELVEIGYGDWEGRTASDILTKERKLYEEWWQHPATVAPPGGET LNQVDARCKKAWERIKGEIKGDTAVVAHGGTLAHFIVHLLEGQPDAAEIVVGNASITTIE YDPVTGQCSLEGLNDCSHLL >gi|157101650|gb|DS480674.1| GENE 86 102326 - 102922 709 198 aa, chain - ## HITS:1 COG:HI0533 KEGG:ns NR:ns ## COG: HI0533 COG0568 # Protein_GI_number: 16272477 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Haemophilus influenzae # 17 168 367 523 629 69 30.0 4e-12 MEHDFFDMYLEEMRSITPLERDEMVSVLEGTARGDAGARSRLVEGCLGEVLEMVREYGDS ELPLSDLVQEANTALMLAAIEYDGSEPWNELMTRRVKESVELALEEQRTENQIEETMAAR VNVLQTVSQVLAKELGREATLEELSAKMKMSEDEVRDIMKLALDALTVNGEGQAAASGQD EEESPNPVRNGWNLEEGL >gi|157101650|gb|DS480674.1| GENE 87 103075 - 103950 1037 291 aa, chain + ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 291 1 289 296 170 34.0 3e-42 MDINYELYKVFYHVAVTLSFSEASKQLFISQSAVSQSIKVLEKKLNQPLFIRSTKKVQLT PEGEILLKHIEPAMNLIKKGENQLLEAHTLNGGQLRIGASDTICRYYLVPYLNRFHKEFP NIHIKVINQTSIECARFLENGQADFIIANYPNSALAGSMNTRVINEFHDVFVASRSAFPL KGSKLALEELLRYPIMMLDRKSTTSEFLHQVFQKSQLDLVPEIELSSNDLLIDLARIGLG IAFVPDFCIPKNDEDLFILDLKEQLPSRQMMVAYNESLPVSQAARQFMDML >gi|157101650|gb|DS480674.1| GENE 88 103996 - 106413 2495 805 aa, chain - ## HITS:1 COG:CAC1672 KEGG:ns NR:ns ## COG: CAC1672 COG1199 # Protein_GI_number: 15894949 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Clostridium acetobutylicum # 5 801 4 788 791 602 41.0 1e-171 MGNHSQPQIRISVRNLVEFVMRSGDLDNRRTAGAKKEAMQAGSRLHRKIQKRMGTSYRSE VMLRHQVQEDMFDILVEGRADGIITEPAGVTIDEIKCIYMDVSRLEEPDPVHLAQALCYG WFYSTQNELETIGIQITYCNIETEEIRRFKEARSFEELKAWFEGLIHEYVKWARYLYHHG IRRQECLKELPFPYPYREGQKELAGNVYRSIARKRNLFIQAPTGVGKTLSTIYPSLKAMG EGHGEKLFYLTAKTITRSVAEEAFSILRREGNLYFNTVTITAKEKLCVMEKPDCNPQACP RAKGHYDRVNDAVYEIIQEVDGITRDKVLEYAERFKICPFEFCLDISNWVDGIICDYNYV FDPNVKLKRYFDQGEPGQGYLFLVDEAHNLVPRAREMYSASLIKEDVLLTKRILKTQPGA AKVIAQLDKCNQRFLELKRSYGGDEGERRTVLGSTYELLPDVNVLALNLMTMFGELETFM NENIEFPDRDLVLEFYFAVRDFLYVYDRLDESYRIYDQILADGSFMVKLLCINPAVNLKE CLNKGVSTLFFSATLLPIQYYKELLSGSQEEYAVYAKSPFPEENRMVLAASDVSSRYSRR GPSEYEKIVDYICRVVEGKKGNYMVFCPSYQYLHAIEDILAAREASGALSFIWNAQTNHM TEEDRETFLHSFEEERDCSMAALCVMGGIFSEGIDLKEERLIGAVIIGTGLPQVNTEQEI LKEYFDEHGEHGFDYAYQYPGMNKVMQAAGRVIRTVHDRGIIALLDDRFLRPEYVALFPR EWGTYTVVNRYNVDQAVRAFWDGAV >gi|157101650|gb|DS480674.1| GENE 89 106523 - 106768 374 81 aa, chain - ## HITS:1 COG:no KEGG:Closa_1342 NR:ns ## KEGG: Closa_1342 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 81 1 80 80 90 70.0 2e-17 MKEKKIMLPTMAEAKRFVDEATKCDFDIDVFYNRVTIDAKSILGVLSLDLTRILTVQFNG ENQGFEEYLETISPEANASAA >gi|157101650|gb|DS480674.1| GENE 90 107094 - 108332 1300 412 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 4 411 13 415 427 262 41.0 9e-70 MGKKRNIIVGQSGGPTSVINSSLAGVYKTAKERGFHKVYGMLHGIEGLLDEMYVDLSTQI HSDMDIELLKRTPSAFLGSCRYKLPDIHENPAFYEKIFAILDKLDIEVFIYIGGNDSMDT IKKLSDYAIVKGHGQKFLGVPKTIDNDLALTDHTPGFGSAAKYIATSTKEVIRDAMGLSY RRKTITIMECMGRNAGWLTGSTALARTEDCCGPDLIYLPEIPFDIDKFLKKCKDLIQKKP SIVIAVSEGIKVPDGRYVCQLSGGSDYVDAFGHKQLAGTADYLAGFLAGELGCKTRSVEL STLQRSASHVASRVDINEAFMVGGAAVKAADEGDTGKMVVIDRVSDDPYMSAAGIYDVHK IANNEKTVPRSWVNKDGSYVTQEFVNYVEPLIQGDYQPFMVNGLPQHLVLKR >gi|157101650|gb|DS480674.1| GENE 91 108375 - 108779 411 134 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00568 NR:ns ## KEGG: EUBELI_00568 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 114 10 128 157 116 53.0 4e-25 MKRDDLVWSDRKRNWLGLPWTFTVYGLTEDRLFIKTGVLNIHEDEVRLYRILDLSLRKTL WQRMVGLGTIHVDSSDKTMKAFDISNIRNCEDVKEQLSRLVEQERDNKRVSSREFIGGYG DDEDHGFDEEDSPY >gi|157101650|gb|DS480674.1| GENE 92 109014 - 109448 270 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936429|ref|ZP_02083798.1| ## NR: gi|160936429|ref|ZP_02083798.1| hypothetical protein CLOBOL_01321 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01321 [Clostridium bolteae ATCC BAA-613] # 1 144 5 148 148 259 100.0 4e-68 MSRTISAGYVYRQYSTEGVYIADYKSSRAAHEATGVSIGSIARAAHGERRTGGGYVWRKV LADSPKESIEIDLISKIGNHDKRPLVQKNLDGEIVGEFLSIAHASRSLKISRRSLSCALS GAQKTAGGYIWEEKMEDDMQVQEQ >gi|157101650|gb|DS480674.1| GENE 93 109740 - 110162 515 140 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1216 NR:ns ## KEGG: EUBREC_1216 # Name: not_defined # Def: hemerythrin # Organism: E.rectale # Pathway: not_defined # 5 139 9 143 144 131 52.0 1e-29 MYAEFTDDLVTGNEMIDTQHKELICKINDLLKSCEERSNQSGAARMLNFLADYTEYHFNE EEALQESINYPGIKEHKEKHEELRRTVQELHEMLTEEEGPTDAFVEKVSEKVRDWLYYHI QTFDRSVAEFKFMRDNAERI >gi|157101650|gb|DS480674.1| GENE 94 110389 - 111819 968 476 aa, chain - ## HITS:1 COG:BS_mleN KEGG:ns NR:ns ## COG: BS_mleN COG1757 # Protein_GI_number: 16079413 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus subtilis # 2 443 8 450 468 306 40.0 6e-83 MSLVPAIIVFVVICAAIAVQKVIFDGDMGAMFLMLWVVLIPFGMFYGFKASELEDIAVKF TAKSLPAVFIMLSVGALIGTWIAAGTTPAVIYYGIKFINPKFFLVTAIILCSIISLLSGT SLGSVGTAGVAMMGIGNSLGIPAPVTAGACICGAFFGDKMSPMSDTTILASSICGVNIFK HIRHMVYDQVPSYLITLIFFFVLGFKYGGNIDSPEVNAMMEGLQGNFKLGIMAFLPLLVT IVLLLKKVSATLCIIIGSVMGIAVAVFYQGMDVAAAFNTFYSGFTLETENEMLFTLLNRG GISSMWSLVGVTLFGFTVAGMLDHMDVLKCIADSTVKHIHGTAGVTFLTILFGFVGNAVA MSQNFAIVMSGTLMAPLYKRYNMLPKNCSRDLEAGGTYGALFIPWNTNALFCAGALGVSV LSFIPYIPLLYITPVVVIIYSITKFHIDKIYEDEDFVDVSERLEKDHDKQNVMGEL >gi|157101650|gb|DS480674.1| GENE 95 111876 - 113000 480 374 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 7 372 11 374 386 205 35.0 1e-52 MMYTQMDIDLDVIAENYKKVSDRLPEETGIIAVVKADAYGLGAVPIAKKLEKAGCPMFAV TFMEEAVKLREAGIKAPILVMMPAESKELLIAGKHKLIITITDYEMAKQISDELVKQSYS LPAHLKVDCGLSRMGIVLKDRLEEAAEEAAKIHALPNLNTEAVFSHLTAAGTEGAYAALD LAEQQRFFELIKMLKKKNISIKTHLMSSDPYIWYPQTPGDYIRVGSILYGICPERYSCCG IKNCMSMNTYIVQIKWIPRDTEVSYGPLYRTVRRTKIAVVPVGFADGIRRALSNKGEMLI HGKRVPIIGKICCDHTILDVTDIEEVKVGDEVTIFGRNGEESQTAGDYAELCNASAPETT VIFSARIPRNYLNM >gi|157101650|gb|DS480674.1| GENE 96 113018 - 114118 671 366 aa, chain - ## HITS:1 COG:mll9542 KEGG:ns NR:ns ## COG: mll9542 COG0498 # Protein_GI_number: 13488402 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Mesorhizobium loti # 23 332 76 387 420 210 41.0 3e-54 MFPGNSMETYRELLPFKGNQPLVDLGVGNTALTKLPAIGKELGLDLYIKDETRNPTWGHK DRLNAVLINKAMELGVPGVVYASTGNNGASGAAFAAKAGMPCVILTVKGVNKTLETFMQV YGAKLIGVKTVSERWEVLKYLVNQCGWYPATNYVVPIVGSNPWAIEGYKLIAYELYKQMD ELPDKIVVPICYGDALFGIMKGFSELKEMGFINHIPQMVSVEDYGPVAKAYNSGCEMIEP VEAWDSVASSISTKWGTYHSLYALRKSRGLAIALEKNEDFYQAQSELAQKEGVYCESASA ASYAGLKELVAGGKIKKGEKVVLLITASGVKDTKVTSSYLPEFPESITGLDGALDILEKQ YQMKVR >gi|157101650|gb|DS480674.1| GENE 97 114295 - 114669 224 124 aa, chain - ## HITS:1 COG:BS_yabJ KEGG:ns NR:ns ## COG: BS_yabJ COG0251 # Protein_GI_number: 16077116 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Bacillus subtilis # 3 124 4 124 125 124 52.0 3e-29 MSSVSTKKAPAAIGPYSQGISANGFVFVSGQLPINPSTGTIAEGSIKDRTIQSMMNIGEI LKEAGSGLDKVVKTTIFVTNLELFGEVNEAYGTFFGGNAPARACVQVAALPKGADIEIEA IATV >gi|157101650|gb|DS480674.1| GENE 98 114871 - 115752 258 293 aa, chain - ## HITS:1 COG:CAC3466 KEGG:ns NR:ns ## COG: CAC3466 COG0583 # Protein_GI_number: 15896705 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 290 1 289 292 162 31.0 6e-40 MNFDCLQTFIVLSQCKNFTRASEIMYVAQSTISNRIRLLEEYTKCQLVIRNKTGIELTEE GTLFLQYAKQLLNVETVALREIHMLNVYQDHLNVACAQWIHDCWFADDMVRYSRDFPDIA VNMTVDHGETLISMMHTSTLDLAITAYNINTNSLISQLYKKSKVVLVGKREQYSCLQKGI RGEDLLNLPLIYSDIWDNYLSDISEHLLLDERIYRIHCNMLGSAKNFCIAGLGCCFLPAD LVKHELEAEILVEIPVGEISARYVNLYVTYNRARLGSAAMRYFFELFPQMIPK >gi|157101650|gb|DS480674.1| GENE 99 115843 - 117009 1306 388 aa, chain - ## HITS:1 COG:no KEGG:Closa_2524 NR:ns ## KEGG: Closa_2524 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 382 1 382 389 633 80.0 1e-180 MNKKEANEIKKLFTPAGCAITRICGCYVDAEKNKKTELKEAFLSLQEEEAFKYFTIFRNA LSGTIEKNLINMEFPLHTEAEGGTQHFLLKLRDSQLKDDAILEEFYDKVIAAYDYGENYY IILIHCAYDIPAKATDGTEMFDASDYVYEFIQCTICPVKLSKAGLCYNSLTNTIENRDRD WLVEAPVQGFLFPAFNDRNTDIHSLLYYAKNPEELPDTLIDELLGCVIPMSAKSQKETFQ AIVEETLGENCDFETVKNIHENLSELVEETKDEPVPLTLDKYQVKKLLETNGATPEKLEE FEQRYAQVEDGPGTSFVAANVVNTRSFEIKTPDVSIKVSPDKTYLVENRMIEGRPCIVIA INEHVEINGITVRPVAAPKGDEDENVPF >gi|157101650|gb|DS480674.1| GENE 100 117052 - 117213 71 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936438|ref|ZP_02083807.1| ## NR: gi|160936438|ref|ZP_02083807.1| hypothetical protein CLOBOL_01330 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01330 [Clostridium bolteae ATCC BAA-613] # 1 53 1 53 53 95 100.0 8e-19 MGCTVLSVTAVPALFRLPAVQTGHLCLPFKPDSRCLVIAGCRTLDFLGNLVYY >gi|157101650|gb|DS480674.1| GENE 101 117245 - 118681 1189 478 aa, chain - ## HITS:1 COG:slr1342 KEGG:ns NR:ns ## COG: slr1342 COG3395 # Protein_GI_number: 16330749 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 37 475 6 439 445 263 37.0 5e-70 MREITRLPAGELEQKAERDEGRIERLLGDAIKRSGRKIVVLDDDPTGVQTVHDVSVYTDW SEESVLQGFQEEGNLFYILTNSRGFTEDETKKAHQEIAGNIVKAARKTGRDFLIMSRGDS TLRGHYPLETQVLRDVLSREGQQETDGEVICPFFKEGGRFTIGNIHYVRYGRELVPAGET EFAADRTFGYRSSNLADYVEEKTKGAYPAGEVICIGLDDLRHGRVDKVAGQLMEVRNFNK VIVNAVDYGDLKVFVLALYEAMGQGKRFLFRTAASLVKVMGGITDQPLLTREKMALPENG HGGVIVVGSHTAKTTRQLMELKKVPGIRFLEFNSDLVLNDAAFREETERTLELISQTIAE GKTAAVYTRRKLLEFKDDTREAALMRSVKISEAVQSLVGNLKVVPAFVVAKGGITSSDVG TKALRVKRATVMGQICPGVPVWQTGKESRFPHIPYVIFPGNVGEDATLREAVEILLGR >gi|157101650|gb|DS480674.1| GENE 102 118810 - 119724 993 304 aa, chain - ## HITS:1 COG:TM0413 KEGG:ns NR:ns ## COG: TM0413 COG1402 # Protein_GI_number: 15643179 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Thermotoga maritima # 9 275 1 275 311 137 31.0 3e-32 MKKQTEFKLDKRDSVWFQDNTAAVNCAYAKEVCDIAILPIGAIEQHGPHCPCGSDSFNAM GIAEAVARKSGAMILACPMYGSHPAHHWGMPGTIPLTFETHVGLLTDIVRGAANAGFNKF LIISAHGQTMSMFEAVHKLGIEGYFTIGSTWYDFLRDNKKTLDTFMWHADEAETSVALSL YPDKVHMDLAVPGKAHGLIDSKWKIAPGQAAGEGMLYHFEGTFSLMEKDDLDTGVIGDPT NASKEKGDKIVERCADLYCELLEEVKTKYPCGVNPLGFRNPLGYNGTNNIQYDAMHDEHG NLKK >gi|157101650|gb|DS480674.1| GENE 103 119752 - 120534 1048 260 aa, chain - ## HITS:1 COG:STM2289 KEGG:ns NR:ns ## COG: STM2289 COG3836 # Protein_GI_number: 16765616 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Salmonella typhimurium LT2 # 5 240 7 239 267 140 37.0 3e-33 MVRENKVKQKLKNGEAVIGTFVKTVDPSVIEILGCSGMEFFVIDSEHVSYNPETITDLVR ASDLTGIVPIVRVREATAVNIMQALDTGALGYHAPNVDTPEQARIAVDAGRYTPLGNRGF APTHRAANYGMMDKQEYIDKANKEILTIIHCETMESLGNLDEILKLEELDVVFIGPMDLS QSLGRDVMGQRKHPKLLEAIDQIIEKVNKAGKAVGTVADNAQMAKELMEKGVRYIPISSD QGMIGNMAKNIVKEFNQYSK >gi|157101650|gb|DS480674.1| GENE 104 120518 - 121444 758 308 aa, chain - ## HITS:1 COG:AGl1647 KEGG:ns NR:ns ## COG: AGl1647 COG3386 # Protein_GI_number: 15890941 # Func_class: G Carbohydrate transport and metabolism # Function: Gluconolactonase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 288 51 337 348 194 38.0 1e-49 MGLFYESFDERFFDVVDVKQDLATVCSGFTFIEGPIWNDVTGCLTFNDIPESRTYRLEQG KTVRLLREDTHKANGNAYDMDSNIIVCEHVRSCISRTNDQGENLEVLVSHYGDKELNSPN DVVVRSDGVIFFTDPRFGRNPSRVGLERPQELDFQGVFSFDPGTGELKLLADDFENPNGL CFSPDEAFLYVNDSPKKHIRKFRVEADGTLSGGAVWAETTGEGAGLPDGMKTDSRENVYC CAQGGIHIFNKNADYLGIIKVPEQTGNCAFGGPDMDILYLAASTTLYSFRIKGRADKNGR KILNGQGK >gi|157101650|gb|DS480674.1| GENE 105 121462 - 122895 1607 477 aa, chain - ## HITS:1 COG:TM0307 KEGG:ns NR:ns ## COG: TM0307 COG2407 # Protein_GI_number: 15643076 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 5 477 3 473 473 270 34.0 6e-72 MKKMGLPKVGMITFSDAREHEYKVLYEEKTAARIKRTVEYFADYEMELHTFDTVARSPQD IDRQTDELKKKGVEVFIANIPCWTWPNGVVRGVLNMGLPTILLSNDDPGTHGTVGFLGSG GALNQIGYKHLRIQQDFTDEHPNLFDTKMMPYIRAAAAVRKLQGRIFGFIGGRSLGIDTG SFDPMQWKKLFQIDSDHIDHEEIVRRAEAIDESRVEAVFKWLVDSVGSVKYDEKLTEERL KYQVACYLATKDIIADRHLDFGAIKCMPDMTNFRAPQCISTALLNGGYDADGEFDPFPIS CEADADGALTMEMLKLISGGMPTMFADVSHVGYKEKLIYLPNCGSMCTYYAGRCCDGCRN MKNIELRRANRPSGGAVTFTVPSAGEMTMARLCRVDGKYQMFIIETEFVDPSKEILDAFV KARGVHQLPVAFMKADFDIEKFVDRFNSNHISGVAGHYKKELINLCSLLDIEPVVIQ >gi|157101650|gb|DS480674.1| GENE 106 122910 - 123737 219 275 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 248 1 240 242 89 25 1e-16 MKLENKVAIITGGTKGIGYGIAEEYLKEGAKITICSRNAQEGVKAAKELGKLGEVLYLQA DVSSIEDNQMLVDETVKKYGRLDIFVANAGINDPDKTHYLHITEEQYDRILGVNLKGLFF GGQLAARQMVKQGQGGAIVNISSVNAYLALDSQMCYTTSKGGVSQLTKVQAVALTPYDIK VNAICPGPIESELMRRVGSDEQLFNTVISRTPIGRIGTPNECGRLAVFLACDDSNFVYGQ SIYIDGGRSFQAFPVPGYKTVTDEQYEKLMELEQK >gi|157101650|gb|DS480674.1| GENE 107 123774 - 124841 1175 355 aa, chain - ## HITS:1 COG:MK0392 KEGG:ns NR:ns ## COG: MK0392 COG2055 # Protein_GI_number: 20093830 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Methanopyrus kandleri AV19 # 7 349 3 339 341 210 32.0 3e-54 MGNTVLMRGETVTAFTQELFQKAGMSEPDAKYHAECLVNTNYWGIDSHGVIRVPAYFKRM LNGAINTNPQIKVVRGEGCLEVLDVDAAAGFIGGRAGMERAMDHAAVSGIAACGVINSNH FGAGALYARMAAERGMIGIAMTNVKPLIVAPGASVPVTGNNPIAFGIPTYGEFPFVLDMS LSVVAGGKLTLAIKKKEKIPMDWATDKNGRPTDDPQAAFDGYLLPMGGHKGLGLSYVVDI LSGLITGGVFSKDMKSMYANPKDPSLTGHFFIAMDISKIISKEQMKERMAQFKDQLKATP MWQEGAQMLLPGEIEYRKEQERRQNGLPIPVTTYEELAELKKEYDIQAPLTIIQE >gi|157101650|gb|DS480674.1| GENE 108 125297 - 126343 1000 348 aa, chain + ## HITS:1 COG:VCA0673 KEGG:ns NR:ns ## COG: VCA0673 COG1609 # Protein_GI_number: 15601431 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 24 310 16 301 343 192 35.0 8e-49 MKSSDSGDSIQNTGKRANGRGHNTSIDVAKLAGVSQATVSRAFNPASKMNPSTRQRVLDA ASALKYAPDAIARSLISNSTNIVAVIMQNTTNPFYATVLSKLSIRLRSMGKQILFFYLDD EGHMKDLIHQILQYRVDGILITSVTLTSDLADTCVDFGVPVVLFNRYMNTPGIQAVCCDN VQAGRDCADYFFMKGYRSFAFIGGHDEASTSHDRRRGFVSRLEERGIHDCSVINTPFTYE GGRDGFRRLLEQEGTCPKALFCVSDLVCAGVLDCARFEYNLKVPEDVAVIGFDDIELASY NSYSITSFVQPMDSMIDRVLDLLLSEKQPSESTNLTLFPCTLKTRGTA >gi|157101650|gb|DS480674.1| GENE 109 126516 - 127802 1418 428 aa, chain - ## HITS:1 COG:BH2671 KEGG:ns NR:ns ## COG: BH2671 COG1593 # Protein_GI_number: 15615234 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, large permease component # Organism: Bacillus halodurans # 1 426 1 426 426 344 52.0 2e-94 MVAILLFVSLILFLLLSIPIGVSLGLASLTTIRTMDCISMTTFVQSMIQGLNSFPLMAVP LFTFAGEVMGKGGISRRLIDLARVFTGRFTGGLGIVTIVTSLFFAAIAGTGSAAVAAIGL IMIPAMVKNGYDKGYSSALVATAGTVGVIIPPSVCMVVYAVAAGASISGLFMAGIMPGLL IGAGLILYSVFYSRRHGYKGEDKRYTPGEIWQIILKAIPGLMIPVIILGGIYGGVFTPTE AAAIAGVYGVIVGLFIYREIKPSDLIYIMYRSVLMCAPVLLIIGISTGFGRILTITQVPV MIADAILGLTSSKILVLLLINVLLLIVGTFMETNAAIIILTPILLPIVTRLGVNPIHFGI IVVMNLSIGFITPPLGANLFMACQVGEISFDKLARKIWPWIVVMVALLMLVTYIPQISMA LPEAMGVI >gi|157101650|gb|DS480674.1| GENE 110 127809 - 128417 601 202 aa, chain - ## HITS:1 COG:no KEGG:ELI_2933 NR:ns ## KEGG: ELI_2933 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 202 1 203 205 183 46.0 5e-45 MKKTLRWLDVNFEAILMVLFFSLMILLVTVQVVLRFIFKTGFSWGEEVARLLFVWMSFSS FGYLTRGSRHVRVGFFRELFPVRTQKAILILCDVLFLVFSAGGLRAMVQLCSDAWRFQDM LTAVPWNYNALYLAGMLGFFMMVVRNIQILVWKFRHWGDDLERFVNYDGDYYENNRICFE PKVKDVLELAEKEAAELLEKGE >gi|157101650|gb|DS480674.1| GENE 111 128419 - 129459 350 346 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|114773040|ref|ZP_01450335.1| TRAP-type C4-dicarboxylate transport system, periplasmic component [alpha proteobacterium HTCC2255] # 10 297 6 288 329 139 28 7e-32 MDRWRIKAAMGLVCILTVSAFALTGCAQTEKSDVITIRIAHDNNVNTPLHKAFLKFKDLV ETGSEGRMEVVIFPGGQMGSVQDTFEQCRRGDIEMSGSTTSNFTRAMPEFAAWESFYMFD DTAHAKRVFESEAGKKMMEPLKRMNLTGIGYMELGFRNFSNSKRPIQTDEDLKGLKIRGY NPLQIKAWESVGVNTTSVSWNELFTSLQQRLIDGQECATTSFYTEKFYEAQKYWSLTRHV FTNFLWYANEDFMNSLSDSDRAFIMESAQEAIDYNWELADQSEEEILKQLEDAGFPVNDV DISVRRQLGEKINASIKGDIIANCGEDTYNMLMAAVAAERREQGEE >gi|157101650|gb|DS480674.1| GENE 112 129495 - 130931 1551 478 aa, chain - ## HITS:1 COG:no KEGG:Spirs_3076 NR:ns ## KEGG: Spirs_3076 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 7 476 3 512 514 249 30.0 1e-64 MSKINEKSTLNVKLTVKPILFTMYHIHAYMGPCRYGEGYALTTESDMEVTQKEFDIFKKE MEEEADLRNVEFLEPVVIFWNEDFALKEEQIQKALEDDAKTDFYLIRGLRISSYFAVELS KRTDKPIGHTPNKSAISKCDHVDMTAHLFAEGKPGYAFLDFEEMNRTFAALRVKKALACT KIFFPLKSAMLTFGCQSSYLNLEDITRKFKVRFSHVNSDEVFQWLDALTDEEKAEAKALA DKLEKESQGVHMPAEYILNDTEFYETIKKMMAHYDCNAFTIPCFEVCATRELNRRRLTFC LAHSLFKDEGIASACAGDVGSVLTITILMNIARKAPYMGNTMVLDKNTNQCRTLHDVACA QMKGYDEPALPTEFVSFTMDSWGTTMRYDFSKDNGETLTMINMSPDMKKIMVAKGTINGC DDYLTPECKHAVRYTVKDAKRFHECQKFVGHHFALVYGDYVEEVKAFAKECDLEVLEA >gi|157101650|gb|DS480674.1| GENE 113 130934 - 132064 962 376 aa, chain - ## HITS:1 COG:BH3204 KEGG:ns NR:ns ## COG: BH3204 COG1104 # Protein_GI_number: 15615766 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Bacillus halodurans # 1 372 1 372 380 251 39.0 2e-66 MIYLDNSATTQVDKDVAMTACEVMTEVYGNPSSPYLLGRDALGRLTAARQQVAQVIGAPT RQIFFTSGGTEANNLAIRGSVLAARGEGRNKGRIITTAIEHSSVLESCRSLEEDGYEVVF IRPRDGRIQADDVIHAVNQDTLLVSVMAVNNETGERLPVEEIAGGVKRKNPDVLVHCDCV QGFGKILFSLNRIKADMVTMSSHKIHGPKGSGAIYVREGVNVRPLCFGGSQESKVRPGTE NVPGIAAFGHASLKALRDIRENWEHVKELNLYLTDCLRQMEDVVINSPSGAVPYVLNISV LPFATEELLHEMRMRGIYLSGSSACEKGARSHVIRAMGIEGPRADSVVRIGLGKKNTREE VTVLVETLNELIKRGK >gi|157101650|gb|DS480674.1| GENE 114 132190 - 132258 56 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTKGKSNKKHIDVLGFNCDNA >gi|157101650|gb|DS480674.1| GENE 115 132479 - 133525 943 348 aa, chain - ## HITS:1 COG:BS_gutB KEGG:ns NR:ns ## COG: BS_gutB COG1063 # Protein_GI_number: 16077682 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 1 346 9 349 353 172 31.0 9e-43 MKALVMRAPGEYGVEDVKKPVPGFQEVLVRMHSVAICGSDPPLLAGESLKDGLPAYMPFI PGHEGAGVVETVGSGVTDFKPGDMVAAEAHLGCGHCANCRMGRYNLCLNFGIREQGHKQY GFTAQGCYAQYCVYHIRAVHHIPDGLTCEHGAMADTMATALHGIESVGIRPGGLTAVMGC GPVGMSAMLLAKAMGSRVILCGRGDRLERGKQMGADVVLDYTAEDIGERIHSISGGIGAD QVIECAGTELSFRNCIAGVKRGGRIALLSSPRKSSYGIDMKAIMWNELTLCGSRGNPNSH QQVMQMMKMGIADPLPMITHRFALEDMREAVDVFTQRRDGCIKALIQL >gi|157101650|gb|DS480674.1| GENE 116 133537 - 134484 1054 315 aa, chain - ## HITS:1 COG:BH2321 KEGG:ns NR:ns ## COG: BH2321 COG1172 # Protein_GI_number: 15614884 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 3 315 11 322 324 205 45.0 1e-52 MENKERAAASRLIKKYSIILILIALMIVLTMIKPQFIKPGNLVNMLRQISLVGILTMGMM LVILIGDIDISGGPQIALTSVICAYFASRECPLVLAFLMPLLAGLCCGLVNGVLTAKMRI PSFIATLGMMSVAKGLALIVSKGQPISGIRPEIEWLGSKKLMGIPVLVIVFIICAAVVSI MLQKTAFGKKIYALGGNQEAARVCGINVDLTRIKVFLLMSVFATISGILITGRVSSGSAT NGIEYHMDAIASTVIGGVSMTGGSGNVMGVVGGVLIMGVLQNGLDLMMVSPYLQQIFKGA IIVIAVVADNMKNRR >gi|157101650|gb|DS480674.1| GENE 117 134486 - 136000 1302 504 aa, chain - ## HITS:1 COG:BH2322 KEGG:ns NR:ns ## COG: BH2322 COG1129 # Protein_GI_number: 15614885 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Bacillus halodurans # 7 498 5 497 522 520 55.0 1e-147 MGSDTGYILEMNHITKEFPGVKALKDVTFKVRKGVVHGLMGENGAGKSTLMKILQGIYTP TSGTMIFDGKPLELKTIYHALEAGISMIHQEMSPIPDMTVAENIFLGREPKNKWGFVDHK QLNEMTACLLGRLRLKLSPTILMRELSVGNMQMVEIAKAISFHSKLIIMDEPTSAIADKE VESLFNIIRMLKEEGVTIIYITHKMSEVFEITDEISVLRDGELVGGDLTADLDNNKLISL MVGRELTEMFPKVYYEPGEELLRVEHFSDGKRFFDVSFCVHEREMLGFTGLIGSGRSELL EAVFGLRPHTEGRVFIRGKEVHIRNPHDAIALGIGLLTEDRKASGCFLPLDITDNMIMSN LELQKKGMFLDFNRIKAHTESMRAKLAVKTPSMEQHIELLSGGNQQKVLMGRWMINAPLI LIVDEPTRGIDVNAKAEIHRLLGEMNKNGTAVILVSSEMPEVLGMSDRIVIMHEGRKAGE IGREDATQELLMQIAFGEDEGRKH >gi|157101650|gb|DS480674.1| GENE 118 136075 - 137127 1260 350 aa, chain - ## HITS:1 COG:BH3901 KEGG:ns NR:ns ## COG: BH3901 COG1879 # Protein_GI_number: 15616463 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 76 344 55 322 327 183 37.0 4e-46 MLKGRIVSGKIVSILLAATMMAGCILTGCGSAESGSLEAADKVAETANAPAAGNGEEGPG GTSGEPAEVSVIIETFDNTFASFVMNAMKKYEEEHKDKIHITYLDSKQDANTQISQVENQ ILNGTQAIICLAVDAKQSEPIVRACKEAGVPLIAFNRIFENCDSFVGADGAVAGSLLARF TGEKTGGKGNVAIVNGIMGQENQFIRRDAIVETLGEMYPDMQVVIEGAGDWKRDKALQLV ETWLQSGTDIDAILCLNDDMAMGTQLAVEQAGKSDDIIVCGVNGDPDGINAVRDGKIAAT VFQNPDRQAEEALEAALQYIHGEKPESKIMIPFELVTADNVDDYVQYSQK >gi|157101650|gb|DS480674.1| GENE 119 137952 - 138404 420 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936461|ref|ZP_02083830.1| ## NR: gi|160936461|ref|ZP_02083830.1| hypothetical protein CLOBOL_01353 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01353 [Clostridium bolteae ATCC BAA-613] # 1 150 1 150 150 116 100.0 5e-25 MPGRIVPQFQDQSGRTGGTGLGGMGLGGVGLGGMGLGGTGLLGHPGLSGMGRPGLSGSGG TGLSGRPGLSGIGGTGLSGRTGLSGIGRPGLSGSGATGLSGIGRLGIACMGLASLSCMPK SFSWSDQSLPSSGRGRMGVVSWEFSHSSEP >gi|157101650|gb|DS480674.1| GENE 120 138793 - 139698 662 301 aa, chain - ## HITS:1 COG:VC1216_2 KEGG:ns NR:ns ## COG: VC1216_2 COG2199 # Protein_GI_number: 15641229 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Vibrio cholerae # 2 126 34 160 204 75 29.0 8e-14 MLSQEALSEALTNLGNRNRFEKTIQAFEFDKQGCGTLGCIYIDVNGLHEINNHLGHQAGA RMLKTISDIFLEYFDSQDIFRIGGDEFVILCKNMGREELVRRIEQVCRKTEEAGFSLSTG LEWRESDLRIEDIVQKAERAMQENKKGFYSSKGGERQRRELNQHMERLISEKKDADRFLS ILAPVFEGVFFVNLETDTVRQIFIPSYFQEMLKECGDKYSRALLLYADRMVEPQYASLFE LCCDYSRLEAMLEGEEIPDFTYRKKDGSQLRLRILKFYHYDSTGKETLWIFSDIEIIYIE S >gi|157101650|gb|DS480674.1| GENE 121 139712 - 140008 199 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936464|ref|ZP_02083833.1| ## NR: gi|160936464|ref|ZP_02083833.1| hypothetical protein CLOBOL_01356 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01356 [Clostridium bolteae ATCC BAA-613] # 1 98 1 98 98 183 100.0 3e-45 MWSRLKQFLGIIPDFTDSEEVFRALHGDVVKRNFLFHIFIIVFEFVMVISISMRPGGPFA KPRRVTYFVLYLVLILAAAGVTWLETWIDRKRQAVPAD >gi|157101650|gb|DS480674.1| GENE 122 140319 - 141752 1413 477 aa, chain - ## HITS:1 COG:SP0287 KEGG:ns NR:ns ## COG: SP0287 COG2252 # Protein_GI_number: 15900221 # Func_class: R General function prediction only # Function: Permeases # Organism: Streptococcus pneumoniae TIGR4 # 13 476 19 490 490 315 42.0 1e-85 MEDIALKQETGRIDRFFQVSQNHSSVRTEVLAGITTFITIAYILILNPQILADPYVIMGD AAMAGKIANGVFIGTCIGAFIGTILCALYARVPFAQAPGMGLNAFFAYTVVLGMGYTYGQ ALVVVFISGVFFIVITAIGLREAIIRSIPDAVKTAITPGIGLFITIIGLKNAGIVISNPA TLVSLVDFSQWKIEGADLALMSSALVALAGLVIMGMLHARKVKGSILLGIVAATLIGIPL GVTHISNLDMNIGMKFRDFAEVSFMKMDFAGLFSGANMVETIFTVTMLVISFSLVNMFDS IGTLLGAAKQSGMIDENGEVIRMKQALMSDAISTAAGAMVGTSTVTTVVESSAGIAAGGR TGLTSLVTALMFLGAILFAPIVSIVPAAATAPALIFVGILMLGNIRDVDFSDMSNALPAF CTIVFMPFTYSIANGVAFGLITYCLMKLTTGRRQDVKILTLAISVVFVVRYAFMTLG Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:09:16 2011 Seq name: gi|157101649|gb|DS480675.1| Clostridium bolteae ATCC BAA-613 Scfld_02_16 genomic scaffold, whole genome shotgun sequence Length of sequence - 52995 bp Number of predicted genes - 44, with homology - 43 Number of transcription units - 24, operones - 11 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 543 212 ## Closa_0537 hypothetical protein - Prom 645 - 704 6.1 2 2 Op 1 1/0.000 + CDS 815 - 2566 1307 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 2602 - 2660 10.9 + Prom 2611 - 2670 5.5 3 2 Op 2 . + CDS 2698 - 3711 171 ## COG5263 FOG: Glucan-binding domain (YG repeat) 4 2 Op 3 . + CDS 3783 - 5024 870 ## COG3328 Transposase and inactivated derivatives 5 3 Op 1 . - CDS 5277 - 5582 253 ## COG1109 Phosphomannomutase 6 3 Op 2 11/0.000 - CDS 5659 - 6990 608 ## COG1109 Phosphomannomutase 7 3 Op 3 . - CDS 7013 - 7972 544 ## COG0836 Mannose-1-phosphate guanylyltransferase + Prom 8117 - 8176 7.3 8 4 Tu 1 . + CDS 8308 - 9132 67 ## ELI_1182 hypothetical protein 9 5 Op 1 . - CDS 10329 - 11480 274 ## Acear_2148 hypothetical protein 10 5 Op 2 . - CDS 11464 - 11994 206 ## gi|160936481|ref|ZP_02083849.1| hypothetical protein CLOBOL_01372 11 5 Op 3 . - CDS 11972 - 12946 432 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 12 6 Op 1 . - CDS 13576 - 13737 58 ## gi|160936484|ref|ZP_02083852.1| hypothetical protein CLOBOL_01375 13 6 Op 2 . - CDS 13805 - 14056 117 ## gi|160936485|ref|ZP_02083853.1| hypothetical protein CLOBOL_01376 + Prom 14164 - 14223 3.0 14 7 Tu 1 . + CDS 14360 - 14443 73 ## - Term 14827 - 14877 1.5 15 8 Tu 1 . - CDS 14931 - 16823 515 ## gi|160936488|ref|ZP_02083856.1| hypothetical protein CLOBOL_01379 - Prom 16941 - 17000 6.4 16 9 Tu 1 . - CDS 17151 - 17540 234 ## gi|160936489|ref|ZP_02083857.1| hypothetical protein CLOBOL_01380 - Prom 17567 - 17626 2.4 - Term 18588 - 18623 -0.4 17 10 Op 1 . - CDS 18676 - 19101 111 ## COG0615 Cytidylyltransferase 18 10 Op 2 . - CDS 19106 - 19264 108 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 - Prom 19296 - 19355 3.3 + Prom 20350 - 20409 4.6 19 11 Tu 1 . + CDS 20606 - 21772 341 ## COG3547 Transposase and inactivated derivatives + Term 21818 - 21852 1.1 20 12 Op 1 . - CDS 21874 - 23541 340 ## EUBELI_00268 hypothetical protein 21 12 Op 2 . - CDS 23551 - 24612 37 ## Calhy_1088 haloacid dehalogenase domain-containing protein hydrolase - Term 25232 - 25269 -0.8 22 13 Tu 1 . - CDS 25350 - 25610 119 ## gi|160936495|ref|ZP_02083863.1| hypothetical protein CLOBOL_01386 - Prom 25656 - 25715 3.2 - Term 27302 - 27336 -0.9 23 14 Tu 1 . - CDS 27393 - 29168 210 ## COG5610 Predicted hydrolase (HAD superfamily) - Prom 29199 - 29258 8.6 24 15 Tu 1 . - CDS 29521 - 30033 7 ## gi|160936497|ref|ZP_02083865.1| hypothetical protein CLOBOL_01388 - Prom 30172 - 30231 6.5 25 16 Op 1 . - CDS 30744 - 32822 137 ## COG5610 Predicted hydrolase (HAD superfamily) 26 16 Op 2 . - CDS 32825 - 33766 214 ## COG1088 dTDP-D-glucose 4,6-dehydratase - Prom 33789 - 33848 3.1 - Term 33813 - 33851 -0.9 27 16 Op 3 . - CDS 33857 - 34213 66 ## COG1211 4-diphosphocytidyl-2-methyl-D-erithritol synthase - Prom 34288 - 34347 6.1 - Term 34777 - 34824 2.4 28 17 Tu 1 . - CDS 34858 - 36591 610 ## Tthe_2726 transposase - Prom 36617 - 36676 6.5 - Term 36669 - 36713 2.5 29 18 Tu 1 . - CDS 36803 - 38005 588 ## COG3547 Transposase and inactivated derivatives - Prom 38185 - 38244 6.9 + Prom 38195 - 38254 5.8 30 19 Op 1 . + CDS 38397 - 39062 473 ## Closa_3160 transposase 31 19 Op 2 . + CDS 39078 - 39389 133 ## Reut_C6337 transposase IS66 + Prom 39477 - 39536 1.8 32 20 Tu 1 . + CDS 39597 - 40025 116 ## COG3436 Transposase and inactivated derivatives 33 21 Op 1 . - CDS 40523 - 40747 124 ## bpr_I0637 IS110 family transposase 34 21 Op 2 . - CDS 40762 - 41157 241 ## bpr_I0637 IS110 family transposase - Prom 41230 - 41289 5.9 + Prom 41240 - 41299 5.8 35 22 Op 1 . + CDS 41355 - 42800 202 ## gi|160936508|ref|ZP_02083876.1| hypothetical protein CLOBOL_01399 36 22 Op 2 . + CDS 42860 - 43792 286 ## gi|160936509|ref|ZP_02083877.1| hypothetical protein CLOBOL_01400 - Term 43675 - 43741 8.1 37 23 Op 1 2/0.000 - CDS 43803 - 45008 1022 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component 38 23 Op 2 . - CDS 44998 - 46065 552 ## COG0438 Glycosyltransferase 39 23 Op 3 . - CDS 46077 - 46538 460 ## gi|160936512|ref|ZP_02083880.1| hypothetical protein CLOBOL_01403 40 23 Op 4 3/0.000 - CDS 46566 - 47342 434 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 41 23 Op 5 25/0.000 - CDS 47329 - 48504 403 ## COG0438 Glycosyltransferase 42 23 Op 6 12/0.000 - CDS 48505 - 49632 348 ## COG0438 Glycosyltransferase 43 23 Op 7 . - CDS 49635 - 51044 613 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 51105 - 51164 4.8 - Term 51093 - 51149 2.9 44 24 Tu 1 . - CDS 51187 - 52818 531 ## Closa_0537 hypothetical protein - Prom 52920 - 52979 6.1 Predicted protein(s) >gi|157101649|gb|DS480675.1| GENE 1 3 - 543 212 180 aa, chain - ## HITS:1 COG:no KEGG:Closa_0537 NR:ns ## KEGG: Closa_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 26 179 12 160 514 104 39.0 1e-21 MDNIKTKEISDEKIINLLAAWRKYNSVIAGIFTVLIIFVFPLVFHDYYYDILVAKYIFYY GSVILMIVAMLVGVFFFWYKDKREIGGNAVKEMRSNAKLNALKMSDWAMIIFLIAVTIST LQSEYRFESFWGNEGRYMGMFLILLYGISFFVISRCLRYRQWYLDTFLAGGIIACMIGIL >gi|157101649|gb|DS480675.1| GENE 2 815 - 2566 1307 583 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 224 526 703 744 83 29.0 1e-15 MRKQTKLVAVLSTAALLAIGASMTSFAATGWAEEDGTWVYYNRDGERATDQWKKSGNNWY WLNSDGEMAIDELIQDGDNYYYVDINGVMAANQWVAIDNEDAGQDDEPDHYWYYFQANGK ALKNGDNARVALKTVNGKKYAFDDEGRMLFGWVDADNAERIDNTDGDGFKEGDYYFGGED DGAMTVGWLQMDITYDEATSDYEVSPVFNDDEDQTRWFYFKSNGKKVKAENGDTQKSKTI NGKKYEFDQYGAMTAEWSLDVDTASTSGIRGDYSTATHSVSAKYAQQWRYFQDVENGARV SKGWFKVVSAEYLNYDKYNDDEDAWYYADGSGNLYAGQFKTIKGKKYAFRNDGRMIDGLK FIQQDAQNNITDVRADDDDNHPFDTDDDFNKNSIYWSDLGYRCFYFGGGDDGAMRTNKTN IDIDGDRFNFYFEKSGAHKGAGLTGEKDDKFYQSGKLITAGKDEKYQVVKRSNKSGVSTA ASKYTYSKLDDVRKFLDDISGKYTDNVINDTTVDQLRDLGVNKKAEDIKELYIIDTTQVN ANDYFVVNTSGKVIDTKSKNKDGNDYYYVIKKGGQIAAIYAEN >gi|157101649|gb|DS480675.1| GENE 3 2698 - 3711 171 337 aa, chain + ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 159 298 565 691 693 61 32.0 2e-09 MIKKKWISLCLAITLATTTPIVSFADTSTDTTTMENCGAYLFEWRPVDCGNGHYFAILVG GDSIAERNVILNRGYDFAYTFEPMKYPRPWGEVPTLNNVDGVWAIPENWPMLPEDSQSTL QIVLFTNNNKLPDKERYIDVVKLPNNVTVSSLPPEVKKYLINADGTDAGAYEGTETSGWV QAEDGRFKYRKPDGTFVSNGWLNVEDKLYYMDENGYMLADTLAPDGSYVNASGIKQKYMP GWFQNERGWKYVLKNGYFAASTWVQDTDGKYYYFDLGGYMKTDYDTPDGYHVGPDGVWDG QATTHVEDIKKLGPGAQPQIENAEQSLASENTTTKSL >gi|157101649|gb|DS480675.1| GENE 4 3783 - 5024 870 413 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 47 411 34 398 402 343 45.0 3e-94 MAREKKSVHKVQMTDGKRNIIRQLLEEYEIESAQDIQDALKDLLGGTIKEMMESEMDEHL GYRKSERSDCDDYRNGYKTKQVNSSYGSMKVEVPQDRNSTFEPQVVKKRQKDISDIDHKI ISMYAKGMTTRQISETLEDIYGFEASEGFISDVTDKLLPQIEDWQNRPLSDVYPVLYIDA IHYSVRDNGVIRKLAAYVVLGINSDGLKEVLTIEVGENESAKYWLSVLNGLKNRGVKDIL LLCADGLTGIKEAIAAAFPKTEYQRCIVHQVRNTLKYVSDKDRKLFAADLKTIYQAPTEE KALEALERGTKKWSEKYPNSMKSWHQNWDAIVPIFKFSTTVRKVIYTTNAIESLNATYRK LNRQRSVFPSDTALLKALYLSTFEATKKWTMPIRNWGQVYGELSLMYEGRLPM >gi|157101649|gb|DS480675.1| GENE 5 5277 - 5582 253 101 aa, chain - ## HITS:1 COG:lin2618 KEGG:ns NR:ns ## COG: lin2618 COG1109 # Protein_GI_number: 16801680 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Listeria innocua # 4 96 477 570 576 68 40.0 2e-12 MDGIMSGFRTAAPSEIGGLKVISISDYKESLIKYGDGRETIIKLPKSDVMKFTLEGNVSM VARPSGTEPKLKLYFSICADTEADAKQLEMKIKEDIEKVLL >gi|157101649|gb|DS480675.1| GENE 6 5659 - 6990 608 443 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 6 435 4 434 575 407 48.0 1e-113 MTERVKDIYKYWLNQNLEDTELTKELETISLDAKAIEERFYKDLEFGTGGLRGIMGAGTN RMNVYTVARATQGYCSYLKKIYSGEIHVSIAFDSRLNSVNFARTAGEVFAANGIIVHIYP ELMPTPSLSFAIRQLHCNGGIVITASHNPSEYNGYKVYGNDGCQITTGTAKEIQKEISQI DVFADVARMPFEKAQAEGLIAYISKETVDAFIQAVSKETLCEAGMKKDITIVYTPLNGAG GYCVRRALEENGFQNVFVVKEQEMPDGCFPTCPYPNPEIEQAMELGIRDAKHLNCDLLLA TDPDCDRVGVGVKSGCEYLLLSANEIGMLLLEYICKRRSASGSMPCRPLMMKTIVTIDMA ERIASEYGVDTVNVLTGFKYIGEQIGLLEEKGEEDRFIFGFEESYGYLSGSYVRDKDGVG ASLLICEMAAYYKTRERIKRSTG >gi|157101649|gb|DS480675.1| GENE 7 7013 - 7972 544 319 aa, chain - ## HITS:1 COG:CAC2968 KEGG:ns NR:ns ## COG: CAC2968 COG0836 # Protein_GI_number: 15896221 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 5 311 40 347 356 268 43.0 8e-72 MIQLTVERIASLVKMEDIYIATNKDYRNLVLEQLPGIPEDNILCEPVSRNTAPCIGLGAV HISKKHKDALMLVLPSDHLIKFNKMFVSTLRDACRIAEMGDNLVTIGIMPDYPETGYGYI KFNNHEMEGRAYKVERFVEKPSLEVAKEYLATEEYLWNSGMFIWKVSSILKNIKKFMPAM YKGLEQIQSAIGTDGQQVVLEQEFNKMESQSIDYGIMERAKDIYILPGAFGWDDVGSWLA IERIKKSNEFGNVVDGNVITVNTHNCIIQGDRKLIAAVGLKDIVVVDTEDAILICEKDST GDIKRVLENLKICNRDEYI >gi|157101649|gb|DS480675.1| GENE 8 8308 - 9132 67 274 aa, chain + ## HITS:1 COG:no KEGG:ELI_1182 NR:ns ## KEGG: ELI_1182 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 3 274 39 309 314 200 39.0 5e-50 MSLISGIYWCVRKQSALYLLFNFHLGNFINQAIKITVCAYRPWIRDNRITPVDSARPGAS GYSFPSGHTAKAMAVWGGLAVYDFKRNKPIASFLLLIILVVGFSRNYLGVHTPQDVIVSL ILGGFILYSTYYLLIWADKKRGRDWITAGSITLASILLVFYAKNKSYPLDYIDGQLLVNP QQMINGCFRGCGGIIGLAYGWVLDRKLIHYNETRGTWQTKLYRYLLGMIGLTLLFKTFPS LISFFISPPKAAFINGFIIPFYITAIYPILTRKL >gi|157101649|gb|DS480675.1| GENE 9 10329 - 11480 274 383 aa, chain - ## HITS:1 COG:no KEGG:Acear_2148 NR:ns ## KEGG: Acear_2148 # Name: not_defined # Def: hypothetical protein # Organism: A.arabaticum # Pathway: not_defined # 10 383 5 372 372 171 33.0 6e-41 MKRVNSYLENNKWMLPLVILFIFLLSRLLMYLVFIVWQYVNNVDIFIFDRMNNWDAGWYK SIAMDGYAPAPSPDETGEANWAFFPLMPLVMRYLIRFTHGNVNVVTCIANSLFFLAGLVA SASYICKTRNDYIEALLFCLIYSFGPYSFYFSIFYSESLFFLFVVLFFIFMKEERYIAMG VVGALASATRNLGVLLVFAVAVQYTSDYLRRGSGKSLWNYWKEAFKNYRLVLGVALIPLG LFGFMVFLWYKMGDPLAFAHIQYAWGEHESSLQEVIKALTGRGTRRKIYFALVGIFGLIG SIHLARTKRWGESVMGVLFILVPFSVRTTSLPRYVMGAYLPVLGWTDVFCKQKAKWLGII LIILALLECVLFYWWLEGKNILI >gi|157101649|gb|DS480675.1| GENE 10 11464 - 11994 206 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936481|ref|ZP_02083849.1| ## NR: gi|160936481|ref|ZP_02083849.1| hypothetical protein CLOBOL_01372 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01372 [Clostridium bolteae ATCC BAA-613] # 1 176 1 176 176 343 100.0 2e-93 MKRPFLISAYIVSAALFIFYIWVLWNARTPEVNLAYRMYYINRELEKWPGIEGFDYVVGI PVYMNQEQKRCGSGWEDRNNLGRRVSSEGAWIYFDGLPEDDLRLTIVVDDIGKKNILTVL SNREETPLTLEKGRENGEREQLWTAPIKAENGEISLLINGTEDVWVKEITIDEASE >gi|157101649|gb|DS480675.1| GENE 11 11972 - 12946 432 324 aa, chain - ## HITS:1 COG:L29089 KEGG:ns NR:ns ## COG: L29089 COG0463 # Protein_GI_number: 15672983 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 1 314 1 318 318 393 61.0 1e-109 MRKTLYIVIPCYNEEDVLRETARQLLAKMDSMSKKISGGSRIVFVNDGSKDKTWDIIKEL HQSNPVYSGINLSRNRGHQNALLAGLMTVKDYCDMAISMDADLQDDIGAIDEMVDKFYRG YDIVYGVRSKRNTDTFFKRFTAEGFYKMMSALGADTVYNHADYRLMSKRALEGLAKFKEV NLFLRGMVPMIGYSSATVYYVRNERFAGESKYPLKKMLAFAFEGITSLSTKPIRIIVMLG SIILGVSVLMLIWSVAGFYRGTTVPGWASIMVSIWGIGGILILSVGVVGEYIGKIYLETK ERPRYIVESFINQEGTVDEKTISY >gi|157101649|gb|DS480675.1| GENE 12 13576 - 13737 58 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936484|ref|ZP_02083852.1| ## NR: gi|160936484|ref|ZP_02083852.1| hypothetical protein CLOBOL_01375 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01375 [Clostridium bolteae ATCC BAA-613] # 15 53 27 65 65 84 100.0 3e-15 MGLEWGWGGGVEGVLQRDTAVFRTTRLHEKGHVEFAAHINTDVKWVYKNYLLS >gi|157101649|gb|DS480675.1| GENE 13 13805 - 14056 117 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936485|ref|ZP_02083853.1| ## NR: gi|160936485|ref|ZP_02083853.1| hypothetical protein CLOBOL_01376 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01376 [Clostridium bolteae ATCC BAA-613] # 1 83 1 83 83 155 100.0 1e-36 MFLVEVLDSEEPGSCKVQEGFIAVVIQIDLFVDATKAQVVSNNEGIYVIIFRQVIIRKFE YGDLFWIEDVNHPVKRRKMGIFP >gi|157101649|gb|DS480675.1| GENE 14 14360 - 14443 73 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEQQIKVFDKAIAEQMELITNTLTSIK >gi|157101649|gb|DS480675.1| GENE 15 14931 - 16823 515 630 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936488|ref|ZP_02083856.1| ## NR: gi|160936488|ref|ZP_02083856.1| hypothetical protein CLOBOL_01379 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01379 [Clostridium bolteae ATCC BAA-613] # 1 630 99 728 728 1256 100.0 0 MSVTIGCTYLFKAFIRLLTQSVILADLAAILAVVLFSVQKDMPHAVFLCMGGNATISLML VLASLNFCVLSIRKKILWEGIAEVLSVIFYFLAMMTYEATYPFILLIILLVFYEESSRER KLRKVSPYLIVWIMTLGLYCVLKMFAKESYDGVSVGLSVKNIMRGFAVSVSAPLPFVNGF FRGHAANSFKDFFNQINLLDVLLIVVFIYLLNSIINKRNTSAVSNNKSYFLMIIGIMFVV FPAILMAVSEKYQVLPFGVGYQVIIYQYYGWGCIGSAIILKAWRYFLTKFKHAEKKLIVF DGIIMLITVYLIAVNQQNTRWTIKEMELKEASVSATFDQTLRNAFSNGMIESIPLDAENN LIISSRQYYGMFAKGAMLKELTGQDFQVKWYDELDEIFLEKQPCYALNSGSGWVLVGAIT EWDSPTSSAMAGQIWLYCDDEFDRLLVLSRDKLSFLNLKDADEVTHCAVGNIYSINKYYD INQLVPIKHCKYKLGTRFPVSVSRQDIGAYLYGGISDREGGFTWTSDKTMGMYFDLEDAD YEKIGVEINVKMIYGERQPVNIHVNGVCVYSEVLSGKSKIEFECIMPQNGILDMHFEFPE ARSPKETEESDDARKLSIAMDDMILVGLSN >gi|157101649|gb|DS480675.1| GENE 16 17151 - 17540 234 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936489|ref|ZP_02083857.1| ## NR: gi|160936489|ref|ZP_02083857.1| hypothetical protein CLOBOL_01380 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01380 [Clostridium bolteae ATCC BAA-613] # 1 129 369 497 497 257 100.0 2e-67 MNILYLMGWIKGRLVKGRLNILTPKDITIYYIFLVIGTIGLVLCPKNFNELTSVSAIYSY YSGELEDYIEETDNRRELLEGDESIVVLKPYKARPRLLYYYNDIQSDEHDWKNEIIAKWY EKEIVYLEE >gi|157101649|gb|DS480675.1| GENE 17 18676 - 19101 111 141 aa, chain - ## HITS:1 COG:aq_1368 KEGG:ns NR:ns ## COG: aq_1368 COG0615 # Protein_GI_number: 15606564 # Func_class: M Cell wall/membrane/envelope biogenesis; I Lipid transport and metabolism # Function: Cytidylyltransferase # Organism: Aquifex aeolicus # 11 138 12 130 168 81 37.0 4e-16 MKVYKTGYIAGVFDLFHIGHLNLIRKAKERSEYLIVGVLSDELVLHFKNKAPYIPFQERL EIVKAIKYVDKAVPVTFDNIDKLDAWNLYRYDCLFSGDDYVNNESWIRDKKKLKRLGSDI QYFPYTKSTSSTQIKEAMNRG >gi|157101649|gb|DS480675.1| GENE 18 19106 - 19264 108 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 52 403 454 454 104 100.0 2e-21 MDIPKKNLHIIICNIYYRQIGKQLEKMGFSEYYIYVQNKAWIIEDFMKDNEG >gi|157101649|gb|DS480675.1| GENE 19 20606 - 21772 341 388 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101649|gb|DS480675.1| GENE 20 21874 - 23541 340 555 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00268 NR:ns ## KEGG: EUBELI_00268 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 24 543 37 565 646 315 35.0 4e-84 MATLNMDFIKTQETSISRPQCDYDIYDYISFNQDNDFKLILEKDNRWEVFYHLSKMRESL LNWYPFGKDAQVLEVGGEFGALTGTLCKRCKRVVSIERYSLRAEGIFKRYKTCSNLEVIA GEFLDIKFIEKFDFVIIIGKLEYQSGSKFTSEIYENYLKKAKSLLKENGRLLLAIENRFG IKYFCGCIDPFSGVPFAGINNYPLAASGRSFTKDEIKNILDKVGFGTIKFYYPIPDYKLP QLIFTDEYMPKSSIKERVIPYYINSNSLLAFENDLYDDLVANGMFATMCNSFLIECSVNE VLSNIIFAALSTDREDANAFITTIHSEGLVRKYPVNKEGVAVLERSYKNILELKNRGINV VLHKWNGSYIEMPYISSITLSDYIGKEAGLDKNKFIKVLDKLYNCILRSSEKSSHLNRVF PQNDDLDWGIILKKAFIDMVPMNCFYDNDELVFFDQEFMREDFPAKYIMFRAIKYAYLFF PYIEDVISKKDLAVKYGLLQLWEYFDTEENSFVFHNRQQKLYKNFLNWSTINREKINRNA KALIGETSYEKSSIN >gi|157101649|gb|DS480675.1| GENE 21 23551 - 24612 37 353 aa, chain - ## HITS:1 COG:no KEGG:Calhy_1088 NR:ns ## KEGG: Calhy_1088 # Name: not_defined # Def: haloacid dehalogenase domain-containing protein hydrolase # Organism: C.hydrothermalis # Pathway: not_defined # 1 271 372 652 736 117 31.0 1e-24 MIGMFIAYAFNSPFITLDDKKRIVIGNEYDWAYLFVAPLVTGFMLWLIKILQTEKYEGIL FSSRDGYLIKQLYDNIPVCYCRDILPEGLYFYVSRVACSKVAIMNDKDIKWIAGPKCQGS MEKVVRERYEIEYLNTQNTLTKDEELLDYAMLNKECIYKKSKTIRENFLSYINKLGIKDG KKYAFFDFVSCGSCQFFLNQVMPVELEGVYCCLYNSYEEQKAKLPIHSYFYNNHPRNPES NLFENYLQLEVIMTSYEPSIVSYDRDGRPVFADEVRGQDELLFVKKVQKAIKDFWSDFIN NLWDYYGDGIRNVVVDELFKYSKECYTNIRSSNLRSMNLLNDFGQEVIEIRRK >gi|157101649|gb|DS480675.1| GENE 22 25350 - 25610 119 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936495|ref|ZP_02083863.1| ## NR: gi|160936495|ref|ZP_02083863.1| hypothetical protein CLOBOL_01386 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01386 [Clostridium bolteae ATCC BAA-613] # 1 86 589 674 674 178 100.0 1e-43 MIYGAGAYGRNIAKKLDDLQIRYDCFVVTNKKGNLDWLMEHEVKALNEIEENVDTHGIIV ALNKENQKQVKPLLEQLGFDYYLIDL >gi|157101649|gb|DS480675.1| GENE 23 27393 - 29168 210 591 aa, chain - ## HITS:1 COG:jhp0540 KEGG:ns NR:ns ## COG: jhp0540 COG5610 # Protein_GI_number: 15611607 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Helicobacter pylori J99 # 17 316 9 296 591 87 27.0 5e-17 MNKLDNEYFSISRELLQNKIDTYDIISFDVFDTLLMRKTLQPKDIFSLVEVRAMKEKNIQ CAFEKNRILAEQNLLASFIPNYLQIYNELQFITGISDDEKEYLMDLELRIESQMLIPRKE MVRVFDYAKSRNKKIYLISDMYLSSNWIGNQLKKAGIIGYDELMISNEYSTSKGQNLFNV FIHKVKDKKILHIGDSEAFDIMAAQRNGIDNFRILSAYDMLQISVYKEVLYEIVDLTGRL QVGYFISRIFNSPFALFESDGKGRVKTGRDIGYIYMAPIITDFLLWLANKIKDKKSARVL FSARDGYLIDKLYGILRKKESLIKLPEGLYFLTSRILCISASLFNEDDILWSMNNTFSGS PEELLKKRYFLTEGEIEYFDPNKYSSISQYILAHKKAILSKSAEIREKYMEYISGLGLNG DDDLIFFDFVSTGTCQLCLSRLLNADLKGYYFYRFNTMDIAKQKLLIEAPYTSEGILAAN SLLLECILTSFVPTVKGFDSSGKPVYLEERRTEEELCYIKEVQDGIVEYFEDFIKYRDVD DLDAKDISLGEIFLKFITPQYSTIINNIFRTYIIDDEFHNENLHVSNIYQY >gi|157101649|gb|DS480675.1| GENE 24 29521 - 30033 7 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936497|ref|ZP_02083865.1| ## NR: gi|160936497|ref|ZP_02083865.1| hypothetical protein CLOBOL_01388 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01388 [Clostridium bolteae ATCC BAA-613] # 1 170 204 373 373 322 100.0 7e-87 MAVDGTISHAIERILGFVALNNGYDVLEIINLDYAKQRLCDLQLRFTNIVKDLQNDYLIN NIDFLNEEKKKTKELLTYCKSYKNIYIYGAGTYGKQCNDLLKRYGIKINGFIVSNDQKRS SLYDGSQVWEFREIINQLNDSGIVIAVNNQYQKQILELIKQNSRTDVFCI >gi|157101649|gb|DS480675.1| GENE 25 30744 - 32822 137 692 aa, chain - ## HITS:1 COG:jhp0540 KEGG:ns NR:ns ## COG: jhp0540 COG5610 # Protein_GI_number: 15611607 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Helicobacter pylori J99 # 128 556 11 412 591 80 23.0 7e-15 MAENQTKMFWKYLNARIAIYGISVQTKIFIEENSDLNIVGFLDSYKKEGIIYGKPILDLD ELLKRDIDVIIIIARVNSTEMIFRKISDFCLKNHIQVLNSKGKEIKHEDYKDQFEEVNNI SLEIMLREIDSHEVVSFDIFDTLLMRKVLFPNDVFYLMQKREGLNDSFPENRISVEKKMM SEGIIAPSIFEIYKRLIDEYPEAGYDEQKLINLEYELDRNLILCRNDITYAFIYARMKKK RIFLISDMYYSKGQIQQILLDNDIVGYEDIFISSEYGINKEEGLFVKLLEKLDSKKVLHV GDCYEADILAAEQCQIDTFWIKSAVSVMNSSRYRHLLKYTNSLEQRVMLGLFVARIFNSP FALYHTLGKGIIDNSRDITYLFAAPIITEYILWLCSQVDNKKCQVIFPSRDGFILKVLYE NLKSNKKLFKKLPNSVYLVTSRAVCIGASLFDEDDINYAAVFPYEGTPEHLLIERFFLGK GEIEQFDCEKYKDLIEYVRSHKKKILKRSQEIRLNYEKYLFQVLDNRELIFSDFVSSGTC QMGLQKIIGREIKGCYFVKSSDAYPDKQKLKVASLYNEESNVAKKQYVFENILTDRSPSF LFFDGDKPIYAAEKRTIEEIDFVSSSQDEVLKYFNEYTSLVNYHYNYEKSLIADYLLEVM GRKYTMIEDRVFKNYILHDDFCNRTMSIDDCL >gi|157101649|gb|DS480675.1| GENE 26 32825 - 33766 214 313 aa, chain - ## HITS:1 COG:DRA0041 KEGG:ns NR:ns ## COG: DRA0041 COG1088 # Protein_GI_number: 15807711 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Deinococcus radiodurans # 1 304 17 311 364 99 25.0 9e-21 MVTGATGFIGMLLVGFFCFLNQKLNAQINIVVYVRNKEKVKTLFNNASIEVVVGDISDKI NYSGDMDYIFHCASNTDSKFMVDKPVECIKSIVNGTQAVCEFAVAKKVKSMVYLSSMEIY GNIEGLVDKISEENMGIIDLLNKRSCYPLGKRMAENICYCFSQEYQCPIKIARLSQTFGA GYILDNSSKVFAYIGRCILNQRNIILHTDGSSMGNYVYSADAISGLMYILVDGDIGEAYN IANEELTMTIREMVEFVIKNFNNNISIVYDIPKENVFGYASKTNIKLSSQKLQKIGWSPQ YNMKDMYQRMLGV >gi|157101649|gb|DS480675.1| GENE 27 33857 - 34213 66 118 aa, chain - ## HITS:1 COG:all5167 KEGG:ns NR:ns ## COG: all5167 COG1211 # Protein_GI_number: 17232659 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2-methyl-D-erithritol synthase # Organism: Nostoc sp. PCC 7120 # 3 113 119 228 228 63 33.0 8e-11 MIEVVRAKRTAVAVSKATETIGVVDETGEIIALPDRTYSRIAKAPQGFYVADLMKAHLQV QGEGINWMIDSATLMKHCGYKLSTVECSQYNIKITTPSDFYVYRALYEAEENTQIFGL >gi|157101649|gb|DS480675.1| GENE 28 34858 - 36591 610 577 aa, chain - ## HITS:1 COG:no KEGG:Tthe_2726 NR:ns ## KEGG: Tthe_2726 # Name: not_defined # Def: transposase # Organism: T.thermosaccharolyticum # Pathway: not_defined # 1 576 1 569 570 629 56.0 1e-178 MRLTTNKTKNGISYYIIRSVRRDGKRSSEVVERLGTEREIMEKYHCTDANKWAKAHLDEL NQAEAEKQQKVLVPLRTDILIPFDKQNSYNVGYLFLQKIYYDLRLPNICRKISKEYSFSY DLDSILSRLVYERILNPSSKLSCYEQSSDLLEPPAFELHQIYRALSVIADESDSIQAELY EASRRLVKRQTGVLYYDCTNYFFETECQEGLKQYGPSKEHRPSPIVEMGLFIDRSGIPLA FCIHPGNTNEQTTLIPLEEQILRDFSLSKFIVCTDAGLSSERNRKFNNFGGRCFITTQSI KKLKKDLRQWCLEPTGWHLKDSLDTYDISRLEDTAKNRSKLFYKQLYVEGNDGKRDIDFD QTLIVTYSLKYRNYQQQIRNQQISRAMKAIDTEPKRIDKHSQNDYRRFIKKTSITADGEC AANKIYEIDQDAVQEEAQYDGFYAVYTNLDDDPSEIAAVNQGRWEIEESFRIMKSEFEAR PVYLKRDDRIKAHFTTCFIALLIYRILERKLDSQFTCDEIISTLRKMRVTSIGNEGYVPS YTRTKLTDALHEYAGFRTDYELIKKRTIKGICRHSKE >gi|157101649|gb|DS480675.1| GENE 29 36803 - 38005 588 400 aa, chain - ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 341 1 329 391 83 24.0 6e-16 MISVGIDVSKGKSTVCVLKPYGEIVCRPFEMQHVEKDLEDFHKLLKKLEGEIRIVMEATG IYHLPVLTFLHDKGYFVSVVNPFAMKKYAKDNSIRGAKTDKLDSIMIANYGIEKWFKLQS YEGDEEIYAELKLLGRRYRYYMELHVKALQELTHILDYVMPGIKKMFNSWNEANGKDKLS DFVERFWHFDLITSMSLEKFTEEYLVWAKEKKYHRSRSKAEAVYELTSSGISTLSSGTPS TKMLVQEAISVLRTVDSSLSPILTRMQVLAKSLPEYSTVREMGGVGDVLAVKLMAEIGDV RRLHSAKALIAWAGIDPPPYESGQFIGSKRKITKRGSSTLRKVGYEVMRVLKSHPAPKDD AVYNYILKKEGEGKSKKHAKIAGLNKFLRIYYARVIAVYQ >gi|157101649|gb|DS480675.1| GENE 30 38397 - 39062 473 221 aa, chain + ## HITS:1 COG:no KEGG:Closa_3160 NR:ns ## KEGG: Closa_3160 # Name: not_defined # Def: transposase # Organism: C.saccharolyticum # Pathway: not_defined # 6 221 7 239 555 125 34.0 1e-27 MGSKHTLGELNNLSREELITIILSMQEQLDALNENIEKLIEQVRLANQQRFGRHTETMAS IEGQLSFFDEADALCNPLVQEPDPDEILPKKAKKKKSKGQREADLKDFPEDVIPTHTVSK EVLDAFYGEGNWKQMPSETYKRLRHEPESWTVEIHMVDVFVGTDGYHQDEFLRGDRPKDL FRGSIATPSLVASILNVKYVNSAPLHRIEQEFSRNGVNISK >gi|157101649|gb|DS480675.1| GENE 31 39078 - 39389 133 103 aa, chain + ## HITS:1 COG:no KEGG:Reut_C6337 NR:ns ## KEGG: Reut_C6337 # Name: not_defined # Def: transposase IS66 # Organism: R.eutropha # Pathway: not_defined # 1 99 242 340 544 77 38.0 2e-13 MINSSHRYLAPLVECMKQELLKLPVTQSDESPTQVINDDRAPGSKSYMWVHRSGEFHTEF PMVIYEYQKGRNHEFPLQFYQNYEGILVTDSLCQYHLIEKKLR >gi|157101649|gb|DS480675.1| GENE 32 39597 - 40025 116 142 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 138 320 450 450 87 33.0 9e-18 MVEEYFAWVKSQLKKTAVPPKGKTAEGLKYSLNQEKYLKVFLTDGKVPIDNSASERSIRT FCIGKKNWMFHDSIQGAQASAVIYSISETAKLNNLRPYYYFKHLLTELPKRCDEKGNIDS AKLNHLLPWAKELPAECRKPRR >gi|157101649|gb|DS480675.1| GENE 33 40523 - 40747 124 74 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0637 NR:ns ## KEGG: bpr_I0637 # Name: not_defined # Def: IS110 family transposase # Organism: B.proteoclasticus # Pathway: not_defined # 1 53 138 190 372 84 66.0 2e-15 MRDDHRLALKKMKQEINAFCLRHNLRYGGRSTWTAAHTKWLRGLEPEGLYREIQKQQKAS GRDPGPFQIRAIYG >gi|157101649|gb|DS480675.1| GENE 34 40762 - 41157 241 131 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0637 NR:ns ## KEGG: bpr_I0637 # Name: not_defined # Def: IS110 family transposase # Organism: B.proteoclasticus # Pathway: not_defined # 1 101 1 101 372 132 61.0 4e-30 MYNTTIFVGMDVHKETFSLCCYSIEKDEFSHPHTTEADYIHVLSYLASLRAALGENVSFI CGYEAGCLGYTLYHQLTEHNITCVILAPTTMPVIKGKKKSKRISGMRVTLQNALPTMITA LCMSQLARMNR >gi|157101649|gb|DS480675.1| GENE 35 41355 - 42800 202 481 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936508|ref|ZP_02083876.1| ## NR: gi|160936508|ref|ZP_02083876.1| hypothetical protein CLOBOL_01399 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01399 [Clostridium bolteae ATCC BAA-613] # 1 481 1 481 481 966 100.0 0 MQNTTFNITSLQKITDDTEFIIQCYHTMLGRSLDPFEMQHAIQMLSNGLSKKGFLYWITR SQEFGNRFVIKEIASYKLDCYLYKGMEKLLKLLHLDSIPKYNSECFSTPLHCGRFGYLAE QFTEHDYSFDTEYSALAEGQISQISAYLPKLSDLTYIGHLASVLSPANSVLNTSITHTKV ADITEYMDSHMPAPKSSCFITNPDVLFQLLTTDRLSSFANSIGDTLVLTMPQLPSSQAYV SIVWDGSWDRMEFRPDGTICRWMAGLELDGSIHLINHSMDYCQITLDFILTVLDINSEVS ISFHGKTIPIQYSGTKCNVTLKLSLNPGCNTMSLLYAGKRIKQPDASGRPALLSVDNLAL SFLGSSFLSLSGEAAYSLDEQSHGTGYYPYLLPDSFIRSQLHRNGFFEISAFRVSKTYAV SQLPTTRYDYLRDERHHDCFYILDSDSKEEDITSFSSVILYIACRTGHLSPEFYNYELEE H >gi|157101649|gb|DS480675.1| GENE 36 42860 - 43792 286 310 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936509|ref|ZP_02083877.1| ## NR: gi|160936509|ref|ZP_02083877.1| hypothetical protein CLOBOL_01400 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01400 [Clostridium bolteae ATCC BAA-613] # 1 310 1 310 310 629 100.0 1e-178 MPLAHYMERKGHRVLVIDDQKFPYERSGLYHRLEDMKNQSGIVLISSAHVWFDEFNYHYF YGKDRNMISVIELIDFLKPVYSVFYPHDMESFVHCSEIPWLDLFDMIMLPYTHNLYYRLL GCCQRVEVVGWMKKQREVHPAIDQENPVYSPVFFPSNIITFYEKLGTEGYADWFRRYAGP NIPLKMPAGDAGIVPMLSKEGYQILDSSQSVYDVMAGHNLIIGSGHSSIIFEAAFSGIPV ISLLDGVFPDKEYIKSLSGIKGIYPLHPEELLEFIKDINTRHILLERGPDILMPFDFERV YKYLTEFPEN >gi|157101649|gb|DS480675.1| GENE 37 43803 - 45008 1022 401 aa, chain - ## HITS:1 COG:CAC2328 KEGG:ns NR:ns ## COG: CAC2328 COG1134 # Protein_GI_number: 15895595 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Clostridium acetobutylicum # 7 396 4 417 419 263 37.0 4e-70 MKAENAIEVHDIKKSFRVYLDKGRTLKELVLFSKRRKYEERQVLQGISFEVKKGEALGLI GHNGCGKSTTLKLLTRIMYPDSGTIEMRGRVSSLIELGAGFHPDMSGRQNIYTNASIFGL TRKEIDARVDNIIEFSELEAFIDNPVRTYSSGMYMRLAFAVAINVDADILLVDEILAVGD ANFQAKCFNKLREIKANGTTIVIVSHSLGQIEEICERSIWIHEGKIQKEGNPREVHPAYL EYMGQKRPEAASEKVKSEGERPGDGRVRIKTVEVISGKEGESNVFRTGEPVTLNISYNVV EKVEEASIGLEVYNGDGVKCYSTDTRTEKMDYIKLERDGEIHLILENLELLNGKYTMDFS IKSKDSFPIDSYTKAFSFEMYSDVKDTGISRLAHKWEVKNA >gi|157101649|gb|DS480675.1| GENE 38 44998 - 46065 552 355 aa, chain - ## HITS:1 COG:MTH335 KEGG:ns NR:ns ## COG: MTH335 COG0438 # Protein_GI_number: 15678363 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 96 347 122 368 373 96 33.0 6e-20 MLSSLTYGDAIGNVVLALKEAIQKLGYETEVYAEAIDTRLPAGSAKKIQYLPELHKDDVV INHLSNGTSWNHRFGDFSCRKIIYYHNVTPPRFFEDFSVELQMNCHRGLKDVEYLADKVD YCLAVSEFNKQDLRRMGYRQKIDVLPILIPYGDYDKTPSQNILDKYGDGRTNILFTGRIS PNKKQEDVIKAFFYYKNYMNQDARLFFVGKYAGMEAYYEQLKRYAEVLDLKDVYFTGHIK FDEILAYYRTADVFACMSEHEGFCVPLVEAMYFGVPIVAYDSSAIADTLGNGGILTEDKD PKLVAEIINRLVQDETLRKEIISRQKEQLKRFEYDKVTSLFSGYLEKFLEEHNEG >gi|157101649|gb|DS480675.1| GENE 39 46077 - 46538 460 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936512|ref|ZP_02083880.1| ## NR: gi|160936512|ref|ZP_02083880.1| hypothetical protein CLOBOL_01403 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01403 [Clostridium bolteae ATCC BAA-613] # 1 153 2 154 154 251 100.0 2e-65 MSDPINIDEIMQELRNNAKNKSYPKEAVAFDAVCARKKEAEDFDFFRELMERDVRYMDRS YYVDYDQPITGRSPRIKKFIKSLYQFHLRPLWDAQNCFNLKAASAMSQMRNFVLQQIEQN EQMEKHLEELRKVCREQELKIEQLEKKLSEEKE >gi|157101649|gb|DS480675.1| GENE 40 46566 - 47342 434 258 aa, chain - ## HITS:1 COG:CAC2329 KEGG:ns NR:ns ## COG: CAC2329 COG1682 # Protein_GI_number: 15895596 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Clostridium acetobutylicum # 5 258 6 258 258 145 35.0 5e-35 MQLFKEIYAYREMIFSLVRRDLKGRYKGSALGFFWTFLNPLLQLVVYTFVFSVIMKSDVE YYYLHLFVALVPWLFFSMSVSEGCSCIRSQQDMVKKIYFPREVLPIAYVTSQFINMLLSF VVVLLVVLLSGRGLNLVALLYLPIIAIVEYLLCLGSALLVSAITVYVRDMEYLLKIVTMA LQFLTPVMYSIDIVPERYMAIYVLNPMTPIIVAYRDILYYGKIPRLTTLLHAVLMGVVLL VIGFLVFGKLKRRFAEEL >gi|157101649|gb|DS480675.1| GENE 41 47329 - 48504 403 391 aa, chain - ## HITS:1 COG:MA3757 KEGG:ns NR:ns ## COG: MA3757 COG0438 # Protein_GI_number: 20092555 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 143 381 114 351 351 176 40.0 5e-44 MKIGFEAQLLLDRHKTGIGWCAVNLIQNLVSMEGHNDYQMHYFSLGKSPELLVEMKQYED LGIMLKPCKWFRFVLYKLIWPFVPVPYSLFFGRECDVTQFFNFFVPPGVKGKTVTIVHDM AYKACPETVKSRTRHWLNLVLKKSCNRADIIITVSQFSKDELVRYLGVPKEKIRVMHLGV DQSKFSNHQSRERIDAVRGKYHIKERYYLYLGTIEPRKNIKRLLEAYGRLHDKMSNAPQL VLAGGRGWLCDDIYETAKALDLGDDILFTGYVEEGEAPVLLAGAMAFLFPSLYEGFGIPP LEAMACGTPVLSADKASLPEVLGDAALLINPESVDEICMGMERLAVDRALRSELSRKGSE RVKIYTWERSAHILKGIYEELVTLEERHAVV >gi|157101649|gb|DS480675.1| GENE 42 48505 - 49632 348 375 aa, chain - ## HITS:1 COG:aq_516 KEGG:ns NR:ns ## COG: aq_516 COG0438 # Protein_GI_number: 15605985 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Aquifex aeolicus # 4 370 1 366 368 186 34.0 8e-47 MKKLNVLQVSRMYYPARGGIERVIQDISEGLSDLANIEVLTCQVKGNEVKETIHGVPVTR CASRGMLLSLPIAPSFFQKFKTMSKGKDIVHIHLPFPMADLAAVTSGYDGKLVIWWHSDV VRQKKLMLLYRPIMEKCLKRADAIVVASEGHIEGSSYLGPYKDKCHVIPFGVEKKMEKRA DEYVKVKTVNSSSNIRFLFVGRLVYYKGCDVLIRAMSKVKYGHLDIVGEGPLKTELTELS EKLGLTDRVSFLGEVADSVLDECFRKCDVFVLPSVEKSEAFGIVQMEAMAYGKPVINTNL KSGVPYVSLHRITGLTVEAKNSSELADAMNWLALHPEVREVYGKAGYERIKEYFSQSNML KQLFTLYENLTCEEA >gi|157101649|gb|DS480675.1| GENE 43 49635 - 51044 613 469 aa, chain - ## HITS:1 COG:CAC2330 KEGG:ns NR:ns ## COG: CAC2330 COG2148 # Protein_GI_number: 15895597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Clostridium acetobutylicum # 58 446 58 444 461 224 36.0 4e-58 MKKDYEPYKRMVRFLFSAVLIGLEVMVYGHVWLKYYNGFMEFPYNRTGNWLIMAVYGILL LVFSAIYGGLRIGYLRIFNIIYSQVLTSFCTNIIIYLQITLLTKHFQNPVPLLAMTGVEG VWITVWSVLCTQIYLRIYPPRKVVLVYGEHPVYSLMVKLYGRADRYDIKELVHISKGMDV IKEKVKQYEGVILCDIPSQMRNQLLKYCYNESIRTYMVPKISDIIIRSAENLHLFDTPLM LARNTGLSFEQQFIKRTFDIIIAVCALVVLSPIYLVTAACIKLYDRGPVIFKQKRYTKDG KIFDIYKFRSMIVNAEAAGTSVPAADRDPRITPVGRVIRALRIDELPQFINILKGEMSVV GPRPERVEHVELYTKDIPEFVYRLKVKGGLTGYAQVYGKYNTTAYDKLKLDLMYIQNYSI WLDIEIIFKTIKILLIKESTEGFSEEQAAAITKECGDLTETMEEKDRKL >gi|157101649|gb|DS480675.1| GENE 44 51187 - 52818 531 543 aa, chain - ## HITS:1 COG:no KEGG:Closa_0537 NR:ns ## KEGG: Closa_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 26 538 12 512 514 370 41.0 1e-101 MDNIKTKEISDEKIINLLAAWRKYNSVIAGIFTVLIIFVFPLVFHDYYYDILVAKYIFYY GSVILMIVAMLVGVFFFWYKDKREIGGNAVKEMRSNAKLNALKMSDWAMIIFLIAVTIST LQSEYRFESFWGNEGRYMGMFLILLYGISFFVISRCLRYRQWYLDTFLAGGIIACMIGIL HFFKIDPIGFKRGLVGDDYTIFTSTIGNINTYTSYVAMVSGMGTVMFSVEKNKYRRLWYL VVAILSLFALIVGISDNAYLALLVLFGLLPLYLFGNIDGVKRYVLLLAILFTEFQVIGTL SHAYPNHVVEFKGLMNVIASFGGLPYLIIGLWVCTACLYLLAYKLPENHALKSGSNIGRW IWLGLILLACAGVIYVLYDVNIAGNVEKYGPLNQYLLLNDDWGTHRGYIWRIGMEFYQKQ PIHHKLFGYGPDTFGIITVKNYFEDMVRRYGEKFDSAHNEYLQYLVTIGIVGLTAYLALL TTSIIEMLRTLKKRPELIAIAFAVICYGAQAAVNISVPIVAPIMLTLMMLGVAAVRNVSG NTE Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:13:19 2011 Seq name: gi|157101648|gb|DS480676.1| Clostridium bolteae ATCC BAA-613 Scfld_02_17 genomic scaffold, whole genome shotgun sequence Length of sequence - 227371 bp Number of predicted genes - 208, with homology - 204 Number of transcription units - 95, operones - 57 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 2/0.182 + CDS 2 - 127 146 ## COG1592 Rubrerythrin + Term 164 - 210 10.3 + Prom 282 - 341 6.9 2 2 Tu 1 . + CDS 386 - 925 609 ## COG1592 Rubrerythrin + Term 1032 - 1084 -0.7 - Term 1009 - 1080 17.0 3 3 Op 1 . - CDS 1221 - 1661 740 ## COG0698 Ribose 5-phosphate isomerase RpiB 4 3 Op 2 . - CDS 1698 - 2837 1182 ## COG0082 Chorismate synthase - Prom 2893 - 2952 3.8 5 4 Tu 1 . - CDS 3005 - 4063 1057 ## COG2706 3-carboxymuconate cyclase - Prom 4113 - 4172 5.6 + Prom 4205 - 4264 6.6 6 5 Op 1 4/0.000 + CDS 4361 - 4894 588 ## COG1396 Predicted transcriptional regulators 7 5 Op 2 30/0.000 + CDS 4927 - 6003 1274 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 8 5 Op 3 36/0.000 + CDS 5996 - 6835 868 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 9 5 Op 4 25/0.000 + CDS 6835 - 7632 809 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 10 5 Op 5 . + CDS 7662 - 8777 1388 ## COG0687 Spermidine/putrescine-binding periplasmic protein - Term 8795 - 8856 5.4 11 6 Tu 1 . - CDS 8896 - 9078 176 ## - Prom 9230 - 9289 7.0 + Prom 9235 - 9294 10.4 12 7 Op 1 . + CDS 9327 - 9707 414 ## COG1321 Mn-dependent transcriptional regulator 13 7 Op 2 . + CDS 9718 - 10287 471 ## COG0778 Nitroreductase 14 8 Op 1 15/0.000 - CDS 10446 - 12380 2243 ## COG2217 Cation transport ATPase 15 8 Op 2 2/0.182 - CDS 12414 - 12632 357 ## COG2608 Copper chaperone 16 8 Op 3 . - CDS 12709 - 13068 551 ## COG0640 Predicted transcriptional regulators - Prom 13128 - 13187 5.5 - Term 13229 - 13269 7.0 17 9 Op 1 13/0.000 - CDS 13292 - 15082 2084 ## COG0173 Aspartyl-tRNA synthetase 18 9 Op 2 . - CDS 15105 - 16370 1524 ## COG0124 Histidyl-tRNA synthetase - Prom 16401 - 16460 6.7 + Prom 16129 - 16188 2.3 19 10 Tu 1 . + CDS 16339 - 16506 63 ## 20 11 Tu 1 . - CDS 16467 - 17390 1056 ## Closa_1289 Uroporphyrinogen-III decarboxylase-like protein - Prom 17449 - 17508 3.9 - Term 17459 - 17506 10.0 21 12 Op 1 1/0.318 - CDS 17559 - 19109 1629 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 22 12 Op 2 . - CDS 19141 - 19827 641 ## COG0491 Zn-dependent hydrolases, including glyoxylases 23 12 Op 3 . - CDS 19849 - 22134 2688 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases - Prom 22164 - 22223 5.5 24 13 Op 1 . - CDS 22254 - 24212 1869 ## COG2199 FOG: GGDEF domain 25 13 Op 2 . - CDS 24253 - 25725 1501 ## Closa_2368 lipoprotein - Prom 25774 - 25833 6.2 - Term 25845 - 25891 7.1 26 14 Op 1 . - CDS 25916 - 26440 705 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 27 14 Op 2 1/0.318 - CDS 26437 - 27480 1106 ## COG1609 Transcriptional regulators 28 14 Op 3 . - CDS 27537 - 29105 1595 ## COG5434 Endopolygalacturonase - Term 29148 - 29183 4.3 29 15 Op 1 38/0.000 - CDS 29220 - 30071 942 ## COG0395 ABC-type sugar transport system, permease component 30 15 Op 2 . - CDS 30078 - 30971 993 ## COG1175 ABC-type sugar transport systems, permease components 31 15 Op 3 . - CDS 30991 - 31287 339 ## gi|160936549|ref|ZP_02083916.1| hypothetical protein CLOBOL_01439 32 15 Op 4 . - CDS 31340 - 32686 1683 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 32889 - 32948 8.2 33 16 Tu 1 . + CDS 33132 - 34262 1111 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins + Term 34358 - 34401 10.3 + Prom 34322 - 34381 5.6 34 17 Op 1 . + CDS 34457 - 35281 732 ## COG0657 Esterase/lipase 35 17 Op 2 . + CDS 35283 - 36032 999 ## COG1349 Transcriptional regulators of sugar metabolism + Prom 36140 - 36199 8.3 36 18 Tu 1 . + CDS 36259 - 36912 961 ## COG0176 Transaldolase + Term 36998 - 37053 10.1 - Term 36955 - 36992 4.1 37 19 Op 1 3/0.000 - CDS 37165 - 37869 880 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 37893 - 37952 6.6 38 19 Op 2 2/0.182 - CDS 37998 - 38912 864 ## COG0524 Sugar kinases, ribokinase family 39 19 Op 3 21/0.000 - CDS 38934 - 39908 1123 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 40 19 Op 4 16/0.000 - CDS 39911 - 41398 1698 ## COG1129 ABC-type sugar transport system, ATPase component - Term 41409 - 41448 10.1 41 20 Op 1 . - CDS 41474 - 42565 1306 ## COG1879 ABC-type sugar transport system, periplasmic component 42 20 Op 2 . - CDS 42601 - 43521 963 ## lwe0277 AP endonuclease (EC:5.3.1.5) - Prom 43543 - 43602 7.3 - Term 43823 - 43889 23.2 43 21 Op 1 32/0.000 - CDS 44000 - 44662 861 ## COG0704 Phosphate uptake regulator 44 21 Op 2 41/0.000 - CDS 44684 - 45445 324 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 45 21 Op 3 38/0.000 - CDS 45486 - 46391 1102 ## COG0581 ABC-type phosphate transport system, permease component 46 21 Op 4 39/0.000 - CDS 46381 - 47319 917 ## COG0573 ABC-type phosphate transport system, permease component 47 21 Op 5 . - CDS 47347 - 48303 1178 ## COG0226 ABC-type phosphate transport system, periplasmic component - Prom 48390 - 48449 6.4 - Term 48473 - 48509 6.3 48 22 Op 1 . - CDS 48546 - 49652 1042 ## Closa_3988 aminoglycoside phosphotransferase 49 22 Op 2 . - CDS 49649 - 50584 943 ## Closa_3989 hypothetical protein 50 22 Op 3 38/0.000 - CDS 50641 - 51531 1006 ## COG0395 ABC-type sugar transport system, permease component 51 22 Op 4 35/0.000 - CDS 51548 - 52426 1007 ## COG1175 ABC-type sugar transport systems, permease components - Prom 52510 - 52569 1.6 - Term 52494 - 52534 6.6 52 22 Op 5 . - CDS 52580 - 53920 1555 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 54009 - 54068 8.2 53 23 Tu 1 . - CDS 54154 - 56469 2121 ## BCAH187_A2105 hypothetical protein - Prom 56536 - 56595 10.6 + Prom 56602 - 56661 8.9 54 24 Tu 1 . + CDS 56706 - 57605 891 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 57630 - 57680 4.2 55 25 Op 1 . - CDS 57755 - 58171 207 ## Closa_0226 Zn-finger containing protein 56 25 Op 2 . - CDS 58254 - 58718 -263 ## gi|160936578|ref|ZP_02083945.1| hypothetical protein CLOBOL_01468 - Prom 58844 - 58903 4.1 57 26 Tu 1 . + CDS 58833 - 60266 1482 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 60286 - 60345 5.3 58 27 Op 1 10/0.000 + CDS 60376 - 61221 814 ## COG0725 ABC-type molybdate transport system, periplasmic component 59 27 Op 2 9/0.000 + CDS 61259 - 62038 644 ## COG0555 ABC-type sulfate transport system, permease component 60 27 Op 3 . + CDS 62035 - 62793 317 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 61 27 Op 4 . + CDS 62790 - 63563 773 ## Taci_1616 hypothetical protein 62 28 Tu 1 . - CDS 63548 - 64438 763 ## COG0583 Transcriptional regulator - Prom 64467 - 64526 9.3 + Prom 64461 - 64520 5.7 63 29 Op 1 . + CDS 64545 - 65228 434 ## COG3619 Predicted membrane protein 64 29 Op 2 . + CDS 65284 - 68223 2139 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like + Term 68295 - 68332 5.4 - Term 68281 - 68320 2.1 65 30 Op 1 3/0.000 - CDS 68384 - 69160 817 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 66 30 Op 2 . - CDS 69197 - 70444 917 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 67 30 Op 3 . - CDS 70457 - 71536 816 ## COG0371 Glycerol dehydrogenase and related enzymes 68 30 Op 4 . - CDS 71558 - 72361 417 ## COG3718 Uncharacterized enzyme involved in inositol metabolism 69 30 Op 5 . - CDS 72374 - 73648 674 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 70 30 Op 6 . - CDS 73645 - 74148 548 ## Tmz1t_0544 tripartite ATP-independent periplasmic transporter DctQ component 71 30 Op 7 2/0.182 - CDS 74169 - 75230 329 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 - Prom 75267 - 75326 7.7 - Term 75263 - 75311 5.2 72 31 Op 1 . - CDS 75379 - 76137 615 ## COG2186 Transcriptional regulators 73 31 Op 2 . - CDS 76195 - 77547 1586 ## COG0733 Na+-dependent transporters of the SNF family - Prom 77625 - 77684 4.5 74 32 Op 1 . - CDS 77727 - 78353 605 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 75 32 Op 2 17/0.000 - CDS 78365 - 79123 235 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 76 32 Op 3 21/0.000 - CDS 79113 - 80792 1270 ## COG1178 ABC-type Fe3+ transport system, permease component 77 32 Op 4 . - CDS 80839 - 81996 1387 ## COG1840 ABC-type Fe3+ transport system, periplasmic component - Prom 82069 - 82128 7.6 + Prom 82186 - 82245 5.1 78 33 Tu 1 . + CDS 82336 - 83724 1289 ## COG0733 Na+-dependent transporters of the SNF family + Term 83754 - 83794 2.2 - Term 83574 - 83606 -0.4 79 34 Tu 1 . - CDS 83742 - 84632 904 ## gi|160936603|ref|ZP_02083970.1| hypothetical protein CLOBOL_01493 - Prom 84853 - 84912 8.3 - Term 84878 - 84927 13.2 80 35 Op 1 . - CDS 84948 - 85412 384 ## COG2731 Beta-galactosidase, beta subunit 81 35 Op 2 . - CDS 85426 - 86241 753 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 82 35 Op 3 . - CDS 86287 - 87576 997 ## COG1593 TRAP-type C4-dicarboxylate transport system, large permease component 83 35 Op 4 . - CDS 87594 - 88097 -29 ## SpiBuddy_2701 tripartite ATP-independent periplasmic transporter DctQ component 84 35 Op 5 . - CDS 88124 - 89239 232 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 85 35 Op 6 . - CDS 89266 - 89916 675 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase - Prom 90030 - 90089 6.3 + Prom 89920 - 89979 11.8 86 36 Tu 1 . + CDS 90147 - 91016 633 ## COG0583 Transcriptional regulator + Prom 91071 - 91130 6.2 87 37 Op 1 23/0.000 + CDS 91176 - 91670 533 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 88 37 Op 2 2/0.182 + CDS 91689 - 93014 1290 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 89 37 Op 3 2/0.182 + CDS 92999 - 95716 2168 ## COG3383 Uncharacterized anaerobic dehydrogenase 90 37 Op 4 . + CDS 95737 - 96441 479 ## COG1526 Uncharacterized protein required for formate dehydrogenase activity 91 38 Tu 1 . - CDS 96452 - 97405 972 ## COG0679 Predicted permeases + Prom 97651 - 97710 7.9 92 39 Op 1 5/0.000 + CDS 97772 - 98092 253 ## COG1695 Predicted transcriptional regulators 93 39 Op 2 . + CDS 98089 - 98919 788 ## COG4709 Predicted membrane protein 94 39 Op 3 . + CDS 98924 - 99799 896 ## Closa_2421 hypothetical protein 95 39 Op 4 . + CDS 99826 - 100011 270 ## COG1983 Putative stress-responsive transcriptional regulator + Term 100144 - 100184 8.1 96 40 Tu 1 . - CDS 100018 - 101046 716 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 101121 - 101180 2.9 97 41 Tu 1 . - CDS 101360 - 102109 614 ## COG3279 Response regulator of the LytR/AlgR family - Prom 102314 - 102373 6.0 + Prom 102175 - 102234 4.4 98 42 Op 1 . + CDS 102271 - 102636 236 ## Clole_3645 response regulator receiver 99 42 Op 2 . + CDS 102655 - 103692 826 ## COG2199 FOG: GGDEF domain + Term 103771 - 103821 1.7 100 43 Op 1 . - CDS 103641 - 104828 979 ## gi|160936628|ref|ZP_02083995.1| hypothetical protein CLOBOL_01518 101 43 Op 2 . - CDS 104809 - 105648 856 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 105807 - 105866 7.2 102 44 Op 1 . + CDS 105860 - 108361 2513 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 103 44 Op 2 11/0.000 + CDS 108429 - 109454 564 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 104 44 Op 3 11/0.000 + CDS 109469 - 109969 241 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 105 44 Op 4 . + CDS 109969 - 111270 1107 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 111301 - 111362 27.1 - Term 111294 - 111344 23.6 106 45 Op 1 34/0.000 - CDS 111466 - 112188 583 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 107 45 Op 2 31/0.000 - CDS 112172 - 112840 715 ## COG0765 ABC-type amino acid transport system, permease component 108 45 Op 3 . - CDS 112866 - 113783 1218 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 113806 - 113865 8.5 109 46 Op 1 . + CDS 114101 - 115000 833 ## COG0726 Predicted xylanase/chitin deacetylase 110 46 Op 2 . + CDS 114997 - 115830 870 ## COG0726 Predicted xylanase/chitin deacetylase + Term 116035 - 116071 2.4 - Term 115870 - 115920 12.4 111 47 Tu 1 . - CDS 115994 - 116644 610 ## COG2135 Uncharacterized conserved protein - Prom 116701 - 116760 4.4 112 48 Op 1 3/0.000 - CDS 116771 - 118225 681 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 113 48 Op 2 . - CDS 118261 - 119289 763 ## COG0095 Lipoate-protein ligase A - Term 119360 - 119402 -0.8 114 49 Op 1 12/0.000 - CDS 119414 - 120847 1650 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain 115 49 Op 2 16/0.000 - CDS 120844 - 122211 1467 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 116 49 Op 3 18/0.000 - CDS 122264 - 122644 596 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 117 49 Op 4 . - CDS 122713 - 123822 1251 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) - Prom 123855 - 123914 2.8 + Prom 124195 - 124254 5.5 118 50 Tu 1 . + CDS 124419 - 125291 825 ## Closa_4282 hypothetical protein + Term 125358 - 125410 11.3 + TRNA 125486 - 125562 74.4 # Arg CCT 0 0 - Term 125558 - 125603 2.2 119 51 Op 1 . - CDS 125794 - 127254 449 ## COG0582 Integrase 120 51 Op 2 . - CDS 127274 - 127474 149 ## gi|160936651|ref|ZP_02084018.1| hypothetical protein CLOBOL_01542 - Prom 127521 - 127580 4.1 121 52 Op 1 . - CDS 128013 - 128252 220 ## CD3328 hypothetical protein 122 52 Op 2 . - CDS 128249 - 128668 374 ## CD3329 hypothetical protein - Prom 128830 - 128889 6.2 - Term 128867 - 128912 9.8 123 53 Tu 1 . - CDS 128927 - 129169 105 ## gi|160936655|ref|ZP_02084022.1| hypothetical protein CLOBOL_01546 - Prom 129299 - 129358 5.3 - Term 129372 - 129407 0.3 124 54 Op 1 . - CDS 129435 - 129554 85 ## gi|160936656|ref|ZP_02084023.1| hypothetical protein CLOBOL_01547 125 54 Op 2 . - CDS 129571 - 129879 144 ## gi|160936657|ref|ZP_02084024.1| hypothetical protein CLOBOL_01548 - Prom 130036 - 130095 6.1 126 55 Op 1 . - CDS 130215 - 131492 252 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component - Prom 131553 - 131612 2.6 127 55 Op 2 . - CDS 131614 - 132027 178 ## gi|160936659|ref|ZP_02084026.1| hypothetical protein CLOBOL_01550 - Prom 132066 - 132125 11.4 - Term 132318 - 132361 10.5 128 56 Tu 1 . - CDS 132396 - 133067 163 ## gi|160936660|ref|ZP_02084027.1| hypothetical protein CLOBOL_01551 - Prom 133304 - 133363 4.6 129 57 Tu 1 . + CDS 133361 - 134572 556 ## COG3547 Transposase and inactivated derivatives - Term 134409 - 134441 -1.0 130 58 Tu 1 . - CDS 134679 - 137591 2061 ## COG0642 Signal transduction histidine kinase - Prom 137765 - 137824 8.8 - Term 137885 - 137945 9.1 131 59 Tu 1 . - CDS 138040 - 139932 1758 ## COG2199 FOG: GGDEF domain - Prom 140176 - 140235 7.5 + Prom 140183 - 140242 6.2 132 60 Tu 1 . + CDS 140321 - 142168 1445 ## COG2200 FOG: EAL domain + Term 142211 - 142274 21.9 - Term 142201 - 142258 12.3 133 61 Tu 1 . - CDS 142265 - 143365 799 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 143385 - 143444 7.6 + Prom 143401 - 143460 6.3 134 62 Op 1 . + CDS 143503 - 144858 1228 ## COG2211 Na+/melibiose symporter and related transporters 135 62 Op 2 . + CDS 144871 - 146511 969 ## Pjdr2_5097 alpha-L-rhamnosidase + Prom 146564 - 146623 7.6 136 63 Tu 1 . + CDS 146667 - 148139 1107 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Prom 148170 - 148229 4.4 137 64 Op 1 5/0.000 + CDS 148296 - 149621 1216 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 138 64 Op 2 . + CDS 149642 - 150907 1122 ## COG0477 Permeases of the major facilitator superfamily 139 64 Op 3 . + CDS 150940 - 151746 766 ## Bcer98_0475 hypothetical protein 140 64 Op 4 . + CDS 151743 - 152207 463 ## SSP0162 hypothetical protein + Term 152377 - 152414 -1.0 141 65 Op 1 . - CDS 152234 - 153835 1219 ## COG0591 Na+/proline symporter 142 65 Op 2 . - CDS 153906 - 154913 852 ## Ilyop_1572 hypothetical protein 143 65 Op 3 . - CDS 154932 - 156227 1334 ## COG0477 Permeases of the major facilitator superfamily 144 65 Op 4 . - CDS 156228 - 157478 1346 ## CDR20291_0606 hypothetical protein 145 65 Op 5 . - CDS 157450 - 157578 77 ## 146 65 Op 6 . - CDS 157580 - 158611 830 ## COG1609 Transcriptional regulators - Prom 158638 - 158697 7.7 147 66 Tu 1 . - CDS 158991 - 160313 1010 ## CD3111 hypothetical protein - Prom 160356 - 160415 2.8 - Term 160372 - 160426 1.0 148 67 Op 1 3/0.000 - CDS 160446 - 161651 857 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 149 67 Op 2 5/0.000 - CDS 161681 - 162973 1141 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 150 67 Op 3 . - CDS 163018 - 164307 1124 ## COG0477 Permeases of the major facilitator superfamily - Prom 164422 - 164481 5.9 - Term 164555 - 164605 11.3 151 68 Op 1 . - CDS 164610 - 164879 300 ## gi|160936686|ref|ZP_02084053.1| hypothetical protein CLOBOL_01577 152 68 Op 2 . - CDS 164910 - 166088 1069 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 153 68 Op 3 . - CDS 166136 - 166801 539 ## Amico_1858 hypothetical protein 154 68 Op 4 . - CDS 166819 - 167538 508 ## Amico_1857 hypothetical protein + Prom 167892 - 167951 7.9 155 69 Tu 1 . + CDS 167978 - 169612 632 ## COG2508 Regulator of polyketide synthase expression + Prom 169795 - 169854 6.8 156 70 Tu 1 . + CDS 169946 - 171418 291 ## Elen_3094 regulatory protein GntR HTH + Term 171526 - 171562 -0.4 - Term 171287 - 171337 0.9 157 71 Op 1 10/0.000 - CDS 171538 - 174753 2135 ## COG0642 Signal transduction histidine kinase 158 71 Op 2 . - CDS 174800 - 177259 1797 ## COG0642 Signal transduction histidine kinase - Prom 177358 - 177417 5.9 + Prom 177314 - 177373 5.3 159 72 Tu 1 . + CDS 177410 - 177580 89 ## gi|160936695|ref|ZP_02084062.1| hypothetical protein CLOBOL_01586 - Term 177496 - 177533 6.2 160 73 Op 1 . - CDS 177594 - 177809 103 ## 161 73 Op 2 . - CDS 177781 - 178962 1039 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 162 73 Op 3 . - CDS 178964 - 179818 728 ## EUBREC_2330 hydrogenase-4 component C - Prom 179951 - 180010 5.6 - Term 179961 - 180012 -0.9 163 74 Op 1 1/0.318 - CDS 180166 - 180708 450 ## COG0655 Multimeric flavodoxin WrbA 164 74 Op 2 . - CDS 180723 - 181664 674 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 181684 - 181743 2.6 165 74 Op 3 . - CDS 181758 - 182249 545 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 182327 - 182386 2.9 166 75 Op 1 . - CDS 182417 - 182647 323 ## gi|160936702|ref|ZP_02084069.1| hypothetical protein CLOBOL_01593 167 75 Op 2 . - CDS 182741 - 184123 1210 ## COG0534 Na+-driven multidrug efflux pump - Prom 184157 - 184216 6.8 + Prom 184197 - 184256 7.0 168 76 Tu 1 . + CDS 184306 - 185193 623 ## COG0583 Transcriptional regulator + Term 185270 - 185323 12.2 169 77 Tu 1 . - CDS 185328 - 185846 280 ## Spirs_4177 hypothetical protein 170 78 Op 1 40/0.000 - CDS 185954 - 186649 641 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 171 78 Op 2 . - CDS 186651 - 188141 1025 ## COG0642 Signal transduction histidine kinase - Prom 188243 - 188302 2.3 172 79 Op 1 . - CDS 188309 - 188950 511 ## BMD_1530 LysE family translocator protein 173 79 Op 2 . - CDS 188963 - 190009 778 ## PRU_1044 hydrolase domain-containing protein - Prom 190209 - 190268 2.8 174 80 Tu 1 . - CDS 190321 - 190914 494 ## gi|160936715|ref|ZP_02084082.1| hypothetical protein CLOBOL_01606 - Prom 191007 - 191066 9.6 175 81 Op 1 . - CDS 191084 - 193441 1521 ## Ethha_0330 protein of unknown function DUF214 176 81 Op 2 . - CDS 193457 - 194218 301 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 177 82 Op 1 . - CDS 194349 - 195692 1055 ## COG1541 Coenzyme F390 synthetase - Term 195709 - 195747 3.7 178 82 Op 2 . - CDS 195755 - 196888 790 ## PRU_1044 hydrolase domain-containing protein - Prom 197135 - 197194 4.7 - Term 197142 - 197180 4.1 179 83 Op 1 . - CDS 197223 - 197747 484 ## gi|160936721|ref|ZP_02084088.1| hypothetical protein CLOBOL_01612 180 83 Op 2 . - CDS 197802 - 198995 785 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 199073 - 199132 2.6 181 84 Op 1 34/0.000 - CDS 199178 - 199717 468 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 182 84 Op 2 . - CDS 199690 - 200310 533 ## COG0765 ABC-type amino acid transport system, permease component - Prom 200341 - 200400 4.8 - Term 200741 - 200799 0.6 183 85 Op 1 . - CDS 200965 - 201867 367 ## gi|160936728|ref|ZP_02084095.1| hypothetical protein CLOBOL_01619 184 85 Op 2 . - CDS 201917 - 202141 101 ## gi|160936729|ref|ZP_02084096.1| hypothetical protein CLOBOL_01620 - Prom 202294 - 202353 3.2 185 86 Op 1 2/0.182 - CDS 202355 - 202597 185 ## COG0583 Transcriptional regulator 186 86 Op 2 . - CDS 202650 - 203087 200 ## COG2936 Predicted acyl esterases - Prom 203110 - 203169 2.4 187 86 Op 3 . - CDS 203186 - 203380 61 ## gi|160936732|ref|ZP_02084099.1| hypothetical protein CLOBOL_01623 - Prom 203512 - 203571 4.7 188 87 Op 1 . - CDS 204237 - 205634 1084 ## COG2211 Na+/melibiose symporter and related transporters 189 87 Op 2 2/0.182 - CDS 205646 - 207121 827 ## COG0554 Glycerol kinase 190 87 Op 3 12/0.000 - CDS 207125 - 208060 756 ## COG3958 Transketolase, C-terminal subunit 191 87 Op 4 1/0.318 - CDS 208057 - 208875 680 ## COG3959 Transketolase, N-terminal subunit 192 87 Op 5 . - CDS 208922 - 209701 930 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 209740 - 209799 5.3 + Prom 209922 - 209981 7.0 193 88 Tu 1 . + CDS 210008 - 211048 778 ## COG1609 Transcriptional regulators - Term 210937 - 210968 -0.9 194 89 Op 1 . - CDS 211011 - 211352 168 ## gi|266621899|ref|ZP_06114834.1| ATPase 195 89 Op 2 . - CDS 211396 - 211599 115 ## gi|160936741|ref|ZP_02084108.1| hypothetical protein CLOBOL_01632 - Term 211903 - 211952 3.1 196 90 Tu 1 . - CDS 212050 - 212616 164 ## Clole_2805 hypothetical protein - Prom 212805 - 212864 6.2 197 91 Op 1 . - CDS 213114 - 213662 565 ## COG0784 FOG: CheY-like receiver 198 91 Op 2 . - CDS 213715 - 215865 1861 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 199 91 Op 3 . - CDS 215943 - 219155 2562 ## COG0642 Signal transduction histidine kinase 200 91 Op 4 . - CDS 219160 - 219522 277 ## ELI_3416 hypothetical protein - Prom 219560 - 219619 7.0 + Prom 219603 - 219662 4.8 201 92 Op 1 . + CDS 219736 - 219876 131 ## gi|160936748|ref|ZP_02084115.1| hypothetical protein CLOBOL_01639 202 92 Op 2 . + CDS 219903 - 221321 845 ## ELI_1104 hypothetical protein + Term 221407 - 221446 -0.4 - Term 221395 - 221434 -0.4 203 93 Op 1 7/0.000 - CDS 221503 - 223284 1059 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 204 93 Op 2 3/0.000 - CDS 223274 - 224047 612 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 224099 - 224158 5.2 - Term 224103 - 224149 11.5 205 94 Op 1 3/0.000 - CDS 224204 - 225361 809 ## COG0225 Peptide methionine sulfoxide reductase 206 94 Op 2 13/0.000 - CDS 225436 - 226017 500 ## COG0526 Thiol-disulfide isomerase and thioredoxins 207 94 Op 3 . - CDS 226040 - 226738 580 ## COG0785 Cytochrome c biogenesis protein - Prom 226776 - 226835 4.8 208 95 Tu 1 . + CDS 227077 - 227371 68 ## MGAS2096_Spy1746 transposase Predicted protein(s) >gi|157101648|gb|DS480676.1| GENE 1 2 - 127 146 41 aa, chain + ## HITS:1 COG:CAC3597 KEGG:ns NR:ns ## COG: CAC3597 COG1592 # Protein_GI_number: 15896831 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 40 141 180 181 63 80.0 8e-11 DLAKRAKALNLDAIHDTVHEMAKDEARHGKAFEGLLNRYFG >gi|157101648|gb|DS480676.1| GENE 2 386 - 925 609 179 aa, chain + ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 7 179 5 179 179 179 56.0 2e-45 MASKYAGTKTEQNLKDAFSGESQARNKYTYYASAAKKAGYEQMSALYLETADQEKEHAKM WFKELHGIGTPEENLADAAAGENYEWTDMYARMAREAREEGFEELAVKFELVAKVEAAHE RRYNKLLESLKNDKTFKGDAPLGWKCRNCGYIHEGPEAPEVCPCCAHPKAYFERKVENY >gi|157101648|gb|DS480676.1| GENE 3 1221 - 1661 740 146 aa, chain - ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 2 146 3 147 149 174 62.0 7e-44 MKIAIGNDHTAIELKNIIKEFVEEKGYEVIDLGTNSTESCDYPIYGEKVGRAVADGQADL GIAICGTGVGISLAANKVKGIRACVCSEPYTAKLSRVHNNSNVLAFGARVVGSELAKMIT EAWLDAEFEGGRHERRVNMIMEIENR >gi|157101648|gb|DS480676.1| GENE 4 1698 - 2837 1182 379 aa, chain - ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 1 362 1 355 365 338 51.0 1e-92 MAGSALGTILRVTTWGESHGKGIGVVVDGCPAGLPLEEGDIQSYLNRRKPGQSKFTTARQ EGDQVEILSGVFGGRTTGTPIALEVRNTDQRSHDYSNIMDIYRPGHADYTFDQKYGFRDY RGGGRSSGRETIARVAAGAIACRLLESLGITVRAYTREVGGICIAQERFDLDEMWNNRLY MPDARAAAQAEARLEELMAGKDSCGGVVECIIRGMPAGAGEPVFEKLDANLAKAMLSIGA VKGFEIGDGFAASRARGSENNDGFGISHKAADGSPYLSGRIEKQTNHSGGTLGGISDGSD IVMRAAFKPTPSIAQPQHTVNQNGEETVVEIKGRHDPIIVPRAVVVVEAMAALTAADMLL LSMTSRLDRVQAFFARDRV >gi|157101648|gb|DS480676.1| GENE 5 3005 - 4063 1057 352 aa, chain - ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 4 344 3 342 349 148 28.0 1e-35 MSGKYMAYVGSYSYTGKAKGITVFDVDVKNGTFRERCEMEVDNSSYVIASSDGKTLYSIA DEGVVSFRILQNGSLARLNSAPIRGMRGCHLCTDMEDKYVFVSGYHDGKTTVLSLNKDGS VGQIVDEVFNKGYGSVAERNFRSHVTCTRRMPDGRYVLSVDNGTDQVKVYRFNDKDKRLF QVDAIHCDLESAPRHFRYSSDGRFLYLMHELQNVICVYSYETGDRAPVVEKIQTISTTST KNPGNLTAACAMRLSPDEKYVYCTNAGENTVSIYSRDEKSGLLTMICCLPVSGEYPKDVA IFPDQKHIASINHDSGSISFFEVDYEKGLIVMCGRSVMVNEPNCCAIAKIGN >gi|157101648|gb|DS480676.1| GENE 6 4361 - 4894 588 177 aa, chain + ## HITS:1 COG:CAC0841 KEGG:ns NR:ns ## COG: CAC0841 COG1396 # Protein_GI_number: 15894128 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 177 1 179 179 199 55.0 3e-51 MEIGSKLKELRVLKGLTQEELADRAELSKGFISQLERDLTSPSIATLMDILQCLGTTIGE FFNETPDEQVVFGKQDYFEKIDTDLNNEIKWIIPNAQKNVMEPILLTLKAGGSTYPDNPH EGEEFGYVLQGSISIHIGSKTYKAKKGESFYFTPDKKHYLTSRNGALLLWVSSPPSF >gi|157101648|gb|DS480676.1| GENE 7 4927 - 6003 1274 358 aa, chain + ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 5 351 6 351 352 397 55.0 1e-110 MSQPFIDLQHISKSFDGELVLDDLNLSIRENSFVTLLGPSGCGKTTTLRILGGFTTPDTG KVIVGGQDISMIPPNKRPLNTVFQKYALFPHMNIAENIAFGLKIKNKPKSYIDDKIKYAL KLVNLPGYEKRTVDSLSGGQQQRIAIARAIVNEPRVLLLDEPLGALDLKLRQDMQYELIR LKNELGITFIYVTHDQEEALTMSDTIVVMNQGYIQQMGSPEQIYNEPENAFVADFIGESN IVPGTMIRDELVEIFGARFACVDKGFGNNKPVDVVIRPEDIDLVKPEDGTLQGVVTHLIF KGVHYEMEVTTPDGFEWLVHSTDMFPVGRQVGIHVEPFEIQIMNKPASEDEEALGVNE >gi|157101648|gb|DS480676.1| GENE 8 5996 - 6835 868 279 aa, chain + ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 18 275 12 269 277 228 46.0 8e-60 MNSEKKGAAIGKALLSGPYLIWMAGFTLIPLAMIIWYGLTDKGGGLTMANVIAIASADHA KALWLSLGLSLVSTVLCLLLAYPLAMILRGRGAAAGGFIVFIFILPMWMNFLLRTLAWQT LLEKNGVINGILNWLHLPSQSMINTPSAIVLGMVYNFLPFMVLPIYNVLSKIDDNIINAA MDLGANSFQRFTRIWFPLSIPGIISGITMVFVPSLTTFVISDLLGGSKILLIGNVIEQEF TRGSNWHLGSGLSLVLMVFILISMAMIAKYDKNGEGTAF >gi|157101648|gb|DS480676.1| GENE 9 6835 - 7632 809 265 aa, chain + ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 7 259 1 253 260 209 47.0 3e-54 MTKSSKVLRKIISDFYMVLILIFLYAPILTMMVLSFNKSKSRTQWGGFTLQWYTQMFDSR SIMSALTTTLIIAFVSALVATVIGTAAAISISSMRTVPRTIVMGVTNIPMLNADIVTGIS LMLAFIAFGISLGFKTVLISHITFNIPYVILSVMPKLKQTDKYTYEAALDLGASPLYAFF KVGFPDIMPGVLSGFLLAFTMSLDDFIITHFTKGAGINTLSTLIYSEVRRGIRPSMYALS SIIFLTVLILLIFVNFRSIKKPEKA >gi|157101648|gb|DS480676.1| GENE 10 7662 - 8777 1388 371 aa, chain + ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 3 371 2 353 354 287 40.0 2e-77 MKKKMAITLAAASMAVLSAALLTGCGKSGSSPEGGNKEDNKLNVYNWGEYIDEDVITQFE EETGIQVVYDVFETNEEMYPVIEAGAVKYDVVCPSDYMIQRMIENGLLAELNFDNIPNYK EIGQQYLDISKGFDPENKYSVPYCWGTMGILYNTKRLEELGVPAPTKWSDLWDERLSGEI LMQNSVRDAFTVALKMKGYSLNSTNPDELAEARDLLIEQKPLVQAYVIDQVRDKMIGGEA AVGVIYSGEMLYIQDAVAEQGLDYSLEYVIPEEGTNYWIDSWVIPANAEHKENAEKWIDF LCRPDIAKKNFEYITYATPNTGAYDLLDDDLKNNKALFPDLDKLPKCEIIQYLGDDVDTI YNDMWKEIKSE >gi|157101648|gb|DS480676.1| GENE 11 8896 - 9078 176 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLSTLIILVILAVIIGAVVRIMIRDKKAGKGCGGCAGGCGGCAGSSMCHGYHEHGKREG >gi|157101648|gb|DS480676.1| GENE 12 9327 - 9707 414 126 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 118 3 117 122 108 49.0 3e-24 MKIQESAENYLETILMLAKEQPYVRSIDIATELGFSKPSVSVAMKNLRQNGYVQMDDQGH ITLTPSGQAIADTMYERHTLLSNWLIYLGVDPRTAAEDACRMEHTLSAESFDAIKRHITH GHELPD >gi|157101648|gb|DS480676.1| GENE 13 9718 - 10287 471 189 aa, chain + ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 189 1 189 192 105 34.0 5e-23 MLKDLVKQCRSYRRFYQDTVIPYSELADMVDTARLAASASNAQALKFKILCTPEECAGIF PYTAWAGALKDWEGPEEGERPSAYIVIACDLSIGKNKQWDDGITAQTIMLAAAEKGYGGC MIGSCMRSEIGRLLGLDPEHYSIDLVLALGKPKEEVVLVPVKEDGSTAYYRDGNQVHYVP KRSLEDILL >gi|157101648|gb|DS480676.1| GENE 14 10446 - 12380 2243 644 aa, chain - ## HITS:1 COG:BS_yvgW KEGG:ns NR:ns ## COG: BS_yvgW COG2217 # Protein_GI_number: 16080402 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Bacillus subtilis # 43 644 119 699 702 654 56.0 0 MTKKQKAMLSRIIAAFIIYITLAVTDHMEILPEWLGLWGKMALYLVPYVLIGWDIVYKAF RNIRNGQVFDENFLMTVATFGAFGVGEYSEAVAVMLFYQVGELFQSYAVNRSRQSITELM DICPEYANIEEDGQLKQVEPDDVQVGDIIVVKAGERIPLDGKVVFGNSMVDTSALTGESV PRKVAEGDDIISGCVNGSGLLRVQVTKEFDDSTVAKILELVENASSKKAHVENFITRFAR YYTPLVVMGAVLLAVIPPLFFGQSWSEGVRRACTFLVISCPCALVISVPMSFFSGIGVAS KKGVLIKGSNYLEALSEMDTIVFDKTGTLTKGEFKVTEIHAVTGSPVPEASEVPRTRVWA ASQPSDEGKRSLLELAALAESYSDHPISRSIKDAWGDAIDMNRVSNAQEISGHGVQAEID GHTVLAGNSKLMKEMNVAYQECMSMGTVVYVALDGVYCGYIVIADSIKDEAFEAIKNLKK VGVRRTVMLTGDKKEVGEAVAARLGLDEVHAELLPGDKVAKVEELLAQQTGKHRLAFVGD GINDAPVLSRADVGIAMGSMGSDAAIEAADIVLMDDNPARIADVVRISRKTMSIVKQNIV FALGVKAVVLLMGAAGMANMWEAVFADVGVSVIAILNAMRALSL >gi|157101648|gb|DS480676.1| GENE 15 12414 - 12632 357 72 aa, chain - ## HITS:1 COG:FN0259 KEGG:ns NR:ns ## COG: FN0259 COG2608 # Protein_GI_number: 19703604 # Func_class: P Inorganic ion transport and metabolism # Function: Copper chaperone # Organism: Fusobacterium nucleatum # 1 67 1 67 73 60 49.0 1e-09 MKKKFKLQDLDCANCAAKMEEAIKKIEGVSDATVSFMAQKMTIEADDSRFDEIMKEVVSV CRKVEPDCVILL >gi|157101648|gb|DS480676.1| GENE 16 12709 - 13068 551 119 aa, chain - ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 4 119 6 121 122 130 56.0 7e-31 MAINDVECCDFYQVHENVVKAVNDKMPDEDKLYDLAEIFKVFGDSTRIKILYVLFEAEMC VCDIAQLLNMNQSAISHQLRILKQNRLVKSRRDGKAVFYSLADSHVRTIINQGMEHIEE >gi|157101648|gb|DS480676.1| GENE 17 13292 - 15082 2084 596 aa, chain - ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 596 1 594 595 751 60.0 0 MAESMVGLKRTHRCTEVTTAHIGQEVTVMGWVQKSRNKGGIIFVDLRDRSGILQIIFEEN DCGAENFAKAEKLRSEFVVAVTGRVEARSGAVNTNLATGAIEIRANSMRILSESETPPFP IEENSKTKEELRLKYRFLDLRRPDLQKNLMIRSQVATLTRAFLASEGFLEIETPTLIKST PEGARDYLVPSRVHPGSFYALPQSPQLFKQLLMCSGYDRYFQLARCYRDEDLRADRQPEF TQIDMELSFVDVDDVLDVNERLLKKLFKEICGFDVQLPIPRMTWQEAMDRFGSDKPDLRF GMELKNVSEVVKGCEFAVFKGALENGGSVRGINAQGQGHMPRKKIDALVEYAKGFGARGL AYVAISEDGTVKSSFAKFMKEEEMTALISAMDGKPGDLLLFAADRNKVVFDVLGNLRLEL ARQLDLLKKDDFKFLWVTEFPLLEYSEEEDRYVAMHHPFTMPMDEDLQYIDSDPGRVRAK AYDIVLNGVEMGGGSVRIHQADIQSKMFEVLGFTPERAGEQFGFLLEAFKYGVPPHAGLA YGLDRVVMLMVGADSIRDVIAFPKVKDASCLMTEAPGQVDEKQLEELHIAVAAEEE >gi|157101648|gb|DS480676.1| GENE 18 15105 - 16370 1524 421 aa, chain - ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 6 340 10 327 438 196 36.0 1e-49 MALAKKPVTGMRDITPAEMQIREYVMNQIKETYKSFGFTQIETPCVEHIGNLTSKQGGDN EKLIFKILKRGDKLNLEGSKEENDLVDSGLRYDLTLPLSRYYANNAANLPSPFKALQMGS VWRADRPQKGRFRQFVQCDIDILGDATSLAEIELILATTTALGRICPDNGFTVRINDRGI LRGMALYCGFAEESMDQVFITLDKMDKIGYEGVEKELLESGHSPESVSKYLDMLKNVTKD SAGVRALGQMLSEVMPADVSEGLAHIMDTVTEVAEADFGLEFDPTLVRGMSYYTGTIFEV SMEGFGGSVAGGGRYDKMIGKFTGMETPACGFSIGFERIVTILMDNGFTVPGASDRKAFL FEKGVDSARLAAVIREAMEERKKGVQVLVAQMNKNKKFQKEQLGREGYTEFKEFYKESLK N >gi|157101648|gb|DS480676.1| GENE 19 16339 - 16506 63 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPVTGFFANAIRVLQSCFLFNIFILSNFYDFASIFIDFSLFGPYFMASAAIRMRS >gi|157101648|gb|DS480676.1| GENE 20 16467 - 17390 1056 307 aa, chain - ## HITS:1 COG:no KEGG:Closa_1289 NR:ns ## KEGG: Closa_1289 # Name: not_defined # Def: Uroporphyrinogen-III decarboxylase-like protein # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 6 307 16 317 319 365 55.0 1e-99 MKHEPVDRIPCYFTTHFAREAAFGQAAVEEHLKFFREVNPDIQKIMNEHLVPNMGSIKTA EDWKCIRTITLKDKFMEDEIELVKRVLDRCDRDAFNIGTLHGIVASTIHPIEPDYGYMPV RQLLCSHLRQNKGPVLDAMKRIADGLCQLAGKFVELGLDGILYAGLGGERFLFTDQEFEE YIMPFDKQIMKACLEAGGHNILHMCKTDVNFDRYDSYKGLYSIANWGIYETGMSLAEGRK RFADCAIMGGLSNHKGPLIEGTPDEIRQEVRRVMAEAGETGFILGTDCTIPMGVPYERMR IAAEAIK >gi|157101648|gb|DS480676.1| GENE 21 17559 - 19109 1629 516 aa, chain - ## HITS:1 COG:CAC2271 KEGG:ns NR:ns ## COG: CAC2271 COG0635 # Protein_GI_number: 15895539 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 119 512 69 467 476 346 45.0 6e-95 MIGLLIQDNQYEQDIRELLMSFYPGEAYVHEVKDGADFYVETRLGDGAVSIYIWENAAAT EDCGGDKTGPEGCGGGAAGPEPGKEQAPSCPEGLKGWMLGDSCTRPSDLSDHSATKNVIK KMFYLMLAARTGKEMPWGSLTGIRPTKIALTRLEEGWKEEDIRSFMKETYLASDEKIDLS IEIAAREKKLLEPLDYERGYSLYVGIPFCPTTCLYCSFTSYPIGKWKGRTGLYLEALFKE LEYTARKMEGRPLDTVYFGGGTPTSLSAEDLEALLSRLEQLFDLSRVLEFTVEAGRPDSI TMDKLKVLRDHGITRISINPQTMNQKTLDLIGRHHTVDMVKDRFYMARELGFDNINMDLI MGLPEENMDDVRRTLEEVKALAPDSLTVHSLAIKRAARLNMFREEYGGLKIQNTPEMIEL SAACARQMGMEPYYLYRQKNMAGNFENVGYSLPGKACIYNILIMEEMQTIAACGAGTTTK VVFPSENRRERCENVKEVEQYISRIDEMIQRKERIL >gi|157101648|gb|DS480676.1| GENE 22 19141 - 19827 641 228 aa, chain - ## HITS:1 COG:BH2820 KEGG:ns NR:ns ## COG: BH2820 COG0491 # Protein_GI_number: 15615383 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Bacillus halodurans # 11 211 8 209 211 148 41.0 7e-36 MSDIRIKTCVLGQVSTNCYLVYDDQTKEGVVIDPADNAPYILNKCSELGITLTAVLLTHG HFDHIMAVPDVIRAFRVKLYAYEAEENMLADTKLNMTGGFRGPQTSLHADVLLHDGQELE LLGTRWKVLFTPGHTAGSCCFYLPDEGIIFAGDTLFRGSYGRTDLATGNTIRIVSSIVDV LFALPDDTMVYTGHGDPTTIGFEKQGNPVLTVRDNILRVKGPDAFKQE >gi|157101648|gb|DS480676.1| GENE 23 19849 - 22134 2688 761 aa, chain - ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 29 759 6 738 740 714 49.0 0 MANEGTKEKLDIELEAPADFTSPEVLYQDLINTIRKYHPSDDITMIEKAYRIADEAHKEQ KRKSGEPYIIHPLCVAIILADLEMDKETIAAGLLHDVVEDTIMTLEQLTKEFGSEVSFLV DGVTKLTQLNWDKDKVEIQADNLRKMFLAMAKDIRVIIIKLADRLHNMRTGQYWKPEKQK EKARETMEIYAPIADRLGISKIKIELDDLSLKFLQPDVYYDLVEKVALRKDTREAFVQEI VDEVKSHMEDAGIESTVGGRVKHFFSIYKKMVNQNKTLDQIYDLFAVRIMVETVKDCYAA LGVIHEMYKPIPGRFKDYIAMPKQNMYQSLHTTLIGSNGQPFEIQIRTYEMHRTAEYGIA AHWKYKESGSGQVAAGSAEAKLSWLRQILEWQRDTDDSKEFLSMVKGDLDLFSDSVYCFT PSGDVKTLPSGSTPIDFAYCIHSAVGNKMVGARVNGKLVPIEYVIQNGDQLEIITSQNSK GPSRDWLGIVKSTQAKNKINQWFKSQMKEDNIVRGRDQIERYCKAKGINWSDIGKPEFME KVMNRYAYKDWDSVLASIGHGGLKEGQVINKMMEEREKKLKREITDADVLSGIADTAGRA GEASVRKSKSGIMVKGIHDLAVRFSKCCNPVPGDEIVGFVTRGRGVSIHRTDCVNIINLP EDERSRMIDAEWQMAEGASNNEQYSTEIKIFANNRIGMFADISKVFTEKQIDITSMNVRT SKQGKATIIMTFDIHGTDELNRLTDKIRQIEGVLDIERTTG >gi|157101648|gb|DS480676.1| GENE 24 22254 - 24212 1869 652 aa, chain - ## HITS:1 COG:mll2240_2 KEGG:ns NR:ns ## COG: mll2240_2 COG2199 # Protein_GI_number: 13472065 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Mesorhizobium loti # 478 650 8 173 181 87 34.0 1e-16 MKKRLLSNRVRNAILMAILLQSVIFGFGLMVTGTFSGTANRPYKVMESQVAEKDSLISGY MNNSLLIANTMEKELAKIKDDRKIHDRLIDNLNHASSVDGVFYLDLDQKEAVIYRDGEPQ IFSSAYGDIYCVVGSSKSEYPIALSKDWSPKLTPQEWAKAELFWTERTKGGKWFFDDNRL YYVISQEVYGSRRMMGFQISGEVLDSYLGLDNPPYKGMQVLLMTDEKILYSRDKAFIGTD YDYREDGGRISLTYNGASYDGVRSVLQVYGHMEGGSVYVGAVCYHSELAELSRSTIVMVA GVYLLSIVIAIIFSYIAIWMVLKPIKKLQEDITCQNPEEVHFRESGIEEIDRIHQALNDM AAKLEQSYSRYSFTLESAGEKVGSFEYQEGGSRVKLSSSILQLLDIGSDRMGSDGCLDYD VWEDILGGLERVMELEDGFSYTDRACNTRAVSIRQRHEEHGVFGIVIDKTDAYMEILRLR NISQHDQLTGLYNASYLKKEGQKILDGNRSRVNALVFCDLDNLKFVNDNYGHEMGDHYLK AMADLLTGIAFGEQCMAVRLSGDEFVLFFYGYSDRKTIEDKVRSGYEGRSSIQLPDGTSR RINASIGLAFAQRESERMEDLINRADRAMYRVKRTEKNGIAIYDKTDEAGPV >gi|157101648|gb|DS480676.1| GENE 25 24253 - 25725 1501 490 aa, chain - ## HITS:1 COG:no KEGG:Closa_2368 NR:ns ## KEGG: Closa_2368 # Name: not_defined # Def: lipoprotein # Organism: C.saccharolyticum # Pathway: ABC transporters [PATH:csh02010] # 51 490 29 475 475 234 31.0 6e-60 MRRLRMASITALCIGAAVLAGGCAGENKKAEIPQAEESSALTEENDSDTAMTIEFWHYYN DAQKQHLDQLVKEYNGTLGLERGVAVEAYSQGSIADLTNKIDLVLNGTTNDVEMANIVLA YRDMVVNTVKLHPDRLVELGSVVPEADLAQYNQAYLNEGYIDGKLYILPVAKSTELLLMN QTRLDEFLDANPQYREENMESWEGLERMAEGYFEWTDSMTPGTDGDGRPFIGVDNLANYF VAMNHAMGSDIYHYDESGTMVPDLDKGIIERLFLNYYEPFTKGYYGAKGRYRSDDVKQSY LAGCIGSSSSVLYFPEEVADSQGNMVPVTTGVYQYPVMEGTTPTAIQQGAGVAVFNRSDE ENRAALDFIHWLTIDKGFELATSMSYMPVDNNGMTEEQEQQIDDARVLKGIETGLKQSNS YQMVYGFDFENSYDVRTGVDACFSEALSQGRMEFEEYLARGMSMDEAAASMDYGSKAEAF YEQVKTIFEE >gi|157101648|gb|DS480676.1| GENE 26 25916 - 26440 705 174 aa, chain - ## HITS:1 COG:SA1461 KEGG:ns NR:ns ## COG: SA1461 COG0503 # Protein_GI_number: 15927215 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Staphylococcus aureus N315 # 4 172 3 171 172 186 54.0 1e-47 MKKLEEYVRSIPDFPEPGIIFRDVTSILQDADGLHMAVDSLIDMVKDLDYDLVIGPESRG FIFGVPVAYAQHKGFVPVRKKGKLPCETISMEYDLEYGQATIEMHKDAIRPGQKVIIVDD LIATGGTTEAIVKLIEQLGGQVVKICFVMELAGLKGREHLKGYDVDSAIIYEGK >gi|157101648|gb|DS480676.1| GENE 27 26437 - 27480 1106 347 aa, chain - ## HITS:1 COG:CAC0693 KEGG:ns NR:ns ## COG: CAC0693 COG1609 # Protein_GI_number: 15893981 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 340 1 333 334 246 42.0 7e-65 MLIGDSKEGDIMAVTIKDIAKRAGVSYSTVSRALNGIGAENTESRKNILRLAQEMGYVPN QAAINLKKSRSYVIGLYFSTISKMTSPFVLHDVLTGVYSVAGSKYNVVVKGIDMHEPGTL NPSYFDGIIVLSQRSEDMEFMNEVLDKKIPMSVICRAVDVDAPNVTTDEALAMERAMDYL LENGHRNIGIIEGNPGLDSSRLRRRGWRTSMTSHGLDPDALPVISGNYRYASGYTAAKQL LTFRPTALLCFNDEMAFGARTAIVEAGLKVPDDVSLVGFDNWDMSGYSDMHLTTVERNMG EIAREGARVLLRRLDEGIVDNRRIYLNNKLIIRDTVKNLNVEERKPS >gi|157101648|gb|DS480676.1| GENE 28 27537 - 29105 1595 522 aa, chain - ## HITS:1 COG:CAC0355 KEGG:ns NR:ns ## COG: CAC0355 COG5434 # Protein_GI_number: 15893646 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Endopolygalacturonase # Organism: Clostridium acetobutylicum # 1 516 1 513 513 456 43.0 1e-128 MDLKVIYKNARCAVLEIEDGGTYETSRMYRLYLNGKFIRETGNAVTSIYDLKPDTSYQVE ARSGEGLVLKSSFTTDSEFVTLDVRDFGAKGDGIQDDTLFIQAAIMACPEKSRVLIPAGT YRIVSLFLKDDVNIELAEGAVLSAYTDRTRFPVFQGMIQSYDEQGEYNLGTWEGNPLPMF TGIINGVNVKGAVIYGQGTIDGNAGDSEGNWWHEPKVIHTACRPRMIFLERCRQVTVQGI TVRNSPSWNIHPYFSDHLRFFDLKVLNPKDSPNTDGLDPESCQDVEIAGVYFSLGDDCIA VKSGKIYMGSTYKRPSKDISIRRCCMRDGHGSVTIGSEMAGGVKNLTVKDCMFLHTDRGL RIKTRRGRGKDAVVDGIVFEHIRMDHVMTPFVINCFYFCDPDGHSEYVRTKEALLVDERT PLIKSLCFKDIEAENCHVAAAYMYGLPEQRIERVEMDHVRVTYAASAREGQPAMMDGCSS HICRMGIYANQIEELILTDVKVEGQDGPAIITENIGRVCGSI >gi|157101648|gb|DS480676.1| GENE 29 29220 - 30071 942 283 aa, chain - ## HITS:1 COG:AGl3353 KEGG:ns NR:ns ## COG: AGl3353 COG0395 # Protein_GI_number: 15891797 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 283 22 294 294 295 54.0 1e-79 MTRETKRKVSTAIRYTLLILVGFIMVYPLIWMVGATFKTNSEIFTSAWFLPKNPVFDGYS NAFKSYGGKISLITSLINTYKIVIPKVIFTVVSATIAAYGFGRFDFKGKKIFFALLMSTL FLPQVVLNAPQYIMFNKWKWVNTYLPLIVPSLFAADTYFVFMLIQFLRGVPKELEEAAKI DGCNTLKTLWYVIVPMLKPSLVSCALFQFMWTNNDFQGPLIYIADMQKYTNSVYLRMSMD GDVGFQWNRILAMSLISIVPSLIVFFCAQDSFIDGIAAGGVKG >gi|157101648|gb|DS480676.1| GENE 30 30078 - 30971 993 297 aa, chain - ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 12 295 7 291 293 288 56.0 6e-78 MKKKKGFWALNIGLLYILPWLVGFLLFKVYPFASSLYYSFTNYDLFNGITKTGLLNYEKI FTDEDIIRAFVVTFKYAIFDVPLKLAFALFIAYILNFKLKGVNFFRTAYYIPSILGGSIA IAILWRAVFNTDGLINTVITFFGGPKINWMAGANSALFVIVLLRVWQFGSAMVIFLAALK GVSEDLYEAASIDGAGKWTQFFKITVPLITPVIFYNLITQLCQAFQEFNAPFLVTKGGPN GATTLISMLIYNNAFLRHKMGMASAQAWILFLIVMTLTAVAFISQKKWVYYSDEEGR >gi|157101648|gb|DS480676.1| GENE 31 30991 - 31287 339 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936549|ref|ZP_02083916.1| ## NR: gi|160936549|ref|ZP_02083916.1| hypothetical protein CLOBOL_01439 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01439 [Clostridium bolteae ATCC BAA-613] # 1 98 6 103 103 187 100.0 2e-46 MDVRKAVKHRENYDSIVTYFKTLKTPGMDQMVLLIDTIEQMSPEIYEHYRALQDIFRMRL KEMLAGGNPGPQEQLAYMIQKGCSTGTLLREKYERYLD >gi|157101648|gb|DS480676.1| GENE 32 31340 - 32686 1683 448 aa, chain - ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 40 380 20 360 430 213 33.0 6e-55 MKKNLKKAVSLALASVMAFSLTACSGGSGTTTAAQADGKEAGSGTGDVTIKFCWWGGDSR HEATEKAVDAFMAKYPDIKVECEYGAWTGWEEKQSLAIQGGNAADVMQINWNWITSYSGN GTNFANLEDYSDVLDLTQFSQDSLDQCKADGKLMAVPVALTGRLFYWNKTTFDEVGCEIP TDEASLLEAGAKFKAYNEDYYPLAMMEYDRAIFLVYYLESVYGKPWVENGELQYTAQEIQ AGMDFMTKLEEAHVIPTIATLQGDMADSLDKNAKVIDGKYAGVFEWDSAVSKVQKAIAES TTKPGQELVIGDFLKFGDYDGGFTKISMALAVSANSAHPKEAAMLVNYLLNDDEAIEICG TERGIPCSAEGVKILEEKGIGDKLVMEANAKVLAYSKFPLDPMFEHNDLKANPDGVYYKV FGKLSAGDITSAEAAEDLMKGIDDCYNF >gi|157101648|gb|DS480676.1| GENE 33 33132 - 34262 1111 376 aa, chain + ## HITS:1 COG:CAC0359 KEGG:ns NR:ns ## COG: CAC0359 COG4225 # Protein_GI_number: 15893650 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Clostridium acetobutylicum # 35 369 23 357 361 308 45.0 1e-83 MKILDRYIDQLIEQSTPECPVWNIEKIREGSKSTWNYIDGCMIKAIIELFLITRNHRYLD FADAFTGRFVREDGSIESYDPKEYNLDHVNAGKTLFDLYKLTGKEKYRRAMDTVYSQLKE QPRTSTGNFWHKKIYPHQIWLDGLYMAQPFYMQYEAEYNGCKNCEDSYQQFLQVYGRMRD PLNGLYYHAYDDSRQMFWCDKVTGLSENFWLRAMGWYAMALIDTMEVMPESMACQKARLN AIYKELIDAMLPYQDQATGMWYQVVNRGGIAPNYLEESGSAIFAYAIMKSVRLHYLPEEY FKYGQKAFDGICSTYLSEKDGSLQLGGICLVAGLGNTDMREGTFEYYMREPVVENEAKGI APLILAYIETMFRCQS >gi|157101648|gb|DS480676.1| GENE 34 34457 - 35281 732 274 aa, chain + ## HITS:1 COG:CAC2917 KEGG:ns NR:ns ## COG: CAC2917 COG0657 # Protein_GI_number: 15896170 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Clostridium acetobutylicum # 35 269 37 267 272 246 47.0 3e-65 MVTDQECGHARESGNPADLTAYLLDRLPHKPELLRPAVVLCPGGGYGFVSPREDQPVAME FLAAGCQVFSLHYSVAPETFPVALMELAKAVALVKSHACEWNIDTERIMVCGFSAGGHLA ACLGMMWNRDFLYGPLNMAPEDIQPKGMILCYPVITSGEFGHKRSFEQLLGGKAGDPRLR ELVALELQAGPHTPRTFLWHTWTDQSVPVENSLLLAQALKKAGVSLEMHIYPSGRHGLSL ATEEVSDSTGDCLVPHCQGWMELVKEWIKEKDRE >gi|157101648|gb|DS480676.1| GENE 35 35283 - 36032 999 249 aa, chain + ## HITS:1 COG:AGc4939 KEGG:ns NR:ns ## COG: AGc4939 COG1349 # Protein_GI_number: 15889977 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 218 8 221 266 139 36.0 6e-33 MLKEERQALILNKLHSSGKVVVSRLAVELNVSEDTIRRDLLDLDQKGQVKRVFGGAIPME RPVINFFDRETTDVELKQRLALKALDFLEKDQLVAIDGSSTNLQFAKNIPNGLKLTVLTN SYSIAHACSMKDQVDVIVLGGRLLKESMTNVGETAASQAALYHPDLCFMGVYAIHPEYGM TIPYPDEVSIKRQLIQSSRRVIALVNPIKLNTVSRYHVCGIEAFTTLITDNGVSGEMAVD YRNKGLDFL >gi|157101648|gb|DS480676.1| GENE 36 36259 - 36912 961 217 aa, chain + ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 213 1 211 214 301 76.0 9e-82 MKFFVDTANVEDIKKANDMGVICGVTTNPSLIAKEGRNFAEVIKEITDIVDGPISGEVKA TTVDAEGMIKEGREIAAIHPNMVVKIPMTVEGLKAVKVLHAEGIKTNVTLVFTSAQALLA ARAGASYVSPFLGRLDDISMPGIDLIYDITEIFQMHNIETEIIAASVRNPIHVIDCAKAG ADIATVPYKVLEQMTKHPLTDQGIAKFQADYKAVFGE >gi|157101648|gb|DS480676.1| GENE 37 37165 - 37869 880 234 aa, chain - ## HITS:1 COG:lin2431 KEGG:ns NR:ns ## COG: lin2431 COG1349 # Protein_GI_number: 16801493 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Listeria innocua # 1 229 1 229 250 130 33.0 2e-30 MVPELRKEEILKSIKKRDISYIKDLAAELNISLSTVRRDVAALEQEGSVIVMRGGAVKYK AEDFDEPVVKKKLIHSEEKEIIARKAAALVEDGDCVYVDSGTTTVGMLKYLQGKRITIVS SSTELLDNLPIKNAKCILLGGEVRDDLESVLGALTEKMISDMYFDKAFIGANGYIPDGGI YTYDDREARKKVIVKDHSKVVYVLMDTSKKNKYAFSKVFDLGEAILITEEEDNR >gi|157101648|gb|DS480676.1| GENE 38 37998 - 38912 864 304 aa, chain - ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 9 303 5 300 308 179 35.0 6e-45 MKDCEREYICVIGSLNYDIIMKQKRMPLQGETYTADSITYSGGGKGANQAVQCAKLGIDT IMVGKVGRDTFGASLVEKLKDYGVDCSCIGRSSSPTGVGVVHALEDGTVYASIITGANFD ITSREIDGLDELIRNSRIVILQLEIPTEVVEHIIRKAKQYHVYTILNAAPAKEMDLEVLK MADCLIVNETEASFYAGVEVTDGDMVRTHADRLRRLTEGTVIVTLGSKGSMLLGQEGAVA IDPVKVEHVTESTGAGDSYIGAFAYGKYKGMTDLQACRFAARAASVTVTKIGAQEAMPCL NEIN >gi|157101648|gb|DS480676.1| GENE 39 38934 - 39908 1123 324 aa, chain - ## HITS:1 COG:VCA0129 KEGG:ns NR:ns ## COG: VCA0129 COG1172 # Protein_GI_number: 15600900 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Vibrio cholerae # 6 306 26 329 332 191 42.0 2e-48 MKNVIKKFSRELEVVAALAVIMVVFGVIQPIYFSPSNLLDILDQSVINGLLAIGMTLVII TAGIDLTVGSTLAIVIVSVGNFLVMGLNPFAAVLAGAAIGFGLGLVNGILVAKMKLQPFV ATLGMMSVYRGVAYLITGGWPVLNIPQNFRDIMDGDVIGTLSSSMILFLVVTVAAYILLK HTKFGTYLYSIGSNEEATRLSGVNVDFNKIMSYAICGVCVAFAGMVMLAKLGTGEPAAGA SYETNAIAAAAIGGTSLAGGKGSIIGTFLGAILLQALKVGLVVCGVDTFWQYIATGLIIV FAVYVDVIQGKLAAMRLNKKAKAE >gi|157101648|gb|DS480676.1| GENE 40 39911 - 41398 1698 495 aa, chain - ## HITS:1 COG:BH2322 KEGG:ns NR:ns ## COG: BH2322 COG1129 # Protein_GI_number: 15614885 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Bacillus halodurans # 4 494 6 496 522 416 43.0 1e-116 MATMIHFSNITKQFPGNKALDSVDFTVDKGEIHALLGENGAGKSTLLNILHGIYPDYDGN VEIDGRKVGFKNANDAIEFGIAKVHQEVNLVTELTVGQNIALGYEVTRGFFVDYDAMYKK TDVILERLGCKFRSRDRVTSLSTGEMQMILIAKALFHNAKIISFDEPTSALTDREVDRLL KMIMELKDQGITILYITHRLDEVFAIADRASILRDGTYITTLNMKETSKEELIRHMVGRD VSAYAVRNNPLCAAGEVVLEARDLCKNGVFEHISFELHRGEILSFAGLVGSKRTDVMRAI FGADPYTSGEVYIKGERAEIKSPKDALQYSLGLIPENRKTQGFVKNFTNASNMALASMNR FCSRGFVNARKIYDNCMYYTGEMNLHPADPDYLTESLSGGNAQKVIIAKWMTTDADIIIF DEPTKGIDVGAKAEIYRLMEELVAGGKSIIMVSSELPEVIGMSDRVVVMREGKIMGVLNH DELNEEVIMSFAMGG >gi|157101648|gb|DS480676.1| GENE 41 41474 - 42565 1306 363 aa, chain - ## HITS:1 COG:BS_rbsB KEGG:ns NR:ns ## COG: BS_rbsB COG1879 # Protein_GI_number: 16080649 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 67 340 37 301 305 142 33.0 2e-33 MKKRGLVLSVTAIAMAAGLMAGCTNKQAAATAQTEAAADTFNPDKEAGVSIEELREKLGD VPKQDGEVKLGAVAKSFDNEYWRTLREGYGEYQTKAAAAGCNITIDVTSAQGENDEQGQL SIVKDMINKKYTGLLLSPISDGNLTPGVEDAKKAGIPVINVNDGLIANADSFVGPKADQN GELAAEWIADKLGGKGKVAIVIGMPKAFAARQRTEGFENWMAANAKDIEIVAKQNADWDR QKSKDLADTWIKQYPDLNAIFCNNDTMALGVVEAVKSSGKDILVVGVDGIGEAYDSIRRG ELDATIDSFPFYKAQIAGEVMLRRLGGQEIPRVLWTPQALIDSENVDKPAEEIIGWTDPV YAK >gi|157101648|gb|DS480676.1| GENE 42 42601 - 43521 963 306 aa, chain - ## HITS:1 COG:no KEGG:lwe0277 NR:ns ## KEGG: lwe0277 # Name: not_defined # Def: AP endonuclease (EC:5.3.1.5) # Organism: L.welshimeri # Pathway: Pentose and glucuronate interconversions [PATH:lwe00040]; Fructose and mannose metabolism [PATH:lwe00051]; Metabolic pathways [PATH:lwe01100] # 1 306 1 307 311 328 50.0 2e-88 MKYATRINSFLRFDSNLLNAFRSIGSIEGVDYVDLNYPEHFADYDIEVIKAKMEECGLRC NAINLRFRDKYIGGEFGNHEPAISQDAITLCREAADACRKLGGNQMIIWLGFDGFDYSFQ IDYVSYWNRIVKAFRDVCDYSKVPVSIEYKPYEERVHAFIDSFGTAVSILHDVDRENLGV TLDFCHMLMKKENPAFAAAWLLERGKLYNIHLNDGEGSTDDGLMVGTVNFWKTVEVMYYL KKYDFQGVIYFDTFPKREKAVEECEANIKMCRRIEQLIDAYGLCRMEKVVEQNDAVAVSS MMVALL >gi|157101648|gb|DS480676.1| GENE 43 44000 - 44662 861 220 aa, chain - ## HITS:1 COG:CAC1709 KEGG:ns NR:ns ## COG: CAC1709 COG0704 # Protein_GI_number: 15894986 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Clostridium acetobutylicum # 4 212 3 212 219 142 41.0 6e-34 MSVRVTYEHELEVLKQDLKEMAHMVESAIEQTFMAFEDQDYTMAEDVIKGDRTVNDMERA IESRCLSLILRQQPVASDLRQVSTALKVVTDLERIGDHASDIAELILRIKSEHAYHIVKH LPVMAASANAMVHDAIEAFINQDLDSAMEIIERDDEVDTLFNQVKDDVIHLLKSSPDQAD QGIDLLMVAKYLERIGDHAVNVCEWTQFSKTGALKNVRIM >gi|157101648|gb|DS480676.1| GENE 44 44684 - 45445 324 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 248 2 239 245 129 35 1e-28 MESKTKISTRNLNLYYGENHALKNINLDLYEHKITAFIGPSGCGKSTYLKTLNRMNDLVP GIKIDGTVLLDDEDIYDSQVDTTLLRKKIGMVFQQPNPFPMSIYDNIAYGPRIHGIKNKA QLDEIVENSLKGAALWDEVSDRLKKSALGLSGGQQQRLCIARALAVEPEVLLMDEPTSAL DPISTLKIEDLMDDLKSKYTVAIVTHNMQQATRIADHTAFFLVGEVVEYATTDELFTMPK DKRTEDYITGRFG >gi|157101648|gb|DS480676.1| GENE 45 45486 - 46391 1102 301 aa, chain - ## HITS:1 COG:VCA0072 KEGG:ns NR:ns ## COG: VCA0072 COG0581 # Protein_GI_number: 15600843 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 28 296 16 285 289 224 48.0 2e-58 MTNDIAVRVHKDSVVQDSIYGRRVRGLDVFLTVIIYTCAGFSILLLAGIMGYVFVRGIPH VTWAFLSTASSATKGTFGILGNIINTLYIVIITLLIATPLGIGSAVYLNEYAKPGKVVRL IEFTTETLSGIPSIIFGLFGMVFFGNALGLGYSILTGALTLTIMILPLITRTTQEALKTV PDSYRHGALGIGATKWYMIRTILLPSAMPGILTGIILAIGRIVGESAALLFTAGSGYYLP KNLFSKIFESGGTLTIQLYLFMQKAKYNEAFGVAVVLLVIVLGINGLAKYMSHRFNVEAG A >gi|157101648|gb|DS480676.1| GENE 46 46381 - 47319 917 312 aa, chain - ## HITS:1 COG:VCA0071 KEGG:ns NR:ns ## COG: VCA0071 COG0573 # Protein_GI_number: 15600842 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Vibrio cholerae # 19 299 50 318 327 209 44.0 7e-54 MRAETFSIRSCHGSKSTVEKAAQGIFTACGFFAVLAVASITLYMLISGTPALFKVGLLEI LFGTVWQPAAASPSFGILFVILTSIVGTFLAILIGVPIGVMTAIFLAEVAPKRLSDMVRP AVELLAGIPSVIYGLLGILILNPLMYKMELILFKGSETHQFTGGANLISAVLVLALMILP TVINISESALRSVPAHLKSASLALGATKIQTIFQVIVPAAKSGIITAVVLGTGRAIGEAM AISLVSGSSVNLPLPFSSVRFLTTAIVSEMGYASGLHRQVLFTIGLVLFAFIMIINISLT RILKRGGNRNDK >gi|157101648|gb|DS480676.1| GENE 47 47347 - 48303 1178 318 aa, chain - ## HITS:1 COG:MTH1727 KEGG:ns NR:ns ## COG: MTH1727 COG0226 # Protein_GI_number: 15679719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Methanothermobacter thermautotrophicus # 77 310 32 263 271 174 45.0 2e-43 MRMNGKKLTVLLGTAVLAMGILAGCGSSAPETTAAGTAAPATEAGTDAGSSAASGDTKDT EAAKAENAPSADLSGTISMVGSTSMEKFANALSESFMEKHPGVTVTAEFVGSGAGVEAVS NGTADIGNSSRNLKDEEKADGVAENIVAIDGIAVVVDAANTVEDLTKQQLSDIYEGKITN WKDAGGNDAPIVVVGRESGSGTRSAFEELLELEDVCKYSNELDSTGAVMAKVASTPGAIG YVSLDVLDDTVKAVKLEGAEPTEENIKAGSYFLSRPFVMATKGEISEQNDLVKALFDYIY SEEGAEIVKSVGLIAVDK >gi|157101648|gb|DS480676.1| GENE 48 48546 - 49652 1042 368 aa, chain - ## HITS:1 COG:no KEGG:Closa_3988 NR:ns ## KEGG: Closa_3988 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 10 362 9 365 370 506 68.0 1e-142 MSGEMDKHILKEAAAAFATDGEAVSCQRYGSGHINDTFRLICEKHPYILQRMNTDIFQDP VSLMRNIEGVTTFLRQEVIKNNGDPDRETLNLIRTREGAPYYVDSRGNYWRMYLFIEGAT CYNLVEKPEDFYQSGKAFGHFQRLLAHYPARELAETIPGFHDTPGRFRAFRKAVEEDICG RVSEVQNEIQFVMDREQDMGLAMDMLAKGELPLRVTHNDTKLNNIMIDDKTGQAICIIDL DTVMPGLSIFDFGDSIRFGANTAEEDETDLTKVSLSVPLFEIYTRGFLEGCAGSLTEAEV KMLPQGARLMTLECGIRFLTDYLSGDTYFKIAREKHNLDRCRTQFGLVEDMEKKWGEMER IVEAVCKG >gi|157101648|gb|DS480676.1| GENE 49 49649 - 50584 943 311 aa, chain - ## HITS:1 COG:no KEGG:Closa_3989 NR:ns ## KEGG: Closa_3989 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 308 1 308 309 492 75.0 1e-137 MSQKPVLVIMAAGMGSRYGGLKQIDPVDPYGNIIIDFSIFDAREAGFEKVIFIIKKAIEE DFKIHIGNRISKYMDVAYVYQELDKIPDGYQVPEGRVKPWGTGHAVLCCSALIDGPFAVI NADDYYGKEAFHMIYDQLSSVEDTDRYQYTMVGYRLYNTLTENGYVARGVCRTDAKSRLV DIHERTRIEKHGSQAEYTEDDGATWTELPESTIVSMNMWGFTKSILGELENRFGAFLDKN LPVNPMKCEYFLPFVVDELLKADLAEVTVLKSVDRWYGVTYKEDKETVVKAIKDLKDKRL YPEKLWEELVK >gi|157101648|gb|DS480676.1| GENE 50 50641 - 51531 1006 296 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 28 296 38 300 300 137 29.0 2e-32 MSGLFQKKEKLHREPVRWKKELKLAPGYLILTLWILFTLALLGWVLLASFSTTKEIFSNK LLDSGFHIENYVKAWVNSNVSTIFFNSLFYAAVSCTLLILICAPAAYALSRFDFVCNKLI QSSLVASMGVPVVMIVLPLFAVVAGLNILNNVSANRATLIFLYIGINVPYTTIFLTTFFA NLSRAFEEAAAIDGCPPVKTFWLIMLPMAQPGIVTVTIFNFINIWNEYFISLIFGNSDKV RPVAVGLYSMINSMKYTGDWAGMFSAVVIVFLPTFILYIFLSEKIIAGVTGGGVKG >gi|157101648|gb|DS480676.1| GENE 51 51548 - 52426 1007 292 aa, chain - ## HITS:1 COG:XF2447 KEGG:ns NR:ns ## COG: XF2447 COG1175 # Protein_GI_number: 15839038 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Xylella fastidiosa 9a5c # 1 229 10 231 293 106 31.0 6e-23 MFMTPAVVMFVLVFLYPIIRTILMSFFKIEGITDPMSKWTFSGIDNYVKLANTTLFRISM WNLARIWFIGGIIVMSLALLFAVIITSGIRFKSFFRAMIYLPNIVSAVALATMWLQYVYS PKYGLIKNFFQALGLDSLAKIQWLDNDHKFMALLLAYCFGMVGYHMLIFASGIERISDDY FEAATLDGANKFNQFRYITLPLLKGVFRTNVTMWSVTSVGFFVWSQLFSTVTADTQTITP MVYLYMQIFGAGNSVTERNSGVGAAVGVMLSVCVVIVFWLCNHLLQDKDLEF >gi|157101648|gb|DS480676.1| GENE 52 52580 - 53920 1555 446 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 6 385 9 372 438 68 23.0 3e-11 MKNRKWMALALAAAMVGSLTACGGGSGTADTTAAAAAADTTVAPAADTTAAADAQAPAGD GTGLVYWSMWESTEPQGQVIKEAVDKFSADTGVAVDVQFKGRTGIREGLQPALDAGTNID LFDEDIDRVNTTWGAYLMDLEELAKASDYDATANASLMEACREVGGGTLKSIPYQPNVFA FFYNKAIFEEAGVAAVPTTWAELDAACQKIKDAGYTPITCDDAYILCLFGYHMSRLNGYD KTSDIVKNNKWDDPSVMETAKAYADFAQKGYFSENIASNVFPAGQNQELALGTAAMYLNG SWLPNEVKNMAGDDFQWGCFSYPAVEGGTDGTEAANYGGQVLAINKNSQHAEDAFKLITY ITKGEFDKKLSEESLGIPSDTTNSEWPVQLTDVKPVMESLKTRYPWAAGAEDNVDMTPII KENMTKLCSGAITADEFVSNLQAAGK >gi|157101648|gb|DS480676.1| GENE 53 54154 - 56469 2121 771 aa, chain - ## HITS:1 COG:no KEGG:BCAH187_A2105 NR:ns ## KEGG: BCAH187_A2105 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_AH187 # Pathway: not_defined # 1 771 1 719 723 1034 63.0 0 MSKTKGRVTLPSESGFLKETKELMERWGADAIRDSDGTKLDKDVKSLDAEIYTTYFVARG HNEFAAKHMEECQQLYLMSRRFTAVGNHLEMDFMEGYFDQQVVPDCYHDPKKWWEVMDRT TGQAVPVSQWELTGASMPEGFCSGSEFKGVFKGAGDTADAGNAADAGNTAETKKAEAVSA PMRVVLEHASPFHEYTVSFLAYAVWDPTQMYNHITNNWGDKPHEIPFDVRQEASGLFARE YLVQWLKDNPDTDVVRFTTFFYHFTLVFGSDAKEKFVDWFGYGATVSVKALEEFQAEYGY ALRPEDIVDNGYYNSSFRVPTRQYRDYMDFIQRFVAEKARELVELVHQAGRKAMMFLGDN WIGTEPYGAYFPEIGLDAVVGSVGGGATLRLISDIPGVSYTEGRFLPYFFPDTFYEGNDP VPEAVENWICARRALMRKPVDRIGYGGYLSLAYKFPGFVDYIEKVAEEFREIYGNIGGSR PFCGLKAAVLNCWGKLRSWQPYMVAHALWYKQTYTYFGILEALSGAGVDVVFMSFDDIRS SGIPEDVDVIINAGDAGTAWSGGDEWLDEQIVTAVRRFVWEGGGFVGVGEPSAVQRGGRY FQLADVLGVDKEQGFTLSTDKYHVTQADSHFITQDVPRDESGRLVLDFGEGMKNVYALGT DTEIGEYSDHEVHLSAHPYGRGRGVYLAGLPYSHENTRLLIRSMYYAACKEGEMKKWFSD NLFCEVHGYPEAGKYAVVNNTSRGQSTVVYDGDGHGTSMELLPCEIKWFDL >gi|157101648|gb|DS480676.1| GENE 54 56706 - 57605 891 299 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 36 287 32 271 277 74 20.0 2e-13 MAYKSVVLEDSVTINRIISVHYFQYMSDFSFPGESHDFWELVCVDRGEIDALAGDRRLTL KKGNILFHKPNEFHNVLTNGKVSPSLVVIGFECHSPAIKSFEDQLMSVQDTEKELMAQII VEARNTFSGRLDDPYQEELIFNSEPLTFGSAQLISHYLEQLMIHLYRRYFSYSLPVRSSR FLAEASSGNDTYNRIVRYMEEHLGERMTIDRICRDNLVGRSQLQKLFRDTKGCGVIEFFS MMKIDTAKQMIRDNQLNFTQIADRLGYNSIHYFSRQFKQITTMTPSEYATSIRLLSEKP >gi|157101648|gb|DS480676.1| GENE 55 57755 - 58171 207 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_0226 NR:ns ## KEGG: Closa_0226 # Name: not_defined # Def: Zn-finger containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 138 1 134 134 140 53.0 1e-32 MNKIREKLQRFMIGRYGMDQLGQFIMYTVLVLIFLNVVVRVRILSSFLYILELAGFFLLY FRMFSKNVGKRYQENQVYLRLRFYVTEYFRKIKFRFTEGRKYRIFKCPDCGQKVRIPRGH GKVSVHCPKCGTDFIKKS >gi|157101648|gb|DS480676.1| GENE 56 58254 - 58718 -263 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936578|ref|ZP_02083945.1| ## NR: gi|160936578|ref|ZP_02083945.1| hypothetical protein CLOBOL_01468 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01468 [Clostridium bolteae ATCC BAA-613] # 1 154 1 154 154 300 100.0 3e-80 MPVRCSGCIWIDRCTPLYKNMALWNMIPCMTERNDKDTQILPGFLLPGKRVCKSSAAVQS SNNHISVSHLSLYYIVLISCITVLRPILIIGDGKYNVKVEKIILSAYFVRSRVRAEVWAE VWALRACGLHFDACIYKIIIKTLSVHCFSMRIRL >gi|157101648|gb|DS480676.1| GENE 57 58833 - 60266 1482 477 aa, chain + ## HITS:1 COG:BH1900 KEGG:ns NR:ns ## COG: BH1900 COG0488 # Protein_GI_number: 15614463 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus halodurans # 1 473 2 478 538 336 39.0 7e-92 MLLNALNVKKEYGIQTVLDIEKLVIRDGDRIGLIGRNGAGKSTLLGVLSGRIACDEGVVK RYCPIAEILQTGETEDEPEARLLSQLKLRDSAVKSGGEKTRRAIGAAFSSQAPLLFADEP TTNLDMEGVELLEKMMKGYKGAILMISHDRTLLDRVCNQIWELDKGNIRVFDGNYSDWAA QKERERGFQEFEYQQYQKEKKRLERTTDALQRKSQTMTKPPKRMGSSEWMLYKGVAAVQQ GHVQSNKTAVMSRLEHLEKKERPDELPQVSMKLPDAGKIRAKNAAAIRHLTVSYGERIVL DNVSLEIEAGRRTFITGNNGAGKSTLIKALIDRAPETFITSEARVAYFSQDLDTLDPKKT VLENVSEDAAYPQHICRAVLSNLYMTKDDMFKPVSVLSGGEKVKTALAKVLVSGCNFMIL DEPTNHMDVYTMEGLEHLLESYDGTLLAISHDRTFISHTGDVVCRLENGDIQKSTEQ >gi|157101648|gb|DS480676.1| GENE 58 60376 - 61221 814 281 aa, chain + ## HITS:1 COG:MTH924 KEGG:ns NR:ns ## COG: MTH924 COG0725 # Protein_GI_number: 15678944 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Methanothermobacter thermautotrophicus # 53 279 26 258 260 134 33.0 2e-31 MKSNVLGILLAGAIGAALITGCGAASSGTARSSAPQQTQESAPEDGDAGKSAADTPDLTG HTLMIYCGAGMTKPFQEIADSFKAATGCEMNVTYANAGQIQSQITTSEEGDMFIAGSAEE LKPVESYVSESKALVKHIPVLAVQSGNPKNITSLAGLAEEGVSLIIGDIDSTPIGKIAKK ALTDAGIFDKVQIEASTATAPQMATAIAAGEADAAIIWKENCDAQGVEIVDNSGLEDYIK TIPAASLSCSADHDARQAFLDYLNSGDVQEIWLKYGYEIAE >gi|157101648|gb|DS480676.1| GENE 59 61259 - 62038 644 259 aa, chain + ## HITS:1 COG:MTH921 KEGG:ns NR:ns ## COG: MTH921 COG0555 # Protein_GI_number: 15678941 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type sulfate transport system, permease component # Organism: Methanothermobacter thermautotrophicus # 31 241 43 254 269 129 34.0 5e-30 MTVAVALLVIFFIGSAVCAIAAGGIPAFAGTIRQKEVLFALRLSAATASLSTLIVMALAL PTAYALTRTSMPFKELSGIIIELTLSLPYLLLGLSLLIMFSSPAGKWLKEQGFKVVFSPA GIVMAHILVNLPYAIRLIRTAFEASDQRMEFIALTLGASPWRCFLTILLPLCRRSLVSTF ILTWSRALGEFGATLMLVGITRFKTETLPGSIYLSISTGDNQAAMATAMLMLIISGLTLF LSRLLASPADRNKRQVFIR >gi|157101648|gb|DS480676.1| GENE 60 62035 - 62793 317 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 18 235 17 241 245 126 30 7e-28 MNQLVINNFSVKRGSFTLSPISLTIQMGEIFAILGRTGSGKTVLLEAISGMFPGDTGSIL LDGTDIRDIPPAQRKLGLVYQDYGLFPHMKVKENILYGLKMHGYSKKEQDSRAQELMEML SISHIGDQYPGTLSGGESQRTALARALALAPQVLLLDEPFSALDPATRQSLYQELRRIHR HFGCTIIFVTHDFLEARTLAGRAAILLDGKLQTVTDTSRLMDGHFSAEVERFLGRNQIIT APERKQEKRIVQ >gi|157101648|gb|DS480676.1| GENE 61 62790 - 63563 773 257 aa, chain + ## HITS:1 COG:no KEGG:Taci_1616 NR:ns ## KEGG: Taci_1616 # Name: not_defined # Def: hypothetical protein # Organism: T.acidaminovorans # Pathway: not_defined # 10 253 6 250 251 276 53.0 4e-73 MTYMNDALSFYSELRLRFMKLLTEEGILGEQVVINTKSLTPEEAIGITKRKDFPIITGKD VMVQAECMGSLGQAFTDAPSAYRGTLEEICSLDLADDPYSRGLFIAALNAVMKHSGRADC TVHCRNEGPESCAMDVVRYISEHYGRPAIALIGYQPAMLEQLAKEYDVRAADLSLANIGQ KRFGVLIEDGRIPETSQSLCRKADLVLCTGSTVCNGSIVDFLPFKEKILFYGTTLAGAAP LMGLPRLCFADRYQEPQ >gi|157101648|gb|DS480676.1| GENE 62 63548 - 64438 763 296 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 2 294 1 289 291 158 32.0 1e-38 MMDLKDLQYFLAVAQAGTITGAAAKLCMAQPPLSRQMKELEEELGTTLFIRGKRHIRLTE EGIFLRQQAEEILSLMEKTRGQLSRMGTGAGGLISIGVTESCGAGVLSDIIERFHSDFPG IRYNIWCGNGDEINDKLDKGLVDLGIVREPFRTEKYESALIKTESWTALLSREHPMAEQP GDTILLSDIAGSPLIIPSRPPLQEEIRGWFHRIEREHTILCTYNTLSCIVPLVERNVAVA ICPEAVRYFTDRQRLVCRRLTEPEHVSRLLLVRRRNQLMPAAAGCFWDFARHYCGS >gi|157101648|gb|DS480676.1| GENE 63 64545 - 65228 434 227 aa, chain + ## HITS:1 COG:MA1282 KEGG:ns NR:ns ## COG: MA1282 COG3619 # Protein_GI_number: 20090146 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 14 208 27 225 240 75 26.0 7e-14 MKEKERTEAYIHYIMSLSGGFMGAYALLNRCEIFGSAQTANMIHSILDLSSGSWTSLLLR LGSLLIFITAIIAATLIPRLISIDIRLICILAEIPVIAVLGFFPANMDPLMALYPVFFIM AFQWCSFTGVKGYASSTIFSTNNLRQFISSLSNYFADRNKKHLEKAWVYGFTLFFYHIGV ALCWYFTKIFGIFSVWLNFLPLLLALVLVEREHIFINLRSFLQNTSA >gi|157101648|gb|DS480676.1| GENE 64 65284 - 68223 2139 979 aa, chain + ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 13 972 12 972 976 170 21.0 2e-41 MASLPRIGETISGFTVTELGTIHMLGARTVLFNHESSGAQLLYIQNDDRELGFNLIYRTP QLDERDNSHILEHLILSSCQKYPSRDIFFDMDSKSYTTFMNGLTDNTFTCYPVCSQSQEQ LVKLMDVFLCCMEEPDALKDKHFYLREAIRYELSSPRGPLTMQGTVLSEDWGHLTDILEN ADSAMSHTLYQDTRTANLLGRAHLHYRELSYEQARETFERCYSYSNCLITLYGNMDYRSV LHFLDREHLSRAASKGRSLLGEFEEPVKPGFRTCTAQSPAYQGSPSEHASVMDYGIDLSD CTKEELIYWDLFADILDNDTSPWHRYAREMGINHVMEVYLDTLLPHPSLKFRLRNGDSCL KEAFLGSIKKALEDISRNGLSPKLYRASMKENRLSDSLTREAPHLGFNLSEEIGRYWSIT GRTDYFQLYEEAFRRFEEDSGQSIIKKLAASALQPAASALVVTEPVPGLAEQMEEEKEQY LKEKLASMTAAEQKQLIEQTAAFHDWNSRERSNMDFLIGPGELPEPSESCPFTKRQWGTI TCYTSPAPSRDVGSYQLYFDISGIEKDDLNYLTLYQMLLTELDTKRFTVEQQKNLEQEYL HDCTFDELYPPKEAGALNHPMMSVFWYGLTGDFEAGLDFLLDIMGGGDYSDTDTIIQVLE KYLPDYDQSKADNASALAFSLSEGYMRQECRFRNMLNSQENYYFLKDVLNRLKEDPDFGA AAAARLETISHTILNRRGLVFLAAASPDVLGTLEQTAVDILGRLPGFKEGHSSPAPAVWL PDQRQKLAVCVEASCQETRLMGDFHGNPGFKGRYLPFLMAAADKYLKPAIRYQGGAYDSG IDFYIPAGYFSLWSTADPEVRSTIKLFLATGAQLMELPLTHEELDGYILNAYAQALPPSG TLSTRMRHMRRDMVNMDTRKINEMISDIRNAALCHQKEAALVIDRILGRSAIVTVGNEKC IMRDKDVFDQVLIYHTDYV >gi|157101648|gb|DS480676.1| GENE 65 68384 - 69160 817 258 aa, chain - ## HITS:1 COG:CAC0361 KEGG:ns NR:ns ## COG: CAC0361 COG1028 # Protein_GI_number: 15893652 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 9 257 13 260 260 189 42.0 5e-48 MVYDYKEKFSVQGKKCIVTGGAQGLSRGMAEGLLENGAEVVLMDLQKEKLEQAVEEYCKK GYKAHGVAGDLSKKAELDRMFDEAMELLGGKLDVMIPAAGIQRRYEPWEFPEEMWNLVIN VDLNHVWFMCQRAIQVMRDKDTVGKIINIGSMNSFFGGTTVPAYSAAKGAVTQLSKSIAS DCADHSICCNVIAPGYMDTEMCANMSQERKDECTKRIAAGRWGRPEDLKGPILFLASSAS DYLNGAVIPVDGGFLVKS >gi|157101648|gb|DS480676.1| GENE 66 69197 - 70444 917 415 aa, chain - ## HITS:1 COG:rspA KEGG:ns NR:ns ## COG: rspA COG4948 # Protein_GI_number: 16129539 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Escherichia coli K12 # 10 396 8 387 404 308 42.0 9e-84 MAVTIENIRVICTAPEGINLLAVRVDTNVPGLYGLGCGTFSYRHLAVKLVVEEYLTPLLK GRNAEDIEELWQLMYQNGYWRGGPIENNAISGIDMALWDIKGKLAGMPLYQLFGGRVREG VPVYRHADGRDLEELCENIERYRELGIQNIRCQCRGYGGERYGVIPEAAPQGALPGVYLD GARYVRDTVKLFAGIRDKIGYDVKLIHDVHERVAPPEAVKLARELEPFDLLFLEDPVCAE QTKWLREIRCHSGVPIAQGELFNRSVDWRDLISGHLIDYIRVHISQIGGITPARKLQMFA EQYGVRTAWHGPGDMSPIAHAANVHIDLAARNFGVQEWSGIEPPNFVIQDLKGPKGALLE VFPGLPEFRDGYVYPNERPGLGVDINEREAAKYPCENTVTRWTQTRNRDGSLQTP >gi|157101648|gb|DS480676.1| GENE 67 70457 - 71536 816 359 aa, chain - ## HITS:1 COG:L198485 KEGG:ns NR:ns ## COG: L198485 COG0371 # Protein_GI_number: 15673538 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Lactococcus lactis # 31 355 33 352 358 124 31.0 4e-28 MNQYHSIYLPQFTMGEDAFGTFPERIGRGRKAAVICGKQAWEASKDYVLQSLEQAEIIVT GAVVYGKDATWENVKRLEREPAVQEADVILAVGGGKCIDTVKVVGDRLSKPVYTIPSIAS NCAPITQISILYHEDGSFREILKLKNVPVHCFINPKLTLAAPVRYLWAGIGDAMAKHVES AWSAKAGEALDYGSALGITAGSMCFYPMLEKAEKAMEDAAKGVISRELEETILNIIISPG IVSVSAHPDYNGGIAHALFYGMTSRKCIEERHLHGEVVAYGMLVNLMADHDWEKLSRAWR LSKAIKLPVCLGDLELDPTDPLEDVLKVTMENQELKHTPYPVTKEMIREAITALETYQA >gi|157101648|gb|DS480676.1| GENE 68 71558 - 72361 417 267 aa, chain - ## HITS:1 COG:BMEII0569 KEGG:ns NR:ns ## COG: BMEII0569 COG3718 # Protein_GI_number: 17988914 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized enzyme involved in inositol metabolism # Organism: Brucella melitensis # 40 246 32 247 269 60 28.0 4e-09 MVQGVVTRERMEKWKIVSPEEYGFHKVIVPGREDCQVAVMYRLNLAAGQRYELKPGGEEM NGVCIKGAAGLELAGHQYHCGRLDSFYTAGGNSVAIEAEADCVFYIGASVDEGYGTPFFR AFNLHLPLGEIHQIHGKGVGQREVFMTLNQEIQASRLIAGLTWGGNGAWTSWPPHQHEAD LEEIYCYFDMDEPHFGLHLSYTAPGDVDNIVAHPVKSGSMVLAPRGYHPTVASPGTRNAY FWVLAAHSHESRGYDLAVLDPSREQMN >gi|157101648|gb|DS480676.1| GENE 69 72374 - 73648 674 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 5 423 6 425 435 264 32 3e-69 MNAGLILFLIFILCLVIGVPVSISLSVASLAAVVFGGLPASSTAIAQRIFGGLQSSSIMA IAFFVLAGNLMTKGGISRRIVDFADCLVGNVRGGMSLALVLACAFFAALSGSAPATVIAI GTMLYGDMIKKGYPGPRTAGLLVVSGGLGPIIPPSIIMVIYCTLTGASVGNMFKQGMMIG IVIMLVLMVEVLWFAHKEKWPKQNVKRTPAEVVKVFLDAIPALLTPIIILGGIYSGMLTA TESAAVAVVWAAIAGAFIYRELNLPDTIQIIKDSAKSSAMILFIIAASTAFSWVFTISGA SKALVDTVIGMNLNATMFCLVVAVILLIFGTFMEGTAIAVLLVPVLWPIAQSMGINVVHF GMIVCISNVVGTMTPPVAVNIYSAATVSKLKMGEIAKAEIPFLVGYVGVFFLCVFSEWFC TFLT >gi|157101648|gb|DS480676.1| GENE 70 73645 - 74148 548 167 aa, chain - ## HITS:1 COG:no KEGG:Tmz1t_0544 NR:ns ## KEGG: Tmz1t_0544 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ component # Organism: Thauera # Pathway: not_defined # 15 156 10 147 161 67 30.0 2e-10 MKFMKKLLSCVTSAEYAVMVAAFVAMVASYFISVLNRNFIQASMPWTEEIAIYSMTYMAL LGTEVGLRDGTQVAVTAVVDKLHGVARKIVDLIRQVILVGFAFIMIKAGGALVLKQLQAG QTTPVMKLPMSLMYFSLVLAFGLIFVIQTITLVQQIAEFGKKEGNRT >gi|157101648|gb|DS480676.1| GENE 71 74169 - 75230 329 353 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 45 325 40 322 346 131 29 3e-29 MLKRTYLLSGIAVLTALGLMACGSSGSGQTEAKGSVSGNSGTEAAANEESLNFTMALVDN SQSNYYKGAEKIAELVNEATEGKIQLTIQAGGVLGGEADTLDMAIQGDLDIATCANSVLA NYIPEMSILDQAFLWDSADQANYAVQNELGDLISAEAEKHGLHVIGYLESGFRDVFSKKP IQTPADFKGVKIRVMQNEGQLAAFTAFGANPVAISASEQFTALQQGTIDACENAVSNCWI NKYYEAGVNSITNTKHCFVYIPICMSDNAWNKIPEDMREPFVNAVQEGCKAQWTYLNEAN AEAVENLEGAGVTFYDIDTEALKAEYQAAQEKNGASYDEKWAAAVDAAKSAVQ >gi|157101648|gb|DS480676.1| GENE 72 75379 - 76137 615 252 aa, chain - ## HITS:1 COG:HI0054 KEGG:ns NR:ns ## COG: HI0054 COG2186 # Protein_GI_number: 16272028 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 8 238 14 243 266 86 27.0 5e-17 MEKGGVWVRQKTETLSERVAVKIRQYIIENEMKAGDRLPNELQMAEYCEVGRGTVRDAVK LLRFEGLVEVVQGSGTYLLRQKQKPIEISGGETPNLLGAFLPEELPNKALEFSEVRLMLE PDIAALAATNATYEDCRKLMELEAEVRWQIQMQESHYEKDIEFHLQIARCSKNEIACKLM EIVVKGIPLFCKVTNDELANQTVKFHHMISESIERGDASGARYSMIDHLNSTRRKIIEEI EQQKAGKSSNDF >gi|157101648|gb|DS480676.1| GENE 73 76195 - 77547 1586 450 aa, chain - ## HITS:1 COG:FN1989 KEGG:ns NR:ns ## COG: FN1989 COG0733 # Protein_GI_number: 19705285 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 20 450 8 438 438 460 59.0 1e-129 MNKQKKTDSSPAAVHQNGGFGSSLGFVLACVGSAVGMGNIWMFPYRVGQYGGGAFLVPYI LFIVLFGLVGLSAEFAIGRRARTGTLGAYEYCWKKIEKGRLGYFLGWIPLLGSLGIAIGY AIIVGWVVRALAGSVTGAILTADAAAYFSQATGEFGSVPWHVLVVVTAAGILMFGATKGI EKINKVLMPSFFVLFAILAIRVAFLPGAAEGYKFLFAPDWSDLLKVDTWVMAMGQAFFSL SITGSGMIVYGTYLAKTEDIPKASIRTAFFDTVAAMMAALAIMPAVFAFGIEPNAGPPLM FITLPEIFRQMPMGRLFAVIFFLSVAFAGITSLINMFEAVSESWQNRFKLPRKAAVALCS GIALLVGIFLESEPKVGSWMDFITIIVVPFGAVLGAVSIYYILGYKEIREELEEGRKKPL GEFFGPLARYVYVPLAVIVFILGIVYGGIG >gi|157101648|gb|DS480676.1| GENE 74 77727 - 78353 605 208 aa, chain - ## HITS:1 COG:MA3621 KEGG:ns NR:ns ## COG: MA3621 COG0613 # Protein_GI_number: 20092421 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Methanosarcina acetivorans str.C2A # 1 195 1 195 224 80 29.0 2e-15 MLVDMHLHECTYSSDSKISLEEIVTTARGRGLDAVCITDHDSMGLREYAAEYSDRTGFPI FVGIEFFSLQGDITAWGIEDYPGCRVDAQDFINHVNEAGGFCVSCHPFRNNNRGLEENLR AVKGLHGVEVLNGSTALEANRRAFSYCRELGLKTIGASDAHTKGQIGKYATWLPEKVDCL KDFVEQLKTRQVRPAVWNGSSYDVVDML >gi|157101648|gb|DS480676.1| GENE 75 78365 - 79123 235 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 17 209 20 217 223 95 31 2e-18 MYLELDHLKKHFDGREVVRDLSLTLEQGQLLCILGASGCGKTTTLNMIGGFLEPDSGSVR LDGQDITRIPPEQRPVTTVFQSYGLFPHMTVLQNVVYGLKFRGVKREQARQKGMRYLEMV GLGDYASAYIQEISGGQQQRVALARALIVEPKLCLLDEPFCNLDAALRTRMRYELKRLQK DLGLTMVFVTHDQEEAIILADRIAIMDQGELVQNDAPLDLCRKPASPFVAEFMDLESLVW TEDGRLLKIIKR >gi|157101648|gb|DS480676.1| GENE 76 79113 - 80792 1270 559 aa, chain - ## HITS:1 COG:PM0956 KEGG:ns NR:ns ## COG: PM0956 COG1178 # Protein_GI_number: 15602821 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Pasteurella multocida # 23 559 132 677 685 199 29.0 9e-51 MHGDMYGKRASKDVLFGIAGYGADKVLLAVLLASVFLFVFWPMLCILIRSLWDGKGISYD AYAAVWNQYRGNLWNSVFVGVCASLLCTAFSTAAALFLTTRRGAARMLCMALLLITMVSP PFVSSLAYIQLYGRRGWITYRLLGLSWDPYNCWGVILMQSLSFVPLNALFLGGILTKLDG DSIRAARDLGASPKAILGDIVLPLIRPGIWVALLLSFVRSLADFGTPVIIGGRFSTIASE IYLQLVGYSNLEKASAMNMVLMLPSIAAFFLYRKLMKRSDMLTEGSRGRQEPLALSLGKC GAAGILANLGSILFFFMMALQYGCIFISGFLKSTRGVYSFTTRYLEQMVRFDLDTMGRSV AYALIVSLAGTLFAMLFAYYMERRQIPGRNFFDCLVTLPYMLPGTCFGIGYILAFNSPPL KLTGTALIVISNMLFKQLPTATKICSATLTQVPRAQEKAVRDLGGGQMSVLKTVILPGLR PAFLSCFVYNFSSSMTTAGAILFLIDPGRKLAVFKLFDAVYTGDYALASLIATSIILVVL AVEGMAYLITWKAGKRRVS >gi|157101648|gb|DS480676.1| GENE 77 80839 - 81996 1387 385 aa, chain - ## HITS:1 COG:BMEII0565 KEGG:ns NR:ns ## COG: BMEII0565 COG1840 # Protein_GI_number: 17988910 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Brucella melitensis # 61 379 29 354 358 131 28.0 2e-30 MRKKIVALLMALTMVPALAACGSSNGTGNNTAGAPAESQTGETTAAQETEVRSQGETKAE ANAESKAEAPALSGTMKVVATSENYVTLFDKFTEDTGVKVELLSMSSGEVLSKLRAEGGT PSADLWFGGGIDAFMSAKDDGLLEQVTFDASGDLAEDFKDPEGYWYSKGITIVGFLLNDG LMEELNIEAPASWDDLLKAEYEGEIIMSNPAVSGTNYAVVNALLQAKGEEDGWSYFEGLN KNIAYYAKRGSDPKNKVTTDEYAIGITYIDGTIESLLDEYDVSIVYPADGIPWVPEGVAA FKNAENTDAAKYFIEWLFSSDENLQMLAEIDQKNSVKTIKPSLEGLDLGYDTGMLMKEDL SLFGEKRTEILEHFEALMGDKAADE >gi|157101648|gb|DS480676.1| GENE 78 82336 - 83724 1289 462 aa, chain + ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 5 451 9 444 453 263 38.0 4e-70 MEHKRSQWASNIGFILAAAGSAVGLGNIWKFPGKVAAGGGGAFLICYALIVLFVGFPVML AELSIGRSTQKNVVGAFRRLNPRWSFAGGIGVLTLFVIMSYYSVVGGWVMKYISVYLTGA DFSAFGNDSSRFSAYFADFISRPAEPLLWGAAFLLLCIYVVVRGVSEGIERVSKFLMPGL FVLLTGIVVYAITLDGAAEGLKYMLAVDPAKFNGGTVVAALGQAFFSLSVGMGIMVTYGS YVPRTENLAKSAGCICILDSLVALLSGLAIVPVVYITAGPEAMGMGGGFAFMALPEVFSR LPGGVFFGFAFFLLLFVAALTSAISILESCIAWLTEEFGFARLKATLLLVIPMSFLSAGY SLSQGAMDIKLPWFDFTNGVQYLPMNAVMEKFTDNLMIPLGALCFCLFVGWVWGTDNASK EIEASGHSMAWKRLWAFLVKFLAPAVIIVILYFTVGRGQGLS >gi|157101648|gb|DS480676.1| GENE 79 83742 - 84632 904 296 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936603|ref|ZP_02083970.1| ## NR: gi|160936603|ref|ZP_02083970.1| hypothetical protein CLOBOL_01493 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01493 [Clostridium bolteae ATCC BAA-613] # 1 296 1 296 296 442 100.0 1e-122 MRINFNAIQPKNTIINAARVNASFAKNNSRTGKSAGTGRSDRADFSPQGKLMSMIENLTK QKQTIIDRKNELVENTLKNGGKVDDIKDQLKNYGKQLADLDKQIAGLYAQQAKACAEPDD KKKTDRTDPHKTEEQREVERLSSLANVSDGIRNAEKISAVKDRVEGEMHVKEAEVSQGGV HVDALVSKGMGGAANVADMVDNETSALKRKEMEIAALGDRASALGDSQAKSLGDSMDALE ENREADDPVREDKAEKSREENKTEIRAENNTGIWTEESGNAKNTGSGKTDTGTLVS >gi|157101648|gb|DS480676.1| GENE 80 84948 - 85412 384 154 aa, chain - ## HITS:1 COG:yiaL KEGG:ns NR:ns ## COG: yiaL COG2731 # Protein_GI_number: 16131447 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Escherichia coli K12 # 14 151 13 151 155 138 43.0 3e-33 MISSSIYTNEDLGKYPEGIQKALEYLMSRDFTQMEPGTYAIDGKDIYAMVMDITTCASAD KRPEIHKKYADVQFVAKGREMLGFAPDIGDLTITDGKEEGDIYFYDDVGDESFLIATKGC YNIFFPNDIHRPGCMVEHPGKVRKVVVKVSMDRV >gi|157101648|gb|DS480676.1| GENE 81 85426 - 86241 753 271 aa, chain - ## HITS:1 COG:STM3249 KEGG:ns NR:ns ## COG: STM3249 COG3836 # Protein_GI_number: 16766547 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Salmonella typhimurium LT2 # 4 250 8 244 256 117 31.0 2e-26 MRENKMKQILDEGKLAVGMGIFTGSPIVVQMIGYSGFDFVFIDMEHTPVTIDRELQALII AANESGMGTCVRVKYNDEVLIRTTAEFGADAVVVPHCRTAEDARKMVQAVRFPPYGVRGS ATDCRSAGYGCYPDFNFGEYVEKCNRETLIIPLAEDPEFFDNMDEILDVEGLGGVQLGPS DLALGLGIKETYNFDNPQVKSRFERLFKKAKEKGIPLMGPIAPPNLERAIEMAENGVRVM TLRNDVTNFKMLLAGLRKDVYIPLKKHFETV >gi|157101648|gb|DS480676.1| GENE 82 86287 - 87576 997 429 aa, chain - ## HITS:1 COG:SMb20372 KEGG:ns NR:ns ## COG: SMb20372 COG1593 # Protein_GI_number: 16264106 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, large permease component # Organism: Sinorhizobium meliloti # 9 425 6 422 425 253 36.0 5e-67 MNIGIVVFLAAFAMIFLTGMPIALGMLSVGVIYMLATGGNMGVIPNCICEAYWNNYVIIA IPLFVFTANVMNTGKISDMIFRFSDGIVGRMRGGMAQVNVLVSLIFSGMTGSAMADASGV GIMEIAEMKRQGYDDGFSCAITAASATIGPIFPPSMTMLLYSSLTGASVGALFMGGMLPG VMLALFLGAYCLVVSYRRKYPYGTKYTIPLFLTYTFKAIPALLTPVILLGGIYGGAVTPT EAGAVAGAYALIVSVLVYRMLNWESLKKILLETARTVGTVSITVGCASVIVYISTLERIP TLLGGFIMGITDNRYVFLLICNIMFLILGMIFDNNTITLVFIPMIYPLVQGFGIDPVHFG VMFVINMMIGMVTPPYGPLLFVTSAISETPLKDIIKEILPMIAVMIAFLLVITYIPDVVL VLPKIFLNY >gi|157101648|gb|DS480676.1| GENE 83 87594 - 88097 -29 167 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2701 NR:ns ## KEGG: SpiBuddy_2701 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ component # Organism: Spirochaeta_Buddy # Pathway: not_defined # 5 154 1 151 176 88 33.0 1e-16 MRAKIKWLFLRIRDFVEVFIPVITFSALFLAFVYSVFSRYILNRPPAWGTEVQVATYIWT VLLGACYVRRLNKHVNFTMIYDLLPQDGQRWLRIFRNLLMGVTYGILMKPTVQYIVKYRT VSPSLKIPVKFYFFPIIILVFGVMCYSFADLYRDIRGWFRQKRGQSG >gi|157101648|gb|DS480676.1| GENE 84 88124 - 89239 232 371 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 143 351 105 307 328 94 29 5e-18 MINKNRLSVLAIAALITVSLAACSSGKKESAAGSQTSAAQVDENQTSGTQTTETSAGDTA GKDAVKLITSTSQLPGASTQLLQSRISEVSEQKGGGNFQFEHYDSATLFKSSAELGALLD GNIDIGFIQLGYFYDNGALWANMFDIAFLYDDVDDMMQVLDTQGEVGKWVQDQIWDEFHV KVIAPFYLGRRDVWISDGSLTVQTPEDLKGVKIRMPNSASFLDMGTAIGAEPTPLDSSET YLAMQTGTVDAQENIILSSYANAMQEVSKSIVMTDHMITANLVCVNGDKWENMTPEQQEL LTQVIREAVEQNNQDVMAEEAKILDECVEKYGINLQEPDKDAFKEYAFKYYLSNPDNEKN WNMDIFNEIIK >gi|157101648|gb|DS480676.1| GENE 85 89266 - 89916 675 216 aa, chain - ## HITS:1 COG:all4771 KEGG:ns NR:ns ## COG: all4771 COG0800 # Protein_GI_number: 17232263 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Nostoc sp. PCC 7120 # 1 210 1 207 210 99 30.0 3e-21 MGGNEVLAGIEKEKLIVIFRGVPTEDAADVARALADGGVKFIEITYNHCKDDPEAYFEEQ MMAVKAAVGDRVCMGAGTVLTVDQVELACRLGAGYIVSPNSNDAVIKRTKELDMVSIPGA MTPTEIVHARDCGADIVKLYIVDDINHVKYLMGPLGHIPMQATCNVSLDTIPQFLKAGIK AFGTRAFLTNDLIEKKDYKEIEKRAREHIEIIKANS >gi|157101648|gb|DS480676.1| GENE 86 90147 - 91016 633 289 aa, chain + ## HITS:1 COG:CAC3361 KEGG:ns NR:ns ## COG: CAC3361 COG0583 # Protein_GI_number: 15896604 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 248 1 245 312 145 35.0 1e-34 MNTRQLEYILAIAEEKNILRASEKLFISQSTLSQTLINLEKELGTPLFIRNQRELAITEA GQYYIEAARDIMHIKEQAYDRIRSISVTGKERYRVGISSQEGMERFLTASGLFQKKYPGV ELYATDESTKNLLKKLNMGQLALIITSLDSLQKITLPYKVLNREEILLLVPRTLPYKIWG ENRRLKWELLKNEHFILSKQNSTMRSITDHIFKQLGVNPDIVCELDNTSATIHMVAEGTG LAFLPSGLKVENEQIRYVSLTPKVYRYQVVIYQEEMSQNPKMTDFIDLL >gi|157101648|gb|DS480676.1| GENE 87 91176 - 91670 533 164 aa, chain + ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 10 162 9 159 164 110 33.0 1e-24 MEITSQLTAEEKRAIIRDNGGDKEHLLAILYELQNASGYNYIDEETAALVAEEVGMNPTR VYDIITFYAMLKTEPKARYVLKVCNSTPCHFSRSEEIAQILKEELGVGIGETTEDGVFAY HYIPCVGACDIGPVIKVKDTVYGNLDRRKIRQLLADLRSGKRNQ >gi|157101648|gb|DS480676.1| GENE 88 91689 - 93014 1290 441 aa, chain + ## HITS:1 COG:TM0228 KEGG:ns NR:ns ## COG: TM0228 COG1894 # Protein_GI_number: 15643000 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 4 427 109 529 545 402 47.0 1e-112 MEIQESVLLRRVGRMDPVSVPEFEALGGFSGIRKALSMEKEEILEEISRSNLKGRGGAGY PAGRKWKSLYRIEGDTKYIVCNADEGEPGTFKDRVLLEETPLSVIEGMLIAGYVFGSKQG YIYIRGEYRRIQKIFETALENAEDAGYLGKDILGVPGFDYSITVVSGGGAYVCGENSAML NSIEGKTGRPRIKPPHLAEVGLYGKPTLVNNVETLCCVPIILEEGSQKFLSYGTTESGGT KLVSISGHVKNRGVFEVGLGISLRQLIMDEKYGGGTSTGRKPGFFHMGGQSGPIGFPEQL DTPYTYETLAPKGLAVGSGAIVVLDDSVCIVEYIQKVFEFFVHESCGKCTPCRLGTTRIL ELLTDFVEGRAKDGDVERLEYMARQVSQLSACGLGQSVNVALMSALRHRRGDFEAHINGG ACPAGACACHENGRNKACLTI >gi|157101648|gb|DS480676.1| GENE 89 92999 - 95716 2168 905 aa, chain + ## HITS:1 COG:MTH1552 KEGG:ns NR:ns ## COG: MTH1552 COG3383 # Protein_GI_number: 15679548 # Func_class: R General function prediction only # Function: Uncharacterized anaerobic dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 27 903 2 862 865 702 43.0 0 MSHHIDPDTRIHIEIDHIPCEVAPETTILSAAAGLGIQIPTLCYMKELAPDGSCRMCVVE VDGGRKKGLVTACSEHCTEGMVISTRSEKVMEARRFVLDLLMSNHREHCFSCPQNGSCKL QDYCMEYGVEYTSYDSGKRVNAPKDTSNPFFEYNPELCIMCRRCVRICEELQCRNVISLK DRSFDTIITTAYGRPWSETTCESCGNCVSHCPTGALSAKDHKKYRSWEVRKVRTTCPHCA TGCQMDLLVRDNRIVGAEPADGPSNHNLLCVKGKFGSYKFVGSGDRLTHPLIRRNGQLEQ ASWDEALDYMAEQFREIQKKYGNDAIAGFSCSRAPNEDNFVFQKMMRAAFRTNNVDNCAR ICHSASVTGLAMTLGSGAMTNPIADITGDVDVILLVGSNPTEAHPVAGTQIRRAVRRGTK LIVVDPRKIDLTNDAALHLQIKPGTNVAFANGMMHIILKEGLADMDFISSRTEGFEILEQ ILEDYPPEKVAQICHIRQEDLIQAARMYAGARKAPIIYCLGVTEHSTGTEGVMSLSNLAM MTGKLGRPGCGINPLRGQNNVQGACDMGCLPTDFPGYQKVANPDVIHKFQTAWSVELNTR PGLTSTEVFPAAIKGSIKGLYIFGEDPIVTDPDTGHIEKALKSLDFLVVQELFMTETAAY ADVVLPGVSYAEKEGTFTNTERRLQRVRKAVEPAGLARDDIQIFCDVMGRMGYPCHYDSA AQVMDEIACVTPSYGGISHARLDAGESLQWPCPSKDHPGTPILHAGRFSRGLGYFYPAVY KESRELPDEEYPILLTTGRMLYHYNTRAMTKRTEGLNEICSESYIEVNEEDAKVLGISHG DKVLVSSRRGRITSTARVGQKVSPQEAFMTFHFADGNVNRITNAALDDLARIPEYKACAI RIQKA >gi|157101648|gb|DS480676.1| GENE 90 95737 - 96441 479 234 aa, chain + ## HITS:1 COG:lin2729 KEGG:ns NR:ns ## COG: lin2729 COG1526 # Protein_GI_number: 16801790 # Func_class: C Energy production and conversion # Function: Uncharacterized protein required for formate dehydrogenase activity # Organism: Listeria innocua # 1 223 1 254 261 116 28.0 4e-26 MEFTKQYPVIRYEDGRETPSMDEVLAECSLNVHINGRCTEVLHCIPEYLEEMLLGRLYTS GVISSPEDIMDLSIRPSLNRADLKIRETDRTMVRLPDLHICPDHILSLSQHLLDASELFR KTGNVHSVMLCRDCAVLYMTEDLDRSRAFEKTVGRALKDRVDFAGTSVYTTGRIPRPIAE KAIWAGIPVIVSRSAPTDLTLELAAEYNLTVIGFARRNRMNIYRTSRDGSATPS >gi|157101648|gb|DS480676.1| GENE 91 96452 - 97405 972 317 aa, chain - ## HITS:1 COG:CAC2949 KEGG:ns NR:ns ## COG: CAC2949 COG0679 # Protein_GI_number: 15896202 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 6 316 5 303 305 115 30.0 1e-25 MNAGIIAGQMLVLFFMMLAGFIAFKAHIVSQSGSKSLSSVIINIFNPCLIISGVLNKKIT YSPRMVAENLVLVAVFFLVLIVLSRLFTKMMRLLPGEVNSYRLMMIFPNLGFMGIPLVRS LYGEEAVIFVAFYIVGYNLLVYSYGILLAMGTRDDCEDAGKEAAKRAGLPWKKMLNIGMG ACITAIFIFAFHIRVPEPAADFVDSMGNAAVPLSMMTVGIMVAQSDLKSLVTDKKQLLFA FISLLVIPAVCIPFMRMLPVDKVSYGIFILLIAMPVGSVVMLVEKEYGATDGRISAKSIA LTSILSVLTIPVITYLA >gi|157101648|gb|DS480676.1| GENE 92 97772 - 98092 253 106 aa, chain + ## HITS:1 COG:SPy2172 KEGG:ns NR:ns ## COG: SPy2172 COG1695 # Protein_GI_number: 15675909 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 93 1 91 108 74 39.0 4e-14 MIFNTGSALLDAIVLAVVSKDAEGTYGYKITQDVRQAIDISESTLYPVLRRLQKEDCLEV YDMAIDGRNRRYYRLTEKGRVQLNLYRGEWVTYSQKISQIFKEAKL >gi|157101648|gb|DS480676.1| GENE 93 98089 - 98919 788 276 aa, chain + ## HITS:1 COG:SPy2173 KEGG:ns NR:ns ## COG: SPy2173 COG4709 # Protein_GI_number: 15675910 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 1 68 1 69 195 66 44.0 6e-11 MTRDEFMKELAYLLQDIQDEDKDDAVQYYTDYFEEAGPDKEAEVIQELGSPERIAAIIRA DIAGHLESGGEFTEQGYQDERFRDPGYAMAKRLDLPEEQESGGQGSGSRGQGYNREQNFD REQSFDQKRGFNHGPEQTYTEENNGTSNPPPPRTSKVIKLILWAVLIIVAFPVFLGVGGG LLGLAAGVLGILTAVLVTLGATTFAVLLGGIVMLPYGVFHMFSHPLDGLMASGTGLILLG AGALCLALSVLFYGRFIPFIIRSIVNGLSRLLHRGR >gi|157101648|gb|DS480676.1| GENE 94 98924 - 99799 896 291 aa, chain + ## HITS:1 COG:no KEGG:Closa_2421 NR:ns ## KEGG: Closa_2421 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 287 1 296 300 103 28.0 7e-21 MKRFVKICAITGLILLLLGLGITAVSAAMGGRYTNSLPSRLASRVWHNHMPWLVRWGDDR ADDWDDRDDWDAWDNWDDWNNPDDPDDTDRSPGSSQIKTVSDPSYAQARKLDVDIDKGVI RVLEKEGISQIQVNVQDTYNRTQCYMDEFTLKVKRESGRSRSNEAPRIEILIPAGYGLDK LSLDLGAAECTVLGVTVSKLEIDTGVGAITFSGTVNGDVEIETGVGDVTLNLTGSQDGYN YQVECGVGSIDVGAEHYSMLSHETHINNKAPYTMELECGVGNIAVNFDQIL >gi|157101648|gb|DS480676.1| GENE 95 99826 - 100011 270 61 aa, chain + ## HITS:1 COG:CAC2659 KEGG:ns NR:ns ## COG: CAC2659 COG1983 # Protein_GI_number: 15895917 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Clostridium acetobutylicum # 1 61 1 62 63 74 62.0 5e-14 MDKKLYRSNINKMLCGVCGGIGEYFNIDPTIIRLIWAIFACSGPGIIAYFVAAIIIPLAP N >gi|157101648|gb|DS480676.1| GENE 96 100018 - 101046 716 342 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 97 336 215 452 452 63 23.0 4e-10 MFGNANFIESRFFWDFSDRHPYLIYNTARVAILLITYPFILHFLGHTVREALKMEDRKMW QYMWKIPLFSTLFGMLYCTVQDVYAYASWQFLISRYLMLLGTCYVSYVFLKVLEISRRKT QLEEELKYADRSLLAQKKQYETVSAHMEEMKKARHDLRQHLAVVQSYIERNDRAGLSEYI EIYKTELPPDIIEIYCRNQVINALICYYASQARRHKIRFEAQADYPEQCPVPDTEVTVIL GNLLENAVEACLREKNGKCSIRLRIFRKNSSLVFLTDNTCSGTVPFDADGKTPLSSKRKG VGIGVSSIREIADRYGGDTLFEQKAGMFYASVWLQIPEKADN >gi|157101648|gb|DS480676.1| GENE 97 101360 - 102109 614 249 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 9 239 2 230 234 85 25.0 1e-16 MPVRGEDFMQVAIIDDLAVCRQEIKEYLLRYIRENYQGEEPELDEFESGTRFLDAFRPQV YDVIFIDQYMEGMSGMDTAREIRRVDEEAALIFVTTSCDHAVDSYGVRACGYLIKPFTYE AFGETMRLARLEKIMNARFIALGDDRILLKEILWCDRDGHYVQIHTDMRGVLRYRVTIAF LEAVLSAYPQFLPCYRGLIINMDRVRRMEELEFLMDTGERVPFRKRDHKEIKSRFSQYMF RRARGEDLL >gi|157101648|gb|DS480676.1| GENE 98 102271 - 102636 236 121 aa, chain + ## HITS:1 COG:no KEGG:Clole_3645 NR:ns ## KEGG: Clole_3645 # Name: not_defined # Def: response regulator receiver # Organism: C.lentocellum # Pathway: not_defined # 1 112 2 115 121 63 28.0 2e-09 MEIIICDDNLSERSLMIDYILKWASEKHLEIQINTCKDWMELAAALEGREWDIVIVSLDG VRGLDTASSAQPLSRRLIWISDLDFGVQAYRMCVSYFFMKPVSCQKMQRALECALKGQGT T >gi|157101648|gb|DS480676.1| GENE 99 102655 - 103692 826 345 aa, chain + ## HITS:1 COG:BMEI0929 KEGG:ns NR:ns ## COG: BMEI0929 COG2199 # Protein_GI_number: 17987212 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Brucella melitensis # 8 178 53 225 255 91 34.0 3e-18 MKQRQHTDIYKKTRKQLEQENCALQTLAELDGLTGLYNRITMERRVDDHLCRKQPGTMIV LDLDHFKQVNDRYGHIAGDMLLCTIANVLKKMFSGRNLVGRVGGDEFVIFIPSVLDDPMT AGECLRIQERFREVRLGDSILIKLSITIDSADSRTCTKYRDMFDCADQKIMAKKRLRNLK DDTGLISPSREPAGIQPDMSLIAMEMEEKETQPGAYCQDYETFKQIYRLEERRLRRDKKR AFIILFTLTDRNNGFLPLDKRDAEMDILGEKIKQSLRMGDVYTRYSSCQYLVMVLDVSIE NVGVIADHICSAYYQMHQETADNLILHHSYPLKPARGRPRASRPQ >gi|157101648|gb|DS480676.1| GENE 100 103641 - 104828 979 395 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936628|ref|ZP_02083995.1| ## NR: gi|160936628|ref|ZP_02083995.1| hypothetical protein CLOBOL_01518 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01518 [Clostridium bolteae ATCC BAA-613] # 1 395 1 395 395 733 100.0 0 MAEGRFDTARPGGNRAAKMPKGQPYAGFNMQFCILSALGMFFVVDGHLNNSYLDIGGLVP YYSFHMPLFAFISGYFYKKGSEHRLGAYTKKKIIRLLGPYMVYNLIYGMIVQGLHRAGFA FGGDLSLWNLFVEPFITGHQFEYNLAAWFVPALFLVEMANVLLRRLLKRVDSEYAVFFLY LSIGVWGILLAFSGRYQGGWLTLVRMMFLLPCYGAGTLYKEKLEAGDRADHGLYFGIILA AWLLLAMSGRPLIYSVAFCNGFTGILMPYVTAALGIAFWLRVSRILAGAFGEGRYIRYFG SHTYAVMMHHIMALMVLKTLFAALAKYTALFPGFSFEQYKADLWYCYFPKALPQFRVVYL IWAIVLPLLFQWIFDRLRLHWGREALGRPRAGFKG >gi|157101648|gb|DS480676.1| GENE 101 104809 - 105648 856 279 aa, chain - ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 31 268 47 299 313 100 27.0 3e-21 MIDQAGIMMAQGRSIAGERIQGVNDNMSKSHYHDYFELYYLEEGERFHIIQDRLYCIHPG EFLIFSPYVMHHSYGQENVPFKRLLVYFSKEEVDSVQLLKVLEKGTGVYKPDQREKQVVH RMLDDLLKEQDDPQAYHEDYIHTLLNMALLTIARNIQTPVKPEKRNRITDVISYIHQNYQ EDISIDQLAQMSYVSPYYLCREFKRYTNSTIIQYVNVTRVMNAQRMFMETNKNITEISRD TGFSNITHFNRVFKSVMGMTPSNFRKMYKTEAGHGRRTI >gi|157101648|gb|DS480676.1| GENE 102 105860 - 108361 2513 833 aa, chain + ## HITS:1 COG:BH0704 KEGG:ns NR:ns ## COG: BH0704 COG1501 # Protein_GI_number: 15613267 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 9 730 12 737 801 509 39.0 1e-143 MKVCTNAYTVAKKDSYFSVMTNSVEIRILFLTDSILRIRAGFDGDFAEESYSLVMTAWED RMDDFLKGRRTRVEAADAVLSDGDREAVIEGRILKVVVEKDPFRICVYDKEGTLLHADIV DLAYMEDSNHRRIHTSEISPEDCFYGFGEKSGSFNKAQKFMSMSPKDAMGYNPRETDSLY KHIPFYIKLNRGTRKAVGYFYHNTCECDFDMGREKSNYWKPHSRYRTDGGDIDLFLIAGP SVRQVVERYTDLTGKSAMLPRYALGYLGSSMYYPELDNDCDDAILDFIDTTREEKIPVDG FQLSSGYCTVETDKGIKRCVFTWNKKRFKDPREFFAQMEKRGVTVSPNVKPGILLIHPKL DEMKAKGMFIKASDSDEPGIGTWWGGKGVFADFTNPSTRTYWKEMLKENVLEYGTSSVWN DNCEYDSLVDKDCRCDFEGKGGTIGQLKSVMSNIMCHITDEAIHETFTNTRPYIVCRSGH CGIQRYAQTWAGDNLTCWDSLKYNIATILGMSLSGVANQGCDIGGFYGPSPEAELMVRWI QNGIFQPRFSIHSTNTDNTVTEPWMYGDCTDYILEAIGLRYQLSPYLYSLMERAHETGLP IMEPMCSAFQEDVKCYEEGVDFMLGDSLLVANVVEKGAVSRKVYLPEGETFYDFYTRAAY EGGRTVELPVDLGSIPLFVRSGAIIPMAEDRLDNLKTQQAEHIRILCAADRDGRFELYED DGISMDYEKGGCLKTSITMTAGERTVLDFHQEGHYETAVKTLYLDMIHREKAPYWVKADG ETIPHFLHRRKFEDADCGWYYSQRLKSVQIKYPNPKKDYQVIVSFEQFDLIGM >gi|157101648|gb|DS480676.1| GENE 103 108429 - 109454 564 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 39 341 48 346 346 221 38 2e-56 MKKFAAAMAAALSLAVLGLSGCAQSGAKTSAEASAENPMVLTLAHGLSETHTVHIAMTEF ADKVEERTNGRIQVRILPNGQLGSENENMEQLMAGVISMTKVSAPGLATYNESYHAFGLP YIFDDTEDFYHVMDSDPMQDFFLSSSDDGFVTLTYYTSGARSFYTKNKAIRTPADLKGLK IRVQDMKSQTDMMKALGGIPVAMSYGDVYTSLQTGIIDGTENNETALTTGKHGEICKVYS TDQHAMIPDVMVMSAKVWKTISPQDQNIILEAARESTESHKIAWDAAIDEAIQEAESQMG VEFVNDVDKDAFRQATSGMIDDYCAQYPGVKELLDTIDSVE >gi|157101648|gb|DS480676.1| GENE 104 109469 - 109969 241 166 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 144 1 142 164 97 36 5e-19 MNEILSVIKKWMTRLLAGIATVLLSVMTLLVLYQVFTRYVLNSPAAFTEELVRYFLIWTG FIGAAYAFITREHMCLVLVRDSLSPAGRRILMTVIDVLILLFAVFVITIGGFKLALSAQK VFSALLGIPRSLVYAMAPISGLFIILAQIINIYEDVTGITIEGGNE >gi|157101648|gb|DS480676.1| GENE 105 109969 - 111270 1107 433 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 5 427 3 426 435 431 50 1e-119 MSIGLTTVILFAVFGILLFLSCPISVSIVIASIVTAVSSLSWDQITFITMQKMNSGVESF SLLAIPLFILAGNIMNNGGIARRLVNFAKLFVGWIPGSLAQANIVGNMLFGALSGSSVAA ASAMGGCIYPIQKDEGYDPAFATAVNIASAPTGLLIPPTSAFIVYSTVAGGVSISTLFMA GYIPGILMGVGSMVVAFIYAKKHKYPVSGRTPMKEAVKVTLEAIPSLLLIIIVIGGIVGG IFTATEGAGICVLYCLILSICYKSLTVKSFMKILVSSSCTSGIILFLISASSAMSFVMAY SGIPAAISEGIMSISTNKYVILLLMNIILMVVGMFMDITPAILIFTPIFLPIAQSLGMTD IQFGVMLIFNMCLGNITPPVGSVLFVGCGISKISIEQVTRTLLPYFAVLFLLLMAVTYIP ALSLGIPGLMGLI >gi|157101648|gb|DS480676.1| GENE 106 111466 - 112188 583 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 229 45 1e-58 MIEFRDVHKSFGNLEVLKGINLNIEKGQVVTLIGPSGSGKSTILRCINLLEKPTKGQVFI DSTDITVPKADIQAVRKNIGMVFQHFNLFPHMTVMENMTYAPVKVNKIPKDKAAENALEL LKLVGLVEKADAYPGKLSGGQKQRIAIARALAMEPQIMLFDEPTSALDPEMVKEVLEVIK NLAHTGITMALVTHEMGFAREVSDRICFIDDGQIVEDANPDEFFKNPKSDRARGFLEKVL >gi|157101648|gb|DS480676.1| GENE 107 112172 - 112840 715 222 aa, chain - ## HITS:1 COG:BS_yqiY KEGG:ns NR:ns ## COG: BS_yqiY COG0765 # Protein_GI_number: 16079453 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Bacillus subtilis # 1 219 1 219 219 209 49.0 5e-54 MKLDFTSVWQYWPFLYHGALLTLKITAICAVLGFLLGIVLALVKISKYKPLIWFGRFYTS IFRGTPLLVQLCIVHFGLPQLLGFTLTTTQSGILTFTLNSAAYVSEIFRGGIMGVDKGQY EASMSLGIGYGDMMKDIIIPQAVKHILPSLVNEAIALLKESSILSYIGAVELLRAADQIT VLTFRFFEPYLVIAVIYYVFVMILSAVASRLERWVNRSDRVS >gi|157101648|gb|DS480676.1| GENE 108 112866 - 113783 1218 305 aa, chain - ## HITS:1 COG:BH1461 KEGG:ns NR:ns ## COG: BH1461 COG0834 # Protein_GI_number: 15614024 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Bacillus halodurans # 56 294 27 267 276 123 31.0 5e-28 MKKIALLLCAVLTAGALTACSGGADATTAVSQAAAQEAADKDGAGKDAAETPAAGAEEKS DEAGENSADTATENPFSGKTVKVGCSATFVPFESIEMAADGSKTYVGMNIDIVKAVVEKN GGQVEFVDMPFKSLMAAIQAGQIDFCSGGMAPTAEREKTLDFSEIFFYPRNAIVYRAEDN YPDLDSLKGKKIAYVFGTNYQQVAESVEGAVTVGIQGSPACIEEVKSKRADACIVDGAGA TEFLKNNEGLKLSLMDKVDDDCFAIGFPKESPYYETFNNTLKEMMENGELDDIIASHLSE QFILD >gi|157101648|gb|DS480676.1| GENE 109 114101 - 115000 833 299 aa, chain + ## HITS:1 COG:STM3132 KEGG:ns NR:ns ## COG: STM3132 COG0726 # Protein_GI_number: 16766432 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Salmonella typhimurium LT2 # 17 298 7 296 307 133 31.0 4e-31 MSYEIDGRTNGPAITWPGRARIAVMVTFDFDAEYLRMSRAASKGTTIGFTDYSRGQYGPR EGLKRCLDMLDTHDIKSTFFVPGIVAEKYTAQVESIHARGHEIACHGYMHESVRGISAEE ETEILEKSEAILKRITGRRPVGHRGPESIIHPFTPELLAQRGYLYSSSMKDCDWAYLWEK DGQVLPLVELPCDITMDDFTYFYFTFSDPAVRSMYTNREVFGNWKDEFDGLVEEGNKIFI LKLHPQMIGRASRIAALGNFIAYMKRHGAWITTCEEAARYVLAQAEANQENPEKGGAAL >gi|157101648|gb|DS480676.1| GENE 110 114997 - 115830 870 277 aa, chain + ## HITS:1 COG:SMb21100 KEGG:ns NR:ns ## COG: SMb21100 COG0726 # Protein_GI_number: 16264427 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Sinorhizobium meliloti # 2 265 19 281 292 143 33.0 4e-34 MMTWPDNKRIAVMMAFDLDAETMWTTRGDGNHDHITNLSRGAYGPKQGVPRILDMLDVYG VKATFFIPGVIAEHYPLVVKEISRRGHEIGFHGYLHEESTATSYEEEDATMTRCETIIKD LTGQSIAGHRGPGGVIHDYSLRLFLEHGYLYSSNWRDSDGPFIHTIDGRQVPLVELPKDS IFDDTAYDFYTDSAPERYELKSPREMLEIWKDEFDSLAAEGRMINFVLHPQFIGRASRVN MLSELIGYMLSHGAWIDTNRAVAEYVLENRDLLKPDR >gi|157101648|gb|DS480676.1| GENE 111 115994 - 116644 610 216 aa, chain - ## HITS:1 COG:BS_yoqW KEGG:ns NR:ns ## COG: BS_yoqW COG2135 # Protein_GI_number: 16079108 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 187 1 179 224 74 29.0 1e-13 MCGRYHFSAELLDEIRDITEQKDWKLELGVLDRDIHPGDMAPVIVAALEEASNVPAPAKN AGTASYAGRGGSLRVCRQKWGYPGPGGKGLVFNARSESVFEKRMFRDSVSMRRAVIPVSW FYEWNKNKEKFTFTKEGSRILFLAGFYGRYEDGEHFVILTTQANASMAPVHSRMPLVLER EQVREWILDSAKTKELLGQEPPQLERDCEYEQQTLF >gi|157101648|gb|DS480676.1| GENE 112 116771 - 118225 681 484 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 4 483 3 448 458 266 33 4e-70 MADKFDLVVIGAGPGGYEAAIEGVQKGMKVALVENRELGGTCLNRGCIPTKTIIHTAELY HELQSGPSIGLTVREPAVDMEMVQKRKDEVLEQLRKGIASLMKTNKISVYYGTGTILDRE HVKVAAAEDTSEGKSEGKPEGKSEEQPGGQKQDQVVLETSHILIATGSVPACPPIPGSSL PGVVTSDGLLDKKDMFEHLIIIGGGVIGMEFASVYSSLGHGVTVIEALDRILPTMDKEIA QNLKMIMKKRNVDIHTGAKVEEILQTEDGKGLICRYTEKDKPCEARADGILIATGRRAYT GGLITDESSREVKDMAMERGRIITDEKYETSVPGIYAIGDVTGGIQLAHAATAQGRNAVA HMAGEDMVIRTDIIPSCVYTNPEIGCVGISADEAKARGMEAVTKKYIMSANGKSILSQQE RGFIKVVADSGSHRILGAQMMCARATDMISQFAVAMANGLTLEDMAKVIFPHPTFSEGIL EAVR >gi|157101648|gb|DS480676.1| GENE 113 118261 - 119289 763 342 aa, chain - ## HITS:1 COG:STM4576 KEGG:ns NR:ns ## COG: STM4576 COG0095 # Protein_GI_number: 16767817 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Salmonella typhimurium LT2 # 10 312 9 318 338 256 42.0 3e-68 MINRLVWMETDNTYPYRNLAMEEYMTLHVPQGTCILYLWQNRHTVVIGKNQNCWRECRVN FLEQENGYLVRRLSGGGAVFHDLGNLNFTFIVGAGDYDVSRQLDVILEAVRSLGIHAEKT GRNDITVDGRKFSGNAFYRTGDGCYHHGTILIRADKENMSRYLNVSKEKLASKGVTSVKS RVANLCEFKPDITLEEVKSALVEAFSKVYGQPAVRMEEGELPQPDIDARTEKFASWQWKY GRKIPFEYEMEKRFPWGNIQIQFHVDEGMIQDVNVFSDAMDQEMIGMLAEQLKGCAYAEE DICRAVKAMGTDAFQAGEYTGADEPQSCRIKEDIMTLVRESL >gi|157101648|gb|DS480676.1| GENE 114 119414 - 120847 1650 477 aa, chain - ## HITS:1 COG:TM0214 KEGG:ns NR:ns ## COG: TM0214 COG1003 # Protein_GI_number: 15642987 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Thermotoga maritima # 4 474 3 472 474 524 54.0 1e-148 MKLVFEKGSAGRRLDLISPCDVPQVSFEKAHIREKQPRLPHMSENEISRHYTELAKRSHG VNDGFYPLGSCTMKYNPKVNEEAAALKGFRGVHPLQPEATVQGSMEVLYLAEKYLCEITG MDAMTFQPAAGAHGEFTGLLLIKAYHVHHNDTKRTKIIVPDSAHGTNPASASMCGYDVVS IPSREDGCVDLEQLKAAVGEDTAGLMLTNPNTVGLFDKNILEITKIVHDAGGLCYYDGAN LNAVMGIVRPGDMGFDVVHLNLHKTFSTPHGGGGPGSGPVGCKEILKPFIPGVRVEKEED TYHFVKAQDSLGDVKMFYGNFLVVVKALSYILTLGKEGIPEAARNAVLNANYMRVKLSDV YDMAYNETCMHEFVMSLEKMKHDKDISAMDIAKALLDYGIHPPTMYFPLIVHEALMVEPT ETESRETLDEAIEVFHKIYEQALADPEALHSAPHLTPIGRPDEVTAARKPVLRYTWE >gi|157101648|gb|DS480676.1| GENE 115 120844 - 122211 1467 455 aa, chain - ## HITS:1 COG:lin1386 KEGG:ns NR:ns ## COG: lin1386 COG0403 # Protein_GI_number: 16800454 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Listeria innocua # 4 439 6 444 448 367 43.0 1e-101 MGSYVPGTKQEQQEMLREIGYASFEEMFSHIPDEVMLKEGVNIPSGLSEQAVKARMEDIA GKNVVFRHIFRGAGAYNHYIPAIVSNVTSKEEFVTAYTPYQAEISQGILQSIFEYQTMLC ELTGMDASNASIYDGATAAAEAIAMCKERKKTTAYISAAANPGVIEVMKTYCFGSNTKVV MVPEKDGVTDGAALEEMLKADPQAACFYVQQPNYYGNIEDGDALGRIVHEAGAKYIMGCN PMALAVMKTPAEYGADIAVGDGQPLGMPLAFGGPYLGFMTATAAMTRKLPGRIVGETADH EGRRAFVLTLQAREQHIRREKASSNVCSNQAWCALTASVYMTAMGADGMAKAAGQCMSKA HYLRDVLKEAGLEPKHNCEFFHEFVTVSPEGGCTDSAHILRALESKGILGGLPLSDREIL WCATEMNTREEMDELASAVREALHRECGQKEVCGS >gi|157101648|gb|DS480676.1| GENE 116 122264 - 122644 596 126 aa, chain - ## HITS:1 COG:PH1317 KEGG:ns NR:ns ## COG: PH1317 COG0509 # Protein_GI_number: 14591129 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Pyrococcus horikoshii # 5 125 15 138 138 121 57.0 3e-28 MNFPEDLKYAKSHEWVKTLEDGTVLVGVSDYAQDALGDLVFVNLPEAGDEVTEGEPFADV ESVKAVSDVYSPVSGTVAEINEVLLDAPESINEAPYEAWFMKVENVTGTAQLMSAAEYEE YVKTLD >gi|157101648|gb|DS480676.1| GENE 117 122713 - 123822 1251 369 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 10 368 4 363 365 346 49.0 4e-95 MGTETAGEVLLKTPLYDTHVTYKGKIVPFAGYLLPVQYETGVIEEHMAVRTKCGLFDVSH MGEVICKGPDALKNLNMLLTNDYTVMAEGQARYSPMCNEQGGVVDDLIVYKVRDDCYFIV VNAANKDKDYAWMKAHQSGDVVFDDISSQVAQLALQGPKAMDVLKKVAKEEDIPEKYYTC LFDRMVGGMKCIISKTGYTGEDGVEIYLAPEDAPKMWELLMEAGKDEGLIPCGLGARDTL RLEAGMPLYGHEMDDTISPKEAGLGIFVKMDKDDFIGKQAIQDKGPLTRKRVGLKVTGRG IIREHQPVYIGEQQVGMTTSGTHCPFLGYPAAMALVDIGFKDPGTALEVDVRGRRVAAEV VKLPFYKRQ >gi|157101648|gb|DS480676.1| GENE 118 124419 - 125291 825 290 aa, chain + ## HITS:1 COG:no KEGG:Closa_4282 NR:ns ## KEGG: Closa_4282 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 290 1 290 295 381 61.0 1e-104 MHKFMRTVGFSMYQKKQDMAKLLKRLVKEAEMTGRLAEQEGSRFCELRTEVAPGMGVAMI GEMSPKGTFSREYYFPYVKNPDVSSEAECSIQRHTERETYAGLLDEYRVGISLIFYMENS MEYRMRRSARQPVQPKGAALTGLAVQGKVLLPVNKTEKQAEDSRQAAKKRSSLLEAAKHG DEDAIETLTIEDIDLYSMVSRRIAHEDVYSIIETCFMPCGIECDQYSVIGEITEISTSAN RITKEEIYNLKLDCNDMVFHVAINKQDLLGEPRIGRRFKGQVWMQGTVLF >gi|157101648|gb|DS480676.1| GENE 119 125794 - 127254 449 486 aa, chain - ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 21 442 26 395 399 80 22.0 6e-15 MASIKQRKSKFSVIYWYMDDTGERKQKWDTLETKKEAKARKAFIEFYQQTYGYVLVPLEE QFAHQREEAEQAIDTPDDEITLKEFLVTFVNLYGVSKWSANTYSSKLSSINNYINPLIGD WKLNEITTKKLSQYYNGLLSVPEVPRANRKATGRCVQPANIKKIHDIIRCALNQAIRWEY LDSKMRNPASLATLPKVPKVKRKVWNVQTFKEAIKLVDDDLLLLCMHLAFACSLRVGEIT GLTWDDVIIDEEAIANNNARVIVNKELARISQSAMQKLKEKDIIKIFPTQKPHCTTRLVL KTPKTETSNRTVWLPTTLAQLLVQYKKDQQELKEFLGSAYNDYNLVIALENGNPVESRIV RNRFTMLCEEHNFETVVFHSLRHLSTKYKLKMTHGDIKSVQGDTGHAEAEMVTDVYSEIV DEDRRLNAKKLDEEFYDTLDTEEPEHKPLSEAQKSISDSDKLLLELLKSMTPETKDKLLK ESLSYC >gi|157101648|gb|DS480676.1| GENE 120 127274 - 127474 149 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936651|ref|ZP_02084018.1| ## NR: gi|160936651|ref|ZP_02084018.1| hypothetical protein CLOBOL_01542 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01542 [Clostridium bolteae ATCC BAA-613] # 1 66 49 114 114 129 100.0 6e-29 MLDGKENTHGITTSEKKTYTVEEIAALLNISMKSAYALVKSGQFHYIRAGRMIRVSKISF DKWLHE >gi|157101648|gb|DS480676.1| GENE 121 128013 - 128252 220 79 aa, chain - ## HITS:1 COG:no KEGG:CD3328 NR:ns ## KEGG: CD3328 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 79 1 79 79 141 97.0 8e-33 MTAKHPMIPFPVIVRAADGDIEAVNQIVRHYSGFIASRSMRPMKDEYGNTHMVVDETLRR RMETRLIAKILSFEIRESN >gi|157101648|gb|DS480676.1| GENE 122 128249 - 128668 374 139 aa, chain - ## HITS:1 COG:no KEGG:CD3329 NR:ns ## KEGG: CD3329 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 139 2 140 140 241 96.0 5e-63 MKPSEFQATIENQFDYICKVAMEDERKDYLKVLSRQCKRETLFCDMDDYTVNLFSSEDTY PSHFHTFEMDGFTVRIENSLLAEALEHLDEKKRDVILRYYFLGFDDTEISKILEVNRSTI QRRRHAGLEFIKKFMEEEA >gi|157101648|gb|DS480676.1| GENE 123 128927 - 129169 105 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936655|ref|ZP_02084022.1| ## NR: gi|160936655|ref|ZP_02084022.1| hypothetical protein CLOBOL_01546 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01546 [Clostridium bolteae ATCC BAA-613] # 12 80 110 178 178 102 100.0 1e-20 MPIPILIYIILRASVKKISIISILAGIFIEPIQLLINLITGFPNYVIDIDDFMLQIAGIL VGLFIITLAEKYHILNWLKE >gi|157101648|gb|DS480676.1| GENE 124 129435 - 129554 85 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936656|ref|ZP_02084023.1| ## NR: gi|160936656|ref|ZP_02084023.1| hypothetical protein CLOBOL_01547 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01547 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 74 100.0 3e-12 MQSLNNRKVIFLGMQVEDTCIPFYKDEEGNYVDIYKTDS >gi|157101648|gb|DS480676.1| GENE 125 129571 - 129879 144 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936657|ref|ZP_02084024.1| ## NR: gi|160936657|ref|ZP_02084024.1| hypothetical protein CLOBOL_01548 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01548 [Clostridium bolteae ATCC BAA-613] # 1 102 71 172 172 188 100.0 1e-46 MTGGIYTKQNWVVSDTPTFDINVWIESYGYPGGDDSDLYIGLEKKSAVTNKWSATDYAYV SAKDGGSVTLTGDGSGTYRIYLRLIEATGYRVTGTLNISYNF >gi|157101648|gb|DS480676.1| GENE 126 130215 - 131492 252 425 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 59 369 114 426 541 68 23.0 2e-11 MVLVFSAFGMPVLAGVILYKFVFGERIYLVSDDLLYMDTIHKSSISYGTPWGNHWISLLL FLAWFLGFIYYGIFKYTNDRRILKKLEKCSKYTQDKLLLETKREAMNELGLKGPVLLLSN EIIQSPFMTGIFEQKIFLPENNYTQEEWGLLLKHELVHCKNQDYFFRRLVFILCALHWFN PLIYRLSDYFVEVNEMACDEKVLNRQPIKRHTMYAELILQIQEKELGITTVSLTGHTVNG LERRIGNIMKKTEKTKRISFAVLVMSMVLMCPLTVFAASWGMSNLQDLVVKKLWITEVEV QQESTELSEKTEFHYSEDVIEYPIQINPRGVTSIDITIDGKSVKYGDSLSLTSGNKVAFI LQSDSSSDSFKAGLEDSKGKRIYVQTSNGWIDHTFSIEQSGSYKIFLEGTTSKSIHITGS ITILK >gi|157101648|gb|DS480676.1| GENE 127 131614 - 132027 178 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936659|ref|ZP_02084026.1| ## NR: gi|160936659|ref|ZP_02084026.1| hypothetical protein CLOBOL_01550 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01550 [Clostridium bolteae ATCC BAA-613] # 1 125 31 155 167 239 100.0 5e-62 MARPAEAPLNYLGGKKKLSATEFEFMKFIWKHPEGVSSEEIYQAFPQARGTKSTILYHLS EKGYVDKKQSGLHHFYTVLVTQEEYEQALLRQQIEGVLGNTSFERLVAAFCGKTLSKAQI EKTRNLLKELEDELEDK >gi|157101648|gb|DS480676.1| GENE 128 132396 - 133067 163 223 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936660|ref|ZP_02084027.1| ## NR: gi|160936660|ref|ZP_02084027.1| hypothetical protein CLOBOL_01551 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01551 [Clostridium bolteae ATCC BAA-613] # 24 223 1 200 200 358 100.0 1e-97 MKSKKILALILVLSTTSALSFSSMAEEVSLLDSLPAVQYEVETRGNAETLSEDDYLLDGN YENGELINGSIYHSEAEAVEGVKQDRIMARTIANSSESEPDLARIYPNTGIIEGISSTGS FYYNIQTSGTTPSAYWNVHSTVWVDSQSDVYYSKSQSTGTNDNYSYILTVMSDGNAVGSF RGYADDVKGGANFTIPGNSMYIKLATENFPRNEFLHGSGSVDD >gi|157101648|gb|DS480676.1| GENE 129 133361 - 134572 556 403 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 3 396 1 381 391 97 23.0 5e-20 MNMNAVGIDVSKGKSMIAILRPYGEIVSSPFEIKHTSSNIQSLIEQIRSIEGESRIVMEH TGRYYEPLARELSLAGLFVTAVNPKLIKDFGAHSLRKVKSDKADAVKIARYTLDSWTELK QYSLMDELRNQLKTMNRQFGFYMKHKTAMKNNLIGILDQTYPGVNTYFDSPAREDGSQKW VDFATTYWHVDYVRKFSLNAFVDHYQKWCKRRKYNFSKDKAEEIYGAAKELVPILPKDDL TKLIVKQSIEQLNTASKTVEELRTLMNDTAAKLPEYPVVMGMKGVGPSLGPQLMAEIGDV TRFTHKGAITAFAGVDPGVNESGTYEQKSVPTSKRGSSSLRKTLFQVMDCLIKTKPQDDP VYAFIDKKRAQGKPYYVYMTAGANKFLRIYYGRVKEYLMSLPE >gi|157101648|gb|DS480676.1| GENE 130 134679 - 137591 2061 970 aa, chain - ## HITS:1 COG:VC1831_1 KEGG:ns NR:ns ## COG: VC1831_1 COG0642 # Protein_GI_number: 15641833 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 430 681 324 583 584 157 35.0 1e-37 MKTKPKKKSIAMLAVPAGICVTALSISVFFLVQITGRMNLSANQNLLTSSKVISEGLNNK ISLDRELLFTLADMLASEQESAVGETLLGYVKSTDFFHFTYITMDGEGVDSDGNLVHVSD LPFDDLALSSGKDGMSAPYYGASGRLQITYQSPVRKDEKQIGAIYADRVINDYNLPTLFT FNNGEGYAYVVDKNGNYIIESRGTAEQPDIYSYLAMQGNSRAVQNTLCEVMQEGISGTLV VINENQKSLLGFLPLDAPEGSYLITMIPRAVLQRESTPIIIMLYTMFTFLLIGAVAIANL VTGRQSMREEARQREYREKLFRNLSANIDFVFLLYTPSTQKVELVSDNLPVLLGITAKQV QERPKLVFSASGMAPDDPAGNSFLDGTLNKQVTREAMVGDGPNEVKRWFAVHMIPADYGQ YLAVFQETTKEHNMREHLADALKQAQNSNQARTAFFSSMSHDIRTPMNGIIGMTNVALNS LDNMEKVESCLNKITAASGHLLQLINEVLDMSRIESGKISLKEENVHLPSMMANLISFIR PDMDKKRQSFELKSQILEHDTVISDGLHLQKILLNLLSNAMKYTQAGGTISMQITETNVD ADTIRLRFVVEDNGIGMTPEFLERVFAPFERAEDSRMSQATGTGLGLAITKSIVDMMEGD IYVESEKDMGSRFTVELPLKLPKLQMQELPDLAGYSVLIVDDDQDACESISLILQETGIR ADWVMNGMEAVEKAWQAHVMEDDYCIVIVDWKMPEMDGVETARRIRARLGSGMPILLLSA YDWENVKEEAIQAGINGFLTKPIFKADILEQMKQYIQGGYQEYEEQQIQETEKLEGVRKL EGVRILVADDNELNLEIMFEILNSSGAEVDCVHNGKEALDTYLKSSQGYYQIILMDVHMP EMDGLQATKKIRNSGRPDASIVPIIAMTADVFKEDIRRCKDAGMDAHIGKPVELDKLYSM VRKFINKACK >gi|157101648|gb|DS480676.1| GENE 131 138040 - 139932 1758 630 aa, chain - ## HITS:1 COG:PA1727_2 KEGG:ns NR:ns ## COG: PA1727_2 COG2199 # Protein_GI_number: 15596924 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 474 626 29 183 194 97 34.0 8e-20 MKNNKLFQTNLLVSSILVVGFILTAFFSYRANYRASLENIEQVSSLTAEGIYYQLKAMLT KPVNISLTMSHDSLLVDHLLAENSRMDALDFAETTKTYLETYQKKYGFDSVFLISANSGR YYNFNGLDRVMEKGDPENEWYYELLGNDLEYSLNVDNDQVEGADNRITVFVNCKVTAPDG AVIGIVGVGIQIDYLEELLQEYEDSYGNQASLINQDGFIEISTTYNGYEKKDWFGVYGQE HIREQVLDFRESDNSMEFWTDTGPKEGKRSFIVERYIPELSWYLLVEQNTGPIVEAMNRQ LYQTGFILTVVILTVLIVITAVIRKYSSQVTELVEERQALFKKATEQLYDSIYELNLTKN SYVGKQTEEYFKSLGAGGLPFDQGLCVIAREQIKEEFREGYVSMFTPENAIREYNAGNNH LQYDFMISQNGEDYHWIRVETFLFYSEEDSSVHMFSYRRNIDAEKKREWQAAIDEMTKLL SKKATERLIDKQLSESPDKMYAFFIFDIDEFKMVNDRYGHSFGDFCICSFSDILKSHFRE GDILGRIGGDEFAAFIQVSDTECVEEKVKILSSALHTVIGRGDIRCSVTASIGIALYPRD GANFEDLYQQADRALYETKQRGKDGFTIVS >gi|157101648|gb|DS480676.1| GENE 132 140321 - 142168 1445 615 aa, chain + ## HITS:1 COG:SMc00038_4 KEGG:ns NR:ns ## COG: SMc00038_4 COG2200 # Protein_GI_number: 15964712 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Sinorhizobium meliloti # 360 611 1 252 257 196 38.0 1e-49 MKRFFRGGLISLFILLFLFGSITIQTVIQTQTNGRLINYVGIVRGASQRLIKLELSGQPD DDMIAYLDSILLELQGGEATYGLPSPDDPAYQRDLAKLELMWSQVKDEISAYRSGSADSS RLLSLSEDFFEQANHTVFSADAYSARQMRFLLTGCLVMMGVMSLTWIFILWANSKNLLKL EVQNEQLNDLASRDTLTGVYLFSAFKEKAQQLLDTRPEKFAVVYTDFSDFKYINDVFGYD YGDLLLSQYGKLLLKGLHEYELCGRVSADNFVLLLHYNEKSEIAARQADADYEILRFMHT AHDGQMLPTNCGICCIEDVIESLTIDGLLDRANFARKTVKTGNKQNYVYYDESIRSHLRE EKDIENRMLTALENREFTVYYQPKVNLATGKIGCAEALVRWQSQAGAIIPPDRFIPVFER KYMIDRLDQYVYEEVCRWLRSLIDSGRKPLPVSVNVSRLQFYDQNFVDRYVSIRDQYAIP PELLEVEFTESIAFDNTARLLQTVARLKKAGFSCSIDDFGKGYSSLGLLKNLPIDVLKID SLFFDEGPDQNKDMALVEGIIDMVKKFDIRTVAEGIESMDQVEYLKQIGCDYVQGYVFYR PMPQSDYEALIIQGC >gi|157101648|gb|DS480676.1| GENE 133 142265 - 143365 799 366 aa, chain - ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 63 343 41 323 326 98 25.0 2e-20 MTTGELENQLMEFNEDERFYQQYYNVKQDKWMLEDFLESLDFNEVIQRRLIVPEVSGGWM PSDMKDEAYFDAEDKNHIVISKHNRFTPPFRHRHIFFELAYVVKGHCSQDVGKDRIEMEE GDFCLLAPDTIHSVSVFDDSLLVNILVRRSTFEDIFFNMLRDTNMIATFFNQSLYSGVHN PYLIIPARGDQVLKEYVLSMFLEYLGKSRYYEKILNNQLMILFAKILQSYEDRIQLPSVM RRATEESIRILSYIEDNYQSVTLKQTAAQFHFSQPYCSKIIKEYTGKSFTQIVQEIRFQK AAILLKNTNISIAEISSRVGFENVEHFNRMFRKLYEMPPGKYRKGNTGSRLFTGPGGESR TGPQAL >gi|157101648|gb|DS480676.1| GENE 134 143503 - 144858 1228 451 aa, chain + ## HITS:1 COG:CAC3422 KEGG:ns NR:ns ## COG: CAC3422 COG2211 # Protein_GI_number: 15896663 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Clostridium acetobutylicum # 17 451 10 442 445 258 39.0 2e-68 MKQDSTMRYVKQQIWKQRIGYGVSDFACNLIWQMISLYLLYFYTDVMSLSAKSIAFMFVV TRFIDGGTDLLIGYCIDHTRTRWGKSRPYFLFGAIPFALFAVLAFSVPDISQSGKLIYAY ATYIGLSFTYTVVNIPMASILPSLTTDTQERTALSTTRKFFAFLGSTVVSSTALKLVDSL GKGSEALGFRIVMIIFGFIGCLLFFFTFATVKERVNTAAESVSLKESLMSLGKNKPWLIF ALNIIFMWTGFFLQTSAIVYYYTAVIGSKELSVTIATIMSIVPMAANFLVPALAGRMGKR NLYVGSAAIQLAGLVVILFAGINHGAILAGAVVTALGYGMKESIYFSMQADPVDYGEWKT GVNTAGSLSAINGFLGKVAQAAAGGISGVLLAWGAYQGGASAQSSNAVLAIKAMYLYIPM LLLICSIVTMLFYNLDKIYPQIKAVLDSRNE >gi|157101648|gb|DS480676.1| GENE 135 144871 - 146511 969 546 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_5097 NR:ns ## KEGG: Pjdr2_5097 # Name: not_defined # Def: alpha-L-rhamnosidase # Organism: Paenibacillus # Pathway: not_defined # 13 545 1 512 513 578 53.0 1e-163 MAQITNHIENFKMNINEDFVRKAQELEPALHYKTVRPAQLVSFQPSDAPDKPNGWDLPVH NRADRVFDWIPCRMQPVEALDTMPLSKGSSVCLDFGTHHVGYISFRLTPAGSPPDAPAHI RIKLGEMPCEIMEDTASYRGSISSSWIQEEYLHIDELPARISLPRRYAFRYLELKVMDTS PKYQVLLSDVSCDTVTSGDMSQVDPLNENVPEDLKRIDAIALKTMEDCMQSVFEDGPKRD RRLWIGDLRLQALTNYETFKNYDLVKRCLYLFAGLTQNEHHVGACLFLRPAPMVDDTSLF DYGLFFISCLHDYYQATGDRETLMQLWPAAYHQAELALKRLDGRGVVKDSGDWWCFLDWN DSLNKQAGAQGVLIYTLRQALSLARLMCSETSGAESVKQLESWIETAVRGALEYLWDKEL RFFISGQNRQVSWASQIWLVLAGVLDQESNRRLLEHLVKEDPPVGMVTPYMYHHFIEALI QNGMKDTALHYIRFYWGQMADLGADCFWELFNPRDRYASPYGSRIINSYCHAWSCTPSYF IRKYFN >gi|157101648|gb|DS480676.1| GENE 136 146667 - 148139 1107 490 aa, chain + ## HITS:1 COG:TVN0402 KEGG:ns NR:ns ## COG: TVN0402 COG1167 # Protein_GI_number: 13541233 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Thermoplasma volcanium # 123 486 23 395 402 179 29.0 1e-44 MEIQLKKNSTTPLYKQIRNTIRSQILSGQLSDGFKLPSERQLVEKLGVHRNTVKKAYEML IHEGLVYASVKAPRGYFVRNSPSEVKPDGKPDTRKAFSSLDKNFNYHFLGTQNTFQRLYN SSYLSDSISFAGVLVNKETLPLSYLHEIMDEITSGDDLEPFWFCDPQGTERMRACLASLL FGRKIYVQPQNIQIITETYEALSNIAFMYLKEGDFAIVEEPVMPAIVNILLHTGAHVLFV PVEKDGIQIEILENLVERYRPKLIYSMPNFHNPSAAVMSLEKRKRLLQCALSHNIPIVED DSLFDFNYSGITLPSLYSMDRTNSVIYIDSFNLSFFPGARVGYIVAPDNVIKTYRRIVNK DQMFLNSLSQYMWARFFEKGYYEKHRQFLIDFYRKKRDLMCSALSCIPGLSFSVPDGGLV IWIRLPEYTNDRQIAAAAERTGLLLMPGNTFFAEGSHGESYIRVSYSSVTDEEIAEGCRR MAEVLNSCRI >gi|157101648|gb|DS480676.1| GENE 137 148296 - 149621 1216 441 aa, chain + ## HITS:1 COG:Cgl0102 KEGG:ns NR:ns ## COG: Cgl0102 COG1473 # Protein_GI_number: 19551352 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Corynebacterium glutamicum # 30 368 71 393 462 125 30.0 2e-28 MEELYKKTLDRLKAELKKQLPEIEANGDWLWHHPEAGFKETETQKYCLEVLKNHGFEACT WPDVTGFTCTYDTGRPGPCVMIMGEMDSLICYSHPDCNPETGAAHACGHSIQVATAMGAF ITLTESGVLENFGGKVMLAGVPAEECLEIEWRTREIQKGRLHYLCGKAELLYRGAFDGVD IVISMHAGLGNDGTITILGSHNGFISKNVTFQGRSAHAAADPENGINALYMANTALTALN SLRETFKDSDTVRVHPIITKGGTVVNAIPEEVCVECLCRAGKTETVLDVSKKFDQAMGAG AYAFGGKALVRTQPGYLPYRPLPELDQAAVQTADDILGKGRVNKGGHNCGSTDLGDLNEV IPGIQVFVNYMYGDHHGADYRIADRKIYETASLFLAAMACRLLDDGGALAKKVKAAHKPL FASAEDYCAYVKSLETEQILP >gi|157101648|gb|DS480676.1| GENE 138 149642 - 150907 1122 421 aa, chain + ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 13 410 7 407 411 142 29.0 1e-33 METPSLKKNLPLLITLTLSGSFICSLPWFRSYYYNPFMETFGMTNTQMGLCGTALGVGGI AAYLFGGIICDYISIKKLIPAAMISTGALGLVMLTIPSPIIMVIIHGLFAITCLMLFFPA QTKAVRNLASVNEQGKAFGIFEGGRGISNAVYLAIAALIFGQMTIIKSESFGVRGIILFY SVMTILLGVIDIVLLKGIDDGKKGDSNDTVNLKMLTKPLKMPAVWLMIGIIFATLTISTG YYYISPYVTEVFGVSALLGAVLSSSSQYIRPIASFGAGVLGDRINNSKVMLIGQIGLAIA LVIILAVPSSMGVLPILVACLMIFFCMYMCVTMHFAIMDEEYFPPECVGTAIGLICAIGY LPEATSPFVAGVILDHFPGAAGYRLFFVYLFAVTVFGLVLTVIWLRRTKARRAEILEKLK K >gi|157101648|gb|DS480676.1| GENE 139 150940 - 151746 766 268 aa, chain + ## HITS:1 COG:no KEGG:Bcer98_0475 NR:ns ## KEGG: Bcer98_0475 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 19 264 18 260 279 173 42.0 6e-42 MRKKLSDMTKQYSSFFLCVILCCLAAEFIGPIKLSLGRFTVTFLPLLYAMLLMLVLYLAK PVRGVGKAYVPAASRMLSMGVVFAMAKLGISSGASIKEMISASPILILQNIGNIGTVLFA LPIALILGMKREAVGMSYALSREPNVALIADMYGSESDEFKGVMVCYIVGTVFGSIFMSV IPPLFVTLGIFRPEAAAMAVGAGSSSMMAAGLAGVIEASPASNPDTLTAFATISNVISSA ITVYLGIFITIPLSNLIYKVMKRKGVPS >gi|157101648|gb|DS480676.1| GENE 140 151743 - 152207 463 154 aa, chain + ## HITS:1 COG:no KEGG:SSP0162 NR:ns ## KEGG: SSP0162 # Name: not_defined # Def: hypothetical protein # Organism: S.saprophyticus # Pathway: not_defined # 9 149 3 143 150 94 36.0 1e-18 MSEKVSFKSLCVEWVISLTVMVILTMACNIVGYGGRFLESIPGMLILCVISFLGLTAKHF IPSSLPAVTYIGIIGMLAAVPASPVSAPIIYWTGKIDLMASITPILAFAGVLLGKDWKAF LSIGWKGVLVSLLVIFGTFFTSGLIAQFMGAGTV >gi|157101648|gb|DS480676.1| GENE 141 152234 - 153835 1219 533 aa, chain - ## HITS:1 COG:VCA0667 KEGG:ns NR:ns ## COG: VCA0667 COG0591 # Protein_GI_number: 15601425 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Vibrio cholerae # 10 478 9 469 513 193 30.0 6e-49 MTFTIIDAAIIFIYLMGMLVLGFIMGKTNKTQEDYFVAKRSLPWLPIGLSIAATMISCNS FVGGPGWGYHSGLLPFMQNMTVPLALFISTTIFVPMLYDLKLTSVYEYIELRLGPKTRMA AVLGFILNSLIQVSSMVFVPSLVLQRFMGVSINIIIPVIVVIAVVYTMAGGIKAVIWTDA IQILVIWGGLIGGMVFLFSKSDIGFFETVRMAAQAGKLRAIDTSWQFSLTNENGFWAALV GGLFMWSRYFSFDQAQVQRMFTAKSIRELKRSFVVSGIMMNVLYFLFIFLGALLFIQFGG AAFENANDVMIRFIGGLPAGLVGLLLASLFAAAMSSVDSLLNSMSTVFTKDIYERYIEKE KEASLKTSMMVSAVWGILVYIITMLAFSGTSKSILAVIGSYISYISGPTCGIFLLAMLTK KANDKGTVCGAVVGFIITLLFGTLAGFTWMWNSAFGTVMTFLAGYVFSCLFGEPAGEQAR RYTIRGHMACLKEQGRTEENGISLMPFAVDRYSMILLVFFLIQLAVICFFQRA >gi|157101648|gb|DS480676.1| GENE 142 153906 - 154913 852 335 aa, chain - ## HITS:1 COG:no KEGG:Ilyop_1572 NR:ns ## KEGG: Ilyop_1572 # Name: not_defined # Def: hypothetical protein # Organism: I.polytropus # Pathway: not_defined # 1 333 1 334 337 564 77.0 1e-159 MGQYQEAKKVALSYFDALEKAEPEQVLQVFREYMTEDYNWKGVYPFREQNGREAVAESFW IPLKKALRHMQRRQDIFIAGTNEISGEEWVMSMGHLMGLFDQDFLGIRHTRKMASLRYAE FSCVENGKICKSGMFVDLIGLMIQAGENPLPPSTGNYFIYPGPRYHDGILLEDAPEEEGI KTLALVNKMVDDLDELNKSGAMGCPPEVLEKAWASDMIWYGPGGIGASYTIPRYQEQHQL PFRNHLKDKTFNGHVCRFAEGNFACFFGWPNLTNTPIGGFLGLPGGGRGDMQVVDVYCRR GDKLSENWVLIDLPYWLKQQGLDIFERTRQILNHD >gi|157101648|gb|DS480676.1| GENE 143 154932 - 156227 1334 431 aa, chain - ## HITS:1 COG:VCA0669 KEGG:ns NR:ns ## COG: VCA0669 COG0477 # Protein_GI_number: 15601427 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Vibrio cholerae # 25 422 14 416 431 180 31.0 5e-45 MSAIALNTSEAENEIILGKNLSAGRRSLAFLSIVIVYFFYCYNFMVGTFVKPTMIAEAAS GGFGFTLQQSETIFAVMSFGTIPGTIVFGILNAKIGKKYTLIVVASLIALTTFLPMMTPG SYQVWLVSRLITGGTLGGVFGCAMPLVTELFPQKYRGKLAAVLTSLFSLAMIFGGQLYSF MGDSNWQVLMYTAIIPPAVGAVMIFLFVPNDYQNTRQLNEVSKQQNEKISYLNMYKGKYL WIGLGVILLSGANFTAYSAFSNNATTYLRNSLGLSAAVAGGIYSLQGIGQLIGYNIWGFI SDRFGRKKPLIGMALSAVFVFLFMRLGAGNITQLKLVSVFLGFAVGFSGAWGAYYTELFP QKFSALSSGISFNGGRIISTFAIPAIAGLGSAAAGMQGIFMVSMAVFVAGAVIWMFLPET YQKNEKAADNE >gi|157101648|gb|DS480676.1| GENE 144 156228 - 157478 1346 416 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0606 NR:ns ## KEGG: CDR20291_0606 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 37 402 100 465 465 516 65.0 1e-145 MQEKEPGTYEKTGIPDPVIPGKTDMKKGEEKGTKMKEKPKSSGKIYYGETEEVNPVRHAD YNDYLELAGSKQQLPGFDEEYRDIVDYVLKITHRIWEEKGIGVIYDTYHNDVTMHCASSN LVGIKDVVANTLMTLHSFPDRRLIGEQVIWSKHDTKGYLSSHRVLSTATNLGDSDFGKAT GKKVNFRTVIDCAIENNRIYEEWLVRDNLWLVKQLGLDVHEVARNMARGAKGKTPALQSR FGLGESMQGQFFPEKYKAADDSVGEMMLEMHSQVFNYKMINQVKAYYAPDAVVHYICGKD LVGYEEIQGMVISLLASFPNAHHSVDRVTCNRRAGDDEWDVAVRWRLRGLHEGRGMFGAP SMKPVDILGINHYRIAGNQIQEEWVTFDGLDVLKQIYMETEDSYAKTEDNMTEGER >gi|157101648|gb|DS480676.1| GENE 145 157450 - 157578 77 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFYGFYVFTLSLVKGFVLIQNFKRLKKEGGRYDYAGKGTGYI >gi|157101648|gb|DS480676.1| GENE 146 157580 - 158611 830 343 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 1 293 6 292 333 137 32.0 3e-32 MTIKDIAKNLNVSYSTVSLSLNNDPRVAEKTKKRVIEEAERLGFTFNANARSLVTKKTNR IGIIFSDNFNAPDYRWFFNEIETYSTRAVENCGYDFLIQPNKNIHSQSNILRMVNGKMVD GLVIFSKTLTEAEYKFLGDAGIPYVYVYFKPSFSADIPDHFFWDDNRKGGYWATKHLLEH GHRKILTIRSNDPELKMYDDRTAGYMEAMAEYGAEPAVLTARMDFESQKEFVRQNIEYIR KFTGIFSQQDLPALSIVQQLKKYYGMDVPEDLSIVGYNDIELIHYLDIPLDTIADPREEV ITNAVNSLVCRIEGREDLNPRKIYPRLITRGSVRDAAQAGPVK >gi|157101648|gb|DS480676.1| GENE 147 158991 - 160313 1010 440 aa, chain - ## HITS:1 COG:no KEGG:CD3111 NR:ns ## KEGG: CD3111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 440 1 437 442 154 24.0 7e-36 MMKLAILAAPGLIPYLSDKLESTGQAAGVKLDLYDYGHLSMLPELYPKLRDKYDGFLLSG LVVQTAIQRYFPEDDKPVAAFDTELEEIYHALLDLLDRDRSLDLTKVVVDIYLQVNEKHN CRALLDIKDMDAARQKTMDYWKNISLKEIEGLGDRLFAQIKRRWEAGEITYVISRNSSVQ PKLMACGIPHTFLYPSEKQLLRTVHGLLRQITAQKSADYMAAAVVVARESVKGLVEADEE EELLLQQAVLDYKKETLADFQIQKTTNGLFLFTNLKTVSRMTGNFRHCGLSHYLQRRLRF CTYVAYGVAPELSTAKSSALEALREGVYRKAVYAVVEDSLVGPLMPETEPLKQVPLDEDL MEIAGKAMLAPGTVRKIKTILGIRGNNEITAAELAERLNVKIRHTNHIIERLTTAGYAVQ TGMKSTGTKGRPTRIYRIDI >gi|157101648|gb|DS480676.1| GENE 148 160446 - 161651 857 401 aa, chain - ## HITS:1 COG:CAC1014 KEGG:ns NR:ns ## COG: CAC1014 COG1473 # Protein_GI_number: 15894301 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Clostridium acetobutylicum # 1 396 1 395 396 328 45.0 1e-89 MDIKQRIEEIYPELVEIRRDFHRHPEPGFEEKWTSARIRERLDGWGISYEYPVAGTGIAA MIQGSRPGRGNTVALRADMDALPLTENPACPYGSMNPGFMHACGHDAHVSIALGAARILR ECKAEWSGCVKFFFQPAEETTGGALPMVEAGCMEEPHVDYVIGLHVMPTHESGEIEIRYG DLNASSDHIHIKLRGKSCHGAYPDTGVDAIVMAAAVINSLQVLVSRCISPLDNGVLTIGT ICGGTAGNIVAGEVEMTGTLRTTDRMVRQNIIDTMTRMVKCTCQAMGGNGEVTVTPGYAA LINTDQVVDVLVETASDIIGEEHIHWKKCPSMGVEDFSFFLDKAKGVFYHLGCASRRKKI TAPLHSQDFDMDETCLKLGVEMQTRLALRLLEREVNPAISG >gi|157101648|gb|DS480676.1| GENE 149 161681 - 162973 1141 430 aa, chain - ## HITS:1 COG:PH0722 KEGG:ns NR:ns ## COG: PH0722 COG1473 # Protein_GI_number: 14590599 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 6 403 6 386 388 253 37.0 6e-67 MGETKMNYNYLEEACQLNDELVAYRRYLHANPELGLSLPGTKAYVMKALSDFGYQPLEYG ESGVAVLAGGKKPGKVFLIRADMDALPIEEESGVPYASRIKGRMHTCGHDNHTAMLLGAA KLLKRHEDEIEGTVKILFQPGEETLEGAKAMIEAGILENPHVDAALMFHDIAGSPVKEGT VGVFGPGAVYASSGRFEIHIKGVGGHGAAPDWTHNPLGAMCAIYQGIHEIVAEHRSPFDS CTMTVGKMSAGDSANIIPETAIMQGTIRTFSKQVGERLRKDLVTLVEYVGKAKGVTAEVT FGTACPPLVVDSNVHKSFLCSMKEMFGDEAWDMRTAWGGAYSKSTSAEDFSYIAEKVPSG FGWIFLGDSRQGRKWGCHNPKVQFNDAYLYRGAAAYTKAAMDWLLENAAAGTVSDCIRQA EETLYEMAGN >gi|157101648|gb|DS480676.1| GENE 150 163018 - 164307 1124 429 aa, chain - ## HITS:1 COG:BH2694 KEGG:ns NR:ns ## COG: BH2694 COG0477 # Protein_GI_number: 15615257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 4 417 1 409 418 77 23.0 4e-14 MDAVKEKRYYGWYMVMVSAIIYALVYSTTSTGSVFTISITQEQGFSRSLWSSKSIFTCIG SLISCYMSGRLIQRFGLKRIMLISTAGVASTFFIPLMFTDIRQYYVLSVYSSATWCAATI MSVPILINRWFGPKKKGIAISLALAGSGVGSMLMSPLLSYLCENYGWRKTYIFEGVMLLV VMLPLIYFTVVERPQEKGYTERLGEKESTGSGASAGKTVLTGVPYKEGIKSIVFFTVVMG IMLIQICNASVTNHRTPYLTDLFNGNAVKAASISGIAIGSLTIGKIVLGSLCDKIGVKKG LLLGMTSYFCCLLMLFLSIYSTNFIYLYLMFFCIGGSVGTIVTPLIVSTVFGDKDYGRYL GTIQSITAFGTPIGNMFSAYVFDKTGGYSGAWMVVSTAALIAIPMIFISIVIANRKRAKL PVSMKEATV >gi|157101648|gb|DS480676.1| GENE 151 164610 - 164879 300 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936686|ref|ZP_02084053.1| ## NR: gi|160936686|ref|ZP_02084053.1| hypothetical protein CLOBOL_01577 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01577 [Clostridium bolteae ATCC BAA-613] # 1 89 1 89 89 150 100.0 3e-35 MSPGLLFLAYGVIVIMAMYVFVYIPNKKKHRKMQEMHDSIAAGDMVITIGGVVGRVVKKE GDYLTLVIDEEKEVTMRIVLYAVNQKIDK >gi|157101648|gb|DS480676.1| GENE 152 164910 - 166088 1069 392 aa, chain - ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 3 391 5 394 394 358 46.0 1e-98 MWEKCKELQEELVRMRRELHQIPELGGELPKTRAYVEEKLKELGIPFVENKTDSGLIATI KGEKEGKTIVLRADMDALPIQEANEVDYISRHEGCMHACGHDTHMTMLLGAAKILSEHKD QIPGTVRLLFQTDEEGSRGAQRLCAEGAMDGVDAVFGTHIGTIISKDIKAGTVISVPGCC MASFDKFVIKVKGIGCHGSTPEKGVDPVNIAAHIIINLQEIIAREIPAVKPSVLTIGHVK AGFAYNVIPSEVLIEGTIRALEEDVRQELAKRIGEIAEATAKAFRGEAEYEMIWGAPPVI NDAGMAKLAADCARDVVGDDMVIDHLDAPNMGGEDFAYYLEKAPGAFMFLSSSNPEKHTD VPHHNPLFNVDEDVFWIGSAIFVRIVERFFKM >gi|157101648|gb|DS480676.1| GENE 153 166136 - 166801 539 221 aa, chain - ## HITS:1 COG:no KEGG:Amico_1858 NR:ns ## KEGG: Amico_1858 # Name: not_defined # Def: hypothetical protein # Organism: A.colombiense # Pathway: not_defined # 1 220 2 228 229 179 47.0 5e-44 MSDNEIMKQWKKSSIVIGSPTNILAALTSFIPVIWLCTTYDCWPPISTVLTAWGMVAASF GAFYFVEPVSYYAALGMTGTYLSFLSGNIGNMRVPCAALALDVTHSTPGTIQAEVVSTMA ICGSIITNLIGTTSAVIVGGAVLSVLPEFIVVGLTKYASSAIFGATFGNFAVKYPKVAVF ALAIPVILKMTTGLPAAAVIVCTVFGTLLIARVFYKKGLVK >gi|157101648|gb|DS480676.1| GENE 154 166819 - 167538 508 239 aa, chain - ## HITS:1 COG:no KEGG:Amico_1857 NR:ns ## KEGG: Amico_1857 # Name: not_defined # Def: hypothetical protein # Organism: A.colombiense # Pathway: not_defined # 5 232 3 240 243 177 48.0 4e-43 MSNIEQVQQVSNSPILWLLCGVTVVISAIQTVLYMRQCSKTTRAAGLDPKIPKEAFRVGL ISAIGPALGVFIVMVGLMASIGGPMAWLRLSIIGAAATELSAANIGAEACGVTLGTEQYT LICLAVSWFTMALNGAGWLLFTGVATPSLETLRNKVSGGDMKWLVVLSGACSLGIFGYLN AGEILKGMGNTLAVLTGAVSMVLLVKTVCKKYPKLMEYSLGIAMIIGMIFALGYDQLLK >gi|157101648|gb|DS480676.1| GENE 155 167978 - 169612 632 544 aa, chain + ## HITS:1 COG:BH0757 KEGG:ns NR:ns ## COG: BH0757 COG2508 # Protein_GI_number: 15613320 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Bacillus halodurans # 11 540 17 554 557 144 22.0 3e-34 MVTVADFTSDLEQIRCVAGFCGTNREVTSFTIIDTPEILDWLHGGEFVVDSGYITSHNPS MLKGFIASLKEKGCAALGVKLHRYHNEIPGIILDDGNKLDFPIYEMPYNLYFCDFASMIY KRIFEEQLDDMYHVSIAYKDIVNSFSKYKVPERMLSELSLVLRNPVFLINSQFELIEYSY GPSDDFSLQNLLHKEVLMELDKNIQNTLLKRHQQQPYQFHQFSISSPKGEINCILLSSGN IDGDTTFLIIPEINKLEDWHYRVVKDISALFQLTANNKVNTQDFAASVIMAQSNSNETIL RYCKMYNFDYSKHRICLSIEFDSEVPTTSTRKYITREIFSSITTWIMNTYGCNVFFLKTH THYVLYLLFSDVISQETAEYQASEAAKKIWHIFNHNNISTYIGVSLYSSDLQQISKAFFQ AIDAMKLGRKLYPEENVFLFRTIQVYQWLSTMPQESLEEMYRDTIEPLSKAGNSDVNYVH ILRTYIGNHYNISKTASYLHIHRNTLMNYLTRIQEMLPYNIGLPENMLKLQIGIHAMYLL DRNI >gi|157101648|gb|DS480676.1| GENE 156 169946 - 171418 291 490 aa, chain + ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 1 422 1 421 486 144 25.0 8e-33 MSHDQQLYQYLYQTITTQICNGTFIYGQEFPSQREICQRYNAGITTVRKVMKLLAEEGYI RTAQGQPSVVTYRTSKEAHAEFLVQNREEIADAYKGLGVLMPILYREGAKRCNESDLYWL DKILNHISEQMELPDHYQLANAFFTALLRPLKNQLIIDLEFDSENYLHVPFVPIPGIDNP FATTFERTKTWFSKAIIQIKQKQFDDFYADASKLYRDSGERVDNYLYALGKYTESNSNKN EDIRWFRIKGHSELYSRLAMTIVRRIIAGEFDNEPYLPSIPKLMEEYNITKNTARRAVSL LNSLGIAQTIDKKGSIITPEGYTSFKSELDLAEPVIQERLSFFLEALQIVALTARCCASS ISFIPDYLGQSVESRLRTASDNRITPLSFQLLMNAFIQLLPYQSLKNIFRQLDDLLIWGH YMQAIDKSYYPDYSEIALAMEAVITALKTQRSDALPDAFSNAFALIYQDVCMVLSKLPSG SIFNTLKDPE >gi|157101648|gb|DS480676.1| GENE 157 171538 - 174753 2135 1071 aa, chain - ## HITS:1 COG:all4496 KEGG:ns NR:ns ## COG: all4496 COG0642 # Protein_GI_number: 17231988 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 525 781 311 571 575 177 39.0 1e-43 MEINLSYEQLNAEYSMLMSVLGASVSKHLVDDHFTCIWANDYYYELIRYPKPLYEHLFRN HCDEYFYNNPKGWTRLTEQVHSALEQGKSGYSLYLPMIYPDGGTFWVRLQAVFTDEFING YRVSYTTMVDVTDMMQARLEQKETRQDFDKMMQEQAMIMSALNISVSKHLIDEHYTCVLA NEYYYKLIGYSKARYEALFHNHPDEFFANNPEGWELLTSKVEAVLKNNGDGYEAIVPMKY EDGSSYWVKLVSFFTDEYIDGFRISYTVMTDVTELMRTKNEQEMLMSAMEVSVSRHLMDE HFTVIWANECYYRMIGYTKEDYETRFHNCCDAYFWDNETGLRIINENIQSMTNADEKSYE AYLPLKMPDGSMRWVKIVGFLTDEYMDGIQIAYTTMIDVTDLVQTQKERSIAYENIPGFI VRHRILPQGIVMLDASERIKDIFDVSMEDIAAIDPLEVMSGESRELIISHFPGLRRGEPL EATIRIKDKYDRDRWFHFNSTCIDTISDDPVYLSMFIDITDITELRELQRKLEERTDMLN SALEAAKYANAAKSDFLSRMSHDIRTPMNVITGMAEIAAAHIGDQDRVKDCLQKIRLSSH HLLGLINDVLDMSKIENGNFSISMAPMCLPDELQEIITIVLPGIRKMNQRFDVYLEGPGW ENFMSDALRFRQILLNLLSNASKFTPKDGRIILKIEIQETESAKKAMICFTVSDNGIGMS PEYLEQIFEPFTRERDSRIDQIEGSGLGMAITKRLIDLLGGTITVKSQLGVGTVFDVLMP MDVLADSEEEQKDFRGLHVLLIDDDPVQQACAGSLLRSMGMEVECTDFEEAGKRFLTHWG TENFGFIILGLSLAEVERMEKTEMICRGIHGRVPLVVSAYDVSGMKERAMAWGVCGFMSK PLCRSGVIDCMNRCRIGEELMRPDTKLVRLEKKPGQPDTEPLDFAGSQVLLVEDNELNRE IALELLGGFGMELEAAVNGREALKRFEESRPGYYSLILMDIQMPVMNGYEAARAIRSLSR EDAKTVPILAMTADAFVEDIENTKAAGMNGHLAKPLDFKQLVREIGQYLHV >gi|157101648|gb|DS480676.1| GENE 158 174800 - 177259 1797 819 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 261 656 266 663 676 187 34.0 8e-47 MEILNETLKRTGYEYDTLMNLLHVSVSKHILDEYYTLVWCNDYYYELIGYSREEYEALYH NKPAHYYMNDDLNIHDEKLWMQLSARVMETLESGEKSYNLVTRMRRKHNGNHHDYMWVRL NAYLTDEYVDGYQVSYSVMTDITDVMDMKLEQSVTYDSLPGFVSKFRVNRNLDFKLLEAN DRFFDFFGKDSMDGMDNALFRENIIRNMPVFLEHKNAVTNGNSVHFTVQMKNRSGNDAWL QINASCIGYQEGDPVYLAIYIDITNETELRQMQQKLEAQAKELRSALETAEEANRAKSDF LSRMSHDIRTPLNAVLGMKDIAEAHLDDPAKVKDCLRKIGLSGQHLLSLINDVLDMSKIE SGEMVLREDSMSLPEVLENVVAIMQPQFGEKEQHFSIRLRSVVHEQFLSDALRMRQIFIN ILSNAYKFTPVNGTVSMDVCEKQAVDGTALFRFSITDTGIGMKPEYLAQIFTAFSRERDS RVDKTEGTGLGMAITKRFVDMMGGTIEVTSELGKGTCFRVELPLKIEENQFIKGEFPGLR IIVTDDDAVMCEYMAEMLGQLGIHADWVDSGIQTVEMIKTAKQNGEMYDAVLLDWKMPDL DGFETTRRIRELCGGQLPVLIISAYDWSDVEKDAKKAGVTGFLQKPIFVSTLIHGLQRYV LEGQPARKTQGTDTGARFTGRRFLLIEDNAINQEVARELLADMGAEVDTASDGAEGVETF RRSPDGFYDVILMDVQMPVMNGYEATRAIRTMDRDGAKAVPILAMTADAFSEDIQAAKDA GMNGHMAKPLDKAILWREIGKYLNSDFSPETAGKSGVNH >gi|157101648|gb|DS480676.1| GENE 159 177410 - 177580 89 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936695|ref|ZP_02084062.1| ## NR: gi|160936695|ref|ZP_02084062.1| hypothetical protein CLOBOL_01586 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01586 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 97 100.0 2e-19 MEGIAFLGLQVIEQPYLLHRLSKISAIPGIRKKPRKSRLPEPGIYFLCPEKELFFY >gi|157101648|gb|DS480676.1| GENE 160 177594 - 177809 103 71 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRNMQPEAVDDDRRVTMYYKKFRSLNLSALGLGALRLPMEKDNPKSVVRRNSCMYGPSIF KVGAIRFYNID >gi|157101648|gb|DS480676.1| GENE 161 177781 - 178962 1039 393 aa, chain - ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 383 2 382 384 365 43.0 1e-100 MPYNFDELTERRGTGSMKWDVPEGELPMWVADMDFETAPEIREALMKRAAHGIFGYTVVP DAWREAVTGWWERRHGFVMEKDWLVFTTGVVPAISSGVRKLTTVGENVLVQTPVYNIFFN SIRNNGRNIIENRLLYDGAGYRIDWEDLEKKLSDPQTTLMILCNPHNPTGMIWTREELGR IGELCAANHVKIMSDEIHCDLTEPGYGYIPFASVSETCAQNSVTCIAPTKAFNLAGLQTA AVMVPNEELHHKMERALNTDEVAEPNAFAIEAAIAAFTKGEGWLDALRVYLAENRKTVRK FLNDEIPEMSLVPAHATYLLWLDCRKIIVDASELCRYLRKETGLYLSAGMVYGGNGREFL RMNTACPKERLLDGLERLKQGVKTYEEYAAGSC >gi|157101648|gb|DS480676.1| GENE 162 178964 - 179818 728 284 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2330 NR:ns ## KEGG: EUBREC_2330 # Name: not_defined # Def: hydrogenase-4 component C # Organism: E.rectale # Pathway: not_defined # 2 278 5 281 283 385 65.0 1e-106 MKQINQKFLTGERALFQGKDLKIYDTTFADGESPLKESRNIELYRSMFKWKYPLWYSHNI KMEDCVLFEMGRAGIWYTENISMKNCIVEAPKNFRRTTGIELERVTMPNAGETLWNCYQI RMDQVSAKGDYFAMNSSNLYIDGFELVGNYSFDGVKNAEIHHARLLSKDAFWNSENVTVY DSFISGEYLGWNAKNLTLINCTIESSQGMCYIDNLVMKNCRLLNTTLAFEYSTVDVEVNN TIDSVLNPNSGIIRAGKIEELIMDASKIDVTKTFIQTEELKEAV >gi|157101648|gb|DS480676.1| GENE 163 180166 - 180708 450 180 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 180 1 179 179 159 46.0 3e-39 MSKRILVLSTSPRIGGNSETLADVFIKGAEEAGHETKKICLYDKKIEFCKGCLGCQTTGK CILRDDAERIIAQMKAMDVLVFATPIYFYEMSGQMKTLLDRTNPLFPAEYEFRDIYLLAT SADEEESSMEAAVKGLEGWISCFEKSRLAGVVRGTGADKKGAIEECGEALSAAYEMGRNV >gi|157101648|gb|DS480676.1| GENE 164 180723 - 181664 674 313 aa, chain - ## HITS:1 COG:MA0419 KEGG:ns NR:ns ## COG: MA0419 COG1073 # Protein_GI_number: 20089312 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Methanosarcina acetivorans str.C2A # 7 313 17 326 327 357 55.0 1e-98 MANYIAELSKDVERKHVRYNNRYGLAIAADLYQAKAIDGSETHPAIIVGAPYGGVKEQGP CVYANELAKRGFVVLTFDQSFMGESAGEPRNLSSPEIFTENFSAAVDYLGVKVPYVDCDK IGVIGICGSGGFALSAAQADTRIKAVATASMYVMTTAVRGTQDQAALAEMKERLSLKRWE DFEQGYPEYIPSFPEEPIDAVPEGMPEPGAEWFRFYAVKRGHHPNARGGFTTTSDLAFLN ADLTAFVGEISPRPVLFIAGDRAHSKFFSEMVYEKAQEPKELYVVEDAEHIDLYDRTDRI PFDKLERFFRENL >gi|157101648|gb|DS480676.1| GENE 165 181758 - 182249 545 163 aa, chain - ## HITS:1 COG:CAC3298 KEGG:ns NR:ns ## COG: CAC3298 COG1979 # Protein_GI_number: 15896542 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Clostridium acetobutylicum # 7 162 230 390 390 90 32.0 9e-19 MIHSSRIAVKNPLDYEARSNIMWTATWALNTLVSKGKTTDCIVHMIGQPAGAYTDATHGM TLSAVSMAYYRFLMPYGLAKFKRYAVNVWDVDPAGKTDEELAAEGLKQMDSYMKELGLVM SIRELGVTEEMIGGIANGTFVMDGGYKVLTHDEIVQILKESMA >gi|157101648|gb|DS480676.1| GENE 166 182417 - 182647 323 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936702|ref|ZP_02084069.1| ## NR: gi|160936702|ref|ZP_02084069.1| hypothetical protein CLOBOL_01593 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01593 [Clostridium bolteae ATCC BAA-613] # 1 76 1 76 76 130 100.0 4e-29 MANYVENPDSVIQNLEDAGCDMEMVAEFMKLGIAGNRQNQLKLLEQHRKRLLEKVHTNEK RIDCLDYLVFQMNKEE >gi|157101648|gb|DS480676.1| GENE 167 182741 - 184123 1210 460 aa, chain - ## HITS:1 COG:BH2163 KEGG:ns NR:ns ## COG: BH2163 COG0534 # Protein_GI_number: 15614726 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 2 429 5 430 464 166 27.0 1e-40 MQKESIMGTTPIPRLIIKTSIPLIISLLVNNLYNLVDSIFVSRVSEDALTGISIAAPLQM LMIALGCGLAVGLNAMVSKAMGEKQPEEVKKTTSAAIVMAFGAYLINVIVCLLLVKPYFA WQAGGNEVIARYGISYIRICMLFSFGQMLQWVFDRLLIATGKSNLFLATLGTASLVNVIL DPILIFGYFGFPAMGTAGAAAATVTGQLAGGLLGIYLNIKRNKEIPVHFTLHIQKKYVAN ILKVGIPTAIVQGIMSVMGIFINTILYQFSSTAVAIYGILMKVQNLAQIGVQGMNNGLIP IIAYNYGAGKEKRIHETIRCAFLYAVIIMTVVLISLELIPDKILMLFDASEQMIRLGIPA VRIMSVSFFVSSVGIIFAAIFQGFGNGNYSMYLAFMRQVILLIPILLIGAAMGNITVIWS GFAVAEALSIPFGMWLYGKVTQNMIAPMAETYGENRVETA >gi|157101648|gb|DS480676.1| GENE 168 184306 - 185193 623 295 aa, chain + ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 288 1 286 291 271 45.0 1e-72 MDIRVLRYFMAVTREESISGAAESLHMTQPTLSRQLMDLENEIGKKLLIRGNRRITLTEE GMLLRKRAAEILDLVDKTEAELTAPDEAIGGDIYIGGGETDAMRLIARIATDLQKSCPDI RYHLYSGNADDVMERLDKGLLDFGILIEPADMKKYDYIRLPATDTWGLLMPKDCLLAAHD FIRPEDIWELPLISSRQTTVSNEFSGWLKKDYDKLNIVATYNLVYNASLMVEEGLGYALC LDKLVNTSEGSRLCFRPLKPKMDAHLDIVWKKYQVFSGAADRFLKNVQDAFHSSI >gi|157101648|gb|DS480676.1| GENE 169 185328 - 185846 280 172 aa, chain - ## HITS:1 COG:no KEGG:Spirs_4177 NR:ns ## KEGG: Spirs_4177 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 39 172 17 152 152 113 42.0 3e-24 MRLYKKKLFAGLVLAAMFVSSAYSAGNLQPEEAARSEHTAETTMETSEIATTEESRAITM NVQIGSSTFTATLEDNTAVDSFVRMMQTAPVVIQMNDYSGFEKVGSIGTSLPASNSQTTT QSGDIVLYNGNQIVIFYGSNSWSYTRLGKIDDLSGWTEALGSGDVTVTFSLE >gi|157101648|gb|DS480676.1| GENE 170 185954 - 186649 641 231 aa, chain - ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 229 1 229 230 260 55.0 1e-69 MNKVLIVEDDELIAELERDYLEAGGFEVELIFDGNEGLNRAMAGGWDALILDVMLPGKSG FEVCREVRKSLHLPVIMVTAKKEDVDKIRGLGLGADDYLVKPFSPAELVARVKSHIQIHE RLLEHREQQKEDAIIVKNLAIRPKERRVYLNGEEMNLANKEFEVLLFLAENPNIVFSKDT LFDRIWGMDSVGNTATVTVHINRLREKLEKNADSSPFIETVWGAGYRFKVE >gi|157101648|gb|DS480676.1| GENE 171 186651 - 188141 1025 496 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 483 1 482 498 193 28.0 6e-49 MTIRQKIKLSNVMMALIPIMLTAIIVAVCLQTSLGSYWYTLETMYQDENGIQSAQSLIYT YQQELWENNWGQGVHREAGTEIRKSDKMNHLENKLSRMGYHFLITKNGNEIYSNISEEEM RAAESVAGTAMESAKTLTASYHDVSVVKHTFYHGEKAVCILAVHTGETDQEVISYLQNYI LNYIVGFISAFLVLTVCVNGILSCWISKSVLVPLQKLSRGTKEIKEGNLDTSIDYTKEDE FGEVCRDFDEMRAYLKESVEQRLKDEQRKKDLIIGISHDHRTPLTSISGYLDGLMDGIAN TPEKRQRYLSAIKTRAKDLSRLVDSLSEYNRLDSSSFHYHLEPADLKRFIGQYLISYEEE ARQNKVEFKLTSEGGEFPVLLDKEEMKRVLDNLFTNTIHYRRHPHSNVVISLRRVERSTV VEFIFSDDGPGVPEESLDRLFESFYRVDGARSNSGEGSGIGLAVVKEIISGHRGIVHAEN HGGLAIIIHLPMTEEK >gi|157101648|gb|DS480676.1| GENE 172 188309 - 188950 511 213 aa, chain - ## HITS:1 COG:no KEGG:BMD_1530 NR:ns ## KEGG: BMD_1530 # Name: not_defined # Def: LysE family translocator protein # Organism: B.megaterium_DSM319 # Pathway: not_defined # 5 203 5 197 204 121 36.0 2e-26 MLENYLLKGLVIGVVFGVPAGVVGILSIQRVLTQGAFAGFLTGIGSSAADIFYACVGVFG LTFISDILLKHQSTICMVGCLMVVAIGVRTIKKKESHSFASAAENNKEEHPGHIFSCFLS SFVIAITNPATILSFMVVFSMFRIGGNESVGENVQLICGIFAGTCIWWLAIAVIVSLYRK KVTDDVYFILNRIFGALMILFGIVIGVRGCLNG >gi|157101648|gb|DS480676.1| GENE 173 188963 - 190009 778 348 aa, chain - ## HITS:1 COG:no KEGG:PRU_1044 NR:ns ## KEGG: PRU_1044 # Name: not_defined # Def: hydrolase domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 1 348 1 351 355 250 39.0 8e-65 MKKGVRILLIVLMLLISGAAYLYFRIRAGVDSAVNVQKLEDGFYYMEYRDDYGFDRFLEQ GGAASDLDVAKFVGKALFKGFLYPRFLGGSFGCSTLSARNPKGGVMYGRNFDWMECTSMV VKSKPERGYASVSTVNLDFLNLGTEYDPEKTVSKIISAAALYAPMDGINEEGLCVAVLMI DDSAVTEQDTGKPDLTTTTAVRLLLDKAASVDEAVQLLGQYDMHSSAGMMLHLALSDKSG RSVAVEYVNNEMSVVETPVVTNFYLTQGDKYGIGTEQSHTRYEMLLERLSEQPAMDMENM KDAMSSVSKNNFGEFASTEWTIVYSQDSGEIRYYHREDYDNYYSFSVE >gi|157101648|gb|DS480676.1| GENE 174 190321 - 190914 494 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936715|ref|ZP_02084082.1| ## NR: gi|160936715|ref|ZP_02084082.1| hypothetical protein CLOBOL_01606 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01606 [Clostridium bolteae ATCC BAA-613] # 12 197 25 210 210 362 100.0 7e-99 MLALAAASMLAGCQKPDETRRENSNPSTSAGTTMSAGRDNSTGNAGTKAEEAGEELTGAL ALEVRFGDDGAPFVMQMEDNATADAIAGYVGTANWRLPIYHYDDYENWEVMQYYDIPSHY EIPSEPEMVTAEKAGEVYYSEPNRIILFYQDGEVTGEYTKIGTIEDTGEFREAVESNPVL EGWGNKIVLISSLVNLF >gi|157101648|gb|DS480676.1| GENE 175 191084 - 193441 1521 785 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0330 NR:ns ## KEGG: Ethha_0330 # Name: not_defined # Def: protein of unknown function DUF214 # Organism: E.harbinense # Pathway: not_defined # 139 785 137 780 780 123 23.0 3e-26 MEDTILLMADLKRHKGSLSGIFILVLLVCTALGTVLSIWMNSERYLREEMERAGFGTLTA WVSNVPDMDALADNIADLEAVGNVDTQVVIYSDYTVNDQDSDSEGQLIPHGTEDRRYKFF EDELSGYRGDIPEIRPGEIYVSPSMISMFGVQIGDEIRFQVARGGQAADFTVKGFYEDPF MGSSMIGMKGFLISEADYSGIIQTIRDTGIDALARDGAMLHIFQADTNGEGVTVSQLNQL INENTELPQYLEFLHSGNAIAGFMLILQNAFSGLLAAFVVVLIFVVLIVLGHGIGSSVEA DYVNMGILKTIGFTGGKLRRIQLAQYMLVILTGMVLGILAAVYVSRIVSAATLTTTGIRI PTDLPAAWCFLINGLLFLVLTGFIVFKTGKLKRITPMKAIGGEAFIYGTRVKKGFPIRGN YLKYSMALRQLTAGARRYSGACIVAILLVFFASMIGRMDTWLGADGKGMMDAFNPADHDI GVQIFGEHTPQEAEETVLSYTGITDRYLLAMQGVSVNGIDYTANAITDPERFHIVEGRTS KADDEIVMTEFVASDLGVAVGDTVTVQADLGSEEYVISGIYTCANDMGDNIGMSREGYLK IGSDDPAIWCHHYFLEDASQKEAIIVALDESYGGDVHVHENTWPGLYGIISAMQALLVFL YGMVAAFILIVTIMTGSKILSAEQRDIGIYKAVGFTSGQLRGTFALRFMIVAVLASIIGT ILAAALTDSLVSIVMRMAGISNFSSGMTAGNTLVPSAVVILLFTGFAYAVSWKVKKVDLT VLITE >gi|157101648|gb|DS480676.1| GENE 176 193457 - 194218 301 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 241 1 237 245 120 29 5e-26 MKKTLLAAKDISKEFGKDRGAVPVLDHVSVDIYEDDFTIIMGPSGAGKSTLLYVLSGMDR VSSGTVVYKGKVISRFKESQMAKIRSREFGFVFQQTHLVSNLTLFENVTVAGYLGKERKP EETRKQAEALLEQMNVGSAKDRMPAEVSGGEAQRAAIARAMINEPGIIFADEPTGALNKR NTGEVLDLLTELNKEGQSILMVTHDLRAAVRGTRLLYLEDGKILDELSMPVFHPEDERTR EAAINDWLASFQW >gi|157101648|gb|DS480676.1| GENE 177 194349 - 195692 1055 447 aa, chain - ## HITS:1 COG:MA1063 KEGG:ns NR:ns ## COG: MA1063 COG1541 # Protein_GI_number: 20089933 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 28 447 44 443 445 93 22.0 7e-19 MNYFNLLWNLYQLKRNTGMRREQLITLQEKKLRELLVYAYDNSTYYHRVFEEAGITRKQI PLMPLSAFPVLDKQLLMEHFNELVTVSDLKQEDLRRFDREESTEQKKFKDEYHVVHSSGS TGTPGYFVYDEAAWSQMLLGIIRAALWDMTMPQILKLLWKGPRIVYIAATDGRYGGAMAV GDGIEGVHADQLFLDVKTPLAAWIRQIKEFKPNMVIGYPSAIKILGELVEKGEVSVDIFR IVSCGEPLGASLRNYLETIFEADVVNIYGASESLALGVETSHAEGMYLFDDLNYIEVENG AMYLTSLYNYVQPLIRYRISDQLNLREPEDGSPYPFTLAGNIMGRNEDMMWFEDGRGNRD FLHPLVIEGFCLDGLLDYQFRQLDSHSFEMLAEVSDTGKIKEIREEMMKQMALLLKEKSL EYVRFSIRFVGEIRPDQRTGKKRLIVA >gi|157101648|gb|DS480676.1| GENE 178 195755 - 196888 790 377 aa, chain - ## HITS:1 COG:no KEGG:PRU_1044 NR:ns ## KEGG: PRU_1044 # Name: not_defined # Def: hydrolase domain-containing protein # Organism: P.ruminicola # Pathway: not_defined # 63 377 38 354 355 213 37.0 1e-53 MNKMKRMLLFTTLTITGVLTGCAGGAPVLQQTAYAQNNGALTQTVTAAGKASDAAITPGS EIMKLEDGFSAVSYDGNYWFDDFLAQGGATNDAGVIDFLTRHMMAGESRPSLRTGGFGCS TLAVKSPAGEALFGRNFDWQPCEAMVVRSRPEGAYASVSTVNMDFIRSGYGAGLSQLPDE VRTLIALYAPLDGMNEKGLAVSVNMIQDYDAICQDTGKPNLTTTTAIRLFLDRAANVEEA LALLGQYDMHSSMGMMVHFALADRSGRSVVVEYIDDEMVVTDTPAVTNFYLAEGRKQGIG TAQSHTRYERLMKRLAEDEAMDMDQVRDALDSVSKDNFGEFESTEWSIVFNLNNGEACYY HRENYGNSYVFSVNEEA >gi|157101648|gb|DS480676.1| GENE 179 197223 - 197747 484 174 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936721|ref|ZP_02084088.1| ## NR: gi|160936721|ref|ZP_02084088.1| hypothetical protein CLOBOL_01612 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01612 [Clostridium bolteae ATCC BAA-613] # 1 174 1 174 174 333 100.0 4e-90 MKKAGLKKYAVAIMVVIGLTAALTGVASAEMTGEVVTASQLEVRFGDNGAAFTMTLENND TAAAIKKHVGTADWRLPIYHFDDYDNWEVMQYYDISSRYEIPSVDVKTVTSQKAGEVYYS HPNRIILFYGDGEVTGDYTKIGEIEAADDLKKAVVNNPILQGWGNKIVQIKSVK >gi|157101648|gb|DS480676.1| GENE 180 197802 - 198995 785 397 aa, chain - ## HITS:1 COG:PA2218 KEGG:ns NR:ns ## COG: PA2218 COG1073 # Protein_GI_number: 15597414 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 79 394 50 365 367 344 55.0 1e-94 MRKNSQRIIAGALMLTVLVSTGCSVQNSQQTVTEQQTQQSTTSETEPETGAPVADTDETT DGVVNYGMMEDTSIKVEPIELTDEWDKVFPLSDEVNHRKVTFVNHFGNTLAADLYEPKEY TGKIPAIAVSGPFGAVKEQSSGLYAQEMAERGFLAIAFDPSFTGESGGYPRYFNSPDINV EDYQAAIDFLSTQDNVDPEQIGIIGICGWGGMALQTAALDTRVKATAAMTMYDMSRNTAL GYFDSIDEDGRYESRVAYNQQRTDDYKNGTYTLGGGLPEEAPEEAPQYVKDYVAYYKTDR GYHPRSVNSNNGWAATAPGSLMNMRLFEYAPEIRSSVLLVHGEEAHSLMYSHDAYELLQG DNKELLIIPGATHTDLYDQMDKIPFDKLESFFTEAFQ >gi|157101648|gb|DS480676.1| GENE 181 199178 - 199717 468 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 11 175 78 242 245 184 51 2e-45 MGTETEAGVKIRRKTAFIFQNYNLFRNKTALQNVTEGLVIARKMPRSQAEKAGKRALDKV GLWDRYDYYPHQLSGGQQQRVGIARAIAMEPEVILFDEPTSALAPELIGEVLEVMRKLAG EGMTMLVVTHEMNFAKNVGTKVIFMDKGVVVEENTPKEFFESPKEDRARQFMSSIKWSG >gi|157101648|gb|DS480676.1| GENE 182 199690 - 200310 533 206 aa, chain - ## HITS:1 COG:SMc03893 KEGG:ns NR:ns ## COG: SMc03893 COG0765 # Protein_GI_number: 15967029 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Sinorhizobium meliloti # 66 203 78 215 226 155 57.0 4e-38 MTAAAVLTTGLLAGSGSLNAGAGTIPAETTVSGTTVEAQTAGEDAGTEGEKAADGESTDG RCEGAGLPGLGITLDAFPSAVVVFSVNTGAYASETIRAAIEAVPAGQLEAGYCVGLSYTQ TMMRIILPQALRIAFPPLSNSLIGLVKDTSLAANITVMEMFMSAQRIAARTYEPFALYRE VGIIYLIFCTALTRLQSVWEQKLRLE >gi|157101648|gb|DS480676.1| GENE 183 200965 - 201867 367 300 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936728|ref|ZP_02084095.1| ## NR: gi|160936728|ref|ZP_02084095.1| hypothetical protein CLOBOL_01619 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01619 [Clostridium bolteae ATCC BAA-613] # 1 300 1 300 300 566 100.0 1e-160 MRRITGMFVCISLLLNLSFTSYASERLIHSEGNEYQEILSMKDEVLYHINYGGARGELNR DVEADEVEMERAYKVYANSELLKTNDVKESLENSYYKWQIPVYTDGFTILVDITKVISIP ADIPEDAKETLKQKLNIWTVGAVYVYDAETVDYDNTVTTSLEEAGYNSDEYSYEVVSGLS GIRYPAAIIFNADEKPEFVIPAQKSTTHAFIGEWPTAAKNREKTATPLNAQNSNNDNING FSIYYYDDVVRVSNSYNQSGVGLSYKKDIIEKNTIAVSIFGAAGILLIIGTKKRNKKRMK >gi|157101648|gb|DS480676.1| GENE 184 201917 - 202141 101 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936729|ref|ZP_02084096.1| ## NR: gi|160936729|ref|ZP_02084096.1| hypothetical protein CLOBOL_01620 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01620 [Clostridium bolteae ATCC BAA-613] # 1 74 1 74 74 115 100.0 8e-25 MCFRTSQRRTEDEKKTDLKQAIFYVICAAVIISKPTTVSASNEQTAVEDTGEQQPSLRQE IPKLKKENWVNQTA >gi|157101648|gb|DS480676.1| GENE 185 202355 - 202597 185 80 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 66 1 66 291 64 46.0 5e-11 MKLSSLRYYIAVAKYGNFIKESDWLFISQPTLSRTIQELKEELGTQLFIRERRFLKLTED GVWLLNGKPCSRNISKVVNA >gi|157101648|gb|DS480676.1| GENE 186 202650 - 203087 200 145 aa, chain - ## HITS:1 COG:L61341 KEGG:ns NR:ns ## COG: L61341 COG2936 # Protein_GI_number: 15673217 # Func_class: R General function prediction only # Function: Predicted acyl esterases # Organism: Lactococcus lactis # 1 143 445 571 572 140 50.0 8e-34 MQKLDQNGNMLLEITVPNHGAAIRDFTDDGASITRYKGSWGKLRLSMRHLDEQEITDEIP AYSFDRVEKLEKGQMVEDDIVMSPVGMAFHKGESIRLILSSKKEYRNSMMQPPPPTGCIP KNWGWQVIYTGSDKASYLQLLILKI >gi|157101648|gb|DS480676.1| GENE 187 203186 - 203380 61 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936732|ref|ZP_02084099.1| ## NR: gi|160936732|ref|ZP_02084099.1| hypothetical protein CLOBOL_01623 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01623 [Clostridium bolteae ATCC BAA-613] # 9 64 1 56 56 110 98.0 3e-23 MHHLFRAVVDFIFRAIFYKYISGLTECGNVTYTKLYLDGTFRQLKREIPDKGAAAKYDTQ GLPG >gi|157101648|gb|DS480676.1| GENE 188 204237 - 205634 1084 465 aa, chain - ## HITS:1 COG:BS_ydjD KEGG:ns NR:ns ## COG: BS_ydjD COG2211 # Protein_GI_number: 16077683 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 2 463 5 460 463 176 26.0 1e-43 MEKMNNRYQSNEDLNTFRLTPMKRMYYALGDFGYNFMYYWLSTYLMIYYTDTIGIPAATV SVMMLVVRIFDAFNDPVIGSLADRTNSRWGRYRPWFMLGSIAMACFIVLIFAASPGWQMT SRLLWMWGIYLLLTVASTCSNMPYGALNGCITPDSEDRAKVSGLRMMFANVSSMVTVIIA VPLMIAFSSDGSASSARGYFWAVLITCILGLPTMIVSCCKTKEVLTPPPTQNKIPMNLQM KSLVKNKAVLILIIGQFILGCVLYGRAAMLAYYWQYNAGNAAYATTYGLIGLAATIVGTG WFGNFLFKRVRHKGKVCAVLNFITAGAYFFMYFTTAPSVMFWFLTFVSSMSFYAYMGIHF GAIGDAVDFGEFISGVRCDGFLSSFVSVANKAGGAVMPAIGAAVLAAMHYVAGGAQPAQI LNAVNWFITLIPAILSLINAVFYLMYPISTEKHKEIMGELIRRRQ >gi|157101648|gb|DS480676.1| GENE 189 205646 - 207121 827 491 aa, chain - ## HITS:1 COG:BS_glpK KEGG:ns NR:ns ## COG: BS_glpK COG0554 # Protein_GI_number: 16077994 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Bacillus subtilis # 3 483 4 490 496 422 43.0 1e-118 MGYILGIDQSTQGTKAILVDNKGRLRSRADQSHRQIVNERGWVSHDLEEIYQNTCRVIQD VVKRTGISKNEIEVIGISNQRETTAIWDRDGTPLNGAVVWQCSRAKEVSQALAPYGDLIR NKTGLMLSPFFPAAKMAWLLSNTEGAKERAPEDLCLGTIDSWLIYRMTRGKAFKTDYSNA SRTQLFNIHTLSWDEELYELFGVSAKSLAQVCDSNSLFGMTDLDGYFDHEIPIHSAAGDS HAALFGQGCHSAGMIKTTYGTGSSIMMNTGMECVSSSHGLVTSLAWGLDGQVNYVLEGNI NYTGAVMTWLKDELHLISSPAESEAFAKAANPVDKTVLIPAFSGLSAPYWSDTAKAVIYG MTRTTGKNEIVRAALESIALQIQAVLEAMEKDSSICIKELRVDGGPTRNSYLMQLQSDLS KLDVAVAGIEELSAMGAAYMAGIGAGILNQSELFSEEGRTLYSPQMAEEERKEKIGNWNE AIRLVCGATVQ >gi|157101648|gb|DS480676.1| GENE 190 207125 - 208060 756 311 aa, chain - ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 3 311 2 310 311 290 48.0 2e-78 MNKITNRQAICDTLLEASKTDRDIVVLCSDSRGSASMTPFAKTYPEQFVEVGIAEQNLVS ISAGLAACGKKAFAVSPACFLSTRSYEQAKIDVAYSNTNVTLVGISGGISYGALGMSHHS LQDIAAMCALPDMRVYLPSDRFQTGKLIEALLQDEKPAYVRVSRSATEDIYEEQMKFELD KAHVLSEGEDAMIIACGEMVPYALEAARILEKDGIRVGVVDMYCLKPLDEEAVLQYASRV KCLITVEEHSVYGGLGSMVAAVTAEKHPIKVKKIALPDGHLIPGSNTELFAYYGMDGAGI AETVKKTLTEG >gi|157101648|gb|DS480676.1| GENE 191 208057 - 208875 680 272 aa, chain - ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 4 272 12 278 286 285 48.0 1e-76 MDALTRKAYELRRDIIEMVHNSRAGHVGGDLSVTDILTVLYFKVMNVSPEKGENPDRDRF LLSKGHCADALYCVLGERGFYDKEETIKTFSQFGSRFIGHPNMEVPGVEMNSGSLGHGIS VGVGMALAARMNHQSYRTYVVLGDGEMAEGSNYEGMMAAGHYKLDNLCATVDLNRLQISG TTGQVMDSASLADKFRDFGWNVIEVLDGNDCAQLVNAYEQAAAYKGKPTAVIASTVKGKG VSFMENQISWHHGVMTEEQYEQAVKELKEALQ >gi|157101648|gb|DS480676.1| GENE 192 208922 - 209701 930 259 aa, chain - ## HITS:1 COG:YPO3351 KEGG:ns NR:ns ## COG: YPO3351 COG1028 # Protein_GI_number: 16123501 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Yersinia pestis # 4 258 5 255 256 202 43.0 4e-52 MIQYEKVNGNFDLTGKAAIVVGGAGGIGEATAQMYARKGARVAIVDCKDNCPEVAAKIAE EFGVQVIGVKCDVTSQESVDAMMKAVLEAFGEINILAAMPGFVDLYRASDISIATIEKHM SINAVGVFRVCQAVANQMIKQGKGGKMVCVTSQADFIAINKHVGYTMSKAAVVGMIKVCA LEWADYGINVNGVAPTVVNTYMGEKAWQGKVKTDMIEKIPAHRFAETDEIAAAVLFLSCN ESNMITGEDLVVDGGFTIY >gi|157101648|gb|DS480676.1| GENE 193 210008 - 211048 778 346 aa, chain + ## HITS:1 COG:VCA0132 KEGG:ns NR:ns ## COG: VCA0132 COG1609 # Protein_GI_number: 15600903 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 2 339 3 330 334 135 26.0 8e-32 MTNREIAQKLGISPAALSLIINHKPGVSDSTRDRVLTELQEMGCDNLIKKAPAVPSNNLC FIIYKRHGEVLDLHPFFLLLMESIETRARSYGYNILLCNLDKRKSMEPQLAHLEELDCQG AIVFATEMEDEDMELLQELPIPMVALDNDFSRLSCNSVSINNQMGTFQAVEYLVRMGHRR IGYLKSLIRINSFKERQSGYEDALAHFGLSFSDGHILDVHYSEEESYRDIRQFFDSHPSY DIPDAFVCDDDTMASGALRAFSEHGYKVPGDISIIGFNDRPNCEVTTPPLTSINVSKQGL AGESVDELMRMIQNPDKSTQEFRSRKIRIGTKLTVRESVSPPSKNV >gi|157101648|gb|DS480676.1| GENE 194 211011 - 211352 168 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266621899|ref|ZP_06114834.1| ## NR: gi|266621899|ref|ZP_06114834.1| ATPase [Clostridium hathewayi DSM 13479] ATPase [Clostridium hathewayi DSM 13479] # 4 55 268 319 642 66 59.0 7e-10 MENWEKKDRTFAEADKLLINQSLETIDGNFDNLPTTHPAKRPYQKYGSNYNRKINNQKKV VKNPVWMTNLRKRACSYCNELAAVYGDGYSPAVSNYRLLLDIIHFLREGTPIL >gi|157101648|gb|DS480676.1| GENE 195 211396 - 211599 115 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936741|ref|ZP_02084108.1| ## NR: gi|160936741|ref|ZP_02084108.1| hypothetical protein CLOBOL_01632 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01632 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 109 100.0 5e-23 MGCFMDCVPEASLMAEKLTGGLRRLVIEKCLGPGGIQMKASVRGEIQRGKICGKPLTGGE KRGRFRA >gi|157101648|gb|DS480676.1| GENE 196 212050 - 212616 164 188 aa, chain - ## HITS:1 COG:no KEGG:Clole_2805 NR:ns ## KEGG: Clole_2805 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 187 1 186 186 212 55.0 6e-54 MEDVYKNCPIYEDKDYTLRLVRLEDKVDLLKVYSDEKAIPFFNGDNCGGDDFHYTTENRM EQAIEYWLWEYSRQGFVRWTIVSKAAHEAIGTIELFHREGGDYFTNCGLLRLDIRSDHEI QSEIIQILALIIEPTYTLFHCNKIATKAIALATERIKALHILGFEPTDEKLVGHDGTKYD SYFVVKKN >gi|157101648|gb|DS480676.1| GENE 197 213114 - 213662 565 182 aa, chain - ## HITS:1 COG:slr2099_1 KEGG:ns NR:ns ## COG: slr2099_1 COG0784 # Protein_GI_number: 16330585 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Synechocystis # 4 125 10 130 130 79 37.0 3e-15 MAAKKKILIVEDNLINREVLGSILSPVYEVLEAENGEEGLAFLKEQKAGLSLILLDIVMP VMDGYAFLSRVKADAELALIPVVVTTQGSSEADEVAALSYSAADFVSKPYRPEIILHRVA NIINLRETAAMVNELQYDRLTGLFSKEYFYRCVRDQLDAAPGKMYDIICSNIENFYWQAV FF >gi|157101648|gb|DS480676.1| GENE 198 213715 - 215865 1861 716 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 275 686 326 749 776 179 30.0 1e-44 MQESFFEKTIIDQQHLIAYASDPDTYEILYMTQAAASLCGLNAREAYGKKCYELIQGRNA PCPFCTNDKLKLDVPYRWEHYNEKLQRWMDITDILVPFQGKNRRLEFARDITEQKETFDR VSNRLSVEEALVDCIQTLSGETDVTVAVNRFLALIGRFYAADRAYIFEYGENVIYNTFEW CAPGVSREIENLQEIPLEYIANWNDKFEHDGEFYITSLGRELVVDSNEYRILHDQGIESL SAAPLMKQGKIIGFLGVDNPTENVEDLSLLRSVCGFVLDEMERRRLIEELERSSYTDLLT GVSNRNCYIKKLNQLSCSTLHTLGVIYIDINGMKKLNDNFGHEYGDRVIQNVADILRARA ESNAFRVGGDEFVVLCENTKQEDFQALAESLRRDFEESRDFEVSIGCTWKDRDISVDNEI MRADDLMYAQKQGYYHTILHGGGHTRTGIATEILREIADRRFVIYYQPQISLKTGRIMGA EALIRKRDDSGRLIPPDQFIPYYERERVISHLDLHVFQTVCSDLREWAKQGVKTRISSNF SRMTLMASDIVQQLAQICRDYDVPTQQLTIEVTESISKLEAGQLLTLMRLLLGEGFSISL DDFGSQYSNLSILTMLEFNEIKFDKSLVEELGSNIKSRVVMKNAMQICRELPKTCSLAEG IETIEQLELLRQYQCEYGQGYYFAKPMKVEAFLDLLKKEQALDAPVLTGNNPPDEE >gi|157101648|gb|DS480676.1| GENE 199 215943 - 219155 2562 1070 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 473 920 214 663 676 185 30.0 5e-46 MLENEETVKQWEKQAGTDYEYRILVGLLSVGISKHLLDGHFTLIWANDFFYSLIGYSQEE FATRFHNCPDEYFYNNPEGYQALEKSVRDARANGEKGNSICVRLVTSDGCSLWVKLQAAF TDEYVDGYQVAYTTLTDVTEMMEAQLEQEHTQQTLEQMVHEQEMLMSILKVSVSKHMVDE HYTCVWANEYYYQLIGYPKEKYESLFHNHPDEYYQNNPEGWEQLTATVADVLENGKDQYD IITRMKHEDGSSFWVKLFSYFTDEYIDGYRSTYTVMMDVTELVQMKNEQEMLMRAMEVSV SLHLVDKHFTLVWANDFYYDLIGYSKTEYETLFHNHCDEYFAENKNSWNKIHEKVQEITE AGKRSYELFVPLRIPDGSTRWVKMTGFFTDEYQDGKQMAYTTMVDVTELMQIQQEKAVAY DNIPGFIVKHRILPDKIVMADASDRITDIFNVDTGKLDSFDIYSVLEPESRTMIEASHES FRQGKPFDGTICLRDRYTRERWFSIHSTCIDSIADDPVYLTVFIDVTDVTELRVMQKKLT EQKTALQDALEAAKHANRAKSDFLSRMSHDLRTPLNAIQGMARIIKSHVYDPERVLDSTD KIMLSNDLLISLINEVLDTSKVESGQMLLAEEEVNLAELVQGVVNMVQPQLGEKNLRFKT YANRITHETVVSDLQRLQQLLLNLLSNAVKYTPEGGSITLEINEKPSEQTDMAYYEFVVS DTGIGMKPEFLARVFEPFERADDAKIQAVQGTGLGMSICKKIAELMGGTIEVESTYGKGS RFTVSVYLRVQEVKIDDGVLAGLRVLVVDDDEIACRNTCERLEELQMTAKSVSDGQTAIS EVEAAHAVYQDYFAVLLDYRMPGLDGVETARQIREKVGNSLPIIMLSAYDLSDQVDAAKE AGANGFITKPLFRSRLVYKLKQFIGAAGMEEPETWKPARCSYAGKRILLVEDNALNQEVA IEMLVESGIPPENVDVAENGQAAVDRIKASRPGTYDLIFMDMQMPVMDGCTAAIQIRALP RDDVKTVPVIAMTANAFDDDRKKTKDAGMNGHLAKPVEPDQLRQVLETWL >gi|157101648|gb|DS480676.1| GENE 200 219160 - 219522 277 120 aa, chain - ## HITS:1 COG:no KEGG:ELI_3416 NR:ns ## KEGG: ELI_3416 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 6 95 1 94 117 73 45.0 3e-12 MVTNWMTALKEAGADSDGALRRLSGNISLYQKILRMFPQDETYKQIEPALQANDWAALLA AAHTLKGVAGNLGLIPLSNARSDTVALLRAEQYVEAVASCAKIRDADTGFIEIIRLLEEA >gi|157101648|gb|DS480676.1| GENE 201 219736 - 219876 131 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936748|ref|ZP_02084115.1| ## NR: gi|160936748|ref|ZP_02084115.1| hypothetical protein CLOBOL_01639 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01639 [Clostridium bolteae ATCC BAA-613] # 1 46 1 46 46 88 100.0 1e-16 MAGVEERDIENIFAYPDKQVGETVKYKVAVTVMGLRFFLKECRPKG >gi|157101648|gb|DS480676.1| GENE 202 219903 - 221321 845 472 aa, chain + ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 6 445 7 453 500 137 26.0 1e-30 MQNSSMTDIIYDYFYSRILFGYFLPGDQLPSISYICRQFQVSALIVRTAFARLREEGMLE TIERKGSTVTYRPGKEQGRLYRESFLSRREGMEDICRHSDIIFGSMAQVYMQKQTEDSIH QIRTQLKKRKGHPAKQITMFYAEAMQPLNNPLAINLHWEIVRYLRIPYFEAANFGDTGIQ AEDHIERMLTLIEAGNAEKAVEVMQAFNRDGPRLFIKGLPFMMNGEPPVEQIPFKWQIYR EHPQLCYTLAAELMSKIDAQAYKQGEFLPSCQALAQEYGVSLITMRRTLELLNNICVTET LNGVGTRVLSGKSAGMPKLFQPIQKILVLYLQALQIGALSCHDVAIHTLSSLDDDGYDTL DRIIGRHIEERRAFLLAETCLRFIGGHSPSAFVKEVYHQLYHLLLWGHALHFFSQKMDAS QTHEAYAHKLRDALSRRDAESFASQYAELMGLFLKGTKLLLLQLGFEEQQLV >gi|157101648|gb|DS480676.1| GENE 203 221503 - 223284 1059 593 aa, chain - ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 19 574 10 552 552 406 38.0 1e-113 MAGKAKKISIKSWFRRFGVQISMFYLAAGLMIVMVLSGAIYFFAANLMMDETVLKTRDQL EMSGANISTYIARIKGESNVFAADPMLRQYLSGKEDAPEPDLMGQIMTILSNDAYIKSLV VVSKDGRILSNEENLDMSVSEDMMKEGWYTDAIHNTMPVLTGARMQSFSADKDNWVISVS TEITDSSGENLGVLLMDMRYSVIEDHLRRLNLGKEGYVFLLNDKGEPVYHKDTSYFSDPE KKRQLLEILKAEDVYNKRQNQLTYRTNIENAGWTMVGVVSLDNLKMLEQQLFEALLLTGG LLFLAVLFIGILFTRRLAAPMADLERGMSEIEKLTEISVHRNSFYEVELLAGNYNRMIRR IRALMDEIKEKEETLRHSELNVLVSQINPHFLYNTLDTIVWMAEFNDSERVIALTKSLAA FFRLSLNGGRELVSVADELDHMRQYLYIQKERYGDKLNYEIGEADELSDYTVPKIILQPV VENSIYHGIKPLDGPGYVTVTVKEMGERLIFTVADNGVGFDPEEPQTGKRKGKGSVGLKN VDERLKLYYGPDYGLQIDSSPGKGCTVYLIVGKKILGAGVCISGAEDAEGYIK >gi|157101648|gb|DS480676.1| GENE 204 223274 - 224047 612 257 aa, chain - ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 250 1 252 261 205 44.0 5e-53 MYNLLVVDDESILRRGIRAFIDFDGLGIGEYFEAENGEGALKIIETSQVDLILADINMPK MNGLEFCRQAKERDPAVKIAILTGYDYLDYAIRAIKIGVDDYALKPLSRNDVQKLLFRLI EKKKDAEGFAAVRQAVGQLARIAGSDSDAGVKGQLADVLEQHLADPGFSLGAMAEQLGYS LSYLSTLFKKTFGENFRDYLLDLRLERAKILLLSTEMKNYEISAAVGIEDPNYFSVCFRR KFQKTPKEFRMGAGHGR >gi|157101648|gb|DS480676.1| GENE 205 224204 - 225361 809 385 aa, chain - ## HITS:1 COG:NMB0044_2 KEGG:ns NR:ns ## COG: NMB0044_2 COG0225 # Protein_GI_number: 15675984 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptide methionine sulfoxide reductase # Organism: Neisseria meningitidis MC58 # 62 220 4 162 181 215 61.0 1e-55 MAAAILTLAGCSASAAEAGMAAGAKRETESLSMDDSVMDSGKEKKLTAEQEDMMISGEDL HTIYLAGGCFWGIEAYVKKLPGVSSTDVGYANGNTENPSYKEVCYNNTGHAETVKVVYDT SRISTDQLLDGFFKVVDPTSVNRQGNDRGSQYRTGIYYVDEADQAIAKAAVARQEEKYSS PIATEVLPLSNFYMAEDYHQDYLDKNPGGYCHIDLNDADEFIRESGLDMADVASLIKVED YPVPSDAKLKETLTDIQYRVTQMGDTERPYTNEYSSTFEKGLYVDVVTGEPLFSSVDKFE SGCGWPSFSKPIIADVVKENQDTSFNMVRTEVRSRAGDTHLGHVFDDGPRELGGLRYCIN SASIRFIPYGDLEAEGYGFLKSIFD >gi|157101648|gb|DS480676.1| GENE 206 225436 - 226017 500 193 aa, chain - ## HITS:1 COG:SPy1558 KEGG:ns NR:ns ## COG: SPy1558 COG0526 # Protein_GI_number: 15675451 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Streptococcus pyogenes M1 GAS # 32 193 46 207 207 145 44.0 6e-35 MKRKYFIIPVLCLAFFAFAAYGARKPRNAFPMVNSEEMGSGVDRNMDTKKMNKGNPAPDF EMMDLNGNTVKLSDFAGEKVYIKYWASWCPICLGGLEDINTLSAEDNGFKVLTIVAPGSK GEKNDEDFKTWFAGVEHTENITVLLDIDGVYTNRAGVRGFPTSEYIGSDGVLISLAPGHA DNETIKSTFEAIN >gi|157101648|gb|DS480676.1| GENE 207 226040 - 226738 580 232 aa, chain - ## HITS:1 COG:HI1454 KEGG:ns NR:ns ## COG: HI1454 COG0785 # Protein_GI_number: 16273360 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Haemophilus influenzae # 1 224 1 212 213 162 45.0 5e-40 MAAQTLYLGTVLTAGLLSFFSPCIVPLLPVYISVFSAGSGAAQEDIFQRRIRTLFKSCLF VAGISTCFIILGFGAGALGSIIGSSFFMTAIGLIVIVLGLHQTGLIHIKWLSYEKKVNLK RSHRSDYFGVFLLGLTFSFGWTPCIGPVLGAILGLSATSSRPFYGGFLMAVYSLGFLIPF LILALFSDVLLQRIKKLNKHLGKIKAAGGVVIIMMGILLMTDNLDKILMLIN >gi|157101648|gb|DS480676.1| GENE 208 227077 - 227371 68 98 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1746 NR:ns ## KEGG: MGAS2096_Spy1746 # Name: not_defined # Def: transposase # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 3 96 133 227 268 77 42.0 1e-13 MGMCVKAALIFPNTGSRDHVKTVTRHIWVPYMEMCEDIRHSTGMNELYALRINHREIVWN SKRESWLPVYADDREGPEGDEAGLMFACMNLKKLAKWL Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:19:58 2011 Seq name: gi|157101647|gb|DS480677.1| Clostridium bolteae ATCC BAA-613 Scfld_02_18 genomic scaffold, whole genome shotgun sequence Length of sequence - 124621 bp Number of predicted genes - 117, with homology - 114 Number of transcription units - 50, operones - 26 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 33 - 185 73 ## 2 2 Op 1 . - CDS 398 - 1120 636 ## Closa_0179 Colicin V production protein 3 2 Op 2 . - CDS 1139 - 1966 943 ## COG0561 Predicted hydrolases of the HAD superfamily 4 2 Op 3 . - CDS 1963 - 4491 2464 ## Closa_0177 hypothetical protein 5 2 Op 4 23/0.000 - CDS 4538 - 5134 559 ## COG0353 Recombinational DNA repair protein (RecF pathway) 6 2 Op 5 30/0.000 - CDS 5134 - 5490 185 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 7 2 Op 6 . - CDS 5533 - 7215 1686 ## COG2812 DNA polymerase III, gamma/tau subunits 8 2 Op 7 . - CDS 7293 - 8372 1355 ## COG0205 6-phosphofructokinase - Prom 8426 - 8485 7.5 - Term 8820 - 8860 -0.9 9 3 Op 1 . - CDS 9012 - 11234 1937 ## COG3345 Alpha-galactosidase 10 3 Op 2 . - CDS 11252 - 12964 1287 ## COG0366 Glycosidases 11 3 Op 3 . - CDS 13007 - 16291 2768 ## COG0383 Alpha-mannosidase - Prom 16325 - 16384 3.2 - Term 16336 - 16393 24.0 12 4 Op 1 38/0.000 - CDS 16418 - 17260 948 ## COG0395 ABC-type sugar transport system, permease component 13 4 Op 2 35/0.000 - CDS 17265 - 18146 889 ## COG1175 ABC-type sugar transport systems, permease components 14 4 Op 3 3/0.111 - CDS 18321 - 19715 1485 ## COG1653 ABC-type sugar transport system, periplasmic component 15 4 Op 4 3/0.111 - CDS 19777 - 20961 1239 ## COG0626 Cystathionine beta-lyases/cystathionine gamma-synthases - Prom 20990 - 21049 4.8 - Term 20989 - 21036 1.1 16 4 Op 5 . - CDS 21055 - 21738 740 ## COG2186 Transcriptional regulators - Prom 21776 - 21835 7.3 + Prom 21929 - 21988 6.1 17 5 Tu 1 . + CDS 22064 - 22735 361 ## Halsa_1450 transcriptional regulator, Crp/Fnr family + Term 22797 - 22853 3.2 + Prom 22925 - 22984 6.8 18 6 Op 1 . + CDS 23011 - 24417 1131 ## COG1288 Predicted membrane protein 19 6 Op 2 . + CDS 24455 - 25645 876 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 20 6 Op 3 . + CDS 25664 - 26581 656 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 21 6 Op 4 . + CDS 26598 - 27953 205 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 22 7 Tu 1 . - CDS 28037 - 28771 569 ## COG0590 Cytosine/adenosine deaminases - Prom 28812 - 28871 8.9 + Prom 28761 - 28820 8.9 23 8 Tu 1 . + CDS 28866 - 29564 530 ## COG1180 Pyruvate-formate lyase-activating enzyme + Term 29717 - 29769 12.5 24 9 Tu 1 . - CDS 29763 - 32354 692 ## Closa_0170 hypothetical protein - Prom 32449 - 32508 6.1 + Prom 32413 - 32472 7.3 25 10 Tu 1 . + CDS 32564 - 33289 975 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase + Term 33358 - 33404 5.1 - Term 33400 - 33461 24.1 26 11 Op 1 . - CDS 33472 - 34044 720 ## Closa_0155 Lipoprotein LpqB, GerMN domain protein 27 11 Op 2 . - CDS 34073 - 34576 509 ## COG1846 Transcriptional regulators 28 11 Op 3 . - CDS 34646 - 36028 1130 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 29 11 Op 4 . - CDS 35992 - 37446 1191 ## HMPREF0868_1153 putative UDP-N-acetylmuramoylalanine--D-glutamate ligase - Prom 37602 - 37661 5.7 30 12 Tu 1 . - CDS 37674 - 38201 588 ## COG0700 Uncharacterized membrane protein - Prom 38287 - 38346 4.2 - Term 38289 - 38332 6.7 31 13 Op 1 . - CDS 38381 - 38578 67 ## gi|160936906|ref|ZP_02084270.1| hypothetical protein CLOBOL_01795 32 13 Op 2 . - CDS 38635 - 39141 453 ## gi|160936907|ref|ZP_02084271.1| hypothetical protein CLOBOL_01796 - Prom 39176 - 39235 2.9 - Term 39278 - 39324 8.1 33 14 Tu 1 . - CDS 39425 - 40297 671 ## COG0582 Integrase - Term 40395 - 40437 9.2 34 15 Op 1 . - CDS 40495 - 41586 1092 ## COG1915 Uncharacterized conserved protein - Prom 41616 - 41675 2.5 35 15 Op 2 . - CDS 41739 - 42671 798 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 42869 - 42928 6.6 + Prom 43041 - 43100 10.3 36 16 Tu 1 . + CDS 43201 - 44052 725 ## CLM_0794 hypothetical protein + Term 44055 - 44122 21.6 - Term 44049 - 44104 12.1 37 17 Tu 1 . - CDS 44115 - 44813 256 ## COG2186 Transcriptional regulators - Prom 44976 - 45035 10.8 + Prom 44804 - 44863 6.6 38 18 Tu 1 . + CDS 45104 - 46213 833 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 46222 - 46268 5.1 + Prom 46219 - 46278 4.5 39 19 Op 1 . + CDS 46329 - 46844 360 ## Dtox_0837 cupin 2 conserved barrel domain-containing protein 40 19 Op 2 21/0.000 + CDS 46874 - 48367 1029 ## COG1129 ABC-type sugar transport system, ATPase component 41 19 Op 3 3/0.111 + CDS 48380 - 49354 565 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 42 19 Op 4 . + CDS 49373 - 50446 802 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 43 19 Op 5 . + CDS 50462 - 51406 488 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 44 19 Op 6 . + CDS 51410 - 52504 221 ## COG2706 3-carboxymuconate cyclase + Term 52530 - 52590 12.2 - Term 52524 - 52571 7.1 45 20 Tu 1 . - CDS 52577 - 53680 1171 ## Closa_1264 glycoside hydrolase family 25 - Prom 53722 - 53781 5.3 46 21 Op 1 . - CDS 53811 - 54287 510 ## COG1720 Uncharacterized conserved protein 47 21 Op 2 . - CDS 54331 - 55167 527 ## COG0789 Predicted transcriptional regulators 48 21 Op 3 . - CDS 55240 - 55836 541 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. - Prom 55965 - 56024 3.2 49 22 Tu 1 . + CDS 56193 - 57617 1619 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Prom 57722 - 57781 6.9 50 23 Tu 1 . + CDS 57870 - 58553 533 ## COG1284 Uncharacterized conserved protein 51 24 Tu 1 . - CDS 58606 - 59475 596 ## COG0789 Predicted transcriptional regulators - Prom 59639 - 59698 3.0 - Term 59479 - 59518 -0.4 52 25 Op 1 . - CDS 59702 - 60835 1187 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 53 25 Op 2 4/0.000 - CDS 60835 - 62556 1719 ## COG0777 Acetyl-CoA carboxylase beta subunit 54 25 Op 3 4/0.000 - CDS 62721 - 64073 1483 ## COG0439 Biotin carboxylase 55 25 Op 4 4/0.000 - CDS 64094 - 64519 536 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 56 25 Op 5 4/0.000 - CDS 64547 - 65065 590 ## COG0511 Biotin carboxyl carrier protein 57 25 Op 6 11/0.000 - CDS 65082 - 66320 1672 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 58 25 Op 7 26/0.000 - CDS 66369 - 67109 222 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 59 25 Op 8 3/0.111 - CDS 67127 - 68050 1177 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 60 25 Op 9 4/0.000 - CDS 68122 - 69048 1246 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 61 25 Op 10 6/0.000 - CDS 69147 - 69380 453 ## COG0236 Acyl carrier protein 62 25 Op 11 . - CDS 69431 - 70393 935 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Prom 70479 - 70538 4.9 + Prom 70509 - 70568 5.1 63 26 Tu 1 . + CDS 70665 - 70928 345 ## COG2002 Regulators of stationary/sporulation gene expression 64 27 Op 1 1/0.333 - CDS 71057 - 71902 919 ## COG0313 Predicted methyltransferases 65 27 Op 2 1/0.333 - CDS 71989 - 72771 714 ## COG4123 Predicted O-methyltransferase 66 27 Op 3 1/0.333 - CDS 72761 - 73693 1210 ## COG1774 Uncharacterized homolog of PSP1 67 27 Op 4 . - CDS 73729 - 74718 1061 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 74819 - 74878 5.3 68 28 Op 1 7/0.000 - CDS 74908 - 75669 909 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 69 28 Op 2 1/0.333 - CDS 75662 - 77380 1511 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 70 28 Op 3 . - CDS 77382 - 78491 253 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 - Prom 78541 - 78600 4.7 + Prom 78565 - 78624 5.1 71 29 Tu 1 . + CDS 78661 - 79515 642 ## Closa_2529 hypothetical protein + Term 79546 - 79595 5.0 - Term 79405 - 79456 5.6 72 30 Op 1 7/0.000 - CDS 79594 - 80727 930 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 73 30 Op 2 10/0.000 - CDS 80802 - 82013 1227 ## COG0477 Permeases of the major facilitator superfamily - Prom 82096 - 82155 3.3 - Term 82108 - 82157 11.1 74 31 Op 1 40/0.000 - CDS 82193 - 83656 1254 ## COG0642 Signal transduction histidine kinase 75 31 Op 2 . - CDS 83653 - 84318 738 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 76 31 Op 3 . - CDS 84290 - 85912 1249 ## COG1292 Choline-glycine betaine transporter 77 31 Op 4 . - CDS 85962 - 86093 67 ## gi|160936958|ref|ZP_02084322.1| hypothetical protein CLOBOL_01847 - Term 86183 - 86235 14.0 78 32 Op 1 . - CDS 86266 - 86742 289 ## Spirs_1952 hypothetical protein 79 32 Op 2 . - CDS 86773 - 88212 1527 ## COG1109 Phosphomannomutase - Term 88229 - 88270 7.1 80 33 Op 1 17/0.000 - CDS 88282 - 89934 1860 ## COG1178 ABC-type Fe3+ transport system, permease component 81 33 Op 2 7/0.000 - CDS 89928 - 91007 1246 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Prom 91063 - 91122 1.6 - Term 91078 - 91120 1.5 82 33 Op 3 . - CDS 91185 - 92324 265 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 - Prom 92444 - 92503 7.8 - Term 92525 - 92573 8.3 83 34 Tu 1 . - CDS 92628 - 93191 416 ## Mevan_0161 hypothetical protein - Prom 93314 - 93373 8.2 - Term 93409 - 93470 15.1 84 35 Op 1 . - CDS 93492 - 94631 1146 ## COG0520 Selenocysteine lyase 85 35 Op 2 . - CDS 94741 - 94977 264 ## Closa_1020 hypothetical protein - Prom 94997 - 95056 2.0 - Term 94994 - 95058 1.0 86 36 Op 1 . - CDS 95114 - 95320 352 ## PTH_0551 redox protein 87 36 Op 2 . - CDS 95333 - 96427 1190 ## Ccur_00560 hypothetical protein - Prom 96513 - 96572 4.5 - Term 96518 - 96582 13.5 88 37 Op 1 . - CDS 96609 - 97541 925 ## COG0583 Transcriptional regulator 89 37 Op 2 7/0.000 - CDS 97548 - 99467 2035 ## COG3276 Selenocysteine-specific translation elongation factor 90 37 Op 3 . - CDS 99472 - 100782 1474 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 91 37 Op 4 . - CDS 100779 - 100880 80 ## - Prom 101027 - 101086 5.2 - TRNA 100915 - 101007 27.7 # SeC(p) TCA 0 0 92 38 Op 1 . - CDS 101183 - 102304 1378 ## Clos_0960 glycine reductase (EC:1.21.4.2) 93 38 Op 2 . - CDS 102332 - 103888 1400 ## CLM_1422 glycine reductase complex component C subunit beta (EC:1.21.4.2) - Prom 104034 - 104093 2.0 - Term 104041 - 104083 6.3 94 39 Op 1 . - CDS 104125 - 104451 215 ## Shel_25200 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) 95 39 Op 2 . - CDS 104467 - 104598 190 ## Amet_3592 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) 96 40 Op 1 11/0.000 - CDS 104727 - 105044 451 ## COG0526 Thiol-disulfide isomerase and thioredoxins 97 40 Op 2 . - CDS 105078 - 106019 581 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 106074 - 106133 5.6 98 41 Op 1 . - CDS 106225 - 106455 219 ## Shel_25230 glycine reductase, selenoprotein B 99 41 Op 2 . - CDS 106483 - 107532 1017 ## Clos_0958 selenoprotein B (EC:1.21.4.2) 100 41 Op 3 . - CDS 107542 - 108822 1293 ## Bmur_2722 glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits 101 41 Op 4 . - CDS 108827 - 109222 336 ## CLOST_1107 grdx protein 102 42 Tu 1 . + CDS 109628 - 110650 948 ## COG3604 Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains + Term 110702 - 110754 12.1 - Term 110689 - 110741 14.2 103 43 Op 1 1/0.333 - CDS 110761 - 111387 458 ## COG0194 Guanylate kinase 104 43 Op 2 . - CDS 111443 - 112960 1157 ## COG1982 Arginine/lysine/ornithine decarboxylases 105 43 Op 3 . - CDS 112957 - 113478 526 ## COG0563 Adenylate kinase and related kinases 106 43 Op 4 . - CDS 113569 - 115068 1256 ## EUBREC_2822 hypothetical protein 107 43 Op 5 . - CDS 115058 - 115288 245 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 108 43 Op 6 . - CDS 115304 - 115750 455 ## gi|160936992|ref|ZP_02084356.1| hypothetical protein CLOBOL_01881 - Prom 115925 - 115984 9.1 + Prom 115978 - 116037 6.7 109 44 Op 1 . + CDS 116072 - 116509 353 ## COG1846 Transcriptional regulators 110 44 Op 2 . + CDS 116548 - 117228 894 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump + Prom 117245 - 117304 3.5 111 45 Tu 1 . + CDS 117349 - 118002 626 ## COG1145 Ferredoxin 112 46 Tu 1 . - CDS 118385 - 120139 1833 ## COG0441 Threonyl-tRNA synthetase - Prom 120232 - 120291 5.5 + Prom 120558 - 120617 7.0 113 47 Op 1 . + CDS 120731 - 121153 483 ## Closa_1010 hypothetical protein 114 47 Op 2 . + CDS 121140 - 121382 296 ## Closa_0017 hypothetical protein + Term 121540 - 121606 8.6 115 48 Tu 1 . - CDS 121398 - 123146 1270 ## gi|160937004|ref|ZP_02084368.1| hypothetical protein CLOBOL_01893 - Prom 123268 - 123327 6.5 - TRNA 123357 - 123429 50.2 # Pseudo GAA 0 0 - Term 123561 - 123605 9.1 116 49 Tu 1 . - CDS 123613 - 123708 74 ## - Prom 123750 - 123809 3.2 + Prom 123800 - 123859 7.0 117 50 Tu 1 . + CDS 123961 - 124452 41 ## COG4974 Site-specific recombinase XerD Predicted protein(s) >gi|157101647|gb|DS480677.1| GENE 1 33 - 185 73 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQLFVNNFFYFFFVVCLSSNFYILPYLLLFVNNFFNSFFRVVLPFFAVDL >gi|157101647|gb|DS480677.1| GENE 2 398 - 1120 636 240 aa, chain - ## HITS:1 COG:no KEGG:Closa_0179 NR:ns ## KEGG: Closa_0179 # Name: not_defined # Def: Colicin V production protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 240 4 244 244 232 48.0 9e-60 MDMILEHWLSAGAGVFLIGMVLYGHYRGALKQCVSLGALILTIVLVKIATPYMTDFIKDN PSIRQSAAEAILDAAGWEEPSAENTELPAAQRIAIENLNLPQSVKETLLENNNSEFYHML GVDQFAEYVSTYLADILINAVSSIILFAVVYILIHLVVRWLDLIARLPILYGLNHIAGAV LGLIQGLLFLWIGCFLVGIFSATPLGMTLEEQINASTWLRFLYQYNLINIVLGGIIRGIL >gi|157101647|gb|DS480677.1| GENE 3 1139 - 1966 943 275 aa, chain - ## HITS:1 COG:BH0497 KEGG:ns NR:ns ## COG: BH0497 COG0561 # Protein_GI_number: 15613060 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Bacillus halodurans # 6 275 7 245 247 124 32.0 2e-28 MMTIGLIALDLDGTLLDSQKNLSERNRRALERCARMGIQIVPTTGRAVDGITQEVRSLPG VNYAISTNGGTVADLVKGTSLKRCTLSNAKALEVMDIVKRYRAMYDPYIDGRGISQPEFI GHMEDFGLSPVIQKMVLSTRDVVPNILNYVTECKKDVEKVNVYLADVNDIVPLRKELSAV EGIVISSSLYNNLEINALGATKGIALMWLADYLGIAPEATMAFGDGENDLSMLEAAGVGI AMGNGLDIAKNAADQITLTNDEDGVADAIERLIFN >gi|157101647|gb|DS480677.1| GENE 4 1963 - 4491 2464 842 aa, chain - ## HITS:1 COG:no KEGG:Closa_0177 NR:ns ## KEGG: Closa_0177 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 221 1 221 765 317 77.0 1e-84 MDKYEFNIKVEQIKKMVSRGDYETAMKIADTIDWRRVRNVNILSMVATIYEKNEEYQEAK DILLLAFERAPIGKRLLFKLAELAIKEGSFQEAEDYYREFCDLAPDDPRQYILRYLILGA KGAPAEQLIHTLEQYCNIELDEKWIYELAELYAEAGMGELCIMTCDKIMLMFGLGKYVEK AMELKIQFAPLTKYQMDLVENRDKYEARLKAVEQEYRAGKRPADYEPVPEKKTEVSPQSY MARQEAAFAAEPEAEYPEDGYSESGYAEGGYQEDGYAEDGYAEDGYPNDGYAEDGYPESG YAEDGYTEERYTERGYRNDRYAQNEYQNDRYTEAGYPEDDYSERDYQKSGYQAEAYQEDS YTDEDGSEEVYARQGRYGNEYPDQNVQEDTQDGAEAGYPEEHHGIAVGAEEPYEDEGAHT RVTIESEDSDEGEEPLKEPQSQWMPDKPRILSDEAVRARMHEAEVQANLAKEMSRISDER HRPETTSAQTRVLTDIKDLGRESSVQESHHFMIEAEYSSSGLDQAIELLKKIHKETGAKN QAAKITGEKLNSKGVFAIADKLTGKDLIIEQAGAMEESTLQELNQLMARDETGMNVVLID VAGNLARIHKVYPGLAKRFEYVGKMGPEEVAYVAPEGDDRPVRPVQLRKTEEAAAPRTAL RRTEEPAAGKVQPVAEPAAIRSLPKPETEQKPAAERKAGTEQKPKPERRKMEESRPEPAA PREEHKEVSLKAAETPPVLPQPEEPDYDDQQEMDIDEFAQYACQYASDIDCSISGKSMLA LYERIEIMEEDGVPLTKVNAEDLIEEAADKAENPSFIKRITGIFSSKYDRDGLLILKEEH FI >gi|157101647|gb|DS480677.1| GENE 5 4538 - 5134 559 198 aa, chain - ## HITS:1 COG:CAC0127 KEGG:ns NR:ns ## COG: CAC0127 COG0353 # Protein_GI_number: 15893423 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Clostridium acetobutylicum # 1 198 1 198 198 238 57.0 4e-63 MDYYSSQITKLIEELSKLPGVGAKSAQRLAFHIINMPKEQVEELAGAMTGARNNVRYCKE CFTLTDKELCPICSSDRRNHKTIMVVENTRDLAAYEKTGKYDGVYHVLHGAISPMLGIGP GDIKLKELMQRLQSDVEEVIIATNSSLEGETTAMYISKLIKPTGIKVTRIASGVPVGGDL EYIDEVTLLRALEGRTEL >gi|157101647|gb|DS480677.1| GENE 6 5134 - 5490 185 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 10 109 6 105 114 75 38 9e-13 MAKRGGFPGGMPGNMNNLMKQAQRMQRQMEETTKELENKEYTAASGGGAVSVTVSGKKEV VSVKLSPDVVDPDDIEMLEDLIVAATNEAFRQMEAESSEAMSKLTGGLGGALGGGFPF >gi|157101647|gb|DS480677.1| GENE 7 5533 - 7215 1686 560 aa, chain - ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 1 466 1 445 563 377 43.0 1e-104 MSYTALYRKWRPVSFEDVKGQDPIVQTLKNQITSERIGHAYLFCGTRGTGKTSIAKIFAR AVNCEHPVDGSPCNECPTCRSIQSGSSMNVVEIDAASNNGVENIRDIREQVQYPPTEGRY RVYIIDEVHMLSIGAFNALLKTLEEPPSYVIFILATTEVHKIPITILSRCQRYDFKRISL ETIANRLRELTQAEQIQVEDKALLYVAKAADGSLRDALSLLDQCVAFHYGKVLTYDNALE VLGAVDSGVFSQMFGAIVEGRTKDCICSLEEIVIQGRELGQFVTDFIWYMRNLLLIQSAD DAEGLVDMSEENLKQLRSDAGKTDGTTLMRYIRIFSELSNQLRYASQKRVLVEVALIKLT RPSMEPNLDAILQRLGDLEAQMEDLEAGRMAIPMAAVYQEAGSAPAGQAPRTGAGPGADG GTVPQAAAPYAPGLQADPGAAPEKVALPQAQLEDLKLVRNEWAKIVRSMGGGARSYLRDT VVEPGGEGCLTIVFMDPMNYDMGKRPTVIGELERYVEANFGRSIYFKTRLAGRGERLNTI YVTQEELEEKIHMDITYEDE >gi|157101647|gb|DS480677.1| GENE 8 7293 - 8372 1355 359 aa, chain - ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 3 339 5 333 346 266 43.0 5e-71 MRRIGMLTSGGDCQALNATMRGVAKSLYNMFDDVEIIGFEDGYKGLMYADYQIMKSSDFS GILTQGGTILGTSRQPFKLMRTPDENGLDKVEAMKHTYKKLKLSCLVVLGGNGSQKTANM LSEEGLNVVSLPKTIDNDLWGTDMTFGFQSAVDVATNAIDCIHTTAASHGRVFIVEVMGH KVGWLTLYAGLAGGADIILLPEIPYDINIVCDALTKRSKAGKRFSILAVAEGAISREDAK MSKKELKEKKKSGVVYPSVAYEIGAKIQEVIGSEVRVTVPGHMQRGGEPCSYDRVLATRL GAAAARLIAQEQYGYMVAVKNNDITQVPLSEVAGRLKTVDPECSMIKEAKMVGISFGDE >gi|157101647|gb|DS480677.1| GENE 9 9012 - 11234 1937 740 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 734 1 738 748 847 56.0 0 MGIIYEEKSRQFHLYNDKISYIMTCLPNGQLGQLYYGKRLRHRESFGHLLEFAHRDMAPC VYEDDTTFSLEHIRQEYPAFGTGDMRYPAYGIRRENGSRVSCFVYKSHTIQGGKPELPGL PAVYVEQDEEAQTLELYLEDPVAGMELVLLYTIYENMPVLTRSARFICRGPEPVCLERAM SLSLDLPDPRYQMVDLTGAWCRERYVDCHGLHRGVQSIHSMRGHSSHQFNPFICLKRPDA GEDSGEVIGISLVYSGNFLAQAEVDTMDTVRVTMGIHPEGFDWPLCSGEEFQTPEAVMVY SGDGLNGMSQAFHQLYRTRLARGYWRDRERPILINNWEATYMDFDEEKILSIADKARELG IELFVLDDGWFGKRDDDTTSLGDWYPNLHKLPDGITGLAGKIRDMGMKFGLWFEPEMVSR DSCLYRAHPDWMLGSTDRPVCTGRHQYVLDFSKDQVVEYIGDRMEEILGDGGVSYVKWDM NRSISDLFSAGRDARYQGTVYHRQILGVYKLYERLTRDFPHVLFESCASGGARFDPGILY YAPQGWVSDDTDGVERVKIQYGTSLVYPVSSMGSHVSAIPNHQTQRNVPLASRAAAAFFG TFGYELDLNLLTEEEQYQVKEQIRFMKEHRKLIQQGRFYRLISPFEGNEAAWIVVSEDEN KAIAAYFRFLQPVNTGFKRLVLKGLNPGTMYRVSGMDYTSYGDELMSAGLILSDGASGVR TSPVPQGDYLSRLFLVEAQP >gi|157101647|gb|DS480677.1| GENE 10 11252 - 12964 1287 570 aa, chain - ## HITS:1 COG:BH2903 KEGG:ns NR:ns ## COG: BH2903 COG0366 # Protein_GI_number: 15615466 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 7 567 5 555 561 632 56.0 0 MEEQRAWWKECVVYQIYPRSFQDSNGDGIGDLNGIRSRLDYLKDLGIDVIWLSPVYQSPG DDNGYDISDYKAVMDVFGTMEDFDRLLAEIHEKGMKLVMDLVVNHTSDEHAWFVKSRRSP DNPYRDYYIWKAPVSGGPPNNWASRFRGPAWEYDEGTGMYYLHIYSRKQPDLNWENPRVR QEIYDMMAWWGDKGIDGFRMDVISMLSKDQRFPDGVLKEGNPYGDGLPYYANGPRIHEFL QEMNREVLSRFDWMSVGEGVGVGTREAVQYAGYESHELNMMFHFEHVELNPGPFGKWNDN PVCLRELKEVLNRWQTELEGKAWNSLFWENHDYPRAVSRFGNDSPVFWELSAKMLAVCLY MMKGTPYIYQGQELGMTNFQGSSIEDFQDIEIHNVYREMVGSGRISHEEMIRYINHMGRD NARTPMQWDDTAQAGFTEGTPWLGVNPNYTWLNAREQAGRGDSVLSFYKQLIKLRHTMDI LVYGTYRLLLEESGSVYAYERTLGDERLTVLCNFTDQTQSCGLSCRDLGTDNVLIGNYPD QSGTFRGNGPALHEEEDICLRPYEAIVLTQ >gi|157101647|gb|DS480677.1| GENE 11 13007 - 16291 2768 1094 aa, chain - ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 34 1089 35 1029 1032 633 35.0 0 MSILFELERMERLAEDLDRLRRPVRIPVSCYKRMEGKEKGSPWCSTLDWEDCSIEEPWAC LDSHRWYRTTVEIPPHLDGAHVEFLITTGREGQWDATNPQMLFYLNGNIVQGVDVNHREI LISSGAKAGERYDIAILAYSGSVPGDLIIRTELVRVDDAVEKAYYDFLVPVQAARLLKKP DEENYRRILVKLGPAADALDLREPYSSRFYQSIEEMERIVEKEFYQKVNAASPVVSAIGH THIDIAWLWTVEQTREKAVRSFSTVLELMDRYPDYKFMSSQPILYQFVKEQEPELYECIR ERVREGRWETDGAMWLESDCNLPAGESLVRQIIKGEQFFQEEFGISSRCLWLPDVFGYSA AIPQILKKCGIPYFLTTKIAWNQFNQLPNDTFMWKGIDGSRVFVFMPTACDFDKTLGLNV SFTDTRNTTTYTGIVNPNMTLGTFKRFQNRDLTEDTLMLFGFGDGGGGPTKEMLEEAKRL QYGLPGIPRLVQENERTFFDRIHHDIGSRPDMPVWDGELYFEYHRGTLTSMGKNKRYNRK SEQMYEQLETLGVMAELKGLEYPAGVIKRGWDIILLNQFHDIIPGSAIGPVYEQTDREYE EILKSGAETALHLAEGMGSRICGQDKDTSREEPEEQKPEEHKIIVFNTQGYEREDTVTVS GVARGEAAYACDCLGYRAPVQYVGEDTLIFHAKGIPSCGYAVYSLVGAEDSVNQGTSGSA KEHPVLAAEHPGPDPEPSGFDAECPRPWPGFFENDWYRAAFNDKMELVSLVEKETGCQLL KEGRVGNQLLTFEDRPMNWDNWDVDMFYQRKPYNADHVTAPVLKEWGPVRTVMSISHRFA GSLVEQDIVFYPNLPRIDFVTRADWRDHHVLLRVYFPARINATKAAYEIQFGNVERETTS NHSWDTAKFEVCAHKWADLSENGLGLSLLNDCKYGHSIKDGEMGLTLIKSGTYPNENADI GFHEFTYSIYPHKGRWQEARTVEMAYDLNSPLVSALVPAKGAGGFWSMVSVDQPDCFVEM MKKAENGNGYILRIYENRNTRTPMTVKLGFEISRVEECDLLERPLRPIETDGYRFKDFIK PYEIKTYRICPVRA >gi|157101647|gb|DS480677.1| GENE 12 16418 - 17260 948 280 aa, chain - ## HITS:1 COG:mll0851 KEGG:ns NR:ns ## COG: mll0851 COG0395 # Protein_GI_number: 13470996 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 15 280 18 282 282 179 39.0 4e-45 MTMKTKRRLKSLLKYLVLACFLVLTMFPFVWMLLTSLKGSQGEIYAFPVRYLPNPASAVN YVAMLWKGNFGRYMLNSFFSSAVAATCAVFIAILASYVISRFHFRFKNVLLLFFLVTQMI PMFIMLAPLYQLLAKAKMLNDLRSLAMLYTNMMIPFSVVTLCGFFDSVPKSLEEAAWIDG CGYFKALFKVVVPVIMPGIAATFIFAFINSWNELFMAVMFIDVDKFKTIPVGLNGLILKY DIKWGEMAAGTIMSLVPTMCLFAFAQKYMIEGLTAGSVKG >gi|157101647|gb|DS480677.1| GENE 13 17265 - 18146 889 293 aa, chain - ## HITS:1 COG:AGl3272 KEGG:ns NR:ns ## COG: AGl3272 COG1175 # Protein_GI_number: 15891756 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 268 59 312 337 200 42.0 2e-51 MRRARFIFIMVLPAILFPMIFTYYPMIKGSLMAFQSYNLMNVKNIKWVGLENFSKLFAHN TSNTFYSTMLNTVKWVGISLFVQFTVGFAMALLLKKKFKGSSLYQGLIFFPWAVSGFIIG IMWRWMFNGTSGVINDLLMRIHLISQPVGWLASKNTALYSCIIANIWYGIPFFTIMITAA LRGVSEDLYEAADVDGASPYQKFTNITVPCIRSVLLLTVLLRVIWIFNFPDLIYSMTQGG PAGSSNIITSYMMQLVQSLDYGLASAVGVLCILFLIVFAAIYLVATKYTDEEA >gi|157101647|gb|DS480677.1| GENE 14 18321 - 19715 1485 464 aa, chain - ## HITS:1 COG:SMc01977 KEGG:ns NR:ns ## COG: SMc01977 COG1653 # Protein_GI_number: 15966264 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 60 450 24 406 420 195 33.0 2e-49 MKKKQLAAMALAAVMLTGCGGGSAAPDATKAADSGASTAADAGKKEADSQKAESGEAVTI TLVESLTSPERTAVLREIADKYQAEHSNITIDIISPPLENADAKITQMLMNGSGADIVEV RDSTVTQYATNEWIADLQKYIDAWDEKGTLTESADEVIHYLKGGAYLIPYGFYQRGLFYR ADWFEEKGLDQPETWQDIYDAGLAITDPANSRFGYSFRGGPSGYQYADTIYWSWIGTDKV ADPNAAYFLKDGDGATIFTLPEVKEALHFYKDLFKNTCPTDSIAWGFSEMVQGFVGGTTA MLIQDPEVIATCSADMQDDQWALVPFPKGPSGQAVFPNGFAGWGMTSFTEHPDETADFLL YLSNSENNTYFAKNYSTIPIHNNAAEKDPYFSEGRFAMYMDMAKEPDVYRHAAYPQMYEA FATYKTEVDTMYQKYLTDEITDDELLQWLDEFWTQAYKDEGQKW >gi|157101647|gb|DS480677.1| GENE 15 19777 - 20961 1239 394 aa, chain - ## HITS:1 COG:APE1226 KEGG:ns NR:ns ## COG: APE1226 COG0626 # Protein_GI_number: 14601268 # Func_class: E Amino acid transport and metabolism # Function: Cystathionine beta-lyases/cystathionine gamma-synthases # Organism: Aeropyrum pernix # 58 393 45 383 384 274 41.0 2e-73 MTNQNEYLKSTKLITAHYGEEYEHYYNAIVPPVFMNSLNVFETLEDYYDFDRTDKHKYCY GRVQNPTVRILEDKIAALEHGVGALAFASGMAAATTAVLTACKAGSHVVCLHNAYGPLKD FLSKYCTEHLDITLTYVTGDRVEEFEEAVTDATDLIVLESPSSVLFSLQDIRAVSRIARK HQALVYIDNTFCTPIYQQPLDMGADIVMHTTSKYIGGHSDIIGGMLAVKDQELMKKLTAN RELLGGIVGPMEAWLMIRGLRTMEVRLAAHQAAAMEVAKFLEAHPKVQRVNYPGLPSHPQ YDLMKQQQTGNTGLMSFEIHGTTEDAIKVAESLKIFKIGVSWGGFESLVFLPHARLDEET CRVLGAGQNVIRIHCGLEGTEALISDLESALAKI >gi|157101647|gb|DS480677.1| GENE 16 21055 - 21738 740 227 aa, chain - ## HITS:1 COG:mll0857 KEGG:ns NR:ns ## COG: mll0857 COG2186 # Protein_GI_number: 13471000 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 5 224 25 248 260 99 29.0 4e-21 MNSKIQRTSLQSEIIRYIQNYIQENGLKPGDRLPSQEKLIEMMGVSRTSLREAMKTLEAR NVLEAKNGKGIYVGSQEVNDLLRLIDFAKEKELLLDTLEVRKILEREILRMVIHSATREE LEELGAITRVLMAKFRRGEQQTAEDKKFHYTIYRLSHNQVMYQLILSISGVMDKFWEFPL NMEDPFLESLPLHEELYNAICEKNVKKAQAINEKLLDAVYRDIRNQR >gi|157101647|gb|DS480677.1| GENE 17 22064 - 22735 361 223 aa, chain + ## HITS:1 COG:no KEGG:Halsa_1450 NR:ns ## KEGG: Halsa_1450 # Name: not_defined # Def: transcriptional regulator, Crp/Fnr family # Organism: Halanaerobium_sapolanicus # Pathway: not_defined # 31 201 35 205 231 73 25.0 7e-12 MQLDNSSFTLPNKRNAVLERLIECHGETCVYPARKIFLTPGQTLQGVYYIITGRTRHYMI GTDGTEKILYTLSDGWFYGETPLSLRESTGLFSQAEVETRVRIIPYQDYEILLDTNKVFR EAILESYSKKMLIMRHEIESLAFNSCKDRLKRLFCSTADTSCLLENKWYNLKVHYTQQEL SAIIGSSRITISKTIKELCNEDFIRILNRNTQINAAAYNKVMQ >gi|157101647|gb|DS480677.1| GENE 18 23011 - 24417 1131 468 aa, chain + ## HITS:1 COG:BS_ycgA KEGG:ns NR:ns ## COG: BS_ycgA COG1288 # Protein_GI_number: 16077371 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 77 465 5 393 396 206 35.0 7e-53 MKEGNKRSWFPNVYTILFLLAIVSAILTWIVPAGSYERVTEENVTKVVAGTYHVIGQNPQ GPWEIFQALVTGFKNQSSLIYMILFVGAAVYMITETKAIDTIFMKLAKAVKGREEIAIFC VMFFMSMGGATGVFGNATLVLIPIGIFLSQAMGFDKTLGFFMIFFGQFAGFNVGWANAGV LGVAQAIAEVPLFSGFSARVIFHIVNFTLSYAFVIFYLHQIKKDPCKSLNYEQGLKTNDI MGYQDGELGDAPVTKVQVLSMLCMVAGLAAVVIGALKFKWGADKISATFLVVCLLIGCVS CKDINVGFNRFIKGCASTVGAAFIVGFANCLTVLMSNGMILDTIVYWLAKPISHMGAVLG AGFMFLANALINFFISSGSGQAAAVMPIMVPIADLTGITRQVAVQAFQFGDGFTNCVIPT IGTLMGGLGFAGISYGKYLKKAMPLILVQITLAFFTLMFLQSIGWTGL >gi|157101647|gb|DS480677.1| GENE 19 24455 - 25645 876 396 aa, chain + ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 1 391 1 393 394 259 38.0 9e-69 MTIHELVASLEPEINAVRQQIHENPELGLKEYNTSALIEKELREKTHVDRIEHIGETGLL VEIKGTKPGTAHCIALRGDMDALPIKEDESHELRSRTDGIMHACGHDVHTSILLGAVRVL EHYRNQIAGSILFFFQPSEETLQGGKLFAESPKIDFHTIDGVAALHVTPDLCAGKIGVRY GAILGSSDELTITVKGKGGHGAHPDTVIDPILLAAQIVQALQMLVSRELAADESGVVSLC SIQGGNAFNIIPDSVVLKGTIRALDPKVREHILKRIPEICKGIAAAGRGDAEVNIKLGPP PLVSDSQWVDRVKRCGSKLLGHENVIELAHPSMGAEDFAFVMEKAPGVFVRFGSRSEGGP YGGLHSPHFYCDRKALTTGILTLAGIALDFFDVDFE >gi|157101647|gb|DS480677.1| GENE 20 25664 - 26581 656 305 aa, chain + ## HITS:1 COG:MJ0244 KEGG:ns NR:ns ## COG: MJ0244 COG0329 # Protein_GI_number: 15668419 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Methanococcus jannaschii # 6 298 4 289 289 166 32.0 5e-41 MVQLHGSWVAMPTPFTEDDRIDFKGFETLIDRQIKYGTSQIFILGSAGEVTLLTLEEKKA IVHEVIKIVNGRIPVFFSAASMTTQASVDFARYCEQEGADGVIFTVPPYVLINQTAAFIH MDTCMSSVSIPCGIYNNPSRLGVQIMPETIKKLSDAHPNFVVDKEALPNVEQLVQVQRLC QGKVKIMCCDFPKYSIVLPTLAVGGTGTANIGGNIIPEEAALYSRPWTDMNIIEECRENY FKWFPLLQALYTFSNPVVIKAALNILGLPGGHLRKPYQDYAGTKLKDLENLMGEMGVIDK YGVKN >gi|157101647|gb|DS480677.1| GENE 21 26598 - 27953 205 451 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 95 411 119 418 458 83 26 4e-15 MAEKLIVIGGTAAGLSAASRARKEKPDMEIQVFEKSGFVSYGACGLPYFVGGLIHDINDL VAINAESLKNMRNISAWIHHEVLLIDPEKKEVSVKNLDTDQVSIHSYDKLVIATGAVPVV PPIPGIHSDGVYYLRNMEDGIRLKAAAREHGRVCIIGGGAIGLEAAEELRNAGLSVSVYE QFPRLLPFLDNAFSQALEDTLIKHGINVHTGTQIAEILSEDGKASGIRTAFGEIEPSDII LVSAGVKPAGALAEQAGLALGLKGGIIVDDEMRTSHKDIWACGDCVQMKNRITGKPAYVP LGTTANKQGRIAGGNVAGGHDTFKGILGSMVTKVFELFIAATGLSKEQAAGEGYDAISVS ITKADRASYYPGGRDNHICLIVDKKTGRLLGAQGIGSESIAGRINVLATAITCGMTVEEI NELDLVYAPPAAPVYDPILIAANQAMKKITE >gi|157101647|gb|DS480677.1| GENE 22 28037 - 28771 569 244 aa, chain - ## HITS:1 COG:BH0033 KEGG:ns NR:ns ## COG: BH0033 COG0590 # Protein_GI_number: 15612596 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Bacillus halodurans # 93 244 2 154 159 174 55.0 1e-43 MAKGPSDIGEKNMALKQENMASNQKRIELNQKTVETSQEPLMPNRENRNREAAEIEAARL KGQETRRRNYEKRMEKQRLAALAAEEQRLRQRQKDEGFMREALRQAQKAAAIGDVPIGCV IVRGDKIIARGYNRRNADKSVLSHAEIISIKKACKKIGDWRLEDCTMYVTLEPCPMCAGA IVQARIPRIAVGCMNPKAGCAGSVLDMLHVPGFNHQAEVTEGVLEQECSKLMSDFFQSLR ERKK >gi|157101647|gb|DS480677.1| GENE 23 28866 - 29564 530 232 aa, chain + ## HITS:1 COG:MTH1586 KEGG:ns NR:ns ## COG: MTH1586 COG1180 # Protein_GI_number: 15679581 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Methanothermobacter thermautotrophicus # 1 191 4 186 233 124 41.0 2e-28 MKICGLQKTTLLDFPGRVAATIFTGGCNFRCPFCHNSGLLGSDAEEYECQEDILAFLEKR KRVLEGVCITGGEPTLQPDLEEFIRKVRSLGLAVKLDTNGYMPGILKDLCAKGLLDCVAM DIKAGRNHYEEAAGVSGLSMERIDESIEFLLSGSIPYEFRTTVVRGIHTSDDFRQIGPWI KGCPDYYLQCFTESGEVLVPGLYSDFSKDEMMAFADLVRPYVGQVSLRGIDY >gi|157101647|gb|DS480677.1| GENE 24 29763 - 32354 692 863 aa, chain - ## HITS:1 COG:no KEGG:Closa_0170 NR:ns ## KEGG: Closa_0170 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 330 1 344 561 182 39.0 5e-44 MTNYRRLISYIYAYEGEVKGKNIGFAKLESRNGQCKLSVNVKKVYVGSSDLGVYLLAPGK EIFLGSIFIRSGGGEFRAVVNVENAADSGCSMDQCYGLTIHEQGDTWRAYTTIWEDAVAH AAEVELADVTSSKVGDAEAQIHRKVEELNQELRSGEKAAEGGDGPVLSSSPAEARQMSAG RGAMKNAETPEGAEVVQETEVLAQTSEEVYQTENEANQMTEALDQAAEKVVGAEEILPES IEADTVEMSQKQENQDDRDDLALEPELLPDDKDSMELEPESLTDGMPDEMSNEIEEAQES GGIPDSQLESQSENTAESEITAELGKAAEEHKAGFQNSGLAGITPKRDMEQMDGMITPNM GREQETDHRAWSGNGDFGEQPNSKHGWDEKNVTSIRKQLYEENESVPRELQTAHDPLNMG LPRQEGMVPDTRIPLQPDNMPENRTVPRPNIMTRPGMPRRHMPAGTVRTGSSGGQSDQEQ MMDQCPLNQQPMPGQQPMPGQQPIPGQQQVPCQQPMSNQQSRMLRQLMPSNQTMPSRQQT PGQSSLGQHPVTNRQPVPGQQAIQNQQPISNQQPMPNQQPISNQRPMPNQQPMPGQRPTS NQQPMPNQQPMPNQQPMPNHRPTSNQQPMPNQQPMPNHRPMPNQRPMPNQQPIPNQQPMP NQQPMSNRQPLPNQRPGSQIQNQGRQMRPAGPEAGDNSEAAGQAEERVPVVDAGEETLIL GNPQDLERLEQEEQQSDFPGRLWEGFRKRYPKIQAFDSANGCEILTIKPQDIGLLPRENW NYGNNSFLLHGYYNYRYLILARIGDETKGRTRYILGVPGNYYSNEKYMASMFGFPHFVLA RKQPSQDGRFGYWYTDVRLENQD >gi|157101647|gb|DS480677.1| GENE 25 32564 - 33289 975 241 aa, chain + ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 1 241 1 241 241 266 54.0 2e-71 MKIYKAKDYDELSRKAASIIASQVLMKPECVLGLATGSTPIGTYKQLIEWYNKGDLDFSG VKSVNLDEYRGLTRDNDQSYYYFMYNNLFKHININMDCTNVPDGTQPDSDKECSRYEDVI KSLGGIDLQLLGLGHNGHIGFNEPDEEFAKTTHCVDLTQSTIEANKRFFASIDDVPKQAY TMGIGTIMKAKKILLVVSGSDKAQILHDVLCGPVTPHVPASVLQLHNDVIVVADEAAMAK L >gi|157101647|gb|DS480677.1| GENE 26 33472 - 34044 720 190 aa, chain - ## HITS:1 COG:no KEGG:Closa_0155 NR:ns ## KEGG: Closa_0155 # Name: not_defined # Def: Lipoprotein LpqB, GerMN domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 190 1 178 178 134 50.0 2e-30 MKKWLIGVMTVLMAVSLAACTPTTEKQKKETTAAAAAKSQEDMVPPSTGGASDKVPDPNV MPVAVISIYHGGGGSLVQDMDSLDTEGLDAQLLVDKLIEYGVLTDGTEVLSFDIEGKDEN AVGTLDLSQAESAEGCSDKMFLTEIGNTFTENFELSKLKLKVNGGNYEGDDIKQGDSDYL TYNADYESVE >gi|157101647|gb|DS480677.1| GENE 27 34073 - 34576 509 167 aa, chain - ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 30 165 8 143 154 122 50.0 2e-28 MCEHLKYCTISKKLLKQKVKGRNVGAYETLNEVLVSLFRDVNDIEQKAIITSEFSDITNN DMHVIDAIGIDRPKNMSSIARELSVTVGTLTISVNSLVKKGYVVRNRSSEDRRVVFISLS EKGVKAYYHHKKFHEQMIDSVVKELTEEELEALVKALTKLNTWFRSF >gi|157101647|gb|DS480677.1| GENE 28 34646 - 36028 1130 460 aa, chain - ## HITS:1 COG:CAC3194 KEGG:ns NR:ns ## COG: CAC3194 COG0771 # Protein_GI_number: 15896442 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Clostridium acetobutylicum # 19 459 11 460 462 191 29.0 2e-48 MLYRRLARRTDVIEKIEPWIKGKRILLLGYGREGQSTWNVLRRLGTYKVLDIADLKAPAA VPEDGTVWHTGPDYQKCMDDYDVVFKSPGIVLERPENEYRCSILSQTEVFFQCFRDQIIG ITGTKGKSTVTTLLYHLLKQAGMDALLVGNIGIPALDHMEEVKPDTRIVFELSCHQLEYM TVSPHIGILVNIHEEHLDHYGTMEKYVEAKHHIFKNQGPDDILICNVQCLPEEGTCPSGL IRAGMDGSGKELDVVQEQDGTWVHFRGKSFCIPTDEIKLLGQHNYFDIGVAYGVCSILGM DDQVFARGLKTYEPLPHRLQYIGEREGVKYYDDSISTICDTTIQALKTLKDTDTVLIGGM DRGIDYRELIEYLSDCQVPHIILMEATGKRIYQEIHKYYPEFKNRARLILAEHLEDGVKR ARQITRPGTSCVLSPAAASYGIFRNFEERGETFSRLVFNK >gi|157101647|gb|DS480677.1| GENE 29 35992 - 37446 1191 484 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_1153 NR:ns ## KEGG: HMPREF0868_1153 # Name: not_defined # Def: putative UDP-N-acetylmuramoylalanine--D-glutamate ligase # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 2 418 4 408 1019 351 45.0 5e-95 MKQYLSLREQYPRFAYRGYEIEENDSCLKITYRFETLGLSEFAPVWVFPKTEGDCRRWSE DRLMQDMIFSLGMVELVSYWKIACPPIVTVEAGWLNQDQIDWWKDLYFNGLGEFFYVNGI KEADPNHFMDIRCVGQHETQCACQLKDPCTDQYKERHEECGVETDGKGNGVLVPIGGGKD SAVTLELLRLAGRPVCAYIINPRGATIHTTEVAGLDAAHVISAKRTLDSNMLELNRQGYL NGHTPFSALVAFSGIIAARMHGLTMVALSNESSANESTVQGSTVNHQYSKSFKFEEDFHY YQTTYLKGSAYYFSMLRPLSEFQIARFFAGQKQYHGIFRSCNAGSKTDSWCGHCPKCLFV YLILSPFLKPQEVRDIFGRNMLDDWDMKETLDQLIGIEEEKPFECVGSRDEINTAIVLTI KGLEDAGEALPCLLSYYKTTDLYQTYRTRGDQYSSYYDGNNLVPDELAGLVRKCCTDGLQ GEQT >gi|157101647|gb|DS480677.1| GENE 30 37674 - 38201 588 175 aa, chain - ## HITS:1 COG:CAC0470 KEGG:ns NR:ns ## COG: CAC0470 COG0700 # Protein_GI_number: 15893761 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 6 170 3 167 173 112 33.0 3e-25 MKFLIFLSEAMVPLMIFYIVGFGLLSGRPVLDDFIDGAKDGMKTVAGILPTLVGLLVSVG VLRASGFLDFLGELLQMPAALLHIPPQIVPVVLVRLVSNSAATGLVLDIFKEYGTDSSLG LIASVLMSSTETVLYCLSVYFGSVGITRTRYTLAGGLIATAAGVAASVVLAGAVR >gi|157101647|gb|DS480677.1| GENE 31 38381 - 38578 67 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936906|ref|ZP_02084270.1| ## NR: gi|160936906|ref|ZP_02084270.1| hypothetical protein CLOBOL_01795 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01795 [Clostridium bolteae ATCC BAA-613] # 9 65 1 57 57 103 100.0 5e-21 MVYTLQVPMAQWPEEPAQITILDEDVPLAPLPKTGEGPRSYYLAAVISSLLSGLYIALHG RKRDS >gi|157101647|gb|DS480677.1| GENE 32 38635 - 39141 453 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936907|ref|ZP_02084271.1| ## NR: gi|160936907|ref|ZP_02084271.1| hypothetical protein CLOBOL_01796 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01796 [Clostridium bolteae ATCC BAA-613] # 1 168 1 168 168 328 100.0 1e-88 MGGTLVQDEEAIKRIEELIVSDRASTMLVCIQGTREGEPCGEIYNCYLREPVLFNGAGDM VLKLDEMCNWVGAPQSTTDYRFLNREMERQYREDSARHPQVSRDNLVYSIDRIPFQHALK AREVLVVYIKYRQNSSLQGSVRGRITKGEVVSFRSALELMRMVRMIQV >gi|157101647|gb|DS480677.1| GENE 33 39425 - 40297 671 290 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 11 276 1 262 265 111 25.0 1e-24 MAKRIITDNMIKDYCGEMHQNEMSSGTIKKYFYYLNLFRKYADGKPVSKELVIVWKDTLR EQFSPVTANSALAALNGFFKWFGWEDCTVRLIKVKHRMFCSGQRELSREEYIRLVNVARG EGNERLSLLLQTVCSTGIRISELSFITVNAVDKQVAEVDCKGKIRTVFLTNGLCRLLKAY ARKRNIISGMIFVTRSGKPMDRSNIWREMKQISHKAGVNPDKVFPHNLRHLFARVYYSQE KDLVRLADILGHSSVNTTRIYTMESGENHLRQLEKMNLLIDRYNRIPLLL >gi|157101647|gb|DS480677.1| GENE 34 40495 - 41586 1092 363 aa, chain - ## HITS:1 COG:MTH867 KEGG:ns NR:ns ## COG: MTH867 COG1915 # Protein_GI_number: 15678887 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 29 354 86 396 411 202 38.0 7e-52 MSFKLVKYKEPDFSQSQFTDAPDAVTAPAPKDKVAPEHYHATTIFPEYFKVKGQWLLAEE SRMDCVAVYENGKIYVREFRNLKQGDLVFTGRTENCEDGIFVYPHGFDEGEGGHDEAFAF RQRRSRETAFSKDYDNIYELLRHEREHGYVVWVMGPACAFDSDSREAFSKLVMNGYVDAL LAGNALATHDLEGAYLNTALGQNIYTQYSQPNGHYNHIDTINRVKFHGSIPAFVEKEGID NGIIYSCVKKDVPFVLVGSIRDDGPLPEVYANVYEGQDAMRDCVRKATTVICMATTLHSI ATGNMTPSFRVMPDGTIRQVYFYSVDVSEFAVNKLGDRGSLSAKSIVTNVQDFVVNVSKG LGL >gi|157101647|gb|DS480677.1| GENE 35 41739 - 42671 798 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 308 3 304 308 311 54 7e-84 MSKIYKGAIELVGNTPLVEVTNIEKKEQLKARLLVKLEYFNPAGSVKDRVGKAMIEDAER TGRLKPGSVIIEPTSGNTGIGLAAVSAVKGYRMILTMPDTMSVERRNILKAYGAEIVLTP GEKGMSGAIEKAEELAKEIPDSFIPGQFDNPVNPRAHMESTGPEIWQDTDGQVDIFVASV GTGGTLTGTGTYLKEKNPRIKVIAVEPSTSAVLSGGSAGPHKIQGIGAGFIPKVLDTRVY DEIITVDNEAPFATAKMLARTEGLLTGISSGAVLYAGIEVARRPENEGKTIVALLPDSGD RYYSTALFVE >gi|157101647|gb|DS480677.1| GENE 36 43201 - 44052 725 283 aa, chain + ## HITS:1 COG:no KEGG:CLM_0794 NR:ns ## KEGG: CLM_0794 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 1 238 11 248 271 138 33.0 2e-31 MKTAAQFFGEDLLPYIGVKGKILTVAPTEHVHLEMRRLEEDFNFIMEKGAIRHLEFESDS ITEQDLRRFREYEAYIGMIYSAPVMTTVICTSNTKVQKKELINGDSVYRIEVVRLRDRNA DKVFKKLRQKIHKGKRLRRRDIFPLLLTPLMSGDMEMSERIYQGMEFLQCEELQVSADER KRMQSVLYALAVKFLSRNELERIKERIGMTVLGKMLLEEGIEKGIEKGIEKGIEQGMEQG VKQGQGRVNALIVKLAEAGRMEDIVRAASDREYQDRLFSEFGL >gi|157101647|gb|DS480677.1| GENE 37 44115 - 44813 256 232 aa, chain - ## HITS:1 COG:CAC3603 KEGG:ns NR:ns ## COG: CAC3603 COG2186 # Protein_GI_number: 15896837 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 202 8 201 225 90 30.0 3e-18 MNENVVESIYNKMVESITNGQWAAGSKIPTENELAKTYQVSRSSVRQAISRLKALGLLES RQGSGTVIKKSGISATLSDIIPTIMFEIEDNLQIFEFHKGIQIECAKLACLRYTDEQMEQ LMFHSRKMNEYYSDDNRNRAIFHDLECHKTICEMSGNLMFVRATEIIYQRLEQCFFQICA TFDYKESIVFHERLIAALKDRNPIFSSSVMEAHQWDTYQKFLNISKNSHRSP >gi|157101647|gb|DS480677.1| GENE 38 45104 - 46213 833 369 aa, chain + ## HITS:1 COG:TM0958 KEGG:ns NR:ns ## COG: TM0958 COG1879 # Protein_GI_number: 15643718 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 110 366 60 318 323 76 28.0 1e-13 MKKLAVLFLAAFMGLTACSSAQKDTPTESPTTEASTTATGTETADTQSVPSEKEQIPHTE VDIKYPAIIEVNEDVYADKPYKIAYANWNDENSISIIVRDSVLDACRYYGVDCMVFDNKQ DANQAITNCDTAIQAGVDYYLNFNQDESINAAIADKLEKAGIPMISIQTKARDGIDPEYH IDNYKFGTLCGELLIKEAGERWPDEKPVLFVAGHPESGVNFIQRAEGCKDSVLEALPKTE VFEISTEGDPEITRQRTADFLTAHPNGKIMMWCHIDQNTMGMLAAVKAAGRVDDVIITSC GANPVAFEELKKDDSPILGTVAQFSERFGWDLLPMVIEHLNDGTKPPLETVPPLDIITKE NLYTYYPDA >gi|157101647|gb|DS480677.1| GENE 39 46329 - 46844 360 171 aa, chain + ## HITS:1 COG:no KEGG:Dtox_0837 NR:ns ## KEGG: Dtox_0837 # Name: not_defined # Def: cupin 2 conserved barrel domain-containing protein # Organism: D.acetoxidans # Pathway: not_defined # 18 145 2 128 130 127 47.0 2e-28 MDKKFNITPDEEALYARNRELTPYVRTFEEDGISYPIPDLIGSSRLIAIDEDTVGAKEIT FGYSEFAPGSSVHKPHIHPDCEEVMYILKGRGISGLNGMDFICKEKDILFVPKGAEHYFY NPFDESCCFVFLYTKGSLKEAGYTVASNGYNEIGGDVEALQKSGVNRFDSE >gi|157101647|gb|DS480677.1| GENE 40 46874 - 48367 1029 497 aa, chain + ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 495 25 515 521 398 42.0 1e-110 MNQAVLEVKKIHKTFGEVSVLKGVDFDIQKGEIHALVGENGAGKSTLIKIISGVYTRDSG FLHLEGKEMSFANPKDAMDAGIRVIHQEINMAQTLTVAENIFLGNYPTRHGIIDWKELYK NARKVMEILGDSIDVEQKVSKVSIAEQQMIEIAKAVSVEPKVLIMDEPTAALNDQETESL FELLESLKKKGVSIIYITHRFSEICRLADRVTVLRDGESVATLTKAEISDDLLVKLMVGE GKAARYIKRETRKGGEIFRIEGLSAEGSLNNIRLSIRRGEIAVVFGLVGAGQTELCRVIF GDLPHTKGTLLLDGKEVHINNVQDACREGIGYVSDDRKNEGIIPLLSVQENICIPSYPGK LSNQLGFIKKKTAKQTAQLYYNKLHVKSSGLGQKIGSLSGGNQQKGMICRWLANDVKLLI LNMPTRGVDVGARAEIYRALEDLADQGVAVLAVSPEMQEVLALADVVYVMHEGMITGKVE GKDATQEKLMKLALGID >gi|157101647|gb|DS480677.1| GENE 41 48380 - 49354 565 324 aa, chain + ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 27 318 20 309 314 202 47.0 7e-52 MAENHMLNKKANLQAVRQRLLDSILVLFAVCMIIYFSFASQYFLSVRNFMNIFSSVSVVG IIATGMTLIIITRGIDLSVGSIIALTGCVAAILIVNFKVSWPLAILATLIIGFLVGGFNG LLITKFNVVPFIATLGSMNIIRGIAFIITNGQAIYVPDPVISFMGTGKLFSLIPVPAIIM LCLYILFWLITKFTVFGRNVFAVGGNSVASRLAGIRVKHLTMVLYVLTGILSAVAGLVMT GLTSTAMPSAGDGYNLDVITAVYLGGNSASGGEGSVWRTFMGILIIGILNNGMALLSVQS YWQTFVKGCLLIIAVIFDMLRRRG >gi|157101647|gb|DS480677.1| GENE 42 49373 - 50446 802 357 aa, chain + ## HITS:1 COG:BS_gutB KEGG:ns NR:ns ## COG: BS_gutB COG1063 # Protein_GI_number: 16077682 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus subtilis # 3 357 5 353 353 194 33.0 2e-49 MNIPVKMKAMVLTNPGKWEITEADVPTPGPEEVLCRIDAVAICGSDPEIIHGGLAGIWPP SYPFIAGHEWAGTVVASGNKSGDFKIGDRVAGEAHKGCGYCSNCLKGSYNLCLNYGKNNT GHRHYGFTSQGAYAQYNVYHIKSLSKLPDSVSFEEAAMCDTAGVALHGLELAGVDPGSTV AIIGPGPIGIMAMKLSRAMGASRILVVGRMPRLEAAGNLGADELVDFSKCNPVDEVRRLT GGLGANLVLECSGAPGTITQSLGMCCKGGKVVMLGVAKDGVTEPIPLKYTTHNELTLYGS RANPNTGRKIVEMIAGGQLKVKDMVTHTFTLNEVNLAFETFEKRIGGAMKVIIYPNK >gi|157101647|gb|DS480677.1| GENE 43 50462 - 51406 488 314 aa, chain + ## HITS:1 COG:MT1754 KEGG:ns NR:ns ## COG: MT1754 COG1250 # Protein_GI_number: 15841175 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Mycobacterium tuberculosis CDC1551 # 6 285 11 275 304 148 35.0 1e-35 MTGSVAVIGSGMMGSAIAAMSALAGNRTLMVDLNMEKALNGRKNALECIKMRADNGLNTQ EEAVLAADNLICCGSLKKEAENIGLVIEAIVEQLPAKQNLFEELDKLLLPTVPICSNTSG LRITDISAKCMYPERTITTHFWLPGHLVPLVELVMGEKTREDIALCVRDELTHWKKVPVL VKRDVPGQLANRIFQAIIRESIDIVASGLAGAEDVDAAISYGMAMRFPEWGPLKHLDAIG LDLGISVQDTVLPDICAVRHSNTYLKNLVDEGKLGVKSKQGFYDWKVRDIEKEIRKRDQF IIETVKLVTRLDKG >gi|157101647|gb|DS480677.1| GENE 44 51410 - 52504 221 364 aa, chain + ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 45 355 38 346 349 176 34.0 7e-44 MKQYICIGTYTEDILFGTGECFKGKGKGLLLGIFEDGEIELCRIAQTINPSYLCINKVNK KIYAVNETKIFQGKRGGAVTQFSYNINGELLQEACFYTDGEDPCHIASSLDGKCLAIANF ASGSLSVFELNEHGNITGKKNLFQHTGSGSHPIRQSGPHAHSAIFSPDGKFLYVPDLGID CIKAYACTSGKIEPIPEADVHMPSGSGPRYGEFSRDGKHFYLINELSSQVTHFFYHNGKM IEKDSVCTLPDDFTGYNICSDLHLTPDGAFIYASNRGHDSIVCYRINANGELSFVQRISS GGKTPRNFCIDPTGAYLLTGNQDSDTVAIFLIQPNGLLKMCGTKYVESPVCIQIFTPGHC HQEL >gi|157101647|gb|DS480677.1| GENE 45 52577 - 53680 1171 367 aa, chain - ## HITS:1 COG:no KEGG:Closa_1264 NR:ns ## KEGG: Closa_1264 # Name: not_defined # Def: glycoside hydrolase family 25 # Organism: C.saccharolyticum # Pathway: not_defined # 57 364 7 316 316 436 65.0 1e-121 MRILGKAGKMGEIGESEKTGGPGKTEGPGKTGGPGRSGAPGRTGASGKSAKPVTKGKRAA ACAAAAIIGAMSLFAPAAAYGAQPWSLENGQYVDASGAPIAGALEKGITVTKYQNRANEN GIDWSRVASDGVSFAMIRVGYYKDKDPYFDRNVTEAFANGIHTGVFFYTQALDVQTAIDE ANFVLKVIKDFPISYPIAYDVESQHLLDNGLTRQQITDNVNAFCKTISDAGYHPVVYGNN EWLTRNMDTGQIPYDIWYARYGTVNSYPNRTIWQCTDTGSVDGINGNVTIELAFTDYSAV IPADGWKHVDGRWYYMKGYVKQTGWVEVDSAWYYLDANGVMIHDTTMDIDGVSYTFDSNG VMAEPTR >gi|157101647|gb|DS480677.1| GENE 46 53811 - 54287 510 158 aa, chain - ## HITS:1 COG:MK0151 KEGG:ns NR:ns ## COG: MK0151 COG1720 # Protein_GI_number: 20093591 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanopyrus kandleri AV19 # 15 141 38 165 173 89 39.0 3e-18 MKQFTVRQIGTMEMDKDGMYVKLLPEYIPALKSLEGFGHADILWWFDGCDDEENRSVLEA ESPYKQGPDTMGIFATRSPQRPNPIALSVVQILYIDHEAGVIYIAYADARQGSPVLDLKP YTPSLDRVEHPQVPDWCGHWPRSLEESGDFDWENEFNF >gi|157101647|gb|DS480677.1| GENE 47 54331 - 55167 527 278 aa, chain - ## HITS:1 COG:MA0989_1 KEGG:ns NR:ns ## COG: MA0989_1 COG0789 # Protein_GI_number: 20089866 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 4 103 8 106 124 80 43.0 4e-15 MFRIGAFSKLTRVSVRMLRYYDNAGLLTPAVIDKFTGYRMYTTDQIPVLQKICLLRDMGF NVTEIGAVMECWESSGVTEYLEDKKRQLLDSIRLEQQRIRKIEIAMRDFKESTIETHYNV TIKSVPGCKILSLRKTVADYYCEGRLWEQLYAFVREEQIKLSPGTNNLAIYHNGGTTEDG VDIEVGVMVEREGADKGGFCYRETEAVEDMACIMVYGPYENIGLSYHAFACWMEEHSQYE IAGPSRQICHIGSWDESDSQKFLTEVQTPVRRILTLTP >gi|157101647|gb|DS480677.1| GENE 48 55240 - 55836 541 198 aa, chain - ## HITS:1 COG:BH1574 KEGG:ns NR:ns ## COG: BH1574 COG2715 # Protein_GI_number: 15614137 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Bacillus halodurans # 1 194 1 185 197 147 43.0 2e-35 MLNYLWAGMTLTGILWGALHGQMTAVTDGAIQASKEAVTLCITMLGVMTLWTGVLEIGHR SGLVDQLARRMGPVLTFLFPRLDPDGEARKQISVNMIANILGLGWAATPAGLKAMEELKK VEEERGMGGAARQEGTASNEMCTFLIINISSLQLIPMNMIAYRSQYGSVNPTAIVGPALA ATFISTVVAVIFCRIMDR >gi|157101647|gb|DS480677.1| GENE 49 56193 - 57617 1619 474 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 314 526 718 744 84 26.0 5e-16 MKRQAKWFLIPCAAMALTMGSALVSFAATGWAEENGEWVYYNNDGSKATDVFKKSGNNWF YLDSDGNMAKNQLIEDDDNYFYVNSAGAMVTNQWRSIENEDAGSDEPDEWWYYFQSNGKA VKKSGNSDNVKFVTLPTSTGQAKFTFDDEGRMLYGWIDESGEMLTDDDAWKSGMYYCSDN GDGRMATGWKYITAVNDEDDDREGDGYWFYFSTNGKKTADNDSKKINGRKYRFDENGAAH FEWYNNPGTASSSTVNKYYNTEEECWMSTGWFKTVPSEDVDPEAHDDDEAHWFYAESNGS LVTAQIKKINGQYYGFDVNGKMLQGLYRIEFEENGKTIRSAEEIEDVDELPDEDEDGVFV YYFGDSPKEGAMKTGTMTMEIDGDKYYYSFEKSGSKRGAGTDGIDGDSIYVKGRRLEAEE GTKYQPVTYKDETYLISTSGKLVKNKKNVKDSDDVYYKTDSKGRILDSGTEKLD >gi|157101647|gb|DS480677.1| GENE 50 57870 - 58553 533 227 aa, chain + ## HITS:1 COG:SPy2153 KEGG:ns NR:ns ## COG: SPy2153 COG1284 # Protein_GI_number: 15675895 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pyogenes M1 GAS # 27 210 16 198 290 107 33.0 2e-23 MKQLKKLACRAVQELSITFPPKTVLSIILGTAITTFGIYNIHQQADITEGGILGLILLFH FWFGMSSSILSPVLDALSYALGFRFLGKEFLKTSIFATICMAGFFRLWELFPPVLPSLAD YPLLAALAGGCFIGTGCGLVVRQGASCAGDDALALVISKVTGCRISRAYLLTDVSVLVLS LSYIPAGRIVYSLITVTVSSFMIDFIQNFGIPRKDEDNGKETAADNG >gi|157101647|gb|DS480677.1| GENE 51 58606 - 59475 596 289 aa, chain - ## HITS:1 COG:ybbI KEGG:ns NR:ns ## COG: ybbI COG0789 # Protein_GI_number: 16128471 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli K12 # 25 137 3 114 135 61 30.0 2e-09 MKKLNETEGVKMEKALAEKTGGYLIGDVAQMVGLSRDALRFYEKKGIIRADKKENGYRYY SEDDIYRLMYILYHRKMNTSLEEIGGLMSGQNSTSAMRQHVRQRMAEEEEALRRHSQAIM RLKLVEKDISRIEACMGRYSIRKFPKAFVMANCSDLQEGLRTWFKLSSTIPGLDMAYFYN VLTYTGQDLEEKGTQLLLYEGLEKGLGQEFDRSLYSMTEEPECIYTIIESEYTLPDFDMI HRMVQWAHKHGLEPMECVYANDMTSFFAKDRTTYCLEIYMPFKRIASPV >gi|157101647|gb|DS480677.1| GENE 52 59702 - 60835 1187 377 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 3 375 2 350 355 334 46.0 2e-91 MSELKPLVIGDLVVEKPIIQGGMGVGISLHRLAGAVAKAGGMGIISAAQIGFREPDFTTN FVEANLRSIRREMKLAREIAPQGAIGFNIMVATKHYDMWVKEAVKAGADIIISGAGLPVS LPEYVEAAYAEMEKKPDRRIKLAPIVSSAKSAMVICKMWDRKCHIAPDLVVVEGPLAGGH LGFSLDQLSAYGADTSDVPATYDREAYDREVKAVIKVVEEYGTKYGRHIPVVTAGGIYTH EDVMHQLELGADGVQVATRFVTTQECDASDAYKQAYINAGKEDIVITQSPVGMPGRAILN PFLSQIKGGTRPAIKSCFQCLEHCDIRTIPYCITMALVYAAEGDTDHGLLFCGSNAYRAE KIDTVDNVMKELTGELA >gi|157101647|gb|DS480677.1| GENE 53 60835 - 62556 1719 573 aa, chain - ## HITS:1 COG:CAC3569 KEGG:ns NR:ns ## COG: CAC3569 COG0777 # Protein_GI_number: 15896803 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Clostridium acetobutylicum # 6 283 6 284 285 345 59.0 1e-94 MRNMFKKTYTKIDTKYKPQAGSDQEPVIPEGLWRKCNKCGQPIYVEDVKNNYYVCPKCNG YFRVHAYRRIEMLADEGTFEEWNKEMPFSNPLNFPGYEKKVIAAREKSRLNEAIVTGKCK VDGNPAVVGVCDARFIMSSMGHVVGEKITDAVERATREKLPVILFACSGGARMQEGIVSL MQMAKTSAALKRHHEAGQLFISVLTDPTTGGVTASFAMLGDIILAEPHALIGFAGPRVIE QTIGQKLPEGFQRSEFLLEHGFVDKIVERKDQKRVIGQILYMHRSHKMNIELKSPVETAR TFVDKPSQLQADSGKTAWDTVLLSRKSDRPVATDYINALFDEFIEFHGDRYFKDDGAIVG GVAMFHGMPVTVIGQQKGKNTKDNIIRNFGMPSPDGYRKALRLMKQAETFNRPVICFVDT PGAFCGLEAEERGQGEAIARNLFEISDLKVPVLSIVIGEGGSGGALAMAVADEVWMLENA IYSILSPEGFASILYKDSKKAPDAAKVMKVTAKDLLQLELIERIVPEEEPACTDNLYRLA EYMDSAMAEFFKTYLAMSDEELREHRYQRFRRM >gi|157101647|gb|DS480677.1| GENE 54 62721 - 64073 1483 450 aa, chain - ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 445 1 445 447 575 64.0 1e-164 MFNKILIANRGEIAVRVIRACREMGIQTVAVYSEADRDCLHTMLADEAICIGPAASTDSY LNIERILAATVAMKADAIHPGFGFLSENARFAKLCAECNIAFIGPSAEIINKMGNKSEAR KTMMEAGVPVVPGSKEPVHAAEDGLAMAKEIGFPVMIKASSGGGGKGMRISRSPEDFTEL FNAAQMESVKGFSDDTMYIEKYIEKPRHVEFQIMADKHGNVVHLGERDCSIQRRHQKVLE EAPCDVISPGLRKAMGDTAVKATKAVGYENAGTIEFLLDKDKNFYFMEMNTRIQVEHPVT EMVTGMDLIKEQIRVAAGETLSVSQEDVRIEGHAIECRINAENPSKNFMPCPGRITNVHI PGGNGVRVDTHIYNDYKVPANYDSMLMKLIVYDKDREAAISKMRSALGEVIIEGIETNIN FQYEILENDAFQQGDTDTSFIEKHFPDYVR >gi|157101647|gb|DS480677.1| GENE 55 64094 - 64519 536 141 aa, chain - ## HITS:1 COG:CAC3571 KEGG:ns NR:ns ## COG: CAC3571 COG0764 # Protein_GI_number: 15896805 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Clostridium acetobutylicum # 1 140 1 140 141 167 60.0 7e-42 MLMGVKEIQEIIPHRHPFLLVDCIEELQPGVRAVGYKCVTYDESFFKGHFPQEPVMPGVL LIEALAQTGAVAILSQEGFKGKVAYFGGIEKAKFKRKVVPGDRVKLECEIIKQKGPVGVG KATATVDGKLAVSAELTFMVG >gi|157101647|gb|DS480677.1| GENE 56 64547 - 65065 590 172 aa, chain - ## HITS:1 COG:CAC3572 KEGG:ns NR:ns ## COG: CAC3572 COG0511 # Protein_GI_number: 15896806 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Clostridium acetobutylicum # 1 171 1 156 159 102 39.0 3e-22 MNVDEIIKLMQAVSDNGLTSLELKEGELKLSLKREKEMPQIVTVSAPSPDAPSMQMAAMQ NGLAAPSMMSPAMMPQGVMMPQAAQDTAGRADIGSDKVVTSPLVGTFYNASSPDAAPFVQ AGDTVKKGQVLGIIEAMKLMNEIESEYDGIVEAVLVNNEEVVEYGQPLFRIK >gi|157101647|gb|DS480677.1| GENE 57 65082 - 66320 1672 412 aa, chain - ## HITS:1 COG:CAC3573 KEGG:ns NR:ns ## COG: CAC3573 COG0304 # Protein_GI_number: 15896807 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 1 410 1 410 411 452 56.0 1e-127 MKTRVVVTGMGAITPIGNDVESFWQGLKDKTVGIGPITYFDTTDYKCKLAAEVKGFDPKQ YMDAKAARRMEAFSQFAVAASKEALEQSGIDMEKEDPYRVGVCVGSGIGSLQAMEKDVKK LNEKGPSRVNPLLVPLMISNMAAGNVAIQFGLKGKCFNVVTACATGTHSIGEAFRSIQYG EADVMVAGGAEASITPIGIAGFTSLTALNTTEDASRASIPFDEDRNGFVMGEGAGVVVLE SLEHAKARGANILAEVVGYGATCDAFHITSPAEDGSGAARAMENAMKDAGMAAEDIDYVN AHGTSTHHNDLFETKAIRLALGDHAEKVKINSTKSMIGHLLGAAGGVEFITCVKSIQDGF VHATVGLEKPGEGCDLDYTMGDGVSMNVDVAISNSLGFGGHNASLIVKKFSE >gi|157101647|gb|DS480677.1| GENE 58 66369 - 67109 222 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 242 4 238 242 90 27 5e-17 MLEGKIALVTGASRGIGRQIALTLGREGASVIVNYNGSAAKAEEVVKEIEAAGGKAEAVQ CNVSDFESCGRMMADVVSRYGRLDILVNNAGITRDNLLMKMSEEDFDAVISTNLKGVFNC IKHISRQMLKQKAGRIINISSVSGVLGNAGQANYCAAKAGVIGITKSAARELASRGITVN AVAPGFIATEMTDVLSDSVKAAATEQIPMKHFGSTQDIAETVAFLASDKAGYITGQVLSV DGGMAM >gi|157101647|gb|DS480677.1| GENE 59 67127 - 68050 1177 307 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 301 1 306 308 311 52.0 1e-84 MSKIAFIFPGQGAQACGMGQDFYEQTETGKRIFDKATELMGFSMPQLCFEENDRLDITEY TQAAMVTASIAMMRVLEENGIKPDVAAGLSLGEYCALAAAGVMSDEDAIRTVRQRGILMQ EAVPVGEGAMAAILALDASAIEEVTGAMEGVWIANYNCPGQIVISGEKAAVEDACEKLKA AGAKRAVMLNVSGPFHSGMLADAGEKLGEVLSQVELHEPQIPYVANVTAQYVKNAAEVKE LLTRQVSSSVRWQQSVEAMIADGVDTFIEIGPGKTLAGFMRKISRDVKTLNVEKLEDISK VAEALKS >gi|157101647|gb|DS480677.1| GENE 60 68122 - 69048 1246 308 aa, chain - ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 301 2 302 310 390 68.0 1e-108 MKTRITEMLGIEYPIIQGGMAWVAEHNLAAAVSEAGGFGLIGGANAPGEVVRDEIRKARE LTDKPFGVNVMLLSPHAEDVAKVVVEEGIKVVTTGAGNPEKYMEMWKAAGIKVIPVVASV ALARRMEKYGADAVVAEGMESGGHIGEQTTMTLVPQVADAVSIPVIAAGGIGDGRGIAAA FMLGAEAVQMGTRFVVAKESIVHENYKQRIIKAKDIDSTVTGRSHGHPVRGLRNQMTREY IKLEQEGKSFEELEYLTLGTLRKAVMEGDVNGGTVMAGQIAGMISKEQTCKEMIEEIMAQ AGKLMGWN >gi|157101647|gb|DS480677.1| GENE 61 69147 - 69380 453 77 aa, chain - ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 3 73 5 75 78 67 60.0 5e-12 MLEKMKEIIAEQLSVEADTVTEASSFKEDLGADSLDLFELVMALEDEYSVEIPAEDLEQL TTVGEVMNYLKAKGVEA >gi|157101647|gb|DS480677.1| GENE 62 69431 - 70393 935 320 aa, chain - ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 2 320 3 323 325 362 55.0 1e-100 MTTRIVGTGSYVPEQIVTNDDLAKIVETNDEWIRSRTGIGARRIATSESTSYMAAEASIK ALENAGVKPEEIDLILLATSSPDYCFPNGACEVQGRIGAVNAACFDLSVACTGFVYALNT AHAFISSGIYKTALVIGADVLSKLIDWTDRGTCVLFGDGAGAVVVKADETGILGMNMHSD GTKGGVLTCGSRTNGNFLMGKKPELGYMTMDGQEVFKFAVKKVPQCIMEVLDDTGVKAED VRYFAIHQANYRIIESIAKRLKVNVDRFPVNMEHYGNTSGASVPLLLDEMNRKGMLSAGD KVVLSGFGAGLTWGAALLEW >gi|157101647|gb|DS480677.1| GENE 63 70665 - 70928 345 87 aa, chain + ## HITS:1 COG:CAC0310 KEGG:ns NR:ns ## COG: CAC0310 COG2002 # Protein_GI_number: 15893602 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 79 1 79 81 75 50.0 2e-14 MKTTGVIRKLDELGRITLPIELRRSLDLDVKDGLEITVQDDCIILKKAECADIFTGSTDD LLEYEGKKVSRQSVVALAKLAGLNVTE >gi|157101647|gb|DS480677.1| GENE 64 71057 - 71902 919 281 aa, chain - ## HITS:1 COG:CAC0307 KEGG:ns NR:ns ## COG: CAC0307 COG0313 # Protein_GI_number: 15893599 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Clostridium acetobutylicum # 3 278 4 276 282 273 48.0 3e-73 MAGTLYLCATPIGNLEDITFRVLRTLKEVDLIAAEDTRHSIKLLNHFDIKTPMTSYHEYN KVDKARYLVGQLEEGVDIALITDAGTPGISDPGEELVRQCYEAGIQVTSLPGPAACITAL TMSGLSTRRFCFEAFLPSEKGDKKERARILEELRQETRTIIVYEAPHHLVRTLEDLYKVL GDRNITICRELTKKYETAYRTTFRDALEHYHEEEPKGECVIVIEGKPLEELRKEQISRWE EMEIQEHLAFYTDQGMDRKEAMKAVARDRGITKRDVYQQLL >gi|157101647|gb|DS480677.1| GENE 65 71989 - 72771 714 260 aa, chain - ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 19 259 3 243 244 227 46.0 2e-59 MTTDTVTVKHTWRQEGTIMLRDDERIDDLQRNHYGIIQRKGAFCFGMDAVLLSGFAVVKK GEKVLDLGTGTGIIPILLTAKTEGSHFTGLEIQEESADMARRSVAYNHLEGKVDIVTGDI VEASRLFALASFDVVTTNPPYMNESHGLKNPGDAKAIARHEVKCTLEDVVREGTRVLKPG GRFFMVHRPRRLIEIITVMKRHGLEPKRMKMVHPYADREANMVLIEAVRGGGPLLKMEAP VIVFDQNGEYSPEIRTTYGY >gi|157101647|gb|DS480677.1| GENE 66 72761 - 73693 1210 310 aa, chain - ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 1 264 1 263 303 331 64.0 1e-90 MITVIGVRFRTAGKIYYFDPAGRQIKTGDHVIVETARGIEYGYVVLGNREVDESKVVPPL KSVIRMATDEDETAEARNKQKERDAFKICQEKIKKHHLDMKLIDAEYTFDNNKVLFYFTA DGRIDFRELVKDLASVFKTRIELRQVGVRDETKIMGGIGICGRALCCHSYLSEFIPVSIK MAKEQNLSLNPTKISGVCGRLMCCLKNEEETYEVLNSKLPNPGDYVTTSDGLKGEVHSVN VLRQQVKVIVTVERDEKEIREYKVDQLKFKPRKKKGKGGGEKHGGDTPDAELKQLEALEK KEGKSKLDDN >gi|157101647|gb|DS480677.1| GENE 67 73729 - 74718 1061 329 aa, chain - ## HITS:1 COG:SA0436 KEGG:ns NR:ns ## COG: SA0436 COG2812 # Protein_GI_number: 15926155 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Staphylococcus aureus N315 # 4 328 15 348 565 140 29.0 2e-33 MLGFNDILGHEQIKEHFRNAVQTGKVSHAYILSGEAGMGRKSLANAFALSLLCEKGMEEP CMQCHACKQVLSGNHPDLIYVTHEKPASIGVDDIREQINDTIQVRPYSSYYKIYIMDEAE KMTVQAQNALLKTIEEPPAYAVILLLTTNQDAFLPTILSRCVQLKLKPLKDSVVKEYLIQ SLGENESEADIYAAFARGNLGKAIHLAQSEEFGVMYREMLHLLKHIKDTDISELLDYIRK VKEENLDIRECLDFMQMWYRDILMYKTTKDINLLIFRDEFSAVKSVSTTSGYDGLEKILE AIDKARIRLDANVNTELVMELMLLTMKEN >gi|157101647|gb|DS480677.1| GENE 68 74908 - 75669 909 253 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 2 249 3 240 245 157 36.0 2e-38 MYKVIVVEDETMVRRGIILTIDWAALDCVIAGEAANGEEGAELAVRLSPDIIVTDVKMPR MDGVEMITKLREQGCRAKFIILTAYSDFKYAQSALRLGVSDYLLKPLRDGDLEQAVRHIR EQMEERPAKEEETDAAPVLRFHSDKKSKNKYVEAATRYIREHYREDITISTVASFLEISE GYLSRVFKKETDYTFTNYLAYYRMQMAMNLLKDCRVKVYEVADQVGYSDTAYFSAQFKKI LGVSPSEYQDRCR >gi|157101647|gb|DS480677.1| GENE 69 75662 - 77380 1511 572 aa, chain - ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 291 569 298 576 577 161 33.0 4e-39 MIWYRRLSFKTRVFLGCLLVALVPLTFSSVVMMRLFTASINRQITVDGNQQLEEVRERFT QLLENCQKACETLTEDGSAAWVMIDNKTIEFQKDLYLSMYQAVQEIYSHAQFSIYDAGGK LRFTTDTAPKSSRLPVNWGLLNKASGEQGITYYRTDPYLPSSGSDVLMQGAFSLESIHGA RTGYVVLDFTRENFDNLLNGFYSSGDTLLVLDSHQKPLYCSRPEYGEREISDIIGHTISG QGGTEKEGVYTRYLWTREPSQGFYILLRCSAPISAPAVRTMGTVSLALSGLGLVLCLFIS GALSRSIGQPVSLLDKAMAKVKKGDLSIRIRTNRQDELGRLTESFNQMTGDLQKYLDDTV QKQKDLNKTTLKLYQTQLNPHFLYNTLDSIKWNARINQVPEIAVLAENLAVILRKSISSR PFITLREELETIESYVEIQKIRFTGRFLYETEIPDQLEDCMVPKMILQPLVENAIIHGLD GCGHGYICIYAFQKEGILNISVTDNGCGMSREMEDWINSEAPAKRDGHLGLYNVINILKI YYGQEYGVKAAVTEDGTTITLRLPVQKEVPDV >gi|157101647|gb|DS480677.1| GENE 70 77382 - 78491 253 369 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 63 356 32 338 346 102 25 1e-20 MKKTVMGMKAAVKGKRGSVKGINRTAWGKRDIAFCALCFLCTAALSSCQSGPQTAPMPAV PDLVLYTAQEEEIYEPIIKEFEERTNLMVKVERGSSEEMTGRLENEEEKPDWDVVFGVGI ETLEQTKEHWQEYKSPEAAFITDSFQCEDNRWTSFSALPLVIMYNTNVVTYRELPVGWNS LLEPRWKGRIAFVDPRRSDVYSAALVTAVHTWEKKGDYLERFMENLEYGTLDSMQEVNAG ILDGRYSLGVTMEESAQALLSEGADVDYIYPQEGTTALPDGTAIVKGCANPDAARQFLDF TVSRDTQRILVSDLNRRSVRGDVPPLPGLSPIGRLPLIEMDLEELSREKKDVLARWNSIL SRQNAGGGA >gi|157101647|gb|DS480677.1| GENE 71 78661 - 79515 642 284 aa, chain + ## HITS:1 COG:no KEGG:Closa_2529 NR:ns ## KEGG: Closa_2529 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 281 1 278 281 355 60.0 1e-96 MGVQIIEQFICSKYLDQKKCEDGLFISNDFIAVIDGVTSKGELTWPDGSEGGSVFLSPMT SGRYAREILAQALKTMEPGIDAASAMEYLNQALAKAGSGRREFLRDHPEERLQAVVILYS CQKREVWAFGDCQCLIGDTLHSHGKEIDGLMAEIRCLYNQAELILGGAEDDFARHDPGRA CILPLLRRQFLLANQDRPYGYDVLDGFAIHSHHVSVYPVPPQTQVVLASDGYPVLKDTLA ESEKSLDELLQKDPQCLWENRGTKGLVKGNQSFDDRTYVRFAVF >gi|157101647|gb|DS480677.1| GENE 72 79594 - 80727 930 377 aa, chain - ## HITS:1 COG:AF2099 KEGG:ns NR:ns ## COG: AF2099 COG4948 # Protein_GI_number: 11499682 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Archaeoglobus fulgidus # 1 370 7 374 375 220 35.0 4e-57 MKITGIEIKQYAKPLDPPFLASWDPNPRVKFASTIVYVYTDEGITGIGSGDLMTGFAGHE KFFIGKDPFEMENHNKIIDNIDFHYGRCWPLDIALWDIIGKKCGQPIYKLLGGKQDKIKA YCSCGARIPAEERAEKAREYVEMGFKAMKIRFHHEDVREDIKVVEAVRGAIGDKMEIMVD GNQAWKMPWDVERIWDLKKAVKVARELEKLGVYWLEEPLHHADYDNLARLREMTDIRIAG GEMNRRWHDFRDLADKGSYDVYQPDTVLCGGFTRTKKIADYVQSKGAVFCPHTWSNGVGL LANLHMACAVSDAEYLEYPFDPPVWTVDRRDYILKEEDRVLVDKDGYMHVPQKPGLGFEP DEEALREYEIKDYYVGE >gi|157101647|gb|DS480677.1| GENE 73 80802 - 82013 1227 403 aa, chain - ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 1 399 1 408 411 200 33.0 3e-51 MKKWTSLLVLCCGTGVIFQLPYIRNTFYIPLVEALNLSNEEFGALSTSYATISMICYFFG GWIADRVSPRKLLTISFISTGLVGLLFSTFPSYGTMRLIFAAFGVTTVLTYWSALIKAVR CLGDSNEQGRLFGILEGGRGVVSATAVFVLLGLFNSLGGGKFGLSWVIRSYALLAIIVGI ALWFLIKDKKEDVVSGTSLVQQIKSVLLMPKMWMICLIIFTAYSIYAILSFLPPYMVSVY GMSKNFSVQIGGIRYIIQFAGGITGGFLADKIGSRIKTVLLGYTAIIAFLIVFLLTPQQP SLAMLCVFDFILLNVTVYAVRGIYFAVIDEAQIDRSVTGAVSGMASCIGYLPDVFLYTMI GRWIDQGVQGYRSMFLYGIGTCIVGAAVAVILIKSIRKIPAKG >gi|157101647|gb|DS480677.1| GENE 74 82193 - 83656 1254 487 aa, chain - ## HITS:1 COG:CAC1507 KEGG:ns NR:ns ## COG: CAC1507 COG0642 # Protein_GI_number: 15894785 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 443 1 461 473 180 28.0 6e-45 MKLWQKIFLGAFIASVAIFFFCGFYIVVTGLHFTLEGEMKRTEYYQSVYARKIKSFFREP DVQSGLSGFMDGFREELAGEGRYLQVWEGEELRYDSFEKFPPAPRYTFVEGQAAYTPTYM KDIGDSRYLYVESEFPCDNAMYKVVMIKDLSDIYKSCDMQLNIFFALCLIQAFITAMIMF LITKRITSPIERLNEAAKLMAQESIPLKLAVEGEDEISELSHNFNLMSEAINRNIDDYKQ IVGNLTHEIKTPLTSIIGYAELLKNHECDRELQEQALDYILEEGKRLNSITRKMIKLSRI QPGLLNIERQDVKKMLETSLMAVRMKAGQKQIHFTSDIPEGIFCYGDREFLITMMENVLD NAIKASHEGGVIEVRAAGEGGALAYIQVTDHGIGIPADILDKIDQPFFKGDKAHSQGGDG FGMGLAICKSVMLHHHGSLKYESVVGAYTTVTMRFPAADYLRLESPEQTDQVKKDSLWVY EASNIIR >gi|157101647|gb|DS480677.1| GENE 75 83653 - 84318 738 221 aa, chain - ## HITS:1 COG:CAC1506 KEGG:ns NR:ns ## COG: CAC1506 COG0745 # Protein_GI_number: 15894784 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 216 1 216 217 204 48.0 8e-53 MYRILIAEDNKAISNLIKTHLSFAGYDTSQVYDGEFALEILEAEPFDLLILDLMLPKVSG VDILARVKDCQMPIIVVTALDDLSSKVQCLQVGADDYITKPFDSMDLLARVGAVLRRCRR RPMSLNVRNISVDPDQHRVLKDGKEIELTPKEFDLLYYLCSNCDKVLTRKEILSDVWGYN YVGETRTVDIHIQRLRKKLDWEGAIQTITKVGYLLKGSDDR >gi|157101647|gb|DS480677.1| GENE 76 84290 - 85912 1249 540 aa, chain - ## HITS:1 COG:PA5291 KEGG:ns NR:ns ## COG: PA5291 COG1292 # Protein_GI_number: 15600484 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Pseudomonas aeruginosa # 10 507 8 495 661 223 29.0 8e-58 MEKVSGQKRKSRLRGWVFWPPFLVLLMVLILGFVSQDAFLKVVNGVKDWIWGNFKWLFSG YGLAAVGVCFYACFSKFGNTVIGGKDAKPILGKFNWFAISLCTTIAAGLMFWAAAEPLYL MSDPSPFFDIEPNSPQAAVFAMAQMYLHWGITPYAIYALAATVFAFTYYNMKKPFSLGAC ISPLLGERATKGRLADIIDALCIFILCAGMSGSLSTGAFSVAGGISNITGIPTNGIMLII VMAAIILVYTISASSGIMKGIKWLSSFNVYIFSFLILFIFLFGPTKFILNFGVESVGELM DSFFMRNLFTDSMNLGNSSWSSLGNMKVGYWASYLAWAPVTALFLGKLARGYKIRDCIII NLFVPTVFSMIWVSIFGGTAIHMHMQESLIPLLETSGAESIVYKLFEVLPLRQLCIPIFV LATFISFVTAADSNTNAIASVCTSGMSEEEMEAPAGLKILWGMIIGVMAAVMAVFTGVGG AKTLASIGGFPSMFLQIASCAALIKIMRHPEKYDKCRQDTGNGKEDTGEGKHVQDTDCGR >gi|157101647|gb|DS480677.1| GENE 77 85962 - 86093 67 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936958|ref|ZP_02084322.1| ## NR: gi|160936958|ref|ZP_02084322.1| hypothetical protein CLOBOL_01847 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01847 [Clostridium bolteae ATCC BAA-613] # 1 43 1 43 43 78 100.0 1e-13 MKLADYLPGFCIKVVKLFLDSRFKWYNEPNYFILQVNIYNHRL >gi|157101647|gb|DS480677.1| GENE 78 86266 - 86742 289 158 aa, chain - ## HITS:1 COG:no KEGG:Spirs_1952 NR:ns ## KEGG: Spirs_1952 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 3 104 4 103 148 73 36.0 3e-12 MVIKPHHFMDIIKLYGGGIQVFVPDTDYGHDFYRAANEIIGNHRIQIKVTSGADDICGPC RFLGSDKTCTDCISHIKGIRSKNSYNKMLDCRIMDILGLGEDAVYTAEELCRIMGDCPGL VADVWKEEAEACRETARKRQQLFEAGVRKYLDQGNEVI >gi|157101647|gb|DS480677.1| GENE 79 86773 - 88212 1527 479 aa, chain - ## HITS:1 COG:VC0611 KEGG:ns NR:ns ## COG: VC0611 COG1109 # Protein_GI_number: 15640631 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Vibrio cholerae # 1 479 1 470 470 494 51.0 1e-139 MVKFGTGGWRAIIGEEFTKENIQKLALAICLKMKAEGVEGQGVVMGYDRRFLSKEAIIWS CEIFGNQGIRVWFVNRSSPTPLVMFYVMKHELSYGMMVTASHNPAIYNGIKVFTYGGRDA DEKQTAEIEYYLEQAEELYLPNEEENGQMPPPSYLELVRKGMVREINPMNEYLDNIISVI NMDAIKERDLRIAIDPMYGVSLTALSTILSIARCTIETINSRHDTLFGGKMPAPTERTLR NLQNYVLDRYCDIGIATDGDADRLGVIDDQGHYLHANNILVMLYYYLLKYKGWRGPAVRN LATTHVLDKVAESFGQKCYEVPVGFKHISAKMQETDAIIGGESSGGLTVKGHIHGKDGIY AAALLVEMIAVSGKRLSEIAADIRKEYGAIHMAERDYRFTPEEKERIHNILMEERKLPEM PFETDHVSYMDGCKVYFKNGGWVIARFSGTEPLLRIFCEMEYEPDTVRVCNLFEEYLGL >gi|157101647|gb|DS480677.1| GENE 80 88282 - 89934 1860 550 aa, chain - ## HITS:1 COG:FN0377 KEGG:ns NR:ns ## COG: FN0377 COG1178 # Protein_GI_number: 19703719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 1 546 1 546 550 399 44.0 1e-111 MVKSRKKLDFWFWVKVLVVGFMLIFLFYPFCTLITRSFFSKNAAGFTLENYIRFFTKKYY YNALGRSLFVSVVTTLTTLVVGVPIAYVMSRYNVVGKRFIHIFIIMSLMSPPFIGAYSWI TLFGRSGFITSLFANIGIHLPSIYGKLGIIIVFTFKLFPYVYLYTSGAMGSIDSSLEEAA ENLGSNKLRRLWTITLPVVLPSIAAGAIMVFMTSLADFGTPMLIGEGYMVLPVLVYNEYM SEIGGNAHLASALSVIIVICSTTVLLVQKYYVSRKNYVMTAMRPPKEEVLHGFKRFLATF PVMLVTFVGILPQIVVLITSFIKCDFTGFQKGFSLGSYITIFNRLWTNIRNTFVFSTVAI VFIIALGMLISYIVVRQKGVAGQLMDLIIMFPFVIPGAVLGISLIVAFNKPPVVLTGTAL IIIIAFVVRKLPYTVRSGSAFLQQMDPSVEEASISLGVSPMKTFGKVTARLMAPGVLSGA ILSWITCINELSSSVMLYGGKTSTISVAIYTEVVRNSYGTAAALASILTVSTVVMLLIFL KVSKGKVSIV >gi|157101647|gb|DS480677.1| GENE 81 89928 - 91007 1246 359 aa, chain - ## HITS:1 COG:FN0376 KEGG:ns NR:ns ## COG: FN0376 COG3842 # Protein_GI_number: 19703718 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 349 1 348 371 343 51.0 4e-94 MSHGVIIKDAVKRYGDFTALKGVNLEIKQGEFFTLLGPSGCGKTTLLRMIAGFNSVDGGE ICFDDKVINNLEAHKRDIGMVFQNYAIFPHLTVAENVAYGLKAKKYPKDQIPGKVEEALD LVQIKNLKDRKPNELSGGQQQRVALARAFVIEPGVLLMDEPLSNLDAKLRVQMRTVIKKL QRRLGITTIYVTHDQEEALAISDRIAVIKEGEVMQVGSPENIYKKPQNTFVAGFIGVSNF VECDVNGENPAAAVLDIKGECTITCSLKAPFKGKGIISARPEQLFFDEKEGLPGKIVIST FLGDFIEYEIELDNGQTVQLNEYTKDVRELRPDGQRVKVNFDINAVGVYDAGTQEVISW >gi|157101647|gb|DS480677.1| GENE 82 91185 - 92324 265 379 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 93 359 50 322 346 106 28 5e-22 MKKRFLTASLATMMVLSLAACGGGKAADTTTAPAATEAPADTKAEDKAEETKAEETKAEA EAAGEKAPEDYKGTVVVYSPHDADPLNAGVNLFMEKYPNVKVEVVAAGTGELCNRIAAET ANPIADVLWGGGADSLAAFKEYFEPYVCANDEFIGAAYKDPDGLWIGESPLPMVIFYNKD LIEKDGLTIPETWEDLTKPEWKGKIAYCLPSKSGSAYTQLCTMILGHGGKEDGWDFIKKL YDNLDGKIVDSSGKCHKMVADGEFYVGLTLEKAAVQYKDDPSVGFVYPKDGTSAVPDGVA LVKGCPNEENAKLFIDFVTSKECQTEQSGNWGRRPVRSDMEVGEGMAKLEDIPLVDYDFD WAANEKEAIIEHFNDIMVD >gi|157101647|gb|DS480677.1| GENE 83 92628 - 93191 416 187 aa, chain - ## HITS:1 COG:no KEGG:Mevan_0161 NR:ns ## KEGG: Mevan_0161 # Name: not_defined # Def: hypothetical protein # Organism: M.vannielii # Pathway: not_defined # 3 159 8 164 164 150 50.0 3e-35 MRSITTLDLQYAHRFYGFKGEAQYLHGHTGILTLEVEDTVNAGVNMVFPCNEIQKTAWEV LKNFDHATILREDDPLLPALLGVYEAQGIKDGAPNNIMKGPAFKTDLATAYPECRLVVTK ETMTVEGMIKIVYDLLKDKLNIVKITFTSGVNAASEEFKPQRTMERCPLCGIALDENGVC PKCGYRK >gi|157101647|gb|DS480677.1| GENE 84 93492 - 94631 1146 379 aa, chain - ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 375 1 374 379 326 44.0 5e-89 MIYFDNAATTMRKPDCVIDAVADAMRNFGNSGRGAHEASLDASRMIYETRERISELFNLG NPLQVAFTSNSTESLNTAIQGLFSKGDHVITTVLEHNSVLRPLYLMENRGVSLTILPCDE KGVLCYDQLEESIRPETKAVVCTHASNLTGNRMDLERVGELCHRHGIRLVADASQTAGVL PIDMKRMHIDVLCFTGHKGLLGPQGTGGICVGDGIRIRPLKAGGSGVHTYLKEHPGDMPT ALEAGTLNGHGIAGLHAALGFIKETGVETIHRREAELMRRFYDGVVQIPGVRVYGDFTSD ERAAIVALNIGDYDSSEVSDELAVTYGISTRPGAHCAPLMHESFKTVEQGMVRFSFSYFN TEEEIDAGIQAVRELAAEE >gi|157101647|gb|DS480677.1| GENE 85 94741 - 94977 264 78 aa, chain - ## HITS:1 COG:no KEGG:Closa_1020 NR:ns ## KEGG: Closa_1020 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 73 1 73 78 79 49.0 5e-14 MRKKEPKLIITFHTTAEAIAMEKLCKENQKTGRMIPVPREISAGCGLAWCCEPDMEQEMS EFMKEKGMEHEEMVRLMY >gi|157101647|gb|DS480677.1| GENE 86 95114 - 95320 352 68 aa, chain - ## HITS:1 COG:no KEGG:PTH_0551 NR:ns ## KEGG: PTH_0551 # Name: sirA # Def: redox protein # Organism: P.thermopropionicum # Pathway: not_defined # 4 67 5 68 70 62 51.0 5e-09 MYEVDARGLSCPEPVMLTAAALKKHKGEPVKVLVSEPHTRMNVEKYAKSQGKTATVTKKG SEFEIVIE >gi|157101647|gb|DS480677.1| GENE 87 95333 - 96427 1190 364 aa, chain - ## HITS:1 COG:no KEGG:Ccur_00560 NR:ns ## KEGG: Ccur_00560 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 1 356 1 356 381 408 69.0 1e-112 MNLTSSKKTLLLSGVVLGILACILAYFGNPKNMAICVACFIRDSAGAMKLHTAAVVQYFR PEIVGFVCGSFLIALATREYRSTAGSSPMLRFILGVIMMIGSLAFLGCPLRMVIRMAAGD LNAYVGFLGFAAGVFTGTIALKHGFSLGRAHETGKINGAVLPVVLVVLFVLSLTTTLFAF SEKGPGSMHAPALLALAVALLFGAIAQKCRTCFAGSIRDVILMRNFDLITVIGGLFAVML VFNLATGGFKLSFAGQPVAHSQHIWNILGLYAVGFAAVLAGGCPLRQLVLAGQGSSDSAV TFLGLLVGAAFCHNFGLAASAAAAATADAPAAPGGLAMAGKAAVIGCIVILFIIGFMPKP QEGK >gi|157101647|gb|DS480677.1| GENE 88 96609 - 97541 925 310 aa, chain - ## HITS:1 COG:MJ0300 KEGG:ns NR:ns ## COG: MJ0300 COG0583 # Protein_GI_number: 15668475 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanococcus jannaschii # 6 294 8 293 296 136 31.0 4e-32 MEFRQLEAFVNAVKYKSFSKAADATFLTQPTISAHINNLENEMGTTLVNRTGREITLTKQ GELFYPYAIDMLHTRSQALATVQAQCEAMDGVLDLYASSIPGQYYLPRLIGEFHAKCPKI RFYVDQSDSKTVIENVMSQKGEIGLTGYKMHNSLVYEPVFMDELVLIVPDTERYARWKMG STVSFRDFEHETFILREEGSGTKQEMEKAEIHGVPVFKNVDVIARMNSTETIKQAVAGGL GISILSRMAAGEKEETHRIKYLKIDGLDKKRTFYMVYSKNIRLSPIAEAFRDLVIDYRDR NWQQDIGRFM >gi|157101647|gb|DS480677.1| GENE 89 97548 - 99467 2035 639 aa, chain - ## HITS:1 COG:SMa0015 KEGG:ns NR:ns ## COG: SMa0015 COG3276 # Protein_GI_number: 16262460 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Sinorhizobium meliloti # 4 632 1 628 666 279 31.0 2e-74 MQNVIVGTAGHVDHGKTCLIKALTGTDTDRLKEEQKRGITIELGFANLPNDAGIHIGIID VPGHEKFVKNMLAGIGGIDLVLLVVALDEGVMPQTVEHFEILKMLHIKQGIVVFTKADLV DEDWAELVNDDVDNLVKGTFMEKADRIQVSAYTGKNIDVLKQMIVDKVRAAGTRRQEKEL FRLPIDRVFTMEGFGTVVTGTLLEGCCQVGQEVELYPTEKTVKIREIQTHGHKVDMAYAG QRTALNLVNIKKDEINRGDVLAAQDSLLKSQFIDAKVQLFSSTDRELRNGDRVHINYGSA QAICKAVLLDKDVLSAGEEAYVQFRFDEPVAVRRNDRFIIRFYSPTITFGGGIVLEAEAL KHKRNHEEVIDSLHIKELGTDLEVLELELKEESRYFPVPKILAAKLNWTNQETEEQLEVL VKGKKAVRLSDGSFIHKDYWNEITQYGTELLNQFHKENPISDGMEKEEFKSRILDNFKIR ESKKADVLLNEMIKRSITVTMASAIAAAGFNAEYSHELKGMLKDIEDTYLKAGYEIPATD DVIGKFKDKKMAKQIVNDLVKKGTLIKINPGALIHKDNWDKAMGLLKGYFDSHPNISLGD YRDLLGTSRKYAVLFLEYCDQQKITKKQDDVRILVQKSR >gi|157101647|gb|DS480677.1| GENE 90 99472 - 100782 1474 436 aa, chain - ## HITS:1 COG:aq_1031 KEGG:ns NR:ns ## COG: aq_1031 COG1921 # Protein_GI_number: 15606324 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Aquifex aeolicus # 2 422 27 442 452 352 46.0 1e-96 MIVDSVREVIDELRKDILEGRRSQVGTKETLMTEIVARITGKKKKSLRRVINATGVVLHT NLGRANLSDKACESIMDVARNYTNLEYDVKRGSRGSRHDHVEKILTKITGAEAAMVVNNN AAATMLCLSALAKDKEVIVSRGELVEIGGSFRVPEIMEQSGARLMDVGTTNKTKPSDYLN AYHEGETGALMKVHTSNYRILGFTQEVELPEMVELGKKLNLPVIYDMGSGLMADLTDCGV DEPTVLDALKTGIDVILFSGDKLLGGPQGGIIAGKKEYIDKMKAHPLARAFRVDKMTLAA MEATFFEYSDIRQARKTIPVLNMITTPAAELKDKAERLAGEIRRTTHHFTVEVEACKDQV GGGSAPTVLLDGYAVAVQGRTLAPEKIERLLRKEEIPIIIRITHNQVYLDVRTIREDEFE YIVAAFTAMDSRQVEG >gi|157101647|gb|DS480677.1| GENE 91 100779 - 100880 80 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEDRRELLRKIPKIDEVLQDERLFFLRKVRPEQ >gi|157101647|gb|DS480677.1| GENE 92 101183 - 102304 1378 373 aa, chain - ## HITS:1 COG:no KEGG:Clos_0960 NR:ns ## KEGG: Clos_0960 # Name: not_defined # Def: glycine reductase (EC:1.21.4.2) # Organism: A.oremlandii # Pathway: not_defined # 4 373 3 388 394 412 57.0 1e-114 MSDKKIKAMIANTFLEVADALETGQYGKKPVIGFASEGSEHGQDNIEEAMKIAEMKGLKA VLIEGEDLHKKMEEMLDSGEIDGAVTMHYPFPIGVSTVGKVVTPGKGKPMYIATTTGTSD TDRISGMVKNAIYGIIAAKADGVENPTVGIANIDGARQTEKALLKLAENGYDINFAESVR ADGGIVMRGNDLLGGTPDIMVMDSLTGNLMMKVFSSYTTGGNYESLGFGYGPGVGEGYDR LIMIVSRASGAPVIAGAMEYATNLIRHDWKKIADEEFEKANKAGLKDIINELKSAGKKDA GAAAEEVKMPPKEVVTSQIPGIEVMDLEDAVKALWKAGIYAESGMGCTGPIVLMSDANKD KAHEILKAAGYVS >gi|157101647|gb|DS480677.1| GENE 93 102332 - 103888 1400 518 aa, chain - ## HITS:1 COG:no KEGG:CLM_1422 NR:ns ## KEGG: CLM_1422 # Name: grdC # Def: glycine reductase complex component C subunit beta (EC:1.21.4.2) # Organism: C.botulinum_A2 # Pathway: not_defined # 4 516 3 512 512 677 62.0 0 MNGYAVLKGASYTLAVTPDMVIHNGTTQTTEMIVNPESEYLKELPSHLRSFEDAVNYMPN QVYIGNEKPQKMAEVGFPWYDKLMDGASKDGKYGEIMPEDEFYGLIQICDVFDLVKLEKG FAADVKAKLDAHHMFTDEHVAILDKNSETTQEEIAALVNDEHAEPLYFNNVLVGAVKRAH DVDVNLSAHVMLENLVTKASNVLSLLNLVKKAGVNPADIDYVVDCCEEACGDMNQRGGGN FAKAAAEVAGFVNATGSDVRGFCAGPAHAMLHAAALVKAGTFKNVVVTAGGCTAKLGMNG KDHVKKGLPILEDAIAGFSVLVSADDGVSPQIRTDIVGRHTVGTGSAPQNVISSLVTNPL DEAGLKIVDIDKYSPEMQNPDITKPAGAGDVPEANYKMIAALGVKRGELQRNEIADFVKE HGMTGWAPTQGHIPSGVPYLGLLRDEILAGETKRAMIIGKGSLFLGRMTNLFDGVSFVVQ ANDGSASEENGGVDESKVKAMIGQAMRAFAEGLMGEEE >gi|157101647|gb|DS480677.1| GENE 94 104125 - 104451 215 108 aa, chain - ## HITS:1 COG:no KEGG:Shel_25200 NR:ns ## KEGG: Shel_25200 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) # Organism: S.heliotrinireducens # Pathway: not_defined # 1 107 50 156 156 156 74.0 2e-37 MDLENQKRVKDFAEQYGAENIVVVLGAAEGEAAGLAAETVTAGDPTFAGPLAGVQLGLSV FHVCENEIKDEVDSSVYDDQISMMEMVMDVDDIAKEMSSIREQYCKYL >gi|157101647|gb|DS480677.1| GENE 95 104467 - 104598 190 43 aa, chain - ## HITS:1 COG:no KEGG:Amet_3592 NR:ns ## KEGG: Amet_3592 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 1 43 1 43 158 72 79.0 7e-12 MAILKDKKVIIIGDRDGIPGPAIEECVKTAGAEVVYSSTECFV >gi|157101647|gb|DS480677.1| GENE 96 104727 - 105044 451 105 aa, chain - ## HITS:1 COG:Cgl3031 KEGG:ns NR:ns ## COG: Cgl3031 COG0526 # Protein_GI_number: 19554281 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Corynebacterium glutamicum # 1 88 4 91 107 63 35.0 1e-10 MVDLTKDNFEEEVLKAEGTVLVDFYGDGCVPCQALMPHIHAMEGDYGDKIKFTSLNTTKA RRLAIGQKILGLPVIAIYKGGEKVAECVKDDATVENIKAMIEANL >gi|157101647|gb|DS480677.1| GENE 97 105078 - 106019 581 313 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 303 1 297 306 228 42 1e-58 MDKIYDVIILGAGPGGLAAGIYAGRARLDTLLIEKGKDGGQIAITDEIENYPGQMVDGES GPSLIERMTKQVEKFGAERVSDTIVSVDIEGPVKTLVGSNGTYKARTLIIATGAFPRPIG CKGEGEFMGKGVSYCATCDASFFEDLEVFVVGGGDSAVEEAMYLTKFARKVTVIHRRDEL RAAKSIQERAFKNPKLFFMWDTVVEELHGDGILSGMTVKNVKTGELTRIDADEDDGLFGV FGFIGYNPRSELFEGKLEMDRGYIKTDEDMHTNVEGVYAVGDIRVKSLRQVVTAAADGAI ASMQVERWLAEQK >gi|157101647|gb|DS480677.1| GENE 98 106225 - 106455 219 76 aa, chain - ## HITS:1 COG:no KEGG:Shel_25230 NR:ns ## KEGG: Shel_25230 # Name: not_defined # Def: glycine reductase, selenoprotein B # Organism: S.heliotrinireducens # Pathway: not_defined # 1 76 361 436 437 115 75.0 4e-25 MVKEIERAGIPVVHIATVVPISLTVGANRIVPAIAIPHPLGNPALTMEEEKEIRRHILTK ALTALETPVTEQTVFE >gi|157101647|gb|DS480677.1| GENE 99 106483 - 107532 1017 349 aa, chain - ## HITS:1 COG:no KEGG:Clos_0958 NR:ns ## KEGG: Clos_0958 # Name: not_defined # Def: selenoprotein B (EC:1.21.4.2) # Organism: A.oremlandii # Pathway: not_defined # 1 349 1 349 435 510 72.0 1e-143 MSKLRIVHYINQFFANIGGEEKADYQPELREGIVGPGMAFNQAFGEEAEIVATVICGDSY FNENVEAAKKTILDMISSQKPDAVICGPAFNAGRYGVACGTVALAVKEELNLPVLTGMYK ENPGADMYKTKVYVVETKNSAAGMRSAVATMAPLALKLAKGEKLGSSKEEGYMPNGIRVN FFEDKRGSQRAVEMLIKKLSGKEFTTEFPMPDFDRVDPNPAVKDMAHAKIALVTSGGIVP KGNPDHIESSSASKYGKYDIDGVMDLTEATYETAHGGYDPVYANEDADRVLPVDVLREFE KEGKIGSLHRYFYTTVGNGTSVANSKRFAAEFAQELRNDGVDAVILTST >gi|157101647|gb|DS480677.1| GENE 100 107542 - 108822 1293 426 aa, chain - ## HITS:1 COG:no KEGG:Bmur_2722 NR:ns ## KEGG: Bmur_2722 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits # Organism: B.murdochii # Pathway: not_defined # 1 425 1 425 428 603 70.0 1e-171 MKLELGCVQINDIQFASESKVENGTIYVNAEELKALLLEDENLKSVELEIAKPGESVRIM PVKDVIEPRVKVNGGGNLFPGVISKVETVGSGRTNVLKGSAVVTTGKIVGFQEGIIDMCG EGAKYTPFSQLNNLVVVCEPIDGLAKHAHEKAVRFAGLKAADYIGKLAKDVEPDEVKVYE TCSVKEGIEKYPELPRVAYVLMLQSQGLMHDTYVYGVDMKQSLPTILNPTELMDGAILSG NCVSACDKNTTYHHLNNPVIADLFEQHGKTLNFVGVIITNENVYLADKMRSSDWTSKLCE FLGVDGAIVSQEGFGNPDTDLIMNCKKIEAKGVKTVIITDEYAGRDGASQSLADADASAN AVVTGGNANMVINLPAMDKVIGYPEVADIIAGGFDGSLQKDGSIVAELQVITGATNEMGF NKLSAR >gi|157101647|gb|DS480677.1| GENE 101 108827 - 109222 336 131 aa, chain - ## HITS:1 COG:no KEGG:CLOST_1107 NR:ns ## KEGG: CLOST_1107 # Name: grdX # Def: grdx protein # Organism: C.sticklandii # Pathway: not_defined # 3 124 2 118 119 97 45.0 2e-19 MYIVVTNNVKCRDRYQDKLKVDFLENGSYIDVLIKVRNYIHKGYRLETHPMAGSLKPNQI PYKTIIISDSEVEKEEFYQFEMVMENSIQSCEKFMKDRRTPDWPENILEDFRDVDLSLIE GAVKKIIYKEE >gi|157101647|gb|DS480677.1| GENE 102 109628 - 110650 948 340 aa, chain + ## HITS:1 COG:aq_218 KEGG:ns NR:ns ## COG: aq_218 COG3604 # Protein_GI_number: 15605774 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing GAF, AAA-type ATPase, and DNA binding domains # Organism: Aquifex aeolicus # 23 329 206 504 506 247 44.0 3e-65 MDENDMKKKFFFNTKSPAYRRALEDCERVAASNVNVLLIGESGSGKDVAAQYIHACSRRS DMPLIAVNCNAYTESLLEAELFGHEQGAFTGAIKGRAGKFELADKSTLFLDEVGDINLTT QVKLLRAIETKRIERIGGNNSKQLDFRLITATNRDLVAKVRNSTFREDFFYRISTIVIRI PALKERKEDLVDMIHFFLEESQRVNQIRIDSIDEKAWDFLINYDYPGNIRELRNHVDRMV VLSKNHVITTEGIPILYNLKTTDDDSPSTAFTQAGIVPWREFKRRSEKEYLQWVLNQMGW NISATAYKLGISARQLFNKINEYNLENPQTGRPQKPVHSI >gi|157101647|gb|DS480677.1| GENE 103 110761 - 111387 458 208 aa, chain - ## HITS:1 COG:CAC0298 KEGG:ns NR:ns ## COG: CAC0298 COG0194 # Protein_GI_number: 15893590 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Clostridium acetobutylicum # 1 190 1 191 195 196 52.0 2e-50 MGKIFYVMGKSASGKDTIYKKLHSRMPELKTVTMYTTRPIRDGEHNGVEYFFVDPSYLDQ CRKDGILIECRTYDTVYGPWSYFTADDGQIDLKTGNYLIMGTLESYGKMRAYYGEEALVP IYIHVEDGLRLRRALEREQQQTVPKYKEMCRRFLADEEDFSPSHLEQSNITRQYENVVLE DCLREIVQEIGRVCHTQSNAGQPEVQDI >gi|157101647|gb|DS480677.1| GENE 104 111443 - 112960 1157 505 aa, chain - ## HITS:1 COG:CAC2338 KEGG:ns NR:ns ## COG: CAC2338 COG1982 # Protein_GI_number: 15895605 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Clostridium acetobutylicum # 2 503 6 487 487 290 34.0 5e-78 MRQEELLINRLAAYARSDMYPFHMPGHKRRTGPEDFFMNSCVDSFTNPFAVDITEIEGFD NLHHPEGILRDSMKWAADVYGADQTYYMINGSTGGILAAVCGSVPRGGRILVSRNCHKSV YHGICLNQLKTSYVYPQEIEGLGIQGGITAEDVDRMLNRYMDTQAVLIVCPTYDGIVSDI EAIARIVHRAGLPLIVDEAHGAHFRYDAMFPVSALDLGADVVIQSVHKTLPSLTQTALLH IKCNRPDGGCYADRERIDRYIHMVQSSSPSYVLMASIENSIYQMEQTDMAPYGKQLHKLR RRLGQMRHLRLADTGLIGQAGIRDLDISKIVVSTRGTCLYPAEDGLTGFTGAQLDDILRR EYHLEMEMCGADYVTAITTVMDSGEGLERLGDALTRIDSQLTDAGYKPDGRSGNQKSVYS MRCDTAMSMGEAMEENMASVGLEDSAGCISGEFVYIYPPGIPIVAPGEWISRPILEVILE YRDKGLPVQGPADQSLRTIRVVQKD >gi|157101647|gb|DS480677.1| GENE 105 112957 - 113478 526 173 aa, chain - ## HITS:1 COG:SPy1995 KEGG:ns NR:ns ## COG: SPy1995 COG0563 # Protein_GI_number: 15675784 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Streptococcus pyogenes M1 GAS # 1 163 6 167 168 141 44.0 5e-34 MGYSGAGKSTLAKKLGRLYDCPVLYLDRIQFEPGWKERNREEAKRMAEEFLNENQDTGWI IDGNYAKFCQERRLEEADLIVFMDYTRRICLWQAVKRYLEYRDKTRESMAEGCREKIDWD FVKWILRDGRDENHSYRSIKARYPHKTLVVRNRKQLKDSMTRLILRSRERTLG >gi|157101647|gb|DS480677.1| GENE 106 113569 - 115068 1256 499 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2822 NR:ns ## KEGG: EUBREC_2822 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 37 495 55 528 692 232 33.0 3e-59 MDDKMYLAATAQGEVKVTGGRAEEKLGARLQRIGKEFGFFGVLSIFYGVSASFCLFRNPL GITVPLFVAVTYGAAFLIFRKMGVPIKKDSYLLAAVSLLIGLSACFTGNMAVGYYMNRLA LILLFCIFILHQCHQDNKWNIGKYMASIVLYLAQAVGMVLYPFRHLGEYILSLKDKRSRN VFRLLAGACAAVPAVIFLCVLLSGADMVFRNMLSVVISKFLNPVTLFMVVIQTVFWSLAM YCLVCSAYGGSISDGMVNVRRHSPVAAVSFMAMVGLVYLVFCVIQIVYLFMGKGSLPEGM TYSQYARQGFFQLLFVAVLNLVMVLMCLKYFREHVLLNGFLLLVSLCTYVMLASAVYRMV LYVQQYQLTFLRILVLWFLAMLFVLMAGVVILIFNHEFPLFRFCLAVVSSFYLVFAWMRP DYITARYNVAHRDSIAGVEQSDFMRLSTDAAPALKGMEDSEIKERLLSWYAKRYEVWDDG NPMGLRTFNFSVLKARNKL >gi|157101647|gb|DS480677.1| GENE 107 115058 - 115288 245 76 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 1 65 1 65 67 99 70 1e-19 MGIVINLDVMMARRKKGLKELSAEVDITMANLSILKNNKAKAIRFSTLEAICKALDCQPG DILSYEQEEGEDENGR >gi|157101647|gb|DS480677.1| GENE 108 115304 - 115750 455 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936992|ref|ZP_02084356.1| ## NR: gi|160936992|ref|ZP_02084356.1| hypothetical protein CLOBOL_01881 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01881 [Clostridium bolteae ATCC BAA-613] # 1 148 1 148 148 217 100.0 2e-55 MNMRLVRFTKGLLDIMFYVGMAITAAIPLIFHYVGMYIEKFRTYYVVQCALYMASGVLAL LLVLELRRMFATVLADDAFVMENAASLKRMGKCSFLIAVLSVLRLFVAFTPATAVIIIVF SIAGLFCFVLCLVFEQAIRYKQENDLTI >gi|157101647|gb|DS480677.1| GENE 109 116072 - 116509 353 145 aa, chain + ## HITS:1 COG:CAC0197 KEGG:ns NR:ns ## COG: CAC0197 COG1846 # Protein_GI_number: 15893490 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 2 134 4 136 140 120 47.0 1e-27 MTFHYLLMAGHAIYLKQLMAELYHTGLTPGQPKVLDYLMEHDGANQKEIANACHIEAATL TSVLNGMEAKSLIERRRTDGNRRSFYIFLTDRGREMGREVTEAFARLEEYTFQDIPTEKV GKFMEMFQELYQRMENHADTGKPTK >gi|157101647|gb|DS480677.1| GENE 110 116548 - 117228 894 226 aa, chain + ## HITS:1 COG:CAC1482 KEGG:ns NR:ns ## COG: CAC1482 COG1811 # Protein_GI_number: 15894761 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Clostridium acetobutylicum # 1 225 1 233 242 140 37.0 2e-33 MPYGIITNCAAVFLGGILGAVLKNYFPQKLKEALPNIFGLSAITMGISLIIMLKSLSAVV LALIVGTVIGELLNLEYNLLKGLHKLEQKLPVNLDDEQMDVLISMIILFCFSGTGVFGAM NSAMTGDHSILYAKAVMDFFTAIIFGTTAGYLVGFIAVPQFAVGILLFGGASYILPLVTD TMLANFKACGGIITLAVGLKIAGIKRTKVLNILPAIAIIFPLSCLL >gi|157101647|gb|DS480677.1| GENE 111 117349 - 118002 626 217 aa, chain + ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 81 1 81 115 104 56.0 1e-22 MIRRIIKINEDLCNGCGACADACHEGAIAMVDGKARLIKDDYCDGLGDCLPTCPTGAITF EEREAAAYDEAAVKARMHQSRNLSPSDAPLHRECPESRLTQWPVQIKLAPVQAPYFKDAR LLIAADCTAFAYGNFHNDFIAGRITLIGCPKLDMVDYSEKLTEIIRLNDIRDVTIVRMEV PCCGGIENAAKRALQASGKFIPWQVVTVRTDGSLVRN >gi|157101647|gb|DS480677.1| GENE 112 118385 - 120139 1833 584 aa, chain - ## HITS:1 COG:CAC2362 KEGG:ns NR:ns ## COG: CAC2362 COG0441 # Protein_GI_number: 15895629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 14 582 65 633 637 828 68.0 0 MKIILPDGSVQEAEDELRAIRHTASHVLAQAVKRLYPDAKLAIGPAIDDGFYYDFDREGG FTPEDLEKLEAEMAKIVKENLPVKPFVLPRAEAVRFMEEKGEPYKVELIEDLPEEETISF YQQGEFVDLCAGPHIMYTKGVKAFKLTSIAGAYWRGSEKNKMLTRIYGTAFANKTDLESY LTMMEEAKKRDHRKLGKELGLFMFAEEGPGFPFFLPKGMTLKNTLIDYWREIHYRDGYQE VSTPIILSRKLWENSGHWDHYKDNMYTTVIDEEDYAVKPMNCPGGMLVYKNQPHSYRDLP LKVGELGLVHRHEKSGQLHGLMRVRCFTQDDAHIFMRDDQIEDQIKGVTKLINEVYTQFG FEYFVELSTRPEDSMGSDEDWEMATNGLRKALEDMGLDYIVNEGDGAFYGPKIDFHLRDS LGRTWQCGTIQLDFQMPQRFDLEYTAEDGSKKRPIMIHRVCFGSIERFIGILIEHFAGKF PVWLAPVQVKVIPVSEKSMDYATGVYEKLKAAGIRTELDHKDEKVGYKIRQAQLEKVPYM LVLGEKEAAEGAITVRSRDKGDLGAAGLDAFIEDIKKMIQTKEK >gi|157101647|gb|DS480677.1| GENE 113 120731 - 121153 483 140 aa, chain + ## HITS:1 COG:no KEGG:Closa_1010 NR:ns ## KEGG: Closa_1010 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 130 1 130 134 177 75.0 9e-44 MKIVIIDGQGGKMGKSVIEQLKKRFPGLETYAIGTNSIATSAMLKAGATYGATGENPVIV NAQDADIIIGPIGIVMANSLLGEITPAMATAIGISKAYKILIPVNRCNHYIVGCQNSTLS DYINMVCDEVGKEAGINENH >gi|157101647|gb|DS480677.1| GENE 114 121140 - 121382 296 80 aa, chain + ## HITS:1 COG:no KEGG:Closa_0017 NR:ns ## KEGG: Closa_0017 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 76 76 111 75.0 9e-24 MRTIEFKMERQGLVNKGDKVKIIEREGTSSYSYIIDPAVAMSGCFAARYRIKSEYGIVKD IRENSRGFYVIVDFDEDEPR >gi|157101647|gb|DS480677.1| GENE 115 121398 - 123146 1270 582 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937004|ref|ZP_02084368.1| ## NR: gi|160937004|ref|ZP_02084368.1| hypothetical protein CLOBOL_01893 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01893 [Clostridium bolteae ATCC BAA-613] # 10 582 29 601 601 1099 100.0 0 MAAIIMIAAVYVIMSVVGKIRSDYKNLDFTTEEKEMPSITIDTKDEPKDGWNETDKGWMY YLDEKEYVTDQWKDIEGFLYYFDSDGIMATGELKQEGQTYTCHDTKGYLKNIQIDPDYVP ESTGENLDSLVRTNAFWCYLRAEDTGLFKTILYRKTVENKVVVLGGETAPEKTTRNSMRA YGDYVYFLPKVKESQKQGLSEAERGLCDKLFRMRPGSDTKELIAENVDGYMVLGDIIYYS QGGKIYQTTLGTEVATGEARYSVVIKDGNCYLVDDMGNPAVAESGSSVSIGDRVYRIEED GKIKYVKHGQVTIDGKTYYLGGSGTKSSVCVKRDGSDTAVIRENYGVQSYCIVDNQIYYS SYVDKGTGGEWYSQIFKAGLDGQNKQAVSERFPGVMQNMYYYEDEGQIFGEYNPAIWKQA YGTVVMIARDGGIYQIEDASARTGKHVDGNDMLEIIMARDGKVICLWHDCQWERNSGITS ILWSKAIELNAGDRRLIDMVPSTESAESSEAQETGDIIQPIETMPPASTETVSPVEPSIN NDPVISTEGPHTDTAVPTVPAPAPTVPPVTEPSAEVKIVPLG >gi|157101647|gb|DS480677.1| GENE 116 123613 - 123708 74 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDVLNFLVAVMASVAGYYICKWLDRNRKDS >gi|157101647|gb|DS480677.1| GENE 117 123961 - 124452 41 163 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 30 156 67 190 297 66 30.0 3e-11 MQHIFVSNCVIPLSLLNVPIIMVLLHHSLSKATISTYVRNARIFLRWVHAEYGLSFDPVK IKVPKPPKKNVHVLSSAEIQYLLDSAKCSVPWLTARNKAIIALMLGSGLRQNEICTLVKK GLDFDRGALQVTGKGCKDRLVPLGYISKILITIAEREILVSTI Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:23:54 2011 Seq name: gi|157101646|gb|DS480678.1| Clostridium bolteae ATCC BAA-613 Scfld_02_19 genomic scaffold, whole genome shotgun sequence Length of sequence - 132116 bp Number of predicted genes - 110, with homology - 110 Number of transcription units - 63, operones - 31 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 2 - 1133 1067 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 1191 - 1250 6.7 2 1 Op 2 . - CDS 1322 - 2320 1160 ## COG1609 Transcriptional regulators - Prom 2446 - 2505 4.9 3 2 Op 1 . - CDS 2542 - 2901 365 ## COG3304 Predicted membrane protein 4 2 Op 2 . - CDS 2907 - 3359 562 ## bpr_IV092 hypothetical protein - Prom 3567 - 3626 9.6 + Prom 3722 - 3781 5.2 5 3 Op 1 . + CDS 3811 - 4503 414 ## COG1794 Aspartate racemase 6 3 Op 2 . + CDS 4576 - 5337 358 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 5432 - 5469 7.1 - TRNA 5509 - 5581 81.6 # Ala GGC 0 0 7 4 Tu 1 . + CDS 6008 - 6298 151 ## gi|160937017|ref|ZP_02084380.1| hypothetical protein CLOBOL_01906 + Term 6306 - 6348 1.9 - Term 6294 - 6331 6.1 8 5 Tu 1 . - CDS 6385 - 7080 580 ## COG3773 Cell wall hydrolyses involved in spore germination - Prom 7143 - 7202 3.3 - Term 7197 - 7229 5.4 9 6 Op 1 . - CDS 7234 - 8028 629 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 10 6 Op 2 . - CDS 8018 - 9046 803 ## COG3541 Predicted nucleotidyltransferase - Prom 9071 - 9130 7.7 11 7 Op 1 . - CDS 9175 - 9399 158 ## gi|160937023|ref|ZP_02084386.1| hypothetical protein CLOBOL_01912 12 7 Op 2 . - CDS 9565 - 9828 296 ## Closa_2254 hypothetical protein - Prom 9945 - 10004 9.6 + Prom 9828 - 9887 3.7 13 8 Tu 1 . + CDS 10091 - 10750 383 ## Closa_2883 hypothetical protein - Term 10760 - 10800 8.8 14 9 Op 1 . - CDS 10934 - 11218 304 ## COG4728 Uncharacterized protein conserved in bacteria 15 9 Op 2 . - CDS 11299 - 11643 310 ## COG2315 Uncharacterized protein conserved in bacteria - Prom 11737 - 11796 4.3 - Term 11857 - 11917 4.8 16 10 Tu 1 . - CDS 11930 - 13093 1168 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases - Prom 13134 - 13193 3.1 - Term 13173 - 13216 11.3 17 11 Tu 1 . - CDS 13250 - 16786 3261 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 16890 - 16949 8.1 - Term 16971 - 17031 10.6 18 12 Tu 1 . - CDS 17037 - 18866 1757 ## COG2720 Uncharacterized vancomycin resistance protein - Prom 18980 - 19039 7.6 + Prom 18944 - 19003 9.2 19 13 Op 1 . + CDS 19082 - 19921 752 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Prom 19925 - 19984 7.3 20 13 Op 2 . + CDS 20048 - 20284 335 ## Closa_1693 hypothetical protein + Term 20335 - 20374 7.3 - Term 20323 - 20362 9.2 21 14 Op 1 . - CDS 20466 - 21659 1312 ## COG0628 Predicted permease 22 14 Op 2 . - CDS 21671 - 22843 1226 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain - Prom 22970 - 23029 4.8 - Term 23344 - 23379 5.0 23 15 Tu 1 . - CDS 23404 - 23556 171 ## gi|160937041|ref|ZP_02084404.1| hypothetical protein CLOBOL_01930 + Prom 23489 - 23548 5.7 24 16 Op 1 . + CDS 23693 - 24109 527 ## Closa_1687 hypothetical protein 25 16 Op 2 . + CDS 24154 - 24435 330 ## Closa_1686 hypothetical protein - Term 24406 - 24448 7.1 26 17 Tu 1 . - CDS 24490 - 25332 645 ## COG3786 Uncharacterized protein conserved in bacteria - Prom 25362 - 25421 7.0 - Term 25442 - 25490 8.2 27 18 Tu 1 . - CDS 25524 - 27938 2954 ## COG0495 Leucyl-tRNA synthetase - Prom 27984 - 28043 3.5 28 19 Op 1 . - CDS 28337 - 28702 377 ## Cpin_5888 hypothetical protein 29 19 Op 2 . - CDS 28689 - 28805 65 ## gi|160937047|ref|ZP_02084410.1| hypothetical protein CLOBOL_01936 - Prom 28858 - 28917 4.4 + Prom 28803 - 28862 8.5 30 20 Op 1 . + CDS 28902 - 30434 1691 ## Closa_3302 hypothetical protein 31 20 Op 2 . + CDS 30424 - 31122 618 ## Closa_3303 hypothetical protein 32 21 Tu 1 . + CDS 31202 - 34486 2269 ## COG0210 Superfamily I DNA and RNA helicases + Term 34538 - 34583 -0.6 - Term 34526 - 34571 9.0 33 22 Op 1 3/0.077 - CDS 34623 - 35963 1644 ## COG0372 Citrate synthase 34 22 Op 2 . - CDS 35994 - 38297 2335 ## COG1048 Aconitase A - Prom 38372 - 38431 8.3 35 23 Tu 1 . - CDS 38517 - 39476 906 ## HMPREF0424_1232 CAAX amino terminal protease family protein + Prom 39743 - 39802 6.4 36 24 Tu 1 . + CDS 39837 - 40487 595 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 37 25 Tu 1 . - CDS 40476 - 43028 2285 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 43063 - 43122 5.9 - Term 43107 - 43155 12.0 38 26 Tu 1 . - CDS 43166 - 43330 242 ## COG1592 Rubrerythrin - Prom 43395 - 43454 5.2 - Term 43492 - 43532 8.0 39 27 Tu 1 . - CDS 43545 - 45641 2460 ## COG0480 Translation elongation factors (GTPases) - Prom 45675 - 45734 6.4 40 28 Tu 1 . - CDS 45766 - 46725 401 ## COG2199 FOG: GGDEF domain - Prom 46773 - 46832 4.2 - Term 46966 - 47005 4.1 41 29 Tu 1 . - CDS 47007 - 47216 62 ## gi|160937065|ref|ZP_02084428.1| hypothetical protein CLOBOL_01954 - Prom 47290 - 47349 3.9 - Term 47305 - 47336 0.1 42 30 Tu 1 . - CDS 47397 - 47879 441 ## Clole_1402 hypothetical protein - Prom 47910 - 47969 6.1 + Prom 48477 - 48536 6.4 43 31 Op 1 5/0.000 + CDS 48558 - 49580 1271 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 49593 - 49626 2.1 + Prom 49618 - 49677 2.0 44 31 Op 2 6/0.000 + CDS 49718 - 50851 1302 ## COG0287 Prephenate dehydrogenase 45 31 Op 3 . + CDS 50879 - 52159 1129 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase - Term 52152 - 52207 14.7 46 32 Op 1 . - CDS 52271 - 53047 887 ## COG0428 Predicted divalent heavy-metal cations transporter 47 32 Op 2 . - CDS 53090 - 53725 799 ## Closa_3203 hypothetical protein 48 32 Op 3 . - CDS 53792 - 54304 694 ## COG1335 Amidases related to nicotinamidase - Prom 54481 - 54540 7.0 + Prom 54718 - 54777 5.8 49 33 Op 1 9/0.000 + CDS 54884 - 56029 1438 ## COG2984 ABC-type uncharacterized transport system, periplasmic component 50 33 Op 2 13/0.000 + CDS 56060 - 56992 1061 ## COG4120 ABC-type uncharacterized transport system, permease component 51 33 Op 3 . + CDS 56985 - 57758 225 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 57784 - 57825 1.8 52 34 Tu 1 . - CDS 57829 - 58188 430 ## COG1396 Predicted transcriptional regulators - Prom 58261 - 58320 6.0 + Prom 58206 - 58265 6.0 53 35 Tu 1 . + CDS 58438 - 59742 996 ## COG2199 FOG: GGDEF domain + Term 59788 - 59838 15.0 - Term 59775 - 59826 15.2 54 36 Op 1 . - CDS 59864 - 60580 828 ## CLH_2547 hypothetical protein - Prom 60701 - 60760 6.5 55 36 Op 2 2/0.077 - CDS 60767 - 61801 1194 ## COG0524 Sugar kinases, ribokinase family - Prom 61830 - 61889 3.0 - Term 61834 - 61881 12.1 56 37 Op 1 8/0.000 - CDS 61909 - 62940 1184 ## COG0524 Sugar kinases, ribokinase family 57 37 Op 2 1/0.077 - CDS 63018 - 63668 797 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase - Prom 63729 - 63788 5.6 58 38 Op 1 . - CDS 63798 - 64559 912 ## COG1414 Transcriptional regulator - Prom 64617 - 64676 9.4 59 38 Op 2 . - CDS 64708 - 66189 1784 ## COG2721 Altronate dehydratase - Prom 66278 - 66337 12.4 + Prom 66309 - 66368 10.2 60 39 Op 1 2/0.077 + CDS 66402 - 67403 1130 ## COG1609 Transcriptional regulators 61 39 Op 2 . + CDS 67426 - 68841 1629 ## COG1904 Glucuronate isomerase + Term 68857 - 68894 3.1 - Term 68846 - 68881 4.1 62 40 Tu 1 . - CDS 68915 - 69958 892 ## COG0392 Predicted integral membrane protein - Prom 69998 - 70057 5.4 - Term 70084 - 70134 3.1 63 41 Tu 1 . - CDS 70166 - 70543 468 ## Cphy_1011 hypothetical protein - Prom 70583 - 70642 7.0 64 42 Op 1 7/0.000 - CDS 71228 - 72736 1674 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 65 42 Op 2 1/0.077 - CDS 72717 - 74579 1774 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 66 42 Op 3 . - CDS 74569 - 75561 1047 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 75597 - 75656 6.5 - Term 75632 - 75682 13.0 67 43 Op 1 11/0.000 - CDS 75703 - 76992 743 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 68 43 Op 2 11/0.000 - CDS 76996 - 77487 711 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 69 43 Op 3 . - CDS 77513 - 78541 292 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 - Prom 78608 - 78667 4.0 - Term 78714 - 78761 8.3 70 44 Op 1 . - CDS 78791 - 80443 664 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 71 44 Op 2 9/0.000 - CDS 80484 - 81293 900 ## COG0327 Uncharacterized conserved protein 72 44 Op 3 5/0.000 - CDS 81280 - 82065 705 ## COG2384 Predicted SAM-dependent methyltransferase 73 44 Op 4 31/0.000 - CDS 82058 - 83182 1672 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 74 44 Op 5 3/0.077 - CDS 83213 - 85039 2043 ## COG0358 DNA primase (bacterial type) 75 44 Op 6 . - CDS 85096 - 86106 1260 ## COG0232 dGTP triphosphohydrolase - Prom 86126 - 86185 2.8 - Term 86174 - 86220 5.1 76 45 Op 1 . - CDS 86238 - 88013 1764 ## COG1404 Subtilisin-like serine proteases 77 45 Op 2 . - CDS 88047 - 89375 1451 ## COG0733 Na+-dependent transporters of the SNF family 78 45 Op 3 . - CDS 89372 - 90667 1474 ## COG1757 Na+/H+ antiporter 79 46 Op 1 41/0.000 - CDS 90786 - 91706 1316 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 80 46 Op 2 . - CDS 91724 - 92461 192 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 92518 - 92577 5.6 - TRNA 92676 - 92759 64.6 # Leu CAG 0 0 - Term 92628 - 92662 4.0 81 47 Tu 1 . - CDS 92805 - 96761 3452 ## COG2200 FOG: EAL domain - Prom 96820 - 96879 5.3 - Term 96888 - 96943 14.0 82 48 Op 1 40/0.000 - CDS 96975 - 99407 3061 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 83 48 Op 2 . - CDS 99456 - 100475 1196 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 100561 - 100620 2.5 + Prom 100949 - 101008 5.0 84 49 Tu 1 . + CDS 101060 - 102043 794 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase - Term 101867 - 101920 6.9 85 50 Op 1 . - CDS 102050 - 102661 177 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 86 50 Op 2 . - CDS 102707 - 103297 625 ## gi|160937121|ref|ZP_02084484.1| hypothetical protein CLOBOL_02012 87 50 Op 3 . - CDS 103322 - 104206 721 ## COG1091 dTDP-4-dehydrorhamnose reductase - Prom 104226 - 104285 8.1 88 51 Tu 1 . - CDS 104302 - 105744 1158 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases - Prom 105833 - 105892 4.4 - Term 105871 - 105916 5.5 89 52 Op 1 31/0.000 - CDS 105932 - 107083 1183 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain - Prom 107110 - 107169 1.5 90 52 Op 2 29/0.000 - CDS 107252 - 109114 2451 ## COG0443 Molecular chaperone 91 52 Op 3 21/0.000 - CDS 109164 - 109838 902 ## COG0576 Molecular chaperone GrpE (heat shock protein) 92 52 Op 4 . - CDS 109845 - 110897 1458 ## COG1420 Transcriptional regulator of heat shock gene + Prom 111158 - 111217 6.8 93 53 Op 1 . + CDS 111274 - 111843 463 ## Closa_0879 3D domain protein + Term 111856 - 111907 6.0 + Prom 112021 - 112080 5.0 94 53 Op 2 . + CDS 112116 - 112844 703 ## Closa_0878 3D domain protein + Term 112846 - 112902 17.1 - Term 112834 - 112888 17.5 95 54 Tu 1 . - CDS 112925 - 115198 2430 ## COG2385 Sporulation protein and related proteins - Prom 115260 - 115319 3.6 - Term 115317 - 115350 -0.6 96 55 Op 1 4/0.000 - CDS 115509 - 116852 1411 ## COG0213 Thymidine phosphorylase 97 55 Op 2 . - CDS 116854 - 117270 369 ## COG0295 Cytidine deaminase 98 56 Tu 1 . - CDS 117377 - 117844 590 ## COG2606 Uncharacterized conserved protein - Prom 117987 - 118046 6.0 + Prom 117957 - 118016 6.6 99 57 Tu 1 . + CDS 118051 - 119145 1158 ## COG2006 Uncharacterized conserved protein 100 58 Tu 1 . - CDS 119156 - 119902 636 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 119924 - 119983 2.9 - Term 119952 - 120000 13.6 101 59 Tu 1 . - CDS 120027 - 121736 2155 ## COG0513 Superfamily II DNA and RNA helicases - Term 122086 - 122120 -0.5 102 60 Tu 1 . - CDS 122221 - 123081 976 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 103 61 Op 1 . - CDS 123190 - 123645 612 ## COG2731 Beta-galactosidase, beta subunit 104 61 Op 2 . - CDS 123647 - 124423 867 ## Cphy_3765 hypothetical protein 105 61 Op 3 . - CDS 124458 - 126587 2603 ## COG0145 N-methylhydantoinase A/acetone carboxylase, beta subunit 106 61 Op 4 . - CDS 126604 - 127947 1700 ## Apre_0573 citrate transporter 107 61 Op 5 . - CDS 127990 - 128997 1242 ## COG2055 Malate/L-lactate dehydrogenases - Prom 129145 - 129204 8.2 - Term 129198 - 129248 6.2 108 62 Tu 1 . - CDS 129403 - 130401 1110 ## COG1609 Transcriptional regulators - Prom 130480 - 130539 4.5 109 63 Op 1 . - CDS 130737 - 131693 1237 ## Closa_2430 G3E family GTPase-like protein 110 63 Op 2 . - CDS 131714 - 131953 276 ## Closa_2431 cobalamin synthesis protein P47K Predicted protein(s) >gi|157101646|gb|DS480678.1| GENE 1 2 - 1133 1067 377 aa, chain - ## HITS:1 COG:SA2167_2 KEGG:ns NR:ns ## COG: SA2167_2 COG1263 # Protein_GI_number: 15927957 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Staphylococcus aureus N315 # 97 377 5 293 385 329 60.0 7e-90 MDYAKTASLVIKYVGGKNNIKSVTHCATRLRFQLRDNGLRNEEAISDLEGVKGVFLTQSQ FQIIFGSGIVNMVCDEVQKQLGTLEEKPEEKEEGKGNVVQRFIKMLSDIFVPIIPAIVAG GLLMGLNNLLTSPLVNGQSMIELYPMWQGLASAINTFANAPFTFLPVLIGFSATRKFGGN AFLGAAMGMIMVHPDLLNAYQIGLAQPPVWDIFGFQIAAIGYQGTVLPVLAVSWILANIE KRLHKVTPSWLDNLTTPLLSILVTSFLTFICVGPVLREAGNLLASGITWLYNTLGPVGGA LFGFAYAPITMTGMHHSFIAIETQLLADSAHTGGSFIFSTASMNNVAQGAAVLAVLLMTK NDKMKSICSASGISALL >gi|157101646|gb|DS480678.1| GENE 2 1322 - 2320 1160 332 aa, chain - ## HITS:1 COG:BH1855 KEGG:ns NR:ns ## COG: BH1855 COG1609 # Protein_GI_number: 15614418 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 7 326 3 323 326 226 37.0 7e-59 MSSGKITMDDIANMAGVTKSTVSRYFNGGSVRESTRERIQEIIREYNYEPNTFARLKAKE SNVIGVVLPTLNSKVSSRVVTSIGRYLREQGYETLIKDSDHSVDLELKNIQRLITLKVDG IILSAITITEAHRKLIQGSPVPVVVLAQDYDGGISIVHDDYGAGKAMGVRVAKARPARVG YMGVSALDVAVGINRKQGVMDGLKEWGIKDVVTAAGDYSYASGIRMAEEMMDQGPVDAII CATDRLAFGAYHVLGERGLRIPEDVSVAGFGGYDESTLLKPDLTTLRFDSYGLGYLGAET ILKMIHREPVPKKQIVDYTMIEGHSVKEQSDF >gi|157101646|gb|DS480678.1| GENE 3 2542 - 2901 365 119 aa, chain - ## HITS:1 COG:CC2111 KEGG:ns NR:ns ## COG: CC2111 COG3304 # Protein_GI_number: 16126350 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Caulobacter vibrioides # 6 119 7 130 132 86 44.0 1e-17 MGCLGNVLWFIFGGCISGLSWLLTGCLWCITIIGIPVGLQCFKFAGLSFFPFGKEVVYGG GAASLLLNIIWLLVSGLPLAIESAIIGVLLCITIVGIPFGLQQFKLAKLALMPFGSQIV >gi|157101646|gb|DS480678.1| GENE 4 2907 - 3359 562 150 aa, chain - ## HITS:1 COG:no KEGG:bpr_IV092 NR:ns ## KEGG: bpr_IV092 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 7 139 6 136 138 99 42.0 4e-20 MERKSRQQEISEALMAADMASMSLQKAEELLQKASSWGIWDMLGGGFFSTMFKHNRMDEA QAAMNEARGHLRRLKRELLDVNLTGDLKMDVGSFLTFADYFFDGVIADWMVQSKIGDALN QVREARRQVSGIRKRLQEMRQTLETEQEGR >gi|157101646|gb|DS480678.1| GENE 5 3811 - 4503 414 230 aa, chain + ## HITS:1 COG:YPO2758 KEGG:ns NR:ns ## COG: YPO2758 COG1794 # Protein_GI_number: 16122962 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Aspartate racemase # Organism: Yersinia pestis # 1 229 2 231 231 260 56.0 2e-69 MKTIGLIGGMSWESTVSYYQIINEEVKRRLGGLHSAKVILYSVEFDEIEKCQSDGDWKKS GDILGTAAKSLEAAGADFILICTNTMHKVVPQIESMIQIPIIHIADATAQELLACQIRKV GLLGTKYTMTQDFYKQRLIDRGIDVITPDAADIDIVNDIIFHELCVGNIREESRRQFQVV IDNLKAKGAHGVILGCTEIGLLIHQTDSSLPVFDTTVIHAKQAVELALEV >gi|157101646|gb|DS480678.1| GENE 6 4576 - 5337 358 253 aa, chain + ## HITS:1 COG:PA1415 KEGG:ns NR:ns ## COG: PA1415 COG0491 # Protein_GI_number: 15596612 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Pseudomonas aeruginosa # 2 211 7 220 242 115 31.0 6e-26 MDNWFTIDKTDSNTYIISEYRHWEETHCYLLNGRERSLLIDTGLGISNIFDEVVKLTDKP ITAVATHIHWDHIGGHKYFPDFYAHAEELDWLNGKFPLSIETIKEMVADRCNLPDGFDVN DYEFFQGVPSKILTDHDIIDLGGRQVEVFHTPGHSPGHLCFWEKDTGYLFTGDLIYKDIL FAYYPSTDPEAYLNSLKKISSLPVKKVFPAHHSLDIQPEILFRMRKAFEQLEADGKLHHG SGTFNYGDWGVWL >gi|157101646|gb|DS480678.1| GENE 7 6008 - 6298 151 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937017|ref|ZP_02084380.1| ## NR: gi|160937017|ref|ZP_02084380.1| hypothetical protein CLOBOL_01906 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01906 [Clostridium bolteae ATCC BAA-613] # 1 96 1 96 96 173 100.0 4e-42 MAYPKVHIVNSTNFSVKGKVKYASAFCSDDNYEIAPWGSWTAGSRGVCLLTEVSATVHTP GQDAKATPYESSGTSYSQFAVLQTAPGKFTMTRIVT >gi|157101646|gb|DS480678.1| GENE 8 6385 - 7080 580 231 aa, chain - ## HITS:1 COG:BS_sleB_2 KEGG:ns NR:ns ## COG: BS_sleB_2 COG3773 # Protein_GI_number: 16079350 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Bacillus subtilis # 120 189 8 77 125 60 35.0 2e-09 MKRIVILGAMALTLSLAVPMTVQAEENGTDMVLAAKSVAGEFKTALQEAAHKTIKFANQD GINIYKELSTGAVVLDQAFLNTSFEVLGEYGDWSMITTEAGHAYIETKYLSGIEIRRPSY SEEDLYILAHVICGEAQNCPDDEQLLVGSVVLNRVNHGRFPNTIKGVVFAPGQYSCTRDG NYYREPTASNWANAKWLLENGSILPGNVVWQSGGRQGRGEYLRTRYHSYCY >gi|157101646|gb|DS480678.1| GENE 9 7234 - 8028 629 264 aa, chain - ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 39 254 231 442 451 99 29.0 4e-21 MNCNHDIPISAACIIKELELRGWEFTQGQNLEMDNWQHILFTVVPELKTMCGFQQNNPYH VYDVWQHTAAALRAAGDDPVVALTILFHDIGKPECYTEDEDGRGHFYGHGPVSARITNQV MKRLKFDEETIDTVVELVRYHDSDLHEKRRSIRTWLDRLGEKQFRRLLEVRRCDVSGQNP ACLAERLARIHRLEALLDEVVEQTGQFHIKDLDINGSDLIQIGYTPGSSIGSTLSKLADQ VVEGVMENKRDILLREAGRWLIRE >gi|157101646|gb|DS480678.1| GENE 10 8018 - 9046 803 342 aa, chain - ## HITS:1 COG:CAC1496 KEGG:ns NR:ns ## COG: CAC1496 COG3541 # Protein_GI_number: 15894775 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 7 342 4 339 339 335 50.0 6e-92 MTLGQIKERLQSSEYDFLRTDEHLGHHIILLTLGGSHAYGTSMETSDLDIRGCALNSKRE ILIHEDFGQVTDPHTDTTIYAFRKLIPLLTNCNPNTIEMLGGREDHYLHIAPVGRELLDH AHLFLSQRAVYSFGGYANQQLRRLDNKAARLVEQTDNERHILKTIEHASFDWKNRYFYMP GDSVKLYIGKALKEEYDSEIFMDITLRHYPLRDYKCLWSEMQSIVKSYAKVGKRNQNAIE HGKLGKHMMHLIRLYMMCLDILEKEQIVTYREKEHDLLMDIRNGRYLDENRQPVPEFFEM VDEYEKRLDYARKNTSLPEEPDIRKINDFAASVNERVVKGEL >gi|157101646|gb|DS480678.1| GENE 11 9175 - 9399 158 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937023|ref|ZP_02084386.1| ## NR: gi|160937023|ref|ZP_02084386.1| hypothetical protein CLOBOL_01912 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01912 [Clostridium bolteae ATCC BAA-613] # 1 74 19 92 92 140 100.0 2e-32 MEECCGINIEQEMTIENLYCFIRASLQALQSTGGYGEADFVCPLCGKKAHIKRLKGELYN TGEIGCRCGYSFRF >gi|157101646|gb|DS480678.1| GENE 12 9565 - 9828 296 87 aa, chain - ## HITS:1 COG:no KEGG:Closa_2254 NR:ns ## KEGG: Closa_2254 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 74 1 73 84 93 64.0 3e-18 MKITKDNLNEVLFENHDARLLIADVVTHTSANLYYYHDIEITVQKALDIWHKAQAADGEE NGFYSVSFLDFSKESAPLPSGLQHAFT >gi|157101646|gb|DS480678.1| GENE 13 10091 - 10750 383 219 aa, chain + ## HITS:1 COG:no KEGG:Closa_2883 NR:ns ## KEGG: Closa_2883 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 212 1 212 219 194 47.0 2e-48 MQTITQHILTDQMDYQDTVVLNCQIRYPQLNASCNQEAIQAINQYYVTQANEKARYCRSV LYPQAAEEANYSRVRNVPFFPYEFNISYVVTYNQDCIFSLFSEQYTFTGGAHGSTLRTSE TWDAQSGRKMTLSDFYQNNPSYIQDIQNWIQFEIAERLKANPGTYFDNYQELLRNSFHPE NFYLTPGGIVIYYQQYDIAPYSSGIPEFLLPFDADTSNV >gi|157101646|gb|DS480678.1| GENE 14 10934 - 11218 304 94 aa, chain - ## HITS:1 COG:CAC1208 KEGG:ns NR:ns ## COG: CAC1208 COG4728 # Protein_GI_number: 15894491 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 10 93 77 148 173 61 40.0 5e-10 MNRQNEHTGIGPGTIVRHFKRETLSAEEKKTNRYLYVVRDMAHHTETGELLVIYQALYGE FGTFARPAEMFFSEVDRAKYPEIRQRMRFEAVSS >gi|157101646|gb|DS480678.1| GENE 15 11299 - 11643 310 114 aa, chain - ## HITS:1 COG:L35832 KEGG:ns NR:ns ## COG: L35832 COG2315 # Protein_GI_number: 15674158 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 4 106 3 109 114 90 41.0 8e-19 METKSDLITYCLTYKDVYEDYPFRDHEWTVIRHRGNKRVFAWIYRREGQVCINLKCAPEW IEFWRKAYSGVRPGYHLNKKHWNTVVLDGSVPDEDIKRMIGESYDLTGDCSKKR >gi|157101646|gb|DS480678.1| GENE 16 11930 - 13093 1168 387 aa, chain - ## HITS:1 COG:CAC0877 KEGG:ns NR:ns ## COG: CAC0877 COG2230 # Protein_GI_number: 15894164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Clostridium acetobutylicum # 10 387 12 388 391 359 46.0 4e-99 MLEEKAMVQFLNRFTDYPFLVKWKDHEDRVGHGNPMFAVDIRETIPVKKLMDSTSLALGE AYMKGKLEVEGSLYEALDHFLGQMDRFSVNRLHLRKLLHPSNSRKNQKMEVSSHYDIGND FYRLWLDETMNYSCGYFKTENDTLYEAQVNKTDYILKKLALEEGMSLLDVGCGWGFLLIR AAREYGARGTGITLSTEQYDGFKKRIEEEGLEHLLDVRLMDYRDLPQSGMKFDRVVSVGM VEHVGRENYGRFMDCVDKVLNPGGVFLLHFISSQKEHAGDPWIKKYIFPGGVIPSLREII SLMAEHDFHILDVENLRNHYNKTLLCWEKNFKEHEKEIKDMFDEEFVRMWDLYLCSCAAT FHNGVIDLHQILVSKGVNNELPMVRWY >gi|157101646|gb|DS480678.1| GENE 17 13250 - 16786 3261 1178 aa, chain - ## HITS:1 COG:CAC2229_1 KEGG:ns NR:ns ## COG: CAC2229_1 COG0674 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 3 402 2 404 413 563 67.0 1e-160 MARKMKTMDGNHAAAHASYAFTDVAAIYPITPSSPMAEATDEWATDGRTNIFGREVQITE MQSEAGAAGAVHGSLAAGALTTTYTASQGLLLMIPNLYKIAGEQLPGVFNVSARALASHA LSIFGDHSDVYACRQTGCAMLCESSVQEVMDLTVVAHMASIKGKVPFINFFDGFRTSHEI QKIETWDYEDLKEMVDMDAVDAFRKHALNPNHPCQRGSAQNPDIFFQAREACNPYYDALP AIVQEYMDKVNAKIGTDYKLFNYYGAADAEHIIISMGSVNDTIEETIDYMVKQGQKVGVV KVRLYRPFCVQALIDAIPDTVKVISVLDRTKEPGAIGEPLYLDVVAALKGSKFDQVKVLT GRYGLGSKDTTPAQIVAVYENTTKSPFTVGIVDDVTNLSLEIGAPLVTTPEGTTNCKFWG LGADGTVGANKNSIKIIGDNTDMYAQAYFDYDSKKSGGVTMSHLRFGHSPIKSTYLIRTA NFVACHNPAYVRKYNMVQELVDGGTFLLNCPWDMEGLEKHLPGQVKKFIADHNINFYTID GVKIGIETGMGPTRINTILQSAFFELTGIIPAEKANELMKAAAKATYGRKGEDVVQKNWA AIDAGAKGMHKVEVPESWKNCEDEGLDYAVVTQGRKDVVDFVNNIQTKVSAQEGNSLPVS AFTEYADGSTPSGSSAYEKRGIAVKVPVWNPDNCIQCNFCAYVCPHAVIRPVAMTADEAA KAPADMKMKDMTGMAGYKFAISVSALDCTGCGSCANVCPGMKGNKALVMESLEANLGEQA IFDFGQSLPVKEEVLAKFKENTVKGSQFRQPLLEFSGACAGCGETPYAKLITQLFGDRMY IANATGCSSIWGNSSPSTPYTVNAKGQGPAWDNSLFEDNAEFGYGMLLAQNAIRDGLKAK VESVMANEKATDEMKAACKEWLDTFGTGALNGTATDKLVAVLDGVDCDVCRDIVKNKDFL AKKSQWVFGGDGWAYDIGFGGVDHVLASGKDINIMVYDTEVYSNTGGQSSKATKTGAVAQ FAAGGKDVKKKDLASIAMSYGYVYVAQICMGADMAQTVKAIAEAEAYPGPSLIIAYAPCI NHGIKKGMDKAQTEEKLAVECGYWNNFRYNPAAEKKFSLDSKAPKLETYQDFLKGEVRYM SLAMKNPERAAELFARNEAEAKERYAYLEKLVTLYGND >gi|157101646|gb|DS480678.1| GENE 18 17037 - 18866 1757 609 aa, chain - ## HITS:1 COG:CAC0691 KEGG:ns NR:ns ## COG: CAC0691 COG2720 # Protein_GI_number: 15893979 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Clostridium acetobutylicum # 197 472 132 401 411 130 36.0 6e-30 MNQLENNQNRRSSSAGRSRGRGSDKGTGRGTSRNTGRSSSSGSGSRSSYSYRSGGQAGRM NAAQTARRNSPHRRKKGPDYKKIAIAGVILIILIAGISLAMKWKGAGGSEPDHSVEETTE TEMMKEVKVDGITITGMSKNQARNAILKEYPWDMTVTYDGDTYNVTNLAAEKVDALLDEI YSGEPAESYTLNMDGLEEAAAGEAENCAAKWNKKAKNGSIDSYDKEKGAFVFAGEESGFA IDQEKLAADILKALKDKKFDASITATGSETAPDISAAAAKEKYKTISTFTTKTTANKARN TNIRLASEALNGTVVHPGEEFSFNAAVGQRTEAKGYQGAAAYNNGEVVQEIGGGVCQVST TLYNAVYRSGLGEESITFRRSHTFEPNYITPGQDATVSWEQPDFRFKNTSSTSIGIKASY ADQKMTVSIYGIPILEEGTKLDLVSEKVEELDPPAPTYEEDQTLEPGVEIQKSAGSKGSR WITYKVLYKDGKEVSREQDHKTTYKGHAPVIRRNTTGVVLPPEETTLSTETTTPTIDGMP DGYIPGDETITAPSGNSSPESGPGTVTQPAAPTTAPAAAPNPTQAPQPSETEAAGPGGDN PVIAPLKPE >gi|157101646|gb|DS480678.1| GENE 19 19082 - 19921 752 279 aa, chain + ## HITS:1 COG:BH3372 KEGG:ns NR:ns ## COG: BH3372 COG0613 # Protein_GI_number: 15615934 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 5 249 13 250 266 150 36.0 2e-36 MSYVDLHVHSNASDGTLSPEQVVRLAHEAGLSAIALTDHDTTAGVAGAAEAGAELGLEVV PGIEVSSSFRDHEIHILGLFVDTDSPSLQAALKEFRHHRDQRNQEMLGRFDAHGIHLTAE DLCAGNPDTVITRAHVARALLAKGLGPDLDHIFKKYLQYGGTYCPPKEFLPPEAVMEALL DSHAFVALAHPFQYKLGDKATEELIAYLAGLGMKGLEVYHSSHNRLESAKLQEMAARHHL LSTGGSDFHGSNKPDISIGTGRGGLRVSSLLLDAIKNTL >gi|157101646|gb|DS480678.1| GENE 20 20048 - 20284 335 78 aa, chain + ## HITS:1 COG:no KEGG:Closa_1693 NR:ns ## KEGG: Closa_1693 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 78 1 79 95 125 84.0 7e-28 MKIQNITDVEKFFSVIDQCRGTVELVSPEGDRINLKSKLAQYLSMATIFSNGYIKELDLV AHEKEDIERLIKYMYQGE >gi|157101646|gb|DS480678.1| GENE 21 20466 - 21659 1312 397 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 1 373 1 379 383 215 36.0 1e-55 MELNSDTMKKIRGLIVFTVVIVVAGINYRRLVDVAAGLMHIVWPFILGAAIAFILNVPMR NIERHLTVFGQGSRLRRPVSLVVTILLVTGGLFLVIFVVAPQLVKTFMNLQSSIPVFFAG VRDEAERLFASNPQILEYMNQMEVNWQEVFQNMVAFLKSGAGTMLNTTFSAAVSIVSGVS SFLIGFIFAIYILLQKETLGRQIKKVLEAFLPEPAVGRILDITALTERTFSHFLTGQCAE AVILGTMFFVVLTVIRLPYALLIGVLIAFTALIPIFGAFVGLAVGVFLMLLVNPMDALIF TITFFVLQQIEGNLIYPYVVGNSVGLPSIWVLVAVTVGGSMMGIVGMLIFIPLCSVLYAL LRDGVNTRLGKKGGGTAVCEDACGPAGEKDGDQGQQE >gi|157101646|gb|DS480678.1| GENE 22 21671 - 22843 1226 390 aa, chain - ## HITS:1 COG:CAC0906 KEGG:ns NR:ns ## COG: CAC0906 COG2872 # Protein_GI_number: 15894193 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Clostridium acetobutylicum # 5 373 4 370 387 177 32.0 3e-44 MDKSRLYYQIPYVKSFMCTVEHCEESGKGTWLVTLNQTGFYPEGGGQPYDTGTLNGISVV SVHEKGEQVIHELAQPIEEGTLAEGIIDWQRRYDNMQLHTGEHILSGLVHKHFGYDNVGF HMGAEEVTVDFNGIIEPEDLDWLEDEANQLVYANVPVKVLYPSEDELKTMEYRSKKELSG LVRIVEVPGADVCACCGTHVENTGEVGIIKIRSMIHYKGGVRLSMLCGRRALLDYRDRLK DEARISNLLSAKLVQVPDAVEKLKSDSQDKDYLIGKLGSSLLALKAGQFPESTEPLIVFE EDLTPIQVRQYATLLYEQNKAGVAAVCSGNDRDSEYHYALGSSRMDMRALSKAMNSRLSG RGGGSSLMAQGTFRAAREAIEAAFLEETEL >gi|157101646|gb|DS480678.1| GENE 23 23404 - 23556 171 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937041|ref|ZP_02084404.1| ## NR: gi|160937041|ref|ZP_02084404.1| hypothetical protein CLOBOL_01930 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01930 [Clostridium bolteae ATCC BAA-613] # 1 50 47 96 96 93 100.0 4e-18 MYSRLIIDGNAVYEIDEECLKQRQGRKGYGGKNRGTKSAFSSGETGEAGK >gi|157101646|gb|DS480678.1| GENE 24 23693 - 24109 527 138 aa, chain + ## HITS:1 COG:no KEGG:Closa_1687 NR:ns ## KEGG: Closa_1687 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 188 189 112 36.0 6e-24 MNPSDDYKLTDLDYLIGDHHLQMIKASLPYLGVPEQKAISLFVKVQELRKTVELFETEEV ASMGICSLERPQGNGSMRDLLKAIRPYGNPMEQDMIDMAETLMEGQTPMDQLRRFLTPEQ QSRFETMEMIFTAMQAMA >gi|157101646|gb|DS480678.1| GENE 25 24154 - 24435 330 93 aa, chain + ## HITS:1 COG:no KEGG:Closa_1686 NR:ns ## KEGG: Closa_1686 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 87 1 87 95 82 55.0 5e-15 MDNSWKQDPRLKAMNKDKLAMLTEFAERIEHSDKNNMMEAFMAINMEARQKGVQFNDRET DLLVNILSSRMPPSEKKKIDLLKMLSKKMAGPR >gi|157101646|gb|DS480678.1| GENE 26 24490 - 25332 645 280 aa, chain - ## HITS:1 COG:ML2522 KEGG:ns NR:ns ## COG: ML2522 COG3786 # Protein_GI_number: 15828358 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium leprae # 93 236 39 183 218 92 36.0 8e-19 MRHERRTMEMNGKCHKIGRAAAAFVLSGAIMAAGGITAMAAGSAGAGTAAGTAVNPAEQG PGVPAASKAEPEEKKAAVENGVIQPQNLEGAKDADQLVVVVGTGGCSADVYYYKKSGEAW EMVWKEAGIVGRNGITDQKTEGDGSTPSGTYGFTMAFGLRENPGSVLPYHKITKGDYWVD DSASPYYNKLVNTSQVAKTWNSAENMAAASPYYNYALALNYNEACEPGKGSAIFLHCFTA ARDNGSAGCIRLPQERAKELVQSATEHTKIVIAPDLERLN >gi|157101646|gb|DS480678.1| GENE 27 25524 - 27938 2954 804 aa, chain - ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 5 802 3 802 804 868 52.0 0 MATQYNHSAIEKKWRENWEEKPINVNDGRKEKYYCLDMFPYPSGSGLHVGHWRGYVISDV WSRYQLLKGKYVIHPMGWDAFGLPAENYAIKMGVHPAKSTEANIRNIKRQIKQIAAIYDW DMEVNTTDPDFYKWTQWIFVQMFKKGLAYEKEFPINWCPSCKTGLANEEVVNGCCERCGT TVTKKNLRQWMLKITAYAERLLNDLDKLDWPEKVKKMQTDWIGKSYGAEVDFPIDGREEK ITVYTTRPDTLYGATFMVLAPEHALAKSLATDETREAVEKYIFDSSMRSNVDRMQAKEKT GVFTGSYAVNPLNGAKTPIWLSDYVLADYGTGAIMCVPAHDDRDFEFARKFGLPIVQVIA KDGKEIENMTEAYTEANGIMINSGDWNGLESAVLKKEAPHMIEEKGFGHKTVNYKLRDWV FSRQRYWGEPIPIIHCPKCGCVPVPEDQLPLKLPEVESYQPTGTGESPLAAIDEWVNTTC PVCGAPAKRETNTMPQWAGSSWYFLRYVDSHNKEELVSREKADKYLPVDMYIGGVEHAVL HLLYSRFYTKFLCDIGAIDFDEPFKKLFNQGMITGKNGIKMSKSKGNVVSPDDLVRDYGC DSLRLYELFVGPPELDAEWDDRGIEGVSRFLNRFWNLVMDNKDKDVKASKEMIKLRHKLV YDIEYRFNQFSLNTVISGFMEYNNKLIELARKEGGIDRETLKTFVILLAPFAPHIGEELW QQLGGDDSVFHAQWPECDEEAMKDDEIEVAVQINGKTRAVISISADSSKEDAIAAGREAV KEKLTGNVVKEIYVPGKIVNIVCK >gi|157101646|gb|DS480678.1| GENE 28 28337 - 28702 377 121 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5888 NR:ns ## KEGG: Cpin_5888 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 92 21 117 139 63 36.0 2e-09 MTLYDRNGTPAAYIDSDGEHIYLFDGKPAAYLNEDAVYGFGGTQYGWFEKGWIRDLDGYC VMFTDEARTQGPAKPSRHETPAKWARAARPVRRSRETKRSKTAYKAAWSEIAAGDFLSGN S >gi|157101646|gb|DS480678.1| GENE 29 28689 - 28805 65 38 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937047|ref|ZP_02084410.1| ## NR: gi|160937047|ref|ZP_02084410.1| hypothetical protein CLOBOL_01936 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01936 [Clostridium bolteae ATCC BAA-613] # 1 38 1 38 38 72 100.0 1e-11 MLRNSLNPGQRSSIIDPLRAYIAAENGNKWGEWKYDIV >gi|157101646|gb|DS480678.1| GENE 30 28902 - 30434 1691 510 aa, chain + ## HITS:1 COG:no KEGG:Closa_3302 NR:ns ## KEGG: Closa_3302 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 506 1 506 513 683 78.0 0 MNQKLKEKITESLSSVLPVTCIVLILGITITPIPLDPLMLFLAGAAFLIIGMGFFTLGVD MAMMPIGEKVGAQLAKAKKLPVIVIACVIIGAMVTVAEPDLQVLARQTPAVPDMVLILAV AAGVGFFLALAFLRSLFGWSLSHILLVCYIILFILAFYIPGDFLAVAFDSGGVTTGPITV PFIMALGVGLSSIGQGKNADSDSFGLVSLCSIGPILSVMILGLLFQSSSGAYEPMTIPSI SNSQDLWLAFQSELPHYAAEVFTALFPILAFFLLFQIFFLKMRKRQVVKILVGMLYSYIG LTLFLTGVNVGFMPAGSYLGKQMASLPYRWILVPVAMLIGYYIVKAEPAVQVLNKQVADV TGGSISEKTMMTGLSIGMALSLGFSLFRVITGLPLMAFLLPGYALALGLSFVVPPVFTSI AFDSGGVASGPMTATFLLPLCMGACEALGGNILTDAFGIVAMVAMTPLIIIQLIGLSFEL KTRREKAVSRHPVVDMEQEFINLEEEPYEP >gi|157101646|gb|DS480678.1| GENE 31 30424 - 31122 618 232 aa, chain + ## HITS:1 COG:no KEGG:Closa_3303 NR:ns ## KEGG: Closa_3303 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 231 2 232 236 248 57.0 1e-64 MSHKSIKLAAAIVNRGAGNRAAAIFHTYNHEILLAVRGHGTASSAVMDCLGLDEPEKDLV LGLSSKTGADRLLRALGREMEFCKPGHGIAFTLSLTGISFAAMDILSRHEDQTGPNSRPA EHKEDIPMTNSHSYELIASVIHTDLSAPVMEAARKAGCQGGTLIKAREIGSDAGKKLFGM TLSQEKAILLILTPTSLRAPILKSICETVMKETGEHAVAISLPVDAVEGLTG >gi|157101646|gb|DS480678.1| GENE 32 31202 - 34486 2269 1094 aa, chain + ## HITS:1 COG:CPn0772 KEGG:ns NR:ns ## COG: CPn0772 COG0210 # Protein_GI_number: 15618681 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Chlamydophila pneumoniae CWL029 # 495 1071 5 633 639 237 30.0 1e-61 MYIGDLHIHSRYSRATSKELTPEHLDLWAGKKGINIVGTGDFTHPAWRAELAEKLEPAEP GLYMLKKEYSLQRPSILGQSRPRFVISGEISSIYKKNGRVRKVHSLILLPSLEAAEVLSR RLEAIGNIHSDGRPILGLDCHDLLAITLEACPDAIYVPAHIWTPHFSLFGAFSGFDTIEE CYEELTPQIHALETGLSSDPAMNGRLSALDSFQLISNSDAHSPAKLGREASLFDIPMSYA GLYGAIQRGEGLKGTIEFFPEEGKYHFDGHRKCHLCLSPSQARKYKGICPVCGRKLTTGV LHRIEQLADRDEDFLLPQGRPFENLVPLGEVIASSVGSSPSSVKVSRQYEHLLEELGNEF YILRQAPLEDISHAAGSLTAEGIRHLRDGKVQWRPGYDGEYGTMRLFQSAELDNVEGQMY MTFETANADLSETMGPGSPGAPGVTGDAELAADAVPSANTALSGKAGVSHGSTASREASY ETAVSNILTVSNPSSALNRDQQQAVESVFPVTAVIAGPGTGKTKTLVSRIEHLMGERGVK PSEITAVTFTNKAADELTQRIRQALPGRRSLNQMQVGTFHSLCYRLLARSGTELSLADQG QAEECAGETIEALELACTVRQFLTAISLHKAGLSSAQGFAAYMADDSAAETGFILTQEGA DRYQQALECRKLMDFDDLLLNMLRLLEQEQKAEYRKQHFSYLLVDEFQDISPVQYRLIKA WNKGGRELFVIGDPDQSIYGFRGSDSRCFDLLEQDFPGTSTIRLTTGYRSTPEILSASLP LISRNPGGERSLSAARPHGGPVRLVTAQTSLSESIFIAKEINRMVGGIDMLDAHSQTPGL LDTARSFADIAVLYRTNHQARLLEQCLQKEGIPYVVAGREDYLDHPAVQGTISFFGALLK SSGLDEDTTKLCRKRISPSADYSSLAERFLPRLKRDRPWKLLADWISCLGLGSDKSMDTF VCMSYFHKTMEDMLSTLAFGQAHDLKRSGQDSRPSDAVTLMTLHASKGLEFPVVFLYGLR QGILPLKTGKGAVNPEEERRLMYVGMTRARDELILTTSEEPSPFLEELPKDACRREKART PGSGMKQLSLFDFM >gi|157101646|gb|DS480678.1| GENE 33 34623 - 35963 1644 446 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 18 446 14 441 441 374 42.0 1e-103 MVKESFLAQFYGKSQNYTDIPNKWYKEYNVKKGLRNEDGTGVRIGLTRVADVVGYEETDG GIKAIPGRLLYRGVDVLDLVKGKKESHFGYEETCFLLLFGYLPKKAELEKFCNVVRHCYT IPDEFVEMNLLRLPGKNLMNKLQDAVLTLYNYDDDPDNIDVYQTLVKGVNLLAKLPSIAC YAYQSKVHYYDRESLFIHYANPEYSIAENFLSLLRKDQKFTSKEANLLDTILMLHADHGG GNNSTFTNVVIASTGTDIYSTVCGGIGSLKGPRHGGANIKVAEMMDAVLAETGGYDADDK KIRQAVEKLMAGELYDRSGLVYGFGHAVYTVSDPRAEVLRACCMEVAKEQKHTRELDFYM RFERIAKEVMEEIKGKALPTNVDFYSGFAYGMLRIPRDLYTPLFVCSRVVGWVAHNIENK LYDGRIMRPATKYVGDNMDYVPMSER >gi|157101646|gb|DS480678.1| GENE 34 35994 - 38297 2335 767 aa, chain - ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 767 9 760 761 1100 68.0 0 MVSLYDSGIYLVNGTEIVPESESGKVKKLTGRDADQASARKGTIAYSILEAHNTSGDMEH LKLRFDSMASHDITFVGIVQTAKASGMETFPIPYVLTNCHNTLCAVGGTINEDDHMFGLS AAKKYGGIYVPPHIAVIHQYMREMMAGCGRMILGSDSHTRYGALGTMAIGEGGGELVKQL LKDTYDVAYPGVVAVYLDGAPRPGTGPQDIALAIIGAVFKNGYVKNKVMEFVGPGVASMT TDYRNGVDVMTTETTCLSSIWKTDSDTRAYLAKHHREEDYKELAPADVAYYDGCVYVDLS TVKPMIALPFHPSNTYEIDEFKANLDDILGMVEAEAAKVSGGKASFTLRDKIEGGQLMVQ QAVIAGCAGGTYSNVMEAARVLKGKNCGCDEFSLAVYPSSQPVFADLVKKGAVSDLMDAG AIIRTAFCGPCFGAGDTPCNNGLSIRHTTRNFPNREGSKPGNGQMAAVALMDARSIAATA ANGGRLTSAWELEGWDNVPEYEFDDISYKNRVYMGYNKGDGEKELVYGPNIKDWPEMSPL ADNILLKVCSKIMDPVTTTDELIPSGETSSYRSNPLGLAEFTLSRRDPEYVGRAKEVDKL EKARTKEGAAKEVELEPVLEKLFDAIRGIEGNENVTVQDTEVGSMIYAVKPGDGSAREQA ASCQRVIGGLANICKEYATKRYRSNVMNWGMVPFQMEEDPCFEVGDYIYVPGVRKALDGD LKDIKAYVLGKEGSSIRTLDLYIADMTAEEREIVKAGCLINYNRGRH >gi|157101646|gb|DS480678.1| GENE 35 38517 - 39476 906 319 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_1232 NR:ns ## KEGG: HMPREF0424_1232 # Name: not_defined # Def: CAAX amino terminal protease family protein # Organism: G.vaginalis # Pathway: not_defined # 1 307 23 319 333 128 29.0 4e-28 MGLAAAAFMLLTLAAQVVVMVLVELLNSLLGDFIDFYGFSGRMLMSSLPMYLVAFPAVAG LLQLVPKCGAPQKEQWGFWKFASFFVIAVGIGLAGNILGRLVGLLQPSGPDSAELDQLIR NSSLWVNLLTTVIMAPVVEELFFRKLVMDRLLGYGQKTAIIISGIMFGMAHGNFSQFFYA FGIGILWAYVYAKTGKVRYTIGFHMLFNLLGGVITVELSKGAQGLAEGPWMIRQIESILG VDMGWLVTAFSSIMILAYLFFMLACLIAAITLIIMYRRQITFQPGQWPLRKGRAFQTAFL NVGMIVYFVICCGLFILNW >gi|157101646|gb|DS480678.1| GENE 36 39837 - 40487 595 216 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 9 215 14 216 229 75 23.0 5e-14 MTEYFQKNLFKDISIDEYKKLLSCLDGRTKQFKAGETICDYDEGFHEIGIIGKGTASVVR YEYNGARTILERLGPQDIFGHLLSYEGSEHAGISVVCDSPCDILFIHYATIGSPCTRSCA HHHQLTQNLLDLISERALNLSRRVEVLSQRTIREKLICYFMQLAAQAKASSFSLPFTMVD LADYLSIDRSAMTRELKRMKEEGLIEMDKRCVRLFL >gi|157101646|gb|DS480678.1| GENE 37 40476 - 43028 2285 850 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 430 847 345 756 776 233 35.0 1e-60 MEEQQNSFKDSFLAAVLDQMDTNFYITDVETDEILYMNKTMKETFGLIRPEGCVCWQVLQ KDMKGRCPYCKIRQLEQMEDNKICVWRETNSVTGRKYRNYDSLIQWKGKLYHIQNSVDMT EYDQISETARTDELTRMLNRRGGSERLIRALEQGSRENQIVTVVLYDINELKSVNDRYGH SEGDKLLQYVASIAKECLGRLDFMYRLSGDEFVMVFYGRDMNAADESMKRLLTRAGQERE RLLKEYDVSFSYGIAEVYPGEMGSVEDIISRADEQMYIQKRGYHIDRAKRRLEEKPESGD ETFDYAGGSLYEALSASTDDYIFVGNMKTGVFRYPQAMVEEFGLPGQTVREAAAFWGQLI HPHDEAYFLESNQSIADGREDYHNIEYRARNVRGEWIWLRCRGKMLRDKDGQPELFAGMI SNLGKKNQIDHMTGLYNKYEFEGNIKKYLAGRKSMDSIGLMVLDMDSFKNINDLYDRSFG DEVLRITAQKAASMLPPNAMLYRLDGDEFGILLLEGDAKEAARIYGKLQQNFCRQQEYGG RKYFCTLSAGYAGYPEDGSNYLELLKSANYSLEYSKMMGKNRMTVFSRDILADRERRLEL MELLRESVERGFAGFTIHYQPQVDTLSGRLCGAEALARWHCSKYGDVSPGEFIPLMEQSG LIIPFGQWILSHSMAQCLEWSRSKPDFQISVNLSYIQLTQGDFISFLRTVLEENSLAPSR LIMELTETYLAKAEEDTLRLIAQMKELGVKIAMDDFGVGYSSLFSLKSIPVDVVKIDRGF VKGITSDLFNATFIRSITDLCHDVGRKVCLEGVETKEEYEAVRELGMEYIQGYYFGRPMS ARQFEDRLKE >gi|157101646|gb|DS480678.1| GENE 38 43166 - 43330 242 54 aa, chain - ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 2 54 181 233 237 82 58.0 1e-16 MKKYVCDPCGWIYDPEVGDPDGGIEPGTAFEDIPDDWVCPLCGVGKDMFSVVEE >gi|157101646|gb|DS480678.1| GENE 39 43545 - 45641 2460 698 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 695 3 690 690 637 47.0 0 MNVYGTEQIRNVVLLGHGGAGKTTVAEALALLTGVTKRMGKVPDGNTISDYDKEEIKRQF SISTTLIPLEYEGEDGPIKINLLDTPGFFDFVGEVEEAISVADAAIIVVNCKAGIEPGTE RAWEMCEKYNLPRLIFVTNMDDDHASYRELVVKLETKFGRKIAPFQLPIRENEKFVGFVN VVKMGGRRFTHLSDYEECEIPDYVQKNLTIARDALIEAVAETSEEYMERYFLGEEFTQEE ISTALRTHVIEGDIVPVMMGSGINCQGFKVLLQAIDKYFPSPDHFECIGVDVSTGERFTA KYNDDVSLSARVFKTIVDPFIGKYSLMKICTGTLKGDTVVYNVNKDTEEKIGKLYILRGK DQIEVQELKAGDIGAVAKLTVTQTGDTMAVRTAPIVYHKPQTSTPYTYMAYKAVNKGEDD KVSTALAKMMEEDLTLKVVNDAENRQALLYGIGDQQLDVVISKLQSRYKVDVTLSQPKFA FRETLRKKVKVQGKHKKQSGGHGQYGDVIMEFEPSGDLETPYVFEEKVFGGSVPRNYFPA VEKGLQECVLKGPVAGYPVVGLKATLLDGSYHPVDSSEMAFKMATILAFKQGFMEANPVL LEPIASLKVTVPDKFTGDVMGDLNRRRGRVLGMNSNHNGKQVIEADIPMSELFGYNTDLR SMTGGLGDYSYEFSRYEQAPGDVQKREIEARAAAKDGE >gi|157101646|gb|DS480678.1| GENE 40 45766 - 46725 401 319 aa, chain - ## HITS:1 COG:BMEII0654 KEGG:ns NR:ns ## COG: BMEII0654 COG2199 # Protein_GI_number: 17988999 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Brucella melitensis # 158 305 64 211 212 95 35.0 1e-19 MGENISANDFCRYIWKSYLEDRRYDLVNEFVSDKISVIGTGAHEIERNLHEFIAKFTEES HEWTGHFVIRDQWYQTTELSDSHSLVMGELVVREDSEEGILYDMCFRFTVVLEKSENGWK VVHIHQSVADPNQMSDEFFPHHMVEKTHTQIVYNLRHDSLTGLLNRLYFKETCERFQADG DNGAFLMMDIDWFKKINDQYGHPIGDKTLISFSESLKSVISPASLAARMGGDEFTLYLSG IRQRGEVEDFFLSLMTDWEERQKALHIHDQVSISAGVAFTSNADTSYEALLAQADQALYL AKKKKNNELIHWEFSGSHA >gi|157101646|gb|DS480678.1| GENE 41 47007 - 47216 62 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937065|ref|ZP_02084428.1| ## NR: gi|160937065|ref|ZP_02084428.1| hypothetical protein CLOBOL_01954 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01954 [Clostridium bolteae ATCC BAA-613] # 1 69 17 85 85 130 98.0 4e-29 MCYHTVKRNECEEVNDITKLELVAAIGPNQLELVQVYLRGLLSAIELEKLIGKRKAKIVD IFSTEYAHM >gi|157101646|gb|DS480678.1| GENE 42 47397 - 47879 441 160 aa, chain - ## HITS:1 COG:no KEGG:Clole_1402 NR:ns ## KEGG: Clole_1402 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 2 158 3 164 165 186 54.0 3e-46 MKEIGYQREAVLEYAKRWAKGRNPRYLDFEHMGGDCTNFASQCIYAGSGVMNYTPTMGWY YSSGSNRSPSWTGVAYLYNFLTGNKSAGPYGVETNEKGVEPGDIVQLGNKNGYYHSPVVV DVAGNNILVAAHSYDAYMRPLDSYIYEKVRFIHIQGVRNW >gi|157101646|gb|DS480678.1| GENE 43 48558 - 49580 1271 340 aa, chain + ## HITS:1 COG:TM0343 KEGG:ns NR:ns ## COG: TM0343 COG2876 # Protein_GI_number: 15643111 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Thermotoga maritima # 1 334 1 334 338 366 52.0 1e-101 MIIIMKPNASDEAVKKVTNLIESKGLTAHLSKGSQVTIVGVVGDKTKLLDTNIEISEGVD KVVAVTESYKLVNKKFHPEDSVIPVGNTHIGPGTMTVMAGPCAIESQDQLMETAYAVKKA GATFLRGGAYKPRTSPYSFQGLEEEGLKYMKEAREATGLNVICEVTSAHAIESAVKYVDM LQIGARNMQNFELLKEAGKSGTPVLLKRGLCATIDEWLNAAEYIISEGNPNIVLCERGIR TYETSTRNTLDISAVPVIRQKSHLPIIVDPSHATGVRAYVEPLAKAAIAVGADGLMIEVH PCPARALSDGPQSLTFDAFEQLMKDLAPYAELSGRTFKAQ >gi|157101646|gb|DS480678.1| GENE 44 49718 - 50851 1302 377 aa, chain + ## HITS:1 COG:BH1666 KEGG:ns NR:ns ## COG: BH1666 COG0287 # Protein_GI_number: 15614229 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Bacillus halodurans # 6 377 1 366 366 227 38.0 3e-59 MEKTTMTDSTIAFIGLGLIGGSIARAIRKTRPDIRIMAYMRSRSRLEQARKEGIVDVILD GIDENLRECDLIFLCTPVEYNAGYLSAIRPYLKQGALITDVGSTKCGIHEEIRRQGLEYC FVGGHPMAGSEKTGYENSTDHLLENAYYIVTPPEGKNSSRPQALSLNAERIVAVAKTIGA IPLVLDYREHDKVVAAISHLPHLVASSLVNLIKDHDTADGTMKQVAAGGFKDITRIASSS PEMWEQICMTNVAPISDILEAYIASLNKVLEEIKEHKGQDIYRLFETSRDYRNSITDKAK GSVEPSYEFTVDVSDEVGAISTISVILAAKGISIKNIGINNNRERGEGALKIAFYDEASM EAAWQRLDKYKYEMFRM >gi|157101646|gb|DS480678.1| GENE 45 50879 - 52159 1129 426 aa, chain + ## HITS:1 COG:BS_aroE KEGG:ns NR:ns ## COG: BS_aroE COG0128 # Protein_GI_number: 16079317 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Bacillus subtilis # 1 425 1 426 428 397 48.0 1e-110 MEFKRASHFKGEIRVPGDKSISHRAVMFGSVAQGLTEIHNFLQGADCLSTIACFEKMGID IENKGGKVLVHGRGLHGLNAPREVLDCGNSGTTTRLISGILCGQNFDVTLTGDESICKRP MKRIMEPLAMMGARITSIKDNGCAPLLIKGGPVHGIHYDSRVASAQVKSAIMLAGLYADS PTSVTEPYVSRNHSELMLRLFGAQVSTEGTTAVIKPANELHGNQVMVPGDISSAAYFIAG GLMVPGSQVLIRNVGINPTRDGILRVCRDMGAHIELLNVSGGTGEPTADILVRHGSLHGT VIGGSVIPTLIDELPVIAAMACFAEGETVIRDAGELKVKESNRIAVMVQGLAAMGADVTE TEDGMIIRGGAPLHGAVIDSRKDHRIAMTFAVTALCADGITEIRDADCVNISYPGFYSDL KKLAQG >gi|157101646|gb|DS480678.1| GENE 46 52271 - 53047 887 258 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 15 258 25 269 269 213 55.0 2e-55 MSSIFAGIMIPFIGTTAGAACVFFLKNALKPSVQKAFLGFASGVMVAASVWSLLIPSMDM SGHMGKLAFIPASVGFLLGIGFLLLLDEIIPHLHIDTEQPEGPKSDWKKSTMLVLAVTLH NIPEGMAVGVAFAGLLSQNGTITMAGALALSVGIAIQNFPEGAIISLPLKEAAGKGKAFI YGTLSGVVEPIGALLMLVLAEFLGPVLPYFLSFAAGAMIYVVVEELIPESAEGEHSNVAT IGFAVGFVVMMILDVALG >gi|157101646|gb|DS480678.1| GENE 47 53090 - 53725 799 211 aa, chain - ## HITS:1 COG:no KEGG:Closa_3203 NR:ns ## KEGG: Closa_3203 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 209 1 212 215 263 62.0 5e-69 MNDKHFVITIERQYGSGGRLTGKRLAEELGIHFYDEEILKMTSETSAIGEQYFRLADERA GNNMLYRIVTSMKPELTEPDKDGPNITSPENLFRFQSSVIRRLAASETCIIVGRCGNYVL QDQMDNLVRIFVYADTVTRIRRVMDVDKVDEAEALRRMRRIDKERTEYHRYFTGREWMDM ENYDLPINASRIDYDQMIQLIKDYLKLKGFL >gi|157101646|gb|DS480678.1| GENE 48 53792 - 54304 694 170 aa, chain - ## HITS:1 COG:L67226 KEGG:ns NR:ns ## COG: L67226 COG1335 # Protein_GI_number: 15672251 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Lactococcus lactis # 1 169 2 168 171 188 53.0 3e-48 MSRLLVVVDMQKDFVDGALGSPEARAIVPNVLDKVKAYQDAGDEVVFTLDTHFADYMDTM EGKKLPVVHCVKGTPGWELVPELRGVPGTRFEKHTFGSRELADYAARGQYDFVELAGLCT DICVISNAMLIKAAVPDTSIQVDGSCCAGVTPQSHHNALSAMGMCHIDIV >gi|157101646|gb|DS480678.1| GENE 49 54884 - 56029 1438 381 aa, chain + ## HITS:1 COG:PA3836 KEGG:ns NR:ns ## COG: PA3836 COG2984 # Protein_GI_number: 15599031 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Pseudomonas aeruginosa # 71 375 22 323 325 189 38.0 8e-48 MKKTIAMLLTAALSASLLSGCVSTSQGTSAAQSAETSAENGSENGSVQNAESSASGVSAA DTSSADKTGESKAESSSDSYTIGVGQFAEHGSLDNCREGFMAGLAEEGIEEGVNLTVLYD NSQADGGTASQIATNFAGKGVDLMCGIATPMAQAEYGVAKKSDIPVIFTAVTDPVAAELA NADGTPVGEITGTSDKLPVEAQLKMIRQILPEAKNIGIMYTTSEVNSESAIAEYKELAPQ YGFEIVDTGISSSADIALAADTLINKVDCITNLTDNTVVASLPVILDKAAAKNIPVFGSE IEQVKIGCLAAMGLDYIELGKQTGKMAAQVLRGEKKASEMNFETIKEAAFYGNTQVAGNL GITLPEDLTGNAAELFTAIAQ >gi|157101646|gb|DS480678.1| GENE 50 56060 - 56992 1061 310 aa, chain + ## HITS:1 COG:FN2080 KEGG:ns NR:ns ## COG: FN2080 COG4120 # Protein_GI_number: 19705370 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 10 305 7 275 278 223 47.0 4e-58 MTTIILGIFEEGLVYAIMALGVYITYRILDFPDLSVDGTFPLGAAFTATGIAGGFIHPAA ALPLSFLLGAMAGCVTGLIHVKLKVRDLLSGIIVMTALYSINLRVAGKSNLPIFSKDTIF SNPWLERTIPQALRPYTVTIILLVIVLICKLLLDAYLKTRSGYLLRAVGDNDVLVTSLAK DKGMVKIIGLAIANGFAALAGSVYCQQKGFFEISTGTGTIVIGLANVIIGTQLFKRFGFV KSTTAVIIGSIIYKACVSLALLLGDLTLTFGNITVSFPVTASDLKLITSILFLIILVASS SRGKKVKSHA >gi|157101646|gb|DS480678.1| GENE 51 56985 - 57758 225 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 254 1 245 245 91 28 2e-17 MLELKHINKYYNAGTVNEMCLFEDFSLTIKDHQFVSVVGSNGSGKTSLLNIICGSIPLDE GRIIISGKDITRIPEYKRQRRIGRVYQNPAMGTCPTMTILENMSLADNKGKAFNLRPGTN RQRLEYYRESLRSLGLGLEDKMDVKVGVLSGGQRQAMALLMSTMTPIEFLILDEHTAALD PKTAETIMELTGKIVKEKNLTTIMVTHNLRYAVEYGDRLLMMHQGRAVVDKEAGEKQALC IDDILEKFNEISIECGN >gi|157101646|gb|DS480678.1| GENE 52 57829 - 58188 430 119 aa, chain - ## HITS:1 COG:CAC1578 KEGG:ns NR:ns ## COG: CAC1578 COG1396 # Protein_GI_number: 15894856 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 104 1 104 110 60 30.0 5e-10 MELDYKAIGKRIKIQRIQREMTQEKLAELTGLSNPHISNVETGSTQVSLKSLIAIANALE ITPDVLLCDNIRYGKHIFKNAVMEAVDDCDEVEIRIMADMVQALKRSIRYRKKYFKEQE >gi|157101646|gb|DS480678.1| GENE 53 58438 - 59742 996 434 aa, chain + ## HITS:1 COG:RSp1203_2 KEGG:ns NR:ns ## COG: RSp1203_2 COG2199 # Protein_GI_number: 17549424 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Ralstonia solanacearum # 247 424 1 174 202 105 35.0 1e-22 MVETKKTLRREAVTGGIFECRYDKSLTLTQADNSLFRLLGMNAGSLKSGTPISFLDRICA EDRPIFLEKIQYQLSRSRVFMSELHIVTSEGQLRWIHVCGELLNAEQEIPCLHCIFHDVT DARDRQERLAIESQIYDIILSQTQDIIFELDCLTRDIYYSPTFEKKFGYQIPVKGFPDSM FATDIIHEKDKILLRRCFQSILAGQNQMQCEYRLKHRNGRYLWVSVNATAIRDLRGRLLK IVGIISDINEQKEEILRAKKDALLDPLTSLLNRRECIRRIEAYTRTEEEMGALILIDIDN FKSINDTNGHLFGDKVLVDLAAALRLAFRRNDITARIGGDEFIAFMPHIHVRGDVVPKLE TILYSLSRCAETDGKADAAPESDSAVTFSIGVSFYPDHGRDFSSLYEKADAAMYHAKRRG KNQYYIYKEDAPEA >gi|157101646|gb|DS480678.1| GENE 54 59864 - 60580 828 238 aa, chain - ## HITS:1 COG:no KEGG:CLH_2547 NR:ns ## KEGG: CLH_2547 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 70 235 46 209 339 112 41.0 1e-23 MRKYHLYAVGICAAIALGAAGCSAGANAETMSGSSVTAEENTEETTVSDTTVVESVTGAI ESMEESQPEVPQSVRIYGPVTKMEDGRLSIDNQSGMSTSGEIILNVADDMTYILDAMTGL PVALEDVKDGDTIYAYIGPAMTMSLPPMTNAVMIFTNLPADAKAPDYVEVKSMVTDAGTS KSVLTAADGTEFTLAEDCNIFPYLTRNIVTLDDLTQGKKCVVWSDDENTASKIMLFAE >gi|157101646|gb|DS480678.1| GENE 55 60767 - 61801 1194 344 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 11 329 3 328 339 165 31.0 1e-40 MVETNGRNFDLLTLGQVLLRLSPPDNERLSRGDTFVKQVGGAELNVAVGAALLGLHTGVI SKLPAHDIGSFMKGKIRSFGVSDDFFMYDQSPAARVGIYYYENGAYPRKPKVIYDRNHSS FFSLDIDEIPEEVYTASKCFHTTGITLALNEDIRETTIEMIKRFKAAGTLVSFDVNFRGN LWTGDQARECILRILPYVDIFFCSEDTARLTFLKTGTAREMMKSFTEEFPISIVASTQRI VHSPKRHTFGSVIYDAARDEYYEEEPYRDIEVVDRIGSGDAYISGALYGLLSTDGDCARA VAFGNATAAVKNTIPGDLPSSNLEEIETIIHAHNQAGPQSEMAR >gi|157101646|gb|DS480678.1| GENE 56 61909 - 62940 1184 343 aa, chain - ## HITS:1 COG:TM0067 KEGG:ns NR:ns ## COG: TM0067 COG0524 # Protein_GI_number: 15642842 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 3 343 2 339 339 328 50.0 9e-90 MAKIVTMGEIMLRLSTPNNEKFIQADEFDINYGGGEANVAVSLANYGHEAEFITAVPANP IGECAVAALRKYNVGTKHIARSGERLGIYFLETGSAMRASNVVYDRAHSSISTATADQFN FDEIFADADWFHFTGITPAVSDAAIELTEAALKAAKAHNVTVSVDLNFRKKLWSSEKAQK VMTNLMQYVDVCIGNEEDAEKVLGFKPGNTDVTSGELELAGYVDIFNQMADKFGFKYIIS SLRESHSASDNGWSACIMDGKTREFYHSRKYHITPIVDRVGGGDSFAAGLICGLVDGKDM KAALEYAVAASALKHTIPGDFNLVTRADVENLAGGDGSGRVQR >gi|157101646|gb|DS480678.1| GENE 57 63018 - 63668 797 216 aa, chain - ## HITS:1 COG:CAC2973 KEGG:ns NR:ns ## COG: CAC2973 COG0800 # Protein_GI_number: 15896226 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Clostridium acetobutylicum # 1 216 1 213 213 243 60.0 2e-64 MKKEQVLARMKEDCLVAVVRAKNYEQGEKVVDAIIAGGINFIELTMTMDDGDPVGFIKLM AEKYKGNDKIVIGAGTVLDPETARACILAGANYIVSPGLNVDTIRLCNRYRVPMLPGVMT PSEAITAMEAGCDIIKIFPGNIVGPAAISSFKGPLPQGEFMPSGGVDVDNVDKWIKAGAY AVGTGSSLTKGAKTGDFAAVTAKAQEFVAAVAAARA >gi|157101646|gb|DS480678.1| GENE 58 63798 - 64559 912 253 aa, chain - ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 9 249 7 248 251 129 30.0 5e-30 MKLNRTTLRTVEILKLVSKKPDGITLDEICARLDMPKTSAYDIVTTLVQTGMIHVAREQK QRYTIGLTAYRIGINYTNNLDFISTIDPVLKAFSREVGKTVFFGIRSDDSIVYICKFEPE NPIITTATVGTKNPIYCTSLGKAIMAYTEDEDRDGVLSRISFKKHTDRTICTREEFEREL KSVREKGYAFDARELEEHMECVGAPVFGQDGNVMGAISVSSLYKPTEDYDALGRLVSDKG AQLSRLLGYIGHA >gi|157101646|gb|DS480678.1| GENE 59 64708 - 66189 1784 493 aa, chain - ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 1 493 1 492 492 561 55.0 1e-159 MQDIVKIHENDNVAVALRPLAKGETLDVAGVKVSLLEDIPQGHKFALRFIKAAEEVVKYG FRIGYAKEDIEAGAWVHVHNVRTALGDVLTYDYVPQENDVAPTEHVYFEGYRRPDGKAGI RNEIWIIPTVGCVNSIAKALENQAKSLAVGNVEDVIAFPHPYGCSQMGDDQENTRKVLAD MIRHPNAGGVLVLGLGCENSNIPVLMDYIGAYDEDRVKFLQCQDVEDEMETAMGLLKELA AYAGAFSREKIDAAELVIGMKCGGSDGLSGITANPVVGAFSDLLVSKGGTTILTEVPEMF GAETLLMNRCASPELFDKTVDLINHFKNYFTSHNQTIYENPSPGNKKGGISTLEDKSLGC TQKSGSALVRGVLEYGEPVKVKGLNLLSAPGNDLVASTALAVSGAHMVLFTTGRGTPFAS PVPTVKISSNSRLAGHKDNWIDFNAGRMVEDMTKEELAKELLNYVLAVASGRKVKSEEAG FHDMAIFKQGVTL >gi|157101646|gb|DS480678.1| GENE 60 66402 - 67403 1130 333 aa, chain + ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 3 331 5 328 335 159 32.0 8e-39 MNIYDVSEHAGVSIATVSRVLNGNPNVSEKTREKVLKVMDELGYTPNVFARGLGLNTMKT IGIMCSDSSDPYLAGAIYYLERGLRAHGYDAILCCTGYDLDAKQKYFDLLRSKRVDAIIL AGSKFVEMRAKDNAYIIEAASDLPIMLVNGFLEGENIYSTVCDDHAAVYGAASQLLSSGR RRILYVYTSYSYSGVNKLEGYKDALKAKGITPSDELIRQCPKDIEAARELLLSLHREGLE FDAVITSDDSLAVGAVKYAHIADISIPRDLSIIGYNNSLLARCTDPEITSIDSKVEALCT TTINTLMGVFSGVNVPSRTTIASDLIKRKTTNF >gi|157101646|gb|DS480678.1| GENE 61 67426 - 68841 1629 471 aa, chain + ## HITS:1 COG:CAC0692 KEGG:ns NR:ns ## COG: CAC0692 COG1904 # Protein_GI_number: 15893980 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Clostridium acetobutylicum # 1 467 1 467 467 562 57.0 1e-160 MKQFMDKDFLLSTDMAKTLYHDFAENMPILDYHCHINPQEIYEDRKFDTITQVWLGGDHY KWRQMRSNGVDEKYVTGDSTDREKFQAWAETMPKLIGNPLYHWSHLELRKYFGYQGYLNG DTAEEVWNLCNMRLQEDSMTVRNLIKQSGVTLICTTDDPVDSLEWHKKIAEDDTFDVQVL PAWRPDKAMNVEKPTFGAYMAQLSQVSGVQITDFASLKEALKKRMAFFASMGCCVSDHAL EYVMYAPAADSEVDAILAKGLKGEAVSKEEELKFKTAFMLFVAREYNAMGWIMQIHYGCK RDNNGYMFEKLGADTGFDCINNYAPSAQMADFLNALSTNNEIPKTILYSLNPNDNASIGT IIGCFQEKFPGKIQQGSAWWFNDHKTGMTEQMTSLANLGCLGNFIGMLTDSRSFLSYTRH EYFRRILCNLIGGWVENGEYPADMKALEEIVKGICYNNSVKYFGFKLDEVK >gi|157101646|gb|DS480678.1| GENE 62 68915 - 69958 892 347 aa, chain - ## HITS:1 COG:lin2698 KEGG:ns NR:ns ## COG: lin2698 COG0392 # Protein_GI_number: 16801759 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 40 335 37 331 357 73 23.0 7e-13 MKQKKMNWKEMSIRLCLVALLAWFTFHMLLGSHSVQNLGSTFMRADIRFVATGFACMFLF VNCEAANIRLLMGTFGKKVPYWRSLSYAFTDFYFSAITPSATGGQPMQLYYMVRDGFGAA HSSFSLLATAAVYQMTVLVYGCVMVGANLSFVMGQGRIIRLLLVFGVLVNGFCSGLILLI IFHGLLAEKIMLCIAGGLSRAGIIKNRKRAIRKVEGLIDEYSRGGAYLRQYPLAAVRIFI HSAVQLTALYLVPYWACRALGISASAKDFLALQAILSLAVTAVPLPGSVGASEGSFLVLY RTLLGAGQSFSVMLLSRGISFYAMLCISGLATAFLQFARRGKPASPV >gi|157101646|gb|DS480678.1| GENE 63 70166 - 70543 468 125 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1011 NR:ns ## KEGG: Cphy_1011 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 124 7 125 126 134 49.0 2e-30 MAQMDPLLLLDNGTLAYIIGNGDYRFTNIKFEEARAIIDMKGEADVVRVYANPDLEHSMY EYLGIQQGNFQYAPVRSMKIGQDAIAFKLYITPTGTQPIVLGDDGQQAKKIKNMYIYCQH VVRLK >gi|157101646|gb|DS480678.1| GENE 64 71228 - 72736 1674 502 aa, chain - ## HITS:1 COG:BH1910 KEGG:ns NR:ns ## COG: BH1910 COG4753 # Protein_GI_number: 15614473 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 4 499 5 502 506 144 25.0 6e-34 MRAIICDDDEIILKGLCSVIDWVSLGIELVGTAGNGKEGLTLLKEQRPELLMTDIRMPHV DGLQLIEEGKKLNPGLMAIVFSGYDDFEYARRALKLGVQDYLTKPINMDELTYLASDCVR RFDEARKDSFEDREDLLRKLLMYEVNDSGLITEMEHDYCMVLIFESADSGWLMADAAGHM REQGAYVLVQRENLCEIAVIAPTRMTVEMRASLFLSHTRQLFSEQGMSLTCAQSNVQWGI RLLDQCYQEAHEALKLKYIRGDNQNLRFQDIKGDEKESDPSPILDIDLITPVKSGNVQLL KKRMGELEKLLKQAGMDSYIYMQFMVGSLYSTVLKELGDAGICADLVFENPVEEYKKLIG CETIGKAIGVLGENLKRICEYVGSQKSGVYSRPVVKALAYMEENYSSPDLSMEDTAVAAG LSSARFSTSFKSETGVTFTDYLLKLRMEKAKDLMRDPNRKIYEIAMMTGYSNIPYFSTAF KKYTGCPPSEYREKQHGLQGKV >gi|157101646|gb|DS480678.1| GENE 65 72717 - 74579 1774 620 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 133 605 114 588 602 204 30.0 3e-52 MHIKKTMKAYLNKRTCFKACRNLGARLMLKEKMILSFAAASTGIVVLTGALHYYMTASRI LEEAEATTVQMVGNACQDINELFLQTFQVCSAMNDNISMQKLIRRQFSSIKEMYSHDLEG SMEMMVLPSYNRDIFGVYVLGDNGGRYKSNSSSFLVSDPRDTDWYRIIHATGQAQWFPPH ENSYVVSTPYTSYVSMGIPFIDKATGRTKGVVLADIDSDKLSDIAARAVSGAGDVFLMDE RNRIMNLTAGEGPWKDRDGVEKARRMVEENLSEIPEYGVSHVLKDKGQLVVYQTLNQTDW KIVGVIPKASITRSVEYIKVLMIMLLALAVVISMILSEIVADSVTRPLFTLVKAMEQVCA GNLETSVETERGDEIGVLYDSFNHMTGEMRNLIDTIYREQEKLRREELKALQAQINPHFL YNTLDSIIWSLRMRQVEESIEMLTALTDFFKISLSKGRDIITIEEEVKHIKSYLSIQHRR YQEKFDYDIHVDPSILSCRTPKLILQPVVENAIYHGIKPMEEKGYIYIHIFPQEGDVILQ VQDTGMGMDEDTCRRLNRELDTISSDQQGTGYGVRNVNDKIKIVYGSKYGVTIESAMGEG TVVTLRIGKKGEASDESDYL >gi|157101646|gb|DS480678.1| GENE 66 74569 - 75561 1047 330 aa, chain - ## HITS:1 COG:AGl3185 KEGG:ns NR:ns ## COG: AGl3185 COG1879 # Protein_GI_number: 15891710 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 327 1 312 319 142 29.0 7e-34 MRKKIIAVCLAVMLACLLWLVKDRWSLPAAQETEHEGLRVGLSQAEPLTPWKTAQIKSFK EAARRRGAQLIYHAPEEESLEWQIEDIYGLLDADIDYLILIPSARNGYDQVLLEARQRGV PVILAEQDVEMEPEYDREDYILGYVASDYYAEGELCADILARHFGIEPCHTMVIQGEKSS AMDWERYMGFMHGVREHSNLYVSKRVESSGNRLTAQKVTESVIADQDIDFNAVFAQSDED GLGVLQALKLAGMEPGEDVVIVSIGGIQDVFKAIIAREYLATVKSSPEYGDAVFDIINRH QRGENPDRQTMVKHREYTMDNAEELFEDAY >gi|157101646|gb|DS480678.1| GENE 67 75703 - 76992 743 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 428 1 429 435 290 37 2e-77 METSIILISTVVLLVLLFLKVPVFISILSGSMAYFYLTGGPFTKILGQRVLAGVESIPLL AIPFFICAGVFMNYSGVTKRVMGFCNVMTGRMTGGLAQVNVLLSTLMGGLSGSNLADAAM EAKMLVPEMERHGYSKQFSTVVTAMSSMITPLIPPGIAMILYGSIANISIGKLFIAGIGP GVLLCVAMMILVHFISKKRGYKPMRTERLTMKQAAPEIASAFLPLCLPVIIIGGIRLGIF TPTEAGAVAIVYSLVLGVVYKELKLKDIGTGLKETVATTASIMLIVGGASAFAWILTREQ IPQQFTALVLGTIHNKYLFLFVIILFLIVVGMFIEGNAALIVLVPLLAPVAEAYGIDPIH FAMVYIFTMACGGVTPPMGALIFVTCGITRCKIKAFIKESIPFYVMLFICMLLMAYVPLF STGLVNLFY >gi|157101646|gb|DS480678.1| GENE 68 76996 - 77487 711 163 aa, chain - ## HITS:1 COG:BH3392 KEGG:ns NR:ns ## COG: BH3392 COG3090 # Protein_GI_number: 15615954 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Bacillus halodurans # 24 159 32 163 175 58 27.0 5e-09 MKSDNKIGNLLRNADVLAACGALVLLVGVTFFGVIMRYCLGDPFVWQEEVQLALIIWVVF LGGRYAFVCGNHAAIDVIVEMFPEKLQKILSVLIAAAAVIILCYVGYQGVRYIMQMVRYD RVTNILKIPYSLIYAPLPIGCLTMAVQVCLNTYRDLTGKGEEV >gi|157101646|gb|DS480678.1| GENE 69 77513 - 78541 292 342 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 1 325 1 327 327 117 27 4e-25 MKKVLALLVSCTLALSLAGCSSQTDSGQTAEGSGKKVRIMIGYENNPGEPIDLACHEWKR LVEEKSGGTMRVDLYPSSQLGSKDNIIDQAVNGDCVVTLANGPFFQDRGIQDFGVLFAPY LFEGWEDLEKVVDSDWFEGKAEELEQNVGLKILTTWRYGIRHTITKKPVNGVSDLKGKKI RVPNQTILIKVFESLGATPTPMAMSDIYTALQQGTIDGAENPLPVILNGAYQEVAKNLVL DAHTYDMTCWIMGADYFHTLTEEQQNILMECGDQAGVFNSQQLEKADEEALEALKDAGVT VSELTPEDFKAAGQAFYEDPQIKAMWSEGLIEEMKTIIGRQD >gi|157101646|gb|DS480678.1| GENE 70 78791 - 80443 664 550 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 33 546 30 542 546 260 31 3e-68 MALVKIHGITKEYPEGTTWMEVAREHQKEYEYDILLVRVNGKLQELHKQVKDCELSFVTA KDKPGMSAYQRSASLMMLKAFYSVAGAGNVEKLMIDFSIGRGFFVEARGNFVLNQEFLDA VKAKMREYVERKIPIMKRSVSTDDAIELFEKLGMYDKARLFRYRMVSRVNIYSIDGFEDY YYGYMVQNTGYIKHFDLIPYHYGFVMVMPDRKTPDVLHRFTPSDKLFATLSESTEWGRRM DLETVGALNDRIAKGDMSHLILIQEALQEKKIAEIAAQIAARKNARFVMIAGPSSSGKTT FSHRLSVQLEAIGLKPHPIAVDNYFVNRVDSPRDEHGNYNYEILECLDVELFNRDMTGLL EGKQVELPYYNFKKGVREYKGNFLQLGEGDILVIEGIHCLNDRLSYTLPADSKFKIYISA LTQLNIDEHNRIPTTDGRLLRRMVRDARTRGSSARETIRMWPSVRRGEEENIFPFQEEAD AMFNSALVYELAVLKQYAQPLLFAIPKDSEEWLEAKRLLKFLDYFIGVSSEDIPKNSILR EFIGGSCLNV >gi|157101646|gb|DS480678.1| GENE 71 80484 - 81293 900 269 aa, chain - ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 250 1 242 258 146 33.0 5e-35 MKCSEIMDVCRKLAPEDCACDWDNPGLLAGRSDKEVKKIYIALDATDKVVEAAQALGADM LLTHHPLIFKAIKKVNDQNFITRRLVKLIQADISYYAMHTNFDAAPGCMADLAAERLGLA GCVPLEALGEMSGPEGAVSYGIGKSGFLKEEMTVRELAVRVKDVFGLPFALVYGDELMDQ KVSRISVCPGAGGSEIEGALAWGAQVLVTGDISHHQGIDAAARGMAIIDGGHYGLEHIFI PYMAEFLRKNLGGQVELEQAPPQWPAMIV >gi|157101646|gb|DS480678.1| GENE 72 81280 - 82065 705 261 aa, chain - ## HITS:1 COG:L84257 KEGG:ns NR:ns ## COG: L84257 COG2384 # Protein_GI_number: 15673054 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Lactococcus lactis # 18 250 4 228 230 117 33.0 2e-26 MNSDKLETRDKRKQAGTIKLSARLLTVAGFVRQGSRIADVGTDHGYIPVYLAQTGRIASA LAMDVRSGPLERAQAHVREYEERERARRQDVWAVPIHLRLSDGLKELKPGEADTVIIAGM GGELIIKILDEGRHVWDSVKQYVLSPQSDLDKVRRYLAAHGFAIEDEAMVKDEGKYYTVM SVKRGFMEYESQAQYLYGKILIDKKDVILREYLGREMLRIEKILVSLQEKDGITDTETRA EARISRQKELSWIKEAQDEMQ >gi|157101646|gb|DS480678.1| GENE 73 82058 - 83182 1672 374 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 15 373 23 372 372 411 68.0 1e-114 MEGTDVFEEKLAGLLEEAKKKKNVLEYQEVMNYFGQEPPVSNQLDRLFEFLDQNKVDVIR MGTEDELDPDLFIEEEMELDEEEEIDVEHLDLSVPEGVSLEDPVRMYLKEIGKIPLLGME DEVELAKKMELGDPEARKRLAESNLRLVVSIAKRYVGRGMQFLDLIQEGNLGLIKAVEKF DYTKGYKFSTYATWWIRQAITRAIADQARTIRIPVHMVETINRLVRTSRQLLQELGREPT TEEIAARLDLPVERVSEIMKMSQEPVSLETPIGEEEDSHLGDFIQDDNVLVPQDAAAFTL LHEQLMEVLLTLTEREQKVLRLRFGLDDGRPRTLEEVGKQFNVTRERIRQIEAKALRKLR HPSRSKKLKDYLDE >gi|157101646|gb|DS480678.1| GENE 74 83213 - 85039 2043 608 aa, chain - ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 5 518 8 496 596 330 36.0 7e-90 MYYPDEVIEEVRMKNDIVDVISGYVKLQKKGANYFGLCPFHNEKSPSFSVSPGKQMYYCF GCGAGGNVITFLMEYENYTFQEALSSLADRAGVNLPKMEYSREAREQADLRARLLEVNKL AANYFYYQMKQPQGKAAYDYFHLKRGLADETIVHFGLGYSNKTSDDLYRYLKGKGYEDSF LKDTGLVTLEERGGRDKFWNRVMFPIMDVNNRVIGFGGRVMGDGEPKYLNSPETKLFDKS RNLYGLNYARLSREGYLLICEGYLDVISLHQAGFTNAVASLGTAFTSQHANVLKRYTDQV ILTYDSDGAGVKAALRAIPILKEVGMSIKVLNMKPYKDPDEFIKNMGAEAFRQRIREAKN SFLFEVDVLRRGYEMDDPEQKTRFYQETARKLLQFGEALERENYLQAVAREQMIPAEELR GLVNRMGMSMGLKAGESLNHSGRVVPQETQEEESGGSGLSGSRKPRRPDKEDGIRRSQRL LLTWLIETPALFEKIEGIITADDFVENLYHQVAQMVFDGHREGNLNPAAILSRFINDEDQ YKEVAALFNASLKESLNNEEQKKAFSETVMKVRKNSLDVASRNAKDITQLQEIIRQQAAL KQLHISLD >gi|157101646|gb|DS480678.1| GENE 75 85096 - 86106 1260 336 aa, chain - ## HITS:1 COG:RSc2968 KEGG:ns NR:ns ## COG: RSc2968 COG0232 # Protein_GI_number: 17547687 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Ralstonia solanacearum # 12 328 15 382 387 223 37.0 3e-58 MNIRESLEALEESYMSPYASLSSRTRGRDKPEQLCDMRPEYQRDRDRILHSKAFRRLKHK TQVFLAPEGDHYRTRLTHTLEVSQIARTIAKTLRLNESLTEAIALGHDLGHTPFGHSGEY ILNKICEDGFTHYEQSVRIVEVLEKDGRGLNLTWEVRDGIRNHRTSGHPSTLEGAIVRLS DKIAYINHDIDDAIRARMFTEHELPRGYTDVLGHSVRERLNNMIHDIILNSMDKPAITMS EGMEEAMQGLRKWMFDNVYKNDIPKAEEGKAQNMITQLYEYYMGHVDKLPVEYVLLMVNK GEKKSRVVCDYIAGMSDIYAIDQFEELFVPKRWNVY >gi|157101646|gb|DS480678.1| GENE 76 86238 - 88013 1764 591 aa, chain - ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 50 589 10 521 1118 331 36.0 2e-90 MDNQKRENLLNLALDATEEERLKSVNLNVGYDPGEKTWELIVRYNGSLESLRDEGIRVDE LAAGYAVLVVPESRIEQVSAMEQIVYIEKPKRLFFASNMARAASCLSTIQTSSGAGAGGA GTGGAGAGVISSLESGLTGKGVLVAVIDSGIDYFHPDFRNPDGTTRIGLLADQDRDRIYT REEINAALETGSRTSALALVPSTDPSGHGTAVAAIAAGNGREGNGVYRGVAYESELMVVK LGTPLTDSFPRTTQLMKALDLVVRRAQDMNRPLAVNISFGNTYGSHDGTSLLETFINDMS GIGRNVIVAGTGNEGTGAGHRAGSLVMGQEENAQLSIAPYETGMGVQLWKSYVDQFSIRL VTPSGEIIGPIDSRLGPQTLRYGGTQILIYYGKPSPFSRAQEVYFDFLPVRDYLDSGIWT FRLTPERIVTGRYDMWLPSRGILNPSTRFLRPVPETTLTIPSTAANVISVGAYDDSYRAY ADFSGRGFTRQTGQVKPDLAAPGVDIVTARRGGGYEAVTGTSFAAPFVTGSAALLMQWGI LQGNDPFLYGEKVKAYFTRGARHLPGYDVWPNERLGYGTLCVRDSLPLGNS >gi|157101646|gb|DS480678.1| GENE 77 88047 - 89375 1451 442 aa, chain - ## HITS:1 COG:BH1128 KEGG:ns NR:ns ## COG: BH1128 COG0733 # Protein_GI_number: 15613691 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus halodurans # 4 438 9 453 453 322 43.0 7e-88 MKTKNQWASKLGFIMATAGAAIGLGNLWKFPYLMGRNGGASFLVVYLFFICILGVPVMML EMSLGRKTRHDPVQAYGDIHPNARIAGVFGVAAAFLILSYYSVIGGWIIKYIVSYGTTFH APEDFEAFISAPAEPVFWHFLFLFMTTAVCYHGISGIENVSRIMMPLLFVLLIVIIVRSV TLPDAVLGLKFIFIPSSAAFNMKAVSAALGQVFYSLSLCMGITITYGSYLGDKENIPRSC ASVAGLDTIIAVMAGIAIFPAVFSFGLEPGQGPGLIFGTLPKVFASIKGGSVFAILFFAL VLFAAVTSAIALLEVVVSFAVDSLKWTRKKATVILGLLIFLTGVPSALSFGTLGNVTILN YSVFDFVGMVTDNILLPFGGLAMCYFIGWKWNPRYLVDEVEKNGVTFRLKKLWIFCIRFL TPILVAAVTLTGFASIYSIVRG >gi|157101646|gb|DS480678.1| GENE 78 89372 - 90667 1474 431 aa, chain - ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 1 421 23 443 459 356 49.0 6e-98 MGENREGRAAALLPILIFLLIFLGSGFITGDFYSMPAIVAFLIALLAAFVQNRELGFAEK IKLAAQGVGDENIITMCLIFLAAGAFTGAVKAAGGVDSTVNLGLSILPSNVAVAGLFLIG CFISISMGTSVGTITALAPIAVGISQKTGFSMAVCAGAVVGGAMFGDNLSMISDTTIAAV KTQGCEMKDKFRENFLIVLPAAIVTILLFLVLGKDASFHIQGSLDYSLLRVVPYMVVLAG ALAGVNVFLVLMTGTVLSLIAGVATGAFGVGEMFTHVGAGITGMYDITVISVIVACIVAL VKEYGGIAFILSFIKKRINGERGGELGIAALALLVDMCTANNTVAIVMAGPIAKEISDDF GVTPRRSASLLDMFSSMGQGLIPYGAQLLAAASLTGLTPFEIIPYCFYPILMGVSGLLFI FIKKRHGKVLV >gi|157101646|gb|DS480678.1| GENE 79 90786 - 91706 1316 306 aa, chain - ## HITS:1 COG:AF2365 KEGG:ns NR:ns ## COG: AF2365 COG0719 # Protein_GI_number: 11499942 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Archaeoglobus fulgidus # 43 306 85 343 369 90 30.0 4e-18 MADSIQERILEEVADLHGVPAGAYNIRANGQTAGRNTTANIDIATKTDKPGIDIRIKPGT RNESVHIPVVVSASGLKEMVYNDFFIGEDSDVVIVAGCGIHNCGDQDSEHDGIHRFFVGK NAKVKYVEKHYGEGDGNGKRILNPGTEVYMEENSYMEMEMVQIEGVDSTNRTNCAELAAG AKLIVRERLLTHGSQNAESTYIVNLNGEDSSADVVSRSVAKDTSKQTFNSKIVGNAKCSG HTECDAIIMDDAKIFAIPGLIANNIDAALIHEAAIGKIAGEQIVKLMTLGLTEEEAEAQI VNGFLK >gi|157101646|gb|DS480678.1| GENE 80 91724 - 92461 192 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 223 1 226 245 78 25 1e-13 MLTLENLSFDVNDEKGEIGIIKDISFTVDDGKFVVITGPNGGGKSTTAKLIMGIERPTGG RILFDGQDITDMSITDRANLGISFAFQQPVRFKGVQVIDLLRLAARKKLTVSEACQYLAE VGLCAKDYINREVNASLSGGELKRIEIATVIARASKLSVFDEPEAGIDLWSFQNLIQVFE RMRQNTNGSILIISHQERILNIADEIVVIADGKVVNQGTRDEIMPQIMGTASAVDACSKF YETAQ >gi|157101646|gb|DS480678.1| GENE 81 92805 - 96761 3452 1318 aa, chain - ## HITS:1 COG:all4225_2 KEGG:ns NR:ns ## COG: all4225_2 COG2200 # Protein_GI_number: 17231717 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 1065 1308 7 250 260 179 36.0 4e-44 MDIIYMNEQIILLNDSMPAGFFRCCAGEDRRLDIVNHELLSLFGCTDTKELEELTGGTLK GMLDVRDHARVLDEVLEQLHQGKEIIKAQFRIRSRNGGICWVDFRGRVVEDERGNNWVHA VMIDDTEAQNQKQEYWEKSMRDSLTGLFNRDAAVQSVNRYMQDGDKAPDGTLMVIDLDNF KEINDTKGHLFGDSVLTEAARCIAGVFDSQDMVGRIGGDEFLVFSKQLSGREEIMKRGRQ LIEDIASLPSVKKSGSKLSCSIGAAVYPADGSTYEQLFLKADMALYAGKNWGKGRCVVYD PSLRPADSGKHALGRFTDGTMIDSENGRYFVANKVIHYVFKALYESTDLREAINTILKII GLQFNVSRAYIFEDCLDGLAMDNTFEWCNEGIEPQKDKLQHMSYEWSAYGYRENFGEDGV FFCMDIDHLPRNQREVLEPQGVRSLLQCGIYDSGQMVGFVGFDECRENRQWSREQIEVLT YVARVIGIFLTKDRAHRILEQRLATLENVLEETSGKVRGSSKGAFDALTNLLTAGAFQEQ TEQYLETGPAPYTTALFILDMDDFKRINEEKGKLFGNVVLMNVANSLKQACREGDLIARF GGDEFLVLLKNTNREDASAAGRTIQKEIGGILSDTGSKKERISCSMGVRLAGEGERNFSD IIVKANQALLQVKKSGKGGILFYEAVEDKENGSISYDYLKQAREAKRREQSSLDDKTTTA VALEVFEKSATVDEAIHILMGFVGNRFRLNRIVLYMNGEGGHGKQSAYQWVDDRTALLYD PTDSFRREEFYIFYHLYDGNGIAVLRRREYESYNAGLKRILDRAGAHTMLSAGIFIEGRY SGMILLVNTETEREWSRAECEAVSEMARIVASGIKNTSMLIEAKQEAEYYRNHDALTGLM RYDRFKESCQMLMDEGKEEYVLVASDIKGFKFINEAIGYTQGDNILRMFGDMLTQNGLET NCYTRVSADQFLSFGVCGMDRNDFVNMVQGLNNEFCRMENEIYSNINLMIRSGIYFIEKD CREIETAVDRATIARKSVDYIIRSTSVVFNDGPFDSSYRENEIINRMEYALKHGEFKVYL QPKVGLDGLGIVGAEALVRWQHEDGRIVPPGDFIPLFEKNGFITQVDTYVFGTVCSQLSQ WMEQGGEPLPISVNLSSVDIASEQLIPQILEITRACGLDHKYLEFELTETAFLSDSARTF HVMKTLQEEGFITSIDDFGSGYSIMNMMADIPTDVIKLDCGFVQSCTKTGRGREFLGQLI QMTNKMGFISLCEGIETAQELKMLREMGCELGQGYYFSRPLPMNVFFEKFCRKVIDKS >gi|157101646|gb|DS480678.1| GENE 82 96975 - 99407 3061 810 aa, chain - ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 164 809 8 654 654 610 49.0 1e-174 MNTALSWIKAYVPDLDVTAQDYTDAMTLTGTKVEGYTCLDKNLEKIVVGEILEVKGHPDA DKLVVCQVNVGAEEPVQIVTGAPNITEKSIGEKVPVVLDGGRVAGGHDGGALPEEGIRIK QGKLRGVESCGMMCSVEELGASRDMYPDAPESGIYILPGDSVPGSDAVEIMGLRDVVFEY EITSNRVDCYSVIGIAREAAATFGKAFHAPVVTPTGNGEDIHDYLNVRVENSELCPRYCA RMVKNIKLAPSPQWMQRRLAGNGIRPINNIVDITNYVMEEYGQPMHAFDYDQLAGREIVV KCAKDGDVFQTLDGQERKLDSTILMINDGEKEVGIAGIMGGENSKITDNVQTMVFESACF NGTNIRLSAKKVGLRTDASGKYEKGLDPNTAEEAVNRACQLIEELGAGEVVGGIIDLYPE KRTSKRIPFDAVKINRLLGTDIPEETMLSYFKTIELDYDPETKEVIAPTWRQDLERMADL AEEVARFYGYDKIPTTLPSGEATTGKLSYKLQIEEVAREIAEFCGFSQGMTYSFESPKVF DRMLIPEDSALRKTVEISNPLGEDFSVMRTTSLNGMLTSLSTNYNRRNKDVRLYELANVY RPKSLPLTELPDERMQFTLGMYGDGDFFTMKGVVEEFFEKAGMHKKPHYDPSAEHPYLHP GRKADILYDGAVVGFLGEVHPDVADNYKIGDRAYIAVIDMPSIMDFTTFDRKYTGIARFP AVTRDISMVVPKDILVGQIEDVIEQRGGKVLESYKLFDVYEGAQVLAGHKSVAYSITFRA KDHTLEEKEVTSVMNKILNGLGGLGIELRA >gi|157101646|gb|DS480678.1| GENE 83 99456 - 100475 1196 339 aa, chain - ## HITS:1 COG:CAC2357 KEGG:ns NR:ns ## COG: CAC2357 COG0016 # Protein_GI_number: 15895624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Clostridium acetobutylicum # 1 339 1 339 339 451 60.0 1e-126 MKEQLEKIKEEALRQIESSEALERLNDIRVSYLGKKGELTNLLKSMKDVAPEDRPKVGQM VNDVRGLIEGRLEEAKTALARKAREEQLKREVIDVTLPARKNNVGHSHPNTIALEEVERI FVGMGYEVVEGPEVEYDRYNFEKLNIPKGHPARDEQDTFYINENILLRSQTSPVQVRTME KGRLPIRMIAPGRVFRSDEVDATHSPSFHQIEGLVIDRNITFADLKGTLAEFARELFGPE TKVKFRPHHFPFTEPSAEVDVTCFKCGGSGCRFCKGSGWIEILGCGMVHPHVLEMCGIDP DEYSGFAFGVGLERIALLKYEIDDMRLLYENDIRFLKQF >gi|157101646|gb|DS480678.1| GENE 84 101060 - 102043 794 327 aa, chain + ## HITS:1 COG:CAC2707 KEGG:ns NR:ns ## COG: CAC2707 COG0122 # Protein_GI_number: 15895964 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Clostridium acetobutylicum # 11 320 22 286 292 134 30.0 3e-31 MIHRYETPCMDLQQIHQSGQCFRMVPLPEHTYPSGTKNGYRLISGLRVLRAWQDGPFVCL DCPHEDLSFWLSYLDMDRDYQAVIASIDRENTYLRAAAMSGTGIRILRQDPWEMIITFVI SQQKTIPNIRQLVEALSSRYGTLLEDRQNRGSGEVREKDEVREKEGARREGAALEESLPP AFSFPAPSQLCLASLEDLMGLKLGYRAKYIHRLCQDAVSGRLDLSHLAALNYEEAMEYLT GFYGIGKKVANCVCLFGLHHIDAFPVDTWIEKILMEQYFDRKKYRRIPKNRLCETIVEDV FGRYSGCAGIMQQYIFYYERSVRQGRE >gi|157101646|gb|DS480678.1| GENE 85 102050 - 102661 177 203 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 26 195 12 174 190 72 31 8e-12 MRQGSSTGSRTMVQRHGIARDLTVCGLFTALTAVGAFIKIVIPVGADTMNFTLQWFFVLL AGLLLGSRRAFLSVSTYLLIGLVGIPVFARGGGPSYLIRPTFGFLLGFALAAYVMGKIVE WMHASKPGACLFSATVGYVIYYGMGILYFYFITHFVVVTPDTVGWGAIFAVYCLPTMIPD YILCILAVTVAGRLRPSMGRLME >gi|157101646|gb|DS480678.1| GENE 86 102707 - 103297 625 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937121|ref|ZP_02084484.1| ## NR: gi|160937121|ref|ZP_02084484.1| hypothetical protein CLOBOL_02012 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02012 [Clostridium bolteae ATCC BAA-613] # 1 196 7 202 202 394 100.0 1e-108 MNRRNISPQAVRICGLTLLCLAIMFGVAAANKAKGGEKLVSEYTVQDDLLPRHVKFESMP ADMPQMTAAWFYKYKGLGQFDMCSRLFPQDQLEALNFEQEDRDFKDGYYIQEYIVHGFKT LSQEEYEDQKARYDQLAASYGYEEYKVVRVSFSQKWSPKALQKAPQWGDGAFTRDFAVGR EAGLRERWKIFELGMM >gi|157101646|gb|DS480678.1| GENE 87 103322 - 104206 721 294 aa, chain - ## HITS:1 COG:BH3365 KEGG:ns NR:ns ## COG: BH3365 COG1091 # Protein_GI_number: 15615927 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Bacillus halodurans # 5 256 4 245 283 97 27.0 2e-20 MNRRLLITGAGGFLGSRICEYYSNRDGYEAVGVTHRELDIEDFVAVSAFIKAIRPDYVLH CAAISNTGTCERNPVLSEKVNVRGTINLAKACRNAGSRMIFTSSDQIYNTSHSMEPNREG SEGKPGNVYGRDKKRAEEAMLTYLPDAVALRLTWMYDAPSRGRAAGQGLLQKLADALKDR REVEFPVHDYRGITYVWEVVRHLEEAMGLPGGVYNFGSENRYNTFDTARLFLRELAGSDS GGILKRNDERFASCPRNLTMDTDKIKSYGIRFFNTSEGIRKCCMEHREMSRASC >gi|157101646|gb|DS480678.1| GENE 88 104302 - 105744 1158 480 aa, chain - ## HITS:1 COG:TM0007 KEGG:ns NR:ns ## COG: TM0007 COG1305 # Protein_GI_number: 15642782 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Thermotoga maritima # 13 478 8 438 438 280 39.0 3e-75 MDNSYFDSLAVVLPEDIEKLKWRGEFDRAVRRIDSRLEKDIPQALRKRLMIEKEILKRMP GQYPYTWAEALGILRDNVRDFKDCELETLWEEDAADWIYVDGEVRFRSSFLKNVLKTRKE YENRAMDLELVRSKAENFALLNRTIASMKERGGAGCRFCIRGTLGILEEAERDGEDIKVY LPLPIEYAQVKNVKIHSVRIGEREAEAQEYTIARPDAFQRTICFHTKHRKGQKYLVEFSF ENREAYKDFSRGSLEHGGPERIKERTGANGFAEPMDAWLAQQPPHIRFTPLIRELASDIV GEEKDPLKKARRIYDYITTHVMYSFVRSYFTLTDIVQYAASSLKGDCGVQALLFITLCRA AGIPARWQAGLYTSPLEISCHDWAQFYTPSHGWRYADCSFGGAAFREGETERWDFYFGNL DPFRLPAAREFGYEFEPPMDHLRNDPYDNQTGEAEYRDRSLAQEEYSTEYEMISLEDIIY >gi|157101646|gb|DS480678.1| GENE 89 105932 - 107083 1183 383 aa, chain - ## HITS:1 COG:STM0013 KEGG:ns NR:ns ## COG: STM0013 COG0484 # Protein_GI_number: 16763403 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Salmonella typhimurium LT2 # 4 382 2 367 379 319 51.0 5e-87 MAESKRDYYEVLGVPKDADEDALKKAYRKLAKKYHPDANPGDKEAEAKFKEASEAYSVLS DPQKRQQYDQFGHAAFEQGGAGAGGFGGGFGGFDFGGDMGDIFGDIFGDIFGGGRSGRAR SGPSRGANIKTSVRITFEEAVFGCDKEIEINFKETCASCHGTGAKAGTSPQTCSKCNGKG KIMYTQQSFFGTVQNVQTCPDCNGTGQVIKEKCPDCYGTGYKTVRKKFKVSIPAGIDNGQ CVRLAGGGEPGTGGGERGDLLVEAVVSQHPIFKRQDTSIYSTVPISFARAALGGPIRIKT VDGEVEYDVRPGTQTDTKVRLKGKGVPSLRSRSVRGDHYVTLVVQVPERMNQAQKDALRR FDDAMNGVASPEEEKPKKKGIFK >gi|157101646|gb|DS480678.1| GENE 90 107252 - 109114 2451 620 aa, chain - ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 618 1 610 615 715 67.0 0 MSKIIGIDLGTTNSCVAVMEGGKPTVIANAEGVRTTPSVVAFTKTGERLIGEPAKRQAVT NADKTISSIKRHMGTDYRVDIDGKKYTPQEISAMILQKLKADAESYLGETVTEAVITVPA YFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAYGLDNEKEQKIMVYDLGGGTFDVSII EIGDGVIEVLSTNGDTRLGGDDFDNAITNWMISEFKKAEGVDLSNDKMALQRLKEAAEKA KKELSTATTTNINLPFITATAEGPKHLDMNLTRAKFDELTHDLVERTAIPVQNALKDAGI TASELGKVLLVGGSTRMIAAQEKVKQLTGKEPSKTLNPDECVAIGAAIQGGKLAGDAGAG DILLLDVTPLSLSIETMGGVATRLIERNTTIPTKKSQIFSTAADNQTAVDIHVVQGERQF ARDNKTLGQFRLDGIPPARRGIPQIEVTFDIDANGIVNVSAKDLGTGKEQHITITSGSNM SDDDIDKAVKEAAEYEAQDKKRKEAIDARNDADSMVFQTEKALEEVGDKIAPGDKAEVEA DLNSLKEAINRAPVEEMTDAQVEDIKAGREKLMNSAQKLFAKVYEQAQASGAAGQAGPDM GGAGNAAGGDDVVDADFKEV >gi|157101646|gb|DS480678.1| GENE 91 109164 - 109838 902 224 aa, chain - ## HITS:1 COG:BS_grpE KEGG:ns NR:ns ## COG: BS_grpE COG0576 # Protein_GI_number: 16079602 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Bacillus subtilis # 86 222 49 185 187 97 44.0 2e-20 MSEDMKDEALKKEDEALNTEQEAGTEQTSEAGSGEAGNTEAAAETGETAEDVPEGAEEDT EQAPAAEKEEKKGFFKKKKDPRDEKIEDLTDRVKRQMAEFENFRKRTDKEKSAMYEMGAK DIIERILPVIDNFERGLATVPEDAKGTPLAEGMEKIYKQFRKTLEEAGVKAIEAVGQEFD PNYHNAVMHVDDESLGENIVAEELQKGYMYRDSVVRHSMVKVAN >gi|157101646|gb|DS480678.1| GENE 92 109845 - 110897 1458 350 aa, chain - ## HITS:1 COG:CAC1280 KEGG:ns NR:ns ## COG: CAC1280 COG1420 # Protein_GI_number: 15894562 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Clostridium acetobutylicum # 7 347 3 339 343 215 36.0 9e-56 MPLDALLDDRKVTILKAIIKTYLETGEPVGSRTISKYSDLKLSSATIRNEMSDLEEMGYI VQPHTSAGRIPSDKGYRFYVDQIMLDKESEVTEFKEMMVQKVDKLELVLKKMAQVLAANT NYATMISGPSYHKTKLKFIQLSKMEKHKLLVVVVVEGNIIKNTMIDIEDNLSDEELLNLN ILLNSSLNGLTIEDINLDVISKLKGDAGIHSQVVDLVLNEVAEAIKAGEEDLQIYTSGAT NIFKYPELSDGDKASRLIDTLEHKEVLQEFVAEVNSGTEEEAGIQVYIGDESPVQSMKDC SVVTANYDLGGGLRGTIGIIGPKRMDYEKVLGTLRNLMNQLDTAFKKDER >gi|157101646|gb|DS480678.1| GENE 93 111274 - 111843 463 189 aa, chain + ## HITS:1 COG:no KEGG:Closa_0879 NR:ns ## KEGG: Closa_0879 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 32 187 22 185 188 102 42.0 7e-21 MHYFKKTLLAAAMAAFSLTVAFPAFADEPVGMDVTVVEAGASDTSEDSSQEIQETSASRE SKGRRAAVMNEEEYIAQGPGAQKEEKEEAPAPKETSLGRFTITGYCGCEQCSGGHNLTFS GTVPTPNHTISADLDYFPLGTRLKVDGTVYTVEDKGSSVNGNILDIFYGSHEEALAKGTY TAEVFLVQD >gi|157101646|gb|DS480678.1| GENE 94 112116 - 112844 703 242 aa, chain + ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 241 12 240 240 198 44.0 2e-49 MKLRHLLTGTILTVSLLAATAITALAAAPKGRFETVNSTTISGWAYNSSTPDDALNVRIS VKDKNTGEEVFSQRMTAGEYSDKLYGSGKGNGCHAFTLNMDWSALPDGVYLVEGYVGDND FSNPRTYTNGNPDAKQETPAAQAQNLVPLGVFKTTGYCPCRACSEGWGRHTCTGAVATAG HTIAVDPRVIPYGSKVMINGVVYTAEDRGGAVRGNHIDIFFNTHAETRQHGTQSAEVYLV QS >gi|157101646|gb|DS480678.1| GENE 95 112925 - 115198 2430 757 aa, chain - ## HITS:1 COG:sll1283 KEGG:ns NR:ns ## COG: sll1283 COG2385 # Protein_GI_number: 16329811 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 438 755 127 390 391 121 32.0 5e-27 MKNRMIAWVSGVVLVVVTLMVIIVKLEPPRDGIIRAQAMKAMALALTDKEECEKRAKERG TSHFSAKEKDNWFVKYMDYLYDEGYLDPELTPASLASAQGYLTYAEAAYMAGQVSGKLKL QAGSTRNNRDQAFPEEDWWQLYGSILKETDPEGKVEEMEAVLYGTPSNLDQAESWTAYTT VGNFGFQGLALDAFLDCEIRFLARSGEMITSSLVSRNVVYENVWLAESDGRHFKAYLGTA YREFPVSAKLGGEDGMAGNLADLHMEDGKLVKITMKRDRLSGKVLSVTEDAIEIEGYGEI PLAPNFHVYKVYGDFKVLNASDILVGYNLQEFVAADGKLCAALLEREFDAKTIRVLLMDT GFKSVFHASADLVLGSGADLEYENAKGKMVGERLEAGTQLTVGPDSPYLEYGRMIITPDE PEAITIRSIERSQGTPVYSGSLEIKGTPEGLVLVNDLFLEDYLTKVVPSEMPPSYEKEAL KAQAVCARTYAYRQIQGNTYSQYGAHVDDSTNFQVYNNTSANDKSTQAVKETYGKMLFYE DKPIEAFYFSTSCGRTADAGVWGTDSGKYPYLRAVEVKEGGKSLGKEDNDGFESYIKRED VIAYDTSYPMFRWQTDLPADVASAQISGAGQIQDMTVTDRGPGGIAGELTVTGSDGTVTI KGQSAIRSALGNPSLIITKKDGGTMTGSATLPSAFIAIEKRTGEDGSLSFHIYGGGFGHG VGMSQNGAQGMAKTGKGYKQILDFFYHGTELRECNEG >gi|157101646|gb|DS480678.1| GENE 96 115509 - 116852 1411 447 aa, chain - ## HITS:1 COG:SA1938 KEGG:ns NR:ns ## COG: SA1938 COG0213 # Protein_GI_number: 15927710 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Staphylococcus aureus N315 # 1 424 14 437 446 440 54.0 1e-123 MRMYDLIEKKKRGGSLLTEEIAYMVRGFVAGDIPDYQMSAMLMAVYFKGMDDREITDLTL EMAHSGDMVDLSPIEGIKADKHSTGGVGDKTTLVTGPIVAACGVKVAKMSGRGLGHTGGT VDKLESIPGFKTAIPREEFFRIVNENGLALIGQSGNMVPADKKLYALRDVTATVDSIPLI ASSIMSKKLAAGSDCIVLDVKTGSGAFMKTLEDSISLAREMVSIGTLAGRRCCALITNMD VPLGHAIGNSLEMEEAIETLKGRGPEDLTCVCLELAANMLHMAGRGDMEQCRKIAQGAME DGSALKCLKSMVESQGGDGRYVENPGLFHKAPLSREVKASETGFITHMDTEGIGVASVML GAGRNKKEDTIDYSAGILLEKKYGDSVREGETIAVLYASHEELFGPAMEKFLASCTLGKQ MPEPEPLIFARISDAGVERRTIEEKIP >gi|157101646|gb|DS480678.1| GENE 97 116854 - 117270 369 138 aa, chain - ## HITS:1 COG:BS_cdd KEGG:ns NR:ns ## COG: BS_cdd COG0295 # Protein_GI_number: 16079584 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus subtilis # 1 135 1 130 136 112 46.0 2e-25 MERELLIQTALDGLGRSYAPYSHFHVSAALLCRDGKVYTGNNIENAAYSPGICAERCAFA KAVSEGEREFEAIVICGGPDGKAGDYCPPCGVCRQVMREFCNPDSFCIILAKSAEEYREY TLAQLLPESFGPDHLGGA >gi|157101646|gb|DS480678.1| GENE 98 117377 - 117844 590 155 aa, chain - ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 3 151 4 152 158 147 47.0 5e-36 MVKTNVMRLLDAARIVYETGTYEVDENDLSGSHAADLMGVDHDRMYKTLVLKGEKKGYLV CCIPVDEELDLKKAARAAGEKKVEMIHVKDMFDITGYIRGGCSPIGMKKHFPTYIEEMAQ LYDTIMVSAGQRGVQVTLSPEDLRSYLQGQFVSLV >gi|157101646|gb|DS480678.1| GENE 99 118051 - 119145 1158 364 aa, chain + ## HITS:1 COG:AF1306 KEGG:ns NR:ns ## COG: AF1306 COG2006 # Protein_GI_number: 11498904 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Archaeoglobus fulgidus # 40 254 32 244 275 73 26.0 6e-13 MEKNDIIIIHGTDYKNMAKRVLEQAGVAADIGDVNKKIALKPNLVTAKAPSSGATTHSEL LAGVIEYLREHGFQNITIMEGSWVGDHTEDAFQAAGYHQVCSRYSVPFVDLQRDTWKEYD AAGMKIKLCDKAAAVDYMINMPVLKGHCQTTVTCALKNNKGVIPNSEKRRFHTLGLHKPI AHLNTIARSDFILVDNICGDLDFEEGGNPVVMNRVLGFKDPVLCDAFVCDCMGYSVDDVP YITMAEKLGVGSTDTAHANMIYLNQATEADSKFRMTRRVQHLAAYTAPEDACSACYGSLI YALDRLNDMGYLRKGLPPVCIGQGYKGQDGAIGVGQCTSCFAKTLGGCPPKAADIVDFLS TQWK >gi|157101646|gb|DS480678.1| GENE 100 119156 - 119902 636 248 aa, chain - ## HITS:1 COG:mlr0493 KEGG:ns NR:ns ## COG: mlr0493 COG0596 # Protein_GI_number: 13470715 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Mesorhizobium loti # 3 243 25 263 268 95 28.0 8e-20 MKEGYVDNRFASLYYKEMGQGMPLIMLHGNGESHEIFARLSELMSRYYRVILMDSRGHGS SLMKEEAAGGDLTIPDMAADVVQVMEFLHVPRAVILGFSDGANVALETASCYPGRVSAVV AVSGNALPRGMSAASLMEVKLKYALWRGLEKVPLPRGVRERAVKSRQLLGLMARWPRLDK DKLSCIQAPVLILTGRRDMIRPRHSLWMGEQIPDSKVVFVKGADHFTLLKKEKAYGAHIM AWLRQRGL >gi|157101646|gb|DS480678.1| GENE 101 120027 - 121736 2155 569 aa, chain - ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 4 564 6 524 539 468 45.0 1e-131 METVKFDELQLDERIIRAITEMGFEEASPIQAQAIPVAMEGRDMIGQAQTGTGKTAAFGL PLLQKVDPKVKKLQAIVLLPTRELAIQVAEEMRRFAKFMHGVKVLPIYGGQDIVKQIRSL KDGTQVIVGTPGRVMDHMRRKTVKVDHVLTVVLDEADEMLNMGFLEDMETILSQLPEERQ TLMFSATMPQAIAEIARKFQKDPVTVRVIKKELTVPKVTQYYYEVKPKNKVEVMSRLLDM YAPKLSIVFCNTKRQVDDLVQELQGRGYFAEGLHGDLKQVQRDRVMDSFRNGRTDILVAT DVAARGIDVGDVEAVFNYDIPQDDEYYVHRIGRTGRAGREGKAFSLVMGKEVYKLRDIQR YCKTKIIPQAIPSLNDITEIKVEKILDQVQEVLNDTDLTKMVNIIEKKLMEEDYTSMDLA AALLKMSMGDESEDIIDSFETARSLDELDSFGRGSSRGRGRERSAYGNRRKGATDRAAVD YVLGEGDEKMARLFINIGKAQRITPGDILGAVAGESGIPGRMVGSIDMYDGYTFVDVPGR YADDVLKAMAHAKIKGKNIHVEKANTNRR >gi|157101646|gb|DS480678.1| GENE 102 122221 - 123081 976 286 aa, chain - ## HITS:1 COG:NMA0225 KEGG:ns NR:ns ## COG: NMA0225 COG1752 # Protein_GI_number: 15793247 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Neisseria meningitidis Z2491 # 4 285 47 295 300 112 29.0 1e-24 MGYGLALAGGGTRGAAHVGVLKALMEAGLRPDAVAGASAGGIVAGLFASGMSVSRMEQAV LHLEKHGGEYLDPDYSGLLAFMPQLLTGKGVNLSGLIKGDKLQDYFFQMTGRKHMDEAVF KLVIPAVDLISGNTICFTNSDQVREMEHVTWEWNGYLCEAMMAGASVPAIFAPRKLGRYF LVDGGVTDILPVNLLQAAGIRDVLAVDIGGDYEAPSDHSVMEVASHSFSIMSGRLKDCAS TGEVLLLKPPLSSKAGLLTFELMGECMERGYEYTRKMLPQIRKVLN >gi|157101646|gb|DS480678.1| GENE 103 123190 - 123645 612 151 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 151 1 151 152 102 31.0 3e-22 MLTTSIDLIQKYDYLEDKFKKGYEFLRNTDLKALPLGRVDIDGDEVFASVQEYTTMPADA CRYESHDLYFDIQYVVEGQERFGYTKRAGLEEEAPYNEADDIVFFKEPEQGGAVLLKAGD CAVVAPEDAHKPRCIAGTPCKVRKIVVKVRV >gi|157101646|gb|DS480678.1| GENE 104 123647 - 124423 867 258 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3765 NR:ns ## KEGG: Cphy_3765 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 40 246 5 217 233 135 37.0 2e-30 MKREDGRKLFRAFLGNSRETAGEQRAMMEQMAVPSRERSLYSMVEHLLSLDETAWYLYAW SRDPLEGRFSREQKLAYGFRAAECGRNEAQLILGKTDVWDIAAGMGLKVLTPQVPGGGGH VIFAQYQEPDSITIFMDGAERAQKLIREEKLEGLLGNRAVEDVLLAHELFHVTEYRKKDF IFTQTEKVELWKKPFSNRSRILCLGEIAGMEFARCLTGITFTPYVLDVLMMYGYDKEAAT ALYEEIAAFAGNGEGKEG >gi|157101646|gb|DS480678.1| GENE 105 124458 - 126587 2603 709 aa, chain - ## HITS:1 COG:SMc02580 KEGG:ns NR:ns ## COG: SMc02580 COG0145 # Protein_GI_number: 15963814 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-methylhydantoinase A/acetone carboxylase, beta subunit # Organism: Sinorhizobium meliloti # 8 377 7 373 668 140 28.0 6e-33 MGNRLVRMGIDVGGTHTKAVAIDNATHEIIGKSSVKTTHDDRYGVAAGVVKAFQNCLREN DIAPEDVIFVAHSTTQATNALIEGDVACVGVLGMGPGGMEGFLAKHQTRLKDIDLGSGRS IKIKNRYLSDSEEDENTVRKAVEELKAEGAQVLVASDAYGVDDIAGEQFVFEIAHDEMGM ETTVASDITKLYGLTRRTRTAAINASILPKMLNTANSTEESVRKAGVEVPLMIMRGDGGV MEINEMKKRPVLTMLSGPAASVMGSLMYLRASNGVYFEVGGTTTNIGVIKNGRPAIDYSV VGGHRTYISSLDVRVLGVAGGSMIRVNKSGVHDVGPRSAHIAGLDYAVFTDEEEIVEPRL EFFSPKEGDPEDYVAIRLKSGKRITITNSCAANVLGLVTPKDFSYGSVNSARKAMQPIAD YLNTTVEDVASQILTRAYEKIEPIIMELADKYRLEHDQISLVGVGGGATSLIGFCANKMN IKYSVPENAEVISSIGVALAMVRDVVERVVPNPTPDDIRSIKREAVDKAVESGAAPGSVE VHIEIDPQTSKLTAIALGSTEVKTTDLLKECSEEEARNLAAEDLRSGPEELNLVGQTPNF YVYNMESRGRKPLRILDKKGFIKVQRTDGEVVLAKASSYRSIVSRMWEEMAIFKADAILR PDYYLCVGARVMDFNGTGELEQILMLMDVEMQMVDPSEDIVIVGAKNQL >gi|157101646|gb|DS480678.1| GENE 106 126604 - 127947 1700 447 aa, chain - ## HITS:1 COG:no KEGG:Apre_0573 NR:ns ## KEGG: Apre_0573 # Name: not_defined # Def: citrate transporter # Organism: A.prevotii # Pathway: not_defined # 17 447 17 433 434 346 44.0 1e-93 MEYIIGILILASFFGLAVYAARGGNLMMGMLIMAILWTILPMTGNLLASNPEFIAANGDS IRITWIQAFSKVFQSGPEGWGSVLVNVVFGAWFGRVLLETGIAATLIRKTTELGGDKPAV TCILLCIVTAAIFSSLFGAGAVVAIGVIILPIFMSLGIPKVLSVVSFMLSIGAGMFLNPV LFGQYTAFFLDGDGKVLYPYEQYVKWGGIALAVQLVFTIILILFCMRKKSVHSWAARRPV RQKLDFAPTPSLLTPFIPVFLLIVFKVPIIMGFLIGGFYALFVCGKLKSFRAACRTFNKD FFDGVVDTAPLVGFLLVVPMFNKAAELCIPYFNALLGNIIPHSTLFITIIFCVLAPLGMF RGPFTLYGCGAATLGILKGVGFEVAFLAPLMIAATTVMNVSCCITQSWIVWGISYAKVST KEFLKMSVVCGWIICCILQIITFIMFG >gi|157101646|gb|DS480678.1| GENE 107 127990 - 128997 1242 335 aa, chain - ## HITS:1 COG:yiaK KEGG:ns NR:ns ## COG: yiaK COG2055 # Protein_GI_number: 16131446 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 333 1 332 332 308 47.0 7e-84 MRVKYEDLLEKFQRILESRGFSGKHAKDAATVFANNSLDGVYSHGVNRFPRVVEYLDKGE IDSAAIATCESSMGAIERWNGHRGFGPLNAKLAMDRAVELAREYGVGVVALGNNNHWMRG GSYGWQAADKGCIGICWSNTMPNMPAWGGRDRKIGNNPFIMSIPRSNGKHAVIDCAVSQF SYGKIEEAKLKGQQLPVPGGYDTKGNLTTDPAEIEKTWRVLPMGYWKGSGISIALDLIAT VLTNGNSVSRIGTFGDEVGLSQIMIAIDPLRFNTPEETDSIVDEILADIKSSEPIREGGE VYYPGELELITRETNTRDGIPVIDEVWETLNSLER >gi|157101646|gb|DS480678.1| GENE 108 129403 - 130401 1110 332 aa, chain - ## HITS:1 COG:YPO2387 KEGG:ns NR:ns ## COG: YPO2387 COG1609 # Protein_GI_number: 16122610 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Yersinia pestis # 3 329 4 330 341 146 29.0 5e-35 MNIYDIAAEAGTSISTVSRVLNNKGNVNPKIRDRVEAVLKKYDYKPSAIARGMVSKTMKN IAILTVDVRVTHYARMIYVIEQEFSNLGYNVSVCNTGGSIQECDRYFEILSEKQTDGIVL IGSVFNELINYPEITAKIKDTPVVIANGQVKLPNFYSVLVDDTKAIRMASDYMFDHGRMD LFYVFDIATDSGRAKRQGFLEAMKLRDVPDGNDRVVIVDESSLEGGRDAARKILAAGRPV NGVVCGEDITAIGLMKEFKERGYRVPEDIAITGCNNSPDSRICEPELTTLDNKPELLGGM CASLLRDRMEGKESATSVAIQPELVVRGSALF >gi|157101646|gb|DS480678.1| GENE 109 130737 - 131693 1237 318 aa, chain - ## HITS:1 COG:no KEGG:Closa_2430 NR:ns ## KEGG: Closa_2430 # Name: not_defined # Def: G3E family GTPase-like protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 318 7 317 317 424 66.0 1e-117 MAYEIEDDIMPLFLINGFLEAGKTQFLDFTMQQDYFKTEGKTLLIVCEEGDTEYDEKKLK KNKTAVVFVDDMEKLTPQYLNELEVIYQPERVLMEWNGMWNQDDLKLPDDWTIYQQITII DGSTFELYVQNMKPLLGAMLRGSELVIVNRCDGIGDDKLTAYRRTIRAMSRESEIVLEDK EGEIEQAALEEDLPYDLGADVIEIKPEDYGIWYIDCMDQPERYKDRVVEFTAMVLKTPKF PKGQFVPGRMAMTCCEADMTFLGFMCKWPEAEDYKTKQWVKVRAKVGIEYQKDYHGEGPV LYAEHVERAEEIKDVVQF >gi|157101646|gb|DS480678.1| GENE 110 131714 - 131953 276 79 aa, chain - ## HITS:1 COG:no KEGG:Closa_2431 NR:ns ## KEGG: Closa_2431 # Name: not_defined # Def: cobalamin synthesis protein P47K # Organism: C.saccharolyticum # Pathway: not_defined # 4 78 373 447 447 109 70.0 4e-23 MTHEQLEDILKRLSSTREFGDVLRAKGMLPTENPGEWLYFDLVPEQYEIRQGRPDYTGKV CVIGASLKEEELNSVFGRG Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:26:36 2011 Seq name: gi|157101645|gb|DS480679.1| Clostridium bolteae ATCC BAA-613 Scfld_02_20 genomic scaffold, whole genome shotgun sequence Length of sequence - 172914 bp Number of predicted genes - 151, with homology - 145 Number of transcription units - 72, operones - 38 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 67 - 1038 817 ## COG0524 Sugar kinases, ribokinase family 2 1 Op 2 1/0.143 - CDS 1044 - 2468 1167 ## COG2211 Na+/melibiose symporter and related transporters 3 1 Op 3 1/0.143 - CDS 2489 - 3514 1076 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 3596 - 3655 7.0 - Term 3680 - 3725 9.5 4 2 Op 1 1/0.143 - CDS 3737 - 4879 1049 ## COG1940 Transcriptional regulator/sugar kinase 5 2 Op 2 . - CDS 4906 - 5418 122 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 6 2 Op 3 . - CDS 5475 - 6065 260 ## gi|160937164|ref|ZP_02084526.1| hypothetical protein CLOBOL_02054 7 2 Op 4 . - CDS 6081 - 6224 136 ## gi|160937165|ref|ZP_02084527.1| hypothetical protein CLOBOL_02055 - Prom 6276 - 6335 6.7 8 3 Op 1 11/0.000 - CDS 6374 - 8737 1561 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 9 3 Op 2 15/0.000 - CDS 8734 - 9231 297 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 10 3 Op 3 . - CDS 9212 - 10090 713 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs - Term 10102 - 10158 12.3 11 4 Op 1 . - CDS 10170 - 10814 461 ## COG1335 Amidases related to nicotinamidase 12 4 Op 2 11/0.000 - CDS 10836 - 12110 860 ## COG1593 TRAP-type C4-dicarboxylate transport system, large permease component 13 4 Op 3 11/0.000 - CDS 12125 - 12676 427 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 14 4 Op 4 . - CDS 12710 - 13849 1053 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component 15 4 Op 5 . - CDS 13879 - 15045 966 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 16 4 Op 6 . - CDS 15092 - 16831 1225 ## Sthe_2748 peptidase M28 17 4 Op 7 . - CDS 16857 - 18002 640 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 18 4 Op 8 . - CDS 18050 - 19021 440 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold - Prom 19042 - 19101 7.8 - Term 19171 - 19214 8.1 19 5 Tu 1 . - CDS 19236 - 19886 503 ## COG1802 Transcriptional regulators - Prom 19911 - 19970 6.4 + Prom 20068 - 20127 4.9 20 6 Op 1 . + CDS 20221 - 20355 122 ## gi|239627993|ref|ZP_04671024.1| transcriptional regulator 21 6 Op 2 . + CDS 20384 - 20716 252 ## gi|160937179|ref|ZP_02084541.1| hypothetical protein CLOBOL_02069 22 6 Op 3 . + CDS 20701 - 21267 387 ## Amico_1744 hypothetical protein 23 6 Op 4 . + CDS 21267 - 21587 233 ## gi|160937181|ref|ZP_02084543.1| hypothetical protein CLOBOL_02071 + Prom 21596 - 21655 3.2 24 7 Op 1 . + CDS 21676 - 21816 96 ## gi|160937182|ref|ZP_02084544.1| hypothetical protein CLOBOL_02072 25 7 Op 2 . + CDS 21844 - 22725 647 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 26 7 Op 3 . + CDS 22780 - 22929 92 ## gi|239627991|ref|ZP_04671022.1| conserved hypothetical protein + Prom 23036 - 23095 6.9 27 8 Tu 1 . + CDS 23129 - 24574 669 ## ELI_1104 hypothetical protein 28 9 Op 1 . - CDS 24605 - 27118 1755 ## COG0642 Signal transduction histidine kinase 29 9 Op 2 . - CDS 27115 - 28347 788 ## MTES_1114 sugar ABC transporter periplasmic protein 30 10 Op 1 1/0.143 - CDS 28466 - 32431 2275 ## COG0642 Signal transduction histidine kinase 31 10 Op 2 . - CDS 32495 - 34195 1423 ## COG0840 Methyl-accepting chemotaxis protein - Prom 34282 - 34341 8.4 - Term 34284 - 34324 1.9 32 11 Op 1 . - CDS 34456 - 34704 188 ## gi|160937191|ref|ZP_02084553.1| hypothetical protein CLOBOL_02081 33 11 Op 2 . - CDS 34697 - 34930 196 ## gi|160937192|ref|ZP_02084554.1| hypothetical protein CLOBOL_02082 34 11 Op 3 . - CDS 34930 - 35376 238 ## COG0642 Signal transduction histidine kinase - Term 35693 - 35727 5.1 35 12 Op 1 . - CDS 35879 - 37753 1617 ## Elen_2576 extracellular solute-binding protein family 1 36 12 Op 2 . - CDS 37778 - 40090 1402 ## COG0642 Signal transduction histidine kinase 37 12 Op 3 . - CDS 40145 - 40606 430 ## gi|160937197|ref|ZP_02084559.1| hypothetical protein CLOBOL_02087 - Prom 40639 - 40698 3.3 - Term 40653 - 40692 -0.8 38 13 Op 1 . - CDS 40764 - 42419 1406 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 39 13 Op 2 . - CDS 42343 - 43344 490 ## Dtox_1525 recombinase 40 13 Op 3 . - CDS 43347 - 44954 1269 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 41 13 Op 4 . - CDS 45032 - 45193 193 ## gi|160937202|ref|ZP_02084564.1| hypothetical protein CLOBOL_02092 42 13 Op 5 . - CDS 45205 - 45615 337 ## Closa_3172 Resolvase domain protein - Prom 45646 - 45705 2.4 43 13 Op 6 . - CDS 45707 - 46051 266 ## COG2337 Growth inhibitor - Prom 46157 - 46216 3.6 - Term 46580 - 46609 1.6 44 14 Op 1 . - CDS 46729 - 46989 57 ## 45 14 Op 2 . - CDS 46995 - 47405 389 ## LM5578_1859 hypothetical protein - Prom 47425 - 47484 4.3 46 15 Tu 1 . - CDS 49398 - 49637 130 ## Trebr_1033 MATE efflux family protein - Prom 49753 - 49812 6.0 47 16 Tu 1 . - CDS 50055 - 50198 65 ## gi|160937213|ref|ZP_02084575.1| hypothetical protein CLOBOL_02103 - Prom 50233 - 50292 5.4 48 17 Op 1 8/0.000 - CDS 50316 - 50909 346 ## COG0500 SAM-dependent methyltransferases 49 17 Op 2 35/0.000 - CDS 50906 - 51667 195 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 50 17 Op 3 . - CDS 51715 - 52722 502 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 51 17 Op 4 . - CDS 52761 - 53255 190 ## Clole_0369 oxidoreductase/nitrogenase component 1 52 18 Tu 1 . - CDS 53985 - 54689 218 ## Swol_0407 nitrogenase molybdenum-iron protein alpha and beta chains-like protein - Prom 54782 - 54841 2.5 53 19 Op 1 1/0.143 - CDS 55307 - 56083 526 ## COG1348 Nitrogenase subunit NifH (ATPase) 54 19 Op 2 . - CDS 56124 - 57245 679 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component - Prom 57271 - 57330 3.6 - Term 58647 - 58691 7.4 55 20 Op 1 . - CDS 58698 - 59048 264 ## BBIF_0517 hypothetical protein 56 20 Op 2 . - CDS 59024 - 59266 118 ## gi|160937224|ref|ZP_02084586.1| hypothetical protein CLOBOL_02114 - Prom 59446 - 59505 4.6 - Term 59494 - 59532 -0.6 57 21 Op 1 35/0.000 - CDS 59619 - 61340 221 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 58 21 Op 2 . - CDS 61337 - 63082 220 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 63286 - 63345 2.4 + Prom 63119 - 63178 4.6 59 22 Tu 1 . + CDS 63368 - 64582 821 ## COG3547 Transposase and inactivated derivatives + Term 64818 - 64874 9.4 - Term 64806 - 64861 13.0 60 23 Tu 1 . - CDS 64873 - 65343 234 ## COG0783 DNA-binding ferritin-like protein (oxidative damage protectant) - Prom 65364 - 65423 8.7 + Prom 66601 - 66660 7.1 61 24 Tu 1 . + CDS 66764 - 66865 65 ## + Prom 66927 - 66986 3.2 62 25 Tu 1 . + CDS 67084 - 68529 406 ## COG3464 Transposase and inactivated derivatives + Term 68630 - 68682 1.0 - Term 68460 - 68493 -1.0 63 26 Op 1 . - CDS 68689 - 68835 145 ## gi|160937230|ref|ZP_02084592.1| hypothetical protein CLOBOL_02120 64 26 Op 2 . - CDS 68869 - 71634 1861 ## COG0474 Cation transport ATPase - Prom 71666 - 71725 2.0 - Term 71663 - 71713 1.5 65 27 Op 1 . - CDS 71799 - 72110 193 ## CD3374 hypothetical protein 66 27 Op 2 . - CDS 72117 - 72269 106 ## 67 27 Op 3 . - CDS 72357 - 73034 300 ## COG1285 Uncharacterized membrane protein - Prom 73055 - 73114 4.6 68 28 Tu 1 . - CDS 73174 - 74097 300 ## COG0053 Predicted Co/Zn/Cd cation transporters + Prom 74136 - 74195 8.4 69 29 Tu 1 . + CDS 74395 - 74550 94 ## gi|160937236|ref|ZP_02084598.1| hypothetical protein CLOBOL_02126 + Term 74676 - 74722 9.0 - Term 74660 - 74711 10.1 70 30 Op 1 . - CDS 74714 - 75217 303 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 71 30 Op 2 . - CDS 75233 - 77893 1506 ## COG0474 Cation transport ATPase - Prom 77922 - 77981 4.9 + Prom 79063 - 79122 4.5 72 31 Tu 1 . + CDS 79142 - 79279 176 ## COG4716 Myosin-crossreactive antigen + Term 79431 - 79471 4.0 73 32 Tu 1 . + CDS 79907 - 80206 158 ## COG4716 Myosin-crossreactive antigen + Prom 80488 - 80547 5.6 74 33 Tu 1 . + CDS 80663 - 80782 61 ## gi|317501057|ref|ZP_07959263.1| hypothetical protein HMPREF1026_01206 + Prom 80811 - 80870 4.2 75 34 Tu 1 . + CDS 81068 - 82042 289 ## COG3049 Penicillin V acylase and related amidases - Term 82896 - 82944 8.5 76 35 Op 1 16/0.000 - CDS 82955 - 84103 1122 ## COG1879 ABC-type sugar transport system, periplasmic component 77 35 Op 2 21/0.000 - CDS 84151 - 85167 856 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 78 35 Op 3 . - CDS 85184 - 86692 1640 ## COG1129 ABC-type sugar transport system, ATPase component 79 35 Op 4 2/0.000 - CDS 86728 - 87177 408 ## COG4154 Fucose dissimilation pathway protein FucU 80 35 Op 5 3/0.000 - CDS 87171 - 87875 532 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 81 35 Op 6 4/0.000 - CDS 87912 - 89381 923 ## COG1070 Sugar (pentulose and hexulose) kinases 82 35 Op 7 . - CDS 89394 - 91187 1122 ## COG2407 L-fucose isomerase and related proteins - Prom 91388 - 91447 6.1 + Prom 91172 - 91231 9.2 83 36 Tu 1 . + CDS 91399 - 92250 312 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 92232 - 92273 -0.4 84 37 Tu 1 . - CDS 92371 - 92682 164 ## COG2056 Predicted permease - Term 94934 - 94985 16.3 85 38 Op 1 . - CDS 94991 - 95392 213 ## Cthe_1620 hypothetical protein 86 38 Op 2 . - CDS 95389 - 95664 287 ## Cthe_1619 AbrB family transcriptional regulator - Prom 95709 - 95768 5.2 87 39 Tu 1 . - CDS 96274 - 97953 479 ## Closa_0537 hypothetical protein - Prom 98078 - 98137 5.2 88 40 Tu 1 . - CDS 98146 - 98331 64 ## - Prom 98351 - 98410 4.4 - Term 98406 - 98443 5.0 89 41 Tu 1 . - CDS 98592 - 98747 56 ## gi|160937268|ref|ZP_02084630.1| hypothetical protein CLOBOL_02158 - Prom 98784 - 98843 1.5 90 42 Tu 1 . - CDS 98872 - 99300 128 ## gi|160937269|ref|ZP_02084631.1| hypothetical protein CLOBOL_02159 - Prom 99353 - 99412 3.1 91 43 Tu 1 . - CDS 99467 - 99931 133 ## gi|160937270|ref|ZP_02084632.1| hypothetical protein CLOBOL_02160 - Prom 99956 - 100015 3.6 92 44 Op 1 . - CDS 100062 - 100688 431 ## gi|160937271|ref|ZP_02084633.1| hypothetical protein CLOBOL_02161 93 44 Op 2 1/0.143 - CDS 100746 - 105212 2461 ## COG5263 FOG: Glucan-binding domain (YG repeat) 94 44 Op 3 1/0.143 - CDS 105224 - 111511 4464 ## COG5263 FOG: Glucan-binding domain (YG repeat) 95 44 Op 4 . - CDS 111547 - 113199 1290 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Term 113226 - 113262 2.1 96 45 Op 1 . - CDS 113327 - 113959 365 ## Closa_0878 3D domain protein 97 45 Op 2 . - CDS 113991 - 114428 136 ## gi|160937276|ref|ZP_02084638.1| hypothetical protein CLOBOL_02166 98 45 Op 3 . - CDS 114320 - 115108 290 ## PPE_04051 uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain - Prom 115262 - 115321 7.2 - Term 115219 - 115277 2.0 99 46 Op 1 . - CDS 115371 - 116261 201 ## COG0582 Integrase - Prom 116307 - 116366 9.1 - Term 116339 - 116385 -0.4 100 46 Op 2 . - CDS 116429 - 116569 64 ## - Prom 116699 - 116758 9.8 101 47 Tu 1 . - CDS 116846 - 116992 63 ## gi|160937279|ref|ZP_02084641.1| hypothetical protein CLOBOL_02169 - Prom 117090 - 117149 4.2 102 48 Tu 1 . - CDS 117176 - 117331 80 ## gi|160937282|ref|ZP_02084644.1| hypothetical protein CLOBOL_02172 + Prom 117761 - 117820 6.2 103 49 Op 1 . + CDS 117871 - 119289 341 ## COG1696 Predicted membrane protein involved in D-alanine export 104 49 Op 2 . + CDS 119307 - 120275 487 ## bpr_I2151 GDSL-family lipase/acylhydrolase + Term 120373 - 120417 3.2 - Term 120842 - 120880 0.3 105 50 Op 1 . - CDS 120900 - 121610 278 ## COG1285 Uncharacterized membrane protein 106 50 Op 2 . - CDS 121628 - 122506 258 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 122640 - 122699 5.4 - Term 122681 - 122732 5.2 107 51 Tu 1 . - CDS 122852 - 123418 176 ## PROTEIN SUPPORTED gi|226309687|ref|YP_002769581.1| probable 50S ribosomal protein L25 - Prom 123479 - 123538 8.7 108 52 Tu 1 . - CDS 124361 - 124510 204 ## PROTEIN SUPPORTED gi|224476463|ref|YP_002634069.1| 50S ribosomal protein L33 - Prom 124647 - 124706 2.3 - Term 124666 - 124692 -0.3 109 53 Op 1 7/0.000 - CDS 124869 - 125822 541 ## COG3689 Predicted membrane protein 110 53 Op 2 . - CDS 125819 - 127378 705 ## COG0701 Predicted permeases 111 53 Op 3 . - CDS 127375 - 128286 253 ## COG0523 Putative GTPases (G3E family) 112 54 Tu 1 . - CDS 128393 - 129718 746 ## Closa_2429 polysaccharide deacetylase - Prom 129752 - 129811 2.8 - Term 129720 - 129752 0.6 113 55 Op 1 42/0.000 - CDS 129918 - 130715 792 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 114 55 Op 2 25/0.000 - CDS 130702 - 131412 251 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 115 55 Op 3 . - CDS 131433 - 132527 953 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin - Prom 132554 - 132613 3.4 116 56 Tu 1 . - CDS 132710 - 133021 238 ## gi|160937302|ref|ZP_02084664.1| hypothetical protein CLOBOL_02192 - Prom 133171 - 133230 5.3 - Term 133413 - 133449 -0.4 117 57 Tu 1 . - CDS 133464 - 133898 155 ## Closa_4176 ferric uptake regulator, Fur family 118 58 Op 1 . - CDS 133975 - 134121 67 ## gi|160937304|ref|ZP_02084666.1| hypothetical protein CLOBOL_02194 119 58 Op 2 . - CDS 134137 - 134361 90 ## gi|160937305|ref|ZP_02084667.1| hypothetical protein CLOBOL_02195 120 58 Op 3 . - CDS 134358 - 134468 56 ## - Prom 134703 - 134762 5.6 121 59 Op 1 3/0.000 - CDS 134805 - 139073 1292 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 122 59 Op 2 . - CDS 139060 - 140160 362 ## COG4693 Oxidoreductase (NAD-binding), involved in siderophore biosynthesis 123 59 Op 3 . - CDS 140163 - 141239 363 ## COG1748 Saccharopine dehydrogenase and related proteins 124 59 Op 4 . - CDS 141253 - 148599 2348 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins - Prom 148780 - 148839 5.3 - Term 148769 - 148803 1.1 125 60 Tu 1 . - CDS 148849 - 150015 341 ## COG3547 Transposase and inactivated derivatives - Term 150081 - 150117 1.1 126 61 Op 1 . - CDS 150220 - 150750 198 ## gi|160937311|ref|ZP_02084673.1| hypothetical protein CLOBOL_02201 127 61 Op 2 1/0.143 - CDS 150747 - 151391 232 ## COG2091 Phosphopantetheinyl transferase 128 61 Op 3 7/0.000 - CDS 151402 - 152889 628 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 129 61 Op 4 3/0.000 - CDS 152914 - 156057 1081 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 130 61 Op 5 . - CDS 156062 - 156631 253 ## COG3208 Predicted thioesterase involved in non-ribosomal peptide biosynthesis 131 62 Op 1 34/0.000 - CDS 157046 - 158545 215 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 132 62 Op 2 . - CDS 158559 - 159284 297 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 133 62 Op 3 . - CDS 159284 - 159871 302 ## SLGD_01924 hypothetical protein 134 62 Op 4 . - CDS 159950 - 160567 276 ## COG1309 Transcriptional regulator - Prom 160615 - 160674 7.1 + Prom 160537 - 160596 7.0 135 63 Op 1 35/0.000 + CDS 160694 - 162448 650 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 136 63 Op 2 . + CDS 162449 - 164173 226 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Term 164195 - 164225 3.4 137 64 Op 1 . - CDS 164413 - 164532 70 ## gi|169351275|ref|ZP_02868213.1| hypothetical protein CLOSPI_02054 138 64 Op 2 . - CDS 164563 - 164889 212 ## COG2946 Putative phage replication protein RstA - Prom 165109 - 165168 4.7 + Prom 165483 - 165542 5.4 139 65 Tu 1 . + CDS 165636 - 165902 224 ## SGGBAA2069_c00880 HTH-type transcriptional regulator + Prom 165905 - 165964 1.7 140 66 Tu 1 . + CDS 166056 - 166286 138 ## gi|160937326|ref|ZP_02084688.1| hypothetical protein CLOBOL_02216 - Term 166475 - 166515 -0.6 141 67 Tu 1 . - CDS 166519 - 167460 262 ## COG2378 Predicted transcriptional regulator + Prom 167470 - 167529 7.0 142 68 Tu 1 . + CDS 167555 - 167974 294 ## CLD_3697 hypothetical protein + Term 168014 - 168058 3.3 - Term 167998 - 168047 6.1 143 69 Op 1 . - CDS 168072 - 168254 185 ## Tthe_0801 PilT protein domain protein 144 69 Op 2 . - CDS 168251 - 168490 169 ## Cthe_1619 AbrB family transcriptional regulator - Prom 168521 - 168580 5.9 - Term 169152 - 169181 0.5 145 70 Op 1 . - CDS 169377 - 169733 262 ## COG2873 O-acetylhomoserine sulfhydrylase 146 70 Op 2 . - CDS 169759 - 170382 206 ## BCZK1570 hypothetical protein 147 71 Op 1 . - CDS 170997 - 171350 140 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 148 71 Op 2 . - CDS 171390 - 171614 196 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase - Prom 171655 - 171714 3.9 149 72 Op 1 . - CDS 171778 - 171993 81 ## GYMC10_6261 peptidase C60 sortase A and B 150 72 Op 2 . - CDS 172036 - 172593 366 ## COG4509 Uncharacterized protein conserved in bacteria 151 72 Op 3 . - CDS 172598 - 172882 295 ## gi|160937344|ref|ZP_02084706.1| hypothetical protein CLOBOL_02234 Predicted protein(s) >gi|157101645|gb|DS480679.1| GENE 1 67 - 1038 817 323 aa, chain - ## HITS:1 COG:BH1857 KEGG:ns NR:ns ## COG: BH1857 COG0524 # Protein_GI_number: 15614420 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Bacillus halodurans # 9 323 5 316 319 199 36.0 5e-51 MENQKKYDVIALGELLIDFTPAGTSAQGNPVLEQNPGGAPCNVLAMLARYGRSTGFIGKI GNDIHGRFLCRAVRESGIGCGGLVMSDEVHTTLAFVSMDESGDRSFSFYRNPGADMALTE DEVNLDMIRCSRIFHFGTLSMTHEGVRKATIRAAACARENGCLISFDPNLRPPLWADMEE ARKQMLYGVSLCHILKITDEELRFMTGIEDEKQAVGHLQAVHGIPLILVTAGAFGSTAYW AKEALKAEAFLTDRTIDTTGAGDTFCGCCLNYLLDHPLETLNREGVRDMLTMASAAASIV TTRKGALKSMPEPGEIEEVVRQK >gi|157101645|gb|DS480679.1| GENE 2 1044 - 2468 1167 474 aa, chain - ## HITS:1 COG:BS_ydjD KEGG:ns NR:ns ## COG: BS_ydjD COG2211 # Protein_GI_number: 16077683 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 10 469 1 457 463 219 31.0 7e-57 MTEMVLRSKMKRTGQTAAAETDLRADTDVKVSLISKIAYGMGDVGCNFSWMFVGNFLMIF YTDVFGISMGAVAGLMLFSRFWDAVNDPIIGSLTDRTHSRWGRYRPWLLVAAPLTAVVLI LTFWAHPGWNSTAKIVYMGITYCILVLGYTCVNIPYGTLCGAMTQNIEERAQINTFRSVA AMIAIGIINIITVPLIETLGAGSDKRGYLLIAVIYGGIFALCHMFCFAKTREVVAIPERK KISLKEQLRAVSRNRPYILALAGQMLFGFTLYGRNADALYYFTYVEGSKAFFTVYSMCII IPSIIGAACFPMLFHWTNNKGRAASIFAFLTGMSMMGLGMFTAKESPILFYGVSALTQFF FSGFNTAIYAIIPDCVEYGEWKTGLRNDGFQYAFISLGNKMGMALGTSLLAMVLGMCGYV AGGQQNPAVLAVIKHSFTTIPGMFWIITGIVLFFYRLNRRRYNEIMKEMYTGGV >gi|157101645|gb|DS480679.1| GENE 3 2489 - 3514 1076 341 aa, chain - ## HITS:1 COG:YLR070c KEGG:ns NR:ns ## COG: YLR070c COG1063 # Protein_GI_number: 6323099 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Saccharomyces cerevisiae # 5 341 11 356 356 188 34.0 1e-47 MLQQVMTEPGKIEFHEVPVPEISDREVLIKIMKIGICGSDIHVYHGEHPFTSYPVTQGHE VSGEVVKAGRAVTGLKPGQKVTIQPQVVCGQCYPCRHGKYNLCEDLKVMGFQTTGVASHY FAVDAAKVTLLPDDMSYDEGAMIEPLAVAVHAVKQARDVKGAKIAVLGAGPIGILVAQAA KGLGAEQVMITDVSGLRLEKAKECGVDFCVNTRNKDFGEAMTESLGPDKAEVIYDCAGNN ITMGQAIKYARKGSTIILVAVFAGLGQIDLAVLNDHELDLNTSMMYRNEDYLDAIRLVNE KKVVLSPLISRHFAFGDYLKAYQYIDENRESTMKVIINVQE >gi|157101645|gb|DS480679.1| GENE 4 3737 - 4879 1049 380 aa, chain - ## HITS:1 COG:CAC3673 KEGG:ns NR:ns ## COG: CAC3673 COG1940 # Protein_GI_number: 15896905 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Clostridium acetobutylicum # 16 376 6 378 385 184 33.0 2e-46 MGCPIMKINEKALEKKRQNKARIASHILHRKQISKPELAIELGLSMPTVLQNVKELQEAG IVEEVGEYESTGGRKAKALSVVKNVRYAAGVDITANHISVVIIDLKGDMVHSRRFRRRFE NTAEYYEDLAAELEQFLDESRVDKKKILGVGISLPGIVDRENSRLIRSHILQVSDINLEV ISRLIPYPVHFENDANSAAVAELQGTDRNAVYLSLSNTVGGSIYLDNDIYGGDHFRSAEF GHMIIYPEGRLCYCGKRGCADPYCSAGVLSSFAGEDGGLAEFFRLVEKKERDAMAVWDTY LEDLSIVITNLRMCFDCDIVLGGYVGGYLKPYMSQLGRKVMVNNKFDNDTLYLRNCRYEK EASAVGIAMTFMDEYFRTII >gi|157101645|gb|DS480679.1| GENE 5 4906 - 5418 122 170 aa, chain - ## HITS:1 COG:BS_fabG KEGG:ns NR:ns ## COG: BS_fabG COG1028 # Protein_GI_number: 16078654 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Bacillus subtilis # 49 139 62 149 246 70 41.0 2e-12 MTSPAKIAEYMQVSVITVRRTLTLFNQLGVTHSVNGVGTRILGAENSMDVSNPDDSEILV SSTMETFGKLVDILVNNAGARDEGLRPVDKEIDSEIDRVVGINIKGVTYCTRAILKDMLK NKKGVIVSVASSSGLSGGPDNGPVCRPGLWFQFIVKYSGFPSCHMLLLDV >gi|157101645|gb|DS480679.1| GENE 6 5475 - 6065 260 196 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937164|ref|ZP_02084526.1| ## NR: gi|160937164|ref|ZP_02084526.1| hypothetical protein CLOBOL_02054 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02054 [Clostridium bolteae ATCC BAA-613] # 1 196 1 196 196 402 100.0 1e-111 MKIEGYITLTKKAGAAVAVQFQEAEFEKNIQTFFTLRKEAVMDMCKSFGPMFSQAQWHGL KNAGPEQMDELERLRSQTNILRPYIMVQHIQLIYGSLNNQLLLRLIWQAFLFYQAPFLSL PANLMDFEDSDGPLLYMIRFCRQKDWDGLWGSVTYCQEQITCAISRFYANRITLEPPGEQ IPFCWNVYQNTSQKCY >gi|157101645|gb|DS480679.1| GENE 7 6081 - 6224 136 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937165|ref|ZP_02084527.1| ## NR: gi|160937165|ref|ZP_02084527.1| hypothetical protein CLOBOL_02055 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02055 [Clostridium bolteae ATCC BAA-613] # 1 47 1 47 47 90 100.0 5e-17 MDKVMELRQVIYGLLAAQIEFGTYHYKDPLPKIEEVSQWVGVSLDTV >gi|157101645|gb|DS480679.1| GENE 8 6374 - 8737 1561 787 aa, chain - ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 8 774 21 773 774 344 32.0 5e-94 MTLNESCIGQRYIRPDSISKVTGYAKFTADLLAGRTDVLVAKVRRIDCARARIKSIDTGE AKKVPGVVAVLTAGDLKDGSGYGYILKDKPVLAKDEIICDCDPVAFVAAETEEAAYRAVN LIKVEYDALPFETDPLKNMLPGAEQVRKEYGSHHCGNVSNDVRVAKGGMEEAASKTVIEI TADFRTPMVEHTALERDIAFAEPDAVNGGLTFYCPVQNVHGMREGLCEALHLPVSRIRVV SPLVGGGFGGKECSSVDCGVAAGILALCTGRAVVYEMTREEMFRYTSKRHRSYIKYKIGV DTHGCMTGMWSEAIFDKGAYKSVDVIPHRSALLSGGPYVIPAVDVRNRSIYTNHVYGGAF RGLGAPQQYYALECAMDDMARTIGMDPVEFRLKNLIKEGSETIFSQQMSKSDAAGIRECI EKVRQELEWDKPLDCSDPYKKRGRGIACYMYGTGSSFPKDAGHVYLELNLDGSLNVNLAQ NEMGQGLITAMSQIAAQAMGVSIEYVNVGISDSMCGPEAGPTSASRATVFQGNAIIAGCR SLKKRILDVAADMLEADAHELTIKNSEIFSEKEPDKKVTLKSAAARARVSQVSLAQVGNW YPPMTEKDPENLNQTTRWTTFAYGAHGVLVEVDTRNGMITVCRSVQATDVGKAINPDTVE GQMDGGAAQAIGWALMEECFLRHGLVKETSLHEYLIPTALDVPSLESIIVESGSESGPFG AKGVGEPTILGGAPAIRNAVLDATGLAMYEIPMTPVRVMEALERAETASAAEKKPFFRVP LQCGLWD >gi|157101645|gb|DS480679.1| GENE 9 8734 - 9231 297 165 aa, chain - ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 4 155 15 165 171 161 52.0 4e-40 MPVVHFDLNGRAVCIEVDGKELLVDTLRKRFLLTGTKKACGTDDCGACTVLIDNRAVRSC AYLTCMVEGRNVMTIEGLGTPWHLHPLQQAFIDAGAVQCGYCTPGMIMTVYGLLEEKENP DEEEIRMAISGNLCRCTGYQKIVNAVKLAAERRLMTANHSWEEIK >gi|157101645|gb|DS480679.1| GENE 10 9212 - 10090 713 292 aa, chain - ## HITS:1 COG:SMb20130 KEGG:ns NR:ns ## COG: SMb20130 COG1319 # Protein_GI_number: 16263878 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Sinorhizobium meliloti # 5 232 6 235 287 94 29.0 2e-19 METNYYHPRELEEAAALLEEQEGTVLVNGGSDIILSISAGKITPTGIIDIRGIKSMNTIT DDGQYVHIGGNVTYRQMQDSPVCGRIGGLARAAGSVGSPVIRMVATPAGNIATAAPSADC ASMMLALDTELVLVSVRGERIVRQKDIYLGAYKTDIQSNEIIKEMRFPSPGPDKGSGYAR FSRRKSQDIAKVIAGAVVTVDKKGRCLSAVISLGALNATAVLAPSIGERIKGLGRDEAIQ LALKFFPREAGLRESYFKRYKEETTCNAIAEALDMAFAETERGDSCYACGSL >gi|157101645|gb|DS480679.1| GENE 11 10170 - 10814 461 214 aa, chain - ## HITS:1 COG:mlr4169 KEGG:ns NR:ns ## COG: mlr4169 COG1335 # Protein_GI_number: 13473531 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Mesorhizobium loti # 7 208 21 216 223 78 30.0 1e-14 MEDFKAALIVVDVTRAHFDYDFNYLPVPEDDCQRVLGGLVKVIPEFRSRGLPVIFVRTGH KINPLTGETMSLASPFWRYQMENKISIGGHVRRPVNTENSPAMEIMPQLGAEEQDIIVHK QRYSAFMGTPLEMYLRVMGINTLFFTGANTNNCVLCTAFEAYNRDYRVIAIEDCCASMNG KAYHELAIMQIKASLGWVVDSGSVADILDGRTKL >gi|157101645|gb|DS480679.1| GENE 12 10836 - 12110 860 424 aa, chain - ## HITS:1 COG:PM0273 KEGG:ns NR:ns ## COG: PM0273 COG1593 # Protein_GI_number: 15602138 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, large permease component # Organism: Pasteurella multocida # 54 422 54 424 427 248 40.0 1e-65 MTAGLFIIALLVCLFLGIPIFIAMMIPSTYYLLNVLNMQPIMAVSSFTAGLNKFSTLCLP FFILAASVMGKGQIGSRLLKFCRSIVGHMYGGIGMTAIVCCIIIGAISGASTAGILIIGA LIYQEMIDNGYPRGFSAGLICTVSSVGMLIPPGIAFVVYAMNTNTSVLKLFMGGLSSGIV MAAIFMVYVYVYARRHKVPRLPRISLKEFFTAVKEAIWALGLPVIIIGTMYTGICTPTEA AAISAVYAIFVEMVIYKDVDFKTLIDICGDCAKTVCTIYILLAAGQLLGYVMTLARIPQL MEAMLGSTSRLMILLVINIMFLIAGMFMGAGSSIVIIMPMVYQLALKAGIDPIFLGNVVV TNLAIGMSTPPFGLNLFVSAKVMKLSFVEIVRSSMPFIMLTILVLIIITLFPGLSLFLPN LLVE >gi|157101645|gb|DS480679.1| GENE 13 12125 - 12676 427 183 aa, chain - ## HITS:1 COG:BH3392 KEGG:ns NR:ns ## COG: BH3392 COG3090 # Protein_GI_number: 15615954 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Bacillus halodurans # 9 174 9 173 175 63 22.0 2e-10 MNKTYEKYKAFDKAFKKFELTFCALSILFVTFLVTLGIVLKNFFGFSFPWTEELCQYIMV WVACIGGVISVEKHEHVSVDIIYNMLPQKLHSYYRMILAVIATVFLAAFAYFSYLEVVSI KATARTSVTMPWFQMWWMYLGTMMGCGLMALEYFKTIFVLFKEGKIKAGVDTKDQDLSKL ETM >gi|157101645|gb|DS480679.1| GENE 14 12710 - 13849 1053 379 aa, chain - ## HITS:1 COG:SMb20036 KEGG:ns NR:ns ## COG: SMb20036 COG1638 # Protein_GI_number: 16263787 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Sinorhizobium meliloti # 65 375 27 332 338 142 30.0 1e-33 MKKVLSILLAAAMVSSLWACGAPASKEASAAKETPAAEDMSVQENASQAENASSGESAKA ASDYEYPEMTIILAHSGAATDARQEGALAIETYIEETSGGKIQVDVYPAGQLGDSNTLVE SVQNGGIQMCIQPPANVCAFNPLLSILDVPYFFPPDIDKARAVFETDAADALLDTMQDYD MVGLGYWVDLFKAFTSNVPLRAPEDFNGLKFRVMASPVLMKFIESLGGTALTIDYQETFT ALQTGAIDGQEAGIGAGIYNMKFYEVQKYMEITNHILASQLIFANKTWFEGLDDPCKQLV MDAVNIAGNDAYQEVRGDIEQAALDAIGAQCEIITLTDEELQAMSDACRQPCLDLYIESN GEAGQKILDAFNADIEKLN >gi|157101645|gb|DS480679.1| GENE 15 13879 - 15045 966 388 aa, chain - ## HITS:1 COG:MA0362 KEGG:ns NR:ns ## COG: MA0362 COG0596 # Protein_GI_number: 20089259 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Methanosarcina acetivorans str.C2A # 58 370 94 422 422 89 23.0 9e-18 MSKYRQNESWISNKFVGQLGIEAFMPNANMGLEEKGFKHADIQRANRRMTCLRAMPKAWK AVAEQQEKLAEESAAMGNMVTAGAFYHRAALYYGKAQLYHHKDDAMKLELHTSCVRCYEK AIAGYDYTIERVVLPFEGKNIYGIFHAPKGALNLPTVLFTPGMDMIKEDYPNLNDNFFVR RNMCVLVMDGPGFGETRVHGLPVTLTNYPEAASLFLDWLCGRPEVNPDQIGIFGGSMGSY YGPTVAVNDSRIKAVVGLLGCYLEKDLQFNSCQPGLRNNFKYMCNIYDDDAFDEFVAEMT LEKVVDRLKVPFLMSTGEFDEMCPPELTERFFHMLDCPKEMWIMEGEFHSCAGMYHELFA WAIDWLREALTTGKPENLEVRKVIPERW >gi|157101645|gb|DS480679.1| GENE 16 15092 - 16831 1225 579 aa, chain - ## HITS:1 COG:no KEGG:Sthe_2748 NR:ns ## KEGG: Sthe_2748 # Name: not_defined # Def: peptidase M28 # Organism: S.thermophilus # Pathway: not_defined # 14 571 4 564 570 347 34.0 1e-93 MKDYVMELEHIENSLVDRVDGKKLMDYTSNIAKWVRISGTQEEVDSLLYCEKVMKEIGYE TKLTFHDAFISVPVRAHVEMVSPVPMGFRALTHCFTRSAPEHGLEGGAVDSDSPDINGLI AIKDGLPNADQVRDMEKRGAVAVIYVQDDNLHNSPVSSLWGGPTEKTEGLLPGIPVVSVV RQDGAFIRDQMKKGPVKIWIQSVVDTGWRKVPLLEAQLKAEGTEDFLLFGSHIDSWDYGA MDNGAANATMIECARLLAAEQKSWKRGLRLVFWAGHSQGKFFSSAWYADNHFEELEKHCA GYVYVDSTGGKDAVVIDEAPVMPQTRSLAASVIKKQTGIEFIGKRIGHFADQSFYGVGLT SIFGTFSEQDIEKTRDILSFRTGTPKHAGGLGWWWHTEHDTMDIIDRDILIRDTKIYVAV VWRLLSSAVLPYDFREAVEEMKETVESLGSLLGDRFDFMPLRERLCLLERRMGNLYRQIE GNEIPDEDADSVNALLQKLSHKIVRITFHGENHFDFDLSGAMYPIPSLGDGVRLAGCNKD SYRYFVLRTQLMRGYNRVMSYVREAAELLAGYDDIDGER >gi|157101645|gb|DS480679.1| GENE 17 16857 - 18002 640 381 aa, chain - ## HITS:1 COG:BS_amhX KEGG:ns NR:ns ## COG: BS_amhX COG1473 # Protein_GI_number: 16077370 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Bacillus subtilis # 16 366 12 360 389 247 39.0 3e-65 MKSSICEYVNAHHTDILSTFNELHSMPEPALREVRTADYLKTQLKQAGFDVKGDFAKTAV LGIRDTGIDGPVVGIRADMDALCYEKEGKTIYIHSCGHDANCTAVLWAAKALLETGQLKA GSLKVLFQPAEETLQGAEAVIESRVLEGTQYLVSTHLRPQEELPMGQVSPAVLHGASGHI RVKIYGHEAHGAKPHQGVNAILTASAVIGTVNALPFNPSVPHSIKPTKISSGSNPFNIIP NYAEIMFDIRAQTNEVMKQIRESLTKAAVTSAESMGAKALAEWLGGVPAASRCDELIEIA SEAIRESLGEDALGPVIITPGGEDFHNYPLAISGLRTTVLGIGAGLKPGLHMSDMTFDTN AVFNAVTAIGSTVVNIYKSNL >gi|157101645|gb|DS480679.1| GENE 18 18050 - 19021 440 323 aa, chain - ## HITS:1 COG:slr0619 KEGG:ns NR:ns ## COG: slr0619 COG2159 # Protein_GI_number: 16331820 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Synechocystis # 32 318 38 333 348 64 25.0 2e-10 MIIDVHAHYIPASLADNEKFSDLFYVKEDTFGKTMFIKNRELRPFNPGLIYLDEQIADMD KAGIDMRFISLPPFALNYEDCRCKDWTRESNKALSEDASLHRNRFRYLATLPMADMEGTI REIDRVINDPLCAGIEIATNIAGMELDDAYLEPFWKKAAEYEIFVLLHPHYTIKSARLER YHLRNLMGNPLDTTIAAFALMTGDVASKYPGVKICFSHSGGYTPYAIARFEHARKVRKEF EGVQKSYEECCRSFYYDTILHDAETLQFVESKVTSSHLLMGTDYPFDMGDEEPVHTVSSM KIPREKISDILGGNIERLIGDGI >gi|157101645|gb|DS480679.1| GENE 19 19236 - 19886 503 216 aa, chain - ## HITS:1 COG:BS_ydhC KEGG:ns NR:ns ## COG: BS_ydhC COG1802 # Protein_GI_number: 16077637 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 13 201 17 209 224 71 26.0 9e-13 MGRKDKSLEDKAYNYLHDMIIQHKLQAGKRIIESDIADCLGMSRTPVRAALRRLEEEKLV YSLQNLGTFVEQLTVKDVLEMNELRIVLEVKALESCVMKAADEDIDECIRVTESITVTDS AQDIKAKDEYFNNFIIHYCENSRIRDILHSLNAELSHPRHLVAFTEKRLEVMKQDHLKIL KYIKARSYSRAAKALEKHHNNRYYGFMEEYMKMIIK >gi|157101645|gb|DS480679.1| GENE 20 20221 - 20355 122 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239627993|ref|ZP_04671024.1| ## NR: gi|239627993|ref|ZP_04671024.1| transcriptional regulator [Clostridiales bacterium 1_7_47_FAA] transcriptional regulator [Clostridiales bacterium 1_7_47FAA] # 1 43 1 43 299 75 90.0 1e-12 MTLQQMKYFVAVADSVSISEAAKRLFAAQSSVSEAVRVVEGQWD >gi|157101645|gb|DS480679.1| GENE 21 20384 - 20716 252 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937179|ref|ZP_02084541.1| ## NR: gi|160937179|ref|ZP_02084541.1| hypothetical protein CLOBOL_02069 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02069 [Clostridium bolteae ATCC BAA-613] # 1 110 1 110 110 224 100.0 1e-57 MLTGDGKELFIEFKGSLNRLYHLDLKYEDKKESNHAFYVAAQHHICGMDSFMSIIQSLDV PEYYIGFREYKTSGVFEQVEKDLADLGVIFFCGTAERPDDAGTQGAWDPL >gi|157101645|gb|DS480679.1| GENE 22 20701 - 21267 387 188 aa, chain + ## HITS:1 COG:no KEGG:Amico_1744 NR:ns ## KEGG: Amico_1744 # Name: not_defined # Def: hypothetical protein # Organism: A.colombiense # Pathway: not_defined # 54 181 87 214 271 135 53.0 6e-31 MGSSLTPFPTAQPMPISRRPILWQNAARYGSSGRVFYYAGTSGVCSWKRVRAAGANLSKL VAVGPTLILQEFGNLGTIFISLPIALLLGLKKEAIGACYSINRDSNLGLTTDIYGPDAKE TEGTFAVYIVGSVIGTVFISLLAGVVASWNIFHPLAMGMASGVGSGSMLSSAAETLGKSI PHMQKISW >gi|157101645|gb|DS480679.1| GENE 23 21267 - 21587 233 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937181|ref|ZP_02084543.1| ## NR: gi|160937181|ref|ZP_02084543.1| hypothetical protein CLOBOL_02071 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02071 [Clostridium bolteae ATCC BAA-613] # 1 106 3 108 108 173 100.0 3e-42 MGGASDMLTGITGIYIGTFVGLPVTRKLYALLEPKIGRKPLSPVKDIIIQVTSKVTFIAP ATALGAFAGISLGKDLKNFVKMGWKMVVITLLVMAGTFIFSAFVAD >gi|157101645|gb|DS480679.1| GENE 24 21676 - 21816 96 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937182|ref|ZP_02084544.1| ## NR: gi|160937182|ref|ZP_02084544.1| hypothetical protein CLOBOL_02072 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02072 [Clostridium bolteae ATCC BAA-613] # 1 46 1 46 46 86 100.0 5e-16 MVKCDILINGCRILKPDMTVSEEMSVAIHDSWIKKVGPAGEIKEIF >gi|157101645|gb|DS480679.1| GENE 25 21844 - 22725 647 293 aa, chain + ## HITS:1 COG:MTH1505 KEGG:ns NR:ns ## COG: MTH1505 COG0402 # Protein_GI_number: 15679502 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanothermobacter thermautotrophicus # 2 267 53 313 427 149 38.0 9e-36 MLLMQGLVDGHTHACQQLLRGRVSDEYPMVWTRFLVPFESNLRPDDSYINGQLACLDMIK NGTTSFADSGGIHMERVADAVLESVMQATIAKFTIDMGNAITGAMKETAEEAIMHTRDLY KAYDGKGDGRISIWFAIRQVMTCSWVLIAMARDAAAAELNTGIHAHLCEHKDEVSFCLQN YHLRPAQFLKSMGVLDPNLLTAHNVMLSDEDIAMMARRDVKVIHCPRANLSNHGFPKTPQ MLQAGLNVSLGCDGTAPSNLDLFDEMKALRYSMIAYWGLPFFNPMVMSLSHAS >gi|157101645|gb|DS480679.1| GENE 26 22780 - 22929 92 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239627991|ref|ZP_04671022.1| ## NR: gi|239627991|ref|ZP_04671022.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 49 317 365 394 92 87.0 8e-18 MEEGRKADVILLNIDQPHLSPTQNLINTIVEAANGHDVTDSIMNGKIVM >gi|157101645|gb|DS480679.1| GENE 27 23129 - 24574 669 481 aa, chain + ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 451 1 455 500 130 24.0 2e-28 MKGNMERYQVVYSVLKTHIQFGIYGFGDVLPSIENAAADFFVSVDTMRAAYQQLQRDGLV TISTNIGTTVVRNYGKEEIEQNIQMFYAQRKNILIDLSKSLRPLLGHAQWIGFKNIPAEI YGNLRQLEDSHSLPQTTTFEHVVRAYTALGNNLLFRLVWQIFMFCENPFFNMRDNPWRTF IIKDFLPRSWDCCIKQDWDCLRELVYASQDSLSLALCRFYDEKITMPPPEQEVAFQWNSY NKASQICYSLAMDLLVSINNGVYPADTLLPSLNKLSQEKQVSVSTVRRALSLLNGVGATK SAKRIGTRVLPSHEIVKNCDFKNPAVRKRLLDMAQSLQFLTLSCRDVAEGTIQALTEDGL QTCRKRLAALKNSQQYELIGYATLELLKCFAPYKAIRTVYKELLQLLFWGYSLKDIWKTD EGRIHYYLSCFEEFDLFLAEKDAAGFSMKLEELMAGEFHFTIEIMISQGIHEAKKLLVIG I >gi|157101645|gb|DS480679.1| GENE 28 24605 - 27118 1755 837 aa, chain - ## HITS:1 COG:MA2256_2 KEGG:ns NR:ns ## COG: MA2256_2 COG0642 # Protein_GI_number: 20091095 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 420 707 20 305 345 173 37.0 1e-42 MKIKEGRKTYKYLIYISCLCLLLAIGAMTAVSLMSVQSVRKLVRNTTTKSITELTVSKAQ FLDEKIHSELLSLQSFAANLGAFDDLFAYPELLEDYKEKHGAARMWIIDTNGTYQDTGGM NKNFADKKELFTEALQGNEGITDVFLGELGRRQIMFQIPIYRDGKVVGGLYEAYPVELLQ NAYHGSTYNDAGYSYVLDDNGSIVLAPVRFSYLQIYSTIQDVLKDGENDEESIRKFSAAL RSQASGTAVFDFEGEKQFLTFMPLMQKHNWYYVSVIPLSMVEKDGTAIIMHTMKMAFIIT AAIVFSVAVIAAFFLLRNKKRRDYERYVQNINEAIAQNIDTAIFIVDGHTGHVEYAFENA EKILGIKPQNLGEDCGRCSGSFYETLCSVLKERIDEKTVKTIPVYNDLLSRQMWIQIAAL PVKLLGELKYIFAVTDVTHDRQIRENLNAAVAAAEHANASKSVFLSNMSHDIRTPMNAIV GMTKLAEIHIEDREKVKDCLYKIGISSKHLLELINDVLDMSKIESGKLVLTSESFSLTEL IQRNIAIVNPQYQAKNQVFVTETKDIRHEYLLGDSLRFNQVLLNLLSNAGKFTPENGTIT LTVKELPQKHSGHAVYRIAVADTGIGISPEFLPKLFLPFERERSRHQNQAEGTGLGLVIS KNIIAAMGGQIYVESRLGEGSVFTVEVELPLSKEADDELCGRTGPRDSRNIFTGRRFLLV EDNELNCEIASQLLEVSGAAVECAPDGAEGVKAFEEKDPGYFDAILMDIQMPVMNGYEAA RRIRKSTHPQAGSIRIIAMSANTFSDDVHAALESGMNAHVGKPIDMDVLADVLSAVL >gi|157101645|gb|DS480679.1| GENE 29 27115 - 28347 788 410 aa, chain - ## HITS:1 COG:no KEGG:MTES_1114 NR:ns ## KEGG: MTES_1114 # Name: not_defined # Def: sugar ABC transporter periplasmic protein # Organism: M.testaceum # Pathway: not_defined # 14 206 35 230 421 68 25.0 5e-10 MEPKAQRDLPQGKTLTFFAPVEGKSSGAVSYRRLIDKYNKSHDVHVVFEGIATADGYNEY LEERLRTGKGDDIFIVNEDSVKTLAHNGYFQDLSSLEAFQKLNDSAREEAVIGDTVYCIP MNMTAYALFVNMDVLERYGLEAPDNLEEFKVCCTEIKALGGTPISLSRWHATAVPTIANG LYKLYDGPDIQKHLELLNSGEEQIGDYMVEGFEAFQTSVENGWYGDGVDGDAADALRAGE KDISDFTSGMTAFYFGPLEYIPLVEEANPPPDYHVQGIPVPGGTALLITVVSRLCVNPDS DNLDEAMEFVSYLSSEYYKESMENGTIILPVYKSSDFTLSNEKMRPAYDTYVSGIRIPAE DMHLKFGSWDVVRELCLEMFNGRTAVEATEEYNKIQLEQISAYDKQEKHP >gi|157101645|gb|DS480679.1| GENE 30 28466 - 32431 2275 1321 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 913 1307 267 654 676 187 35.0 2e-46 MSQNPFHGSSSELVESVLNNAPASIIVCSAESRVLLYANERARKLFLKEDYRPGITCYEA AGRCRPCPFCQYEEPNSSEFTVCNYMDPVTRRIYQVSSKMIDWEETRAHIEYIVDVTGAQ TKTERYRNRSEGPDRTLCSVPCGLCIYPNDCVHLETELKDTRMKMEHLVNSIPGGIVSYE IQGKKAVPDFVSDGLLALSGHTRSGYNELVKGNTIDIIYEADRGRVLSAARSAVESGRVM DISCRIRHKDGHLVWIHISGRRIGPKADTMRFYAVFTGMSEDAILFRSIANDTAERVYVI DKESYELIYTSESAGFLCQGADCIGQKCYAALYGKNQPCEFCTLKSHKPDGAAHGMDYHE NGRFYSTRFRETLWNGIPAYLKYVRDVTEEVKAQKEKEHIEQYFQTLVRKLPGGVAVVRI EKDGRKTPEYFSDGYAMLSGMSMDQVWKSYGEDGMAAVHPDDVEQLNAELSDFMASGEEQ REFTYRIKRGDGEYIWIKNMTSMLRRGEGEVILYASYRDLTVEFKEQEQIRRQYRELILQ HYRTPGPNALIVGHCNITRSQITEISDYTDSGLLECFGTERDAFFTGISTLILDEKERRD YMNRFLNEPTRAAFQAGKRELELDCFIRLPKDVRGRYVKFKVNLVEEPDTGDITGILTVY DITEQTIAERNLKKLSTSGYDLIADVDLFHDFCTILSGRFTKDDLCAKSGRHSERLAYMM ERQLVPKDRPRIMKMLEPEYILERLKQEDPYSLSYSILGEAGEIQTKKLTVSATDLRLGR VCLARADITDSVREQQGLLNVVAYTFEMLGIIHLGSRHITLYTHQAVLQVLEPRRASIDS WLADIKKKYAPKGGEEEVERCFGLQNMLARLEERPGGYDFVLPYLEENQIRYKQINILWG DGDHKMICIVRQDVTETMTAERRSKETLEQALALAEEANRAKSDFLSSMSHDIRTPMNAI MGMTTLARAHLDEREKVENCLRKITLSSRHLLSLINDILDMSKIEQSKITLNNDKVFLSD LLEQLYSIIGQQAQEAGLQFDVRTSNIIHPYFYGDALRINQILINILGNAVKFTPDGGTV TFLTEELPTIKGPGYVRYCFTIRDTGIGMPESFLSHLFEPFTRNRNTEHVEGSGLGLSIT KGLIELMGGELLVESRERAGTTFRVEMEYRIAPEDNRGESGDKTGVSTPSGADRLTGRCL LVAEDNEINAEVLSELLLLQGVRTVVKTNGVQVLNEFQSAAYGTYDAVLMDIQMPEMNGY EAARAIRNLEQKTGRHMPIIAMTANAFAEDIQEAMKAGMDAHVAKPIDMSALMKTLDKLL V >gi|157101645|gb|DS480679.1| GENE 31 32495 - 34195 1423 566 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 5 563 7 526 555 216 29.0 8e-56 MLKSLELSKKLILGLGIMVAISAAMMIIAIVSLHDIGGLVNRLYQSPFTVSTQSIMLQKE IQNMGREIRGMVLYEDPSYFDSVLASSGRAKANLALVEKRFLGDQQLILDMYQSLDEIEA AGKEINRLVAGGKIEEAKNSADIDFRTAMKSGIETSQEIVDFALDKALEFNEDAGIALEN ATVMLIVLLVVMVVLCMGVTTVLSRAVSRPISQLTDAAKKLAAGALNIEIDYYSKDEVGT LAEMFREMSGSMKAVIKDIGQQLGAMSNGDFMVAPRAEYTGDYVSIKNALINIRESLSNT LNEINLSADQVFSGSAQVSDSAQTFSEGAADQAGSIEELAAAINEISFQVRETAANMEAA RRLTAKAGEQVAVSNRQMEEMLLAMGEIGAKTEQIRAINNTIEEIAFQTNILALNAAVEA AHAGESGKGFAVVAGEVRRLAGKSTDAAKRTSDLIDGTVQAVEKGRKIANITAESLHNVV ESTNEVLNTVDKIDEAAQHQAGSIVQVTQEIDQISYVVQNNSATSEESAAASEELSGQAQ MLKELVGRFKIDGSENVNQDHAYIYH >gi|157101645|gb|DS480679.1| GENE 32 34456 - 34704 188 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937191|ref|ZP_02084553.1| ## NR: gi|160937191|ref|ZP_02084553.1| hypothetical protein CLOBOL_02081 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02081 [Clostridium bolteae ATCC BAA-613] # 1 82 1 82 82 168 100.0 1e-40 MSNLTNLWDYRDRGMEVFPVNFPSLDGRGYAVVRLDNLNWQGVRDVFYLNIQDLLKGTKS PEEVAKAIDETCNAALEQGRPK >gi|157101645|gb|DS480679.1| GENE 33 34697 - 34930 196 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937192|ref|ZP_02084554.1| ## NR: gi|160937192|ref|ZP_02084554.1| hypothetical protein CLOBOL_02082 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02082 [Clostridium bolteae ATCC BAA-613] # 1 77 1 77 77 140 100.0 3e-32 MIEEVGLDEFIPEQDTIAHWSTEDFNTVFAGLKESITDETTYVFMMYALNSQGDSHIMTL LQAMGGSLYDENGNLHV >gi|157101645|gb|DS480679.1| GENE 34 34930 - 35376 238 148 aa, chain - ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 40 134 41 136 385 73 40.0 1e-13 MNQKKVKILPLSILIISVLCFLSVGYWIYQKRVRNVILQKTVASTTFLKDIINDVLDMSK IESGQLEPICRNIDLNGLLSEVETLLETQARHKQIDFTVNTQGIKQKWIAGDPIRIKQIL MNILGNAIKFTPEGGGQNRGACSSGRYH >gi|157101645|gb|DS480679.1| GENE 35 35879 - 37753 1617 624 aa, chain - ## HITS:1 COG:no KEGG:Elen_2576 NR:ns ## KEGG: Elen_2576 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: E.lenta # Pathway: not_defined # 35 587 39 595 614 190 27.0 2e-46 MKLGNYLCISALVLTVLTLSSCGKKHNEEIDGFAAPGTENDRVTVTLYGIQPLPHFEEAV ENKFPNIDLVQYLYTGDYAECEFTARIKSGDTCDIMMAKAGRIALLDTTESLMDLSTMSF PANYTKNALPQNERGEIYFVPGPLSFNAYIYNKTLFEEHGWEVPESFEEYLALCRQIDDS GIRGCEFPLGHANMPMYVFCVRLAMDYMTTVEGQTWYEDFLAGRASSQEPLEDTLDVYSR MMDEGIIRVEDLSLSSEERLELLRNRQLAMTTAEINTIKALNEEGTDEFAFYPQYGVADN QGWLLSLGYYFGAGKYLEQPENMAKREAVMDILEFISTDEGQNLLIEDNIGMVSTVKGAV IPADKVFDGIRRAVMQGNYVMRPVLGSLIPVLQPELTSYIRGDISKKEIMQHCREAMEGK VKEEVSIGIAAENFDILETAQLKADTMREVLGTDVALMGVSEAGGFDPVQGVLSKLYAGT IGEQDVKRIAQRNDTVPVYCWTAQVKGEKLLEILDYGAYSEKEREAEETEHFHPYAVAGM KVRYERNARIGARIAEAAMVDGSAIEPDKSYSVAYLEGVLEPEQVSEPKRTGRLLTEVLT EKIKAEGCVKPGSVSIYYTFDIKE >gi|157101645|gb|DS480679.1| GENE 36 37778 - 40090 1402 770 aa, chain - ## HITS:1 COG:SMb20356_1 KEGG:ns NR:ns ## COG: SMb20356_1 COG0642 # Protein_GI_number: 16264090 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 229 485 389 645 667 170 40.0 8e-42 MKLFVKRENWDGGIWMKLPASEEQAEQVQKKLAGYHPSRMFPFRGDVEAPVAGLTHLLIG EIVDNVIYEDENRLIFFMPIEELKINGKEMKAIGVSYDTDKLFRVLQIEAFQGQSNLYVI HEDGTEVFSAVNSEGVKGYNLLNSLEKLTFKNGSPEEFRRGLETGSREDNLMTVEFEGKD YYINHTPITGENWILVSMIPVEAVSGKMEQYSYHAFFLMAGVCALIIIGFVIFYSDAAKQ ILRAEEKARKAAESANEAKTNFFTSMSHDIRTPMNAIIGMTEIAVNHLDDPNKTMNCLGQ IRRSGKLLVGLVNDILDMAKIESGKMEIHAERTSLAELIENIVVVITPLVKEKNQDFQIC IREVRHEHLMLDSVRMNQVLMNLISNAIKFTPEGGSIRVTVEESLSPHPGQAHFTFTVSD TGMGMSREFQNNIFTAFVRERDSRVDHIEGSGLGMAITGMIVANMGGSIRVESEPGMGSI FTVELDFCRAEPEMDSESEGIAISSVKCLLVEADPEICKCVCEFMNVLGLVPSTAGSGLE AVELMRTAREKGEDYQFVILDRGLAVYDACEAVGRLKELAGERQLTILLAAYDWADMETK ASEAGVDGFVRKPIFKSTLEQIVRQYVLHTEAVRRPECAETPILAGKRILLVEDNLINQE IVKELMEQTGAAVDAAYNGREGADRYASMEDGYYDLILMDIQMPVMNGYEATRLIRGMNR RDAGSIPIFAMTADAFAEDIAMARQAGMDRHLSKPIDAVSLMQEITRVLI >gi|157101645|gb|DS480679.1| GENE 37 40145 - 40606 430 153 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937197|ref|ZP_02084559.1| ## NR: gi|160937197|ref|ZP_02084559.1| hypothetical protein CLOBOL_02087 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02087 [Clostridium bolteae ATCC BAA-613] # 1 153 1 153 153 269 100.0 5e-71 MVSEELKKMFEGRIEMQDMHYVGKACYGRLEENLRGKIELGQGFMDSGYTRLTVSVLERT NGLVDQMKFLISDVTGLKQETEGERMAGPELRSYKDSVWWNCEMEEEDYQKIAEAVNGYL SLFQSEELVQGQKEGESQGQNEQVGVLPSNILT >gi|157101645|gb|DS480679.1| GENE 38 40764 - 42419 1406 551 aa, chain - ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 30 481 3 450 542 119 26.0 1e-26 MRRARTATVVKKNISVIPANPEYSRDTRAQYKRLRVAAYCRVSTLQEQQETSYEAQVNYY TEKINSNPEWVFAGIYADDGKSATMTRKRNDFQAMIDDCMAGKIDMVITKSISRFARNTV DSLTHIRKLKAKNIAVYFEKENINTLGDGGEMLITILSSQAQEESRNLSENVHWGYVRQF ENGVVYVNHNKFLGYTKDEEGNLIIVPDEAKLVRRIFRLYLEGNSINKIAETLTKEGIRT VTGNTVWHATVISKMLMNEKYMGDALLQKTYTVDFLEKKRAVNKGIVPQYYIEGNHEAII PRELFYQVQEERARRANVYRPAKKKGTTIRGKYSSKYVLADIMYCKECGQPYRRQVWVNG GTKKPVWRCYSRLKSGTKKCKHSPSLEEKALHEAIMEAVNSVVKDEGEFIDAFRDNVIRI LGSYSDNVEPTEYDEKIDALQKQMLALIEDSAKTESADEEFDRAYREIADQIRAFKKKRT ELIREKQLAEAYDQRVEGMGQYIQKTNYLKCQFDDELVRRLIKAIMVISEDKIEIQFHSG IVMSQRINEYD >gi|157101645|gb|DS480679.1| GENE 39 42343 - 43344 490 333 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1525 NR:ns ## KEGG: Dtox_1525 # Name: not_defined # Def: recombinase # Organism: D.acetoxidans # Pathway: not_defined # 1 296 1 296 299 270 47.0 5e-71 MRQRHMPLGYRMAEGKIVIDPETAEIINRIFQEYSEGASLYQLARKLTEQGALNANHKPV WNHGTVGKILENRKYLGDEFYPTMIEPELFTKVQERRIEKSRELGRIMQPNSFRNRSMFA GSLVCGVCGQPYRRYVEHCKQPGEKIVWKCKHYINENRVCCRNIFLVDSQIIQAFMELMR GVKDGRLQIEPKTIMQGLTYSREAAELTRKLQELEEKPEYPSEEMVRLIYERARLQYRVS KIQDQNYQTEKLKTALEENLLQKEFDSELFKATIHRITVHKDRRFEFELMNGVKIELPIR EAAEGRKPDETGQNSNRCKEKYLCNSSQSGIQP >gi|157101645|gb|DS480679.1| GENE 40 43347 - 44954 1269 535 aa, chain - ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 24 377 4 353 542 98 26.0 3e-20 MAKKVVKKITKIEPVSSGSAANVVEVKRVCAYCRVSTNSADQKNSFEAQVDYYTRLIGEK EGWVLAGIYADEARTGTKLYRRDDFQQMMQDCRQGRIDLILTKSVTRFARNTLDSIKAIR ELKALGIGVYFEKERVNTLMEKSEQMITILSSIAQGESESISTNNKWSVVRRFRNGTFII SSPPYGYENNEYGELVIKKEEAEIVRWIFDEYLGGKGSYTIAAELETAGIPTIRGAKEWQ DSVVKGILQNCAYEGDLLLQKTYTTDGVPFIRKNNRGELPQYLIADDHEPIITREEASAI RQIYEYRREQQCVEDTSVYQNRYAFSSRILCGECGTKFRRQKIYIGKPYEKIQWCCYQHI KDSKECSQKAVREDVIQAAFLRLWNRLASNYEEILIPMLAALKAIQGNPEQDREMQEIEQ NIQELKRQGYRLGRVLTEGSISSAIFIERQNQIEAQLEANRRRLRQLKGQKAFEWEISQT EYLISVFRNRPAILEAYDEELFLLLVERITVLPGRRLVFRLKNGLELEENEQEVS >gi|157101645|gb|DS480679.1| GENE 41 45032 - 45193 193 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937202|ref|ZP_02084564.1| ## NR: gi|160937202|ref|ZP_02084564.1| hypothetical protein CLOBOL_02092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02092 [Clostridium bolteae ATCC BAA-613] # 1 53 1 53 53 85 100.0 1e-15 MGQKKETNEVRYSVARKLLNLMLENSFISDEEYKKIDTLNRETFSPELSKVYG >gi|157101645|gb|DS480679.1| GENE 42 45205 - 45615 337 136 aa, chain - ## HITS:1 COG:no KEGG:Closa_3172 NR:ns ## KEGG: Closa_3172 # Name: not_defined # Def: Resolvase domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 113 2 112 120 104 43.0 1e-21 MNEKKKAWVYTRIDAPEDRYGSLKRQDQKLTEYASRMGWEVVGHSQDTGSCRDMNRPGLQ ALTDAVSQGCVNILVVSGPDRLSRNAKDIIAYAEFLQNFDVEAYSPEQGKIEVLCTSVLK DSALAVPDDSPNLSIR >gi|157101645|gb|DS480679.1| GENE 43 45707 - 46051 266 114 aa, chain - ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 3 113 5 115 122 147 65.0 5e-36 MNILRGDLYYADLTPVTGSEQGGVRPVLMIQNDMGNRFSPTVIVAAVTSRQDKHPLPTHV PISPNHCGLKEHSVVLLEQIRTIDRVRLREYIGRLTEEDMEQVDHALRISIGLS >gi|157101645|gb|DS480679.1| GENE 44 46729 - 46989 57 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSEVVGPFTFSFIGVKGLIQSPLFLENFISSDYDTLAVQRESESGIERYATTTCRISSI LSIRPYRRCRAVFSVLEMARGRAIPP >gi|157101645|gb|DS480679.1| GENE 45 46995 - 47405 389 136 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1859 NR:ns ## KEGG: LM5578_1859 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 136 1 136 137 116 52.0 2e-25 MAKERIYTIIVQRERVEVSREVYYAYHKAREAERYQNRVIRQIEMSLERFQEEGVNAEYR VVRSAPGIEEEMIRAEDNHRLYQALDELNVEERLLINALFFSGITEGDLAARLGVTQQAV SKRKKRLLRKLREKIE >gi|157101645|gb|DS480679.1| GENE 46 49398 - 49637 130 79 aa, chain - ## HITS:1 COG:no KEGG:Trebr_1033 NR:ns ## KEGG: Trebr_1033 # Name: not_defined # Def: MATE efflux family protein # Organism: T.brennaborense # Pathway: not_defined # 10 79 225 294 449 68 48.0 8e-11 MQWLLFGILSEKIRINAIRIEKTLFSGIIGVGFSAMLMQVTILVQQNVLYNLAAQHGGET WQIILGVAYRVVPFAFIPL >gi|157101645|gb|DS480679.1| GENE 47 50055 - 50198 65 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937213|ref|ZP_02084575.1| ## NR: gi|160937213|ref|ZP_02084575.1| hypothetical protein CLOBOL_02103 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02103 [Clostridium bolteae ATCC BAA-613] # 1 47 28 74 74 84 100.0 2e-15 MDKFVMTSAEDWQQDKNTTGCISVPLVSVIKISKIIESSVSYAHTKQ >gi|157101645|gb|DS480679.1| GENE 48 50316 - 50909 346 197 aa, chain - ## HITS:1 COG:MA4535 KEGG:ns NR:ns ## COG: MA4535 COG0500 # Protein_GI_number: 20093320 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 8 154 17 160 168 86 34.0 2e-17 MIPFSYDIMQPMYPLLIQQFLDDYNLSAGVALDIGTGPGNLAVELAKVTAMDLILVDIDG EALRTAQSRLFELGVDNRISTLCADVEKLPLRDNLADFIMCRGSIGFWPVPVQGITEIYR VLKPGGCAIVGVGAGRYMPETMRRRIYESMSASDRTTSPRQYTLEDYDAFARQAMLSDYR VIIEDDISKGCWLEVKK >gi|157101645|gb|DS480679.1| GENE 49 50906 - 51667 195 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 218 21 232 305 79 27 8e-14 MSGISFELEEHKVMTVLGPNGVGKTTLLKCIMDFLHWQQGETFIAGQPLRTYKEKELWQV ISYVPQAKRSVFSYRVLDMVVMGLNASQNFFKVPDQSQYDQAEELLNLLGCQKLIDRYCN QLSGGELQMVMIARALISRPQLLVLDEPESNLDMKNQLRILDAIERASREMNTACLINTH FPNHALNISDQTLMLGLGQKKLFGATKDLLNEENIERFFSVRSKIVSFQADGQEHQTIFP YRIASHSRGGGIT >gi|157101645|gb|DS480679.1| GENE 50 51715 - 52722 502 335 aa, chain - ## HITS:1 COG:MA1234 KEGG:ns NR:ns ## COG: MA1234 COG0609 # Protein_GI_number: 20090098 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Methanosarcina acetivorans str.C2A # 12 333 26 341 343 260 49.0 3e-69 MVLLLAMALTGVFLILLCVGRYSVPPLEAFRIMGVALFRDGQALEALKAENPFGVSVILG SRLPRLLMGMIVGAGLSVAGATYQAIFGNPLVSPDILGVSSGAGFGASLGILLSTSTLII QGSALTFGLLAVLMVFLVSKVNGRTELFTLVLSGVIIQALFDALVSLIKYVSDPQEKLPA ITVWLMGSLASSSYKDIIVSCVIILPCIAVLLTLRWKMNLLSLDEEEARSLGVNVHRLRS VVIVLSTLMTGVTVSLCGVIGWIGLVIPHVGRMLVGNDHRALLPASALLGALYLLLIDTI ARTATAAEIPLSILTAIIGAPFFAWMLRKTSGGST >gi|157101645|gb|DS480679.1| GENE 51 52761 - 53255 190 164 aa, chain - ## HITS:1 COG:no KEGG:Clole_0369 NR:ns ## KEGG: Clole_0369 # Name: not_defined # Def: oxidoreductase/nitrogenase component 1 # Organism: C.lentocellum # Pathway: not_defined # 47 141 273 367 395 81 43.0 1e-14 MSWLAGSFSGRKPLQRFLIQLENLLSEVPVQEAEEKTVEKYEGLCALVIGEQLMANSVRE SLELDREFTSVDVCTLFDADQQLMRSGDQAHISEAQLEERIRSGRYDLVIGDGFFRELDW VDSPTGYVELPHVAVSSRLGWTSQVCPFGDKFLELLDAALQEGV >gi|157101645|gb|DS480679.1| GENE 52 53985 - 54689 218 234 aa, chain - ## HITS:1 COG:no KEGG:Swol_0407 NR:ns ## KEGG: Swol_0407 # Name: not_defined # Def: nitrogenase molybdenum-iron protein alpha and beta chains-like protein # Organism: S.wolfei # Pathway: not_defined # 1 228 203 431 435 125 33.0 1e-27 MNEVGFEKVLQLSSCQTFEQFMEMRHSSHNLLIKPFGRVACQEMERHLGIPWHKEFIRYH PEGIRRGYEGLETFLERKISWQSCYESCMDSISQWRRRLKGLRVGVGIGVNGSPFEIAMA LALCGVEISFLLSDAIMEYEWESIAQLKRSFPDVPVYTSYHPSMSMAEAVPEEADVTIGF SASYVCPSSKLFTLDADDQHYGFEAFTALLQGVEDALNSDISARALLYTKGLVV >gi|157101645|gb|DS480679.1| GENE 53 55307 - 56083 526 258 aa, chain - ## HITS:1 COG:MJ0879 KEGG:ns NR:ns ## COG: MJ0879 COG1348 # Protein_GI_number: 15669069 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrogenase subunit NifH (ATPase) # Organism: Methanococcus jannaschii # 1 246 1 245 279 221 47.0 1e-57 MRKIAFYGKGGIGKSTLTSSIAAAIAGMDKRVMQIGCDPKADSTLNLRAGQELTSVMDVL QAYGGLCPSLDAISVKGYRGIVCVEAGGPTPGSGCAGRGIIKTFDTLDDFNAFQVYAPEY VFFDVLGDVVCGGFAVPIRQGYADEVVIVTSGEKMALYAAANIKKALDNFQERGYAKLRG IVLNCRNVPDEVAIVEDFVQRIGTEIIGVVPRDSDIQRAEEQNMTVVQMDSELPVSQTII DIAKRIMVPVEEQKEGAV >gi|157101645|gb|DS480679.1| GENE 54 56124 - 57245 679 373 aa, chain - ## HITS:1 COG:FN0305 KEGG:ns NR:ns ## COG: FN0305 COG0614 # Protein_GI_number: 19703650 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Fusobacterium nucleatum # 55 373 10 331 336 240 37.0 3e-63 MKTRVLSLLLACVLGLSLCACSGGGQMETSIAEDLFTQTIDKEDRSVQESDKIVITDQLG RTIEFDEPPQRLATTIMPFPYIFYAVMGNNDYMVGCNPSSVIAYEDSTLRHMYPTMAKVD TSWVDTSFVVNIEELLKLEPDVVFQWNYMTEEIEKMENAGLKVIALQYGTMDDLETWIRI ISSLYQQEERGEELIDYFYAQVAEVSDRVATLNTSEFRSILQLSDNLKVAGQGFGYYLWD NSGAINPAADLSGEELNVDMEQIYLWNPEIIYIGNFTDLQPSDLLENKLEGQDWSLVRAV KEGEVYKIPIGGYRWDPPGVETPLTIKWMAKIQHPELFADMDMETELRDFYQEMYGFTLT DEMVVEILGDTQN >gi|157101645|gb|DS480679.1| GENE 55 58698 - 59048 264 116 aa, chain - ## HITS:1 COG:no KEGG:BBIF_0517 NR:ns ## KEGG: BBIF_0517 # Name: not_defined # Def: hypothetical protein # Organism: B.bifidum # Pathway: not_defined # 7 112 7 112 114 102 45.0 5e-21 MDIVLWRMRIKDGKEERAQEWIAFLQEHQEEGNKTLKNEKEHLEIYFFNQENGAAYAYMF VLADDLDYAAKIAENSGNPLDAKHMEYMSVCVDLEDCTQLSPVLALGDFSVFHSKK >gi|157101645|gb|DS480679.1| GENE 56 59024 - 59266 118 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937224|ref|ZP_02084586.1| ## NR: gi|160937224|ref|ZP_02084586.1| hypothetical protein CLOBOL_02114 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02114 [Clostridium bolteae ATCC BAA-613] # 1 80 1 80 80 151 100.0 1e-35 MRLLLKNMGLYEYHIVHLAILSDALKVASSSEKLCSYMLRGLWKSAVNIVFHSEREENIS NKELKPIQKEVFPWILCYGV >gi|157101645|gb|DS480679.1| GENE 57 59619 - 61340 221 573 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 332 554 132 354 398 89 29 8e-17 MRELFKKHFALTDKGAKDLQKASAASFFAYVINFFPAMLLLLLVDELLLNNVKETGLYLW GSILVLAVMWILLRIEYDALYNTTYQESANLRTEIADILTKLPLSYFSRHDLSDLAQTIM ADVAAIEHAMSHAMAKAVGFLFFFPLLSVLLLLGNVKLGLAIILPILLGFGLLLLSKNLQ IRESFKHYQKLRDNSESFQEAIENQQEIKSFGLTQKIRQTLYQKMEESEKIHLRAEISAG IPMLCSNVILQFAFVLVILIGVQMLHTGEINILYFLGYVLASIKVRESVEAVSMNVAELY YLDSMCKRIREIRETKIQQGKDQTISSYDIEFDQVSFSYDKDTEVLKNISFAAKQNEVTA LVGVSGCGKTSILRLMSRLYDYDSGSIRIGGLDIKEISTKSLFEKIAIVFQDVTLFNASV LENIRIGKKTATDEEVIQAARLANCEEFIRRLPEGYKTMIGENGATLSGGERQRLSIARA FLKDSPIIILDEIAASLDVENEKKIQDSLNRLILDKTVVIISHRLKSVENADKIVVIDCG RVEASGTHLELLKTSPTYNNLVEKAKLTEEFQY >gi|157101645|gb|DS480679.1| GENE 58 61337 - 63082 220 581 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 351 552 294 502 563 89 31 1e-16 MFGVYKKLLYYVPKQRPLAYFAIGLTVVSTLLNVGAYYYLYEFLKRLVVDEDMGHAQYNA FLIAGLLIGGSLLYFAAVLLTHMLGFRLETNLRKRGIDGLTNASFRYFDLNSSGKTRKLI DDNAAQTHSIVAHLIPDNAGAILTPFLALVVGFLISLRVGIVLLVLFLLSGVLLVLMTGE KKFMEIYQKALETMSSETVEYIRGIQVLKIFGIDATSFKTLHRAITDYARHALNYSMSCR RPYVLFQLILFGFIAILVSVAALTLNRIQDPAVLAVELIMTLFLGGVLFTAFMKVMYVGM YAFMGTSAVEKLEQVFSDMQKDRLIFGNKSTFKNYDIEFERVCFGYTDQMVLEDVSFTLK EGRSYALVGSSGGGKSTIAKLISGFYKVDGGAVKIGGESIEAYTEDALIRNIAFVFQNVR LFRISIYENVRLANPDAQRTEVMEALYQAGCDSILDKFKDRENTVIGSKGVYLSGGEKQR IAIARAMLKNAKIVILDEASAAIDPENEHELQKAFAHLMKGKTVMMIAHRLSSIRNVDEI LVLEDGKIVERGTDAALMATDTKYREYQTLYEQANEWRVVR >gi|157101645|gb|DS480679.1| GENE 59 63368 - 64582 821 404 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 5 395 4 381 391 95 23.0 2e-19 MYNAVGIDVSKGKSTIAVLQPGGTVIRKPFDVSHTSQNLNELADYLGSLEGPSKIVMECT GRYHEPMLKVLYEAGLFVSVVNPHLIKNFGNNSLRKVKSDPADARKIARYTLDNWAELRQ YSDMDDTRTQLKTLNSQFSFFMKQKVAAKANLIALLDNTYPGVNKLFDSPTREDGSEKWV DYAYSFWHVDCVRKIGLKTFTERYKAFCKKHHYIFQQDKPEKLFNASKELVAVFPKEKTY KLLIQQSIQQLNLASGHVERLRQEMNALASTLPEYNTVMGIYGVGKTYGPLLIAEIGNVS RFTHREALTAFAGVDPGVDESGQHKSKSNRASKVGSARLRKTLFQIMTTLLQNAPVDDPV YRFLDKKRAQGKPYYVYMTAGANKFLRIYYGKVKECLRNLEQAE >gi|157101645|gb|DS480679.1| GENE 60 64873 - 65343 234 156 aa, chain - ## HITS:1 COG:FN1079 KEGG:ns NR:ns ## COG: FN1079 COG0783 # Protein_GI_number: 19704414 # Func_class: P Inorganic ion transport and metabolism # Function: DNA-binding ferritin-like protein (oxidative damage protectant) # Organism: Fusobacterium nucleatum # 19 156 7 144 144 112 43.0 2e-25 MYFKSKEAVILKKELATQVNAYLANIGVSYIKLHNLHWNVVGSQFKAAHEYLETLYDALA DVLDAIAELLKMNGEAPLASMKGYLSVATIPELESVELDVKSAMQTVLKDMESLRTQAFF IREQADKENQFDVVAHMEDDVANYNKTIWFIQSMLK >gi|157101645|gb|DS480679.1| GENE 61 66764 - 66865 65 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVFAMEQELINLLDSTLKCLDCKIKSNVVIIEV >gi|157101645|gb|DS480679.1| GENE 62 67084 - 68529 406 481 aa, chain + ## HITS:1 COG:AGpA139 KEGG:ns NR:ns ## COG: AGpA139 COG3464 # Protein_GI_number: 16119324 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 52 472 44 401 404 139 26.0 1e-32 MIDRILMTSVKLSSVSASALLKADAIQVSKSSICDLLKKMPVHVDKSSVINICVDDFAFR KRYSYGTVMVNLDTHRIIDIIDSRETKKVEEWLKSYPNLKVISRDGAQTYSSAASNAHPD AVQVSDRFHLLKNLSDAVEKYMHRLFPSRLIIPAITANPQMQALYDTRNRAERIVFAHRK RNEGYTINDIALLLHSATTTVREYLAIPENEIPEIQENARERQYIQGLENKQLAINEVRQ LYAGGHSIDEISRLTGHTTVTIKKYLKDDCPVGNGHYDSRRPGKLAPYESDVIKMRAQGI TYQKIHEYICQKGYDGTVDSLRVFMQKERTHQKRISAAEADAVEYVPRKCLCQLIYRKLE KANGITEEQYEAAVKKYPILGQLYDLLREFHRIMFSGKYDELDLWIETAQSLNVDEIDTY VNGLKSDIDAVKNAIKYKFNNGLAEGSVNKIKLIKRIMYGRNNFRLLKAKVLLNEYYYQI N >gi|157101645|gb|DS480679.1| GENE 63 68689 - 68835 145 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937230|ref|ZP_02084592.1| ## NR: gi|160937230|ref|ZP_02084592.1| hypothetical protein CLOBOL_02120 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02120 [Clostridium bolteae ATCC BAA-613] # 1 48 9 56 56 70 100.0 6e-11 MTFFGTTEWLGLNIGFWMAMAVVLLIVIVMNVIFWSMKPKADAKELYK >gi|157101645|gb|DS480679.1| GENE 64 68869 - 71634 1861 921 aa, chain - ## HITS:1 COG:L85514 KEGG:ns NR:ns ## COG: L85514 COG0474 # Protein_GI_number: 15673239 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 10 921 4 910 910 1112 61.0 0 MNKKENRMAVRQTVEKTVIRDEQNRRMQFAATNPIKEVLKNLHTTLRGLDAGAVPVSRTK YGTNKVTHEKKKSLAKRLVSAFINPFTAILFCLALVSTITDMILPYFSLFGSVPDDFDCL TVVIILTMVFISGTLRFVQESRSGNAAEKLLAMITTTCTVTRRNQEKIEIPMDDLVVGDI VHLSAGDMIPADVRILDAKDLFVSQASLTGESEPIEKKPNVSAQKESVTDYTNIAFMGSN VISGSATAVVVCVGDHTLFGSMASAVAGEAVETSFTKGVNAVSWVLIRFMLVMVPLVFFV NGITKGDWLEAFLFGISIAVGLTPEMLPMIVTTCLAKGAVSMSKKQTIVKNLNSIQNFGA IDILCTDKTGTLTQDKVVLEYHLNVNGEDDTRVLRHAYLNSYFQTGYKNLMDLAIIHKTE EEEAADSRLIDLSENYVKVDEIPFDFTRRRLTTVVQDKNGKTQMVTKGAVEEMLSICAFA ECDGGVKPLTDDVRCRILETVDDLNDKGFRVLAIAQKSNPSPVGAFGVKDECDMVLIGYL AFLDPPKESTADAIKALKDHGVTTKILTGDNDKVTRTICKQVGMKVRNMLLGADLEHMSD TELARAAEFTDVFAKLTPDQKARVVSVLRENGHTVGFMGDGINDAAAMKTADIGISVDTA VDVAKESADIILLEKDLMVLEQGIIEGRKTYANMIKYIKMTASSNFGNMFSVLAASALLP FLPMMSVHLIFLNLIYDLSCTAIPWDNVDEEFIAKPRKWDASSVGSFMIWIGPTSSIFDF TTYIFMYFVFCPLFVSNGVLFNDLPAHFSGAELATLQKQYIGMFQAGWFVESMWSQTLVI HMIRTPKLPFIQSRASAPVTLLTMTGITVLTIIPFTVFGTMLGFVALPATYFAYLIPCIL LYMILATSLKKAYVRHFGELL >gi|157101645|gb|DS480679.1| GENE 65 71799 - 72110 193 103 aa, chain - ## HITS:1 COG:no KEGG:CD3374 NR:ns ## KEGG: CD3374 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 101 1 101 103 160 82.0 1e-38 MDFTCIFFGILFTIAGFVFACGKGHIHLSEWKNMPQEEKDKIKIIPLCRNIGEVIALNGI IFLMKGLWSGFPNHWFVCAMIAWLIVAGFDVWYIEKSNRYRRP >gi|157101645|gb|DS480679.1| GENE 66 72117 - 72269 106 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSKLRILMFQIWYQVTAVIWFILAVTVIILLIRLLRLNKLKNEVKHWKE >gi|157101645|gb|DS480679.1| GENE 67 72357 - 73034 300 225 aa, chain - ## HITS:1 COG:BMEII0053 KEGG:ns NR:ns ## COG: BMEII0053 COG1285 # Protein_GI_number: 17988397 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Brucella melitensis # 13 224 24 236 246 137 38.0 1e-32 MFYTDFILRISLSLVLGFLIGLERQLTGHPAGIRINVLICMGTSFFTLFPMLYGSDQVFR VGSSIISGVGFLCSGVIFKDSGTVRGMNTAATLWCTAAIGILASTGMYAMAITAAGILIA SNSILRPLARKMNPITAGDEFEKQYRISVICQEAAEQEIRLLLINSNSCKTLFLNNLESG DVVGDKVEIIAKYCSVGKPKNNVLEGIVGQALAIPEVVSAGWEVL >gi|157101645|gb|DS480679.1| GENE 68 73174 - 74097 300 307 aa, chain - ## HITS:1 COG:MA3366 KEGG:ns NR:ns ## COG: MA3366 COG0053 # Protein_GI_number: 20092180 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 8 305 3 303 341 207 39.0 3e-53 MEKKSNPHNTAQKAAMYVSSISIIVNLLLSLFKLIAGIIARSDAMISDAVHSASDVLSTI VVIVGSKISSKESDTEHPYGHERIECVSSIILSGMLLVTGIGIGIVGVKKIIAGSTGDDL TVPGILALMAAVVSIIVKEWMYWFTRSVAKKINSGSLMADAWHHRSDALSSIGSFAGILG ARLGYPILDSIASVIICVVIVKVSMDIFYDAINKMVDHSCNEATEDKIRSLIATIPGIRR IDLLHTRLFGMKIYVDIEIAVDENLRLKEAHHIAEQVHYSVENSFPEVKHCMVHVNPAST VPVGSTK >gi|157101645|gb|DS480679.1| GENE 69 74395 - 74550 94 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937236|ref|ZP_02084598.1| ## NR: gi|160937236|ref|ZP_02084598.1| hypothetical protein CLOBOL_02126 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02126 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 84 100.0 3e-15 MNMNAVGIDVSKGKSMVAILRPLNIILFEIMEKRSNTTLKISWGFILFIHV >gi|157101645|gb|DS480679.1| GENE 70 74714 - 75217 303 167 aa, chain - ## HITS:1 COG:BH0820 KEGG:ns NR:ns ## COG: BH0820 COG0745 # Protein_GI_number: 15613383 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 33 165 98 232 234 101 42.0 5e-22 MKKNISYIPMDSTLEIWLLEKLIKEPPKSDCVEDFLSGIRDMISGEVSSLILGASEETEL SASALSDGIIIGELEIHPKSRKVLQKGSEISLTPKEFDILYFLAQNRGEVFTKEQIYRAV WEDDYLLDDSNIMAFIRKLRKKIEPDPDAPKYILTIWGIGYKFNDQL >gi|157101645|gb|DS480679.1| GENE 71 75233 - 77893 1506 886 aa, chain - ## HITS:1 COG:L85514 KEGG:ns NR:ns ## COG: L85514 COG0474 # Protein_GI_number: 15673239 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 21 886 37 910 910 779 45.0 0 MSKATLFDSRIKKYAYCDPLEIYRDIGTSPDGLTIEQIESMREKYGKNSFNERKNDTMMK RLQRAFINPFHVILFVLGIVSLITDVFVVSNFARNATTAIIIFSMILISGVIRLIQELRA KSAAAQLDRLIHESVTVRRDGELMEIPTEELVVGDLVLFSAGDRVPADIRLTKNTDLFIS QAAITGESAILEKSCRKLSYGEQETLTQLENLAFMATTVISGRGEGIVLAVGKDTLYGSF TKPDPDDKNSFQKGANSIAWVMLRFMAVLVPIVFVILGITGGKWLESFAFALSVAVGLMP EMLPMVITACLAKGSLAMSKKQTIIKDINAMQGFGSMDVLCMDKTGTLTNESILLEYYMD ILGNENTEVLDLAYLNSAYHSGVRNTIDNAILACKTMPGRETHYSALLTQYQKADEIPFD YARKFISTLVRDIDGECQLIMKGDIGHIVSRCSHVEYRGEILPIEKDDMQSVSSVVDDML QEGMKVIAIARKNVGHQREITPADEFNMTLVGYLSFFDAPKQTAKESVAALKRLKVTPKI LTGDQAAIAVSVCHRVGISAEAVLTGAKLDEMTDSELRKAVEEIYVFAELTPGQKVRLVS ALQENGHSVGFLGDGVNDIPALNEANVGISVDTAVDSAKDAADVVLLQKDLNVLEQGVLE GRKTFTNMLKYIKITASSNFGNIFSIICASAFLPFLPMTSIQILLLNLLYDILCIVLPWD NVDEEETLSPRDWSGKTLGRFMMSFGPLSSLFDIATFLFLYYFLCPMLCGGATYLNITDP SLQLQYVSLFQTGWFLESMWTQVLILHFLRTPKVPFMQSRTSTPVICITLAGIVAFTAMT FTSGASLFGMTRLPLWYFAFLLFVALAYMLLTTVAKYFYKKYYELI >gi|157101645|gb|DS480679.1| GENE 72 79142 - 79279 176 45 aa, chain + ## HITS:1 COG:SA0102 KEGG:ns NR:ns ## COG: SA0102 COG4716 # Protein_GI_number: 15925810 # Func_class: S Function unknown # Function: Myosin-crossreactive antigen # Organism: Staphylococcus aureus N315 # 1 45 1 45 591 83 75.0 1e-16 MYYASGNYEAFARPRKPEGVESKSAYIIGTGLAALTAACYLVRDG >gi|157101645|gb|DS480679.1| GENE 73 79907 - 80206 158 99 aa, chain + ## HITS:1 COG:L180241 KEGG:ns NR:ns ## COG: L180241 COG4716 # Protein_GI_number: 15672937 # Func_class: S Function unknown # Function: Myosin-crossreactive antigen # Organism: Lactococcus lactis # 3 58 375 430 587 73 53.0 9e-14 MSGIVTGKDSNWLMSCTFNRQSQFRNQPKGQLVGWIYGLFSDKPGNFVKKKRSDCTGKGP YITAVCKTTDFLNAVDIIVISSHTNSLSTFAKDIFFISS >gi|157101645|gb|DS480679.1| GENE 74 80663 - 80782 61 39 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|317501057|ref|ZP_07959263.1| ## NR: gi|317501057|ref|ZP_07959263.1| hypothetical protein HMPREF1026_01206 [Lachnospiraceae bacterium 8_1_57FAA] hypothetical protein HMPREF1026_01206 [Lachnospiraceae bacterium 8_1_57FAA] # 1 39 108 146 210 63 74.0 5e-09 MPDFVTKVDFDWAIEEKAKKKKQDFTKIKFFTYDKGECV >gi|157101645|gb|DS480679.1| GENE 75 81068 - 82042 289 324 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 305 1 309 328 160 32.0 4e-39 MCTAATYKTRDFYFGRTLDYEISYGETVTVTPRNFKLMFRSMGTMDHHFAFMGMAHITEN YPLYYDAVNEKGLCIAGLNFVGNAHYREKEKGKDNIAQWEFIPWIMGQCATVNDARELLG RINFLAEPFSPHFPMAQLHWIISDKNSSITVESVKDGIFVYDNPVGVLTNNPPFPFQMFN LNNYMRLSIDTPKNCFSEQIRLNAYSRGMGAIGLPGDLSSMSRFVRVAFTKMNSLSENDE KSSVSQFFHILGSVEQQRGCCRLADGKYEISIYTSCCNADRGIYYYKTYDNYSIVGVDMN REDLNGEALISYKLMTDKALEIQN >gi|157101645|gb|DS480679.1| GENE 76 82955 - 84103 1122 382 aa, chain - ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 30 381 32 357 357 99 29.0 1e-20 MKIRLLSTVICAAMVSTMVMGCSSGASAPATTAAAAAAPETEVKDTTAGDTETAKADAAV DYSDVTVEIVAKGFQHDFWKAVKMGSEQAAEDLGLKSTNFVGPASESAVAEQIEQLNNAV NKAPSAICLAALDTQAALDAISNAQAANIPIVGFDSGVPDAPAGAIVANAATDNYAAGGL AAEKLYEMIKDKVTDPAEEVRIGVVSQDATSQSIGERTGGFIDRMRTLVGEDKCMVVGHD KYNKAVDEAKVIIDVGIPATVDDAACVTVANTLLNKNDLIAIYASNEFTAKNLVTANESL QVLGKDKVIGVGFDSGAIQLDAIKSGVLAGSITQNPVQIGYQAVTLAAKAAKGEPVEDVD TGALWYDSTNMDSAEVAPCLYK >gi|157101645|gb|DS480679.1| GENE 77 84151 - 85167 856 338 aa, chain - ## HITS:1 COG:AGc5111 KEGG:ns NR:ns ## COG: AGc5111 COG1172 # Protein_GI_number: 15890065 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 22 334 26 344 344 182 40.0 1e-45 MSNNGNQVKKEPNGLFKAIGTQKLVAVIALIVLFMFFWIFGDNFAKYSTLVSIFDSSYYI GFMAIGVTFVIITGGIDLSIGTVCICCSLISGTLYTKLGLPMPLCVILAILMGGLFGLAN GLMVSVMRLPAFIATLGTMMITRGLGSIVTNTATVTFPQASAPGGSFRGIFKLKGAGLPQ AGIPTGFILLIILAILMAILLNKARPGRYILSLGSNKEATRLSGVNVVKWETLAYVISGL FAGLAGVSYAAVYSTLMPGTGNGFELDAIAGVVIGGTSLAGGVGSISGTLIGVFIMSVLK TGLPFVGVQPHYQLLITGFVLIIAVFVDVLNRRKQGKN >gi|157101645|gb|DS480679.1| GENE 78 85184 - 86692 1640 502 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 499 24 514 521 469 49.0 1e-132 MEGDDVGEVILTMKDIDKSFVGVHALKNACLELKKGEVHALMGENGAGKSTLMKILTGIY SKDSGTIVYEGRQVEFKSPREAQDAGIVIVHQELNMMNHLTVAQNIFIGKEKMNGKLIND RRMVEEAEKLFGLLNIRIDPRETMGRLTVGRQQMCEIAKAISTDAKIIVFDEPTAALTDS EISELFKIIRDLRAKDIGIVYISHRMDEINQITDRVTVMRDGEYVGTLITRDSTKDDIIQ MMVGRTVYEEPKSFSNVKPSAPVVLKVEHLNVGKTVKDVSFELRKGEILGFSGLMGAGRT ETARAIFGADRRDSGDIYVNGKRVDIKNPQDAVDAGIGYLSEDRKRYGVIIEKTVAENTT MASLKAFCKGLFIDKNAEKETAEQYVEALKTKTPSVEQQVRNLSGGNQQKVVIAKWLTRN CDILIFDEPTRGIDVGAKSEIYTLMNNLVKQGKSIIMISSELTEVLRMSDRILIMCEGRK TGEISIEEATQEKIMHAATLRD >gi|157101645|gb|DS480679.1| GENE 79 86728 - 87177 408 149 aa, chain - ## HITS:1 COG:SP2165 KEGG:ns NR:ns ## COG: SP2165 COG4154 # Protein_GI_number: 15901975 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose dissimilation pathway protein FucU # Organism: Streptococcus pneumoniae TIGR4 # 1 147 1 143 147 129 45.0 2e-30 MLRGIPHILSPELLKILCEMGHSDRIVIADGNFPAESMGRDAHVIHMEGHGVPEILEAIL SLFPLDTYVEHPVCLMSVTDGDTVETPIWDQYMTLVEACDSRGRSAIGHIERFAFYEEAK KAYAIIATSEKALYANIMLQKGVVTDQGL >gi|157101645|gb|DS480679.1| GENE 80 87171 - 87875 532 234 aa, chain - ## HITS:1 COG:TVN1450 KEGG:ns NR:ns ## COG: TVN1450 COG0235 # Protein_GI_number: 13542281 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermoplasma volcanium # 5 214 2 209 218 115 30.0 7e-26 MELSYMELREQICDVCHKMWQLGWVAANDGNVSARLDDGTFLATPTGMSKSFITPEKLVR INGKGEVLEGLPGYRPSSEIKMHLRCYKEREDVNSVLHAHPPVATGYAVANVPLDEYSMI ETVIALGSIPVTPYGTPSTYEVPDNIAPYLGEHDAMLLQNHGALTVGADVITAYYRMETL ELFAKISLNARMLGGAQEISRENIDRLISMRKGYGVTGRHPGYKKYSKQEENRC >gi|157101645|gb|DS480679.1| GENE 81 87912 - 89381 923 489 aa, chain - ## HITS:1 COG:TM1073 KEGG:ns NR:ns ## COG: TM1073 COG1070 # Protein_GI_number: 15643831 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 6 486 5 470 476 370 41.0 1e-102 MSHKNVLAVDFGASSGRVMLGNFDGNRLSIQEIHRFPNDPVILNGTMYWDFLRLFYEVKQ GLMKAKKYGKIDSIGVDTWGVDFGLIDKEGNLLENPVHYRDARTAGMLEKSFGKMDKDRF YQITGSQFMEINTVFQLLALLEKRPGLLEGTSRLLLMPDLFNYYLCGAQVSEYSIASTTQ MLDARTKMWSKEVLESLGIPERILGGIVPCGTRLGFIKNGICQELGIEPMEVIAVAGHDT QSALAAVPARDKDFIFISCGTWSLFGTELDEPVINGVSERLNITNEGGCQGKISFLKNII GLWLIQESRRQWIREGKEYSFGHLEQMAADAKPFKALIDPDAPEFVPAGNIPERIRSYCR KSGQKVPENEAELVRCIDESLALKYRMTLEEIKMCTGREYPAIHMVGGGTQSGLLCQMTA NACHCPVIAGPVEATVLGNAAMQLLATGELRDLDQVRSVVGASAKVSIYEPKDTDQWEQV YGQYKKMLE >gi|157101645|gb|DS480679.1| GENE 82 89394 - 91187 1122 597 aa, chain - ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 10 597 4 588 588 798 64.0 0 MAASRLVGSYPVIGIRPTIDGRQGYMKVRESLEEQTMNMAKSAAALFRENLFYSNGEPVN VVIADSTIGRVGESAACADKFRRAGVDITLTVTPCWCYGAETMDMDPHTIKGVWGFNGTE RPGAVYLASVLATHAQKGLPAFGIYGHDVQDADDTSIPEDVKEKLLRFGRAAVAAATMRG KSYLQIGSICMGIGGSIIDPAFIEEYLGMRVESVDEVELIRRMSEEIYDKKEFEKALAWT KANCKEGFDKNPPEVQKTREQKDRDWEFVVKMMCIIKDLMNGNPNLPEGREEERIGHNAI AAGFQGQRQWTDFYPNGDFAESMLNSSFDWNGAREPYILATENDTLNGLGMLFMKLLTNR AQIFADVRTYWSPEAVKKASGYDLEGVAKEAGGFLHLINSGAACLDATGKALDEDGQPVM KPWYEVTADDQKAMLEATTWNEADFGYFRGGGYSSRYVTEAQMPCTMIRLNLVKGLGPML QIAEGWTVKLPDDVTDILWKRTDYTWPCTWFAPRVTGRGAFASAYDVMNNWGANHGAISY GHIGADLITLCSILRIPVSMHNVPEEKIFRPAAWNAFGMDKEGQDFRACAAYGPLYK >gi|157101645|gb|DS480679.1| GENE 83 91399 - 92250 312 283 aa, chain + ## HITS:1 COG:BH0724 KEGG:ns NR:ns ## COG: BH0724 COG2207 # Protein_GI_number: 15613287 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 33 278 40 288 298 99 26.0 7e-21 MHYLDYNEKKQHGTMDFPIEYYHVDEHHPRYNMPFHCHKELEIIRILEGILFLKLDDEEF EAKAGDIIFINEGVIHGGLPHNCIYECIVFDIQRLLMHTDTCRFYIRLITKHQIIVQNHF TKNNHSLHRIINHLFISMQHSEPGSELITMGTLFELFGIIYQKKLYQDKIDATKSSRKMM QLKPVLEYIDSNYGQTITLNELSRIAGMSPKYFCSYFHSIIHRTPIDYLNYYRIERACSE LSTSDLTLTEVAYRCGFSDVCYFIKTFKRYKGITPHRYCMNIG >gi|157101645|gb|DS480679.1| GENE 84 92371 - 92682 164 103 aa, chain - ## HITS:1 COG:FN1403 KEGG:ns NR:ns ## COG: FN1403 COG2056 # Protein_GI_number: 19704735 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 1 102 333 434 436 111 55.0 4e-25 MLPVGLVIVMGTGTSFGTVPVLAVLYVPLCSERGFGLGATTLLVAAAATIGDTSFHVSDI TLAPTSGLNVDGQHDYIWDTCVIPLIALDIPMFILACLCPLFF >gi|157101645|gb|DS480679.1| GENE 85 94991 - 95392 213 133 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1620 NR:ns ## KEGG: Cthe_1620 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 132 1 132 134 178 61.0 6e-44 MRVLIDTNVLISAALSANGIPFQAYVKAASYPNHGLICEQNVDEMKRIFNKKFPNRLASL DKFLSVALLTLELVPIPTEKDSSEIQIRDRNDRPILRAAIEAKADVLLTGDKDFLESGVK NPMIMTPADFLQY >gi|157101645|gb|DS480679.1| GENE 86 95389 - 95664 287 91 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1619 NR:ns ## KEGG: Cthe_1619 # Name: not_defined # Def: AbrB family transcriptional regulator # Organism: C.thermocellum # Pathway: not_defined # 7 89 6 88 90 93 61.0 3e-18 MLANTFVDNAKVMAKGQVTIPKDIREVLGVASGDRITFIVEGNTVRIVNSAVYAMQVLQK EMAGEAERSGLTSDDDVMAFVEELRNEDENA >gi|157101645|gb|DS480679.1| GENE 87 96274 - 97953 479 559 aa, chain - ## HITS:1 COG:no KEGG:Closa_0537 NR:ns ## KEGG: Closa_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 14 506 13 507 514 300 36.0 1e-79 MKDLMKRWKLANEVIACITTVIIISILPLVFHDYYFDILDTKYYFYCGTVICMAAVMLLL TLIILRLNKGTIVMTPKLRKSDWAMIGFLFSIVLSCIWSDYRFEAFWGTEGRLMGTFLYL ICGISFFTLGHCLKFKRWYLEAFLVTGMVVCSIGIMQYFLIDPFGFKKDIHASHYTSFIS TVGNINTYASYVALIFGMSTTLFVRERNMARKSWYVFTVILSLFALVTGNSDNAYLTIGV IVLFLPLYAFRDMYGLKQYMLLISMISTEFMAIGEIDKRYASHVLPIGGLFRLIVGSDWL IFVVAALWGCTAFIYILERFVRKKYKDECGICICRRIWGTVLILAIAVLIYMTFDATRMG NGQQYGILKDYLLINDEWGTHRGYIWRIGIECYMKYSPLRKLFGSGPDTFGIVTQSYYRE MIERYYEIFDSAHNEYLQYLITIGIVGLFSYLFLLVSSIIEMVKASKREDSVMSIVYAVS CYAAQAVVNIGVPIVFPVMFTLLSVGISVQNCTDKTSEVKVMRGTKRENKNNKYLYKDKV QFTRDYTGEVSDGRGSTPS >gi|157101645|gb|DS480679.1| GENE 88 98146 - 98331 64 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIKCQYCGKNLATIEGTGRDTFTSKIVFTCRFCGRTTSLNIIRNKRRKNEKERNRAAIE L >gi|157101645|gb|DS480679.1| GENE 89 98592 - 98747 56 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937268|ref|ZP_02084630.1| ## NR: gi|160937268|ref|ZP_02084630.1| hypothetical protein CLOBOL_02158 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02158 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 77 100.0 3e-13 MWGGTFAVAERKGRQRRNPRTGEVVEIKPRRVPVFRAGSILKREVIQGNQG >gi|157101645|gb|DS480679.1| GENE 90 98872 - 99300 128 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937269|ref|ZP_02084631.1| ## NR: gi|160937269|ref|ZP_02084631.1| hypothetical protein CLOBOL_02159 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02159 [Clostridium bolteae ATCC BAA-613] # 1 142 1 142 142 281 100.0 2e-74 MRWKSTGEAEINIGVLGSRNDVLWEFRDFLICLDAESDYKISMMLAKNRETLLRVMSFVP GGFDIIIVTDHLPGFVYLEMAEMAFAFNAEVKILFQMDRGMEFSDKLYPHALFVPGREQL KSLAKEKMDDMITKNSEKGAAK >gi|157101645|gb|DS480679.1| GENE 91 99467 - 99931 133 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937270|ref|ZP_02084632.1| ## NR: gi|160937270|ref|ZP_02084632.1| hypothetical protein CLOBOL_02160 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02160 [Clostridium bolteae ATCC BAA-613] # 1 154 1 154 154 252 100.0 5e-66 MRKDRFNICISFYAVLAFVLAILGQTLLCGMLLGFVIVIQKDEWLTKQVMQAFFLALVES IISCVTNIFSGLYAIPILGIAFSGIVGLISGILSFLLLIIGLIAIGKVSKDSDARIPIAS KLANKAFGLTCTIQYKPASDASVQDPDTSNDHKD >gi|157101645|gb|DS480679.1| GENE 92 100062 - 100688 431 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937271|ref|ZP_02084633.1| ## NR: gi|160937271|ref|ZP_02084633.1| hypothetical protein CLOBOL_02161 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02161 [Clostridium bolteae ATCC BAA-613] # 1 208 1 208 208 404 100.0 1e-111 MLLCDEEYEAMTDERNMMRDEVIDMEESDFYTLANTDRMGIFLICIKNKQGELSGDIVNF YLKEPVAFTGILDLGLKIDTLCAKFQRPMATTEPRFLTAHMKRAYEKRELDKEHIPMKYC TKIQNMASFMALGISACETLIIEIHQRQFSSMQGRIKGKCSGGNGVSFRSALELMRMLRE YEYMRNEKNKNGGSNGKTRKEQETVETP >gi|157101645|gb|DS480679.1| GENE 93 100746 - 105212 2461 1488 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1334 1483 473 619 621 78 33.0 9e-14 MKRTNKIKAYLARVLAAAMVITMAAPPAPAYALNYGDPAVIRFDPKEGPDLSHSNYMNVP RDGNRITYATGRAGHPLAEASDFNGIAVENLGSGDRPVLPAFDADLSWPGYTFDGWYNAD GNKILYLPFAFPYNSTTAYEARWNGDASSQFDFTVMHYRDLNEDRNGNMNGEDPNAWPEA DDSQIYKFFDDGSWTTRVTANTAVSATYKRDIPGYKVSSVIIKNNKTRRFDDASGHGTLG EAATINESTRSVRGNMPNDDLTVAYRYEPDSSKKFALRVEYADGSGRAIRTPESYLYSAE SQVSAAPAQITAYTLTGAQIKTGSGDIDDLSGRGIYSAQTAGCTFDAQKNFTGKMPNQPV TVVYTYEIDPSYVTHVTINRIDNHNNVLAEPETREVSPNETVTVNVERKAGYTYPPNIGW NGTFTDISMDQTAATLSFKTDFTGGTVTITYNEDLNDTAHWARINYYNSEHGSLSGDSSP RSLRLGTHSIDTITEGITPSPEQHYMFNGWYKANASGTGKVGTVLTGDIELTGDLKLYAD FVEDPGQWCDIRFESGSRGSISGTNSMHVPKGTQWSQISLPQTTPDSNYMFAGWFDENGN QVTDPNMTILADQTYRARFTPVGGDDGILCIPDGTGTIGNDGMGKIEISGANEARKYALT DSDGRLLAVMTGEQLGSSCFEQLPPCDSYYVYEMAESAAPVVGDILTDSVDPSLISQPAR VMVPAIGGNYSTADDTTEGLRKIVIRPAGENTVYAVLDMDGNVVSQDGNEDGWVSPSGSP RTAELAGLEPNVSYIIVAKQADNGDAPSDRMINGSQVMVTGTSQQNRVYTFRLLNGGYVE SVTRNGEALDVEEHADEVLVKAGDQIRISAEDMNPAGQAFKQWEVQIGNLHMSYLTRKNQ TVEMTAGDVILQAMYEPSPVATASNATVDYSPKNGIFALDKSEEALQELKEELVDNDEDS TALSNGQRIAYTVKFDRHAPVASASEAVRQEVDDGESVKTPWSLDIGLTRKVDGTNKPLV EDANLTPAIKVFGKLDTSLLGNLEYRLWKINFGEDDGDTTCEEVLMTPDPNENEDFTGSF AFEANIGDTLVFSYYKAYEVIIMDTGRGQVHTFKVRDGRSLDDTEDYMDLDIHEGYTDSV TGIAYEFTGLSKRQSGGGMYDTSDPVTRDLKLYAVYEPEDDTQWQEAKERLQEEINIANA LKNNGSVSAEDRDALTEAVDEAVEVLNRLPRPSVDDLESAFDTLKALVDSISSGGSGGSG GSSGAGGSGGSGGSSGSGGSGGSGGSGGSGGSKGFSGGGHGPGSDGTENSYRTYLDGTEG IWNNVDQTNHKWAFVLNSGTRIKDSWANIRYTHNGSSQIATYHFDREGIMDSGWFLDQDT DNWYFLSNVHDGWFGRMTKGWHYDDDDGRWYYLSPFTGVMLMGWQKIDGIWYYLTADSQQ KTWTFNEGSKRWEYTNQNGRPLGSLYINEMTPDGYQVDENGAWTRETP >gi|157101645|gb|DS480679.1| GENE 94 105224 - 111511 4464 2095 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1918 2092 536 741 744 69 28.0 6e-11 MNSGIYTFYKRLTAMLLAICMIAGLVMGQPNILSYASTELQNEDLGDGASEATAYTWKNG SVTGQGGGGNSWRFDLRGLQAGQHNYAQAGIKTTYSNGGYATWFQVGTNSKQLIGGNTNG GVQSLDTYGIEVKIAVSPSPDNKYVFVDYYVYDKNGQGGSTGRTIKMGTGTDVMIGGTQE DDYATVYKNDRGFHMVNQHVKTTFDCITNDSSLGVTPPDTRWIGHYNSWGSNVFNESSND VVSGTDSGMAYSWEFQLHPYETVHRRVAFAIRDTSYYVSDQYGQDSTSAEGTYSSPFKTI EYALNKIGNNKGYIYVMDYPEISSAIDVTGNSQKDITIASTDYDHEGHPMNEDGDYIKTL TRASGYTGPLFNVSGPTLKFTDIVLDGNHAESQDSLISASSGKLEINSGAVITNCSGSES SQGSAVNVTGSAGLSMNFGTVSGNVSAGKGAVYYNGSGAFEIRNRNQISDNTTPSGKKAN VYLAQDKYITVMSDLDTSQIGVTAEQLPLASPGGISSQPSQEVKIAVPSSSYPGAAGSCP FADNFKADQEAGNSGVYVSAGTEILGNGRNAVLKRNGYTVSFIYRDSATGGTVNGAPASI PLSYAGGDAVEVNAPAAITGYGLTGVTIDQGTGGTLSAQADAGVADFGKVTGTMPGQDVV ITYEYTRNSGSIVFEANGGTPEPQALTGSVGGSVNALLPNISRYGYQFVGWSTVNDRVNP DLISGLPSEFPEDPVTYYAIFSPDSNVKFDYTVDYVNADGSIVFQSTTTEDAYSVEAEVH SGKKNIHGYTWSLADSSTNPAIYNYGDGHGPVPFGNFNGGTGDFSGKMPGQDAAVKYGYQ VDRSNPNAKSAFTVKYVTENGTVVHTADIQDVFPEDAIAAVPADVYGFRYLSGSITAGST ADDTDGHLVSAVQGGFDSDGRFTGTMPNQPVEITYLYEATEEGYEYKIHYLDNGTQDERL RNITGPDIQNVTADTPVTAEFKNMYGYVFQDERSEPASAGTFDSSHNFTGTMPNDRLAVT YRYDRDPSKWANLIYKAGEHGVLRPGNGMSSDVVILSGGAYRVSVLINDGTVEGTAGSYT WNDILEKRLVPEAVADTYYRFEGWFIDQNGNGIKDSGEELLSADSRFTGSAAVTAYFAED PDQWVDIHFAAGEHGTIDAGENVNLHIQYDRTWADTAGNRPAYTPEINYLVDGWYDGGVS VEDDSRLVNGNTYTIRFYPDPAVFGTDVAARDASAGIDTQGKGRITVYGTTQGYQYIITD MEGNIIDVVRGNISGRVYFEDLYPGTRYLIYEATGTTQAQTGRPIDGVTGIISSATEVLV PVVETNYQVLYDEEHEGKTVFVIKPADTDSDYAILDKDGHVVITPETGDGGWQSAGGSQP ASLTFSGLDYNEEYVVVARPHGSSGITAESKLEDGTHISTDPGGELDIPNYIVEAINGQV HSVDGVELDIPRYDEVHKGDEVVLHAEETDGNGQNFLYWKVTIGAVPGMTETIRRQDVTF TMPDSNVVMTAYYERASATPSNATVTDEVRGGNKHEMALDPNEIENLEEELTTDADRTLM DVNHADVTYKVVYKKNVVKATESNAIKRSSYYDSDHEEAYHGAWGLNVDIERYVNGRRVG VASPSEATFNTYVQLDKEDVDMMDYQLFAISEDDDGELIIDSVTMSDDPEETGGLFTFTA QAGMRYVLIYSRAYRIYFINNKEEPKYRYYFKVRRGETPNSGDYSFEYSQVETPIDSFID NEGVEFNYIGWSYREDRLKEFDPDKEIKRKTYVYAFYDDNSGEVNDARKQLEDAIKAAIE KSDDYFLTLKETEKIREAIEEAMDVFDRTGPRATLDELLEALNRLEETCKPFDKILEDRY DHYDKLQNGGNSGGTSGGDGWGSGSHGGSGGSKGGSKGGGGSSRSAGTGGPVTTPAPYVA ETSKSYVVGTNGNWELVDPESDKWAFVLNGGIRLTSIWAKLDYANGDVNRNGWYHFNASG LMDYGWFRDEHMNWYYCNAKKDGWLGKMKTGWHYDEVDKHWYYLDLITGQMVMGWKEIEG KWYFFTPQNTAQTYSYDRSTGKWVFMQNQERPLGSMYSNEKTPDGYQVGADGALQ >gi|157101645|gb|DS480679.1| GENE 95 111547 - 113199 1290 550 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 29 184 545 711 744 61 27.0 6e-09 MRKSMTAIMALTLSVGFSAALGMGSVQAAEGWTKINEEWRYYDQNDQLVTLTWKKSKDYW YYLSEDGTILKHCVFRDKDSIYYVDEEGKMVQDTWIYADAGSVTIDGVEEGWYYFGADGK GYRRKNSSFKRNIGGSIYIFDENGKMLSGWFDEEGNPIEDSDTPFVEGVYYAKEDGRLLT GEWLDYGEIDEGIGGSDLDSEVAGRNYTDYDKMWIFFDSHSKKVKSNGDRLRQKTINGAK YGFDENGIMIPWWSKVASISNADKSNPSSDVSARYYSGYDGGALLKDSWFWMYPSENLDT EDYYDQECSWWHTDERGEVYRNRIRKINGRSYAFDGIGRMQTGFVLFDGKSEFVARYEVD DWSSEDFIEGTIYGIEKSDLYLFSPDELNDGSMQIGKDIAVELEDGVFHFGFASNGKAFG NRNRLQKKEGMYYINGLRLEADEEYGYGVVKVEDGADTYYQVVDTRGKVIEGKKKIVKDK EGGYLLIIRNRLVAWCGDEDKPRWRKGEEGTGFYHYDRGNKEDHFAGGLIAAAGMEPDID GLPEEERLNF >gi|157101645|gb|DS480679.1| GENE 96 113327 - 113959 365 210 aa, chain - ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 118 209 148 239 240 98 50.0 1e-19 MFAVIDENEAREAYIGELDVAAYCGCQYCIGRSDTYVTWSGRLPTQGTTVAADLNGFEIG DVLRIGSDIYRVEDKVSPGAREALCLYFSNHADAIAFGRQILPVFKIKSREEHLGEPLGI FEVTGYCGCEKCCGLKGGLTKAEKIPKAGHTIAADPEVLPMESKVEINDIVYVVEDTGKL VKGNVIDIYFNTHEEAVRFGRQKINVYLVP >gi|157101645|gb|DS480679.1| GENE 97 113991 - 114428 136 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937276|ref|ZP_02084638.1| ## NR: gi|160937276|ref|ZP_02084638.1| hypothetical protein CLOBOL_02166 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02166 [Clostridium bolteae ATCC BAA-613] # 69 145 1 77 77 142 100.0 1e-32 MEALRYYMGQECGATPMETGAGGENGGGPHAQDFPVENGGLSHRRVSPKRQQMTEFRRAK KRRRIRRKMIRYLTVAGLTIVVGIGTYFFASEGASVQDYIQVPPLTVVLESTEALGGSEK VTENGSTCRCTQGRCRSRRCTAVGR >gi|157101645|gb|DS480679.1| GENE 98 114320 - 115108 290 262 aa, chain - ## HITS:1 COG:no KEGG:PPE_04051 NR:ns ## KEGG: PPE_04051 # Name: not_defined # Def: uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: P.polymyxa # Pathway: not_defined # 130 236 125 233 375 70 38.0 8e-11 MKRVYGPCLLILFLCLAVIWSLPACGMNAQAQTQMKDIEFEKRGRYAEEMTGRIPDYEVC GDKMLELIENSEELSEWKDSGVVFGTVEDALSFGRYFYRYIYLGKQEIRLAVGEKGDRAV VYVQCDDPHQAAIQHRQTEDRLYDIVAEGAGMAEREKASFFYEWVYSKVEYDTELKRKTV YEAVMEGRSVCWGYVSAYLMLCRMAGMDCEPVYGGGHAWNRVWIDGGWKHCDITWDKSAG LRRWKLVPEAKMEEDPMHRTFL >gi|157101645|gb|DS480679.1| GENE 99 115371 - 116261 201 296 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 30 282 14 262 265 101 25.0 2e-21 MKRNGRLVKRIVTQESVNRYCEWLYGCEKSSGTIKQYKHYLLLFMQYMNGKSVEKRDVIM WKGILRERMAPVTVNSALAAVNGFLTHNAWQDCRTKFLKVSRRVFCPESREIDRDEYKRL VKCAYKKGDERMAMLLQTICATGIRVSEVPYITIEAVKQGRAEVECKGRIRTVFLTSRLC YMLLDYAKKSHIDSGMIFVTRSGKALDRSNIWRNMKKLCEGADVLWDKVFPHNFRHLFAR LYYEQEKNLVRLADILGHSNINTTRIYTMESGRNHMRQLEKLEVLFDFYNKFSLLL >gi|157101645|gb|DS480679.1| GENE 100 116429 - 116569 64 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINNVSKNQGSSVKVRVLSDFFVYRARYNSMNYFGDFVATFILIKD >gi|157101645|gb|DS480679.1| GENE 101 116846 - 116992 63 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937279|ref|ZP_02084641.1| ## NR: gi|160937279|ref|ZP_02084641.1| hypothetical protein CLOBOL_02169 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02169 [Clostridium bolteae ATCC BAA-613] # 1 48 1 48 48 84 100.0 2e-15 MVTVTGAFEISEDGESLILKAQSIEVSRKPEEGYVYPCFQSENRMREN >gi|157101645|gb|DS480679.1| GENE 102 117176 - 117331 80 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937282|ref|ZP_02084644.1| ## NR: gi|160937282|ref|ZP_02084644.1| hypothetical protein CLOBOL_02172 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02172 [Clostridium bolteae ATCC BAA-613] # 1 51 29 79 79 80 100.0 4e-14 MEGYSENDAYPEDDSRQSSREENPWYDLKGLDEVEKTLRLQMEITTHGSMN >gi|157101645|gb|DS480679.1| GENE 103 117871 - 119289 341 472 aa, chain + ## HITS:1 COG:CAC1564 KEGG:ns NR:ns ## COG: CAC1564 COG1696 # Protein_GI_number: 15894842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Clostridium acetobutylicum # 1 472 1 473 473 304 39.0 3e-82 MVFSSLVFLFHFLPLFLLLYYICPAKWKNAVLFAGSLIFYIYGVKNEPWYFFLFLLSVLV NYHIGLLLGRRRWLEHRKKWLIFGLCYNLIWLFLFKYSGFFAGNLNWIFKKMGIGFSIQI PEFPLPVGISFYTFQAISYLADVYRKDVLHETSFINYGTYITMFPQLIAGPIVTYASVRK QIHQRSHYLKNVENGLREFTIGLGLKVLLANQVGRLWSQVGAIGYESISTPFAWLGLAAF SFQIYFDFYSYSLMAKGLGQLMGFSLPDNFNYPYLSLSMTEFWRRWHMTLGGWFRDYIYI PLGGNRCGKATTIRNMLVVWLLTGLWHGASWNFILWGSVIFLLMLIEKMGLKHLLDGCPL IGHLYMIMAIPLTWLSFAVTDLVQMKIYLLRLFPFLAPEKQSLYFAGDFLKYGRIYVLSL TLSLIFITQLPRKLYKQWKRSFLTAVILLAVFWLCVYLIKMGADDPFLYFRF >gi|157101645|gb|DS480679.1| GENE 104 119307 - 120275 487 322 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2151 NR:ns ## KEGG: bpr_I2151 # Name: not_defined # Def: GDSL-family lipase/acylhydrolase # Organism: B.proteoclasticus # Pathway: not_defined # 102 317 120 330 336 123 35.0 1e-26 MKKYWYTLLLAFSLLISAIISTLPASVYGIVTSVFVRVCSNGQEMIENIDAELSAVKHDH PETSQTENTHPEEPAIHDTAHVVPKHSAHKNDSSESVSMEASSFDSIDTTSEPATSAPQT PAATSVGPSYIPPNPDFTSTLFIGDSRTVGLFEYGDLGNAEIFANSGMSVFNLFESTVKT KSGLKQNLEEVLFQQQYHTIYLMLGINELGYDYNSIIRKYQSVVDTIKARQPNAMIVLEA NLHVTAEKSASSNTYTNEKINQINSGIRAIAENSDCCYIDVNDIFDDENGALKTTYSTDG SHVLGKYYSVWTDWLKGKSPDV >gi|157101645|gb|DS480679.1| GENE 105 120900 - 121610 278 236 aa, chain - ## HITS:1 COG:SP1823 KEGG:ns NR:ns ## COG: SP1823 COG1285 # Protein_GI_number: 15901652 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 17 236 8 236 236 142 36.0 4e-34 MEIMKFFFHKITVTGELTVAAIIFRIFVSLCIGGILGAERGIKNRPAGFMTYVLVSVGAT IYMLTNQYITTVIPGSDPTRFGAQVISGIGFLGAGTIIVTRQNEIRGLTTAAGLWVAASL GLAVGTGFYSGAFIGILFTVFALVLLKKIDVYIKQHARSMEIYIEHNIEFSFRNLVDYIE KRGFDVFEMQRSKIKTLDKDLGVLIFSLDLVKKVNHQEVLNHLNEIEGIEYVKEIA >gi|157101645|gb|DS480679.1| GENE 106 121628 - 122506 258 292 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 21 290 101 371 372 228 48.0 1e-59 MKQEKTLEKAKVDNVYEIETDSVQWYLKQAGVGPLLTKEEEIELAYLIQEGSDEARQKMI SSNLRLVINQAKKYVTYTQSLTLLDLIQEGNLGLIKAVEKYNPELGFRFSTYASWWIKQA ITRAISVLDKAIRLPVHFCEDIQKVKRATHEIMQQNDIVTPEKLKDITGFSEGKIEKILM NMNHTISLETPVGEEATLGDFVEDTLTPSIEEQAMENAMQREISRQMECLRPREQTILAL RFGLNGQNPHTLEEVGKQMGVTRERIRQIEMKALRKLRHPNRSRYLREFILG >gi|157101645|gb|DS480679.1| GENE 107 122852 - 123418 176 188 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|226309687|ref|YP_002769581.1| probable 50S ribosomal protein L25 [Brevibacillus brevis NBRC 100599] # 1 175 1 173 202 72 26 1e-11 METICVQRREGAIKAKKLRRLGFIPANVFGKSLPDPISIQLKESDARRLLHQKREGSKLM LDLEGTGLPVQIKEKEIDTIKGQIQHISFQALTADEKVNSVIHILLVNDDKIPGQLDKML MEIPYASLPGDMIDTITVDLDGITVGTILTVKDIPELMSDKIELKVDSEEIVLRVSERKH HVEVKEAE >gi|157101645|gb|DS480679.1| GENE 108 124361 - 124510 204 49 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|224476463|ref|YP_002634069.1| 50S ribosomal protein L33 [Staphylococcus carnosus subsp. carnosus TM300] # 1 49 1 49 49 83 73 7e-15 MRVKVTLACMECHDRTYATTKNKQKHPERMEVMKYCPRLRKYLLHKETK >gi|157101645|gb|DS480679.1| GENE 109 124869 - 125822 541 317 aa, chain - ## HITS:1 COG:BH1517 KEGG:ns NR:ns ## COG: BH1517 COG3689 # Protein_GI_number: 15614080 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 22 317 24 319 321 72 22.0 1e-12 MTLRGKGINLQVLTECICYLLFGYLLFQLTYSGGYLNYVTPRMKPYLYGMSALMFLWAVF TGRYLLTPRYRVKMARSFVFIIPILLLVFHPADPRGSSMVRSYESSGFSMSSGNGGGQMA GNPGRQTAGSTVRQTLPAAGGESRGPFSDGESGSGLPDGENTETDGYSGTDAYPEDDTGQ SSGEENPWYDLNGLDEAEKTITIADEDYYTWMYELSNFYEKYEGYTIVMKGFVYRAPDIQ KKCDFALVRLSMWCCAADLSPIGFLVDSDNEVTFKDDDWVTVKGTFGISEDGASLILKAQ NIEAAEKPEEEYIYPYF >gi|157101645|gb|DS480679.1| GENE 110 125819 - 127378 705 519 aa, chain - ## HITS:1 COG:BH1518 KEGG:ns NR:ns ## COG: BH1518 COG0701 # Protein_GI_number: 15614081 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Bacillus halodurans # 214 509 35 333 337 207 36.0 4e-53 MTTPVWVITGLLDSGKTTLINQLAEQELDELDILVIQFESGETPLIEQERVKKLAFSKSQ LEQSPFGIADTIIHYIDKHRPDLILIEWNGMEHFHKLEKMLLQFSAKAVLSIEKVVYAAE GASFKTRIPDAGVATFSQIAGCDCAYVRLKGNRKSRNGMDVLYGCNPDIQVYTSWNRFVR ALFRFGMKPRHWFLMLLTAVMLYMAAFSMLNDAGALPERYISIFLGVFLQAAPFLAVGVL LSSLIQVYLKPDWIQRRFPRKILAGQLFAVLAGFCLPVCDCASIPVFKGLVKKGVPIPAA VTFMLVSPVINPVVILSTWYAFNGNYKIIAARCGLGILCSVLCGLTYLFKPPVNYLVEDA MPIQAACGDYSLLVEKRTQLSRFSLMMQHAQNEFFSVGKFLLTGIFASTLFQDMIPRAVA AGGAAAPWKALLVMMGLAFVLSLCSSSDAVVARSMAGSLPVGAVLGFLVFGPMMDIKNAA MLLSGFKSSFVIRLFATTFLVCFLVIGVFMTGGSGGIRI >gi|157101645|gb|DS480679.1| GENE 111 127375 - 128286 253 303 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 8 297 26 294 294 112 28.0 8e-25 MVHSAFKNKKLVVIENDFGEAGIDAGLLKECNLAVTSLSDGCICCSLTGDFEKAMERILK DYIPDAVLVEPSGVVRLSDLIKICLKQEDRGLIHLKRTITVVDIRSFDKYRKNYGGFFED QIAYADLILLSHQEEGGQEIQSVKIKIQEMNPEARVEADFWESIPASVFRYGPRNSNIFK LEMEAAVSMKPVRIRSCREPSRRTGFIRRHFARDVFSSVTIEWKEPITEERLGKKILHVV KHAEGEILRGKGIVADGRRGLVFHYLPGSLSIEPANIVGNQVCFIGTGLDEQQIHTLFRE ETT >gi|157101645|gb|DS480679.1| GENE 112 128393 - 129718 746 441 aa, chain - ## HITS:1 COG:no KEGG:Closa_2429 NR:ns ## KEGG: Closa_2429 # Name: not_defined # Def: polysaccharide deacetylase # Organism: C.saccharolyticum # Pathway: not_defined # 2 438 49 466 469 392 45.0 1e-107 MRILLGDREISTTASAAESASASGKEASSSPGEKASLSLFDKTDELLAQVDRAVSQYDYD RAVELLVGNGADKEDPRIGEKLSDIEAVKKTLVEQDIYQITHIFFHILVEDPGRTFLDTA QGKGYNSVMTTVPEFEAILTQMYQRGYVLVGIHDLAEVSEDETGMEQMRAKKILLPPGKK AFVMSQDDVNYYEYMEGSGCARKILLDDRGKPVCEYMDQDGTVWIGEYDLVPILDRFVEE HPDFSYRGAKAILALTGYNGVLGYRTDETYDPASPNYDPDMKPNPNIEEDRAYVKKLTQA LKEDGYEFASHSWGHRDYGKIDLEHMKADIERWEKNVAPLLPDPCDIMIYPFGSDVGDWR PYTEENEKYRYLQSLGFRYFCNVDSRPYWVETGGQFLRQARRNLDGYRLWMDYGCGANRL SDLIDVNTVFDARRPTPVGWK >gi|157101645|gb|DS480679.1| GENE 113 129918 - 130715 792 265 aa, chain - ## HITS:1 COG:CAC2878 KEGG:ns NR:ns ## COG: CAC2878 COG1108 # Protein_GI_number: 15896132 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Clostridium acetobutylicum # 1 254 2 255 268 164 38.0 1e-40 MEIFEYAFMQRAFIVGILLAAIIPCIGLVIVCKRLSMIGDALSHTSLAGVAAGLLLGLNP VLTAAAACVVAAIGIEAIRRKIPRFSEMSIAIIMSAGIGLAGVLSGFIKNAANFNSFLFG SIVAISNGELYLVIAVSMAVLLMFLLMYKELFYVSLDEQSARLAGVPVKTVNFLFTVLIA VTVSVAARTVGALIVSSMMVVPVACAMQFGTSYRQTLILAVVLDILFMITGLFAAYYLGL KPGGTIVLVGVLVLILVFVGKRILR >gi|157101645|gb|DS480679.1| GENE 114 130702 - 131412 251 236 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 222 1 221 311 101 33 3e-20 MMDVIAVKGLDFSYGNTQILKNINFTVPEGRFAVLLGPNGAGKSTLIKLMLGELPLSGQT GKIELLGQEIRQFKDWKKISYVSQNGMASCQNFPASVEEFVQAGLYTQIGKFRFAGKRER EQVRRVLHQVGMGDFAKRLIGRLSGGQQQRVLLARALINAPRLLLLDEPTSGMDEDSTTL FYRLLYQINREQGVTIWIVTHDRKRLMAYADDIWLLEDEKMKLVQPGREEGEHGNL >gi|157101645|gb|DS480679.1| GENE 115 131433 - 132527 953 364 aa, chain - ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 66 364 22 312 312 247 45.0 2e-65 MNKKLNSTFCTLAVTAMVSVAMLSGCSGTTAAPGASDTDQASTTAVAGGDKSMGEASMEA ENTDAGIESKDTDGQKLKVMASFYPMYDFAVKIGGDKAEVTNMVPAGTEPHDWEPAAADI KNLEEAALFVYSGGGMEHWVEDVLASLETKKLVSVEASAGVTLRSGHIHDEEEHKGAEEH TEEEDHGHDHGQFDPHVWLSPVNAKKEMENIKNAYVKADPENKDYYEKNYQTYTAEFDKL DQEYKDTLSALPNKSIVVSHEAFGYLCDAYGLTQMGIEGLSPDSEPDPARMAEIIDFVKA NHVKVIFFEELVSPKVAETIAAETGAQTRVLNPLEGLSDEELKNGADYFSVMEDNLKQLK AALE >gi|157101645|gb|DS480679.1| GENE 116 132710 - 133021 238 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937302|ref|ZP_02084664.1| ## NR: gi|160937302|ref|ZP_02084664.1| hypothetical protein CLOBOL_02192 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02192 [Clostridium bolteae ATCC BAA-613] # 1 103 1 103 103 187 100.0 3e-46 MDYLDEHKRIFAFVTAFLVTAVLLLSGFFIVTHIEHECTGEDCPVCAELQECAATIRLIT EAAGTGAIVIFAYIITQKLLSSYQTGLYLCPVSLVSLKIRLDN >gi|157101645|gb|DS480679.1| GENE 117 133464 - 133898 155 144 aa, chain - ## HITS:1 COG:no KEGG:Closa_4176 NR:ns ## KEGG: Closa_4176 # Name: not_defined # Def: ferric uptake regulator, Fur family # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 1 132 140 205 74.0 5e-52 MSERGRYKTKQQDLILECLKKQKWRFLTVSQFMDCLREKGVVVGQTTVYRVLERMVDDGK MMKLPMEDGTKVRYCFAGEEDWDKLGKLVCMGCGRLIPLECSKMADFLEHIYKEHGFELD QKHTILYGYCECCNNKRSVPKIGL >gi|157101645|gb|DS480679.1| GENE 118 133975 - 134121 67 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937304|ref|ZP_02084666.1| ## NR: gi|160937304|ref|ZP_02084666.1| hypothetical protein CLOBOL_02194 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02194 [Clostridium bolteae ATCC BAA-613] # 1 48 2 49 49 90 100.0 5e-17 MGIVLEPEELVNTIFYLREGELGDDALSGISIVTMLCDFLRGCWVSIN >gi|157101645|gb|DS480679.1| GENE 119 134137 - 134361 90 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937305|ref|ZP_02084667.1| ## NR: gi|160937305|ref|ZP_02084667.1| hypothetical protein CLOBOL_02195 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02195 [Clostridium bolteae ATCC BAA-613] # 6 74 1 69 69 130 100.0 5e-29 MKSTLMDSVQEMSNTGKHIRYLDREILPPCYEDESLNTNCESYENSIYLKELSATYGKTG MARAEDWLMTKKGR >gi|157101645|gb|DS480679.1| GENE 120 134358 - 134468 56 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLIVQHYGFHMISQCIIVISGRLLENKITRHIVNL >gi|157101645|gb|DS480679.1| GENE 121 134805 - 139073 1292 1422 aa, chain - ## HITS:1 COG:YPO0776 KEGG:ns NR:ns ## COG: YPO0776 COG1020 # Protein_GI_number: 16121089 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Yersinia pestis # 14 944 20 991 1939 682 38.0 0 MNHVENMKIILELFKNKGITLFMDGEKLKYRASHGINHDDILLLKENKEALIQYFKEHAE DIRIISAPEDMYKPFALTDVQAAYALGRKDTFEYGGVACHVYMELTYSELHVDKVQAAWQ RLIARHEMLRAIVNENGTQQVLPEVNEFKIQEYDLLHISEAGKALEKIRNRMGNAIYELG SWPMFEIAVSHTDKESILHFSMEFLIADWKSIWMLLYEFETLYFTPEQLLPEVGVTFRDY LIAERKLKETEQYKVDKEYWLNRIESLPKAPELPILRNPIKNRSDFSRYFMEIDQEKWNR LKRYSNHFGITPTAVVLTAYAEVMERWSQTSEFCINLTVLNRAPIHQKVEEIVGDFTSVS LLETRMEKSKPFIEKAREISGQLFEDLDHRLFNGVEVIREISRKKGREHALMPIVFTSAI GLSDKKLQGEFHGNGISQTPQVFIDCQVMDGEYGLQADWDVRNGVFPDGLIKDMFETFRS RLLELAETDVNWTKVSELELPSWQLERIAGINSTEKEISPQMLYDEFLEQVKKQPAKTAV IDSADKFTYLELHHMALGIADRLSEMGIKPGSNIAVLLPKSRFQVAAVLGILYTSCIYVP VDIEQPEQRWNTIVANADIKAVLIHSGHSAVFEHVPVLPVDQIEAISNDSMILRGTPDDL AYIIFTSGTTGVPKGVAITHKAAWNIIKDINQKFFVSSQDSVLGLSKLNFDLSVYDIFGL LSCGGTLVYPKLSRYMDPSHWVELIQEYEITIWNSVPAFMQILTGYFAGKNEKLPLRIVL LSGDWIPVGMPGDIQKCAGDAMVISLGGATEASIWSIYHECVDNEIREVSIPYGKPLSNQ GFSIYDAKGRPCPVYVTGELCIWGTGLAEGYYNDHKLTEAKFVTGRENGRMYKTGDNGCY LPNGEIEFKGRNDNQIKLRGHRIELGEIQSTLEQHKSVSQAMVVLNEVKTDIYAFVKTVQ GNVKNSDLKQYLEAYLPKYMIPADIISVEEFPLTANGKIDRDKIKKLIIKQENAEGKELR EENPDELQNKLIELWNSVVPQSAVGVFDNFYDLGADSLILAQMATKVREQISKDIAFDAL LRELIYNPTIQKLSEFIRNNTETEVEEKKMEDTNSELVFVREYGGDLEKGLSIIFHGALG SFNRFDYLADSLIASKGEKVIGIGIKDTNKYCSIPSTNLIWMLSDIYSQIICSYQVKAVS LIGYCFGGVIALETANRLNENSITVKSLSILDSLLLPPSFKVEDELLMEMMFLDNIPVSY ADIEMPNPALYEQIIVQESMKKGVINKGFLTSISGTTEREQLGNFFRELATMQKEERFGL YAVQCKKNTNVEMPMELIKSLYQVCVATFEAMHYEICPYLGTINYFFAPEGEVGFKKMVL ETWEGLALGAFNVIETPGNHYTCVEQKENAEYLAGSLCDLKE >gi|157101645|gb|DS480679.1| GENE 122 139060 - 140160 362 366 aa, chain - ## HITS:1 COG:YPO0774 KEGG:ns NR:ns ## COG: YPO0774 COG4693 # Protein_GI_number: 16121087 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Oxidoreductase (NAD-binding), involved in siderophore biosynthesis # Organism: Yersinia pestis # 1 298 2 312 375 181 32.0 2e-45 MQKIRTVVCGTTFGLFYLEALRRLNDKFEIVGILANGSERSKRCAANFNTTLYTDIRQVP KDIDLACVVIRTRALGGKGTEIAAHFLENGINVIQEQPIHPKDLEECYRIALKNGVYFQT GDVYPHLDGVETFIRSAHKLNEIHDRPLYMNLSFAPQVSYPLMDILMNSLPDIRSWEFSG NVKAGPFRVITGTLNKIPVCMEIHNEVFPKDPDNNMHLLHSITYMYESGHLSLQDTFGPA IWNPRLDIPVELYNYGDVKSEVPEYMYEKTGHILGGYHSRNFKQIITDIWPEAVKKDLLL VLKGISDKRLLNAKGQKEMICSRQWNEMTKYLGYADIMDKKLSKATLTDILEKSVQEQRR HEYESC >gi|157101645|gb|DS480679.1| GENE 123 140163 - 141239 363 358 aa, chain - ## HITS:1 COG:AF1588 KEGG:ns NR:ns ## COG: AF1588 COG1748 # Protein_GI_number: 11499183 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Archaeoglobus fulgidus # 1 162 1 169 408 62 30.0 2e-09 MIGIIGGSGNVGQWATQALTRFSSAYIKVGGRNKEQFEKLARSLPDNAAVEFRQVDFKNI HSVKQFAEDCTIILNCAGPTYNHASELAEAVLNMGIGYVDVGYGKAMDQIFKRYGDKRIL CKAGAIPGLSGVLPKYLAQQVEEIHQIEVYYCALGKFTKTAAADYLDGVFDNRMELPSKR REARRKLLPLSINEAYLYPYYDEECEHVERITKCNVSKWFMALEGECTHNFMNTAADSYK MDSDKAVEDLCLATYVDTLGRTEYAGFVIQISGETAGKHMIRTMQVKAPTAEYMTGTVAA ASALLLDNTKELEIGGELGKIKEVNEFFSILDRIGQRLLISIVDKSIEDLTAVEEGEI >gi|157101645|gb|DS480679.1| GENE 124 141253 - 148599 2348 2448 aa, chain - ## HITS:1 COG:all2645_2 KEGG:ns NR:ns ## COG: all2645_2 COG1020 # Protein_GI_number: 17230137 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Nostoc sp. PCC 7120 # 1438 2325 5 931 1356 585 34.0 1e-166 MSLEHIVGMVDFMFLPFFTGCTCVYSEISSFLSNPREWFDLMEKYNVTVSVAPNFAYKYM SDCITDSDNWDLSSLRILMSIGEPISRQVIQRFFSKVNRFHIDEEKFVAAYGLSETCSGV VVNKGNVLKAKVAPEVNITSGIDFDSSHFDYKKADLICQGEPIADTKVRIMDDHREELKE NCIGNIEIKGSAVCKGYFEYDGRQPVEDGWLSTGDIGFYSGGALYVIGRRKDVIFSYGKN IYLSDLEYAVAKYFQLRSVACGDNVSSEGEFYIYLFVEINEAEGQQMKSKIRHTILRELG IKLDDIIFMDKLSNTKAGKISKAEIMRKYTMSRDSERRDNIEGFIKGYFAGEVTEDTSIL DVAEDSLSMFRLIGAIHKQFGIKMGILDLTRCETIHDLCDYINKLSGREIPVEKITCENQ MFCTSIQEAYILGRQQEFYGNTNMSHFYLEVRHNLSSQKIEEAIRKLIRRHSMLRSIFEN SRRVVLGEDIADTFFLKTLEEKSLESHRKAANQKVNEPDKWPLFEICNIKEGDSYIAAID IDMLIVDGFSLSVLVRDFKSLCRRLELEPISDGFGKYLEAYSIRKSGNRYLQDKQYWLER VKTLPGSPQIPILAAVNRGPNVFLRKEKAFTEQAWEELENVAERNGVTISVLLLTLYIHT LKRWSEEPEFTLNVTVMDRPTQMKDTERFVGDFTTNTLFNYENESGSPRNLYQDLSHVKG RLYEYLDHNNFDGVEVIREYAKVHQGEMALMPVVFTSMLFKEDNALTYEDINYFVTQTSQ VYLDCQIYHYDTTYVIAWDYLEKVFPDNMIHNMFDYYMNLVEGIICGREINEIPVGNERA VTRYNETDYSEELMTIPEMLKRSLDFFGPKTAITDGETVLTYEQLKRRMEDMIEDYIKQG MKCGDTVMVLAGKDIDSVVAILAVAKIGATFIPVPEDYPKERIETIREYSSAEWIVHVQN NKIERCVSAKRQRNHTTSKLEDTAYIIFTSGSTGNPKGVAISHEGVMNTLVDMKKRFALD ENEGVLGLSALNFDLSIFDIFGSIYMGGFLSLVKDPRNAEEINHMLEQFPITIWNSVPAI MKLFLESLPVDYRNTEINHIFLSGDWIGTELPEKIKKVFTNAEIISLGGATEASIWSIYF PIREVRKEWSSIPYGYPLGNQKIYILNKEGDACPTNVVGEICIGGRGVALGYVGDEEKTN RSFVDIPSIGRVYKTGDYGKFSEEGYVIILGRKDGQVKINGFRIELGEIEAVARKFYTVD NAIAIMDSKKKLALFYTGKEIEDRELQVHFEKYLPAYMIPYRFVYMEQFPLSKNGKIDRA ALLQIRELKEESEISAPANEIEMKISVIWGKYLESDQVSVDENFFALGGDSIIAQRIAQE LTKQFAVKIPFVKIVETGTVKELARFIQTETESKEEPKATLLSENQGEKTVEASQTELPM TNIQKAYFNGRNDSFELAHYNAHYYFEVEFPYEISDIERSIAQMLERHEVLRSIFLKNGT QKILADVPEYKVDITIAETEQEFAEKNEQMRRELSHKNYDFSKWPLFTFEVLKRGCDDFK ILCASINLMVCDGDSLRIFLKEFTEGLNNKKLPPIHYSYHEYVNSLNDCTNEREYQEAKA YWLNKIPCFPEFPHLPMLKNLTQCKNYRISRVSETIDAETWGTFKNTVRKHAVSPSTVLC TIYARVLAKWSNQKDLLLNMTAFERKEFDQNVSGILGDFTKILPLEVHIDGADFWDNAMQ IQTRILNNLDHKAFDGTEIIRELSRLSNGIGKALMPVVFTCVLFDSDENWFEQIGKLRYA VTQTPQVFLDNQILEMKHELYISWDYVEDLFNPAIIAAVFEDFISGIKNIAVNGVFETRS QIETDKVWDGYNMPVERVAPNTLQAFFEKQVRETPDGTALYYGDKQYTYAAIEAKANQVA RYLCSHGICGANNRVGVLGERKPETIISIIGVLKTGAAYVPVDAKFPKERRDFILENSGS RLLLSEELYRSGELEQYDESPIGIQSLPDSLAYIIYTSGSTGKPKGVMINHMAACNTISD LNNKIKLNKTDRLIGISSVCFDLSVYDIFGAFSTGAALVMVRDARDCSEIVSLLAERKIT VWNSVPVICSIVVQHMLERGMKDSLSLRQIMLSGDWIPLDLPDKAKSIFNKTKIMSLGGA TEASIWSIYYEIGEVKPAWKSIPYGMPLKNQTMYVLNFAGELCPMDVVGEIYIGGVGVAQ GYCGDTEKTEAAFIQHDHLGRIYKTGDMGVLRKEGYIEFLGRIDNQVKIRGYRIELGEIE SAINNVPGIEKSVISYVVNRTGNKQLIAYYIPTSNDLASETIKESIGAFLPEYMVPQHYI SIKELPLSANGKIDRSKLPELNLDKKERALSKDVISPLQKTLLGIWQKVFENTEITITDD FYGLGGDSIVLMKLLDEIALAGLENISIEDILEYETIEALSAYMEENK >gi|157101645|gb|DS480679.1| GENE 125 148849 - 150015 341 388 aa, chain - ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101645|gb|DS480679.1| GENE 126 150220 - 150750 198 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937311|ref|ZP_02084673.1| ## NR: gi|160937311|ref|ZP_02084673.1| hypothetical protein CLOBOL_02201 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02201 [Clostridium bolteae ATCC BAA-613] # 1 176 1 176 176 347 100.0 3e-94 MRQNGSIMEIFYNFFNNSENEIIFYDLSEERFSADNLRNMVEKTMAKLSSVERREYVILQ IGDKKEFLISLISLFCLDAIPVILPMCTSQSGYEQLKKILKAHPGTLVIADNQGSQNLAH FTDGSINIQNIEADAMSLCRRGRAYGEQDEAIIMYSLYDVVRSKHNILPFFFCFVV >gi|157101645|gb|DS480679.1| GENE 127 150747 - 151391 232 214 aa, chain - ## HITS:1 COG:CAC1329 KEGG:ns NR:ns ## COG: CAC1329 COG2091 # Protein_GI_number: 15894608 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Clostridium acetobutylicum # 15 159 12 155 201 63 31.0 4e-10 MWNDGVYIFKELDRISESDLERVSMILPEYRLEKVKKYKRDLDRKNCVIAYLLLTIALKR EFNIEGRIAMDYGIHGKPYLKEYQDVYFNFSHCSQGVVCAVSINDIGVDIETIDEKRMFF SSDIFSPKELEFIVNSQKEDFFRLWVLKEATVKCTGIGLVYGSNYIEFLTPKDGTFCKDE FVYHLKKIGSMYLAVCSKYPVRYQMVTRKELLNI >gi|157101645|gb|DS480679.1| GENE 128 151402 - 152889 628 495 aa, chain - ## HITS:1 COG:all2645_2 KEGG:ns NR:ns ## COG: all2645_2 COG1020 # Protein_GI_number: 17230137 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Nostoc sp. PCC 7120 # 75 488 5 424 1356 238 32.0 3e-62 MDKTKEFIWLADNELRINLNKLASDGINVWLEDEKLKFKGEQGKVTEEALKWMRQNKTAI ISYLLEKEEVQDKGFDLTPIQKAYFLGRDNLYELGGISANYYFEIDMDKLDVKKFESIFN QVIHDNEALRGIILKCGRQAVLSNVPWFSIEVKRVNDEAEQKALRLEWESKIYPLESWPM FHVGLTTKDNCCYRLHVGFDCILLDAWSVTLMLKQIYKLYENDKIMVPAYTFREYCNEHG KLLNEKIQEQAERYWSQRCEMLPKAPALPYAKKLSDINIPHFRRLNYELPRSDTETIYMK AKEYKVTPAAVLCTVYMKVLAEYAESPEFSINLTVFNRLSANREVQEVLGDFTNISIAVY EAKKGRSFKEEVLETQKEFWNLVKYRGYEGVKIIRKMPKNNPVQASLPVVFTGILQGLKQ SDYYLPNWAREIYAYSQTPQVSLDYQATDFKGNLSVNWDYVVQAFENFVIQEMFDKNIAL LRKIIQLDWEQEIEI >gi|157101645|gb|DS480679.1| GENE 129 152914 - 156057 1081 1047 aa, chain - ## HITS:1 COG:YPO1911_2 KEGG:ns NR:ns ## COG: YPO1911_2 COG1020 # Protein_GI_number: 16122159 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Yersinia pestis # 121 1033 430 1365 1381 427 33.0 1e-119 MKENISIIKGMDTLRTIAEERKVSVNVLLKGIAVEIYYRWFANETAIEYVTLNDLISEVI DRSKDKDTIVINDYVCSVMCNNDKAVLCIECEQLECDMMEGLSAFTEEIILEWADLADNA PMVHQIPRNQMAVRKNRNRTKKTFSDECLHHGFFENAANAPGKAALIYYKNGNKEEISYS CLQDMILKLAGYLKENDVKSGTLVGVSIPKGINQVIAVYGILAAGATYVPIGVHQPVERK KKIIDTGRIRYIVTSNEMRSEEDLGVSVICLEAVLQKALPAKEPGFPETNQPAYIIFTSG TTGIPKGVVISHSEAHNTIADINQRFNIGRNHTGIAISDLDFDLSVYDIFGLLSAGGTLV VLSEETKREPVFWRKAIIDAGVNVWNSVPALFDMLLTTCETDKQMLPLQLVLLSGDWIKM DLYDRLKAISECCKFVSLGGATEAAIWSNYFVVDGIDSSWKSIPYGEPLSNQYLRVVDGN GYDCPDYVNGELWIGGDGVAEGYLNDPELTDAKFVAIDGVRWYKTGDLVRYNSAGIVEFL GRIDNQVKINGYRIELGEVENVIKRSPDVSNVVVGVVDEKGKKELGAVIVPKINAAVNVH ITCCEDNNKYSDFHMKERERVVRQLILEFCHMQSRVSAEYRPIIEFWEGWLECHMEYSAD IFYAREEMEPLQKVMNARNLFAEILTGEQRVEELLRREEVSPEFWSLKGEDTKYFLDKIL AGDISNRKIAVLGARTGEIIKQYWDGFCQAEEITLFDTSAGILKLAQEKLGGMNANIRYC STYNEGVEESELNKYDIVLAVNLLHTYEEPLKEVEWAALLLKREGDFYAIEYEELDPMGI LISGLLENGFVNRKRGRREHTPLLLLDDWKNIYGQAPFGEVAINRRGRLGALLIHAKNVT PEVIEMRSRILEYMNENLTAYMIPTKRMIVKQVVLSKNGKVDRQQTLALLAVKQTAEKES IILQGIEADIAELWTNILSCKIFDRSQSFFESGGDSLSATRFLAEIKKKYSVDIPLKDIF STPSLKSVADLLQARLAENEGMVEGEI >gi|157101645|gb|DS480679.1| GENE 130 156062 - 156631 253 189 aa, chain - ## HITS:1 COG:alr2045 KEGG:ns NR:ns ## COG: alr2045 COG3208 # Protein_GI_number: 17229537 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted thioesterase involved in non-ribosomal peptide biosynthesis # Organism: Nostoc sp. PCC 7120 # 23 168 88 234 253 95 34.0 7e-20 MAVDFNEILPELSLQIADVSCGRKIVLYGHSMGAAMAFYTAHYLWNQLDRKCEKIIVAGR QAPDQENPYEFKTYMNDDALIEELVRYNATPKEVLENKELLNFILPGLRQDYILNESLVY HGEVLDIPIVAHTGSQDYEANAEIMNLWKHMTTNIFQLNLFEGSHFFVTDLGNKYRDIVI QEAINKRED >gi|157101645|gb|DS480679.1| GENE 131 157046 - 158545 215 499 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 251 488 1 239 312 87 29 4e-16 MIEIKDVSFTYESGESENSLRNINLKIKDGETVLLCGESGCGKTTLTRLINGLIPHYYNG QLTGQIYLDGKEVKDYQLYQIAPMVGSVFQNPRTQFYNVDTTSEIVFGCENMGLPVQEML ARLENTTKCLKLENLLGRSLFALSGGEKQKIACASADAICPNTFVLDEPSSNLDISSIKD LTEVIRQWQSENKTVIVAEHRLYYLAPYADRIIYMKRGEICNEYTREEFLGLDPKELKKM GLRALNPFDLMPEKTPEANEKSLQIDNFWFSYEKRGYPIVNIPDLTLPQGEIIGIIGDNG AGKSSFARCLCGLDKSSKGTLKLNGNALNAKKRRHISYMVMQDVNHQLFTEDVLDELLLS MDGENEEEDKVRAQEILAGLDLSDKLKLHPMSLSGGEKQRVAIGSAVASGKEIIIFDEPT SGLDYRHMLEVSEKLIQLKEMKKTLFLITHDPELIYKCCTHLMFIEHGRILWHRPMDNEA VRLLQEFFSHEKGSSAMAI >gi|157101645|gb|DS480679.1| GENE 132 158559 - 159284 297 241 aa, chain - ## HITS:1 COG:SP1437 KEGG:ns NR:ns ## COG: SP1437 COG0619 # Protein_GI_number: 15901289 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Streptococcus pneumoniae TIGR4 # 91 226 3 133 147 82 34.0 8e-16 MHMLAVTETKGFHLDPRTKLLLMAVVATAEFLYGHTAFMLAVALIPFILLLTNRQYKAAT IFIVLFVAALVVKEIQNSIRFHMVVNMVVVLLVGLVLRLFPAFAMGNYIIKSTTASECIT SLSRMHIGRNITIPLSVLFRFLPTMQEESAAIKDAMRMREIQFGTKKFWQNPMALLEYRF IPLMISVVKIGDELSAAALTRGLDNPADRSSITRVGFTCYDAIAVVISGVMLFVTLFIIK W >gi|157101645|gb|DS480679.1| GENE 133 159284 - 159871 302 195 aa, chain - ## HITS:1 COG:no KEGG:SLGD_01924 NR:ns ## KEGG: SLGD_01924 # Name: not_defined # Def: hypothetical protein # Organism: S.lugdunensis # Pathway: not_defined # 6 195 3 192 192 140 44.0 3e-32 MQKSNKLNGKDLINVGIYTAMTLVIFFVFGLLTSLPVIYPFLLFIWPFVCGIPMMLYYTK IQKFGMLTITGVICGLFFFLIGYTWIGLVGWTLGGILADIVLKAGQYKSFKVTFLSYACF CLGMMGCPANLWIAGQSYWENIHSSMGDQYAETLQSMMPSWMMYAGFLILFVGGICGALL GHKMLKKHFERAGIV >gi|157101645|gb|DS480679.1| GENE 134 159950 - 160567 276 205 aa, chain - ## HITS:1 COG:aq_2179 KEGG:ns NR:ns ## COG: aq_2179 COG1309 # Protein_GI_number: 15607114 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 1 117 7 114 192 61 31.0 1e-09 MSRDFQQTHEKLLACAKKHFLEFGFERASIREICKDANVTNGAFYNHFADKEALFGSLVE SVVQTIQKIYSESIDKHFDLVKTDELKNLWRLSESTIIQIIEYIYENFDVFRLLLMYSSG TKYAGFLDDLVRADVQETIKLIAELKARGVPVNDLDEDEWHMLVHSYYASIAEIVMHNYP KPAALKYAHTLSVFFSSGWQTVLGI >gi|157101645|gb|DS480679.1| GENE 135 160694 - 162448 650 584 aa, chain + ## HITS:1 COG:SA2217 KEGG:ns NR:ns ## COG: SA2217 COG1132 # Protein_GI_number: 15928007 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Staphylococcus aureus N315 # 9 572 8 574 587 417 39.0 1e-116 MKSKEPGAIRQLFMFVGSSKGKMKFSILMAVLGELFGMVPFLMLAMLADSLYFGTATTKD TVILAGIAALCQCIKTFLIWRSSLLSHRISFTILQNIREKIADKMAKVPMGVMLETPSGS FKNLIVDNVAKLEDSMAHFMPELPSHITAPLCSILLIFILDWRMGLASLITVPLGGLFFA AMMRGYASRMENYMCSANEMNSSLVEYVNGIQVIKAFNRSSSSYGQYSKAVNYFHDCTME WWSECWLWNAAARAVLPSTLLGTLPIGALLFMNGSITLPVLMICLIVPLGFTGPLMKVSE AMEQVSMIKGNLEQVTAFLKTPELQRPADPVTLTEPTFEFSHVCFGYNKSEVLHDISFKT SPKSMTAIVGPSGSGKSTIAKLMAGFWDATSGTVLFGSQDIRNIPFEQLMGEISYVAQDN FLFDKTIRDNIRMGNPAATDEQIETVAKAANCHDFIMQLEKGYDTMAGDAGDRLSGGERQ RITIARAMLKKASVVILDEATAYADPENEALIQEAISKLISGKTLIVVAHRLNTIRNADQ ILVVADGKIAGCGTQNELLENCPLYKEMWENYSDAVTSKTKGES >gi|157101645|gb|DS480679.1| GENE 136 162449 - 164173 226 574 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 332 547 279 502 563 91 29 2e-17 MFEMISRIYQIAGSSRKKITVGILCNILKSFFQSFMMLSVFLIMLNLDTLTSSIILKAVI IILISILGRFFFQWMSDRTMSGTGYDIFRDYRLEIGERLKQAPMGYFSEQNLGTIQTILT TTIADLEGYSMMAIEQMTSGVAMAFLMSFMMFFFNPIIGTLSMIGLVIGLTVLRLVRQRA AEHSPIYQAAQENLVNKCLEYIRGIAVLRSFSNGKSGQQEVHTAFQKKWDADYSQEKATA GVLRLYGLVYKLMSCVLIAIAGILYMDGKISLPFCMTFLFCAFTVYSDLETMGNSAFLSK KINTELDRLEEVTNIPQMDTTTDKLHPSNYEISLKDVSFGYGSRRIIDHISLNIPEHTTC AIVGPSGSGKTTLCNLIARFWDVQEGSVCIGGQNVKDYTADSVLDCISMVFQKVYLFHDT IENNIKFGKPDASLEEVMEAAKRACCHDFIMALPDGYQTIIGESGSTLSGGEKQRISIAR AILKDAPIVILDEATSSVDPENEQALLSAIEELTKNKTLISIAHRLSTVQKADQIIVIDN GQVKQKGTHEQLSKKDGIYRNFLKFCVEATGWQL >gi|157101645|gb|DS480679.1| GENE 137 164413 - 164532 70 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|169351275|ref|ZP_02868213.1| ## NR: gi|169351275|ref|ZP_02868213.1| hypothetical protein CLOSPI_02054 [Clostridium spiroforme DSM 1552] hypothetical protein CLOSPI_02054 [Clostridium spiroforme DSM 1552] # 3 39 189 225 331 68 86.0 1e-10 MTSMTFKDRQVELGELAFGLLADNIRFVVPNKNERNKSR >gi|157101645|gb|DS480679.1| GENE 138 164563 - 164889 212 108 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 2 108 118 220 352 57 38.0 7e-09 MFRRCERRYGLYNFHFTRLDVAIDDKNEKPFFTLEQIKKKCEKEEFIANSEGYHFDESKF DDFDTAKTGYIGAGKSGLFYRFYDKDKEVCLKYNKTLDEVGSWKRTEM >gi|157101645|gb|DS480679.1| GENE 139 165636 - 165902 224 88 aa, chain + ## HITS:1 COG:no KEGG:SGGBAA2069_c00880 NR:ns ## KEGG: SGGBAA2069_c00880 # Name: not_defined # Def: HTH-type transcriptional regulator # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 65 1 68 210 62 45.0 5e-09 MNDKHFGPNIQTFRKLKGMKQQELADAIGINLQSLSKIERGVYYPTFETLEKIMVVLEVA PNEMLSGTWKFASHIEKEVCQRGRTIKC >gi|157101645|gb|DS480679.1| GENE 140 166056 - 166286 138 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937326|ref|ZP_02084688.1| ## NR: gi|160937326|ref|ZP_02084688.1| hypothetical protein CLOBOL_02216 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02216 [Clostridium bolteae ATCC BAA-613] # 1 76 1 76 76 147 100.0 4e-34 MKFQKILDRYDEFYSLDMFGESTEGHKRMPPYVPFAKRKVIVNMNILMFKMNILYPLTED TTNKKSDQECKHHLRW >gi|157101645|gb|DS480679.1| GENE 141 166519 - 167460 262 313 aa, chain - ## HITS:1 COG:lin0383 KEGG:ns NR:ns ## COG: lin0383 COG2378 # Protein_GI_number: 16799460 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 313 1 313 315 322 53.0 5e-88 MRKSTRLNDMMIFLNDKSYFNLKDIMQEYGISKSTAIRDIQSLEEIGMPIFSETGRNGRY GILKNRLLSPIIFTVDEMYALYFAMLTLNAYESTPFHLSIEKLKKKFETCLSDELIKTIN KMEHTLSLEGTKHYNSSPLLKDILEMAVKDTVCCISYKKGEFVSEVQVQFFNISSSFGQW YVTGFNYDRQRVQVFRCDKIIDISESECYIPKPRQELLNRSTNIFKSENATDFEVQILDK GVDLFYKENYPSMELVKYDGQYKIKGFYNVGEERFIADYFLSYGKLIKSIYPEHLKKLVS KRAKELYQYYDNL >gi|157101645|gb|DS480679.1| GENE 142 167555 - 167974 294 139 aa, chain + ## HITS:1 COG:no KEGG:CLD_3697 NR:ns ## KEGG: CLD_3697 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 1 139 1 140 140 115 49.0 6e-25 MDTLKEFQKMMAEQTDLALATIGTDGIPNVRVVNYYFNPENNIVYFATFKNNDKVKEMKQ NQNVAFTTIPKKENEHIKVRGTAIKSKKNIFDLADCFCEKIPDYRETIEYGGKSLIVFEI HFETATVTLDLSNSKTLKL >gi|157101645|gb|DS480679.1| GENE 143 168072 - 168254 185 60 aa, chain - ## HITS:1 COG:no KEGG:Tthe_0801 NR:ns ## KEGG: Tthe_0801 # Name: not_defined # Def: PilT protein domain protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 1 51 1 51 134 79 58.0 6e-14 MRGLIDTNVLSSAALNANGIPFQAYVKAVSYPNHGLICEQNVDEMKRVFSKTNSKTNIFP >gi|157101645|gb|DS480679.1| GENE 144 168251 - 168490 169 79 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1619 NR:ns ## KEGG: Cthe_1619 # Name: not_defined # Def: AbrB family transcriptional regulator # Organism: C.thermocellum # Pathway: not_defined # 1 77 12 88 90 84 59.0 1e-15 MAKGQVTIPKDIREVLGVASGDRITFIVEGNTVRIVNSAVYAMQVLQKEIAGEAERAGLT SDDDVMEFVKELRNEDENE >gi|157101645|gb|DS480679.1| GENE 145 169377 - 169733 262 118 aa, chain - ## HITS:1 COG:PA5025 KEGG:ns NR:ns ## COG: PA5025 COG2873 # Protein_GI_number: 15600218 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pseudomonas aeruginosa # 35 118 196 281 425 77 46.0 7e-15 MADFFRYSWILQRGERVFNLGVDTKKGWNLKLFNDILRGSFGKISGHGITLGGIIVNSGN FGWKASGKTTPIEKNTPSYHGDSFVEAAAPAAFVTYIHAILLRDKGATLSHFNAFLFL >gi|157101645|gb|DS480679.1| GENE 146 169759 - 170382 206 207 aa, chain - ## HITS:1 COG:no KEGG:BCZK1570 NR:ns ## KEGG: BCZK1570 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_ZK # Pathway: not_defined # 17 203 16 212 213 76 30.0 8e-13 MAKKFKSVLCQLGMERLEHPLFYHAPVGIRFRIGGEETIYLDGKINSDYVQGALDRAAAI YRSLPNPPDILRIDGYPEKETDESVLTMVRKMSGLPVPQEQEASVGLDEDGDAHRQIRFY WDLRRVLFCPEQLIREIILGDIGGWSGFTSNVYLAGPGPYLYHLYDDRGLDVLGSSQELL LPLYRQFHSWILEYDLEQIDRLFASED >gi|157101645|gb|DS480679.1| GENE 147 170997 - 171350 140 117 aa, chain - ## HITS:1 COG:SPy0189 KEGG:ns NR:ns ## COG: SPy0189 COG3969 # Protein_GI_number: 15674394 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Streptococcus pyogenes M1 GAS # 1 117 108 225 444 149 61.0 1e-36 MYWYPWDDTKQDIWVRQIPKYSYVINLTKPFAYYRYRMHQRDFAKHFGRFYKEEHGYGRT VCLLGMRADEPLQRYSGFVNKKYRYHNEGWISKQFKDVWCASTLYDWSNSDVWCANY >gi|157101645|gb|DS480679.1| GENE 148 171390 - 171614 196 74 aa, chain - ## HITS:1 COG:SPy0189 KEGG:ns NR:ns ## COG: SPy0189 COG3969 # Protein_GI_number: 15674394 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Streptococcus pyogenes M1 GAS # 1 74 20 93 444 83 55.0 1e-16 MFQEFKSIYISFSGSKDSDVLLNLLLYYWNNHASDRVIGVFHQDFEAQYTVTTDYITRTF KRLENEYGIELYWV >gi|157101645|gb|DS480679.1| GENE 149 171778 - 171993 81 71 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_6261 NR:ns ## KEGG: GYMC10_6261 # Name: not_defined # Def: peptidase C60 sortase A and B # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 3 70 190 257 259 76 51.0 3e-13 MYYQDETDVFRFYAYQDLSDPVLFSEYVDNVTAASLYDTGETAKYGDTLLTLVTCSYHAE NGRFVVVARKC >gi|157101645|gb|DS480679.1| GENE 150 172036 - 172593 366 185 aa, chain - ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 89 184 67 163 246 77 37.0 1e-14 MGSARPRWGRSRGRNKHPWYVTLGGLSSCVMLVVSLFMLGHSLLGLEKSQVVFTTLTTRV EEERKRPKPTTTIELVTEDTHASSPYAVLKEENGNLFGWVKIAGTKLDYPVMYTPEEPEY YLHRAFDKSSSVSGVPFLDGNYIDGGKNYLIYGHNMKNGTMFHTLLNYVKADFWKEHPTI TFDTL >gi|157101645|gb|DS480679.1| GENE 151 172598 - 172882 295 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937344|ref|ZP_02084706.1| ## NR: gi|160937344|ref|ZP_02084706.1| hypothetical protein CLOBOL_02234 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02234 [Clostridium bolteae ATCC BAA-613] # 1 94 1 94 94 176 100.0 5e-43 MIKKKLAIQIAQRIGNGREETTKLLDVLCEVLGDTLDAGVFVHLTGLGRLTVREYPAQKG FDPRTRGFINMAAKKSPVFKANKSLKGKSNEAGA Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:33:44 2011 Seq name: gi|157101644|gb|DS480680.1| Clostridium bolteae ATCC BAA-613 Scfld_02_21 genomic scaffold, whole genome shotgun sequence Length of sequence - 301038 bp Number of predicted genes - 262, with homology - 260 Number of transcription units - 107, operones - 63 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 127 - 200 75.2 # Gly TCC 0 0 - TRNA 233 - 307 86.4 # Pro TGG 0 0 1 1 Op 1 7/0.030 - CDS 397 - 897 536 ## COG0622 Predicted phosphoesterase 2 1 Op 2 . - CDS 917 - 1516 396 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 3 1 Op 3 . - CDS 1558 - 3297 1499 ## COG4870 Cysteine protease 4 1 Op 4 . - CDS 3308 - 4471 1019 ## COG0116 Predicted N6-adenine-specific DNA methylase 5 1 Op 5 . - CDS 4502 - 5059 570 ## COG0622 Predicted phosphoesterase 6 1 Op 6 . - CDS 5095 - 5427 333 ## Closa_0690 Rhodanese domain protein - Term 5518 - 5568 12.5 7 2 Tu 1 . - CDS 5606 - 5839 214 ## - Prom 5907 - 5966 9.7 + Prom 6099 - 6158 7.5 8 3 Tu 1 . + CDS 6281 - 6484 200 ## COG1278 Cold shock proteins + Term 6502 - 6542 6.4 - Term 6587 - 6624 7.6 9 4 Op 1 7/0.030 - CDS 6647 - 7369 540 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 7445 - 7504 8.4 10 4 Op 2 . - CDS 7609 - 8328 190 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 - Prom 8394 - 8453 8.2 - Term 8422 - 8457 5.1 11 5 Op 1 . - CDS 8565 - 9068 576 ## Closa_0686 hypothetical protein 12 5 Op 2 . - CDS 9068 - 11005 1949 ## COG3855 Uncharacterized protein conserved in bacteria - Prom 11040 - 11099 5.9 + Prom 11008 - 11067 5.0 13 6 Tu 1 . + CDS 11138 - 11941 775 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III + Term 11961 - 12002 2.1 + Prom 12063 - 12122 4.1 14 7 Tu 1 . + CDS 12142 - 13062 534 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 12940 - 12977 -0.4 15 8 Tu 1 . - CDS 13059 - 13685 804 ## COG3601 Predicted membrane protein - Term 13859 - 13897 5.1 16 9 Op 1 . - CDS 14134 - 14724 494 ## COG0778 Nitroreductase - Term 14749 - 14791 -0.9 17 9 Op 2 . - CDS 14855 - 16039 1231 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 16072 - 16131 4.5 - Term 16317 - 16355 7.3 18 10 Op 1 . - CDS 16502 - 17344 723 ## CPE0145 hypothetical protein 19 10 Op 2 2/0.121 - CDS 17378 - 18715 1523 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 20 10 Op 3 8/0.030 - CDS 18732 - 19886 1278 ## COG3839 ABC-type sugar transport systems, ATPase components 21 10 Op 4 . - CDS 19905 - 21770 2266 ## COG1178 ABC-type Fe3+ transport system, permease component 22 10 Op 5 . - CDS 21845 - 22855 1039 ## COG1609 Transcriptional regulators - Prom 22914 - 22973 5.3 + Prom 22970 - 23029 3.6 23 11 Tu 1 . + CDS 23176 - 24507 1479 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 24702 - 24737 0.0 - Term 24465 - 24499 -0.9 24 12 Op 1 7/0.030 - CDS 24549 - 25370 1128 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 25 12 Op 2 . - CDS 25419 - 27116 1824 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 26 12 Op 3 17/0.000 - CDS 27191 - 28843 1834 ## COG1178 ABC-type Fe3+ transport system, permease component 27 12 Op 4 7/0.030 - CDS 28833 - 29954 1356 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components - Term 29970 - 30011 4.2 28 13 Op 1 . - CDS 30046 - 31161 271 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 29 13 Op 2 . - CDS 31209 - 31880 858 ## COG0274 Deoxyribose-phosphate aldolase - Prom 32035 - 32094 6.6 + Prom 31931 - 31990 5.4 30 14 Tu 1 . + CDS 32185 - 32421 241 ## gi|160937383|ref|ZP_02084744.1| hypothetical protein CLOBOL_02274 + Term 32442 - 32495 16.1 - Term 32425 - 32483 3.8 31 15 Op 1 . - CDS 32526 - 33800 970 ## Jann_4086 hypothetical protein 32 15 Op 2 11/0.030 - CDS 33825 - 35108 705 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 33 15 Op 3 11/0.030 - CDS 35108 - 35650 179 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 34 15 Op 4 2/0.121 - CDS 35671 - 36747 285 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 - Prom 36848 - 36907 5.2 - Term 36887 - 36933 5.7 35 16 Tu 1 . - CDS 36940 - 37947 726 ## COG1609 Transcriptional regulators - Prom 38022 - 38081 9.0 - Term 38095 - 38123 0.7 36 17 Op 1 . - CDS 38171 - 38950 921 ## Tmar_1888 xylose isomerase barrel 37 17 Op 2 21/0.000 - CDS 38919 - 40097 1086 ## COG3839 ABC-type sugar transport systems, ATPase components 38 17 Op 3 38/0.000 - CDS 40136 - 40960 904 ## COG0395 ABC-type sugar transport system, permease component 39 17 Op 4 35/0.000 - CDS 40950 - 41840 979 ## COG1175 ABC-type sugar transport systems, permease components - Term 41850 - 41898 11.0 40 17 Op 5 . - CDS 41930 - 43375 1382 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 43443 - 43502 5.7 - Term 43498 - 43548 6.1 41 18 Op 1 4/0.061 - CDS 43622 - 45106 1471 ## COG2721 Altronate dehydratase 42 18 Op 2 . - CDS 45154 - 46494 1409 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases - Term 46514 - 46557 8.6 43 19 Op 1 . - CDS 46575 - 47858 1068 ## Spirs_0433 hypothetical protein 44 19 Op 2 . - CDS 47884 - 49170 640 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 45 19 Op 3 . - CDS 49163 - 49675 470 ## Amet_0551 tripartite ATP-independent periplasmic transporter DctQ 46 19 Op 4 1/0.121 - CDS 49705 - 50823 312 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 47 19 Op 5 . - CDS 50875 - 51894 1029 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 51942 - 52001 9.6 + Prom 51962 - 52021 6.7 48 20 Op 1 . + CDS 52148 - 52822 673 ## COG1802 Transcriptional regulators 49 20 Op 2 . + CDS 52829 - 53017 88 ## gi|160937404|ref|ZP_02084765.1| hypothetical protein CLOBOL_02295 + Term 53233 - 53272 0.5 - Term 53487 - 53543 2.1 50 21 Op 1 2/0.121 - CDS 53660 - 54922 1456 ## COG4198 Uncharacterized conserved protein 51 21 Op 2 6/0.030 - CDS 54974 - 56137 1452 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 52 21 Op 3 . - CDS 56174 - 57259 1235 ## COG1932 Phosphoserine aminotransferase - Prom 57343 - 57402 4.1 - Term 57312 - 57382 9.2 53 22 Op 1 6/0.030 - CDS 57499 - 59181 2151 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 54 22 Op 2 1/0.121 - CDS 59227 - 60891 2144 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 55 22 Op 3 . - CDS 60948 - 62033 1297 ## COG0473 Isocitrate/isopropylmalate dehydrogenase - Prom 62094 - 62153 6.1 56 23 Op 1 . - CDS 62303 - 62935 711 ## Closa_1592 GCN5-related N-acetyltransferase 57 23 Op 2 . - CDS 62943 - 65042 1512 ## Clole_0471 peptidase M28 - Term 65051 - 65095 8.3 58 24 Op 1 . - CDS 65121 - 66092 1195 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 66124 - 66183 4.2 59 24 Op 2 . - CDS 66201 - 67532 1181 ## COG2206 HD-GYP domain - Prom 67568 - 67627 5.7 - Term 67613 - 67650 6.3 60 25 Op 1 45/0.000 - CDS 67655 - 68398 699 ## COG0842 ABC-type multidrug transport system, permease component 61 25 Op 2 . - CDS 68395 - 69180 177 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 62 25 Op 3 . - CDS 69230 - 70003 817 ## COG0789 Predicted transcriptional regulators - Prom 70064 - 70123 4.1 - Term 70093 - 70140 11.2 63 26 Op 1 . - CDS 70165 - 70380 236 ## Closa_0502 hypothetical protein 64 26 Op 2 . - CDS 70520 - 71434 537 ## COG2207 AraC-type DNA-binding domain-containing proteins 65 26 Op 3 . - CDS 71457 - 72860 997 ## ELI_1104 hypothetical protein 66 26 Op 4 1/0.121 - CDS 72865 - 76203 3303 ## COG0642 Signal transduction histidine kinase 67 26 Op 5 . - CDS 76203 - 77477 973 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 77535 - 77594 6.4 - Term 77588 - 77635 10.0 68 27 Tu 1 . - CDS 77696 - 78172 278 ## COG4422 Bacteriophage protein gp37 - Prom 78349 - 78408 3.2 - Term 78347 - 78381 -1.0 69 28 Tu 1 . - CDS 78499 - 78963 330 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 79165 - 79224 6.0 - Term 79166 - 79222 7.3 70 29 Tu 1 1/0.121 - CDS 79247 - 80797 1359 ## COG1418 Predicted HD superfamily hydrolase - Prom 80839 - 80898 7.1 71 30 Op 1 14/0.000 - CDS 80932 - 81549 419 ## COG2137 Uncharacterized protein conserved in bacteria 72 30 Op 2 . - CDS 81546 - 82682 1434 ## COG0468 RecA/RadA recombinase - Prom 82725 - 82784 4.6 73 31 Tu 1 . - CDS 83156 - 84205 823 ## Closa_3771 hypothetical protein - Prom 84275 - 84334 2.0 74 32 Tu 1 . - CDS 84342 - 85511 1505 ## COG0462 Phosphoribosylpyrophosphate synthetase - Prom 85561 - 85620 4.5 75 33 Op 1 3/0.091 - CDS 85632 - 86612 992 ## COG1484 DNA replication protein 76 33 Op 2 . - CDS 86634 - 87860 913 ## COG3935 Putative primosome component and related proteins - Prom 87881 - 87940 4.0 - Term 88057 - 88110 -0.4 77 34 Tu 1 . - CDS 88123 - 89508 1525 ## COG0773 UDP-N-acetylmuramate-alanine ligase - Prom 89562 - 89621 8.9 - Term 89709 - 89749 2.1 78 35 Tu 1 . - CDS 89791 - 90288 621 ## COG4708 Predicted membrane protein - Prom 90448 - 90507 8.3 79 36 Op 1 7/0.030 + CDS 90738 - 92735 1261 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 80 36 Op 2 . + CDS 92756 - 94381 892 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 94355 - 94401 7.2 81 37 Op 1 . - CDS 94412 - 95185 675 ## gi|160937443|ref|ZP_02084804.1| hypothetical protein CLOBOL_02334 82 37 Op 2 . - CDS 95223 - 95993 699 ## gi|160937444|ref|ZP_02084805.1| hypothetical protein CLOBOL_02335 83 37 Op 3 . - CDS 96007 - 96756 480 ## Celal_1465 hypothetical protein 84 37 Op 4 . - CDS 96775 - 96942 112 ## 85 38 Op 1 . - CDS 97078 - 98442 1225 ## GYMC10_0716 extracellular solute-binding protein family 1 86 38 Op 2 38/0.000 - CDS 98520 - 99350 709 ## COG0395 ABC-type sugar transport system, permease component 87 38 Op 3 . - CDS 99350 - 100186 756 ## COG1175 ABC-type sugar transport systems, permease components - Prom 100298 - 100357 5.8 88 39 Op 1 . - CDS 100401 - 101216 611 ## COG1082 Sugar phosphate isomerases/epimerases 89 39 Op 2 . - CDS 101213 - 102541 994 ## COG0534 Na+-driven multidrug efflux pump - Prom 102607 - 102666 2.8 + Prom 102678 - 102737 4.2 90 40 Op 1 7/0.030 + CDS 102759 - 104690 1629 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 91 40 Op 2 2/0.121 + CDS 104708 - 106276 1404 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 92 40 Op 3 35/0.000 + CDS 106293 - 107756 1630 ## COG1653 ABC-type sugar transport system, periplasmic component 93 40 Op 4 38/0.000 + CDS 107767 - 108666 711 ## COG1175 ABC-type sugar transport systems, permease components 94 40 Op 5 2/0.121 + CDS 108656 - 109486 885 ## COG0395 ABC-type sugar transport system, permease component 95 40 Op 6 . + CDS 109500 - 111770 1840 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 111812 - 111872 19.0 - Term 111798 - 111857 1.4 96 41 Op 1 2/0.121 - CDS 111893 - 112285 485 ## COG1433 Uncharacterized conserved protein 97 41 Op 2 . - CDS 112263 - 112658 247 ## COG1342 Predicted DNA-binding proteins - Prom 112702 - 112761 6.2 - Term 112768 - 112826 9.5 98 42 Op 1 . - CDS 112866 - 113273 499 ## COG2337 Growth inhibitor - Prom 113330 - 113389 2.2 99 42 Op 2 5/0.061 - CDS 113401 - 115359 1740 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 100 42 Op 3 5/0.061 - CDS 115390 - 116862 1370 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 101 42 Op 4 5/0.061 - CDS 116877 - 118364 1333 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 102 42 Op 5 21/0.000 - CDS 118408 - 119940 1456 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 103 42 Op 6 16/0.000 - CDS 119927 - 120307 558 ## COG1006 Multisubunit Na+/H+ antiporter, MnhC subunit 104 42 Op 7 11/0.030 - CDS 120310 - 121113 849 ## COG2111 Multisubunit Na+/H+ antiporter, MnhB subunit 105 43 Op 1 5/0.061 - CDS 121361 - 121612 311 ## COG2111 Multisubunit Na+/H+ antiporter, MnhB subunit 106 43 Op 2 . - CDS 121599 - 121919 428 ## COG1320 Multisubunit Na+/H+ antiporter, MnhG subunit 107 43 Op 3 . - CDS 121919 - 122302 477 ## gi|160937471|ref|ZP_02084832.1| hypothetical protein CLOBOL_02362 108 43 Op 4 . - CDS 122314 - 122778 623 ## COG1863 Multisubunit Na+/H+ antiporter, MnhE subunit - Term 122804 - 122843 7.2 109 44 Op 1 1/0.121 - CDS 122868 - 124109 1416 ## COG1228 Imidazolonepropionase and related amidohydrolases 110 44 Op 2 . - CDS 124160 - 125608 1539 ## COG0531 Amino acid transporters - Prom 125658 - 125717 6.3 + Prom 125740 - 125799 11.6 111 45 Op 1 9/0.030 + CDS 125829 - 126539 921 ## COG3279 Response regulator of the LytR/AlgR family 112 45 Op 2 . + CDS 126571 - 127875 984 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 113 46 Op 1 40/0.000 - CDS 127949 - 129397 1173 ## COG0642 Signal transduction histidine kinase 114 46 Op 2 . - CDS 129397 - 130098 723 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 130141 - 130200 2.4 115 47 Tu 1 . + CDS 130318 - 130794 557 ## gi|160937480|ref|ZP_02084841.1| hypothetical protein CLOBOL_02371 + Term 130863 - 130890 0.1 116 48 Tu 1 . - CDS 130860 - 131735 1009 ## COG0583 Transcriptional regulator - Prom 131766 - 131825 9.6 + Prom 131744 - 131803 9.4 117 49 Tu 1 . + CDS 131917 - 132957 1249 ## COG2855 Predicted membrane protein + Term 132981 - 133035 15.1 - Term 132969 - 133021 14.7 118 50 Op 1 . - CDS 133026 - 133775 864 ## COG1802 Transcriptional regulators 119 50 Op 2 . - CDS 133789 - 134436 676 ## Cphy_3765 hypothetical protein 120 50 Op 3 . - CDS 134486 - 136615 2583 ## COG0145 N-methylhydantoinase A/acetone carboxylase, beta subunit - Prom 136643 - 136702 2.1 121 51 Tu 1 . - CDS 136748 - 138091 1711 ## Cphy_3767 citrate transporter - Prom 138265 - 138324 6.0 - Term 138464 - 138504 7.3 122 52 Op 1 2/0.121 - CDS 138581 - 139411 891 ## COG1737 Transcriptional regulators 123 52 Op 2 3/0.091 - CDS 139461 - 140357 314 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 124 53 Op 1 1/0.121 - CDS 140511 - 140966 512 ## COG2731 Beta-galactosidase, beta subunit 125 53 Op 2 2/0.121 - CDS 140990 - 142492 1815 ## COG0591 Na+/proline symporter - Prom 142602 - 142661 2.3 - Term 142514 - 142551 0.3 126 54 Op 1 3/0.091 - CDS 142681 - 143598 1184 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 127 54 Op 2 . - CDS 143665 - 144351 925 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase - Prom 144409 - 144468 8.5 128 55 Op 1 7/0.030 - CDS 144582 - 146195 1913 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 129 55 Op 2 . - CDS 146179 - 147996 1630 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 130 55 Op 3 . - CDS 148043 - 149200 969 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 131 55 Op 4 14/0.000 - CDS 149191 - 150534 1201 ## COG1653 ABC-type sugar transport system, periplasmic component 132 55 Op 5 38/0.000 - CDS 150579 - 151430 898 ## COG0395 ABC-type sugar transport system, permease component 133 55 Op 6 35/0.000 - CDS 151431 - 152354 757 ## COG1175 ABC-type sugar transport systems, permease components - Prom 152419 - 152478 3.5 - Term 152462 - 152497 1.2 134 55 Op 7 . - CDS 152512 - 153927 1606 ## COG1653 ABC-type sugar transport system, periplasmic component 135 55 Op 8 . - CDS 154022 - 155059 995 ## COG1363 Cellulase M and related proteins - Prom 155081 - 155140 6.0 - Term 155255 - 155306 9.6 136 56 Op 1 . - CDS 155332 - 156840 1716 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 137 56 Op 2 . - CDS 156863 - 157093 441 ## EUBREC_1028 hypothetical protein 138 57 Op 1 . - CDS 157215 - 158621 1486 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 139 57 Op 2 . - CDS 158618 - 159577 950 ## COG1073 Hydrolases of the alpha/beta superfamily 140 57 Op 3 . - CDS 159635 - 159952 486 ## EUBREC_1025 hypothetical protein 141 57 Op 4 . - CDS 159949 - 160500 460 ## COG2091 Phosphopantetheinyl transferase - Prom 160626 - 160685 8.0 + Prom 160573 - 160632 3.3 142 58 Tu 1 . + CDS 160661 - 161278 631 ## COG1309 Transcriptional regulator + Term 161294 - 161346 8.1 + Prom 161290 - 161349 7.6 143 59 Op 1 . + CDS 161485 - 164304 3197 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 144 59 Op 2 . + CDS 164291 - 164638 400 ## gi|160937513|ref|ZP_02084874.1| hypothetical protein CLOBOL_02404 + Term 164680 - 164743 19.3 - Term 164668 - 164731 19.3 145 60 Op 1 . - CDS 164836 - 165492 826 ## COG0703 Shikimate kinase 146 60 Op 2 . - CDS 165489 - 166100 682 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 147 60 Op 3 . - CDS 166134 - 167804 1972 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase - Prom 167992 - 168051 7.6 - Term 168104 - 168152 11.5 148 61 Tu 1 . - CDS 168186 - 169640 1712 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 169723 - 169782 8.0 + Prom 169671 - 169730 6.1 149 62 Tu 1 . + CDS 169967 - 170515 527 ## COG1247 Sortase and related acyltransferases - Term 170796 - 170843 9.2 150 63 Op 1 . - CDS 170895 - 172397 1390 ## COG1640 4-alpha-glucanotransferase 151 63 Op 2 . - CDS 172594 - 173613 648 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 173639 - 173698 5.5 + Prom 173650 - 173709 5.5 152 64 Op 1 . + CDS 173758 - 176046 2296 ## COG0058 Glucan phosphorylase + Prom 176061 - 176120 7.4 153 64 Op 2 . + CDS 176154 - 177407 1360 ## COG1404 Subtilisin-like serine proteases + Term 177561 - 177607 14.3 154 65 Op 1 . - CDS 177385 - 178443 606 ## COG2706 3-carboxymuconate cyclase 155 65 Op 2 . - CDS 178498 - 179157 717 ## COG0684 Demethylmenaquinone methyltransferase 156 65 Op 3 . - CDS 179159 - 180181 434 ## Cwoe_4678 hypothetical protein 157 65 Op 4 3/0.091 - CDS 180253 - 180996 199 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 158 65 Op 5 . - CDS 181019 - 182110 720 ## COG1312 D-mannonate dehydratase 159 65 Op 6 21/0.000 - CDS 182110 - 183126 872 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 160 65 Op 7 16/0.000 - CDS 183123 - 184616 1277 ## COG1129 ABC-type sugar transport system, ATPase component 161 65 Op 8 . - CDS 184652 - 185743 1135 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 185820 - 185879 5.6 162 66 Tu 1 . + CDS 185925 - 186797 473 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 186825 - 186875 5.0 - Term 186813 - 186861 12.2 163 67 Op 1 4/0.061 - CDS 186888 - 188300 1753 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 164 67 Op 2 9/0.030 - CDS 188431 - 189576 1365 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 165 67 Op 3 . - CDS 189627 - 190019 325 ## COG0511 Biotin carboxyl carrier protein 166 67 Op 4 . - CDS 190070 - 190873 857 ## Closa_1325 sodium pump decarboxylase gamma subunit 167 67 Op 5 . - CDS 190891 - 192360 1564 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) - Prom 192436 - 192495 8.8 - Term 192367 - 192404 1.3 168 68 Tu 1 . - CDS 192598 - 193812 1520 ## COG0527 Aspartokinases - Term 193841 - 193875 8.1 169 69 Tu 1 . - CDS 193945 - 195675 2219 ## Closa_3778 hypothetical protein - Prom 195713 - 195772 7.2 170 70 Op 1 . - CDS 195912 - 197345 1355 ## COG1129 ABC-type sugar transport system, ATPase component 171 70 Op 2 7/0.030 - CDS 197342 - 198352 921 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 172 70 Op 3 . - CDS 198368 - 199966 1560 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 173 70 Op 4 21/0.000 - CDS 199990 - 200895 1223 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 174 70 Op 5 . - CDS 200997 - 202499 177 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 175 70 Op 6 1/0.121 - CDS 202518 - 203324 629 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Term 203351 - 203390 4.1 176 70 Op 7 . - CDS 203403 - 204512 1314 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 204758 - 204817 8.1 - Term 204929 - 204983 12.5 177 71 Op 1 . - CDS 205061 - 205747 651 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 178 71 Op 2 38/0.000 - CDS 205785 - 206600 1001 ## COG0395 ABC-type sugar transport system, permease component 179 71 Op 3 35/0.000 - CDS 206615 - 207511 1107 ## COG1175 ABC-type sugar transport systems, permease components 180 71 Op 4 . - CDS 207690 - 209087 1443 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 209134 - 209193 7.1 + Prom 209302 - 209361 6.9 181 72 Tu 1 . + CDS 209419 - 211704 2061 ## COG1609 Transcriptional regulators + Term 211728 - 211780 0.6 - Term 211711 - 211773 5.3 182 73 Tu 1 . - CDS 211820 - 212674 855 ## COG1737 Transcriptional regulators - Prom 212717 - 212776 16.9 + Prom 212715 - 212774 6.6 183 74 Tu 1 . + CDS 212809 - 213357 172 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase + Term 213386 - 213417 -0.1 184 75 Op 1 . - CDS 213408 - 214289 1126 ## COG1284 Uncharacterized conserved protein 185 75 Op 2 . - CDS 214294 - 214518 111 ## gi|160937563|ref|ZP_02084924.1| hypothetical protein CLOBOL_02454 - Prom 214538 - 214597 6.0 + Prom 214489 - 214548 6.9 186 76 Tu 1 . + CDS 214592 - 214822 326 ## gi|160937562|ref|ZP_02084923.1| hypothetical protein CLOBOL_02453 - Term 214707 - 214745 3.3 187 77 Tu 1 . - CDS 214794 - 216155 1223 ## COG0534 Na+-driven multidrug efflux pump - Prom 216338 - 216397 11.0 + Prom 216292 - 216351 9.5 188 78 Op 1 1/0.121 + CDS 216390 - 216797 495 ## COG1959 Predicted transcriptional regulator + Prom 216802 - 216861 4.3 189 78 Op 2 1/0.121 + CDS 216896 - 218779 1932 ## COG1151 6Fe-6S prismane cluster-containing protein 190 78 Op 3 4/0.061 + CDS 218790 - 219242 300 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 191 78 Op 4 . + CDS 219239 - 220483 1169 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 220507 - 220544 9.4 - Term 220495 - 220532 9.4 192 79 Op 1 30/0.000 - CDS 220605 - 221798 1436 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 221832 - 221891 3.5 193 79 Op 2 51/0.000 - CDS 221917 - 224034 1794 ## COG0480 Translation elongation factors (GTPases) 194 79 Op 3 56/0.000 - CDS 224083 - 224553 706 ## PROTEIN SUPPORTED gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 - Prom 224770 - 224829 2.8 - Term 224773 - 224801 -0.9 195 79 Op 4 . - CDS 224857 - 225276 665 ## PROTEIN SUPPORTED gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 - Prom 225357 - 225416 8.1 - Term 225362 - 225404 1.6 196 80 Tu 1 . - CDS 225434 - 226627 1164 ## COG1906 Uncharacterized conserved protein - Prom 226856 - 226915 4.4 + Prom 226600 - 226659 4.9 197 81 Tu 1 . + CDS 226831 - 227184 299 ## gi|160937579|ref|ZP_02084940.1| hypothetical protein CLOBOL_02470 + Term 227191 - 227225 4.1 - Term 227400 - 227446 13.5 198 82 Op 1 58/0.000 - CDS 227488 - 231195 4270 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 199 82 Op 2 . - CDS 231212 - 235096 4296 ## COG0085 DNA-directed RNA polymerase, beta subunit/140 kD subunit - Prom 235267 - 235326 5.4 + Prom 235289 - 235348 5.8 200 83 Tu 1 . + CDS 235422 - 235883 683 ## Closa_3786 hypothetical protein + Term 235887 - 235920 5.5 - Term 235869 - 235915 9.5 201 84 Op 1 1/0.121 - CDS 235962 - 236387 470 ## COG4506 Uncharacterized protein conserved in bacteria 202 84 Op 2 . - CDS 236405 - 237466 1132 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes - Prom 237507 - 237566 7.4 + Prom 237484 - 237543 4.3 203 85 Tu 1 . + CDS 237587 - 238240 565 ## Closa_0527 stage II sporulation protein R + Term 238464 - 238492 0.5 - Term 238163 - 238204 -0.5 204 86 Op 1 . - CDS 238245 - 238448 331 ## gi|160937587|ref|ZP_02084948.1| hypothetical protein CLOBOL_02478 205 86 Op 2 . - CDS 238448 - 239155 873 ## COG1802 Transcriptional regulators 206 86 Op 3 . - CDS 239178 - 240086 1131 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase - Prom 240123 - 240182 6.0 - Term 240188 - 240221 3.2 207 87 Tu 1 . - CDS 240245 - 240976 695 ## Closa_0515 hypothetical protein - Prom 241007 - 241066 5.2 + Prom 241048 - 241107 7.2 208 88 Tu 1 . + CDS 241161 - 241832 731 ## Closa_0514 hypothetical protein + Term 241845 - 241888 12.1 - Term 241533 - 241571 2.6 209 89 Op 1 . - CDS 241771 - 242520 636 ## Closa_0513 GCN5-related N-acetyltransferase 210 89 Op 2 . - CDS 242573 - 243091 625 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 211 89 Op 3 . - CDS 243109 - 244746 1203 ## Elen_3094 regulatory protein GntR HTH - Prom 244788 - 244847 6.4 + Prom 244722 - 244781 7.2 212 90 Tu 1 . + CDS 244831 - 247113 1420 ## COG2200 FOG: EAL domain - Term 247323 - 247360 5.7 213 91 Tu 1 . - CDS 247497 - 248888 854 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 248996 - 249055 3.6 + Prom 248955 - 249014 6.9 214 92 Tu 1 . + CDS 249060 - 250043 677 ## Closa_0500 hypothetical protein + Term 250125 - 250181 21.5 - Term 250112 - 250169 22.0 215 93 Op 1 . - CDS 250254 - 251354 1222 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 216 93 Op 2 . - CDS 251379 - 251927 590 ## gi|160937603|ref|ZP_02084964.1| hypothetical protein CLOBOL_02494 - Prom 251974 - 252033 8.2 + Prom 252010 - 252069 7.8 217 94 Tu 1 . + CDS 252115 - 253674 1458 ## COG1292 Choline-glycine betaine transporter + Term 253735 - 253779 8.1 - Term 253721 - 253766 8.3 218 95 Tu 1 . - CDS 253797 - 254051 216 ## gi|160937605|ref|ZP_02084966.1| hypothetical protein CLOBOL_02496 - Prom 254124 - 254183 6.8 - Term 254197 - 254237 -0.9 219 96 Op 1 . - CDS 254293 - 255492 1179 ## Closa_0526 hypothetical protein 220 96 Op 2 . - CDS 255525 - 256760 1410 ## Closa_0525 hypothetical protein 221 96 Op 3 . - CDS 256796 - 257428 657 ## Closa_0522 TetR family transcriptional regulator - Prom 257511 - 257570 7.7 - Term 257633 - 257674 10.2 222 97 Op 1 16/0.000 - CDS 257701 - 258828 1184 ## COG1879 ABC-type sugar transport system, periplasmic component 223 97 Op 2 11/0.030 - CDS 258844 - 259869 947 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 224 97 Op 3 21/0.000 - CDS 259872 - 260879 930 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 225 97 Op 4 . - CDS 260885 - 262399 203 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 226 97 Op 5 . - CDS 262450 - 262602 71 ## gi|160937613|ref|ZP_02084974.1| hypothetical protein CLOBOL_02504 227 97 Op 6 . - CDS 262632 - 264125 1004 ## COG2207 AraC-type DNA-binding domain-containing proteins 228 97 Op 7 . - CDS 264122 - 265978 1161 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 229 97 Op 8 . - CDS 265950 - 266780 648 ## COG1082 Sugar phosphate isomerases/epimerases 230 97 Op 9 2/0.121 - CDS 266777 - 268438 909 ## COG1069 Ribulose kinase 231 97 Op 10 8/0.030 - CDS 268453 - 269091 587 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 232 97 Op 11 . - CDS 269108 - 269953 477 ## COG3623 Putative L-xylulose-5-phosphate 3-epimerase 233 97 Op 12 . - CDS 269913 - 270581 347 ## COG0036 Pentose-5-phosphate-3-epimerase - Prom 270607 - 270666 6.0 + Prom 270598 - 270657 6.4 234 98 Tu 1 . + CDS 270735 - 271550 594 ## COG0434 Predicted TIM-barrel enzyme + Term 271628 - 271671 10.3 + Prom 271576 - 271635 5.8 235 99 Op 1 3/0.091 + CDS 271694 - 273202 777 ## COG1070 Sugar (pentulose and hexulose) kinases 236 99 Op 2 . + CDS 273221 - 273982 718 ## COG1349 Transcriptional regulators of sugar metabolism - Term 273984 - 274040 11.1 237 100 Op 1 1/0.121 - CDS 274062 - 274997 927 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 238 100 Op 2 . - CDS 275108 - 275950 924 ## COG3475 LPS biosynthesis protein 239 100 Op 3 4/0.061 - CDS 276020 - 277378 1285 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 240 100 Op 4 5/0.061 - CDS 277394 - 278452 926 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 278638 - 278697 3.3 241 101 Op 1 . - CDS 278778 - 279593 776 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 242 101 Op 2 . - CDS 279675 - 281108 1207 ## Closa_3794 hypothetical protein 243 101 Op 3 . - CDS 281159 - 283021 1839 ## Closa_3795 hypothetical protein 244 101 Op 4 . - CDS 283044 - 283553 423 ## Closa_3793 hypothetical protein 245 101 Op 5 . - CDS 283480 - 285384 1228 ## Closa_3793 hypothetical protein 246 101 Op 6 . - CDS 285450 - 286235 729 ## Closa_0493 glycosyl transferase family 2 247 101 Op 7 4/0.061 - CDS 286225 - 286866 797 ## COG0279 Phosphoheptose isomerase 248 101 Op 8 2/0.121 - CDS 286863 - 287450 514 ## COG0241 Histidinol phosphatase and related phosphatases 249 101 Op 9 1/0.121 - CDS 287508 - 288578 1206 ## COG2605 Predicted kinase related to galactokinase and mevalonate kinase 250 101 Op 10 2/0.121 - CDS 288613 - 289320 796 ## COG1208 Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) 251 101 Op 11 2/0.121 - CDS 289407 - 293351 4115 ## COG0438 Glycosyltransferase 252 101 Op 12 . - CDS 293358 - 294866 1730 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component - Prom 294930 - 294989 8.5 + Prom 294937 - 294996 6.5 253 102 Op 1 . + CDS 295040 - 295498 484 ## COG0607 Rhodanese-related sulfurtransferase + Prom 295553 - 295612 3.5 254 102 Op 2 . + CDS 295651 - 295968 452 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 296003 - 296044 6.3 255 103 Op 1 . - CDS 296044 - 296616 672 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 256 103 Op 2 . - CDS 296598 - 296735 62 ## gi|160937646|ref|ZP_02085007.1| hypothetical protein CLOBOL_02537 - Prom 296939 - 296998 1.6 + Prom 296617 - 296676 5.9 257 104 Op 1 . + CDS 296760 - 297008 257 ## Slip_1595 4Fe-4S ferredoxin iron-sulfur binding domain protein 258 104 Op 2 . + CDS 297001 - 297738 608 ## CD0628 hypothetical protein + Prom 297766 - 297825 7.3 259 105 Tu 1 . + CDS 297882 - 298565 684 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 260 106 Tu 1 . - CDS 298574 - 299530 248 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 - Prom 299556 - 299615 11.1 + Prom 299501 - 299560 7.1 261 107 Op 1 7/0.030 + CDS 299667 - 300305 682 ## COG2059 Chromate transport protein ChrA 262 107 Op 2 . + CDS 300302 - 300874 648 ## COG2059 Chromate transport protein ChrA Predicted protein(s) >gi|157101644|gb|DS480680.1| GENE 1 397 - 897 536 166 aa, chain - ## HITS:1 COG:CAC2664 KEGG:ns NR:ns ## COG: CAC2664 COG0622 # Protein_GI_number: 15895922 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 166 1 155 155 69 31.0 2e-12 MRILIVSDTHRRDENLKEVIRRTGPLDMLIHLGDAEGSEHAIATWVNEDCDLEIILGNND FFSCLDKEKELMIGRYKTLLTHGHYYNVSVGAEYLKQEARARGFDIVMFGHTHRPFYEVE KKEGDKDLIVLNPGSLSYPRQDGHKPSFMLMEIDKEGEAHFSLNFL >gi|157101644|gb|DS480680.1| GENE 2 917 - 1516 396 199 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 194 1 195 200 157 44 6e-37 MGHRIIFATGNEGKMREIRLILADLGLPILSMKEAGAEPEIVENGSTFGENAEIKARAVW NLTGDIVLADDSGLEVDYIGGEPGIYSARYLGEDTPYAVKNRSIIERLKEAGGQERSARF VCNIAAMLPDGQVLHTEAVMEGLIAGEPAGEGGFGYDPILYLPEFGKTSAEITMDQKNEI SHRGKALRAMKEALEGVLK >gi|157101644|gb|DS480680.1| GENE 3 1558 - 3297 1499 579 aa, chain - ## HITS:1 COG:MA1513_1 KEGG:ns NR:ns ## COG: MA1513_1 COG4870 # Protein_GI_number: 20090372 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cysteine protease # Organism: Methanosarcina acetivorans str.C2A # 176 577 106 507 511 261 38.0 3e-69 MVVVSDIVRNGIRKKAAALFLAVSLSLGTAGCGQANNPAETQSPGRIQDPDGTQDTGGIQ NPGGIQDSGGIQNPGGIQDSGGMQEPGGTRDSGGMQEPGGTQDSGGIQNPGGTQDSGGIQ NPGGTQDSGSGRNPGGTGNPSGAGGFAAAGGDTSENGENAQGSSTGSSTTGVYQWNRLDT ANIKLPSAYDYRKTGRAPQIGNQGSLGTCWAFASLTALESSLLPGKSMTFAVDHMSMHNS FLLGQDEGGEYTMSMAYLLAWQGPVLESQDPYGDGVSPDGLAPSVHVQEIQVLPSKDYEA IKRAVYLRGGVQSSLYTSMRDYQSQSVYYNRETNSYCYIGNEKPNHDSVIVGWDDNYSRE NFNLDLAGDGAFICTNSWGEDFGDQGYFYVSYFDSNIGVHNIVYTGVEPVDNYDYIHQSD LCGWVGQIGYGQEEAWFANAYRADKGENLAAAGFYATDKNTEYELYLARNLPDAGGGEME RALDRRMLLAKGRLDNAGYYTIPLDKKIPLEDNEKFAIIVKIITPGTVHPVAIEYDAGDG IAQVDLTDGEGYLSHDGKVWEHVEETQSCNLCLKAYTKK >gi|157101644|gb|DS480680.1| GENE 4 3308 - 4471 1019 387 aa, chain - ## HITS:1 COG:BH1771 KEGG:ns NR:ns ## COG: BH1771 COG0116 # Protein_GI_number: 15614334 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Bacillus halodurans # 1 382 1 381 385 389 50.0 1e-108 MKTFELIAPCHFGLEAVLKKEILDLGYEISLVEDGRVTFIGDDEAICRANVFLRTAERVL LKAGSFKAETFEELFQGTRNIPWEDFIPEDGKFWVAKASSIKSKLFSPSDIQSIMKKAMV ERLKNRYGVTWFPENGASYPLRVFLYKDMVTVGIDTSGESLHKRGYRTLTSKAPITETLA AALILLTPWNRDRILVDPFCGSGTFPIEAAMMAANMAPGMNRSFLAEEWRNVIKRKCWYE AMDEAGDLVEEDVQVDIQGYDVDGDIVKAARTNAQSAGVDHMIHFQQRPVSALSHPKKYG FIISNPPYGERIEEKENLPALYREIGERFAALDAWSMYLITSYEDAQKYIGRKADKNRKI YNGMLKTYFYQFMGPKPPRRSQENGSN >gi|157101644|gb|DS480680.1| GENE 5 4502 - 5059 570 185 aa, chain - ## HITS:1 COG:ECs3184 KEGG:ns NR:ns ## COG: ECs3184 COG0622 # Protein_GI_number: 15832438 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Escherichia coli O157:H7 # 1 180 2 176 184 182 49.0 4e-46 MKIMFASDIHGSAYYCRKMLDIYSESGAGRLVILGDILYHGPRNDLPREYAPKEVIAMLN PLKDQIYAIRGNCDTEVDQMVLEFPIMADYGLMVLEGKAFYATHGHVYNQDHLPPLQDGD ILVHGHTHILKAETIEAEGGRHIAVLNPGSVSIPKGGNPSTYAMLEDGVFSIRTLEGEVV KELKL >gi|157101644|gb|DS480680.1| GENE 6 5095 - 5427 333 110 aa, chain - ## HITS:1 COG:no KEGG:Closa_0690 NR:ns ## KEGG: Closa_0690 # Name: not_defined # Def: Rhodanese domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 103 2 101 102 79 36.0 6e-14 MTTFPMISYRQFDQWMEQGKIAQLVDLREPWMFEQDRIWGSVNIPYDELENLMGEIRKDG TIVFYCDRGAKSMVVCRDLWRMGYHAVDLAGGMLNYRGKYIDRRPLSALE >gi|157101644|gb|DS480680.1| GENE 7 5606 - 5839 214 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNKNNFNNDMENSSRNCSQQNNSQNSSQNSSRNSSQNNSQNNSQNSSQNNSQNNSQNSS RNSSQNSSRNSSQNNNY >gi|157101644|gb|DS480680.1| GENE 8 6281 - 6484 200 67 aa, chain + ## HITS:1 COG:SA2494 KEGG:ns NR:ns ## COG: SA2494 COG1278 # Protein_GI_number: 15928289 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Staphylococcus aureus N315 # 1 64 1 63 66 84 71.0 6e-17 MNKGTVKWFNSQKGYGFITNEENGMDIFVHFSGIASNGFKSLEEGQAVTFDITNSPRGLQ AVNVCAA >gi|157101644|gb|DS480680.1| GENE 9 6647 - 7369 540 240 aa, chain - ## HITS:1 COG:NMB2001 KEGG:ns NR:ns ## COG: NMB2001 COG0791 # Protein_GI_number: 15677829 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Neisseria meningitidis MC58 # 125 230 122 229 251 87 34.0 2e-17 MRKMLGSLLKAGAVCGMLMVMNPFTSLAAIGPGFTAGTYIATITAESVNINKSQDSEEVL MTAKTGNTYEVLEDMGNGWMKVRVNESEGFLPVSGNATVEQVGESEMAEVQQDAIESSDT YKRQQVVNYAMQFVGGRYKYGGSDPRTGTDCSGFTRYVLQNSAGISMNRSSGGQAQQGIA ISADQMQPGDLIFYGSGRSINHVAMYIGNGQIVHASTERTGITVSNWNYRNPVKIVNVLG >gi|157101644|gb|DS480680.1| GENE 10 7609 - 8328 190 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 123 229 61 162 175 77 36 5e-13 MSIRGLGKGIIKASGLICLCAGLWAVNPMDSQAAVKQAEVQTRSSYVVKIEAPAVDVHRS ASEGSSRQGQVMRGQTYEVLGRTEQGWVKIRTGGREGYIKTSGNATVVEKAHETVDEDAK MRRQVVEYALQFVGGRYQYGGVDPNKGVDCSGFTRYVLGKAASISLPHSSTGQSSYGKAV TEEQMQPGDLLFYAGGGGINHVALYIGDGEVVHASTEKTGIKTSPYNYRKPVKIVSLLS >gi|157101644|gb|DS480680.1| GENE 11 8565 - 9068 576 167 aa, chain - ## HITS:1 COG:no KEGG:Closa_0686 NR:ns ## KEGG: Closa_0686 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 167 1 157 157 116 41.0 3e-25 MSRTEFLQGLKSELEGRVPYSVIQENLRYYDSYIMEEAAKGQTEDEVIESLGGPRIIART IVDAALDTEDRPDGFDSFESEASYRSGPAGSSQEEREPFRGKKPEVHYVDFGKWYVRLIA GLVVFLVIFLVMTVFFGIMGLAGWILSYIWPVLLVMLAVWMFRGPRR >gi|157101644|gb|DS480680.1| GENE 12 9068 - 11005 1949 645 aa, chain - ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 645 15 664 665 686 52.0 0 MRDLAYLKLLAKEYPTVKEATSEIVNLTAICSLPKGTEYFFSDLHGEYEAFIHLLRSSSG ITREKIKETFGHLIPEEEQVQLANLIYYPERNLARMMKQGHYTEDWQKITIYRLVQSCKE VSSKYTRSKVRKKMPQEFAYIIDELLHVDYNDDNKKLYYNEIIHSIIENDTADKFIIALC HLIQNLTIDNLHIIGDIYDRGPRADIIMNELMCFHDVDIQWGNHDISWMGAATGNLACIC NVLRIAISYNSFDVLEDGYGINLRPLSMFAAKVYQDDPCTRFMPKILDENIYDAVDPGLA AKMHKAIAVIQFKVEGAMIKRHPEYEMDNRVLLSGVDFEKGTVVIEGKTFPMLDMNFPTV DPKNPLELSRGEKELLRTLQASFKHGELLHKHIRFLYSHGAIYKCYNSNLLYHGCIPMKK DGSFDTITMNGVSYGGKALMDFFNQQVQNAYFMPEGTPGKEQAMDMMWYLWCGAKSPVFG KDKMTTFEHYFVEDKSTYKEVMNPYYQLSLKEEFCNRLLEEFGLPVEGSHIINGHVPVKL KDGEKPMKAGGKLFIIDGGLSKAYQSTTGIAGYTLIYNSHHLALAEHRPFDPKKESTPKV SVVEKVKSRVMVADTDKGKELKGQIADLKELVAAYREGTIKERVE >gi|157101644|gb|DS480680.1| GENE 13 11138 - 11941 775 267 aa, chain + ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 267 1 267 268 311 51.0 6e-85 MEELYVFGTGNAQATRCYNTCFAIKDGDEYFMVDAGGGNGILRILEDMDVDLCHIHNIFV THEHTDHIMGIVWMVRMVAAAMKKGKYEGELQIYCHQGLVDTISTICRLTIQGKFYKMIG SRIFLVPVEDGETVHILDYDVTFFDILSTKARQFGFTTTLKNGRRFTCAGDEPYNAECRP YVEGSDWLLHEAFCLYRDRDIFEPYEKHHSTVKEACELAKELSIPNLVLWHTEDKHLSER KALYTAEGREYYDGNLLVPDDGEIIPL >gi|157101644|gb|DS480680.1| GENE 14 12142 - 13062 534 306 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 8 104 7 103 130 84 42.0 4e-16 MKEQIEAVQRMQDYIESHWQESITLADLSKASHFSPWYSARLFRKLTGLSPADYIRRYKL SRSALRLRDEDCKVIDVAFDLGFGSVDGYQRAFLREFGCNPREYAAHPVPLYLMTPYGVK YRFTEKEPMNMEQIKNVFIQVIEKPARKVILKRGIQAYDYMTYCNEAGCDVWGLLTSIKS ISGEPLCLWLPEQYRPEGTSQYVQGVEVSSDYSGPVPEGFDVIQLPAASYLMFQGEPFEE EDFCQAIQMVQTAMERYDLSVIGLAPDPENPRIQLEPIGTRGYIEMLPVKQAVSSSHSHP APAVNP >gi|157101644|gb|DS480680.1| GENE 15 13059 - 13685 804 208 aa, chain - ## HITS:1 COG:CAC2841 KEGG:ns NR:ns ## COG: CAC2841 COG3601 # Protein_GI_number: 15896096 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 200 1 196 209 138 42.0 9e-33 MKQNHLLSVRNVTVMAMFGALAAVLMIFEVPLPFIAPSFYGMDISEVPVLVGTFALGPVA GVVMELVKILVKLILKPTSTGFVGEFANFCFGCSLVLPAGIIYRLKKTKKGAVMGMAAGT VIMTIVAVILNAVVMLPFYSHFMPLDTIIAAGAAINPAISNVWTFVILAVGPFNILKGAI VSLLTALVYKRVSVIIHSDSGRRAVKAS >gi|157101644|gb|DS480680.1| GENE 16 14134 - 14724 494 196 aa, chain - ## HITS:1 COG:AF0131 KEGG:ns NR:ns ## COG: AF0131 COG0778 # Protein_GI_number: 11497748 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Archaeoglobus fulgidus # 4 194 11 191 194 96 34.0 3e-20 MDAIDRRRSIRRFSDKEIPHEVICKIVESGIKAPSAKNRQPWKFMVIQGQAKENMLRAFQ AGIQREKAGKAMLPDSSRHLGGAEYTVQIMEQAPVVIFVLNPLGKGMFSPLNDEDRIYEL CNIQSVSAAIENMLLEATDMGIGSLWICDIFFAYEELCAWMDCGDELVAAVAFGYPEESP PPRPRKKLEDVVVWKR >gi|157101644|gb|DS480680.1| GENE 17 14855 - 16039 1231 394 aa, chain - ## HITS:1 COG:CAC3299 KEGG:ns NR:ns ## COG: CAC3299 COG1979 # Protein_GI_number: 15896543 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Clostridium acetobutylicum # 1 389 1 386 389 303 41.0 3e-82 MENFEFFTPTRMIFGKNTHQQVGKIVKEYGFKKVLVHFGGSSAKKSGLLDAVTGSLETEG IQYVTLGGVQANPTLSKAKEGIELCLKEGVDLVLAVGGGSVIDSSKCIADGAGNPDVDVW TYFRQEAVPARALPVGVILTLSASGSEMSASCVITNEENGFKRGFNSVTHRPLFSICNPE LTCTVSKYQTACGTVDIMMHTLERYFSKTRDTELTDRIAEALLKSTVEAGRIADKDPDNY EARAALMWGGSLSHNGLTGAGREFFMQVHQLEHELSGMYPSIAHGAGLAALWPSWARFVC PSDVNRFARYAVQVWNIDMDFDCPMNTALAGIRATEDFYKSLGMPSSLKELGVEEERLEE MAVKCTNMGKRTLPGIRELGKEEMMEIYRMALGE >gi|157101644|gb|DS480680.1| GENE 18 16502 - 17344 723 280 aa, chain - ## HITS:1 COG:no KEGG:CPE0145 NR:ns ## KEGG: CPE0145 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 280 1 280 281 323 52.0 6e-87 MPDILEGYRKQKEYLICVDSDGCAMDTMDSKHITCFGPCLIPVWGLSQWEQKIRRRWDEI NLYTMTRGINRFKGLAMILAEIDRTYKAVPDVDAFVRWTGEAPELSNQAVKAQWERTGKE IFRQALEWSEEVNRKIGAMPEEAKCAFDGVREALKAAHETADVAVVSSANKKAVEEEWQR QGLMEYVDVVLSQDSGSKSACINALLKKGYCPDHVLMVGDAPGDCEAAADNGVYYYPILV KHEEMSWRRFPAEALKRLVMGEYGGAWQEQMVRTFRDNLS >gi|157101644|gb|DS480680.1| GENE 19 17378 - 18715 1523 445 aa, chain - ## HITS:1 COG:SMa0522 KEGG:ns NR:ns ## COG: SMa0522 COG1840 # Protein_GI_number: 16262728 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Sinorhizobium meliloti # 109 307 34 232 242 132 35.0 2e-30 MRRRGKRLLALALAGVMAASALTGCEKIEDTKAASGNTTAAKVADEGKAGEQPGAGETQK EAAGDKAGENAAGENAAGGNAASAPEEQWAVDAGLYEDESSDELYKKALDEEGGKVVIYS ISSRMAKVKASFEEDYPGMTLEVYDINANDMSTKLSTEYNSGIRTADVIHSKEQTGDYMI NFFNKGILHNYQPESIYKNVNQEYLKDMTPLYFETDWWFYNTGIYTETPISNWWDVTKPE WKGSFVLQNPIGTVSYMALFTTMVEHADEMAEAYKACYGEDIVLADDEPTAAHAWMKRVL ENDPVIFDSNNEVIQSVGEGKESKLIGYTPSSKYRERADKGWNIESEPLKMEPVSGVAFM NFVAVVNEAPHPNGAKLLIRYLLGGEDGNGNGIKPFNTIGGWPVRPETTPAEGNIPLEDM KLWMINYDFVYKNLQDVQDYWYQFR >gi|157101644|gb|DS480680.1| GENE 20 18732 - 19886 1278 384 aa, chain - ## HITS:1 COG:TM1232 KEGG:ns NR:ns ## COG: TM1232 COG3839 # Protein_GI_number: 15643988 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Thermotoga maritima # 1 365 1 351 355 323 44.0 3e-88 MSRIELRGINKYYGRNHVLKDINLVIEDGEFMTLLGPSGCGKTTTLRVVAGLEKPQEGYM YMGNREIVNAPEKFYEEPGKRHLNLVFQSYALWPHMTVFDNVAFGLEVAKVPKEEIRRKV ENALKRMRIEAFIDRYPSELSGGQQQRVAIARAIVSEPEVLLLDEPLSNLDAKLRIEMRS EIKRLHQELGTTIIYVTHDQTEALTMSTKIAIFFEGVLEQVAPPMELYQNPASLRVADFI GNPKVNFLPGKARVAGDTMTVESVLGTASYDRKSFTGEAPGDGSFEAVVGVRPELVEILD GPAEGSIEAEVYSVMPAGSETTIYLEVGGERILSKQNGLKQYSPDQKVWLRINPDKMNVF CKDSGRLAKLAVMDAQQDQQNSNK >gi|157101644|gb|DS480680.1| GENE 21 19905 - 21770 2266 621 aa, chain - ## HITS:1 COG:SMb20155 KEGG:ns NR:ns ## COG: SMb20155 COG1178 # Protein_GI_number: 16263903 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 30 621 34 614 616 347 35.0 3e-95 MQTDTKAGTKGGVREFSRKRRLMNQAGSIVKNPYNIMVVISVIFMVYLILIPLGQMIIQT LTLAGADAARVDGGREGMFTPYYWKRVLTSSISQKMLWTPLKNSLLISVCTAAFGITLGS LLAWLMVRSDLPYKKFFSLALIIPYMIPSWCKSMAWLTIFKNERVGGAQGLLGFLGIQTP DWLAYGPFAIIVVLVIHYYAYAYLLVSSALSSINSELEEMGEILGASKARILTKITFPLV MPAILSAVIMTFSKAMGTFGVPSVLGLKIGYYTVSTTMYNSIQNGQNRVAFAISLILIAI ASFSVFLNQKAIGSRKSFSTIGGKGSRSNPIRLGKWRPVIVGILIAFIAVAVVFPVVILL YQSFMLEPGNYSLSNFTLHYWTGGSVPTIDQGEPGIFKNPQFMKYVVNTIKLVVLTSLFA TVFGQLIGYINSRGRKLFSGKLVEQLVFIPYLIPSIAFGAMYLALFATAKKIEMFGYRIT LVPSLYGSFALLVLIAVVKNMPFASRAGTSNMLQISTELEEAAQVENAGFLRRFFKIVFP LSKGGMISGFMLVFISIMKELDLIVILMTPSQQTLPYMAYAYSAENLIQLSSAVTIVMFV LVFFVYWFANTFTDADITKGF >gi|157101644|gb|DS480680.1| GENE 22 21845 - 22855 1039 336 aa, chain - ## HITS:1 COG:CAC0360 KEGG:ns NR:ns ## COG: CAC0360 COG1609 # Protein_GI_number: 15893651 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 4 334 2 326 328 183 33.0 3e-46 MKQVTIKDIAREAGVSIASVSRALNGMDGISEENRSRILRVCERLSYTPNGLARSLVKRR TQTIGIIMPDIMSPFYSELMVKASDAAHKRGYQVLLCNSFRELKAERDYLKLLAEHQVEG ILIFPIGPRSTESMAEFIHNVPMVALNELTGQCGIPYVCADEEQAGRIAAEYLISKGCRN LLFVGFKPERLAHRYRAASFLETARAKRIPAQVYECNLDFRTSFERGYNHFKHFLQSGLA MPDGIVAASDATANGIVKACRENGIRIPEDFSLIGFDNITEELPYIELTTVAVSHEEQME TAVELLLGMVEEGAPVGGKASVKLEPRLVERRSCKG >gi|157101644|gb|DS480680.1| GENE 23 23176 - 24507 1479 443 aa, chain + ## HITS:1 COG:BH3436 KEGG:ns NR:ns ## COG: BH3436 COG0739 # Protein_GI_number: 15615998 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus halodurans # 244 342 212 305 337 91 46.0 3e-18 MRFFRMLREHLHISTKTVIYGEMMVLSLLCLSIFLFKTDPGQAVMSQASRLAGYMNPLEN IDGADPDKTGTDKKEAKNKDYIKWVDFGVTSEAMNQAFRLDVDTCQADVHLNWVDLLAYL GTRYGGDFSKYKPEHMTAVAEKLKNGETTIEELTKDMKYYSYYRQAYGAVLDGMVGQFEA EIPAGEAPAAYLPSGVAGGAPGDPATVWVTKYGLKAFSPIAKGFPYSDFDDFGVARSYGF KRQHLGHDMMGQVGTPVVAVESGYVEALGWNQYGGWRLGIRSFDHKRYYYYAHLRKNYPY QSGLEVGSVVQAGDVIGYLGRTGYSRTENTNNIDDPHLHFGLELIFDESQKDSNNEIWVS CYELVKFLNLNRCEAVKVEGTKEWKRLYGIKDPLVPVQPATPSQAGPAAESQAEPLAKSQ AGPQTESQAGPQTDPQAGSQAES >gi|157101644|gb|DS480680.1| GENE 24 24549 - 25370 1128 273 aa, chain - ## HITS:1 COG:SPy1062 KEGG:ns NR:ns ## COG: SPy1062 COG4753 # Protein_GI_number: 15675054 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 2 257 1 247 262 162 37.0 9e-40 MLLKVVIVEDEEIIRKGLTFALNWLDMGCIIVGTAKDGAEGLETIRREQPDIVLTDIKMP KMNGLKMIETAIKDQQFYSLVLTSYSEFELARQAIHMGVTDYLLKPVDEDELKESLDKIR QQIAYASKYEKIEKISQDRLLTVYDEWKIFDSAKNSMDPYVKRTYEFVKQHYKEKVSINQ VAESLGVSTSYLSRKLKAGLNTTFVDLLNQYRIKKALNMLSQGTMRIYEISDELGFSEYK HFCGVFKKYTNVTPSEFLKNGGVAVVGEKDSRG >gi|157101644|gb|DS480680.1| GENE 25 25419 - 27116 1824 565 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 305 565 324 594 602 161 35.0 3e-39 MKKRKQTCFRDEIRKSLIFHALAPCFISLVVLLLVFTAVGSQQIIRKSRTMLEHFSGEFE GVIDSYVDENKRMAGELDVEQFKGLPSYKTEAVSEIYRFLNGQVYRGDYYLFDRDRNLVF STNSQSNVIQYIGNYLPWNTTDQNDSKNDCIFIYDNTVIDGRALPAWLMFQTVVKDGRLQ GYSGFVLKADAFKERLGNMEQPVLLINKFNRLFTDGVSRFQNDRGKLVEEFRDGGSMVHL ENRWYYTDSVSVLDEEASVQVIYDCTSFVQLCLMSLVLVGFLALAVTLAIYRSAGKVADK KTEIIYDLIGALDQVEKGDLDVSLEITSGDEFERIGHSFNTMIGSIRHLLARHQELAKEN MLATVQILESQFNPHFLFNTLESIRYMIKFGPGEAEKMLVSLSRMLRYSIQNGKDVVTVK EEMDFISRYLQVMLYRYGDRLRYSIDLEEGSRNASIPRMALQPIVENSIKYGFGEDRDCL EIRISTRIQNEVLSVIISDDGVGISTELLEKLKANLDQGQNQTDHIGIYNVHKRIRLVYG SGYGVGIDSEMEKGTVVTLRVPCEE >gi|157101644|gb|DS480680.1| GENE 26 27191 - 28843 1834 550 aa, chain - ## HITS:1 COG:FN0377 KEGG:ns NR:ns ## COG: FN0377 COG1178 # Protein_GI_number: 19703719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 9 550 9 550 550 471 51.0 1e-132 MTNKKIRWDFWTIVTVVIIALFALFLLYPLISLFLSGFKDTETGVWTIDNYARFFSKKYY KSALLNSFKLTFSVTVVAIILGVPLAYFMSFYKIKGKGLLEILFIISMMSPNFIGAYSWI LLLGRSGTVTQFLKGLGIHMPSIYGFGGMLLVFALKLYPFIYMYVSGALKKIDVALSEAA ESLGCGGLKKVFTVIMPLVTPTLIAAALLVFMNCMADFGTPALIGEGYRVMPTLVYSEFV GETGGSANFAACMATIMVVVTATVFLLQKWYVNTKSFTMSSMRPIQPKEVHGIKGFFIHL FIYLLAATSIIPQCMVVYTSFRATKMQVFVDGFSLESYRKVFSTAIQSIKNTYLYCLAAI VIIVVLGMLIAYLAVRRKSWLTNAIDTIAMFPYIVPGSVLGITLLIAFNHPPMMLAGSAI IMVISLVIRRQAYTLRSSSAILYQISPSMEEAAISLGDSPSMSFVKVTAKMMLPGVASGA ILSWITLINELSSSVMLYTANTRTMSVAIYNEVIRASYGTAAALATILTVTTVVSLLIFF KVSGNKDVTM >gi|157101644|gb|DS480680.1| GENE 27 28833 - 29954 1356 373 aa, chain - ## HITS:1 COG:FN0376 KEGG:ns NR:ns ## COG: FN0376 COG3842 # Protein_GI_number: 19703718 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 368 1 360 371 399 58.0 1e-111 MSVSIGIDNVVKKYGDTTIIPDLSAFIKNGEFFTLLGPSGCGKTTLLRMIAGFNSIEGGT IKFDDKVINDIPAQKRNIGMVFQSYAIFPHLTVRQNVEYGLKIRKVPKEHMKERVDEILK VVKIDEYQDRLPERLSGGQQQRVALARAIVIHPQVLLMDEPLSNLDAKLRIEMRSAIRDV QKQVGITTVYVTHDQEEALAISDRIAVMKLGIIQQIGSPQDIYARPYNAFVSTFIGHSNL FFGTIRKQGSVPSVVFEGGYHVEMANLVDTVTEGQKVVVSVRPEEFSLNPEGMECEIISR TFLGKYTNYFLRFGDGMVLEDQPSIEYSQDLGHVDRIFSVGEKIRLRPNPNKINVFTQDM EQSLIKDVKRYDE >gi|157101644|gb|DS480680.1| GENE 28 30046 - 31161 271 371 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 60 350 24 321 346 108 25 2e-22 MKKRRLIPVLCAAAVAVSLAGCSGSSGSATEKGADTQAPKAEAGAETGKEEDGGEKKEEA GASKDEKLIIYSPLTESMIDSMLAMFEEDTGIDAECLAMGTGDALKRIQTEADNPQADIL WSGTIGTVKNKSEYFADYTTSNEDAFYDEYKNTEGNLTRFDTIPSVIMVNTDLIGDIKIE GYEDLLNPELKGKIAFADPAASSSSFEHLVNMLYAMGGGNPDNGWDYVKQFCAQLDGKLL GGSSAVYKGVADGEYTVGLTFEQGSAQYVGAGAPVKTVYMSEGVIFRGDGVYIIKGCPNE SNAQKFVDWLTSKDVQEFMNNTQYRRTIRKDVEAGDAMVPMDQIKVIQDDETDTAAHKSE WLDEFKELFTE >gi|157101644|gb|DS480680.1| GENE 29 31209 - 31880 858 223 aa, chain - ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 3 222 1 220 223 220 51.0 1e-57 MGMTKEEILVHVDHTLLKAVSAWDEIQTLCEEAIENHTASVCIPPSYVKRVSKAYGDKLN VCTVIGFPLGYNTTETKVYETQKALADGAGEVDMVVNLGDVKNGEFPKITREIEAVKKAA GNHVVKVIIETCYLTEDEKRELCKCVTDGGADYIKTSTGFGTAGARIEDVRLFKRYIGEG VKIKAAGGVKTREDLEQFLEEGCERIGTSSALKLLGGQEAGTY >gi|157101644|gb|DS480680.1| GENE 30 32185 - 32421 241 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937383|ref|ZP_02084744.1| ## NR: gi|160937383|ref|ZP_02084744.1| hypothetical protein CLOBOL_02274 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02274 [Clostridium bolteae ATCC BAA-613] # 1 78 16 93 93 150 100.0 3e-35 MANDLGLYTTVHQALKAGQIPCEVKVVNTGTRNRRTGALIGRVGERPELEIQYYIYTPLE YADRAKYVIDEYRKNLSR >gi|157101644|gb|DS480680.1| GENE 31 32526 - 33800 970 424 aa, chain - ## HITS:1 COG:no KEGG:Jann_4086 NR:ns ## KEGG: Jann_4086 # Name: not_defined # Def: hypothetical protein # Organism: Jannaschia_CCS1 # Pathway: not_defined # 69 403 44 375 380 215 36.0 3e-54 MSEVKNTKNQAAQAEDILREPSGAQKESGTGGKKEDCDRNIRYRGNVYQETLGQLSVPLK DDITVMWKDGKAREQEVPMPGFGKWRDPADFIIRHTYDIWDDKGMGKIYDHYKHNALVHT SDGVTYGRDQVIANSVMKLAGYPDIKDYIDDVIWASDGNGGFHTSMRWTWIGHNTGHTIY GPPTGRKVVVWGVANCYVKGDRVVEEWVVYNEISLIRQLGYDPMKVLEATANKGGSAKAD TGTWGEIERLNGEFAPDRFGDNNGRYSDIEYFIRKSYHEIYNRRMFNQFKDNYAPDYRYH GPSDRELVGRGDFLQDQLNLFQAFPNLAMQVEDVYYLYDEKRDEYRCATRWNIIGTHEGF GIYGPATGAHVVISGISQHIIKDGMFREEWSIYDEFALMRKLAQKRMAMKDNTDGAGNDD GLKA >gi|157101644|gb|DS480680.1| GENE 32 33825 - 35108 705 427 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 423 3 426 435 276 35 9e-73 MSVAVLFVLFLGLAFLGVPIGFAIGIGVLGSVFTSDIISLSYFVRAMANSVDSFTLTAVP FFVLAGQIMSEAGISEGLFKAANVWVGRLKGGIMMVTVLACMAFGAISGSAYATIAAIGL IALPELRKQGLSQGAAAALIATAGCCGQMIPPSMGLVVFGSLNNVSISKLFTAEILPGLF IGCCFLVYCHIYGKKHNICSGNDPIPLRRKLRTMWEAKFSLIMPVIILGGIYSGIFTPTE AAVVSVVYGLIYGLLKRNSGLEARMLPQMFYKTVITTCTILFILGVSSGFGKILTLEEIP TKLAALILGSVSSKIVALIILNALVLFLGTFLDGIAINVILSSILLGIATQYGIDPIHFG VMFIFNSTLGIITPPVGGNLFVAAQISKAPFEEIVKEIWPWILTMGIGLIFVIFIPQMST WLPGLMK >gi|157101644|gb|DS480680.1| GENE 33 35108 - 35650 179 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 150 2 144 164 73 27 9e-12 KDVKKSVKKVIQWVDDKLEVFIGAVFLALMVVFTTLQIVGRYLFSTPFPWTEEMTRYMFV WMVFISIGYAVKTGEHIRITFVRSLLRPGLRLYLDILCNILCFGFSVICMIEGYKLLSII RQGGQTAVSLPIPMWLLYLVMPLGFLLVCLRLVQCTVKIAGQIRQGGIEADKIEEEKGGA >gi|157101644|gb|DS480680.1| GENE 34 35671 - 36747 285 358 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 5 311 1 287 327 114 25 5e-24 MKKTMKKVITISLVGLALALTGCGTEKPAAAKQEQGSGEDAPSPSGAESSKLEIRIAHQA NEKEATHQGFLKFKEIIEGEDVGIEVKIFPNAQVVGSDRDSIEAVTLGEIDMTSVAELQF APNIKEFYVFNADYLFDDLKDAKAKLDGEQGDLLKKAADDKNMGVKVAAFFGSSGRIFWN NVREVNTLADFQGIKSRAPENPINIAELEALGCIPTPMAWNELYTGLQQGTVDGLVSSKM PIIQQGLIDVLKYATDTNHSYSINVILISSAKYNSFSEEQRAAYDKAIAGAAEEEWKIAM KEEEDSVQKIKELQEQGKILYTELTDEARAEIKEAMVAATEPKVEEMCGKEILETFRN >gi|157101644|gb|DS480680.1| GENE 35 36940 - 37947 726 335 aa, chain - ## HITS:1 COG:VCA0673 KEGG:ns NR:ns ## COG: VCA0673 COG1609 # Protein_GI_number: 15601431 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 4 332 14 339 343 177 33.0 4e-44 MIKSISSFDVAKRAGVSQSSVSRVFNEKWNDRISHKTREKVLKAAEELGYSPNALARGLN SNRTGIIGVVLSDTYNVYYYGLLNILTNLLQEYGLRVMVFNTAPDSDINQILRRLLEYCV DGIIITSSALTHKISGEWKKKGIPLVLLNVYSPNTDINMVYSDNYGAGERAARFLHGLGM REFAYVSAEESCYLNHDERQEGFLGGLKQCGITQCMIQAGDYSYESGYRAGDRIFASGLP VEAIFCANDLMALGVMDCGRLKYKYVPGEDYSIMGLDDTFAASLRSYSLTVLQQQNDILC KEAVRVLIENMENPEMPPETVAVPMQLIVRNTVRH >gi|157101644|gb|DS480680.1| GENE 36 38171 - 38950 921 259 aa, chain - ## HITS:1 COG:no KEGG:Tmar_1888 NR:ns ## KEGG: Tmar_1888 # Name: not_defined # Def: xylose isomerase barrel # Organism: T.marianensis # Pathway: not_defined # 25 239 32 247 336 80 28.0 5e-14 MALLISTNMYKAEEFKRILSYVEKWKGQVGVEVFPMFHRQAFASLLEESMDMLKHVPISF HGPYYKAEHSAPQGTQEYEYTMELVGQTLRYSERLNSRYMVFHHNNCAVRDEERMLRDSC ANYRRVEELFRPLGIPVAVENAGVMDRNNMLLDESGFVSLCRAEGYPVLIDIGHAHANGW DLRSVMERLKGQILAYHLHNNDGVHDSHRRIGDGTLDFDSFMAWAGEYTPQADLVVEYGM ETADDCLGIEADVERLLKI >gi|157101644|gb|DS480680.1| GENE 37 38919 - 40097 1086 392 aa, chain - ## HITS:1 COG:DR2153 KEGG:ns NR:ns ## COG: DR2153 COG3839 # Protein_GI_number: 15807147 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Deinococcus radiodurans # 1 364 36 407 455 304 46.0 2e-82 MKNIVFDHVVKAYDKNVIVKDLNMEIREGERLILLGPSGCGKSTTLRMIAGLERLSGGNL YMDGRLVNDVPCGERNVSMVFQNYALFPHMTVQSNIVYGLKAHKMDPAEIKTRLSEVLDM LDLKGLEERRPKDLSGGQRQRVALARAVVKRSDYFLLDEPLSNLDAQLRLRARKELVKIH EMYHQTLIYVTHDQIEAMTVGQRIALMHEGKMQMLDTPANVYNRPANVFTAKFIGSPSMN IVEASYTRGTLVIGRQVVWLPDMWSGLASRNESGRLFLGIRPEHMILHRRRQENTLEGTV KYVEDYGNRYGVYVQVDNMEIIAVSEGDVPAPGESVYIQPDFDRIHLFDRATQVSLGYPE QLRAAREPYILQKGASKKGENGHGFAHQYQYV >gi|157101644|gb|DS480680.1| GENE 38 40136 - 40960 904 274 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 20 273 29 281 282 165 37.0 8e-41 MKIKEMQPGWHLLLLCAVFLILMPIVFAVSNSFKTMQDAFNTVFQIIPARPTVQNYIHVF SKLPFLKITMNTFLIAATVTVFKTVTGLFAAYSFVYFDFRGKGILYFIMLATMFIPFTVT MIPNYLMISKIGLSDKIWGVALPQLADVLGIFLLRQSMRGIPRALIEAARMENVKNLKIM RDIIIPLVRPSIISTGIIFFINSWNEYVWPVLILKSKENYTLSLALQMYISAEGGTEFTI AMAVSVMTMVIPLALYILFQRYIINTFAMSGIKG >gi|157101644|gb|DS480680.1| GENE 39 40950 - 41840 979 296 aa, chain - ## HITS:1 COG:RSc1265 KEGG:ns NR:ns ## COG: RSc1265 COG1175 # Protein_GI_number: 17545984 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Ralstonia solanacearum # 11 294 12 293 293 132 31.0 1e-30 MNQEEKKQTLLGYLFLVPSLAVFAVFMFYPLFYTIYLSFFEWNMVKPVKKFVGLANYAAI FRDPNSWKIAGNTAVYILVLLVLNFVLPYILSFILSAVIKRGQGFYKAVFFLPSVISLVV GSILYIWILNPISGPVALVLKYFGLALPFWSKAEGWVIAVLSIITSWKVFGYNFIIILAG VSGVSQEVVEAARLDNIPLWKIFRDMVVPMSSATGIYVFITTIVQGLQYVFTPIKVVTQG GPNYASSNLIYMSYHEAFTLYRTGTSSAVSVITMVLFIVLLLLEFRFVEKGVYYEN >gi|157101644|gb|DS480680.1| GENE 40 41930 - 43375 1382 481 aa, chain - ## HITS:1 COG:BMEII0115 KEGG:ns NR:ns ## COG: BMEII0115 COG1653 # Protein_GI_number: 17988459 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Brucella melitensis # 96 478 40 418 421 131 29.0 3e-30 MALAMAVSVSASMAVLSGCGAASGSKARDNAGAANAQTGTAAVNQTGSGAEAGTDTNTDT STDTGTGSGTGGNSSSGEKIVIEYWHCNAETQGGLVVDELVKQFNESNDHIQVVAKYNPD MYKGLMQNLQAEAAAGNSPALVQIGWAFLDYFSNNFSYVSPQEAIDRFDKEDAGFLQDKF LPNVLELAVNSQGSQVGIPYSLSNPVLYINKDILREAGLAEDGPATWQEVSEFAKTIKEK TGKYGFYMQEPADFWAQQGLIESSGAKMLTTNAEGKKEASFATEDGIEAMQLIADMVKDQ TALHISWEEGCQSFIDGNCAMLYTTIARRASVQKGAQFDVATVKSPLWKDKERMVPAGGC FLAITAQGDDQIEAAWEFEKFLYSVESMAAWTEGTGYVPPRKDVAGAENGLKTFLAENKM MNAAIQQMDGVVPWTAFPGDAGLQAEQMLLDMRDQILGGQVTAEQGMKAAQDAINQLLAV Q >gi|157101644|gb|DS480680.1| GENE 41 43622 - 45106 1471 494 aa, chain - ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 1 494 1 492 492 561 55.0 1e-160 MQEFIKINREDTVAVALKPLSKGSTVASDPYTVVLNEDIPQGHKFAVCQIPEGAPVVKYG CRIGYASRQINPGDWVHIHNVRTALGDVLEYQYEPQIKELPKSSPSSFMGYKRGDGGCGV RNELWILPTVGCVNSVARAMEQEAKRRFMLGNVEDIVAFAHPYGCSQMGEDQENTRKVLA DIIHHPNAGGVLVLGLGCENCNIPVLMDYIGEYDPERVKFLQCQDCEDEMEAAMELLGQL YEKASGDVRVECDASQLIIGMKCGGSDGLSGITANPAVGAFSDILISKGGTTILTEVPEM FGAETILMNRCRDEETFARTVKLINGFKEYFTSHNQTIYENPSPGNKKGGISTLEDKSLG CTQKSGSSPVCGVLEYGERVREKGLNLLSAPGNDLVAATALAVSGAQIVLFTTGRGTPFA TLVPTMKISSNSKLAGYKAGWIDFNAGEMVETKTKDQVAMELFQYVLRVASGEKVKSEEA GFHDLAIFKQGVTL >gi|157101644|gb|DS480680.1| GENE 42 45154 - 46494 1409 446 aa, chain - ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 5 413 16 426 482 376 45.0 1e-104 MARKETVIQFGEGGFLRGFADYFFQKMQDKGLFDGSVVIVQPIEKGMCSVLEQQGCEYNL FLRGIDNGQVVDEHTHIDIISRCVNPYDDFEAYMKLAENPDFRFIVSNTTEAGIEYVDSN QFTDAPARSFPGKLTQLLYKRYRLGLKGFIILSCELIDHNGEELEKCCLRYAKQWELGEE FETWLVRENDFCSTLVDRIVTGFPRDEHEELCRRIGEQDNMMDTAEIFHLWVIQGSHEDE LPLQKAGFHVVWTDNVDPYKKRKVRILNGAHTSMVLGAHLYGLKTVGECLKDEKVSALLR KCIFREIIPAIGDTEDNRKFGEAVLERFSNPFIRHMLLSIALNSVSKFRARVLPTILEYR DMFGSCPQGLTFSLAALMAFYRTDDANDSQEIMDFMKTAPVEDILKREDYWGQDLSPLLA DVKKWYELIETEGMDKAYDAVLSESE >gi|157101644|gb|DS480680.1| GENE 43 46575 - 47858 1068 427 aa, chain - ## HITS:1 COG:no KEGG:Spirs_0433 NR:ns ## KEGG: Spirs_0433 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 2 427 5 430 430 575 61.0 1e-162 MRERQFVADLVSSQDLPKMFKVRQTFPRPRIEPEDIPGIIRGYLSEESFAAKVRPGMRIA ITAGSRGIANVALVTKTIADYVKSRGAEPFVVPAMGSHGGATAEGQKDVLESYGITESYL GCPILSSMEVKKIGVNEEGMDVFIDKYAAEADGIIVSCRIKPHTAFRGPYESGIMKMMAI GLGKQHGAEVCHEAGFKNMAKYVPMFGKAIIENAPVLFAVAVIENAFDETCKIAAVQAED IVEKEPPLLKEAFTYMPRILVDSCDVLVVDQIGKNFSGDGMDPNITGTFCTPYASGGINA QRVCVLDLSPETHGNGIGLGYSSATTKRVFNQLDLASMYPNAITCTVLGGVRIPIVMESD KEAIQVCVRTCNEIDKKNPRIVRIPNSLHLEHIMLSEAYYDEVRNHPGITIESEPEYLPF DEDGNLW >gi|157101644|gb|DS480680.1| GENE 44 47884 - 49170 640 428 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 426 1 427 432 251 32 3e-65 MNNLVIICAILLLVLLFLKVPVYIAVLSASAVYFIGTPGMNLSIFAQKTISGAEGLSLLA IPFFVCAGIFMNYTGVTKRIMNCCEVLTARMPGGLAQVNILLSTLMGGLSGSSLADAAMQ CKMLVPTMESKGYSKSFSTVITAASGMVVPLIPPGVGLIIYGCINNISIGKLFIAGIGPG IVLCVTLMIFTHFLSKKRGYLPSRTTKIPVSEKIAAVRPAFLPLLLPIIIIGGVRIGVFT ASEAGTVAIVYAVLLGLLYKELTLKNVMQGCKETVTTTASIMLIVAAATCFSWILTKEQI PQQFSQWMISNIHNKYVFLIMVNIFLLIVGMFIEGNASMIVLAPLLHPVAMAYGIDDIHF AMVYIFNCTIGAFTPPMGTLMFVSCGITKCPTKDFIKEAVPFYILFAIDIIVLTYIPQLT TFLVNVFY >gi|157101644|gb|DS480680.1| GENE 45 49163 - 49675 470 170 aa, chain - ## HITS:1 COG:no KEGG:Amet_0551 NR:ns ## KEGG: Amet_0551 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ # Organism: A.metalliredigens # Pathway: not_defined # 1 168 1 169 171 112 34.0 6e-24 MKKDSKLFKILINLDIVIASIALVILVGCTFAGVIARYVVGKPFGWIEEIQAAFIVWVVF AAGGAAYRTGNHSAIEILYEIFPKPVRKIVSIFIGLVVTAVLGFLCYTSIGYLQLFMRTG RTTAVLNLPFTWIYMIVPVSCMLQIFNYFLVNVFGYEDEVEKLVEEDEDE >gi|157101644|gb|DS480680.1| GENE 46 49705 - 50823 312 372 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 45 350 10 309 327 124 28 3e-27 MRFNRLAALTMAGVLAAASLTACGGGSTGSTTAAGTTESAAAADSATDTTAAQADADTEP AAGDVVTIQVGYENATSEPAAKAVEKWKELVESKSNGTIKMELFPNSALGKKTELIDQMI LGEPMITVADGAFLADYGVPDFGIFYAPFMFDTWDEEWSAIDSDWYRGLCDELAQKASIR VLSSNWVYGARNILSVKPVTLPEDLAGLKLRVSSNDLSISSFNSLGASSVGMDMGDVYQA LQAKTIDAVENPITPLANRSFQEVAKYLVEDEHILATSMWICGDTFFQSLTADQQKILTE AADEAGLYNNELQEAAEEDARQKLLDAGVTITTLTDEQKAKWIEAGQPFYDIAGETLGWS DGLYDTVKEAAK >gi|157101644|gb|DS480680.1| GENE 47 50875 - 51894 1029 339 aa, chain - ## HITS:1 COG:HI0053 KEGG:ns NR:ns ## COG: HI0053 COG1063 # Protein_GI_number: 16272027 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Haemophilus influenzae # 3 332 10 338 342 201 33.0 2e-51 MYINAPGQVEFRDIDKPVRQKGEVLLKLLYGGICGSDLGSYKGTFAYFDYPRIPGHEFSA EIVEVDEDNGQGLRPGMIVTCNPYFNCGHCYSCEHGIVNACMDNQTMGCQRDGAFQEYIT MPEERVYDGKGLDPMLLAAIEPFCISYHGVSRAEVKPGDKVLVIGAGTIGVLAAVAAKAK GAVVYISDVSMGKLDMAREFGVDGIILNDSPEGFDKAVKEITDGNGFDIAIEAVGLPSTF QNCIDAACFGGKVVVIGVGKKNLDFNFTLLQKKELNVFGSRNALKKDFVELIDLVKEGKV PVKGIITNIYKFEDGAKAFNDFANNPGNMLKVILDFTDR >gi|157101644|gb|DS480680.1| GENE 48 52148 - 52822 673 224 aa, chain + ## HITS:1 COG:mlr4749 KEGG:ns NR:ns ## COG: mlr4749 COG1802 # Protein_GI_number: 13473980 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 10 159 32 180 248 90 34.0 2e-18 MVKTNLKTLAYNTIKQKIVTCEYAPGTFLNEEILTDELKISRTPIRDALSRLEQEGLIEI KPKKGITVTALSIKDVNMIFEIRKLYEPYILKNYGSFLDEDKLNEFYHIFSHKDANSECF QNNNYFYDLDSSFHQMIMDACPNIYLRQNYALIQTQSERFRFMTGNISNNRLEDTFREHI DIIIPCLQKDWDTAVEKLLYHLDESKRSTFKLVFDSIDSNNIGF >gi|157101644|gb|DS480680.1| GENE 49 52829 - 53017 88 62 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937404|ref|ZP_02084765.1| ## NR: gi|160937404|ref|ZP_02084765.1| hypothetical protein CLOBOL_02295 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02295 [Clostridium bolteae ATCC BAA-613] # 1 62 3 64 64 89 98.0 6e-17 MDVNEAEGHPSVSGMGAGEADAGCGAGITKAAEIGFKQDSGDPTRALDTARNPGTELRQG YA >gi|157101644|gb|DS480680.1| GENE 50 53660 - 54922 1456 420 aa, chain - ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 419 1 413 414 468 54.0 1e-131 MAVVKPFMCIRPAADKAARVAALPYDVYNRKEACQAVKGNPLSFLNIDRAETQFGDDVDT YDSRVYDKARELLDCQIEEGVYVTDPEENYYLYELTMEGRSQTGIVACCSIDDYVNGVVK KHENTREDKELDRIRHVDATNAHTGPIFLAYRQNQELKALVAQVKEEAPLYDFVSEDAIR HRVWMIGARSMVEAVEAAFAAIPGTYIADGHHRAASAVKVGLKRREENPDYTGKEPFNYF LAVLFPDEELKILPYNRVVKDLNGLGQEDFLAAVAERFDVAAWGDPGCGEPGADAFWPRE KGTFGMFLGGRWYCLRVKPEFQSSDPVKGLDVSILQDQLLGPVLGVGDPRTDKRIDFIGG IRGLKELERRVSEDMEAAFSMYPTSIEELLAVADAGLLMPPKSTWFEPKLRSGLFIHRLG >gi|157101644|gb|DS480680.1| GENE 51 54974 - 56137 1452 387 aa, chain - ## HITS:1 COG:lin2956 KEGG:ns NR:ns ## COG: lin2956 COG0111 # Protein_GI_number: 16802015 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Listeria innocua # 1 387 1 389 395 397 50.0 1e-110 MKKIHCLNPIAKCGTELFPAGYEMTDKAADADAVLVRSASMHDMELPENLLAVGRAGAGV NNIPLDSCAQKGIVVFNTPGANANGVKELVIAGMLMASRDIVGGIEWCKANAEDDNITKD TEKSKKAFAGCEIKGKKLGVIGLGAIGAEVANAATHLGMEVYGYDPYISINAAWRLSRNV KHITNADIIFQECDYITIHVPLMDSTRGMINKEKLAIMKDGVVILNFSRDTLVNDDDMAE ALDAGKVRYYVSDFPNPKVANMERVILLPHLGASTKESEDNCAVMAVKELTDYLENGNIK NSVNFPSCDMGMCQAESRVAVLHMNIPNMIGQITAILAEQGMNISDMTNKSRDKYAYTLL DLEHKAEDSTIQKLRAIKGVLRVRVVK >gi|157101644|gb|DS480680.1| GENE 52 56174 - 57259 1235 361 aa, chain - ## HITS:1 COG:BH1188 KEGG:ns NR:ns ## COG: BH1188 COG1932 # Protein_GI_number: 15613751 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Bacillus halodurans # 1 360 1 361 361 376 50.0 1e-104 MSRVYNFSAGPAVLPEDVLKEAAAEMLDYRGTGMSVMEMSHRSKAYDTIIKEAESDLRDL LHIPDNYKVLFLQGGASQQFAMVPMNLMKNKAADYILTGQWAKKAYQEGAIYGKANAIAS SADKTFSYIPDCSDLPVSEDADYVYICENNTIYGTKYWTLPNTKGKLLVADQSSCFLSEP VDVTKYGLIFAGAQKNVGPAGTVIVIVREDLVTEDVLPGTPTMLRYKTHADAESLYNTPP TYGIYMCGKVFKWLKARGGLEAMKEYNEKKAEILYNFLDESQLFKGTVVKKDRSLMNVPF VTGDEELDALFVKESKAAGFENLKGHRTVGGMRASIYNAMPAEGVGKLVEFMADFEKKHR K >gi|157101644|gb|DS480680.1| GENE 53 57499 - 59181 2151 560 aa, chain - ## HITS:1 COG:CAC3169 KEGG:ns NR:ns ## COG: CAC3169 COG0028 # Protein_GI_number: 15896417 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Clostridium acetobutylicum # 3 550 2 549 554 633 57.0 0 MSKLTGAEIVVECLKEQGVDTVFGYPGGAILNIYDALYQHQDEITHILTSHEQGASHAAD GYARATGRVGVCLATSGPGATNLVTGIATAYMDSVPVVAITCNVTNSLLGKDSFQEIDIT GVTMPITKYNFIVKDVNRLATVIRRAFTIAQTGRPGPVLVDITKDVTAAPCEYEKQVPEE IVRQSDTITEQDMDRAVEMIRKASKPFIFVGGGAVLANASDELRAFAHKIQAPVADSLMG KGAFDGADELYTGMVGMHGTKTSNFGITEADLLIVVGARFSDRVTGNASKFAKNAKILQL DIDPAEINKNIKVDASIIGDVKVILRKLNARLDPVNHDEWIAHIERMKDMYPLRYDKNLL TGPFIVQTINEVTGGDAVIVTEVGQHQMWAAQYYQYRQPRTLLTSGGLGTMGYGLGAAIG AKMGCRDKTVINIAGDGCFRMNMNEIATATRYNIPVVEVIVNNHVLGMVRQWQTLFYGKR YSQTILNDSVDFVKIAEAMGARAYRVTQKEELEPVLREAISLNIPVVIDCQISCDDKVFP MVSPGAPIADAFDDTDLKIN >gi|157101644|gb|DS480680.1| GENE 54 59227 - 60891 2144 554 aa, chain - ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 552 1 548 552 676 62.0 0 MKSDAVKTGMQQAPHRSLFNALGMTEEEMRKPLVGIVSSYNEIVPGHMNLDKIVDAVKLG VAMAGGTPVVFPAIAVCDGIAMGHIGMKYSLVTRDLIADSTECMALAHQFDALVMVPNCD KNVPGLLMAAARINVPTVFVSGGPMLAGHVKGHKTSLSSMFEAVGAYAAGNMSEEDVKEF ENKACPTCGSCSGMYTANSMNCLTEVLGMGLKGNGTIPAVYSERIKLAKHAGMKVMEMYE KNIRPRDIMTKEAFMNALTMDMALGCSTNSMLHLPAIAHEAGVELNVDIANEISARTPNL CHLAPAGPTYIEDLNEAGGIYAVMKEISKKGLLNLDCMTVTGRTVGENIKDCVNRNPEVI RPVENPYSQTGGIAILKGNLAPDSAVVKRSAVAPEMLKHEGPARVFDCEEDAIDAIKNGR IVAGDVVVIRYEGPKGGPGMREMLNPTSAIAGMGLGSTVALITDGRFSGASRGASIGHAS PEAAVGGPIALVEEGDIISIDIDNHELNVLVSDQEMEARKAKWQPRTPQVTTGYLARYAS LVTSADRGAVLQVK >gi|157101644|gb|DS480680.1| GENE 55 60948 - 62033 1297 361 aa, chain - ## HITS:1 COG:PA3118 KEGG:ns NR:ns ## COG: PA3118 COG0473 # Protein_GI_number: 15598314 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Pseudomonas aeruginosa # 1 356 1 353 360 405 59.0 1e-113 MNYNVTVIPGDGIGPEIVREARKVLDQVGKVFGHSFDYTEILMGGCSIDAYGVPLTEEAL ETARKSDAVLLGAVGGDVGNSRWYDVAPNLRPEAGLLAIRKGLGLFANIRPAYLYKELAE ACPLKKEIIGNGFDMVIMRELTGGLYFGDRYTREVDGVMTAVDTLTYNEKEIRRIAVKAF DIAMKRRKKVTSVDKANVLDSSRLWRKVVEEVAADYPEVELSHMLVDNCAMQLVMNPGQF DVILTENMFGDILSDEASMITGSIGMLSSASMNESKFGLYEPSHGSAPDIAGKDMANPIA TILSAAMLLRYSFDMDREADAVEKAVQAVLTEGYRTGDIMSEGCTRVGTCKMGDLIAERI G >gi|157101644|gb|DS480680.1| GENE 56 62303 - 62935 711 210 aa, chain - ## HITS:1 COG:no KEGG:Closa_1592 NR:ns ## KEGG: Closa_1592 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 209 1 209 211 293 66.0 4e-78 MESRQLYRVKEEDLPRLEKLLTGCFEHDPLYCKLIPDEEIRKRLLPELFACDLTEFYETC EIYSDSPEMNSLLVVSDETEPYNPLTYYLAEAWASLRTDEFLIKEDPSLKTLWKFMLGKD YLNSRWTAQLHRTERIHIIYLAVRPDMQHHGLAALLLGEVIRYADQHKLMISLETHNPDN VPLYEHFGFKVFGVMEKHFALRQYCLVREA >gi|157101644|gb|DS480680.1| GENE 57 62943 - 65042 1512 699 aa, chain - ## HITS:1 COG:no KEGG:Clole_0471 NR:ns ## KEGG: Clole_0471 # Name: not_defined # Def: peptidase M28 # Organism: C.lentocellum # Pathway: not_defined # 7 662 2 653 678 734 55.0 0 MNGEEWRKRICMETDTGYSYSLAKRMEKYRTNPVLGYRTAGSRAEFETGEMLVREMESLG LSDVHKDRICVDSWEFEKAVMVFTDSRGREHRFQLGAYQTDFHTGGFREFSLVYLGKGTA ADYGQVDVTGKLVLIDINQRDEWWINFPVYQAWLKGAAALIAVQEQGYGEINDEALNAQD IAGPDYAPAFSMSRADALVLKASMGEDKQVQVRLDASSTVLKDQETYNIIGSIPGTEEGM ILLSAHYDSYFSGFQDDNAAVAMMLGIARAVLRSGYKPRKTLVFCAMAAEEWGVVNSKYD WSAGAYEQVFTARPDWAGKVIADLNFELPAHAHGRKDAVRCVYEYADFLEEFITGLPAGF FADQPYPEGVEVLSPIQTWSDDFSMAVGGIPSMVNDFSAGEFMETHYHSQYDNENFYQEP VYRFHHEMYAGLVLAFDRTAAVPMNFARLFEGAAASIDGSVCRGADADLPGLHLVLKQGI QAGEALYSKIREINRAYGRLLDRGEEEKAFLLAAGCRSLNRELLRIFKKEQDYFVRLSWH DDVLFPQQAVQENVRAMDRAIACLENRNLPGALEAVYTIDNNRYAFLFEEEVYTYFTEYV LKQPPGRLKWGAGRIVHHQNLFRLVERLKARAGEENPDTREEYDILVRARENQLGCYGDD IRYMISSAGKLLKAIRQAGYRASELKHDLYTSESKIQEE >gi|157101644|gb|DS480680.1| GENE 58 65121 - 66092 1195 323 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 211 318 507 617 621 63 34.0 4e-10 MLHMKRKLMMAAAGVLAALMVFPAYGASRKPIKSISLTIKAEIKPDTDFGDELIEIETSS NKYSVDGYEILNDDVEWREDTVPRIQITLTANDDYYFQSLPKDKVTIKGGAEFKNSKRED SSTTLLMDVELQALNTSLHALTNVVLTEDGIATWDAIPAAGSYEVRVYRDDKAVGAVMTV STNTCNCRERMQKGNTAYTVRVRPVSKFDSKEKGDWAESPSTYIPEEKAAQFRENPTGGN GEWVQTPENGKWWYRNADGSYPANSWQQIGDKWYFFDAQGYMATGWIQWEGREYYCADNG EMLTNCLTPDSYWVGDDGAKINQ >gi|157101644|gb|DS480680.1| GENE 59 66201 - 67532 1181 443 aa, chain - ## HITS:1 COG:VCA0681 KEGG:ns NR:ns ## COG: VCA0681 COG2206 # Protein_GI_number: 15601438 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Vibrio cholerae # 19 398 30 414 431 139 27.0 2e-32 MPEYSLPVGNKDLINIARRAFNLVDPRLIGHGARVSYLVFQMLKEDGTYTPSEMRNLLIL AALHDIGAYKTEEIDRMVEFETKEVWNHSIYGYLFFHYFTPFEYWDSVVLYHHMPWNRLR KQKDVPERVREAAQILNLADRADIYFGSSGYTGGYQRFTEFLRRQEGLAYSPRVSGLFRR LGPEVLDRLAVQGQEYCLPEDVSGQFDRTMEEVPFTEDEQKALLRMLTYVIDFRSSHTVT HTITTTVISEELAVRMLDSGNDIRDVICAAMLHDLGKIGIPVEILEFPGKLSPQAMRVMR THVDLTEHILGTNVSARVRDIALRHHEKLDGSGYPRGLNAESLDMPQRIVAVADIISALT GTRSYKDAFSKEKTVFILEDMAEKGLVDSVIVSMFVENYDTIMGAVRQRSQPILDKYHEM QRQYQVLERRMEDRNSQAAARES >gi|157101644|gb|DS480680.1| GENE 60 67655 - 68398 699 247 aa, chain - ## HITS:1 COG:RSc0165 KEGG:ns NR:ns ## COG: RSc0165 COG0842 # Protein_GI_number: 17544884 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Ralstonia solanacearum # 71 245 202 377 381 63 27.0 4e-10 MKRFYTILKTELKLSIRGMDMIIFAICMPLVAISLLGIIFGSRPAYPGAGYTFLEQSFGA VSAIAVCAGGVMGLPLVISEYRQKKILKRYQVTPASPALLLAVQTAVYTVYSLASLVLVY AAAVLFFGARFPGSAAAYLAVFLFVMLVIFSIGMMIGGIAPDVRTAGMWASILYFPMLVL SGATLPYETMPSGLQRIADMMPLTQGIKLLKAVSLGLPAEGIWKSCLYMAVCFLVCGGIA LRYFKWE >gi|157101644|gb|DS480680.1| GENE 61 68395 - 69180 177 261 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 40 241 27 249 563 72 29 2e-11 MTNREEQRKGRETAVNAQAALREKAVTVTGLSKSYSGKKVIDNLNLTIRPGQIYGLLGAN GAGKSTAVECILGTRRADSGEVRILGMDPVTERRHLFERVGVQFQEASYPDRIKVRELCE ETACLYEKTADYRDLLRQFKLEDKADSMLSQLSGGQKQRLFIILALIPGPKVVFLDELTT GLDTKSRREVWKCLLEMKDKGLSICLVSHFMDEVEMLCDQIGILRDGMFVFEGTVKEAVD NSPYSKLEDAYLWWTQEEENR >gi|157101644|gb|DS480680.1| GENE 62 69230 - 70003 817 257 aa, chain - ## HITS:1 COG:CAC3509 KEGG:ns NR:ns ## COG: CAC3509 COG0789 # Protein_GI_number: 15896746 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 250 1 249 250 110 30.0 3e-24 MNTYRTVDLARMFGIHVNTVRLYERYGLIPKAERTQSGYRIFTELHVEQFKLARAALRVE VLQNGLRKQAVTIVKTSASGDFETALKLTKRYSDQVDQEIKHAEEAVRICRRMLAGIKEP CSGKDREQTMYTRKEAAGILGVTIDTLRNWELNGLFSVRRMANGYRVYTGEDIQRLTIIR SLRCANYSLSSILRMLNALSRNPSADIRKIIDTPGEDDDIITACDKLLTSLAEAKENAGY VESQIGKLMEMNRREQL >gi|157101644|gb|DS480680.1| GENE 63 70165 - 70380 236 71 aa, chain - ## HITS:1 COG:no KEGG:Closa_0502 NR:ns ## KEGG: Closa_0502 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 70 9 78 79 72 45.0 5e-12 MNGFLQIVNQCSGAVNAIFPDGKRRDLNKSYAAQKVLWDQFRENQGSLALKLDFQKPEDY ISIVYYYISEI >gi|157101644|gb|DS480680.1| GENE 64 70520 - 71434 537 304 aa, chain - ## HITS:1 COG:BH0724 KEGG:ns NR:ns ## COG: BH0724 COG2207 # Protein_GI_number: 15613287 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 25 296 23 290 298 121 28.0 2e-27 MKIIRKVELKPGSYEEELPGISAEFPYIASYVELDKYAGKQSPWHWHKEAELFYMEQGSL EYDTPHGKAVFTAGSGGFVNSNVLHMSRAKDGEGSVVSLLHIFNPLLISGQSGSIIDRKY VSPLMTASHVEIIGLYPDQPSHVPVLETLRQSFRISEDDCGYEIRLRSVLSEIWCGLLAM TETMEKSPKYNSRSNQKIKIMLAFIHKHCGDKLSVKEIAASAFISERECFRVFRDCLNMT PVEYVTSFRLQRACWMLSEGNEPITSICHICGLGSSSYFGRVFRSSMGCSPKEYRQKWQN SDTR >gi|157101644|gb|DS480680.1| GENE 65 71457 - 72860 997 467 aa, chain - ## HITS:1 COG:no KEGG:ELI_1104 NR:ns ## KEGG: ELI_1104 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 442 1 451 500 131 24.0 6e-29 MRADSGLNRLVYEYFEARILYGYYKYGESLPSINKICQMFHLAQATVRAGLALLEKGGYV RVDPRTAPVVVYKAGPAGFRESAARYFVSRKAGIADWVLSGKLLFEPLWRECLMRWSDED WKRFLDNMKHLSVESVSMPAALYILVFKAPDNRLFLNLYWETVRYIRFAYLMVRKDQAVL TDQELEGLSREEVISLLGQTFETLYGRITADLLSFIAQAQDEFDLGDMEQVPFYWNIYRK RPQMRYTLVSRILREIERGDYPRGTYLPSLPCMAQRYEVSFNTVRRSLSILDSMGITRSL QGKGTLVLAEAEKADLANPEIREGMKLYLDSLQLMTLTIKPVVQYTLENAPADAVEKLQY DVRQLGKNRNSYKIFEVVLTFIAENCRLNLVKECYNKLRELIVWGYPIALLRLREREFHM EYNDMVAQAGRYLDEGNAEAFSGSVKEFMGREYQEVRIFLESADTRQ >gi|157101644|gb|DS480680.1| GENE 66 72865 - 76203 3303 1112 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 575 967 268 665 676 186 31.0 2e-46 MDKNSTSIKKKIAMAAVLVGAVLSILTFLFIHEVKEELWKQSVQTIIESTQQGCNTLKVQ LMDDYQSMGTVTLNLREIASGQREELELLMDNYAQIENGISLYLQDGVCIPSGSVIDEKA EAVLLDTDRGNGIIDPHISSVTGVNVFDLFMGVTLKDGVRGYLVKEYEVDCIVDSFTLSF YNNAGFSYVVNQKGDVLIRSPHPNSNKTVQNLFDMLPETQNSRESLDQFARSLQSSYTGW ATFNYQGEATVFCYTPLKLQSDWYLISIIPQSVVNAQTNEILMRSFTLIGCILLGIALLL VFYLRYANRTNRKLKNQAVYISRLYNAVPEGIALMSVEAPYDFRLMNQEGMRLFRCQEGA SNEAMKGKSLRDVIHPEDYENLVQLFKDTADSGRKNIFEVRVITLDNGFFWAAGIVERTL DENGNPVLISAFHDITDEKLAEEAAEREKLQERLTLVGAISNAYPVIISLNLTRDSLNFI YVKPGLMLELGNQESYSQLYEYVAASIHPEHLEEFCRRFSAENLNKTLGQEKNEVFLDVR QMLGDGRYHWVSMQIIHVDNPYSGDKLAILISRRIDEQRYEEEQKRSALQSALDNARAAS EAKSQFLSNMSHDIRTPMNAIVGMTAIASTRLDDRERVMECLKKISLSSKHLLSLINDVL DMSKIESGKLSIREEPFNLAELVTESAELVRPQAEAKHLEMDVHLKGLKNEEVVGDSLRI RQIYINILSNAAKYTPEGGSVHVEVRQEQGAGRGYGRYVFRCADTGIGMSSEFMSKLFQP FERAQDSTNSRVTGTGLGMAITKNLIDLMNGDIRVESRPREGSVFTVALPLQFQDAAPEA VPEEWVGIHCLIVDDDEQTCENASELLEDMGLRPQFVTKGAEAVRRVLDLKDSDDPFRLV IVDWKMPDMDGVEVARRIRQEVGREIPVIVLTAYDWSEIEAEAREAGVSAFLAKPFYRSK ICYLLSGLSGEKEPVQWSGFTGKSDFTGKRVLLVEDNEMNREIAGTLIEEMGVEVEEACD GEEALERFKQVPEHYYNMILMDVQMPKMDGYESTRAIRALEREDAASVPIIAMTANAFAE DVQNALHAGMTAHFAKPIDASVLEQMLYKYLR >gi|157101644|gb|DS480680.1| GENE 67 76203 - 77477 973 424 aa, chain - ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 4 361 5 355 419 88 26.0 2e-17 MGRRMVGLYAVFLASALMLGGCGTGNKVVNFEIPNENVTTITFFGNKYEPENVTVIEEIL SDFMKENPGIRVSYESLKGNEYYEALGKRFDAGKGDDVFMVNHDVLLEMQSRGQLMDLSG LKTIQDYSDRMLRQMDEHGKIYWVPTTVSGFGLYCNKKLLKEHKQKLPENLQEWRQVCSY FKEQGITPVIANNDISLKTLAIGRSFYSAYQENREDEMFRHLNEGKEELSFCMTPGFSLV KEFIDNGYVDAGKALETRKTSDDLEEFVKGESPFMLTGAWAAGRVEGMDPDFDFVVAPYP VLEDGGLVVINADTRLSVNRASEHIDEALAFVEFFTRSENIQKFADQQSSFSPLREGIPS SVKEIQPIVSCYQSGRTVIGTDALLEFPIWELTARASERLMKGEELEDTMKWMDQEAAGR RGGW >gi|157101644|gb|DS480680.1| GENE 68 77696 - 78172 278 158 aa, chain - ## HITS:1 COG:SMa2239 KEGG:ns NR:ns ## COG: SMa2239 COG4422 # Protein_GI_number: 16263660 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Sinorhizobium meliloti # 1 129 106 232 259 61 31.0 8e-10 MIRERTDIDFLILTKRIDRFLVSLPPDWGTGYDNVNIGCTVENQKMADYRLPLFLSYPIK RRFIACAPLLEAIDLTPYLHGVDHVTVGGETGRDARVCDYDWVLDIREQCVKANKTFWFK NTGSFFRCNGAVEKINPFKQTGLAKELGIDISDGKRLF >gi|157101644|gb|DS480680.1| GENE 69 78499 - 78963 330 154 aa, chain - ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 1 140 10 151 163 116 43.0 1e-26 MKYEGVCIAVKDVNLSKKFYQDLFDLEVFQDYGINVSFGGLSLQQEFDWLTGVPKESILE KAHNMELYFEEDDFDGFIDRLGQRNDIQYIGDGVKEAGWGQRTVRFYDLDGHIIEVGENM KMVVRRFIHSGMSMEETSRRMDVSVSDLEKLLQS >gi|157101644|gb|DS480680.1| GENE 70 79247 - 80797 1359 516 aa, chain - ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 9 516 7 514 514 535 61.0 1e-152 MSQIMVPIMIIVMIVVAIITWFAARNYERKQYDSKVGSAEEKSREIIDEALKTAETKKRE ALLEAKEESLKTKNELEKETKERRAELQRYEKRVLSKEENVEKKADALEKKEADLVRREN ILSKRTAEVEAQYEQGIQELERISGLTSEQAKEYLLKSVEEDVKHDTAKLIKELENKAKE EADKKARDLVVTAIQRCAADHVAETTVSVVQLPNDEMKGRIIGREGRNIRTLETLTGVEL IIDDTPEAVVLSGFDPIRREVARIALERLIVDGRIHPARIEEMVEKAQKEVENNMREEGE AACLEVGIHGIHPELVKLLGKMKFRTSYGQNALKHSIEVAQLSGLLASELGVDVRLAKRA GLLHDIGKSVDHDMEGTHVQLGADLCKKYKESATVLNAVESHHGDVEPTSLISCIVQAAD TISAARPGARRETLETYTNRLKQLEDITNSFKGVDKSFAIQAGREVRIMVVPEQISDDDM ILLARDVSKRIEDELEYPGQIKVNVIRESRVTDYAK >gi|157101644|gb|DS480680.1| GENE 71 80932 - 81549 419 205 aa, chain - ## HITS:1 COG:CAC2410 KEGG:ns NR:ns ## COG: CAC2410 COG2137 # Protein_GI_number: 15895676 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 11 203 16 214 214 65 23.0 7e-11 MMVTAVIPVDKRKSKVFLEEGFAFVLYRGEVERYRIEEGRELEDEVYEEILRDVLRPRSK EYALHLLKDSGKTEKWMKEKLGKAGYPREAVEYAVNFLKEYRFLDDNAYAQSYVRSYAGK KSRRQMVYEMQQKGVDSSYIEEALDQSPVDEEESARQALTRRLKGKQEASYQEKGRLAAF LGRKGYSFEVINRVLREIKTAENQD >gi|157101644|gb|DS480680.1| GENE 72 81546 - 82682 1434 378 aa, chain - ## HITS:1 COG:BMEI0787 KEGG:ns NR:ns ## COG: BMEI0787 COG0468 # Protein_GI_number: 17987070 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Brucella melitensis # 15 364 29 377 378 464 67.0 1e-130 MAREEKSNVINRDMNKDEKLKALDAALTQIEKAYGKGSVMKLGDSRANMNIETVPTGSIS LDIALGLGGVPKGRIVEVYGPESSGKTTVALHMVAEVQKRGGIAGFIDAEHALDPVYARN IGVDIDNLYISQPDNGEQALEITETMVRSGAVDIVIVDSVAALVPKAEIDGEMGDSHVGL QARLMSQALRKLTAVISKSNCIVIFINQLREKVGVMFGNPETTTGGRALKFYASVRLDVR RIETLKQGGEVTGNRVRVKVVKNKIAPPFKEAEFDIMFGRGISKEGDILDLAVKENIIEK SGAWFAYNGSKIGQGRENSKQYLSDNPAVCAEVEAKVRAKYELPGAEEAMEAAEAFKTAA KQAETKTGETDADAPELV >gi|157101644|gb|DS480680.1| GENE 73 83156 - 84205 823 349 aa, chain - ## HITS:1 COG:no KEGG:Closa_3771 NR:ns ## KEGG: Closa_3771 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 349 1 345 345 262 44.0 2e-68 MEEKQFRDKIYWLTFLFSLLVVWVHSFNGELFMGTPEAALKVAGIERILGERLGQIAVPG FFMVSSYLFFRGFCWEKLIPKWRSRIRSLVLPYLLWNFLYYIGYVAATRLPGLSGVVGKT PIPFNGTQLFDALVCYSYNPVFWYLFQLILLVALAPLVYTVMRGNLTGAAALGIMAFGLW KNWAMPLLNLDALFYFCAAAWASLHRDTWGRGIEERTGVRKNMTAGAILLTAMGLLLYLG RFGGLLWERPLYTVCWRLWGVCGAVLAVKAADLPDAREWMKHNFFLYAIHFAWVRFINKA AAAAFPGSAVIALAAFILMPVLMTAVSALIGGAMRRFVPNVYYMLSGGR >gi|157101644|gb|DS480680.1| GENE 74 84342 - 85511 1505 389 aa, chain - ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 14 384 11 366 371 399 51.0 1e-111 MLYEEKIIETIPVGPLGLIPLKSCTELGDKVNAYLVDWRRERESEHKSTIAFSGYQRDSY IINAVTPRFGTGEGKGLVTESIRGDDLYIMVDICNYSLTYSLFGMTNHMSPDDHFQDLKR VIAAAAGKAHRINVIMPFLYESRQHRRSGRESLDCALALKELVNMGVENIITFDAHDPRV QNAIPLRGFETVQPIYQFLKYLLKNETDLQIDSSHMMVISPDEGGTGRAVYFANMLGLDM GMFYKRRDYTKIVGGRNPIVAHEFLGASVEGKDVLIIDDMISSGESVMDVAKELKRRKAR KVFVCATFGLFTGGLTKFDEYYDKGIIDRILTTNLVYQTPDLLERPYYINVDMSKYIALI IDNLNHDASLSDLLTPTKRINRLLAQYRS >gi|157101644|gb|DS480680.1| GENE 75 85632 - 86612 992 326 aa, chain - ## HITS:1 COG:CAC3588 KEGG:ns NR:ns ## COG: CAC3588 COG1484 # Protein_GI_number: 15896822 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 1 323 1 320 329 169 30.0 9e-42 MALSNSQYDAIMREYGRQQIENHHKLEERRQEIYARLPVVRQLEAEIAERSVACAKKLLE GDKSVLDRLKEDLKDLREQKSLIIRAAGYPDDYLELHYRCPDCRDTGLIDGRKCHCFLQA QMKLLHAQSNLEDVLERENFNALSYEYYDDTEILSQLGITNAAYMRRVVAGCREFVRDFD KKHDNLLFTGSTGVGKTFLTNCIARELMDDFHSVIYLTASDLFDVFSRNKFDYDNAEDMK DMYRFILDCDLLIIDDLGTELNNSFTSSQLFYCINERMNMSRSTIISTNLTLARLRDSYT DRVTSRIMSGYRIIPLYGGDIRLLKK >gi|157101644|gb|DS480680.1| GENE 76 86634 - 87860 913 408 aa, chain - ## HITS:1 COG:CAC3587 KEGG:ns NR:ns ## COG: CAC3587 COG3935 # Protein_GI_number: 15896821 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Clostridium acetobutylicum # 168 395 105 316 328 108 24.0 3e-23 MGGNMKDRWSVPVTAVANEFIDTYMAAANGEYVKVYLYVLRHQGEDITIELIADALNHTE SDVRRALSYWKKAGVLTASEQGQPVQGEPVRPESGGHTFTRGQDGGQAAVPDLRVEPFQR ETGTAGRAEVRGMTGMREPAVVQEPAVVRERTETTGQEDGCIPVYSTEQVNRLAQDEEFS QLLYIAQKYMNKVFTPRDCQVFAYLYEGLSMSSELLEYLVEYCVQNGHISVRYMETVAMS WHEKGIRTALEAKDYSASYNRDSFAVMKAFGINNRKPAAPEQKLMDKWFRDYGFSREVVL EACSRTITAIHNPSFQYADKILSDWQKAGVRGLADIKDLDAKRTAAREESGESREKRLQK YDSGVSSARQGSGARKNSSNQFHNFKQRDTDYDALVLKQVKEWVGQQP >gi|157101644|gb|DS480680.1| GENE 77 88123 - 89508 1525 461 aa, chain - ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 10 461 11 458 458 448 50.0 1e-125 MYQIDFNKPEHVHFIGIGGISMSGLAEILLDEGFSISGSDSHETALTEHLETKGARVYYG QRAANIEREPGISVVVYTAAIHEDNPEFRAARDQGLPMLSRAELLGELMRNYKEAIAISG THGKTTTTSMLTDILLAGDLDPTISVGGILKDIGGNIRVGGPDLFVTEACEYTNSFLSFY PTMEVILNVEEDHLDFFKDIEDIRHSFRLFAEKLPAKGLLVVNSGIEKVEEITRGLRCRV VTFGKDLSSTYTARNITYDGFARPSYDLVVQGETVTRITLGVTGEHNVYNSLAAIAIALE LGVGFDAIMSGLGRFTGTDRRFEKKGEVAGVTVIDDYAHHPQEIKATLAAARNYPHRKLW CVFQPHTYTRTRAFLDQFAEALSAADEVVLADIYAARETDTLGVSSRDVADRIEKLGTKA HYIPSFDEIETFILENCMNGDLLITMGAGDIVKVGEKLLGQ >gi|157101644|gb|DS480680.1| GENE 78 89791 - 90288 621 165 aa, chain - ## HITS:1 COG:CAC2413 KEGG:ns NR:ns ## COG: CAC2413 COG4708 # Protein_GI_number: 15895679 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 11 158 10 159 166 83 35.0 2e-16 MRQNNRKLYGLAQGAMIAAIYVALTMVFAPISFGPVQFRIAEALCILPFFTPAAVPGLFA GCLLSNLLCGAMPLDVVFGSLATLIGAAGSWYLRKHKWCVCIPPIMANTIIIPWVLRYAY GSEDMILLAMVTVGIGEILAIGVLGNALLITLERYKNLVFRQQAA >gi|157101644|gb|DS480680.1| GENE 79 90738 - 92735 1261 665 aa, chain + ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 241 538 220 507 605 120 28.0 7e-27 MKKTFQSIGYKLLSTLVIATTFPMLVYCFFFSYDMRKSAENSYIEQTQHTWQIVSSNIEH SLSTSIDVANKGIYLNGTLQDLLFSRDGNAFSYAESANSSLLFSYMTNIYSMTPEATQIQ FSTYKAGKSFLLTTKNLQKYLRNNSLEKNSPSPVPAFQAYIMPPHRQTSYGHQLNHISLS QEEEYADTPGTGELVFTVCLPIYHLPSPMTPIGQLRIDISMEFFQKVCNFLYDQSEQFFI VDSDYHVIFASDPSAIGTMTSEPWIQSLIGQAADDDDLFVSKTSGDALRICQKLPGDSYH WYMLKSIPKHTIYQSSNNQLATLLLTFSICLLITILINGYSILHYTNPLKKATTFLNHIN THTQNLNSRLSEYVTYKSEDEIGILFRSLEEMMDTINNFVIRQYELEIINRTTELKMLEA QINPHFIYNTLQCLATKSLEHHDQEQYDYISSFGQLMQYSMDTKHTLVTINEELSHIDRY IKLQKMRFTNHLSVFFEVEETVRKIIVPKMILQPLIENSFKHGNLFKREKGTILVQAHLK EDRLLHVYIVDNGCTPSPEKLAKINKLLIKLKQEYTHKLLTPNRLSSIDGSREKPACQDA DLSIKNAAENLYATNNIGLTNVLLRLLLNFGQECSMELYANELGGTTVHLIISHKTLWNQ KPTDT >gi|157101644|gb|DS480680.1| GENE 80 92756 - 94381 892 541 aa, chain + ## HITS:1 COG:lin2118 KEGG:ns NR:ns ## COG: lin2118 COG4753 # Protein_GI_number: 16801184 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Listeria innocua # 1 530 2 490 494 104 23.0 4e-22 MKVLISDDEHHVIQAIRLLVPWEEFGIDQIYTASNGMEALDIITREVPEIVITDIVMEDK NGIDIMDFIATRHPSIKVIAVSGHNDFEYVRTMLTKGCMDYLLKPLESATLISTVGKAVQ SWKDEHETNLRHRHLQEKVHSLSAFYSGVLLYKMLDSRCVETAYEELLQADASFSSVDTC KIIYYDTSYFPMQNAGFACLLKDFESQVRRSLGGQHGFILSNPDHPGEIMFFLLGAGDMT ATSIIKAAQSFFMNTAYPFHLGISREHPFPGRFSDSYTQAKNAFFCFYGDIASSSTAMAA LPSMEKLRFCPETEQLHQLEDQIFSSLLIGDRREIEGSAADWLAQILPEREIPLYLIHYV LETFHGLLKQWASKIRKKNPAFTYMPAENPLVYEELMGENYLFSREALRQAMFTSLYCIC DELKESRSTSDIMLQIAQYMELNYTKPFIQSEYAKLFYINKDYMSRKFTSTFHVNMLTYL NQIRIRHARELLSDFSLKIQDIAYAVGFKDEKYFAKQFKKLTNTTPGDYRAALRKEKRQG Y >gi|157101644|gb|DS480680.1| GENE 81 94412 - 95185 675 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937443|ref|ZP_02084804.1| ## NR: gi|160937443|ref|ZP_02084804.1| hypothetical protein CLOBOL_02334 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02334 [Clostridium bolteae ATCC BAA-613] # 1 257 1 257 257 533 100.0 1e-150 MKNEYGKNYYVVKADQVEKHFDDNGFSRTELLPGVYDGGIRSYKCFLKAGCKISPELYKD KGVVILFGQGKGVLKDKEGEHAITELAFYVPDFDKDTYEIRAEEEMEFVINVIEFNQWDW EGFHAWHIGLPYFRLHSQCVPYVQDCKGPNTEARMVLCPKWFGRVLMGTTRADGEGTVEK GHPAVHQWNYCVGNSDFTMSVGYKDQNNMENVNHKAGDWSFIPAGADHDLVSEPGKEVYY VWFEHFTDEKKFAKPEN >gi|157101644|gb|DS480680.1| GENE 82 95223 - 95993 699 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937444|ref|ZP_02084805.1| ## NR: gi|160937444|ref|ZP_02084805.1| hypothetical protein CLOBOL_02335 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02335 [Clostridium bolteae ATCC BAA-613] # 1 256 1 256 256 541 100.0 1e-152 MEKYYLVKAGDIPQAYNDSGFAMSELLAGTYEGGIRNYKCFLKAGCEVSPDLCKEDTVLL MFGKGTGYIYSKNDMHNISELSFYAPDFDKEPYTIHAIQDMEFILSIVEMNQWDKELFEA SHVRLPFFLQVSDAVKYDQDCKGPHTTSWHILSPKMLGRIMVGVVRAEGEGTTEKGHSKV HQWNYCVGNSDFSLTVDDCPAVNHRTGDWSFIPAGPDHALVAEPGKEVYYVWYEHYTREK DFCIALAKGEEFDGKY >gi|157101644|gb|DS480680.1| GENE 83 96007 - 96756 480 249 aa, chain - ## HITS:1 COG:no KEGG:Celal_1465 NR:ns ## KEGG: Celal_1465 # Name: not_defined # Def: hypothetical protein # Organism: C.algicola # Pathway: not_defined # 54 238 74 262 278 69 26.0 1e-10 MAIYIAQKEDLVSEYDDSGFAKKEALPGTYEGGIRNYKCFLKAGCQVEPEMYEDKLVLFF FGKGTGYITDQKGAYSIEELCFYVPEFNRTPYAVHATGDMEFMMCVVDMNQWDWKVYNAS HVRLPFFRTVSKCIKYDQDCKGPHTTSWFVLTGQQLGRIMVGAVRAVGEGTAEKGHPAVE QWNYCVGDSDFELTVDGETRPHRSGEWSYVPAGLDHSLMAEPGKEVYYVWFEHFTKEKDF IVKPLLENF >gi|157101644|gb|DS480680.1| GENE 84 96775 - 96942 112 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPDSMRQWMILEVAGIFGYTGACYLLTAGICILGSTRASVLNTLEPTYKDRKSDR >gi|157101644|gb|DS480680.1| GENE 85 97078 - 98442 1225 454 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_0716 NR:ns ## KEGG: GYMC10_0716 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 83 452 62 428 429 154 31.0 1e-35 MKKCLALVLAAALAASALAGCGSGSSSTDTKASETKAEETKAAGDTAAAGESGTSEKEPG SVNLSIMLALGQWTDNFDSVIEDYKKDNPQIGTIEYEFPSSSTYWDLLKGYLSAGTMPDI FGCGFGEQIGNWHEYLADLSDLESAKDLSAEQVAMCSVDGKTIQVMPMVMEGWGILYNMK YLNQVGWEKPPETISELEKLCKDLKAAGIQPFIHHYAETSLSLTNHLGSTWITAKEDPLG YFEELKSGKDMNLAEDPELNAMLDYYDLVLQYGNDDAIATDKWTGRNAFFLEEAAMIDDE GSWEIPNIMDVNPDLADHVVQGRVPITDDASKNKLQTASICAAVYKDSPQLDEAKKFLDY LVKSDYTAYWHQDVMGNIPGLSTVPVSENLACLGQDVFTLMQDGLTHETMTPWCPDEVKD SIGEVWSLYVGKQIDRPQFFKQYQEIWTNYAANK >gi|157101644|gb|DS480680.1| GENE 86 98520 - 99350 709 276 aa, chain - ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 9 276 33 304 304 149 35.0 5e-36 MKAKKTFLKYITVGGLMICLAIVLVPFYIIISNSFKPYAEIAKHIFSLPHEFTTRNYSEA WRRLNFANSLKNTVIISVLSNFGGVVFSSMCGYWITRHQNRGTRLVFFMLIAFMSIPFQA LMIPFAKITSKMHLTNSLFGLSVCFWALTVPISTFITSGAVKSVPIEIEESALIDGCSPL RMYWTIVFPLIKASVFTFATINTLWFWNDYLMTQLMLSKKTLRTIQISMQALFNEAFFAW DVALAALTLAILPLFIFFVIAQKQVLEGASAGAVKG >gi|157101644|gb|DS480680.1| GENE 87 99350 - 100186 756 278 aa, chain - ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 1 263 36 301 317 162 34.0 5e-40 MGPAVLLFVVIGLIPFFYEIWYSFTNWDGIHANYQFVGLKNYIDVLTNDPKYWHAMWFTI KFAFCVLIFSNVLGFIWAYGLSKAIPFRNVMRAGFYIPRIVGGVVLGFLWRFIITELFPV IGEATGIGWFSQSWFQTEASSFWAMVIVMTWSMAGYMMIIYVAGLTSISSDYVEAATIDG AKSSHVLRYIILPLLMPSVTQCLFLSMVNSLKVYDLNISLTNGNPFRLSEAVTMNVYQTA FSSNLMGYGSAKALLLVLVIAGVSLIQVAITSSKEVEL >gi|157101644|gb|DS480680.1| GENE 88 100401 - 101216 611 271 aa, chain - ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 8 250 9 246 270 119 32.0 8e-27 MRIATTGTADGKGTFLVYHGPYREIIPRMAACGYNGVEMHIEDSACILRDQLWELLKAYG MRLTSIGTGAIYGRRHYNLVDSDSSVRQAAIQHLRQHMITAQPDHGLVIIGLITGRMQDC TCREEFFWNLEESLYQLDQLAESYDVQLGFELTNRYEREHLIRIADGVEYLRNHDYKRIL LHLDTVHMNIEEADIRQSILGAKGYVGHVHIADNDRWYPGHGHYLFLETLQALKDIGYEG TLALETNCLPSEEISARKSLENLRVMLGQLK >gi|157101644|gb|DS480680.1| GENE 89 101213 - 102541 994 442 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 420 5 428 447 176 30.0 1e-43 MTRGDSLKVIIRFALPLVAAAVIQQLFSLTDAMVLGIFSGNRGLAVLGVCSWPVWFQVSV LTNFGQASCLLTAVRFGAKNETNLKKAIGNVYFASFLLGLVMVPGLLGSAGTLLRIQNTP PEVFDDALSYLRILYIGTVFLLVYNTLSSLLRALGDSYTSFLAITVSALVNVVMDVILVA GFGMGVRGAAIATTASQVLAALICLVKISRYPVFHIRRKYLRPDGLLLKEYIGICIPMMA QSIVIAVGGTFVQSHINQYGTVFAAGISAAGKIFSVVETGAIALASASASFVSQNVGARQ FERIRTVVKQVCGLSEVTAVVIAIVLLLFGRYILSFYVTDEAVVYAMGNLKVYSAGLLLM YPMYALRQTVQALGNVRIPLLAAVLQLVMRILAASFLPMLIGSSGIYFTSFAAWAVSLIL IGFVYPVQFRKCMENSRGGGIS >gi|157101644|gb|DS480680.1| GENE 90 102759 - 104690 1629 643 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 328 641 314 598 602 138 29.0 3e-32 MKPIFKSSLIKDVVISYVCMIVFPVMIILGMVFLLAGRYIYRAATESVTVTQQAVLELYK EETRGSALALSRVVNMNGGTTIQDASGYDTQRQEQAMEQLYTMYQMLAAPEYKILDIHIY RKDGTAIIPKGEMRYSVEELRQKEWYQTALRQPGLVHTSLVEEDYFYRIRSSQEILEISA YAPREDTEAECIVLYRVSKIPDAIKNYNKNGRMGTICLINSQGSIVADPGQKTRLPSAVT EKIRQTAADLAASGQPAAGNPPASTNHFAASTQQQISYRGIRYMILPLPEPGYYLVSAVN ETSLFGQFTLFSLISMAVILCVVLMFIFYFKWYMNRLLIPLGHLTEGMRQVEHRNLDTQV EMCRQQDIARMISTFNNMVEQIKELIHDKEQVEKEKYQEELHTLQSQMNPHFLMNTLNTL KFMAISAHYTGMQDMVVALENILAAVLNRDGGFYTVRDEQSVLESYIYIMQFRYMDSFEV DINLSPGILDCKVPKLILQPIVENCITHGFDNLGDRLGRITIKGTIQEGSLVFEITDNGK GMDQDTIQQILGNPLAPGHPEYMAKPESMDTPGSLKSPESLPCAEAASRPEPTHRRTIGV SNTDRRLKLNFGPLYGVSIDSKPGVYTRVTLTLPVITNTTEDE >gi|157101644|gb|DS480680.1| GENE 91 104708 - 106276 1404 522 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 516 1 522 525 135 24.0 2e-31 MYTVLIADDEAIVRMMLSSMIDWDEMELKLAGCVSNGREALAYLEQHPVDILITDIQMPV IDGLELIRRVKEFHTQPEILVLSAYNDFPYVRQAFKLGIYDYCLKREIHEEMLKRHLTNM KQLLSQKGRTQTSGAEHVKDRKQLLAQLLCGELEPEAAGLPERYYMVYFSIQRYQEIRRN FGNDFDKNFHVSLLNLAGQISQVANHGVIVPDHVTNLVMVYETGPQDNQDQALEGLLRVC RRLLNAWKNYMNVDASAAVSSLALGPSAFEEKLAEATMNLTMKYVMKSQNLFSSLDYSRF SPLEAMARENEFREIIQALKSNNTVSFESARSLLLARIQESPVRQAGLLALYLAYHIAAS LVHQLEDTDYVYSTSLLEEISGLRTSQEVCIWTVNFLSDMKRYVQFHYQFDFPDEISRAI NYIDSHYYKPDLTLYEVASEAGFSEKYFSTLFSRKMGMSFSSYMKHLRIHHAKTMLKETS MKLKEISEAVGYNSVEYFVRVFSAQEGVSPSSYRKQSRTIVQ >gi|157101644|gb|DS480680.1| GENE 92 106293 - 107756 1630 487 aa, chain + ## HITS:1 COG:SP1897 KEGG:ns NR:ns ## COG: SP1897 COG1653 # Protein_GI_number: 15901724 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 101 388 47 329 419 65 24.0 3e-10 MFSSIRIVTLNLTKAITKEEAMRKKRFLLGAVALTAALAATGCSSTPNGSQAADSAKNQA DTGKDSGSRDSGSEDTSKAAGADRNISCTLTLATWDVAAAKTFEELDMEGRFQELYPNVE IDIEEFNAEPEYFNSMKIRSSASELPDLMFHKSDSMNQYSEYLADLTDTEANKNNQIASE YAVDGKVLGIPDRKASDYVYYWTDMFDEAGVEVPETWDEFVEASEKLQEHFGPQDPDFGA ICMGAKDAWPTYPLMEYGPAAQSGNGYYWDAMTTEDAPFAEGTDIHTVYGKIYDLFQKGV LGKDPLGITYDQSRALFAKKKGAMTLNSPMFLTAYKADGYDTAGMSTFYLPFRDSSQDEF NLVTQGDCFLTVTNNSENPEVAKAFLEFYFSEAWYPDYIASISSDSTMTPYAKELDPVLQ TASELQPEVKFVTYGAGGSDFINMVSETKFDCKQLGVEMLTDGFDYAKRCEELNQTWAEA RTKLGLK >gi|157101644|gb|DS480680.1| GENE 93 107767 - 108666 711 299 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 10 282 29 303 317 164 32.0 2e-40 MNRLNRKRLTFLSLSLAVPVILLCVFVVYPFWELIHMSFLEWDGVSSVRSLAGFDNYTQL FFKSPEFWQAFRNNLTYLLIHGIMMPVELVLAVMLSSRFKGSGFVKAIIFLPFIINGVGI SYSFSYFFSPVEGGFNYILTRLGMEGLIHNWLSDPKIVNYVLASVSIWRYMGFHIVLFMA GLSSVPKDMMEAAVVDGANAWQRLRYIQLPAIRTVIDFMLFDVINGSLQMFDIPYIMTAG GPNGASNTFSIYTIDTAFKYNNFGMASSMAVIMIFMIVCVQIAQRFITNRLRRRDQHGA >gi|157101644|gb|DS480680.1| GENE 94 108656 - 109486 885 276 aa, chain + ## HITS:1 COG:BH2224 KEGG:ns NR:ns ## COG: BH2224 COG0395 # Protein_GI_number: 15614787 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 14 275 15 275 276 124 32.0 3e-28 MARKKISLAGKYAVSLILVLMVFLPLLVTIISSLKIPGSLDAQSPLWISARDMTLNNYLS VFKERYLVRAFTNTIKIVAVSIFFNVLVGSITAYCLERFEFRFKKIIYILFYLAMMVPTN IVEIARFQVIRGMGLYNTLGAPIVIYIAANLMQLYIYRQFISGISVSLDESAMLDGCGYF RIFAQIIFPLLTPATATLVIIKTVSIVNDMYIPYLYMPKNQNKTLTTFLMRYAGAQQNSW PLLAAGIIVVALPTVLLYVFFQKYIIEGITAGAVKE >gi|157101644|gb|DS480680.1| GENE 95 109500 - 111770 1840 756 aa, chain + ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 32 716 32 722 755 548 41.0 1e-155 MDKPVLWNLNWKFYDYFEENLLTEDSDRCSDVNLPHTVKELPLSYFSHQETAMISTYVKL LSVTSRMLENRIILVFDGVMTYFELFVNGQRAGEHKGGYSKSMFDITGLCKEGPNRIVLR VDSHEREDIPPFGYAIDYMTYGGIYRDVWLYCCSQVFVERALIRYDLENKNAILKPELFL DNSGPECSLTADICLKDKEGAIVASYKRCIQAAPGKSSIILEPQCVPSPILWDPDNPYLY TVDITLESGGRVLDRHHVRTGFRTVACTPEGLFINNRKIKIMGLNRHQSFPYVGYAMGRR AQEKDADILKNYLNANTVRTSHYMQSEYFLDRCDEIGLLVFSEIPGWGHIGGEEFKKVMM QDVESMILTQYNHPGIFIWSIHINESLDDDQLYTRANALAHKLDSSRPTTGVRYITNSHL LEDVYSLNDFTYSDCNPDGTKLFKGRREATGLEYPVPFMITEFSGTAFPTKLWDSADKRT THARAYARVYSHSNLSKDLLGIIGWCAFDYNTHADYGSGDKICYHGVMDMFRIPKYAAYV LRSQKNPEQEIVMEPTTEFSRGDNKGNRLVSPFMVLTNCDYIEVEMYGKPPVRYYPDNRY IGLEHPPIEIEEEIGVWQDLWQDGSITGYYKGKAVIQKKFLRDSSLHDMEVTADDTRLYG DYTDATRVVCRVTDRVSTTLVYFPGIIQVETNGPIQVIGPSAIPVRGGCAAFWVKTNPQS FSGNAETAEVHIRLTDTPVEKKCVKIDVVKSDMEVG >gi|157101644|gb|DS480680.1| GENE 96 111893 - 112285 485 130 aa, chain - ## HITS:1 COG:CAC3167 KEGG:ns NR:ns ## COG: CAC3167 COG1433 # Protein_GI_number: 15896415 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 118 1 116 118 72 39.0 2e-13 MKIAVTYENGQVFQHFGHTEEFKVYDTEDKKILSSQVVGTGGSGHEALAVFLKNLGVKVL ICGGIGGGARTALSQAGIELYPGASGDADQAVEALLNGSLDYDPNTMCSHHHEGGEHNCH GHGHSCGGHN >gi|157101644|gb|DS480680.1| GENE 97 112263 - 112658 247 131 aa, chain - ## HITS:1 COG:MJ1243 KEGG:ns NR:ns ## COG: MJ1243 COG1342 # Protein_GI_number: 15669428 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Methanococcus jannaschii # 2 94 5 99 123 70 38.0 8e-13 MARPYKARRICSVPAIDTFGPLNQEMDSAVELTLEEYETIRLIDWLDCTQEQCADQMGVA RTTVQAVYNSARKKLADCLVNGKRLEIRGGNYQLCPDGGNCCGKNCEKRGCRRRRCNNKP GGDCNEDCCNI >gi|157101644|gb|DS480680.1| GENE 98 112866 - 113273 499 135 aa, chain - ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 9 122 5 118 122 143 62.0 8e-35 MNQDAHQTILRGDLYYADLSPVVGSEQGGIRPVLVIQNDVGNKYSPTVIVAAITSRSTKA AIPTHVCIRRMRGGLKQDSTVLAEQIRTIDRNRLKEYIGHLDSGQMAGIEQAMVTSLGLG HLTGNMGQVFLPSYS >gi|157101644|gb|DS480680.1| GENE 99 113401 - 115359 1740 652 aa, chain - ## HITS:1 COG:MA4569 KEGG:ns NR:ns ## COG: MA4569 COG0651 # Protein_GI_number: 20093353 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Methanosarcina acetivorans str.C2A # 54 402 54 418 518 180 33.0 6e-45 MEGNIWLLGAVAWPFLAAFISLLAGKKGERNRDVFASAAMVMEFTGMLCLYPVNSPAAFS WAGFCEMGIWLRADGFRWLYSTIAAFMWMMTTLFSMEYFARHGNRGRYCFFSLLTCGATM GVFLSDSLFSLFLFFEIMGLTSFVMVIQEETEAAGRAARTYLAIAVIGGLCALFGIFMIA AGTGNLSMDSLEQFRKATGGTWPLYLAGALLLAGFGAKAGMYPLHVWLPNAHPVAPAPAS ALLSGILTKTGVFGILAVTVTMFRHDMAWGMVLLVPGAVTMVLGAILAVFSIDLKRTLAC SSMSQIGFILTGCAMQCLLGEENGLAVSGTVLHMVNHSMIKLVLFMAAGCIYMNLHKLDL NEIRGYGRNKPFLMLVFAMGAYGICGIPLWNGYVSKTLLHESIVEYIEVLGAQGADALPF QVLEWAFLLAGGLTVAYMAKLFAAIFLEKAPMGRERDDREKRYAGTAACFALGGSACLLP LMGMAPHQIMDRMTDISRPFLRGAQVPHQVEYFSEANLRGACISITIGILVYVLFIRRYL METGEDGTRVYVDRWPAWLNLEDRVYRPLVLAVLPFLGAVLARCVNGICEGPLPLFMKRP EKNQVVAPGESSRFFRQAQDVRLLHMIYSSMGYGFMMLGAGLILMTAYVLFL >gi|157101644|gb|DS480680.1| GENE 100 115390 - 116862 1370 490 aa, chain - ## HITS:1 COG:SMa1541 KEGG:ns NR:ns ## COG: SMa1541 COG0651 # Protein_GI_number: 16263292 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Sinorhizobium meliloti # 58 474 63 474 487 239 38.0 8e-63 MNPYYLLVPVLLPMVTGAAVLGLRPKERRKREILVMGGILAASAVIGALVLNRPGQPLVL YRFGSRMNISLGLDGLSGVFACLIAVLWPLTALYSFEYMKHEGKENKFFGYFSITYGVVA GVALSHSLITLYFFYELMTLATLPLVMHAMDTRAIYAGKKYLLLSMAGAAMVFVSIISLH EYGTTLDFTWGGVIPQGLSHVERRHLYGAFILAFFGFGVKAAVVPFHSWLPAASVAPTPV SALLHAVAVVKGGVFALMRVVYWCFGAGFLIGTKVQAAVLCVCCITILYGSLRALCTQHL KRRLAYSTVSQLSYILMGVMLMTPAGLAAGLTHMVCHALMKITLFFCAGAILYKGGREYV YELRGVGRAMPVTMTCFTLAGLGLVGVPLFAGFVSKFMLGSAAAEAGGLGMLGVACLIIS AFFTLFYMVLIIGAAWFPAEGREDSHWNTHCDPNWNMKLPLIAITAMSMVLGLWSGPLIQ VFNRIAAGGM >gi|157101644|gb|DS480680.1| GENE 101 116877 - 118364 1333 495 aa, chain - ## HITS:1 COG:SMa1541 KEGG:ns NR:ns ## COG: SMa1541 COG0651 # Protein_GI_number: 16263292 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Sinorhizobium meliloti # 27 440 33 440 487 221 36.0 2e-57 MREMWMMAPVLFPVVSGGALWIWNPGKNKIHTAAAVLSLAEAVCSWLVILNCMGMELPLW SIGPGMELCLRMDGTGALFSGLASLIWVLVVFFAFEYMEHEKEDARFFGCLIMSLGALTG VAWAGNFVTLYLFFEMMTFFSVPLVFHSRMREALRAGMVYLAYSMLGASIALGGYFFFRQ YACGTDFKAGGVLTEAAGMVQGRPQILLLAVFCMAAGFSCKAGLMPLHPWLPIAHPVAPA PASAVLSGLITKAGVVAVIRVVYNMAGPAFLRGTWVQYALLSMAVVTIFTGSMLAYKEKK LKRRLACSSFSQVSYVLLGVFLLSMEGLYGSLLQVVFHALAKNALFLCAGAVICKTGCTR VKELRGMGKRMPAVMVCFALASLSLVGIPPAGGFLAKWHLAVGAMRADAGVFAWLGPVVL MVSALLTAGYLFPVIVEGFFPGKDWRKTDQSEDGRLSVSAFMGVPLAVLGISLLVLGALG NPVFELLRGIASDMV >gi|157101644|gb|DS480680.1| GENE 102 118408 - 119940 1456 510 aa, chain - ## HITS:1 COG:PH1452 KEGG:ns NR:ns ## COG: PH1452 COG0651 # Protein_GI_number: 14591242 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Pyrococcus horikoshii # 64 445 64 433 495 152 31.0 2e-36 MRLVENIPFFSIFIMMAGGIVTPLLGGRNRARVLHLVLVGAVAVMSGLLLGYTSDGTCFR FMMGHFPAPWGNEVRAGALEALMSLVFSLVMFLTVAANKRSLEHDIPGERAGLYYVMMNL LFSSLLALIYTNDLFTAYVFIEINTISACAIVCAKESGETVAAAIRYLIMSLVGSGLILI SIALLYCQTGHLLMEPMAGAVRELASSGKDLFPLKMALVMMTSGLAVKSALYPFSSWLPG AHANATAASSSVLSGLVLKGYIILLLKVYMRILGMDLIIRLRINDLLFVFGILAMVIGSA KALQQHMVKRVIAYSSIAQIGYIFMGIGMGTSAGLAAAVLHMIVHAVTKPMLFTAAGGLM DCCGHQKEIAALRGSARRNPAAGAGLVVGSMAMIGIPFFSGFVSKISFAMASFHDLDRTV LVLGALAVSTVLNAMYYIPLIIIVFSERGPEKADYEAAGLEDAAVPEAGEPVKDVSFAAA MACFMAGNVLIGTCYGPVVRMIETGLANFG >gi|157101644|gb|DS480680.1| GENE 103 119927 - 120307 558 126 aa, chain - ## HITS:1 COG:VNG0565C KEGG:ns NR:ns ## COG: VNG0565C COG1006 # Protein_GI_number: 15789777 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhC subunit # Organism: Halobacterium sp. NRC-1 # 1 109 2 107 118 71 37.0 5e-13 MQALVTNYYETAAMILFSIGFTMLLLQKNLIKKAIGLNIMDMSVYLFLAAKGYIAGGRAP IADGRTSVEGFINPLPSGLVLTGIVVSVSVSAFLFSLIQRFHHYYKTLDLDGMLYEGQEE TDGEAG >gi|157101644|gb|DS480680.1| GENE 104 120310 - 121113 849 267 aa, chain - ## HITS:1 COG:RC0355_2 KEGG:ns NR:ns ## COG: RC0355_2 COG2111 # Protein_GI_number: 15892278 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhB subunit # Organism: Rickettsia conorii # 25 266 16 265 265 112 27.0 9e-25 MGRRFAEFTEEHALRVFGRVYRAVAQIVCLAGMAVLLMMVTHLPAFGSADNPANNEVVGR YLENGLSETGAVNTVAGIILDYRAFDTFGESVVLFLAAVSVIALLRLKTESGYSGKEDRS GGTEHEEDEERDIILEQVARLLVPFILMFGVYVVLNGHLSPGGGFSGGTVMGAGLILYAN AFGHRRIHRFFSFKTFTVMSSSALMVYALSKGYSFFMGANHLESGIPLGRPGNIFSAGLI LPLDICVGLVVAGTMYCFYSLFREGEV >gi|157101644|gb|DS480680.1| GENE 105 121361 - 121612 311 83 aa, chain - ## HITS:1 COG:VNG0566C KEGG:ns NR:ns ## COG: VNG0566C COG2111 # Protein_GI_number: 15789779 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhB subunit # Organism: Halobacterium sp. NRC-1 # 4 73 3 72 176 59 44.0 1e-09 MKAVEIILLLFLMACSLSVFLCRNLLVAAILFMANSLIMSIIWVILQSPDLAITEAAVGA GVTGILLFITLYKVDGLRGEDSE >gi|157101644|gb|DS480680.1| GENE 106 121599 - 121919 428 106 aa, chain - ## HITS:1 COG:BS_yufB KEGG:ns NR:ns ## COG: BS_yufB COG1320 # Protein_GI_number: 16080217 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhG subunit # Organism: Bacillus subtilis # 4 105 7 110 124 58 36.0 4e-09 MGLVTGGIFIVLGVFVLCVATFGIFRFEYALNRIHVAAKCDTLGSMLILAGLIISNGWNM ESAKIGLILLFIWIGNPAASHMVARAEARSNPNLKEECGVSEYEGG >gi|157101644|gb|DS480680.1| GENE 107 121919 - 122302 477 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937471|ref|ZP_02084832.1| ## NR: gi|160937471|ref|ZP_02084832.1| hypothetical protein CLOBOL_02362 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02362 [Clostridium bolteae ATCC BAA-613] # 1 127 1 127 127 211 100.0 1e-53 MGAEIGAKIGAEMGAAEGLFTGILILSVIILSINMIFCLLRAVLGPRFSDRLIAINMIGT KTVLMIALLIKVFHEDYLVDICLIYALISFLSFVVLTGLLGRHMGCEKERGESGGAGKEN TKKEAVL >gi|157101644|gb|DS480680.1| GENE 108 122314 - 122778 623 154 aa, chain - ## HITS:1 COG:PH1456 KEGG:ns NR:ns ## COG: PH1456 COG1863 # Protein_GI_number: 14591246 # Func_class: P Inorganic ion transport and metabolism # Function: Multisubunit Na+/H+ antiporter, MnhE subunit # Organism: Pyrococcus horikoshii # 2 128 20 145 174 80 33.0 1e-15 MYVLLYLLWILLNGRVTAEILAVGVPVAGLVYGFGYLALGYRLSGDLLLLKRLGHILVFL AMVAWEVVKANVAIIRIVLSPHMEITPCLVPVKTDLKSAGARAALANAITLTPGTITVDV KDDVLWVHALTAEMAEGLKDWHVARQLARIEEVR >gi|157101644|gb|DS480680.1| GENE 109 122868 - 124109 1416 413 aa, chain - ## HITS:1 COG:CC2672 KEGG:ns NR:ns ## COG: CC2672 COG1228 # Protein_GI_number: 16126907 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Caulobacter vibrioides # 5 412 31 427 429 161 30.0 3e-39 MKKVAFKGGLLIDGTGAEPVVNSLVLTEDDKIAYAGADKEIGPDYEVVDITGKTIMPGLI DSHLHFSGNLTDDDSDWVLEDVVQKTVVAVQQAHECLENGLTTVGEISRSGIQIRNMVEA GVMKGPRVVATGLGFCRTCGHGDSHKLPSYYNDESHPWAERVDGPWDLRKAVRRRLRQNP DAIKIWSTGGGIWRWDQKLDQHYTLEEIQAVVDECRMVGIPVWSHAEGYGGALDSARAGV HLIIHGQTLNDECLDIMAEKGIYFCPTIQFLNEWFKTYAPPYIPEVHDQYPGDTVAEKEL NRVYANLRKAKAKGIGLTIGSDSFCSSLTPYGTTAIGEMYSFVEKAGISEMDTIVAATKA GAEMLKVDNVTGSLAEGKSADLLVINGNPLENIRAIAVENMDVIMKEGYFVKR >gi|157101644|gb|DS480680.1| GENE 110 124160 - 125608 1539 482 aa, chain - ## HITS:1 COG:SA0541 KEGG:ns NR:ns ## COG: SA0541 COG0531 # Protein_GI_number: 15926262 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Staphylococcus aureus N315 # 8 475 9 463 494 155 29.0 2e-37 MSGPSGNDAGGMKKTLSIWNFFTIGFGAIIGTGWVLLVGDWMVIGGGPIAAMIAFAIGAV FLLPIGAVFGELTAAIPISGGIIEYVDRTYGRNVSYITGWFLALGNGILCPWEAIAISTL VSDMFGGLFPVLRAVKLYTIMGADVYLFPTVIALGFAVYVILLNFRGASAAAKLQAFLTK ALLAGMLLAMAISFVKGGPSNILPSFTQVEGPSTVTSANNMFAGIISVLVMTPFFYAGFD TIPQQAEEAAEGLDWNKFGKVISMALLAAGGFYMICIYSFGTIIPWTDFIKSTVPALACL KNINMFLYVAMLIIATLGPMGPMNSFYGATSRIMLAMGRKGQLPDSFAVLDEKSGAPKMA NTVLAVLTIMGPFLGKNMLVSLTNVSALAFIFSCTMVSFACLKMRYTEPDLPRPYKVPGG KAGICMACLAGTIIIGLMVVPFSPAALKPVEWAIVVAWLVIGLGLSAVTRAKANNKAAAA AK >gi|157101644|gb|DS480680.1| GENE 111 125829 - 126539 921 236 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 234 1 232 234 108 28.0 6e-24 MFRIVLCDDSPADLMVLCGHLEQLKEKRPVDIISYTDGLDLLKDYKQKGFCDILVLDMRM ETMGGIEVAKHIRQLDDEVPILIVTATVDYAVEGYKVGAFRYIVKPVESGDFLSAVEELL DRQQKKQASIFSFPSISGTTKLMTDHIYYLESDLRTIRVVAKEGTFTFTGTISSLEEQLR PDGFIRIHKSFLVNVSHIYNIFKDSVTMDNKEVLPMSKHKRREVNQEFLSYMEANL >gi|157101644|gb|DS480680.1| GENE 112 126571 - 127875 984 434 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 139 431 142 452 452 92 24.0 2e-18 MMNVIQHAWVSFVLSVAETLILYWLCSNFLHRNLRPMSQYTVGILIYFIFQWVTYWFNAP LFSMCVFYCAFTMLVSCIFFTDSLRTKVLVAYLFVVLNYACKLLSAVLLRFLHNEPLPSE PNFLIQSPLAQMTACILFFFFTLLFILCRNMRKSNKATLYNAISFLVPSVILFITIQIFH MRNSVSHFYLEISGILFCSSMFLFYLVDDYAIINEESQKNMIADKLLTMQSSYYQKVEDS QREITSLRHDLKKHLHSLVVFLKAGQYEEALTYIEQIYESANGMKVPLTGGNSMVNILVN NAQQQAAACCVPLTANIMVPPELPFENVDLCIILGNLLDNALEACCRMGKGANRFIHTEI RCQKAFLIISISNSYNGQLRIEGNRYESIKIGEQYCGIGLSNVSTVIHKYNGDMKISHNN DVFTVSVMLPLVCR >gi|157101644|gb|DS480680.1| GENE 113 127949 - 129397 1173 482 aa, chain - ## HITS:1 COG:BH1945 KEGG:ns NR:ns ## COG: BH1945 COG0642 # Protein_GI_number: 15614508 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 239 454 222 443 462 124 30.0 5e-28 MSAIRSRIFVPVVILIVLFPVLSWALFSSASGWYVKRISSQSLEHLMESVQSMADQVYPR DGNETLTRDEEKMYSREFLSQVREYIRKGRPEASLIAFNSRLKLTFPRQGEEEWEGEEIS GACREMIAGGMFEENKGADVTAAVGKRQYMFRLYETESSGNIRGKYLVGYVEIPDTKALL SYTGGLLAAIAGVLAFLSLVAAWFVAGSISGQLKGLCRQAGAIGKGEFEPSRVHYPISEI EDLKNAMNGMAAELDRTEQKQLAFFQNASHELRTPLMSICGYAQGIQCNVFPDHAQAAAV ILSETMRMKELVDGILTISRLDSHDTRLRTEVISLGEFIEEQIDILQGLGIAEKVSIGME KEQEDIRVSADPSLLGKAFQNVVNNCARYAQSSVTVSLKREGEWAAVCVKDDGPGLDEKE IPHLFERFYKGNKGNFGIGLSIARAAMEYMGGRVQARNRRPPCHGAEFRLMLPVEEAAGR KG >gi|157101644|gb|DS480680.1| GENE 114 129397 - 130098 723 233 aa, chain - ## HITS:1 COG:BH1580 KEGG:ns NR:ns ## COG: BH1580 COG0745 # Protein_GI_number: 15614143 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 4 226 6 234 238 164 41.0 2e-40 MAKRIYVADDELHIRTLIQTFLVNEGYEVETFEDGKSLFEAFHASCPDLIILDIMMPGMD GFALCTAIRRESRVPIIIVSAKDAPMDRVTGITLGSDDYMVKPFLPLELVARVKALFRRA GYIRQDAGETLVCANLSLDPRSRCMTVDGAHFSVTPTEFAFLQYLMERRGEAVSKQVLLK EVWELPDYDLDTRVADDLVKRLRRKFREAGCRAAIETVWGYGFRLEEDRGERS >gi|157101644|gb|DS480680.1| GENE 115 130318 - 130794 557 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937480|ref|ZP_02084841.1| ## NR: gi|160937480|ref|ZP_02084841.1| hypothetical protein CLOBOL_02371 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02371 [Clostridium bolteae ATCC BAA-613] # 1 158 1 158 158 240 100.0 2e-62 MKKITVFICAAALACSMSFPAFAAQTKKEYNTETAQVRQELADTSASIKNLQKSSRTASD AAKTVRQGWKDSGQLKEHKDTWNQVRELKDDITEVQVSYTEASGQARLLKAQAKNDVKDG KYDEAIAKLNQCLEQKKKALSSMEQINGIWAQIDSLLK >gi|157101644|gb|DS480680.1| GENE 116 130860 - 131735 1009 291 aa, chain - ## HITS:1 COG:PA5437 KEGG:ns NR:ns ## COG: PA5437 COG0583 # Protein_GI_number: 15600630 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 5 276 7 276 311 106 28.0 6e-23 MLDFRVYTFLAVCEYMNYTRAAEALHITQPAVSQHIRYLENMYQVKLFLAEGKKIHLSPA GERLLHAATTLKNDEVFLRKQMFEGAGKGLSLRFGTTRTIGESVISTPLARYIHSHPKDR ISVVINNTDELLHKLVSGEIQFALVEGYYDDVDFDSMVFRTEPFIPVCAAGHVFAGEPVQ LRDLLEEHLLIREPGSGTRDILEKNLDIKNIRLSDFAHITEIGSMHVILQLLEKDAGITF LYRTAVEEGIREGVFRELELRDFQMEHDFAFIWNKGSIYADTYRKICRELQ >gi|157101644|gb|DS480680.1| GENE 117 131917 - 132957 1249 346 aa, chain + ## HITS:1 COG:SPy1056 KEGG:ns NR:ns ## COG: SPy1056 COG2855 # Protein_GI_number: 15675048 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 22 346 22 339 339 324 58.0 1e-88 MASLQKKWKGLAFCLLLAVPSWFLGKMFPIIGGAVIAILAGMVIAVFMKTKNGLEDGIKF TSKKILQWAVILLGFGMNLNVVLETGRQSLPIIVCTITTSLVIAFVLHKLMHIPSNISTL VGVGSSICGGSAIAATAPVINADDDEVAQAISVIFFFNVLAAILFPTFGKFLGFDTASGN AFGVFAGTAINDTSSVTAAAATWDSMWNLGAATLDKAVTVKLTRTLAIIPITLCLAFIRT RKEEKADQASGQSVSFKDIFPFFILYFIGASVITTIAMGAGIPASVFAPIKELSKFFIVL AMSAIGLNTDLVRLIRTGGKPILMGACCWVGITAVSLALQSTLGIW >gi|157101644|gb|DS480680.1| GENE 118 133026 - 133775 864 249 aa, chain - ## HITS:1 COG:L0240 KEGG:ns NR:ns ## COG: L0240 COG1802 # Protein_GI_number: 15673615 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 8 222 7 223 227 105 30.0 6e-23 MGSTDSRNSEDNIYDVLRTDILNLKLRPGMIFSIRDISEAYQVGRTPVRDALISLSKEGL ITFLPQRGTMISKINYDKVLNERFLRTCVEEKVILEFMAVCDLKAITELEMSLDRQEKFV EEEDIRAFLAEDMYFHSIFYVGVNKGYCNDIISANSGHYMRIRLLAMADQGIDREALKQH KEITDAILAKDWERLHTILNFHLNRLVNQERALLSKYPDLFERENVEVRREPDELGVDFL VETKLKYHA >gi|157101644|gb|DS480680.1| GENE 119 133789 - 134436 676 215 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3765 NR:ns ## KEGG: Cphy_3765 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 215 5 220 233 186 44.0 7e-46 MEEMNLRKMMKGLAAVDDMTWGMYAFSRDALRDKVDCGRKKEMIEKSICCGYETADKIMD QTGTSNPREIAGKLGLRVEYLDKGQIADRVLFALFTPPDLIQIMREPIDKAVKGGSLDGF TTREQLEDLILGHEIYHYLEEEYDGIYTRTEKIRLWKILGFENRSTIRALSEIAGMYFSK KLNGFPYSPFALDILLYYNYNSETALNMYREVAEI >gi|157101644|gb|DS480680.1| GENE 120 134486 - 136615 2583 709 aa, chain - ## HITS:1 COG:SMc02580 KEGG:ns NR:ns ## COG: SMc02580 COG0145 # Protein_GI_number: 15963814 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: N-methylhydantoinase A/acetone carboxylase, beta subunit # Organism: Sinorhizobium meliloti # 7 335 7 334 668 135 29.0 4e-31 MGRSVRMGIDVGGTHTKAVAIDNATHEIIGKSSVKTTHDDPRGVAAGVVKSFENCLNENG ISPDDVVFVAHSTTQATNALIEGDVATVGVIGMAKGGLEGFLAKKQTKLDDIDLGNQKKI KIVNDFINVKKMNMAVVEDTINSMQKSGAQVLVSSMAFGVDDGGPEAEVYEAASARGIPT TMASDITKLYGLTRRTRTAAINASILPKMLDTANSTEGSVREAGVHVPLMIMRGDGGVME ITEMKKRPVLTMLSGPAASVMGSLMYLRASNGVYFEVGGTTTNIGVIKNGRPAIDYSIVG GHATYISSLDVRVLGVAGGSMVRANKQGVVDVGPRSAHIAGLDYAVFTDTDKIRGPKVEF FSPKQGDPDDYVRIRMEDGSAVTLTNSCAANVLGLVKPEHFSYGNVESARKAMKALADYC GTTVEDIAKQIMEKAYAKIEPVILELAEKYKLEKDQISLVGVGGGAASLIVYFSNKMGVK YSIPENAEVISSIGVALSMVRDVVERIMPSPSKDDIRAIKNEALNKAIESGATAESVEIH IEIDPQTSKVTAIATGSTEVKTADLLKECDETEARHLAAEDMRLADGDTALLATSPYFYV YGERTEPGTPGAVRIVDKKGFIKVQRGRAMCLKTTAAAYLDALKQLWDDMAVYQTELIAR PDYYLCMGARVMDFTAQDFEQLELLTDLEVSSLEPDEEILVVAANIRQN >gi|157101644|gb|DS480680.1| GENE 121 136748 - 138091 1711 447 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3767 NR:ns ## KEGG: Cphy_3767 # Name: not_defined # Def: citrate transporter # Organism: C.phytofermentans # Pathway: not_defined # 3 447 4 449 449 561 72.0 1e-158 MNINVIIGIIMILSFLGMVWYCVKGFNLMVGFAIMATFWTALALVGNAFSPTSAMEGKTF IQVLEYVYAEGPAGYAKSILVNVFFGAFFGRVLVDTGIAATLIRKVVELGGDKPRITMAL LCIVTAIIFMSMTGIGPVISIAVIVLPILLSLGIPAPVAMFSFMGSIMAGIFGNIVNFQQ YQTIYAGFNAAAADYTYNDYFKIGVTCMVVAIVVVLVVANLSMNKKAAHAWAATASSASD NAPAISWISVILPVLGVVLFKIPIILGFCLSGIYALITTGKMKGSYSEICRLFAKLFTDG SIDVAPMVGFLLTLAMFNNAATYAAPYFTAILGGFVPTTAITLAIVFAILTPLGFFRGPM NLVGCGTAILAVVVATLPNAPVAFLYPLFAVTTIAPQHLDITQSWVAWGFGYTKVSTKDY MKMSIPTGWIVGIICCAAVFVLYGGLM >gi|157101644|gb|DS480680.1| GENE 122 138581 - 139411 891 276 aa, chain - ## HITS:1 COG:SP1331 KEGG:ns NR:ns ## COG: SP1331 COG1737 # Protein_GI_number: 15901185 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 5 260 5 266 269 214 44.0 2e-55 MEYLKSVIPVIESNYVNFTTVERSIADFFIYNREQMDFSAKSVAGKLFVSEASLSRFAKK CGFRGYREFIYQYEENFVVRQDSMTGNTRTVLNAYQELLNKTYNLVNEKQIERVCRYMAK AGRVFVCGTGSSGLAAREMELRFMRIGLDITSMDHSDMIRMRGVFQDENCLVFGVSISGE KEEVLYLLEEAHKRGARTVMLTARNRPEFDAFCDEVVLLPSLLHLNHGNVISPQFPILLL TDILYSYYLSQDPYKKEVLHDSTLRALEKGRRENVM >gi|157101644|gb|DS480680.1| GENE 123 139461 - 140357 314 298 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 7 284 8 308 323 125 28 2e-27 MKQYICIDIGGTEIKHGVLDENEKFLAKGKVPTEAFKGGPALLKKVMGIAADYLKDRKTE GICVSTAGMVDVEKGEIFYAAPLIPEYAGTKLKASMEAEFGLPCEVENDVNCAGLAEAVS GAAAGAQSALCLTVGTGIGGCIIIGGRVYHGWSGSACEVGYMHMDGSDFQTLGAASILVK RVAAAKGEPEAQWNGYRIFQLAGEGDSICVEAIDQMCDCLGKGISNICYVLNPQIVVLGG GIMAQEEYLRPRLEKALARYLVSSIYEKTELAFARHQNDAGMRGAYYHFLERQGKARS >gi|157101644|gb|DS480680.1| GENE 124 140511 - 140966 512 151 aa, chain - ## HITS:1 COG:YPO3447 KEGG:ns NR:ns ## COG: YPO3447 COG2731 # Protein_GI_number: 16123595 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Yersinia pestis # 1 149 1 151 152 90 36.0 8e-19 MIFGNITQEKTYAFLPEDLKECFAYAKEHDLASYEKGCHPIDGERLFVNVVEYETTRPEN RFWEAHRNYLDVHLMLDGQEQIDLNFIENMEQKEYVEKDDFLPMDGAPNSHVVLRPGDFL ICYPEDGHRTAVAVNEPEKIKKAIFKVRIDH >gi|157101644|gb|DS480680.1| GENE 125 140990 - 142492 1815 500 aa, chain - ## HITS:1 COG:SP1328 KEGG:ns NR:ns ## COG: SP1328 COG0591 # Protein_GI_number: 15901182 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Streptococcus pneumoniae TIGR4 # 3 481 5 485 513 555 67.0 1e-158 MQGFTTIDLIILIVYLAAVLFAGLFFSKKEMKGKEFFKGDGTIPWWVTSVSIFATLLSPI SFLSLAGNSYAGTWIMWFAQLGMLVAIPITIKFFLPIYSKLDIDTAYHYLEIRFGSKGLR VLGAVMFIVYQVGRMSIIMYLPSMVLSKLTGLNVNLLIIAMGVIAIIYSYTGGLKSVLWT DFIQGSVLLVGVTFALFYLANGINGGFGAIFQTMGEGKFLAADQPIFNPNILKDSVFLLI VGAGLNTCSSYVSSQDIVQRFTTTTDMKKLNKMMLTNGALSIFIATVFYLIGTGLYVFYS QNALPPAAQQDQIFASYIAYELPVGITGLLLAAIYAASQSTLSTGLNSVATSWTLDIQER LSKKEMSFALQTKIAQYVSLGVGIVAILVSMVLANGEIKSAYEWFNGFMGLVLGVLGGTF ALGTFTKVANAKGAYVAFFVAAVTMVCIKYMAPAGSVSIWSYSIISIAISLVVGIPVSMI TRKMDGDTSKPVKDATIYHN >gi|157101644|gb|DS480680.1| GENE 126 142681 - 143598 1184 305 aa, chain - ## HITS:1 COG:SP1329 KEGG:ns NR:ns ## COG: SP1329 COG0329 # Protein_GI_number: 15901183 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Streptococcus pneumoniae TIGR4 # 1 300 1 300 305 443 70.0 1e-124 MANLEKYKGVIPAFYACYDAEGNVSPERVRALTEYHIRKGVKGVYVNGSSGECIYQSLED RKITLENVMAAAKGRLTVIAHVGCNNTKDSVELAKHAESLGVDAIASIPPIYFHLPEYAI AAYWNAMSQAAPNTDFIIYNIPQLAGVALTQSLFAEMRKNPRVIGVKNSSMPVQDIQMFK LAGGDDYIIFNGPDEQFISGRVIGAEAGIGGTYGVMPELFLKMDELVKAGKMEEAMKVQY AANDIIYKMCSGHGNMYGMIKEMLRINESLDLGGVREPLPQLIESDLAIAQEAARMVKEA VAAYC >gi|157101644|gb|DS480680.1| GENE 127 143665 - 144351 925 228 aa, chain - ## HITS:1 COG:BB0644 KEGG:ns NR:ns ## COG: BB0644 COG3010 # Protein_GI_number: 15594989 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Borrelia burgdorferi # 8 228 7 228 232 246 56.0 2e-65 MNEKVESLKGKLIVSCQALPHEPLHSSFIMGRMALAAKEGGAYGIRANTKEDIAEIQARV DLPVIGIVKRDYEDSKVYITPTMKEINELMEVKPDIIALDATHSLRPGGRTLDEFYREIR KSYPEQLLMADCSTVEEALHADQLGFDFIGTTLVGYTDQSRDLKIESNDFEIIRQIVAKV KHRVIAEGNINTPEKAKRVIELGAFSVVVGSIITRPQLITKSFAEALD >gi|157101644|gb|DS480680.1| GENE 128 144582 - 146195 1913 537 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 8 535 1 525 526 145 23.0 2e-34 METGYDQLLKVMFVDDEEKTRKLLRVFVDWQSIGYEPVEDAASANQAMELIEEEHPDVVI TDIEMPYINGLEFAQMLAEEYPQIVVIVLTAHDLFQYAQKSVELGIKSFLLKPIRRAELI EVMKDVRTDIMEERRAIFEFEGLKSRIAESRTLIVQNFLNNLLLNKVDQESLWGTIRYYD IPLRRDTGHYNVMVLMPEKHNDPEEDELRRQQCMEVLSPAVSRLTDVLLLKDIHQNLILL SQNKKISLMSYGAHFTVLIKEKLSMGIYAGCGNPVDSLEEIRYSYKQAYKNAMIASYSHN QAFLSSGRESDMGGLQELIQTLTEELPLYLGIPSGDKVRQLVGTAYGTLKGLPQSSLSDY LVISYSIVNVTLSTLSDNGIPYIEIYSTDHLPYEKILRLTGAEEIEKYISQFTDFTVCQI EEYMNKKNTMLVHRIVQYMNEHMDDSTLSLAKISKINYINNSYLSRTFKCVMGINFMDYL VSIRIEAAKKLLKGTDLRIYEVARSVGIEDPNYFSKFFKKHTQETPAQYRDRCQEAL >gi|157101644|gb|DS480680.1| GENE 129 146179 - 147996 1630 605 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 154 599 145 598 602 162 27.0 1e-39 MYMKGLCENVKGAWDWLFHEHISRRIMAVFVIFFLISFFLTYSLYSWAARKQEDSMIEQS SVQMVTGVRNNLDEMVLNVNDDFNMLLKGGTLDVVNHLPRPEDRKTYDNMLFTALDSNHY LESIYLLDFSAHLYGVDKHDMKRLSVDSTLKCPWYRKALEAGGSYILEYEAGGIFYKRYN SRFVSVIRTINSLETQKPQGFIIMNIPTDAIAEICSAAAYGSQAMIGIMDQDGNLIAGTP GFEYQDFLKTPAGSSDEEKLLVSRKGDGEHTTSLYLPDYGWHLIARIPSQVNRSAFKDLF LIATLIFLVNVVLMLFGSLYISKSISNPIYALIGMMNTVKDRNFRKIPVKYPKSEIGILQ TNYNNMVDEISSLLSRLVEEQKIKEKAELHALFEQVKPHFLYNTLDAIGYMALTEEPSQT YEAIETLGSFYRQSLSQGDQMITVEKETAIVNDYVKLLRLRYGDLFDVEYQVDESCLRVL VPHLILQPLVENAVYHGIKPMGGNGHIRIWIGPEEGGRMKIQVRDNGVGIDASALSKLQD YHKIWNIDIDVSGTGIGLIGTIQRIVLIYGEQASYQIRSVPEGGTEITFLLPVTCGGEED HGDWI >gi|157101644|gb|DS480680.1| GENE 130 148043 - 149200 969 385 aa, chain - ## HITS:1 COG:BS_patB KEGG:ns NR:ns ## COG: BS_patB COG1168 # Protein_GI_number: 16080196 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Bacillus subtilis # 26 380 28 382 387 257 37.0 3e-68 MEVRYVTDRHDTLCAKYRQEYRDNPDYLYCGVADMDLAPPDPVLEAMKRRLDHGVFGYTE LPDGYEKLVSDWMRERYHCEVKQEWVLFSPRINMALNMAVDTFTEPGDSIIVNTPAYPAL TGAVEKWGRNLKESPLILKNGRFTIDFDGLERLADQKTRAYILCNPHNPTGRVWTDEELS GVTAFCKRHNLLLLSDDIHGDFVWPGHVYKPLLTMYHNPEQEKVIVFNSLTKTLNIPGLI FSSVILPNPEMRRAMEETIDRWGLHNPNVFAADILKPAYRECGEWVDRMRAQVYENIKTA QEFINRELPWLDAYVPDGTYLMWIGYGRTGLSEEDMRRLLEEKGHLIPLMGTHFKEAGRG WFRLNMGTSGEQVNKILERLALCWT >gi|157101644|gb|DS480680.1| GENE 131 149191 - 150534 1201 447 aa, chain - ## HITS:1 COG:BH3680 KEGG:ns NR:ns ## COG: BH3680 COG1653 # Protein_GI_number: 15616242 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 2 426 4 426 438 185 29.0 2e-46 MRHLTIWMCVLAAVLGMGACSRAEKDEDVRVDMTAPAKDSPGERPVIRLWHIYGSDDDQA ARIMEQLTREAEDKFNVTIEVDTAENEGYKTKIKAAAAANELPDIFYTWSHEFLKPMVDA GKVLEVSRYYSDDFQSHLNDAMMKGIQFDGGTYALPLDSSVAMVYYNMEMMDRYGLEIPV TWDEFIRVCQTFVDNGITPMPVGGNEPWTIAMYYDLLALREVGPEKVADAAAKRTDYSDP GFLEAARKLRQLVDMGAFTADSATALREEAEALFMQGQAPMYLNGSWTSSRVYRESSRVA GKVKAAPFPVTGSGKSTIYDFTGGPDSAFAVSSSTKDPELTVDLAEYMAMGLAVELYKCK SNDLPYINVDTGDAQLNPLMQEIHEYTDHAASYTIWWDNLLEGTDAARYLDSLECLFRGD ITPEEFISRMNGLKQDETDDRRETGWK >gi|157101644|gb|DS480680.1| GENE 132 150579 - 151430 898 283 aa, chain - ## HITS:1 COG:BH1926 KEGG:ns NR:ns ## COG: BH1926 COG0395 # Protein_GI_number: 15614489 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus halodurans # 8 282 13 284 285 169 33.0 4e-42 MEKKELKKYSVKTKFFRAITYGVLVLLALICIVPFYNMIINCTHDNAALATSFQMTPGKS LKTNYQHLAANVNIWRGLFNSLFISLASTVITVYVGALTAYGFAKYKFKYNKQLFWLVLA TMMLPGQLGIIGYFQLMNNIHMLDNYAALLITCFAPPAMVFWNRQYIDSYVPDELIESAR IDGCGEFRIFNEIIFPIIMPGVATQAIFTFVESWNAFVKPSILLFSQDKFTLPILVQQMQ GVYKNDYGTVYIGVALSVIPILIMFVLCSRRILEGATAGAVKG >gi|157101644|gb|DS480680.1| GENE 133 151431 - 152354 757 307 aa, chain - ## HITS:1 COG:BH1925 KEGG:ns NR:ns ## COG: BH1925 COG1175 # Protein_GI_number: 15614488 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 11 295 5 290 305 166 36.0 6e-41 MTEKGKSIQPSKRRSLDKSNRGLLFILPYFIIFLCFTLYPILYTFKLSFHSWDGLGTPVF TGLANYRRLLNDAVFFKSIFNTVFIALLAMAPQMIFGLILAFLLNQHRFRGGNLFKTIFY FPNLVTAVSLGVLFSLLFDFKAGSVNKLLMALHIIAEPVNWKMNGLFTQLIVAFVLFWQY FGYYTIIFTAGIKGIPYDIMEAAEVDGANKFQTFFHIVLPQLRNIMTYAFVTSIIGGLQL FDLPFTFAGATGGPSKSVLTMVMYMYNIAFQNYDYGYGSAISYGLFVIIIVFSLIFMKYT AGKERNF >gi|157101644|gb|DS480680.1| GENE 134 152512 - 153927 1606 471 aa, chain - ## HITS:1 COG:BH1924 KEGG:ns NR:ns ## COG: BH1924 COG1653 # Protein_GI_number: 15614487 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 87 421 55 380 426 64 22.0 4e-10 MRKRPLALLTAAALAAGTLAGCGSSASAPTGAAQKAEASGKESSQAAEGKTEAAGQAGGA AGVDVSGATGEINVWTNLSQSEMNFYVEEFNKRVPGVKVTVTVMPGNEYRTKLQNAFRTG TNAPDVATFEISDFGMYKNTDLLENLSTESYDAEALSKDMIPYVDELSRDNSGNLRGLSY QATPGGFWYKKALAREYLGTDDPEELHEMLKDWDSIIEVGKQVNEKSGGKIGLLDDIETV EQIYCSYKGAPWVDENNHIISDDFLMEQLNLMQRVVDNNVDARADQWSAGWTSGMYSQDM FILIGLPSWALNFCIKPGIPEGEMEKAADTWGYMEAPAPYQNGGTWYGIYSGSKNKEAAY AWVKTMTADKDYLVNGLAGQLGDFPAYIPAIEQCIEEGHTDIVTGDQQYFQSFYDTAMNV KADPMTKYDRQAWTSMTAQMDLLAGGGATPEEACQGFKEDMQRIYPEIVID >gi|157101644|gb|DS480680.1| GENE 135 154022 - 155059 995 345 aa, chain - ## HITS:1 COG:MA3849 KEGG:ns NR:ns ## COG: MA3849 COG1363 # Protein_GI_number: 20092645 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Methanosarcina acetivorans str.C2A # 11 329 10 337 348 181 33.0 1e-45 MDQEFFYRLLSCQGVSGRETNIQKTAAAYMEGFADEVRREVNGNVTGILNPDSSFKILLA GHIDEIGLMVTDHTEEGFLRAARVGGITPRMYLGQKVRIDTGKDLVYGAVASSGELQKKE ELSDTDLVIDIGAGNREAARRHVPVGSGVVFDTDWRPLLGECITSRALDDRSGAFIVMEA LRRAKEKKCRNGVYAVTTVGEETGFSGGAWAGAGIRPNMAVAVDVTFASDGPGGDDGVRG RVALGGGPGICVGIMGHPVMNRILEQAAGRLDIRTQPVITPGRTFTDTDAIHISGHGVPA ALVNLPLRYMHTPAEVCSLKDVRDCIELISEFVCMVDGQTDLRPC >gi|157101644|gb|DS480680.1| GENE 136 155332 - 156840 1716 502 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 58 502 41 498 600 211 32.0 2e-54 MAAETLRDIIRHGAEAYGEQTAFRYKVKKEIIDKTYNEVNLDSMAVSRAVEALGMKGKHI AVIGTTSYQWITTYFGIVNSGSVAVPIDAQLPAEAVCELLNRADVEMLVYDELRSDVAGV VRGKCPGIRHVVSMQARETVGDVLSLSRLIAGHAGTYETELAGGQLCTILFTSGTTGKSK GVMLSHRNLTDNAVCLDMRIPAGTVSMTLLPINHVYCLTMDIIKGLYIGMIICINDSIMH VQRNMKLFKPEIVLLVPLVIESIYGKLKDAGSLIPKKMVAKAAFGGSLRIICSGGAYLDP DYVDRFREYGITILQGYGMTECSPVISTNLEWENKKGSVGKLLPNCEARVVDEEIWVRGS SVMQGYYKMPEQTAETLEDGWLKTGDLGYVDEDNFVYITGRRKNLIILANGENVSPEELE NELSRSELVKEILVREKDKIIEAEIFPDYEYAKKKHIKDIQGKLQELIDGFNKDMPVYKR IYSLIVRETEFEKTPSKKIKRF >gi|157101644|gb|DS480680.1| GENE 137 156863 - 157093 441 76 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1028 NR:ns ## KEGG: EUBREC_1028 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 74 1 74 76 75 58.0 8e-13 MFEDLKEIICEYVDVAPETIKEDSRFIEDLGFNSYDFMSMVGEIEEKFDVEVEEREVVNV KTVKDAVAYIQSLQAE >gi|157101644|gb|DS480680.1| GENE 138 157215 - 158621 1486 468 aa, chain - ## HITS:1 COG:BS_ppsA_1 KEGG:ns NR:ns ## COG: BS_ppsA_1 COG1020 # Protein_GI_number: 16078895 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Bacillus subtilis # 10 464 6 432 1045 107 20.0 6e-23 MKTRKGYKVYPLTSAQKLHFYCLKYCPKKQVLNIGSSLTIQVDLDWDVLKSCIQEAIARC DTMRLRFTHDKEGSIYQYVVKEETKEIEHFDFTGWKEEDAEQKLREWTEVPFERYDSPMH HIVMIKMPDGYQGLYICVDHMTMDAQSLILFFRDVIELYASRFYDEVEHPKEMSSYIRQL EKDLAYEAGSRACEKDRQFFQELIASSEPIFTDIYGPKKLLEERKAARNPKLRAATNTSD NVEANITNFHLEGEPSRRLLDFCEKYGISMTCLLLMGLRTYLQKENDQDDVSVTTTISRR ATLSEKRCGGSRIHCFPFRTIVSREDTFMEGLLKIRDAQNQYFRHAGYSPSEYFNYRHDY YKLKDGQTYEPLSLTYQPLAMKYDGPGLDKLGDIKYKTARYSNGVAAHTLYLTVSHRAED NGLDFGFEYQTGVVTPEKLEYIYYYLCRIIFRGVEDPERPVGEIIEMV >gi|157101644|gb|DS480680.1| GENE 139 158618 - 159577 950 319 aa, chain - ## HITS:1 COG:lin2180 KEGG:ns NR:ns ## COG: lin2180 COG1073 # Protein_GI_number: 16801245 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Listeria innocua # 61 310 65 317 319 191 37.0 1e-48 MGIKGWMLGLAAAGAAGEYGIARYFFHRTVVRGNAKRDRTQKMAGTDWDAYIPGIRASRE WLAGQPQEDVYITSRDGLRLHGTFFCCEGSRRAVVCFHGYTSEGLNDYTSIAKFYLNQGF NLMVVDERAHGKSEGTYIGFGCLDRYDALQWMEYVVERLGEDCGLMLHGISMGAATVLMS TGLELPEQVKAAVSDCAFTSAWEVFSHVLRSMYHMPAFPVMQIADRMARREAGYGLDECN ARKEVMRARIPILFIHGDRDTFVPCSMVHELYGACASPKELLVIPGASHAEAYYKDTDRY EHAIEELIARVFGKEENRV >gi|157101644|gb|DS480680.1| GENE 140 159635 - 159952 486 105 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1025 NR:ns ## KEGG: EUBREC_1025 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 105 1 105 110 148 70.0 9e-35 MSYEEVFTKVKEIFMKADVSRVEEHLAFQFNITGEGEGVFYAEVKDGVLRVEPYEYYDRD AKFICSADTLMKLASGKLDPVFAFTTGKLKVEGSLEKALMLQMFV >gi|157101644|gb|DS480680.1| GENE 141 159949 - 160500 460 183 aa, chain - ## HITS:1 COG:CAC1329 KEGG:ns NR:ns ## COG: CAC1329 COG2091 # Protein_GI_number: 15894608 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Clostridium acetobutylicum # 40 178 61 201 201 71 34.0 8e-13 MIYLASYEPGGSLYNREREHILGRSLLNFGLMKEYGRTWEVEQEAGSKPRLKGAEGVEFN ISHTKGLVVCAVADRALGADTEQIRPFKAGLMRRVCSGTERSFVMAGRSEAERRERFFRL WTLKESYAKAIGIGLAFPLGDITFSLEEGVIKGSIPGWRFYQSRVYQSYIISVCAADEKG ESV >gi|157101644|gb|DS480680.1| GENE 142 160661 - 161278 631 205 aa, chain + ## HITS:1 COG:CAC3606 KEGG:ns NR:ns ## COG: CAC3606 COG1309 # Protein_GI_number: 15896840 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 4 201 9 201 202 80 30.0 2e-15 MQNHNKKDMILDAMQELMGSANVQAISVSDIAQKAGIGKGSIYYYFSSKNDIIDAVIERN YSRVLDEGRELAASSHLDAFKKLEIIYHACLDSSMELRRREEIHTFNEQLESAFIHQKFS RIIITKLKPILADIIRQGVREGSIQCSFPEETAQIVLTVLTITLDNHLIPSDDEEIARIL SAFTEMQEKGMGIPSHTLRFLMRKI >gi|157101644|gb|DS480680.1| GENE 143 161485 - 164304 3197 939 aa, chain + ## HITS:1 COG:alr5324 KEGG:ns NR:ns ## COG: alr5324 COG0744 # Protein_GI_number: 17232816 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Nostoc sp. PCC 7120 # 20 680 31 634 643 285 30.0 3e-76 MKDTLKQAARKTLLVIRDIGVSLLGITGWLFHALLAVITAGFILGAILCLLIYAKVKPDL DECRELAYDKLAQMERTDFSKLSDTVIYDKDGRQIGLINAGHYQYVGITGISMNIQNAYI AQEDRRFKSHTGVDWIATFRAGLALVKHRGQVTQGGSTITQQVIKNTYLTQEQTFTRKIV EILLAPEIEKKYSKADIMEFYCNTNFYGHQCYGVEAASRYYFGKHAADLNVEEAAVLVGI SNSPSAYDPVAHPKASLEKRNDVLKSMNEIGELSDEDYEKALSKPLTVVKQESEGTDENY QSSYAIHCAALELLKKDGFQFRYTFEDKDDYNSYMERYTTAYTEKSDLIRAGGFKIYTSL DSSLQDMLQADIDQGLSSYTELQDNGKYALQGAGVIVDNRSNYVVAIVGGRGSGDQFNRA YLSARQPGSTIKPLIDYGPAFDTGEYYPTRMVNDHKWEDGPSNSGGNYHGSVTVREALNR SLNTVAWQILEDIGVDYGLSYLGEMEFQKLTYVDNGVPSLSIGGFTNGVRVVDMAKGYST LANYGVYNDRTCITQIIHEHDGDLTKDLPPAAKQVYRDDSAFMLTDILKGTMTSPYGTGR GLELDNGMPAAGKTGTTNSSKDTWFCGYTRYYTTAVWVGYDIPRKMPGIYGSTYAGKIWK NIMNQIHEGLEPWDWEQPESVELKADPRTGTQDYFSTTAEFRAQQSLHDKEQAKLTAQLE QDIQEFTTREISSVEDTYSVKSQYQDITSRLPLLDDGELRAGMLEQVENRYDYFTGIISQ MGDTIALYEKQKAIDDARAREEAQKQAEENRKQAEKDTKKNEFLQALETVEELEYQQKNA QDLVQDAIGKLSLVAGDPEQQALSDRLQAAITRISGLPTEEQWNASQAESRASEEAAMKQ AEQQVQVQQNQLRSSLNSEKFKWNNMEYYGPGGRPDDEN >gi|157101644|gb|DS480680.1| GENE 144 164291 - 164638 400 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937513|ref|ZP_02084874.1| ## NR: gi|160937513|ref|ZP_02084874.1| hypothetical protein CLOBOL_02404 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02404 [Clostridium bolteae ATCC BAA-613] # 1 115 1 115 115 217 100.0 2e-55 MMKTDLHKKAGHTGAPRLRLPKKAPVEPEKDTANPYRMPPRRWFLTFMCMNIPIAGWVYL LCLAFGKKENQLRDFAKAYLVYKLVFLAAALIILGILVYIGLDLADKLLAYMEML >gi|157101644|gb|DS480680.1| GENE 145 164836 - 165492 826 218 aa, chain - ## HITS:1 COG:MA3237 KEGG:ns NR:ns ## COG: MA3237 COG0703 # Protein_GI_number: 20092053 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Methanosarcina acetivorans str.C2A # 4 151 8 156 175 117 43.0 2e-26 MKPNITMIGMPSSGKSTIGVLLAKRLGFSFVDVDIVIQEKEGRLLKEIIAQEGMDGFLKV EERINAGLDVTLSVIAPGGSVIYGEKAMEHLKEISEVVYLKMSYEEMENRIGNVVDRGVA LKPGFTLRDLYNERVPYYEKYADITIDEEGKTPGETVDALRDIIEGMMDRSMIERIVEEQ KKILEEKDRKIEAYEAEIAALKEELALFRAAETGGCGN >gi|157101644|gb|DS480680.1| GENE 146 165489 - 166100 682 203 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 20 184 20 185 197 171 48.0 1e-42 MSKEYWKPGNMLYPVPAVMVSCGREGETPNIITVAWAGTICSDPAMVSISVRKERFSHSI IRDTGEFVINLVNKRLVRATDYCGVKSGRDVDKFKETRLNPQASRYVKAPGVEESPVNIE CKVVEVKELGSHDMFIAKVMGVTIDNQYMDDRGKFNLNASGLVSYSHGEYFELGKKLGSF GYSVKKPVKRQSDKKRTARRKKA >gi|157101644|gb|DS480680.1| GENE 147 166134 - 167804 1972 556 aa, chain - ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 8 556 3 537 538 302 34.0 1e-81 MEQELKRKTMVIGHRNPDTDSICSAICYANLKRAITGDAYIPARAGHVNGETRFVLDYFG MEEPKLVEDVRTQVKDIEIRKTRGVADNISLKKAWNIMQENNVVTIPSVRPDGTLEGLIT VGDITKTYMNIYDSSILSKAHTQYSNIIETLEADIVVGDADVYFDKGKVLIAAANPDLME FYIEPHDLVILGNRYESQLCAIEMGADCIIVCEGAGVSMTIKKIAQDRGCTVIATTYDTY TAARLINQSMPISYFMTREHLITFNSDDYTDEIREVMASKRHRDFPILDKEGRYLGMISR RNLLGAKGKQVILVDHNEKSQAVAGIESAEIMEIIDHHRLGTVQTMAPVFFRNQPLGCTA TIIYQMYLENKVDIEPKIAGLLCSAIVSDTLLFRSPTCTPVDEMAARSLAAIAGLDIEKY AMEMFGAGSNLKDKSDEEIFYQDFKKFSVGKALIGVGQITSLNDIELETLKEKMLGYMEK AKESNSLDMAFFMLTNILKESTDLICYGQGAVQLAAKAFHLDIEEATEKPEPVLSLPGVV SRKKQLIPELMLAEQD >gi|157101644|gb|DS480680.1| GENE 148 168186 - 169640 1712 484 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 28 228 525 689 744 73 25.0 1e-12 MTRRRILTGWAIGLLLGVMAAGTALGAESGWKTEDGVLKFVDNKGNYVTNEWRTRDGKSY FLGSDGEIEKSTWIENTYYVDDKGVMVKNSWVHTDGKDGLKEEGWYFLGKNGKTEEGWKN VGDGRYNFDSDGKMRTGWFYEGDNIYYLGGKDDGAMRRGWVCLEFDEDDLPEIGDISKEY KKADENSRWFFFQNNGKAKRSLTGNYDQETIHGEKFYFDRNGAMLTGWHAVKEKADSDDA TGISRFVYLGGKDDGAIAKGQWKQLSEHPGDSDDGAAILKKGDEELPREGDKEWYYFENN GTPAYLKTGISTMNAATIRVDGENYFVNQYGCRRNGLVKIVSGGHVLVGYFGEKGTDGRM LTGRNTGIADDDGKRCTFYFNTSGSNKGAGFSGEKDGFLYYNGLLVTADGDSQYEVYQVD GKYYLVNESGKVQTNEKAYKSDGDYVYLVEDREVYYTDDNGKKTKKADSSATLPDFTCDQ VYEL >gi|157101644|gb|DS480680.1| GENE 149 169967 - 170515 527 182 aa, chain + ## HITS:1 COG:L19745 KEGG:ns NR:ns ## COG: L19745 COG1247 # Protein_GI_number: 15673759 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Lactococcus lactis # 9 166 4 161 187 180 53.0 2e-45 MSDTAVTIRMAEEADAAALLSIYAPYVEKTAITFEYEVPTVQEFKNRITSTLKRYPYLAA IRDGRIIGYAYASQFKERSAYDWAVETSVYVSGDARRTGAGSLLYEALENYLRRQNVINV NACIAYPNPGSITFHEKYGYHTVGHFTKCGYKLGQWWDMVWMEKMLGPHPEQPEPFIPAG RL >gi|157101644|gb|DS480680.1| GENE 150 170895 - 172397 1390 500 aa, chain - ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 1 496 4 497 502 446 46.0 1e-125 MRECGMLLPVASLPSRYGIGAFSKEAYDFIDTLKAAGQSFWQILPLGPTGFGDSPYQSFS AFAGNPYFIDLEKLTEEGLLTREECLDADFGNDERDIDYKKIYDGRFPLLRKAYERWKAG RSPELVHELAGRKLGDETREYCFYMAVKNHFDSCSWNLWDEGIRLRKPEALEAYRAMLTD EIGFYEFQQIKFEEQWDALKRYAHEQGIRIIGDIPIYVAFDSADSWSRPELFQFDENNMP GAVAGCPPDGFSATGQLWGNPLYNWEYHRKTGFAWWMRRMEYCFRMYDLVRVDHFRGFDE YYAIPYGDATAEYGRWEKGPGIEIFRQMQEHFGGDKLPIIAEDLGFLTPTVRQLLKDTGF PGMKVLEFAFDAGENSDYLPHKYGSNCVVYTGTHDNDTAEGWFASLDEHDRNFVREYIGA GYTPEGEIHWDLVRAALGSVADLAVIPVQDYLGYDTSARINEPSTLGKNWRWRMSREDLN EDTVKRCRRLAGIYGRLGEA >gi|157101644|gb|DS480680.1| GENE 151 172594 - 173613 648 339 aa, chain - ## HITS:1 COG:CAC1451 KEGG:ns NR:ns ## COG: CAC1451 COG2207 # Protein_GI_number: 15894730 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 84 333 36 285 295 117 27.0 3e-26 MQDGMKMGSTESGYQGRDRSGMQGNGAYGEVGREEYGEGGREEYRGRGREQYGYHEIKEH AGRDFPFNIYPCSIPADFRQVPVHWHEDMEIIAVKKGRGVVTVDMEPYEARAGEAVVVFP GQLHGISQCGSEAMEYENIIFLPSMLMSAESDLCTYDFLRPMTEGGIGKPLHITGGLPDY GAFMECIGILDQLCGKKNYAYQMGVKGTLFWFLGLIAGIWGPEAAGQPRKSRERMKRLLE YMEEHYGEKITVEDGAELCFYSNSHFMKYFKQYMGVPFIQYLNEFRLEKAAGMLLTTPDP VTAVAQRCGFDNISYFNRLFRRKYGKTPGEYRKNGEIYL >gi|157101644|gb|DS480680.1| GENE 152 173758 - 176046 2296 762 aa, chain + ## HITS:1 COG:SP2106 KEGG:ns NR:ns ## COG: SP2106 COG0058 # Protein_GI_number: 15901921 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pneumoniae TIGR4 # 6 757 4 752 752 967 63.0 0 MNQNQLEQMLTEACKKDISSCTDKELYTGLLCLVNKLSKGRESARGKKKLYYISAEFLIG KLLSNNMINLGIYSQVKSILASRGKDLSVLEELEPEPSLGNGGLGRLAACFLDSIATLGL NGDGIGLNYHFGLFRQLFRDNLQNEEPNPWMENTSWLSKTNVTYQIPYRNFTLTSRMYDI EVTGYDNRTNKLHLFDVETVDESIVEDGISFDKTDIRKNLTLFLYPDDSDRAGQILRIYQ QYFMVSNAARLILSEAAAKGSTLYDLPDYAAIQINDTHPTMVIPELIRLLTTEHGMEMDD AIQVVTDTCAYTNHTILAEALETWPMDYFKEAVPQLIPIMEVLDDKVRRKFDDPSVYIID KGDRVHMAHIDIHYGFSVNGVAALHTDILKKNELNNFYRLYPEKFNNKTNGITFRRWLLH CNPQLADYISELIGDGFKKDAFELKKLDDRAFTANIKTLHRLLAIKDSKKQELADYLKET QDIDIDPKSIFDIQVKRLHEYKRQQMNALYLIHKYLEIKRGSRPATPITAIFGAKAAPAY TIAKDIIHMILCLQQVIDSDPDVKPWLNVVMVNNYNVSLAEKLIPACDISEQISLASKEA SGTSNMKFMLNGAVTLGTDDGANVEIHELVGDDNIYIFGESSETVIRRYAEGSYCSRSYY EADPDLKEAVDFIISPAMTAAGSVEHLERLYNELLNKDWFMTFPDFKAYCQAREQALSDY ENRTAWARKMLINISKAGYFSSDRTIEEYNRDIWKLDTGLVP >gi|157101644|gb|DS480680.1| GENE 153 176154 - 177407 1360 417 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 117 402 43 327 416 164 38.0 2e-40 MKKLITATALFLALATASGAVPYNAADNAFSAYAVGPGLNNLSIKDDKASYQWALKNDGQ AQKIVQELDIDSLDPAYVHRNKRGKVDAIALPPLEPANILTSAIDAVPGIDINIQPAWEA YSQTETKRPLTVAVIDTGVDISHPDLQGSIWVNEDEIPGDGIDNDGNGFVDDVNGWNFYS NTNQVYEGPEDVHGTHAAGTIAASKDNGGIVGIADNNYVKIMPVKALGGEEGKGSPDNVI AAIKYAKANGAQICNLSFGSQNCTEEFKAAIRDSGMLFIVAAGNGDDNAIGYNIDASPIY PASLPFDNVITVANLMFDGKLDESSNFGAGSVDIAAPGTFILGITPDNGYAFMSGTSMAA PMVTGTAAMLYTSRPELSLTDIKNVILASARKMDSLNGKVLSGGMLDANGAMQWGRQ >gi|157101644|gb|DS480680.1| GENE 154 177385 - 178443 606 352 aa, chain - ## HITS:1 COG:BS_ykgB KEGG:ns NR:ns ## COG: BS_ykgB COG2706 # Protein_GI_number: 16078366 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Bacillus subtilis # 2 346 3 343 349 195 33.0 1e-49 MEYRIYAGTYAGADENGIFRYRMDGNSQILERELALPGISNPSYLALSQNGTMMYAVMED MEYHGKAGGGVCAIKCRENSLEILNSRGTTGTLPCHVTLDEEAGVIYAANYMSGSLSMFP LMPDGSLGEMSDCRQHKGKGPNTLRQEGPHVHFSGLSPEKDGIWCVDLGLDCILFYGIDV HTMTLSHRPERDIMLPKGTGPRHFVFQPGNAHRMYVVCELSSEVFAVNILKEGCQILQRI SSLSAPNDKSSCAAIKCSEDGRFLYVSNRGDDSIAVFRIDPKNGLLHLVQIEKAGGRTPR DILVLKDMILSANQDSNTITCFYRNEDTGILTRRMGETFCHAPVCLTAVPTA >gi|157101644|gb|DS480680.1| GENE 155 178498 - 179157 717 219 aa, chain - ## HITS:1 COG:MTH888 KEGG:ns NR:ns ## COG: MTH888 COG0684 # Protein_GI_number: 15678908 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Methanothermobacter thermautotrophicus # 19 211 31 214 220 83 31.0 3e-16 MVLKKHELLEVIKKELYTPVVGDILDQMGLYHQFLPQAVRPLRDDMKLAGYAMTVLMIDV FGQQKKPFGYLTEALDDLQEDEIYVAAGGTMRCAYWGELLTATAKKRGAAGAVVNGWHRD TPQVLDQNWPVFSRGCYAQDSSVRTQVVDYRCRIEIEGVTVMSGDLIFGDVDGVLVIPAE HIEYVIEKALEKARGEKNVRKAIEKGMSATEAFATFGIL >gi|157101644|gb|DS480680.1| GENE 156 179159 - 180181 434 340 aa, chain - ## HITS:1 COG:no KEGG:Cwoe_4678 NR:ns ## KEGG: Cwoe_4678 # Name: not_defined # Def: hypothetical protein # Organism: C.woesei # Pathway: not_defined # 4 319 10 334 365 205 35.0 3e-51 MYSRNYGCRITDQIIMDGNKAVVMENQKLRLTFLADRGMDCVEMLYKPEDIDFMWRSPAG LHKRSEYLSNSGDSLGNYLDHNSGGWQEILPNGGGECSYKGACLGMHGEISSVPWDCRII KDSEEEVVLKACITTLRSPFRLEKEISLKMDESAITIRESLTNLAKEPMELMWGHHPTVG KPFLDSSCRIDTNGTVGFSMDQRDFETQRLKPGTRFEWPAAGNGVDFSNVPEEDADTADM IYITGFPERAWYRVHNETKNISYGMSWDGKLFPYMWMWQVCGGSYGYPWYGRTYNLALEP WTSYPSSGLIKAIENGSALCLEAGEIRQTELCFWIKKEEI >gi|157101644|gb|DS480680.1| GENE 157 180253 - 180996 199 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 9 238 4 238 242 81 30 4e-14 MQLFDLTDKNVLITGGTRGIGFAIAQGVKQAGGRVWIHGSSEDRTRRIAEENGFDYLYGD LRDQEGLDGMLRPFLSREDRLDVLINNAGFETHSAIENAEEESMDAIYNVNTKSPYFLLQ KLLPALKKAGRASVINVTSIHQKVPVRENGYYCMAKASLGMFTKVAALELAKYGIRVNNL APGAILTDMNRELVEAMDFEKWIPIGRVGRADELIGPAVFLASDASSYVTGTTLYVDGGY SENLLRY >gi|157101644|gb|DS480680.1| GENE 158 181019 - 182110 720 363 aa, chain - ## HITS:1 COG:CAC1332 KEGG:ns NR:ns ## COG: CAC1332 COG1312 # Protein_GI_number: 15894611 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Clostridium acetobutylicum # 70 333 41 311 351 143 32.0 5e-34 MRVAWRKEMKLGVGLYRYMLKDEEFTFARQCGCSDVIIHLANYYDGENDIVKATDEKENY GIARAGESVWSLDHMMVLQKQAKEKGLNIYGIENFSPADWYDILLDGPKKEEQMEHLKEI IRNAGKAGIRCFGYNFSLAGVWGHQKTKKARGGAISTCFHAGQLNLDARIPKGEIWNMTY AENARGEYIEDIGYEELWNRLKWFLERILPVAEEAGVMMALHPDDPPMPFLRGTPRLVYQ PQLYQKVLDLVPSPCNGIDFCMGSIQEMTDGNIYDALEQYSSQGKIGYAHVRNVRGKVPD YTEVFVDDGDINIKRTLEILKKNHFEGVLIPDHTPQIQCQDTWHAGMAFALGFIRAELMG LEG >gi|157101644|gb|DS480680.1| GENE 159 182110 - 183126 872 338 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 32 329 28 317 322 181 40.0 3e-45 MKSVKIDSLHMELFSCKKTKNLILDHLIEILLLVLIIGVSIAAPSFLKSGNILNILRNAA MKGVIAYGMCLVIISGEIDLSVGSQVALSGVLAGFISKGLAEAGIMPIELGALVGIAAAT LVALLTGLFHAWARQKFNMPSFIITLATLNVMYGFAAIVCGGFPIANAFPTWYAFLGSGR ILGIPFPAIVFILIFLFFWFLTEKTDLGRRIYAVGGNAEAARLNGISVWKTRIFVMCMVQ LMCVLSGIMNSAQVRSATFTFGRGWETQIISSVVIGGTSMLGGIGTIWGTLIGVLFTGVI TNAMTLLNVNEFMQYVINGALMFFAVMFNTYMTKKRNN >gi|157101644|gb|DS480680.1| GENE 160 183123 - 184616 1277 497 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 3 474 25 497 521 400 44.0 1e-111 MSILRTEGIIKDYPGTRALDDVTVSFDSGKVHAFIGKNGSGKSTLVKVFAGAIQPTSGHF YLDEEELHFSTPVEALNKGIATVYQELSLVPHLSVAENIFLGRLPKKGRLVDWKRTYQMA GELLHAMGVDIDPHEKVFRLSMWQCQVVEITKAMSFKPKVLMLDEPTSALAANETQKLFE AIRMLKKQDVIIIYISHRLQELWEIADSVTVLRDGKYIGTEPISSLQHKDIITMMFGDVE IKERPGDLKVQDGIIMDVKHLTRKGKFQDVSFQLRKGEVLGIAGMLGSGRTELLRCIFGA DPYDSGEILVDQKTAPKNAGPIGMMQMGLGLTPEERKTQGVILIHSIRDNLCYASMDKMT QRHVINEKTRKEFSERQIRDLQIKIPDLMAPVSSLSGGNQQKVIIGNWLNTAPKIMMYDE PSRGIDVKAKQQIFQIMWEQSRKGISSIFVSSELEELVEVCHRILIMHMGKIVGEVCLDD HVTIDALYAYCMGGKVE >gi|157101644|gb|DS480680.1| GENE 161 184652 - 185743 1135 363 aa, chain - ## HITS:1 COG:ZyphF KEGG:ns NR:ns ## COG: ZyphF COG1879 # Protein_GI_number: 15803073 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 EDL933 # 66 305 32 272 327 92 30.0 1e-18 MKKIALAAMILGMAGVLGACSMDGDGASKTTGGTTVAQGTSQEEKGQKEKAGEGEEQTDT SGAGKSIAGIVFQEDQFFKLMSAGYEAAAKDLGYEIQLSNVTNDQMKETELVNTYTAQGV AGIAISPLSATISPEALKVSSDKGVKICVANTAMDSYPFSIAAYTSDNYAFCKQTGEIAV DFIKEKYGDEKIELGILQFKTQIPETSAERVNGFLAALDEGGVNYEIVSNQDAWLQDTAI AKAGDMLAANPDIDIMFAANDGGTIGAVMAVENAGQAGKTFVFGTDASEQIVDLLKADND ILQAVTGQDPFQIGYKTVETLVQSIEGKDVADQGKTVIVEGIPLRRGDTKQLDDFMSDLK SKM >gi|157101644|gb|DS480680.1| GENE 162 185925 - 186797 473 290 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 22 283 14 286 292 72 24.0 1e-12 MQISAKNVNTPSVFPIDCEVYYSIRTLIDANEYPQVHDFYEIILVTENSLDILLNNDRIK LGHGNLLLIRPGDIHTKIQTGDSVHINLAFPAYTVNSLFGYLYNSQAPRKELFAGSHSPI VRLSTIDTLMIQNRLSFLNQLSSSDMEEKNTHLRAILIDIMYSYIIPELEKQKRSSDASN LPAWLVPALEGLSNPKNLTMGMDYLVRQTQRSPEHICRMFRKHLGITPSAYINGKRLNYA ANLLIHTDMEIVDVIYESGFQSINYFYHLFKKEYGISPVKYKKIHLVQKL >gi|157101644|gb|DS480680.1| GENE 163 186888 - 188300 1753 470 aa, chain - ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 9 453 4 448 448 536 57.0 1e-152 MAEIEKRPVKIVETVLRDAHQSLLATRMSTEQMLPIIDKMDKVGYHAVECWGGATFDSCL RFLKEDPWERLRKLRDGFKNTKLQMLFRGQNILGYNHYADDVVEYFVQKSISNGIDIIRI FDCLNDIRNLETAVKATNREKGHAQIALCYTLGDAYTLDYWKDIAKRIEDMGADSLCIKD MAGLLTPYAAAELVQALKEGTSLPIDLHTHYTSGVASMTYLKAVEAGCEIIDCAMSPLAL GTSQPATEVMVETFRGTPYDTGYDQTLLAEIADHFQPIREQALKSGLLNPKVLGVNIKTL QYQVPGGMLSNLVSQLKEAGQEDKYRQVLEEIPRVRKDFGEPPLVTPSSQIVGTQAVMNV IMGERYKMVPKESKKIMLGEFGQTVKPFNPEVQKKIIGDETPITCRPADLIAPQLPQFEK ECAQWKQQDEDVLSYALFPAVAKDFFEYRAAQQTKVDSSAADKDSKAYPV >gi|157101644|gb|DS480680.1| GENE 164 188431 - 189576 1365 381 aa, chain - ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 5 380 3 376 376 356 54.0 3e-98 MDFGNIINNLLNQMAFFHLGAGNYVMIVVALVFLYLAIRKGFEPLLLIPIAFGMLLVNIY PDIMYSPEQTSNGTGGLLWYFFQLDEWSILPSLIFLGVGAMTDFGPLIANPISFLMGAAA QLGIYSAYFFAILLGFSGKESAAISIIGGADGPTSLFLCSKLGQTQIMGPIAVAAYSYMA LVPIIQPPFMKLLTTEKERKIKMEQLRPVSKLEKILFPIIVTIVVCLILPSTAPLVGMLM LGNLFRESGVVKQLADTASNALMYIVVILLGTSVGATTSADAFLNVTTIKIVALGLTAFI FGTIGGVLLGKLLCKITGGKINPLIGSAGVSAVPMAARVSQKVGAEADPTNFLLMHAMGP NVAGVIGTAVAAGTFMAIFGV >gi|157101644|gb|DS480680.1| GENE 165 189627 - 190019 325 130 aa, chain - ## HITS:1 COG:SPy1176 KEGG:ns NR:ns ## COG: SPy1176 COG0511 # Protein_GI_number: 15675148 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Streptococcus pyogenes M1 GAS # 1 129 1 116 116 68 41.0 2e-12 MKNYTITVNGKAYAVTVEEGAVSGIAAAPAAPAAPAAPAAPAPAPAAAPAPAAAPAGAAG SVSVSAPMPGKVVAVKAAVGQAVKKGDVILVLEAMKMENDIVAPQDGTIASINASTGDSV ESGAVLATLN >gi|157101644|gb|DS480680.1| GENE 166 190070 - 190873 857 267 aa, chain - ## HITS:1 COG:no KEGG:Closa_1325 NR:ns ## KEGG: Closa_1325 # Name: not_defined # Def: sodium pump decarboxylase gamma subunit # Organism: C.saccharolyticum # Pathway: not_defined # 1 265 1 252 256 106 37.0 1e-21 MRRLMKRVAAAVLMAACLLSLSACSAKQDEAGTAPRIATDGTPVDDMMAEYAKLTAVQSL GVSTDGLIAQKVMSEAQGNAVMAAICEEQINARKELGEVRSIDIEDASVVQMADGTYTVM LPVAFTEGSMQYVMNLNMISQQIVDAEFTELASNEEEGKTIGQLMETATVYAVIGIGTVF AVLIFISLLIACFKFIHKWETGQKEKAAPVAPAPVKAAAPVPAAGEDLMGDAELVAVISA AIAAYEGTSSNGLVVRSIRRVQRSKGR >gi|157101644|gb|DS480680.1| GENE 167 190891 - 192360 1564 489 aa, chain - ## HITS:1 COG:PAB1769 KEGG:ns NR:ns ## COG: PAB1769 COG4799 # Protein_GI_number: 14521095 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Pyrococcus abyssi # 6 489 28 520 522 362 41.0 1e-99 MMEDWKMSNSAQTSASSRIAALLDENSFVEVGAYITARSTDFNMTEQETPADGVVTGYGT IGGCLVYVYSQDASVLGGSMGEMHAKKICNIYSMAMKMGAPVIGLVDCAGLRLQEATDAL DGFGKLYLSQTMASGVIPQITAIFGTCGGGMAVSAGITDFTFMEEKSGKLFVNSPNAIDG NYKEKCDTSSADFQSRESGLADFTGDEASIIAQIRALVSILPANNEDDMSYGECQDDLNR ICDNIEGCTGDTALALTLISDNSFFMEVKKNYEPSMVTGFIRLNGTTIGCVANRTEIYEN GEKKETFEPVLSGKGCDKATDFIHFCDAFNIPVLTLVNVKGYKAKMCSERRIAKAAARLT YAYANASVPKVTVIVKEAFGSAYLTMGSKSIGADVVYAWPGAVIGMMNPSEAVKIMYAKE IAASGDAAALINEKTAEYTALQSSAVAAAKRGYVDDIIKAGETRQRVIAAFEMLFTKRED RPSKKHGTV >gi|157101644|gb|DS480680.1| GENE 168 192598 - 193812 1520 404 aa, chain - ## HITS:1 COG:MT3812 KEGG:ns NR:ns ## COG: MT3812 COG0527 # Protein_GI_number: 15843329 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Mycobacterium tuberculosis CDC1551 # 1 400 1 405 421 342 49.0 8e-94 MGLIVQKFGGTSVADAARLRHVAEIITDTYKAGNQVVAVLSAQGDTTDELIEKANEINPL ASNREMDMLLSTGEQMSVALCAMAVEALGYPVISLTGWQAGMVTNTVARNARIKKVDTER IETELNQKKIVIVTGFQGINRYEDITTLGRGGSDTSAVALAASLRADLCQIYTDVDGVYT ADPRIVKDARKLDEVTYNEMLELATLGAQVLHNRSVEMAKKYNVKLEVLSSFTGHPGTKV KEVAKRMEKSYISSVAKNVNIARIALIGVPNEIGTSFKVFSLLAQNHINVDIILQGIGHE EGKDICFTVAQSDLEAASVLLKEHQESIGFGHLETNSHIAKVSVVGAGMINHPGVAATLF EALYDANININMISTSEIKISVLVDEKDAEKAVQVIHDKFFSMG >gi|157101644|gb|DS480680.1| GENE 169 193945 - 195675 2219 576 aa, chain - ## HITS:1 COG:no KEGG:Closa_3778 NR:ns ## KEGG: Closa_3778 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 22 575 16 556 557 748 66.0 0 MKKPFVKGGKALTLAAALLCGAAFLGSFTAFGAAAQAESQAGAQAEAQVFTPGVAGQITS CKISGDKQNVEIGFSSSGDVTGTDGNIYVFEMQPYEEELGSRSNFIASSPALSAASVSVP LNLGTPQDKLYSRFVLAVFDGTKFIKVSQPHYITNPELIAKNQTPFNDPLTKKGLVIEID MLSDAFDLGVKHVTTNIAFSQIMGTGIDYQYDGKTYHFNKGVVDAYDKTISALSNKGMTV TVILLNDWNPNTPDLVYPGTQKNSNAFYYMFNAETEAGFDQTKAIASFLAQHYSGDNPNY GKVSNWIIGNEINNQQWNYMGPMDLTNYVKAYQKAFRVIYTAIKSTNANDRVYFSLDYNW MNEIDGKLKYGGKEIVDSFNSIANTQGQMDWGLAYHPYPCPMTEPEFWDDPQSTGLFTND FNSPVINFANLNVLTDYFVQDTLRAPAGNVRHIILTEQGFTSYSPTRGNIPEIQAAAYAY SYYLVDSNPYIDAYTVSRQVDAPSEAKDGLKFGLWECDMNQPNLIVATKRKKIWQVFRDI DKKNATLEATEFAKPIIGISKWSDVVPNFKWKNLEK >gi|157101644|gb|DS480680.1| GENE 170 195912 - 197345 1355 477 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 475 26 496 521 165 27.0 2e-40 MRIEKIRLDEVKKTYGDIKALEDISLSLYQGELLCVAGLSGSGKSVLADIISGRTSKGRG RMYVDGRPAVFTSISQAQTAGIFCVRYRTSVINDISVEENLCLLQKQSCMSFPARQGIIR MRETLKALEVDCRLQAPVASLSLMERHTIEICRAFCAGMKVLILDGIARDYTQKETEYMK RLVIRLKEKGISILWIDTRPEIMLDAADRLAVLRKGRLAAVLYREEFDRQMIDRIIIGEN RKVGREIIRSQAEPGGRGMLLSVREVRATELRDVSFHVCPGEVVGFIHPDDTYCREIVRL LSGELKVESGSAWLAGRQVELAGGRKNMVRYGFGYIEDYKKSLFPKLDVAENLTIAGLEY FAPGVGVREKLEQAVACDYLALLDIREEDLKRPIETLSHRCQLNVALYKWIIAKARLIVI DNLFSGTDILMRNSVREFLEFAKGRGIGVVYCSPNESELSSVCDRLYELAEGKVSCK >gi|157101644|gb|DS480680.1| GENE 171 197342 - 198352 921 336 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 87 326 355 591 602 154 34.0 3e-37 MSKLQTNRSYFYLLLADRIFVVLLFLFLFFGLLRTRTGIFVVVLFFLYQAVMLAVFCIRF LGPLREIEKISEMIAQPERHMADLEGGRPIHRLIQYVKQSVENEYTSQMLASQAEIHALQ SQINPHFLYNTLDTIRSMAIIRDSEDIAQMAESLSMLFRYSISRPGEMATLKDELDNVKN YLVIQGYRFPGKFKYQQRIEDEELLQYKLPVLTIQPVIENAIHHGLETKVGGGQVSVYAY STQERFIIRIADNGMGMTEQQMDSLRERLSKGRFIYEQDRDLKKRGNTGIALPNVNQRIK FYYGDDYGIKVYSTAGIGTIVEIILPAAADSKGGES >gi|157101644|gb|DS480680.1| GENE 172 198368 - 199966 1560 532 aa, chain - ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 2 526 3 521 526 164 26.0 4e-40 MKVLIADDEINIILLIKSLIDRQKTDVEIAGEAGDGITALEMVRQLKPDVVITDIRMPGM NGMDLIRHVREEQIPVEFVIISGYSEFEYARSAIQYGVSDYLLKPIKKDELNDVLAKLES QRASRREQQMQFNLMEDRLASNNGILRKNLLQNLLTGDGELFRQALGTGKVKDIFDYRGS FFTVAAVKLDCVSDSEYQVPEQAVETITAKIMRKLKACCHETEFIISRSWGYICLNYDGG EPALDHICSELGEMLERTEYKNDLFEMTVGIGSRVSSLEHLPDSLETAVKSVMLRIDLGV GNIIRYDRLPENYKKDSYPLPKEFEVKMERQVPLLAAGKLVLIGEEAMNRAVEGQYRYYR LYGLLEEMLERTGLVFSDIMKDYEIPDAEPYIRRLENCTTLNQLKLVWADYVSVSCELCQ RQKDGMGSRPVRMIKEYIGEHYSENITLNDIADIVFLNPAYLSAMFKKETGQTLTQYLID VRIDKAKEMLRNPERTIGEIACMVGYQDERHFSKLFSKMTGVKPTEYRRFYV >gi|157101644|gb|DS480680.1| GENE 173 199990 - 200895 1223 301 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 3 290 29 317 322 211 47.0 1e-54 MYLVVFIGLSIACPNFLKFSNIMIGIRQAVYTAIVAFAMTFVISLGGIDLSVGSTVGISG MVLAAMILGGYNIYLAIGVVILLGAFIGLVNGLLVTKLRIAYFIATLGTMSILRGLIYVY TKGIPLYGLKYPEIQFVGQGYIGVIPFPIILTLIILLVCYYLFYKTKFGRYTVSIGSNED AAKLVGINVDKIKILVFMLSGVLCAIVGVILASRSEAAVPEAGNAYEMDAIAATVIGGTS MSGGKGNMVGTAFGAILMATIKNGLSLLNVNTFWHQVVIGIFILFAVALDGYASTRAARN G >gi|157101644|gb|DS480680.1| GENE 174 200997 - 202499 177 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 280 489 24 222 311 72 25 2e-11 MFSEEDVVLKMEGIYKAFGSVRVLEGVDLTLHRGEVLGLIGENGAGKSTLIKILCGIHRM DKGSIVLSGKKVDILNASHAQKLGISTIYQELSVMPDLNAVQNIFLNRELTPQRTLVSRL KQKQMKEIAGRVLRDSLNVDLDLDRPLRYLPLAQKQMVEIARTVYADAQIIIMDEPTAAL EANDREQLFKVIRRLKEGGHSIIFISHHLDELMEICDRVSVLRDGQKVSEGYVKDYTVDR IISAMVGKELKNQYPKTKVPIGETLLEVKGLTKKSTFEDISFELHKGEILGLAGLEGCGK NEVIRAVFGAAVYDSGTVKIHGNSLESGGVRKAMDAGIAFVPAERKVDGLFLKQDIAWNM TIASLKQISRGRTLARRLEDRVSADYMERMRVKAKGIRQGISALSGGNQQKVMLARWMMM GGDVFLLEEPTRGIDVNAKTEVYEAIGECVKSGKGVVIVSSEEEEVLGICDKILVMKQGR VTRVMDAKSATTEEIKHYSV >gi|157101644|gb|DS480680.1| GENE 175 202518 - 203324 629 268 aa, chain - ## HITS:1 COG:AGpT140 KEGG:ns NR:ns ## COG: AGpT140 COG0491 # Protein_GI_number: 16119885 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 268 5 276 276 203 40.0 3e-52 MFIIPGGMIECDLANMVAMPALADADNRETVSRWVKSPVYYVLIENEEGPVLFDTGCHPK AMTDRWDMGNRKRTPVALEESHFVVNALLGLGYRPDDIPWVVVSHLHEDHAGGLEFFKKS RILVSDQEFTQTMRMYGLGRTSGGYIKKDIEAWLAADLNWETMDEEEAELYPGIRVLNFG PGHAYGMMGLLVTLPGSGNYIVASDAVNTAMNYGPPMQYPGLAYDTRGFERTIDRIRKLE NRYHAQVLFSHDIDQYETYRKAPLEWYE >gi|157101644|gb|DS480680.1| GENE 176 203403 - 204512 1314 369 aa, chain - ## HITS:1 COG:SMb21587 KEGG:ns NR:ns ## COG: SMb21587 COG1879 # Protein_GI_number: 16264775 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 100 357 48 304 320 93 30.0 6e-19 MKRRLALLLAAATTISLMGCATSNQKAAETTTAAQTQEAKAETEKAETTEEAKTAESEKA EPETGGYAVQPRSIKDRTLKVGLIPVTMNTAYTMTINGAKKQIEDKGLDIELLVQAPASN TSTITEQGNIMESMIQQGVDAIALATESDNSMLPYLRAAAEADIPVFLFNMTELDKDNIY YVSSIGYDQYNAGLEIGKYVAEHYGDKEVKLAILEGYAGIVNTMRIDGFKAGIQGKDNIQ IVASQTADWTREKGQSVTESILTSNPDINLIFGDYDEMVLGGLVSIKERGLLDQIDTIGY GCTKDGVAAIENNEMTATIDVGEYGTGIDIIDAVNDFCIEGKSVEKVINRPSKVYDKNNI GELDRVIFE >gi|157101644|gb|DS480680.1| GENE 177 205061 - 205747 651 228 aa, chain - ## HITS:1 COG:CAC0364 KEGG:ns NR:ns ## COG: CAC0364 COG1234 # Protein_GI_number: 15893655 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 5 159 4 164 167 114 34.0 2e-25 MKLHFFGCGSAFNPAMGNTSAWFEMDGCLFLVDCGETVYELLMKRSDLREYRQIYVLLTH LHADHVGSLGSLISYNYCILGRKICVIHPRATVVELLRLMGIKDEFYNYYQELPEDVGCL RAEPVPVEHADNMDCFGYILEAEGMRIYYSGDSARMPERIAGMLKDGKLDAVYHDTSLHN PPSPSHCYVGVLEETVPEQLRHKVYCMHLDGECRDMLESRGFRVAEVG >gi|157101644|gb|DS480680.1| GENE 178 205785 - 206600 1001 271 aa, chain - ## HITS:1 COG:TM0598 KEGG:ns NR:ns ## COG: TM0598 COG0395 # Protein_GI_number: 15643364 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Thermotoga maritima # 49 270 58 277 277 162 42.0 6e-40 MKQVKSTAISVFALALALMWLAPLVWLVGTAFSEPTFHMTFLPNSRFTFGNLSYVWNAIP FARYYLNTLILVVVTFCVQFVTSTLAAYALAVMDFKGQTLVFAVIFMQIIIPNDVLITPN FMTLADLGLTDTKVGIMLPFFGSALAIFLLRQHFKSIPKALAEAARIDGANTWQTIWRVY MPCAKPAYMSFAVISVSYHWNNYLWPLIVTNSPSNRTLTVGLAIFAKSKEANMQWANVCA ATFIIILPLLIAFFFLQKQFMNSFVSAGIKE >gi|157101644|gb|DS480680.1| GENE 179 206615 - 207511 1107 298 aa, chain - ## HITS:1 COG:BMEII0113 KEGG:ns NR:ns ## COG: BMEII0113 COG1175 # Protein_GI_number: 17988457 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Brucella melitensis # 29 297 1 270 270 183 39.0 3e-46 MKTKTKKQREIRNNATAWALMAPALIFMLAFTVFPIFRSLYLSLSKYKLGMDGAEFIGLE NYVKLAGSKLFWKVMKNTIVFALMTVIPSMAVGLGLAVLVNRKGKRVGFIRTAYFYPVVM PMIAIASVWMFIYMAKNGLFDQLLIAIGLKPMNVLSSKSTVLPAMAVMYVWKEAGYLMVF FLSGIQSISDEVMEAARIDGAGNWTVFRRITMPLLAPTFLFVSTIAFTNCFKLVDHVVIM TEGAPNNASTLLLYYIYQQGFTNFNYGVSSALTVIMLGLLLVVSLPRFISQDKKIHYN >gi|157101644|gb|DS480680.1| GENE 180 207690 - 209087 1443 465 aa, chain - ## HITS:1 COG:TM1120 KEGG:ns NR:ns ## COG: TM1120 COG1653 # Protein_GI_number: 15643877 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 74 464 28 431 436 217 35.0 4e-56 MKKTKKIAAAVLAMTMAAGLAVTGCSGKGGSTDTDPAAAEQSKETAGSAGESKDDAAKAD DGALHLTFYYPVNVGGSAAQLIEKICADFNAENPDIFVEPVYTGNYDDTVTKIQTAMQGG TPPDVFVSLATQRFTMASTGMAMPLDELIEADGDEGKAYIDDFLPGFMEDSYVDGKIYSI PFQRSTMVLYYNKDAFKEVGLDPEAPPTTWEELAEYGQKLTNDGRYGVGIALNSGSAQWA FTGFCLQNSTDGQNLMSEDGKSVYFNTPENVEALQFWLDLQNEYKCMAEGIVQWTDLPTQ FLAGEVAMIYHTTGNMANIHDNADFDFGTAFLPAHKRVGAPTGGGNFYISSNISEDRVQA AWKFIKFATSTDRAAQWSLDTGYVATRQSCFDTDLIKDYYAEVPQASVAYEQLPYAKPEL TTYNAAEIWRVLNDNIQAAVVGDMSAQEALDAAQEQAEEVLSEYQ >gi|157101644|gb|DS480680.1| GENE 181 209419 - 211704 2061 761 aa, chain + ## HITS:1 COG:BH1250 KEGG:ns NR:ns ## COG: BH1250 COG1609 # Protein_GI_number: 15613813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 212 1 208 338 89 28.0 2e-17 MTTIKDIAKAAGVAQGTVSNVLNGKGNVSSEKIRQVMDAASALGYVPNERAKLLRKGRSN TLAVILPNIRSKQYIDFYLSFKAYAENHGYSVSQYLTSDDNREAEYAAIQDVRSSMAQGM AAVSCCSFADANPYLDEQGMLADHVVFAERRPPFAAPYAGFDYHRAGSELALRALERGFS SLCLLTGSLQLPNESDFFNGFMSIAGSSGCRINHIQTDPYRKLQNIMQMFGAAAPQAIFI SNYGFAESVKDIWNTFYSGDSPEIYTVSPMFTMPENDFQKYELNYRQLGKVAAECLIQDI SKEKKGEKSGPEESEEPNLRTGQDSGQDSGQGSGHPYLLLENSGFRDWFADILIPSSKKP LNVLTLDSPSAYTMRNLSRIYTKKTGVPVNITIYSYEEIYEAFNHMHHDSVFDVLRLDVT WLSWFADKILQPLDQIDPGISSCLDTFLDGTINQYSIVRGRVYALPSTPSVQLLYYRKDL FESPIYRRMYHETYRQELRPPQDFKEFNQIARFFTKACTPSSPVEYGATMTLGSTGVAGS EYLARLFSHQENLYNEDCRVTLNSPAAIGSLRELIALKEYSDPKYCSWWTNTATTFAGGN VAMAILYSNYASDLLSHSSRVAGNIGYALTPGRNPVIGGGSLGVAKYTKRPEDALSFIKW MCSEPVASAATLLGSVSPCRKSYDNYEILNTFPWLNLAKDGFGLAHGRRTPEFSTQPFDE RSFLSIIGMAVKNAYSQVMTPEDALNYAQKRYDEQFLHTGP >gi|157101644|gb|DS480680.1| GENE 182 211820 - 212674 855 284 aa, chain - ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 6 282 5 280 283 161 32.0 2e-39 MAEDFLLKIRGGYNQFTKAEKKVADYILSSPKKVLFMSITELAEACGVGDTSVFRFCKTM NCKGYQEFKMLLSLSLHEGKQGFGQMESDISLEDSFSQVAEKVLNSNVKALKETHSLLKE DVFRRVVECFHKAGRICFYGVGTSMTTAMKAADKFLKIEPKVYCSPDSHMQAMMASTMSR GEVAVIFSYSGATKDTIHVAQLARQAGAAIVCVTRFIKSPLTAYADLTLLCGANESPLQA GSSSAEISQLFLIDMLYTEYYRTYYDKCSVNNEKTSASVMAKLC >gi|157101644|gb|DS480680.1| GENE 183 212809 - 213357 172 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 14 176 484 655 904 70 30 6e-11 MRHISKKELLSIPNLMGYFRLIMIPVFVWLYLKASTDADYYRAAIVMGISSITDMFDGMI ARKFNMITEFGKFLDPLADKLTHGAILLCLWSRYPLILLLLVLFVLKEGFMLIMGAIKLR EGKKLNGAKWFGKCCTALLFVVLFLLLLFPHMSLVTVNVLILLCAAAMFITLLLYIPVFR SM >gi|157101644|gb|DS480680.1| GENE 184 213408 - 214289 1126 293 aa, chain - ## HITS:1 COG:lin1219 KEGG:ns NR:ns ## COG: lin1219 COG1284 # Protein_GI_number: 16800288 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 9 283 4 275 281 121 29.0 2e-27 MEKKKTTAKIVFSLGEILVGNMMYAAAVVLFIVPNGLITGGTTGLALFVNHSLGIPISLF VSVFNVAMFLLGAWVLGKQFALTTVLSTIIYPVLLGVMEGSGFGGFVLEEKLVAVLYAGL LIGGGIGIVMRAGASTGGMDIPALVLKKKLDVNVSMTLYIMDCVVLGLQLIEADSHAILY GIILIIVYTMVLNQVLMSGNARIQVKIVSRRYEEINRLIAERIDCGTSLLHMETGYLHRE QEMILAVISRRDLPRLNNLVMDEDPEAFMIINQINEVRGRGFTLKRVYKEARQ >gi|157101644|gb|DS480680.1| GENE 185 214294 - 214518 111 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937563|ref|ZP_02084924.1| ## NR: gi|160937563|ref|ZP_02084924.1| hypothetical protein CLOBOL_02454 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02454 [Clostridium bolteae ATCC BAA-613] # 1 74 81 154 154 127 100.0 3e-28 MEGALDRFGETCYHKMQDSLYETESAQNMMVLKAYRRRPRRNRVFIYRRFETSGRQRNLP SFNLGIHSGAVRKD >gi|157101644|gb|DS480680.1| GENE 186 214592 - 214822 326 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937562|ref|ZP_02084923.1| ## NR: gi|160937562|ref|ZP_02084923.1| hypothetical protein CLOBOL_02453 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02453 [Clostridium bolteae ATCC BAA-613] # 1 76 1 76 76 140 100.0 2e-32 MKKLIRCANQYIWESDWKTIGLLKFCLLSLGICVGMAVPEEKKKTVLRAAFLSFVISYIP LMAKFIRIIWSDEGDW >gi|157101644|gb|DS480680.1| GENE 187 214794 - 216155 1223 453 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 403 12 414 426 150 26.0 4e-36 MTEGNILTSLMGFALPVLAALFLQTMYGAVDMMIVGQFSHAAEVSAVSTGSWIMQTITTA IVGLAMGTTILLGRKIGEGQPEEAGRVMGASIWIFAVIGLAVTAVMQFLAVPCTAWMRAP KEAFDSTAVYVKICSAGSLFIVAYNLLGSIFRGLGNSRLPLISVSIACLVNIAGDLLLVG VFHMASAGAAIATVFAQAVSVVLSILIIRKQKLPFAFRKESVRYDRQLAGRVVKLGFPIA FQDVLVSISFLVISAIVNTLGVIPSAGVGVAEKLCGFVLLVPSAYGQAMSAFVAQNIGAR RPERAKRALAYGILTSLGVGVFLAYFSFFHGDLLAGVFSRDREVVLAAADYLRAYAIDCL LVSFMFCMTGFFNGCGRTAFVMFQGIAGAFGVRIPVSYLMSRLVPVSLFKIGLATPASTL LQILLCGIYLKVLNRDPLLHGTGEPTSRPHQTI >gi|157101644|gb|DS480680.1| GENE 188 216390 - 216797 495 135 aa, chain + ## HITS:1 COG:FN0893 KEGG:ns NR:ns ## COG: FN0893 COG1959 # Protein_GI_number: 19704228 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 127 1 128 140 63 33.0 1e-10 MMITRESDYAIRIIRALKDGELLPLEQICQRELVPKQFAYKILKKLERSGLVSIRRGAGG GCILNRSLADLTLLDIIKAADEEFFLNPCLDSQYQCGYAASFDGCKVHCELARVQRVLEE ELKRTSLDRLFNKSG >gi|157101644|gb|DS480680.1| GENE 189 216896 - 218779 1932 627 aa, chain + ## HITS:1 COG:CAC0116 KEGG:ns NR:ns ## COG: CAC0116 COG1151 # Protein_GI_number: 15893412 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 627 1 628 629 843 64.0 0 MNHCYSCTTCKSADKPLECYIAGLPMETSHHRVEGQSVKCGFGLQGVCCRLCSNGPCRIT PEAPHGICGASADVIVARNFLRAVAAGSGCYIHVVETTALNLKETALSGGALKGMNALEH LCQLFRIPQGDPHTRAAAVAQAVLDDLYLPDYTPMKLQQKLSYAPRLAKWNELGILPGGA KSEVCAGVVKCSTNLNSDPTDMLLHCLKLGISTGVYGLMLTNLLNDVMLGEPEITPAPVG LGVIDPDYINIMITGHQHSIFVHLQEQLKKPEVQELAKAAGARGFRLVGCTCVGQDLQLR GVHYQDIFSGHAGNNYTSEAVLATGAIDAVLSEFNCTLPGIEPICDRLNIRQICLDDVAK KQNAALKPFVFEDRERISREILDEILESYKNRRARVPLNLLPEHGNTNTITGVSEVSLKK FLGGSWKPLIDLIASGDIKGIAGVVGCSNLTAGGHDILTVELTKELIARDILVLTAGCSS GGLENCGLMTPEAASLAGPKLRAVCESLGIPPVLNFGPCLAIGRLEMVATELALALNVDI PQLPLVLSAAQWLEEQALADGAFGLALGLPLHLGLPPFITGSSLVTKILTEDMKALTGGQ IIVNSHAGESADILEQIIMDKRQALNI >gi|157101644|gb|DS480680.1| GENE 190 218790 - 219242 300 150 aa, chain + ## HITS:1 COG:TM0396 KEGG:ns NR:ns ## COG: TM0396 COG1142 # Protein_GI_number: 15643162 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Thermotoga maritima # 1 148 5 150 152 94 35.0 1e-19 MKRIWIDAGRCDGCLNCTLACMNAHRADKGSIYDLDLTDPANESRNFIRRQPDGSYRPIF CRHCDEPECVNSCMSGALSKNRETGIVEYDEKQCAGCFMCVMNCPFGVLKPDNTARSRII KCDFCKDSGSEPSCVKACPKKAIWIEEVES >gi|157101644|gb|DS480680.1| GENE 191 219239 - 220483 1169 414 aa, chain + ## HITS:1 COG:TM0395 KEGG:ns NR:ns ## COG: TM0395 COG0446 # Protein_GI_number: 15643161 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Thermotoga maritima # 1 396 1 406 425 209 29.0 1e-53 MRYVIAGAGPAGISAAKTLRQLDRDGEIVLVSKDDQVHSRCMLHKYLGGERDAEGISFIP PGFFKQNNITWYPGRAMVRLDCAERRILLDDGTFLPYDRLLLATGAYYLLPPVPGLKEAE NVYGFRDLSDALAIDKAVRPGARAVIIGSGLVGMDAAYALMERGISPVIVEMAGRILPLQ LDAEAASEYQRLFEHHGCSFRLAARASRTRLDSRGNVSALVLEDGEELACDFIIAAAGVR PEIRFLEHCSVETGRAVTVDQYLSTSVPGVYGAGDVCGLSGIWPNAMKQGAVAAKNMYGI PTPYEDTYAMKNTMNFFGLPALCIGDINRLDSHTLVITEEDSQNYRKALVEDGVLKSILM VGNISGSGIYQYLIKNQIKLPSADRSIFRLSFADFYGFDTSKGSYTWDTDLCCG >gi|157101644|gb|DS480680.1| GENE 192 220605 - 221798 1436 397 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 397 1 407 407 557 67 1e-157 MAKAKFERNKPHCNIGTIGHVDHGKTTLTAAITKTLHERLGTGEAVAFENIDKAPEERER GITISTAHVEYETENRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGVMAQTREH ILLSRQVGVPYIVVFMNKCDMVDDPELLELVEMEIRDLLNEYEFPGDDTPVVQGSALKAL EDPKSEWGDKILELMKAVDEWVPDPVRETDKPFLMPVEDVFTITGRGTVATGRVERGTLH LNDEVEIIGIHEDVRKSVVTGIEMFRKLLDEAQAGDNIGALLRGVQRTEIERGQCLCKPG SVKCHNKFTAQVYVLTKDEGGRHTPFFNNYRPQFYFRTTDVTGVCDLPAGVEMCMPGDNV EMTVELIHPVAMEQGLRFAIREGGRTVGSGRVVSIVE >gi|157101644|gb|DS480680.1| GENE 193 221917 - 224034 1794 705 aa, chain - ## HITS:1 COG:CAC3138 KEGG:ns NR:ns ## COG: CAC3138 COG0480 # Protein_GI_number: 15896387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 704 2 687 687 932 64.0 0 MAGREYPLERTRNIGIMAHIDAGKTTTTERILYYTGVNYKIGDTHEGTATMDWMAQEQER GITITSAATTCHWTLQENCKPKPGALEHRINIIDTPGHVDFTVEVERSLRVLDGAVGVFC AKGGVEPQSENVWRQADTYNVPRMAFINKMDILGANFYGAVDQIKTRLGKNAICIQLPIG KEDDFKGIIDLFEMKAYIYNDDKGDDISIIDIPEDMQDDAELYRSELVEKICELDDDLMM TYLEGEEPSNDDLKKALRKATCECAAIPVCCGTAYRNKGVQKLLDAVIEFMPSPLDIPSI KGTDLDGNEVERHSSDDEPFAALAFKIMTDPFVGKLAFFRVYSGSLNSGSYVLNATKGKK ERVGRILQMHANKRNELDRVYSGDIAAAVGFKVTSTGDTICDEKNPVILESMEFPEPVID IAIEPKTKASQDKMGDALVKLAEEDPTFRVRTDEETGQTIISGMGELHLDIIVDRLLREF NVEANVGAPQVAYKETFTKAVDVDSKYAKQSGGRGQYGHCKVHFTPMEANAEETFKFTSS VVGGAIPKEYIPAVGEGIEEACKTGILGGFPVLGVHADVYDGSYHEVDSSEMAFHIAGSM AFKEAMHKGNPILLEPIMKVEVTMPEDYMGDVIGDINSRRGRIEGMEDIGGGKMVRAYVP LSEMFGYSTDLRSRTQGRGNYSMFFDKYEPVPKNVQEKVLSDHKK >gi|157101644|gb|DS480680.1| GENE 194 224083 - 224553 706 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 [Roseburia intestinalis L1-82] # 1 156 1 156 156 276 86 7e-73 MPRKGHTQKRDVLADPIYNNKVVTKLINNIMLDGKKGVAQKIVYGAFDRVAETTGKDAME VFEEAMNNIMPVLEVKAKRIGGATYQVPIEVRPERRQTLALRWITLYSRKRGEKTQKDRL ANEIMDAANNTGASVKKKEDMHKMAEANKAFSHFRF >gi|157101644|gb|DS480680.1| GENE 195 224857 - 225276 665 139 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 [Roseburia intestinalis L1-82] # 1 139 1 139 139 260 92 4e-68 MPTFNQLVRKGRQTSVKKSTAPALQRGYNSLQKKATEVSAPQKRGVCTAVKTATPKKPNS ALRKIARVRLSNGIEVTSYIPGEGHNLQEHSVVLIRGGRVRDLPGTRYHIVRGTLDTAGV ANRRQARSKYGAKRPKEKK >gi|157101644|gb|DS480680.1| GENE 196 225434 - 226627 1164 397 aa, chain - ## HITS:1 COG:PH0014 KEGG:ns NR:ns ## COG: PH0014 COG1906 # Protein_GI_number: 14589976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pyrococcus horikoshii # 3 390 6 390 395 121 25.0 2e-27 MKLCFVFVVIIAILWMKRPLFWAISGGLAAALVLYGIRVPDTLSIMGRSMVSKDTVTVVL SFYFITFLQRMLERRNRLKQAEQSLNWLFNNRRVNASAAPAVIGLLPSAGAMTICAEIVR SSCQDYLSNEDMTCVTSFYRHIPESFLPTYSSILIALAVSGVGAGEFVLAMLPLVAALFF IGHMFYLRKVPRSTGQKTEEGGKKAAVMLFKSLWSIILIVVLIIAFDIPVYVATPMAAVL NIFVDHLKPWEIKPMFRTAFEPIIIFNTILIMMFKDIITYTGVIHELPVFFGGLPIPLPM VFALIFFFGTIISGSNAIIPLCMPMAMAAMPDAGVPLLVLLMSSAYAAMQVSPTHVCLFI AAECFKVDIGALVRRNIPMILVFFAVTLAYTALLGVF >gi|157101644|gb|DS480680.1| GENE 197 226831 - 227184 299 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937579|ref|ZP_02084940.1| ## NR: gi|160937579|ref|ZP_02084940.1| hypothetical protein CLOBOL_02470 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02470 [Clostridium bolteae ATCC BAA-613] # 1 117 3 119 119 210 100.0 2e-53 MGFNEYLKLLLLEKNMKLADLCRMAGIQTSLMSEYINGRKSPTTGNAILIADVLNISLDN LAGRKWNTCCDHKKPGTYTGEYTGEYTDELRETLHKLTEKECIFVMDVIKALKQHLQ >gi|157101644|gb|DS480680.1| GENE 198 227488 - 231195 4270 1235 aa, chain - ## HITS:1 COG:CAC3142 KEGG:ns NR:ns ## COG: CAC3142 COG0086 # Protein_GI_number: 15896391 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Clostridium acetobutylicum # 13 1192 6 1178 1182 1635 68.0 0 MPETNETYHPMTFDAIKIGLASPDKILEWSHGEVKKPETINYRTLKPEKDGLFCERIFGP SKDWECHCGKYKKIRYKGVICDRCGVEVTKASVRRERMGHIKLAAPVSHIWYFKGIPSRM GLILDISPRTLEKVLYFASYIVLDPGSTSLQYKQVLSEKEYREEVEKYGGTGGFRVGMGA EAIQELLRAINLEKDSADLRRALADSTGQKRARIIKRLEVVEAFLTSGNRPEWMIMDVIP VIPPDIRPMVQLDGGRFATSDLNDLYRRIINRNNRLARLLELGAPDIIVRNEKRMLQEAV DALIDNGRRGRPVTGPGNRALKSLSDMLKGKQGRFRQNLLGKRVDYSGRSVIVVGPELKI YQCGLPKEMAIELFKPFVMKELVSNGTAHNIKNAKKMVERLQPEVWDVLEDVIKEHPVML NRAPTLHRLGIQAFEPILVEGKAIKLHPLVCTAFNADFDGDQMAVHLPLSVEAQAECRFL LLSPNNLLKPSDGGPVAVPSQDMVLGIYYLTQERPGAKGEGMVFKSVNEAILAYENQEAT LHSRVKVRVSKTMSDGTVKTGTIDSTIGRFIFNEIIPQDLGFVDRSIPENELKLEVDFHV AKKQLKQILEKVINVHGATQTAVTLDDIKAIGYKYSTRAAMTVSISDMTVPESKPKLIEE AQATVDRIAKNYRRGLITEEERYKEVIETWKTTDDQLTHDLLTGLDKYNNIYMMADSGAR GSDKQIKQLAGMRGLMADTTGHTIELPIKSNFREGLDVLEYFISAHGARKGLSDTALRTA DSGYLTRRLVDVSQDLIIREVDCCEGRDIPFMEIKAFMDGNEVIEDLEERITGRYIAETI TDPDTGEVVVKANHMCTPKRAAAVMKVLNKTGRKSVKIRTVLSCKSHIGVCAKCYGANMA TGQPVQVGEAVGIIAAQSIGEPGTQLTMRTFHTGGVAGGDITQGLPRVEELFEARKPKGL AIITEFGGVVQIKDTKKKREITVTDNETGNAKTYLIPYGSRIKVLDGQVLEAGDELTEGS INPHDILKIKGVRAVQDYMLQEVQRVYRLQGVEINDKHIEMIVHQMLKKIKIEESGDSDV LPGVSMDVLDYNEMNEALIADGKKPAEGRQVMLGITKASLATDSFLSAASFQETTKVLTE AAINGKVDHLIGLKENVIIGKPIPAGTGMKRYRKVKLDTDDLISDEILLSDDDELVLSSE DESSSGLTEDNVAEEVLGMDDMADVDEEDDAVEEE >gi|157101644|gb|DS480680.1| GENE 199 231212 - 235096 4296 1294 aa, chain - ## HITS:1 COG:CAC3143 KEGG:ns NR:ns ## COG: CAC3143 COG0085 # Protein_GI_number: 15896392 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta subunit/140 kD subunit # Organism: Clostridium acetobutylicum # 6 1233 2 1184 1241 1619 66.0 0 MEKSRIRPVPAGKSVRMSYSRQKEVLEMPNLIEIQKDSYQWFLKEGLKEAFDDISPIADF AGHLSLEFVDFTLCKDDIKYTIEECKERDATYAAPLKVKVRLYNKDKDEMNEHEIFMGDL PLMTDTGSFVINGAERVIVSQLVRSPGIYYGIGHDKVGKELYTCTVIPNRGAWLEYETDS NDVFYVRVDRTRKVPVTVLIRALGFGTNAEILELFGEEPKLLASFGKDTSDNYQDGLLEL YKKIRPGEPLSVDSAESLLNSMFFDPRRYDLAKVGRYKFNKKLSFKYRLNGQVLAEDVVD TATGEILAEAGTKLDKESAMRIQNAAVPYVWVDGPDRKQKVLSNMMVELSAYVPFDPEEV GVTELVYYPALASILEEAGTDEESLKEAVRRNIHDLIPKHITKEDILASINYNMHLEEQV GNADDIDHLGNRRIRAVGELLQNQYRIGLSRMERVVRERMTTQDMDGITPQSLINIKPVT AAVKEFFGSSQLSQFMDQNNPLAELTHKRRLSALGPGGLSRDRAGFEVRDVHYTHYGRMC PIETPEGPNIGLINSLATYARVNQYGFIEAPYRVVDKTTDPSSPRVTDEVVYLTADEEDN FVVAQANEALDEEGHFIHNNVSGRFRTETSEFQKKTIDLMDVSPKMVFSVATAMIPFLEN DDANRALMGSNMQRQAVPLLTTEAPAVGTGIEAKAAVDSGVCVVAKNAGVVERSASNEII IRRDSDGNRDVYHLTKFKRSNQSNCYNQKPIVYKNDHVEAGEVIADGPSTSNGEIALGKN PLIGFMTWEGYNYEDAVLLSERLVQEDVYTSVHIEEYEAEARDTKLGPEEITRDVPGVGE DALKDLDERGIIRIGAEVRAGDILVGKVTPKGETELTAEERLLRAIFGEKAREVRDTSLK VPHGAYGIIVDAKVFTRENGDELSPGVNQTVRIYIAQKRKISVGDKMAGRHGNKGVVSRV LPVEDMPFLPNGRPLDIVLNPLGVPSRMNIGQVLEIHLSLAAKALGFNVSTPIFDGANEN DIMDTLEVANDYVNLSWDEFKAKYEDLLKPSVIDYLGAHLEHRELWKGVPISRDGKVRLR DGRTGEYFDGAVTIGHMHYLKLHHLVDDKIHARSTGPYSLVTQQPLGGKAQFGGQRFGEM EVWALEAYGASYTLQEIMTVKSDDVVGRVKTYEAIIKGENIPEPGVPESFKVLLKEMQSL GLDVQVLRDDGTEVEMNENIDYGDTELRAMLEGERRFEEKESYAQFGYQEQEFKDSELVA VEEEEEVSADDYTEEADDEDDLFLDEAADGEDEE >gi|157101644|gb|DS480680.1| GENE 200 235422 - 235883 683 153 aa, chain + ## HITS:1 COG:no KEGG:Closa_3786 NR:ns ## KEGG: Closa_3786 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 150 5 149 149 156 73.0 2e-37 MAQIILNVLIVILIIALVALVVLYFLGNKLQKRQVEQQQLLDAAAQTASILVIDKKKMKI TQAGLPKAVTDQTPKYMRWAKVPVVKAKIGPKVMTLIADGRVFDSLPVKTEAKVVISGIY ITEIKSVRGGAIPTPPKKKGFFARFKRNKKDKK >gi|157101644|gb|DS480680.1| GENE 201 235962 - 236387 470 141 aa, chain - ## HITS:1 COG:CAC2894 KEGG:ns NR:ns ## COG: CAC2894 COG4506 # Protein_GI_number: 15896147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 122 1 120 137 63 34.0 9e-11 MTKDVLITISGIQMIDEEDSDVEMIVRGDYYQKNGKHYILYEEMMEGFTGKVKNVIKISP SGMDIIKKGIANTHMQFEKNKKNLSCYTTPLGDMVVGIQANRIKINEEPDSLLVNVDYSL DINYEHLSDCSIRLDVQSCPQ >gi|157101644|gb|DS480680.1| GENE 202 236405 - 237466 1132 353 aa, chain - ## HITS:1 COG:CAC2895 KEGG:ns NR:ns ## COG: CAC2895 COG1181 # Protein_GI_number: 15896148 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Clostridium acetobutylicum # 3 351 2 340 343 295 45.0 9e-80 MGKLQAAVIFGGQSSEHIVSCMSAVNVIEHIDRTRYDVLLVGITEDGHWIKADSLEDVKN GTWRDKTVSALLLPDATKKCILLMDGDSVREVRVDVVFPVLHGLYGEDGTIQGLLELAKI PYVGCGVLSSAVSMDKLYTKIIVDDLGVRQAAYEAVYREQLVHMESVVERVEKHFSYPVF IKPSNAGSSRGVTKADDRQALEAGLSEAARHDRKILVEETIIGREIECAVFGGGLKEVVA SGVGEILAAAEFYDFEAKYYNEESRTVTDPELPAGAAERVRQAAMDIFRAVDGYGLARVD FFVKDDGEVVFNEINTMPGFTAISMYPMLWEARGITKDQLVDMLMAHGMERYT >gi|157101644|gb|DS480680.1| GENE 203 237587 - 238240 565 217 aa, chain + ## HITS:1 COG:no KEGG:Closa_0527 NR:ns ## KEGG: Closa_0527 # Name: not_defined # Def: stage II sporulation protein R # Organism: C.saccharolyticum # Pathway: not_defined # 1 216 1 207 215 222 49.0 8e-57 MKYKISLCISALCFFTAFLILMASRTTGEEALASRIAPEILRFHVLAESDSTRDQNLKLG VKGLVLDYIHGQVPEDTDKEQLKQWIESNKTSIETMAQDWLADQGASYPVKLELTRDYFP TKAYGDMVFPCGTYDAVRITIGSGKGHNWWCVLYPSLCYTDAIHAVVPDSSKKTLSSLLG EDDYDALLSPLDRTKQPQQKPEVRVRFRLMDLFHKKS >gi|157101644|gb|DS480680.1| GENE 204 238245 - 238448 331 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937587|ref|ZP_02084948.1| ## NR: gi|160937587|ref|ZP_02084948.1| hypothetical protein CLOBOL_02478 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02478 [Clostridium bolteae ATCC BAA-613] # 1 67 2 68 68 112 100.0 9e-24 MGDETFKERYLSGEIPFEEIDRYVSRWNNSDDPRTLAQYLGLNAEEEDVWIDVSDEALQD MLDSQKR >gi|157101644|gb|DS480680.1| GENE 205 238448 - 239155 873 235 aa, chain - ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 15 209 23 217 228 110 35.0 3e-24 MENNFKVNMNEYLPLRDVVFNTLRQAILKGELAPGERLMEIQLAEKLGVSRTPIREAIRK LELEGLVLMIPRKGAEVAKISEKSLRDVLEVRRSLEELAIELACQRMSEDDMAELERMQG NFKSAIDRGEAMSIAQSDEQYHDVIYQGTRNDKLVQMLGNLREQMYRYRLEYIKDEDKRQ VLLVEHEHILNALKNRNIAEAKKAAREHIDNQEITVSRNIKEQEQEPAVPARGRK >gi|157101644|gb|DS480680.1| GENE 206 239178 - 240086 1131 302 aa, chain - ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 8 275 6 270 289 241 44.0 2e-63 MIKHLRLKAYGKINLALDVLGRRADGYHEVRMIMQTVGLHDRIDLYVTQEPGIKIETNLF YLPDNEQNLAHKAARLLMEEFGIRQGLSIHLRKFIPVSAGMAGGSTDAASVLFGVNKMFG LGLSTGELMERGVTLGADIPYCLMRGTALSEGIGERLSPLPPMPQCQVLIAKPGISVSTK VVYESLDAMKLAPSDHPDIDGMIDAIRSQNLREVAGRFGNVLELVTGERYPAISRIEQVM RDYGALGAMMSGSGPTVFGLFANPRAAQAAYEDLRFGKASELAKQVYLTNFFNVKRNGQD MA >gi|157101644|gb|DS480680.1| GENE 207 240245 - 240976 695 243 aa, chain - ## HITS:1 COG:no KEGG:Closa_0515 NR:ns ## KEGG: Closa_0515 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 231 1 227 260 251 60.0 2e-65 MLQGIFNYDNPVWRFIGKLGDLIILNILWIVCSIPVFTAGASTTAVYYVTLKLVRDEDDS TIRSFFRSFKSNFKQATAIWLILLAAGIVLGFDFWFFVSGQMSLNGAVKTAMTAVSGGLL LIYLFIITYVFPLQARFYNPVKRTLFNAFFMSVRHLFQTIGILATDLGIAAATYFGLFYF PQFAMLLFLFGMPLIAFVNSYFFTAIFRKYMPKEEEQEHTDATPLLGEEDEEMKEAIRNL KGK >gi|157101644|gb|DS480680.1| GENE 208 241161 - 241832 731 223 aa, chain + ## HITS:1 COG:no KEGG:Closa_0514 NR:ns ## KEGG: Closa_0514 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 222 8 231 233 142 34.0 7e-33 MSIREEAENERQKLKNMSWKDRAWYVWEYYKFHLLAVILVIGVLSTIGTMIYRQTFTTRL SVAIINDRSAGASSSALLESDLREYLGCGKKDLIEINEGLMVDFNEESTSQYGYATLAKI SALVASKSLDVVIGDQSAIDHYETVSAYQNLEELLSPELYARVKDHIYRAKDGEGNLTPV ALSLEDTALEEKTGIIMDPPYLAVIQGSPHKEAAIQMIEYLFP >gi|157101644|gb|DS480680.1| GENE 209 241771 - 242520 636 249 aa, chain - ## HITS:1 COG:no KEGG:Closa_0513 NR:ns ## KEGG: Closa_0513 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 244 1 244 249 176 41.0 1e-42 MVTERSICLDGKTCTAVISDEPEALLAAKAAGRAVVGVESERTDIWELKGIPYVIPDFED ATDELLDLVIRRHLGLPWNICRTDRLLIRELTADDACHIPEEEYGPQEAIFRLRDTLELY CRNQYGFYEYGTWALVRRDDQVLVGLAGVSNPRLKREMEECLDSMGQSVPWLELGYHVFL PYRQRGYCAEAVTAIADYSHEVLGVRLCALIRRENQASRKVAEGLGMTCLMETDTQSSEW QLLYGESLV >gi|157101644|gb|DS480680.1| GENE 210 242573 - 243091 625 172 aa, chain - ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 170 1 170 174 235 66.0 3e-62 MANPIVTITMENGDVMKAELYPEIAPNTVNNFISLVKKGYYDGLIFHRVINGFMIQGGCP EGTGMGGPGYSIKGEFAQNGVKNDLVHTEGVLSMARAMHPNSGGSQFFIMHKASPHLDGS YAAFGKVTEGMDVVNKIADVQTDYSDRPLQEQKIKSMTVETFGVDYPEPEKC >gi|157101644|gb|DS480680.1| GENE 211 243109 - 244746 1203 545 aa, chain - ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 38 496 1 461 486 223 32.0 1e-56 MQYLCTYTYSIEINTWYNTVIRNSRRAEHAGCPGGVAMKRKKTLHDYLYENLREQIETGY YKYGEALPSMNRLCETYHVGIRTVRDVLSELRDQGMIRTEERRPAVIVYGKTGEEDYIPD ILAVLQRKTSVLAVYKTMELLMPRMFALSVQACGVETAVEYFKRLKRDRRKEVRARWKAS SNALYNLLDASHNLLFRDIFTSLEIGSRIPFFLEPEYTPPFSSCSGEYDDPMWMTEAVSA GDTRMIQEQFARMYRSLGHDIKRYLDWIESRFETGNIEQKNGYSWNCAQGHDHLYIQIAR DLIDRIGLGFYKDGSFLPSEAALAQEYSVSVATIRRSISMLNRLGFCRTYNVKGTQVTLF NDQAAFQSMKNKVHKKDILMYLSGLQFMALASAPAALLAAPFIGREEIDCIEEELAGPYA VPLSILMHHIIEDLTLEPYQIILRQVSSILHWGYYYSFYQNGIQAGNELNQISHEAFRRL QAGDTEEFARLLSLCYCHALELVRDSMVVWGLSEAALFVTPQEMDKVKPCWTKPAEFAIV DKDSN >gi|157101644|gb|DS480680.1| GENE 212 244831 - 247113 1420 760 aa, chain + ## HITS:1 COG:PA4601_3 KEGG:ns NR:ns ## COG: PA4601_3 COG2200 # Protein_GI_number: 15599797 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 520 758 2 239 247 183 37.0 1e-45 MKKTKDSISDDYEKPLFRRGWYIGFLSLVLVSVFALSILNAAKLQIELDRSTQGYLTDVT SQLSRDIRDAINNKITNLVMTADSFSQFTEKQDTDTMSGFLNRAAQILEFKPLIFFDREG FSVSSQTDDGEPPVSPEDFLKLPSVRASFQGSIQASYIGGKSIFYSVPVYRDNGVDGVLV GVRSKENMQSLISSKSFSGQMLSCIVDSQGQVIISPTDLKPFLQLGDIFKQEKNEKIITD IQKMQRDMTENRSGVLKFTATTKEELLLSYNALNINDWFLLTIIPADIISTGTNQYILQS FIIIGATILIFSLFLFAVFRFYNAHKRQLELIAYRDPLTGGLNNAAFQARYRELSKNMTP CTYSIVLLNVRGFKMVNEQFGIRIGNQILSYIYKVLKQHLREEQDEFAARGESDHFFLCL KEYSPGAIQYRLNDMIHDINNFHDTELPDLSLSFKQGACLADDPYQEITLLQDHARIALQ SCGTDLCHSCCFYDESLIRTLKIEQELNALFEDSIENRYFQIYLQPKIGLRSGRLEGAEA LVRWNHPQKGIIYPSDFIPLFENNGKICRLDLYVFTEVCAAIDRWRQEGLPLIPVSVNLS RQHFRNPNFLDTFAGIASSFKIPDKILEFELTESIFFDNQQIKTVRETIQEMHRMGFLCS LDDFGSGFSSLGLLKEFDVDTLKLDRSFFLNMSGQKAKDVISCLIDLSRHLKVKTVAEGI ESTEQVDFLRSMGCDMIQGYVFSAPIPLNEFEKRYLQDQP >gi|157101644|gb|DS480680.1| GENE 213 247497 - 248888 854 463 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 3 456 5 436 456 333 39 5e-90 MMNVLEQVHQMVWGPWTLVIFLGVGIFFTLKCHFFQIRGFLFWWSRTAGRLISGEASRKE DGTGKGKGISQFQSVCTALAATVGTGNIAGVATALTAGGPGALFWMWVSALIGMITAYAE TMLGIRYRYRDKDGHYICGPFIYMERGLGLRGMGVLYSFLCLMSSLGMGSMVQANSAIST MEYTWHVPALAGGLIFTGLLVLVTWGGIRRIGNVAEKLVPAASGIYIAFSMIVILSCLEQ IPGALASMVTSAFRPEAAAGGTAGYVISRSVRYGISRGVFSNEAGLGTLAVLHGPTEGTT PHEQGMWAMFEVFFDTVVLCTLTALVILCMTGSSPETIPYEGAALAAWCFSRRLGIIGEY LVSGSMVVFAFATIMAWFYLGRQAAAYFIGGCGFGENAQHMFAAKVYPLLFIAAVFLGSE ARLGLVWLLSDIWNGLMAFPNLTALLFLSREVTLPEGFLRRRD >gi|157101644|gb|DS480680.1| GENE 214 249060 - 250043 677 327 aa, chain + ## HITS:1 COG:no KEGG:Closa_0500 NR:ns ## KEGG: Closa_0500 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 322 1 322 326 453 68.0 1e-126 MPGFTTHYVLGMKAYNDMPQNNLKFIIAKYRWLYQLGLQGPDMFFYNLPILRHRDHRNVG SYMHEHHVNYFFRCCFMQLSRIGSRQQREEGLAYMCGFICHYIGDSICHPYVYGRIEYDV NHPGSYYHGLHAKLENDIDALLLQKYKRKKPSQFNQAATICLNGMETQFISQFLSDCINE AFYPLSYKNRYQVTAPMIHRSILALRFGCRTLSDPNSRKKDRIAFFESLFLKNPVASSKL VTDTVENPAWSLNLHHETWCNPWDKSIASKSSFPDLFRQSLGKLSTIFYMINSMLETGQP LKLNALENLLSELGNYSYHSGLPCADD >gi|157101644|gb|DS480680.1| GENE 215 250254 - 251354 1222 366 aa, chain - ## HITS:1 COG:lin0944_1 KEGG:ns NR:ns ## COG: lin0944_1 COG2333 # Protein_GI_number: 16800013 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Listeria innocua # 115 318 51 236 289 60 28.0 7e-09 MRIDRYWKQAAAGLLAIMTLTMTGGNLDSLAMTNSQPLKAQGQMLSSPDKIIIRDEGETA QSSQAAAASQPVIQAKAQIAAFGEVTTPKTDSTSLFGAGRLTMLANHDTAAQLLSVIIET GEGGLIVVDGGWTNNTDYLLNQIKQKGGHVQAWLLTHPDSDHVGALADILYKHNGEITID GIYYSFAEDSWYAEKDPEVAKMVAYIKGAFGLVPQNILHGDIVSGQVIQAGPARIQVLNQ AYKMNNDFVNNSSVAYMVSLNGTNTVFLGDLAKSGGEQLMADHDLSALKCDIVQVAHHGQ NGVGYEVYKALRPSVALWPTPQWLWDNDNGGGSGSGIWLTQETKNWMVRLGVNANYCIKD GDQVIE >gi|157101644|gb|DS480680.1| GENE 216 251379 - 251927 590 182 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937603|ref|ZP_02084964.1| ## NR: gi|160937603|ref|ZP_02084964.1| hypothetical protein CLOBOL_02494 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02494 [Clostridium bolteae ATCC BAA-613] # 19 182 19 182 182 317 100.0 3e-85 MRKLKKLSLILLASAALLCGCTQSNIVSPSNGVVQGDGQPDLEVDKNINIDWQEVREDLR DQFLEPYGVFADYVMDLDAKYDAGSNRVVVLLPVSHKTTGDIAVVYGQEVLKAVGTYIAE QNFYYEAPDPEDTDSTYYGSYFDEHDVLVQVFPYDKEGDESAYLVNDVMKAGEQRELTAQ IQ >gi|157101644|gb|DS480680.1| GENE 217 252115 - 253674 1458 519 aa, chain + ## HITS:1 COG:TP0106 KEGG:ns NR:ns ## COG: TP0106 COG1292 # Protein_GI_number: 15639100 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Choline-glycine betaine transporter # Organism: Treponema pallidum # 14 503 3 503 510 468 50.0 1e-131 MKQDKNRSIDTPLKNRLDWVTTIIPFVTILALCSLFMIFPDASSQVLGSVRFFLGDQFGG YYLLLGLGALICSLYMAFSRYGKIKLGNLEKPQYSSFQWGSMMFTAGLAADILFYSLCEW MLYAGEPHITDMGAMQDWASTYPLFHWGPIPWSFYIILAVSFGFMLHVRKRNKQKYSEAC RPLLGRRVDGVLGKLIDLTAVFALLAGTATTFSLATPLLSMALSRVLGIPENNFLTIGIL VIICVVYTMAVYFGMQGIAKLAASCVYLFFGLLAYFLLCGGETRYIIETGITAVGNMVQN FVGMATWTDAMRTSSFPQNWTIFYWAYWMVWCVATPFFIGTISKGRTIRQTVLGGYFFGL SGTFTSFIILGNYGLGLQMHGKLDLMGVYAKTGDLYQAIISIFETMPAAKAGLVLLAITM IAFYATSFDALTMVASSYSYKSLGAEEEPHKQVKLFWAILLMLLPIALIFAENSMANLQT VSIIAAFPIGFIIVMIVWSFFKDAGHYLDKENSETRNME >gi|157101644|gb|DS480680.1| GENE 218 253797 - 254051 216 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937605|ref|ZP_02084966.1| ## NR: gi|160937605|ref|ZP_02084966.1| hypothetical protein CLOBOL_02496 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02496 [Clostridium bolteae ATCC BAA-613] # 1 84 1 84 84 160 100.0 4e-38 MLTKQVDSRERVFRPIPISYLIQTANQFDCDIFVTSDKYTANVKQYDEVKNLRMTGFLTF YFKGSDELEAEDRIQKILWEDNEG >gi|157101644|gb|DS480680.1| GENE 219 254293 - 255492 1179 399 aa, chain - ## HITS:1 COG:no KEGG:Closa_0526 NR:ns ## KEGG: Closa_0526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 394 2 392 398 154 30.0 5e-36 MNMNRRKGFMALSLAAVMAVGGAGAALADEPGVVPGDTYTEAMARIQDSHMEYDELTDLI KNCYAPIKSAYSMIDTMEEDQGSIATAMRVVADDYMSQADQLADVLGSGNPAVMQVVGGA RMLRSSAGSMDRSIERSKSSRSVDRQVNMIVSGAETLMNQYEYYLAQRTVVAKALEIAQT AQAIQQTMQAQGLAVDADVLSAAAQLSSAKSQLESIDAGIDQIYKTLCYYTGWEKGADTV IGPVPAADPSVIGTLDLTSDKETAVNNNYSLISMRSGSGSGMSDFQVRTTKEMTQKANKM RTVDYSEDQLRSDMQTLYDTILEKKAAYDSASTAYQSAQLTWNAAQIQRQNGTLSQIQFM QQELAYLQAQSGFKCADLNLQQALRNYQWAVKGVSVSAG >gi|157101644|gb|DS480680.1| GENE 220 255525 - 256760 1410 411 aa, chain - ## HITS:1 COG:no KEGG:Closa_0525 NR:ns ## KEGG: Closa_0525 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 62 410 29 380 383 192 36.0 3e-47 MKNLRNRKYGYRAAAVLAAITMAGTMPLTSFAASGKRPNVSKPDDAHTIAEPPAPDINKD ISPEFAYSEDKWASLRDNVLEFGELKDLVHEYNPTVRSNRSTYKDQKGKDLNAAYDDYMD DIDAIWNNADSDDDVAWATARFQVGVLQQQADNNYQDADMEKIQYDQTEAGLVYQAQQLM VTYEQSRYNMENLQSARNLMQAQYEATAARQAAGMATQADVLSALKSVQDQDTAILTAQK SADNVHRSLCLMLGWAVDGQPEIRDVPEPDLNRIASMNPDADMETAIANNYDVKYFEKKA GNLTSQYLIDSNQAQIQDAKDKAAKSLRNQYNAVLTGRDSLNAAVMALDVASVNLNTATA KRAVGEITELEYQNVLNSYISAKNSVETDKLQLLLAMEAYDWNVKGLTTSN >gi|157101644|gb|DS480680.1| GENE 221 256796 - 257428 657 210 aa, chain - ## HITS:1 COG:no KEGG:Closa_0522 NR:ns ## KEGG: Closa_0522 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 204 1 200 207 110 34.0 5e-23 MPTQRFLKLKEEKKQAILEAAVHEFSRVPYSSASINQIIKEADISRGSFYTYFEDKDDLM RYMLRGFRDNCQERIFRTLKEQGGNPFGTALKLLEAVVDEDHGGLGYKMYRNMLSDLSVV DQNHLFGIKGFLLQDESYGEFVHKLYEGIDRERYPIDEETLSYLVDMTMLIIIRAVTLYY KNVVDREKLLEVTKKEMLILEYGVCRNTGN >gi|157101644|gb|DS480680.1| GENE 222 257701 - 258828 1184 375 aa, chain - ## HITS:1 COG:AGl2685 KEGG:ns NR:ns ## COG: AGl2685 COG1879 # Protein_GI_number: 15891450 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 46 375 32 353 353 172 36.0 8e-43 MKKRTLSLMLAAAMVLGSLTGCSTGQKEESKETAAVTEAAKAGEDAAAAAAQEAGTQPED GEKKVYAFVSKTAADPIFLLMFDGYKAACDKLGVEAVYRGADQPTAEKEIEIITQLIAQN VDAITVIGADFNALQPILQQAMSQGIAVNSVDTAVNPDSRQVHVEQCSIDEVGRAQMQSA LAICGGPGSSGKIGILSANPESQLHADWCKAMLLEVEEHPEDYKNIEVLDIAYGDDLPDK STSEAQAMLQNNPDLKAIISPTTVGILAAAKVIQDQGSDCKVTGVGLPSEMAPYIENGIC YDAYLWNPLDQGALGAYSAHALVTGAATGKVGDIVDAGDLGKFTIEEYYDGGTQVLLGEP LRFDKDNIAQWKDKF >gi|157101644|gb|DS480680.1| GENE 223 258844 - 259869 947 341 aa, chain - ## HITS:1 COG:mll5703 KEGG:ns NR:ns ## COG: mll5703 COG1172 # Protein_GI_number: 13474746 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 12 315 18 321 331 200 38.0 4e-51 MEGTAKTRSHSLKSVLFQWETMIFVIFIIINIININLSSNYLNFNNMMNNMVVFMDKALM VFTTMMVLLLGEIDISIASIMCLSGCCTGVAFEAGLPIVPAMMVGLLVGAACGFFNGILL VAFPDLNSTIVTLGNQILFRGIAYMILEDKPITSISSKMSFLAWSKVAGIPLILIVFFVE TLIFAFVIHRTKFGRSLYAIGSNAKASRFSGIKTNQIKVLVFTLSGLCAGFAGLFLISKL GSARASIAKGYEMEVIAMAILGGVSTAGGKGKVLGAVLGVFSIGFLRYGLGIVNVSSQIL MIIIGALLVVAVAIPNLKEVISESASGRWIRQRVIHNKQKY >gi|157101644|gb|DS480680.1| GENE 224 259872 - 260879 930 335 aa, chain - ## HITS:1 COG:mll5704 KEGG:ns NR:ns ## COG: mll5704 COG1172 # Protein_GI_number: 13474747 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 19 320 8 308 325 192 42.0 7e-49 MKSISRDFDLKRFVSHNIRELTLFAILIVIAVLVQVRTGGRFLSSSNLNDLLRETAILMM VSVGMMMVILTGCIDLSLGSTMGLAGMICALTLRDHRETPIIVIVLIALGIGLLAGLLNG FIVAKLRIFPLIGTLGVCDMLRGLVYVFSKGAWVGQGDMTQEFMSISTGKVFGINNLVFI AILITVMGYIFLQYFRNGRYMYAVGNNELSAKISGINPVRTKFLAYVINGMIAGLAGMLW ICKFGNAQGESCSSYELNVIASVVLGGVFITGGSGKVGGVVLGVLLFGTLNNILPLIQVS SFWQMGIKGFVIIASVVINSITQQHMDKKALQGRD >gi|157101644|gb|DS480680.1| GENE 225 260885 - 262399 203 504 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 258 474 11 221 312 82 28 1e-14 MDEYILELRNITKRFPGVVALNGVQFQLKQGEIHALMGENGAGKSTLIKVITGVHEPEEG DILINGRKVVFKNTNDSAANSIAAIYQHSTVYPYLSVTENIFIGHESLTRAGSIKWNEMH GRAKELLMSLGCDINPRTRMVNLSVAEQQIVEISKAVSSKARIVIMDEPTAALSKRECEE LYRITEKLKAEGTSIIFISHRMEDMYRLADRVTVFRDARYIGTWDVDKISKDELISAMVG REVTQLYPKQAVPIGETALEVKGLTKKGYFKDISFDVKKGEIFGLTGLVGAGRSEVCQGI CGLLKPDSGTILINGEECSFTHPSQALKKGVGYLPEDRQQQGLILSWELYRNMTLSTLDK YNRLIGIDTGRERQAGRELCEKLQIKAKSIFSRADSLSGGNQQKVVFAKLLNSDVNILLL DEPTKGVDVGAKAQIYSIMSDLAAQGYAIILVSSEMPEVMSMADRIGVMHEGHLAAIFDA SHVTQEEILAAAMTVKDEDLKEGA >gi|157101644|gb|DS480680.1| GENE 226 262450 - 262602 71 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937613|ref|ZP_02084974.1| ## NR: gi|160937613|ref|ZP_02084974.1| hypothetical protein CLOBOL_02504 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02504 [Clostridium bolteae ATCC BAA-613] # 1 50 1 50 50 84 100.0 3e-15 MCKKYALYDAGCRGGVKKEQKESEKYIKVQKSATIVLDFTFTKHSACLML >gi|157101644|gb|DS480680.1| GENE 227 262632 - 264125 1004 497 aa, chain - ## HITS:1 COG:BH1958 KEGG:ns NR:ns ## COG: BH1958 COG2207 # Protein_GI_number: 15614521 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 352 497 361 504 516 89 28.0 1e-17 MINVLVVEDEPPIQRMICKTISEMDREFSVRYTAFNGIQAQRILEAEEVDVVFTDIMMIG GDGITLLQYLHEYRPEIQTVVLSGYDKFEYAKQAYKNGVTDYLLKPVNKTELRDILEVLK KQYHLRMGNRRRECFKNLLDGGELTQKSCFYDELQASSWYLLLIRMRSRGCVKGGLNPES QKNIESGLVKMFGNEDTVFQTSDGAGEYVVAVKDSITDVEAKALELLNTVGQSDGVIAIA GLPEPVPVQDFRQYVLELRIQIMTESIYGKSCYSCGGRFCPPHDDTWEKDVKTEAFSVMV KALKAGDTRGACSQLDAFMREFEEKNKTCMAIENFMMEFLERGKGSSHSVRASVWNAVYE TFSYGELKTRVFQLVERMNGQEEDLAEDGSSEGAMPQMVVMIEQYLVDNYQRNIRAGELS MEFGFVSEYISRIFKKYVGLSPSRYLTKIRMEKACQLIKNHPEIQVKEVADQVGYKDIHY FSKVFRKEMGVWPSEYK >gi|157101644|gb|DS480680.1| GENE 228 264122 - 265978 1161 618 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 62 595 45 570 602 130 24.0 6e-30 MKSGTNWGYDEQGAIWEETVKRPFVFSNFQKKILQSYLIIIVGGLVLFYTLVAVMVRESS LKTAERNQISMCIKVSQQLESFMVEMRNVGFQVMMNSRLIQDFGTFSDDKANDNYYGDHI VDHINAVSALADINGIQRIATRISVFNRYGDYVSSGVMPQSKAYADNVLRSHMDLDEIIS RLVLNDGHIYVEGPHPDYWNSKVGYEMISLYCPLGNKIQDDMYGIIEIQQDFQKLVKTLL PGLDQDMAVLLFDSSGRCIFNSMESICPPEYYDFYYSQASKNPDSRYGFCRIKVGNRAQE QYLSAGISETSGWMAVMVRNKSSIVNVIQDIQKVIIMAIFVFMVLSAYMAYRISLKMTRP INDMIVSAQSISWSNLDMKTLEGNDENEIMALNNTFKETLKRLSKSMQLELGARLNALQS QMNPHFLYNTLSVISASADQQEKVERMCSRLSDMLRYSTVYEEESNSTLEDEVRHTENYL ELMKDRYEENLIYNIEEAGELERVKVPRVILQPIVENCFKHGFGENGFPWIIHVAVTASH GHWRIHVRNNGRPFQDRDLTELNEKVEGFLKGEQKKISGIGLTNTIIRLRLLYEEQVEYE IYTGNDGYTYVVLKGNYK >gi|157101644|gb|DS480680.1| GENE 229 265950 - 266780 648 276 aa, chain - ## HITS:1 COG:ECs4223 KEGG:ns NR:ns ## COG: ECs4223 COG1082 # Protein_GI_number: 15833477 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Escherichia coli O157:H7 # 1 259 1 259 275 182 34.0 5e-46 MKFGLISLDFKRFSLEYCFKMARRYGLDGVELWGGRPHGYFTDMTRERIQEIDALKKKYK LEIPMYTPNVLNGPYNLCSLEERTIHETIEFFQRSIDIAYELECPRMLVVADHPGYEAPK ALVCRRFIEHMQYLCDYGAPKGVSLVIEPLTPMESPVITTADDCVDAIKRIGRDNVEAMM DVAPPTVANEPFSVYFDKLKDRMNYIHICNNDGMTDAHKRLDEGIIPIKDMFTVFANWKY NGYVTCELYSENYRDPELYLANTMRMVYEIRNELGI >gi|157101644|gb|DS480680.1| GENE 230 266777 - 268438 909 553 aa, chain - ## HITS:1 COG:BS_araB KEGG:ns NR:ns ## COG: BS_araB COG1069 # Protein_GI_number: 16079931 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Bacillus subtilis # 1 552 1 555 560 510 44.0 1e-144 MGFSIGVDFGSLSARAMAVDVSNGRILKESVYGYPHGIMKESLPTGRKLEPGTALQEPRD YLDAWQFLIQDMFKDKELRADQAVGIGIDFTQCTMMPVDREGIPLCMHTEFRDDPHSYVK LWMHHHAQKEADDITREAGLRQERFLKYYGSKISSELLFPKILEILRQSPDIYQAADQFV EGADWMTWQITGTRMRSKSIAAVAALWQEEEGYPSDDFLRALHPEMPDVKQKLRGKLVKP GTCIGGISKEMSDKTGLPAGTPVACGLGDSHSAFAGSGLCSEGAMLMVIGTSGCDILISR NQIPVEGFCGICPDSAIPGYYAYEAGQACMGDHFQWFMENCLPAACREEAAGRNMSVFQW MDEKAGRLKPGSSGVIALDWWNGCRSVLMDSDLGGCLFGMTLQTRPEEIYRALMEGIAFG KRMIIEQMEMAGVRCRQLYATGGVAQKNPLIMQIMADVLGREIRVPVIANGSCMGSAMFG AVAAGRKGGGYDTIEEAVEAMGPPVGKIYIPDQSASAAYDVLFQMYREMYLYMGNSSMLK RLAAMRGKEESTL >gi|157101644|gb|DS480680.1| GENE 231 268453 - 269091 587 212 aa, chain - ## HITS:1 COG:TM0283 KEGG:ns NR:ns ## COG: TM0283 COG0235 # Protein_GI_number: 15643052 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermotoga maritima # 1 200 43 240 254 133 34.0 3e-31 MFEKEKREILDAALEIKKCNLISLSGGNVSMRAGENLFLVTPSGMIYEEMTADDVVVIDR DCRVVEGTRKPSSDSPALVYMFEHMPHINALIHTHQPYATAAGFASDQIPEFLVTLIDAN GAAVNVAPFTPSSDIGMGIMAVKYAGEARAVILKHHGIMAYGSDMKEALYSAIYLEEAAK TFVLAKSMGAEIPLLDPEDIAKEKRGWLDYGQ >gi|157101644|gb|DS480680.1| GENE 232 269108 - 269953 477 281 aa, chain - ## HITS:1 COG:STM4387 KEGG:ns NR:ns ## COG: STM4387 COG3623 # Protein_GI_number: 16767633 # Func_class: G Carbohydrate transport and metabolism # Function: Putative L-xylulose-5-phosphate 3-epimerase # Organism: Salmonella typhimurium LT2 # 7 278 5 278 284 202 39.0 5e-52 MDMELIQRPFGIYEKAIKPQEWEKMFADASSAGYQNYEISLDESDARLARLHWDDKQYEE VRRAARNQNIRILSACFSGHRRFPLGSSSRELEERGIRMMREGIDFCQNLGVRMLQVAGY DAFYEPRSHETAARYRENILRCLEWAEQAGVMLAIEPVEVNLVKVSDTLRLVKEADSPWL QIYPDVANMKSLGIDPVTELPQGRGHIANVHVRDSLPDYFYGVPLGAGNMDFIGVFKALD AMEYRGPLTIEMWNETEENYLDIIRCARTFMMDKIRCARCS >gi|157101644|gb|DS480680.1| GENE 233 269913 - 270581 347 222 aa, chain - ## HITS:1 COG:Cgl1560 KEGG:ns NR:ns ## COG: Cgl1560 COG0036 # Protein_GI_number: 19552810 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Corynebacterium glutamicum # 3 209 8 217 219 137 35.0 2e-32 MKIIPSIASASQLDLKCQLDRIRDVPYIHIDVEDGNFLPNITFGMKTVKEMAAYSKAELD VHLLTTNPIKYIQELSKSGISAVCGHMEALPYPLEFLHEVRQAGMKAGLALNLSMPIEQV MSYIDSFDYLLLMTSEPDGCGQRFRSCALERIRRARACIPDEVMIYADGGIGRDELGKVA EAGADCVVMGRAVWGAEDPAKAWKEMETLNGYGTDSKAFRNI >gi|157101644|gb|DS480680.1| GENE 234 270735 - 271550 594 271 aa, chain + ## HITS:1 COG:sgcQ KEGG:ns NR:ns ## COG: sgcQ COG0434 # Protein_GI_number: 16132124 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Escherichia coli K12 # 3 268 2 267 268 342 61.0 5e-94 MSNWMQDLFHVEKPIIAMCHLQPLPGDPYYDADGGMEKVMAAAKADLLALQEGGVDGIMF SNEFSLPYLTKVEPVTIAAMARVIGELKPLITVPYGVNCLWDPIASIDLAVAVDGKFIRE IISGVYASDFGLWNTNVGKTVRHKMAVGAKDLKLLFNIVPEAARYLADRDIKDIARSTEF NNRPDALCVSGLTAGSETDTQILGQVKASVKHTPILCNTGCNVNNITRQLSVADGAVCAT TFKYDGVFENAVDVKRVKEFMDKVKEYRATL >gi|157101644|gb|DS480680.1| GENE 235 271694 - 273202 777 502 aa, chain + ## HITS:1 COG:YPO3334 KEGG:ns NR:ns ## COG: YPO3334 COG1070 # Protein_GI_number: 16123487 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Yersinia pestis # 6 497 4 494 498 306 34.0 7e-83 MKKHAYFIGIDNGGSAIKAAVFNENGRELSVSQVQIPMDHPNPGWTERDPHAVWRGNCQV IRDAIGKAQICAQDIKGVSVCGYGGGLGLLDSANSPIAPFIVSTDTRAQSLLGEYSKNGI NSRLYQLTLQNLWAGQQAMLLPWLKKHQPDLLKQACHMVCIKDYIRYCLTGMLATDYTDA SCTNLFNIRDKQFDPEIFELLDIREYFRLACVPILQSFETGGFVTSQAAIQCGLPEGIPV AAGLYDVDSCCLASGILDSSCLCLITGTWSINEYLTDDLEECIGKNGNTIAFLPDYYIIE ESSPTSASNFDWYADHFLSRMYPGLERRELYDYCNKSILESGLKQTDILFMPYLFASNSV PDAKGAFLNLDSSHSPMDILLAIYEGILFSSCYHIEKLTGTIHPQKQIRLSGGITNSPVW TQILADILQLPIQVMEGKERGALGAAICASVACEYYPDFHSAVKKMCHLSRTYIPDSSKA VFYKEKYEKFKKAIQALEVFYK >gi|157101644|gb|DS480680.1| GENE 236 273221 - 273982 718 253 aa, chain + ## HITS:1 COG:lin2465 KEGG:ns NR:ns ## COG: lin2465 COG1349 # Protein_GI_number: 16801527 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Listeria innocua # 1 250 1 249 252 140 36.0 2e-33 MLKAERLQAIVDITNKEKIVTSEELMRQLNISKATARRDIEELSCQGLIQKTRGGAMSAN HTSLEPSFVKKKESNADEKARIAKAAREHISPGEKIILDSGTTVLELAKLVNDIPDLTVV TNDLHIASEVSIFPNATLLMVGGVVRKGFNSTYGYFAEKMLGSISVNKTFLSIDAVDMEQ GLLSYITDDTNIKKQYIKSGKEVILLCDHTKFQASAFINISQLDCIHRIIVGKELDGEYV ERLESMGIDVELV >gi|157101644|gb|DS480680.1| GENE 237 274062 - 274997 927 311 aa, chain - ## HITS:1 COG:BS_ykcC KEGG:ns NR:ns ## COG: BS_ykcC COG0463 # Protein_GI_number: 16078354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 3 304 8 311 323 243 40.0 3e-64 MLSVVVSVYNEEKALEEFYRETTHVLEHISWDYELLFVNDGSRDGSQDILDRMAASDSRV RVISFSRNFGHEAAMIAGLDYSRGDGIICMDADLQHPPECIPDILAKFEEGYQVINMVRT KNRTAGLVKNITSSGFYWLINRISDVHFEANASDFFAVSRHVAQVLKNSYREKVRFLRGY VQNVGFKKTAIEYEARARVAGESKYSIKKLFVFSINTILCFSNMPLKLGIYAGIFSALLG LAVMIYTLCTRKGAPSGYATIVVLICFMFAMMFVIIGIIGEYIAILFTELKDRPVYIVDR TENIVQGENAD >gi|157101644|gb|DS480680.1| GENE 238 275108 - 275950 924 280 aa, chain - ## HITS:1 COG:SP1273 KEGG:ns NR:ns ## COG: SP1273 COG3475 # Protein_GI_number: 15901133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: LPS biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 2 280 1 267 267 95 27.0 9e-20 MMKTYDMSRVHEANLTILKEIDRICRKYNIRYMLDAGTLIGAVRHKGFIPWDDDADVAFT RNQYEAFMKVAPRELPDTMELLEPDSYRGGKAFYDFTPRIIYKKSKCHEDSPEMQYYEGK LNHIWVDLFILDKLPASRAGEEFTRFLHKAVYGMAMGHRYKLDFGKYSLIHKIFVGGLAA AGRLVPMKLIFAMQKAAALKDRHSKGSLRYYSNYQPDYLYVTVDKSCCDQVEDSDFEDTR LMIPKGWHQILTEVYGNYMEMPPEDKRVPTHSSQQIRIFD >gi|157101644|gb|DS480680.1| GENE 239 276020 - 277378 1285 452 aa, chain - ## HITS:1 COG:STM2090 KEGG:ns NR:ns ## COG: STM2090 COG0399 # Protein_GI_number: 16765420 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 7 447 2 431 437 496 55.0 1e-140 MAFEHKTETEARQQILDMVAEYCDTYHNKKGPFQEGDRIPYASRVYDHKEMVNLVDSSLE FWLTSGRYTDEFEKKLADYLGVRFCSLVNSGSSANLVAFMTLTSPLLGERRVKRGDEIIT VACGFPTTVAPAIQFGAVPVFVDVTIPQYNIDAEKLEEALSEKTKAVMIAHTLGNPFDLR AVKAFCEKHGLWLVEDNCDALGTQYTMEEDGKEVTRFTGTIGDIGTSSFYPPHHMTMGEG GAVYTDNPLLNKIIRSFRDWGRDCVCPSGRDNMCGHRFDRQFGELPLGYDHKYVYSHLGY NLKATDMQAAIGCAQLDKFPSFVERRRHNFDRLKAGLEGTEKQLILPEACPHSRPSWFGF LITCREGVERNKVVQYVEKKGMQTRMLFAGNLTKHPCFDEMRAAGQGYRIVGSLDNTDRI MADTFWVGVYPGMTDEMIDYMARTIKEAVECA >gi|157101644|gb|DS480680.1| GENE 240 277394 - 278452 926 352 aa, chain - ## HITS:1 COG:STM2091 KEGG:ns NR:ns ## COG: STM2091 COG0451 # Protein_GI_number: 16765421 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Salmonella typhimurium LT2 # 8 352 5 355 359 379 49.0 1e-105 MKMKELMDFYRDKRVLVTGHTGFKGAWLCRILTLAGAQVTGYSLEPPTDPSLFEIAGLED VMDSVIGDIRNLQGLRDVFERVRPEVVFHLAAQPIVRDSYKDPVYTYETNVMGTVNVMEC VRLTDSVRSFLNVTTDKVYENREWEYGYRECDPLDGYDPYSNSKSCSELVTHSYQKSFFG DGRCAVSTSRAGNVIGGGDFANDRIIPDCIRAAATGKDIIVRNPHSTRPYQLVLEPLAVY LTIAMRQYEDGRYQGYYNVGPDDKDCVTTGCLVDMFCRLWGGDVKWVNQYDGGPHEANFL KLDCSRIKSVFGWRPRYGVEEAVEKTVEWSRAYLDGEDMLEVMDRQITEFFS >gi|157101644|gb|DS480680.1| GENE 241 278778 - 279593 776 271 aa, chain - ## HITS:1 COG:slr0983 KEGG:ns NR:ns ## COG: slr0983 COG1208 # Protein_GI_number: 16329493 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Synechocystis # 16 271 1 256 256 324 58.0 9e-89 MRLEYNVPEIQWRTEMKVVILAGGLGTRISEESHLKPKPMIEIGGKPILWHIMKYYSEFG FHEFIICLGYKQYVVKEFFADYFLHTSDVTFDLANNKMEVHNNYSEPWKVTLVDTGLNTM TGGRVKRIQPFVGDEPFMLTYGDGVSTVDLDELVRFHKSHGRTATITTVNIGQMKGVLDI DENDTVLSFREKEDNDGALINGGFMVMNPEIFNYLEGDSTVFEQGPMQKLAAEGQLKSFY HSGFWQCMDTQREMNKLEGLWQSGRAPWKIW >gi|157101644|gb|DS480680.1| GENE 242 279675 - 281108 1207 477 aa, chain - ## HITS:1 COG:no KEGG:Closa_3794 NR:ns ## KEGG: Closa_3794 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 19 476 15 454 461 449 52.0 1e-124 MKAERNRKIIYWLLPLAGILFCLWYVKSATCDVVYSDYIRLVNSYLPDVWNPEKFFVPDV LTRIPINYLCRIVNVELFGFTITLERVLGVVSLGLAGWVFAAYCRSRKIGCLWFALLMAV MFSLNKWEMLTNGSGWSHFFAFACFYYHELVLDRVWAGEEKRRDRLKLLVLPWLIILGTA GPYCAVYAATLLLSYGFCMVMDRRKRKDCPGRRGQGSWDMRYLAYMACALLPLLLYMLSN SMAVEEHAGATGRSLGTILAENPTFPIRFLLKSFAGVLVGGEELERFMENGLLSNRMCYV LGLFVVCGYLMALWINFRFRLYERTIMPLMLLAGGGMNHVIIFISRYIFEKESYALSSRY ALQFQVGILGIILTFALAWQMKERTDRGYRWGMALFCLAILMGNGYTTYREIQKAPSREE SFERKARLALEVPGMSREELKDRGEELETEFEYRKGLDKIQNAFRILEENKLNVFRE >gi|157101644|gb|DS480680.1| GENE 243 281159 - 283021 1839 620 aa, chain - ## HITS:1 COG:no KEGG:Closa_3795 NR:ns ## KEGG: Closa_3795 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 22 618 10 605 607 639 53.0 0 MEAKESTRTRIGFSRMEKKYEWKDHVIFMGILLAMAVWVNRGIEIKGLYMDDLYFWSCYG EQSFLQYVFPVGSTRFRFLYYLAAWLEMAVVRNHVGWFMPFNILLNAAVSWSIYFIGRRL SRAKGIGFLCGSMFLISRMAYYQIGQVLGLMETMALWMAVMILWYLFRYLNEKDSHVYFY LSCVLYFGVCFVHERYMALFPLLLLVLLFKRSRKLWEWGAGLLSFLLVQIIRAFTIGSIL PAGTGGTQVADTFSLADTFKYALSQIAYVFGINAGPEHLNGCPWGQSPLAIKGLVLLADA CIAVIVLAFVVKLVREKRKDYRVQMLKNSTLFILFTGLCIASSSVTIRVEMRWVYVSLVS ALLFLAYMYGELTEGVKPELYLKRLWPWGAAFACYVVFMLPVELYYRADYPKLYLWPNQL RYNSLAEETYGRYGDGIFGKTIYIIGDSYEMSDFTARTFFKVFDRERAAEGTRVEFIENI RDIGLVDDRMLVLREDPDHEGFQDITEIVREMKLQVDYGYYSDGWMDEHASLTVMAGETG VIDLEVVYPGIMSGGEAVRITKDKEEPRYLPVRSNVVNTTIQAEPWQTVHLTFEYNFFMQ NAQEQRGEDRLAAIVHLTVR >gi|157101644|gb|DS480680.1| GENE 244 283044 - 283553 423 169 aa, chain - ## HITS:1 COG:no KEGG:Closa_3793 NR:ns ## KEGG: Closa_3793 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 167 581 748 753 137 41.0 2e-31 MTCTTGWAGVPGFTPVSLRHMGYYDHRREMLDRRAGEGSLTLSCMFTPRSRVLAFGRHPR VLDLPCSVQSYYDVTGSGGNVYLVKRLAYFEEFLRYAGTEYFFVEAGYLAGQDRALQMIE DMIAEGSLSDIHYEWGNMTARVTLDAGPPDDSQQALEEFRENYSMTELD >gi|157101644|gb|DS480680.1| GENE 245 283480 - 285384 1228 634 aa, chain - ## HITS:1 COG:no KEGG:Closa_3793 NR:ns ## KEGG: Closa_3793 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 47 582 47 552 753 277 37.0 1e-72 METGKNDKILAIIIGAAGLAVLAVTVFGVLLWGPLEPVLRNTRTLLMMAEVVLVFFWNLF CLAGGRGKKPGTGSHRGVRALGLAAGIVLFTWCHRIFLPLAVSGIYMAVLILAGRQLLRL FLEERQLSAPIPFLQRTSMCLVLGSALWMVLVCLVSLTGHGGLMLWRGLAAVMALMAVVL EILWGRRNKQVLPGAWLKAVKNSADKGRSESMAQAALLAFIITMVLIQAGRMNIELDYDS LHYGLRSAYVLDNGKGIYENLGMINLVYTYSKGLEVLVLPLSGTPTYGFVLAFSLWCTVG ILLLAADIVGRYCGQVKGILAAAILAAIPGIMNMAATAKSDNITLLHQLIIYDFICFALW DGKQGKKSSSAKNIPWLSMAVSTYLLTLVYKPTALVFSTALGGVALVCLFLTRRLRVGDR RGFCLLLIPGAAAAGLWYRTWLLTGVPVTSIFAGFFEKAGCRVKYPYTFNHVIGDPSALT AGEKCSRLWSRVWGILFAPVTEDMAHVIIAWGTGIVTVFLVIWLAVSWKAGRRGLRHPLN LFDCILIPVLALGSLASIYTLYQVDGNYFILFYAVLVISALRMGWGQNWGPAVRRSLSLV LVFIFCSEYSGDLYHWLGRSAGLYACEPASYGLL >gi|157101644|gb|DS480680.1| GENE 246 285450 - 286235 729 261 aa, chain - ## HITS:1 COG:no KEGG:Closa_0493 NR:ns ## KEGG: Closa_0493 # Name: not_defined # Def: glycosyl transferase family 2 # Organism: C.saccharolyticum # Pathway: not_defined # 5 261 4 260 260 299 56.0 8e-80 MRYKHVFAVCAYKDSPYLEQCIRSLKAQTVPSHIIICTSTPSSYIDRLAWKYGLQVCVRR GESGIKDDWNFAYSMAEGELVTIAHQDDMYHRDYSARLLAAHRRYPDMTVFTTDYVIVKQ GALITGDAMLWIKRILRTPLRFPQMNDRAWVKKLAFVLGNPICCPATTYHKAVLGEPFVR SEYSFALDWDNLVRLAEEPGRFICDERPLLYYRVHEGATTKACIRDNRRFAEEREMFCRF WPEPVADIIMGFYKKAYGEYE >gi|157101644|gb|DS480680.1| GENE 247 286225 - 286866 797 213 aa, chain - ## HITS:1 COG:Cj1149c KEGG:ns NR:ns ## COG: Cj1149c COG0279 # Protein_GI_number: 15792473 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Campylobacter jejuni # 18 210 20 186 186 124 37.0 1e-28 MKEMEYLEELTSRYPVLEAVKGDVLAAYEILRDCYAGGGKLMIAGNGGSCADSEHIVGEL MKGFVKRRPVSNEFAAALQEADEKRGEELASCLQGGLPAIALTGHAGLATAFLNDVNGEM IYAQQLYGYGKKGDVFIGISTSGNAENVMYAVAVAKASGIKTIGLTGKNGGKMAGACDCS IVVPSDETFKIQELHLPIYHTLCLMLEEHFYEV >gi|157101644|gb|DS480680.1| GENE 248 286863 - 287450 514 195 aa, chain - ## HITS:1 COG:CAC3053 KEGG:ns NR:ns ## COG: CAC3053 COG0241 # Protein_GI_number: 15896304 # Func_class: E Amino acid transport and metabolism # Function: Histidinol phosphatase and related phosphatases # Organism: Clostridium acetobutylicum # 1 156 1 156 181 164 48.0 7e-41 MDRVVFLDRDGTLNEEVHYLHRTSDLKLLPGVPEALQMLKEAGYRLVVVTNQAGVARGYY GEEDVKNLHVYMNHILECMGASIDAFYYCPHHPEHGIGPYKKECSCRKPGTGMFEEAAQR FEVDKAHSFMIGDKLLDVEAGNNYGLTTILVGTGYGSGIRSQEEKEGIPPVYDFYAEDLV KAAQWIIEKGQADVR >gi|157101644|gb|DS480680.1| GENE 249 287508 - 288578 1206 356 aa, chain - ## HITS:1 COG:CAC3055 KEGG:ns NR:ns ## COG: CAC3055 COG2605 # Protein_GI_number: 15896306 # Func_class: R General function prediction only # Function: Predicted kinase related to galactokinase and mevalonate kinase # Organism: Clostridium acetobutylicum # 1 356 1 357 364 486 69.0 1e-137 MVIRGRAPLRVSFGGGGTDVAPFCEEQGGAIIGSTINKYAYCSIVPREDDQIVVHSLDFD MTVKYNTRENYVYDGKLDLVTAALNAMGIDQGCEVYLQCDAPPGSGLGTSSTVMVAMLTA MARWKGVMMDAYALADLAYQVERLDLKIDGGYQDQYAATFGGFNFIEFHGRNNVVVNPLR IKKDIIHELQYNLLLCYTGKIHVSANIIKDQVQNYEKKDAFEAMCEVKALAYALKDELLK GNLHSFGKLLDYGWQSKKRMSSKITNPQIDQLYDEALKAGALGGKLLGAGGGGYLLMYCP YNVRHKVAARMEQAGGQLADWNFELRGAQSWVADESRWQYDQVKVHMPNGEYRFNI >gi|157101644|gb|DS480680.1| GENE 250 288613 - 289320 796 235 aa, chain - ## HITS:1 COG:CAC3056 KEGG:ns NR:ns ## COG: CAC3056 COG1208 # Protein_GI_number: 15896307 # Func_class: M Cell wall/membrane/envelope biogenesis; J Translation, ribosomal structure and biogenesis # Function: Nucleoside-diphosphate-sugar pyrophosphorylase involved in lipopolysaccharide biosynthesis/translation initiation factor 2B, gamma/epsilon subunits (eIF-2Bgamma/eIF-2Bepsilon) # Organism: Clostridium acetobutylicum # 1 232 1 231 234 205 46.0 5e-53 MQAILLAGGLGTRLRSVVSDRPKPMALIEGKPFMEYVTRELVRHGITDIVFAVGYKGTMV EEHFGDGAAFGFHAGYAYEETLLGTAGAIKNAGRFITEDRFYVLNADTFYQIDYTRLLRQ QDSQNLDMALVLRRVPDVSRYGQAILDRDGFLTGFNEKTEDAREGTINGGIYLMKQQLLD EIPEGKVSLENDMIPKWLSEGRRLGGFVNDGYFIDIGIPEAYFQFQEDVRNKVVI >gi|157101644|gb|DS480680.1| GENE 251 289407 - 293351 4115 1314 aa, chain - ## HITS:1 COG:XF0885 KEGG:ns NR:ns ## COG: XF0885 COG0438 # Protein_GI_number: 15837487 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Xylella fastidiosa 9a5c # 948 1302 82 430 443 154 31.0 1e-36 MKDISYEILDLFRQYGSSPEGIRKALDSQSGPEVLHALSDIRENLLEWLDYTGCGEVLQI GSGYGVLTGLLASRCAHVAVMDEADENLAVNQERNKTYTNITYGYKAGAETGQPEPSFDL VVMAGLRKGQEIQDAAAFGASFLKQEGRLVIACENALGLRYLAGADHEEGFCTRKQAEEA CREAGLARADFYYPVPDHRMPISLYSDRYLPGKGDITSPSVSYDGPRYACLNEGEVYDRL IEEDDFGRFANSFLIIASKDGRQEKTIFAKYNRTRKEEFRIRTTMGDLDGERRVEKAALE SAGRYHIFSLKNKCRQLRQLNPRVRILEPEISADGSKVEFVFLEGITLAECLGRCIRDGR APVEEIKAALQFLYDVAEDQIVPFAVTPEFTRTFGPVPDIDDVSYKVSNIDGLFENLMIT EKDGDEKLYCLDYEWVFDFPVPARFVQYRNLAYFYYKYQGLMSYAGLEEFLQEFGIGSEM ASVYASMEDAFQSYVHGDAVQGYLVNYQQKVTTLSELKETDQALSQAKDRISQLQAEVEE KNLQIRKEQEVQRLTNNHVANLEVMIKDLRHEIDELGKLATYLNGHEALVFKARRKLGAQ VNKAFPRGTRKRKILNYCFNTVKHPVRYGRLYATKEGRNQIEGDFKIGEDYLKYGRLVFP SVPGKGGYVDSGSGETWDGPMVSIVIPCYNQVHYTYACLQSILEFTKDVTYEVIIADDVS TDATAELSRYAEGLVICRNETNQGFLRNCNQAAKAARGKYIMFLNNDTKVTEGWLSSLVN LIESDSTIGMVGSKLVYPDGRLQEAGGIIWSDGSGWNYGRLDDPDKAEYNYVKDVDYISG AAILLSTALWKQIGGFDQRFAPAYCEDSDLAFEVRKAGYRVVYQPLSKVIHFEGVSNGTD VNGTGLKRYQVENSEKLKEKWAEEFKKQCVNTGNPNPFRARERSQGKKIILVVDHYVPTF DRDAGSKTTYQYLKMFLKKGYVVKFLGDNFLHEEPYTTTLLQMGIEVLYGDTYAAGIWDW LKTNGDEISFAYLNRPHIATKYVDFIKDYTNMKVIYYGHDLHFLRLGREYELTGDINIKR EADYWKSVELTMMHKAAVSYYPSYVEINAIHAIDGSIPAKAITAYVYDTFLDNIQDDFAE REGLLFVGGFAHPPNADAVLWFAREIFPLIRRELPDVKFYVAGSRVTDEIKALEEPGNGI IIKGFVSEEELAQLYGHCRVVVVPLRYGAGVKGKVVEAIYNGAPIVTTSTGAEGIPFADT VLEIEDQAELFAGKTVKLYQDNERLNQLCRRTQDYIKKHYSLDGAWKVVEGDFR >gi|157101644|gb|DS480680.1| GENE 252 293358 - 294866 1730 502 aa, chain - ## HITS:1 COG:PA1386 KEGG:ns NR:ns ## COG: PA1386 COG1134 # Protein_GI_number: 15596583 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Pseudomonas aeruginosa # 12 438 4 366 422 248 34.0 2e-65 MENVMSVKEQDNKYAIDVAEVTKVYRLYEKPIDRLKESMSISHKNYHRDFYALNQLSFRV RKGETVGIIGTNGSGKSTILKIITGVLTPTTGEVKVDGKISALLELGAGFNMDYTGIENI YMNGTMMGYTKKEMDAKLQDILEFAEIGDFVYQPVKTYSSGMFVRLAFALAINVDPEILI VDEALSVGDVFFQSKCYRRMEEIRQKGTTILMVTHDMGSIIKYCDRVVLLNKGEFIAEGP AGRVVDMYKKILAGQLDALKAELERERQQKESLTGEIGQEDAGSGVLTGAGTEETETAGS RAGAGTKSGGVEMSDFSGGMGLEGSRLMKDQLTINYDRTEYGDGRAEIVDLGLFDERGNI TNLLLKGEMFTIKERIHFNTEIQSPIFTYTIKDKKGTELTGTNTMFEGAEIQPVKRGDEY VVEFTQKMTLQGGEYLLSMSCTGFEQGEHVVYHRLYNVTNITVISNKNTVGVYDMESQVK AAKIEAGKIEAGNAGPTVNGGL >gi|157101644|gb|DS480680.1| GENE 253 295040 - 295498 484 152 aa, chain + ## HITS:1 COG:MA0746 KEGG:ns NR:ns ## COG: MA0746 COG0607 # Protein_GI_number: 20089631 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Methanosarcina acetivorans str.C2A # 38 144 26 142 151 68 35.0 4e-12 MKGTIFTAVLICSAILAACSPAAPREAMASTQSPSPSPAEEPADETSDEAYHKITAEEAK QMMDEGNATVVDVRTAEEYAAGHIPGSILIPVESIGDTKPVELPDTEAVLLVHCRTGIRS KRASDQLVELGYKHVYDFGGIVDWPYETVTEN >gi|157101644|gb|DS480680.1| GENE 254 295651 - 295968 452 105 aa, chain + ## HITS:1 COG:YGR209c KEGG:ns NR:ns ## COG: YGR209c COG0526 # Protein_GI_number: 6321648 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Saccharomyces cerevisiae # 10 102 10 101 104 105 51.0 3e-23 MELKHLNKDEFEQAVNAGDELVVVDFFATWCGPCKMLGPVIERAADKFSDVHFYKVDIDE EMDLAARFQVMSVPTLIYFKRGEVLSKSVGLVSPADIDKEISKLK >gi|157101644|gb|DS480680.1| GENE 255 296044 - 296616 672 190 aa, chain - ## HITS:1 COG:L181238 KEGG:ns NR:ns ## COG: L181238 COG0494 # Protein_GI_number: 15673901 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Lactococcus lactis # 8 182 7 179 186 139 43.0 4e-33 MDRADLIKQIESYCPYNEQEEADRQMILKCLKEEEDIFDRGNCLAHMTASAWVVNPGRTK VLMVYHNIYKSWSWLGGHADNEQDLLQVAVREVKEESGLARVRPVSEDIYSLEVLTVDGH EKRGAYVGSHLHLNVTYLLEADENDALFSKADENSGAAWFGLEDSLKASSEPWIRQRIYA KLNGKLKKQG >gi|157101644|gb|DS480680.1| GENE 256 296598 - 296735 62 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937646|ref|ZP_02085007.1| ## NR: gi|160937646|ref|ZP_02085007.1| hypothetical protein CLOBOL_02537 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02537 [Clostridium bolteae ATCC BAA-613] # 1 45 12 56 56 69 100.0 6e-11 MYQRYHKTGKVSVTGVTQETVIKIQKELEHVCQGKEKYIKWTARI >gi|157101644|gb|DS480680.1| GENE 257 296760 - 297008 257 82 aa, chain + ## HITS:1 COG:no KEGG:Slip_1595 NR:ns ## KEGG: Slip_1595 # Name: not_defined # Def: 4Fe-4S ferredoxin iron-sulfur binding domain protein # Organism: S.lipocalidus # Pathway: not_defined # 5 70 266 331 354 63 43.0 4e-09 MSHLSKNLARVDAGTCVACGCCLKVCPRQALSVFKGSYAVVNQDACVGCGLCAKECPASV ISILPRSHQEIESAQTGGNRHE >gi|157101644|gb|DS480680.1| GENE 258 297001 - 297738 608 245 aa, chain + ## HITS:1 COG:no KEGG:CD0628 NR:ns ## KEGG: CD0628 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 6 195 1 190 193 280 71.0 5e-74 MNKPALKAKKHWYDYLWIASLLYLFLGFFNIMFAWLGLICFLVPLAISIATGSKLYCNRY CGRGQLFSILGKKLHLSPNREIPGWMKSKAFRYGFLVFFLVMFCQMLFNTYLVFRGTSGL KQAVTVLWTFRLPWQHAYHGTGLPLWTAQFAFGFYSVMLTSTILGLATMALFRPRSWCVY CPMGTMTQMICKAKAGACPGAASECGTSGCGSSGCSSSGCGASGCSSSGCSSSDCSSSCS DESLS >gi|157101644|gb|DS480680.1| GENE 259 297882 - 298565 684 227 aa, chain + ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 12 222 10 216 217 89 31.0 6e-18 MNHSRPNITDTLKKCYPFWEKLTDTQREQLKTSSYIRDYEPGSFVHSRSEECLGILIVLS GQLRTYIQSEEGREVTLFRLGPGEVCTLSAACMMQEITFDIFIESTAASTLLITGASGIH RLMEANIYAENYIYRQTAERFSDVMWSMGQLLFSSFDKRLAAYLADEYVKLGSDVISTTH EQIAKNLGTAREVVSRMLKYFEKEGLVALGRGTITIKNVSGLKKLLT >gi|157101644|gb|DS480680.1| GENE 260 298574 - 299530 248 318 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 23 273 1 252 305 100 29 9e-20 MPYVRLQRCTLSLPCAVQEKRNMTFRHLKIFVTVCETGSMTAAASQLFIAQPSISLAISE MEDYYGVKLFDRISRKLYLTENGRRALQYARHIIDLLDEMEQGVRDVDAMGRLKVGTSIT IGTYLLPGYIKKLKQQYPALKVEASIANSGTIEQQILDNDIDVGIIEGVAHSPFILSESF AGDRLVFICPAGHEFAGKTLEELSEVRNQDFILREKGSAGREIFDGILAAQELNVRPIWQ SASNQVILQGVKAGLGISILPYFLVRQDIREGELAEFRIKDLALNRKYSVIYHKNKFLPR SARTLMELCKEGNGADRP >gi|157101644|gb|DS480680.1| GENE 261 299667 - 300305 682 212 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 26 193 6 171 186 91 30.0 1e-18 MVVNIKSKKHASSRAPGHAGSRGSLLAKLFLSTLYLSAFTFGGGYVIVTLMKKKFVDEYH WIDEQEMLDLVAIAQSSPGAIAVNGAIVVGYKLAGMAGVLTAVIATVLPPFTILTLISFC YAAFRSNLFVGWMLNGMQAGVGAVIAQVVWEMGSGIVKDRQWISVLIMAAAFIANYVYNV NVVLIILLCAAIGVVRTLSSGHNKKGTEGGAL >gi|157101644|gb|DS480680.1| GENE 262 300302 - 300874 648 190 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 179 1 168 176 124 43.0 1e-28 MIYFQLFLSFLQIGAFSFGGGYAAMPLIQNQVVTLHGWLNLAEFTDLVTISQMTPGPIAI NAATFVGTRIAGTPGALAATIGCVLPSCILVTLLAKIYLKYRNLSLIQGVLKSLRPAVVA MIGAAGVSILVTAFWGLAGFTPDLGAINLRSVCIFTGAMILLIRFKMNPILVMVLSGLAE TACQLAMRVI Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:39:36 2011 Seq name: gi|157101643|gb|DS480681.1| Clostridium bolteae ATCC BAA-613 Scfld_02_22 genomic scaffold, whole genome shotgun sequence Length of sequence - 107048 bp Number of predicted genes - 102, with homology - 98 Number of transcription units - 41, operones - 24 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 61 - 2478 1307 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 2 1 Op 2 . - CDS 2493 - 4478 1179 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 3 1 Op 3 . - CDS 4485 - 5054 580 ## CD196_1448 hypothetical protein 4 1 Op 4 . - CDS 5072 - 6247 959 ## Dred_1188 hypothetical protein 5 1 Op 5 . - CDS 6247 - 6651 418 ## gi|160937658|ref|ZP_02085018.1| hypothetical protein CLOBOL_02548 6 1 Op 6 . - CDS 6648 - 6866 357 ## COG4443 Uncharacterized protein conserved in bacteria 7 1 Op 7 . - CDS 6893 - 7150 168 ## gi|160937660|ref|ZP_02085020.1| hypothetical protein CLOBOL_02550 8 1 Op 8 . - CDS 7166 - 7432 199 ## gi|160937661|ref|ZP_02085021.1| hypothetical protein CLOBOL_02551 9 1 Op 9 . - CDS 7432 - 7629 113 ## gi|160937662|ref|ZP_02085022.1| hypothetical protein CLOBOL_02552 10 1 Op 10 . - CDS 7716 - 7922 156 ## Cphy_2993 putative phage-related DNA binding protein - Prom 7960 - 8019 5.5 + Prom 8022 - 8081 6.0 11 2 Op 1 . + CDS 8104 - 8439 281 ## Cphy_0782 XRE family transcriptional regulator 12 2 Op 2 . + CDS 8445 - 9146 237 ## Cphy_0781 hypothetical protein + Prom 9317 - 9376 6.2 13 3 Tu 1 . + CDS 9495 - 10103 -33 ## gi|160937667|ref|ZP_02085027.1| hypothetical protein CLOBOL_02557 + Prom 10188 - 10247 5.0 14 4 Tu 1 . + CDS 10292 - 11497 444 ## COG0582 Integrase - Term 11489 - 11531 4.4 15 5 Tu 1 . - CDS 11572 - 12951 997 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 12976 - 13035 3.3 16 6 Tu 1 . - CDS 13268 - 14212 1011 ## COG3481 Predicted HD-superfamily hydrolase - Prom 14241 - 14300 4.0 - Term 14373 - 14415 1.9 17 7 Tu 1 . - CDS 14549 - 15889 1390 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 15915 - 15974 9.3 + Prom 15957 - 16016 6.5 18 8 Op 1 . + CDS 16079 - 18517 2377 ## COG1193 Mismatch repair ATPase (MutS family) 19 8 Op 2 40/0.000 + CDS 18545 - 19255 905 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 20 8 Op 3 . + CDS 19252 - 20658 1594 ## COG0642 Signal transduction histidine kinase + Term 20738 - 20788 -0.8 + Prom 20715 - 20774 9.1 21 9 Tu 1 . + CDS 20823 - 21863 937 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 21977 - 22022 8.2 - Term 21962 - 22013 11.2 22 10 Op 1 . - CDS 22106 - 23050 639 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 23 10 Op 2 . - CDS 23085 - 24374 1202 ## Closa_0071 hypothetical protein - Prom 24408 - 24467 3.2 24 11 Op 1 . - CDS 24478 - 26064 1215 ## Closa_0070 Peptidoglycan-binding lysin domain protein 25 11 Op 2 . - CDS 26124 - 26723 646 ## Closa_0069 sporulation protein YyaC - Prom 26778 - 26837 6.3 - Term 26810 - 26853 7.1 26 12 Op 1 . - CDS 26965 - 27885 955 ## Closa_3994 hypothetical protein 27 12 Op 2 . - CDS 27936 - 29162 1246 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 28 13 Op 1 24/0.000 - CDS 29291 - 30334 1071 ## COG0208 Ribonucleotide reductase, beta subunit 29 13 Op 2 . - CDS 30382 - 32628 2272 ## COG0209 Ribonucleotide reductase, alpha subunit - Prom 32753 - 32812 6.9 + Prom 32815 - 32874 7.4 30 14 Tu 1 . + CDS 33019 - 33882 859 ## COG1284 Uncharacterized conserved protein + Term 33964 - 34017 6.1 - Term 33952 - 34004 4.2 31 15 Op 1 . - CDS 34025 - 35212 1348 ## BHWA1_01065 hypothetical protein 32 15 Op 2 . - CDS 35227 - 36735 1250 ## COG1696 Predicted membrane protein involved in D-alanine export - Prom 36966 - 37025 8.4 + Prom 36950 - 37009 8.6 33 16 Tu 1 . + CDS 37176 - 38561 1367 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog + Term 38577 - 38610 2.1 + TRNA 38805 - 38877 75.9 # Phe GAA 0 0 + TRNA 38905 - 38975 74.6 # Gly TCC 0 0 - Term 39030 - 39076 12.7 34 17 Op 1 . - CDS 39102 - 39749 447 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases - Prom 39783 - 39842 3.0 35 17 Op 2 . - CDS 39854 - 40942 1349 ## COG1879 ABC-type sugar transport system, periplasmic component 36 17 Op 3 . - CDS 40990 - 41130 229 ## gi|160937698|ref|ZP_02085058.1| hypothetical protein CLOBOL_02590 37 17 Op 4 21/0.000 - CDS 41173 - 41955 857 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 38 17 Op 5 2/0.000 - CDS 41958 - 43427 174 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 39 17 Op 6 . - CDS 43461 - 44927 1053 ## COG1070 Sugar (pentulose and hexulose) kinases 40 17 Op 7 11/0.000 - CDS 44947 - 45855 651 ## COG1180 Pyruvate-formate lyase-activating enzyme 41 17 Op 8 . - CDS 45915 - 48326 1957 ## COG1882 Pyruvate-formate lyase 42 17 Op 9 . - CDS 48323 - 48418 67 ## - Prom 48513 - 48572 8.0 + Prom 48507 - 48566 7.7 43 18 Tu 1 . + CDS 48638 - 49666 795 ## COG1609 Transcriptional regulators + Term 49880 - 49946 30.0 + TRNA 49862 - 49933 75.4 # Gly GCC 0 0 - Term 49843 - 49913 22.0 44 19 Op 1 . - CDS 50013 - 51143 275 ## COG0582 Integrase - Prom 51177 - 51236 5.7 45 19 Op 2 . - CDS 51244 - 52143 230 ## COG2946 Putative phage replication protein RstA - Prom 52243 - 52302 2.8 - Term 52333 - 52377 4.6 46 20 Op 1 . - CDS 52445 - 52636 161 ## gi|160937707|ref|ZP_02085067.1| hypothetical protein CLOBOL_02600 47 20 Op 2 . - CDS 52626 - 52793 141 ## 48 20 Op 3 . - CDS 52978 - 53193 185 ## gi|160937708|ref|ZP_02085068.1| hypothetical protein CLOBOL_02601 - Prom 53270 - 53329 13.2 + Prom 53212 - 53271 5.2 49 21 Tu 1 . + CDS 53364 - 53738 237 ## Clocel_3551 putative transcriptional regulator, XRE family + Term 53781 - 53844 8.9 - Term 53779 - 53821 6.1 50 22 Tu 1 . - CDS 54022 - 54207 61 ## - Term 54395 - 54440 7.4 51 23 Op 1 . - CDS 54483 - 55196 611 ## COG0860 N-acetylmuramoyl-L-alanine amidase 52 23 Op 2 . - CDS 55263 - 56705 1364 ## COG1409 Predicted phosphohydrolases 53 23 Op 3 . - CDS 56764 - 58446 1846 ## COG3858 Predicted glycosyl hydrolase - Prom 58471 - 58530 10.7 54 24 Tu 1 . - CDS 58653 - 58925 291 ## Closa_0332 hypothetical protein 55 25 Op 1 . - CDS 59031 - 59657 378 ## Closa_0331 hypothetical protein 56 25 Op 2 . - CDS 59558 - 61039 531 ## COG0515 Serine/threonine protein kinase - Prom 61115 - 61174 4.9 57 26 Tu 1 . + CDS 61370 - 62119 791 ## Closa_0328 Negative regulator of genetic competence + Term 62149 - 62186 1.0 - Term 62135 - 62176 6.7 58 27 Op 1 1/0.143 - CDS 62238 - 62915 715 ## COG1802 Transcriptional regulators 59 27 Op 2 . - CDS 62948 - 63958 1019 ## COG2055 Malate/L-lactate dehydrogenases 60 27 Op 3 . - CDS 63987 - 65054 1099 ## COG1312 D-mannonate dehydratase - Prom 65126 - 65185 5.4 - Term 65271 - 65306 0.5 61 28 Tu 1 . - CDS 65346 - 65444 161 ## - Prom 65591 - 65650 7.0 + Prom 65545 - 65604 6.1 62 29 Op 1 . + CDS 65786 - 67894 2395 ## COG1882 Pyruvate-formate lyase 63 29 Op 2 11/0.000 + CDS 67941 - 68186 375 ## COG1882 Pyruvate-formate lyase + Term 68207 - 68264 6.1 64 29 Op 3 . + CDS 68278 - 69030 655 ## COG1180 Pyruvate-formate lyase-activating enzyme + Prom 69046 - 69105 10.6 65 30 Op 1 . + CDS 69139 - 69309 209 ## CPR_2442 ferredoxin (FdxA) + Term 69325 - 69362 5.3 + Prom 69311 - 69370 4.0 66 30 Op 2 . + CDS 69407 - 70123 665 ## COG2071 Predicted glutamine amidotransferases + Term 70345 - 70377 1.1 - Term 70178 - 70238 11.9 67 31 Op 1 . - CDS 70293 - 71093 783 ## Closa_4202 hypothetical protein 68 31 Op 2 . - CDS 71090 - 71656 467 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 71792 - 71851 7.2 + Prom 71856 - 71915 7.6 69 32 Tu 1 . + CDS 71980 - 73536 1850 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Term 73554 - 73616 16.0 70 33 Op 1 . - CDS 73622 - 74365 613 ## COG1802 Transcriptional regulators 71 33 Op 2 . - CDS 74414 - 75262 370 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 75311 - 75370 7.6 + Prom 75371 - 75430 6.4 72 34 Op 1 . + CDS 75482 - 76585 236 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 73 34 Op 2 . + CDS 76625 - 77113 441 ## SpiBuddy_2557 tripartite ATP-independent periplasmic transporter DctQ component 74 34 Op 3 . + CDS 77127 - 78404 566 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 75 34 Op 4 . + CDS 78436 - 79287 658 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Term 79343 - 79382 3.0 - Term 79329 - 79368 5.1 76 35 Op 1 . - CDS 79388 - 80659 1306 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase 77 35 Op 2 8/0.000 - CDS 80656 - 81426 680 ## COG1540 Uncharacterized proteins, homologs of lactam utilization protein B 78 35 Op 3 21/0.000 - CDS 81521 - 82546 875 ## COG1984 Allophanate hydrolase subunit 2 79 35 Op 4 . - CDS 82550 - 83269 640 ## COG2049 Allophanate hydrolase subunit 1 80 35 Op 5 . - CDS 83328 - 84011 779 ## COG0684 Demethylmenaquinone methyltransferase - Prom 84100 - 84159 5.7 + Prom 84064 - 84123 8.1 81 36 Op 1 . + CDS 84148 - 85071 776 ## COG0583 Transcriptional regulator + Prom 85082 - 85141 7.3 82 36 Op 2 . + CDS 85178 - 86356 1256 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 86365 - 86415 7.4 - Term 86167 - 86207 -0.1 83 37 Tu 1 . - CDS 86386 - 87186 779 ## COG1349 Transcriptional regulators of sugar metabolism - Term 87202 - 87263 6.2 84 38 Op 1 . - CDS 87295 - 88122 930 ## COG0191 Fructose/tagatose bisphosphate aldolase 85 38 Op 2 . - CDS 88136 - 89503 1488 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 86 38 Op 3 . - CDS 89534 - 90403 699 ## Oter_0115 xylose isomerase domain-containing protein 87 38 Op 4 3/0.000 - CDS 90419 - 91057 723 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 88 38 Op 5 . - CDS 91200 - 92687 1040 ## COG1070 Sugar (pentulose and hexulose) kinases 89 38 Op 6 18/0.000 - CDS 92767 - 93486 238 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 90 38 Op 7 19/0.000 - CDS 93479 - 94210 230 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 91 38 Op 8 24/0.000 - CDS 94213 - 95226 1048 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 92 38 Op 9 . - CDS 95229 - 96089 979 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components - Prom 96229 - 96288 6.5 - Term 96323 - 96371 10.7 93 39 Op 1 4/0.000 - CDS 96406 - 97167 719 ## COG1349 Transcriptional regulators of sugar metabolism 94 39 Op 2 2/0.000 - CDS 97267 - 98247 1086 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 95 39 Op 3 4/0.000 - CDS 98250 - 99215 895 ## COG0524 Sugar kinases, ribokinase family 96 39 Op 4 3/0.000 - CDS 99205 - 100059 861 ## COG0191 Fructose/tagatose bisphosphate aldolase 97 39 Op 5 2/0.000 - CDS 100098 - 101081 936 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 98 39 Op 6 1/0.143 - CDS 101162 - 102241 973 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 102262 - 102321 11.9 - Term 102489 - 102526 8.1 99 40 Tu 1 . - CDS 102725 - 103693 792 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 103733 - 103792 5.8 + Prom 103685 - 103744 6.3 100 41 Op 1 . + CDS 103945 - 104997 989 ## COG1316 Transcriptional regulator 101 41 Op 2 9/0.000 + CDS 105010 - 105744 649 ## COG3279 Response regulator of the LytR/AlgR family 102 41 Op 3 . + CDS 105741 - 107046 1087 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain Predicted protein(s) >gi|157101643|gb|DS480681.1| GENE 1 61 - 2478 1307 805 aa, chain - ## HITS:1 COG:XF0506 KEGG:ns NR:ns ## COG: XF0506 COG5545 # Protein_GI_number: 15837108 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Xylella fastidiosa 9a5c # 376 710 30 368 488 170 35.0 1e-41 MQVSSIADYRMAVKHDGTITLATGRSRMEKNWKNKSYSWSQLLKHLETPVRTHETLAEYM KMPKDEQDRIKDVGGFVGGALKGGRRKADTVDSRQLITLDADYAPAGLMEDIALLAYYAY AMYTTHKHSPEKPRLRFVIPMDRPVTADEYEAIARKLAEEIGIDYFDDTTYQPSRLMYWP SAAADGEYLFHYEDLPWLSADNILRRYPDWTDTSYWPESSRAKESRVKQAKKQGDPTEKA GLIGAFCRTYDVEDAIAAFLPEVYVKCDLPERYTYAEGSTASGLVIYEDGRFAYSNHSTD PACGKLCNAFDLVRIHKYGIQDEDAAPGTPTTKLPSYKAMMELVQRDKETTLTVARERAE LAREDFADTCDTEEDDSWKGRLSKDKKGLEPSLNNLLLIMRYDQGLKGIRFNQMADNLEI KGPVPWKSQSRFWRDADDAQLEAYLSMTYTEFPKAKILTAITKAADDRSYHPVREYLDGL PEWDGIPRVDTLLIDYLGADDTEYVRSVTRKTLCAAVHRVKYPGCKFDTVLVLCGPQGIG KSTLISRLGGQWFSDSLNLADTRDKTAAEKLQGYWIIEIGEMAGIGSAGVKTLRGFITTQ DDRYRASYGRRVSSHPRQCILIGTTNSEEGYLNDVEGGRRFWPVRVPCIGAKRVWDMTQE EVSQIWAEVLYHVAQGEKLILSGGAADEAVKQQKEAMITDPREEKVRMYLDTLLPEDWYN RDLDKRRDFLYGTECPEPEAVLRRDFVSCQEIWCECFGNSLKNMEPKDTYTIKKILARLP NWENSNERKYIGGEYGRQRGYKRVF >gi|157101643|gb|DS480681.1| GENE 2 2493 - 4478 1179 661 aa, chain - ## HITS:1 COG:mlr3298_2 KEGG:ns NR:ns ## COG: mlr3298_2 COG0749 # Protein_GI_number: 13472868 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Mesorhizobium loti # 181 628 243 615 652 70 25.0 8e-12 MTMNVDIETYSSIDIREAGVYAYASAPDFEILLIGYRYDGQDVKVIDLTDPLADPERDFP DFWEGLYSPDVIKTAYNANFERTCLASWLGRPMPPEQWRCTAVHAATLGLPGTLGGVGEA LGLPEDKQKDKIGKSLIQYFCKPCKPTKTNGMRSRNLPEHAPDKWQLFIEYNRQDVVAEA TIREKLQIYPVARQEQELWNLDQHINDHGVRLDMDLADKIIQYDGIYQERLKQEARKLTG LNNPNSLPQLKMWFFMTYGLDVPSITKDSIPVIEAQLKELKTTHDVLPGLRMLQIRKELG KTSVKKYQAMRHAVCPDGYLRGILQFYGANRTGRWAGRIVQVHNLPQNKIPDLDLARDLV KQEDFDTLELLFEGIPFVFSQLIRTAFIPSEGYRFVVSDFSAIEARVIAWLADEKWRLNV FRTHGKIYEASAAQMFHVPIENIKKGSRLRQQGKVAELALGYGGGFGAMKAMDKAGTIPD DEIPMIIANWRKASPNICKLWRNAEAAARAAIEERRTIKLKHGLSFSYINRILFIGLPSG RKLAYYDTRIEDDEKGKSVITYAGVDQETKKWGRLKTWGGKLVENIVQATARDCLAVTME RVSGAGYQIVMHVHDEIIVDVPETDIDALEKITAIMAQPVPWAQELPLRGDGYETPFYKK D >gi|157101643|gb|DS480681.1| GENE 3 4485 - 5054 580 189 aa, chain - ## HITS:1 COG:no KEGG:CD196_1448 NR:ns ## KEGG: CD196_1448 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_CD196 # Pathway: not_defined # 1 176 1 175 190 194 58.0 1e-48 MSEERKVTKVTTGTVRFSYLHVFEPWAAQEGQEKRYSVCLLIPKSDKKTLSKIKEAIEAA KKEGAANKFKGKTAGLKLPLRDGDEERADDYPEYTGMYFMNANSNRKPILLDQDNNEILD QTEMYSGCWGQASINFFPFNSNGNKGIAVGLNALKKKRDDEPLGGTITVDSAISDFEDSD DGFDDDLLG >gi|157101643|gb|DS480681.1| GENE 4 5072 - 6247 959 391 aa, chain - ## HITS:1 COG:no KEGG:Dred_1188 NR:ns ## KEGG: Dred_1188 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 5 389 4 389 397 425 55.0 1e-117 MPDTHARLSASSAKRWMSCPPSVRLEEQFPDSGSEYAAEGTLAHSLAETILRYNNSEISK KAFSTRFNKIKADPMYNREMQDYIEDYTQRVWEIANEVKAACPDARVLFEQRLDVSEYVP DGFGTGDVVIVADDMVNIIDLKYGKGVGVSAKDNPQLRLYGLGAYLEHSMLYDIRRIQMT IIQPRLENISVEELTAEELLDWAEREVRPKAAQAYAGEGEFKVGDHCRFCKARVTCRVRA EYNLELTKLDFVDPALLTDEEIGEVLRRADELDHWVKDVTGFALAEALKGTKYEGWKLVE GTSRRRYTDQDAIAMRLTTEGWEEDEIYKPQELIGITEMTKLIGKKKFEELLSGLVIKPE GKPTLAPESDKRPELNRVAEAKQDFDNKMDE >gi|157101643|gb|DS480681.1| GENE 5 6247 - 6651 418 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937658|ref|ZP_02085018.1| ## NR: gi|160937658|ref|ZP_02085018.1| hypothetical protein CLOBOL_02548 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02548 [Clostridium bolteae ATCC BAA-613] # 1 134 1 134 134 198 100.0 9e-50 MTITAVFESLEEMQEFAGKLNGGSIQGAVQEFIKGEDIYRTSETKQQTCMQDEISEDRET PADEDADEPMEPTTEQKDEGPGYTLVQVREKLSELQKAGKRAKVQELIAGFGVKKLTEVP EDRYTELMQKAGEL >gi|157101643|gb|DS480681.1| GENE 6 6648 - 6866 357 72 aa, chain - ## HITS:1 COG:CAC0545 KEGG:ns NR:ns ## COG: CAC0545 COG4443 # Protein_GI_number: 15893835 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 72 5 74 74 57 47.0 6e-09 MPDLKYNIIQTLAVLPAVGKWHKELNLIEWGDRKAKYDIRGWNKDRSEMTKGITLSKEEM EYLKESIGGIEI >gi|157101643|gb|DS480681.1| GENE 7 6893 - 7150 168 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937660|ref|ZP_02085020.1| ## NR: gi|160937660|ref|ZP_02085020.1| hypothetical protein CLOBOL_02550 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02550 [Clostridium bolteae ATCC BAA-613] # 5 85 1 81 81 143 100.0 5e-33 MRKKMVKKLTALDCIKESGCQDFETAFPRVTGYISECAKYDPDALYTATEVAELMEYLFE LRRRKKDPRRVEPSKVQVTKKYLQP >gi|157101643|gb|DS480681.1| GENE 8 7166 - 7432 199 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937661|ref|ZP_02085021.1| ## NR: gi|160937661|ref|ZP_02085021.1| hypothetical protein CLOBOL_02551 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02551 [Clostridium bolteae ATCC BAA-613] # 1 88 1 88 88 135 100.0 9e-31 MTKVTELAIRARAAAQYPGWRLDITGPGTAVMTNIMGHRRMVTFRQCRRRRDGPIMRAAK WIVPAVIWLLGMWMVAIVVMAAAMGVRI >gi|157101643|gb|DS480681.1| GENE 9 7432 - 7629 113 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937662|ref|ZP_02085022.1| ## NR: gi|160937662|ref|ZP_02085022.1| hypothetical protein CLOBOL_02552 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02552 [Clostridium bolteae ATCC BAA-613] # 1 65 10 74 74 122 100.0 8e-27 MVEPYKPLYTVEETATVLMTNTDTVYSLIRKGSLRALKLGRIKIRGSDLEQFIEDYPVFQ GEGQT >gi|157101643|gb|DS480681.1| GENE 10 7716 - 7922 156 68 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2993 NR:ns ## KEGG: Cphy_2993 # Name: not_defined # Def: putative phage-related DNA binding protein # Organism: C.phytofermentans # Pathway: not_defined # 1 68 2 69 72 72 51.0 5e-12 MYRNFLTAMKDEGVTFAQIGSLLGCRYQTVSDTVNGETKKGFYHEDAVMIRNVLFPKYDL DYLFQREK >gi|157101643|gb|DS480681.1| GENE 11 8104 - 8439 281 111 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0782 NR:ns ## KEGG: Cphy_0782 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 104 1 104 108 120 58.0 1e-26 MEKAKILERLIKEQGYSLKSFSAKCAIPYTTLYGIMKNGVGKATVDNVMAICHGLGITMD DLEKMANDKKTTKPEPTYADVERLVARNGKQMSVEQKMRLIKLLSEINNED >gi|157101643|gb|DS480681.1| GENE 12 8445 - 9146 237 233 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0781 NR:ns ## KEGG: Cphy_0781 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 206 1 201 232 130 39.0 4e-29 MNHDFILKKVLETYIFCDFKKFPFDCISAIKKYGYHVYTYSELKEKNPEVYELCASCSDE AYTEPFSRTVAYNEEKPLDRIIFSLAHELGHIVLEHPYKADYYEKEANCFASYVLAPSMV IHYCHCESAWDVHRHFGLSDEAAHNAFAAYRRWYRRATHKMYPVDWEMYSYFYKSESKKF ICAETECFYCGRTFYNRPGDCICPICDAKASQEPYPFNDLLSLENRVLGAMNA >gi|157101643|gb|DS480681.1| GENE 13 9495 - 10103 -33 202 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937667|ref|ZP_02085027.1| ## NR: gi|160937667|ref|ZP_02085027.1| hypothetical protein CLOBOL_02557 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02557 [Clostridium bolteae ATCC BAA-613] # 1 202 110 311 311 392 100.0 1e-107 MDIVLELARVGRYEEAKKEKAFIEDNYFVGYDFSSMHETVLQKTLGSIHQQATDLVEADD APNCDEICAKYRKRIYSISGKDKRFPAMTNEVYNSGLIFFPFIEGISRPKYCSLDNIIEY NNRPFIDDRTDEEKENYKQFSKQRILEERYATDYLEYCQICDFISLLQPKSFKSYQEMKY NNTENFQELMQIAEEAGIDIEL >gi|157101643|gb|DS480681.1| GENE 14 10292 - 11497 444 401 aa, chain + ## HITS:1 COG:SA1835 KEGG:ns NR:ns ## COG: SA1835 COG0582 # Protein_GI_number: 15927603 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Staphylococcus aureus N315 # 8 392 6 374 390 78 25.0 3e-14 MGQLRTRKRGSTWEWSFEGAKINGKRNPISHGGYRTKAEAITAGTQAKAEYDSAGRRFTP SDISVSDYLDYWYDNYVKTNLSYNTQKDYEKKIRVHLKPAFGKYRLASLETDVIQKWIDG MKRQGYSRSMVKNTLSCLSGALGYAVYPCKYIKYNPCDYARIPKIVMSDEAKAHTEYICV KEDFAAIIERFGPDSNFYVPLMTGYHCGTRLGEAYGIDLLHDVDFERHTITIQHQLTNEG GKWYYRPPKYDSVRTIKILPEYEKILKTEIHNRKKNMLRYGQYFTKTYQMDDGLIFQAPS NVKIVGKEIMPISAKENGELLTPYSFKYCAKVIHEELGNPLFHSHCLRHTHGTLLAENGA QPKTVMERLGHKDIKTTMDRYVFNTEKMQNDAIVILADAIS >gi|157101643|gb|DS480681.1| GENE 15 11572 - 12951 997 459 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 1 456 8 457 458 459 52.0 1e-129 MEKNQEFIVTIEDMNDDGAGVGKVDGYIWFVKDAVIGDVVRAKAMKMKKSYGFARLMEVL EPSASRVIPSCPVARQCGGCQLQAMSYEEQLKFKERKVMNNLIRIGKFAEDEIHMLPIMG MEEPWRYRNKAQFPFGKDKDGNVIAGFYAGRTHAIVEAEDCLLGVEENREILDIVKRFMK EMKIEPYDELSHKGLVRHVLIRKGFKTGEIMVCLVINGNKLPSKERLVEMLTGGDGVKGM TSISYSVNQEKTNVIMGKEIVNLYGPGYITDYIGNVKYQISPLSFYQVNPVQTERLYGTA LEYAGLTGNEIVWDLYCGIGTISLFLAQKAQKVYGVEIVSQAIKDARRNAEINGIHNAEF FVGKAEEVLPEQFEKKHVHADVIVVDPPRKGCDAVCLDTILKMRPERVVYVSCDSATLAR DLRYLADGGYVVERGKCCDMFPGTVHVETVVELSRKTNG >gi|157101643|gb|DS480681.1| GENE 16 13268 - 14212 1011 314 aa, chain - ## HITS:1 COG:SA1660 KEGG:ns NR:ns ## COG: SA1660 COG3481 # Protein_GI_number: 15927416 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Staphylococcus aureus N315 # 1 290 1 287 313 161 33.0 2e-39 MRYIDTFREGMHIADVYLCKNKQIALTKNGKEYGNLVMQDKTGTIDAKIWDLGSPGVGEF ETMDYVHVEADVTLFQSSFQLNVRRIRRAQEGEYVEADYLPVSKKDIKKMYEELLGYIRS VKNPYLQKLLSGYFVENAAFAKAFQFHSAAKTVHHGFVGGLLEHTLSVTKLCDYYAGYYP MINRDLLLTAAIFHDVGKTRELSRFPENDYTDDGQLLGHIIIGTEMVGESIRSIPGFPEK LATELKHCILAHHGELEYGSPKKPALLEALALNFADNTDAKMETMIEALQSGGENKGWLG YNRLLESNIRKTTE >gi|157101643|gb|DS480681.1| GENE 17 14549 - 15889 1390 446 aa, chain - ## HITS:1 COG:RSc2932 KEGG:ns NR:ns ## COG: RSc2932 COG0265 # Protein_GI_number: 17547651 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Ralstonia solanacearum # 167 446 113 384 403 92 28.0 2e-18 MADKRADQHRDLDRQDQTPDKSHRFINEKIVRQPMTRRQIARKILLTIFCAVLFGVLSAI CFVVSRPVAERILGQETPEESSQITIPKDEPETTAPAAAETEPTEPAMTEAVDEMVRSEM EKYRYSIDDLNKLYSNLRTIASDADDSLVVVHSIQHEKDWFDNPIETSGQFSGVVIARTR TEILILTPDKAVEAADSIEVAFPDGTMMAGIVKQSDTIMGLSVVSVDAGQMDSEQFKKVE AIKLGNSYGVKQGDMVVAIGAPAGIVHSSDYGNISYVVRNVHTVDGSSRVFYTSASGDAE AGTFLLNTSGELIGWVTDKYDEDGVCRMTTVVAISDYKGILEMMSNGIPAPYFGIKGQEV SQAMADSGMPSGIYVISAVTDGPAYNAGIQPGDIITWMNGEKVGSLKDFQNQVESLHAGD KIKVAVLRNGKDEYTEIEFSVTVGAR >gi|157101643|gb|DS480681.1| GENE 18 16079 - 18517 2377 812 aa, chain + ## HITS:1 COG:CAC2340 KEGG:ns NR:ns ## COG: CAC2340 COG1193 # Protein_GI_number: 15895607 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 812 1 788 788 601 42.0 1e-171 MNQKALKTLEYDKIINQLTEYAASPLGKALCQNLSPSSDLEEVRTWQAQTTDAVTRIRLK GSVSFSGIRDIGDSLKRLDIGSSLSIPELLSISSLLTVAARAKAYGRHDGEDDARGTGEP QDDFDSLEPLFAGLEPLTPLNNEIKRCILSEDEVADDASPGLSHVRRSMKVTADRIHTQL NSILNSNRSYLQDAVITMRDGRYCLPVKSEYKNQVSGMVHDQSATGSTLFIEPMAIIRLN NEMRELEIQEQKEIEAVLASLSNQAAPCTEELRMDMELLAQLDFIFAKAGLARHYKCSAP VFNDKGYIHIKDGRHPLLNPQAVVPINVWLGREFDLLIVTGPNTGGKTVSLKTVGLFTLM GQSGLHIPAWEGSELAIFDQVFADIGDEQSIEQSLSTFSAHMTNIVRILNEADSRSLCLF DELGAGTDPTEGAALAIAILSFLHNMKCRTMATTHYSELKVFALSTPGVENACCEFNVET LQPTYRLLIGIPGKSNAFAISQKLGLPGYIIDDAKSHLEAKDESFEDLLTSLESSRLTIE KEQAEINAYKDEIASLKNRLTQKEERLDERKDKILKNATEEAQRILREAKETADQTIKQI NKLAASSGVGKELEAERARLRDQLKKTDEKLTVKPKGPSQPISPKKLKIGDGVKVLSMNL KGTVSTLPNARGDLYVQMGILRSLVNIRDLELLNEKDISATLGDGSSISYGGKAARGKGS GSSQIKMSKSSTVSAEVNLIGMTVDEAVPAMEKYLDDAYLAHLQTVRVVHGRGTGALKNA VHKRLRQLKYVKEFRLGQFGEGDSGVTVVTFK >gi|157101643|gb|DS480681.1| GENE 19 18545 - 19255 905 236 aa, chain + ## HITS:1 COG:CAC3220 KEGG:ns NR:ns ## COG: CAC3220 COG0745 # Protein_GI_number: 15896467 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 235 7 228 228 262 56.0 3e-70 MAEKQRILIVDDDANIAELISLYLMKECYETMIVGDGEEALKVFPEFKPNLVLLDLMLPG IDGYQVCRELRASSQVPVIMLSAKGEIFDKVLGLELGADDYMIKPFDSKELVARVKAVLR RYQAAPAPAPSSTEQQGEYVEYPDLIVNLTNYSVIYMGHSVEMPPKELELLYFLAASPNQ VFTREQLLDHIWGYEYIGDTRTVDVHIKRLREKIKDHNSWAITTVWGIGYKFEVKR >gi|157101643|gb|DS480681.1| GENE 20 19252 - 20658 1594 468 aa, chain + ## HITS:1 COG:CAC3219 KEGG:ns NR:ns ## COG: CAC3219 COG0642 # Protein_GI_number: 15896466 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 462 1 471 475 220 31.0 5e-57 MTRSLYSKFILGYLIFGLLGFIAIATFSSKMTRDYLVQSKADALYDEANIIASSCSTMYQ GKRLDSEEVSTQIRALSHYLRAEIWVVNRQGTVVMDSLGGSRVQSSINGFDPASIGNRSY TIGDYYGMFSGNVLSVSAPITGNYNTYGYVLIHLPISEINQSQNGILSILYITSAVLFGL SLIILLVFTQTVYLPLRKITVGANEYAAGNLDYRIDVKTHDEMGYLSDTLNYMSDELNKM EEYQKNFIANVSHDFRSPLTSIKGYLEAILDGTIPAEMYEKYLSRVISETERLHKLTESM LTLNSLDARGYLSRTNFDINRVIKDTAASFEGTCESKNVSFDLTFSDDIQMVFADLGKIQ QVMYNLIDNAIKFSHHDSTIYIQASGRYEKIFVSVKDTGIGIPKDSLKKIWERFYKTDLS RGKDKKGTGLGLSIVKEIIQAHGENIDVVSTEGVGTEFIFSLPRSTNL >gi|157101643|gb|DS480681.1| GENE 21 20823 - 21863 937 346 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 257 343 536 620 621 62 37.0 9e-10 MITWKKVGTALLVASLISSVLAPAAMAASKKKVGKIYLTIDSDIRTGRDGGDVEVTATGD NTGLYYVDSWEVTNDDGDTWSSSKPPQVDIILGVEDEEEYYFSNTSSSNFKLTLGSSSKY RFDKIKFVNAKRQDSNATLILTVQLVFDKDADTSAAAAPSNLQWDSAHNGNVSWNEVSSA KYYQTQLIKNGAAIGEIKDIYNTSYDFREQITAPGTYQFKVRSVKSSNSAKSSWNTSGSW TVSEADIAALGNTVSDNSQPAAGTWQTAADGRWWYSNADGSYPVSSWQQINGQWYYFDAE GYMATGWIELDGKSYYLDPSTGAMYANTRTPDNFWVDASGVWIPGM >gi|157101643|gb|DS480681.1| GENE 22 22106 - 23050 639 314 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 314 1 318 319 250 44 2e-65 MGMKCIGVDVGGTSVKIGLFEITGELLDKWEVKTRKEEGGTHILPDVARSIRARMEERGL SLKTDLVGIGLGVPGPVMPDGFVEVCVNLGWRRMNPQEELSRLLDGVTVKSGNDANVAAL GEMWQGGGKGYKDLVMITLGTGVGGGVIIDEKIIAGRHGLGGEIGHIHVRDEEWEHCNCG GVGCVEQICSATGIAREARRKMEASDKPSALREYGADVTAKDVLDAAKAGDELANEVMDV VGRYLGLALSMAVMIVDPEIFVIGGGVSKAGQFLIDVIQKHYDYFTPISEYKGKLGLATL GNDAGIYGAARLVL >gi|157101643|gb|DS480681.1| GENE 23 23085 - 24374 1202 429 aa, chain - ## HITS:1 COG:no KEGG:Closa_0071 NR:ns ## KEGG: Closa_0071 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 429 1 418 424 421 50.0 1e-116 MGSTNSMSRENKKKQLKRQIVPSGSPFQEEEDSREIVHKAHQRVVRKRLIILAVLVALAA IAALVLFRYERYHTFTDYQTVWERDLVAEAQEAAAQGEGSFCGYRDFGDGVLKYTKDGAT YLDAKGKVVWVQSYEMRTPVVSVNGDFVAIGDQQGNSIYICDKNGTQGQATTLLPILRVT VSAKGVVAALQEDSKASYIYLYKRDGSPLDIMVKSLLSGDGYPVDLSLSPNGTQLITSYM YLDQGVIKCKIVFYNFGLGKNDPNRVVGIFFPKDLGDAMAGRVRFLDESHSVIFTDKGIQ FFSTRVETSPELVSQIPIEENIRSITYAQDKVGIVTDNVEGGDPYKLRIYDREGSPVFEK TFNYQYTGFDIDGDLVLLYNDSSCKVYNMTGTEKYNGTFDFTVSKVSAGRFPGTLLVMGP QKMTEIKLQ >gi|157101643|gb|DS480681.1| GENE 24 24478 - 26064 1215 528 aa, chain - ## HITS:1 COG:no KEGG:Closa_0070 NR:ns ## KEGG: Closa_0070 # Name: not_defined # Def: Peptidoglycan-binding lysin domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 378 1 375 416 387 52.0 1e-106 MGELYEPFPKLPKNFRQIGERDQVLKLYLEDYVNTYLKRLQPAKGADLRVGLLLGSRETH EDIPFVFVDGALEMDSVTEEGGKVAFTEDAWKKAYQDVEQMFPKRTVQGWFLCAGPGCTL SPLTYWKQHSQYFTGKNQLMYLNCGLEGEEAVYITSSDGFYKLRGYSIYYERNQMMQDYM ILRKDVPRAETGVDDKVIQNFKQKMDERRVEAGRHRSTVGVLSGLCSVLAVTVLAGGVAM FNNYQKLHQMESVIASVVPEGNIKDGLMAFTGKGGTENPSGKGWSAADEPDYVIEEASGK VYPTTAPSPENGDEKPSRVTPETLAPSNSGQTGGTGQGSGQTAAADQGSGQSAASGQGSG QTAASDRGSGQTAAPDQGSGQTAASGQGSGQTAAAGQGSGQSAASGQSSSQKGGTGQENG QTSGKGESQTDSAQKKNEQTESGNSQAAAQTKAPADDSTLPPTGGRGEADDQTVSSASYK VYTVADGETLYGICFKLYHNLQHIDEICRVNSLTDENSIFAGQKLLIP >gi|157101643|gb|DS480681.1| GENE 25 26124 - 26723 646 199 aa, chain - ## HITS:1 COG:no KEGG:Closa_0069 NR:ns ## KEGG: Closa_0069 # Name: not_defined # Def: sporulation protein YyaC # Organism: C.saccharolyticum # Pathway: not_defined # 1 199 1 199 199 259 60.0 5e-68 MKLWKTFNVQKREDISYYNTSEGFETEGFAMHLDRLIREEMAAKGKDGIMFLCIGTDRST GDSLGPLVGHMLRSRRLKGAAVIGTLDKPVHAMNLDLYARYIRLHYPDYVVVAIDASVGS LDHVGYATLGRGALQPGLGVSKELQAVGDIAITGIVGGVGSRDPVMLQSVRLSMVMKMAD CICESIFLVERLWENAAII >gi|157101643|gb|DS480681.1| GENE 26 26965 - 27885 955 306 aa, chain - ## HITS:1 COG:no KEGG:Closa_3994 NR:ns ## KEGG: Closa_3994 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 306 1 306 310 432 67.0 1e-119 MKNTALVIMAAGIGSRFGGGIKQLEPVGPNGEIIMDYSIHDAMEAGFNKVIFIIRRDLEK DFKEIIGHRIEKLLPVEYAYQELEDLPAGYEVTPGRTKPWGTGQAVLSVKGMVDGPFLVI NADDYYGREGFRRIHDYMAEHMDSQSELYDICMGGFVLSNTLSDNGTVTRGVCQVDEEGY LTNVTETYNIQMKEDGLHATDESGAPVTISPSQPVSMNMWGLPASFVQELEKGFPVFLDN LKEGDIKSEYLLPKIIDNLVQNKKARVTVLDTPDKWFGVTYREDKQAVADAIRGLIQSGV YKEKLF >gi|157101643|gb|DS480681.1| GENE 27 27936 - 29162 1246 408 aa, chain - ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 61 400 51 381 387 124 32.0 3e-28 MTAEEGDRRTWELAGQLVQIDSSDPGAYEGQIEQWIYEWFTKEIEEKAGALKNQIELVQR EVMPGRFNLMARIPGKEQAPAPALVYICHMDTVMLGEGWEEETPALGAVVKEGRLYGRGA CDMKSGLACAMSAFSDMVELTGSTGELPARPFVFIGTVDEEDFMRGVESAIRDGWVDRED WILDTEPTDGQIQVAHKGRTWFELTIQGITAHASNPWKGADAIAAMAEAVSHIRKAIGSC PPHEDLGMSTVTFGQITGGYRPYVVPDSCKVWIDMRLVPPTDTAAAKNIVEEAVKAAEAE IAGVRGSYLITGDRPYVEKDSSSRLLASLKSVCDRVTGEDTPIGSFNGYTDTAVIAGTLG NCNCMSYGPGSLELAHKPNEYVPYEDVCRCRRVLTELARQETFFPFAF >gi|157101643|gb|DS480681.1| GENE 28 29291 - 30334 1071 347 aa, chain - ## HITS:1 COG:TP0053 KEGG:ns NR:ns ## COG: TP0053 COG0208 # Protein_GI_number: 15639047 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, beta subunit # Organism: Treponema pallidum # 2 347 6 351 351 546 75.0 1e-155 MSILSAKPLFNPQGDTDKNSRRMIGGNTTNLNDFNNMKYTWASSWYRQAMNNFWIPEEIN LSADLKDYKNLTAAERTAYDKILSFLVFLDSIQTANLPCIAQYITANEVNLCLTIQGFQE AVHSQSYSYMLDSICSPVERDRILYQWKEDEHLLRRNTFIGNLYNEFQEKKDDVTLLRVI MANYILEGIYFYSGFMFFYNLGRGGKMPGSVQEIRYINRDENTHLWLFRSMIAELRKEEP QLFTPKMVDMLRSMLEEGVAQEIQWGCYVIGDDVPGLTREMVTDYIRYLGNLRWSGLGFG TLYEGYDEEPQSMKWVNQYSNANMVKTDFFEAKSTAYAKSAAIVDDL >gi|157101643|gb|DS480681.1| GENE 29 30382 - 32628 2272 748 aa, chain - ## HITS:1 COG:TP1008 KEGG:ns NR:ns ## COG: TP1008 COG0209 # Protein_GI_number: 15639992 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Treponema pallidum # 19 748 117 845 845 1045 66.0 0 MADTKVEIDLDYKGLEQCLEEIGRDFQGEGYGLDRLLERYRSFRKPGMGPAEGMEALIRS AAELTSKETPLWEFAAARLRYCGFACEVEEQLKGLGIKGLYEKISYLTEQGLYGDYILKH YSREEIEEAEGFLDDERNKLFTYAGLDLLLKRYVIQDHSHHPLETPQEMYLGISLHLAMK ESKDCRMQWVRRFYDMLSRMQVTMATPTLSNARKPYHQLSSCFIDTVPDSLEGIYRSVDN FAQVSKFGGGMGLYFGKVRATGSRIRGFEGAAGGVIRWIKLVNDTAVAVDQLGMRQGAAA VYLDAWHKDLPEFLQLRTNNGDDRMKAHDIFPAVCFPDLFWKMAEENLDQDWHLMCPHEI RTVKGYCLEDCYGELWEERYLDCVGDDRISKRTVLLKDVVRLIIKSAVETGTPFIFNRDT VNRANPNSHRGIIYCSNLCTEIAQNMSPIESVSTQVLEVNGETAVVQTTRPGDFVVCNLA SLCLGAIDVDDPDEVEYIAASAVRALDNVIDLNFYPLAYAGITNQNYRGIGLGVSGYHHM LAKHGIRWESEEHLVFADQVFERIHYAAVRASANLAAEKGCYRYFEGSEWQTGAYFEKRG LVSREWKELAEQVAEHGMRNAYLLAVAPTSSTSILSGTTAGLDPVMKRFYLEEKKNAILP RVAPDLTPRTFWLYKSGYTMDQTWTVRAAGVRQRHIDQAQSVNLYITNDYTMRQLLNLYI LAWESGVKTLYYVRSKSLEVEECESCSS >gi|157101643|gb|DS480681.1| GENE 30 33019 - 33882 859 287 aa, chain + ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 14 284 6 276 279 181 37.0 2e-45 MSDIKSPFWRTAAEYGIITVSIWIMVVGIYFFKFPNHFAFGGVTGFSTVVSEISRWSASD FTFIVNTSLLVLGFIFLGKGFGIKTVYASMVMSISLSLLERVCPLSRPLTTEPLLELLFA IFLPAVGTAILFNLGASSGGTDIIAMILKKYTSLNIGTVLMLVDVAAVASSFFIFGPETG LFSTIGLAAKSLVIDDVIENLNLCKCFNIICDDPDPICDYIINTLHRSATVYHAQGAFTH HEKTVIMTTMKRSQALKLRNYIRTIEPTAFMLISNSSEIIGKGFLAG >gi|157101643|gb|DS480681.1| GENE 31 34025 - 35212 1348 395 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_01065 NR:ns ## KEGG: BHWA1_01065 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 5 371 206 554 582 179 30.0 2e-43 MRQRIKAYIFAGMFLGMMILPGLLFIPLKDYVDTENYENRVYQEKPVLNFRNLSAADLKS YPGQYDAYFNDHLAFKNPIVAFGKLADIHVFGEVTSDAVLMGKDGWMFYKLKGEMEDSIA DYQGTNHYSVDEMERFGALLKQAEENLASRGIKLVFYMVPNKEQVYSEFMPSSVKVVQEQ SKADLLYEYLKENTDCDVYYPLDEFRKAKDDWCQIYRKYDTHWNDMGAFMASQMVIQDID GEPFGPESYRSYDIENRGTFSGDLATMLNLQKYYDDDPFLKVSGYREDVSFEMDYQNEHE TITHYSSDAEEKAERILICRDSFGVHMAEYFARNYPDVTLMDYRTEDCGAAALEIQPDAV VIEVAERYTDYMFGLLERLGTIGSGDGILKEATAD >gi|157101643|gb|DS480681.1| GENE 32 35227 - 36735 1250 502 aa, chain - ## HITS:1 COG:CAC1564 KEGG:ns NR:ns ## COG: CAC1564 COG1696 # Protein_GI_number: 15894842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Clostridium acetobutylicum # 1 502 1 473 473 314 38.0 2e-85 MLFSSMIFLWLFLPFVLVVNLLLPIRLSNFFLLAMSFIFYAWGEQEYIWLLLLSLAVNYA GALGIEAVRGSRPAAGEAAGAAGVTAGTAVSTDSQTASGAGKHTLFTPDKLLLVLLVAVN LGFLGYFKYFDFLTGLVNKLTGGTVAEPKNLLLPVGISFYTFQMISYVADVYRGTTGAER NLLNLGLYMSFFPKMIQGPIERYQSMGARIRNRHVTPELFACGARRFIYGLGKKVILANQ FGSVVDKVLANPMDQISGGLGWYVGILYTLQIYFDFSGYSDMAVGLGKMLGFELTENFNY PYLARSVGEFWRRWHISLSSWFKDYLYIPLGGSRKGTFVTCRNLMIVFICTGFWHGAGLS FIAWGMYYGCLQVAERLFLKRRLECLPAAVSYIYMFFVTVVGWTMFRADSLTRGLMLLKQ MFLLRPGIYDTAMYMSHKTIVYMVIGIVLCGPFQALVPCFKKHMQDGRSIYLSESLGLIL LFAYSIVMAVGSTYNPFIYFRF >gi|157101643|gb|DS480681.1| GENE 33 37176 - 38561 1367 461 aa, chain + ## HITS:1 COG:CAC0707 KEGG:ns NR:ns ## COG: CAC0707 COG1508 # Protein_GI_number: 15893995 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Clostridium acetobutylicum # 1 460 3 464 464 242 33.0 1e-63 MDLKLQVKQTQTLSQRMIQSAEILQMTSQELNTYINELALENPVIDIVEPPTAEEQRESI EQQEWLNSFNEENYYLYQRQNNDDDYDFKSSWNINTDDGETLQDYLWSQLITENFTDQET EIIKFMLECLDNKGYLEESTETIASYFGTDTEIVEDLLSDLQALDPSGVCARTLEECLKL QLERRDILTPVLESIIDNCLEMVAKNQIPAIARKLRLSPTETAGYCQIIKSLNPKPGVSF SSRDQLRYIIPDVTIVKFKDHFDILLNESMYPTIELNSYYRQMNQNPESSELKEYLGNKI RQAEWVKQCVTQRGKTLMQVSRAILEHQEEFFTFGPAHLNPLRLADIAQELDIHESTVSR AVSKKYLQCSWGVYPMNFFFSRSVAVQESSSSESGTQSVTAADIKRVLREIIEEENKKKP YSDRLLGEKLAERGISISRRTVAKYREEEGIADASGRKEYV >gi|157101643|gb|DS480681.1| GENE 34 39102 - 39749 447 215 aa, chain - ## HITS:1 COG:TVN1450 KEGG:ns NR:ns ## COG: TVN1450 COG0235 # Protein_GI_number: 13542281 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermoplasma volcanium # 4 200 6 199 218 141 36.0 1e-33 MDFKKCIVESGKKLENSGLTVETWGNISIRDPETGLVYLTPSAMKYSTIVEDDVVVCRLD GTIVEGHRKPTIETGLHLAIYRNREEVNAVIHTHPMYSMVYATQGKDIPLIIDEAAQAMG DTCKCTQYALPGSEELADQCEKALGKEANSCLLHSHGAVCVGENMDAAFKVATVLEATAR ILYMIEATGGKPAGISPENVAVMKDFAKNHYGQGK >gi|157101643|gb|DS480681.1| GENE 35 39854 - 40942 1349 362 aa, chain - ## HITS:1 COG:PM1325 KEGG:ns NR:ns ## COG: PM1325 COG1879 # Protein_GI_number: 15603190 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 74 354 32 314 314 162 32.0 7e-40 MKRNMTAAVMALAMSTVMSAAALSGCGSTPKETVPETAKEQQTDAQTADATGESGKDEES SAEAASGNYGDKTIGFVGMTLNNEYHITLANGAKVEAEAKGVKIEVQAGDQHASADAQLG IIENMIANKVDGILLVPSSSDGLESALTKCKEAGIPIINLDTKLTDESLANVGLDIPFYG TNNYEGAKLAGEYVAKNFEKGTKTAILKGIEGQTNAADRYNGFIEGAGDTVTVVAEQTAN WEVDQGYTAAQNIISANPDVELFFCCNDNMGIGALRAIKEANMQEQIQIIGFDAVSEALN LVENGEFLATVAQYPAEMGKLGVDNMLKIFDGGEAESYIDTGTEVITKDNVGEFKDYLKT FE >gi|157101643|gb|DS480681.1| GENE 36 40990 - 41130 229 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937698|ref|ZP_02085058.1| ## NR: gi|160937698|ref|ZP_02085058.1| hypothetical protein CLOBOL_02590 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02590 [Clostridium bolteae ATCC BAA-613] # 1 46 1 46 46 70 100.0 5e-11 MGAIILSTLTCGLQIMNVATYYQTIVTGIVIIAAVFADKKKERQSE >gi|157101643|gb|DS480681.1| GENE 37 41173 - 41955 857 260 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 13 260 30 272 330 162 43.0 4e-40 MESYKEGVKAKLPALTLYLGLVVIFIVFAVICSMMGKNFLTLNNMFNIITQASIISIIAI GASLVIVTGGIDLSVGSIVGFVGIFGGLILKAGMPLIAMGILCIAAGAAFGLVNGCLVSY GKVPPFIVTLGAMQIARGLALLINGGQPISSFPKGLGSLMASKVLGTVPVSVIYVFVFYA VIIFVMAYTKFGRHVYAAGGNICAARLSGINVNRTVLMVYVLSGIFAAIGGIMLLSRLTY ADPNAGSGYEMNAIASAVIG >gi|157101643|gb|DS480681.1| GENE 38 41958 - 43427 174 489 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 258 463 23 235 563 71 25 1e-11 MCGINKSFFGVPVLKNAQLEVLPGEVHVLLGENGAGKSTLIKILSGAYGREGGDIQLDGR ELAPMTPKEALDAGISVIYQEFNLVPDLPVYENIFIGKEYGQGLLVDHKRAIKEAEGYMK RVGLEVSPKTLVRELSVAQKQLVEIAKAISNHVKVLVLDEPTAAITDKETNMLFGIIREL QKEKIGIVYISHRMSELFEIGDRCTVMRDGEYIGTVKLSETTEDQLTKMMVGRTVSFEKI ANEAVDMNQVVLKVEKLRYKEFLKDISFELHKGEILGFAGLVGAGRTEVAKCMIGAYRKD GGTVTYKGETLSNSLSRNIEKGIVYLSEDRKDEGLVLMHSVMDNIALPNLKQLAKPFIRK KEMADRAKDYIKSLRIKTHTHLTEAKNLSGGNQQKVVIAKWLYSNADVYIFDEPTRGIDV GARDEIYNIMYDLVNQGASIIMISSDLVEVLKMCDRVAVMREGVLEAILDNSPDLTQETI LKYAMQGGL >gi|157101643|gb|DS480681.1| GENE 39 43461 - 44927 1053 488 aa, chain - ## HITS:1 COG:lin2981 KEGG:ns NR:ns ## COG: lin2981 COG1070 # Protein_GI_number: 16802039 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Listeria innocua # 3 486 2 479 483 249 33.0 7e-66 MKRNMVAFDCGNSSCRVILGVYDGEQITTEVISQIPNYMVRVHQYFYWDILMIYQKLLEG LKEAVKRAGTIHSIGICTWGVDFALFDNQGLMIQNPLSYRNSIGAEVLEQMSDEQRTYLF RQTGILCDKINSVNMIKGMMEKMQSMFSHGHKLLMIPDILNYLFTGCMVNEPSELSTTQL MDARKRELSEEALKTMEIPSGLFAPIGRHGTAIGTLHSNVKEILGISYDVPVICVPSHDT AAAVLAIPAEEEEFLFVSTGTWALIGMELEEPIVNEEVRKRCLTNEIGAFDNITLLRNSA GMFIIQRIKEEYEGEHGAIGWEELNSLGDCHEGAAPLFPVNDSRFFNPVHMSDEIWNYLV KTGQASGEKDWGTIICSFQNSMALSFASVIQGLEEISGKKKDTVYMVGGGSRNVRLCQMT ADATGKKVVTGGKESTSLGNLGAQLKYFKPEMTVGEIRRLLGTGIESAAYCCGKDSGEAL KRYQKLEG >gi|157101643|gb|DS480681.1| GENE 40 44947 - 45855 651 302 aa, chain - ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 6 301 8 298 302 174 34.0 2e-43 MGVCNVFNIERFATEDGRGIRTVIFLKGCSLRCRWCANPESQRAGTEVLVKTNVCINCAR CSVLCPEKAISYMEGYGFITDQSKCRQCGVCIRECYADAREMMGRKYNEEELLTEILKDR QYYGMSGGGVTFSGGEPLYYSEIIGAVGEQMHRRGYNTLVETCGHVPQKALEDINGHVDS IYYDFKQIDPDKHKELTGVDNTLILSNLEWLCGHYSGELSVRYPYIPGCNDDEASINGFF EYIKSLDHISEIVFLPYHRLGLPKYQGLGRAYEMGNMPSLKKADLLFLVQRAEKYGLKIK IQ >gi|157101643|gb|DS480681.1| GENE 41 45915 - 48326 1957 803 aa, chain - ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 1 803 12 805 805 473 35.0 1e-133 MNTRTERLRKRLFDTDPRICPERCIYFTESMKETEGKPNALRRSQAFYDVLSRMTVYVNA DELIVGNQAQWPKASPIYPEYSTDWLEAELNGTPFFPDKRPGDKFYYTQEDKDKILECVD YWKGKSLYENLRKTLPEKINQAWDANVIDDTWVSAAGLGNVIVDYKGVVDKGLSDVMRRI ENKLKRLDPREPGNTRKRWFLEAALQGNQAVVMFSGRIADCCKEQAEQEKDETRRRELLK LEDICRNVPLNPARTFHEAVQSIYMILLAVHLESNGHAISLGRFDQYVNPYYKRDLEEGR ITREEALEIVECFFIKCNELNKLRSWPDTEFFLGYQMFINLAIAGQTVDGKDATNEVSHL CVEACENVRLFTPSVSIKWFEGTSDGFMMKALKAAQRHQGGQPAFYNDKAFIRTLKNMGI AEEDRVDWVPDGCIEASIPGKWDFAAKGPWLNVEKVLEITLHGGTDPKTGYHFVDLEKRA EDCGSVRELMELYKQTLDYFMGLQVETEHINDEIHIQQDINAFRSSLVYDCIERAMDLVE GGSVYSADGGPTAGTISAGDSLAALDEIVFNQKLLTMEQVLHAMSTNYEDMETVPAGPEI RAILLNKAPKFGNDDERADKWVVELEDYIGSSYRYKYRSSKYGKGPVPCCYSYSQSPVTG NIAFGKSIGATPDGRKAGQPVNNGISPANGSEKKGATAACNSVMKLPSIWFQKGAIFNMR LSKGALDTDENKEKVIAMIKVLFENYGQQIQFNVVDNKVFKKAMEHPDEYKDLMVRVSGY SALFTSLSPECQMDVINRAELEL >gi|157101643|gb|DS480681.1| GENE 42 48323 - 48418 67 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHKYIIDKNEIKCYIENVTSYIHEVNVRGGT >gi|157101643|gb|DS480681.1| GENE 43 48638 - 49666 795 342 aa, chain + ## HITS:1 COG:BH2313 KEGG:ns NR:ns ## COG: BH2313 COG1609 # Protein_GI_number: 15614876 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 15 318 3 307 337 160 32.0 3e-39 MNKEGRPENSGYIGIRDIAKLAGVSTATVSRVINNPEMTSEKVRKKVQKVIEEYNYIPNQ PVKKIFSKTSNTIAVFIYDMENPFFISLIKELNQICLEHNYMLLICDTGSSPELEKKYLY FCLANRCAGIILTEGVDSDIFEACKIHIPIAFLDRKTHGKFSSVRSNNRQMMQPIVDYLY NLGHRKIAFVGCKTPMDSVISRQNGYIETMTGKGLPIIQEYIYDKNPQLTLQAGSQALNY FLSLPAAPTAIICCNDIIALGALNAAYSLGLKIPADMSIVGFDNVISNLHQPQITTVQQN LREISLELFELVVNPPENPVSKIIEASFIEGATCSRIETAQK >gi|157101643|gb|DS480681.1| GENE 44 50013 - 51143 275 376 aa, chain - ## HITS:1 COG:SPy1488 KEGG:ns NR:ns ## COG: SPy1488 COG0582 # Protein_GI_number: 15675393 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pyogenes M1 GAS # 58 366 61 368 380 69 26.0 8e-12 MEMVLEVHPYAITELAGKRRRCQTYVKDGDSRRLICSADKDGLFEKLYDFYFVHNGTSSM TMDKLFQEWLVYKEAVTDSLKTIRRHEQHWNKYFAHLKSKKVASYDRLELQKECNQLVKT NNLSSREWQNIKTILSGMFYYASEKHYIQDDIMRDVKITVKFRQVNKKTGKTETFLTDEL DTLIRYLNAEFARTHDMAILAVRFNFFVGCRVGELVALKWCDFLDIQHLHVCREEIKESV HDGDKWKDVYTVVEHTKTHTDRIIPLMPGAIRILNDIRLKMAPGSSDDDFIFMRDGSRIT SRQINYVLEKACKKIGIPVKRSHKIRKTVASRLSAGNVPLDSIREMLGHSNLSTTLGYIY NPLSEKETYALMSKAL >gi|157101643|gb|DS480681.1| GENE 45 51244 - 52143 230 299 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 73 267 122 315 352 99 33.0 1e-20 MFRQSLYGISILFDGKKDMGIHVNIPGKAIHDCLLHFSRKYSSVTPFGSVAYEVDSFDVS FYSSVLRDLLRKIQEKGHLTRFDIAIDDIGANYYSLPALDKKLANNEYVSKFRGHESKIS YHTGNELKGYTIYLGSRHSDIMLRVYDKQLEQNSRRVNDSDALIECPWIRWELELKDDYA SRAAKFLVDGELLNTVACGILSNYVRFITLDATRKGNCSLDSVWESFIDGVSKLSLYKAP APKTIDDKRNWLFRCVSRVFTTVVASDGYDLGVVHDMLRIGEHRLSGSDIALINQACFG >gi|157101643|gb|DS480681.1| GENE 46 52445 - 52636 161 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937707|ref|ZP_02085067.1| ## NR: gi|160937707|ref|ZP_02085067.1| hypothetical protein CLOBOL_02600 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02600 [Clostridium bolteae ATCC BAA-613] # 1 63 1 63 63 119 100.0 9e-26 MVIKGCKTIKEYKALREHFVDLWYQTNFDSGTTYYDIVGNYVKVVDYTGDSVKVPLSEIP GYH >gi|157101643|gb|DS480681.1| GENE 47 52626 - 52793 141 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQCWNMVGVSLLSVGLVRIARHEQPVINPDCQVRATVSNPDYQTQATSKEVQYGH >gi|157101643|gb|DS480681.1| GENE 48 52978 - 53193 185 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937708|ref|ZP_02085068.1| ## NR: gi|160937708|ref|ZP_02085068.1| hypothetical protein CLOBOL_02601 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02601 [Clostridium bolteae ATCC BAA-613] # 1 71 1 71 71 135 100.0 7e-31 MSKNYISSDTEEWVFSLYSHMPKEEFTKDLQALCSARRVYLQNELADSFVFGYLDSVYEV MRDVLTCKALG >gi|157101643|gb|DS480681.1| GENE 49 53364 - 53738 237 124 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3551 NR:ns ## KEGG: Clocel_3551 # Name: not_defined # Def: putative transcriptional regulator, XRE family # Organism: C.cellulovorans # Pathway: not_defined # 1 66 1 65 66 73 68.0 3e-12 MIDYGKLFALLEIRNMKKTDLLKIISSPTLAKLSKGQNISTDTIDKICIHLGVQPSDIME VYEEEIVDGKKLKIKTRYGEPKTYQENEIRTLIISELGKFLKKEGNKEILDEEKIEETLK KINE >gi|157101643|gb|DS480681.1| GENE 50 54022 - 54207 61 61 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKKVEMSRLSLHIPTALYRRIDEDSRRYDITKTALIQTMIVNYYRSVDSSSFGKTPDCN T >gi|157101643|gb|DS480681.1| GENE 51 54483 - 55196 611 237 aa, chain - ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 44 229 42 229 238 133 38.0 3e-31 MKHSAWQTVMGILLLCIVALISWQAGTLVAVHNQAAEADKARPVVVIDAGHGGSDPGKVG INGQLEKDINLKITEQLKAYLEASDVEVILTRDSDQGLYSSGDSHKKMADMRKRCEIINE AVPDLVVSIHQNSYHEEYVSGGQVFYYKTSEKGKYLAEILQKRFDYVLGEANKRMAKAND NYYLLLHVKEPIVIVECGFLSNGKEAKRLEDEEYQDRMAWTIHMGIMEYLNTVKQKR >gi|157101643|gb|DS480681.1| GENE 52 55263 - 56705 1364 480 aa, chain - ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 75 434 42 404 443 256 41.0 7e-68 MTAILMAGVMAGCSVQAGKQVEPTTGKQQTKETVQETVQETELQKPTHPAPSQIQPFPDV EKETKAEPLIGGKRIILMTDIHYLAESLTDRGNMFQSMVEHGDGKLTNYVWEITDAALEE IKLLSPDALIISGDLSLQGEKKSHEELAKKLDEVEKAGITVLVIPGNHDINNPSASVYSG GDRYPAEPTAPEDFERIYKEFGYSEASSRDANSLSYTYDLGPSMRLLMLDTCQYEPRNKV GGMIKTETYEWIKEQLKQAARDSVILLPVAHHNLLEESKVYADDCTIEHSEELIQMLEGE NIPLFLSGHLHVQHFMRNNDIGIYEVVTSSLSTPPCQYGVLDYMEDETFYYYSRKVDMEK WARKNKSTDENLLNFDTYSPPVLKRIFYNQAYDAMKNSAEEETGSVFVKLTESEKQQMAK VYGDINAACYAGRAYEVVKEAVKQPGYAMWKEYCYPAILYEYLEYIIEDAVRDYNVLSME >gi|157101643|gb|DS480681.1| GENE 53 56764 - 58446 1846 560 aa, chain - ## HITS:1 COG:CAC1556 KEGG:ns NR:ns ## COG: CAC1556 COG3858 # Protein_GI_number: 15894834 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Clostridium acetobutylicum # 248 553 135 437 446 117 25.0 8e-26 MSRRRQGHGCGITVLVLLFIFILAGSGLGVFFAKKYMPSNELADKSRVFGIKGGQVALIL DNQVQEEKGIYEDGQVYLPVDWVNEHLNERFYWDEGEQLLVYALPESIVYADESTQGEKG PLFKVTQDGMYLSLGVVVNYTDIRTQAFATSQIKRVFIDTSWQPYDTAVLKKTGQVRVKG GVKSQIITEAAAGETVDVLETMDKWSRVRTADGYIGYVENRKLEAGEQIAPVSTFEAPVY TSISMDGKVRLGFHQVTRQEGNNTLEDYASNARGMNVIVPTWFNVVSSDGTYTSLASKDY VDKAHDMGLKVWAMVENVSTQESIKNLNTKTLMSSTSTRKKLIEKLMNEADTYGFDGFNL DFESLKAEAGPHYVQFIREMSVACRNKGLVLSVDNYVPSSYTAFYNRKEQGIVADYVIVM GYDEHYAGGEAGSVSSIPYVREGIENTLKEVPKEKVINAVPFYTRVWTVNEGKTSSKAYG ISDARQWVEENQVELTWDKLLGQYYGETVSGSGQQYIWMEEEDSMKLKIDLIKEFDLAGV ACWKLGFEPADIWDIVSGVK >gi|157101643|gb|DS480681.1| GENE 54 58653 - 58925 291 90 aa, chain - ## HITS:1 COG:no KEGG:Closa_0332 NR:ns ## KEGG: Closa_0332 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 77 7 79 81 64 53.0 1e-09 MIRNVEYGDRKKHQRTYLDGLKRYKNKGVRITIDGVECPEKEWEKIFEMGEDGGFYMGDY VGAEQGCLKEIRFDKVYLSDPPKEKKDKDS >gi|157101643|gb|DS480681.1| GENE 55 59031 - 59657 378 208 aa, chain - ## HITS:1 COG:no KEGG:Closa_0331 NR:ns ## KEGG: Closa_0331 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 21 208 23 205 205 137 36.0 2e-31 MGFSKRNHFKHSRQPAKGCRVVGITGTGRGTGVTHLSVLAANYMASALQRRTAVIEWNDH GAWKTMKDICQAGSVKQADSADYRYKIFGVTYYMQGDPSILASCLGGIYDEIIIDFGELR PSIRAEWLRCEVKIVMAALSEWKLEAFLELLSEEEGRRAGWIYTAAFGSEDTRKQIERQF GISLVRVPLSVDAFSVDYKTMQWFERIL >gi|157101643|gb|DS480681.1| GENE 56 59558 - 61039 531 493 aa, chain - ## HITS:1 COG:CAC0404_1 KEGG:ns NR:ns ## COG: CAC0404_1 COG0515 # Protein_GI_number: 15893695 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Clostridium acetobutylicum # 1 225 35 272 306 125 33.0 1e-28 MGTGRAGTVWLAVHIGLEEYRAIKRISKRDADYDTFRREALVLKELRHPGIPMIYDLEED SEFFYLIEEYLEGYSLYALIANQGPIQEAEAVRYGMQVCGLVAYMHSACELPILHLDLQP NNLIICNGTVKLIDFDHAAGCRWANIAPKRFGTVGCAAPEQYASDRMLDQRTDIYAIGAV LRFMTAGTLREEAGQSAALSEVFERIIRKCMDSNMEKRYASAGEVEDALQALCTRKLTEE KEQKTIPSHILLVTGMKAGTGTTHLAFGLVCFLNKNGYRALYEEHNASRAVDTMARRAGM RPDRFGIYTMNGCCLKPWYGPAVKLDEGTGFDIIVKDYGTDWQQAELELKAGKADLLGTG SASAWDGRYIRQYISWMNQAWAGRCDGNRVLVFRGTRTICQKELKAMTSSDKALKGIKLF VSPEYENPFRQKREEAFFQSLWSAIAVSGRRWNRKRGLLKSWGFLREIILNTAGSLRKGA GLSESQEQDGEQG >gi|157101643|gb|DS480681.1| GENE 57 61370 - 62119 791 249 aa, chain + ## HITS:1 COG:no KEGG:Closa_0328 NR:ns ## KEGG: Closa_0328 # Name: not_defined # Def: Negative regulator of genetic competence # Organism: C.saccharolyticum # Pathway: not_defined # 1 249 1 239 239 298 65.0 2e-79 MKIERINENQIRCTLTSFDLSARNINLVELAYGTEKARKLFREMIQKASNEVGFEAEDIP LMVEAIPLSSESIMLVITKIEDPEELDTRFAKFSPFTDDNQNSLVSQLASEFLEGAPDSL NYLEPGSVEQVGDESGTDGASEAPAKKNTGSVASSRIFRFNSLDRISEASRAVSGIYDGV NTLYKKPGTRQYYLIVKRLDCGDLDFSRVCNILAEYATKLSGESASEAYFKEHYEVIIQD NALQNLARI >gi|157101643|gb|DS480681.1| GENE 58 62238 - 62915 715 225 aa, chain - ## HITS:1 COG:BH1062 KEGG:ns NR:ns ## COG: BH1062 COG1802 # Protein_GI_number: 15613625 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 11 224 11 224 226 156 40.0 3e-38 MKIEEKRHGETARQYAYRTLRENIISLDLEPGAPFNDVEMSQMIGISRTPVREAVIQLSE ESRIIEIFPQRGMRIALIDVDLVEEARFLRLVLEKAAVELVCGQADEEDLQWLDENIRLQ EFYLEGGSSRKLLELDNEMHRKLFSICRKELTYEMSQRLAIHYDRIRSLSVAAVKDHKII EDHKKLLEAIRSRDKKLAAETMDLHLNRWMLNEQMFRDQYPAYFK >gi|157101643|gb|DS480681.1| GENE 59 62948 - 63958 1019 336 aa, chain - ## HITS:1 COG:yiaK KEGG:ns NR:ns ## COG: yiaK COG2055 # Protein_GI_number: 16131446 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 336 1 332 332 303 44.0 3e-82 MRLTFDEIKSEIKRVLVKYGMSEEKAETCARIHTETTYDGVYSHGTNRVARFVNYIQKGW VDVNAEPSLEREFGALRVYNGNMGPGVLNALYCADRAMELADQYGIGMVGIRNTTHWMRG GTYGLYAARKGYAAIMWTNTESCMPPWGGRECRLGNNPFVMAVPSADGGEPLQLDMAMSQ YSYGKLQVTRLAGEKLPYPGGFDDEGNLTDDPGAIEQSRRVLPVGYWKGSAFSFMLDILG SVLTDGIGAAGMDQKGYGSCGGCSQVMIVVDPGRTTTDAHMKEIVDTAVLYVKSSAPSEE GGSVRAPGQGSARFHEEHDRLGIYVDDDVWKEIQSL >gi|157101643|gb|DS480681.1| GENE 60 63987 - 65054 1099 355 aa, chain - ## HITS:1 COG:TM0069 KEGG:ns NR:ns ## COG: TM0069 COG1312 # Protein_GI_number: 15642844 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Thermotoga maritima # 1 355 1 357 360 457 56.0 1e-128 MDMTLRWFGEGVDSVSLEQIRQIPGVKGVITTLYDIPAGEVWPMERILQMKEQVEASGLK VLGIESVNIHDAIKVGTPDREQYIANYITTLERLGQAGITVVCYNFMPVFDWTRSDLAKE RPDGSTVLAYNQKEIDKINPENMFQSMGEKSNGFELPGWEPERMARIKELFDMYKDVDED RLFDNLVYFLKAIQPVCETYDIKMAIHPDDPAWPVFGLARIITGKERLLKLMKAVDADFN GVTLCTGSLGSNPENDIPDIIRSLKGRIPFAHVRNLQYNAPGDFQEAAHLSADGSMDMYE IMKALYDIGFDGIMRPDHGRAVWGEVSMPGYGLYDRAMGACYLNGLWEAICKQNR >gi|157101643|gb|DS480681.1| GENE 61 65346 - 65444 161 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEISKDMLIGQLIQVDDSIAPILMRAGMHCLG >gi|157101643|gb|DS480681.1| GENE 62 65786 - 67894 2395 702 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 1 679 1 680 743 984 69.0 0 MRTEWNTFKGGVWQREINVRDFIQKNYTPYDGSDEFLEGPTRNTTELWNQVMELSKKEQE AGGVLDMDTKIISTITSHGPGYLNKDKETIVGFQTDKPFKRSLQPYGGIRMAMKACQDNG YEVDPEIVEFFTKHRKTHNAGVFDAYTPEMRACRSSHIITGLPDAYGRGRIIGDYRRIAL YGVDALIQEKKEEKDSTRTIMYSDVIREREELSEQIRALEDLKELGNIYGFDISKPACDV REAIQWTYLGYLAAVKEQNGAAMSLGRTSTFLDIYAERDLKEGRYTEKEIQEFVDHFIMK LRLVKFARTPEYNELFSGDPTWVTESIGGIGIDGRHMVTKMSFRYLHTLQNLGTAPEPNL TVLWSTKLPENFKRFCAKTSIESSSIQYENDDLMRVTHGDDYAIACCVSSMRIGKEMQFF GARANLAKCLLYAINGGVDEISGKQVGPKYRPVTTEYLDFDDVMDKYKDMMKWLAKVYVN ALNIIHYNHDKYSYERLQMALHDKKVTRWFATGIAGLSVVADSLSAIKYARVKAVRNDKG IVTDYIVEGDFPKYGNNDDRVDQLAADLVHTFMSYIKGNHTYRGGIPTTSILTITSNVVY GNNTGATPDGRKAGQPFAPGANPMHHRDSRGAVASLASVAKLPFRDAQDGISNTFSIIPG ALGKDDQIFMGDLEVELNGCGGGCEAPTLFEGLDSEADIEES >gi|157101643|gb|DS480681.1| GENE 63 67941 - 68186 375 81 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 3 81 665 743 743 115 67.0 1e-26 MVKASESQIDNLVALLDGYAQEGGHHLNVNVFSKETLLDAQAHPEKYPQLTVRVSGYAVN FNKLTKAQQDEVISRTFHGSI >gi|157101643|gb|DS480681.1| GENE 64 68278 - 69030 655 250 aa, chain + ## HITS:1 COG:SPy0379 KEGG:ns NR:ns ## COG: SPy0379 COG1180 # Protein_GI_number: 15674526 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 3 240 11 249 263 266 54.0 2e-71 MIGRVHSIESFGTVDGPGIRMVIFLSGCPMRCLYCHNPDTWDPKGGSPMTAEEILDQYEQ ARPFYKKGGITVSGGEPLMQIGFVTELFEKAKKSGIHTCLDTSGITFNPGSQAVMAHFDR LLASTDLILLDIKHIDPKEHVKLCAQPQDNILAFAAYLEKKQIPVWIRHVVVPGITDREE YLYRLGRYLGTLKNVKALDVLPYHDMGKAKYGQLGIPYPLEDTAPLPPAKAAESRKIILQ GMRDERVHCS >gi|157101643|gb|DS480681.1| GENE 65 69139 - 69309 209 56 aa, chain + ## HITS:1 COG:no KEGG:CPR_2442 NR:ns ## KEGG: CPR_2442 # Name: not_defined # Def: ferredoxin (FdxA) # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 52 14 65 69 64 76.0 1e-09 MAYVISDACVSCGSCAAECPVSAISEGDSQYVIDADTCIDCGTCAATCPTGAISEG >gi|157101643|gb|DS480681.1| GENE 66 69407 - 70123 665 238 aa, chain + ## HITS:1 COG:RC0552 KEGG:ns NR:ns ## COG: RC0552 COG2071 # Protein_GI_number: 15892475 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Rickettsia conorii # 2 237 3 242 242 155 37.0 4e-38 MKKPVIGITPSHNTDNDEISVRPTYLRAIEAAGGLSILLPLEVSAEDLKQLSGLCDGFLF SGGPDIHPFLLREETHMHCGNVSVARDTMELSLLKLAMEAKKPVLGICRGAQVINVGLGG DIYQDITSQAETGFPIAHKQPYSCCLPSHHVDVQRDTLLCGIANGKTQIEVNSSHHQAVR RIAPCLIASGHAPDGIIEALEMPDYPYLLALQWHPEYMWKTDTVSANIFKSFVEACRG >gi|157101643|gb|DS480681.1| GENE 67 70293 - 71093 783 266 aa, chain - ## HITS:1 COG:no KEGG:Closa_4202 NR:ns ## KEGG: Closa_4202 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 36 265 28 261 261 95 28.0 2e-18 MNRRSKPVAQKELEEQIGRALCSRAESIQVTSADTERMRRSVHRKIEEESGMKKWNMKRV FVVAAAICVLGTITAVAAGKVAYTSAGGSHLDDFTYDKLGEMETKLGYTTNAPEVFSNGY RFDMGMPTHQDAMDEDGNVIKSAESFSLHYKKDGSPDIFVTVQGTSLFEEEGQPDQVFEH NGITLNYTRDQYRFVPPDYQASEEEKAKEAAGELYISYGSDTVTDQVAQSMLWKDGEKVY IITAMDNPMTAQDMAQMAGELIDNKQ >gi|157101643|gb|DS480681.1| GENE 68 71090 - 71656 467 188 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 1 177 1 186 187 77 29.0 1e-14 MEQETAGLVRRMAEGDSQAFDRLMEFYYPKILRMAYLISGSHADSEDIVQETFVLCWMNR KKIKEPEYFGNWLYKTLTREAWRYCRKNRKEQPVEEVFGKEEPETASVLEEVMMRSQEKE LYKAIRNLPVKQRTAVVLYYFNQMSTREIAGIMGCLEGTVKSRLYTARANLKQELMEERK RVGREVTL >gi|157101643|gb|DS480681.1| GENE 69 71980 - 73536 1850 518 aa, chain + ## HITS:1 COG:BS_yfmM KEGG:ns NR:ns ## COG: BS_yfmM COG0488 # Protein_GI_number: 16077809 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 517 1 517 518 812 76.0 0 MSILNVEHLSHGFGDRAIFQDVSFRLLKGEHIGLIGANGEGKSTFMNIITGKLMPDEGKV EWAKNVRVGYLDQHAVLEKGMTIRDVLKNAFAFLFEMEEHMNQICDKMGEAGEEEMAAMM DELGTIQDLLMAHDFYIIDSKVEEVGRALGLADIGLDKDVTELSGGQRTKVLLGKLLLEK PDILLLDEPTNYLDVQHIDWLKRYLQEYENAFILISHDIPFLNSVINLIYHMENQRLDRY VGDYDKFQEVYAVKKSQLEAAYKRQQQEIAELEDFVARNKARVATRNMAMSRQKKLDKMD VIELAREKPKPEFHFLEARTPGKYIFETKDLIIGYDEPLSRPLNLVMERGQKIVLVGANG IGKTTLLKSILGLTPALSGSVELGDYLSIGYFEQEMAPGNTTTCLQEIWTEFPAYTQYQV RSALAKCGLTTEHIESQVRVLSGGEQAKVRLCKLINRESNVLLLDEPTNHLDVDAKDELK RALKEYRGSILLICHEPEFYEGLATDVWDCKEWALKLS >gi|157101643|gb|DS480681.1| GENE 70 73622 - 74365 613 247 aa, chain - ## HITS:1 COG:AGl1783 KEGG:ns NR:ns ## COG: AGl1783 COG1802 # Protein_GI_number: 15891006 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 210 58 262 280 82 25.0 5e-16 MSQEDMMEKDNSLTNKAYETIRQQIFDFDLFPGQIVSDYLLSKELNMSRTPIRQALMRLK NEGLLEEQIGKKNYRVSTITKEDIQDLFDFREGIETTAFMLSWRNGITPEQLENLQEITD KMKETKDAGQAKEHFFYDQQFHNELVALSNNKRLIKAHDEILLQLTRMRFLSFLENSLQS KACMKHQAIIDAIKDQDYERGRQAVIDHVRSSRDDYINLFGNGLSANSLCLLRYFTRSPE TDSKEDE >gi|157101643|gb|DS480681.1| GENE 71 74414 - 75262 370 282 aa, chain - ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 25 246 23 243 270 116 32.0 4e-26 MEHSLFPFRTGFQCVLESKEQLPQLRRLFGLLHREGFYGVELNLPDLDLIKPEELKSLLN EYQMKMNMLATGVYAKTHGLSLSSGIPEERMRAVNGCKRNIDYAASMGCGIIIGFLKGGP GSDREVSERLFLESMLELKPYIEEKKVQVLIEATNHKEASIIRTLKEGAHVINFLDCPFI QLLADTYHMNIEEPSLVDSLSEYMNYYPHLHISDDNRAYPGLGSLDFTAIYRKLMECGYK GTVAVEGNIQHGLLEDTQICAAYLHKMCSGLHTPSRVPSHAV >gi|157101643|gb|DS480681.1| GENE 72 75482 - 76585 236 367 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 63 352 49 330 346 95 25 1e-18 MRKMMKRIAAATIGMAIVMTSTACSSGTGSTTAAPENKETTAVQSAAEEKKEDPAAAPEE TYTLMFGHAQTETHPYQACFQEWADAVAEKTNGGLVIDLYPSNTLGSEEDIINSFKDSDT NWGYNTDFARLGTYVPELAMFNLPYFVETMDDINAVKELDMVKEWINKLETENDIKVVSL NLVQGYRNVVAGKPVLKPDDLKGLTLRCPNTEIWRAAVSSLGCSVQGMGRGDIYNNLVNK VIDGYEDVYPCIVSESYYEIKNVNTISETHNILLLNPVVVDAGWFNSLPKEYQDALIETC DQACNSCSDKMLGEFTEEAKQKCIDEGMTVIDHSEIDTDAFREAAKASYEQLGLTDTYNE IMSALGK >gi|157101643|gb|DS480681.1| GENE 73 76625 - 77113 441 162 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2557 NR:ns ## KEGG: SpiBuddy_2557 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ component # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 162 1 162 162 115 41.0 7e-25 MKKFFEILNKAELAITMTCVVGFTSVLMLGAAVRLLGHPLNWCNDVALLMLAWTTFLGGD VAFRAGRMVNVDILIAKLPVRFQKAVAVMVYAVLLVMMFMMVKQGALLCVNVGKRTLEGI PFLSYVWVAASIPVGFSLMFITAVKRLYDLMMSSDTTVISKM >gi|157101643|gb|DS480681.1| GENE 74 77127 - 78404 566 425 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 424 5 428 432 222 30 5e-57 MIYVFIVFLIFLVMGVPVAFAIGASGMVFFLLNTHFPLTHIAQLPLTQVQSISMLAIPLF ILAGNLMNNGGVTKRLVSLATLLTGHMRGGLAQTSVVLSTLMGGVSGSATADAAMEARLL GPDMIKSGYSKGFSANVLTFTSLITATIPPGVGLIIYGTTGSVSIGKLFSAGIFVGLYMM IVLMVSVAVIARMRGYKPVRQNRATWKEIGINLKETVWAVMFPVILLVGIRMGFFSPSEV GAFACFYSVFVGAFIYKELTIKSFITALRDSIADIGSIMMIVSLTALFGYGIPIDKIPQK MTTFITGITTNPSLVLILIIILLVFVGMFMEGSVVILLLTPILLPMCQSVGIDPVHLGLL MCVIVTMGVNTPPVGMSMYTVNTILDCSLQEYTKSMIPFLAAILLAIGLLAFLPDVVLFL PRLLY >gi|157101643|gb|DS480681.1| GENE 75 78436 - 79287 658 283 aa, chain + ## HITS:1 COG:lin0819 KEGG:ns NR:ns ## COG: lin0819 COG0656 # Protein_GI_number: 16799893 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Listeria innocua # 3 283 2 274 274 301 51.0 1e-81 MKNIRDCFVLNNGVKIPCVGYGTWRTPDGIDTVQAVREAIGCGYRHIDTAQLYGNEESVG KGIKESGLNRSELFVTTKLKNTDQGYDSTLRAFEGSMKRLGLDYLDLYLIHWPVPAVFKD DWKQVSRDTWRAFERLYGEGLVRSIGLSNFLPHHIDNIEVSAHIKPSVDQLEIHPFFTQK ETAAYCRRKGIQVQAWIPLGHGTILDNPVIKEIGLRYHKTAAQTALRWELQQDILPIPKS LNPERMAQNADIFDFNLTQEECLRITGLGETGRTGPDPDNIDF >gi|157101643|gb|DS480681.1| GENE 76 79388 - 80659 1306 423 aa, chain - ## HITS:1 COG:alr4255 KEGG:ns NR:ns ## COG: alr4255 COG0334 # Protein_GI_number: 17231747 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Nostoc sp. PCC 7120 # 8 417 27 434 437 427 51.0 1e-119 MNKNKYNPYENMLAVLDEAASRLGLKEADYITLRYPEREMIVSIPVRMDNGEMKVFEGYR VQHNSARGPYKGGIRFHQNSDLDEVKALAAWMSFKCAIVNIPYGGAKGGIKVDPSKLSRD ELIRLTRRYTTRILPIIGPDQDIPAPDVNTNGEVMGWIMDTYSMFKGHSVPGVVTGKPIE IGGSIGRTEATGRGVTIITRQCLEHLGMSYENSAYAIQGMGNVGGTAAQILYDKGCKIVA VSDYSGGVYNENGLDIPAIRTYLSDKTKALIDYVSDDVKHISNDEVITCCCDVLIPAALE NQITGENAAGVQAKVIIEAANGPTTVEADKILEEKGIVVVPDILANAGGVVVSYFEWVQN IQSMAWDLDEVNRTLKKIMNKAYDEVDAMSRDNKVTMRMGAYMVAINRICTAGKMRGGPI SMA >gi|157101643|gb|DS480681.1| GENE 77 80656 - 81426 680 256 aa, chain - ## HITS:1 COG:Cj1541 KEGG:ns NR:ns ## COG: Cj1541 COG1540 # Protein_GI_number: 15792849 # Func_class: R General function prediction only # Function: Uncharacterized proteins, homologs of lactam utilization protein B # Organism: Campylobacter jejuni # 1 248 1 248 255 277 52.0 1e-74 MYQIDLNSDMGESFGAYSIGNDDEIIKYVTTANVACGWHGGDPMVMDKVVKKAAERKVAV GAHPGYPDLMGFGRRKMVLKPSEVKNYIKYQVGALMAFTASYGMKLQHVAPHGALGNLCQ YDRDVSRAICEAVYEIDPSIRIFYCAGAVLGDEAEKMGLEALSEIFADRAYMDDLSLVPR GMDGAMITDEDEAIRRCVKMIKEGKVTSVSGKELDIKGDTLCVHGDGAKALAFVSRIRDA FEAEGIQINNFMGGRS >gi|157101643|gb|DS480681.1| GENE 78 81521 - 82546 875 341 aa, chain - ## HITS:1 COG:FN0436 KEGG:ns NR:ns ## COG: FN0436 COG1984 # Protein_GI_number: 19703774 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 2 # Organism: Fusobacterium nucleatum # 8 327 9 328 336 256 40.0 4e-68 MGMVIVNPGIYTTVQDEGRFGYEQFGVSPAGPMDRRSFHIANLLVGNDIGEAELEMTIMG AEIRFTGPAVIALTGADMSPLLNEAPVCMYRAFPVGAGDVLRMQSVSGGCRTYLAAAGGL DIPVVMGSRSTLVKNGIGGYQGRPLKKGDAIGLRQNMSTIENLPARWMLAEHSEPGCRQI RVIAGPQDDCFTGKGLEDFFHGTYKVTGDSDRMGYRLTGPCPEHVADGNIISDGIVMGAI QVPTSGQPIVMMADCQSIGGYTKIATVITADLPAIGQCKAGDEIRFIPVDIMQAQQAYAD YYREMEMLKAKFETPVAYRPPRLFQVKVGGTSFQVKVEEHS >gi|157101643|gb|DS480681.1| GENE 79 82550 - 83269 640 239 aa, chain - ## HITS:1 COG:FN0437 KEGG:ns NR:ns ## COG: FN0437 COG2049 # Protein_GI_number: 19703775 # Func_class: E Amino acid transport and metabolism # Function: Allophanate hydrolase subunit 1 # Organism: Fusobacterium nucleatum # 10 230 24 248 262 157 37.0 2e-38 MSGAVLRLCGDCAVTVEFENEISIETNRKVIALKYRIEGEHVPGIRELVPSYRSLLIHYD PLKIPYNELERLVEEWTGDAGEIALPKPVITEIPVLYEGEDIGEIARIEKKTVDEIIKIH SQSDYFVYMLGFAPGHPYTARFEHPFSFRRRESPRVRIAGGSVVVQQGLSDIIPFDQPCG WNIIGMTPIKVCDYRKENPFLLRAGQWIRHIPVSRAQYEDIKSQADAGIYVCRTYEKEA >gi|157101643|gb|DS480681.1| GENE 80 83328 - 84011 779 227 aa, chain - ## HITS:1 COG:AGpA472 KEGG:ns NR:ns ## COG: AGpA472 COG0684 # Protein_GI_number: 16119556 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 224 3 222 227 168 44.0 7e-42 MANVGCRIRKEFERPDRKLVEAFKDIPVANIDDCMNRTAAVDSQLKPMNKARLLGTAFTV KAPAGDNLMFHKALDMAKPGDVLVIATFGSQSRSLCGEIMTRYAMSKGLAGFVVDGCIRD SVEIGQITDFPVYAKGVTPNGPYKNGPGEINFPVSCGNQVICPGDILVGDGDGLLVIKPE EAAELAERAKKVSQDETKQFAGIAAGTGLNRDWVDAKLESIGVEYVD >gi|157101643|gb|DS480681.1| GENE 81 84148 - 85071 776 307 aa, chain + ## HITS:1 COG:L0217 KEGG:ns NR:ns ## COG: L0217 COG0583 # Protein_GI_number: 15672359 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Lactococcus lactis # 3 145 2 145 273 69 29.0 6e-12 MDLSEIETFLTIVNTKSITKTADILFLSQPTVSHRLKALEKELGFPLIVRKKGFKNVELT SKGSEFVSIAERYISLWKETQVLAQDRDSTLLSIGCTDSLNIALFAGFYRQMEASGCPID LDIHTHQSSELYSILDRHDIDIGFVYYHLHYKNILSERIYQERLYLVQSDEPAICKSLIH TEELDPSLELFLNWDANFQIWHDQWLTGYSRPRIIVDTITLLNHVWTDKKSWMIAPESVI NELYHYRRLYVSELKNPPPDRVCYKIKHRFPSSATQTAVAYFEQALGNYLCQVKSSLRIG EIWGADI >gi|157101643|gb|DS480681.1| GENE 82 85178 - 86356 1256 392 aa, chain + ## HITS:1 COG:PAB0525 KEGG:ns NR:ns ## COG: PAB0525 COG0436 # Protein_GI_number: 14520985 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Pyrococcus abyssi # 2 388 1 385 389 329 45.0 7e-90 MLTTAERMEHMPFSGIRVVMEKATQMDARGEHVIHLELGRPDFDTPQVIKEAAYKSLEKG NVFYTSNYGSMELRTAVAEKLRVENNIHYDPSEILITVGVGEGTFNAFGAYLEEGDEVLI PDPVWLNYIHVPEYFGAKAVPYTLKQENDYQIDMEELESKITDRTKMVVIISPNNPTGGV LERETLEKLANIAIRHDLYVISDEIYEKLIFDGEKHISIASLPGMKERTITLNGFSKAYS MTGWRLGYMAAPKDIISASVRLHQHINTCASSFVQEAGITALKQAGPDVQKMLAEYQRRR DYVVEAINQIDGLSCNKPKGAFYLFINIKELGKSAMEMAEYFLEEAKVAMVPGTAFGAAG EGYLRLSYASSYENLAEACARIKAAVEKLRSK >gi|157101643|gb|DS480681.1| GENE 83 86386 - 87186 779 266 aa, chain - ## HITS:1 COG:ECs2479 KEGG:ns NR:ns ## COG: ECs2479 COG1349 # Protein_GI_number: 15831733 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 210 44.0 3e-54 MVLKDRLDHLRGIVKTEKKVKVTELSQRFDVTEETIRRDLERLELEGLITRTHGGAVLKV ENMFDRLGYTHRSRTNVEEKKKIAQIVASIIPMQATLFADASSTVMEALTLMADRDDITV LTNSIPVLSSLNQSGLNIMSTGGANNRVSCSLQGIIARTTIMNYHVDFVLTSCKGLKLDE GAYDSREGETEIKQLMISRGQKTIMMADHTKFDRTSFVKYCSFQQIDILVTDRRPADEWM ELLEEYQIRILYPEAEDAYIKETGNS >gi|157101643|gb|DS480681.1| GENE 84 87295 - 88122 930 275 aa, chain - ## HITS:1 COG:STM3253 KEGG:ns NR:ns ## COG: STM3253 COG0191 # Protein_GI_number: 16766551 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Salmonella typhimurium LT2 # 8 274 9 279 284 157 35.0 3e-38 MLKTLDEVLPKYINADAAVGAFNLALFPDTKVLIEAAEEMNAPVILQISPVVAEFMGYDY WGMIAGEMARRSTAEVVLHLDHANMVEPIWKALDAGFSSVMFDGSQLPFEDNVRITNEVV KRAERYGASVEAEIGSVAYLGRDSHKDQLTDPKEAAAFADASGCGCLAVSVGTTHMMRTQ TANIQYDLLESIQNEVKVPLVIHGSTGLPDSQLVKMRGYHVCKVNIGTALRVAFDKGLRA ELEERPNDYIYIDLLKRPLEDEKQVVRKKMELLGF >gi|157101643|gb|DS480681.1| GENE 85 88136 - 89503 1488 455 aa, chain - ## HITS:1 COG:APE2521 KEGG:ns NR:ns ## COG: APE2521 COG0683 # Protein_GI_number: 14602118 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Aeropyrum pernix # 63 455 39 429 430 261 36.0 3e-69 MYKKFLAAALAVTMVLGLAGCGGSGGSTQAASSQEKESSAAPAGSTGESTTAQAEGADAP AVIKVGFLTPLSGNNATYGAQCKAAGQMIADVINEEHPEMAMALAKTAGIPALNGAKIEL VFADSKGDPTTATSEAKRLITEEGIVALTGQYTSAITKAVAVVTETYGIPLLTAGSSPSL TTPETGLEWYFRFGPNDTYYIRDSFEFIKQLNEEQDAGLMTVALVSEDTEFGANIRIEEE RMAEEYGIEVVENITYSSSATNVSSEVMKIKAANPDVIMMSSYASDAILYLKTFKEQNYV PKMLLGQRGGFVQTDFLDAMGKDTEFIYTTGGWSADIDTDTSKQLIELYKQYTPDGADLS EGHCKDMINVLLIALGINQAGTTEPEALNEALRNLEVDTSTLPIPWNAITMDEYGQNTSA NAFVLQMRDGKYQTVYPSDYAAIEPVLPMPAWNER >gi|157101643|gb|DS480681.1| GENE 86 89534 - 90403 699 289 aa, chain - ## HITS:1 COG:no KEGG:Oter_0115 NR:ns ## KEGG: Oter_0115 # Name: not_defined # Def: xylose isomerase domain-containing protein # Organism: O.terrae # Pathway: not_defined # 1 254 1 255 286 139 31.0 1e-31 MYTNLNPRTMGLNHHPYEDLLAAAGKCGFGGIEVPAHAFGSVEKAREAGKVLESRGMKWG LMMGPCDMFRAGDEEFDKALEQWARWLERARAAGCSRAYNHFWPGADERDFEENFEWHRN RLQRIYHIMKENGFQYGLEFMGARTVCSKFRYPFIRTISGTMALADSVSREIGFVFDCIH WYTSGARKDDLYLYLNNMERVVNLHLDDAYPGRGPDQQEDRERAMPNQWGIIDSTAIVRA FHEKGYEGPVIVEPMAPTTERYETMKLEEAVREAALCLNGVLQEAGVFI >gi|157101643|gb|DS480681.1| GENE 87 90419 - 91057 723 212 aa, chain - ## HITS:1 COG:FN1417 KEGG:ns NR:ns ## COG: FN1417 COG0235 # Protein_GI_number: 19704749 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Fusobacterium nucleatum # 12 208 9 208 222 98 32.0 1e-20 MDSRIKTEIMDEIMEIGASMEDMGLVNTFEGNLSIKDGDLLYITPGRTSKKKLTHDNICV FDESTGEKLWGGRPSSETKMHRGAYTVKEGLHGVIHCHAPYLTAHAVTHVPLDFKCHPEL LFHFKDIPVAPYGMPGSDEIIENARPYLLHRNLVLMANHGVLSIASNLALACQRVVAAEK FARVMCIARQIGEPVDIPETEIYRLLGRELEL >gi|157101643|gb|DS480681.1| GENE 88 91200 - 92687 1040 495 aa, chain - ## HITS:1 COG:TM1073 KEGG:ns NR:ns ## COG: TM1073 COG1070 # Protein_GI_number: 15643831 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 73 490 71 474 476 225 31.0 2e-58 MKESLYAAADIGATGIKMAAAVYDGRKLRIVDTYSEPNLPKVKDGHEFADIGHMLKTIRD GLGWLGENGRFVSLGIDTYGNGYGILDAEGKLIQEPYHYRDRRIDGIMDQVHRCFTDWQL YEQMGNYPVKTRALFHLYRDVLDCSPNIMRGERFLPLSSLLEYLITGQASVERTIASVLY LLEQDGRQWNYEVFRKLGIPEKLFGPLSEPGRPNGSITRSFAAGAGIEGVPVISVAGHDT ESALMAAPGLDKTKVFVSLGTSFIFGARVKAPVVNRESFHDRFKNMRGVGGTYSLCKDFP GFWILERCMEQWRKQVPRLDYEAVCAAAEKVRDNRTFIDISDDRFRVSGDNLPETIREYC LETGQKCVEGIGDTSRCLFESYALYLKWNIRRLSRITGETYRELVAVNGGVRNRLLMQMF ADAAGIPVVAGSPLASVGGNLLMQLYAAGEAKTLWELEQIASATWEPVVYESSHLSRWDE WLEYLEHRGLGMYRT >gi|157101643|gb|DS480681.1| GENE 89 92767 - 93486 238 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 7 227 5 224 305 96 31 6e-19 MSDEKLLEVVDLQAGYNGMAVVHDVSFSVREGEILAILGSNGSGKTTTLRAITGTIKPMS GIIRYKGEDITGMPPFKLVSKGIAMVPEGRMLFGGMTVEDNLLMGAYLNQDKKAIQERLK LVYELFPRVEERKKQTASTLSGGEQQMVAIARGLMSDPKLLILDEPSLGLMHKLVQEIFK FVKEIAAIGITVIIVEQNAVDTLALCDYAFVIQNGESVIEGRGEDLLSNEDVKRAYLGG >gi|157101643|gb|DS480681.1| GENE 90 93479 - 94210 230 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 216 1 215 305 93 29 5e-18 MGKILEIKNATKRFGGLVANEAVTFDVEEGEIVGVVGPNGAGKTTLFNSISGAHTLTEGN VFFKGKDITSMKPYDICKLGIGRTFQIPQSLNDMMVYENVLVGALCKRHNIEEAMKHVDQ ILELCGMLDMKYVYAGKLNVPQKKRLEIARAMATDPELLLLDETMAGLTATERKDAVNLI KKINTMGVAILTIEHNMDVVMNVSRRVVVLVSGKVLVVGSPDEVTSNQEVINAYLGGGAK KDE >gi|157101643|gb|DS480681.1| GENE 91 94213 - 95226 1048 337 aa, chain - ## HITS:1 COG:BMEII0874_2 KEGG:ns NR:ns ## COG: BMEII0874_2 COG4177 # Protein_GI_number: 17989219 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Brucella melitensis # 33 312 19 297 315 160 34.0 3e-39 MKKTLRDNKRWFIGLGIGVLAAVIFPQVVRIRFVWNLLSLILVWAIVGMGWNVIGGYCGQ VSNGHSLFYGIGAYTVGLAVSYFKISPWISIWIGVGISMLIAFIVGKPLLRLKGHVFAIS TMALAECTRIIFINWKWCGGATGVYIYSKGVNEWAFMQFKNPLNYYYVFLIFTVAVLVLI KCLDKSKFFYYLRTIKGNEMAAESVGINAASYKIRAYMLSAAIVSIGGSLYAQFILYIDP SMLMTLNVSMMIVLTAVMGGVGTVLGPVLGAIVLTSISEYSRVYLGQYGGLDMILYGTLV ILIVLFIPGGILSILEGRYERIVSYFKNKKKSSVAGG >gi|157101643|gb|DS480681.1| GENE 92 95229 - 96089 979 286 aa, chain - ## HITS:1 COG:BMEII0874_1 KEGG:ns NR:ns ## COG: BMEII0874_1 COG0559 # Protein_GI_number: 17989219 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Brucella melitensis # 1 284 10 294 330 153 35.0 4e-37 MSMFMQALFEGIINGSTLALVAMGIALVWGVMGILSFTQGEFLMIAMFCSFYLNLYFGLD PIVSLPICMAIMFGIGFIVYKLIIARALRGPVLSQRLITFALSMVLVNLALLLFSGDFKT IPEVAISGSIDLGFMVLSKQKIVPFVISVCVAGFMFLFLNKTRTGKAIRATSMNKTAAGL VGINPEKTYALAFGLSAAIAGAAGCALTYFYYIYPNVGANFQLFGFIAVVMGGFGSIPGA FFGGLIMGLADSFTGVYMNTAFKYVGICVVFLLLVQFKPKGLFGGK >gi|157101643|gb|DS480681.1| GENE 93 96406 - 97167 719 253 aa, chain - ## HITS:1 COG:ECs2479 KEGG:ns NR:ns ## COG: ECs2479 COG1349 # Protein_GI_number: 15831733 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 1 252 1 252 252 224 44.0 1e-58 MAQKDRLEQIRQIVRLEKKVVVANLSNQFGVTPETIRRDLEKLEEEGLVVRTYGGAVLNQ TESSEKIDFVKRSETNVEEKRAIAGIVSEMVPPNASIGCDASSTVMETLKYLSEREDILV LTNSVKVIREMERSRFGILSTGGRVNRQSYSMQGGVARSTIQEYHLDMVLISCKGMSMEG GIFDSHELEADVKKSLIERGHKVYLLADHSKFGRVAFMKLTDLEQIDVVVTDRKPSDEWM RLFSRNHVEVYYS >gi|157101643|gb|DS480681.1| GENE 94 97267 - 98247 1086 326 aa, chain - ## HITS:1 COG:ydjG KEGG:ns NR:ns ## COG: ydjG COG0667 # Protein_GI_number: 16129725 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Escherichia coli K12 # 1 325 1 325 326 413 60.0 1e-115 MKTIKIGTLDKPISRIGLGTWAIGGGPAWGGDKSKDESIATIQKCPELGVNLIDTAPGYN FGQSERIVGEALNGMDREKVAVMTKFGIVWTREGALFNKVGDIQLYKNLSRESILEEVDL SLERLGTGYIDIYMSHWQAVEPYYTPISQTMELLGELKEKGKIRAIGAANVTAAQVEEYI KYGSLDIVQAKYSVLDRAVEEELLPLCRENGIVLQAYSPLEMGLLSGALPRDYKPVGAQC NKKWFQPDNMPRVMDFLDRLEPMCRKYDCAVADLAMAWVLAQGDKVILLSGATTEEQIRK NTRADELELDAEDVVAIREMAEALDS >gi|157101643|gb|DS480681.1| GENE 95 98250 - 99215 895 321 aa, chain - ## HITS:1 COG:ydjH KEGG:ns NR:ns ## COG: ydjH COG0524 # Protein_GI_number: 16129726 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 8 316 12 320 322 334 53.0 2e-91 MKNNECCEVVIVGAAHTDLQLYPVGADVLDTASYSVKQMVLTVGGDALNEATVITRLGHK VRLVSCVGDDVIGSLVLEHCRKNNIGTEYIKVDSSKVTSINVGLIREDGERTFINNKSGS IWTFCPEDVERESVSRGRILSFASIFNNPLLDESLMIPLFERAREKGMLICADIVGCKRG ERLEDIQRALSYVDYFFPNYAEAAAITGKKDLEEIAGTLLDCGVRNVIIKIGKRGCFIKN QEEAFIVPAFADTQCLDTTGAGDNFASGFICGLLEGKGLKECAAFANCTASLSVESVGAT TGVRNREQVEARYQEYLKMEG >gi|157101643|gb|DS480681.1| GENE 96 99205 - 100059 861 284 aa, chain - ## HITS:1 COG:ydjI KEGG:ns NR:ns ## COG: ydjI COG0191 # Protein_GI_number: 16129727 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli K12 # 12 279 12 278 278 331 62.0 1e-90 MLVSTKDMLVKAQKEHYAVANFCIWNVEMLSGVMKACEKLQSPVILSFGSGFLVNTDINH FVNMMRSYATQTSLPCSIHWDHGRSFEIVSHAIDIGYNSLMIDGSAYSFEENIRMTREVV DKFHPMCIPVEAELGHVGAETDYEEALNHYMYTDPSQAAEFVEKTDIDSLAVAIGNQHGA YTAPPQINFEILEKVRRDVSIPLVLHGASGIGDEDIRHAISLGITKINIHTELCEAAMSA IQAHEKGEGYQALNIRVRDAVQKRAEEKILLFGSNGKAEGWFEK >gi|157101643|gb|DS480681.1| GENE 97 100098 - 101081 936 327 aa, chain - ## HITS:1 COG:ydjJ KEGG:ns NR:ns ## COG: ydjJ COG1063 # Protein_GI_number: 16129728 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 327 21 347 347 477 71.0 1e-134 MPVPAEDEVLIKVAYVGICGSDVHGFECGPFIPPKDPSQKIGLGHEFSGTVVAVGAKVTR FKEGDRVLCEPGIPCGKCEFCLRGHYNICPQVDFMATQPNYKGALTNYLTHPESFTYHLP AHMDMVEGALVEPAAVGMHAAELAGVKPGKNVLILGAGCIGLMTLQACVLMGAERIVVAD VIERRLEKALQLGAWHVINGRKEDTVDRSRALFGGDGADIVFETAGSRVTAMQSFASVRR GGEVMIVGTIPGETPVDFLKINREVKVQTVFRYANNFPMTIQAISSGRFDVLTMVTDEYT YEDVQKAFEESLSRKAEIIKGVIKISD >gi|157101643|gb|DS480681.1| GENE 98 101162 - 102241 973 359 aa, chain - ## HITS:1 COG:ydjL KEGG:ns NR:ns ## COG: ydjL COG1063 # Protein_GI_number: 16129730 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 1 359 1 357 358 550 71.0 1e-156 MKALAKFGRPSDFGTYRMIDIPEPECGDNDIILEVKAAAICGADMKHYKVDNGSDEFGSV RGHEFAGEIVRVGKNVKDWRPGQRIVSDNTGHVCGRCPACERGDYLLCSEKVNLGLGYGY SGGFTKYCRIPGEILAIHKRAIWEIPEGLEYEEAAVMDPICNAYKAIAQRSGFLPGENVV IFGTGPLGLFSVQIAKIMGALNIVVVGLEEDVKVRFGVARQLGATEVVNGSAEDVVQRCQ EICGGSDSIGLVVDCAGANICLKQAIEMVRCNGEIVRVGMGFKPVGFSINDISMKAVSII GHMAYDATSWRNALQLLAAGRIQVKPMITHRLGLSRWEEGFEAMANKEAIKVILTYDFD >gi|157101643|gb|DS480681.1| GENE 99 102725 - 103693 792 322 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 322 1 324 324 393 58.0 1e-109 MKIVVLDGYTENPGDLSWEGLQELGELTVYDRTPVGDTAEIIRRIGDAQAVFTNKTPLTK EVFEACGHICYVGVLATGYNVVDCETASAKGIPVCNIPTYGTAAVAQYTIALLLEICHHI GHHSQAVHEGRWQSCPDWCFWDYPLIELAGKTIGIIGFGRIGQGTARIAQALGMKVLAYD AYRNEALENENCHYADLDELYRMSDVIALHCPLFPETAGMINKESIAKMKDGVIILNDSR GPLIVEEDLKEALNSGKVGAAGLDVVSTEPIKGDNPLLQARNCFITPHIAWAPRESRQRL MDIAVANLKSFMDGSPVNVVNK >gi|157101643|gb|DS480681.1| GENE 100 103945 - 104997 989 350 aa, chain + ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 48 278 45 272 341 105 30.0 2e-22 MNLNWKTILKKAGIILLSTLLFALCSGLLYITKMAGSINITRPEDEPELASAMTNENLDT ETQEKLGGYWTIAVFGVDSRNGKLGKENNADVQMLCSINRDTGEVRLVSLYRDTFLMNDT TNSGYGKLNQSYFLQGPSGNISAINTNLDMKVEDFVSFNWNSAADAINLLGGVDIELSKA EFYYINAFITETVNITGIPSTHLEGPGMQHLDGVQAVAYMRLRQMDTDFKRTERQRSVIE QVFDNARKADISTLIQVFHTVAPQIMTSISEEEFMDIAYNIKNYHINGTAGFPFEQTTAS LGKTGSVVIPSTLESNVEELHRFLYDDDAYTCSGQVRDISREIIKRAVKN >gi|157101643|gb|DS480681.1| GENE 101 105010 - 105744 649 244 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 241 1 231 234 83 26.0 4e-16 MIRIAVCDDEPCFTQQISHVIANHARDISPSPETVLYTNSGQLLYDVEEGAHFDLLLLDI EMPEKDGMSLAASLRRHLPLSLIIFITSHTQYAVKAYELSVFRYIPKSEMETCLPLALKD ASCILKQSSSDAYIIESARRIQKLSAEDIIYVYKQQKYSVIAAKGAEIPVRKPLAQVLEE LNTLTAGGTGSFLLVERGYIVNLFHVEKLEDEQIYLDNGSVLPVSRNRLKETREAITRYW RKML >gi|157101643|gb|DS480681.1| GENE 102 105741 - 107046 1087 435 aa, chain + ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 110 434 106 433 433 80 24.0 5e-15 MTMNHWYQSMYQMFELVSDFIEAWLCYGFVGLFLPDRVRGKVLFFILSLVLVGSVRAMDI LGIDPIISTLWFVFYICMTTVVLFQVNLFYAVSLVSFYILCLYVINYFCMSVMGVIAGNR QFAQFILNQLSLLRCIYLAADTALLFLLYMLIRRAFKGGLLYSPRMLFTISLLGILGITF LSVVTLQDISVITLFSWSLCLVMVLCFIFLLLFYSNYMKEHELRNILELKDQMVRQEYEM VRQLQQEQQSLSHDLKNHLLVMDTLLKEGKYQEARNYIGQLGIPLERLAPSIWTGTSTLD VLLNHTRNRCIQSHITFTVQADAVDLRPMEDQDICSLFANLLDNAYEAAHSMPDGQGWIL FKMRKAREMLFLDISNSSPAPPCVKNGALISRKQDGRLHGLGLNRASATAQEYGGQLTYG YEGGVFTVSISFLGE Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:42:32 2011 Seq name: gi|157101642|gb|DS480682.1| Clostridium bolteae ATCC BAA-613 Scfld_02_23 genomic scaffold, whole genome shotgun sequence Length of sequence - 87517 bp Number of predicted genes - 77, with homology - 74 Number of transcription units - 32, operones - 22 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1597 1744 ## Closa_3788 cell wall binding repeat-containing protein + Term 1604 - 1638 0.7 + Prom 1644 - 1703 5.5 2 2 Tu 1 . + CDS 1730 - 2698 658 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 2728 - 2776 0.9 + Prom 2762 - 2821 7.8 3 3 Op 1 12/0.000 + CDS 2953 - 3822 337 ## COG0582 Integrase 4 3 Op 2 . + CDS 3845 - 4624 411 ## COG0582 Integrase + Term 4774 - 4809 -0.3 5 4 Op 1 . - CDS 4736 - 6013 862 ## COG0582 Integrase - Term 6030 - 6085 8.2 6 4 Op 2 . - CDS 6093 - 6296 205 ## Tresu_1933 Excisionase from transposon Tn916 7 5 Op 1 . - CDS 6726 - 6965 303 ## gi|160937775|ref|ZP_02085134.1| hypothetical protein CLOBOL_02667 8 5 Op 2 . - CDS 6962 - 7405 224 ## lmo1099 hypothetical protein - Prom 7505 - 7564 4.4 9 6 Op 1 . - CDS 7897 - 8022 95 ## 10 6 Op 2 . - CDS 7994 - 8788 187 ## COG4124 Beta-mannanase 11 7 Op 1 3/0.000 - CDS 9009 - 10133 337 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis - Prom 10159 - 10218 3.5 12 7 Op 2 . - CDS 10323 - 11510 429 ## COG1004 Predicted UDP-glucose 6-dehydrogenase - Term 11976 - 12021 -0.8 13 8 Tu 1 . - CDS 12160 - 12996 479 ## COG0582 Integrase - Prom 13084 - 13143 5.4 + Prom 13141 - 13200 7.2 14 9 Op 1 . + CDS 13374 - 13703 241 ## CD1101 putative mobilization protein 15 9 Op 2 . + CDS 13664 - 14998 1004 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 16 9 Op 3 . + CDS 15010 - 15366 270 ## EUBREC_3582 hypothetical protein + Term 15431 - 15457 -0.7 + Prom 15963 - 16022 7.0 17 10 Tu 1 . + CDS 16053 - 16754 266 ## COG0515 Serine/threonine protein kinase + Term 16758 - 16807 8.0 - Term 16747 - 16794 16.0 18 11 Op 1 . - CDS 16802 - 17002 265 ## gi|160937791|ref|ZP_02085150.1| hypothetical protein CLOBOL_02683 19 11 Op 2 . - CDS 17006 - 21667 3326 ## CD1105 putative DNA primase - Prom 21690 - 21749 2.4 - Term 21694 - 21733 6.4 20 12 Tu 1 . - CDS 21801 - 22400 266 ## Bcell_0463 helix-turn-helix domain protein - Prom 22442 - 22501 2.4 - Term 22451 - 22481 1.1 21 13 Op 1 . - CDS 22503 - 22706 57 ## gi|160937794|ref|ZP_02085153.1| hypothetical protein CLOBOL_02686 22 13 Op 2 . - CDS 22756 - 23232 187 ## FMG_0453 hypothetical protein - Prom 23427 - 23486 6.3 - Term 23525 - 23563 3.1 23 14 Op 1 . - CDS 23608 - 25698 1872 ## COG0550 Topoisomerase IA 24 14 Op 2 . - CDS 25695 - 26462 737 ## CD1107 hypothetical protein 25 14 Op 3 . - CDS 26452 - 26715 301 ## gi|160937800|ref|ZP_02085159.1| hypothetical protein CLOBOL_02692 26 14 Op 4 . - CDS 26739 - 28685 1494 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 27 14 Op 5 . - CDS 28717 - 31122 1963 ## COG3451 Type IV secretory pathway, VirB4 components 28 14 Op 6 . - CDS 31025 - 31474 271 ## CD1111 hypothetical protein 29 14 Op 7 . - CDS 31481 - 31942 306 ## COG4725 Transcriptional activator, adenine-specific DNA methyltransferase 30 15 Op 1 . - CDS 32081 - 32944 902 ## Ethha_1894 hypothetical protein 31 15 Op 2 . - CDS 33020 - 33235 244 ## Ethha_1893 putative conjugative transfer protein 32 15 Op 3 . - CDS 33300 - 35105 1809 ## COG3505 Type IV secretory pathway, VirD4 components 33 15 Op 4 . - CDS 35102 - 35578 343 ## BLJ_1242 hypothetical protein 34 15 Op 5 . - CDS 35622 - 35834 307 ## BLJ_1241 hypothetical protein 35 15 Op 6 . - CDS 35852 - 36985 615 ## BLJ_1240 hypothetical protein - Prom 37186 - 37245 2.8 36 16 Op 1 12/0.000 + CDS 37556 - 38392 807 ## COG0582 Integrase 37 16 Op 2 . + CDS 38403 - 39353 572 ## COG0582 Integrase 38 16 Op 3 . + CDS 39402 - 40556 1050 ## COG4905 Predicted membrane protein + Term 40616 - 40646 -0.9 - Term 40444 - 40472 -0.9 39 17 Tu 1 . - CDS 40682 - 42418 2177 ## COG1109 Phosphomannomutase - Prom 42508 - 42567 3.9 40 18 Tu 1 . - CDS 42571 - 43689 957 ## Closa_0390 CotS family spore coat protein - Prom 43824 - 43883 3.0 - Term 43869 - 43930 -0.9 41 19 Op 1 . - CDS 44014 - 44520 555 ## COG0219 Predicted rRNA methylase (SpoU class) 42 19 Op 2 3/0.000 - CDS 44540 - 45712 1449 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 43 19 Op 3 . - CDS 45709 - 46191 596 ## COG1522 Transcriptional regulators - Prom 46214 - 46273 3.2 44 20 Op 1 . - CDS 46275 - 46868 292 ## COG0309 Hydrogenase maturation factor 45 20 Op 2 . - CDS 46868 - 48451 1671 ## COG2720 Uncharacterized vancomycin resistance protein 46 20 Op 3 . - CDS 48481 - 48591 179 ## 47 20 Op 4 . - CDS 48635 - 49372 396 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) - Prom 49441 - 49500 7.2 - Term 49533 - 49579 0.7 48 21 Op 1 . - CDS 49664 - 50947 1355 ## COG3875 Uncharacterized conserved protein 49 21 Op 2 11/0.000 - CDS 51031 - 52011 1181 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 50 21 Op 3 21/0.000 - CDS 52060 - 53010 874 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 51 21 Op 4 16/0.000 - CDS 53007 - 54533 189 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 54571 - 54604 5.4 52 21 Op 5 . - CDS 54629 - 55756 1385 ## COG1879 ABC-type sugar transport system, periplasmic component 53 21 Op 6 . - CDS 55800 - 56711 798 ## SpiBuddy_2638 amidohydrolase 2 - Prom 56902 - 56961 8.2 + Prom 56733 - 56792 5.9 54 22 Op 1 . + CDS 56927 - 57595 517 ## COG1878 Predicted metal-dependent hydrolase 55 22 Op 2 1/0.250 + CDS 57601 - 58677 1122 ## COG1879 ABC-type sugar transport system, periplasmic component 56 22 Op 3 7/0.000 + CDS 58700 - 60511 1831 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 57 22 Op 4 . + CDS 60489 - 62048 1508 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 61963 - 61997 2.1 58 23 Op 1 . - CDS 62144 - 62929 638 ## COG0177 Predicted EndoIII-related endonuclease 59 23 Op 2 . - CDS 62986 - 64824 1427 ## COG1032 Fe-S oxidoreductase 60 23 Op 3 . - CDS 64829 - 65323 559 ## COG2131 Deoxycytidylate deaminase - Prom 65488 - 65547 7.2 + Prom 65780 - 65839 8.5 61 24 Tu 1 . + CDS 65952 - 66779 659 ## CLB_2021 hypothetical protein - Term 67022 - 67080 6.2 62 25 Op 1 4/0.000 - CDS 67145 - 67525 268 ## COG3862 Uncharacterized protein with conserved CXXC pairs 63 25 Op 2 6/0.000 - CDS 67519 - 68751 1162 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 64 25 Op 3 . - CDS 68755 - 70215 1335 ## COG0579 Predicted dehydrogenase - Prom 70262 - 70321 9.8 - Term 70453 - 70514 6.4 65 26 Op 1 . - CDS 70529 - 71266 832 ## COG2186 Transcriptional regulators 66 26 Op 2 . - CDS 71266 - 72504 1170 ## COG0009 Putative translation factor (SUA5) - Prom 72544 - 72603 6.3 + Prom 72419 - 72478 3.0 67 27 Op 1 . + CDS 72503 - 72616 85 ## 68 27 Op 2 . + CDS 72674 - 73126 505 ## COG1970 Large-conductance mechanosensitive channel - Term 73015 - 73053 -1.0 69 28 Tu 1 . - CDS 73132 - 74829 1437 ## COG1158 Transcription termination factor - Prom 74997 - 75056 4.0 - Term 74952 - 75003 7.4 70 29 Op 1 . - CDS 75062 - 76378 1293 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 71 29 Op 2 . - CDS 76392 - 77120 812 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 77300 - 77359 5.2 - Term 77228 - 77265 9.2 72 30 Op 1 1/0.250 - CDS 77361 - 79166 1863 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Term 79190 - 79237 1.0 73 30 Op 2 . - CDS 79256 - 80317 1126 ## COG5263 FOG: Glucan-binding domain (YG repeat) 74 30 Op 3 . - CDS 80424 - 82295 1896 ## COG4750 CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes - Prom 82380 - 82439 10.4 - Term 82442 - 82500 11.7 75 31 Tu 1 . - CDS 82510 - 84624 2310 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 84845 - 84904 6.7 - Term 84739 - 84789 9.7 76 32 Op 1 . - CDS 84936 - 85799 1186 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 85820 - 85879 3.6 77 32 Op 2 . - CDS 85890 - 87386 1649 ## COG0498 Threonine synthase - Prom 87427 - 87486 3.9 Predicted protein(s) >gi|157101642|gb|DS480682.1| GENE 1 2 - 1597 1744 531 aa, chain + ## HITS:1 COG:no KEGG:Closa_3788 NR:ns ## KEGG: Closa_3788 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 531 109 569 569 272 43.0 3e-71 GQDDEPDHYWYYFQANGKALTQGDNDKVSLKTVNGKKYAFDDEGRMLFGWVDEDSAERVD DTDGDGFKEGTYYFGGEDDGAMTVGWLQLDVTYDEATNDDYKYTAPVFNDDEDQTRWFYF KSNGKKIYAEDGDRTKDKTINGKKYAFDEYGAMVAEWSLDEEDLPGKSLASYSDAVESGD VNAGTASASNIITGKAFNAKYSEAWKYFNSVDDGARVSKGWFKVVPAEYLNDEKYNDDED YWYYADGSGNLYAGEFKTIKGKKYAFRNDGRMIDGLKFIYEDKDAQSLTVWADDDDPYRF DSEDDFDDNAPLYEAAGYYCYYFGDGDDGAMRTNKSTVEIDGENFNFYFEKSGGKKGAGL TGEKDDKFYQSGKLLKADSDDKYSVVQRQLVKKTDGTINSELEVVTTKDSTTTEVYNMLD DVDELLSVTKDAGIEILTIDDLNSSAYSGKADSILKAANINKDLEDLREVYIFGTKDENS NFKASELNTKDYFLVNTSGKVLDSKGRHKDGSDYYYALTGSGKIAGIYVED >gi|157101642|gb|DS480682.1| GENE 2 1730 - 2698 658 322 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 102 298 408 621 621 82 29.0 1e-15 MFKKIAILGMALMLGLSSSTAAFADQTETSTTTMENCGAFLFEWRPVDCGDGHYFAILVG GNTISESDVILNRGYDYPYTFDYTVPRPWGEVPQLVNIDGIWGIPENWAILPEGSQPTLR ITLVTNNNILPTKERYIDVVSLPRNINSRDLPAEVRKYLINVDGSDAGAYDGTTTAGWEE ENGLWRYRKSDNTYVSNTWLKLDDKQYYMNEQGIMLADTITPDGYYVNAKGEKTGYIPGW LQDGASWKYVRKNGYYAANQWIQDTDGKWYYFDMAAVMLADTTTPDGYYVDASGVWDGQP ASAVINGENPGPGVDTAAHTEE >gi|157101642|gb|DS480682.1| GENE 3 2953 - 3822 337 289 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 24 289 1 265 265 248 46.0 7e-66 MEPMESPNFLEHAELLDENTPHYLNAFHKYLIQHNMSPNTITAYIYSVRHFFTYFGQLNS NNVSLYKVYLLDHYQPQTVNMRIRALNCFLKFQGISDYRIRALRLQQKKYTDYIISQADY EYLKRRLREDEQYTFYFIIRFITATGVRVSELISFQIQDVINGYKDIYSKGNKMRRVYIP TALQEDTLRWLESEFRKNGPLFLSHLKRGISVSGIRSQLKTFAYRYHLDPKVVYPHSFRH RFAKNFIENGGDIAFLSNLLGHTSIETTRIYLRRSSTEQSLIVNQIVDW >gi|157101642|gb|DS480682.1| GENE 4 3845 - 4624 411 259 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 13 249 4 233 265 203 43.0 3e-52 MKTNMQEINGPAFHAYLKKAQSSPNTIRSCLAAYRLYHSLYNDLAVNHLLCFKDYLIRHY KASTVNNRIYGINRYLDFLEEIYGSQMIHFRLTVIRTQQAAFLDNIISQEDYETFKKRLK DSGNMLWYFVVRFLACTGARVSELIQIKAEHLHLGYIDLYTKGGKVRRLHFPDALCQEAL EWLSAKGQVSGFLFVNRGGNQITARGISSQLKTLAIRCNIPPETVYPHSFRHRFAKNFLE SFNRFPIISLDVLICRYSP >gi|157101642|gb|DS480682.1| GENE 5 4736 - 6013 862 425 aa, chain - ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 214 407 204 383 387 68 28.0 2e-11 MSKKERDEKRRDSKGRLLKSGESQRTDGRYAYKYTDTFGEPKFVYSWKLVPTDKIPAGKR PDISLREKIKQIQKDLDDGIDTIGKKMTVCQLYEKYIRQRGNVKRGTHKSRQQLMKLLSE DKIGGASIDSVKLSDAKEWALRMQEKGVAYHTICNGKRSLKAIFHMAVQDDCLRKNPFDF QINEVINDDTVPKVPLTPAQEKELLGFMQSDPVYAKYYDEVLILLETGLRVSELCGLTPA DLNFDKRFVNVDHQLLRSTEDGYYIEAPKTDSGYRQVPMSAAAYKAFQRVLHRRKDGKGV VVDGYKGFLFLNRDGLPKAAVNYDSMFQGLAKKFNKFHAEPLPEVMTPHTMRHTFCTRMA NAGMNPKALQYIMGHSNIVMTLNYYAHATFHSAQEEMERLQAKSQTAAAVNAQPASESAQ ESKAA >gi|157101642|gb|DS480682.1| GENE 6 6093 - 6296 205 67 aa, chain - ## HITS:1 COG:no KEGG:Tresu_1933 NR:ns ## KEGG: Tresu_1933 # Name: not_defined # Def: Excisionase from transposon Tn916 # Organism: T.succinifaciens # Pathway: not_defined # 1 67 1 67 67 106 82.0 2e-22 MSNNDVPIWEKYTLTIEEASKYFRIGENKLRRLAEENPSAGWVILNGNRIQIKRQKFEKI IDSLDTI >gi|157101642|gb|DS480682.1| GENE 7 6726 - 6965 303 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937775|ref|ZP_02085134.1| ## NR: gi|160937775|ref|ZP_02085134.1| hypothetical protein CLOBOL_02667 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9475_02725 [Clostridium symbiosum WAL-14673] hypothetical protein CLOBOL_02667 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF9475_02725 [Clostridium symbiosum WAL-14673] # 1 79 1 79 79 152 100.0 6e-36 MKEPTFNGRELLPLSVMEAAHAGDAMAMEQVLRYYEDYINKLCIRTLYDSNGIPYVCVDE YMKHRLEIKLIHSIIVALK >gi|157101642|gb|DS480682.1| GENE 8 6962 - 7405 224 147 aa, chain - ## HITS:1 COG:no KEGG:lmo1099 NR:ns ## KEGG: lmo1099 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes # Pathway: not_defined # 1 137 1 136 139 100 37.0 3e-20 METIPRNDFILRHRYDAFCKAVLRNEAKSYWSEMAHRREREKSLDALTQEEMDKLSVVDD YPSDSYVFSSYGYDLLIDNELVAEAFASLPEQEQSILILHCVLDLADGEIGSLMGMSRSA VQRHRTRTLKQLRMKLMAFMPEGGKRG >gi|157101642|gb|DS480682.1| GENE 9 7897 - 8022 95 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRGITFVRNKKFTKYQIMTRVGTIISILLFVVLGFCIEVIF >gi|157101642|gb|DS480682.1| GENE 10 7994 - 8788 187 264 aa, chain - ## HITS:1 COG:AGl3016 KEGG:ns NR:ns ## COG: AGl3016 COG4124 # Protein_GI_number: 15891623 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 41 208 99 273 320 61 27.0 2e-09 MGVYVQGPEETDVLADSINTLAWFDRFDQTSDYKISLCLDDNKYIAFITLQPTDWDLKLV SDGYYDDLIIEYFKKLSSDNRANTELFVRLAHEMEMRPSYKSGWYSWQTDDAHAYVNAWV HIVNLGREYAPNVKWVWSPNRADEYTTKYYPGDEYVDYVGLTLNNTLDSRESFQQFYENE GQRDYLEAYNKPIIFGEIAEHSTSDEVRNEYIQSVFDYLGTYDKCIGFIFLNQDIESARQ YKFTDCELILDTFIENARDYICAK >gi|157101642|gb|DS480682.1| GENE 11 9009 - 10133 337 374 aa, chain - ## HITS:1 COG:SSO1299 KEGG:ns NR:ns ## COG: SSO1299 COG1215 # Protein_GI_number: 15898141 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Sulfolobus solfataricus # 2 233 67 283 422 103 29.0 6e-22 MLLPVVDEPLDLFYSVLMKIARQNPSEIIVVINGPKNEGLENLCVDFNRNLPICFTPVQH YYTPVAGKRNGIRVAMEHINPNSDITVLVDSDTVWTEDTLSELLKPFACDQKIGGVTTRQ KILDPDRKLVTMFANLLEEIRAEGTMKAMSVTGKVGCLPGRTIAFRTQILIDVMYDFMNE TFMGFHKEVSDDRSLTNLTLRKGYKTVMQDTSVIYTDAPTEWKKFIRQQLRWSEGSQYNN LRMTPWMLKNAKLMCFIYWSDMISPMMLVSVYANTIICKVLNILGCAIPTLAYTAPWWQI ILFILLGCIISFGSRNIKVMRSVKWYYTLLLPVFILVLTVVMVPIRLLGLMLCSDDMEWG TRKLEEDNDKVEVP >gi|157101642|gb|DS480682.1| GENE 12 10323 - 11510 429 395 aa, chain - ## HITS:1 COG:BH3708 KEGG:ns NR:ns ## COG: BH3708 COG1004 # Protein_GI_number: 15616270 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Bacillus halodurans # 1 395 1 388 388 444 57.0 1e-124 MQITVVGAGYVGLSLATLLGQKHEVIVLDIDEEKVAKVNSRISPVQDVYLEKFFAERKLN LTATVDTAIAYKDAEYVIITTPTNYEDETNSFDTHAVDSTIEICMANNDHCIMIIKSTVP VGYTRSVRKKYNTSHILFSPEFLRETKALYDNLYPSRIVVGTDIADPAMVIHAQVFSTIL QECANKEDIPVLIIGLDEAEAAKLFANTYLAMRVAFFNELDTFAEMNGLSTKNIIDAICH DPRIGKHYNNPSFGYGGYCLPKDTKQLKSSFHDIPENLITAVCQANHTKKGHVIKGILNK HPGTVGIYRLTAKSNSDNFRSSAVWGVMEGLSKEKQEIVIYEPLLGDAAEFMGYKVVHSF AEFKRSCDVIVANRVSSELSEVMYKVYTRDIFGRD >gi|157101642|gb|DS480682.1| GENE 13 12160 - 12996 479 278 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 9 274 1 262 265 129 28.0 5e-30 MKQNYNEILREYRIYLTEHEKSHATIQKYVRELVWFLSFLQGEEPTKAKVLEYREQLQQS HQARTVNAKLSAIHSYLDYLGLAACKVRFLKIQHTVFVDDSRDLTEAEYHRLLDAAKRKK DSRLYHVMLAICTTGIRVSELSFLTVEALHKGKAEIRMKGKIRTILLTKELCRKLNAYAK EKGIRTGYLFCTRTGKPLDRSNICHDMKKLCRAARVNPEKVFPHNLRHLFAKCYYAIKKN LAYLADILGHASVDTTRIYVAMGTREHERTLQRMHLIS >gi|157101642|gb|DS480682.1| GENE 14 13374 - 13703 241 109 aa, chain + ## HITS:1 COG:no KEGG:CD1101 NR:ns ## KEGG: CD1101 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 1 109 1 109 109 115 57.0 5e-25 MANRKRKFVLRVPVTPEERALIQQKMAQLGTKNFSAYARKMLIDGYIVHIDTGPVRAQTA ELQKIGVNINQIARRINSTGTVYAQDLEDIKGALAQIWQLQRSILSSQR >gi|157101642|gb|DS480682.1| GENE 15 13664 - 14998 1004 444 aa, chain + ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 193 1 210 402 81 30.0 3e-15 MAVTKIHPIKSTLKKALDYIENPDKTDEKLFVSSYGCSYETADIEFQMLLDQAYQKGNNL AHHLIQAFEPGETTAEQAHEIGRQLADEVLQGKYPYVITTHIDKGHLHNHIIICAVDMAN QRKYISNRQSYAFIRRTSDRLCKEHGLSVVKPGKDKGKTYAEWDAQKKGKSWKAKLKIAI DAAIPQSKDFDSFLQLMEAQGYEVKQGKFISFRAPGQERFTRCKTLGEDYTEERITQRIK GIAIDRGPRRRSAGEISLRIALEDSIKAQQSAGYARWAKLHNLKQAANSLNFITEHQIDS YEGLESRLAEISAVGDAAASALKDAERRLGDMALLIKNLSAYKQLRPVVLELRNVKDKAA FQRQHESQLILYEAAAKALKEAGITKLPNLYALKTEYKKLDAERERLSAQYSEAKQKLKE YGIVKQNVDSILRTAPGKELTQER >gi|157101642|gb|DS480682.1| GENE 16 15010 - 15366 270 118 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3582 NR:ns ## KEGG: EUBREC_3582 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 109 8 115 128 133 67.0 2e-30 MSYPQYRKARRLTHECCNYCNGNCLLLDDGEECVCVQSISYSLLCRWFKVAVLPLDAALC AEITKGRDDIKRCTVCGAAFTPNSNRAKYCPDCAVQVRRKKEAERQRKRYLLSTHLGR >gi|157101642|gb|DS480682.1| GENE 17 16053 - 16754 266 233 aa, chain + ## HITS:1 COG:SA0077 KEGG:ns NR:ns ## COG: SA0077 COG0515 # Protein_GI_number: 15925785 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Staphylococcus aureus N315 # 4 233 133 355 502 85 32.0 9e-17 MKLIGSGSYANVYKYKDTFYNRPFILKRAKKELTDKEMARFKREFDVMNDLSSPYILEVY CYNPDKNEYIMEYMDYTLDGYIAAHNSTLTIIQRKGIAQQILRAFDYLHSKGHLHRDISP KNILIKEYDDTLVVKLSDFGLVKIPDSTLTTVNTEFKGYFNDPALVVEGFNTYGIVHETY ALTRVIYFVMTGKTNTEKITNQNLRNFVERGLNPDKTKRFQNIRDMISAFKTI >gi|157101642|gb|DS480682.1| GENE 18 16802 - 17002 265 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937791|ref|ZP_02085150.1| ## NR: gi|160937791|ref|ZP_02085150.1| hypothetical protein CLOBOL_02683 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02683 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 111 100.0 2e-23 MYFTVEEENLLCLYHNAKRRRTAANLRTVLPDMDKEMATLACQTADKLDAMSDADFAAQR FHFTDE >gi|157101642|gb|DS480682.1| GENE 19 17006 - 21667 3326 1553 aa, chain - ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 239 843 162 775 1343 434 44.0 1e-119 MSYSSYDHDDLEPASTMRIERRIYFESGKADLSEAVKLPLAELLSLRAESAAAEQEVFDR LKAQAAAWEKQAGRTLFLDKVLEYARTLPVTHTSNQWEQPDNYRHIRSNMVYQMYYSISE NTRYDSAAQTSVPYSWTLSWSLRTNAPGSYRQAKIAGQDRKVFASREALDKYLNGRIKAH AHYFTEISPAIPKEYADYFKVNGCLLPGYTIEGEEPAKAAELPTQEEPMQPPTFNTTPER RASVNEVCSIFLDNHAEAQSDEPHGCWLSLPTTAEQVQAALNEIHITADNQQDCFIAGVS APEGHPLELPEDLIQSASVDELNFLAVQLQKLDAVERSQLNAVMQSLEKFQTIGQVIDYS ENTDCFVLIDAKDYRTLGDYYLNHSGLMVIPDEWKPAIDTERLGQFIAQGEQGTFTEYGY LLRTGDKWQRVHEGQPVPEEYRVMAYPAPEIMREESKVQPETAAPAKAPQPVTPILLNSQ NSTDRMKEITDRLEIGIQELFESERYKAYLTSIAKFHSYSFNNTLLIAMQGGQLVAGYNK WRDEFHRNVKKGEKAIKILAPAPFKAKKEVPKLDAQGKVVIGQDGKPVTEVQEIQVPAFK IVSVFDVSQTEGEPLPSIGMEELTGSVERYEEFFKALEQTSPVPMSFEDIPGGSHGYYHL AEQRIAIQEGMSELQTLKTAIHEIAHSKLHAIDPEALAIEQADRPDSRTREVQAESVAYA VCQYYGLDTSDYSFGYVAGWSSGKDLKELKASLETIRATAHELITTIDGHLAQLQKERQA QQEQPQTAPLEQAAEQPDPGSAFSKLPPKQQQEMTDSVKAMLQTLIEADLKSTGEVSQGT KEAAQEQGFTIAGDGSLEKAEAPQEAAYRLESGDYLYIQTSENGYDYTLYGPDYKELDGG QLDNPDLSLMEAGKEILAAHELPAGTMEPLTGDALDGFLEAAKQANAIPQPQAWNGIDGI LNGNPLMPDVSPADRAAALIALAEQSAPRLGREERKLIMDYADAVNDPQKIIGLINHLCE QGYELQHGKMDDVVKSQIVSEMAVARAEQTIACDPEAEPIVTILWSESPHLKDGQQMPLH EADAVFQSLDEAQRRGREQPGYTGSWYDKTRFRIDFTMQGQPDNYEGRQDFGDGDGSLIQ HIRGCHEYYEQDESWKNHVLHHEGPEAWEADKAQRDMLLHTFVPYLKQHCNLSRMEQEAQ HLLRSGDPLLPEQTAYLDALVEYVRECRPLLNQGESLPEPPQLFDFDKSLQDYKAQVKTE LEQEAATAGLTVEEYIDSGSYAPAQPNFSIYQVPPGPEGRDFRYRSYEDLQADGLLVDRK NYQLVYTAPLDKDTTLDEIYRRFNMEHPADYKGRSLSMGDIVVFRQDGEQTAYYVDEGAD YRQVPEFFSQPEKQLTPDECMTGEQIQTPRGRFYLTDRSREQMEAAGYGFHHQSEDGKYL IMANGTRAFAIPAQQDSHIKTAEMSTEQNYNMIDGMMNNAPSMEELEAKAKAGEQISLLD VAEAAKAEAKKPKQTRRTTQKPKKPSIRAQLAAAKEEQKKKSPAWEKSKEMEV >gi|157101642|gb|DS480682.1| GENE 20 21801 - 22400 266 199 aa, chain - ## HITS:1 COG:no KEGG:Bcell_0463 NR:ns ## KEGG: Bcell_0463 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: B.cellulosilyticus # Pathway: not_defined # 1 193 1 203 207 101 34.0 1e-20 MALSENIKARRTQIKMSQEYVADQLGISRQAVAKWEAGTSEPTSKNLSELASLFEMSISE LVDPQTYAEEQETQEQKLSTKQRNAKMLFGRWGAVILMNAGWDGYSSGLYGTDFPYYWLA ILAVGLVLLFITSKDMGKKHRLEALQIILGLSMIFSVFFLPRIIPLEQVGVTYLLADVIT AICAILLSLKYWRHIWKVK >gi|157101642|gb|DS480682.1| GENE 21 22503 - 22706 57 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937794|ref|ZP_02085153.1| ## NR: gi|160937794|ref|ZP_02085153.1| hypothetical protein CLOBOL_02686 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02686 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 125 100.0 1e-27 MYTNSRENEVRLLKRLWAELPTICPKCHQAELEHLHKKAKKSNLDWKCPACGEIYRTINM LSELPDS >gi|157101642|gb|DS480682.1| GENE 22 22756 - 23232 187 158 aa, chain - ## HITS:1 COG:no KEGG:FMG_0453 NR:ns ## KEGG: FMG_0453 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 14 158 11 157 157 63 26.0 3e-09 MKKWILIFSAIALLLFSLTFVLCSRENNLAALSRQCGIDLTIGKVVSHKDTHGGFHGDGV SYTVVQYPDDSIGEQMEESETWHTLPLPENLDTFLYQPYDDEVSIPEIQDGYYYFYDRHS ESRNPYDDSELFQRFSFNFTFAIYDQSSNQIYLIEYDT >gi|157101642|gb|DS480682.1| GENE 23 23608 - 25698 1872 696 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 5 633 5 655 709 471 40.0 1e-132 MSIQLVIAEKPSVARSIAAVIGAEKKQDGYLEGNGYLVSWCIGHLVSLADAGAYDERFKK WRYDDLPIIPQQWQYIIPDEKKKQFEILRSLIERTDVESLVCATDAGREGELIFRFVYQM AGCKKPFKRLWISSMEDGAIKDGFAHLKPGADYDSLYQSALCRAQADWLVGINATRLFSI LYHKTLTVGRVQTPTLKMLVDRESQIGNFKKEKYHVVHIAAGGMEAASDRFTSPDDAEAA KAACAGAQAVCVSVKREQKTEKPPRLYDLTTLQREANRLFGFTAKQTLDYAQQLYEKKRL TYPRTDSQHLTDDMQPTAESLVSGLWPLLPFAKGLDLSPQLGRVLNSKKVSDHHAIIPTM EFVQKGFDGLTEGEKKLLSLVCCKLLCAVAAPHVFEAVTATFTCAGKEFTAKGKTILTPG WKEIDLRFRTCFKADANENAPELARVLPEITEGQTFDKAEASITEHYTAPPKPYTEDTLL SAMERAGAKDMPEDAERQGLGTPATRASILEKLVQMGFVERKGKQLLPTRDGHNLACVLP DVLTSPQLTSEWETKLTAIAKGQADPEDFMHDIAEMTRSLIVGYSQISEDAQKLFQDERV AIGKCPRCGESVYEGKKNYSCGNRACQFVMWKNDRFFEERGKAFTSKIAAALLGDGKAKV KGLLSLKTGKTYDGTVLLADTGGKYVNYRVEQRGKN >gi|157101642|gb|DS480682.1| GENE 24 25695 - 26462 737 255 aa, chain - ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 227 3 228 244 186 51.0 6e-46 MKNKAIRMFTATLAAVLCMAAVSVTAYAGGVDPHPQPLPEETEKPTTGGIEMEPEDVPVT PKGNAALVDDFFGDKQLITVTTKAGNYFYILIDRANEDKETAVHFLNQVDDADLQALLKD GETKPEHCTCTTKCEAGAVNTNCPVCKNNMNACTGPEPEEHKPEEAKPQEEKPNGIGGLV VFLAVVLAGGGAALYFFKFRKPKADIKGGDDLDEYDFGEDEDDEEEAPEPDDTQEPGEDW LIEKEAPEPNKEDEE >gi|157101642|gb|DS480682.1| GENE 25 26452 - 26715 301 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937800|ref|ZP_02085159.1| ## NR: gi|160937800|ref|ZP_02085159.1| hypothetical protein CLOBOL_02692 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02692 [Clostridium bolteae ATCC BAA-613] # 1 87 1 87 87 104 100.0 2e-21 MAMNKLERIEKDIEKTKDKIAALQKQLRDLEAAKTEQENLQIVQLVRDLHMTPQEFAAFV RDGALQAPPAPQPDFEQDEQEETADEE >gi|157101642|gb|DS480682.1| GENE 26 26739 - 28685 1494 648 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 528 647 4 123 124 110 43.0 8e-24 MKPLKPRDKVTQRMTRAGLTLDNQTTGESMNISSREAEPEYTAKPDGTAEKALERAVDIR DRHKAKQAARHGERMAKEASGPASRLQFTAEERASPELAPYIKQAEKRADRLDTAKSALP NRRVITKETVYNEAKGKARSKLHFEKVEKHPPKLKPNPASRPVQEAGLYLHGKIHEVEQE NVGVESGHKAEELAERQAGKALRNARRRNKLKPYRAAAKAERKSMAANAEFVYQKSLRDN PELAQAVTNPISRLWQKQHIKREYAKAARAAGRGAAGSAKTTASAARKAAEKGKQAASLV ARHWKGALLIGGVGLMLLFLMGGLQSCTAMFGSAGTGLAATSYLSEDSDMLGAEAAYAGM EADLQYELDHYETLHPGYDEYRFELDEIGHDPYVLTSILSALHNGVFTLEEVQGDLAMLF EQQYILTQTVETEIRYRTETSTDSEGNEYEEEVPYTYYICNVTLENRDLSHLPVSLMDEE ALSLYAAYMQTLGNRPDLFPSGSYPNASTIKEPTYYEIPPEALKDEAFAAMIAEAEKYVG FPYVWGGSSPSTSFDCSGFISWVVNHSGWNVGRQTAQGLYSLCTPVSPEQARPGDLVFFV GTYDTAGMSHVGLYVGNSVMLHCGDPISYTNLNSSYWQQHFYCYGRLP >gi|157101642|gb|DS480682.1| GENE 27 28717 - 31122 1963 801 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 230 771 27 596 617 98 22.0 5e-20 MFKKQRNETSAKAPVRLTRAEQKEIQAVIRRYKGDGKPHSAQESIPYEAMYPDGVCRLTP RRFSKCIEFSDISYQLAQADTKAAIFESLCDLYNYLDASIHIQFSFLNHKIDPRQYAKSL EIRAQGDAFDDIRTEYSAILKDQLVSGNNGLVKRKFLTYTIEADSLKLARARLHRIETDL LGYFKSMGAVAWGLDATARLEVMHRMFHPDGEPFSFDWKWLASSGLSTKDFIAPSSFRFG NARMFGLGGKYGAVSFLNILSPELSDEMLADFLNTENGIVVNLHVQAIDQSKAIKTVKRK ITDLDAMKIQEQKRAVRSGYDMDILPSDLATYGQDAKELLKTLQSRNERMFQLTFLVLNT ADTRQALENDVFWAAGVAQKYNCSLVRLDYQQEQGLMSSLPLGASHIQIERSLTTSSVAV FVPFVTQELFQDGEAMYYGVNAKTGNMIMLDRKRARCPNGLKLGTPGSGKSMSCKSEILS VFLCTPDDVYVCDPEAEYYPLVKRLHGQVVKLSPTSKSYVNPLDINLNYSEDESPLALKS DFVLSFCELVMGGKNRLDAIEKTVIDRAVQVIYRPYLADPRPENMPILSDLHKALLDQHI PEADRVAQALDLYVNGSLNVFNHRTNVDIESRIVAFDIKELGKQLKKIGMLIVQDQIWGR VTQNRSQGKATWFFCDEFHLLLREEQTAAFSCEIWKRFRKWGGIPTGATQNVKDLLLSPE IENILENSDFICLLNQASGDRHILAERLNLSPQQLRYVENSEPGEGLLIYENVVLPFKNP IPKHTQLYQIMTTRLGERATV >gi|157101642|gb|DS480682.1| GENE 28 31025 - 31474 271 149 aa, chain - ## HITS:1 COG:no KEGG:CD1111 NR:ns ## KEGG: CD1111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 134 1 134 134 149 55.0 4e-35 MAYVTVPKDFTKVKSKVVFGLTKRQLICFGGALLVGVPLFLLIRGRIPTSAAALLMVFAM LPGFLLALYERHGQPLEVVVHQILECCFFQPKERPYQTNNAYTALVRQFQMEQEVNAIVQ KAKKRNERKSAGQAHPCRAKRNSGRHPQV >gi|157101642|gb|DS480682.1| GENE 29 31481 - 31942 306 153 aa, chain - ## HITS:1 COG:all7280 KEGG:ns NR:ns ## COG: all7280 COG4725 # Protein_GI_number: 17233296 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Transcriptional activator, adenine-specific DNA methyltransferase # Organism: Nostoc sp. PCC 7120 # 1 142 39 192 210 87 34.0 7e-18 MGIDALCALPVETLAAKDCLMFLWATFPMLPEALRLIQAWGFTFKTVAFVWLKRNKKSPT WFYGLGHWTRGNAEICLLAKRGHPKRYSRSVHQFIISPIEEHSKKPDITREKIIELAGDL PRAELFARQKTPGWDVWGNEVDSDFSLSAPETR >gi|157101642|gb|DS480682.1| GENE 30 32081 - 32944 902 287 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 2 287 4 289 289 348 62.0 2e-94 MIDDLIGEWIKGILIDGIKGNLSGLFSTVNTKVGEIASDVGSTPQDWNGGIFSMLQTLSE TVIVPIAAAILALVMCYELIEMIVEKNNMHDFDTSLFFRWMFKSAFAILIVTNTWNIVMG VFDATQAVVNQSAGVIIGETSIDFDTLIPDLETRLEAMDIGPLLGLWFQTLVVGLTMHAL SICIFLVTYGRMIEIYAVTALGPIPLATLGNSEWRGMGQNYLKSLLALGFQAFLIMVVVG IYAVLIQDIGTMEDISGAIWGCMGYTVLLCYGLFKTGSLSKAVFTAH >gi|157101642|gb|DS480682.1| GENE 31 33020 - 33235 244 71 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1893 NR:ns ## KEGG: Ethha_1893 # Name: not_defined # Def: putative conjugative transfer protein # Organism: E.harbinense # Pathway: not_defined # 1 71 1 71 71 82 70.0 8e-15 MEFFNSAVDTLQTIVVGLGGALCVWGGVNLLEGYGADNPASKSQGIKQLVAGGGVALIGM TLVPLLSGLLG >gi|157101642|gb|DS480682.1| GENE 32 33300 - 35105 1809 601 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 131 554 151 562 591 205 34.0 2e-52 MRPEVKKLIIANLPYLLFVYLFGKLGQTYRLAAGADVSEKLLHLADGFSLAFENAAPSFH LFDLAVGVAGAVALRLMVYCKSKNAKKYRRGVEYGSARWGGPKDIAPYIDPVFDNNILLT QTERLTMNNRPKDPKTARNKNVLVIGGSGSGKTRFFVKPNLMQCVSKDYPASFIITDPKG GLIGEVGQLLVRSGYRVKVLNTINFSKSMRYNPFRYIHSEKDILKLVNTLICNTKGEGEK SAEDFWIKSERLLYSALIGYIWYEAPDDEMNFTTLLEMINASEAREDDPEFQSPVDAMFE RLEEKDPEHFAVRQYKKFLLSAGKTRSSILISCGARLAPFDIREVRELMEDDELELDTIG DEKTALFLIMSDTDTTFNFILTMVQSQLINLLCDRADDKYGGRLPVHVRMILDEFANIGQ IPNFDKLIATIRSREISASIILQSQSQLKAIYKDAAEIISDNCDCTLFLSGRGKNAKEIA DVLGKETIDSYNQSENRGAQTSHGLNYQKLGKELMSQDEIATMDGGKCILQVRGVRPFFS EKYDITRHPRYQYLSDADKQNTFDVDRYLSSLRRKKRRVVSEDETFDLYDIDLSDEDLAA Q >gi|157101642|gb|DS480682.1| GENE 33 35102 - 35578 343 158 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1242 NR:ns ## KEGG: BLJ_1242 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 157 1 157 158 208 88.0 6e-53 MQEEIEQKTIALAVKTGKLTGQVLQAAMKKFLAARQKGTGKAHHGKQSLRQLKKDGSALS NIEITDANIGLFRPCAKKYGIDFTLRKDRTTHPPRYIVIFKSKQADNLEQAFKEFTAKKL KQQERPSIRKALSAMKQKTAARNRQQAKEKIKERGLSL >gi|157101642|gb|DS480682.1| GENE 34 35622 - 35834 307 70 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1241 NR:ns ## KEGG: BLJ_1241 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 70 13 82 82 133 92.0 2e-30 MKQGTLIFDEQADRYDIRFDLADYYGGLHCGETFDVLVGGRWRPTRIEMAENWYLVGIRT DDLSGLRVRI >gi|157101642|gb|DS480682.1| GENE 35 35852 - 36985 615 377 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 375 1 373 381 645 84.0 0 MYFTSGSDRAFEQLMQTRPGADHFDSGEAGAPEDCGTCRFYRPHWKYQFCIYAECPYQPG KLTVFDTVKFQVKGVCDKMAVFRVEKNRGYTVMSNHHLRNKDLSLKAKGLLSQMLSLPEG WDYTLKGLSLINRESIDAIRTAVWELERAGYITRQQNRDGKGKMADMLYTIYEQPQTEAS VLEQPILENPVLENPTSDNPTPENPTQINKDRSSKEKSNTDGRITDSIPILSPPSPLEKA AAAPPERKGTGAKSQSAVAVYREIIKDNIEYEHLCQHTKGIDRELLDEIVDLLVETVCSA RTTIRIAGDDYPAELVKSKLMKLNSSHMEFVFDCISKNTTEIRNIKKYLLAVLFNAPSTI NGYYTALVAHDMNTGKI >gi|157101642|gb|DS480682.1| GENE 36 37556 - 38392 807 278 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 13 278 1 265 265 245 45.0 6e-65 MTATPRLMSNHALDQFQKYLMTHNKSPNTICVYRYAVEQFYHLYPQLTPRNLQLYKVYLL EHYRPQTVNLRIRALNCYMEYRQASITPVSMIKIQQKTYLDKIISQADYEYLKRCLAESE EYTYYFIVRLITATGVRISELITFQIEDVHTGHKDIYSKGNKMRRVYIPRGLVRELQEWL AATHRSTGPLFLNRFHSPISPSGIRAQFKVFAARYGLDPEVVYPHSFRHRFAKNFIEKCG DISLLSDLLGHESIETTRIYLRRSSSEQYRIVNKVVDW >gi|157101642|gb|DS480682.1| GENE 37 38403 - 39353 572 316 aa, chain + ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 32 316 1 265 265 285 50.0 1e-76 MKELKISVIYGGICPPQADPSMYNPDNYQANLQYFHDFLQRRGSSANTIDSYLTSVRHFH SLYRDVTVENLQSYKAWLMERYKPSTVNTRIYGINQYVAALQSGGDISDTGPYTPAASID FLEPYKLPSVRHQQKPFLNNVISKRDYERLKRGLKRDNNMFWYFVVRFLGATGARVSELI QIKAEHMQIGCMDLYSKGGKVRRIYFPEKLCGEMLSWLDSRQIQTGFIFTNRRGNPITPR GISSQLKVLAKKYRICPDTVYPHSFRHRFAKNFLTKFNDISLLADLMGHESIETTRIYLT RSSDEQRELIDRIVTW >gi|157101642|gb|DS480682.1| GENE 38 39402 - 40556 1050 384 aa, chain + ## HITS:1 COG:L191765 KEGG:ns NR:ns ## COG: L191765 COG4905 # Protein_GI_number: 15673533 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 16 217 7 201 235 122 33.0 2e-27 MVYMYEYTWYQWLTFFFIYCFFGWIFESTYVSAKTGHFVNRGFLRLPLLPLYGTGAVMML WVSLPVKDNLFLVYLAGVIAATILEYVTGWGMEKLFKMKYWDYSNQRFNVKGYICLSSSI AWGFLTIFLTEVVHRPIEHYVLGLPVMVNIIFVLITSLLFAADTAESVKTALDLAKVLDA MTDMRAELDDIQVQMALLKAETSERVAEAREEAAARLSRMKSGAAVHAAALKDDTAERLN ELVENTMEKMSGFVEGTAEKMGGLVENTAEKVARTAERLSELTEDAAARVSGTLAQRAEG SRQAIQRDELNKSAVQDRKDKLAALSQRLVSITEKRHMLSTHMDFYQRGILKGNPTASSS RFAEALKELREIADRKNPDEHDAN >gi|157101642|gb|DS480682.1| GENE 39 40682 - 42418 2177 578 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 4 576 3 572 575 561 50.0 1e-159 MKDYMKIYQEWLSNPYFDEDTKAELRAIEGDENEIKERFYMDLEFGTAGLRGVIGAGINR MNIYVVRRATQGLANYIIKQGGADKGVAIAYDSRHMSPEFAMEAAMTLAANGIKAYKFES LRPTPELSFAVRELGCIAGINITASHNPPEYNGYKVYWEDGAQFTPPHDKGVTEEVLAIE DLSTVKTTDEASAAAAGKYEVIGREIDDKYIAQVKAQVVNQKAIDEMQDQISIVYTPLHG TGNIPARRVMKELGFTHVYVVPEQELPDGGFPTVSYPNPEAAEAFSLGLKLAAEKNADLV LATDPDADRLGVYVKDAGSGEYIPLTGNMSGSLLCEYVLSQKKAAGKIPDDGQVIKSIVS TNLIDAVAKEYGCELIEVLTGFKWIGQQVLKNEKTGRGTYLFGMEESYGCLIGTYARDKD AISATAALCEAAAYYKQKGMTLWDAMVAMYEKYGYYKDAVKSIGLSGIEGLAKIQSIMET LRNNTPKEVGGYKVVSARDYKLDTIKDMASGEVKPTGLPSSNVLYYDLNDGAWICVRPSG TEPKIKFYYGIKGSSMEDADAKSEALGAAVMAMVDKMM >gi|157101642|gb|DS480682.1| GENE 40 42571 - 43689 957 372 aa, chain - ## HITS:1 COG:no KEGG:Closa_0390 NR:ns ## KEGG: Closa_0390 # Name: not_defined # Def: CotS family spore coat protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 365 1 325 326 394 53.0 1e-108 MNEKYVEALEQYDMEVRAVRKGRGSWICETDQGCRLLKEYRGTARRLEFEDEVLGRLDTR GSLRADRYLRNREGGLMTVTGDGTRYILKDWFLDRECNIKDGYEIRQALSRLAMLHSQLR KIEFKEEWNMGSILSGPLEEELERHNREMQRARNYIRSKRKKTEFELCVIGSYQMFYDQA LEAVQGIRELWPGADELPSEGTGPYNTGKCLKLGPYQEDQDALEAAEVQPPVKKDRKPLY LCHGDLDQHHVLMGGNYTAIIEYNRMHLGIQISDLYRFMRKVMEKHGWNLDLGLSMLDSY ERVLPMEPKERGCLYYLFLYPEKYWKQLNFYYNANKAWIPARNTDKLRGLEEQQQARNSF LKRLKADCKGCV >gi|157101642|gb|DS480682.1| GENE 41 44014 - 44520 555 168 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 2 151 1 148 150 178 58.0 3e-45 MMNIVLLEPEMPSNTGNIGRTCVATQTRLHLIEPLGFKLNEKALKRAGLDYWDKLDVTVY SDYKDFLERNPQASGNMYFATTKARKVYSDVTYGPDCYLMFGRESAGIPEEILVHNEDHC IRIPMWGDIRSLNLSNSAAIVLYEALRQNGFEKLELRGQLHHLHWKDE >gi|157101642|gb|DS480682.1| GENE 42 44540 - 45712 1449 390 aa, chain - ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 2 385 8 390 393 446 56.0 1e-125 MRNPLSDKIVKIPPSGIRKFFDIVSEMKDAISLGVGEPDFETPWHIREEGIYSLEKGRTF YTSNAGLKPLKVEICEYLKRRCGVTYDPDQEILVTVGGSEAIDIALRAMLNEGDEVLIPQ PSYVSYMPCVTLADGVPVTIELEEKDQFKLTPEKLLEKITDKTKILVLPFPNNPTGAIME KEELEEIVKIVLEKDLFVISDEIYSELTYNGKRHVTIASFPGMKERTVLINGFSKAYAMT GWRLGYAAAPHIILEQMLKIHQYAIMCAPTTSQYAAVSAMKNGDPDVSMMRESYNQRRRF LLHAYEEMGLTCFEPMGAFYTFPNISRFGMTSEEFATRFLEEEKVAVVPGTAFGDCGEGF VRVSYAYSLDNLKEALGRMERFVKRLDGRE >gi|157101642|gb|DS480682.1| GENE 43 45709 - 46191 596 160 aa, chain - ## HITS:1 COG:BH3351 KEGG:ns NR:ns ## COG: BH3351 COG1522 # Protein_GI_number: 15615913 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 5 160 8 164 164 149 48.0 1e-36 MREKILAVIEKNSRIDLKDLAVLLGESEIAVANEIAEMEKEHIICGYHTLINWDNTSEEK VVALIEVKVTPQRGMGFDKIAERIYQYSEVEAVYLMSGAYDFTVFIEGRTMRQIAQFVSD KLGPMESVLSTATHFVLKKYKDHGTIISEQVQDERMLITP >gi|157101642|gb|DS480682.1| GENE 44 46275 - 46868 292 197 aa, chain - ## HITS:1 COG:PH1573 KEGG:ns NR:ns ## COG: PH1573 COG0309 # Protein_GI_number: 14591353 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Pyrococcus horikoshii # 19 197 150 326 326 61 30.0 1e-09 MRTAGQGPGGFEKRDGFMRPGQDLVMAGYAGFAGTVKIGAEKRDFLKTRFSSMFLRCLDE TEPYSVEQWIKREQKEEHCSFTVYEYAGEGGVLAALWNLSGIFNAGIDADLRCMPVRQVT VEVCELFELNPYRLYSGNCAVLVSDRGGQLVRRLREEGIPAGVIGSVAEGSARQIRHGQV GAGFLERPRPDELTKII >gi|157101642|gb|DS480682.1| GENE 45 46868 - 48451 1671 527 aa, chain - ## HITS:1 COG:BS_yoaR KEGG:ns NR:ns ## COG: BS_yoaR COG2720 # Protein_GI_number: 16078932 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Bacillus subtilis # 128 362 49 279 303 113 32.0 7e-25 MRNRIGNAGLTAAISMALLAAFPVIAWAAPVLPAGITAGSQSLAGMTAEEAKAAVQEYVD GLAAQSVTLSVDGQDVATTAGELGFHWINTDVIDEAVSQYAGGSLIKQYMIEKDLAAAPV EVELETEVDSAKVKNFVDTQCQGITAEPQNASITRENGQFVITDSVPGRVVDVAGTEAAL NEALEGGLDQPIEVTAQVTEEQPVITSEALATIQDVLGTYTTDFSSSGAARSTNLAVGAA KINGHVLMPGDVLSGYECLQPFTTANGYKTAAAYENGQVVDSIGGGVCQLATTLYNASLE AEVEIVQRQNHSMIVTYVKPSRDAAIAGTYKDIKIKNNYSTPIYVEGYTSGRKLTFTIYG KETRPANRKVEYVSETLGSTSPGEPQLIVDNTLAPGARVKVQSSHTGLRSRLWKVVTVDG VEQERTLLNKDTYNASKAIYRVGPDLPVALPVVPDVTAPVDTAPAETAPVTGVEGGPGVT QPAPEPAPAPEANPAPAANPAPEANPAPAANPAPAQGGPTVTPVEGP >gi|157101642|gb|DS480682.1| GENE 46 48481 - 48591 179 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNMRDPKTKKIISTIIVVVLVLAMIVPMALSALMY >gi|157101642|gb|DS480682.1| GENE 47 48635 - 49372 396 245 aa, chain - ## HITS:1 COG:BS_polC KEGG:ns NR:ns ## COG: BS_polC COG2176 # Protein_GI_number: 16078721 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Bacillus subtilis # 4 170 420 588 1437 107 37.0 3e-23 MTNSYIALDLETTGLEARLDKITEIAALRVEDGRVEERFVTLVNPGRQLGERITALTGIT DDMVKDAPVIEDTIGDVVRFCGNLPLLGHNILFDYAFLKRAAVNSGLDFEREGIDTLGIC RLFMPEDVKRNLTCACSFYEVSQSAAHRAQADAEAAHGLYQVMMARHGEEHPEAFAAKPL IYKAKREQPATKRQKEYLHDLLKCHRIDVTVQIDAMSRSEVSRMIDNIISQYGRMTSRIS GEQKD >gi|157101642|gb|DS480682.1| GENE 48 49664 - 50947 1355 427 aa, chain - ## HITS:1 COG:TM0442 KEGG:ns NR:ns ## COG: TM0442 COG3875 # Protein_GI_number: 15643208 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 29 360 22 362 427 110 25.0 7e-24 MLKNIAEFSAQTESGELSSHQAVAVMDCMLEGLPKAGRILLVPPDITRCYSYGGVITSYL YHRLSREAKVRIMPAVGTHRAMSREEQIRFFGEDVPEEVFLYHDWRKDTVPVGTVPGAYC GQVSGGRYTSDIGAEVNRCVVDGSFDLILSIGQVVPHEVIGMSNYTKNILVGLGGRPMIN GTHMLGALCNLETIMGNTDTPVRAVFDYGEEHFLRDVPLAYILTVASEQEGRTALHGIFT GASRQVYEHAAALAKRRCITQVERRAKKVVAYLEPEEFSTTWVGNKAIYRTRMMIEDGGE LLVIAPGIKGFGENPEVDGLIRRYGYKGTPYTMELMENGAFPGSAMVPAHMIHSSSEGRF TITYAVNPEYVSQDEIGRIGYEFMDVKDALARYPVLEMEDGWQEMEDGEEIYVVKAPALG LWRVMGV >gi|157101642|gb|DS480682.1| GENE 49 51031 - 52011 1181 326 aa, chain - ## HITS:1 COG:AGl989 KEGG:ns NR:ns ## COG: AGl989 COG1172 # Protein_GI_number: 15890611 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 33 324 26 319 322 197 40.0 3e-50 MGEEKNRQASNLQFIKKNAGSSQVMVYVILVVLLLAAQLMSPGYLKPTHVAGILRLASFM GIAAIGQNLTILTGGIDLSIANTITFANVIAAQIMMGRNENLPAALFAVIVMGVMVGFIN GVGIHWLKIPPFIMTLGVGTVIQGVFLIYTKGAPKGNASPLLKAICGQSFIGILSGIVVI WAVMAAVTIVLLRNTPYGRKIYSVGTNEQAARFSGIHTGRVTFSVYLISAVIAAVSGFFL VGYTGTSFLDVGTSYNTKTIAAVIIGGTAITGGKGGYIGTIAGAIIMTILDDFLTIVNIP EAGRQIMQGVIIVLLVLIYSREKRKK >gi|157101642|gb|DS480682.1| GENE 50 52060 - 53010 874 316 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 20 309 25 316 322 170 40.0 3e-42 MKGRLNRNFWHKHGHTVIIYVFLAALMVFVSIFNKDFFTLGNFKNLLRSSFPLLMAALGQ TLIILTGGIDLSLGGIVALCNVVCVMSMNPDSAWGFVPALGAAAAVGLACGAVNGVLVTK GRLAPIIVTIATTAVFDGMALLLMPNPGGSVHKAFAKFLTRGYSGAVPFLLFILILCLVR SLANSTPYGKALRAIGGSENAAYSTGIRVGKIKFCAYCLAGLLCAVAGIFLSAQMNSADA TIGKNYAMNAITATVVGGTAMTGAVGDPLGTIAGVFIISIINNMLNLFGVSSFYQFICQG LILIAALSLSALHKRH >gi|157101642|gb|DS480682.1| GENE 51 53007 - 54533 189 508 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 269 477 13 217 245 77 27 2e-13 MAEEILKMVNITKRFAGITALDRVQFSCSKGEVHVLAGENGAGKSTILKILAGIHQADEG EIYFHGKKVVIRNPEQSQKLGIAMVFQELTLVGEMTVWENIYLNQEPVTRLGRINRKEIK KRILDVMDRYGIHIDPDAPVGSLPVAEQQMAEILKILVRDPELIILDEPTSALAKKEVEQ LYQIIHNLIADGKTIIFISHRLEELFELGDRITVFKDGCYIGTRNMNEINEDDLIRMMVG RSLNHVFPPPVCEVDDSQVIFEVKNLADANHKLNQVSFQVHKGEILGVAGLQGHGQTELL NAVSGLYPLSGGSIIVNGKQVGIKNAGQAIANGIALVPSDRKNQGLMLELSNRHNLAISS MKKRMKGCFIDKKAEQAFSESMASKLSIKMGGLDLPVSSLSGGNQQKVVLGKELATEPRV ILFDEPTRGIDVEAKREFYQIMHDLAARGVAVVMNSSDMLEVIGMSSRVIVMYEGRISGV LEKEELTEERIMQLGMGLGKNRKEGEKG >gi|157101642|gb|DS480682.1| GENE 52 54629 - 55756 1385 375 aa, chain - ## HITS:1 COG:AGl993 KEGG:ns NR:ns ## COG: AGl993 COG1879 # Protein_GI_number: 15890614 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 58 363 16 327 340 158 35.0 2e-38 MRKMFTVGLAAVMAMGLCACGSQKAAETTAAPAAAETTAAPSQDEKKEEAEETTEAQASS ETAGGEGYVVALCNYSIGNSWRAQMEQEFVAEAEKLKAEGVVSEYYITNSNEDINKQISD MQDLITKKVDAIVITAASPTALAPVVEEAAEAGIKVVSFDNVVETDEQVATVGIDEKEFG RIGAEWLVDKLDGKGKIVVLNGIAGTATDSLRWGGAEEVFKQYPDIEILGSANASWDYAQ GKAAMESMLSAYPEIDGVWSQGGAMTQGAIDAFIAAGRDLVPMTSEGNNGAIRAWIENKD KGLSCIAPSNPTYTSAEALRVVIKALNGEDIPGNVVMDIETVTEENVDQYYRSDMPDSFW VLTELDDATLQKLYK >gi|157101642|gb|DS480682.1| GENE 53 55800 - 56711 798 303 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2638 NR:ns ## KEGG: SpiBuddy_2638 # Name: not_defined # Def: amidohydrolase 2 # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 299 1 297 299 88 24.0 3e-16 MNVIDIHAHIYERVAGITRGQPMASTDLGRVAIGNEEVQFLPPSFEQSRSTAEMLIAYMD WCGIEKAVLMPNPYYGYHNRYFADSVKRYPDRLRGVALVDIMKGQEAARELASIYDQGIL FGFKVEVDSTFQCAPGTRLSDSGLSPVWDCCNQYHQPVFIHMFRTEDIEDAGFLASAYPN ISFILCHMGADACFKKGMPESNYREVLDLVKNRSNVFIDTSTVPVYFGEEYPWPSSVEII QACYRHVGPEKMMWASDYPGMLNHGTMQQLINLVEKHCSHIPESHRQMIMADNARRLFFD DQR >gi|157101642|gb|DS480682.1| GENE 54 56927 - 57595 517 222 aa, chain + ## HITS:1 COG:PM1362 KEGG:ns NR:ns ## COG: PM1362 COG1878 # Protein_GI_number: 15603227 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Pasteurella multocida # 2 221 3 222 224 214 48.0 9e-56 MYQILSYPIKKGQPTWPGNPAFSLEPHTSIAGGDTANTCTIHLFNHYGTHLDGPMHFYGK GIPLDQVPFGQFFFHNPLLLDIPKEPGAKLMPEDLIPHREDVKDADLLLIRTGFSKYRRE KPDLYENNGPAVSSRLARYLQDNMSHLKALALDFVSLASYSDTKDGDLAHQIMLGMYHNR YICIIEDVNMEGLPSGFLKNAAAVPLIIEGIDSSPVTMWAEY >gi|157101642|gb|DS480682.1| GENE 55 57601 - 58677 1122 358 aa, chain + ## HITS:1 COG:ECs5205 KEGG:ns NR:ns ## COG: ECs5205 COG1879 # Protein_GI_number: 15834459 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 68 331 24 286 318 173 36.0 5e-43 MSRFMLRFKTWFKTWFKILFKTRFKLLFRKHWCLLIICSLCVILAITLYDTSNQAANWEP EPADSSSLHVGFSQIETDNPWRTAQINSFREALLPNGMDFIYHEPEDHSVQWQLEDIRSL IREGVDYLVIVPADLSALTPVLQDAKDAGIPVILIDQSAETIDQSYYVSLISADYLKEGQ ICAGLLADKFGDQPCRIVEIYGTESSPGAQARSKGFHQALMKYPNMEIIDVEYGNFDRVT AQKAMENALIKAANRGQTIDAVFAHSDEDGLGALQAIKVAGLPPGEIAIVSINGIQDVCK AIIAGEYLGTVESNPRWGFIAAFLIQQMERDCKPFPMVMIPYQIITAENAAEYALTAY >gi|157101642|gb|DS480682.1| GENE 56 58700 - 60511 1831 603 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 8 593 10 595 602 218 26.0 2e-56 MKNFFRQLSFRRKLLISFVTLSCIPVLLVGIAAYHLYTNFIINMTEKSSIETIDLVCDDI DSLLNDAWNLCDMLTGDIKMQKYLRMDFDSVRDQYSNDLAGSMELASISTYRKDIFGVYV FGQNGGRYKSNYYSFKSEDQRETTWYKAIAGSRETTWFPSHEGSFIVRSSISDNFITVGQ PVMDKASGMVNGIVAADIKEDVITQKIKHSLSNGVICIIDQEGSILFRSNAGNDLHYPID ISPSLVSHILESTGTAVGKSMVVPDSGYLVVSRTLMNSNWRIAGIIDRGFLTQSSKDITH IVMLVLLIIAFSSLYVAMLISQSVYKPVQILYRMMEEVENGDFSVRYTYHSSDEFGRLGK NFNQMLERIQKLISQIYEEQKKLKNSELKALQAQIQPHFLYNSLDSVMWLLRMDKNRDAE KMLNELSTLFKISLSKGNEIITIEEELRHISSYLFITNMIYSKKFEYAIECDPVLYSYRT LKLLLQPLAENAIAHAIPMPGQKVFIQVRIYEDEDSLVLSVQDISRGIDQETLEKLQQQL GTAAHPDRRDSGYGLYNVNERIHILFGSSYGLTLTSEPDFGTEVTVKIPKLKGDDIFVPG NAL >gi|157101642|gb|DS480682.1| GENE 57 60489 - 62048 1508 519 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 518 1 522 525 147 22.0 7e-35 MFQAMLCDDNEIILEGLSRQIDWEGLGICLSGTAADGQDAWNQMKDNPPDILITDIRMPY IDGLELSRLAKDLNPNIVLLIISAYDDFEYARTAMHLRALDYILKPIDLDAMNRILTNAV SHCRQFHHDKRLMATQLLRNMAVQEAAGPEGIALNELDLEPDSWCCFLQVELDDHSLSKL SDDIRYASERRFSALSQILFGDSSYLLESHQYSYSVCLTSRSRMELAARRKILVDTIRRQ FPHEGSYCDVSIASGSIVQGIRLLHKSVQTCHEAGKLHFIKGSNADIFYEEVESYIYNEP EDSQGYPIPDTRLITFIKEQNKEGITGELEQLKTWLYQKGSESYLYMTFSLGSFYTHLMK ELGKSGINLQDIFQNPIEEFKKVAAGGTLESSIENLKQNLFKICDSIRINKSRYGKLIDQ AILYIQNHYMSSSFSIDEVAGAVCLSTSYFSTVFKSETGITFTDYLIKVRMEKARGLLEN TSMKMYEISSRAGYENAAYFSAAFKRYYGKSPSEFQSRK >gi|157101642|gb|DS480682.1| GENE 58 62144 - 62929 638 261 aa, chain - ## HITS:1 COG:FN0057 KEGG:ns NR:ns ## COG: FN0057 COG0177 # Protein_GI_number: 19703409 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Fusobacterium nucleatum # 65 256 1 192 201 202 49.0 7e-52 MLQKTILNEPETDRKTGRGIKSDRDTKAGRGRKAGGGTKTQGRPKAVRETKAELAARIAR ILDVLDREYGTEYRCYLNHETPWQLLIAVIMSAQCTDARVNIVTADLFQKYDTLEKFAAA DLKELEQDIHSIGFYHMKAKNIIACCRDLVERFGGEVPRTIEELTSLAGVGRKTANVIRG NIYNEPSIVVDTHVKRISRKLGLTKEEEPEKIEYDLMKVLPKDHWILWNIHIITLGRTIC IARRPKCCECFLREECPGREA >gi|157101642|gb|DS480682.1| GENE 59 62986 - 64824 1427 612 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 482 3 439 548 351 39.0 3e-96 MKILLTAVNAKYIHSNLGIYSLKAYADQVLGLGEAGGRNARAKKRGTGTGAMEPAKPSIE LAEYTINHQLDQILQDIFRREPDVIAFSCYIWNIEYIRYLIADLGKILPDVPIWLGGPEV SFDAAKVLGELPEATGVMKGEGEETFAQLAGYYVDKLCKKGTDSGKILPDGQAGPENIPG LAFRLPDGSLADTGVRRAMDMSRIPFPYKHMDIKDLEHRIIYYESSRGCPFSCSYCLSSI DKSVRFRSLELVLDELAYFLEAGVPQVKFVDRTFNCNKKHAMAIWRFIQSHDNGVTNFHF EIAADLLDQEEIELLGQMRPGLVQLEIGVQSTNPDTLKAIHRKTDIDEIRRITGTINQAH NVHQHLDLIAGLPNENLESFRHSFNQVYSMEPEQLQLGFLKVLKGSYMEEMALTYGLCYS SRPPYEVLSTRWLSYRDILELKGVEDMTEVYYNSRQFVHTLSGLAAEYGSPYEMFLDMAA YYRENNLTGISHSRIARYEILYSIIARRRDAGLDASVMERFRDLLMYDLYLRENVKSRPS FARDQSPYKQAVRQFFQREEESPHYLMDYEGYDSRQMSKMAHIEAMGDGTLVLFDYRHRD PLTYNARAVRIG >gi|157101642|gb|DS480682.1| GENE 60 64829 - 65323 559 164 aa, chain - ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 5 162 15 168 174 187 57.0 5e-48 MTGKRVDYITWDEYFMGVALLSGRRSKDPSTQVGACIVSQDNKILSMGYNGFPKGCSDDE FPWGKENEKEDPYNSKYFYSTHSELNAILNYRGGSLEGSKLYVTLFPCNECAKAIIQSGI KTIVYREDKYADTPAVMASKRMLNAAGVRYYQYQSTGHKIELEV >gi|157101642|gb|DS480682.1| GENE 61 65952 - 66779 659 275 aa, chain + ## HITS:1 COG:no KEGG:CLB_2021 NR:ns ## KEGG: CLB_2021 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 237 11 247 271 125 31.0 2e-27 MKTAAQFFGKDLLPYAGIKGRIACVAPTEHVHLEMRRLEEDFNFIMKDGSWRHLEFESDS IQEKDLRRFREYEAYIGLTYDVPVTTTVICTSRVKILKKELVNGMNVYRVEVVRLKDRNA GKVFEKLRRKISRGKTLGRHDVFPLLLTPLMSGKMEMSQRIYQGMEFLQCKELKAGDEDR KRMQSVLYALAVKFLDRNELAMIKERIGMTVLGKMLFEDGVEKGIEKGIEKGVQQGLGRA NALNMKLADAGRADDIIRAASDRTYQEQLFKEFGI >gi|157101642|gb|DS480682.1| GENE 62 67145 - 67525 268 126 aa, chain - ## HITS:1 COG:TM1434 KEGG:ns NR:ns ## COG: TM1434 COG3862 # Protein_GI_number: 15644185 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Thermotoga maritima # 1 122 1 118 138 86 40.0 1e-17 MLREYTCIICPNGCDIRAQIEEREDGSRRICSVEGAACPKGRAYVEQELTDPQRNIATSV LVKGGILPLASVRLTNPIPRDRIFDAVSEIKKYTLTAPVKAGTVVIPRLMGYDTDVIVTK SVPENR >gi|157101642|gb|DS480682.1| GENE 63 67519 - 68751 1162 410 aa, chain - ## HITS:1 COG:FN0182 KEGG:ns NR:ns ## COG: FN0182 COG0446 # Protein_GI_number: 19703527 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 1 393 3 399 421 386 52.0 1e-107 MEIDVLIVGGGPAGLAAAVRLYRKGIHNILIVEREKQLGGILRQCIHDGFGLTRFKTTLS GPEYAQRFIEEVEELGIPYITDATVLEVAEDRTVTAATRRGLKVWRAKSVILTMGCRERT RGALGIPGERPAGVFTAGVAQAYINLYNVMPAKEVVILGSGDIGMIMARRLTLEGAHVQA VFEIQPYPSGLPRNVEQCLNDYGIPLYLSHTVTAVNGDNRLTGVTVSRVDEHLRPVPGTE KEYKCDTLILSVGLIPENELSLDAGVTLDDRTKGAVVDEHFQTDVAGIFAAGNVLHVHDL VDFVSMEAESLADSAAEFVQKGCLPPCPIEIGTDSHINHTVPQRVSGTRDFKLSLRVNSP LKECRIEVRQDGVLVASKKMKKAIPAEMIELTVKADKLVSMGRLEVSAEC >gi|157101642|gb|DS480682.1| GENE 64 68755 - 70215 1335 486 aa, chain - ## HITS:1 COG:TM1432 KEGG:ns NR:ns ## COG: TM1432 COG0579 # Protein_GI_number: 15644183 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Thermotoga maritima # 8 484 4 479 479 276 33.0 8e-74 MKQLYDVIIIGGGVVGSAIAREMSRYRLKIGVLEKNLDVCYETSGRNSGVIHGGFAYDTG SLKARLCVEGNQFMGQLAEELDFKFIRCGKVLVGNTAEDMETLKRTIRQGEENGAAGLEL IGKERLHQLVPAVVGEFAMFSANSGIVDPFNYTIALAENAHANGAAYYFDHEVTGIRRDS EGNYILTTPKGEFHTRWVVNSAGLGCGNISDMLGIRGYKVIGSKGDYIILDKRTGYLLPM PVYPVPSNTYMGIHVTNTTDGNVIIGPNAEMVTDFTYYGVPQENMDYLAKSASDLWPCIH KKDYIRNYSGILPKWVDEDGVIQDFKIEIRDDIAPRAINLVGIESPGLTAAVPIARHAIS LMEEREKLTPNPGFNPVRKGIPHFAEMTKEEQEQMIRQNPDYGQVICRCEKVTRAEILAA IHNPLGVDTMAGVKYRTRSMMGRCQGGYCQMRIAQMIEDELGKKVTEVRYARKDSQMFFG RVREEA >gi|157101642|gb|DS480682.1| GENE 65 70529 - 71266 832 245 aa, chain - ## HITS:1 COG:RSc1078 KEGG:ns NR:ns ## COG: RSc1078 COG2186 # Protein_GI_number: 17545797 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 11 231 15 240 255 94 32.0 1e-19 MHQAGGNMEKQTLGEIASQKLLEMIQRDGYTAGDKLPTEAELVELLGVGRNTVREALRIL MSRNIVTIRQGSGTFISDKNGVSDDPLGFAMIEDRRKLTEDLIQVRVMLEPPIAALAAQN ATVEDIRQLENILLGLEEQMILREDYADKDSQFHAQIANCSHNLVMTNLVPVITDGVRVF AGAVQETEYEQTLKSHRRIFEAIRDRKPVEAQQAMYFHLMYNDNRYKGEMGDGKQGMKGT RAKEQ >gi|157101642|gb|DS480682.1| GENE 66 71266 - 72504 1170 412 aa, chain - ## HITS:1 COG:BH3771 KEGG:ns NR:ns ## COG: BH3771 COG0009 # Protein_GI_number: 15616333 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Bacillus halodurans # 2 411 5 347 348 285 42.0 1e-76 METKRIIIEDRNHIKDEELKEAASILRSGGLVAFPTETVYGLGGNALDEDAARKIYAAKG RPSDNPLIAHVSCVEEVAPLVKEIPEAGRKLMEAFWPGPLTMIFPKSEKVPYGTTGGLDT VAIRMPDDPVANRLIALAGVPVAAPSANTSGRPSPTTADHVWQDMNGRIEMIIDGGPVGI GVESTIVDVSSAVPSVLRPGAITMEMLAEVLGEVSVDPAILGPLSADVRPKAPGMKYKHY APKADLTLVEPGTGTERESGAEQVTGAEQKTGAEQVTGAEQKNGAGQKTGAEQKTGADRN TGADPETGLDETQLQAMICKVRELSREKIEAGYKVGVICTDESRGCYTDGEVRSIGARKS QASVAHNLYALLREFDDLGVDYIFSESFPKDHLGQAIMNRLSKAAGYKIVKV >gi|157101642|gb|DS480682.1| GENE 67 72503 - 72616 85 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIHTSIVKYRLLYVQSILAPATANVKPGNPAYPAPVT >gi|157101642|gb|DS480682.1| GENE 68 72674 - 73126 505 150 aa, chain + ## HITS:1 COG:PM1564 KEGG:ns NR:ns ## COG: PM1564 COG1970 # Protein_GI_number: 15603429 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Pasteurella multocida # 6 145 2 133 133 119 49.0 3e-27 MSNINSVMKEFKAFILKGNIIDMAVGVIIGGAFSKIVTSLVNDILMPLLGALTGGASFNT LKYVIHPAATINGVEVEEAAILYGSFLQNILDFLIIGVCMFFMIKTVAVISSRLRHEEDA KEEAPAPAGPTQEELLAEIRDLLKERKDIA >gi|157101642|gb|DS480682.1| GENE 69 73132 - 74829 1437 565 aa, chain - ## HITS:1 COG:XF2699 KEGG:ns NR:ns ## COG: XF2699 COG1158 # Protein_GI_number: 15839288 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Xylella fastidiosa 9a5c # 159 561 10 411 411 408 53.0 1e-113 MREKLQTLPLSELKEMAKSRGIKGISSLRKAEIIDLLCDEAEKNPDIKPAEIKREVKTET SRSQETPRSQEISRSQETPRSQEISRSQETPRSQDQPHAFERNQSRSQDYQENRNDSRGT DNRRPVSRGYDSKYTQNRTQTPQTHGNSSMNNRMSQNSAPSQNDGERFARADNRSDAIGL SSQDMAELDSGIEANGILEVMPDGFGFIRCENFLPGENDVYVAPSQIRRFNLKTGDIIVG NRRVKSATEKFAALLYIKTVNGYPLSATETRPNFEDLTPIFPNRRLHMETPGERNTVAMR VLDLLAPIGKGQRGMIVSPPKAGKTTLLKQVAKAVTTNNPDMHLMILLIDERPEEVTDIR EAIVGPNVEVIYSTFDELPDRHKRVSEMVIERAKRLVEHGRDVIILLDSITRLARAYNLT VAPSGRTLSGGLDPAALHMPKRFFGAARNMREGGSLTVLATALVETGSRMDDVVYEEFKG TGNMELVLDRKLSEKRIFPAIDILKSGTRRDDLLLSREEAEAVDIIRKATNSLKPEDAVE KILDLFARTRNNREFVENAKRIRFF >gi|157101642|gb|DS480682.1| GENE 70 75062 - 76378 1293 438 aa, chain - ## HITS:1 COG:BS_yrvN KEGG:ns NR:ns ## COG: BS_yrvN COG2256 # Protein_GI_number: 16079807 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus subtilis # 17 424 3 404 421 412 49.0 1e-115 MDLFDYMREKDMEKESPLASRLRPRTLDEVVGQQHIVGKDKLLYRAIQADKLGSIIFYGP PGTGKTTLAKVIANTTSADFRQINATVAGKKDMEEVVKEAKDNIGMYGRKTILFVDEIHR FNKGQQDYLLPFVEDGTLILIGATTENPYFEVNGALLSRSRIFELKPLEKEDVKELIRRA VYDKERGMGIYDADIDEDAADFLADTANGDARAALNAVELGVLTTAKGADGRIHIDMAVA QECIQKRAVRYDKNGDNHYDTISAFIKSMRGSDPDAAVYYLARMLYAGEDIKFIARRIMI CAAEDVGNADPQALVVAVNAAQAAERIGIPEANIILSQAVTYVATAPKSNAACMAVQKAM EAVRNERTMPVPVHLQDKHYKGAEKLGHGAGYLYAHDYPKHYVQQQYLPEGMEGTVFYEP SDNGYEKQINAHMKWLKS >gi|157101642|gb|DS480682.1| GENE 71 76392 - 77120 812 242 aa, chain - ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 60 240 268 448 463 171 41.0 8e-43 MKLINKSKLLAAVCIADLLLLGGLIHTVRGEMPAHVSADEGSWVREGGDTLDRAKGLEGK CVALTFDDGPNEDCTEKLLDGLKERNVRATFFLMGQNIEGNEDIVKRMKAEGHLIGNHSY SHVQLTKAGSDAVCQAVDRTSRMIEEITGERPQYMRPPYGDWNEELECRVGMTTVLWSVD SLDWKLRNTNRVVKRVLKDVEDGDIILMHDIFPTSVEAALEIVDTLTKRGYTFVTVDELL ID >gi|157101642|gb|DS480682.1| GENE 72 77361 - 79166 1863 601 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 290 492 524 727 744 194 43.0 4e-49 MKNTTQIMRRCHALALAGLIGAGTIVPAGTSLAVGPGETQSATVISPVGPGQSSAGSTGT ASGTGSTSGQSASTANPNAWKKVNGVYQMPDGSAINNVLFRGIDVSRWQGDINWSQVAAD DVSFVMLGTRSKGAVDPYFHRNIQQASAAGVKVGVYIYSLAMTPEMAVEEANFVLNLIHD YPVSYPVAFDMEDSTQGTLSKDELAAIANAFCGRISEAGYYPVIYANDNWLANKLDMSKM NYPVWVARYSAKPAYQNPVMWQATSTGAVNGISGNVDIDFQFKDFTSVIPANTWRTINGQ TYYYQNYAKQKNNWIQDDGTWYYMNGDGLVSKGWLNQSGKSYYLDDTTGKMITGWKSDSG KWYYFGSSGALSKGWINDNGTWYYSNQEGVMQTGWLDDGGRRYFLEGNGAMAKGWTSQNG KWYYLDSSGALSRGWINDNGTWYYSSQEGVMQTGWLDDGGERYYLKGSGAMATGWREMDG AWYYFEGSGRMAKGVIDVGGLHYYMDPSTGRMAAGTTVDIGGVAYTADGSGVLSQVVQEA GNETGDSQTGNVQTQAPDGSQGGQAPQPSQNGGVSNQAPGLGQAVTGSSGGPGVVVTPLG P >gi|157101642|gb|DS480682.1| GENE 73 79256 - 80317 1126 353 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 50 275 526 744 744 167 36.0 4e-41 MGYKKYWVAMAAAGMIGAMSLGSVFTSYAADGWSKSNDSWTYIYNGQTHTGWLQTSEGWY YMDLSTGHMSTGFKQIDGKWYFFRPNGLMAMGWINPEDGKWYYMLNDGSMVTGWLKIGND YYFMRGNGTMATGWREMDGAWYYFQSNGKCVVNSWAQIKDNWYLFGTDGKMMTGWAKVDN DYFYLNTDGRMLIGWLSDSEGNKYYMDTSNGKMSTGWKQIDNSWHYFNSNGHMMTGWIQL DGKYYYLNPANEGKMIAGTTATINGVQYNFDGSGVCQNANGVSAQTPGTVNNNSGSSGGS GGPGVSGSSGSGISSGTPGGSGNSSNGPTSGNSNNSPSGSSTALEPGRTDGPG >gi|157101642|gb|DS480682.1| GENE 74 80424 - 82295 1896 623 aa, chain - ## HITS:1 COG:TP0107_1 KEGG:ns NR:ns ## COG: TP0107_1 COG4750 # Protein_GI_number: 15639101 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CTP:phosphocholine cytidylyltransferase involved in choline phosphorylation for cell surface LPS epitopes # Organism: Treponema pallidum # 79 361 1 251 251 280 48.0 8e-75 MDRHGLVCRAVLEQPDITQRELAGKLDVSLGTANGLIKECLAKGYIGEESSQKGKARWRL MPDGQKLLDQYKVDGALIIAAGFGSRFVPLTFETPKGLLEVFGERMIERQIKQLHEVGIR NITIAVGYLKEKFEYLIDKYGVELLYNPEYTSKNTLTTIYRARKVLEGRNMYVLSSDNWM RENMYHSYECGAWYSSAFQKGETREWCLYFNKKGRITGVKVGGRDQWVMYGPAYFSREFS RQFLPVLEAYYKTPGTEQFYWEQVYVDMLEGEARRRLEEEGGGLLEAAQEAGGLAASRWD EIEMDVNRQPDDQVYEFENLEELRLFDPRYQNHSDNAAMKLVAEVFKVPESSIKDIRCLK AGMTNKSFLFKVEDRHCICRIPGPGTELLINREQEKTVYDAVASLEITEHVLYMNEKTGY KIADYYEGTRNADSADWADMEKCMSMVRRLHQSGITVAHSFDIRERISFYEKLCRSHGAL LFEDYEEVKGWMDWLMDTLDSMERPKCLCHIDANVDNFLFFEDGSVKLLDWEYAGMCDPV MDVSMCAIYSYYKEEDMERLFRLYLRREPSEDELFALYANAALGGFLWCLWAVYKSILGD EFGEYTIIMYRYAKKYYRKLRKL >gi|157101642|gb|DS480682.1| GENE 75 82510 - 84624 2310 704 aa, chain - ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 7 704 4 696 696 897 62.0 0 MSDVLNVSEMFGKNVFNDAVMRDRLPKSVYKKLKKTIEDGAELDPSIADVVAHAMKDWAI ERGATHYTHWFQPLTGVTAEKHDSFISAPSDGKVIMEFSGKELIKGEPDASSFPSGGLRA TFEARGYTVWDCTSPAFLREDAIGVTLCIPTSFCSYKGEALDKKTPLLRSMQAVNEQALR ILRLFGNTTAKRVAPSVGAEQEYFLVDREKYLQRKDLIYTGRTLFGAMPPKGQELEDHYF GAIRERIGSYMNDVNKELWKLGVPAKTQHNEVAPAQHELAPIYEGVNQAVDHNQIVMETL KKVAGRHGMACLLHEKPFAGVNGSGKHDNWSLTGDDGTNLLNPGDTPHENIQFLLVLACI IKAVDIHADLLRESASDVGNDHRLGANEAPPAIISIFLGEQLEDVIDQLCSTGEATHSKQ GGTLKTGVDILPDLDKDATDRNRTSPFAFTGNKFEFRMVGSSDSIASPNTVLNTIVAEAF KEAADMLEKAEDFDMAVHDFIKETLAAHKRIIFNGNGYSDEWVAEAERRGLPNIRSMVDA IPALTTDKAVKLFEAFGVFTRAELESRSEVEYENYSKAINIEARTMYDMASKSIIPAVIK YTTQLASSIAAVKGVCQEADVSTQTELLIETSKLLAEIKTALARLLEVTEKAAAAEGGKV QALVYHSQVVPAMEALRAPVDQLEMIVDKELWPMPSYGDLIFEV >gi|157101642|gb|DS480682.1| GENE 76 84936 - 85799 1186 287 aa, chain - ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 287 1 287 287 436 78.0 1e-122 MLVSAKDMLEKAREGKYAVGQFNINNLEWTKSVLLTAEELKSPVILGVSEGAGKYMTGFK TVAAMVEAMMEELNITVPVALHLDHGTYDGCYKCIKAGFSSIMFDGSHYPIEENIEKTKE LVKVAHAMGLSLEAEVGSIGGEEDGVVGLGECADPQECKAIADLGIDFLAAGIGNIHGKY PENWPGLRFDVLEQVKAAVGDMPLVLHGGTGIPEDMIKKAISLGVAKINVNTECQLSFAA ATREYIEAGKDLKGKGFDPRKLLAPGADAIRATVKEKMELFGSVGKA >gi|157101642|gb|DS480682.1| GENE 77 85890 - 87386 1649 498 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 5 496 6 495 496 566 58.0 1e-161 MGLTYQSTRGGEKEVTASMAILQGLAKDGGLFMPSCIPQLDVPLEKLASMTYQETAYEVM KLFLTDYTEKELKDCIARAYDSKFDTEEIAPLAKADGAYYLELYHGSTIAFKDMALSILP HLMTTAARKNHVDREIVILTATSGDTGKAAMAGFADVPGTRIIVFYPKDGVSKVQELQMR TQKGDNTSVVAIHGNFDDAQTGVKKMFGDKDLEAELMGKGFQFSSANSINIGRLVPQIVY YVYAYAKLLEAGEIEKGENINVVVPTGNFGNILAAYFAKRMGLPVKTLVCASNDNKVLYD FFTTGIYDRKREFILTNSPSMDILISSNLERLIYMSTGCDALASGNLMRGLSQEGRYEVT PEMRAFMSDFVGGFATQEQNAATIKKLFDDTGYLIDTHTGVAASVYGNYRKESGDDTKTV IASTASPYKFSHSVMEAIAGREGLEGKDEFEIVDALSALSGVAVPQAVEEIRHAAVRHNR ECGVDDMKNEVKDILGIS Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:45:09 2011 Seq name: gi|157101641|gb|DS480683.1| Clostridium bolteae ATCC BAA-613 Scfld_02_24 genomic scaffold, whole genome shotgun sequence Length of sequence - 117699 bp Number of predicted genes - 107, with homology - 107 Number of transcription units - 48, operones - 21 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 87 - 159 67.6 # Thr GGT 0 0 1 1 Tu 1 . - CDS 414 - 1319 882 ## COG0039 Malate/lactate dehydrogenases - Prom 1399 - 1458 3.4 2 2 Op 1 . - CDS 1484 - 2545 1168 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 3 2 Op 2 . - CDS 2571 - 3050 555 ## Closa_1304 hypothetical protein 4 2 Op 3 . - CDS 3102 - 3617 407 ## Closa_1303 hypothetical protein 5 2 Op 4 . - CDS 3610 - 4104 574 ## Closa_1302 hypothetical protein 6 2 Op 5 . - CDS 4118 - 4447 509 ## Closa_1301 hypothetical protein 7 2 Op 6 . - CDS 4465 - 5523 1209 ## COG1459 Type II secretory pathway, component PulF 8 2 Op 7 . - CDS 5539 - 6462 1267 ## Closa_1299 transglutaminase domain protein 9 2 Op 8 7/0.000 - CDS 6493 - 7236 675 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 10 2 Op 9 8/0.000 - CDS 7233 - 7703 230 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 11 2 Op 10 2/0.000 - CDS 7688 - 9097 1692 ## COG0215 Cysteinyl-tRNA synthetase 12 2 Op 11 . - CDS 9181 - 9732 301 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 13 2 Op 12 . - CDS 9805 - 10728 356 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Prom 10924 - 10983 78.5 + TRNA 10907 - 10982 87.3 # Val TAC 0 0 + 5S_RRNA 10913 - 10964 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. + TRNA 11000 - 11073 76.1 # Met CAT 0 0 + Prom 10908 - 10967 77.7 14 3 Op 1 . + CDS 11179 - 12306 1109 ## SpiBuddy_1165 major facilitator superfamily MFS_1 15 3 Op 2 . + CDS 12320 - 12958 648 ## Spico_0824 putative phage-related protein - Term 12932 - 12968 6.5 16 4 Tu 1 . - CDS 13016 - 13483 503 ## COG1225 Peroxiredoxin 17 5 Op 1 . - CDS 13591 - 14805 1148 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 18 5 Op 2 38/0.000 - CDS 14870 - 15712 948 ## COG0395 ABC-type sugar transport system, permease component 19 5 Op 3 35/0.000 - CDS 15705 - 16598 790 ## COG1175 ABC-type sugar transport systems, permease components 20 5 Op 4 2/0.000 - CDS 16626 - 17999 1463 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 18089 - 18148 4.9 21 5 Op 5 . - CDS 18211 - 19113 615 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 19217 - 19276 2.7 22 6 Op 1 19/0.000 - CDS 19278 - 21215 2438 ## COG1299 Phosphotransferase system, fructose-specific IIC component 23 6 Op 2 . - CDS 21237 - 22139 1121 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) - Prom 22159 - 22218 8.2 - Term 22258 - 22301 7.1 24 7 Tu 1 . - CDS 22372 - 23247 872 ## COG1737 Transcriptional regulators - Prom 23307 - 23366 8.0 + Prom 23335 - 23394 8.0 25 8 Tu 1 . + CDS 23474 - 24139 579 ## Closa_2516 phosphoesterase PA-phosphatase related protein + Term 24251 - 24302 12.5 - Term 24239 - 24290 9.5 26 9 Tu 1 . - CDS 24341 - 25678 391 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 25823 - 25882 10.3 - Term 25707 - 25749 -0.8 27 10 Op 1 . - CDS 25910 - 27160 805 ## COG0477 Permeases of the major facilitator superfamily 28 10 Op 2 . - CDS 27160 - 28251 766 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities - Prom 28283 - 28342 2.0 29 11 Op 1 . - CDS 28393 - 29160 542 ## COG1402 Uncharacterized protein, putative amidase 30 11 Op 2 . - CDS 29202 - 30581 1291 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 31 11 Op 3 . - CDS 30614 - 31315 748 ## Pden_1201 hypothetical protein 32 11 Op 4 . - CDS 31327 - 32040 678 ## Htur_4152 hypothetical protein - Prom 32074 - 32133 6.4 - Term 32210 - 32246 4.2 33 12 Tu 1 . - CDS 32274 - 32753 470 ## COG1846 Transcriptional regulators - Prom 32795 - 32854 5.5 - Term 32867 - 32925 6.3 34 13 Op 1 . - CDS 32972 - 33304 378 ## gi|160937897|ref|ZP_02085255.1| hypothetical protein CLOBOL_02791 35 13 Op 2 . - CDS 33374 - 34393 1066 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 36 13 Op 3 . - CDS 34419 - 34868 353 ## gi|160937899|ref|ZP_02085257.1| hypothetical protein CLOBOL_02793 - Prom 34896 - 34955 6.0 + Prom 34928 - 34987 6.8 37 14 Tu 1 . + CDS 35027 - 35956 1094 ## COG0549 Carbamate kinase - Term 35985 - 36047 10.5 38 15 Tu 1 . - CDS 36056 - 36562 586 ## Closa_0702 CarD family transcriptional regulator - Prom 36594 - 36653 4.8 39 16 Tu 1 . - CDS 36738 - 37202 533 ## COG0251 Putative translation initiation inhibitor, yjgF family - Term 37237 - 37287 7.8 40 17 Op 1 24/0.000 - CDS 37345 - 40563 3990 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 41 17 Op 2 . - CDS 40563 - 41636 980 ## COG0505 Carbamoylphosphate synthase small subunit - Prom 41700 - 41759 10.1 - Term 41799 - 41845 8.3 42 18 Tu 1 . - CDS 41863 - 42708 479 ## Olsu_0725 transcriptional regulator, MarR family - Prom 42854 - 42913 8.3 - Term 42976 - 43008 6.1 43 19 Tu 1 . - CDS 43055 - 44305 1199 ## COG3409 Putative peptidoglycan-binding domain-containing protein - Prom 44350 - 44409 3.5 - Term 44327 - 44362 2.4 44 20 Tu 1 . - CDS 44419 - 45054 873 ## Closa_3203 hypothetical protein - Prom 45117 - 45176 5.8 + Prom 45449 - 45508 4.6 45 21 Op 1 16/0.000 + CDS 45588 - 46691 748 ## COG1985 Pyrimidine reductase, riboflavin biosynthesis 46 21 Op 2 15/0.000 + CDS 46673 - 47353 515 ## COG0307 Riboflavin synthase alpha chain 47 21 Op 3 18/0.000 + CDS 47445 - 48662 1185 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase 48 21 Op 4 . + CDS 48733 - 49260 421 ## COG0054 Riboflavin synthase beta-chain + Term 49405 - 49450 -0.1 - Term 49187 - 49230 2.3 49 22 Op 1 5/0.000 - CDS 49269 - 52151 2747 ## COG0642 Signal transduction histidine kinase 50 22 Op 2 . - CDS 52283 - 53998 1429 ## COG2199 FOG: GGDEF domain - Prom 54055 - 54114 7.8 - Term 54132 - 54183 7.0 51 23 Op 1 . - CDS 54245 - 54865 471 ## COG0406 Fructose-2,6-bisphosphatase 52 23 Op 2 . - CDS 54887 - 55930 1182 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 53 23 Op 3 2/0.000 - CDS 55996 - 57480 1596 ## COG0554 Glycerol kinase 54 23 Op 4 12/0.000 - CDS 57513 - 58451 1073 ## COG3958 Transketolase, C-terminal subunit 55 23 Op 5 2/0.000 - CDS 58444 - 59277 956 ## COG3959 Transketolase, N-terminal subunit 56 23 Op 6 21/0.000 - CDS 59292 - 60248 1190 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 57 23 Op 7 16/0.000 - CDS 60254 - 61771 1696 ## COG1129 ABC-type sugar transport system, ATPase component 58 23 Op 8 6/0.000 - CDS 61830 - 62894 1450 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 63080 - 63139 9.9 - Term 63154 - 63199 10.1 59 23 Op 9 . - CDS 63226 - 64242 1011 ## COG1609 Transcriptional regulators - Prom 64268 - 64327 5.5 - Term 64302 - 64353 7.2 60 24 Op 1 12/0.000 - CDS 64359 - 64895 551 ## COG0602 Organic radical activating enzymes 61 24 Op 2 . - CDS 64897 - 67056 2704 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 67168 - 67227 8.0 - Term 67283 - 67318 6.5 62 25 Tu 1 . - CDS 67342 - 68379 1122 ## COG1609 Transcriptional regulators - Prom 68499 - 68558 5.8 + Prom 68636 - 68695 3.0 63 26 Op 1 3/0.000 + CDS 68741 - 69361 394 ## COG2190 Phosphotransferase system IIA components 64 26 Op 2 1/0.143 + CDS 69390 - 70982 1601 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 65 26 Op 3 . + CDS 71058 - 71921 557 ## COG3568 Metal-dependent hydrolase 66 26 Op 4 . + CDS 71956 - 73455 1287 ## COG1640 4-alpha-glucanotransferase + Term 73458 - 73521 8.9 - Term 73452 - 73503 18.1 67 27 Op 1 . - CDS 73545 - 74327 892 ## COG0860 N-acetylmuramoyl-L-alanine amidase 68 27 Op 2 3/0.000 - CDS 74403 - 74843 574 ## COG1522 Transcriptional regulators 69 27 Op 3 . - CDS 74877 - 76097 1354 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 76252 - 76311 3.7 70 28 Op 1 . - CDS 76313 - 77128 831 ## COG0253 Diaminopimelate epimerase 71 28 Op 2 . - CDS 77205 - 77762 495 ## Closa_1273 ANTAR domain protein with unknown sensor 72 28 Op 3 . - CDS 77799 - 79136 1353 ## COG0174 Glutamine synthetase - Prom 79286 - 79345 9.4 - Term 79165 - 79216 2.9 73 29 Tu 1 . - CDS 79347 - 80207 1123 ## COG0253 Diaminopimelate epimerase - Prom 80307 - 80366 4.8 + Prom 80321 - 80380 11.6 74 30 Tu 1 . + CDS 80443 - 81312 679 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 81364 - 81402 -0.9 + Prom 81362 - 81421 5.4 75 31 Op 1 20/0.000 + CDS 81459 - 82622 1557 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 82659 - 82689 1.0 76 31 Op 2 24/0.000 + CDS 82777 - 83658 1029 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 77 31 Op 3 19/0.000 + CDS 83669 - 84784 1258 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 78 31 Op 4 18/0.000 + CDS 84784 - 85650 246 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 79 31 Op 5 . + CDS 85637 - 86344 241 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 86351 - 86408 7.6 - Term 86337 - 86396 14.4 80 32 Tu 1 . - CDS 86421 - 87179 442 ## SpiBuddy_1120 GntR family transcriptional regulator - Prom 87233 - 87292 5.2 81 33 Tu 1 . - CDS 87294 - 87680 78 ## COG2421 Predicted acetamidase/formamidase 82 34 Op 1 . - CDS 88246 - 89235 336 ## COG0451 Nucleoside-diphosphate-sugar epimerases 83 34 Op 2 . - CDS 89251 - 89952 356 ## COG1878 Predicted metal-dependent hydrolase 84 34 Op 3 16/0.000 - CDS 89965 - 91029 764 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 91054 - 91113 3.9 85 34 Op 4 21/0.000 - CDS 91125 - 92105 606 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 86 34 Op 5 5/0.000 - CDS 92118 - 93611 787 ## COG1129 ABC-type sugar transport system, ATPase component 87 34 Op 6 . - CDS 93645 - 94403 334 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 94576 - 94635 10.0 - Term 94608 - 94643 4.0 88 35 Tu 1 . - CDS 94679 - 95317 728 ## bpr_III235 orotate phosphoribosyltransferase PyrE2 (EC:2.4.2.10) - Prom 95360 - 95419 6.1 - Term 95465 - 95516 20.1 89 36 Tu 1 . - CDS 95536 - 97689 1260 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 97747 - 97806 5.7 + Prom 97800 - 97859 8.3 90 37 Tu 1 . + CDS 97928 - 99544 1637 ## COG0504 CTP synthase (UTP-ammonia lyase) 91 38 Tu 1 . + CDS 99651 - 99815 252 ## gi|160937962|ref|ZP_02085320.1| hypothetical protein CLOBOL_02856 + Term 99842 - 99883 -0.7 - Term 99830 - 99871 0.1 92 39 Op 1 34/0.000 - CDS 100003 - 100746 499 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 93 39 Op 2 31/0.000 - CDS 100743 - 101546 984 ## COG0765 ABC-type amino acid transport system, permease component 94 39 Op 3 . - CDS 101591 - 102535 1237 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Term 102571 - 102614 5.1 95 40 Op 1 . - CDS 102647 - 102832 130 ## gi|160937966|ref|ZP_02085324.1| hypothetical protein CLOBOL_02860 96 40 Op 2 . - CDS 102849 - 103757 1006 ## COG4509 Uncharacterized protein conserved in bacteria 97 40 Op 3 . - CDS 103784 - 104644 676 ## Closa_0110 hypothetical protein - Prom 104695 - 104754 6.7 + Prom 104749 - 104808 9.2 98 41 Tu 1 . + CDS 104839 - 106074 1153 ## COG3629 DNA-binding transcriptional activator of the SARP family 99 42 Tu 1 . - CDS 106182 - 107567 1591 ## COG0534 Na+-driven multidrug efflux pump - Prom 107633 - 107692 5.8 - Term 107674 - 107707 6.1 100 43 Tu 1 . - CDS 107719 - 109128 1621 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 109228 - 109287 6.5 - Term 109292 - 109339 5.3 101 44 Tu 1 . - CDS 109361 - 110074 896 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family 102 45 Op 1 . - CDS 110227 - 111450 1353 ## COG1457 Purine-cytosine permease and related proteins 103 45 Op 2 . - CDS 111463 - 112254 890 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase - Term 112364 - 112401 6.1 104 46 Op 1 9/0.000 - CDS 112592 - 113896 1137 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 105 46 Op 2 . - CDS 113904 - 114611 803 ## COG3279 Response regulator of the LytR/AlgR family - Prom 114780 - 114839 6.3 + Prom 114719 - 114778 3.9 106 47 Tu 1 . + CDS 114807 - 116264 1706 ## COG0591 Na+/proline symporter + Term 116392 - 116429 0.5 107 48 Tu 1 . - CDS 116357 - 117682 919 ## BT9727_0681 hypothetical protein Predicted protein(s) >gi|157101641|gb|DS480683.1| GENE 1 414 - 1319 882 301 aa, chain - ## HITS:1 COG:BH3937 KEGG:ns NR:ns ## COG: BH3937 COG0039 # Protein_GI_number: 15616499 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 2 295 17 303 310 320 53.0 2e-87 MVGMSYAYCLLNQSVCDELVLIDVNKKRAEGEAMDLNHGLAFANSSMTIYAGEYDDCSDA DIVVICAGVAQKQGETRLDLLKRNAEVFRSIIEPVTSSGFNGLFLVATNPVDIMTRITCT LSGFNPRRVLGTGTALDTARLRYLLGDYLKADPRNVHAYVMGEHGDSEFVPWSQALLATK PILELCGENGEAVCRQRFDEIEEEVRTAAYKIIEAKSATYYGIGMALTRITKAILGDEHS VLTVSAMLRGEYGQMDVFAGVPCIINQNGVQRVLPLSLTPEELEKLGRSCDTLREGYDGI F >gi|157101641|gb|DS480683.1| GENE 2 1484 - 2545 1168 353 aa, chain - ## HITS:1 COG:aq_745 KEGG:ns NR:ns ## COG: aq_745 COG2805 # Protein_GI_number: 15606134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Aquifex aeolicus # 1 351 13 362 366 311 44.0 1e-84 MNVMELLTSAVEKNAADIFLIPGMPFSYKIGGRIIYQGDNRIMPDEMDKMITEIYGLAKN RGMDKVQSHGDDDFSFAIPGVSRFRASVFRQRGSLAGIIRVVRFELPDAGQLHLPDSIIG VSRLTKGMVLVTGPAGSGKSTTLACIIDEINSTRNAHVITLEDPIEYLHRHKQSVVTQRE IVTDTDSYVTGLRASLRQAPDVILLGEMRDYETISIAMTAAETGHLILSTLHTVGAANTI DRVIDAFPPNQQQQIRTQLAMVLDAVISQQLIPTVDGGVQPAFEIMFLNNAIRNMIRESK IHQIDGIIATSQEEGMISMDNSLIKLYRDGVISRENAIAYSSNSELMEKKLAR >gi|157101641|gb|DS480683.1| GENE 3 2571 - 3050 555 159 aa, chain - ## HITS:1 COG:no KEGG:Closa_1304 NR:ns ## KEGG: Closa_1304 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 159 1 161 164 109 41.0 4e-23 MAGEASRNKVNIGASSLILIFIVLCMATFGLLSLSSAQGDLKLAGRNGEAVQAYYEADSR GQQWLKEVDQVLTEEMGEGKDSTQCSLDIKDRLGELYNRETGLISTDIPMDRGQSLRIEL VLMCGEKHYDIKSWYVYASDEYEIDNSMPVWGGAAPSQE >gi|157101641|gb|DS480683.1| GENE 4 3102 - 3617 407 171 aa, chain - ## HITS:1 COG:no KEGG:Closa_1303 NR:ns ## KEGG: Closa_1303 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 63 1 61 139 69 57.0 5e-11 MNRKSGSGSGPFLMEMLVVVGFFIICASICVLVFVKADNISKDARDINQAVLKAQSLAEE LKAGQALRWSEMLPDRNIWEHLADADEGNQEQVQWNQELVQSEGYEGIHTMYWNSSWEET EPDSEPAFLGVIYTGTVDKMERADILIMRYGRGANKGKMLYRLQTETYAGP >gi|157101641|gb|DS480683.1| GENE 5 3610 - 4104 574 164 aa, chain - ## HITS:1 COG:no KEGG:Closa_1302 NR:ns ## KEGG: Closa_1302 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 159 1 159 163 179 59.0 4e-44 MNRKTDKKGNAVSVLFTMLLFLVFVLCALFTVLIGSRVYENINVRSDANYTGNTALSYIA NKVRQGDRAGMVNVADVDGIQVLEMKQEIGDSEYVTWIYWDEGSIRELFTDTSSGLGLAD GLEILECQGLKFSKDGRLLRIETVGEGGGSLELSLRSGGLETDE >gi|157101641|gb|DS480683.1| GENE 6 4118 - 4447 509 109 aa, chain - ## HITS:1 COG:no KEGG:Closa_1301 NR:ns ## KEGG: Closa_1301 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 106 5 104 107 129 62.0 3e-29 MKVKSEKKEKSQALGYILSLAAMAAMVALFAAAVLNFSGRTGAREEETLRKAVARASVQC YAIEGRYPPSVEYLEENYAVQINRKKYNVFYDGFASNVMPEITIIPIDE >gi|157101641|gb|DS480683.1| GENE 7 4465 - 5523 1209 352 aa, chain - ## HITS:1 COG:aq_747 KEGG:ns NR:ns ## COG: aq_747 COG1459 # Protein_GI_number: 15606135 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Aquifex aeolicus # 12 351 65 406 408 132 24.0 8e-31 MAKKGEAYYRYSYEELSAFCLQISLLLEAAVPLEEGLAIMAEDAASQGEKDMLLYMAEGV ELGDPFFKVMEDTGVFPAYVVRMAKLGQQTGTMDQMMKSLSDYYEKEYRLLRAIKNAVTY PVMMVVMLLVVLFVLFSKVMPVFNKVYEQLGAQMPPVAASAMRLGGWLSGAALIAGAVLA LALCGIWTASKFGKRFALVERFVNFVKGRSKIALAVANRRFTAVLALTLKSGMELEKGMD LAKELVENESVAVRIGKCSEQLQVGESYYQAMKDTGLFSGFYVQMIKVGTRSGHLDSVMD EISQDYEEMADTAIDDMIARFEPTIVAVLAVSVGLVLLSVMLPLVGVLSAIG >gi|157101641|gb|DS480683.1| GENE 8 5539 - 6462 1267 307 aa, chain - ## HITS:1 COG:no KEGG:Closa_1299 NR:ns ## KEGG: Closa_1299 # Name: not_defined # Def: transglutaminase domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 34 307 54 327 327 353 66.0 6e-96 MKTAYKPVVKLAAAAAAAFFAVSSIQVPVTALAYTKPAGSHVLTPIASGVTTYSNEKATL DASNVSEGYIMVKYTGSVGKIKVQITKSGSETYTYDLSSSGVYEVFPLSEGSGSYSVKVF ENIQGNQYSQAFSQNVNATITNQFGPFLYPNQYVNFNAASAAVQTGASVAASATDQLEVV SNVYNYVINNVTYDTAKASSVQSGYLPNVDVVLAQKKGICFDYAALMTAMLRSQDIPTKL VVGYTGNLYHAWINVYLEGQGWVDNIIYFDGNSWKLMDPTFASSSGQSQEIMQYIGNGSN YRAKYSY >gi|157101641|gb|DS480683.1| GENE 9 6493 - 7236 675 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 8 247 9 249 255 264 51 1e-69 MRYEELTIEGRNAVLEAFRSGKTIDKLFVLDGCQDGPVRTILREARKYDTIINYVPKERL DQISETGKHQGVIAYAAAYEYAEVEDMLKAAEEKGEPPFLILLDGIEDPHNLGAIIRTAN LAGAHGVIIPKRRAVGLTATVAKTSAGALNYTPVAKVTNLTATMEDLKKKGMWFVCADMG GELMYKMNLKGSIGLVVGNEGEGVGKLVREHCDMVASIPMKGDIDSLNASVAAGVLAYEI VRQRLEM >gi|157101641|gb|DS480683.1| GENE 10 7233 - 7703 230 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 21 143 5 130 141 93 38 5e-18 MEESVTIQEFLTYYRQCMRLEPVDENSYSPLVLAYIGDSVYEVIIRTKVINRGSMQVNKM HKQSSELVKAGTQAELIKAIEDMLTQEEHAVFKRGRNAKSATSAKNASVIDYRMATGMEA LVGWLFLRQEYNRLVYLISQGLEKLGRMPDPEGDKT >gi|157101641|gb|DS480683.1| GENE 11 7688 - 9097 1692 469 aa, chain - ## HITS:1 COG:BH0111 KEGG:ns NR:ns ## COG: BH0111 COG0215 # Protein_GI_number: 15612674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Bacillus halodurans # 1 469 3 465 466 495 54.0 1e-140 MKIFNTLSRKKEEFVPLTPGEVKMYVCGPTVYNFIHIGNARPMIVFDTVRRYFEYKGYDV RYVSNFTDVDDKIIKKAIEEGVDADTVSRRYIEECKKDMAAMNVKPATVHPQATQEICGM LEMIQTLIDKNHAYVAGDGTVYFRTGSFKEYGKLSHKNLDELQSGFREIKVTGEEGKEDP NDFVLWKPKKEGEPFWESPWCDGRPGWHIECSVMSKRYLGEQIDIHAGGEDLIFPHHENE IAQSECANDKNFATYWMHNAFLNIDNKKMSKSLGNFFTVREISEKYDLQVLRFFMLSAHY RSPLNFSADLMEASKNGLERILTCVEKLRDLEAKASDSSKTAGEQATMEEADKLRGKYEE AMDDDFNTADAISAIFELVKLANTTADESGTREFVSYMKTMIEELCDVLGIITEKKEEVL DSEIEEMIEARQQARKDKNFALADEIRGKLLDMGIVLEDTREGVKWKRA >gi|157101641|gb|DS480683.1| GENE 12 9181 - 9732 301 183 aa, chain - ## HITS:1 COG:CAC0434 KEGG:ns NR:ns ## COG: CAC0434 COG0245 # Protein_GI_number: 15893725 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Clostridium acetobutylicum # 1 154 1 154 155 194 64.0 9e-50 MRIGQGYDVHKLVQGRDLILGGVNIPYEKGLLGHSDADVLVHAVMDALLGAAALGDIGQH FPDTDPAYKGISSIELLKKVGELLDGKGYVIENIDATIIAQRPKLAQYRPQMAANIADAL HLDVSRVSVKATTEEGLGFTGTGEGISSQAITLLTETADYCYDSKIMEGGCGGCPGCSRA KEE >gi|157101641|gb|DS480683.1| GENE 13 9805 - 10728 356 307 aa, chain - ## HITS:1 COG:BH1953 KEGG:ns NR:ns ## COG: BH1953 COG1597 # Protein_GI_number: 15614516 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 1 286 1 276 295 134 32.0 2e-31 MYNFIVNPKAGSGKGLKIWKAIARYMESHNIEYEVFLTEGIGDARTAAMRLTDSYDEPRY IIAVGGDGTMNEVLDGVSFHGPLSLGYIPAGSGNDLARSLRMPGRTIKCLKKQLAPRHFT MIDYGVLTYGNQEVSHRRFLVSAGIGFDAAVCQDALSSRCRQRLNRLGMGKLSYILIGIH QFFKCKSCKGYILLDGVKKVEFNNILFISCHIQPSEGGGFLFAPRADGSDGKLNICVMSH GTRTKLIPVLLSALIGRRRPKGVRTYECREVSIHTEAALPVHADGESCGFQTDIQASCIA RKVRMMI >gi|157101641|gb|DS480683.1| GENE 14 11179 - 12306 1109 375 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_1165 NR:ns ## KEGG: SpiBuddy_1165 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: Spirochaeta_Buddy # Pathway: not_defined # 13 355 22 369 398 150 29.0 8e-35 MEKEQIFQRNAGFLAFFLSGICAISSGVVVSMLQETYGFAYGMTGTLLSLMSIGNLLAGF ASGMLPAKIGTKKTVVLLTAGYGAGYLIMGFSGWMLVLMLGFFLVGVAKGSTINTCTILV ADNSGDRTKGMNIMHSCYALGALLCPFFIGAAMKAGNAVPMLVLASCGFLLWLTFCVTPA ETKAMKKDRSIDKGFLKSKKFWLLTGLLFCQNAAETSVTGWMVTYFKGNGIISGSLSPYT VTVMWGATLIARLLIAFVIPIKNSYSAMIKMGIGCIIFYLGLMMAGTQTAAILLLFAFAF AMAGMNPTAVASAGRMTSAASMGIMLPAASSGAIIMPWIIGMVAEHAGIEIGMASNIIPC AGMLLFSVAVKRLKE >gi|157101641|gb|DS480683.1| GENE 15 12320 - 12958 648 212 aa, chain + ## HITS:1 COG:no KEGG:Spico_0824 NR:ns ## KEGG: Spico_0824 # Name: not_defined # Def: putative phage-related protein # Organism: S.coccoides # Pathway: not_defined # 1 161 1 165 188 79 34.0 1e-13 MLYDKDIREPLFDFLEEMYGKIRILEEKQIGKSRADIVMVLPDLVAGIEIKSDADTYARL KRQVKDYDQFYDRNYVAAGSSHALHIEEHVPEWWGIITVEQVAGQADFYLLREARPNPGV NLRKKLSILWRPELAHIQELNDMPKYKEKSKQFVIDKIAEKIPEEILSKQISGALFERDY NEIEHVIQEFKKTRTAGTMKGRAGRTRKGRRS >gi|157101641|gb|DS480683.1| GENE 16 13016 - 13483 503 155 aa, chain - ## HITS:1 COG:CAC0327 KEGG:ns NR:ns ## COG: CAC0327 COG1225 # Protein_GI_number: 15893619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Clostridium acetobutylicum # 5 150 6 151 151 160 53.0 1e-39 MLSAGIKAPDFTLKDKDGKEVKLSSFLGRKVVLYFYPKDNTPGCTREACAFAGAYIGFKQ RNVEVIGVSRDSEKSHANFAAKHELPFILLSDPELTAIQAYDVWKEKKLYGKVSMGVVRS TYVIDENGVIEKVFEKAKPDTNAQEILDYLDGKAE >gi|157101641|gb|DS480683.1| GENE 17 13591 - 14805 1148 404 aa, chain - ## HITS:1 COG:CC0532 KEGG:ns NR:ns ## COG: CC0532 COG4948 # Protein_GI_number: 16124787 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Caulobacter vibrioides # 4 400 2 399 403 302 41.0 7e-82 MDNITIRDIKVFVTAPRGINLVIVKVETSEPELFGYGCATFTWRHKAVVTAIEEYLCPML KGRSVHNIEDIWQTMMGSSYWRNGPILNNAISGVDEALWDIKGKLAGMPLYSLLGGKARE GVTVYRHADGGCLEEVKECISRYIEEGYRHIRCHMGTYGGNFDGRRQQMVKPEGAPEGAY FHPKMYMDSVIRLFEQVRKDFGWELEVMHDVHERLSLADTLAFTKELEPFKLFFLEDSLA PDQVGYYKYMREQTAVPFAMGELFTNPAEWKTIIQNQWIDFIRVHLSDIGGITPAVKLAH FCDAYGVRTAWHGPNDLSPIGMCAQMHLDLNSHNFGIQEFSGFTQEEEAVFPGCPKIRDG YAYVDDTPGIGVGFDEKEAAKYPAVDMDHSWLFARLPDGTAVRP >gi|157101641|gb|DS480683.1| GENE 18 14870 - 15712 948 280 aa, chain - ## HITS:1 COG:SMc01979 KEGG:ns NR:ns ## COG: SMc01979 COG0395 # Protein_GI_number: 15966266 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 4 279 1 276 277 176 36.0 4e-44 MNKMKSKRRMEQVLCYVVLILLALMVLVPVLWMISTAFKTEAQTYSPKPQWIPDPISLES FRKFFTTYNFGRMTLNSLVTCIFAMIICITCACLAGYGVTRFQFKGKKQLMDFLLVTQMF PSVMLVVPFYAVLSKYHMTNKLIGLIIVYAATNVAFSTWMLVSYFKTVPIELDEAARVDG ASSFRIFWNIILPLIVPGIAAVAIFVLFSGWNEYMYSSVLISNDQLKTLTVGIISLNSQY QIKWNDLMAASTVSSLPLVVLFICFQKYFIAGMTGGAVKS >gi|157101641|gb|DS480683.1| GENE 19 15705 - 16598 790 297 aa, chain - ## HITS:1 COG:PM1761 KEGG:ns NR:ns ## COG: PM1761 COG1175 # Protein_GI_number: 15603626 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Pasteurella multocida # 20 290 11 282 294 166 34.0 5e-41 MSKVKTKGTAGKIKYQNDGYLFAFPIVFMLAALIIYPMAYGFYISFFNTNLVTKWKFVGF KYYIEAFTDPSFYKSVLLTFEFMIFVVIGHFVLGFILATLLNREFRGRTVFRVIFMLPWL FPEAVIALLFTWIMNPMYGVLNDMLKSLGIISANISWLGSKELAFPSVVFTCIWKGFPLV MTMILAGLQSISKDLYEAAVIDGANKWDSFRYITLPSLKPILTTVIILDSVWWFKQYTQV FTMTAGGPGTATNLISLSIYGTAFNDLRFGKGAAWGILVFIICYLINSVYKVVLKDE >gi|157101641|gb|DS480683.1| GENE 20 16626 - 17999 1463 457 aa, chain - ## HITS:1 COG:alr4277 KEGG:ns NR:ns ## COG: alr4277 COG1653 # Protein_GI_number: 17231769 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 55 454 65 461 461 90 23.0 5e-18 MLSVVMAASLVTGCAGSSSGTADTKAEEVKAGDTKDNGGTGTGNKDTQAASGEQVTLTVM DGYAPEDPHGQYIYQYAEEFMKENPDIKVEIQAIASGDIYTKLAAMATSPDDLPTMFFTS ADQVPTLYDLGLTEDLNNWMDKEAIDGLANGVMDACTIDGQMTYYPVAVQPQAVIYRMDR FEEAGLKVPATWDEFVDCAKALTKDTDGDGQVDQWGFSMVGSNNSSGQSRFMSYLWSNGY ELAYQEDGSDEWKTDITTDPAFVDVFSKWTDMNNVEGVVPTGITEVDYPTAANYFAMGYT SMFLTGPNALGVAYANNPELKGKLGSFKLPGEYSGTMLGAEGYAITAKSTDAEKAAAAKY LAFFTSHDKDMKFWESSGKIPSTTEGQKVSYITGDDYAGFLKQIEDGCRPTLAFAGISGL KSALGDAYAAVFSNEKTNDQAAEQLVKDLDQLLEDYN >gi|157101641|gb|DS480683.1| GENE 21 18211 - 19113 615 300 aa, chain - ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 173 285 394 503 508 66 30.0 8e-11 MEIKINDYYEPDHFRAISEDNYNISRTCEYEKNEMHHIHNTSEILLVEDGAADYYISGRK YHVEPYDILVIGAMEHHLRRIDRLPFSRYGFTAKPTYYRSIILDHDLQKVFGTPPPDVFV QNYKNVDRAVFGHLIDMLCYLKEEGEVHKPFRTQIQRTIITQIAVLLFRVFRFERSESGI SPSNARMLEIKEYIDVHFNEELNLNMLGEKFFLHPTTISKDFTKYCGYNLNKYINTVRVC EAASLLENSSDSVAVIAERCGYDSVNTFLRQFKSIMDVSPLQYRKSAHEWWETSRRCRRK >gi|157101641|gb|DS480683.1| GENE 22 19278 - 21215 2438 645 aa, chain - ## HITS:1 COG:BH0828_3 KEGG:ns NR:ns ## COG: BH0828_3 COG1299 # Protein_GI_number: 15613391 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Bacillus halodurans # 303 643 2 334 336 321 53.0 2e-87 MRITELLKKESIELGVKVSDKEEAVDRLVSLMDAGGRLKDTAGYKEGILAREALGSTAIG EGIAIPHAKVEAVKEPGLAAMVVPEGVDYDAFDGSLANLLFMIAAPAEGADVHLEALSRL STLLMNPGFKEGLMGAADKDEFLRIIDEAETERFGAPEGEAAKDAKADAEAGASSGYRVL AVTACPTGIAHTYMAAENLENTGKKLGIALKAETDGSGGAQNVLTREEIAAADAIIIAAD KNVEMARFDGKPVIMVPVADGIHKAEELIKRAVDGTVPVYHHTGAAGGESVSEGNDSIGR TIYKHLMNGVSHMLPFVIGGGILIALAFLFDDYSINPANFGKNTPLAAYLKTVGEQAFGM MLPILAGFIAMSIADRPGLAVGLVAGLIAKMGSTFVNPAGGDVNAGFLGALFAGFVGGYI VVGLRKLFSKLPKSLEGIKPVLLYPVIGIFLAAVVTTFINPYMGMINDGLTHFLNGMGGT SRIVLGMVLGGMMSIDMGGPFNKAAYVFGTAQLAEGNFEVMAAVMAGGMVPPIAIALCTT FFKSKFTEKERQSGIVNYIMGLSFITEGAIPFAAQDPLRVIPSCIIGSAIAGGLSMAFGC TLRAPHGGIFVLPTIGNHLMYLAAVVAGAAAGCVILGMLKKNAAQ >gi|157101641|gb|DS480683.1| GENE 23 21237 - 22139 1121 300 aa, chain - ## HITS:1 COG:SPy0854 KEGG:ns NR:ns ## COG: SPy0854 COG1105 # Protein_GI_number: 15674887 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Streptococcus pyogenes M1 GAS # 1 289 1 287 303 254 46.0 2e-67 MIYTVTFNPALDYVVKVEHFALGSVNRTIRENIFYGGKGINVSALLANLGYRSTALGFIA GFTGREIERGVRSLGFDSDFIQVEKGMSRINVKLKSDQESEINGMGPEITAGDVDQLFEK LSYLKKEDVLVLSGSIPAAIDSDIYQRIMERLDGRGIRMVVDAEKDLLLKVLKYRPFLIK PNNHELGQMFGLELETEEEIILHARKLKEQGAVNVLVSMAGNGAILVTEDGSVHRRQAAR GTVKNSVGAGDSMVAGFIAGYLEKGDYAYALKLGTACGGATAFSDGIGTKDEIMRLLNTL >gi|157101641|gb|DS480683.1| GENE 24 22372 - 23247 872 291 aa, chain - ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 10 287 7 283 287 131 30.0 2e-30 MRQAEYDMGETRSALGVICSSYDEFFEAEKKIADYMMEHKAEVVDMTVGELARASGTSDA TVSRFCRRCGFKGFHSLKLALAREVLEEEQMDRNVSNDIDRGDLRQSLQNILANKVAELT ETVNMMDPANLEHILDRLEHARMVQLAAVGNTIPVALDGAFKFNQLGIPAVAGDIWETQT AYTFNLGQEDVVMVISNSGSSKRLQTLAEGAKENGCTVILITNNGNSPLARICDYKIVTA TREKLLTEEFWFSRVTATAVMEILYLLLRAGMKGSMEHIRRHEKAISPDKK >gi|157101641|gb|DS480683.1| GENE 25 23474 - 24139 579 221 aa, chain + ## HITS:1 COG:no KEGG:Closa_2516 NR:ns ## KEGG: Closa_2516 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 221 1 218 218 274 62.0 2e-72 MKNIITFLSKYKHAWLLCYGFIYLPWFCYLEKTVTRHYHVMHVALDDFIPFNEYFIIPYM MWFLYVAGTIVFFLFRNKEDYYRICTFLFSGMTISLIICTFFHNGTDFRPAIDPGKNIFS GMVAALYQTDTPTNVFPSIHVYNSIGTHIAIMKSESFKKYPWVRAGSAILMVSICLSTVF LKQHSVIDMVGAAIMAYVIYGIVYGYNWSAEDKKVTQKALS >gi|157101641|gb|DS480683.1| GENE 26 24341 - 25678 391 445 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 20 441 12 421 447 155 27 1e-36 MKEGKHKKESVFELNGGMPPLGQALPLAFQHIVAMIVGCVTPAIIIAGVAGLSGEDKVII VQAALVVSALATLLQVFPVGFRKGSLQFGAGLPLIVGVGFAYVTSMSTIVEGYGIATIFG AEIVGGIVAVLVGLSYNKIKSLFPPLIIGIVILCIGLSLYPIAIRYMAGGQGSESYGSWQ NWLVAALTLTVVIVCNNYGRGIIKLASVLIGIIVGYGVALGFGMVDFSAVGRAGIFALPK IMHFGIEFEVSSCVAIGLLFAINSLQAMGDVTATTMGGMDREATDKELRGGILGFGLSNI IGAFFGGLPNVTFSQNVGIVTTTKVVNRWVAALAAIILGIAGILPKFSAFLTSVPQCVLG GATLTVFAAITVTGIRMIFNTGLSVRSGFIVGIAVALGAGVTQASDALCGFPSWMTVVFA KTPVVIAAVVSILLNLLLPKDKEER >gi|157101641|gb|DS480683.1| GENE 27 25910 - 27160 805 416 aa, chain - ## HITS:1 COG:BS_ybfB KEGG:ns NR:ns ## COG: BS_ybfB COG0477 # Protein_GI_number: 16077286 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 7 398 3 399 416 113 24.0 6e-25 MEKQENRLFYGWSVVLGCAVVCGCNIGVLSNTVGVFLKPVSIELGVSRSQIALYSSIFSV IGMFSAPFQGRLMEKYSLKKLMVSGSVIAGICLFCYSFAPGLWFFYANAVLCGLVFGFTN LIPVNKILSNWFTKKKGTAVGVALAGSGLMAMVVTPVLSHMVADYGWRSGYRFIGSLYLL LMVPVILFVIKESPEDGKFGGGGTGDTETGDQNIKTGLTRAQAMGTRRFWLILAALVLAS SVAMGIQQHMIAYLTDMGYGQQYASGIYSLSMGVLMAGKVILGTLYDRLGIKGASVYICL ALAASLGFLLMADLPGIPYLFAAAFGLANAIQSIPATCLVTRFFGTREFTSIYGICNAGN MAGIALGTSMSAWIYDASGSYVAAWYFYLLLSAVIFILYISADREYERRVLQKCCQ >gi|157101641|gb|DS480683.1| GENE 28 27160 - 28251 766 363 aa, chain - ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 350 29 377 384 223 33.0 5e-58 MGIADMEFATAPCVIEAVRRRMEHPLFGYFNLDNRFYEAIMNWHGRHFHNEDLRPEHILY QNGVLGGIAAALQAFTLPGDRVLCSGPVYSGFIRTVGDLGRVLCPSMLKPDREGIMRLDL EDMERKILDNKVKAFIFCSPHNPTGRVWDEDEIRDVAVLCGKYHVLLISDEIWADFTPGG KQHIPTAMVSGLAKDITISLYSPTKTFNLASLRCAYAVAYAPWLRDAMEKAAVMTHFNNP NVLSCEALIGAYQEGGEWVEQLNRYIRKNQEYVHDYVTSRFSGVRMQMPEGTYLGWLDCS APSLDFEQILKDMEHCGVLGNDGASYLAPKHVRLNLACPMSVCEEAMSRLERYVFHIRGE RTA >gi|157101641|gb|DS480683.1| GENE 29 28393 - 29160 542 255 aa, chain - ## HITS:1 COG:MK0183 KEGG:ns NR:ns ## COG: MK0183 COG1402 # Protein_GI_number: 20093623 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Methanopyrus kandleri AV19 # 14 245 2 213 224 108 33.0 8e-24 MKHPYRWDNLNREELRKLAEENTLVLIPLGATEQHGPHLPAGTDSMLAEAACDIAAKKLD EMGRHAVIAPTVTISNSLHHGSYPGTLSLDPRLYMDYLTSIARGIVSHGFRNICCVNGHG GNGTPTDMAVMDIVTRYGIHISWVPYYVGCNGDFEAILDTQSNIFHADEVETSLMLALDE SLVDPCYKEAKGGNVQKGTPHGRPGRPFTFLPFETRTETGILGNSYAATREKGEQLWEAV ASHLVQALLDPELWQ >gi|157101641|gb|DS480683.1| GENE 30 29202 - 30581 1291 459 aa, chain - ## HITS:1 COG:AGl2896 KEGG:ns NR:ns ## COG: AGl2896 COG1473 # Protein_GI_number: 15891558 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 18 451 94 544 552 223 33.0 8e-58 MNDYIQLWYDEYETEARTLMHDIWEHPEFAMEEYYTAGRLAGFMKEQGFETRTFNAKEPQ SQSAPHNTVYAKWGSGKPVIGILGELDSLKGLGQENVPYYSPVPGCGHGCGHNLIAGCGA AAASSLKFAAEREELSGTIIYVGCPAEETLDGKVWLAKWGYFDDMDVCLMWHPGGHELKF AGYTNMALTSILFEYFGKTAHGVRAWNGRSALDACELMNIGVNYLREHMTPECSIHYVYE DGGDMPNVVPEHASVFYYIRSRDEENRELVERVKAVAQGAAIMTETKLKMTLRTYCRGNF PAIALNRYVYEEVKKIPPLTYSGEDYEFARKLHRNFFEEEPPEEPDALIPVKTSPVKDWD SIPFFRGTSDVGDVSHILPTIQLSGLGEVAGTRAHHWTVTAAAGTAIGEKAAVYTSKIIS QSALDILKNPGLVEGFWKDFTESRNARKMPPYEKCSFRI >gi|157101641|gb|DS480683.1| GENE 31 30614 - 31315 748 233 aa, chain - ## HITS:1 COG:no KEGG:Pden_1201 NR:ns ## KEGG: Pden_1201 # Name: not_defined # Def: hypothetical protein # Organism: P.denitrificans # Pathway: not_defined # 7 180 1 175 231 127 39.0 6e-28 MEENNVMTKEQLQEDYSNVYVKKVHLIGRATMAIAFILMFAPVLYMHFVAGFTAPADAYM ACAAAACAAGVGGWIAEPISYFPILGAAGTYMSYLAGNVGNTRVPVAVAVQSATDSDNSS PRGQMATVIGVGMSIFFSLIVLTVIILVGSAMLQVVPEPVLKALGYVLPCLYGSMLTMRL MANFKGSIKYVPLAFVVFLVCRYTGFTRYGLLTDIAVTCIFAYILHQSGAGKN >gi|157101641|gb|DS480683.1| GENE 32 31327 - 32040 678 237 aa, chain - ## HITS:1 COG:no KEGG:Htur_4152 NR:ns ## KEGG: Htur_4152 # Name: not_defined # Def: hypothetical protein # Organism: H.turkmenica # Pathway: not_defined # 9 231 6 227 236 100 28.0 4e-20 MSQAMVTVNRICFLFCLLIIGLVIIQAVLFIRNALMFNKKHRVLSEDEVRSVMKIGCVSA IAPACSIIVVALGLVSLIGPVLSFMRVGVIGSAAYETQMAEIAASTLGVTLGTEGITEST LTLCLFTMTLGSAPFLINTLITVKPMDDAMVKAAKSSRSFLPAFSLAAMMALLVYLGANN ASKSSPNLVGFAASALCTFALTKYVKKSGKKSLGNFTMSIAMLAGMICATVAYYSGL >gi|157101641|gb|DS480683.1| GENE 33 32274 - 32753 470 159 aa, chain - ## HITS:1 COG:BS_yhjH KEGG:ns NR:ns ## COG: BS_yhjH COG1846 # Protein_GI_number: 16078115 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 34 138 42 145 175 60 33.0 1e-09 MTQGGNDYTFTTFEASDIMHRAASFCRASNSSHEYGNGESYTATEVHFLKYVVENPGISA IELAHSWNKTRAAATQMLTKLERSGLLYRGRERYNEKKVLYYPTEKGLELHEMHRSYDTE NFGKFISYLEQRFTKEQIESAFQILQAYSDYHMQMDEFK >gi|157101641|gb|DS480683.1| GENE 34 32972 - 33304 378 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937897|ref|ZP_02085255.1| ## NR: gi|160937897|ref|ZP_02085255.1| hypothetical protein CLOBOL_02791 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02791 [Clostridium bolteae ATCC BAA-613] # 1 110 1 110 110 207 100.0 2e-52 MVKEFLKCNSIRERIQFLANTVLKDWDASELDTVMDIMGLSKEGMEAPEDKISAIEKYLA DYKHEVEMRGGIDGSRVDGIPAKDQGSTLFEGPSEAEVVSKMLNPFSKGQ >gi|157101641|gb|DS480683.1| GENE 35 33374 - 34393 1066 339 aa, chain - ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 339 1 338 338 384 53.0 1e-106 MKLLTVAIPCYNSESYMEHCINTLLTGGEEVEIIIVDDGSAKDRTGEIADRYAREYPTIC RAIHQENGGHGEAVNTGLKNATGIFFKVVDSDDWVNEEAYVQVLDTLRRFVYGGETLDML VTNFVYEKEGARRKKVMKYHTAFPKDKVFGWNDVKFFMTGQYILMHSVIYRTGLLIQCGL ELPKHTFYVDNIFVYQPLPHVKHIYYLDVNFYRYYIGRQDQSVNEEVMIGRIDQQIRVTK LMLGYYDAMKISSRKLRHYMIQYLEIMMTICSVLAIKSGTEENLEKKKELWQYLKKQNLP LYLRLRMGFLGQGCNLPGKGGRELLILGYKITQKFYGFN >gi|157101641|gb|DS480683.1| GENE 36 34419 - 34868 353 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937899|ref|ZP_02085257.1| ## NR: gi|160937899|ref|ZP_02085257.1| hypothetical protein CLOBOL_02793 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02793 [Clostridium bolteae ATCC BAA-613] # 1 149 1 149 149 234 100.0 1e-60 MRDGKQVLPPGWQKRVLAAAAAALAGNLLVNVIPWPGSLLSSYEQAMIPAIYRRAELLWM TLVLAPILEEGAFRLVLYGWLRRFMAFLPAAVISSLAFGIYHGNWIQGTYAFLLGMVLAW GYEGSEYRKYPMAVLMHGAANLAALAAFG >gi|157101641|gb|DS480683.1| GENE 37 35027 - 35956 1094 309 aa, chain + ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 4 308 3 308 310 273 49.0 3e-73 MAKKRIVMAIGHKDMGTNLPEQKAAVARTAKIIADFIQEGWQVAIVHSNAPQVGMIHTAM NEFGKQHDGYSQAPMSVCSAMSQGYIGYDLQNGIRAELIKRGIYKPVSTVLTQVTVDPYD EAFYTPVKVIGRVMSKEEADAEEAKGNHVTEVEGGYRRIVASPHPVAIVEIDAVKALMDA DQIVIACGGGGIPVMEQGYNLRGASAIIEKDLATGLLAKEIDADVMMILTSVDNVTLNYG TPQEQPISHMTIDQARDYISQGQFEFASMLPKVSASIDFLESGKGRKAIITTLDKAKDSL KGHAGTVIE >gi|157101641|gb|DS480683.1| GENE 38 36056 - 36562 586 168 aa, chain - ## HITS:1 COG:no KEGG:Closa_0702 NR:ns ## KEGG: Closa_0702 # Name: not_defined # Def: CarD family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 164 1 164 170 149 43.0 4e-35 MYQKGQYIIYGIRGVCEVMDIITIDRPDGPKDRLYYVLRPYYQQDSKIVTPVDSEKTITR PLLSKEEALELIDKISDVKEMEVTNDKQREERYKEALKTCDCRVWISMIKALYLRRKDRL EQGKKMTDLDERYFKTAEDNLYSELALSLGMKKDEMVSYITERVLAEA >gi|157101641|gb|DS480683.1| GENE 39 36738 - 37202 533 154 aa, chain - ## HITS:1 COG:SMc02103 KEGG:ns NR:ns ## COG: SMc02103 COG0251 # Protein_GI_number: 15965246 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Sinorhizobium meliloti # 2 150 4 152 154 108 40.0 5e-24 MKVEQKLREMGLELPELPKPNGLYAPSRRVGNLIYIAGQTPDIHGVRQVVGVVGEDLTIE DGQRSAQICALNILSVLKAELGDLDKVKQFVQIISYVRCAKGFGNQPQVIDGASRLFRDL YGEAGIAARLAIGANELPGGSATEILAVVEAKED >gi|157101641|gb|DS480683.1| GENE 40 37345 - 40563 3990 1072 aa, chain - ## HITS:1 COG:BH2536 KEGG:ns NR:ns ## COG: BH2536 COG0458 # Protein_GI_number: 15615099 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus halodurans # 1 1056 1 1050 1062 1192 56.0 0 MPKNPDIKKVLVLGSGPIIIGQAAEFDYAGTQACRSLKEEGIEVVLLNSNPATIMTDKDI ADRVYIEPLTVEVVEQLILKEKPDSVLPTLGGQAGLNLAMELEDAGFFKEHNVRLIGTTA LTIKKAEDREMFKETMEKIGEPVAPSDIVEDVKHGLEIAEKIGYPVVLRPAYTLGGSGGG IAANPEQCAEILENGLRLSRVGQVLVERCIAGWKEIEYEVMRDGAGNVITVCNMENIDPV GVHTGDSIVVAPSQTLGDKEYQMLRTSALNIISELGITGGCNVQYALNPDSFEYCVIEVN PRVSRSSALASKATGYPIAKVAAKIALGYTLDEIRNAVTKKTYASFEPMLDYCVVKMPRL PFDKFISAKRTLGTQMKATGEVMSICTNFEGALMKAIRSLEQHVDCLLSYDFTDLSDETL EEELNRVDDMRIWRIAEALRRGFTYEQIHDVTKIDFWFIDKLAILVEMEDALKAVGSGEK SLTAELLAEAKRIEYPDNVIARLTGTSQEDIKKLRYDNGIRAVYKMVDTCAAEFAAETPY YYSCFGGFNEAEQTTGRRKVLVLGSGPIRIGQGIEFDFCSVHSTWAFSREGYETIIVNNN PETVSTDFDIADKLYFEPLTPEDVESIVDIEKPDGAVVQFGGQTAIKLTEALMKMGVPIL GTAAEDVDAAEDRERFDAILEQCNIPRPAGHTVFTAEEAKEAAHKLGYPVLVRPSYVLGG QGMQIAISDEDIDEFIGIINQIAQEHPILVDKYIMGKEIEVDAICDGTDILIPGIMQHIE RTGIHSGDSISVYPAQDITQHNIDTIVDYTEKLARALHVKGMINIQFIVDGDDVYIIEVN PRSSRTVPYISKVTGIPIVPLATRIICGHTIRELGYKPGLQPAADYIAIKMPVFSFEKIR GADISLGPEMKSTGECLGIAKTFNEALYKAFEGAGIRLPKYKKMIMSVRHSDQEEAVDIA RRFAAVGYQIFATRGTARTLNKNGVKAYEIRKLEQESPNILDLVLGHQIDLIIDIPAQGA ERSHDGFIIRRNAIETGVNVLTSLDTANALVTSLENRAKELTLIDIATVKNA >gi|157101641|gb|DS480683.1| GENE 41 40563 - 41636 980 357 aa, chain - ## HITS:1 COG:lin1950 KEGG:ns NR:ns ## COG: lin1950 COG0505 # Protein_GI_number: 16801016 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Listeria innocua # 2 357 3 357 363 363 50.0 1e-100 MKAYLILEDGTVFTGTSIGSEREVISEIVFNTSMTGYLEVLTDPSYAGQAVVMTYPLIGN YGICHKDMESAKPWPDGYIVRELSRIPSNFRSEDTIQNFLKEHDIPGICGIDTRALTKIL REKGTMNGMITTKEYTDFSEPIKRMKEYTVTGVVLKTTTKETYVLPGNGKKVALLDYGAK RNIARSLNQRGCEVTVYPADTPAEKVLAGKPDGIMLSNGPGDPKENTAIIREVKKLYESD VPIFAICLGHQLMALANGMDTHKLKYGHRGGNHPVKDLETGRVYISSQNHGYVVDTDNLD PTVAVPAFVNVNDSTNEGLKYVGKNIFTVQYHPEACPGPLDSGYLFDRFLKMMEVND >gi|157101641|gb|DS480683.1| GENE 42 41863 - 42708 479 281 aa, chain - ## HITS:1 COG:no KEGG:Olsu_0725 NR:ns ## KEGG: Olsu_0725 # Name: not_defined # Def: transcriptional regulator, MarR family # Organism: O.uli # Pathway: not_defined # 120 241 30 152 160 82 38.0 2e-14 MDENRNEMDKMENTNHAGNSSNSSNASNASNTSNAGNTSNSNNDTTQDYCPCCPNHCQAD ALECGKGTRYFQENPSEGEHFREHHHGQGHGFHGRCPGHLNREDMSLEDILLYQFRSCTH FFRYGMGGKTGQQRILAMLAERGIITQRELQDMLGVQSGSLSEILNKVETCGYIMRRQNE RDRRQMNLELTDSGMEAARNFREEHMKKARAMFDGLTEDEKKQLSSLLEKMMEHWPRMEE GGACGQGGGSRFGRRGFPGADREHGTEDRRRGGRMGGRHSL >gi|157101641|gb|DS480683.1| GENE 43 43055 - 44305 1199 416 aa, chain - ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 9 415 2 407 437 350 47.0 2e-96 MATNPERTSSGLLQINVINIQNNFPIQNARVTISYKGEPDRTVEQLTTNSSGQTEQVQLP APPVEYSLEPSIIQPYSEYNLMVEAEGFEPLAISGTEILAQATAIQPAVMTPVGGETPPE DPVVIPDHTLYGNYPPKIAEPEIKPVNESGEIVLSRVVVPQTVVVHDGVPTDSTAKDYYV PYRDYIKNVASSEIYSTWPRATITANVLAIMSFTLNRVYTEWYRNQGYDFTITSSTAFDH KWIYGRNIFESISLIVDEIFDNYLSRPGVRQPILTQYCDGRQVQCPNWMTQWGSCSLGEQ GYSPIEILRHFYGDSIYVNTAEQISGIPASWPGSDLTIGSSGDKVRQMQEQLDEIATVYS AIPRVTPDGIYGSGTANAVREFQSIFGLPQTGVVDFATWYKISHIYVGITRIAELV >gi|157101641|gb|DS480683.1| GENE 44 44419 - 45054 873 211 aa, chain - ## HITS:1 COG:no KEGG:Closa_3203 NR:ns ## KEGG: Closa_3203 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 211 1 215 215 216 49.0 6e-55 MKEGKKLIVTIGRQYGSGGSEIGARLAKELGIHFYDKNILRMNSDQSGIKESYYYLADEK AGNKLLYKIIKSLTPEKGSPSFGSDLVSADNLFRFQSEVIRKLAAEESCVIMGRCADYVL EGAEGLVRVFLYADMEYREKRISDLGYYEPRDIRKNIKRIDRERRDYHRYYTGRDWENVE NYDLMLNTASLGTEGAMEAIRAYLRIKEYEL >gi|157101641|gb|DS480683.1| GENE 45 45588 - 46691 748 367 aa, chain + ## HITS:1 COG:L0163_2 KEGG:ns NR:ns ## COG: L0163_2 COG1985 # Protein_GI_number: 15672975 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine reductase, riboflavin biosynthesis # Organism: Lactococcus lactis # 142 361 1 220 220 237 52.0 4e-62 MNDSDYMSIALKLARKGCKKVSPNPMAGAVIVREGRIIGQGSHLYFGGPHAERNALASLT ESPRGATLYVTLEPCCHHGKTPPCTDAIIESGIRRVVIGTRDPNPMVSGKGTKILVANGI EVTEGILEEECRELNHVFFHYIKTGRPLVIMKYAMTLDGKTATFAGHSRWITGERARRRV HEDRSRYSAVMCGVETILADDSLLTCRLPDTRNPVRIICDSQLRTPPDSQVAATCSQART ILATCCPDREAWLPYEEKGCEILLLPPREGRVDLNALTERLGQMGIDSVLLEGGSTLNWS ALNQGIVNRVQAYISPGLFGGATAKTPVGGRGVEFPYQGFSLGKTKITVLDQDILIEGEL MPCSQES >gi|157101641|gb|DS480683.1| GENE 46 46673 - 47353 515 226 aa, chain + ## HITS:1 COG:SP0177 KEGG:ns NR:ns ## COG: SP0177 COG0307 # Protein_GI_number: 15900114 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Streptococcus pneumoniae TIGR4 # 1 226 1 211 211 213 49.0 3e-55 MFTGIIEETGILMSVQGGLSGSRLAVKADRVLEDIQKGDSIAVNGVCLTAYSFDSGCFYA DVMPETLKRSNLARLPAGSPLNLERAMTAGGRFGGHIVSGHIDGTGTIVSIARDGNALWY TVRSDPSILRRIIEKGSIAIDGVSLTVARVRDQDFCVSLIPHTAFCTTLGTLKPGCVVNL ENDCVGKYIEKFIKIPPPFHCTEGVDSKSTGTGSRITPELLKQYGF >gi|157101641|gb|DS480683.1| GENE 47 47445 - 48662 1185 405 aa, chain + ## HITS:1 COG:SP0176_1 KEGG:ns NR:ns ## COG: SP0176_1 COG0108 # Protein_GI_number: 15900113 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Streptococcus pneumoniae TIGR4 # 10 210 3 203 203 268 63.0 2e-71 MEGFNTIKGFSTIEEVLDELRQGKIVLVTDDENRENEGDFICAAQHATTENVNFMAVHGK GLICMPMSPELCTRLKLPQMVSHNSDNHETAFTVSIDHISTSTGISAAERGVTARQCVNA RSRPEDFRRPGHMFPLAARKNGVLERAGHTEATVDLMRLAGLSECGLCCEIMKEDGTMMR MPQLLKLAGQWNMKITTIKALQNYRKQHDKLVECAAVTHMPTRYGTFKAYGYRNLLNGEH HVALVKGEIGDGRDLLCRVHSECLTGDAFGSLRCDCGQQFAAAMARIEQEERGVLLYMRQ EGRGIGLLNKLRAYEFQDQGMDTLEANLALGFPGDLREYFIGAQILRDLGARSLRLLTNN PDKVYQLSGFGLEIKERVPIRMAPTVHDLFYLKTKEQRMGHLMHY >gi|157101641|gb|DS480683.1| GENE 48 48733 - 49260 421 175 aa, chain + ## HITS:1 COG:SP0175 KEGG:ns NR:ns ## COG: SP0175 COG0054 # Protein_GI_number: 15900112 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Streptococcus pneumoniae TIGR4 # 1 154 1 154 155 239 76.0 2e-63 MYILEGKLVPKKIRVGIVASRFNEFITSRLLSGALDCLRRHEVPEDDITVAWVPGAFEIP LIASRMAASRRYDAVICLGAVIRGSTSHYDYVCSEVSKGIAQASLAAKIPVMFGVLTTEN IEQAIERAGTKAGNKGYDCAAGAIEMVNLIHELDKRTADNSLSVTPFVQEEPCRQ >gi|157101641|gb|DS480683.1| GENE 49 49269 - 52151 2747 960 aa, chain - ## HITS:1 COG:CC0723_1 KEGG:ns NR:ns ## COG: CC0723_1 COG0642 # Protein_GI_number: 16124976 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 415 679 189 454 473 179 39.0 3e-44 MKKWSSEEVREFARRILSSYFCENDVELLISTFADDIIWLGGGKMQKAEGRDAVAAWFRE GKDELSSCNMFDEQYEVMEMGAGLWLCEGVSELESREGTGVYIRTQQRITFIFREKGEAL ETIHIHNSMPFGEIQEDELFPAESGKRTYLELEGELRRREQKYIQQARFLSQLYNSVPCG IIQFERGEGHRIVNVNRMGWELYGYSSEEEYRREVKNPFQMVLDEDKEWIEGIIGGLALN GDTASYIRHTFSRDGKQIWISVIMGRIVNADGQEVIQAVFTDVTDIKQLELEQERQQLIE NRSLRAAICTAYPLIMSLNLTKDIYNCFIEDQDACLSGRCGSYTEMIKGDIPDVYPSYRE DFETAFARENVLARFAAGERDIYMEFQKMGMDGKYHWLSVQIIYVENPFNDDVLAIELVK VLDSQRAEQARQEQLLRDALASAKAANQAKSDFLSRMSHDIRTPMNAIVGMSTIGQLKLD DRRSVQDCFCKIDASSRYLLSLINDILDMSKIETGKMELAHEYFDFTELMDEVNQIIYPQ AEERQIEYVIYHSEPLEQFYIGDPLRLKQILMNLLSNALKFTRAGGRIQIQIEEQKRTNG FSYLRFQVKDTGIGMSREFRSRLFMPFEQEAPETARNNIGSGLGLSIVYNLVQLMGGSIE VESEKEKGSVFTVTLPFKLAEDDQEMERQRKKRELLQGLAVLVVDDDPLVGSQATSILEK AGAQSLWVDSGFRAVEEVQRLLEENRHYDIAMIDWKMPDMDGVETARRIRALVGPDTTII MISAYDWSRIEDEARNAGVNCFISKPLFGGTIYDAFARAVNRSEEKGAEKEREDFSGCRV LLADDNELNREIARTLLEMRGMEVETAEDGQEAVNLYKSRGAGYFSAVLMDIRMPRMDGL EATRTIRAMEDENTDRIPILAMTANAFEEDKARAYEAGMNGYLVKPLDMEAVLDELKKHL >gi|157101641|gb|DS480683.1| GENE 50 52283 - 53998 1429 571 aa, chain - ## HITS:1 COG:slr1305_3 KEGG:ns NR:ns ## COG: slr1305_3 COG2199 # Protein_GI_number: 16329450 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Synechocystis # 395 568 1 175 188 115 39.0 3e-25 MNHVVVDPVKMDLYAFVLKDACDEFFEVDVAERILRPIFLTPGKGDAACGTGEMDVRAAA ELFVCEEDRAGFLALFSRQSLDEVCAHKIGKQGEFRRMQQGGRERWVRGTLFSCDACGRD KALFYTVDIESRNAARLMQMNRDLFGVFLDAYIGIAELDLDLGIGTIIQCIPMPDMALRS FQWGRMVGYLCSNVIYHGDRDIFAGTFSLKHMREVWGNGQKNMSLEVRYSADGETFRWAE VNAMLFRREDMPDKVYLAMRDTSEQHLLKGIVERYVYSHCDYFIYLNARNNSYVMFSSKG DGTLLPPDHCDDYEKAIETYVRKYIAPEDQELAICEMRISRVLEELDRTGEHCFYCGVMD SQRGYTRKRLQYLYYDKENQMVLMTRTDVTDIYLEQKRQNKLLQRALMQAQTDPLTGLYN QAVKDMIAERLEGEDGIAAVLFMDLDNFKVINDTLGHIKGDLLLQEVAGVLKKTLRASDL SGRVGGDEFLALLHPVNSPAGVRQCVQRICASVRSVMDRDFHDFCVTCSIGISMYPADGT NIAALVEKADKAAYAVKRKGKNGFAFYSDLL >gi|157101641|gb|DS480683.1| GENE 51 54245 - 54865 471 206 aa, chain - ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 196 1 196 206 107 32.0 2e-23 MKVYLVRHGETEWNRRGKIQGQADIPLNEKGEALAFLTGQKMKDIPFKRIYTSPLSRARR TAELISGQRGLPLMEDSRLLEISYGNREGQLLALIHRLPFLRLHRYFSHPSAYVPPKGGE TYDDLRKRCREFLEQELKPLEEQMDHVLVCGHGALIREMVCIIDGIAPDAFWKGPVQKNC AVTVLSLEQGVFQVMEEGTVYYNDDI >gi|157101641|gb|DS480683.1| GENE 52 54887 - 55930 1182 347 aa, chain - ## HITS:1 COG:ydjJ KEGG:ns NR:ns ## COG: ydjJ COG1063 # Protein_GI_number: 16129728 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli K12 # 8 347 6 345 347 279 44.0 5e-75 MEGTMKVAVMTGKKKMEWCEREIPQPGPGELQIKLEYVGVCGSDLHFYSEGRLANWVPDG PLVLGHEPGGVVSAIGEGVTGFEIGDRVALEPGVPCGHCEDCLKGHYNLCRSVRFMAIPG EKDGVFSEYCTHAASMTFKLPDNVSTMEGGLMEPLAVGMHACELSNAKLGETAVVLGAGC IGLVTLMSLKARGVSEIYVADVLDKRLEKARELGATRVFNSRNENIEEFVKTLPGGGVDQ VYECAGNRITTLQTCRLIKRAGKITLVGVSPEPVLELDIATLNAMEGTIYSVYRYRNLYP SAISAVSKGLIPLKKIVSHTFDFKDVIEGVDYSLNHKDEVIKAVIRM >gi|157101641|gb|DS480683.1| GENE 53 55996 - 57480 1596 494 aa, chain - ## HITS:1 COG:SA1141 KEGG:ns NR:ns ## COG: SA1141 COG0554 # Protein_GI_number: 15926883 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Staphylococcus aureus N315 # 1 487 1 489 498 461 45.0 1e-129 MEKQYLLSVDQSTQGTKALLFDQEGNLICRGDRPHRQIINDAGWVSHDLNEIYSNTLKVV RDVIEKAGIRREQVAGLGISNQRETSAVWDRTDGHPLADAIVWQCSRAKDICARVEESGK AEWVRRTTGINLSPYFPASKLAWFMENVKEAPQKASEGTLCLGTIDSYLVYRLTGGASFK TDYSNASRTQLFNIHSLEWDEGLCSLFGIPANALAQVCDSNAWFGDTDLEGYLNNPIPIH GVLGDSHGALFGQGCLEKGMIKTTYGTGSSIMMNIGEKPVISTHGVVTSLAWGMDGRVNY VLEGNINYTGAVITWLKDDLKLIASASETEGLARQANEDDTTYLVPAFTGIGAPYWDSEA RAAIVGITRKTRTPELVKAGLECIAYQIADVVEAMSQDAGVRIEELRVDGGPTRNRYLMQ FQSDILDIRVLVPDAEELSGIGAAYAAGLALGFFDRSIFGKMKRSVFEPAMDQDARRQKQ DGWRAAVGTVLTRS >gi|157101641|gb|DS480683.1| GENE 54 57513 - 58451 1073 312 aa, chain - ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 3 311 2 309 311 286 49.0 5e-77 MNKIPNRKAICDTLMEAAKEDKDIVVLCSDSRGSASLTPFADACPDQFVEMGIAEQNLVS VSAGLAKCGKKPFAASPACFLSTRSYEQIKVDAAYSNTNVKLIGISGGVSYGALGMSHHS AQDIAALSAIPNMRVYLPSDRFQTACLVKELLKDDKPAYIRVGRNAVEDIYEEGNVPFSM DKATVLSRGGDVAIIACGEMVKPALDAGRLLGESGISAGVVDMYCVKPLDEEAVQKAAGA ARLVVTVEEHAPFGGLGSRVSQVVSACCPRMVVNLSLPDEPVITGTSAEVFEYYGLTGQG IARRVQELLRQL >gi|157101641|gb|DS480683.1| GENE 55 58444 - 59277 956 277 aa, chain - ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 3 274 11 280 286 306 54.0 3e-83 MNELKVLSYDLRKHVVDMIIAGKGGHIGGDMSVMEILVELYMKQMNISPENKELDNRDRF VLSKGHSVESYYAVLAKKGFFPMEEVISTFSQFGSKFIGHPNNKLPGIEMNSGSLGHGLP VCVGMAMAGKMDRKDYRVYVVMGDGELAEGSVWEGAMAASHYCLDNLCAVVDRNRLQISG CTEDVMGHDDLHERFRSFGWHVIDVADGNDIDQLDAAFEEAKTVKGKPTVLIANTVKGCG SPVMENKAAWHHKVPTAQEYSEIIRDFEKRKEALLHE >gi|157101641|gb|DS480683.1| GENE 56 59292 - 60248 1190 318 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 21 315 36 329 330 208 45.0 1e-53 MDKAKNISLKKLLTGSLGTVLGLVILVVVFTALSNIFLTPSNIRNILIQSGTNAIIAVGM TYVIISGNIDISVGASLALSSCVGATLMVGTGNVGLGVAAVLLTGILLGLFNGSLVAYLG FPPFIVTLSTMWLFRGMAYLYTGGQAVVGLPQGMTSFAMGSILYIPNIVWLIAIVYIIGH VALQKTTAGRKVCAVGDNSESARLSGINVKRVTLAAFVLSGFCAALSGIVYMGRLNSGQP IAGQSYEMYAIAAAVIGGASLTKGGIGSMVGTLIGAIFISVLQNGLTILNVNTYWQQVCM GVVLLLAVGLDRFRKSVK >gi|157101641|gb|DS480683.1| GENE 57 60254 - 61771 1696 505 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 505 25 518 521 384 43.0 1e-106 MEHSKPILEFKHIVKEFPGVRALDDVSMDFYEGEVVALMGENGAGKSTLMKILSGAYTRD GGDIVLDGRPLPKQYSTLEARGFGVAIIYQELSLMSEMNVMENVYVSHEPRKVGNLIDFR KMYADTKEQLAKLNAGHIDPRGRVGDLSLPEKQMVEIAKALAVDCKVLIMDEPTTSLTHE ESEQLFEIIGTLKARGITTIYISHRLDEIFRICDRAIVMRDGKLVGSSDVRTSTPDDVIM MMSGKVLSQPMGGGRTRRDKAKSIPLLELEHITDGGFIQDLNLTVYQGEVLGIGGLVGSK RTELARMVCGADPLKGGVIKVSGKARRITNTRQAIDAGLGYLSENRKEDGLSLGLKIEEN TIHCDMTAVSRRGFMNWKRVREVSDNYIKMLKTKGTSATQVVNLSGGNQQKVAIAKWLHA GCNLLIFDEPTRGIDVAAKAEIHQLIRNFAAQEGKAAIVISSEVNELVSVSDRIIVMSKG QITGELEAEEIEQNNVLRHITTKGR >gi|157101641|gb|DS480683.1| GENE 58 61830 - 62894 1450 354 aa, chain - ## HITS:1 COG:PM1325 KEGG:ns NR:ns ## COG: PM1325 COG1879 # Protein_GI_number: 15603190 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 56 282 37 267 314 157 39.0 3e-38 MKKRVVSILLTGLAVLSLAACTSNGLEAASATNDNAKEAEAGTEKKEGISVYWVGKTLNN PWWISVSDFAQQTADNLGVDLTIAIPQEEVDLEKQVSMIEAAIEKKADAIVVSAASSDGV IPAIKKAREAGIKIVNFDTRISDTSVIDAFVGGDDVAGAYKAGKYICEQLGGEGEVAIIT GLMEQSTGVDRHAGFMQACAEYPGIKVVAEQGAEWSSDKAADVTTNILTANPNVKAIFAC NDQMAVGMVNAAKAAGKKADDLILVGYDGILDAVNMTMDGDLDAFVSLPNLDEGAMGVKL ATALVMNSDYHYDREILYDCTLVTGEFVDGLTDQTIYEYAAERFPLRGLTEKGY >gi|157101641|gb|DS480683.1| GENE 59 63226 - 64242 1011 338 aa, chain - ## HITS:1 COG:BH1928 KEGG:ns NR:ns ## COG: BH1928 COG1609 # Protein_GI_number: 15614491 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 7 336 6 335 335 149 31.0 1e-35 MALRAKDIAEMLGVSTATVSLVLNNKPGVGEKRRQEIIQKIKEMDCGYMLKEQRINNGSI GFVVYKREGSIVDESPFFTYILEGINNRINHYGYTLNFLYINKTMPRTEQEHQIKSGGYK GLIIFAVEMYYDDLQLFKESQIPFAILDNSFQENDVDAVSINNVQGTSKAVSYLCGMGHR EIGYIRSKVRINSFDERYNAFKHKLRSLGGIFNKEYVVDVGYSEEDVKRDVKKYLEGRKR LPTAFFAENDLIGCSAIRAMQECGYRIPEDISVVGFDNRPISTLVEPQLTTINVPKDIFG PAAVDLLISRLDQGREQSLKVEIGTSLVKRGSVKKITD >gi|157101641|gb|DS480683.1| GENE 60 64359 - 64895 551 178 aa, chain - ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 164 1 160 168 157 43.0 8e-39 MNYGTIKKNDIANGVGVRVSLFVSGCTHHCPGCFNEEAWDFGFGKPFTGETEEEILDALA PDYIEGLTLLGGEPFEPENQKALVPFLEKVRERYPQKNLWCYTGYTLEQDLLRDSRARCG YTDRMLSMIDVLVDGRFVEALKDISLPFRGSSNQRILNLRATLESGSAKLLEMGSGRI >gi|157101641|gb|DS480683.1| GENE 61 64897 - 67056 2704 719 aa, chain - ## HITS:1 COG:STM4452 KEGG:ns NR:ns ## COG: STM4452 COG1328 # Protein_GI_number: 16767698 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Salmonella typhimurium LT2 # 3 719 5 706 712 448 36.0 1e-125 MKIIKRNGAEEDFDNNKIVVAVAKANTAAGKNKLSLEQIKDIADYVEYKCVKLNRAVSVE EIQDMVENQIMSTGAFELAKDYVRYRYQRSLIRKANTTDNRILSLIECNNEDVKQENSNK NPTVNSVQRDYMAGEVSKDLTKRMLLPQDIVEADKEGIIHFHDSDYFAQHMHNCDLVNLE DMLQNGTVISETRIEKPHSFSTACNIATQIIAQVASNQYGGQSISLAHLAPFVDVSRKKI YAEVQEEVKTFGCMPSEEAIHEVVENRLRKEVTKGVQTIQYQVITLMTTNGQAPFLTVFM YLNEAKNEREKKDLAMIIEETLKQRYQGVKNESGVWITPAFPKLIYVLEEDNVNEGTEYY YLTEMAAKCTAKRLVPDYISEKKMMELKEGNCFPVMGCRSALSPWKDENGNYKFYGRFNQ GVVTINLVDVALSSGGDMKAFWKIFDERLDLCYRALMCRHNRLRGTLSDAAPILWQNGAL ARLKKGETIDRLLYGGYSTISLGYAGLYECVKYMTGKSHTDQETTPFALEVMEYMNKACN RWKDETNIGFSIYGSPIESTTYKFAKCLQKRFGIIEGVTDRSYITNSYHVNVSEEIDAFS KLKFESQFQALSTGGAISYVEVPNMQNNIPAVLGIIRYIYDNIMYAELNTKSDFCQECGF DGEIQIVTDESGKLVWECPKCGNRDENKLNVARRTCGYIGTQFWNQGRTQEIKERVLHL >gi|157101641|gb|DS480683.1| GENE 62 67342 - 68379 1122 345 aa, chain - ## HITS:1 COG:BH2923 KEGG:ns NR:ns ## COG: BH2923 COG1609 # Protein_GI_number: 15615486 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 336 1 338 344 273 40.0 5e-73 MPVTIKDVAALAGVSPSTVSRTCKDHPSISRQTKEKVRQAMAKLGYEPNFQASSLATQNS RTVGIILPPSEWEAYQNPFYLEMIRGISQYCNQKQYINTVITGQNEDEILQAIKTMARTG LAEGFIVLYSREQDPVLDYLYEEGLLYVLIGKACRNVNQTIYVDNDNVLAGREAAEYLLN LGHVRIACLSGDHSLLYNHDRKMGYMLALAERGIPFDGQLCVEVPSVPEEQDQALAALFA REDRPTAVLVSDDILAVALEKACIGLGLSIPGDVSILSFNNSLLARLTFPPLTSVDVNAC QLGIEAAAQMISHVENPELVATKIIVPHHMVERESCAGAGGFCGD >gi|157101641|gb|DS480683.1| GENE 63 68741 - 69361 394 206 aa, chain + ## HITS:1 COG:lin1016 KEGG:ns NR:ns ## COG: lin1016 COG2190 # Protein_GI_number: 16800085 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Listeria innocua # 13 145 14 146 163 127 42.0 1e-29 MNKNTDGCIGIRILSPLTGTAVPLEEVPDPVFSQKIIGDGVAILPRDGNLVSPIDGEVVS IAETLHAYGLRSEDGIEVMVHFGLETVALKGECFQCCVKIGDKVKAGSLLAKADLKALEE KQVNTITPVLICGGMEGRSMNAFTGPVKAGADTVITVLDHCPPDAAEEADAASPHNPDTA PAANTSSPADAADRKAGKPKKIPDQF >gi|157101641|gb|DS480683.1| GENE 64 69390 - 70982 1601 530 aa, chain + ## HITS:1 COG:SPy1986_1 KEGG:ns NR:ns ## COG: SPy1986_1 COG1263 # Protein_GI_number: 15675776 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 1 446 22 467 472 564 64.0 1e-160 MTVIAVMPAAGLMISVGKLIQMAGGDMGAVLTIGSTMETIGWAVINNLHILFAVAIGGSW AKERAGGAFAAVIAFILINVITGALFGVSSADLNDPAAVTRTLLGQEILVNGYFTSVLGA PALNMGVFVGIISGFAGGVIYNKYYNYRKLPDALAFFNGKRFVPMVVILWSVIISLVLAI VWPVVQTGINSFGVWIANSSSTSPVLAPFIYGTLERLLLPFGLHHMLTIPMNYTSFGGTY TILTGVNAGSQVFGQDPLWLAWATDLINLKDAGDMAGYQNLLATITPARFKVGQMIGATG LLLGIALAMYRRVDPDKRRNYRSMFISTVLAVFLTGVTEPLEFMFMFCALPLYLVYAVLQ GCAFAMAGLIHLRLHSFGNLEFLTRVPMSIRAGLGGDLVNFILCVAVFFAAGYAIAYFMI GRFHFATPGRLGNYTDDSVSENSQGRAGFASSGDSQAGRIISLLGGRDNIVLVDACMTRL RVTVKDVDKVAGLPQWKAEGAMGLIKKDGGIQAIYGPKADVLKSDINDIL >gi|157101641|gb|DS480683.1| GENE 65 71058 - 71921 557 287 aa, chain + ## HITS:1 COG:SPy1985 KEGG:ns NR:ns ## COG: SPy1985 COG3568 # Protein_GI_number: 15675775 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Streptococcus pyogenes M1 GAS # 1 283 5 265 272 119 32.0 5e-27 MTLNTHSLAEPEYESKLAAFAGSLVQIKPDIIALQEVNQSRSAPPAASGWLRESGYVPCC AKDLNTVSPCASRAVIRADNHALNLALLLARAGVSYQWTWTPAKVGYGIYDEGLAVLSRS PILGTSQFYITGIRDYENWKTRKILGINTRTEYGNEYFYSVHMGWWNDSEEPFEGQWNRI SSGLSPLAEERLWLMGDFNSPAHITGQGRDLILASGWFDAYDLAASRDSGVTVANAIDGW REQGEIPGMRIDYIWTSRRVPVISSRTIYDGTSFPVVSDHFGIITEY >gi|157101641|gb|DS480683.1| GENE 66 71956 - 73455 1287 499 aa, chain + ## HITS:1 COG:SP2107 KEGG:ns NR:ns ## COG: SP2107 COG1640 # Protein_GI_number: 15901922 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Streptococcus pneumoniae TIGR4 # 2 498 3 497 505 470 48.0 1e-132 MKRESGILLSITSLPSKYGIGCFSKEAYEFVDRLKEAGQSYWQILPLGPTSYGDSPYQSF STFAGNPYFISLEDLIQEGVLTEEECSRADFGSRPDFVDYAKVYEERYPLLRKAYERSSI SLDPEFRKFRDENRWWLQDYALFMAVKARFKGTAWTRWAQDIRLRWQNALDYYRRELYYD IEFHQYLQFKFSQQWKKLKSYANANGIRIIGDIPIYVAMDSADTWAHPELFELDRDNIPT AVAGCPPDGFSATGQLWGNPLYRWEYHRSTGYEWWLARLAYCYKLYDVVRIDHFRGFDQY YSIPAGSGTAVGGHWEQGPGIGFFHTVRSALGEKEIIAEDLGYVTDSVRQLVKDSGFPGM KVLEFAFDSRDSGCASDYLPHNYPENCVAYTGTHDNETIKGWYASITAQERKLARDYLCD SCTPASRLHMSFISLIMRSQARMCIIPLQDYMGLDNDCRINTPSTVGTNWRWRLLPGQMS DPLVKTISSVTLRYGRWNW >gi|157101641|gb|DS480683.1| GENE 67 73545 - 74327 892 260 aa, chain - ## HITS:1 COG:BS_cwlC KEGG:ns NR:ns ## COG: BS_cwlC COG0860 # Protein_GI_number: 16078804 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 6 256 3 254 255 147 34.0 1e-35 MTDVKRVIIDAGHGGEEPGAMYDGRREKDDTLRLALAIGRILENNGVDVVYTRTTDVYDT PLEKARIANQSGADYFVSIHRNAFPVPGTASGAMTLVYEDAGVPAMLADNIQRNLVETGF RDLGVQERPGLIVLRKTQMPAVLVEAGFIDNPDDNRFFDENFDAIAQAIADGVLETIRQQ EAQRPEYYQVQVGAFRTRMPADRLVNELQASGLPAFLVYDDGLYKVRVGAFLNMDYAVQM ERNLRNMGYPTVLVRERAIY >gi|157101641|gb|DS480683.1| GENE 68 74403 - 74843 574 146 aa, chain - ## HITS:1 COG:PA4784 KEGG:ns NR:ns ## COG: PA4784 COG1522 # Protein_GI_number: 15599978 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 1 139 1 139 155 99 38.0 2e-21 MDEIDRKILKLLQDNARVSLKTIAENTFLSSPAVSARIERLEKEGIITGYHASVDPVKLG YHILAYINLEVIPEDKEQFYAYAREVPHILECSCVTGGFSMLLKVAFKSTMELDMFIGQL QKFGRTSTQIVFSTHVGPRGVDVDEI >gi|157101641|gb|DS480683.1| GENE 69 74877 - 76097 1354 406 aa, chain - ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 404 1 408 410 511 60.0 1e-145 MFKINENYLKLPGSYLFSAIAKKVAAYEEANPDRQVIRLGIGDVTLPIAPAIVEAIHKAA DEMGQAKTFHGYAPDLGYAFLRETIVEKDYRLWGCHVEADEIFVSDGAKSDCGNIQEIFS EDSRIAVCDPVYPVYVDSNVMAGRTGSYDPDTGMWSNVIYMPCTAQNHFVPELPQETPDL IYLCVPNNPTGTTLTRDQLKVWVDYANRAGAVILYDAAYEAYISEEDVPHSIFEIEGART CAIEFRSFSKKAGFTGVRLGFTVVPKDLKCGDVMLHSLWARRHGTKFNGAPYIEQRAGEA VYSEEGSRQVMEQVAYYKRNAMVIYEGLKEAGYTVFGGINSPYIWLKVEDGMDSWEFFDY LLEQANVVGTPGSGFGPSGEGYFRLTAFGTYENTVEAVKRIKALRR >gi|157101641|gb|DS480683.1| GENE 70 76313 - 77128 831 271 aa, chain - ## HITS:1 COG:BH3412 KEGG:ns NR:ns ## COG: BH3412 COG0253 # Protein_GI_number: 15615974 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Bacillus halodurans # 1 261 6 279 286 222 44.0 6e-58 MQGTGNDYVYVDCFKETVRDPAKTAIYVSDRHFGIGSDGLILICPSRTADCRMEMYNADG SQGIMCGNGVRCVGKYVYDHGLVDRDKRTITVETLAGIKTLELEIEDGKAVSVTVDMGEA ALTSRLPEDITVDGKEYRFVGINVGNPHAIYYVDQVDCLDLERIGPAFENHERFAPDRVN TEFIHVADRKHLEMRVWERGSGETWACGTGATASVMASILMGYTEDEAEVALRGGKLVIR YERESGHLFMTGPAVEVFRGSIDIPEEIYYD >gi|157101641|gb|DS480683.1| GENE 71 77205 - 77762 495 185 aa, chain - ## HITS:1 COG:no KEGG:Closa_1273 NR:ns ## KEGG: Closa_1273 # Name: not_defined # Def: ANTAR domain protein with unknown sensor # Organism: C.saccharolyticum # Pathway: not_defined # 4 184 1 181 182 230 61.0 2e-59 MGHVTNIIVAFSKPEDGKNIKSILVRNGFQVVAVCTSGGQALSAADCLNGGIVVSGYRFE DMMYDELRQCLPGDFDMLLISSPARWGGQCPDRVICLPMPLKVHDLLNTLEMMVQAQERI RRRRRSRPKERSREEQDIITEAKTLLMERNNMTESEAHRYIQKCSMDSGTNLVETAQMVI SLIHV >gi|157101641|gb|DS480683.1| GENE 72 77799 - 79136 1353 445 aa, chain - ## HITS:1 COG:BS_glnA KEGG:ns NR:ns ## COG: BS_glnA COG0174 # Protein_GI_number: 16078809 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Bacillus subtilis # 1 445 1 444 444 558 59.0 1e-159 MSKYSKEDIMRMVEEEDVEFIRLQFTDIFGMLKNVAITASQLEKALDNRCMFDGSAIEGF VRIDESDMYLYPDLDTFEVFPWRPQQGKVARFICDVYNPDGTPFSGDPRYVLKRAVKRAQ DMGYVLNVGPECEFFLFHTDEEGRPTTSTHEMAGYFDVSPIDLAENVRRDIVLNLEEMGF QVEASHHEIAPAQHEIDFQYTDALRAADNIMTFKMAAKTIAKRHGLHATFMPKPREGMNG SGMHINMSLSDKEGRNLFADDQDALRLSRDAYYFMAGILKHMKAMTILTNPLVNSYKRLI PGFDAPIYIAWSPTSNRSSLIRIPSPRGESTRIELRCPDSAMNPYLALAACLSAGLDGIE KKTGLPACVEGNMFTMEPEELRERNVERIPETLGDAIEAYSGDVFMKEVLGEHIYTKYLE AKEQEWRDFRAQVTDWEVSQYLYKY >gi|157101641|gb|DS480683.1| GENE 73 79347 - 80207 1123 286 aa, chain - ## HITS:1 COG:alr4841 KEGG:ns NR:ns ## COG: alr4841 COG0253 # Protein_GI_number: 17232333 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Nostoc sp. PCC 7120 # 7 266 5 264 285 243 48.0 3e-64 MRIVVEKYHGLGNDYLVYDPNKNKLKLTPSNVQLMCNRNFGVGADGILEGPVLDGEQISM VVWNPDGSVAQKSGNGVRIFAKYLKDAGYVQKKDFTISSQGGEASIHYLNEEGTRLKASM GKVSFWSDDVPVEGPRREIINETMVFGRIPYPVTCLSIGNPHCVILMDEISKDLVCRIGR YSESARQFPEKINTQIMKVLDRTHIQTEVYERGAGYTLASGSGCCAAAAAAYRLGLTDPK MFVQMPGGTLELEIDGEGVVHMTGDVGYVGAVKLGSHFTEQLRALK >gi|157101641|gb|DS480683.1| GENE 74 80443 - 81312 679 289 aa, chain + ## HITS:1 COG:CAC2818 KEGG:ns NR:ns ## COG: CAC2818 COG2207 # Protein_GI_number: 15896073 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 12 282 4 273 279 205 40.0 9e-53 MPITDDAPAFERQGFLKEDFRLFHLKDKVSPCYEFHYHDFLKIMVLLEGKVNYVIEGRSY SLHPFDIVLVGRGQIHRPEVDSGQPYERQILYLSQDFLDRHRENEDSLDRCFATAAYRHS NVLRLKKEVRANMLSLLKQLERSQNWQQEEFAGPLMSRLLCLEFLVELNRASMDDKAQYL PTGVLNYRVSGLISYINEHLDEDLSIGTLSSLSCISPYHMMRIFKEETGCTIGNYITEKR LIYARDLLGEGVNATDACFRSGFSHYSTFLRAYKNHFQQLPKRNPGTKI >gi|157101641|gb|DS480683.1| GENE 75 81459 - 82622 1557 387 aa, chain + ## HITS:1 COG:TM1135 KEGG:ns NR:ns ## COG: TM1135 COG0683 # Protein_GI_number: 15643892 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Thermotoga maritima # 33 380 21 360 370 187 38.0 3e-47 MRIRKVAAVTMAACLAASALAGCGSKKSDEKVIKIGVFEPTTGENGGGGYQEVLGIRYAN EMHPTVNIGGEEYAVRLVEVDNKSDKTEAVNAAQKLVSEKVSVVLGSYGSGVSIAAGQIF ADAKIPAIGCSCTNPQVTEGNDYYFRVCFIDPFQGTVMANYAFQNGAKSAAVITQLGDDY SSGLGSYFKEAFAKLGGNIVSEEQFQTNQTDFKAILTNIKAADPDIIFAPSSITTAPLII KQARELGITATIAAGDTWENSTIIENAGADSEGVVISTFFDEAEPANDEAAAFIKGFKEY LVKEKQEDIIPAVSALGYDSYLAAIKAIEDAGSTDTTAIRDALKNVEIDGVTGNITFNET GDANKDIAFIKTIKDGRFQFLTTTTVE >gi|157101641|gb|DS480683.1| GENE 76 82777 - 83658 1029 293 aa, chain + ## HITS:1 COG:Cj1017c KEGG:ns NR:ns ## COG: Cj1017c COG0559 # Protein_GI_number: 15792344 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Campylobacter jejuni # 1 286 1 290 298 215 45.0 9e-56 MSLTTFLQQCLTGISLGGAYALIAIGYTLVYGILRLINFAHGDIFMMAGYFMIFAMASLP WFIAIPVTLIVTILLGVTIERVAYRPLRSAPRMSVMISAIGVSYLLQNLATYLFTALPKG YPEIPFLKKIFRIGGLSASFVTFLTPVLTLVLVYLLILLINHTKIGMAMRAVSKDYETAS LMGIKINKTITFTFAVGSLLAGIGSILYFTDRMTVFPFSGALPGLKCFVAAVFGGIGSIP GAVIGGFILGLGETALVAMGQSTFSDAFTFILLIVMLLIRPTGLFGEKTTDKV >gi|157101641|gb|DS480683.1| GENE 77 83669 - 84784 1258 371 aa, chain + ## HITS:1 COG:TM1137 KEGG:ns NR:ns ## COG: TM1137 COG4177 # Protein_GI_number: 15643894 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Thermotoga maritima # 1 346 1 347 359 246 42.0 6e-65 MKNMTKNKNKLYTLIALVIIIGFLAFLQSDTAEYSYQISILERSAIYAVVAVSMNLLTGF TGLFSLGQAGFMAIGAYVVAIFTIPVGSRASVYYVSGISPVIANIQLPIPVALLLGGLAA AAMAALIGIPVLRLKSDYLAIATLGFSEIIRAFIAAPMFDTITNGSYGLKSIPGFPNIFA VFGLAALCIALMVLLINSSYGRAFKSIREDEVAAQSMGINLFHYKELSFVISSFFTGVGG GMLAMFMRSIDSKTFQVALTYDILLIVVLGGIGSVTGSVAGAFLVTAGRELLRFFDDPLT IAGVSVPLFRSGFRMVIFSILLMAVVLFYSRGIMGNHEFSWDGLFRLIRGIPGKFKRGRK KPSSAKKGGIH >gi|157101641|gb|DS480683.1| GENE 78 84784 - 85650 246 288 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 22 274 33 254 329 99 26 7e-20 MNNMENSVKNVLHMESITMQFGGVVAVNGLTLDVNEHEIVALIGPNGAGKTTAFNCVTGI YQPTFGSVSFNGEVILADTPQGKMRRLYEGDITPTDLTPVRKTPDQVTKLGIARTFQNIR LFGALTVFENVLIAKHMRAKQNVFSATLRLNHAEEKRMREESMELLEMQGLAHLKNEIAS SLPYGLQRRLEIARALATSPSLLLLDEPAAGMNPQETQDLTDFIHKIRDDYHLTIFMIEH HMDLVMQISDRIYVLDFGKLIAHGTPDQVQNNPKVIEAYLGVSEDAED >gi|157101641|gb|DS480683.1| GENE 79 85637 - 86344 241 235 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 219 1 226 245 97 26 3e-19 MLKINDLRVNYGGIEAVKGISFEVPEKSIVTLIGANGAGKSTTLRAIAGLVKASAGSIQF DGTELLGMDTPDIVSKGITLVPEGRRVFPDMTVIENIKIGAYLRTDSLDQDIEWVYSLFP RLKERSWQLAGTLSGGEQQMLAVARALMSHPKLMMMDEPSLGLAPLIVQDIFNIIKEINK QGVTILLIEQNANMALRIADQGYVLETGRISLSGTGRELLADESVKAAYLGKKKK >gi|157101641|gb|DS480683.1| GENE 80 86421 - 87179 442 252 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_1120 NR:ns ## KEGG: SpiBuddy_1120 # Name: not_defined # Def: GntR family transcriptional regulator # Organism: Spirochaeta_Buddy # Pathway: not_defined # 22 240 12 228 249 75 25.0 2e-12 MHDMKKIEKCFSARGDLLEYIIMDIILQYNDVVGSWTLKNELDILEIHIGTATIGRVLKI LDNKGFTVPVSNKGRALTKKGIDEVLKTRSRIHKSILSNDIVDSVEITHFDDLIEVLSTR LIIEREATRLAVQNATQEDFKQMSDALDQHRMLLSNELDDSRAGLDFHIYIVDAAKNKFM SAVARLLAYEEHQIEARVPSLSAKKHAMEYVNVHQELLEAIQDRDEERATRLMEEHLNQI IGLVKRDREITL >gi|157101641|gb|DS480683.1| GENE 81 87294 - 87680 78 128 aa, chain - ## HITS:1 COG:all0706 KEGG:ns NR:ns ## COG: all0706 COG2421 # Protein_GI_number: 17228201 # Func_class: C Energy production and conversion # Function: Predicted acetamidase/formamidase # Organism: Nostoc sp. PCC 7120 # 1 123 189 312 328 110 47.0 5e-25 MYLPIFTEGALFSVGDGHAAQGDGESGCTAIECPMKEVRIRLEIQEGSFGSPVADTPGGW VAFGFSESLTDASYNALGNMALLMERLYGYEYREALSMCSVAVNLRITQIVNGIRGVHAI LPKGSIFQ >gi|157101641|gb|DS480683.1| GENE 82 88246 - 89235 336 329 aa, chain - ## HITS:1 COG:aq_344 KEGG:ns NR:ns ## COG: aq_344 COG0451 # Protein_GI_number: 15605858 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Aquifex aeolicus # 1 319 1 303 310 90 29.0 4e-18 MKFFLTGGTGFVGINLAYHLAGLGDDVVIYANQPLLQQAEKMFGNCPGKVSWVCGDVLDK AHMESELLTSNADVFIHAAVITPGREREIKQFEQIFRVNTLGTINALEAAKNCGTTRFIY VSSVAVYGTSSQEYDPVRETVPLKPSNVYEISKFASEHIALRYRELYNMDVRAIRLGDVF GAWEFDTGVRDTMSAPCQTLKAALERRHAILPKEGMTGWVYVKDVAASITALAKAAPGQL NHVVYNSSSVYRWSIAQWCDMLAKRYYGFTYEIGTPSEANIKFHALMDNGMFDVSRLRED TGFISQYGCREAFEDYTSWADSYPDLVRQ >gi|157101641|gb|DS480683.1| GENE 83 89251 - 89952 356 233 aa, chain - ## HITS:1 COG:PAB0997 KEGG:ns NR:ns ## COG: PAB0997 COG1878 # Protein_GI_number: 14521703 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Pyrococcus abyssi # 5 231 33 209 217 95 31.0 6e-20 MSRRVIDLTLEITSNMPAQDAFPGNIYVQLVSHEESRRTESGTPEDPFTSTWNYIGMVEH IGTHVDAFFHMNPKGLSVDRMPLDMFFGKAVCFDMTHIPERGEITAEDMEKAQEATGVKV DGHIVLLATGIHDKYFPDKKILSVNPEITPEAVKWLADHNSRLHGVEGPSTDIMDLNLFP SHRACRNLGITHYEWLINLTELIGKGEFMFYGAPLKLKDGSGSPVRAYAVIEE >gi|157101641|gb|DS480683.1| GENE 84 89965 - 91029 764 354 aa, chain - ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 20 347 22 356 357 156 32.0 6e-38 MKKVWGFTLAAALVISMLAGCGAQKQAEEKKDTAAQTQAQTEKESTAQKENAAPTEAKED AEKVIAVIPKSLLFDYWQYVRIGAQSAGLDEGYAIDFQGTRTDTDLEGQVKLVEDFIQRG VSAIVISPVNPDGMVPVLQQAEDAGIPVIIMDGKLNADFPRSTVSTNDEAAGKFAAEKLK ELAGDAGGTFAIVSAVPGAVQEGGREKGFSDELGTYPNYKIIGTYYGKGDRNQTYNITQD ILTSNPDITGFYTVNEGSSAGVTLGVREGDLKEKIFVAFDPSTEVLDAIRDGYVDGAVAQ NPYLIGRTAVLNAIKVLNGETVEKKIDVPVTWVTIDNLDDPEIQNVLKPEEVIK >gi|157101641|gb|DS480683.1| GENE 85 91125 - 92105 606 326 aa, chain - ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 17 323 6 313 314 225 45.0 8e-59 MKEKTQQLSATHIGVTRLKSILGKASILLGLLLLCTFFSFTARNFLTQKNLLNIALQTSI IAIVAIGQTYTITGGGIDLSVGSVVGLSSIVVSMLLVHDVSIPIAILMAIAAGLTCGAIN GFIIAYGGIAPFIATLGMLSIARGIVYVACDGVPITGLPSEFSVLGAGRAFGTIPVPIII LSVLAVVMGIIFHKSKFGRHIFALGSNENTAYLSGVNTKKVKFMMYLCCSLMAAIAGIIL TSRVISGQPNSGEAYETDAIAAAVIGGASMSGGQGSIIGTLIGALMMGVLNNGLNLLGLS YFYQKIAIGMVIIAAVYIDMLRSRKK >gi|157101641|gb|DS480683.1| GENE 86 92118 - 93611 787 497 aa, chain - ## HITS:1 COG:AGc5112 KEGG:ns NR:ns ## COG: AGc5112 COG1129 # Protein_GI_number: 15890066 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 495 24 514 521 446 46.0 1e-125 MNDVLLEMKEINKSYSGVQVLHNVNLKVRKGEIHALMGENGAGKSTLIKIITGIVQADSG EKTYKGEPLNISHPSQIYHYGIGIVHQEFNLLPDLNVAQNIFIAREPMKKFGFVDEKKSD QDAQKLLDRLQLKIDIHKKVIQLSVAEQQLVEIAKALSYDCELLILDEPTSALADSEAEV LFGILEHLREQGVAIIYVSHRMSDIKRIVDRVTVLRDGHFIGEHLLSDITEEQLISEIVG RKLEAYFPKKPKYKRGEQTLEVKGLSVKGLLEDISFEAYKGEILGIAGLMGAGRTELAHA IFGKLRPTAGEIIVHGKKQKFDSPQKAIRAGIGYTTEDRKRDGLMLNQDIKSNILISSYD QLVNFWGLVKEGMANRRTDEYIEKMNIKTTGRNQNIGSLSGGNQQKAVLAKWLSANVDIF FFDEPTRGIDVGAKMEIYQLMYTLVENGVTVIMISSELPEILGMSDRILVMSHGKLAADL DISEATQEKIYYYASQE >gi|157101641|gb|DS480683.1| GENE 87 93645 - 94403 334 252 aa, chain - ## HITS:1 COG:CAC3574 KEGG:ns NR:ns ## COG: CAC3574 COG1028 # Protein_GI_number: 15896808 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Clostridium acetobutylicum # 3 251 5 249 249 171 39.0 1e-42 MKLEGKVTIITGAGYGYGMSRGFPEAFAREGSHLSLNYYGCDDMKMKLWAQDMEKLGVRV ILAPGDISQEETAERLIQRTWEEFGHIDVLVNNAGITGPKSIVDMTIEEWDRMIAVNLRS FFLTCKYVVPIMMKQKRGRIINIASQIAQKGGTDHCHYAAAKAGVIGLTKSLAWDLGKFG INVNCIAPGPINTQMMESVSDEWRKDKMKDLAIPRFGEIEEVVPSAIFLASSPDGDLYTG QTLGPNCGDVML >gi|157101641|gb|DS480683.1| GENE 88 94679 - 95317 728 212 aa, chain - ## HITS:1 COG:no KEGG:bpr_III235 NR:ns ## KEGG: bpr_III235 # Name: pyrE2 # Def: orotate phosphoribosyltransferase PyrE2 (EC:2.4.2.10) # Organism: B.proteoclasticus # Pathway: Pyrimidine metabolism [PATH:bpb00240]; Metabolic pathways [PATH:bpb01100] # 5 212 4 212 212 202 44.0 6e-51 MDNKYVKIHTTGCNAPLKVTPGHFATNHAHINYYLDMTTLKTRLSEAQEIARSLSGLFLY DTVVDTIVCLEGTQVVGSFLAEELTSAGVLSMNAHKTIYIVTPEFNSNSQMIFKENILPM IRDKNVIILTASVTTGLALNKGIESIRYYGGNLRAITAIFSALEELNGFRITSIFGRKDL PDYTYYDYRDCPLCKQGRKLDALVNAYGYTSL >gi|157101641|gb|DS480683.1| GENE 89 95536 - 97689 1260 717 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 14 637 2 624 636 489 44 1e-137 MDNNNKQDPNMKSNWQTITMLVVAAILTFIVVSSMNNYLNSKKRQELSYNEFVQMVEAGE VDSIKVGSTEITVIPKKGNTKYSQNLDYYVVRLEGDYDFVNRILDNNVVTNREKSDSSTL LLVLLNYAVPFLFLLFLMNFTMKRMGGGGIMGVGKSNAKMYVQKETGITFKDVAGEDEAK ESLTEIVDFLHNPGKYTKIGAKLPKGALLVGPPGTGKTLLAKAVAGEAHVPFYSLSGSDF VEMFVGVGASRVRDLFKNATENAPCIIFIDEIDAIGRSRDSRMGGNDEREQTLNQLLSEM DGFDSTKGLLVLGATNRPEVLDPALLRPGRFDRRVVVDKPDLNGRINILKVHSKDVLLDE SVDFKEIALATSGAVGADLANMMNEAAINAVKHGRNAVSQKDLFEAVEVVLVGKEKKDRI MSKEERRIVSYHEVGHALVSALQKHSEPVQKITIVPRTMGALGYVMNVPEEEKYLNTKKE LEARLVELMAGRAAEEIVFETVTTGAANDIQQATNLARAMVTQYGMSDKFGLMGLESQEN QYLTGRAVLNCGDATAAEIDQEVMKILKDSYDEAIRLLSDNKDAMDQIAAFLIDKETITG KEFMKIFRRVKGIPEPEEPKEDDDKGTASQELHANPETQADRGERENQAEQANPQEQVNP EEQANPQEQVNPEEQANPQTQSNPGNQSGPEVSDRGNEPEKKTEDTVWSHPGNDRNL >gi|157101641|gb|DS480683.1| GENE 90 97928 - 99544 1637 538 aa, chain + ## HITS:1 COG:CAC2892 KEGG:ns NR:ns ## COG: CAC2892 COG0504 # Protein_GI_number: 15896145 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Clostridium acetobutylicum # 1 535 1 535 535 724 61.0 0 MPVKYVFVTGGVVSGLGKGITAASLGRLLKARGYKVTMQKFDPYINIDPGTMNPVQHGEV FVTDDGAETDLDLGHYERFIDESLTRNSNVTTGKVYWTVLQKERRGDFGGGTVQVIPHIT NEIKSRFHRNYTENETEIAIIEVGGTVGDIESQPFLEAIRQFQHEAGPGNTCIIHVTLIP YLKASEELKTKPTQASVKDLQGMGIQPDILVCRSELPLDDGIKTKIAQFCNVPKHHVLQN LDVDVLYEVPLVMEEEHLAQSACECLDLPCPEPELADWRAMIEAWKHPKQDVTVALVGKY IQLHDAYISVVEALKHAGVANRAAVHIKWVDSETVTPENAPEIFKDVSGILVPGGFGDRG IDGKIHAIQYAREHRIPFLGLCLGMQLSIVEFARNAAGYADAHSVELNPATTHPVIHLMP EQNGVEDIGGTLRLGSYPCVLDKTSKAYGLYGEATIYERHRHRYEVNNDYRAVLTKCGMM LSGLSPDGRIVEMIELPGHPWFIATQAHPELKSRPNRPHPLFKGFVEAALKEKTKNNV >gi|157101641|gb|DS480683.1| GENE 91 99651 - 99815 252 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937962|ref|ZP_02085320.1| ## NR: gi|160937962|ref|ZP_02085320.1| hypothetical protein CLOBOL_02856 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02856 [Clostridium bolteae ATCC BAA-613] # 1 54 1 54 54 85 100.0 1e-15 MDHKKKDDFDVDIQACSTMDCTGLIPSLPQDEAEKEAYEDLYPYVTKAVSSDRE >gi|157101641|gb|DS480683.1| GENE 92 100003 - 100746 499 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 6 243 1 241 245 196 42 3e-49 MKGDRILEIRHLSKTFGTNPVLKDIDFSVSSGDVTCIIGASGSGKSTLLRCLNLLETPTT GEIMYHGTNIADKKVNAPQYRSKVGMVFQSFNLFNNMTVLENCMIGQVKVLKKNKTDAHK NAMYYLEKVGMAPYINAKPRQLSGGQKQRVAIARALAMEPEVLLFDEPTSALDPQMVGEV LEVMRKLAGEGLTMIIVTHEMAFARDVSSHVAFMAGGVIVEEGEPSQIFGAPKQLQTQEF LSRFMHG >gi|157101641|gb|DS480683.1| GENE 93 100743 - 101546 984 267 aa, chain - ## HITS:1 COG:SPy0277_2 KEGG:ns NR:ns ## COG: SPy0277_2 COG0765 # Protein_GI_number: 15674455 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 12 238 1 224 239 194 46.0 1e-49 MPTDFFGRVMVVFNRYGMSMLQGAGVSLKIALVGTLVGCIIGFAVGIVQTIPSERGDNPV KRAVLWAVKLILNAYVEFFRGTPMMAQAMFIYYGLLPMLGINMSMWGAAYFILSINTGAY MAETVRGGILSIDPGQTEGAKAIGMTHVQTMLYVILPQALRNIMPQIGNNLIINIKDSCV LSVIGVTELLYKTKAAAGALYMNFEAYTITMIIYFIMTFTCSRILRWWENRMDGADSYDL ATTDTLAHTSGMYRFPDEKDKKKEDAR >gi|157101641|gb|DS480683.1| GENE 94 101591 - 102535 1237 314 aa, chain - ## HITS:1 COG:SP0453_1 KEGG:ns NR:ns ## COG: SP0453_1 COG0834 # Protein_GI_number: 15900370 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 69 311 28 265 325 174 42.0 2e-43 MKMSAKKVLGLVLAGVLAAGSLTACGGGGTAATTADTKAEATTAAGDAKAESDAAADTTA ADSGEKKTLRVAMECAYAPYNWTQPDDSNGAVAIADSNEFAYGYDVMMAKKICEELGYDL EIVRLDWDSLIPAVTTGQVDCVIAGQSITSERLQAVDFTEPYYYATIVTLVKEGSKYADA KSVADLAGATCTSQQSTIWYNTCLPQIQDANVLAATASAPDMLMSLNADKCDLVVTDQPT GKGALIAYPNFKMIEFGGGDADFQVTDEDINIGISLKKGNTELKDAINSVLTKMTKDDFS KMMDEAISVQPLAN >gi|157101641|gb|DS480683.1| GENE 95 102647 - 102832 130 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160937966|ref|ZP_02085324.1| ## NR: gi|160937966|ref|ZP_02085324.1| hypothetical protein CLOBOL_02860 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02860 [Clostridium bolteae ATCC BAA-613] # 1 61 1 61 61 105 100.0 1e-21 MTAMNSQLKDLWLFLQLFIPFSCIKKRIWVLENRTSYVILSTLGAYMLKKVKCAMFNRAR F >gi|157101641|gb|DS480683.1| GENE 96 102849 - 103757 1006 302 aa, chain - ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 5 245 2 229 246 108 29.0 1e-23 MDQKKGKSRIRGVIILVLLAVMLVSGAMSIHSWIEDKRTREEYERLAELARETTAQPSTE AVTEPGTEEPETEPYVSPINFEELMKENPDTIGWIRVPDTNIDYPIVQGEDNDFYLNHDF YGKENIAGAIYLDFESQGDFVGRNNVLYGHNMKNGSMFKDVNRYKDEAYFKEHQFFSIYT PDREIRLKAVAAYYGEAQPIVRKTRFKSQESFDAFVHEMVKPCSYGVEITYPARTLYTLV TCSYEINDARTFLFAVEVDEDGKQIQPDDVFLQRMDDLMGDKAQETAGETVQETADRAKE SQ >gi|157101641|gb|DS480683.1| GENE 97 103784 - 104644 676 286 aa, chain - ## HITS:1 COG:no KEGG:Closa_0110 NR:ns ## KEGG: Closa_0110 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 278 1 287 294 96 27.0 1e-18 MSIAKTGTLPDSYKYVNIAIDSFDGRLLQGRLYHDSLEEEIPFGCISEMAICLEKLFDEL RYPMKSVDHRSFVGIKHGSSSLRCGVEAGKKRLQGKLAEFCLHVKYRFHSTWQGDIMDLR HGGTYSFESFLDLMEYLDRSLGHGSDIMHGLGKRMCEVSVRNFSHFVLGGDVSHPAVADR REFASEFELRDEVERFLLPLADGGEEGTIISPRSVLVNAGGFGPMTFVIRVMFRRNATWQ GTICWKEKRRQISFRSFLEMLLLMQEAVADSEGWNEELKAASGHMG >gi|157101641|gb|DS480683.1| GENE 98 104839 - 106074 1153 411 aa, chain + ## HITS:1 COG:TM1119 KEGG:ns NR:ns ## COG: TM1119 COG3629 # Protein_GI_number: 15643876 # Func_class: T Signal transduction mechanisms # Function: DNA-binding transcriptional activator of the SARP family # Organism: Thermotoga maritima # 11 267 2 251 349 76 26.0 7e-14 MNASVMEDKPTIYIKTLGGFSLRVGDKEITDNSNQSKKPWCLLEYLVVFQKKSISPGELI NIVWADDPGVNPGGALKTLMFRSRKLLEPLGIPPQKLLVQQRGSYCWTQDYLSVLDIDQF ESICTRVLNHDMDEDEALNLCLEGLELYKGDFLPKSEYESWVIPISTYYHSLYQKLVYKT VELLTKKEDFSRITSICQTAVGIEPFDEEFHYHLVYSLYRDGHTSQAIEEYNHTLDLFYN EFSISPSDHFKDLYKSIRSKEQGINTNLDSIQETLKEEASGGAFYCEYPVFHDLFQLERR AIERTGDSIYLCLLTIGDLDGHAPKTTVLNKAMEHLNHAIRDSLRCSDVYTRYSISQYIV LLPTVTMEKGEMVMKRILNNFRRLYSRKDLIVDYKLQPVLPWERSPAGLGE >gi|157101641|gb|DS480683.1| GENE 99 106182 - 107567 1591 461 aa, chain - ## HITS:1 COG:L170983 KEGG:ns NR:ns ## COG: L170983 COG0534 # Protein_GI_number: 15672149 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Lactococcus lactis # 7 442 2 437 446 276 36.0 5e-74 MAANVTKDMTEGSPVKLILGFFIPMLCGLLFQQLYNMADTIIVGKCLGVKALAAVGSTGS VNFMIIGFCLGVCSGFAIPVAQKFGEKNEKALRRFVANSAWLAILFSAVMTVTVCLLCRN ILELMQTPEDIIDGAYSYIFVIFLGIPATYLYNLLSCTIRSLGDSTTPLIFLVFSSVVNV ALDFFTILVLKMGVSGAAWATITAQAVSGILCLLYMKKKFAILKMNDDEWRPDAHYMKIL CNMGIPMGLQYSITAIGSVILQTAVNSLGSMAVAAVTAGSKISMFFCCPFDALGSTMATF GGQNVGARKLDRIDQGLKAGAVMGCVYAVIAFAVLCIFGQWIALMFVDKGEGEILRNTRL FLIGNSLFYIPLVFVNAVRFMIQGLGYSRLAIIAGVCEMAARSFVGFCLVPVFGYIAVCI ASPVAWIAADLFLIPAYRHVMKNLNHLFYGRSRATATDTAE >gi|157101641|gb|DS480683.1| GENE 100 107719 - 109128 1621 469 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 10 468 21 465 466 268 34.0 2e-71 MGRLAAAVMAAAVLVLAGAFVSLADESGDTTRFVSGTKVNGVGVGGLTVDEAKARIEGFY AGEYNLTIRERGGRQETIAGTDIGYKVEVPEGLKAILDAQNAAGRVSGPDADNSHTMAMT VTYSQEALGARIKALTLISGSGITVTSDARISSYEEGQPFSIIPAVQGNNVDEAKTTEVI TAAVKAGQNSVDVDSAGCYYQVNIWETDENLIALCSRMNQYRDMSVNYVFGSEKETLGGE TIATWITGSQDGVPTVDLEKITAFVTEMASRRDTAGTARVFHTATGKDVELTGPYGWKID VAGEVQALAALIQAGPAAGPVDREPVYAMTAASRTAPDWGTTYAEVDLTGQHVYMFQEGN LVWDAPCVTGNISKNYDTPPGIYSLTYKEKDRILRGAKKADGTYEYESHVDYWMPFNGGI GFHDATWRSKFGGTIFQTSGSHGCINLPPEKASVLYDLIYKGMPVLCYQ >gi|157101641|gb|DS480683.1| GENE 101 109361 - 110074 896 237 aa, chain - ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 78 234 6 173 174 162 52.0 6e-40 MKKWMCICAAAVLAVSALTGCSTSAGAATTAAQTEAATTDAAKAETEKKETAKTAEASEK AQDTASEEIAAAAGKHHVKINVKDYGTISVELYGDEAPITVANFLKLAKDGFYDGLTFHR IISGFMIQGGDPLGTGMGGSDQEIKGEFSNNGVENPLSHTRGAISMARSQIKDSASSQFF IVHEDSTFLDGDYACFGYVTEGMEVVDAICKDTKVEDNNGSVAKDNQPVIESIEVVD >gi|157101641|gb|DS480683.1| GENE 102 110227 - 111450 1353 407 aa, chain - ## HITS:1 COG:NMA0365 KEGG:ns NR:ns ## COG: NMA0365 COG1457 # Protein_GI_number: 15793373 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Neisseria meningitidis Z2491 # 19 389 16 393 437 304 48.0 2e-82 MKEKKSIKEKGTSVSANSLIWFGAGVSIAEILTGTYLAPLGFGKGLAAIVLGHVIGCALL FMAGLIGGYTRRSSMETVKMSFGQKGSLLFCGLNVLQLVGWTSIMIYDGGLAANGIFNAG LWVWCLVIGGLILVWLLIGIRNLGKINTAAMAALFVLTLILCRVIFAGSGEPSMEADSSL SFGAAVELSVAMPLSWLPLISDYTREAEKPFRATAASALVYGLVSCFMYVIGMGAAIYTG EYDISVIMVKAGLGAAGLLIIVLSTVTTTFLDAYSAGVSSVSITHRIKEKWAAVAVTLAG TAAAMVYPMDNITDFLYLIGSVFAPMIAIQIADYFIIRQDHEAEEANLPNLLIWLAGFVI YRILMRVDTPVGSTLPDMAVTVVLCVVCRRVMSACCALPGRENVTKV >gi|157101641|gb|DS480683.1| GENE 103 111463 - 112254 890 263 aa, chain - ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 1 260 4 263 265 320 61.0 1e-87 MKTALTIAGSDSSGGAGIQADIKTMTAHGVYAMSAITALTAQNTTGVTGIMEVTPSFLKE QLDDIFTDIFPDAVKIGMVSSGGLIEAIGDRLAFYRAANVVVDPVMVSTSGSRLISSEAI EALKEVLLPMANLLTPNIPEAEVLSGMEIHNPGDMETAARAIGEGYGCAVLLKGGHQLND ANDLLYRDKKLKWFKGRRIDNPNTHGTGCTLSSAIASNLAKGMDMDLAVERAKVYLSGAL EAQLDLGKGSGPMNHGFAIRGEY >gi|157101641|gb|DS480683.1| GENE 104 112592 - 113896 1137 434 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 173 429 168 432 433 73 24.0 8e-13 MNDAACTVFMNLINLAVRMMPIFVCLDPKHSYRENIAAISAYLWVIMISMESIFHISPQV FFVFRGIFSCLYFLVLLVFFKGTLLKKGFLYISAWLFAVLSASLNEFAAWILKGHVSLTY GQICVAVSLLWACGFYGFVRFWLRDKVNRLFDQLSRRSCSLLLTYPSVSLFILFIGNNTI FSSRSLAERGLEDVLFYLALCIMILVLYVLILSSTLEIMCRRRTEEELQFARQLISQQRE HYNQTLDYIEQVRIIKHDFRHHIHALLNMDRGERTNYLMNLKRELDMTAEMVFCQNQAVN GLLQEYAARARQDGVEFTARVDLSAHVPVDDLTLCIVTGNLLENALEACRRLTGPRFILV QARWLDDHLMMLVENSYNGQIKKNGSRILSSKRDGGLGILSIKRILNQPGDEFDVDYNDT TFTAMVKIVDRALG >gi|157101641|gb|DS480683.1| GENE 105 113904 - 114611 803 235 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 225 1 220 234 82 24.0 5e-16 MYNVAVCDDTEEERLQAAEYAGRFFEREGIEVRIDTYAAGRELLESGREYDLYLLDVLMP GMSGIDTAQALAEDKDHPVVVFITSSLESAVEGYRVEAAGFILKPVEEENFWSTMERVVR RRLGVKKAVLSVVHNRVNVELPLERLAWFENRLHRVFVKLTDGEVLSVNQKLSELQLVLE PHAQFLRCHQSYLVNLDYVDKLEDSCFYMRDGQMIPISRNFYKLSKNAYYHYRLK >gi|157101641|gb|DS480683.1| GENE 106 114807 - 116264 1706 485 aa, chain + ## HITS:1 COG:PA0287 KEGG:ns NR:ns ## COG: PA0287 COG0591 # Protein_GI_number: 15595484 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Pseudomonas aeruginosa # 9 472 6 452 461 168 30.0 2e-41 MGSNVIVTIIVIVYLLFMLWIGWYSSTKISTNTDFMVAGRRLGPLLMAGTLAATEIGGGS SLGVVQNGMSGFGLSASWYITTMGIAFIILSFVAPKFRAATVKTVPEYFRRRYGKSCGII TAVIMLLPLVGLTAGQFIASAVILSTMLNIDYQVAVIIVAVVVTVYSIMGGLWSVTLTDF VQVFLIVIGMIIAVPFAMNYAGGWDNVASNIPEGTLSLFQGYDLFGIISLVIMYTATFSV GQEAVSRFYAARDEKAAKGGAWLAALVNFIYAFIPTILGIITLALINMGKFSSAQFESVG ARYALPVLAINTMPALICGLLFAGIISATMSSSDSDLLGAGSIFANDIYKAVLKPDASSQ SVMKVTKIVMCLVGLASMFIALFNTQSIVSILMFCFTLRAAGSFFPYVMGHYWKKASTAG TIASLIAGTIVVVYLEHISKGMLFGIKFSQPIIPGLAAAFICFIIFSLAMPPEKETTELA PEEDD >gi|157101641|gb|DS480683.1| GENE 107 116357 - 117682 919 441 aa, chain - ## HITS:1 COG:no KEGG:BT9727_0681 NR:ns ## KEGG: BT9727_0681 # Name: not_defined # Def: hypothetical protein # Organism: B.thuringiensis # Pathway: not_defined # 6 432 1 414 423 215 30.0 3e-54 MQTCNMVQAGYGEADITPDNPSEMVGFYRPDNRSRGVRDSLRLQALVWEADGVMGGLIVI DSLGFTVELSNVLRDRAAAALHTGRERIMVCFSHTHSAPNAAEEPDYFEMVCRQADTAVR KAAGKMKSMEAAWGIGENTVGVNRRNEPEQMDGRLGILKLAARDGKEPEVLLIRVTAHAN VLSGDNYFISADYMGAARKRLEEAYGCPVMMVQGAAGDIRPRYHQDNMEYVEIHCWEMAR KGFSQEYRQKYVPQSRRALEQMAEDMFRSVDAVYASLVLMPLERVEIRSSFCRFAADVPD MERAEEIAEEAEREGEIDGRMWLKEVKRLLDEGIQKQYADIEIQYLFVNQGCLCGVPNEA MCRIAIDIWKEAEAPLLFFNGYTNGCSSYLPTAEEYDKGGYEVLWSNLVYFPYHGRVMPL NRDTAGKMAAQVVEDWRKSRS Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:47:22 2011 Seq name: gi|157101640|gb|DS480684.1| Clostridium bolteae ATCC BAA-613 Scfld_02_25 genomic scaffold, whole genome shotgun sequence Length of sequence - 99738 bp Number of predicted genes - 90, with homology - 88 Number of transcription units - 42, operones - 21 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1617 1725 ## COG1966 Carbon starvation protein, predicted membrane protein - Prom 1703 - 1762 3.8 2 2 Op 1 6/0.000 - CDS 2129 - 3673 1144 ## COG0007 Uroporphyrinogen-III methylase 3 2 Op 2 1/0.231 - CDS 3675 - 4583 860 ## COG0181 Porphobilinogen deaminase 4 2 Op 3 . - CDS 4576 - 5226 435 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) - Term 5273 - 5326 2.3 5 3 Tu 1 . - CDS 5361 - 6797 1418 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 6830 - 6889 7.1 + Prom 6790 - 6849 5.5 6 4 Tu 1 . + CDS 6887 - 7081 70 ## gi|160937987|ref|ZP_02085344.1| hypothetical protein CLOBOL_02880 - Term 7073 - 7139 6.2 7 5 Tu 1 . - CDS 7259 - 9493 2000 ## COG2199 FOG: GGDEF domain - Prom 9520 - 9579 8.2 - Term 9652 - 9690 6.6 8 6 Tu 1 . - CDS 9761 - 10633 1032 ## COG2035 Predicted membrane protein - Prom 10695 - 10754 7.3 + Prom 10750 - 10809 9.7 9 7 Tu 1 . + CDS 10839 - 11723 1096 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Term 11686 - 11734 2.3 10 8 Op 1 . - CDS 11770 - 12648 831 ## COG1533 DNA repair photolyase 11 8 Op 2 1/0.231 - CDS 12671 - 13711 1149 ## COG3172 Predicted ATPase/kinase involved in NAD metabolism 12 8 Op 3 . - CDS 13701 - 14396 788 ## COG3201 Nicotinamide mononucleotide transporter 13 8 Op 4 . - CDS 14374 - 15492 643 ## COG3949 Uncharacterized membrane protein - Prom 15522 - 15581 7.8 - Term 15537 - 15597 11.3 14 9 Tu 1 . - CDS 15632 - 17329 1918 ## COG0004 Ammonia permease - Prom 17504 - 17563 2.1 - Term 17457 - 17504 9.1 15 10 Op 1 . - CDS 17565 - 17858 421 ## Closa_0987 hypothetical protein 16 10 Op 2 . - CDS 17877 - 18458 760 ## BHWA1_00709 hypothetical protein 17 10 Op 3 . - CDS 18491 - 20077 1699 ## COG0038 Chloride channel protein EriC 18 10 Op 4 . - CDS 20111 - 21343 1718 ## COG1171 Threonine dehydratase 19 10 Op 5 10/0.000 - CDS 21403 - 22356 845 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 - Prom 22390 - 22449 6.4 20 10 Op 6 6/0.000 - CDS 22455 - 26015 4269 ## COG1196 Chromosome segregation ATPases 21 10 Op 7 7/0.000 - CDS 26027 - 26731 954 ## COG0571 dsRNA-specific ribonuclease - Term 26789 - 26818 2.1 22 10 Op 8 3/0.154 - CDS 26832 - 27062 391 ## COG0236 Acyl carrier protein 23 10 Op 9 . - CDS 27112 - 28119 1154 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme - Prom 28147 - 28206 4.7 + Prom 28264 - 28323 5.4 24 11 Tu 1 . + CDS 28388 - 30085 1402 ## COG2200 FOG: EAL domain + Term 30263 - 30313 2.1 25 12 Op 1 6/0.000 - CDS 30093 - 32198 1828 ## COG2200 FOG: EAL domain 26 12 Op 2 . - CDS 32213 - 34732 1919 ## COG2199 FOG: GGDEF domain + Prom 35003 - 35062 5.0 27 13 Tu 1 . + CDS 35206 - 36102 719 ## COG1032 Fe-S oxidoreductase + Prom 36155 - 36214 3.5 28 14 Tu 1 . + CDS 36272 - 37186 129 ## BCE_3152 BclA protein + Prom 37235 - 37294 7.6 29 15 Tu 1 . + CDS 37320 - 37424 93 ## gi|160938016|ref|ZP_02085373.1| hypothetical protein CLOBOL_02909 + Term 37426 - 37461 -1.0 + Prom 37437 - 37496 2.1 30 16 Op 1 . + CDS 37619 - 38968 809 ## COG0534 Na+-driven multidrug efflux pump 31 16 Op 2 . + CDS 39021 - 40646 1282 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 32 16 Op 3 . + CDS 40648 - 41274 490 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 41420 - 41460 0.3 - Term 41090 - 41155 3.9 33 17 Tu 1 . - CDS 41356 - 43596 1897 ## COG0178 Excinuclease ATPase subunit - Prom 43719 - 43778 5.3 34 18 Op 1 . - CDS 43873 - 43935 74 ## 35 18 Op 2 . - CDS 43925 - 45514 954 ## Closa_2955 hypothetical protein 36 18 Op 3 . - CDS 45477 - 45623 101 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 - Prom 45694 - 45753 5.1 37 19 Tu 1 . + CDS 45759 - 46925 341 ## COG3547 Transposase and inactivated derivatives + Term 46971 - 47005 1.1 38 20 Op 1 . - CDS 47170 - 47775 342 ## Bacsa_0382 hypothetical protein 39 20 Op 2 . - CDS 47772 - 48539 300 ## gi|160938026|ref|ZP_02085383.1| hypothetical protein CLOBOL_02919 - Prom 48703 - 48762 5.5 - Term 48704 - 48749 9.2 40 21 Op 1 . - CDS 48767 - 49090 336 ## gi|160938027|ref|ZP_02085384.1| hypothetical protein CLOBOL_02920 - Prom 49124 - 49183 3.6 41 21 Op 2 . - CDS 49294 - 49881 426 ## COG1309 Transcriptional regulator - Prom 49941 - 50000 5.5 - Term 50046 - 50087 11.3 42 22 Op 1 . - CDS 50128 - 50310 304 ## PROTEIN SUPPORTED gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 43 22 Op 2 1/0.231 - CDS 50314 - 50841 188 ## PROTEIN SUPPORTED gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein - Prom 50862 - 50921 7.0 - Term 51027 - 51077 13.2 44 23 Op 1 21/0.000 - CDS 51098 - 52291 1458 ## COG0282 Acetate kinase 45 23 Op 2 . - CDS 52316 - 53311 1022 ## COG0280 Phosphotransacetylase - Prom 53348 - 53407 4.9 - Term 53501 - 53549 10.5 46 24 Op 1 . - CDS 53563 - 53895 467 ## COG3870 Uncharacterized protein conserved in bacteria - Prom 53922 - 53981 5.9 47 24 Op 2 . - CDS 54036 - 55640 1564 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 55777 - 55836 8.3 + Prom 55667 - 55726 7.2 48 25 Tu 1 . + CDS 55900 - 57216 970 ## COG1323 Predicted nucleotidyltransferase + Term 57231 - 57284 8.0 - Term 57214 - 57277 11.2 49 26 Op 1 1/0.231 - CDS 57294 - 58535 871 ## COG0500 SAM-dependent methyltransferases 50 26 Op 2 . - CDS 58545 - 59180 674 ## COG0546 Predicted phosphatases 51 26 Op 3 1/0.231 - CDS 59252 - 59893 710 ## COG0406 Fructose-2,6-bisphosphatase 52 26 Op 4 9/0.000 - CDS 59933 - 61303 1858 ## COG2848 Uncharacterized conserved protein 53 26 Op 5 . - CDS 61362 - 61634 445 ## COG3830 ACT domain-containing protein 54 26 Op 6 . - CDS 61693 - 62490 1032 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 55 26 Op 7 4/0.154 - CDS 62514 - 63140 475 ## COG0237 Dephospho-CoA kinase 56 26 Op 8 . - CDS 63209 - 66001 3136 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 57 26 Op 9 . - CDS 66027 - 66356 446 ## Closa_1882 transmembrane anti-sigma factor - Prom 66393 - 66452 6.4 58 27 Tu 1 . - CDS 66475 - 67299 781 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 67401 - 67460 4.7 + Prom 67342 - 67401 5.9 59 28 Tu 1 . + CDS 67470 - 68450 521 ## COG0673 Predicted dehydrogenases and related proteins + Term 68489 - 68542 3.4 - TRNA 68564 - 68646 58.7 # Leu CAA 0 0 - Term 68482 - 68524 9.0 60 29 Tu 1 . - CDS 68667 - 68792 85 ## gi|160938052|ref|ZP_02085409.1| hypothetical protein CLOBOL_02946 - Prom 68868 - 68927 3.2 + Prom 68683 - 68742 6.8 61 30 Tu 1 . + CDS 68791 - 69423 713 ## COG2364 Predicted membrane protein - Term 69406 - 69445 -0.5 62 31 Op 1 . - CDS 69467 - 70348 774 ## COG4989 Predicted oxidoreductase 63 31 Op 2 10/0.000 - CDS 70364 - 71368 1209 ## COG4211 ABC-type glucose/galactose transport system, permease component 64 31 Op 3 16/0.000 - CDS 71365 - 72900 167 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 65 31 Op 4 . - CDS 72913 - 74061 1260 ## COG1879 ABC-type sugar transport system, periplasmic component 66 31 Op 5 4/0.154 - CDS 74087 - 74944 995 ## COG0191 Fructose/tagatose bisphosphate aldolase 67 31 Op 6 . - CDS 74961 - 75908 836 ## COG0524 Sugar kinases, ribokinase family - Prom 75988 - 76047 7.5 + Prom 75951 - 76010 8.4 68 32 Tu 1 . + CDS 76114 - 76863 661 ## COG2188 Transcriptional regulators - Term 76891 - 76939 14.1 69 33 Op 1 . - CDS 76954 - 78030 1248 ## COG0258 5'-3' exonuclease (including N-terminal domain of PolI) 70 33 Op 2 . - CDS 78082 - 78171 69 ## 71 33 Op 3 . - CDS 78210 - 78740 548 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 72 33 Op 4 . - CDS 78771 - 78977 295 ## gi|160938065|ref|ZP_02085422.1| hypothetical protein CLOBOL_02959 - Prom 79127 - 79186 6.9 - Term 79230 - 79291 9.8 73 34 Op 1 . - CDS 79314 - 79808 491 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 74 34 Op 2 . - CDS 79848 - 80072 223 ## CLK_2436 DNA-binding protein - Prom 80108 - 80167 8.5 + Prom 80166 - 80225 6.4 75 35 Op 1 . + CDS 80262 - 81086 1082 ## Closa_1881 hypothetical protein 76 35 Op 2 . + CDS 81168 - 81797 814 ## COG0035 Uracil phosphoribosyltransferase + Term 81853 - 81904 16.2 - Term 81835 - 81898 11.1 77 36 Op 1 . - CDS 81946 - 82692 937 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 78 36 Op 2 . - CDS 82696 - 84798 2342 ## COG0550 Topoisomerase IA - Prom 84875 - 84934 7.4 + Prom 84829 - 84888 5.2 79 37 Tu 1 . + CDS 84999 - 85253 329 ## gi|160938073|ref|ZP_02085430.1| hypothetical protein CLOBOL_02967 + Term 85488 - 85526 2.0 80 38 Op 1 . - CDS 85240 - 86304 422 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 81 38 Op 2 . - CDS 86334 - 87410 1356 ## COG0136 Aspartate-semialdehyde dehydrogenase - Prom 87490 - 87549 2.5 82 39 Op 1 . - CDS 87662 - 88387 963 ## COG0813 Purine-nucleoside phosphorylase 83 39 Op 2 24/0.000 - CDS 88405 - 90639 2842 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 84 39 Op 3 . - CDS 90658 - 92577 2230 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + TRNA 92834 - 92904 65.6 # Gly CCC 0 0 - Term 92974 - 93025 8.5 85 40 Op 1 34/0.000 - CDS 93090 - 93821 269 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 86 40 Op 2 31/0.000 - CDS 93811 - 94491 932 ## COG0765 ABC-type amino acid transport system, permease component - Prom 94564 - 94623 3.5 - Term 94661 - 94699 1.0 87 40 Op 3 . - CDS 94713 - 95561 1141 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 95648 - 95707 6.2 - Term 95719 - 95766 9.0 88 41 Op 1 . - CDS 95810 - 97771 2178 ## COG3855 Uncharacterized protein conserved in bacteria 89 41 Op 2 . - CDS 97802 - 99049 1758 ## COG0628 Predicted permease - Prom 99101 - 99160 6.6 + Prom 99102 - 99161 5.7 90 42 Tu 1 . + CDS 99253 - 99561 407 ## gi|160938090|ref|ZP_02085447.1| hypothetical protein CLOBOL_02985 Predicted protein(s) >gi|157101640|gb|DS480684.1| GENE 1 3 - 1617 1725 538 aa, chain - ## HITS:1 COG:PAE1423 KEGG:ns NR:ns ## COG: PAE1423 COG1966 # Protein_GI_number: 18312627 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Pyrobaculum aerophilum # 1 509 1 487 618 252 36.0 1e-66 MNGLVMMLLAVVVLGGAYLIYGRYLAKKWGVDPDAKTPAYEMEDGVDYVPADTNVVFGHQ FASIAGAGPINGPIQAAMFGWLPVMLWILLGGVFFGAVQDFAAMYASVKNKGRSIGYIIE VYIGKLGKRLFLLFTWLFSILVVAAFADIVAGTFNGFDAATGERIAANGAVATTSLLFIA FAVVLGCFLKYGKCSKLVNTLVAIGLLVLAVALGLAFPVYVPQSAWLIFVFLYVIVACVT PVWALLQPRDYLNSYLLLAMIIGAVLGICVYNPAINLPSFTSFVFTDAKGNISYLFPTLF VTIACGAVSGFHALVSSGTASKQIKNEKNMLPVSFGAMLLESLLAITALIAVGALATSEG LPTGTPPQVFAKAIAMFLTTLGMPDSISYTLITLSISAFALTSLDSVARVGRIAFQEFFT NDDIDPSRQSSLNKVLTNKYFSTILTLALCYALSRAGYASIWPLFGSANQLLSALALIAC AVFLKKTHRQGVMLWGPMVIMLGVTFTALSLKIKDLISALSSQFVFGNALQLVFAVLL >gi|157101640|gb|DS480684.1| GENE 2 2129 - 3673 1144 514 aa, chain - ## HITS:1 COG:BS_nasF_1 KEGG:ns NR:ns ## COG: BS_nasF_1 COG0007 # Protein_GI_number: 16077397 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Bacillus subtilis # 1 241 3 242 250 229 51.0 1e-59 MENKGMVYLVGAGPGDPALMTLKGQALLSSCGALVYDHLASRQFLEWVPSACRKIYVGKQ AGHHSMKQDEINAVLVELALSGLTVVRLKGGDPFVFGRGGEEIQALEAHGIPYETVPGVT SAIAVPECAGIPVTHRGVSRSFHVITGHTRDGEDCLPPEFETYGSLPGTLVFLMGLGQLP VITARLIEGGMAPDTPAAVIENGTLPDQRAVRAPLSGIHAKVMEAGIGTPAIIVVGETAS FDMKCDLKCRSFGPLAGYRIGMVGTRHFTDSLGHALKQEGAVVSPILEMKVISHTKCPAM QEAYRNLAAYTWLIFTSANGVRLFFQEMQHSGMDYRLLGHIKFAVIGDGTGRELANFGFH ADYMPDSFCAEALARGLASILTDSDRILIARSKGGSPVLTSILEKAGLVYDDIVLYQVEG RLASEQRQAEEGFDYITFASASGVRAYLSSGLNINPGQRTGQTRLVCIGDITAGELKKHG IKADITAGTYHIQGLVQAIADDAISREAEGSASV >gi|157101640|gb|DS480684.1| GENE 3 3675 - 4583 860 302 aa, chain - ## HITS:1 COG:aq_263 KEGG:ns NR:ns ## COG: aq_263 COG0181 # Protein_GI_number: 15605804 # Func_class: H Coenzyme transport and metabolism # Function: Porphobilinogen deaminase # Organism: Aquifex aeolicus # 5 294 22 300 323 191 41.0 2e-48 MNDIIRIGTRKSALALIQTQLVADELRQVCPGIQVEIVTKDTLGDRILDKPLQEFGGKGV FVSEFEQAIQEGVIDLAVHSAKDLPMELAQGLTIAAVSGREDPRDVLVTMRGHRIRPESV VRIGTSSPRRQLQARLMCKTLWPEAAGAECSTLRGNVHTRLRKLEEENFDGIILAAAGLK RLGLLEQDTYEYTFLAPEEFIPAGGQGLMAVEAKAGTAAHSLAQAMDHAQGRLCLDLERS VLKQLDAGCHEPIGIYSVLQNGQLEVWGISSPREEVKRIHLTGGTSRRDIEKLASEAWKG LM >gi|157101640|gb|DS480684.1| GENE 4 4576 - 5226 435 216 aa, chain - ## HITS:1 COG:PA2611_1 KEGG:ns NR:ns ## COG: PA2611_1 COG1648 # Protein_GI_number: 15597807 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Pseudomonas aeruginosa # 1 186 1 180 213 119 36.0 5e-27 MAYFPFFADIEGMRWLIAGGGSVALRKVHDLLPYGAVIEVVSPDMSPGLDEMAQDIAYTD TLKLTRREFEDRDLDRADFVIAATSEPGLNSRISVLCRSKRIPVNVVDVKEECSFIFPSI VRDGPVVVGISTGGMSPVIARYLKARIRSVMPDSLGELTTRLGAYRETVKDLFPDSPRVR SALFYELAEEGLSHNGCLTREQAQIIINRKLEQEHE >gi|157101640|gb|DS480684.1| GENE 5 5361 - 6797 1418 478 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 1 311 1 318 425 172 32.0 2e-42 MRRISYIITSLIMVFSMATITAFAKPDWPIDTGIQSEAGIVMDMDSGAVLFAQNIHEQKI PASITKLLTALVVIENTNDLDAPVTFSHDAIYNVESGAGNKFNLEEGDVLSVRQCLYAML LQSSNQSANALAEYVAGSRQAFVDMMNQRIAELGCNESHFANPSGLNDDTQRTTAYDMAI IARACFQNPTVLEIASARTSTIPATANNPNGRTFSIEHKMLLSDDSNYYPDAIAGKTGWT SQAGQTLVTYARREGRGQIAVTLKSTQKTHYSDTKTILDFCFAKFKNVNIADNEIEYVTG DTPVTIAGETYDPSELSIESGAVITIPNDAQFTDADKALETDLPEDHPQEAVARLVYTYN ERRVGDAWIYSSRVQAVNATPSEPGEDTPQESQPEKDGQDGKTGLKLPRAVLIGGGAVLV LALIAGGGVFWFKRRQAAERERQRILREKRRQRLADIGYTEEDFERLLDERRKQHDMR >gi|157101640|gb|DS480684.1| GENE 6 6887 - 7081 70 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160937987|ref|ZP_02085344.1| ## NR: gi|160937987|ref|ZP_02085344.1| hypothetical protein CLOBOL_02880 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02880 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 115 100.0 8e-25 MYLVEKRISHAVGIAVYFRWHILEAVPLFYTSLEVKKSILLNPCIITVYLGIVRWMSGFP VPDL >gi|157101640|gb|DS480684.1| GENE 7 7259 - 9493 2000 744 aa, chain - ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 558 718 77 244 251 105 36.0 3e-22 MGDGMKKLAAAKYRIFLIVSSFLLCLFLALPVQAAEKVRVGCVDIGDFIQIDQDGYAIGY GADYLNKIAEYTGWEYEYIQAPWGECLDMLRRGELDLLLPAEYSEERAEDFLFSSYECCF DFAALVGRKSDERLYYDDYHGFQGIRVGMIKGNFLNDLFAEYAENHGFSYEAVYYDTGTQ ILEALEREDVDAIINGNMEYNMNQKLLAKIDYMPAYFITSVDKPGLMVRLNKALKQIFID NPYYSAGLYEKYYHEMDRQFAEFTREESQFVKQTGPVSVLVARSDYPFEWYDEKEKTCKG AYVDYMRHLSKVSGLNFEFIPLEPEEASADKLFRENAQIRLSVFKQKRSGMSGSLKYTIP YYNCSFSLVGKKDEPLNLSSYQRIAMVEKVDGLQEVLEDKYPKWEIVTYDTPKDCLTAVE NGNADCAMVSSIKLSADRNLLGMNLVVVDGSTAVSPVYIGVSESANPLLAQVLSKSIAKA GEGAMDEAVYATLLSGKENKDFSYFIQTYPLYFAFGVIAVSLLGVGILFMRYDARHQKLQ NLILQKKNEELKAAIAMQTLLRRKAQTDALTGLKNKSTTEELCRACLEHAQGDICALFIL DLDDFKHINDERGHQAGDVVLRAFGDTIHRCVRQDDVAGRIGGDEFMLFMGGIKDADQLT RFADRVYRALKDNPDFNATCSMGIAVGRTGNISYEEMFGMADHALYQAKANGKNKYHIEY IPGEAAEDGNSESSEPAKSSGDIL >gi|157101640|gb|DS480684.1| GENE 8 9761 - 10633 1032 290 aa, chain - ## HITS:1 COG:SA0773 KEGG:ns NR:ns ## COG: SA0773 COG2035 # Protein_GI_number: 15926501 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Staphylococcus aureus N315 # 3 254 7 253 283 109 33.0 5e-24 MLLNGIRGFCMALADSVPGVSGGTIAFLLGFYDTFIESLNDLMGRDNGRRKAAFLFLVKL GVGWVIGFCMSVLILSSLFEVHIYRISSLFLGLTLFAIPMVIREEKEVLKDRYGNIVFTL IGVLLVFCITYFNPSGGEGISVSVEHLSLGLVVYVFFTAMIAITAMVLPGISGSTLLLIF GLYVPIIGAVKEFLHMNMDYFPILMVFGLGVVTGIVGIIKIIKMCLERYRSQTIYTIIGL MTGSLYAITMGPTTLDVPRAPMSPSTFSILFFLLGGVILAGLELLKARLD >gi|157101640|gb|DS480684.1| GENE 9 10839 - 11723 1096 294 aa, chain + ## HITS:1 COG:TM0607 KEGG:ns NR:ns ## COG: TM0607 COG0491 # Protein_GI_number: 15643373 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Thermotoga maritima # 13 293 3 280 282 191 37.0 1e-48 MYELIQAGPRSFYINSPVKIGVYLIDDEKVCLIDSGNDKDAGRKVRQILEREEWSLHAII NTHSNADHIGGNKYLMQQTGCRIYASSMEAAFTRHPILEPAFLYGGYPFKDLRHKFLMAQ ESDVSDVSRLSEDQLLKDLELIPLPGHFFQMIGIRTPDDTVFLADCLSSRATLEKYRVSF IYDVAAYLETLDKVASMKANLFVPAHAEVSYNVAALASYNRDCVMEIKELLLELCRVPRT FETLLKEVFDHYGLVMTFQQHVLVGSTIRSYLSWLKDMGEMEAYIEENCLYWRS >gi|157101640|gb|DS480684.1| GENE 10 11770 - 12648 831 292 aa, chain - ## HITS:1 COG:CAC3492 KEGG:ns NR:ns ## COG: CAC3492 COG1533 # Protein_GI_number: 15896729 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Clostridium acetobutylicum # 1 288 1 288 290 365 59.0 1e-101 MHEVKAKSILSARNGMNLYRGCSHGCIYCDSRSTCYGMDHAFEDIEVKINGPELLEQALR RRRKRCMIGTGAMCDPYMPAERSLGLTRKCLEIIEHYGFGVSILTKSDLILRDLDLLKRI NQKTRCVVQMTLTTCAEDQCRIIEPHVCTTSRRAQVLEIMRDEGIPAIIWMCPILPFIND TEENIQGIIEYGERARVHGILAFDIGVTLRDGDRQYFYSRLDEHYPGLKKQYMDMYGNAY EVGSPRKKELMAFFKRLCRERQMETDINKLFAYLQKFEDKGAGEQLSFMDLI >gi|157101640|gb|DS480684.1| GENE 11 12671 - 13711 1149 346 aa, chain - ## HITS:1 COG:STM4580_3 KEGG:ns NR:ns ## COG: STM4580_3 COG3172 # Protein_GI_number: 16767821 # Func_class: H Coenzyme transport and metabolism # Function: Predicted ATPase/kinase involved in NAD metabolism # Organism: Salmonella typhimurium LT2 # 157 341 2 182 185 87 30.0 5e-17 MKYKTGMFGGSFDPLHTGHIHDIIRAAAMCRELYVVISWCRGRESTSKEMRYRWILNSTR HLSNVMIRMVEDQALTKEEYDTPGYWEQGARDIKAVIGKPIDAVFCGTDYLGTGRFEALY GPESQVIYFDRSEVPVCSTDIRAWALGHWDYIPSVCRDYYARRVLVLGSESTGKSTLVRN LALAYNTNYVSEAGRDTCDYAGGEDLMIAEDLYENLLRQKINVMEAAKHSNRILFVDTDA VTTLFYSHFLLGDKQKELTVCTKLAEAIHETDRWDLVLFLEPDVEFIQDGTRNEKIRQDR EGCSLRIKELLDRYRVEYHCIGGTYLERFNRAKELIQEQIGLVTVW >gi|157101640|gb|DS480684.1| GENE 12 13701 - 14396 788 231 aa, chain - ## HITS:1 COG:PM1838 KEGG:ns NR:ns ## COG: PM1838 COG3201 # Protein_GI_number: 15603703 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pasteurella multocida # 9 231 11 233 234 115 29.0 5e-26 MGRFLKHELSGWKAWEIIWLAAACTIITALSIHWGDDVLGITMALTGVICVILTGKGKMS CYLFGLVNTVLYAWIAFGARYYGEVMLNACYYVPMQFVGWFMWKKHMDGKTKEVEKTRLT VSRQILLLVLSAVSICGYGLVLKRLGGNLPFIDSMSTCLSVLAMLLSVRRLMEQWIVWIV VDVVTVFMWFADYQKGGTDIATLLMWCIYLLNAMFMFVKWFRESGEHVNEV >gi|157101640|gb|DS480684.1| GENE 13 14374 - 15492 643 372 aa, chain - ## HITS:1 COG:SA2305 KEGG:ns NR:ns ## COG: SA2305 COG3949 # Protein_GI_number: 15928096 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Staphylococcus aureus N315 # 31 354 25 348 359 115 21.0 1e-25 MKGKGSRCSIRNVVTFAGAFIAWIIGSGFATGQEILQFFTSYGYYSYGVVLLNLLGFLFL GQVLLTTGYAHREEVNFNHFTFFCGKNVGRFYSWLIPVTLLMIMAVLISGAGATISEYYG INRYLGSMIMAVMVLCAYLTGFEKMVRIVSTIGPVIIAFTLMVGLATVIRGAANLSHIRQ YETGLGASRSAPNWLISSVLYISLNFLSGSTYFTALGMTARSRKEAKWGAVWGALTMILA IAIMNTAILLHSKDAASLAVPTLYLARHISNVLGAVFSVVLVLGMFSSCSTMMWTVCSRF SFSAPGRNRMAAAAVTVFAFLLGMLPFSRLINVFYPFVGYMGLIYIGCVACKGLRPLKTY KKGNGDGKISET >gi|157101640|gb|DS480684.1| GENE 14 15632 - 17329 1918 565 aa, chain - ## HITS:1 COG:TM0402 KEGG:ns NR:ns ## COG: TM0402 COG0004 # Protein_GI_number: 15643168 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Thermotoga maritima # 5 409 30 429 435 378 54.0 1e-104 MEFSAVNTIWVLLGAALVFFMQAGFAMVETGFTRAKNAGNIIMKNLMDFAIGTPLFWLTG FGIMFGGAGAFIGGFDPLVRGDYSGILPAGVPLPAYLIFQTVFCATAATIVSGAMAERTK FISYCIYSAVISAVVYPVSGHWIWGGGWLARMGFHDFAGSTAVHMCGGAAALIGARVLGP RMGKYTEDGKPNAILGHSLTLGALGVFILWFCWFGFNGCSTVAMDSDAAVYSAGNIFVTT NLAAATATVATMIITWLRYRKPDISMTLNGSLAGLVAITAGCDMVSPAGAFFIGLIAAFV VVFGIEFIDKVCKIDDPVGAIGVHGMCGAAGTLLTGVFAVDGGLAYGGGFSFLGIQLLGV VSVILWVSVTMIITFRVLKHTIGLRASEEEETKGLDVTEHNLASSYADFMPMVFMGKAKE GAADTGVPVEKAVPVEHYPSAKPVSASVKLSKVVVVFNQSRFTALKDALTELGVTGMTIT QVMGCGTQKGHVNYYRGIKVEEAALLPKMKLEVVVSKVPVEDVLETARKALYTGNIGDGK IFVYEVENVVKVRTGEQGYDALQGE >gi|157101640|gb|DS480684.1| GENE 15 17565 - 17858 421 97 aa, chain - ## HITS:1 COG:no KEGG:Closa_0987 NR:ns ## KEGG: Closa_0987 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 96 1 90 91 72 37.0 7e-12 MLELRDFMPVNFLKKEKFTGSHNGMRFRMEKLDLEGEDGPRLSVTIWPEPYGYDATPEEE KEQQILSFDADGIAKGVDWLNEQFEGQKARWKAVSRG >gi|157101640|gb|DS480684.1| GENE 16 17877 - 18458 760 193 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_00709 NR:ns ## KEGG: BHWA1_00709 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 62 169 31 135 155 62 30.0 7e-09 MNRIKRFMGRKGILLAIALTALLAAGCGSKKEPAKTSEAETSAPAETATEPEAGNQELPN PMVEVEDVLAFEAIGVHMVLPEGAEDTSFFIINQEVADAQFTLDGVAYTYRASDTAEDFA GIFERFKDEAIAQNYDYGETSLEVLIKTTDSGGRLASWEWGSTKYTLYTAAQVDDETITD LTMNLVELSQYEK >gi|157101640|gb|DS480684.1| GENE 17 18491 - 20077 1699 528 aa, chain - ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 22 516 16 511 512 342 41.0 8e-94 MQPEEHTVTHTINRYRSFRYELILEGVVVGALAGAVVVAFRYLIGSADILLNMILDYGKS HTWFVPVWILILAVAACVVSLLLKWDSLISGSGIPQIEGEIIGEIDEKWWRVLLAKLGGG IISLGCGLSLGREGPSIQLGAMTAKGFSRAAKRVKTEEKLLITCGASAGLSAAFNAPIAG ILFSLEEVHKHFSPELLLSSMAASITSDFVSRNVFGLTPVFSFHITHMMPLGTYGHVLVL GVLMGAMGVVYNTTLSKTQDLYGKIPWGTVKLLIPFLLAGVFGFTYRSVLGGGHALVEEL STGEMALGALCLLLVVKFTFSMISFGSGAPGGIFLPLLVMGAIIGSIYYNAALLVSPSLS GLVGNFIILGMAGYFSAIVRAPITGIILISEMTGSFSHLLTLSMVSLAAYLVPDMVHCAP VYDQLLHRLLAKQNPDKKVTLTGEKVLVEGMIYHGSAAEGKKVSAIAWPRTSLVVSLMRG EAEFVPRGDTALRAGDKIVVLCDETSQGQLHKALQEYCETVAPMQKGQ >gi|157101640|gb|DS480684.1| GENE 18 20111 - 21343 1718 410 aa, chain - ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 6 404 4 402 404 395 55.0 1e-109 MEELTLERFEEASELVKDVTTETKLVYSEYFSAQTGNKVFFKPENMQYTGAYKVRGAYYK IHTLTEEEKSKGLITASAGNHAQGVAYAAKLAGVKATVVMPTTTPLMKVNRTKSYGADVV LEGDVFDEACDHAYKLAEEFGYTFVHPFDDLDVATGQGTIAMEIIKELPTVDYILVPIGG GGLCTGVSTLAKLLNPKIKVIGVEPAGANCMQVSLKEGKVVGLPQVNTIADGTAVKRPGE KLFPYIQENVDSIITIEDSELIVAFLDMVENHKMVVENSGLLTVAALKHLNVQKKKIVSI LSGGNMDVITMSSIVQHGLIERDRVFTVSVLLPDKPGELARVSALLAKEQGNIIKLEHNQ FISINRNAAVELRITMEAFGTEHKNQIVSALTSAGYRPKLVKFKGTYSEM >gi|157101640|gb|DS480684.1| GENE 19 21403 - 22356 845 317 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 14 314 24 329 336 330 52 2e-89 MAEGRKGFFGRLVEGLAKTRSNIVSGIDSIFSGFSAIDDDFYEEIEETLIMGDLGIQTTM SIVEDLRKKVKEQHIREPEECKELLINSIKEQMDLGENAYEFEHRKSVVLVIGVNGVGKT TSVGKLAGQLKDDGRRVILAAADTFRAAAIEQLTEWASRAGVELIAQQEGSDPAAVIYDA VAAAKSRKADVLICDTAGRLHNKKNLMEELKKINRIIDKEYPDAYRETLVVLDGTTGQNA LAQARQFMEVADITGIILTKLDGTAKGGIAVAIQSELGIPVKYIGIGEKIDDLQKFNADD FVNALFHVQKQAEENPS >gi|157101640|gb|DS480684.1| GENE 20 22455 - 26015 4269 1186 aa, chain - ## HITS:1 COG:CAC1751 KEGG:ns NR:ns ## COG: CAC1751 COG1196 # Protein_GI_number: 15895028 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Clostridium acetobutylicum # 1 1182 1 1186 1191 610 35.0 1e-174 MYLKSIEIQGFKSFANKLVFEFHNGITGIVGPNGSGKSNVADAVRWVLGEQRIKQLRGAS MQDVIFAGTEMRKPQGFAYVAITLDNSDHQLAIDYDQVTVSRRLYRSGESEYMINGSACR LKDINELFYDTGIGKEGYSIIGQGQIDRILSGKPEERRELFDEAAGIVKFKRRKAIAQKK LEDEQANLVRVSDILSELEKQVGPLERQSRAAREYLQLKDSLKICDANLFLMETEGTGTQ LEEVEKRHQVLSGDMEDTGRESERLKAGYEELEQTLADLERRMAADRDELSQGTVLKGNL EGQINVLKEQIHTEEMNQEHLEHRRDVIKEELAAKNQQLASYREERKAMGDQVRDAVKRQ EEAGARLGDQDESIRRLEEAIEGAKASIIQALNERASLTARQQRYETMLEQVNLRRSEVS QKLLRFKSDESVQDELIGRERALLDQLNEELEKKQFAAQEAEDALLKAEQESHRLNRNLN DTQQEYHMAYTKLESLKNLAERYDGYGNSIRRVMEVRDRVHGIHGVVADIITTSQKYETA IETALGGSIQNIVTDSEATAKQLIEYLKKNKYGRATFLPLTSINGKQTFSQPAALKEKGV LGLASDLVQVDSRYEGLARYLLGRVVVADTIDNAIALARKYKYSLRIVTLEGELLSAGGS MTGGAFKNSSNLLGRRREIEELENTCSKALVQVEKIQKELNLEESMAQEKKGELEKLRAD MQSMAIRENTIRMNISQLEDKKAEIAESSTDLVREHGQLEEQVKEINESRTALTQDSREL EQVSTQANQEIEDKTVLLETSRKERETCAAALSALQMEAANLRQKQDFIRENADRVSGEI EKLTEEFDSLAEGTGNSEQVIEGKRREIAHLGELIQNAMVHMKELEQVMAGHESEKEEMS SRQKAFLAKREELTARLAELDKDMFRIQAQREKLEEKLEASTAYMWSEYEMTYSTALELK REEYQSVPEVKKLIDELKSRIKGLGNINVNAIEDYKEVSERYEFMRTQHEDLVTAQAELE KIIEELDTGMRRQFEEKFGEIRAEFDKVFKELFGGGRGTLELMEDEDILEAGIQIIAQPP GKKLQNMMQLSGGEKALTAISLLFAIQNLKPSPFCLLDEIEAALDDSNVDRFAGYLHKLT RNTQFIVITHRRGTMVSADRLYGITMQEKGVSTLVSVNLIADDLDK >gi|157101640|gb|DS480684.1| GENE 21 26027 - 26731 954 234 aa, chain - ## HITS:1 COG:lin1919 KEGG:ns NR:ns ## COG: lin1919 COG0571 # Protein_GI_number: 16800985 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Listeria innocua # 6 225 5 226 229 195 45.0 7e-50 MNSHLKELEERIAYEFKNKNLFTQALTHSSYANEHRLDHSRCNERLEFLGDAVLEIVTSE FLYRKYETLPEGDLTKIRASIVCEPTLAYCAGDIELGQYLYLGKGEDATGGRNRNSVVSD AMEALIGAIYLDGGFANAKEFIHRFILNDIEHKQLFYDSKTILQEMVQSRQEAPLSYEII REEGPDHNKSFEVCAKIGDEEVGRGAGRTKKAAEQVAAYNGILKLKAQEAAAEE >gi|157101640|gb|DS480684.1| GENE 22 26832 - 27062 391 76 aa, chain - ## HITS:1 COG:HP0962 KEGG:ns NR:ns ## COG: HP0962 COG0236 # Protein_GI_number: 15645578 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Helicobacter pylori 26695 # 3 75 81 152 153 65 53.0 2e-11 MEFEKLQGIIAEVLNIEPEEVTMAATFVDDLGADSLDIFQIIMGIEEEFDIEIPNEAAEQ IVTVGDAVEQIKNALN >gi|157101640|gb|DS480684.1| GENE 23 27112 - 28119 1154 335 aa, chain - ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 4 333 3 331 331 306 46.0 5e-83 MIKVAVDAMGGDYAPGEMVAGAVEAVNAKPGIQVLLVGQEQAVVSELSKHTYDKDRIQVV NATEVIATEEPPVNAIRKKKDSSIVVGLNLVKQGEADAFVSAGSSGAILVGGQVIVGRSK GVERPPLAPLIPTEKGVSLLIDCGANVDARASHLVQFAQMGSIYMEHVMGIKNPRVAIVN IGAEEEKGNALVKETFPLLKECKGINFTGSIEAREIPHGGADVIVCEAFVGNVILKLYEG VGATLISMVKGGMMSTLRSKIGALLVKPALKETLKSFDGSQYGGAPLLGLKGLVVKTHGN SKRTEVRNSIIQCVTFKEQDINGKIQQCLNVQNEE >gi|157101640|gb|DS480684.1| GENE 24 28388 - 30085 1402 565 aa, chain + ## HITS:1 COG:alr1230_2 KEGG:ns NR:ns ## COG: alr1230_2 COG2200 # Protein_GI_number: 17228725 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 323 555 24 255 266 183 41.0 9e-46 MTNDKTDILGQFGFELDAALFYDAIVKSTDDYVYIVNMKTDTSLISENMVRDFELPGRIV PGLVPLWGSLIHERDRARYDESIEQMLSGLTDVHDVEYQVRNRKNEYVWVVCRGVLQRDQ QGSPAMFAGIVTNLSHRGKVDYVTGLFMQQECERVVEECLSEGQSGGILLLGLDDFSGIN NLKGHIYGDSVLRQFAQNVQRLLPDYASIYRHDGDQFVIVCQNTSYGEIALLYEVINSYS DCRHEVDGIQYYCTVSGGIAVFCQDGSSYTELLTCASGALETSKHKGKNKCTLYDEELIH TRLRSLEIMDRLRSSVMNHMEGFSVVYQPIVRSSDTKIAEAEALLRWSCKDMGSVSPLEF IPLLESSGLIIPVGKWVLEQSVRQCCKWLACVPDFVMNVNCSYLQMLDEDFVQSVKEIIN RYRLDPSHIVLELTESRFVTDQDRLNDTFMRLRSLNIRLAMDDFGTGYSSLSCLSCTPAD IVKIDREFIRGICDQSHSFNRSFIGSVINLCHSVGISVCAEGVEEKDELETVLRLKADSV QGFYFSKPVTPDEFEKQHIKHPDIG >gi|157101640|gb|DS480684.1| GENE 25 30093 - 32198 1828 701 aa, chain - ## HITS:1 COG:BS_ykoWm_4 KEGG:ns NR:ns ## COG: BS_ykoWm_4 COG2200 # Protein_GI_number: 16081166 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus subtilis # 459 700 11 251 259 138 31.0 5e-32 MEQDMILYETPLLDELEEIIYVSDIRDYSLLYLNRAGRSLTGLKQEDVRNSKCHRVLQGR DTPCPFCTNAKLSKDGFYVWEYENPHLNKNFIVKDKLVEWEGRPARMEIAVDVGDRVKER GSHTSKYHMETVMLETLRVLNTADYLDDAINRTLEMIADFYGGERAYIIEIDRVQGFARN TYEWCRAGIPSQKAALQQVPLDGIPYLFETFNKKQHLIISRTGELRDSYPSEYRYLTARN TDSLFAVPFEDESAFSGYIGIDNPNINQDTIKLLDSIAYNIANEIKKRRLYERLEYEAAH DSLSGLLNRESFVHFRNDLMRSGKAVSCGIVAGDINDLKQLNRDYGQSRGDMTIKEAASV MASSFAGASIFRLSGDEFIIVSLESAYEEFMERVGKMEWMLDSRTPNGVSLGCTWEEHLA DFDRLMRHAEELMLVNKQIYYKDSSQERKHYSPEYLNRLLQDMEEGCYRLCLQPKYNPLT GKVCSAEALVRYRAAGTESTQPLNFVPLLEKTKLIRYLDFYMLEQVFKLLSDWKERGREL IPVSVNFSRITLLERDLFRVLTDMQKRYHIPARLVMIEITESIGDIERKVIESIGSKLRE AGFRISLDDFGANYANMSILSIMQFDEVKLDKSLMDDLVENQTNQTVVKCIIDMCHSLQV ECVAEGVENQEQLELLTSFGCTAIQGYYYSRPLEIAEFERI >gi|157101640|gb|DS480684.1| GENE 26 32213 - 34732 1919 839 aa, chain - ## HITS:1 COG:DR2498 KEGG:ns NR:ns ## COG: DR2498 COG2199 # Protein_GI_number: 15807483 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Deinococcus radiodurans # 663 818 200 356 356 93 33.0 2e-18 MNQLIKKYRMVLSAIYCLILPYSFVSYGAEAESVIRVGFPVQSGLTMKDENGNYAGYTYD YLKEIGQYTGWTYEFVEPQGSMDEQLIQMMDMLERGELDLVGAMNNNKQTSSVFDFPSEN YGNAYSVIAVRDDDDRIDEYNLSDFKGLRIALLKQADYHNEKFYQYAKLNGIQYEIVWCE RDGEQEERVYSGKADALLSVDVSLSQGFRPVAKFSPTPFYFATTKGNTKIINELNRAISY ISENNPTLQMNLYNKYFSRSGSQMHLNSKEREYIQEHPVLKVLVHDGFGPIQYYDGKGQV QGVARDLLSSIAQKAGWTLEYVYADDYSEFEQALNEGRADVILSILYDYDAVQRRNVLLS NPYLETESVLVARDGFDMTNLKGRVQAVYMGMRKSDDDRTDVRFYDSLEESLNAVERGEC DYTYSNSYTASYYLSRNQYEHVAIYPQAGSDSVKYSIGILRKDDKQILAILNKGIRSIET GELEKYIYNNAQQKQEVTLRTFIRDNSVPFILFTLLAASGLLALIYAHYRSQMRMKRQIE LENTRYRYLSDILKEVTFTYDYGSDVLTLSREGVEIFGTDKSIGQYSRYQGSSGQEEGLP SLYYLLEQRQDVDTEILMVLPNGDTKWHRAVIKVIFDGNQADSAIGRLQNIHGDKLERER LEQRSKRDALTGIYNIAAAKMEITRMINLHTGALALAVIDLDGFKEINDRYGHYTGDQVL IHTAKALRESFCGEAVTARMGGDEFIVCVPYTGENHLAKCCNALFDSLQAKCRASGYPAA TVSIGISISRREDDYTTLYQRADTLLYEVKNSGKNNFRIEDEAGRTSRPDTEQEHHTVQ >gi|157101640|gb|DS480684.1| GENE 27 35206 - 36102 719 298 aa, chain + ## HITS:1 COG:AF1311 KEGG:ns NR:ns ## COG: AF1311 COG1032 # Protein_GI_number: 11498909 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Archaeoglobus fulgidus # 1 276 1 273 287 144 32.0 2e-34 MNYEGQICRGPMERSSYMLPVAVGCSYNRCKFCTLFRHLKYRELPMEQIEAELERVRSLG GNPKHVFLGDGNAFGLGIKRLLEITDLIHRYFPDCQAINMDGTITNIQAKSQEELRALRE AGIRHLYLGIESGLDDVLRYMGKDHNLDQAYRQIDRIQSAGFIFDAHIMTGIAGHGRGLE NGTATAEFFNRTQPQRIINFSLFLFHSAPLYQEAQAKIFIPATELENLQEERLLLELLNT DGPLTYDGFHDRIEFRVRGTLPDDKMKMLDRLDLAIEGYREHEPVIAAADDSPGAYSI >gi|157101640|gb|DS480684.1| GENE 28 36272 - 37186 129 304 aa, chain + ## HITS:1 COG:no KEGG:BCE_3152 NR:ns ## KEGG: BCE_3152 # Name: not_defined # Def: BclA protein # Organism: B.cereus_ATCC10987 # Pathway: not_defined # 49 137 95 183 347 87 62.0 5e-16 MRYDYANGMNRGEYSYDESNEYSGYDNRYAEPMGIKPNSFYHRCCQGPTGPAGPPGCPGL PGPIGPRGCPGPQGITGPTGAMGPQGYVGPTGPMGPTGPAGQAGAVGPTGPAGSTGPMGP AGTTGVTGATGPTGPAGNENSVSCRCKEQMRNIIQQIIALYPNNNLLVTLSSGDASFGRP GSLILGPDGRTGVFEVINPQNQMQYLSICSIDTIQIDNATYNDTIVYLPEPFPAPTDCCA DCQAAIRSLLPVGTAGVNIITSSQTPSVGTVIRNEYGMLVLSNEARTNITFVSSCSIDLF LINT >gi|157101640|gb|DS480684.1| GENE 29 37320 - 37424 93 34 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938016|ref|ZP_02085373.1| ## NR: gi|160938016|ref|ZP_02085373.1| hypothetical protein CLOBOL_02909 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02909 [Clostridium bolteae ATCC BAA-613] # 1 34 1 34 34 68 100.0 2e-10 MLFGKFCIDYSSGQFDILKHCTEPENTKTIAGSY >gi|157101640|gb|DS480684.1| GENE 30 37619 - 38968 809 449 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 5 449 3 444 447 199 30.0 1e-50 MNDKQDLTTGTISKKLIAYALPLLAANLLQSFYSLMDMLVVGRIVGETGLAAISNASMLS FIINSVCIGITMGGTVLAAQYKGAKDEHGQIEIIGTLFTVAFIAAVLVTAAGLAAYRPLF RLLHVPAPAMEDACGYMKIICYGTIFVFGYNAVCSIMKGLGDSKSPLYFVAVATVINLVL DLLLVGPFGMGTEGAACATIFSQGISFVISVFHLKRKDSVFDFRLSHFSIKTDELAAILK VGLPTAVQMASVNLSYLLITGMLNQFGVSVAAASGVGLKINTFAGMPCWAIGQAVTAMAG QNMGAGNIKRVQKTTSTGLCLNLLITLIMVLLVQIFGKQLILLFTPASPEVLEEGILYLR ICCSVNSLVYAVMYTFDSFAIGIGSANIAMFNALLDAGIVRLPVSWLLAFPLGLGYPGIY IGQALSPFLPAAAGLLYFKSKGWKNKRIL >gi|157101640|gb|DS480684.1| GENE 31 39021 - 40646 1282 541 aa, chain + ## HITS:1 COG:AGl1953 KEGG:ns NR:ns ## COG: AGl1953 COG0488 # Protein_GI_number: 15891093 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 531 33 554 567 274 32.0 3e-73 MKGTNMNLSFGLEVIYEDADFHIDAHDKAGIVGVNGAGKTTLFHVLMHEQELDSGTVSTG HLRIGYLPQEIILENESCTVLEYLQSGRPIQKLENELNQIYQQLESADSSEHAQLFKQME KLQSRLESLDYYEADSILLHIIDRMQIDIDLLGMPLGNLSGGQKSQIAFARVLYAKSDIL LLDEPTNHLDAAAKEFVTGFLKNYKGTVLIISHDISFLNQIINKILFINKATHKISVYDG SYDTYKQKYAQEQRLREQAIVQEEKEILELEEFVRRANQASRTNHALKRMGQERALRLEK KRRAKQTRDRIYKRIKMDIHPLRECAAILLEAEHLSFHYPNQPPLYRDLSFQINGKERFL VVGENGVGKSTLLKLIMGILSPDAGNIHFHPKTDVAYYAQELEQLDLQKTVLENAWTDGY SVKQLRSVLSNFLFYADDIHKKAEVLSPGEKARIALCKVLLQRANFLVLDEPTNHLDPET QSIIGGNFHLFEGTIMAVSHNPWFVEQLGINRVLILPSGRIVDYSRELLEYYYGLNSEEE L >gi|157101640|gb|DS480684.1| GENE 32 40648 - 41274 490 208 aa, chain + ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 1 207 3 209 210 290 64.0 2e-78 MADTKKLYPRTGDKQTVYLKNVITDPAITVGDYTMYNDFVNDPVGFEKNNVLYHYPINRD RLIIGKFCSIACGAKFLFNSANHTLSSLSTYPFPLFFEEWGLEKRNVAESWDNKGDIVLG NDVWIGYEAVIMAGVTIGDGAIIGARAVVTKDVPPYTVAGGIPAKPIKKRYPEETIAALS ELKWWDWPENRIAQNLHAIQAGQLNELK >gi|157101640|gb|DS480684.1| GENE 33 41356 - 43596 1897 746 aa, chain - ## HITS:1 COG:lin2156 KEGG:ns NR:ns ## COG: lin2156 COG0178 # Protein_GI_number: 16801222 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Listeria innocua # 4 741 6 746 746 772 51.0 0 MADILVYGLTQNNLKHVTFRIPKEKITVFTGVSGSGKSSIVFDTIAAESQRQMNAAYPAF VRSRLPKYPKPAAERIDNLTASVVVDQSPLGGNARSTVGTISGLYASLRLLFSRIGKPYA GTASYFSFNDPNGMCRTCSGLGKITKVDVEAVLDMDKSWNEGCVKDSLYRPGSWYWKQYA QSGLFDLDKPIRAYSGEEYNLLLYGSRDGKGEPENPKVTGLYHKYTKTLLNRDISNKSRH TREKSQSLIAEMECTDCHGKRLNGAALSCKIKGYHIADLCCMELTGLREVLAGITDQRVD VLVQTIIEGLDRLIEIGLPYLHLNRETPSLSGGEAQRLKLVRYMGSSLTGLTYIFDEPSA GMHPRDVYRMNNLLRQLRDKGNTVLVVEHDKDVISIADHVIDVGPGAGQNGGEIVFAGSY PELLLADTLTGKAMRQVLPIKETPRQAAGSLPVRGACLHNLKHVDVDIPLGIMTVVTGVA GSGKSTLISRVFAKLYESNIVMVDQGPITATNRSTPASYLGFFDEIRRLLAGESGRPDSL FSFNSAGACPVCGGKGVIVTELAFMDPIVTECEACGSVRYNEEALACTYKGKNIVELLGL TASQALQVFEDAKIRRRLKVMEQVGLSYLTLGQPLSTLSGGERQRIKLAKNLGRKGSIIV MDEPTTGLHMSDIENLLKLFDLIVSRGNTLVVIEHNLDVMKQADWLIDIGPDGGKNGGQV VFTGTPMEMLRCAKTLTADSLRASLG >gi|157101640|gb|DS480684.1| GENE 34 43873 - 43935 74 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METDSALEKGIGMGKPAIGV >gi|157101640|gb|DS480684.1| GENE 35 43925 - 45514 954 529 aa, chain - ## HITS:1 COG:no KEGG:Closa_2955 NR:ns ## KEGG: Closa_2955 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 228 333 299 399 657 66 33.0 3e-09 MLLTTSYKKIMIKKIILVLSAVFIISTFVGAGYIFLHRGRVNLSSAGAEAQKESGEADRP SQALSLTEKSTEDSTADSAANSAEDSAINKAENRAEDGAKNNTENGAGKDVEGDYDTVLL DEARKLCRTYFKSGDFSYQIVITPGGRRNRNQLNVDVVKKDGLGYTVLNRLSMYYDPEYP FRMQSSMDQAAYQADGTIKEYQIVPSENGFIGLYGQLEAQGSYFPLDDLFMPEFFSRPLN ETDLIGMTVADMKILRNQYYAVHGRIFEKQELRDYFQEKTWYLGDKTEDEFDETVFGGLE KRNIAFLKKAEAEFDEEKSKEVKKIYDGLLMAPYASLLTAHTEMGVSLYSDIQHRADKGI YYEAEGTIYTPVILSPKQYEAVMKEGKEERICVNELTKETAAVRRTDNSDYGDCMLFYDS GTESGKTSGEDTPHYYFLSYEPYSGNYTLWANSDDTLFKNIYKGTVYVLKGAEHEWYQYF SIPDKAYLESGERVMQFDDITNQGAAGYGGGQPVFDEKGYLKAIYYYGD >gi|157101640|gb|DS480684.1| GENE 36 45477 - 45623 101 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 454 87 100.0 3e-16 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQEDYD >gi|157101640|gb|DS480684.1| GENE 37 45759 - 46925 341 388 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101640|gb|DS480684.1| GENE 38 47170 - 47775 342 201 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0382 NR:ns ## KEGG: Bacsa_0382 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 3 201 4 202 202 184 46.0 1e-45 MKKIIDFMEENVDGRTLFTKELVYELENGALQGVYSDQISFSNLKYSQSGFQIDMFIVSN EKIWLIDKEGQRDKLRKDFSSVSMFRFELAMRKSTNAITGCFRFISASGKNVPAEAVVSG IYDVRLENSVLKLSESQVLYRDQPIQDGRYKPVAFQAEHRFYCEDGKLHYEYDGRCFDVD ANTMQRRDSSDTFPPFISIEK >gi|157101640|gb|DS480684.1| GENE 39 47772 - 48539 300 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938026|ref|ZP_02085383.1| ## NR: gi|160938026|ref|ZP_02085383.1| hypothetical protein CLOBOL_02919 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02919 [Clostridium bolteae ATCC BAA-613] # 1 255 1 255 255 495 100.0 1e-138 MKNIRAVIIVLALLSTGGCSANVAQTSDEPWEESGSGLGPENSYVISIDTDSVVIPQTVS AEPAIDLSDYKNIVNHSSDIIYGEVESVENFNADGGTAWVKERVRVLETYKGRLLPGAEI TVIQQTGYISVNDYINSYAKTDRETVRNMMLQNIPVEDAGNMVLDQTQGVPLDQAGDKIV FCLWVSPQSDGEKTYYEPVGDWAGKQVDMGNDTFAQFYPSVDSDAGRAAGENYETRSIDE MKELFETEQIRSLFR >gi|157101640|gb|DS480684.1| GENE 40 48767 - 49090 336 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938027|ref|ZP_02085384.1| ## NR: gi|160938027|ref|ZP_02085384.1| hypothetical protein CLOBOL_02920 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02920 [Clostridium bolteae ATCC BAA-613] # 1 107 1 107 107 207 100.0 3e-52 MADALELHEGEVVEKEFKSDYWEKTLCFYSQKRGKYWLTNQRIVFRGGFATILEIPYADI ESVQLCNVGGLVQIIPTGIKVTLKSGKSYKLSVTKRKEILAFIQEKI >gi|157101640|gb|DS480684.1| GENE 41 49294 - 49881 426 195 aa, chain - ## HITS:1 COG:BS_yobS KEGG:ns NR:ns ## COG: BS_yobS COG1309 # Protein_GI_number: 16078967 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 9 189 4 183 191 105 30.0 6e-23 MMKGKVMARAGLDKNVVVEKAAQLANKMGIDQIQLKTLAESLSIQPPSLYNHIRGLDDLR RELMIYGWKQVEKRMLEAASGADGYAAWEAVCRAFYQYATENPGVFSAMLWYNKYQDKET QGVTEKLFSICFAITSSLNISEENCNHLIRTFRAFLEGFCLLVNNNAFGYSLSVEESFDL SLKVMIGGMKELEGK >gi|157101640|gb|DS480684.1| GENE 42 50128 - 50310 304 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 [Oribacterium sinus F0268] # 1 60 1 60 60 121 91 1e-26 MSICPKNKSSKARRDSRRANWKMSAPNLVKCSKCGALMMPHRVCKACGSYNKREIVSVED >gi|157101640|gb|DS480684.1| GENE 43 50314 - 50841 188 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168184665|ref|ZP_02619329.1| conserved hypothetical protein [Clostridium botulinum Bf] # 46 169 46 163 166 77 29 3e-13 MLINLSELFPVEGKSKTYTPELEMTQFRLADTVYDIAEKGPLSLVITNRGNKKLALAGTI DLVLIMPCARCLDPVRVPFHLEIDQELDMNQSDEERVEDLDEQPYVNGYKLDIDQLVGNE LTLNLPMAVLCSDDCKGICDRCGTNLNHETCDCDNRPLDPRMSVIQDIFKQSKEV >gi|157101640|gb|DS480684.1| GENE 44 51098 - 52291 1458 397 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 397 1 399 403 503 61.0 1e-142 MNILVINCGSSSLKYQLINSASEAVLAKGLCERIGIEGSQITYQPAGGEKEVTVSPMPTH TQAIQMVLDALTNDKTGVIKSLDEVGAVGHRIVHGGEAFTASTLITEEAVKAIEECSDLA PLHNPANLIGIRACQELMPNTPMVGVFDTAFHQTMPEKAYLYGLPYEYYEKYKVRRYGFH GTSHSYVSKRTAEVLGRPYDSLKTVVCHLGNGSSISAVLNGKSVDTSMGLTPLEGLVMGT RSGDVDPGALQFIMHKENMDIDQMLNVLNKKSGVYGMSGVSSDFRDVENAANDGNKKAEV ALESFAYRVAKYVGAYAAAMSGVDAIAFTAGVGENDKITRKKVCEYLGFLGIEIDDEANS KRGQEIVISTPDSKVSVLVIPTNEELAIARETLALVK >gi|157101640|gb|DS480684.1| GENE 45 52316 - 53311 1022 331 aa, chain - ## HITS:1 COG:MA3607 KEGG:ns NR:ns ## COG: MA3607 COG0280 # Protein_GI_number: 20092407 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Methanosarcina acetivorans str.C2A # 3 331 4 331 333 371 59.0 1e-103 MGFIDLVKARARADKKTIVLPESMDRRTWEAAETILKEDIANLIIIGTPEDIADHSKGLD VSGATVINPQTYEKTQEYIDLFVELRKSKGMTPEKAKEIIMSDYAYYGCLMIKNGDADGL VSGACHSTADTLRPCLQIVKTKPGTKLVSAFFLMVVPDCEYGAEGTFIFADSGLNQNPNP EELAAIAKSSADSFELLVQKEARVAMLSHSTKGSAKHPDVDKVVEATRIAKELYPELKLD GELQLDAALVPEVASSKAPGSEVAGKANVLMFPDLDAGNIGYKLVQRLAKAEAYGPVTQG IAKPVNDLSRGCCADDIVGVVAITAVQAQAE >gi|157101640|gb|DS480684.1| GENE 46 53563 - 53895 467 110 aa, chain - ## HITS:1 COG:SA0441 KEGG:ns NR:ns ## COG: SA0441 COG3870 # Protein_GI_number: 15926160 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 1 109 1 108 109 91 39.0 3e-19 MKLVYAIVRNDNEDDVVSQLTQHHYSVTRLSTTGGFLKKGNTTLMIGAEDDKVQEVIDVI KQECGQHQKLTVNMPYISGTTMVNYATMPMTVDVGGATIFVINVDRYEKI >gi|157101640|gb|DS480684.1| GENE 47 54036 - 55640 1564 534 aa, chain - ## HITS:1 COG:CAC2433 KEGG:ns NR:ns ## COG: CAC2433 COG0265 # Protein_GI_number: 15895698 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Clostridium acetobutylicum # 119 509 39 417 433 175 33.0 3e-43 MYENENKNMNGLGNENMEHRDVPETTVNWTIPGDERNRQGQNTQGHRTMDPEARSQEAQG QYSQARYSQAQEGQDSREQHREQHTQGPDAAFGSYGSQTRRQDCYGGNTYQGYPGGGDKQ RNHSNKGRFVKRAAGITAAAVLFGAVSGGVMTGVNYLGNRLTGAYGTAGTLAGTQAQNQI AQAAPANAQASANQGATAVSAVTDVSGIVENAMPSIVAINDTMTVEQRDFFGMPQTYTAQ SSGSGIIVNQTDTELLIATNNHVVDGASDLKVTFVDNKDVSAAVKGTDSASDLAIIAVQL KDIPSDTMSKIKVATLGNSDDIKVGQQVIAIGNALGYGQSVTVGYVSALDREITDEKGIN RTFIQTDAAINPGNSGGALIDLNGNVIGINAAKTASTEVEGMGFAIPISKSQDILNSLMT KKTRVAVSEDAQGYLGIQGTNIDAATSQMYGMPVGIYVYKIVEGGAASSSDLKEKDIITK FDGQSVTSMEELKQMLTYYEGGAAVTLTVQSLVDGAYVEHDVQVTLGTKPAAQS >gi|157101640|gb|DS480684.1| GENE 48 55900 - 57216 970 438 aa, chain + ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 13 437 3 402 402 219 34.0 6e-57 MGQPDSLEQPAKVIGIIAEYNPFHGGHKFQIEEAKKRTGTDWCVAAMSGDFVQRGEPAVY SKYLRTRMALSCGADLVVELPSAFAVSSAEDFAACGVALLTGLGAVDVLCFGSEDGDIRR IRTAAGILAQEGGDFSSLLSIGLRSGLSWPQARSQALLKMADKDKDFPLKREEMDKLLGS PNNLLGIEYCKAILRQNSPLIPFTIRRRGQGYHDNGLEGGQASASAIRRTLKAGVPSGEA GLFPYAKLTPEAMTHIPPEIRPLYGREPVLEANDLSEILNFCLLSLKREGTDYTQYGDMS AEMARRLDHCLLKQVSWEGRIEQLKTRQYTYTRLSRALLHMVLGLTDARVQSYKEAGRAP YARILGFRKESQELLALVKQKTAIPLITKTADAPRILTGTALDMFSQDIYASHIRQTLLS KKLEQPVRNEYNHPICIL >gi|157101640|gb|DS480684.1| GENE 49 57294 - 58535 871 413 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 18 396 16 396 412 252 37.0 1e-66 MNQEECKEIKQALDLFLNESLERILMSNPTDSGKISRSRIRPLLMKGRLVFQAEEQAGKQ AFHRNLDRDEAADYVTGLLDGSFRQAEIASGLGNALILVSRKGKVTVKVKQSPRPARILP AGNPASREPERAALLSHNRKKHYILEEGIPVPFLVDLGVMTKEGRVVNSRYDKYRQINRF LEFIEDILPNLDQDRESTIIDFGCGKSYLTFAMYYYLKELKGYPVRIVGLDLKEDVIEHC SRLGRQYGYEGLSFCHGDIASFEGVEKVDMVVTLHACDLATDYALEKAVNWGARVILSVP CCQHELNGQMENSLLRPVLQYGLIKERMAALYTDAIRAQVLEYRGYRTQILEFIDMEHTP KNILIRAVRQGKKRDNGLQIRELADFLHVKPAVVELLAPELWESGGKTKDSWY >gi|157101640|gb|DS480684.1| GENE 50 58545 - 59180 674 211 aa, chain - ## HITS:1 COG:CAC2496 KEGG:ns NR:ns ## COG: CAC2496 COG0546 # Protein_GI_number: 15895761 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 5 205 6 206 208 168 41.0 7e-42 MNKGILFDVDGTLWDSAAQVAESWNEVLARYPHLGVRITARDMYDNMGKTMMDIGKTLFP GLSQEECRKVMEECMTYENHYLLSHPGVLYPETREIMACLSRQYGLYIVSNCQSGYIEVL LESCGLREYVRDIECYGNTGLPKGDNIRMVVQRNHLERCFYVGDTHMDEEAAGAAGIPFV HAAYGFGRAERPVGTIESLSQLTALAKKLLD >gi|157101640|gb|DS480684.1| GENE 51 59252 - 59893 710 213 aa, chain - ## HITS:1 COG:DR1393 KEGG:ns NR:ns ## COG: DR1393 COG0406 # Protein_GI_number: 15806410 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Deinococcus radiodurans # 2 198 20 221 237 85 34.0 8e-17 MKIFLIRHGRQCSKLCNVDVDLSEEGYRQASLLGERLFHENIQVVYSSNLLRAVETAQAA NLYWNVEHIIRPELREISFGHMEGMEDRDIAVKYRDFKAQQALMEEDLPYPGGECAGDVV RRAEPVFREMTESGYERIAVVTHGGVIRSVTAHCLGIPMNKWRILGKNLENCSITELNWD GASGRFTLERFNDYAHLEPYPDLLRKGWVSAEN >gi|157101640|gb|DS480684.1| GENE 52 59933 - 61303 1858 456 aa, chain - ## HITS:1 COG:lin0538 KEGG:ns NR:ns ## COG: lin0538 COG2848 # Protein_GI_number: 16799613 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 8 456 5 451 451 583 69.0 1e-166 MINVNMYEVNETYNMIEHEKLDVRTITLGISLLDCVDSNLEALNEKIYKKITTLAKDLVS TGQDIEADFGIPIVNKRISVTPIALVGGSACKRPEDFVTIAKTLDLAAKEVGVNFIGGYS ALVSKGMTRADKNLILSIPQALSQTERVCSSVNVGSTKTGIDMDAVELMGRIVTETAEAT RDHDSLGCAKLVVFCNAPDDNPFMAGAFHGVTESEAIINVGVSGPGVVKVALAKVRDANF EVLCETIKKTAFKITRVGQLVAQEASKRMGIPFGIIDLSLAPTPSVGDSVAEILEEIGLE RVGAPGTTAALAMLNDQVKKGGVMASSYVGGLSGAFIPVSEDQGMIDAVSMGALTIEKLE AMTCVCSVGLDMIAIPGDTPSSTIAGIIADEAAIGMVNQKTTAVRLIPVVGKKVGETAEF GGLLGYAPVMPVNTFSCRRFVTRGGRIPAPIHSFKN >gi|157101640|gb|DS480684.1| GENE 53 61362 - 61634 445 90 aa, chain - ## HITS:1 COG:lin0537 KEGG:ns NR:ns ## COG: lin0537 COG3830 # Protein_GI_number: 16799612 # Func_class: T Signal transduction mechanisms # Function: ACT domain-containing protein # Organism: Listeria innocua # 3 90 2 89 89 87 55.0 8e-18 MNKAIITVVGKDRVGIIAGVCTYLAENQINILDITQTIVKGFFNMMMVVDVENITKSFGE VAQELEQVGEEIGVSVKIQREEIFLKMHRI >gi|157101640|gb|DS480684.1| GENE 54 61693 - 62490 1032 265 aa, chain - ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 1 263 1 259 261 223 47.0 4e-58 MRMVSIASGSSGNCIYIGSDSTHLLVDAGISNKRIQQGLNEIGLTGNDVNGILITHEHSD HIKGLGVLSRKYEIPIYGTRETLDEIAAAGSLGKFDKGLLTPVCPDLDFMVGDLTVKPFK IDHDAANPVAYRVQNGEKSVAVATDMGHFDQYIIDHLQGLDALLLESNHDVRMLETGPYP YYLKRRILGDHGHLSNENAGRLLNCVLHDRLKKILLGHLSKENNYEELAYETVRLEIDEG DCPYGASDFSISVASRDCMSEIIYL >gi|157101640|gb|DS480684.1| GENE 55 62514 - 63140 475 208 aa, chain - ## HITS:1 COG:BH3150 KEGG:ns NR:ns ## COG: BH3150 COG0237 # Protein_GI_number: 15615712 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Bacillus halodurans # 15 202 2 192 201 96 29.0 3e-20 MENIRNAFPAGGKFILGITGGVGSGKSRVLEILKEEYGFRVIQADQVAKDLMQPGQESYR AVVDFLGPSILNEDGTINRPAMAQVIFGCPDKRVQVDRLTHPLVWNTAFGEALACPEPLV VIEAAIPSKEFRDNCGEMWYVYTSRENRMERLRESRGYTQEKTESIMDSQAPEAGFREFS DAVIDNNGSVEDTRKQIRRLLKNKIRMQ >gi|157101640|gb|DS480684.1| GENE 56 63209 - 66001 3136 930 aa, chain - ## HITS:1 COG:CAC1098_2 KEGG:ns NR:ns ## COG: CAC1098_2 COG0749 # Protein_GI_number: 15894383 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Clostridium acetobutylicum # 439 930 58 531 531 467 50.0 1e-131 MDKLVLIDGHSIMNRAFYGVPDLTNSEGLHTNAVYGFLNIMLKILEEEKADHAAVAFDLK EPTFRHEMYDAYKGTRKPMPQELHEQVPVMKDVLKAMGIPIMTLKGFEADDILGTVAKRC QAKGIQVSVVSGDRDLLQLADEHIKIRIPRTSRGVTEIKDYFPEDVLREYQVTPEEFIDV KALMGDASDNIPGVPSIGEKTATSIIAQYKSIENAYAHLEEIRPPRAKKALEEHYDMAQM SRKLAAICTDCPVEFNYEDARIEDLYTPEAYQYMKRLEFKSILARFDKTAAQEAAAPDIK THFTLVKDIKTADKIFSRAAETGLAGFQLILGRDGALKETAEQLSFSFDENGEVKAGKKA RSGLNPVTGLALCLGEEDIYCLAAGEQITGEYLLEQLRALCGRNQEQDNQAQNSQGQDNP GVRELWALDLKSMLAYLELKDTDPVYDAGVAGYLLNPLKDTYAYDDLARDYLGLTVPSRA DLLAKEDLGDALWRGEKNAVDCVCYMGYTAWKAAAPLAGQLKDTGMYSLYTDIEMPLIYS LFHMEQEGVKVERAELKEYGDRLKVGIAKLEQEIYQETGHEFNINSPKQLGEILFEQMEL PGGKKTKTGYSTAADVLEKLAPDYPVVQKILDYRQLTKLNSTYAEGLAAYIGEDGRIHGK FNQTITATGRISSTEPNLQNIPVRMALGREIRKVFVPKEGCVFVDADYSQIELRILAHMS GDERLIEAYRSAQDIHAITASQVFHVPLDEVTPLQRRNAKAVNFGIVYGISAFGLSEGLS ISRKEAVEYIDKYFETYPGVKTFLDGLVKQGKEQGYVTTLYGRRRPIPELKSANFMQRQF GERVAMNSPIQGTAADVMKIAMIAVDRELKKRGLKSRIVLQIHDELLIETARDEIEAVKE ILTDKMKHAADLRVSLEVEAEVGKSWFDAK >gi|157101640|gb|DS480684.1| GENE 57 66027 - 66356 446 109 aa, chain - ## HITS:1 COG:no KEGG:Closa_1882 NR:ns ## KEGG: Closa_1882 # Name: not_defined # Def: transmembrane anti-sigma factor # Organism: C.saccharolyticum # Pathway: not_defined # 1 109 1 109 115 136 65.0 2e-31 MTCREAERLVMPYINGSITDEELKEFLKHIETCEECREELEIYFTVDVGIRQLDQETGTY NIKGALETALELSRQRVHTLGILETARYAVNTLCFWAVLAVLVLQFRMW >gi|157101640|gb|DS480684.1| GENE 58 66475 - 67299 781 274 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 24 273 25 286 292 99 29.0 9e-21 MSNTMILKDKSVQVSRIRRSSSFVMKRPHYHSYYEIYYLLSGKCKMFINQDIYYLEPGDM TIIPPLEVHKALYEPSWEAERFGIYFSRDTVTSFLSLCGREAFGHIFSQPKRTIPSEFRP KIEELLAQMQDEEKQGDSYCQIQLCSLLHQILVVLGRCQEARREGHALEEAEEAMVRAAR YMNQHYQEGVTLGQVAAFANMSPTYFSKKFKSSTGLCFKEYLNYIRVQKASDMLRNTDLS VTGVAMACGFSDGNYFGDVFRRLTGVSPREYRKG >gi|157101640|gb|DS480684.1| GENE 59 67470 - 68450 521 326 aa, chain + ## HITS:1 COG:CAC1231 KEGG:ns NR:ns ## COG: CAC1231 COG0673 # Protein_GI_number: 15894514 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 7 305 5 319 331 100 25.0 4e-21 MAEMTMISMAVMAATQKSRSFLEAAAGRKDLYLEEGLLEEGTLEIGTLGIGTLEKADVIY LPDYSSLDFRQVRKLLKAGKHVLFGTMGAADAGSLKALAQTADEHGAVLLDHIHLAFNPC MASLKTALTCLGPIGRIRLHSCEYAFCCEGLRISPVRPPACVPPGGALFQRGADCLYPLV HLFGIPEQMDAQMVLTEHGMDAAGTVRGYLGTARVEIVYSKISGSHVPSLIEGERGSLLI RDLRNLKEVTYKNRSGDKDQLVSFSDEACRARELDRCMAFIKGKCSWQRQLASSIQTLQM MDEIRGHRNLSAEYTVSSHGTYASAI >gi|157101640|gb|DS480684.1| GENE 60 68667 - 68792 85 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938052|ref|ZP_02085409.1| ## NR: gi|160938052|ref|ZP_02085409.1| hypothetical protein CLOBOL_02946 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02946 [Clostridium bolteae ATCC BAA-613] # 1 41 1 41 41 67 100.0 4e-10 MKEYPAFSVILFLYFIICFIIVPEWKNGKGFEKTLAKICIL >gi|157101640|gb|DS480684.1| GENE 61 68791 - 69423 713 210 aa, chain + ## HITS:1 COG:HI0522 KEGG:ns NR:ns ## COG: HI0522 COG2364 # Protein_GI_number: 16272466 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Haemophilus influenzae # 12 198 24 204 218 58 26.0 6e-09 MAVNYFKKIGARRLLIMIMGNVFLGMGISIFKLSGLGNDPFSGMVMALANILGFSYASFL LVINGGLFIFELAAGRRLIGIGTLVNAFLLGYIVTFFNGLWPLLFPVPDTMLLRVVTVLV GVIVTSFGVSLYQTPNAGVTPYDSISLITAKSRPLIPYFWHRIATDAVCAVICFLAGGII GLGTLVSAFGLGPIVHFFNVQVAGRLVKDC >gi|157101640|gb|DS480684.1| GENE 62 69467 - 70348 774 293 aa, chain - ## HITS:1 COG:SA0643 KEGG:ns NR:ns ## COG: SA0643 COG4989 # Protein_GI_number: 15926365 # Func_class: R General function prediction only # Function: Predicted oxidoreductase # Organism: Staphylococcus aureus N315 # 1 286 1 295 302 223 38.0 3e-58 MKKVKLSEGLSLSAVVQGFWRLDSWKWSAEELARFMNECIDRGVTTFDTAEIYGGTLCET LMGQAFEQDRAIRNKIQLVSKTGIFKEGGFGYYDTRYDRVKQSCEESLKRLHTDHLDLYL IHREDPCIDFESTAKALTELKKEGKIGEMGVSNFDPHKFNGLNRAVNGQLVTNQIEWNPV CFEHFNSGMMDDLTASGIHPMIWSPLAGGRMFDPNDSLCAAAMKTIKRIAEKHDADPSAI IYAWIMYHPAGAVPISGSNRLDRLDTAIKALDIRLEHYEWYEIYKASGQQAIR >gi|157101640|gb|DS480684.1| GENE 63 70364 - 71368 1209 334 aa, chain - ## HITS:1 COG:VC1328 KEGG:ns NR:ns ## COG: VC1328 COG4211 # Protein_GI_number: 15641340 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Vibrio cholerae # 16 332 14 332 337 247 49.0 2e-65 MKIKELTAKDIKNIMMDKALIIVLVFMIIGIIIIEPSFLQIRVLTDILTQSAVKMICALG LMFALLLGGTDLSGGRQVGLAAVMVASLSQMADYSNKFWPGLPELPIIIPVIMTLALVAV IGAVNGFMIAKLKMAPFIATFGMSTIVYGINCLYFAKKPNNSQPIGGIRNDLTVLSSFKV FGQISFLVVIAVIAIVGVWFLLNKTVFGKNIYIVGGNPEAAKVSGINVSKVIITVFIIES MFVGLAGILEVARTGGANSAYGNAYEFDAISSCVVGGVSLSGGVGKVSGVIIGVLIFTFI SYGLAYIGVNSNWQLIVKGLIICSAVALDMRKNS >gi|157101640|gb|DS480684.1| GENE 64 71365 - 72900 167 511 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 275 478 18 215 245 68 25 9e-11 MQMEEYRLKIANMSKSFPGVKALDNIKLAVKPGSVHALIGENGAGKSTLMKCLFGLYHPD SGTIEIDGRVMDIKDPNDALKNGISMIHQELNPVPYRNVVDNIWAGRFQTKGIVVDENQM MEKTRNLLRDLDFDIDVTTFAKNLSVSQMQAIEIAKAVSYNAKIIIMDEPTSSLTVAETQ HLFKIIDKLRKEGRSIIYISHKLEEIFEIADEITVMRDGQYIGTWDIKDITIDQLIGQMV GRKMNERFPLCEDARTDEVMLQVKNFTSADPRSFKNVSFDLHKGEILGIGGLVGAQRTEL IEAIFGLRRIVSGELTVNGKTVVNHSPQDAIRNGFALLTEERRATGIIPMLSVWDNALTV AYKKLTGNVFGYIHTKKLFPSVDKVCTDLDVRMAGTKTLIKNLSGGNQQKVLIGRWLLAN CDVIMLDEPTRGVDVGAKYEIYRIMQDLVKKGKAIIMVSSEMPELLGMSDRIMIMCAGKQ TGILGKTEATQVEVMKYATQFEDKTKEGVSV >gi|157101640|gb|DS480684.1| GENE 65 72913 - 74061 1260 382 aa, chain - ## HITS:1 COG:YPO1507 KEGG:ns NR:ns ## COG: YPO1507 COG1879 # Protein_GI_number: 16121780 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 54 369 26 331 335 190 37.0 4e-48 MKKSMNRFVCGLTAAVMMGTLTACSGGLGETKPAVPAGSESAGNSESAAQQGEAQAEMKA TVIWRAFDDQFQSGFRIIMQNEAEKSGGITLDMQDAANDVSTANSKLEVALSKKTDVVAI CAPDRESIENMMKKCDADGVPAVFFNMEPMEDTMNRYENIWYVGAQAKQSGEMCAQALIN YWNDNKDIADKNGDGKLSLVILQGEIGQQDVTLRTDAYKETLEKNGIEYEILACDTANWD QAQALDKMNAWITAYGIDGIEGVLCNNDSMAMGAVQACINNGYNSGDAEKFVPIVGIDAN IDALEAMKAGSLLGTVLNDRKSQSDAIINVIKAVHAGTEITEDVVGVNCTVDGKYVWVPY VIVDESNLSATLEDMLAVAQSE >gi|157101640|gb|DS480684.1| GENE 66 74087 - 74944 995 285 aa, chain - ## HITS:1 COG:BH3786 KEGG:ns NR:ns ## COG: BH3786 COG0191 # Protein_GI_number: 15616348 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Bacillus halodurans # 1 282 1 282 287 211 40.0 1e-54 MKLSPMKDILEMAEERNIAYGAYVTVSYETALAAIEAGTELDIPVIFITGTDCCDLMGGF EGTVDTVKRAMKAADCKVPVALHLDHCRTYEECVQAIQAGYSSVMIDGSSLPFEENVALT KKVADYAHCYGITVEGELGKLVGEEGNFKVEGDPESAQTDPDQAKEFVERTGIDCIAVSI GTQHGVYVAAPHLNIERLKKIHDVVDVPIVLHGGSGTPKEQVQEAIRNGIRKINIATDVL IAVANSFEDMKQKPDFKYNTGMFVNSQACAREFIKGKMREFALMD >gi|157101640|gb|DS480684.1| GENE 67 74961 - 75908 836 315 aa, chain - ## HITS:1 COG:ydjH KEGG:ns NR:ns ## COG: ydjH COG0524 # Protein_GI_number: 16129726 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli K12 # 2 298 13 307 322 138 30.0 2e-32 MVLSFGMMVCDTLLRPVDRDALSRDCTMIEKPIQACGGDALNTAIALSRLGIPVTMAGRV GDDLNGRFVLDQCRKLGMDVSHVVTDPVHATASSFALIEENGERHFLSDVTIFDEVDDSD LPEKAVENADIVYIGSACSFNRMNQGGLARILRRAKAHGRMTVMDAAINQRSVCSCWMDM FKEVLPYTDVFFPSYEEALAITGCREIPDISSAFKPMGLAYFGIKLGSKGCYATDFIRER YIECPRGIRAVDTTGAGDSFMAGLMAGLVNGLSFFESAGLAGSVASHSVQHMGAAGGILP YEDELYIYNHHKKLK >gi|157101640|gb|DS480684.1| GENE 68 76114 - 76863 661 249 aa, chain + ## HITS:1 COG:BH0419 KEGG:ns NR:ns ## COG: BH0419 COG2188 # Protein_GI_number: 15612982 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 8 244 2 236 240 143 34.0 3e-34 MTMFTKEINKDVPIPLYYQLKNIIKTDIENGTLKTGDTIPTEMEIMSHYNISRFTVRQAI SELVNEGYLLRKTSKGTFVTEPNKKTSFIKSFEPFHQQIQELGKTPHTELLDMSVMKPDD RIRELLRLGTNDKVISLFRRRFADATPVVTIQNYLPYNLCHFVLSHDFKDVSLYNLFMSR PETCIDNTRTIVSAETATPEDVKLLEVKPGSPMLCFNNISTTSDGTIIDYAYAHYRGDMN KFEINESPK >gi|157101640|gb|DS480684.1| GENE 69 76954 - 78030 1248 358 aa, chain - ## HITS:1 COG:SA1513_1 KEGG:ns NR:ns ## COG: SA1513_1 COG0258 # Protein_GI_number: 15927268 # Func_class: L Replication, recombination and repair # Function: 5'-3' exonuclease (including N-terminal domain of PolI) # Organism: Staphylococcus aureus N315 # 5 350 3 280 324 128 31.0 2e-29 MSERKFVIVDGSSLLSTCYYAVLPREIMFAKTDEEKEKHYGKILHASDGTYTNGIMGVLK AVVSLLKKQQPAYVAFVFDKTRDTFRRELYPDYKGTRSRTPEPLKQQFVLMERILEEAGF KVLYSDRYEADDYAGSLVYKFREQVPMVVMTKDHDYLQLVSDEYNVRAWMVQARQEKAEE LCDKYYGLYGLDKASVNLPEKTFEFTAETVYSEEGVWPRQITDLKGIQGDTSDNIPGVRG VASAAPLLLGEYGTVEHIYEVIHEAEQDKKQLKELQDFWKNCLGISRSPYKSLTKTGEEG ELCGEAAARLSKELATIKTDIPLDFELEDFSASCCREEVLKEWCRKLDIKTASVFGKS >gi|157101640|gb|DS480684.1| GENE 70 78082 - 78171 69 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGQMERLIVLVFIVAALLFAGIWDRHHRR >gi|157101640|gb|DS480684.1| GENE 71 78210 - 78740 548 176 aa, chain - ## HITS:1 COG:FN1951 KEGG:ns NR:ns ## COG: FN1951 COG2110 # Protein_GI_number: 19705253 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Fusobacterium nucleatum # 8 176 5 172 175 176 56.0 2e-44 MKNENNTVIGTVLGDITKITGMDAIVNAANSSLLGGGGVDGAIHRAAGKELLHECRLLGG CKTGQAKITKAYNLECRYIIHTVGPVWNGGTCGEQEKLASCYRNSLLLALENGVKRIAFP SVSTGIYHFPVGLAAETAIGTAREFAAEYPGELEQILWVLFDARTKLVYDTVLRKL >gi|157101640|gb|DS480684.1| GENE 72 78771 - 78977 295 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938065|ref|ZP_02085422.1| ## NR: gi|160938065|ref|ZP_02085422.1| hypothetical protein CLOBOL_02959 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02959 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 112 100.0 1e-23 MTHLELVAEIGSGPMEIVKLYLKGLLSFAELENILGRDKASLVHTFAADSGAGKIGELLL SGLRGYMV >gi|157101640|gb|DS480684.1| GENE 73 79314 - 79808 491 164 aa, chain - ## HITS:1 COG:mll3540 KEGG:ns NR:ns ## COG: mll3540 COG2110 # Protein_GI_number: 13473056 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Mesorhizobium loti # 10 146 17 139 155 60 32.0 1e-09 MIQYYNGSIFDSKADILCHQVNCQGAFGYGIAGQVKKLFPEVEKTYKRITKQWIEKENGD TSKLLGRVSAQPVEKDGRWFLIGNLYGQDDYGRKKMVYTDYDALEKAMGEIRSFLEARGK NETVAFPYKIGCGSAGGDWNIVEDIIKRTFEGYSGEVQIWKYEE >gi|157101640|gb|DS480684.1| GENE 74 79848 - 80072 223 74 aa, chain - ## HITS:1 COG:no KEGG:CLK_2436 NR:ns ## KEGG: CLK_2436 # Name: not_defined # Def: DNA-binding protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 12 74 10 72 136 63 41.0 3e-09 MVQERIDLKNLFAEIRRKGSTSQVSRDLDVSMGNISDWKNGRSVPNAVSLVKLADYFGCS VDYLLGRTEHRDMI >gi|157101640|gb|DS480684.1| GENE 75 80262 - 81086 1082 274 aa, chain + ## HITS:1 COG:no KEGG:Closa_1881 NR:ns ## KEGG: Closa_1881 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 33 274 28 280 280 192 42.0 1e-47 MDTNKKILIGAIGAAVVIGGIAIGLLLSKTPEKADLSTIHTEAATEAPKETLPAATQAPE TEETTEGQEDAASSVSASIETYTSGKVSIQYPAVDQMDDASKKDKINELLKTNALSFIKA NDIDEANDTLDIKCKVISVDRKRLTATYTGTLSAKGAAHPVNLFYSNTVNLLQAQNLGLD DFTDAYTMAGYVLSDDVKFSGISSDVEAQVLSYRSSMDLDSLTAVFNSADFPLSSETQWP ESFSYEKQGTIYFSLPVPHALGDYVLVAFDPTTK >gi|157101640|gb|DS480684.1| GENE 76 81168 - 81797 814 209 aa, chain + ## HITS:1 COG:SP0745 KEGG:ns NR:ns ## COG: SP0745 COG0035 # Protein_GI_number: 15900640 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 1 209 1 209 209 273 63.0 2e-73 MDNVMVFDHPLIQHKISILRNKATGTNEFRALIEEIAMLMGYEALRDLPLEDVEVETPIE TCMTPMIAGRKLAVVPILRAGLGMVNGILALVPSAKVGHIGLYRDEVTHEPHEYYCKLPN PIEERTIIVTDPMLATGGSGVSAVDFIKEHGGKKIKFMAIIAAPEGLKRLHEAHPDVQIY VGHLDRCLNENAYICPGLGDAGDRIFGTK >gi|157101640|gb|DS480684.1| GENE 77 81946 - 82692 937 248 aa, chain - ## HITS:1 COG:CT629 KEGG:ns NR:ns ## COG: CT629 COG0110 # Protein_GI_number: 15605360 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Chlamydia trachomatis # 27 203 23 204 205 124 36.0 2e-28 MKLEQMKIKELLDTSHTIAEKLFEGHEYPWEVLGDINGFIESLGASLPENEYKKIGKNIW IHKTAQMAPTTAMGGPMIIGPKVQIRNGAFLRGGVILGQHVVIGNSCEIKNSIIFDEAQV PHFNYVGDSILGYKAHMGAGAVTSNVKADKGLVKVHAEDGDVETGRKKFGAILGDHAEIG CNSVLNPGTVIGRNSNVYPLSSVRGCVPSDSIYKGQDNIVVKEAREVETEQEAPEAPEKG KGGLKVVK >gi|157101640|gb|DS480684.1| GENE 78 82696 - 84798 2342 700 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 667 1 587 709 369 35.0 1e-101 MSKSLYIAEKPSVAQQFAEALKIRGRRGDGYIESDQAVVTWCVGHLVTMSYPEAYDMKFK RWSLQTLPFLPKEFKYEVITNVSKQFEIVKGLLNRPDIDTIYVCTDSGREGEYIYRLVAQ MAGVKDKKQKRVWIDSQTEEEILRGIREAKDETEYDNLSASAYLRAKEDYLMGINFSRVL TLKYGPALSNYLGTKFTVLSVGRVMTCVMGMVVRREREIRGFVKTPFYRVIGSFEAPGKD GQPVPFEAEWKAVEGSRYFGTPCLYKDNGFKDRQQAEELAALLTAEPPLTATVLAKEKKK ETKNPPLLYNLAELQNDCSKMFKISPDETLKVVQELYEKKMVTYPRTDARVLSTAVSKEI QKNIGGLKNYGPMAGFADEVLNMGSYKGIAKTRYVNDKQITDHYAIVPTGQGLGNLRGLP PLSEKVYQVICRRFLSIFYPPAVYQKYSLELERLKEHFFANFKVLSEPGYLKVADVNLAK KNGVEESFDEARDETKDETRDESRDEAKEGVPNAISPKVDRTVLLQMLAELKKNDILTLV ELNIKEGETSPPKRYTSGSMILAMENAGQLIEDEELRAQIKGSGIGTSATRAEILKKLFN IKYLNLNKKTQVITPSLLGEMVYDVVDQSIRQLLNPELTASWEKGLTYVAEGSITSDEYM EKLNRFVAGRTVNVIRMNNQYNMRGYFDAAAAFYKTKKEN >gi|157101640|gb|DS480684.1| GENE 79 84999 - 85253 329 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938073|ref|ZP_02085430.1| ## NR: gi|160938073|ref|ZP_02085430.1| hypothetical protein CLOBOL_02967 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02967 [Clostridium bolteae ATCC BAA-613] # 1 84 22 105 105 152 100.0 6e-36 MPGSWQYQPYLTTYDSFIFYNAIGEHPDYFYRPIAVAKQVVNGTNYRFMTIAEPEQSDLT PHFAIVEIYQPLEGRAYATKITPV >gi|157101640|gb|DS480684.1| GENE 80 85240 - 86304 422 354 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 2 352 3 345 345 167 33 3e-40 MAIEKNSFGKMPDGTEVYKYTLTNKNGVSASFITLGGVWVSMVVPDKDGSMADVVLGYDD LDSYRRNPAHFGAPIGRNANRIGGAVITIDGKDYKLEANNGPNNLHSGPDLYHDRVWDCA ASETEAGSRLDFYLESPDGDQGYPGNARITVSYTLTDEDSLVLDYHMVCDKDTVANFTNH SYFNLAGHDSGDVMKQEVWLNASRYTPADEVSIPTGEIAPVAGTPMDFTEMKAIGRDIGA DFEALVFGKGYDHNWVLDKEEGELALAAKAMDPASGRVLECYTDLPGIQFYTANFLQDEM PGKGNARYDYRHAFCFETQYFPDAVHKPQFPSPLLKAGDEYRTRTVYQFLVKQA >gi|157101640|gb|DS480684.1| GENE 81 86334 - 87410 1356 358 aa, chain - ## HITS:1 COG:CAC0022 KEGG:ns NR:ns ## COG: CAC0022 COG0136 # Protein_GI_number: 15893320 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 4 358 3 357 360 454 59.0 1e-127 MEKKLKVGILGATGMVGQRFIALLENHPWFQVVTVAASPRSAGRTYEEAVGDRWKMTTPM PEAVKKLVVMNVNEVEKVAAGVDFVFSAVDMTKDEIKAIEEAYAKTETPVVSNNSAHRWT PDVPMVVPEINPEHFDVIPFQKKRLGTERGFIAVKPNCSIQSYAPVLTAWKEFEPYEVVA TTYQAISGAGKTFKDWPEMEGNIIPYIGGEEEKSEQEPLRLWGTIEDGVIVKAQSPVITC QCVRVPVLNGHTAAVFVKFRKKVTKEQLIEKLREFKGVPQELKLPSAPAQFIQYLEEDNR PQVTADVDFERGMGISVGRLREDAVYDYKFIGLSHNTVRGAAGGAVLCAELLTARGYI >gi|157101640|gb|DS480684.1| GENE 82 87662 - 88387 963 241 aa, chain - ## HITS:1 COG:SA0131 KEGG:ns NR:ns ## COG: SA0131 COG0813 # Protein_GI_number: 15925840 # Func_class: F Nucleotide transport and metabolism # Function: Purine-nucleoside phosphorylase # Organism: Staphylococcus aureus N315 # 8 240 4 235 235 269 57.0 3e-72 MSLSNTPTPHIGAEKGEIAETILLPGDPLRAKYIAEHFLEDVKQFNGTRNMLGYTGIYMG RPVSVMGTGMGCPSIGIYSYELIHFYGCKNLIRVGTAGALTPDVHVRDMVFAMGACTTSN FVRLLGLPGDYSPVCSYSLLERAVAAARERGLSFHVGNVLSSDMFYAPPKQVQGGMTWAD MGVLAVEMEAAALYANAAAAGVNALAVLTISDSMVTGEATSAMERETSFTQMMEVALSLV S >gi|157101640|gb|DS480684.1| GENE 83 88405 - 90639 2842 744 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 4 739 6 734 830 510 39.0 1e-144 MAEKILRTEYSEEMQRSYMNYSMSVITARAIPDARDGLKPVQRRVLYDMRELHLDHDKPH RKSARIVGDTMGKYHPHGDSSIYETLVVMSQNFKKGMALVDGHGNFGSIEGDGAAAMRYT EARLEKFAEEVYLKDLDKTVNFVPNYDETEKEPEVLPVRVPNLLINGAEGIAVGMSTSIP PHNLGEVIDAVQAYIDNPEVTTQELMEFMPGPDFPTGGVIANKSELPQIYETGMGKIKLR GKFEVELGKRKADKDKLIISEIPYTMIGAGINKFLVDVADLVESKKLTDVVDISNQSNKE GIRIVLELRKDADIDKIKNILYKKTKLEDTFGVNMLAIADGRPETLNLKGILRNFLNFQY ENTERKYNVLLQKEMDKKEVQEGLIAACDCIDLIIAILRGSKNLKDAKACLTTGDVTNIH FKVPGFEEDAKKLAFTERQASAILEMRLYKLIGLEILALEKEHRETLRKIGEYKKILGSR AQMNRVIKEDLAAIKAEFATPRRTLIEDGPEAVYDESAVAVQEMVFVMDRFGYCKLLDKS TYDRNQETVDTEQVHVVKCLNTDKICLFAASGSLYQIKAMDVPAGKLRDKGTPIENLSKF DGAKDTIVYLTSAEDMKGRIFVFATKMAMVKQVPSEEFETNNRVVAATKLQEGDELAAVI PVEGETEVVIQTTGGVFLRFALEEISMLKKASRGVRGIRLTKNEELEKLYLIGENPIVDY KGKEVHLNRLKLAKRDGKGSKVRL >gi|157101640|gb|DS480684.1| GENE 84 90658 - 92577 2230 639 aa, chain - ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 3 632 5 630 638 604 50.0 1e-172 MAKEQYNADSITVLEGLEAVRKRPGMYIGGVGSKGLNHLIYEIIDNAVDEHLAGFCSSVW VTLEADGSCTVRDNGRGIPVGLHKKGISAARIVLSTLHAGGKFDNDAYKTSGGLHGVGSS VVNALSAHMDLKVYRDGCIYHDEYEKGRPVIKLENGLLPVIGKTRETGTCINFLPDGDIF EKTRFKAEWLKSRLHETAYLNPNLTLYYENKRPGEGEKVAFHEPEGIVAYVRELNSGKTP VHDPIYFKGTVDKVEVEAAIQFVDTFEENILGFCNNIFTQEGGTHLAGFKTKFTALINSY AREIGILKDKDTNFTGADTRNGMTAVIAVKHPDPIFEGQTKTKLASADATKAVFTVTGDE LQLYFDRNVEVLKGVIGCAEKSAKIRRAEEKAKTNMLTKSKFSFDSNGKLANCESRDASR CEIFIVEGDSAGGSAKTARNRQYQAILPIRGKILNVEKASMDKVLANAEIKTMINTFGCG FSEGYGNDFDITKLRYDKIILMTDADVDGSHIDTLLLTFLYRFMPELIYDGHVYIAMPPL FKVIPKRGEEQYLYDEKELERYRKTHTGDFTLQRYKGLGEMDAEQLWETTLDPERRVLKR VEIEDARMATEITEMLMGTEVGPRRQFIYEHADEAEIDA >gi|157101640|gb|DS480684.1| GENE 85 93090 - 93821 269 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 219 1 221 223 108 31 1e-22 MENKISVHNLKKNFGKLEVLKDISIEIREGEVVCMIGPSGSGKSTFLRCLNRLEKITSGH VVVDGHPISDPNTNINKVRENIGMVFQHFNLFPHLTVKENITLAPTELKLMGKEEAGRKA LELLARVGLADKAEAYPGQLSGGQKQRVAIARSLAMNPDIMLFDEPTSALDPEMVGEVLE VMKQLAADGMTMVVVTHEMGFAREVADRVVFMDDGYIVEEGTPDEVFGNPREERTRSFLD KVL >gi|157101640|gb|DS480684.1| GENE 86 93811 - 94491 932 226 aa, chain - ## HITS:1 COG:lin0840_2 KEGG:ns NR:ns ## COG: lin0840_2 COG0765 # Protein_GI_number: 16799914 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Listeria innocua # 6 222 1 209 212 192 50.0 6e-49 MQIVNIITQYYPSFFKALGITLQMTVISLLCATVLGVIFGLFKVSDFKVLKLVADVYIDI IRGTPLLVQVMIMMYGVGSVLKPYGFQWSNIGGAFTAGCVALSLNAGAYMAEIIRGGIEA VDKGQMEAARSLGLSYGKAMRKVILPQAFRTMLPSIINQFIISLKDTSLVSVIGPRELTQ NGKIIAANSASMVMPIWICVALFYLVVCTILSRIAKYVERRVSYGK >gi|157101640|gb|DS480684.1| GENE 87 94713 - 95561 1141 282 aa, chain - ## HITS:1 COG:lin0840_1 KEGG:ns NR:ns ## COG: lin0840_1 COG0834 # Protein_GI_number: 16799914 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Listeria innocua # 55 282 27 254 268 253 58.0 2e-67 MKKFMALSMAAIMALTVAGCGSKPAETTAAGTTAAETAANADTTAAESAAEEETSAAASG KKYIINTDTTFAPFEFENEKGEMVGIDLDILQAIAQDQGFEYEVVPVGFSAAVTALEAGE CDGVIAGMSITDERVQKYDFSEPYYDSGVGMAVLSDSDITSYDQLKDQKVAAKIGTEGCT FAESIAEQYGFEVIQYESSSDMYQAVIAGEAVACFEDYPVIGYEISRGLGLTLPTPMEKG SSYGFATLKGANPELVEMFNKGLENIKASGRYDEILNTYIAK >gi|157101640|gb|DS480684.1| GENE 88 95810 - 97771 2178 653 aa, chain - ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 7 648 17 661 665 772 58.0 0 MSDLQTRYLERLAELYPTIAAASTEVINLEAILNLPKGTEHFLTDIHGEYEAFSHVLRNG SGAVRRKVNDVFGNTLSSQDKQTLATLIYYPREKMAQILEHVKNREDWYKITLYRLIEVS KRAASKYTRSKVRKALPQEFAYVIEELITEKVDVRDKESYYNAIIQTIIRVGRAGEFIVA LCELIQRLIVDHLHILGDIYDRGPGPHMIMDKLMTYHSLDIQWGNHDVLWMGAAAGQKGC IANVIRICARYGNLDILEDGYGINLLPLATFALDTYGSDPCACFQLKGANACSPSETELN IRMHKAISIIQFKVEGQIIKEHPEFCLEDRNLLHRIDLDKGTILLEGGEQEMLDSHFPTM DPADPYRLTAEENAIMERLAAAFQNCEKLQQHMRFLLAKGSLYKVYNGNLLYHGCIPLNR DGSFKAVDIYGQNYKGKSLYECLDGYVRKAFMAVDPAERAKGRDICWYIWLHQDSPLFGK DKMATFERYFLADKETHREEKNPYYSMLEDEDTVGRILEEFGLPPQEGHIVNGHVPVKSK NGENPVKCGGKVLVIDGGFSRAYQKETGIAGYTLIYNSYGLILAAHEPFESTEAAIEKES DIHSESTIVKRVSSRKLVGDTDVGKQLKTKIKDLEMLLEAYHTGAIVERFPVR >gi|157101640|gb|DS480684.1| GENE 89 97802 - 99049 1758 415 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 19 388 14 380 383 125 26.0 2e-28 MKNPRFKNYIYWGTTALAVVACSVAFGFFLSRFEIVSRAVGKIVSILMPIIYGAVLAYLM LPIYNKTRVYTTEKLKRCIKDGHTADSLAKAAGTLVSLLLLIAIVCGLFWMVIPQIYTSI IGLQESLGENINNLSLWLMKMFEDNPTIEQTIMPFYEQAATEFQNWLTTDLVPNMSKIIG ELSTGLLSVVTVVKNILIGVIVMVYFLNIKDTLSAQSKKIVYSLLKLKDANRLVSEVRFA HSVFGGFITGKLLDSLIIGIMCFFAMQFLKMPYVLLVSVIIGVTNVIPFFGPFIGAVPSA FLILLVSPMKCLYFLIFILVLQQFDGNILGPKILGDSTGLPSFWVLFSILLFGGLFGFVG MIIAVPLFAVIYRLTATYVSSALRKKDLSARTEDYLSLDYIDEENKHYVDREREE >gi|157101640|gb|DS480684.1| GENE 90 99253 - 99561 407 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938090|ref|ZP_02085447.1| ## NR: gi|160938090|ref|ZP_02085447.1| hypothetical protein CLOBOL_02985 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02985 [Clostridium bolteae ATCC BAA-613] # 1 102 1 102 102 193 100.0 3e-48 MSYIIKMALDIKAGFEPPAPMTSPLEAYCAVGTIAKAMKLGMPERKDTLFEMRDQLDGDM GGNEPEDSRIARIHAILKDFIRNEDTTDQMMEYVAYGYENER Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:49:26 2011 Seq name: gi|157101639|gb|DS480685.1| Clostridium bolteae ATCC BAA-613 Scfld_02_26 genomic scaffold, whole genome shotgun sequence Length of sequence - 87739 bp Number of predicted genes - 60, with homology - 57 Number of transcription units - 30, operones - 17 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 198 - 410 253 ## gi|160938093|ref|ZP_02085449.1| hypothetical protein CLOBOL_02987 - Prom 486 - 545 6.6 2 2 Op 1 . - CDS 569 - 1897 1509 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 3 2 Op 2 . - CDS 1930 - 2889 869 ## Closa_3600 GCN5-related N-acetyltransferase 4 2 Op 3 2/0.000 - CDS 2897 - 3796 1035 ## COG2103 Predicted sugar phosphate isomerase 5 2 Op 4 . - CDS 3811 - 4680 955 ## COG1737 Transcriptional regulators - Prom 4899 - 4958 4.2 - Term 4787 - 4837 8.6 6 3 Op 1 . - CDS 4987 - 6366 1267 ## COG0534 Na+-driven multidrug efflux pump 7 3 Op 2 . - CDS 6329 - 6457 89 ## - Prom 6634 - 6693 3.7 8 4 Op 1 22/0.000 - CDS 6756 - 7904 1209 ## COG0842 ABC-type multidrug transport system, permease component 9 4 Op 2 9/0.000 - CDS 7901 - 9061 1123 ## COG0842 ABC-type multidrug transport system, permease component 10 4 Op 3 . - CDS 9073 - 10137 1467 ## COG0845 Membrane-fusion protein 11 4 Op 4 . - CDS 10143 - 10559 517 ## Clole_0637 MarR family transcriptional regulator - Prom 10666 - 10725 5.8 - Term 10784 - 10844 6.2 12 5 Op 1 . - CDS 11016 - 11558 478 ## Clole_1142 hypothetical protein - Term 11592 - 11623 -0.8 13 5 Op 2 . - CDS 11635 - 12795 1090 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 12855 - 12914 6.8 - Term 13018 - 13068 13.5 14 6 Tu 1 . - CDS 13095 - 20981 5355 ## Closa_3324 LPXTG-motif cell wall anchor domain protein - Prom 21013 - 21072 4.0 - Term 21096 - 21126 5.0 15 7 Tu 1 . - CDS 21309 - 21614 231 ## COG1943 Transposase and inactivated derivatives - Prom 21687 - 21746 7.3 - Term 22291 - 22334 11.0 16 8 Op 1 . - CDS 22470 - 22928 182 ## ELI_2084 hypothetical protein 17 8 Op 2 . - CDS 22969 - 23772 731 ## EUBREC_0800 hypothetical protein 18 8 Op 3 . - CDS 23849 - 24202 223 ## DSY1378 hypothetical protein - Prom 24290 - 24349 5.7 19 9 Tu 1 . + CDS 24620 - 25021 310 ## CKR_2546 hypothetical protein + Prom 25031 - 25090 1.9 20 10 Op 1 . + CDS 25141 - 26082 660 ## ELI_1000 plasmid recombination protein 21 10 Op 2 . + CDS 26066 - 26278 193 ## Tresu_1927 hypothetical protein + Term 26305 - 26356 16.3 + Prom 26281 - 26340 1.9 22 11 Tu 1 . + CDS 26372 - 27883 1312 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 28058 - 28088 4.3 - Term 28091 - 28131 11.3 23 12 Tu 1 . - CDS 28175 - 33022 2545 ## Closa_3324 LPXTG-motif cell wall anchor domain protein - Prom 33249 - 33308 9.7 24 13 Tu 1 . + CDS 33504 - 34595 456 ## COG1106 Predicted ATPases + Term 34605 - 34647 0.1 - Term 34643 - 34682 8.1 25 14 Op 1 . - CDS 34772 - 35986 896 ## COG3629 DNA-binding transcriptional activator of the SARP family 26 14 Op 2 . - CDS 36058 - 37383 1184 ## COG0534 Na+-driven multidrug efflux pump - Term 37409 - 37445 -0.5 27 15 Op 1 . - CDS 37515 - 38018 686 ## COG0782 Transcription elongation factor 28 15 Op 2 . - CDS 38110 - 39387 1072 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 39410 - 39469 4.5 + Prom 39558 - 39617 7.0 29 16 Tu 1 . + CDS 39663 - 40172 355 ## COG2032 Cu/Zn superoxide dismutase + Term 40330 - 40366 -0.7 - Term 40177 - 40226 -0.8 30 17 Op 1 . - CDS 40282 - 41265 706 ## COG0042 tRNA-dihydrouridine synthase 31 17 Op 2 . - CDS 41336 - 41845 42 ## Mpet_1676 hypothetical protein - Prom 42020 - 42079 6.5 32 18 Op 1 . - CDS 42261 - 43271 1081 ## Closa_2929 hypothetical protein 33 18 Op 2 . - CDS 43299 - 43991 609 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 44049 - 44108 5.2 34 19 Tu 1 . - CDS 44160 - 45512 1123 ## COG0534 Na+-driven multidrug efflux pump - Prom 45566 - 45625 3.7 35 20 Op 1 . - CDS 45746 - 58642 7396 ## COG4932 Predicted outer membrane protein 36 20 Op 2 . - CDS 58698 - 59615 424 ## COG2340 Uncharacterized protein with SCP/PR1 domains 37 20 Op 3 . - CDS 59681 - 59782 81 ## 38 20 Op 4 . - CDS 59839 - 59937 58 ## - Prom 60132 - 60191 2.1 - Term 60714 - 60767 2.1 39 21 Tu 1 . - CDS 60854 - 61021 110 ## gi|160938148|ref|ZP_02085504.1| hypothetical protein CLOBOL_03042 - Term 61030 - 61099 4.1 40 22 Op 1 5/0.000 - CDS 61123 - 62511 1476 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 41 22 Op 2 . - CDS 62511 - 63401 1055 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 42 22 Op 3 . - CDS 63475 - 64386 579 ## COG2017 Galactose mutarotase and related enzymes - Prom 64464 - 64523 5.3 - Term 64409 - 64446 -0.7 43 23 Op 1 . - CDS 64542 - 65033 526 ## COG0262 Dihydrofolate reductase 44 23 Op 2 . - CDS 65041 - 65532 503 ## COG0262 Dihydrofolate reductase - Prom 65592 - 65651 4.6 + Prom 65593 - 65652 9.0 45 24 Tu 1 . + CDS 65677 - 66507 680 ## COG0207 Thymidylate synthase + Term 66516 - 66574 16.9 - Term 66512 - 66554 5.1 46 25 Op 1 . - CDS 66606 - 67550 1043 ## COG1893 Ketopantoate reductase 47 25 Op 2 8/0.000 - CDS 67565 - 71560 3646 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 48 25 Op 3 . - CDS 71538 - 75128 3279 ## COG3857 ATP-dependent nuclease, subunit B 49 26 Tu 1 . + CDS 75196 - 75633 452 ## Closa_1014 hypothetical protein + Term 75686 - 75721 1.3 - Term 75837 - 75876 7.3 50 27 Op 1 . - CDS 75902 - 77155 1377 ## COG3635 Predicted phosphoglycerate mutase, AP superfamily 51 27 Op 2 . - CDS 77155 - 77943 750 ## NT01CX_2399 hypothetical protein 52 27 Op 3 . - CDS 77953 - 79425 1593 ## COG0471 Di- and tricarboxylate transporters - Prom 79626 - 79685 2.3 + Prom 79613 - 79672 9.0 53 28 Op 1 3/0.000 + CDS 79702 - 80616 861 ## COG0583 Transcriptional regulator + Term 80663 - 80707 -0.4 + Prom 80660 - 80719 2.5 54 28 Op 2 . + CDS 80762 - 81841 958 ## COG1396 Predicted transcriptional regulators + Term 81860 - 81908 0.3 55 29 Op 1 1/0.143 - CDS 81917 - 82624 759 ## COG2186 Transcriptional regulators - Prom 82644 - 82703 4.5 56 29 Op 2 . - CDS 82724 - 84010 735 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 57 29 Op 3 . - CDS 84007 - 84504 424 ## SpiBuddy_2701 tripartite ATP-independent periplasmic transporter DctQ component 58 29 Op 4 . - CDS 84520 - 85614 343 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 59 29 Op 5 . - CDS 85629 - 86402 844 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase - Prom 86446 - 86505 6.7 - Term 86430 - 86470 -0.4 60 30 Tu 1 . - CDS 86632 - 87543 512 ## COG3344 Retron-type reverse transcriptase Predicted protein(s) >gi|157101639|gb|DS480685.1| GENE 1 198 - 410 253 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938093|ref|ZP_02085449.1| ## NR: gi|160938093|ref|ZP_02085449.1| hypothetical protein CLOBOL_02987 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_02987 [Clostridium bolteae ATCC BAA-613] # 1 70 21 90 90 135 100.0 9e-31 MQSLSEVKAGAVCTIKWMFGNPQVMEFMHQHDIREGSTINVFQHGRDSMIIGMNNMRLAI GNEVAERIKV >gi|157101639|gb|DS480685.1| GENE 2 569 - 1897 1509 442 aa, chain - ## HITS:1 COG:BH3574_2 KEGG:ns NR:ns ## COG: BH3574_2 COG1263 # Protein_GI_number: 15616136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 94 439 1 348 355 266 44.0 7e-71 MTKKELAQSICNLVGESNIRTIANCYTRVRLTVKDPGNIKREELKKLEGVLGVIEEGSHV QVVLGPGTAAAVAQLMSDMTQIKSEEVDEAVLLKEENRAKNNVPYKNMLKKIANIFIPII PAFIACGLVVAVYESSYVFFPGFQDTDFGKIMSALAYSVFTILPIIVGYNTSKEFGGSPI IGAVLASILNSSSLAGVQLFHVSITPGRGGVISVLIIAYVAVLLEKQLQKVIPDFLDMFL RPLITVCVMTFVGLMVFQPIGGFISETIGAFVSFIIYKVPVLAGLASAIYLPLVMTGMHH GLIAVNTQLIADFGVTYLLPVTCMAGAGQVGAAFYVYLKTKNQRLKKTIKNALPVGMLGI GEPLMWGCTIPLGKPFIASCIGGGIGGSIMAVMKVAAKVPELSGIPLAFITTRFPLYFVG LFASYAAGFIVCRLMGFTDPED >gi|157101639|gb|DS480685.1| GENE 3 1930 - 2889 869 319 aa, chain - ## HITS:1 COG:no KEGG:Closa_3600 NR:ns ## KEGG: Closa_3600 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 54 317 62 321 326 167 37.0 7e-40 MLPQMVRLWKEEAVRIRYKAYDSPEEWKAELLDKPDFDPEGCILAFTDSGSLAGFAAAIS QKEFLPGQNRDNTPGYIFLLVVKEEYEGRGVGSSLLEKAEAFLKGQGKHEIRISHKCPIK MSWYIDREGHEHNKAPGIRTDSPGYIFFQRRGYELVQKEVSYYLELEKFYIPADVQKHIM ELESEGIRVAWFDSAGNFGYEEMFERLKDQSFLKKFRDGIREHKKILIVEDRDGRVMGTA GTVYPEKNGRGFFSALAVDPADGGRGIGNVLFFSLCQELKNMGAEYMTIFVTETNFARRI YEKAGFSAVQEWAILKKTV >gi|157101639|gb|DS480685.1| GENE 4 2897 - 3796 1035 299 aa, chain - ## HITS:1 COG:L144334 KEGG:ns NR:ns ## COG: L144334 COG2103 # Protein_GI_number: 15673112 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Lactococcus lactis # 1 296 1 296 297 312 56.0 4e-85 MIDLSVLVTESRNKETMGLDQMTPLEIVTVMNREDGKAVEAIGEVLPQIAQAIAWCTDSL KQKGRIIYIGAGTSGRLGVLDAVECPPTFGVSPDVVVGLMAGGTPAFVRAVEGAEDSQTM GEEDLKEIHLSPADIVIGLAASGRTPYVIYGLRYAKKIGCRTVAVSCNRDSEIGKEADLA IEPVPGPEVLTGSTRLKAGTVQKMVLNMISTGSMVGIGKVYQNLMVDVVQTNMKLITRAE NIVMTATGCTREEARDSLEEAEGSVKLAITMILLQCGAKSAKTRLNRAGGYVRNAIQDV >gi|157101639|gb|DS480685.1| GENE 5 3811 - 4680 955 289 aa, chain - ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 274 1 273 283 157 32.0 2e-38 MKSVLVRIQEYSAQASGAEKGVLRFLRENPEEAAGYSIKQLADKTFSSAATIVRLCRKMG FDGYKELQKSLLYESALRRESTRPMEQEIKKDDSLEELVNKVTYKNIVSLDNTRKLVDLD ILNQCVELLDRSQTVYLFGIGSSLLVARDMYLKLLRVNKACVICDDWHAQLLQARNIRSC DLALIVSYSGLTEEMITCAREARLREAPVITISRFEQSPLVRLADYNLAVAATELIFRSG AMSSRISQLNMIDILYTAYVHKRYDECMEQFRKTHIAKSEGPEENQNAL >gi|157101639|gb|DS480685.1| GENE 6 4987 - 6366 1267 459 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 13 445 7 438 445 253 34.0 6e-67 MIGEWGGVLQSANTDMIDGSIFQSIFWFSIPLLIGNFFQQLYNTVDSYVVGNFVNTNALA AVGASTPVINMLVGFFMGLSTGAGVVISQYFGARQGEAMSRAVHSAMALTGLLSIVFTGL GLLYTGPLLRAIGVPEDVLPHSSLYLMIYFCGITFSLFYNMGSGILRAVGDSRHPLLYLA VASIVNIILDFTFVCGFHMGIAGVAIATIMAQAVSSFMVMHKLMHTREDYKVEIRKIRFH KKMIRKIIAFGFPAACQQSITSFSNVVVQSYINRFGTAAMAGYSATLRIDGFLQLPLQSF NMAITTFVGQNIGARKYKRVKKGIFAAWVMSSLIILAGSVGMYFGAPLLISVFTNDPQVI GNGSSMLRIFSRAYIVMPVIQVLNGALRGAGLSKVPMFFMLGSFVVLRQIYLLVAVPMTH SLMVVMAGWPITWVICAAGMFLYYVKADWLPKEAGPEGE >gi|157101639|gb|DS480685.1| GENE 7 6329 - 6457 89 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYDRESGFGIDCFLFFVMVLYSKAQGGAVTNDWGMGWCLAKR >gi|157101639|gb|DS480685.1| GENE 8 6756 - 7904 1209 382 aa, chain - ## HITS:1 COG:RSc0165 KEGG:ns NR:ns ## COG: RSc0165 COG0842 # Protein_GI_number: 17544884 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Ralstonia solanacearum # 15 356 32 365 381 71 21.0 2e-12 MISRIWKEEKNTVIGLFISVLAGMLLITTIWKNDYTEDIPFGILDEDRSSLSQSIVKQFG INPSLDIVYYADSEQDLKQAILDKKIEAGVMIPENFSRDMSLKKSPQAVIFADCSNIITG GGAVGAASSVLGTMSAGMQLKMLEGNNFYPSAAQTSLGTFSYVDRTLYEPQGDYIRKMSY LLVPAITMQTFLISFFIPLMIRKRKALAAATPEKRREELKDGIIRTAVVAGGAVAAQFVV LCVVGLYKNIPLRGEIGLYLFSTVLFMLAAVAFGVMLGSFTKRLACFAQLYMMCSNLIIF TSGLIFPYYLMPKWLSIASRIFSPIANIAVELKAVNLKGIGWDAAWPQLAGTALYTVFWL AAGGAMYAWSIKKERRLAGAAE >gi|157101639|gb|DS480685.1| GENE 9 7901 - 9061 1123 386 aa, chain - ## HITS:1 COG:PA3400 KEGG:ns NR:ns ## COG: PA3400 COG0842 # Protein_GI_number: 15598596 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Pseudomonas aeruginosa # 5 356 1 358 377 66 21.0 1e-10 MREWMKQAGSLASSMKKRYITIPMLILLLVPTILSLLMGAEFVHLPFQKVPTIIVNHDHS ETVQSLIQMISDNRTFDVIACTDEEDDLKEAFYYNKALAGIVIPEHFSEDLLNGREAKIM IFNDGALSTVASGMRGTIAEALGTIKSGYMLKLAEGKGIPPQAAMNLIAPMGYVTKAISN PSKNVAYMMMEGILLTIVQIGAGCVGACVCERGSFGRLMKKAGFITGIACVSALGCIVTQ TLCFGFPYKSSPWAGILMTVFTCFGITLFGILQNLGTGGNCEEAVQKCSIISFTMLLAGY TFPVISMPWPCKFLTWFMPNTHYIVPFRDMALVERSFLSEAHHVLWLMGFCAVMVLAVAK KFRDAGAEPGKTPADGPRTENGVAGI >gi|157101639|gb|DS480685.1| GENE 10 9073 - 10137 1467 354 aa, chain - ## HITS:1 COG:alr1505 KEGG:ns NR:ns ## COG: alr1505 COG0845 # Protein_GI_number: 17228998 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Nostoc sp. PCC 7120 # 56 352 93 424 504 89 28.0 7e-18 MKLAKSKKKIIGAAAAVILCAIVLVLIFGHVTGDDLLFKNKKNLVVQSYVTMDESNINAL TGGQIKDVLVEEGDLVTKGQELVRLDSDTLMAQREQAQAALAQAQAAYQSLLNGATEEQM KQLQLAVQIAQANLENAQAAYTRMDADYQRTLALVPAGAVSQSSVDAQKAALASAKAALD SSRSNLEISQSKLSEAQNGATPEQIAQAQAGVDQAKAALKQIDTTLDKCVLTSPVDGLVT TVNVKSGDVVSSGLPSVVVADIYTPYITCDVDEMKLPRVKLDQEVDISLAALGKDTYKGK VVRINKEADFATKKASNEADWDILTYGIKVTFENLTGLQDELRSGMTAYVDFGK >gi|157101639|gb|DS480685.1| GENE 11 10143 - 10559 517 138 aa, chain - ## HITS:1 COG:no KEGG:Clole_0637 NR:ns ## KEGG: Clole_0637 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.lentocellum # Pathway: not_defined # 1 126 8 133 145 80 34.0 1e-14 MIEKSKRRFLCIKFRACGLSFSEGIVLLVIGYSRYSNQETISSLSGIDKYQTAKVLAAME ERGYIRREINPENKREKLVCLSEKGKEAANSLKDIMDQWEKIIFAGITPQEEQVLEQIMG KIVGNVREYERNIGREKL >gi|157101639|gb|DS480685.1| GENE 12 11016 - 11558 478 180 aa, chain - ## HITS:1 COG:no KEGG:Clole_1142 NR:ns ## KEGG: Clole_1142 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 172 1 175 208 122 37.0 6e-27 MGRNCCITIERQYGSGGREVARILSERLGIPCYGRELLAVTAADNGISIEELKKLDEKRT GSLLHDLVLYAFRFQNYNKMQEPFDINKEVSDTIRRLARKGPCIFIGRCADFALDGECDV LKLFIYASSMEKRKRRICEVDDIAFEKAEEEIQKRDSQRSDYYHYFTGKTGGIWKIMMPV >gi|157101639|gb|DS480685.1| GENE 13 11635 - 12795 1090 386 aa, chain - ## HITS:1 COG:RSc0194 KEGG:ns NR:ns ## COG: RSc0194 COG1063 # Protein_GI_number: 17544913 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Ralstonia solanacearum # 40 383 1 343 345 268 42.0 1e-71 MQSDIQSQMSQSQMSQSQMSQSQMPGSQAPHTRNSKPGTMKAAVYKENGIITLEHRPVPA LADPQDAIIRVTLTTICSSDIHIKHGAVPRAVPGTILGHEFVGIVEQTGEAVKKFKQGDR VAVNVETFCGECYFCRRGYVNNCTHEHGGWALGCRIDGGQAEYVRVPFADNGLTKIPDEV TDEAALFTGDILSTGYWGASLAEIKPADTVAVIGAGPTGLCTLMCARLYGPGVVIAIDTS DERLELAKEQGLADVIINPLKQDVEQEIKKLTLGRGADAVLEVAGGRDTFRMAWRIARPN AVVCVVALYEEPQMIPLPDMYGKNLVFKTGGVDASSCEEIMKLIKAGKLDTSCLITHRTK LEHIMEAYDVFENKKDHVIKYAIEVS >gi|157101639|gb|DS480685.1| GENE 14 13095 - 20981 5355 2628 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 414 863 174 642 2050 186 30.0 2e-44 MKEKGRSRALHRKPYGLCKRFLAAMIAAVMVTTNIGTDLTTAYAAESSDSVTFEMSGSGL VTAIEEAIAAGSPLQEGDIDFTNGKTEQFEALFYGEGNIYEIYPEYDGGDMDAELRVFVR LPEDADDMYMVTGDEEIIFLYVNNGESSISCSTSITRMEDGKEKVKRTKKLTVKSYEAAF GDEEVNIISKPAETVPPETEPAETEPAETVPSETIPAETIPTETAPSEEIPTEAVPTEPD SSETTSSEANPAESEPAEPTPAETTEQTSQSQETQTPAGEEGQKETTASPTEGTDTGRQD TEESQKESEGSRPENTDKSEEISNNAKEEIEQKDNADKEQIHSEKQDDTPVASINRHYAP VVAVKTGDETASLPEKHDSDVRIEEGTKEKEEPKENAKADNTTKETAENQGSSADKTSGK TNVSSETGTSGSKESSSNPESSGNQESSASQESSGNPESSGNPESSGNPESSGNPESSGN QESSASPESSGNPEPSGTPEPSGPPESFGEAETPSQPAHPSGSEPSGESGDSEETEEGKT EESSPAESTGDIITDETPVITIETEPSRTPEETEAEDVHKVKAATSDDLVGIGYCSTARV YTTTVNTLKALNDFEGYKVTYAVYSQGTARIVDGPRGVEEGGSLTFGVKTQVGYDIQSVS ANGELLEAGAPGDSADPDSADPGNAAENISWFTIEDVTDELEIEVYTTETDEHPEFSDTI MVNDGMIINLYAPEGVLPKGVTASAERVDSALEDSIRENAQEAASEEGKQVSSVAAYDIN LWLGSQKLDAGIWNQEGAVTVTFSGVPVEEASQTAEEMSIVHVETEAADVKALEEVRDAV DVSGGRAVDALSFEAEHFSIYAVISKENITRTTVEVSKDETFRVKADRRYGGSWFVSDPR VVSIQNVKDINENGKWYSAADVKALKAGKKASIWYGNVVHNDYFNIIVNEAANQFHFKFN QDSSNISLWYSINGGGLLEMERGQVLQTDRTDTVSFYVKPAAGYVTPTTFEHRNGRSEAE LAKARFGEGIYVDIADAPSTISTEHTSRDFSDANAEAVSMGCTNQFHYSYRDDAYQYREF KIQATPAKIQVKYDGNGAEGMAPVDYRTYTHPAIKNGDSYIDIKGSGNLKKNHFDFIGWE LAETNVNQAGSKIYQQGDSLFVGDVWKDININPVTYVLTAKWMPKQYTLTTASDANSAID QGKQYTYDADSLIRVNFTAKQGYRISGVAVDGVHLSGKALNEAVQNGWVEVDCKQDHYVS VTSVRTEYNLITRAENADITKGTKYSYSDTDTLDVKFKADKGYSVTGITVDGKSFTKDEI QAILAEADQQGYSHITLDRKQDHTVFVKADKTLYTLTARADENSWINPQTKSYEYKEFIM YKVKFGAKKGYQIQNIIVDEKELSGFEKTAAIAAGAYLFDYKSSHTIEVTTRINSYHLTT YSDHNSWITPSMDFTYDGTEKLKVEFGAKEGCSISGIWVDLRQYRGDELKEAVSRGYIEV DRDRDHIVSVTGKINEYKLTVGADENSRITEPKQEVTTFTYDSDKCLVVKFKANTGYRVS KVIVDDVSLEGAELLKALASHSVSVDRKGDHSVFVETSPMEYALTTGADLHSRITYPQDR IYTYSYDPEASIEVTFEAKTGYGISGVFVDARPLKGQELQDAIEAGSVSVDRKECHSVYV TSEARRYTLITDSDEYSSISQGGSYGYDAILPMLVRFRADTGYTISSITVDGQSLQGAEF ERAAALGYVSLSRTRDHEVTVTSRIRQDLEYTVNYFFENDRGDFRIDRQYEHAQHVTNAE YGSDIVFDKTPALDKNGHHYALDHIDGDGKKVSLITLDNVVNVYYGIDEKGETNPEDSGK PTNPDGIPDRYQVTFTYRAGANGAMTGTVMEVATRRDSAGELSLYAPVRPTADGVEYLAD DGFTFNNWKSSQPGYKTLMPNYVFDDTDDIAAHKFYCDTVFTANFLDNKNIHYDVEFYMQ SQGRYQFRPTYTDARNDVPLHGMAKVTEQDLADKVESECTYVLDTSRRNITEAKVKEDGS TTLKLYFKQQFTVTYKPGSQGAFTEQSKSDYKYNDRLERFIGEKTPIDETMRFAGWMGMD GTVLQESQLPDTVTKNMIYTAVFEEDTVKSGFFVRKDKAARPDGNVPDPSYNYALFGEGF LHANPVMVPAPEGELNQSSEAVLERIAQWPENDTFTMDGKTYSMDDILWYRIQYEEGDPY EWHVDGELKHSIAVVFDLKGGTLNDSTDPLLTYSLQKYGVITDNLIPSLEGSAFSGWRVV ESQDTGFKTDQLYQAPDINNHEFTGSVTFEAVWDINMYTVNVEMADKEFPANILDRKLLE QVPFGTDPEADGRLNGLIPESIDGADGNHYIYTGTLRKSYEPVQGVRENGTITVYYMTDN DGPDKKPDGIPDAYEIQFIFESSNEQQGRIRNNGTGAYGTVTEWKSLVELDSEGRIVIGA DGEPVFWEDGVSPEGAVTAEANSGYALSGWTALMDDMNISFADLNEIRSTKYEHDVVFAA QWSAAGGNTPDTGGNSSSGSSGGSSSSGSGSGRRYVSANGGPGAVTLEETQVPLAEAPMT DMTVIDDGEIPLAPLPKTGNIMDRMSLTFLLSGIVLTLSVFDRRRHVK >gi|157101639|gb|DS480685.1| GENE 15 21309 - 21614 231 101 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 3 101 59 157 157 154 74.0 4e-38 MPDHIHMLVSIPLKYAVSEVVGYLKGKSSLMIFEKHANLKYKYGNRHFWCRGYYVDTVGK NTAAINAYIQNQIKEDLEYDQMSLVEYIDPFTGEPVKKNKK >gi|157101639|gb|DS480685.1| GENE 16 22470 - 22928 182 152 aa, chain - ## HITS:1 COG:no KEGG:ELI_2084 NR:ns ## KEGG: ELI_2084 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 152 1 152 152 213 68.0 2e-54 MKDFFEALKSEKKSIAIPEELNYFKKLIGSWQIDYVDSNISHVIKGEWHFSWVLEGMAIQ DVIILPDYEYGTSLRIYNPNTHTWDVAYGYTGKIIRLEAKKQDDRIVLTFIDDERRKWVF VEIEDDYFHWQNVTVKNDGEWHINAEIYAKRI >gi|157101639|gb|DS480685.1| GENE 17 22969 - 23772 731 267 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0800 NR:ns ## KEGG: EUBREC_0800 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 266 13 277 277 285 56.0 1e-75 MALTIQERLKDLRVERGLTLEQLAEQTQLSKSALGSYEADDFKDISHYALIKLTKFYGVT ADYLLGLTETKNHSNADLADLRLSDKMIDLLKSGRIDTALLCELAAHPDFVKLLADIQIY VEGIAATQIQNLNAWVDVARAEIVEKYQPGEHDKTAGVLKAAHVREGDYFSSRVHHDIDA IMEDIRQAHRGRSDSAPENTIVDELKQDLEEVANFKGSRAEQLLMVFCKQTKLRYSKLTE EEKQWLTRIVQKSELAKSYLPQRGKRK >gi|157101639|gb|DS480685.1| GENE 18 23849 - 24202 223 117 aa, chain - ## HITS:1 COG:no KEGG:DSY1378 NR:ns ## KEGG: DSY1378 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 113 17 126 130 92 46.0 6e-18 MVRFYRDTLGFHTEWNGGDFAEFETASGALSLFMYSRKEFVKAIGEGYIPPKGINQTFEI ALWLPSFADVDTEYERLSKLGLRFPTGEPITYPFGIRNFYVADPEGNLLEIGSTNET >gi|157101639|gb|DS480685.1| GENE 19 24620 - 25021 310 133 aa, chain + ## HITS:1 COG:no KEGG:CKR_2546 NR:ns ## KEGG: CKR_2546 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 133 1 129 129 162 66.0 4e-39 MSKKLPEIERGPGGFLPRLKPGQKKRANALIRKTCCNYENGNCLLLDDGEECVCPQTLSY SVCCKWFRWAVLPQDKALEAEIFRSASVKRCVECGAALVPRSNRAIYCEACAKRVHRRQK NASDRKRRRETDK >gi|157101639|gb|DS480685.1| GENE 20 25141 - 26082 660 313 aa, chain + ## HITS:1 COG:no KEGG:ELI_1000 NR:ns ## KEGG: ELI_1000 # Name: not_defined # Def: plasmid recombination protein # Organism: E.limosum # Pathway: not_defined # 1 313 1 313 313 502 85.0 1e-141 MPQYAILRFAKHKGNPARPLEAHHERQKEKYASNPDIDTAKSKYNFHIVKPEGRYYHFIQ SRIEQAGCRTRKDSTRFVDTIITASPEFFKGKPPKEIQAFFQRAADFLIQRVGRENIVSA VVHMDEKTPHLHLTFVPLTKDNRLCAKEILGNRASLTKWQDDFYAYMVEKYPNLERGESA SKTGRKHIPTRLFKQAVNLSKQAKAIERALDGIGPLNAGKKKEEALSMLKKWFPQMEDFS GQLKKHKVTIADLLEENKKLEARAKASEKGKLKDTMERAKLESELHNIQRVLDRIPPDVL AELKQQQQYGKDR >gi|157101639|gb|DS480685.1| GENE 21 26066 - 26278 193 70 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1927 NR:ns ## KEGG: Tresu_1927 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 1 69 1 69 70 93 86.0 2e-18 MERIGDFISGIRYRKETEVVTFQGKEITLENLSPVFTPEQEAAKRRELEQQLYDVFRKYA DKRQAEEAGV >gi|157101639|gb|DS480685.1| GENE 22 26372 - 27883 1312 503 aa, chain + ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 7 478 4 492 531 199 30.0 1e-50 MNNRIDAIYARQSVDKKDSISIESQIEFCKYELKGGNCKEYTDKGYSGKNTDRPKFQELV RDIKRGLIAKVVVYKLDRISRSILDFANMMELFQQYNVEFVSSTEKFDTSTPMGRAMLNI CIVFAQLERETIQKRVTDAYYSRSQRGFKMGGKAPYGFHTEPIKMEGINTKKLVVNPEEA ANIRLMFEMYTDPATSYGDITRYFAEQGILFYGKELIRPTLAQMLRNPVYVQADLDVYEF FKSQGTVLVNDAADFTGMNGCYLYQGRDVKPDKLHNLKDQMLVLAPHEGIVPSETWLACR KKLMNNMKIQSARKATHTWLAGKIKCGNCGYALMSIVNRVGRQYLRCTKRLDNKSCVGCG KIYTAELETVVYQQMVKKLESHKTLTGRKKAAKANPKIAALQVELAHVDGEIEKLVDSLT GANNVLLSYVNVKIAELDGRKQELVKRIANLTVEALSPEQVSQISGYLDTWDRVFFDDKR RVVDLMITTVAATSEALNITWKI >gi|157101639|gb|DS480685.1| GENE 23 28175 - 33022 2545 1615 aa, chain - ## HITS:1 COG:no KEGG:Closa_3324 NR:ns ## KEGG: Closa_3324 # Name: not_defined # Def: LPXTG-motif cell wall anchor domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 252 759 166 642 2050 169 29.0 1e-39 MEGRKRSGMLKKSVYNMRKRLVAFALAAAMVCTNVGADLNAAYAATSSSESVTFEMTGSQ LVTAIEEAIENGNVISPGDLDFTNGDIAKFESLFYGEGKVLEVFPDPDGGSMDAELRVFV RLPEDADDMYMVTGDEEIIFLYVNNGEDTISCSTNITRMEDGVEKVKKTKRITVKSFEAA YGDEEINYISKPAEETTAPAPEDVNGPGTEETTAPDTTAPTDEGTVDDTTTVAPGENEST DAPTEEQTTAPEEGSTAPTEGETETSSEVAPTEEATTEEAITAPEATEEKAQEATEAETK ETEAPEPEAAVEPEEVSGEPVASIIRHYAPVVADNEEGNAEDAPKAEEAKEPEAEPEEKE EPVKETEAPTEKETEAPTPAETESTTEAQEGNTEESTTAAPDETSGETGESTDPSETTTE GTTTVPSEGDETTAPETTTEATETPAATEAVQVGTPSEVTKPEVQPEDTVNKASTTDLVG MGYCSTAKVYTTTINQLKALDDFDGYKITYSINPGASARIVDGARGVAEGESLTFGVKNQ IGFAVESVTANGEALVADSVADNEDGSQTAWYTVSDIYEEQEVEVYTVETDVHPAFDAEL VMDDGMVIHIMAEEGVLPAGVKATAVRVNNNVEAAVQEILSADTEKNVSSVMAYDINLWL GDKLLDAEMWSGSRLVNVTFTGKPIIEQSAEADVVEVVCVETVQDANKEEKEKAASQESE FAPEDIVKVETVSETVDVQGEQTVGEISFEAEHFTIYAVVSGNKQLKSVTIAVGGKLTLY NEDSRNRPLNADWSVTSGFDVIKFASGNARNRSSIEIVGLVPGTATVKADSGYWQEYSYT IIVAGSSGINAEYWITNQEVTGSDGRQSKNISSADLNQYGDGGVSVDEMAPATGNRGNDE LIFWHARVLPTNNHQTNGAGVNRVNDGQRIEKLRFFNENVQYYTSSGSWKNFAATDQLVF YYLQRTELAKQVEIAVSDWWANTGGNTRKIKYELIDELSGNIMGSSSTSYHDQHPGVSGI YITPNLDNAYSVSKVTVDNVEIDWENGFSVGWSNSSQTVTVRIYIKPNEANLQIQYLTED GQPLTGVMDSSTLGVLTKEGTDFAGYVRNGWLLKTQIEGYYGETYTIYTDLRSISSLKEY ALNYEYLSAEVSDNNTVLKLYYRSISHEYTVNYYKDSTSANNFLGKVEGTGAVGDAIPAD LSKFAPAGYVVPGTRSGATNISADGGDVVNVVYSKDTFSYSVHYFYDGIEDVQALESGIG EFESQVTTYKEKLRTKYALDYVENLPLTIGTDISKNVINVYYALDDNGDEIPDKYQIIFT YVSAGNGSVDGDAEEVHTFKDNDGNYIKPTPTSPKANIIVNADNKYAFDYWTDSEINDYT PDMSKMKSNTYLVNTTFIAYFDVDEIGIEVPNEPDGVPDKYQIKFQYVSEDTNRGTVSGR VTEVKTVYEIVTGEDGNDHRELRPASPDANVTVSSLGSYSFNNWTDGSTGYANADEIRAA EFTQSTTFTAQFRFNGGGGTGPGGGGGPSGNTEGNGRYNPSTVGPGTTTITPEDVPLAPL PESPVDVTLIDDGEVPLAPLPKTGQTSMRTTLTMMLSGIFVAVTALSKKRKEEDS >gi|157101639|gb|DS480685.1| GENE 24 33504 - 34595 456 363 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 12 361 34 413 420 140 28.0 4e-33 MDFLPASINEHKNTLLKDPADGECILPLIGIYGPNSGGKSTVLNALSCLSSLVTASVTVP TRIKDVHCLFSSDCKDIPTQFDVLFRSQGFLFRYQLQFLKGNIAEENLFYGSLGGSDTGI LFNRKESVIHPGRELISFPLKAVPPSVTLLSWLNAYTDSMHAKAAYSWFSRIQGSSPREL PSDADTRRRLCGILQDLGLDIADYSIIPDSDHKSPAAILVHSPYEGCTYELPWEEESMGT KKLLSLLPSILTSLDDGSLMMADDLDCILHPHALRYLIALYANRERNPHNAQLLFTAHNT AILQPAQMRRDEIWLCCRQEGQDTELYPLSSYKKENGLIPRNDEAYGKQYLEGRYGAMPN IHA >gi|157101639|gb|DS480685.1| GENE 25 34772 - 35986 896 404 aa, chain - ## HITS:1 COG:TM1119 KEGG:ns NR:ns ## COG: TM1119 COG3629 # Protein_GI_number: 15643876 # Func_class: T Signal transduction mechanisms # Function: DNA-binding transcriptional activator of the SARP family # Organism: Thermotoga maritima # 39 271 26 282 349 60 24.0 4e-09 MKEEGTYYPARDREGLTVFMMGSIMLEYNRKPLQIPGGPSGKVLQLVLILLYSGEEGVHR EELLDMLYGNGECSNPSGSLRATVFRLRKLLEGSELPPREYIRTDGGIYRWDGGELPVYI DARDFENTAKVALRNHDEALLKRACSLYQGEFLAQLAGEKWVSVVGVRYQELYFSCLRVA SSLMKKNREYNGLLEICNYACELYPYEECQIMKIDCLIAMKRFKEAMQVYDTAVSQYFEE QGLPPSEKMLERFRIMSGQIRYTSDTLKDIRDSLHERDRINGPYYCAYPAFIDCYRLLAR MAERVDFSSVLLCCTLLDSKGKPLDTGGELLKTASEKLHDAIGKSLRKGDLFTCYNPSQY LVMLNGTSNEYCLRIYERIDRNLHLWDGGRKVKLNYQVMPESLI >gi|157101639|gb|DS480685.1| GENE 26 36058 - 37383 1184 441 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 3 432 5 434 452 457 54.0 1e-128 MAAVNMTEGSITKHLVDYSIPLILGNMFQLTYNAVDSIIAGRFIGKEALAAEGTASPVMN IVILGISGICMGASVLMSEFYGAGQKEKLKREMSTTVIFGCYFSVIIAILGGIFSKSLLG ALGVPDEILGKAASYLSVIFLGAPFTYFYNAVSAALKSVGDSKTPLKFLAFSSILNAVLD LIFIGGLGFGIVCSAVTTVVAEAASAVLCITYVYRKIPMLQLRRGEFTMDRQLLRQTLRY GSITALQQSCQPIGKLLIQGAVNPLGVDMIAAFNAVNRIDDYAFTPEQSISHGITTFVAQ NRGAGRKERIRKGFRRGLMLEACYWVFICITITLFRRPLMGLFVTAGNEGIVALGSSYLG LMALFYVFPAFTNGIQGFFRGMGNMSVTLLGTFVQTSLRVVFVYLLTPGIGLLGVAYACA IGWSVMLLVEVPYYFWFMKDK >gi|157101639|gb|DS480685.1| GENE 27 37515 - 38018 686 167 aa, chain - ## HITS:1 COG:BS_greA KEGG:ns NR:ns ## COG: BS_greA COG0782 # Protein_GI_number: 16079786 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Bacillus subtilis # 5 139 9 143 157 93 42.0 2e-19 MYDNLTANDIKKMQEEIEYRKLVVRREALEAVKEARAHGDLSENFEYKAAKQDKNRNESR IRYLEKMIKTARIISNNSREGEVGLGDTVEVYIPDDDETERYKLVTTVRGNSLQGLISID SPLGKAIRGRKMGDKVYVKVDDHFGYDVEIRSIDKTTEDDEDKLRSY >gi|157101639|gb|DS480685.1| GENE 28 38110 - 39387 1072 425 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 79 137 562 619 621 61 44.0 4e-09 MVRRFSRRQADMKGAEKPGMENPGMEKPGMKKPGMKKSGMMEPGMKQPGIKHSNIGRSMI RLLAAASAAAFLLTTPVYAGAWSQADGSWRYQKDDGSYARGWMQSTDGKWYYFDANGRML TDTVTPDGYQVGPDGAWTVGGWEASYDANYPLKDYLDQAGLQYVDVIMGWNAKGEPVMER RLRQDYNYGLYNYFGLSDDLCKALYGESDQIGTGEMAVDGDSLLVLGQVALYQENVGSDI YKQRRNALALVIRDYLNSYRWRDVSELERAKHAALYIASGCTYDTGLYNRFVNGEDTSGD LSFTAYGCLVNHKAVCEGISVAYQMLARSMGLNCFCAPDDNDKDHMFVYVEAGGNWYKVD LSVTGLMPQTMVDRCFKTTANQDVERIFAAYYNKANPYIAHVDYGNLAPGNVYDLSAGGQ LRKYP >gi|157101639|gb|DS480685.1| GENE 29 39663 - 40172 355 169 aa, chain + ## HITS:1 COG:CAC1363 KEGG:ns NR:ns ## COG: CAC1363 COG2032 # Protein_GI_number: 15894642 # Func_class: P Inorganic ion transport and metabolism # Function: Cu/Zn superoxide dismutase # Organism: Clostridium acetobutylicum # 14 161 23 178 182 111 38.0 8e-25 MANQSVTPADIFADLLRFRAPAAIAWVRGGDAAPNLSGLVKFYQTPYSGVLVEAEIFNLP NKQEPGSTDFYAMHIHQNGDCSDDFTKTGDHYNPSGEPHPDHAGDLLPLLGNEGYAWLSF YDKRFSTDDIIGRSIVIHSHADDFTSQPSGNSGTKIGCGVIEKADYLKV >gi|157101639|gb|DS480685.1| GENE 30 40282 - 41265 706 327 aa, chain - ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 1 319 1 311 311 353 52.0 2e-97 MKYYLAPMEGLTGYIYRNAYHACYHAMDKYFTPFLSPKANSKLSFRELNDILPEHNQGMY VVPQILTNQAGDFIQTAKELQEYGYSEINLNLGCPSGTVVSKHKGAGFLGRPDSLDRFLD EVCGKMDEMGMEVSVKTRLGLISPDEFTGLLEIYNRYPLKELILHPRVRTDYYKNTPNMP AFRGGFDGSRSPVCYNGDIFTISDYETLCGDYPELDRVMLGRGIISNPGLLDRIAAGRNS GGPGTDKETLKRFHQMIYEGYRQIMSGDRNVLFKMKELWACMIQIFSDNGKYIKRIKKSQ RLEEYETAAAAVFRDLDILEGAGFRTY >gi|157101639|gb|DS480685.1| GENE 31 41336 - 41845 42 169 aa, chain - ## HITS:1 COG:no KEGG:Mpet_1676 NR:ns ## KEGG: Mpet_1676 # Name: not_defined # Def: hypothetical protein # Organism: M.petrolearius # Pathway: not_defined # 50 168 21 146 191 62 33.0 5e-09 MCEYGEDHREIYKIDFKEKLKEEIIKCLKKAGNELYLVDSSLISDKTYLDVHAHERSICF RFGTYLNKYIESNSFLKKYDLDAEYNRDIDMIKRLPGWPNGCYPDLILHKRGNNENNILI IECKGWWSSEEAIQNDREKIMQFLHSERYQYLFGLQIIFCREGMELKWI >gi|157101639|gb|DS480685.1| GENE 32 42261 - 43271 1081 336 aa, chain - ## HITS:1 COG:no KEGG:Closa_2929 NR:ns ## KEGG: Closa_2929 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 336 1 336 336 621 96.0 1e-176 MALFESYERRIDKINSVLNSYGIASIEEAEKITKDAGLDVYDQVKKIQPICFENACWAYI VGAAIAIKKDCRRAADAAAAIGEGLQSFCIPGSVADQRKVGLGHGNLGKMLLEEETDCFC FLAGHESFAAAEGAIGIAEKANKVRQKPLRVILNGLGKDAAQIISRINGFTYVETEMDYY TGEVKELWRKSYSEGLRAKVNCYGANDVTEGVAIMWKEGVDVSITGNSTNPTRFQHPVAG TYKKECVEKGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSSSVP AHVEMMGLIGAGNNPMVGMTVAIAVSIEEASKAGKF >gi|157101639|gb|DS480685.1| GENE 33 43299 - 43991 609 230 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 230 1 230 230 370 80.0 1e-103 MIYSHEVEEMCTVAQGVHHGAAPIPEEAKWVKSKEVKDISGLTHGVGWCAPQQGACKLTL NVKEGIIQEALVETIGCSGMTHSAAMAAEILPGRTVLEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEDGLPVGAGLEDLGKGLRSQVGTMYGTLKKGPRYLEMAEGYVTGIALDAD DQIIGYQFVSLGKMTDFIKKGDDPNTAWEKAQGQYGRVADAVKIIDPRKE >gi|157101639|gb|DS480685.1| GENE 34 44160 - 45512 1123 450 aa, chain - ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 15 442 18 445 464 149 27.0 1e-35 MALVKKNIDLLNGNILTSLTELALPIMATSLVQTAYNLTDMAWIGMVGSDAVAAVGAAAM YTWLSSGVATLARMGGQVKSAHAYGEGNRREAVQYGKGALQLALVLAAVYGIITNLFAGP LIDFFHLNSNLVVEDAIVYLRIACGLILFAFIGQTLTGLYTASGNSRTPFVANCIGMGAN IVLDPLLIFGLGPIPGMGVAGAAIATVTAQFIVALVLVISMRRDPVLASQMRVWIPTPLS NIKTMVRIGFPAAIQSMLYCGISMVLTRFVTAWGDTAVAVQRVGGQIESISWMTADGFGT AINAFVGQNYGAGNLKRVKKGYMTASAIMFIWGIFTTCLLIFGAAPIFSLFIHEAEVIPA GADYLRIIGFSEMFMCVELMTVGALSGMGKTMEASVITIILTASRIPLAVILGGTAMGLN GIWWALTISSIVKGIIFFVYYLHIMKRMKA >gi|157101639|gb|DS480685.1| GENE 35 45746 - 58642 7396 4298 aa, chain - ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 2982 3683 1221 1933 1983 117 26.0 4e-25 MMALFLATVLVCTAIRTTSGAGDLINLNPYRQVFRGKNLNAFGRTHRDMNIYQVGEGAGA LPALCIQEGHKLPDGSPAKYEQYYVEPGKPVPVIGPFERYLSMVLAYEWLVSDNYYVPAR YGVVQVYYWGCMNGYEHNWALQKQAMEKFQAVMNGDPMVMVYYEEMKAHILEGEAQFNGT GSSSLPAWTGSVQKMTLKDGHYELTLDISSCPQLQDTSWSFPDNQWSYKLAPDGRSVTFQ YNGNQEPQGTIMSAQIQGIENRFYAYIFTPAASENLQKQLGWLDFNRPMASVSFSVGTDA VLPGSSDLELCRHSETFQSNYNIDLEKYCAETNQPLEGTVFNVWEDFDFSQVNEQGYEEG EPDGTSGQVYLNCMSPEPENDYVCDILTTDPDGYARHSDTRYYNYSKTYCMGHPAPEWTE CDHEEDEDCSCDEENDRLRGQWRAEQELCAATCDFHALNNDEDNREQDTSAMEAMLADRD ETYERFIELEYSYHLQEKTARTGYILHGLHNDDKEIETVVLSAAQAGGSARPAVYRPGSF VGKAVEPVYTYVAAMRERRAYKYPVPDALEMELGEQRSIFSLWGDEKEKKKPEIPADGAA AGGIFGDQENTSGSAGDQEAGDGTQGEEEDTSGDGGQDEDGNGNPEGEEEKDGEMTEKNP ENGNKEDVESGGTETGNSASGGAGPETGNSEGGETETENSGSGGAGPETGNSGSGGARPE STGNGGAGPETGNSGSEGAGPESTGNEGAGSETGNSGSEGAGPENTGSGGPKSDGSISEK TGDSRQKNETSYREDQDSGSQKKEEQKSGNLQVQSYSMSASPILLKSSHGKTILFSGISS EEHTVNSKHQSQNKPEDTANENIKYETDEDTKKEMNEEINKDTNGKADADTEINIEISSE NSTDNTADNTTDKEKHEIQDSTTETADIRDEAEGNSHIENQDGTSQRDGDEEDGGQSQIY EEYEYARNPISPSFGMAAYFQADTGNGDEEPEEEGNRAVRFIRSLFSGEEDDDDTISVSL PSFMDDDLGALDVSAYGEPDTILYTFKVWDHRTEGRLHINKRDLELYRADGDKSYGLTQG DATLEGAVYGLFAAQDIIHPDGKSGTIYNQNDLTAVAATDKQGNASFLAYTEKPGTRLSD DGTIIAPENATGPENLYNGSSVTSSAQGFGTVVYPDYVLSNEDQWIGRPLIMGSYYVMEL SRSEGYELSVNGISLKESNRTQDIVNRIHEAGQARISGGLSDYNSMDADGSWNDFIVESY KTENGYDIILTGYPQGAEFYEIRTGTETKTYKAVLGSSLQPKVDQQGRPVYQTAKGGEYK IGADGNPIIRLDTATDSDTGERTPYGETLPYRFRTAPYPSGTAVPADMSKWGQAIEPHYL SKQVNGMLGQLGYRPVTDISPWTQIRLSGQTNAQAAEEIMDWYTVHNFFDCGYVEDIYEI EGSFFALLRHDYSLGHAGFPAVYDFVNRKLYVRKTAEVSGGPAGKVGYWIEYQKGEYSLK SAVASIKEKREINQVIPFGSDIEAAAETVYQPLYETYSEGDIVLDREGNPIPLMERVYEY EDRTETYETDKPEPVSAVYDWATGNYAVHVENTIDWKDRTEPEYTEFRVVTREKTIDWEG EELPYSQYLTDIAGAGVSALAAVPLLDEGSYVVFQALNYPGQNQPVQEAGTGSSPLQVLQ RVIKQSVKVTKDISQSSYDGVNTYGSVHNDPLTVLLGLFNGGSSSQGAKLLNQFKFKAYL KSNLENIFVDSAGNIISEDIGTADFKGDVQKIFLPPRDGSGQRLLETKEDGSYDYTKFFD AMYAAVQVEKGKKPQEAIQQFAVDYYDIDAYKAEILTAEPGLNSDTAYEKALLRAADEAG SYLSVFSGLDNRLAIAWDSDAGGGADGDVTTLQCNTRNGKDDYFNHSIMLAYGTYVIVEQ VPADVDRELANRHFTRDYPREITLPFVPDIGQDEGTGETDVNYQTGSPYFRYDSTDTPDD LIRKYKIRFNEETHIIQANGQDGKFEIYKYGLDKDMRPGRSLTSHAPYEEEYMDGRNDTV KGYYAGYTSQSEDAGIMEGVIYDGYETEDGQWEVRDQVATMKGMQTAVDGKFASMLVPWT VLPPAVDRINPDTGNVETLIPSGNGRDFNFVAFAQEDFEDEYFNSRLRIEKLDSETGDSI IHDGALFKIYAAKREVEKEGMGTVTGSGNVLYGEAVDWQGNPVADADGRRILYPRVGKNN GSTDDLPVRLDKDGIPQYDESQLIRQEDQEGNETGVFRAYSTIRELVVDGQVKKVPVGYI ETYKPLGAGVYVLVEIQAPKGYGKSRPVAFEVYADNVSFYRERRNADGTTDGWEEETAVR YQYAIPVAGSTNKVRTSTVSRIKVEDYPSRMEIHKVEDGDSLVGNQNILQKTDDQGRVES SGGFETDVTVNDAGDLLVYKVSGRKEKLEERGDVRDIAYNPKTMQWDGYVTKSFDEYSEH IVEGTEKELKAMSGVKPLYRLDGTFTGRGIRFDISVSGAVLSLYHAMEIEKIGEHVYKGV SASVKDGKVTGITDTNTGTHKEIRVVGEENSPGGAVKTHTASWDVWDAVIVDNDPVNLYF HDLTQVDTREDPDTGELLVLDKKGNPLCFADSVTGMAYVYDDYGRMLAYTADDEGNKILV KSIQVLKDENGQTIYDNKTTVDDENGLPIYYTSGNVVTKDESWTTDSSMDSNGTQETSGA RHLIARLPFGSYILEEQGVPYDQGYIQARYMGLVLQDTDQVQKYFLQNEFTKTAFAKLDV RTQKEIRGAVMTLYRAGLDSDGSPHREQDGTYKKGQVYASWISGYQYDDDGNIKLDEHGE PATTTKPHWIDHIPVGYYVLEETICPYEQGYVQSEAVNIDVLETGDVQSFEMEDDFTSID ILKYDTKNGDVIYGDSEAYLTLYRPILDEKGFPVLEHGIPRYDETGRIFTFRAATYKDGQ DVAATGRVVPDAGGNHPIMKYDYDFREIPGTYQGRYYYTEQGTVRLEYLPAGNYVLAETE NPEGYATADPILINIEETGHLEEIQYFQMGDKPLKLEVSKVDITGGKEVNGAKLAVYPVD EKGNVSEIPLVLHQPSEDGQYQDIEAVWISGLDGRYTEEDGEQGLIPAGFQPGDLKPHTL EYIPEGDYILREIITPYGFLQSVDIPFTIADSQVLQKTEMTDEIPDGILRIIKSDSDKPD EKLMGAEFCLVNQTTGAICGTVTTDQLGQAQFEPQPIGYMDRDGNFKPYTYVCSEIKAAP GHMLTLEPYEFQFHYRNELTDLIVWEYNPTNDSNRVITDKLSGDTEEMLEGALLRIERRT ESGWETAEEWVSGRQGHYTKNLSEGQYRLIEVKAPEGYKLQETPIEFTISDGMTGIPHLV MRNYTTIIDVEKTSAETGKLLGGARLQLIDKSDGRVIREWTSEAGRGQQFYGLKPGTYII HELQAPSGYERTEDREIVVKEWSGTEGPEMDQKAGNELHNMVQVFRFENQTASSSGGGGG NRPRPKAEYITFKKTDVSGKVLQGAEFTFYDQTGRVIGTSVSDSTGTFRIRRPDNGTYTF RETKAPGGYALNPGIFSFTVNGSDVIRGAYEVVDKELEFTVTKLDGDSGLPLMGARFRIG KGENRVWDQEPENTGKGINGSAAGTITEAVTGADGTFTCRLTTPGLYVIRETEAPLGYER SDKVYEFTMDAEGRIQGEAVMQGSVTIYNWKEKPPVRKIGSITAVYQVGSRFGKGTYHFG SGPRDTVRTGDDLPVAAAAAMAVLCMAGFSMCLCMKQKREWSKGRKWVIFILVLIPVTVY LAFDVLAEETDSISVSGKIVYSSMENIEPVPQTAWVLARDEISGKERNVLLPLVSYHFSD KHWEDGFRLDLKVRDYDAGFYEVGGVRIETQKEEIRESLMDYEAEILGQAGLDSEAYRID QFQWNGGVYESDGVLCRNLTAIGRKMVADCTAVYGGEVSRNVFLNDSKGDEESGQIGDSG IDERYVIKNNVQPFFSMGVTVPAAGICFVALIAAMVLAMIKNTRPYGMAAVMFMFFAGVV FSMHFLVKMGMDYADGRRIYGMVQDEAYGRGQNEGQEREAGEEREAGEEREAGEEREAGE GREAGEERDVDEERGAGGGRDSDAEKKTDEESPDGKSPLNEEALASINPEYQFWLAVPGT NIDYPVVRHEDNEYYLNHNFYQEQHITGCVFADSSAVPLAVDNTVLYGHNMKDGSMFAGL KQYGEEAFFRENPVIQIFYRGKWVECPVFSCQIRHQSDAGAYGTNFMEEEWLPYLEKMGA ASLYETGIIPGGDEKLITLSTCYGKDQYLIVQALLYGM >gi|157101639|gb|DS480685.1| GENE 36 58698 - 59615 424 305 aa, chain - ## HITS:1 COG:BS_ykwD KEGG:ns NR:ns ## COG: BS_ykwD COG2340 # Protein_GI_number: 16078461 # Func_class: S Function unknown # Function: Uncharacterized protein with SCP/PR1 domains # Organism: Bacillus subtilis # 165 292 121 247 257 66 32.0 7e-11 MGRETTEKKTVKKKAAKRIDINRIFRQGITLPVLSVLLSIMAAMPVYAGQWMQDGNGWRW KNDIGSYPENCWQWLDGNQDGMAEHYYFGPDGYMISDTTAPDGQPVNEEGAWIMDGVIQR EEMAGADNRTWQINMSETEYIQFTDMANRINGKNRTWNGEEPEHQRTNPTSTEDISTDES AYRIIELVNAERQKRGENVLSVNQELMENAAARAEEACIYFSHTRPDKSKFDTAITVERY SSAENLAKLYSRDSMEMIAEAAVEKWMDSESHKKQLLDDRWTETGAGVCESGGYIYISQI FVKGM >gi|157101639|gb|DS480685.1| GENE 37 59681 - 59782 81 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSEIVGSEIVGSEIAGSEIAGCEIVGSEIAGE >gi|157101639|gb|DS480685.1| GENE 38 59839 - 59937 58 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKRLGEMIGGEIAGSEIAGSEIAGGEIAGVK >gi|157101639|gb|DS480685.1| GENE 39 60854 - 61021 110 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938148|ref|ZP_02085504.1| ## NR: gi|160938148|ref|ZP_02085504.1| hypothetical protein CLOBOL_03042 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03042 [Clostridium bolteae ATCC BAA-613] # 11 55 1 45 45 63 97.0 7e-09 MVCVVYGVWYVWLWCWCVGVGGWVDCWWSVVMAAVAGVEDNGFYFYISPAKTEEK >gi|157101639|gb|DS480685.1| GENE 40 61123 - 62511 1476 462 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 5 460 4 461 468 501 57.0 1e-141 MDVLKKIPVREQDPKERATNFKEVCLGYNQEEAQEEASRCINCKNAKCIQGCPVAINIPK FIAEVKEGKFEDAANTIAESSALPAVCGRVCPQESQCEGKCIRGIKGEPISIGKLERFVA DWSRENGFVPAKPEKTNGIKVAVIGSGPAGLTCAGDLAKMGYEVTIFEALHEPGGVLTYG IPEFRLPKSGVVQPEIDNVRKLGVKIETNVIIGKSVTIDELMDEEGFKAVFIGSGAGLPK FMGIPGENANGVFSANEYLTRSNLMKAFRDDYDTPIYLGKKVAVVGGGNVAMDAARTALR LGAEVHVVYRRSEAELPARVEEVHHAKEEGIIFNLLTNPVEILTDDNGWVKGMVVRKMEL GEPDASGRRRPVEVAGSDYTIDVDAVIMSLGTSPNPLISSTTEGLETNKWKCIIADESNG KTTKEGVYAGGDAVTGAATVILAMEAGRAGARGIDEYLNGNK >gi|157101639|gb|DS480685.1| GENE 41 62511 - 63401 1055 296 aa, chain - ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 272 1 264 278 260 50.0 2e-69 MYKIVKSECLADKIYLMDVEAPRIARACQPGEFVIVKMDEVGERIPLTICDYDREKGLVT IVFQIVGASTERMAGLKAGDSFADFVGPLGQPSEFVKDDLEEVKKRRYLFVAGGVGSAPV YPQVKWLKERGIDVDVVEGAKTKDMLILEEEMRSVAGNLYITTDDGSYVRKGMVTDVVKD LVENQGKKYDVCVAIGPMIMMKFVCKLTKELGIPTIVSMNPIMVDGTGMCGACRVSVGGE VKFACVDGPEFDGHLVDFDQAMKRQQMYKTEEGRAILRLREGATHHGGCGNCGGEE >gi|157101639|gb|DS480685.1| GENE 42 63475 - 64386 579 303 aa, chain - ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 7 302 1 289 290 175 33.0 1e-43 MKQDGTLFTLENDELLVTVARRGAELTRIYDKKADREVLWCAEPSVWNRHAPVLFPFVGK CYEGAYVHDGKEYGMTPHGFARDMDFEPLLCDMDECWFRLKDTPETYEKYPFHFEVEIGH RLEGRTIEVMWKVANTDSGEMLFMMGGHPAFQVPEGKNIYDFTFEFNRRGCREGRFTDCL HYLAPNANGYEKEELQGNLKLSEGRVPLTKGFFDTALTYMFDEAQVSSVSLMVDGSPYVT LECSDFPYLGIWTMEATHPFVCLEPWYGICASDGYKGELKDRRGIISLPGWENWQKSYQI RVE >gi|157101639|gb|DS480685.1| GENE 43 64542 - 65033 526 163 aa, chain - ## HITS:1 COG:BH3450 KEGG:ns NR:ns ## COG: BH3450 COG0262 # Protein_GI_number: 15616012 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus halodurans # 1 162 2 160 163 100 34.0 8e-22 MNLCVTADRHWAIGKDGRPLVTIPADRQMFLKETAGKVVVMGRRTLEGLPGGQPFGNRVN IVLTHDMQYKVKGAVVCHSLEEALKALKDYDEGDIYIIGGESVYRQFLPYCTTAHVTSID YTYDGDAYFPDLEEEEGWYLAEEGDEQTYFDLCYTFRRFRRKK >gi|157101639|gb|DS480685.1| GENE 44 65041 - 65532 503 163 aa, chain - ## HITS:1 COG:BH3450 KEGG:ns NR:ns ## COG: BH3450 COG0262 # Protein_GI_number: 15616012 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus halodurans # 1 161 2 159 163 120 40.0 1e-27 MNLIVAVDEKWGIGKDGGLLAHLPEDMKYFRETTSGRTVVMGRRTLESFPGGKPLKNRVN MVLSRNESYSPEGVQVYHRAEDVLEALKDCNEDDVFIIGGGMIYREFLPYCNKAYVTYIH RTFEVDTDFENLDQDENWKLESVSDGREHEGISFEFRIYRRVR >gi|157101639|gb|DS480685.1| GENE 45 65677 - 66507 680 276 aa, chain + ## HITS:1 COG:BS_thyA KEGG:ns NR:ns ## COG: BS_thyA COG0207 # Protein_GI_number: 16078831 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus subtilis # 1 265 1 269 279 213 41.0 4e-55 MSYADQIFIQNCNDILEHGVWDTDYDVRPVWEDGTPAHTIKRFGIVNRYDLTREFPVITL RRTAFKSAVDELLWIWQKKSNNIHDLNSHIWDSWADENGSIGKAYGYQLGVKHHYKEGDF DQVDRILYDLKHNPLSRRIMSNIYNHHDLCEMNLYPCAYSMTFNVSGNTLNGILNQRSQD MLTANSWNVCQYAVLLHMLAQVSGLKPGELVHVIADAHIYDRHIPMVEELLKREPYPGPT LSMDTSIQNFYQFTTDSFRLEHYQYHTFSGKIPVAI >gi|157101639|gb|DS480685.1| GENE 46 66606 - 67550 1043 314 aa, chain - ## HITS:1 COG:CAC1605 KEGG:ns NR:ns ## COG: CAC1605 COG1893 # Protein_GI_number: 15894883 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 5 307 2 301 301 193 34.0 3e-49 MNQKREIRTVSLIGLGAIGCFLASHLGPLMGDNLRVIAGGSRRERLEQEGVMVNGVRHHF NIVSPEETCGQDYPDLAIIITKFPALSQALEDMRNQIGPDTLIMAPLNGVEAEEKVAAVF GWDNLLYSLAKVSVVMKDGCASFNPKVARIEMGEKHNETVSPRVQAVKELFEGAGIRTVV PEDMERAIWYKYMCNVSENQSAAVLGIPFGAWSVSADANFIREKLMREVIAIAQKKGIDL SEKDMEKQASVLKDVPPKNKPSTLQDIEAGRKTEVEMFGGTIIRMGRELGVPTPYNEMFY HGIKVLEQKNEGLF >gi|157101639|gb|DS480685.1| GENE 47 67565 - 71560 3646 1331 aa, chain - ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 4 690 7 674 1252 490 40.0 1e-138 MAVNWTSKQQEVIDSRNRNLLVSAAAGSGKTAVLVERIIQMISEGDRPLDIDQLLVMTFT NAAAAEMRERIGAAVEQKLKERPEDEHLWLQAALIPQAQITTIDSFCLNIIRSHYNSLDI DPAFRIGDEGELSLLRGDCMGEMLENCYDEADEEFGRFVEHFGRGKSDRGIEDVILQAWQ FSQSHPWPQEWLAACQKELEEESIGEMEESPWMVFLMEDVARQMAELSGQLGEALNVCLE ENGPLAYEPMLISDRSKIEAIGRAASTGSFEALYNSLQNMSFDRLASIRSKDIDGDKKAF VSACRDRIKKAVAKCRELYGQQSPQEVVESMRGTRRVIRELLRLTGMFDQAYRDAKRERN VLDFNDLEHLALEVLYVREETGNGEERVSRHPSQVADELSRQYEEILVDEYQDSNYVQEA LITSISRERFGYPNVFMVGDVKQSIYRFRLARPELFMDKYEKYSRQGGPYQMIELQQNFR SRESVLTSVNDVFYQIMTKNLGGITYTPETALYPGAVFEEVNRKTVCGPGLEEGESDTQE SGPVSFLAGTPTELLMVDTGAEALKQLDEDSLDYTAKELEARLIAGRIRQLVSQGQGILV WDKSRGAYRRARYGDVVILLRSMTGWSEVFVNVLMNEGIPAFAQTRTGYFNTVEVETVLS LLSVVDNPMQDIPLAAVMRSPIVGMDDEEMAWMMAAYKRNSQKGQDRGVYGAWRLWLEEG PVPEEELEQEEGLRPGEGPGPEEEPEQEEGLGPGKGPGPEENPDREEGLITVGLCSIPAE AAHSISLKSLKLSLLMERLRGEARHLPIHELLYRVYKESGYYDYVSAMPAGETRRANLDM LVEKAAAYESTSYKGLFHFVRYIEKLKKFDTDFGEASVAGEQDNTVRIMSIHKSKGLEFP VVFLAGLGKRFNKQDAYGQILLDADLGAAADFLDLELRVKAPTLKKQALKRRLELETMGE ELRVLYVAMTRAKEKLIMTAADKSLENKLGKWKDIPLSQGQLPYTILASANSCLDWLLMA QPAIPAAHMEMRQIQVKDLIGEEITRQIIRKMKKEDLLNLDGERVYDAAFGTHLRKVLEY EYPYESDTGLYAMVSVSELKKQSQIGRAEDAIGTDRDNLEGAALGELTALRGKQDVAGSG FRESGEQNKPSSSGPNRGALRGTAYHAVLEHIRFHEIHSLSEVKPVVDELLENGFLDQEA RDFINPNVIWNFLSSSLGTRMAKAQAEGRIHKEQQFIIGIPAREMGLGDSDELVLIQGII DAYLEEEDGLVLIDYKTDHVPEGSPEQGEKMLAERYRVQLDYYERALTQLTEKHVKERII YSLALQMSINV >gi|157101639|gb|DS480685.1| GENE 48 71538 - 75128 3279 1196 aa, chain - ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 11 1190 1 1148 1153 625 32.0 1e-178 MSRLLEGERKVALQFILGGAGSGKTRMLYDRVIKESMEHPEQKYLVVVPEQFTMQTQKEI IRLHPRHGIMNIDILSFKRLAYRVFEDLGVRIPVVLDDMGKSMVLRKVAGLKRGKLSLYS GHLEQTGFIGQLKSQISELYQYGITPDMLRDVRGEADSPLLAQKLEDLEVIYSGFREYIE SHYITAEEILDILCRELPKWEPLKDSIILLDGYTGFTPVQYRLVELFLMHAREVVCTVTI DSRENAYKESSIQHLFYMSRHTVCRVASMAKQHGIPKKEDIICSRRPAWRYDESPELDFL EQNLYRYSGRTWEGKEERIQVYKGRNPAGEAAYVCSRIEQMIKEEGMRYRDAAVITGDLP SYGRELAHQFDEAGIPYFLDDKKSILENPMVELIRAALEMVKDFSYESAFRYLKTGLVYD KAGSTEERAAADTAADTSGDAGEADAGSGFNGSREHAEQMTDRLENYVRALGIRGWKRWD SCWERQYRGGEKLNLKELNQYRMWILEPLRPLREAFKAEGATIASVTAALRQFLEHMELR EKLEEYRDFFLERKEPGDENLAREYGQVYDRVLELFERLEGLLGAEKADMKNYIQILDAG FQEIQVGVIPATVDQVMVGDITRSRLEAVKVLFFVGVNEGTVPQRKNGGSLLSDRDRAAL RKMDIELAPTVKEDGCIQKFYLYLMMSKPSRQLVLTYAGLTPDGKSQRPSNLMGEVGKLF PGMKTLDEHSVEWPVRTGRDARELLIQGLRGMQEQGEDDREREDAAFMQLFQHFYTSQEH RDKVRQLVDAAFYSYEERGIGRAAARALYGRELQGSVTRLEQFASCAYAHFLRYGLELME RQEYELEAVDMGNLFHQSIDRCFASMKDRGQDWRELTEEGRKSLVKECVTQVTEEYGNTI MSSSARNAYLAGRVERITDRTIWALAEQVKKGDFEPVGFEVSFSAIDNLKAMRIGLSEDE ELQLRGRIDRMDLCRDEDHVYVKIIDYKSGTTSFDLAALFYGLQLQLVVYMDAAVEMEER RHPDKEVVPAGIFYYHIGDPIADKQEGMDERDIEKQILKQLRMNGLVNSELDIIRHLDRE IEKESDVIPVAIKDGYVQESRSSVASTRRFQDLRRFVNRRLREAGQDILMGNVDLKPYKQ GNRTACDYCPYHAVCGFDTKTAGYGYRKLKGLKPEEIWAELESEGDEETDGSELDQ >gi|157101639|gb|DS480685.1| GENE 49 75196 - 75633 452 145 aa, chain + ## HITS:1 COG:no KEGG:Closa_1014 NR:ns ## KEGG: Closa_1014 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 144 1 144 146 167 65.0 1e-40 MSSKTKIVVLRMKEIIYTAIFVGLAILLITLCFIMFRPGKDNVSVSAEPAGYLPGVYSAA LTLGSEDVNVKVTVDKNRITSIDLVPLSEAVTTMYPLMQPTLDDLASQIISNQSTENLTY PSTSRYTSTALLNAINTALDKAKVD >gi|157101639|gb|DS480685.1| GENE 50 75902 - 77155 1377 417 aa, chain - ## HITS:1 COG:PH0037 KEGG:ns NR:ns ## COG: PH0037 COG3635 # Protein_GI_number: 14589995 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted phosphoglycerate mutase, AP superfamily # Organism: Pyrococcus horikoshii # 3 417 4 412 412 323 45.0 5e-88 MRKKQGLLIIMDGLGDRPDPVLGGMTPLESASTPNMERLLQMGMCGNVYPIAPGIPVGTD VGHLQIFGYDSSRVYRGRGPLEASSGGLELMDGDVAFRGNFATVTEDMAVVDRRAGRISQ GTEFLAQAVNGMVLSDGTRVLAKELTEHRIAVVLRGEGLSDAISCTDPGTTEEGKKVSGP RALDGSEKAQKTADALWEFTKKAYGILKEHPINKERIQKGLKPANIILTRGAGQKTEMVS LKDRYKIRAACVAGDKTVGGIARLAGMDYYIRDSFTGSFDTDLMGKARLAAELLKEKKYD WVVLHIKGTDLAGHDNRPDKKKEIVEQTDQMLGYLLNELDLEQCYISFTADHSTPCKVGD HSGDGVPTFIAGGDVRRDKVDMAGERYFMEGALEGLTANDIFRLQMNYMGFMEKVGS >gi|157101639|gb|DS480685.1| GENE 51 77155 - 77943 750 262 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_2399 NR:ns ## KEGG: NT01CX_2399 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 38 250 26 238 239 121 31.0 3e-26 MKGKGYITFMVAAALSMGLAVTAMAEESGSGEPGGAQLGTMIEMADRLAPEGEDAAVISH IKGVTDSQGRLAVPMYQDAELVSVQEGAGTLKGEPKEQHSGMTSYLVADFDEADFPVELT LNWHQEGTYKMKAAKTSGTAPGNLKAVTYSMVNTAPVSIGSYSLELAVPEGYELAGIVGY DPEEEYGIFTDQGVKYGAYVFGEVPVGRECEMTVNIKKAGGNLALSMWGATILISAYFLY KNKGMLKEAKELAAKKKMGGNS >gi|157101639|gb|DS480685.1| GENE 52 77953 - 79425 1593 490 aa, chain - ## HITS:1 COG:SA0645 KEGG:ns NR:ns ## COG: SA0645 COG0471 # Protein_GI_number: 15926367 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Staphylococcus aureus N315 # 22 490 25 517 517 150 25.0 4e-36 MSSVPAKVKKQKSPFEMRMAYFGLPAALVALTLIWTMRTPAGLSYSGKMAMGIFAFALIL WVSNGIPNYVTSLIVIFLLPAAGAWTEKATLGVFGYEVIWLMIAAFVIASGMEKSGLAKR LALFLVTRFGKSVNTVLIVLMVTNFLIAFVVPSTTARAVMLLPIVMMIMEVFEVGNTTPQ DRNFGKLMALQGIQANNLSTGAIVTATSSQILAISFIKDLTGTEISWMKWFIASAPVAII TLAASFLIGKMLFKTGNRSASAEKMTALEEEYKGLGKMNGVEVRALIIFLITIFLWATDE YHGAMFGFDISLVVVAIVSAAIFLMPHVGILTWKEAKIPWDLMIFSCGAYAGGLALDDTG VATWVLNSVFDKLGVEHMSFMVLYAVIIFIASFSHFVFTSKTVRTIILIPTIISIAQVAG FNPVALALPASFMICDTITLPPHSKVNLTYYSTGKFTVLEEMTYGVLTLLAKWGIMVAAS FTWFKMIGIM >gi|157101639|gb|DS480685.1| GENE 53 79702 - 80616 861 304 aa, chain + ## HITS:1 COG:all3953 KEGG:ns NR:ns ## COG: all3953 COG0583 # Protein_GI_number: 17231445 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Nostoc sp. PCC 7120 # 3 274 8 283 337 119 28.0 8e-27 MDLESLESFLEVEKHKNFTKAAESLFCTQAAVSMRIKRLETALECGLFKRCGRRVELTRE GEIFLPYARQMVNTWKSASEHLLQTRLMEESEIRITSSSTPGTYIIPSLIYLFRQSYPYI TVINHIQYTRSAISDVIGGNFSLGIISQPASVGTDVLCCEPLMEDPLVLVVHPQHPWAEK RGILFKEITEETLLISNPNTSLVNYLEKEGRLSLEPSRLYVAGNIEAIKRGIRNHQGISI MSEYAVRQELELGLLKQVPLLDHPGLKRQLYTIFRKDTRFKLSTSLFLEFFRKAAEDNRL LSQL >gi|157101639|gb|DS480685.1| GENE 54 80762 - 81841 958 359 aa, chain + ## HITS:1 COG:CAC3472_1 KEGG:ns NR:ns ## COG: CAC3472_1 COG1396 # Protein_GI_number: 15896711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 9 122 7 120 125 86 41.0 1e-16 MDIKLGPRIASLRRAKSMTQEQLALSLGVSPPAVSKWETGASCPDISLLCPLARALETNV DTLLQFEETLTEEQIDARLNEIVETARNHGYEAADGMVQSLLHTYPSSIPLKSRAVTLFD LFAMLFPAQPQEIRNGWTKQKKQLLEDVRASGAAACWQAAVSHLASIAISEGELEKAESL LKELPQHTTDSTSIWAMLYLKKGEASQALEVIQKRLYVLTRQVQNFLIQMMNPEMTPDTA RTLELCVIYRQLEELLGVGNGMSPGFFAQAYQRAGQDQQALDSMIQFVDAITGTVQKPNP ILFSPAVKIDGEHPAATKEMRELFLKSLLDDGYYKQFHDNKRFMDAVDKLRGSIQEDTA >gi|157101639|gb|DS480685.1| GENE 55 81917 - 82624 759 235 aa, chain - ## HITS:1 COG:CC2813 KEGG:ns NR:ns ## COG: CC2813 COG2186 # Protein_GI_number: 16127045 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 13 224 7 215 249 98 33.0 1e-20 MEDLDFKVKKKNLYETVADRMEDMILSNSLRQGERLPSEQELALKFGISRNVMRESLKLL KERQLIVQKNGEGTFIAKPESGSVTEILNRMLRLDDIGYMDVYEMRKLLEPYASRVVAEK KDDLDFEKLEQYLQGMIESKDVWDKRIDYDIQFHICIAEATGNPLLICFINSMVSLWKTI LLRGIVTQQEGHQDGIEFHKRIIDSLKRGDPDEAEAVMRAHIEKSAKMCWVEDVQ >gi|157101639|gb|DS480685.1| GENE 56 82724 - 84010 735 428 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 5 426 1 429 435 287 35 1e-76 MSHAMMISLGLLVVCFVIRMPIAIGMIISSTLYLAVKGMDPSVVVEIFTSKMYNNFILLA VPLFIFAANIMNSSKVTDKIFDFSKVLVGRWKGGLGHVNVIASLIFSGMTGSAVADASGL GIMEIEAMRKEGYDDGFSAAITAGSAVIGPIFPPSIPLVIYGVIAGASIGNLFLAGMVPG IILALALCIYVCYIANKRNYPSGRKYAGMEFIKMTVSAVPALLTPIILLMGIYTGIMTAT EAGAVAALYALLISVFAYRSITLSEFIQVLKNTVKSTGQIGIMLGAAYVFSYVIAVENLP SLAGNLILSFATTKVSFLVFVNILLFILGMLLDSAVLQLVLLPILCPIAQSFGIDMVHFG VVFTLNTMIGLCTPPFGMLLFIVTGISGEKMNTIIKEVMPMVFVMVLVLILVTFVPDVIM WLPRLFGR >gi|157101639|gb|DS480685.1| GENE 57 84007 - 84504 424 165 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2701 NR:ns ## KEGG: SpiBuddy_2701 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ component # Organism: Spirochaeta_Buddy # Pathway: not_defined # 11 144 8 141 176 77 31.0 2e-13 MDVWKKCFQAVMDFIEIYMPMCWFILLFVAFILQIISRYIFNNPLVWPYELAQISYVWII TLGCCYAQRTDDNIVFSVIYELVGEKVRRLFRFLQDILIVGLLVYLVPSALEFYRFYFTR YSTVFKVPLGFVYLGFFVFQIITIVRYCGDLAGCYKEFRKGGKAV >gi|157101639|gb|DS480685.1| GENE 58 84520 - 85614 343 364 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 75 363 44 325 328 136 28 3e-31 MKKRIISIALVLAMTAGLAACGQKTSETKSPSQAQAKTEAAPDASEQPQTDAGAELPEVK LVFSTQSSGDVNYVKAMHMIADEISEKTNGKFTMEIHENGSLMTQEGEVDAVARGSLDMM FSSPFLISDQMPYLTMFTAAYIFQNEQHMRAVMDGEIGQKVADDVAEKLNVRWLSALYCG ARHLDMRDIGREITKPEDLNGVNLRMPNSSAWLFMGKALGANPTPMAYAEIYQGLQTGAI DGQENPLPTTIAQKWYEVTSQIVLTGHVIEPMCIAINEEKWQSLPQEYRDIMTEAVKTAT AWCDEANLKQEQEAITFLKEQGLTVIEPDLNAFMEYSENYYLNDKDMVKDWDMELYHQIK DLKY >gi|157101639|gb|DS480685.1| GENE 59 85629 - 86402 844 257 aa, chain - ## HITS:1 COG:RSc1591 KEGG:ns NR:ns ## COG: RSc1591 COG3836 # Protein_GI_number: 17546310 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Ralstonia solanacearum # 4 236 8 238 272 125 30.0 9e-29 MKTLKEKLRDRDELSGMHICLTEPCISEMCAGLGYDFLWIDTEHTAIDYQVLLYHLMGAK AGGTDTLVRIPWNNQVLAKRVLEMGPTGIIIPMVNTAQELDAAMQATLYPPYGIRGFGPL RAVRYGLDDADSFIESSRQQLVRCVQIETKTAVKNLKEMVKNPYVDCFILGPCDLSGSIG ELNRVFEDHTSSLVDEAVAIINESGKSSGVSTGSDDPEIIRYWHEKGINVISAGSDYVHI MDGARKELNMLRQIQAK >gi|157101639|gb|DS480685.1| GENE 60 86632 - 87543 512 303 aa, chain - ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 2 300 175 470 470 246 46.0 5e-65 MKEYAEQGYIHGVALDLSKYFDTLNHEILLNILRRNVRDERVIQWIKRYLKSGVMENGVV METEEGSPQGGNLSPLLANIYLNEFDQEYQKRGIVFVRYADDIVLLAKSERAAKRLMETS TKYLEGPLKLKVNQEKSRVVSVFAIRTFKFLGFTLGKNGKGIYVRVHGKSWKKMKSKLKE LSSRRSVQSIRPALAKIKVYMCGWLNYYGIAEMKKRIEELNKWLYHRIRMCIWKQWKKPR TKVKNLRKMGVPEDLAWQAGNSRRGYWFTTQTVAVNMAMTKERLISSGFYDLATAYQSVH VNC Prediction of potential genes in microbial genomes Time: Thu Jun 30 17:53:07 2011 Seq name: gi|157101638|gb|DS480686.1| Clostridium bolteae ATCC BAA-613 Scfld_02_27 genomic scaffold, whole genome shotgun sequence Length of sequence - 283236 bp Number of predicted genes - 273, with homology - 264 Number of transcription units - 137, operones - 67 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 130 123 ## 2 1 Op 2 . - CDS 204 - 968 710 ## BHWA1_01124 hypothetical protein - Prom 1006 - 1065 3.6 - Term 1132 - 1163 2.4 3 2 Op 1 5/0.000 - CDS 1174 - 2364 1279 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 4 2 Op 2 . - CDS 2381 - 3568 1071 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 5 2 Op 3 . - CDS 3610 - 4821 939 ## COG1906 Uncharacterized conserved protein 6 3 Op 1 11/0.000 - CDS 4962 - 6236 719 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 7 3 Op 2 11/0.000 - CDS 6236 - 6823 613 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 8 3 Op 3 . - CDS 6840 - 7994 290 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 8121 - 8180 7.8 + Prom 8062 - 8121 10.0 9 4 Tu 1 . + CDS 8230 - 9120 550 ## COG0583 Transcriptional regulator + Term 9149 - 9197 6.3 - Term 9127 - 9193 17.4 10 5 Op 1 . - CDS 9234 - 10616 1598 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 11 5 Op 2 . - CDS 10741 - 11280 576 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 11294 - 11332 6.9 12 6 Op 1 . - CDS 11354 - 13936 2884 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 13 6 Op 2 . - CDS 13953 - 14498 421 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB - Prom 14523 - 14582 3.4 14 7 Tu 1 . - CDS 14662 - 17646 3468 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 17794 - 17853 6.3 15 8 Tu 1 . - CDS 17892 - 19223 1536 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Prom 19382 - 19441 5.2 + Prom 19392 - 19451 7.8 16 9 Op 1 1/0.038 + CDS 19535 - 20200 727 ## COG2964 Uncharacterized protein conserved in bacteria 17 9 Op 2 4/0.000 + CDS 20226 - 21434 1296 ## COG1171 Threonine dehydratase 18 9 Op 3 . + CDS 21492 - 22808 1319 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Prom 23046 - 23105 5.1 19 10 Op 1 8/0.000 + CDS 23129 - 24322 1133 ## COG0078 Ornithine carbamoyltransferase + Term 24354 - 24393 7.1 20 10 Op 2 . + CDS 24452 - 25402 1100 ## COG0549 Carbamate kinase 21 10 Op 3 . + CDS 25419 - 26765 395 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 26785 - 26856 16.8 - Term 26776 - 26840 17.7 22 11 Tu 1 . - CDS 26866 - 29124 1939 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 29154 - 29213 6.8 - Term 29187 - 29245 16.0 23 12 Tu 1 . - CDS 29297 - 30076 1008 ## COG2968 Uncharacterized conserved protein - Prom 30184 - 30243 4.6 - Term 31200 - 31234 4.0 24 13 Tu 1 . - CDS 31339 - 31512 114 ## - Prom 31556 - 31615 2.0 25 14 Op 1 . - CDS 32013 - 32183 62 ## gi|160938207|ref|ZP_02085562.1| hypothetical protein CLOBOL_03101 26 14 Op 2 . - CDS 32272 - 32394 59 ## gi|160938208|ref|ZP_02085563.1| hypothetical protein CLOBOL_03102 - Prom 32438 - 32497 1.7 27 15 Tu 1 . - CDS 32525 - 33409 118 ## gi|160938209|ref|ZP_02085564.1| hypothetical protein CLOBOL_03103 - Prom 33442 - 33501 5.1 - Term 33576 - 33611 4.0 28 16 Tu 1 . - CDS 33638 - 34498 352 ## Closa_1057 hypothetical protein - Prom 34529 - 34588 12.0 + Prom 34963 - 35022 7.1 29 17 Tu 1 . + CDS 35125 - 35295 171 ## gi|160938214|ref|ZP_02085569.1| hypothetical protein CLOBOL_03108 + Term 35312 - 35361 -0.2 30 18 Tu 1 . - CDS 35782 - 35937 84 ## - Prom 36049 - 36108 6.2 - Term 36204 - 36233 -0.3 31 19 Tu 1 . - CDS 36273 - 36647 322 ## DSY4397 hypothetical protein - Prom 36672 - 36731 9.7 + Prom 36677 - 36736 7.1 32 20 Tu 1 . + CDS 36830 - 37396 422 ## COG0681 Signal peptidase I 33 21 Tu 1 . - CDS 37430 - 38239 700 ## gi|160938218|ref|ZP_02085573.1| hypothetical protein CLOBOL_03112 - Prom 38259 - 38318 5.0 + Prom 38639 - 38698 2.8 34 22 Tu 1 . + CDS 38767 - 39132 211 ## Rumal_0129 hypothetical protein - Term 39152 - 39207 17.4 35 23 Tu 1 . - CDS 39401 - 39775 310 ## COG2033 Desulfoferrodoxin - Prom 39804 - 39863 5.8 + Prom 40064 - 40123 7.1 36 24 Tu 1 . + CDS 40152 - 40631 383 ## Bsph_4714 hypothetical protein + Term 40671 - 40720 1.6 37 25 Tu 1 . - CDS 40562 - 40708 75 ## - Prom 40933 - 40992 6.4 + TRNA 40982 - 41053 60.8 # Glu CTC 0 0 + TRNA 41216 - 41287 60.8 # Glu CTC 0 0 - Term 41283 - 41317 3.2 38 26 Op 1 . - CDS 41354 - 41938 409 ## COG1896 Predicted hydrolases of HD superfamily 39 26 Op 2 10/0.000 - CDS 41960 - 42427 621 ## COG0691 tmRNA-binding protein 40 26 Op 3 . - CDS 42476 - 44722 1735 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Prom 44777 - 44836 2.5 - Term 44773 - 44819 8.7 41 27 Tu 1 . - CDS 44845 - 45099 301 ## Clole_2827 preprotein translocase, SecG subunit - Prom 45133 - 45192 6.3 42 28 Tu 1 . - CDS 45228 - 46175 681 ## COG1482 Phosphomannose isomerase - Prom 46220 - 46279 5.5 - Term 46268 - 46304 5.1 43 29 Tu 1 . - CDS 46482 - 47753 1398 ## COG0148 Enolase - Prom 47787 - 47846 5.3 44 30 Tu 1 . - CDS 47856 - 48815 1145 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 48909 - 48968 5.9 + Prom 48905 - 48964 8.5 45 31 Tu 1 . + CDS 49005 - 50087 644 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 50096 - 50155 5.0 46 32 Op 1 . + CDS 50199 - 50504 127 ## gi|160938233|ref|ZP_02085588.1| hypothetical protein CLOBOL_03129 47 32 Op 2 . + CDS 50429 - 51526 684 ## Bcer98_2389 triple helix repeat-containing collagen + Term 51527 - 51577 10.0 + Prom 52078 - 52137 4.7 48 33 Tu 1 . + CDS 52344 - 53237 185 ## BCE_4654 hypothetical protein + Term 53330 - 53367 -1.0 + Prom 53752 - 53811 5.5 49 34 Op 1 . + CDS 53837 - 54166 117 ## BCA_2510 hypothetical protein 50 34 Op 2 . + CDS 54179 - 54298 62 ## gi|160938240|ref|ZP_02085595.1| hypothetical protein CLOBOL_03136 + Prom 54306 - 54365 3.3 51 35 Tu 1 . + CDS 54402 - 56348 1351 ## CLJU_c18950 putative collagen triple helix repeat-containing protein + Term 56374 - 56427 2.2 - Term 56363 - 56406 12.7 52 36 Tu 1 . - CDS 56458 - 56826 523 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 56897 - 56956 5.4 + Prom 57031 - 57090 8.4 53 37 Tu 1 . + CDS 57142 - 57738 592 ## COG3760 Uncharacterized conserved protein - Term 57738 - 57785 13.8 54 38 Op 1 . - CDS 57841 - 58164 216 ## Shel_25200 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) 55 38 Op 2 . - CDS 58180 - 58311 267 ## Amet_3592 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) - Prom 58347 - 58406 2.1 56 39 Tu 1 . - CDS 58420 - 58737 420 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 58844 - 58903 5.5 + Prom 58805 - 58864 6.4 57 40 Tu 1 . + CDS 59101 - 59949 691 ## COG0709 Selenophosphate synthase 58 41 Op 1 . - CDS 60048 - 60908 709 ## CLB_0653 response regulator 59 41 Op 2 . - CDS 60925 - 62061 1215 ## COG3949 Uncharacterized membrane protein 60 42 Op 1 . - CDS 62166 - 62399 161 ## Amet_3591 selenoprotein B (EC:1.21.4.2) 61 42 Op 2 . - CDS 62427 - 63476 1212 ## Amico_1394 selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family 62 42 Op 3 . - CDS 63500 - 64786 1440 ## CDR20291_1637 sarcosine reductase complex component B subunit alpha - Prom 64897 - 64956 8.6 - Term 64913 - 64966 9.4 63 43 Op 1 . - CDS 65064 - 65744 788 ## COG1309 Transcriptional regulator 64 43 Op 2 . - CDS 65775 - 66161 512 ## Amet_3596 GrdX protein - Prom 66208 - 66267 6.8 - Term 66299 - 66355 -0.7 65 44 Tu 1 . - CDS 66397 - 67350 1069 ## Closa_3219 PpiC-type peptidyl-prolyl cis-trans isomerase 66 45 Op 1 . - CDS 67415 - 68200 1028 ## COG0561 Predicted hydrolases of the HAD superfamily 67 45 Op 2 . - CDS 68193 - 68801 655 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation - Term 68851 - 68907 4.1 68 46 Op 1 . - CDS 68918 - 69472 443 ## COG4905 Predicted membrane protein 69 46 Op 2 . - CDS 69517 - 69975 545 ## COG1963 Uncharacterized protein conserved in bacteria 70 46 Op 3 . - CDS 70036 - 70548 326 ## COG1418 Predicted HD superfamily hydrolase 71 46 Op 4 . - CDS 70607 - 72052 1498 ## COG0297 Glycogen synthase - Prom 72089 - 72148 4.9 72 47 Op 1 2/0.000 - CDS 72150 - 72968 789 ## COG0784 FOG: CheY-like receiver 73 47 Op 2 3/0.000 - CDS 73543 - 74820 1393 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 74864 - 74923 3.0 - Term 74897 - 74938 4.0 74 47 Op 3 8/0.000 - CDS 74973 - 76631 1585 ## COG0497 ATPase involved in DNA repair 75 47 Op 4 1/0.038 - CDS 76649 - 77101 605 ## COG1438 Arginine repressor 76 47 Op 5 5/0.000 - CDS 77153 - 77995 833 ## COG0061 Predicted sugar kinase 77 47 Op 6 6/0.000 - CDS 78057 - 78989 916 ## COG1189 Predicted rRNA methylase 78 47 Op 7 13/0.000 - CDS 79073 - 80947 2056 ## COG1154 Deoxyxylulose-5-phosphate synthase 79 47 Op 8 . - CDS 80967 - 81878 1078 ## COG0142 Geranylgeranyl pyrophosphate synthase 80 47 Op 9 . - CDS 81871 - 82104 314 ## gi|160938275|ref|ZP_02085630.1| hypothetical protein CLOBOL_03171 81 47 Op 10 1/0.038 - CDS 82108 - 83394 1278 ## COG1570 Exonuclease VII, large subunit 82 47 Op 11 10/0.000 - CDS 83404 - 84039 690 ## COG0781 Transcription termination factor - Prom 84187 - 84246 8.3 - Term 84220 - 84272 11.1 83 47 Op 12 . - CDS 84295 - 84678 581 ## COG1302 Uncharacterized protein conserved in bacteria - Prom 84707 - 84766 8.7 84 48 Op 1 . - CDS 84845 - 86056 1282 ## Lebu_1366 VWA containing CoxE family protein 85 48 Op 2 . - CDS 86071 - 88341 2126 ## Lebu_1367 hypothetical protein 86 48 Op 3 . - CDS 88342 - 89445 1008 ## COG0714 MoxR-like ATPases 87 48 Op 4 . - CDS 89490 - 91436 1716 ## Lebu_1369 hypothetical protein 88 48 Op 5 . - CDS 91467 - 92918 1215 ## Lebu_1370 zinc finger SWIM domain protein - Prom 92970 - 93029 9.6 - Term 93018 - 93073 11.1 89 49 Op 1 . - CDS 93095 - 93952 1193 ## Closa_3245 hypothetical protein 90 49 Op 2 . - CDS 93970 - 94722 710 ## Closa_3246 hypothetical protein 91 49 Op 3 . - CDS 94676 - 95380 693 ## Closa_3247 Sporulation stage III protein AF 92 49 Op 4 . - CDS 95435 - 96718 1436 ## Closa_3248 Sporulation stage III protein AE 93 49 Op 5 . - CDS 96715 - 97101 515 ## Closa_3249 stage III sporulation protein AD 94 49 Op 6 . - CDS 97129 - 97323 271 ## Closa_3250 stage III sporulation protein AC - Prom 97346 - 97405 3.4 95 50 Op 1 . - CDS 97418 - 97981 653 ## Closa_3251 Sporulation stage III protein AB 96 50 Op 2 . - CDS 97938 - 99011 897 ## COG3854 Uncharacterized protein conserved in bacteria - Prom 99047 - 99106 3.1 - Term 99065 - 99108 5.3 97 51 Op 1 . - CDS 99168 - 100133 900 ## COG0657 Esterase/lipase 98 51 Op 2 . - CDS 100176 - 101960 1902 ## Closa_3262 hypothetical protein - Prom 102083 - 102142 4.7 + Prom 102124 - 102183 6.7 99 52 Op 1 . + CDS 102347 - 103468 1103 ## COG4927 Predicted choloylglycine hydrolase 100 52 Op 2 . + CDS 103465 - 104343 633 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 104362 - 104401 6.2 - Term 104179 - 104220 -0.6 101 53 Op 1 . - CDS 104393 - 105532 324 ## PROTEIN SUPPORTED gi|90020579|ref|YP_526406.1| ribosomal protein L22 102 53 Op 2 7/0.000 - CDS 105529 - 106755 1326 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 103 53 Op 3 . - CDS 106760 - 108229 1418 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 104 53 Op 4 9/0.000 - CDS 108226 - 109230 777 ## COG1638 TRAP-type C4-dicarboxylate transport system, periplasmic component - Term 109255 - 109286 2.5 105 53 Op 5 11/0.000 - CDS 109307 - 110614 1147 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 106 53 Op 6 11/0.000 - CDS 110634 - 111110 245 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 107 53 Op 7 . - CDS 111142 - 112206 494 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 112256 - 112315 5.3 - Term 112288 - 112336 10.2 108 54 Tu 1 . - CDS 112408 - 113565 1067 ## gi|160938307|ref|ZP_02085662.1| hypothetical protein CLOBOL_03203 - Prom 113585 - 113644 6.9 109 55 Op 1 . - CDS 113708 - 114187 633 ## COG1576 Uncharacterized conserved protein 110 55 Op 2 . - CDS 114191 - 114682 442 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 111 55 Op 3 1/0.038 - CDS 114715 - 115422 1042 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Term 115439 - 115474 5.0 112 56 Op 1 . - CDS 115484 - 116365 841 ## COG1555 DNA uptake protein and related DNA-binding proteins - Prom 116399 - 116458 2.2 113 56 Op 2 . - CDS 116463 - 116888 539 ## COG3238 Uncharacterized protein conserved in bacteria - Prom 117022 - 117081 3.7 114 57 Tu 1 . + CDS 117369 - 117782 310 ## Closa_3092 hypothetical protein + Term 117870 - 117920 -0.8 - Term 117858 - 117907 -0.5 115 58 Tu 1 . - CDS 117928 - 118473 517 ## Closa_2241 hypothetical protein - Prom 118706 - 118765 5.0 + Prom 118651 - 118710 12.4 116 59 Tu 1 . + CDS 118737 - 118934 241 ## COG1278 Cold shock proteins + Term 119019 - 119065 11.1 - Term 119011 - 119050 9.3 117 60 Op 1 1/0.038 - CDS 119054 - 119932 1120 ## COG1281 Disulfide bond chaperones of the HSP33 family 118 60 Op 2 . - CDS 119938 - 120750 913 ## COG0500 SAM-dependent methyltransferases - Term 120763 - 120803 8.2 119 60 Op 3 . - CDS 120820 - 121329 686 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 121439 - 121498 9.1 + Prom 121533 - 121592 5.8 120 61 Tu 1 . + CDS 121645 - 121818 245 ## Closa_0656 hypothetical protein + Term 121830 - 121885 12.3 - Term 121978 - 122037 0.7 121 62 Tu 1 . - CDS 122042 - 122236 124 ## gi|160938325|ref|ZP_02085680.1| hypothetical protein CLOBOL_03221 - Prom 122371 - 122430 75.8 + TRNA 122354 - 122426 78.4 # Ala CGC 0 0 + Prom 122356 - 122415 80.0 122 63 Tu 1 . + CDS 122576 - 123040 420 ## COG0590 Cytosine/adenosine deaminases - Term 123147 - 123195 4.8 123 64 Op 1 3/0.000 - CDS 123208 - 124644 1294 ## COG1012 NAD-dependent aldehyde dehydrogenases 124 64 Op 2 . - CDS 124657 - 125751 1150 ## COG1062 Zn-dependent alcohol dehydrogenases, class III - Prom 125839 - 125898 7.6 - Term 125988 - 126026 -0.2 125 65 Tu 1 . - CDS 126052 - 126375 202 ## gi|160938330|ref|ZP_02085685.1| hypothetical protein CLOBOL_03227 - Prom 126485 - 126544 6.2 + TRNA 126851 - 126923 80.4 # Val GAC 0 0 - Term 126981 - 127038 7.8 126 66 Tu 1 . - CDS 127149 - 127430 227 ## COG1550 Uncharacterized protein conserved in bacteria + Prom 127670 - 127729 10.1 127 67 Tu 1 . + CDS 127895 - 128044 154 ## gi|160938335|ref|ZP_02085690.1| hypothetical protein CLOBOL_03233 - Term 128183 - 128221 4.1 128 68 Tu 1 . - CDS 128326 - 128682 167 ## gi|160938337|ref|ZP_02085692.1| hypothetical protein CLOBOL_03235 129 69 Op 1 . - CDS 128793 - 129083 326 ## gi|160938338|ref|ZP_02085693.1| hypothetical protein CLOBOL_03236 - Prom 129109 - 129168 2.0 130 69 Op 2 . - CDS 129170 - 129331 151 ## gi|160938340|ref|ZP_02085695.1| hypothetical protein CLOBOL_03238 - Prom 129534 - 129593 7.7 - Term 129423 - 129467 10.3 131 70 Tu 1 . - CDS 129639 - 130550 873 ## COG0679 Predicted permeases - Prom 130663 - 130722 5.2 - Term 130827 - 130875 11.0 132 71 Op 1 11/0.000 - CDS 130968 - 132275 1138 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 133 71 Op 2 11/0.000 - CDS 132272 - 132775 231 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 134 71 Op 3 . - CDS 132759 - 133808 464 ## PROTEIN SUPPORTED gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 - Prom 134028 - 134087 4.4 135 72 Op 1 7/0.000 - CDS 134151 - 135002 790 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 136 72 Op 2 . - CDS 135027 - 136472 1115 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 136493 - 136552 4.8 137 73 Tu 1 . - CDS 136657 - 136875 222 ## Ethha_0641 hypothetical protein - Prom 136967 - 137026 3.8 + Prom 136823 - 136882 3.3 138 74 Tu 1 . + CDS 136994 - 137221 296 ## gi|160938350|ref|ZP_02085705.1| hypothetical protein CLOBOL_03248 139 75 Op 1 3/0.000 - CDS 137262 - 137969 866 ## COG1802 Transcriptional regulators - Term 138022 - 138064 6.1 140 75 Op 2 9/0.000 - CDS 138113 - 138919 189 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 141 75 Op 3 . - CDS 138942 - 139784 956 ## COG3717 5-keto 4-deoxyuronate isomerase - Prom 139949 - 140008 7.5 + Prom 140021 - 140080 4.1 142 76 Tu 1 . + CDS 140113 - 141330 1119 ## COG0205 6-phosphofructokinase + Term 141482 - 141524 6.9 - Term 141071 - 141130 0.0 143 77 Op 1 3/0.000 - CDS 141343 - 142185 941 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Term 142246 - 142283 3.5 144 77 Op 2 . - CDS 142346 - 143395 1224 ## COG1312 D-mannonate dehydratase - Prom 143441 - 143500 6.0 + Prom 143400 - 143459 6.3 145 78 Op 1 . + CDS 143573 - 144463 818 ## COG2207 AraC-type DNA-binding domain-containing proteins 146 78 Op 2 . + CDS 144479 - 145453 841 ## COG1313 Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins + Term 145620 - 145682 3.2 - Term 146531 - 146572 5.1 147 79 Op 1 . - CDS 146606 - 147496 1068 ## COG0524 Sugar kinases, ribokinase family 148 79 Op 2 . - CDS 147530 - 148624 1243 ## BHWA1_01065 hypothetical protein 149 79 Op 3 . - CDS 148650 - 150062 1442 ## COG1696 Predicted membrane protein involved in D-alanine export 150 79 Op 4 . - CDS 150121 - 151479 1596 ## COG0534 Na+-driven multidrug efflux pump + Prom 151580 - 151639 6.4 151 80 Tu 1 . + CDS 151821 - 152387 683 ## COG1971 Predicted membrane protein + Term 152448 - 152502 9.1 - Term 152439 - 152486 11.2 152 81 Op 1 . - CDS 152556 - 153302 844 ## COG5263 FOG: Glucan-binding domain (YG repeat) 153 81 Op 2 . - CDS 153378 - 154604 861 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 154627 - 154686 6.5 154 82 Tu 1 . - CDS 155145 - 156041 1077 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 156071 - 156130 4.0 + Prom 156103 - 156162 5.4 155 83 Tu 1 . + CDS 156216 - 157247 1120 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase + Term 157281 - 157336 18.0 - Term 157275 - 157317 11.6 156 84 Op 1 . - CDS 157427 - 158269 1024 ## COG1968 Uncharacterized bacitracin resistance protein 157 84 Op 2 . - CDS 158399 - 159073 675 ## COG0692 Uracil DNA glycosylase - Prom 159109 - 159168 7.0 158 85 Op 1 2/0.000 - CDS 159189 - 159950 846 ## COG0107 Imidazoleglycerol-phosphate synthase 159 85 Op 2 . - CDS 160025 - 160630 810 ## COG0118 Glutamine amidotransferase 160 85 Op 3 . - CDS 160705 - 161376 542 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 161570 - 161629 4.5 + Prom 161527 - 161586 4.4 161 86 Tu 1 . + CDS 161673 - 163436 1420 ## COG3044 Predicted ATPase of the ABC class + Term 163470 - 163518 11.7 - Term 163460 - 163502 2.0 162 87 Tu 1 . - CDS 163551 - 164573 999 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) - Prom 164602 - 164661 5.4 + Prom 164708 - 164767 5.9 163 88 Op 1 . + CDS 164843 - 165526 473 ## COG0775 Nucleoside phosphorylase 164 88 Op 2 . + CDS 165542 - 166816 615 ## COG2015 Alkyl sulfatase and related hydrolases + Term 166848 - 166886 5.1 - Term 166836 - 166874 5.1 165 89 Tu 1 . - CDS 167106 - 167348 406 ## gi|160938388|ref|ZP_02085743.1| hypothetical protein CLOBOL_03286 - Prom 167419 - 167478 3.4 166 90 Op 1 1/0.038 + CDS 167790 - 169016 1080 ## COG0457 FOG: TPR repeat 167 90 Op 2 . + CDS 169032 - 169460 389 ## COG0824 Predicted thioesterase 168 90 Op 3 . + CDS 169550 - 170704 839 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 169 90 Op 4 . + CDS 170731 - 170943 232 ## gi|160938396|ref|ZP_02085751.1| hypothetical protein CLOBOL_03294 + Term 170966 - 171011 8.4 170 91 Op 1 . - CDS 171008 - 171130 79 ## 171 91 Op 2 . - CDS 171138 - 171815 642 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 171963 - 172022 6.1 + Prom 171859 - 171918 4.8 172 92 Op 1 . + CDS 172024 - 173031 1186 ## COG2502 Asparagine synthetase A 173 92 Op 2 . + CDS 173100 - 173711 654 ## COG2860 Predicted membrane protein + Term 173721 - 173787 24.1 - Term 173710 - 173770 4.5 174 93 Op 1 1/0.038 - CDS 173820 - 175139 1401 ## COG1653 ABC-type sugar transport system, periplasmic component 175 93 Op 2 19/0.000 - CDS 175210 - 175911 746 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 176 93 Op 3 1/0.038 - CDS 175871 - 177172 1393 ## COG4585 Signal transduction histidine kinase 177 93 Op 4 16/0.000 - CDS 177189 - 178136 1085 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 178224 - 178283 6.6 - Term 178290 - 178323 2.0 178 93 Op 5 21/0.000 - CDS 178343 - 179311 1316 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 179 93 Op 6 16/0.000 - CDS 179330 - 180817 191 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 180890 - 180931 6.2 180 93 Op 7 3/0.000 - CDS 180948 - 182066 1371 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 182124 - 182183 3.0 181 93 Op 8 4/0.000 - CDS 182228 - 182920 645 ## COG3822 ABC-type sugar transport system, auxiliary component 182 93 Op 9 4/0.000 - CDS 182942 - 183802 985 ## COG0191 Fructose/tagatose bisphosphate aldolase 183 93 Op 10 . - CDS 183951 - 185081 923 ## COG0524 Sugar kinases, ribokinase family - Prom 185203 - 185262 7.2 - Term 185243 - 185292 17.2 184 94 Tu 1 . - CDS 185385 - 187049 2239 ## COG1283 Na+/phosphate symporter - Term 187143 - 187196 4.4 185 95 Op 1 . - CDS 187444 - 188643 801 ## Mahau_1979 major facilitator superfamily protein 186 95 Op 2 . - CDS 188735 - 189535 954 ## Closa_2876 hypothetical protein - Prom 189629 - 189688 8.8 + Prom 189667 - 189726 6.8 187 96 Tu 1 . + CDS 189751 - 190848 927 ## COG1509 Lysine 2,3-aminomutase + Term 190912 - 190962 13.0 188 97 Tu 1 . - CDS 190981 - 192129 1491 ## COG1454 Alcohol dehydrogenase, class IV - Prom 192205 - 192264 9.2 - Term 192229 - 192264 4.4 189 98 Op 1 2/0.000 - CDS 192363 - 192929 561 ## COG1309 Transcriptional regulator - Term 192998 - 193038 6.4 190 98 Op 2 1/0.038 - CDS 193082 - 195718 2446 ## COG1511 Predicted membrane protein 191 98 Op 3 . - CDS 195702 - 197978 2406 ## COG1033 Predicted exporters of the RND superfamily 192 98 Op 4 . - CDS 198251 - 199129 913 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 199155 - 199214 9.8 + Prom 199181 - 199240 8.2 193 99 Tu 1 . + CDS 199382 - 199972 668 ## COG1309 Transcriptional regulator + Prom 200024 - 200083 3.3 194 100 Tu 1 . + CDS 200110 - 201342 700 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative + Prom 201541 - 201600 5.7 195 101 Op 1 . + CDS 201629 - 201778 118 ## 196 101 Op 2 . + CDS 201726 - 201929 81 ## gi|160938431|ref|ZP_02085786.1| hypothetical protein CLOBOL_03329 + Prom 202350 - 202409 5.6 197 102 Tu 1 . + CDS 202458 - 203831 715 ## COG0489 ATPases involved in chromosome partitioning 198 103 Tu 1 . - CDS 204206 - 204751 71 ## BBR47_10940 hypothetical protein - Prom 204895 - 204954 5.1 - Term 207149 - 207203 -0.8 199 104 Op 1 . - CDS 207207 - 207752 409 ## Closa_3994 hypothetical protein 200 104 Op 2 . - CDS 207778 - 208212 270 ## gi|160938440|ref|ZP_02085795.1| hypothetical protein CLOBOL_03338 - Prom 208295 - 208354 3.4 201 105 Op 1 . - CDS 208806 - 208973 110 ## gi|160938441|ref|ZP_02085796.1| hypothetical protein CLOBOL_03339 - Term 208974 - 209016 -0.4 202 105 Op 2 . - CDS 209050 - 209322 58 ## BT_3377 putative glycosyltransferase - Prom 209365 - 209424 2.6 203 106 Op 1 . - CDS 209447 - 209965 153 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 204 106 Op 2 . - CDS 210023 - 210706 289 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Prom 211843 - 211902 5.1 205 107 Tu 1 . + CDS 211933 - 212076 59 ## - Term 212564 - 212603 0.2 206 108 Op 1 26/0.000 - CDS 212696 - 215044 777 ## COG0438 Glycosyltransferase 207 108 Op 2 . - CDS 215060 - 215983 196 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 216196 - 216255 3.9 208 109 Tu 1 . - CDS 216322 - 216444 85 ## - Prom 216622 - 216681 2.9 209 110 Tu 1 . - CDS 216684 - 217214 143 ## COG0438 Glycosyltransferase - Prom 217370 - 217429 8.0 - Term 218050 - 218083 -0.3 210 111 Op 1 . - CDS 218088 - 218633 149 ## Cthe_1358 glycosyltransferase 211 111 Op 2 . - CDS 218693 - 218980 63 ## Cthe_1358 glycosyltransferase 212 111 Op 3 . - CDS 218982 - 219476 236 ## COG5017 Uncharacterized conserved protein 213 111 Op 4 . - CDS 219519 - 219875 178 ## Cphy_1202 oligosaccharide biosynthesis protein Alg14-like protein - Prom 219934 - 219993 3.2 - Term 219888 - 219918 -0.5 214 112 Op 1 8/0.000 - CDS 220007 - 221257 1008 ## COG1004 Predicted UDP-glucose 6-dehydrogenase 215 112 Op 2 8/0.000 - CDS 221308 - 222372 689 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 222487 - 222546 2.6 216 112 Op 3 . - CDS 223040 - 223267 85 ## COG1004 Predicted UDP-glucose 6-dehydrogenase - Prom 223376 - 223435 4.0 217 113 Op 1 . - CDS 223451 - 223546 72 ## 218 113 Op 2 . - CDS 223562 - 224848 677 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 224902 - 224961 2.1 219 114 Tu 1 . - CDS 225089 - 225796 532 ## gi|160938458|ref|ZP_02085813.1| hypothetical protein CLOBOL_03356 - Prom 225841 - 225900 2.7 - Term 225832 - 225861 -0.2 220 115 Op 1 . - CDS 225916 - 226968 616 ## COG1316 Transcriptional regulator 221 115 Op 2 5/0.000 - CDS 226965 - 227696 475 ## COG0489 ATPases involved in chromosome partitioning 222 115 Op 3 2/0.000 - CDS 227693 - 228478 719 ## COG3944 Capsular polysaccharide biosynthesis protein 223 115 Op 4 . - CDS 228513 - 229142 299 ## COG4464 Capsular polysaccharide biosynthesis protein 224 115 Op 5 . - CDS 229220 - 229678 312 ## ELI_0681 hypothetical protein - Prom 229805 - 229864 4.4 - Term 229939 - 229987 2.5 225 116 Tu 1 . - CDS 230000 - 230206 129 ## gi|160938465|ref|ZP_02085820.1| hypothetical protein CLOBOL_03363 - Prom 230256 - 230315 4.8 - Term 230314 - 230349 4.3 226 117 Tu 1 . - CDS 230379 - 230834 358 ## BDP_1226 two-component response regulator VirR (EC:3.1.1.61) - Prom 230990 - 231049 8.7 - Term 231092 - 231136 4.1 227 118 Op 1 15/0.000 - CDS 231241 - 231726 599 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 228 118 Op 2 12/0.000 - CDS 231719 - 232606 941 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 229 118 Op 3 . - CDS 232625 - 234913 2412 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs - Term 234936 - 234964 0.6 230 119 Op 1 2/0.000 - CDS 234987 - 236381 1625 ## COG2252 Permeases - Prom 236421 - 236480 2.4 231 119 Op 2 . - CDS 236497 - 237768 1316 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Term 238351 - 238399 8.1 232 120 Op 1 17/0.000 - CDS 238477 - 239151 931 ## COG0569 K+ transport systems, NAD-binding component 233 120 Op 2 . - CDS 239210 - 240502 1399 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 240584 - 240643 4.6 + Prom 240582 - 240641 5.5 234 121 Op 1 16/0.000 + CDS 240669 - 242201 1197 ## COG2205 Osmosensitive K+ channel histidine kinase 235 121 Op 2 . + CDS 242194 - 242892 745 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Term 242817 - 242864 -0.9 236 122 Op 1 . - CDS 242941 - 244041 1008 ## Closa_0526 hypothetical protein 237 122 Op 2 . - CDS 244045 - 245310 1332 ## Closa_0525 hypothetical protein 238 122 Op 3 36/0.000 - CDS 245362 - 246573 340 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 239 122 Op 4 24/0.000 - CDS 246567 - 247355 325 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 240 122 Op 5 . - CDS 247369 - 248751 1496 ## COG0845 Membrane-fusion protein - Prom 248838 - 248897 7.2 241 123 Op 1 . + CDS 249371 - 250354 486 ## BLJ_1240 hypothetical protein 242 123 Op 2 . + CDS 250386 - 250607 384 ## CD1117 hypothetical protein 243 123 Op 3 . + CDS 250649 - 251125 445 ## CD1116 hypothetical protein 244 123 Op 4 . + CDS 251122 - 252930 991 ## COG3505 Type IV secretory pathway, VirD4 components 245 123 Op 5 . + CDS 252917 - 253087 132 ## gi|160938488|ref|ZP_02085843.1| hypothetical protein CLOBOL_03386 + Term 253132 - 253166 1.0 246 124 Op 1 . + CDS 253499 - 253714 317 ## Closa_3718 conjugative transfer protein 247 124 Op 2 . + CDS 253734 - 254405 441 ## CD1112 hypothetical protein + Prom 254890 - 254949 1.9 248 125 Op 1 2/0.000 + CDS 254981 - 255793 452 ## COG3344 Retron-type reverse transcriptase 249 125 Op 2 . + CDS 255790 - 256812 608 ## COG3344 Retron-type reverse transcriptase + Prom 257080 - 257139 4.1 250 126 Op 1 . + CDS 257368 - 257769 197 ## CD1111 hypothetical protein 251 126 Op 2 . + CDS 257720 - 258724 797 ## CD1110 putative conjugal transfer protein 252 127 Tu 1 . + CDS 259447 - 261420 234 ## COG3344 Retron-type reverse transcriptase + Prom 261500 - 261559 3.3 253 128 Op 1 . + CDS 261586 - 262962 813 ## COG3451 Type IV secretory pathway, VirB4 components 254 128 Op 2 . + CDS 262959 - 263918 683 ## COG0863 DNA modification methylase 255 128 Op 3 . + CDS 263954 - 266023 1106 ## CD1108 putative DNA-repair protein 256 128 Op 4 . + CDS 266062 - 266340 320 ## CD1107A hypothetical protein 257 128 Op 5 . + CDS 266333 - 267043 632 ## CD1107 hypothetical protein 258 128 Op 6 . + CDS 267040 - 267228 108 ## CD1106A hypothetical protein + Term 267271 - 267304 -1.0 - Term 267077 - 267123 7.3 259 129 Tu 1 . - CDS 267239 - 267991 195 ## CLI_1846 hypothetical protein - Prom 268016 - 268075 6.3 + Prom 267961 - 268020 6.4 260 130 Tu 1 . + CDS 268070 - 270154 1199 ## COG0550 Topoisomerase IA + Term 270194 - 270249 5.2 + Prom 270240 - 270299 1.6 261 131 Op 1 . + CDS 270394 - 273720 2667 ## CD1105 putative DNA primase 262 131 Op 2 . + CDS 273724 - 274677 608 ## EF2322 hypothetical protein 263 131 Op 3 . + CDS 274680 - 274886 278 ## CD1104B hypothetical protein 264 131 Op 4 . + CDS 274883 - 275248 168 ## CD1104A hypothetical protein + Term 275375 - 275427 18.6 - Term 275365 - 275413 13.1 265 132 Op 1 . - CDS 275426 - 275677 320 ## CD1103 hypothetical protein 266 132 Op 2 . - CDS 275741 - 277087 647 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 267 132 Op 3 . - CDS 277048 - 277377 149 ## CD1101 putative mobilization protein - Prom 277566 - 277625 2.2 - Term 277557 - 277608 9.4 268 133 Tu 1 . - CDS 277639 - 277992 305 ## CD1100 putative conjugative transposon protein - Prom 278109 - 278168 6.8 + Prom 278735 - 278794 3.3 269 134 Tu 1 . + CDS 278968 - 279549 -13 ## COG0820 Predicted Fe-S-cluster redox enzyme + Term 279581 - 279635 13.1 270 135 Op 1 . + CDS 280086 - 280517 210 ## Tresu_1935 RNA polymerase sigma factor, sigma-70 family 271 135 Op 2 . + CDS 280498 - 280758 270 ## Tresu_1934 conjugative transposon protein + Prom 281152 - 281211 4.2 272 136 Tu 1 . + CDS 281233 - 281400 110 ## gi|163816249|ref|ZP_02207616.1| hypothetical protein COPEUT_02437 + Prom 281403 - 281462 4.7 273 137 Tu 1 . + CDS 281501 - 283177 1341 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs Predicted protein(s) >gi|157101638|gb|DS480686.1| GENE 1 1 - 130 123 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGDGADRLAVIGMGWIGAYMVPCYRSLLGEGYEKRMFAVKAGS >gi|157101638|gb|DS480686.1| GENE 2 204 - 968 710 254 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_01124 NR:ns ## KEGG: BHWA1_01124 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 10 226 6 227 256 171 36.0 2e-41 MNKMEHITVVYFSPTGGTRKACLNLAMEMGKKVKDVDLCSLEGEYSFGPEDTVIVGVPVF GGRIPGYAAEKLTYLKGGGAVALTAAVYGNRAFEDALLELDDCLKAQGFRIGAGTALLAE HSMVRDVAAGRPDNQDRKEMADFGARILEKLEEEGWQEPQVPGNRPYRDWKQMPVIPLTN ASCISCGLCAARCPVQAIPVSNPSSTDQARCILCMRCISVCPVKARSLPEQACAMLEQKL SPVRDIRRENELFL >gi|157101638|gb|DS480686.1| GENE 3 1174 - 2364 1279 396 aa, chain - ## HITS:1 COG:STM2273 KEGG:ns NR:ns ## COG: STM2273 COG4948 # Protein_GI_number: 16765600 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Salmonella typhimurium LT2 # 1 395 1 399 400 390 47.0 1e-108 MKITSVECYKGAFITVKVNTDEGISGFGEAGLSYGSCQNAAWGNCQDFAKMVIGMDPFDT EKIYEHLHRHTFWGMNGGVTVSAAMAAIDIACWDIKGKALGLPVYKLLGGKTHDSLRAYA SQLQMGWRTLITKLDRSELQFEPKDYYDAVKDAMADGYDAVKIDPCFAGLSREDFNQAYL KRGDNLKGAYPEALRKIAVERIAAAREAGGDELDIIIEIHSTLDANTAVILGKSLEPYRI MYYEEPTMPSNPEVFKHIKQKCDIPLATGERSYTRWGFRQFFEDRTLSIAQPDLCNTGGI TEVKKICDMANVYDIGIQLHVCGGPIATAAALHVETVIPNFVIHEEHCANFKTDYQKAGK YAWAPENGRYVMNERPGIGQEMSEEAMASYARVVVD >gi|157101638|gb|DS480686.1| GENE 4 2381 - 3568 1071 395 aa, chain - ## HITS:1 COG:STM3833 KEGG:ns NR:ns ## COG: STM3833 COG4948 # Protein_GI_number: 16767118 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Salmonella typhimurium LT2 # 1 394 2 396 397 323 44.0 5e-88 MKITSAKLFLAEYNRYETPEFPKKSKVVGIRIYTDEGIYGDGEVAGIHATYGAFGVIKDM WPFISGKDPFDNEVLWDQLMLRTFWGQNGGAFWYSAVSAIDIALWDIKSKALNVPLYKLL GGKRRDKVRCYASQLQFGWGPIDSPAVTVQDYVDRARLALQDGYDAIKIDFLNWDEKGNI LNETNRLGLLPPELVQVFSDRVHAVREAIGPKVDLIIENHAGTNSNSAIQLSAAVQDCNI YYFEEPNTPVFYNNKYVKDNVKMPLAHGERVFGRWEYIRYFMDNSIQVIQPDIGNAGGIT ETKRICDMAYTFDVGVQIHTCASHLLTPPSVQLEACIPNFVIHEQHMRSMNPSNQELTAK VVMPKNGYLDVTDDIGIGNEWSDKALASEDQITLN >gi|157101638|gb|DS480686.1| GENE 5 3610 - 4821 939 403 aa, chain - ## HITS:1 COG:PH0014 KEGG:ns NR:ns ## COG: PH0014 COG1906 # Protein_GI_number: 14589976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pyrococcus horikoshii # 1 394 1 390 395 90 22.0 4e-18 MDILYLSIVFLVIVVIIWLKKPLFVAMIGGIAATIILFRVNLKDAAVVLGRQTVAWDTID VLLSFYLIIFLQLMLEKKGRLTNAKDSFNRLLRNRRMNTIVSPAIMGLLPSAAVMTICAD MVDRTCGEYMDDKNKTFVSCYYRHIPEMFLPTFPAVLLGLTLSGQNAGVFVLAMIPMVVA ACLVVYITHLKGIPREMPLLEETIDKKAEIINLVKNLWTLIAVLLIIIIFNLSVCIAAPL VIASNYIFDHFSLKDLPDLMVRSAEPILLGNMYLIMLFKGILSYTGVIGLLPDFIGQFPI SMTMSFGLLFFIGTVISGSQTIIALCMPMAFLAFPQGGIPFLVMLMSVAWAAMQISPTHV CSFVAADFYHTTLGDIVVRALGPVIIFSIIAYGYGLLLGQIFY >gi|157101638|gb|DS480686.1| GENE 6 4962 - 6236 719 424 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 5 420 7 425 435 281 35 2e-74 MATGVMFIVMFGLMFLGVPIAVAMFISMFVLINVDPVTTSSFIAQTTYGGVASFTNLALP FFMISGTIMETGGLSKRLVSAANSIIGGVTGSLGMVTVIACMFFGAVSGSAPATVAAIGA IMIPMMVQEGYSKYYATALACCAGGLGVIVPPSYPLVLYGVTCNQSVGDLFIAGLLPSCV VGGVLILINYVYCRKHRLKGENHFHIKSAVSAWWDAKWALIMPVIILGGIYGGFFTATEA AVVAVVYGIFIGFVVYRELKIEKLWAVFRDNAAFIAGTMFIMCPAKATGSVFAYLNINQT IADFMFSISSNYYVVMFMIFIIMFIVGMFIQTTPAIIILAPTLLSVVTQVGCDPIHFGII LDLALAIAFVTPPVAINLFVGSSLTGISIDKITKAEMPLLFGLIAAFFVVAFIPAISLLL LGRI >gi|157101638|gb|DS480686.1| GENE 7 6236 - 6823 613 195 aa, chain - ## HITS:1 COG:BH2672 KEGG:ns NR:ns ## COG: BH2672 COG3090 # Protein_GI_number: 15615235 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Bacillus halodurans # 13 153 9 151 183 62 30.0 5e-10 MKDGKFDLKTFLDNIELYISAVLFIALTVLLFANVFCRYALKHSFAWVEEVATIAFVWMI WFAMSAAVTKRKHLRIDFILEMVPFKVKKAMLVISNLVFAAFDIYLLYIVMTIIRRLGNS QTTLLRLPQQMVYAIIPIGLVLSVVRIAQDTIKLMHENEANLGASKPAMDLDECERIYLE KKAAREAAAVKGEVR >gi|157101638|gb|DS480686.1| GENE 8 6840 - 7994 290 384 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 91 364 44 313 328 116 27 1e-24 MKKVMKKVIAAMVSTAVLLPLAACGNSDTDSAAATTAAAATTAGDSKASEGGGSVGADGV DHTQGEPFKIRFATNASAGDVAGDGISQTGKGMIYFCKQIEERSGGRITTQIFTDGQLAS STQEYIGGAQNGAYEIFMLNCGSWADYTPAYAGLNIPYLYMDYDDAYAVLDSGLRQEWDQ RAQADTQCIPLAHLDIGFRQLTANKEIHSPADLKGVKIRTMVDPIQMNCWEAFGASVTPV PYAELYTALQQKLVDAQENPPSNIVSSKIYEMQSYCMKTNHNFTTTIMAASPVFWSTLSE EDKVFIQDLWKETEMYVRSLTEDLSDGFFDEMQSKGTTVIELTTDELKVFQDVAKSVWPQ VEEQMGSEAYNKLVDFVLDYQAKK >gi|157101638|gb|DS480686.1| GENE 9 8230 - 9120 550 296 aa, chain + ## HITS:1 COG:lin2335 KEGG:ns NR:ns ## COG: lin2335 COG0583 # Protein_GI_number: 16801398 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 294 1 288 292 127 28.0 2e-29 MNYQHLEYFLKAAEYQHYTRAAEHLHITQPALTKAILGIEEEIGAPLFMKRGRNIELTKY GTIFFDYARRSLEEINHGISAVKHQVNSDLNTVDLSALCSINTTFLPKKSAQFHLAYPDC SLNITFKYTTAIIKDVSNYNSTFGLCGEFDNESEYTSLEKILLYTEPVKFVISRNHPLAH KKSVEAAELKNQPFAVYNVSTNGTNRLLYELCDGAGFRPKTLTAAYNDYGLINEVISNQC ISIVSYTFYKQFQSLGFTELHIATSIPLIQKIYLVWVRDANLSPVARGFREILMSK >gi|157101638|gb|DS480686.1| GENE 10 9234 - 10616 1598 460 aa, chain - ## HITS:1 COG:AGc4328 KEGG:ns NR:ns ## COG: AGc4328 COG0044 # Protein_GI_number: 15889657 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 458 26 477 506 331 39.0 1e-90 MKKVLKGGTVVGGSGSVRADVLIDGEKVAAVGTGEEMERLADEDTQIVDVEGCLLFPGFI DAHTHFDLHVAGTVTADDFATGTRAAVRGGTTTIVDFGTQYEGESLADGLRNWHEKADGK CSCDYGFHMSISDWNPSVSRELDDMMEAGITSFKLYMTYDTQVDDKTIFEILRRLKEVGG ITGVHCENSGMIAALQAEAKAAGRMGVESHPATRPAAAEAEAIDRLLRLAEVVDIPVIVV HLTCREGYDVIMEARRRGQKVYAETCPQYLLMDDSLYGLAGMEGAKYVCAPPLRKQEDSA CLWKALADGTIQTVSTDHCSFTTEQKALGKDDFTKIPGGMPGVETRGTLLYTYGVDAGRI TKEKMCQLLSENPAKLYGMYPTKGAIAPGSDADIVVMRTGVEDMVTAADQVQNVDYAPFE GRKLTARIESVFLRGTQVVKDHQVVVEKAGKFVKRGRYGL >gi|157101638|gb|DS480686.1| GENE 11 10741 - 11280 576 179 aa, chain - ## HITS:1 COG:lin2807 KEGG:ns NR:ns ## COG: lin2807 COG0454 # Protein_GI_number: 16801868 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Listeria innocua # 23 174 6 157 159 176 55.0 2e-44 MDNTLDKTGHGMEHKRADKAVEFRFAKPQDIPLILRFIRELADYEGMLDQVVATEELLNQ WLFEKEKAEVILGMKDGETVGFALFFHNFSTFLGRAGIYLEDLYVRPEYRGQGFGTAFLN RLAAIAVERGCGRLEWWCLDWNKPSIDFYLGIGARPMSDWTVYRIDGERLEAMAADARR >gi|157101638|gb|DS480686.1| GENE 12 11354 - 13936 2884 860 aa, chain - ## HITS:1 COG:BH0748 KEGG:ns NR:ns ## COG: BH0748 COG1529 # Protein_GI_number: 15613311 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Bacillus halodurans # 170 852 9 741 760 348 33.0 4e-95 MGENFCFTVNGVERCTSEDKPLLRYLRDDLHLHSVKDGCSEGACGTCTIVVDGKAVKSCV LTTRRAVGKNIITTEGLSDAEKEAFVYAFGAMGSVQCGFCTPGMVMAGKALIDRVPDPTE EEIKVALKGNICRCTGYKKIIEGIQLTAAILRGDARIEESLEKGDEYGVGKRAFRLDVRE KVLGYGEYPDDVEMEGMVYASAVRSQYPRARVLDIDCGKAAALPGVLAVLTAEDVPNNKV GHLQQDWDVMIAKGDITRCVGDAICLVVAESRDILEKAKRMVKIDYEELKPVCSIQEAMA EDAPRVHEKGNLCQSRHVTRGDAKKALENSRYVVTQTYRTPFTEHAFLEPECAVAFPYKD GVKVYSTDQGVYDTRKEISIMLGWDPDRVVVENKLVGGGFGGKEDVSAQHIAALAALKVQ RPVKVVFSRQESLSFHPKRHAMEGTFTLGCDENGIFTGLDCEIYFDTGAYASLCGPVLER ACTHSVGPYCYQNTDIRGFGYYTNNPPAGAFRGFGVCQSEFALESNINLLAEKVGISPWE IRYRNAIEPGKVLPNGQIADCSTALKETLLAVKDAYESNPGRAGIACAMKNAGVGVGLPD KGRAKLAVRDGKIELYSAASDIGQGCATVFVQMVSETTGLGKEMVRNMGANSEVAPDSGT TSGSRQTLITGEAVRMAAADLNEALQEAGGDLSQLEGREFFAEFFDPTDKLGADVPNPKS HVAYGFATHVVVLDDDGRVKEVYAAHDSGKVVNPISIQGQIEGGVLMGLGYALTEDFPLK NGVPQAKYGTLGLMRSTQIPDIHAIYVEKEELLSFAYGAKGIGEIATIPTAPAAQGAYYA KDHVLRTTLPMDDTYYSKKK >gi|157101638|gb|DS480686.1| GENE 13 13953 - 14498 421 181 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 47 170 138 261 264 92 40.0 5e-19 MAQKRNRKVAVVQCLGGCHAEAVIPRGDLAGDCSQVLAEHPDGILKCKWGCLGMGSCVAA CKSDAIHINSFGAAEVDRGKCVGCGLCVKACPLGLIRITTPEYPVYTACMNQDGGAQTRK DCSVGCIACGICVKNCPADAVRIENNHAVIDEAKCIACGMCAVKCPRGIILDADGIFTVK A >gi|157101638|gb|DS480686.1| GENE 14 14662 - 17646 3468 994 aa, chain - ## HITS:1 COG:Z4217_2 KEGG:ns NR:ns ## COG: Z4217_2 COG0493 # Protein_GI_number: 15803415 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli O157:H7 EDL933 # 438 960 1 536 582 397 43.0 1e-110 MSDIMTPIPFGKLMNWILEENKKGAVFGIKRPFIADPDKTYEIFGRKLETPFGPAAGPHT QLTQNIVASYVAGSRFFELKTVQKLDGEDLPVSKPCIKADDECYNCEWSTELYVPQAFDE YVKAWFACKVLAREFGLGSPEGFQFNMSVGYDLDGIKTEKIDRFIEGMKDASETAIFKEC RQWLLDNLDRFEKVSREDVEAISPKVCNCATLSTLHGCPPQEIERIARYLIEEKRVHTFI KCNPTLLGYEYARKIMDDMGYDYVAFGDFHFKDDLQYSDAVPMLQRLQALADGKGLEFGV KITNTFPVDVKQNELPSEEMYMSGKSLYPLSMSVAGKLAEDFGGKLRISYSGGADFFNIA DIVDAGIWPVTMATTMLKPGGYERLEQIGQLFAAKEAAAFEGVDAARVTGLVEAVKSDKH HVKAVKPLPSRKINRPVPLTDCFIAPCQEGCPIHQDITRYLQLAGAGEFEQALEVILNKN PLPFITGTICAHNCMSKCTRNFYESPVDIRRTKLEAAEGGFEAVMAKLKAPEVSSDKKAA VVGGGPAGLAAAYFLARGGVKVTLFEKEEKMGGVVRNVIPGFRISGSAIDNDVKLVAAMG VEMVNGKEITDIEALKKDYDYVVLAVGASEQGVLKLEAGDTMNALDFLARFKATEGKVEL GKNVVVIGGGNTAMDTARAAKRNAGVEKVSLVYRRTRRYMPADEEELVMAVEDGVEFAEL LAPVKLENGVLYCKKMVLGDMDASGRRGVVATDETVEVPADTVIAAVGEKVPGAFYESFG IALDSRRRPQVNQETLETSVKGVYVAGDGLYGPATVVEGIRDGKMAAEAIIGKALAEDLF KLSDAETIYGRKGRLSEENDHVVDSARCLSCNSYCENCVEVCPNRANISLVVPGMEKHQI IHVDYMCNECGNCKSFCPWDSAPYLDKFTLFANEADMENSKNQGFTVLDAAAGVCRVRLQ GRIMDYTVGTVNADVPDGIQNIIRTVINDYSYML >gi|157101638|gb|DS480686.1| GENE 15 17892 - 19223 1536 443 aa, chain - ## HITS:1 COG:Z4218 KEGG:ns NR:ns ## COG: Z4218 COG0402 # Protein_GI_number: 15803416 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli O157:H7 EDL933 # 1 437 23 456 464 212 31.0 1e-54 MLVIGNGRLVTRDPEQPFFENGAVAMDGTAIKKVGTLEEIKKEFPDAEYVDAKGGVIMPA FINTHEHIYSAMARGLSIKGYDPKGFLDILDGMWWTIDRHLTNEQTRQSARATYLDSIKN GVTTVFDHHASFGEIKDSLFAIEDAAKEMGVRTCLCYEISDRDGMDKARAAVMENASWIK HALADDTDMIAGMMGMHAQFTISDETMELAAANKPAEVGYHIHVAEGIEDLHDCLKKYGK RIVDRLMDCNILGEKTLLAHCIYVNPHEMQLIKDTDTMVVHNPESNMGNACGCPPTMEIV HRGILTGLGTDGYTHDMTESFKVANVLHKHHLCDANAAWGEVPQMLFEGNAKIANRYFKK QLGVLKEGAAADVIVTDYIPLTPMDASNVNGHILFGMTGRSVVTTVCNGKVLMKNRELVG LDEEKILYEVREEAKKLAHSING >gi|157101638|gb|DS480686.1| GENE 16 19535 - 20200 727 221 aa, chain + ## HITS:1 COG:VC0355 KEGG:ns NR:ns ## COG: VC0355 COG2964 # Protein_GI_number: 15640382 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 11 220 37 239 240 83 31.0 2e-16 MSYTLDLLKQLADGLARQFGPDCEIVIHDLEKHDLEHSIVHINNGHVTNRQEGDGPSKVV LETLHKDPASLKDHLGYLTRTSNGKILKSSTIYIRNEEGDSIDYLLSINYDITGLMTVDR SIKALIDTEPQADVKQQPEQIVHNVNDLLDTLIEQSVALIGKPAALMNKEEKVTAIQFLN DAGAFLITKSGDKVSKYFGISKFTLYSYIDVNSRKSEDKKA >gi|157101638|gb|DS480686.1| GENE 17 20226 - 21434 1296 402 aa, chain + ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 34 399 36 401 404 393 53.0 1e-109 MDQKILWAVNQMPKTDDKNLPIMGLEEIAKARTFHESFPQYTKTPLTKLDHMAAYLGVKE IYLKDESYRFGLNAFKVLGGSFSMARYIAKETGRDVSELPYSVLTSDQLREEFGQATFFT ATDGNHGRGVAWAANRLGQKAVVHMPKGSTQTRLENIAKEGAAVDIQEMNYDDCVRLAAK EADETPRGVMVQDTAWDGYEEIPSWIMQGYGTMAMEAGEQLKEYGCERPTHIFVQAGVGS LAGAVVGYFSNLYADNPPVFVVVEAEAAACLYKGAAAGDGQIRIVDGDMETIMAGLACGE PNTISWDILKNHVKVFIAAPDWVAATGMRMLGAPIKGDAPVTSGESGAAPFGALAAVMSM DEYADLRKDIGLDENSRVLLFSTEGDTDPDRYKNIVWKGLDK >gi|157101638|gb|DS480686.1| GENE 18 21492 - 22808 1319 438 aa, chain + ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 2 433 4 396 403 563 63.0 1e-160 MNLDYAKIKEAAKNYEKDMTKFLRDIVKFPGESCDEKAHIDRIAEEMTKLGFDKVDIDPM GNVLGYMGTGETLIGFDAHIDTVGIGNKNNWNFDPYEGYENDTEIGGRGVSDQCGGIVSA VYGAKIMKDLGMLSDKYTVLVTGTVQEEDCDGLCWQYIINEDKVRPDFVVSTEPTDGGIY RGQRGRMEIRVDVKGVSCHGSAPERGDNAIYKMADILQDVRALNENDAADDKEVKGLVKM LDEKYNPEYKEANFLGRGTVTVSEIFFTSPSRCAVADSCSVSLDRRMTAGETWESCLDEI RALPAVKKYGDDVTVSMYEYSRPSYKGLTYPIECYFPTWVIPEDHKVTKSLEEAYKNLFG DERIGVDATAEMRKARPLTDKWTFSTNGVSIMGRNGIPCIGFGPGAEAQAHAPNEITWKQ DLVTCAAVYAALPSVYCK >gi|157101638|gb|DS480686.1| GENE 19 23129 - 24322 1133 397 aa, chain + ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 2 395 396 546 66.0 1e-155 MKTLQDYIDKLNSLNFKEMYENDFFLTWEKTDEELEAVWTVADALRFMRENNISTKVFES GLGISLFRDNSTRTRFSFASACNLLGLEVQDLDEGKSQVAHGETVRETANMISFMADVIG IRDDMYIGKGNAYMHEVVDAVTQGHKDGILEQKPTLVNLQCDIDHPTQCMADMLHIIHEF GGVENLKGKKLAMTWAYSPSYGKPLSVPQGVVGLMTRMGMEVVLAHPEGYEIMPEVEEVA RKNAEKTGGSFRVSHDMADAFKDADIVYPKSWAPFAAMEKRTNLYAEGDSEGIKTLEKEL LAQNAEHKDWCCTEELMKTTKDGKALYLHCLPADINDVSCKDGEVEATVFDRYRDPLYKE ASYKPYVIAAMIMLAKFADPAEVLKKLEEKGTPRVFK >gi|157101638|gb|DS480686.1| GENE 20 24452 - 25402 1100 316 aa, chain + ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 6 313 3 308 310 381 66.0 1e-105 MENKKKRIVIALGGNALGNTLPEQMKAVKITAKAIVDLIEEGCEVIVAHGNGPQVGMINN AMAALSREDAKQPNTPLSVCVAMSQAYIGYDLQNALREELYNREMYDIPVATMITQVRVD ADDPAFEAPSKPIGHFMTEEQAKIAEEKYGYIMKEDSGRGYRRVVASPKPAEIVEIGAIR SLVDSGQLVIACGGGGIPVTRQGNHLKGASAVIDKDFASELLAENLNADFLIILTAVEKV AVNFGKPEEKWLDDLDTEEARQFIKEGHFAPGSMLPKVQAAVKFAESKPGRTALITLLEK AKDGIQGKTGTHIHLA >gi|157101638|gb|DS480686.1| GENE 21 25419 - 26765 395 448 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 24 442 15 422 447 156 27 7e-37 MNTDNTAKQHASLFDLDGVPKMSQAIPLALQHVVAMIVGCVTPAIIISGAAGIDTADRVL LIQASLVVSALATLLQLFPIGNKNSFHLGAGLPVILGVSFAYVPSMQAIAEQSGIPAILG AQIVGGVCAIIVGLTIKKIRKFFPPLIAGTVVFTIGLSLYPTAINYMAGGAGQPTYGAWQ NWVVAIFTLVVVTVLNHFAKGFLKLASILMGMIAGYIFSMFFGMVSFGNVAESGMFQLPQ VMHFGISFEASACVALGLLFVINSVQAIGDFSATTAGGLDREPTTNELHGAILGYGFSNI IGAFLGCLPTATYSQNVGIVATTKVVNRVTLGLSALIILVAGLFPKFSALLTTIPYAVLG GATVSVFASIAMTGMKLVMTEDMNYRNTSIVGLAAALGMGVSQASASLSAFPTWVTTIFG KSPVVIATLVAILLNVTLPKDTGKVKKN >gi|157101638|gb|DS480686.1| GENE 22 26866 - 29124 1939 752 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 306 736 332 754 776 180 30.0 7e-45 MTTADGRGKSRRDTLLPIILCLCFIGSSIIFLVQMILKSQKENVAYLYDAANQTRTSILK QIEGDWQTLEGLAVSLRELVILDENQILTILNDINEENAFIRMGYADINGNARMVDMEGN VEEVNLKGMDFFERALQGEKSISNTFADQQDASGYINYFGVRINDESGNARGVLCAVHSA VVLREIIDAPVLKGAGYSNILDANGNYVLKTIKTFEGDILPDNKEKIAEVIRDTGRGDFI LTDRQGVRQMLVILPIIEGQWYQASMVPVDVLRSSYIQTAWGIMIIIVVACSLFIWLMSR QRKMAVNNQKMLMELAYSDSLTGLRNFAGFKREVEIFLNRPEISRSELTSYVLWYGDLRN FKLINDVLGYEEGDRLLRLVAEFLRTVEGPGCISCRIAADNFAGITRCEDTQVLDSGLCQ LKEYMRNSGMDEQPFMEIPVGVYRFRAGDGKQSLDVLVNYANMAHKIAKERPGSSYVFYD DSLRRRMLEDTVLESEAEAAMEEEEFKLYMQPKIDIQNGNQITGAEVLARWLSPRRGLIL PGNFIPLFEKSELIVKLDRYMFEHACRWYRSHLEQGGRPVSLAVNVSKAGLFQNDFVDYY TDIKAKYSIPDRVMELEFTESILAADTELFAELVVNLNARGFICSLDDFGSGYSSLNLLK NLPIDVLKLDILFFQKSRDIRRERIVVSNVINMAKELDIKTIAEGVEDMDTVEFLRKAGC NVIQGYVFAKPMPQEDFEHLLTDNQDGSFEGC >gi|157101638|gb|DS480686.1| GENE 23 29297 - 30076 1008 259 aa, chain - ## HITS:1 COG:MA0311 KEGG:ns NR:ns ## COG: MA0311 COG2968 # Protein_GI_number: 20089209 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 45 255 46 253 256 80 33.0 3e-15 MKHIMKKTVYTAGISAAVAAAALGLGGCGAGAQMPDTLKVQNVGAAENVITVTGREEVKV VPDMAQIEYAVYTREDTAAACQEKNANDLNAAIETLKGLGVEETSIQTSSYGLSPIRNWN SDKQEITGYEMTTSLMVSDIPIDNAGTIITKSVAAGVNGINSVSYFSSSYDASYQEALKG AMAVAQAKAQALAEASNKTLAGVVHVEEFGYQPETRYASYNAGGSAKMAAAVAEDAAASV MPGQVSVEAQVTVSYELND >gi|157101638|gb|DS480686.1| GENE 24 31339 - 31512 114 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEIYKIITFWLKKKIAFYYQKGNDIITYGMRLFQENVRMGLVNHWYGMRKKRKHNHI >gi|157101638|gb|DS480686.1| GENE 25 32013 - 32183 62 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938207|ref|ZP_02085562.1| ## NR: gi|160938207|ref|ZP_02085562.1| hypothetical protein CLOBOL_03101 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03101 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 92 100.0 7e-18 MKNKLLEKAVKDSRVKAEVLAKAAEMKLDELVSIDYSWGKLKLTTSPYGRLRMLCH >gi|157101638|gb|DS480686.1| GENE 26 32272 - 32394 59 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938208|ref|ZP_02085563.1| ## NR: gi|160938208|ref|ZP_02085563.1| hypothetical protein CLOBOL_03102 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03102 [Clostridium bolteae ATCC BAA-613] # 1 40 1 40 40 71 100.0 2e-11 MKKIIRVTGKGQISVPPDRICLLFELEERKDTYREAIKDN >gi|157101638|gb|DS480686.1| GENE 27 32525 - 33409 118 294 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938209|ref|ZP_02085564.1| ## NR: gi|160938209|ref|ZP_02085564.1| hypothetical protein CLOBOL_03103 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03103 [Clostridium bolteae ATCC BAA-613] # 1 294 34 327 327 555 100.0 1e-156 MNDIFFDKNFNRKILKKWVSSQTSMLTENNGVTYIEKIENQVNDSGQSSIVPYGSDELEI YENILTPLKVEHVHVVASKRKQGESKFVLEYIDGLTCSDAPQAIHLYKAALKIGDLYRTS RNNISCVNETTFQKYYLSKEKTLNHLYEISKAFDISGLVSFTSDSYEKYSKYPIFLNHYD MHFKNMIYSKGDIRIIDWATAQFSPFYSDLYVLLRQAEQVDADISVIIDNYKNSAGLEEL TEEEMLVGRIFWCIPAIHWLLELQGTDNVPFYEWAEDEYTSLLTAFRRYKEILA >gi|157101638|gb|DS480686.1| GENE 28 33638 - 34498 352 286 aa, chain - ## HITS:1 COG:no KEGG:Closa_1057 NR:ns ## KEGG: Closa_1057 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 140 22 155 336 86 36.0 1e-15 MTVKKWIAMITLSMLFSGTLIWEAYAGTWKQGIDGDVSAWWYDNGDGTFPISTWMWIDGN DDGIAECYYFDDTGRMLSDTVTPDGYTVDKNGAWKLGFVRHMYSLNSYQEEIFKDAAEGL LNLYEDQFYVPSFTYGQLRYLMYEDRQKLMNSLLAADTEKHLNNPDITTMFVPLRIEDGI GYFDRDKTLDDFYAVFNIGLNPENIPQSEQGELKGIPFEKENKYKITDCEQSSDGTFLYM KLKYEKTDVVTNKILRTGTFYMKLYRNKSDDIGYYIDIISNQESFG >gi|157101638|gb|DS480686.1| GENE 29 35125 - 35295 171 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938214|ref|ZP_02085569.1| ## NR: gi|160938214|ref|ZP_02085569.1| hypothetical protein CLOBOL_03108 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03108 [Clostridium bolteae ATCC BAA-613] # 1 56 9 64 64 93 100.0 5e-18 MTNEIAKFTLRTDSELMKKLRIVADYNGRSANRELEVLIKNHVAEFEKKHGTIDLN >gi|157101638|gb|DS480686.1| GENE 30 35782 - 35937 84 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYCKQDIDENGVGELIPEVIETTPELWPTWDSVIYNIYTISDGAFVQVLNS >gi|157101638|gb|DS480686.1| GENE 31 36273 - 36647 322 124 aa, chain - ## HITS:1 COG:no KEGG:DSY4397 NR:ns ## KEGG: DSY4397 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 121 1 121 122 107 47.0 1e-22 MKRIVHTVSIILAVLLAAAGGYYGGRRSVQPLVGTTFYAVIEEIRENRLLVQGLEINDIN GRGTFELAVGDKTELIWRGVPITVKDLRVGNTVSVTYIGEVLEISPAQVSDVLRIQLLDD ETPY >gi|157101638|gb|DS480686.1| GENE 32 36830 - 37396 422 188 aa, chain + ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 21 182 23 183 190 111 37.0 9e-25 MNTSAKSEHTGNSGFKKEVLEWVKILLAAAAIAFVLNTFVIANSYIPSGSMENTIMTGDR IIGSRLSYAFGAQPQRGDIVLFDHKAEPGKDKTRLVKRIIGLPGETVDIRDNQIYINQSD TPLDEPYLPEPMDSENYHFQVPEGCYLMLGDNRNHSRDARDWSDPFVPEEAITAKALFRY FPGIHWIS >gi|157101638|gb|DS480686.1| GENE 33 37430 - 38239 700 269 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938218|ref|ZP_02085573.1| ## NR: gi|160938218|ref|ZP_02085573.1| hypothetical protein CLOBOL_03112 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03112 [Clostridium bolteae ATCC BAA-613] # 1 269 1 269 269 494 100.0 1e-138 MKKHLAVITICTLALANAVPAMPVYAASKKPAAAYSAKQRREGQKKIKDAASRIKSAFKK QDLNALADLCSFPLIISYASGELTELKSKSELLALGTGPVFTEGMKSAIASTDVSKLKEV GNAGVQMGGDAGLSLFKFGGKWKINNIYSDYNSQAGSDSGALKIGNLEEAATTVQKCFSY RDIETLSRICSYPVNVIYEDGTSAEYSDAASFISKCGGRLFTDRLCNSVTATDASGLQAV GNAGAQLGGDSGLSMYQFGGAWKVNNIYQ >gi|157101638|gb|DS480686.1| GENE 34 38767 - 39132 211 121 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0129 NR:ns ## KEGG: Rumal_0129 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 2 116 3 117 120 101 40.0 9e-21 MQSFIPWLDRIMAGPLPEEIIAVNFNLYDDGDSCWSMEFTGTRSFDAYDPDWACDEVFTS RDSTLHWVQNTGWQQVLEDTVHTIRAYLANGTYSGRLKAYQGIGAGFVDGDIVILYETPK E >gi|157101638|gb|DS480686.1| GENE 35 39401 - 39775 310 124 aa, chain - ## HITS:1 COG:CAC2450 KEGG:ns NR:ns ## COG: CAC2450 COG2033 # Protein_GI_number: 15895715 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Clostridium acetobutylicum # 1 123 5 124 125 112 42.0 2e-25 MKFYICNHCGNIIAYVKSSGVPVVCCGEKMQELVPNTTDAAVEKHVPVIQIDGSKVTVTV GSAEHPMIPEHYIQWIALATRQGNQRKELQPGQKPQAEFMLCEGDEAEAAYAYCNLHGLW KAEP >gi|157101638|gb|DS480686.1| GENE 36 40152 - 40631 383 159 aa, chain + ## HITS:1 COG:no KEGG:Bsph_4714 NR:ns ## KEGG: Bsph_4714 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 1 158 1 165 165 103 38.0 3e-21 MKYENLTPVVGVITNISTQEGDCCTQVITLSVEGQPVNMILSDDTFVVDTMRLMPGMRVA AFYDNTLPVPLIYPPQYQADLIAVVRPEEDVNLSWFDETLTASDQSLKLNLYQSTVLSTL NGQPYFCFPANHFLLVYYAAATKSIPPQTAPNRVIVLCS >gi|157101638|gb|DS480686.1| GENE 37 40562 - 40708 75 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQNEKNYFMHSYNGLVKMKKNNSLQVYEHSTMTRLGAVCGGMLFVAAA >gi|157101638|gb|DS480686.1| GENE 38 41354 - 41938 409 194 aa, chain - ## HITS:1 COG:PA1878 KEGG:ns NR:ns ## COG: PA1878 COG1896 # Protein_GI_number: 15597075 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Pseudomonas aeruginosa # 16 194 11 189 192 158 48.0 5e-39 MRIGSNKGSEVEAYTDFIKEAEGLKSTLRTAWTAEGRQESTAEHSWRLALFAGVMCREFP ELDREKVLMMCLVHDLGERYSGDISAALRPDAGDKLNQEREDVQRICGFLPKGEEGEVSG LWEEYSQGITPEARFVKALDKAETIIQHSQGRNPAGFDYGFNLEYGKEYFEQDERLEALR SLIDAETRTRMEIK >gi|157101638|gb|DS480686.1| GENE 39 41960 - 42427 621 155 aa, chain - ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 1 150 1 150 156 161 56.0 5e-40 MGKESIKLVANNKKAYHDYFIDEKYEAGIELFGTEVKSIRMGKCSVKEAFVKIDRGEVYV CGMHISPYEKGNIFNKDPLRVRRLLLHKYEIMKLNGKIAEKGYTLVPLQVYFKGSLVKVE VGLARGKKLYDKRADIAKKDQRRELEKEFKVKNLY >gi|157101638|gb|DS480686.1| GENE 40 42476 - 44722 1735 748 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 15 720 9 687 730 672 49 0.0 MEKEILEQKAKMLEEVMNDKSYVPMKAKELAMLLGIPKSQRDELTQVLDYLVSEGRIGIS KKGKYGKPEVFSVNGIFCGHPKGFGFVTVEGMEQDVFIPEDRTGAALHGDRVQIVVESQD RGGGRRAEGSVLKVLEHANKEVVGYYQKSKGFGFVIPDNQKISKDIFIPQGCDMGAVTGH KVVARIKEFGDANHKPEGVVTEILGHVNDPGTDILSIVRAYGLPEEFPPEVMDEVEGCPD EVAVPGMTRDEETWDGPYGIGDLTSPADWTGDLAGRLDLRGLRTVTIDGEDAKDLDDAVT LCRNGQGGYILGVHIADVSHYVKEGRPLDKEALKRGTSVYLVDRVIPMLPHKLSNGICSL NAGTDRLALSCIMELDDQGNVLDHKIAETVIHVDRRMTYTAVNAIVTDGDEAVMAEYEGF VPMFMLMKEVSDILREKRKKRGAIDFDFPESKIILDAQGKPLEIKPYERNAATKIIEDFM LAANETVAEDYFWQSLPFLYRTHDNPDPEKMKQLGTFIHNFGYFIRLQQGEIHPKELQKL LDKIEGTPEEVLLSRLTLRSMKQAKYTTLCSGHFGLAARYYTHFTSPIRRYPDLQIHRII KESLKGGLGDKRAGHYEAILPGVAMQTSALERRAEEAERETDKLKKCEYMSRFIGQEFDG VVSGVTNWGLYVELPNTVEGLVRISELRDDYYIFDEQHYELVGEMTRKTFKLGQPIRVQV ASTDRLLRTVDFILPRDWDRSAGKGASV >gi|157101638|gb|DS480686.1| GENE 41 44845 - 45099 301 84 aa, chain - ## HITS:1 COG:no KEGG:Clole_2827 NR:ns ## KEGG: Clole_2827 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: C.lentocellum # Pathway: Protein export [PATH:cle03060]; Bacterial secretion system [PATH:cle03070] # 1 79 1 79 135 70 43.0 3e-11 MSVIHVILSIIYVILGVAISVVILMQEGKSNGLGSAIGGISTDSYWSKNRGRSMEGALEH FTRYGAIAFMLITIALNVLLKVMG >gi|157101638|gb|DS480686.1| GENE 42 45228 - 46175 681 315 aa, chain - ## HITS:1 COG:BH3916 KEGG:ns NR:ns ## COG: BH3916 COG1482 # Protein_GI_number: 15616478 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus halodurans # 1 313 1 314 315 301 47.0 9e-82 MRELLFMEPVFKEAIWGGTRLKDVFGYDIPSSRTGECWAISAHKNGDCRIAGGTWKGQSL SSLWESHPEWFGAAAGCHKEFPLLVKIIDARNDLSIQVHPDNAYAGEHENGSLGKTECWY ILDCDPDATIVIGHHAGNKDEVKQMIEEKRWKDFIREIPVSKGDFFQINPGCVHAIKGGT LVLETQQSSDITYRVYDYDRLSGGKPRQLHIKESIDVIEAPFKEDQNAMPAVVKETDSGR KEHLVTCAYYTVDKIDMTGCWTENYGDSFANVSILDGEGSVNGIPVSKGQHFIVPAGFGD VIFEGSLSMICSQAV >gi|157101638|gb|DS480686.1| GENE 43 46482 - 47753 1398 423 aa, chain - ## HITS:1 COG:BS_eno KEGG:ns NR:ns ## COG: BS_eno COG0148 # Protein_GI_number: 16080443 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Bacillus subtilis # 10 421 7 413 430 513 64.0 1e-145 MVNELEIEAVTGREILDSRGNPTVEVEVCLTGGATGRAAVPSGASTGKFEAVELRDGGNR YNGLGVRGAVEHVNTVLSEAVLFESALDQRHIDRLLLEADGTENKGKLGANAILGVSMAV ARAAANGLRMPLYRYLGGIHASVLPVPMMNILNGGKHADNTVDLQEFMIMPFGACCFAER LRMCAEVYHTLKKLLKEEGYSTGVGDEGGFAPDLKDSEAVLEFLVRAMEKSGLKPGTDMK IAIDAASSELYDEASGMYLFPGESRMAGKEIRRSAQEMVDYYRRLVDAFPICSIEDGLQE EDWEGWKMLTKELGGRIQLVGDDLFVTNTKRLSKGIELGAGNAILVKVNQIGTLSESFEA IEMAKRAGFGTIISHRSGETEDSIIADIAVAVNAGQIKTGAPCRSDRVAKYNQLLRIEEN LEG >gi|157101638|gb|DS480686.1| GENE 44 47856 - 48815 1145 319 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 118 315 432 617 621 62 25.0 1e-09 MRKRGAYAVLAGAALFMACTLSFTAFAKEERTPVGKIKLNFTSDIQAGEIGGNVDVSLED GECSIESVDIVNEGDSWVGGDKPKVEIWLSADSDYYFKKSGKSAFSFSGDTVKYVSSSTK NDKEEMVLVVRLDKLDEDDEDLDVSGLMWDEDNGVAHWDDMSLAKNYKVRLCRRGGNSYE DGIGATYTVKENSYDFSGKFPKAGTYYFKVRAMDSRNNAGEWQESPYIEITEEDLTRVNG QWLRDDRGWWYQKGDGTYTSNGWQYINYKWYFFDQEGYMKTGWISWEDKLYYCDPSGAML VSAVTPDGFTVGADGARIN >gi|157101638|gb|DS480686.1| GENE 45 49005 - 50087 644 360 aa, chain + ## HITS:1 COG:lin0696 KEGG:ns NR:ns ## COG: lin0696 COG0463 # Protein_GI_number: 16799771 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Listeria innocua # 1 173 9 189 637 100 34.0 6e-21 MIVKNEEAVLERLLRTMSTVADQIVIMDTGSTDRTKEIAARYTDEVYDFPWTDDFAAARN AVCQKAVSDYWMWMDADDVMEPEQAARLKNLKETLDPSVDVVMMKYLTGFDENGRTSFSY YRERILKNHKGFQWQGRVHEAVPPAGNILYTDIEIQHRKEGAGDKDRNLRIYETMLSQGE PLEPRHQFYYARELYYHERYEDAVRVFRAFLQEPDGWVENKIDACLHLALCEEHLGHTEK AVEALLKSFLYDSPRGEACCELGRMKMKEEKYREAAYWYTQALSSRPAEQTGAFVRKDCY GFLPSIQLCVCYDRLGDHRRAWHYHLKSQKLKPEHPSVRQNQTYFEHKLKHPAEGQPSAL >gi|157101638|gb|DS480686.1| GENE 46 50199 - 50504 127 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938233|ref|ZP_02085588.1| ## NR: gi|160938233|ref|ZP_02085588.1| hypothetical protein CLOBOL_03129 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03129 [Clostridium bolteae ATCC BAA-613] # 1 101 1 101 101 131 100.0 2e-29 MAAAPATMVSQVPEDTRKMIARTATAVAATAQPRQITDAQVTAGFQAAIFQAAGFRAAAQ DPEDHRGLPDQQALRDQWDPEDAPEHRALPDQLVPWGLRAM >gi|157101638|gb|DS480686.1| GENE 47 50429 - 51526 684 365 aa, chain + ## HITS:1 COG:no KEGG:Bcer98_2389 NR:ns ## KEGG: Bcer98_2389 # Name: not_defined # Def: triple helix repeat-containing collagen # Organism: B.cereus_NVH # Pathway: not_defined # 2 233 317 548 842 156 62.0 1e-36 MGPRGCPGAQGPTGPTGAMGPQGYVGPTGPTGPQGPRGFTGETGPQGPQGPMGLPGPQGI AGSTGATGSQGPQGPQGVTGPAGPQGIQGPQGATGPAGPQGPQGPQGPQGPQGSTGAAGP QGLTGATGPIGPTGPTGPQGIQGIQGPQGPQGATGVTGPQGIAGATGPIGVTGPTGATGA IGATGATGATGPTGITGPAGTTGATGATGLAGTTGPTGATGPTGAAGGIAAYGGMYRTTT TPISIGIASTQQVAFNNTMPLLNTTVAANAITVANTGIYEINWKLVSTASVAVALTLAIR RNSVPIASTTSTRTLSLGVESLFEGSVIFNLTAGDVIDMVLSSALAVTLTLSTGTNATLT VKQLS >gi|157101638|gb|DS480686.1| GENE 48 52344 - 53237 185 297 aa, chain + ## HITS:1 COG:no KEGG:BCE_4654 NR:ns ## KEGG: BCE_4654 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_ATCC10987 # Pathway: not_defined # 3 297 166 462 462 146 52.0 9e-34 MPGPTGPTGAMGPQGYVGPTGPTGSTGPQGIPGMTGATGAAGPTGSTGAAGAQGPAGVTG ATGAAGAQGPAGVTGATGAAGAQGPAGVTGATGAAGAQGPAGVTGATGATGAQGPAGATG STGTQGPAGAAGVTGATGPAGATGAAGIQGATGPTGATGPTGTSAISSAMSALNISASTI SVTTDGTNIPLPVQPYMDGFAVNATNTEFTVAQTGTYLISYDIKMTSGLPMSSRVTLNGA PITNSINTSSTSTNEYSVTFMQPLTAGDILALQLYGVNGPVSLQTGTGASLNIVKIA >gi|157101638|gb|DS480686.1| GENE 49 53837 - 54166 117 109 aa, chain + ## HITS:1 COG:no KEGG:BCA_2510 NR:ns ## KEGG: BCA_2510 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_03BB102 # Pathway: not_defined # 16 103 41 128 198 72 47.0 5e-12 MGAELILLFPIKSASTGPMGPAETAGAKDTAGPAGPAAATISMNAQNTAASPLAIIIGKI AIPLPNDQFLNCFTINAANTIFTFPVTGTYFITYHIKATDALLMFWHIL >gi|157101638|gb|DS480686.1| GENE 50 54179 - 54298 62 39 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938240|ref|ZP_02085595.1| ## NR: gi|160938240|ref|ZP_02085595.1| hypothetical protein CLOBOL_03136 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03136 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 63 100.0 5e-09 MADTIDALAVAASQFSRLELELNLISYVWHNAITSGLKT >gi|157101638|gb|DS480686.1| GENE 51 54402 - 56348 1351 648 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c18950 NR:ns ## KEGG: CLJU_c18950 # Name: not_defined # Def: putative collagen triple helix repeat-containing protein # Organism: C.ljungdahlii # Pathway: not_defined # 39 458 199 615 800 196 51.0 2e-48 MYYNYSDNINPQPSNYDPSSAGCRLNNSGNDRNRCWCDPCNPCPPPGCCPQAGPTGPTGP MGPQGPMGPRGCPGAPGPAGPTGAMGPQGYVGPTGPTGPTGPQGFPGITGATGAAGPTGP QGPQGIPGPTGPRGVTGTTGAAETITIRNTVTSEPESPAAVVDITGSPDHILDFYLPRGA TGATGATGATGPAGDTGARGITGTTGATGATGAQGIAGITGPTGPQGPAGATGPTGPQGP AGATGPAGAAGAAGTTGATGATGATGATGITGATGVTGATGATGEAQTITIRNTTTSEPG AAASVTDITGGPNHVLDFTIPRGVTGPAGPQGGTGATGAAGPAGATGATGAAGPAGATGA TGATGAAGPAGPTGATGATGAIGATGATGATGNTGPEGPAGPTGATGATGAAGATGDTGA EGPAGATGATGDIGPTGATGDMGPEGPAGPTGETGPTGAAATITIGSVTTGDPGTEASVT NSGTSEDAVFDFVIPRGEPGGGGAPEVLATVDTTPQPTSANGALIFNETPLVSGTAITHT AGSPDVQINQPGIYQAFFTGTVLIDAGTTIPSTLTVQLTLNGVPITGAVARHTFTASNEE VTLSMNAPFPVTGAGTLEVVTSDAGYTFEDASLTVVRLGDSTNFRTAV >gi|157101638|gb|DS480686.1| GENE 52 56458 - 56826 523 122 aa, chain - ## HITS:1 COG:SP1567 KEGG:ns NR:ns ## COG: SP1567 COG0251 # Protein_GI_number: 15901410 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Streptococcus pneumoniae TIGR4 # 4 120 5 123 126 124 57.0 4e-29 MNILSTDKAPGAIGPYSQGYEVNGVIYTSGQIPVDPADGTVAEGIAAQAEQSCKNVGAIL EAAGTGFDKVFKTTCFLADMNDFGTFNEVYATYFVSKPARSCVAVKTLPKGVLCEIEAIA VK >gi|157101638|gb|DS480686.1| GENE 53 57142 - 57738 592 198 aa, chain + ## HITS:1 COG:AGl2275 KEGG:ns NR:ns ## COG: AGl2275 COG3760 # Protein_GI_number: 15891247 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 39 194 13 167 169 79 30.0 3e-15 MTNLNDTGESIPAYHIDRTLYEGRPADTSGREDKEIRTYDLLDSLQVPFVRVDHDALPTI EACREVDQLLGTEICKNLFLRNAQKTDFYLLLMPGNKKFKTAVLSKQIGSARLSFGEAEF MESFLDIKPGSVSVMGLMNDREKRVRLLIDKDILEGEYFACHPCINTSSLKFTTRDLMDK ILPALGHPHTVVDLPYAE >gi|157101638|gb|DS480686.1| GENE 54 57841 - 58164 216 107 aa, chain - ## HITS:1 COG:no KEGG:Shel_25200 NR:ns ## KEGG: Shel_25200 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) # Organism: S.heliotrinireducens # Pathway: not_defined # 1 107 50 156 156 142 71.0 5e-33 MDLENQKRVLNISEEHGPENVVVLLGAAEAEAAGLAAETVTAGDPTYAGPLAGVQLGLSV YHICEDEVKAETDPAVYEEQVGMMEMVMDVPAIHEEMEGIRKEYCKY >gi|157101638|gb|DS480686.1| GENE 55 58180 - 58311 267 43 aa, chain - ## HITS:1 COG:no KEGG:Amet_3592 NR:ns ## KEGG: Amet_3592 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 1 43 1 43 158 65 65.0 1e-09 MAILKDKYAIIIGDRDGVPGPAIEECAKTAGAKIAYSSTECFV >gi|157101638|gb|DS480686.1| GENE 56 58420 - 58737 420 105 aa, chain - ## HITS:1 COG:YPO3868 KEGG:ns NR:ns ## COG: YPO3868 COG0526 # Protein_GI_number: 16124003 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Yersinia pestis # 1 88 5 92 108 62 31.0 2e-10 MIDLTKENFEEEVLKAQGYVLVDFYGDGCVPCAALMPHIHKFEEVYGTTVKFCSLNTTKA RRLAIAQKVLGLPVIAIYKDGEMIDSRVKDDATPEGCEEMIRKYA >gi|157101638|gb|DS480686.1| GENE 57 59101 - 59949 691 282 aa, chain + ## HITS:1 COG:Cj1504c KEGG:ns NR:ns ## COG: Cj1504c COG0709 # Protein_GI_number: 15792818 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Campylobacter jejuni # 1 271 32 297 308 181 40.0 1e-45 MDIFPPAVDDAYEYGQIAAANSLSDVWAMGGEARLCMNILMFPEHLPFETVQAILRGGYD KVAEAEAIVVGGHTIKDDIPKYGLCVSGVVHPDRILRNNSIQKGDVLILTKPLGTGVLTN ADRGGLLSDSEHKAMVDCMAALNKYAADAVRNLDGVHACTDVTGFSLLGHSYEMCGESGL TVIIDSSRVPLLPGADSYASMGLVPAVAYQNMNHLKDKVALPKDLPSHVHDLLFDPQTSG GLLYSVASQEAGRIMEALQKACQSPAVIGHVDGPSAHPVMVI >gi|157101638|gb|DS480686.1| GENE 58 60048 - 60908 709 286 aa, chain - ## HITS:1 COG:no KEGG:CLB_0653 NR:ns ## KEGG: CLB_0653 # Name: not_defined # Def: response regulator # Organism: C.botulinum_A_ATCC19397 # Pathway: Two-component system [PATH:cba02020] # 31 282 37 267 270 142 32.0 1e-32 MDCDRNSRTELKTMIEDGGMGTVAGIASRWEEAYLELQEMCPDMILADLTLSAPKTIAYI QKIREVLPQTSVVILSHVNDMDIIRMAYEKGAELLIHKPFNEIEVRNVLHNMEMSRAMQW MIVKARNDFSAGAFSWESNVRVRKMQPEELETDYNQSIRRLKSILQEIGILSEAGSKDII RIVRYLIEEDLDIRDITVRELCSRMGQNSKSVEQRIRRAASVGMANLAYRGMDDYADPVF NEYGARLYSLEQIKREMSYIRGKSDGHGNVRVRKFLSGLLVCCKDI >gi|157101638|gb|DS480686.1| GENE 59 60925 - 62061 1215 378 aa, chain - ## HITS:1 COG:Cgl1071 KEGG:ns NR:ns ## COG: Cgl1071 COG3949 # Protein_GI_number: 19552321 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Corynebacterium glutamicum # 11 366 4 340 400 102 25.0 1e-21 MDSDKKVNWGRVLILAGAVIAFTIGSGFATGQEIIQYYTAYGVQGIFTVLLFAVCFIYYN YNFAKAGAEEHFEKGNDVYKFYCGKYVGTFMDYYSTIFCYMSFFVMVGGAASTLNQQYGL PSWVGGVVLTALAILTVVGGLNSLVDKIGVVGPAIVVLCIGIGVVTLFRDAGNVGAGLEV IRTGAFAEAGSETIKNAGPNWVISALSYAGFVLLWFASFTAALGANNKKKDVEYGIYGGT IAVCVAIVLIMLAQVANINTAGLGNTGIFVWNADIPNLILAERIWKPFASFFAVVVFAGI YTTAVPLLYNPCGRFAKEGTSQFKILTVALGVIGLIVGLFLPFRVLVNIIYVLNGYVGAV LILFMLWKNIKDVFLKKK >gi|157101638|gb|DS480686.1| GENE 60 62166 - 62399 161 77 aa, chain - ## HITS:1 COG:no KEGG:Amet_3591 NR:ns ## KEGG: Amet_3591 # Name: not_defined # Def: selenoprotein B (EC:1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 1 76 360 435 436 103 68.0 3e-21 MVKGIEKYGIPVVHMATVVPISLTIGANRIIPGIGIPYPLGDPPQGEKDSYKLRLSMVRR ALKALQTPVDGQTVFEK >gi|157101638|gb|DS480686.1| GENE 61 62427 - 63476 1212 349 aa, chain - ## HITS:1 COG:no KEGG:Amico_1394 NR:ns ## KEGG: Amico_1394 # Name: not_defined # Def: selenoprotein B, glycine/betaine/sarcosine/D-proline reductase family # Organism: A.colombiense # Pathway: not_defined # 4 349 3 348 348 431 61.0 1e-119 MAKYKIVHYINQFFAGIGGEEKADYTPELREGVVGPGMGLKAALGEDYEIVSTIICGDNY FGENLDAATDTIIEMVKKCEPDVFVAGPAFNAGRYGVACGTICKAVEERLGIPVITGMYI ENPGVDMFRKDLIIVDTPNSAAGMPKVLPVMSALIKKMAAGEEILGPKEEGYIERGIRVN YFAEKRGSERALEMLVKKLKGEEYGTDLPMPKFDRVAPAEPVKDIKHAKIAVVTSGGIVP TGNPDHIESSNATKWGKYDITGMDRLSADEFTTIHGGYDRQFAMANPNVVVPLDALRELE KAGEFGELVNYFCTTTGTGTATASAAKFGSEIGQMFIDEHVDAVILVST >gi|157101638|gb|DS480686.1| GENE 62 63500 - 64786 1440 428 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1637 NR:ns ## KEGG: CDR20291_1637 # Name: grdG # Def: sarcosine reductase complex component B subunit alpha # Organism: C.difficile_R20291 # Pathway: not_defined # 1 428 1 428 428 647 70.0 0 MKLELGKIFIKDIRFDDKTHVKDGVLYVNKEEVEKLVLEDDKLTGCHVDIAKPGESVRIT PVKDVIEPRVKVSGGEIFPGVIGKVSPQVGTGRTHALEGCCVVTVGKIVGFQEGVIDMSG PAADYCPFSQTYNLCVVVEPADGLETHVYEKAARMAGLKVATYVGEAARELEPDEIVTYE TKPIFEQAAMYPDLPKVGYIHMLQSQGLLHDTYYYGVDAKQFIPTFMYPTEIMDGAIVSG NCVAPCDKVTTYHHFHNPVIEDCYKHHGKDINFMGVILTNENVFLADKERHSDMVAKLCN WMGLDGVLITEEGYGNPDTDLMMNCRKVERAGTKVVLITDEFPGKDGKSQSLADVCEEAD ALASCGQGNATLQFPAMDKVIGTQDFIEMQIGGWDGCKNPDGSFEAELQIIIASTIANGF NKLAARGY >gi|157101638|gb|DS480686.1| GENE 63 65064 - 65744 788 226 aa, chain - ## HITS:1 COG:FN1803 KEGG:ns NR:ns ## COG: FN1803 COG1309 # Protein_GI_number: 19705108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 18 189 14 186 217 88 30.0 8e-18 MKKNVGTSYFKDLSRKDYVVQASRIIKKEGVEAISIRRIAAELGCSSASMYRYFQNLDEL LFYAQLDALNEYILDLSKCEKEWNDIWDVHFGIWRSYAKEAFKKPQAFEAVFYRNINRDL GEALKEYYEMFPDAIIQVSPFIKEMLEIPSYYQRDYYICKLLVAEGKISEENAKKLNHII CTLFLGYFKFVQEHGISQDEIPELVNQFIQESKEITQCYAYDFTPK >gi|157101638|gb|DS480686.1| GENE 64 65775 - 66161 512 128 aa, chain - ## HITS:1 COG:no KEGG:Amet_3596 NR:ns ## KEGG: Amet_3596 # Name: not_defined # Def: GrdX protein # Organism: A.metalliredigens # Pathway: not_defined # 6 128 2 124 127 91 37.0 1e-17 MFDLGKCILVTNNDRAAEKWGENVEQVVMVETYEEVLLKTRDLIHTNHKLLTHPQASSLK PNQTPYRTILLYSETGKSDAGDIRLIEEAIEAFNKWTAIKKVPKYDEKIAYDYKTIDLSM IENVIPKL >gi|157101638|gb|DS480686.1| GENE 65 66397 - 67350 1069 317 aa, chain - ## HITS:1 COG:no KEGG:Closa_3219 NR:ns ## KEGG: Closa_3219 # Name: not_defined # Def: PpiC-type peptidyl-prolyl cis-trans isomerase # Organism: C.saccharolyticum # Pathway: not_defined # 5 316 13 322 323 219 41.0 9e-56 MGLCLTAVLAMAVLTGCSRKAAATTGSRTVDKEYTKGQMMVIAITERNRYQNIYTSELWS VKADENGDTFEDKLMDQVEQFLIELAATNLMADEQGIELTSQEKDSLKSLAQEYYRNLSE QDRRFMDVSEDEVYDLYCQYYRADKLVAELTKNENPEVSDAEAKVIGIQQIELDSRAEAE NVLALAQAEKADFGAIAAKYSKDSRTDRTLEWKKDMDGLERAAFELEQDQVSGILEQGGR YYILKCVNAYDEEATAARKSRLAQEKKTKAFLGIYEPYVKEHIVKLKKLPGDVVEFSGGE GCTADNFFQLYHGYFSK >gi|157101638|gb|DS480686.1| GENE 66 67415 - 68200 1028 261 aa, chain - ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 1 258 1 257 265 137 31.0 3e-32 MIKLIVSDVDGTLVPDGSPDLDPEVFDIILKLREKGMQFVVASGRPWASVESAFEPVKKK IFYVANNGAYVGCHGRCLYAYTMERELAHRIIRKVRMHPELEMVYAGVNGDYLDSKDDTL CDWLTNGYKFNVIRVKDVLELEEPCVKISIYKKEGIEAATRDIYDEFKDQAKMACAGDMW MDCMAKDVNKGKAVRTIQESLGIKMEETMAFGDQLNDIEMLNQAYYSFAVANAREEVRKA ARFQADSNVRGGVLKILKGLL >gi|157101638|gb|DS480686.1| GENE 67 68193 - 68801 655 202 aa, chain - ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 10 193 4 182 190 140 44.0 2e-33 MRQTWGGYQVVLASASPRRKELMAQIGLEPEIRPSRMEEETREKRPDRVVMELSRQKAED VASGCPGGTMVIGADTVVSVGNEILGKPGTPMRAYEMLEKIQGRTHQVYTGVTVLLCQGE DRCHGITFAERTDVHVYPMTCGEMKEYAQCGEPLDKAGAYGIQGRFAAYIKGIDGDYANV VGLPVGRLYQEIKRLLEDREDD >gi|157101638|gb|DS480686.1| GENE 68 68918 - 69472 443 184 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 128 1 127 270 100 41.0 1e-21 MNIYDFINWFFMYSFLGYLLECIVLSAEYKRPVVDRGFGHGPFCVIYGFGALGACMLLRP VSGSPMELYTASMAMATTMELVTANIMVRLFGSFWWDYSHKPFNYKGMVCLESSLAWGLL GLFFFYFLDIRVRAAVISLPGHVGTVLAVGLTMFYLVDFIHCFRLRMGGMEEEDEPAVGR LKIY >gi|157101638|gb|DS480686.1| GENE 69 69517 - 69975 545 152 aa, chain - ## HITS:1 COG:L26878 KEGG:ns NR:ns ## COG: L26878 COG1963 # Protein_GI_number: 15672981 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 1 149 1 147 147 118 43.0 4e-27 MNVWTQMLGNQLLMSAVTGWVVAQFLKTLIDFALNKNFNAERLVGSGGMPSSHSATVCGL TTAALLKYGAGSFEFAVSFVLSMIVMYDAIGVRRETGKQAKLLNSILSENPLKLNAEVLQ EKLKEYVGHTPLQVLAGAILGIGLALALNSYY >gi|157101638|gb|DS480686.1| GENE 70 70036 - 70548 326 170 aa, chain - ## HITS:1 COG:CAC1667 KEGG:ns NR:ns ## COG: CAC1667 COG1418 # Protein_GI_number: 15894944 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 13 150 20 155 173 151 50.0 5e-37 MEPSMVSEITEDQEYMECVKDILCHPVFQSMDQYIQHGTTTCKTHCIQVSYLGYKLCKQL GGNWRSAARAGLLHDLFLYDWHTHAKETGQYFHGFTHPKAALENADRYFRLTGEERMIIL RHMWPLTLIPPTSKAGYAVTCADKYCSTVETTARFKHWIRMCLFAQPARR >gi|157101638|gb|DS480686.1| GENE 71 70607 - 72052 1498 481 aa, chain - ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 3 475 2 473 477 477 51.0 1e-134 MKKILFAASEAVPFIKTGGLADVVGSLPKCFDKEYFDVRVIIPKYLCIKQEWRDKMKYVH HFYMDYLGQSRYVGILQYVHEGITFYFIDNESYFNGAKPYGDWYWDLEKFCFFCRAALSA LPVIGFQPDVVHCHDWQTGLIPVLLKDKFREGEFFCNMKSVITIHNLKFQGVWDVKTIKR FTELPDYYFTPDKLEAYKDGNLLKGGIVFADAITTVSSTYAEEIKLPFYGEGLDGLMRSR AGSLRGIVNGIDYDEFNPETDTYIAQNYNAKNFRKEKIKNKKALQQELGLPVDEKKFMVG IVSRLTDQKGFDLIQCVMDELCSDDLQLVVLGTGEERYENMFRHYDWKYHDRVSAQIYYS EAMSHKIYAASDAFLMPSLFEPCGLSQLMALRYGTVPIVRETGGLKDTVEPYNEYEGKGT GFSFANYNAHEMLGSVRYAEYVYSSKRREWNKIIDRAMAKDYSWWTSAAKYQELYDWLIG Y >gi|157101638|gb|DS480686.1| GENE 72 72150 - 72968 789 272 aa, chain - ## HITS:1 COG:BH2773_1 KEGG:ns NR:ns ## COG: BH2773_1 COG0784 # Protein_GI_number: 15615336 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 1 121 1 120 120 104 41.0 2e-22 MEKLNVAIVDDNPLILNTLDELINAEDGLSVIGKADNGADAINMIVDTTPDIVLLDLVMP KIDGISVVEKVKSEHTFLKNPAFIILSAVGGEQMTEDAFKAGANYYLMKPFDKEILVNKI RHIGKLPNKQPAGKVITAPFEPGEESHITREEYMKEHLETDITKMLHELGIPAHIKGYQY LRDAIAMSVEDQEMMSSVTKILYPAIAKRNQTTASRVERAIRHAIEVAWGRGKMETIDEV FGYTISTGKGKPTNSEFIALISDKILLEYKKI >gi|157101638|gb|DS480686.1| GENE 73 73543 - 74820 1393 425 aa, chain - ## HITS:1 COG:CAC2072 KEGG:ns NR:ns ## COG: CAC2072 COG0750 # Protein_GI_number: 15895342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 102 424 65 390 395 220 41.0 4e-57 MKGKSIFYRWLGRIFWLSAVAVTGYTWYYMDQAVPDRVSIIENEEEEFSFGLPLKATISS DSQEVALSNESNIPADQITISGRDHFSMFGGEKGSYQVGLKLFGVIKLKDIQVDVVDTRY AIPCGSPIGIYLKSDGVMVIGTGRITGSDGMEVEPAYGILKSGDYIEAFNGKPMNTKEDL IRAVSESGGQDCVLQVRREGEDVDISLKPVQGADGQYKLGAWVRDDTQGIGTITYVDMNG RFGALGHGISDSDTGDLVESSDGALYSTEIMGIEKGSMGKPGLLSGVIYYGPQSHMGDIE KNTNEGIFGTVNQQFKKQISGEPMEIACRQDVKQGPAYIRSNISGELKDYAIEIQKVDYN SGHKNKSMVIKVTDPELLELTGGIVQGMSGSPIIQDGKLAGAVTHVFVQDASRGYGILIE NMMEH >gi|157101638|gb|DS480686.1| GENE 74 74973 - 76631 1585 552 aa, chain - ## HITS:1 COG:BH2776 KEGG:ns NR:ns ## COG: BH2776 COG0497 # Protein_GI_number: 15615339 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus halodurans # 1 552 1 562 565 343 37.0 7e-94 MLLELHVKNLALIEKADVEFGEGLNILTGETGAGKSIIIGSVTMALGGKAPKGSIRPGAD YAYIELVFSVTGEEKRKALRELDVEPTEDGLVIISRKLTSARSISRINDETVTMARLSQI TGLLLDIHGQHEHQSLLYKSKHLEILDAYVKAATQPVKQTIADRYRIYRSLEEKLRGFDL DAESRIREADFLRFEIEEIEASALKEGEEEELTSVYRRYSHSRRIAECLGAAYEAVEGDW LARALKEVEQASEYDESLGGVRDQLYDADSILRDAGREMSAYLDSMEMDEETFRKTEERL DLIHNLQAKYGPTVEAIFQKLEQKKKRLGELEDYDAHKKRMEQELEECRNGLEKLCTQLT GIRKKASRTLVKKIRQGLVDLNFLDVEFDMEFEKLDHFTPSGWDGAQFLISTNPGQPMRP LMDVASGGELSRIMLAIKTVLADSDDIPTLIFDEIDTGISGRTAQKVSEKLMLIARSHQV ICITHLPQIAAMADSHFEIAKSASQGRTITTIRLLDRQASVEELARLLGGARITEAVLKN AGEMKELADRTK >gi|157101638|gb|DS480686.1| GENE 75 76649 - 77101 605 150 aa, chain - ## HITS:1 COG:CAC2074 KEGG:ns NR:ns ## COG: CAC2074 COG1438 # Protein_GI_number: 15895344 # Func_class: K Transcription # Function: Arginine repressor # Organism: Clostridium acetobutylicum # 1 150 1 150 150 115 44.0 4e-26 MKVERHSKIVELIGKYEIETQEELAERLNEAGFNVTQATVSRDIRELKLTKMQSESGRQR YMVLESPRGTSAIKYIRILKDGYMSMDMAQNILVIKTVSGMAMAVAAALDAIQFHEIVGC IAGDDTIMCAIRSVDDTIIVMEKIKKMVED >gi|157101638|gb|DS480686.1| GENE 76 77153 - 77995 833 280 aa, chain - ## HITS:1 COG:NMA1017 KEGG:ns NR:ns ## COG: NMA1017 COG0061 # Protein_GI_number: 15793973 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Neisseria meningitidis Z2491 # 40 253 52 265 296 147 35.0 2e-35 MKHFYIIANLDKEYVQEAQVFIKAYLEAKGAECLLHHTPERTRGAAHIDGAQVPEDTECV ITIGGDGTLIQAARDLAGRNIPMLGVNRGHLGYLNQVSRQEDIAPVLESLLNERYQLERR MMIHGTAWRREETLLKDIALNEIAITRKDPLKVLRYSVYVNDEYLNEYAADGVLVATPTG STAYNLSAGGPVIAPGARMMVLTPICSHSLNARSIVLAPEDRVRIKVLNSGQVVSFDGDT SMELKAGDCIDIRCSELQTVMIKVKQISFMQNLSNHLGGI >gi|157101638|gb|DS480686.1| GENE 77 78057 - 78989 916 310 aa, chain - ## HITS:1 COG:CAC2076 KEGG:ns NR:ns ## COG: CAC2076 COG1189 # Protein_GI_number: 15895346 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Clostridium acetobutylicum # 3 241 5 243 267 309 62.0 4e-84 MKKERLDVLMVQRNLAESREKAKALIMSGIVYVNGQKEDKAGTSFEETVQIEVRGNTLKY VSRGGLKLEKAMSRFGVQLAGKVCMDVGASTGGFTDCMLQNGAVKVYAVDVGHGQLAWKL RNDDRVICMEKTNIRYVTPEDIGDRIEFASIDVSFISLTKVLGPVKQLLTDEGQVVCLIK PQFEAGREKVGKKGVVREKSVHLEVIEMVMDYARSIGFGILGLEFSPIKGPEGNIEYLLY LQNILQENGGQEDNVQKITGEKTTDQEEMNQSKTEHAEMKQTENHQAKRNQTEYELSARA IVEQAHGCLD >gi|157101638|gb|DS480686.1| GENE 78 79073 - 80947 2056 624 aa, chain - ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 2 612 4 613 619 657 52.0 0 MILEQINGPEELKALPPEDLKILAQEIRTFLIEKISHTGGHLASNLGVVELTIALFKTLN LPEDKVIWDVGHQSYTHKILSGRMAEFDELRQYGGLSGFPKRKESPYDSFDTGHSSTSIS AGLGIALGRDIRGEDYRVVSVIGDGALTGGMAYEALNNAARMKKNFIIVLNDNKMSISEN VGGMSRYLNGLRTGSGYNDLKKSVADALDRIPVVGTAMIDKIKRTKNSIKQLFIPGMLFE NMGITYLGPVDGHNIQALCKVLREAQKLDHAVLVHVMTKKGKGYRPAEKNPSYFHGVGPF DIKTGQSLSSQKNPSYTDVFSRKLCQLGQEHPELVAVTAAMPDGTGLAAFGKKFPDRFFD VGIAEAHAVTSAAGMAAAGLRPVVAVYSSFLQRAYDQVLHDVCIQNLPVLFAVDRAGLVG SDGETHQGIFDYSFLTSIPNMSVMAPKNLWELRAMLDFAMDYNSPLAVRYPRGEAYRGLK EFRQPISYGVGEILYEEEDIALLAVGSMVSTGEHVRQKLKAEGYRCSLANGRFVKPFDRK MVSRLAKNHRLIVTMEENVLQGGYGLAVTAFIHENYPEVKVLNIAIPDAYVEHGNVSILR EGLGIDSDSIIRTMKAGGYIGKKD >gi|157101638|gb|DS480686.1| GENE 79 80967 - 81878 1078 303 aa, chain - ## HITS:1 COG:CAC2080 KEGG:ns NR:ns ## COG: CAC2080 COG0142 # Protein_GI_number: 15895350 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 17 303 9 289 289 206 39.0 3e-53 MSDKETFKQELDARTAEIEQLIGTYLPEETGHQKTIFEAMNYSMKAGGKRLRPMLMQEMC RLFTGTLLEAVIPFMAAVEMIHTSSLVHDDLPCMDDDMMRRGKASTWAEYGEDIGVLTGD ALMMYAFETAANAFETSIDPDELSRIGRAMGILARKTGVYGMIGGQTVDVELAGGPIPGD KLEFIYRLKTGALIEASMLIGAVLGGAAEEDCKIVESLAAKIGMAFQIQDDILDVTGSQD VTGKPSGSDEKNKKTTYVTLEGLDKAKKDVEQISTEAIEELNKLPGNNDFLEQLIRALVG RQK >gi|157101638|gb|DS480686.1| GENE 80 81871 - 82104 314 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938275|ref|ZP_02085630.1| ## NR: gi|160938275|ref|ZP_02085630.1| hypothetical protein CLOBOL_03171 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03171 [Clostridium bolteae ATCC BAA-613] # 1 77 1 77 77 99 100.0 1e-19 MPRKKTDTQEKEENKASIEENFAQLEEIIKKLEAGDSTLEESFTCYEAGMKLVKYLSGQI DNVEKQIQILSEGDENE >gi|157101638|gb|DS480686.1| GENE 81 82108 - 83394 1278 428 aa, chain - ## HITS:1 COG:SA1354 KEGG:ns NR:ns ## COG: SA1354 COG1570 # Protein_GI_number: 15927104 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Staphylococcus aureus N315 # 1 417 1 439 445 302 38.0 8e-82 MASVYTVSQVTAYIRNMFTQDFALNRISIRGEVSNCKYHTSGHIYFTLKDGGAQIAAVMF AGQRKGLDFELREGQEVTVSGTVDVYERDGRYQLYAKEITREGKGDLFRQFEKLRNELEE MGMFDSCYKQPIPKYARKVGIVTAGTGAAIRDIMNISARRNPYVQLILYPALVQGEQAKY SIAKGIETLDRMGLDVLIVGRGGGSIEDLWAFNEEMVARAIFNCTTPVISAVGHETDVTI ADYVADLRAPTPSAAAELAVFDYGQFVEQVNLYRQVLERSMERRLEKLHFRLDQCGMRLK LLSPEKQLNDRRQRLADMESRLERMMEENLGLSRRKLEDRRRRLEARMALGLTEGKHRLA LLSGRLDGLSPLKKLGGGFGFVTDARGRAFTSITQAEPGDIIRVSVKDGRVDARVVETES MKLPGSEG >gi|157101638|gb|DS480686.1| GENE 82 83404 - 84039 690 211 aa, chain - ## HITS:1 COG:BH2785 KEGG:ns NR:ns ## COG: BH2785 COG0781 # Protein_GI_number: 15615348 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Bacillus halodurans # 46 134 36 124 134 73 41.0 3e-13 MLFSADFYPTQEEAIAQLGQYFQSPEEDDVDESGILQVLHKVELKEADSEYLQARTANIM EKIPEIDEKLNQAAAGWKTKRMGKVELTILRLALYEMLHDDAIPEKVSINEAVELAKKFG GNDSPSFVNGILAKFVVKGARAAAEAKAPEENRVCGETQEPRETQASQEIQASQEIRASQ EPQETPVPQEPQEIQASQEPQEIPENKESEA >gi|157101638|gb|DS480686.1| GENE 83 84295 - 84678 581 127 aa, chain - ## HITS:1 COG:lin1395 KEGG:ns NR:ns ## COG: lin1395 COG1302 # Protein_GI_number: 16800463 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 17 125 15 124 135 87 43.0 7e-18 MSEENRSTHKLYEKEKIGEVQIADEVVAIIAGLAATEVEGVDSMAGNITNELVGKLGMKN LSKGVKVDVTEEHVSVDLSLNIRYGYSIPSVSEQVQEKVSTAIENMTGLTVLDVNVKIAG VNMDEGR >gi|157101638|gb|DS480686.1| GENE 84 84845 - 86056 1282 403 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1366 NR:ns ## KEGG: Lebu_1366 # Name: not_defined # Def: VWA containing CoxE family protein # Organism: L.buccalis # Pathway: not_defined # 1 398 1 391 393 531 63.0 1e-149 MELETRMKRWRLILGEESSPQFGKMGGVELTDEQLLMDQALASIYNQTDSGGFGDGAGKK VPGGRGAGNGPSAPVLSKWLGDVRTLFDKELVTIIQSDAMERCGLKQLLFEPELLENLEP DMNLASMILALKDQIPGRSKDQVRAFISRIVEEINRLLADDIRRAVTAAVDRRRHSPIPS AAALDYKDTIRRNLKNYNPELKRLVPEHFYFYDRTTSNAANKYTVILDVDQSGSMGESVI YSSVISCILASIASVKTRIVAFDTKITDLTEQCEDPVDLLFGFQLGGGTDIEKSVAYCQQ FMENPGKTLFFLVSDLMEGGNRAGLLRRIREMKESGVTVVCLLAIADGGKPYYDEQIAGR IASMDVPCFACNPQKMPELLERALKGQDLNAFQKELSRSSNAS >gi|157101638|gb|DS480686.1| GENE 85 86071 - 88341 2126 756 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1367 NR:ns ## KEGG: Lebu_1367 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 2 756 7 759 759 721 46.0 0 MGHPNCFGIRHLSPAGAYHLRSFLDEKQPDLILVEGPSDFNGLMDDMVREETKPPFAVMA FTKDSPIRTVLYPFAEYSPEYQAIVWAKEHGAQCRFMDLPSDVFLGIRRAGEGQASPEHT SGSASEHVYRLLDRFNGEDGHETFWERVMEHSENHEAYRDGADAFGRQLRELTAGGDSDW PEILVREAYMRRKLENALKEGFQAERIVAVTGAYHVAGLAGEEPAMTDAQLGLLPSVPCN VTLMPYSYYRLSTRSGYGAGNQAPAYYGMLWQALLEGDRDMGTYRYLTAISGYQRQHGFP VSSAGVIEAVELARSLACLKGYSVPSLRDLRDAAVTAMGHGSLPELALAIADTEIGTAVG ELPQGVSRTCLQDDFYRQLRELKLEKYKTETAFTLNLDLREKLNVKSEAAAYRELRQSFF LHRLRVLGVHFAVLRKEQQEAATWAESWDIRWTPEVEIELVESALKGDTVAGAASFAMKE QAVKAEQIGEAADIIEAACCCGIPEMVAYATDILQGLSVEAAGFMELAGTAQSISAVVRF GNIRHLDSTPLLPVLNQMFLRACLVFLNSCTCGSQAEKGVMEAMDQLNSLCLHHDFLDEE RFIRLLAETASRDDLNTGISGFAAAILLERGRMEEEELSRQVRRRLSRGIPAELGAGWFA GLSKKNRYALIARLDLWRELSAYMDTLDDQEFKRALVFLRRAFAVFSSREKLDVAENLGE IWQVNTAQAGEILNGPLKGEEMELLDSLDGFDFDDI >gi|157101638|gb|DS480686.1| GENE 86 88342 - 89445 1008 367 aa, chain - ## HITS:1 COG:ECs2927 KEGG:ns NR:ns ## COG: ECs2927 COG0714 # Protein_GI_number: 15832181 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Escherichia coli O157:H7 # 11 360 30 378 384 246 40.0 4e-65 MGKIAGKEESVQRLPAEQLYQEEIDALIAAEQDPVPTGWRMSPRSVLTYITGGTVGGKEI TPKYIGNRRLVEIAIATLVTDRALLLIGEPGTAKSWLSEHLTAAINGDSTKVIQGTAGTT EEHIRYSWNYAMLIAQGPSREAMIKSPVYRAMETGSIARFEEISRCASEVQDALISILSE KRISIPELALELPAQKGFSVIATANTRDKGVNEMSAALKRRFNIVILPPPSDMSTEMEIV KSRVEQLAGSLELRAGIPHDEVVEKVCTIFRELRGGMTLDGRQKVKPSSGVLSTAEAISL LAGSMALAGSFGNGEITDYDLASALQGAVVKDEDKDGLAWKEYLENVMKKRGSRWLGLYK ECKELNQ >gi|157101638|gb|DS480686.1| GENE 87 89490 - 91436 1716 648 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1369 NR:ns ## KEGG: Lebu_1369 # Name: not_defined # Def: hypothetical protein # Organism: L.buccalis # Pathway: not_defined # 1 336 1 321 638 174 35.0 2e-41 MELQGLYELHERLGAAAVAGVNLIGDDFRLRRAVEQIRPLAQAVPVIKKLYTMAENVMAP DCEDRPGCLLDALALSEALLCTQAGYETGYHIGQDSGDALTRPEPEAGAEAGPRPLELIS RSYAPCLPYSQVHPLEQALTESGGGRLVPITEAMEEHPLVFEDYRLQAAVITALSDRYAE IADAAEKFLSGKDGQIVPLVKRGFWETTDNGRIHRLRVIESICGGAENEFYLGLLKRAKK ELRAEAIHALRFNTENTGVLLDLARTEKGLCLEMTEQVLGMMEQEETDAYWEEQFKKKGR EIVGYLRFSKSDLVSDRLAGIIEETLNQKETMSKKEFDGLMSQLLTALPGKGSGAMQEVY RRAARMNPAGPEFWLKFPAMLIDSIIYSQDERLISLAGELAGEQERSWLGPALAADFLTK PAAFVYESYCRKIPRDSLLGKEDKRKTRTAVLAVLGRIHCIREDGECEILCRVDEKSGTA HRIGRRLYEKPDSRWMDMLMDARIFGNDTAAGLDAAGRQRETDRDKILFSMLPAAKRAEK GSYFYKRALTVTDNRELYGILVQCGWKDFKGLISTYVKKNPNAGISLWSVQHQMNQLPMT EEERQTELREIDRILCGFPKGSSERRSWDANDYCRKAINEAAGLPEQG >gi|157101638|gb|DS480686.1| GENE 88 91467 - 92918 1215 483 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1370 NR:ns ## KEGG: Lebu_1370 # Name: not_defined # Def: zinc finger SWIM domain protein # Organism: L.buccalis # Pathway: not_defined # 1 481 1 477 479 373 44.0 1e-101 MREIPEQQIAAMAPNANAVNNARKIVKSGGFTEHARSEDDTFYMGSCKGSGSSAYRVTLD FADNDAPVCRCSCPSRQFPCKHGLALMMELSQGRHWDICDIPQDILDKRAKKDVRAGKKE SQGDRPRAPKKTNQAARTKKLKKQLEGLDLAEKVVRELVEAGLGTLNGTSVKVYQDLAKQ LGDYYLPGPQRLVQSLVLEVSRLKEDGKGDYKEAVRILVFLRALIKKSRAYLTEKLEQGQ VEDDASVLYEELGGIWNLDRLIELGLKKENVRLLQLSFHVSYDGARRELVDKGWWLDLDS GEIDITYNYRPVKALKYVKEEDSVTEALRVPMLVYYPGENGRRIRWEGQEFLPVTGELLA EARDKASSSLPELVKAAKNELKNTLSDGRFGCLVSYDGLGTVGTEGVKGEQREYIMYCGN STIELKDRPGDRPSVCRLDILPDRQMAGNQVMFGVLWYDRSRHRICLHPYSIITAKEVVR LLY >gi|157101638|gb|DS480686.1| GENE 89 93095 - 93952 1193 285 aa, chain - ## HITS:1 COG:no KEGG:Closa_3245 NR:ns ## KEGG: Closa_3245 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 285 5 262 262 259 62.0 1e-67 MKIKNMKQVKEHMKDMNMKRLFRRNQIIITTLAVMIAAAGYLNYAGKKDLASGNRVYEAG AMDISDEDILAENQAASLNGQAGLQEIPSLDQDPSDLDAVSDADGAQAAADTGADQGQLA QAGDGTGVEEATQAADPGQMAAADTDSSAQAGLENPGEAVLTSGMNVADYIANVQLSREQ VRAKNKETLMSLINSDSIDEAAKQQAIQDMIDLTEVSEKENAAETLLMAKGFSDPVVSIT KDKVDVVINAPSITDPQRAQIEDIVKRKAEVGADQIIITLLNMAE >gi|157101638|gb|DS480686.1| GENE 90 93970 - 94722 710 250 aa, chain - ## HITS:1 COG:no KEGG:Closa_3246 NR:ns ## KEGG: Closa_3246 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 250 2 201 201 179 48.0 1e-43 MGWKNRISRFNGKMTKEKWLLLLLIGVLLMILSFPLPGGSKKDKNTGGQIAGAGTEGAGA VPLWTKDNTNGSGEDSGFGQGSSQNGMAQGPDGNGAGASDGREDSYPAAARTSEDGSYEA QMENRIRNILKSVDGVGKVDVMVVLKSSSEKVLRVDRSTNTSTTQEKDSGGGTRDVTNNQ IQENTILAGSGSGSSTNAPIVEKELSPEISGIIISAEGGGSPTVKAEISEAMEALFGLPA HKIKVLKRVE >gi|157101638|gb|DS480686.1| GENE 91 94676 - 95380 693 234 aa, chain - ## HITS:1 COG:no KEGG:Closa_3247 NR:ns ## KEGG: Closa_3247 # Name: not_defined # Def: Sporulation stage III protein AF # Organism: C.saccharolyticum # Pathway: not_defined # 1 233 4 227 230 157 38.0 4e-37 MEAVYGWVKNIIYYMIFLSVVNNLLADSKYGKYIRFFSGMVLILLVVSPFTGGLHLDEQI SSMFKSISFQNDTDDLKQDLWGMEERRLDQVIREYEQAVAADVEAMARAEGLECTGASVQ IDGDRNSSRYGQIEGIGLVISGKKADTELGSWDYMGSRPVNVDAGRIDSVKVEDVELGEL PETGGVENNGDTPEQGEGEAGGMAADKKINHLTGKVAQYYGLEESDIQIQWKND >gi|157101638|gb|DS480686.1| GENE 92 95435 - 96718 1436 427 aa, chain - ## HITS:1 COG:no KEGG:Closa_3248 NR:ns ## KEGG: Closa_3248 # Name: not_defined # Def: Sporulation stage III protein AE # Organism: C.saccharolyticum # Pathway: not_defined # 53 425 23 389 391 327 52.0 5e-88 MKRWKKWGVRSAALWGLAVLFAGVLGWAGGPVYGASAGGQGTGIGGEQAPGRGWDTSGAQ GAETGADQALQDMGLQDEMEGLQQFLDDVMGQQDGGMEGLSFWGLMKELMKGNLKGILGQ TGMGLKNALFSQVDRGSQMLFQVAAIGLIGAVFTNVSSVFKGGQISDTGFFVTYLLLFTC LAASFSASLQVAAQVMEQILEFMKLLMPAYYMAVAFSGGSMSALALYECMLGAVTGVQWL CSTVLISMVRIYVLMVLGSHVAKEALLSKLTELLEQAVVWSLRTLTGLVVGFHLIQAMIL PYADAAGQAGMKRLMEMIPGLGQGAGAMAQMVLGSGVLIKNTMGAAAVVVLAVISVVPVM KLTVLMIMYQCVAAVMQPVCDKRVVSCVSGVSKGHKLLLQIVLYSMFLFMVAIAITCATT NVNYFAS >gi|157101638|gb|DS480686.1| GENE 93 96715 - 97101 515 128 aa, chain - ## HITS:1 COG:no KEGG:Closa_3249 NR:ns ## KEGG: Closa_3249 # Name: not_defined # Def: stage III sporulation protein AD # Organism: C.saccharolyticum # Pathway: not_defined # 1 127 1 127 128 143 70.0 2e-33 MTIITIAAAGIATVLLAVQLKGLKGEYAAYMVMAAGAFIFFYGTGKLKDILEALERIQGY IKVNSVYLVTLLKMVGITYVAEFASGICKDAGYGSLGNQIEIFGKLSILGISMPILLALF GTLETFLG >gi|157101638|gb|DS480686.1| GENE 94 97129 - 97323 271 64 aa, chain - ## HITS:1 COG:no KEGG:Closa_3250 NR:ns ## KEGG: Closa_3250 # Name: not_defined # Def: stage III sporulation protein AC # Organism: C.saccharolyticum # Pathway: not_defined # 1 64 1 64 64 96 87.0 3e-19 MGVNLIFRIAAVGILVSVICQVLKHSGRDEQAFLTSLAGLVLVLFWMIPYIYDLFETMKN LFAL >gi|157101638|gb|DS480686.1| GENE 95 97418 - 97981 653 187 aa, chain - ## HITS:1 COG:no KEGG:Closa_3251 NR:ns ## KEGG: Closa_3251 # Name: not_defined # Def: Sporulation stage III protein AB # Organism: C.saccharolyticum # Pathway: not_defined # 18 187 10 176 176 179 55.0 5e-44 MQMEAERSRKVAAFKWTGAVLVLFSAGGLGIWSAMQWKGRLRMLETLRQMIYFLKGEITY SRAPLAEALERVGKREPGPLGGLFEAAAEGIYMQEGESLQEIWKRQVMNLNTDSGPIPLE QEDLEQLAHLGEHLGYLDVDMQERTLKLYLEQLDLTIDYLRQNQREKCRLYTSLGIMGGM FLVIVMF >gi|157101638|gb|DS480686.1| GENE 96 97938 - 99011 897 357 aa, chain - ## HITS:1 COG:CAC2093 KEGG:ns NR:ns ## COG: CAC2093 COG3854 # Protein_GI_number: 15895363 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 5 337 4 295 305 236 38.0 4e-62 MAGREEVLKIFPRDLRAVLGQVTVDFDRVQEIRMRTQKPLLLICGGREYAVKTDGSLAVG VPDPPVRDIRGSGRCQEQERYWARAGIVTVSQAQMKETVEYMCSFSVYAAEEELRQGFIT IQGGHRIGVAGRTMAFGQDIRLMKSISFINIRVAHQIQGCANQVMDYLYSDDGRFLNTLV ISPPRCGKTTLLRDMIRQVSDGPRTQRSGRRITGVSVGVVDERSELGACYQGVPQNDLGM RTDVLDCCPKSQGMMMLVRSMAPQVIAVDEIGSREDVQAIEYVRNCGCSLAATIHGSSLE DIMQKPAVGELIQQGAFERMILLDCRGTAGHVASIWDGRGNVLYADGSGKESEGGGI >gi|157101638|gb|DS480686.1| GENE 97 99168 - 100133 900 321 aa, chain - ## HITS:1 COG:mlr0240 KEGG:ns NR:ns ## COG: mlr0240 COG0657 # Protein_GI_number: 13470513 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Mesorhizobium loti # 63 300 65 300 316 159 39.0 6e-39 MRNKNVSIRGALVRDMIHDIMETSLGKPIQTGEFRKNPVEPAWVCPSGYEYEIIDREPFK MEYLKPEGVVTGRVVLQLHGGGYIGPMKNIYRKFSVRYSRLSYGGDVLTVDYRVAPEHPY PAALEDVIHAYQWLVKEKHYRPGQVVVAGDSAGGGLALALCLYLRDHGMPQPAGLVLMSP WADLTCSGDSYEFNFENDPLFGNSRESMLYNSSYISGADPRDPYMSPVFGDFRGLPPMLL QAGGHEMLLSDTLEVAQNARRAGVKRRVSVYEGMFHVFQMSMDLVPESREAWDEVARFMQ IVYKIDRRPTGQIVKKVKRRR >gi|157101638|gb|DS480686.1| GENE 98 100176 - 101960 1902 594 aa, chain - ## HITS:1 COG:no KEGG:Closa_3262 NR:ns ## KEGG: Closa_3262 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 594 22 618 618 717 58.0 0 MIANENLSNVFRISVTLKEEIDPGLLGQALEEVLPWFAGFKVKLKRGFFWYYFEENKRQP FVEQEGSWPCRYIDPKSNQLYLFRVSYYGTRINLEVFHAVTDGMGAVNFLKELTAGYLEL KRGGRRTGAGPGEGVSTQTEDSYLKNYRKMQTRRYSSRPALRLTGRYIPFGGESVVHGYA DTGELKAVSRKMGVSITKYLTACLIWSISRVYEEEGRKGRHIGVNLPINLRAFFGSDTAS NFFAVTAIDHEPGQGRDEFEEILASVCRQMDEKIVKEKLEETISYNVSNEKKWYVRILPL FVKWLALGFIFRRNDRAHTITLSNIGLISMDREYQDDIEGFRMLIGVSKRQPAKCGVCSF GHRTVITFTKVFQDSRLEECFFGRLRADGIPVELESNGVIMPEADKGNYPVIQHDESIWK KLVLVCYGVLAVVALFLGLVNIVTYDGLWWSGIAIPGIAYVGLTLRYSILRHANLGKSLL IETIGMQILLVMIDRVLGFEGWSVNYAVPGTILFADAAVVFLIVVNRLNWQSYFMYQIAI TVFSFIPLILWAAGLVTHPAMAIITVILTVGILALTIFLGDRRFKNELIRRFHL >gi|157101638|gb|DS480686.1| GENE 99 102347 - 103468 1103 373 aa, chain + ## HITS:1 COG:SA1739 KEGG:ns NR:ns ## COG: SA1739 COG4927 # Protein_GI_number: 15927499 # Func_class: R General function prediction only # Function: Predicted choloylglycine hydrolase # Organism: Staphylococcus aureus N315 # 1 226 1 217 346 72 26.0 2e-12 MKTIHTHALELSGSSYEAGRLLGSRLASVPSLKKRLSGGFPGFGLTQFNQASQCFSRWCP GLNEELAGFADALGCAPEQVLYYGMTWLTPRCSHLALLPSMTASGHPMTARNYEFNDEAE DFTVIKTCIKGKYTHIGTSVLGIGRDDGINEMGLTVTLSSSGFPVGPLPEMRRPAVTGLQ FWAVVRTLLENCRDVKEALSMLKDMPVAYNLNLIVLDREGRCALVETLDGRMAVKAIDSD SGEQTLYATNHPVLEELIPYEPKAFVHSIRRYDNITAFLERHKKGITAGQLKDFFLTPYP EGLSCSYYSQFFGTTKTMVLDPVRGSLSLCWGGRPQNGWHDFNFSDPFPQETAAIEIHNQ TADPSIFAYRSLV >gi|157101638|gb|DS480686.1| GENE 100 103465 - 104343 633 292 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 1 107 1 107 130 100 43.0 3e-21 MNYNNAIEKTILFIESHLTASFTVEDAAREAGYSYYHLNRQFSAVLGESVGSYIKKRRLS HAARELIYTERRIIDIALDHGFESSEAFSRAFKAVYHTSPAGYRKNRLDVIISQKPKADQ HLLAHLTQSLTVRPRIVEIPDILTAGLRDRTTLRNNRIPQLWTRFLDLMPQIPDQIPHGR GFGICESCGEGNTLYTMNDDLLFSEVVSVEVSSMSVLPPPLVYKTIPGGKYAVFTHTGSL KSLPLTYQYIWGTWSLTAQEKLDGREDFELYDERFLGFNHPDSRIDIYIPVR >gi|157101638|gb|DS480686.1| GENE 101 104393 - 105532 324 379 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020579|ref|YP_526406.1| ribosomal protein L22 [Saccharophagus degradans 2-40] # 76 354 36 314 331 129 30 1e-28 MRTIYRAGILTAALMVLVLGGCSGFDRGGEEMASAAGRGADARLGLDAGQGPNAVIKEGG QGKEGLEKKPEKIYSMQIGHSQPVDNPRHQSFLLFKKLLEKKTDGGIQVDIYPAGQLGSE ASMLEQVCDGTIQGFRGGQLEIVPRMLVFSLPFLCEDRTQAERLVNSEFAREISMDSLKS GATILGIGDAGGFRQFSNSVRMIRRPEDLKGLKMRSNGMDTINKTLEALGAEVMIVPYND LYMALKSGAIDGQENPWVNSSGMRFYEVQKYFTEVNYQFHPEPFYVNTRWYESLPPEYQE ILAECTEEMMKENNRLIDENEVQAMENIRANAEIYTLSREERQAFVEATQVVYEEYMDNG LLTMDDLARMKRIIRGEEK >gi|157101638|gb|DS480686.1| GENE 102 105529 - 106755 1326 408 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 197 5 206 530 96 29.0 8e-20 MIADDEELEREALRFFINESSLDIDQVIECASGTEVIKRVMLDRPDIIVLDINMPGLNGL QALEKIKSQDYRFRAVFSTAYDYFEYAVEALRLGAVDFMVKPVKKEAFINVIMRAIDELD AEAEKEYRETKLLKTLDMMESKMVHDMVLGNMPEEVLYYLDLKNIRTECSGNCFCIRLQG SLDKCARQKLFQAVRKEFDYLDLTVMFSETNHMVTAVVFCREPDVLPGIFRRMEEFLASI LKQQQIAFMMGSGTPFVDVSQIEESYEKARERVGDMVARTRCGEEGTEQDRDMPREVEGI CSFIQEHYQKRLTLDTIASEAGFSKYYVNRLFKQYMGTTVVDYLIQIRMKKAKELLRNET YSIKQISGMVGYSEPNYFTWTFKKMEGMSPLKFRYEQAEGIREENDGE >gi|157101638|gb|DS480686.1| GENE 103 106760 - 108229 1418 489 aa, chain - ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 22 486 13 477 481 152 27.0 1e-36 MKKLTFWWKQFCFSVGAQINAVLLLMLILFLSFSIYNHSLFNQFNARYQKEIDQYYSILE LKNHFLQSGILFEEYMKTGNRTTLAEFNDTYEKARSVIDILYTESTNEESWYLLRGIDQS FESYYSECCNASFLYNGSDYRYYDSFYYSQTIGGYLEKYTNELLQYALESSVDSNRELTY RRKVMTAFNMGAVAAACVLIAASASYIFVRVTRPLNELKKQVEEIGRGNLDAHVKESSGE NTVSMLSRAFNHMAFSLKEKIESEKQLLVEQRKNEEYEKLLSQARFLALQSQINPHFLFN TLNSINRTVMLGRREQALTMLDSLSVLLRYNLADAQMPALLGEELGITEEYLKIQKMRFS SRLNVDVRHDRKLEQAVTLPRFTLQPLVENAVIHGLEPKEEGGTLILDVRKTGNYIRIRI CDNGMGIEKERLERIRRRLSEKQPERIGVWNIWQRLSLYTGRDDSLKIMSKKGAGTIVSI YLHRGEEDV >gi|157101638|gb|DS480686.1| GENE 104 108226 - 109230 777 334 aa, chain - ## HITS:1 COG:BH0701 KEGG:ns NR:ns ## COG: BH0701 COG1638 # Protein_GI_number: 15613264 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, periplasmic component # Organism: Bacillus halodurans # 32 306 2 283 341 108 26.0 1e-23 MSAKFIFLKGRSHVTLNMHPRIPAQTSAALTKALAAVLMAASVLLTGCAFDGGPESPSLP RSQAVILRLAETMPENHPSAQAMAYFAEMVSRETQGRVTVKIYYNGSLGTPTEIIEQVKF GGIAMARVNVLELSEEVESIRKYFVPSNFAGGDAQTEWIHSNEETLRDECQMDRITPLVW YYPDFRCFYGTDSTFLDKKDLEGKKIESSESALMAEIFRDMGIELAGSVNTNTYKSLISG NIDGAESSFSEFICNNYDQYIHFVTKNDAWCLPDVMIINTENLTSLSKEDREAVEKCAQS TYQYQKQSMEQFHQVWVETLTEEPEVSFSEGVFR >gi|157101638|gb|DS480686.1| GENE 105 109307 - 110614 1147 435 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 5 433 3 431 431 446 51 1e-124 MDQAILVALVMLVCVLTMLVLGIPISISIGLASTIAMLCVLPFENAMGTAAQKVFTGVNS FSLLAIPFFILAGNIMNNGGIAIRLVNCAKILSGRMPGALAETNVVANMLFGAISGSGVA AAAAIGGTIGPLEEKEGYSKDYMAAVNIASAPTGMLIPPSNTLIVYSTVAGSVSISALFM AGYLPGLLWGIGVMVIAGIMASKRGYRNRDEITLAIVLKTVWAAIPSLLLIVIVIGGILI GAFTATEGSAIAVVYSLLLSFIYRSIKIRDLRKILLDSVKMTAIVIFMIGVSSIMSWIMS FTNIPAIIADVLLGLTNSKIAILLIMNLLLLVVGTFMDPTPAILIFTPIFLPICTSFGMH PIHFGIMIVFNLCIGTITPPVGPILFTGCKVANVKIEEVFKTLLPYFAVTAAILLLVTYV PAISMTIPNVLGLVK >gi|157101638|gb|DS480686.1| GENE 106 110634 - 111110 245 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 143 1 145 164 99 28 2e-19 MESAKRILDRIIEWVCVVLLAVMTILVTFQVVVRYFFNSPNAYTESLSKYMFVWMVMYGS AYVFGLREHMNIGFVRDHMPPKTRVVVEMLGELMVALFAAGVMIFGGYKQVTSQMIQMDA ALQIPMGIIYSAVPVSACFILFYFVYNEMKLAKKLREI >gi|157101638|gb|DS480686.1| GENE 107 111142 - 112206 494 354 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 56 352 30 327 328 194 33 2e-48 MKKALTLVLAGVTALSLTACGGGASASATTAAPAAGGETKAGEAKAPEKDVKTTTLKLAF NQSEKHPQYLALSEMSDAFYEATGGAYRIEISPNELLGSQKDAFELVQSGTIQMAMVANS IVENVNPDFAVLGLPYAYDSVEHQKKVFTSGALDDIFASTEANNFSVMAAFTAGARCIYT DKPVQTPADLKGYKIRVMESQTCIAMLDAMGGVGTPMAQGEVYTAIQQGVINGGENNEIT YADLKHYEVAPYFSYTRHLMIPDLLVMNTATLKGMSEEDQQTLKDLCKEYTEREFQLWDE NLEGAKKTAEEAGAQFIDVDIAPFQEACQPVIDNVTMKSEGAKALYEEIRSLAN >gi|157101638|gb|DS480686.1| GENE 108 112408 - 113565 1067 385 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938307|ref|ZP_02085662.1| ## NR: gi|160938307|ref|ZP_02085662.1| hypothetical protein CLOBOL_03203 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03203 [Clostridium bolteae ATCC BAA-613] # 1 385 1 385 385 729 100.0 0 MKKSFLVILSAAMILGGCGGSGTSAATNTEKAQESASETADMQKETTQKETTEAETTVPP KEVKELSVEEFDELLSGLPVSVVKSEYLVQDEQYKSLYPDMLNAVIQNNTDADIKDAVLA FVAWDSNHLPVKLVGNMDFSGGDYFAEVNYSDINLVGGSTFGEDFGFSINESCKVDTFKP IILSFETFDGDKWKNPYVDSFRQAYEGKKYSDDIKIEVEMQDVTFAKSEAPSGTRVENAS AEELEASLAGQPVFVSSTKYVVQDERYKSLYPDMLQAVIQNNSEEDIRDAVVAYVGWDAN GLPVKIKGHLSYGDSTYVTEVLFENINLVPGSSYGDSQGYAIDEDCGVDTFKAVVVSYTT FEEKTWENPDYKAFCDLYGGKRLAQ >gi|157101638|gb|DS480686.1| GENE 109 113708 - 114187 633 159 aa, chain - ## HITS:1 COG:BS_yydA KEGG:ns NR:ns ## COG: BS_yydA COG1576 # Protein_GI_number: 16081075 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 159 1 159 159 186 61.0 1e-47 MKITLVTVGKIKERYFEDAIREYAKRLSRYCKLDIIQVADEKTPDGAGEAIERQIKEKEG QRILSNIKDGAYVIALAIEGKMLTSEELAGRIQRLGVDGVGHIVFVIGGSLGLSGDVMKR ADFALSFSKMTFPHQLMRVVLLEQVYRSYRIITGEPYHK >gi|157101638|gb|DS480686.1| GENE 110 114191 - 114682 442 163 aa, chain - ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 4 163 3 162 162 199 57.0 2e-51 MDGITIRTAVPADLDAVVQVEAACFPAAEAADRKSLKERLDVFPQSFLVAEDGNRIIGFI NGAVTNERTIGDEMFEDTSLHEPQGAYQSIFGLDVIEEYRHRGVASRLMKAMIERARSQG RRGLILTCKERLIGYYETFGYVNMGISKSVHGGAVWYDMILEF >gi|157101638|gb|DS480686.1| GENE 111 114715 - 115422 1042 235 aa, chain - ## HITS:1 COG:BS_yycF KEGG:ns NR:ns ## COG: BS_yycF COG0745 # Protein_GI_number: 16081093 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 3 232 4 230 235 204 49.0 9e-53 MARVLVVDDEKLIVKGIRFSLEQDGMEVDCAYDGEEAIELAKKTEYDIVLLDVMLPKYDG YEVCQAIREFSDMPIIMLTAKGGDMDKILGLEYGADDYISKPFNILEVKARIKAIIRRSG KSKRQKREEAESGVIGAGDLKMDTESRRVFIGEREINLTAKEFDLLELLVRNPNKVYSRE ALLTYVWGNKAMDSGDVRTVDVHVRRLREKIEPSPSDPRYVHTKWGVGYYFRVKA >gi|157101638|gb|DS480686.1| GENE 112 115484 - 116365 841 293 aa, chain - ## HITS:1 COG:BS_comEA KEGG:ns NR:ns ## COG: BS_comEA COG1555 # Protein_GI_number: 16079613 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus subtilis # 116 293 45 204 205 112 41.0 7e-25 MKLTSFKVIKLAVIVICMLAAGICYSCSVKNRDIEPENMVMALESDSGFAVEERGGIEGP RYRDEPEGGAGAMSGDADGEHLRVQDSVPGQDGMRGQEGTPGQDGMLPQKGSRSGEDTLP REDTLSGEDMPLVYIHVCGLVSTPGVYGLPAGSRVYEAIEAAGGFSEAAVPDYLNLAQVL EDGMKIQVPDREQAEEWKARGLTQSGISMGGGTAGVQTSGRTGSGEGGSKARVNLNTASR EELMTLRGIGASRADDIIHYRQEFGGFKSIEDIMNVSGIKNAAFEKIKDSITV >gi|157101638|gb|DS480686.1| GENE 113 116463 - 116888 539 141 aa, chain - ## HITS:1 COG:CAC3547 KEGG:ns NR:ns ## COG: CAC3547 COG3238 # Protein_GI_number: 15896783 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 13 140 14 140 143 79 37.0 1e-15 MGFIIALLSGALMSIQGVFNTEVTKQTSVWVAAGWVQLTAFITCVVIWLFTGREPVAGLM QVTPKYVLVGGIMGALITYTVIRSMGSLGPAKAALLIVVSQIIIAYAIELFGLFGVEKAP FAWNKIIGAVIAIGGIVIFER >gi|157101638|gb|DS480686.1| GENE 114 117369 - 117782 310 137 aa, chain + ## HITS:1 COG:no KEGG:Closa_3092 NR:ns ## KEGG: Closa_3092 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 134 116 249 258 145 58.0 6e-34 MQDFALLGGVMALAEPSGLMHPYLLLTLHGFLWHFMLIFIGLYCAMTGLGGRTPGDFVSM LPLFLACVCIATFINVASHPYGNADMFYISPYYPNGQIVFHQIALTIGTAPGNLLYLAAM VLGGFICQLSLGRLFPR >gi|157101638|gb|DS480686.1| GENE 115 117928 - 118473 517 181 aa, chain - ## HITS:1 COG:no KEGG:Closa_2241 NR:ns ## KEGG: Closa_2241 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 181 1 181 181 216 66.0 4e-55 MKICNIWNHFSTITRHRNLVRKHCFQIGLYWQGLTHDLSKYSPEEFWTGVRYYQGNRSPN AAERETVGYSRAWLHHKGRNKHHYEYWIDISNHKEEGLVGNKMPLRYVAEMVCDRIAACE VYKGKAYTSAAPLEYYEYTKNYITIHPRTRALLEKLLHMLKDQGEEATFAYLRKLLKKGT Y >gi|157101638|gb|DS480686.1| GENE 116 118737 - 118934 241 65 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 1 65 1 65 65 91 69.0 4e-19 MKGTVKWFNNQKGYGFISDEQGNDVFVHYSGLNMDGFKSLDEGAEVEFDVVNGAKGPQAT NVTKL >gi|157101638|gb|DS480686.1| GENE 117 119054 - 119932 1120 292 aa, chain - ## HITS:1 COG:CAC2370 KEGG:ns NR:ns ## COG: CAC2370 COG1281 # Protein_GI_number: 15895637 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Clostridium acetobutylicum # 1 290 1 292 297 279 50.0 4e-75 MADYVIRATAAQGQIRAFAATTRDLVEQARISHNTSPVATAALGRLLTAGVMMGCDMKGE NDLLTLKIQGDGPIAGLTVTADSRGNVKGYAFNPMVMLPPNEKGKLDVGGALGVGVLSVI KDIGLKEPYVGQTILVTGEIAEDLTYYYATSEQTPSSVALGVLMNKDNTVRQSGGFIIQL MPGASEDIIGRLENKLKEISSITTLLDVGNTPEMILEYVLGEFGLEIGGRLPAAFYCNCT KDRVEKALISVGRKDIGEMIEDGKPIEVNCHFCNKNYTFSVDELKDMLEKAK >gi|157101638|gb|DS480686.1| GENE 118 119938 - 120750 913 270 aa, chain - ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 19 256 3 236 247 156 36.0 4e-38 MAAGMRRGAVGKDEDMEAYSSFAEVYDIFQDNVPYEEWCSYVTGLLREQGVERGLVLDLG CGTGSLTGLLAEAGYDMIGIDNSGEMLQMAMDKRAASGQDILYLLQDMREFELYGTVRAV VSICDSMNYMMEYRDLVQVFRLVNNYLDPRGVFIFDLNTEYKYRELLADNTFAEAREESS FIWDNFYDESSGINEYDLTLFIREGQLYRKYEETHFQKAYSLDEVKRAALEAGMEFVAAY DACTRNPVREDSERIYVIMREQGKTMRDEE >gi|157101638|gb|DS480686.1| GENE 119 120820 - 121329 686 169 aa, chain - ## HITS:1 COG:CAC3537 KEGG:ns NR:ns ## COG: CAC3537 COG0653 # Protein_GI_number: 15896773 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 168 1 166 166 142 45.0 2e-34 MALLETWRNLAYGDGLDDKKKEELWAGYFQIEKGIYEQILSNPTEVITGTVKDLAEKYNT EILIMTGFLDGINESLKGYENPIDTMEEDTEVKIEIDPEKLYYNMVEAKANWLYELPQWD EILTPDKRKELYKSQKASGTVRKGKKIFPNDPCPCGSGKKYKKCCGKNA >gi|157101638|gb|DS480686.1| GENE 120 121645 - 121818 245 57 aa, chain + ## HITS:1 COG:no KEGG:Closa_0656 NR:ns ## KEGG: Closa_0656 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 56 1 56 56 70 80.0 3e-11 MKVNGKNESIKCTIKNCAYNAQSEDYCTLEAIKVGTHEMNPTKKECTDCESFVNKAQ >gi|157101638|gb|DS480686.1| GENE 121 122042 - 122236 124 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938325|ref|ZP_02085680.1| ## NR: gi|160938325|ref|ZP_02085680.1| hypothetical protein CLOBOL_03221 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03221 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 95 100.0 2e-18 MCAPADAGGGWETDMDRFEEEKQARKKSNEMFNSITEKVKPENQNQTHNSRREGMGPNTK RKPC >gi|157101638|gb|DS480686.1| GENE 122 122576 - 123040 420 154 aa, chain + ## HITS:1 COG:alr4634 KEGG:ns NR:ns ## COG: alr4634 COG0590 # Protein_GI_number: 17232126 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Nostoc sp. PCC 7120 # 1 144 1 140 140 97 35.0 8e-21 MNDEIFMKEAIRLSQLAVSHGNEPFGAVLVKDGEIIFSNENQIYTGSDPTFHAEAGLLRR FCAETHITDLREYTLYSSCEPCFMCCGAMVWTKLGRLVYGASDIDLCSILHEKGAECCKI VFEHSPWKPQVTSGVLRDESLKILTAYFSKNTKG >gi|157101638|gb|DS480686.1| GENE 123 123208 - 124644 1294 478 aa, chain - ## HITS:1 COG:PAE2480 KEGG:ns NR:ns ## COG: PAE2480 COG1012 # Protein_GI_number: 18313376 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Pyrobaculum aerophilum # 6 477 7 477 478 374 42.0 1e-103 MEYQLYINGEWKDTVTGYVAEDKNPADDSVFARVHFAGGAEVEEAIAAAQRAWRPWADTP AAQKEQILLKAADWMQEHIDEVADVLMGESGSAFGKAYFEAGFAVDILRTAAGECRRVFG QVQQQEAGEISMIRRLPLGVVAGIAPFNFPLLLALKKVAFALAAGNTFVLKPASATPVSG VMIAKALDGAGIPKGVFNLVPGSGEEVGNKLIEDERIRMVAFTGSTAVGRGIASKASVLF KKYSLEMGGKNPLILLKDFDVEQAVRIAGFGAFFHQGEICMCTSRLIVEEPVYDPFCEAF AAYARTMKVGDPHQKDTIIGPLIKQEQCQIIDSQIQDAVSKGAVLMTGGTHEGNFYQPTV LKDVTPDMRIFYEESFGPVTSIIKAVDAQDAVRLCNDNQYGLSSSLLTNDLSKAMSLSLD MEAGMVHINNATVSDNSTVAFGGVKNSGVGREGGSYSIDEFTELKWITVQYTPAQLPF >gi|157101638|gb|DS480686.1| GENE 124 124657 - 125751 1150 364 aa, chain - ## HITS:1 COG:BH2011 KEGG:ns NR:ns ## COG: BH2011 COG1062 # Protein_GI_number: 15614574 # Func_class: C Energy production and conversion # Function: Zn-dependent alcohol dehydrogenases, class III # Organism: Bacillus halodurans # 1 363 1 362 366 386 50.0 1e-107 MKITAAVVREKGKPFQFEELDLQDPREGEVMVKIAASGVCHTDEVAQHQMIPVPLPAVLG HEGCGIVEKVGEGVTEFQKGDHVVFSFGYCGHCKSCLAGRPYACENFNAINFGGVMSDGT KRLSKNGQEISSFFGQSSFATYSVVHKNSIIKVDEDLELDILGPLGCGIQTGAGAVLNRL KPTAGSSLVVFGCGTVGMSAIMAAHLCPCKNIIAVGGNEESLVLAKELGATHTINRKKCD CIVSEIKKITEGGADYAIDTSGVPDFVKKALACVKFLGTAVVLGVTGELTIQVQEELMGE GKSLIGIVEGDSNPKLFIPQLISYYKQGKFPFDKLIRVFEFDEINEAFDASHSGKAIKAL LKMR >gi|157101638|gb|DS480686.1| GENE 125 126052 - 126375 202 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938330|ref|ZP_02085685.1| ## NR: gi|160938330|ref|ZP_02085685.1| hypothetical protein CLOBOL_03227 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03227 [Clostridium bolteae ATCC BAA-613] # 1 107 8 114 114 213 100.0 4e-54 MAAAGLFMVLLLAACYIRLPDYRIYKVFSYSGLNVRETDLYVIVYKPWRIDHEVQGIVSQ HNEMNGTPDKLTVWLYYSKYNLDRGKDFNKVIFEYSKSQVTDPPEAW >gi|157101638|gb|DS480686.1| GENE 126 127149 - 127430 227 93 aa, chain - ## HITS:1 COG:CAC0740 KEGG:ns NR:ns ## COG: CAC0740 COG1550 # Protein_GI_number: 15894027 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 91 1 91 95 97 60.0 6e-21 MLILSLKIKLHAPWVHSLKEKRMVVKSLLSKMRNKFNVAVAEVGEQDIHQTIVIGVAAIV SHSAQADSVMEEIRYFIEESTEAEITDMEREIF >gi|157101638|gb|DS480686.1| GENE 127 127895 - 128044 154 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938335|ref|ZP_02085690.1| ## NR: gi|160938335|ref|ZP_02085690.1| hypothetical protein CLOBOL_03233 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03233 [Clostridium bolteae ATCC BAA-613] # 1 49 1 49 49 65 100.0 1e-09 MKFFTKALIILAVILLILALGDSIFNGLFSNDGYRPVTQQMKEIMTGTD >gi|157101638|gb|DS480686.1| GENE 128 128326 - 128682 167 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938337|ref|ZP_02085692.1| ## NR: gi|160938337|ref|ZP_02085692.1| hypothetical protein CLOBOL_03235 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03235 [Clostridium bolteae ATCC BAA-613] # 1 118 1 118 118 233 100.0 3e-60 MTTEQYIITFFGIDTKYYTDDFVGRCWEMAADEEYGSTGVYVTGTVTTHSLICGEIRGCQ LGKVGHCVSAVRNPVEVADSGTYKKSLINVINRTRSLLDNPNMSIIINEVEYFYFCQI >gi|157101638|gb|DS480686.1| GENE 129 128793 - 129083 326 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938338|ref|ZP_02085693.1| ## NR: gi|160938338|ref|ZP_02085693.1| hypothetical protein CLOBOL_03236 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03236 [Clostridium bolteae ATCC BAA-613] # 1 96 1 96 96 167 100.0 2e-40 MQITDYVKPELIVVALVLYFVGMWIKQSEAVKDRYIPLINGGLGIVICGIYVLATSTCRS GQEIFMAIFTAITQGILLAGLSTYVNQIIKQIGREE >gi|157101638|gb|DS480686.1| GENE 130 129170 - 129331 151 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938340|ref|ZP_02085695.1| ## NR: gi|160938340|ref|ZP_02085695.1| hypothetical protein CLOBOL_03238 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03238 [Clostridium bolteae ATCC BAA-613] # 1 53 16 68 68 82 100.0 8e-15 MKQGEQKSTKELEEQAKNVETFNPRWHGGEYLGPNTLRQKADTEVYGKMEEEP >gi|157101638|gb|DS480686.1| GENE 131 129639 - 130550 873 303 aa, chain - ## HITS:1 COG:SA2054 KEGG:ns NR:ns ## COG: SA2054 COG0679 # Protein_GI_number: 15927838 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Staphylococcus aureus N315 # 12 245 9 242 302 91 27.0 2e-18 MAEVLMKAGSFVAVIIMGNLLRRAGFFKEEDFYVLSKIVLKITLPAAIISNFSGIDLKPS MLVVTLLGFAGGILYIAVAFFMSIGKEREERAFNILNLPGYNIGNFTMPFAQSFLGPVGV VATSLFDTGNAFICLGGAYSAAVATKEKNARFSMFPVIKTLLKSIPFDAYLIMTVLSLLH VGLPAPITSFTGVIGNANAFMAMLMIGVGFKLNGDSSQPGKIVKILAARYSLAIVLALVF YFLLPFGIEYRQALTILALSPMSSAAPAFTGNLGADVGLASAVNSISVVISTVLITGTLM MIL >gi|157101638|gb|DS480686.1| GENE 132 130968 - 132275 1138 435 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 9 429 6 426 435 442 52 1e-123 MNIAVVCGLIILVLLVVMLLAGVPIAVALAVSSICAILPVLNTGAAVLTGAQRIFSGISV FSLLAIPFFILAGNIMNKGGIAIRLINFAKLFTGSIPGALAHTNAVANMLFGAISGSGTA AASAMGSIIGPIEEEEGYDRDFSAAANIATAPTGLLIPPSNVMITYSLVSGGTSVAALFM AGYIPGILWGLACMAVIFYFAKKKGYRSTKKYSNSEKVKVFFQAIPCLLMIVIVIGGIIS GIFTATEGSVVAVVYSLILSLFFYKSIKFSELPKIFLDSAEMTGIIIFLIGVSSIMSWVM AFTGIPTAVSEAMLGISNNRYVILFIINILLLIIGTFMDMTPACLIFTPIFLPICQALGM NTVHFGIMMIFNLCIGTITPPVGTTLFVGVKVGKTKIENVIKPLLLYFAAIFVVLLLVSY VPILSMWLPGLLGYV >gi|157101638|gb|DS480686.1| GENE 133 132272 - 132775 231 167 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 22 148 19 147 164 93 33 8e-18 METMHKVRARIMQILGVFIICLFMLMTIIGTYQIVTRYFFNKPSTVSEELLTYSFTWMAL LASAYVFGKRDHMRMGFLADKISGSAKKYLEVIIDLLTFAFAGVVMVYGGISITKLTMIQ TTASLRVPMGYIYVIVPVTGIIIMMFSLMNAADMLHTDFGKKEDAGV >gi|157101638|gb|DS480686.1| GENE 134 132759 - 133808 464 349 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 1 334 1 338 340 183 32 7e-45 MKKVRKFIKRVAAAALAAAVAAGLTGCGSVTSGKRIIRVSHAQSETHPEHIGLLAFKEYV EERLGDKYEVQIFPNEILGSAQKAIELTQTGAIDFVVAGTANLETFAGVYEIFSMPYLFD SEDVYKSVMQDTDYMEKVYESTDEAGFRVVTWYNAGVRNFYAKTPIHTPDDLKGKKIRVQ QSPASVAMVNAFGAAAAPMGFGEVYTAIQQGVIDGAENNELALTNNKHGEVAKYYAYNKH QMVPDMLVANLKFLNGLSPEEYQVFKDAAALSTEVELEEWDKSIEEAKDIAQNKMGVEFI DVDVDAFKQKVLPLHETMLKDNPKIVDLYNHIQEINEKAKGGKQDGDNA >gi|157101638|gb|DS480686.1| GENE 135 134151 - 135002 790 283 aa, chain - ## HITS:1 COG:BH3679 KEGG:ns NR:ns ## COG: BH3679 COG4753 # Protein_GI_number: 15616241 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 4 276 6 253 257 112 26.0 8e-25 MYRVLLADDEQIERMALAKRLKRHFCGSLDISEAVNGAEALETFKREKSQIVVMDISMPE MNGVEAAERIRSLDEDCIIIFLTAYDEFSYAKRAIVIRALDYLLKPCEEDELVAVMEEAM RLTDKRLNVSGVPSPGVPSPGVPSPGVPSPGVPSPDIRREEHAEAMPRDDGDGRLAQVAE TIREYIRNNYMKEISMQDAARMMNYSDAYFCKLFKQCFDQNFTSYLTGFRVNEAKKLLKD RSISVKDVSMQVGYYDSNYFAKVFKRMTGMIPSEYRDSETAQK >gi|157101638|gb|DS480686.1| GENE 136 135027 - 136472 1115 481 aa, chain - ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 181 477 169 471 481 162 32.0 1e-39 MEGRQGRKLHGIWMSCSLERKVRWITISTAAVVILSITAVMLIAGYGMKGFGDLLMGNSR SLAFWSAMDAESSSFQYYAGDRSPENREKYEMSCARTRSALKSLPFDYEEIGADRYGVTW SILNMYENYAAARDAFLEMEEGAPDYLDRLYTIYRVQGYLATHAGRLEQMTVQSGNERYE MQRPLFIIVPAVSILWGAAALLMVRWLNRSVQKNIVRPMVELAEDSRRIGENDFTGPDTC AEGSDEIASLVRAFCTMKASTRGYIEALTEKHRMEKQLDEVRLQMLKNQINPHFLFNTLN MIASTAQIEDADTTEKMIHALSRLFRYNLKSTDSVMPLERELKVVQDYMYLQQMRFGQRI RYDTDCKPDTMEVLVPSFALQPLVENAIIHGISPKGQGGRIHVRSWMEGRRLWISVADTG RGMAQERLEEIRLALARGEEKATGVGVGNIYRRVHGMYRDGEVFIYSSEGRGTVVQMAFT P >gi|157101638|gb|DS480686.1| GENE 137 136657 - 136875 222 72 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0641 NR:ns ## KEGG: Ethha_0641 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 64 17 80 83 64 40.0 1e-09 MIDYSPFWKTLEQSEENWYTLTKKHRVSDSTLHRLKHNMDISMKTVNDLCRILDCDIEDI AVYVPSEKDQLL >gi|157101638|gb|DS480686.1| GENE 138 136994 - 137221 296 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938350|ref|ZP_02085705.1| ## NR: gi|160938350|ref|ZP_02085705.1| hypothetical protein CLOBOL_03248 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03248 [Clostridium bolteae ATCC BAA-613] # 1 75 1 75 75 114 100.0 3e-24 MLLFTYLKGRAKQMDYFTKEGMKKLLEDEEVVRRLTEFMAMDGAAYFEEVRSHLSPEELE EYLDENPDERIYLKK >gi|157101638|gb|DS480686.1| GENE 139 137262 - 137969 866 235 aa, chain - ## HITS:1 COG:AGl1783 KEGG:ns NR:ns ## COG: AGl1783 COG1802 # Protein_GI_number: 15891006 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 225 70 280 280 112 33.0 6e-25 MGEMNFKNRGELVYETLKQEILDLRLKPGQMISENDICERFGVSRTPVRDALRLLQEQGF AETIPYRGTYVTLLSLDNIKQMIYMRVAVETMVLRDFLEIQTPMVMEDIRHSIAQQRAVI REPGFEPEQFYRMDAQMHSIWFAAVRRQKLWEMLQAQQLHYTRFRMLDFITETDFMRIIG EHEDLFRLIADKNEKGLEEALKEHLYYSMKRMRRSIEVDYKDYFEEEDEEGRFVI >gi|157101638|gb|DS480686.1| GENE 140 138113 - 138919 189 268 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 13 257 4 238 242 77 25 6e-13 MADYLNNFSLEGKVALITGASYGIGFAIAKAMAGAGATIVFNDIKQELVDKGLAAYKEEG IEAHGYVCDVTSEDAVNEMVGKIEKEVGVVDILVNNAGIIKRIPMCDMTAAEFRQVIDVD LNAPFIVSKAVIPSMIKKGHGKIINICSMMSELGRETVSAYAAAKGGLKMLTKNIASEYG EYNIQCNGIGPGYIATPQTAPLREIQPDGSRHPFDQFIVSKTPAGRWGEAEDLGGPAVFL ASDASNFVNGLVLYVDGGILAYIGKQPK >gi|157101638|gb|DS480686.1| GENE 141 138942 - 139784 956 280 aa, chain - ## HITS:1 COG:STM3018 KEGG:ns NR:ns ## COG: STM3018 COG3717 # Protein_GI_number: 16766320 # Func_class: G Carbohydrate transport and metabolism # Function: 5-keto 4-deoxyuronate isomerase # Organism: Salmonella typhimurium LT2 # 1 280 1 278 278 324 54.0 1e-88 MDIRYSANQRDVKRYTTQELRDEFLIQNLYAANEVVAVYSHVDRMVTLGCMPTTETVSID KGIDCWKNFGTHYFLERREIGIFNIGGSGRIVADGQEFQMGYKDCLYITKGTKEVYFSSD DASAPAKFYMVSAPAHTSYTTTYIPIAKAAKRPCGASETANKRVINQFIHPDVLKTCQLS MGMTVLEEGSVWNTMPSHTHERRMEVYMYFEVPGDNVVFHMMGEPTETRHIVMKNEEAVI SPSWSIHSGAGTSNYTFIWAMGGENMEFDDMDTMMPNQMR >gi|157101638|gb|DS480686.1| GENE 142 140113 - 141330 1119 405 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 2 402 13 412 427 218 37.0 2e-56 MKKNILVGQSGGPTAVINASLYGVIAEGMARPEEVDHVYGMVNGIEGFLSGHVMDLSSLT HEDLILLKQTPAAYLGSCRFKLPENLDAPVYSLLFEKFKSMNIGACFYIGGNDSMDTVDK LSRCARIRRESISFIGVPKTIDNDLPVTDHTPGYGSAAKYVASTVREISLDASVYQQPAV TIVELMGRHAGWVTAAAALARKFPGDNPLLIYMPEVPFSMEQFFKDVEQALSRQPNVVIC VSEGICDKNGRFICEYTNDAQIDTFGHKMLTGCAKILESYVRNRFQVKVRSVELNVNQRC SSLLASAADVQEAESSGRTAVEKALQGETGKMVTCSRLSGPGYELEYGLTEVGNVCNREK SFPVEWITAEGSDIGPEFLDYALPLIQGQPDHIMENGLPKYLCRK >gi|157101638|gb|DS480686.1| GENE 143 141343 - 142185 941 280 aa, chain - ## HITS:1 COG:HI0048 KEGG:ns NR:ns ## COG: HI0048 COG1028 # Protein_GI_number: 16272023 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Haemophilus influenzae # 6 280 12 285 285 303 53.0 2e-82 MVLGTDLTGKMAVVTGAGGILCGMFARTLARAGAAVALLDINLEAAEETARDIREAGGTA AAYGVDVMDKDMLESVHARVLARLGPCDILINGAGGNQARANTSKEYFEMGDIEADVATF FDMEYKGVQGVLDLNFLGGFLTCQVFAKDMVGREGCNIINISSASAARPLTRIPVYSGAV AALSNFTQWLAVHFAKAGIRVNAIAPGFFVTRQNEGLLYEEDGKPAARTAKVLAATPMGR FGRPEELNGALMFLLNNEAAGFITGIILPVDGGFSAYSGV >gi|157101638|gb|DS480686.1| GENE 144 142346 - 143395 1224 349 aa, chain - ## HITS:1 COG:CAC1332 KEGG:ns NR:ns ## COG: CAC1332 COG1312 # Protein_GI_number: 15894611 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Clostridium acetobutylicum # 1 349 1 348 351 507 66.0 1e-143 MKLSFRWYGEDDKVTLENIRQIPGMQSIVTAVYDVPVGEVWSRESIAKLKKQVEDAGLGF DVIESIPVHEDIKLGKASRDKYIENYCENIRRVAEAGVKCICYNFMPVFDWTRTQLDHEL ADGSTSLVYYQEQVDAVNPLNSDSDLTLPGWDSSYTKDGLKAVVEEYHSLTEENLWDNLK YFLERIIPVAAECDVNMAIHEDDPCWSIFGLPRIITCEENLDKFLKLVDDRHNGITLCTG SLGCSAKNDVVRLAGKYAAMGRIHFAHLRNVAVLDNGFEERAHLSCCGSLDMFGIVKALV ENGFDGYVRPDHGRMIWGETGRAGYGLYDRALGATYLNGLFEAVEKMSR >gi|157101638|gb|DS480686.1| GENE 145 143573 - 144463 818 296 aa, chain + ## HITS:1 COG:CAC1333 KEGG:ns NR:ns ## COG: CAC1333 COG2207 # Protein_GI_number: 15894612 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 12 284 11 282 286 156 35.0 6e-38 MKKQLSMDFNTRQYMQSGDFELFYYNDTALKSVDAHEHDYYEFYFFLEGDVTYHIGDKSY QLEYGDCLLIPPDTPHYPQFHSYDKPYRRFVFWLNQDYHKKLRQTNEDLTYCFDYAQSHQ AYRFHTDTITTQEIQGKLTDLMEELAGGRSFHRLVSELMAVSFLVYINRLLYDSLHQKSA VYENVLYLNICDYINRHLEEDLSLDSLSSFFYVSKYHISHIFKDNMGIPLHQYILKKRLH ASKNAILSDQPISHVYLQYGFKDYTSFFRAFKKEYGVSPKEYREQHRLPEQQDYTI >gi|157101638|gb|DS480686.1| GENE 146 144479 - 145453 841 324 aa, chain + ## HITS:1 COG:CAC3242 KEGG:ns NR:ns ## COG: CAC3242 COG1313 # Protein_GI_number: 15896487 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S protein PflX, homolog of pyruvate formate lyase activating proteins # Organism: Clostridium acetobutylicum # 10 321 2 295 298 285 42.0 1e-76 MNTPDILSFDTYRNCILCPRMCKADRSRAAGYCGCSSVLTAARAALHHWEEPCISGSADA DGGLTAPGGAHSDSGGSGTVFFSGCTLGCCFCQNHKISSGHFGREMTAEELGQVFLRLQD QGAYNINLVTPTQYLPHIISALDQVRFKLTIPVVYNCGGYERAETVKALKDYVDIWLPDF KYYDNRLAQRYSRAADYFETTAAAIRQMIDQTGAPVYQSVSPGLLAKGVIIRHMVLPGQR RDSIALLEWISGNLPRGRYMISLMSQYTPYIQNQEFPELNRRITSYEYDKVVDAAIRLGL TEGFMQEKSSAREEYTPPFDLEGL >gi|157101638|gb|DS480686.1| GENE 147 146606 - 147496 1068 296 aa, chain - ## HITS:1 COG:TM0960 KEGG:ns NR:ns ## COG: TM0960 COG0524 # Protein_GI_number: 15643720 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 7 294 9 295 299 152 37.0 9e-37 MKILNYGSLNIDYTYSVDHFVRGGETMSSEEMHVFSGGKGLNQSIALSKSGAEVWHAGAI GTGDGDFLIRQLKEAGVNTEYISVLDGKTGHAIIQKAKDGGNCILLFGGANQKITREMVD GVMDHFEKGDYLVLQNEISEIGYIMERAREKGMVLVFNPSPMDDKISTYPLEYVDYFLLN EIEAGDICGEQGSGEELLQKLSDKFPASKIVLTLGGDGSLYRDGQRILTQGIYKVPVADT TAAGDTFTGFLIGGLVQGLDAGEALDLAAKAAAIAVSRPGAAPSIPSREEVEAFKG >gi|157101638|gb|DS480686.1| GENE 148 147530 - 148624 1243 364 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_01065 NR:ns ## KEGG: BHWA1_01065 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 6 350 211 552 582 133 26.0 1e-29 MMKKIMIAGFWALLLIPNLLFPFVKGGSQTGTGENRNLAEFPVFSPDTYEAYPSAVNSYI NDHAAFRNLFLSLNSMINLKLFGYADSQDVIVGKDGWYFFAGGMSLYDALGTQPFYPDDA AWIGGQIIKAAGYYESRGIPFLMMIAPNKEGIYREYMPDAYKRVWDGNRPGQLEDYIREH SDVTVLDPREYFNTNRDYVWYYKTDTHWNSAGGYAAGQMLIEALGGTPLPIEDVQVDYIK GAPGDLANLFHLPEKYCDDQIALVSGYHEDLAVTVQDVNGDKNIVHTSTPGAPDKRRIAV IRDSFGTALMDFLPRYFANVDFYHWQAFDRSLLEENPPDAVIYLIVERDLTRIPQDVGKM VPEE >gi|157101638|gb|DS480686.1| GENE 149 148650 - 150062 1442 470 aa, chain - ## HITS:1 COG:CAC1564 KEGG:ns NR:ns ## COG: CAC1564 COG1696 # Protein_GI_number: 15894842 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Clostridium acetobutylicum # 1 470 1 473 473 315 39.0 1e-85 MLFSSLTFVWFFLPALALIYYISPGRKLKNGVLLMASLVFYSWGEPRYVVLVVLSILLNY GFGLALDGCEERKGLKKAVFILSILVNLGLLGYFKYFNFFLELAMSLLGRQGFAPRDIAL PIGISFYTFQSISYLADTFRGENPAQRNVFRLALYISLFPQILSGPIVKYHELEPQIGNR QESLSMHAYGIKRFVYGLVKKMVFANMFGQVVDRMFDLPLEQLGTVTVWLAVILYSFQIY YDFSGYSDMAIGLGRMFGFTYKENFNYPYLSASVSEFWRRWHISLSTWFRQYLYIPLGGN RKGLARTCMNLFLVFLATGLWHGASMNFICWGVYYGILIVAERLWIGRILEKNPLKFLNH LYTLGAVAVGWLLFRVDTLHNGKILLKAMLIPSKGLWNPRIFADNRILFLLVLAVVFCGP IQQLMPRLKARLFDEENVTAADVAVMAALLLLGTLLMVSSTYQAFIYFRF >gi|157101638|gb|DS480686.1| GENE 150 150121 - 151479 1596 452 aa, chain - ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 408 6 411 426 239 34.0 6e-63 MKKSYEIDMTSGPLLGKILLFSIPLMLSGILQLLFNAADIIVVGRFAGSGALAAVGSTSS LINLLINVFVGLSVGVNVLVARYYGARKDKDVSETVHTAVTTSIVSGFILVVLGILLANP LLRLMGTPEDVLSQSVLYMRIYFLGMPVLMVYNFGAAILRAIGDTRRPLYFLFASGVVNV CLNLFFVVVLGMGVDGVAWATVISEHISAFLVLRSLMSAPGALKLDLKQLRIHPRKLKRI VKIGLPAGMQGAIFSISNVLIQSSVNSFGSIAMAGNTASSNIEGFVYTAMNAVYQTNLSF TSQNLGGRKYSRINKIMYICLGVVTAVGLILGLTAVAAGDGLLHIYSSDPEVLRYGMLRL EIICTTYFLCGIMDCMVGSLRGLGYSIIPMFVSLTGACGFRVLWVFTVFAAYRSLDVLYL SYPVSWAITAIAHMVTFHKIRRKIPKQDAVSM >gi|157101638|gb|DS480686.1| GENE 151 151821 - 152387 683 188 aa, chain + ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 1 183 1 184 187 152 53.0 2e-37 MSLAELFVIAVGLSMDAFAVSVCKGLAMPKMNWKGALLVGLYFGGFQAAMPLFGYFLGSS FSLAIRAYDHWVAFILLAVIGANMIKESFSKDEECPNPDLDVKNMVLLAIATSIDALAVG VTFAFLNVDILPAVSFIGSVTFFLSVAGVKAGNAFGCRYKSKAELAGGAILILMGFKILL EHLGILFG >gi|157101638|gb|DS480686.1| GENE 152 152556 - 153302 844 248 aa, chain - ## HITS:1 COG:SP1937 KEGG:ns NR:ns ## COG: SP1937 COG5263 # Protein_GI_number: 15901761 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 162 236 210 284 318 65 42.0 9e-11 MRRYMSVKRLGALVLAAVLSVSAAFVSEAAASYKTEVPVTLHLDVNSRRTTIYDGYSSYL LGYRVSEYDTFTVTKDKSDVDLNEVTLNYYLVTYRGETQKYLECTVYGLKEGTHYPVIRP ETVEKERSAEGDYNYLDQCYMVEFCYQDKKESRYFMLLPEEDMSNYRNILLGKWNKDMRG WRYLYQDEYLTSWAMINDKWYFFNTDGYMQTGWLEYKGQWYYMDPENGVMQTNRTIDGYQ LDSSGVRI >gi|157101638|gb|DS480686.1| GENE 153 153378 - 154604 861 408 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 7 400 13 409 418 336 43 7e-91 MQIYEELVARGLIAQVTNEEEIKKMVNEGKAVFYIGFDPTADSLHVGHFMALCLMKRLQM AGNKPIALIGGGTGMIGDPSGRSDMRTMMTPEIIQHNCDCFKKQMSRFIDFSEGKAMMVN NAEWLMGLNYIEFLREVGPHFSVNNMLRAECYKQRMEKGLSFLEFNYMIMQSYDFYELFK RYGCNMQFGGDDQWSNMLGGTELIRRKLGKDAHAMTITLLLNSEGKKMGKTQSGAVWLDP EKTSPFDFYQYWRNVGDSDVLKCLRMLTFLPLEQIDEMDQWEGSRLNQAKEILAFELTKL VHGEEEAAKAQEGARALFAAGASTENMPTFELSEEDFTDGTIDILSILNKSGLAASRSEA RRNVEQGGVSVDGNPVKDIKAVFSREQFAGDGVVVKRGKKNFKKIILK >gi|157101638|gb|DS480686.1| GENE 154 155145 - 156041 1077 298 aa, chain - ## HITS:1 COG:FN2101 KEGG:ns NR:ns ## COG: FN2101 COG0697 # Protein_GI_number: 19705391 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 5 295 3 293 301 198 39.0 1e-50 MKNDKRQTAGTIMALVGGTCWGFSGCCGQYLFEQKGIEAPWLVAVRLFFAGIILVLAGFK LHGRENLRVFRKKRDTIHLLAFAIFGITFCQFTYFMAIQASNAGTATVLQYLSPILILAV VCMRELRLPKGLELAAIGLSLFGTFVIGTHGDIHSFHITGEALFWGLLAAVSSMIYTIIP GGLILKYDIYQVLGFGMFFGGIAMGAVVQPWNYGVVWDAGTLGALAGVVVVGTAIAFGLY LQGVSMIGPLKGSIMGSVEPVSAVVISVFWLGTRFTLPDFLGFALILGAVFVLTFAHR >gi|157101638|gb|DS480686.1| GENE 155 156216 - 157247 1120 343 aa, chain + ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 15 338 15 337 343 359 52.0 4e-99 MGFTFINKLPTPAEIREQFPLAPELAEIKHARDEEIKRVFTGESGKFVVIIGPCSADNEE SVCDYAGRLVKIQEKTKDKLIIIPRVYTNKPRTTGEGYMGMIHQPDPEKRPDMFEGILAI RHMHMRVLRETGFSTADEMLYPENMNYLSDMMSYVAVGARSVENQFHRLVASGCDVPVGM KNPTSGDLTVMLNSVVAAQLPHDFIFRGFEVQCPGNPLTHTILRGAVNKRGQCNPNYHYE DLKLLFDLYNKRGLQNPACIVDTNHANSNKQYHEQIRIAKEVLHSRRHSGDIKNLVKGLM IESYIEPGSQKIGEHCYGKSITDPCLGWDDTERLLYEIADNLD >gi|157101638|gb|DS480686.1| GENE 156 157427 - 158269 1024 280 aa, chain - ## HITS:1 COG:SPy0280 KEGG:ns NR:ns ## COG: SPy0280 COG1968 # Protein_GI_number: 15674457 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Streptococcus pyogenes M1 GAS # 1 272 1 272 279 265 52.0 5e-71 MSLVEVLKIIVLGIVEGFTEWLPISSTGHMILVDEIIHLNQPDAFKDVFLVVIQLGAILA VVVMFFHKLNPFSPSKSSRQKDATWALWIKIIIACVPAAILGLLLDDWMEAHLFNAYVVA AALIVYGVLFIVLENSSVCTNPRFNKVGQISAKTAFLIGMVQLLALIPGTSRSGSTILGA MVLGCSRAAAAEFSFFLGIPVMFGASLLKIVKYFLAGNVFTGPQIFYTILGMLIAFVVSI YSIQFLMSYVRKHDFKFFGYYRIILGVIVLAYFGITALAS >gi|157101638|gb|DS480686.1| GENE 157 158399 - 159073 675 224 aa, chain - ## HITS:1 COG:BH3850 KEGG:ns NR:ns ## COG: BH3850 COG0692 # Protein_GI_number: 15616412 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Bacillus halodurans # 1 224 1 224 224 285 59.0 6e-77 MAAITNDWLGPLSSEFKKPYYKELYKTVKHEYETKRVFPEPDDIFNAFAFTPLADVKVVI LGQDPYHNYGQAHGLCFSVKPDVDIPPSLVNIYQELHDDLGCYIPNNGYLKKWADQGVML LNTVLTVRAHQANSHRDIGWEKFTDAAIGILNQQDRPMVFILWGRPAQTKKAMLNNPRHL ILEAPHPSPLSASRGFFGSRPFSKTNAFLTEHGLTPVDWQIENI >gi|157101638|gb|DS480686.1| GENE 158 159189 - 159950 846 253 aa, chain - ## HITS:1 COG:aq_181 KEGG:ns NR:ns ## COG: aq_181 COG0107 # Protein_GI_number: 15605750 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Aquifex aeolicus # 1 252 1 250 253 330 64.0 2e-90 MFTKRIIPCLDVNNGRVVKGVNFVNLRDAGDPVEIAAAYDRAGADELVFLDITASSDNRR TVVDMVRKVAETVFIPFTVGGGIRTVEDFRMLLREGADKISINSSAIDNPRLISDAADKF GRQCVVVAIDARRRADGKGWNIYKHGGRVDVGTDALEWAMEADRLGAGEILLTSMDCDGT RNGYDNELTSLIASHVSVPVIASGGAGNKEHFYDALTKGGADAALAASLFHYKELEIRDL KKYLWERGIAVRM >gi|157101638|gb|DS480686.1| GENE 159 160025 - 160630 810 201 aa, chain - ## HITS:1 COG:CAC0939 KEGG:ns NR:ns ## COG: CAC0939 COG0118 # Protein_GI_number: 15894226 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Clostridium acetobutylicum # 2 198 5 199 203 208 53.0 5e-54 MIAIIDYDAGNLRSVEKALLSLGETPVITRDRKTILKADKVILPGVGSFGDAMGKLEQYG LVDVIHETVDQGTPFLGICLGLQLLFAGSEECQGAAGLGILPGKILKIPEHPGLKIPHMG WNSLKIRPEAKLFRGIEDGSYVYFVHSYYLKAEDESIVAASTEYGTVIHASVERENVFAC QFHPEKSGEVGLAILKNFIGL >gi|157101638|gb|DS480686.1| GENE 160 160705 - 161376 542 223 aa, chain - ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 31 217 89 289 315 70 24.0 2e-12 MDQQKKDLLEALSGAEKVLIGIGGEWRLRDDGRDVRLRSLGDPVQKELREAYAVLWSLVR DKDYYVVTTVTDGAVYDTDFDSGRVVAPCGNIHWRQCSKACTKDIWEEGEVPDDICPHCR APLTGNTVEAETYIEEGYLPRWEAYKKWQTGTLNRRLVILELGEGFKTPTVMRWPFEKIG FFNQKSVFYRIHETFFQIPKETGERAAGIQADSVAFIRGLADI >gi|157101638|gb|DS480686.1| GENE 161 161673 - 163436 1420 587 aa, chain + ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 11 586 8 543 549 365 38.0 1e-100 MKSAQDLRRLLESIDRKGYPAYKDTRGSYDFTDYVLSIDHVQGDPFASPSKLSVFIPLQR AAYPSGYFDAPHKQTALEDYLVRQFCQEIAKYNFKAKGSGKSGLIATSHPGPEILSRTAC ECSIKGITARFEAGFPANGRTINSGELIKILFDFLPRCVKTVFYYKNRPAQEVKAVSDLA EDQYFIRCELERLGLVSFVADGSILPRESGISSRPMKGSVPFQSPDSLRMELNLPHGGRI TGMGLCRGITLIVGGGYHGKSTLLKALESGVYNHVAGDGRKYVITDSTAMKLRAEDGRSV QNVDISLFINDLPNKKDTHCFSTEDASGSTSQAAAVIEGIEAGSRVFLIDEDTSATNFMV RDDLMQKIISRDQEPITPFIERARDLYEKSGISTVMVAGSSGAYFYIADTILQMDCYEPL DITDKTKAFCSGYGVEPITSAPGFLLPSGGRKLLSNKGGSQTAPGNYASSNRGNYRGRGP RDGGRDERIKVKVYGKDSLQIGKSPVDLRFVEQLIDSEQTNSLAQMLRYCVEHQLLERYT LKDTVAMLLKEIQKGGLAAVGDSSYAACGFSMPRIQEIFACLNRYRN >gi|157101638|gb|DS480686.1| GENE 162 163551 - 164573 999 340 aa, chain - ## HITS:1 COG:TM1297 KEGG:ns NR:ns ## COG: TM1297 COG0667 # Protein_GI_number: 15644052 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Thermotoga maritima # 1 191 1 197 285 132 39.0 1e-30 MVEVTLGKTGITVNKNGFGALPVQRIPKEDAVYLIKKAYEGGITFFDTARNYTDSEEKLG AALEGIRDKVFVATKTAAKTAEKFREDLEISLKNLRTDHVDLYQFHNPSFCPKPGDGTGL YEAMLEAKEQGKVRHIGITNHRLSVAMEAIDSGLYDTLQFPFCYLATDKDLELVKRCREA GMGFIAMKALSGGLINNSRAAYAYLAQFDNVLPIWGVQRERELDEFLSYVGNPPEMTDEI RAVIDKDREELLGEFCRGCGYCMPCPVGIEINNCARMSLMIRRAPSAAWLTPDAQEKMKK IEQCLHCNQCKGKCPYNLDTPALLEKNLKDYQEILDGKEY >gi|157101638|gb|DS480686.1| GENE 163 164843 - 165526 473 227 aa, chain + ## HITS:1 COG:BH3238 KEGG:ns NR:ns ## COG: BH3238 COG0775 # Protein_GI_number: 15615800 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Bacillus halodurans # 1 226 1 232 233 134 39.0 2e-31 MKKIGILCAGDTELAPFLEHMKGQQITEKAMLKFHTGTINHVNVSAVYSGVCKVNAAIAA QLLIDMFHVDLIINAGTAGGMKEGVQLFDTVISERVIYHDVADDILTGFHPWLKSNYFLA DQELCAIARAYSRTSKHPVLFGTMVTGEQFIEDEKREEINQKFDPLSTDMETAGVAHVCY VNRIPFLAVRTITDTVTHQGIETFDQNCEAASEISAEIVLGILGQLD >gi|157101638|gb|DS480686.1| GENE 164 165542 - 166816 615 424 aa, chain + ## HITS:1 COG:PA0740 KEGG:ns NR:ns ## COG: PA0740 COG2015 # Protein_GI_number: 15595937 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Alkyl sulfatase and related hydrolases # Organism: Pseudomonas aeruginosa # 17 421 102 516 658 181 30.0 2e-45 MLIQNGHQQLQDFTNKNYKQTITEVTKGVWFVLGLGHSNAIFIEAGSSVILIDTLDTRER GEQLLDLIRQNTKKEVGTIIYTHGHPDHRGGAGAFMDSRPEVIAFAPAAPALEKTGLLQD IQNLRGARQFGYPLTDQEAISQGIGPREGITHGEHRAFVLPTTLYEQDKVSREIDGVQLE MVRLPGESEEEIMIWLPQKKVLCCGDNFYGCFPNLYAIRGGQYRDLAAWIHSIDVLMSYP AECLLPGHTAAILGHKTISSTLGNFRNAFEYILTQTLEGMNAGKTADQLAADIQLPPEYA GLPYLAEHYGCVEWTIRSIYSAYLGWFDGNPTHLHPLSPEEHSQKMIALIGGVQTVLDAA KTALSHQEYQWCLELCDLLLSNGNSAKEEVLHLKASSLEKLAEYETSANGRHYYMVCAKE MNPE >gi|157101638|gb|DS480686.1| GENE 165 167106 - 167348 406 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938388|ref|ZP_02085743.1| ## NR: gi|160938388|ref|ZP_02085743.1| hypothetical protein CLOBOL_03286 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03286 [Clostridium bolteae ATCC BAA-613] # 1 80 1 80 80 118 100.0 2e-25 MARGVRKSPVEKMQGELADVQASIAQYESCLETMREKEKALMEHIQLEEFKVVNEMLKER DMTIDDLKEMLAGGDVTLSA >gi|157101638|gb|DS480686.1| GENE 166 167790 - 169016 1080 408 aa, chain + ## HITS:1 COG:alr3807 KEGG:ns NR:ns ## COG: alr3807 COG0457 # Protein_GI_number: 17231299 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Nostoc sp. PCC 7120 # 95 327 353 592 604 72 27.0 2e-12 MNETDKGPDVCQNQDEGLNETQGVPDGSRELSDSSHELPDGMEIDMSHYKEGSAAAKCVL GVILLLCFIFMSRSCNYRHSTLTFYMGPEPEVNLGSQYYEEGRYQEAADYYQEKVDEVFS NGKRYEPANGPLYFNLGMAYYKLENYDQALLYLKECADVDKRTSNTEELAWDYNTMADVC KDAGHMDEALNYYEKGLKAIKKACGKDSEYTAIFLSDLGDAYKKTEDYEKALENYQQALE IQESNGSDQTWSYIRIARIYNALKDYDAAKHFYGKASGSETADSYTKGVAFYNMGQMYHE RGEYSPALAALDKALEFVNQDGDNAYAESNVQNTLASVYAESEDSLDKAILHSVTACRLL ETSGFPTHNCREDLEQFKEQLKGYYQTDTRDMTDEGFELWYQDQMDSE >gi|157101638|gb|DS480686.1| GENE 167 169032 - 169460 389 142 aa, chain + ## HITS:1 COG:lin1320 KEGG:ns NR:ns ## COG: lin1320 COG0824 # Protein_GI_number: 16800388 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Listeria innocua # 17 137 1 122 122 108 41.0 4e-24 MFKDYEHHAQYYETDQMGCVHHSNYIRWMEEARVNLMEQMGCGYKAMEASGIMSPVLEVH CQYRSMVRFDDHVKIHVWVKEYNAIRMTLGYEMRDAATGELKTTGESRHCFLDTGGRPVS LKRSYPQWDTAFCSAVQAQDSR >gi|157101638|gb|DS480686.1| GENE 168 169550 - 170704 839 384 aa, chain + ## HITS:1 COG:CAC2622 KEGG:ns NR:ns ## COG: CAC2622 COG2333 # Protein_GI_number: 15895880 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Clostridium acetobutylicum # 1 293 1 290 307 246 44.0 7e-65 MTKLKNIRILIISILCAISLLLGGCTDSSPSFSPDKGSSITAPSGYGLAVHFIDVGQGDS ILAESDGHYMLIDAGENDQAGTVISYLKAQGVTKLDYVIGTHPHSDHIGGLDKVIDTFPV DKVILPPVEHTTKTFEDVLDSIASRGLKITKPTPGDSYDLGDASFTILSPVKDYGSDLNN WSVGVRLTYGDNSFVMCGDAENQAEEDIIKNGAVLKADVLKAGHHGSSTSTSDAFLKKVS PSWVVIQCGKGNSYGHPHKETMEKLKKAGCQVLRTDEEGTITAFSDGKTITWSTGTAAVP GTSAGSGGNQTAAIPDTDNGAAAENEAGNPSTRSSYVINTNTGKFHRPDCASATQMKPEN RKDVAASRDELVQEGYEPCKQCKP >gi|157101638|gb|DS480686.1| GENE 169 170731 - 170943 232 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938396|ref|ZP_02085751.1| ## NR: gi|160938396|ref|ZP_02085751.1| hypothetical protein CLOBOL_03294 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03294 [Clostridium bolteae ATCC BAA-613] # 1 70 14 83 83 103 100.0 4e-21 MKYIIDRLEEGLAICETELRKRISVPVSHLPKEVKEGDVLREEEGRFFLDSEETDKRRRE MKKKLMDLFE >gi|157101638|gb|DS480686.1| GENE 170 171008 - 171130 79 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNGWKYDRKYGRKYDRKDSRKYGKNDRKYGRSITGNLTGA >gi|157101638|gb|DS480686.1| GENE 171 171138 - 171815 642 225 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 101 221 497 615 693 68 34.0 1e-11 MKRTVKVWVAAVMSGVLLAGGAAMTGQAPWGMEAQAHSGRTDASGGHRDNKNASGLGSYH YHCGGHPAHLHPNGVCPYAGGGESTQTQTRAQTKTQTQAQTQAQAPAQTQAQAPAQAQTQ PQEQAVVRAASGAGEASGTSGWHQNGAYWNYICEDGTRVIGCWKQIDGVWYSFDGDGNMR TGWYGENGSLYYLGTDGKMAVGTCVVDGKEYQFDQDGKMISPDNR >gi|157101638|gb|DS480686.1| GENE 172 172024 - 173031 1186 335 aa, chain + ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 9 335 3 327 327 401 59.0 1e-112 MELIIPKDYDPRLSIRETQDAIKYIRDTFQKEMGREMHLERISAPLFVEKSSGLNDNLNG TERPVSFDMLGLPGETLEVVHSLAKWKRMALKKYGFKPGEGLYTNMNAIRRDEELDNLHS CYVDQWDWERVITKEQRTLDTLKDTVREIFKIIKHMQHEVWYKYPEAVKHLPKDIYFITS QELEDLYPDNTPKERENLITREHGCVFLMQIGDKLAGGKPHDGRAPDYDDWKLNGDILFW FEHLQCALEISSMGIRVDEAALEYQLKKAGCEDRRSLPYHKMLLNGELPYTIGGGIGQSR LCMLLLDRAHIGEVQASIWPQKMRDICGENKIYLL >gi|157101638|gb|DS480686.1| GENE 173 173100 - 173711 654 203 aa, chain + ## HITS:1 COG:L169795 KEGG:ns NR:ns ## COG: L169795 COG2860 # Protein_GI_number: 15673892 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 1 195 17 215 219 115 39.0 6e-26 MAFASSGALVAIKQRLDLLGVVVLGVTTAVGGGMLRDILLGIVPPSLFMNPVYVYTAFLT AMVLFILVWLNQQILESRYISAYEKMMNLFDAIGLGAFTVTGINTAGTAGYGEYHFLSIF LGVLTGVGGGILRDMLAGQSPYILRKHVYASASIAGAVCYVCLNGWLSRDASMILSAILV VLIRLLATKYDWNLPTAYHTKNK >gi|157101638|gb|DS480686.1| GENE 174 173820 - 175139 1401 439 aa, chain - ## HITS:1 COG:CAC1456 KEGG:ns NR:ns ## COG: CAC1456 COG1653 # Protein_GI_number: 15894735 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 18 435 21 437 439 412 47.0 1e-115 MKGWKWTLGLGAAVFLLAGFLGCIYKYYEKEIVLEFGMFTGSNWDVASASSFVMIDKAIA KFEEEHPGVRIHYYSGISKEDYSEWCARKLLEGKMPDVFMVPDTDFNLYSSLGVMKKLDE LIEHDAGFNRNDFFTTALDAGKYDGHQCALPYETVPVLLFVNKSLLAKEQIEVPGEDWTW DDMYEICRKITKDVNGDGLLDQFGTYNYSWRNAVYTSGGRFYDGHQQQLPFTDSHVLDAI KYMKRLNDLNQGQSVTQEDFNKGNVAFMPLTFAEYRTYKTYPYRIKKYTKFQWDCITFPA GKEGENRSRVDSLLMGINSNTKHEKLAWEFLKQLTGNEEMQMDIFRYSQGVSVLKSVTGS AQAAAIIQEDMDEGEQVINSSLLYNVIENGVIEPKYQQYEQVMSLADGEISKILMENRNV DSSMKIFQRSITKYLQQQR >gi|157101638|gb|DS480686.1| GENE 175 175210 - 175911 746 233 aa, chain - ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 12 233 1 223 225 300 71.0 1e-81 MQEYQSDGGRLMIKVLIADDQELIRQSLQIVLNSRPDIMVTDVASNGQEVIRSVRKNKPD VILMDIRMPKLDGVQCTKIIKENYPQIKIIILTTFDDDEYVYNALKYGASGYLLKGVSMD ELANAITTVYNGWAMINPDIATKVLRLFSQMAQADYSILVGDKNVDELTKTEWKIIAQVE VGASNREIAEALKLSEGTVRNYLSTILNKLDLRDRTQLAIWAVQTNVRKRLGT >gi|157101638|gb|DS480686.1| GENE 176 175871 - 177172 1393 433 aa, chain - ## HITS:1 COG:CAC1454 KEGG:ns NR:ns ## COG: CAC1454 COG4585 # Protein_GI_number: 15894733 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 3 430 4 439 441 310 38.0 3e-84 MDNRRVRILQLGILTLNVVSVMGLSIFIYATIENIRRSYVAREFLSGIQAIVWYPYWNIW LCALLLALLAGSMFVRDRLFPDNSKVILFSLVADFAICFAIIILLNFNYNGILLLVFSNV ILYAKNGKSRYFLAAVAIGSFILADYELLSISYRLYSIQDYISFYNATTQQYLLSSYNIL VSLNVIMFVVYCVNIINQQQGNLDEIHALNEQLQDVNEQLQEYSVMAEKMAETRERNRLA REIHDTLGHTLTGIAAGIDACLATIGTSPQQTKDQLELISKVTRDGIKEIRRSVSQLRPD ALERFSLEYAISKMVSDMNAVSGARVYFDCQVKNLKFDEDEENAIYRVIQEGITNALRHG HASQIWITIKKEDTDILLQIRDNGIGCKEIKSGFGTKHMKERIKMLSGVVTFDGSDGFTV NARIPIRWGETYD >gi|157101638|gb|DS480686.1| GENE 177 177189 - 178136 1085 315 aa, chain - ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 7 315 13 325 325 243 41.0 4e-64 MKIGTGLLILTAGAMLLLGGCGHGDASGRQHEARLFGATYMTRNNPYFDVLNEGIEEVVE ANGDILLTRDPLQDQEKQNEQIQEMIDEGIQMLFLNPVDWEKVQPALDACREAGVGIINV DTVVKDRDSVISIIETDNYQAGQLCALDMMKRKDEAKIVILDNPIQTSITNREQGFLDTI ADNHNYQVVYREAAAGEIEVSSHVMADLLRRDISFDVILGGNDPTALGALAALQQARREE GVLIYGIDGSPDFKAILDVGYVTGTSAQSPRSIGRKAAETAYRYLDGEPVEKYISMPSTM ITRDNLHEFEIDGWQ >gi|157101638|gb|DS480686.1| GENE 178 178343 - 179311 1316 322 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 22 319 31 330 330 238 48.0 1e-62 MTTRLQKNVKQYFKDNIGIIGALLILCIFLSVFPKTSGSFLTQKNIFNVLRQISSNLFLA CGMTMVIILGGIDLSVGSIIALSGCFAAGCVSRYGLPIAVGVLGGVLVGTVVGMFNGVVI SKTTIPPFIVTLATMNVAKGLAYVYTGGSPVRVVTKEWQFLGAGYVGGVPTPVILLAVVL VITAIIMNKTKLGRHIYAVGGNSQAARFSGISTAKVKFLVHTYSGIMAGLAGVVLASRMY SGQPTAGDGAEMDAIAAVVVGGTSMAGGSGKIGGTIIGGLIIGVLNNGLNLLNVNSFWQY VVKGTVILLAVFVDYLRNKKAE >gi|157101638|gb|DS480686.1| GENE 179 179330 - 180817 191 495 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 258 474 1 217 245 78 26 3e-13 MSETVLLQMKNIHKSFPGVKALQGVDLELRAGEVHALLGENGAGKSTLIKVLGGIYIAEE GEIFIDGQKVTIDGVNASHNNGISIIHQELVLVPHMTVAENIFLGREPSKGLWVSHETME KEAQQLLDTYHMSIDAGTLVKDLTIAQQQMVEIVKAISYHSRILVMDEPTSSISDKEVEF LFSTMRELTKKGVGIIYISHKMDELEQICDRVTVLRDGTYVGTERVKDTTRDHLIAMMVG RELTKYYTRDYMEPGRVVLKCENIADGKMVKGASFELREGEIIGFAGLVGAGRSETMKCI FGLTPGCTGEISVDGKKAAIKSPIDAMKFGIALVPEDRKQEGLYKVQSVRFNSTIEVLKS FIKGIWVDADKEEEITRQYIEMMDTKTPSQEQLIGNLSGGNQQKVMIGRWLATAPKILIL DEPTRGVDVGAKSEIYGIMNELVKKGVSIIMISSELPEILNMSDRVYVMCDGRITGCFSH EECVTQEQIMKLAAK >gi|157101638|gb|DS480686.1| GENE 180 180948 - 182066 1371 372 aa, chain - ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 91 372 45 325 325 174 38.0 2e-43 MKKSTKKLLAVMMCAAMAGSALAGCSSGDSQTTAAATEAPTQAQTTEEAKEAAAEPEKAE ESKEATEAAAPAAADMASADVKKKDPDKHYKFGYTCMDGTNPFFVTIEKRMREMVEANGD ELIAVDPANDVTLQITQVEDLISQNIDAMFMNPAEAEGILPALDQLKEAGIPIVGFDTEV ADLSYLISYTGSDNYNAGKVCGEDLVKKCPDGGDIIVLDSPTMNSVTDRTNGFLDAIEGK GFNIVAQQDAKGNLEVSMGIAEDLLQAHGDVVAIFGGNDPTALGALAAANAAGLTDCMIY GVDGSPDIKAEMASGESLIEGSGAQSPVKIAQSAVNIMYTYLNGDKVESRYPVETFLITS ENVDKYGTDGWQ >gi|157101638|gb|DS480686.1| GENE 181 182228 - 182920 645 230 aa, chain - ## HITS:1 COG:YPO3961 KEGG:ns NR:ns ## COG: YPO3961 COG3822 # Protein_GI_number: 16124089 # Func_class: R General function prediction only # Function: ABC-type sugar transport system, auxiliary component # Organism: Yersinia pestis # 1 221 1 223 227 219 50.0 3e-57 MKRSEINRALRDMEKMIDRCSFKLPPFCYFTPEEWNEKGHEYDEVRDNMLGWDITDFGMG DFDKVGFSLITLRNGNVSMDKYTKPYAEKLLYMKEGQSAAMHFHWNKMEDIINRGGGNVL IGVYNAGREELADTDVRIHSDGREYTVPAGTQIRLCPGESITIQPYLYHDFHLEPGTGPV LLGEVSMCNDDNRDNRFYLPAGRFPVIEEDEPPFRLLCNEYPPAGSCAKR >gi|157101638|gb|DS480686.1| GENE 182 182942 - 183802 985 286 aa, chain - ## HITS:1 COG:PM1373 KEGG:ns NR:ns ## COG: PM1373 COG0191 # Protein_GI_number: 15603238 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Pasteurella multocida # 2 285 3 285 294 199 37.0 5e-51 MLVSMKAVLDLADAKKIAVGAFNITSLEGIQAVLGAAEELGQPVILQFAPVHESIIPLKV IGPVMVMMAEKSSVPVCVHLDHGDKLEILREALEMGFTSVMYDGSVLPFEENAANTRIAV EMAGDWGASVEAEIGAMGRQEFSSVGEGEEGEAVESCYTDPEQARAFTGLTGIDALACSF GTVHGLYLTTPKLDFDRIRRIRESIGIPIVMHGGSGVSDEDFKKCIANGVRKINYYTYLA KAGGMYVKEKCRAAEEYVFFHDVTQWAVQAMKEDVLHTIRVFSHLE >gi|157101638|gb|DS480686.1| GENE 183 183951 - 185081 923 376 aa, chain - ## HITS:1 COG:YPO1816 KEGG:ns NR:ns ## COG: YPO1816 COG0524 # Protein_GI_number: 16122068 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 34 364 22 307 319 78 24.0 2e-14 MEEGNKKVAVAGHISLDITPVFQNSGKQKLSELFQPGKLLKVGKAMMCTGGAVSNTGLGL KRLGADVVLMAKIGDDYFGNALKDMISAHGCETCISQVPGENTSYTIVLAPKGLDRFFLH DPGCNDTFGCGDVDFEKVGEASHFHFGYPPIMRRFYLDEGEELVRLFKKVKSMGLTTSLD LVAVDPDSEAAGMDWARILERVLPYVDFFVPSIEELGYMLDRPLYCKWQEKAGGEDVTGI LSLEEDVKPLAARALSMGTRCVLLKCGAAGMYLKTAPEPVWRKFLPEFTGWNDISYFEDS YVPDCILSGTGAGDTSIAAFIKAMLDGCGPLECIRLAAATGASCVTAYDALSGLLSFEEL RQKMEAGWEKQNIIHP >gi|157101638|gb|DS480686.1| GENE 184 185385 - 187049 2239 554 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 5 536 8 537 543 309 37.0 9e-84 MGVELVLSLLGGLALFMYGMQMMSTNLEAVAGNRMKQILERLTANRFLGVLVGAGITALI QSSSATTVMTVGFVNSGLMTLKQAVWIIMGANVGTTITGQLIALDIGALAPLIAFVGVMS LIFAKNKKVQHVGGIIAGIGVLFIGMGMMSDAMVPLRDSETFIHMVTKFSNPLLGILVGA VFTAIIQSSSASVGILQALAMGGVINLHSAVFVLFGQNIGTCITALLASVGTSRNAKRTT LIHLMFNVIGTALFVTLCILTPFTDFVVSLTPDNPVAQIANVHTIFNISTTLILLPFGAL LEKIAIAILPDKAVPVKDADQWFEGLMASKHHLGISTIAINQIHDEIKGMLATAAENVSQ SFKAVEEGASEGIQAIADREEEIDLSNMRLSQKISKILVLDQTPKDIDTLNRMYTILGNI ERIGDHAMNLAEYAETIEAKGLQFSNYARKEFAVMEETCREGMELLKAAAAGDTQSPLSE VAEIEQRIDDITRKFRQNQIDRMREGNCNVESSILYSEMLTDYERIGDHMLNIAQAYDAI EWSGDRGQTAAAIA >gi|157101638|gb|DS480686.1| GENE 185 187444 - 188643 801 399 aa, chain - ## HITS:1 COG:no KEGG:Mahau_1979 NR:ns ## KEGG: Mahau_1979 # Name: not_defined # Def: major facilitator superfamily protein # Organism: M.australiensis # Pathway: not_defined # 1 386 1 386 394 463 62.0 1e-129 MALKDNYNQTLNASCLGYVVQAVVNNFAPLLFLTFQKSYGISLARIAMLVTVNFGIQLAV DLLSARFVDRIGYRTCIVAAHVFAALGLAGLSFLPDMVPGHFGGLLVCVALYAVGGGIIE VLISPIVEACPTARKDAAMSLLHSFYCWGSVAVVVLSTVFFQTAGLNSWRVLALLWAAVP VMNAVLFSRVPILALTEDGQEMGIRGLLKSSLFWLFIFIMVCAGASEQAMSQWASAFAES GLNVSKTVGDLAGPCMFSILMGISRAMYARYSQRIRLTAFMTGSAVLCIFSYGLASTAAS PVLGLAGCGLCGFSVGILWPGAFSLASVSCPRGGTAMFALLALAGDLGCAGGPALVGLIA GASGGNLKAGLAAAGWIPAVLAIGLLKIMYNNRKKQYTQ >gi|157101638|gb|DS480686.1| GENE 186 188735 - 189535 954 266 aa, chain - ## HITS:1 COG:no KEGG:Closa_2876 NR:ns ## KEGG: Closa_2876 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 263 4 264 264 374 66.0 1e-102 MGMQITVKDKNDAVKAEAAGREQAVLAWKGEYEEGDRIIFSLPEKNRFYIIRVDDTMDEA FIYGAGNELVYEVPFAEGKTSYNPKSFTGERHYLTMRYARDYEIKSYRNLAKNVMDQHRG QECYPHAHANVETRGEAVFAARNAIDGVTANLSHGKWPYESWGINMQDDAQMTLEFGRPV DFDQIVLYTRADFPHDNWWEKVTFTFSDGTSETVELEKSVEPHIISLERRGITWLTFGQL IKADDPSPFPALTQIEVYGREHEAAL >gi|157101638|gb|DS480686.1| GENE 187 189751 - 190848 927 365 aa, chain + ## HITS:1 COG:TM0121 KEGG:ns NR:ns ## COG: TM0121 COG1509 # Protein_GI_number: 15642896 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Thermotoga maritima # 22 363 17 356 368 287 44.0 2e-77 MNWNETLRNNVTRAQELKTYMRLTSQEEEHMTRILEQFPMTVTRYYLSLIDWNNPEQDPV FRMSIPSIRETDLSGDFDTSGEADNTVLPGLQHKYRQTALILSTHRCAMYCRHCFRKRLV GISGGETAGNVDQMAAYIVSHPEITNVLISGGDSFLNSNQIIRRYLEAFSSIKHLDLIRF GTRTPVVLPMRIYDDPELLDILARYTKIKQIYVVTQFNHSNELTPQAVKAIRCLMDAGII VKNQTVLLKGINDDAGSLGTLLKNLTRYGVIPYYIFQCRPVSGVKSQFQIPLTEGCRIVE EAKNMQNGQGKCIRYAMSHVTGKIEILGQMPDKNMLFKYHQAKYEKDQGRIFCEKLAPGQ AWLQG >gi|157101638|gb|DS480686.1| GENE 188 190981 - 192129 1491 382 aa, chain - ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 378 2 379 383 446 59.0 1e-125 MANRIVLNETSYHGAGAIQEIANEAKARGFKKAFVCSDPDLIKFNVTCKVTKVLEDAGLA YEIYSNIKPNPTIENVQTGVAAFKASGADYIIAIGGGSSMDTSKAIGVIITNPEFEDVRS LEGVAPTKNPCVPIIAVPTTAGTAAEVTINYVITDVEKKRKFVCVDTHDIPVVAVVDPEM MSSMPKGLTAATGMDALTHAIEGYTTKGAWEMTDMFNLKAIEVIARSLRDAVENKPEGRE GMALGQYLTGMGFSNCGLGIVHSMAHGLGALYDTPHGVANAIILPTVMEYNAPCTGTKFK DIAIAMGVDGVENMTQEEYRKAAVDAVKKLSADVGIPADLKEILKPEDVDFLSQSAMDDA CRPGNPKDPSFEDIKNLYLSLM >gi|157101638|gb|DS480686.1| GENE 189 192363 - 192929 561 188 aa, chain - ## HITS:1 COG:BH3394 KEGG:ns NR:ns ## COG: BH3394 COG1309 # Protein_GI_number: 15615956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 10 170 7 163 186 62 24.0 6e-10 MAVTGKQEKTKYKLAESVKECMKTAPVDKITVKNIVEGCGVTRQTFYRNFLDKYDLINWY FDKLVLQSFEQIGMGHTVGESLTRKFEFILNEKVFFTEAFRSDDRNSLKEHDFELILQFY TDLIARKTSQPLGRELQFLLEMYCQGSVYMTVKWVLGGMKDTPREMSDKLVEAMPPKLAE VFGKLKLL >gi|157101638|gb|DS480686.1| GENE 190 193082 - 195718 2446 878 aa, chain - ## HITS:1 COG:BH0721 KEGG:ns NR:ns ## COG: BH0721 COG1511 # Protein_GI_number: 15613284 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 37 406 39 401 599 102 22.0 3e-21 MKMKYKRIMAAGMVFVMLCTGTGSAPVYGAQPSVDVDETMYVNLDYYGRVDKINVVKGVG LNGQTEFMDYGTYENVINMSNSIEPVLGDGMVTWQLPEDQRGRFYFKCSVDKGMMVLPWD FDVSYKLNGVPMDGDKLAGASGLIEINVKAEPNDNAGEYYRNNMMLMVAVPVDMSKCYSV EAEGSQTQNLGETTAVVFTALPGEDGDYTVRIGTDSFETIGVIMAMVPGTVEDLEHIKDL KEAKDTWQDAGDELYDSLEQLAKSVENMRQGVNQVRQGMDSAESARQKWSNSKDSILAGN DQTLAALTSVSQQMDTMIPHLQTAKDCAEVVHSSMNDIVNTLGDMQDPLRKLQTRLKNIE NSAGGISSDLPELQKTMESIIALDTQLQASQDTLLTYLSLYKSSSTKARRLYDEELDEEE LENMEDVDYGTSNGGSHSSSGGGSSQENTQNNNQGNDQGSGNSGGSTEGKDNPDGSAGGS GSTDGNGSTSGTGSGSGNADGGATGSGSGAGSGNTTGGASTDGSTTGSGSGNTTGGGSTD GSTTGSGSGNTTGGASTDGSGNTTGGESTNGSATGSSSASAAESGNSSGNSGSNTPDTGT AISSAINTGNTGLAGLAATASIEKKNIPLVSSIDQQAAAEAAIAIEQSLYDKVAVLQDLS ARSKSLTDKMANLMDDTSDSAKYSAEIVDNMDYLIEDLKALNDSLDVYYPDLQTALDDSQ ELVRRTTDALNNGISTLTIIQNTLKDSSDDFDAAARDSLRGSMELLDKSLNILDSTTVMR QAGRTMKDTIDNELDKFDTDNRFLFMDPSADKVSFTSDRNKPPKTLQIVMRTDEISVDDD TAKTADAEVEKAKESPLRRMWNVLVQMWKAMVSIFKNR >gi|157101638|gb|DS480686.1| GENE 191 195702 - 197978 2406 758 aa, chain - ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 31 709 4 677 687 444 37.0 1e-124 MGQKLFGRIKDRMHRVKGREPQEGDKFSFRMARFIIHKKGWIESVFVAGCVFSLIAMLFV NVNYDLTEYLPASARSCIGLDLMEDEFGYPGTARLMIKDVSLYEAKQYKDKLEAVDGVDQ ILWCDSTVNIYSGEDFIRQKDIEDYYKDGCAVMDITFDQGDTAKKTSQAIDEMKAITGDK GYYVGMAVQNKSLTENVESEMNLILTVAVIMIFAVLCISTTAWSEPFLFLLVMGVAILLN RGTNIFIGTVSFLTNNVAMVLQLATSMDYSIFLLDAFSREKQKGLSEEQAMINAIDAAIN SIFASSLTTVVGFLALVSMKFTIGFDMGLVLAKGIVFSLITVLFFMPAMILKFSRWNDKT RHRPFLPSFRKFSEWVYRVRYASLIIMLILAPPAYVAQGMNDFLYGNSAVGASEGTQVYA DDQVISREFGRSNMMLAMYPNTSAVTEKEMSDEIEDLPYVKSVTSMSNTLPEGIPEEFLP YSVTSELHTKDYCRMLIYIRTKTESDQAFKGADEIRDILERYYPENSYLVGETPSTQDIK TTITADNSRVNLLSMLGVFLVVMFSFRSFAIPMIVMVPIEAAIFLNMAMPYLAGDTMVFM GYIIVSSIQLGATVDYSILLTNNYVSCRKSLEKKEACIQALMLSCPSIFTSGTIIILAGY IIHFISTTAAIGDLGHLIGRGALFSVILVLTVLPALLVLFDRIITSNEWDRLQKYLKRRH EKRKALIKAGIGTIVNKASALKRTDLEPAEVESDENEI >gi|157101638|gb|DS480686.1| GENE 192 198251 - 199129 913 292 aa, chain - ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 43 134 516 606 693 80 39.0 4e-15 MRYRICAVILAAAGALWAFEGTGPVTALADEQAQEISVSFVPGWNQTEGIWYWLEPDGTL HRGWLQADGRSYYMDEIGAMVTGWREVDGEWYYFHEDGGMNLGELILDNGKYEFSAQGAL VSAGWVENTGGGAYDAGCYDHMAQDLFDQLNEEKKYLFFEEYPDREDEYDGDMHRVYDRY AGFQMDMTLNKAADHRLEGAMAGGYADDRIPGEGTINDYLSVINYRRSASCLELYVRDCE DASEAFDKIKEKLDKRFQSKTDRKYSLEYYRSLGMAHREKDGKQYFVVILMR >gi|157101638|gb|DS480686.1| GENE 193 199382 - 199972 668 196 aa, chain + ## HITS:1 COG:CAP0046 KEGG:ns NR:ns ## COG: CAP0046 COG1309 # Protein_GI_number: 15004750 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 7 183 7 180 188 110 31.0 2e-24 MGIEPVDRRVRKTRRQLRECLITLLKEKKVQDITVRELTDMADLNRGTFYLHYKDVFDLL EKTEAELQEDFNQLVCKHDAVDLKQRPSVIFNEIYSLVYDNADLIEILLGENGDLNFVNR LKQLIREKCLKDWMEVFRSGNAAAFDAFFSFIVSGCIGLVQYWLQTGLKETPEQMAKLTE HIITKGIGVLEIDPYV >gi|157101638|gb|DS480686.1| GENE 194 200110 - 201342 700 410 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 11 410 2 396 396 274 37 3e-72 METRIEYSPKAMVRIRKGQGRSLKAGGAWIYDNEIESVNGAFENGDMVTVEDFDGYPLGT GFINTKSKLTVRMMSRRKDTVIDSDFLEMRVRSAWDYRKAVTDTSSCRVIFGEADFLPGI VIDKFSDVLVVESLALGIDRLKPVIVDKLRKVMAEDGITIRGVYERSDAKVRLQEGMERH KGFMGDAFDTKVEILENGVRYMVDVEDGQKTGFFLDQKYNRLAVQNLCRRIQPKQVLDCF THTGSFALNAGLAGSAHVLGVDASELAVNQARENAALNGLTEQVQFECADVFDLLPRLEQ EGRKYDVVILDPPAFTKSRNSIKNAVKGYREINLRGMKLVKDGGYLATCSCSHFMDPELF TKTIREAAGNVHKRLRQVEYRTQAADHPILWAADESYYLKFYIFQVCEEK >gi|157101638|gb|DS480686.1| GENE 195 201629 - 201778 118 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIILIIFLILTVLSILVCCFVAGSTPYDREADDEAQLEFLRKYRESKNR >gi|157101638|gb|DS480686.1| GENE 196 201726 - 201929 81 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938431|ref|ZP_02085786.1| ## NR: gi|160938431|ref|ZP_02085786.1| hypothetical protein CLOBOL_03329 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03329 [Clostridium bolteae ATCC BAA-613] # 1 67 7 73 73 129 100.0 6e-29 MRRSWSFCVSIGSQKTGKEALKEQKYLRHAQSKAFFLHLMVEYNLALMEYSGMTQSWVRD NLPADIL >gi|157101638|gb|DS480686.1| GENE 197 202458 - 203831 715 457 aa, chain + ## HITS:1 COG:RSp1018_2 KEGG:ns NR:ns ## COG: RSp1018_2 COG0489 # Protein_GI_number: 17549239 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Ralstonia solanacearum # 233 442 6 218 252 74 24.0 5e-13 MSETNDEIEIYIPDLLHSIWELRWIVLFLITLGGIAGSILSWDSTPSYETRASMIVNARN INDLYQNGSNVPRNEDIQMARNLVKTVQLLATSNRVLEIVLDEDGYSSIPLEELKQNISV TAEDDTAFLWLTLSWENPQQAIDILNHLMEVLPEVMLEVMDIGSVSVIDTARQAQGVSSP SLRNVVIGMAVGLITGSILGIIYYLFVPKIRNNSPLETLGIDIIGEIPYIESTKHTVEEF LDNKNVPEEYQVAYGRLAAVFRYLTEKNKQQVVAVTSSIPGEGKSTVAYNLALHLTELGC KVLLLDFDFKKGVLYQFTRKIKPKDGDERNEPRIGEKLRQQVEQMYNGIYTIQGFSQKNI FQVDNQIFPAIHSMKDTFNYILIDTPPVGTLSDVQLMRGLMDSVILVVRPDWVLRGAVEE SLEFLKKSGIPVTGGIVNGKTPLIGNALKRNIQARKS >gi|157101638|gb|DS480686.1| GENE 198 204206 - 204751 71 181 aa, chain - ## HITS:1 COG:no KEGG:BBR47_10940 NR:ns ## KEGG: BBR47_10940 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 93 181 157 245 251 63 39.0 4e-09 MNLSFSKTIQFILRDILKEMQYLPLAILWGTVLYLVTILCTRRYKKPIFSTTFIIYFIML FIITIFEREPGSRTGVSLKFFETLGGPKANAYVVENVLLFIPFGILVPLKWKQLRNTFVC TFLGFCLSCVIEIIQLITERGHFQVDDILTNTLGALIGGIVFRAFSFVIIKYRYHYRHDY D >gi|157101638|gb|DS480686.1| GENE 199 207207 - 207752 409 181 aa, chain - ## HITS:1 COG:no KEGG:Closa_3994 NR:ns ## KEGG: Closa_3994 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 176 4 181 310 216 60.0 3e-55 MNTTLLIMAAGIDSRFGAGIKQLDPVDGDIHIIMDYSIHDAIEAGFNHVVFIIRKDIEKE FKEVIGDHIASICKPNNVTVNYAFQDINDIPGELPTGRTKPWGTGQAVLAAKKVIKTPFI VINADDYYGKEGLKAVHEYLVNGGESCMAGFVLKNTLSDNGGVIRGICKMDEQNNLTEKV K >gi|157101638|gb|DS480686.1| GENE 200 207778 - 208212 270 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938440|ref|ZP_02085795.1| ## NR: gi|160938440|ref|ZP_02085795.1| hypothetical protein CLOBOL_03338 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03338 [Clostridium bolteae ATCC BAA-613] # 1 144 188 331 331 281 99.0 9e-75 MLESDSRKYDAVAGVTAFGLVMFSWFIYRDGNRLYYFDMKPAYYKELLLAVLIPCAFGIV FARFAYWLEKEKYFEILKRFLYLCGRATIPIMFMHVPLNHWKDCVGYGRLEYMVIGIGIP LMIIFAFNENSVMRKVLGIPKLNR >gi|157101638|gb|DS480686.1| GENE 201 208806 - 208973 110 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938441|ref|ZP_02085796.1| ## NR: gi|160938441|ref|ZP_02085796.1| hypothetical protein CLOBOL_03339 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03339 [Clostridium bolteae ATCC BAA-613] # 1 55 25 79 79 101 98.0 2e-20 MHAIMRLIDWKRGKPYTFRYSEFDELCESEMFFARKFDASVDSEIINFNKEKVLE >gi|157101638|gb|DS480686.1| GENE 202 209050 - 209322 58 90 aa, chain - ## HITS:1 COG:no KEGG:BT_3377 NR:ns ## KEGG: BT_3377 # Name: not_defined # Def: putative glycosyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 85 90 151 322 63 54.0 2e-09 MDKKNQVYDAGMVTSHIKESAVYHVERTCVTWGGGVNAELLLLKKAVSVGSYQHYHLLSG ADLPIKTQEQIVSFFEANKDKENIPIYILL >gi|157101638|gb|DS480686.1| GENE 203 209447 - 209965 153 172 aa, chain - ## HITS:1 COG:VC0238 KEGG:ns NR:ns ## COG: VC0238 COG0110 # Protein_GI_number: 15640268 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Vibrio cholerae # 48 170 56 182 188 80 39.0 2e-15 MVRGGGVASDIFSNNVRIKGIQPFKKTNIIIGKTGKIEFAGKAFIAGGTTIRVENGVCTF CANFSCNTNCFISCTEKVTFGDNCLLGCYVNVRDSDGHTIILNDVEQVSLKSVEIGKDVW IAANVDILKGVTIGDANVIGYRSCVTKSVNEEHCIIAGYPAKVVRREIDWKR >gi|157101638|gb|DS480686.1| GENE 204 210023 - 210706 289 227 aa, chain - ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 3 223 258 479 479 118 33.0 9e-27 MTLIHVFSIIQTTFNTLWAPMAVEHYTNDKEDVSFYQKGNQIITVIMFFIGFTLILCKDV FSVLLGAKYREAAYILPFLIFNPIMYTISETTVNGLVFKKKSNMQVVVSLGACIVNIIGN TILVPKLGCQGAAISTGFSYIVFFTLRTILSNRYFYVDIHLKKFYFITALAIMYAFYNTF VKFNIGSIVGYIVCAVSLVLLYRATVKCGIQYVMDMATVFFRKKADK >gi|157101638|gb|DS480686.1| GENE 205 211933 - 212076 59 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAKMNTVYRMLFTYKTINNTKKIAIADFEAVNATPITVIKITHLSCL >gi|157101638|gb|DS480686.1| GENE 206 212696 - 215044 777 782 aa, chain - ## HITS:1 COG:HI1698 KEGG:ns NR:ns ## COG: HI1698 COG0438 # Protein_GI_number: 16273585 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Haemophilus influenzae # 396 776 2 350 353 125 25.0 4e-28 MDYNTLVKQLADKELTYILEHGFAHAGFNGPYYNQDTPVRNSAHWIVTYSYLYEKTGKQE YYDAAKILSEYLLRKENYGTSGSIRCRTDARYDHTNGLIGQAWVIEGLVESARLFKDLSY YEKAVQLFKIQEYNSNNNLWEVTDCNGEKDYDLTFNHQLWFAASGAMILEFETENKINKS VFIDNQIQDFLSAAEKKYFKVKKDGCIVHVVDYQKSPDEVEHSKKLQNARKMASITENPV KVIKKKITDKLSKFSLTEGLEEGYHIFDLYGFALLKKLYGNHSIYKSDNFKAAVHYALDT NQLLKLRNSCGGEKFNKFAFGYNSPAFEYPFVAYMFNGTINQFLVDKLFQFQLDATYKED RLSQNCVDPNTLEARIYELVRYLQLNELHFTRRKKQKVCFLTNNIAEMGGRQRVNALLAS EMSKTDDLDVSIVFTSNYKTAMKRFYYLNNNVRVLWSQKLSPSKRDSLYKIARYINKKII RFRNTSFLQFVYFPPYEVRNYNQFFSDNKFDVVVGVGTRAGAMLSLLNDKSKKVVWLHNS YDVYFLKKDYFQWHQDGLYKKLLSKPDAMVVLTDGDINRYKRYIECDPIRIYNPLTLVCK DEAKLENNELVFVGRLDYDIKGLDLLADIFSIVKMKISEARLTIVGDGNGRKRLEETLAK LNLQHSVNFVGQRDNVMQYYQQGSVVLLTSRKEGFGLVTTEAMECGLPVVSFKTEGPSEI INDGRNGFLIDNYDVNAFAEKVILICKNKELRSVMGRKAKERAKDFSIDKIVNEWRELFV RI >gi|157101638|gb|DS480686.1| GENE 207 215060 - 215983 196 307 aa, chain - ## HITS:1 COG:PAB0772 KEGG:ns NR:ns ## COG: PAB0772 COG0463 # Protein_GI_number: 14521365 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Pyrococcus abyssi # 7 239 4 226 298 132 35.0 7e-31 MSENKEPLVTTVITTYKRKPEIVKRALDSIVGQTYDNIEIFVVNDCPTDKKLIEDLKNMI ASSSGDRTVHYIVVEKNGGACRARNIALEKAKGKYFACLDDDDEWLPKKIELQVRALEEH PDAVIAYCNAVIRYADIGKETVRFTKKQNEGNLYYELIGKNNIGSCSFPMFRVDAIKRAG GFDANMLALQDWDLYLRILKENKAVYVHEPVAVYYFYAGQRISAHPENRIIAYERLHQKL GKDLENNRKSAAAFYLMGTYFYSLAGNVKTAMHYYILGVKNDPCNFKRNIKDFFRMTGRK FVKTKNV >gi|157101638|gb|DS480686.1| GENE 208 216322 - 216444 85 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHLNSLSRKNHDGKTHPKYQEIGVLVHILSDYFQMVKVI >gi|157101638|gb|DS480686.1| GENE 209 216684 - 217214 143 176 aa, chain - ## HITS:1 COG:SP0353 KEGG:ns NR:ns ## COG: SP0353 COG0438 # Protein_GI_number: 15900282 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 2 173 196 367 372 117 37.0 1e-26 MLIGHVGNFNRQKNQEYLINVFAELLQLKPNSWLYLMGDGKNKQKCMELADKLKISNRVI FTGSITNVSDMLQAMDVMVLPSIHEGLPLVVVEWQMAGLPCLVSDTVTNECVFTDLVYFM SLEEYLKWANRIITLGKTDRLTNSKIAIDSAIEAGFDIELDASRLQEYFIERCDRG >gi|157101638|gb|DS480686.1| GENE 210 218088 - 218633 149 181 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1358 NR:ns ## KEGG: Cthe_1358 # Name: not_defined # Def: glycosyltransferase # Organism: C.thermocellum # Pathway: not_defined # 1 179 115 296 296 226 59.0 3e-58 MIVKNKNLLDKKAAKVKIENCLKQNFYCIGREWPYKNVKPRIIAEKYMEDAETGELRDYK FFCFNGEPKAMFIASDRNSGHVKFDYYDMEFNHLDITQKYPNAEIPNKKPNCFNEMIVLA EKLSFGIPHVRVDLYEVNGNIYFGEMTFFHFSGFIPFRPNNWDKVFGDWLILPAKKVKRQ L >gi|157101638|gb|DS480686.1| GENE 211 218693 - 218980 63 95 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1358 NR:ns ## KEGG: Cthe_1358 # Name: not_defined # Def: glycosyltransferase # Organism: C.thermocellum # Pathway: not_defined # 12 78 12 78 296 106 73.0 3e-22 MSSNRLAKKVRYALRFLPDELYIQLNYFAHFKKFANLKKPVTYNEKLNWLKLHDHNPLYT MLVDKYEVKKYVEKIIGDGGGVSYQLLESGSALKI >gi|157101638|gb|DS480686.1| GENE 212 218982 - 219476 236 164 aa, chain - ## HITS:1 COG:all2286 KEGG:ns NR:ns ## COG: all2286 COG5017 # Protein_GI_number: 17229778 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 1 112 1 112 169 81 35.0 8e-16 MIFVTVGSRKYPFDRLFKELDELYEDGTLCEPMFAQIGTSTYQPKHYKFKDFISPEEFIE KINEADIVVSHGASGSIMKALNAGKKVIAVTRLEKYGEHINDHQIQNNEAFSSNGYVLMA GLELDDLGECFKKIYDGADGVRPWENKDPMAIVNMIDKFIQENW >gi|157101638|gb|DS480686.1| GENE 213 219519 - 219875 178 118 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1202 NR:ns ## KEGG: Cphy_1202 # Name: not_defined # Def: oligosaccharide biosynthesis protein Alg14-like protein # Organism: C.phytofermentans # Pathway: not_defined # 2 118 56 172 172 186 76.0 3e-46 MITEKTQFQQNAKYFMIQTDLKDKFMPFKMAINCLRSIVIWIKERPDFVITTGTMVAYSF YLLAVLFHKKFVYIETFGRANMPTVAGKRMEKHSALFIVQWESQKKYYKKAIYGGCLY >gi|157101638|gb|DS480686.1| GENE 214 220007 - 221257 1008 416 aa, chain - ## HITS:1 COG:STM2080 KEGG:ns NR:ns ## COG: STM2080 COG1004 # Protein_GI_number: 16765410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Salmonella typhimurium LT2 # 7 416 1 388 388 486 61.0 1e-137 MREFKDLKIAVAGTGYVGLSIATLLAQHHKVIAVDIILEKVELINNRKSPIQDDYIEKYL AEKDLNLTATLDAKESYADADFVVIAAPTNYDSKKNFFDTSAVENVIGLVMEYAPDAIMV IKSTIPVGYTKSIREKTGSKNIIFSPEFLRESKALYDNLYPSRIIVGTDMEDERLAEAAH TFAELLQEGAIKENIDTLFMGFTEAEAVKLFANTYLALRVAYFNELDTYAEMKGLNTQQI IKGVCLDPRVGDFYNNPSFGYGGYCLPKDTKQLLVNYQDVPENLIEAIVESNRTRKDFIA DRVLEIAGAYGANDSWDESREKEVVVGVYRLTMKSNSDNFRQSSIQGVMKRIKAKGAIVI IYEPTMKDGDTFFGSKVVNDLDEFKRQSQAIIANRYDKCLDEVKEKVYTRDIFQRD >gi|157101638|gb|DS480686.1| GENE 215 221308 - 222372 689 354 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 11 353 3 333 343 344 51.0 1e-94 MSYVDLKNKTILVTGAAGFIGSNLVKRLYKDVEDVTVIGIDNMNDYYDVRLKEARLSELS AHPSFIFVQGSIADKELVNKVFEQYRPQIVVNLAAQAGVRYSIINPDAYIESNLIGFYNI LEACRHSFDDGHTPVEHLVYASSSSVYGSNKKVPYSTDDKVDNPVSLYAATKKSNELTAH AYAKLYNIPSTGLRFFTVYGPAGRPDMAYFGFTDKLRAGKTIQIFNYGNCKRDFTYIDDI VTGVEKVMTKAPDKAIGEDGIPVPPYALYNIGNNHPENLLDFVQILSEELVRAGVLSEDY DFDVHKELLPMQPGDVPVTYADTSALERDFGFKPSTDLRSGLRRFAEWYKEFYM >gi|157101638|gb|DS480686.1| GENE 216 223040 - 223267 85 75 aa, chain - ## HITS:1 COG:ugd KEGG:ns NR:ns ## COG: ugd COG1004 # Protein_GI_number: 16129969 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Escherichia coli K12 # 2 58 84 140 388 60 52.0 6e-10 MNYNGKKNFFDTYVVEVVIKLGGQYNPDTMLVIKSTISVGFTAFAHEFYHCDNIIFSPNS CMNLRDCMTMASLMN >gi|157101638|gb|DS480686.1| GENE 217 223451 - 223546 72 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYVITTVLKRDFEFELSTKLAGWYAEVCRKV >gi|157101638|gb|DS480686.1| GENE 218 223562 - 224848 677 428 aa, chain - ## HITS:1 COG:all4160_2 KEGG:ns NR:ns ## COG: all4160_2 COG2148 # Protein_GI_number: 17231652 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 215 424 23 221 226 214 50.0 2e-55 MAFVFLFIQIFVTFFGESFKNVLKRGKFKELTETIKHVCQVILIAAFYLFATQKGEEYSR ITLVTTGILYVVISYIARIYWKKYLLTRGLEGKGKRSLLILTSKGMVEEVVQNILNKNYE RFHIIGVSLMDADWAGKTINGVTIVTDRTGIAEYVCREWVDEVFINLPKNMPLPQELMDD FIDMGVTVHLKLIEMGRLKGQIQHVERLGSYTVLTSCINMASWRQAFFKRVMDIAGGVLG CILTGILFLFVAPCIYVQSPGPIFFSQVRIGKNGKRFKLYKFRSMYMDAEERKKELMNQN RVKGGMMFKIENDPRIIGGKKGIGNFIRNYSIDEFPQFWNVLKGDMSLVGTRPPTVDEWE KYDLHHRVRLSVKPGITGMWQVSGRSEITDFEEVVKLDKEYITEWRMGLDIKILLQTVQV VLGRGGAM >gi|157101638|gb|DS480686.1| GENE 219 225089 - 225796 532 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938458|ref|ZP_02085813.1| ## NR: gi|160938458|ref|ZP_02085813.1| hypothetical protein CLOBOL_03356 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03356 [Clostridium bolteae ATCC BAA-613] # 1 235 1 235 235 239 100.0 1e-61 MKKKCIAALVLAAALASQPAWAFAAGSASSSGSDGSGSISSSSSSSNKGTTSSKSNTTVS VSSTGVKTTGATTSASPTGSTIGVAVDTVTTTGQQVTTNSKGEAVIGDTAVAFADSVAAT TGLPSNVVSAINDINAGKPLNEAVQGVDLTGYNALTGTHAIITKDAATGTVKTGAVEVSL YVPNLVSNVNNVQVLFYDNATGQWQLIPAVKVDPVTKTVAVNVPGSGTLSVVYKN >gi|157101638|gb|DS480686.1| GENE 220 225916 - 226968 616 350 aa, chain - ## HITS:1 COG:all0187 KEGG:ns NR:ns ## COG: all0187 COG1316 # Protein_GI_number: 17227683 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Nostoc sp. PCC 7120 # 84 263 152 319 503 63 27.0 6e-10 MSWIRNHKNLVHRIAVGAAALSFTCFIWTGVQLYRGRKVLEAEIAARNHGKQYEVQEDSM CSFSGDVVKYKGKQYRRNSYVKAILCIGVDRSGNMTEKTTTGFGGQADGLFLIAQDTVRN TVKILMIPRDTMTDIMLTDLSGNELGKDLQHLNLAYAYGDGRERSCEYMAEAVSELFGGL KIEWYMAADTSAIPVLNDEVGGVSVTIGTDGMESRDPALVKGETVILKGKQAEIFVRYRD IRVDHSAIYRMDQQQQYIKGFFEAVQKHSAKDSGLVVRLFDRVQEYMVTNMAKDQYLKVA MDVVGSGKLSDEDFYTVPGEGVVTPRYDEFYADKEALTPILLELFYREIE >gi|157101638|gb|DS480686.1| GENE 221 226965 - 227696 475 243 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 7 234 11 226 227 179 41.0 4e-45 MSQFVTLNRIMKDDYHYNEAIKTLRTNIQFCGNGIKTIMLTSSLPDEGKSDITFAMASSL AQIGKRVVMIDADIRKSVLVSRYQLEHEVPGLSQYLSGQRPLEHVLYDTNIENLSIIFAG PYSPNPAELLEESLFSDMISKLKEKYDYVIIDTPPMANLIDGAIVARQCDGAVMVIESGS ISYRLEQRVKGQLEKSGCRILGVVLNKVSMERNGYYSKYYGKYGKYSKYGHYGKYDRYEK AEA >gi|157101638|gb|DS480686.1| GENE 222 227693 - 228478 719 261 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 6 224 3 224 230 110 34.0 2e-24 MEKTYDDDEIEIDLLELLGEFRRKIWIILGTIILFGSVAGAFSAFVLTPQYKSTAMVYIL SKETTLTSLADLQIGSQLTKDYKIIVTSRRVLNQVIEDMELSLTYKELVEKVTIDNPQDT RILSISVEDPDPAMAKLIADKIAVTSSDYIGDIMEMVPPKLIEEGEVPILKSSPSNTKNA LIGGFIGAVLVCGFITVNVILNDTIRTEEDVTKYLGLSVLASVPEREGEKPEDKEAMISS KPKNKPAVGKSRKKKRGGHAS >gi|157101638|gb|DS480686.1| GENE 223 228513 - 229142 299 209 aa, chain - ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 1 206 25 238 243 100 28.0 1e-21 MLEESYRQGIRYIIATPHYSRRGLNPDIYDLSEKLKEEARRIAPDFRTGLGQETYYHDGL VENLRDGKALTLEGSQYVLVEFDPQVPYMKMYQAVRKLTMARYIPIIAHVERYACLREDT NMSELFQCDCRLQMNFSSLKGNSILDKEVRWCRKQIMQGRIYCLGTDMHRMDYRKPEIDE SFQWLVNHLDGLKLHGLLRGNAKNVIRKR >gi|157101638|gb|DS480686.1| GENE 224 229220 - 229678 312 152 aa, chain - ## HITS:1 COG:no KEGG:ELI_0681 NR:ns ## KEGG: ELI_0681 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 11 140 136 266 278 99 41.0 4e-20 MADTENVNIFIKTFDGFDVYVNGKMLYFPSSKAKEMLAIMVEKRGSSVSLSQMTYLLYEN IEERTAKNNLRVVYYRLRMNLMEHGIEQILIKKRGSYAVDTELFICDFYEFIKGNSDYIT LFSGSYMPEYAWAEDMLPYLRNLYRKYNGGLI >gi|157101638|gb|DS480686.1| GENE 225 230000 - 230206 129 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938465|ref|ZP_02085820.1| ## NR: gi|160938465|ref|ZP_02085820.1| hypothetical protein CLOBOL_03363 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03363 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 122 100.0 6e-27 MEKMLMQEYIEDYAKFLCLEENEGTLFYFLSNRDRHIAVGRDPGVFQDKNSLDIYQQTEK NIIKYYLN >gi|157101638|gb|DS480686.1| GENE 226 230379 - 230834 358 151 aa, chain - ## HITS:1 COG:no KEGG:BDP_1226 NR:ns ## KEGG: BDP_1226 # Name: virR # Def: two-component response regulator VirR (EC:3.1.1.61) # Organism: B.dentium # Pathway: not_defined # 7 141 125 259 259 98 37.0 9e-20 MFDITSDKVYVKTFGGFDLYYQGELIYFSSSKAKELLAILVDWSGKEVTLSHLAEILNEG ERDETAAKQAVHLAWHRLKQTLKKYGIEKIVVKGRGTYAINKEAVVCDSYDMVRQVQGAS NFFVGEYMPEYSWAEVTLSNLIRDYFEEGES >gi|157101638|gb|DS480686.1| GENE 227 231241 - 231726 599 161 aa, chain - ## HITS:1 COG:mlr1925 KEGG:ns NR:ns ## COG: mlr1925 COG2080 # Protein_GI_number: 13471825 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Mesorhizobium loti # 5 154 6 154 157 150 48.0 1e-36 MLNIVHMTVNGKEVELAVDPRESLLETLRNRLGLTSVKKGCEVGECGACTVLVDGEAIDS CIYLTMWAEGRSIMTVEGLKGPNGELSTIQKAFIEEAAVQCGFCTPGLIMTAVEIVGTGK RYNRDELRKLISGHLCRCTGYENILNAMERIVEETYKVVGK >gi|157101638|gb|DS480686.1| GENE 228 231719 - 232606 941 295 aa, chain - ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 292 1 290 292 261 42.0 1e-69 MYDIGKFYQAADVEDAVRALVEDPEAVVISGGSDVLIKIREGKLAGCSLVSIHGIKELEG IRMEEDGTIVIGPATTFSHITNNDIIQKHIPMLGDAVDMAGGPQLRNIGTIGGNVCNGVT SADSASSLCCLDAVLVLKGPDGVREVPISQWYTGPGRTVRNHDEVLTAIRIKKENYQGYG GQYIKYGKRNAMEIATLGCAVSVKLTEDKKHIQDLRLAYGVAAPTPIRCHTTEEAVKGME TGEALAQAVGKGALEEVNPRSSWRASREFRLQLVEELGRRAVKQAVINAGGEWDA >gi|157101638|gb|DS480686.1| GENE 229 232625 - 234913 2412 762 aa, chain - ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 9 762 2 752 752 689 46.0 0 MGIGKNMTRVDAFDKVTGSARYTADLEPQGLLVAKVIRSTIANGVVKSFDLEEARKVPGV VKIVTCFDVPDIQFPTPGHPWSVETAHQDIADRKLLNTRVRVYGDDIAAVIAEDDIAAAR AARLVKVEYEEYAPLLTVEDAMAEGASRLHDEKPGNVIAHSSFVVGEGTYGEAVKEEGLV EIEKDYRTQSVQHCHIETPISFAYMEKGRIIVTTSTQIPHIVRRVISQALGLPMGRIRVI KPYIGGGFGNKQDVLYEPLNAFLTTQVGGRGVRMEISREETLGCTRVRHAIEFKVKAAAR KDGTLVARKLEAYSNQGGYASHAHAIVANSSNEFKQIYKDEKVLESDAWTVYTNLATGGA MRGYGIPQAAFAAECMADDLALALHMDPLEFRKKNCMRPGYVDPHTHVKCNTYGLAECME KGREFIRWDEKRREYENQTGPVRKGIGMAIFCYKTGVYPISLETASCRMILNQDGSMQLQ MGATEIGQGADTVFTQMAAEVTGITEDKVNVLSTQDTDVSPFDTGAYASRQTYVSGMAVK KTGAIFKEKILDYAAYMLEKPQDTLDVRDNHIVDKETGTQLLPMEELATTAFYSLDRSVH ITAEATSHCKTNTFATGACFAEVEVDMPLGKVKVTNIVNVHDSGKLINPKLAAAQVHGGM SMGLGYGLSEELLFDAKGRPLNDNLLDYKLPTAMDTPDLNALFVELDDPTGPFGNKALGE PPAIPVAPAVRNAVLNATGVAVDSLPMDPQKMVKHFKEAGLI >gi|157101638|gb|DS480686.1| GENE 230 234987 - 236381 1625 464 aa, chain - ## HITS:1 COG:MJ0326 KEGG:ns NR:ns ## COG: MJ0326 COG2252 # Protein_GI_number: 15668500 # Func_class: R General function prediction only # Function: Permeases # Organism: Methanococcus jannaschii # 9 464 5 434 436 360 50.0 3e-99 MENKSNQGFLEKVFHLSENHTDVKTEVIAGITTFMTMAYILAVNPNILSATGMDRGAVFT ATALASLVATLLMAAFANYPFVLAPGMGLNAYFAYTVVLQMGYTWQMALAAVFVEGLIFI ALSLTNVREAIFNAIPMNLKHAVSAGIGLFIAFIGLQNAKIVVESATLVSVFSFKGSLDA GTFNSVGITVLLALIGVLITGILVVKNIKGNILWGILITWILGIICEVTGLYQPNAELGM FSVLPDFSSGFGIQSMAPTFFKMDFSGILSLNFVTIMFAFLFVDMFDTLGTLIGVASKAD MLDKDGKLPKIRGALLSDAIGTSLGAVFGTSTTTTFVESASGVAEGGRTGLTSVVAAIFF GLSLFLSPIFLAIPSFATAPALIIVGFLMISSILKIDFNDFTEAIPSYIAIIAMPFMYSI SEGIAMGVISYVVINVATGHAKEKKISLLMYILAILFVLKYVLI >gi|157101638|gb|DS480686.1| GENE 231 236497 - 237768 1316 423 aa, chain - ## HITS:1 COG:CAC0282 KEGG:ns NR:ns ## COG: CAC0282 COG0402 # Protein_GI_number: 15893574 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Clostridium acetobutylicum # 6 423 9 424 428 409 47.0 1e-114 MAENFVLKGNICYSQDSRTLICVEQGYLVCADGESAGVYKELPQMYAGFPLTDYGDRLIV PGLTDLHLHAPQYSFRGLGMDLELLEWLNTRTFPEEAKYSDMEYAGKAYTIFAENMKHSA TTRACIFATMHREATELLMDLMEETGLKTMVGKVNMDRNSPDILVEETAESLRETRQWLE DIKGRYRNTAPILTPRFIPTCSDTLMEELKKIQMEYDLPVQSHLSENKGEIDWVKELCPW SEFYGDAYDHFGMFGSGVKTIMAHCVWPPEEEIQRMRDNGVYVAHCPQSNTNLSSGIAPV RTYLEQGIPTGLGTDVAGGAGESVFRAMADAIQVSKLRWRLVDETCKPLTAEEAFYLGTK GGGGFFGKAGSFEEGYELDALVLNDESLKHPQPLSLKERLERFIYLSDERHIDAKYVAGR KIF >gi|157101638|gb|DS480686.1| GENE 232 238477 - 239151 931 224 aa, chain - ## HITS:1 COG:BH2663 KEGG:ns NR:ns ## COG: BH2663 COG0569 # Protein_GI_number: 15615226 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 11 222 5 217 220 153 40.0 3e-37 MARKQTEAMSYGIIGLGRFGSALAATLAEADKELMVLDRSEEKIRQARNYTEHAYVVKDL QKETLRETGIQNCDVVVVCIGDKVDVSILTTLNVINLGVPQVVAKANSPEQGEILEKIGA QVVYPERDMAVRLARRLISKRIFNLFELDQNVDISEIKTTRKLTGQSVRDAQLRNRYGIN IIAITRGGWLTTDVSPDYVFQDEDTVIVIGEREKIKRLADDLAR >gi|157101638|gb|DS480686.1| GENE 233 239210 - 240502 1399 430 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 6 430 13 445 445 236 37.0 7e-62 MKYLERQSPARIIAIGFLAVILTGSFLLMLPVCVKPGVRLHYVDALFTSTSAVCVTGLIA VDTADTYTTLGQTVVALLIQIGGLGVTSVGVGVILALGKRVNLKERLLVKEALNLDSGRG IVCLVRNVLITTLCFEGVGAVLSFLVFSRDYPPLKAVGISLFHSVAAFNNSGFDILGGLK NLTGYQDNVLLNLTTAGLIIFGGLGFLVIMDLIKNRSFKKLCLHSKAVIVTTLILLASGT VLIWLTEDITWLGAFFTSVSARTAGFSTFPLGEFTSAGLLSVSVLMFIGASPGSTGGGIK TSTLFTLFHTIKGAACNEEPHAFHYRIPQEAFYKAAIVTALSALVVCLGTFFMCILEPGI PFEDILFEITSAFGTVGLSTGITPGLSSLSKLLEVLVMYIGRLGPLTIVTIWVFKPTKSV SYAEGSISIG >gi|157101638|gb|DS480686.1| GENE 234 240669 - 242201 1197 510 aa, chain + ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 6 510 385 888 888 423 44.0 1e-118 MARSRILDTLKTMAVLAAATAIGFLLETLGLSQANIITVYILGVLVVAAVTASRWYSISA SLASVLLFNYIFTEPMFVLKAEDAGTQLTLLITFLAAVFTSSFTVQMKEKARLSQQDAYR SGVILDTSQMLQKAAAPADILTCTAIQLNKLLKRDITCFEQDGAGLKSPLCFREYSSPAT GRGPAPDTLEELTAARKCFQEGKETGAATGLFPAAAYHFLPLQSSGAVYGVMGIHVGTTP PGDFENSLAVAIIAQCSMALEKEYISRKREEEAAMARAEQLRSNLLRSISHDLRTPLTSI SGNAGVLMSDGENLSAEHRSRIYRDIYDDSMWLINLVENLLSITRMEGQGVHLHMETELL EEVIDEAMRHINRRHEHHTIRVNQVGDYILADMDARLIIQVITNLVDNAVKYTPEGSTIM VNSRQEGASAVIDVSDNGPGIPDESKEKIFEMFYTTGHKSADGKRGMGLGLPLCRAILSA HKGTIEVLDNKPAGSIFRLTLPAKEVSLHE >gi|157101638|gb|DS480686.1| GENE 235 242194 - 242892 745 232 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 231 1 230 240 273 58.0 3e-73 MNKPSILIIEDDKAVKNLIATTLEMHDYHYIWADKGEQGILSAASQRPDIILLDLGLPDM DGVDVILKIRSWTNTPIIVISARSEDSDKIAALDAGADDYLTKPFSVDELLARLRSTQRR LSYMQSLEEVQTSLFANGGLTIDYAAGCACLDGEELHLTPYEYKLLCLLAHNVGKVLTHS YITREIWGAAWEGDVASLRVFMASLRKKIEKNPSQPQYIQTHVGVGYRMMKR >gi|157101638|gb|DS480686.1| GENE 236 242941 - 244041 1008 366 aa, chain - ## HITS:1 COG:no KEGG:Closa_0526 NR:ns ## KEGG: Closa_0526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 31 365 53 393 398 102 25.0 2e-20 MVKKYLLLCMGLFFACLYPLEAHADAMPDVIRYEELETLVKASSPQIQMEQAQYDSRLAR YENAREEIMETRRLLREEAEDLEKNGDKDSAGQYRAQAKTLEDAAKDMDKQIRSAQGSQA SMSLRRMEDTVLWTAQNLMGTYNTLKTEQEAALAQAEWKQSQYEKLLRQVQTGSASAAQA DQAGKEADAAAAKAKAVQDEMERVKKELFILTGIPEGSQAEVGPMPAPRPERVLEMGGST DKWRALGNNYSLREQRSGGASSNKELHARQRDIRQSEEAMYGQLDTLYQDVLASQTLWTS AATSMASQEAAFQAASNKLALGMLSRQEYLEAKAAYMDAAAAKERADTGFQQAMDTYDWA LKGFMT >gi|157101638|gb|DS480686.1| GENE 237 244045 - 245310 1332 421 aa, chain - ## HITS:1 COG:no KEGG:Closa_0525 NR:ns ## KEGG: Closa_0525 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 30 420 4 382 383 191 34.0 4e-47 MRLEDIKGCGRRKSRIETGGKQVRRMSGTWKRAGAMGLALMLAPVPGMLRPVTVWASPEF AYSADKWAALRDDVLEYGELADLVHEYNATVINNRLEYDDYRGKDHDEMKNAYQDIADRL YDSSDKIMDSVNEDQPGYAGTAVGAISARLQAEQNQELADSQNEDGRVKKLEYDRQEAVL VKDAQTKMISYWQKAKARPALEEDVNQARSKYEAMAVKAGQGMATQAELLGAREKMEAAQ AALETNDRERDGLRRELCVMTGWDHNAQPDIREVPVPDAGEMDQIDLEFDKERAIEQNFT QTANERRLIFTGNGTQWDVMNQKVETGRRQIEADVEARYKLLEQARADYEQAAGELELAR TGAQTAERKYSLGMISKNEYTQQQGTMASSQSACDTAGLKYRQALEDYRWAVNGLAQTEG A >gi|157101638|gb|DS480686.1| GENE 238 245362 - 246573 340 403 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 4 403 7 413 413 135 30 2e-30 MLIENMKMAFSAIRSNKMRSALTMLGIIIGIGSVIAIVSIGDTMRGLFSDLYKDVGITQA YISIGYWVDDVRQSDYFTLDEMERAKEAFKDQVAYIDSSAYTSSDAEYKRTKIKFDYQGI DYNYQDVQPVNVVYGRYLNEGDILGRRKNVVMDTKSAQMLFGTENAVGRSFRTTLYGSTD DFTVVGVYRKEVNAFQALMMGNSDDKGSAFIPYTLLTWPNDYFYQLHVYAKDDVNLDQFF SQFKAYAAKMHGRQPEDMYMYTAMQEMTSVDSMMGSLSMAVGGIAAISLMVGGIGIMNIM LVSVTERTREIGIRKALGARTRDVLIQFLTESAILSACGGIIGVILGVGTVSLGGFLLGF AVVIKPGVIVVAVSFSAVVGIFFGLYPASKAAKADPIDALRYE >gi|157101638|gb|DS480686.1| GENE 239 246567 - 247355 325 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 240 1 217 245 129 39 1e-28 MKKWMKFLHRDSMIQEEVYDGPLIKLENLVKIYDTGAIKVLGLKKINLTIERGEFVAIMG HSGSGKSTLMNILGCLDRPTLGHYYLDGIDVADMTPDQLSDIRNRKIGFVFQSFNLISRT SALKNVELPMTYARVPKAKRAVRAQLLLERVGLGARKEHMPNELSGGQRQRVAIARALAN EPPLILADEPTGNLDTASSVEIMELFTKLHQEGATVVLVTHEEDIAAFAHRVIRFSDGQI QSDKLNAARVTSEDGKGGRAEC >gi|157101638|gb|DS480686.1| GENE 240 247369 - 248751 1496 460 aa, chain - ## HITS:1 COG:AGl492 KEGG:ns NR:ns ## COG: AGl492 COG0845 # Protein_GI_number: 15890353 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 91 460 91 412 414 90 24.0 7e-18 MEGISESEEIDRRVEELMAPEEEGNGKKKKKKQKQKTGHRSSEGGFHPVRRFKGFSRKRK IITVLVLAAVVLFGFSRMSGGKKDLGIPVAVMPLAKQDVQEKLTVSGPVSGTDSVDVVSN IHAEITAMNVKEGDTVTKGQVLAVVDSTDLEREVQIAQNAYDLAVANKQEKDKEAALGYE KAVQDFQKASLDHSRNSQLFATGDISQMELEASANALNDARRQMEGYTVENGRGVADSSY GIQIQNAAFDLEKKKEELDNTQIKSTIDGTVVRVNSKVGQFADKVEDDKPILSIENLEQL ELEIKVSEFSIGKVKVGQMATITADILDGETAEGEVVKISPTGEEKGGGSTERVIPTTIR VDGGSSKLIAGITAKAELLIREAKDVFVVPSSALFDDGEVTYIAIVQDMKVKRIPVTMGV DGDVVVEVIPTDGTVLEEGMQVITNPGPHLMDGAAVLITQ >gi|157101638|gb|DS480686.1| GENE 241 249371 - 250354 486 327 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 305 80 374 381 423 74.0 1e-117 MAVFRVERNKGYTVMSNHHLRNKELSLKAKGLLSQMLSLPEDWDYTLKGLSLINREKIDA IREAIKELERAGYIVRSRERDEKGRLRGADYVIFEQPQPPTPDLPTLENPTLDNPMQEKP TLEKPTLENPTQLNKDIQRTDLPKKEKSNTDLSSTHSIPILSPNPSPCREAAAPPERKGT EAAAQSAVDIYREIIKDNIDYHILKQDMKFDSDRLDEIVDLMLETVCTARKRVRIAGDDY PAELVKSKFMKLDGEHIRFVLDCMRENTTKIRNIKQYLKAALFNAPSTIGNYYTSLVAHD MASGALSPKKPQYGDPDYYSCNEGESL >gi|157101638|gb|DS480686.1| GENE 242 250386 - 250607 384 73 aa, chain + ## HITS:1 COG:no KEGG:CD1117 NR:ns ## KEGG: CD1117 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 73 1 72 72 126 83.0 3e-28 MAQKMTGALVFDERTDRYDIRFDLNSYYGGLHCGECFDVFVRGKWKPTRIEYGDNWYLVG IRAEDLNGLRVRI >gi|157101638|gb|DS480686.1| GENE 243 250649 - 251125 445 158 aa, chain + ## HITS:1 COG:no KEGG:CD1116 NR:ns ## KEGG: CD1116 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 158 1 158 158 155 63.0 5e-37 MQDEVNEKTIALYIKTGKLTAQTLQKAMKAILSKGKKQLAKPPQGKQSLKQLMKQNAGVS NIEITEGNIKAFESTAKKYGIDFALKKDATESPPRYLVFFKGRDADVLTAAFKEFSAKKL TQEKKPSIRKLLSTLKEAAQGRNAERAKVKNKDREVSL >gi|157101638|gb|DS480686.1| GENE 244 251122 - 252930 991 602 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 139 555 157 562 591 218 33.0 4e-56 MKPEIKKLLILNLPYLLFVWLFDKVGAAVRLSPGADASAKLLHLGDGFTAAFSSIAPSFH PADLALGIAGAVIVRLIIYTKGKNAKKYRRGTEYGSARWGGADDIKPYTDPVFENNIPLT QTERLTMNSRPKQPKYARNKNILVIGGSGSGKTRFFVKPSLMQCTSKDFPTSYIVTDPKG TLILETGKMLQRYKYRIKVLNTINFKKSMKYNPFAYLRSEKDILKLVNTIIANTKGDGEK SGEDFWVKAEKLYYTALIGYIWYEAPEDEKNFTTLLEMINASEAREDDEDFQNPVDLMFE RLEEKDPEHFAVKQYKKYKLAAGKTAKSILISCGARLAPFDIKELRELMETDEMELDTIG DRKTALFVIISDTDDTFNFVVSILYTQLFNLLCDKADDEYGGRLPVHVRCLLDEFANIGQ IPKFEKLIATIRSREISASIILQSQSQLKAIYKDNADTIVGNCDTTLFLGGKEKTTLKEI SEILGKETIDSFNTSETRGRELSHGLNYQKLGKQLMTEDEIAVMDGGKCILQLRGVRPFF SDKFDITKHPKYKYLSDADPKNAFDMEKHLKRRPAIVKPDEVFDYYELDAADLQEDADHE ET >gi|157101638|gb|DS480686.1| GENE 245 252917 - 253087 132 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938488|ref|ZP_02085843.1| ## NR: gi|160938488|ref|ZP_02085843.1| hypothetical protein CLOBOL_03386 [Clostridium bolteae ATCC BAA-613] hypothetical protein COPEUT_02475 [Coprococcus eutactus ATCC 27759] hypothetical protein DORFOR_02120 [Dorea formicigenerans ATCC 27755] hypothetical protein ANACAC_02287 [Anaerostipes caccae DSM 14662] hypothetical protein CLOSCI_02550 [Clostridium scindens ATCC 35704] hypothetical protein ANACOL_03096 [Anaerotruncus colihominis DSM 17241] hypothetical protein CLONEX_02798 [Clostridium nexile DSM 1787] hypothetical protein ROSEINA2194_01053 [Roseburia inulinivorans DSM 16841] conserved hypothetical protein [Acidaminococcus sp. D21] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] conserved hypothetical protein [Clostridium sp. M62/1] hypothetical protein HMPREF0863_00447 [Erysipelotrichaceae bacterium 5_2_54FAA] hypothetical protein HMPREF1025_00599 [Lachnospiraceae bacterium 3_1_46FAA] hypothetical protein CLOBOL_03386 [Clostridium bolteae ATCC BAA-613] hypothetical protein COPEUT_02475 [Coprococcus eutactus ATCC 27759] hypothetical protein DORFOR_02120 [Dorea formicigenerans ATCC 27755] hypothetical protein ANACAC_02287 [Anaerostipes caccae DSM 14662] hypothetical protein CLOSCI_02550 [Clostridium scindens ATCC 35704] hypothetical protein ANACOL_03096 [Anaerotruncus colihominis DSM 17241] hypothetical protein CLONEX_02798 [Clostridium nexile DSM 1787] hypothetical protein ROSEINA2194_01053 [Roseburia inulinivorans DSM 16841] conserved hypothetical protein [Acidaminococcus sp. D21] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] conserved hypothetical protein [Clostridium sp. M62/1] hypothetical protein HMPREF0863_00447 [Erysipelotrichaceae bacterium 5_2_54FAA] hypothetical protein RBR_20890 [Ruminococcus bromii L2-63] hypothetical protein HMPREF1025_00599 [Lachnospiraceae bacterium 3_1_46FAA] # 1 56 1 56 56 68 100.0 1e-10 MRKRSAEEKQKQLERFLMNVAEAADAALWEYWREKEAEHRRFATEYVTRRGLIPQQ >gi|157101638|gb|DS480686.1| GENE 246 253499 - 253714 317 71 aa, chain + ## HITS:1 COG:no KEGG:Closa_3718 NR:ns ## KEGG: Closa_3718 # Name: not_defined # Def: conjugative transfer protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 71 1 71 71 95 92.0 6e-19 MAFFNSAVDVLQTLVVALGAGLGIWGVINLMEGYGNDNPGAKSQGMKQLMAGGGVALIGM TLVPLLSGLFG >gi|157101638|gb|DS480686.1| GENE 247 253734 - 254405 441 223 aa, chain + ## HITS:1 COG:no KEGG:CD1112 NR:ns ## KEGG: CD1112 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 198 1 198 289 317 90.0 2e-85 MQSILDAINEWIKEILIGAINGNLSTMFGDVNEKVGTIAAEVGQTPQGWNANIFSMIQTL SENVIVPIAGLVITYVLCYELISMVTEKNNMHDVDTSMFFKWVFKAFVAVYLVTHTFDIT MAVFDMAQHVVSGAAGVIGGSTEIDVAAALASMQSGLDAMEIPELLLLVMETSLVSLCMK IMSVLITVILYGRMIEIYRASRSAAFHPKAVRGHSGNPALASR >gi|157101638|gb|DS480686.1| GENE 248 254981 - 255793 452 270 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 29 252 256 480 834 201 43.0 2e-51 MRSPENVLESLKSKACNKSYKYGRLYRNLYNPQFYLLTYQRIQAKPGNMTAGTDGKTIDG MGMARINALIEKMRDFSYQPNPARRTYIPKSNGKMRPLGIPSFDDKLIQEVVRLILESIY EPTFSDYSHGFRINKSCHTALKYVQKYFTGTKWFVEGDIKGCFDNVDHHVLIDILRKRIA DEHFIGLLWKFLKAGYMEDWNYHNTYSGTPQGSIISPILANIYMNELDSYMAEYAEKFNC GNRRKINPAFKKKLDVCRGKNSGLKEIFLK >gi|157101638|gb|DS480686.1| GENE 249 255790 - 256812 608 340 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 30 339 511 827 834 119 33.0 7e-27 MSEEEKEDLIAEIRELRRSLKSMPYSDQMDDSYKRICYVRYADDFLIGVIGSKEDAEQVK QEVGCFIREKLHLEMSGEKTLITHGHDFAKFLGYEVTITKGEYSKKTKTGATRRVNNGKV LLYVPHDKWLKRLLSYNALKIKYDKQNGNKEVWEPVRRIRLLHLDDLEILNQYNAEIRGL YNYYRLANNVSVLNNFYYVMRYSMLKTFAGKYRTRISRIIRKYRQGKDFAVEYPKKNGKV GKVLFYNNGFRRNTKVESGNPDIVARVVENYGRNSLIKRLQANKCEWCGAENVPLEVHHV RKLKDLSGRKQWEIAMIGRRRKTMALCIDCHDKLHAGKLD >gi|157101638|gb|DS480686.1| GENE 250 257368 - 257769 197 133 aa, chain + ## HITS:1 COG:no KEGG:CD1111 NR:ns ## KEGG: CD1111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 132 1 132 134 201 81.0 5e-51 MAYVPVPKDLSKVKTKVAFNLTKRQIVCFAAALLFGLPLFFLLKDSTGTSLASMAMIAVM LPCFLFAMYEKHGQPLEVVVKNIIRTKFIAPKERPYRTDNFYSVLERQRKLEKEVSAIAK GNTKKQRGGKHKA >gi|157101638|gb|DS480686.1| GENE 251 257720 - 258724 797 334 aa, chain + ## HITS:1 COG:no KEGG:CD1110 NR:ns ## KEGG: CD1110 # Name: not_defined # Def: putative conjugal transfer protein # Organism: C.difficile # Pathway: not_defined # 4 329 5 326 809 548 82.0 1e-155 MQKETRRNSAAGSTKPNPPRKLTRAEKKQIAEIIRQAKGDGKAHTAQQTIPYIQMYPDGI CKVTGRKYSKTVAFEDINYQLAQADDKTAIFENWCDFLNYFDASVSVQLSFINQGTQRGE AEKAVNIPAQEDAFNSIRTEYRDMLKNQLAKGNNGLVKAKYITFAIEADSLGAAKSRLAR IETDVLNNFKLLGVSARPMTGYERLKMLHGIFHPEGGQFAFDFSWLAPSGLSTKDFIAPS SFRFGEGRYFRMGRKIGAVSFLEILAPELNDRILSDILDLETGVIVNLHIHSIDQTEAIK TIKRKITDLDKMKIEEQKKAVRSGYDMDISATRS >gi|157101638|gb|DS480686.1| GENE 252 259447 - 261420 234 657 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 12 364 25 339 470 183 34.0 8e-46 MPNEKKPKTKYVEYLRHAEYYDMQSTFDELYARSQAGEIFDGLMDVILSRENILLAYRNI KSNTGSFTPGTDKLKISDIGKLTADEVTARVRRIVKGGKNGYTPRSVRRKEIPKPNGSTR PLGIPCIWDRLVQQCIKQVMEPICEARFSNNSYGFRPNRSVENAIAAIYRLMQRSGLYYV VEFDIKGFFDNVDHSKLIKQLWSLNIRDKELLYVIRRILKAPILMPDGHIEHPAKGTPQG GIISPLLANVVLNELDHWIESQWQCNPVTENYSYRENATGCPIQSHAYRAMRNTRLKEMY IVRYADDFRILCRTKEQADRTLIAVTHWLKERLRLDVSPEKTRVVDTRRSYSEFLGFKIR LHKKGKKYVVQSHMCDKAYRKVKANLTKQVGNIKFPRKDRGEAGEVRLFNSMVMGIQNYY QLATDISIDCGDIGRTVNIVLKNRLKSGKTHRLKEEGRDLTKMELQRYGKSEQLRYIAQS KEPIYPISYVQCTNPMNLRRKVCAYTATGRSAIHDDLRINTSLLLQLMRAPTYRRSAEYA DNRISLFSAQWGKCAITGKEFQCVSEIHCHHKTPKGNGGSDKYENLVLVLAPVHELIHAV NEDTICSYLSALKLDASQLMKLNRLRILANRKPIDLENLNLTNNSHNGMTKETKKTV >gi|157101638|gb|DS480686.1| GENE 253 261586 - 262962 813 458 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 5 435 393 840 853 92 24.0 1e-18 MFLLTFLVVNMADTKRKLENDVFAAAGIAQKNNCALTRLDYQQEAGLMSSVPLGENLIPI QRGLTTSSTAIFIPFITQELFQTGAALYYGLNALSNNMILCDRKQLKNPNGLILGTPGSG KSFAAKREMTNAFLITDDDIIICDPEAEYFSLVQRLDGQVIRLSPTGKGIDGKPQYVNPM DINLNYSEDDSPLALKSDFILSLCELVIGGKEGLQPVDKTVIDRAVRNVYRPFLADPDPE KMPILGDLYNELLKQPEPEAARIAAALELYVSGSLNVFNHRTNVELNNRLVCFDIKQLGK QLKKLGMLIVQDQVWNRVTVNRAERKSTRYFMDEFHLLLKEEQTAAYSVEIWKRFRKWGG IPTAITQNVKDLLASREVENIFENSDFVLMLNQAAGDRAILAKQLNISPQQMKYVTHTEA GEGLIFYGNVVLPFVDRFPKDTELYRVMTTKPEEVGEA >gi|157101638|gb|DS480686.1| GENE 254 262959 - 263918 683 319 aa, chain + ## HITS:1 COG:XF0641 KEGG:ns NR:ns ## COG: XF0641 COG0863 # Protein_GI_number: 15837243 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Xylella fastidiosa 9a5c # 9 316 51 378 380 136 30.0 6e-32 MNGLKTDTIINRDALYALRELPEESVHCCVTSPPYYALRDYGLDMQIGREDTPEQYIDRL TEVFRELRRVLRSDGTLWLNIADTYCGTGNKGYHADPKNPKGRNGQQIAKNNRVSGCKQK DLIGIPWLLAFALRADGWYLRSDIIWQKENPMPESVKDRPTRCYEHIFLLTKSKKYFYDA AAIAEPLAPTTAARYRTGRSAGQKYADEVPGQGKVQGLNRARSGSYYDEALMPTMRNRRD VWLINTVPYKGGHFAAFPPKLAETCIKAGCPKGGVVLDPFFGSGTTGAAARQLDRHYIGI EINAEYCALARARIGGTEK >gi|157101638|gb|DS480686.1| GENE 255 263954 - 266023 1106 689 aa, chain + ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 1 688 1 646 646 957 76.0 0 MTREGAVEVNAATGKKKRISKRIRDADFAKTEAPQPPEQAAQTIPGGAAPAPTAAAPPLP HAPGAEREQDTAAAERVLERIDGARTRKASKKAARKAQAEATAKEKSSRLQFTDEERATP ELERYIRKSDKAADRLDAAKAAIPKEKKLVRERTFDEATGKGKTRLHFEEQEKPIGRNKP HNNPLSRPAQEAGIFVHNKIHSVEKDNSGVEGAHKSEELAERGAKYGTRKVKEGYRSHKL KPYRAAAKAEKAAFKANVDFQYHKSLHDNPQIAGNPLSRFMQKQQIKRQYAKSARKGGAK TAQKAAENTRKAAKKTAEETKKAIAFVGRHPAGVCIAVAALLLFIMVSAGLSSCGSMFSG LMNGILGTSYTSEDSDLVATENNYAAKENELQQQIDNIESTHPGYDEYRYDLDSIGHNPH ELASYLTALLQTYTPQSAQAELNRVFAMQYTLTLTEETEIRYRTETSTDPETGETTSEEV PYEYHILNVKLTNKPISEIAEELLTPQQLEMYRVYLETSGNKPLIFGGGSPDMGASEDLS GVQLVNGTRPGNTAVVDLAKRQVGNVGGRPFWSWYGFNSRVEWCACFVSWCYNQAGKSEP RFAGCQSQGVPWFQSRGQWGARGYENIAPGDAIFFDWDGDGSADHVGLVIGTDGERVYTV EGNSGDACKIKSYPVNYSCIKGYGLMNWN >gi|157101638|gb|DS480686.1| GENE 256 266062 - 266340 320 92 aa, chain + ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 68 1 68 85 96 83.0 3e-19 MAMNKIERIDKEIAKTREKITEYQNKLRGLEAQKTEAENLQIVQLVRSMRLSPHELSAML SGGGIPGMEAAPGYPAEPADHDTEEMEDTENE >gi|157101638|gb|DS480686.1| GENE 257 266333 - 267043 632 236 aa, chain + ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 221 3 227 244 259 67.0 8e-68 MNKKILRTLTALCAALMLSGGFSVTAFAQTPEGEDDTNDSGVVYEEPQKEEPLTPDGNAT LVDDFGGNKQLITVTTKNGNYFYILIDRDDEGENTVHFLNQVDEADLLALMEDGSTEAAP PAVCSCTEKCEAGKVNVSCPVCKDNMTACSGKEAEPETEEPPELPKEKGNAGGLVLFLVV ALLGGGGAFYYFKFMKPKQNVKGGTDLEDFDFDDYDEDEPDEGDGLSDEEQEDEEA >gi|157101638|gb|DS480686.1| GENE 258 267040 - 267228 108 62 aa, chain + ## HITS:1 COG:no KEGG:CD1106A NR:ns ## KEGG: CD1106A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 62 1 62 62 91 87.0 1e-17 MTLFTDNPFEKMMIQRPGGRRDNAPPVPHSPACASCPYKGQLPCVGYCLKQVQEKKASEP ER >gi|157101638|gb|DS480686.1| GENE 259 267239 - 267991 195 250 aa, chain - ## HITS:1 COG:no KEGG:CLI_1846 NR:ns ## KEGG: CLI_1846 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_F # Pathway: not_defined # 33 245 43 250 263 77 27.0 4e-13 MKSIHTFDVFAPTDEQEEAMNKKRGGHYRRGKNKEMDYFIECATSLFPNNHYSRFTPIYD TTKEFYQLITSSDTTERAILNWIASNRAWNYLFGLLRHYYRTGHHDDNYIFPEFKIGSAY VADYLLIGNSSDGYQFVFVELEAPNGRITKEKGTRFGEVINKGIEQVRDWQMYIAANWNV IVAELEKHSFSNTKLPRQLYKYCPYQIYYAVIAGLRKDFENIRDRKLQLQNENNITLLHY ENLIDVANEN >gi|157101638|gb|DS480686.1| GENE 260 268070 - 270154 1199 694 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 632 5 655 709 439 38.0 1e-123 MKLIIAEKPSVAQTIAAALGVKEKKDGYIEGGGCLISWCVGHLVQLAEAAAYGEQYKKWS FESLPILPEEWQYAVDPDKGKQFKTLKELMHRADVSEVVNACDAGREGELIFRFVYEAAG CKKPMRRLWISSMEDGAIKAGFASLKDGQEYDALFASALCRAKADWLIGINATRLFSCLY GKTLNVGRVQTPTLKMLTDRDAAISHFQKEKYYHVRLDLSGAEAASGRISDKAEADALKG ACEAETAVCVSLTREKKTAAPPKLFDLTSLQREANRIFGYTAKQTLDLAQSLYEKRLLTY PRTDSSFLTDDMGGTAAGIIALLCEKLPFMAGADFTPEVVKVLDSKKVSDHHAIIPTMEL AKADPDALPESEKNILTLAGARLLFATAEPHIYEAVTAVFSCAGTDFTARGKTVLAEGWK ELQRRYRATLKDKPEAEDGENADVTLPKLSEGQGFPNPAAKVTEHTTTPPKPHSEASLLS AMERAGNGDTDPDAERRGLGTPATRAAVIEKLVKGGFAERKGKQLIPTKNGNSLICILPD ILTSPQLTAEWENNLTQIAKGAADPGEFLSGIEAMARELVQTHAAALDGKKDLFREEKPS VGKCPRCGSPVHEGKKNYYCSNKECAFVMWKNDRFFEERKTAFSAKIAAALLKSGKANVK KLYSPKTGKTYDGTIVLADTGGKYVNYRIEVQKN >gi|157101638|gb|DS480686.1| GENE 261 270394 - 273720 2667 1108 aa, chain + ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 1 780 1 772 1343 974 67.0 0 MASLRDTVKDYQDELRDGIAWVAFWKQGRSWNAEYFHLDMDDTLYPEDRSRLEEIKSIDP AAVILNGYYCGHLGEDMSLDELTAGVRYHYENSMNDIDGFIGAHDDRLPPEVIEEARAAA HEAGLPFSEKPYRDGEDFNPYVFDGSMSIEDFELMHRMIEKERSEQMAEPILSGYLSNLG KYTEGRPAGEWVTFPTTAEHLKEVFDRIGIDFKHYEEWHFTEFQSTIPGLTEHLSEYSHP DELNYLGKLLEMQFDDDREKFIAAIEYGDHADSLQDIINLAQNLDCYWIYPSVHNEEEYG RYLVDELEEPELPEEAKKYFMYEEYGRDASINDDGMFTEKGYIYNNRNTFTEWYDGRDVP EEYRVTPQPPQPERPDPSKVEMDAAAPGQRTAQTAEQPQEPRPVIPIVLTSEKPAEKLKE ITDRLEQGIAELFDSERYREYLKVMSKFHNYSFRNTVLIAMQKPDASLVAGFSAWKNNFE RNVMKGQKGIKIIAPSPYKIKQEMQKIDPHTQKPVIGKDGKPVTEEKEVTIPAYKVVSVF DVSQTEGKELPDIAVDELTGDVDRYKDFFAALEKTSPVPIAFENIEGGSHGYYHLEDKRI AINEGMSELQTLKTAIHEIAHAKLHDIDLNAPKDEQQPHVDRRTREVEAESVAYTVCQHY GLDTSDYSFGYVAGWSSGRELSELKSSLETIRSAAAEIINSIDENLAELQKAQDKEQTAG QEQPTREGQEAAPEKPDPEAAAPGKSGAQEKAGAAPKEAFTPETIYRVRRNPYSDSRENS YLLQAYVTQENGRAKMGDVLYTGTPEKCRELMGQLKSGELTEGDVKQLYAKAQETAQTAG QDKDTFSIYQIKGGDETRDFRFEPYDRLQAAGNVVDRANYELVYSAPLAPETSLEDIYTC FNIDHPKDFKGHSLSVSDVVVLHQDGQDAAHFVDSVGFREVPEFLQEQKQLTPDDLETGE TVKTPRGTFHVTAMSREQIEAAGYGFHHQSDDGKYLIMGNGTRAFAVAAEQAQRDNPLKT AEQTTEQNGNMIDGIINNTPTVDELEAKVKAGEQISLVDLANAVKADKERGKGAKPEKKP SIRAQLRADKEKAQKKNAKQKSQDLERS >gi|157101638|gb|DS480686.1| GENE 262 273724 - 274677 608 317 aa, chain + ## HITS:1 COG:no KEGG:EF2322 NR:ns ## KEGG: EF2322 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 6 315 120 433 434 137 32.0 7e-31 MPEQSKFENMDLFASLEAIMKQNTGFYQSDLDIDKEIIAKAAASPHREDKTLLWFCRPSG THCFRERDVFLKDTAPHNTWRFYMEQTSDRVLAYAIELTGKERGKIKGNLYELDYSKHYE RVKEKELPADTVKLIYEHGERVQEAGRYFDGTPDPQLGKFERFEAVPNDPDALQSLLQEE RRSREQLSPGDFKAHIAALRDGLIETEARRIVREMKRHYEPNSPNKTHFMAELSPAFMRL AATKDTDRLFSMLPYKTLSFSKIEGRHGTYALIDKGENRDREIRKPRPSIRAQLKADKAK TAPKKAAAKTKNHDMEV >gi|157101638|gb|DS480686.1| GENE 263 274680 - 274886 278 68 aa, chain + ## HITS:1 COG:no KEGG:CD1104B NR:ns ## KEGG: CD1104B # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 68 1 68 68 94 79.0 1e-18 MIRLTVEETNLLSIYNEGGKRALIENVNAALPYMDADMRELAKRTLSKVDALTEAEFAEL PIYAADEV >gi|157101638|gb|DS480686.1| GENE 264 274883 - 275248 168 121 aa, chain + ## HITS:1 COG:no KEGG:CD1104A NR:ns ## KEGG: CD1104A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 121 1 121 121 186 82.0 2e-46 MNGVKRLTPPQSRKVNALVRRTCCNYDNGNCILLDDGDECVCPQLISYSLLCKWFRVAVL PADRLLYAELYQTGDKKKCTECGAFFASTSNSVKYCPVCRKRITRRQAAERMRKRRAPVT Q >gi|157101638|gb|DS480686.1| GENE 265 275426 - 275677 320 83 aa, chain - ## HITS:1 COG:no KEGG:CD1103 NR:ns ## KEGG: CD1103 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 4 76 3 75 76 87 63.0 1e-16 MENQKMPEYETIRAAVAGEKWAVEKVVECYKDEIDRLSTVAVRQPDGSTKQEINEDMRQS ITKKLIEALPQFPLEEMEKGNVR >gi|157101638|gb|DS480686.1| GENE 266 275741 - 277087 647 448 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 1 150 1 172 402 62 27.0 2e-09 MAVTKIKPIKSTLSKALDYIENPDKTDGKMLVSSFGCSYETADIEFEYTLSQALQKGNNL AFHLIQSFEPGEVDYQKAHEIGKQLADAVTKGQHEYVLTTHIDKGHVHNHIIFCAVNFVD HRKYNSNKRSYYGIRNMSDKLCRENGLSVVVPGKGSKGKSYAEYQAEKTGTSWKGKLKIA VDALIPQVSSFEELLQRLQAAGYEIKPGKYVSCRAPGQERFTRLKTLGADYTEEAIRERI AGRRAKAAKAPREQRGVSLLIDIENSIKAAQSKGYEQWAKIHNLKQAAKTMNFLTEHKIE QYADLVSRIEEMAAESGQAADALKDAEKRLADMAVLIKNVSTYQKTKPVYDAYRKARNRE KYRAGQEQAIILHEAAARSLKAAGIAKLPNLAALQSEYEALQAQKEALYADYGKLKKKVR EYDIIKQNIDSILQADRQPEREKGTERG >gi|157101638|gb|DS480686.1| GENE 267 277048 - 277377 149 109 aa, chain - ## HITS:1 COG:no KEGG:CD1101 NR:ns ## KEGG: CD1101 # Name: not_defined # Def: putative mobilization protein # Organism: C.difficile # Pathway: not_defined # 1 108 1 108 109 160 87.0 1e-38 MNGRKRTVQIKFRVTEAERDLILEKMKLVPTRNMAAYLRKIAIDGYIIQIDHADIKAMTA EIQKIGVNVNQIARRVNATGNAYQEDIEEIKGVLAEIWRLQRLSLLKAL >gi|157101638|gb|DS480686.1| GENE 268 277639 - 277992 305 117 aa, chain - ## HITS:1 COG:no KEGG:CD1100 NR:ns ## KEGG: CD1100 # Name: not_defined # Def: putative conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 117 1 117 117 187 86.0 1e-46 MAKRPVPKYDFKAFGAAIKAARTGRKESRKKVSDEMFISPRYLANIENKGQHPSLQIFFE LMLRYNISVDQFLLETPPEKNTQRRQLDALLDGMSDTGIRIVSATAKEIAEVETEGR >gi|157101638|gb|DS480686.1| GENE 269 278968 - 279549 -13 193 aa, chain + ## HITS:1 COG:TM1715 KEGG:ns NR:ns ## COG: TM1715 COG0820 # Protein_GI_number: 15644462 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Thermotoga maritima # 1 181 155 334 343 133 38.0 2e-31 MGMGEPLFNYDNLIAAIHILRDRNGLNFPTDGITVSTVGPVNQLKKLREEHLKIQLTISL HAATQAARNCIIPHMHMYAIEDVVKQALSYSQRHNRKVVFAYLLLPGINDRSSDIRQLAK WFKGKNVMINVLQYNPTSNSKIRAPQKQEMVAFKHQLEQTGLEVTMRVSHGREIKAACGQ LANTYNKAKKQQK >gi|157101638|gb|DS480686.1| GENE 270 280086 - 280517 210 143 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1935 NR:ns ## KEGG: Tresu_1935 # Name: not_defined # Def: RNA polymerase sigma factor, sigma-70 family # Organism: T.succinifaciens # Pathway: not_defined # 1 143 42 184 184 253 97.0 1e-66 MEPNRREFIKQCAFQKFCNTVLHNEACDAHKELHRHKAREITFSDLTLEEARQLHTFDEY FKGEIAFERAGKKITPKLLLEAIRTLPEEKRKAVLLYYFEGMNDTEIAELFDTSRSTIQY RRTSSFELLKKYLEENADEWDEW >gi|157101638|gb|DS480686.1| GENE 271 280498 - 280758 270 86 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1934 NR:ns ## KEGG: Tresu_1934 # Name: not_defined # Def: conjugative transposon protein # Organism: T.succinifaciens # Pathway: not_defined # 1 86 1 86 86 172 100.0 3e-42 MNGTNGNEPGYPEKALVPYPVILAATKGDPDAMKIVLQHFSGYIARLSMRKLYDERGNVY FGVDHDIRERLQAKLMMAVLTFKAEE >gi|157101638|gb|DS480686.1| GENE 272 281233 - 281400 110 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|163816249|ref|ZP_02207616.1| ## NR: gi|163816249|ref|ZP_02207616.1| hypothetical protein COPEUT_02437 [Coprococcus eutactus ATCC 27759] hypothetical protein DORFOR_02096 [Dorea formicigenerans ATCC 27755] hypothetical protein ANACAC_02309 [Anaerostipes caccae DSM 14662] hypothetical protein ANACOL_03155 [Anaerotruncus colihominis DSM 17241] hypothetical protein CLONEX_02772 [Clostridium nexile DSM 1787] conserved hypothetical protein [Acidaminococcus sp. D21] conserved hypothetical protein [Clostridium sp. M62/1] hypothetical protein COPEUT_02437 [Coprococcus eutactus ATCC 27759] hypothetical protein DORFOR_02096 [Dorea formicigenerans ATCC 27755] hypothetical protein ANACAC_02309 [Anaerostipes caccae DSM 14662] hypothetical protein ANACOL_03155 [Anaerotruncus colihominis DSM 17241] hypothetical protein CLONEX_02772 [Clostridium nexile DSM 1787] conserved hypothetical protein [Acidaminococcus sp. D21] conserved hypothetical protein [Clostridium sp. M62/1] # 1 55 21 75 75 94 98.0 2e-18 MTQNQTPVTTTEHKIGKVTYLVCSSASERATDTLDKKIKKLIRKDMELNPANARK >gi|157101638|gb|DS480686.1| GENE 273 281501 - 283177 1341 558 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 3 300 4 299 301 272 47.0 1e-72 MLQTDKITALYCRLSQEDMQAGESESIQNQKLILQKYADEHHFFNTRFFVDDGFSGVSFE REGLQAMLHEVEAGNVATVITKDLSRLGRNYLKTGELIEIVFPEYEVRYIAINDGVDTAR EDNEFTPLRNWFNEFYARDTSKKIRAVKQAKAQKGERVNGEAPYGYLIDPDNRNHLIPDP ETAHVVKQIFAMYVRGDRMCEIQNWLRDNEILTVGELRYRRTGSKRHPRPQLNAWYNWPD KTLYDILTRKEYLGHTITGKTYKVSYKSKKTKKNPEEKRYFFPNTHEPLIDEETFELAQK RIATRQRPTKVDEIDLFSGLLFCGDCGYKMYAVRGAGTLERKHAYTCGNYRNRARNDMLC TTHYIRKSVLKELVLADLQRVTSYVKEHEQEFIETANECSAKAVQKTLTQQRKELDKAQN RINELNILFRKLYEDNALGKLSDEQFAFLTSGYDEEKKTLTRRIAELSQEIDNATERSAD VKRFVALVRRYTAIEELTYENVHEFIDRILIHELDKETNTRKIEIFYSFVGRVDTGDKPT ESISYFRQIGADVKSYAI Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:02:12 2011 Seq name: gi|157101637|gb|DS480687.1| Clostridium bolteae ATCC BAA-613 Scfld_02_28 genomic scaffold, whole genome shotgun sequence Length of sequence - 27247 bp Number of predicted genes - 25, with homology - 25 Number of transcription units - 12, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 280 - 339 4.3 1 1 Tu 1 . + CDS 364 - 567 111 ## gi|160938562|ref|ZP_02085915.1| hypothetical protein CLOBOL_03458 + Prom 816 - 875 8.0 2 2 Op 1 . + CDS 946 - 1284 182 ## Closa_0763 hypothetical protein 3 2 Op 2 . + CDS 1299 - 2072 677 ## Cphy_0799 hypothetical protein 4 2 Op 3 . + CDS 2126 - 2881 227 ## COG0338 Site-specific DNA methylase 5 2 Op 4 . + CDS 2967 - 3179 70 ## Cbei_1678 XRE family transcriptional regulator + Term 3188 - 3247 15.0 + Prom 3204 - 3263 5.6 6 3 Op 1 . + CDS 3408 - 6266 1353 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 7 3 Op 2 . + CDS 6324 - 7808 957 ## COG0471 Di- and tricarboxylate transporters 8 3 Op 3 . + CDS 7853 - 8482 471 ## COG0698 Ribose 5-phosphate isomerase RpiB 9 3 Op 4 2/0.333 + CDS 8495 - 9253 574 ## COG1414 Transcriptional regulator 10 3 Op 5 . + CDS 9293 - 10066 219 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 10314 - 10350 2.4 11 4 Tu 1 . - CDS 10053 - 10976 325 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 11000 - 11059 6.0 + Prom 11003 - 11062 7.1 12 5 Op 1 . + CDS 11144 - 12340 1017 ## Sgly_3193 major facilitator superfamily MFS_1 13 5 Op 2 . + CDS 12297 - 13205 785 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Term 13223 - 13286 11.2 + Prom 13328 - 13387 4.8 14 6 Tu 1 . + CDS 13444 - 16413 1789 ## Rumal_0452 hypothetical protein + Term 16495 - 16534 -0.6 + Prom 16505 - 16564 6.7 15 7 Op 1 19/0.000 + CDS 16639 - 18273 1230 ## COG4585 Signal transduction histidine kinase 16 7 Op 2 . + CDS 18332 - 18967 740 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 17 8 Tu 1 . - CDS 18978 - 19724 658 ## COG2188 Transcriptional regulators + Prom 19899 - 19958 12.6 18 9 Op 1 1/0.667 + CDS 20115 - 20894 979 ## COG2820 Uridine phosphorylase 19 9 Op 2 2/0.333 + CDS 20920 - 22122 1588 ## COG1972 Nucleoside permease 20 9 Op 3 2/0.333 + CDS 22135 - 23322 1537 ## COG1015 Phosphopentomutase 21 9 Op 4 . + CDS 23391 - 24026 772 ## COG0274 Deoxyribose-phosphate aldolase + Prom 24045 - 24104 3.6 22 10 Tu 1 . + CDS 24144 - 25046 748 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 25048 - 25107 5.9 23 11 Op 1 . + CDS 25169 - 25921 739 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Prom 25928 - 25987 7.2 24 11 Op 2 . + CDS 26018 - 26596 599 ## COG3034 Uncharacterized protein conserved in bacteria + Prom 26665 - 26724 2.5 25 12 Tu 1 . + CDS 26807 - 27070 269 ## MGAS2096_Spy1119 hypothetical protein + Term 27076 - 27112 7.0 Predicted protein(s) >gi|157101637|gb|DS480687.1| GENE 1 364 - 567 111 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938562|ref|ZP_02085915.1| ## NR: gi|160938562|ref|ZP_02085915.1| hypothetical protein CLOBOL_03458 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03458 [Clostridium bolteae ATCC BAA-613] # 1 67 80 146 146 129 100.0 6e-29 MIVDDVLHFIAPEAGTGTMAFMYLVKVNIVNGGFGLYVEAANRICFLESNAWLENRNWSN SFNLYIR >gi|157101637|gb|DS480687.1| GENE 2 946 - 1284 182 112 aa, chain + ## HITS:1 COG:no KEGG:Closa_0763 NR:ns ## KEGG: Closa_0763 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 106 1 106 106 103 52.0 2e-21 MDVSTIMQYVSYLLNAIGLMAFLVSVITQVIKSWPGLDKLPTQAVVIVLSLVLCPAVFIA LMAWIHHPINWYTVFACVIMAFIVALVAMDGWERVSEIWKRTKTPKTTYPGN >gi|157101637|gb|DS480687.1| GENE 3 1299 - 2072 677 257 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0799 NR:ns ## KEGG: Cphy_0799 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 149 3 165 216 88 33.0 3e-16 MSKTSVGLVEHCKSKLRTPYVYGAKGEVLTQAILDRLARENPGTYTSTYRAKAAKYIGQR CTDCSGLISWYTGILRGSYNYHDTAVERIGVDHLDESMVGWALWKPGHIGVYIGDGWCIE AKGINYGTIKSRVAAIPWQKVLKLCDIDYTPVPVTYTQGFRPAADGQRWWYQFTDGSYAA NGWYWLREATDGTCGWYLFDSEGYMLTGYQVDPAGEAFLLCPVKGSDEGKCMITDARGAL RIAEEYDMANRRYVFNW >gi|157101637|gb|DS480687.1| GENE 4 2126 - 2881 227 251 aa, chain + ## HITS:1 COG:XF0935 KEGG:ns NR:ns ## COG: XF0935 COG0338 # Protein_GI_number: 15837537 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Xylella fastidiosa 9a5c # 6 219 43 272 311 85 28.0 8e-17 MNSFIGWIGGKRLLRKEILGCFPEDVGRYIEVFGGAGWVLFAKEKQAGQMEVYNDRDGNL VNLYRCIKYHCSALQEELQWLLPSREQFYDYRAQMDMRGLTDIQKAARFFYLLKISFGSD YRTFATSSKSIENAIDYLVKVQKRLQGVVIENKDFENLIGVYDRKDALFYLDPPYVGTET YYNVTFTMEDHQRLAEILKNIKGKFILSYNDIPMIRELYSQYPCKEVVRNSTLAGDSNKP AAYRELIITNF >gi|157101637|gb|DS480687.1| GENE 5 2967 - 3179 70 70 aa, chain + ## HITS:1 COG:no KEGG:Cbei_1678 NR:ns ## KEGG: Cbei_1678 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.beijerinckii # Pathway: not_defined # 1 70 4 73 75 79 48.0 4e-14 MVKIHLSRLLGEKRWSQAKLARITGIRASTINDIYNEFSERISLEHLNRICRALDCDISD ILEYIPDEPR >gi|157101637|gb|DS480687.1| GENE 6 3408 - 6266 1353 952 aa, chain + ## HITS:1 COG:mll4880 KEGG:ns NR:ns ## COG: mll4880 COG1529 # Protein_GI_number: 13474083 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Mesorhizobium loti # 165 946 10 757 774 398 35.0 1e-110 MNAELVKIELKVNGKKVCKYVAPSMRLADFLREELHLIGTKKGCNAGECGTCSVLINGVL KKSCMIPVIKANHCEILTIEGIGTDGLSIIQRCFIKAGAVQCGYCTPGMIMAATTILKEN RHPDKVEIRRRLGGNICRCTGYVKIVEAIELARDILNHDQSADCLDSIVPDTGKIIGSRI ERVDAKGKAAGALEYAADMTMPRMLHVHLVRSREVHAEILKIDYEKALTMPGVETVITSK DVPGEDGFGVYYHDQPVLAKGKIRFYGEPVAAIVAETLEEAEEAEKLIEITYRKLPVVSG IADAIGCKSVVHDDYPDNVVSFTSVIKGDVENGFRNSDIIVEQDYSTQCVDHAYLETEAG LAYCDPDGVLVVKSCDQDITHHRMLLAKILDMPINKIRVIMTPVGGGFGGKEDMIYQGIL AIAALKTKRPVKLVFSRNDSMIGSCKRHPVLVHHKIGLRKDGKIQAVEIDLKSDGGAYCF STKGTVAKSAILGAGPYEIENVRVISKGYYTNNTPSGAMRTFGILQPTFAIESTMDICSE RLGIDPIELRLINGVRDGAVTHTGQVLGSVSYCETLKQCARMAAWEPGPSNVRGRVRKDV KGSQTVQPYIPGSEFKAPAAVELCKARFKYGRGVGTGWYGIGRCATVDKAGAFVEIDDGG TAMILTGVTEIGEGILTVLTQIAADELGMYPEDITIGDNDTARSPEAAHAGASRQSYMIG NAVLNACRDVKEKFIREIAAYWNVDSSSICMRNRRIFVQGHSRYDFSLKEAVDICKKVRG YVPLGSGTYTAHHEALDPVSGEGNPWQAYVFGSQIAEVAVDTFTGEVHVLGIWASHDVGR AINPQGIEGQIEGGAVQSIGQAIMENFVLSEGVPVNRNFAKYILPTSVDVPMFCTSLVEN RDPFSPLGAKGIGEPAPLPTIPAIVNAIYDAVGVRVTSLPATPEKILNEMSE >gi|157101637|gb|DS480687.1| GENE 7 6324 - 7808 957 494 aa, chain + ## HITS:1 COG:MTH788 KEGG:ns NR:ns ## COG: MTH788 COG0471 # Protein_GI_number: 15678812 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Methanothermobacter thermautotrophicus # 51 484 13 434 443 186 30.0 7e-47 MSNINNVSGTSGGLSVNAKRWIYICISLIVLVFIKNVSFPASIATIKDATLTPEGQNALA VLIFALLLWITEAIPFHFTGLLAMALLALFGVDSFGGIVKTGFGNINVIFFIGVFILSAF INKSGLGKRLVNVCLSITGNSTKYVILGFLVVGVLLSMWISNMAVAAMLMPLAKSLLDEE GLKPLESNFGKALLISVAWGSLIGGFGTPAGNGPNPLAIGFMKDMAGIDVSFLDWMIYGV PISLIHIPIAWGLLLLAFKPEMKYLKRTNQEIRNEFKNQPRLSRDEKVTLILFVATVALW VFSSQLSDLLGVDIPIALPVILTTMLFFFPGVSKTKYKEIEKEMSWSSIILVLSGVSLGM VLYQTGVANWIALGLLGNLGGLSPILMIFVVVLSVSLMNITLSSATVSASIVIPIIIELS INLGISTLAIAFPAALASSLAFILITSTPTNVIAYSAGYFSIKDFAKAGVLMSIVSCVIV AVVMYGVGLLTGLY >gi|157101637|gb|DS480687.1| GENE 8 7853 - 8482 471 209 aa, chain + ## HITS:1 COG:CAC2606 KEGG:ns NR:ns ## COG: CAC2606 COG0698 # Protein_GI_number: 15895864 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Clostridium acetobutylicum # 1 185 1 189 213 178 46.0 7e-45 MKIAVINEVSASIRNEDIVNALHKTTNAEVLNVGMKNPEQTPSLTYIHTSYMAAVLLNTG SCDFVIGGCGTGQGFLNGVLQFPGVVCGLIVEPLDAWLFSQINGGNCVSLPLNKGYGWAG SINLEYIFEKLFADPAGAGYPKERAKSQAMSRQTLASISKLSHKDFTDILKASDPEILKA IAGSKDFMKVLADGGEKAKAVLELLGENR >gi|157101637|gb|DS480687.1| GENE 9 8495 - 9253 574 252 aa, chain + ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 247 1 248 251 146 31.0 3e-35 MVQSLARGIEILSIIQERGSATIVEVASVLGVDKSTVSRLMATLMHYDMVSIDPVSKKYR LGFRNLYLSEGVKKNFNVAIVARPYLYKICDDTKESVHLAAMGNRKMYIVDQVRSQREYN LSAQIGMIEAWHCSAVGKCVLAYKPPSFIEDIFRDYDFEHYTSNTITTYQNLEKELKKIR EQGYALDDEERTLGVRCLAVPVFNYSGNVSCCIGISAPKEQITEATIKKYTLCMKKYGSQ ISKELGYGLYRS >gi|157101637|gb|DS480687.1| GENE 10 9293 - 10066 219 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 13 252 4 238 242 89 27 3e-17 MSNFIDKFSLAGKKAIVTGGAKGLCNGMAQALHDAGAEVVLLDILDMVEDSAGEMAKKGP MVHAVRGDLTKTEELENIYNECLENLGGRVDILLNGAGIQFRAPAVDFPHDRWEKIIALN MNAVFYMSQLAGKTMLAQKYGKIINIASMTAFFASVLIPAYSASKGGVAQITKALSNEWA GQGVNVNAIAPGYMATELTANIKEVNPKQYEEITSRIPAGRWGRMEDLQGLAVFLASDAS AYISGAVIPVDGGFMGK >gi|157101637|gb|DS480687.1| GENE 11 10053 - 10976 325 307 aa, chain - ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 16 302 8 291 299 128 27.0 1e-29 MALQDCTLNLNRTGRELQPHGTPDFPCAGYSSVYTDSAGDVVPWHWHEDLEIIYVESGHL KLQVPGKTLHLKQGEGAVVNSNILHFAAAEPLCELHSLVFHPLLVTGEKDSVFSRRYVAP LLQCRSFDICPFDYLAGDLRAGNPAAVFTAAFDAFCGEPLGYEFVVREKLSCICCCLYMY YSHAMDSSKPMPDTDNIRIQKMISFIHEHFQENLDLAQIARAADIGERECLRCFKRAIQT SPLQYLLKYRVTQGASMLLRSPHSSISSIAGMCGFNSPSNFSQMFVRFFKCTPREYRKAS CPPSIFP >gi|157101637|gb|DS480687.1| GENE 12 11144 - 12340 1017 398 aa, chain + ## HITS:1 COG:no KEGG:Sgly_3193 NR:ns ## KEGG: Sgly_3193 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: S.glycolicus # Pathway: not_defined # 1 388 1 388 392 531 69.0 1e-149 MFQLLLVIIYLAFISLGLPDALLGSAWPSMYRELGASVSYAGIISMIIAGGTIISSLFSD RLIRNFGTGKVTVVSVAMTAAALFGFSSSHSFLQLCVWAVPYGLGAGSVDAGLNNFVALH YKSRHMSWLHCFWGIGATAGPYIMGLCLTRGFKWNSGYMVVGVIQIALVACLVLSLPLWK VKKDGNSGQDTPHRQITLREGLRLPGAKAVLTAFFCYCALEATAGLWASSYMVLHKGIAP QTAAKWASFFYLGITLGRLASGFVTDRLGDRNMVRCGQLTAFAGTVILLLPAGDGAVLTG LIMVGLGCAPIYPSLLHETPDNFGVQYSQSMMGMQMACAYVGSTFMPPLFGVAAERISVT LYPVCLVIFVVLMIGMTERMGRVQNGKVNGLTHAQLLQ >gi|157101637|gb|DS480687.1| GENE 13 12297 - 13205 785 302 aa, chain + ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 6 272 8 258 290 106 29.0 5e-23 MEKLMDLHMHSYYSDDGEFSPEELVRQCAMSGIRVMSIADHNSVRANDGGRQAARRAGIR YVSGIEIDCTFRGVNLHVLGYGFDDTSDDFADIEDNISSQAATASRQMLCATRKMGFDVT EEDMEELAGDYYWKDRWTGEMFAEVLLGREEYKDHPLLLPYREGGARGDNPYVNFYWDFY SQGKCCYVKMKYPKLEQAVDIIHRNGGYAVLAHPGVNLKDCGELLDPILEAGVDGIEAFS SYHSEEQAGVYHKAARGRFRMITCGSDYHGKTKPSISIGGHGCTVPYEEMVRQLERILGE NG >gi|157101637|gb|DS480687.1| GENE 14 13444 - 16413 1789 989 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0452 NR:ns ## KEGG: Rumal_0452 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 69 310 121 368 1508 82 28.0 1e-13 MEWFIILLIWLVAPFAELAAIIVLAVANSRYKRRIWELTRGTDKADTGMQDKGPADIEIQ HEGPADIEIQRERPADTGIQYERPADIEIQREKPTDIKIQHEGPADTEIQHERTADAVTQ HEKSAAWYGPGQAIVVPEDFEKAESACRVPEIPHKQTALVRAPKIDGGFFQGTAALVIGV IFVVLAGLIFATTAWHVLPSVCKVIMVLGGSGLFFGASWAAEKYFKIERTGQAFYILGSV FLFLTILAAGYFGLLGPEFILKGENRYRVLLAGSIATEIALFSKIRRFRNKVYTQACLWG MTVSMCFLMGALKLEWPDCMNGMMYYSFLLIGGNEVYRRKSGLQGSMNAPADRFIDEFMG EFGLFAALQFWIISGMMAYKAVAGGIGFIIGILFLGIWEVTFWDTLAFGLMAAGTALVAL RRRSPEMKMLHSLTMMIFFQYAGFCIPVDFTYQVFLGAAMTAVWFLAERRFKNPLNNLAG GCVFSAVLAIDMAVLLLDMLFSRDSLGNQLAASAVVILLAAVMAEWGRRYPVLRGSVAAV LFALTLTGWEVFNRTLGMDVGYDVVVWGYVLIVSIWDMVKKDRFCIPILAIGTAAQAVAR LENQETLSFFLLLSVYLLVKSFGREGTARERFIRGSCLYSLAGVYLLAEGATANGVLRMV WTAAVYGLEYASVILHDRSKIRDRFWNCTGMTVFLMMMGAFYSDPSLALWNMILCMAVFE VIYVMLYRSGCSWLHLAAAAAVLPLPMIATARYGLNENQVYGATAALLILSGILFRRFRP VMVRREGESGGWDVDWFHILVIFVLIPMAWEAGRGWQCAYILLTALYVLQFAVLEHWKRA AFTLAAALAAAAFWRQPFIRWPEMLSLEIQLIPAAGFIWSLGRIWGDRKEITNLQTVLYC LCLGAMASDALSTGAVWDALILEAVNLAVFLLAHMRKCMRWIRISGIIMILVALYMTKDF WLSLSWWVYLLAAGLGLIVFAAVNEMKKR >gi|157101637|gb|DS480687.1| GENE 15 16639 - 18273 1230 544 aa, chain + ## HITS:1 COG:CAC2940 KEGG:ns NR:ns ## COG: CAC2940 COG4585 # Protein_GI_number: 15896193 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 327 541 180 390 402 94 29.0 8e-19 MLDIMLRIAYLSAIGLSLYTSGWLLMKADKTRTTGALAACQLLIIIWCMPQLFSALPMTK GMKYLAYGISYIGISFIGPAWLEFAFLYSRRKLGHGAELFLFGISAVNYSVLLTNEYHHL FYASFEVAQVVYGPVFYIHMVYTYICVLAGMAVVLAAFKKNRVALAHIAVILLAAAVPLA FNLLYMTGLVRTGFDLTPPAFSLTSVLMLLAVFRYDFLDVNTMAFDKIFDSIAEGVVVYN RRDKITYCNGAAVHWLGLQTGDDMEPLRRILGEKGAKADCADSQTPVFTLEDNGERRRLE VRQYIHRDKKGDMVAGTIMLTDVGRYYQLLEQGRELAVTNQSLAIEKERNRIAQEVHDTA GHTLTMINSLLRLIRIGYKEERGQEQDKRIEEYLIQAQELAGSGIRELRCSINNLRQSAS YGLISQGVYQLTGSVKEFEVEVEIQGEDRQEYSHLSPVVYDCLREAITNCHKYAHATHMD VILKFGADSLSLYMFDNGKGCLHIEEGNGIRGIRQRTEQAGGTVRFISESGEGFQIYICL PYGN >gi|157101637|gb|DS480687.1| GENE 16 18332 - 18967 740 211 aa, chain + ## HITS:1 COG:all4635 KEGG:ns NR:ns ## COG: all4635 COG2197 # Protein_GI_number: 17232127 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 204 1 208 219 176 44.0 3e-44 MIKAVIADDIQILRQGLRAILEQDKDIQVVGLAADGREAWQLCRKHRPDVVLMDMRMPEY DGSYGITRIKEDYPDIKVLVLTTFDDRETVDAAVGSGADGYILKEMEDDKVIQSVKAVCA GMRVFGGSVFEGMRRQMAPGKIKADMAGDLTPRERDIMRLVARGMDNREIAGALFLAEGT VRNNISRLLEKLKLKDRTQLAVFAVKHNLDE >gi|157101637|gb|DS480687.1| GENE 17 18978 - 19724 658 248 aa, chain - ## HITS:1 COG:SPy1870 KEGG:ns NR:ns ## COG: SPy1870 COG2188 # Protein_GI_number: 15675689 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 243 1 244 247 248 54.0 8e-66 MDADVLSEKAKKLKHVKVYNRLYSMIQDGVYPPGSQLPSEPELALQMDVSRMTLRRALAL LQEDNLVINIRGKGNFISERNPGASMPGLEVTQHPVRCTLSGSIDETEMEFRIEPPTESI SQNLKRKTAVVVIADRWYKSAGKACAYSLSFIPIEVISGKQIDLREKEDLFQYLEHGVYE DAVSSTCQLSYTTTGNFTAVKYMLSQHASFILVQETLYDENSRVLVSSKHYIPVESFKTQ VNAVAGRA >gi|157101637|gb|DS480687.1| GENE 18 20115 - 20894 979 259 aa, chain + ## HITS:1 COG:SPy1869 KEGG:ns NR:ns ## COG: SPy1869 COG2820 # Protein_GI_number: 15675688 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Streptococcus pyogenes M1 GAS # 1 257 1 257 259 462 87.0 1e-130 MQNYSGEEGLQYHLQIRKGDVGRYVIMPGDPKRCEKIAKHFDNAVLVADSREYVTYTGYL DGEKVSVTSTGIGGPSASIAMEELVLCGADTFIRVGTCGGMDMDVKGGDIVVATGAIRME GTSREYAPIEFPAVADLDVTNALVSSAKALGYTYHAGVVQCKDAFYGQHEPKRMPVSYEL LNKWEAWKRMGCKASEMESAALFIAASHLRVRCGSDFLVVGNQERQEAGLDNPIVHDTEA AIKVAVEAVRRLIQADKQA >gi|157101637|gb|DS480687.1| GENE 19 20920 - 22122 1588 400 aa, chain + ## HITS:1 COG:SPy1868 KEGG:ns NR:ns ## COG: SPy1868 COG1972 # Protein_GI_number: 15675687 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside permease # Organism: Streptococcus pyogenes M1 GAS # 1 400 1 400 400 493 73.0 1e-139 MKILLNLTGILLVLAIMYLISWKKKNISVKMLVKAVIAQFLIAVILVKVPAGRYVVSRVS DAVTSVINCGQDGLSFVFGSLADSTAATGSVFAIQVLGNIVFLSALVSLLYYIGILGFVV KWIGKAVGKLMGTSEVESFVAVANMFLGQTDSPILVSKYLGQMTDSEVMVVLVSGMGSMS VSILGGYTALGIPMEYLLIASTLVPVGSIMVAKMLLPQTEEVREVGSVKMDNKGNNTNVI EAVAEGAVTGMQMALSIGASLVAMVALVAAVNKLLGVCGISLQQVFSYVFAPFGFFMGLD PSEILLEGNLLGSKLVLNEFVAFQQLGSMISSMDYRTGMICAISLCGFANFSSLGICVSG IAVLCPEKKSTLARLVFKAMLGGVAVSLISAMVVGLVTLF >gi|157101637|gb|DS480687.1| GENE 20 22135 - 23322 1537 395 aa, chain + ## HITS:1 COG:lin2068 KEGG:ns NR:ns ## COG: lin2068 COG1015 # Protein_GI_number: 16801134 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Listeria innocua # 3 395 4 393 394 457 57.0 1e-128 MGKYKRIFVVVLDSLGIGAVEDSPEYGDVGVDTLGHIAREVPGLKIPNLKKLGMVNLHPL EGMEPAEHPLGRYMRLKERSRGKDTMTGHWEMMGLLVTTPFQTFTSHGFPKELIGELEKR TGRKIIGNKSASGTEILDELAEEEIREGHLIVYTSADSVLQICGNEETMGLDNLYHYCEI ARELTLRDEWKVGRVIARPYTGMKKGEFRRTSNRHDYALKPYGRTALNALKDAGYDVVSI GKIYDIFDGEGLTQSNHSNSSVHGMEQTIQYAKTDFNGLCFVNLVDFDALWGHRRNPEGY GRELERFDEKLGELLPLLGEDDLLILTADHGNDPTYTGTDHTREQVPFIAYSPSMEGGKD LGSADTFAVIGATVADNFGVKMPEGTIGTSVLNEL >gi|157101637|gb|DS480687.1| GENE 21 23391 - 24026 772 211 aa, chain + ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 1 210 1 210 223 254 65.0 6e-68 MDKKDILSRVDHTLLKQTATWEQIEKLCREGLEYTAASVCIPPCYVKQAKDFVGDKLAVC TVIGFPNGNMTTAVKVFETEDAVKNGADEIDMVINIGLVKAGHYDQVLDEIRQIKAACGG RCLKVIIETCLLTEEEKKEMCRVVTESGADFIKTSTGFSTAGATPEDVALMRKYSGPEVK VKAAGGIASIEDAQRFIELGADRLGTSRLIP >gi|157101637|gb|DS480687.1| GENE 22 24144 - 25046 748 300 aa, chain + ## HITS:1 COG:BH2747 KEGG:ns NR:ns ## COG: BH2747 COG0697 # Protein_GI_number: 15615310 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 9 286 11 284 302 141 32.0 1e-33 MKLMEKHPLVMIIIGIAGISLSAIFVKYSQAPSVVTALYRLMWTVALMTPVVLGRKDCRR ELRETDRKTVLLCGASGVFLALHFTAWFESLNQTSVASSTAIVCTEVIWVARGYCLFMKG KISVPAGVSILVTVGGSLLIAFSDYSAGGNHLYGDVLALAAAVFCAVYTLIGRQARGYMS TTIYTYIVYVFCALALGLATAFSGLAFTGYGVRSVVVGLLLSVCSTLLGHSIFSWCLKFF SPSFVSASKLCEPVAAAAFALFLFREVPALLQIAGGVVTIGGVLLYSQVEKKENDVLKGK >gi|157101637|gb|DS480687.1| GENE 23 25169 - 25921 739 250 aa, chain + ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 3 238 5 241 245 134 31.0 2e-31 MNEKIAQLRKILDDSTYTVALCGSGMMEEGGFIGIKKQDKAYDIENRYGYGVEEMYTSAF YNTRPEQFFEFYKKEMLHNAPGDTASGPALAAMERAGKLQCIIDSNIYDKARRGGCRHVI NLHGSIYQNQCPRCKKKYPIGYMAGAKRVPICRDCNVPIRPMISLIGEMVDSQNMTKTTE EITKADTLLLLGTTLASEVFCQYIQYFAGRSMVIIHKQEHYLDKDANLVILDHPMNVLPQ LGYGEEKTEE >gi|157101637|gb|DS480687.1| GENE 24 26018 - 26596 599 192 aa, chain + ## HITS:1 COG:mlr7497 KEGG:ns NR:ns ## COG: mlr7497 COG3034 # Protein_GI_number: 13476231 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 50 190 40 161 163 61 30.0 1e-09 MKFRKCAAAGVLALCLAAAGPLSALAGQNAGPGAAYQPSVNGVNIYVSKMGNTLTLKQYA QTIGTWPVKLGRRSETGDKVQEGDEITPSGSFYVCTRNDQSICYLALGLSYPNAEDAERG LRDGLINEEQYNAIVEANKAGIQPPWNTPLGGAIEIHGDQGGGTSGCIAVTNDVMDILWE YCPLGVPVTVGP >gi|157101637|gb|DS480687.1| GENE 25 26807 - 27070 269 87 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1119 NR:ns ## KEGG: MGAS2096_Spy1119 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 87 11 97 97 107 60.0 1e-22 MKYVNAKAVLPHHLVEELQEYIQAGYIYIPAREDRHRAWGEQSGCRRELAERNAGIVDAY RQGVSLEELGDKYCLSVHAIRKIVYQK Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:02:52 2011 Seq name: gi|157101636|gb|DS480688.1| Clostridium bolteae ATCC BAA-613 Scfld_02_29 genomic scaffold, whole genome shotgun sequence Length of sequence - 14263 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 4, operones - 2 average op.length - 8.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 242 - 274 -0.8 1 1 Tu 1 . - CDS 364 - 1359 491 ## COG0582 Integrase - Prom 1389 - 1448 5.2 + Prom 2073 - 2132 4.3 2 2 Tu 1 . + CDS 2157 - 2360 111 ## gi|160938562|ref|ZP_02085915.1| hypothetical protein CLOBOL_03458 3 3 Op 1 . - CDS 2387 - 3403 651 ## gi|160938563|ref|ZP_02085916.1| hypothetical protein CLOBOL_03459 4 3 Op 2 . - CDS 3405 - 3917 555 ## gi|160938564|ref|ZP_02085917.1| hypothetical protein CLOBOL_03460 5 3 Op 3 . - CDS 3904 - 4620 531 ## CTC02115 phage-like element pbsx protein XkdT 6 3 Op 4 . - CDS 4620 - 5066 293 ## Dhaf_4811 phage protein 7 3 Op 5 . - CDS 5068 - 5295 188 ## gi|160938567|ref|ZP_02085920.1| hypothetical protein CLOBOL_03463 8 3 Op 6 . - CDS 5297 - 6286 988 ## Dhaf_4809 hypothetical protein 9 3 Op 7 . - CDS 6306 - 6965 547 ## COG1652 Uncharacterized protein containing LysM domain 10 3 Op 8 . - CDS 6983 - 9100 1656 ## COG5412 Phage-related protein 11 4 Op 1 . - CDS 9286 - 9693 491 ## gi|160938572|ref|ZP_02085925.1| hypothetical protein CLOBOL_03468 12 4 Op 2 . - CDS 9711 - 10148 384 ## CDR20291_1207 hypothetical protein 13 4 Op 3 . - CDS 10159 - 11229 1180 ## BBR47_29220 hypothetical protein 14 4 Op 4 . - CDS 11234 - 11704 434 ## gi|160938575|ref|ZP_02085928.1| hypothetical protein CLOBOL_03471 15 4 Op 5 . - CDS 11720 - 12157 493 ## Sterm_1434 hypothetical protein 16 4 Op 6 . - CDS 12157 - 12519 277 ## gi|160938577|ref|ZP_02085930.1| hypothetical protein CLOBOL_03473 17 4 Op 7 . - CDS 12531 - 12887 341 ## gi|160938578|ref|ZP_02085931.1| hypothetical protein CLOBOL_03474 18 4 Op 8 . - CDS 12904 - 13857 752 ## Cthe_2479 Lj928 prophage protein 19 4 Op 9 . - CDS 13859 - 14263 467 ## gi|160938580|ref|ZP_02085933.1| hypothetical protein CLOBOL_03476 Predicted protein(s) >gi|157101636|gb|DS480688.1| GENE 1 364 - 1359 491 331 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 56 328 47 319 321 155 33.0 1e-37 MDEIRLKDELMARLSNELDRPALQVIDGALSSVLRNYEVAKRETGLSTNVVSFPELDIFI GKMRFENYSASTVNQYQRFLTDLLVYVGKPVQEITGEDVVECLNYYEQVRQISTSTKDHK RRIASSFFAFLHDRGYIRKNPMATVDPIKYVAEIREALTSREVEKLRIACGGNVRDNAVL ELFLATGCRVSEVVGMHVEDIDMQVGCVKVLGKGQKERIVFFGDRAMEYLERYLDGRRAG AVILSSRAPHQGLKKNALENIIRRIANQAGLGKRVFPHLLRHTFATRALNKGMPLPTLCD LMGHASVETTRIYAKNGVGKIKYEYDMYAAG >gi|157101636|gb|DS480688.1| GENE 2 2157 - 2360 111 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938562|ref|ZP_02085915.1| ## NR: gi|160938562|ref|ZP_02085915.1| hypothetical protein CLOBOL_03458 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03458 [Clostridium bolteae ATCC BAA-613] # 1 67 80 146 146 129 100.0 6e-29 MIVDDVLHFIAPEAGTGTMAFMYLVKVNIVNGGFGLYVEAANRICFLESNAWLENRNWSN SFNLYIR >gi|157101636|gb|DS480688.1| GENE 3 2387 - 3403 651 338 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938563|ref|ZP_02085916.1| ## NR: gi|160938563|ref|ZP_02085916.1| hypothetical protein CLOBOL_03459 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03459 [Clostridium bolteae ATCC BAA-613] # 1 338 1 338 338 616 100.0 1e-175 MANEHAIQATGTFVVSKGFRLLTKLAASQGSLQFTRAAVGTGKPPEGYSPESMIGLNAYK IDAEIADYGVQDDMAYITVQVSSDNVTEGFLVTEVGVFAEDPDEGEILYGYMDISTDPTY IYANGSTNRSKFAEFTLYVLIGSVSNVIAAVTPGSIITRDTFTAANLKAIDTHGILGGEA GAGTTGQGLIDALTNKLLTEFVTNTGLMERLGTYVLKSKIVNDFLSTDEETVLSGPMGKL LKEQLNVLNTKIIKAPDYKNYKSVSLPFTAPNDGYICGGVYSLGTTACYIYVNSTQVGWN HSGGGSNTASPFMYPLSKGDIVTAIGYDQFRAYFIYTK >gi|157101636|gb|DS480688.1| GENE 4 3405 - 3917 555 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938564|ref|ZP_02085917.1| ## NR: gi|160938564|ref|ZP_02085917.1| hypothetical protein CLOBOL_03460 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03460 [Clostridium bolteae ATCC BAA-613] # 1 170 1 170 170 316 100.0 4e-85 MLLDEKMLPGIKIRIPQMEDLLQAEQAYLDVVLEVVEWLRERMILLNEEVMNIPNLKAKI KQITGWDCEILEDAEHLTLTIRYYFDAREPVLEQEERIMKYIPAHLKAVHEYLQRYAGKR KLYAGTMLGTYVRYIGRPQETNGHREGKAHVRVQGNMYIHTKMIVYPEGR >gi|157101636|gb|DS480688.1| GENE 5 3904 - 4620 531 238 aa, chain - ## HITS:1 COG:no KEGG:CTC02115 NR:ns ## KEGG: CTC02115 # Name: xkdT # Def: phage-like element pbsx protein XkdT # Organism: C.tetani # Pathway: not_defined # 49 220 181 343 359 81 29.0 4e-14 MELEWNASDILARLKAGLKNEDTRIEGSFSMDNMQAVSEELARYNSMLIKPLWDEIDLRI DEIITSGNENHYAFWARQVENAEGKRVIGSARVHGVRDGSGIVHVALLTPEAGAPTPEVV ELVRAYIETQRPVGAQPVIRAAEAVEVIINGIIELQEGADMESVRTQAGKEVKSYLAEVA LEGKKETVLNYYRIGMLIGGTPGVKEIVNYTVNNGEESITASYDRFFALKGLTLNASG >gi|157101636|gb|DS480688.1| GENE 6 4620 - 5066 293 148 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4811 NR:ns ## KEGG: Dhaf_4811 # Name: not_defined # Def: phage protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 137 1 133 138 89 35.0 3e-17 MGVFPFINTETVQEAASKKLPLFREYAYDFEKHCLKLDDDGRTYLVQGNEALRIWIYFAL ETARYRYTAYDTTFGSEIEEQLIGQPMNDEVTQMELERYTTEALMCNPYIEELSEFDFVL QKDGIKESFRCRSVYGEETIQHDIKAVR >gi|157101636|gb|DS480688.1| GENE 7 5068 - 5295 188 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938567|ref|ZP_02085920.1| ## NR: gi|160938567|ref|ZP_02085920.1| hypothetical protein CLOBOL_03463 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03463 [Clostridium bolteae ATCC BAA-613] # 1 75 1 75 75 145 100.0 9e-34 MGGYTAKLARRIKEQGANGSSPLILAEYVSPSAIRIGGELFSHNVHGNPQCDAMAGDTVL AAQIGSSFYVICREV >gi|157101636|gb|DS480688.1| GENE 8 5297 - 6286 988 329 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_4809 NR:ns ## KEGG: Dhaf_4809 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 6 329 11 335 335 159 30.0 2e-37 MNILVGDKDLSELVETITWSGDSGQIARKLEFTIAKNTQDPNFPNVTINEGDQVLLQTDT GTFLFGGIIFDIEKMAGSNLVRYLAYDLIFYLTGSELTKDYNGAPEDIARDVCGALGITA GTMAATGITITNPCVKKTGYQVIQSSYTAAARKNGKKYQLMMTEVNRVSVIEKGQDSGVI LTGDSNLSDATYKTTLQNLVNKVLITNQKGTVVGIVEDTESQAAYGTIQKVLEQEEGKDA AAEAKNLLHGSDPSTTVTGIPDDTRAMAGYALLVQEEETGLIGQFFIESDSHKYSNGEST MSLTLAFQNLMNEVEIDKPSENKKQKRGG >gi|157101636|gb|DS480688.1| GENE 9 6306 - 6965 547 219 aa, chain - ## HITS:1 COG:BS_xkdP KEGG:ns NR:ns ## COG: BS_xkdP COG1652 # Protein_GI_number: 16078334 # Func_class: S Function unknown # Function: Uncharacterized protein containing LysM domain # Organism: Bacillus subtilis # 14 219 35 234 235 96 32.0 3e-20 MGKTRKIKLGDIVLPVNPVELEVITPQLNKRLTLLNMGTVNLKGNRDVATATISSFFPSR ESPFYRYADMSPKKYKAKIENWKENKETKRLIITDMGVNLAMLIDKCSFKVKEGGGDMYY TLELSEYRNLTVPTVSIPLQVRDNGLKQRPDEAVPAKTHTVGSGDTLWGIAKKAYGNGTQ SSRIYAANSAVIEAAAKQHGKSSSNNGWWIYPGTVLAIP >gi|157101636|gb|DS480688.1| GENE 10 6983 - 9100 1656 705 aa, chain - ## HITS:1 COG:BH3518 KEGG:ns NR:ns ## COG: BH3518 COG5412 # Protein_GI_number: 15616080 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Bacillus halodurans # 287 664 497 899 940 70 21.0 1e-11 MGGVTGSISLKDNASATLRNLRSEQSKLREDTKKTSSALKSVWGTPRKLKADVSDATRAL KRVTDAAKKTKPVTVAVKAKDTATKVIKGAGTVLKAVGRPVTAVLKAKDTAFKVVKKTAG ALDRLGRKVASPVVKVVDKASGAIKGIVGRIGKAAKAVAIPVGIAAAAGTAALGASVSAG MQLEQQQISIEHFVGATNKELSQADVKAQSQSYIQQLRENANATPFETGEVIQAGSRAIS LAGGSTTDAMNLVTLAEDMAAASGGTASIMDAIEALGDLKVGETERLKSFGFKVSAEEFK KKGFSGVSGELQDFFGGAAGKLAGSGAGLMSTITGKMKSNAADFGLGVVEQLKPVLSEVI GLMDEAQPVLKKLSSGFGQGLGKGIGLAKTAFTQIAPIISSVVSTALPIATSFLGSMSTV FGQIAPVVTTAMTGIGPVVMSLVPIIQQAASIIGAVAGGIGSAISAVAPVVTTIMSEIGD KIGGVVEFLAERSEFIHSVIETAGPAIAKVLETAWGIISPVMDILITTFELVFGVVQKVW PGIQDTISGVWSHLEPIFDTIGKGADLLAGAWSKVKDLVTGGGDTGSGSSGGGASPGKNA RGDNNWRGGPTWVGEKGPELIDLPKGTRILPSKESFAWAAMGDSNIIQLSRYPADGGDVT PSGGKTITIQIQKIADEVHVRSEDDIEEISERTAKKIIEELDNTA >gi|157101636|gb|DS480688.1| GENE 11 9286 - 9693 491 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938572|ref|ZP_02085925.1| ## NR: gi|160938572|ref|ZP_02085925.1| hypothetical protein CLOBOL_03468 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03468 [Clostridium bolteae ATCC BAA-613] # 1 135 1 135 135 216 100.0 3e-55 MEQNKEDIFKRFLAKAEKRAEEKKIHRTCLVRVPSMDERIRIRGLSKQEIAEVSEIDNTD DPYAGDKYSVYIATVDPDLKAVAKQMKEDGNIQEYTDVVDIFEIYETRQLAEKIMELSGV SGKNKIEVIEESLKN >gi|157101636|gb|DS480688.1| GENE 12 9711 - 10148 384 145 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1207 NR:ns ## KEGG: CDR20291_1207 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 3 141 2 141 142 80 32.0 2e-14 MAGNVRGNRTLTGSWGEVWVDGEKIFVLQKIEQKVEVNREDVQMGMDVDSKMTGLKGSGT LSIKKVYSRAKAVLEKLSAGQDVRCQIIAKLKDPDAVDGQIERWSTDNVWWNTIPVISWE TGGQVQEEWEFGFTPSDMKNLDEIK >gi|157101636|gb|DS480688.1| GENE 13 10159 - 11229 1180 356 aa, chain - ## HITS:1 COG:no KEGG:BBR47_29220 NR:ns ## KEGG: BBR47_29220 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 3 356 1 352 352 299 47.0 1e-79 MALGLPNITIIFKGLAASAIERSERGIVVCMIKDDTEGGQRLSVYESILDIDFEHMTEAN YGYLKLLFEGGPVRVIVLRESAESPNLAAALKELMYLRWNYLCYPDISDEDKTTLAAWIK EMRNKNHKTFKAVLAASASDHEGIINLTTDGIESSITGKTHTAKEYCARIAGVLAGLSLS RSSTYYVLGDVLKAECPSDPDERINKGEFILVFDGEKYKVGRGVNSLTTFTKEKTEDVRK IKIVEGMDLYQDDIRNTFTDSYVGKYVNDYDNKQLFVAAVRAYQAGLAGEVLDASYDNTA SVDAQAQKQYLQERGQDVSQMTETEILTANTGAKVFIASHVKFVDAMEDLQMTVNM >gi|157101636|gb|DS480688.1| GENE 14 11234 - 11704 434 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938575|ref|ZP_02085928.1| ## NR: gi|160938575|ref|ZP_02085928.1| hypothetical protein CLOBOL_03471 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03471 [Clostridium bolteae ATCC BAA-613] # 1 156 1 156 156 289 100.0 4e-77 MDENILQAVKDAAISMLRGHEPDVDVYAEEIMRTDRTLEPENEDSSKWYFVEVIPTSFTT ISPDQTEAALMVSVDYHEPEESIRRYGEKAMELDRVFRPVFPFAYGGERRAATVARVTTN ISGGMLHLTFPLTFIVSDRAEEGTEIGILEMRSERG >gi|157101636|gb|DS480688.1| GENE 15 11720 - 12157 493 145 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1434 NR:ns ## KEGG: Sterm_1434 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 3 139 2 143 149 82 35.0 4e-15 MAGTEYRMDGLDEWEQELARAIEEQYPEEFKAMVIQVAMELQGKVKEKTPKKTSWLQNNW KVGEVRKRGDEYVIEVYNNVEYAEAVEWGHRQKVGQYVPALGKRLKAKTVKGAHMMELSL AELQAVLPGYLQEWMNDFLNTHDIV >gi|157101636|gb|DS480688.1| GENE 16 12157 - 12519 277 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938577|ref|ZP_02085930.1| ## NR: gi|160938577|ref|ZP_02085930.1| hypothetical protein CLOBOL_03473 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03473 [Clostridium bolteae ATCC BAA-613] # 1 120 1 120 120 227 100.0 2e-58 MTEADILALTYEDTVTVYRPFKDRLPNGETAFHRKAEGRKVYESISCALSTHTGGTLNRE LPAGSVPTQYSLFVRPEIEIEPNDYLEIKQRGRLTKAMAGLAERQPSHNQVPLVMEQERV >gi|157101636|gb|DS480688.1| GENE 17 12531 - 12887 341 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938578|ref|ZP_02085931.1| ## NR: gi|160938578|ref|ZP_02085931.1| hypothetical protein CLOBOL_03474 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03474 [Clostridium bolteae ATCC BAA-613] # 1 118 1 118 118 202 100.0 7e-51 MTSEQLTWITGEVMASLKLSDDKKSDVERCIRRIGTMVLIRCNREDIPKMLEPVIAQMVE DTLKEEMNLSSAGAVSSVTRGDTSITYRDDTALTQASSRLLKDYEPQLRRYKKMNLPK >gi|157101636|gb|DS480688.1| GENE 18 12904 - 13857 752 317 aa, chain - ## HITS:1 COG:no KEGG:Cthe_2479 NR:ns ## KEGG: Cthe_2479 # Name: not_defined # Def: Lj928 prophage protein # Organism: C.thermocellum # Pathway: not_defined # 1 306 1 305 307 360 58.0 4e-98 MANTIEYAKIFQPELDAAAVEQATSGWMEVNRDLVRYSGGDEVKIPNVVMDGLADYDRAN GFVAGSVDLSWQTMKMTKDRGRSFQIDENDVDESGFVLAAASLMGEFQRVHVVPEIDAYR YSTIAQKCMGVDLAAYSYTPTESSILKALLDDIAAVQDVVGENIPLIVSISTLVLNLLNN SDKLSRRLDVTDFSQGEVTVKVKSLNGIPLRSVPSSRMKSKYVFQDGKTAGQEKGGFKAA EDAIDINWLITPQNAPIAVSKTDKMRIFDPETNQKARAWGWDYRRYHDLWITKEKLKTCR ANFKQAKPASVEPEEPA >gi|157101636|gb|DS480688.1| GENE 19 13859 - 14263 467 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938580|ref|ZP_02085933.1| ## NR: gi|160938580|ref|ZP_02085933.1| hypothetical protein CLOBOL_03476 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03476 [Clostridium bolteae ATCC BAA-613] # 1 134 1 134 134 142 100.0 6e-33 AVKEAKEAEEARRAEEKRLEKLSPQEREAEEREAIKKENAELTGKLKRMELEQKASAKLA EKKLPGGLSEFLDYTDEARMAASLEKIGAMYQEQLETGIKERLKGTTPKGLGGAASLTDG MISAEIQKRIRGGL Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:05:40 2011 Seq name: gi|157101635|gb|DS480689.1| Clostridium bolteae ATCC BAA-613 Scfld_02_30 genomic scaffold, whole genome shotgun sequence Length of sequence - 216676 bp Number of predicted genes - 174, with homology - 172 Number of transcription units - 81, operones - 34 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 843 526 ## COG0582 Integrase - Prom 922 - 981 6.9 2 2 Tu 1 . - CDS 1235 - 1429 76 ## - Prom 1631 - 1690 5.7 + Prom 1244 - 1303 3.7 3 3 Tu 1 . + CDS 1325 - 1705 247 ## Tmar_1157 response regulator receiver protein - Term 1712 - 1764 2.1 4 4 Tu 1 . - CDS 1857 - 2825 415 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2855 - 2914 6.6 + Prom 3042 - 3101 11.9 5 5 Tu 1 . + CDS 3225 - 3428 269 ## COG1278 Cold shock proteins + Term 3515 - 3554 1.3 - Term 3586 - 3632 7.2 6 6 Op 1 . - CDS 3641 - 4975 1156 ## COG3048 D-serine dehydratase 7 6 Op 2 . - CDS 5021 - 6364 790 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 8 6 Op 3 . - CDS 6364 - 7674 885 ## COG1524 Uncharacterized proteins of the AP superfamily 9 6 Op 4 . - CDS 7689 - 8480 843 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 8664 - 8723 9.2 + Prom 8635 - 8694 6.0 10 7 Tu 1 . + CDS 8746 - 10023 1215 ## COG0477 Permeases of the major facilitator superfamily + Term 10048 - 10093 10.2 11 8 Op 1 . - CDS 10013 - 11278 999 ## Cphy_1046 hypothetical protein 12 8 Op 2 . - CDS 11275 - 11766 639 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 11876 - 11935 2.1 - Term 11882 - 11936 12.9 13 9 Op 1 2/0.143 - CDS 11981 - 13357 1205 ## COG0534 Na+-driven multidrug efflux pump 14 9 Op 2 . - CDS 13354 - 13977 499 ## COG1309 Transcriptional regulator - Prom 14017 - 14076 4.5 - Term 13986 - 14051 9.7 15 10 Tu 1 . - CDS 14247 - 17123 1980 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 17190 - 17249 5.7 + Prom 17367 - 17426 9.2 16 11 Tu 1 . + CDS 17477 - 18424 250 ## COG0583 Transcriptional regulator + Term 18592 - 18638 -0.5 17 12 Tu 1 . - CDS 18428 - 19708 697 ## COG0471 Di- and tricarboxylate transporters - Prom 19741 - 19800 5.1 - Term 19783 - 19831 10.2 18 13 Op 1 . - CDS 19848 - 20879 755 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 19 13 Op 2 2/0.143 - CDS 20916 - 22334 1182 ## COG0471 Di- and tricarboxylate transporters 20 13 Op 3 . - CDS 22352 - 23053 536 ## COG0684 Demethylmenaquinone methyltransferase 21 13 Op 4 . - CDS 23071 - 23640 329 ## COG3481 Predicted HD-superfamily hydrolase - Prom 23692 - 23751 11.3 - Term 24084 - 24119 4.0 22 14 Tu 1 . - CDS 24246 - 26294 775 ## COG0642 Signal transduction histidine kinase - Prom 26400 - 26459 9.8 23 15 Tu 1 . - CDS 26607 - 28322 543 ## COG0840 Methyl-accepting chemotaxis protein - Prom 28421 - 28480 7.7 24 16 Tu 1 . - CDS 29498 - 29662 77 ## - Prom 29745 - 29804 8.2 + Prom 29555 - 29614 1.9 25 17 Tu 1 . + CDS 29649 - 30020 105 ## gi|160938609|ref|ZP_02085961.1| hypothetical protein CLOBOL_03504 + Term 30104 - 30150 2.5 26 18 Tu 1 . - CDS 30303 - 31910 808 ## Closa_0537 hypothetical protein - Term 31927 - 31964 6.2 27 19 Op 1 . - CDS 31984 - 32514 169 ## gi|160938612|ref|ZP_02085964.1| hypothetical protein CLOBOL_03507 28 19 Op 2 1/0.286 - CDS 32572 - 36966 3625 ## COG5263 FOG: Glucan-binding domain (YG repeat) 29 19 Op 3 . - CDS 36977 - 43195 5754 ## COG5263 FOG: Glucan-binding domain (YG repeat) 30 19 Op 4 . - CDS 43217 - 44857 1191 ## Closa_3788 cell wall binding repeat-containing protein 31 19 Op 5 . - CDS 44923 - 45723 464 ## Closa_0878 3D domain protein 32 19 Op 6 . - CDS 45808 - 46539 209 ## Pjdr2_5011 transglutaminase domain protein - Prom 46741 - 46800 7.3 33 20 Tu 1 . - CDS 46867 - 47682 456 ## COG0582 Integrase - Prom 47908 - 47967 6.9 - Term 48256 - 48296 5.4 34 21 Op 1 . - CDS 48364 - 51315 1507 ## gi|160938621|ref|ZP_02085973.1| hypothetical protein CLOBOL_03516 35 21 Op 2 . - CDS 50523 - 50948 188 ## PROTEIN SUPPORTED gi|167042352|gb|ABZ07080.1| putative ribosomal protein L31e - Term 51521 - 51559 -0.1 36 22 Tu 1 . - CDS 51643 - 52932 671 ## bpr_I0590 hypothetical protein - Prom 52959 - 53018 9.3 - Term 53031 - 53066 -0.8 37 23 Tu 1 . - CDS 53243 - 53758 201 ## COG3196 Uncharacterized protein conserved in bacteria 38 24 Tu 1 . - CDS 53817 - 53972 99 ## gi|160938627|ref|ZP_02085979.1| hypothetical protein CLOBOL_03522 - Term 53974 - 54036 -0.1 39 25 Op 1 11/0.000 - CDS 54076 - 55890 1822 ## COG0526 Thiol-disulfide isomerase and thioredoxins 40 25 Op 2 . - CDS 55924 - 57096 1449 ## COG0492 Thioredoxin reductase 41 25 Op 3 . - CDS 57169 - 58629 1191 ## COG1757 Na+/H+ antiporter - Prom 58659 - 58718 6.8 42 26 Tu 1 . - CDS 59195 - 59668 218 ## CD1766 putative lipoprotein - Prom 59697 - 59756 7.8 - Term 59823 - 59864 9.1 43 27 Op 1 . - CDS 59871 - 61157 1545 ## COG0151 Phosphoribosylamine-glycine ligase - Term 61172 - 61201 0.0 44 27 Op 2 21/0.000 - CDS 61218 - 61808 613 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 45 27 Op 3 . - CDS 61802 - 62836 835 ## PROTEIN SUPPORTED gi|149378138|ref|ZP_01895857.1| Ribosomal protein S7 - Prom 62882 - 62941 2.0 46 28 Tu 1 . - CDS 62949 - 63464 730 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 63555 - 63614 6.3 47 29 Tu 1 . - CDS 64001 - 64864 771 ## gi|160938640|ref|ZP_02085992.1| hypothetical protein CLOBOL_03535 - Prom 65057 - 65116 6.9 + Prom 65254 - 65313 5.5 48 30 Tu 1 . + CDS 65501 - 67045 995 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 67087 - 67143 14.5 - Term 67074 - 67130 14.2 49 31 Op 1 . - CDS 67232 - 68416 927 ## gi|160938644|ref|ZP_02085996.1| hypothetical protein CLOBOL_03539 50 31 Op 2 . - CDS 68413 - 69147 616 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 51 31 Op 3 . - CDS 69234 - 69359 83 ## gi|160938646|ref|ZP_02085998.1| hypothetical protein CLOBOL_03541 52 31 Op 4 . - CDS 69375 - 71594 2232 ## Closa_1605 hypothetical protein 53 31 Op 5 . - CDS 71611 - 71934 443 ## Closa_1604 hypothetical protein 54 31 Op 6 . - CDS 71931 - 72404 348 ## COG4767 Glycopeptide antibiotics resistance protein 55 31 Op 7 . - CDS 72420 - 72548 220 ## gi|160938650|ref|ZP_02086002.1| hypothetical protein CLOBOL_03545 56 31 Op 8 1/0.286 - CDS 72599 - 73276 938 ## COG0461 Orotate phosphoribosyltransferase 57 32 Op 1 13/0.000 - CDS 73405 - 74307 1137 ## COG0167 Dihydroorotate dehydrogenase 58 32 Op 2 1/0.286 - CDS 74344 - 75138 869 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 59 32 Op 3 . - CDS 75158 - 76105 1038 ## COG0284 Orotidine-5'-phosphate decarboxylase 60 32 Op 4 . - CDS 76056 - 76214 98 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 - Prom 76285 - 76344 5.1 61 33 Tu 1 . + CDS 76350 - 77516 339 ## COG3547 Transposase and inactivated derivatives + Term 77562 - 77596 1.1 62 34 Op 1 . - CDS 77836 - 78954 1076 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 63 34 Op 2 1/0.286 - CDS 78944 - 80347 1224 ## COG1904 Glucuronate isomerase 64 34 Op 3 . - CDS 80337 - 81107 220 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 65 34 Op 4 2/0.143 - CDS 81111 - 82301 1265 ## COG2721 Altronate dehydratase 66 34 Op 5 . - CDS 82342 - 82689 214 ## COG2721 Altronate dehydratase 67 34 Op 6 . - CDS 82709 - 83374 821 ## COG0684 Demethylmenaquinone methyltransferase 68 34 Op 7 3/0.143 - CDS 83391 - 84131 223 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 69 34 Op 8 . - CDS 84131 - 85252 1010 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 70 34 Op 9 38/0.000 - CDS 85263 - 86102 974 ## COG0395 ABC-type sugar transport system, permease component 71 34 Op 10 35/0.000 - CDS 86099 - 87046 899 ## COG1175 ABC-type sugar transport systems, permease components - Term 87074 - 87116 2.5 72 34 Op 11 . - CDS 87132 - 88460 1294 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 88507 - 88566 4.8 - Term 88790 - 88825 2.4 73 35 Tu 1 . - CDS 88835 - 89368 76 ## COG1633 Uncharacterized conserved protein - Prom 89459 - 89518 2.8 74 36 Tu 1 . + CDS 89797 - 91563 1852 ## COG0840 Methyl-accepting chemotaxis protein + Term 91581 - 91631 11.3 75 37 Tu 1 . - CDS 91657 - 92739 1136 ## COG1609 Transcriptional regulators - Prom 92793 - 92852 3.7 76 38 Tu 1 . - CDS 92927 - 94234 853 ## COG4826 Serine protease inhibitor - Prom 94402 - 94461 5.7 + Prom 94357 - 94416 10.2 77 39 Tu 1 . + CDS 94510 - 94953 682 ## COG0071 Molecular chaperone (small heat shock protein) + Term 95001 - 95048 8.5 - Term 94989 - 95036 11.4 78 40 Tu 1 . - CDS 95068 - 96744 1577 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold - Prom 96886 - 96945 3.7 + Prom 96744 - 96803 5.1 79 41 Tu 1 . + CDS 96893 - 98521 1373 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Term 98414 - 98456 8.5 80 42 Op 1 . - CDS 98514 - 99905 1562 ## COG0534 Na+-driven multidrug efflux pump 81 42 Op 2 . - CDS 99919 - 101715 2131 ## COG1164 Oligoendopeptidase F 82 42 Op 3 23/0.000 - CDS 101787 - 102491 877 ## COG1346 Putative effector of murein hydrolase 83 42 Op 4 . - CDS 102493 - 102855 525 ## COG1380 Putative effector of murein hydrolase LrgA - Prom 102926 - 102985 5.0 + Prom 102906 - 102965 5.2 84 43 Tu 1 . + CDS 103073 - 104074 411 ## Closa_1593 GCN5-related N-acetyltransferase - Term 104017 - 104058 7.5 85 44 Op 1 . - CDS 104108 - 104821 825 ## COG2357 Uncharacterized protein conserved in bacteria 86 44 Op 2 . - CDS 104864 - 105700 1118 ## COG0489 ATPases involved in chromosome partitioning - Prom 105733 - 105792 5.5 87 45 Op 1 . - CDS 105800 - 106957 895 ## Closa_2166 FliB family protein 88 45 Op 2 . - CDS 106968 - 107567 569 ## EUBELI_20100 hypothetical protein - Prom 107641 - 107700 5.7 - Term 107729 - 107777 15.2 89 46 Op 1 11/0.000 - CDS 107835 - 110105 2397 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 90 46 Op 2 2/0.143 - CDS 110108 - 110587 486 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 91 46 Op 3 1/0.286 - CDS 110618 - 111934 1732 ## COG2233 Xanthine/uracil permeases 92 46 Op 4 . - CDS 111987 - 112607 725 ## COG0655 Multimeric flavodoxin WrbA 93 46 Op 5 11/0.000 - CDS 112653 - 114917 2663 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 94 46 Op 6 . - CDS 114907 - 115389 478 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 95 46 Op 7 . - CDS 115411 - 116340 1029 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog 96 46 Op 8 . - CDS 116400 - 116699 504 ## Sterm_1552 hypothetical protein 97 47 Op 1 . - CDS 116825 - 118015 1444 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 98 47 Op 2 . - CDS 118045 - 119244 1307 ## COG0167 Dihydroorotate dehydrogenase 99 47 Op 3 . - CDS 119265 - 120461 1641 ## COG1171 Threonine dehydratase 100 47 Op 4 15/0.000 - CDS 120528 - 121019 615 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 101 47 Op 5 12/0.000 - CDS 121022 - 121897 1033 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 102 47 Op 6 . - CDS 121914 - 124274 2381 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 103 47 Op 7 1/0.286 - CDS 124265 - 125611 1373 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 104 47 Op 8 4/0.143 - CDS 125615 - 127039 1748 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 105 47 Op 9 2/0.143 - CDS 127133 - 128572 352 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Term 128589 - 128623 -0.5 106 48 Tu 1 . - CDS 128739 - 130112 1010 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 130312 - 130371 7.3 - Term 130488 - 130529 7.1 107 49 Tu 1 . - CDS 130640 - 131386 171 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 131517 - 131576 4.0 108 50 Op 1 . - CDS 132185 - 132991 329 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 109 50 Op 2 11/0.000 - CDS 132993 - 133919 594 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components - Prom 133942 - 134001 1.5 110 50 Op 3 21/0.000 - CDS 134006 - 134950 663 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 111 50 Op 4 16/0.000 - CDS 134964 - 136466 191 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 136506 - 136565 2.9 112 50 Op 5 1/0.286 - CDS 136575 - 137651 766 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 137704 - 137763 13.8 113 51 Tu 1 . - CDS 137913 - 138380 268 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 138453 - 138512 4.2 114 52 Op 1 . - CDS 138516 - 139652 580 ## gi|160938716|ref|ZP_02086068.1| hypothetical protein CLOBOL_03611 115 52 Op 2 . - CDS 139649 - 140869 537 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 140923 - 140982 6.3 - Term 140947 - 140991 4.2 116 53 Op 1 . - CDS 141001 - 141543 422 ## gi|160938718|ref|ZP_02086070.1| hypothetical protein CLOBOL_03613 117 53 Op 2 . - CDS 141540 - 142514 649 ## gi|160938720|ref|ZP_02086072.1| hypothetical protein CLOBOL_03615 - Prom 142611 - 142670 8.4 + Prom 142522 - 142581 6.5 118 54 Op 1 . + CDS 142724 - 143692 619 ## COG0524 Sugar kinases, ribokinase family + Prom 143799 - 143858 4.0 119 54 Op 2 . + CDS 143878 - 145278 1196 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 145318 - 145376 13.7 120 55 Op 1 . - CDS 145343 - 146416 572 ## gi|160938723|ref|ZP_02086075.1| hypothetical protein CLOBOL_03618 121 55 Op 2 . - CDS 146413 - 148095 1790 ## gi|160938724|ref|ZP_02086076.1| hypothetical protein CLOBOL_03619 122 55 Op 3 38/0.000 - CDS 148102 - 148944 520 ## COG0395 ABC-type sugar transport system, permease component 123 55 Op 4 . - CDS 148941 - 149711 544 ## COG1175 ABC-type sugar transport systems, permease components 124 56 Tu 1 . - CDS 149815 - 152076 1804 ## Clole_2648 extracellular solute-binding protein family 1 - Prom 152108 - 152167 5.1 + Prom 152143 - 152202 9.6 125 57 Op 1 40/0.000 + CDS 152230 - 152916 795 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 126 57 Op 2 . + CDS 152909 - 154249 736 ## COG0642 Signal transduction histidine kinase - Term 154123 - 154151 2.3 127 58 Tu 1 . - CDS 154231 - 155295 958 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 155332 - 155391 2.2 + Prom 155173 - 155232 4.3 128 59 Tu 1 . + CDS 155294 - 156082 319 ## gi|160938730|ref|ZP_02086082.1| hypothetical protein CLOBOL_03625 - Term 155837 - 155888 9.1 129 60 Op 1 . - CDS 156033 - 157877 1427 ## COG0366 Glycosidases 130 60 Op 2 1/0.286 - CDS 157883 - 160084 1866 ## COG3345 Alpha-galactosidase 131 60 Op 3 38/0.000 - CDS 160139 - 160972 994 ## COG0395 ABC-type sugar transport system, permease component 132 60 Op 4 35/0.000 - CDS 160987 - 161868 967 ## COG1175 ABC-type sugar transport systems, permease components 133 60 Op 5 . - CDS 161893 - 163194 1262 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 163322 - 163381 6.8 + Prom 163336 - 163395 5.9 134 61 Tu 1 . + CDS 163416 - 164273 855 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 164246 - 164282 2.0 135 62 Tu 1 . - CDS 164330 - 167026 2902 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 167068 - 167127 8.9 + Prom 167117 - 167176 3.6 136 63 Tu 1 . + CDS 167201 - 168448 1300 ## COG2357 Uncharacterized protein conserved in bacteria + Term 168653 - 168717 7.3 137 64 Tu 1 . - CDS 168549 - 169562 801 ## COG4129 Predicted membrane protein - Term 169925 - 169951 -0.7 138 65 Op 1 . - CDS 169981 - 171672 1950 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 139 65 Op 2 . - CDS 171669 - 172091 479 ## CLL_A0378 hypothetical protein - Prom 172123 - 172182 6.2 - Term 172345 - 172380 1.4 140 66 Op 1 4/0.143 - CDS 172390 - 173862 1602 ## COG1070 Sugar (pentulose and hexulose) kinases 141 66 Op 2 . - CDS 173883 - 175367 1921 ## COG2407 L-fucose isomerase and related proteins 142 66 Op 3 . - CDS 175437 - 176060 682 ## Closa_1357 hypothetical protein - Prom 176121 - 176180 6.7 - Term 176130 - 176190 4.1 143 67 Tu 1 . - CDS 176196 - 177128 877 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 177183 - 177242 2.3 144 68 Tu 1 . - CDS 177245 - 178495 484 ## CLJU_c18950 putative collagen triple helix repeat-containing protein - Prom 178519 - 178578 2.7 145 69 Op 1 1/0.286 - CDS 178633 - 179460 649 ## COG1180 Pyruvate-formate lyase-activating enzyme 146 69 Op 2 . - CDS 179457 - 180863 1161 ## COG3885 Uncharacterized conserved protein 147 69 Op 3 . - CDS 180952 - 182139 1098 ## COG0471 Di- and tricarboxylate transporters - Prom 182175 - 182234 7.0 148 70 Op 1 1/0.286 - CDS 182325 - 184079 1684 ## COG1001 Adenine deaminase 149 70 Op 2 . - CDS 184106 - 185440 1253 ## COG2252 Permeases - Prom 185462 - 185521 4.6 - Term 185583 - 185633 5.4 150 71 Tu 1 . - CDS 185668 - 186420 731 ## COG2188 Transcriptional regulators - Prom 186535 - 186594 5.0 - Term 186575 - 186621 8.1 151 72 Op 1 . - CDS 186674 - 189118 1826 ## COG3875 Uncharacterized conserved protein 152 72 Op 2 30/0.000 - CDS 189108 - 189623 685 ## COG0066 3-isopropylmalate dehydratase small subunit 153 72 Op 3 . - CDS 189628 - 190926 1276 ## COG0065 3-isopropylmalate dehydratase large subunit 154 72 Op 4 1/0.286 - CDS 190965 - 192059 982 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Term 192068 - 192118 8.5 155 73 Op 1 14/0.000 - CDS 192139 - 193557 1598 ## COG1653 ABC-type sugar transport system, periplasmic component 156 73 Op 2 38/0.000 - CDS 193593 - 194441 783 ## COG0395 ABC-type sugar transport system, permease component 157 73 Op 3 . - CDS 194446 - 195342 936 ## COG1175 ABC-type sugar transport systems, permease components 158 73 Op 4 . - CDS 195360 - 196760 1297 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 196899 - 196958 5.6 + Prom 196880 - 196939 6.6 159 74 Tu 1 . + CDS 197039 - 197698 663 ## COG1802 Transcriptional regulators + Term 197735 - 197775 11.5 - Term 197720 - 197766 12.9 160 75 Op 1 . - CDS 197807 - 198034 350 ## Calhy_1445 phosphotransferase system, phosphocarrier protein HPr - Prom 198155 - 198214 4.6 161 75 Op 2 . - CDS 198245 - 199231 1199 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 199300 - 199359 4.8 + Prom 199317 - 199376 4.1 162 76 Op 1 7/0.048 + CDS 199435 - 200991 1522 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 163 76 Op 2 . + CDS 200988 - 202865 1670 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Term 202897 - 202938 7.1 164 77 Op 1 . - CDS 202852 - 204129 1095 ## Closa_0526 hypothetical protein 165 77 Op 2 . - CDS 204206 - 205366 1360 ## Closa_0525 hypothetical protein 166 77 Op 3 27/0.000 - CDS 205385 - 208390 3394 ## COG0841 Cation/multidrug efflux pump 167 77 Op 4 1/0.286 - CDS 208394 - 209671 1238 ## COG0845 Membrane-fusion protein - Prom 209787 - 209846 4.0 - Term 209825 - 209893 28.0 168 78 Tu 1 . - CDS 209932 - 210468 303 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 210577 - 210636 9.3 + Prom 210536 - 210595 6.7 169 79 Tu 1 . + CDS 210667 - 211566 610 ## COG0583 Transcriptional regulator + Prom 211609 - 211668 10.6 170 80 Op 1 40/0.000 + CDS 211722 - 212396 873 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 171 80 Op 2 10/0.000 + CDS 212393 - 213460 1044 ## COG0642 Signal transduction histidine kinase + Prom 213887 - 213946 3.2 172 80 Op 3 . + CDS 214048 - 215250 1078 ## COG0477 Permeases of the major facilitator superfamily + Term 215415 - 215461 -0.3 173 81 Op 1 40/0.000 - CDS 215236 - 216570 1330 ## COG0642 Signal transduction histidine kinase 174 81 Op 2 . - CDS 216539 - 216676 107 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain Predicted protein(s) >gi|157101635|gb|DS480689.1| GENE 1 3 - 843 526 280 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 10 272 4 262 265 128 30.0 1e-29 MEMEMLLMKFESYLCDEEKSKSTINKYLHDVREMLDFMGQECISKELLIQYRGHLSRRCR ARTVNGKLSAINAYLKCMGLEAHKVKFLKVQKRIYMDEKRDLTEQDCRRLLETASRTGKT QLYFLMLVLYGTGIRISELPYVTVEAVYQGNAEINMKGKYRVIIFPRNLVRQLKEYIRDA DIKSGCIFRTKSGKCLDRSNICHSMKKICREARVDQSKVFPHNFRHLFAKSFYAIEKNLA HLADILGHTSIETTRIYVASSLKHYEKVMNRMGIAKDIKI >gi|157101635|gb|DS480689.1| GENE 2 1235 - 1429 76 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDGPDDAVDCVTKPYIGVVYTEAPKQVTDFFFVVHVNPSLVFVTNVTLYAEAWFDKGDWT TKLI >gi|157101635|gb|DS480689.1| GENE 3 1325 - 1705 247 126 aa, chain + ## HITS:1 COG:no KEGG:Tmar_1157 NR:ns ## KEGG: Tmar_1157 # Name: not_defined # Def: response regulator receiver protein # Organism: T.marianensis # Pathway: Two-component system [PATH:tmr02020] # 2 109 157 267 270 65 34.0 8e-10 MNHEKEISNLLRRLGVNNSYVGFRYTIYGVIRTVHDPTLLAYISKGLYVDIAVHYQTSIG CVERNIRTLINTIWLHGDRKLLNEIFDFELSQKPRNGAFIDALVNYIVMRGISEEITDSA CQQQAT >gi|157101635|gb|DS480689.1| GENE 4 1857 - 2825 415 322 aa, chain - ## HITS:1 COG:ECs5354 KEGG:ns NR:ns ## COG: ECs5354 COG2207 # Protein_GI_number: 15834608 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Escherichia coli O157:H7 # 24 160 19 135 289 74 31.0 3e-13 MYEWKKQIQIIVDEIDRCIKCYNDEALSLRALSCRLGYSEYYTSRKFKEISGIQFRNYLQ LRKLAFALKEVRDSQKNLLEVAFDYGFSSHEAFTRAFKETYGVTPSEYRKNPKPVVLRTK ISAFDRYFLGLGEIGMIQSADGVKIYFVTIPAHKFLYIKNRESNGYWDFWQKQNLIPGQD YETVCGLLDSIKGKLDDGGGSEVNCGSGQIMAYMNDPDGRLCDWGILRSECWGVRLPPEY KGAVPPQMLMDDIPEAEYIVFEHGPFDYEQENRSVEERIENAMAAFDYADTGYCLDTSPG RLMYFYYNPEQYFKYIRPVRRV >gi|157101635|gb|DS480689.1| GENE 5 3225 - 3428 269 67 aa, chain + ## HITS:1 COG:SA2494 KEGG:ns NR:ns ## COG: SA2494 COG1278 # Protein_GI_number: 15928289 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Staphylococcus aureus N315 # 1 64 1 63 66 83 73.0 8e-17 MNKGTVKWFNSQKGFGFITNEENGEDIFVHFSGIATDGFKSLEDGQSVTFDITNGNRGLQ AVNVCAA >gi|157101635|gb|DS480689.1| GENE 6 3641 - 4975 1156 444 aa, chain - ## HITS:1 COG:BH1762 KEGG:ns NR:ns ## COG: BH1762 COG3048 # Protein_GI_number: 15614325 # Func_class: E Amino acid transport and metabolism # Function: D-serine dehydratase # Organism: Bacillus halodurans # 10 434 8 434 442 479 54.0 1e-135 MDSRIEAYMKQIPVLAEAAAGKEVFWVNSRECPESGQGEKAVSKEQIQDAARRLERFAPY LAEVFPETAARNGIIESELREIRAMRQELNQEWGAGITGKLFLKMDSHLPISGSVKARGG IYEVLKHAEDLAFEAGLLNENEDYRKLNTSEMREFFGRYKVQVGSTGNLGLSIGIMSAKL GFQVTVHMSADAKQWKKDLLRSKGVTVIEYESDYCAAVEEGRKSSDQDPMSYFVDDENSV NLFLGYAVAGQRLKHQLEEQGIQVDRQHPLFVYLPCGIGGAPGGVAYGLKEIYGDCVHCF FMEPTQACCMLIGMATGEHDKVCVGDFGISGKTDADGLAVGRPSAFVGKVIEKRLAGICT IEDGKLYELMRALVKTEDIFIEPSACASFAAFLHADEMDNYIREKGLSGCMGQAVHIAWA TGGNMVPPEMVAEYLNIRTQSASL >gi|157101635|gb|DS480689.1| GENE 7 5021 - 6364 790 447 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 3 444 2 442 456 308 37 9e-83 MKTAEAVLGNLAGLLWGNWMLFVLLGLGVFYTIATGGIQFRSIPWIIKELCPKKEKTEGT RDHKGTVSSIQALYMAIASCVGSGNIVGVATALIGGGPGALFWMWAAAFLGMATKYAEIV LGLVFRQKGADGSYYGGPMYYIEKGLHARLLGVAAAVLLFFQNAGGTLIQSNTISDVAFQ VFGVPKAVTGILMAAVMVFIIGGGLKRLADVAQKIVPVMASLYIVGGIVVVFTNLDSILP MFRSILQDACSMKAGMGAAAGLTMKEAMRFGVARGLYSNEAGEGSAAVIHSAAQVDHPAR QGFYGVVEVFVDTMVICSTTGFSILASGVPLEGASAASLAAEAFGTVAPLFKYVVSVSLI LFASTSIMSQWYFGHVSLMYIKSLKGDRFYRILFPVLILLGSLSTAGIVWSVQDCMLGLL IIPNVLALLKLSPVVMRMTREFFRENQ >gi|157101635|gb|DS480689.1| GENE 8 6364 - 7674 885 436 aa, chain - ## HITS:1 COG:lin1457 KEGG:ns NR:ns ## COG: lin1457 COG1524 # Protein_GI_number: 16800525 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Listeria innocua # 49 431 40 414 422 144 25.0 3e-34 MERNKVSVLLVSVDALKPEFVFEQERLGISLPNISKYFVENGTFAGGGVQSVFPTFTYPC HQSIITGTNPATHGIYNNGIFDPMGEHMGAWHWFASKKAENLWEAAGKHGYLSASVAFPT SVAAGGDFMAPEFWWDGTELDSRFIDAMSVPQGMILEMEKEIGRYAGGLDLTEAGDRQRY RAAMWVLDHKLAPKAGKCPFFMTAYFASFDEMAHQYGVYSKEAAHAIEAIDKMLGDLIEK VHSMTDGQAVVCVVSDHGTLDNRYNIRPNVKLREAGLITTDGNGRVTDWRAWSQRAGGMS EIRLKNGQDEDARRKLDQIMKELSSDPVFGILEVLDRAQAIERGGFPLADYVLVSRKGYE IRDDAEGDYCTETLHQKAQHGYCERFEEMRASFMIEGCGIERKCDIGSMNLIDIAPTLAS VMGFDMPQAEGRNRLK >gi|157101635|gb|DS480689.1| GENE 9 7689 - 8480 843 263 aa, chain - ## HITS:1 COG:CAC0231 KEGG:ns NR:ns ## COG: CAC0231 COG1349 # Protein_GI_number: 15893523 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 1 258 1 251 254 130 33.0 2e-30 MLAEQRYQKIINLMQTDGSVKVADLKKKMGVSPETVRRDLENMEAQGLIRRTRGGAFLAD KEEMPGERTIQYQNFSARERENLESKVEIAEFAVRFIKEGQSIALDSGTTAYELARVIKR TFRSLTVVTNSIAILNELADAKGITLIATGGVYRPEEMAFVSDIAGMIFSKLSINTFFLT TCGISVDRGITYQRMDEIIVQEKMMEASDQTIVIADSTKLGVNSLVKMCDIDRVGMIITD SEASKEQIRPFEQAGISVEKPEN >gi|157101635|gb|DS480689.1| GENE 10 8746 - 10023 1215 425 aa, chain + ## HITS:1 COG:yihN KEGG:ns NR:ns ## COG: yihN COG0477 # Protein_GI_number: 16131714 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 11 421 5 413 421 385 50.0 1e-106 MKTIADNRQSKKWISFGTLVLGAGTIYKLGCLKDAFYVPMQEFMGLTHTQIGTAMSVAGL ISTFGFLISIYLTDRVSKKVMIPMSLMGIGLLGIGLSTFPPYWLFLLINCGLAVCADMLY WPTMLKTVRLLGSEEEQGRMFGIMEAGRGLVDTVIAFGALGVFALLGSNAASLKAAILFY SAVPMVIGIIAYFLLEPDEIKAVNAAGEKVSRNKAAMDNVILALKNKNIWLVSFNVFSVY CVYCGLTYFIPFLKDLYAIPAVLVGAYGIINQYALKMLGGPVGGYLTDKVFHSATKYLRV VFVITGTALTVFAFLPHSHMNVYFGMLMTLSIGACVFSMRAIFFAPMDEINVPREITGSA MSLGSFIGYLPGAFLYSIYGSILDHNPGLSGYRIVFLIMAVSAGAGFLLSSYILSVIGKD KAINN >gi|157101635|gb|DS480689.1| GENE 11 10013 - 11278 999 421 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1046 NR:ns ## KEGG: Cphy_1046 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 296 1 296 298 189 38.0 2e-46 MRQIEDARKRYEEIPIPAELSGRVKAAIVRSEEKRKSQRQRIPTVRIRRKHRIWGTCAGM AAALVLVFTTALNTNTAFAEMAAGLPVIGAAARILTFRSYERDEGDWKISVEIPSVEMIA KDTNGLDSALNQEIYDLCSQYADEAVERAREYKKAFMETGGTEEEWEAHDINIHVGYEIK AQTEAYLSFAVQGTENWSSAYSETKYYNIDLKDNKMVTLADVLGHDYARIADESILRQME NMEEKEGIAFWSKSEGGFTGVTDKTVFYMNEKENPVIVFDKYEIAPGAYGKLEFEIDKTG NMAEKTGYTDNFMVPEEEITAFAGKVKEAVADRDMDALASLASYPLYVGFKDGGVSAGSP EELAALGTDRIFTPELAAAVEAAGDETLSPSMAGFVLKKDGTPNIVFGVSEGVLAVKGIN Y >gi|157101635|gb|DS480689.1| GENE 12 11275 - 11766 639 163 aa, chain - ## HITS:1 COG:BS_sigV KEGG:ns NR:ns ## COG: BS_sigV COG1595 # Protein_GI_number: 16079766 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 9 163 11 165 166 115 43.0 4e-26 MKEDLYNRIVDYIVSNQEKFYRLAYSYVHNQEDAMDIVQNAVCKALDHYEAIRNENAVKT WFYRILVNESLLLLRKQKMEIPTEEGMGQEIPYYEEGYNHEYDVYEQLNRLEEDVQTIIK LRYFEELTLKEIAYVTRTNLNTVKAKLYRGLKLLKQNIQEADL >gi|157101635|gb|DS480689.1| GENE 13 11981 - 13357 1205 458 aa, chain - ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 434 1 437 475 193 28.0 5e-49 MKKENARQESILGDMKPSRAITRLAVPATLALLAKAVYNIVDTAYIGMLGSDIALAAVGV TLPLLLIMVSVENIFAAGAAVLAGRQLGAGDKMGANRTVTSVVGLSVFIGMFLCAAGIIF MEPLLRAFGASEAVLPQAGDYAFWMFVAALANLPAQSMNCAARAESSVKISSIAVVTGAA LNVVLDPVFMFDWGFAMGVEGASLATTVSQFVTFFILGWFYLSGRSIIKIKREYFKPQWT LIKSVTLIGIPTAVIQICLAAATSLTNIAAKSMTDADLIIAAYGVVQRLVLIGCYVVMGF MQGYQPVAAYSFGANREERFHESVRFALRTSLILTVLVAGTYILLARPLIMLFNRNPAVI DYGVRLLISQVALYPAFGLCYMMTITYQTIGSSRYGLFLSVIRQGLFYVPFILILPGIMG VTGIYLAQPAADILTMAVCLYSVKPMKRMASEHMAAFR >gi|157101635|gb|DS480689.1| GENE 14 13354 - 13977 499 207 aa, chain - ## HITS:1 COG:lin2076 KEGG:ns NR:ns ## COG: lin2076 COG1309 # Protein_GI_number: 16801142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 9 170 3 163 206 65 25.0 7e-11 MKREEKNQQTKHRIMESALKEFAEQGYGASSVNNICSCEGVSKGIIYHYFQTKDELYLAC VDGCFRALTEYLREKLILEGETVHRQLQLYFDARFEFFQAFPMYRRIFCEAVIMPPAHLE SAVRQRKSGFDEFNISILNRILEPLKLRGGVSRETVVDIFLQFQDYINAKYQSSGEKEID FKTHEEDCRRALDVLLHGVVERKDVDI >gi|157101635|gb|DS480689.1| GENE 15 14247 - 17123 1980 958 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 531 947 334 749 776 210 34.0 9e-54 MQTKVKGMNGILMIAVLLIVFLFPLDVSAAETASAALVGYYEDGDYMYHNAQGEYEGYNF EFLQEVSKLSGLSYEVVDSPSWQEAFQLLIDGRIDILPAVYRTEGRMDQMLFTDESMCTI YTTLNVRMDDNRYNYEDFDAFQGMRVGIIKGGEDGESFKRYCAQNGVALTIIEYDETQEL LDALGSGTLDGVAITHLGRSSVFRSVARFAPTPMYIAVSKQRPDLLAQINRSINDILLRN PDYRTDLYDKYLSPSSNQVPVLTKEEVEYIEAADEPIHISYDPSFAPFSYKDAEGELNGI MADIFARIAEKSGLDFQFEACPAATALRAVKLGETDAVSVVDGDYLWDERNHMNTTLRFL NTPSAMITQAERTEIEVLALPEGYQLSEHVAQNNPEKEIQYYDSIQACFDAVLDGKAQAT YTNTQTANYIISAPKYEKLHVTALTQYPNDLCIGVSKSADPRLFSILDKYIQYMSNEEID TLLLNNSVSVRPITMEAFVHQNIWLITGLVAAVSGSIILLLCINLFNISRSKRRIQDLLY RDELTGLDNINRFYVRAEELLATGKYAVVYCDIDRFKLINDTFGFEEGDEVLRAFGSILQ KSMEDRECCARLSADNFVMLRHYKQWETLAADLMHIQAVLNKWRGERGIIPYEIAVSFGV YQADAGETNDMKQMLDFANYAMRSAKTAAGGSCFLYDEQMRNKALFEQGLEGRLASAMEQ GEFEAYYQPKVDMDTGRIVGCEALVRWNHPEQGLLMPGSFIPFFEKKGLIVRVDLHMFEQ VCRTVRRWLDEGRPAVTVSCNFSQMHFGHDGFAGQVSEIADRFQVPHHLLEVEITESAIA DAPESVSSALTELKMRGFQIAIDDFGSGYSSLGQLQRLRADVLKLDRSFVCAGLQGPREQ IVIENLVNMASELGMEVVCEGVETQVQVKVLQDIGCHIAQGYYYYRPMQTAAFEQLLG >gi|157101635|gb|DS480689.1| GENE 16 17477 - 18424 250 315 aa, chain + ## HITS:1 COG:BS_ywbI KEGG:ns NR:ns ## COG: BS_ywbI COG0583 # Protein_GI_number: 16080882 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 1 307 1 292 301 111 25.0 2e-24 MDLTSLRYFVALARNLHFTKTAQELYISQQNLSQHINKLETHYRTPLFYRKPKLQLTPAG KALYDSCIKILTEEDNISRQMNDIIQNNVGSLSIGASQYRGQYWLPLILPDYLKCWPNIN VQVTMERSEKMEQLLVSGDLDFFVGIKHGDDSLLRFIPLMNDKVYMAITRELMEQYFGKE TDTIITKCSNGTTLKEFKEVPFMRLKSTYRLRMLTDKCFQAVGYKPRQILEVESIDLMIS LFPYSLGAFFCTGMRVPSLKTAHPNTYFFPVQLNGKTLLNPLGIVYHKDRFLPRYAQDFM ERLKSFFASFENSSC >gi|157101635|gb|DS480689.1| GENE 17 18428 - 19708 697 426 aa, chain - ## HITS:1 COG:STM3356 KEGG:ns NR:ns ## COG: STM3356 COG0471 # Protein_GI_number: 16766651 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Salmonella typhimurium LT2 # 1 426 1 421 422 201 30.0 2e-51 MGDIQVTLLIGFFMLISYIWAKIPYGMTAVFCCLALEITGVLSSAEAWAGFGNTTVFLFA SIFILGAGLMKTSFIIKIQKLLQKIGSSEKKMRWICMYIASFLAMLTSATAAGATMLPMV SLLSRKGNFQKRRLLKPVMDAACMGFAIMPFGMGAAFMEQGNSYLQMMGSSVRMSLFDSA IARLPVLIVTISFIGLFGYKFISNQKENGDKNQEEIILNENCNLSEFKDKMGIGIFFGSI AAMIISSLVGIPSHYAPLIGALLMIVSGVLSGKEAFGAIDLNTVCIFAGSLGFANALTKT GIADWIVEKLVFATNSSQSEYVLVAIFLLVPFAMTQFMGNIPVINLTMPIAATVACTTGI NPTAIMVATVAGATLSISTPMAAGIQALIMEAGEYRFSDYIKAGLPTAAVFAAAYIIWAP FVFPLY >gi|157101635|gb|DS480689.1| GENE 18 19848 - 20879 755 343 aa, chain - ## HITS:1 COG:MTH970 KEGG:ns NR:ns ## COG: MTH970 COG0111 # Protein_GI_number: 15678988 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanothermobacter thermautotrophicus # 33 323 36 321 525 189 41.0 6e-48 MGWKILLPQELKKDARDLLESHGHTLIDGRGFETEDVIADMKEHKPDAIIVRITKMPRAV FEAAAPNLKVLVRHGAGYDAVDLKAAADYGVKCLYAPVANSTSVAETALMLMLYMSRNVT RLRDMLTVDFYDAKLKTYMTTLNGKTVGLIGCGNIGSRTAKRCLAMEMNVLCYDPYKPAC DFPEGVEVVRDIERIFRESDYVSLHTPNTSVTRNMINKETLSMMKPTAFLINTARGAVVN EQDLYEACKNHVIAGAALDAVVHEPILPDMPLITLDNVLITPHVGGNTVEAAMRASYMAA MGIVEMYEGKEPTWPIPDIDYETAPVYTDTIKPDRKPVGMYDY >gi|157101635|gb|DS480689.1| GENE 19 20916 - 22334 1182 472 aa, chain - ## HITS:1 COG:ybhI KEGG:ns NR:ns ## COG: ybhI COG0471 # Protein_GI_number: 16128738 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Escherichia coli K12 # 2 469 3 471 477 292 37.0 8e-79 MKNRLVKMIPVVLFPLLTWLISAPAGLETETWHMFGLCLSLLCGLILKPFSEPVISLIVL GLGACFVEKPAVLYSGYAAQGVWFLLAVLLACGAFKKTGLGKRLAYILLSKFGGTKFGRS SLGLGYVMMLCDLILSPATGSNTSRSAIVFPIFRGVSESLGSYPDKDPQKLGGYLELLEH VVAMSTAVLFLTGMASNAVIASTIKDIAGVELTWMLWFKAALVPGLIVLFLCPLVVYKLY PPQMKDLGDIKPFVVSKMQEMGAMKKDEKILLALFIMAILGWMLGGKIGISMYVVAFAFL ALELLLGVMDWNDLMAEKGAWNMYIWYGAFFSISGAINEGGFYSWLSGQIQNYLDLSAVN GMAVLVILLLISFVTKYFFVSNAAYIASIYPVILTLAASTNVNIMALSLMLAFFGGYGAL LCNYGNGASIYIFGNGYVSQKDWYLKGTILLAMIFAVFLIIGLPYWKIMGIC >gi|157101635|gb|DS480689.1| GENE 20 22352 - 23053 536 233 aa, chain - ## HITS:1 COG:AGpA472 KEGG:ns NR:ns ## COG: AGpA472 COG0684 # Protein_GI_number: 16119556 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 226 1 223 227 160 41.0 2e-39 MAVGNRVFLKRNLPDNYLVEEFSKLPAANVADCMGRLCGMNPSIRLMSNPSEAIMCGVAL TVKARPGDNMMLHKAMNMSMPGDVIVVSNEGERSRAIMGEIMYNYMQGFKKIAGIVIDGP IRDIDVISKGHLPIYATGTNPAGPYKEGPGEVNVPIACGSITVNPGDIILGDKDGVIVIP RQDAAEILEKARTLSAADKQKAENSAKGTVNRQWVEDLMEKKNVEIIDDVYNC >gi|157101635|gb|DS480689.1| GENE 21 23071 - 23640 329 189 aa, chain - ## HITS:1 COG:MK0390 KEGG:ns NR:ns ## COG: MK0390 COG3481 # Protein_GI_number: 20093828 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Methanopyrus kandleri AV19 # 37 174 32 176 246 59 31.0 4e-09 MTKITEEMVRSYFPQLSEIKNQEYAKKAAEIWVEAFENSSWEDIADAQFATRAPGVALVK HTESVTANALMIARNTVEKYGYEIDTDILILAAVLHDVCKLEEMEMGDQGSKTSRKSSLG KIYQHGFLSGYYTQKYGLPNEVTGLVVAHSGQSKVIPRSIEGMILYYADMMDADIHFVQA GTTLCLETH >gi|157101635|gb|DS480689.1| GENE 22 24246 - 26294 775 682 aa, chain - ## HITS:1 COG:VC2453_1 KEGG:ns NR:ns ## COG: VC2453_1 COG0642 # Protein_GI_number: 15642449 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 146 393 264 509 516 195 44.0 3e-49 MDMKRFVSIFLTYDFKGEQNVRSIFQERYDLQFSDIERLYSCYSGPKEDVDTLQAAMNQL IAQQDKAVDFVVGHTENEISNYLALYVYPQYDRVSDCLTTIIDFADDKVCELTETSKHTA VLTVSVSLLLSIGIILLSIYSEYMDKKNLKLLSIQEQELKDALNNAQKAFKVKKEFLSCM SHEIRTPLSIIIGMTKIAATHLDDSNRIEDCLSKITFSSRHLMTIINDVLDMSKIEEGKL SVNHEPFQLQQLLEPLVATIYAQAEEQGLKFECGIKNVPEAVYIGDCMRINQILLNLLSN SIKFTPKGGVVHLEVNPAPVVDGKTHLQFITKDTGIGMSDEYLKRIFKPFEQADSGTSRK YGGTGLGMAITYNLVKLLGGSIQVESKLGEGTTCTVVLPVDVPNEGKHKKWEWEPLKVLV ADPDEDNCNYASMLLNRMGIHADCVSEGHEALQRVFNAHKALQDYDICMIDWKLSNVDGI EVIRCIREKIGPKKPDIIISTYNWAEIEDKARSDGANAFILKPLFESSLYDILVSLTKNI LGQEPEQISQFIPPEFPGRHFLLVEDNILNREIAMEILKVTGAEIDCAENGKEALDIFLA SPGGYYDLILMDIQMPVIDGYEATKQLRASSHPDAERVPVIAMTANAFKEDVEHALAAGM NGHLAKPINVEIFYDTLVSKLR >gi|157101635|gb|DS480689.1| GENE 23 26607 - 28322 543 571 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 3 558 6 515 555 242 32.0 2e-63 MKNLKIRTRLLLSYVVIIFLCLCASIVALIMLNHIGDNLTIFYDNNYTVTVNVATARRKI QEARGDILRAIVESDQEISKELIENADNSLIMMRETFPIIRQVFRGDKSLIDEVESIQKE AIVYRNQVFDAILANKKDKAFDIMKNGYVPHLNQMADKLQAIADIAGENAKKMVQDGQKA QEMSITLVTVIIILSFILAGTIGLYISNTVRKPIKEIEMAIQKLARGELETAQVKYISKD EVGDLSNNIRYLIGIQKKIIFDIAHIMGDLSKGDFTTCSNALEAYMGNYEGILISMRNLR NNLSDMLQLINQAAFQVAAGGEQISASSQTLAQGAAEQACSVEELANSISVISEQVQETA SNVEKAREQTMQAGGSVNDCMHQMQEMVNAMKDISHKSSEIKKIIKVIEDIAFQTNILAL NASVEAARAGAVGKGFSVVASEVRNLAEQSARASKNTAVLIEGSLQAVETGRKIADKTAQ SLEQVAQNTQTVSCTVDDIADAASRQASSLTQVTQGIRQISYVVQTNTATSEESAGASEE LARQAQSLKNLVDGFKLENSEARLVSKSCKT >gi|157101635|gb|DS480689.1| GENE 24 29498 - 29662 77 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSCSIHIVRLDSSMEEIVNDATSRQDNAKSCKLCLQSNKRFLTIGFAKLQLSAI >gi|157101635|gb|DS480689.1| GENE 25 29649 - 30020 105 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938609|ref|ZP_02085961.1| ## NR: gi|160938609|ref|ZP_02085961.1| hypothetical protein CLOBOL_03504 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03504 [Clostridium bolteae ATCC BAA-613] # 1 123 363 485 485 240 100.0 2e-62 MEQLILFEQLDKLALIVPFSLKFMSQSAPYQTIRTFYSELYKQLFWGVPLRGMMGTQEVI NDIYRPYHQSMLNYLKSSDAEGFAAILEEASNLAVQFVSERLAKYGIQETDSKSITKYGK NSI >gi|157101635|gb|DS480689.1| GENE 26 30303 - 31910 808 535 aa, chain - ## HITS:1 COG:no KEGG:Closa_0537 NR:ns ## KEGG: Closa_0537 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 32 523 19 507 514 318 40.0 4e-85 MDSGKKTSIANIKSFYSKWRGMNSGLAGIFAIIIIGVFPLAFRDYYFDILNFKTFFYYKV VIIFAVVFLMVNVIFLILSIYLGCDRSTGRRRKICASDWSMLDFILIAGISTLQSATPAA ALLGNDGRYSGLLIWIAYAFMYFAVRRSLRLRQWYLDLFLGAGMGACIIGILQYFDIDPI GFKNLMDPGQYYMFMSTIGNINTYTSYAAMLAGASVLLFFTEEVRIRKTWYLFCMSISFA ALITGMSDNAYLAIFALIGLLPLYAFKNLRGVRTYIFILAVLSSEFWSLHRMTKPQSGAA GQMGGLYRIITGSHYLILLVAVLWALYFILCWMMKMIPDTHPLMRDGNKGRWAWLAFLSV GIILTGFALYDVNVCGNAGNYGGLAQYLTINDDWGTHRWYIWRIGMESYGKFPFKQKIFG AGPDTYGAVVMENHYDEMVMRYGEFFDSAHNEYLQYLVTVGLAGLVAYLALLVTSIRKMI RLSGKRPYVMAIAFAVICYAAQAAVNISVPIVTPVMLMLLAMGVSGEDEADKEIS >gi|157101635|gb|DS480689.1| GENE 27 31984 - 32514 169 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938612|ref|ZP_02085964.1| ## NR: gi|160938612|ref|ZP_02085964.1| hypothetical protein CLOBOL_03507 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03507 [Clostridium bolteae ATCC BAA-613] # 1 176 1 176 176 346 100.0 4e-94 MLSEERNLQIEMAGTDRACMMLVCINGHENNMPYGIIVNYHLPCPVSFRGIHGLILALDE VCEMVGAPMATTEPRFLQKEREQQYRALRGKRQKSSGKVLEKEEVAGNLYPYVTAAKEVI LIHVLFRQNSSMQGRLRCNYTGTHYIAFRSALELMRMLEEANTYMCQKRINKDLKM >gi|157101635|gb|DS480689.1| GENE 28 32572 - 36966 3625 1464 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1357 1460 533 620 621 73 41.0 3e-12 MRKRYNIKSKLAGMLAAAMVFSMAAPGYPAYAVNFGDPVKLKFDPQNGPGVQLSSYSSVA SGGAMEGNLFVADGQAGSPLTSSANFAGLPTDDFGSGARPKVPDFNGITWDGYTFDGWYG TSGNRVETLPYAFPYDSVTTYNAVWKGNETTPYTFRVEHYRDFNTARDEADDSSGWPGDY DDPDLRKFFESTWESTQMANNPISATYRRDIPGYRFKSVLLKNNKVRKYGEASGHGTMTE GGASINASTNAVRGSMPNDYLTAAYRYEPDPTKKFLITVRYVDMEGNQIKAPESRTLPVE SDYVIEPAAINSYLIASAELTDGGITDLEGTGVISAEDAGVSLNTSNYHFTGKMPNQAIT FTYQYELDPSFVTTLRVKRMDNHGNILREEETRAVQPGVPVEVDVDKMSGYVYPPNIVWE DSLSDVSLDQETKKLNLTPNLQGGTVTITYTEELTDETYWAKVEFYNGQNGSFNGNTAPK FVKKSENNTLAQVTEGLMPTPSPHYSFDGWYKATSNGSNKTGQKLADDTAINASIKLFAN FEETEGEWFDLKFVSGPHGTIEGSRTSHVEIGTSWSSLALPSTLPDNFYMFGGWFDENGN RMQNAQTINADQTYTARFIPIGISDDNILAMPDARGTVDNDGSGKVSVSGANENRRYALT DMDYRVVGTKLGSQLKNTGFTGLNPCTSYYVYELNQSANPDIGVILPEHLDPDTFGPPSR VTVPALGTNYDVEDDGAGMKQVVVEPANPETLYAILNMEGDVIDVPGADEDEWVMSSGAS PRAVLGGLESNQLYVVVAKQPGEDTAAADKILLGTQVAVIGNTQEQQDYTIALTNGGFIT KIMRDGAIIDFDDDKTAVVKAGDVISLDAETTNSQGQGFRQWSALIGKLTLADKTRRNPA ITMPEGNLVLQANYYAAAAATPGNASIDYTPKTGAVALDLSDDRMEDLIDELTNNSSDQD AMTAGVDVLYTVKFTQRAPKASESEAVKSEVGNDSVKVPWTLSSVLARQVGGSNKEIPSD VNKEPDIRVYAVLDRSMHGYTDYHLWNLEDLDGEYTCTEVPMEPAPDDMDSGFTGVVSFD ANVESTYVLTYLKAHEVKIIDDKRSMEHIIKVRSDTALEDAEGFMDLDIFEDYTDPITGI VWEYKGLGKTASSGNGYDTGEPVTKNLTLHVLYQTEDDAQWQEARQKLLDQISIAQVLKN NQSVSEEDREELSEAIDIALEVANRIYRPTVDELLEAYDTLKALVDSITSGGNNPDDPDN PNQPGGGGSGGGSGGGSGGGSGGGSGGGSRGGSGGGRSLGPGNTFNDYRTYTIGTEGRWE SIDTDGRQWSFILRSGRIIKDQWINIKYKDARQTCTYHFNAAGIMDYGWYMDAGGHWYYL KEDQGADFGRLVMGWYYDAKDMKWYYLNQFTGGMATGWQKLGEYWYFFSTGSQSGKPMGT LYVNEITPDGYMVDENGRWMRETP >gi|157101635|gb|DS480689.1| GENE 29 36977 - 43195 5754 2072 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1911 2071 482 620 621 91 35.0 2e-17 MSKELYHIWKRIVSLILTVCLTAGLWSSQMAVVAYASTELPDENLGNGASEGTAYAWITP RNGTGENPTNSTYRFDLCGLQAGQTGYSQSGIKTTYSWQGYATYIQVENGTKYKANGSAN GAVEDISSMGIEFKIALSPSPDNKYIFVDYYVYDKTGRGGTEGRSVKLGTGTDVMIGGSQ EDDYATVYKNNRGFHMVNQHVKTTFDCITNDSSLGVTPPTTRWIGRYSNWGSNVFNEDGG DSLSGQDSGMAYSWHFQLHPYEIVHKRVAFAIRDTSYYVSESGVDSTAADGTYSSPFKTI EYALEKIGNKKGYIYIMDYPDITSPIEVSGSGRDITIASTDYDRNGNPTNENSDYIKTLK RAGSFNEPIFKVSGATLKLTDLVLDGNGSVSSEPLVSAGSGRVEINSGADLRNCHGDNTS KGSAVNITGSAALSMNYGTISGNVSEGKGAVYFDSTGRFNVLNDVMIESNTTPSGAKANV YLAEGRTITVTGDLNTSRIGVTTARQPDASPGGISTLAGQEIKIAVPLSGSGVDTAPSPF ADNFFADQAKTDGTGVYVTVGTKNLTGAGTGNDRNAVIKRNGLQISFTVRDADTGGSIPG VTPIPTVSKASGESVDLAPGPAITGYELAGVVIEQGTPPTLTANLTPGADFGRVTGTMPN QDVAVNYEYRKIGSQIIFNSNGGTPEPETLVGTAGNPVTSLLPTTTRYGYIFKGWSAVND WEHPQFISSLPPVYPETPVTYYAIFEADPSIKFNYTVEHSNASGDIMFASNTLEGAYSVE QPLLEEKRAVRGYSWSQEDSSTTPSEYNFSGTSVPIGQFNGTGTFSGKMPGQDATIRYGY KVRYGDTSARSLFEVLHQTNNGTTVSTAQSELYYPENEISAQPAQVYGYQCTGYRFDLGE EAGELADGLVNGVRGEFDDSFTFNGIMPNQPVRLVYLYESTIEGYGITVKYEDNGTADSS LKDIIPPEVSGPFEADSSVEGDYLEQYGYSLESHTVRPAGTQIQFDDSGWAGIMPNDDAT ITYRHDRIPSKWADITYKPGVNGTLQGGSGVSQDVQALSGGRFKASVLLDDGVSPGHENS YTLETIKEKRLMPVPQPENNYYRFGGWFVDTDGDGVLNNGETLLPQDYRFTAAAVITAHF EENPDAWIHINFEAGSHGSINAGQPLTLRTTFDKKWGDIEASLPEYTPEVNYLVDDWYVQ GELVDDDTDLVNGQTYTIQFYPDPAIFGTEVAEPEPAAGLNGQGKGRITVFGTTQGYKYI LTDLDGKVLAVNKGNILTSRTVFDNLYPGMRYVVYEATGQTGVGTGDMIGDVQGTLSDGA EVLTPVVDTNYKILYDEEDEGKTELIIRPADTASDYAVLDSDGHVVATPQTGAGGWQSVS GNPGSVSFGGLDYNREYTVVARPKGQNTITAESCEENGSIITTDPGGDLELPNYIIQTIY GEVVSAGDIEVGADRYEEAHKGDWVKIKAEPTNDEGRPFLHWKFTIGAVDGMDTKINSRE ASFAMPDTNLVLTAVYERAATPSNAVVVDEVRGGSREELALDPDEVPVLEDELTTDADRE LMDVNHADVTYKVVYRKNNVRASESNAIKMGGDYDTDHEAAYKAAWGLDVSIERYVNGRK VNRASASEASFNTYVQLGKRDVDMMDYQLYEIEEDDDTGIAVSLVPMDYEPEETGGLFTF TATEGSRYVMVYNRAYRLYFLNNTAPDIYRYWFKVRKEESPLDSYYETEYAGIEEQLDYF ISPAGAEYSYVGWSYRQDRYREFEPDRSITRKTYIYAYYDDNEKEVDNTRRELEEAIREA IGISDDHFLKLGESKKLKEYIEAALEVLDREEPKATIDQLEEALFELKDKTEPYKELLDN RYDHYDDQQKGGNKGGSKGGGGGGGGSKQAPFAGTAPESYLIGTNGNWVESTGPSGERQL SFVLNGGTSLSGMWARLKYPEGSRSGDSGWYYFDDKGIMQSGWICDKAGKWYYCNTEKES PYGKMVTGWRLDTRDGNWYYLDPLNGVMAQGWRKIDGKWYYFSPVGAGVYAYDPAKQQWI YGGGTGRPLGSMYKNEITPDGYQTDADGAWIQ >gi|157101635|gb|DS480689.1| GENE 30 43217 - 44857 1191 546 aa, chain - ## HITS:1 COG:no KEGG:Closa_3788 NR:ns ## KEGG: Closa_3788 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 450 1 473 569 178 30.0 5e-43 MMKKRITLIASTLVVTAFALTKLHPVTAYAVEGWMSEDGEWSYLDENDEPLQNTWRQSRD SWFYLGDQGLMLRNCFIDQGNSLYYVFDDGARAENMWVLVEEGDEKGHEAGWYYFGANGK AYRDTTSSFKRSVDGRSYIFDESGRMLTGWFDEDGRPLDEDENPLEEGVYYADKDGALKT ESWLDYSQLELSGLEDLNSDISGRDYDEYDQVWLYFNNRSKKVKSNGDRLIQKNINGSTY GFDEYGIMLPWWTKVASVSNADKSNPTSDVPAKFFAGYDGGSLLRDSWLWMYPSDNLIQD DYDDGEYSWWRTDQNGKIFQNKIKNVNGRYYAFDGLGRMQTGFVLFDTRSTFVAQYDMDA WSSEDFIEGNIYGIEKADLYFFSPDELNDGSMQTGSELKVELDDGVHTFGFKSDGVAYGN RNRLKRVKDSYYINGLRLEADEEYGYGVVRDSEDTYRVVDTNGKTVTGNKKVVKDKDGGW LVIINGRFAARVDDEFKPRWYKGEEGTGFYHYDSDNRENKYAGGLIAGSNTEPLLDGLPE EECLNF >gi|157101635|gb|DS480689.1| GENE 31 44923 - 45723 464 266 aa, chain - ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 173 264 148 238 240 89 47.0 1e-16 MRKFLRPVIFGCGLIGLVAWILFSQLSTAGAEQMQPDLEGTGTVDNPYAVERLEIISNRE STQSERQTEHQTEFEYLGDWEVAAYSSDESDMTYSGGVPVEHHTAAGILSVFKIGDTVLI DGDTYVIEDKVDEDAKEKLRIYFDSYEEAKRFGRKKCAIYRHSELPEESPYYLGEFQVTG YCSCTICCGEKEERLTKSETVPRASHTVAADPSVIPLGTRIVIDDVVYTVEDTGKAVEGM RLDIFFDSHEEAVRYGRKEKYVYLEG >gi|157101635|gb|DS480689.1| GENE 32 45808 - 46539 209 243 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_5011 NR:ns ## KEGG: Pjdr2_5011 # Name: not_defined # Def: transglutaminase domain protein # Organism: Paenibacillus # Pathway: not_defined # 134 218 141 229 380 74 45.0 3e-12 MNICFAKNLFAMQQTESSAGQVDVAGQLSSQMCSQYPEFECLGEDVLDALEGGRENGSFI KTDIVFDSQEEAFSFGRYYYRYIYLGKEEVTLYSFDENGKFVIYVSCGNPARAVSEHSQV QDRLSEVVQKCSTLGDREKAEYFYDWVYDNVSYDQTLKNRTIYDAVMNGNAVCWGYVSAY LMLCRNAGLICEPVYAGDHAWNRTWLDGEWRYCDITWDKSLGGTRWKFITQKDMDMDSMH NNL >gi|157101635|gb|DS480689.1| GENE 33 46867 - 47682 456 271 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 8 252 19 262 265 103 26.0 4e-22 METVKKYRHYLKQFKEFIGEEPVTKEMVLLWKANLREHITPVTINCVIAALNGFFKYCGW DGCESKFLKISKSSFYPERKELTKDEYKRLVKTAFDNGNERLALVLETICASGIRVSELS FITVAAIHKGQAEIECKGRIRTVLLTRQLCSLLHSYAQKNNISSGMIFVTKNGKPLDRSN IWREMKALGIEANVKAEKIFPHNLRHLFARTYYEIEKNLSKLADILGHRDINTTRIYTKE SGSQHRIQLEKMKLLLYDTTEYFFCCTLGYS >gi|157101635|gb|DS480689.1| GENE 34 48364 - 51315 1507 983 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938621|ref|ZP_02085973.1| ## NR: gi|160938621|ref|ZP_02085973.1| hypothetical protein CLOBOL_03516 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03516 [Clostridium bolteae ATCC BAA-613] # 317 983 1 667 667 1102 99.0 0 MIKKNVMKKHSRHAALLTAIVLAFSNVGTCLNVSLAQEITSEGGSNSTGGSGSDSLGGSD SSGGSDSTGGSSSSGGSHSSSGSHSSSGSDSSDGSGSSDGSGSSGGSDSSGGSDSTGGSD SSSGNGSGEGTGSEGEGGETAGGESGEGTGSEGEGGENAGGESGEGTGSEGEGGETAGGE SGEGTGSEGEGGENAGGESGEGTGSEGEGGENAGGESGEGTGSEGEGGETAGGESGEGTG NKGEGAEDPGVEGEENVNGETGGVAAGGAAAGDLADKEEKEKEFTITYTVNPEGAAEVEG ETLIQEGEELSFTVKPVDGYAIQIVTANGYELAAEAEEDTEEDSGLAGMATYVVTDIEEN LDIEVELLETEETVSDTAAFMAEIDGITISVEPLDGEFDASVKSLSAIAIDDEDTSSVIK GKCAADGYGSVEVRVFDITLEDDNGQPVQPGCNVKVSFSGLDIPESVSNERSVESDKINV FHLEDQENDLKVNDIKCDVNEGEGTVAFETSHFSVFAIAYPDGNEFGHRYVPTSYLSSTS TFEYRNNSEYIVIDTTNSTLNKVQGGYDLTVDAAIAEDAEGPLTFVLSDAFMEAMRKFGA DDGVMPTYKFNLNLNLTNNSKIKYLYEANSLSVQTPESNVDTSFSGMDVTIKGFDGQIID RTYGAYRVQNNAIKELFGKTKISFEDMLTIYDVLYEKGYDGDYPLTDYYKNYYHTDDLST VFGNLSMESVGNNGTFDLSGTIDEVAAKYDGTYPYFDFANIFTSKSKLEARLYEVEPELG GLLYDTFYKELFGVTFGERTGYDTPLALFRDHDTSEWQSADASFADTLGDGLSSGESCDF TLGMFFHPRMGNAYQGYNCGYYAAITLVPENTTPPTPPTPPTPPTPPTPPSGGGTSGGGG GGTTPGGHRATPDTNGPGVTIEPEAVPLAPLPEGVTIEPGEVPLAPLPKTGQNPMAQWVC LMSGLMLGLYGFLGRKRGDDNEK >gi|157101635|gb|DS480689.1| GENE 35 50523 - 50948 188 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167042352|gb|ABZ07080.1| putative ribosomal protein L31e [uncultured marine crenarchaeote HF4000_ANIW97M7] # 8 134 99 226 236 77 39 6e-13 MVTGAEKEPEAKGKAERPPEARAEKEPEAKERAEKTPEARAEKEPEAKEKAERPPEARAE KEPEAKGKAEKTPEARAEKEPEAKGKAEKTPEARAEKEPEVKGKAERPPEARAEKEPETK GKAQKILESKEKKTSMEKQVV >gi|157101635|gb|DS480689.1| GENE 36 51643 - 52932 671 429 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0590 NR:ns ## KEGG: bpr_I0590 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 15 407 9 402 411 160 27.0 9e-38 MNEINKNNLAVRNTLYARTLGTLEILYNGEPWELPMSVKGKVMQLFLLLIWAGENGISRV KLQDFLYDRRSTDAANALRITTTRLRKILSQSVLPSGEYVVVEKNTYYLKMGRVGGELEL QIDAALMEDYYNRAMETEDEAARQLLLEQACRLYGGEFLPVLSGEVWAESLRSHYQDIYF KGVREACRLMKNHRDHQKIVRLCDAATAAYPLAEWAEQKIGALLALKRYGDALKTYDYVT QTLLDETGSIPPDHVLAYFKQIGSQIEHVNGGLKEIRGNLEEREWSAGSYYCTYPGFVDC FRMVIRSVERGKRRGFLIVCTVRDGKDCPVEDGGKLQEYEDALCEVVHSTLRRGDVYTKY GPDQILILANELREDKCRIVENRIVSGMRRRYGQKVSLHIENVNLKDWCPGTEKGKEDNR REKTVRRRP >gi|157101635|gb|DS480689.1| GENE 37 53243 - 53758 201 171 aa, chain - ## HITS:1 COG:yieJ KEGG:ns NR:ns ## COG: yieJ COG3196 # Protein_GI_number: 16131585 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 7 169 11 193 195 105 36.0 5e-23 MEKKIVFKYHPNVYDNDIIVHENGICQCCGKEVNEYCSTMYCIDDIHCICLECISDGKAA EKLRGDFIQDAESGLVSDPQKTEELFKRTPGYASWQGEYWLACCNDYCAFIGDVGTEELE RMGIADEVFAEYDARDEFDNAREYLEKSGSLAGYLFRCLHCGTYHLWVDAD >gi|157101635|gb|DS480689.1| GENE 38 53817 - 53972 99 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938627|ref|ZP_02085979.1| ## NR: gi|160938627|ref|ZP_02085979.1| hypothetical protein CLOBOL_03522 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03522 [Clostridium bolteae ATCC BAA-613] # 17 51 1 35 35 66 100.0 8e-10 MLCGPLIPGGPHIYFSMHSIGRRMGRPAFSCMFNPLFGNGFRYVLLDCYVV >gi|157101635|gb|DS480689.1| GENE 39 54076 - 55890 1822 604 aa, chain - ## HITS:1 COG:MA3212 KEGG:ns NR:ns ## COG: MA3212 COG0526 # Protein_GI_number: 20092028 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Methanosarcina acetivorans str.C2A # 514 601 7 93 93 64 39.0 5e-10 MGQPLVFPHGVTVYNPEKCWSGYTIVPLINDGVLLFDMNGNEVRRWNMHAMPPKLLPGGY VMGLSGYRHPDYGMQDGVNLIEIDYDGNIVWEFDHFENIDDPGRDHRWMARQHHDYQREG NPVGYYVPGMDAKPLSGNTLILVHQTIHNPKISDKKLLDDAMIEVDWDGNILWKWSISEH FDELGFDEAAKNVLFRDPNLRASDGGVGDYLHVNCMSYLGPNKWYDQGDMRFKPDNIIFD CREANILAILDKETGKIVWRIGPDFNAAPELKKLGWIIGQHHFHMIPKGLPGEGNLMVFD NGGWGGYGTPNPSSPIGIKNALRDYSRVLEFNPVTLEVVWKLTPKELGHAIPTDASKFYS PYVSSAQRLPNGNTLVTEGSDGRLIEVTPDHEIVWEWISPYYTHNEKGPRNNMIYRAYRY PYSYVPQEPVPAEVPIERIDNVTYRVPGAGAFGAKTVIDVEGTLPYYQDVALCVATDDEE DLSQREKVFEVDTQVFHPAGAASWKDEVLAVEDKPVLVLFGAERCVHCKALHPVLEEALK EEYDGAYEIRYVDVDANQELVDTYNVGGIPVVVIFRNGKEVLRFNGEMDYDDLCDILDRP LEKA >gi|157101635|gb|DS480689.1| GENE 40 55924 - 57096 1449 390 aa, chain - ## HITS:1 COG:TM0869 KEGG:ns NR:ns ## COG: TM0869 COG0492 # Protein_GI_number: 15643632 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Thioredoxin reductase # Organism: Thermotoga maritima # 5 298 18 308 317 243 43.0 3e-64 MDSVYDLIIVGGGPAGLSAAIYAGRSERRTLLLEKGSYGGRINDTYEIRNYPGTKADSGQ HLMELFREHAAGHPTVELKRTTVTGVRKEDDTFIVETKRRGDFMARSVILDLGTRPRELN IKGEKEFTGHGVAYCATCDAEFFKGKEIYVLGAGDQAIEESDYLTGFASKVTIVVLHEEG HLDCNEVAAEHAYKNPKIDFVWSSTVQEIKGSDHVESLVLKNVDTGRETEVKADGLFLFV GMVPQTGLVKDMVDCDKAGYIKVNEKMETNVPGLYAAGDCTQTFLRQVVTSAADGAVAAV ASERYVKELEQIQEILGPDSGRTVFLFYNPYSNEEIEKTARLEQELSGQWRVYRQDVTRQ SLLYNSLKLERTVAGAFYDNGKLVEIKQAF >gi|157101635|gb|DS480689.1| GENE 41 57169 - 58629 1191 486 aa, chain - ## HITS:1 COG:YPO0624 KEGG:ns NR:ns ## COG: YPO0624 COG1757 # Protein_GI_number: 16120950 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Yersinia pestis # 8 466 6 464 483 207 28.0 3e-53 MKTEETKLTFYGNQAFAMLPLVGYILIGGTFSVGLHYYSMKGLSFAAAVSLFAGFFLCRE KDRYWDSVVHGLAQYGNSRLIFVFMVIGIFSKLMNTGGIGDGFIWLSLKLGISRSGFVVF SFLASAMISMGAGAPIAAMFAVIPIFYPPGILLGANPAMLAGALISGIFFGDAISPSSQV INTTVMIQHDGVTGQPADLLKTLRMRTPYILAAGGAAMVVYLIFGGRGGAMGDISQISAM SSPQGLWMLVPLAVLLLVCFRTGNLFMGLSFAIVSGFAVGLLTGVLAPSDIVAIDYGTLG LKGVIFEGISGMLDVVVSTILLYGLIAVAVDGGVMERCCNWLLRRNIMGTTRGSETVLTA GIVITNILLAGCVLPSILMFGGMADRIGQESGISPERRSILLTANATNFSAIIPINSAFV MGSVTIINEMVQKHSYLPTVSPFQIFSAAFYCLFLTGICVIWVVLGERAGKGGHQKAQEI SPVRGS >gi|157101635|gb|DS480689.1| GENE 42 59195 - 59668 218 157 aa, chain - ## HITS:1 COG:no KEGG:CD1766 NR:ns ## KEGG: CD1766 # Name: not_defined # Def: putative lipoprotein # Organism: C.difficile # Pathway: not_defined # 20 157 17 154 154 152 58.0 3e-36 MFSTKKLLFFLVGIISVVSLFACQKASANQSNVSDTEETQKTVLEMEMNANYSNSDPFEN GRLFCVSEDIETLDAEVYFQMDGERGIVEIKDRNADEVLWSNTWDEKVSGDTITVSLNNL QKEKEYDVRFTGTKINHAVVNVTFESELVQEKSRPSK >gi|157101635|gb|DS480689.1| GENE 43 59871 - 61157 1545 428 aa, chain - ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 426 1 422 428 408 51.0 1e-114 MKVLIVGGGGREHAIAWKISQSPKVEKLYCAPGNAGIAKVAECVDIGVMDFEKLVAFAGE QAIDLVVVGPDDPLAAGAVDAFENAGIRVFGPRKNAAILEASKAFSKDLMKKYNIPSAAY ETFDSPEAALAYLETAPMPIVLKADGLALGKGVLICSTREEAKEGVKTLMLDKQFGSAGD RIVIEEFMTGREVSVLSFVDGKTIKIMTSAQDHKRAKDGDQGLNTGGMGTFSPSPFYTDE VDAFCRKYVYQPTVDAMKAEGREFKGIIFFGLMLTENGPKVLEYNARFGDPETQVVLPRM KNDIVDVFEACVDGTLDQIELEFEDNAAVCVVLASDGYPEHYDKGYKIDGLDRFDGQDGY YVFHAGSRFDKDGNIVTNGGRVLGVTAKGNTLKEARENAYKATEWVEFGNKYMRHDIGKA IDEVGVVM >gi|157101635|gb|DS480689.1| GENE 44 61218 - 61808 613 196 aa, chain - ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 196 1 192 204 197 51.0 8e-51 MLRVGVLVSGGGTNLQAILDAVDHGDITNAEVSVVISNNPGAYALERARKHGIRAVCISP KQFPTRDAFNQAFLAKIDEYDLDLIVLAGFLVMIPAAMTEKYKGRIINIHPSLIPSFCGV GYYGLKVHEAALARGVKVTGATVHYVDGGMDTGPIILQKAVEVEEGDTPEILQRRVMEQA EWVILPKAINMIANGQ >gi|157101635|gb|DS480689.1| GENE 45 61802 - 62836 835 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149378138|ref|ZP_01895857.1| Ribosomal protein S7 [Marinobacter algicola DG893] # 2 338 8 337 354 326 50 6e-88 MMDYKKAGVDIEAGYKSVELMKEYVKGTMRPEVLGGLGGFSGAFSMSAFKGMEEPTLLSG TDGVGTKLKLAFLMDKHDTVGIDCVAMCVNDVACAGGEPLFFLDYIACGKNVPEKIATIV KGVAEGCRQSGSALIGGETAEMPGFYPEDEYDLAGFSVGIVDKKDLITGENLCAGDVLIG MASSGVHSNGFSLVRKVFEKELTKEGLNTYYDELGTTLGEALIAPTRIYVKALRNVKEAG VTVKACSHITGGGFYENIPRMLKDGVKAVIRKDSYPVPPVFKLLAEKGRIEEQMMYNTYN MGLGMIVAVNPADVEKTMEAMKAAGEAPYVVGSIEKGEKGVSLC >gi|157101635|gb|DS480689.1| GENE 46 62949 - 63464 730 171 aa, chain - ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 169 1 169 171 214 64.0 9e-56 MAKVGIVMGSDSDMPVMAKAAEILDKLGIDYEMTIISAHREPDVFFDWAKAAEDKGFKVI IAGAGKAAHLPGMCAAIFPMPVIGIPMKTSDLGGVDSLYSIVQMPSGIPVATVAINGGAN AGILAAKILATSDAELLAKLKAYSESLKNDVVKKAEKLEQVGYKEYLAQMK >gi|157101635|gb|DS480689.1| GENE 47 64001 - 64864 771 287 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938640|ref|ZP_02085992.1| ## NR: gi|160938640|ref|ZP_02085992.1| hypothetical protein CLOBOL_03535 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03535 [Clostridium bolteae ATCC BAA-613] # 1 233 1 233 287 417 100.0 1e-115 MTAKRIKKAAILVCAGFMALSLPGVGSTALGAPSNRETRIQEETWGRVFLSAGPKISLSY DREGEVLEAKGLNQDGRKLLEGAGNFAGADCDTAVRRLVKRMDDKGWFRGDKNEKELLIE SEKGSVYPRADFMKEIEEAAWEAVNKNGIDITITLKDNGKQSEQLYIGKNEAKRIAIKEL GLKENELTYKGCDLDDGVYELRFLVNGRKLEVDVNAATGRIADVDWDNDTGAWDDDRFDD DDRFDDDDRFDDDDRFDDDDDHDDDDDHDDDDDDRHDDDHDGNDYDD >gi|157101635|gb|DS480689.1| GENE 48 65501 - 67045 995 514 aa, chain + ## HITS:1 COG:CAC3339 KEGG:ns NR:ns ## COG: CAC3339 COG0488 # Protein_GI_number: 15896582 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 499 1 501 518 413 42.0 1e-115 MSLLDITGLSHSYGDSRLYSDSALSLNRGEHMGIVGQNGTGKSTLIKICTEQVIPDSGRV AWQPGITIGYLDQYACTNHVLTMEQFLKSSFRSLYELEAKINRLYDEAALSCNMEVLNQA SLFQEQLERSGFYSIDTHIAQVVSGLGLVSIGLSRPLKDMSQGQRAKVILAKLLLVKPDV LLLDEPTNFLDKEHVEWLGGYLSAIENAFMVVSHDRTFLERICTRICDIDNGRLSKYYGS YTEFLKKKAFLREDHIRQYNAQQKEIKKTEEFIRKNIAGRKAQMARGRQKQLDRMDKLAA LEQKEIKPEFCFQCLPLTKCLHLSVKRLSVGYDYPILSHLDFSVTGGQKVVVTGFNGIGK STLLKTLTGSLNPLDGVFAFSQQVTIGYFGQDFSWENDTLTPVQIVSNAYPKLTLKKIRQ HLTRCGVLSRQAVQPMGTLSGGEQTKVKLCLLTLSPSNFLILDEPTNHLDIQAKDSLQAA LKSFPGTILLVSHEESFYRSWANRIIHIGDYSSP >gi|157101635|gb|DS480689.1| GENE 49 67232 - 68416 927 394 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938644|ref|ZP_02085996.1| ## NR: gi|160938644|ref|ZP_02085996.1| hypothetical protein CLOBOL_03539 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03539 [Clostridium bolteae ATCC BAA-613] # 1 379 1 379 394 417 100.0 1e-115 MTYLVMECHPSYAIVLDNEGRMIKAANMGYQEGQVVGEIIARQTPKAPILFRLAPLAAAA CLCLAVLGGGAYGAYGMPYGTVEVRINPDVKMSVSYMDRVVGLEGVNEDGKKLIDQVSYK GRNSEYMTGILVERAMEMGYLTEGGTIYVAASGGSERWKAKRADAIRTGLDTLLKDGMPV EVHVESIGKDSPPAAAPHHKGEIPGGTGGDKPAFIGMPPGGTGEDADGNGEDDRDEDERG DDTENDDEENDADPDKYDEDDGDDREEDAGKDKAGDKDDGRDDDGKDDTDDRDDDIKKAR DKSGRNNVVRQSSGAGDDKDLDDLDERDNGKAGNGLNDNRDRDEDDGGEHSDQDDSDEGD SDEDDIDENDSQEEDNDGHEDGEDDKDEDEDDDD >gi|157101635|gb|DS480689.1| GENE 50 68413 - 69147 616 244 aa, chain - ## HITS:1 COG:BS_ykoZ KEGG:ns NR:ns ## COG: BS_ykoZ COG1191 # Protein_GI_number: 16078409 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 1 233 11 245 251 101 30.0 1e-21 MGEEYQIVRLVYEAKKSPEAADSLISQYMNFIQAETVRFTQGAPTQVREDQLSVAMFAFY EAVQGYERGRGAFLPYAAASIRNRLIDYIRKEGRHQGLVSFDAPAGQEDDSHALLEVTPD EKNGIQLWNVRTATREELDEFRRNLSEYGISLSDVADNCPRQDRTLRACHKVLEAARSHP RLLEHLIQNRKLPVAELAELSGVSRKTMERHRTYLVAILLAYTNGYEIIRGHLCQISPGK EGTV >gi|157101635|gb|DS480689.1| GENE 51 69234 - 69359 83 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938646|ref|ZP_02085998.1| ## NR: gi|160938646|ref|ZP_02085998.1| hypothetical protein CLOBOL_03541 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03541 [Clostridium bolteae ATCC BAA-613] # 1 41 2 42 42 70 100.0 5e-11 MDEGDTALVVGEGVLYIKKSLVQSNSWMEQVIFSVILDKRK >gi|157101635|gb|DS480689.1| GENE 52 69375 - 71594 2232 739 aa, chain - ## HITS:1 COG:no KEGG:Closa_1605 NR:ns ## KEGG: Closa_1605 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 728 1 690 691 852 59.0 0 MDYRILLQEENDAVRERYELSMERIRSMDTEEMGLPYGCYFKKVSAFIRRIERLAKRVEE TGLSGEDGLEGTSLTLEALQEENYALYQDILPENYRESFANPDYAVEVLGDGYGQLLSFL YTEIRADIVYAFEMRLMNITILNELFLEVYNLFAQAWEEGKNAPLEQLIKDSLYWFVSDY AEVTVDWRIREGLDPALSFGTDIVMKRDLTDLRYLYAYGEYVSETEIKLASFMNSLPQET IDQMASTYTEGFRKGFQVMGRDLKKKRTVVVEFQLGFERMIRRAVEYFREMGLEPICYRA AVESVNRRANGRRGYYGTSPNKQYDYDHRYDSALYMGNAFKERKLAVLRSAYETYRKEAA WCAGPALVETFGEEGFAPENKKAALALNAHQEALTLAYANESRQIVNQYMPGDETSFTII AFPKPEIGPDFEAVFRETIRINTLDYEKYQKIQQCIIQALDEAGYVLITGRDGNETSMKV ALHPLRDREKETNFENCVSDVNIPLGEVFTSPILKGTEGLLHVKNVYVEDYQFRNLRMVF KDGKVTEYSCGNFEEAGAAGDAGNDKGAGAAGNDKDTGDARRQGQALVKQVIMRNHDWLP LGEFAIGTNTTAYAVARRFSIGDKLPILIAEKMGPHFAVGDTCYSFAEDSPMYNPDGREV IARDNEISLLRKTDMSKAYFSCHTDITIPYSELGDIKAVRADGSRIDIIRQGRFVLEGTE ELNEALDEADDGAAGLDCE >gi|157101635|gb|DS480689.1| GENE 53 71611 - 71934 443 107 aa, chain - ## HITS:1 COG:no KEGG:Closa_1604 NR:ns ## KEGG: Closa_1604 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 105 2 104 106 80 46.0 1e-14 MRQIGDKVSYVRNPFARNSYYCLTLAVLGLALGTASMYLSVARAGQGGLNTGAYGFSSLA AALMGLWYGVRSFMEKDRNYILAKIGISICVVLVIVWAVIIVTGIVR >gi|157101635|gb|DS480689.1| GENE 54 71931 - 72404 348 157 aa, chain - ## HITS:1 COG:CAC0829 KEGG:ns NR:ns ## COG: CAC0829 COG4767 # Protein_GI_number: 15894116 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Clostridium acetobutylicum # 15 137 18 143 308 88 38.0 5e-18 MIRNTTKRQKLGWVLFILYLCFLAYFMFFSESFGRTDTDRGYQYNLVPFKEITRYFRYYD VLGPTLFMINMIGNVAAFMPFGFFLPIISRRSKKWYNTVMFGFVFSLMLETLQLVFKVGS FDVDDMLLNTLGAAAGYLCYRLVQWTRCKLRKRRLSR >gi|157101635|gb|DS480689.1| GENE 55 72420 - 72548 220 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938650|ref|ZP_02086002.1| ## NR: gi|160938650|ref|ZP_02086002.1| hypothetical protein CLOBOL_03545 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03545 [Clostridium bolteae ATCC BAA-613] # 1 42 18 59 59 64 100.0 3e-09 MRNTGAGCIAIGVIIAVTGLTVGIISIVNGALLLKRKSEIEF >gi|157101635|gb|DS480689.1| GENE 56 72599 - 73276 938 225 aa, chain - ## HITS:1 COG:CAC0027 KEGG:ns NR:ns ## COG: CAC0027 COG0461 # Protein_GI_number: 15893325 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 224 1 224 224 297 62.0 1e-80 MENYKQEFIDFMVESSVLKFGDFTLKSGRKSPFFMNAGSYVTGTQLRKLGEYYAKAIHDN FGLDFDVLFGPAYKGIPLSVATTMAISELYGKDIRYCSNRKEVKDHGDTGILLGSPIKDG DKVVIIEDVTTSGKSIEETFPIIKAQGDVEIKGLMVSLNRMERGKGTRSALDEIKDLYGF PTAAIVSMADVVEHLYNKEINGKVVIDDQIKAAIDAYYEQYGAQA >gi|157101635|gb|DS480689.1| GENE 57 73405 - 74307 1137 300 aa, chain - ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 3 297 2 297 311 311 52.0 1e-84 MNMSVNIAGVEFKNPVMEASGTFGSGMEYSEFVDLNRLGAVVTKGVASVPWPGNPTPRIA EVYGGMLNAIGLQNPGIDVFTKRDIPFLKQYDTKIVVNVCGKTTEEYIDVVERLGDQPVD MLEINISCPNVKEGGIAFGQDPKAVEAITREVKAHARQPIIMKLSPNVTDITVMARAAEA GGADAISLINTLTGMKIDIHKRAFALANRTGGLSGPAVKPVAVRMVYQVAQAVKVPIIGM GGIRNADDALEFILAGATAVAIGTANFHNPYATVETVDGIRAYMETYGIGDIRDLIGAVR >gi|157101635|gb|DS480689.1| GENE 58 74344 - 75138 869 264 aa, chain - ## HITS:1 COG:BS_pyrDII KEGG:ns NR:ns ## COG: BS_pyrDII COG0543 # Protein_GI_number: 16078617 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus subtilis # 1 264 1 256 256 209 42.0 4e-54 MAKVKITAAVTLQEQLAADIYDMRIKAPEIAESAVPGQFVSLYSKDGARILPRPISLCGI DKEKGELRLVYRIAGEGTKEFSGLTAGDTIDVLGPLGNGFPLEAGKKAFLIGGGIGVPPM LELAKALHELNKDEEDNLVQSVLGYRDSQMFLKDEFEAYGPVYAATEDGSFGTSGNVLDA IREQGLTADVIYACGPTPMLRALKAYAAEKGLECWLSLEEKMACGVGACLACVCRSKEVD DHSQVHNKRICKDGPVFRADEIEL >gi|157101635|gb|DS480689.1| GENE 59 75158 - 76105 1038 315 aa, chain - ## HITS:1 COG:CAC2652 KEGG:ns NR:ns ## COG: CAC2652 COG0284 # Protein_GI_number: 15895910 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Clostridium acetobutylicum # 10 315 1 286 286 209 39.0 6e-54 MLLTTSYKNIMISQLIEKIKKTNAPICVGLDPMLSYVPEHVVKKSLDAYGETLEGAADAI WQFNKEIIDHTFDLIPAVKPQIAMYEQFGIEGLTVYKRTVDYCHEKGLIVIGDAKRGDIG STSAAYAAGHLGRVQVGSKSLSGFNVDILTVNPYLGTDGVKPFVDVCRSEDKGLFVLVKT SNPSSGEFQDRLIDGKPLYEWVAAKVEEWGAGCMDGEYSNVGAVVGATYPEMSEILRKLM PHTYFLVPGYGAQGGTAADLKHCFNQDGLGAVVNSSRGIIAAYKQEKYKNFGPEHFGEAS RQAVIDMVADISSVL >gi|157101635|gb|DS480689.1| GENE 60 76056 - 76214 98 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 454 88 100.0 2e-16 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQEHYDKPVN >gi|157101635|gb|DS480689.1| GENE 61 76350 - 77516 339 388 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101635|gb|DS480689.1| GENE 62 77836 - 78954 1076 372 aa, chain - ## HITS:1 COG:SMb21107 KEGG:ns NR:ns ## COG: SMb21107 COG4948 # Protein_GI_number: 16264434 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sinorhizobium meliloti # 47 360 34 351 370 164 31.0 3e-40 MDGENMTAGKMNVKEKIKDIHFYRAVSDISRPIADSTHTISQIGFYIVELMTAEGVRGQG YVLSFHYSPGAMEGALRDVCSFISAGEYAVYDTVLARADWDREAEYFGNTGINNCAIAAV NLAMWDAWGHTCRQPVWKMLGANRRRIPVYGSGGWISYTDQELLEEVLDYKKRGFKAVKI KVGSGSIERDAERLAKCREALGDSVKIMMDANQGFNVPDALELARRASSMGIQWFEEPVV HTDYAGYQLLRNQCGIALAMGEREYDFEALKQLSVRNALDLWQPDIVRLGGVEAWRESAA LAEAFGLPVLPHYYKDYDVPLLCTVRSPYGAESFDWIDGIIDNAMEIENGFASPREGDGW GFRFREECMTEV >gi|157101635|gb|DS480689.1| GENE 63 78944 - 80347 1224 467 aa, chain - ## HITS:1 COG:BH0705 KEGG:ns NR:ns ## COG: BH0705 COG1904 # Protein_GI_number: 15613268 # Func_class: G Carbohydrate transport and metabolism # Function: Glucuronate isomerase # Organism: Bacillus halodurans # 3 465 2 464 472 385 40.0 1e-106 MGSDCFSEEFLLTNPTGSRLYHSYGERLPIVDYHCHLEAREICENREFKDLGEMWLAHDH YKWRAMRTFGISEEYITGTKSYYEKFLKFAQIMPYLAGNPLYIWCALELKRYFDIEEPLG PDNAEEIYRRTNQRIRQLHMTPGWCLERSGVELLSTTEDPADSLEYHMKIKENAVLKTRI LTAFRPDRAFYCERKTFSDYMKILSQAAGQEIVDFSSMMGALEQRLAFFAEFGTTVSDNG IAHIVWEEYTEAQAEDIFQKAAAGEQLSEVEINRYKSAFLIEMAKLYKKYHFVMQLHIGT YLDANHRKVQEIGQSTGFDCVDDSTSVKSVGRILDRLTELDQLPKTILYPLNPAQMEPFA VLAAGFCDGTVRGKVQLGAPWWFNDQPFGIMRQFGGAGNLYPLSLSVGMLTDSRSFLSYP RHELYRRCLCSYLGELVERGEYFSGEKYLKEIIEGICYRNVKEYFGW >gi|157101635|gb|DS480689.1| GENE 64 80337 - 81107 220 256 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 13 250 4 238 242 89 29 1e-16 MFDTGSYFSLEGQTAIVTGGTTGLGLAITRCMVGAGAKVAVLSYESREQGREALKEFGGK AVFYQFDITDTDHTQEMADRVIAEQGPVTILVNNAGNHCKKYIEDMTVEDYVNVLNVHLV GAFALTKALFPHMKEQGKGSILFQASMTSYIGQPQVAGYATAKAGYLGLVHTLAAEGGPY GVRVNAIAPGWIDTPMFHKATDGDQVRLGKILGRIPMQKVGSVTDIGMAAVFLSSDAAGY ISGTCLPVDGGALIGF >gi|157101635|gb|DS480689.1| GENE 65 81111 - 82301 1265 396 aa, chain - ## HITS:1 COG:STM0650 KEGG:ns NR:ns ## COG: STM0650 COG2721 # Protein_GI_number: 16764027 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Salmonella typhimurium LT2 # 1 380 9 383 390 231 35.0 1e-60 MGYVRKNGAMGIRNKVLVIYTVECASFVAREIVRLAGDPETEVVGFSGCTDNEYAVRLLI ALIRHPNVGAVLTVGLGCEYVQPEWLADIAGKEGKEAAWMFIQQEGGTRKTIEKGLAVVG LMRERLKDTKRKPMGIRDLIIGAECGGSDYTSGLAGNVVVGRFFDRLTDLGGTAIFEEIV EAIGLKEMLVGRAANERARQEIGDTYDKALEYCKSVRQYSVSPGNFAGGLSTIEEKSMGA LIKSGSRPIEGVLKVGVKPPHPGLWLLDSTPDPYWMQFGITNPNDSEGLMDLAACHSHIV FLVTGRGNVVGSAVVPCIKITGNSATYRAMEEDMDFDAGPVLEGRCSQDEMAQRLLKMAA EVASGTKSKSEALGHKEYFIPYKYQDRDATKKCREA >gi|157101635|gb|DS480689.1| GENE 66 82342 - 82689 214 115 aa, chain - ## HITS:1 COG:CAC0696 KEGG:ns NR:ns ## COG: CAC0696 COG2721 # Protein_GI_number: 15893984 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Clostridium acetobutylicum # 8 96 1 86 492 65 38.0 2e-11 MDGTMEIIQRAFQIEQKDNVATALEELIPGALRLLGDGCMREAEAVQKIPKGHKIALQNI RAGEDVIKYGVRIARASKDIRAGEWVHLHNIRSVYDERSSHLDVMTGAPKDTRYE >gi|157101635|gb|DS480689.1| GENE 67 82709 - 83374 821 221 aa, chain - ## HITS:1 COG:MTH888 KEGG:ns NR:ns ## COG: MTH888 COG0684 # Protein_GI_number: 15678908 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Methanothermobacter thermautotrophicus # 21 213 31 214 220 77 30.0 2e-14 MKLWNDENEKFALMREKLYTPVVGDILDELGYYHQFLPQDIRPLREDMKLAGKAMTVLMT DVYGPQKKPFGLLTEALDQLKENEIYIAAGGTKRCAYWGELLTAAARTRKASGAVVNGWH RDTPQVLSQNWPVFSCGCYAQDSSVRTQVVDFRCDIEIGQVRIHDGDLIFGDVDGVLVIP REAADEVLIKALKKAAGEKTVRKAIEDGMSATDAFAKFGIL >gi|157101635|gb|DS480689.1| GENE 68 83391 - 84131 223 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 239 4 238 242 90 31 5e-17 MELAGKTGIITGGSSGLGFAAANVLAEAGAVVYAVSRSGRPKLEGETSHENVIHLAADVS DYKEMGELTERIGREHGIDFLVNNAGITVKSLAQDVSDADFEKVLQVNVSSVFKLCTLCY PYLRQSPNRGRIVNITSMAAHLGFSEVVPYCTSKGAVLSMTRGLAVEWAQDGINVNSVAP GWFPSEMSRKVMDEERKQKILARMPVHCFGNPRDIGAMVKFLVSDESEYITGQDFAVDGG ALSFGY >gi|157101635|gb|DS480689.1| GENE 69 84131 - 85252 1010 373 aa, chain - ## HITS:1 COG:AGl2751 KEGG:ns NR:ns ## COG: AGl2751 COG4948 # Protein_GI_number: 15891485 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 34 365 62 406 407 195 34.0 2e-49 MKIVKFETWWVKRNKCLFDEKRKGKSSMDWDVVVLKLTTDNGYEGIATALAARSGRVTES YLHDNIAPVVLGASPYDREKIWHELWNIDRHLTFFPVYLPGPVDVALWDICAKAAGLPLY QYIGAYRRSLPVYASGLFHEDPQEYVREALYYKSRGVNCYKAHPSGPCELDMEIHENIRK AVGPDMKLMSDPVAEYTLEEAVRVGRHLEKLGYEWLEEPFRDFELNKYTQLCAALDIPIA ATETTRGCHWGVAQVINQKAADIVRADVSWKCGVTGTLKIAHLAESFGMQCEIHTTTMNY MDLVNLHVSCAIRNCRYFEYFVPEEDFMFPMKGLLPIDEKGIITVPDKPGIGGELDWELI ERNCVSHQMEVSE >gi|157101635|gb|DS480689.1| GENE 70 85263 - 86102 974 279 aa, chain - ## HITS:1 COG:BMEI1714 KEGG:ns NR:ns ## COG: BMEI1714 COG0395 # Protein_GI_number: 17987997 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Brucella melitensis # 3 279 15 294 294 154 31.0 2e-37 MSIKKNAGSIAVFILLFVISFIFIIPFIWTMLTSLKPDEEIYSSVLTFLPVNPYLGHYTG IFTKLGNFFKYFSNSVVVSFWSVLFNVLFAATLGYSFSKFQYWGRDLFLGFVLLIITLPY VIYLIPIYIMQSRFDLIDTRLGLILPYIATNLPMSVFIMRGQFNGVPNEMMEAARIDGAG QWQTFSRVMLPIVKPGIATVIIMTFISVWGEFTYARTLCTTAKSQTLAVGITFLRDEAAS WQYGTLTATIILSLIPVLAIFLSMQKYFIKGIMAGAVKG >gi|157101635|gb|DS480689.1| GENE 71 86099 - 87046 899 315 aa, chain - ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 1 296 1 291 296 190 36.0 3e-48 MGNKKSVGVQLKKMIPGYAFAAPGVIFVAVYMGYPLLRSLYLSFTNYNFAFDKKPVFAGL SNFAKMFSDAYFMDSLRNTVVFSLLFFPSIMIISLIIAMLLDKGVRGSGIFRTCVFVSMV VPLSLTGIIFQWILNNQYGLLNSVLREMLHLDTWAHNWLGEGKWAMFSIVIVSLWKNMGM LVIFFMAGLAGIPSDIIESAKIDGANAFQRIFRIILPNLKESYVICGLWAIIQSVKVFEQ PFVMTNGGPGTSTLVLYQYTWMNAFKYYEMGYASAIGYFMGAVILILSCINLYINRTKDE DEVKRKKTGKRRNRA >gi|157101635|gb|DS480689.1| GENE 72 87132 - 88460 1294 442 aa, chain - ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 1 354 1 347 451 79 23.0 2e-14 MKTKKAYSLALAAVMASSVLWGCGSSSTDPGTGAAEASDSGTKEAGGGQTLRVTVQAWMM GKYDFERIAKEFEEENPGVKVVYNQVDNVDVTSSMLEWSQGKTNCDLALGGDRSETVPYA ANDYVVEFTDENFFNGDFTKDKFIESFLESGNADGVQYMIPLLGEVLGCVVNIPMMQEAG YVAEDQTVTAPSDWNEMYEYAKNLTKDGQTGLAIDWGNNMAVKAYDACVMGANGNLYESD GNTLNFTAGPVKNMLQVWQDLVKDGYTTTDVFADADANRTNFKAGRVAMHIAPASRWIEA GELLGAENVGVMPIPGTDTHGSVSYIHGAVIPKASENQELAIKFIKDKLLQEQFQADSMN SYGKMSPMKAHYEKLDNPYWPTVLDFTEKATTPPLYKDYTKLDTNMQIELQKMLSGGSTV DEFSANMSKFMTTIDLSTGMNK >gi|157101635|gb|DS480689.1| GENE 73 88835 - 89368 76 177 aa, chain - ## HITS:1 COG:CAC1633 KEGG:ns NR:ns ## COG: CAC1633 COG1633 # Protein_GI_number: 15894911 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 23 175 66 229 236 65 27.0 4e-11 MNNIQYNIRANPDSFPVAHSNWYPPVAVLEKNQQYAQILMADLASTASEMTTVHQYLYQS WTINDRHQNIRRVIQRIAIVEQHHFSIIGQLIHLLGGQPECRSSRPNSYWCGNMVSYSCE LPAILSDNARSEQFAAQAYAAQSKEIKDPHVSKMLARLSLDEKLHYKIFSDYLSQIE >gi|157101635|gb|DS480689.1| GENE 74 89797 - 91563 1852 588 aa, chain + ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 1 563 1 518 555 253 34.0 6e-67 MNFMKNLKIGKKLIVTFGVIIILFLISLMVSIMSLSNSGKNFGDFYTVGYPVSNKTTDMC RATQTGLKCIAISMLTDDAAETQNYITQANEQFNTLSPGFDYLRQNFRGDKAVLDQAQKI LEEAKPYRIEILDLAGQNKNTEAADLLFTKYQPLMLEFMNLMNKADSDTTAIAAEDYSQS AKAQTSALVFLITIAALALALTVTLAIYITKALTRPISEIEEAAKKMSAGDLEVSISYVS EDELGSLSESMRTLTHNFKGIIQDMGMGLSALGNGDFSVDSKAKELYVGEFAQLATSMYQ IIDKLSSVLGQINQSADQVASGSDQVASGSQALSQGATEQASSVEELAATINEISNQVKS NAENAHNVNKLADDVGLKMTESNQQMQTMIEAMKEISSSSSEIGKIIKTIEDIAFQTNIL ALNAAVEAARAGEAGKGFAVVADEVRNLASKSAEASKNTAALIESSILAVEKGTKIADET AHTLLESVEGAQKVTRTIDQISRASEEQASSISQITQGIDQISNVVQTNSATAEESAAAS EELSGQAQILKGLISQFKLKDMGSAPMVMQDVPRQMDPIPVYPDGSKY >gi|157101635|gb|DS480689.1| GENE 75 91657 - 92739 1136 360 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 2 227 3 211 333 72 27.0 1e-12 MKSQITMKDMAQHFNVSLNTIHKAIYGKPGVSEATRKKIVSYAEKNGYRLNAMASMLKRK NMKIAVCLPKLDQDSKYFYSYIWQGYRMYLAEQSEFNLEIQELAYEKGELSRTLEALCEK VKAGEELDGLLTVPPGDEKGICAVRYFTDKGLPVVFVTEDQDRCGHLGTVVGDYYAAGQL MAEQACNILARGRRILLMTGDPYKDSHYLVAKGFHEYMRQEKKEYAVEDLYGYYELDHLD KNVLDILKENPPDLICCVFSRGSAVLYRALKKSGMAGRIPVIASDVFDETVRALKEGTFT NLVFKDPCKQSYLAIKMLYEYLLMDKEPAEKVRRVEIVLIFKSNVNYFWENIKDMQYTRT >gi|157101635|gb|DS480689.1| GENE 76 92927 - 94234 853 435 aa, chain - ## HITS:1 COG:all0778 KEGG:ns NR:ns ## COG: all0778 COG4826 # Protein_GI_number: 17228273 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Serine protease inhibitor # Organism: Nostoc sp. PCC 7120 # 80 433 20 372 374 192 33.0 1e-48 MKHLMQMITKRKRKYLGAGLLLTAIFLSALTSGCALNGLHKGTQDLTRNIPKEEVETVDM MEPSYESVNPRKELTEFSLELLSENLSHGNCLISPLSIVSALGMTANGAEGNTRAQMEQM MHTDTAVLNDYLKAYTDYMPNSEACQVNIANSIWFRDRDSLAINEDFLKTSRNYYDASIF EAPFDAGTRDDINSWVKKETNGMIQKLLEEAPPRDAVMYLVNALSFDGEWRDIYKKNQIH KGTFNAENGEHQSAEFMYSTEAVYLENTLGRGFLKPYGDGAYAFAAVLPDEGMTMKEFLE RLKENGITDLLEQTRNKTVCVRLPKFTVEYNVVLNESLTESGMKDAFDSQKADFSSMAHS DNDNIYISKVIHKTKIDVDEKGTKAGAVTAVEVSEGSAVQTEEPKEVFLDRPFFYMIIDT RQNFPLFMGCLMQLE >gi|157101635|gb|DS480689.1| GENE 77 94510 - 94953 682 147 aa, chain + ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 40 146 46 150 151 62 34.0 4e-10 MMLIPRRNYGLNLFDDFFNDPFFTSASEKGEEAKKLPVMRTDIMEKDGNYILEIELPGFK KEDIRAELKDGYLTVSADITKSSEGKDDKGTLIHKERYTSSCKRTFFVGEQIRQEDIKAG FENGILRLQVPKDAPKVEETPKYIDIL >gi|157101635|gb|DS480689.1| GENE 78 95068 - 96744 1577 558 aa, chain - ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 1 557 14 549 553 332 34.0 2e-90 MAQTLYYNGTILTMEDRCPQVEAVLTEDGRILETGTWEKVKERAGDKARRLDLQGRVMMP GFIDSHSHFTACASHTMEVDLGGAGSFEDMITCIRQYIEDKKIPEGKWVTASGYDHNRLK EHSHPRRKVLDQAAPQNPLILKHQSGHMGVFNTMALNLLGVTGQTQAPQGGVIEMEGGLP TGYMEENAFLGFQGRIPMPSAEDFLTAYGRAQELYASYGITTVQEGLMASQLVPLYQMLL KSGLLKLDLISFMDIRDSDAARTAFESHIKNYKGHMKIGGYKMFLDGSPQGRTAWMRTPY LPETPDAEEKAQADAFPRTRATKEDYCGYNTLEDSQVKENILKAELEDMQLLAHCNGDRA AEQYMDMLEAVYRELGSGNSRKPGFYRGDIRPVMIHAQLLGLDQLERVKRLEIIPSFFLA HVYHWGDIHVRNFGQERAGRISPAASALKEGICFTLHQDSPVIMPDMMETLWCAVNRRSR EGKVLGPEERIPVRDALKAVTVNGAYQYFEENEKGTVTPGKKADFVVLEQNPLETGADEI RNIRVLATIKEDRLLWKA >gi|157101635|gb|DS480689.1| GENE 79 96893 - 98521 1373 542 aa, chain + ## HITS:1 COG:VC1670 KEGG:ns NR:ns ## COG: VC1670 COG1502 # Protein_GI_number: 15641674 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Vibrio cholerae # 353 489 310 444 484 72 32.0 3e-12 MICRCITWIKKHPVLSLFFLILMYLVIGATAPFYHYKPISQETQDTVSAENFYQEGDSRD RAMILETNQSAWDERMRLMNLARERIILSTFDFRDGESPRDLMAVMLHKADQGVSIKILV DGFSGLVRMEPSKLFYALSSHPNIEIKIYNKMNPLLPWKTQGRMHDKYVIVDDYGYILGG RNTFDYFIGSYPTDSRSHDREVLVYNTAHGTDRGKDSSLYQVEDYFEQVWNLDVSSLFHD SEKTGDRTSVRNAAAMLRERYKVLAIQYPDLFDSEEDSASCTAAPGQPGCPALPMDSSIN DDDYPAAPDIPDAPDTAPPLFPALSLTANQSKALEYYGSNTVPTGKITLVSNPTGIYGKE PVVFHTMSVLMKDAKRSVLIHTPYAVFNDYMYDTMKEITARVPVTMMINSVENGDNFFAS SDYPMHRDAFGDTGMEILEYDGGLSYHGKSLVIDDELCAVGSYNFDLRSTYLDTELMLVI QSPELTAQLEEYMVSYQKDCRRLLPDGTYEIPEHLTIADVPAYKRAAWKVVGFVMQPFRF LI >gi|157101635|gb|DS480689.1| GENE 80 98514 - 99905 1562 463 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 20 463 4 444 447 273 38.0 4e-73 MNQSAAKSREQGKQIKKRIDLTEGSPGRRLLLFALPMILGNLFQQFYNMADSMIVGNFVG EDALAAVGASYALTNVFIMIAIGGGNGASVLTSQYLGARQHGKMKTSISTALITFLFVSI LLGSAGLYLNGLILESLKTPANIMGQAKLYLGIYFLGMPFLFMYNVLAAIFNAMGDSRTP LYLLIFSSILNVVLDIISVTWLGMGVDGVAIATVMAQGVSALISFGILMKKLRGYEQEDG DSFRLYDSSMLKAGTRIAVPSILQQSIVSIGMLLVQSVVNGFGSAALAGYSAGSRIESIC VVPMIATGNAVATFTAQNIGAGKMERVKEGYRASYGIVAGFSAIIAVIVVLLNGPIITSF LGDGMSREAYGAGTGYLSFIAYFFVFIGMKACTDGVLRGSGDVLVFTIANLVNLTIRVYA AFHFAPIWGVAAVWYAVPMGWIANYVISFSRYLTGKWRDKKVI >gi|157101635|gb|DS480689.1| GENE 81 99919 - 101715 2131 598 aa, chain - ## HITS:1 COG:BH2856 KEGG:ns NR:ns ## COG: BH2856 COG1164 # Protein_GI_number: 15615419 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Bacillus halodurans # 1 591 5 595 598 594 48.0 1e-169 MKKRSEADSKYTWKLEDMVAEDSQWEQMFKEASGEISEYASYKGRLAGSADTLYACLLFD DKLSQKIERLYVYARMRSDEDTTVQRYQDMFSRAQTLSYRAAENSSFLVPEILSMDRELL EQYMAADNGIGHFKRALEIILARRDHTLSGEMEELLAQSYDATQGASQIFTMFNNADVKF PVITGESGEGIQITHGNYISLMENQDRRIRKDAFEGLYSVYEQFSNTLAAAFSSNVKQAV FYAKAKKYASSREYYLADNEVPELVYDNLVKAVRENIVKLHEYTRVRKDVLGVDELHMYD LYVPMVAAADRRYTYEEAKSIVLEGLAPLGEEYLSLLKQGFDSRWIDVYENEGKRSGAYS WGTYGSHPYVLLNFHGTLNDVFTLAHEMGHSIHTWYSDRNQPFTYAGYKIFVAEVASTCN EALLIRHLLKKAGSREEKAYLLNHFLESFRGTLFRQTMFAEFEDMAHKKAARGESLTAES LCSIYRQLNADYFGPAMTVDRQIDYEWERIPHFYTPFYVYQYATGFSAAVAISSRIMSGE PGALEGYKKFLSGGCSMKPIDLLKLCGVDMSTTRPVDEALGFFGELIEEFKKCIHTNE >gi|157101635|gb|DS480689.1| GENE 82 101787 - 102491 877 234 aa, chain - ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 1 229 1 229 238 168 41.0 8e-42 MREGIRLILEQTQYFGLVLSIGAYLFACWLKNKTKLAILNPLLVSAALIIACILGVGMDY ETYNKGASYLSWLLTPATVCLAIPLYKQLHLLKKHADAVAVGITSGVVTSAVSIFLMCKV LGMAHVHYVTLLPKSITTAIGMGISEEAGGIVTLTVMSIILTGVLGNMAGETVLKLLKVR HPVAKGLAMGTSAHAVGTAKALEMGEIEGAMSSLSIAVAGLMTVIVVPLAANLI >gi|157101635|gb|DS480689.1| GENE 83 102493 - 102855 525 120 aa, chain - ## HITS:1 COG:MA3263 KEGG:ns NR:ns ## COG: MA3263 COG1380 # Protein_GI_number: 20092079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Methanosarcina acetivorans str.C2A # 4 110 2 108 165 73 35.0 1e-13 MRYVKQIGIILGITLAGEILNHVVPLPVPAGVYGLFIMLAALMCGAVKLESVEGTGNFLM DTMTMMFIPATVGIVECIGEVKAVLVPFLIIIGISTLLVMAVTGCMAQWVMGRKHSGEEQ >gi|157101635|gb|DS480689.1| GENE 84 103073 - 104074 411 333 aa, chain + ## HITS:1 COG:no KEGG:Closa_1593 NR:ns ## KEGG: Closa_1593 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 22 154 2 132 255 66 30.0 1e-09 MNMMHTNTAHTNLTRTNMTHTNITHTTTLSRKEQEDIRRISALCRLTDGLSLSCPEDGDE YWLLEEDATAAAFLAVYKTEETMWECYAFTHPDFRRKGYFSALLEQVCQYSEALGEPELC LVTDNKCPAATAALRELGAELWNEEYMMEYNTAADSSKDGKSASHASLPNRASDGADGRK QDMELDMDIRPTLEGLLICARRPGDHPSPDKDGITSQDRDDITARNKAHSTARSKVHSTA QDNTSGTDSTPDACVTCRLSLNSTAAYLYSLETAPALRRRGLASCFLIQLIRYLEREGIR RICLQVSGSNEPALHLYRKTGFRITETLSYYLY >gi|157101635|gb|DS480689.1| GENE 85 104108 - 104821 825 237 aa, chain - ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 30 228 17 214 217 228 59.0 7e-60 MNININPNSELHQSIVNVPDMVQVPEMWVDQARQFQQAMMRYTCAIREVKTKLEVLNDEL SVKNQRNPIEMIKSRVKKPKSIVEKLQRRGFEISLESMEKNLDDVAGIRIICSFLDDIYE VADMLIRQDDVKVIAVKDYIQNPKPNGYRSYHMIIEIPVFFSDSKKPIRVEVQIRTIAMD FWASLDHQLKYKKSFIDDNGEISEELKQCAEVIAGTDVKMLEIRKKIEAQGVTVRRD >gi|157101635|gb|DS480689.1| GENE 86 104864 - 105700 1118 278 aa, chain - ## HITS:1 COG:AF2382 KEGG:ns NR:ns ## COG: AF2382 COG0489 # Protein_GI_number: 11499959 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Archaeoglobus fulgidus # 35 271 17 252 254 236 46.0 2e-62 MSECNHDCGSCSANCDSRKADKSEFLEALNPASSVKKVIGVVSGKGGVGKSLVTSMLAVS MNRKGKKTAVLDADITGPSIPMAFGIGNEGVATSPDGKLMLPAKSMEGVEVMSANLLLDK DTDPVIWRGPVIAGAVKQFWSETLWQDVDYMFVDMPPGTGDVPLTVFQSLPVDGIIIVTS PQELVGMIVAKAVNMAKKMDIPIVGVVENMSYLECPDCGKRISVFGEGHVEEIAAEHGIK VLAQIPIDPAIAQMVDAGRVEYLEMPWFDEAVKAVEAL >gi|157101635|gb|DS480689.1| GENE 87 105800 - 106957 895 385 aa, chain - ## HITS:1 COG:no KEGG:Closa_2166 NR:ns ## KEGG: Closa_2166 # Name: not_defined # Def: FliB family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 375 1 383 386 405 53.0 1e-111 MIYTFPHYYNHFKCTASGCPDTCCAGWGIMIDSASLKKYRNMDGPFGSRLHNSIHWKEGS FKQYHGRCAFLNEENLCDIYSEAGPEYLCRTCRTYPRHIEEFEGCREMTLCLSCIEAATI ILGCEEPVRFLTREDERQETYEDFDFFLFTKLMDARELALDILRDRTWSWEDRTSMCLGL AHDLQARIRKGRLYEADGLIERYRRAAGRRRLPSGWDRASCSRYEVMEEAFSLFSEMEVL GKDWPAFVSGMKAALYGKGRQAYETRRRAFLESEIGKRLPVLKEQLMVYFVFTYFCGAVY NGNPYGKMKMAATATLLIEELAQALWTDQGGRLTFLDFADGAHRFSREVEHSDSNKAVLE EAVVRRPGFALRRLLAAVESDGSGE >gi|157101635|gb|DS480689.1| GENE 88 106968 - 107567 569 199 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20100 NR:ns ## KEGG: EUBELI_20100 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 191 7 192 200 284 67.0 2e-75 MIPDRERANKELKWAADLNPGPWKEHSLYVAAACESIASRCSHMDRDRAYVLGLLHDTGR YAGRTSERHLLDGYRRCMEHGWEEAARICMSHAFMIQDIGTSIGEFDVTPQDYEFMKEFI SGAVYDDYDRLVQLCDSLALPTGFCPLETRFVDVTIRYGVHPSTIPRWKRVLEIKDYFEE QMGCTIYEALPCGGRLQHL >gi|157101635|gb|DS480689.1| GENE 89 107835 - 110105 2397 756 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 10 752 18 763 772 479 37.0 1e-134 MSCEIPKNGMIGKSVPVRDAAMKVTGQMRYVADMKLPGMLYAKILFSPVPHARIKSIDTT QAQALEGVHAVVCYKDSPKNRYNGNGEDKDINPSELVFDQTVRFVGDKVAAVAADTEEIA RKALSLISVEYEELPFYLDPEEAMKEDAYPIHQGGNIIMESNNSGGDVDGAIRDADHVFR HRYEMPAVNHCSLETHVSIAEYDADGKLTVWTPTQDAFGQRINLQRIFELPMSKIRVISP VMGGGFGGKIDLVTEPVTALLAIKTGRPVKMVYNRTEEFCCTRTRHAAKVDLTMAVKKDG TITAADETMILNAGAHTSATMSVCWAAGGKFYKLLKTRNMRCRGIPVYTNTPVASAMRGF GSPQQFFPVSSMMNEIAQSLRLDPSDVFLKNLADPFEAACNDGESFGNFRLKDCVVRGRE LIDWYEAKKQMEASRKEKGRYLIGVGMACGSHGTSMFPFMPDITGITIKMNEDGTIVFTT GTSDMGNGSVTTQCMLISEVLGIPLDHIAYIKTDTDAAMSDLGAYASRGTYTSGHAAVKT AQIMRDKIAVLAAELLDTTTDNLDFHDDAVWCVDIHRKFVSMKEIAVHAREKHQEDLSCS YMFHSEAGPMSSGAHFAKVQVDTVTGGIKVLEYAAVHDVGRVVNPMGITGQLEGGISMGL GYALCEEMKLDSTGRNQTSSFKKYHILNVKEMPKLSVDFLDSEEETGPYGAKSIGECAVV PVAPAIANAVSNAVGRQFYKLPIRPEDVLEALSLAK >gi|157101635|gb|DS480689.1| GENE 90 110108 - 110587 486 159 aa, chain - ## HITS:1 COG:SMb20131 KEGG:ns NR:ns ## COG: SMb20131 COG2080 # Protein_GI_number: 16263879 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sinorhizobium meliloti # 3 158 4 159 161 174 55.0 9e-44 MELVALNVNGRTYNIAVEKNWTLLYTLREVMDLTGTKCGCSTGDCGSCKVIIDGEAVNSC SVRAMDMGGKAIETIEGISPGHVSLHPIQQAFIDCGAVQCGFCTPGMVMAAKALLDKNPD PTEEEIKEAMKGNLCRCTGYVKILEAVKRAAQVMRKEGR >gi|157101635|gb|DS480689.1| GENE 91 110618 - 111934 1732 438 aa, chain - ## HITS:1 COG:CAC0872 KEGG:ns NR:ns ## COG: CAC0872 COG2233 # Protein_GI_number: 15894159 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Clostridium acetobutylicum # 1 434 11 435 435 185 29.0 2e-46 MKMQYGFDDKVPAGKAVPWALQYVFTIFTGSMTGSIMLASGAGMNSADTAFLIQCGLFAC CITTLIQSLGIRIGKFQIGARLPLVSAGSWTLITPMVLFANDPNIGIPGAFGAACLGTLL LFLLGPIVIDKLYKYFTPAVTGSVVLAVGMCLILNAWNDMVDYNPTGPDAMKMFLIGIAI AAICVIIDHFAKGIVQSLAVLIAMVAGYVFCSVTGMVDFSQVGSAAWVAVPRPLNFGMTV NLGAVITVFIIHIATIMENSGNTTGVVTATDEGLPTKETLKASVRGDALGGVFSTLFNAL PMCVAAQNSGVEVMSGVASRFVTAIAGVIFGLMAIFPKFSQVLALIPNPVLGGILLVTFG NIIASGIKVIGFDENNKRNFTIIALAAAVGIGGNFAQAAGTLAFLPSTVVTLFTGISGTA ITALVLNIILPGEKKEEA >gi|157101635|gb|DS480689.1| GENE 92 111987 - 112607 725 206 aa, chain - ## HITS:1 COG:MJ0731 KEGG:ns NR:ns ## COG: MJ0731 COG0655 # Protein_GI_number: 15668912 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanococcus jannaschii # 2 205 1 196 198 137 39.0 1e-32 MVKILGICGSPRRKSCYKALETALEAAKNAEEGVEVELVELRAKKMNFCIHCNKCLRDKS DRCTVFKDDMTELYDKFYEADGVIIASPVYEMNITAQLCTFFDRFRSAWLKGVEDPDFFA HKVGAGITVGGTRNGGQEATMNSITNFFLTQGMTVCTGGSGMYSGPMLWNPGDGSTEIED PDGFRNCEILGRKMAKMSRIMKEARF >gi|157101635|gb|DS480689.1| GENE 93 112653 - 114917 2663 754 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 2 754 13 766 772 452 35.0 1e-126 MDNEYTVIGKPRPIGDAALKVTGQKIYVGDMELPGMLYAKVLLSTVPHARIKSIDTSAAE ALPGVKAVATYKNTPQVRYNSAVRFIEHKLPDTERIFDDTVRFVGDRVAAVAAESPDIAK KAVNLIKVEYEALPVITDVEEAIKPEAWPIHEGGNIVGISRAEAGNVDEAFQECDYVFED RYTTQAIHHAAIEPHVAIADFAPNGKLTVYSPCQNTFAFRVIMSRIFGLSYNRIRMVAPA IGGAFGGKLEVTVEPVAAVLSQMTGKPVKVEYNRKESILSTRVRHASVNYVKTGFMKDGT LKAVDFKVYTNTGAYASSALNVSGAMSHKVFKAYKIDHMRFQCQPVYTNTEIAGAMRGYG SPQVYFGWQRQMQKIADFLHMDMAALQMKNMVDPDSCDPIFHKPHGNSRPKDCLKRALEL IDYEACLKEQEATRNQDIRIGVGLALGVHGNNCVGAHRDVSTPMLKMNEDGSCIYYTGSH DMGTDTLGMQMQIVSEVLGISMDRIDCLAADTDVVHWHIGDYSSRGVFVAGSAAKKTAEA MKRELQVEAAKLLETEPDDIELHHDRAWSRKNEEKNASLHDVMVHCQSVSMRELMVAETY EAKRGATSYGVHIAKVEVNTLTGEVRPLEYAAVHDIGRAINPLMLKGQLAGAIQMGLGYG LCEDMAYDGDGKPLVQTLKKYKVLRASQMPKLYMDFVETPQGEPDGPYGAKALGECPVVP AAPAVVNAICNAVRGEINELPARPDRVLAALGDK >gi|157101635|gb|DS480689.1| GENE 94 114907 - 115389 478 160 aa, chain - ## HITS:1 COG:SMb20131 KEGG:ns NR:ns ## COG: SMb20131 COG2080 # Protein_GI_number: 16263879 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sinorhizobium meliloti # 4 153 5 155 161 157 54.0 8e-39 METISLNINGVCRELAVGKNWTLLKTLRDGLRLTGTKCGCETGDCGACMVIIDGVATRSC LVKAVNLEGKAIETIEGLSDGIHLHPIQQAFVDAGAVQCGFCIPGMIMSAKALLDQNPDP REEEVRKAIDPNLCRCTGYDNIVKAILLAASRMGGNRNGQ >gi|157101635|gb|DS480689.1| GENE 95 115411 - 116340 1029 309 aa, chain - ## HITS:1 COG:BH2271 KEGG:ns NR:ns ## COG: BH2271 COG2355 # Protein_GI_number: 15614834 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Bacillus halodurans # 1 265 1 266 310 131 32.0 2e-30 MRLFDLHCDTIEELKKRGEDYLNCTTQFSLRDREKFQKAVQVMAVFVPDDIRGQEAVQFV LDYHQYMDDQVRRTGGQAELVEKITDLDRILDEGKWAFIRSIESGAALNGDLDNINRFAD LNFKMLGLVWNGANELGSGPNNDTGLTAFGKEAVKRMEQVGMIVDCSHLNDAGFEDLLNI AERPFVASHSNVRACCSHPRNLTDAQFKEIVRRGGLCGLNLFSRFIDDDDNGSKDYLLRH IYHMLELGGEDVIAWGGDMDGEITCDPDLNTPYGVGRFADYLVEHGIGEAAVQKIFFDNA YRFFKQQVK >gi|157101635|gb|DS480689.1| GENE 96 116400 - 116699 504 99 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1552 NR:ns ## KEGG: Sterm_1552 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 88 3 88 98 73 44.0 2e-12 MAMKKWAMVIMNAGYDPEKDMARLDLEQVETHILTVRNPEEAVALAKRLGEEGFGAIEVC GAFGEELAKKMYEATGCKVPVGYVTTPQDQFEKALAFWS >gi|157101635|gb|DS480689.1| GENE 97 116825 - 118015 1444 396 aa, chain - ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 19 373 28 378 403 154 30.0 4e-37 MSMLNEKRAEELLALTTKMIQAPSYSGQENLVVDVMKKFCDEHKFTDIHVDRYGNCICHI KGSKPGPKILFDGHMDTVPVPDKTKWDHDPFGAEIVDGRMYGRGTSDMKGALSAMLAAAL YYAEDADYDFPGDIYVAGVVHEECFEGVAAREISAYVKPDYVIIGEASQLNMKIGQRGRA EIVVETFGVPAHSASPHKGVNAVYKMCKVIEEINKLTPPHHDVLGDGILELVDVKSSPYP GASVVPDYCRATYDRRLLTGETKESVLAPLQELLDGMMKEDPQLKAKVSYAVGQEKCWTG ETIEAERFFPGWLFDKNEDWVQNIYKEMKEIGLNPTITNYDFCTNASHYAGEAGIRSVGV GPSLETLAHTINEYIEIDQLNKVMESYYGVMKALLK >gi|157101635|gb|DS480689.1| GENE 98 118045 - 119244 1307 399 aa, chain - ## HITS:1 COG:MTH1213 KEGG:ns NR:ns ## COG: MTH1213 COG0167 # Protein_GI_number: 15679224 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 12 301 5 302 303 110 30.0 6e-24 MLDKSKLTGADISIDFCGCHLQSPYILSSGPLCYGAEGMIKGHEAGAGAVVTKTIRLGAA INPVHHMGTVNQDSLINCEKWADSDRLNWYENEIPKTVAAGAVVIGSVGHTLKEAQAIVK DVERAGAHMIELVSYTEDTLLPMLDYTKEHVDIPVICKLSGNWPDTAATARRCLEHGANG ICAIDSIGPTLKIDIEKAAPEMMSGDGFGWMTGAAMRPIAMRYNYQIAKENPQLMNLYAS GGVMKADDAIEYMMAGAMGVGVCTAGILKGVEYVEKMCYDLSKRLAELGYSSIQEVNRAA HPNFPSEEHISGLKFHFTPFKEDGTTKKCVSCKKCETVCCYDARKLTFPEMTVDMDKCRC CGLCLDVCPTGALTAELAPQTEKDLELAQKSKDFYELVK >gi|157101635|gb|DS480689.1| GENE 99 119265 - 120461 1641 398 aa, chain - ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 33 395 35 401 404 380 50.0 1e-105 MSCEIKWALNKMPKTEDKNLPIMSVVEVKKARAFHESIPQYQVTPLADLKNMAEYLGLAK ACVKDESYRFGLNAFKVLGGSYAIARYIAKETKRDISQMPYSVLTSKELRDEFGQATFFS ATDGNHGRGVAWAANRLGQKCVIRMPVGSTENRRRHIEDEGAECTIEKVNYDDCVRMVAE EAKHAERGVVIQDTAWEGYEEIPSWIMQGYSTMADEAVEQFEERPTHVFVQAGVGSLAGG VVGYFANKYKKNPPVMVVVEAEPAACLYRGAEAADGEIRIVDGEMPTIMAGLCCGEPNIT SWDILKNHVTAFLAVNDDVARRGMRMLAAPFKGDPQVVSGESGAATFGALATIMKDESYR ELRLELGLNQDSKVLMFSTEGDTDPDRYKTIVWEGGLK >gi|157101635|gb|DS480689.1| GENE 100 120528 - 121019 615 163 aa, chain - ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 2 156 11 164 171 165 51.0 3e-41 MEKMKISFTLNGKPCQVEVKPHQRLLDMLRDDLRLTGTKEGCGIGECGACTVIMDGKAVT ACLVPAPAVDGRNVVTIEGVGCDGKLDPVQEAVLENHALQCGFCTPGFIMSAKALLDENP DATREEIRRAISGNLCRCTGYEQLTDAIYQAAEELKKLHMEKL >gi|157101635|gb|DS480689.1| GENE 101 121022 - 121897 1033 291 aa, chain - ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 6 286 7 283 292 124 29.0 3e-28 MIQYKYHRAASLDDVVKVLKEYGGKARIVAGGTDIMVQIHEKDKRWKDLQCLLDITYLDK ELRYISQDEEYVYIGPLSTHTDLEQSEIIAKHIPFLGWASATVGSPQIRNRGTLGGSIGN AFPASDPLPALIAADVLIKVYGQDGERVCLLKDFYEGKGSLKLEPGEFIREFVVKKLPEG TRMGFSKLGRRKALAISRLNCAAALTMDSEGTITEARIAPGCIFILPDRVEKAEQVLIGQ KPSEELFAKAGKAVSEVMIERTGVRWSTQYKEPAVQEIVLRALCQAAGMEV >gi|157101635|gb|DS480689.1| GENE 102 121914 - 124274 2381 786 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 6 780 11 760 772 385 33.0 1e-106 MSLTKVGEKHFSVVNHCVPRIDGVDKVTGRARYAADIYMEGMLYAGVLRSPYSSARVVSI DAAKAKAIPGVEAVVTCFDMPRTRSWAGYMYLTDTIRYSGDCVAMVAAQSKELVDEALEA IHAEYEELPGVYTIEEALSEGAFAVHEKYPDNIFKDSVFHIRKGDPDTAFEKADIVLERE YRTQYIEHSYIEPEACICYLNPNDGAMTVHSASQNPFFTRRYVADVLGVPMNQVRMVQET LGGTFGGKEEGAGLVAGRCAYLCSLTKKPVKFVFNREDSFLESAKRHPFRLRYKAGITKD GRIVAWQGEQVDNCGAYNNQTQFMNWRANVHSAGPYEIDNIKTDTYGVFTNNVHSGAMRG YSSPQLIWAQEQFIEELAMACGMDELEFRRKNLLHDGALTATGVPVEHCVIQDIMDATAE QTGYKEKHAAYLNQPEGSRKRRGIGLAVCHRGSGLGAESPDASGCMMICNEDGSVMINSG LAENGQGLKTAYAQIAAEALGVSYEAIRFFGTDTHSIPDCGMTVASRGTVMGAQSVRKTG EKLKGVMIQNALELHSLPLEEIEKAYGLKKGTLDYSKLTADQVELLDSQFYLKKYPDVKV PFSGVCNACLWTGKQMAAFEWFKPQDLIQDHETGQGKAFPTFAYGCIVAEVEVDMKTGYV DVLEVTASHDVGTAINPALVKGQIYGGIVMGQGFAVTEDVVTGKRGQKTKNFDSYILPTS MDAPKMNINIYENQDPAGTYGAKSIGEPATEGVGAAIANAIFNATGRRVYENPADLETVL LGHKLQ >gi|157101635|gb|DS480689.1| GENE 103 124265 - 125611 1373 448 aa, chain - ## HITS:1 COG:MTH1505 KEGG:ns NR:ns ## COG: MTH1505 COG0402 # Protein_GI_number: 15679502 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanothermobacter thermautotrophicus # 6 429 3 414 427 201 33.0 3e-51 MIKNGNLAAAIIKPDYVLTPEGLKKGWGVRIEEGRIACVGPWDTLGSGSNEEVQTVCLPG QMLLPGFVNGHNHMYGVLSHGITAEAMVTDFSDYLEDFWWPYVEDRVDHDLARITTRWAC VEMIDSGVTSFVDILEGPNSIPGALAVEKEEVEKAGLRGFLSFEACERKSGENGRLGLKE NFDFAKACNEENGLVKGVMSIHTLFTCGEDFIREAKRMADEAGCLLHMHMSESDLEPAWA REHLKTTPVEAYDGMGCLDENVLASQLVQVTDKEIAILAEKDVNAISMPLSNCEVGGGIA PIEKMLEAGMTVGLGTDGYVNNFFEVMRGAFLIHKGYHKDPQAMPARKVYRMATELGAKA VGIEAGVIKEGMLADLITVDVARPTPINEYNVYDQLVLFTNPQNVINVMVGGAWLKRDGK LVTLDKEAVRREMEEKTERFWKGDSECR >gi|157101635|gb|DS480689.1| GENE 104 125615 - 127039 1748 474 aa, chain - ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 1 454 11 441 442 263 35.0 5e-70 MISHLFKNGIIVTVNPDREIFFHGAVAVKDDRIVEVGPTEAMEAKYTDCERVTDLEGRVM FPGFVNTHNHLFQTLLRGLGDDMVLKDWLETMTFPAATNLTPDDCYHGAMLGLMEGIHSG ITTNVDYMYPHPREGLDDGVIKAMRELGIRGIFGRGCMDTGIQYGVHPGITQQKDDIEKG VRDIFERYHNCDNGRIKIWVAPAAMWSNTRETLQMLWKVTNEYKSGITIHISETEFDREA AKGIHGLWDIDAMIDMGICGPNVLMVHCVHLTDEDIEKARKYDLKISHNVCSNMYLSSGV APIPKLLKAGVTCSLGVDGAASNNANDMIELMKNTALLQKCATRDPLSMSAEKVVEMATI DGARAIGMEKEIGSIEAGKKADMVIFDPYECVKAVPLHNPCSTLVYSASLKNITDVYVDG RAVMEKGVILTVEDEKTELRAAQKAAEELCVRGNITNRLEGHKWNDTYRKYERQ >gi|157101635|gb|DS480689.1| GENE 105 127133 - 128572 352 479 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 19 454 6 430 447 140 25 6e-32 MNKNASKSTSTIKTGSIYELNGRPPFAQAFPLGFQQLLAMFVGNIVPMILVANASGMDTS HATLLLQCSILGAGVATLLQVFPIRFGNIQFGSGLPVMMGLTYTFLPICISVSVNYGLGV LFGAQLIGGLVSIAIGFALPKIRRFFPPVVTGTIITSIGISCFPIAAYNLAGGQGDPSMG QLHNFVIGLIVILAILILNGYGKGMVSAAAVLGGMIVGYVIAAIFGYINFSPISEAAWFA VPRPMAFGKLEFHLNFVLVFILLFFINAVEMSGDFTVSATGGLNRQPKNEELRGGIIANG IACIFSSFFNCFATGTYSQCSGIVALTKVCNRWVMGWGAITLTAAAFCPKLASVLSTIPS CVIGGATIVVFSMICMSGMSLVARARFTNRAMLICGPALALGLGISLAKDTLSGMGEYVQ MFFGESSIILVAGFAIILNLILPKDQTDKEVEAEYIREISQTEDDAVSAKDQELGTARA >gi|157101635|gb|DS480689.1| GENE 106 128739 - 130112 1010 457 aa, chain - ## HITS:1 COG:AGc4328 KEGG:ns NR:ns ## COG: AGc4328 COG0044 # Protein_GI_number: 15889657 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 453 26 475 506 346 41.0 6e-95 MLLKNGIVVSGSGCTRQDIRMDEGRILQTGAGLNPLEGEEVMDVSGCLVAPGGIDAHTHF DMPCGSIMTSDDFETGTRAAVAGGTTTVIDFSEPDHGASLQSGLDRWHEKADGRSFTDYS FHMTVARYDEGIEEEILSMIRQGVTSFKAYTAYKGDLGVEDRDMYRLLSLMKKHGVLLMV HCENGDILDVRREELAAGHPADISLHPMSRPNEVEHEAVSRMIDMARLLTVPVYIVHTST RQALEEIEEAKEEGVRVYCETCPQYLFLTEEKYHLPGFEGAKYVCSPPLRSSRDQEALWR GLKKGIVDTVSTDHCSFNYKGQKDLGLGDFRLIPNGLPGVENRLELMYSQAESHGLTYSD IARMTAENPAKIFGLYPRKGVIQPGSDADLVVIRPDSSHVIDAGTQKQYVDYNPYQGMVV SHKVQHVFLRGQKIIEDGAFKTDVPAGNYLLRNTLVR >gi|157101635|gb|DS480689.1| GENE 107 130640 - 131386 171 248 aa, chain - ## HITS:1 COG:TM1073 KEGG:ns NR:ns ## COG: TM1073 COG1070 # Protein_GI_number: 15643831 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 1 230 245 470 476 137 33.0 2e-32 MFISSGTWSLIGMETRNPVISREGWEFNASNSSMPLHSNMFKKIVTGMWIVQMCHKRWET YTFEDIVRLATQAKETNLYIDPDHIDFYRPSDMPLAISRNIERRYGIKINPYLPGPIAKI VFESLALKYCLTIRRLILLSGKNIKKVYVLGGGSKNALLNQYTANALGLPVLTGVTEASS VGNLLCQLYGEGLLKSRKDVKDILKRTYPLTTYMPEDTGKWKEKFEDFMRDAYEPLNVCS GRKFWKEF >gi|157101635|gb|DS480689.1| GENE 108 132185 - 132991 329 268 aa, chain - ## HITS:1 COG:PM1364 KEGG:ns NR:ns ## COG: PM1364 COG0235 # Protein_GI_number: 15603229 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Pasteurella multocida # 6 212 2 207 210 106 32.0 6e-23 MKNYPSEKEARELICEIGHRMWSKQMVAANDGNITVRVGENTVLITPTGVSKGELTPDML FKIDLDGNILESAEGYSPTSETAMHLMVYKHNKDAMSTCHAHCMYLSTFACAGIELDMAT APEPTLITGKIPVAPYACPGTSDLAQSIVPFLNDHKVILLANHGPISWGTSPKDAWYCLE SAEAFAKSSLILKYIIKDFRPLSNEQLDILEKAFHPISALNKVRTGRETVNLEKGKSLQD IEVDSVTLSDDCIEKLADMVVKRMTSIS >gi|157101635|gb|DS480689.1| GENE 109 132993 - 133919 594 308 aa, chain - ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 1 286 23 308 314 188 44.0 1e-47 MIILAACINKSFLTLENLTNVLRQASMNGMIAVGMTMVIICGSIDLSVGTMIGLCGYIAL YFSNYSLILAVIVPLLTGVIVGTINGFLINKIKIAPFIATLATMMAVKGTTLVLTNENTY FASNSIEQFQYIGRGLLAPYVTVPSVMFICCAVAGAYILRHTMLGRNMYAVGGNTEAARM MGVNVEKTLMFSHIFTGGMAALAGITLVSRVGAAYPLAGEGKELDVIAACVIGGTALTGG KGHISGTFIGVLIMGMLTNIFNMQSLLNPFWEKVIIGILVLIVVLIQSASESGVTFGKNI KTLMKGKM >gi|157101635|gb|DS480689.1| GENE 110 134006 - 134950 663 314 aa, chain - ## HITS:1 COG:VCA0129 KEGG:ns NR:ns ## COG: VCA0129 COG1172 # Protein_GI_number: 15600900 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Vibrio cholerae # 2 308 21 330 332 151 38.0 1e-36 MKKNPSQNLMRQYGALIALIILVVINCFTTKNFISVMTLWNLFTQATTVMILGLGMTIVI ATGGINISVGSVLALSSMVLAKFILNGHIVMGVIASLAVAALTGAVTGIIVTKFKVQPMI TTLAMMYTLRGIAKLMNGGTRLSYKNAAFSRFAYIKIGGCIPIQFAFIVILSLAIFILMR KTYFGNYVEAYGDNALATRISGINVVAVVTMCYVLCNVLACVAAIIETSSITCADPVNMG LNKETDAIAAVVVGGTPMSGGKPNIGGMLCGALVLQLITIMINMNNIPYSYSLIIKAAII VVSLYVQNCRDKKV >gi|157101635|gb|DS480689.1| GENE 111 134964 - 136466 191 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 273 477 17 217 245 78 28 3e-13 MEEEYILEMHGIVKEFPGVRALKGVDFNLKPGEVHTIMGENGAGKSTLIKILTGVYEKDG GNIVFNGREVHFHNTFDAQKTGISTVYQELNMIPYLSVGENIYLGRYPRTKAGIDWKSLH KNAQKLLDDLGLDIESRLPLNQYGTASQQMVSIVRAVSLNCQVVVLDEPTSSLDSKEVRM LFNIIKQLKEKKLGIVFISHRLNEVYEISDRITILKDGAYEGTYFPEQLTEFQLIEKMVG RDVREIKGNNRVYDGNGEPFIVMLKDIVRHPKLSGVSIDVKKGEIVGLTGLLGSGRTETA KVLFGYDIPDSGEIVIDEKRVVLKAPKDGLKNGLAFVTENRREEGVIPNMTVRENISISS LPQICRRGFINRKRQEALAEEYIRRFKIKTPSMEQQLKNLSGGNQQKVLLARWIATKPKL IILDEPTRGIDVGAKKEVEDLIKEIADTGISVLLISSELSELVRGCDRIFVLRDGTVRGM LNRDEISEENIMKWIANSPA >gi|157101635|gb|DS480689.1| GENE 112 136575 - 137651 766 358 aa, chain - ## HITS:1 COG:SMb21345 KEGG:ns NR:ns ## COG: SMb21345 COG1879 # Protein_GI_number: 16264669 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 71 358 39 327 327 217 41.0 2e-56 MRKKGLAMLMAGIMIVSSCVMGCAKDDSNDIPAKTSAAASETEAQSEKAGTDTTQKVSGE IKGDINGDGIFKVGFSELAIDGAWRVAQVDSMQKEAESRGYEFVMSNAELDTAKQISDVE DLLTQDLDFLFIAPIDMEAIMPAIEAAKAKGVPTILLDREANGTPGVDYICTILSNYIWQ GEACADWLSENGGADSYKIVQITGKVGGSDVRDRQAGFETGVKKYENMEIVATQSGEWSR TEAQKVMQNIIQSTGGDFNVVYCHNDEMALGVVLALKAAGMNPGTDVKVIAIDGQAEAVE AIIAGEMNCIATCNPRFGPVAFDTMEKYLNEEKLEHIINNEEYIIDSTNAEEKLPDAF >gi|157101635|gb|DS480689.1| GENE 113 137913 - 138380 268 155 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 1 154 436 592 602 86 36.0 2e-17 MVTLEKELDHAKIYVDLQMIRYRDEISVEIVNDLDEQLMKTVIIPKFTLQPLIENAITHG LRDSGRKGFIRIHLMKSGDIGIIEIYDNGIGIAQENLKILRQALEKGTPVSKEIGGYGIV NVSQRIHYCYGDSYQVAIASKEGAYTSVKVIFPIK >gi|157101635|gb|DS480689.1| GENE 114 138516 - 139652 580 378 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938716|ref|ZP_02086068.1| ## NR: gi|160938716|ref|ZP_02086068.1| hypothetical protein CLOBOL_03611 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03611 [Clostridium bolteae ATCC BAA-613] # 1 378 1 378 378 704 100.0 0 MNRIKTKKKENIVQVLRRMYLLIAIIPMIVMTIITAVFYVNSLIDSTKQNLQTVLNLYAD NIDGTISQGIKVIDIVQNDLSVQNALRVSRFEEKEEYYIQRTTINTTMMLINQNYGNTLD GIYILLDDGRCFKSSVFPYNKEKYSQQQWYKDIRDTQGIQGNGFYKYSRVIANPGKEGYI SIGEPIINYRNGEIVGVILVEINLNTFSEMLSKTVTTQDLNVSITEEDGTPVWQYDLGEI RKKGNILKTHLDESIHRELSNGWEISMTTNWFNKIKTEVLNTLTILFILIIMVTFASIYF GRRFADTINHPLKQLQDGMNRNREIWRGEYIAVETEYHELDDLNEGVNDLISKVNQMFEE VKNKERAMRKVQFSALQA >gi|157101635|gb|DS480689.1| GENE 115 139649 - 140869 537 406 aa, chain - ## HITS:1 COG:SP0156 KEGG:ns NR:ns ## COG: SP0156 COG4753 # Protein_GI_number: 15900094 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 402 1 422 428 112 24.0 2e-24 MLRMVIADDETFILDCLSKYIPWRNYGIEVVATAKDGVEALNFVKEKRPDILIADIKMPL LDGISLLERVKEVKSNIEVIIISGFQEFEYARKALKHGCMGYVLKPIEPEELVTYVQKAA DKILKSREEQIMIQKVMRPEIYINGMINGTLTSEECAYYMEQYQQFSEKTLRMGILLFNG FSSETGGESKHPYKRLMDDIEDMVQKSDSFFIVEKGFDNVVLLFHAQTKEEILQTEEMVS KHVTVLNHIQNSIKVTFIRAGDFSSIDTVHEQYWRLVWDSLKEEDESEKEKVDSGTQGTK DLIQEAKLFIEKNFKDSELCLNKVAVNLGVSPGYLSSLFSTKCDCGLTSFINRVRINYAK LMLEKDDDKIEYIAESSGYENATYFSTVFKRETGLPPSEYRRNTRK >gi|157101635|gb|DS480689.1| GENE 116 141001 - 141543 422 180 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938718|ref|ZP_02086070.1| ## NR: gi|160938718|ref|ZP_02086070.1| hypothetical protein CLOBOL_03613 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03613 [Clostridium bolteae ATCC BAA-613] # 1 180 1 180 180 350 100.0 2e-95 MNEAAKWMMAGMFLLSFHIDFMGIPIVPGIVGCLVFRSGFRMLEREGDGEGFEWNRKMEQ TGRLWLGITILDLILAVFMRQSTGISIQITAVLMLAETACGCSTLNHYGELGRQYTQNCK KGPSSLGYLIWMTAGLASYEYSVVLGSSGWNTAGAVCVIIGRIMLVKELAVWAPGEKTEY >gi|157101635|gb|DS480689.1| GENE 117 141540 - 142514 649 324 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938720|ref|ZP_02086072.1| ## NR: gi|160938720|ref|ZP_02086072.1| hypothetical protein CLOBOL_03615 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03615 [Clostridium bolteae ATCC BAA-613] # 1 324 1 324 324 629 100.0 1e-178 MDRPKDRFIHQSAFLFYFFMTAGVLIVAAGSLLQHYFALRHPVFIPMHIELEMYWDGGMD GDTMGSTDGNADVDAGSAAGEDRYVWKEFPIYYIQDSEDTSKVAEITFPQLEGIGEYWMT PYDGAGGLFQSYSGSVQDGRYQFKQLFVKMMMPARSLGARGITLDTIKVRYGDGAEEEVS IGQIILTLDKPMEYVTVTGGSSSSDGTSMTRYLVQRDCRLEGFEIQNPEQVRGEFKVTVD GTDIWEWEPAAMTEGDTFLVESTWRGTGDIREQFEPYTALIQWKCSDGSGGESWFTTNLD RQNNRNTDRFKQTYRYLKERGIKP >gi|157101635|gb|DS480689.1| GENE 118 142724 - 143692 619 322 aa, chain + ## HITS:1 COG:ECs2903 KEGG:ns NR:ns ## COG: ECs2903 COG0524 # Protein_GI_number: 15832157 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 5 310 20 305 321 150 29.0 2e-36 MKQALIIGSTVADVTVRLPHLPVTGEDVIVESQSMSLGGCAYNVSEILRQSGVPYTLFSP VGTGIYGNYVREHLKRKQIPILIPAPAADNGCCYCFVEESGERTFAALHGAEYRFERQWF SLAESLSVGSVYVCGLELEEDTGGFILDYLEEHPEYTVYFAPGPRISFIPQSSLRRMFSL SPILHLNRDEALEYTGTTGSPENKKPWKKADLPCREYRLEACLPEAASRLTALTRNHVII TLGAHGAFHMEPSGKHALIPGCPARQVDATGAGDSHMGAVIACRMKGFPMDKAIRTANRI SAAVVEHAGAGLDDEDFLRLNA >gi|157101635|gb|DS480689.1| GENE 119 143878 - 145278 1196 466 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 272 463 529 724 744 189 45.0 9e-48 MAGTTAAQAASWFRGEMFEEEKNMMTRDWNARGTKACVRPVLRCLFLLMIFTVFMGITAF AAAEKIDTVKLSFTCDPVPKAGEAPGTVTAKTGSREFTVKSTEYTNDVEVWSLGDKPRVR VVLEAADGYRFSYTTSSHFKLSGRGADFIKAKISDSGSTLNLECYLSQADGKPGKIQGLD WSGFRARWDKAEEDIKHYEVRLYRNRNLVTTVTASGTSYDFRNEITRAGDYTFRVRTIAR YADRAGDWSEYSDENTVTERQVSSNGNGSWIQNQYGWWYRYYDGGYPSNTWEKINNTWYY FNRDGYMLNGWQQISGSWYYLGANGAMTRGWQSVNGRWYYMNGDGVMLTGWQYLGGRWYF LDGSGAMLTGWQFINGRLYFLDSSGAMLTGWQFINGAWYFLDGSGAMLTGWQFINGHWYY MNDSGIMLTGWQFINGAWYYLDGSGAMYADQTTPDGYYVDGSGKRN >gi|157101635|gb|DS480689.1| GENE 120 145343 - 146416 572 357 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938723|ref|ZP_02086075.1| ## NR: gi|160938723|ref|ZP_02086075.1| hypothetical protein CLOBOL_03618 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03618 [Clostridium bolteae ATCC BAA-613] # 1 357 1 357 357 693 100.0 0 MKKWIILILIAALWGFGTILAWTMRDAGCHVFMYFGEGDAVEKQLITHALDSEQEQGRNM LPEITAWSRAKRLKVMNPGLGRSGEADCIAVYGLMEMASYSRLIGGTYGYRSDQDGCLIS SGLAMELFGGLDVAGKYIWCRQKPYKIRGVTDDSSHVILLPAKDKDAMRYMMFTYARSGE GRTGEEGTSHGMETGKAAAVNFLYRNEIKSGKVLVDGAYVSAAAGLGACLPLWITSVWAL SVLMGRRGAKAGAKMRAKTRAKAFILAAVFFLAGLFLAWKLELKIPPDLIPSRWSDFEFW SGKIEELRSDMAAVGEAGDVWWLAQMKSRLLLCLFCSFAGTAGGVLWGVKCTGRPGE >gi|157101635|gb|DS480689.1| GENE 121 146413 - 148095 1790 560 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938724|ref|ZP_02086076.1| ## NR: gi|160938724|ref|ZP_02086076.1| hypothetical protein CLOBOL_03619 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03619 [Clostridium bolteae ATCC BAA-613] # 1 560 1 560 560 795 100.0 0 MNVKKAAAIFLALMAALTFLSRAMDSFTVIRVNTGFGKQDVVLYTIQGEGELTAGRTVYI SLPENMQVEEIAARPGQSVKAGDTVLTLQMEGLEEERDALSLEYKKAELALKQEQMSLAS VPRVTEETLALQQLAAAQRALELGNQDFAEAKEDHEQASIELEHDYVQKKNRTREQVKED NRKAMKSARRSYESAQKSRDSAVRKAEREVEDKQKKLDRLEEQGGSDEELERAELELERA GEDLEDIREEEDLKVEEARAKMYAAEEDYEDVDYGERENQDDLLKAYEDALKAEDEKLKE AGRKVQDLEEVLYQAMEKVENARVSDAGTAAGEAAGREMSGLKQESMKLDMEEIQKKLGK IEQFIDNKGQIKAPVDGVVVDTGLQAGDRIQDGHQLRLAVGGLEMRAQIDKETAGAGLLK KGVMMQVKMAGQSKNVETEVESVDQMAEGGKIQVTAWIQEGEGRLGDLVSFTVNMESGAY PCVIPIEALREDNKGYYCLAAEPEKTILGDEQKAVRIQVDVLEKSSSAAAVSGPVTKDMK LITESSKPVSEGDRVRVVEE >gi|157101635|gb|DS480689.1| GENE 122 148102 - 148944 520 280 aa, chain - ## HITS:1 COG:CAC0666 KEGG:ns NR:ns ## COG: CAC0666 COG0395 # Protein_GI_number: 15893954 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Clostridium acetobutylicum # 8 273 13 269 275 175 37.0 7e-44 MIKRILIHAVLVLLCLFIWLPLLLMAGSALMSEGEMLERFGAVFGMGNAPIKAAFIPSYP TLSPFVELLLDSPGFYVMFWNSCLQTGLVLAGQLLIALPGAWAFAHFCFPGKRALFLFYI ILMMQPFQVTMVPSYLVLKRFHLLNTHLAVILPGIFSAFPVFIMTKFFASIPTPLIEAAR LDGAGDFSIFMKVGIPVGRPGIISMLVLGFLEYWNAIEQPMTFLKEQRLWPLSLYLPDIT ADKAGAAWSASVIMMVLPVLLFLMGQETLEQGIAASGIKE >gi|157101635|gb|DS480689.1| GENE 123 148941 - 149711 544 256 aa, chain - ## HITS:1 COG:CAC0665 KEGG:ns NR:ns ## COG: CAC0665 COG1175 # Protein_GI_number: 15893953 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Clostridium acetobutylicum # 2 235 32 266 289 211 45.0 7e-55 MLIPFVNVVVRSFFQAVGGQFVGMENFIQVFHSQAFRLAVKNTLHFILVCIPVLMAFSLW VSILMTGAADSRSLYKTTILLPMAIPVASVVLLWKLLFYPQGILDQMVVWAGGSSQDWLN QGSAFYILVLTYIWKNTGYDMMLWLAGLDGIPKELFEAAKVDGAGSWQSFFHIALPGLKS TAFLVLVLSVINSFKVFREAYLISGDYPHESIYMLQHLFNHWFVSLDIQKMSAASVMVEV SIMIPVLLWTWKRKKR >gi|157101635|gb|DS480689.1| GENE 124 149815 - 152076 1804 753 aa, chain - ## HITS:1 COG:no KEGG:Clole_2648 NR:ns ## KEGG: Clole_2648 # Name: not_defined # Def: extracellular solute-binding protein family 1 # Organism: C.lentocellum # Pathway: not_defined # 270 753 304 776 776 170 27.0 2e-40 MKNKLMSLCLAVSILSGCLWGCAAPSGTAGGSTGEQQTKSKTQEEKQQEGPMGRYAESSI SISLEEGETAADIIQTGDQVLELYTVKDKKASRYIWTGEKWEKQDNSLLEGLEFPYGTLH MIWGEDKNRYVLYQGGDDYKTYMMKLTEGQEPMMLLDPVFSVKNDQGYYDVRPDFAAVSE EGLIFLSHSRVTDVYTPEGELVLSLPQQWSSMEWKGTGLLKGNRYITYSESSYISYDISG MSASAKEEIPFQSPDFDMWAPMASDGSGGIYIANPRGIHHMNQGGSLWETVADGTLNSLS LPSANLRKLFAGNQNDFYVWMSQDDKEELKHYTYDPQMPSVPTQTLTVYGLNLEQTDTIH QAASMFQLEHPDVRVELIDGQITSGSTTVSDTIRALNTELLGGNGADLLVLDGLPAESYI EKGILEDMKDFLSPMIASGELTEQVSKPYTEESGSIYQIPTRMTLLAAYGDSQAAASLVS MEAMRAYQEDPSHLPLRPKTNYESLLRQILMLRYDEVVDMRTGKPYPDKIKELLETVKVL GEANGAKSAFDKSEDGGRGNVYNLMPGLDGLLGSEYSRVDQGMSAVAIDKIDGMYNILLP LAVQRKHGFTMENVNASYLPSGTMGINQACQNKEMAREFILFVLGRDVQSKDLSDGLPVN TLAVRDWVDREDNKDTSVAVSGDDGYELTGSWPTKEERQLVFDVAAKADHPIRMDRVLTD IIINETKGYFEGRLSLEQAAQNAQNKAMLYFSE >gi|157101635|gb|DS480689.1| GENE 125 152230 - 152916 795 228 aa, chain + ## HITS:1 COG:slr1584 KEGG:ns NR:ns ## COG: slr1584 COG0745 # Protein_GI_number: 16332195 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Synechocystis # 1 221 1 221 234 165 43.0 6e-41 MRILLVEDDKELCDAIKLQLERAGYETDCCGQGQEAFYYASEYPYDVIILDRMLPQMDGL GVLGGLRRMNITTPVIITTALDGINDRIDGLDAGADDYLVKPYAAGELLARIRALTRRPG NFKQSPSAACSNISLDPEKRELTGPLGTLQLSKRESALMEYLIRNKGQVLPRGLILSYIW GPDSDVEEGNLDNYIYFLRRRLKSIGALPQIKTVHGIGYRLEIGGTHD >gi|157101635|gb|DS480689.1| GENE 126 152909 - 154249 736 446 aa, chain + ## HITS:1 COG:slr0640 KEGG:ns NR:ns ## COG: slr0640 COG0642 # Protein_GI_number: 16331561 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 185 437 171 436 441 136 32.0 8e-32 MIKQLRRQLTALFTITTGLILTLVVIGILVISAREFNKKTLDSFQNQILNITSRLQSGST VSCTWLSRLESDNRLIIHIEDNGRPLLYHGSWTPDTRRDTLIMRAREAAMAERIDPSSRP VSSSLLRSSVFTVTGDSNDTYLGSVLVLPTQTGFQSLTLLASKDSMGFGLFRQGILFLLL NLAGIAALMCVSWILVGKSLKPLAESQKQQNEFIAAASHELRAPLAVIRSSICAARTAPD QREKFMANIDRECSRMSCLVGDMLLLASADTGQWSLRTSSLDMDTLLISIYERFEPLYQD KGVCLKLDLPEASLPRIYGDENRLEQIFAVLLDNALRYTPKGRSVTVAASVQTDKHLLSR SRSIVCLTVSDQGSGMDDETKKHIFNRFYRGDSARSHKQNYGLGLSIAKELVQLHKGTIS VSDSPDGGACFLVRLPAVPESHASSR >gi|157101635|gb|DS480689.1| GENE 127 154231 - 155295 958 354 aa, chain - ## HITS:1 COG:VC1364 KEGG:ns NR:ns ## COG: VC1364 COG0561 # Protein_GI_number: 15641376 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Vibrio cholerae # 84 352 2 267 273 102 29.0 1e-21 MKRSLPANKGQSYFRTIILPYNQYNTISSIRLEEKLKNSYDSMRKGNGKKHEGSCEFVID TLGNGVYSLGEKNGNERGTFMSHYELIGFDMDGTILNSQKTISHRTLEAVNRAARMGKRV ILSTGRCISELEEFRDVLANVSHYVCESGALIYDAVNQSILHSETLPRDLVEQVLETAAH EDVMVYMMSGGQALANASQVADTSHYHMGVYQEMMDRVVNKTDDIGALYRKHPFPVEKLN LFSASADIRERLHSRICGLPLAIAFAEGASLELSPRNVSKASGLVWLCGHLDIPLHKTII VGDADNDAQVLRIAGLSVAMGNALPHIKELCDVVVADNDHDGCAEAVDRYLLEA >gi|157101635|gb|DS480689.1| GENE 128 155294 - 156082 319 262 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938730|ref|ZP_02086082.1| ## NR: gi|160938730|ref|ZP_02086082.1| hypothetical protein CLOBOL_03625 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03625 [Clostridium bolteae ATCC BAA-613] # 1 262 1 262 262 526 100.0 1e-148 MKEKHRLIPVLIFAFILTVLCTAAVSRWYLLHQKGGTYNSILLGKLVSSDGRKLVLQGNI TNPQGRNGQYAITYSPGITAADKTKDSHITPPTAIPQLLVTYGPDKTEIPSFSGGYQLYT PDDEGTAEAVIACGAEPLSYLVGKVPDIPYVKLGTDISLDFREGTVPDSVTVQDLILRGD GSPKYTEQVTEEFTLACSGSAACFTLDINSAALLSSDMSSYEEAGILRGFRVNCRWDDNS TAEYGFVIRTDASSTEIRNQAD >gi|157101635|gb|DS480689.1| GENE 129 156033 - 157877 1427 614 aa, chain - ## HITS:1 COG:DR0933 KEGG:ns NR:ns ## COG: DR0933 COG0366 # Protein_GI_number: 15805957 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Deinococcus radiodurans # 7 570 23 595 644 401 40.0 1e-111 MKRDNIYRKRLDKHMDELKWLYMELYNSQDMFDGLLKGMEGFYKERAGCLRALDAAREED PCWYRKSSMLGMMMYADHFAGSLKGVEGKLDYIAQCHVNYLHLMPLLDTPRGRSDGGYAV SDFRKVREDLGTMEDLEQLAHECHDRGISLCLDFVMNHTSEDHGWAKRARAGEKEYQDRY FFFDSWTVPSMFEETVPQVFPATAPGNFTCLPETGQYVMTTFYPYQWDLNYRNPVVFNEM MFNFLYLANRGVDVLRIDAVPYIWKELGTDCRNLPRVHTIVRMMRVIGEIVCPGILLLGE VVMEPEKVAPYFGTVEKPECHMLYNVTTMAAIWNTAATKDVRLLKSQMETVSSLPSACTF LNYLRCHDDIGWGLDYPLLERWGMRQVPHKKFLNDFFTGRIPESFSRGVLYNDDPATGDA RFCGTTASMCGVEKAGFCHDRQAMEEAVRLDTMLHAFMLFQSGIPVLYSGDEVGQVNDYT YKDNPEKAPDSRYIHRGEFQWDLVERINEPETVQNRIFHNLEQLEAIRAGEPLLDAGCPF RTLETWDNSVLGMVRQDGRGKKLIGLFNFSGERRTAWVDERDGMYGDMVTGDLMEASGVS LPGYGFLWMKRQCG >gi|157101635|gb|DS480689.1| GENE 130 157883 - 160084 1866 733 aa, chain - ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 1 733 1 743 748 589 41.0 1e-168 MAIQYQENGRIFTLHTDHSTYQMKADSYGNLLHLYYGDRTEGSMEYLLTFGDRGFSPNPY EAGDDRTYSLDALPQEYPYQGSGDFRSPALAVEQADKSYGLNLRYGGYEISKGKYSLPGL PSMYGEGLEDAESLKIHLTDQVSGLAVTLLYGVFPGYDIITRAVCVKNEGNGPAALKRVY SACLDFLHGEYDLIQFYGRHAMERNFQRTALCHGAQVIGSRRGTSSHQYNPFLILAGRET TENHGPCYSMSFVYSGAFKAEAETDQYGQTRMSMGLQDEMFSYELKPGEIFFGPEVIMGY SSHGLGTLSRSIHKAMRRSLCRGKYKTIPRPVLINNWEATYFDFTGEKILDIARQAKELG VEMMVLDDGWFGKRDSDCSGLGDWQVNVKKLGGSLGSLVSRINGLGMKFGIWMEPEMVSE DSSLYREHPDWAFAVPGRKPVRGRSQLVLDFSRKEVADYIFDSICRVLDSAHIEYLKWDM NRSIADVYSAAGKMAGGDSPGSILYRYMLGLYDFLERLTKRYPDLLIEGCSGGGGRFDAG MLYYTPQIWCSDNTDAVDRIRIQYGTSFGYPISAVGSHVSAVPNHQTGRITPMKTRGVVA MAGSFGYELNLSAISEEDKNCVREQIDAYHKYWHLIHDGAYYRLTNPFEDREAAAWLFVS EDKGEALLNTVALETHGNPLSLYIPMEGLDPEAMYRDEDTGTEYPGAGLMSAGIPIPSGT GEYQAFQVHLRRI >gi|157101635|gb|DS480689.1| GENE 131 160139 - 160972 994 277 aa, chain - ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 10 277 14 281 282 192 38.0 5e-49 MKSLETNRSINTALVYAALGAGALLMIFPFIWMVFTSFKSAGESVQIPPTILPKQWMKEN YLRAIASLPFMKLYVNTALLILFRVLCAVAFSSAAGFAFAKLKFRGKNLLFSLVIVQMML PSQIFIIPQYQMLARMGLTNSMFALVFPGLVSAFGTFFLRQAYMGIPDELGEAAYLDGCT KWQTFTKVMMPLTKASMAALTIFTAVFAYADLMWPLIANTDLNMMTLSAGLSTLRGQFTT NYPVLMAGSVLAMVPMVILYLVFQKQFIEGIAMTGGK >gi|157101635|gb|DS480689.1| GENE 132 160987 - 161868 967 293 aa, chain - ## HITS:1 COG:lin0218 KEGG:ns NR:ns ## COG: lin0218 COG1175 # Protein_GI_number: 16799295 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 13 291 12 292 292 182 37.0 5e-46 MGKKQYLTEDAKWGYLLIAPTIIGLIVLNVYPFLQTLVLSFSTTHPFGYYEVSGVENYVQ MFGNKEFWKATWNSIYFCILTVPLGIFLSLLTAVLLNAKIRAKAAFRAIYFLPMVVAPAA IAMVWKWIFNAEYGIINQIIGSRINWLTNPKLVLPACAVVAVWSAVGYDAVLLLSGIQNI SKSYYEAASLDGASKIQQFFKITLPMVSPTLFVVLIMRLMASLKVYDLIYMMVDQTNPAL TNAQSLMYLFYRESFVAGNRGYASAIVVWTVLLIGAVTALQFWGQKKWVNYEV >gi|157101635|gb|DS480689.1| GENE 133 161893 - 163194 1262 433 aa, chain - ## HITS:1 COG:Rv2318 KEGG:ns NR:ns ## COG: Rv2318 COG1653 # Protein_GI_number: 15609455 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mycobacterium tuberculosis H37Rv # 6 355 5 363 440 103 25.0 1e-21 MLKTFRKGILTAAAAVLGAALLGGCGSGTRASDGLVHLNFQIWDNAQRGGMEAVARAYMD KNPEVEIQVQVTSWDEYWTKLDAAAESGQLPDIFWMHTNQILKYADYGKLEDVTDLYDDV EPDYYTSHFSDISLANARGSDGRMYGVPKDKDTVCLVYNKEMFDAAGVAYPDESWTWDDL TSASAQIYERTGKYGYMAYADEQLGYWNFVYQAGGYILNEDKTRAGFLDSGTRKAMEFYI GLQKEPWCPDQNYFAETAPGNAFMSEQGAMYLEGNWNLISFMENNKEMIGKWDVAVLPKC PDPVRGDGRATISNGLCYSTGAKGKKLEYARDFLKFAGSEEGQRVQGLSGAAIPAYKGLE DTWISAFDQYDYKLDVQKCVDMFPYGVQSVNNASRPNWKTQINDLLLKIYSGELSLDQGL EDMQKLVDEAKAP >gi|157101635|gb|DS480689.1| GENE 134 163416 - 164273 855 285 aa, chain + ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 23 279 24 278 287 127 29.0 2e-29 MSNVLFSIFPNERFVDLGLYQYGWEQCEPLHSYGPYARNHYLFHYCISGTGTLISTDSKG ESHTYQIKSGEGFLIYPKQINTYFADKNHPWEYTWVEFDGLRVKEALELAGLTMDAPVYH SNARDLSLELKNEMLYIARHSDQSPFHLIGHLYLFLDYLTRSSASRRLLKTGRLQDFYIR EATSFIEQNFQNDISVEDIAAFCNLNRSYFGKIFRDAVGKSPQEFLISYRMTKAAELLKL TELSINDISNAVGYPNQLHFSRAFKKTYGIPPRQWRQENKMAPAT >gi|157101635|gb|DS480689.1| GENE 135 164330 - 167026 2902 898 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 460 892 319 754 776 191 28.0 6e-48 MKKRLIPIVLFLSLVGLGGLSLVSIHNLQGNARVINYTGVVRGATQRLVKEELKGRADDA LIARLDGIMEELATGVGENRLIRLNDQAYQELLASMEDQWQVLKEEIMLVRQGKDSEELF ELSEHYFQLADTTVTAAEQYTEKQVNRTSRGFGILILGAMGVWGILVWQERRQDKMQAVI REKENANRQQKQRLDRMWDTLRAPLNDISEMMYVSDVETHELLFLNEAGMRNFHLDSIEG KRCYEVFHGRSEPCPFCSEHFTDYDNIVTWENTNPITGRHYLLKDRLIEWNDRPARLEIA FDVTEAEKEKQSLKFALRAEDMLVDCVRSLYEDRGIGEMLPVILDKLGRFMNADRSYVFM LRDGKLYNEYEWCAEGVESQMDMLQGLPLEFIDRWLKIFDRQECMVLEDVQDLKEDYRNE YEILTAQGITSLVAAPLERDGHFCGAIGVDNPPLEQLVNIGPLLRTLGYFLMLSYRRDEN EKELNRLSYHDTLTSFYNRNRYTADTDNLRVTDSPVGIVYLDVNGLKDINDLRGHAFGDK VLVECARRMRETFDGADFYRIGGDEFVIICREWSKEVFEEKVELLRKNFERDSICRAAVG SRWAEGSDDLSQIIADADARMYEDKKEFYRRSPVSRRYRHQSDEVINLSNPEVLADELGK NHFVVYVQPKVSSADRSAVGAEALIRYQSKEGSMVLPGNFLPLLEEAQTISQVDFFVFEF ICSKVKEWADREKQAFPVSVNFSRFSLAQPSFVEQLVKLCDKYGISPGLLEIEITESIRN APEIDLKELIGKLRQAGFTVALDDFGTEYVNLSLLSSVEFDVLKLDKTMVDDMVNNPKAQ AIIESIAGICKKMGIRVVAEGIETEEQMSVLRTCGVETAQGYLFSRPIPVEEYEEKYL >gi|157101635|gb|DS480689.1| GENE 136 167201 - 168448 1300 415 aa, chain + ## HITS:1 COG:DR1631 KEGG:ns NR:ns ## COG: DR1631 COG2357 # Protein_GI_number: 15806636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 7 176 26 190 394 62 28.0 2e-09 MDLINQFIENYKKKMNFYETAGRMAARQLESALQAAGIRAIVTSRAKAPGRLKSKVLIRN SRRAVPYKNMREIYEDIADLCGVRVSLYFPGDRDKADSLINDLFTLLETKQFPEQSKAPS YNKRFSGYWANHYRAHMREESLDRSQKKYTTARIEIQVASVLMHAWSEVEHDLVYKPLQG TLSDEELAILDELNGLVLAGEIALERLQNAGNERIRNKNAEFGSQYELASYLYNYLSNNF RPEDIELRMGNIELLFKLSSRLKINSVKELEPVLKSVKFEKDRRNISQQIIDQMITGSEK RYHIYQELRAGQDGIGEDERHAVEYFFSQWVPLEQLLNRVSSKNSPKVRGAFNINILKRL NLLDKECINQIVSLRKIRNVLIHDIEIPEADYINRQGDEAQSLFHKLSEQFAAPA >gi|157101635|gb|DS480689.1| GENE 137 168549 - 169562 801 337 aa, chain - ## HITS:1 COG:CAC2484 KEGG:ns NR:ns ## COG: CAC2484 COG4129 # Protein_GI_number: 15895749 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 8 321 6 313 320 207 36.0 3e-53 MHRYGRLLKILKIAVGSMLAMAAAQALGLRYSSSAGVITLLSIQDTKRETIRVTGRRFLA FLAAMVLGPVSFALAGYRPLAMGIFLLMFTPLCMKWGIQEGISVNTVLMTHILAEGSMGM ADIANEALLLFVGTGVGVVLNLYIPGSETAIRSAQGEIEETFISLLSRMASELETPGENG EQPDGRGFQALEQALGQALEQGERRAYEGMENSLLADTRYYLAYMGLRKNQFAILCRMRD CFSRMESTPDQALVVADLLQSVSGSFHERNNALGLLDELEQVKLQMKAQPLPVRRQEFES RALLYLGLLELEQFLVLKKEFALGLSRDEIRRFWGRG >gi|157101635|gb|DS480689.1| GENE 138 169981 - 171672 1950 563 aa, chain - ## HITS:1 COG:L194050_1 KEGG:ns NR:ns ## COG: L194050_1 COG0446 # Protein_GI_number: 15672768 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Lactococcus lactis # 2 452 4 451 456 359 41.0 9e-99 MRILIVGGVAGGASVAARARRLDARSEIIMFERGHHVSFSNCALPYYLSGTVKNSDALVM MSPEKFKKQYDIEARTDSEVLSVDREKKTIRVKNGCTGDVYEERYDILVLSPGASPIRPK SIEGVDRPNVFTVRNVTDIVNLKKFVTDCQIRDVAVVGGGFIGIEVAENLNMDGRHVSVI EAQDQIMAPFDYDMVQMLQKELLDHGVDVIVNDGVSAIGEDSITLASGKAVRAGMTVLAI GVAPETGLAKDMGLELGETGAIKVDHNYRTSDPDIYAVGDAIQVFNRLTHKPSRLALAGP AQRQARAAADHMYGIPHNNKGVIGSCAVRIFDLNAAATGLNEKAAAEAGIPHDSVYIIPT DKVGLMPGSAPLHFKLVYEYPTGKILGAQAIGRGNADKRIDVIAAMISMGGTLEDLKELE LCYSPVFGTAKDVVNQAALAALNLLYGRVRQVPVTKVRELVETGACIIDVREKNEYGLGH LKTSVNIPLSELRDRLDEIPRDRPVYLHCRSSQRSYNACMALVNRGYRNVWNISGSYLGI CLYEYFTDVTTGREKIVTEYNFK >gi|157101635|gb|DS480689.1| GENE 139 171669 - 172091 479 140 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0378 NR:ns ## KEGG: CLL_A0378 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 130 1 133 144 83 33.0 4e-15 MTRDRIYEYFIEKDNNCAEAMLRALNDEYCLGIPDDSVKLVGGFGGGMGCGKACGALCGG CSAISYRLIHERAHATPELKPAVAAFAEEFVERFGSDACCELVKTYKKEDTRCLDMICMA ADIADACFERIISEEREQSK >gi|157101635|gb|DS480689.1| GENE 140 172390 - 173862 1602 490 aa, chain - ## HITS:1 COG:BH2756 KEGG:ns NR:ns ## COG: BH2756 COG1070 # Protein_GI_number: 15615319 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 4 478 5 483 498 378 41.0 1e-104 MLYIGVDLGTSAVKLLLMDGSGRIHKVVSREYPLYFPHPAWSEQNPEDWFTASMDGMKEL TSECDKSQVAGISFGGQMHGLVTLDQADEVIRPAILWNDGRSEKETDYLNQTIGKEKLSA YTANIAFTGFTAPKILWMKRNEPENFARICRIMLPKDYLAYRLSGSFCTDYSDASGMLLL DVAHKCWSEEMMELCGITRKQLPDLYESYEVVGNLKEELAKELGFSQDVKIIAGAGDNAA AAVGTGTVGEGCCNISLGTSGTIFISSESFKVDCNNALHSFDHADGHFHLMGCMLSAASC NKWWMEEILKTKEFSREQEGIVKLGENHVFYLPYLMGERSPHNDPRARACFIGMTMDTSR EEMTQAVLEGVVYGLRDSLEVARSLGIRIDRTKICGGGAKSPLWKKMVANIMNVKVDVVE NEEGPALGGAILAAVGCKEYPDVKTAADSIVKVTETIEPDGELTEKYEKQYSKFRQMYPA LKAVFQNLGE >gi|157101635|gb|DS480689.1| GENE 141 173883 - 175367 1921 494 aa, chain - ## HITS:1 COG:CAC2610 KEGG:ns NR:ns ## COG: CAC2610 COG2407 # Protein_GI_number: 15895868 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Clostridium acetobutylicum # 1 493 1 489 490 755 73.0 0 MNNTPEVKIGIVAVSRDCFPESLSVNRRKALVEAYKAKYDASEIYECPVCIVESEIHMVQ ALEDIRKAGCNALCVYLGNFGPEISETLLAKHFDGPKMFVAAAEESGDNLCQGRGDAYCG MLNASYNLALRNIKAYIPEYPVGDAEDCADMIHEFVPIARAVAALSELKIISFGPRPLNF LACNAPIKQLYNLGVEIEENSELDLFEAFNKHAGDERIPGIVKEMEEELGAGNKKPEILS KLAQYELTLKDWVQEHKGYRKYVAIAGKCWPAFQTQFGFVPCYVNSRLTAQGIPVSCEVD IYGALSEYIGTVISQDTVTLLDINNTVPKDMYEADIKGKFDYTLKDTFMGFHCGNTASGK LAFCEMKYQMIMARALPIEVTQGTLEGDIKPGSITFYRLQSTADNKLRAYIAQGEVLPVA TRSFGSIGVFAIPEMGRFYRHVLIEKNYPHHGAVAFGNYGKALFEVFKYIGVDVEEIGYN QPKSVRYKTENPFA >gi|157101635|gb|DS480689.1| GENE 142 175437 - 176060 682 207 aa, chain - ## HITS:1 COG:no KEGG:Closa_1357 NR:ns ## KEGG: Closa_1357 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 207 1 208 208 307 65.0 2e-82 MGMTVKKVTDPAFKAYGRVITGYDFSGLLKAMEQTPLPEDVIYIPSLPEMEALPAAKELE NGIYGQMPIQIGCCNGHNKKLNAVEYHRDSEVDIAVDDLILILGKQQDIEEDHTYDTSRM EAFLVPAGTAVEVYATTLHYAPCHVKDEGFRCVIVLPRDTNLDMEPVEVKDPEDRLLFAR NKWLIGHAQGGLPEGAFIGLKGENLSV >gi|157101635|gb|DS480689.1| GENE 143 176196 - 177128 877 310 aa, chain - ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 11 272 11 288 292 94 25.0 4e-19 MDFRYETVIPNDDLPFRMFIFEGRDGNYRVSKHWHQSVELFLVLEGTLDFFINSRQYTLK PNDFVIVNPNEIHSIECPDPNITIVLQIPMKSFEGYMSDESRITFADKDESQKARLVALV TAMYRTYEKGEFGYRLKVKSQFMEFLYLIVTEFRIEQIDKVRVQQKRHLDKLSQVTQYMK ENYDKELSLEMVASRFGFTPSYLSHMFREYAQTGYRTYLMDLRVKYAMRELLNSDRYVGD IALDHGFADARAFAKAFKKRYGCLPSQYRKQMNQKQVNQKQMNQKQMNQKQANQKQRNRK SLAGKSLAGE >gi|157101635|gb|DS480689.1| GENE 144 177245 - 178495 484 416 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c18950 NR:ns ## KEGG: CLJU_c18950 # Name: not_defined # Def: putative collagen triple helix repeat-containing protein # Organism: C.ljungdahlii # Pathway: not_defined # 25 283 207 465 800 154 56.0 6e-36 MWNGNEEDFRERAGCWYKECCCCPGPAGPVGPKGDTGPAGPRGPIGAAGMRGDTGPQGPQ GVPGERGVTGAQGPAGPQGAPGLQGNTGAEGPKGDTGPRGAMGPEGPRGEIGPAGVTGPT GPRGPMGIQGPQGIQGVIGPTGPQGPQGPQGNPGTVGPRGIPGAAGVPGPVGTTGPRGET GPTGATGPAGNPGPTGPTGIPGSAGATGPAGATGATGATGPTGATGTTGATGPAGVPGTT GATGPAGATGTTGATGPAGVPGTTGATGPAGATGTTGATGPAAGDVFASFITFAGVWNNA QRIPVGVGVADTTGNIVLTDSTRITLAPGYYCISYQVSSMLADAGYMQITPVYNGSSHIE YGIYFWTANARSSAFGASSLIVEVPQSTTLFLYFNSNTRAQEGTVTMVIFKLNRPV >gi|157101635|gb|DS480689.1| GENE 145 178633 - 179460 649 275 aa, chain - ## HITS:1 COG:TM1552 KEGG:ns NR:ns ## COG: TM1552 COG1180 # Protein_GI_number: 15644300 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Thermotoga maritima # 5 274 18 282 331 256 46.0 3e-68 MRAVCGGCMNRCSLEEGQTGMCRARKNTGGAVIPVNYGKITSMALDPIEKKPLRRFLPGS VILSVGSFGCNLRCPFCQNHKISMAGEGDFESVTVSPEELAQKAAELGKNGNIGVAYTYN EPLVGWEYVRDTGRLARKAGMKNVIVTNGSVSDQVLDEILPVADAMNIDLKGFTENYYRK LGGDLETVKHFITCASRRCHVELTALIVPGENDSDEEMREMAGWISSLSPEIPLHVTRFF PRWKMDDRPATDAARVYALVQTAREYLRYVWTGNC >gi|157101635|gb|DS480689.1| GENE 146 179457 - 180863 1161 468 aa, chain - ## HITS:1 COG:CAC1420_1 KEGG:ns NR:ns ## COG: CAC1420_1 COG3885 # Protein_GI_number: 15894699 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 294 1 299 299 199 34.0 1e-50 MAVAAGIMVPHPPLIVPDVGRGQEILIKDTIKAYQEAAGMAAELKPDTIVVISPHAVMYA DYFHVSPGKEARGDFGRFMAPQIGVEAGYDEAFVRELDRLCGDDGFPMGTEGEREKELDH GTLVPLYFINQHYRDYKLVRIGGSGLSFADHYRAGRYIARAAENLGRSVFIVASGDLSHK LRGDGPYGYSVQGPEYDERIMDVMGKAEFGELLAFPEDFCEKAGECGHRSFVMMAGAMDK KAVMPRRLTYEGPFGVGYGICTFRVTGDDGRRGFLDLYESCQRETCKKRQMEEDEYVSLA RRSLEYYVHEGRMIPFGRAEEGLPEEMLQTRAGVFVSVRKGGALRGCIGTIEPVCRNIAE EIIQNAVSAGIHDPRFPSVRREELSFLEYSVDVLGETEQIQGEDQLDPLRYGVIVTKGRK RGLLLPNLEGVDTVREQLSIARQKAGIGEDETDVQLERFEVIRHGAKS >gi|157101635|gb|DS480689.1| GENE 147 180952 - 182139 1098 395 aa, chain - ## HITS:1 COG:RSp0802 KEGG:ns NR:ns ## COG: RSp0802 COG0471 # Protein_GI_number: 17549023 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Ralstonia solanacearum # 18 357 57 373 408 106 28.0 6e-23 MNKVKGFVKRETVLSAAVILAILSAMVIRPDREYGSYIDYKTIGLLFCLMTVMAGLQSLG VFKKAGERLLGYVKGPGAVSAVLVFLCFFFSMAITNDVALITFVPFAIAVLKMAGMEPLV LPVVVLQTIAANLGSMLTPVGNPQNLYLYSKAGLNAGRFMELMAPYTAVSFGMLVLCLAV VAAGNRMPGKERNGRGQGPMVSGFPGFSQEAAGMEASGKRIYLGRLGLYLFLFLLSLGCV ARILPWQLLLAVVITGVWIADRSLFLAVDYSLLLTFCAFFVFIGNMGRVPAFCRFLGRIL EGREIITCVGASQIISNVPAALLLSGFSGDYSALIVGTNLGGLGTLIASMASLISYKHIA RMYPMEKGRYLGWFTVVNLLFLAVLLVLSLWVIPF >gi|157101635|gb|DS480689.1| GENE 148 182325 - 184079 1684 584 aa, chain - ## HITS:1 COG:SMa1715 KEGG:ns NR:ns ## COG: SMa1715 COG1001 # Protein_GI_number: 16263395 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Sinorhizobium meliloti # 11 573 11 583 595 322 36.0 2e-87 MKIAVWMDRGKMVRAAMGEIPCDLTVCNIQLVNMFSGEVYPAQVDILDGIIVRVRTDNEC PALPSERIFDGGGRYLIPGFIDSHLHVESTMMIPENFGRAVLPWGTTTIVMDPHEIANVL GIEGVSFMLDNAKKTPLRQFALAPSCVPSVPGAENSGAVFGEEEIARLLELPEVLGIAEI MNYVDVCRGDERMSRIISQGLKRNLYLQGHAPRLEGDALAAYLLAGPESDHECRSARECR EKVRQGMHVNLKTSSLSNHLPDALEGIRDHRWHDSVSLCTDDVHAGVIYREGHLNRVVQK AIGYGAHPLDAIRYASYNAAREYGFDDLGAIAPGYVADMQLVDALDGRRPSHVFCRGKLV AEEGRYLGSPYDGGLKFTNTMKLSYLSGPGDFRLKAPASQGETLVIYSKYDGPFNKAFYE TLPVEDGYICIRQDPNLAMACVCNRHGLNQRTVVPIRNFGITEGAIATTVSHDCHNLTMI YRDPEDAWIAAETLKRSGGGIAVVLNGKVLASLALPAAGLMSQLSVDSLAPQVEQVEKAV YDLCAGKSSLLKMSTFALAALPGAIMTDKGVLDGQSQTFMPVFR >gi|157101635|gb|DS480689.1| GENE 149 184106 - 185440 1253 444 aa, chain - ## HITS:1 COG:FN1877 KEGG:ns NR:ns ## COG: FN1877 COG2252 # Protein_GI_number: 19705182 # Func_class: R General function prediction only # Function: Permeases # Organism: Fusobacterium nucleatum # 3 411 12 414 442 234 36.0 4e-61 MNTLEKHFKIRERGSSVRTEVLAGMTTFATMAYIIILQPVNMRDTGMDTVGILIATALVS AFITMLMGVAANMPIALAPGIASGIVLTYSIVVPGLADWKVGLGMSMISAILFLVLSLFK IREKIVELIPKNIKIGISAGLGIFIIRTALVNARLVNPDFRGFGDFSDPSVLLSAVSIVI CFILYFLRIKTGGREYRIRSSLLIAIVVTTVIGIPMGVVQVPSSIFTSGGVASLGNIAFK ADVLGALRPEYIAFVLAFFVSDFFGTLATALGLGQQMGMLDENGNFPIIGKIFLVDAIGS VVGTCMGVTVVTSYVESASGIEVGGRTGLASVVTGLFFLLAVLFAPLFLMIPTAATTPVL LIIGFVMMQGLKSIDFGIEEWVPVGMLIISTLFYGISQGIGIGLLTYCGVKSAYYLFTDE RGMDKLPSPFTIIFTLLTCIQFFI >gi|157101635|gb|DS480689.1| GENE 150 185668 - 186420 731 250 aa, chain - ## HITS:1 COG:BS_yvoA KEGG:ns NR:ns ## COG: BS_yvoA COG2188 # Protein_GI_number: 16080556 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 16 235 15 228 243 83 29.0 5e-16 MPSIAPIRKKTLSDSVKDSLRKLISEKAFNEAGKLPPEEELARQLAVSRITIRKALTDLE QEGLLLRIHGRGTFVNPAARQVKVNLSGMLEFGSVISGNGYRPECRLVSVDEERIPASVG EHLGTGKGGRGIRVEKLYLADDIPAIVSVGRIPLHLFSETPARTDWQEKSNFEVIQEYTG RMVVRDWIEVSSLSAQEAGEMLGHPCPLKAASVLMIQAVGYDQNSEPVIHGVALYDTSHI RFNLLRHAEG >gi|157101635|gb|DS480689.1| GENE 151 186674 - 189118 1826 814 aa, chain - ## HITS:1 COG:MA1313 KEGG:ns NR:ns ## COG: MA1313 COG3875 # Protein_GI_number: 20090175 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 387 800 54 464 468 358 43.0 3e-98 MTAERYISQYAEEFMKLDRKFWNYEDGCVLTGLEAMYKATGRKRYAEAVRVFLDRYICPD GRIRWYDREEYSLDKIPSGRGLLFLYRETGQEKYRLAAKQLMEQLRRQPRTESGSFWHKK IYPRQIWLDGLYMAAPFYLQYEMELGDKKNCADIIKQFENARRFLYDESASLYIHAYDEG KCQFWADPETGRSPNFWSRAEGWYLMALADCCSILPRGSEDWQYLAGLWKEAMEGMLRYQ DQESGLFFQLTALGKTPGNYLETSASAMAAYSIYKGYEMGIFNRQTVQRADLIMMALETE KLKLRNGCLHLEGTCAGAGLGPADRPERDGSVSYYLGEAVVSDEQKGAAAFMLAYSQWEV RRRSIQDTEVTGMVKLNDVYELRHRAVEEIELGYGTGTEKVKIPRDAIAHILTPHKKEMR APEEEIIERALDSPIGTERLEKMASGKKDVVIITSDITRPMPSWRVLPHVLKRLEKAGVS RSHITVVFAMGTHRRHTSEEMRHLAGDEVYNTCRCMDSSECSFIHMGETKAGTPVDIADK VAHADLRICLGNIEYHFFAGYSGGAKAIMPGVSTMQAIRKNHSRMIHPMAKAGTLEGNPV REDLEEAAGICGVDFLLNVVLDEHKNVIHAVAGELKEAHRQGCRFLDGFYRMEINELADI VIVSQGGAPKDLNLYQTQKALANAEQAVRQGGIIILAGACPEGLGGAVFEQWMLEAEDLD SILKRIQRDFQIGGHKAASFARALKRARIFLVSGIDRELVRDIFMEPFDHVQEAYDAAVK EMGPGARVIVMPYGGSTLPVLSGDGNGETDGRKD >gi|157101635|gb|DS480689.1| GENE 152 189108 - 189623 685 171 aa, chain - ## HITS:1 COG:MK0781 KEGG:ns NR:ns ## COG: MK0781 COG0066 # Protein_GI_number: 20094218 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Methanopyrus kandleri AV19 # 1 168 1 164 170 176 51.0 2e-44 MKEKFAGQALALGNNIDTDQIYPGRYLALTNPEEIGSHCLEGVDEGIARNFPKGGIIVAG TNFGCGSSREHAAIALLNMGAGAVVAESFARIFYRNAINLGLPLLVCKGIGGKVESGQNL EINMTAGVLKNLDTGEQHQCETIGEYAMSILEAGGIKPLFIKRIKEEGHDS >gi|157101635|gb|DS480689.1| GENE 153 189628 - 190926 1276 432 aa, chain - ## HITS:1 COG:MTH1631 KEGG:ns NR:ns ## COG: MTH1631 COG0065 # Protein_GI_number: 15679626 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Methanothermobacter thermautotrophicus # 1 415 4 423 428 413 50.0 1e-115 MHAIEKILAKHSGRDRVVSGEIITADIDFAEINDLYLQTIYSFREMGGEKVWDRDKAAFV FDHYAPSPTIEASRNHREMRLFRQEQGLTHHFDINAGICHQVMPEAGLVYPGMILVATDS HTTTHGAFGCLGTGIGATDMATVLITGKLWFRVPEIIRIHLEGMPGSHVLPKDVILYIIG KMKADGAVYKAIEFTGSYVEQLDVAGRMVLCNMAVEMGAKTAYMEPNQAVLDYVAGRTSR PFTVEQTDGDFEYEETYVFDISGLKPQVSMPSSVDNVGAVALAGRVRIDQAFVGACTGGR VEDIGEAARILKGRTIAPHVRMVVIPASAEVLKECIAKGFLDTLIDAGATISAPGCGPCL SAHQGVLAAGEVCVTTSNRNFPGRMGSRDSAVYLASPATVAMSALTGYLTDPSWENQGQA ERRETEATGKGE >gi|157101635|gb|DS480689.1| GENE 154 190965 - 192059 982 364 aa, chain - ## HITS:1 COG:SSO3124 KEGG:ns NR:ns ## COG: SSO3124 COG4948 # Protein_GI_number: 15899830 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sulfolobus solfataricus # 36 362 39 370 373 181 34.0 2e-45 MKITDVRQEYFRWEKSRPITNGMHTYTHCGLAVITVETDQEITGYGLSGPVLGINPFMMV DELKDRLIGQDPMRTEWLWDRLYVPKLTGRKGISTRTISGIDMALWDIKAKSLGLPLYRL LGEYRRSVPVYIAGGYYEPGKGIADLQREMEGYMEQGVSGVKMKVGALSIREDAARVKAV REAIGPRALLMLDANCAYSSHRAVQFAGLVEEYDIYCLEEPVAPHDYGGMKRLADKTVIP LAAGENEYTKWGFKELIDTGAVPILNPAPFLMGGVTEFLKTAALSQAHCLELAPHGDQTI NISLGAAVGNVSYIEYYPPAYDEVWQNAFNDSLEIDREGRLPAPERPGIGFEPDYSVLDR FRII >gi|157101635|gb|DS480689.1| GENE 155 192139 - 193557 1598 472 aa, chain - ## HITS:1 COG:YPO1719 KEGG:ns NR:ns ## COG: YPO1719 COG1653 # Protein_GI_number: 16121979 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 74 456 26 415 430 212 31.0 1e-54 MRKRGKQLVSLILAGALAGSLAGCMPTQSKETAAAPAKTQDTKAAENAAGADPEAAGADT EAAASADMEVNTTDPITLRFNWWGGDSRHEKTLKAIEAFEAKYPNITVEAEYEAFNGHEE KISLALNAGSAADVVQLNMDWVFAYSPNGDTFYDLNKVSNIIDLSNYDDSDKAFYTVNGA LQALPIANTGRGFIWNKTTYDKVGAKIPTTLDELYEAGEKFAAYEDGSYYPFACADYVKV HLMIYYLQCKYGRDWIKDNALQYSKEEIAEGLNFIKELEDRHIIPDSEKLAGEGTSELLE TSESWINGHYGGAMCWDTNIAKYVDAVTDGEIVVGDMINMGEYHGGPIKASQVIAITSTS EHPVEAAALIQFLFGDEEGAVILGDSRGIPCNKNAVKYVETEGSLVAEMNEKLMAWSAFK LDIFTERAALKAPDGVYTLAIQSLSYGEEDAAACADMLIDGINREIETATAQ >gi|157101635|gb|DS480689.1| GENE 156 193593 - 194441 783 282 aa, chain - ## HITS:1 COG:YPO1721 KEGG:ns NR:ns ## COG: YPO1721 COG0395 # Protein_GI_number: 16121981 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Yersinia pestis # 6 282 30 306 306 308 53.0 1e-83 MNRKEKARLNLFLRYLLLIGLGIVMIYPLCWMIGASFKTNSELFSSAGIIPMAPTLDGYR NGFKGYEGMNLLHFMGNTYKFVLPKVVFTVISAVITSYGFARFRFVGKNLLFVLLLSTLF LPQVVLNVPQYILFNEVGWLNSYLPIVIPSMLAGDTFFVYMLIQFLRGIPRELEEAAEID GCNVVERLWYVIVPMLKPSIVSCALFQFMWASNDFMGPLIYINSVRKYPVSIFLRMSMDT ETGFDWNRILAMSLLAIIPSLAVFFMAQDSFVDGIAAGGVKG >gi|157101635|gb|DS480689.1| GENE 157 194446 - 195342 936 298 aa, chain - ## HITS:1 COG:AGl3351 KEGG:ns NR:ns ## COG: AGl3351 COG1175 # Protein_GI_number: 15891796 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 295 4 290 293 291 55.0 1e-78 MKKKKKGFWKRNIGLAFVMPWLIGLLVFKLYPFAASFIYSFHSYNLFKTASFTGLENYKY ILGDKLIIKAFIQTFKYAFLTVPLELMFALFIAYILNNKIRGLNFFRTIYYIPSILGGSV SIAVLWKFLFKTEGLVNIMLGALGIPAFNWLGNPDGAFFVIVLLRVWQFGSPMVIFLAAL KGVQGDLYEAAAIDGAGKWKQFFQITVPLITPVIFYNFVTQLCHKFQEFNGPFIVTQGGP LRSTTLVSLLVYNEAFKRNEMGLASAIAWLLFLVIMTFTAVAFISQKYWVYYADEDGR >gi|157101635|gb|DS480689.1| GENE 158 195360 - 196760 1297 466 aa, chain - ## HITS:1 COG:BS_lplD KEGG:ns NR:ns ## COG: BS_lplD COG1486 # Protein_GI_number: 16077780 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 7 455 3 441 446 451 49.0 1e-126 MKWNNHHVSDIKMAYIGGGSKAWAWKLMADLALEPALDGTVWLYDIDARAAEENQIIGSQ LKERKEAAGRWDYKVAYSLKEALKGADFVVISILPGTFEEMRSDVHLPERLGIYQSVGDT AGPGGMVRALRTLPMYEEIALAIREFCPFAWVISYTNPMSLCVASLYKTFPEIKAFGCCH EVFGTQKVLAAIASQELGTGPIDRKDIHVNVLGVNHFTWFDAASYEGADLFPVYKKYIGT HFEEGYDDPDRPWEKSTFNCRHRVKFDLFNRYGLIAAAGDRHLAEFMPGNTYLNDPETVR SWKFGLTTVDFRISQMEERLARRKRLIQGEEQMEIVPSGEEGVQQIKALVGLDRMVSNVN MPNSFLQIPNLPKEAVVETNAVFSRDSIRAVAAGPLPEPIRELILPHVQNHGYILEAADT YDRNLVVKAFMNDPLVKHKCRDEGEIRKLADDMIHNTERYLPAGWK >gi|157101635|gb|DS480689.1| GENE 159 197039 - 197698 663 219 aa, chain + ## HITS:1 COG:STM3357 KEGG:ns NR:ns ## COG: STM3357 COG1802 # Protein_GI_number: 16766652 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 3 213 5 215 221 165 43.0 7e-41 MKKLDSLEIMPTRIRIASILRKAILSGEFKEGEELSLTDIANNLGVSRTPVREAFQILSS ENLILLRMNKGAIVKGITTKTIREHFEMRSLLEGEAAVRATLRHFDVSELMESQVHIESL GENFTDDEYQSYNQNLHTSIWKAADNSKMYSTLSSLWNGSSFGKTVSAKDHQILSIREHR NILEYIQSCNPYLVRKEMEHHIERSMNNIIDSFSLKENE >gi|157101635|gb|DS480689.1| GENE 160 197807 - 198034 350 75 aa, chain - ## HITS:1 COG:no KEGG:Calhy_1445 NR:ns ## KEGG: Calhy_1445 # Name: not_defined # Def: phosphotransferase system, phosphocarrier protein HPr # Organism: C.hydrothermalis # Pathway: not_defined # 1 73 1 73 76 67 42.0 3e-10 MKRILVKFDQADQIINFVRIMNRFECDADVKCGSRMVDAKSIVGVLSLAKSKTVELILHT DDCDQLMEEIAPFAA >gi|157101635|gb|DS480689.1| GENE 161 198245 - 199231 1199 328 aa, chain - ## HITS:1 COG:AGl2685 KEGG:ns NR:ns ## COG: AGl2685 COG1879 # Protein_GI_number: 15891450 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 321 50 346 353 147 32.0 4e-35 MTGVLFLFSCQSSQFNRGKYAIIMKSRNNWYNELASEGFKQTVEDAGKNCIVLYPDHPSA QEQIHLIQNLIDEKVEAIAVAANDEYALTPVLTQAREKGISVITLDADVEAGSRSIYISP VDARELGKELVREVDRICGHSGQWGILSAGSRSANQNEWIYMMKQELQNLEYRDLRLVDI AYGEGEYEKAAEKANLMLETYPDLKVMCCLSTEGIKAAADVVKARGQASKVKVIGLGLPD QMEDYVGSDPEDICPVLYIWNPMDLGRVAGYVCLELSEGRIEERGDQELLLGGRTYPMDY GHDGGLEVIAGEPIKVDSENIGYWKDQI >gi|157101635|gb|DS480689.1| GENE 162 199435 - 200991 1522 518 aa, chain + ## HITS:1 COG:BH1910 KEGG:ns NR:ns ## COG: BH1910 COG4753 # Protein_GI_number: 15614473 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 3 515 4 497 506 126 25.0 1e-28 MKLLIVDDEKLTREGIRDSLGLESLGISQVLLEDDGIHGLKTALEERPDIVLTDVRMPRM NGVQMAERILKELPGTSIIFMSAYSDKEYLKAAIKLKALGYVEKPLDMEELASAVKEAVD SSRNEKISQAAARLQEKEQLGHLSLLLSQPEEESLIKAGQLAADLGLSITHTTCFCCIII DCVTPLSALPEEQMDGIRDYFMEYLTSMDISQAYVLRGDHRIIVFLHSPVRPADKTLYDC AGLLAQKLIKICPFFISLGPVVSGMDRAHLSYQEASEHLKEAFFHDYGFILTRNMETAVF RPPADLLLEFSMALSEKQEEEALDIVRRLYESFVPNDMIGPSQVKDIYYKYFSKLDEHGL GSYISLWQKEGLESESIWEGVMNCTILRELNDLLAAKVRLFFERLRGNSVGNPVVFQIKE YIHKNYAVLSLSVPDVSEYVRLSSSYVCTIFKNETGQTLNQYLTDYRIKMSKQFLSDPRY KIADISSKVGYSDGNYYSKTFKKIVGLSPSEYREKMLA >gi|157101635|gb|DS480689.1| GENE 163 200988 - 202865 1670 625 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 375 618 349 591 602 157 35.0 5e-38 MKNPFTKLIHHYTYDMRLKTKLVISHIILVLLPTAVLSGFLYLRIYGIVMDDSIRSEQAL SAQTVTSIESLVSHVGHASDTITGSMMVQDLFRVPRSEASTRDISTSKMNSLFHLVQSIT DHSMIMDVKIYYDDSVYGDLMQYNKIGNALFNPVSSVSSSYWYGIFSTSQQSRLLCPELY LTPDETGKSGKLAYITRIPYTYEGGTVSDIENASAYVALYLSGPAFETVLRNDATVTDEA AFLVNERDVIVSASDMGLAGKYFIPRRDLEQRVGKEKTFSLVSYLDGSAYVAYFPIADTD WYMMSIIPALHIGDAGKALMTHFAMIYLLFTALALYTAFRLSGSIADRIIGVALQMETVR TGPPQPMDVADTGCDEIGVLSDTYNYMTEEIITLMDSQKKASDELRMAEFRALQAQINPH FLYNTLDMINWLSQTGQSEKVTEAVQILSRFYKLTLSRRELMNSIEKELEHVSLYVRLQN MRYDNCVAFVVDVPEELCEYTIPKLTFQPIVENAFLHGIMMKEEKKGSILLTGWPEGDDI VFIISDDGAGIPPETLDTLKDDVNAGTGSSASPRHTAFSGHIGIYNTNLRLKSLYGESYG LSITSTLGKGTEVTVRIPARHITSD >gi|157101635|gb|DS480689.1| GENE 164 202852 - 204129 1095 425 aa, chain - ## HITS:1 COG:no KEGG:Closa_0526 NR:ns ## KEGG: Closa_0526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 90 422 53 392 398 110 25.0 1e-22 MRNRSDIKESGRNRGDIITGNTETGNTQTGNTQKVKIVLLAMVLCLWGAGGRTSYAAQDT SQDISKDISQAGAASTTPETAGNTAGAIIIEYGNLRELLKQGNLSLKESIEDYEDNINAY QEIWDTLKREQDNMEDKAEDMDDEDSQTAGIYASNAAMLKSSASRIYSQLDIMTSEKSTR SLEKSADTFTMTAQTLMNSYNQMVQNVEYQEKRGESLQAAFEAMGRKQAAGSATQAQLKE AQKNLDTAKNSLESLRLQASQLRQQLLTMLGIEDSSQVVIGTVPEPDMAAIEAVDYESDK IRAMGNDKSVQNARHTSASSTTEINIRFKLVDEAEGTKEAAFLASYQNLQASKTAYEAAL TAFQSAQLTYEGLQRKQQAGLLTGTQYLEGQASYLEKKAAKETAAMNLTAAYESYCWDVK GISQT >gi|157101635|gb|DS480689.1| GENE 165 204206 - 205366 1360 386 aa, chain - ## HITS:1 COG:no KEGG:Closa_0525 NR:ns ## KEGG: Closa_0525 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 384 3 379 383 214 37.0 7e-54 MKHGKIGAGCLAVVLCLGTMPVTAWAGTPEFAYTAEQWASLRDNQLEFTEIADLIHVYNN TVIQNQLEYEDFRGEDADDIADDYYDAADDIYGSLEYPDSSDSDYASRLSSYLSSQIQAD NLREQGDDNVEDGDVKKLEYDKTEAGLVKEAQELMISYWSQTYSLESLEQNKIQARSSYD QTLNRLSAGMSTQAQVLSAREAVTSADASLLSAQSSLGQTKETLCLMLGWTYGAQVDIGD VPEPDLEGMTAINLEEDVSRGVENNYSLKILEKKIANAKSGTNKTSLEQSLKSQKETAAS SIKNAYESMMISKSDYEQALNTYEIEAAAMAAAERKMAAGTTTRNDYVTQQTAFAAAQVN VRTQKLALLKAQLEYRWSVDGLASVS >gi|157101635|gb|DS480689.1| GENE 166 205385 - 208390 3394 1001 aa, chain - ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 3 997 1 1012 1093 417 29.0 1e-116 MGLTRFVLKRPVATVMALLCLLVFGISSVFNATLEQMPDTDQPMLIIIASYSGAGPEDID ELVTQPIEDQVGTIEGVKSMSSTSSENRAMIMLEYDYGTDMDDAYNDLSQSLDALSRQLP DDAETSVMEMNNNAGTTMMLSISNPSQEDLYDYVDQTVVPLLEQISSVAEVEAMGGKSEY YKIELQSDEMAQYQVTMSDVSTAMSTANLSYPSGDAVSGKLELSVSTSLEHETIEALKEV PITTSSGKIVYLEDIAKVYEAEESRGGISRYNGQDTISISITKQQSSTAMDVSSEVQEVI ESLEADDENLNIRIVRDSADSILSSLKDVALTLVLAAVISMIIIFIFFGDYKASLIVGSS IPTSILVSLILMTSAGFSLNIITMSGLVLGVGMMVDNSIVVLESCFRAIETEEDKGLLGY ARASLSGTNIVLQSILGSTVTTCVVFLPLVFLQGMSGQMFGSMGYVIVFCMLASFLSAIT IVPLTYMAYKPQERMQAPMSRPMERIQNGYRRIMPGLLNHKAIIMIASVILVIAAYFLAS GMETELMTADDTGTVSVSIETRPGLLSENADAMLLQAEEIVQGHPDVESYMLRYNNDSGT ITAYLRDNRDMSTDEVVEQWETEMADLDNCTVTVEASSSMSFMSRNRGYEVILNGTDYDE LQEVSNKIVTEMTARDDVMNVHSSIENTAPVVTVKVDPVLAAAEGLTASQIGSQVKQMMD GEEVTTLDVDGREVSVMAEYPEDEYRTVSQMKDIILSKPSGGYVALTDVAEIYYKDSPAS ISKTDKAYEITITADYTGGNVQSAIDSEVINPNLSGTIKRGVNSMNRMMQEEFAALYQAI AVAVFLVFVVLSAQFESPKFSFMVMTTIPFSLIGSFGLLQITGVSISMTSILGFLILVGT VVNNGILYVDTVNQYRMTMDLKTALIEAGATRLRPIMMTSLTTILSMIPMALAIGDSGST TQGLAIVNIGGLSVGVAVALFILPIYYALMNGDKKRVVLDI >gi|157101635|gb|DS480689.1| GENE 167 208394 - 209671 1238 425 aa, chain - ## HITS:1 COG:ECs2882 KEGG:ns NR:ns ## COG: ECs2882 COG0845 # Protein_GI_number: 15832136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli O157:H7 # 28 377 98 464 464 84 23.0 4e-16 MSKSRKIKIGIAAGIIVILAVAMIVMKGGKKAKGGMPMGQEASVTVVRAEQPSSGDIILT TGLTGTVEPSDVVHIYAKAAGDVTAVYVKAGDMVTQGQVLFEIDTEQVETAKNSMDAASV SLSEAQSNLRRMQILYSGGDLSEQEYEQYVNAVKSAQLQYESAKLAYERQVQYSSVTAPI NGKIESFDVEVYDRVSQSQDLCVIAGEGENIVSFYVTQRMMQNANVGDELEIQKNGTTYK AYISEINSMVDRDTGLFKVKAQIENTQEIAAGSTVKLNLVTERALDTMVVPIDAIYYSGG NAYVYLYQDGTASMAQVEVGLEDEEHAQILSGLSADDMVVSTWSSNLYEGAKIRLRDEVQ PGEEAQFREETQAGEKTQSGEEFQSREEFQSGEGNQSEEETQAGETAAPGGEDSKHAASQ AEQEA >gi|157101635|gb|DS480689.1| GENE 168 209932 - 210468 303 178 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 9 168 6 167 167 123 35.0 2e-28 MERFILQPANRKEAEKAMALIDEAKEFLKSQGIDQWQTGYPDMETIIGDLAHGRGYFIEE GSDAAAYLCIDFAGEASYETLQGSWKSDLPYAVVHRMAISGAYRGHGIASIAFQLIEELC IQNGIYSVRVDTDEKNAIMRHVLEKNGFDYCGTIWFDNSVKYAYEKMLKRDAGVNGAV >gi|157101635|gb|DS480689.1| GENE 169 210667 - 211566 610 299 aa, chain + ## HITS:1 COG:STM2644 KEGG:ns NR:ns ## COG: STM2644 COG0583 # Protein_GI_number: 16765964 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Salmonella typhimurium LT2 # 1 287 1 283 299 266 45.0 3e-71 MELKCLRTFRTIIDEGGFSKAAKKLNYTQSTITFQMNQLERDLSTTLFEKIGRKMVLTKA GEHLIPYVDDILQSVDKLSFLGEHLSEYQGDIQIGVGESLLCYKLPAILKEFHRQAPKAR LFLRSMNCYDIRDELFSGGLDLGVFYEDVGGFGSNLAVKPLGDYPVVLVASPEIQNRYPD FVTPDRTIPLPLIINEPNCIFRQIFEQYLRDKSIILDHTIELWSIPTIKNLVKNNVGVSF LPEFTVTEELNRGELVKIPTTISGTKITAVCAHHKNKWVSPLMQLFISLCDSVSCGRER >gi|157101635|gb|DS480689.1| GENE 170 211722 - 212396 873 224 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 219 1 220 225 198 45.0 6e-51 MFQILVAEDDKNTRRLMEAVLKEHGYHPILACDGLEALKLLDTHHVDLVILDIMMPGMDG YEFTRQLRATDYTLPILMVTAKQLPEDKRKGFIVGTDDYMTKPVDEEEMILRIRALLRRA QIVNERRITIADVCLDYDSLTVSRGDESQTLPRKEFYLLYKLLSYPGKIFTRIQLMDEIW GMESQSDDNTINVHINRLRKRFEDYPEFTIETIRGLGYKAVKHL >gi|157101635|gb|DS480689.1| GENE 171 212393 - 213460 1044 355 aa, chain + ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 59 347 167 457 459 210 35.0 4e-54 MRHRRNRKPLGTMTLILTAGIFLIMLTVMTLQGFLMYLYWRIFYQADIPIPQFWRPIPLL AVLSAVVGAGLTLLLSRIPLKPIRDLIEAINQLADGNFKVRIHLDLNREFERLSESFNRM AQELENTELLRSDFINNFSHEFKTPIVSLRGFAKILKNDRLTKEERDEYLDIIISESNRL SQLSTNVLNLSKIEKMSILSDMESFDLSEEVRQSVLLLESKWQKKDLELFIDMDELEYRG NKALLNQVWINLIDNAIKFSPQNGKIKLKLHRKKDQVVFQILDNGCGMDEETKNHIFDRF YQGDSSHTAEGNGIGLTVVEKIVHLHKGQIRVVSEAGIGTTFTVNLPIIPPPSVL >gi|157101635|gb|DS480689.1| GENE 172 214048 - 215250 1078 400 aa, chain + ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 6 380 44 415 450 137 29.0 4e-32 MNDYSRWKQKFLTIAAGQTVSLIGSSAVQFSLIWWLASETGSPMMMALSGLVAFLPQIFL GPFAGVWIDRLKRKHVATAADMFIGLVAVVFAVLLWVGHPPYWSACAVLGIRALGSVFHT PAIQSIVPQLVPPDKLVRANGWNQFMQSGAFMLGPVIGAAMYAALPLPVILLTDFIGACA ASLTMAVVTIPEPEHSHQEAPHLIREMKEGLSICMSDRRLMKLTAAAAITMVFYLPLASY YPLMSSTYFKASAWHGSAVEFLYALGMMVSAIIFGSLGKVNNKLKASYLGLAGSGITCFI CGILPSDMWAWYVFAATCMFMGGAGSAYNIPFVAYLQETIPAKAQGRAFSLLGSIMSLAM PVGLLISSPVAQIFGVNMWFLVSGIGILAVVAVCWLSGND >gi|157101635|gb|DS480689.1| GENE 173 215236 - 216570 1330 444 aa, chain - ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 18 426 8 414 432 193 30.0 7e-49 MYGERGIGLSKSRRISRLGTELTLAAVASALVSLALYMALNTLLFGILDRVLFSEDRMIL REQACVERLQAYVTENGLSTNDGKMLDKWASGEKNLLITIYKDGRFRYSNDAKVMVMVPE MNEWEDTSWLYEVKFADGTAQAGVSYYVEYGYYAAASVLGGVISMAVFALMLVMLIRGKI RYIDLMEQEIQILKGGDLDYRITVKGKDELASLAAEIDAMRCAIKERQQKEEEAGKANRE LVTAMSHDLRTPLTSLLGYVDILQMEKGEDRGQQRKYLNSIRDKAYQIKELSDKLFEYFI VYGRKREELEAEEVNGAEFLGQIVEESLFDMESEGFDIRRSSDEINCRLLADINLCRRVF GNIFSNLLKYADRNRPVTVSYQQRADCLIICFGNYVAEDAQEKESTGIGLKTCAKIIGDH GGSFYSGREAGFFLTKISLPLIVS >gi|157101635|gb|DS480689.1| GENE 174 216539 - 216676 107 45 aa, chain - ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 2 43 186 227 231 64 64.0 4e-11 GGEAYFYSANNTVMVHIRNLRRKLEADPKNPKYIVNVWGKGYRIE Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:14:43 2011 Seq name: gi|157101634|gb|DS480690.1| Clostridium bolteae ATCC BAA-613 Scfld_02_31 genomic scaffold, whole genome shotgun sequence Length of sequence - 443324 bp Number of predicted genes - 466, with homology - 449 Number of transcription units - 202, operones - 96 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 99 141 ## + Term 128 - 187 6.5 + Prom 124 - 183 2.9 2 2 Tu 1 . + CDS 280 - 1119 580 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) + Prom 1133 - 1192 3.8 3 3 Tu 1 . + CDS 1237 - 2817 1060 ## gi|160938790|ref|ZP_02086141.1| hypothetical protein CLOBOL_03684 + Term 2873 - 2936 -0.6 - Term 3069 - 3121 8.3 4 4 Tu 1 . - CDS 3141 - 4481 302 ## Cphy_1803 transposase - Prom 4513 - 4572 4.3 + Prom 4600 - 4659 5.7 5 5 Op 1 . + CDS 4696 - 5580 693 ## COG2768 Uncharacterized Fe-S center protein 6 5 Op 2 . + CDS 5619 - 5993 398 ## COG0789 Predicted transcriptional regulators + Prom 6029 - 6088 3.6 7 6 Op 1 . + CDS 6176 - 6796 512 ## gi|160938794|ref|ZP_02086145.1| hypothetical protein CLOBOL_03688 8 6 Op 2 . + CDS 6882 - 7346 354 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 9 6 Op 3 . + CDS 7401 - 8783 1341 ## Closa_2969 hypothetical protein 10 6 Op 4 . + CDS 8783 - 9616 829 ## Closa_2968 hypothetical protein 11 6 Op 5 . + CDS 9601 - 12993 3674 ## Closa_2967 SMC domain protein 12 6 Op 6 . + CDS 13029 - 14207 590 ## EUBREC_2907 hypothetical protein + Prom 14215 - 14274 7.7 13 7 Tu 1 . + CDS 14338 - 15696 1385 ## COG0372 Citrate synthase + Prom 15714 - 15773 7.0 14 8 Tu 1 . + CDS 15816 - 16025 169 ## CPR_0799 hypothetical protein + Prom 16088 - 16147 8.3 15 9 Op 1 . + CDS 16170 - 17033 816 ## CDR20291_2714 hypothetical protein + Term 17040 - 17081 8.1 16 9 Op 2 8/0.000 + CDS 17105 - 17449 332 ## COG2739 Uncharacterized protein conserved in bacteria 17 9 Op 3 23/0.000 + CDS 17494 - 18870 1735 ## COG0541 Signal recognition particle GTPase + Prom 18889 - 18948 2.6 18 9 Op 4 19/0.000 + CDS 18971 - 19213 341 ## PROTEIN SUPPORTED gi|160880540|ref|YP_001559508.1| ribosomal protein S16 19 9 Op 5 12/0.000 + CDS 19233 - 19460 315 ## COG1837 Predicted RNA-binding protein (contains KH domain) + Term 19487 - 19526 8.0 + Prom 19507 - 19566 4.0 20 9 Op 6 . + CDS 19587 - 20099 728 ## COG0806 RimM protein, required for 16S rRNA processing + Term 20120 - 20152 -0.9 + Prom 20221 - 20280 4.0 21 10 Tu 1 . + CDS 20340 - 20549 76 ## gi|239623996|ref|ZP_04667027.1| predicted protein + Term 20593 - 20636 6.1 22 11 Op 1 12/0.000 + CDS 20671 - 22962 2043 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 23 11 Op 2 . + CDS 22979 - 23938 1003 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 24 11 Op 3 . + CDS 23968 - 25263 1529 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance 25 11 Op 4 . + CDS 25358 - 25609 407 ## Closa_2047 hypothetical protein 26 11 Op 5 . + CDS 25654 - 26349 648 ## COG2003 DNA repair proteins 27 11 Op 6 . + CDS 26364 - 27209 1269 ## COG1792 Cell shape-determining protein 28 11 Op 7 . + CDS 27206 - 27739 403 ## Closa_2043 rod shape-determining protein MreD 29 11 Op 8 1/0.261 + CDS 27732 - 30653 3506 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 30 11 Op 9 22/0.000 + CDS 30675 - 31343 728 ## COG0850 Septum formation inhibitor 31 11 Op 10 . + CDS 31421 - 32212 895 ## COG2894 Septum formation inhibitor-activating ATPase 32 11 Op 11 . + CDS 32229 - 32513 169 ## Closa_2039 cell division topological specificity factor MinE 33 11 Op 12 . + CDS 32516 - 33637 1240 ## COG0772 Bacterial cell division membrane protein 34 11 Op 13 . + CDS 33659 - 34054 437 ## COG1803 Methylglyoxal synthase 35 11 Op 14 . + CDS 34054 - 35031 1080 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 35051 - 35110 7.6 36 12 Tu 1 . + CDS 35134 - 35541 257 ## Closa_2035 hypothetical protein + Prom 35555 - 35614 1.9 37 13 Op 1 . + CDS 35669 - 37069 1419 ## Closa_2034 hypothetical protein 38 13 Op 2 14/0.000 + CDS 37084 - 37764 656 ## COG0325 Predicted enzyme with a TIM-barrel fold 39 13 Op 3 . + CDS 37812 - 38321 527 ## COG1799 Uncharacterized protein conserved in bacteria + Term 38363 - 38397 3.5 40 14 Op 1 . + CDS 38417 - 39511 1237 ## COG0337 3-dehydroquinate synthetase 41 14 Op 2 15/0.000 + CDS 39511 - 40020 559 ## COG0597 Lipoprotein signal peptidase 42 14 Op 3 . + CDS 40052 - 40963 298 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit + Prom 41040 - 41099 9.3 43 15 Tu 1 . + CDS 41156 - 41794 756 ## Closa_2028 cytidylate kinase + Term 41842 - 41895 16.3 - Term 41830 - 41883 16.3 44 16 Op 1 . - CDS 41892 - 42437 353 ## gi|160938835|ref|ZP_02086186.1| hypothetical protein CLOBOL_03729 - Prom 42491 - 42550 3.5 45 16 Op 2 . - CDS 42594 - 44420 1755 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains - Prom 44447 - 44506 5.1 + Prom 44873 - 44932 5.4 46 17 Op 1 . + CDS 44975 - 46870 1023 ## Closa_2016 hypothetical protein 47 17 Op 2 . + CDS 46911 - 48851 1313 ## COG1316 Transcriptional regulator + Term 48983 - 49035 13.2 - Term 48881 - 48923 2.5 48 18 Tu 1 . - CDS 48998 - 49723 568 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control - Prom 49812 - 49871 7.0 + Prom 49818 - 49877 8.4 49 19 Tu 1 . + CDS 49902 - 50159 248 ## Closa_1508 Phosphotransferase system, phosphocarrier protein HPr + Prom 50201 - 50260 4.5 50 20 Op 1 1/0.261 + CDS 50306 - 51400 1166 ## COG3589 Uncharacterized conserved protein 51 20 Op 2 2/0.130 + CDS 51425 - 53344 2266 ## COG3711 Transcriptional antiterminator + Term 53423 - 53461 1.4 + Prom 53350 - 53409 6.5 52 21 Op 1 8/0.000 + CDS 53485 - 53823 508 ## COG1447 Phosphotransferase system cellobiose-specific component IIA 53 21 Op 2 10/0.000 + CDS 53904 - 54224 461 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 54 21 Op 3 . + CDS 54293 - 55627 1381 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 55 21 Op 4 . + CDS 55700 - 56224 671 ## Closa_1514 hypothetical protein 56 21 Op 5 . + CDS 56240 - 57220 1099 ## COG1446 Asparaginase 57 21 Op 6 . + CDS 57225 - 58442 1260 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 58 21 Op 7 . + CDS 58493 - 59239 793 ## COG3142 Uncharacterized protein involved in copper resistance 59 21 Op 8 . + CDS 59243 - 60115 840 ## COG1284 Uncharacterized conserved protein + Term 60163 - 60207 -0.0 + Prom 60263 - 60322 8.4 60 22 Tu 1 . + CDS 60380 - 61114 560 ## gi|160938853|ref|ZP_02086204.1| hypothetical protein CLOBOL_03747 + Prom 61149 - 61208 5.8 61 23 Tu 1 . + CDS 61236 - 61343 100 ## + Term 61358 - 61398 4.2 - Term 61339 - 61392 10.7 62 24 Op 1 . - CDS 61420 - 61653 308 ## gi|160938855|ref|ZP_02086206.1| hypothetical protein CLOBOL_03749 - Prom 61715 - 61774 6.0 - Term 61830 - 61868 1.2 63 24 Op 2 . - CDS 61872 - 61967 118 ## - Prom 62189 - 62248 5.8 + Prom 62189 - 62248 7.8 64 25 Op 1 4/0.043 + CDS 62366 - 62785 616 ## COG0071 Molecular chaperone (small heat shock protein) 65 25 Op 2 . + CDS 62872 - 63333 480 ## COG0071 Molecular chaperone (small heat shock protein) + Term 63352 - 63416 3.1 + Prom 63387 - 63446 6.1 66 26 Op 1 . + CDS 63520 - 64344 543 ## gi|160938860|ref|ZP_02086211.1| hypothetical protein CLOBOL_03754 + Prom 64349 - 64408 6.3 67 26 Op 2 . + CDS 64444 - 65646 601 ## DSY1298 hypothetical protein 68 26 Op 3 . + CDS 65703 - 66233 243 ## gi|160938862|ref|ZP_02086213.1| hypothetical protein CLOBOL_03756 69 27 Tu 1 . - CDS 66321 - 67163 954 ## COG1284 Uncharacterized conserved protein - Prom 67294 - 67353 5.1 + Prom 67138 - 67197 8.1 70 28 Tu 1 . + CDS 67352 - 68011 694 ## Clocel_1110 suppressor of fused domain + Prom 68027 - 68086 1.7 71 29 Tu 1 . + CDS 68118 - 69356 945 ## gi|160938865|ref|ZP_02086216.1| hypothetical protein CLOBOL_03759 + Term 69373 - 69410 -1.0 + Prom 69373 - 69432 4.7 72 30 Op 1 . + CDS 69483 - 71555 1644 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 73 30 Op 2 . + CDS 71558 - 72409 474 ## COG0384 Predicted epimerase, PhzC/PhzF homolog + Term 72519 - 72562 0.5 74 31 Tu 1 . - CDS 72464 - 73228 505 ## Asuc_1312 MerR family transcriptional regulator - Prom 73325 - 73384 4.3 + Prom 73274 - 73333 5.2 75 32 Tu 1 . + CDS 73360 - 74679 1075 ## COG0534 Na+-driven multidrug efflux pump + Term 74839 - 74876 -1.0 76 33 Tu 1 . - CDS 74596 - 74787 189 ## - Prom 74818 - 74877 4.8 + Prom 74759 - 74818 6.0 77 34 Tu 1 . + CDS 74909 - 75532 329 ## Cphy_0970 hypothetical protein 78 35 Tu 1 . - CDS 75704 - 75862 156 ## gi|160938871|ref|ZP_02086222.1| hypothetical protein CLOBOL_03765 - Prom 76073 - 76132 6.0 79 36 Op 1 . + CDS 76197 - 76745 397 ## gi|160938872|ref|ZP_02086223.1| hypothetical protein CLOBOL_03766 80 36 Op 2 . + CDS 76757 - 77716 517 ## gi|160938873|ref|ZP_02086224.1| hypothetical protein CLOBOL_03767 + Prom 77826 - 77885 4.9 81 37 Op 1 . + CDS 77920 - 79116 500 ## gi|160938874|ref|ZP_02086225.1| hypothetical protein CLOBOL_03768 82 37 Op 2 . + CDS 79117 - 79335 258 ## gi|160938875|ref|ZP_02086226.1| hypothetical protein CLOBOL_03769 + Prom 79344 - 79403 2.3 83 38 Op 1 . + CDS 79429 - 80403 349 ## gi|160938876|ref|ZP_02086227.1| hypothetical protein CLOBOL_03770 84 38 Op 2 . + CDS 80376 - 80519 140 ## gi|160938878|ref|ZP_02086229.1| hypothetical protein CLOBOL_03772 85 38 Op 3 . + CDS 80560 - 80682 105 ## 86 38 Op 4 . + CDS 80705 - 80989 292 ## gi|160938880|ref|ZP_02086231.1| hypothetical protein CLOBOL_03774 + Prom 81049 - 81108 5.3 87 39 Op 1 . + CDS 81195 - 81980 461 ## gi|160938882|ref|ZP_02086233.1| hypothetical protein CLOBOL_03776 + Prom 81991 - 82050 4.7 88 39 Op 2 . + CDS 82074 - 82994 254 ## gi|160938884|ref|ZP_02086235.1| hypothetical protein CLOBOL_03778 89 39 Op 3 . + CDS 82987 - 83166 243 ## gi|160938885|ref|ZP_02086236.1| hypothetical protein CLOBOL_03779 90 39 Op 4 . + CDS 83168 - 83368 186 ## gi|160938886|ref|ZP_02086237.1| hypothetical protein CLOBOL_03780 + Term 83484 - 83518 5.0 91 40 Op 1 . - CDS 83513 - 83779 309 ## gi|160938888|ref|ZP_02086239.1| hypothetical protein CLOBOL_03782 92 40 Op 2 . - CDS 83856 - 84107 129 ## gi|160938889|ref|ZP_02086240.1| hypothetical protein CLOBOL_03783 - Prom 84134 - 84193 7.2 + Prom 84151 - 84210 6.3 93 41 Op 1 . + CDS 84314 - 84550 165 ## gi|160938890|ref|ZP_02086241.1| hypothetical protein CLOBOL_03784 94 41 Op 2 . + CDS 84523 - 84723 161 ## gi|160938891|ref|ZP_02086242.1| hypothetical protein CLOBOL_03785 + Prom 84725 - 84784 2.5 95 41 Op 3 . + CDS 84810 - 85070 183 ## gi|160938892|ref|ZP_02086243.1| hypothetical protein CLOBOL_03786 96 42 Tu 1 . + CDS 85232 - 85450 223 ## gi|160938893|ref|ZP_02086244.1| hypothetical protein CLOBOL_03787 + Term 85520 - 85561 -0.7 + Prom 85977 - 86036 3.8 97 43 Tu 1 . + CDS 86057 - 86650 268 ## Closa_0878 3D domain protein + Term 86690 - 86723 5.4 98 44 Tu 1 . - CDS 86815 - 88497 1151 ## COG0840 Methyl-accepting chemotaxis protein - Prom 88613 - 88672 7.5 - Term 88948 - 88996 3.3 99 45 Tu 1 . - CDS 89019 - 89381 250 ## COG1733 Predicted transcriptional regulators - Prom 89458 - 89517 7.2 + Prom 89477 - 89536 6.1 100 46 Tu 1 . + CDS 89556 - 90071 545 ## COG0778 Nitroreductase + Term 90091 - 90144 6.4 - Term 90079 - 90132 10.2 101 47 Op 1 . - CDS 90155 - 90910 464 ## gi|160938904|ref|ZP_02086255.1| hypothetical protein CLOBOL_03798 102 47 Op 2 . - CDS 90903 - 92123 696 ## sce3342 hypothetical protein 103 47 Op 3 . - CDS 92120 - 93133 552 ## gi|160938907|ref|ZP_02086258.1| hypothetical protein CLOBOL_03801 - Prom 93374 - 93433 2.4 + Prom 93109 - 93168 5.2 104 48 Tu 1 . + CDS 93213 - 94160 722 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase 105 49 Op 1 1/0.261 - CDS 94206 - 95147 773 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 106 49 Op 2 1/0.261 - CDS 95144 - 96466 973 ## COG1541 Coenzyme F390 synthetase 107 49 Op 3 . - CDS 96459 - 98387 1209 ## COG0451 Nucleoside-diphosphate-sugar epimerases 108 49 Op 4 . - CDS 98429 - 99535 721 ## COG1819 Glycosyl transferases, related to UDP-glucuronosyltransferase - Prom 99676 - 99735 4.9 109 50 Tu 1 . - CDS 99950 - 101491 1027 ## COG1690 Uncharacterized conserved protein - Prom 101589 - 101648 10.6 + Prom 101543 - 101602 8.0 110 51 Tu 1 . + CDS 101654 - 102706 1015 ## COG0582 Integrase + Term 102751 - 102801 4.0 - Term 102622 - 102648 1.0 111 52 Op 1 . - CDS 102852 - 104204 584 ## gi|160938916|ref|ZP_02086267.1| hypothetical protein CLOBOL_03810 112 52 Op 2 . - CDS 104257 - 104991 713 ## COG1285 Uncharacterized membrane protein - Prom 105020 - 105079 2.5 113 53 Op 1 3/0.130 - CDS 105182 - 107254 1190 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component 114 53 Op 2 . - CDS 107251 - 107637 442 ## COG3682 Predicted transcriptional regulator - Prom 107673 - 107732 8.9 115 54 Tu 1 . - CDS 107742 - 109808 2034 ## COG0840 Methyl-accepting chemotaxis protein - Prom 109852 - 109911 5.5 - Term 109870 - 109901 2.5 116 55 Op 1 . - CDS 109935 - 110372 642 ## COG0716 Flavodoxins 117 55 Op 2 . - CDS 110387 - 111004 475 ## Clole_3254 hypothetical protein - Prom 111160 - 111219 5.9 + Prom 111223 - 111282 8.0 118 56 Tu 1 . + CDS 111366 - 112256 688 ## COG0583 Transcriptional regulator + Term 112435 - 112464 0.5 - Term 112236 - 112277 8.7 119 57 Op 1 42/0.000 - CDS 112348 - 112764 367 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 120 57 Op 2 42/0.000 - CDS 112798 - 114198 1663 ## COG0055 F0F1-type ATP synthase, beta subunit 121 57 Op 3 42/0.000 - CDS 114225 - 115118 945 ## COG0224 F0F1-type ATP synthase, gamma subunit 122 57 Op 4 . - CDS 115138 - 116661 1772 ## COG0056 F0F1-type ATP synthase, alpha subunit 123 57 Op 5 . - CDS 116672 - 117187 631 ## Closa_4167 ATP synthase F1, delta subunit 124 57 Op 6 . - CDS 117180 - 117671 522 ## COG0711 F0F1-type ATP synthase, subunit b 125 57 Op 7 . - CDS 117696 - 117920 394 ## EUBREC_2902 hypothetical protein 126 57 Op 8 . - CDS 117974 - 118663 852 ## COG0356 F0F1-type ATP synthase, subunit a - Prom 118748 - 118807 9.7 - Term 119015 - 119056 -0.6 127 58 Tu 1 . - CDS 119069 - 119368 258 ## Closa_2009 Peptidoglycan-binding lysin domain protein - Prom 119579 - 119638 4.1 + Prom 119471 - 119530 5.4 128 59 Tu 1 . + CDS 119551 - 120165 669 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) + Term 120294 - 120334 -0.2 - Term 120274 - 120328 11.7 129 60 Tu 1 . - CDS 120377 - 121264 323 ## gi|160938935|ref|ZP_02086286.1| hypothetical protein CLOBOL_03829 - Prom 121284 - 121343 8.5 - Term 121440 - 121507 14.4 130 61 Op 1 6/0.000 - CDS 121549 - 121899 425 ## COG0799 Uncharacterized homolog of plant Iojap protein 131 61 Op 2 9/0.000 - CDS 121850 - 122488 614 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 132 61 Op 3 7/0.000 - CDS 122493 - 123119 550 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 133 61 Op 4 1/0.261 - CDS 123180 - 123479 201 ## PROTEIN SUPPORTED gi|55821596|ref|YP_140038.1| hypothetical protein stu1620 134 61 Op 5 14/0.000 - CDS 123491 - 124780 1431 ## COG0536 Predicted GTPase - Term 124796 - 124834 6.0 135 61 Op 6 14/0.000 - CDS 124842 - 125132 484 ## PROTEIN SUPPORTED gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 136 61 Op 7 14/0.000 - CDS 125136 - 125474 252 ## COG2868 Predicted ribosomal protein 137 61 Op 8 . - CDS 125486 - 125794 416 ## PROTEIN SUPPORTED gi|160880683|ref|YP_001559651.1| ribosomal protein L21 - Prom 125891 - 125950 6.0 138 62 Tu 1 . - CDS 125979 - 126587 539 ## gi|160938946|ref|ZP_02086297.1| hypothetical protein CLOBOL_03840 - Prom 126648 - 126707 5.6 139 63 Op 1 17/0.000 - CDS 126721 - 128136 1409 ## COG0168 Trk-type K+ transport systems, membrane components 140 63 Op 2 . - CDS 128177 - 129532 1654 ## COG0569 K+ transport systems, NAD-binding component 141 63 Op 3 2/0.130 - CDS 129568 - 130725 1010 ## COG1530 Ribonucleases G and E 142 63 Op 4 1/0.261 - CDS 130725 - 131480 734 ## COG5011 Uncharacterized protein conserved in bacteria 143 63 Op 5 . - CDS 131486 - 133354 1845 ## COG1032 Fe-S oxidoreductase - Prom 133564 - 133623 8.8 144 64 Tu 1 . - CDS 133836 - 134471 554 ## COG3153 Predicted acetyltransferase - Prom 134511 - 134570 7.5 145 65 Tu 1 . - CDS 135036 - 135530 304 ## Clole_4115 hypothetical protein - Prom 135561 - 135620 1.6 146 66 Op 1 . - CDS 135677 - 135751 84 ## 147 66 Op 2 . - CDS 135827 - 136249 230 ## Clole_0240 MerR family transcriptional regulator - Prom 136352 - 136411 8.6 - Term 136357 - 136387 0.3 148 67 Tu 1 . - CDS 136486 - 137736 910 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 137784 - 137843 5.5 + Prom 137985 - 138044 6.1 149 68 Tu 1 . + CDS 138147 - 138572 299 ## gi|160938958|ref|ZP_02086309.1| hypothetical protein CLOBOL_03852 + Term 138574 - 138619 11.7 - Term 138557 - 138609 6.2 150 69 Op 1 . - CDS 138667 - 139437 755 ## gi|160938959|ref|ZP_02086310.1| hypothetical protein CLOBOL_03853 151 69 Op 2 . - CDS 139424 - 141157 1517 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 141272 - 141331 5.6 + Prom 141228 - 141287 3.8 152 70 Tu 1 . + CDS 141330 - 141842 305 ## BDI_0509 two-component hybrid sensor kinase/response regulator - Term 141590 - 141633 -0.1 153 71 Op 1 . - CDS 141849 - 142895 380 ## PROTEIN SUPPORTED gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase 154 71 Op 2 . - CDS 142978 - 143511 412 ## gi|160938963|ref|ZP_02086314.1| hypothetical protein CLOBOL_03857 155 71 Op 3 . - CDS 143537 - 143752 296 ## gi|291524993|emb|CBK90580.1| hypothetical protein EUR_14900 - Prom 143787 - 143846 3.5 156 71 Op 4 . - CDS 143852 - 143992 214 ## - Prom 144094 - 144153 6.8 - Term 144146 - 144194 1.1 157 72 Op 1 . - CDS 144441 - 144611 133 ## gi|160938966|ref|ZP_02086317.1| hypothetical protein CLOBOL_03860 158 72 Op 2 . - CDS 144694 - 144855 137 ## gi|160938967|ref|ZP_02086318.1| hypothetical protein CLOBOL_03861 - Prom 144893 - 144952 5.7 - Term 144956 - 145003 4.3 159 73 Op 1 . - CDS 145062 - 146006 1008 ## COG1705 Muramidase (flagellum-specific) 160 73 Op 2 . - CDS 146010 - 146438 478 ## COG4824 Phage-related holin (Lysis protein) 161 73 Op 3 . - CDS 146509 - 147549 473 ## COG3344 Retron-type reverse transcriptase - Prom 147696 - 147755 4.9 162 74 Op 1 . - CDS 147907 - 148296 218 ## Dtox_4257 hypothetical protein 163 74 Op 2 . - CDS 148356 - 149450 451 ## Tgr7_1617 hypothetical protein 164 74 Op 3 . - CDS 149450 - 149944 582 ## gi|160938973|ref|ZP_02086324.1| hypothetical protein CLOBOL_03867 165 74 Op 4 . - CDS 149966 - 151006 717 ## CLM_2920 hypothetical protein 166 74 Op 5 . - CDS 151026 - 152072 637 ## Ccel_0826 hypothetical protein 167 74 Op 6 . - CDS 152077 - 153030 485 ## CLB_2961 hypothetical protein 168 74 Op 7 . - CDS 153034 - 156027 1758 ## COG5280 Phage-related minor tail protein 169 74 Op 8 . - CDS 156020 - 156307 82 ## CLL_A2250 hypothetical protein 170 74 Op 9 . - CDS 156310 - 156660 450 ## gi|160938979|ref|ZP_02086330.1| hypothetical protein CLOBOL_03873 - Term 156675 - 156715 6.1 171 75 Op 1 . - CDS 156742 - 157296 508 ## CLL_A2252 hypothetical protein 172 75 Op 2 . - CDS 157299 - 157490 141 ## gi|160938981|ref|ZP_02086332.1| hypothetical protein CLOBOL_03875 - Prom 157546 - 157605 1.5 173 76 Op 1 . - CDS 157661 - 158053 173 ## L58927 prophage pi2 protein 37 174 76 Op 2 . - CDS 158050 - 158307 102 ## gi|160938983|ref|ZP_02086334.1| hypothetical protein CLOBOL_03877 175 76 Op 3 . - CDS 158300 - 158611 214 ## gi|160938984|ref|ZP_02086335.1| hypothetical protein CLOBOL_03878 176 76 Op 4 . - CDS 158630 - 159841 1006 ## BCZK3424 group-specific protein; phage major capsid protein 177 76 Op 5 . - CDS 159860 - 160633 558 ## COG0740 Protease subunit of ATP-dependent Clp proteases 178 76 Op 6 . - CDS 160639 - 161793 448 ## Pjdr2_1541 portal protein - Prom 161938 - 161997 2.5 179 77 Op 1 . - CDS 162022 - 163692 750 ## COG4626 Phage terminase-like protein, large subunit 180 77 Op 2 . - CDS 163696 - 164085 223 ## gi|160938989|ref|ZP_02086340.1| hypothetical protein CLOBOL_03883 181 78 Op 1 . - CDS 164590 - 164889 76 ## gi|160938991|ref|ZP_02086342.1| hypothetical protein CLOBOL_03885 182 78 Op 2 . - CDS 164983 - 165375 309 ## gi|160938992|ref|ZP_02086343.1| hypothetical protein CLOBOL_03886 183 78 Op 3 . - CDS 165372 - 166121 385 ## Clole_0728 VRR-NUC domain-containing protein 184 78 Op 4 . - CDS 166118 - 166288 113 ## gi|160938994|ref|ZP_02086345.1| hypothetical protein CLOBOL_03888 185 78 Op 5 . - CDS 166295 - 167221 766 ## Clole_0730 ParB domain protein nuclease 186 78 Op 6 . - CDS 167223 - 168053 381 ## COG1192 ATPases involved in chromosome partitioning 187 78 Op 7 . - CDS 168050 - 168970 329 ## CLL_A2281 hypothetical protein - Prom 169041 - 169100 5.0 188 79 Tu 1 . - CDS 169102 - 169242 56 ## - Prom 169303 - 169362 2.5 189 80 Op 1 . - CDS 169388 - 169690 261 ## gi|160939000|ref|ZP_02086351.1| hypothetical protein CLOBOL_03894 190 80 Op 2 . - CDS 169690 - 170040 320 ## gi|291525022|emb|CBK90609.1| hypothetical protein EUR_15250 191 80 Op 3 . - CDS 169997 - 170503 296 ## gi|160939002|ref|ZP_02086353.1| hypothetical protein CLOBOL_03896 192 80 Op 4 . - CDS 170504 - 170794 285 ## gi|160939003|ref|ZP_02086354.1| hypothetical protein CLOBOL_03897 193 80 Op 5 . - CDS 170813 - 171172 214 ## bpr_IV157 hypothetical protein 194 80 Op 6 . - CDS 171172 - 171516 379 ## gi|160939005|ref|ZP_02086356.1| hypothetical protein CLOBOL_03899 195 80 Op 7 . - CDS 171550 - 171870 257 ## gi|160939006|ref|ZP_02086357.1| hypothetical protein CLOBOL_03900 196 81 Op 1 . - CDS 172011 - 172304 211 ## gi|160939008|ref|ZP_02086359.1| hypothetical protein CLOBOL_03902 197 81 Op 2 . - CDS 172314 - 172967 357 ## gi|160939009|ref|ZP_02086360.1| hypothetical protein CLOBOL_03903 198 82 Tu 1 . - CDS 173127 - 173909 176 ## COG0270 Site-specific DNA methylase - Prom 174065 - 174124 5.3 199 83 Op 1 . - CDS 174353 - 174577 187 ## gi|160939013|ref|ZP_02086364.1| hypothetical protein CLOBOL_03907 200 83 Op 2 . - CDS 174613 - 174795 97 ## gi|160939014|ref|ZP_02086365.1| hypothetical protein CLOBOL_03908 - Prom 174819 - 174878 6.0 + Prom 174824 - 174883 6.2 201 84 Tu 1 . + CDS 175034 - 176863 1558 ## COG0642 Signal transduction histidine kinase + Term 176879 - 176928 4.1 - Term 176866 - 176916 6.1 202 85 Op 1 2/0.130 - CDS 176960 - 177973 760 ## COG0500 SAM-dependent methyltransferases - Prom 177993 - 178052 1.8 203 85 Op 2 . - CDS 178056 - 180278 2165 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Prom 180341 - 180400 4.2 - Term 180393 - 180432 1.1 204 86 Tu 1 . - CDS 180444 - 182186 1313 ## gi|160939019|ref|ZP_02086370.1| hypothetical protein CLOBOL_03913 - Prom 182326 - 182385 5.4 205 87 Op 1 10/0.000 + CDS 182451 - 183356 706 ## COG0379 Quinolinate synthase 206 87 Op 2 13/0.000 + CDS 183421 - 184638 807 ## COG0029 Aspartate oxidase 207 87 Op 3 1/0.261 + CDS 184607 - 185464 548 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 208 87 Op 4 . + CDS 185498 - 186046 540 ## COG1827 Predicted small molecule binding protein (contains 3H domain) 209 87 Op 5 . + CDS 186060 - 186866 253 ## COG1408 Predicted phosphohydrolases + Term 186956 - 186985 -0.7 210 88 Op 1 . - CDS 186818 - 187138 133 ## gi|239625675|ref|ZP_04668706.1| predicted protein 211 88 Op 2 . - CDS 187015 - 187638 317 ## gi|160939025|ref|ZP_02086376.1| hypothetical protein CLOBOL_03919 - Prom 187659 - 187718 4.7 + Prom 187673 - 187732 5.1 212 89 Op 1 . + CDS 187761 - 188069 283 ## gi|160939026|ref|ZP_02086377.1| hypothetical protein CLOBOL_03920 213 89 Op 2 . + CDS 188066 - 189517 1006 ## COG3463 Predicted membrane protein + Term 189691 - 189749 4.0 214 90 Tu 1 . - CDS 189972 - 190532 409 ## COG1859 RNA:NAD 2'-phosphotransferase 215 91 Tu 1 . - CDS 190653 - 191408 642 ## COG0084 Mg-dependent DNase - Prom 191599 - 191658 6.1 216 92 Tu 1 . - CDS 191660 - 192670 867 ## ELI_3952 hypothetical protein - Prom 192763 - 192822 8.2 + Prom 192627 - 192686 4.8 217 93 Tu 1 . + CDS 192905 - 193339 446 ## CKR_1355 hypothetical protein 218 94 Tu 1 . - CDS 193392 - 195479 1937 ## COG2200 FOG: EAL domain - Prom 195525 - 195584 9.2 + Prom 195565 - 195624 9.5 219 95 Op 1 . + CDS 195646 - 195966 301 ## COG1695 Predicted transcriptional regulators 220 95 Op 2 . + CDS 195959 - 196915 262 ## gi|160939037|ref|ZP_02086388.1| hypothetical protein CLOBOL_03931 221 95 Op 3 . + CDS 196908 - 197450 470 ## gi|160939038|ref|ZP_02086389.1| hypothetical protein CLOBOL_03932 + Term 197500 - 197543 10.1 222 96 Op 1 . - CDS 197566 - 198834 1140 ## COG2195 Di- and tripeptidases 223 96 Op 2 . - CDS 198894 - 200207 1382 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 224 96 Op 3 5/0.000 - CDS 200234 - 200953 872 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 225 96 Op 4 6/0.000 - CDS 201034 - 201621 776 ## COG0131 Imidazoleglycerol-phosphate dehydratase 226 96 Op 5 18/0.000 - CDS 201640 - 202932 1523 ## COG0141 Histidinol dehydrogenase 227 96 Op 6 11/0.000 - CDS 202972 - 203625 753 ## COG0040 ATP phosphoribosyltransferase 228 96 Op 7 . - CDS 203648 - 204913 1535 ## COG3705 ATP phosphoribosyltransferase involved in histidine biosynthesis - Prom 205086 - 205145 7.9 229 97 Tu 1 . - CDS 205295 - 205828 526 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase - Prom 205898 - 205957 8.1 230 98 Op 1 . + CDS 206398 - 206892 291 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 231 98 Op 2 . + CDS 206960 - 207835 560 ## Elen_0591 hypothetical protein - Term 207782 - 207825 3.1 232 99 Op 1 . - CDS 207832 - 208563 690 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 233 99 Op 2 . - CDS 208640 - 209095 440 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 234 99 Op 3 . - CDS 209135 - 210526 758 ## Clole_4025 acyl-CoA reductase 235 99 Op 4 . - CDS 210556 - 211629 583 ## Clole_4024 acyl-protein synthetase LuxE 236 99 Op 5 . - CDS 211649 - 212245 248 ## gi|160939058|ref|ZP_02086409.1| hypothetical protein CLOBOL_03952 - Prom 212462 - 212521 6.5 + Prom 212414 - 212473 8.1 237 100 Op 1 2/0.130 + CDS 212502 - 213107 480 ## COG1309 Transcriptional regulator 238 100 Op 2 . + CDS 213135 - 213797 523 ## COG0778 Nitroreductase + Term 213845 - 213899 13.1 - Term 213832 - 213885 15.1 239 101 Op 1 . - CDS 213887 - 214687 890 ## COG0789 Predicted transcriptional regulators - Prom 214730 - 214789 7.0 240 101 Op 2 . - CDS 214819 - 215613 550 ## gi|160939062|ref|ZP_02086413.1| hypothetical protein CLOBOL_03956 - Prom 215656 - 215715 1.8 241 102 Tu 1 . - CDS 215724 - 215843 97 ## gi|160939063|ref|ZP_02086414.1| hypothetical protein CLOBOL_03957 - Prom 215891 - 215950 3.0 - Term 215907 - 215944 4.0 242 103 Tu 1 . - CDS 215977 - 217071 1281 ## COG0180 Tryptophanyl-tRNA synthetase - Prom 217219 - 217278 6.1 - Term 217228 - 217261 0.2 243 104 Op 1 . - CDS 217302 - 218402 911 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 244 104 Op 2 . - CDS 218427 - 219191 839 ## COG0300 Short-chain dehydrogenases of various substrate specificities 245 104 Op 3 . - CDS 219197 - 221110 1820 ## COG1032 Fe-S oxidoreductase 246 104 Op 4 1/0.261 - CDS 221128 - 221790 801 ## COG0637 Predicted phosphatase/phosphohexomutase 247 104 Op 5 . - CDS 221865 - 222653 887 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 248 104 Op 6 . - CDS 222690 - 222959 60 ## Closa_1969 Fmu (Sun) domain protein 249 104 Op 7 . - CDS 222931 - 224097 932 ## COG0144 tRNA and rRNA cytosine-C5-methylases 250 104 Op 8 . - CDS 224156 - 225523 1395 ## COG0534 Na+-driven multidrug efflux pump - Prom 225554 - 225613 6.6 + Prom 225516 - 225575 7.4 251 105 Tu 1 . + CDS 225685 - 226974 970 ## Closa_2898 acyltransferase 3 + Term 227022 - 227068 7.7 - Term 227010 - 227055 11.3 252 106 Op 1 1/0.261 - CDS 227120 - 229165 1936 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 253 106 Op 2 . - CDS 229166 - 231634 1708 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases 254 106 Op 3 38/0.000 - CDS 231648 - 232490 858 ## COG0395 ABC-type sugar transport system, permease component 255 106 Op 4 35/0.000 - CDS 232503 - 233405 796 ## COG1175 ABC-type sugar transport systems, permease components - Term 233417 - 233446 1.1 256 106 Op 5 . - CDS 233453 - 234754 1445 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 234786 - 234845 6.4 + Prom 234953 - 235012 7.4 257 107 Op 1 7/0.000 + CDS 235094 - 236764 1316 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 258 107 Op 2 . + CDS 236770 - 238200 1129 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 238138 - 238172 -1.0 259 108 Tu 1 . - CDS 238190 - 239053 708 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 239082 - 239141 4.3 260 109 Tu 1 . + CDS 239100 - 240053 611 ## COG3314 Uncharacterized protein conserved in bacteria - Term 239954 - 239986 3.2 261 110 Op 1 . - CDS 240077 - 240715 798 ## Closa_1966 hypothetical protein 262 110 Op 2 14/0.000 - CDS 240712 - 241206 406 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 263 110 Op 3 . - CDS 241207 - 241764 266 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 - Prom 241813 - 241872 6.8 264 111 Tu 1 . - CDS 241883 - 242560 757 ## COG2013 Uncharacterized conserved protein - Prom 242591 - 242650 6.9 265 112 Tu 1 . - CDS 242730 - 243728 683 ## COG1073 Hydrolases of the alpha/beta superfamily - Prom 243758 - 243817 6.0 + Prom 243720 - 243779 5.8 266 113 Tu 1 . + CDS 243954 - 244157 237 ## Closa_1963 small acid-soluble spore protein alpha/beta type + Term 244219 - 244263 9.2 - Term 244200 - 244258 13.5 267 114 Tu 1 . - CDS 244335 - 244850 658 ## COG0778 Nitroreductase - Prom 244879 - 244938 3.3 268 115 Op 1 . - CDS 244955 - 247012 2275 ## COG1200 RecG-like helicase 269 115 Op 2 2/0.130 - CDS 247036 - 247908 1014 ## COG1940 Transcriptional regulator/sugar kinase - Prom 247969 - 248028 7.9 270 116 Tu 1 1/0.261 - CDS 248068 - 249036 301 ## PROTEIN SUPPORTED gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 - Prom 249058 - 249117 2.7 - Term 249065 - 249119 10.3 271 117 Op 1 . - CDS 249167 - 249871 837 ## COG0670 Integral membrane protein, interacts with FtsH 272 117 Op 2 . - CDS 249892 - 250500 657 ## Cthe_0317 hypothetical protein 273 117 Op 3 9/0.000 - CDS 250535 - 252259 1983 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 274 117 Op 4 . - CDS 252273 - 252629 465 ## COG1302 Uncharacterized protein conserved in bacteria - Prom 252684 - 252743 8.2 + Prom 252683 - 252742 8.0 275 118 Tu 1 . + CDS 252799 - 252984 294 ## PROTEIN SUPPORTED gi|227872680|ref|ZP_03991010.1| possible ribosomal protein L28 + Term 253010 - 253054 2.1 - Term 253094 - 253157 8.0 276 119 Op 1 . - CDS 253208 - 253345 182 ## 277 119 Op 2 . - CDS 253415 - 253843 520 ## Closa_1955 Sporulation protein YtfJ 278 119 Op 3 . - CDS 253927 - 254982 791 ## Closa_1954 hypothetical protein 279 119 Op 4 . - CDS 254979 - 255302 292 ## Closa_1953 hypothetical protein 280 119 Op 5 . - CDS 255367 - 256707 1370 ## Closa_1952 hypothetical protein 281 119 Op 6 4/0.043 - CDS 256770 - 258014 321 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 282 119 Op 7 2/0.130 - CDS 258016 - 258558 379 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 283 119 Op 8 . - CDS 258542 - 259897 816 ## PROTEIN SUPPORTED gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase - Term 259907 - 259965 -0.7 284 119 Op 9 . - CDS 259994 - 260257 319 ## Closa_1948 DNA-directed RNA polymerase, omega subunit 285 119 Op 10 8/0.000 - CDS 260303 - 260935 679 ## COG0194 Guanylate kinase 286 119 Op 11 . - CDS 260979 - 261857 1119 ## COG1561 Uncharacterized stress-induced protein - Prom 261892 - 261951 5.5 287 120 Tu 1 . + CDS 262184 - 263929 1673 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP + Term 264104 - 264151 5.1 - Term 264091 - 264139 9.1 288 121 Op 1 . - CDS 264177 - 264353 285 ## PROTEIN SUPPORTED gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 - Prom 264422 - 264481 9.4 289 121 Op 2 . - CDS 264508 - 267156 2838 ## COG0013 Alanyl-tRNA synthetase - Prom 267191 - 267250 9.8 - Term 267234 - 267282 9.1 290 122 Tu 1 . - CDS 267339 - 268304 738 ## Closa_2075 hypothetical protein - Prom 268362 - 268421 5.3 - Term 268415 - 268458 -0.8 291 123 Op 1 . - CDS 268517 - 268999 456 ## COG0328 Ribonuclease HI 292 123 Op 2 . - CDS 269027 - 270628 1822 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing - Prom 270687 - 270746 5.4 293 124 Tu 1 . - CDS 270795 - 272270 1683 ## Closa_2078 stage IV sporulation protein A - Prom 272394 - 272453 4.5 - Term 272426 - 272465 6.1 294 125 Op 1 18/0.000 - CDS 272526 - 273236 268 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 295 125 Op 2 19/0.000 - CDS 273251 - 274009 266 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 296 125 Op 3 24/0.000 - CDS 274011 - 275084 1503 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 297 125 Op 4 . - CDS 275093 - 275974 1212 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 298 125 Op 5 . - CDS 275977 - 276078 67 ## 299 125 Op 6 . - CDS 276138 - 277376 1509 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component - Prom 277412 - 277471 7.4 - Term 277503 - 277544 -0.5 300 126 Op 1 1/0.261 - CDS 277577 - 278605 1198 ## COG0240 Glycerol-3-phosphate dehydrogenase 301 126 Op 2 2/0.130 - CDS 278633 - 279274 538 ## COG0344 Predicted membrane protein 302 126 Op 3 . - CDS 279299 - 280627 1774 ## COG1160 Predicted GTPases 303 126 Op 4 . - CDS 280660 - 280854 254 ## Ccel_2580 hypothetical protein - Prom 280938 - 280997 4.9 + Prom 280898 - 280957 5.1 304 127 Tu 1 . + CDS 281125 - 282333 1236 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 282412 - 282448 -1.0 - Term 282364 - 282434 13.5 305 128 Op 1 . - CDS 282450 - 283496 1081 ## BBR47_51510 hypothetical protein 306 128 Op 2 . - CDS 283533 - 284387 926 ## COG1496 Uncharacterized conserved protein 307 128 Op 3 1/0.261 - CDS 284409 - 284750 226 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 308 128 Op 4 8/0.000 - CDS 284743 - 285405 835 ## COG0164 Ribonuclease HII 309 128 Op 5 2/0.130 - CDS 285408 - 286274 1112 ## COG1161 Predicted GTPases 310 128 Op 6 5/0.000 - CDS 286292 - 286843 599 ## COG0681 Signal peptidase I - Term 286898 - 286933 5.3 311 128 Op 7 . - CDS 286954 - 287301 468 ## PROTEIN SUPPORTED gi|160880535|ref|YP_001559503.1| ribosomal protein L19 - Prom 287400 - 287459 5.8 312 129 Tu 1 . - CDS 287475 - 288752 1611 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 288846 - 288905 5.7 - Term 289046 - 289086 0.4 313 130 Op 1 . - CDS 289088 - 290005 852 ## COG0336 tRNA-(guanine-N1)-methyltransferase 314 130 Op 2 1/0.261 - CDS 290002 - 292674 3464 ## COG0249 Mismatch repair ATPase (MutS family) 315 130 Op 3 . - CDS 292751 - 294259 655 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 294385 - 294444 5.5 316 131 Tu 1 . - CDS 294481 - 296223 2063 ## COG1288 Predicted membrane protein - Prom 296243 - 296302 2.4 317 132 Op 1 . - CDS 296343 - 296804 471 ## Dred_0591 hypothetical protein 318 132 Op 2 . - CDS 296848 - 297042 302 ## COG1476 Predicted transcriptional regulators - Prom 297172 - 297231 13.2 319 133 Tu 1 . - CDS 297245 - 297619 373 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 297657 - 297716 10.2 + Prom 297697 - 297756 11.8 320 134 Tu 1 . + CDS 297845 - 298756 828 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 298641 - 298671 2.5 321 135 Op 1 . - CDS 298723 - 300021 911 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 322 135 Op 2 . - CDS 300051 - 300320 265 ## Cphy_3923 hypothetical protein 323 135 Op 3 . - CDS 300412 - 301065 627 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 301109 - 301168 3.6 + Prom 301060 - 301119 5.6 324 136 Tu 1 . + CDS 301192 - 302787 1584 ## COG1151 6Fe-6S prismane cluster-containing protein + Term 302800 - 302854 17.8 325 137 Op 1 . - CDS 302842 - 303612 879 ## COG3022 Uncharacterized protein conserved in bacteria 326 137 Op 2 . - CDS 303636 - 304217 711 ## COG0681 Signal peptidase I - Term 304242 - 304281 2.3 327 138 Op 1 . - CDS 304298 - 305539 1474 ## COG0014 Gamma-glutamyl phosphate reductase 328 138 Op 2 . - CDS 305563 - 305784 273 ## Cphy_2602 hypothetical protein 329 138 Op 3 . - CDS 305829 - 306692 1158 ## COG0263 Glutamate 5-kinase - Prom 306718 - 306777 6.1 330 139 Op 1 . - CDS 306781 - 308469 1605 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 331 139 Op 2 . - CDS 308466 - 309791 1406 ## COG2081 Predicted flavoproteins - Prom 309837 - 309896 8.0 - Term 310100 - 310141 1.2 332 140 Tu 1 . - CDS 310143 - 310541 351 ## Closa_2065 hypothetical protein - Prom 310600 - 310659 2.3 333 141 Op 1 . - CDS 310693 - 312390 1671 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 334 141 Op 2 . - CDS 312396 - 313547 1258 ## Closa_2067 hypothetical protein 335 141 Op 3 17/0.000 - CDS 313565 - 314062 688 ## COG0319 Predicted metal-dependent hydrolase 336 141 Op 4 . - CDS 314083 - 315063 1266 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 337 141 Op 5 . - CDS 315078 - 316337 958 ## Closa_2070 sporulation protein YqfD 338 141 Op 6 . - CDS 316356 - 316625 273 ## Closa_2071 hypothetical protein - Prom 316651 - 316710 4.5 + Prom 316687 - 316746 2.5 339 142 Tu 1 . + CDS 316767 - 317678 797 ## Closa_1701 hypothetical protein + Term 317744 - 317792 12.7 - Term 317726 - 317787 20.1 340 143 Tu 1 . - CDS 317796 - 319673 1367 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 319879 - 319938 6.4 341 144 Tu 1 . + CDS 320009 - 321352 1057 ## COG2610 H+/gluconate symporter and related permeases - Term 321597 - 321657 3.9 342 145 Op 1 . - CDS 321677 - 322285 271 ## COG2184 Protein involved in cell division 343 145 Op 2 . - CDS 322298 - 322525 267 ## gi|160939173|ref|ZP_02086524.1| hypothetical protein CLOBOL_04067 - Prom 322549 - 322608 7.5 - Term 322627 - 322664 2.4 344 146 Op 1 . - CDS 322672 - 324162 351 ## COG0286 Type I restriction-modification system methyltransferase subunit 345 146 Op 2 . - CDS 324167 - 324772 83 ## gi|160939175|ref|ZP_02086526.1| hypothetical protein CLOBOL_04069 346 147 Tu 1 . - CDS 325662 - 325763 61 ## - Prom 325861 - 325920 2.9 347 148 Tu 1 . - CDS 325922 - 326317 183 ## gi|160939177|ref|ZP_02086528.1| hypothetical protein CLOBOL_04071 - Prom 326532 - 326591 6.4 - Term 326628 - 326656 -0.9 348 149 Tu 1 . - CDS 326691 - 326873 159 ## gi|160939178|ref|ZP_02086529.1| hypothetical protein CLOBOL_04072 - Prom 326896 - 326955 2.6 349 150 Op 1 . - CDS 326984 - 327340 157 ## gi|160939179|ref|ZP_02086530.1| hypothetical protein CLOBOL_04073 350 150 Op 2 . - CDS 327353 - 328924 756 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 351 150 Op 3 . - CDS 328908 - 330092 533 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 352 150 Op 4 . - CDS 330089 - 330283 111 ## Swol_1464 hypothetical protein - Prom 330347 - 330406 7.4 - Term 330396 - 330435 5.7 353 151 Op 1 . - CDS 330539 - 332158 886 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 354 151 Op 2 . - CDS 332214 - 333080 441 ## Dtox_1525 recombinase 355 151 Op 3 . - CDS 333083 - 334648 898 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 356 151 Op 4 . - CDS 334666 - 334869 148 ## gi|160939186|ref|ZP_02086537.1| hypothetical protein CLOBOL_04080 - Prom 334955 - 335014 2.3 - Term 335364 - 335404 -0.0 357 152 Tu 1 . - CDS 335521 - 335934 184 ## gi|160939188|ref|ZP_02086539.1| hypothetical protein CLOBOL_04082 + Prom 336483 - 336542 8.0 358 153 Tu 1 . + CDS 336767 - 337153 76 ## COG3682 Predicted transcriptional regulator + Term 337265 - 337311 -0.9 + Prom 337643 - 337702 5.0 359 154 Tu 1 . + CDS 337839 - 338951 99 ## COG2602 Beta-lactamase class D + Term 339117 - 339147 4.3 + Prom 340507 - 340566 7.8 360 155 Tu 1 . + CDS 340586 - 341479 251 ## COG2367 Beta-lactamase class A + Term 341690 - 341729 8.6 - Term 341678 - 341717 8.6 361 156 Tu 1 . - CDS 341847 - 342293 384 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases - Prom 342351 - 342410 5.8 362 157 Tu 1 . - CDS 342690 - 342755 73 ## - Prom 342803 - 342862 6.2 - Term 343497 - 343546 5.3 363 158 Op 1 . - CDS 343789 - 344049 116 ## Ccel_2737 hypothetical protein 364 158 Op 2 . - CDS 344042 - 346018 657 ## COG3505 Type IV secretory pathway, VirD4 components 365 158 Op 3 . - CDS 346023 - 347606 579 ## Rumal_1401 hypothetical protein 366 158 Op 4 . - CDS 347575 - 348039 257 ## gi|160939202|ref|ZP_02086553.1| hypothetical protein CLOBOL_04096 367 158 Op 5 . - CDS 348052 - 348903 386 ## Closa_3128 hypothetical protein 368 158 Op 6 . - CDS 348916 - 350241 566 ## DSY0030 hypothetical protein 369 158 Op 7 . - CDS 350286 - 350849 537 ## HM1_0571 hypothetical protein 370 158 Op 8 . - CDS 350846 - 351292 159 ## DSY0031 hypothetical protein 371 158 Op 9 . - CDS 351264 - 351401 85 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 - Prom 351472 - 351531 5.1 372 159 Tu 1 . + CDS 351537 - 352703 341 ## COG3547 Transposase and inactivated derivatives + Term 352749 - 352783 1.1 - Term 353100 - 353135 0.0 373 160 Op 1 . - CDS 353338 - 353682 245 ## ELI_1911 hypothetical protein 374 160 Op 2 . - CDS 353779 - 354498 22 ## ELI_1911 hypothetical protein - Prom 354535 - 354594 5.4 + Prom 354311 - 354370 4.1 375 161 Tu 1 . + CDS 354593 - 355210 78 ## gi|160939212|ref|ZP_02086563.1| hypothetical protein CLOBOL_04106 - Term 355188 - 355225 6.4 376 162 Op 1 . - CDS 355254 - 355940 138 ## COG3153 Predicted acetyltransferase 377 162 Op 2 . - CDS 355900 - 356979 264 ## COG1162 Predicted GTPases 378 162 Op 3 . - CDS 357023 - 357094 96 ## - Prom 357117 - 357176 1.7 - Term 357140 - 357177 8.0 379 163 Op 1 . - CDS 357184 - 357456 232 ## gi|160939216|ref|ZP_02086567.1| hypothetical protein CLOBOL_04110 380 163 Op 2 . - CDS 357478 - 358194 521 ## gi|160939217|ref|ZP_02086568.1| hypothetical protein CLOBOL_04111 381 163 Op 3 . - CDS 358213 - 358566 361 ## Dhaf_2445 hypothetical protein 382 163 Op 4 . - CDS 358582 - 359532 778 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 383 163 Op 5 . - CDS 359555 - 359890 342 ## Dhaf_2422 hypothetical protein 384 163 Op 6 . - CDS 359910 - 361721 770 ## COG3451 Type IV secretory pathway, VirB4 components 385 163 Op 7 . - CDS 361718 - 362311 408 ## CKR_3422 hypothetical protein 386 163 Op 8 . - CDS 362247 - 362573 212 ## LM5578_1878 hypothetical protein 387 163 Op 9 . - CDS 362591 - 362956 324 ## TepRe1_2440 hypothetical protein 388 163 Op 10 . - CDS 362971 - 363276 223 ## Tthe_0786 hypothetical protein 389 163 Op 11 . - CDS 363308 - 364168 713 ## Ccel_2764 hypothetical protein 390 163 Op 12 . - CDS 364193 - 364729 421 ## gi|160939227|ref|ZP_02086578.1| hypothetical protein CLOBOL_04121 391 163 Op 13 . - CDS 364750 - 365097 312 ## Sgly_0055 hypothetical protein - Term 365111 - 365144 6.1 392 164 Op 1 . - CDS 365158 - 370944 3177 ## COG4932 Predicted outer membrane protein - Prom 370966 - 371025 3.0 393 164 Op 2 . - CDS 371039 - 372097 485 ## COG2340 Uncharacterized protein with SCP/PR1 domains - Prom 372123 - 372182 4.1 394 165 Tu 1 . - CDS 372253 - 372978 510 ## COG3764 Sortase (surface protein transpeptidase) - Prom 373063 - 373122 3.7 395 166 Op 1 . - CDS 373133 - 374287 910 ## gi|160939233|ref|ZP_02086584.1| hypothetical protein CLOBOL_04127 396 166 Op 2 . - CDS 374284 - 374463 166 ## gi|160939234|ref|ZP_02086585.1| hypothetical protein CLOBOL_04128 397 166 Op 3 . - CDS 374480 - 374788 344 ## gi|160939235|ref|ZP_02086586.1| hypothetical protein CLOBOL_04129 398 166 Op 4 . - CDS 374804 - 375766 742 ## Dtox_1475 D12 class N6 adenine-specific DNA methyltransferase 399 166 Op 5 . - CDS 375817 - 376749 441 ## gi|160939237|ref|ZP_02086588.1| hypothetical protein CLOBOL_04131 400 166 Op 6 . - CDS 376785 - 377123 379 ## gi|160939238|ref|ZP_02086589.1| hypothetical protein CLOBOL_04132 - Term 377141 - 377173 -0.0 401 166 Op 7 . - CDS 377174 - 377839 441 ## gi|160939239|ref|ZP_02086590.1| hypothetical protein CLOBOL_04133 402 166 Op 8 . - CDS 377842 - 378942 141 ## COG0270 Site-specific DNA methylase - Prom 378998 - 379057 2.3 - Term 379085 - 379126 11.3 403 167 Tu 1 . - CDS 379132 - 379533 266 ## gi|160939241|ref|ZP_02086592.1| hypothetical protein CLOBOL_04135 - Prom 379567 - 379626 1.8 - Term 379678 - 379712 3.9 404 168 Op 1 . - CDS 379726 - 380463 327 ## Ccel_2964 hypothetical protein 405 168 Op 2 . - CDS 380473 - 380799 300 ## gi|160939244|ref|ZP_02086595.1| hypothetical protein CLOBOL_04138 406 168 Op 3 . - CDS 380815 - 381186 325 ## LM5578_1871 hypothetical protein - Prom 381251 - 381310 7.8 - Term 381453 - 381486 0.0 407 169 Tu 1 . - CDS 381592 - 382320 1136 ## COG0217 Uncharacterized conserved protein - Prom 382409 - 382468 6.2 408 170 Op 1 . - CDS 382496 - 383848 1586 ## COG0534 Na+-driven multidrug efflux pump 409 170 Op 2 . - CDS 383838 - 385313 1511 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 385407 - 385466 5.2 - Term 385364 - 385411 6.0 410 171 Tu 1 . - CDS 385494 - 385841 348 ## COG2337 Growth inhibitor - Prom 385867 - 385926 1.7 - Term 385893 - 385929 5.2 411 172 Tu 1 . - CDS 385990 - 387180 1360 ## COG0787 Alanine racemase - Prom 387225 - 387284 2.2 412 173 Op 1 . - CDS 387304 - 387393 63 ## 413 173 Op 2 1/0.261 - CDS 387399 - 388940 529 ## PROTEIN SUPPORTED gi|126666946|ref|ZP_01737922.1| Ribosomal protein S15 414 173 Op 3 . - CDS 388900 - 389319 362 ## COG0736 Phosphopantetheinyl transferase (holo-ACP synthase) 415 173 Op 4 . - CDS 389340 - 390002 863 ## COG2344 AT-rich DNA-binding protein - Prom 390046 - 390105 8.9 + Prom 390005 - 390064 3.9 416 174 Op 1 . + CDS 390142 - 392118 2412 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 417 174 Op 2 . + CDS 392141 - 392377 337 ## gi|160939256|ref|ZP_02086607.1| hypothetical protein CLOBOL_04150 + Term 392417 - 392469 1.1 - Term 392404 - 392453 2.1 418 175 Op 1 . - CDS 392583 - 393764 1138 ## Dacet_0350 major facilitator superfamily MFS_1 419 175 Op 2 . - CDS 393836 - 395032 1105 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 420 175 Op 3 . - CDS 395071 - 396228 1200 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 421 176 Tu 1 . - CDS 396336 - 397511 1441 ## COG5505 Predicted integral membrane protein - Prom 397609 - 397668 9.3 + Prom 398003 - 398062 7.4 422 177 Tu 1 . + CDS 398138 - 398548 507 ## SSP0850 transcriptional regulator - Term 398581 - 398624 5.7 423 178 Op 1 . - CDS 398644 - 399921 1520 ## COG0104 Adenylosuccinate synthase - Prom 400007 - 400066 6.1 424 178 Op 2 . - CDS 400110 - 401030 950 ## COG0583 Transcriptional regulator - Prom 401077 - 401136 7.6 - Term 401241 - 401286 3.1 425 179 Op 1 . - CDS 401352 - 401732 248 ## gi|160939265|ref|ZP_02086616.1| hypothetical protein CLOBOL_04159 426 179 Op 2 . - CDS 401802 - 403238 1703 ## COG0015 Adenylosuccinate lyase 427 179 Op 3 2/0.130 - CDS 403313 - 404761 1309 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 404826 - 404885 6.2 428 180 Tu 1 . - CDS 404926 - 405633 1095 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase - Prom 405715 - 405774 7.5 429 181 Tu 1 . - CDS 405794 - 406162 195 ## gi|160939269|ref|ZP_02086620.1| hypothetical protein CLOBOL_04163 - Prom 406187 - 406246 5.2 430 182 Tu 1 . + CDS 406056 - 406277 112 ## gi|160939270|ref|ZP_02086621.1| hypothetical protein CLOBOL_04164 + Term 406449 - 406479 0.0 431 183 Op 1 . - CDS 406312 - 406644 405 ## gi|160939272|ref|ZP_02086623.1| hypothetical protein CLOBOL_04166 432 183 Op 2 6/0.000 - CDS 406735 - 407493 940 ## COG0289 Dihydrodipicolinate reductase 433 183 Op 3 . - CDS 407523 - 408407 1193 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 408457 - 408516 3.3 434 184 Tu 1 . - CDS 408782 - 409276 600 ## COG2109 ATP:corrinoid adenosyltransferase - Prom 409324 - 409383 7.0 435 185 Op 1 . - CDS 409430 - 410059 730 ## COG0629 Single-stranded DNA-binding protein 436 185 Op 2 . - CDS 410040 - 410117 112 ## - Prom 410166 - 410225 2.6 - Term 410180 - 410227 10.6 437 186 Tu 1 . - CDS 410256 - 412088 2335 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 412128 - 412187 7.2 438 187 Op 1 . - CDS 412217 - 413350 1037 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 439 187 Op 2 . - CDS 413391 - 414041 739 ## COG1739 Uncharacterized conserved protein - Prom 414104 - 414163 5.7 + Prom 414131 - 414190 5.3 440 188 Tu 1 . + CDS 414253 - 416946 2839 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Term 417022 - 417085 1.6 - Term 416942 - 416977 -0.1 441 189 Op 1 . - CDS 417195 - 417548 591 ## Closa_1567 SpoVA protein 442 189 Op 2 . - CDS 417561 - 418583 988 ## Closa_1566 stage V sporulation protein AD - Prom 418635 - 418694 4.7 - Term 418609 - 418652 -0.5 443 190 Tu 1 . - CDS 418696 - 419130 601 ## Clole_4035 nitrogenase cofactor biosynthesis protein NifB - Prom 419196 - 419255 4.9 - Term 419435 - 419475 0.7 444 191 Tu 1 . - CDS 419635 - 419736 69 ## - Prom 419777 - 419836 5.4 445 192 Op 1 2/0.130 - CDS 420546 - 422453 742 ## COG0210 Superfamily I DNA and RNA helicases 446 192 Op 2 . - CDS 422458 - 424515 591 ## COG3593 Predicted ATP-dependent endonuclease of the OLD family - Prom 424553 - 424612 6.7 447 193 Tu 1 . - CDS 424733 - 425005 133 ## gi|160939292|ref|ZP_02086643.1| hypothetical protein CLOBOL_04186 - Prom 425040 - 425099 3.2 - Term 425023 - 425068 5.1 448 194 Op 1 . - CDS 425115 - 425570 554 ## Closa_1565 SpoVA protein - Term 425580 - 425606 -1.0 449 194 Op 2 . - CDS 425640 - 426083 536 ## Closa_1564 hypothetical protein 450 194 Op 3 . - CDS 426070 - 426696 791 ## Closa_1563 stage V sporulation protein AA 451 194 Op 4 . - CDS 426693 - 426938 230 ## gi|160939296|ref|ZP_02086647.1| hypothetical protein CLOBOL_04190 452 194 Op 5 6/0.000 - CDS 427027 - 427740 829 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 453 194 Op 6 8/0.000 - CDS 427746 - 428195 587 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 454 194 Op 7 . - CDS 428201 - 428542 422 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) - Prom 428564 - 428623 3.0 - Term 428565 - 428630 13.4 455 195 Op 1 . - CDS 428645 - 429955 1487 ## Closa_1558 TPR repeat-containing protein 456 195 Op 2 . - CDS 429952 - 430098 345 ## 457 195 Op 3 . - CDS 430128 - 431492 1222 ## COG0285 Folylpolyglutamate synthase - Prom 431537 - 431596 6.6 458 196 Op 1 . - CDS 431760 - 432677 810 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Term 432698 - 432727 -0.2 459 196 Op 2 . - CDS 432753 - 433964 1195 ## COG3629 DNA-binding transcriptional activator of the SARP family 460 197 Tu 1 . - CDS 434089 - 436728 2731 ## COG0525 Valyl-tRNA synthetase - Prom 436857 - 436916 6.1 461 198 Tu 1 . - CDS 437581 - 437913 62 ## gi|160939308|ref|ZP_02086659.1| hypothetical protein CLOBOL_04202 - Prom 438037 - 438096 3.1 462 199 Tu 1 . - CDS 438497 - 438721 98 ## EUBREC_3238 hypothetical protein 463 200 Tu 1 . - CDS 438960 - 439889 114 ## bpr_I2412 polysaccharide biosynthesis protein - Prom 440019 - 440078 3.9 - Term 440292 - 440332 4.9 464 201 Op 1 . - CDS 440536 - 441012 167 ## COG0546 Predicted phosphatases 465 201 Op 2 . - CDS 440966 - 441190 56 ## gi|160939313|ref|ZP_02086664.1| hypothetical protein CLOBOL_04207 - Prom 441219 - 441278 3.3 466 202 Tu 1 . - CDS 441708 - 443309 1168 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases Predicted protein(s) >gi|157101634|gb|DS480690.1| GENE 1 1 - 99 141 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no ILGVIVAFEGIRKLLDKDTEADKAAPGKGVTA >gi|157101634|gb|DS480690.1| GENE 2 280 - 1119 580 279 aa, chain + ## HITS:1 COG:mlr7748 KEGG:ns NR:ns ## COG: mlr7748 COG3757 # Protein_GI_number: 13476430 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Mesorhizobium loti # 65 264 12 204 228 172 43.0 7e-43 MIKDEPSDILLKQLQQYGLGFRKAGLHMKHRYKWLRILAALVVMTVFAAGIMAYLIWNGW ILLNNPSKTRYPVRGVDVSHYQGEIDWKVLGRQDIDFAYIKATEGSSHVDEKFYQNWSES ALSGLAVGAYHFFSFDSSGKDQLAHFIQCLPDREGMLPPAVDVEFYGDKAANPPNPADVE QELGALLEGLEAQYGIVPVIYATEESWNLYIRGRFDRYPLWIRNVKTKPRTEGEPWLLWQ YTNRQRLGGYEGEETFIDMNVYCGSREQWENWYGNRNKS >gi|157101634|gb|DS480690.1| GENE 3 1237 - 2817 1060 526 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938790|ref|ZP_02086141.1| ## NR: gi|160938790|ref|ZP_02086141.1| hypothetical protein CLOBOL_03684 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03684 [Clostridium bolteae ATCC BAA-613] # 1 526 39 564 564 1058 100.0 0 MEPTTEEKNGQNQEDRQDTEPNSNKPHKLGKGRYMIVSALSGILVLAAVLAVFMTRHRET GTVIAGMDYHEAAVDNKIYTSYSQHDTRWSSNPLGSSKDTMGSSGCLTSCIAASLTAQGV HSFSPGELNQIFNENKVYNDRGAIVWKELEQALPYARVRLDCGTTSESINRLLDEKMYPI VKVKRKSGAVHWIMLTGTEKDTLDITAMDPIDGFVHMSDYDNRIYGIRVVTGSGKSSETT GGETEASGKPHEVSGTLSGQGQDKTGKTEDRDTAAVTGRAEEGFWKTEEAFAQKEIQPVS LEELGKAPDAQGIVFLSALSQNKFRLYGYAGPAGHYDGIYITDWEGNMNSFPSIPYTSPR LILPKMGWDQENGILKASFHTMTGTSLSREQFYAFLRYDTGHLEPYEFREEDYEKQLDRR LSYKLDKEKNMVTFYDESEKIVSLPLEDLDSSRIRDVIYTDFVAFNPNQPVTMAFTPGFI IEGMTTPQYLEQDVRLEAAIDVIRDSDSSGADVVRFHIRDIEKGGS >gi|157101634|gb|DS480690.1| GENE 4 3141 - 4481 302 446 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1803 NR:ns ## KEGG: Cphy_1803 # Name: not_defined # Def: transposase # Organism: C.phytofermentans # Pathway: not_defined # 8 446 8 434 434 343 43.0 1e-92 MPKKKEYDHKHLSTSQRIHIEKGLNDGLSFAAIARKLDKHPSTIAKEVKKYRTLQPREKD PKKPARCALFKECTLRFLCDKKDCVKMCKSCYDIKLQVSKCSYLCSEYREPQCASISKAP FVCNHCARQRTCNKEKAYYIAQNADQSSQELLVSCRQGINQAPADIAMLDTLISPLLAQG QSLAHIYAFHGHEIPCSRKTLYNYIDQGVFTAKNIDLRRKVRYKCKPRKTGTRVSLAAKE FRIGRTYEDFQKFIQENPDIPVVELDTVEGGRDNSTQAFLTIFFRNCSLMLIFVLQEKSQ DQVIKVFDYLTEKLGIKVFQELFPVILTDNGVEFQFPERLECDKNGEIRTKIFYCNPNSS WQKGRIEKNHEYIRYVIPKSQSLDHYKQRDACVLMNHINSEARDSLNGCTPFRLSKMLLN NRLHRLLCLQEIPGDQVHLKPSLLKK >gi|157101634|gb|DS480690.1| GENE 5 4696 - 5580 693 294 aa, chain + ## HITS:1 COG:MA0367 KEGG:ns NR:ns ## COG: MA0367 COG2768 # Protein_GI_number: 20089264 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Methanosarcina acetivorans str.C2A # 4 291 67 355 355 335 58.0 5e-92 MSIVYMTREITPESLVRIYRALGVSLPGSAAVKISTGEPGGRNFLQPELIRNLVQELMGT IVECNTAYEGRRNTSKAHWETMRDHGFTAIAPCDILDEEGHMPLPVTGGHHLTENYVGSH LQNYDSMLMLSHFKGHAMGGFGGALKNMSIGLASSYGKIWIHSSGTSTRFEDVFTADHDS FLESMADADRSVMDYMGRENIVYINVANRLSVDCDCDAHPHDPEMGDIGIFASADPVALD QACVDAVYASEDDGKAALIERIESRNGIHTVEAAHSLGLGSRRYELRCIDSYKN >gi|157101634|gb|DS480690.1| GENE 6 5619 - 5993 398 124 aa, chain + ## HITS:1 COG:CAC0766 KEGG:ns NR:ns ## COG: CAC0766 COG0789 # Protein_GI_number: 15894053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 124 1 124 126 152 61.0 1e-37 MTIAEVSRQYDISADTLRYYERIGLLPHVGRTSGGIRNYSEDDCHWVEYIKCMRSAGVSV ETLLEYVTLFHQGASTIQARKNLLLEQREQIAARINELNQVLARLDWKLDGYEERMLSYE EHLK >gi|157101634|gb|DS480690.1| GENE 7 6176 - 6796 512 206 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938794|ref|ZP_02086145.1| ## NR: gi|160938794|ref|ZP_02086145.1| hypothetical protein CLOBOL_03688 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03688 [Clostridium bolteae ATCC BAA-613] # 1 206 25 230 230 396 100.0 1e-109 MSQEQLAQKIHVTRQAVSNWETGRSQPDLDMLETLASAFGTDILVVIYGQMPAGADEETR SAVRKRHLKKAVFWGILTLASYLILTALGRHLDVLKTRTYNSMPYILLQTSVMPLISVGF AVSLMHVLNVAGTVRIAESGMRKKLLVVGGLTAALYMLCMLWLFVPLPLFPGAPLWIWRM SFSVIPHILFFSIGLLLYLGLDHDNE >gi|157101634|gb|DS480690.1| GENE 8 6882 - 7346 354 154 aa, chain + ## HITS:1 COG:CAC1006 KEGG:ns NR:ns ## COG: CAC1006 COG0494 # Protein_GI_number: 15894293 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 2 143 1 141 146 150 50.0 1e-36 MISVEFHNTVDDSLLKFAVILSRYQGKWLFCQHRDRDTYECPGGRREAGEKIEHTARREL YEETGAVGFTLTPLSAYSVRETAGDPDTPGISYGMLYYGDITELGPIPEGYEIARTALFR EPPANWTYPDIQPVLLGYLKAWLRESGDGGANDF >gi|157101634|gb|DS480690.1| GENE 9 7401 - 8783 1341 460 aa, chain + ## HITS:1 COG:no KEGG:Closa_2969 NR:ns ## KEGG: Closa_2969 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 458 1 450 453 604 71.0 1e-171 MILMNEIPDRFWSLFRSVNRSTYIEALLKINEEYEYSNYFLSREVCIQLLSEYFSQKKYV IWQDETEDDWDELEPPATRVLNWLLRTGWLRKVDDYASMTVNIVIPDYAAVMIEAFSRLL GDDEDETQVYIQNVYAILFSLKNDPRSSVSLLNTALINTKKLNKSLQDMLHNMDKFFGSL LEQKNYGTLLKEHLEGYVTEIVNKKYHILKTSDNFYLYKTDIKSWIAAMREDSSWINKMC AKSAASARGQKKAPLTEYDIVEKLDQLERGFDDIEHRISNMDKEHSRYVKATVTRLNYLL NQEDNMKGLVIRLLNHLSEYGCPEEELASVSARMNLSQMTILSEKSLYKRRKTKADFTEQ LKDEEESEELSMEEVLNLNKVRNRYSQKEIEAFIEGSMTEGKMVVDEQTIHSDQDFEKLI LAYDYSTRRKSPYKVEEEDSEMVHCKGYTYPRLVFVRRAR >gi|157101634|gb|DS480690.1| GENE 10 8783 - 9616 829 277 aa, chain + ## HITS:1 COG:no KEGG:Closa_2968 NR:ns ## KEGG: Closa_2968 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 215 13 204 216 267 70.0 3e-70 MENMELENSMTDIAPAAGYTGNGVPYYDNLMQSEQMEVTEIIRLLWRQTFILEHKYDKRT GRFQYNRDYRVCSKHLEFMKNYFAISGVELRENSQMGVMYIQGETVVGDKLPRLATLYLL VLKLIYDEQMESVSTSVNVYTSLREIHEKLGNYRLFKKQPAPTEIRRAVTLLKKYQVIEP LDLLEDLKSESRMIIYPCINVVLMGDDVRCLLSSFEDDSDESGVQLKADLEEEEPDATDG MTEDGMAEDEMTQDEMTEDEMTEDEMASEEGEPWNRK >gi|157101634|gb|DS480690.1| GENE 11 9601 - 12993 3674 1130 aa, chain + ## HITS:1 COG:no KEGG:Closa_2967 NR:ns ## KEGG: Closa_2967 # Name: not_defined # Def: SMC domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 1130 1 1126 1127 1461 68.0 0 MEQKIANDRFLALSRICLNNWHYITKRTLSFNDEINFFTGHSGSGKSTVIDAMQIVLYAN TDGRGFFNKAAADDSDRSLIEYLRGMVNIGENNEFSYLRNQNFSSTIVLELKRSDTGQCQ CVGIVFDVETATNEISRRFFWHKGPMWDNGYRGGKRTMSIVEIEDYLQANYSREDYFLTS HNERFRRILYDSYLGGLDSEKFPLLFKRAIPFRMNIKLEDFVKEYICMEQDIHIEDMQES VMQYGRMRKKIEDTCAEIQSLKQIEDRFLTYADKEQQIEKCSYFVQKLEILQLQSEVRTL ADRAKLAGEDLKKQVQARSSVDEQIGELTRQSDELLRRIASTGYEELKSQLKSLNELLEQ LGKSESRWQQTATALKRWEEEDITSNQTLWDMEEFENRTIDLETLERLKKNMVSMRTDTE KQHQEAAAAVRELKKQEKQTRDELEKLRSGSKAYPKYLENARSYIQRRLLEKTGKSVDVH VLADLLDVKRDEWRNAVEGYLGNNKLNLVVSPKYARAALEIYGELDKKEYFNVAVLDTEK ASQNQPQVQKGALSEEVTVRESYLKPYIDLLLGKVIKCNTIDELRECRIGITSGCMLYHT FRLQHINPDNYTKFAYIGKDSVRRRIRLLEQELASLEEKKKPLEETARESLRILALPWLS AEPTEYMDWLNDINSYKAKEREKKKLIQKLELLKEQNVDQLERDRQAVITLCDGKKRERD SLNVLIRDKEREIEGCRNKSIDSQSLLTSKERELKQNPDYDNELSAYLPSRDTVRYDREK ETFIARCSKLSEAREAAFQELLAVRTSYIRTYPNRNFTVNARDNAAYDRLLSDLKCDNLE EYKQKATQQAKSAVEHFRDDFMYKIRSAIREALIRKDELNRIISRLDFGKDKYQFVIGRN KGPDGRFYDMFMDDSLDINPADLNYSYENQMDLFTIEHEKQYGDLMNELISIFIPPENAT PEQMDEARRNMDKYSDYRTYLSFDMQQLIQNEDEVIKIRLSKMIKKNSGGEGQNPLYVAL LASFAQAYRINLPSKVERNPTIRLVVLDEAFSKMDGEKVASCIELIRGLGFQAIISATND KIQNYVENVDKTFVFANPNKKSISIQEFERERFVELMQDIEDGDEFADTL >gi|157101634|gb|DS480690.1| GENE 12 13029 - 14207 590 392 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2907 NR:ns ## KEGG: EUBREC_2907 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 40 391 46 417 417 166 31.0 2e-39 MNAPLECIIQKYEKSTVDWKKGMSGKRSVIFREQDFKETGKRDFMNQLMELEMEGLVKVK WYQYGIEAEKAWYSLEQMETIYQRLGKIPKFKRINKLKEEVDQQLALIQSQWIRSYYQSF LQSLDKGDVPKALDTPKRELLFTCFKAIDCLQEPVYKRIFSKKYLGKSKAFEKHLQSKVL SAARAYLDTVNDDMDDRQVLDQLMINGYAQELAVKGNLIIELAGNKINLSLFPSGFVFNN QTLRSASIPPEQSISKVITVENKANYESMPYEEGTLIIYSHGYFSPDERAFLIQLERVLS GIHTSCGTAYLHTGDLDYGGVKIFQYIKKRIFPKLYPYLMDTQTYDQYMAYGEPIETSKL EKLQRTAEPLLQPLIDKICAEKQVIEQESFLF >gi|157101634|gb|DS480690.1| GENE 13 14338 - 15696 1385 452 aa, chain + ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 10 452 2 441 441 526 58.0 1e-149 MIETKEFMDNLSQWSDICLSDERIDLELYEKYDVKRGLRDKNGNGVVAGLTKVSKIEANK VVDGVKVPCEGKLFYRGYNIYDLIGGVVRENRYGFEEIAYLLLFGELPNADQLTKFKESL AFSRTLPTNFVRDVVMKAPTKDMMISLSKSILTLASYDEKAWDITVPNVLRQCMMLISVM PMLAVYGYHAYNHYERDGSMYIHRPDENLSTAENLLRLLRPDMKYSQVEAHVLDLALILH MEHGGGNNSTFTTHVVTSSGTDTYSVISAALASLKGPKHGGANIKVVEMMQDLRSEVKDV TDRDEVESYLKKLLHKEAFDRKGLIYGMGHAVYSKSDPRAEIFKRFVRQLSEEKGRMEEF ALYSMIEELGPKVITEERRIYKGVSANVDFYSGFVYSMLDIPEEMFTPIFAIARIVGWSA HRIEELINMDKIIRPAYTSIMKEENYKPLCDR >gi|157101634|gb|DS480690.1| GENE 14 15816 - 16025 169 69 aa, chain + ## HITS:1 COG:no KEGG:CPR_0799 NR:ns ## KEGG: CPR_0799 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 4 68 2 66 71 64 47.0 2e-09 MGKYEYNFVEVPFDGDLKSGNAETIEKCKDIIRKESAQGWRLVQIISPSSEKRVLTSNSN YAIVFEREN >gi|157101634|gb|DS480690.1| GENE 15 16170 - 17033 816 287 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2714 NR:ns ## KEGG: CDR20291_2714 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 107 1 107 277 84 37.0 4e-15 MKMKEAEARTGLDRKNIRYYESEELLAPRRTEGNRYRDYSEEDIQRLLEIRFLRQLGIPI RDIRGYFEGRLSVSDLMRMRLDRIDGEMSQLKKLEQICSRLEEQQTLKPSIVESCLEEFS REDGNTTWLEDIRKDWKMFQKELHSQYIYFEPEGEILTPEDFARETALYAARRNLDYETI RLEFTYALVRLEGIAYQASFTFGPCFRLGRIPLVKLVRCTPPPRSVSKGKYMFFVLLPTI LMFGSILFCLIDSQVGARFSGIYRILIIGCFIAAILLISVNKNVYYQ >gi|157101634|gb|DS480690.1| GENE 16 17105 - 17449 332 114 aa, chain + ## HITS:1 COG:BH2485 KEGG:ns NR:ns ## COG: BH2485 COG2739 # Protein_GI_number: 15615048 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 1 106 2 108 109 62 40.0 2e-10 MEKIVEQGLLYDFYGELLTEHQRRIYEDVVFGDLSPSEIAGEQGISRQGVHDLIKRCDKI LKDYEAKLHLVAKFIKVKEIAGELERLSDECIRTRDMALIDQIRRLSAEITSTL >gi|157101634|gb|DS480690.1| GENE 17 17494 - 18870 1735 458 aa, chain + ## HITS:1 COG:BH2484 KEGG:ns NR:ns ## COG: BH2484 COG0541 # Protein_GI_number: 15615047 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus halodurans # 1 433 1 433 451 454 55.0 1e-127 MAFESLSDKLQNVFKNLRGKGRLSEADVKAALKEVKMALLEADVSFRVVKQFIGSVQERA VGEDVFGSLTPGQTVIKIVTEELVKLMGSETTEISLKPSNEISIIMMAGLQGAGKTTTTA KIAAKMKAKGRKPLLAACDIYRPAAIEQLQINGDRVGIPVFSMGNKNKPVDIAKAALEHA SKNGLNVVILDTAGRLHIDETMMDELVEIKDNLDVCQTILVVDAMTGQDAVNVAGTFNDK IGIDGVILTKLDGDTRGGAALSIRAVTGKPILYVGMGEKLSDLEQFYPDRMASRILGMGD IQSLIEKAAAEVDEEQAKELSQKLRKAEFDYNDFLTQMQQIKKMGGMGSILSMMPGMGNQ LSGVDMDEGEKSMHRVESIILSMTKEERANPNLINPSRKQRIAKGAGVEVSEVNRLVKQF DQMKKMMKQMPGLMGGGKRKGGFGGLGGLLGGKMKMPF >gi|157101634|gb|DS480690.1| GENE 18 18971 - 19213 341 80 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880540|ref|YP_001559508.1| ribosomal protein S16 [Clostridium phytofermentans ISDg] # 1 79 1 79 81 135 81 2e-30 MAVKMRLRRMGQKKAPFYRIVVADSRSPRDGKFIAEIGTYDPTREPSVIKFDEEAAKKWL ATGAQPTETVGKLLKIAGIQ >gi|157101634|gb|DS480690.1| GENE 19 19233 - 19460 315 75 aa, chain + ## HITS:1 COG:BH2482 KEGG:ns NR:ns ## COG: BH2482 COG1837 # Protein_GI_number: 15615045 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Bacillus halodurans # 1 74 1 74 76 73 60.0 8e-14 MKELVEVIAKALVDNPEEVVVTESMKGEDTLIELKVSPADMGKVIGKQGRIAKAIRSVVK AAASKEDKKVIVEIQ >gi|157101634|gb|DS480690.1| GENE 20 19587 - 20099 728 170 aa, chain + ## HITS:1 COG:CAC1757 KEGG:ns NR:ns ## COG: CAC1757 COG0806 # Protein_GI_number: 15895034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Clostridium acetobutylicum # 1 165 1 161 166 104 33.0 1e-22 MENMLRVGVITSAHGIKGEVKVFPTTDDAKRFKELKEVILDTGKEHIPMEIEHVKFFKNM VILKFRGYDNINEIEKYKSRDLLITRDQAVDLEPDEYFITDLIGLTVVSDQGAELGTLKD VLETGANDVYVVAMKDGKELMLPAIGDCILNVDLEQGRMEVHVLEGLMDL >gi|157101634|gb|DS480690.1| GENE 21 20340 - 20549 76 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|239623996|ref|ZP_04667027.1| ## NR: gi|239623996|ref|ZP_04667027.1| predicted protein [Clostridiales bacterium 1_7_47_FAA] predicted protein [Clostridiales bacterium 1_7_47FAA] # 13 69 149 205 218 92 73.0 8e-18 MASLIQTVPTGCRAVHAYSQISHICETQQCLVLYTMDHGGYLLDKRQMGAVCINGLRTFL SERSGKSVK >gi|157101634|gb|DS480690.1| GENE 22 20671 - 22962 2043 763 aa, chain + ## HITS:1 COG:CAC1836 KEGG:ns NR:ns ## COG: CAC1836 COG0323 # Protein_GI_number: 15895111 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Clostridium acetobutylicum # 4 346 3 344 622 288 41.0 4e-77 MANITVLDQSTINKIAAGEVIERPASVVKELLENAVDAHATAVTVEIKDGGCSMIRVTDN GWGIPKEEIPLAFLRHATSKIKTVEDLFTISSLGFRGEALASIAAVAQVELITKTGDSLT GSRYQIEGGVEKGLEEIGAPDGTTIIARNLFYNTPARKKFLKTPMTEGAHVAALVEKIAL SHPDISIRFIQNNQNKLYTSGNHNLKDLIYTVFGREIAGNLLPVEINEDWITVTGFTGKP VIARSNRNYENYFINGRYIKSSIISKAIEEAYKPYMMQHKYPFTMLHFHIEPDTLDVNVH PTKMELRFADGEKVYHAVLRAVSNALAHKELIPQVSLEAKQEKEARQLAEKLAPRPEPFE VRRRENLSQKTSGSPVQVQNGGPSSGAGSVYTQSAPPRPSFVNELMPDWLKERRKEQEKR SVSPAAVNPHQTENTVSAGKDIGAGGPASGQTDSGQTDSGQTDSGQTDSGQTVSGQTGSG QTAPGQTASGQTGPGQTISSQTAHGQTAHGQTAPGQSDSGQTVPSHPEPGMTGNSDTTGT ASAHTGAGTAGCTAGPEQLDLFDGKLLDPKARLKHKLIGQLFDTYWMVEYNEQLFIIDQH AAHEKVLYENTIKSLKTRQYDMQMVDPPIILTLNMNEELLLEKYMDYFTGIGFEIEPFGG REYAVRGVPANLFSIAKKELLTEMIDGLSEDMSVHNPDIIYEKVASMSCKAAVKGHHTMS FQEANVLIDQLLDLENPYACPHGRPTIISMSKYELEKKFKRIV >gi|157101634|gb|DS480690.1| GENE 23 22979 - 23938 1003 319 aa, chain + ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 1 299 1 295 314 284 47.0 2e-76 MKQPLIVLTGPTAVGKTALSIRLAKAIGGEIVSADSMQVYRHMDIGSAKVTPGEMDGVPH YLIDVLDPRDSFNVVTFQQMAKEALREIYSHGRIPVIAGGTGFYIQALLYDIDFKENEDS SPVRRELEALAEREGERAPQLLHAMLRQVDPEAAAQIHANNIKRVIRAIEYFRQTGEKIS LHNQQEREKESPYNFLYYVLNTDRALLYERIDRRVDLMMEQGLVEEVKALKDMGCRRGHT SMQGLGYKEILDYLEGRCSLDEAVYILKRDTRHFAKRQLTWFKRERDVRFLNLPDFGGDV SNVLERILEDCAKQGIVLQ >gi|157101634|gb|DS480690.1| GENE 24 23968 - 25263 1529 431 aa, chain + ## HITS:1 COG:BS_ynbB KEGG:ns NR:ns ## COG: BS_ynbB COG4100 # Protein_GI_number: 16078807 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Bacillus subtilis # 9 422 1 416 421 461 53.0 1e-129 MDINGMNQMYESLGISREVREFAAETEKALRERFEAIDAVAEYNQMKVIKGMQDNRVSDI HFAATTGYGYNDLGRDTLEDVYASVFHGESALVRPQLMSGTHALHVALSGNLRPGDELLS PVGKPYDTLEEVIGIRDSVGSLKEYGVVYRQVDLFEDGSFDYEGIAAAINERTKLVTIQR SKGYATRPTLSVKRIGELISFIKNIKPDVICMVDNCYGEFVETLEPTDVGADMIVGSLIK NPGGGLAPIGGYIVGRKDCIERASYRLSAPGLGKEVGASLGLNQQLYQGLFLSPVVVSGA LKGAIFAANIYERLGYGVVPDGSESRHDIIQAITFGTPEGVISFCKGIQAAAPVDSYVTP EPWDMPGYDSPVIMAAGAFVQGSSIELSADGPIKPPYAVYFQGGLTWYHAKLGILKSLQQ LLDDGVLKILR >gi|157101634|gb|DS480690.1| GENE 25 25358 - 25609 407 83 aa, chain + ## HITS:1 COG:no KEGG:Closa_2047 NR:ns ## KEGG: Closa_2047 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 83 1 83 83 97 75.0 2e-19 MHYKNFWILLIMLLAGIVLGGFMGQLAEGISWLNWLNFGQSFGLDSPLVINFGILVITFG LTIKITMASIIGVAVALIIYRYI >gi|157101634|gb|DS480690.1| GENE 26 25654 - 26349 648 231 aa, chain + ## HITS:1 COG:BS_ysxA KEGG:ns NR:ns ## COG: BS_ysxA COG2003 # Protein_GI_number: 16079856 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Bacillus subtilis # 4 231 8 231 231 187 42.0 1e-47 MERIRIKDLPHKDRPYEKCLSLGPDSLSDTELLAVILRTGSREGNSLDLADSLLALGSPG DGLLGLLHHSLDDFTRIKGVGKVKGVQLLCIGELSKRIWKRKASEHTLYFGRPEEISDYY MEDMRHLEQEEIRVMFLNTKQAMVRDQIMARGTVNASVMTPREVLIEGLRCRAVSMVLVH NHPSGDPTPSRADILLTKRLKEAGDLVGITLIDHIVIGDRRYLSFREENLM >gi|157101634|gb|DS480690.1| GENE 27 26364 - 27209 1269 281 aa, chain + ## HITS:1 COG:CAC1243 KEGG:ns NR:ns ## COG: CAC1243 COG1792 # Protein_GI_number: 15894526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Clostridium acetobutylicum # 19 277 20 280 283 98 28.0 2e-20 METKHSKYILAGLTVLCVLLICITSLKDGIMEPLRTGVGYFLIPIQTGVNAVGTGIYNEL KDYGSLKDALRENEELKTQIVQLTEDNNRLQAEQFELDRLRKLYELDQDYMQYDKIGARV IARDSEKWFQVFRINKGSADGVAVDMNVVADGGLVGIVTDVGANYATVRSIIDDSSRVGA MKLDSSYNCIVAGDLTLYEEGRLKLTDFSKDAVLRDGDQIVTSNISTKFLPGILVGYAAD VSIDPDHLTQSGYLIPVADFNNLQEVLIITDIKDTGEAAAQ >gi|157101634|gb|DS480690.1| GENE 28 27206 - 27739 403 177 aa, chain + ## HITS:1 COG:no KEGG:Closa_2043 NR:ns ## KEGG: Closa_2043 # Name: not_defined # Def: rod shape-determining protein MreD # Organism: C.saccharolyticum # Pathway: not_defined # 6 177 31 202 202 221 71.0 1e-56 MIQAKIKQVLINILFIVLAFTVQNCVFPLLPFLSATPNLLLILTFSFGFIHGKNAGMFYG VLSGLLLDLFYSGPFGFYTLIYIYIGYANGIFTKYYYEDYITLPLILSIFNGIWYSMYIY IFRFLIRGRLNLPYYFRNIMLPEIIFTAVTTLLIYRLFLSASRRVEDIGKRRDTKLV >gi|157101634|gb|DS480690.1| GENE 29 27732 - 30653 3506 973 aa, chain + ## HITS:1 COG:HI0032 KEGG:ns NR:ns ## COG: HI0032 COG0768 # Protein_GI_number: 16272007 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Haemophilus influenzae # 628 959 279 629 651 134 29.0 7e-31 MFDELLEIVKEFFKKLFSSRLFALSVMFTVMFAGLLGKLFQMQILDGSEYQETYMQRTEK NVTTPGSRGNIYDRNGNRLAYNELAYSVTIQDLGDYPRPSSRNQMLYRLVRILDRRGETV EGKFEIALDQNGEMFFTSSSDSARTRFFLNFYGLSSTASLNDPKGKYPTNITAREAFERK KVDYELDKMKDEKGNPLVIPDNIALGMINIIYTMKLTEYQKYETTTVATNISNETMVEIN ENAADLKGVAVEQSYVRKYEDSIYFAPIIGYTGKVQEDQLSALNEQWHQSDEAAGLPEDA PDKYDLNDIVGRIGIEKSMELELQGEKGYSRMYVDNMGRPREIIEKTDAKAGDDVYLTID RDLQIAIYKLIEQQLAGIISYHLQYDDLDPDKTYDPSKIPIPVKDAYYQLINNNVLSLMD MSQEGASDIEKQIYSKYETSRDQILGAIRNELLSSHALAMNDLPKDMATYMNYIYAYLSD GSVGIIQKDKIDQSSPEYQAWRDGTISLRDYIYSGIAGNWVDTTRLDVTSKYSDADDIFT QLVDHVIESLRSDNKFTKRMFRYLINDEVITGRELCLALYAQGVLAYDAQAIAALQMGDS TYTYQFIKEKISNLELTPAQLALDPCTAGVVVTDVKTGEVRALVTYPSYDNNLMSGSVDA AYFSQLQDDMSRPLYNNATQAQKAPGSTFKPITAVAALEEGVIGLQDTIECTGIYDQISK PIKCWIWPGRHNSENIEEGIQNSCNYFFAELAHRLCMKPDGTYSPDQGLATLRKYATLFG LDHTSGVEISENEPQISSEDPERSAMGQGTHSYTNVQLSRYVAALANRGNVFELSLLDKL TDSDGNLIKDYTPAISSHVDAADSTWDAVQTGMRRVITDSSAKSIFSDLPVEVAGKTGTA QEDKSRSNHAFFVSFAPYSHPEIAVTVNIPYGYAGTNAATLGKKVYEYYYKYTTLEQIES SGALGVSNVNIGD >gi|157101634|gb|DS480690.1| GENE 30 30675 - 31343 728 222 aa, chain + ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 1 178 1 177 216 108 35.0 9e-24 MRNAVVIKSSKAGMTVILDPELPFGELLDAIGKKFSESARFWGSVQMTLTLEGRDLTAAQ EFAIVDTITKNSQIEVLCLLDTDAERIERCEKALNDKLMELNSQTGQFYRGTLKRGDCLE SEASIVIIGNVDHGARVTAKGNVIVLGELKGTVTAGVSGNPQAVVMALDMAPLQIRIGDM SSRFNERNKRLGRGPMIALAEDGAIVMRSLKKSFLNNMLNFA >gi|157101634|gb|DS480690.1| GENE 31 31421 - 32212 895 263 aa, chain + ## HITS:1 COG:CAC1249 KEGG:ns NR:ns ## COG: CAC1249 COG2894 # Protein_GI_number: 15894532 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Clostridium acetobutylicum # 1 262 1 263 263 314 60.0 8e-86 MSEVIVITSGKGGVGKTTTSANVGTGLAILGKRVVLIDTDIGLRNLDVVMGLENRIVYNL VDVVEGNCRMKQALIRDKRYPNLYLLPSAQTRDKTSVNPEQMIKLVDDLREEFDYILLDC PAGIEQGFYNAIAGADRALVVTTPEVSAIRDADRIIGLLEHAEIDEIDLIVNRIRADMVR RGDMMSLSDVTDILAVNIIGAVPDDENIVISTNQGEPLVGLGSTAGQAYLDICRRILGES IPMTNPGETETFFARLRGILKRA >gi|157101634|gb|DS480690.1| GENE 32 32229 - 32513 169 94 aa, chain + ## HITS:1 COG:no KEGG:Closa_2039 NR:ns ## KEGG: Closa_2039 # Name: not_defined # Def: cell division topological specificity factor MinE # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 92 92 96 51.0 3e-19 MRWFQGFQTRRSGETAKMRLKLLLVSDKAGCSPEMILMIKDDVIHAISKYMEIEKDKVQI QMDTEGSPQKGGCRTLPVLHANIPIRSISNKGLY >gi|157101634|gb|DS480690.1| GENE 33 32516 - 33637 1240 373 aa, chain + ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 7 367 8 386 398 175 34.0 2e-43 MFLDYDFRHFNFRLIFYMIVLNVIGVLVIRSATNMNADAVNKQLLGVFIGLAVAIGLSLV DYHKILNFSTLIYGLCIASLVAVLIWGNVVNNARRWIEVPAIGQLQPSEFVKIGLIITFS WYFMKYQEKINQVSTVAIAAVLFAIPAALIFEQPNLSTCLVIMVMVLGIVFASGISYKWI AGTLAVTIPVMATFVYLLLHGMIPFIKDYQAGRILAWFYPDQYGEARYQQNNSIIAIGSG QLKGKGLFNTTIASVKNGNFLSEEQTDFIFAVIGEELGFIGCVVVIALFLLIIYECLMMA ARARDLSGRLICVGMATLIAFQAFANIAVATGIFPNTGLPLPFISFGSSSLISIFMGIGL VLNVGLQRETRHI >gi|157101634|gb|DS480690.1| GENE 34 33659 - 34054 437 131 aa, chain + ## HITS:1 COG:lin2020 KEGG:ns NR:ns ## COG: lin2020 COG1803 # Protein_GI_number: 16801086 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Listeria innocua # 1 131 1 131 134 150 54.0 4e-37 MTIGLIAHDAKKKLMQNFCIAYRGILSKNTLYATGTTGRLIEEVANLNIHKYLAGHLGGM QQMAAQIEHNEMDLVIFLRDPQNPKSHEPDVNDVVRLCDIYNIPLATNLASAELLIKALD RGDMEWREVVR >gi|157101634|gb|DS480690.1| GENE 35 34054 - 35031 1080 325 aa, chain + ## HITS:1 COG:BH1573 KEGG:ns NR:ns ## COG: BH1573 COG1686 # Protein_GI_number: 15614136 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 74 308 33 255 382 149 40.0 6e-36 MKAVRTRSFLCALLLGAVALSTTGCLSGIKEPYSLEERIPVMEDSLSRTVRYADGFASDL CVVEDEGSFDENYVTSEAAALFDVSDRETLYSKEAFERLYPASITKVMTALVAIKNGDLN ARVVVTDDAVITESGATLCGIEPGDTLTLEQLLYGLMLPSGNDAGAAIAVHIAGSIDKFA DMMNQEAASLGATGTHFVNPHGLNDPDHYTTAYDLYLIFNEALKYPVFRQIVGTTSYTAN YHDKDSQAVTKTWKGGNWFMTRERETPEGLTVFGGKTGTTNAAGYCLIMATRDQNDKEYI SVVLKSDSRPGLYDNMTNIISKIVK >gi|157101634|gb|DS480690.1| GENE 36 35134 - 35541 257 135 aa, chain + ## HITS:1 COG:no KEGG:Closa_2035 NR:ns ## KEGG: Closa_2035 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 12 135 16 139 139 182 68.0 3e-45 MQEKKGKGKTKHLLDMANAAIKGANGTVVYLDKSAQHMYELNNKIRLINVNEFPLTNSDG FLGFICGIISQDHDLETMYLDSFLKLSCLEGADISDTYMTLKNIGEKYHITFVLSISMDA AELPECAKGDVIISL >gi|157101634|gb|DS480690.1| GENE 37 35669 - 37069 1419 466 aa, chain + ## HITS:1 COG:no KEGG:Closa_2034 NR:ns ## KEGG: Closa_2034 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 466 17 479 479 565 60.0 1e-159 MAKKKNKKVVRYRRPLNINVGMIIFFIIFIYLVFSVYTYLKREKIQFYEVVDGGIVNNRI YNGLILREEQVRTADRSGYINYYLREGKRASVGSRIYSLDETGNLEALLKENAEEGSSLS DDSLSDLKKQLTSFVLSFQNQDFGTIYDARISLDASVMEYSSFSTLDQLDKLAEQSGAVF EQVRSDVSGVVSYGIDSFESMTTSDVEAASFDRSTYVKTVHKSGDLIDSGSPAYKIVTSE KWSIVFPLSEEDAALYNDKTVLRVVFRDYSLSTPAAYSTFTGKDGAAYGKLDFSKYMEQF ISDRFVDFEIKVDKADGLKIPVSAVTDKNFYLIPLEYMTQGGDSSEDGFYKEVYTENGTS IVFVPTTVYYSDEEFYYVDMGEENGFKAGDYVVRPESSDRYQIGRTASLKGVYNINKGYT IFKQIDILDSNSEYYTVRKNMKYGLSVYDHIVLDASMVEEGQLLYQ >gi|157101634|gb|DS480690.1| GENE 38 37084 - 37764 656 226 aa, chain + ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 24 226 15 216 221 199 52.0 4e-51 MIKKQLEEVRKHIEDACRRAGRNPEEVTLIAVSKTKPVPMLMEAYDAGARDFGENKVQEI LNKKPELPEDIRWHMIGHLQRNKVHQVIDKAVLIHSVDSLRLAQQIEDDAAKLGLDVDIL LEVNVAREESKYGFLMEEVEDAIMQIKDFPHVHIKGLMTIAPFVDNPEENRGIFKKLFEF AVDIGGKNIDNVTMSVLSMGMTGDYEVAIEEGATMVRVGTGIFGSR >gi|157101634|gb|DS480690.1| GENE 39 37812 - 38321 527 169 aa, chain + ## HITS:1 COG:BH2549 KEGG:ns NR:ns ## COG: BH2549 COG1799 # Protein_GI_number: 15615112 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 22 155 9 146 149 72 28.0 4e-13 MSLLGKLMDTMRLNPDEEEDDYYLDDDFEEEAPRKGLFSKKNTEYDEEEEQDKPRFLGRS NPKVVPMRRSMEVTMIKPTSMEDSRDICDYLLAGKAVVLNMEGIHTEVAQRIIDFTSGAT YSMNGNLQKISNYIFIATPDSVELSGDFQDILAAGSAMGMDVSGLNIHL >gi|157101634|gb|DS480690.1| GENE 40 38417 - 39511 1237 364 aa, chain + ## HITS:1 COG:RSc2969 KEGG:ns NR:ns ## COG: RSc2969 COG0337 # Protein_GI_number: 17547688 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Ralstonia solanacearum # 35 355 32 352 368 258 43.0 1e-68 MADRINVCRDGKAIYDIVLTQSFDGLKEAVSCLPVKEHKICIVTDSNVAPLYLEEVRAVL AGCCKSVSVFTFPAGEEHKHLDTVRNLYEHLILERFDRKDMLAALGGGVVGDLCGFAAAT YLRGVDFIQIPTTLLSQVDSSIGGKTGVDFDAYKNMVGAFHMPRLVYANLRTLLTLPEEQ FSSGMGEVVKHGLIKNREYYQWLKDNREGIAARNLDLCQAMVYESCMIKKRVVEEDPTEQ GERALLNFGHTLGHAVEKLENFTMHHGHCVGLGCIGAARISQLRGMLTEEEMADIRNTFL SFGIPDAAPGLVWDQVLKTTQSDKKVDAGVIRFVLLKAIGQAYVDTTVTADEMRAGFDVI GGQA >gi|157101634|gb|DS480690.1| GENE 41 39511 - 40020 559 169 aa, chain + ## HITS:1 COG:SPy0826 KEGG:ns NR:ns ## COG: SPy0826 COG0597 # Protein_GI_number: 15674864 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Streptococcus pyogenes M1 GAS # 12 160 6 149 152 95 37.0 3e-20 MTKQKQSLIASFLIGFAILVGLDQWTKGLVVQSLKGQKPFVIWNGVFEFYYSENRGAAFG MMQEKQLFFFLIAVLVLGAVAYLIYKMPSDGKYRPLAVCLMMISAGAVGNMIDRVSQGYV VDFLYFKLINFPIFNVADCYVTIGAACLVFLIMFYYKDEDMACFSLKKQ >gi|157101634|gb|DS480690.1| GENE 42 40052 - 40963 298 303 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 5 256 8 249 285 119 33 2e-25 MKQEFYPTDMEDHVRIDKYLAEACPDLSRSYIQKLLKSGQVLVNGQGIKASYIVVEDDRI ELEVPEAVEPEIDAEPMDLDILYEDQDLILINKPKGMVVHPAAGHYSHTLVNGLMYHCKD QLSGINGVMRPGIVHRIDMDTTGVIIACKNDMAHSSIAAQLKEHSITRRYQAIVHGVIAE DEGTVDAPIGRHPVERKKMSINYQNGKNAVTHYRVLNRFRQYTHVECRLETGRTHQIRVH MASIRHPLLGDAVYGPARCPISGLCGQTLHAGILGFIHPRTGEYMEFSAPLPDYFEKLLR TLS >gi|157101634|gb|DS480690.1| GENE 43 41156 - 41794 756 212 aa, chain + ## HITS:1 COG:no KEGG:Closa_2028 NR:ns ## KEGG: Closa_2028 # Name: not_defined # Def: cytidylate kinase # Organism: C.saccharolyticum # Pathway: not_defined # 1 210 1 210 213 321 72.0 1e-86 MTNNLIITIGRQCGSGGKMIGELLAEKMGVKCYDKELLSMAAKHSGLCEELFEKHDERPT SSFLYSLVMDSYSMGYTASGYSDMPINHKIFLAQFDTIKKLAEEASCVMVGRCADYALED YPNVVSVFITANDDDKIKRLQDLYKVDASKAKDIMIKTDKQRASYYNYYSNKKWSDPRSY DLCLNSSVTGPEGAVDVILNFAKVKQTWRKQK >gi|157101634|gb|DS480690.1| GENE 44 41892 - 42437 353 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938835|ref|ZP_02086186.1| ## NR: gi|160938835|ref|ZP_02086186.1| hypothetical protein CLOBOL_03729 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03729 [Clostridium bolteae ATCC BAA-613] # 1 181 1 181 181 338 100.0 9e-92 MKRKIALLLITGICLANTVPCFASPSMKVSVSSSEENQYSTLPDGDTLQKDVGFRPKAPA SLAGGYLFGSGNITESFDLDSNGAPVNKQKGISFKYIKKDNNTSKSVSLSAEPASGQSLS SNASVIKYGETDLYYSTAEANSLAWIDGDVRYILMDINKIVTRDELVAMAEDMIDLGKDL H >gi|157101634|gb|DS480690.1| GENE 45 42594 - 44420 1755 608 aa, chain - ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 608 1 608 608 517 45.0 1e-146 MCGIIGYTGPLEAKDVLLKGLAGLEYRGYDSAGIAYFNQDSQVRSIKKVGKVAALRECVK STCEISHCGIGHTRWATHGGVTDANAHPHSFGQVTLIHNGIIENYRQLTQEYGLEGRLAS QTDSEVAAAVLDSAYRGCGQDPLKAIAAFTGMLEGSYGFCILFADRPGEIYAVRKVSPLV ASYTHSGALVASDLTALISYTREYFVVPEGCIVRLTPYKVRVYDMDFNRVEPETLEVSWN MDAAMKNGYPHFMLKEIHEQPEALKNTILPRMLDRLPDFRDDGIPDSLFAECSQVRIIAC GTAMHAGMVAKALMEPMLRIPVTVSIASEFRYEDPLIDSSTLVIVISQSGETIDTLAAMN LAKSLGSVTLSIVNVKGSTIARESDYVLYTHAGPEIAVASTKAYSVQLAALYMIGCRMAL VRGAFGQDEAREFMTDLLDVIPAMETVIGQSEFIQSFVRHLIRERDTFFIGRGLDYAFSL EGALKLKEISYIHAEAYAAGELKHGTIALIYDNVPVIAIATQDHVFAKTISNIREVKARG AYVILLAKKQSVVEDGVADIELRIPDLEDRFTVFPIAVALQLIAYYASIGKHLDVDQPRN LAKSVTVE >gi|157101634|gb|DS480690.1| GENE 46 44975 - 46870 1023 631 aa, chain + ## HITS:1 COG:no KEGG:Closa_2016 NR:ns ## KEGG: Closa_2016 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 97 623 79 529 537 267 33.0 1e-69 MKKSDYIQVIKKYKREILQCSALAVTVVCACAMSYMAGRYSGRLNAPAAGRSEGGAVETT SQYIDLAPGSPSFLFPPGAACRDATPSNADMMDSQVLNLAYLYEDGADSTDPADSYAGRL YDLLALKDASWISGLYDLYNYTPQQLAALLGLPETAVTSPSGIIPEFRNISVHFVNSEQQ ETGSTSNAREIISIINTLHYFGILKDMKTMEDCADKLWNASHKYSVRIGGIYYCDGSCLD EAADGQRDAEEFGTEAAGAQTAGVEPADAVIAGSEAAGAEAVGAAVAGSEAAGGETAGAV VAGSQIAGAEAVEAAVAGSETTGAEAVEAVAGAETVGTAAAGTEAVESGAAGGEPVSSMA ASAAEAETAAVASFSDASSVGSGDSDTGPADGSQTCPGHVDCTVLAVVKGTAEADSLFQA ADTLEAVKASGWTGWNADTRNMAGALLAQDWSQEYGLYTVDYPVKGLLNSQEIELYMSLV PEDASRQRKDFVRYALTSVGKIPYYWGGKPSAPGYTGNGFGSLTVPDEDGRLLKGLDCSG WINWVYWSVTGKGLGAQSTGTLLGCGSPVARDELLPGDICIRFNPMAHVVIFLGWTAEGN MLCIQETTGNSNNVEVGTVTSNWESYLRILE >gi|157101634|gb|DS480690.1| GENE 47 46911 - 48851 1313 646 aa, chain + ## HITS:1 COG:CAC3063 KEGG:ns NR:ns ## COG: CAC3063 COG1316 # Protein_GI_number: 15896314 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 208 409 69 269 339 119 34.0 2e-26 MNNRENDDELERMRARARRGSGRQESGRQSSRAYPRPVQPAGERRSARQSYDGREEYDEE YYYGEEYDAEEDRQIDFPMEDYLDDEPPMPARRLRSSYGSSGSAEGRRQAQSAGNRNAQG RKSRQPGSASSNQDRARTAPSVSRNQGRTAKRQQDTMERGRTSAGRQQGQMAKRRKKGRW KGILLLLLVAFIGYGAWLFLHKPTGYWNIAVFGVDSRDGNTKNALADVQMICSIDRATGE IRLVSVYRDTYLKINSEGTYHKINEAYFKGGHKQAVSALEENLDIKIDDYATFNWKSVAE AINILGGIDLEITPAEFKYINGFITETVNSTGIGSYQLEQAGMNHLDGVQAVAYSRLRLM DTDFQRTERQRKVVELALAKAKQADLSTLTSLAGYMVQEISTSVGVDDVLPLAKDIGKYH IGETAGFPFSRQTMKIGKMDCVVPTTLESNVVLLHQFLYGQETSYSPSSAVRKISAHISE ETGLYEEGKAAPSGGGSSGGSSSGDKAPAQNVPAETPPPETTAPVETTQESTGETEETEE TTQESTEETEETEEVGPGVSGKPTKAPESTKHTDEEQGPGSSGPGGSGAAEPSGSHPTKA PETEKPGSGSSPGSESDSGSSGPGPGSQDNGPGQAQSGTDEGGPGI >gi|157101634|gb|DS480690.1| GENE 48 48998 - 49723 568 241 aa, chain - ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 12 228 40 258 277 143 34.0 3e-34 MKLQRLLSLTRQALDTYGLIQDGDRIAIGVSGGKDSLTLLHALHHLRRFYPEKYELCAIT VDLGLGNMDLEPVAAMCREFEIPYEIISTEIGAVLFDIRKETNPCSLCAKMRKGALNEKA KEMGCNKIAYAHHRDDVIETMMMSLFYEGRFHSFSPLTYLDRMDLTVIRPMIFVTEADVI GFRNKYQLPVCKNPCPMDGYTKRQYIKELIHRLQEDNPHIKDCMFRAILDGSIKGWPQAL K >gi|157101634|gb|DS480690.1| GENE 49 49902 - 50159 248 85 aa, chain + ## HITS:1 COG:no KEGG:Closa_1508 NR:ns ## KEGG: Closa_1508 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 81 1 81 85 111 66.0 1e-23 MISYKYVVSDKQGLHATNAMSLSRAAAEYESRITLKSQKGTADCKNVLALMSLGAHQGER LELTVEGPDEGQASGFLKGLMRTIL >gi|157101634|gb|DS480690.1| GENE 50 50306 - 51400 1166 364 aa, chain + ## HITS:1 COG:L176316 KEGG:ns NR:ns ## COG: L176316 COG3589 # Protein_GI_number: 15672155 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Lactococcus lactis # 1 359 1 357 364 343 46.0 3e-94 MARLGISIYPEHSTEEQDIAYIRKAGKLGYKRIFTCLLSVGEKSGEEIIGQFRRLADEAH ACGMEIIPDVSPSVFSRLGISYEDLSVFREMHVDGIRLDEGFDGMKESLMTYNPQGLKLE LNASTKLVYVENIMDHHPDREKLITCHNFYPQAYTGLSLKHFNRCNDKMKSLGLKVAAFV SSNAPGAYGPWPLNEGLCTLEMHRGLPLDFQVRHMFATGMVDDVLIANAYATDEELHACA AVNPSILTFGIELEKELTDTEKQILDYEPKHVVRGDMSEYMIRSTWPRVTFADQSIPAGN TRDLRPGDVVILNDGYLKYKGELHIVTKDMPNDGRKNVIGHLPEYEHVLLDYIEPWKVFA FRIV >gi|157101634|gb|DS480690.1| GENE 51 51425 - 53344 2266 639 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 492 1 499 499 128 22.0 4e-29 MQNKRQEQLLAILSERRDWMTSRQLAGLLQVSDRTIRSDVEAINKHTDPPPIESNVRQGY RLCEEARPALASGAETREADIPQTPGARCAYMIQKLLFEVKELNLTMLQSQIYVSGYSID NDLKRIRKMLEPYSGLKLVRNKECISLKGDEASKRRFYRDLLVAEVQENFLNLNTLAHLY RSFNLIEVKDIFVDVLEEYDYSIHESMFPMLILHAGTSIERMNCANYINMEEGMQGLEDT IEYQIAQTFFDRISKRLHITVHDGEVGMFALVIMGRRASNYTSDFVNYNGKWMNTKKLVA EALEQVYALFGLDFRQDADLVAGLKMHFHGLIERVKNQVRMEDVFVEEIKRKYPLVFEMG IYVLEFLEQRLERPISDVESCYIALHLGAASERMNSVRKYRAVMILPHNQSFSDMCVKKI SDMFRERMEVVKAFGYFEEDEVSALDPDLLLSTFPLEHGLDVETVSINLFVDSETESKIL QAINRLDKKGFRLEFTSHIGNLIRKEHYHSQVDMDQPEEIIRMLCAGLEKEGIVEPEFTE VVLKREQMSPTSFVNTFAIPHAFGAFARNSTIAVAQLKNPVKWGAFEVRLVMLFAINEGD ARMIKIFFDWVSNVVNQPEELAKLVAPCSYEEFIDRIMG >gi|157101634|gb|DS480690.1| GENE 52 53485 - 53823 508 112 aa, chain + ## HITS:1 COG:SP2024 KEGG:ns NR:ns ## COG: SP2024 COG1447 # Protein_GI_number: 15901845 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIA # Organism: Streptococcus pneumoniae TIGR4 # 1 100 7 106 108 77 43.0 5e-15 MVEGLEMICFKIISNVGGARSSYIEAIQKAKQGDFEGARECIKAGQEMFLVGHEAHFELI QKEAQGEQVGGSMILVHAEDQLMSAEGFKIIAEEMIASYERIAELEKKLENR >gi|157101634|gb|DS480690.1| GENE 53 53904 - 54224 461 106 aa, chain + ## HITS:1 COG:CAC0384 KEGG:ns NR:ns ## COG: CAC0384 COG1440 # Protein_GI_number: 15893675 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Clostridium acetobutylicum # 1 102 1 100 102 90 45.0 5e-19 MKRVYLFCSAGMSTSMLASKMQDVANSHDLPIEVEAFPDGKIGQIIDEKHPDVILLGPQV KYRYGEIVEKYGDKGIPIQVIDQTDYGMMNGEKVLKSAIKLMKAAK >gi|157101634|gb|DS480690.1| GENE 54 54293 - 55627 1381 444 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 5 422 8 433 452 352 43.0 7e-97 MLNKLESVLMPLAEKIGKNKYLIAIRDGFLLSMPLLIVGSFFLLVANFPIPGWTDFWARF FGDNWASYFSKPTDATFSIMAVLAVIGIGYSFAEQMNVDKLFGAAIALVSWFLIMPYEIL VDGGSSVTGIPLNWVGSKGIFVGIIVAFLSVHIYAWVNKKGWVIKMPDGVPPTVSKSFSA LIPAGVSILVFFIMNIVFAMTPYKNAFNFIFTILQTPLLKLGNTLPAMVIAYIFLHFFWF FGVNGGSVVGAVFNPILQTLSADNLAAFQAGQPLPNIISQQFQDLFATFGGCGSTLSLLI AMLLFCHSKRIKELGKLAFIPGVFGINEPLVFGLPIVLNPMILIPFMLVPTINIVISYFC MSIGLVPLCTGVAIPWTMPVVLSGFLATGWQGAVLQLLLLILGVFIYMPFIKMMDKQYLE DEAKASDKSDDDDIDFDDLSFDDL >gi|157101634|gb|DS480690.1| GENE 55 55700 - 56224 671 174 aa, chain + ## HITS:1 COG:no KEGG:Closa_1514 NR:ns ## KEGG: Closa_1514 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 174 1 174 174 304 87.0 1e-81 MKIIMIYDQIQSGLGTKDDTMVPLTGKKEPIGPAVMMEPFLNKVDGHVTACLCCGNGTFL ADPEEVSRKLCAMVNKLQPDVVMCGPAFNFADYAGMCAKVACDINATTKAKAFAAMSVEN KDTIAAYKDKVAIVETPKKGGMGLNDALKNMCTLAKALADGSDTAELEHKFCFK >gi|157101634|gb|DS480690.1| GENE 56 56240 - 57220 1099 326 aa, chain + ## HITS:1 COG:CC2359 KEGG:ns NR:ns ## COG: CC2359 COG1446 # Protein_GI_number: 16126598 # Func_class: E Amino acid transport and metabolism # Function: Asparaginase # Organism: Caulobacter vibrioides # 3 283 20 307 327 161 36.0 1e-39 MWGMIATWRMAVEGITKGAEMLKDGGDAGDAVESAIREVEDFPYYKSVGYGGLPNEEMEV EMDAAFMDGNTLDIGAVAAIRDFANPVSIARRLSREKVNSLLVAEGAEKFAHKEGFERKN MLTDRAKAHYRKRLKEMSDQAALQSASGKLKPYSGHDTVGMACLDMSGKMTAATSTSGLF MKKKGRVGDSPISGSGFYADSKKGAASATGLGEDLMKGCISYEIVRLMGEGMHPQKACET AVARLDAELKERRGEAGDLSLIAMNPKGEWGVATNIEGFSFAVVTEALEPTVYLVTRQED GLCTYEKASDEWMENYMKTRMAPIEE >gi|157101634|gb|DS480690.1| GENE 57 57225 - 58442 1260 405 aa, chain + ## HITS:1 COG:CAC2723 KEGG:ns NR:ns ## COG: CAC2723 COG0624 # Protein_GI_number: 15895980 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Clostridium acetobutylicum # 15 214 8 199 465 198 47.0 2e-50 MDEQNQIDRMLLDYIDENRDRLVEHVRQMVRIDSVEREARDGAPFGPGVKKALDLALSIS RDMGFETVDLDGYIGYAFYGRGEDYVCAMGHVDVVPAGEGWKEPPFKGHMENGVIYSRGV LDNKGPVMACLYGLAAVRALGLPLRHPVRIIFGCDEETGFEDLRYYLSKEKPPVYGFTPD CKYPVVYSERGRAVVRITGTRECLGTFFDFVNKYFIGAGNTGDRLGIDYYHEEYGMMEMR GYRLGLDPVWKAGQGNIRRVNAGNKNTEVSAGNGNAANTVYFEATLSYPGGITIREIMKP ITEKAESCGLKAELVQNYDPVVFAKDTPMVKAMQDSYERVTGMNGTPVTTTGGTYAKAMP GIVPFGPSFPGQKGISHNPNEWMTVDDLVTNAKIYALALYRLAQL >gi|157101634|gb|DS480690.1| GENE 58 58493 - 59239 793 248 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 2 204 3 203 244 143 39.0 3e-34 MLEICCGSYYDAMQAASGGAGRIELNCALHLGGLTPSLASLELVKEHCNVKVIAMVRPRG GGFCYSEEDFNVMLRECEHLVRHGADGIAFGCLREDASIDRERNARMISIIHSHGKEAVF HRAFDCTSNPYESIETLIELGADRLLTSGLKPKAVEGKALLGALQSSYGNSIEILAGSGI NASNARTLMEETGIRQVHSSCKDWMTDPTTTSHGVTYGFAEPPHENCYDVVSEALVRKLL DSLGELGR >gi|157101634|gb|DS480690.1| GENE 59 59243 - 60115 840 290 aa, chain + ## HITS:1 COG:lin2365 KEGG:ns NR:ns ## COG: lin2365 COG1284 # Protein_GI_number: 16801428 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 16 284 14 282 300 162 35.0 7e-40 MIAKYKKNILVRRLITILAVVASALLQTYVIQAFIRPANLLSSGFTGVAILIDRIASLYG HNISTSLGMLVLNIPVAILCSRSISMRFTFYSMLQVFLASFFLKICSFQPLFNDVLLDVV FGGFLYGLSIAIALRGNASTGGTDFIALYVSNKTGRSIWEYVFAGNCVILCIFGYMFGWL YAGYSILFQFVSTKAISAFHHRYERVTLQITTTKAPEIINQYVKQYRHGISCVDAVGGYS HKKMYLLHTVVSSYEVNDIVHLMREQDPHVIVNMIKTENFYGGFYQAPIE >gi|157101634|gb|DS480690.1| GENE 60 60380 - 61114 560 244 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938853|ref|ZP_02086204.1| ## NR: gi|160938853|ref|ZP_02086204.1| hypothetical protein CLOBOL_03747 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03747 [Clostridium bolteae ATCC BAA-613] # 1 244 54 297 297 383 99.0 1e-105 MRNSGKILCIASLLGLLVLSGCGLKSEPAAGTGADGAAVGAEGTVSAGADEGRADEDKTD GNGTDQEIHIYSDTRAGDSAGDETVPAANGGSIIQDQSFDVELDSLKAAEPPAKESISET RVVSEEIIRQAAGSWTLNGPKTDAGLRQHGSLLEMFGTGLHMGNSLEISDNGEIEYYIGI GTGGEGQCTETGGSTETGSSITASITPYEDHGNSQNSLTLHLITEDDTEYLTMEYDGEIL YWNR >gi|157101634|gb|DS480690.1| GENE 61 61236 - 61343 100 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSFKIIKEYWRLAILGLAYTAVFAAAVTVVILQL >gi|157101634|gb|DS480690.1| GENE 62 61420 - 61653 308 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938855|ref|ZP_02086206.1| ## NR: gi|160938855|ref|ZP_02086206.1| hypothetical protein CLOBOL_03749 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03749 [Clostridium bolteae ATCC BAA-613] # 4 77 1 74 74 139 100.0 5e-32 MTSMLVKFEQPEQVVDFVNTLSHYDCDADIKYGSRMVDAKSVLGVLYLAVSRTVELILHI DEDGSGDIKNRLAKFAV >gi|157101634|gb|DS480690.1| GENE 63 61872 - 61967 118 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKEFWSDMAGLGIMLLVMLGAAALRAYIWL >gi|157101634|gb|DS480690.1| GENE 64 62366 - 62785 616 139 aa, chain + ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 32 123 48 138 151 70 39.0 8e-13 MLMPSIFGENLLDDFFDYSFGSHKTFDMMNTDIRDTENGYEITMNMPGVRKEDVKAELKD GYLTIQATTDSSRDEKDSNGTYIRRERYCGSCSRSFYVGDAVTQEDIKAKFEDGTLKMLV PKKEAKPAVEEKKYIAIEG >gi|157101634|gb|DS480690.1| GENE 65 62872 - 63333 480 153 aa, chain + ## HITS:1 COG:RSc0200 KEGG:ns NR:ns ## COG: RSc0200 COG0071 # Protein_GI_number: 17544919 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Ralstonia solanacearum # 43 141 37 133 140 66 33.0 1e-11 MLMPSIFGENMFDEFFRDPFFDSKDMKKLEKKLYGRRGKNLMKTDIRETDTGYELEMDLP GFRKDEIRASLRDGYLTISAAKGLDKDEQEKTSGRYIRQERYAGACERSFYVGEDMTEDD IKGEFKHGILRLSIPKKEAKPLAEEKKYISIEG >gi|157101634|gb|DS480690.1| GENE 66 63520 - 64344 543 274 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938860|ref|ZP_02086211.1| ## NR: gi|160938860|ref|ZP_02086211.1| hypothetical protein CLOBOL_03754 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03754 [Clostridium bolteae ATCC BAA-613] # 1 274 23 296 296 561 100.0 1e-158 MKKDKGYGLLWMLAAQVVLLIFFLAIAPRLNAEGLLYRSAVPVTPEEVLSKDDNGSFELK YGAGYPCTEPTGYMTSVSDLQKNYIGAWEADASLFQATGIYKQISHRTRPSTQRGYYSSR SPTYGVTAPAITRSRWTTFWMRPYADYAQYYLITFKDGTRTWALVDNAITRIPAKGRVAL PVGYYWDEGAARFLTDKEKKAYRITDRDVLEDNCLGNALDLYSGWIEGDEMRQYHETCSF VQSVALVIIGILLFITLILFMLSGKTSGRHTQNR >gi|157101634|gb|DS480690.1| GENE 67 64444 - 65646 601 400 aa, chain + ## HITS:1 COG:no KEGG:DSY1298 NR:ns ## KEGG: DSY1298 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 88 237 52 193 233 86 34.0 2e-15 MIWSPIVTILAVTVLLCGCGGTISPTAASSMAREPALPEGMEYRSEAPAPETASESLPEI TVKPELLTDAGTLCPMEYGTDADVIFFADLDHDGEPERITAGLADFQASGGQYASLKVLA GESADSPVIWEHDYSVAHAGWTMLYLYRENGADYLMEYSPYMNQGYGAYSIRIFYIGPDK KTVEAHSHTADFALQQEDRTGFFGDPDALAAFMDTAEGYIKSSLLLVSTDQGRLAYSTPE RPVTLDHDDSFFTDSWNVTGKWNGRETIEYMSQKNWMAAEGIPLLCISRPSGDTVEVQFK NRNHFYTTSNPVIDQLFADMTGRNWTFLPDREPPPGPPDAVLSVYRKEGMDVFISCYQKE ETAVIWTAPFVYNIANDMRSINECYYMSASAVERLLSWAP >gi|157101634|gb|DS480690.1| GENE 68 65703 - 66233 243 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938862|ref|ZP_02086213.1| ## NR: gi|160938862|ref|ZP_02086213.1| hypothetical protein CLOBOL_03756 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03756 [Clostridium bolteae ATCC BAA-613] # 1 176 24 199 199 312 100.0 7e-84 MVSILSFIIVILIISILCFLSGRRMAENRAAKNPNAENRTNGNRTAQSINHFLMRHPVMV KGLLLFYIAFAGCFTFPIRSQPFPLDFTKAYLLICSYGIIGWGFAGLYGKAKERAVYPLT LLMTILGMVCRYVLEYGEASNVYNFTMLNILSYIILIPAFTVGAYHYIVKYLISAR >gi|157101634|gb|DS480690.1| GENE 69 66321 - 67163 954 280 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 5 275 3 273 283 174 35.0 1e-43 MKKWETIREFIMITFATVIVAAAVFFFLMPSHVSVGSISGLAIVLGNFVPLKISAITFIL NMLLLVLGFLLIGREFGAKTVYTSFLLPVVLAVFETAFPENASITNDAFLDMICYIFVVS IGLAMLFNRNASSGGLDIVAKLLNKYLHMELGRAMSMSGMCVALSAALVYDKKIVVLSVL GTYLNGIVLDHFIFGFNIKKRVCIISQREEEIREFILHHLHSGATIYEAIGAYDGQPRRE IITIVDKNEYVTLMNYVLKTDKDAFVTVYTVNEIIYRPKH >gi|157101634|gb|DS480690.1| GENE 70 67352 - 68011 694 219 aa, chain + ## HITS:1 COG:no KEGG:Clocel_1110 NR:ns ## KEGG: Clocel_1110 # Name: not_defined # Def: suppressor of fused domain # Organism: C.cellulovorans # Pathway: not_defined # 1 219 1 219 219 269 57.0 6e-71 MTKEEFLKRVKENEEWAPGWETIEQEFERLYPGQEPRHYGTNMMARAIFGGDCYLDGYSI YDSPKGYKHIVTFGMTELYADEESFGGEWNKWGYEMTIKLREKEPEDCLWAIDMLSNLAR YTYTQNQFFLPSQYIAGNGTSLHVGTDSAITALITTMDTEAVSQDSVYGKTEFIQLVGIT EQEFKNIQGKTANIELLLERMKQDNPDLVTDMTRTSSYL >gi|157101634|gb|DS480690.1| GENE 71 68118 - 69356 945 412 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938865|ref|ZP_02086216.1| ## NR: gi|160938865|ref|ZP_02086216.1| hypothetical protein CLOBOL_03759 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03759 [Clostridium bolteae ATCC BAA-613] # 1 412 1 412 412 805 100.0 0 MKYITNKHISKKLTAPLLCMALCGVLTACGHKEPSHALSPEQAESQSTEAEPEIRIDTQE GSQSEQSSEHPIGSTAAPADSAQGEAGYLLIKDQTFDVTLEPLGSVTFASYKPDSLQNPL ADVMFEIKKEKKTVCLLDGVYEGNTRSNETFNKVEAVSFPDYNSDGFNDIIIICSYSPAS GPKTGTGYPEVRIYSGKADGSFTLERSLSENANSALAEKTVQSVLGFLGAGKGRPSVSAS GWQQAYTGLLQSQDSGQWQGYNLIYINDDDIPELVEIGIDEATGCKIVSFADGAAYETQL SRLYFSYIERGNLLCNSEGNMDYYYDIVYSMESGRLLPIASGYYGAEDNSHVQFDEEGNP IYQYEWEGVMMTQEEYNQAFRAVYDTSSEKPGYEWDKWLSLEEVIQAIDGIQ >gi|157101634|gb|DS480690.1| GENE 72 69483 - 71555 1644 690 aa, chain + ## HITS:1 COG:PA1920 KEGG:ns NR:ns ## COG: PA1920 COG1328 # Protein_GI_number: 15597116 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Pseudomonas aeruginosa # 6 653 15 657 675 592 44.0 1e-169 MLLVAKRDGETVEFQLDKITLAIKKAFKATGKSYTADIIELLSLRVTSDFQMKIKDSRIS VEDIQDSVEKVLEESGYTDVAKAYILYRKQREKIRNLDKTILDYKDVVDNYVKVEDWRVK ENSTITYSVGGLILNNSGAVTANYWLTEVYDREISDAHVNGDIHIHDLSMLTSYSCGWSL RKLLLDGLGGITGKITASPAKHLATLCNQMVNFLGIMQNEWAASQSFSSFDTYLAPFVKA DNLSYDKVKKCMEAFVFGVNTPSRWGTQAPFSHIFMDLKVPGDMADEKAIVGGRRMEFTY GDCQKEMDMVNRAFLETMLEGDANGLGFQYPIPAYGIGADYDWSDNEQNRLLFSLASHYG TPYFINYMGNGMKPEDVRTSYDGIRPDFHVLRKKAGGFFGYGEHTGSVGVVTVNLPRIAY QSKDEKEFYARLDQMMDLAARSLSIKRNVISTLLKNGLYPYTACYLTDFDNHFSSIGVIG MNEAGLNAQWLKAGLESEETKKFAVSVLNHMREKLIGYQMEYGNLYGLEATPAESATFRF ARLDRERFSGIRTAGHEGDTPYYTNSTKLPADYDGTLQGALINQDSLQPLYTTGTVFHVY MERELEDWKKARDLLSSMITEHHIPCFTLSPVYTICQTCGYLPGRQERCPKCNGAADVYS RIAGYYRPVHDWNDGKAQEFRNRIMFRIGE >gi|157101634|gb|DS480690.1| GENE 73 71558 - 72409 474 283 aa, chain + ## HITS:1 COG:lin0782 KEGG:ns NR:ns ## COG: lin0782 COG0384 # Protein_GI_number: 16799856 # Func_class: R General function prediction only # Function: Predicted epimerase, PhzC/PhzF homolog # Organism: Listeria innocua # 1 272 3 272 282 274 51.0 2e-73 MDVYVANAFSKDNRGGNKAGVVLQQARLNEAEKTEIALLMGYSETAFVADSELADFKLEY FTSVGEVPLCGHATIGTFTVLRHLGRLEKQEYTIETKSGILRIRVDKSGLIFMEQNIPEF YERLDKGDVKACFDSDGIPDALPVQIVSTGLKDIILPVTDPEILGRMTPDFSAITRLSRE KECIGIHAFSLAKEMGLTAICRNFAPLYGIDEESATGTSNCALACYLYRYVSKQDQYIFE QGHNLNSISRIYVNLDTEDNTITGVWVGGYGYLSCLSQICCQP >gi|157101634|gb|DS480690.1| GENE 74 72464 - 73228 505 254 aa, chain - ## HITS:1 COG:no KEGG:Asuc_1312 NR:ns ## KEGG: Asuc_1312 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: A.succinogenes # Pathway: not_defined # 7 250 18 255 260 115 32.0 2e-24 MKHISRIKEVSELFHLPASALRYWDDEGLIRFERSKDNHYRCPTSQTMLDICDVIFYRSL SLSIKEIKSIPGMCVEDVDHTLETNARRLEDQIRQMQMTLEKLQTRRSMVQRIMDLERTS FQVLRDLLPAMKLFSPEDRESLETYVQDPYQSSILIKPQQGQEIQYGIFLACPDYDLGNS VILRDQDAESRLYLKGLLKVNAQSPDCNNAGAFLEAAQSMGYGSGQLTGRYLLTACDGYR CDYYEAWLEIWDNG >gi|157101634|gb|DS480690.1| GENE 75 73360 - 74679 1075 439 aa, chain + ## HITS:1 COG:VC0650 KEGG:ns NR:ns ## COG: VC0650 COG0534 # Protein_GI_number: 15640670 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 1 423 6 431 445 158 29.0 2e-38 MLRRFIGAVLPSMLAFAFSGLYTIVDGFFIGRNIGDIGLAAVNIAYPLAALVQAVGTGIG MGGAVWISLYRGKGDREREEECLGNTLTTLFLGGLLLMAVLFALCGPVLRLSGAEGSVYG EAMVYIRILIGGSLLQLFATGFAPLIRNYEGAVTAMGSMIGGFLTNILLDFLFVAVYHKG VAGAAAATVIGQGVTVVPCILFMGARIKGIRKEHFLLKKKQMLRIAGTGVSPFGLTISPF IVIMLINRSAYVYGGEAAVAAYAVISYVVAIVQLLLQGIGDGSQPLMSFYLGIGKPKQAR TVRNMAYLFAAVTALANMGILCLLRSAVSGFFGASSAAFPIVAESMPVFTAGFLFIAFCR TTTSYFYATKQNLFSYLLVYGEPVLLFFLLTLGLPPVMKLEGVWLSVPITQSLLAVLGIA LLRMEERSIPGYQYSPKND >gi|157101634|gb|DS480690.1| GENE 76 74596 - 74787 189 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANEMIKETEQVIDQEIVMALSIDRVMGLLLIGVTMLIIFRAVLIARNASLLHPQKSYPQ NCQ >gi|157101634|gb|DS480690.1| GENE 77 74909 - 75532 329 207 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0970 NR:ns ## KEGG: Cphy_0970 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 204 1 222 432 139 35.0 1e-31 MINREDMLELTRRMTLSRTSFTRIAGCYVDRDGDFDGSFNINFLKLSASERTKKLKLAKE IPFAATNVNLKKYEYPQGARKPGSMWQLLMAMNECGLKNDALMDTFYDVIMEHYRAEREY AILVFHDRYDIPAKGSDKERQWESEEVFEYMICAVCPLSGEYEPDKPVCGFLFPAFTDRS GDLNHIDVFQADAGKPHNEILKLLEII >gi|157101634|gb|DS480690.1| GENE 78 75704 - 75862 156 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938871|ref|ZP_02086222.1| ## NR: gi|160938871|ref|ZP_02086222.1| hypothetical protein CLOBOL_03765 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03765 [Clostridium bolteae ATCC BAA-613] # 1 52 3 54 54 95 100.0 1e-18 MDNREVICYCAGVTKAQILEALDNGARNLDDIKQMTGACTIGRCKELSPTGK >gi|157101634|gb|DS480690.1| GENE 79 76197 - 76745 397 182 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938872|ref|ZP_02086223.1| ## NR: gi|160938872|ref|ZP_02086223.1| hypothetical protein CLOBOL_03766 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03766 [Clostridium bolteae ATCC BAA-613] # 1 182 42 223 223 364 100.0 2e-99 MRLDGYLVSDIRDCLVHTEEDTLERFRKAQNKPEFNPSAGFGGNLWFAKRGMFDISLWLI VLNMTAVPLMAAAYGWMHKGSRSLPYDTGYVFLFLLIMLALEFYPLGKIADRMFWKHTRE VLDFHGCNDRAEEENPELKKMLAEDGGLSTANSLIILGLDLLLMFFCKQVTTAIFLYFSY TR >gi|157101634|gb|DS480690.1| GENE 80 76757 - 77716 517 319 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938873|ref|ZP_02086224.1| ## NR: gi|160938873|ref|ZP_02086224.1| hypothetical protein CLOBOL_03767 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03767 [Clostridium bolteae ATCC BAA-613] # 1 319 1 319 319 649 100.0 0 MKPLKKSTGILAAVILIVMISGLLLANQMIDHLENNITTDYTKVYHDDLNRLLGEGWMVL AKESGYTVTPFDSTNIYTWEIGYKDKDGQERSILLSNFKDRSMEHPFGYCIEKNFQTMTG EVYARDVISPVIPAVCWQPEYWIRFLNYYVSGDHITDRPYKDPAADISPRAEVALKTSPL LDESDSSMDWENPESDFCLYDIDYAGIFKRKPDLWYLSVSLDFYLEDMEQAEYAGKADAL EETAKQVLARLNEYTGHTVNAEIIYPTGSDQKGTAYHHWYVINGDIVDQSVSDIQLPIFG VVDPMYNHMKKFYHPEQEY >gi|157101634|gb|DS480690.1| GENE 81 77920 - 79116 500 398 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938874|ref|ZP_02086225.1| ## NR: gi|160938874|ref|ZP_02086225.1| hypothetical protein CLOBOL_03768 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03768 [Clostridium bolteae ATCC BAA-613] # 1 398 1 398 398 705 100.0 0 MVKKLLSALLALFILSSFPLTVLAGDLPTSTNTINWSTVEVFAYGSGTYKKSLGKVQEAG GQGAFGKEFTPSDGMGIQGFVLVLPRSSFPTTGKWKAQVSIQTSSTSLLLDRMYARGRVK RPNASDLSGQLPKIASDTIPPDFYTVTYNVNSQSLTSVEFFFPYAFPLYGTQRLNFSSVI SFRDYESGNINNPVAPLPDASEQNQAIANSVQDTANNTAVLVQKQDTIIEQIVDTTQTIS NQLHSFWDQLAGEFTNMYNKMNQHHSEDLAANRRNTEDIIDSQESNTTNIINNNNANTDK LANGYDKSRLDNSNAQLNDSIHNMQENEKQLMEDVSNNINSFRYDDYFTRIRGPLSDISY FLNHIYNNMKGLNIPIGFSLTLTIAMLCIGYYRFKGGT >gi|157101634|gb|DS480690.1| GENE 82 79117 - 79335 258 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938875|ref|ZP_02086226.1| ## NR: gi|160938875|ref|ZP_02086226.1| hypothetical protein CLOBOL_03769 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03769 [Clostridium bolteae ATCC BAA-613] # 1 72 1 72 72 90 100.0 3e-17 MFQFFDAVAGFVETVVGFIVNMIATLILVIVNIVRSVGWLVMCLSYLPPWMVGFVVVPIS LAVVFQILNKGS >gi|157101634|gb|DS480690.1| GENE 83 79429 - 80403 349 324 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938876|ref|ZP_02086227.1| ## NR: gi|160938876|ref|ZP_02086227.1| hypothetical protein CLOBOL_03770 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03770 [Clostridium bolteae ATCC BAA-613] # 1 324 1 324 324 581 100.0 1e-164 MVIFWSLSLLFLCLLLRLLFPSVVREAVHLLWFVRLGHLNVVVVLSSFILLSCPLLAYAD TATPSNASRPRRETELVRYEDAIPVASSSNASYTDLDLLVDDRTLMDIDDFEDLKRDDLL RYILCELVAIRDSFGGPGLASSSDAALSSLEDAQDVSDLEEGPTDIIPMDAPVRSARSVS ASDDYVNVLRYDVSISGQEYTLLFPPEYADSLYVDSRGRLFNVSANAIQGRLVDGNFNPY AQTGKLVYLTPCLGNNFYSIREYGSPNYVREYYWSAGRLQYRDTYVNIQVNQYHHIFKVS DTLTYIVVVLIGGCLICLWRKSSR >gi|157101634|gb|DS480690.1| GENE 84 80376 - 80519 140 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938878|ref|ZP_02086229.1| ## NR: gi|160938878|ref|ZP_02086229.1| hypothetical protein CLOBOL_03772 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03772 [Clostridium bolteae ATCC BAA-613] # 1 47 1 47 47 71 100.0 2e-11 MFMEKILALMGDVPVQFYPVCYAFGFIAFLWIVDKFFHIFVSLIDRR >gi|157101634|gb|DS480690.1| GENE 85 80560 - 80682 105 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFHELSLLFDILWRHSYWVGMFVIGLPLLKKVVDIFRRLL >gi|157101634|gb|DS480690.1| GENE 86 80705 - 80989 292 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938880|ref|ZP_02086231.1| ## NR: gi|160938880|ref|ZP_02086231.1| hypothetical protein CLOBOL_03774 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03774 [Clostridium bolteae ATCC BAA-613] # 1 94 1 94 94 155 100.0 1e-36 MLKIKNAVKKYRCVPVATGVASVAMAFPAFAEGATPATGLDSLLATFSVVTAWLWKECGL LLTWILSQPILLAAMCLFFAGAVVAFFMRIYHQV >gi|157101634|gb|DS480690.1| GENE 87 81195 - 81980 461 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938882|ref|ZP_02086233.1| ## NR: gi|160938882|ref|ZP_02086233.1| hypothetical protein CLOBOL_03776 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03776 [Clostridium bolteae ATCC BAA-613] # 1 261 1 261 261 514 100.0 1e-144 MYALSVLLFGFGLFWIVFFAYHFFKYRNPYKLFMVFGKKGSGKTTLMCKMALKYRKKGWH VYSNVHIPGTYHFDTVDIGVAHFPENSILLIDEVGLVWDNRNFKSFPEHVKVYFKYQRQY KHIVYLFSQSFDVDKKIRDLTDHLYIIHNFLNCFSIARRITKTVAVVHADKSASGESKIV DDYNIDSLLLAPFGSVRFTYIPKYVKYFNSYNPPQLPEKEFEYMPFPELVKQGKLGALRR AMAGGRVQLHEVVRKWNRKKR >gi|157101634|gb|DS480690.1| GENE 88 82074 - 82994 254 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938884|ref|ZP_02086235.1| ## NR: gi|160938884|ref|ZP_02086235.1| hypothetical protein CLOBOL_03778 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03778 [Clostridium bolteae ATCC BAA-613] # 75 306 1 232 232 473 100.0 1e-131 MLYYFNPLLEDGIFYSCDSLRYSFEFPDVDTVESFLSFLSHLPGCTHYHSLKDFDYRYLF VFGIKGLSFSIGLCMNGVKKETVLQGFLDFNPNKILGEIAYDDGFMRTSVSPFDQEEVEY RGLQSQIGQLLRTVLNELFPMVETIKIKRWDLAVDVPYGRDCVQLIKDNRKYSQFYKSAQ DFTEYLGCMSAPGRVKVYNKQIEAGLDYPLTRIEVTLDSLDYIDCCRCWPNVYTRKVIDL AESKVMVQLLAEQPVDRMDYYLRQMSAPTKRRYKALLLDRPFEIEGPIFNKLRSQLLRYQ EGNFYE >gi|157101634|gb|DS480690.1| GENE 89 82987 - 83166 243 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938885|ref|ZP_02086236.1| ## NR: gi|160938885|ref|ZP_02086236.1| hypothetical protein CLOBOL_03779 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03779 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 113 100.0 6e-24 MNKPVGKIKVNVYLTTKQYDRLCRYMEFCECYSTPAEMATRLVQVGLAEAIKRGAIEED >gi|157101634|gb|DS480690.1| GENE 90 83168 - 83368 186 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938886|ref|ZP_02086237.1| ## NR: gi|160938886|ref|ZP_02086237.1| hypothetical protein CLOBOL_03780 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03780 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 88 100.0 1e-16 MTSEDKIYFCSVFFVVLPLLYFAIHNYFVRKRSRGRKYTDLDMIVCYVYMIAFILFSFYF KYIANL >gi|157101634|gb|DS480690.1| GENE 91 83513 - 83779 309 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938888|ref|ZP_02086239.1| ## NR: gi|160938888|ref|ZP_02086239.1| hypothetical protein CLOBOL_03782 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03782 [Clostridium bolteae ATCC BAA-613] # 1 88 1 88 88 119 100.0 8e-26 MENKELIEAIRGMLKEELEPIHEEISKVKDRLTVLEIKEQATHKKLDNLTLDVKIAERDT RRDIAQANDAIETLIAVLEAKGILPKAL >gi|157101634|gb|DS480690.1| GENE 92 83856 - 84107 129 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938889|ref|ZP_02086240.1| ## NR: gi|160938889|ref|ZP_02086240.1| hypothetical protein CLOBOL_03783 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03783 [Clostridium bolteae ATCC BAA-613] # 1 83 1 83 83 142 100.0 6e-33 MSSFNATKYKNEFTKEKYDRLNIQVPKGQKAVIEEYWKAKGYKSLNAYVNDLIQRDMNNT PGIQINHNKGIVAGNIHGDINMK >gi|157101634|gb|DS480690.1| GENE 93 84314 - 84550 165 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938890|ref|ZP_02086241.1| ## NR: gi|160938890|ref|ZP_02086241.1| hypothetical protein CLOBOL_03784 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03784 [Clostridium bolteae ATCC BAA-613] # 1 78 2 79 79 147 100.0 2e-34 MDKKRIRQTLNDFVERYLLGDVDGDFRFNYIWIRSQLSFAFSIDAITIDEFDALSSVVLH AYKTDRRGPECVDFPRLS >gi|157101634|gb|DS480690.1| GENE 94 84523 - 84723 161 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938891|ref|ZP_02086242.1| ## NR: gi|160938891|ref|ZP_02086242.1| hypothetical protein CLOBOL_03785 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03785 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 110 100.0 3e-23 MRRLSKALIEQEQNETSVAICRAMALHDQCRVDVLQYHFARLEHILAYLDEKTDSIPSIS SEVQTT >gi|157101634|gb|DS480690.1| GENE 95 84810 - 85070 183 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938892|ref|ZP_02086243.1| ## NR: gi|160938892|ref|ZP_02086243.1| hypothetical protein CLOBOL_03786 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03786 [Clostridium bolteae ATCC BAA-613] # 1 86 1 86 86 169 100.0 9e-41 MVYEILGIERFEGISKKTGKPYDFTRFYAVPSRQPDKVRGRRVEQVDVWNGEYAPDISGI HPEDFVDISYGRGGFVEECRLVDSPN >gi|157101634|gb|DS480690.1| GENE 96 85232 - 85450 223 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938893|ref|ZP_02086244.1| ## NR: gi|160938893|ref|ZP_02086244.1| hypothetical protein CLOBOL_03787 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03787 [Clostridium bolteae ATCC BAA-613] # 1 72 21 92 92 116 100.0 5e-25 MNIELTERELRYLNRVVNVRLDELIERCARIRRIRSLEDIITSERFSIAESEIKVMKGVH DKIADALSDCNM >gi|157101634|gb|DS480690.1| GENE 97 86057 - 86650 268 197 aa, chain + ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 91 195 138 240 240 85 41.0 1e-15 MCRRIFLSVLFFTAVLLLKVDFFDSYALPLKGDIEVQTADGSNTVEAPGIIVIQDTISGE GGLETKGLQDIKDDRINKAAQEKQADTDVPLQASSQGAAGEESCGMFEITGYCSCELCTG ENHFTKSGTAPRPEYTVAADPDVFPLGSRIRIGEIIYLVEDTGASIKGNVVDIYYETHEE AVANGRYETEVFLIRGM >gi|157101634|gb|DS480690.1| GENE 98 86815 - 88497 1151 560 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 3 556 6 514 555 280 34.0 7e-75 MKNLKIGPKLILCFMLVTCLSSISGILGAFFLHWSDQNYSKALVENGFSQGDIGSFDTYL NKGSAVVRDIVLLTEEKEIQNSRSELEQIDQKMQQSLDALKINCQTPEEQSLIAVIDQEL PRYMEIRSQVVEMGLKNHNDEALDIFRNQARPILNNAMNAAQKLADLNKQMGNEISRSLT DNSRLIIYMIVFVIIISVSISVAMSFWVARLISRPILNVRDAAFKLAQGDLDIFISNDSN DEVGEMTRSFSEAAGMMRRYITDISRGLSEISHGNFNISPNEKYKGAFITIEESLFTIIH SLSNTMGEINEAAEQVSSGSEQVSIGAQSLSQGSTEQASSIEEVAATVNEIASRIKRNAE NALSANEKAVEMGRRITESNRQMQDMIHAMDEISSSSKEIGKIIKTIEDIAFQTNILALN AAVEAARAGEAGKGFAVVADEVRSLASKSADASNSTSALIENSLRAVLNGTRIADDTAKN LSDVVSGVKDVVATIGQISNDSTEQADAIGQVTQGIDQVSSVVQTNSATAEESAAASEEL SGQAQLLKKLISQFRLITVQ >gi|157101634|gb|DS480690.1| GENE 99 89019 - 89381 250 120 aa, chain - ## HITS:1 COG:CAC1483 KEGG:ns NR:ns ## COG: CAC1483 COG1733 # Protein_GI_number: 15894762 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 9 109 7 107 108 129 60.0 2e-30 MTQQERSMFNCPVEATLSLIGGKYKPIILWHLIKEPLHYMELQRLIPGATPKMLSQQLRG LETSGMISREVIPEKPPRTIYSLTEFGKSIIPVLDVMCTWGTDYLNGLNVKTPCQKQERG >gi|157101634|gb|DS480690.1| GENE 100 89556 - 90071 545 171 aa, chain + ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 170 1 170 172 224 61.0 6e-59 MEFLSIAKQRFSVRSYTSQHVEPEKLEKILEAARVAPTAANLQPIRLIVAQSEQSLAKIG KAANIYGAPLAIIVCADHEKAWVRPFDKKQTGDIDASILTDHMMLQATELGLGSVWVCYF NPDILCREFELPEHLEPVNILAIGYSNAGTADTSRFDRQRIALSQLVTYEK >gi|157101634|gb|DS480690.1| GENE 101 90155 - 90910 464 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938904|ref|ZP_02086255.1| ## NR: gi|160938904|ref|ZP_02086255.1| hypothetical protein CLOBOL_03798 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03798 [Clostridium bolteae ATCC BAA-613] # 1 251 1 251 251 517 100.0 1e-145 MNDNYNTYRIWAPDNALWTQWAKPVLFARTLQQVPEKLVLPSVKWAPYGDGRTALFVDLP GKRGVLEGLALAQMGYRPVPLYNGVYGADKWSMAVDVTSVAETLYQGADYLSCQHIRPDA PPAFLLDAARMKGTARQPGRYDNRWCVFPQDAPSADFLKAQGIESIYVRTKEIQNDLAHI LLRYQKKGIRIYQVRDNGEPKKLTVVRPSHFKSFLYRFCTLLGLTRNAAGGFGGMVPEAT QSSGTRYYGIG >gi|157101634|gb|DS480690.1| GENE 102 90903 - 92123 696 406 aa, chain - ## HITS:1 COG:no KEGG:sce3342 NR:ns ## KEGG: sce3342 # Name: not_defined # Def: hypothetical protein # Organism: S.cellulosum # Pathway: not_defined # 29 398 23 413 414 66 24.0 3e-09 MSGAVLARIPGERYSDYRYEAIFRAYKWDPQVEDHNTVAEHVVLLDSQTARQLEQWAEQL SAETMEIEQAMMERSDLVKKLGLPSKVKKAIPRMGSYSPERHVRLMRFDFHPTTDGCSVS EVNSDVPGGLAEASVLPRIAQPYFPQYEPQGHVARSLLEAFQKRTGPGVRIAFVHATSYA DDRQVMQFLGDYFEENGYRSLYAAPDHLIWREQQAVSLIQGEEGPVGGIVRFYPLEWLPN LSGRTDWGGYYDTQTPSCNHPIAVFAQSKRLPLIWDELGLELPAWRALLPETRAPEAGKF GDGWIYKPALGRVGEGISTREALTPKEYRQIQRNVRWHPKDWIVQKRFESKPVSGDEGEH FHLCVGVFTVDGKAAGFYGRISPFPRIDARAKDIPLLVEKEDKPNE >gi|157101634|gb|DS480690.1| GENE 103 92120 - 93133 552 337 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938907|ref|ZP_02086258.1| ## NR: gi|160938907|ref|ZP_02086258.1| hypothetical protein CLOBOL_03801 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03801 [Clostridium bolteae ATCC BAA-613] # 1 337 13 349 349 634 100.0 1e-180 MNNECGNQKECRLRQRELFNLLQTVKFWCKIILYRRLIQVFGLEVLFMEMLVFGIAAAVS YWYETEVFLFLGSAYVTCRFLYKWYMPLLETWPPEHGETGRRVLGCLPPVSLAIILFVLL TMASYDVVGIWVVFYILMGYAWICGGVFMMECLDLSWPFDAVHLDNKAAVFPAAGGFVGV TLIYAGANTGDGPGWWVVLFAAGLGFITWLGLALVINLLTDISERVSVDRDMGSGIRFGA YLIASSLILAYASGGDWTSTQATLMEFSVGWPVLPLMLFSLAVEWFYIRPSRSLADNRLQ ANILPSVLWGVIYLVYAGIVLLIFHGIPERISGGVLL >gi|157101634|gb|DS480690.1| GENE 104 93213 - 94160 722 315 aa, chain + ## HITS:1 COG:SMa1894 KEGG:ns NR:ns ## COG: SMa1894 COG0229 # Protein_GI_number: 16263495 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Sinorhizobium meliloti # 182 312 12 142 147 194 70.0 2e-49 MQKIYFAGGCFWGVEEYFSRIPGVCDTECGYANGAVENPTYEEVCSDTTGFAETVCILYN PDIVALRILVRQFFKIIDPVSINRQGNDRGSQYRTGIYYISDEDLGPIQSVMFEIQKDYS QPLAVELEPLRNFYPAEDYHQNYLQNHPHGYCHIDFSSLKDLIVRPDGRTGIKQDRAELQ KQLTREQYDVTQNASTERPFSGPYWNHHEPGLYVDIVTGEPLFTSAHKFDSGCGWPSFTK PVMSCAVTELPDSSHGMHRVEVRSREGDSHLGHVFEDGPAGTGGLRYCINSAALRFIPRS RMEEEGYGAYLDLAI >gi|157101634|gb|DS480690.1| GENE 105 94206 - 95147 773 313 aa, chain - ## HITS:1 COG:FN1850 KEGG:ns NR:ns ## COG: FN1850 COG0332 # Protein_GI_number: 19705155 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Fusobacterium nucleatum # 4 313 2 309 309 315 50.0 7e-86 MRERKVKIAGWGNTLSGKKIEFDGQTRYRLDKDETLLMLAVRSARKALERAAMDISDMDC IVCAMATPLQAIPCNAALVHEQLAKGLDIPAMDINTTCTSFISALDVMSCLIEMERYKNV LIISGDTASAALNPRQKESYELFSDGASACVLTRAGEQESSKILYSCQKTWSEGAHDTEI RGGGGMMPAFSMNEENREDFYFDMKGKRILKLSAQNLPGFVEQCLKEAGVKRESIDLVVP HQASRALDVIMPRLGFRKGTYIDRVSQYGNMISASVPYALCEAIEEGRVGKGDLVLLIGT AAGLTANFLLMRI >gi|157101634|gb|DS480690.1| GENE 106 95144 - 96466 973 440 aa, chain - ## HITS:1 COG:FN1849 KEGG:ns NR:ns ## COG: FN1849 COG1541 # Protein_GI_number: 19705154 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Fusobacterium nucleatum # 1 433 1 424 424 353 40.0 5e-97 MADTLSVLKYYLYYKYKKRFSDRSALERWQVQKIRKHLEYVGEHSRLYKGMKKLSSYPVI DKKFMMEHFDELNTVGIGREEALEFAVLAERQRNFSPKLKGVTVGLSSGTSGKQGIFLVS DDEKNRWAGYILARFLPGSLFETYSIAFFMRADSNLYQAVKSKNIQFHFFDIYKDMEEHR KRLEEIRPRILAGQPSLLLMIAEDVRKGKTDICPRIVISIAEVLEQADEQRIKEAFGLSV IHQVYQCTEGCLAATCSMGTLHLNEDIVYIEREYLDDRRFVPVVTDLERKAQPIIRYRLN DILVERKEPCGCGSPFLALEKIEGREDDVFSFKGKDGRNVPVFPDFIRRCILFGDKGSGE NQAGSSGDYRIVQETDGRLTVYADMADPGKQRVLKEFELLSKDCGFLLPDISFQPYSCEK GRKMKRVERRGAEPVKGRYI >gi|157101634|gb|DS480690.1| GENE 107 96459 - 98387 1209 642 aa, chain - ## HITS:1 COG:FN1847 KEGG:ns NR:ns ## COG: FN1847 COG0451 # Protein_GI_number: 19705152 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Fusobacterium nucleatum # 12 328 4 317 328 269 42.0 1e-71 MGKTVENGKNTVLITGATGFLGEYLVRRLTKEYRVLAMGRNREQGRKLEGLGAVFCPGDF TDRKTCEAYFKGVRYVIHAGARSTVWGRWEDFYRTNVAGTALVAELCLENGIERLVYISS PSIYTVKCDRYDIREEQAPKYNDLNHYIRSKLSAERVVEDVHQKGLETVILRPRGMIGVG DTSLVPRLLRANMRIGIPLMREGLNTVDLTSVENVAQACQLALTARAADGMAFNITNGEP MEFKTLLEHFLAAIGEKPHYRKLPFGAVYGMAAAMEWVYRSFHFPGEPALTRYTVCTLGF SQTMDISRARTILGYEPEKTLMESIEEYGKWWKNRDEPVPDRIARVKMYHCGSCTNDLGL LFKRHPGQKREFPARAFLIQHRDLGNILYDTGYSQAVYEDGFLLKLYRRLNPVHVKPDQI IDEKLRADGINPESVRTIILSHAHPDHMGGLKHFQGYHLVATEQVHKALLRPSVRNLVFA NMLPNRSAHMHPKRWGMFSGTPREAVSELPQDMMTDIPSGKCCEVRGRTPKKRLSEHFLC RYFEQVYDLLGDGSIIGVVLDGHCRGQMGIWIPDFKLLLAADASWGGDLVRHTLEMRLFP RLIQNDFTEYKKTLKKLCELHRDHPQIRIVFTHEKGSEETYG >gi|157101634|gb|DS480690.1| GENE 108 98429 - 99535 721 368 aa, chain - ## HITS:1 COG:BS_ydhE KEGG:ns NR:ns ## COG: BS_ydhE COG1819 # Protein_GI_number: 16077639 # Func_class: G Carbohydrate transport and metabolism; C Energy production and conversion # Function: Glycosyl transferases, related to UDP-glucuronosyltransferase # Organism: Bacillus subtilis # 123 366 141 361 381 105 29.0 1e-22 MEPYRREIEANDCEFRAYPLDRDSIDASDGDKPLKLYRLILEYTRDMLPHLLAEAEKEKP YRVIFDSLALWGRAVGEIMKIPSCSFYSIAVVDRVGGNAFWAYASGFSASFLRYAGEFNR AMDIRRKLRQTYGIRKLGVLSVLMNRGSSNLMGYSKDFQPGGNKLGNGYVFLGPLAPFRK VIQTNDFVCPDDRLIYISLGTVFNRDEGLLREILRQFGQKHPVDGEPYKNENPWNVVMVW NMETAGRNTDTKKDEWKHRDNFIIRPFVNQGEILKHADLFITAGGMNSIHEALYYGVPCL MCPQQGEQRLNAGQFEALGFGRILRDRLDLYGEAMAAMKLKDQWDEGLRTEMCAVHMEAA LQLFDENK >gi|157101634|gb|DS480690.1| GENE 109 99950 - 101491 1027 513 aa, chain - ## HITS:1 COG:TM1357 KEGG:ns NR:ns ## COG: TM1357 COG1690 # Protein_GI_number: 15644109 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 62 513 57 474 474 182 33.0 1e-45 MELKLEFGGQSCGVSIFLPNGTKPEKKAADELKSMMELTGTVERMKESGVFSAEKEHSVI KAAVTPDFHKGAGIPIGTVLMTEGFVIPQAMGNDINCGMRLYTTDLKEEELECCLVSLLP QIRRIFFEGGRSIPMNGIQREAMLRHGLTGLLETSGEAEGRGIWSLYDRDEQEKALEYTA FKGSLYTTGTEGLGDFIGAAHTLSYDDQIGSIGGGNHFVEMQRVAEIYDGRTANAWGIRK GSILVMIHSGSLTIGHQSGRINRIITKELYPKGVPHPDNGIYLLPEREKMEINSRENVPV SDKTDSPWQRFCSTTYNAANFGFANRLFLGLMLQRAVTDKVRPYEMKLLYDSPHNFIWKE EMEGRNYFIHRKGACSARGMEGMEGTPFAYYGEPVMIPGSMGSSSYLLRGMGNPQSLWSA SHGAGRRLSRGEAIHGNDREFKEFMKKFHIVTPIDPDRSDLKGRNDILKKWEETIRSEAP YAYKDIHQVVKVHEEHGMAGLVARMEPVFTVKA >gi|157101634|gb|DS480690.1| GENE 110 101654 - 102706 1015 350 aa, chain + ## HITS:1 COG:CAP0080 KEGG:ns NR:ns ## COG: CAP0080 COG0582 # Protein_GI_number: 15004784 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 41 344 19 307 323 127 31.0 3e-29 MSQLSYHEQVDRENILRLRNLIKDLPPFCGDFFRGIEPRTSSRTRIAYAYDLHVFFDFLH KENPALARITVQDIKLDQLDQLSVTDFEEYMEYLKYRFNDKNQEVMNKERGIMRKISSLK SFYNYYYRNERIKNNPAALVQLPKLHEKEIIRLDVDEVALLLDEVEQGDKLTDKQKNYHA KTKTRDLALLTLLLGTGIRVSECVGLNISDIDFKNGGIRIYRKGGKEVTVYFGTEVEDAL LDYLEERDRIIPEQGHEDALFLSLQRRRMAVRSVENLVKKYARTVAPLKPITPHKLRSTY GTNLYRETGDIYLVADVLGHSDVNTTKRHYAALEDERRRSARNKVRLREK >gi|157101634|gb|DS480690.1| GENE 111 102852 - 104204 584 450 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938916|ref|ZP_02086267.1| ## NR: gi|160938916|ref|ZP_02086267.1| hypothetical protein CLOBOL_03810 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03810 [Clostridium bolteae ATCC BAA-613] # 1 450 1 450 450 924 100.0 0 MIKETQNKPEQERKAEKAVLSSRQKLLIRAPRTIRIIGAIGIVFGGLMSFWCYRDYLNGN ETASAGMSLGFLLIFGTMGLWLWCYGCRMLQVKGEEILFRDIFFRKHRYTLDDICSVRWN HDGFLFSGSTGKLFKIYGYCPDCKRLFELLEERGAQVDLPGRMFSPEQTAAVHPEPDKRR FLVRTSPWLPKEVSKIEVRGRQLAVSKLFRKEFTCSCLDVAKIQLKENKEGRLNARIYGG EGTCLAKVRTFSADASDMRCAFAFLRHMIELGIPVEGAEKTSEYVRCLLQNRFVSFSIGQ SLFKEEYQRICPLLDRYAAVFSDLGIQMEYGPIDKEQKEELEKSLALEKITYDTFNRGFF ICLQRDGQMLFDKKASLPLAEMVMLLAEAPEMGRGRREDRYLEDQGMAAVQNLLYFRPIP EIVIQQILEYFLNKVKKKQIHISDIKRDWS >gi|157101634|gb|DS480690.1| GENE 112 104257 - 104991 713 244 aa, chain - ## HITS:1 COG:CAC3658 KEGG:ns NR:ns ## COG: CAC3658 COG1285 # Protein_GI_number: 15896891 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 26 241 7 227 229 150 37.0 2e-36 MDWHRARNRTVLPGRRSYMNQFVTEQLIYVLRLVAAALCGGLIGCERQSKKKTAGTRTHV IIAVASAMMMIISKYGFGDVLNEYVKLDPSRVAAGVVTAIGFIGSGIIIFRNNSVNGITT SAGIWATVGVGMSMGAGMYILGAVSTAIVMLSELFLGRKGYFSRRNGEDKEVEIEYCESL DEDIFQFINRTLQDQNCKMVNIQLTEEASTYVLTAQLKTPAYYDMSGLVGELKKKGQLKR ISVQ >gi|157101634|gb|DS480690.1| GENE 113 105182 - 107254 1190 690 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 195 364 156 328 541 74 24.0 6e-13 MNMCLKMLFTLSLSGTALILVLLGATRIFGSHMGRRWQYYIWLAAVLRLLLPIPSPAGLI FPDALSSGMSAGLLSDGGNLYAWMDGRSGSGNPAGQPESVMGVRPGTGTERLAETGDMEE MAFTVGDTAKADTAKAYTAKAPGLYLFLLTLWLSVAALLMGRRIWNYRSAARLYKRNESL AIARDGGLEAYGAPLEEACRVCGLNRRPRLVVSSTLPSPVTMGLLSPVIIMPGDFPAERA YYVFLHELTHVRRMDGFYKWLVEAAVCLHWFNPAVYVLQKEVSRACELSCDEEVIRRMDG RNKHLYGEILLTALRRRPMPVSADMGLPLSENAKWMKERLVSIMEFRKRSRPWSLAACIM SATIVSMAVLCGFAPLQRENGTVGAENVLTGAADRLGAQTIAGIGKADTSVRSGGSFADS ETISWNEYEDVRRSPDQKSLQTKMFWVNGYIVVLAWNVDPGQYDVVRQVDGKPLCFTGRT EQYADNSAIAEAVRQSIEKQKLSKSKYGFRPEQMALLEAAGPFEGTADELAQRFYEADNL SYFAAVSGSVSPETCASLMEQFYAENQIEYFAIISENDAPSLKARKTEMAKQAARAGNTD IFSILKDELTDADKAEIARNAYQTDHADIFYMTYESLNQEQAGEMAIQAYEEDRVEYLYV LKERLSSSQISSLKERAGRDGKQEFLYVLD >gi|157101634|gb|DS480690.1| GENE 114 107251 - 107637 442 128 aa, chain - ## HITS:1 COG:CC1640 KEGG:ns NR:ns ## COG: CC1640 COG3682 # Protein_GI_number: 16125886 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Caulobacter vibrioides # 10 126 28 142 144 63 34.0 6e-11 MEKEISLPEAQMEVMEVIWDKGGRTMFGVLGEELAARGKSWKPNTILTLLSRLTENGMLT VRKQGRLNEYVSLVSREEYRQTQARLLVDRIFGGDTKHLISALVSQEYLTEEDYEELKAF WEKGGDKG >gi|157101634|gb|DS480690.1| GENE 115 107742 - 109808 2034 688 aa, chain - ## HITS:1 COG:CAC0120 KEGG:ns NR:ns ## COG: CAC0120 COG0840 # Protein_GI_number: 15893416 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Clostridium acetobutylicum # 381 682 213 514 555 211 40.0 3e-54 MENSEGRGRINRKEGRGCRRSSISFKISLSLLLVLIPFLIILITMACVMAAGAISALNDK ILEVQTDYAVSSVDDFFSSKVTAAGMFRDNDDLQTYFASVSREEDISAYNQLDDVLRVLN SALVQMGEKEKVMEAWVADPRTDSYLLSNGSVVDAGLEDTQWYESVISGGSTIITEPFLD PATHEMIVSIVSPVVSGNGSEITGLFGFDVYLDTLKQTLSDIKVGEAGYLELISNSSDYI YSDDPTASGKNVSEIKISDDYKNKVKNNYNGFADFTYSGVKYTAMFRNCDTTDWLAVATL PLSEVNHTRNYLIETMAFVSILMLALLVLLIVFAVRRVLKPLGGISASMEAFSRGNLDVD IRAAGDDEIGRLSESVRSSVRSLKDIIEDITYILTEVSAGNLAVPVEGNYIGDFRFIREA LEQIIASLNDTLRQINISAEQVSCGSEQVSAGAQTLSRGAAEQAGSVEELATVINDLSQK ITSNADLASEGSRLAANVGEEAVKSDGRMQELQNAIHNIKTSTFKIREIIRTIEDIAFQT NILAINAAVEAARAGDAGKGFAVVAGEVRNLASKSAQASRNSADLITHSLKAVEDGTLIV DATAVSMRNALNGVQNVVKTMDDIASASREQAYSVEQVTREMEQIAGVIQENSATAEESA AASEELSAQALLLKGLLERFRFDNTDVL >gi|157101634|gb|DS480690.1| GENE 116 109935 - 110372 642 145 aa, chain - ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 140 1 139 142 123 52.0 1e-28 MNQIMVVYWSQTGNTEAMANAVAEGIRKAGKDALVTEVSNISADALKDAGVFALGCPAMG AEVLEESEMDPFVAEVETFASGKHIGLFGSYGWGDGEWMRDWQARMANAGADVVGGEGVI AQESPGEDALNHCRELGEALAAAEV >gi|157101634|gb|DS480690.1| GENE 117 110387 - 111004 475 205 aa, chain - ## HITS:1 COG:no KEGG:Clole_3254 NR:ns ## KEGG: Clole_3254 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 205 1 204 204 152 38.0 6e-36 MSKETLHLMLKLNTEDLDTQVALQCAPLLTGMKISNLLTVGRSKRAAVLELFKETAVSCC ILYESSEKTTFLLYIRPALEAYLSRPEVKVLMERFGYRGWDLEEILSLVSIKYEEHMEDC AGFPHEIGLLLGYPAKDVTGFIENNGKNFLYIGYWKVYSDLSECKRIFQSYNHAREHIIH MVSRGMKIDAILRIHNLNQYKSMTI >gi|157101634|gb|DS480690.1| GENE 118 111366 - 112256 688 296 aa, chain + ## HITS:1 COG:BS_yybE KEGG:ns NR:ns ## COG: BS_yybE COG0583 # Protein_GI_number: 16081119 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 16 234 2 227 278 124 32.0 2e-28 MELLQLRYFLTAAEYQHMTKAAEHLQIAQPALSQAIHRLEAELGVPLFERKSRSVELNEA GRLLQKRLIPIVSALDRIPGELRADQYMSSHTIHLKLLAASTLITNRIIAYRALRPDVNF QLYQAENHSDYDLCVSAVRSDLKNTDYEDYMVMLEEDLYLAVPTHSPYGDKDSINLADTR NAGYICLAGSRPIRQICNSYFSEADFVPNVIFESDTTESVRNLIAAGLGIGFWPQYSWGP LGGEWGPMSPHSPVKLLPIRDQVCRRQIILMCSPLGKDNPIVMDFCNFLIQNSKWL >gi|157101634|gb|DS480690.1| GENE 119 112348 - 112764 367 138 aa, chain - ## HITS:1 COG:SPy0761 KEGG:ns NR:ns ## COG: SPy0761 COG0355 # Protein_GI_number: 15674809 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Streptococcus pyogenes M1 GAS # 1 132 1 136 138 76 33.0 1e-14 MSAFWLKIVASDHIFYNGHCSYLAVPAPDGEMGIMPHHEAMILAIQEGNLRFKVPEETEW HQAIVGRGILQVANNRVTMIVDTAERPEDIDEVRAREALERAQEQLRQQLSLREFKMSQA SMARALTRLKGSHLKELK >gi|157101634|gb|DS480690.1| GENE 120 112798 - 114198 1663 466 aa, chain - ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 461 6 463 466 623 69.0 1e-178 MSKGKIIQVMGPVVDVAFEDDGLPHIKDALVVDNQGKKCVMEVAQHVGGNVVRCIMLASS DGLCKDMEVEPTGAGIQVPVGEKTLGRLFNVLGETIDGGESLEGEDKWVIHRDPPSFEDQ SPVVEMLETGIKVIDLLAPYAKGGKIGLFGGAGVGKTVLIQELITNIATEHGGYSIFTGV GERSREGNDLWTEMRESGVLEKTALVFGQMNEPPGARMRVAETGLTMAEYFRDVEHQNVL LFIDNIFRFVQAGSEVSALLGRMPSAVGYQPTLATEVGELQERIASTKNGSVTSVQAVYV PADDLTDPAPATTFAHLDATTVLSRKIVEQGIYPAVDPLESTSRILEEDVVGKEHYEVAN KVQQMLQKYKELQDIIAILGMEELSEDDKLTVNRARKIQRFLSQPFHVAENFTGVPGKYV PVKETIRGFKAIIDGEMDEYPEAAFFNVGTIDEVAEKAKTLEHHEA >gi|157101634|gb|DS480690.1| GENE 121 114225 - 115118 945 297 aa, chain - ## HITS:1 COG:STM3866 KEGG:ns NR:ns ## COG: STM3866 COG0224 # Protein_GI_number: 16767150 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Salmonella typhimurium LT2 # 1 291 1 286 287 167 33.0 2e-41 MANAREILSRMKSVQDTRKITNAMYLISSTKLKKSKKLLEETEPYFYSLQDLIRRILRHM PDVEHFYFNPRKEIKEEDRVKGLIVITGDKGLAGAYNHNVLKMAHEWMEQSKNHRLYVVG ELGRHYFEARHVQVEEQFHYTVQNPTMHRARLISSTILEDYEEGRLDEVWIIYTQMMNSM QMEAEMTELLPLKREDFSNIVIPADIIQEEIQMVPSPSAVMDHVVPNYVTGFIFGALVES FCSEQNARMMAMQSASDNANAILNDLSIEYNRVRQAAITQEITEVISGAKAQKKKQS >gi|157101634|gb|DS480690.1| GENE 122 115138 - 116661 1772 507 aa, chain - ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 4 499 3 497 505 580 58.0 1e-165 MSTISSQEIISIIKSEIENFDMHAGQEQETGSVIWVGDGIAIVYGIEHAMYGEIVIFENG VKGMVQDIRRNEIGCILFGKDTGVGQGTKVVRTRKRAGVPVGSGFLGRVVNALGEPIDGL GSVKEEDYLPVEEEAPGIVDRKSVGVPMETGILAIDSMFPIGRGQRELIIGDRQTGKTSI AMDTILNQKGKDVVCIYVAIGQKASTVAKLVSTLKKNDAMDYTIVMSSTASEAAPLQYIA PYAGTAMAEYFMHQGKDVLIIYDDLSKHAVAYRALSLLLERSPGREAYPGDVFYLHSRLL ERSSRLCDEKGGGSITALPIIETQAGDVSAYIPTNVISITDGQIFLESDLFFSGMRPAVN VGLSVSRVGGAAQTKAMKKASGSIRIDLAQYREMEVFTQFSSDLDEATKEQLAYGKRLME LLKQPLGRPLSLHEQVITLCAATHKVMLSVEVSDIKEFQMNMLAWFDEKHPEIGKEIEET KVLGDDLINRIVDAANQYKTNQYKASV >gi|157101634|gb|DS480690.1| GENE 123 116672 - 117187 631 171 aa, chain - ## HITS:1 COG:no KEGG:Closa_4167 NR:ns ## KEGG: Closa_4167 # Name: not_defined # Def: ATP synthase F1, delta subunit # Organism: C.saccharolyticum # Pathway: Oxidative phosphorylation [PATH:csh00190]; Metabolic pathways [PATH:csh01100] # 1 171 1 166 167 120 39.0 3e-26 MTETARTYAIVLYELGIPEEMVREAVKLLKENPELPEVLNSPVVPLKSKHAVIEKVFREP EFSVLMVRFLKKACDAGCIGQTEEIAKARDAYARMAAGVMDAELHYVTMPGAAQITGMKQ FLCRKYNKKDVNIRLVNQPDLVGGFVLKAGDTEYDYSLKGQLVRLGRAVVR >gi|157101634|gb|DS480690.1| GENE 124 117180 - 117671 522 163 aa, chain - ## HITS:1 COG:SA1909 KEGG:ns NR:ns ## COG: SA1909 COG0711 # Protein_GI_number: 15927681 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Staphylococcus aureus N315 # 9 160 21 172 173 70 25.0 2e-12 MLNLNIWNVAFTVINLLVLYLFMKHFLIGPVRSILDERKQMIEHDLDEAKNRETEARQMK EQYQASIGNADEEASRIIEEARNRAAAEYEKVLAQARADAAKKMEEADRTIALEREKAMN DLKAGVAGLAMTAAAKLISEQSAPDGDRNLYNRFLAESGEGND >gi|157101634|gb|DS480690.1| GENE 125 117696 - 117920 394 74 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2902 NR:ns ## KEGG: EUBREC_2902 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Oxidative phosphorylation [PATH:ere00190]; Metabolic pathways [PATH:ere01100] # 2 74 4 76 76 76 82.0 3e-13 MTGLIALGAGVAVVTGIGAGIGIGLATSKAVDAIARQPEADGKITKALLLGCALAEATAI YGFVIGLLIILFLK >gi|157101634|gb|DS480690.1| GENE 126 117974 - 118663 852 229 aa, chain - ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 11 228 2 220 221 117 35.0 2e-26 MNGLVESLMEELNCNVVFTIPLFGGIDVMESVVVTWIIMAIMMIASLILVHGLKVRNISK KQAALESGMSFIYDFFEGLLGKEGKAYIPYLITVVIYIGIANMIGLLGFKSPTKDLNVTV SLAVMSIVLVEVAGIRKKGVRGWIKSFAEPIAIIAPINVLELFIRPLSLCMRLFGNVLGA IVVMALIKHLLPLIVPLPFSFYFDIFDGVIQAYVFVFLTSLYIKEAIED >gi|157101634|gb|DS480690.1| GENE 127 119069 - 119368 258 99 aa, chain - ## HITS:1 COG:no KEGG:Closa_2009 NR:ns ## KEGG: Closa_2009 # Name: not_defined # Def: Peptidoglycan-binding lysin domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 98 11 95 104 85 49.0 6e-16 MRVRKRRLSATCFTVLLFVSIIMGRTMFTSMAKEEGSRTTPYYKSVRIQEGDSLWGIARQ YREGSSMSMDEYVNQLRQMNGLKKDTIHTGQYLTVVYFE >gi|157101634|gb|DS480690.1| GENE 128 119551 - 120165 669 204 aa, chain + ## HITS:1 COG:lin1340 KEGG:ns NR:ns ## COG: lin1340 COG1974 # Protein_GI_number: 16800408 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Listeria innocua # 5 202 2 201 204 219 61.0 3e-57 MAQDKITSKQQEILEYIKETILQKGYPPAVREICEAVHLKSTSSVHSHLSALEDKGYIRR DPTKPRTIEILDDTFNFNRREMVNIPLVGTVAAGEPILAEERIEDYFPFPAEILPNSETF MLKVKGESMIGAGILPGDKLIVEQCPTAANGEIVVALVDDSATVKRFYKENGHYRLQPEN DAMEPIITETVEILGKVIGLVRMM >gi|157101634|gb|DS480690.1| GENE 129 120377 - 121264 323 295 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938935|ref|ZP_02086286.1| ## NR: gi|160938935|ref|ZP_02086286.1| hypothetical protein CLOBOL_03829 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03829 [Clostridium bolteae ATCC BAA-613] # 1 295 1 295 295 565 100.0 1e-159 MRVNMLEETIIEEVTKIIKVAETKIMKEMSIQYLADSLCYSEQQVRLVFKIAAGQTIGQY LKRRYLTQIYLTIGSDAYNKLKRTAAVMGCRRYKQKMEHFFGKEIDDDKLQRPLTTEQMR INLFDLENSPYVDRWLKEKICGKRQRVDIGNDVTRLLLLDENNRMPYDYDHCYFRHKGRL FIIDSRLAVDMPYNNSLWIMGMEFCIYKTEQAENRYVFQIIKKFMENGTVKNMKKEDSGI QVQELSYFYAHKNKSFRIVNIIDGSDMVGEALTTLMIKEEYIILESGKLFFLLDK >gi|157101634|gb|DS480690.1| GENE 130 121549 - 121899 425 116 aa, chain - ## HITS:1 COG:BH1328 KEGG:ns NR:ns ## COG: BH1328 COG0799 # Protein_GI_number: 15613891 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Bacillus halodurans # 1 109 1 109 117 100 38.0 4e-22 MNSKEMVKLAYQALSDKKGEDIKIIDIQSVSVLADYFIIADGSNPNQVQAMADNVEEILE KAGCECKQIEGYQNANWILMDYKDVIVHIFCREDRLFYDLERIWRDGKTFVDVNEL >gi|157101634|gb|DS480690.1| GENE 131 121850 - 122488 614 212 aa, chain - ## HITS:1 COG:CAC1263 KEGG:ns NR:ns ## COG: CAC1263 COG1713 # Protein_GI_number: 15894545 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Clostridium acetobutylicum # 15 194 11 189 189 116 38.0 4e-26 MENQCSQIMLYRKSLKRKLAPMRYEHSLSVSYTCMNLAMRYGCSLDKAELAGLMHDCGKR YSDDIILKKCLKHGLEVTDAEYKALPVLHAKYGAWLAEHKYQITDREILDAIGCHTTGRP EMTTLDKILYIADYIEPRRYKADNLPMIRRMAYEDLDETMYAILAGTLDYLEKKGGSIDP MTAQAYEYFKKVKGERDELKGNGETGISGSQR >gi|157101634|gb|DS480690.1| GENE 132 122493 - 123119 550 208 aa, chain - ## HITS:1 COG:CAC1262 KEGG:ns NR:ns ## COG: CAC1262 COG1057 # Protein_GI_number: 15894544 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Clostridium acetobutylicum # 2 194 8 199 200 134 35.0 9e-32 MGGTFDPIHNGHLRLGREAYEQFGLDAVWFMPTGNPPHKKDHKITEGEMRERMVKLAIAD TPYFLYSDFELRRKGNTYTAQTLSLLREEYREDVFYFIIGADSLYQIEQWFHPELVMKLA VLLVAGRAYHDDHQPFDRQIEYLTARYGAKIYPIRCREMDVSSEEIRASVSDGHSIHGFV PDAVEEYIKDHGLYGGGVKTPVLAGKGE >gi|157101634|gb|DS480690.1| GENE 133 123180 - 123479 201 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|55821596|ref|YP_140038.1| hypothetical protein stu1620 [Streptococcus thermophilus LMG 18311] # 1 94 3 96 105 82 43 3e-14 MTSKQRAYLKGLAMNIDPVFQIGKSSLTPEITQAVDEALEARELIKVTVLKNCLDDGGSI ADVLAERTHSQVVQVIGRKIVLYRPAKEEKKRKIVLPEK >gi|157101634|gb|DS480690.1| GENE 134 123491 - 124780 1431 429 aa, chain - ## HITS:1 COG:CAC1260 KEGG:ns NR:ns ## COG: CAC1260 COG0536 # Protein_GI_number: 15894542 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 1 429 1 424 424 414 52.0 1e-115 MFADSAKIFIKSGKGGDGHVSFRRELYVPNGGPDGGDGGRGGDVIFQVDKGKNTLVDFRH VRKYIAKDGQEGGKKRCHGADADDLIVKVPEGTVLKDFETGKVIADMSGDNQREVILRGG RGGLGNMHFATSTMQVPKYAQPGQPGAELWVQLELKVIADVGLVGFPNVGKSTLLSVVSN AKPEIANYHFTTLNPHLGVVDLGDGAGFVMADIPGLIEGASEGIGLGHAFLKHIERTKVL VHVVDGASVEGRDPLEDIRTINRELEAYNPELLKRPQVIAANKMDAVYAEEDTEIILDEL RNEFEPKGIRVFPISAVSRQGVKELLYHINDLLKTVDDAPVVFEKEFEVQYQGDRNLPYT VTRADDGAYVVEGPRIDKMLGYTNLDSEKGFDFFQKFLKNTGVLDDLEKAGIEEGDTVRM YGLEFDYYK >gi|157101634|gb|DS480690.1| GENE 135 124842 - 125132 484 96 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623849|ref|ZP_04666880.1| ribosomal protein L27 [Clostridiales bacterium 1_7_47_FAA] # 1 96 1 96 96 191 96 5e-47 MLKMNLQLFAHKKGVGSTKNGRDSEAKRLGAKRADGQFVLAGNILYRQRGTHIHPGNNVG RGGDDTLFALVDGVVKFERKGRDRKQVSVYPRAINE >gi|157101634|gb|DS480690.1| GENE 136 125136 - 125474 252 112 aa, chain - ## HITS:1 COG:BS_ysxB KEGG:ns NR:ns ## COG: BS_ysxB COG2868 # Protein_GI_number: 16079847 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted ribosomal protein # Organism: Bacillus subtilis # 1 100 1 101 112 57 37.0 6e-09 MIKATVFKNSTDSGNAVYTGIVMEGHAGYADEGEDIICAAVSALALNFFNSVEAFTEDGF EGGAGESGSFEFRFTSDISPESKLLMNSLILGLQNIERDYGKSYINVKFKEV >gi|157101634|gb|DS480690.1| GENE 137 125486 - 125794 416 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880683|ref|YP_001559651.1| ribosomal protein L21 [Clostridium phytofermentans ISDg] # 1 102 1 102 102 164 77 4e-39 MYAIIATGGKQYKVAEGDVIKVERLGAGAGETVTFDQVLVVNNGELQIGCPTVSGATVTA TVEKEGKAKKVIVYKYKRKTGYHKKNGHRQLYTQVKIEKINA >gi|157101634|gb|DS480690.1| GENE 138 125979 - 126587 539 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938946|ref|ZP_02086297.1| ## NR: gi|160938946|ref|ZP_02086297.1| hypothetical protein CLOBOL_03840 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03840 [Clostridium bolteae ATCC BAA-613] # 1 202 12 213 213 374 99.0 1e-102 MVRCTKGLKGIKGAGALILAFSLTFAYPAVSWAVETEASEGTVTYGPGSYVVGEDILPGE YAVFTEDTSEQNAYSTCTITLYKDDTDERRIGTFQFQHHGLITLYKGQHLVITKGYAVEA DRAGITLGTTGMYKAGRDMEPGTYRITPLTFGSGYYALYNDVRYYYDYIDEYAALFEPVT VNIAEGQYLELFDAGSIERVSD >gi|157101634|gb|DS480690.1| GENE 139 126721 - 128136 1409 471 aa, chain - ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 5 470 13 481 483 385 46.0 1e-107 MLGWIMNMEAIFMTLPVVTAVVYRESIGFIYLGVAAVCGLLGFMCTRKKPCTKMFFAREG FVTVSLGWIVLSFFGCMPFVISGEIPHIIDAMFEIVSGFTTTGSSIIPKVEDMSHATLMW RSFSHWIGGMGILVFILAILPMAGDYNMHIMRAESPGPSVGKLVPKIRITAKLLYSIYFC MTIVMMILLLLGKMPLFDSICMSFGAAGTGGFACRNSGQADYTVYQQAVITIFMLLFGVN FNVYYLLLIRRPKDAGRCEELRGYLAVVAIAILLITINIRSLFPSLFMAFHQAAFQVSSI ITTTGYSTIDYNSWPEFSKTILLLIMFIGACAGSTGGGMKVSRVMIAFKEIKKEMASVIH PRSVKVLKFEGKPLEHNTLRSLNAYIIVYFMIFGISTLIVSLDNYDFMTSFSAVAANLNN IGPGMSVVGPASNYSMMSYLSKTVLIFDMLAGRLELFPMLVLLSPGTWKRS >gi|157101634|gb|DS480690.1| GENE 140 128177 - 129532 1654 451 aa, chain - ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 1 450 1 450 452 316 39.0 5e-86 MKIIIVGCGKVGTTLAEQLNRENHDITLIDCDSEALQSISDSTDVMSVTGNGAVYQVQME AGIKEADLLIATTNSDELNMLCCLIAKKAGNCHTIARIRNPEYSAEINYIREELNLSLAI NPELAAAREIARLLRFPNAIKIELFAKGRIELLKFMIPKDSILDRMKVMDVVSRLKSNVL ICAVERGDDVVIPDGNFEMRGGDKISFIAPHADCADFFRKAGIENNTVNSAMFVGGGKLT VYLAKALADTKIKIKIIEQDEERCRILSELLPHAMIIHGDGSDQKLLLEEGIRQTEAFAS LTGFDEENILLSLYAASQSRAKLITKVNKIAFENVINALNLGSVIYPKMLTADIILQYVR AMQNSMGSNIETLYKIVADKAEALEFRVRGDSPVLGIPLEKLRTRNNLLVACINRNGRII MPRGKDTLEAGDTVIIVTTHTGLNDLKDILM >gi|157101634|gb|DS480690.1| GENE 141 129568 - 130725 1010 385 aa, chain - ## HITS:1 COG:ML1468 KEGG:ns NR:ns ## COG: ML1468 COG1530 # Protein_GI_number: 15827770 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Mycobacterium leprae # 16 383 336 725 924 219 36.0 7e-57 MDKLVVTRRGDKVCTAVVSDGKVSQLMLEPDTADSLLGNIYIGKVQKVVSNINAAFIDIG PGCTGYYSLQERWKPERLKTGDELVVQVSKDAVKSKAPVVTERLSFTGRYCVLTVGKTGI GFSAKIRDASYKARLRELLECDLAGTKDLGIIVRTNAVTVEECAVREELAELMRTWQRLS AEAGCRVCYSCLYRALPGYIAAIRDSFGGTLEEIITDVPEYHRELKAYLEMCQKEDAGRL TLYEDSLLPLGKLYSLETAFEKALGKNVWLKSGGYLVIEPTEAMTVIDVNTGKYSGRKKM QDTIYKINMEAADEIGRQIRLRNLSGIIIVDFIDMEREEDRKALLAHLVEVVSRDPVKTT VVDMTALNLVELTRKKVRKPLHEQV >gi|157101634|gb|DS480690.1| GENE 142 130725 - 131480 734 251 aa, chain - ## HITS:1 COG:alr4330_2 KEGG:ns NR:ns ## COG: alr4330_2 COG5011 # Protein_GI_number: 17231822 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 2 200 1 199 225 80 24.0 3e-15 MKIRIKFRKYGTMKFIGHLDVMRYFQKAIRRADVDVCYSGGFSPHQIMSFAAPLGVGNTS NGEYVDIEVSSTKDSETMKKCLNGVMSEGFEVVSYRGLPDDAANAMSIVGAGDYTLAFRP GYEPEDMDKETWFQGLIAFFERPSVMVTKKTKRGDKEMDLKPLVYELRICPEGDGEMEGP RLFMKISAGSAANVKPEQVLDAYYGFLGKERPPFAFMVQREEVYADRASDAEKEAGIHSF ISLEQLGEEIR >gi|157101634|gb|DS480690.1| GENE 143 131486 - 133354 1845 622 aa, chain - ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 8 618 6 614 622 693 52.0 0 MRKLALNDEILLSIQQPARYIGGEVNMVIKDPEKVDVRFAMCFPDVYEIGMSHLGIQILY SMFNSREDICCERVYSPWTDLDPILREKKIPLFTLETQEPVKNFDFLGITLQYEMCYTNV LQILDLSQIPLHTCDRTENDPIVIGGGPCAYNPEPLADFFDLFYIGEGETVYFELMDRYK ENKKAGGTRKQFLEMAAEIPGIYVPAFYDVTYKEDGTIECFLPNNEHAMPVVTKQVVTDM DSVGYIEKPLVPFIKVTQDRVVLEIQRGCIRGCRFCQAGDVYRPLREHSLEYLKHYASDM LKSTGHEEISLSSLSSSDYSQLEGLVNFLIDEFKGKGVNISLPSLRIDAFSLDVMSKVQD VKKSSLTFAPEAGSQRLRDVINKGLTEEVILKGATDAFESGWNRVKLYFMLGLPTETVED MEGIALLSEKIAEAYYDIPKDRRNGKVQIVASSSFFVPKPFTPFQWARMCTKEEFIERAN IVRGKFREMKNFKSLKYNWHEAELTVLEGVLARGDRRVGAVIEEAYRTGAIYDSWSEYFK NDVWMKAFETCGVDIGFYTTRERSLDEVFPWDFIDAGVSKEFLIREWNHAVKEEVTPNCR QRCSGCGARDFGCGVCYETVKE >gi|157101634|gb|DS480690.1| GENE 144 133836 - 134471 554 211 aa, chain - ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 2 199 17 214 217 142 38.0 3e-34 MITIRNEEERDYKVVEEITRKAFYNLYIPGCAEHYLVHIMRGHEDFIPELDFVIELDGQV IGNIMYTKARLVDEAGTEKEILTFGPVSIAPEYQRRGYGRMLIEHSFEQAVLLGYDVVVI LGSPMNYVGCGFKSSRKLNICMENGKYPAAMMVKELVPNQLDGRKWFYYDSPVMAISEEE AQEYDNTLEKMEKKYQPSQEEFYIMSHAFIE >gi|157101634|gb|DS480690.1| GENE 145 135036 - 135530 304 164 aa, chain - ## HITS:1 COG:no KEGG:Clole_4115 NR:ns ## KEGG: Clole_4115 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 6 154 5 154 154 104 37.0 1e-21 MAKTSFNEFLLAVAPEHRVFVEKLNDKLLKQGCELVIKEVKSGYTATYQLEKKTVMNWVF RKTGIWARIYGDNAGRYEEVIAALPAHMQKKMAASRDCKRLIDPDACSDTCVKGFVYSLN GETQKKCRNDGMLFLLTEETAEYIAGLICAEAAARKPALQQLNR >gi|157101634|gb|DS480690.1| GENE 146 135677 - 135751 84 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEHFKFDLYARACDAYFIYGVILG >gi|157101634|gb|DS480690.1| GENE 147 135827 - 136249 230 140 aa, chain - ## HITS:1 COG:no KEGG:Clole_0240 NR:ns ## KEGG: Clole_0240 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: C.lentocellum # Pathway: not_defined # 4 137 190 333 334 144 51.0 8e-34 MGPEATGMIEKFVYETELLKLKPDARGMGFDCSRADLRVEAGAAPTAYEAWVSIPEDMEV KPPLVKKTFGGGMYAAHVLRDWSFQDWSLLQEWARESERYEEAGGPCFEEILNYYNLMNN GAKMEDTQIDLLLPVKEMIK >gi|157101634|gb|DS480690.1| GENE 148 136486 - 137736 910 416 aa, chain - ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 206 400 17 211 217 165 41.0 2e-40 MSKDTKKNIKGTDNRKNTNSKIPKNLKTQDRQNAMKRISKTRQAKMREEEVKSTLKFGVK LVLVLAVVIGCAQLFFYGKNVMADRSNGPAAWIAAESSQTKDNPSDSSVLSKTGEPEKEE QEETAGPETKEGITGGLASKEIQTGEPPSGIVLEPFLPDSEKGQQQTPETIAPFAVSSDG PGNVVQAAPPLGEASLGPAGGGNELFPVSGPADVSKPMIAFTFDDGPYYSVDSRILDTLQ AYGGRATFFIVGSRVNDYKDTLKRIRDSGSEIGNHTFNHKNLEKISPEEVTSQIEMTNDA VEAVTGFRPKLVRVPYGAFKGQVPGLVSYPMIQWNIDTQDWSSKDKDAIAASVLSQARDG SIILMHDLYSATAEAFETVIPLLAAQGYQFVTVSEMYAAKGVPLEAGQVYFNIPKG >gi|157101634|gb|DS480690.1| GENE 149 138147 - 138572 299 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160938958|ref|ZP_02086309.1| ## NR: gi|160938958|ref|ZP_02086309.1| hypothetical protein CLOBOL_03852 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03852 [Clostridium bolteae ATCC BAA-613] # 1 141 10 150 150 255 100.0 7e-67 MNLKGMTQATLSKETHIPASTLKRIINGENQKDKMVHMESIAKALGTSVHSIFDIPLETP LAAALRPDEAGASVTKLLTSDESLLIYWFQNTYREGQAAILKRAKEEYNKTQNEILKKVT SLDKPKDNPGDFDQLSLELRT >gi|157101634|gb|DS480690.1| GENE 150 138667 - 139437 755 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938959|ref|ZP_02086310.1| ## NR: gi|160938959|ref|ZP_02086310.1| hypothetical protein CLOBOL_03853 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03853 [Clostridium bolteae ATCC BAA-613] # 1 256 1 256 256 481 100.0 1e-134 MKPNEAAEKQEIMYRLKNAGELYVLLSMCTGEPYVVCDQETYDDEIIVFFDSQAAVEEAK EQTEAGTPVRPMKLENKQFLIFYTSLYTLGVNALLVKDGGRNSLIQLQDFVKRNKQQGQE TGEKVWVENPSLHLTMLYYMQELRKKPGQENLPEIKEWQDEISNGFSKGSFIVPVEKEGK GLAAVKVNEQLFQAIFTDILEFQKFNREGKLRPLVVTADKIPQIMTEEAKGVILNPMGVR MPLQIKRAPSQPKKDA >gi|157101634|gb|DS480690.1| GENE 151 139424 - 141157 1517 577 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 134 543 341 753 776 201 31.0 3e-51 MFTQRKILVVEDNEINRMMLSAILSSEYEVLEAENGQAALAVLRKYRDGIALILLDIIMP VMDGYTFLARMQEDPALASIPVIVTTQSDDESDEVTALSHGATDFVAKPYKPQVILHRVA NIIKLRETAAMINLIQYDRLTGLYSKEFFYQRVKETLAAHPEQEYDIIYSNIENFKLVND IFGVPSGDRLLVNVAAMFSKQVGRWGICGRINADQFACLSERRLDYSEQLFAEAGNQINA MSNVKNMVMKWGVYHIDDRSISVEQMCDRALLAVRSIKGQYRRHFAAYDDALRSKLLREQ AITDSMEDALAQGQFIVYLQPKYRLKDRELSGAEALVRWIHPEWGFQSPGEFIPLFERNG FITQLDQYVWDRTCAFLRDWDSKGYPQIPVSVNVSRADIYNADLANILLRTVHKYNLPPS RLHLEITESAYTENPGQIIDTVTHLRELGFIIEMDDFGSGYSSLNMLNQMPLDILKLDMK FIQSETAKPVNQGILRFIMSLARWMNLSVVAEGVETDEQLERLREIGCDYVQGYFFGKPM PEDVFEELMKKTDREKTGGREPANYNNIKEKQTNETE >gi|157101634|gb|DS480690.1| GENE 152 141330 - 141842 305 170 aa, chain + ## HITS:1 COG:no KEGG:BDI_0509 NR:ns ## KEGG: BDI_0509 # Name: not_defined # Def: two-component hybrid sensor kinase/response regulator # Organism: P.distasonis # Pathway: not_defined # 1 165 1 165 793 111 33.0 1e-23 MKSTGLSLNDYIDATDGIIDTIWELRLKEEEVYVWRDRTLPSTTGQTLDLEQTLRELSEH NVYGPDQAIWNEFFCTESLRSFFASGRKHKRIEIRFIGAPYGFEWHEIYLSTCPHSGCSS DRLLLFARRIESEMRSHLVEMAAQDDYDYVTYIDANTNHLLFVNSFATSF >gi|157101634|gb|DS480690.1| GENE 153 141849 - 142895 380 348 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148987750|ref|ZP_01819213.1| ribose-phosphate pyrophosphokinase [Streptococcus pneumoniae SP6-BS73] # 4 337 8 312 317 150 33 6e-35 MGRKYRQLSQNDRISMETLLNKGHSVQEVADYLHVHRSTIYREMKRGEYVHRNSDYTEEV RYSSDKGQQTHDWNAQGKGRNIKIGNDIKLAEYIENKIVENKYSPEAALAAVATSGIEFS TTISVRTLYRYIDNGIFLKLTNKHLPVKGKKKKKNKKVQVQKRAAAGESIENRPDEVATR ETFGHWEMDTVKGKQGVTKSCMLVLTERKTRDEIIFKLKDQKAESVVDAMDRLERKWGDM FSKVFRSITVDNGVEFSDCKGMERSALTPGEKRTYLFYCHPYSSWERGTNENTNKLIRRH IPKGEDFDEKQDRDIEFIENWINTYPRGIFGFKTSEELFKEELEKITA >gi|157101634|gb|DS480690.1| GENE 154 142978 - 143511 412 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938963|ref|ZP_02086314.1| ## NR: gi|160938963|ref|ZP_02086314.1| hypothetical protein CLOBOL_03857 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03857 [Clostridium bolteae ATCC BAA-613] # 1 177 1 177 177 317 100.0 3e-85 MGLFSNLFSKKNTAAVQPQPVQMPEEKKPRYIVKSQRFILDNIKDHMEDIMDLVEKNEDY KLKKKDLIEDDRTDENIYEYELNEKATITPTSCEGGGVEQLQVFVCNTHIGDIKKGGVSK VKNLLKKGNIENIWSEVSGGNYKRLRYDAGKDVYYYDELEKEFSITIEITYKEEITE >gi|157101634|gb|DS480690.1| GENE 155 143537 - 143752 296 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291524993|emb|CBK90580.1| ## NR: gi|291524993|emb|CBK90580.1| hypothetical protein EUR_14900 [Eubacterium rectale DSM 17629] # 1 71 1 71 71 77 97.0 3e-13 MPKIKKEFDQTKYQNEYKKKTYDRMELLVPKGEKAVIKEKAAAAGTSVNEFVYSAVKEKM EAMEAATETEE >gi|157101634|gb|DS480690.1| GENE 156 143852 - 143992 214 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDEDMNVGELLKETAEENQTRKILEILNECKDLAEAKERVKALLNK >gi|157101634|gb|DS480690.1| GENE 157 144441 - 144611 133 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938966|ref|ZP_02086317.1| ## NR: gi|160938966|ref|ZP_02086317.1| hypothetical protein CLOBOL_03860 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03860 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 56 15 70 70 64 100.0 3e-09 MIMEDVFLNPGNFKEAHKNFVEKMEEYEKKENLTEEEAQYNIEMAVEALKKYRAEQ >gi|157101634|gb|DS480690.1| GENE 158 144694 - 144855 137 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938967|ref|ZP_02086318.1| ## NR: gi|160938967|ref|ZP_02086318.1| hypothetical protein CLOBOL_03861 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03861 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 53 1 53 53 92 100.0 7e-18 MWEVVKTVKGYDITRMIGSRGAYHVSVREGKGFREFHTFKTIKAAVEFIETAL >gi|157101634|gb|DS480690.1| GENE 159 145062 - 146006 1008 314 aa, chain - ## HITS:1 COG:lin2738_1 KEGG:ns NR:ns ## COG: lin2738_1 COG1705 # Protein_GI_number: 16801799 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 9 153 44 179 180 99 44.0 9e-21 MATKDQVNAFIAKLAAIARKEYLTRDKWVLPSVCIAQAALETGWGTSGLMTKANAFFGIK AGSSWKGKVYSSKTNECYDGKTYTQITAAFRAYDSLEESVADYYNLICGSSRYAGAVNNG NAESAITAIKNGGYATSPTYIKNVMNIINSYNLTQYDTWDGENQGPANKETHGYKVGDKV RVIDNITYNGVRFATYYDEYDVIQVNGDRVVIGIGNTVTAAVNAANIAKDEEIAEAPADA ETPTDIPTQEETENAHGFKVGQKVKVINAYDYYGNMFKCWYSKYDVIEVKNDRIVIGIGS TVTAAVNAANLAAA >gi|157101634|gb|DS480690.1| GENE 160 146010 - 146438 478 142 aa, chain - ## HITS:1 COG:lin0175 KEGG:ns NR:ns ## COG: lin0175 COG4824 # Protein_GI_number: 16799252 # Func_class: R General function prediction only # Function: Phage-related holin (Lysis protein) # Organism: Listeria innocua # 22 141 19 138 140 75 36.0 2e-14 MEKMFNFISVIGGLVGGFIVSLFGGWDVMLYTILLFAILDYFTGILKAVYKKELSSAIGF KGIVKKIMVFVVIAVAYNVQRMTGDTIPLREIVIVFFICNEALSILENAAEFINIPQQLK DVLLQLRDKNAAKAEEKEESEE >gi|157101634|gb|DS480690.1| GENE 161 146509 - 147549 473 346 aa, chain - ## HITS:1 COG:alr3497 KEGG:ns NR:ns ## COG: alr3497 COG3344 # Protein_GI_number: 17230989 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Nostoc sp. PCC 7120 # 1 336 1 349 352 191 36.0 2e-48 MKTVKGLHEKMGTFENANTSFHQAAKCKRYTDEVLAFSMVKEEELLRATEEIQNLTYRQG EYKIFKVFEPKERLIMALPFYDRVVQHMICNAIQPVFENGFYYHSYACRTGKGMHAASDT LYQWMYETEVKQGLRMYAFKGDISKYFASIPHDKLKDENRRYIGDKKALMLMDDIIDHNG ILPDGVGIPVGNLTSQLFANVYGNKLDKFCKHVLHIPYFVRYMDDFIILSDDLEQLKEWV KRIEEFLENEMLLHINPKSTILYAGNGIDFCGYIHYADHKKVRKSSIRKLKQDVKAYELG ELPPEEFNRKYESRKGHLGHADTYHIAKAVEYELLFYEWERLEAAA >gi|157101634|gb|DS480690.1| GENE 162 147907 - 148296 218 129 aa, chain - ## HITS:1 COG:no KEGG:Dtox_4257 NR:ns ## KEGG: Dtox_4257 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 14 124 1 112 113 73 37.0 3e-12 MAENKKSNQADAYMESMKAYQKTYDFLLYLYPILSQFPKFEKFALQSQIKTAVFEMLKSV IRFRKTGTKSHIYNADVELQFIKTLIRLSYDLEYPAMSKHRYEVLSKKMTELGCIIGGII EAVKNGNWK >gi|157101634|gb|DS480690.1| GENE 163 148356 - 149450 451 364 aa, chain - ## HITS:1 COG:no KEGG:Tgr7_1617 NR:ns ## KEGG: Tgr7_1617 # Name: not_defined # Def: hypothetical protein # Organism: Thioalkalivibrio_HL-EbGR7 # Pathway: not_defined # 68 364 58 351 351 186 39.0 2e-45 MGRLFVYDENMTDERAKITVAKMAAVSDIVASEKAFIQYSAAGQLTVLAGAVIAVGDAIF QTEETTLSAANLDGASSFAHGKDYYIYLCNNGKDSSNEVYLISENSTFPDGVEWDDTNTR KIGGFHYGFVRNVDEYGREVNTSGSVRGSGWESNVHEDIAPNSVWTALHRPKCDPSGMAY LGNGLWADIYLASDDGANGLQSVYNATPITGTEGLNWYIANEKAARVGKRLPDLAEWLIA AEGSPQGLDGSNTNGWTATTNTARTAVGKIKNAISVKNIMDIAGNVWEWINELCLDPTAA SWNWYNVMSGYGQIYMPSQTALHALVGGGHWGSGVHCGSRAVGCSDYPWYVTASIGVRCV CDSL >gi|157101634|gb|DS480690.1| GENE 164 149450 - 149944 582 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938973|ref|ZP_02086324.1| ## NR: gi|160938973|ref|ZP_02086324.1| hypothetical protein CLOBOL_03867 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03867 [Clostridium bolteae ATCC BAA-613] # 1 164 1 164 164 230 100.0 2e-59 MVGFPKVIKTKADLVNTYKLVKKGRLKKKDWLAAVEKLENQNFIFCPILEKTEDRKGVTL MFVNEVTEGDKVKAGNATATVKTVEHIEVEKAAEADQEAAAVEGTAAADAETTTNNNQTK HTILTLSKAIAADAATIGIPAEVTFYDRLGITEEEVEQMKGELA >gi|157101634|gb|DS480690.1| GENE 165 149966 - 151006 717 346 aa, chain - ## HITS:1 COG:no KEGG:CLM_2920 NR:ns ## KEGG: CLM_2920 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 4 247 3 245 422 123 33.0 1e-26 MAVRGFFYNSVNKDRLYNGQDMNEDKAPFYKEGVAYGHLQVTADGESMAVKVDGGNRTGY AYINLHTIHNTTVLELPVSGSNGTLPRIDRVILRNDETERKPSIFILEGAYSSNPQPPEL TNNDTIQEKCLAEIYVAAGAVAISQSDITDTRADTGLCGFIASQFEDFDFSQFTKQFNAW FALEKKSMEQDHANFIEEYAEMTQAFMTDQEAQWNKWFKEKQTELSGDIAGKLQLQIDDV KEKVYNIAFKVYIISTLEQITSPVTVKLTNKTTGTMQTVTVTKSQMGFYITEAGEYTVEA DLESVMVTPKAFKVTNDNLMTTQNITLREGSNLGYIGNYIGSFIAQ >gi|157101634|gb|DS480690.1| GENE 166 151026 - 152072 637 348 aa, chain - ## HITS:1 COG:no KEGG:Ccel_0826 NR:ns ## KEGG: Ccel_0826 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 341 1 335 344 239 38.0 2e-61 MEIIVYDRNLYRLGTIENHTSLQWHRKYYECGTFELHAPATEDNIRLLQPGNVIRPKGKD EAAVIRGDQTEEESTLVNEIVRNGYFLPIYFNDRLTGPMFTFNGTCEDAMRFMINRMAAV PLLEVAPGIGDATKITFQATYKNVLTYLSKIARYCELGFRVVPDFKGKKMTFETYKGIDR TTKQGTKPRVIFSESYNNLNRAKHTYSDETAKTKIVVGGAGDGADRIYVTVGGGTGFDLR EEFLDAKDINKDDFSTNAEYLEALRIRGEQYRAENAVIENIEAEVEAEVNFIYGTDYDLG DIVTVEKAKWNKVLNLRITELCEVYEYGGMYVVPTFGDALPTTINWDN >gi|157101634|gb|DS480690.1| GENE 167 152077 - 153030 485 317 aa, chain - ## HITS:1 COG:no KEGG:CLB_2961 NR:ns ## KEGG: CLB_2961 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 11 317 9 289 289 115 28.0 2e-24 MADITITCTNDKNVSITFRWDWDKSPFHLLDVEGIYGYSCKVTTSENTTTDGSTYQGSTA EERNIVITTEIDGDYKKNRELLYRVFQKGRTGTLEYSEDGDVKTINYKVENVLPGAITGV VRDYTISLKCPDPYFKDLSDVEVVMASWVSDWYFENGFDINGVEFGHREAELVKEIENDN GADNIGITAIFKADGIVKNPAIYHSESGKYTKVGYTGNDFELQSGQYVVIFTHTGKKNMY LLDGVSQAEIEEHKDRYGMIDWDTVVSIYGTIINQYYDEDGEYIQLQDGTNTITYNAESG INYLSVSVYYRISYLGV >gi|157101634|gb|DS480690.1| GENE 168 153034 - 156027 1758 997 aa, chain - ## HITS:1 COG:L60836_1 KEGG:ns NR:ns ## COG: L60836_1 COG5280 # Protein_GI_number: 15673031 # Func_class: S Function unknown # Function: Phage-related minor tail protein # Organism: Lactococcus lactis # 1 500 1 516 612 261 35.0 6e-69 MANNIKGITIEIGGDTTKLQNALKGVNGDLKSTKNELKEVEKGLKLDPKNTELLAQKQQL LTKAVGETKDKLDVLKTAEAQVEQQFKNGEASEEQYRAIKREVIATEAELKNLEEQAKAS NSTLAKVGDAFGTVGDKATKAGEKMMPVTAGITALGAAGVAASMELDNGYDTIITKTGAT GEALESLTTVADNVFSDMPTTMDDVGVAVGEVNTRFGATGTELENLSKEFIKFANINGTD LNTAIDSVDSIMTKFGVDASQTKNVLGLMTKAGQDTGISMETLETALTTNGASLKEMGLD LTSSVNLLAQMEASGVDVSTALAGMKKAVQNATADGKSADEALTETIDSIKNAKTETEAL TIASDLFGKKGAAEMTQAIREGRLSVDDLSGALSDYGDVVSDTFETTLDPWDDAAVAMNN LKLAGADLGSSILTTLQPTIDKVVNKVKEFTTWFKNLDDNTKQMIVKIGMIVAAIGPALI IFGKMSTGISGVIKTVTGLTSKIGGMSGVLSALTGPVGIVIAIIAALAAGFIALYKTNDE FKEKVDGTIGKVKDAFSQMWTTIQPLLESLKQAFINLMAALKPVFEFVLTYIASIVNGVI NAAAPIIAAIQNVIDFVTNIINAIIALLHGDFDGFFSYLQGAFNSAISFVKNIIQAVINF IIGFLEGFGVNVKTLFTNIWNSIVAVFQGVGQWFSDRFTEAWNAITTIFSAIGSWFAARW NDIKTALATVATWFLTMFTNAYTNVTNVFATIGSWFAARWNDIKTALAAVATWFLTMFTN AYTNVTNVFAAIGSWFGARWTEIKTALASVPTWFKTQFDNAWTNIKNAFANVTSFFSDLW EKIKGCFVNVGTAIGSAVSDAFKSAINSCLSTIEGVVNKFIGMINGVIGIINEIPGVSLS KIDTLSLPRLAKGGVLREGTAMVAEAGPELLSMVNGKAVVTPLTGSAVNTAADNLKGNNG GFHQEINITSPKALSPYEVARQTRIQTRAMAIAMQRG >gi|157101634|gb|DS480690.1| GENE 169 156020 - 156307 82 95 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2250 NR:ns ## KEGG: CLL_A2250 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 88 1 86 87 74 45.0 1e-12 MPYYPRQDKKGEIPYTILTRPEKLVMDYCHINIYEVQEMEIDIYLFFLREAMIFENSQTE EGRKYLKDCFRMEQTKPDREGLRERFGKKGGKSSG >gi|157101634|gb|DS480690.1| GENE 170 156310 - 156660 450 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938979|ref|ZP_02086330.1| ## NR: gi|160938979|ref|ZP_02086330.1| hypothetical protein CLOBOL_03873 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03873 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] hypothetical protein EUR_15050 [Eubacterium rectale DSM 17629] # 1 116 1 116 116 198 100.0 1e-49 MAVKEFNMNKIKRTFWPFTLKDKKDENGNVVEKGKKIIVRMPQKGVFEAIKDVEANGTGE DADTSTIYNLVAAVLNNNMGNVKVSAEEVESYDIEECTAILNAYMEFVDELKANPN >gi|157101634|gb|DS480690.1| GENE 171 156742 - 157296 508 184 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2252 NR:ns ## KEGG: CLL_A2252 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 3 180 6 182 193 217 65.0 2e-55 MDKESIVLGSGDLYCTDFQGTNEAIPDDAAIETEDNRLGHIKGGAEIEYAPEFYEAKDDM GKVSKVIITEEEATLKSGIMTWCGTTLEKLCQTARVTEDKAKGIRTVKIGGIGNATGKKY LLRFVHKDTKDGNIRVTIVGNNQAGFTIAFAKDSETVIDAEFKAQPMDKEGTLILYTEDI DKTE >gi|157101634|gb|DS480690.1| GENE 172 157299 - 157490 141 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938981|ref|ZP_02086332.1| ## NR: gi|160938981|ref|ZP_02086332.1| hypothetical protein CLOBOL_03875 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03875 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] hypothetical protein EUR_15070 [Eubacterium rectale DSM 17629] # 1 63 57 119 119 119 100.0 6e-26 MADDWQLELYTVADDEAAEEIRTRIENEVLHDVDYVKFVAYVDSEECFQTAYEVTGLLRK ARK >gi|157101634|gb|DS480690.1| GENE 173 157661 - 158053 173 130 aa, chain - ## HITS:1 COG:no KEGG:L58927 NR:ns ## KEGG: L58927 # Name: pi237 # Def: prophage pi2 protein 37 # Organism: L.lactis # Pathway: not_defined # 2 128 4 121 123 64 35.0 1e-09 MKVSLDSLDEEIKKELENFNAEVINAANDSFQETAKEAAEMLKKGGPYQERTGAYTKDWA VDKRGSRTSVVTGLNGYSVYNKKHYQLTHLLENGHQSRKGGRVKAFSHIAPVNEQLGEMV TGKIESKLRG >gi|157101634|gb|DS480690.1| GENE 174 158050 - 158307 102 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938983|ref|ZP_02086334.1| ## NR: gi|160938983|ref|ZP_02086334.1| hypothetical protein CLOBOL_03877 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03877 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 85 1 85 85 149 100.0 5e-35 MIKKNQTEYQETEVIASINPVGRDEFAAAGQLGYKATSQLEVWDFEYDGQTEVSIDGKRY AVYRTYGPKSNGKTELYIAERVGKG >gi|157101634|gb|DS480690.1| GENE 175 158300 - 158611 214 103 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938984|ref|ZP_02086335.1| ## NR: gi|160938984|ref|ZP_02086335.1| hypothetical protein CLOBOL_03878 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03878 [Clostridium bolteae ATCC BAA-613] # 1 103 1 103 103 199 100.0 6e-50 MTKEQLIAKAKLRIRKTSSDMLDEDVGQLVEVALADLKRIGVHSSYLDEADIKDPLIIEA VLLYCHANFGSPDNQTQLLASYNAMCTKIKGGGYHREVNKTID >gi|157101634|gb|DS480690.1| GENE 176 158630 - 159841 1006 403 aa, chain - ## HITS:1 COG:no KEGG:BCZK3424 NR:ns ## KEGG: BCZK3424 # Name: not_defined # Def: group-specific protein; phage major capsid protein # Organism: B.cereus_ZK # Pathway: not_defined # 12 402 5 385 387 301 45.0 3e-80 MNREQLMKMSKKELKARLVTVGKEAQDKSGEELTSLMDEARTIGEILDEIKAREELAKAA QAAAGEGEGPDGNAGEGAEVKDQARAKSGKALKAGAKTTYNAKKIAKPMAALSTTTGVVM PHHTSPDITPTFNNVSSLIDRVTTIPLVGGETYSRPYVKSYGDGAGSTAEGADYNTSEPE FGYAEIAREKITAYAEEPEEMQKLTDADYDGVIEDSITRAIRRYASRQILVGDGTSGKFK GIFYNPASESDDIIDRNTDITTITAVADDTLDEIIYSYGGDEEVEDVAVLILSKKDLKKF AKLRDKQGRKVYTIVNHGNTGTIDEVPYIINSACAEIGGTKDAYCMAYGPLSNYEVAVFS DIETAKSTDYKFKSGQICYKGCVFMGGNVVARNGFIRVKNPAA >gi|157101634|gb|DS480690.1| GENE 177 159860 - 160633 558 257 aa, chain - ## HITS:1 COG:CAC1893 KEGG:ns NR:ns ## COG: CAC1893 COG0740 # Protein_GI_number: 15895167 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 29 247 24 241 256 110 35.0 3e-24 MPQIKQFIACKNAKTATVKPFCEIKNITDTTADLYFYGDIVSDWWGAWQDEDQYPDAIKN FLAEANGRDLNIYINSGGGSVFAGIAIYNMLKRYQGKKHCFVDALAGSIASLFPFVDSDK PTIPKNAYLMIHKPWCDCEGNANELRKMADTLEAIEAGIWSIYEEHLAEGVTIEQIKELM EAETWLSGEEAAKYFNVSVGEENTAVAAVQDYTKLYCKNTPEALAGGNPPDDTKDQAVAK EKEIRSKIAALTIQHME >gi|157101634|gb|DS480690.1| GENE 178 160639 - 161793 448 384 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_1541 NR:ns ## KEGG: Pjdr2_1541 # Name: not_defined # Def: portal protein # Organism: Paenibacillus # Pathway: not_defined # 6 342 49 385 411 188 34.0 4e-46 MGAIADAIGKNVGKLKPQVIRKDEKGMVIKNDYIARLLSLRPCPEMSTYDFLYRIAVDLV YTSNSFSVIFWNKDFTRVESIQPITTTSYRIFEDDKNNILFRFRWDYDGKTYTVPYQNVI HVKARYNKRRFLGTTPDMELKRSLDLIETSGETIKNIVNRSNSLAGYLKYNNIADDEELK EIARNFQDAYMNKDNAGGIAAIDNTVEFKEISQRTPSIPTNQITFLRDNIYRYYGVNDKI LTSTLNDTEFISFYENVIEPISVQLSYEFTFKLLTPREIGYGNRIDFVANLLQYATLQTR ETIGGGMFDRGALTINEYRELMYYGPVEDGDQRLVSLNYVKVGDQSLYQVGQQNEPPDDT GANDREKRAMQAAARAYMQIMKGG >gi|157101634|gb|DS480690.1| GENE 179 162022 - 163692 750 556 aa, chain - ## HITS:1 COG:CAC1895 KEGG:ns NR:ns ## COG: CAC1895 COG4626 # Protein_GI_number: 15895169 # Func_class: R General function prediction only # Function: Phage terminase-like protein, large subunit # Organism: Clostridium acetobutylicum # 39 556 84 587 596 194 32.0 3e-49 MDNWIFKYHEAIQKKEVIVGVWVRLCFEILTTGLLNGEWEFNEKKANKAIKFIENFCHHS EGRSDLLHLELWQKAIVSAIFGIMDKTTGYRQFREVFIIVARKNGKTLFAAAIAAYMTYV DGEYGAKVYFLAPKLDQADLVYDAFYQIVQSDDELDSITKKRRSDIYIKAFNTSVKKIAF NSKKSDGFNPQLVVNDEMEAWPGDQGLKQYEVMTSALGARKQPLIISIATAGYVNDGIFD ELFKRATAFLKGNSREKRLLPFIYMIDDIEKWDSIEELKKSNPNLGVSVSVEYYLEQIEI ARNSISKKVEFMTKFCNIKQNSAVAWLDYWDVMKCVHEEKPLSLEDFKGCYCVGGIDLSR TTDLTAASIVINRDGINHIFTRFYMPQKRYEVAINEDNTPYNIYRDRGFLFISGENQVDY KDVYNWFIELVKVYKIKPLKIGYDRYSANYLVEDLKTAGFHTDDVYQGTNLTPVLHEFEG NLKDGLFDFGDNSMLAAHFLNVAVDINLNDSRMKPVKIEKRMRIDGAMSVFDALTMVSKY HNEIGKKLLNISKETA >gi|157101634|gb|DS480690.1| GENE 180 163696 - 164085 223 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938989|ref|ZP_02086340.1| ## NR: gi|160938989|ref|ZP_02086340.1| hypothetical protein CLOBOL_03883 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03883 [Clostridium bolteae ATCC BAA-613] # 1 129 7 135 135 203 100.0 4e-51 MAESKNDLTENKKKRPNKLTNARIKKEIEFLEKMFVGVDDENKKTLINSLIEEAAFLKVA CFQAKEELKKEGLTTETVNASQKFVKAHPSTQIYEKYSRQYTAIIHSLIEYLPPKEKEKV DRLAALRDE >gi|157101634|gb|DS480690.1| GENE 181 164590 - 164889 76 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938991|ref|ZP_02086342.1| ## NR: gi|160938991|ref|ZP_02086342.1| hypothetical protein CLOBOL_03885 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03885 [Clostridium bolteae ATCC BAA-613] # 1 99 26 124 124 195 100.0 8e-49 MLNGNASAFDRMAYSVINEALNNSCHNIDSEAAREQMQKQIYKSVVHCTPYESIYDVMCG RRQFYDYRNEFITAVAEGLGMLPGSRTKRNTGCSSTTGT >gi|157101634|gb|DS480690.1| GENE 182 164983 - 165375 309 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938992|ref|ZP_02086343.1| ## NR: gi|160938992|ref|ZP_02086343.1| hypothetical protein CLOBOL_03886 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03886 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] hypothetical protein EUR_15180 [Eubacterium rectale DSM 17629] # 1 130 1 130 130 197 100.0 3e-49 MSAKADKTGSCSFCGQTKIIQVPEEWEQGQINEAVTCECECEQAQAYAKAKERKDKAKKR VNELFGSGAEKPVAEDVVNLLIATVDAIEDKHMKGITVDVGHGVKAKVSKMAKESIKVER SENKKTTYEE >gi|157101634|gb|DS480690.1| GENE 183 165372 - 166121 385 249 aa, chain - ## HITS:1 COG:no KEGG:Clole_0728 NR:ns ## KEGG: Clole_0728 # Name: not_defined # Def: VRR-NUC domain-containing protein # Organism: C.lentocellum # Pathway: not_defined # 9 119 12 124 128 120 53.0 6e-26 MSMQNMKRSETTEQIALFNWAKRTESILPELALMYHVPNEGKRSNGGILKAAGLKSGVPD ICLPVANNGFHGLYIELKFGKNKATKAQEEYMAMLNAQGYKTAVCYGAEEAGEEILSYLT EPGRMPKKACVNAPWINGKCDGINLPSRMFSREECRGCKNFNPGREERIINEILNEHPEK REIKQAIINLSCGQTGNKKIESMEDTLEIINATLGGMVKGNELTVEQSAAVLTVAMKAYE VGKKARMKV >gi|157101634|gb|DS480690.1| GENE 184 166118 - 166288 113 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160938994|ref|ZP_02086345.1| ## NR: gi|160938994|ref|ZP_02086345.1| hypothetical protein CLOBOL_03888 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03888 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 95 100.0 8e-19 MGKIILFPTHPDYCKRCIYSRDNGTCASEKYNENQYKVNCVWHYCKYRKEKAEYET >gi|157101634|gb|DS480690.1| GENE 185 166295 - 167221 766 308 aa, chain - ## HITS:1 COG:no KEGG:Clole_0730 NR:ns ## KEGG: Clole_0730 # Name: not_defined # Def: ParB domain protein nuclease # Organism: C.lentocellum # Pathway: not_defined # 3 262 4 257 318 140 38.0 9e-32 MGFNIMDLMNGATRAAVEGVDNYEAITLNLDEIKVTKHNRYSMDDLEELATSILMDGLQE PLIIGRVNGEYLLSGGHRRREALVILQNEGHTEITQNIPCRFKDMTETQFRLSLLIGNTF NRKMTDYDLMNQAADWKEVLTQARKEKLLVLEEGKRVRDYVAAVLGEKPTKIAQLEAINN NATEEVKEQFEKGNMKITSAYETSRLSEDAQKEVAAAVEAGADIKSEEIKQMSEEKKKKR KTAEDIAKEQNVSDTDTSEEEKANAKKLHAVKMLEKYYIYLSEEETGILERMLEDCKRRK REYALEED >gi|157101634|gb|DS480690.1| GENE 186 167223 - 168053 381 276 aa, chain - ## HITS:1 COG:RP058 KEGG:ns NR:ns ## COG: RP058 COG1192 # Protein_GI_number: 15603937 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Rickettsia prowazekii # 1 265 1 251 255 121 35.0 2e-27 MKIICTLNLKGGCAKTTTAVSMAELLATGFKSKRGAVEPGKVLLFDNDKQGNASRLFNQY LGDTEAPAAAVLKRATFRGNTIRHTKIENLDIVPCNYFMELAELEIKADTETPQHDRYRR AFEELKNTPPFGNYDYCIIDNAPDLGMNVINALVAADEIVIPVNLDCYSLDGLEELVDQV NNVRQLNRKAHIAGVLITDYEKSDTSEAAETWLREKSGLPVFNTIIRHSKKVKDSTFYHK TPIAYCVRSGAAQGYKNFILEYMNKPHMVAEQKERG >gi|157101634|gb|DS480690.1| GENE 187 168050 - 168970 329 306 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2281 NR:ns ## KEGG: CLL_A2281 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 8 289 7 280 281 131 34.0 4e-29 MGKRYYDNYDYEEAYKEQCKKLEEAEMERWMKEGWVNCLYRTSTYKSTNTESNTTLLESM VYPSFKFKADMPKTEKKRETSPSQSNLNDKNARRYLIRLANINFGKGDIWATFGWNNGLL PETYEDAKKDVVNFIRRINRKRKKLGLENAKYIYIIAFEEYTRPHFHLLISGGIDRDELE RMWGKCDRPNTRNISPDENFLLTGLATYITQNPHGTKRWCPSKNLKKPDEPKRSYSKFRK AKVEKMAFDSSVLQAEMEKAYPGFTFLDAEVKHNGVNAAFYIYARMVKKGEKPKGKPQKR KRGNKA >gi|157101634|gb|DS480690.1| GENE 188 169102 - 169242 56 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVVRIITIIAAVTATGFFTLAALWFIGFLKVRQAWAWLFEDWDRRF >gi|157101634|gb|DS480690.1| GENE 189 169388 - 169690 261 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939000|ref|ZP_02086351.1| ## NR: gi|160939000|ref|ZP_02086351.1| hypothetical protein CLOBOL_03894 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03894 [Clostridium bolteae ATCC BAA-613] # 1 100 1 100 100 184 100.0 1e-45 MKFVIQGLKYDTDKMKKIANVKKWYETNSPLVKAIYGNREVGTTYDCELWRSEKGNWLLT HTEDYNKKVGHAITEEEAKSLLMRYAPDIYETEFEEIPEA >gi|157101634|gb|DS480690.1| GENE 190 169690 - 170040 320 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291525022|emb|CBK90609.1| ## NR: gi|291525022|emb|CBK90609.1| hypothetical protein EUR_15250 [Eubacterium rectale DSM 17629] # 1 116 1 116 116 176 94.0 4e-43 MMQTAAVQKISEQNSPYKAMLDRAYMIGYTDAMNQERSRRRAARERRERKKYFAMQKLNG VALLIFTAVAIKILEGDATIAFITVPLGLSMLLSKEMLVINKYYWRCEEKAERGVQ >gi|157101634|gb|DS480690.1| GENE 191 169997 - 170503 296 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939002|ref|ZP_02086353.1| ## NR: gi|160939002|ref|ZP_02086353.1| hypothetical protein CLOBOL_03896 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03896 [Clostridium bolteae ATCC BAA-613] # 1 168 1 168 168 296 100.0 3e-79 MGFMDNFTSDSPVTVKQPDYYVMTKEAAKAELITNAVNAEVPGYYIQAMITGKKPDFLNT LDAEEEETGFRTEYEQITGAVVSIFEAWEKENGVESAADGLHRLIDDLSKNRIEELRVIR ENREAENAILSKLLILPLCGDKSRAEETESEGTQHDADGSSTEDKRTE >gi|157101634|gb|DS480690.1| GENE 192 170504 - 170794 285 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939003|ref|ZP_02086354.1| ## NR: gi|160939003|ref|ZP_02086354.1| hypothetical protein CLOBOL_03897 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03897 [Clostridium bolteae ATCC BAA-613] # 1 96 1 96 96 174 100.0 2e-42 MGQFKDYMDLDEHPFTGLRKLSMPFKARIKNTEYIRKFFDGTPNFSQVEDVTIGKVYEIH AVNGYGDVSDFIFTDDLGNENRLGSFFFEDAESEDK >gi|157101634|gb|DS480690.1| GENE 193 170813 - 171172 214 119 aa, chain - ## HITS:1 COG:no KEGG:bpr_IV157 NR:ns ## KEGG: bpr_IV157 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 3 100 7 104 115 103 47.0 3e-21 MAIYAIDFDNTLAITRFPEIIAPNKKMVAFAKTVKAQGHKIILWTSRAGADLENAVEWCR LQGIVFDAVNEPLPEQIKRWGNDTRKIYADYYIDDKNMTIAQAESTMNQIKEIMEEMAE >gi|157101634|gb|DS480690.1| GENE 194 171172 - 171516 379 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939005|ref|ZP_02086356.1| ## NR: gi|160939005|ref|ZP_02086356.1| hypothetical protein CLOBOL_03899 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03899 [Clostridium bolteae ATCC BAA-613] # 1 114 1 114 114 172 100.0 6e-42 MMQKLKEEITAAANRELNRANEQFPLFTSKHEGVAVAYEELEESKEALEELEASFKCLWD DVRGKETPRYLKEEITPLKIADYAINLACEAVQTAAMLMKYEMSLNLAAEREGE >gi|157101634|gb|DS480690.1| GENE 195 171550 - 171870 257 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939006|ref|ZP_02086357.1| ## NR: gi|160939006|ref|ZP_02086357.1| hypothetical protein CLOBOL_03900 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03900 [Clostridium bolteae ATCC BAA-613] hypothetical protein EUR_15300 [Eubacterium rectale DSM 17629] # 1 106 1 106 106 208 100.0 1e-52 MEEKTMMPINNQIEPDFLEHIKSTFKRWRDLNTQGVTIGARELSNFAFTLKGASMNSHLG FKYNFNPRGTDADGNPAITLKLYTKPEQMNPAADRPVYEFAAPYMV >gi|157101634|gb|DS480690.1| GENE 196 172011 - 172304 211 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939008|ref|ZP_02086359.1| ## NR: gi|160939008|ref|ZP_02086359.1| hypothetical protein CLOBOL_03902 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03902 [Clostridium bolteae ATCC BAA-613] # 1 97 1 97 97 180 100.0 4e-44 MTLADIVGVMSGPDRVRITKGRVELFAGYLGNLVHMAEYEALMAEEVTRIKEKVDITHKR YKELGLMQPLHPEETPNYSFSDLQLTIYREIILKSEE >gi|157101634|gb|DS480690.1| GENE 197 172314 - 172967 357 217 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939009|ref|ZP_02086360.1| ## NR: gi|160939009|ref|ZP_02086360.1| hypothetical protein CLOBOL_03903 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03903 [Clostridium bolteae ATCC BAA-613] # 1 217 38 254 254 373 99.0 1e-102 MTPKEYGILQAFPMDDWKQVVSDSQAYKQFGNAVTTTVFTAIAEEIAKSIYAAEESEEQN MEAENKNFTGMNEPEEKPTAESVLTVAEAQAAASQEETTKTAIPAEETITPDALAAGMLH FLLDSGIVASACVTDETEKMFAKHIKEELDGMTIGEAPEILRDWEAAQNAVNDMLSKYAP GGYMGKIIYPLLTPLKERLEAGERTPDLYNAIVEATR >gi|157101634|gb|DS480690.1| GENE 198 173127 - 173909 176 260 aa, chain - ## HITS:1 COG:SP1336 KEGG:ns NR:ns ## COG: SP1336 COG0270 # Protein_GI_number: 15901190 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Streptococcus pneumoniae TIGR4 # 2 153 110 271 407 75 31.0 1e-13 MPAVIIAENVRGLRPYLPVLRLEYERHGYTAHIEMFNSKYWNVPQNRDRYAVVGTRNKKN LSFTFPKEQHEFVPKLSDYLEKDVPEKYYLPDEKAQTIIAQAMEKLEKMGKCHACITPDR INKRQNGPRAKAEDEPMFTLTAQDLHGVIILEDKKKRKARGGGQQTTTDNKCCEHKPVRS WDERKRILCRGISPNAYNQQGRGDKSNDNRECPFENGIIQAQKNGCGTFVAVAPTLLSSD YKQPPLVIEKRKEQDNGKNQ >gi|157101634|gb|DS480690.1| GENE 199 174353 - 174577 187 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939013|ref|ZP_02086364.1| ## NR: gi|160939013|ref|ZP_02086364.1| hypothetical protein CLOBOL_03907 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03907 [Clostridium bolteae ATCC BAA-613] # 1 74 1 74 74 126 100.0 5e-28 MEVWILRGTDPETLEEKINKQLEEVEKVKSFFHTPTVQYQTAVVPQMRGDKVTGYKVEYS AMVAVEAKPLFREA >gi|157101634|gb|DS480690.1| GENE 200 174613 - 174795 97 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939014|ref|ZP_02086365.1| ## NR: gi|160939014|ref|ZP_02086365.1| hypothetical protein CLOBOL_03908 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] hypothetical protein CLOBOL_03908 [Clostridium bolteae ATCC BAA-613] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] # 1 60 5 64 64 95 98.0 1e-18 MKKGDKRITYADRQKIEAMERTGAKVTDIAKAVGFHRATIYNELKRGGTPYRAEVAQRSL >gi|157101634|gb|DS480690.1| GENE 201 175034 - 176863 1558 609 aa, chain + ## HITS:1 COG:MA2256_2 KEGG:ns NR:ns ## COG: MA2256_2 COG0642 # Protein_GI_number: 20091095 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 70 330 36 287 345 165 38.0 3e-40 MEYNRAYLPVDEWEETTRLMSLDNVIKELEQNKEYILYSRTIQNGILRDKKLRYSYYNRQ RKIILLTRSDITEIREEKRQKELLQDALNAAQVANQAKSTFLSRMSHDIRTPMNAIIGLT AIAASCADDPDRVIDCLGKITSSSKLLLSLINEVLDMSKIESGRIVLSEEDFLLDSLFKD VISMIQPDLKRRGHTLNIHIETLEHEAVCGDMQRLQQIINNLLSNAIKYTPHGGTITLEL SETRSPQQGYSTYRLVCEDNGVGISPEFIERVFEPFERAEDERIRSVQGTGLGMAITRNI ARMMDGDIWVESTPGRGSRFTVTFNLRLRNQILPDSSVLTELPVLVADDDRIVCEETCRC LNQIGMNSSFVTSGREAVSRVVSAHRDLKGFFAVIVDLRMPDMDGIEVTRQIRAQAGCHV PIIVISAYDTAEYEAAAKEAGANGFISKPLFPSKLICMLRRFALHEEEETAAASLPGLPN ADFSRKRILLAEDNPLNQEIAMAFLQNTGAHVETAMDGREAVEMFAASFTGYYDLIFMDI QMPHMDGYQATRRIRAMERSDAASIPIVAMTANAFTEDIKLALEAGMNQHMAKPIDIRQL ETIMFRWLG >gi|157101634|gb|DS480690.1| GENE 202 176960 - 177973 760 337 aa, chain - ## HITS:1 COG:MA2773 KEGG:ns NR:ns ## COG: MA2773 COG0500 # Protein_GI_number: 20091596 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 12 337 14 342 349 181 31.0 2e-45 MPMDSFSSEGILRRLIRLPYLPLYVGVLNAAVELDVFSDLTQRKSVRELSEERGWNEDNT RYFLDALYCLGFIWKRDGMYCNCRETDRYLVKGRPEYIGGHLAFHCAEDCMGCRDIKRLV EEGPGGDSQPEGKLAFEQYVQNMRDSQLGFRQTEIQKMVKELPEYPEIRKILDLGCATGL LGLGVIGEREDVYGILYDRPAMEPAIRESIRLSGLEERAVPMTGDYLTDDIGSGYDLVLA IATLSFVEQEMAGLMKKLYKAMNPGGVLLCYSEGIERDGSGPWDMMLGWLPYNMQGYNLG VKKNEITEAAMAAGFRSAEKRTGIYSTGNVDVDIFRK >gi|157101634|gb|DS480690.1| GENE 203 178056 - 180278 2165 740 aa, chain - ## HITS:1 COG:CAC3683 KEGG:ns NR:ns ## COG: CAC3683 COG0768 # Protein_GI_number: 15896915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 47 735 27 669 671 476 39.0 1e-134 MRSRRRRRRSRAVIIIPLVLVCLMAAAAGMAFLWFAKGQAGVRQAAPDERFMEYTGYLTE GNYEAMYRMLDSGSRMDISQEDFITRNKKIYEGIGASSIRVDITGVEEKEDQGIQTVSYE TSMESLAGTIHFFNQADFKLEASSGAAGTDSHDSEKAGKKRKEAKDEEYRLIWNDRVIFP NLSWNDKVRVTTDKAVRGSVLDRNGIMLAGKGSASMVGLVPGKMSREADNYSEDEINMLS ELLGVSADSIRKKLSANWVKDDSLVPIKTLKKVDELNLQSALPEDENVQNRALQDELLTI PGVMITDTPVRYYPLGEKAAHLVGYVQNVTAEDLEEHKGEGYLSDSVIGRSGMEGLFEKE LKGQNGRIISIVTSQGEEKQVLAAIPRIDGQDITLTIDSSLQELVYDAFRDSKSCTVAMN PYTGEVLALVNTPSYDNNDFILGMSEETWTGLNEDERKPMLNRFRQRFAPGSSFKPITGV IGLTAGVLTPDENFGEDAAGLSWQKDAGWGGYHVTTLHGYSPVNLKNAYIYSDNIYFAKA ALKIGYDDFMAGLDRLGFNQKLPFEISVAESQYSNADRIETEIQLADSGYGQGQVLVNPI HLASLYTMFPNQGKVLKPYLIYKDTPVPEIWIEDACTQETAETVEDAMKAVISSEHGTGH AAMRSDITLAGKTGTAEIKASKEDTSGTELGWFAVYTADKDIQKPVLMVSMVEDVKNAGG SGLVVRKSRDVLGAYIPSGQ >gi|157101634|gb|DS480690.1| GENE 204 180444 - 182186 1313 580 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939019|ref|ZP_02086370.1| ## NR: gi|160939019|ref|ZP_02086370.1| hypothetical protein CLOBOL_03913 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03913 [Clostridium bolteae ATCC BAA-613] # 1 580 1 580 580 1140 100.0 0 MKKIHEILTHRKRICKIRIHRMTAPGFAACLVFMAVSLAACQSADSGTGGQKENIAAVRE QDGAGQTDQENHAAKGENAQDTGTPRGQEVVPDTPDMEAGQETAAEQQGEDGLMTADQWQ MVPGYGICKKGEPVLYVLDMDASQMKTDEKGVSARLLSAVFQDNVISVRVKFKDDTVTLI PDEKVKEILKQEEENRKKQDQGISVQWDDSYFCIDSEKNIYGRSAFQDRIRAEKKEQKAQ GARAVTFDRIKGKGIDGLGFTRQRNSTSYKDNGKGGFYISQVSENTIGKDRFTLDEPEGV YELWLNGFKDPIKIVFKKAPAYDSLEDIKGMVNHEGFYGMAEGIVEEDGLRLTTYTYSED GYRIGFSRGKLMATLSDGRTAELVPVLGEQLTTSLDQLSGIQPRSIQETLYRRPEGTVVT EASLHAEKPVVISPEQGSVLEIPIPGERQPLDMVVEFRDCRIYLTGISPMDELYEYGTDD NGDPLTRHMVYINARAEMRDADKNLYYVNAVQPEEGQEGSSGDPIKTVYALADMAGADKY NAGELAGFKALYEEGESVIRLQLQRPTYGWNQEFVLPVQM >gi|157101634|gb|DS480690.1| GENE 205 182451 - 183356 706 301 aa, chain + ## HITS:1 COG:FN0008 KEGG:ns NR:ns ## COG: FN0008 COG0379 # Protein_GI_number: 19703360 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Fusobacterium nucleatum # 4 300 2 296 298 299 51.0 4e-81 MTKQEEIKQLKAEQDAVILAHYYVDPQVQELADYIGDSFYLSKVAAGLENRTLVFCGVSF MGESGKLLSPDKVVLMPDAQADCPMAHMVTKEEVEQYRKEYPGLAVVCYINSTAEIKSWS DVCVTSANAVQIVRNLPNQDILFIPDKNLGRYVASQVPEKHVMLVKGCCPVHDQMSPKEV SGLKKIHPNALVLAHPECNAQVLELADYIGSTTGILKYAAGSTSREFIICTEIGVRHELE QQNPDKQFYFPDTEPICRDMKKITLDKIIHVLRTGQNQAFVPEAYVPSAGRALEQMLKLA R >gi|157101634|gb|DS480690.1| GENE 206 183421 - 184638 807 405 aa, chain + ## HITS:1 COG:FN0009 KEGG:ns NR:ns ## COG: FN0009 COG0029 # Protein_GI_number: 19703361 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Fusobacterium nucleatum # 7 386 6 381 435 367 47.0 1e-101 MTKHIYCDVVIAGCGVAGLYTALNLPEDLHVLMICKEDMDTCDSMLAQGGICVLRGEDDY EPYFEDTMRAGHYENRKESVDLMIRTSPGIIRHLLELGVGFDKNPDGSLKYTKEGAHSRP RICFHEDITGKAITTVLQERTAALPNVDIMEYTVMTDILTANGACAGIQARTRDNSSLFI HARDTVMATGGIGGLYQHSTNFPCLTGDALTVAQKHGVKLEHLDYVQIHPTSLYTEKPGR SFLISESARGEGAVLLNGKGERFVNELLPRDAVSQSIEAEMKKEGSCHVWLSFAPIPQKT ILEHFPHIYETCLEEGYDITKEPIPVVPAQHYFMGGVWVDLDSRTSMEHLYAVGETSCNG VHGKNRLASNSLLESLVFAGKAAKRIENSQKGRIHYESNYHAAYC >gi|157101634|gb|DS480690.1| GENE 207 184607 - 185464 548 285 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 17 278 18 283 286 215 45 2e-54 MNPITMQLIADKYIRLALEEDISSEDVSTNAVMPEYKKGQVQLICKEDGIIAGLQIFKRV FTLLDPETKVVFDVRDGEQVKKGQHLATVTGDVRVLLSGERTALNYLQRLSGIATYTHTV AGMLEGTRTKLLDTRKTTPCMRVFEKYAVRIGGGQNHRYNLSDGVLLKDNHIDAAGGVRE AVAAAKAYAPFVRKIEVETENLDMVKDAVDAGADIIMLDNMTPEQMAEAIRLIDGRAKTE CSGNMTKENIRTVTELGVDYVSSGALTHSAPILDISLKHLKVLTD >gi|157101634|gb|DS480690.1| GENE 208 185498 - 186046 540 182 aa, chain + ## HITS:1 COG:BH1216 KEGG:ns NR:ns ## COG: BH1216 COG1827 # Protein_GI_number: 15613779 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Bacillus halodurans # 2 170 4 175 179 126 40.0 2e-29 MENAMNGTQRRRKLLDMMRLASAPLSGGALGRETGVSRQVVVQDIALLRTMGYPILSTAR GYVLNVSKHASRFFKVCHTNEQTEDELVTIVDLGGTVVDVMVNHRVYGKMSAPLNIKNRR DVQLFMNNIKTGKSTPLMNVTSGYHFHHVCAEQEEILDEIEEALRKKHYLAELLPYEMSD DE >gi|157101634|gb|DS480690.1| GENE 209 186060 - 186866 253 268 aa, chain + ## HITS:1 COG:lin0757 KEGG:ns NR:ns ## COG: lin0757 COG1408 # Protein_GI_number: 16799831 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 15 265 26 285 293 160 36.0 3e-39 MQKKTHAVNPFYCGLTVHRYTEYSKKITKHLRAALITDLHSTIYGSAQEPLLTAVHSHSP DLILMAGDIADHKVPHKGTLLLLEGLKGQYPCYYVTGNHEHWIHQIPDIKKMFTGYGVTV LGGRTIRTAIKGQPLIIGGVDDPHAFTDSHHSVKLDSRWKEQFWRCCSRTSPDIYSILLS HRPELTKYYRDSGFDLIVAGHAHGGQFRMPGCPGGLLAPHQGFFPRYAGGQYALGSTSLI VSRGLCINRLPRIYNPPELVLVDLKPIS >gi|157101634|gb|DS480690.1| GENE 210 186818 - 187138 133 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239625675|ref|ZP_04668706.1| ## NR: gi|239625675|ref|ZP_04668706.1| predicted protein [Clostridiales bacterium 1_7_47_FAA] predicted protein [Clostridiales bacterium 1_7_47FAA] # 10 89 523 634 829 75 41.0 1e-12 MRGRGICNWDDYCPGTLFVSGDKEEEYFAFCGSSQEAPDLVRLTKAGSSKVFMENVVQCE PYAGVTGFFQHGGERFMDAVPKEWDWSGRAFYEMGFRSTRTNSGGL >gi|157101634|gb|DS480690.1| GENE 211 187015 - 187638 317 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939025|ref|ZP_02086376.1| ## NR: gi|160939025|ref|ZP_02086376.1| hypothetical protein CLOBOL_03919 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03919 [Clostridium bolteae ATCC BAA-613] # 1 207 1 207 207 418 100.0 1e-115 MRKRTRIIQDKRRRTRCAQLAADAFDEKLISVREEDGRTEIIMEGDAVTDMCGSTLFIQE PDTGDIFSENLKINLYSYRPGEGKKLAVPNIYGFYPVQKGESFCFTRLGENTDGQALCRC QEYSWYMTEGIGAQYMTVAKVDDVFQPAESRMYRFALTDNKLSSLSECAAEGYAIGMTIV REPCLCRETKKRNILHFAAVLRRRLIL >gi|157101634|gb|DS480690.1| GENE 212 187761 - 188069 283 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939026|ref|ZP_02086377.1| ## NR: gi|160939026|ref|ZP_02086377.1| hypothetical protein CLOBOL_03920 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03920 [Clostridium bolteae ATCC BAA-613] # 1 102 12 113 113 169 100.0 8e-41 MIQMMKRDFQLSLFCCCLGKLTMASWLLTNTMFLLASHVPFVTLDFVRQISLESFLGIWL GIFLCMAIPYVLPCGRSRIHFSLTPSVLMYIATLMIKETFIL >gi|157101634|gb|DS480690.1| GENE 213 188066 - 189517 1006 483 aa, chain + ## HITS:1 COG:sll1352 KEGG:ns NR:ns ## COG: sll1352 COG3463 # Protein_GI_number: 16330152 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Synechocystis # 6 324 9 318 472 89 22.0 2e-17 MIKTIEQWLTGKRLLIILTMAAAVWFSFTAVMTVLRYRLNYASAYDFGIFSQMYYYMDKG LSPLTTCERDGLLSHFAVHLSPVFYVLLPFYKLVPRPETLLVLQSLVVISGMVPLCLLAR TFCFTRLQTLCFGLMYCFYPAFMGGCFYDFHENKFLAVIILWCMYFLETGHFKSMLASAA LLLMVKEDAPVYAACIGLYLFLWKKQYRWGAIVFLASCAYFCMAVWYINRFGDGAMINRF DNFISDKRLGLASMFKTILVNPAYVLSQIAVKDKIMFFLEMLLPLGFLPLMTKDWQKWTL LIPFVLINLMSNYIYQHSIYFQYTYGSGALLIYLALVNFRDIKNAPGRRRKGGDGYDGRL PATVCAWGLLCGLVLMGGVMYSKSKYAVLYHRHHEEAAQARTLLDQIPPDASVKSSTFFL PQLSMRDEVYLLTSSHPADYMVVDLRKGYEKDLEQILADCREQGYEPAGTVDGYVALLKQ DVE >gi|157101634|gb|DS480690.1| GENE 214 189972 - 190532 409 186 aa, chain - ## HITS:1 COG:FN1102 KEGG:ns NR:ns ## COG: FN1102 COG1859 # Protein_GI_number: 19704437 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNA:NAD 2'-phosphotransferase # Organism: Fusobacterium nucleatum # 12 183 6 178 179 177 49.0 1e-44 MGNKKAGSSEERLSVFLSLVLRHQPEAAGITLDEHGWADVEDLIQGISGTGRPMDMELLE DIVRTDEKGRYSFNEDRTLIRANQGHSIPVDVELTETAPPQVLYHGTASRFEQAILEQGL KPMSRLYVHLSGDAKTALTVGRRHGNPVVFEVDSGRMSRDGEHFYISQNGVWLTKRVKPE YLKRNI >gi|157101634|gb|DS480690.1| GENE 215 190653 - 191408 642 251 aa, chain - ## HITS:1 COG:ECs4769 KEGG:ns NR:ns ## COG: ECs4769 COG0084 # Protein_GI_number: 15834023 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Escherichia coli O157:H7 # 4 251 12 258 260 191 40.0 1e-48 MGKQFDKDREEVVRDSLKEGVGLIITGTDLKSNQAAVDYIGEKMPEKTWCTCGIHPHNAD RWNDDYRSKLEALIRRNRQSIVALGEAGLDYDRMFSARENQKKCFSDILEMAGDLDLPLF LHERAAEQDFIRLLKNRRELCRRSVVHCFTGTRETAYRYLQLGCYIGITGWICDDRRNRD VVEAVRVIPLERLMIETDAPYLTPLNIRGLSRRNVPSNIVYVAERIAEIKGVDVETVKKI ALETTRTFFSV >gi|157101634|gb|DS480690.1| GENE 216 191660 - 192670 867 336 aa, chain - ## HITS:1 COG:no KEGG:ELI_3952 NR:ns ## KEGG: ELI_3952 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 18 313 18 317 339 234 44.0 4e-60 MAENEKKKRTLWEQAVENALHFAFNLAFVFIFQSIFGVQNVLPGVAISVGMTMFPDGYIG VRPVTMAGIIIGLYCGCVIAGQTALMLPAAALVIDFLFVLLIMALTCEPTMYKPAISFLL CFVFAQSTPVPPAAFPMRLAGAAAGAAAVAVTTVIKWYWQGHGKEGRGLKEQIRLCTVRK GYILRMSMGIAAAMFIGMVLHLRKPLWISIVVMSLTQLHYHETLERIRHRFLGNVLGILF FVVVFRMLVPESWAFGMVLFLGYISFFTSEYKYKQIVNAVSAINASLVLLDTSTAIENRL LCLAGGACIVLILYLAQKLGRGIMGHTGAAAACRIK >gi|157101634|gb|DS480690.1| GENE 217 192905 - 193339 446 144 aa, chain + ## HITS:1 COG:no KEGG:CKR_1355 NR:ns ## KEGG: CKR_1355 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 5 143 14 152 158 68 32.0 7e-11 MNNKNFTEELLRLLPYWHYRIDKPFKAFMKDKMSLETYYCLQVLLRKGPMTMTELTRHLN VSKQQATKLIEILCSHDFVRRLPTEHDRRCIVIEVTERAKDYMINTIYKDTSFADKLKQE LGSEDMERLEQAVLTLSDVLSRLD >gi|157101634|gb|DS480690.1| GENE 218 193392 - 195479 1937 695 aa, chain - ## HITS:1 COG:sll0267_6 KEGG:ns NR:ns ## COG: sll0267_6 COG2200 # Protein_GI_number: 16331091 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 446 688 5 246 253 177 36.0 7e-44 MERIRQFLNKNIYIFSVRQGLLLTIPFLILGSFSMVIMQFPLKAWQELLPGFAGGSVKLL LMAVYQATFGSLSFIFALMISYSYGEEQTVYKNTHIFFPAVALCSFIAFTYPSGGLSIWG PEWSFTAICITLISCFLLTGIYHWVEGYRHLYTMGVAYNFNASMQSMIPAVVTIGLCGLS GLVLSLAFDDVNIMNFGSYLFLKLFGQMGNGLFSILLYILLSHVLWFFGIHGTNTLEAVS RRLFESEVTANQAMAAQGLEPTHMFSKTFLDTFVFIGGSGCALCMVLALFLAARKNNNRR LAKLSLPAVVFNINELVLFGFPIIFSADMILPFILTPMVLSVISSFAMWFGLVPVASRSV EWTVPVLASGYLATGSVSGSLLQAFNLIVGTMIYIPFIKQSERVQDREFLGKIKRLEASM NEDERSVWVRDFRWNTYENQQTAKLLASDLEYALSKGALELYYQPQVKRDGTLYGCEALL RWNYMGQGFVYPPLIIALAVQGDFIGSLGMYIVEKACRDMAGAHTQAGCPVSFSVNILPL ELEDPSFADHVLRILKDTGVEGSNLTIELTEQVALNPGPNLERQLEKLRESRVRISMDDF GMGHGSLNYLNSAHYDEVKIDGSLIRRLPEHTQTCELVGNIMNMSRILGVSTVAECVETP EQIQMLEKLGCSIYQGYYYSRPLPLDAFIEYVKGL >gi|157101634|gb|DS480690.1| GENE 219 195646 - 195966 301 106 aa, chain + ## HITS:1 COG:CAC0571 KEGG:ns NR:ns ## COG: CAC0571 COG1695 # Protein_GI_number: 15893861 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 101 1 101 107 110 58.0 7e-25 MNPQFKKGVLELVVLESVRKRDMYGYELVEEVSKVIDVNEGTIYPLLKRLTNEHYFETYL RESTEGPPRKYYHLTAAGILYRDVLEREWDEFQQKVCTFLKEHGDE >gi|157101634|gb|DS480690.1| GENE 220 195959 - 196915 262 318 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939037|ref|ZP_02086388.1| ## NR: gi|160939037|ref|ZP_02086388.1| hypothetical protein CLOBOL_03931 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03931 [Clostridium bolteae ATCC BAA-613] # 1 318 1 318 318 658 100.0 0 MNKTEFLDELKNGLAVLEEKEQQDILEEYTQHIDMKMERGLSEEEAIGDFGDLGQLAAEI LEAYHVNPDYSGNRTLSPVLRVRDLIRRARSGFLGIARAAGRFLHRIAAAAVQNTGVLIG KTRTFFSRIPARRPRFLRHLRTKMEVVNINGADFTPSSEHSQTGVPRIPPVPRTDHVHDQ STVQNTVRNAFKRAGSSICRVIKGMLCGLGVLFSSCLSLALWCMYFCLRWTWNILLFILT LFNGCLTLFSLYLLAVFLVWTVQGYPFAGPALLSLGVVLCAGSFSILCISLLRLRRNTGN TALSTDMDQNASEEVQHA >gi|157101634|gb|DS480690.1| GENE 221 196908 - 197450 470 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939038|ref|ZP_02086389.1| ## NR: gi|160939038|ref|ZP_02086389.1| hypothetical protein CLOBOL_03932 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03932 [Clostridium bolteae ATCC BAA-613] # 1 180 1 180 180 352 100.0 9e-96 MRKLQSYMIALFCTGVFAAGLGTGITFSEFSSFTYMGDADAGPVDMKTEILDYDMGQSHH SEETSELESPEDTCLDIIRTDRLPESEEDIVADNSVPRGIIRFQVTYNAAAVRPYVEYSE GSYVGIDSYDISDDFERFMRNKDLFLKDLKARQLHSYKSSGITQIKVLVHPEDRDMLRLP >gi|157101634|gb|DS480690.1| GENE 222 197566 - 198834 1140 422 aa, chain - ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 13 421 1 407 408 501 58.0 1e-141 MKIAGGNGEVAEISEVTARLLDYVLYDTQSEDDRESIPSTKKQFELAHKLAGELKELGAS QVRISEQGYVYACIPANLEEGKTCPSLGFIAHMDTAPSFSGKDVKPQFIRDYDGQDICLN REQDLWMRTADFPDLKAYEGKTLITTDGTTLLGADDKAGIAEIMTMAAYFLGHPEVKHGT ICIGFTPDEEVGRGADGFDVDGFGADVAYTVDGGALGELEYENFNAASGRVTVHGANIHP GTAKGRMRNALLMAMEFQSLLPANENPMYTEGYEGFYHLDRMTGCVEEARMDYIIRDHDR AKFERRKELFSQAAGFLNQKYGSGTVEAAVKDSYYNMKEKIEPHMDLIDKAKASMEKLGI QPIVVPIRGGTDGARLSYMGLPCPNLCTGGHNFHGKYEFIPVQSMEKITELLIEIARSFA EA >gi|157101634|gb|DS480690.1| GENE 223 198894 - 200207 1382 437 aa, chain - ## HITS:1 COG:CAC0942 KEGG:ns NR:ns ## COG: CAC0942 COG0139 # Protein_GI_number: 15894229 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Clostridium acetobutylicum # 230 337 3 107 115 134 61.0 3e-31 MTDCKKLIMGFGYRNGKTFSWNGKLEYEGGLKELARTACDNGADEIFICDRSFSDEDHEA VIGAIKETARTVDEPILAGGRIRRLEDVKKYLYAGASAVFLDVSYEDNVDMMKEAADRFG SEKIYAYMPDLSYLNRVEEYIQLGASVMICRASGPVLSLAELGEIGEISCRSLIFCGGHE NVQEMAGDLKLSLGCPQVEGAVLTLAEDNLDKVMELKQILKGAGIVTDTFESSLEWKNFK LGSDGLIPVIVQDYKTLEVLMMAYMNEESFQATLASGRMTYFSRSRQKLWLKGETSGHFQ YVKSLKIDCDNDTILASVKQVGAACHTGNRSCFFTTLAEKEYKETNPLKVFEEVFGVILD RKEHPKEGSYTNYLFDKGIDKILKKLGEEATEIVIAAKNPNPEEIKYEISDFLYHMMVLM ADRGITWEEITEELANR >gi|157101634|gb|DS480690.1| GENE 224 200234 - 200953 872 239 aa, chain - ## HITS:1 COG:BH3579 KEGG:ns NR:ns ## COG: BH3579 COG0106 # Protein_GI_number: 15616141 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Bacillus halodurans # 3 235 6 241 244 215 45.0 7e-56 MQLYPAIDMKNGQCVRLRQGAFKDITIYSDAPEKVAAHWQEKGASFLHLVDLDGALAGYS VNEEVIRRIADTVSIPIEIGGGIRSKEAVERMLDLGVRRVIIGTKAAEHPEFLRDMVRTF GEEAIVAGVDAKDGMVAVEGWEKVSSLTASDLCLTMKEYGVRHIVYTDISRDGMLSGPNV EATRKLTEETGLDIIASGGVSCMEDLKCLHEAGIRGAIIGKALYENRIDLAEAVRLYEA >gi|157101634|gb|DS480690.1| GENE 225 201034 - 201621 776 195 aa, chain - ## HITS:1 COG:CC3734 KEGG:ns NR:ns ## COG: CC3734 COG0131 # Protein_GI_number: 16127964 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Caulobacter vibrioides # 1 195 1 196 196 216 53.0 2e-56 MERIASITRDTNETQISMSLNLDGSGRGNINTGIGFFDHMLNSFARHGFFDLDLAVKGDL EVDTHHTIEDTGIVLGQAIRKAVGDKKGIVRYGSQILPMDESLVLCALDLCGRPYLVWDL VLDREKVGDLETEMVREFFYAVSYGAEMNLHLKQLSGTNNHHIIEAAFKAFAKALDGAVA AEPRLSGVLSTKGSL >gi|157101634|gb|DS480690.1| GENE 226 201640 - 202932 1523 430 aa, chain - ## HITS:1 COG:CAC0937 KEGG:ns NR:ns ## COG: CAC0937 COG0141 # Protein_GI_number: 15894224 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Clostridium acetobutylicum # 1 429 5 431 431 455 55.0 1e-128 MRIVTLDNKSMENILADMLKRDPNNYDSYTQTVQAIVDDVKARGDEALFEYTKRFDGASL DGSSIRVTREEIDEAMKQVEPGLLKVMERSMDNIRRYHEKQRQNSWFDAQPDGTILGQKV TALESVGVYVPGGKAAYPSSVLMNIIPAEVAGVKRIAMVTPPGKDGKVNPVTLTAAYMAG ATEVYKAGGAQAVAALAFGTASIPRVNKIVGPGNIFVALAKKAVYGHVSIDSIAGPSEIL VIADDSANPRFVAADLLSQAEHDELASSILVTTSMELARKVSDEVDGFLKVLSRSDIIKK SLDNYGYILVAESMEKAVETANSIAPEHMEIVTRNPFEVMTKIQNAGAIFIGEYSSEPLG DYFAGPNHILPTNGTAKFFSPLGVDDFIKKSSIIYYSREALEAAHKDIETFAESEHLTAH ANSVRVRFEQ >gi|157101634|gb|DS480690.1| GENE 227 202972 - 203625 753 217 aa, chain - ## HITS:1 COG:CAC0936 KEGG:ns NR:ns ## COG: CAC0936 COG0040 # Protein_GI_number: 15894223 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 215 5 215 215 204 51.0 1e-52 MRYLTIALAKGRLADKAMEMFEAIGISCDEMKDKASRKLIFVNEDLGVRFFLAKANDVPT YVEYGAADIGIVGRDTILEEGRKLYEVMDLGVGKCRMCVCGPESARERLEHHELIRVATK YPNIAKDYFYNQKYQTVEIIKLNGSIELAPIVGLSEVIVDIVETGSTLRENGLMVLEEVC SLSARMVVNQVSMKTENERITAIIKKFQAYLKENAGR >gi|157101634|gb|DS480690.1| GENE 228 203648 - 204913 1535 421 aa, chain - ## HITS:1 COG:CAC0935 KEGG:ns NR:ns ## COG: CAC0935 COG3705 # Protein_GI_number: 15894222 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase involved in histidine biosynthesis # Organism: Clostridium acetobutylicum # 9 326 7 326 407 245 38.0 1e-64 MAGNNRLIHTPEGVKDSYNGECRKKLAVQDKILDTFYLYGYEHIQTPSFEYFDIFSKDRG SVPDREMFKFFDRDNNTLVLRPDMTPAVARCVAKYFMDDPMPLRLCYLERTFKNNSSYQG RLKERAETGAELIGDDSEDADAEMIAMVIDSLRQAGLKEFQVELGQVAFYRSLLKEAGLE EEVEEELNQYIENKNYFAVEGLLKNQPVDEGLKKVFLKLPELFGSLEQMQEAKKLTANPG ALAAIERLEKVHSILESRGLEAYVSYDLGMLSRYQYYTGIIFKAYTYGTGDYIVTGGRYD KLLVQFGKDTPAVGFVIVVDQLMAALSRQQIDVPVTLVNTVILYETSARSRALWLGSYFR DKGLAVQSMKKKEQVALEDYKAMAIQRGMRNVLYLKGDGTVVTAMDTVNGNTDQIPITAY E >gi|157101634|gb|DS480690.1| GENE 229 205295 - 205828 526 177 aa, chain - ## HITS:1 COG:BH3539 KEGG:ns NR:ns ## COG: BH3539 COG3331 # Protein_GI_number: 15616101 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus halodurans # 10 167 8 165 168 121 40.0 8e-28 MGTWNSRGLRGSTLEDLINHTNDLYREKKLALIQKIPTPITPIEIDKSSRHITLAYFDQK STVDYIGAVQGIPVCFDAKECAVKTFPLQNIHPHQIEFMGEFEKQGGIAFIILYFTGLDE IYYLPFEQIEGYWKRMEEGGRKSFTYDEVDKSWRVRSHAGFLVHYLEEIQKDLDRRS >gi|157101634|gb|DS480690.1| GENE 230 206398 - 206892 291 164 aa, chain + ## HITS:1 COG:lin0443 KEGG:ns NR:ns ## COG: lin0443 COG1595 # Protein_GI_number: 16799520 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Listeria innocua # 4 151 18 175 182 81 32.0 6e-16 MRQESDVEQTIGQYADMVKHICFVYMKNESDTEDVFQDVFLKYALFSEPFDSEEHKKAWL IRVTVNRCKDLLRSFFRTHTCSLEEAMSMADTKSQDLSHVLESVMNLSDKYRIIVYLHYY EGYSAVEIAGILHKNVNTVYTHLSRAKAELKKMLGGDEGETKYT >gi|157101634|gb|DS480690.1| GENE 231 206960 - 207835 560 291 aa, chain + ## HITS:1 COG:no KEGG:Elen_0591 NR:ns ## KEGG: Elen_0591 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 28 190 78 236 271 86 34.0 1e-15 MHRSRKPAPRRMKWAVSLACLLLFATSGLGGYSLYYTEAAVISIDVNPSIELDINRWGKV VDQTTYGEESETVLQSLSLKHLEYEEALALLLASDAMQQYLKKDALVSITLETKDGDPQM FSSLQECVNTALMQCHGVKAEYASVDSHMCEEAHEHGMSLGKYYAIQELLTADPQATLDE FKDKSMKEIKVHTEHCEHRQQRIRGELQGQSQAESQAESSQAESSQAESQEESSQAESQA ESSQADFQGESQADFQDKFPDNSSQPAKRSHSQTDSQETGCHGSRHHSRRN >gi|157101634|gb|DS480690.1| GENE 232 207832 - 208563 690 243 aa, chain - ## HITS:1 COG:NMB1820_2 KEGG:ns NR:ns ## COG: NMB1820_2 COG0110 # Protein_GI_number: 15677656 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Neisseria meningitidis MC58 # 97 240 37 180 190 117 39.0 2e-26 MEDGKMGDGEMKNDEMKDSEIKGSERKVRNTGEPVKRLLIWGAGDQGIVTLDCALAMNRY SRIDFMELKEKGHRAIPEYLIRREDDGLDQILKSYDEVIVATGSNELREKKISMLVSLGI PLAVIIHPTAVISPLSRIAKGCTIHPYAVINAYASIGTGCIINTQADIEHDCVVEDFVNV CPKVSMAGHTVVGRKTFLGIGCTIIDGIRIGTEATVGAGAVVIRDVPDHAAVAGVPAKDI RKI >gi|157101634|gb|DS480690.1| GENE 233 208640 - 209095 440 151 aa, chain - ## HITS:1 COG:CAC0160 KEGG:ns NR:ns ## COG: CAC0160 COG0454 # Protein_GI_number: 15893455 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 3 146 2 145 149 60 29.0 2e-09 MELEYRFADIDELDFLVRMRIRDLRMFSECAAGTKLEDAIRRFYEIKMKENACRTLLAYA GRELAATATLYFYDVLPSNENPSGRVGQITNVWVDEAFRRQGIATRMVERLMEEARGETK MVCLNSSEQALGMYERMGFSSKKGYLVYYLS >gi|157101634|gb|DS480690.1| GENE 234 209135 - 210526 758 463 aa, chain - ## HITS:1 COG:no KEGG:Clole_4025 NR:ns ## KEGG: Clole_4025 # Name: not_defined # Def: acyl-CoA reductase # Organism: C.lentocellum # Pathway: not_defined # 2 457 13 422 436 321 41.0 4e-86 MLQGELLGQLEARINDTRLGRGLEQERVIRGLVRLGQELREELENAAPGKTAGEAGNPWI PAWLGCLTRFLSREWLEYKIETELGVKAGKGRKSEPGGSTECEPGPSHEVYHYTELQKME THILPLGTLLHITAGNMEGLPVFSIVEGLLTGNVNILKLPGNDGGLSMDIILRLIRLEPA LADYIYVFDTSSGDLPAMRRLEEMADGIVVWGGDSAVAAVRRSAPPGTRLIEWGHKLGFV YISGYECKDKELSALAEHIVSTRQLLCSSCQTIFLDTERMDDVNEFCREFLPYMERAVLS CPGHVSLFQPDILQKDSTVRPVESRFGSVNMAEAALALQQHNDRMEQMAYGKKKMRKLYP GKGCSLTACGDHELELSPMFGNCLVKRLPQGQLFGCLRRHKRHLQTAGLICREDRRPELM KTLASGGVARITRAGTMSSPFAGEAHDGEYALRRYVRFVSQEM >gi|157101634|gb|DS480690.1| GENE 235 210556 - 211629 583 357 aa, chain - ## HITS:1 COG:no KEGG:Clole_4024 NR:ns ## KEGG: Clole_4024 # Name: not_defined # Def: acyl-protein synthetase LuxE # Organism: C.lentocellum # Pathway: not_defined # 1 347 22 368 373 386 51.0 1e-106 MFFSAVRENCAYHYRHCREYSKILKRSGFRPGHLESSRDIGRIPVLPTLFFKHHDIHSIA PECTWLKTTSSGTSGTASQVNFDGGALLCGLGMVARTVSKRGLLSLRPARYVIFGYEPHR DNRTAVARSTFGATFFAPPLSRDYALIYRQGQYIPNLEHVMDRLCRYSHGAVPVRILGFP SYAYFALKQMEERGIHLTLPGGSKMILSGGWKQFEGQKVNKEVLYGLARRVLGIEDKDVA EFFSAAEHPVLYCDCRNHHFHVPVYSQVIIRDIKTMEPLGYGRPGLVNLITPMIKAVPVL SVMTDDVGVLHVGKECGCGIDAPYLELLGRAGVSGIKTCAAGAQDILNGTYGKEAAR >gi|157101634|gb|DS480690.1| GENE 236 211649 - 212245 248 198 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939058|ref|ZP_02086409.1| ## NR: gi|160939058|ref|ZP_02086409.1| hypothetical protein CLOBOL_03952 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03952 [Clostridium bolteae ATCC BAA-613] # 1 198 68 265 265 384 100.0 1e-105 MAGILAGIFGVFLGCGGCGQGRQNAQDVSGAVPAYQFAPGFGIIAHSPYPVYVLEDENGY AVSKDGVKVELVRGMMQDNELIAELRILDYRKNSQRDDRGADSWIYDIRCFGPGIPDTGY TAERMGTHSENRENGDGGYRETLAEFCTVSKIRERQGMLGWQQERTMYRKDIGRLYGRIR NHKRQVKQIPAFLDKEYL >gi|157101634|gb|DS480690.1| GENE 237 212502 - 213107 480 201 aa, chain + ## HITS:1 COG:CAC2773 KEGG:ns NR:ns ## COG: CAC2773 COG1309 # Protein_GI_number: 15896028 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 8 195 3 193 204 98 33.0 6e-21 MNTDIGGKKPYHFGNLKEALIEQGIALIHEEGIEKFSLRKAAKKVGVSAAACYNHFGNMD DLLREMYSYVIDRFAAALKQAVEDNPCHHVTISMGVAYVEFFAEYPHYFNFLFDSEYLGI QIKETEITWNSSFTPFEIFVTGAKRGMRELNIDEKELRDDLLVMWAAVHGLAAMANMKGV QYDSGDWGALTERLLLSKVML >gi|157101634|gb|DS480690.1| GENE 238 213135 - 213797 523 220 aa, chain + ## HITS:1 COG:CAC0748 KEGG:ns NR:ns ## COG: CAC0748 COG0778 # Protein_GI_number: 15894035 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 219 1 231 241 128 32.0 8e-30 MTEIEAIRARHAVRNYTAKPLSPEIIDELRQEIEQCNRQGQLHIQLVTENEEAFKTFIPL FGRFKNVKNYIALAAKKQGDFYVKCGYCGARLMVKAQQAGLNSCWVTNTYNAKKCPVTLA PDEELVGVIAIGYGTTDGTQHKSKSMERLCKPCSDKWFLDGMNAAVLAPTGLNRQDFFIE ANGNTVSIRTKDNHPMSQINTGIVKYHFEIGAGRENFNWE >gi|157101634|gb|DS480690.1| GENE 239 213887 - 214687 890 266 aa, chain - ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 4 95 7 98 117 70 45.0 4e-12 MLTIGQMSKVCGVSVKTLRHYDKIGLLKPQRIDEINGYRYYEDPQIGTMLLIGRLKRYGF SLTEIQALLTIPDSRELLRQLYKQRFRLERQMEHISITIREMGYHLEEFERTGDIMSYQN NYEIQIKEAEEQVLVTRRNKMSVEEFGTYYGKIYEKIAREHMTINGVVMAIYHDQEFDPA YSDIELGVGITERDKADFVMPGCLCATTIHKGAYSGLPDAYGAIVAWINANGYHMNGMPY EIYRKTQFDKLPPEEWETEIFFPVKK >gi|157101634|gb|DS480690.1| GENE 240 214819 - 215613 550 264 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939062|ref|ZP_02086413.1| ## NR: gi|160939062|ref|ZP_02086413.1| hypothetical protein CLOBOL_03956 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03956 [Clostridium bolteae ATCC BAA-613] # 19 264 1 246 246 504 100.0 1e-141 MGRPAARKNYVFLAVPKRMFPDRETMDRFLDQFTDPQAVEDDGHVAEGIFNFYFFMDQRA WSHAWIQGMQMGLRVKKLYGAKKGKVLRVTGIVAALCGMSIGMIADSAILTAVFLTMLLL SYMRGHGLSESFYKKQMMSGWMPSDGIGRWEISIGVNGIRMKRGLAFTEYSWEDYNCLAE TEDTFFFLNTETARGIECIPVPKWVFKDLGEMDAFLDFCRDKGVKWAGLDKTADLPKNDR MLYILIFMVLLAVVISGIIRAVYL >gi|157101634|gb|DS480690.1| GENE 241 215724 - 215843 97 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939063|ref|ZP_02086414.1| ## NR: gi|160939063|ref|ZP_02086414.1| hypothetical protein CLOBOL_03957 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_03957 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 69 100.0 9e-11 MEGEKVETGLEFQMTKGELAEYWLFRVFFCIGNAAGWDS >gi|157101634|gb|DS480690.1| GENE 242 215977 - 217071 1281 364 aa, chain - ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 5 347 6 341 341 416 60.0 1e-116 MGKIILTGDRPTGRLHVGHYAGSLKRRVELQNSGEFDEIYIMIADAQALTDNADNPEKVR QNIIEVALDYLACGLDPEKSVLFIQSQVPELCEMTFYYMDLVTVSRLQRNPTVKSEIQMR NFEASIPVGFFTYPISQAADITAFKATTVPVGEDQAPMIEQTREIVHKFNSVYGDTLVEP DILLPDNKACLRLPGIDGKAKMSKSLGNCIYLSDSEDEIKKKIMSMFTDPNHLRVEDPGQ VEGNPVFIYLDAFCRDEHFAAYLPDYKNLDELKAHYARGGLGDVKVKRFLNSVLQDELRP IRERRKEIARDIPAVYRILEEGSRRAEQKAAQTLAEMKRAMKINYFEDKELIAEQAERFR AEMD >gi|157101634|gb|DS480690.1| GENE 243 217302 - 218402 911 366 aa, chain - ## HITS:1 COG:CAC1015 KEGG:ns NR:ns ## COG: CAC1015 COG0564 # Protein_GI_number: 15894302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 4 351 3 311 318 192 32.0 8e-49 MQTLTVTQNEAGQRLDKLLTKYLNQAGKGFIYKMMRKKNITLNGKKCDGSERLEEGDQVK LFLSDETIEKFSVPDMGGYEKQPGDQSSTGASSHEKREKARGKGGEHQGSRDREGGRRLD IVYEDQHILVVNKPSGMLSQKAKDSDMSLNEYILNYLIDSGKLPISQLRTFKPSICNRLD RNTSGLVVAGKSLAGLQVMNEVFKDRSIHKYYQCLVAGEIKEKQLIAGFLKKDESTNTVS IFPLEVEDSVPIMTEYLPLSGNKTFTLLQVTLITGRSHQIRAHLASIGHPIVGDYKYGSR SLNDAVKKKYAVRSQLLHSWRLVMPETLPAPLKHLRGEEFTARLPVIFNTVMEGEGIGLP CESQGI >gi|157101634|gb|DS480690.1| GENE 244 218427 - 219191 839 254 aa, chain - ## HITS:1 COG:MT1596 KEGG:ns NR:ns ## COG: MT1596 COG0300 # Protein_GI_number: 15841011 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Mycobacterium tuberculosis CDC1551 # 5 253 13 259 267 126 37.0 5e-29 MKIAVVTGASSGMGRETIIQLWEHFKGFDEIWIIARRRERLDELDRQVGVPLRKFALDLT RERDRDVLLRALSARKPQVKFLVNAAGFGMIGQVEELGLKSETDMVALNCEALCAVTRMV LPYMECNSRIIQYASSAAFLPQPGFAIYAATKSFVLSYSRALNQELRSRRIYVTAVCPGP VKTEFFDIAESTGVIPLYKRLVMANPKRVVQKAIRDSIAGREISVYGITMKAFRLLCKMM PHRLLLAVMSFMND >gi|157101634|gb|DS480690.1| GENE 245 219197 - 221110 1820 637 aa, chain - ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 1 589 129 737 742 588 48.0 1e-167 MNRNDMSIRGWKQCDFVYVSGDAYVDHPSFGTAIISRLLEARGYKVGIISQPDWKDKKSI EVLGRPRLGFLVSAGNMDSMVNHYSVSKKRRKEDSYTPGGVMGKRPDYAVVVYCNLIRSA YKDVPVIIGGIEASLRRLAHYDYWSDKLKRSVLLDSQADLISYGMGERSIVEIADALDSG LDVKDITFIDGTVYKTKSLESVYDYKMLPDYEELLRDKKEYAKSFYVQYSNTDPFSGKRL VEPYGNQLFVVQNPPSKPLSQEEMDEVYALPYMRNYHPSYEELGGVPAIREIKFSLISNR GCFGACSFCALTFHQGRIIQARSHESLVEEARLLTEEPDFKGYIHDVGGPTADFRFPACE KQLTSGACPGRQCLFPEPCKNLRADHSDYIALLRKLRALPKVKKVFIRSGIRFDYVLADS NRKFLKELCEFHVSGQLKVAPEHVADKVLTRMGKPRNSVYRQFVKEYKEMNQRLGKEQYL VPYLMSSHPGSSMKEAVELAEYLRDLGYMPEQVQDFYPTPSTVSTCMYYTGYDCRTMEQV YVPVNPHEKAMQRALIQYRNPRNYDLVTEALKIAGRTDLIGYDKKCLIRPRNGNRDGGTA APAGKARGNGYKAAGQGAGEARRPKKTIRNIHKKKGT >gi|157101634|gb|DS480690.1| GENE 246 221128 - 221790 801 220 aa, chain - ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 7 215 6 214 215 187 45.0 1e-47 MKLNRKKAVIFDLDGTLVDSMWMWKAIDIEYLARFGLACPDDLQKEIEGMSFSETAVYFK ERFRLKESLDEIKNAWIQMSIEKYRKEVTLKPGARAFLEFISGKGLVAGIATSNGRAMVD AVLDSLDIRRYFKVVATACEVAAGKPAPDIYLNVAERLKVAPEDCVVFEDVPAGIQAGKN AGMTVFAVEDAFSLEMKAEKEQLADYYIRDYYELLDGAAG >gi|157101634|gb|DS480690.1| GENE 247 221865 - 222653 887 262 aa, chain - ## HITS:1 COG:lin2436 KEGG:ns NR:ns ## COG: lin2436 COG1187 # Protein_GI_number: 16801498 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Listeria innocua # 17 248 1 226 233 227 53.0 2e-59 MGGKGPMADKGRKNGPMRLDRFLAEMGYGTRTQVKDMVKKGRVRMNGEAVKDADRKVNPE SDIVEVDRSRVAYARMEYYMLNKPQGVVSATEDRKYPTVVGLIREALRKDLFPVGRLDID TEGLLLITNDGELAHNLLSPKKHVDKVYLAHVSGGLPEDAVKRFEEGIKLEDGTMTLPAE LKILKGAGPEEDAQEVLVTIREGKFHQIKRMFEALGCRVEYLKRISMGPLILDPNLEPGE YRPLTLQEEASIKDCGNRGNRV >gi|157101634|gb|DS480690.1| GENE 248 222690 - 222959 60 89 aa, chain - ## HITS:1 COG:no KEGG:Closa_1969 NR:ns ## KEGG: Closa_1969 # Name: not_defined # Def: Fmu (Sun) domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 89 366 454 454 135 67.0 5e-31 MGELKKGRFEPSQALAMSMKAGQFPNTVSFPAGDNRVLRYLKGETISLEGDEGPVKGWCL AAMEGFPLGWAKGTGMSLKNKYYPGWRWQ >gi|157101634|gb|DS480690.1| GENE 249 222931 - 224097 932 388 aa, chain - ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 6 302 2 280 280 220 41.0 3e-57 MTDVRQLPEAFLLKMQKLLGEEFGQYLESFKEEWKPGLRVNTLKLSPGELAELVPWNLEP VPWADNGFYYDGTLDGEVLRPSKHPAYYAGLYYLQEPSAMTPAAMLPVVPGDRVLDLCAA PGGKSTELASKLKGRGMLVSNDISYSRARALLKNLELAGAANICVTSEAPEKLAGVWPEF FDKILVDAPCSGEGMFRRDEDMVKDWNEKGPEYYVPIQRQILSQAAAMLRPGGYMLYSTC TFSVEEDEGNVAYVLEEFPQMQLCCLDLDKVPGACGGFGLPGCMRLFPHRLKGEGHFLAL MRKKGGDDGGKEILPPMDPGTARKRVRAVEKEKELDAFLRQSGAEWDYGRIVIHQDNVYY LPEGLAWNLPLRFLRTGLFFGRAKKRTV >gi|157101634|gb|DS480690.1| GENE 250 224156 - 225523 1395 455 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 425 1 425 447 285 38.0 1e-76 MTNNMTTGKPAKLILQFMIPTCLGNIFQQFYNLADSVVAGRFIGVDALAAIGSTTSLIFL VIGWLNGLTSGFSIMVAQHFGAGKYERMRRNVAMSIYLCLAFVLVMTIALEILNYPILRL MNAPDGIIDDTAGYMAVIYAGLFATAAYNGLAAVMRALGDSRSPLYFLIISAVINVLLDV AFIIWFGMGVEGCAYATVIAQAVSALLCFLYIVRKFDILKLSREDFRISFGIMGRLLSIG IPMGLQFSITAIGTIIVQGAVNVFGSVYIAGFSAAGKIQNIVSTVFVAFGATAATYVGQN RGAGRMDRVHQGVKSIQVMILVWSAVMILVIHLFGDMLIHIFIDASETEVMSAARTYFRA VAWCYPFLGSIFLYRNALQGLGYGLMPMMGGVFELIARAAIVAAVAGSFGYVGVCLADPA AWLAALVPIVPYYFYKISRISRKETHGGRITADVI >gi|157101634|gb|DS480690.1| GENE 251 225685 - 226974 970 429 aa, chain + ## HITS:1 COG:no KEGG:Closa_2898 NR:ns ## KEGG: Closa_2898 # Name: not_defined # Def: acyltransferase 3 # Organism: C.saccharolyticum # Pathway: not_defined # 77 402 6 334 356 122 30.0 5e-26 MNGKKLRYGLVFIVLTTLIYMLMELRYGLPIRPLRPKHYAAAAVLAACSCLIHSRRAAAS PGYAGELLLGCRNQPGRIVYMDYLRVLASFLVILVHVLEPAYALLPPHTFSRNVMAAAAG LGLSCNLLFMMLSGALLLGGREESVLQFYSRRFVRVLIPCFAYYLLYFFYVEGIFALSPG NWGSLIQSFLSNDSGQTPHFWLVYIILMFYVAAPFFRIMLRHMTEPMLEALTAVIFILHF IYTYGPLVHIQFAASSFLASWDSIFLLGYYCTTRSAMKHYRLFMTAGLLSGLAIAGSIMA PESLGPLVYNNAPPQMLFTCAVFLFFRKHGDRLFARIPTLLSAIGRYSFSILLIHWLVLH RIVGDVFGINGLSFGIAGGILVSFLLTLVISLALSFLYDNTVVLCMDRACEIFFGVLGRV GKRLGRFAK >gi|157101634|gb|DS480690.1| GENE 252 227120 - 229165 1936 681 aa, chain - ## HITS:1 COG:BH2055 KEGG:ns NR:ns ## COG: BH2055 COG1501 # Protein_GI_number: 15614618 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 12 669 3 654 657 678 50.0 0 MAYYTIPRYEGKTEGNRIIWETAGELVYIEPYGRDAIRFRSSKSLRIDEALNWTLEEPAS PEGVIIEADDEKACMTNGKIQVTITGDGTVTYRNTRTGKVLLEEYWIDGRVHTAPLRRAR EYRVTSGNQFKISLYFKAEPGEHFYGMGQDANDCFDLKGSTVELLQKNGKCTIPYTYSSR GYGFIWNNPAIGRAEFVNNHTMWHVQCAKQIDYVIIAGDTPGEINEKFTAITGRAPMLPE WAAGFWQCKLRYETQEELLQVAREYKRRGLPISVIVIDYFHWTMQGEWKFDPEKWPDPKA MVSELESMGIKLMVSVWPTIDPRSENYAYMREHNYILRGERGVDVVFMFFGPQTYVDTTH PGAREFFWSRAKKNYYDYGIRTFWLDEAEPEMRPYDYDNVRMYLGNGEEVSNIYCVGVAK AFYDGLKAQGEEVCNLVRCAWLGSQRYGVVLWSGDIASTFDSLRKQLKAGLNVAMCGIPW WTTDIGGFINGDPESEEFRELMIRWFEFGVFCPIFRLHGFRLPYPVRDILNPDGYCGSGG PNEVWSFGEEAYEIIRRYMYVREELKPYIMGQMKLASEDGTPVMRPLFYDFCGDKNVYDI GDEYMFGPDLLVAPVVELGARKRMVYLPEGCRWKDAGTGMVYDGGTRIEADAPLDTIPLF LKEDARLSLQMKPSTAETMHR >gi|157101634|gb|DS480690.1| GENE 253 229166 - 231634 1708 822 aa, chain - ## HITS:1 COG:BH1905 KEGG:ns NR:ns ## COG: BH1905 COG1501 # Protein_GI_number: 15614468 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Bacillus halodurans # 207 805 157 772 773 416 36.0 1e-115 MEKSLGEIMHGLSEENIAREEELFENMPVFPRISRSKNLTMLLVHRVKSCTPVPKGCDIL AEGLYVDENYTYRYLDYRHVFTEKPNRSTLSIRLRFVNENTLRVTMAEGFCVPENRTEMI GDMEETECSVTCSESEGQIVMATGCMTAVIQKSPWNLSVKNRDGEVIYRQFGRDCHSFMP YEICPMGFLFDRNTKEQYACEAAWSDPYEEFYGLGENFTSVSRKGRLFDLWNTNSLGVNT ERGYKYIPFYMSSRGYGIFYNTSRKIRCDLGATLSKASYTMVEGALLDYFLMTGDTIKDI IPAYHRLTGLPAVPPKWSFGIWMSKISYGTREEVERVAGQMRSSRLPCDVIHIDTDWFTE NWVCDWKFDEKKFPKVEEMIEKLHQDGYKLSLWQLPYIERGKTSYEVYDEAAANGYFAGN PDGDLQFPHGLIDFTNPEAVHWYKEKLIKPLLRKGVDVIKVDFGESAPAFFKYAGADGRD VHNLYALLYNKAAYEAAKEEFGEDKALIWARSAWAGSQKYPVHWGGDAGTDFGSLASSVK GCLNLSLSGIPFWSSDIGGFWFESSPELYIRWLEFGMFCSHARFHGFYSREPWDFGQQAV DVYRKYAGLRYRLIPYIYNQALAVREEDTLIHRPVFYDYSSDLSAAGLDTQYLFGKELMV IPVLNEDNSVRIYLPRGQWTDFYDDTVEEGGKWMEQVVPLDKIPVYVRENAIIPIGPEME YIGQKEDDQWEIHCYPMEGKGRLCSYEHGFTVEMYASADRVKIDCTRTDMNLTFVLHNIR PAGVQADGQDIGFRMDHKRTYIHMGNRMQDHDIHMVIEKGEQ >gi|157101634|gb|DS480690.1| GENE 254 231648 - 232490 858 280 aa, chain - ## HITS:1 COG:SMb21593 KEGG:ns NR:ns ## COG: SMb21593 COG0395 # Protein_GI_number: 16264781 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 11 279 6 275 275 172 37.0 1e-42 MKDRKRMEGRRKAALTAAGILVTCVFLFPLYWMVATSLRPPGESFSNPSFFPSRIVFDSY ILKNEHGISLFTYLKNSVVVSFGATALTLVLGVPASYGLARFKSRMISVLLFIFLVAQML PSSLILTPLFVNFNKLGLINNHLGVILSDATITIPFVVIILRTYFKDIPKELEEAAIIDG CGPLGTFIRIMVPISYPGIVTATSMALFMSWGDMVFSLTFLNQEKLKTLPLILYKAMGEL GVRWEILMAYSTVVVLPIVIVFICLQKYIVSGLTAGSVKG >gi|157101634|gb|DS480690.1| GENE 255 232503 - 233405 796 300 aa, chain - ## HITS:1 COG:AGl1012 KEGG:ns NR:ns ## COG: AGl1012 COG1175 # Protein_GI_number: 15890623 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 24 297 24 297 300 190 37.0 3e-48 MGSKKGYFFKKKLTLFAFVLPGAAFMTIMIIFPIISNIIMSFKNVDLMNFATGDSHFVGL KIYQKICRTEVTWIAVKNTLVFTLWCLVFQFPIGFLMALFFGQDFKGAGPLRAVNVVAWM IPMVAVAGVFKYMFNSDIGIMNRILMGLHLIQKPVEWLAHGNTAMAAIIIANIWKGVPFN MILLATALTTVPQDIYESASIDGATRIQRFFYITVPLLRPAIISVLTLGFIYTFKVFDLV YVMTGGGPGSSTEMLSTLAYRYSFSEYNFSQGAAVANILFVMLMAVGMIYIHIVTREDEN >gi|157101634|gb|DS480690.1| GENE 256 233453 - 234754 1445 433 aa, chain - ## HITS:1 COG:SMb21595 KEGG:ns NR:ns ## COG: SMb21595 COG1653 # Protein_GI_number: 16264783 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 48 431 23 409 410 187 32.0 3e-47 MKKRLFTRGIALGLAAVMLAGCSGSGTSTSQGSTDAGTTGQTAETQASQGQEQTTIEMWY HISPDQATLLLQMIDEFQELHPGIKVETQSVAFAEIKKQLSIGVAADQLPDVTLCDTVDN VSFAAMGAAMDITSEVEAWGELENYFEAPVNSARYEDKYYGVPYYSNCLAIMYNKDIFDE MGLEYPDSGWTWDDFKDMTAKTTTADHYGLTMSLIKSEEGTFDVLPFIWQAGADYDSLDS EGAGEALTMINDFYQNGYMSKELISMTQADMCASLFATGKSAMMVAGSWLNTNIQNENPD LNYGVVTFPNYKNAASPIGGGNIMMMKDDNREASWELMKFLSSKENSRKFCEDAGYISPR EDAVAESTLWLDDPILSVYTEQLKLAKARGPHPKWPEISSAIQFAYQDVLSGAKSVDDAL KQAAAEVAEIISE >gi|157101634|gb|DS480690.1| GENE 257 235094 - 236764 1316 556 aa, chain + ## HITS:1 COG:BH1122 KEGG:ns NR:ns ## COG: BH1122 COG2972 # Protein_GI_number: 15613685 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 14 551 52 568 586 153 25.0 8e-37 MLFTKNHSLLADNLQLTQNSINNVLTDYKRILFQISTDVTCMDNILELSKVSPDSAEYRR ISESLETYIKANILMYPEIQALCVISTTGTPYIYVQKRQKTQPIIDYFEANKEELNQLLL SSTRSIIGSVHDGSPYYDPEHPVFFLGSRTIHYEKMKITGSILLFISPDKMNSAINNPDS QVYPFTDKLLVDSSGRLICSKDRNTGKILGSLENYKAIDWDSLSSDSGDMQGDYLVSQIP LEHFDLHLINIVDYRQMTKDLEALWTTITLAISAILLVSMLVAYVLCHKFIFSIENVAAK MNLFDEHHLDVEIRNMSRNELRIIESSFNRMLGQIRGLLDENKQQYKKICEAELKSLELQ INPHFLFNTLDSICWTAAQENCLTVSEQLNKLAAILRHTVYNMNCVVPLQDEICWMQNYL DLQKSRFHGRFSYDIHDLTKGRRIYIHKLLLQPFLENALIHGFENLPRKGRLEIACRITG QSHLLLQVSDNGCGIPEDKVREINLLFTTGRSAFTGIGLTNIAYRTKGYYPHSRIFVSSS PMGTCFKIFIPMNEME >gi|157101634|gb|DS480690.1| GENE 258 236770 - 238200 1129 476 aa, chain + ## HITS:1 COG:BH0793 KEGG:ns NR:ns ## COG: BH0793 COG4753 # Protein_GI_number: 15613356 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 470 1 502 508 105 22.0 3e-22 MFRLLIVEDEENSRKGLKATLAPIPNLEIDTACDGREGCLKAIRWNPDIIISDIHMPVWD GLVMTEKLCKKGFAGTIYLLTGYAEFEYARKALQYHVEDYILKPVVPSKLRNLIEVKLKD LNKKRNSQNPQLCHLLSDEDSVLLTERLAPFCYTDYFLAVVYMEREHHLPLEVKEAVIEE RGHYILTLPDKHYRGILIGFTNHMINHGKIDRISSLLEPYEHLTCIYETRQAGFISNWLL AFEQLRSAIPWTITCRSRFLSYDTYMEQEAGELNEDVFFKKDLQRMLCGGDFDGCRKILL KKISQMQANACHPRHILAAAVSGLVKFDSKQAYLEAVNQMAAARTMHEIRTCIDSYYETC KGRPASAGYSPLIQKALHVIDKSYKEPISLNSVAEQLNITPQYLSRLFMKEMSRSFVDYL TSYRMERAKSLLQDTNMKINMICVQVGYQDAKYFCTLFKKITGVTPNQYRASSPNM >gi|157101634|gb|DS480690.1| GENE 259 238190 - 239053 708 287 aa, chain - ## HITS:1 COG:BH3436 KEGG:ns NR:ns ## COG: BH3436 COG0739 # Protein_GI_number: 15615998 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus halodurans # 159 281 212 328 337 90 43.0 3e-18 MPLILILLLAVFQCSITNYLIANPDYYQLGPYSWESDDFRAMSLGDAVAGLDNLDPDMVA TLMVEHDYDLTGIKDTRYNNRLLAAARPADYRKLKHAYETVLGDLEYFPIPESRNKDTPE VAFENGWMDGRGYVSGGENGGGDNSRENGGRKNSPKRRHEGCDIMGSKMPPGYYPVVSMT DGIIEKIGWLEMGGWRIGVRAPGGAYLYYAHLYSYAGDFKEGDRVKAGELIGYMGDTGYG KTEGTRGNFDVHLHVGIYIKTDHNEEMSVNPYWILKWLEKKRLVFTY >gi|157101634|gb|DS480690.1| GENE 260 239100 - 240053 611 317 aa, chain + ## HITS:1 COG:BH2588 KEGG:ns NR:ns ## COG: BH2588 COG3314 # Protein_GI_number: 15615151 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 7 311 12 341 411 92 24.0 1e-18 MRMKKSIGGVLSVLCLLMLLFNPGLALEGARQGLVLWGNVVLPTLLPFMICSGVIVAMDA IHILTGPFKPILSGILSLSDQGSFCLMSGLLCGYPMGAKTTSDFLDQGRLSVQEGKYLLA ISNHPSPMFVLGYVMAQIALIPSMASPCPLWAVLAALYLPILPISILARCCYHYNRQTRE EACPLAAQAAPEKTACFSFDTHMMSCFETMVKIGGYIMLFSILALYLTVFPFQMPPLLRP ALLGSVEITTGIQMIASSVPGSMGALLIIGSAAFGGFSGIFQTKSVLKNAGLSIRHYMLW KMLHSALSCLIFYVSLC >gi|157101634|gb|DS480690.1| GENE 261 240077 - 240715 798 212 aa, chain - ## HITS:1 COG:no KEGG:Closa_1966 NR:ns ## KEGG: Closa_1966 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 192 1 181 193 189 71.0 5e-47 MMSRIEQLISEIEEYIDSCKFQALSNSKIIVNKEELEELLVELRLRIPDEIKKYQKIISQ QETILNEAQAQADAMLDDAKKQADSMVAQASEQTSEMINEHEIMQRAYAHANEVVEQASI QAQAIVDSAVNDANGIRQSSIQYTDDMLRSLQTIINHTMEGARGRFDAFMSSMQSSYDIV SSNRNELSGGIVRPEEGEGQAPQEMDDAQQQA >gi|157101634|gb|DS480690.1| GENE 262 240712 - 241206 406 164 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 159 4 161 164 160 46 6e-38 MRTAVYPGSFDPVTLGHYDIIERTAKMVDKLIIGVLNNKAKCPLFSAQERVNMLKEVTSS LPNVEIQSFEGLLIDFVRRSHADIVVRGLRAITDFEYELQMAQTNRVIAPEIDTIFLTTN LKYSYLSSSIVKEIAEYEGDISEFLHPVIAARVREKLEERRRLT >gi|157101634|gb|DS480690.1| GENE 263 241207 - 241764 266 185 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 150 13 162 199 107 40 1e-21 MRVIAGSARRLLLKTLDGLDTRPTTDRIKETLFNMLQPELADCMFLDLFSGSGAIGIEAL SRGAGLAVMIENNPKALECIRENLSRTKLEERAMVMGCDVITGLKRLEGKNYKFDIVFMD PPYNHEYERLVLDYLNHSPMVTEETLIVIEASRETEFGWLEESGWHMIKSKEYKTNKHVF VGRGE >gi|157101634|gb|DS480690.1| GENE 264 241883 - 242560 757 225 aa, chain - ## HITS:1 COG:CAC1537 KEGG:ns NR:ns ## COG: CAC1537 COG2013 # Protein_GI_number: 15894815 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 217 5 236 260 162 40.0 3e-40 MDYRMLGDILPAVEIRLQAGEAMYTQSGGMAWMSDGFTLDSNVKGGLMKGLGRMFSGESL FMATYTASRPDSTIAFASTVPGKILAIDTAKTSLICQKGAFLCAQTTVEINTVLTKKFTA GFFGGEGFILQQIQGSGMAFLEVDGDVVERVLAPGEVIKVDTGNVFAFEPGISYEIETVK GVKNILFGGEGLFLTKLTGPGKVYMQTMNIAEFTGRIAQGLPASK >gi|157101634|gb|DS480690.1| GENE 265 242730 - 243728 683 332 aa, chain - ## HITS:1 COG:L15267 KEGG:ns NR:ns ## COG: L15267 COG1073 # Protein_GI_number: 15673556 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Lactococcus lactis # 38 330 27 319 320 253 44.0 4e-67 MKKSMTGAEVGTAAGITAGIAAGITAGISTAIAGLLAGEYFFHIALDASTDKSALLKPAP GTEKEEKVPGEERNAPEAVTMLKAQYEEREIVSFDQLRLHGYLCRQKKPNKKWAVLIHGY DDSGMWFGREALAFYQAGFNLLLPDARGHGKSRGTYVGMGWHDRLDIKEWICWLVRQYPD SEIVLYGVSMGAAAVMMAAGEKLPSNVKAAVEDCGYTSAWSVLSYQMKSQFHLPAFPFLY CADFVTRIRAGYGMKEADALKCVAGTRLPMFFIHGTEDRFVPFEMMKELYDACRSEKECL AVAGAAHVQAALVGGAEYWDRVFRFVGRYLER >gi|157101634|gb|DS480690.1| GENE 266 243954 - 244157 237 67 aa, chain + ## HITS:1 COG:no KEGG:Closa_1963 NR:ns ## KEGG: Closa_1963 # Name: not_defined # Def: small acid-soluble spore protein alpha/beta type # Organism: C.saccharolyticum # Pathway: not_defined # 1 67 1 67 67 98 95.0 1e-19 MSNNSNRTNVPEAKEAMDRFKMEVANEIGVPLTNGYNGNLTSAQNGSVGGYMVKKMIEAQ ERQMAGK >gi|157101634|gb|DS480690.1| GENE 267 244335 - 244850 658 171 aa, chain - ## HITS:1 COG:AF2267 KEGG:ns NR:ns ## COG: AF2267 COG0778 # Protein_GI_number: 11499848 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Archaeoglobus fulgidus # 1 159 1 156 174 89 34.0 3e-18 MNETLNTILTRRSTRKFSDKPIPEDEMELIVQAGLHAPSGMGLQTWQFTVVRNREKIQML AAAVEKTLDRQGYDMYQPQVLVIPSNEKESRFGREDDACAMENMFLAAHSLGIGSVWINQ LQGICDEPAIRRILTSFGVPADHVVYGMAALGYAADTKEDKERTGKVVYVD >gi|157101634|gb|DS480690.1| GENE 268 244955 - 247012 2275 685 aa, chain - ## HITS:1 COG:CAC1736 KEGG:ns NR:ns ## COG: CAC1736 COG1200 # Protein_GI_number: 15895013 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Clostridium acetobutylicum # 7 654 7 650 678 501 42.0 1e-141 MLNRQPVNSLKGIGEKTGKLFEKLGVVTIDDLLSYYPRAYDAYEPPVSIGQLKEQAVMAV ESALVRGADLLRLGHMQIVSVQLKDLTGSLQVSWYNMPYMRANLKTGVTYVFRGRVVKKR GRMVMEQPEVFTPDSYQALAGSMQPVYGQTRGLSNKTIVRAQQMALEMRKMEREYMPPDL RRRYELAEINYAMEHIHFPADQTELLFARKRLVFDEFFMFLVGVRRLKEHREDRHSAFMI KESEEVAAFQSSLPYALTGAQERALREVYGDMGSGLVMNRLIQGDVGSGKTIIAILALLE AAYNGYQGALMVPTEVLARQHFESMTGLFEKQGIEKVPVLVTGSMTAKEKRLAYAKIASH EADIIIGTHALIQEKVVYDNLALVITDEQHRFGVGQRELLSSKGQEPHVLVMSATPIPRT LAIILYGDLDISVIDELPAGRQTIKNCVVDPGYRPKAYAFIERQVAEGHQAYVICPMVEE SEMIEAENVLDYTKALRKALPPSVTVEYLHGKLKGKEKNAIMERFAAGEIHVLVSTTVIE VGVNVPNATVMMIENAERFGLAQLHQLRGRVGRGKDQSYCIMVNCSRDQGAGERLDILNR SNDGFYIASEDLKLRGPGDIFGLRQSGDMEFKLADIFTDANILKKVSEEVNRLLDEDPQL EKDEHRELKRKVEDYLGTNYEKLNL >gi|157101634|gb|DS480690.1| GENE 269 247036 - 247908 1014 290 aa, chain - ## HITS:1 COG:BS_ydhR KEGG:ns NR:ns ## COG: BS_ydhR COG1940 # Protein_GI_number: 16077653 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Bacillus subtilis # 2 290 1 290 299 337 54.0 1e-92 MLIGALEAGGTKMVCAVGKEDGTILEQVSIPTTTPEETIPKLIEYFKDKEIEALGVAAFG PVDVKPESETYGYILDTPKLAWRHKDLLGRLKAELKVPMGLDTDVNGSCLGEVTYGCARG LDSVIYITIGTGVGVGVCINGKLLHGMLHPEGGHILLARHEDDSKGGICPYHKNCLEGFA SGPSIEARWGKKAVELVDRPEVWEMESYYIAQALVDYIMLLSPRKIILGGGVMHQEQLFP MIRQKVREMLNGYIKTKELEDMDSYIVPASLHDDQGIMGCIKLGLNALQA >gi|157101634|gb|DS480690.1| GENE 270 248068 - 249036 301 322 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762640|ref|ZP_02169704.1| ribosomal protein L33 [Bacillus selenitireducens MLS10] # 6 316 7 313 323 120 29 8e-26 MKYVLGFDIGGTKSAILLARPGKENVEFLERKAIPTHGTWKEVLGCLADKGKAFLESHQI SGKECCIGISCGGPLDSERGVILSPPNLPGWDQVPIVSYLEQRLDMDARLKNDADACALA EWRYGAGRGCRHMIFLTFGTGLGAGLVLNGRLYEGACGMAGEAGHVRLAEEGPVGYEKAG SFEGFCSGGGIARLARSMAEDALKAGRPVSYMKEGGLDDITARDVAEAAEMGCADAGEVL AVSGRYFGRGLAMLVDILNPERIVAGGIYARAGRFLKDEMEKELRREALPASVNACCILP CGLGEQIGDYGAVMAALEMWEE >gi|157101634|gb|DS480690.1| GENE 271 249167 - 249871 837 234 aa, chain - ## HITS:1 COG:CAC2831 KEGG:ns NR:ns ## COG: CAC2831 COG0670 # Protein_GI_number: 15896086 # Func_class: R General function prediction only # Function: Integral membrane protein, interacts with FtsH # Organism: Clostridium acetobutylicum # 22 234 16 231 231 108 37.0 1e-23 MNENMMNRSYEVQVEGDSLGRYTAKTFGWMFAGLLVTFIVAVGFAMTGTIFYIYSFRYWP LALLAAELGVVIYLSARIEKMSVGMARSLFFLYAALNGIVFSAYLFAFSLEILWMVFGIT ALFFGIMALIGWFGNINFTKLRPFMLGGLVFLFFYWIFAMFIDLSAFDTIVSTVGIFLFL VLTAYDVKKIQAFHAYYGQAPEMAAKASIFSALQLYLDFINLFVYIMRILGRRK >gi|157101634|gb|DS480690.1| GENE 272 249892 - 250500 657 202 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0317 NR:ns ## KEGG: Cthe_0317 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 195 1 215 223 107 32.0 3e-22 MARWTRDEELNQPADFVQFMMEDFLTKNGFKKKTHKGQEVWQEGDGFLTMARFIRYEYND GSFHLEAWIGKKRENPIKGFVGAMPKKMFRESLEELIRLLHQPLPQGQEDTGNKVITIQV PDHGKCAVPAMILSILGIVVGLLLSPLGGIVLGGLGITFGQKARSSSKNNIGRAAVILGT IAMVAAIIGYIINLIFAFTFIF >gi|157101634|gb|DS480690.1| GENE 273 250535 - 252259 1983 574 aa, chain - ## HITS:1 COG:CAC1735 KEGG:ns NR:ns ## COG: CAC1735 COG1461 # Protein_GI_number: 15895012 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Clostridium acetobutylicum # 5 574 5 547 547 443 43.0 1e-124 MGISTIDAGMLKNAFLAGAKGLEAKKDWINELNVFPVPDGDTGTNMTLTIMAAAKEVAEL ENPTMDQLAKAISSGSLRGARGNSGVILSQLLRGFTKEIKAVEEIDTTILANAMVRGTET AYKAVMKPKEGTILTVAKAMADKGLEMASQTDDIEEFVKQVIEYGDYVLSQTPEMLPVLK QAGVVDSGGQGLMQVVKGAVDGLLGRTVDFSLDTVPDSGNRPAAGEKAARGAARTDIDTA DIKFGYCTEFIINLEKVYTDKDETELKSYLESIGDSLVVVSDDEIVKVHVHTNHPGLAFE KALAYGSLSRMKIDNMREEHQERVIQDSERLAREQAAREQEDKDKATGEEAPSERREYGF IAVSSGEGLSDIFRGIGADCLIEGGQTMNPSTEDMLKAIERVNAENIFILPNNKNIIMAA QQARDLTEDKNIIVIPSRTVPQGITALVNFMPDLGPEENTRTMTEEMGNVRTAQITYAVR TTNIDGIDIEEGDIMALGDHGILAVGKSVDGVALEAMKEMLDDESELVTIYYGADVSEHE AKVLEEQAQEQYPDKEIELQYGGQPIYYYLISAE >gi|157101634|gb|DS480690.1| GENE 274 252273 - 252629 465 118 aa, chain - ## HITS:1 COG:BS_yloU KEGG:ns NR:ns ## COG: BS_yloU COG1302 # Protein_GI_number: 16078646 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 118 1 119 120 93 44.0 8e-20 MKGRISNKMGEIQINPDVIALYAGTIAVECFGIVGMAAVSMKDGLVKLLKRESLTHGINV DINDNKISIDFHVIVSYGVSISAVSDNLIESVKYRVEEFTGMEVEKINIYVEGVRVID >gi|157101634|gb|DS480690.1| GENE 275 252799 - 252984 294 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872680|ref|ZP_03991010.1| possible ribosomal protein L28 [Oribacterium sinus F0268] # 1 61 11 71 77 117 90 5e-25 MAKCAICEKAAHFGIQVSHSHRRSNKMWKANVKSVRVKVNGGTQRMYVCTSCLRSGLVER A >gi|157101634|gb|DS480690.1| GENE 276 253208 - 253345 182 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRICGLMLFCFGLGMAVLLFIPETIFTFLFVIGCLVLGYNLFCC >gi|157101634|gb|DS480690.1| GENE 277 253415 - 253843 520 142 aa, chain - ## HITS:1 COG:no KEGG:Closa_1955 NR:ns ## KEGG: Closa_1955 # Name: not_defined # Def: Sporulation protein YtfJ # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 135 136 150 64.0 1e-35 MAENQFTSTIDTLFRGMDQFVNTKTVVGEPVQVGDTIIVPLVDVTCGMAAGSFAEGVKQK QRGGGGMSAKMSPSAVLVIQGGVTKLVNVKNQDAITKVIDMAPDFINKFLGGKPAVSEET VQTAKDTAATYDTGKAEHMDVK >gi|157101634|gb|DS480690.1| GENE 278 253927 - 254982 791 351 aa, chain - ## HITS:1 COG:no KEGG:Closa_1954 NR:ns ## KEGG: Closa_1954 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 36 351 36 329 329 174 33.0 4e-42 MIHILLVILKIIGIVLLVILGLLLGILLAVLFIPVRYKAEGSYHGQLLLSVCATWFFHIL AFRAVYEGEGLVCSLKVLGFRLWHNREKDPARDLEDGLETILGDEEQSLYEELQQDEEHY RKSQEEQAHGGQSVGEQTHGGQSGGEQTRGGRSVEEQAHGSQLHGERAGEQDANRKDSRE EEDETGPESGKKGPGIMGKIKGFPGRISQSIRHILEKLKFSFAKICDKLKGIREFVQEKK MWLEDEKNQASLKLLYRQTKRLLVHLWPRKGRCTVTFGFDDPYTTGQVLQAASLIYPFYH RQLFLYPVFDEKILDAEGSLKGRIRLSVILWLVLQVLFDGHTRRMLKGFLK >gi|157101634|gb|DS480690.1| GENE 279 254979 - 255302 292 107 aa, chain - ## HITS:1 COG:no KEGG:Closa_1953 NR:ns ## KEGG: Closa_1953 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 101 5 103 109 97 45.0 1e-19 MRWWEHLYMGDRAGRHRARVLMGIRERKLMPDTYVITLPQSGNHILDIRPVMLLTEEERK AEDFLILGVAEGYGEACEVVRTMVDDMYRHTDGFNWKSYMAYLGEAR >gi|157101634|gb|DS480690.1| GENE 280 255367 - 256707 1370 446 aa, chain - ## HITS:1 COG:no KEGG:Closa_1952 NR:ns ## KEGG: Closa_1952 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 439 1 436 440 285 37.0 3e-75 MESRRRKKQISAACHYAYTRLALNYYMYYEVILEGDLLSSRLERLSKRYHGLLKDFLEGS LKPEELNGLRDETASAMEANTAYTDAFQAYEYVLNRLEGRFEPQLMGNRNVQPDEEEWTA EIMAYITDTDEAMVVNERIQSVVGQLPIRLTRNKFFAMVEEGLSVYKGGPVSSLDDMLYI LRSEAMLARPEDMETGYEDLHSILREFERADFKAVTAEEYRHLTAEMERAGQILLDTTGE LMLFMDLINDLYVLMLSRRWAMMEVSEESLLQELLSKVQSLFEEGAVTAIPQELTDKLSM LEGKQETYFEQWMRLATEVKSAADSADEDGEILRKIELLMSDSSFMSLDRPEDKADQKVD EELMAGKLERFFGELSAAWEGKPKVLVRAVMSKILSRLPVFFDSLEEVRLYVEGSLRSCS DVKEKEISMGLIRMLMEQDKFDQEDY >gi|157101634|gb|DS480690.1| GENE 281 256770 - 258014 321 414 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 254 405 746 897 904 128 44 4e-28 MTVELISVGTEILMGNIVNSNTQFLAEKCAMLGFDLFFQVTVGDNYDRMMSVVRQALSRS DILILTGGLGPTEDDLTKEVCADAFGMPLVEDAHTREHLKEFFRNNIYRHIPDNNWKMAT VPQGALVLDNANGMAPGLILEKEGKTAILLPGPPNELYPMFSRQVFPYLQQKQNAVLVSS MVKICGYGESQVEDMILDLIDAQTNPTIATYAKTAEVHLRITARAGSREEGMALIAPVEK EIKNRFKDAVYTTEEQVSLEMAVADLLKQNHLTMCTAESCTGGMIAARMVNVPGVSGVFM EGMVTYSNEAKMRLLGVREDTLKAYGAVSEETAREMAEGGVNTSGTDLCVATTGIAGPDG GTDEKPVGLVYMACCLRGKTCVRRYQFKGNREKVREQAMMKALDLARLCILANK >gi|157101634|gb|DS480690.1| GENE 282 258016 - 258558 379 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 5 175 484 669 904 150 45 1e-111 MNLPNKLTIIRVCLIPFFVAALLFDHGNNYTMRIVANVLFIVASLTDLFDGKIARKYNLV TNFGKFMDPLADKLLVCSALICLIELGQLPAWVVIIIISREFIISGFRLVAADNGVVIAA SYWGKFKTTFQMAAVILMIFNIPALTLVTNIVVVIAVALTIISLVDYVAKNIKILTEGGM >gi|157101634|gb|DS480690.1| GENE 283 258542 - 259897 816 451 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase [Brachyspira murdochii DSM 12563] # 1 445 1 438 440 318 39 2e-85 MENIKLFCVSLGCDKNLVDTEKMLGLLNGRGITFTDDETEADVILVNTCCFIGDAKEESV NTILDMARYKEEGNLKALLVAGCLAQRYKQEILDEIPEVDAILGTTSYEEVAKVLDEVLA GRTGTHLSCFHDLSELPQTEVRRVVTTGGHYAFLKIAEGCDKRCTYCIIPYLRGPYRSVP IEQLVKEARQLAEAGVKELILVAQETTLYGRDLYGEKCLPRLLRELAKIPGIYWIRIQYC YPEEITDELIETIRTEEKVCNYLDIPIQHASDRILKRMGRRTNKEELMERIGKLRQEIPD IAIRTTLISGFPGETESDHEELMDFVNEMEFERLGVFAYSAEEDTPAFSYPDQVPEEVKE ERRDEIMELQQDIAFEHCENMVGRVLTVLIEGKVVDEPAYVGRTYMDAPNVDGLIFVNGD VELMSGDFVRVKVTGAAEYDLIGEVYDESAQ >gi|157101634|gb|DS480690.1| GENE 284 259994 - 260257 319 87 aa, chain - ## HITS:1 COG:no KEGG:Closa_1948 NR:ns ## KEGG: Closa_1948 # Name: not_defined # Def: DNA-directed RNA polymerase, omega subunit # Organism: C.saccharolyticum # Pathway: Purine metabolism [PATH:csh00230]; Pyrimidine metabolism [PATH:csh00240]; Metabolic pathways [PATH:csh01100]; RNA polymerase [PATH:csh03020] # 1 84 1 83 84 104 70.0 1e-21 MLHPSYTDLINVVNSDIEPGEQPVVQSRYSIVIAASRRARQLIAGEDPMVAGAAGKKPLS IAIDELYHQKVKILPEEETTEEEEQNA >gi|157101634|gb|DS480690.1| GENE 285 260303 - 260935 679 210 aa, chain - ## HITS:1 COG:BH2512 KEGG:ns NR:ns ## COG: BH2512 COG0194 # Protein_GI_number: 15615075 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Bacillus halodurans # 3 188 4 189 204 195 53.0 6e-50 MNQQGILVVVSGFSGAGKGTLMKELLKRYDNYALSVSATTRQPREGEKDGEDYFFVNREY FQQMIEEGRLVEYAQYVNHYYGTPRDYVEKKMAEGKDVILEIEIQGALKVKKRFPDALLI FVTPPSAGELRRRLVGRGTETIEVINARLRRAAEEASGMEAYDYLLINDEIDACVEQMHQ LITLQHSKTCYHLDFLSRMREELYHLDDRQ >gi|157101634|gb|DS480690.1| GENE 286 260979 - 261857 1119 292 aa, chain - ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 292 1 292 292 236 45.0 4e-62 MLKSMTGFGRCENVTDEYKISVEMKAVNHRYLDLSIKMPKKFNYFEASIRTLLKKYIQRG KVDLFINYEDYTEGNMCLKYNRALAAEYMEYFNRMSDEFGIQNDVKVSALAKFPEVLTME QVPDDEEHLWEILSEALKEAAVKFVESREIEGEHLKKDLLGKLDYMETLVDFIEERSPQI LTEYRNRLEDKVKELLSNTAMEESRIAAEVTIYADKICVDEETVRLRSHIHNTRNELLSG ESVGRKLDFIAQEMNREANTILSKSTDLSISDKAIALKTEIEKVREQIQNIE >gi|157101634|gb|DS480690.1| GENE 287 262184 - 263929 1673 581 aa, chain + ## HITS:1 COG:BH2516 KEGG:ns NR:ns ## COG: BH2516 COG1293 # Protein_GI_number: 15615079 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus halodurans # 1 574 1 559 570 363 37.0 1e-100 MAFDGITIANLVHEFKEALGGGRISKIAQPEKDELLIAIKNNRENYRLQISASASLPLIY LTDKNKPSPMTAPNFCMLLRKHIGSARIVDISQPGLERIIQFELEHLDEMGDLCRKKLIV EIMGKHSNIIFCKEDGTIIDSIKHVSAQVSSVREVLPGRTYFIPQTVAKENPLSVTEETF RQTVGSSSMSIQKALYNHLTGISPIMAEEICHLASIDSDYGASELSETELLHLYHTFSLV MEDVKEGRFAPAIVFDEEGPAEYAALPLTCYEGGVYRSQSFRSMSHLLEEYYASRDTLTR IRQKSSDLRRIVQTALERNNKKYDLQLKQLKDTEKREKYRIYGELLNTYGYELTGGEKSF TCLNYYTNEEITIPLDTQLSAKDNAKKHFDKYNKLKRTYEALTDLTKETKAEIDHLESIS SALDIALAENDLVQIKEELMEYGYIKKRRSSDKKPKITSKPFHYISSDGFHIYVGKNNYQ NEELTFKLATGNDWWFHAKGIPGSHVIVKTEGRADLPDRLFEEAGSLAAYYSKGRDNDKV EIDYIQKKNIKKVAGAAPGFVIYHTNYSLVASPECGLNEVE >gi|157101634|gb|DS480690.1| GENE 288 264177 - 264353 285 58 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 [Oribacterium sinus F0268] # 1 58 1 58 58 114 96 6e-24 MSNVIVKDNESLDSALRRFKRNCAKAGIQQEIRKREHYEKPSVRRKKKSEAARKRKYN >gi|157101634|gb|DS480690.1| GENE 289 264508 - 267156 2838 882 aa, chain - ## HITS:1 COG:CAC1678 KEGG:ns NR:ns ## COG: CAC1678 COG0013 # Protein_GI_number: 15894955 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 5 882 2 881 881 934 53.0 0 MKEPQYRGVNELRRMFLEFFESKGHLAMKSFSLVPHNDNSLLLINSGMAPLKPYFTGQEI PPRNRVTTCQKCIRTGDIENIGKTARHGTFFEMLGNFSFGDYFKTEAIHWTWEFLTDVVG LDPDRLYPSVYLDDDEAFGIWNKEIGIPAERIFKFGKEDNFWEHGSGPCGPCSEVYYDRG EAFGCGKPGCTVGCDCDRYIEVWNNVFTQFDNDGHGHYTELEHKNIDTGMGLERLAVVVQ GVDSLFTVDTNKALLDRVCQLAHTEYQKDHETDVSLRIVTDHVKSCTFMISDGIMPSNEG RGYVLRRLLRRAARHGRKLGIEGRFLAGLSETVIELSKDGYPELEEKQAMIFKVLSEEEE KFNKTIDQGLSILGEMQDEMQKTGNKCLSGANAFKLYDTYGFPLDLTKEILEERGCTVDE DGFAAAMKEQKEKARKARKTTNYMGADVTVYQSIDPSVTTEFIGYDRLTADSRVTVLTTE EELVQALTDGQKGTIIVEQTPFYGTMGGQQGDTGVISNENGAFKVEDTIHLQGGKVGHVG VMTKGMFQVGENVTLSVCAHNRALTCKNHSATHLLQKALRKVLGDHVEQAGSYVDAGRLR FDFTHFSAMTPDEIKKVEELVNQEIQASLPVVTQVMTLEEAKKTGAMALFGEKYGDKVRV VRMGDFSTELCGGTHVPNTGTIAYFKIISEAGIAAGVRRIEALTSEGLMKHYQEVEEELQ EAARTAKTTPSALTSKIQSLLEEIKTLHSENEKLKSKLANDSLGDVMSQVKEVNGLKVLA ARVAGVDMNGMRNLGDQLKDKLGEGVIVLACEQDGKVNLMATATDSAQKKGAHAGNLIKA IAGLVGGGGGGRPNMAQAGGKNPAGIDDALKKAVETVEEQTK >gi|157101634|gb|DS480690.1| GENE 290 267339 - 268304 738 321 aa, chain - ## HITS:1 COG:no KEGG:Closa_2075 NR:ns ## KEGG: Closa_2075 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 302 3 277 287 102 33.0 2e-20 MKDKRIWIVIGCILFIGTGVTHYTKYYVNSQGAAMMAEITAGAEDTAAAPSFAPALGQQD TEAGGTDTGMAAAGRAAPATPAERRMTGDAGPGNTASGNTASGDAGLASAAPKGADSGDA AYAKAPAAKAAAEPFSSGQAAPAMADAAGETMASDTAGGAGEGAAASEENMPISPLTGSR TREYKVSLIVDYRKRLEDLDTQILKMREEETDSNVYSIKTSAETELKLWEGELNSIYNAL MEMLSQEDAAKLASEQQEWLKNRDARAAESSGRNSGSMEGISYAATLASLTRDRAYELAG RYEEANGVFYEEEDSSQPASP >gi|157101634|gb|DS480690.1| GENE 291 268517 - 268999 456 160 aa, chain - ## HITS:1 COG:STM0263 KEGG:ns NR:ns ## COG: STM0263 COG0328 # Protein_GI_number: 16763646 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Salmonella typhimurium LT2 # 1 147 2 141 155 164 57.0 4e-41 MTKVQIYTDGSARGNPDGPGGYGTVLHFTDSSGQLHEKTMSGGYVRTTNNRMELMAAIVG LEALNRPCQVELYSDSKYLTDAFNQHWIDSWVAKGWNRGKSGPVKNIDLWKRLLKAKEPH QVKFIWVKGHAGHPENEKCDALATSAADGGNLMTDQGLEA >gi|157101634|gb|DS480690.1| GENE 292 269027 - 270628 1822 533 aa, chain - ## HITS:1 COG:PA3614 KEGG:ns NR:ns ## COG: PA3614 COG1236 # Protein_GI_number: 15598810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Pseudomonas aeruginosa # 3 464 4 467 467 359 40.0 6e-99 MRLTFIGADHEVTGSCHFLEVGNTKVLIDCGMEQGNNIYQNAELPVGYHEIDYVLLTHAH IDHAGMLPWIHARGFRGQVITTYATADLCSIMLKDSAHIQEMEAEWKNRKARRAGRGEVE ALYTMQDAEDVLKHFEGVAYGQVFKLNDNLTLRFTDVGHLLGSASIEIWATEGASKKKIV FSGDIGNKNKPLIRDPQYITEADYVVMESTYGNRYHKKDVNHVDSLARIIQETLDRQGNV VIPAFAVGRTQELLYLIRHIKEERLVAGHDSFEVYVDSPLAVEATHVFKDNMLECYDEET KALVERGINPIDFPGLKLSITSEESKNINFDMTPKVIISAAGMCDAGRIRHHLKHNLWRP ECSVVFAGYQAEGTLGRILQDGAQTVKIFGEEIDVRAHIETMDGISGHADRDGLISWISA FEKKPDYVFVVHGSEESCVSFSDVLNTQLKLTARAPFSGSQFDLIKGFWIKETEGVLIEK ETAAKRRASGVYTRLVDAGKMLLDVIHRNEGGANKDLTRFTEQILALCDKWDR >gi|157101634|gb|DS480690.1| GENE 293 270795 - 272270 1683 491 aa, chain - ## HITS:1 COG:no KEGG:Closa_2078 NR:ns ## KEGG: Closa_2078 # Name: not_defined # Def: stage IV sporulation protein A # Organism: C.saccharolyticum # Pathway: not_defined # 1 491 1 491 491 803 80.0 0 MDNFNVYKDIQARTGGEIYIGVVGPVRTGKSTFIKRFMELLVLPAMEDENLRNLSRDELP QSAAGKTIMTTEPKFIPKEAASINLADGIEAKVRVIDCVGFMVDGAAGHVENGEERLVKT PWFDYDIPFTQAAEIGTRKVINDHSTIGIVVTTDGTIGEIKRPGYIAAEKQTIDELKKLG KPFVVLLNSTKPYSDETARLAREMSESYGVSVLPVNCEQLKKEDVFHILERVLKEFPVTE MDFHIPKWLEILPSTHWLKAQVIQAARNVIQKVTHMKDVSGELEQQHTDTIRSMNVKNMQ MADGRVGVQVDMDDSYYYQILSDYVGLPIEGEYQLMQTLSSLANMQKEYEKVQNALTQVR LKGYGVVTPERSEIVLDEPQVIKHGNKYGVKMKAEAPSINLIKAHIETEIAPIVGSEQQA QDLIAYIKENARESDDGIWNTNIFGKSIEQIVEDGIQAKVSQMTEDCQLKLQDTLQKIIN DSNGGMICIII >gi|157101634|gb|DS480690.1| GENE 294 272526 - 273236 268 236 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 228 1 231 245 107 26 6e-22 MAMLEVRDLEVYYGVIQAIKGISFDVNQGEIVALIGANGAGKTTTLHTITGLISAKTGKI VYEGTDITRVPGYKLVGMGIAHVPEGRRVFATLTVLQNLKMGAYTRKDKEEIEATLKMIY QRFPRLEERKNQLAGTLSGGEQQMLAMGRALMSHPRMIVMDEPSMGLSPIYVNEIFDIIQ KINGDGTTVLLVEQNAKKALSIAHKAYVLETGNVALSGDAKELMNNDQVKKAYLSE >gi|157101634|gb|DS480690.1| GENE 295 273251 - 274009 266 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 244 33 254 329 107 29 1e-21 MALLEVKNLSISFGGLKAVDNFQISIEKGQLYGLIGPNGAGKTTIFNLLTGVYKPDAGSI ELAGVNITGRKTTEINQAGIARTFQNIRLFKDLSVLDNVKAGLHNHYRYSTTAGIFRFPN YFKVEKQMDERAMELLKVFGLDEECDYKASNLPYGKQRKLEIARALATEPKLLLLDEPAA GMNPNETAELMDTIRFVRDNFDMTILLIEHDMKLVSGICEELTVLNFGQVLCQGETGAVL HNPEVVKAYLGE >gi|157101634|gb|DS480690.1| GENE 296 274011 - 275084 1503 357 aa, chain - ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 49 341 4 264 285 188 45.0 2e-47 MKSNAKRKTFINNMITYGMVVAAYIIMQVLVGTGHISSLMKGLLVPFCIYVIMAVSLNLT VGILGELSLGHAGFMCVGAFAGALFSKCMKGAIDPTLGMVIAILIGAVTAALFGVIIGIP VLRLRGDYLAIVTLAFGEIIKNLVNAVYLGRDASGFHISFKDSMSLKLGEGGEILIKGAQ GITGTPQAATFTIGVILVLITLFIVMNLVNSRTGRAIMAIRDNRIAAESIGINITKYKLL AFSISAGLAGIAGVLYAHNLTTLTALPKNFGYNMSIMILVYVVLGGIGNIRGSVIATVIL YLLPEMLRGLSNYRMLMYAIVLILAMLFNSAPQFVVMRERLAEKFSRKDKKPAKEAA >gi|157101634|gb|DS480690.1| GENE 297 275093 - 275974 1212 293 aa, chain - ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 1 293 14 308 308 256 51.0 4e-68 MGFINYLINGISLGSVYAIIALGYTMVYGIAKMLNFAHGDVIMIGSYVVFVTVSTMGLPP MAGVLLAVAVCTLLGMTIERIAYKPLRGASPLAVLITAIGVSYLLQNVALLIFGADTKSF TSVVTLPAIKLAGGEMTITGETIVTILSCIVIMIGLTAFINKSKAGQAMLAVSEDRGAAT LMGINVNGTIALTFAIGSALAAIAGVLLCSAYPSLTPYTGSMPGIKAFVAAVFGGIGSIP GAFIGGILLGVIEILSKAYISSQMSDAIVFSVLIIVLLVKPTGILGKKINEKV >gi|157101634|gb|DS480690.1| GENE 298 275977 - 276078 67 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTLLFIVTIYDIILNNLSLYSDKLKKTSDEEV >gi|157101634|gb|DS480690.1| GENE 299 276138 - 277376 1509 412 aa, chain - ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 1 410 1 376 383 234 36.0 2e-61 MKKRFLSLGLAAVMAMSLTACGGSNTAETTKAAETTAADAKAEGSEAADTAATDAGAAGG VFKIGGIGPVTGGAAVYGVAVQHGAELAVKEINEAGGINGAQIEFNFQDDEHDAEKSVNA YNTLKDWGMQVLMGTVTSAPCIAVADKTAADNMFQITPSGSAVECAANPNVFRICFSDPD QGAASAKYIGENKLASKVAVIYDSSDVYSSGIYEKFAEESANQGIEIVAAEAFTADSNKD FSTQLQKAKDAGAELVFLPIYYTEASLILQQASTMGFAPQFFGCDGMDGILQVQNFDTKL AEGLMLLTPFAADATDDLTVKFVESYKAAYNEVPVQFAADAYDAIYAIKAAIEDANVTPD ADASTICDALKASMLNIKLDGLTGEGMTWTEDGEPHKAPKAVKVVDGAYSAM >gi|157101634|gb|DS480690.1| GENE 300 277577 - 278605 1198 342 aa, chain - ## HITS:1 COG:FN0906 KEGG:ns NR:ns ## COG: FN0906 COG0240 # Protein_GI_number: 19704241 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 6 333 1 331 335 394 61.0 1e-109 MDGEVMASIGVIGSGTWGTALSVLLNGNGHTVQIWSAIPQEIAELKETRIHRNLPEVKIP EAIGLTDDLEEAMRDKDLLVLAVPSVFVRQTSQRMKPYIKQGQVITNVAKGIEEKSLMTL SQIIEEELPEAEVTVLSGPSHAEEVSRGLPTTCVAGAHKRSVAEFVQGIFMSEVFRVYTS PDVLGIELGGALKNVIALAAGMADGLGYGDNTKAALITRGIAEISRLGIDMGGRMETFAG LTGIGDLIVTCASMHSRNRRAGILIGRGYTMEEAMAEVKMVVEGVYSAKAALALSQQYKV SMPIVEQVNAILFGGMPAKEAVLELMLRDKRIENSDLEWSEE >gi|157101634|gb|DS480690.1| GENE 301 278633 - 279274 538 213 aa, chain - ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 7 205 7 191 194 107 35.0 2e-23 MEQIICLAMGYVFGLFQTGFIYGRSKGIDIRQHGSGNSGSTNALRVMGVKAGLIVFMGDF LKTVIPCFLVRILFRSQPGYIYVLILCAGFGVILGHNFPFYLKFKGGKGIAATAGIMFSL DWRLTAVCLLAFVLIVAVTRYVSLGSLVVSAIFLCWNIYFGHLGAYGLAPEARPQFYAVS AVIAAMAFWRHRANIVRLVQGRENKVGAKKKTE >gi|157101634|gb|DS480690.1| GENE 302 279299 - 280627 1774 442 aa, chain - ## HITS:1 COG:CAC1711 KEGG:ns NR:ns ## COG: CAC1711 COG1160 # Protein_GI_number: 15894988 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 438 1 437 438 506 59.0 1e-143 MSKPIVAIVGRPNVGKSTLFNVIAGDTISIVKDTPGVTRDRIYADCDWLNMNFTLIDTGG IEPDSKDIILSQMREQAEIAISTADVIIFIVDVRQGLVDSDSKVADLLRKSHKPVVLAVN KVDSVAKYGNDVYEFYNLGIGEPVAVSAASRLGIGDLLDEVVKHFDSEQMEEEEDERPRI AVVGKPNVGKSSIINKLVGENRVIVSDIAGTTRDAVDTEVIHEGTPYVFIDTAGLRRKSK IKEELERYSIIRTVSAVERADVVVVVIDATEGVTEQDAKIAGIAHERGKGIIVAVNKWDA IEKTDKTIYEYTRKIKEVLSFIPYAEYLFISAATGQRLTKLFEMIDVVRQNQNLRVATGV LNEIMTEAVAMQQPPSDKGKRLKIYYMTQVAVKPPTFVIFVNDKELMHFSYTRYLENQIR NAFGFKGTSLKFLVRERKGKEQ >gi|157101634|gb|DS480690.1| GENE 303 280660 - 280854 254 64 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2580 NR:ns ## KEGG: Ccel_2580 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 8 64 4 60 60 62 54.0 5e-09 MKPGKNKKTSCECCGNYVYDEENDYYICEVNLDEDEMGRFLRGTVSACPYYQSDDEYRIV RRQM >gi|157101634|gb|DS480690.1| GENE 304 281125 - 282333 1236 402 aa, chain + ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 16 391 13 370 371 369 47.0 1e-102 MANNFVFKDSLPVAPLKIAALESCKDLAAKVNDHIVQFRRNDIEELMRRKEDLHYRGYDV DSYLLHCSCPRFGTGEGKAVINESVRGTDVFVMVDVMNYSIPYTVSGYTNHMSPDDHFQD LKRIIGACSATAHRVNVVMPFLYESRQHKRNKRESLDCAMALEELAALGMENIITFDAHD PRVQNAIPLIGFDNFMPTYQFVKALFSHDSNIRIDKEHLMVVSPDEGAMNRAVYLANNLG VDMGMFYKRRDYSQVIGGRNPIVAHEFLGSSVEGKTVLIIDDMISSGESMLDTCKELKEH KADKVIVCCTFGLFTNGLEKFDEYYHQGYLDYVITTNLNYRTQELLNREWYLEADMSKYL AAVINSLNHDRSISASLSPTEKIQKLVQKHQNGGYEYFQKLV >gi|157101634|gb|DS480690.1| GENE 305 282450 - 283496 1081 348 aa, chain - ## HITS:1 COG:no KEGG:BBR47_51510 NR:ns ## KEGG: BBR47_51510 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 2 345 3 352 364 239 39.0 2e-61 MDVLLEEWSPVCSIQAVVEDNGQAVYFYLWVNPGSENAEIRNCWICNRVPAMEDMDYEAM KMGMAPRMPRAYVTHDSAGICLDREKLSLVWFEEGDGAALYEDGRMIAVIPGWADREFPG YSLYAKGMAPFAWELQEALPVLEQRMERSRMYWDIMRGGYFETMQQEQLAAMEAFFGPHK KYYAIDGGGFPPKALVTGERKGSQYAFTLGNGALAQPKVEQYFQDETWKYRRMELGVAFP KGCSQEGIMAMLNYTAAQASLPWKELGWLGHGHTIPCSGVEGYEGVLLLSPGLCGVQEGP AYPMFMDEPVNLLWMMLLTGEEYELVKEQGTDPILEKKGADPAAWNLV >gi|157101634|gb|DS480690.1| GENE 306 283533 - 284387 926 284 aa, chain - ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 37 284 31 276 278 148 32.0 1e-35 MDINWSYKNKSRVFDTKESAGVPYLSFKALDDTGLAVNGFSTRMGGASRGKFSTMNFSYS RGDAAEDVLENFTRMAAALGVERDRMVVSHQTHTVNLRRVTLEDAGKGVVRERDYRDIDG LITDVPGLTLVTFYADCVPLYLLDPVNRAIGLSHSGWRGTVKRMGQVTVDAMKEAFGTKP EDLIACVGPSICRDCFEVGGEVVEEFREAFGPEHREALYCQGKRPGKYQLDLWKANEIIF REAGIRRENIHITNICTMCNSDYLFSHRKVGEERGNLAAFLCLK >gi|157101634|gb|DS480690.1| GENE 307 284409 - 284750 226 113 aa, chain - ## HITS:1 COG:yraN KEGG:ns NR:ns ## COG: yraN COG0792 # Protein_GI_number: 16131040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Escherichia coli K12 # 1 113 14 127 131 84 41.0 6e-17 MNKRQTGSRYEETAAAFLTSKGYRVLERNFRCRQGEIDLICRHGRYLVFAEVKYRSGLSM GSPAEAVDARKQERIRRAAAFYLYSHGMGGDVPCRFDVVGILGSDIELIQDAF >gi|157101634|gb|DS480690.1| GENE 308 284743 - 285405 835 220 aa, chain - ## HITS:1 COG:SP1156 KEGG:ns NR:ns ## COG: SP1156 COG0164 # Protein_GI_number: 15901021 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Streptococcus pneumoniae TIGR4 # 14 209 50 246 259 190 53.0 2e-48 MAARVLKGEKLEQEMARLEAMAEYEKEYDACTYICGIDEAGRGPLAGPVAAGAVVLPKGC RILYLNDSKKLSEKRREELFLEIKEKAVAWNVGLASPARIDEINILQATYEAMREAVKGL SVKPEVLLNDAVTIPGLEGVQVPIIKGDAKSLSIAAASILAKVTRDHMMAEYEELFPGYG FAKHKGYGTAAHIAAIRELGPCPIHRRSFIRNIVQADTYE >gi|157101634|gb|DS480690.1| GENE 309 285408 - 286274 1112 288 aa, chain - ## HITS:1 COG:BS_ylqF KEGG:ns NR:ns ## COG: BS_ylqF COG1161 # Protein_GI_number: 16078668 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 1 280 1 280 282 270 46.0 2e-72 MNIQWYPGHMTKAKRAMKEDMKLIDLVIELVDARVPLSSRNPDIDDLGAGKARMVLLNKS DLADERQNARWAAWFEDKGIHVVKVDARNKGTLKQVQSVIQEACKEKIERDRRRGILNRP IRTMVVGIPNVGKSTFINSFAGKACAKTGNKPGVTKGNQWIRLNKTLELLDTPGILWPRF EDQQVGLKLALIGSINDQILNKDELACELIRFLKKRYPQVLAERFGLETEGKEAAVILEE IARVRACLLKGGDLDVSRAAGLLLDDFRAGKLGRITLEEPENQKDKVE >gi|157101634|gb|DS480690.1| GENE 310 286292 - 286843 599 183 aa, chain - ## HITS:1 COG:slr1377 KEGG:ns NR:ns ## COG: slr1377 COG0681 # Protein_GI_number: 16329775 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Synechocystis # 8 180 18 188 218 86 30.0 3e-17 MDFYVNEDTAWLRKAVRWTVNVVLVLASAWFLVYGFCTQVPVSGGSMQPVLDADDVVLVN RLIYDVGKPERFDIVVFEREDHKKNVKRIIGLPGETVQIKGGYIFIDGELLNAEDGLEQV SLAGRADTPIKLEDNEYFLLGDNRDSSEDSRFPNIGNVKREQIQGKVWFRIFPLLKLDFI HSR >gi|157101634|gb|DS480690.1| GENE 311 286954 - 287301 468 115 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880535|ref|YP_001559503.1| ribosomal protein L19 [Clostridium phytofermentans ISDg] # 1 115 1 115 115 184 79 4e-45 MNDIIKNIEAAQLKESVPSFNVGDTVKVYNKIKEGNRERIQIFEGTVIKRQNGGARETFT VRKNSNGIGVEKTWPLHSPSVDNVEVVRKGKVRRAKLNYLRDRVGKAAKVKELVK >gi|157101634|gb|DS480690.1| GENE 312 287475 - 288752 1611 425 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 22 425 17 426 426 258 38.0 2e-68 MDPSGDSIAIRVVIIIILLGLSAFFSSSETALTTVNKIRIRTLAEGGSKSAQWVMKLSEN QGKMLSAILIGNNVVNLSASSMLTVLVTEIFGNKAAGAATGVLTLLILIFGEITPKTMAT LDAERYSLRVGHMIHILMTVLTPLIVLINWMSLIVLKVLHVDPDKRNDDITEDELRTIVD VGHEKGVIESEEREMINNVFDLGDSVAKDVMVPRIDMVFVDIEADYDDLIQIFREEHYTR LPVYKETTDNVVGIINIKDLLLVEDKASFCVSDYLRQPFYTFESKKLSELMMEIKKSPNN IIIVLDEYGATAGLITLEDILEEIVGDIRDEYDEDEEEELMDLGDGQYLVEGSMKLDDLN DILDLELSSEDYDSVGGLVIDRLEHLPSQGEEVVCGNVRLVVEQVEKNRIDKVHLYILPG EKEEE >gi|157101634|gb|DS480690.1| GENE 313 289088 - 290005 852 305 aa, chain - ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 244 1 244 246 284 56.0 1e-76 MNFHILTLFPDMVMGGLGTSITGRAMESKTISVEAIDIRDYSKDKHRHVDDAPYGGGAGM VMQPGPVCEAYEALCGRIGRKPRLIYMTPQGRVFNQTIAEELAKEEDLVFLCGHYEGIDE RALELIATDYLSVGDYVLTGGELPAMVMIDCISRLVPGVLNNDASAEEESFHDSLLEYPQ YTRPEVFRGMEVPEVLLSGHHKNIEEWRRQQSIKRTLERRPDLLEHAALTMKEVKYLDSL RREKGDLEILEELIDQYVKSLNDEASAGRTKRKAMAAAKKLLAEKTCTVGELQGYFKVMG MLAGG >gi|157101634|gb|DS480690.1| GENE 314 290002 - 292674 3464 890 aa, chain - ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 7 883 3 867 869 874 51.0 0 MIDLSQLSPMMQHYMETKKEYPDCVLFYRLGDFYEMFFDDALTVSKELEITLTGKDCGLS ERAPMCGVPFHALDSYLYRLVQKGYKVAIAEQMEDPRQAKGLVKREVIRVVTPGTITSSQ VLDETKNNYLMGIVYMDGIYGISTADISTGDFMVTEVDSDRELFDEINKFSPSEIICNNA FYMSGVDMDELKNRYQVVISALDSRFFGEESCRRILMEHFKVGALVGLGLEDYATGIIAA GAVMQYIYETQKSTLEHITTITPYSTGQYMVIDTSTRRNLELVETMREKQKRGTLLWVLD KTKTAMGARLLRACIEQPLIHRDEIIKRQNAVEELNMNYISREEICEYLNPIYDLERLIG RISYKTANPRDLIAFRSSLEMLPYIKRILGEFNSELLAELGRELDPLQDIFQLIGDAIVE EPPITVREGGIIKDGYNQEADKLRHAKTEGKNWLAELEAKEKEKTGIKTLKVKFNKVFGY YFEVTNSFKDQVPDYYIRKQTLTNAERFTTDELKQLEDIIMGAEEKLVSLEYDLFCEVRD KIGAEVIRIQKTAKSIAGIDVFCSLSVVATRRNYVKPSINDKGVIQIKNGRHPVVEQMMR DDMFVANDTFLDNGKNRLSVITGPNMAGKSTYMRQVALIVLMAQLGSFVPAQEADIGICD RIFTRVGASDDLASGQSTFMVEMTEVANILRNATRNSLLVLDEIGRGTSTFDGLSIAWAV IEHISNSKLLGAKTLFATHYHELTELEGTIAGVKNYCIAVKEQGDDIVFLRKIVRGGADK SYGIQVAKLAGVPDSVIARAKEIAEELSDADITARAKEIAEISSNITQHKAVPKPDEVDL QQLSFFDTVKDDDIIRELDSLELSTMTPLDAMNTLYRLQTKLKNRWKETG >gi|157101634|gb|DS480690.1| GENE 315 292751 - 294259 655 502 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 53 492 3 430 451 256 34 7e-67 MEESRDTIDNIKNAETAINQDPPAAEPERQQYFMERAKGQLAELSRRLGRPLTCCINTFG CQMNARDSEKLLGILETIGYQAVESEKADFVLFNTCTVRENANLRVYGRLGQLGAYKKTH PDMMIALCGCMMQEEEVVEKIRKSYRYVDLIFGTHNIFKLAELVSLCLERRTQGGQKQGK TKMVVDVWKDTDQIVEDLPVERKFPFKSGVNIMFGCNNFCSYCIVPYVRGRERSRRPEEI IKEIQKLAADGVVEIMLLGQNVNSYGKNLDNPMTFAQLLEEVEKIDGIERIRFMTSHPKD LSDELIEVMARSEKICRHLHLPLQSGSSRILKVMNRRYTKEQYLDLAERIKKAVPGISLT TDIIVGFPGETDEDFEETMDVVRRVGFDSAFTFIYSKRTGTPAAAMEDQVPEAIVKERFD RLLKEVQDISAEVCGRDVRTVQEVLVEEVNDHAPGLMTGRLSNNTVVHFPGDASMIGRLV PVYLQESKGFYYMGHIADKENE >gi|157101634|gb|DS480690.1| GENE 316 294481 - 296223 2063 580 aa, chain - ## HITS:1 COG:FN0023 KEGG:ns NR:ns ## COG: FN0023 COG1288 # Protein_GI_number: 19703375 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 135 579 55 498 499 468 53.0 1e-131 MPFNKKKNTGTPKRDYTKMKTPHTYAIIFAAIFLCWLLTFLVPAGKFSTHKVEYTDSNGA VKTKTVLMADTFRYSYRLDREAVKDNLTAMSGDEALMESLGVDPAELDAFLETDSSEWTQ EDLDELGLTDDVLYGQYGESIYDTSQKLHKTAKVWGTEDFGGFGFLNYVFEGLVTGDRSG SAVGICALILVVGGSFGIIMKTGAIDAGIYAFIKKTKGLERLALPLLFILFSLGGATFGM AEECIPFSMIMVPFVIALGYDSIVAITVTYVASQVGNAVSWMSPFSVAVAQGIAGIPVLS GAGFRLAMWVIVTLLSAGYMMIYAEKIRKNPAHSVVYESDAYFRGRMDSVSEEREFSLGH KLILLEMLAVLVWIIWGVTQEGYYIPEIASQFFVMGFAAGVTAVIFKLNDMTVNGMASAF QSGVADLAGTAVVVGMAKGILLVLGGSDANVASTLNTILYTIGNALAGVPSFIGALFMYL FQSCFNLIVTSNSGQAALTMPIMAPLADLVGVTRQVAVLAFQMGAGFVDAFTPVSASLIG VLGVARIDWGKWAKFQIKMQAFFFLMGTVAIAIAIAVNLQ >gi|157101634|gb|DS480690.1| GENE 317 296343 - 296804 471 153 aa, chain - ## HITS:1 COG:no KEGG:Dred_0591 NR:ns ## KEGG: Dred_0591 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 1 151 6 144 144 80 31.0 2e-14 MMDERTTQVRNKIMSEMAMLMYLFVAAAFAVKVLLMGKGIRDCVVEYVILILAPLYQYVR ARQLKFSLYRPGYGSQQHRVRRNLVSVAVGALVFILVFWRTSKTGTLDAKEAVPGVLTFV VVFLLTRVVIVRIEKKRADRLEQEYGDDGEEQE >gi|157101634|gb|DS480690.1| GENE 318 296848 - 297042 302 64 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 63 1 63 68 99 73.0 1e-21 MKNLKMKAARAGMDLSQEELAELVGVTRQTIGMIEAGKYNPTLNLCLAICKALHKTLDEL FWEE >gi|157101634|gb|DS480690.1| GENE 319 297245 - 297619 373 124 aa, chain - ## HITS:1 COG:PA3571 KEGG:ns NR:ns ## COG: PA3571 COG2207 # Protein_GI_number: 15598767 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 13 114 200 301 307 82 37.0 2e-16 MERRCRTAKSQYLNRVITYMRDHLQENLTLTRIAREAGLSESYLNAVFKECVKCAPMDYY INMKMEQACYLLCNTDMHIYQVAQYLGYDNQYYFSRAFKKVLGVPPKKYKEMPRIPSAPM QVSC >gi|157101634|gb|DS480690.1| GENE 320 297845 - 298756 828 303 aa, chain + ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 8 299 5 291 291 151 30.0 2e-36 MPSKIQIITNEHQEELKEHGTYEFPVLVSDEALSRFYTSSFQWHWHTEIELTLITEGTMV YQINDQIFYPRAGQALFGNSNTMHTGRMENGRDCRYISITFHPRLLYGYQGSRIASRYTD PLMENAALPAVCFDLSQEWHGEAVGLLEDIIRIDSLRYDTYEMDIQMDLFRFWKLLYLHC RPDREPAHTAGRKNQERIRQMLSFIHENYQSDITLDDISRHIHICKSECCRVFKGYMKES LFEYLLKYRIEKSIPDVLEGKLTMTETAIRAGFTDPNYFSKVFHKIKGCSPRSYRKTRSS PLP >gi|157101634|gb|DS480690.1| GENE 321 298723 - 300021 911 432 aa, chain - ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 1 386 8 391 396 330 43.0 2e-90 MDVNSAYLSWTAAYRVCILGEDLDLRDLPSVIGGSEAERHGIVLAKSSSAKGYGIRTGET LAEARCKCPGLVIVPPDYSLYVSCSSALVELLKKYFPYVHQYSIDEAFCQVEGTMGLWGT HTAFAHQLKEEIRRELGFTVNIGVSDSKLLAKMASEFQKPDQVHTLFTWEIERKMWPLPV NELFWVGPATVRKLRALGIRTIGALAASDLELLKRHLKKQGEFIWNYANGRDVSPFLREL PQNKGYGNSMTTPSDVTDRETAEQVILSLTETVCARLRADSMEAACVGISITDREFRHGS GQTTLISATDTTLEIYEAARSLFDRLWNHSPIRQMGVHTSKVSPKGHYQYQLFDRGLHDK YARLDTAVDEIRKRYGEDCVQRAVFVGNRIPHMSGGIDKVKRTGVTKAVQRPGPGSGAGD RNHGSGEDRVFR >gi|157101634|gb|DS480690.1| GENE 322 300051 - 300320 265 89 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3923 NR:ns ## KEGG: Cphy_3923 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 5 89 4 88 88 64 38.0 1e-09 MAALNIPIQMFGACSTLGDMTPLWFRYENKEHGIITVKIESIVSSREEKFCGMEHISFVC WAVAEGQRRLIELRYRISTHKWSFFRTLS >gi|157101634|gb|DS480690.1| GENE 323 300412 - 301065 627 217 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 5 216 8 215 229 83 26.0 2e-16 MDIHLLAGTMLFQGIREHEIEAMLTCLSAEERTYGKDAYIYRAGDVTGRLGVVMEGAVNI IKDDVWGNRKIIENIGGGQIFGETYACLKGEPLMVDVQASERSRILFMDVNRILTTCSSS CDFHNRLIRSLMYVLAGKNLMLTKKMDIITPKSLRERVMVYLSQESVKQGCRTVTVPFNR QQMADYLSVDRSALSAELSRMQRDGVISYEKNRFTIQ >gi|157101634|gb|DS480690.1| GENE 324 301192 - 302787 1584 531 aa, chain + ## HITS:1 COG:YPO1360 KEGG:ns NR:ns ## COG: YPO1360 COG1151 # Protein_GI_number: 16121640 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Yersinia pestis # 5 531 1 549 550 507 46.0 1e-143 MEHNMFCFQCEQTAGCSGCMGKAGVCGKTAATARIQDELTGALIGLARASEGKDRSPEAD RLIVRGLFATLTNVSFDDSALNELLAEAGRLKHSLSPMCGSCTADCSRTEDFDMNRLWSE DEDIRSLKSLILFGMRGMAAYAYHAMVLGHEDRGVMDYFYKGLSAVGYDSYTVQDLLPIV MELGAVNLDCMALLDRANRQAFGIPAPAQVSLTVEQGPFIVISGHDLHDLKLLLEQTQGR GINIYTHGEMLPAHAYPELRKYSHLKGNFGTAWQNQQKEFSDLPAPVLFTTNCLMPPKPS YSDRVFTTALVSYPGIVHIDGRKDFSPVIQKALELGGFKEDVTIPGINGGTKVTTGFGRE AILAAAGTVAQAVRQNRLRHIFLVGGCDGARPGRNYYTDFVKQAPEDTLILTLACGKYRF NDLELGTLAGIPRLLDMGQCNDAFGAIQVAAALAEAFGCQVNDLPLTLVLSWYEQKAVCI LLSLLHLGIRNIYLGPTLPAFLSPAILEYLTEQFEIHPISTPEHDLAAILA >gi|157101634|gb|DS480690.1| GENE 325 302842 - 303612 879 256 aa, chain - ## HITS:1 COG:RSc2009 KEGG:ns NR:ns ## COG: RSc2009 COG3022 # Protein_GI_number: 17546728 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 1 250 1 252 257 137 36.0 2e-32 MKIIISPAKKMNIREDELEWANLPCFLSRAEELKRYIQGLDLDQARKLWQCNEKIALLNY GRFKEMDLERRLTPALLSYEGIQYQYMAPGVFEQGQWDYVQEHLRILSGFYGILKPLDGI VPYRLEMQAKVELPSGIKSLYDYWGSSICRELAADETLIVNLASKEYSRAVEPYLEAHID YVTCVFGTLAEDGTGGLKVKVKATEAKMARGEMVRFMAERGVRDAEGLKGFDRLGYRYCQ EKSGDKEYVFIKIREQ >gi|157101634|gb|DS480690.1| GENE 326 303636 - 304217 711 193 aa, chain - ## HITS:1 COG:slr1377 KEGG:ns NR:ns ## COG: slr1377 COG0681 # Protein_GI_number: 16329775 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Synechocystis # 5 187 3 181 218 118 35.0 9e-27 MLEEENSRKRSRKQQEEEPFSWKKEIISWIQIIVAAVIIALVLNNFIIANSRVPTGSMEN TIMSKSRVIGSRLSYLTSDPERGDVVIFHFPDDPTGKIYYVKRVIGLPGETVNVVDGKVY INDSDTPLDEPYLPEPMEGSYGPYTVPEGCYFMMGDNRNNSLDARFWKNQFVEKDKIIAK VLFTYFPKIEKVE >gi|157101634|gb|DS480690.1| GENE 327 304298 - 305539 1474 413 aa, chain - ## HITS:1 COG:CAC3254 KEGG:ns NR:ns ## COG: CAC3254 COG0014 # Protein_GI_number: 15896499 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Clostridium acetobutylicum # 6 413 10 417 418 482 60.0 1e-136 MTLNEIGSRAKEVSRVLGTLGSREKNMGLEEAARALLEGEEEILEANSMDYEKARSGGMS QGLLDRLKLTPARVQAMADGLLQVASLEDPVGEVLSMKLRPNGLQIGQKRVPLGVIGMIY EARPNVTADAFGLCFKSGNAVILKGGSDALHSNMAITRCLRAGLASAGLPEDSVQLIEDT SRDTTRELMRLNRYIDVLIPRGGAGLIKTVVENSTIPVIETGTGNCHVYVDASADQNMAL DIIFNAKTQRIGVCNACESLLVHRSIAKEFLPLLRKKLEEKQVEIRGDEDACAIEPSFVR ATEEDWGREYLDYILSLKLVDSVDEAIRHINTYNTGHSETIVTSDYFNAQTFLNEVDAAA VYVNASTRFTDGEEFGFGAEIGISTQKLHARGPMGLKELTTTKYIIYGDGQIR >gi|157101634|gb|DS480690.1| GENE 328 305563 - 305784 273 73 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2602 NR:ns ## KEGG: Cphy_2602 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 66 1 66 69 80 71.0 3e-14 MDASRIDRINELYHKSQSVGLTEEEKEEQARLRQEYVAAIRGSLRNNLNNISIKEPDGSI TDLGKKHGGIKEV >gi|157101634|gb|DS480690.1| GENE 329 305829 - 306692 1158 287 aa, chain - ## HITS:1 COG:lin1228 KEGG:ns NR:ns ## COG: lin1228 COG0263 # Protein_GI_number: 16800297 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Listeria innocua # 6 269 2 263 276 238 50.0 1e-62 METQERQNLADKQRIVIKIGSSSLTHAQTGEVNLMKIEKLVRVVSDLRGQGKDVVLVSSG AIAAGRQALGRHRKPDTLAEKQAFAAVGQARLMMIYQKLFAEYNQTAAQVLLTKDTMVND SSRYNAQNTFDELLNLGTIPIVNENDTVSTSEIPYVDSFGDNDRLSAIVAALIGADLLIL LSDIDGLYTDDPRENPEAGFISLVPEITPEFLRMGKDTSGSDVGTGGMSAKLAAARIATD SGADMVIANGDQVDVILDIMSGKEKGTLFLAHTNLDFDLMHYLNNEY >gi|157101634|gb|DS480690.1| GENE 330 306781 - 308469 1605 562 aa, chain - ## HITS:1 COG:CAC3077 KEGG:ns NR:ns ## COG: CAC3077 COG2509 # Protein_GI_number: 15896328 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 1 550 1 538 540 465 46.0 1e-131 MIRINQLKMPLGHDRAGLLEKAARVLRVPSGEIEKLTIVKQSVDARKKPDIWYSYVVDIG IRQAGLQKEEKLVRRLKDRNVAVHKEAPYRLPEPGTECMAGRPVIIGTGPAGLFCGLMLA RKGYMPILLERGEDVDARTDRVARFWETGQLDPSSNVQFGEGGAGTFSDGKLNTLVKDTF GRNREVLRILTEFGGDPSILYVNKPHIGTDVLSRIVKSIRTEIEKLGGQVLFQSQVTDFV TGGEPGESRRIKALVVNGSQVLEAETVVLAIGHSARDTFETLLARGIPMEPKAFAVGLRV QHPQTLINESQYGMKECGELGPASYKLTWKASDERGVYSFCMCPGGYVVNASSEPGRLAV NGMSYHDRAGENANSAIIVTVTPEDFCGMETGAGTDQGCEVPGDAMAGIRFQRRLEETAF CLGKGNIPVQLYGDFKEGRVSEGFGGVNPAFRGGYAFANLRELFPEPLSRAFMEGMEGFG TMIRGFDRPDAILAGIESRTSSPVRIPRDQGMESPVKGIFPCGEGAGYAGGITSAAMDGI KTAEEIIRRYNPLRPSRTVAKT >gi|157101634|gb|DS480690.1| GENE 331 308466 - 309791 1406 441 aa, chain - ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 4 405 2 402 405 298 40.0 2e-80 MRKRIVIIGGGASGLTAAIGGARNGAHVTIVEHMDRVGKKILSTGNGRCNLTNLRMEADC YRCGRKEFPMEVIRGFGVDETLAFFKGLGIEPKDRNGYIYPNSDQASAVLDVLRSEVERL GVAVLLSCRVEKIEPAAGGGHVCYKVYTDQGILDADAVILAAGSKAAPSTGSDGSGYELA GRLGHRVIKPLPALVQLRCQGNLYRQMAGIRTEAGVKLMAAGELLARDRGELQLTDYGLS GIPVFQVSRFAARALDQGKRVTALVDFMPSWDDGEAFGLLKKRASLLGHKTVEELFTGLL NKKLALVLIKLAGIKPSQKSGDLSPKQLKLLLGQIKSYEAIVMSVNPFANAQVCCGGVDT GEVDAATMESRLHANLYLAGELLDVDGICGGYNLQFAWSSGMIAGTHAAGGHSVGKHEAG ANKAAAHGAYAPAPDGKRNRK >gi|157101634|gb|DS480690.1| GENE 332 310143 - 310541 351 132 aa, chain - ## HITS:1 COG:no KEGG:Closa_2065 NR:ns ## KEGG: Closa_2065 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 1 130 130 144 53.0 1e-33 MKKLLYCFALFAVTSIVCLIVAYGITRYSVRQEQAVPNTVIETETVNDVDDRAAMNQEQV KPILESVSPAEEYYLVSESGFLLVFCSDRSTICLYTHIPVTDFPEKEREKLMEGIWFPSM MDVYHYLESYTS >gi|157101634|gb|DS480690.1| GENE 333 310693 - 312390 1671 565 aa, chain - ## HITS:1 COG:CAC1016 KEGG:ns NR:ns ## COG: CAC1016 COG2244 # Protein_GI_number: 15894303 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 23 497 11 470 539 183 26.0 9e-46 MNPRTSRKKTRKQKAANFVIQGSILAMAGIVVRIIGMLYRMPLNDIIGKQGNGYYTSAFN VYNILLILSSYSMPVAVSKMISARLARGEHRNCSRILKAALIYATAVGGIAAVVLWFGAD LFAQLIKTPFSRYALKTLAPTIWIMAYLGVLRGYFQGTGTMVPTAVSQILEQIVNAVVSV VAAGILFGVGTSINAAQGAKDYSYALGAAGGTIGTGAGALTAFLFFICLTASKGRERRQM VREDVSGRTESYRRIFYVLTITVLPIVISSGIYNCSNVVDNYLFGQGMDKLGYMEDSIAT YWGVLGQYQLLFNIPVAVSNALSSSLIPSLTRAVANRNRKEKLERIATSIRFSLLIAIPA AVGITVLAKPVCNLLFISEDNTMLIRLSMAGSLAVVFYSLSTVTNAVLQGLNHMDVPIRH AVIALVIHVAVLEVFLMVFKMGIYSVVFANIIFALVMCLLNGHAIARFARYRQEYKRTLI LPTICAGFMGAAAYGVYRGIYALLPDQLMRGRMGMAIVVFPSVAVAILVYAVLLVRFRAV EEEELKGMPGGRRLVRLLNRFRLLG >gi|157101634|gb|DS480690.1| GENE 334 312396 - 313547 1258 383 aa, chain - ## HITS:1 COG:no KEGG:Closa_2067 NR:ns ## KEGG: Closa_2067 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 380 8 383 383 461 59.0 1e-128 MAIGLTAGLTAAVLLTGCAREASVEATTVQEPPATTEAAEPDIIIRSDPETKKEDTGNLP AFYLPEERTEENGMIRSYLTGQMVETAKGNRRPAAVMMSNDKEARPQYGINRAGVVYEAP VEGGMNRYMALIEDYDGLERIGSVRSCRTYYTYFAREFDAVYVHFGQSTFAKPYLKNVDH INGLDAIGDLAFYRTKDKKSPHNAYTSGDRITASIEKLGYTQSYAPSYKGHYLFARDGRE ASLTERPGVMDAGTAKTGYIMNQAYFVYDPSDGLYHRYQYGGIHQGDEGPVAVKNVIFQY CQTGHYATTDYLDINVHTTECGYFMTDGKAIPINWEKDGEFGPTRYFDANHDEVVLNQGK TWVCIIPTRDFAKSEIIGNEDTQ >gi|157101634|gb|DS480690.1| GENE 335 313565 - 314062 688 165 aa, chain - ## HITS:1 COG:CAC1293 KEGG:ns NR:ns ## COG: CAC1293 COG0319 # Protein_GI_number: 15894575 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Clostridium acetobutylicum # 13 158 13 159 166 110 44.0 1e-24 MTISIEYEAEKVLDLPYREIIEDTVLAALDYEGCPYEAEVNVILTDNDAIQEINREHRQI DAPTDVLSFPMVDYESPSDFDHVEEAVEDYFNPETGELMLGDIVISVDKVEEQAEKYGHS QTRELAFLTAHSMLHLFGYDHMEDGERLVMEEKQKEILETRGYTR >gi|157101634|gb|DS480690.1| GENE 336 314083 - 315063 1266 326 aa, chain - ## HITS:1 COG:BH1361 KEGG:ns NR:ns ## COG: BH1361 COG1702 # Protein_GI_number: 15613924 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus halodurans # 14 311 22 319 320 327 54.0 2e-89 MIDIPAEHERNVCGQFDTYLKKIERTLHVTMIARDGEIKIIGPEKTIGQAKSVFNNLIEL SKRGNTITEQNVDYALSLSFTENDGQILEIDKDIICRTISGKPVKPKTLGQKHYVDAIRK KMIVFGIGPAGTGKTYLAMAMAIQAFKNGEVGRIILTRPAIEAGEKLGFLPGDLQSKIDP YLRPLYDALYQIMGAESYLHNSEKGLIEVAPLAYMRGRTLDNAYIILDEAQNTTPAQMKM FLTRIGFGSKVIITGDQTQKDLPGDAVSGLDTALKVLKRIDDIGFCYLTSSDVVRHPLVQ KIVQAYETYEQKNTPPAGRRRIRERN >gi|157101634|gb|DS480690.1| GENE 337 315078 - 316337 958 419 aa, chain - ## HITS:1 COG:no KEGG:Closa_2070 NR:ns ## KEGG: Closa_2070 # Name: not_defined # Def: sporulation protein YqfD # Organism: C.saccharolyticum # Pathway: not_defined # 1 401 1 405 414 448 53.0 1e-124 MIEWLLHNWEGYVRLRLHGYSPERFLNLCNARGLEVWGLICNRDGDYEFFMTVEGYRRVK PLVRKAQVRLHITGRFGLPFFIYRHRKRWYYGAGIMAFFLVLYVMSLFIWDIQFEGNYRY TYDTLVRFLDTQDIDYGMLKARINCEDLEASMRTSFPEITWVSARVSGTRLLIHIKENEV LSTIPEKDETPCDIVASQPGIITSMVVRQGVAQVCVGDQVEEGQVLVSGRVPIIGDNEEE INAYMVHADADIVARTVKQYEKTFPLLHQERVHTGRRRRGWYMKAGAWSFTFLTPVRGQG EWDYSMEEKQLRLFSNFYLPVYIGAIQGREMAAYERNYREDELKELSKAINYQFVKNLEE KGVQILENNDRIETSVSECRLTGELVTEESIVKTQPAAEPDMNSGGEPGEKPEETITAE >gi|157101634|gb|DS480690.1| GENE 338 316356 - 316625 273 89 aa, chain - ## HITS:1 COG:no KEGG:Closa_2071 NR:ns ## KEGG: Closa_2071 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 88 1 88 88 114 63.0 1e-24 MSWEPKKTAVSALGLPRDVILGDVLISFVGRSQVNVENYRSILIYTDTLIKLQARNCKVV IHGARLKIDYYTAEEMKITGQVGSLEFET >gi|157101634|gb|DS480690.1| GENE 339 316767 - 317678 797 303 aa, chain + ## HITS:1 COG:no KEGG:Closa_1701 NR:ns ## KEGG: Closa_1701 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 79 303 66 288 291 153 38.0 7e-36 MNNQSYHTDRMNEYPDTLLTTAELGEADVPVTPLPNPGEGGPVDSGGNQNNGANVPVIPL PNPGEGGPVDSGNRPGFPPIFRPIFPGNSGSVTVIPGVTFPCYNCTATSFGRVRILNAST GYQPFYVYLGNWMVASGLGNGDITGYVQSASGTQTVTVSGANGYVYIQKTVQVRASASMT IAIINTASGLDIMEISDISCNAPSGTSCLRTCNLSLNLGPFDVSIGNQNSSYTAFSNVRY REVTPYTSFYPGWYQLYISRTGSYPNTSVATTSANMSANTSYTLYIFNAPNATDGLKTLV VSN >gi|157101634|gb|DS480690.1| GENE 340 317796 - 319673 1367 625 aa, chain - ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 54 617 49 626 627 330 34.0 6e-90 MIKIALAVPYKGFIENAYQIFRQHNEFYRAVDQENYEMEEVIVTSENVNQIKIEADIVLT RGLLAEILASLSKNIPVVEITVPAADILRTIRNCVEHYGKQKIGIIAAGNMLDGITELSD LSEAPIRTFVLTPDWNNEMLVKRAVSEGCKVILGGVNTCRYAAENHVSFMMIQTSQEAFW NALSDVKRTWSIFRKEQEKALRLQALLDISHDGILLIDPDFKLTAMNRKAQSVLGIDGEH LPVDALSLGGELKDVLAGDEEYVNRLVQYGSNMLNVSKNVLKIQSSICGYVVNVQAVKEI QSMEHDIRKRICKRGHIARYRFDDITGESPVMISTVETARRYSRTSSNILLIGQSGTGKE MFAQSIHNDSQRRRMPFVAVNCAAIPEQLLESELFGYVPGAFTGAHRDGKAGLFELAHKG TLFLDEIGEISLPLQAKLLRVLQEREIMRVGGDSVIPVDIRILAATNQNLEQLAAERQFR EDLLYRLDVLRINIPSLNQRREDISLLADSYMKRAFPDIRITDEAKAYLEQMDWPGNVRQ LFNFCERLAVLCNGFAVDADLVAAVSLQGGIPLGTGSVLSDEEDEQRIITKTLASCHYHK GNTAKALGISRSTLWRKLREYGIRT >gi|157101634|gb|DS480690.1| GENE 341 320009 - 321352 1057 447 aa, chain + ## HITS:1 COG:Cgl2846 KEGG:ns NR:ns ## COG: Cgl2846 COG2610 # Protein_GI_number: 19554096 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Corynebacterium glutamicum # 10 438 20 452 463 271 41.0 2e-72 MVMVLIVVLGIAFLLLLVLKCKLHAFIALLIASIAVGVAAGMPLTDITASIQNGMGGTLG TVAIIVGLGAVFGQMLEASGGAQALAKSMIRFFGEKHAPLAFAVTGFIIAIPVFFDVGFI ILLPLIYAVAYKTKRSVATFAFPLIISLCLTNAFVPPTPGPVATCSALDANVGWVIFFGI IISIPLTLVGYLYSIKYIDKHLFIPVPDYMIPEESNDENLPSFSSVILIIAVPILLILVN TGVDALVNEGILADSFFTQAAKFLGTSYIALLIAVLLSFYILGKRRGFHKEDLYEISSKA FLPVGLIILVTGAGGVFKQILLDSGVGDVIAQSVETIGLPPIILGYLVAVFIRVSQGSGM VAMITAAGIVSPLLAISPVSEMQRALVVLAIAAGSVAASHVNDSGFWLVCKYLNMTEGQT LKTWTFITAFLSLCSLVLVILLSLIVP >gi|157101634|gb|DS480690.1| GENE 342 321677 - 322285 271 202 aa, chain - ## HITS:1 COG:SP0571 KEGG:ns NR:ns ## COG: SP0571 COG2184 # Protein_GI_number: 15900482 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Protein involved in cell division # Organism: Streptococcus pneumoniae TIGR4 # 57 178 100 224 267 90 37.0 2e-18 MKDFYLYDDVPVLRNLLNIKDEKLLEEAESNITYVKLLDIDDKICGDAFDYQRIKDIHTY IFGDLYEWAGKERGINIVKGERVLGGDTVRYADTNSIEPEMEAALKELNQVKWEVLDISE TAELFSKLIAKIWQIHPFREGNTRTIITFATQFSEKHGFRMNKALLKDNSEYVRDALVKA SDGMYSEYEYLIRIIQDAILKG >gi|157101634|gb|DS480690.1| GENE 343 322298 - 322525 267 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939173|ref|ZP_02086524.1| ## NR: gi|160939173|ref|ZP_02086524.1| hypothetical protein CLOBOL_04067 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04067 [Clostridium bolteae ATCC BAA-613] # 1 75 1 75 75 117 100.0 3e-25 MAYTIEKKYSIKETTKVGREKIVNDALAIATLDADAPTERTMNLVQEYIDGKKEISEILK ETIAYYQEQAKHCNN >gi|157101634|gb|DS480690.1| GENE 344 322672 - 324162 351 496 aa, chain - ## HITS:1 COG:MA2116 KEGG:ns NR:ns ## COG: MA2116 COG0286 # Protein_GI_number: 20090959 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Methanosarcina acetivorans str.C2A # 1 494 1 492 498 723 69.0 0 MLEQTTTIISKVWGMCNPLRDDGVSYGDYLEQLTYLIFLKMSDEYAKPPYKRDTGIPSGY TWSDMNTLKGAELEDQYKATLEKLGEQAGILGQIFKGAVNKISNAAILYRVVQMINNEKW VAMSSDVKGEIYEGLLQKNAEDIKSGAGQYFTPRPLIRAMVRCLRPEPMKTIADPCCGSG GFFLAAQSFLADPNNYALDREQKGFLKNETFYGNEIVPATYKTALMNLYLHNIGDIYGNV PITLGDALLTDPGYRVDYVMTNPPFGKKSSITFTNEEGEQEDEDLVYNRQDFWTTSSNKQ LNFVQHINTILKATGKAAVVVPDNVLFEGGAGEVVRKKLLETTDLHTILRLPTGIFYKPG VKANVLFFDKRPASAQRQTKEVWIYDLRTNIHFTLKQHPMTDADLEDFVCCYHPENRYER TETYSADNPDGRFRKFSIEEIMERDKTSLDIFWIKDKSLADLDNLPSPDELANDIIENLQ SALDSFTALQAQLNKG >gi|157101634|gb|DS480690.1| GENE 345 324167 - 324772 83 201 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939175|ref|ZP_02086526.1| ## NR: gi|160939175|ref|ZP_02086526.1| hypothetical protein CLOBOL_04069 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04069 [Clostridium bolteae ATCC BAA-613] # 1 201 1 201 201 405 100.0 1e-112 MQIIDWLNKNSGAMTFLITVVYVVATIAICQANIKSAKATREQLEVSKTQFEETKRLEFL PYLQFQQISSSQNEYKLKLVLCDKDIDGGTYVCCFNVQNIGRGTVKDIIYTYEWDDFSKK YDRGPFPIGGLMPSGNSNIRIEFALPKEQRNSVKCAFQLHFKDLLENEYNQRIEFHFESK NTAPMVLENYIVQSPIVINKE >gi|157101634|gb|DS480690.1| GENE 346 325662 - 325763 61 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLQKSACFVMMVIYLSYAMALDLVLLDEPSKE >gi|157101634|gb|DS480690.1| GENE 347 325922 - 326317 183 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939177|ref|ZP_02086528.1| ## NR: gi|160939177|ref|ZP_02086528.1| hypothetical protein CLOBOL_04071 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04071 [Clostridium bolteae ATCC BAA-613] # 1 131 129 259 259 231 100.0 1e-59 MCAFIEFADVFDSFGKTFMQETKPLFENIVRIRKDYADYAEALSEYQSLEEYKKAKSIIE KEFFWNIGQYRIEIETFYKNESVKYSFTFSISESDYRQLKSNIDESLVSPLKDRYAIMRN YQWADIELKEA >gi|157101634|gb|DS480690.1| GENE 348 326691 - 326873 159 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939178|ref|ZP_02086529.1| ## NR: gi|160939178|ref|ZP_02086529.1| hypothetical protein CLOBOL_04072 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04072 [Clostridium bolteae ATCC BAA-613] # 1 60 6 65 65 113 98.0 5e-24 MIKDHIASSLSILPEDLDLTPFDRKGGLFGFCEAFGEQYKEILEEMKYGIGSIKEKYYGN >gi|157101634|gb|DS480690.1| GENE 349 326984 - 327340 157 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939179|ref|ZP_02086530.1| ## NR: gi|160939179|ref|ZP_02086530.1| hypothetical protein CLOBOL_04073 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04073 [Clostridium bolteae ATCC BAA-613] # 1 118 1 118 118 220 100.0 3e-56 MGFYDAFKDVLNMAQKVDNIDLYRQLLDLSAQALEMQEEIIKLKAENKELKSQKDIEDDI EYHVDPFITRKSDKKPIKYCAACWADKKKLLPLQDFGGNDYRCPFCKFKIMDTSNWTR >gi|157101634|gb|DS480690.1| GENE 350 327353 - 328924 756 523 aa, chain - ## HITS:1 COG:MA2122 KEGG:ns NR:ns ## COG: MA2122 COG4096 # Protein_GI_number: 20090965 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Methanosarcina acetivorans str.C2A # 2 523 393 917 917 477 50.0 1e-134 MKQVIEHRNRLSREKRWMQMDEDVDYVPPQLDRDIVNPSQIRTVIRTFKENLFTTLFPRR KEVPKTLIFAKTDSHADDIVQMVRDEFGEGNDFCRKITYSAENPESVLSSFRNDYNPRIA VTVDMIATGTDVKPIECLIFMRDVRSKNYFEQMKGRGTRTLGKDDLQKVTPSATENKDHF VIVDAVGVTKSKKCDTRPLERQPYVSMKELMMNVALGSKDDDTLTSLANRIIRLNSQMNN SERKQFKEVVGESAEKVAENLLNAFDEDVITEKAKADTKTETPSDEALKQAQKDLIAQAV APFMNPDTRDFIENVRRSHDQIIDNVNLDSVLFAGYDVQKEETADRVIQTFRTFIEENRD EIIALRIIYSESYANRAMAVEQLKALYAKLKSKGITVERLWDCYAIKKPDKVKKGVMAQL TDLISIIRFEMGYTDQLSPFADKVNYNFMKWTLRRNAGAVHFTEVQMEWLRLIKDHIASS LSILPEDLDLTPFDKKGGLLGFYEAFGEQYEEILEEMNIELVA >gi|157101634|gb|DS480690.1| GENE 351 328908 - 330092 533 394 aa, chain - ## HITS:1 COG:MA2122 KEGG:ns NR:ns ## COG: MA2122 COG4096 # Protein_GI_number: 20090965 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Methanosarcina acetivorans str.C2A # 2 384 6 385 917 424 59.0 1e-118 MTPEQKARLSIDKKLEQSGWTVQDLKQLNPMASLGIAVREFPTNTGPVDYALFVNGKPVG VIEAKPSNAGENITTVEEQSARYANSTFKWIKTEYSIRFAYEATDKLIRFTDYKDIKFRS RTVFSFHRPETLELLLASQDTIRNNMKHFPPLDETGFRKCQINAIQNLDNSFANNRPKAL VQMATGAGKTFTAITAAYRLLKYGKMNRILFLVDTKSLGEQAEREFLAYTPNDDPRSFSQ LYGVRRLKSSYIPSDVQICISTIQRMYSILKGEELDESAEETSFAEYATANTKAPKEVVY NAKYPPEFFDCIIVDECHRSIYNVWSQVLSYFDSFIIGLTATPDKRTFAFFDENIVSEYP REQAIIDGVNVGEDIFLIETEITKKRGTSDEAGD >gi|157101634|gb|DS480690.1| GENE 352 330089 - 330283 111 64 aa, chain - ## HITS:1 COG:no KEGG:Swol_1464 NR:ns ## KEGG: Swol_1464 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 56 31 86 95 69 55.0 3e-11 MIERDMSNAQLMEQAGFSANIITRMKRGSYISLESVESICRAMCCGVDDILEFIPDEHEE VNKQ >gi|157101634|gb|DS480690.1| GENE 353 330539 - 332158 886 539 aa, chain - ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 28 437 4 413 559 110 27.0 6e-24 MNSTATDRKIKIIPPKANMNTEKKKKVKKINVAAYCRVSTAQEDQETSYEAQVAYFTKLI TENPSWNLAGIYADDGISGTDMKKRDNFNAMMERCLQKDGDIDLILTKSISRFARNTVDC LSCIRKLKERNIAIYFEKEHINTLESTGELLITILSSQAQEESRNISENVKWGLKRKYEK GEMLVRRMFGYGKGTDGQLYIIPEEAEVVRLIYGKYLEGESLNSIARLLKEKGIKTIRGN IEWNVNSIRTILINEKYIGDAMAQKTFTTDYLTKTRKENQGELQKYYVENAHEAIIPREV FYKVQEELRQRANIYKKSSKKETESKGKHTGKYALSKITVCKECGCEYRRQIWSKYGEKK AVWRCENRLRNGTRYCKDSPTIEENVLHRAVLHAINQVLENKGDFVQTFRKNVVTALTHG TEDSEYAREKKKLQKEMAELIQQQAKQNGDKTAFEEHCQAITAQIEALEMKQIKAASRGE KGRKMEDIENFLDKTNCILTEYNDKLVRQLIENINVVNARKVEVVFKSGITVEEMLPEY >gi|157101634|gb|DS480690.1| GENE 354 332214 - 333080 441 288 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1525 NR:ns ## KEGG: Dtox_1525 # Name: not_defined # Def: recombinase # Organism: D.acetoxidans # Pathway: not_defined # 9 281 9 293 299 135 29.0 2e-30 MGRKVTVYGYQFKNGILQADKEQSRFVQEIFNVYNSGISVSRLKDHIEGLEINRVKLNDM LSDKRYMGDENFPKIIEPELFEAVQQMKKERRKAIGKEQSYIYYKEYFLLGDKMKCGECG SEYHCYKHGDKQIWDCSKRIVKGRVHCRNQHIQEAQIKELFMQAVTKMKNHPEKIRKITI HKFKRNIRLQAVEHEIKLLKNDSSHNIDELLELIYKRASLQYEDSDDGGAEYYTNKIVDL LQQHKEQTEEKTFDKDLFESITQTITIYKDGRVMFTLKNGVNVTEMLP >gi|157101634|gb|DS480690.1| GENE 355 333083 - 334648 898 521 aa, chain - ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 22 473 11 477 559 107 25.0 8e-23 MNDTAKTTDINEEPMMKNVCAYCRVSTNIPHQKGSYESQLEYYENMIRSKADWNFVGIYA DYGKSGTKEKGRTDFMRMLEDCEAGEIDIICTKSISRFARNTVECIQVVRKLKEMHIDVY FEKERIHTLPEKSELLLSIYSSVAQAESESISSNQKWSIQKRFLNGTYILPNPAYGYCTN ANGLLELDQKQAAVVRFIFDEYLEGNGIWRIAKSLNEQKIPTKTGKAAWLGGAIYIILKN SVYTGDLLLQKTYSEDIVPFVRRKNNGEYRQVLIENDHEPIVTHEEYEAVQRMLKQKAKV KKAGKEEQLEEHSEFKGKVICGICGSSYNRQVKKGRTGKSNITWSCARRIQTKDLCENDI IKESQLEQLFITMWNKLSNHCDEILVALMKELEHLKATPMIQKQLEQLDKRIQEQKKQRE ILNHLASGEMIDSAFYMEQQNMISKNLKEYQREKEQYLQKSRQRKELIQTKELIKQFKKG PQYLEVYDKTLFQAIVKKIIVNPNELVFELTNGLKLTERRK >gi|157101634|gb|DS480690.1| GENE 356 334666 - 334869 148 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939186|ref|ZP_02086537.1| ## NR: gi|160939186|ref|ZP_02086537.1| hypothetical protein CLOBOL_04080 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04080 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 93 100.0 4e-18 MSKDQMQNEIRYQLSKELLTRMLFRNLITEEEYNRMNSLNLQTFQPAEAKLYEKNSRCVT EKQVSFA >gi|157101634|gb|DS480690.1| GENE 357 335521 - 335934 184 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939188|ref|ZP_02086539.1| ## NR: gi|160939188|ref|ZP_02086539.1| hypothetical protein CLOBOL_04082 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04082 [Clostridium bolteae ATCC BAA-613] # 1 137 3 139 139 217 100.0 3e-55 MCKEDKKQYFVQIGNKQVEVNQEVYLELYKDQLYEKNQQRKRRRNNILSFDAMNEENNDV YDFVPDMNSNAEEEAIHRATMHKIKEILKEIDPDNIISLNYFLEYSEHEISAILGIPKHT VNRRKRKILNSIKKLME >gi|157101634|gb|DS480690.1| GENE 358 336767 - 337153 76 128 aa, chain + ## HITS:1 COG:CAC3438 KEGG:ns NR:ns ## COG: CAC3438 COG3682 # Protein_GI_number: 15896679 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 119 1 119 124 87 38.0 9e-18 MKELPQISEAEFEVIKIVWKYAPINTNEVTEKLLEITNWSPQTIHTMLKRLVTKKVLTYE KQSRVFVYSPLINESEYIQQKSHSFLNRYYNGKITAMISSYLEDDTLPTTEIEELRSILT KMSQKGGK >gi|157101634|gb|DS480690.1| GENE 359 337839 - 338951 99 370 aa, chain + ## HITS:1 COG:SA0039_2 KEGG:ns NR:ns ## COG: SA0039_2 COG2602 # Protein_GI_number: 15925746 # Func_class: V Defense mechanisms # Function: Beta-lactamase class D # Organism: Staphylococcus aureus N315 # 114 367 1 251 252 194 41.0 3e-49 MILIRIIYWFNPIVFVALKEMCHDREIACDSSVLKMLEYKDYINYGNTLINFAEKISTTP FPFVAGLGGNMKQIKRRILNIASYENPTYWKRIKGLIAFFMTAILLFGCSPMLSTYASEE CYTWDTSSKKITLVDLSSYFDGYKGSFVLYDLQKDNWNIYDIEQATIRISPNSTYKIYDA LFALEENIITSENSFISCPQQNYPFESWNEDQTLFSAMNSSVNWYFQALDAKLGKSNLQS YIEQIGYGNQNINGELSSYWMESSLKISPIEQVELLKSLYFNDFGFTPENIQTTKESIQL FSDVNCTIYGKTGTGCIEEKNVNGWFIGFVESKNHTYFFATNIQAIDNATGSIASEITLS ILSDMNIYKL >gi|157101634|gb|DS480690.1| GENE 360 340586 - 341479 251 297 aa, chain + ## HITS:1 COG:BS_penP KEGG:ns NR:ns ## COG: BS_penP COG2367 # Protein_GI_number: 16078940 # Func_class: V Defense mechanisms # Function: Beta-lactamase class A # Organism: Bacillus subtilis # 44 294 53 303 306 294 56.0 2e-79 MKHNKIIISILTIFIILLSVGCSNKQENIPDSNENVNADYDSLFQELEEKYDATLGIYAL DTETNKEISYNADERFAYCSTYKALAAGAILEKYSIEELDNVIYFEEEDVLSYAPVAKDK VDTGMTIREICDAAVRQSDNTAGNLQFTLLDGPNGFKQSLSKIGDTVSEPSRIETELNDA VPGDIRDTSTPKQLAFNLKEYVTGDILSDDKKEIFIDWMSNNATGDELIRAGVPSDWIVA DKSGAGSYGTRNDIAIVTPPNKKPIFVAVLSKKAEQDAEYDNKLIADATKIIFDLIS >gi|157101634|gb|DS480690.1| GENE 361 341847 - 342293 384 148 aa, chain - ## HITS:1 COG:XF2725 KEGG:ns NR:ns ## COG: XF2725 COG0610 # Protein_GI_number: 15839314 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Xylella fastidiosa 9a5c # 1 136 868 1006 1007 130 48.0 1e-30 MLVEQYHEGNCEDKEILTSIRKAVDASMQLRSKKELIEAFINRVNVDTQVTTDWRRFVLT QEENDLAKIIMAEKLKPEETRKFVSNAFRDGVLKTTGTEINKLMPPVSRFGGSGRAKKKQ GVIEKLKAFFEKYFGLGITEMQSEKEEN >gi|157101634|gb|DS480690.1| GENE 362 342690 - 342755 73 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFVISTAVYVENTAEGFDIVL >gi|157101634|gb|DS480690.1| GENE 363 343789 - 344049 116 86 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2737 NR:ns ## KEGG: Ccel_2737 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 11 86 10 84 86 79 53.0 4e-14 MIDKMKVYQDTELSSKAVSVYLYLCERASKETKSCYPSMKTIAKDLHLSLASVKRSIQEL ERSAYIKKENRFRDNGGKSSNLYFIE >gi|157101634|gb|DS480690.1| GENE 364 344042 - 346018 657 658 aa, chain - ## HITS:1 COG:AGl58 KEGG:ns NR:ns ## COG: AGl58 COG3505 # Protein_GI_number: 15890132 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 337 585 291 542 554 92 28.0 3e-18 MENLWKLMPLLLIFGAFIVCMLLIDRKSLNIKSKTVGDGQHGSASFATPKEIDAAFQRVR YEPQVWRRKPPCDLPQGIIIGCNMMKFGKKKTVTAIVEKGDVHALMIGAAGVGKTAKFLY PNLEYCCASGMSFVTTDTKGDLLRSYAPIAKKYYGYDISVLDLRNPTQSDGFNMLHMVNH YMDEYLQTDSLVAKAKAEKYAKIISKTVMNSGGFDSANAGQNAFFYDSAEGLITAVILVI AEFCSPYALALEQAKRKQKALEQRKKEIAQMRKACIDKDIDFLPKAQTAILQKDVEAEIL FQPEEQKEEKGEERHIISVFKLIQDLLGPSEVKGKSQFKLLMERLPEEHKARWLSGAALN TSDQAMASVLSTALSRLNAFLDSEIEQLLCFETKINTEKFCRNKSAVFLIMPEEDDSKYF LISLIVQQLYREMLSIADEMGGKLPNRVMFFLDEFGTLPAIQSAEMMFSASRSRRISFVP IIQSLAQLEKNYGKEGADIIIDNCQVCIYGGFAPNSEAANVLSKTLGDRTVMTGSISQGR DKSKSLQMTGRPLMTPDELKIMPKDTFIVTRTGVKPMKTKLKLFFEWGIELNETYPMRQA VVRKVHYANKKTIEEAIAKKYGLPQNPLQQPVKRPMQEQDKPLCNISKTMKGVEKSYD >gi|157101634|gb|DS480690.1| GENE 365 346023 - 347606 579 527 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1401 NR:ns ## KEGG: Rumal_1401 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 500 6 451 466 267 34.0 1e-69 MALYTQEQIDKANQTNLEEFLQQKGEHLRKTGSESVLIYKDSTGEHDSISVRRNRWYDHK NMRGGYPLKFMQEFYGMNFRTAMKELLNGEEPGLGRKRNKENEKNVRQTEKAVVQENQER ITCPSEKSEFILPEKDNNMKRLFAYLLQTRFLSKDVVKSFVEQKILYQEKEHGNVVFVGT EKDGVPKSACKKSTAEQTKSFRMTVAGSDCRYGFCWRGESSKLYVFEAAIDLMSFITLKN YDWKTDSFLALDGLSPKPLLQFLEEQKNIHEIFLCLDYDAAGIEACDKLNDILIEKGHDG EKIKREYPLYKDWNEQLKSEHGVEAILPQSHPKKAAYHVAVRKLGIMNANTESPYMKWRK QEYEKSGIYFYLEQIKRDFKQAEKLVKKGEADTKPLLASILRMADLSVCLMCEMKRDENL PELTMYQTTIQKLEKAYKPYLDKSRMDRRIQEMKEEMEHLKTLAVQDQPSLFVWAKNMAD MAMRAIIYVETEYPQELERKNSFVEQSKEEQKTEPDTTEQMELEMGG >gi|157101634|gb|DS480690.1| GENE 366 347575 - 348039 257 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939202|ref|ZP_02086553.1| ## NR: gi|160939202|ref|ZP_02086553.1| hypothetical protein CLOBOL_04096 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04096 [Clostridium bolteae ATCC BAA-613] # 1 154 1 154 154 285 100.0 9e-76 MIYLLTYDGYMKEEKIQEICPSAKLFMQCGLKDCTLEFHRFYETAMPTLEQGEEIVPAVI WKLEESDLKNMEAIYPNEIYQKTNWNLKTEKGVLDVMVFLMRETPFAFPKEKEVEAMEEA YEEHSFDYSSVENALDRAKDREEAYGTLYPGTDR >gi|157101634|gb|DS480690.1| GENE 367 348052 - 348903 386 283 aa, chain - ## HITS:1 COG:no KEGG:Closa_3128 NR:ns ## KEGG: Closa_3128 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 273 3 321 344 119 26.0 2e-25 MRWKKYGVEMDMTGITRKAVAEILAEQFDTKVVFCLSDNGYNVPDQNQRLWRILPSDSIK AEKYNGEKVVGANYMYQVKLLSPFLYENEFPMLEKALEQLELRGAIVNDSTKINLLLDVS CIENWEKYQINLENLYESKGELFQKALEIPFSQVADTDQDNGIISFPYFKSTINKKELLS DIQFAQIVSSFAENNRTVSQKKSENQNDKFMMRTWLVRAGMVGEEYKFARKMLTKNLEGN SAWQKMMEPTEIESKEVCNQAQSEEMMDNHVEEQVVSDLELEV >gi|157101634|gb|DS480690.1| GENE 368 348916 - 350241 566 441 aa, chain - ## HITS:1 COG:no KEGG:DSY0030 NR:ns ## KEGG: DSY0030 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 2 372 17 392 959 258 40.0 3e-67 MKKPSQVKNYVKYVATRPNAEKFEVNQSQREATEHQKRWIEKELKANKELRESCEQEYED YLQNPTRENAMELINKIAEEGLAREDSMENYVGYLAKRPRAERGKQGHALWNGSDKEIDL NQVAKEAAEHNGNIWTFVLSLKREDAVRLGYDNGNMWRELVRGKAPEIAKAMGIPVEHFV WYGAFHNEGHHPHIHLMCYSKKPKEGYLGSKNLMKLKSSFANEIFHDELYYIYEQKDEVR DELKNYFNRKLKQAMTMEYTDHPKVEALLWELSQKLKTAKHKKVYGRLEKENKILVDEIV KEISKDKKISQAYESWLGLKDDILSTYQNQERARRTLAEEPEFRNIKNMVLKSALILLEM NEVIAAEPEERNHEIVNPQESWQERTKKYHQMQAQKQFLLLMKHISHMIEEDYDRKKQQI HVDKKLMRKIMEKKEAHGQKM >gi|157101634|gb|DS480690.1| GENE 369 350286 - 350849 537 187 aa, chain - ## HITS:1 COG:no KEGG:HM1_0571 NR:ns ## KEGG: HM1_0571 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 43 185 36 167 175 73 40.0 3e-12 MRYKFYSPVQGIIDYDFNKDMEYDAYFDEYCIEDLEKKDFDYLTGEDITVYEEYINQMIE KDLKKELDEGMGLMQYFNYGSRENYKELLEKVKSAYPRIETVRNKAYGVMVCDIQKPLNE KEIEILKDYFAGQYSDGWGEGFAQRGIETLCGVLYLDFYSDDFYIETEDELKNSLESKQD NEMQFEM >gi|157101634|gb|DS480690.1| GENE 370 350846 - 351292 159 148 aa, chain - ## HITS:1 COG:no KEGG:DSY0031 NR:ns ## KEGG: DSY0031 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 17 148 78 210 210 95 35.0 7e-19 MLLTTSYKNKMSEEYKKQRYSVWLSKDAVQKSDAAVVMEGLTNRSDFIEKAIHFYSGYLY QEKHMDFLSDVMMETVSGIMKTSENRLSKMLFKIAVEMAKLESMLAAINDMDEATMRRLH IHCVNEVKKINGILTMEEAVRYQRSDEE >gi|157101634|gb|DS480690.1| GENE 371 351264 - 351401 85 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 454 88 100.0 1e-16 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQE >gi|157101634|gb|DS480690.1| GENE 372 351537 - 352703 341 388 aa, chain + ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101634|gb|DS480690.1| GENE 373 353338 - 353682 245 114 aa, chain - ## HITS:1 COG:no KEGG:ELI_1911 NR:ns ## KEGG: ELI_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 114 263 374 416 123 52.0 3e-27 MIVLDFLIGNTDRHFNKFGLIRNAVTLEWIGVAPIFDCGTSLWYNTQERLIKPLSPNLPA KPFKKTHREQIKLVKDFSWLDMKKLKGMEEEMEEILSQSPYISRERRAVLCDAF >gi|157101634|gb|DS480690.1| GENE 374 353779 - 354498 22 239 aa, chain - ## HITS:1 COG:no KEGG:ELI_1911 NR:ns ## KEGG: ELI_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 21 239 9 229 416 205 46.0 1e-51 MTDTELMGMILKKPEVRIVEFTLMHREIKVVDIEMDSFYHIKSIKNIYAAAHMPVGTMQK QDADQQALAKWWSRRTIPKGRTRLQEVLDIRNILTSKELLKDSFGLSLSDQYWLKPKDSS LSWEQIQFFDNDFSEQFGEMMLGNLEITECFDTMTPDVVLEGRLEKAWKIRDGKRVLIKG GSNPYQQEPLCEVIASGIAERLCIPHTKYTLLWEHEKPFSVCQDFITSETELVSAYHIM >gi|157101634|gb|DS480690.1| GENE 375 354593 - 355210 78 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939212|ref|ZP_02086563.1| ## NR: gi|160939212|ref|ZP_02086563.1| hypothetical protein CLOBOL_04106 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04106 [Clostridium bolteae ATCC BAA-613] # 1 205 1 205 205 357 100.0 3e-97 MDIFQYFLSYQSENTSAFSDIFRTAKEFHQLLGQKSYLLNHYLSMFFRLIVEMDFCALED EIDQSISDLQKKIQDDLENNDSSIPRFDCQETSTEMELCWTALADSLLEQALKEFYHEYM HSYHLSVDLVDFRQTEEKLIEYLGKSAWEKFQQQLINCFLHCSLMQLFRQGIVIEITKRF LSRDLETDQEVFRLFITHFIHNKSD >gi|157101634|gb|DS480690.1| GENE 376 355254 - 355940 138 228 aa, chain - ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 9 216 6 214 217 148 35.0 7e-36 MSDAMPKSSVKENEVGYMNIVVRNEETKDYRRTEEVAREAFWNLYFPGAAEHYVVHQMRS HTDFIPELAFVIEVDGIVEGAIFYTHSKIVTEQGEFPTISFGPVFISPKYHRQGLGRNLI THSIQKAKEMGYSAILILGYPYHYEPYGFCGGKKFGIAMADGHFYKGLLVLPLKENALHG KSGYVVFSDALEAEEKDILNFDATFPQKEKKVLPCQQEYETACAMLDE >gi|157101634|gb|DS480690.1| GENE 377 355900 - 356979 264 359 aa, chain - ## HITS:1 COG:alr8077 KEGG:ns NR:ns ## COG: alr8077 COG1162 # Protein_GI_number: 17227451 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Nostoc sp. PCC 7120 # 28 351 24 353 353 279 45.0 5e-75 MEDNELKQLQKYGATERFYTESKLYPDYQLARVIAQYRGKYKVITDQQEFFAEISGKFRY TTEELVQFPTVGDFVMVTVEKEFASIHQVLTRKSLFLRKAVGVSNQSQTVAANIDTVFLC MSLNKNFNLNRMERYLSIAWDSGATPVIVLTKSDLCEDLSERISQIESISCFSDIIVTSM FEDDISAKFLSYFSKNQTCAFIGSSGVGKSTLINRLLGVNTIATQEIAKGDKGRHTTTGR EMFLCPLGGVVIDTPGMREMGADSTDLSKTFSELESLAYQCKFRNCTHTNEPGCAVLAAI ERGELDIRRLENYRKLQHESSYDGLNSKEIEIKKCERMFKDVGGMKNVRRYAKEQRKRK >gi|157101634|gb|DS480690.1| GENE 378 357023 - 357094 96 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MESVFMMNTNKKDIVKMEGGKVQ >gi|157101634|gb|DS480690.1| GENE 379 357184 - 357456 232 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939216|ref|ZP_02086567.1| ## NR: gi|160939216|ref|ZP_02086567.1| hypothetical protein CLOBOL_04110 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04110 [Clostridium bolteae ATCC BAA-613] # 1 90 1 90 90 96 100.0 8e-19 MKLSAIQVKFETSKYLALKRYAAKKEVSLEQELEETLNRLYRKLVPPDVREYIEEGEKME KVSKPKAKNEEKESGEEQNQAEQPQDIQSV >gi|157101634|gb|DS480690.1| GENE 380 357478 - 358194 521 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939217|ref|ZP_02086568.1| ## NR: gi|160939217|ref|ZP_02086568.1| hypothetical protein CLOBOL_04111 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04111 [Clostridium bolteae ATCC BAA-613] # 1 238 1 238 238 432 100.0 1e-119 MNQERKPHFESLMAKLENFREEEIRVLQGYLEPVLEVREKILSSFSNEKASSRFSVGEIS DELMYVNLLEDLLQTDERISECRMDFDACDMILYHKQPEHSYDSMKTTEQKYEGVAAMNL FYRELGDAMFYYNPDEPNKGCVVIEKIISLSDEDFWFFGENIKQEASFITDNEELQYFDQ QMTLHCLFIQKEDAEFGVLISHDQKSGEVYSGYLPNLDQFQEIGCEISEKEDYVEPQM >gi|157101634|gb|DS480690.1| GENE 381 358213 - 358566 361 117 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2445 NR:ns ## KEGG: Dhaf_2445 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 103 1 102 192 144 66.0 1e-33 MMKEITVGELKKMTDKEGLILQGCGGDLKEWEDGVNELLTESGILLEGDTFKNVYVFENE GLTNLLFDMDDVKLDVGKLAMWRINTHQQFGGTWLSDYLANKFEMGEELKSSMEPEL >gi|157101634|gb|DS480690.1| GENE 382 358582 - 359532 778 316 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 195 316 4 123 124 117 47.0 3e-26 MASPMAVAVMTKKALELAEDKRVRTLLASIVAGIVIVMLIPLLVMVSIFNTQAGFSQEVA RIVFDGGPIPTDIDAELSKAMEEMIDAFEELNQTIESLEEDGFDDIKVKSFFYILYFTKD LTDFDEEFYVGFVDCFVGEKEEDEIYEELEDYLSYEFTETEKVEIRNLYLFIKYGYSATD KITGIPGEAFNDATFAQLMQEATKYIGFPYQWGGSTPETSFDCSGFVCWVYTHSGVHNLP RTTAQQIYNQCTPVSKDEVKPGDLVFFTGTYQSSNPVTHIGIYVGDNQMLHCGDPIGYAN LGNSYWVKHFYGFGRL >gi|157101634|gb|DS480690.1| GENE 383 359555 - 359890 342 111 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2422 NR:ns ## KEGG: Dhaf_2422 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 105 1 105 267 70 36.0 2e-11 MMIDARDEDFKLAEIDNVVVIFTNARIDRDTVPFDLYCYDVRESECFSGDPVTLEKVVSI NHWGTILSKKPFPLEDDAYYPLKDGINYLEETCTMDEFMEMNPEDEMDVMS >gi|157101634|gb|DS480690.1| GENE 384 359910 - 361721 770 603 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 201 543 449 793 853 110 25.0 9e-24 MKISRTKKTKAYIPKTEEEKEQVRIKEFLDMCAPSVLKFYPDYYICGSSFRSTWVVREYP TDTEEQALLRHLGDRSGVTLRVYTRHVTPAEERKIISNAASRNRMNRSSGNIQKEIIAES NLNDVMNLIAQMHRDREPLVHCAVFIEISTPTLEKLKEMQAEVEAELTRSKISVDKLFLR QKEGFESTQPCGNNAFHEQFERVIPASSAANLYPFNYSGKADPQGFYIGHDKYGSNLFID FQRRTDDKTNANILILGNSGQGKSYLMKLLLCNMRESGMDIVCLDPELEYKDLAVNMDGC YIDLMEGEYIINILEPKRWEEDSGSVLPRHISFLRDFFRAYKDFTSAEIDVLEIFLEKLY ESKGIKEDTDFSGLKHTDYPVLSDLYHLAEQELNQYEERKGNIYTRETVRDLCLKLKSIC IGADSKFFNGHTNITSNRFICFGTKGITDADTAIRNAMLFNVLSFMSDALLSKGNTAAFI DELYLFLTNMTAIEYIRNAMKRVRKKGSAVIIASQNIEDFNRADVREMTKPMFAIPTHQF LFNPGNISKKEYMEMLRLDDCLFDLISSPLRGVCVFRHGSEVYHLVVKAPQYKEDLFGSA GGK >gi|157101634|gb|DS480690.1| GENE 385 361718 - 362311 408 197 aa, chain - ## HITS:1 COG:no KEGG:CKR_3422 NR:ns ## KEGG: CKR_3422 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 8 195 18 203 203 176 50.0 7e-43 MTRMRIGKKKSTREEIGARNIDGVGVETYGNEYLVYFLVQPDNIAVLSETAVKTKIIAMS SVIKGLDSIEFSCINSRENFDQNKMFYKERLEEEESPKVREILEKDLIHLDRVQIQTASA REFLFILRFRNYNPETNEVQTGISRMTKLLKDAGFSYCQASKEDIKRLYAVYFVQNITQV YFDDYDGERFVKEDEYR >gi|157101634|gb|DS480690.1| GENE 386 362247 - 362573 212 108 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1878 NR:ns ## KEGG: LM5578_1878 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 88 1 87 87 79 45.0 3e-14 MYIYPENLKGKSTMFLWTLRDMLIIIIGAVLSAVFAAALDWIVPGVLIGVFAFLTLRIDN ELNISDYIRFAFRFFVTTQQIFYWRFADDKNENWKEEIYQRRNRRQKY >gi|157101634|gb|DS480690.1| GENE 387 362591 - 362956 324 121 aa, chain - ## HITS:1 COG:no KEGG:TepRe1_2440 NR:ns ## KEGG: TepRe1_2440 # Name: not_defined # Def: hypothetical protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 3 94 71 162 184 89 48.0 3e-17 MKLVYICSPLRGEPEKNIAKANEYAREEAMKGNCAIAPHCVFTQFLNDEVPDERWLGQEM GKALLCRCDELLVCGSIISEGMREEIKIAYENGITVLGRDLSIADIEDAVFGETEQCCEM V >gi|157101634|gb|DS480690.1| GENE 388 362971 - 363276 223 101 aa, chain - ## HITS:1 COG:no KEGG:Tthe_0786 NR:ns ## KEGG: Tthe_0786 # Name: not_defined # Def: hypothetical protein # Organism: T.thermosaccharolyticum # Pathway: not_defined # 8 75 9 76 76 81 54.0 1e-14 MKWVDNGRRMAERAKELFPPGTRIQLIHMDDPYNPIPDGTRGTVKFVDDMGTVFPDWDNG RGLGVVYGEDSFRKLTPEELLEEQQKEDINQDTDMGMNMGM >gi|157101634|gb|DS480690.1| GENE 389 363308 - 364168 713 286 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2764 NR:ns ## KEGG: Ccel_2764 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 286 1 286 286 279 53.0 1e-73 MFGLDFAVDGVLDQFCDWIYGKLITFFGEFFSMINMMGAELFELDWIKAILLFFYYFAWA LYIVGIVVAVFDTAIDARRGKGSFQDLALNIIKGFFAVSLFTVVPIDLYVFCINLSNELI GAIAGMSDSPGKLGAIATMVLGSFETPGANVVVSIVFVILIGYAVIKVFFANLKRGGILL VMMATGSLYLFSVPRGYSDGFVSWCKQVAALCLTAFLQTIVLVAGLVTYNANMLLGIGLM LSSTEVPRIAQNFGLDTSMRFNAMSTVYSVNSVVNMARNVGRIASR >gi|157101634|gb|DS480690.1| GENE 390 364193 - 364729 421 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939227|ref|ZP_02086578.1| ## NR: gi|160939227|ref|ZP_02086578.1| hypothetical protein CLOBOL_04121 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04121 [Clostridium bolteae ATCC BAA-613] # 1 178 9 186 186 292 100.0 8e-78 MGCWGVKAFESDEGLDALEWIRNHIPEDGCLRLKELLEQLKLDEWCRPPAAENGESHSST MLIAELMESFQNGTIEEWEYLPNNPFEKVVSFLVEKESVKEMCEYLSKTLESARKNTQDN QWNGWFEETNWNKWQEHMESLIETMRKILEQDGEVLDLIPQTKQEISEEHIEGGMNME >gi|157101634|gb|DS480690.1| GENE 391 364750 - 365097 312 115 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0055 NR:ns ## KEGG: Sgly_0055 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 4 115 3 112 112 109 54.0 3e-23 MAVRWKRVFLLTVVVLMAVIIFAPPVFAAPDSGQVSTAIESTWKTAATQIKTVTNNVIFP VVDCVLAILLFVKLAMAYFDYKKHGEFDFAPVAILFFGLVFAITAPTYIWNILGI >gi|157101634|gb|DS480690.1| GENE 392 365158 - 370944 3177 1928 aa, chain - ## HITS:1 COG:lin2282 KEGG:ns NR:ns ## COG: lin2282 COG4932 # Protein_GI_number: 16801346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Listeria innocua # 1004 1624 1013 1639 1806 218 31.0 1e-55 MCKTRRIIAFLLAMLIAIPTISLSTVHAAEEGSDITKITVVKGKDINDIKYDGRNVLHSS AWWYDEEGNKNPAFCVDPNKAGPGEYFAGKYDLNVAGAETNEKIAAIINNSIPYKTYQEL GVNSEEEAYAATKAAIWCVIGVSQYTDRGKWNAPDKPQVTALFNKLVDLALNNPEPVKAA VYGTIKVDEEPVEEGNYFVQRFQVKETSNSGKKITKYKVELTGDYPVGTIITGEDGIEKT QFKGDETFAIRIPKTSVPVGGSVEANVKISANLYSNVILIGKPLNGLDGKVQDMEIGMPN QPITINAKMVIGKSTETDESGDGKLKVIKLDATDNATPLAGVTFDCYNAQGHLIDTGTTD ESGIWEPNITESGTYTVVERSTNNRYQLTEPTTLVITVEPDQTATATFRDYPPHIVVTEK EDAETGEPIPGVQYEIMQIDGKGAWRATGKTDAEGKITWEDVPDGTYLVREVSTVEGYIL DQTPQYVTVRNGQAAGLKFLNSKFPGLTIIKIDQQTGEVITDPATFKVEQVDGSYTTTVT TENGVATLKNVPVGTYKITEVTAPEGYALGDCPQTVYLGKDKNVQVVMKNLKKPVLTIEK IDGQTGDPVPGTKFEIKKSDGTLIGTVTTGADGKVTVGMKGGELGYLDPDVYTVTEVFVP EPYVLTGEHQDIRLNAGDSKSLLFANLEMPSITVEKYDEETGEKLPGAQFAIYEQADTAR PVVEGMTDENGKFTSGYIKPGTYVVKELNPPPGYMFSDKTSPDRVIVAKPGDGEIIVKVD NIKLPELTIKKIDSVTKEPIPGVVYEVKEVDDTSVQPTTATTDENGMIVIPGLKAGTYEI TEISTPKPYILNDTPQRVKLEAGDHKTLMFENIKYPTLIIEKTDYTTNKGIPNTTFKIEH EEDDGAITTVGTFKTDENGRIKLPYVEPGWYIITETIPAQGYQKPTNPVTRIYLNPGDNS YLKDDNIAGGGGTGGIGGGTTTDGIEITSGNDYEVVDGIVNYPLNSLVIKKADANTGEML DGATFEVFRITGETSGQNGKLICTATTDHSGVIVITGLEAGAYAVREAKAPNNYIIAETD MQTVNMKADGTSVVEVVFRNYPYGSLLITKVDALTNKPLSNATFQVTTGDGTVVGNSNGM YTTNSDGEILIPNLKPGSYVVTERIAPEGYACDTKPQTIEIGTDGATYKVHFQNQPMCSL VILKKDADNGNPLSGAQFKVTTSKGDVIGRNNGIFTTDSNGSITISYLAKDSYIVEEVKA PDGYVLEEQSKTIALDYGKTYTLEFTNKKMTSLVVKKIDAVTGEPLPGAKFFVEKQNGEH VGEYTTDNTGTILLPTLDPDWYVVRETKAPEGYILDETPKTVEVKTNVPTVVTFDNKPLS GIKIVKTDSETGEPLEGVSFSVSKMNGEKIGTFKTDKEGMVYISDLEDGYYTVTETEGLE GYHWDKEPKTVEVKSGKQTILEVENQPYSGLVIEKTNSRTGDPIEGVEFLVTKFNGEQIG YYETDESGLIVIEGLEEGTYLVKETKEAKGYKLDSEAKEVEIKDGKRTTLKVENDPMSSV LIHKIDSVTGEGIPGVKFLLYDSDNEPIGQFETDDEGYIWIRKELPEGKYKLRELEPAEG YISDNSEKTFYVQKGRTTEIEWENTPEVGQIVITKRSSEFNELTGLPAGSPLSGAVFEIY NTTGNLVDKISSGSNGVAASKGLPVGVYTIKEVSAPRYYALNDKTLLAEIRHNGDIVRFE VLNSSISLNLTVQKKGPNAASPGQTIQYDIYEVQNGSTGTLENFYIHDRIPTDATRALKI VTGTYSERMYYKITYKTNYRDYRVLAENLLTKNSYEYSLHPNVLGLANGEYVTDVRLEFP KASPGFKMLENMSVFCQVMPNMPKDYRIVNRADVGGRYGNEWESAKTSWNTTVWTVNTPP VTLPKTGY >gi|157101634|gb|DS480690.1| GENE 393 371039 - 372097 485 352 aa, chain - ## HITS:1 COG:BH1803_2 KEGG:ns NR:ns ## COG: BH1803_2 COG2340 # Protein_GI_number: 15614366 # Func_class: S Function unknown # Function: Uncharacterized protein with SCP/PR1 domains # Organism: Bacillus halodurans # 229 350 8 129 132 80 37.0 5e-15 MKKRLAAILAATMVLGTVPCAFAADQITVTVDGEKVNFEDQQPVNIDGRIMVPIRDVAEK MGWEVEWFTYYGDTVVDGTFQIEHSAIFTKPIASTDRYMAGYHSSLNIEDKTKTKSVWGA THILTQGEDLLVSAPAKVINDRTLVGIRDLADCMYADAQWDAETKTVAIKTTPTEQLPQY NEILANVASVKNQEQQEMPETKPTLTIEEEQRQRTEKYAAEQNAKRDEFAEEVIDRINQE REKNGLNKLQMNDALMEAADVRVKEIVTNFSHTRPDGRKAKTAANEAGFEGNYIGENITG FANTPKWAVDNWTESDGHRENYLNPEYTYTGVAYLYDKDSEYGSYWVQIFAR >gi|157101634|gb|DS480690.1| GENE 394 372253 - 372978 510 241 aa, chain - ## HITS:1 COG:CAC0204 KEGG:ns NR:ns ## COG: CAC0204 COG3764 # Protein_GI_number: 15893497 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Clostridium acetobutylicum # 117 238 66 191 194 71 37.0 2e-12 MGKSNINIQGEIPDYLRDIYDVSPQTVYDSEYGANVVEPNAPSINHTTYAPVNAPINPYT QSPSSLISGAAYGGGTSGSYTANISSDGYVTNEIPEQVYDEVLQYPLTTIEQVRNSDGSI GVLKIPEIGLTVTAYDGDTFTAMKKGVGHLASTSCWNGNIGLVGHNRGTNDYFGKLKKLD IGDEMTYTTKLGTRTYVVKSITKIADTDWSKLQYTSDNRLTLITCVEDVGDQRLCVQAVQ K >gi|157101634|gb|DS480690.1| GENE 395 373133 - 374287 910 384 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939233|ref|ZP_02086584.1| ## NR: gi|160939233|ref|ZP_02086584.1| hypothetical protein CLOBOL_04127 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04127 [Clostridium bolteae ATCC BAA-613] # 1 384 1 384 384 716 100.0 0 MKKAILCLIAAVSLLLPVSAVASPVSVETPIQEENATENTLPVPISMERKNIDGVEYLVK VYDLPANTPQEQLVEEDFVLEDFLFSYIAADKQLNENKDTKEVTEDAKAEGGSKNLEDVI KLFPATKTYDKDGYQGTLTLDTGSIVTEVAGYTTKNYTVSATKEYPGLMYADPSYVAQST VKDGYTLPLTNVSWTVMGTSLAGDTLVPTEYKAVATYSKTFSSQVPTGYVSTARYKGNVT KTTADTATFTITYAGKLIENGMPPVLKAVTGIIAVLLLGCAIAMLLLYLKNRQGADVYNL IDKEYICIGRQNVDPKQPVIDLNEFEDLIQSNVFQFVLDKKTTKALFGRNISVTYKDMTV KHRVNEKKGEYRFELDLGGVLDAE >gi|157101634|gb|DS480690.1| GENE 396 374284 - 374463 166 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939234|ref|ZP_02086585.1| ## NR: gi|160939234|ref|ZP_02086585.1| hypothetical protein CLOBOL_04128 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04128 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 68 100.0 1e-10 MDIEKALRQYQKTKDFSDDCQRKKTELFADFHVLPDDLTIERMVLEEKRKELEERRESV >gi|157101634|gb|DS480690.1| GENE 397 374480 - 374788 344 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939235|ref|ZP_02086586.1| ## NR: gi|160939235|ref|ZP_02086586.1| hypothetical protein CLOBOL_04129 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04129 [Clostridium bolteae ATCC BAA-613] # 1 102 1 102 102 182 100.0 5e-45 MTINCFKTVSAKGLIYLPVALRRFVGIHEDDVVQITGRGGIILINKAEVTAENPVEVLQA DFQKEKEDAEKERDQIKNRIKQLLSEGKSLDEVLDEVLRFLF >gi|157101634|gb|DS480690.1| GENE 398 374804 - 375766 742 320 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1475 NR:ns ## KEGG: Dtox_1475 # Name: not_defined # Def: D12 class N6 adenine-specific DNA methyltransferase # Organism: D.acetoxidans # Pathway: Mismatch repair [PATH:dae03430] # 2 320 1 301 310 296 45.0 7e-79 MVNSMIPWIGGKRLMREFLIARFPPHYDKYVEVFGGAGWVLFAKKPERFEVYNDANSNLT NMFHVVKHKPMSFVKELGFLPLNSRAEFDLMLDWHRKQDFSLPYQTEEMALAKIYLSPID FREYKELITTQAELGDVRRAATFYKLIRYSYAAGGNSFNGQPVNIAQTYRTIWLANRRLN ENGVKSDSEILMAGGNPGKGVIIQNKSFEVIIALYDSPMTFFYLDPPYYGTEKQYEELFT LELHYLLREVLGKIEGFFMLSYNDCAFIRELYKDFYITPFERLNSIAQKYTPGGMFKELV ITNYDPNLRLNSQPKQLTLL >gi|157101634|gb|DS480690.1| GENE 399 375817 - 376749 441 310 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939237|ref|ZP_02086588.1| ## NR: gi|160939237|ref|ZP_02086588.1| hypothetical protein CLOBOL_04131 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04131 [Clostridium bolteae ATCC BAA-613] # 8 310 1 303 303 575 100.0 1e-162 MNEVIKQMQKRIVEGIQLPDKKQQNQLRKDVEQALQESIKDLAEHGGGPQAEDRFWKRLE HIAAKEPKNRIIYQHYQDISYFWQYTENWYSCYPPNLSEEERDKALGYAEFKRKNVEDRE RYMTVYFPCGRIGLKGFFVDNKLPESCGHTVYGVYCENEKTYTIGEMLEKLPKEEMDVFQ LNELFCRFDKLPEEMQRVYDLTLQLKEPRNVKGMIDLMEHLSEVCVVNGIQNERQLGEFL VENELFDVSFPDEVLPYLDYAKIGREHMQTHQGKLIQGAYVEDTSTDNVNQFSQESNEEN DQSDDYEMKL >gi|157101634|gb|DS480690.1| GENE 400 376785 - 377123 379 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939238|ref|ZP_02086589.1| ## NR: gi|160939238|ref|ZP_02086589.1| hypothetical protein CLOBOL_04132 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04132 [Clostridium bolteae ATCC BAA-613] # 1 112 1 112 112 168 100.0 1e-40 MIRLYVASEKLVKEEKDICVRLVLPVEENEIWIALQKAEMESLDDCEISDVECDVEEAQE FLCSLEISKANIFELNVFAGLLSALPEDELMLYRKKLKDQQPKSLEEAIYEI >gi|157101634|gb|DS480690.1| GENE 401 377174 - 377839 441 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939239|ref|ZP_02086590.1| ## NR: gi|160939239|ref|ZP_02086590.1| hypothetical protein CLOBOL_04133 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04133 [Clostridium bolteae ATCC BAA-613] # 1 221 1 221 221 412 100.0 1e-114 MIQVTFCSKQLFEKKNEVDVSLRLPADAETMKRYLDAAGVRDMDDLLISDFSIHELYAWS CGEYQEWDKADLNELNYLATRLEELFEEDQDELYCDLLEMHQPQSAADMINLSYGFDKVS YDPAIRSNQELGEHLVKNNLFVVPEDMKSFVDYNKVGQLYLYTMKHSKDAGMMQVDENGC ICDSHKEQERYGYYDGVNVPEEYMILTEEVFEDQAQQHIQS >gi|157101634|gb|DS480690.1| GENE 402 377842 - 378942 141 366 aa, chain - ## HITS:1 COG:SP1336 KEGG:ns NR:ns ## COG: SP1336 COG0270 # Protein_GI_number: 15901190 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Streptococcus pneumoniae TIGR4 # 1 303 2 268 407 169 36.0 8e-42 MYFLDFFAGIGLFRLGLEQAGWTCKGHCEIDKYANMSYQAMHHIKEGEWFEEDITKVSAQ SLPDVDLWTGGFPCQDVSMAGERRGLYGERTGLFFEFVRLLRERGYHKPRWLILENVKGI FSSGGGWDFAIVLCELAALGYGVEYALVNSKDYGVPQSRERVYIIGDLTGRSTGKIFPLR SPGKTAPAQIIGGPQGSRIYDTGGVSPTITSGTGGLGAKTGLYFVAKDEQSGLVTKPYSG TIDASYGKGLGCRQHRTGILESRLPRALLNPGKEKTRQNGRRIKEEDEEMFTLTASDIHG ILLDSRIRRLTPLETFRLQGVPDAYFERAASVCSDAQLYKQIGNAVTVPVVRAIGEKIAE YWKGEK >gi|157101634|gb|DS480690.1| GENE 403 379132 - 379533 266 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939241|ref|ZP_02086592.1| ## NR: gi|160939241|ref|ZP_02086592.1| hypothetical protein CLOBOL_04135 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04135 [Clostridium bolteae ATCC BAA-613] # 1 133 1 133 133 237 100.0 2e-61 MEKASHKANVQCAEDIGKAINSYFDGFHYHTGFEKELIETYGMEQVKYVVAYNIQQKLND GRISKENKTWALQSQMNQGEGNPKPEYTIHTHSGLLDLFANSIREQQLIHEAEENMKGGD ADEIEETDLDMQI >gi|157101634|gb|DS480690.1| GENE 404 379726 - 380463 327 245 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2964 NR:ns ## KEGG: Ccel_2964 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 61 233 13 183 190 64 27.0 3e-09 MRKITDLRGIKDTAKVFLHMNIEKTKFSPLVIKHPFTDSAMVCLSQTDGEIAFANIMEDT KAFTLWKEQVEKQIDTAEDVFGVYHLMTKSYLLAFLKYTESYLSREDFSKMLADIWIRTE APNLDPNFKQKELLDLFRDSKQEEMMTEDEIETLRSLPETVSVYRGVTSYNAGKVKALSW TLDQKVAQWFANRFGENGTVYEAEISKEHILALFKGRNEWEVIVEPDHLLQLSEAMEENM EEPQL >gi|157101634|gb|DS480690.1| GENE 405 380473 - 380799 300 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939244|ref|ZP_02086595.1| ## NR: gi|160939244|ref|ZP_02086595.1| hypothetical protein CLOBOL_04138 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04138 [Clostridium bolteae ATCC BAA-613] # 1 108 1 108 108 176 100.0 6e-43 MKNMKTEPSEKTIIYRTPGDPIEITDEMLENAEINPNELVDIILQKGCIIIKPTSVLGRL PEDLLLLYEELGFSREMVECVFTKYAEEAGGFDALVEQIKKEKNVALW >gi|157101634|gb|DS480690.1| GENE 406 380815 - 381186 325 123 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1871 NR:ns ## KEGG: LM5578_1871 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 122 1 122 122 128 56.0 9e-29 MKEKEIKVLRVKPHEHPEVYMLKNTLEAMQEAVGGYIDIVGLDDNVCILLNDEGKLIGLE GNRRIGSDIIVGDFFVCGSDEEGNLTSLSEEALDTYTKIFYEPQEFTKEEIEETTVIEFY TFE >gi|157101634|gb|DS480690.1| GENE 407 381592 - 382320 1136 242 aa, chain - ## HITS:1 COG:CAC2295 KEGG:ns NR:ns ## COG: CAC2295 COG0217 # Protein_GI_number: 15895562 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 239 1 237 246 244 56.0 1e-64 MSGHSKFANIKHKKERNDAAKGKIFTVIGREIAVAVKEGGADPANNSKLRDVIAKAKANN MPNDTIDRGIKKAAGDANSVNYETLTYEGYGPNGVAIIVDTLTDNKNRTAANVRSAFTKG GGNVGTPGSVSYMFDKKGQIIIDKEECDMDPDELMMLALDAGAEDFSEEEDSYEVVTAPE DFSAVREALEAQSIPMMQADVTMIPQTWVELTDEDDIKKLNRTLDLLDEDDDVQAVYHNW DE >gi|157101634|gb|DS480690.1| GENE 408 382496 - 383848 1586 450 aa, chain - ## HITS:1 COG:CAC0847 KEGG:ns NR:ns ## COG: CAC0847 COG0534 # Protein_GI_number: 15894134 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 447 6 452 459 302 39.0 8e-82 MFRDSRFLRKALMIACPVALQGMLNTVVNLVDTLMIGTMGATAIAAVGLANKFFFVFSLL VFGIVSGSGVLAAQFWGNGDLRSIRKVLGLSILLALTGALFFVVPALVSPQSVMRIFTTS HDSVELGSKYLKIAVLTYPFLAMTNVYVAILRAVNKVIFPVISSCIAIVINICLNYVLIF GKLGMPAMGVSGAAVATLIARIIEIVLILGYVYGKRLPVACGLQDLFGWSRLFVSRFFGT SAPVIANEFMWGLGTTMYSLAYGRMGDEAVAAITIATTIQDILVVLFQGLSAATAVILGN ELGAGHLKRAEKYAVHFFILQFIATVGVAVVLIGIRWPIIGLYNITPEVARDVSLCILIF AAYMPAKMFNYVNVVGVLRSGGDTKMCLFLDTSGVWFIGVPLAFLGALVWRLPIYTVYAL VMTEEVYKAVLGYIRYRQKKWLRNLASDVG >gi|157101634|gb|DS480690.1| GENE 409 383838 - 385313 1511 491 aa, chain - ## HITS:1 COG:TM0845 KEGG:ns NR:ns ## COG: TM0845 COG1253 # Protein_GI_number: 15643608 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Thermotoga maritima # 127 477 101 452 455 210 35.0 5e-54 MDDGYPPVSILILIGFILLEAVFYGFGSAIQNVNEGKLEEEAAGGNKKAARLLEIVNRPV RFVHTIQVTTHLMGMIIGAAILPAMIGMVERRFLLGKVLAWVDPAQVMAAAGTPWYRNPS WWQWAGILALVTIFVLVLVISFGIIIPKRLAAKEPEKWGYHMLPVVLFFAGLFLPLTRLI TLISSLVLKLFGVDLADDDGNVTEEDIMSMVNEGHEQGVLEADETEMITNIFELGDKEAA DIMTHRTNMTVLDGSMSLKEAVDFILNEGVNTRYPVYGEDIDDIIGILHMRDAMTFAEKE ENKDRMVMDIPGLLREANFIPETRNIDTLFKEMQSRKIHMEIVVDEYGQTAGLLTMEDIL EEIVGNIMDEYDEEEDFIQAMEDGTFVMSGLTPLEDVMETLDIELPEEDSDTYDTLNGYL VSRLDRIPQEGENPQVEFGGWLFEVERAGNKMIESVRVVPVPESGKAADRGKDSVEEKYD AVNREYEGDVS >gi|157101634|gb|DS480690.1| GENE 410 385494 - 385841 348 115 aa, chain - ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 1 113 1 113 116 156 69.0 8e-39 MIIRRGDIYYADLRPVVGSEQGGVRPVLIIQNDVGNKHSPTVICAAITSRMNKAKLPTHV ELPTRRCAMIKDSVILLEQLRTIDKQRLKEKICHIDEELQQKVDEALMISLELRT >gi|157101634|gb|DS480690.1| GENE 411 385990 - 387180 1360 396 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 7 385 8 385 386 319 43.0 6e-87 MNSYSRVYASIDLDAVESNMRAMKDSLPPSASMIGVVKTDGYGHGAVPVARAIDPYVKGY AVATIDEAIILRRHGIQKMILVLGVTHESRFEDLVRYDIRPAMFRYEEAGKLSETAVKNG ARANIHLAVDTGMSRIGMTPDEASADEAARISRLPGICIEGMFTHFARADEADKAFYEAQ YKKYREFCEMLCSRGVDIPIRHCSNSAGIVEGLDSNGLDMVRAGISIYGLYPSDEVARDR VKLVPAMELKSFITYIKTIGPGTAVSYGGTFVADRPMRVATIPVGYGDGYLRSLSNKGAV LIRGKRAEILGRICMDQFMADVTDIPEAEEGDQVTLIGRDGEECITVEELAALSGGFHYE IICQIGKRVPRVYLRDGKAIGKKDYFNDIYEGFGYM >gi|157101634|gb|DS480690.1| GENE 412 387304 - 387393 63 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVCGWGVTVMVCGWGVMAMVCGWGVTVMV >gi|157101634|gb|DS480690.1| GENE 413 387399 - 388940 529 513 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126666946|ref|ZP_01737922.1| Ribosomal protein S15 [Marinobacter sp. ELB17] # 10 510 7 498 503 208 30 3e-52 MRMLVTGKQMKAIDAYTIHTIGIPSLVLMERAALKVALAAERLWEKGDVIWAVCGTGNNG ADGVAAARMLHLKGYPVRVFLVGNPDKGTEEFRTQLKIAGNVGLFTLPWQEGEKEECGLL IDAVFGVGLSRPVEGDYRQCIEMLASKSIRSTVAVDVPSGIHGDTGTVMGIALWADLTVT FGWEKTGTALYPGREYAGRVEVADIGFPGQALEAVKEAEGTAAGDFAVTYSDDDLKRIPR RPAYSNKGTFGKVLIVAGSKNMCGAAYLSALSAYRTGAGLVKLLTVEENRQILQERLPEA IIATYTPDQLMEGRDEFRKMIEAQMEWADVVVLGPGLGNGPYVEYLVEDILTSAFVPVII DADGLNAIAGHPYLTSYYTENIIVTPHLGEMARLTGEGIDQIKENLAATALEYAGRYGLT CVLKDAATVTAGRDGNLYINSSGNSAMAKAGSGDVLTGIIAGLIAIGMEEEEAACLGVYL HGRAGDAAASKSGAHSLLASELADAVGSVMTMV >gi|157101634|gb|DS480690.1| GENE 414 388900 - 389319 362 139 aa, chain - ## HITS:1 COG:FN1342 KEGG:ns NR:ns ## COG: FN1342 COG0736 # Protein_GI_number: 19704677 # Func_class: I Lipid transport and metabolism # Function: Phosphopantetheinyl transferase (holo-ACP synthase) # Organism: Fusobacterium nucleatum # 1 116 1 120 122 87 40.0 6e-18 MILGVGTDLIEIRRMEKACKKDYFVVRTFTDMESRQAKGSASKLAGSFAVKEAVAKALGT GFRTFMPIDVEVLRDDMGKPYVRLYRGALKRFQEMGMERLEVSITNTREYAMAFAVGEGR IKEVQPNEDAGNGKTDEGH >gi|157101634|gb|DS480690.1| GENE 415 389340 - 390002 863 220 aa, chain - ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 2 202 3 203 214 238 57.0 6e-63 MERKEISKAVISRLPRYYRYLGELIEEGVERISSNELSARMKVTASQIRQDLNNFGGFGQ QGYGYNVKYLYSEIAKILGIDRQHNVIIIGAGNLGQAIANYTNFERRGFVIRGMFDINPK LIGLVIRGIEIRSVDDLETFIRENEIQIAALTIPKTKAPEIADRLVNAGIRAIWNFAHTD LVVPEDVVVENVHLSESLMRLSYRVSSMYDLQEEKKNALK >gi|157101634|gb|DS480690.1| GENE 416 390142 - 392118 2412 658 aa, chain + ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 9 655 2 638 643 532 45.0 1e-150 MSFLKEFTMILSCSNISKSFGEKHILKQVSFHLEDHEKAAIVGINGAGKSTLLKIIIGEL ACDEGCVALSKGASIGYLAQHQDLDSESTIHDALLEVKRPILQMEERIRSLELDMKSASG EGLENMLEEYSRLTHQFELEGGYACRSEITGVLKGLGFAEEEFSKPINALSGGQKTRVSL GKLLLTKPDILLLDEPTNHLDMESIAWLETYLKTYSGSVIIVAHDRYFLDRVVTKIVELD NGTGTVFSGNYTAYSDKKAMLRDARIRAYLNQQQEIRHQEAVIAKLKSFNREKSIRRAES REKMLDKIDRLEKPIDIDDSMDIRLEPDVESGKDVLTVTGLSKSFDSQTLFTDVNFEIKR GERVAIIGNNGTGKTTLLKIINQLLPADAGEIRLGSKVHIGYYDQEHQVLHMDKTLFDEI QDTYPSMNNTQIRNTLASFLFTGDDVFKLIRDLSGGERGRVSLAKLMLSDANFLLLDEPT NHLDITSKEILESALCRYTGTVLYVSHDRYFINRTATRILDLTGQSLINYIGSYDYYLEK KDVVEAAFAARNSRTSAVSGSSQSPDRPSQGSPNDLKLEWKAQKEEQARIRRIQNELRKT EESIHALETRDSEIDALLTLEEVYTDVPRLMELNKEKEEIAGQLEKLYQSWEELAEEA >gi|157101634|gb|DS480690.1| GENE 417 392141 - 392377 337 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939256|ref|ZP_02086607.1| ## NR: gi|160939256|ref|ZP_02086607.1| hypothetical protein CLOBOL_04150 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04150 [Clostridium bolteae ATCC BAA-613] # 1 78 1 78 78 107 100.0 4e-22 MKLKRIFALLGAVLLAGLFILAIVLAVMGAPKNYLMAVIFSLVFIPIVMYAMGLMTRVLK PSDPPEGMDEDGDAGDEL >gi|157101634|gb|DS480690.1| GENE 418 392583 - 393764 1138 393 aa, chain - ## HITS:1 COG:no KEGG:Dacet_0350 NR:ns ## KEGG: Dacet_0350 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: D.acetiphilus # Pathway: not_defined # 1 377 19 388 396 138 28.0 4e-31 MVLWAILSLSMANALVGTGISPALGVIRQSFPDAPGVLVQMIVSLPSLMVIAVAIPFAWL SRRYSVRKLCAAGLVLFLAGGLMGGLASGIYTLVLTRILIGAGYGLMMPLSVGLLSYFFT REEQHRLNGGIVIWSSVSSIICMVLAGYLAAISWRAVFLVYLFGIPCLWLCLKHIPDVTV TGAGGAAGDDRARRARGNLGVIRKTWVYGAGTFVVFVGYFAILNNCSSIILSEGLISGEK VGIIMSFQTVSSLVTGLYIGKLKKLLGRYMGTFIWLCSIGGLLSLYAKNSLALTMLGLVL FGIALASAVGTFNAEACIACERDESLAAGSVIMFMRSLGQFSSPLVLGFMARTAGASDIR FPYEGAALLSVVMMGVFWVNGLVRSSLKKQAAC >gi|157101634|gb|DS480690.1| GENE 419 393836 - 395032 1105 398 aa, chain - ## HITS:1 COG:PH1043 KEGG:ns NR:ns ## COG: PH1043 COG1473 # Protein_GI_number: 14590880 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus horikoshii # 4 393 5 385 387 307 43.0 2e-83 MQGLLNEVQEYRDRMVEWRRNIHSCPETGFHLPKTSAMVAGVLEKLGFQVHTGFAESAVL GILETGRPGNTIAVRADMDALGMQDEKDVPYRSARDGVCHACGHDVHTALLLGLASYLSE HRDQIPGGCVKLIFQPAEEGPAPGGAKLIVDSGVLDGVDAIFGAHCQPQYPAGVMGCRYG SFFASGDFFEVKLKGTGCHAASPHKGKDLISIAMEMIQAIQNIGSREIPPMKTAVISVCS INAGVLSTKNVLPSELTFGGTYRAHDGQVRDYIAGRLEQIVRDLSRMNEITCEFEDSFAF PSFANDRDITDTIYQSAAEVLGQDKVMKRPEPEMGSEDFAWYTRKYKGAFFFFGVKNEEK GLTASLHNPRFDIDEDAMVPALAVYINTLYHLLERQDV >gi|157101634|gb|DS480690.1| GENE 420 395071 - 396228 1200 385 aa, chain - ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 1 382 1 379 384 358 46.0 8e-99 MYDFDRTFDRHNTEVIKWDRLEKDFGRKDLIPFGIADMDFETLPEITRAMTERAKHPTYG YTFPSEGYYESFIEWNRTRNHFDIKREELITIPGVVCACSFIIYALTEKGDKVMVNTPVY DPFFKVIEQQERTLVTSSLKLEHGRYEMDFEDMEEKFKEGVKLFILCSPHNPVGRVWTME ELTRLNDLCSRYHVLVFSDEIHSDLVFPGHKHIPYQTVNQDAAGRSVTAMAPSKTFNIAG LKSSVLIIKNPELYSRVNKVVTAFHVGVDLFGLKAFEVAYRHGAAYVDELNQYLYENAQF VAEYVAENLPAAKTYVQDGTYLMWLGFTALGLSQKELMEKVVGAGVAPNDGSHYGIEGNG FLRINIGTQRAMLKEGLERLKTIFG >gi|157101634|gb|DS480690.1| GENE 421 396336 - 397511 1441 391 aa, chain - ## HITS:1 COG:BH3005 KEGG:ns NR:ns ## COG: BH3005 COG5505 # Protein_GI_number: 15615567 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Bacillus halodurans # 2 387 1 385 388 171 29.0 3e-42 MLVTSGLSFVAALLAINAFVFWIVKKYPLKIYKFLPPVIIVFLIVVCCNTFGVWSFSNKA VSSMRSNILEYMVPFMVFCIAVQCDFKKMVKIGPKLLAVFLCTTLSICIGMVVVYKCFAG PLGLQQIPQSFGTWTASFTGGIENLYAVAGAVGLSDENLANVLLLINLIFRPWMTILIVM VPFAARFNKWTGGKPEEIDVIASRLDETKREKQIPTSLDLFMIMGVGLVIVAFGFHMGDF LGALIPAVPAQVWLYLMVTAIGVIIGTTTEFGYINGLELIGGTLAIFALSVSSSNVDLRS FANAGVFFLSGVTVLLIHVIIMFLVAKLMKVDLCTLGIASIANIGGVSSAPVVAAAYGKS YQSISVIMAAIGSMIGTFVGLGMCNLLLMMG >gi|157101634|gb|DS480690.1| GENE 422 398138 - 398548 507 136 aa, chain + ## HITS:1 COG:no KEGG:SSP0850 NR:ns ## KEGG: SSP0850 # Name: not_defined # Def: transcriptional regulator # Organism: S.saprophyticus # Pathway: not_defined # 15 133 29 146 157 67 34.0 2e-10 MKYNRQMYEYGKDFRSYGTDKMLRVDQMSIINLIGDNPKCNLKLIAQSTDSNISTVSLQV ARLVKLGLLEKHRSSMNQREIVLSLSGEGMKAYEFHKKLDQNWSDTVESLLSGYTKEELA VINAFLHQLLTENPPM >gi|157101634|gb|DS480690.1| GENE 423 398644 - 399921 1520 425 aa, chain - ## HITS:1 COG:MT0373 KEGG:ns NR:ns ## COG: MT0373 COG0104 # Protein_GI_number: 15839743 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 5 425 6 425 432 374 44.0 1e-103 MVRAIVGANWGDEGKGKITDMLAKESDIIIRFQGGSNAGHTIINDYGKFALHLLPSGVFY QHTTSIIGNGVALNIPYLVQEIESLVERGVPKPRILVSDRAQILMPYHVLFDTYEEARLA GKSFGSTKSGIAPFYSDKYAKIGFQVSELFDQELLREKAYRVCELKNVILEHLYHKPLLD PEEIIKELLSYRDMVEPYVCDTSAYLHEAIKEGKKILLEGQLGSLKDPDHGIYPMVTSSS TLAAYGAIGAGIPPYEIKDITTVVKAYSSAVGAGAFVSEIFGDEADELRNRGGDGGEYGA TTGRPRRVGWFDAVATRYGCRIQGTTEVAFTVLDVLGYLDEIPVCVGYEIDGQVTREFPT TAKLEKAKPVLTRLPGWKCDIRGIRKYGDLPENCRKYIEFVEKEIEAPISMVSNGPGRDD IIMRK >gi|157101634|gb|DS480690.1| GENE 424 400110 - 401030 950 306 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 290 1 287 296 165 30.0 1e-40 MNVNFEYYKVFYHVCAHGSLTAAAQELCISQPAVSQAVRQLEKEAGTKLFFRTSKGVQLT REGELLYHYVKPGVEELLEGGKMLERMLNMDVGEVRIGASDMTLQFYLLPYLEQFHREYP KIKVNVTNAPTPETIKSLEEGRIDFGVVTSPFTSRGTVRQFQVKAIRNVFIAGSSFRELE GRKLEYKELEAFPCIALEKNTSTRTFMDAFLAQQGTLLKPEFELAISDMIVQFARRNMGI GCVMEGFAEDAIMRGEVFRLKFKQEMPLRHMCVVTGESSLISVPGRRLLDMMACDQAAAE QPAGGR >gi|157101634|gb|DS480690.1| GENE 425 401352 - 401732 248 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939265|ref|ZP_02086616.1| ## NR: gi|160939265|ref|ZP_02086616.1| hypothetical protein CLOBOL_04159 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04159 [Clostridium bolteae ATCC BAA-613] # 1 126 1 126 126 143 100.0 4e-33 MALINKLDNTASVTYGGNPINSNTVSTVLLLAPTLLKAVDKLTASIGDTLTYTVTVTNVG LNALTNLPFTDTIPAGATFVAGSFTVNGAAATPTVTSNTLTYTIPNIASLGTASIQFQVK VVGGTT >gi|157101634|gb|DS480690.1| GENE 426 401802 - 403238 1703 478 aa, chain - ## HITS:1 COG:CAC1821 KEGG:ns NR:ns ## COG: CAC1821 COG0015 # Protein_GI_number: 15895097 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Clostridium acetobutylicum # 5 478 3 476 476 634 64.0 0 MSDHDRYVSPLSERYASREMQYIFSPDMKFCTWRKLWIALAETEKELGLSITDEQIEELK AHSEDINYDVAKAREKEVRHDVMSHVYAYGVQCPKAKGIIHLGATSCYVGDNTDIIIMTE ALKLVRKKLLNVLAELSGFAEKYKDLPTLAFTHFQPAQPTTVGKRATLWMMELKLDLDDL DYLIGSMRLLGSKGTTGTQASFLELFDGDHEKCRRLDARIAEKMGFEGCYPVSGQTYSRK VDSRVISVLAGIAQSAHKFSNDIRLLQHLKEVEEPFEKKQIGSSAMAYKRNPMRSERIAS LANYVMSDMMNPMLVASTQWFERTLDDSANKRLSVPEGFLAVDGILDLYLNVVDGLVVYP KVIEKHLMAELPFMATENIMMDAVKAGGDRQELHERIRELSMEAGRNVKEKGLDNNLLEL IAADPAFNLSLDELKKTMDPSRYVGRSPKQVEEFLEEVIKPLLEENEGELGLTAQINV >gi|157101634|gb|DS480690.1| GENE 427 403313 - 404761 1309 482 aa, chain - ## HITS:1 COG:CAC1392 KEGG:ns NR:ns ## COG: CAC1392 COG0034 # Protein_GI_number: 15894671 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Clostridium acetobutylicum # 13 472 14 465 475 484 52.0 1e-136 MNQNLMDFDIGADKLREECGVFGMYDFDGNDVVRTIYYGLFALQHRGQESCGIAVSDTEG PKGKAAVHKGMGLCNEVFTPEVLEGLRGNIGVGHVRYSTAGSSTRENAQPLVLNYVKGIL GLAHNGNLVNAPELRHELEYTGAIFQTTIDSEVIAYHIARARIHTHNVESAVAAAMKKLK GAYSLVIMSPRKLIGARDPMGFKPLCIGKRDNAYILASETCALETIGAEFVRDVDPGEIV TITKDGISSDKGMCLSDPSGEARCIFEYIYFARPDSVFDGVSVYKARIQAGRFLAADSPV EADLVVGVPESGNAAALGYAMESGIPYGTAFVKNSYVGRTFIKPKQSSRESAVRIKLNVL KEAVSGKRVIMIDDSIVRGTTSALIVKMLRDAGAREVHVRISAPPFLHPCYFGTDIPSED QLIAHGRTVDEVRQIIGADTLSFLRQERLSQMASERPVCTACFTGDYPMEPPSGDIRGSY EK >gi|157101634|gb|DS480690.1| GENE 428 404926 - 405633 1095 235 aa, chain - ## HITS:1 COG:CAC1391 KEGG:ns NR:ns ## COG: CAC1391 COG0152 # Protein_GI_number: 15894670 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Clostridium acetobutylicum # 1 232 1 232 235 276 62.0 3e-74 MEKKEMLYEGKAKKVYTTEDPDVLIVSYKDDATAFNGLKKGTIVGKGAINNRMTNFIFKK LEEKGVPTHLVEELNDRETAVKKVEIVPLEVIIRNYSAGSFAKKMGMEEGIKFKCPTLEF SYKNDDLGDPFINSYYALALGLATQEEIDDITEYAFKVNEVLQEYFGGLNIDLIDFKIEF GRYHGKVILADEISPDTCRLWDKDTHEKLDKDRFRRDLGNVEDAYEEVFRRLGIQ >gi|157101634|gb|DS480690.1| GENE 429 405794 - 406162 195 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939269|ref|ZP_02086620.1| ## NR: gi|160939269|ref|ZP_02086620.1| hypothetical protein CLOBOL_04163 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04163 [Clostridium bolteae ATCC BAA-613] # 48 122 1 75 75 138 100.0 1e-31 MEYISRRMRRLRKMRRRAVRFWRESCPSVRRRKRLIRGVLFLLAAGTMAWAKTGRGMISW VREQIPYMERIEESGPVRAADQDGMTSVKRGFTIRLDEKKIHIFQVEEGYEDRETAIDSG SD >gi|157101634|gb|DS480690.1| GENE 430 406056 - 406277 112 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939270|ref|ZP_02086621.1| ## NR: gi|160939270|ref|ZP_02086621.1| hypothetical protein CLOBOL_04164 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04164 [Clostridium bolteae ATCC BAA-613] # 33 73 1 41 41 70 100.0 4e-11 MRRFLRLTDGQLSRQNLTALRRIFLRRRIRLLMYSNQSPSPSVHELIIILYRVLFKIHLL VFTQMDPLQPSLA >gi|157101634|gb|DS480690.1| GENE 431 406312 - 406644 405 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939272|ref|ZP_02086623.1| ## NR: gi|160939272|ref|ZP_02086623.1| hypothetical protein CLOBOL_04166 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04166 [Clostridium bolteae ATCC BAA-613] # 1 110 1 110 110 120 100.0 4e-26 MKKIRPAIFAMLLLIMVMSVTACGSNNSNTQSSTGASQNSSSAATSTSTSSQETTGMETS SMETEGDGIIDGLMNDVEKGANDVKDGVDDITGESNTVNESGTNNTTEAR >gi|157101634|gb|DS480690.1| GENE 432 406735 - 407493 940 252 aa, chain - ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 252 1 250 250 232 45.0 5e-61 MVKMIMHGCCGAMGHVITGLAAEDGEIRIVAGIDVREGTDLGYPVFPSLDQCSVEADVIV DFASPKAVDGLLAYSCREQVPVVLCTTGLSGEQLAAVEKASQVTAILRSANMSLGVNLLL KLVSDAARILAGSGFDMEIVEKHHNQKVDAPSGTALALADSMNQAMDGQYAYTCDRSTRR EKRNPKEIGISSVRGGSIVGEHDVIFAGRDEVLTLSHTAYSKAIFAKGALEAAKFLAGKK PGMYSMTDVVDL >gi|157101634|gb|DS480690.1| GENE 433 407523 - 408407 1193 294 aa, chain - ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 1 294 1 292 293 309 52.0 5e-84 MALFEGAGVALVTPFKENGEVNYEKLEEIVEEQIAGGTDAIIACGTTGEASTMTHEEHLD VIEYICRVTKKRIPVVAGTGSNCTETAVYLSAEAEKRGADGLLLVSPYYNKATQKGLMAH FTAVADAVKIPVILYNIPGRTGVTIKPETIAALCRDVDNIVGVKEASGNFSAIATLMSLS DGKVDLYSGNDDQIVPLLSLGGKGVISVLSNVAPRQTHDICASYFAGDVKTSAALQLKAI PLITELFSEVNPIPVKAAMNMMGKGVGPLRMPLTEMEPQNQEKLKKAMTAYGIL >gi|157101634|gb|DS480690.1| GENE 434 408782 - 409276 600 164 aa, chain - ## HITS:1 COG:TM1465 KEGG:ns NR:ns ## COG: TM1465 COG2109 # Protein_GI_number: 15644214 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Thermotoga maritima # 6 160 5 152 170 63 30.0 1e-10 MKESMIQVICGPGRGKTTSAIGRGLTALTKGKCVYMVQFLKGALDVDNMEIIQRLEPEFK IFRFEKTPVFFDRLTEEEKAEARICILNGLNFARKVLVTGECDVLILDEILGILDEGVIS GEELCAIIAQARNAQIQLILTGTIYPDCLNGHVDEVTRIQTRYE >gi|157101634|gb|DS480690.1| GENE 435 409430 - 410059 730 209 aa, chain - ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 3 205 2 202 229 207 51.0 1e-53 MSEKMIENNRVSVIGEIVSGFTFSHEVFGEGFYMVDVAVNRLSEQADIIPLMISERLIDV HKDYSGSTVECIGQFRSYNRHEGIKNRLMLSIFVREIHFIEEFTDYTKTNQIFLDGYICK PPIYRKTPLGREIADILLAVNRPYGKSDYIPCISWGRNARFASSFEVGTRVRVWGRVQSR EYTKKLSETECEKRIAYEVSISKLECDEA >gi|157101634|gb|DS480690.1| GENE 436 410040 - 410117 112 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKQVKRQNEKYFRSEEEMYYVRKND >gi|157101634|gb|DS480690.1| GENE 437 410256 - 412088 2335 610 aa, chain - ## HITS:1 COG:CAC1684 KEGG:ns NR:ns ## COG: CAC1684 COG1217 # Protein_GI_number: 15894961 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Clostridium acetobutylicum # 2 604 3 602 605 733 62.0 0 MKMKREDVRNIAIIAHVDHGKTTLVDQLLRQSGVFRVNQEVQERVMDSNDLERERGITIL SKNTAVFYKDTKINIIDTPGHADFGGEVERVLKMVDGVVLVVDAYEGPMPQTKFVLQKAL DLNLSVIVCINKIDRPEARPGDVIDETLELMMDLDATDEQLDCPFIYASARGGFAKYKIE DPEADMSPLFETIINHIPAPEGDPDAETQVLISTIDYNEFVGRIGIGRVDNGGLKVNQEC VLVNHHEPDKFKKIKIGKLYEFDGLKRVEVQEAGIGSIVAISGIADIHIGDTICTPNNPE SIPFQKISEPTIAMYFMVNDSPLAGQEGKYITSRHLRDRLFRELNTDVSLRVEETDSADC FKVSGRGELHLSVLIENMRREGFEFAVSKAEVLYKYDERNRKLEPMELAYVDVPEEYTGV VIQKLTSRKGALQGMSQAAGGYSRLEFSIPSRGLIGYRGDFMTDTKGNGILNTIFDGYSE YKGDLFYRQTGSLIAFEAGEAITYGLFNAQERGTLFIGPGVKVYSGMVVGQSPKAEDIEI NVCKTKKLTNTRSSSADEALKLTPPKIMSLEQALDFIDTDELLEVTPESLRIRKKILDPT LRKRASFKNK >gi|157101634|gb|DS480690.1| GENE 438 412217 - 413350 1037 377 aa, chain - ## HITS:1 COG:CAC0767 KEGG:ns NR:ns ## COG: CAC0767 COG1453 # Protein_GI_number: 15894054 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Clostridium acetobutylicum # 5 374 1 372 376 457 57.0 1e-128 MEMEIKMKKLGFGTMRLPVLNQEDPASVDLEQVCKMVDTFMERGFTYFDTAYMYHKYESE RAVKKALVDRWPRDRYVLADKLPLSHLKEEADMERFFQEQLEKCGVSYFDYYMLHNMSRS YYETAERLGAFAFVRRKKEEGKARRIGFSFHADAELLEEILSKHPELDFVQLQLNYIDWD SPNIQSRQCYEVCRRYGKDVIVMEPVKGGTLADVPLEAEKLMEAHAPGMTPASWAVRFAA SREGVIMVLSGMSDYSQLLDNTAYMQDFVPLTEEEEGIVARAAEIIQSATAIACTSCQYC VEGCPKQIPIPKYFSLYNQYSLFGEKSNSRGYYQNYGGRYGKAGDCIGCRRCEAICPQHL PIVQHMKEIAEVFEPAK >gi|157101634|gb|DS480690.1| GENE 439 413391 - 414041 739 216 aa, chain - ## HITS:1 COG:BS_yvyE KEGG:ns NR:ns ## COG: BS_yvyE COG1739 # Protein_GI_number: 16080604 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 206 1 205 217 167 42.0 2e-41 MVDSYKILYKGGSGELVEKKSRFIADLCPVSSEEEALDFIEEIRKKYWDARHHCFAYIIG DRGQTARCSDDGEPSQTAGKPMMDVLAGAELHDVCAVVTRYFGGTLLGTGGLVRAYSGAV KEGLKNCVILEKRLAVRLNITTDYNGVGKIQYIAAQGGIDTLDTQYTDKAVFTFLVPMAE EGRFTAQIIEGTAGKAVIRREGEVYYGLGDSGLVVF >gi|157101634|gb|DS480690.1| GENE 440 414253 - 416946 2839 897 aa, chain + ## HITS:1 COG:aq_624 KEGG:ns NR:ns ## COG: aq_624 COG5009 # Protein_GI_number: 15606057 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Aquifex aeolicus # 27 754 8 679 726 249 31.0 3e-65 MNYGKYEAERRLKSLHSKSGKYASRLLLNIFKALFVLFLFLAVVGASIGFGMVKGIIDNA PSVDILNIQPSRFATAVYDSAGNLTDTLVTSGSNREEATYEELPKDLINAFVSIEDSRFW SHNGIDIRSIARAVKGIVSDDYAGGGSTITQQLIKNNVFSGGMESGWGARIERKLQEQYL AVQLEKNSGMSKEDTKKLIITNYLNTINLGNNTLGVKVAAKRYFNKDVSDLTLSECTVLA SITKNPSRLNPISGRENNAERRRIVLQYMYEQGYITKAQQEEALADDVYDRIQNVDTAAK GTNSHYSYFTDELIEQVISALMEKLDYTESQASNLLYSGGLQIYTTQDPALQAIVDEEIN NPDNYSVAKYSVEYRLSITHADGTTEHYSEENLRTFRKSVLGDSSFEGLYASKEAVQDDI DQYKAWLLKDGDEIIGERQNLILQPQASFVLLDQHTGEVKALCGGRGEKTASLTLNRASN VYRQPGSAFKVITAFAPALDACGATLGTVYYDAPYTIGTKTFRNWWNSGYGFTGYSSIRD GIIYSMNIVAVRCLMETVTPQLGVEYAENMGITSLTKDDLGAATALGGITKGVSNLELTT AYASIANGGVYTKPRFFTKILDHNGKVLIDNEPETKQVLKDSTAFLLTDAMSESMKSNRK FTRPGVSINSTSTRAALTGMTAAGKSGTTTSNNDVWFVGYTPYYTAGIWGGCDNNQKLKH GGVNNGGTTFHKDIWRNIMNRVHEGLSDPGFAVPDSIETAEICRKSGKRAVSGVCNHDPR GNAVYTEYFAKGTAPTEVCDKHVEVTVCAESGMRPTPYCPTKTTRVCMTLPEGEEGATDD SVFAIPGYCTIHSDVSTIIPPNTGDSDNTGPTVTPIGPGYQSPQESTEGFGPGNRPY >gi|157101634|gb|DS480690.1| GENE 441 417195 - 417548 591 117 aa, chain - ## HITS:1 COG:no KEGG:Closa_1567 NR:ns ## KEGG: Closa_1567 # Name: not_defined # Def: SpoVA protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 117 1 117 118 160 77.0 2e-38 MDYLKAFIVGGIICALVQILLEKTKMMPGRVMVTLVVSGAILGAIGLYEPFAKWAGAGAT VPLLGFGNTLWKGVKEGIGEDGLLGIFKGGLTASSAGICGALVFGYLASLVFKPKMK >gi|157101634|gb|DS480690.1| GENE 442 417561 - 418583 988 340 aa, chain - ## HITS:1 COG:no KEGG:Closa_1566 NR:ns ## KEGG: Closa_1566 # Name: not_defined # Def: stage V sporulation protein AD # Organism: C.saccharolyticum # Pathway: not_defined # 6 338 4 336 340 554 80.0 1e-156 MGQMKGKASIEFEHPPVIISAGSVVGKKEGEGPLGSLFDEVELDDMFGMDNWEQAESTMQ KKTADLVIEKGGIRKGDLRYLFAGDLLGQLIATSFGTVDLEIPLFGLYGACSTMGEALGL GAMAVNAGYADQVMSMASSHFATAEKQFRFPLAYGNQRPPSATWTVTGCGAVVLAGNRKD GMARIAGITTGRMVDMGAKDSMNMGAAMAPAAFHTIEQNLEDFQVNETWYDKIITGDLGE VGRTILLEFMKNKGRDLSQLHMDCGMEIYDKEQQDTHSGGSGCGCSAVTLCSYILPKVQD GTWKRVLFVPTGALLSSVSFNEGQSIPGIAHAVVIEHIDS >gi|157101634|gb|DS480690.1| GENE 443 418696 - 419130 601 144 aa, chain - ## HITS:1 COG:no KEGG:Clole_4035 NR:ns ## KEGG: Clole_4035 # Name: not_defined # Def: nitrogenase cofactor biosynthesis protein NifB # Organism: C.lentocellum # Pathway: ABC transporters [PATH:cle02010] # 1 144 203 346 347 155 54.0 4e-37 MSIADAKAALDGGSVDAALIAGAAAYQAKQQGYHMVADGEGLIKAIIAVAVREDFYEKNR DVIDRFMEAQKRVAAFMKDNQDEVLQIVAEELDLDEEAVREMYACYDFSMDVTDEDKAGF QKTADFMFKSGMIEEEMDVNSLFL >gi|157101634|gb|DS480690.1| GENE 444 419635 - 419736 69 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKYIWINPVAEDMYLGGIDVFPEKVWTHIGAVQ >gi|157101634|gb|DS480690.1| GENE 445 420546 - 422453 742 635 aa, chain - ## HITS:1 COG:MA1865 KEGG:ns NR:ns ## COG: MA1865 COG0210 # Protein_GI_number: 20090715 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Methanosarcina acetivorans str.C2A # 28 119 17 108 157 66 42.0 2e-10 MSIVNFIDEDIHYAEKILLGNKTFDINEKLPIIKRMDKSISVMACPGSGKTTALMGKIIA LVNHLPLNDGKGICIITHTNVAINEIKSRLGSRGDILFQYPNFIGTIQAFTDKFLSIPFL KQAYRQGITAINDEYYYQKMFQRKKEITILKKYAYGKCGRDYSKIDNYIKSIVLRRKDGE FDFCTGDKSLNLKNKSSDTYKALYSLLVDSLFSQGILRYDDTYLLAQMYLEEHPELSYYF CSRFQYVFMDEVQDNRGVQNEILDKIFLPERVVIQKFGDMDQAIYDDKEEVERNVLEDRY EISKSMRFPQNIADIIENLRVEPHKKTLIGNGNADAFIYILVFEHDKICNVKDKFVDLIL EHNLKKEKSVFKCCGWVGYKEKEEQLSVKSYYPMYISNKDVTIKQVGMSFQSILEENAKH NVTVGLIYKSVIVCIVRYLNMNDITFEEKQITYSILEKYIKNNMTIEWINLRSLIVREAT KIKNYDETAFKIIGTEALHIVFHFFEECNEKQFLKLFERQTNSKKDVLYNCYTRDGVSVL FDTVHGVKGETHTGTLYVETYFNKKTDMQRILKFMSGSNIKKMGADEKKALKVGYVGMSR ATDMLCIAIGEDTFKMYEKEIEELEQSRKLQICHI >gi|157101634|gb|DS480690.1| GENE 446 422458 - 424515 591 685 aa, chain - ## HITS:1 COG:MA1866 KEGG:ns NR:ns ## COG: MA1866 COG3593 # Protein_GI_number: 20090716 # Func_class: L Replication, recombination and repair # Function: Predicted ATP-dependent endonuclease of the OLD family # Organism: Methanosarcina acetivorans str.C2A # 18 461 5 500 613 174 31.0 5e-43 MYLSCVRIWNFRKYGYANLDATPALEVYLKGGLNVLIGENDTGKTAIIDAIKFVLGTRSH DTLQIKECDFYEDLKGERSELLKIECVFQELSDDEAGNFLEWLTFDGDKAELSVRMIARR RDNRIMNSITAGMPELDTRFDAIELLRVTYLKPLRDAENELKHGYHSRLAQILLNHPLFT KEKDLHVLERYFGIANQKIEEYFKKEKLEKDEIFGIEDGEKGAKEITGKLEATLREFMGN NYDSHGYEPIISASRNELTSILRRLSLIIAQNQVGLGSLNQLFIALELLLFDIENRFNIA LIEEIEAHLHPQAQLRLIAYLQNKNDNDNSNKKLQCIITTHSITLASKIKISNLIYCKNN KAYMLDSAHTALESGDYKYLERFLDSTKANLFFAKGVIFVEGDAENILMPVIAKIIGMPL EKYGVSIVNVGSLAFLRYANIFKRKNGECIDIPIAIVTDLDVRPDYYYEIKKEENKNIVY SIEDLKGAEEICELKCSDIEGKYYIKKEDIYEKIKEQNKVKKLSKKVKDSIEIYIKKEVS TELYRSFILEKKKDKYCTDKARLFTNNWTLEYDIAFSTLRVYLYASVMAAKKVKKQDSIE INISEEFLEAKRIIDEWSATGDSQEIIAYKIYEDLLLKNASKAVTAQYFSEILEQNVITV NAIIAKDESLSYIKQAIMYACGKEY >gi|157101634|gb|DS480690.1| GENE 447 424733 - 425005 133 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939292|ref|ZP_02086643.1| ## NR: gi|160939292|ref|ZP_02086643.1| hypothetical protein CLOBOL_04186 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04186 [Clostridium bolteae ATCC BAA-613] # 1 90 18 107 107 159 100.0 8e-38 MDEKIPEKIPYEFKKYEKSDVLIEDNEKMDIVRLMFCLCFDELNLGKIESVIESQGGEAL AKYKKTKIVGWSLTEIRKILSDSIFGDERI >gi|157101634|gb|DS480690.1| GENE 448 425115 - 425570 554 151 aa, chain - ## HITS:1 COG:no KEGG:Closa_1565 NR:ns ## KEGG: Closa_1565 # Name: not_defined # Def: SpoVA protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 144 1 143 150 184 69.0 1e-45 MEIDKKKYEEYVKQVTPTHSLAKNMAAAFLVGGIICVLGQFALNFMMNSMGMDQETAAAW TLLELILLSILLTGFNIYPKIVKFGGAGALVPITGFANSVVAPAIEFHAEGEVFGVGCKI FTIAGPVILYGVLTSWVLGLIYWVGTMMGIF >gi|157101634|gb|DS480690.1| GENE 449 425640 - 426083 536 147 aa, chain - ## HITS:1 COG:no KEGG:Closa_1564 NR:ns ## KEGG: Closa_1564 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 139 1 135 141 142 71.0 4e-33 MILLKKCLLAFFGLCAGGIIAAGVYAFLAIIGVFVRLMGKTGTRKHIFLYETVIILGGVL GNILDIYEFPIYMGPYIGTLFICVFGLSVGIFVGCLVMSLAETLKALPVISRRIHLAVGL QYLIFALAAGKMAGSLVYFWNHMASLG >gi|157101634|gb|DS480690.1| GENE 450 426070 - 426696 791 208 aa, chain - ## HITS:1 COG:no KEGG:Closa_1563 NR:ns ## KEGG: Closa_1563 # Name: not_defined # Def: stage V sporulation protein AA # Organism: C.saccharolyticum # Pathway: not_defined # 1 208 1 208 208 302 65.0 7e-81 MNHTVYLKLSQITELTHKDVVLKDVAQVYCDDQNVMNKCNSMKVMTVKVDAKRRYIMSAL DVINKLKQLDSTIDVNNVGETDFIIAYKPPSPPAYIWQWTKTVFVCLVCFFGAAFAIMTF NNDVSVTDVFSEVFSLVMGYESGGFTVLEISYSVGLAVGIIGFFNHFAAIKLNTDPTPLE VEMRLYEDNISKTLIANDGRKESKIDIT >gi|157101634|gb|DS480690.1| GENE 451 426693 - 426938 230 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939296|ref|ZP_02086647.1| ## NR: gi|160939296|ref|ZP_02086647.1| hypothetical protein CLOBOL_04190 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04190 [Clostridium bolteae ATCC BAA-613] # 1 81 21 101 101 159 100.0 6e-38 MGFGKRVCELLAVIASLLLIGALVWYLSSGFGDRGYEKEGTLVWEEMDGPSVEEEVCPAF VTDPVSMMYGQQEIQRSGRGL >gi|157101634|gb|DS480690.1| GENE 452 427027 - 427740 829 237 aa, chain - ## HITS:1 COG:BH1538 KEGG:ns NR:ns ## COG: BH1538 COG1191 # Protein_GI_number: 15614101 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 2 237 17 250 252 215 50.0 6e-56 MDETMKLINMAHEGDKAARDQLVMDNVGLIWSIVRRFSGRGYEMEDLFQIGSIGLIKAID KFDTGFDVKFSTYAVPMITGEIKRFLRDDGMIKVSRSIKELGVKVRAAREEMTYALGREP TIEEIAGRLGTSREEVAASLEAAAEVESLYRPADSGDENSTYLMDRLAQDNNDHEDLLNR MVLKDLMEGLEEEQREIILRRYFYNETQTQIAGELGISQVQVSRLEKRILKEMRKKL >gi|157101634|gb|DS480690.1| GENE 453 427746 - 428195 587 149 aa, chain - ## HITS:1 COG:CAC2307 KEGG:ns NR:ns ## COG: CAC2307 COG2172 # Protein_GI_number: 15895574 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Clostridium acetobutylicum # 13 145 6 136 143 131 55.0 4e-31 MEDKKMETNREHMRLEMESLSRNEEFARVVTAVFMSRLDPTLEEVDDVKTAVSEAVTNAV IHGYRGDRGTIYLDLTADMEERTLTVAVKDCGVGIADVKQAMEPMFTTDPEGERSGMGFS FMEAFMDQVEVESEPNHGTLVTMKKSIGR >gi|157101634|gb|DS480690.1| GENE 454 428201 - 428542 422 113 aa, chain - ## HITS:1 COG:BH1536 KEGG:ns NR:ns ## COG: BH1536 COG1366 # Protein_GI_number: 15614099 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Bacillus halodurans # 9 112 8 111 116 84 37.0 6e-17 MEQTTFTYEAREQMLVIHLPGELDHHNCRNLKRDTDLLLAENYINRIVFDFTHTSFMDSS GIGVLLNRYKQMAASRGTVAYYGAGPQVRRILEMGGVSRLIKGYEDRESAIKG >gi|157101634|gb|DS480690.1| GENE 455 428645 - 429955 1487 436 aa, chain - ## HITS:1 COG:no KEGG:Closa_1558 NR:ns ## KEGG: Closa_1558 # Name: not_defined # Def: TPR repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 436 2 440 440 410 49.0 1e-113 MNTDYARKNQQIANSFYNLGLEKARIRDLSGAAQCLKKSLHFNKYQTDARNLLGLIYYEN GEVADALVQWVISLNLQPEDNLADHYLDEIQRKPGQLEIESQNVKTFNQALWHAQNGSDD LAILQLARVVESNPHFVKAHLLLALLYMAREDFNKAGRCLYKILQIDKSNQKALYYMSIV KQNTGRADAEKRKMVKAFSHRKMEDDDVIIPNTYKENTGISTVLHIIIGLVLGIMAFYFL ILPARTRDLNSIHDNNLKSYMQKLNNANQQYDILKADYDELDAHTKEIQARLDELTTGNT SVIAQYQGLIWILQDYRSGDLVAAAKAFAGAGFDLIEDENIQAIVENIRQDMTANIYQSL VDRGLQLWNAGNKTEAMDYFQASLTIKPDNPEALFYVGRLYQDAGDTDNANSMFDKVVNE FPDSEYVDRAKNARGY >gi|157101634|gb|DS480690.1| GENE 456 429952 - 430098 345 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIDFEEEIERFKPSLDVEEVEDAIVKSDLTDMTDIMMELIKERGERDL >gi|157101634|gb|DS480690.1| GENE 457 430128 - 431492 1222 454 aa, chain - ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 21 452 22 431 431 201 32.0 3e-51 MGQAEEYLDRIPMWTRKKNSLEDVRRFLDYLGNPDRDRPAIHVAGTNGKGSVCAFLTSIL GEAGYSTGTFISPHLVEVRERFLINGEMVDQESFEGAFETVLEASRKLADQGLCHPTFFE FLFYMAMVLFRMHDVDVMILETGMGGRLDTTNVLERPAACVITSISMDHTRYLGDTLAKI AGEKAGIIKTGVPLVFDDGQPEVSTILEGFAFRKGAARYPVGRDDFYVEEIGEKGLSINA RMKDGRCLELEIPFEAFYQAENAMLAVRTLDVLRYPGECPRTGCRMPGHGPCICPDGKEW GISDQHIIKGLKNTFWPGRMEQVRPGIYLDGAHNPGGIEAFIQTAGEISDRQGKKPWLLF AAVSDKDYERMAEQLCKEMIWEAVGVVHMNSDRGLAAEELARVFRTYTSRPVYSYGDTKS AVEDMAEKSREGLLFCAGSLYLIGEIKAVLEQKP >gi|157101634|gb|DS480690.1| GENE 458 431760 - 432677 810 305 aa, chain - ## HITS:1 COG:CAC0076 KEGG:ns NR:ns ## COG: CAC0076 COG0697 # Protein_GI_number: 15893372 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 3 300 6 297 303 210 41.0 2e-54 MKIKNALLLLLTASIWGVAFVAQSVGMDYVGPLTFNCVRCLMGGVVLLPCIWFFDRKNKR KEQVPVIPGARKTLIIGGICCGVALCLASNFQQFGIQYTTVGKAGFITACYIVIVPVLGL LLGKKCSPVVAGAVVLSLAGLYMLCMKGGELSVNKGDLLMLVCAFLFAVHIMIIDFFSPV VDGVKMSCIQFFVSGILSGAAMLIYETPEWSQIIAAWAPVLYAGIMSCGVAYTLQIVGQK GMNPTVASLILSLESSISVLAGWVILGQRLSSKEVLGCALMFGAIILAQIPVGRRRQASL NTEGN >gi|157101634|gb|DS480690.1| GENE 459 432753 - 433964 1195 403 aa, chain - ## HITS:1 COG:TM1119 KEGG:ns NR:ns ## COG: TM1119 COG3629 # Protein_GI_number: 15643876 # Func_class: T Signal transduction mechanisms # Function: DNA-binding transcriptional activator of the SARP family # Organism: Thermotoga maritima # 11 307 3 275 349 72 24.0 1e-12 MNQSSGNKPVIHVRMFGGFTITMDEKAMCDSDNRSRKAWSLLSYLIIKRRGDVSINELFH AIWQDGVQDNPYGALKTLVFRVRKMLEAAGFPSQDLILSQKGAYMWNPAWETEVDTDRFE SLCRRILDPELDGKNMWEACLEAFELYRGIFLPRSAEESWVGPMAAYYHTLYQKLVVYMA EELIRDKEYAQVEEICQRAIGIDRFAEDFHYYWIYAAYGQGEQETALKRYKDTTDMFYRE RLMTPSEHFKELYKVISNSEQEVVTDLDVIQENLGTGSAGEGHGAYQCEYAVFKRLVQLE RRGVERSGDSVYLCLLTVGDRKGRTLKPEIQARAMERLRASIQKSLRSSDVYARYSVSQF ILLLSSATFENSEMVMDRILSAFNKSYVRKDVAVNYSLDVLKP >gi|157101634|gb|DS480690.1| GENE 460 434089 - 436728 2731 879 aa, chain - ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 3 879 6 881 881 1102 59.0 0 MKELEKTYNPADIEDRLYQKWLDGKYFHAEVNRSKKPFTIVMPPPNITGQLHMGHALDNT MQDILIRYKRMQGYEALWQPGTDHAAIATEVKVIDKLKKEGVDKADLGRDGFLKECWKWR DEYGTRIVNQLHKLGSSADWDRERFTMDHGCSDAVLEVFVKLYEKGYIYKGSRIINWCPV CQTSISDAEVEHVEQNGFFWHINYPVVGEPGRFVEIATTRPETMLGDTAVAVNPEDERYQ DIVGKMLKLPLTDREIPVIADPYVDKEFGTGCVKITPAHDPNDFEVGKRHNLEEIIILND DATVNVPGPYFGMDRYEARKAIVADLEAQGLLVKVVPHSHNVGTHDRCKTTVEPMVKQQW FVKMEEMAKPAIEALKNGSLTFVPESFGKTYLHWLEGIRDWCISRQLWWGHRIPAYYCQE CGEIVVSKEAPHACPKCGCTSLKQDEDTLDTWFSSALWPFSTLGWPEKTEDLDYFYPTDV LVTGYDIIFFWVIRMVFSGLEQTGKLPFHTVLIHGLVRDSEGRKMSKSLGNGIDPLEVID KYGADALRMTLITGNAPGNDMRFYWERVENSRNFANKVWNASRFIMMNIEKAAQSGEVAL DRLTMADKWIVSKVNTLTREVTENLDKYELGIALQKVYDFIWEEFCDWYIEMVKPRLYNE EDDTKAAAIWTLKHVLIQALKLLHPFMPFISEEIFCNLQEEEETIMISQWPVYRDDWNFA KEEQSTETIKEAVRAIRGVRSSMNVPPSKKATVYVVSEDAGLLQIFEHSKSFFAALGYAG EVILQENKEGIADDAVSAVIHKAVIYMPFADLVDIEKEIERLRGEEKRLAGELARSRGML GNEKFVNRAPEAKIAEERAKLEKYEQMMEQVKIRLAQLQ >gi|157101634|gb|DS480690.1| GENE 461 437581 - 437913 62 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939308|ref|ZP_02086659.1| ## NR: gi|160939308|ref|ZP_02086659.1| hypothetical protein CLOBOL_04202 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04202 [Clostridium bolteae ATCC BAA-613] # 1 110 65 174 174 214 100.0 2e-54 MPFGFFFTIILRKPDWKKAAIIGFVFSSVIEVTQVLTDRGLGELDDVLHNTWGSLMGYCA AVILGYLFGRHSNRYKGKVKGASLFFGITILAFAILIFYNQPDWNVRLRR >gi|157101634|gb|DS480690.1| GENE 462 438497 - 438721 98 74 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3238 NR:ns ## KEGG: EUBREC_3238 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 63 102 162 309 80 54.0 3e-14 MCDGSSGYNKVPNAKRTACWAHIRRYLIDAIPKGKQLDYTQASVQGVMYVNRLFELEDKI RRKYAGNYEAIRQA >gi|157101634|gb|DS480690.1| GENE 463 438960 - 439889 114 309 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2412 NR:ns ## KEGG: bpr_I2412 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 301 206 507 516 228 41.0 2e-58 MANRLYPDLQPKGKLDTEVVKNINQKVKDLFTAKIGGVVVNSADTIVISAFLGLTALAVY QNYYYILSAVMGINQVIYKACLASIGNSMVLESKEKNYCDFKMISMLFMWLIGFCTTCFA CLFQPFMEIWVGKDLMLSYLMVILFCIYFYLCEIMSLFSLYKDAAGIWHQDRFRPLIEAG ANLVVNLILVKPMGLYGILISTIISMCCISIPWLYRNLFKYVFDRSVSEYTKILLGGTAI MIFVTTGVALLTRFIPLSGNIGLFVRLLTCIIVSNTLYYLLFKKFETFAKILLLINRITG KRIFKVSIQ >gi|157101634|gb|DS480690.1| GENE 464 440536 - 441012 167 158 aa, chain - ## HITS:1 COG:BH3587 KEGG:ns NR:ns ## COG: BH3587 COG0546 # Protein_GI_number: 15616149 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Bacillus halodurans # 7 119 65 178 215 84 40.0 7e-17 MGLPAAMAASFRRISSESVDKILVNWEAMKLVRKLKKYGYKTALCTGKDHYRTVEILKYF QCDDLFDVLICGDDVPEPKPSAMPILKAMEALGADLENTVMIGDGYNDIKSAKNAGVKSI LTLWYGDAGVPKVADYYSETVEEMWATIDEIASFTANT >gi|157101634|gb|DS480690.1| GENE 465 440966 - 441190 56 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939313|ref|ZP_02086664.1| ## NR: gi|160939313|ref|ZP_02086664.1| hypothetical protein CLOBOL_04207 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04207 [Clostridium bolteae ATCC BAA-613] # 1 74 1 74 74 137 100.0 3e-31 MNKSLYLIFDLDGCLIDSSEVQKAAFFGSYKEVVGDDHCPSYEEYIKHTGDSVNGMLKKN GASSCDGSIVSENQ >gi|157101634|gb|DS480690.1| GENE 466 441708 - 443309 1168 533 aa, chain - ## HITS:1 COG:MT3575 KEGG:ns NR:ns ## COG: MT3575 COG0119 # Protein_GI_number: 15843081 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Mycobacterium tuberculosis CDC1551 # 6 257 10 257 334 68 25.0 2e-11 MGRLQLLDCTLREAPIDNLMWGEMNIHKMIHGLEMANVDIIECGFLKNADYIKGSTSFQR VEDIIPFLKNKKKGKTYVALVDYGRYDLKYLSEYDRRSIDAIRVCFKKNEIGLVLDYAAQ IRAKGYQVCVQHVDTMGFFDEEIIDFIEKVNEFKPLAYSVVDTFGAMYEEDMLHYTGLAD QYLSKDILLGFHAHNNLMLADANAQSYINKMSAHRDVVVDASLFGCGRSAGNAHTELIAQ FINKKHEEKYNIDELLDLIDTVIASAQEVTSWGYSIPYFIAGIYNAHTFNVKQLLKRHNI KSKDLRGIIELLDDVQKKKYDYALLEKLYVEYFDNPVADALALEKLSNTFKNRNILLLAP GKSVAAEKEHIQKFIDEKKTMVIAVNNVISGYQLDYIFYSSTNRYNLRKFQECAPETKVI VTSNIQGAIEEDTTVVNYSSLIKFGWVNLDSSAILLLRLLIRCGVEGVYVAGLDGYRSAG DTFYKNDLETGVEDKDRQELTRENTEMIADIVQTNPEFEIHFITQSEYSKALE Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:37:51 2011 Seq name: gi|157101633|gb|DS480691.1| Clostridium bolteae ATCC BAA-613 Scfld_02_32 genomic scaffold, whole genome shotgun sequence Length of sequence - 182531 bp Number of predicted genes - 188, with homology - 183 Number of transcription units - 75, operones - 40 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 98 - 157 7.2 1 1 Tu 1 . + CDS 230 - 1393 1128 ## COG5505 Predicted integral membrane protein + Term 1420 - 1471 9.2 + Prom 1458 - 1517 3.2 2 2 Tu 1 . + CDS 1565 - 2263 718 ## COG1802 Transcriptional regulators + Term 2310 - 2363 8.5 + Prom 2502 - 2561 6.0 3 3 Op 1 13/0.000 + CDS 2622 - 3098 525 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 4 3 Op 2 10/0.000 + CDS 3103 - 3384 429 ## COG3414 Phosphotransferase system, galactitol-specific IIB component 5 3 Op 3 . + CDS 3406 - 4785 1311 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 6 3 Op 4 1/0.167 + CDS 4811 - 6736 1674 ## COG3711 Transcriptional antiterminator 7 3 Op 5 . + CDS 6782 - 7462 634 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 8 3 Op 6 . + CDS 7478 - 8140 610 ## Rfer_0436 Asp/Glu racemase 9 3 Op 7 . + CDS 8154 - 9386 516 ## COG3395 Uncharacterized protein conserved in bacteria 10 4 Op 1 . - CDS 9477 - 10880 1289 ## COG0534 Na+-driven multidrug efflux pump - Prom 10906 - 10965 1.6 - Term 10899 - 10943 8.2 11 4 Op 2 . - CDS 10968 - 12209 1181 ## COG0733 Na+-dependent transporters of the SNF family - Prom 12243 - 12302 1.8 12 5 Op 1 . - CDS 12389 - 13006 805 ## COG3546 Mn-containing catalase 13 5 Op 2 . - CDS 13026 - 13322 373 ## Closa_1284 spore coat protein CotJB 14 5 Op 3 . - CDS 13315 - 13533 129 ## Closa_1285 hypothetical protein - Prom 13617 - 13676 4.1 - Term 13854 - 13906 1.5 15 6 Op 1 . - CDS 13976 - 15865 2012 ## mru_2083 phosphoenolpyruvate synthase/pyruvate phosphate dikinase (EC:2.7.9.2) 16 6 Op 2 . - CDS 15934 - 16899 942 ## COG0583 Transcriptional regulator - Prom 16953 - 17012 9.1 + Prom 16928 - 16987 11.3 17 7 Op 1 . + CDS 17072 - 18316 989 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs 18 7 Op 2 . + CDS 18368 - 19630 1268 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 19683 - 19719 5.2 + Prom 19833 - 19892 8.0 19 8 Tu 1 . + CDS 19966 - 20796 623 ## COG2207 AraC-type DNA-binding domain-containing proteins + Prom 20824 - 20883 6.8 20 9 Op 1 . + CDS 20926 - 22233 1265 ## COG2873 O-acetylhomoserine sulfhydrylase 21 9 Op 2 . + CDS 22396 - 23646 1288 ## COG0133 Tryptophan synthase beta chain + Term 23655 - 23684 2.8 - Term 23750 - 23803 4.4 22 10 Tu 1 . - CDS 23824 - 25020 1526 ## COG0538 Isocitrate dehydrogenases - Prom 25047 - 25106 4.7 23 11 Tu 1 . - CDS 25166 - 26152 819 ## COG0673 Predicted dehydrogenases and related proteins - Prom 26185 - 26244 3.2 - Term 26252 - 26296 7.1 24 12 Tu 1 . - CDS 26308 - 27156 536 ## Closa_1701 hypothetical protein - Prom 27257 - 27316 5.1 25 13 Tu 1 . + CDS 27688 - 29007 685 ## COG0534 Na+-driven multidrug efflux pump + Prom 29328 - 29387 4.7 26 14 Op 1 25/0.000 + CDS 29527 - 30348 372 ## COG1192 ATPases involved in chromosome partitioning 27 14 Op 2 . + CDS 30305 - 31234 449 ## COG1475 Predicted transcriptional regulators 28 14 Op 3 . + CDS 31246 - 31518 117 ## gi|160939352|ref|ZP_02086702.1| hypothetical protein CLOBOL_04245 + Term 31741 - 31779 1.1 + Prom 31723 - 31782 2.1 29 15 Op 1 . + CDS 31803 - 32681 361 ## BLJ_1240 hypothetical protein 30 15 Op 2 . + CDS 32718 - 33464 330 ## COG3617 Prophage antirepressor 31 15 Op 3 . + CDS 33554 - 34054 427 ## Ethha_1891 hypothetical protein 32 15 Op 4 . + CDS 34051 - 34557 353 ## Closa_3720 TRAG family protein + Term 34786 - 34824 -0.6 33 16 Tu 1 . + CDS 35117 - 36934 974 ## COG3344 Retron-type reverse transcriptase 34 17 Op 1 . + CDS 37043 - 37591 474 ## COG3505 Type IV secretory pathway, VirD4 components 35 17 Op 2 . + CDS 37664 - 39544 977 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 36 17 Op 3 . + CDS 39571 - 39924 334 ## CDR20291_1745 hypothetical protein 37 17 Op 4 . + CDS 39941 - 40240 231 ## CDR20291_1746 hypothetical protein + Term 40257 - 40292 1.8 - Term 40183 - 40210 -0.8 38 18 Tu 1 . - CDS 40268 - 40636 371 ## CDR20291_1747 hypothetical protein - Prom 40746 - 40805 4.3 + Prom 40708 - 40767 7.5 39 19 Op 1 40/0.000 + CDS 40962 - 41654 427 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 40 19 Op 2 3/0.000 + CDS 41660 - 42577 698 ## COG0642 Signal transduction histidine kinase + Term 42581 - 42617 6.3 + Prom 42634 - 42693 3.5 41 20 Op 1 . + CDS 42713 - 43639 277 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 42 20 Op 2 . + CDS 43680 - 44426 460 ## CKR_3399 hypothetical protein 43 20 Op 3 . + CDS 44439 - 45254 341 ## Closa_3153 hypothetical protein 44 20 Op 4 . + CDS 45260 - 45721 245 ## CDR20291_1753 hypothetical protein + Term 45788 - 45834 12.1 45 21 Tu 1 . + CDS 46175 - 46588 376 ## CDR20291_1754 rna polymerase, sigma-24 subunit, ecf subfamily (ecf subfamily rna polymerase sigma-70 factor) + Term 46637 - 46684 9.1 46 22 Op 1 . + CDS 46692 - 47144 262 ## CDR20291_1755 sigma-24 (FecI) 47 22 Op 2 . + CDS 47155 - 47616 116 ## CDR20291_1756 rna polymerase, sigma-24 subunit, ecf subfamily 48 22 Op 3 . + CDS 47645 - 47935 225 ## CDR20291_1757 hypothetical protein + Term 48085 - 48112 -0.8 + Prom 48110 - 48169 2.9 49 23 Tu 1 . + CDS 48242 - 49567 680 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 49627 - 49666 3.5 50 24 Op 1 9/0.000 - CDS 49675 - 49965 105 ## COG3041 Uncharacterized protein conserved in bacteria 51 24 Op 2 . - CDS 49962 - 50240 355 ## COG3077 DNA-damage-inducible protein J - Prom 50263 - 50322 5.2 52 25 Tu 1 . + CDS 50367 - 50765 240 ## CDR20291_1761 hypothetical protein + Prom 50776 - 50835 5.3 53 26 Op 1 . + CDS 50879 - 51628 471 ## CDR20291_1762 phage protein 54 26 Op 2 . + CDS 51625 - 52488 631 ## COG1484 DNA replication protein 55 26 Op 3 . + CDS 52553 - 52726 192 ## gi|210612983|ref|ZP_03289548.1| hypothetical protein CLONEX_01750 56 26 Op 4 . + CDS 52804 - 53169 302 ## CDR20291_1764 hypothetical protein + Term 53182 - 53225 7.4 + Prom 53193 - 53252 1.8 57 27 Tu 1 . + CDS 53407 - 54078 431 ## COG3505 Type IV secretory pathway, VirD4 components + Prom 54217 - 54276 2.8 58 28 Tu 1 . + CDS 54297 - 54437 59 ## EUBREC_3592 hypothetical protein + Prom 54501 - 54560 7.1 59 29 Op 1 . + CDS 54601 - 54906 160 ## gi|225375430|ref|ZP_03752651.1| hypothetical protein ROSEINA2194_01055 + Term 54949 - 55004 1.1 + Prom 54917 - 54976 4.0 60 29 Op 2 . + CDS 55021 - 55509 167 ## gi|160939391|ref|ZP_02086741.1| hypothetical protein CLOBOL_04284 + Term 55590 - 55634 6.8 61 30 Op 1 . + CDS 56137 - 56532 356 ## EUBREC_3584 hypothetical protein 62 30 Op 2 . + CDS 56504 - 58129 1042 ## EUBREC_3583 hypothetical protein 63 30 Op 3 . + CDS 58119 - 58505 187 ## EUBREC_3582 hypothetical protein + Prom 58515 - 58574 4.5 64 31 Op 1 . + CDS 58615 - 59376 372 ## EUBREC_3581 hypothetical protein 65 31 Op 2 . + CDS 59373 - 60227 604 ## COG1484 DNA replication protein 66 31 Op 3 . + CDS 60224 - 60418 174 ## Tresu_1913 hypothetical protein 67 32 Op 1 . + CDS 60796 - 61146 305 ## CDR20291_1745 hypothetical protein 68 32 Op 2 . + CDS 61160 - 61435 262 ## gi|160939402|ref|ZP_02086752.1| hypothetical protein CLOBOL_04295 + Term 61451 - 61497 10.0 - Term 61438 - 61483 6.0 69 33 Op 1 . - CDS 61499 - 62527 1076 ## COG3943 Virulence protein 70 33 Op 2 . - CDS 62524 - 63441 681 ## CDR20291_0776 hypothetical protein 71 33 Op 3 . - CDS 63476 - 64312 685 ## gi|160939405|ref|ZP_02086755.1| hypothetical protein CLOBOL_04298 - Prom 64338 - 64397 5.3 + Prom 64305 - 64364 3.7 72 34 Op 1 . + CDS 64575 - 64778 122 ## gi|160939406|ref|ZP_02086756.1| hypothetical protein CLOBOL_04299 73 34 Op 2 . + CDS 64780 - 66699 1500 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 66705 - 66748 10.0 74 35 Op 1 . + CDS 66767 - 67387 509 ## COG0358 DNA primase (bacterial type) 75 35 Op 2 . + CDS 67365 - 68744 1037 ## COG3378 Predicted ATPase 76 36 Op 1 . + CDS 69137 - 69346 364 ## ELI_1141 hypothetical protein 77 36 Op 2 . + CDS 69351 - 69557 225 ## gi|160939412|ref|ZP_02086762.1| hypothetical protein CLOBOL_04305 + Term 69566 - 69607 10.0 + Prom 69758 - 69817 1.6 78 37 Tu 1 . + CDS 69986 - 71680 1103 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Term 71650 - 71700 18.3 79 38 Tu 1 . - CDS 71710 - 72657 442 ## COG0582 Integrase + Prom 73279 - 73338 3.3 80 39 Tu 1 . + CDS 73510 - 73641 75 ## + Term 73884 - 73943 2.2 81 40 Tu 1 . - CDS 73903 - 74427 -37 ## Rvan_3142 restriction modification system DNA specificity domain protein - Prom 74496 - 74555 2.8 + Prom 75329 - 75388 4.3 82 41 Tu 1 . + CDS 75470 - 75850 125 ## BF1842 putative type IC restriction-modification system specificity subunit, partial + Term 75936 - 75974 0.8 83 42 Tu 1 . - CDS 75843 - 75941 87 ## gi|160939419|ref|ZP_02086769.1| hypothetical protein CLOBOL_04312 - Prom 76053 - 76112 2.3 84 43 Op 1 4/0.000 - CDS 77138 - 78706 939 ## COG0286 Type I restriction-modification system methyltransferase subunit 85 43 Op 2 . - CDS 78719 - 81484 1427 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 86 43 Op 3 . - CDS 81563 - 81778 91 ## COG3655 Predicted transcriptional regulator - Prom 81810 - 81869 8.6 87 44 Tu 1 . + CDS 82067 - 83485 1133 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 83490 - 83549 3.4 88 45 Tu 1 . + CDS 83739 - 84410 649 ## Ethha_1894 hypothetical protein + Prom 84895 - 84954 1.9 89 46 Op 1 . + CDS 84986 - 86818 979 ## COG3344 Retron-type reverse transcriptase 90 46 Op 2 . + CDS 86894 - 87196 179 ## Ethha_1894 hypothetical protein 91 46 Op 3 . + CDS 87211 - 87756 23 ## COG4725 Transcriptional activator, adenine-specific DNA methyltransferase 92 46 Op 4 . + CDS 87771 - 88226 271 ## Ethha_1896 hypothetical protein 93 46 Op 5 . + CDS 88123 - 89157 921 ## Ethha_1897 hypothetical protein 94 47 Tu 1 . + CDS 89880 - 91853 234 ## COG3344 Retron-type reverse transcriptase 95 48 Op 1 . + CDS 92019 - 93386 1061 ## COG3451 Type IV secretory pathway, VirB4 components 96 48 Op 2 . + CDS 93389 - 93763 290 ## gi|160939432|ref|ZP_02086782.1| hypothetical protein CLOBOL_04325 97 48 Op 3 . + CDS 93810 - 95465 808 ## CD1108 putative DNA-repair protein 98 48 Op 4 . + CDS 95495 - 95746 407 ## CD1107A hypothetical protein 99 48 Op 5 . + CDS 95736 - 96488 710 ## CD1107 hypothetical protein 100 49 Op 1 . + CDS 96636 - 97343 367 ## gi|160939437|ref|ZP_02086787.1| hypothetical protein CLOBOL_04330 101 49 Op 2 . + CDS 97324 - 99423 1414 ## COG0550 Topoisomerase IA 102 49 Op 3 . + CDS 99420 - 99698 261 ## Ethha_1904 hypothetical protein 103 49 Op 4 . + CDS 99705 - 103106 2393 ## COG0358 DNA primase (bacterial type) 104 49 Op 5 . + CDS 103108 - 103323 301 ## Ethha_1906 hypothetical protein + Term 103342 - 103378 8.0 105 50 Op 1 . + CDS 103402 - 110757 5275 ## COG4646 DNA methylase 106 50 Op 2 . + CDS 110738 - 111112 239 ## SSUBM407_0953 hypothetical protein + Term 111124 - 111158 3.4 + Prom 111195 - 111254 9.7 107 51 Tu 1 . + CDS 111479 - 113398 177 ## COG0480 Translation elongation factors (GTPases) + Term 113540 - 113581 -0.9 + Prom 113711 - 113770 2.6 108 52 Op 1 . + CDS 113791 - 113919 77 ## + Term 113966 - 114004 4.0 109 52 Op 2 . + CDS 114113 - 114622 330 ## Clocel_3955 sigma-70 region 4 domain-containing protein + Term 114638 - 114687 12.0 + Prom 114627 - 114686 1.7 110 53 Op 1 . + CDS 114718 - 115143 311 ## MGAS2096_Spy1146 putative cytoplasmic protein + Term 115177 - 115217 5.1 111 53 Op 2 . + CDS 115227 - 115700 290 ## MGAS2096_Spy1145 RNA polymerase ECF-type sigma factor + Term 115759 - 115797 2.2 + Prom 115724 - 115783 6.9 112 54 Op 1 . + CDS 115811 - 115897 70 ## 113 54 Op 2 . + CDS 115854 - 116894 337 ## MGAS2096_Spy1144 hypothetical protein 114 55 Op 1 . + CDS 117508 - 118473 421 ## MGAS2096_Spy1143 plasmid recombination protein Mob family 115 55 Op 2 . + CDS 118481 - 119335 530 ## COG1475 Predicted transcriptional regulators 116 55 Op 3 . + CDS 119381 - 119611 210 ## COG4443 Uncharacterized protein conserved in bacteria 117 55 Op 4 . + CDS 119682 - 119873 241 ## gi|160939459|ref|ZP_02086809.1| hypothetical protein CLOBOL_04352 118 55 Op 5 . + CDS 119928 - 121859 798 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 119 55 Op 6 . + CDS 121844 - 122776 557 ## COG4646 DNA methylase 120 55 Op 7 . + CDS 122778 - 123584 503 ## Ethha_1910 hypothetical protein 121 55 Op 8 . + CDS 123587 - 123925 179 ## Ethha_1911 hypothetical protein + Term 123946 - 123999 12.0 + Prom 123987 - 124046 3.3 122 56 Tu 1 . + CDS 124078 - 124377 284 ## Ethha_1912 membrane protein + Prom 124416 - 124475 6.2 123 57 Op 1 . + CDS 124503 - 125096 148 ## EUBREC_3502 hypothetical protein + Prom 125098 - 125157 2.6 124 57 Op 2 . + CDS 125186 - 125947 494 ## EUBREC_3503 hypothetical protein 125 57 Op 3 . + CDS 125954 - 126076 68 ## gi|160939469|ref|ZP_02086819.1| hypothetical protein CLOBOL_04362 + Term 126272 - 126312 5.3 + Prom 126233 - 126292 3.7 126 58 Tu 1 . + CDS 126325 - 127737 878 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) + Term 127830 - 127870 2.8 - Term 127708 - 127750 2.4 127 59 Tu 1 . - CDS 127927 - 128241 251 ## Sgly_1172 helix-turn-helix domain protein - Prom 128311 - 128370 5.9 + Prom 128260 - 128319 10.6 128 60 Op 1 . + CDS 128399 - 128941 163 ## COG4474 Uncharacterized protein conserved in bacteria 129 60 Op 2 . + CDS 129025 - 129219 74 ## gi|332655013|ref|ZP_08420754.1| hypothetical protein HMPREF0866_02741 130 60 Op 3 . + CDS 129235 - 130350 708 ## Bmur_0750 hypothetical protein 131 60 Op 4 . + CDS 130411 - 130644 203 ## gi|160939475|ref|ZP_02086825.1| hypothetical protein CLOBOL_04368 132 60 Op 5 . + CDS 130658 - 131200 565 ## Sterm_0689 hypothetical protein + Prom 131202 - 131261 2.8 133 61 Op 1 . + CDS 131370 - 131819 352 ## LMHCC_2038 hypothetical protein 134 61 Op 2 . + CDS 131834 - 132175 264 ## Vpar_0189 hypothetical protein 135 61 Op 3 . + CDS 132183 - 133262 912 ## COG0666 FOG: Ankyrin repeat 136 61 Op 4 . + CDS 133267 - 133887 537 ## TDE0558 hypothetical protein 137 61 Op 5 . + CDS 133900 - 134490 510 ## Vpar_1736 hypothetical protein 138 61 Op 6 . + CDS 134504 - 135343 950 ## TDE0503 hypothetical protein 139 61 Op 7 . + CDS 135357 - 135860 627 ## gi|160939483|ref|ZP_02086833.1| hypothetical protein CLOBOL_04376 140 61 Op 8 . + CDS 135850 - 136449 125 ## gi|160939484|ref|ZP_02086834.1| hypothetical protein CLOBOL_04377 141 61 Op 9 . + CDS 136472 - 137107 458 ## FN0653 hypothetical protein + Term 137289 - 137315 -0.6 142 62 Op 1 . + CDS 137316 - 137687 309 ## gi|167772517|ref|ZP_02444570.1| hypothetical protein ANACOL_03895 143 62 Op 2 . + CDS 137720 - 138688 362 ## Sterm_1475 hypothetical protein 144 62 Op 3 . + CDS 138705 - 139250 446 ## CLL_A2815 hypothetical protein 145 62 Op 4 . + CDS 139267 - 139593 586 ## Sterm_1192 hypothetical protein 146 62 Op 5 . + CDS 139599 - 140546 837 ## BP951000_1074 hypothetical protein 147 62 Op 6 . + CDS 140574 - 142055 1137 ## TDE0253 hypothetical protein 148 62 Op 7 . + CDS 142069 - 142668 470 ## gi|160939493|ref|ZP_02086843.1| hypothetical protein CLOBOL_04386 149 62 Op 8 . + CDS 142700 - 143299 368 ## PPE_04519 hypothetical protein 150 62 Op 9 . + CDS 143302 - 143547 164 ## FN1193 hypothetical protein 151 62 Op 10 . + CDS 143558 - 144292 447 ## gi|160939496|ref|ZP_02086846.1| hypothetical protein CLOBOL_04389 152 62 Op 11 . + CDS 144289 - 144699 277 ## gi|160939497|ref|ZP_02086847.1| hypothetical protein CLOBOL_04390 153 62 Op 12 . + CDS 144711 - 145466 400 ## Vpar_0727 hypothetical protein 154 62 Op 13 . + CDS 145476 - 147113 1078 ## EFER_3822 hypothetical protein 155 62 Op 14 . + CDS 147156 - 147626 570 ## Bsph_3254 hypothetical protein 156 62 Op 15 . + CDS 147630 - 153146 4549 ## COG4859 Uncharacterized protein conserved in bacteria 157 62 Op 16 . + CDS 153162 - 154025 774 ## COG3196 Uncharacterized protein conserved in bacteria 158 62 Op 17 . + CDS 154042 - 154818 724 ## gi|160939503|ref|ZP_02086853.1| hypothetical protein CLOBOL_04396 159 62 Op 18 . + CDS 154845 - 155720 607 ## gi|160939504|ref|ZP_02086854.1| hypothetical protein CLOBOL_04397 160 62 Op 19 . + CDS 155717 - 156595 714 ## TDE0553 hypothetical protein 161 62 Op 20 . + CDS 156596 - 156736 139 ## 162 62 Op 21 . + CDS 156765 - 157727 537 ## TepRe1_0261 hypothetical protein + Term 157735 - 157772 3.2 + Prom 157759 - 157818 5.0 163 63 Op 1 . + CDS 157859 - 158227 231 ## Cthe_1745 XRE family transcriptional regulator 164 63 Op 2 . + CDS 158296 - 158853 133 ## Cthe_1746 hypothetical protein + Term 158860 - 158912 2.1 + Prom 158856 - 158915 2.2 165 64 Tu 1 . + CDS 158958 - 159494 117 ## Closa_2767 hypothetical protein + Term 159600 - 159629 1.1 166 65 Tu 1 . + CDS 159921 - 160343 77 ## gi|160939512|ref|ZP_02086862.1| hypothetical protein CLOBOL_04405 + Term 160441 - 160469 -1.0 + Prom 160753 - 160812 3.6 167 66 Op 1 . + CDS 160832 - 161212 174 ## COG2337 Growth inhibitor 168 66 Op 2 . + CDS 161243 - 161470 215 ## Ethha_1353 DNA binding domain protein, excisionase family + Term 161474 - 161521 3.4 + Prom 161481 - 161540 3.5 169 67 Op 1 . + CDS 161575 - 162012 147 ## gi|160939516|ref|ZP_02086866.1| hypothetical protein CLOBOL_04409 170 67 Op 2 . + CDS 161990 - 163372 620 ## COG0582 Integrase + Term 163389 - 163429 5.2 171 68 Tu 1 . + CDS 163717 - 163839 75 ## + Term 164080 - 164129 11.7 - TRNA 163721 - 163797 79.7 # Arg CCT 0 0 172 69 Tu 1 . + CDS 164599 - 165879 728 ## COG3344 Retron-type reverse transcriptase + Term 165922 - 165974 3.1 + Prom 166016 - 166075 5.9 173 70 Tu 1 . + CDS 166109 - 167197 409 ## CPF_0975 hypothetical protein + Prom 167212 - 167271 5.4 174 71 Tu 1 . + CDS 167500 - 168216 21 ## CTC01115 hypothetical protein + Term 168240 - 168275 4.4 + Prom 168218 - 168277 5.2 175 72 Op 1 . + CDS 168400 - 168714 248 ## CD3346 hypothetical protein 176 72 Op 2 . + CDS 168736 - 169113 418 ## CD3345 hypothetical protein 177 72 Op 3 3/0.000 + CDS 169150 - 170289 917 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 178 72 Op 4 1/0.167 + CDS 170270 - 170548 165 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 179 72 Op 5 . + CDS 170518 - 171924 875 ## COG2946 Putative phage replication protein RstA 180 72 Op 6 . + CDS 171976 - 172197 280 ## CD3342A hypothetical protein 181 73 Op 1 . + CDS 172319 - 172816 423 ## CD3340 putative conjugative transposon antirestriction protein + Term 172819 - 172862 5.4 182 73 Op 2 . + CDS 172909 - 173301 358 ## CD3339 conjugative transposon membrane protein 183 73 Op 3 . + CDS 173285 - 175738 2244 ## CD3338 hypothetical protein 184 73 Op 4 . + CDS 175735 - 178116 1054 ## CD3337 conjugative transposon membrane protein 185 73 Op 5 . + CDS 178113 - 179117 753 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 186 73 Op 6 . + CDS 179133 - 180041 758 ## CD3335 hypothetical protein + Term 180129 - 180160 -0.5 187 74 Tu 1 . + CDS 180400 - 181263 -87 ## DSY1388 hypothetical protein 188 75 Tu 1 . - CDS 181923 - 182531 358 ## COG3547 Transposase and inactivated derivatives Predicted protein(s) >gi|157101633|gb|DS480691.1| GENE 1 230 - 1393 1128 387 aa, chain + ## HITS:1 COG:BS_yjcL KEGG:ns NR:ns ## COG: BS_yjcL COG5505 # Protein_GI_number: 16078255 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Bacillus subtilis # 26 377 29 388 396 102 26.0 1e-21 MEALITRTEGLIFVILIMVAFALWLQRFKAFKSLGPVLTVVVLGIILSNTHVVPISHDFY SALSTYCVTPAISICLLSMNMRELKKLNREPIIALVSAIFSVCFIAIILGLFFAPRITEG WKCAGMFVGTYTGGTPNLTAIATGLDCSRETLAAANAADYVVSTPLMVFLFAAPAILKAS KRWNKLWPYQFTKEELDDGEDEPLMSDKRWSIRDIAWLLTIGFGVSFVCTVIAQSIFPDT FWKAGRLLILTTVSIGLAQLKPVQKLRGNLDLGLFISLTFLATIGFAVDLQQFIGSALMM TLYVLFMLIGCIVLHLIICRIFKIKYEYVILSMVGCIVDGPTSSLTAAGADWKSLINVGL IMGVIAGACGNYVGIFVSYVIKGICGL >gi|157101633|gb|DS480691.1| GENE 2 1565 - 2263 718 232 aa, chain + ## HITS:1 COG:mll6988 KEGG:ns NR:ns ## COG: mll6988 COG1802 # Protein_GI_number: 13475819 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 28 148 31 151 220 77 34.0 2e-14 MGKLTYQTLRENVVDVIRMKIFNHELAPGMRIIEQDISDELGVSRGPIREALRQLEQEGL VEYVRNVGCSVKNITLEDIYEIYLLRATYEILAVKLCRGVFSETAFAEMDAALESMKDLK ETDYNKSIACDNMLHEAIIRSTGLPRLIKGWTDLNYGNAINYYAGNPDSRAMIERQYPIH KELVDVCRTGNTEEICRAISNHYMKTIRRRLKEQGMPEDHFKFSIDVIDGWV >gi|157101633|gb|DS480691.1| GENE 3 2622 - 3098 525 158 aa, chain + ## HITS:1 COG:lin2816 KEGG:ns NR:ns ## COG: lin2816 COG1762 # Protein_GI_number: 16801877 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Listeria innocua # 8 143 9 141 154 83 32.0 1e-16 MTGDIVYKDLIQLDLDVRNTDELFEVMSGRLMELGYVTGDFLESIKEREKEYPTALPIEP YAVAIPHTDPECIKNAFIACIRLKEPIPWREMAANEVVHMVRFVFLLGFKQEAVGDAHVE LLQILVNNCQRPDFMERLRQAANIEAYFNLVLSMEGIG >gi|157101633|gb|DS480691.1| GENE 4 3103 - 3384 429 93 aa, chain + ## HITS:1 COG:lin2201 KEGG:ns NR:ns ## COG: lin2201 COG3414 # Protein_GI_number: 16801266 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIB component # Organism: Listeria innocua # 1 90 1 90 91 92 57.0 1e-19 MKRIIVACANGVATSQAVASKVNRLLKERKVDAVVDAVNIKSLDRDIKDCVAYITITKTV KEYPVPVINGIAFLTGVRQEDELKKLIDAVNKS >gi|157101633|gb|DS480691.1| GENE 5 3406 - 4785 1311 459 aa, chain + ## HITS:1 COG:lin2200 KEGG:ns NR:ns ## COG: lin2200 COG3775 # Protein_GI_number: 16801265 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Listeria innocua # 1 446 1 443 449 412 50.0 1e-115 MQVLTNVVNFFLGLGAPIFVPIIMIVGSIMVGMKIKDAISSGITLGVAFTGMNLLISFMK DSITPAAQAIMTSTGISLPIVDGGWSTMATISWAWPYAFLMFPLLIAVNVVMILLNKTNT INADLWNVWLKIFTAVAVVSITHSVILAFVIAAVQIILELKSSDAHQHRIEKLTGIPGVS VTHATLFFGAILYPIALVLRKIPGLNREFDTDMLKDKVGIFAENHVMGFILGCLFGILAK YSLAQILTTGIKAAAAMTLFPVVAKYFMQALSPISEAVSAFMNKKFEGKTLIVGLDWPIL GGCNEIWLAILWSIPFMLIYSMVLPGNEILPFAGLMSASLALPAFLVTRGNLLQMSILCV IGAPVFLWVGTAFAPFMTELAIATGSIDLQAGELISNSLINGPVFTYSISHLFMFLKGNF MPLIIFAVWLAGFILYYRDLMREAREDAALEAGNAKAGQ >gi|157101633|gb|DS480691.1| GENE 6 4811 - 6736 1674 641 aa, chain + ## HITS:1 COG:BH0193_1 KEGG:ns NR:ns ## COG: BH0193_1 COG3711 # Protein_GI_number: 15612756 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 1 485 1 483 537 261 34.0 4e-69 MYLDSRSYMIFQEIVDNSSATGKGLEEKFHLTRKQLSYSFDKINDYLRDNGHPEIKRLKT GYFMVPGPVAEHYSAKEAAITKSSYTYSDEERAYWIELRLLCHEDELSTYHFTEELQISK NTLLSDFKKVQERVNRFSLELAYNRKEGYRLYGSEYHKRELLTHALRKLLNMPGGEENLN RIYGIRKEGVKELRANIEEIENKLQLQFTDERLKELLYIFYFTLIRIKNEKTVEAMDRYR LVTTTKEYSVVAAFADKYRIKDENEIVNFAVQIQISKVHNRVHEDIGNLELLRDAGMHML ERFESISCVRLDGRAEFLEILYQHMKPAVYRIWYGYHVEPDITDMILPKYQYLHEIVRKA VRPYEVYLTCTFPDPELVYITVLFASWLHKEGKLIQVQEKERAVVVCTNGLTVSQYLFVS LSELLPEIEFLECLSLRDFYEYDKDFQIVFSTVRLETDKKQFVVFPFANDIAKKSFREKV LDNIRVSEGEVKLQSRLPFLLPDERIQITREMPDWQNAIRMASAPLLDNGFINRNYIEKA IDMVETDKRFIMIADGVIIAHAGVDDGVYSMGMSFLKLPEKLSFNGYMDADIIVVLATPD KTRHLPALYQLFDLLEDEGNISAMRRAQDAHEIARLIRKYI >gi|157101633|gb|DS480691.1| GENE 7 6782 - 7462 634 226 aa, chain + ## HITS:1 COG:TVN1450 KEGG:ns NR:ns ## COG: TVN1450 COG0235 # Protein_GI_number: 13542281 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Thermoplasma volcanium # 11 209 3 197 218 118 33.0 1e-26 MSQETIQRAIQKAKEQVMWGCRTMVDKGYALGTAGNISARVEGEPLFVITPSARPYLTMK PEDLAVGDMDGNVVSGPYKPSIEFSMHLGIYKERPEVNCIVHTHSKFATAVSAMDGVEKV PVIDIEGVVYLGGEILVAPFAPPGSGELAAHVRNTIGTNAGLLMASHGAIGTGLTMEDAM ISSDNVERACEMYLAILGAGRQVRKLPEDYMEMAKKKSLVRRNVVL >gi|157101633|gb|DS480691.1| GENE 8 7478 - 8140 610 220 aa, chain + ## HITS:1 COG:no KEGG:Rfer_0436 NR:ns ## KEGG: Rfer_0436 # Name: not_defined # Def: Asp/Glu racemase # Organism: R.ferrireducens # Pathway: not_defined # 3 217 2 215 219 160 46.0 5e-38 MERKVAVIHTTVVTCDGINSRLKALVPDAEVMNIVDDSLLNDVKKEGMLTKEVTRRLLTY ALEAQDWGAELILNACSSVGEGVDVIRPLLKIPYLKIDEPMARSAVEKGEKIAVYGTVKT TLEPSARLISHIAEAEKKKVVVDSYLASDAFEALTVEKNQEKHNQILEALIRDTGKQYDV LVLAQASMSVLIPCLADITKPILSSMDSGVEAAAKALKGL >gi|157101633|gb|DS480691.1| GENE 9 8154 - 9386 516 410 aa, chain + ## HITS:1 COG:HI1011 KEGG:ns NR:ns ## COG: HI1011 COG3395 # Protein_GI_number: 16272946 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Haemophilus influenzae # 2 403 1 405 413 269 38.0 8e-72 MILGCVADDFTGASDAASFLQAGGANVILSNGIPEREDEDILEAQAVVIALKTRTCPPGE AVDRSMAAFSWLKAHGARMLYFKYCSTFDSTENGNIGPVLDAALERWDIPYTLLCPALLE NGRSVKDGKLYVNGVLLEDSPMKDHPLNPMRDSRLKYLMERQSRYPCFNLGAGELERGFR PYEKPGHYYLIPDYYEEKHGRMIAKQYGNLPLLSGGSGLLRMIARTWEPDEERSVLTPSR CAAPVILAGSCSEATRTQIRYYESKGGASLRIHPLRFVQGIQTMEHIRAFLEENKDQAAL VYTSAPPDEVNRIKAEWRESYPPLEHYNEELLAGTAAYALESGRKRIIAAGGETSGAVMQ KLEMHTFHIGRSIAPGVPVMYPAEQEGMELILKSGNFGQEDFFIRAARDI >gi|157101633|gb|DS480691.1| GENE 10 9477 - 10880 1289 467 aa, chain - ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 452 1 452 455 419 50.0 1e-117 MKSSSTQENPLGILPVNQLLARFAIPSIIAMLVSALYNIVDQFYIGRSVGMLGNAATNVA FPLTITCTAISLMCGIGGAANFNLSLGSGEKEEATRYAGNAVFMLAAIGILLCLTVRLFL KPLMIVFGATPDVLDYSLTYTGITSLGFPFLILTTGGSNLIRADGSPRFSMMCTLTGAVL NTILDPLFIFGFHMGMAGAALATIIGQIISGIMVIVYLTRFKTVRLSLGSLVPVPSHMKA IAALGMAPFFNQMAMMVVQIVMNNILRYYGAQSHYGSEIPLACAGIITKVNMIFFSLVIG LSQGLQPIVSFNYGARKFRRVREAYLKAVLIATMISTLSFLCFQLIPRQIIGIFGSGSEE YFHFAEQYFRIFLFFTFLNGLQPITANYFTSIGKAAKGIFISLTRQIIFLLPLIVVLPMF LGIEGVMYSAPVADLMAALLALVFVTRELRRMRGMEEEMNCGAETAD >gi|157101633|gb|DS480691.1| GENE 11 10968 - 12209 1181 413 aa, chain - ## HITS:1 COG:FN1944 KEGG:ns NR:ns ## COG: FN1944 COG0733 # Protein_GI_number: 19705249 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 2 413 48 459 459 405 53.0 1e-112 MFPARVSKYGGGTFLLPYFLFVIMIGLSGVIGEMAFGRAARSGPIGAFGQAMASRGKGRR LGEAIGYIPVLGSLSLAIGYSVIVGWILRYSAGSLTGTTLAPGDVDGFALAFSSMAGSFG NNFWQTVGLVITFIIMVLGISGGIEKINKIIMPLFFLLFLGLAVYMAFQPSAADGYRYIF RIDLKGLADPMTWVFALGQAFFSLSLAGNGTLIYGSYLDDREDVTNAAWKVALFDTLAAL LAALVIIPAMATAGSGLDEGGPGLIFIYLPNLFKSMPGSRVLVIVFFTAVLFAGISSLIN LFEAPIAALQQQFKLSRRTSVSVITAAALAIGLCIQGIVSGWMDFVSIYICPLGAGLAGI MFFWVFGSDYAKKEVEKGRLRPIGPWFSFMTSYLFCGLTAAVFVLGIVFGGIG >gi|157101633|gb|DS480691.1| GENE 12 12389 - 13006 805 205 aa, chain - ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 186 1 186 200 229 59.0 3e-60 MWTYEKRLQFPVKITQTCPKTASLIISQFGGPDGELAASMRYLSQRYTMPCKEVGGLLTD IGTEELAHLEMICAIIYQLTKNLTPEQAKTAGFDAYYIDHTTALWPTAAAGVPFNACEFQ SKGDAITDLHEDLAAEQKARTTYDNLIRIIENPEVREPLKFLRAREVVHFQRFGEALEKI KSKLDEKNFYYFNPEFDKQFVQNQK >gi|157101633|gb|DS480691.1| GENE 13 13026 - 13322 373 98 aa, chain - ## HITS:1 COG:no KEGG:Closa_1284 NR:ns ## KEGG: Closa_1284 # Name: not_defined # Def: spore coat protein CotJB # Organism: C.saccharolyticum # Pathway: not_defined # 1 94 1 94 98 140 70.0 2e-32 MPDRCQLLQQINEISFVVNDLNLYLDTHPTDGQALDAFSQAMAQRKQLLDSFAKEYEPLT LNCVCPETNNKSESHTKYPGQKHFTWSDGPLPWDCPEH >gi|157101633|gb|DS480691.1| GENE 14 13315 - 13533 129 72 aa, chain - ## HITS:1 COG:no KEGG:Closa_1285 NR:ns ## KEGG: Closa_1285 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 68 14 58 60 66 71.0 4e-10 MNLCNCDAKVPASQCGCSKELPAPGPGVQLAIATVPMQPWETPYEPSKALKQGTIFPCLD LPFFIVGGEKNA >gi|157101633|gb|DS480691.1| GENE 15 13976 - 15865 2012 629 aa, chain - ## HITS:1 COG:no KEGG:mru_2083 NR:ns ## KEGG: mru_2083 # Name: not_defined # Def: phosphoenolpyruvate synthase/pyruvate phosphate dikinase (EC:2.7.9.2) # Organism: M.ruminantium # Pathway: not_defined # 16 618 275 877 880 573 45.0 1e-162 MVGWPEILKKENLTREEKKCICNRLMTTESHMEQLILKHFTEEDFRRVWMRKIGGGVIGG KACGLLVARKLIELNMPEYAGHVEPHNSFFIGTDVFYRYLVYNRCAELKARHRLEKEHFK ETEELTKRLRGGSLPEDIREELSDMLDHYGTTPIIVRSSSIMEDGYGNAFSGKYESIFCM NQGTKEERMEELEEAIRRVYASTMNEQAIEYRRKRHLLDVDEQMALLIQQVAGQQYGDLY MPVAAGMGCSYNPYKWMEHLNPEAGMLRMVMGLGTRAVERTPGDYPRLIGLDRAQANLRT TLAERHKFSQRKVDVLDFGTKSLCTKSLEKILDLFPKWQKKMVLSRDTDAEDMLAERHIY RTIYFGDCQGLVDNLEFIRMMRTLMKMLEKEYERPVDVEFAVTSPEEGMWRLNLLQCRPL QTAKSEQVHIPDGVDHEFLFDVRRTSMRRSKEEPIDYIVWVDPQKYYEYEYAKKPDVARL ISRINQHFEDTDKKLMLLVPGRIGTSSPELGVPVVYAEISQFSAICEVAYGKAGYHPDLS YGSHMFQDMVEADVYYGAINDNSKTRLYRPELLTRYPEVLKDILPGESQELADIVKIHDV SRSGATLTLDAQEGRAVCRIRGEQQTAER >gi|157101633|gb|DS480691.1| GENE 16 15934 - 16899 942 321 aa, chain - ## HITS:1 COG:FN0603 KEGG:ns NR:ns ## COG: FN0603 COG0583 # Protein_GI_number: 19703938 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 306 1 306 314 220 38.0 3e-57 MFSGMNYVYEVYKEQSFSKAAENLYISQPALSSMIKKIETKIGMPLFDRSTSPIQLTECG KKYIKTAEKIMDLENEFAYYVGNLQELKTGRLSVGGTYLFSSFIFPPIIDKFRRAYPHVK LNLFEGHTPLLEQKLFAGELDIIIDNYLLDAGIYEKERFMEERLLLAVPSSFDSNRRAVK YQLKASDIKRCLHQNDTFPAVSLKKFKDEPFVMLRSHNDTRERVDAILGRAGIQLNYTLK LNQLLTTYHLTEYGMGASFVSDTVVKCLPENPDIIYYKIDDPEAVRDVYLYYKKNKYLTR SMIEFIKMAIPGIEKNSEKYV >gi|157101633|gb|DS480691.1| GENE 17 17072 - 18316 989 414 aa, chain + ## HITS:1 COG:Cgl0241 KEGG:ns NR:ns ## COG: Cgl0241 COG1167 # Protein_GI_number: 19551491 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Corynebacterium glutamicum # 3 411 19 426 426 390 46.0 1e-108 MLKELTIQFQEAKKLNLKLNMARGKPAAAQLDMNMGLLEFPEGCTLEDGTDARNYGVLEG IPECRRLLGQLLGLKPDQIIIGGNSSLNLMFDTMAFHCLFGTQGEKPWSFYAYEGSPVKF LCPVPGYDRHFQICEELGIQMIPVPLLEDGPDMEMVEAMVREDASIKGIWCVPLHSNPQG VCYSDRVVDRLASMETAAPDFRIFWDNAYGVHHIYQEVPLKNILDACEAGGHPNRAYYFF STSKITFPGAGISLIASGPDNMKEFKGHMSSQTIGHDKLNQLRHVQYFKTPENIRAHMAD LAEVLRPKFDMVLNRLEEELEGSGLAVWSRPKGGYFISLDTLAGCAKRTVQLAKEAGVTL TGAGATYPYKRDPQDRNIRIAPTYPTPEELKSAMDIFILCVKIAGIESVVNQGN >gi|157101633|gb|DS480691.1| GENE 18 18368 - 19630 1268 420 aa, chain + ## HITS:1 COG:BS_yweB KEGG:ns NR:ns ## COG: BS_yweB COG0334 # Protein_GI_number: 16080831 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Bacillus subtilis # 14 417 24 422 424 415 50.0 1e-116 MEKLYNPYDNVLKVVKEAADILGYTDSDIEAIKYPERELKVAIPVRMDDGTTKVFEGYRV QHSTSRGPAKGGVRFHPAVNPDEVRALAAWMTFKCAVVNIPYGGGKGGVVCDPNELSENE IRAITRRYTAAIAPLIGPEQDIPAPDVGTNAAVMGWMMDTYSMLKGHCIHGVVTGKPICL GGALGRNEATGRGVMYTTKNILNKMGIPVQGTTVAIQGMGNVGSITAKLLHREGMKIIAV SDVSGGICNPEGLNVPAILEYLSLNRKNLLKDYNEEGMSRITNEELLEMDARVLVPAALE NQINASNAHKIRAEIIVEAANGPVAADADGILQERGITVVPDILANAGGVVVSYFEWVQN IQSVSWTEEEVNEKLKDIMDPAFEAVWDIAKRQNATLRTGAYLIAVKRVVEAKAARAIWP >gi|157101633|gb|DS480691.1| GENE 19 19966 - 20796 623 276 aa, chain + ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 4 263 12 279 284 125 27.0 7e-29 MYQEESFKTQRVVYAHTRSDEGDMEYKMHCHDVYEVYYMISGNAEYLLEGRTYTPRPGSL IIIPPGCFHGLRVLDSDEYNRIRLHFAAELLQEEEQALLLEPFGTGWCCYEEQFQLEWYF HSLEECGNYGKDLQDIAIRTRVLSLLTKIFAIYREASGTSRGKEGQVQEIIRYINQNLSA PLSLEGLAQTFYMSKNHLTAVFKRATGTTVARYILYKRMAIVRKELSMGVPAAEAASRAG FGDYSSFFRAYKKMFGCAPSDKSAPGMPEMDKGSMV >gi|157101633|gb|DS480691.1| GENE 20 20926 - 22233 1265 435 aa, chain + ## HITS:1 COG:L75975 KEGG:ns NR:ns ## COG: L75975 COG2873 # Protein_GI_number: 15672055 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Lactococcus lactis # 8 435 2 426 426 494 56.0 1e-139 MGEKRKRSDRHFKFETTQLHGGQEQPDPSTGARSVPIYQTSSYVFDNCDHAAARFNLSDP GNIYSRLTNPTLDVFEQRMAQLEGGVAALATASGAAALSYTFQNLTRAGDHIVSSGHIYG GTYNYLAHTFPEYGVETTFVDPADLEAVEAAIRDNTKAVFIETLGNPNSDIADIEKIASI AHRHQIPLVIDNTFATPYLVRPIEYGADIVVHSATKFIGGHGTAVGGVIVDGGRFDWAAS KKFPGLSEPNASYHGIRFAQDVGAAAFVTRIRAILLRDTGATLSPFHGFLFLQGLETLSL RVERHVENALKVVEYLNAHPQVEAVHHPSVSRDPIQQQLYKTYFPRGGGSIFTFEIRGGA ENARKFIDNLELFSLLANVADVKSLVIHPASTTHSQMHEKELLEQGIRPGTIRLSIGTEH IDDILEDLEEAFRSV >gi|157101633|gb|DS480691.1| GENE 21 22396 - 23646 1288 416 aa, chain + ## HITS:1 COG:BH1663 KEGG:ns NR:ns ## COG: BH1663 COG0133 # Protein_GI_number: 15614226 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Bacillus halodurans # 21 408 5 393 399 430 54.0 1e-120 MDRKYTAKEKDMDYRTYLKNYPDKDGRFGAYGGAYLTEELIPAFEEIADAYQTICHSSQF INELRRIRKEFQGRPTPVYHCERLSGKIGNCQIYLKREDLNHTGAHKLNHCMGEGLLAKF MGKKRLIAETGAGQHGVALATAAAFFGLECEIHMGEVDIAKQAPNVTRMKILGAKVVPVT HGLKTLKEAVDSAFESYARNYKDSIYCIGSALGPHPFPLMVRDFQSVVGYEAREQFLEMT GMMPDEVCACVGGGSNSIGMFIPFLDDPVDITGVEPLGRGETLGDHAASMKFGEKGIMHG FESIMLKDEKGEPAPVYSIASGLDYPSVGPEHAFLHDLGRVKYDTVSDEEAMEAFFKLSR YEGIIPAVESSHAVAYAMRRAKEMRQGSILVCLSGRGDKDIDYVVEHYGYGDQFMD >gi|157101633|gb|DS480691.1| GENE 22 23824 - 25020 1526 398 aa, chain - ## HITS:1 COG:TM1148 KEGG:ns NR:ns ## COG: TM1148 COG0538 # Protein_GI_number: 15643905 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermotoga maritima # 1 398 1 395 399 505 61.0 1e-143 MDKIKMTTPIVEMDGDEMTRILWQMIKDDLLLPYIDLKTEYYDLGLEYRDETNDQVTVDS ANATKKYGVAVKCATITPNAARVEEYHLKEMWKSPNGTIRAILDGTVFRAPIVVKGIEPC VKNWKKPITIARHAYGDVYKGSEMKIPGPGKVELVYTAGDGSESRELVHDFTCPGIVQGM HNINQSIESFARSCFNYALDTKQDLWFATKDTISKKYDHTFKDIFQEIFDSEYAEKFKAA GITYFYTLIDDAVARVMKSEGGYIWACKNYDGDVMSDMISSAFGSLAMMTSVLVSPDGNY EYEAAHGTVQRHYYKHLKGEETSTNSVATIFAWTGALRKRGELDGNNELMEFADRLEKAT IDTIENGEMTKDLALITTIPNPVVLNSGDFIKAVAKRL >gi|157101633|gb|DS480691.1| GENE 23 25166 - 26152 819 328 aa, chain - ## HITS:1 COG:BS_yulF KEGG:ns NR:ns ## COG: BS_yulF COG0673 # Protein_GI_number: 16080169 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus subtilis # 1 325 1 327 328 315 48.0 8e-86 MIRYGTIGTNFIVDRFQEAAMENPSLHYSAVYSRNQETAGNFAAKYGVETTYTDLNAFAC ADDLDAVYIASPNSFHYEQAALLLSHGKHVLCEKPITSNARELEHLIQLAGQNNVILLEA IRSIFTPGYQALAKHLPKLGTIRRASFHFCQYSSRYDKFKSGIVENAFNPIFSNGALMDI GVYCVHPLVRLFGMPEKIKTDAVLLENGIDGSGTILGVYKDMQAELVYSKISSSRIPSQI QGENGTITTEGIDHPHQLIFYDRAGGRQVLYQGPAKADMSGEIAEWLRLISAGPGSNIHN RYSLMALQVMDEARKQMGIVFPADETFA >gi|157101633|gb|DS480691.1| GENE 24 26308 - 27156 536 282 aa, chain - ## HITS:1 COG:no KEGG:Closa_1701 NR:ns ## KEGG: Closa_1701 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 59 281 63 290 291 147 38.0 4e-34 MNNNNPYSVPSLKMTQVEDPIDPGFTYPDTDVVIPSPDDEIDPDFSVMPPYYPDYPVRPD YPVRPNYPVRPNYPVRPNYPIPCVFCNNNQWIRGGIRLLNAATGYNPFTVYIDNQAVFSG LDFSEVTQYRQLSQGYHTFSIMGSNGYVYLRKSMYVGDGMATIAIVNSSTGLDLVSIRDT ACPTDWTSSCFRVCNLAYYSGPVNASLGNIYFNSVNFADAASFSRLSSGNYTLRVARSAR PENTLVTTPVTLNPSRIYTLYVLNWNPSADTIQTLLVEDRRS >gi|157101633|gb|DS480691.1| GENE 25 27688 - 29007 685 439 aa, chain + ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 5 432 10 439 457 159 26.0 2e-38 MNLLTGKIKPLYLKYLGAAFGSACISSVYGLVDMAMVGQYHGPSGTAAMSVVMPIFNIIY SLGLFMGIGGSVLYSSEKGRRENGCYNEFFSTALLGTAALAAIAWTGIVFFGEPLLRLFG AEQTLLPLTKAYLFPIKIVIPVFVFNQMLAAFLRNDNAPGLATAAVLGGGIFNVFGDYFF VFAMDMGILGAGLATAAGACITLLIMLTHFFSKRCTLKFVSIHYLFIKVKGILTTGFSTF FVDLAVGIMTMLFNRQIVRYLGTDALSVYGLIVNISMFVQCCAYSVGQASQPIFSINYGA EHWGRMKETLKYALGTVAFFGIFWTTLVVLVPNGFVRIFMKPTEAVLQIAPSIMRCYGIS FLLLPLNIFSTYYFQSLMKPAASFVVSVGRGLVISGALIMLLPTVAKADFIWFSMPVTEI VIAVFVIYMMVRYTKRLAK >gi|157101633|gb|DS480691.1| GENE 26 29527 - 30348 372 273 aa, chain + ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 4 257 3 249 253 198 44.0 9e-51 MNTQIIAIANQKGGVGKTTTCANLGIGLAQSGKKVLLIDGDPQGSLTISLGNPQPDKLPF TLSDAMGRILMDEPIRPGEGILHHPEGVDLMPADIQLSGMEVSLVNAMSRETILRQYLDT LKGQYSHILIDCQPSLGMLTVNALAAANRVIIPVQAEYLPAKGLEQLLQTINKVRRQINP KLQIDGILLTMVDSRTNFAKEISALLRETYGSKIKVFTSEIPHSVRAKEISAEGKSIYAH DPNGKVAEGYKNLTKEVLKLEKQREKNRAGLSR >gi|157101633|gb|DS480691.1| GENE 27 30305 - 31234 449 309 aa, chain + ## HITS:1 COG:BH4057 KEGG:ns NR:ns ## COG: BH4057 COG1475 # Protein_GI_number: 15616619 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 25 137 18 130 288 84 44.0 4e-16 MKSSAKKIELASVDDLFSTEESRQDEQLEKIQEIPLSELHPFKDHPFKVKDDDAMIETAD SIKKYGVLVPAIARSLPDGGYELVAGHRRRRASELAGKETMPVIVRDLDDDAATIIMVDS NLQRENLLPSERAFAYKMKLEAIKHQGARTDLTSVQVEQKLSARDQVAKEAGERSGIQVM RYVRLTELIPELLDMVDEKKIAFNPAYELSFLKPDEQQMLVETMDYEQATPSLSQAQRMK KFSQDGKLSEDVMLAIMSEEKKSDLDKVTLSSDTLRKYFPKSYTPAKMQETIIKLLEQWQ KKRQRDQER >gi|157101633|gb|DS480691.1| GENE 28 31246 - 31518 117 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939352|ref|ZP_02086702.1| ## NR: gi|160939352|ref|ZP_02086702.1| hypothetical protein CLOBOL_04245 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04245 [Clostridium bolteae ATCC BAA-613] # 1 90 6 95 95 179 100.0 4e-44 MKDIAARELKGHNILAVERFWDSTCWMIEFTVLRPTAYGEPGDEMRLFLTEDGYQSALQS QQRREIKIKRYAHVIEGHIIDFKPKKHRRS >gi|157101633|gb|DS480691.1| GENE 29 31803 - 32681 361 292 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 292 80 375 381 332 58.0 1e-89 MAVFRVERTGDYTVMSNFHLKDKRLSLKAKGLLSQMLSLPDDWDYTLSGLSYINRESKDA IRSAVNELETAGYIRRRQTTDASGKFAANEYTIFERPIEGEPMLDKPSSENPITVNPSAV NPLPENPTQLNTKKSNTQKQNTHGSNTDSIPFRETAAAIPPERKGRDAMSVTEIENYREL ILENIEYDCLKQRYPLYLDDLNEIVELLVETVCAKRKTTRISGADFPHEIVRSRFLKLDS SHIEFVMDCLQKNTTQVRNIKQYLLAVLFNAPTTMNNHYTSLVNHDMHAGGW >gi|157101633|gb|DS480691.1| GENE 30 32718 - 33464 330 248 aa, chain + ## HITS:1 COG:lin2418_1 KEGG:ns NR:ns ## COG: lin2418_1 COG3617 # Protein_GI_number: 16801480 # Func_class: K Transcription # Function: Prophage antirepressor # Organism: Listeria innocua # 1 131 1 128 128 109 39.0 6e-24 MNQIEIFNSPEFGSIRIVEENGKYLFCGADVAKSLGYKDTVNALKTHCREDGVAFYHLTD NLGREQKAKFISEGNLYRLIVHSKLPSAERFEQWVFDEVLPTIRKHGAYLTKEKLWEVAT SPEALLKLCSDLLAEREENVSLRIANAQLEGKAAFYDLFIDLEHSTNLRTTAKELDVPER RFVRFLLEKRFVYRTASGNVLPYAKPANEGLFCVKDYCNHGHTGSYTLITPQGKLYFAEL RDSILMVI >gi|157101633|gb|DS480691.1| GENE 31 33554 - 34054 427 166 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1891 NR:ns ## KEGG: Ethha_1891 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 166 1 161 163 195 67.0 7e-49 MQEEVENRTLTLVVSGTKFTGRLLKAAISKYMAHCKEKKLQKKRSRDAPVTPHGKQTVKQ LIGQNQGISNIEITDPSIKEFEKIARKYGVDYAVKKDRSSSPPKYLIFFKGRDADALTAA FTEYTSKKVKKAEKTERPSVLAKLSQFKEMVKNAVVDRTKRKELER >gi|157101633|gb|DS480691.1| GENE 32 34051 - 34557 353 168 aa, chain + ## HITS:1 COG:no KEGG:Closa_3720 NR:ns ## KEGG: Closa_3720 # Name: not_defined # Def: TRAG family protein # Organism: C.saccharolyticum # Pathway: Bacterial secretion system [PATH:csh03070] # 7 162 5 160 585 249 71.0 3e-65 MKKQFDTKKLVLLNLPYLLMGLFATNFGEAWRLAQGANASEKFLSLFAVLPGALQSFWPS LHPLDLLVGLCCGAGLRLAVYLKSKNAKKYRHGMEYGSARWGTREDIAPYIDPVFQNNVI LTKTESLTMNSRPKDPKTARNKNVLVIGGSGSGKTRFWLKPSAPVRAV >gi|157101633|gb|DS480691.1| GENE 33 35117 - 36934 974 605 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 29 602 256 826 834 261 31.0 4e-69 MRNPIHVLKSLEEKASVSNNKYERLYRNLYNPEFYLLAYANIAKSQGSMTQGVDGQTLDN MSLPRINRIIESIRNRTYQPKPAKRKYIPKKNGKLRPLGITSTDDKLVQEVVRMILEAIY EPTFSNNSHGFRPKRSCHTALTQVKKNFTGVTWIVEGDIKACFDNFDHHVLVELLRKRIS DEAFIGLIWKFLKAGYMEQWQYNCTYSGVPQGSGISPICANIYLSELDNYMQEYKEKYDC EPERRRTTREYERASRRYRKARKALMGAEKSTPELVKEFKDSRRKKMDQHYYNPFEEGFK KIQYNRYADDFVIGVIGSKKDAEKIKEDVKIFLQEKLHLEMSEEKTKVTHSSKPVRYLGY DFKVIHSKNMKRCKNGDMKRVWYGKVFLYMPKEKWIKKAMERGAIQVKRNNDTGKEMWRP MPRKDLMNRSDAEIVSTFNSEIRGLYNFYRIAENVGALHKYYYMVRYSMLKTLAGKHRTN VSVIKKRHMVNGVLRIPYDTTKGRKYCEFYHDGFRKHSDGYDNVADVMPSYRKYDSRHTI VNRIKAGVCEICGEHADYLCMHHVRTLKSLKGRDIFEQKMLKIRRKSLALCPDCFELLHE TKESR >gi|157101633|gb|DS480691.1| GENE 34 37043 - 37591 474 182 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 2 146 184 307 591 71 31.0 7e-13 MQMHSSYVVTDPKGTILVECGKMLQRGAPKLGKDGKPMKDKHGKVIYEPYRIKVLNTINF RKSMHYNPFAYIHSEKDILKLVTTLIANTKGEGKAGDDFWVKAETLLYCALIGYIHYEAP VEEQNFSTLIEFINAMEVREDDEEFKNPVDLMFDALEAEKPNHFAVRQYKKYKLAAGDIC SK >gi|157101633|gb|DS480691.1| GENE 35 37664 - 39544 977 626 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 4 297 6 301 301 284 51.0 3e-76 MRNEKITPLYERLSRDDELQGESNSISNQKQMLEDFARRNGLPNPTHFTDDGISGTRFDR PGFLAMMEEVEAGRVEAIVIKDMSRLGRDYLKVGQVMEVLRQRGVRLIAINDGVDSLKGD DDFTPFRNIMNEFYARDTSRKIRSVFKAKGMSGKHLTGTVIYGYLWDEKREHWLVDEEAA EVVRRIFSLTLEGYGPYQIACKLSADRIEIPVVHLARFNEGVNRSKPVKDPYGWGSSTIV NILKKREYLGHTINFKTRKHFKDKKSHYVSEDEWTIFENTHEAIIDQQTFDLVQKIRSNV RRYPNGWGEAAPLTGLLYCADCGGKMYVHRTNNGKRISQYTCSNYTKVPCGTLCLTQHRI NESAVLTLVSDTLRAIAEYSRNDRTEFIHTVQETQVAQQSADISKKRRHLATAQKRAGEL EKLICKIYEDNALGKLPDTRYKALDAQYAKEQDALEIEIAELEKAVTGYEQSQKSAEKFI ALIDKYENFDTLTNTMLNEFVEKILVHERSRKGSQDTTQEIEIYFNFLGRYIPPSLQPVP LTPEEQEELRKREERKDRLHQNYLKRKASGAQKRYEDKIKAKKKAEMDAKKALIRAEDMK KGVFSTIGQLPKEEPRKGSIAASAAV >gi|157101633|gb|DS480691.1| GENE 36 39571 - 39924 334 117 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1745 NR:ns ## KEGG: CDR20291_1745 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 117 1 117 117 212 94.0 3e-54 MSKLTYIRCGDYDIPNLKLSEQPETSIGKYGRMRKSYLKEHRPILYNQLLMSEKLYPHLL EIDWTAQERVDTMLPHMMEAAGVTEELKACDPMRWVGLMNTLTAQIEEILIRELICS >gi|157101633|gb|DS480691.1| GENE 37 39941 - 40240 231 99 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1746 NR:ns ## KEGG: CDR20291_1746 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 98 4 101 102 185 96.0 5e-46 MLTFEKVLKVFQAYLDDDPLYEVVQTSHGYTLMAWEPHRNDWYSAEIQKTPEDLRNALLD TYANFLEDKITGNDRDLTVTETGEIQQRCRELWEKCRET >gi|157101633|gb|DS480691.1| GENE 38 40268 - 40636 371 122 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1747 NR:ns ## KEGG: CDR20291_1747 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 122 1 122 122 209 99.0 2e-53 MAKVEDCPGFETFGADVKAAREAKRLARKTLAEMVGIEWRYLANIENQGAIPSLPVMIQL IKVCGLPVERYFNPEIMREESEQRQRVSHKLKLCPEEYLPIIEGAIDGALKMEQTAKQKE DA >gi|157101633|gb|DS480691.1| GENE 39 40962 - 41654 427 230 aa, chain + ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 226 1 228 230 171 38.0 8e-43 MNKILIIDDDRELCALIKRSVQSEHIEADFCNTGKEGLQKLKEQEYQLVVLDVMMPGMDG FETLEEIRKENSLPILMFTSKNDSISKVRGLRAGADDYLTKPFDMDELIARIASLIRRYT RFNHQAGAVQKLDFDGLQIDLENRSVTTENGTFELPPKEFDLLLYCAKHQGKILTKQQIY EEVWGEEYFYDDSNIMAIISRLRKKLEVNPSSPKYIQTVKGIGYRFNKEV >gi|157101633|gb|DS480691.1| GENE 40 41660 - 42577 698 305 aa, chain + ## HITS:1 COG:BH0819 KEGG:ns NR:ns ## COG: BH0819 COG0642 # Protein_GI_number: 15613382 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 28 299 28 302 309 128 29.0 1e-29 MEIIVFLSIVIAVVAVLTSIVLVRRVKKQIAEMTDVLVDVKNGNGNRRILSATNELTAPL AYEINEIVVSYESRLSTVRQTEETNRQLMTSLSHDVRTPLTTLIGYLDAAHKGLVTGKDR DDYIETARRKAHDLKEYIDVLFDWFKLNSNEFALEIQSVEAAELTRNILIDWIPIFEDKQ VDYDIDIPEQPVRVRLDMDSYMRIVNNLIQNVIAHSHADKIKIVLSKKENNMELLLADNG VGIEKDDLKHIFERLYKCDKGRSEKGSGLGLSIVHQLVEKMGGSITVESFPGEGTEFMLL FPLEI >gi|157101633|gb|DS480691.1| GENE 41 42713 - 43639 277 308 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 10 301 8 301 311 111 30 3e-23 MDTNYIIETKNLTKQYGSQKSVADLNIHVKRGRIYGLLGRNGAGKTTTMKMLLGLTKPTS GEVKIWGKPLQGNEKKLLPRIGSLIESPGFYPNLTGTENLRIFATLRGVPNNHAIKDALD LVGLPYKDKKLFSQYSLGMKQRLAIALAVMHDPELLILDEPINGLDPIGIAEVRSFIREL CDARGKTILISSHILSEISLLADDIGIIDHGALLEEESLAELEQKSSKHIRFTLSDTAQA ARILERNFHENHFSIQDDHNLRLHNLDLPVGKIVTAFVENGLEVSEAHTCEESLEDYFKR VTGGEGIA >gi|157101633|gb|DS480691.1| GENE 42 43680 - 44426 460 248 aa, chain + ## HITS:1 COG:no KEGG:CKR_3399 NR:ns ## KEGG: CKR_3399 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 248 20 264 264 131 36.0 3e-29 MVFISLLLSVFMPLAYAFFLADVNTDVDAVNGVMSSLFQLSAYLLLMPLLVILASNLLFE ELDNDTLKNLVTIPINRTKLVLSKMLVLLLFAVGFMAVGGLVNLAVLLFQGWEPVGFWTL FGVGIEEGLIMWVGALPCILLVVLLNKNYIVSVVITFFYTTANYILSMSDAFLTQPFGLN IGTLFPGPLAFRWTFQFYDQSQTSAELADLLERISPYFLNGVQVFGVIVGEAVVFLALIA LVYRRQEI >gi|157101633|gb|DS480691.1| GENE 43 44439 - 45254 341 271 aa, chain + ## HITS:1 COG:no KEGG:Closa_3153 NR:ns ## KEGG: Closa_3153 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 241 2 241 269 153 34.0 8e-36 MWNLLKAELLKLRRCQILWVGLVALALCPLVQYGSQLIVEAEYRNPNYDFSTLFENVVWG NTQMFFPISLVMIGSWLIDRESTHDTLKNIMTIPVSMPKMLGAKLFWVGIFAVLLGIYSV GVTLITGLTVGLSGLTVEVFFHGGTQIVLAALTTYLVCMPLILIFGQIRGAYLGGSILAF FLGYSMLFFKGGILASIYPFSAALILVGFDMSGYAGTTTAPNPLLAVIGVGIMVLWAVLL LLMSSNKKEIKSRKQANSKGKGKRAVRRKGR >gi|157101633|gb|DS480691.1| GENE 44 45260 - 45721 245 153 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1753 NR:ns ## KEGG: CDR20291_1753 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 153 1 153 153 228 79.0 6e-59 MKKIVLSLCLILLGASVLSGCTKDEVLDQYNNIVQSAGSIALTGNSSLQGTKEKGIDDYT GSYTANYEDFSDTEYLFGGTSIKREAGKDLSIDCALEVTDGTAKVFWISGADEEVTLLET TGTYSDTITLPEGGNYIGIECENFTGSIELNME >gi|157101633|gb|DS480691.1| GENE 45 46175 - 46588 376 137 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1754 NR:ns ## KEGG: CDR20291_1754 # Name: not_defined # Def: rna polymerase, sigma-24 subunit, ecf subfamily (ecf subfamily rna polymerase sigma-70 factor) # Organism: C.difficile_R20291 # Pathway: not_defined # 1 137 1 137 137 251 95.0 8e-66 MEVTINYNGQAVAVEVTLEVYEFLDRADHKTENLFHEQRRHWDGREFDEYIITTEGVGAY GETPEEYLCRMETLHELMAVLDTCTEAQRRRFLLYALDGLSLAEIGVLCGCSKVAVYQSV EAVRKKFINFFESRLNE >gi|157101633|gb|DS480691.1| GENE 46 46692 - 47144 262 150 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1755 NR:ns ## KEGG: CDR20291_1755 # Name: not_defined # Def: sigma-24 (FecI) # Organism: C.difficile_R20291 # Pathway: not_defined # 1 150 1 150 150 261 98.0 7e-69 MKTINLRWMYPHYRHDEFVDVTDEVWAAMYQAQREMENYERRKVYHRAYYSLEAYSWLEN YALEHSRSPEDILLEREEMTTRLHLIAALPVALAHATPTQARRVHAYYIAGIKQPEISRR EGVHSSKVSVSIRRGLRNMRRCYDDLFQTE >gi|157101633|gb|DS480691.1| GENE 47 47155 - 47616 116 153 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1756 NR:ns ## KEGG: CDR20291_1756 # Name: not_defined # Def: rna polymerase, sigma-24 subunit, ecf subfamily # Organism: C.difficile_R20291 # Pathway: not_defined # 1 153 13 165 165 275 98.0 5e-73 MQTINLKQYYPFCKEDIFVEVSDEIVEAFLLDKRAEAARERKMFRYKAFYSLDCNDGIEN AAIGWAQPSPEDHLIEKEELAEYEELIRRLYEAISSLPPMQARRVHARYMLGMKVKDIAA MEGITPSQAGKSIHAALRRLRRYFARQKWTVNL >gi|157101633|gb|DS480691.1| GENE 48 47645 - 47935 225 96 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1757 NR:ns ## KEGG: CDR20291_1757 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 96 4 99 99 137 90.0 1e-31 MNEKITAQPPKEERQEVLKEIRQLENRKKILENKQRNEERRVRTRRLIERGAILEGIFPL ASSLSGAEVKAFLIALSYLPGAAELTANLPKSGDMP >gi|157101633|gb|DS480691.1| GENE 49 48242 - 49567 680 441 aa, chain + ## HITS:1 COG:mll0964 KEGG:ns NR:ns ## COG: mll0964 COG0507 # Protein_GI_number: 13471082 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 3 221 40 239 1015 141 39.0 3e-33 MIHDYTRKGGIVHAEIMLPAHAPPEFADRSILWNSVEQIEKARDSQLAREIEAALPRELS GEQQLALVRAYVKDNFVDKGMCADFAIHDKGTGNPHVHIMLTLRPLKENGQWGAKCRKAY DLDENGQRIPDGKGGWKNHREDTTDWNDKGNVEIWRAAWAAYTNRALESAGRPERIDHRS YKRQGIDKIPSVHLGPAASQMEKRGIRTDKGEVNRQIVADNKLLKEIKARITRLYRWSKA EAEKPQTQQSSLTALWEAQQQLNAPRTRTGKIRALQESAALFSFLQVNGIQSMQQLHEKI ADMNSCYYDLRGKIVKAERRIAILTERGEMWEQYNQYKSIHKQLAKVKPEKREQFEQRHS RELILYDAAARYLKELKDSGEGITPKAWQREIDQLAAGKQTDTLAMKSMREELKAVERLR KTAEQLSRQERDKSHDRGPER >gi|157101633|gb|DS480691.1| GENE 50 49675 - 49965 105 96 aa, chain - ## HITS:1 COG:SP0276 KEGG:ns NR:ns ## COG: SP0276 COG3041 # Protein_GI_number: 15900210 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 8 94 4 92 92 94 52.0 5e-20 MKETKYTVKYTTSFKKDYKRAIKRGLKIELLEQVVALLAMGEPLPDKNRDHDLSGDWAGH RECHILPDWLLVYRIEDDVLVLTLARTGTHSDLFGK >gi|157101633|gb|DS480691.1| GENE 51 49962 - 50240 355 92 aa, chain - ## HITS:1 COG:SP0275 KEGG:ns NR:ns ## COG: SP0275 COG3077 # Protein_GI_number: 15900209 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Streptococcus pneumoniae TIGR4 # 7 59 5 57 87 58 50.0 4e-09 MAGNTTNISIRMDADLKAQADALFTELGMNLTTAFNIFVRQSLREGGIPFEVRLEQPNKE TVAAMLEAERIAKDPSVKGYNDLDELFADLKR >gi|157101633|gb|DS480691.1| GENE 52 50367 - 50765 240 132 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1761 NR:ns ## KEGG: CDR20291_1761 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 132 1 132 132 233 94.0 2e-60 MKNNKTEPIPVMDYRQYRRARKLVHECCNYIAGNCIALDDGEECICVQSISYSLLCRWFR AAVLPQDKELETALFHRLNAKKCAVCGALFTPGSNRAKYCPECAARMKRINAAKRKRKQR EKCHALGAEKPL >gi|157101633|gb|DS480691.1| GENE 53 50879 - 51628 471 249 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1762 NR:ns ## KEGG: CDR20291_1762 # Name: not_defined # Def: phage protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 249 1 244 244 386 93.0 1e-106 MSDNRKYYYLKLKENYFDDDSIVLLESMQDGVLYSNILLKLYLKSLKHGGRLQLDEDIPY TAQMIATITRQQIGTVERALQIFLKLGLVEVLDSGTFYMSNIELLIGQSSTEAERKRAAR LQNKALSALRTSGGHLSDIRPPEIEIELEKEIEIKREIEKVRPETGHPSHTYGRYQNVFL TDEELADLQVSFPAVWEQYIEKLSEYMASTGKRYQSHAATIRRWAGEDAKKTVTPSRNRD YSVKEDETV >gi|157101633|gb|DS480691.1| GENE 54 51625 - 52488 631 287 aa, chain + ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 36 271 27 270 282 139 32.0 9e-33 MIEITAEIRAFIDKAAAGVELAEDEYIDPSDGLIHCKKCGGQRQTVVPCFGKSGYFMPHC ICQCQREAEEQRKAAEERQRRMERIKRRKAQGLQDRYLYDYTFSNDNRQNPLMDKAHAYV ENWKEAYKSNIGLLLFGDVGTGKSFFAGCIANALLDQDVPVLMTNFPTILNRLTGMFSED RSEFIASFDEYDLLIIDDLGVERSTEYAMEQMFFVIDSRYRSRRPMIITTNLKLSELKNP PDLAHARIYDRILERCAPILFDGKNFREENAGVTRQAAKDIVNSKHD >gi|157101633|gb|DS480691.1| GENE 55 52553 - 52726 192 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210612983|ref|ZP_03289548.1| ## NR: gi|210612983|ref|ZP_03289548.1| hypothetical protein CLONEX_01750 [Clostridium nexile DSM 1787] hypothetical protein CLONEX_01750 [Clostridium nexile DSM 1787] # 1 57 25 81 81 85 92.0 1e-15 MQMADAAAAARAPENTPAMVKKIGKTTYKVHVHFSNTSTETMSDKIKRMLKNEIQQM >gi|157101633|gb|DS480691.1| GENE 56 52804 - 53169 302 121 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1764 NR:ns ## KEGG: CDR20291_1764 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 121 14 134 134 235 95.0 3e-61 MLYTEKEKHEIERVKEVFAEHLRQSPDFELLWSDKVGYVWLTIGVNPVYVDTGIRIESAA DLCGKCLDDVATDVLYMTGNDHALEAADPLELAEIKRLWEPYINQLPDYAYLCKDLLNGK M >gi|157101633|gb|DS480691.1| GENE 57 53407 - 54078 431 223 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 2 180 375 556 591 90 32.0 2e-18 MSDTDDSFNFLISMCYTQLFNLLCEKADDVYGGRLPVHVRCLIDEAANIGQIPRLEKLVA TIRSREISACLVLQAQSQLKALYKDNADTIIGNMDTSIFLGGKEPTTLKELAAVLGKETI DTYNTGESRGRETSHSLNYQKLGKELMSQDELATMDGNKCILQLRGVRPFLSDKYDITKH PNFKYTADADDKNAFDIEAFLSARLKLKPNEVCDVYEVDTKGA >gi|157101633|gb|DS480691.1| GENE 58 54297 - 54437 59 46 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3592 NR:ns ## KEGG: EUBREC_3592 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 45 3 47 48 71 82.0 9e-12 MAFFEQAITVLQTLVIALGAGLGIWGVINLLEGYGNDNPGANAHVR >gi|157101633|gb|DS480691.1| GENE 59 54601 - 54906 160 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225375430|ref|ZP_03752651.1| ## NR: gi|225375430|ref|ZP_03752651.1| hypothetical protein ROSEINA2194_01055 [Roseburia inulinivorans DSM 16841] putative desmoglein-4 [Roseburia intestinalis L1-82] hypothetical protein ROSEINA2194_01055 [Roseburia inulinivorans DSM 16841] putative desmoglein-4 [Roseburia intestinalis L1-82] hypothetical protein [Ruminococcus obeum A2-162] # 1 101 1 101 101 202 100.0 9e-51 MNLRKSILALSVVAGIVAAPMTVFAAHTHSWGSPQYYGYEDEMPGIYDDWDKCATRHVYN YKQCLICGEVSIYEVETIEMSHKWVNGACVYCNKGYAKEIN >gi|157101633|gb|DS480691.1| GENE 60 55021 - 55509 167 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939391|ref|ZP_02086741.1| ## NR: gi|160939391|ref|ZP_02086741.1| hypothetical protein CLOBOL_04284 [Clostridium bolteae ATCC BAA-613] hypothetical protein ROSEINA2194_01056 [Roseburia inulinivorans DSM 16841] membrane-bound lytic murein transglycosylase A-like protein [Roseburia intestinalis L1-82] membrane-bound lytic murein transglycosylase A-like protein [Clostridium sp. M62/1] hypothetical protein CLOBOL_04284 [Clostridium bolteae ATCC BAA-613] hypothetical protein ROSEINA2194_01056 [Roseburia inulinivorans DSM 16841] membrane-bound lytic murein transglycosylase A-like protein [Roseburia intestinalis L1-82] membrane-bound lytic murein transglycosylase A-like protein [Clostridium sp. M62/1] hypothetical protein CK3_12760 [butyrate-producing bacterium SS3/4] hypothetical protein [Ruminococcus obeum A2-162] # 1 162 1 162 162 283 100.0 3e-75 MQKKQLSVLVLIILCFLLGCNRETPKDESQQNYEDESNISKTNGEELVNINSELLGSTPF DIQIDKTALQTRKETETEVTIKTNLTDWGYTVSSEKGKISDINKNSFIYIAPKDERDDTI KIQLSDYENGISYEYTIPLIFAGNNEHSLDKFKENLSRLPNS >gi|157101633|gb|DS480691.1| GENE 61 56137 - 56532 356 131 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3584 NR:ns ## KEGG: EUBREC_3584 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 17 131 1 115 115 211 96.0 7e-54 MKRYNTPHRSRVVKTRMTEEEYAEFAQRLSAYHMSQAEFIRQAITGAAIRPIITVSPIND ELLAAVGKLTAEYGRIGGNLNQIARTLNEWHSPYPQLAGEVRAAVSDLAALKFEVLQKVG DAVGNIQTYQL >gi|157101633|gb|DS480691.1| GENE 62 56504 - 58129 1042 541 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 541 3 543 543 959 95.0 0 MATFKHISSKNADYGAAEAYLTFEHDEFTMKATLDENGRLIPREDYRISSLNCGGEDFAV ACMRANLRYEKNQKREDVKSHHYIISFDPRDGTDNGLTVDRAQELGEQFCKEHFPGHQAL ICTHPDGHNHSGNIHVHIVINSLRIYEVPLLPYMDRPADTREGCKHRCTNAAMEYFKSEV MEMCHREGLYQIDLLNGSKERITEREYWAAKKGQLALDKENAAREAAGQPTKPTKFETDK AKLRRTIRQALSQAGSFDEFSSLLLREGVTVKESRGRLSYLTPDRTKPITARKLGDDFDK AAVLALLTQNAHRAAEQTKAIPEYPAAVKKPLQGEKAAKTTPADNTLQRMVDREAKRAEG KGVGYDRWAAKHNLKQMAATVTAYQQYGFSSPEELDEACSAAYAAMQESLAWLKQVEKTL NGKKELQRQVLAYSKTRPVRDGLKQQKNAKAKAAYRQKHESDFIIADAAARYFRENGISK LPSYKSLQAEIESLIKEKNSGYNDYRAKWEEYRRLQTVKGNIDQILRRSEPQRRKEQSHE R >gi|157101633|gb|DS480691.1| GENE 63 58119 - 58505 187 128 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3582 NR:ns ## KEGG: EUBREC_3582 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 128 1 128 128 235 99.0 6e-61 MNGNIPRMDYRQYRAARRLVHECCNYDSGNCLLLEDGEPCVCVQSISYSLLCRWFTAAVL PLDEALEAALLRRGSRKRCAVCGAFFFPKSNRGKYCPDCAGRMKRINAAKRKRKQREKCH ALGHFKPA >gi|157101633|gb|DS480691.1| GENE 64 58615 - 59376 372 253 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3581 NR:ns ## KEGG: EUBREC_3581 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 253 1 248 248 446 92.0 1e-124 MNPIADYMTAGTELPAYLPYPRFLLETDLSHTARELYALLLDRSTLSQKNGWQDSEGRTY IVYPIAEIAEMLDKGCTTIKGALNELDAAGLLERRRTGFSAANRLYVKVPPIPVVQFSDQ LTDGKPPLIRAGNRPTDSRKTDLMTVGKPSPNQTNINNLIESQTKGASEGQPPAAYGRYK NVFLSDTELLELEQDFPGKWEYYLDRLSCHIASTGKQYQSHAATIYKWAQEDAAKEKPKK GIPDYSFKEGESL >gi|157101633|gb|DS480691.1| GENE 65 59373 - 60227 604 284 aa, chain + ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 34 277 20 281 282 122 29.0 9e-28 MTDTIHNTILPMTDTTAEPEDYTGEDGLLYCGKCRKPKEAYFPEGKTFFGRDRHPSECDC QRAARKEREAAEKRRSHLETVERLKRQGFTDKTMQDWTFANDNGSCPQMKNAAGYVARWE QIKDGNYGLLLWGRVGTGKSYFAGCIANALMEQEVPVRMTNFAAILNDLAASFAGRNEYI SRLCSFPLLIIDDFGMERGTEYGLEQVYNVIDSRYRSRKPLIVTTNLTLEELQHPEDTAH ARIYDRLLEMCSPLCFTGENLRKAAAQGKMEQLKRLLAGKEICL >gi|157101633|gb|DS480691.1| GENE 66 60224 - 60418 174 64 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1913 NR:ns ## KEGG: Tresu_1913 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 5 55 6 56 62 67 72.0 1e-10 MTDTQRNKRPARRPDCVTETRIGNTILVVSGFFKEGATDTAADKMMKVLEAEAAAGYLTC DKPD >gi|157101633|gb|DS480691.1| GENE 67 60796 - 61146 305 116 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1745 NR:ns ## KEGG: CDR20291_1745 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 2 114 3 115 117 119 53.0 6e-26 MELTYTKCGDYLIPNLVLSDTKEYHIGKYGRLRRAYLKEHRPILYTDLIVTEKLFPHLEE IDTACRERLEIIEKAMMQQEGFTEALKSADQMAWVRSMNSIHNRAEEIVLAELVYC >gi|157101633|gb|DS480691.1| GENE 68 61160 - 61435 262 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939402|ref|ZP_02086752.1| ## NR: gi|160939402|ref|ZP_02086752.1| hypothetical protein CLOBOL_04295 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04295 [Clostridium bolteae ATCC BAA-613] # 1 91 1 91 91 149 100.0 7e-35 MILEAMYNGEFYPCETVVPTSPEYRKAIQTCAALMEQLSQRLSKEDYALVEELRAQNAIA QCEESESHFKYGFSAGLIVQQEAHEQLQNKK >gi|157101633|gb|DS480691.1| GENE 69 61499 - 62527 1076 342 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 7 334 6 334 345 266 43.0 4e-71 MSEEQGLTPYETREILFYKTENGEVRVEILLFQENLWLTQAKMAELFEVQKAAISKHLKN IFESGELSEDSVVSKMETTAADGKRYQTNYYNLDAIIAVGYRVNSKKATMFRIWANRVLK EFIIKGYVMDDARLREPENFFGKDYFEEQLERIRDIRASERRFYQKITDIYSQCSADYDV ESPITKEFFATVQNKLHYAVTHHTAAEIVYGRADSTKPNMGLTTWKNAPKGRIRKSDVTV AKNYLNETEMRNLNEIVTMYLDYAERQARRGNVMYMADWVKRLDAFLQFNEEDILHDKGK VNAAIAKAFAEKEFEKFRVLQDRTYQSDFDRLVAETSDDLTE >gi|157101633|gb|DS480691.1| GENE 70 62524 - 63441 681 305 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_0776 NR:ns ## KEGG: CDR20291_0776 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 10 300 16 305 306 132 28.0 2e-29 MKNLFEQSRSHWVRYDHYELKTAEDGKRYITPGKSAKPDVYNPLKEVPNIVLDALNVGML MMGRKPEAEVEKAIMEFITRYGLLGLMTALPTTPSFMDYEAVYLPKNHFIKEESMATDKY LSLFYPFDQLDLVKKGIESTWNVSGDRTMIALTMTFMDEPMAKNMSFQREYAEPYEWVAQ QFKDWAFTLTTAFFYYNDYDFMGEDERGLHRKAMAAFGGIAPSYHIELLDKPTIYWDFHS LLLGIQMMFSFMLVDSDQPLRLCKHCQKVFLGNRSNAAFCSPRCKNQYNVYKSRGKNRTD GGDEE >gi|157101633|gb|DS480691.1| GENE 71 63476 - 64312 685 278 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939405|ref|ZP_02086755.1| ## NR: gi|160939405|ref|ZP_02086755.1| hypothetical protein CLOBOL_04298 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04298 [Clostridium bolteae ATCC BAA-613] # 1 263 1 263 278 411 100.0 1e-113 MAYLSDDELLDTEGGFTGWDAGAEETLEQALAEERLAELESELEQDEHPINYEAELEDEE EEAAPVSDKRLKREMRAEAIRRLEEAARTEKDFRAVVEEWNKLDRNRERRERDHENLRGD VPLEYQAVPEPKLIPRWMNNPAYRQLMAGNFLDILFDCPYEMHNLTADAFVSRMVEELSE EHKEVLYFLSLRLYSTTRLAAVRGQSDRNIRKLRKTIHKKLQRQMYDHMCGKQEHGGSLT LRERQFLEEYSKIARKQGKDAVIRRENKSKRRKKKNRP >gi|157101633|gb|DS480691.1| GENE 72 64575 - 64778 122 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939406|ref|ZP_02086756.1| ## NR: gi|160939406|ref|ZP_02086756.1| hypothetical protein CLOBOL_04299 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04299 [Clostridium bolteae ATCC BAA-613] # 1 67 1 67 67 125 100.0 8e-28 MADKNAVFLTRHIGKTTYKVRVYLSETAEETMEDKILRLIRNDGLANQPECGIMELPQMS RPSERSA >gi|157101633|gb|DS480691.1| GENE 73 64780 - 66699 1500 639 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 5 305 3 301 301 275 45.0 3e-73 MANRQTEEKITALYERLSRDDDLTGDSNSILNQKRYLESYAVQRGYTNIVHYTDDGWSGG NFDRPAWKRLVADIEAGKVAHLLCKDLSRIGRNYLQTGFYTEVMFRQNGVHFVAVANNID SEEQDSGEFAPFLNIMNEWYLRDQSKKVSAAYRVKGKAGKPTTNNAIYGYKKDPEDKDHW LVDEEAAAVVRRIFRLAVEGHGPHEIAKILTQEKVECPAYYLARNGRGCRKNTVDTSRPY DWYGFTVNSMLTKPEYMGHTVNFRSSKKSYRDKRVKNDPSDWLIFENTHEAIVDPETWQL AQQVKRTVRRTDTTGVANPLTGLVFCADCGAKMYNHRGKRKKDRREYGTDFYNCSTYTLT FERETQMCFSHTVSTKALNALILETIRTTASYAIQNKEEFIQKVRSISQVRQQEAAKELK RKVAKERRRSAELDVLIKKLYETYAMGKLEEKRFELLCAEYEKEQAELEQLLASEQAQLD QFHEDTDRASHFLALAQKYTDFTELTAPMLHEFVEKILVHAPDRSTGERVQEIEIYLNFI GKFEVPMPGPTEEELAAEEKRRQKRIRDHEKYLRQKERKQKIAEGLIVPGEPYQLVCQCC GEPFQSVRPNAKFCKPACREKFYRQEKRKAKETETSQTA >gi|157101633|gb|DS480691.1| GENE 74 66767 - 67387 509 206 aa, chain + ## HITS:1 COG:RP859 KEGG:ns NR:ns ## COG: RP859 COG0358 # Protein_GI_number: 15604689 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia prowazekii # 3 89 30 120 616 72 39.0 4e-13 MTLFELVKQNICVPDAAKHYGLQVNRNGMCSCPFHEDQHPSMKLNERYFYCFGCGATGDV IDFVARLFGLNSYEAAQKLAQDFGIDPDKPPAAIALPKPERPLLKAYRQEEVRCLQVLCD YLHLLESWKVQYAPKTPEDVLDDRFVEACQMLDYVEYLADLLIAAELEHRVKIVEMLNKD GLIAGLEERLDRLKKEDGAYEKQNAA >gi|157101633|gb|DS480691.1| GENE 75 67365 - 68744 1037 459 aa, chain + ## HITS:1 COG:VNG0215C_2 KEGG:ns NR:ns ## COG: VNG0215C_2 COG3378 # Protein_GI_number: 15789518 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Halobacterium sp. NRC-1 # 94 400 1 315 365 85 27.0 2e-16 MKNKTQLDGMPVWFDGKSINEALFCEEFLQTHKIIFTNGAFFTPEGRVTDELPLRGEIFE ELKCCAVSNIPRKISNIVELMKLAALVEDFPPEPDRIHLSNGTLFLDGTFAKGKPKIVRN RFPVSYKPNAPKPVLWLQFLDGLLYPEDIPTLQEYIGYCLIPSNKGQRMMVIKGSGGEGK SQIGAVLGTLFGFNMKDGSIGKISENRFARADLEHILLCVDDDMRMEALRQTNYVKSIVT AQGKMDLERKGKQSYQGWMCARLLAFSNGDLQALFDRSDGFYRRQLVLTTKEKPAGRVDD PDLAEKMKAEVEGILLWAFEGLQRLAANNFKFTESQRTKDNREAVKRDNNNVYDFLDSDG YVRRKADLSASSKELYEAYQIYCTENNLPALKPRSFSEALIACQSRYNLEYCNNVTNAAG RRVRGFLGIEVLVRNHISVFSGDSMRTYVPEDVPEEWRR >gi|157101633|gb|DS480691.1| GENE 76 69137 - 69346 364 69 aa, chain + ## HITS:1 COG:no KEGG:ELI_1141 NR:ns ## KEGG: ELI_1141 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 3 69 5 71 73 93 76.0 3e-18 MTVRNEINAQIVRAGYTMQEVVDRLHEEYGWSDSVSNLSAKLQRESIRYKEVVELADVLG YDLIWQKRR >gi|157101633|gb|DS480691.1| GENE 77 69351 - 69557 225 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939412|ref|ZP_02086762.1| ## NR: gi|160939412|ref|ZP_02086762.1| hypothetical protein CLOBOL_04305 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04305 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 80 100.0 3e-14 MTSEEKKLLQAKHRLEEAQARDQVKERKARTRRLIQEGAVLEKVLPEVQAVGLDNLEEYL RRKLAGHD >gi|157101633|gb|DS480691.1| GENE 78 69986 - 71680 1103 564 aa, chain + ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 305 18 283 1117 128 33.0 4e-29 MAIYHLEAKIVSRGAGRSAVAAAAYLSCSRMLNEYDGVQHDYTRKQGLGWRQVFLPATAP VKWQDREILWNAVEETETAKDSRLAREFVAALPIELSREKQIQLLQDFIKEQFVADGMCA DAAIHDPYPPGHNPHAHILLTVRPLDEKGKWQYKTEKEYLCVKDGEERGFTAAEFKQAQA DGWEKQYQYKVGKKKVYMAPSAAQAQGYERVSKYPKSTKYGRQNPITERWNSDEQLVLWR AAWADVANHYLERTGHEERIDHRSHAERGLLERPTVHEGVVARAMEKKGIISDRCELNRQ IKADNALLRELRGQVKKLAQAVKSTLPALAEAMENLRKNLLLFCYQLGYLRKGKERLNTS LNTLRPALAQYNQLAKDIRDKTKERRSLLSEKKALSAVHVFRHRELAAKIAALTEDLEEL RSEKNLLLASLAYTEEDAADKFPKDIVAMEQSLERLEEQEQKYSAELDAALNEYAGLREQ AQSVDPVQLYEARQAIRPGKEQEAESRAQQVYGEKYNPLLMFDSKKAVSRMLHEDMERQA VRRMMRQAQKEQQTLQKKKSEQER >gi|157101633|gb|DS480691.1| GENE 79 71710 - 72657 442 315 aa, chain - ## HITS:1 COG:lin0071 KEGG:ns NR:ns ## COG: lin0071 COG0582 # Protein_GI_number: 16799149 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Listeria innocua # 40 297 110 382 400 76 26.0 5e-14 MTNNTQFKTLLNTWLNQKKPMITPSTHASFTLIAENHLIPHFGKRKIGSITEPDIQSYIS YLYESGRLDKSGGLTVKTIRDVILVLRLAMEYAYKERAIPLLNWDLIEYPKELGIKKVVS LSKDQEQALIQCIYMNLNRKTAGILIALFTGVRIGELCGLQMKDISLTDKTISINKTVQR IYDKKKGESYLHIGPPKTKTSARTIPVPSLLMNIIKKFYTENPNHYFLTGKTKPTEPRTY RQFFSRFLKRNGLQKVKFHEIRHTFAVRAIEIPEFDVKSLSEILGHKNVSFTLNVYGSAN LQQKVKCMNLLNDLL >gi|157101633|gb|DS480691.1| GENE 80 73510 - 73641 75 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSIEFEKTTSYLLVQGHLWVKAIATSVVTVIFIMLAFLFAFM >gi|157101633|gb|DS480691.1| GENE 81 73903 - 74427 -37 174 aa, chain - ## HITS:1 COG:no KEGG:Rvan_3142 NR:ns ## KEGG: Rvan_3142 # Name: not_defined # Def: restriction modification system DNA specificity domain protein # Organism: R.vannielii # Pathway: not_defined # 1 140 10 149 367 96 39.0 4e-19 MGDLISLQSGQDFAPAEYNDQEIGVPYMTGASCIVKGETVVSRWTQIPRCYAYKGDTLLV CKGSGSGAVVRLTQEKAHIARQFMSLRANEKMTSDFCYYLTGFLSDRIKRNATGLIEGID RGTVLNQTVFLPPLHEQKKIARFFSKLDFTITAHENMLDTLINERTGLMQRLFI >gi|157101633|gb|DS480691.1| GENE 82 75470 - 75850 125 126 aa, chain + ## HITS:1 COG:no KEGG:BF1842 NR:ns ## KEGG: BF1842 # Name: not_defined # Def: putative type IC restriction-modification system specificity subunit, partial # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 125 75 196 376 96 41.0 4e-19 MVGDVLIPTSGETSEEISTASCVMLPGVILAGDLNIFRSTKIDGRIMSYILNHIVNGNIA RVAQGKSVVHVQASEISKIKISYPDPETQIRIIKILEAISNRIESCENELNHLTKMRSSL LQQLFI >gi|157101633|gb|DS480691.1| GENE 83 75843 - 75941 87 32 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939419|ref|ZP_02086769.1| ## NR: gi|160939419|ref|ZP_02086769.1| hypothetical protein CLOBOL_04312 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04312 [Clostridium bolteae ATCC BAA-613] # 1 32 399 430 430 63 100.0 4e-09 MLYLLDLRITFEEKTVNALTNTRTALLQQLFI >gi|157101633|gb|DS480691.1| GENE 84 77138 - 78706 939 522 aa, chain - ## HITS:1 COG:SA0391 KEGG:ns NR:ns ## COG: SA0391 COG0286 # Protein_GI_number: 15926109 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Staphylococcus aureus N315 # 6 513 8 514 518 561 54.0 1e-160 MDNSIQAHQKELCNKLWAMANALRGNMEAYEFKNYILGMIFYYYLSDRTEKYMTNLLKDD NISYEDAWTDEEYKTAVVEEALRDLGFIIEPQFLFRKMVKMVENRSFDIEFLQKAINSLM ESTLGNDSQEDFDGLFSDMQLDSTKLGHTVKDRSAVMAKIIASLDEINFSVEDTKIDVLG NAYEYLIGQFAATAGKKAGEFYTPSGPAELLCRLACLGLTDVKDAADPTCGSGSLLLRLK SYANVRNYYGQELTSTTYNLARMNMILRGIPYRNFNIYNGDTLEHDYFGDMKFRVQVANP PYSAKWSGDLSFMEDPRFNEYGKLAPKSKADFAFVQHMVHHMDEDGRAVVLLPHGVLFRG AAEEVIRKHLIQKLNVLDAVIGLPANLFFGTGIPVCVLVLKRERNDNADNILFIDASGDF EAGKNQNILRECDIDKIVETYERREDVDKYAHVATMQEIAENGFNLNIPRYVDTFEPEEE IDLNQVAAEIRQLQNEIKGIDAELKPFFDELGLDFPFEVEGE >gi|157101633|gb|DS480691.1| GENE 85 78719 - 81484 1427 921 aa, chain - ## HITS:1 COG:SA0189 KEGG:ns NR:ns ## COG: SA0189 COG0610 # Protein_GI_number: 15925899 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Staphylococcus aureus N315 # 1 920 1 927 929 838 50.0 0 MAYQSEAALEQQFIDQLNKQGYTSVSIPDYDALVENFKAQFETFNSAKLDRPLTDKEWER VMNIMLGKSVFQSAKILRDKFVLEREDGTKVYLSFFDADHTKNIFQVTNQTTVVGKYVNR YDVTILVNGLPLIQVELKRRGIDIREAVNQVMRYKKHSYNGLYHFIQLFVVSNGVDTKYF ANSDRDMLHSLAFFWTDFNNVRITNLKEFSISFLARDHIIKMLTRYTILNDTDKLLMVMR PYQVYAVEALVRQATLTNRNAYVWHTTGAGKTLTSFKTAQILAANPNIKKVIFLVDRKDL DSQTTEEFNKFENGSVDATDRTDVLVKQMQDKNRQLIVTTMQKMANAVKRPQYSKIMDTY KNEKVVFIIDECHRSQFGDMHKDIVRHFQKAQFFGFTGTPRFEVNGKTEGKITQTTEMLF GECVHNYLIKDAIFDNNVLGFHIEYIKTMEGDFDWDDPTMADAINVGELYMSEERMSLIA NHIVQNHKAKTRNGQYTAIFAVASIEALIKYYDIFKKIKHDLNISGIFSYGQNEEAEGKD EHSRDALERIIKDYNEKYSTNFSTDTFAAYHKDISDRVKGKKTKPLDILLVVNMFLTGFD SKQLSVLYVDKDLKYHDLLQAYSRTNRVEKETKPFGIIICYRNLKKRTDDALTLFSKSHD TSGIVVPEYPYFVEKFNEMVTRLKQLAQTPADIDTMQSEDDQKLFVETFRELTKYLQSLQ TFIEFSFDQDSLIMTEQEYQDYKSKYLMLYAKHKKDREVVSVLNDVDFCIELMESDRINV AYIMNLIRNIHFDDAKQKDYDIKHIKEELGRTDNPQLLRKVEILQAFLDRVVVGLESADE IDAAYNDFENEAKREEIVAFAQTEEIDPAMLTDFISEYEFSGTMDAGNIRDRIEKPMPLL KKRSLVNRIVDFIRQHTEKYQ >gi|157101633|gb|DS480691.1| GENE 86 81563 - 81778 91 71 aa, chain - ## HITS:1 COG:SPy0544 KEGG:ns NR:ns ## COG: SPy0544 COG3655 # Protein_GI_number: 15674643 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 68 1 68 69 70 57.0 7e-13 MKVSYKKLWKLLIDKDMKKRDLERCAGISHYTINKLNHGENVTTDILGKICKALGCTMND IMEFIDDDSAQ >gi|157101633|gb|DS480691.1| GENE 87 82067 - 83485 1133 472 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 1 232 68 301 301 255 55.0 2e-67 MLADIEAGKVGTVIVKDMSRLGRNYLQVGMYTEMIFPQKGVRFIAINDGVDSAQGENDFA PLRNIFNEWLVRDTSKKIKAVKRSKGMSGKPITSKPVYGYLMDEDENFIIDEEAAPVVRQ IYSLCLAGNGPTKIARMLTEQQIPTPGTLEYRRTGSTRRYHPGYECKWATNTVVHLLENR EYTGCLVNFKTEKPSYKLKHSIENPPEKQAVFENHHEPIIDRETWERVQELRKQRKRPNR YDEVGLFSGILFCADCGSVMYQQRYQTDKRKQDCYICGSYKKRTADCTAHFIRTDLLTAG VLSNLRKVTSYAAKHEARFMKLLIEQNEDGDRRRNAAKKKELEAAEKRIAELSAIFKRLY EDSVTGRISDERFTELSADYEAEQKELKERAARLREELSKAQEATANAEKFMNVVRRHTT IEELTPTLLREFVEKIVVHESVALDGKRRGKLRRQEIEIYYSFVGKVELPDT >gi|157101633|gb|DS480691.1| GENE 88 83739 - 84410 649 223 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 204 1 204 289 272 69.0 7e-72 MDFLLEALTNWLKEMLVGGIMSNLSGMFDSVNQQVADISVQVGQTPQGWNGSIFSMIENL SNSIMVPIAGVILAIVMTVDLIQMIADKNNLHDVDTWMIFKWVFKSAAAILIVTNTWNIV MGVFDMAQSVVAQAAGIIGSDASIDISSVMTDLEPRLMEMDLGPLFGLWFQSLFIGITMW ALYICIFIVIYGRMIEIYRASRSAAFHPKAVRGHFGNPALASR >gi|157101633|gb|DS480691.1| GENE 89 84986 - 86818 979 610 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 29 609 256 827 834 305 35.0 2e-82 MRSPENVLESLKSKACNKSYKYGRLYRNLYNPQFYLLTYQRIQAKPGNMTAGTDGKTIDG MGMARINALIEKMRDFSYQPNPARRTYIPKSNGKMRPLGIPSFDDKLIQEVVRLILESIY EPTFSDYSHGFRINKSCHTALKYVQKYFTGTKWFVEGDIKGCFDNVDHHVLIDILRKRIA DEHFIGLLWKFLKAGYMEDWNYHNTYSGTPQGSIISPILANIYMNELDSYMAEYAEKFNC GNRRKINPAFKKKLDVCRGKEQRLKRNLSKMSEEEKEDLIAEIRELRRSLKSMPYSDQMD DSYKRICYVRYADDFLIGVIGSKEDAEQVKQEVGCFIREKLHLEMSGEKTLITHGHDFAK FLGYEVTITKGEYSKKTKTGATRRVNNGKVLLYVPHDKWLKRLLSYNALKIKYDKQNGNK EVWEPVRRIRLLHLDDLEILNQYNAEIRGLYNYYRLANNVSVLNNFYYVMRYSMLKTFAG KYRTRISRIIRKYRQGKDFAVEYPKKNGKVGKVLFYNNGFRRNTKVESGNPDIVARVVEN YGRNSLIKRLQANKCEWCGAENVPLEVHHVRKLKDLSGRKQWEIAMIGRRRKTMALCIDC HDKLHAGKLD >gi|157101633|gb|DS480691.1| GENE 90 86894 - 87196 179 100 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1894 NR:ns ## KEGG: Ethha_1894 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 4 100 193 289 289 135 72.0 4e-31 MARRRVLTLLVTSVAPVPMAAMMGKEWGGMGQNYLRSLLALGFQAFLIIVCVAIYAVLVQ NIALEDDIIMAIWSCVGYTVLLCFTLFKTGSLAKSVFQAH >gi|157101633|gb|DS480691.1| GENE 91 87211 - 87756 23 181 aa, chain + ## HITS:1 COG:all7280 KEGG:ns NR:ns ## COG: all7280 COG4725 # Protein_GI_number: 17233296 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Transcriptional activator, adenine-specific DNA methyltransferase # Organism: Nostoc sp. PCC 7120 # 3 175 9 192 210 117 36.0 2e-26 MKKYQIIYADPPWSYKVYSKKGLGRSAESHYPTMRIEDICALPVGDLADKDCALFLWVTI PCLLEGLSVLRAWGFTYKTIGFVWVKQNKKSDSLFWGMGYWTRSNVELCILATKGHPKRV NAAVHQVIVSHIEEHSKKPQEARERIVSLMGDLPRIELFARQSTPGWDVWGNEVDSSISF P >gi|157101633|gb|DS480691.1| GENE 92 87771 - 88226 271 151 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1896 NR:ns ## KEGG: Ethha_1896 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 124 1 126 152 169 70.0 5e-41 MAYVPVPKDLTKVKTKVMFNLTRRQLVCFTAGALVGVPLFFLLREPAGNSMAAMCMMLVM MPFFLLAMYEKHGQGLEKIVGNILKVAVIRPKQRPYQTNNFYAVLKRQEMLDKEVYDIVH RNEKLAAPSAGKTGRKDRAAGKDKEKAVPRR >gi|157101633|gb|DS480691.1| GENE 93 88123 - 89157 921 344 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1897 NR:ns ## KEGG: Ethha_1897 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 30 339 1 310 794 499 76.0 1e-140 MFTAMKSWLRRLLGKPEEKTVQPVKTKKKLSRADKKQIEAAIARANRTDKKGKSAQDSIP YERMWPDGICRVSDSHYTKTIQFQDINYQLSQNEDKTAIFEGWCDFLNYFDSSIHFQLSF LNLAASEETFANSISIPPQRDAFDSIREEYTTMLQNQLARGNNGLIKTKYLTFGIDADSI KAAKPRLERIETDILNNFKRLGVAARTLDGKERLSQLHAVFHMDEQLPFQFEWDWLAPSG LSTKDFIAPSSFEFRTGKQFRMGKKYGAVSFLQILAPELNDRLLADFLDMESSLIVSMHI QSVDQVKAIKTVKRKITDLDRSKIEEQKKAVRAGYDMDISATRS >gi|157101633|gb|DS480691.1| GENE 94 89880 - 91853 234 657 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 12 364 25 339 470 183 34.0 8e-46 MPNEKKPKTKYVEYLRHAEYYDMQSTFDELYARSQAGEIFDGLMDVILSRENILLAYRNI KSNTGSFTPGTDKLKISDIGKLTADEVTARVRRIVKGGKNGYTPRSVRRKEIPKPNGSTR PLGIPCIWDRLVQQCIKQVMEPICEARFSNNSYGFRPNRSVENAIAAIYRLMQRSGLYYV VEFDIKGFFDNVDHSKLIKQLWSLNIRDKELLYVIRRILKAPILMPDGHIEHPAKGTPQG GIISPLLANVVLNELDHWIESQWQCNPVTENYSYRENATGCPIQSHAYRAMRNTRLKEMY IVRYADDFRILCRTKEQADRTLIAVTHWLKERLRLDVSPEKTRVVDTRRSYSEFLGFKIR LHKKGKKYVVQSHMCDKAYRKVKANLTKQVGNIKFPRKDRGEAGEVRLFNSMVMGIQNYY QLATDISIDCGDIGRTVNIVLKNRLKSGKTHRLKEEGRDLTKMELQRYGKSEQLRYIAQS KEPIYPISYVQCTNPMNLRRKVCAYTATGRSAIHDDLRINTSLLLQLMRAPTYRRSAEYA DNRISLFSAQWGKCAITGKEFQCVSEIHCHHKTPKGNGGSDKYENLVLVLAPVHELIHAV NEDTICSYLSALKLDASQLMKLNRLRILANRKPIDLENLNLTNNSHNGMTKETKKTV >gi|157101633|gb|DS480691.1| GENE 95 92019 - 93386 1061 455 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 74 420 462 830 853 95 25.0 1e-19 MFLVSFLVLNTADNPRQLGNNIFQAGSIAQKYNCQLTRMDFQQEEGLMSCLPLGLNQIEI QRGLTTSSTAIFVPFTTQELFQNGKEALYYGINALSNNLIMVDRKLLKNPNGLILGTPGS GKSFSAKREIANCFLLTSDDVIICDPEAEYAPLVERLHGQVIKISPTSTNYINPMDLNLD YSDDESPLSLKSDFILSLCELIVGGKDGLQPVQKTIIDRCVRLVYQTYLNDPRPENMPIL EDLYNLLRSQEEKEAQYIATALEIYVTGSLNVFNHQSNVDINNRIVCYDIKELGKQLKKI GMLVVQDQVWNRVTINRAAHKSTRYYIDEMHLLLKEEQTAAYTVEIWKRFRKWGGIPTGI TQNVKDLLSSREVENIFENSDFVYMLNQAGGDRQILAKQLGISTHQLSYVTHSGEGEGLL FYGSTILPFVDHFPKNTELYRIMTTKPQELKKEDE >gi|157101633|gb|DS480691.1| GENE 96 93389 - 93763 290 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939432|ref|ZP_02086782.1| ## NR: gi|160939432|ref|ZP_02086782.1| hypothetical protein CLOBOL_04325 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02406 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] putative malonyl CoA-acyl carrier protein transacylase [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF0866_01339 [Ruminococcaceae bacterium D16] hypothetical protein CLOBOL_04325 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02406 [Holdemania filiformis DSM 12042] conserved hypothetical protein [Clostridiales bacterium 1_7_47FAA] putative malonyl CoA-acyl carrier protein transacylase [Clostridium hathewayi DSM 13479] hypothetical protein HMPREF0866_01339 [Ruminococcaceae bacterium D16] # 1 124 1 124 124 241 100.0 2e-62 MKQNICELDTMIFFREALEAHEFMLLPVMASAVVECRTADKELKTLNEDGEIGLARLFSI WANMMCAPGAATIVGCRPITMLSEILAQVHAYLTVHPLYDPEGLALYVELHHMMDAILMG DWFE >gi|157101633|gb|DS480691.1| GENE 97 93810 - 95465 808 551 aa, chain + ## HITS:1 COG:no KEGG:CD1108 NR:ns ## KEGG: CD1108 # Name: not_defined # Def: putative DNA-repair protein # Organism: C.difficile # Pathway: not_defined # 1 551 84 646 646 615 58.0 1e-174 MEKPIRKAEKAAARADKAQANIPKKKVRQTVIDPDTGKKTSKLTFEDKKKPPSKVSQGVK EAPVQLIAGKLHKEIRETEQDNVGVESAHKSEEAVETGAYLVREGYRSHKLKPCRKAAQA EQKLEKANVNALYQKSLRENPQFTSNPLSRWQQKQSIKKQYAAAKRAGQTAGNTAQAASK TGKAARTVKEKAQQAGSFVMRHKKGFLVAGVLFLLICMLMNTMSSCSMMAQSIGSVISGT TYPSDDPEMLAVEADYADREAQLQEKIDNIESSHSGYDEYRYNLDMIGHDPHELAAFLSA VLQGYTRHSAQAELDRVFDAQYQLTLREEIQIRTYTDEDGDEHEYEYRILHVTLTSRSIA SLAPELLTPEQMEMYQVYRQTMGNKPLLFGGGSPDTGVSEDLTGVEFINGTRPGNPQLVE LAKSQVGNVGGQPYWSWYGFNSRVEWCACFVSWCYGQSGRTEPRFAGCQSQGVPWFQSHG QWGARGYENIAPGDAIFFDWDLDGSADHVGIVVGTDGSRVYTVEGNSGDACKIKSYDLNY QSIKGYGLMNW >gi|157101633|gb|DS480691.1| GENE 98 95495 - 95746 407 83 aa, chain + ## HITS:1 COG:no KEGG:CD1107A NR:ns ## KEGG: CD1107A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 67 1 67 85 76 70.0 3e-13 MAKNKIERIDQEITQVREKIAEYQEKLKALEVQKTEAENLEIVQMVRALRMTPAQLSAML SGGTVPGSLADDNNEQEENSYEE >gi|157101633|gb|DS480691.1| GENE 99 95736 - 96488 710 250 aa, chain + ## HITS:1 COG:no KEGG:CD1107 NR:ns ## KEGG: CD1107 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 228 3 229 244 228 60.0 2e-58 MRNKRLLRTLSALCLTLVLASGFTVPAFAQGAAPPPSEDTTNDSNVVVEEPEKAPPLTPD GNAALVDDFGGNKQLITVTTKAGNYFYILIDRANEDKKTAVHFLNQVDEADLMALMEDGN AKEEPPAVCSCTKKCEAGAVNTACPVCVTDKSKCTGKAPEPPAETPEPEKEKPAGLNPAA LILLLALLGGGGVFAYLKLVKNKPKTKGNDSLDDYDYGEEDSEEWETEDEEVLEDGFEGS DPIEESDRAD >gi|157101633|gb|DS480691.1| GENE 100 96636 - 97343 367 235 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939437|ref|ZP_02086787.1| ## NR: gi|160939437|ref|ZP_02086787.1| hypothetical protein CLOBOL_04330 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04330 [Clostridium bolteae ATCC BAA-613] # 1 235 1 235 235 429 100.0 1e-119 MNTWNWLLHFLIIFDLVAFGIYFGGDILFTAISKVQKKSKKDDDLDLYDFEIETAPPKRS LCVDSIRMLYSGEIDEFVEKELLPDAAETMKALSEAEKTAVSLLLKAYIGFLSEEATVDE QSFPMVKELLSYTQGSKEDGEKDAIDCLMEDMVSRTHRHREYYNNYQRYQLIQVDKERVI MACNIIINDLIGRLYRYDYRFGYDITLASEHSIAKKLSDDWQDEWEVEEYEAGDC >gi|157101633|gb|DS480691.1| GENE 101 97324 - 99423 1414 699 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 634 5 656 709 459 39.0 1e-129 MKLVIAEKPSVAQSLAAVIGATVRKDGYLEGNGWRVSWCVGHLAGLADADSYDPKYAKWR YDDLPILPEHWQMVVGKDKKKQFDILKKLMNAPDVTEVVNACDAGREGELIFRSVYELAG CKKPMKRLWISSMEDSAIREGFANLRLGADYDGLRDAALCRAKADWLVGINATRLFSVLY HRTLNIGRVMSPTLALIVQREAEIDTFKPVPFYTVALELPGLTVSGERMADKAAAEQLKE ACQGAAVTIKKVECKEKSEKPPALYDLTTLQRDANRLLGFTAQQTLDYLQSLYEKKLCTY PRTDSRYLTGDMADSLPVLVNLVANAMPFRKGIAITCDPQTVINDKKVTDHHAVIPTRNL KDADLSALPAGEKAVLELVALRLLCAVAQPHIYSETVVIAACAGGEFIAKGKTVKHPGWK ALEDAYRAKMKDTESEKEGAEKALPELTEGQTLSVAAAIVKEGKSSPPQHFTEDTLLSAM ETAGKEDMPEDAERKGLGTPATRAGILEKLVSAGFLERKKSRKTVQLLPSHDAVSLITVL PEQLQSPLLTAEWEYRLGEIERGQLAPEEFLDGISTMLKDLVGTYQVIKGTEYLFAPPRE VVGKCPRCGGEVAELQKGFFCQNDSCKFAIWKNNKWWTAKKKQPTKAVVSALLNDGRVRV TGLYSEKTGKTYDATVILEDNGQYANFKLEFDQRKGGSR >gi|157101633|gb|DS480691.1| GENE 102 99420 - 99698 261 92 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1904 NR:ns ## KEGG: Ethha_1904 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 88 1 93 100 69 50.0 3e-11 MKLSLVERETILLYNQAEPMAEIYTHDPRLMEKLELLAKKHSDQITRKDAHNFTVPKRCV SVREPYSAERRKAASERAKAAGYQPPVRKSGS >gi|157101633|gb|DS480691.1| GENE 103 99705 - 103106 2393 1133 aa, chain + ## HITS:1 COG:RP859 KEGG:ns NR:ns ## COG: RP859 COG0358 # Protein_GI_number: 15604689 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Rickettsia prowazekii # 4 86 39 116 616 67 40.0 1e-10 MAENVFEAVKQSVSTREAAAFYGIEVKRNGMACCPFHDDKNPSMKVDQRFHCFGCGADGD VIDFTARLFDLSPKEAAEKLAQDFGLIYDSQAPPRRRYVRQKTEAQKFREDRRRCYRVLS DYYYLLKKWEIDNSPRTPEEEPHPRFVEAIQKKTYVEYLLDLFLYESEEEQKAWIAEHTA EITHLERRLKIMAENKPTNRERLREITDGIEQGIKELFESEKYMRYLSVMSRFHRYSVNN TMLIYMQKPDATLVAGYNKWKDQFERHVKKGEHGITIIAPTPYKKKIEEQKLDPDTKAPI LDKDGKIVTEEKEIEIPMFRPVKVFDVSQTDGKPLPELASSLSGNVPNYEAFMEALRRSA PVPITFEAMAADTDGYFSADHQKIAIRQGMSEVQTVSATVHEIAHSKLHNQKKIQIANDE QYQEIELFDKPGLFSNGRIVRDNLPEGVYCYDLRGSDYDPGEPIYVENRVGVNHAGAVIL AEPLELPKEGYLRLTEEEGLNFVGGFSTLAQFLQEQRKDRHTEEVEAESISYAVCKYFGI ETGENSFGYIASWSQGKELKELRASLETINKTSGTLISDIERHYKEICKERGIDPHAKAE PETAPIEQPADAAQVSDTEPTGNLTYYVAECMEFPNLGEYHDNLSLEEAVRIYQEIPAER MNGIKGIGFELKDGSDYEGPFPILTGQTIDLDTIQAIDYYRDNPLVQKAAKELAAAMPEM EVLGADANQQEALFLIDDATYLHIQPCDSGWDYTLYDAASMKELDGGQLDAPELSRMKAV LQICNDNDLDSTSLRHAPLSIVETLQEAAYQQMQAEASQMTASSQLPEAQEQALDEYPMP DEQVSTPDMLKYGYSYDGMLPVTRERALELDAAGLTVYVLHEDNTESMVFDPQEIMEHGG LCGVDREEWEKSPQFHEKVMERQEHQQEREQAFLSQNRNCFAIYQVSCDDPQNMRFMNLD WLKSHDISIDRSNYDLIYTAPLSESGTVPEQLEKLYQQFNLEKPVDYHSPSMSVSDIVAI KQDGKVSCHYCDSVGFTQIPGFLPQNPLKNAEMAVEDDYGMIDGIINNGTKEPTVAELEQ QARSGQPISLMDLADAVHREEREKKKSVVDQLKSQPKAEHKKTAPKKSAEREI >gi|157101633|gb|DS480691.1| GENE 104 103108 - 103323 301 71 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1906 NR:ns ## KEGG: Ethha_1906 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 4 71 2 69 72 67 58.0 2e-10 MGNFTFEEMNLMCIYNTGSRTGLINSLREMRGELSPEETELRELTDSALTKLCAMTDEDF AQLELYPDFDQ >gi|157101633|gb|DS480691.1| GENE 105 103402 - 110757 5275 2451 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1414 2433 1 1027 1315 535 34.0 1e-151 MPTKAELYAQMADKVATQLTGSWQEWAGFLTTASRLYKYPFHEQLMIYAQRPDATACAEY DLWNEKMGRYVRRGSKGIALVDDSGDRPRLRYVFDISDTGTREHSRTPWLWQLEERHLDS VQAMLERTYDVSGDDLAGQLTEAAGKLAEEYWTEHQQDFFYIVDGSFLEEYDEYNIGVQF KAAATVSITYALMSRCGLEPERYFDHEDFMAIFDFNTPSTIGALGTAVSQINQQVLRQIG VTVRNAEREANQERSKQDEQSHDLYPERRLSDSRPEAEPAAGETPGQVRKDEENLPEGTP SHPLQPDATEREAVPAPSGDRRDRPEQTGADDTPTGEGSGSHRGTESQRSHEVGGADEHL QSPGRGNPDGGAYQQLTLNLFLSEAEQIQSIDEAENVAHTSSAFSFAQNDIDHVLRLGGN TDRQRERVVAAFEKQKTTAEIAEILKKLYHGGNGLGSVSAWYAEDGIHLSHGKSVRYDRS AQVISWESAAERIGELLESGQFASNVELAEAAGYERSLLSEKLWHLYHDFSEEAREAGYL SCLSEIKGNGFPEETRRLTEQLSAPAFRQTLKEEYAAFWTAYQQDRDLLRFHYHRPREIW ENLKDLDLPRRTFSSDLSQVPTVQRFITEDEIDAAITGGSSFAGGKGRIYAFFMENHTDK EKVRFLKDEYGIGGRSHALSGATHSGEDHDGKGLHYKKQGCPDVHLNWEKVAKRITSLVQ KGRYLTEQEQAQYDKIQAEKELAEEDAIQAQQPEVEEETPKPTLREQFEQYKPVVTASIS EDAAYRNACGHSDRENAVIEGNAAVRRAVLGSKDMELIRLYSDIPEFRQRLHREVIDETY PKLHELLRPLSQEDIDTALCAWNGNIESKHAVVRYMKDHAREKDTAAWLAQEYGGNNSLF VVRAGSPEETQLPWPKVQRRIAQLIQEDRFYTEEEQDRFDNIDPIAIREALEERGSVNGQ VADPEKQENDQKVMSEVEQEANEETEHKPEYTASDREYAREHLIPGETIFEMDGHSFLVD WINYDAGTVSFQDITFTLAPGFPLFQMEPISVVRQWLEQEPEPPVPVTEPEKIFEEVLDE HPVSIQVNGQWQTFPNAKAAEEASYEEYKANLRRNAQNFRITDEHLGEGGPKAKFQANIN AIRLLKELEVAGQQASPEQQEVLSRYVGWGGLADAFDPEKPAWASEYAQLKELLTPEEYV AARSSTLNAHYTSPTVIQAIYEAVGRMGFETGNILEPSMGVGNFFGMLPEEMRNSRLYGV ELDPVSGRIAKQLYPKADITVGGFETTDRRDFFDLAIGNVPFGQYQVNDKAYNKLNFSIH NYFFAKALDQVRPGGVVAFVTSRYTMDAKDSTVRRYLAQRAELLGAIRLPNDAFKKNAGA EVVSDIIFLQKRDRPLDIVPEWTQTGQTEDGFAINRYFIDHPEMVLGRQEPVSTAHGMDY TVNPIEGLELSDQLHDAVKYIHGTYQEAELPELGEGEAIDTSIPADPNVKNYSYAIVDGQ VYYRENSRMVRPDLNATAEARVKGLVGLRDCVQELIDLQMDAAVPDSAIQEKQAELNRLY DSFSAKYGLINDRANRLAYADDSSYYLLCALEVIDEDGKLERKADMFTKRTIKPHQAVAV VDTASEALAVSISEKACVDMGYMSQLTGKTKEELAGELQGVIFRVPGQLEQDGSPHYVTA DEYLSGNVRRKLRQAQRAAQQDPVYAVNVEALTAAQPKDLDASEIEVRLGATWIDKEYIQ QFMYETFRTPYYLQRNIEVKYSSFTAEWQITGKTSVPYSDVAANTTYGTSRANAYKILED SLNLRDVRIYDTIEDADGKERRVLNAKETTLAAQKQQAIREAFKDWIWRDPERRQTLVRQ YNEEMNSTRPREYDGSHITFGGMNPEITLREHQLNAIAHVLYGGNTLLAHEVGAGKTFEM VASAMEAKRLGLCQKSLFVVPNHLTEQWASEFLRLYPSANILVTTKKDFETHNRKKFCAR IATGNYDAIIMGHSQFERIPISRERQERLLYEQIDEITEGIAEVQASGGERFTVKQLERT RKSLEARLEKLQAEGRKDDVVTFEQLGVDRLFVDEAHNYKNLFLYTKMRNVAGLSTSDAQ KSSDMFAKCRYMDEITGNRGVIFATGTPVSNSMTELYTMQRYLQYERLQELNMTHFDCWA SRFGETVTALELAPEGTGYRARTRFSKFFNLPELMNLFKEVADIKTADQLNLPTPEVEYH NIVAQPTEHQQEMVKALSERASLVHSGTVDPSQDNMLKITSDGRKLGLDQRIVNQMLPDE PGTKVNQCVENIMQIWRDGEADKLTQLVFCDISTPQAKAPASKAAKTLDNPLLHALESAV PLPEQEPVFTVYDDIRQKLIAQGMPADQIAFIHEANTEVRKKELFSKVRTGQVRVLMGST AKMGAGTNVQDRLVALHDLDCPWRPGDVGRILRTFKIKKNVEVTDNGKDNF >gi|157101633|gb|DS480691.1| GENE 106 110738 - 111112 239 124 aa, chain + ## HITS:1 COG:no KEGG:SSUBM407_0953 NR:ns ## KEGG: SSUBM407_0953 # Name: not_defined # Def: hypothetical protein # Organism: S.suis_BM407 # Pathway: not_defined # 1 124 1 124 124 236 97.0 2e-61 MAKTIFEEMGGKYERQGDYLIPCLTVSAEEEQPIGIWGQRHLDYLKHHCKVTYTNLLTSG RLNAYLADIDRQAQEGFERLIEGMKQAQGITEQLKAENALEWTGYLNNIRACAREIVEKE IIFA >gi|157101633|gb|DS480691.1| GENE 107 111479 - 113398 177 639 aa, chain + ## HITS:1 COG:CAC1448 KEGG:ns NR:ns ## COG: CAC1448 COG0480 # Protein_GI_number: 15894727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 2 638 3 647 652 454 40.0 1e-127 MKIINLGILAHVDAGKTTLTESLLYTSGAIAELGSVDEGTTRTDTMNLERQRGITIQTAV TSFQWEDVKVNIIDTPGHMDFLAEVYRSLSVLDGAVLLVSAKDGIQAQTRILFHALQIMK IPTIFFINKIDQEGIDLPMVYREMKAKLSSEIIVKQKVGQHPHINVTDNDDMEQWDAVIM GNDELLEKYMSGKPFKMSELEQEENRRFQNGTLFPVYHGSAKNNLGIRQLIEVIASKFYS STPEGQSELCGQVFKIEYSEKRRRFVYVRIYSGTLHLRDVIRISEKEKIKITEMCVPTNG ELYSSDTACSGDIVILPNDVLQLNSILGNEMLLPQRKFIENPLPMLQTTIAVKKSEQREI LLGALTEISDGDPLLKYYVDTTTHEIILSFLGNVQMEVICAILEEKYHVEAEIKEPTVIY MERPLRKAEYTIHIEVPPNPFWASVGLSIEPLPIGSGVQYESRVSLGYLNQSFQNAVMEG VLYGCEQGLYGWKVTDCKICFEYGLYYSPVSTPADFRLLSPIVLEQALKKAGTELLEPYL HFEIYAPQEYLSRAYHDAPRYCADIVSTQIKNDEVILKGEIPARCIQEYRNDLTYFTNGQ GVCLTELKGYQPAIGKFICQPRRPNSRIDKVRHMFHKLA >gi|157101633|gb|DS480691.1| GENE 108 113791 - 113919 77 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFFRAVIHLLIRSKIYQYIAYQRTAGNQLKKFLPMQRYFLP >gi|157101633|gb|DS480691.1| GENE 109 114113 - 114622 330 169 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3955 NR:ns ## KEGG: Clocel_3955 # Name: not_defined # Def: sigma-70 region 4 domain-containing protein # Organism: C.cellulovorans # Pathway: not_defined # 1 169 2 170 201 299 91.0 3e-80 MKYAPRKVYIKERGRYVELSYTDFCCRRESDQTYMDKLFIPVQGCLLEVVREQYTDFYRD KERWRYLQKLDTKNSLLSLDGFTDSEGNPLDFIADEAADIAETVVNAVMVDRLKAALPLL SDSEQALIHAIFFDGLSEREVGARFGITQSVVNKRKARILIKLRKIIES >gi|157101633|gb|DS480691.1| GENE 110 114718 - 115143 311 141 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1146 NR:ns ## KEGG: MGAS2096_Spy1146 # Name: not_defined # Def: putative cytoplasmic protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 139 1 139 143 245 96.0 5e-64 MAYNHGREDRKWRIWKEAEEKLMRECGVDEATIEQIRIADRADFNSNRRFYRWTNDVAEY LEDMADRERQAEVGTVAELLDEIESENLYQVLVAVDGRTLKIVLLKMQGYSTKEIAPLVH LTTGAIYARLDHLRKKLRKIL >gi|157101633|gb|DS480691.1| GENE 111 115227 - 115700 290 157 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1145 NR:ns ## KEGG: MGAS2096_Spy1145 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 157 17 173 173 254 96.0 9e-67 MAYRVKAYTLREESTESGTRYFISFKDGQGKSHELEVSEQFFMEFRQMERRNRNLLQWDE RHREFNEVWDETLYRRALRVPKSLDERMIEEERNETLCKAVGSLPEIQRRRFLLYYEYEF NFYQIAAMEHCTASAIQKSVAIAKEKVKAEMKKYLQP >gi|157101633|gb|DS480691.1| GENE 112 115811 - 115897 70 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCNENKDTARKEESHAEVTDSQRRNAPL >gi|157101633|gb|DS480691.1| GENE 113 115854 - 116894 337 346 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1144 NR:ns ## KEGG: MGAS2096_Spy1144 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 328 1 328 331 600 89.0 1e-170 MQKLQTVNAETLLYEPLEKPSFVVDSLIPTGLSLFCGSQKIGKSWLMLKLCLCVSQGLPL WDMPTMEGDVLYLCLEDTFCRIQDRLFRLTDEASGRLHFAVASCKLSDGLIVQLEDYLKD YSDSRLIVIDTLQKVRTASKDNAYASDYGDISLIKDFADRHSLAVIVVHHIRKQNDSDVF NKVSGTTGLTGSADATFVLEKEKRASDTAKLYVTGRDTPYQEYTLRFRDCSWELVERKTQ EQLAKEAIPDILFRLVDFMRDKEEWTGTATELLDALGETETAANVLTKWMNEYRLDFLQE NHIRYGFSRRSSGRIISLVMQEGTDSDGHVDDDGVSGIPPAAVTDM >gi|157101633|gb|DS480691.1| GENE 114 117508 - 118473 421 321 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1143 NR:ns ## KEGG: MGAS2096_Spy1143 # Name: not_defined # Def: plasmid recombination protein Mob family # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 1 321 1 321 321 565 94.0 1e-160 MPYAILRFQKRKAGGVAACERHNERKKEAYKSNPDIDMERSKNNYHLVKPPRYTYKKEIN HMVAEAGCRTRKDSVMMVETLITASPEFMNSLPPEEQKAYFTMALDFISERVGEKNILSA VVHMDEKTPHMHLCFVPITPDNKLSAKSILGNQKSLSEWQTAYHERMSSRWNQLERGQSS METKRKHIPTWLYKLGGRLDKQYAEIVSALSDINAFNAGKKRDKALDLLSAWLPDVEKFS KEIGKQQTYIDSLKERIGQESDYAGRMRDEKYEQELKVQKANQKIFELQRTNEQMGRLLS KIPPEVLEELQRTGRNKSRER >gi|157101633|gb|DS480691.1| GENE 115 118481 - 119335 530 284 aa, chain + ## HITS:1 COG:DR0012 KEGG:ns NR:ns ## COG: DR0012 COG1475 # Protein_GI_number: 15805053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 7 215 32 218 288 94 33.0 2e-19 MKKQDFKVLKTKDLYPFPDNPFHVAEDETLSELAESIKEFGIVTPIITRPKEDGNGYEVI AGQRRVRASELAGINTIPAFVLPLDRDRAIITLVDSNLQRENILPSERAFAYKMKSEAMK RQGFRTDLTSSQVVTKLRTDDKVAQGFGVGRMTVQRFIRLTELIPPILQMVDEGRIALTP AVELSFLKKDEQEKLFATMESEEATPSLSQAQRMKSLSQSGRLDMDTIFAIMTEEKGNQK ETVKIGMEKLKKYFPKGTTPKQMADTIIKLLERELQRKRNRDSR >gi|157101633|gb|DS480691.1| GENE 116 119381 - 119611 210 76 aa, chain + ## HITS:1 COG:L114363 KEGG:ns NR:ns ## COG: L114363 COG4443 # Protein_GI_number: 15672295 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 3 68 4 69 72 70 53.0 6e-13 MKEIQYEIVKEIAVLSTGDSGYTKEINLISWNGKEPKYDIRSFSPNREKCGKGITLNAAE AAALLKALQKELNSGD >gi|157101633|gb|DS480691.1| GENE 117 119682 - 119873 241 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939459|ref|ZP_02086809.1| ## NR: gi|160939459|ref|ZP_02086809.1| hypothetical protein CLOBOL_04352 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04352 [Clostridium bolteae ATCC BAA-613] # 1 63 1 63 63 103 100.0 5e-21 MGEDKKADKKSKRIVPKAPVQMIISREYVGTQTVTEAFIPIISEDIRKKIAEGDTFDNEG LSA >gi|157101633|gb|DS480691.1| GENE 118 119928 - 121859 798 643 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 14 309 10 300 301 228 40.0 3e-59 MAGIRTENKVYEVGMYCRLSKDDGTDNESASIATQKSILTDYVKKQGWHLAKTYVDDGYS GTNFQRPSFQNMIKDIESGVINCVITKDLSRLGRNYLDCGLYLEVFFPEHNVRYIAVNDG VDTLNKSAMDITPFRNILNEMYAADISVKIKSAFRARFQQGKYMATSAPYGYIKDPADHN HLLIDDKVAHIVREIFDLALQGYGIPKIRKHINKQHILRPAAYAAERGETGFERYFEDNE ENRYIWGENSIRNILRSPIYAGNLAGYKRISPSMKSRKRLSKLPEEWEVIPDTHEGIVTQ EEFDTVQRLMTSRRREENAGGFKNIFAGIIKCADCGYALRATSVHRRKRPDIIDCVQYSC NNYARNGNGVCTAHNIEARDLFNAVLADINRFADMAVNDEKAVRAIEKRLTETDHSRAKA LEKEQRKLNKRLAELDRLFSSLYEDKVMERITERNFEMMSGKYQKEQLEIEARLKEVSET LTESYEKSQGVRDFLSLIRNYQGIKELDATIINALIDKIIVSERETIADGTVRQEIKIYY KFIGFVGELHIIPTKRWAALKPKNCTVCGIEYVPNSGISKYCPECRERIRKVRGTEAKIR SRERNRQACIELSAKNDRLTSPSAKAVSSARATRIPLSMCTAM >gi|157101633|gb|DS480691.1| GENE 119 121844 - 122776 557 310 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 265 1037 1308 1315 80 26.0 3e-15 MYRYVTERTFDAYLWQTVEKKQKFISQIMTSKSPVRSCDDVDETALSFAEIKALCAGDPR IKERMDLDVEVSRLKLMKADHQSKQYRLEDQLLKYFPEEIEKHKGFIKGFESDLEVLAAH PHPEDGFAGMEIRGDLLTDKENAGAALLDACKEVKTSDPVQIGSYRGYAMSVEFSAWKQE YTLLLKGQMTHRATLGTDPRGNLTRIDNALAQMPQRLEAAKAQLDNLCQQQAAAKEEVGK PFLYEEELRSKNARLVELDTLLNIDGKGQGQAHTESAVAKSTRPSVLDHLKRPVPPRSTD KKPKQHEEVR >gi|157101633|gb|DS480691.1| GENE 120 122778 - 123584 503 268 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1910 NR:ns ## KEGG: Ethha_1910 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 106 260 1 166 254 103 37.0 7e-21 MNTNDLNTALYEKMAAEQDKYRDWLKSQPPEEILHHTYEYTIREDIVMAMEELELTDAQA EALLESPSPLADVYRYFDKLETGYMDVIRDSIESRANEVCREPEELNPMVVYLHSASYAT KHGETDAYWLSDQANFSCKVAIEQAISAHYRDNRLDTASAVQEILEEFGAERLNFILANT IQHKDADGRISRDNKAWAKTIPMPEDSATSQQCADLIVDRVNPGLVDLFTRQARKTVQEK EKGSVLQKLKQELPVHKPAAPKKREPER >gi|157101633|gb|DS480691.1| GENE 121 123587 - 123925 179 112 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1911 NR:ns ## KEGG: Ethha_1911 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 111 1 111 112 126 65.0 2e-28 MAKRKRDMQLNFRVSADELAVIEQKMSQFGTSNREAYLRKMALDGYVVKLDLPELKELVS LMRRSSNNLNQLTRKVHETGRVYDADLEDISQRQEQLWEGVKEILTQLSKLS >gi|157101633|gb|DS480691.1| GENE 122 124078 - 124377 284 99 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1912 NR:ns ## KEGG: Ethha_1912 # Name: not_defined # Def: membrane protein # Organism: E.harbinense # Pathway: not_defined # 4 99 1 96 96 77 53.0 2e-13 MKILKGLLMIITAPIILLLTLFVWLCTGLIYISGLVLGLFSTVIALLGVAVLITYSPQNG VILLVMAFLISPMGLPLVAIWLLGKVQSLKFAIQELVYG >gi|157101633|gb|DS480691.1| GENE 123 124503 - 125096 148 197 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3502 NR:ns ## KEGG: EUBREC_3502 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 197 1 197 197 274 65.0 2e-72 MGQFEQLDQMLTAQEGMLRTSQVVSSGISKPVFYDYVRSRDLDRVAHGIYLSKDSWVDAM YLLHLRFEQAVFSHETALFFHDLTDREPIEYTVTVKTGCNPSKMKAEGIQVFTIKADLHD VGLTTAKTPFGHTVPVYDMERTICDLLRSRSRIEIQTFQGALKAYARRKDKNLRTLMQYA GMFKVEKILRQYLEVLL >gi|157101633|gb|DS480691.1| GENE 124 125186 - 125947 494 253 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3503 NR:ns ## KEGG: EUBREC_3503 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 244 32 277 287 328 66.0 2e-88 MERFLERISLSEYHDKFILKGGMLVAAMVGLDARSTMDLDATIKGANVNVEDIENLISSI ITVPIDDGVKFQLKSISEIMDEAEYPGIRVSMSTTFDGVVTPLKIDISTGDAITPREVQY SFKLMLEDRSIDIWAYNLETVLAEKLETIITRTTTNTRMRDFYDIFILEQLHGTTLNPKI LHDALLATAHKRGSEKYLNQAEEVFDEVENDSVMQKLWEAYRKKFSYASDLEWDVIMKAI RRLYDLCEKGIGL >gi|157101633|gb|DS480691.1| GENE 125 125954 - 126076 68 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939469|ref|ZP_02086819.1| ## NR: gi|160939469|ref|ZP_02086819.1| hypothetical protein CLOBOL_04362 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04362 [Clostridium bolteae ATCC BAA-613] # 1 40 1 40 40 63 100.0 7e-09 MDTIVKFVANTKPTKNFQAKVMHPTFVNPVLRYQQKSRLN >gi|157101633|gb|DS480691.1| GENE 126 126325 - 127737 878 470 aa, chain + ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 24 173 18 180 402 80 33.0 5e-15 MATTRIMPLHVGKGRTESRAISDIIDYVANPKKTDNGRLITGYACDSRTADAEFLLAKRL YITATGRVRGADDVIAYHVRQSFRPGEITPEEANQLGVEFAKRFTKGNHAFVVCTHIDKS HIHNHIIWSSVSLEYDRKFRNFWGSTKAVRQLSDTICIENGLSIVENPKPHGKSYNKWLG DQAKPSHRELLRVAIDNALSQSPADFEELLRLLQESGCEVSKRGKSYRLKLPGWEKAARM DSLGEGYGLEDLQAVLSGKKTHTPRKKTVTQAEPPKVNLLVDIQAKLQAGKGAGYARWAK VFNLKQMAQTMNYLSENNLLEYAVLEEKAAAATAHHNELSAQIKAAEKRMAEIAVLRTHI VNYAKTREVYVAYRKAGYSKKFREEHEEEILLHQAAKNAFDEMGVKKLPKVKELQTEYAK LLEEKKKTYAEYRRSREEMRELLTAKANVDRVLKMEVEQDVEKEKDHGQR >gi|157101633|gb|DS480691.1| GENE 127 127927 - 128241 251 104 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1172 NR:ns ## KEGG: Sgly_1172 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: S.glycolicus # Pathway: not_defined # 1 100 1 100 110 125 56.0 5e-28 MDEEFIRNRITELRLKKGVSEYQMSMELGQNRSYIQAISSGRSMPSMKQFLNICEYFEIT PLQFFDAQENNPQLIKKALDGMRKMSDDDLIMLIGFINRLNTEN >gi|157101633|gb|DS480691.1| GENE 128 128399 - 128941 163 180 aa, chain + ## HITS:1 COG:BS_yoqJ KEGG:ns NR:ns ## COG: BS_yoqJ COG4474 # Protein_GI_number: 16079120 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 23 171 2 159 171 62 32.0 3e-10 MNQYACAITGHRPTRFCFGYNEEDPLCQKLKECLLEQFRILHDEKFVRTFFVGGALGVDM WAGEQLLTLRAQTGYEDIKIIVVIPFIGYDSKWPDQSRHRLKKLIQNANDSIVISHSADV SSYKKRNYYMVDHAEYIIGVFDNQKKLRSGTAQTVNYALRNGKAITLIHPDTMEITTPTT >gi|157101633|gb|DS480691.1| GENE 129 129025 - 129219 74 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|332655013|ref|ZP_08420754.1| ## NR: gi|332655013|ref|ZP_08420754.1| hypothetical protein HMPREF0866_02741 [Ruminococcaceae bacterium D16] hypothetical protein HMPREF0866_02741 [Ruminococcaceae bacterium D16] # 2 64 14 76 76 110 98.0 3e-23 MLRTNGAAMNKYIEQQFAALRYEKNILVVEETTSAGGCGSFNAASSCPASACLYDSAAWG FFIA >gi|157101633|gb|DS480691.1| GENE 130 129235 - 130350 708 371 aa, chain + ## HITS:1 COG:no KEGG:Bmur_0750 NR:ns ## KEGG: Bmur_0750 # Name: not_defined # Def: hypothetical protein # Organism: B.murdochii # Pathway: not_defined # 3 371 2 374 374 348 45.0 3e-94 MSLKYTCPSCGTPLGYEGLCWKCKCEQERQAALAWTPEQIVEKQRNLIQNIQRLADMEDP EFADFWQLLGYHDVITPEIQRMALAAEVFWPCEIYYHAPADVRDGLIHALLSAEYSSAAS NLMSCLAMQGDDKAMETLLELERNPRPWRKGLYVDPSSYAQIGGWTFDKEGQKIQLNFDT CYPMVKGTTGEKSPVRIGREREDTCPHCGGRVVDILVLDGRDERLKFLGLDGILTATCCP SCVGFLKGPAFNRFTLDGGVEVFPSELFDGAEKTDCYVSPEEYKALTENPFVLGEAPVSL FYGAACQDVNTIGGFANWVQDAEYTTCPHCGKPMKYLAQIQWDMVFDCAEGTLYVEFCPD CHIVSMQHQQT >gi|157101633|gb|DS480691.1| GENE 131 130411 - 130644 203 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939475|ref|ZP_02086825.1| ## NR: gi|160939475|ref|ZP_02086825.1| hypothetical protein CLOBOL_04368 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04368 [Clostridium bolteae ATCC BAA-613] # 1 77 1 77 77 144 100.0 3e-33 MNSEQTTAFLQEHWHIASVLIGAVILIGAIRNWNWLCDPTGTRDAHRHSRGYRRVVFFLL GVLLIVVSIWGFVLKLK >gi|157101633|gb|DS480691.1| GENE 132 130658 - 131200 565 180 aa, chain + ## HITS:1 COG:no KEGG:Sterm_0689 NR:ns ## KEGG: Sterm_0689 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 180 2 177 177 174 51.0 1e-42 MMTEKYPTWLTGHVKEWAEKRLPTVTLCSTAGNELLEVWYYGDLLTVKGEPQSYIVDSDE APGLVAARDPESGEEFVIFDGGRHGYDHLFCDEHDPSELEHRPLKQYEIPASKLVLELGY NIDYEDEKEDFEPDESDTVELINGERMPWEQVKRDGIDYIALYYVNEKGKPVQILDAELA >gi|157101633|gb|DS480691.1| GENE 133 131370 - 131819 352 149 aa, chain + ## HITS:1 COG:no KEGG:LMHCC_2038 NR:ns ## KEGG: LMHCC_2038 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_HCC23 # Pathway: not_defined # 1 149 44 188 188 124 44.0 2e-27 MWADESVYGPPAPGSKVGDFIQTWLIGSWDCGVFTHREIAQRLMDTFTGLKLKELAWFEN PREKTLKNGKPRARRAKWLPDREVDLVYLYSDRYLDARNPSPAETDFFTVTRMDAWGVST WFMCTPQAAEKLIAMDYDNLAIQETTIVG >gi|157101633|gb|DS480691.1| GENE 134 131834 - 132175 264 113 aa, chain + ## HITS:1 COG:no KEGG:Vpar_0189 NR:ns ## KEGG: Vpar_0189 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 15 99 12 95 107 89 51.0 4e-17 MIGTDENRAALHVEVIFWSGKRKIPPSLVSGKYCPHFVVIGTTEYLGVCFLDGTECTFDT PALGNAQPLYPDTIDYAPLENNAEFLIYEGANAVGKGRVLGRTVPYQVKQQRK >gi|157101633|gb|DS480691.1| GENE 135 132183 - 133262 912 359 aa, chain + ## HITS:1 COG:all2748 KEGG:ns NR:ns ## COG: all2748 COG0666 # Protein_GI_number: 17230240 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Nostoc sp. PCC 7120 # 53 232 108 290 426 62 29.0 1e-09 MYQIAYIGRWETLPETAAAICDHDTPKLEALLQGGLDLDVPIQLSEYIKLMPLEIAVFRN DVPMIHFLLEHGADSSLAEEQPLLLTAARCCGPEVVALFAGQAAKLSSKQKERAFQEVRW GKRPENIQVLEQAGITVEKFGGEAFRAAVSEGNTKLARLLLEKGADINYHKPDMVFPYAS TPVTEAARSNNFPMVRWLIEQGADITIADKYGDRPYTVAVQNKNQELADYLKALEPEEWH NEQEKIRQLMPYKLPAKLVEYLKTGPLRLEFPDQEWVKWAELYSFMDVQEMTWKRKKLLS LMVQMDNYSDYLLLWSPRDKKLWYLDIEHEEFHPLAKWDDFIADPGRYLNGMIEGEFEE >gi|157101633|gb|DS480691.1| GENE 136 133267 - 133887 537 206 aa, chain + ## HITS:1 COG:no KEGG:TDE0558 NR:ns ## KEGG: TDE0558 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 29 197 27 209 210 72 30.0 1e-11 MTSIDFLNKVHKSLDSQEYSLSYSPAKSKNYMLYCNGNFIGGLFDEELCFVYADSVSELL GQPEPVYRGYSSTAQHRMLVIPEEHWAKALKLLYAEKFDWSRLVYDITYTSIGAAVVEDF YDENVVFLRFCFEKELLKKNPLDRQGRILRMVYLNQDLTEAGKYLFPRLLQKFLVFTDRK GKTSLETMLQRWYTALEKEYRSQITG >gi|157101633|gb|DS480691.1| GENE 137 133900 - 134490 510 196 aa, chain + ## HITS:1 COG:no KEGG:Vpar_1736 NR:ns ## KEGG: Vpar_1736 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 1 180 1 180 197 201 56.0 2e-50 MTEENRTYITHLKVTDVPWHRLTTAYGRGTDFPAHLTVLEQMRDLASVKESLYELTTNME HQSTLWHATPFGMVFLSRILEKALADSGQNPAAHFLAGELLDFFSCILQCFHDGDEMEHA EALPLFSDLLKEEYLWSEEYDEEEDEMRYEEDEVFPDDLFYSFYYYSWQAVLAYRDILEQ VSEEFAGPAAAVLKLL >gi|157101633|gb|DS480691.1| GENE 138 134504 - 135343 950 279 aa, chain + ## HITS:1 COG:no KEGG:TDE0503 NR:ns ## KEGG: TDE0503 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 279 16 283 283 180 37.0 6e-44 MFEEFYEMYEPEEQEVVALINRCIGGGFNWKGNFWEMTVVTLGIVFCDTGKVSTKEERLD WPVTEEERNGEKGWGRFGKEQICRLKIRRMKEEWAKDLVVQPWCISQVVKAHEDCPELQA VLDEYHKPVVIQDQVLGELTLDKDYDAFEGEIQWCGKDVSLSLEVNAESKPSWTRARSAA KKLLADCETWDKAMRDLAAKNLTELANNWLSQDEENPRNPETDPITEEELARRISMTSLS VTSGGSFTAWFDCDEIFTDHAVTVYGSLKKGLKTANIEG >gi|157101633|gb|DS480691.1| GENE 139 135357 - 135860 627 167 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939483|ref|ZP_02086833.1| ## NR: gi|160939483|ref|ZP_02086833.1| hypothetical protein CLOBOL_04376 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04376 [Clostridium bolteae ATCC BAA-613] # 1 167 9 175 175 303 100.0 3e-81 MKLNCKRIDFTYIPGEEIFNFPEESGLPFLFDVEEELTGDPAAMDAVGKMLDEAEKLAEK AKTAIKAALADEDSRYHSVVTFFMEFHRDDVGPDIAAELFPGTDPAKLSFAEMVDFLKLK RFGSLVDDEMDQQVFIMDLSFNPEITDELMVIYFDLNQEIFCITHES >gi|157101633|gb|DS480691.1| GENE 140 135850 - 136449 125 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939484|ref|ZP_02086834.1| ## NR: gi|160939484|ref|ZP_02086834.1| hypothetical protein CLOBOL_04377 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04377 [Clostridium bolteae ATCC BAA-613] # 1 199 1 199 199 406 100.0 1e-112 MKAEEQKREKYLPTLLIQQIRLQWYKDCRGGNAAAQRNQYPRAMRLPKDFFSYYPFGLPT HFASIIQRPDGFQIDRDCRRLMEWKPNGTMRLHPFELIQQESGIHVRYRNDWHIGAMPER YTYDKTGQKQPLNELALDLAPGEYGRAVCNGRFRDWDTGIWYYVLDILNVMPLTEPTDSL TSFTDREPNKIYTQIDRLW >gi|157101633|gb|DS480691.1| GENE 141 136472 - 137107 458 211 aa, chain + ## HITS:1 COG:no KEGG:FN0653 NR:ns ## KEGG: FN0653 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 1 211 1 212 212 152 36.0 1e-35 MISTKDLSGLPNAERLKNFCKGLATLDIIMVEEEWSFIRHYTYNPAWRKGKEAFFATDGS DQSMIVMFAPEGCVINGVDSELYDWEEKLPRIEDLTDGMPPALQKLMGSREVKKMKSTFC VWTEDGTTWHCNPMDGEDASKDLLSTIDGNPQSYVDYGKWFYPADLPLEAVRQLADGVPV TKELVAALNPKRSEWEEIKTGLDKIGYPNEL >gi|157101633|gb|DS480691.1| GENE 142 137316 - 137687 309 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167772517|ref|ZP_02444570.1| ## NR: gi|167772517|ref|ZP_02444570.1| hypothetical protein ANACOL_03895 [Anaerotruncus colihominis DSM 17241] hypothetical protein ANACOL_03895 [Anaerotruncus colihominis DSM 17241] # 36 123 97 184 184 174 98.0 2e-42 MLRVVLGTEKDELGSGLSICFVSLPGQRDQKPFGGAYFRDMSGCELAEIEQAAGQKCQPI GHIGYYYPAEVWISEYGKLYAKYEYQDEIECFPDVFALIERELRQYKFDSAAMKTVEALD GKL >gi|157101633|gb|DS480691.1| GENE 143 137720 - 138688 362 322 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1475 NR:ns ## KEGG: Sterm_1475 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 135 306 106 258 317 84 32.0 9e-15 MQNPWDISPLKFGMSQDEIMEVFGKPDAVSTMRSGGKPLILKYCDIELHFDRKAPHGLYL VYSDDEIELSITAEHEETLQPITNTEPVDNEFFFQDGAVYFSGLYENGLLKGVAPKDFCC WHYWGKSSTACFLGGIRLRGADPASFRVLNYAYAMDKTAVYTTSGRIPDAELAAFQVLDK GQNDSGAPQGYAKDSRQVYFHNGDGKVKIIKGAEVSSFRSLGDTYFARDEKRIYAYGKQL SKADLTAWELLSHWYSRDARRVYYLNREIKGADRDSFTVCTPVDAALLADHLARDKDHFY QNDEIMEETQWLEQLRKMTQEP >gi|157101633|gb|DS480691.1| GENE 144 138705 - 139250 446 181 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2815 NR:ns ## KEGG: CLL_A2815 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 170 1 178 193 110 37.0 2e-23 MDYLKILHEKPDLADEFDSLFDFFLLDELSPRDDAEGRCTFSLPGMAFARDGSGGEYHLL EDGSIGYYSSEGEAGRLAESMDDLFSLLVSCICWHDCCDTKQYVDSKTLEEYGQRQRNCN LEDMDMDSLQRVSDALGIPTGEPLAPVLERFRKATQREPVYQCIFHEDDGSLTESYGLMF E >gi|157101633|gb|DS480691.1| GENE 145 139267 - 139593 586 108 aa, chain + ## HITS:1 COG:no KEGG:Sterm_1192 NR:ns ## KEGG: Sterm_1192 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 1 108 1 108 108 101 57.0 8e-21 MKAFDPNYKLLDEMYQDDYYPAFLVNKVKDELQKVIDLLESGETDTDAVQETLDEAVCGI NDLQEEFDENDSEIETVARECIAATVAYILEWFGIPIDTEEAIRERDW >gi|157101633|gb|DS480691.1| GENE 146 139599 - 140546 837 315 aa, chain + ## HITS:1 COG:no KEGG:BP951000_1074 NR:ns ## KEGG: BP951000_1074 # Name: not_defined # Def: hypothetical protein # Organism: B.pilosicoli # Pathway: not_defined # 69 309 4 242 244 196 43.0 2e-48 MAEKELALCDECGSLFFKGSSQMMGLCPECAHILYGYPNCDHHFQDGRCVNCYWDGSESV YIKKLNQQEETDMPTTEWLNKYESIKDKLTCKVDLEAHFTEKVIGNMAVDVLDIGTVHFP TGQIFACDPLVELEDTLPFLQTIPAGTYPVKICVVPSEQYGDRYACVKVEVSREKPVRYE LGMVGNENLDAALGDDDYFGFGVDAGMGCIADIQTQAAFKTYWAKRLEEDPDIDPYNDLF CDLLEENAQAHPKYQGDCGDWLNWTVPDTDCNLPIFSSGWGDGYYPVYFGYDAKGEVCAV YVRFIDIEASYREQD >gi|157101633|gb|DS480691.1| GENE 147 140574 - 142055 1137 493 aa, chain + ## HITS:1 COG:no KEGG:TDE0253 NR:ns ## KEGG: TDE0253 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 493 1 497 497 465 49.0 1e-129 MKANAGEVYTVYNQYLKRYTACQVAYIAPPDTVSKESWAVVLSLDWVGDAPLTAEELPHL RPLYKDFMYWSRDLHLLRVPLEVPPQYTLVGTLPPFTDQPCRSYGGWSDGYDVYLQIRWQ AIPEERRRAFKEAMESEEKTEIGGIPVKVSSHRVMDQYAPFDSALELKALPCLSELICQR WHPDLLEFLRGNPFISELTLLNHGQRTLDLRGTSIRKLMLDMTGLEELWLCEGTEQLLFQ NKGLDACTIHAPEDGSGLTLQFIGEYCPHTELPNLRGLHGIELKDFDLTGLAAVHPHLKE LRLWGAPGNLQNFSAVGAFRELTNLSTFDLFGFGADDIPTPEQMPELRWFWMTSLPEDAA KAAKQLWKSKPGMDLRITKPRKPEWLAQNLDNPFRGWDGAEHIPAAAAKKAANQYRKTRS LLMKLATEPGEDAQAQAMDAVTAYTQTFNKMGFIETEERDEIYMALRGILDALPDGTLQK DALMEKFEQLRDF >gi|157101633|gb|DS480691.1| GENE 148 142069 - 142668 470 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939493|ref|ZP_02086843.1| ## NR: gi|160939493|ref|ZP_02086843.1| hypothetical protein CLOBOL_04386 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04386 [Clostridium bolteae ATCC BAA-613] # 1 199 1 199 199 402 100.0 1e-111 MSYTKFSKEVTKWLKDNGLPCYGTANDSPEETKARLDAWMRGIKEILRQWITEKRYRELI SCAHGGWYQDDVIFEPLAEHFVANHLFDELRFLCERGIRFSAEDMLSTIQSEKEEHGSLD IETIRNIDVPSYVAGRSYSHLGEIAKYRKRALDQIIRYIGYLEQIHAPAEYLEQVKFLQK IVADLTIKAKDLKPFRFRL >gi|157101633|gb|DS480691.1| GENE 149 142700 - 143299 368 199 aa, chain + ## HITS:1 COG:no KEGG:PPE_04519 NR:ns ## KEGG: PPE_04519 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa # Pathway: not_defined # 3 199 5 204 219 132 33.0 6e-30 MSRIVDEPYFTYSAYDALTSGIQVKCPKCHGTGVVTADEDNAYFRCLSCGHQVARDRTIY RYDVHNQCKNCGRYYRVDIEDEERQHFSVLHVACPYCGTTMPGEVHKTAEAFSYIGEIRD GREPYFGLELWFLTSFQGKPVWALNREHLAYLIGYLSADLREKPPGRAKMTQADHLPTFM KTAKNRERIVKLLRQMQEK >gi|157101633|gb|DS480691.1| GENE 150 143302 - 143547 164 81 aa, chain + ## HITS:1 COG:no KEGG:FN1193 NR:ns ## KEGG: FN1193 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 5 79 9 83 89 92 54.0 4e-18 MEQKVIYNGQILTLTRFWATGEPCLWITDPQQIGMPKMEFVGGHPDEYCIFLKNLTETEL SQITSLDGAPLDVKEERNDIE >gi|157101633|gb|DS480691.1| GENE 151 143558 - 144292 447 244 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939496|ref|ZP_02086846.1| ## NR: gi|160939496|ref|ZP_02086846.1| hypothetical protein CLOBOL_04389 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04389 [Clostridium bolteae ATCC BAA-613] # 1 244 1 244 244 494 100.0 1e-138 MKDLCQYGNRPEDEWEILPWIPDPRPPFKIWAKPEQIAPFFLIPHHPYAISLLLKISDGF RTEEFRRLGLIGSSEDWERLVRGVIQEFEENNSGVDLFHFDSDEDVFCVYSQYIDDLMLL AKMIRVACADEKAMRTYLGKIEYIKLFWEGAPEGEPAVILYEVDTENERLAHRSIDIFAD GRTHNNPDLYDGAIEITPIPTVEELNAHVWGEEFHACIIEQAEFEAIWESHAYNEELKGT GEIG >gi|157101633|gb|DS480691.1| GENE 152 144289 - 144699 277 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939497|ref|ZP_02086847.1| ## NR: gi|160939497|ref|ZP_02086847.1| hypothetical protein CLOBOL_04390 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04390 [Clostridium bolteae ATCC BAA-613] # 1 136 1 136 136 277 100.0 2e-73 MKQTDIYTEALTCLRSILLADHPEFQNWIDWLERDIEDWTQRREVAHHLRAYGGMGSFND LPSMRGNHDYIFGFLKSVCYAFGHLYGKREGISPEALMEECLHDVEQAAYHSYKALNQAI AQHLMQGDLQENLDRL >gi|157101633|gb|DS480691.1| GENE 153 144711 - 145466 400 251 aa, chain + ## HITS:1 COG:no KEGG:Vpar_0727 NR:ns ## KEGG: Vpar_0727 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 111 249 11 156 156 114 44.0 3e-24 MVQAWRKVDFGTGTVTFLDDTQKEDMLQVEYPNGFLLDMGWYQDRYIISIIHNFDWTHPV KRYETADGNQLPALLTEAVRFVEKKSRTAEREKSEVNMMQVRKLEGYRDKIKAAYEASAF TKGTRPEPEDAVKRFESRYTPIPAEYRWLLLNFGGCYLAEPWIFTLKELEEAYPIFQEAY EDYMSEYDHGPTFPIGGLGDGSIVFIDLESGRVRGYNRDYADLEEIAENFSSLLLGLVEQ ALELGNLCKEL >gi|157101633|gb|DS480691.1| GENE 154 145476 - 147113 1078 545 aa, chain + ## HITS:1 COG:no KEGG:EFER_3822 NR:ns ## KEGG: EFER_3822 # Name: not_defined # Def: hypothetical protein # Organism: E.fergusonii # Pathway: not_defined # 227 541 5 324 324 207 36.0 1e-51 MAKDIRECLLEQVGKFHQWQEITYPGKTTEEIGGAWEVDYPAWNDIFDAFCHVLTQMDAE TADSVLLDEMVYLIARDNEAEGFIQETTSHPQWFERLCRRAAASNESEAKWQFAAYLSEC PCSQEVKDMILDFAKDPNEYVSRRALLAMPALRPDCVEQFAPLFWERNCYSPELPEYQRI AVLVSLDAIHSDLLPQYLERAKQDGRSYLLEHAKRIEGGLAMNEKLSRPQFNQMDTTEKQ TLMESLADRYDMTFLGLHTFDRWGQSYTTGIFEKDGREFVFVPGDTVTLGWEQFAVGLNQ ESREELEYLFREWEMERDPEEMIRESMAPVRQAAIGPMLVGRELEEINWEPVKMDDPRLT VHPDWLKEFRDFAWSDSSSLTLHQSARIERTEKGFQICIYNHTDYDALLAMLENRGFSLP TADEWAYLCGGGCRTLFPWGDGLDYSMRLHWFENMDEDENRPYDMEEPNFFGLSIAYDPY MREVVQADRLTTCGGDGGCNICGGLGPFLGFLPCSPHCKPEVQEDNELNGDYDFYRPIIR LENYD >gi|157101633|gb|DS480691.1| GENE 155 147156 - 147626 570 156 aa, chain + ## HITS:1 COG:no KEGG:Bsph_3254 NR:ns ## KEGG: Bsph_3254 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 1 150 1 146 157 97 39.0 2e-19 MGAWGIKALERDEGLDVLDILKNEYVPEHLVMDLGEMIELMKEEGMLGEDLSDIDFLYDN TAMALAELYFQWKDNSKLDYDHEEAIWDKVTGFTASKEALAFLLRQLTDIKNEVPDEDGI REIVDLWKNEDSGEIAPAWLEHLGWLIKRLISEQEV >gi|157101633|gb|DS480691.1| GENE 156 147630 - 153146 4549 1838 aa, chain + ## HITS:1 COG:NMB0685 KEGG:ns NR:ns ## COG: NMB0685 COG4859 # Protein_GI_number: 15676583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 186 280 13 107 107 89 45.0 5e-17 MYIKKYWGNFIGGSDDSLNLVAFLEDQKKEEIPLSEIFTKIGLDKQNWGFRQTVEYLEFT HSSGVEMDFHFAIDVVTDLAAILLECSVSGSVNLQDLDEYNTPSRRIRITATPEEHDAMN KSLADFAQNPLEYDLSEMMDDEEIQEMARDVEALRKDLYEAAGRNRDYHVKAEDVKSLLS DWKGADGCIATNRITVEGCKVGYCYREKPDGDWDSGWRFTAGDESEEYMDDPNNAGIYKL NTICNDDPDIIPLLNTPAPCAFERDENGVFQQIKDWKPDEDEEDPDMDILKQCQKWHEED KHQKIVDALEAIPTEERTPEMDMELARAYNNLADSSEPEGRKQLHRALELMQCHEEELGD TYSWNFRMGYAYYYLDQEGRALRYFEKALEQHPGDDPKLNTRQDIEDLIDWCTKGISLPQ FSECFRERTENWWETFAEMEAELRQMMDEDKDHTRGAELVAQMEDTLNLVFDEISFELGF NGEKHELILTPEGNKVKLFELVYFQKHAPKEVLEHWNILVGRQPSQNIGLRTDDSWDISG EDVQIWLEEQGENSFNISAYCEKLLPMLREAEGRVWWMLTTLTDQILGEIPHMRYIDSFD VLEEPKAEPSFLLSQLPDKLREQGLELSTDPEAYLESYLGYEMKPNEDPNADWRLDVMAG STCCVPLINGYLNADNDFMDDLHADGAVAGFFCYPLDTLREEEGSEKIFDFRDKLEELFT TVDGSEMLALIGGATGLYCGYVDFIAWDIREALNMAKEFFEGTDIPWAIFHTFRREAGSV PLKQQDDGTETENQDDELDETLTGMDYIPYTQQDAEAFFAQLEQWNDEDEYTRCIQAINA IPEDWRNYRTAYALARALENYAIIGDHDEGTLKFKGDKALQRAIEVLESVREEGQDKAEW NMRMAYGYQYLYGQEEKAIPYAQRWAELDPEDENAPAVIRECKAEIRKRQRSRKKKAKFV PGDTPFEGFDLTNFWDDNWYALKEYVSDPPSDELIASVEEELGYKLPAAYIWLMKQHNGG IPMNTCYPCDEPTCWADDHVAITGIFGIGREKNCSLCGEMGSQFMIDEWEYPAIGVAICD CPSAGHDMIFLDYRACGPQGEPAVVHVDQENDYKITHLADSFEEFIRGLEHESLYDPDED VEDLNEDVDADEEETDHKGSFAGSVLLSKVEWDKEQLIRDLREEWGIVDEEPDEGDEDDE NSDDAVVMRVGGMMLIVTLFHGHIPDNEAEINAENNYMWPEAVEVTKAHKAHIMVAVLGE EEKLLERGKLFTKAMAVCCKQKYATGVYTSGVVFEPRFYEGLADMIKEDELPIFNWVWFG LYRSEGGLNGYTYGMDVFGKEEMEVLNTDAEPEDLRDFLASLASYVLACDVTLQDGETIG FSADDKHTITRSPGVSLPEEQMTLKISYEPTEVEPETDDDSIGMDDVSYHIESIEEKELP IDPINAYNHMAIYLRWCMEHDLMGEKFLEEHGDVVNQVKADPGSTDLRTFIREELFGCLF SALFNQKGRAFAHYYYGENDAPYYPADIDDYALKYFGPSRYHSNEFQQETYLFIPFDEKY YQAMAKVIEKRFVNWQGQDFDEDTLEPSEVAQAIMEYLDCECTYFPSMADDDPIMSAYSY ARRLGVREDFIPVLIKPDETLLECLVMNADPENDADCYEFNPKAVEEYRKKMLSAPVKDG KAVLEELTGQRKEEAEEDDMDWEEEIIGEIDGGINNDRFASYWDSDTNMTVPLILAKIPV KNPWEIFAYLPFGNWNECPDTLELMAVAKYWFEQYDAVPAAMSHDELEFLLPAPVPKEKA IDVAVEQYGFCPDLDQNASIGTLADTIHQSTVWYFWWD >gi|157101633|gb|DS480691.1| GENE 157 153162 - 154025 774 287 aa, chain + ## HITS:1 COG:yieJ KEGG:ns NR:ns ## COG: yieJ COG3196 # Protein_GI_number: 16131585 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli K12 # 116 286 8 194 195 176 48.0 6e-44 MTEYQKTYIELKKQFVATNEGPDSVRALYTFKEELEQSEDQQAKEVLVDVYDLLDFKEDA YELLCQIGNRSDKKTLKRLGVLKDYAENWGNHYALPKPKTPEEKQKEKERQAQLGLPAFR YHPDPLDTGAFEESKEGVICGCCGKTTHIYYTGPFYSVDEIAYLCPECIASGEAARKYDG SFQDDFSVDDGVDDPEKLDELIHRTPGYSGWQQEYWRAHCGDYCAFLGYVGARELRALDV LEEVLGDPMWNEEQKDMIRESVNGGHLQCYLFQCLHCGKHLVWMDFD >gi|157101633|gb|DS480691.1| GENE 158 154042 - 154818 724 258 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939503|ref|ZP_02086853.1| ## NR: gi|160939503|ref|ZP_02086853.1| hypothetical protein CLOBOL_04396 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04396 [Clostridium bolteae ATCC BAA-613] # 1 258 9 266 266 490 100.0 1e-137 MGLDIYAGTLTRYYSHNWKTVVQQWAEENGYSFNRITPDGEPADDEELSPTNVQAAVENW RDQILAAISQPDQPPYAPWPEDNEKPYYTDKPDWDAFGAMLLVAACRTYEEPVPPTVEKD WIFGEHPLIARLASDEERVWSLFRGVTWWLPLTDSFLFQGPLPTDDTVAIATLGGLRKEL EKLNQLAWQADEDTILGWADTEGYPVDGTVDSDGQYSKADIPEHTQYDTQSLAKFAFSMF WRAMRFAEEQQVPILLDY >gi|157101633|gb|DS480691.1| GENE 159 154845 - 155720 607 291 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939504|ref|ZP_02086854.1| ## NR: gi|160939504|ref|ZP_02086854.1| hypothetical protein CLOBOL_04397 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04397 [Clostridium bolteae ATCC BAA-613] # 1 291 4 294 294 558 99.0 1e-157 MEQKRPADIFQEALDYLWNGLGLEEKGWKRLKKGDFKKKMKNGLTYQIWFDRRRYNYIDY EIGHGNVEVGFTCIIKQGDDRLYSFKIEPTTGVSFFRMLTEDLRLNTGLLDTFLPLIKAH YLDFIDRFETDPAEALHPVCAPFIQPEDYSWCIHVDEQMVEQYGTAEQMEEYRRQAELRG TPECKAKNWMGSMLFHLSHANDVDHAWASSRTKEELDQVVEPFVQAKRQTGQWTQEDEAG YQLYQQETDPKKRTFRVWYLIANPRGLPKEFVQKELEFRWKLFPEKKEEAK >gi|157101633|gb|DS480691.1| GENE 160 155717 - 156595 714 292 aa, chain + ## HITS:1 COG:no KEGG:TDE0553 NR:ns ## KEGG: TDE0553 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 3 292 10 299 299 298 47.0 1e-79 MSNQLFQQNLDDKKGPQPGGPYLIQMLFKEPVEMPDKEKMTAVMEKHIGSTECFCYDKKM AGFAAQEHIAEFKDGKCPVQLMVMKCDKFKGKGFDAFLMSQMWDCQEDRERIFRECKYQV VATDMLAAALPALERANLDADFLEALAELYPTCEAFYFQNCGKLFLAEDVRSHQIEGSDR FIRFGVNVRFFNIEGTEDMLIDTVGMSTLFLPDLQYHFHDMDPNWVVNHAYNVASYILEH DNPIQDGETIDGVADGRMCREIQWKCQYEDALIQPPRGVLDIHMGEYAAGGR >gi|157101633|gb|DS480691.1| GENE 161 156596 - 156736 139 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNGVVLLVGGMLLVAVIKLIVDRNWMVLLLCAVALFLVFGAGHHTK >gi|157101633|gb|DS480691.1| GENE 162 156765 - 157727 537 320 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_0261 NR:ns ## KEGG: TepRe1_0261 # Name: not_defined # Def: hypothetical protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 17 318 19 322 327 234 40.0 3e-60 MQNSTNMRILELLRFLYEQTDENHPATVSDIIAHLNGKGIQAVRQTVYADTNALIDAGID IVVVKSTQNQYFMGSRLFEYPELKMLTDAVASSKIISAKKSEELVQKLCRLTSTHQAEQL QKFAALSSRVKPHNEKVYYIIDNIQTAIGNHQQIRFQYYEYTQEKKKILKHDGYYYVVNP YALEWKNDHYYLIGFSLKHQKIAHFRVDRLTSIENLETCFMPIEGFDVASYTNKMVDMFT SESSKEVTLLCENELMRVIIDHYGEDAAVDRYDDTHFTAKIEVNPSGTFYGWIFKFKGKI KILSPKECITEMQQIAQEFI >gi|157101633|gb|DS480691.1| GENE 163 157859 - 158227 231 122 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1745 NR:ns ## KEGG: Cthe_1745 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.thermocellum # Pathway: not_defined # 6 121 5 126 126 87 40.0 2e-16 MGNITFGSFIAEKRKAHKFNLRDTAKHLNIAYGYLCDIEQSRRPAPNGDFVERISAFLNL DKSEHELLLDLAAKSRNTVSADLPDYIMEKDIVRAALRVAKEVDATDEEWQTFMKMLKER KR >gi|157101633|gb|DS480691.1| GENE 164 158296 - 158853 133 185 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1746 NR:ns ## KEGG: Cthe_1746 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 2 185 31 250 262 98 33.0 1e-19 MEYDPSLLSGQPCPVPIETIIETKFDLILEFHTLRKNPKILGETIFDDGAVVLYDQIQRQ YRVIAVRAGTILIDERLCDPSKLGRLRFTCAHELAHWVLHKKLYSGTGDVAAYNGNVSSD ESHGIIERQADTLASALLMPLPQIKKCFYRLRIGRTDEQLIAEMANIFEVSKQAMQIRLK SRNLI >gi|157101633|gb|DS480691.1| GENE 165 158958 - 159494 117 178 aa, chain + ## HITS:1 COG:no KEGG:Closa_2767 NR:ns ## KEGG: Closa_2767 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 178 11 188 206 154 43.0 2e-36 MRKESTTIQKCNSRLTALGLNSDAAFHRSKLILKIYRDVVWVLSERAEELQETAWIMGEQ DIESGLCYLENFAPDIELQAFEEKVCCLVQNQMLVNIIDRALLRLKRYPDRGELYYEILT KQFIYRFNSTEKELLEELNIERSVFYDRKREAIYLFSVCLFGYSIPEVLEELPRLNSD >gi|157101633|gb|DS480691.1| GENE 166 159921 - 160343 77 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939512|ref|ZP_02086862.1| ## NR: gi|160939512|ref|ZP_02086862.1| hypothetical protein CLOBOL_04405 [Clostridium bolteae ATCC BAA-613] HTH DNA binding domain protein [Ruminococcaceae bacterium D16] hypothetical protein CLOBOL_04405 [Clostridium bolteae ATCC BAA-613] HTH DNA binding domain protein [Ruminococcaceae bacterium D16] # 1 140 43 182 182 233 100.0 5e-60 MFRDEDDRIIMECSRKDFEKYQQEDRHSRYLQEHEKSRSIFPASHVGDRDGTEEGYQDTD LFVDESVDTAEQAICNLLMADLHRALQQLSQKERSFILDYYSMEKPSTLQLAKRYGISQP AAHKRLKKIEEKIKKLVIDF >gi|157101633|gb|DS480691.1| GENE 167 160832 - 161212 174 126 aa, chain + ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 6 115 2 111 116 112 52.0 1e-25 MKEDWIYKRGDLYYANLNPYFGSEQGGTRPVLVLQNNVGNFFCPTLIVAPLTSKWIKKKE LPTHYALESVPELGLKSVVLLEQIKTIDKRRVLSYIGRVSREEMRAIDDALQVSLDIHIP EEMEAP >gi|157101633|gb|DS480691.1| GENE 168 161243 - 161470 215 75 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1353 NR:ns ## KEGG: Ethha_1353 # Name: not_defined # Def: DNA binding domain protein, excisionase family # Organism: E.harbinense # Pathway: not_defined # 15 71 10 66 70 68 56.0 1e-10 MFEARIAELNRFNEQNPVSYDKRTYTVDEIQDILGISRPTAYNLVKQGVFHSVRVGGHIR ISKKSFDDWLDHADE >gi|157101633|gb|DS480691.1| GENE 169 161575 - 162012 147 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939516|ref|ZP_02086866.1| ## NR: gi|160939516|ref|ZP_02086866.1| hypothetical protein CLOBOL_04409 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02459 [Holdemania filiformis DSM 12042] conserved domain protein [Clostridium hathewayi DSM 13479] hypothetical protein CLOBOL_04409 [Clostridium bolteae ATCC BAA-613] hypothetical protein HOLDEFILI_02459 [Holdemania filiformis DSM 12042] conserved domain protein [Clostridium hathewayi DSM 13479] hypothetical protein FP2_00500 [Faecalibacterium prausnitzii L2-6] # 1 145 1 145 145 280 100.0 3e-74 MQKQRVKTSMSVPEMGKMLGLGKVESYWLVKKNYFKTIQVAGRMRVMLDSFEDWYAGQFH YKKVDGTPPGEKWRHTTMSVPEMADLLGLKSGTAYDLVKRGYFETTLIDRRIRIITSSFE AWYQKQTHYVKISERSNENGIYREA >gi|157101633|gb|DS480691.1| GENE 170 161990 - 163372 620 460 aa, chain + ## HITS:1 COG:mlr0475 KEGG:ns NR:ns ## COG: mlr0475 COG0582 # Protein_GI_number: 13470699 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 20 414 25 391 399 123 23.0 1e-27 MASIVKRKSKYSVVYDYTDENGKRRQRWETFSTNAEAKKRKKQIEYEQDSGTFFIPTAKT LNDLLDEYMSIYGVNTWAMSTYESRRGLARNYITPIIGDMLLSDITPRMMDKYYRDLLSV KTVSVNNRKPTSEYLTPHTVREIHKLLRSAFNQAVRWELISRNPVLNATLPKEEHKERDI WTAETLSKAMEVCDDPILSLALNLAFSCSLRIGEMLGLTWDCIDIAPQSIENGSAYIFVN KELQRVTRGALDDLSDKGVIKKFPPCIASTHTALVLKEPKTKTSIRRVYLPKTVAYMLVE RKKEIDELMDLFGDEYIDNNLVFCSSNGRPMESQVINRAFNKLIKENGLPHVVFHSLRHS SITYKLKLNGGDMKSVQGDSGHAQVKMVADVYSHIIDEDRCINAQRLEEAFYSSKTPDPV EDTEPKTADTAVTESDAAKILELLKNPETAALLKQLAKAL >gi|157101633|gb|DS480691.1| GENE 171 163717 - 163839 75 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLASLAGFEPTAFRLGGERSILLSYRDIYSIKNGNNHFSS >gi|157101633|gb|DS480691.1| GENE 172 164599 - 165879 728 426 aa, chain + ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 4 421 52 468 470 383 50.0 1e-106 MDTSSLMEQILSRDNLNAAYLQVVRNKGAAGVDGMTVEELGAYLSENGENIKEQLRTRKY KPKPVRRVEIPKPDGGTRNLGVPTAVDRFVQQAVAQVLTPIFEEQFHDHSYGFRPKRCAQ QAVLKALEMMNDGHNWIVDIDLAKFFDTVDHDKLMTIFGRTIKDGDVISVVRKILVSGVM IDDEYEDTVVGTPQGGNISPLLANIMLNELDKELEARRLDFVRYADDLIIMVGSRQAAER VMKSVARFIEEKLGLKVNAEKSRVDKPKGIKYLGFGFYYDSFAKGYKARPHPKAAAKFKA QMKKYTSRSWGVGNGYKIGKLNRLIRGWINYFKIGSMKRLCAKMDGQIRYRLRMCIWKHW KTPKNREKNLIKLGLPPNAAHGISYAKGYARVCRSWNLHICISKERLAKFGLVSMEDYYA EKAVTC >gi|157101633|gb|DS480691.1| GENE 173 166109 - 167197 409 362 aa, chain + ## HITS:1 COG:no KEGG:CPF_0975 NR:ns ## KEGG: CPF_0975 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 7 350 3 345 362 79 24.0 2e-13 MQLNNYLEYLEKTVMAYSDKVKRHLVKQFIQHGIATKQTDTSGQLYIEGLETVDISEKTL SEQLVTCIHQSKRAGYALEKWEFLKDYKYSVIFTCDDFSSTLKKAKEIIPLENTYENDYL YTSFVKPVAYSMDGYYFLKFNLAYAAIHPLTQEEFLTKYPFLVVFHEKGELIEFRFDVLK RVFLSDKKEPTIYSNLIAEMSDYFKEHFECDLIPLNLDFMVNVCKRDENVKLIAQSMKLP NGGNAQLDVGNNQEYILPFIGELRSLLNDNQAELEKVPDFREALEQFMFEMEEMSDYPWI ELLWENEIKTRSNRVKFVFNYMNKSYCLIQYYYSNVLIGMERMNYVIEYIVNHRNDDTTQ NE >gi|157101633|gb|DS480691.1| GENE 174 167500 - 168216 21 238 aa, chain + ## HITS:1 COG:no KEGG:CTC01115 NR:ns ## KEGG: CTC01115 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 68 234 69 241 268 90 38.0 5e-17 MIDIMPDDKEISKALNVLNSISDKLRYEKICEISEAQKNIYKELLEKFKQLHDSSPTDNA PKNLHNLKGEALENLVSYLLTISGNIFNVDRNLRTSTNEIDQIVTLTPQGKVLLTYHLID PKLDTFLGECKNYDKAVSVTYIGKFCSLLLTTNIKIGILFSYYGTSGTGWSNGAGLIKKF YLHKEKLEDKYCIIDFSIKEFEAILNDKNLLQIIDEQLKSLQFDTDYSRYLSKHPAED >gi|157101633|gb|DS480691.1| GENE 175 168400 - 168714 248 104 aa, chain + ## HITS:1 COG:no KEGG:CD3346 NR:ns ## KEGG: CD3346 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 104 1 104 104 202 100.0 4e-51 MELKFVIPNMEKTFGNLEFAGEDKTEQRRINGRMAVLSRSFNLYSDVQRADDIVVILPAE AGEKHFDFEERVKLVNPRITAEGYKIGTRGFTNYILHADDMVKA >gi|157101633|gb|DS480691.1| GENE 176 168736 - 169113 418 125 aa, chain + ## HITS:1 COG:no KEGG:CD3345 NR:ns ## KEGG: CD3345 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 125 1 125 125 226 94.0 3e-58 MRLANGIVIDKEATFGALKFSALRREVHLQNEDGSVSEEIKERTYDLKSRGQGRMIQVSI PASVPLKEFDYNAEVEIIHPVADTVATATFQGAEVDWYIKADDIVLKKGAAMNPQLPKKD APPVK >gi|157101633|gb|DS480691.1| GENE 177 169150 - 170289 917 379 aa, chain + ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 7 375 13 377 480 338 46.0 9e-93 MKQLFLRGKRIRPTDKDFVFHTALAALFPVFLLVVLLFHIRQIAGTNWQEVSLSQIAQDV NIPYLLFSMGVAGLVCLAVILLFWRYRRDEVKQLIHRQKLARMVLENKWYESEQRKDDAF FKDLSSSRSKETITYFPKIYYRMKQGLLHIRVEITLGKYQEQLLNLEKKLESGLYCELTD KELKDSYVEYTLLYDTIASRISIEDVQAKDGRLRLMENVWWDYDKLPHMLIAGGTGGGKT YFILTLIEALLRTNAVLSVLDPKNADLADLQAVMPDVYYKKEDMLACIDRFYEEMMKRSE DMKLMDNYRTGENYAYLGLPANFLIFDEYVAFMEMLGTKENAAVLNKLKQIVMLGRQAGF FLILACQRPDAKYLGGRNP >gi|157101633|gb|DS480691.1| GENE 178 170270 - 170548 165 92 aa, chain + ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 1 92 376 467 480 102 51.0 2e-22 MGDGIRDQFNFRVALGRMSEMGYGMMFGETTKDFFLKQIKGRGYVDVGTSVISEFYTPLV PKGHDFLKEIKKLIDSRQGVQAACEAKAAETD >gi|157101633|gb|DS480691.1| GENE 179 170518 - 171924 875 468 aa, chain + ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 138 468 25 352 352 248 40.0 2e-65 MRSESRRNGLTCCGWCAARHASRNPSYLTEGYKSTGNNQTTGQNPTVHKGFSPYPAYVNK ALKVGIRFFRQEGFSLNEETWVRDIREKRAAYGISQQKLALAAGITRPYLSDIETGKAHP SKALQEAITEALERFNPDAPLEMLFDYVRIRFPTTDVKHIVEDVLRLKLPYFIHEDYGFY SYTEHYYLGDIFVLVSPELEKGVLLELKGRGCRQFESYLLAQERSWYEFFMDVLMEDGVM KRLDLAINDKTGILNIPHLTEKCRNEECISVFRSFKSYRSGELVRREEKECMGNTLYIGS LQSEVYFCIYEKDYEQYKKHDIPIAYAEVKNRFEIRLKNERAFYAIRDLLEHDNPERTAF QIINRYVRFVDRDDTKPRSDWRISEEWAWFIGEHRGSLKLTTKPEPYSFERTLHWLSHQV APTLKLALRLDKMNHTQIVHDIITHAKLTEKHEKILKQQAAAAKEVVL >gi|157101633|gb|DS480691.1| GENE 180 171976 - 172197 280 73 aa, chain + ## HITS:1 COG:no KEGG:CD3342A NR:ns ## KEGG: CD3342A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 73 1 73 73 107 98.0 2e-22 MNFGQNLYNWFLSNAQSLVLLAIVVIGLYLGFKREFSKLIGFLVVSLIAVGLVFNADGVK DILLELFNKIIGA >gi|157101633|gb|DS480691.1| GENE 181 172319 - 172816 423 165 aa, chain + ## HITS:1 COG:no KEGG:CD3340 NR:ns ## KEGG: CD3340 # Name: not_defined # Def: putative conjugative transposon antirestriction protein # Organism: C.difficile # Pathway: not_defined # 1 165 1 165 165 282 96.0 4e-75 MEEMRIYIANLGKYNEGELVGAWFTPPVDFEEVKERIGLNDEYEEYAIHDYELPFEIDEY TPIEEINRLCEMVEDLPEYIQEELSELQSYFGSIEELCEHEDDIICHSGCDDMADVARYY LEESGQLGELPAHLQNYIDYEAYGRDMELDGTFIVTNHGVYEILR >gi|157101633|gb|DS480691.1| GENE 182 172909 - 173301 358 130 aa, chain + ## HITS:1 COG:no KEGG:CD3339 NR:ns ## KEGG: CD3339 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 130 1 130 130 233 97.0 2e-60 MKKIRSYTSIWSVEKVLYSINDFKLPFPITFTQMAWFVVSMFAVMLFGNFPPLSFIDGAF LKYFGVPFALTWFMCQKTFDGKKPYGFLKSVLAYLVRPKLTYAGKPVKLEKEYPAQPITA VRSDIYGISD >gi|157101633|gb|DS480691.1| GENE 183 173285 - 175738 2244 817 aa, chain + ## HITS:1 COG:no KEGG:CD3338 NR:ns ## KEGG: CD3338 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 817 1 817 817 1582 98.0 0 MAYPIKYIENNLVFNHDGECFAYYELLPYNYSFLSPEQKYQVHDSFRQLIAQNRDGKIHA LQISTESSIRAAQERSKQEVTGKLKDVACAKIDAQTEALISMIGENQVDYRFFIGFKLLV NEQEVTIKQFRREAKTAVSDFLHEVNHKLMEDFVSMSNEEIWRFQKMEKLLESKISRRFK VRRLDKDDFGYLIEHLYGQTGTAYEDYEYYLPKKRFQEETLVKYYDLIKPTRCLIEENQR YLKIEQEDGTVYAAYFTINSIVGELDFPSSEIFYYQQQQFTFPIDTSMNVEIVTNRKALS TVRNKKKELKDLDNHAWQNDSETSTNVVDALDSVNELESTLDQSKESMYKLSYVVRVTAP DLEELKRRCNEVKDFYDDLNVKLVRPFGDMLGLHGEFLPASKRYMNDYIQYVTSDFLAGL GFGATQMLGEPEGIYIGYSLDTGRNVYLKPALASQGVKGSVTNALAAAFVGSLGGGKSFS NNMIVYYCVLFGAQALIVDPKAERGRWKETLPEIAHEINIVNLTSEEQNRGLLDPYVIME NPKDSESLAIDILTFLTGISSRDGEKFPVLRKAIRAVTNSEERGLLKVIEELRAEGTTIS TSIADHIESFTDYDFAHLLFSDEDVTQSISLEKQLNIIQVADLVLPDKETTFEEYTTMEL LSVAMLIVISTFALDFIHTDRSVFKIVDLDEAWSFLQVAQGKTLSMKLVRAGRAMNAGVY FVTQNTDDLLDEKLKNNLGLKYAFRSTDINEIKKTLTFFGVDSEDENNQKRLRDLENGQC LISDLYGRVGVIQFHPIFEDLFHAFDTRPPVRKEVEE >gi|157101633|gb|DS480691.1| GENE 184 175735 - 178116 1054 793 aa, chain + ## HITS:1 COG:no KEGG:CD3337 NR:ns ## KEGG: CD3337 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 793 1 793 793 1406 94.0 0 MKIRKPIQQGTALVPCRIKGQNSLKKIGSIAGKVLLALLLILLLLAVFGTAAHAAGLVDD TVDAANEYSKYPLDNYQLDFYVDSGWDWLPWNWLDGIGKQVMYGLYAITNFIWTISLYLS NATGYLIQEAYSLDFISSTADSIGKNMQTLAGVTTGGLSSEGFYIGFLLILILVVGIYVA YTGLIKRETTKAIHAVVNFVVVFVLSAAFIAYAPDYIGKINEFSADISNASLTLGTKIVL PNSESQGKDSVDLIRDSLFSIQVKQPWLLLQYGNSDVESIGTDRVESLLSTSPDENNGQD REEIVVEEIEDRENTNLTITKTINRLGTVFFLFMFNIGISVFVFLLTGIMIFSQVLFIIY AMFLPVSFLLSMVPSFEGMSKRAITKLFNTILTRAGITLIITVAFSISTMLYNLSGEYPF FLTAFLQIVTFAGIYFKLGDLMGMFSLQSGDSQSMGSRIMRRPRMLMHAHMHRLQHKLGR SVTALGAGTAAYHAGKQAGSDQKNASNSGSSKRTQADHSRPDGQTAPEKESAWKRAGSAV GAVADTKDKIADTAGQLREQAKDLPVNAKYALYHGKTQVSEGVQDFTSSVTQTRTARAEQ RNAQAESRRQTIAERRAELEQAKQPQKTASEAPKGAAPVHERPVTTKQPENFRNHADTAH AGKPAIQPAPPSIRERGQVSYEGTVAERASVPVVKAASIHYEHTPPVRAERQIVPPASPV QPDERQKTAPTTTPAAPRPVRPIQNDTAPVIPEGKRAAPAVKESNFTIRRTTARKEWTKT VKAAAKQKKGEKP >gi|157101633|gb|DS480691.1| GENE 185 178113 - 179117 753 334 aa, chain + ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 214 333 5 124 124 140 58.0 4e-33 MKLRHLFFACSGVFVMMFSLLLLVVIVFSDEEDGGSGGNLIYGGMSVSQEVLAHKPMLEK YAREYGIEEYLNVLLAIIQVESGGTLEDVMQSSESLGLPPNSLSTEESIKQGCKYFSELL AAAETKGCDLNSVIQSYNYGGGFLDYVAGHGKKYTFELAESFARDKSGGKKVTYTNPVAV EKNGGWRYSYGNMFYVLLVSQYLTVAQFDDETVQAIMEEALKYEGWTYVYGGDSPSTSFD CSGLVQWCYGKAGIALPRTAQEQYNVTQHIPLSEAKAGDLVFFHSTYNAGTYITHVGLYV GNNRMYHAGNPIGYADLTGSYWQQHLAGAGRIKQ >gi|157101633|gb|DS480691.1| GENE 186 179133 - 180041 758 302 aa, chain + ## HITS:1 COG:no KEGG:CD3335 NR:ns ## KEGG: CD3335 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 302 1 302 302 491 97.0 1e-137 MIQIRKEENQKKQKEKKLKVYKVNTHKKTVIALWVLLAVSFLFAVYKNFTAIDIHTVHET KVIEEKIIDTHKIENFVENFAEVYYSWEQSAASIDNRTNALKGYLTGELQALNVDTVRKD IPVSSALTDFQIWEIIEEKEQHYQVTYTVEQRITEGESSKTVRSAYQVTVYVDGSGNLTI VQNPTITSVPVKSGYTPKAVQSDGTVDSITTEEINEFLTTFFKLYPTATAKELTYYVNEG VLKPVGKEYIFSELVNPVYNRKGNQVTASLAVKYLDNQTMTTQISQFDLVLEKNGENWKI VK >gi|157101633|gb|DS480691.1| GENE 187 180400 - 181263 -87 287 aa, chain + ## HITS:1 COG:no KEGG:DSY1388 NR:ns ## KEGG: DSY1388 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 153 88 234 234 81 32.0 4e-14 MPVLDFADDNIIIMHDYFGLFIYNLTTGEIEDSIDLQALGYDFSNDSSCRISVSPNADTI WIWSTSSELPYEYDRISRNLNLTNDISEKDIFEDFVLTKDIPPEQLSVKPYRCSKKSVLF ADGSYGILNVRNEKITGISYIRDDKEWVLFSDKNCTMPELSRQDDFFYEQFVNEGAESAS SLIFSYCTMINYGEYAGICALSEGIEYSEELQKEWNALQLTASSEEIKTTNDKACFKVYI ISSDSSPNSGLLQGINEKYIYLKKDLTNSWYVEGFWNDTVPNEEWWS >gi|157101633|gb|DS480691.1| GENE 188 181923 - 182531 358 202 aa, chain - ## HITS:1 COG:FN0509 KEGG:ns NR:ns ## COG: FN0509 COG3547 # Protein_GI_number: 19703844 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 12 198 205 384 391 71 27.0 9e-13 VDHYQKWCKRRKYNFSKDKAEEIYGAAKELVPILPKDDLTKLIVKQSIEQLNTASKTVEE LRTLMNDTAAKLPEYPVVMGMKGVGPSLGPQLMAEIGDVTRFTHKGAITAFAGVDPGVNE SGTYEQKSVPTSKRGSSSLRKTLFQVMDCLIKTKPQDDPVYAFIDKKRAQGKPYYVYMTA GANKFLRIYYGRVKEYLMSLPE Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:48:06 2011 Seq name: gi|157101632|gb|DS480692.1| Clostridium bolteae ATCC BAA-613 Scfld_02_33 genomic scaffold, whole genome shotgun sequence Length of sequence - 201090 bp Number of predicted genes - 191, with homology - 190 Number of transcription units - 78, operones - 44 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 202 - 366 114 ## gi|160939700|ref|ZP_02087047.1| hypothetical protein CLOBOL_04591 - Prom 563 - 622 4.6 + Prom 372 - 431 5.1 2 2 Op 1 . + CDS 502 - 909 208 ## COG4335 DNA alkylation repair enzyme 3 2 Op 2 . + CDS 972 - 1481 190 ## PROTEIN SUPPORTED gi|229200610|ref|ZP_04327171.1| acetyltransferase, ribosomal protein N-acetylase 4 2 Op 3 . + CDS 1516 - 1734 291 ## Clole_2891 hypothetical protein + Prom 1765 - 1824 5.6 5 3 Tu 1 . + CDS 1868 - 2401 192 ## COG0655 Multimeric flavodoxin WrbA 6 4 Tu 1 . + CDS 2536 - 2994 -71 ## gi|160939706|ref|ZP_02087053.1| hypothetical protein CLOBOL_04597 7 5 Tu 1 . + CDS 3102 - 3773 137 ## FMG_1483 hypothetical protein + Prom 3829 - 3888 4.2 8 6 Tu 1 . + CDS 4001 - 4348 154 ## gi|160939710|ref|ZP_02087057.1| hypothetical protein CLOBOL_04601 + Prom 4408 - 4467 3.2 9 7 Tu 1 . + CDS 4493 - 5050 179 ## gi|160939711|ref|ZP_02087058.1| hypothetical protein CLOBOL_04602 - Term 5461 - 5497 1.5 10 8 Op 1 . - CDS 5584 - 5739 186 ## gi|160939713|ref|ZP_02087060.1| hypothetical protein CLOBOL_04604 11 8 Op 2 . - CDS 5815 - 6003 176 ## Clos_1311 hypothetical protein 12 8 Op 3 . - CDS 6050 - 6394 227 ## gi|160939715|ref|ZP_02087062.1| hypothetical protein CLOBOL_04606 13 9 Op 1 . - CDS 6541 - 6660 119 ## gi|160939716|ref|ZP_02087063.1| hypothetical protein CLOBOL_04607 14 9 Op 2 . - CDS 6724 - 7407 527 ## Sterm_1897 hypothetical protein 15 9 Op 3 . - CDS 7425 - 8054 209 ## bpr_III108 hypothetical protein - Prom 8078 - 8137 8.2 16 10 Tu 1 . + CDS 8447 - 8884 107 ## Mlab_1680 SufBD protein + Prom 9226 - 9285 3.6 17 11 Op 1 . + CDS 9325 - 9573 134 ## ELI_1490 hypothetical protein + Term 9596 - 9646 4.4 18 11 Op 2 . + CDS 9662 - 10486 659 ## gi|160939724|ref|ZP_02087071.1| hypothetical protein CLOBOL_04615 19 11 Op 3 . + CDS 10525 - 10635 61 ## gi|160939725|ref|ZP_02087072.1| hypothetical protein CLOBOL_04616 20 11 Op 4 . + CDS 10642 - 11013 185 ## gi|160939726|ref|ZP_02087073.1| hypothetical protein CLOBOL_04617 + Prom 11016 - 11075 5.7 21 12 Op 1 . + CDS 11243 - 11764 183 ## PROTEIN SUPPORTED gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase + Term 11777 - 11829 4.9 22 12 Op 2 . + CDS 11839 - 11991 146 ## gi|160939729|ref|ZP_02087076.1| hypothetical protein CLOBOL_04620 23 12 Op 3 . + CDS 12014 - 12526 277 ## gi|160939730|ref|ZP_02087077.1| hypothetical protein CLOBOL_04621 + Prom 12788 - 12847 3.6 24 13 Tu 1 . + CDS 13013 - 13588 409 ## gi|160939732|ref|ZP_02087079.1| hypothetical protein CLOBOL_04623 + Prom 13630 - 13689 5.5 25 14 Op 1 . + CDS 13782 - 14339 -95 ## COG1396 Predicted transcriptional regulators 26 14 Op 2 . + CDS 14430 - 15059 324 ## COG4845 Chloramphenicol O-acetyltransferase + Term 15236 - 15276 0.0 27 15 Tu 1 . + CDS 15408 - 15896 251 ## gi|160939737|ref|ZP_02087084.1| hypothetical protein CLOBOL_04628 + Prom 15975 - 16034 7.8 28 16 Tu 1 . + CDS 16054 - 16305 96 ## COG4443 Uncharacterized protein conserved in bacteria + Term 16351 - 16418 13.3 + Prom 16424 - 16483 3.4 29 17 Op 1 . + CDS 16514 - 16774 100 ## gi|160939740|ref|ZP_02087087.1| hypothetical protein CLOBOL_04631 30 17 Op 2 . + CDS 16843 - 17091 194 ## gi|160939741|ref|ZP_02087088.1| hypothetical protein CLOBOL_04632 31 17 Op 3 . + CDS 17170 - 17637 317 ## gi|160939742|ref|ZP_02087089.1| hypothetical protein CLOBOL_04633 32 17 Op 4 . + CDS 17665 - 18057 192 ## TDE1820 hypothetical protein 33 17 Op 5 . + CDS 18078 - 18455 117 ## Snas_4048 hypothetical protein 34 17 Op 6 . + CDS 18548 - 18928 73 ## CD1006 acetyltransferase 35 17 Op 7 . + CDS 18925 - 19197 102 ## TDE0254 hypothetical protein 36 17 Op 8 . + CDS 19248 - 19595 135 ## gi|160939748|ref|ZP_02087095.1| hypothetical protein CLOBOL_04639 + Prom 19960 - 20019 2.2 37 18 Tu 1 . + CDS 20066 - 20272 227 ## Clole_2891 hypothetical protein 38 19 Op 1 . - CDS 20215 - 21186 405 ## Nther_0246 integrase catalytic region 39 19 Op 2 . - CDS 21188 - 21532 319 ## gi|160939751|ref|ZP_02087098.1| hypothetical protein CLOBOL_04642 - Prom 21554 - 21613 1.8 40 19 Op 3 . - CDS 21620 - 22159 218 ## gi|160939752|ref|ZP_02087099.1| hypothetical protein CLOBOL_04643 - Prom 22267 - 22326 4.0 - Term 22658 - 22704 9.5 41 20 Tu 1 . - CDS 22867 - 23115 268 ## gi|160939755|ref|ZP_02087102.1| hypothetical protein CLOBOL_04646 - Prom 23282 - 23341 8.1 + Prom 23194 - 23253 4.0 42 21 Tu 1 . + CDS 23309 - 23545 105 ## gi|160939756|ref|ZP_02087103.1| hypothetical protein CLOBOL_04647 + Term 23609 - 23645 2.2 43 22 Op 1 . - CDS 23734 - 23928 99 ## gi|160939757|ref|ZP_02087104.1| hypothetical protein CLOBOL_04648 44 22 Op 2 . - CDS 24004 - 24297 153 ## gi|160939758|ref|ZP_02087105.1| hypothetical protein CLOBOL_04649 45 22 Op 3 . - CDS 24354 - 24686 276 ## gi|160939759|ref|ZP_02087106.1| hypothetical protein CLOBOL_04650 46 22 Op 4 . - CDS 24748 - 24948 209 ## gi|160939760|ref|ZP_02087107.1| hypothetical protein CLOBOL_04651 47 22 Op 5 . - CDS 25010 - 25210 158 ## gi|160939761|ref|ZP_02087108.1| hypothetical protein CLOBOL_04652 - Prom 25417 - 25476 8.1 - Term 25392 - 25451 7.7 48 23 Op 1 38/0.000 - CDS 25534 - 26469 500 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 49 23 Op 2 2/0.000 - CDS 26499 - 27251 1088 ## PROTEIN SUPPORTED gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 - Prom 27398 - 27457 5.5 - Term 27409 - 27465 1.3 50 23 Op 3 1/0.000 - CDS 27532 - 28320 874 ## COG4465 Pleiotropic transcriptional repressor - Prom 28373 - 28432 6.1 51 23 Op 4 13/0.000 - CDS 28476 - 30563 2079 ## COG0550 Topoisomerase IA - Term 30594 - 30637 4.3 52 23 Op 5 2/0.000 - CDS 30652 - 31824 919 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 53 23 Op 6 . - CDS 31815 - 33350 885 ## COG0606 Predicted ATPase with chaperone activity - Prom 33480 - 33539 6.4 - Term 33533 - 33569 1.2 54 24 Op 1 . - CDS 33591 - 34376 669 ## Closa_1841 hypothetical protein 55 24 Op 2 . - CDS 34397 - 35944 1349 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Prom 35972 - 36031 7.4 - Term 35991 - 36039 9.3 56 25 Tu 1 . - CDS 36268 - 37284 1295 ## COG1087 UDP-glucose 4-epimerase - Prom 37360 - 37419 7.9 - Term 37311 - 37356 -0.9 57 26 Op 1 . - CDS 37442 - 38071 698 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 58 26 Op 2 . - CDS 38142 - 40073 1739 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 40098 - 40157 8.8 - Term 40146 - 40181 -0.5 59 27 Op 1 . - CDS 40264 - 42726 1469 ## gi|160939775|ref|ZP_02087122.1| hypothetical protein CLOBOL_04666 60 27 Op 2 . - CDS 42773 - 44110 1494 ## COG0534 Na+-driven multidrug efflux pump - Prom 44140 - 44199 5.3 - Term 44262 - 44314 5.2 61 28 Op 1 . - CDS 44338 - 45516 1426 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 62 28 Op 2 . - CDS 45533 - 46246 867 ## EUBREC_1566 hypothetical protein - Prom 46403 - 46462 3.5 - Term 46454 - 46496 -1.0 63 29 Op 1 . - CDS 46506 - 47894 178 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 64 29 Op 2 . - CDS 47881 - 49719 2070 ## Closa_2183 hypothetical protein 65 29 Op 3 . - CDS 49706 - 50485 379 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 50688 - 50747 3.8 - Term 50716 - 50763 9.0 66 30 Tu 1 . - CDS 50818 - 52218 687 ## COG1486 Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases - Prom 52246 - 52305 2.3 67 31 Op 1 . - CDS 52327 - 53358 61 ## gi|160939784|ref|ZP_02087131.1| hypothetical protein CLOBOL_04675 68 31 Op 2 38/0.000 - CDS 53365 - 54213 409 ## COG0395 ABC-type sugar transport system, permease component 69 31 Op 3 35/0.000 - CDS 54217 - 55125 426 ## COG1175 ABC-type sugar transport systems, permease components 70 31 Op 4 . - CDS 55159 - 56508 732 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 56645 - 56704 8.9 + Prom 56609 - 56668 11.4 71 32 Op 1 7/0.000 + CDS 56693 - 58435 830 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 58455 - 58514 5.6 72 32 Op 2 . + CDS 58681 - 60201 701 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 60384 - 60415 -0.2 + Prom 60217 - 60276 10.6 73 33 Op 1 19/0.000 + CDS 60454 - 61377 805 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 74 33 Op 2 . + CDS 61371 - 61802 594 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit + Term 61872 - 61923 12.1 75 34 Tu 1 . - CDS 62348 - 62560 86 ## - Prom 62762 - 62821 4.3 + Prom 62620 - 62679 5.0 76 35 Tu 1 . + CDS 62748 - 63173 324 ## COG3464 Transposase and inactivated derivatives - Term 63253 - 63296 11.7 77 36 Op 1 . - CDS 63402 - 63839 486 ## ELI_3274 hypothetical protein 78 36 Op 2 . - CDS 63893 - 64261 183 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 79 37 Tu 1 . - CDS 64380 - 65669 1548 ## COG0172 Seryl-tRNA synthetase - Prom 65732 - 65791 7.6 - Term 65687 - 65721 1.2 80 38 Op 1 . - CDS 65828 - 67465 1538 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 81 38 Op 2 . - CDS 67479 - 68945 670 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 68991 - 69050 4.4 - Term 69283 - 69321 -0.4 82 39 Op 1 . - CDS 69432 - 70337 847 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase 83 39 Op 2 . - CDS 70351 - 70971 771 ## Closa_2116 hypothetical protein - Prom 71027 - 71086 2.4 84 40 Op 1 6/0.000 - CDS 71127 - 71777 653 ## COG1564 Thiamine pyrophosphokinase 85 40 Op 2 10/0.000 - CDS 71774 - 72439 805 ## COG0036 Pentose-5-phosphate-3-epimerase 86 40 Op 3 7/0.000 - CDS 72442 - 73320 876 ## COG1162 Predicted GTPases 87 40 Op 4 17/0.000 - CDS 73330 - 75501 2586 ## COG0515 Serine/threonine protein kinase 88 40 Op 5 5/0.000 - CDS 75495 - 76241 957 ## COG0631 Serine/threonine protein phosphatase 89 40 Op 6 4/0.000 - CDS 76244 - 77287 1044 ## COG0820 Predicted Fe-S-cluster redox enzyme 90 40 Op 7 2/0.000 - CDS 77356 - 78723 1344 ## COG0144 tRNA and rRNA cytosine-C5-methylases 91 40 Op 8 1/0.000 - CDS 78790 - 79491 883 ## COG2738 Predicted Zn-dependent protease 92 40 Op 9 26/0.000 - CDS 79494 - 80480 960 ## COG0223 Methionyl-tRNA formyltransferase 93 40 Op 10 4/0.000 - CDS 80492 - 80977 581 ## COG0242 N-formylmethionyl-tRNA deformylase 94 40 Op 11 . - CDS 80997 - 83306 1946 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase - Prom 83338 - 83397 8.1 - Term 83498 - 83534 4.9 95 41 Op 1 5/0.000 - CDS 83731 - 84567 610 ## COG0500 SAM-dependent methyltransferases 96 41 Op 2 2/0.000 - CDS 84582 - 85190 382 ## COG1309 Transcriptional regulator - Prom 85360 - 85419 8.4 - Term 85409 - 85458 4.2 97 41 Op 3 . - CDS 85675 - 86997 1430 ## COG0534 Na+-driven multidrug efflux pump - Prom 87036 - 87095 3.0 - Term 87037 - 87073 -0.4 98 42 Op 1 . - CDS 87135 - 88457 1399 ## COG1362 Aspartyl aminopeptidase 99 42 Op 2 . - CDS 88529 - 89935 1557 ## COG0531 Amino acid transporters - Prom 89981 - 90040 2.1 100 43 Op 1 26/0.000 - CDS 90042 - 91475 1360 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 101 43 Op 2 . - CDS 91456 - 92991 1415 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase - Prom 93018 - 93077 2.4 - Term 93029 - 93080 17.1 102 44 Op 1 . - CDS 93089 - 94759 2158 ## COG2759 Formyltetrahydrofolate synthetase - Term 94821 - 94868 1.2 103 44 Op 2 1/0.000 - CDS 94938 - 98006 3270 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 98093 - 98152 3.5 104 44 Op 3 . - CDS 98154 - 98930 812 ## COG0740 Protease subunit of ATP-dependent Clp proteases - Prom 98959 - 99018 3.0 - Term 99018 - 99056 9.1 105 45 Tu 1 . - CDS 99085 - 100707 183 ## PROTEIN SUPPORTED gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 - Prom 100752 - 100811 8.2 106 46 Op 1 13/0.000 - CDS 100872 - 101669 273 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 107 46 Op 2 9/0.000 - CDS 101669 - 102604 1393 ## COG4120 ABC-type uncharacterized transport system, permease component - Prom 102727 - 102786 2.0 - Term 102800 - 102860 7.1 108 46 Op 3 2/0.000 - CDS 102890 - 103954 1374 ## COG2984 ABC-type uncharacterized transport system, periplasmic component - Prom 104001 - 104060 5.9 109 46 Op 4 40/0.000 - CDS 104112 - 105509 1561 ## COG0642 Signal transduction histidine kinase 110 46 Op 5 . - CDS 105536 - 106213 879 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 106248 - 106307 4.2 111 47 Tu 1 . - CDS 106319 - 107314 1023 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog - Prom 107351 - 107410 1.8 112 48 Tu 1 . - CDS 107424 - 109088 1893 ## COG0659 Sulfate permease and related transporters (MFS superfamily) - Prom 109127 - 109186 4.5 113 49 Op 1 36/0.000 - CDS 109215 - 110393 383 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 114 49 Op 2 24/0.000 - CDS 110377 - 111099 321 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 115 49 Op 3 . - CDS 111149 - 112939 1852 ## COG0845 Membrane-fusion protein 116 49 Op 4 . - CDS 112954 - 114120 1439 ## Closa_2156 S-layer domain protein - Prom 114161 - 114220 5.9 - Term 114211 - 114263 7.1 117 50 Tu 1 . - CDS 114292 - 115842 1528 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 115889 - 115948 2.0 - Term 115912 - 115965 12.2 118 51 Tu 1 . - CDS 115980 - 116417 472 ## COG1959 Predicted transcriptional regulator - Prom 116463 - 116522 7.1 - Term 116517 - 116546 0.3 119 52 Op 1 . - CDS 116547 - 118064 1424 ## COG2720 Uncharacterized vancomycin resistance protein 120 52 Op 2 . - CDS 118069 - 119529 1294 ## COG0785 Cytochrome c biogenesis protein - Prom 119583 - 119642 5.3 121 53 Tu 1 . - CDS 119647 - 121035 464 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 - Prom 121080 - 121139 7.8 - Term 121051 - 121097 2.1 122 54 Op 1 . - CDS 121314 - 122591 1328 ## COG2873 O-acetylhomoserine sulfhydrylase 123 54 Op 2 . - CDS 122621 - 123724 1142 ## COG2768 Uncharacterized Fe-S center protein - Prom 123749 - 123808 5.3 - Term 123879 - 123918 0.4 124 55 Op 1 2/0.000 - CDS 123954 - 124610 739 ## COG5658 Predicted integral membrane protein 125 55 Op 2 . - CDS 124597 - 124878 359 ## COG0640 Predicted transcriptional regulators - Prom 125006 - 125065 6.2 - Term 125095 - 125123 -0.1 126 56 Op 1 . - CDS 125129 - 126346 877 ## gi|160939855|ref|ZP_02087202.1| hypothetical protein CLOBOL_04746 127 56 Op 2 . - CDS 126343 - 128295 1812 ## COG2199 FOG: GGDEF domain - Prom 128441 - 128500 5.0 - Term 128617 - 128664 13.1 128 57 Tu 1 . - CDS 128696 - 130150 1389 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific - Prom 130228 - 130287 5.2 - Term 130220 - 130262 2.1 129 58 Op 1 . - CDS 130319 - 130576 285 ## Hore_14490 phosphotransferase system, phosphocarrier protein HPr 130 58 Op 2 7/0.000 - CDS 130597 - 131091 672 ## COG2190 Phosphotransferase system IIA components 131 58 Op 3 . - CDS 131138 - 131935 878 ## COG3711 Transcriptional antiterminator - Prom 132095 - 132154 8.9 - Term 132098 - 132154 -0.8 132 59 Op 1 . - CDS 132231 - 133538 1033 ## Closa_1252 hypothetical protein 133 59 Op 2 25/0.000 - CDS 133538 - 135259 2513 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 134 59 Op 3 . - CDS 135319 - 135582 279 ## COG1925 Phosphotransferase system, HPr-related proteins 135 59 Op 4 7/0.000 - CDS 135606 - 136949 1593 ## COG4856 Uncharacterized protein conserved in bacteria 136 59 Op 5 . - CDS 136930 - 137799 1013 ## COG1624 Uncharacterized conserved protein 137 59 Op 6 . - CDS 137831 - 138790 1250 ## Closa_1247 hypothetical protein 138 59 Op 7 . - CDS 138787 - 139194 444 ## Closa_1246 hypothetical protein - Prom 139253 - 139312 8.1 139 60 Tu 1 . - CDS 139348 - 141489 1944 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases - Prom 141540 - 141599 4.4 - Term 141567 - 141622 20.4 140 61 Op 1 . - CDS 141664 - 142185 542 ## COG0778 Nitroreductase 141 61 Op 2 29/0.000 - CDS 142227 - 143210 863 ## COG2025 Electron transfer flavoprotein, alpha subunit 142 61 Op 3 . - CDS 143211 - 144005 885 ## COG2086 Electron transfer flavoprotein, beta subunit 143 61 Op 4 . - CDS 144029 - 145189 1171 ## COG0665 Glycine/D-amino acid oxidases (deaminating) 144 61 Op 5 . - CDS 145232 - 146467 1388 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 146542 - 146601 5.0 - Term 146587 - 146635 10.3 145 62 Tu 1 . - CDS 146682 - 148352 1915 ## COG2508 Regulator of polyketide synthase expression - Prom 148391 - 148450 4.6 - Term 148399 - 148447 8.2 146 63 Op 1 . - CDS 148465 - 149706 1387 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 147 63 Op 2 . - CDS 149754 - 149951 356 ## gi|160939880|ref|ZP_02087227.1| hypothetical protein CLOBOL_04771 148 63 Op 3 . - CDS 149987 - 151600 1565 ## COG1620 L-lactate permease 149 63 Op 4 . - CDS 151615 - 152550 1169 ## COG0549 Carbamate kinase 150 63 Op 5 . - CDS 152577 - 155585 3242 ## COG0074 Succinyl-CoA synthetase, alpha subunit 151 63 Op 6 . - CDS 155601 - 156653 1028 ## COG2055 Malate/L-lactate dehydrogenases 152 63 Op 7 . - CDS 156681 - 157469 856 ## COG3257 Uncharacterized protein, possibly involved in glyoxylate utilization - Prom 157525 - 157584 7.9 + Prom 157483 - 157542 7.7 153 64 Tu 1 . + CDS 157702 - 158571 643 ## COG0657 Esterase/lipase + Term 158606 - 158636 -0.5 154 65 Op 1 2/0.000 - CDS 158575 - 159939 1608 ## COG1953 Cytosine/uracil/thiamine/allantoin permeases 155 65 Op 2 2/0.000 - CDS 159990 - 161363 1422 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 156 65 Op 3 2/0.000 - CDS 161380 - 162741 1508 ## COG1953 Cytosine/uracil/thiamine/allantoin permeases 157 65 Op 4 . - CDS 162772 - 164181 1513 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 164277 - 164336 5.0 + Prom 164313 - 164372 8.5 158 66 Tu 1 . + CDS 164423 - 165298 402 ## gi|160939892|ref|ZP_02087239.1| hypothetical protein CLOBOL_04783 159 67 Op 1 . - CDS 165405 - 165995 493 ## COG0740 Protease subunit of ATP-dependent Clp proteases 160 67 Op 2 . - CDS 165992 - 166414 223 ## COG2207 AraC-type DNA-binding domain-containing proteins 161 67 Op 3 . - CDS 166444 - 166584 59 ## gi|160939895|ref|ZP_02087242.1| hypothetical protein CLOBOL_04786 - Prom 166657 - 166716 7.3 - Term 166877 - 166919 6.2 162 68 Op 1 . - CDS 166957 - 167361 291 ## gi|160939896|ref|ZP_02087243.1| hypothetical protein CLOBOL_04787 163 68 Op 2 . - CDS 167361 - 168122 429 ## BALH_0828 hypothetical protein 164 68 Op 3 . - CDS 168119 - 170593 382 ## COG0433 Predicted ATPase - Prom 170725 - 170784 2.8 - Term 171438 - 171475 -0.4 165 69 Op 1 . - CDS 171484 - 172050 238 ## gi|160939899|ref|ZP_02087246.1| hypothetical protein CLOBOL_04790 166 69 Op 2 . - CDS 172064 - 174355 596 ## Pjdr2_1837 hypothetical protein 167 69 Op 3 . - CDS 174352 - 175764 292 ## EAT1b_1094 hypothetical protein 168 69 Op 4 . - CDS 175810 - 176760 244 ## EUBELI_20463 hypothetical protein - Prom 176789 - 176848 6.2 - Term 177213 - 177277 14.2 169 70 Op 1 . - CDS 177285 - 177599 247 ## ELI_0209 hypothetical protein 170 70 Op 2 . - CDS 177611 - 177838 231 ## gi|160939905|ref|ZP_02087252.1| hypothetical protein CLOBOL_04796 171 70 Op 3 . - CDS 177828 - 177974 154 ## gi|160939907|ref|ZP_02087254.1| hypothetical protein CLOBOL_04798 - Prom 178099 - 178158 4.6 + Prom 177734 - 177793 3.4 172 71 Tu 1 . + CDS 177993 - 178196 221 ## gi|160939906|ref|ZP_02087253.1| hypothetical protein CLOBOL_04797 + Term 178209 - 178261 2.7 - Term 178197 - 178246 10.6 173 72 Op 1 2/0.000 - CDS 178404 - 180368 2338 ## COG1190 Lysyl-tRNA synthetase (class II) 174 72 Op 2 1/0.000 - CDS 180393 - 180875 658 ## COG0782 Transcription elongation factor - Prom 180896 - 180955 1.6 175 72 Op 3 . - CDS 180981 - 181946 596 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase 176 72 Op 4 . - CDS 181947 - 182348 484 ## Closa_1095 hypothetical protein 177 72 Op 5 3/0.000 - CDS 182380 - 183354 1376 ## COG0205 6-phosphofructokinase - Prom 183379 - 183438 2.3 178 72 Op 6 . - CDS 183447 - 186932 4259 ## COG0587 DNA polymerase III, alpha subunit - Prom 186972 - 187031 4.8 - Term 186991 - 187031 8.1 179 73 Op 1 2/0.000 - CDS 187065 - 187319 383 ## COG1925 Phosphotransferase system, HPr-related proteins 180 73 Op 2 . - CDS 187441 - 188418 827 ## COG1481 Uncharacterized protein conserved in bacteria 181 73 Op 3 1/0.000 - CDS 188424 - 189293 940 ## COG1660 Predicted P-loop-containing kinase 182 73 Op 4 . - CDS 189330 - 190241 942 ## COG0812 UDP-N-acetylmuramate dehydrogenase 183 73 Op 5 . - CDS 190262 - 191194 1227 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism 184 73 Op 6 . - CDS 191279 - 193258 1801 ## COG0322 Nuclease subunit of the excinuclease complex 185 73 Op 7 . - CDS 193325 - 194722 1398 ## COG1306 Uncharacterized conserved protein - Prom 194845 - 194904 8.4 - Term 194949 - 194995 7.1 186 74 Tu 1 . - CDS 195075 - 196694 1632 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases - Prom 196725 - 196784 3.9 - Term 196839 - 196874 3.3 187 75 Tu 1 . - CDS 196920 - 197069 215 ## gi|160939925|ref|ZP_02087272.1| hypothetical protein CLOBOL_04816 - Prom 197133 - 197192 9.1 + Prom 197419 - 197478 5.5 188 76 Tu 1 . + CDS 197498 - 197716 155 ## Ethha_0641 hypothetical protein - Term 197602 - 197641 -0.4 189 77 Tu 1 . - CDS 197775 - 198473 306 ## gi|160939929|ref|ZP_02087276.1| hypothetical protein CLOBOL_04820 - Prom 198583 - 198642 5.9 - Term 198646 - 198684 7.0 190 78 Op 1 . - CDS 198712 - 200160 938 ## COG1621 Beta-fructosidases (levanase/invertase) 191 78 Op 2 . - CDS 200252 - 201088 889 ## COG2190 Phosphotransferase system IIA components Predicted protein(s) >gi|157101632|gb|DS480692.1| GENE 1 202 - 366 114 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939700|ref|ZP_02087047.1| ## NR: gi|160939700|ref|ZP_02087047.1| hypothetical protein CLOBOL_04591 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04591 [Clostridium bolteae ATCC BAA-613] # 1 54 8 61 61 100 100.0 3e-20 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLTRTVSS >gi|157101632|gb|DS480692.1| GENE 2 502 - 909 208 135 aa, chain + ## HITS:1 COG:BS_yhaZ KEGG:ns NR:ns ## COG: BS_yhaZ COG4335 # Protein_GI_number: 16078046 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Bacillus subtilis # 42 133 265 356 357 87 42.0 5e-18 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQENMTFSFTVSAKETTK VRLEYAMGYVKADGRLSRKIFQISEILLKANQKKTYIKKHSFSDLRIRKHYPGTHSVTLI VNGAEQDTLDFELTI >gi|157101632|gb|DS480692.1| GENE 3 972 - 1481 190 169 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229200610|ref|ZP_04327171.1| acetyltransferase, ribosomal protein N-acetylase [Pedobacter heparinus DSM 2366] # 1 149 9 162 188 77 28 3e-13 MILETERLYLREMNQSDFDSLCSILQDEDTMYAYEGAFSDDEVQEWLDKQIFRYQKWNFG LWAVILKETNEMIGQCGLTMQPWKDTEVLEIGYLLARPYWHKGYAIEAAKACKTYAFEIL KADEVCSIIRDTNIASQNVAIRNGMTMTGTWTKHYRGVDMPHYRYVVSR >gi|157101632|gb|DS480692.1| GENE 4 1516 - 1734 291 72 aa, chain + ## HITS:1 COG:no KEGG:Clole_2891 NR:ns ## KEGG: Clole_2891 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 61 1 61 68 97 83.0 1e-19 MSKYNALWEYVQKNGSQSFRLTFEEIQDFAGISIDHSFLNYKKELTEYGYQVGKISLKEQ TWQGYLKLNFPG >gi|157101632|gb|DS480692.1| GENE 5 1868 - 2401 192 177 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 174 3 176 179 142 40.0 5e-34 MKNVLIISGSPRKKGNSQRLCEEFKKGAEDKGNSVELIRLAEKKIGFCKACDVCMKNNGT CVQQDDMAEVLESFKKADVLVLATPVYFYGICAQMKTFIDRTYPIWQHLGEKEVYYIISA GLGEDIVKRSLGDLDGFVEHLEKHEIKGRIYATNVMDAGKVCGTPYINIAYQMGCHI >gi|157101632|gb|DS480692.1| GENE 6 2536 - 2994 -71 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939706|ref|ZP_02087053.1| ## NR: gi|160939706|ref|ZP_02087053.1| hypothetical protein CLOBOL_04597 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04597 [Clostridium bolteae ATCC BAA-613] # 4 152 1 149 149 290 100.0 3e-77 MNDMGIFKKTFRMIFKKNNGSEKKHNNTLEEDIHLASDWVVKALNSSGYKADYSLESMKE IDRFFDEQSSETGLLSKNRGEILFGLASYIGESAIKLYGGEWNTDDSDPQGEIHISVKLA DGTVIWPVIRCMKRYEHGSEESIYAYLFVLNS >gi|157101632|gb|DS480692.1| GENE 7 3102 - 3773 137 223 aa, chain + ## HITS:1 COG:no KEGG:FMG_1483 NR:ns ## KEGG: FMG_1483 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 1 223 1 224 225 102 28.0 1e-20 MKRMGKIIIAGLCLGLPLLFVKIVFQIPDDLFWKYYLICGGIAVVGTAAFNLLYNRRYLK KMQAAVHLLESNRAEEYVTEVESMRRLAKGRFANCMLTINLSAGYCKLKQYDKAAELLES ISDVKLSGDLELVHRLNLCLCYFYQKQTGRAMALYESSQRILNPYRSSKLYGGNIAVLDI YAAIGHKDYARTAKLLQTARSTWDNPRFLDDYCYLEENIHQTQ >gi|157101632|gb|DS480692.1| GENE 8 4001 - 4348 154 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939710|ref|ZP_02087057.1| ## NR: gi|160939710|ref|ZP_02087057.1| hypothetical protein CLOBOL_04601 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04601 [Clostridium bolteae ATCC BAA-613] # 1 115 1 115 115 225 100.0 1e-57 MGTFRKIIVDNSEYMWLFRYDDYDYQRIPYLLITTKAFPKATLHINFPIKDHFLLNSGLP AVFQRQRVSINLNQPLFVSQIIQQCSKQIDFSNLTGYHYLNGLDILKTIGYELFW >gi|157101632|gb|DS480692.1| GENE 9 4493 - 5050 179 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939711|ref|ZP_02087058.1| ## NR: gi|160939711|ref|ZP_02087058.1| hypothetical protein CLOBOL_04602 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04602 [Clostridium bolteae ATCC BAA-613] # 1 185 1 185 185 369 100.0 1e-101 MKELVGDKFYTTLSEKYNELLLDYVLLSWDEDYHGAESHKKAVIEAILSILNTRHRVGNN IKHPLFYVHESDMRCIECDAGAFFYEDNRGRFNKGVSETPEKMKMNYWLAFSGPPYGIPY SKEDFRKINDSLFPILFRKDLEIFSWNDDFSNYFDDGKEWWGTALWSIYDKWMNRFVIIG ASLTD >gi|157101632|gb|DS480692.1| GENE 10 5584 - 5739 186 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939713|ref|ZP_02087060.1| ## NR: gi|160939713|ref|ZP_02087060.1| hypothetical protein CLOBOL_04604 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04604 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 64 100.0 2e-09 MVEKQRIQAIFTAKDYMELYRTQKPTVDLMLGIKEQWEFEDFLEEEDWIYM >gi|157101632|gb|DS480692.1| GENE 11 5815 - 6003 176 62 aa, chain - ## HITS:1 COG:no KEGG:Clos_1311 NR:ns ## KEGG: Clos_1311 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 2 62 56 116 116 66 50.0 3e-10 MARAYFGLARWGITLILPESLPAFQEIVIADRRMNRGEQLAALAVKICEAIFREKYMIHF GV >gi|157101632|gb|DS480692.1| GENE 12 6050 - 6394 227 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939715|ref|ZP_02087062.1| ## NR: gi|160939715|ref|ZP_02087062.1| hypothetical protein CLOBOL_04606 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04606 [Clostridium bolteae ATCC BAA-613] # 1 114 28 141 141 239 100.0 6e-62 MKVTDNLGETIVLPEAVFLSCFENLGEAVTSFSPIHSDTTVYHLYMVLRGDDCSLEEIKA YAKVMNVNYLQAKRALMQKRNLIAAGSAYDIWKMLGRLEPFNVHHEIEPEYPYG >gi|157101632|gb|DS480692.1| GENE 13 6541 - 6660 119 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939716|ref|ZP_02087063.1| ## NR: gi|160939716|ref|ZP_02087063.1| hypothetical protein CLOBOL_04607 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04607 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 62 100.0 8e-09 MNLEKEPSNEEGGITEDVNYYDGSKPKNTMELVAWFDKD >gi|157101632|gb|DS480692.1| GENE 14 6724 - 7407 527 227 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1897 NR:ns ## KEGG: Sterm_1897 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 5 227 3 213 213 150 36.0 4e-35 MISAFWEMFKPLYAVDTLEGYTENEIVYLKELFGALPQVLEDYYRAAGRTKAFHCVQDIW ILPEHFQKWEWLWEPEYLILLNENQGVCRAGIRREDLMLPDPPVYVTEDDKNWTLCAPTT SEFLSAALAYECVFTFECNPEEFYWLTEEELGLIQSKLTKLPFEITNWLCGMRITLYSNE PDNMVAVMDCGDWELHGDGELQMLYGAASEASYASLMSVLEGVGEAI >gi|157101632|gb|DS480692.1| GENE 15 7425 - 8054 209 209 aa, chain - ## HITS:1 COG:no KEGG:bpr_III108 NR:ns ## KEGG: bpr_III108 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 4 202 76 275 278 110 35.0 3e-23 MPAYFNLSVQFRRDKLYPTFVKDFYALLDEAAMKFQSGYWGFEEDSLEETTEWNQRKLEE DFNLGFTEHHSHDYKQVIYKFGGYSEVRGFWMNNYPEDGEFTHEIIIPESDVLAEGYPVR FKEERIEELLGFSKRIWQFPPVRAIQTGLEIEDDSAGLAELAQGGFPNAWPFAIVEDSRA CYEDSIYDIQPITEGKKGLLFRRTGADNE >gi|157101632|gb|DS480692.1| GENE 16 8447 - 8884 107 145 aa, chain + ## HITS:1 COG:no KEGG:Mlab_1680 NR:ns ## KEGG: Mlab_1680 # Name: not_defined # Def: SufBD protein # Organism: M.labreanum # Pathway: not_defined # 9 143 13 147 160 116 45.0 2e-25 MKENITATLTGKDDKYACALADKIISESLETDEWYEYFDDFASLLNYPKSLVRNRALYIL AANAQWDDENRFDLIISDFLAHITDEKPITARQCIKALAQVGSAKPQYIPVILSCLRSAD LSKYKDSMRPLIEKDIAETEKKLTT >gi|157101632|gb|DS480692.1| GENE 17 9325 - 9573 134 82 aa, chain + ## HITS:1 COG:no KEGG:ELI_1490 NR:ns ## KEGG: ELI_1490 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 7 82 4 79 80 86 53.0 4e-16 MSDRVILIDNIDEIHTTEMGADRIRRNLKLENVDVVEYCKDKIMDENCHIYRQGKNWYCE IDSVKITVNYYSYTIITAHIIK >gi|157101632|gb|DS480692.1| GENE 18 9662 - 10486 659 274 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939724|ref|ZP_02087071.1| ## NR: gi|160939724|ref|ZP_02087071.1| hypothetical protein CLOBOL_04615 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04615 [Clostridium bolteae ATCC BAA-613] # 1 274 1 274 274 515 100.0 1e-144 MLVTIQQTKSNTENLFEVTSNGQLLFQAKAPWMKISLPFNAEDLRELTFSNPAGEIVYTT RYRFIDNLVEELIPFKYLLTKGQRFGQFEIIGRNGSEGSFYVMQNGLFDSKFCIECMGNA YLGYSLDRGRNNYVSIYDDEKQIAQITKPLTVTDNLDVYFLHIKDEYASIIPVLSFFTVY YDYRKYNHSGELTKNTVQISNSYTYGKNNDKYNPNWIAKEFGQQAADELEQKLRKIREQG SAQAKKIVKLVGLAYLVLILLAIVLFVVFKSILG >gi|157101632|gb|DS480692.1| GENE 19 10525 - 10635 61 36 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939725|ref|ZP_02087072.1| ## NR: gi|160939725|ref|ZP_02087072.1| hypothetical protein CLOBOL_04616 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04616 [Clostridium bolteae ATCC BAA-613] # 1 36 1 36 36 65 100.0 1e-09 MTIPNVFFKNGIWIGEIGHNLWDDFYENTILLEFRG >gi|157101632|gb|DS480692.1| GENE 20 10642 - 11013 185 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939726|ref|ZP_02087073.1| ## NR: gi|160939726|ref|ZP_02087073.1| hypothetical protein CLOBOL_04617 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04617 [Clostridium bolteae ATCC BAA-613] # 1 123 1 123 123 226 100.0 4e-58 MVKPISYISYGENEKKIIKAGIVEIRKVLMGNDKNKKRSLLFALDWFMDPYFKQDISDIH NELVELLQTVVISSTDDDVSEDALQLLCDYEWPPFEILEKNINRVSQRLKPDVLYAVNMD KEI >gi|157101632|gb|DS480692.1| GENE 21 11243 - 11764 183 173 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase [Brachyspira murdochii DSM 12563] # 17 146 4 140 166 75 30 2e-12 MKYSKTIILKNGVECCLRNGIESDGQAVWDCFNLTHGQTDYLLSYPDENSFDVMQEGQFL KKKSESSNEIEIVAVVGNVVVGTAGIEAIGSKYKVRHRAEFGISVAKDFWGLGIGQALRA AGYIQLELSVVAENERALSMYEKAGFVKYGINPKGFNSRVTGFQEVIYMRLEL >gi|157101632|gb|DS480692.1| GENE 22 11839 - 11991 146 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939729|ref|ZP_02087076.1| ## NR: gi|160939729|ref|ZP_02087076.1| hypothetical protein CLOBOL_04620 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04620 [Clostridium bolteae ATCC BAA-613] # 1 50 1 50 50 76 100.0 6e-13 MDNLAIHFLNDDEQLVTQEAVNKLVFNNTGANDEAQLEALAEHLGAMEWK >gi|157101632|gb|DS480692.1| GENE 23 12014 - 12526 277 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939730|ref|ZP_02087077.1| ## NR: gi|160939730|ref|ZP_02087077.1| hypothetical protein CLOBOL_04621 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04621 [Clostridium bolteae ATCC BAA-613] # 1 170 1 170 170 338 100.0 6e-92 MPIQNTKMSAKERRIFRYTRADAWETTISQVDVMEGVEKGNVRCLYFVTPRLHANTIVDS CTITISQNHIDMIRDAMLTRLEVCKYKEIEFPVVLDGFINTFEFAPEESFSNIITVFNIS AFRDNVDVAIIGNPPYRGKEVLKLYDDISKILSDNGVSQKYLALDSSIFP >gi|157101632|gb|DS480692.1| GENE 24 13013 - 13588 409 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939732|ref|ZP_02087079.1| ## NR: gi|160939732|ref|ZP_02087079.1| hypothetical protein CLOBOL_04623 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04623 [Clostridium bolteae ATCC BAA-613] # 1 191 9 199 199 370 100.0 1e-101 MDYDNFLLVAQMAVDLGCTVVKEDLKLGKVIESKDAGIITPYGTSNYPGYYFHLPEAGAI EVLTVNGKECLDYGYTATGNALIEAGYSSIVNEPAGVAGSRQKKEIHRARIYCISGYYDE NEKYIQRPDCLTKVYHSLVRYVKKIAPYTEFTVILISMKDENYGEEYEYRHKEYITKNCL DLINNKGYKLC >gi|157101632|gb|DS480692.1| GENE 25 13782 - 14339 -95 185 aa, chain + ## HITS:1 COG:SA2495 KEGG:ns NR:ns ## COG: SA2495 COG1396 # Protein_GI_number: 15928290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 1 72 1 72 189 62 41.0 6e-10 MTFGEKLKFLRKQKGFSQERLSQQLTVSRQAISKWELGESLPDTENVIQLSKLFSVSIDF LLNDNICSEADIPAVQHNTTVLQDKFRSKILLVIGFICSAMGCAGSITLIVLSTMIKVHV EKKRILPDGSVQYYGGGDILGYDFWKFISEYRLQALLFICIILIIAGIAIIWVSKKKAKT KSLLF >gi|157101632|gb|DS480692.1| GENE 26 14430 - 15059 324 209 aa, chain + ## HITS:1 COG:MA1703 KEGG:ns NR:ns ## COG: MA1703 COG4845 # Protein_GI_number: 20090555 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 1 207 1 207 209 271 56.0 6e-73 MANKYKVIDEKTWNRAMHCMIFRNSVEPAFCVTFEADITRFKRVVKEQGVSFTLAMVYAV CKCANDIDAFRYRFVDGQIVLFDKIDTAFTYLNQETELFKVVNVPMIDDLKEYCEFAETI AKEQKEYFTGPLGNDVFQCSPMPWVTYTHISHTNSGKKDNATPLFDWGKFYEKNGKIVIP ISVQAHHSFVDGIHIGKFVDKLQRFFDRC >gi|157101632|gb|DS480692.1| GENE 27 15408 - 15896 251 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939737|ref|ZP_02087084.1| ## NR: gi|160939737|ref|ZP_02087084.1| hypothetical protein CLOBOL_04628 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04628 [Clostridium bolteae ATCC BAA-613] # 1 162 1 162 162 284 100.0 1e-75 MKLKIIIGLYAGILLLSGCASNTSVVAGQSAAVNAESATKGINNEEPNVNDAHLDDTIYI GEYLDSDVNEPNLEIARGDDNKYIVQIGIYRLTSLSDGIGELTADGMNFTATDAAGNPIR GIITADGQAATVTFTDSTWDYLQNGSAFQYTKSSDTSNIWNE >gi|157101632|gb|DS480692.1| GENE 28 16054 - 16305 96 83 aa, chain + ## HITS:1 COG:CAC0545 KEGG:ns NR:ns ## COG: CAC0545 COG4443 # Protein_GI_number: 15893835 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 14 82 6 74 74 72 47.0 3e-13 MKIVIYLGVLHMKRYEIYKHIGNLSPLNNGWIKELNLISWDNKEPVYDIRTWNSEHTEYG KGVTITAGQMVVLKNLLNEMSIF >gi|157101632|gb|DS480692.1| GENE 29 16514 - 16774 100 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939740|ref|ZP_02087087.1| ## NR: gi|160939740|ref|ZP_02087087.1| hypothetical protein CLOBOL_04631 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04631 [Clostridium bolteae ATCC BAA-613] # 1 86 1 86 86 145 100.0 1e-33 MDEFQSMVEETKALVQKEIKNKDNRPDFILEKQLYLILEELDKMERIRDIHLFHPYYPKG IADSWDYSNPLAIRLLELLESYRELQ >gi|157101632|gb|DS480692.1| GENE 30 16843 - 17091 194 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939741|ref|ZP_02087088.1| ## NR: gi|160939741|ref|ZP_02087088.1| hypothetical protein CLOBOL_04632 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04632 [Clostridium bolteae ATCC BAA-613] # 1 82 23 104 104 157 100.0 2e-37 MLIYIEMYPKDRLLNGPKCSVSELKKRLAKILAEAETKDFISIFCARYNFEEMPLDNVPI NENIEVDYYMDIDAGLIHKPSR >gi|157101632|gb|DS480692.1| GENE 31 17170 - 17637 317 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939742|ref|ZP_02087089.1| ## NR: gi|160939742|ref|ZP_02087089.1| hypothetical protein CLOBOL_04633 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04633 [Clostridium bolteae ATCC BAA-613] # 1 155 1 155 155 302 100.0 6e-81 MTERQKLIAYKQTVQAAKNKLKQMSGISILELLEQNENPGLNEMFSAYTTTFGNDIEPYS KLSYQSPAETVYSWLIGTMDLKENDEYYFYCGIWCRIKLLDLRLAIPSLWLHNGNTKGFL LAETDLNRMLEAGSDSRDEYHYLIDIWNYSEQRNY >gi|157101632|gb|DS480692.1| GENE 32 17665 - 18057 192 130 aa, chain + ## HITS:1 COG:no KEGG:TDE1820 NR:ns ## KEGG: TDE1820 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 130 1 130 131 230 79.0 2e-59 MKYKVKITVIDKKLYPELQQQYCHDPNSGICPCYNVGDEFVFYRDDERDDFWHMGLNTLI KTSGNADTVAGGPKMPFCSEAWDAISHYIYTGLQGGSIMKGWMREENTMITCCSDGTRPV IFKIERIDYE >gi|157101632|gb|DS480692.1| GENE 33 18078 - 18455 117 125 aa, chain + ## HITS:1 COG:no KEGG:Snas_4048 NR:ns ## KEGG: Snas_4048 # Name: not_defined # Def: hypothetical protein # Organism: S.nassauensis # Pathway: not_defined # 10 123 148 259 259 129 51.0 5e-29 MQPEHRPLGGKMILYRPVGSKELELIKKSNYRRFPPRLVEQPIFYPVLNEQYATEIASSW NVKYNEDHRGYVTKFEVDDQYCRQFEVHQVGGPHHKELWVPAEKLDEFNEHIIGEIHIIS EFSDR >gi|157101632|gb|DS480692.1| GENE 34 18548 - 18928 73 126 aa, chain + ## HITS:1 COG:no KEGG:CD1006 NR:ns ## KEGG: CD1006 # Name: not_defined # Def: acetyltransferase # Organism: C.difficile # Pathway: not_defined # 1 123 2 124 129 219 83.0 2e-56 MEYKVNDQGLNASVFIPFVNKVWPGDYDEEKTQSALSKTLNISAYENNVLVGCLRILSDG YFFGTITELLVLPEYQKRGIGSKLLQLAKDNTPTMLYFGAQPGVEPFYERNGCQRSLQSY TIRGTK >gi|157101632|gb|DS480692.1| GENE 35 18925 - 19197 102 90 aa, chain + ## HITS:1 COG:no KEGG:TDE0254 NR:ns ## KEGG: TDE0254 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 4 87 94 177 192 115 71.0 8e-25 MTNYTFEEIKGLLLKSIQEHDFESELRLCFHDNPNEYMIIIYDDHCSFQRCGNPKEASGE YNYKSLDELYNAQQVDGIVLERDWEKIKEL >gi|157101632|gb|DS480692.1| GENE 36 19248 - 19595 135 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939748|ref|ZP_02087095.1| ## NR: gi|160939748|ref|ZP_02087095.1| hypothetical protein CLOBOL_04639 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04639 [Clostridium bolteae ATCC BAA-613] # 1 115 9 123 123 227 100.0 2e-58 MTAEEVFCELSDKYGDDFNWHMLPLSNHTFTAELKKEIGKNHFLYHKQIWAVAKCDSNDD VLYLADNGGGADIYYIFHLTYSECNTDGFPRYKKFEGINAVKEYIEQSLNQDSSF >gi|157101632|gb|DS480692.1| GENE 37 20066 - 20272 227 68 aa, chain + ## HITS:1 COG:no KEGG:Clole_2891 NR:ns ## KEGG: Clole_2891 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 68 1 68 68 112 85.0 6e-24 MSKYNTLWEYVQKNGSPSFKLTFEEIQEIAGIPIDHSFLNYKKELTDYGYQVGKISMKEQ TVIFNRMD >gi|157101632|gb|DS480692.1| GENE 38 20215 - 21186 405 323 aa, chain - ## HITS:1 COG:no KEGG:Nther_0246 NR:ns ## KEGG: Nther_0246 # Name: not_defined # Def: integrase catalytic region # Organism: N.thermophilus # Pathway: not_defined # 41 263 59 325 717 87 26.0 6e-16 MNGELFKRLIALDHGAWVISYDEPGAPQYITAAFLEACEKVEMPEGYRVALEQAKHLTEA EMKRLALIEPLLEDSIYIVDGKSRLAMAKRIAEENGTTRKRILGLYYKYLARLVLMEKGG RERGKDRDVRNFDWAIRKFYFSAKKMSLRDTYDHMLASRYMTPDGKLMEVVPSWYSFEHY YYRHGYSKSIKCSVSRGGLSNYQRNKRPLYGSTMRWKDRVGAFQMDATEADIYLVSRFDR SVVVGRPSIYMAVDTATQLVAGISSVSPYLCSPTIFPTRHHCHSCPAGFSSVDSISIYHL EHPVYQSILLKMTVCSFMEILPT >gi|157101632|gb|DS480692.1| GENE 39 21188 - 21532 319 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939751|ref|ZP_02087098.1| ## NR: gi|160939751|ref|ZP_02087098.1| hypothetical protein CLOBOL_04642 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04642 [Clostridium bolteae ATCC BAA-613] # 1 114 22 135 135 228 100.0 1e-58 MSVHSKDAKAYADLLEASPHVVKYKCILPLDPGFMEEVSRLGIRPECFGIAWASDFWIEN ADGTTGIREVVSVAQLVKKSAIQKLELSRRYWKQLDVDDWKIVVIDRGGDEVVF >gi|157101632|gb|DS480692.1| GENE 40 21620 - 22159 218 179 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939752|ref|ZP_02087099.1| ## NR: gi|160939752|ref|ZP_02087099.1| hypothetical protein CLOBOL_04643 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04643 [Clostridium bolteae ATCC BAA-613] # 1 179 14 192 192 354 100.0 1e-96 MERRQINSLSGIDIPSLGIEDIRFMYGTGISEPDGFGRQKKRSLRNGVQTQIGDIREDLW YQLAEQVIGLHGETWLLEALEVWCREEMHYPRYGFDLHKEALELHSHRIFDSEGWIDYIA FNRRFRPEVVEGRRFPRIRVGCCDREISEATEEMLERRDVAPCPKCGAFRSYEVIERPE >gi|157101632|gb|DS480692.1| GENE 41 22867 - 23115 268 82 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939755|ref|ZP_02087102.1| ## NR: gi|160939755|ref|ZP_02087102.1| hypothetical protein CLOBOL_04646 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04646 [Clostridium bolteae ATCC BAA-613] # 1 82 1 82 82 154 100.0 3e-36 MEWDSLSSEAKRILSLIKDPVTGDYHLDANIFVQLGAMFENGEVKEMITPEVFDEIVSFT RDDDKIMALGNCDGELLISIKK >gi|157101632|gb|DS480692.1| GENE 42 23309 - 23545 105 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939756|ref|ZP_02087103.1| ## NR: gi|160939756|ref|ZP_02087103.1| hypothetical protein CLOBOL_04647 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04647 [Clostridium bolteae ATCC BAA-613] # 1 78 1 78 78 122 100.0 7e-27 MIKHLQDFFETSNTISTIWLGILILIDSLAILTVLISVIMFHGDIPNLIQVYFRMSVYGV TVIAPFHLLYEHKLQKRI >gi|157101632|gb|DS480692.1| GENE 43 23734 - 23928 99 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939757|ref|ZP_02087104.1| ## NR: gi|160939757|ref|ZP_02087104.1| hypothetical protein CLOBOL_04648 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04648 [Clostridium bolteae ATCC BAA-613] # 1 64 6 69 69 112 100.0 6e-24 MSEYKVKNSKGEYFVSWVEKTPVFNRNSMFGERFSETGASNLIKLLSEENIRGCESVPVS ASGI >gi|157101632|gb|DS480692.1| GENE 44 24004 - 24297 153 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939758|ref|ZP_02087105.1| ## NR: gi|160939758|ref|ZP_02087105.1| hypothetical protein CLOBOL_04649 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04649 [Clostridium bolteae ATCC BAA-613] # 1 97 1 97 97 173 100.0 3e-42 MEAGTFKDLIVEAYKKSKEGNLVGTLYGAISTSSFSDIPDIEEFLKVGLTDMLHLQSTVT GMEEDIYERTLENYKVKASERTIYIKLKDKPEQPFMY >gi|157101632|gb|DS480692.1| GENE 45 24354 - 24686 276 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939759|ref|ZP_02087106.1| ## NR: gi|160939759|ref|ZP_02087106.1| hypothetical protein CLOBOL_04650 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04650 [Clostridium bolteae ATCC BAA-613] # 1 110 1 110 110 196 100.0 4e-49 MNKWKYKLESQGRKLRELLDKDDTITTIVEIYNQMEVCLKSLLKMLDPRDLEEWKYDIES MIEDIQMACPDIEDSELIYNDEEAILNRHMKDFYDLCDSMRVWIGLGIHP >gi|157101632|gb|DS480692.1| GENE 46 24748 - 24948 209 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939760|ref|ZP_02087107.1| ## NR: gi|160939760|ref|ZP_02087107.1| hypothetical protein CLOBOL_04651 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04651 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 108 100.0 9e-23 MELDKKTYIGIVTYTLKSMLELSKKEKEYSLGLDLIHYYKTEIEPKNIITEAEFKSLCGD VGINLV >gi|157101632|gb|DS480692.1| GENE 47 25010 - 25210 158 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939761|ref|ZP_02087108.1| ## NR: gi|160939761|ref|ZP_02087108.1| hypothetical protein CLOBOL_04652 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04652 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 119 100.0 8e-26 MASFSGNIFGTEHMNDEEEMKKCSVYLQECLSKEELEQLRAEYKHSQPNVPWWKYVFSNV NVSYRK >gi|157101632|gb|DS480692.1| GENE 48 25534 - 26469 500 311 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 1 310 1 280 283 197 43 4e-49 MAAVTAAMVKELREMTGAGMMDCKKALAATDGDMEKAVEFLREKGLAGAAKKAGRIAAEG IVATAVAADEKKAVIVEVNAETDFVAKNEKFQTYVADVAAQALTTSAKDLDAFMEERWAK DETLSVKEALASQIAIIGENMNIRRFEQVEEANGFVASYIHAGGKIGVLVDVETDVVNDD IKDMAKNVAMQAAALKPMFTSRDEVSADYIAKETEILTAAAKNEKPDANDKIIEGMVKGR INKELKETCLLDQVYVKAEDGKQSVSQYVAAVAKANGASIKVKKFIRFETGEGLEKKSED FAAEVAKQMGM >gi|157101632|gb|DS480692.1| GENE 49 26499 - 27251 1088 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 [Roseburia intestinalis L1-82] # 1 249 1 245 247 423 85 1e-117 MSVISMKQLLEAGVHFGHQTRRWNPKMAPYIYTERNGIYIIDLQKSVGKVDEAYKAVSDI AADGGTILFVGTKKQAQEAIKAEAERCGMYFVNERWLGGMLTNFKTIQSRIDRLKEIETM SQDGTFDVLPKKEVIALKKEWDKLERNLGGIKDMKRLPDAIFIVDPKKEHICVQEAHTLG IPLIGIVDTNCDPEELDYVIPGNDDAIRAVKLIVSKMADAVIEAKQGEVLEGAMEIEIPA QAFETEAVEA >gi|157101632|gb|DS480692.1| GENE 50 27532 - 28320 874 262 aa, chain - ## HITS:1 COG:CAC1786 KEGG:ns NR:ns ## COG: CAC1786 COG4465 # Protein_GI_number: 15895062 # Func_class: K Transcription # Function: Pleiotropic transcriptional repressor # Organism: Clostridium acetobutylicum # 5 255 4 258 258 212 51.0 6e-55 MSVQLLDKTRKINKLLHNNNSHKVVFNDICKVLSEILLSNILVISKKGKVLGVSICSGVD EIEELIEDQVGGYVDKMLNERLLSVLSTKENVNLATLGFAEENVRKYQAIITPIDIAGER LGTLFIYKSDSQYDIDDIILSEYGTTVVGLEMMRSVNEENAEETRKVQIVKSAISTLSFS ELEAIIHIFEELDGNEGILVASKIADRVGITRSVIVNALRKFESAGVIESRSSGMKGTYI KVLNDVVFDELKTIKASNSNLK >gi|157101632|gb|DS480692.1| GENE 51 28476 - 30563 2079 695 aa, chain - ## HITS:1 COG:CAC1785_1 KEGG:ns NR:ns ## COG: CAC1785_1 COG0550 # Protein_GI_number: 15895061 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 583 1 577 578 607 55.0 1e-173 MANYLVIVESPAKVKTIKKFLGANYEVDASGGHVRDLPKSQLGFDAEHDYEPKYITIRGK GDVLAKLRKEVKKADKIYLATDPDREGEAISWHLMKALKLDELKSKDVYRISFNEITKNA VKASFKTPRELDMNLVNAQQTRRMLDRMVGYRISPLLWAKVKRGLSAGRVQSVALRMICD REDEINAFIPEEYWNLEADLLLKGNKKPLVAKYYGNRDSKGGKAVIRNKEELDQILSQLE GCKYEVEEVKRGERVKKAPIPFTTSTLQQEASKVLNFSTQKTMRLAQQLYEGVEVKGKGT LGLITYLRTDSTRIADEADTAARSFVKEHYGEKYVAQGSQVKKEGVKIQDAHEAIRPTDI TLTPTIVKESLQRDLFRLYQLIWKRFTASRMAPAVYETTSVKIGAGSHLFTTSASKLDFE GFMSVYVEADDEEEKNQVLGNLEKGTVLKLEQLDPSQHFTQPPAHYTEASLVKALEEQGI GRPSTYAPTITTIIARRYVVKESKNLYVTELGDVVNRIMKNSFPSIVDPNFTANMESLLD KVADGTVAWKTVVSNFYPDLDEAVKTAEKELESVKIADEVSDVVCDLCGRQMVIKYGPHG KFLACPGFPECKNTKPYLEKIGIPCPKCGKELVRRKTKKGRLYYGCEASPDCDFMSWQKP STQKCPVCGSYMVEKGNKLLCAGETCGYRMDKSEK >gi|157101632|gb|DS480692.1| GENE 52 30652 - 31824 919 390 aa, chain - ## HITS:1 COG:all1325 KEGG:ns NR:ns ## COG: all1325 COG0758 # Protein_GI_number: 17228820 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Nostoc sp. PCC 7120 # 29 387 3 371 372 189 34.0 5e-48 MGIGNSGEQESSRNAVFGPEMDLVKRHEKEYLYWLTRIPGFGAVTIRRIWEMAGSFERAY YIEGMELKKLGILQSEEKCRTYDRWKERFSGMVREYNCLQEKGIRFVTPLDREYPKRLLH IYDYPMGLYVKGELPDEKCPTAAIIGARNCSEYGRQAAGFMGKELARAGIQIVSGLALGI DGAGHEGALNGKGRTYGVLGCGVNICYPRSNYYLYEAIPFQGGILSEFGPDEEPMARNFP MRNRIISGLSDVILIMEARKKSGSLITASLGLEQGKEIFALPGRITDDLSAGCNELIQSG AGILTSPEDVLDYMGIFHEKKCINREKDQKGLAKIEKMVYSCLDSEPRHLEQIMVQTGLS AGRCMSALLELELEGFAVRTSGQNYMKTIT >gi|157101632|gb|DS480692.1| GENE 53 31815 - 33350 885 511 aa, chain - ## HITS:1 COG:aq_291 KEGG:ns NR:ns ## COG: aq_291 COG0606 # Protein_GI_number: 15605825 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Aquifex aeolicus # 1 502 1 490 492 381 42.0 1e-105 MFTRINSGGIWAVEGVLVSVEADVSDGLPGFSISGQLASEVRESQDRVRTALKNSGFRLP AKKITVNLSPAGIRKGGTAYDLPIAVAILGAFQVVGTDTLEDSMVIGELGLDGRVKPVSG VLSLAAMAREKGIKRCFLPIENVAEGLVMEGVDMVGVESLGHMAGILKATASMEPCHAPL YRPESGCQKKDGLDYREVNGQAILRRAAEVAAAGMHGLLLTGSAGTGKTMIAKRIPTILP GLTREEDIEISRIYSICGLLPAGRPLLSERPFRSPHHTITSRALAGGGVPPRPGELSLAS GGVLFLDELPHFSPFAVEILRQPLEERKITVTRVSGNYEFPADFMMVAAMNLCPCGFYPD RNRCNCSENQIRRYLGHISRPILERFDICAEASPVTFGELNPAQGENEDSASIRKRVEHA RKRQERRFLGSKIRFNSRMGMKEVELHCSLGPEEKDFAQRIYESGGFSARRYHKALKVAR TIADLEDSRDIKKEHLAEALGYGGLEEKIWG >gi|157101632|gb|DS480692.1| GENE 54 33591 - 34376 669 261 aa, chain - ## HITS:1 COG:no KEGG:Closa_1841 NR:ns ## KEGG: Closa_1841 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 259 3 252 259 274 50.0 3e-72 MTVYMCEPWLEGILCGVYDAWSDPAGHENVRLAVKGEYEELELFAEYRDVQICPEKAEKV IRSVCAKLSQEIYRKVYTASLSQDRERADKIYRFLIQAFRHGPSVLDMLQLPACYDIFGL CRNVYNENHLLTEFLRFSETPGGILVSRIGPKNDIMTLLAPHFADRLPEENWVIYDENHK RAAVHPAGRSWFLVNQPLEPEQENSDRGWSRWLRERTDEETYEDMWRMFCDTIAIRERRN PVCQRTHLPLRFRPYMTEFQK >gi|157101632|gb|DS480692.1| GENE 55 34397 - 35944 1349 515 aa, chain - ## HITS:1 COG:FN0954 KEGG:ns NR:ns ## COG: FN0954 COG4277 # Protein_GI_number: 19704289 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Fusobacterium nucleatum # 8 494 4 414 415 392 44.0 1e-109 MLIQENLSVQEKLKILTDAAKYDVACTSSGVRRKGKAGSIGNTSEAGICHSFSADGRCIS LLKILFTNQCIYDCKYCVNRCSNDVVRTAFTPDEVCKLTIEFYRRNYIEGLFLSSGILYN ANHTMELIYETLHKLRNQWHFNGYIHVKTIPGADSELVERMGFLADRMSINLELPTAEGL KNLAPGKTRDKILKPMRQVQQGIAVSSHLLGYDREFKKPYATERAGFAGGAALIRPELAQ PGALLPGNHGLSGGGQDGLDDGQKAVQSAGGETRYLPDSRYARTPFVPAGQSTQMIVGAT PENDYQMMAVTQALYENYGLKRVFYSAYVPVNDDSCLPSVTARPPLLREHRLYQADWLLR FYGFKAEELLSEEQPNFNVFLDPKCDWALRHLELFPVEINRASYGELLRVPGMGVKSVQR IVNARKQGGLRFEDVKKMGVVLKRARYFITCNGRMMEGARLDQDYITSCLVGDERRKAWD IENRDSFRQLTLFDDMHLEMPVTGEDHYASVTGSL >gi|157101632|gb|DS480692.1| GENE 56 36268 - 37284 1295 338 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 337 1 336 339 491 69.0 1e-139 MAILVTGGAGYIGSHTCVELLTAGYDVVVVDNLYNSSEKALERVEKITGRKVKFYEADLL DQPALKEVFDKEDIDSVIHFAGLKAVGESVRKPLEYYHNNITGTLILCDEMRSHGVKNIV FSSSATVYGDPAEIPITENCPKGEITNPYGRTKGMLEQILTDLHTADPEWNVVLLRYFNP IGAHESGLIGEDPKGIPNNLVPYIAQVAVGKLDHLNVFGDDYDTPDGTGVRDYIHVVDLA KGHVKAVQKLQDKEGVSIYNLGTGVGYSVLDVLHAYEKACGKTLKYEIQPRREGDVATCY SDSAKAKRELGWVAEKGIEEMCADSWKWQSMNPNGYRD >gi|157101632|gb|DS480692.1| GENE 57 37442 - 38071 698 209 aa, chain - ## HITS:1 COG:CAC1689 KEGG:ns NR:ns ## COG: CAC1689 COG1191 # Protein_GI_number: 15894966 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 3 201 26 224 234 212 54.0 3e-55 MKTFPKPLSAGEEKEYLERCKEGDQSARDMLIEHNMRLVAHVVKKYQGSDYDMEDLLSVG TIGLIKAVNTFNVDKGSRLATYAARCVENEILMLLRASRKYSKEVSLFEPIGVDKDGETV SLVDVIEMDNKETLDTMILKQDVKELYEAFESCLTETEKTVLGMRYGLYRGKEHTQREIA GKLGISRSYVSRIEKKAIEKIRTEFAKHG >gi|157101632|gb|DS480692.1| GENE 58 38142 - 40073 1739 643 aa, chain - ## HITS:1 COG:mll7848 KEGG:ns NR:ns ## COG: mll7848 COG1368 # Protein_GI_number: 13476512 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Mesorhizobium loti # 184 583 228 593 653 154 30.0 5e-37 MVTKENKIKIIKYIWYTVGTVCLLMAPVASYFMFEYVTGNLDTVPYYMAALNIGWIYVLY LALFAVTGRTRIAVPAASCILYVISLAETFVVAFRGRPIMLWDVMAFRTAMTVSGNYEFF ITRQMKAAFVLLLLMNVFLWFIPRRVKGWKLHLAVGGGITASIAAYGAWFFAVLVPSWGL GINMWAINGTYQEYGYVLSTAVSLQYVVKKPPQGYSNARLKEIYSRLEENIYAEAGNGQA YPEKSRAYPENPENPENPLEDGQPVTPVNLICIMNESLAELKTAGDFTTNTEYFPFMDSL EENTVRGSLCVPVFGSMTSNTEFEFLTGDSMALLPANSIAYQFNVKPGTYSMVSTLKDQG YYSVAMHPYPGENWNRVECYQNMGFDAFLDQEFYEGSEELRNYVSDEADYQKLIQVVEAK ENPEDKLFIFNVTMQNHGGYEGTYDNFEQEVWLTGEYEGKYPKTDQYLSLMKRSDQAFQY LVEYFSLTDQPTMIVMFGDHQPSVEDEFYDDIAGMPSSEVPAQEHLMWYETPFIIWTNYS MPSGNMGRLGAVYLSSEVLWRANLEMAPYNRFLLAMREELPVVHPLGCYDREGTYYYWAK AESERCPYQDTVLDYEALVYNHSLDRKKVKELFVIDSENSISQ >gi|157101632|gb|DS480692.1| GENE 59 40264 - 42726 1469 820 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939775|ref|ZP_02087122.1| ## NR: gi|160939775|ref|ZP_02087122.1| hypothetical protein CLOBOL_04666 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04666 [Clostridium bolteae ATCC BAA-613] # 1 820 1 820 820 1630 100.0 0 MGYRLIISVSILSLLLTACFGAADGADAAAGADAAAGQPVKEQPELSVPPLTDSGLSMLV EDNTCFPGLPGDLEPADDCYGFPVELVSDLKYRMNQEDADACIEYLDGLNQQDFRIANHE FLSLLDMTDEMTNEMAITDFNENEAEYSCYRCDTDGDGTDELIAFEKDGRGTISYVHMMV EGENGYVFFNSQYMDDVRFFTLFQYESAFYYGASCFDDSTGTVNLQIYMLDRTIFHLANI TDSICFNRTFRISEPHLLYRNEQHPDTHIVQDYLDDVIYDVIYANRAKNTFYGDEWEYRN DERPFLYDDTAYWEDAEDADAINWSHMYIMDINNDGKDEIFTKTPPGYRQPFEHSIKWYD KDFKQCPAAVEAWRDNHYRLVDMWFKHLNGKAAAFTLYQEIQSGQRYLVDARIQEDGQTT ILMDYMAVLEPDDIRVTEFGVGTDVSIYKPLTYQDPARDSAFPYNLPDVAEVFLLKVQGS SAAIQNNCADIPDDFTEILAGLLADKALDRLERGLGMEMYQTDQGDFERKYGTYLSQARR TLNGGTQYVCQISLGRTDYFLLLQYSVTGKIGDLHIYRGTAGGLAYVSTYQPEYLGGKII CHSGGLYIVERPFPGGERPRYLDNIMIHRLIPEGGQDAVIEAVPESFMWQKIYDNRAAYA GDVTGYLEEIKTDLADLSYRSNDEDVYTGDEDPEIELNQFLRLGSVVSPYETFYKIDFNN DRKDEYISKEKSYNHIYAGIYQFTGRGIIGLEYDETAEEAWGHPVQIWFKEIQGKVFTFR LFYHEDSYYVLNVSLIEGTRITQVQTHIIVPKVNYIMNEY >gi|157101632|gb|DS480692.1| GENE 60 42773 - 44110 1494 445 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 434 1 434 447 245 36.0 2e-64 MITDLTAEHPKKTLWRFALPMFISVMFQQFYNIADSSIAGRFAGEDALAAVGASYPITVI FMAFAVGSNLGASVVVSRLFGAKDIGRMKTAVYTAFIACGAMSLILTVLGYAFCGDMMRL IHTPDNIFADGELYLKIYVYGLTFLFLYNVCTGIFTALGDSRTPLYFLIGSSLGNIALDY WFVAWLGLGVAGVAWATFIAQGVSAILALGTLARRLGALKTDRRTVLFSAVLLGQIAAIA VPSIMQQSVLSVGNMFVQEIVNRYGSAVIAGYSGAVKLNTFAINAFMSLGGCLSSYTAQN LGAGKRERIPLGFKTGVRLSFYATIPFFLLYFVFSRQMMGLFLGAGSTRAIESGMEFLRI VSPMYFMISIKLMTDGIIRGSGAMTYFVLATVPDLILRIIVANILTGRFGSTGIWMAWPF GWIAATLLTVIFYRRIVSGKFRIRI >gi|157101632|gb|DS480692.1| GENE 61 44338 - 45516 1426 392 aa, chain - ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 392 5 391 391 523 63.0 1e-148 MKELALKYGCNPNQKPSRIFMEGDRELPITVLSGRPGYINFMDALNGWQLVKELRAATGL PAAASFKHVSPAGAAVGLPLDETLRKIYWVDDMGELSPLASAYARARGADRMSSFGDFIS LSDVCDADTARIIKREVSDGVIAPGYEPEALEILKSKKNGNYNVIQIDPDYEPEALERKQ VFGITFEQGHNNLEINRDLLEDIVTENRELPDSAKIDLMISLITLKYTQSNSVCYVKGGQ AIGIGAGQQSRVHCTRLAGSKADNWYLRQAPQVMNLPFVDSIKRADRDNAIDVYMGDDYM DVLADGRWEKTFKVKPPVFTAEEKRAWLDTMTDVALGSDAFFPFGDNIDRASKSGVKYVA QPGGSVRDDQVIETCNQYGMVMAFTGIRLFHH >gi|157101632|gb|DS480692.1| GENE 62 45533 - 46246 867 237 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1566 NR:ns ## KEGG: EUBREC_1566 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 237 50 286 286 398 81.0 1e-110 MNQMSLEQELKSNAYPGRGIVIGRSEDGTKAVTAYFIMGRSANSRNRVFAEDGEGIRTQA FDPAKLEDPSLIIYAPVRVLGNKTIVTNGDQTDTIYDGMDHQMTFEQSLRCREFEPDAPN YTPRISGIMHIENGNYNYAMSILKSDNGNPDSCLRFTYAYQSPAAGEGRFIHTYMHDGNP LPSFEGEPKKVGISGDIDAFTDLVWNSLNEDNKVSLFVRFIDIAAGTYETRIVNKNK >gi|157101632|gb|DS480692.1| GENE 63 46506 - 47894 178 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 308 462 268 413 413 73 28 8e-12 MRFPDLVIMSINNLRRRKLRTVLTVLGVIIGTASIVVMVSLGIGLDQMFMEQISSWGSLT TINVYSGSSYGNAVMIGGGGSSKSSDPTYITDDVIEQFGRLPHVSGVSPVLEINVVMRQG AYEAQYLSLMGVSQSFLEQIPLGEGRLPEPGEMGMVVGNTVQQNFYNSKTGRGYWDTGEL PEVDFMNKPLFVIFDVDAYFQTRNGSVSSSEDGQTPVKPPKKYMMNATGMVEGGPDDWNS YSYNVYTDIDGLKAQLRKVFKKGAVIPGQPTNKKGKPLNYLTYNRAQIFVDDMEYVPEVQ KQIADMGFQVSSQADWMESTKQQSSMIQAVLGGIGAVSLFVAAIGIANTMMMSIYERTKE IGVMKVLGCDMGNIRNMFLIESGFIGFMGGIVGILLSYGISVVINRFVNLEEMNGLTGNL SRIPPWLSVAAVVFAIFVGMAAGFMPAMRAMKLSPLAAIRNE >gi|157101632|gb|DS480692.1| GENE 64 47881 - 49719 2070 612 aa, chain - ## HITS:1 COG:no KEGG:Closa_2183 NR:ns ## KEGG: Closa_2183 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 604 6 601 611 545 50.0 1e-153 MRANKWCRGLTAFLAASMLAGMYPSSAAASDYKLGYNTNASENDYVFTTKSPKGTIGKSM SIPFRIRATDEDMENLRISLLETNEFQQIEERGNGDYSVDYYPFEIMETTFVAKNVGTIK KGNVKSVSLSARVRRDALQGYYSIPVLLEWDGGSDTDYINVWISTSSTSGADEEEDKKEG NYFVVGENQPTPRGVYPNVMDYTVNFRNKRETTAQDVTVSMQLSEDDAKFPFEINDGNYD RTFERVQPGETVSAPYSMAIRKDSYTGYYPIKYTITFRLSSEGDLHTEEGTFYVHITSKD KEDDLGDFNANDRTRARLIVESYHTVPEEIYAGDEFELILNMKNASTSVAASNILFNLES EKVSDSAVFTTESGTSSLVVDNMAPGQTTEVRARFTARAGVDQRSYAITVKEKYDSPEFK NAEESIVVDIPVKQYARLSTSTIDVMPDSLTVGSESNVMFGINNTGKVILYNVTVTFEAD SIKTTDAYVGNIKPGETGNVDTMLTGVAPTLDEGTVRIRIDYEDENGVPAEPVEKELTLM VMEEMEQNWDDMGAAMDAGSMEAEAAPSFWGKYKFLVIAGAAAAAGIAAAVVIRIRKKRK AAREEDVDDEIS >gi|157101632|gb|DS480692.1| GENE 65 49706 - 50485 379 259 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 14 245 1 228 245 150 37 4e-35 MSSVPTAKENRPVVIRVKNLYKIYRVGDIKVRALDGLDFDIYKGEFVAIVGASGSGKSTL LNMLAGLEKPTKGEIEIGRVHIEKMTERQLVSFRREKVGFIFQAYNLLNTMNALENVALP LSFRGVSRRQRLKEARKYMDLVGVGGQARHMPNQMSGGQQQRVGIARALVVNPQIIFADE PTGNLDSKTTMEVLRLMQKIVREQDQTLVMVTHDNNLASYADRRIRIMDGRIVGIETGGR EALYEGNAEEKECNENESK >gi|157101632|gb|DS480692.1| GENE 66 50818 - 52218 687 466 aa, chain - ## HITS:1 COG:BS_lplD KEGG:ns NR:ns ## COG: BS_lplD COG1486 # Protein_GI_number: 16077780 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidases/6-phospho-beta-glucosidases, family 4 of glycosyl hydrolases # Organism: Bacillus subtilis # 8 455 7 441 446 486 52.0 1e-137 MRYAKDKVSDIKIAYIGGGSRGWAWTFMTDLALEEQLSGTINLYDIDFKAAKNNEIIGNL ISKREDTLGKWNYRTSTSLEDALTDCDFVVISILPGTFDEMESDVHLPERLGIYQSVGDT AGPGGIIRGLRTIPMFAEIAEAIKAYSPEAWVINYTNPMSLCVKTLYHVFPEIKAFGCCH EVFGTQKVLKGICEACCDEKNIDWHDICVNVLGINHFTWFSSASYKGIDLFPVYDAYIAE HFEDGYHDPDRNWMNSSFNCAHRVKFDLFKRYGLIAAAGDRHLAEFMPGNEYLNDPDTVK KWKFGLTTVNWRKEDLKRRLERSRRLADNEEAVELKPTGEEGILLIKALCGLGRLVSNVN IPNTYGQIPNLSRNAVVETNAVFERDAIRPMIAGDIPESIKELILPHLDNHEYTLQAALK YDEDLVVKAFLNDPLVKGKRCTEEDVKMLAHDMIQNTLKYLPDGWK >gi|157101632|gb|DS480692.1| GENE 67 52327 - 53358 61 343 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939784|ref|ZP_02087131.1| ## NR: gi|160939784|ref|ZP_02087131.1| hypothetical protein CLOBOL_04675 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04675 [Clostridium bolteae ATCC BAA-613] # 1 343 1 343 343 730 100.0 0 MKNIEGIIEGARKVIQSHCVDGGSGYSRWIWQNATNDRELVINPYGCADAANIRYMIGEF PATSAERENWIINLQGFQNPISGLFEEATHHPYHTTAHCVAALELFEEKAVHPLFDMQPY LQQHNLEQFLDGLCWSEDPWLASHQGAGIYAAMVLMEEADMQWEDRYFRWLSIEADPETG FWRKGQVKPVLKADSFAGVRAKPSVFPHLAASFHYLFNHEYAHRPLPYPDKIVDTCLAIY KKNDWETLGKKVSFAEIDWVYCLNRGLRHSGHRFDEAKTALHDFALTYIDYLEHLDYETD DGFNDMHSLFGCICALAELQNALPGELVTNKPLKLVLDRRPFI >gi|157101632|gb|DS480692.1| GENE 68 53365 - 54213 409 282 aa, chain - ## HITS:1 COG:BS_yesQ KEGG:ns NR:ns ## COG: BS_yesQ COG0395 # Protein_GI_number: 16077766 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 13 281 28 296 296 301 54.0 1e-81 MGMKNKRTIGITIYHILVCIGGLIMVYPLIWMLMSSFKETNTIFATAKELLPEKATLVNY VNGWKGFAGSTFGRFFANSAIISVLSTVGAVASSAIVGYGFARCRFKGKKLLFTCMMVSM MLPFQVMMIPQFIWFKKLGWVGTYLPLIAPYFFGQGFFIFLIMQFIEGIPRELDEAAKID GCSYYGVFRQIILPLIIPALITSGIFSFIWRWDDFMSPLLYINKTTMYPISYALKLFCDP SSTSDYGAMFAMATLSLLPAVIIFITLQKYLVEGIATSGIKG >gi|157101632|gb|DS480692.1| GENE 69 54217 - 55125 426 302 aa, chain - ## HITS:1 COG:BS_yesP KEGG:ns NR:ns ## COG: BS_yesP COG1175 # Protein_GI_number: 16077765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 14 302 17 307 309 301 56.0 7e-82 MKENKLTLRQWFNKESVAGSICTLPFAVGFLCFTLVPICMSLYYSLTDYNILSQPKFIGL ANYIEMFTTDKTYIKSLFVTIFYTFISVPMRLVFALAVAVILVRNTRMTPIYRAVYYLPS IIGGSVAVSILWKRIFATDGVINALLAAVGIDTQFAWLGNTKTAIWTLILLAVWQFGSSM LIFLSALKQIPNSLYESARVDGASRPYIFWKITLPMLTPTIFFNLIMQMISGFLAFSQCY IITQGKPMNSTLFYTVYMYQQSFEFYRMGYGAAMAWVMLFIVGTVTVILFKTKKKWVFSE GE >gi|157101632|gb|DS480692.1| GENE 70 55159 - 56508 732 449 aa, chain - ## HITS:1 COG:BS_yesO KEGG:ns NR:ns ## COG: BS_yesO COG1653 # Protein_GI_number: 16077764 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 42 442 9 408 412 239 34.0 1e-62 MLKRNLAIGMAIVLSAGSLFGCGSSESNGGNTSAADTPAVTENAADASGDMVLRFSWWGN QVRNEKTDKVLKMYSEDNPGISFQEEINEFSSYWDRLAAQSAGNDLPDIIQMDYRYIDQY VSNGLLLDLTPYVENGLLDLSNLADSTVDSGKIGGKLYGICNGLNAPMMIYNKTITDNAG IEIGDEITLDHYIEYSKLIYEKTGTPTNIAYGAGTKYLQYYMRDRGEKLYNNQGGLGGQE STFADFFRLYENAIKEGWHTDPAAFVEISTTSIEEDAIHSGKSWSTIVWSNQYAAAMNSK PEEYEYAVCTWPANNIKQADYTKPSQFFSVSSNSKNPEEAVKVINYLTNSKECNDVLLGE RGVPASSVIAEAITPKLGQDEQLVIEFVTKVSENCSALDPADPVGSGEITNLIDSLTEEV CYGSKTADEAAAEFYEKGNAILLKGYSGQ >gi|157101632|gb|DS480692.1| GENE 71 56693 - 58435 830 580 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 1 573 1 571 577 232 30.0 1e-60 MYKKLHNWLIDTSLKTKLLLFNCLITLFIGISSFLAVNIFMKYNEELLYNTTASVLTFFS SEIGNSFDSMTTSSSVLIADNTIQGSLYDLKYSPSSLKTAVSYRSLYNSILSYSYTIPHV ESICLYTPTSTIWSNNHQSSLSVSQLEYIKEYTDEQKGSPCFLVLDKYNNQLVLARQIRR IAGTNLEPLGTLVIFINTESLIKASTTHNTFDDCSYILSADAKTVYRSDAPTPENTFDIL NDSYKVIPFKGHFYFVLKSMIPGYGWEFECWMNYDKTHGSILMGHMLFLLALACCIILSL LASGVLSQNISKHIFALVKKMKTFDSESDVIPQTNYDYSTRKDEIGIVHRNFDAMAREQQ RLIRDNYQAEMLMKNAQLKALEQQINPHFLYNTLESINWLAKSKHETEISKIVESLGNLL HATLDSRSRMIPLEHELELIHNYITIQEFRFGDRLKFHLSIEQELSHIPIPKLSIQPLVE NAIKYALEEIADECNIYITILHTEQLIYIYVKNDGSQFELGLLEKLRANPKNSRGFGIGL INIDTRIKLTYGENYGLDVYNENNKAVARISIPFIQDEMN >gi|157101632|gb|DS480692.1| GENE 72 58681 - 60201 701 506 aa, chain + ## HITS:1 COG:BS_yesN KEGG:ns NR:ns ## COG: BS_yesN COG4753 # Protein_GI_number: 16077763 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus subtilis # 1 146 1 149 368 149 45.0 2e-35 MLKLVIADDEKIIRETIRNFIDWQSIGIEVVGVCSNGVEAYETILDYYPDIVLTDIKMPG FSGLELVKKIKEIDNNIQFIILSGYDSFEYAKEAMQFGIHHYLLKPCNEHQIIDAINAVK EDCYKQRSFLAMQKQQKDNEKNLAQNVMKNLIFECISSESSMLSLTKNYEQLLDFSYTEY KLYYIFYLEKKHLGDCLEQLSAYFKSNFPSLPVHMIYVKNTLIFFFEDCAPTYTDLNEYI HGFSFKEQSVSIEFKSYSFLNLGALLNSLMPKLKRYEQISLIEHFHSIQFVNHTALFEKT EYFISRIFQDNPDETAAQDLYDLFSSIQDIAFLKALITDIFFRLLTTSHQTFVSSEQINS YIENVNHYKNPQELIHYFSGQLDGIMQSGKKGASSSRDFIFKVISLVDEYLSNPNLSLKW IAENHLYMNVDYLSKQFVKQTGVKFSSYLASKRMEKAKELLLNGNNEKISEIAEQVGCGD NPQYFSQLFKKYAGMTPSAYLKHHAL >gi|157101632|gb|DS480692.1| GENE 73 60454 - 61377 805 307 aa, chain + ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 2 303 5 306 307 414 65.0 1e-115 MRHLLNPLDFSVEEIDDLLALARDIEQNLPGYAHVCDGKKLATLFYEPSTRTRLSFEAAM LNLGGNVLGFSSANSSSAAKGESVADTIRMISCYADICAMRHPKEGAAFVAAQKAQIPVI NAGDGGHQHPTQTLTDLMTIRSMKGRLDNLTVGFCGDLKFGRTVHSLINAMVRYPNVKFI LISPPELRIPDNIRDDVLIANNVPFEEVGNLDDALGQLDILYMTRVQKERFFNEEDYIRL KDCYILDKKKMKLAKEDMYVLHPLPRVNEISVEVDDDPRAAYFKQAQYGVYVRMALIMRL LEVQKPC >gi|157101632|gb|DS480692.1| GENE 74 61371 - 61802 594 143 aa, chain + ## HITS:1 COG:CAC2653 KEGG:ns NR:ns ## COG: CAC2653 COG1781 # Protein_GI_number: 15895911 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Clostridium acetobutylicum # 1 139 1 138 146 134 51.0 5e-32 MLNISGLQEGIVLDHIEAGKSLDIYYHLGLDKLECQVAIIKNARSNKMGRKDIIKIEGGL DTVDLKVLGYIDHNITVNIIRGDRIAEKRALKLPKKITNVIQCKNPRCITSIEQELPHIF YLADEKAEVYRCQYCEEKYSEFK >gi|157101632|gb|DS480692.1| GENE 75 62348 - 62560 86 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALGPIRENVEDPALFRINQKALIFTGRSIAFEFVNGKRLWKLVRRRIANRVQKAADRLD GNTGKLCDLF >gi|157101632|gb|DS480692.1| GENE 76 62748 - 63173 324 141 aa, chain + ## HITS:1 COG:BH0379 KEGG:ns NR:ns ## COG: BH0379 COG3464 # Protein_GI_number: 15612942 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 6 137 254 392 405 79 35.0 2e-15 MPVSLRKYYKRSRKLILTRYKKLKDENKQACDLMLHYSEDLRLAHRMKEWFYDICQMEAY RQQQREFDDWIANAQSCGIKEFEACAKTYRAWRKEILNAFKYGLTNGPTEGFNNKIKVLK RSSYGIRNFKRFRTRILHCTS >gi|157101632|gb|DS480692.1| GENE 77 63402 - 63839 486 145 aa, chain - ## HITS:1 COG:no KEGG:ELI_3274 NR:ns ## KEGG: ELI_3274 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 16 132 23 140 145 74 39.0 1e-12 MGMSLIFMFVIAVVLVLYILAYWMIFRKAGEAGWKALIPFYGTYIEYKLFWNNRMFVLWL IMALIAASLRFVSGPGAMAAQLAYFGVSVMHILISVKMSYSFGHGVGYALGLTFLTPIFL LILAFDESVYIGPGGKKETDREMYL >gi|157101632|gb|DS480692.1| GENE 78 63893 - 64261 183 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 6 118 7 123 126 75 35 2e-12 MITFNHFNFNVLDLEKSLAFYKEALNLEPVREKNASDGSFKLVYLGDGVTDFTLELTWLR DRKEAYDLGECEFHLAFHVDDFDGTHKHHEEMGCICYENPGMGIYFIKDPDGYWLEIVPA DK >gi|157101632|gb|DS480692.1| GENE 79 64380 - 65669 1548 429 aa, chain - ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 425 6 449 460 326 40.0 6e-89 MLDLRFVRENPDIVKQNIKNKFQDSKLPLVDEVISLDTEARATQKEADDLRASRNKLSKQ IGMLMGQGKKDEAEAVKQEVSANSARLAELEEKEGQLSEKITKIMMTIPNIIDPTVPIGK DDSENVEVERFGDPVVPDFEIPYHTDIMERFNGIDLDAARRVAGNGFYYLMGDIARLHSA VISYARDFMINRGFTYCVPPFMIRSNVVTGVMSFAEMDAMMYKIEGEDLYLIGTSEHSMI GKFIDTITPEQQLPLTLTSYSPCFRKEKGAHGIEERGVYRIHQFEKQEMIVVCRPEESPM WYEKLWQNTVDLFRSLDVPVRTLECCSGDLADLKVKSVDVEAWSPRQKKYFEVGSCSNLG DAQARRLRIRVQGEDGGKYLAHTLNNTVVAPPRMLIAFLENNLQADGSVKIPEVLRPYMG GMEMMVPKN >gi|157101632|gb|DS480692.1| GENE 80 65828 - 67465 1538 545 aa, chain - ## HITS:1 COG:FN0649 KEGG:ns NR:ns ## COG: FN0649 COG1574 # Protein_GI_number: 19703984 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Fusobacterium nucleatum # 4 539 5 542 542 270 32.0 4e-72 MRTIYRNGAVYTGELPLCRAFVVEDGRFICTGSDEDALAMKEPGDEVTDLKGRFVCAGFN DSHMHLLNYGNALGAADLSQHTQSLSGMKEYMKEFIREQSPKPGTWVRGRGWNHDYFEDE RRFPNRRDLDMISTEHPICLTRTCGHACVVNTMALRLAGITGNTPQVEGGLYEVDESGEP NGIFRENAMDLIYGCLPEPAKEDIKEMILAASKALNRYGVTSSQTDDLLAFNNVPYERVL EAYRELDAEGRMTVRVYEQSQFTTLDGLKAFVEKGYNTGWGNHWFKIGPLKMLGDGSLGA RSAYLSQPYTDDGSTRGIPIFTRQQFEDMVEYADSQGMQVAIHAIGDGILDDILAAYEKA LARHPGKDHRHGIVHCQITRPDQLDKFAKLSLHAYFQSIFLDYDIHIVEERIGKERAASS YHFRTLYETTHASNGSDCPVELPDVMKGIQCAVTRTTVKDHVGPYLPEQALDIKQALDSF TAEGAYASFEENVKGRIAPGMLADFVILGANPFETSPEELARIPVEATYVDGVCRYSIET EGKQK >gi|157101632|gb|DS480692.1| GENE 81 67479 - 68945 670 488 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 13 462 14 443 456 262 36 8e-69 MTEFLNWLNGEIIWGPPLIALIIITGIYVTVRSGFFQVCHLGYIFKRTLGCILRKDHVTD ENDKGLLSPYEAIATAVGGSVGFGNIAGVATAVTTGGPGAVLWMWLTAFMGMLIKQVEVT LAVYYRRTDEEGNPYGGPTYYMEYGLGEERHWGKKWLPLAFVFGLGIFSTFFISASNFTT SEVLAQSFGIPQELASLLLVVFIYALIWRGVKNLSKIFVKLVPVMSLGYLIFGLVIIVLN IGNLIPTIVSIFTQAFTGTAAMGGFAGVTVSKAIATGMQRSVYSNEAGWGTSPMVHASSK TRHPVEQGLWGSFEVFVDTFIVCTITALSVLITGEWTSGATNATLALNVFQNNLGFFGTV FMTATMVIFSVTTSGGWYVYYEATLRHLAMNHPTVKKWLLRMFKYFYALPPFLLTLYLIH VGDLSIWTLVDITSGIPTFINVFVVLILSGTYFRLLKDYKARYMNIGTLEDKDMPLFYED KKKLEKAE >gi|157101632|gb|DS480692.1| GENE 82 69432 - 70337 847 301 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 7 274 6 278 290 199 37.0 6e-51 MSAYRQKKIAMVNDLSGYGRCSLTVAIPILSAMKVQCCPIPTSILSNHTGFPVYFFDDYT EKMGEFIHKWKELELTFDGIVSGFLGSEAQIEIVMDVIRQFGQEDTKVIIDPIMGDHGET YATYTPAMCSRMKELVSMGDIVTPNLTEACILTGRTYRKDGWSRKELGQLAGEIQAMGPK CVVITGVNQGGYIMNVVAEGERTAFPRTRRVGHERPGTGDVFSSVVSAAAVRGWSLDSAV RLAASFVKACIARSEELDIPIANGVCFEELMDVLVRAVDRRHEAGKLVRAVDSRCEAGQQ A >gi|157101632|gb|DS480692.1| GENE 83 70351 - 70971 771 206 aa, chain - ## HITS:1 COG:no KEGG:Closa_2116 NR:ns ## KEGG: Closa_2116 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 16 204 5 193 195 260 79.0 3e-68 MKTSSAVTAGHRGLEDSRIYKLALTGLFAALSYVVFTFLQIKITLPGGDATSIHLGNAVC VLGALILGGFYGGLGGALGMTIGDLFDPIYVVYAPKTFLLKLCIGLITGLVAHKIGRINE SSDKRHVFRFVVAASVCGLLFNVIFDPLVGYFYKLAILGKPAADLVLAWNIASTSINAVT SAAAAVVIYMPLRSTLIRSGLFDRFR >gi|157101632|gb|DS480692.1| GENE 84 71127 - 71777 653 216 aa, chain - ## HITS:1 COG:CAC1731 KEGG:ns NR:ns ## COG: CAC1731 COG1564 # Protein_GI_number: 15895008 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Clostridium acetobutylicum # 3 216 2 211 211 136 35.0 3e-32 MKRCLIITGGKLDLSFAGSFLGQETFDKIIAVDAGLEAVKTLGLEPDMIVGDFDTVKPEV LAYYRRMEHIVWDTHQPEKDETDTELALLKAQATGCTEVVVLGATGGRMDHMLGNIHLLF PCLQKGMEAYILDSQNRIYLIDKERTFNRREIWGKYISFLPLTEEVRGITLTGFKYPLHE KDIEIGTSLCISNELVGEEGAITFTDGVLIVVESHD >gi|157101632|gb|DS480692.1| GENE 85 71774 - 72439 805 221 aa, chain - ## HITS:1 COG:CAC1730 KEGG:ns NR:ns ## COG: CAC1730 COG0036 # Protein_GI_number: 15895007 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Clostridium acetobutylicum # 4 201 3 200 216 237 56.0 1e-62 MYYKLAPSILAADFTRLGEQVKEVYGAGAEYLHLDVMDGAFVPSISFGMPVIKSLRPCTR AVFDVHMMVEEPGRYVEDMKNCGADIITVQAEACTHLDRVVNQIKEAGLKAGVALNPATP VRSLECILGQLDMVLIMTVNPGFGGQKFIPYTLDKVRELKNMLNEQGLKTDIQVDGGVNA GNVREIIEAGANIFVAGSAVFGRDPSARTREFMDILKEYEK >gi|157101632|gb|DS480692.1| GENE 86 72442 - 73320 876 292 aa, chain - ## HITS:1 COG:CAC1729 KEGG:ns NR:ns ## COG: CAC1729 COG1162 # Protein_GI_number: 15895006 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 292 1 287 288 240 43.0 3e-63 MTGKIIKGIAGFYYINDGRNHVYQCKAKGVFRNRKIKPLVGDNVEFSVLDEEEKEGNIDV ILPRSNVLIRPAVANVDQALVVFAITHPEPNLNLLDRFLVMMGIQGVPVTICFNKVDLGE AGLVDKYRVIYERAGYEVHFVSTYEDTGLEEMKKFLRGKTSVLAGPSGVGKSSLTNCIQP QAAMETGDISRKIERGRHTTRHSELFYVEENTYMMDTPGFSSMFIDCLEPGELKDYFPEF GPYEEECKFLGCVHVGERVCGVKEALKRGELSRSRYDNYILMYEDLKNKRRY >gi|157101632|gb|DS480692.1| GENE 87 73330 - 75501 2586 723 aa, chain - ## HITS:1 COG:BH2504_1 KEGG:ns NR:ns ## COG: BH2504_1 COG0515 # Protein_GI_number: 15615067 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Bacillus halodurans # 5 318 3 322 343 269 47.0 1e-71 MLAPGTYLQNRYEILERIGSGGMSVVYKARCHTLDRLVAIKVLKEEFAFDENFVSKFKME AQSAARLSHPNIVSVYDVVDEDVYHYIVMELIEGITLKNYIENKGYLESKEAIGIALQVG QGIAAAHEQHIIHRDIKPQNMIISRDGKVKVADFGIARAVSSQTMNATAVGSVHYISPEQ ARGGYCDERSDIYSFGITMYEMVTGRVPFEGDNTVTVALAHLEEPVTPPSQYVPDMSRAM EQIILKCTQKRPDRRYACVTDVIEDLRAALMDPDADIYARKDGEEDEFGKTRPITRDELN SIQEGRKSYPEEALEEPEEEQPGHQDRNRAREPRKKEGRIHTRKKNEGDDVSTQFERIIT AFGIVAAIIIVAVVLFVFSRLTGIFRQGSNKPTTETVSMTTEDSTTVVDINTEDQINMPS VLGLPQDMAREKLKESSLVMEITGSEYSDNYTEGQVMGQDILAGTVVKKWSTVGVTISKG SDKVDLNALDLTQMTGEDARLLLEQKGLVVEVREENSDTAIKGNVIRFSPQVVKVGESVS LYVSKGSQNVEGRVPNLQGQTPTSADALLAGAGLLTGNVTVEASDTVEEGLIISQSELPG TILPPGTAVDYVVSGEPAQTQAADERYYIGSIDEACSLSNYIGPASQTSSVRVLIRLNQT NDEGASVYTTLMSPRLVVGAQTVPVVFPRIKGAYGVDSGMVEVVDADNLTVIKSYPVAFF PVG >gi|157101632|gb|DS480692.1| GENE 88 75495 - 76241 957 248 aa, chain - ## HITS:1 COG:BH2505 KEGG:ns NR:ns ## COG: BH2505 COG0631 # Protein_GI_number: 15615068 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Bacillus halodurans # 6 245 6 246 249 157 40.0 1e-38 MEAYALTDIGRVRTMNQDYIYSSPEKVGSLPNLFLVADGMGGHRAGDYASRFAVENLVIY INRAGDGSPVMQLKKGIEAVNGMLYEESLKREELKGMGCTLVAGVVEDNILYVANVGDSR LYLIHGDSIRQVTRDHSYVEEMVAIGQMRRGSEDYNRRKNIITRALGIGREVEPDFFEVD LEGGDYILLCSDGLSNMLEDQSMYEIITGEGSLKEKAALLIEEANRQGGIDNIAVVLVSP SGREAGVC >gi|157101632|gb|DS480692.1| GENE 89 76244 - 77287 1044 347 aa, chain - ## HITS:1 COG:BS_yloN KEGG:ns NR:ns ## COG: BS_yloN COG0820 # Protein_GI_number: 16078638 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Bacillus subtilis # 3 335 25 359 363 352 49.0 7e-97 MTLEEVTAQMAALGEKSFRAKQLYDWMHVKLAEGFDDMSSLSIPLRQKLKENYSLTCLKM VDERVSQVDGTRKYLFGLEDGHVIESVWMQYHHGNSVCISSQVGCRMGCRFCASTLDGLE RNLRPSEMLEQIYRIQSITGERVSNVVVMGSGEPMDNYDNVIRFLRLVSHEKGLNISQRS LTISTCGIVPGIRKFAEEGLAVTLALSLHAPNDEVRKTLMPVANSYKLQDVLEACHYYYE KTGRRLTFEYSLVRGVNDNLDEARALAKLIKDQHGHVNLIPVNPIKERDYVQSGQKAIQD FKNLLEKNGINVTIRREMGRDIGGACGQLRRSYKEASAVPQCTEGEG >gi|157101632|gb|DS480692.1| GENE 90 77356 - 78723 1344 455 aa, chain - ## HITS:1 COG:BH2507 KEGG:ns NR:ns ## COG: BH2507 COG0144 # Protein_GI_number: 15615070 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Bacillus halodurans # 12 446 7 449 450 249 33.0 1e-65 MAKTLDRGLENREIVLDILTEVLEQGAFVHLVLNQALQKYQYLGKADRSFITRAVEGTLE YLIQIDEVINRYSKTKINKMKPFIRSLLRMSVYQILYMDRVPDSAVCNEAVKMAVKRHFS GLKGFVNGVLRTISREKENLTFDTPSLRYSIPEWMYAMWEKEFGKEKAEKIAASFLEDKP TWVRCNLHRADRESILASLKAQGTSVRELPFMESMIAIDGYDYLEGLDAFRKGWIQVQDV TSAFVGEIASPEKDSYIIDVCAAPGGKSLHLADKLEGTGMVEARDLTYQKTALIEENMAR CGVTNMKTLVWDALETDLSAMAKADIVIADLPCSGLGIIGRKPDIKYNMTPEKLEELAGL QREILSVVWQYVKPGGLLCYSTCTIDRLENQDNAAWLRDRFPLVPVDLTDRFGGLPQEPS LKDGWIQFLPGVHPYDGFFISVFRRAEDETGPEGI >gi|157101632|gb|DS480692.1| GENE 91 78790 - 79491 883 233 aa, chain - ## HITS:1 COG:TM1511 KEGG:ns NR:ns ## COG: TM1511 COG2738 # Protein_GI_number: 15644259 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Thermotoga maritima # 6 231 3 230 230 211 48.0 7e-55 MFYPMFYFDPTYVLILIGVVISMAASAKLNSTYQRYSAVRSMCGMTGADAAKRLLANQGI YDVTVRRVPGNLTDHYDPRSKTVNLSDAVYNSTSIAAIGVAAHECGHAMQDANDYSPLRI RAALVPAANLGSQLCWPLIIIGLLLGGSSVLLNLGILLFCLAVLFQLVTLPVEYDASHRA VTLLDSTGILAGQEVGQTRKVLNAAALTYVAAAAASILQLLRLLILFGNRRND >gi|157101632|gb|DS480692.1| GENE 92 79494 - 80480 960 328 aa, chain - ## HITS:1 COG:BH2508 KEGG:ns NR:ns ## COG: BH2508 COG0223 # Protein_GI_number: 15615071 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus halodurans # 1 313 1 300 317 285 48.0 1e-76 MRIVFMGTPDFSVPALKALVEAGHQVIAVVTQPDKPKGRGKEVQMTPVKIQAMEYGIPVY QPAKVREASFVEVLKGLEADAYVVIAFGQILPKAVLELPKYGCINIHASLLPKYRGAAPI QWCVIDGERETGITTMMMDVGLDTGDMLEKAVIPIEEKETGGSLHDKLSMAGGDLILSTL KKLEEGTLVRTPQTDEGTCYAKMLTKSLGDIDWNQGAVSIERLIRGLNPWPSAYTMWNGK TIKIWAADVIAGREAAEFLSESGVPAETGTAPGTVVCSDKRGLVVCTGGGLLSIRELQME GKKRMDTPAFLRGYPIPAGDMFVKKESD >gi|157101632|gb|DS480692.1| GENE 93 80492 - 80977 581 161 aa, chain - ## HITS:1 COG:CAC1722 KEGG:ns NR:ns ## COG: CAC1722 COG0242 # Protein_GI_number: 15894999 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 148 1 148 150 152 56.0 2e-37 MAVRQIRIMGDEILTKKCKPVKAMNDRTMELIEDMFETMYQANGCGLAAPQVGVLKQIVT IDVDDGNQYVLINPEITATDGSQTGYEGCLSLPGKSGIVTRPNYVKVKALNENMEPYELE GEGLLARAICHECAHLEGQMYVELVEGELVDTAADEEEIVE >gi|157101632|gb|DS480692.1| GENE 94 80997 - 83306 1946 769 aa, chain - ## HITS:1 COG:CAC1721 KEGG:ns NR:ns ## COG: CAC1721 COG1198 # Protein_GI_number: 15894998 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Clostridium acetobutylicum # 3 754 2 715 733 556 39.0 1e-158 MAHRYASIIIDISHENVDRTFQYRIPEELQGKIQVGQQVRIPFGQGNRQRRGYVVDLTDQ AQIDVHKLKDVAGIVQGSVAAESQLIWLAWWMKERYGSTMNQALKTVLPVKQKVREAPKR RIRTLVDRGVLEELAEEAGRKKHKAKLRLLEALLLTSAIPYEEAMNRLNLTPSAIKPLLE AGVIAVEVVERYRNPLEEMKLLMGSRERNQAVSDDRESGLWGAPPELNPFQREIADAVIG EYDRGIRRTYLLHGVTGSGKTEVYMELIAHVLSAGRQVIVLIPEISLTWQTVMRFYSRFG NRVSVMNSRMSAGERYDQYERARTGDIDIVIGPRSALFAPFENLGLIIIDEEHENAYKSE LSPRYDAREAAAKRAQMNEASLVLGSATPSLESYTRAIRGEYGLFTLKERAKEDACLPLV EIVDLREELKEGNRSIFSRRLKALIEERLEKREQVMLFINRRGYANFVSCRSCGEAVRCP HCDVTLTLHRDGRMMCHYCGYSIPQPKKCPVCGSPYIAPFGTGTQKIEAMAAELFPKARI LRMDLDTTSKKGGHQEILSSFARGEADILIGTQMIVKGHDFSKVTLVGALAADLSLYASD YRCGEQTFALLTQAAGRAGRGQLAGNVVIQTYQPDHYSIRTAATQDYESFYSQEMAYRRL LGYPPAVALLTVQMACPAEGTLAQTADLAAGWVQQWADRMEREKVQGYKSFRLIGPVNAA VYKVNDIYRKILYIKHENYDILIQIRKMTEEGLKGLEGRDGLTVQYDLT >gi|157101632|gb|DS480692.1| GENE 95 83731 - 84567 610 278 aa, chain - ## HITS:1 COG:CAC0728 KEGG:ns NR:ns ## COG: CAC0728 COG0500 # Protein_GI_number: 15894015 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 10 264 22 272 272 225 46.0 7e-59 MKTNPYIIDYYNNYDEDSRLSPRHGTVEFLTTMRYIEKYIKPGSRVLEIGAGTGRYSHAL ARLGYTVDAVELVPHNIEVFCSHMLPAEHITITQGNALDLSAFSDNQYDITLLLGPLYHL YTKKDKRQALGEAIRVTKQGGIIFAAYVISDGCLLDEGFNRGNINAAEYIKNGLLDPETF AAKSEPKDLFELVRKEDIDDLMSIFPTTRLHYAASDGCALLIREAIDKMDDETFQLYLKY HYATCERKDLTGITSHAIDIFQKQTTHPKTSPKTSQSQ >gi|157101632|gb|DS480692.1| GENE 96 84582 - 85190 382 202 aa, chain - ## HITS:1 COG:FN1004 KEGG:ns NR:ns ## COG: FN1004 COG1309 # Protein_GI_number: 19704339 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 161 1 155 188 58 26.0 9e-09 MPPKAKITKEMILNTVLEITREAGFESVNARSISGRLQCSTRPIFTCYENMEELRKEFLD FAYEFYSQYVESYSSTEHINPCLVLPLSYIEFAREETNLFRLLFINHMDLSMAEPKDFYM EAGNGERERVFSDTIGIEPERAKAIFIDLFLYSHGIAVLAAAQKITLDRSHIEKMVANML TALIRQEKPDWTLSVSGITESI >gi|157101632|gb|DS480692.1| GENE 97 85675 - 86997 1430 440 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 5 428 92 521 547 181 28.0 2e-45 MFSRKQMIKLLAPLIVEQIMAVLVGMVDVVMVAAVGESAVSGVSLVDSISVLIIQIMGAL ATGGAVISAQYLGKEQPRDARKAAGQLVLVTLLLSVIIAGTALAGGRHLLGGIFGKVEQD VMDNAQTYFWITALSYPFIGLYNACAALFRSMGNSKVSMMTSFVMNMINIIGNAICVFGL KMGVAGVAWPTLISRVTAAFMMFVLIQNRNNTIRLNSWKFLIPDRHMIKNILSIGIPNGL ENGMFQFGKIFLQSLVSSLGTVSIASFAVASNLVTVLYLPGNAIGLGLVTIVGQCVGAGR LKEAKSDTKSLIMVVYGILAVFSTAMVIWSGPLVGIYHLSPQAALMARELILIHSIAMVI WPLAFTIPHALRASLDAKFTMAVSVFSMWVFRIGFAYLFVYIFDLGLSGVWYGMFIDWIF RALMFSGRFKGFERRAGVVE >gi|157101632|gb|DS480692.1| GENE 98 87135 - 88457 1399 440 aa, chain - ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 9 426 5 432 433 396 46.0 1e-110 MNIDREDCLSTAKELIGFLEKSPTCFHAVQSIADCLEEAGFTQLHEGEKWELTEGGSYYV TRNGSSIVSFKVPGKAFSGFQIMASHSDSPSFKIKENPEMEAENHYVKLNVEKYGGMLCA PWFDRPLSVAGRLAVKEGNRIATKLVKVDRDLLMIPNLAIHFNREVNEGYQYNAQVDMLP LYGGADAKGTFMETVAESAGVKKEDILGHDLYLYNRVPGSIWGAGGEFLSCGHLDDLQCA FSTLKGFLEGGHPECVSVHAVFDNEEIGSLTKQGADSTFLEDVLRRINSAMGRSEEEYLM AIASSFMVSADNAHAAHPNHGDKSDPVNRPYMNEGIVIKYSTKYATDGLSAAVFRALCQE EGVPCQAFTNRSDKAGGSTLGNISNSHVAVNTVDIGLPQLSMHSPYETVGIKDTCYLIRA ARRFYSTSMKAMGSGSYEVR >gi|157101632|gb|DS480692.1| GENE 99 88529 - 89935 1557 468 aa, chain - ## HITS:1 COG:STM0969 KEGG:ns NR:ns ## COG: STM0969 COG0531 # Protein_GI_number: 16764329 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Salmonella typhimurium LT2 # 1 464 1 471 473 481 52.0 1e-135 MQQKDDKKILWHTLAFMAFSTVWGFGNVINGFSEYGGLKAIVSWILIFAIYFVPYALMVG ELGSTFKHAGGGVSSWIHETIGPKMAYYAGWTYWVVHMPYISQKPNSTVIAASWALFQDK RASGMNTTVMQILCLVIFLFAMYLSGKGMGVLKRLATLAGSAMFIMSMLFIVMMVAAPAL TDGAFLAIDWSPSSFIPDFNANFFLSLSILVFAVGGCEKISPYVNKMKDPSRDFAKGMVA LAAMVAVTAILGTIALGMMFDSNHIPQDLMTNGAYYAFQKLGEYYHAGNFFVVVYALTNL IGQFSVMVLSVDAPLRMLLDSADSRFIPGKMFEKNEHGTYVNGHRLVTLIVCVLIVVPAF GIRNVDALVRWLVKVNSVCMPLRYLWVFVAYIALKRAGDAFCGEYRFVKGKTAGILVGGW CFLFTAFACLTGIYSEDPFQLALNIVTPFVLVGLGFIMPWMAAKEKKS >gi|157101632|gb|DS480692.1| GENE 100 90042 - 91475 1360 477 aa, chain - ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 1 475 1 454 457 262 37.0 9e-70 MTGITVKELLEATGGNLLHGQEDQHVKHISLDSRTMEGDDLFVPIVGERVDAHRFLCQAI ASGAAVVFTSEHHCGEDVKACVRQQCGENREQERKALQAAWIEVRDTKKALQDLGSFCRK SLSLPLVGITGSVGKTTTREMIAEALSAGFKVYKTPGNSNSQVGVPITIAEIPQSAEIGV IELGMSEPGEMERIARVARVDCAVMTNIGIAHIEQLGSQECILEEKLHIQEGMPPEGILF LNGDDPLLASVVPKEGRKKVLYGLGRDCDYRAEDLHLEEGYPVFTAVHGDCRVRVRLKVM GSHMVSNAMAALAVADTYGLSMEKAALALGQFKGYKGRQQIFEWGGVTVIDDSYNASPVS MKAGLEVLDSVKGERKIAVLADMKELGPDTERFHAEIGAYIGEHPLDMVLLLGELAACIG SGMDAARAATPHIEIDSLAQAEKWLDENIREGDCILFKGSNSMKLSEAVKHLKETRS >gi|157101632|gb|DS480692.1| GENE 101 91456 - 92991 1415 511 aa, chain - ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 500 1 481 482 317 36.0 3e-86 MILRTWLNELDYELVRGSLDVEVTEVVYDSRKAVPGAVFVCMAGTRVDSHRFVPDVLRAG VRVLVTERDIEEELAASGLAEHEMGRVTILKTAEARRSLALLSAARFGYPASKMITIGVT GTKGKTTTTHMIKAILEAAGKKVGMIGTTGTVINGQVTPTMNTTPESYELHQSFAKMAEA GCQYMIMEVSSQGIKMHRVDGLAFDYGIFTNISPDHIGPDEHADFAEYLCCKSRLLSMCR IGLVNLNDGHFKEIVKDASCRLYTYCVKEDGETKADFEASNVRYVSRPDFVGTEFDVTGN LDMDVRLGIPGLFNVDNALAALCVCSFLGLPKDRIVHALEHIRVNGRMEIVYTSSRCTVL VDYAHNAVSMESLLTTLRQYHPRRLVVVFGCGGNRSKDRRYSMGEIGGRLADLSIITADN SRYEKVEDIMADIRGSIEKTGGDFVEIPDRRDAIRSSILHAQPGDMIAVIGKGHEDYQEI NGVRHHFLDREVIEETIKELGDETQYDRNNG >gi|157101632|gb|DS480692.1| GENE 102 93089 - 94759 2158 556 aa, chain - ## HITS:1 COG:CAC3201 KEGG:ns NR:ns ## COG: CAC3201 COG2759 # Protein_GI_number: 15896448 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Clostridium acetobutylicum # 1 556 1 556 556 709 64.0 0 MKSDIEIAQEAVMQPIKEVAAAYGIGEDDLELYGKYKAKLTDELWEQVKDRPNGKLVLVT AINPTPAGEGKTTTTVGLGEAFGKMGKKAIIALREPSLGPCFGVKGGAAGGGYAQVVPME DLNLHFTGDFHAITSANNLLAALLDNHIQQGNALGIDPRQIQWKRCVDMNDRVLRNIVVG LGAKGDGMVREDHFVITVASEIMAILCLADNMEDLKNRLGKIIVAYNFAGEPVTAEQLHA VGSMAALLKEALKPNLIQTLEHTGALVHGGPFANIAHGCNSVRATKTALKLADVVVTEAG FGADLGAEKFLDIKCRKAGLKPDAVVLVATVRALKYNGGVPKDQLSAENLEALEKGIVNL EKHIENLQKFGVPVVVTLNSFISDTEAEYAYIKKFCEDRGCEFALSEVWAKGGEGGIALA EKVMETLENKPTQYHVLYPDEMSLKDKINTIAKEIYGADGASFAPAAAKALKRIEDMGFG NLPVCMAKTQYSLSDDQTKLGRPAGFTINVRDAYVSAGAGFVVALTGSIMTMPGLPKKPA ADSIDVDENGKITGLF >gi|157101632|gb|DS480692.1| GENE 103 94938 - 98006 3270 1022 aa, chain - ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 498 1014 272 765 765 560 56.0 1e-159 MAETTKRQQSGNRKPGRPKGSTSKKTGTSSKSRGTSGKKAYEQDNTEFMRAEVVIICSFA VAILLFLSNFRLCGVVGDVLRGVQLGIFGMVGYLFPILIFVGTCFHLSNQGNIHAAMKLA AVAGAVITVCGLLQLAFGTVPAGAKWMEYYKQSTLTGTGGGWLGGVLTSFLTIGLGKPGT FLVLVVLFIICMVCITERSFVSAVKRGGDKAYQYAREDMDRRRELHAIREEERRRIREEQ RVRGVNLNATRLMTPEEEDYDEEAFDREFGADLEEMPEMPEMPDTYEKDWGPAQTAGAKD SLPPDEFVGRFDPPLEPGTDAGDSYDTDELGALARSIEGHSRLGRREEAPVVTGILVEDE YGTHDGEDYDTIHVKKDLKTLEEMEFHVRSSADRDENPVNGYGHDRGREDTVPWEEEYGT SGDMAEGAALSAESAGTAGLHGTGLHGAADHEAGIPETDVPETGIYETDPSEQGINYDSI FVPEEPKRVVTASGKVIETETELLQKKIEKKREEAGQTDSNMAVAQEIKEKEEAVKKEYV FPPTTLLKKGAKNAGSFSGDEYKATAIKLQQTLHNFGVGVTVTNISCGPAVTRYELLPEQ GVKVSKIVGLTDDIKLSLAAADIRIEAPIPGKSAVGIEVPNKENNMVYLRDLLEAESFKN HKSRLAFAVGKDIGGQVVVTDIGKMPHLLIAGATGSGKSVCINTLIMSIIFKSKPEDVKM IMVDPKVVELSVYNGIPHLLIPVVTDPKKASGALNWAVAEMTDRYKKFAECNVRDLKGYN ERVEKIKDIEDDKKPVKMPQIVIIIDELADLMMVAPGEVEDAICRLAQLARAAGIHLVIA TQRPSVNVITGLIKANVPSRIAFAVSSGVDSRTIIDMNGAEKLLGKGDMLFYPAGFPKPQ RVQGAFVSDEEVGRVVEFLTEQGMVAEYNPEVESRVSSPSMDGGSGASERDEYFVQAGRF IIEKEKASIGMLQRMFKIGFNRAARIMDQLAEAGVVGEEEGTKPRKVLMSMEEFEELLEQ GY >gi|157101632|gb|DS480692.1| GENE 104 98154 - 98930 812 258 aa, chain - ## HITS:1 COG:BS_ymfB KEGG:ns NR:ns ## COG: BS_ymfB COG0740 # Protein_GI_number: 16078742 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Bacillus subtilis # 33 256 5 234 241 223 49.0 2e-58 MGQGKNNRKKRDSQAENGNGQVRQQDDEAKKLEKEIEEKEDNKLEEYGQVILEDNRRQRK IHLITIIGEVEGHENSSGSSKTTKYDHILPKLAEIEDDDSVDGLLVLLNTSGGDVDAGLA IAEMIASLSLPTVSLVLGGSHSIGVPLAVSTNYSFIVPSGTMMIHPVRMTGMVIGTSQTY EYFEMIQDRILTFVSNHADIAYDQLRELMHNTKMLTRDLGTVLVGTQAVEAGLINQVGGI KEALGKLYAMIDDRERRR >gi|157101632|gb|DS480692.1| GENE 105 99085 - 100707 183 540 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856514|ref|ZP_02479226.1| 50S ribosomal protein L1 [Haemophilus parasuis 29755] # 407 529 41 162 175 75 35 2e-12 MLEHKDYLSSLNKARKKRNQRLDPRYLVLLLAAIAAVGIIVLGVKSVKKPVSGKAAGPAP TGQVGNATPAQADPAGVSPEEEAAQKEAQEIDNVISSYSNLGIVQVSGYLNIRETPSTDG KIIGKLSGDGACEILATEGDWSHITSGGVEGYISNQYLLTGDEAREKAKELVKLRATVTA DNLNIRQAPALDPGNVIGQALKGERYVVKGLEDGWIQIEEGYISSEYAEVAYSLNEGRKM DMKAMAINQYDNLVISKVNNYLNVRQEPKSDGKIIGKMTSKAAGEILETLDGWYKIKSGP IIGYISADPQYTATGQEAKDIAMQNATLKAVINTDVLNVRTEPNTEAKIWTQIVKDERYP VVAQLDGWVEIDLDSVDEEDGSKVDKAFISTRDNNVEVRYALNEAIKFSPAEVAANNAAS RRSKVVNYALQFVGNPYVWGGTSLTKGVDCSGFTMQVMKQFGVSLPHYSGSQAKMGKAVT SGNMRPGDLIFYAGSGGQVNHVAIYIGNGQVVHAASRKSGIKISTWNYRSPVAIRNVLGD >gi|157101632|gb|DS480692.1| GENE 106 100872 - 101669 273 265 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 253 1 243 245 109 31 8e-23 MLIMTNVRKTFNKGTINEKKALNGIDLTLNDGDFVTVIGGNGAGKSTMLNMIAGVYPIDS GKIEIDGVNISRQPEYKRAKYIGRVFQDPMKGTAAGMEIQENMALAFRRGQKRGLGWGIR ANEKDYYHDLLTRLGLGLQNRMSSKVGLLSGGQRQALTLLMATLQKPKLLLLDEHTAALD PQTARKVLDLTNEMVTEQNLTALMVTHNMKDAIQIGNRLIMMNDGRIIYDVSGREKQNLT VEDLLAKFAEASGGQFANDRMLLSK >gi|157101632|gb|DS480692.1| GENE 107 101669 - 102604 1393 311 aa, chain - ## HITS:1 COG:SPy1018 KEGG:ns NR:ns ## COG: SPy1018 COG4120 # Protein_GI_number: 15675019 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 14 299 2 281 289 187 40.0 2e-47 MNFANLMQGMGPNLLDAVGQGVLWGIMVLGVYITFKLLDIADLTVDGSFALGGCVCATLI VGRGMNPVAALFIATLAGMAAGFVTGILHTVFEIPSILAGILTQISLWSINLWIMGGSSN IPLLKSKSIVTMVSSATGFNQAQASQVIGIAAAILMVVVMYWFFGTEIGSALRATGNNED MIRALGVNTKWTKLLALMLSNGLVGMSGALVCQGQKYADIQMGTGAIVIGLAAIVIGEVL FGWFRNFALKLSAAVIGSVIYFVIRAFVIKMGMNPNYMKLLSAIVVTLALCIPVAANKWR TYKEYSEGGNV >gi|157101632|gb|DS480692.1| GENE 108 102890 - 103954 1374 354 aa, chain - ## HITS:1 COG:Cgl2198 KEGG:ns NR:ns ## COG: Cgl2198 COG2984 # Protein_GI_number: 19553448 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Corynebacterium glutamicum # 67 350 39 319 330 201 46.0 1e-51 MKKTSKIMSLVLAGAMVLSMTACAGNTAETTSAAAKTQETAADTASKAEETTGAEASESE TAAKAEGTTYKIGVLQLVQHTALDAANDGFIAALDDAGIAYEVDQQNASGDQSACQTVAS KFANEKKDLILAIATPAAQAMVGAVEDTPVLITAVTDPAESGLVDTNEKPGANVTGTSDL TPVKEQIDLLKQLVPEAKTVGVLYCSAESNSKIQADMAKEAIAAAGMESKEYTVSSSNEI QTVVQSMVGAVDAVYAPTDNVIAAGMATVGMVAGDNNLPVICGEAGMVQNGGLATYGIDY YQLGYMTGQQAVKILTEGASPADMPIEYLPAEKCELTVNEETAQTLGIDVSGLK >gi|157101632|gb|DS480692.1| GENE 109 104112 - 105509 1561 465 aa, chain - ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 231 459 41 271 279 177 41.0 3e-44 MRKALLERFILVLLAALCINSVIFYIASSKMMLKNSKNDMTYTIKAVDSILDYDRDLMEQ MERLEAFTDQDSSRVTLIKSDGTVVTDSDADAEYLDNHQGREEIQAAMEHGEGYARRYSK TLGKTMLYVACRSEYSDMVLRLAVPYSGIQEYLPMLFPAAILSFLMAMVCSVVTTRRFVS SVTRPLRDISKAVMKVKGDYTELQFESCQYPEINVIAGTIMNMSKDVKEYLSQIEKEKQI RQEFFSNASHELKTPITSIQGYAELLESGMIQDDGLKLDFAKRIKKEAAGMTGLINDILM ISRLESMDAEVMFSDVRISVLLQEIMDSLKPLAASCQVFLHVDCKPLCIRANLQQMKELF TNLATNAIKYNRPGGQVWITVGEQGDDMLVRVKDNGVGIPKESLDRIFERFYRVDKGRSR KQGGTGLGLSIVKHIVNFYHGTIHVTSELDKGTEFTVTIPLDMHK >gi|157101632|gb|DS480692.1| GENE 110 105536 - 106213 879 225 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 222 4 226 232 194 46.0 1e-49 MERIYIIEDDENIMNLLKIALEGFGYAAEGFETAEEGLARMREEKPDLVIFDWMLPGMDG TEAIKEVRNTEQLKYVPIMLLTAKEKELDKVVGLDCGADDYMVKPFGVLELSARIRSLLR RTKREEDRDVLFYQDIQVNRKTREVSSGGRLVELTLKEFELLVYLLENQSRVVTRDELLN RIWGYEYDGETRTLDMHIRTLRQKLGEEGGACIKTVRGVGYRMVR >gi|157101632|gb|DS480692.1| GENE 111 106319 - 107314 1023 331 aa, chain - ## HITS:1 COG:AGl875 KEGG:ns NR:ns ## COG: AGl875 COG2355 # Protein_GI_number: 15890553 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 85 329 132 369 375 147 34.0 3e-35 MKVVDMHCDTIAEIYKDHKKGGRQSILENQMMLDLKKMQAGDYGLQNFALYVNLEQAKGR PFEFCMELLDTFYQEMEAHEDIIGIVRSYEDISRNWEQGRMSALLTVEEGGTCQGKTAFL RDFYRLGVRMMTLTWNFPNELAFPNARIAEEDGTFRMAPDTEHGLTDTGIAFVEEMERLG MIIDISHLNDAGIWDVFRHTRNPFVASHSNARAMASHPRNLTDDMIRALAERGGVMGINY CTAFLRDFGPGEEQLSRISDMVEHMKHIRKIGGIGCIGLGSDFDGISGNLEMGDAGKLPM LAHAMEREGFTSSEIEAVFCKNVLRVYKEVL >gi|157101632|gb|DS480692.1| GENE 112 107424 - 109088 1893 554 aa, chain - ## HITS:1 COG:CT856 KEGG:ns NR:ns ## COG: CT856 COG0659 # Protein_GI_number: 15605592 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Chlamydia trachomatis # 2 547 9 559 567 376 39.0 1e-104 MKTLTPQLFLSMKQYTKEQFTKDVISGIIVAIIALPLSIALALASGVTPEQGLYTAIIAG FVISFLGGSKVQIAGPTAAFATIVAGIVMKNGMEGLATATILAGLILVIMGFLRLGSLIR FIPYTITTGFTTGIAVTIFIGQIKDFLGLTFRESPVETMEKLGQVISCMDTLNLQALAVG AVSLAILILWPRFFKKVPPSLIAVAAAAVLVKGMGLRVNTIGDLYTISSGLPAFHLPNIS FSLVQKVMPDAITIAVLAAIESLLSCVVADGMIGGKHNSNAELVAQGVGNAASALFGGIP ATGAIARTAANIKNGGRSPVAGMVHAAVLLLILVFLMPYAALIPMPAIAAILFMVAYNMS EWRSFADVVKTAPKSDTAVLVLTFFLTVVFDLVMAIGVGLALACLLFMKRMADVSDIYGW KYAEDYEESTDRERIDLKPVPRQVMVFEVNGPMFFGAADKIGQIPLDSDKKVLILRMRSV PAMDATALNSLKKLNARCRKARITMILSHVNEQPLSVMEKAGFDKEIGRENIAPRIDDAI CRAGEILSEANAAD >gi|157101632|gb|DS480692.1| GENE 113 109215 - 110393 383 392 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 3 392 7 413 413 152 28 1e-35 MLQSFKLALKSIWGNKMRSFLTMLGIIIGVAAVIILVSLVNGYMGSVVESFASMGVNQIN VNVTNLASRSLDVDQMYAFYEEHTDLFGELSPNVTLSSTIKYGDDSMTSTSVAGRSEQYL EIKSYELETGRNIAYSDIVSRQKVCVIGAYVANELYGSSQNAVGRTLKIDGYAFKIIGVA EAQDADNMEDGGTDDFVWIPYSVAVKMSRNANITNYTFTALDTSKADECTAVIKNFLYET FKDEDLYRVTAMSEMLDSLNEQIAMMSGMLGGIAGISLLVAGVGVMNIMLVSVTERTREI GIRKSLGADKGTIMRQFVIEAAVTSSLGGIIGILVGCVATTVVGAAVGVSATPTLSAVVI SFSVSVGIGLLFGYMPASRAANLNPIDALRSE >gi|157101632|gb|DS480692.1| GENE 114 110377 - 111099 321 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 10 235 1 225 245 128 35 2e-28 MDNTSQRETLIELRHIHKIYYLGGEEVRANDDISLKIDRGEFVAIVGKSGSGKSTLMNII GALDVPTDGEYLLGGEDVGIMDDKQLARIRNKMIGFIFQQYNLLPKLNILENVELPLLYA GVGNAERRERAMASLEKVGLSEKWKNMPNQLSGGQQQRVSIARALAGSPSLILADEPTGA LDSKTSRDVLGFLKQLNDEGNTIVMITHDNAIAMEARRVVRIKDGQINFDGDVKDYAAII >gi|157101632|gb|DS480692.1| GENE 115 111149 - 112939 1852 596 aa, chain - ## HITS:1 COG:CAC0318 KEGG:ns NR:ns ## COG: CAC0318 COG0845 # Protein_GI_number: 15893610 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Clostridium acetobutylicum # 311 563 76 364 392 104 32.0 4e-22 MEEKKKGVQRFFKLSAGRHKREAAPGTPAQAVKGRKGRKKWVKPLVILLIIGAAAAAAMT FRGMKAKAATAQAQSHNTASAERRDISSELSSSGTLKAKDTYSITSMVEGEVLSAGFEEG DQVEKDQVLYEIDKSSMESQLTSSVNSLSRSQSSYEDALEDYNDALGDYSGNTYKATDTG YIKELYISAGDKVSGNTKLADVYSDDVMEIRIPFLSGEAAMIGAGMPAAVTLTDTGEQVA GTVKAVANQETVLTGGRLVRVVTIQVQNPGGLTTSLRATAQIGEFIGSEDGAFEASVDTT MNADLSTSVEVESLLVNVGDYVTKGTPVFKMTDKSADKLIQSYKDALDKAEESVESAQNK LDSTQDSYDNYTIKAPISGQVITKNFKVGDNITKNSSSTTVLATIYDLSSLTFEMSIDEL DIKKVKVGQKVEVSADAFEGQTFSGTVTNVSLESTYSNGVSTYPVTVTLDDMGDLLPGMN VDGVITLEEANDVLAIPVDALMRGNQVYVKDDTVTEQQGPVPAGFRSVKVETGLASDTYV EITSGLSEGDVVYVAESSKNSSSFMMMMDGGMGGPPGGGGNMGGQNRGGSGGRQNR >gi|157101632|gb|DS480692.1| GENE 116 112954 - 114120 1439 388 aa, chain - ## HITS:1 COG:no KEGG:Closa_2156 NR:ns ## KEGG: Closa_2156 # Name: not_defined # Def: S-layer domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 9 387 4 362 363 290 45.0 6e-77 MKRPLAACRYVCGSLAAITAVTSALPVTAAANTSSANFDMRRQVVNLTGILNVTDYTGQV TRGDFARMLVNASSYRENLPSSSVSVFADVPGTHPDAVYIRIAASQGWMTGFLGGLFKPE EYITYKDAVKAALTMLGYSDEDFTGDLASSRISKFNYLELNEEMNRQPEEVLNQTDCMNL FYNLLKTKKKDSSEIYGAVLDCELNSDGEINPITILDDERKGPILVRKGFSVIQSVPFGS ENANVFLNGTASTLEAVKASQQDAGFAVVYYNVKSKTIWAYTTRGWDDDDLEGNNAYVLL KGEVKNIYYKSTDVMTPTSIRLEIDDDNSDGDFGEDGIDEDVYLTINLNSSELQYLFSIY GGIEVGDEVVMVCNKSGSSYTAVDAIEY >gi|157101632|gb|DS480692.1| GENE 117 114292 - 115842 1528 516 aa, chain - ## HITS:1 COG:SPy1212 KEGG:ns NR:ns ## COG: SPy1212 COG1502 # Protein_GI_number: 15675176 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Streptococcus pyogenes M1 GAS # 2 516 15 525 525 484 45.0 1e-136 MKKLRGLFRIIFGRTAFVVLFMAIQVGILFGAFRWLEEYMHIVYAISILLSAAVIIYIFN EPINSSFKMAWIVPVLVIPVFGVLLYIFVQMQFQTKLMARRLKKNIGDTAAYLKQDREVQ ERIREASRRSSHLIDYMKDHAGYPAYGNTHVEYFPLGEDKFEALMRELKAARHFIFMEYF IVERGYMWDSILKVLEEKVKEGVEVRVMYDGMCCLMLLPYHYPRVLEKKGIRCKMFSPIK PTLSTYQNNRDHRKIVVIDGHTAFTGGVNLADEYINRKVRFGHWKDTAIMLKGDAVQSFT MMFLQMWNVTERQPEDYGKYAVPKDYEHPQSSDHSGFVLPYGDSPMDREQVGERVYLDIL NQARSYVHIMTPYLILDDEMVNALSYAAKRGIDVKLIMPHIPDKKYAYMLARTFYPELTR AGVKIYEYTPGFVHAKVFVSDDEKAVVGTINLDFRSLYLHFECAAYIYRNPVVFDVERDF EETLKKCSLITMEDCKNYSWMGKKLGRLMRLVAPLM >gi|157101632|gb|DS480692.1| GENE 118 115980 - 116417 472 145 aa, chain - ## HITS:1 COG:SMc01160 KEGG:ns NR:ns ## COG: SMc01160 COG1959 # Protein_GI_number: 15964112 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Sinorhizobium meliloti # 3 132 5 132 152 61 32.0 5e-10 MTSEFTIAVHALVFLNHKGTVFSSEGLAANVCTNAARIRKVMAKLKKADLVKTKEGVDGG YLFHRDSRDVNLSMVAEALDTPFVAASWKSGNPHMACLISSGMAGIMDELYGELNQCCQA YLEKVTIADLDSRIFTGAPKAVTAI >gi|157101632|gb|DS480692.1| GENE 119 116547 - 118064 1424 505 aa, chain - ## HITS:1 COG:CAC0691 KEGG:ns NR:ns ## COG: CAC0691 COG2720 # Protein_GI_number: 15893979 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Clostridium acetobutylicum # 138 415 119 394 411 114 30.0 4e-25 MNRSRARAIRRKYQRRTRVTVQKAWILSALAAAVIVYGAAAVIREKGREAKGTQVSIETP DSTQMPEALETEFLKDVRVGTLNLSGLTMEQAVSRVRETYRWDMRIRNGEESVSLEDLAG PQLEEILETIEHGDKDKPQSYELDYGRMEQAFAQQAAELAQEWDRGAVDSQMESFDKETK TYRYTEERYGRSLKQEQLVEELMEAVRKGEFTTETEAPFDSIAPKRTAAQARDQYKVIGT FSTTTTDNKNRNQNIRLAADAIDGVVLKPGEEFSFNLATGNRTSEKGYQPAGAYRNGVLI EEPGGGVCQVSTTLYHAIINSGYKTTERNFHSFAPGYIDKGQDAMVSFDGYAGPDLKFVN TQNTSIGLRASFDGKQLKLSIVGLPLLEENERISMRSEKIRELEPPAPVYEENPELAYGE EKIVEQAQPGSVWKSYRIRTKNGQVEEETFLYTSTYKAKPAKIQRNSAALPPSQELPQPD GGENLQEGQTEEAAAEGEVIMPFGS >gi|157101632|gb|DS480692.1| GENE 120 118069 - 119529 1294 486 aa, chain - ## HITS:1 COG:BS_ccdA KEGG:ns NR:ns ## COG: BS_ccdA COG0785 # Protein_GI_number: 16078856 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Bacillus subtilis # 15 243 10 228 235 135 34.0 2e-31 MGFSLDISVPALTVFIQGLVSFFSPCVLPLIPLYIGYLSGGTGKRGEDGRMHYERSKVML HTLCFVIGVSFAFFLLGLGFSTLGTFFKSNQLLFARVGGILVVLFGFYQLGFFGKSGAMG KEHRLPFKLDMLAMSPITALIMGFTFSFAWTPCVGPALASVLLMAASASTKALAFGLIGV YTLGFVLPFLAVGLFTTSLLEFFKTHGNVVKYTVKVGGILMIFMGLLMFTGRMNAVTGYL SSFQTQSSVQGSKEPGEGGDKTPAMTAGTEPESADSESGAETEATEAETGKTDTADAANP AEAGNTSAASDADKANAPEETAGDGEEGADQAEAGAQEVLPAIDFTLKDQYGNTHSLSDY KGKTIFLNFWATWCPPCRAEMPDIQKIYETYDTEGDDALIVLGVAGPGYGNEKSEEGIKE FLDENGYTYPVLMDTTGELFSAYGIYSYPTTFMIDRDGNVFGYASGQLNEDMMKNIIEQT MEGKRR >gi|157101632|gb|DS480692.1| GENE 121 119647 - 121035 464 462 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 10 461 2 449 458 183 29 6e-45 MAENMMENRTKYDAVIIGFGKGGKTMAGALGAAGKKVALIEKSDRMYGGTCINVGCIPTK SLVYRAGLAAAKGGSFEEKAAAYKAAMEQKEDLTARLRGKNYQKLDSNPNITVIDGTASF QSPHVVEVEKDGRTFQVEGEQIFINTGSSAFIPPIEGLKGNPYVYTSEGLLNLTELPSRL VIIGGGYIGVEFSSIYASFGSKVTILQDGDIFLPREDEEIAGAVRESLESRGIRVMTGVK VKALEQAGGKALVAVDNGKEVQKLEAEAVLVATGRRPNTAGLNLEAAGVEIGPRGGIVTD DSLTTTAPHIYAMGDVRGGLQFTYISLDDFRIVKSKVLGDGSYTLKERGAVPYSVFLIPP FSRVGLSEKEAVEKGYKVKVARLAAAAIPKAQVLEQPAGLLKAVIDEETGLILGAHLFCQ ESYEMINMIKLAMDAKVPYQVLRDTIYTHPTMSEAFNDLFAV >gi|157101632|gb|DS480692.1| GENE 122 121314 - 122591 1328 425 aa, chain - ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 8 423 5 418 422 497 57.0 1e-140 MSKQSLDTICIQGGWEPKNGEPRQLPIYQSTTFKYTTSEQMGRLFDLEENGYFYTRLANP TNDAVASKICQLEGGVGAMLTSSGQAATFYAILNICQAGDHVISSSTIYGGSTNLFTVTF PKMGVECTLVNPDADEAELAKAFRPNTKVVFAESIANPALTVLDFEKFARLAHSHGVPLI VDNTFATPINCRPFEWGADIVTHSTTKYMDGHAATVGGCIVDSGNFDWEAHHDKFKGLTE PDESYHGIVYTQKFGRLAYIQKATAQLMRDLGSIQSPQNAFLINIGLETLHLRVPRHCES AFKVASYLQSREDVAWVKYPGLPGDKYYELAQKYMPKGTCGVISFGLKGGRQAASVFMDK LKLCSIETHVADSRTAVLHPASHTHRQLTDEQLVEAGVDPSMIRFSVGLESAADVIEDIR QALED >gi|157101632|gb|DS480692.1| GENE 123 122621 - 123724 1142 367 aa, chain - ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 4 367 3 352 357 350 49.0 2e-96 MAVSNVYYTNMRTTLTENLLQKLERLVKKAGMMDIDFNNKYTAIKIHFGEPGNLAYLRAN YSKVLVDLIRSQSGKVFLTDCNTLYVGRRKNALDHLDAAYENGYNPFTTGCHMIIADGLK GTDEALVPIDGEYVKEAKIGRAIMDADIVISLTHFKGHEATGFGGTLKNLGMGSGSRAGK MEMHNAGKPFVHTDKCIGCGACQRNCAHSAITVLERKASIDTSKCVGCGRCIGACPVDAV DSMCDEANDILNRKIAEYTLAVLKGRPNFHVSLVVDVSPNCDCHSENDAPIVPNVGMFAS FDPVALDMACVDAVNSQPVLAGSFLAEQDHHSHDHFINTHPDTNWETCIDHAVKLGLGNK EYNLITI >gi|157101632|gb|DS480692.1| GENE 124 123954 - 124610 739 218 aa, chain - ## HITS:1 COG:MA3135 KEGG:ns NR:ns ## COG: MA3135 COG5658 # Protein_GI_number: 20091953 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Methanosarcina acetivorans str.C2A # 20 216 20 218 227 103 31.0 3e-22 MKKNNRKISKWDIMYWLTALVPFIISVCFYNRLPDLVPTHWGTDNVADGYSSRNMAAFGI PAFMFLMAVMVNVIYRIDPKRENISRSRELKQITRWFVVLLAVMVQFVIVLSGIGVDINV GSMVSIPIALMFVAAGNYLPKCRQNYTMGIKLPWTLADEDNWNRTHRMAGYVWTAGGILM LIMGFFHLASLFFLVFMAMVLIPSVYSYLIYRKKLKGI >gi|157101632|gb|DS480692.1| GENE 125 124597 - 124878 359 93 aa, chain - ## HITS:1 COG:BS_yvbA KEGG:ns NR:ns ## COG: BS_yvbA COG0640 # Protein_GI_number: 16080432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 81 3 79 90 103 61.0 6e-23 MSFGDTFKALSDPTRREILNLLKQGSMTAGQIVEHFDTTGATISHHLNILRQAGLIEDRK SGKYIYYELNTTVFQEVLSWMHSLMEEDKHEKK >gi|157101632|gb|DS480692.1| GENE 126 125129 - 126346 877 405 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939855|ref|ZP_02087202.1| ## NR: gi|160939855|ref|ZP_02087202.1| hypothetical protein CLOBOL_04746 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04746 [Clostridium bolteae ATCC BAA-613] # 1 405 1 405 405 813 100.0 0 MKWIKKQRIITICLKGVLAMVILSAMLGLAACRADTGSHISNTGAGAENHGMQDNASAPE SLGSQEGKAEPDSFGSQEGNGEPDSSSVPDHDADQPWADEPRRVLTQEELRGFEEYLNRI DNYGFLMSDYKSPEYINLNEVFYSGAGLEQSPLTEEEREMYLKTVGQPELYTDLVRLGTE QMNGFLTEKTGLTLEQMKTGIGWIYLAPYDRYYAEHGDTNIRSFFCPSGEVDGGIYRIRC LSDGYSQFNLESIVTLKQVSKGYQFLSNEIIWDYHGSYKEDAKAPTAEALHYVISAYKME AEEGGTGIYPAGAVPDDYGSPVFDSDARYYSMEEMEQISWRPELTAVFRNEIYARHGYIF KSDFWNDFFSTFTWYSGEYPADSFDTGVFNEYEKANLKLAVEMEK >gi|157101632|gb|DS480692.1| GENE 127 126343 - 128295 1812 650 aa, chain - ## HITS:1 COG:aq_035_2 KEGG:ns NR:ns ## COG: aq_035_2 COG2199 # Protein_GI_number: 15605636 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 446 631 51 239 251 88 34.0 4e-17 MKQIIRKRLCIVILTAMLTTLLINYCIEIHNAQNEMYTSAREMFWQVGQILDQNEKEAEK ERENLKEQCFIRAKAIAYILQDRPQVTGDQQEMEKIARLLQVDEFHLFDTKGNLYAGSEP KYFGLNFNSGQQMQFFLPMLENYDLQMCQDITPNTAEGKLMQYAAVWREDRKGIIQVGLE PTTVLDSMKLTELSYIFSLVTSEKDSTVYAIDPESETILGSTDETLVGKQIQEIGLHPDQ LSVTNRAVRAVLNQKDSYCVFGKSNSVILGMSATSASLYREVNRSTLMVAVYLIGLSLLM ITFISKDIDRYILEGISSIHNNLVKITKGNLDTRVEVDSTPEFRELSFQINKMVESLLDT TNKISKILEMVRVPIGVYEYNRDMKRVMATSRLAGILRLKDQEAEELLADHCLFEQKMDD LRSRPVEREERIYQLEGSDLRYIRLESFERENSVLGMIADVTDDVMEKKQIERERDIDLL TGMYNRRAFYRHMEQLLREPEQLDHGMMLMADADNLKQVNDKYGHENGDRYLTAIAGLLK GCGWSNCIAARLSGDEFALFLYKCSSRNQLMEYEAGLRKTMETCTAELDNGERIQVRFSD GYAFYPEDGTDTACLLKHADESMYEAKRAYKAASVGKKDESRGSKEGNVS >gi|157101632|gb|DS480692.1| GENE 128 128696 - 130150 1389 484 aa, chain - ## HITS:1 COG:YPO2628_1 KEGG:ns NR:ns ## COG: YPO2628_1 COG1263 # Protein_GI_number: 16122841 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Yersinia pestis # 1 390 3 393 410 407 55.0 1e-113 MMKYLQKLGKAIMLPVAVLPVAAIMMGIGYAMCPATMQGGDVTGIIQIIGFALVKIGSAV IDKLPMLFAVGVAIGLSKDQHGAAGLAGLVAFLTVTTVLNSGVVATMMGAEEAPLAFSKI DNAFIGIISGITASVCYNKFSSAQLPTALAFFSGKRCVPIVTSFVMVAESAILYFVWPML FNALVWVGEHIMGLGPVGAGLYAFANRLLIPTGLHHALNAVFWFDIAGIADINKFWGFAE GGIKGVTGMYQAGFFPVMMFGLPAAAFAMYQTAKPEKKQYAAGILIAGAVASFVSGVTEP LEFSFMFLAPGLYLVHAVLTGISVAVAAAFHWTAGFCFSAGVTDFIFSCRLPMANQPFML LVQGVVFAVIYYVVFRAVITKFDLPTPGREKEDLDAEMTAVLANDDFTKVAAIILEGLGG KGNVVTLDNCITRLRLEIKDYTLVDEKKIKSAGVAGVMRPSKTSVQVIVGTKVQFVADEM QKML >gi|157101632|gb|DS480692.1| GENE 129 130319 - 130576 285 85 aa, chain - ## HITS:1 COG:no KEGG:Hore_14490 NR:ns ## KEGG: Hore_14490 # Name: not_defined # Def: phosphotransferase system, phosphocarrier protein HPr # Organism: H.orenii # Pathway: not_defined # 8 84 8 84 88 68 42.0 1e-10 MTERSFILKDEMGLHARPAAMLMVRMLHVSSTVEIICGDRRADGKDVMSIMAMNTTAGDE VIFRIDGPDEEAAMKEVEAVLETGA >gi|157101632|gb|DS480692.1| GENE 130 130597 - 131091 672 164 aa, chain - ## HITS:1 COG:BH0595_3 KEGG:ns NR:ns ## COG: BH0595_3 COG2190 # Protein_GI_number: 15613158 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 13 158 2 145 152 145 49.0 2e-35 MFGSLKKLFGGEEKEEKIIAAPVNGTAIPMSQVSDPTFSQEILGKGTAVIPSDGRVVAPA DGLVTMVFDTKHAISMQTDNGAELIIHIGLDTVQLKGQYFDAHVAAGDKVKQGDLLLDFD IDKIKEAGYDVTTPVIICNTPQFPKMECVNGMEVRAGETAIIKL >gi|157101632|gb|DS480692.1| GENE 131 131138 - 131935 878 265 aa, chain - ## HITS:1 COG:BS_licT KEGG:ns NR:ns ## COG: BS_licT COG3711 # Protein_GI_number: 16080959 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 262 12 273 277 228 42.0 8e-60 MSTCDEEGREIIVMGRGVGFKAKEGNTIDEEKVEKIFRLESQNTMDKFKELLENLPMEHV QISAEIIGYAKEVLNRPINPNVYITLTDHINFALERYKQKMMFSNPLIREVRSFYHAEYL IGEYAIAMIERDLGVKLPVDEAASIALHIVNAEYDAPMGDTIKITNLIQQVLEVVREYFS IELDEQSLSYERFITHLRFLAQRVFTGEHMNLDNLEFQEVIDRLYPEEYACSQKIQALIK LQYRHQVTEEEVAYLALHIKRIRTK >gi|157101632|gb|DS480692.1| GENE 132 132231 - 133538 1033 435 aa, chain - ## HITS:1 COG:no KEGG:Closa_1252 NR:ns ## KEGG: Closa_1252 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 432 1 420 430 427 56.0 1e-118 MAVYLVCYIISFILARQHFYMLSGLVLITAALWLYIKDYRETGNLIHLRGLFCLFFVGGQ GISCFKLSRLQTDWDLRTWVCLGLAVLTFWIVFEVLHRIYDGWSAADMQSVYRFYASAES PLQAKRLLHSMAALVVISFLAFFFEAWKLGFVPLFSYGVPHAYSYFHVSGVHYFTVSCVL VPSLFVVYSLMISRRGRGLTRDQGFWLGLLMVVLSLAVPVLCVSRFQLILAVGMAAFTYI SMSGTFSLVYVGVLMACMIPAYLLLTVMRSHSVEYLNGIFEMKNSHMPIFITQPYMYIAN NYDNFNCLVEQLESHSLGLRMLFPVWALTGLKFLFPVLVSFPLFVTKKELTTLTLFYDAY YDFGIFGMVLLGALLGAACYFLVRRRRKMSCPAGHVIYAQIAMYMMLSFFTTWFSNPATW FYLAVTGIIYMYVKS >gi|157101632|gb|DS480692.1| GENE 133 133538 - 135259 2513 573 aa, chain - ## HITS:1 COG:BH3073 KEGG:ns NR:ns ## COG: BH3073 COG1080 # Protein_GI_number: 15615635 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Bacillus halodurans # 4 565 7 567 572 531 48.0 1e-150 MYKGTSASAGIGIGKAVIVEEVQLVITRQAVNDVEEEVQRFKGALDETIKATEKLAEDLA TRVGEKEAEIMQGHMMLLADPMLTGEIENSVRNDGVNSEYAIENVCNMYADMFASMGDEL MQQRATDMRDIKTRMQQMLMGIQSVDISSLPEGSIIVAKDLTPSMTAGINPANVTGIVTE LGGKTSHSAILARALEIPAVVAVTGFMDGVKDGDQVVLDGSTGEVYVNPDQAVTADYEEK RKQFLVEKKELEKYIGKPSVTKDGVQVEIVANIGKPEDVDKVLQYDGEGIGLFRTEFLFM DRNAMPTEEEQFEAYKKVAAAMNGKPVIIRTLDIGGDKEIPYMGLEKDENPFLGYRAIRF CLDRREDVYKPQLRALLRASAFGNIRIMVPLVTCIEEYRQAKALVEEIKKELDEKGIAYN KDIQVGIMVETAAASLIADIFAKEADFFSIGTNDLTQYTMSVDRGNKKVSYLYSTFNPAV LRSIKHIIACGREAGIMVGMCGEAASDPMMIPLLLAFGLNEFSMSASAILKARKMVTQYS VEELQAVADKAMSFATAAETEQYMRDFVEKVTD >gi|157101632|gb|DS480692.1| GENE 134 135319 - 135582 279 87 aa, chain - ## HITS:1 COG:MYPU_6030 KEGG:ns NR:ns ## COG: MYPU_6030 COG1925 # Protein_GI_number: 15829074 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Mycoplasma pulmonis # 8 81 8 81 87 63 44.0 9e-11 MVSKTTVIKNPSGLHLRPAGVLCKEALRYKSKISFKVRTTTANAKSVLSVLGACVKSGDE IEFVCEGPDEEEALEAMVGIVEAGLGE >gi|157101632|gb|DS480692.1| GENE 135 135606 - 136949 1593 447 aa, chain - ## HITS:1 COG:BS_ybbR KEGG:ns NR:ns ## COG: BS_ybbR COG4856 # Protein_GI_number: 16077244 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 77 435 87 438 483 71 23.0 4e-12 MEKKLSNNIGLKVLSVFLAFFVWLAVVNINNPEDSDSKEVPLDIVNEGILTSNGKTYELL TDKDTVTVSYRVRTLDKAGISASDFRAYIDLADIYEPTGAVPVKIEVKNNRRLLDTPVAK PGVIRVKTEDLQRKSFEIRAKVIGEAAEGYDKGTPSVSPNYINVSGPMSVVGQISAVGIV INVDGADSNLSDSAPVVCFDANENEIPTDERITFSRSSVDYTMPILKRKSLVLNFEPEGR VADGYRYTGIESSRPSVAVNGLKSDLANINSITIPSSELNMDGATGDKEVTLDISQYLPE GVYLSDESSKEVKVVIKVEALESRTFELKTSQIKRVGYHSDYEYQYDRDSVNVVIRGLKE DLDQLSAAALGAEMDVEDMVPGENSGEITFQLGDAYELVSYDRIQVTVTEKGPSADSTAA STDESQEGTESETSEGPASTSKAEPGQ >gi|157101632|gb|DS480692.1| GENE 136 136930 - 137799 1013 289 aa, chain - ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 20 285 12 273 273 216 46.0 5e-56 MAEILVKVTSWLSLLPRISVGNILEIVIISFLVYELLVWIKNTRAWNLLKGLLVILVFVL FAAVLHLTTILWILENTTTIAVTALLVIFQPELRKALEQLGSRNYMSYLFSFDEGKDNQG FNDKTINELVRATFEMAKVKTGALMVIERGSSLKEIERTGIEINGLVTSQLLINIFEHNT PLHDGAVVIRGNRITAATCYLPLTDNMSISKDLGTRHRAAVGVSESTDSITIVVSEETGR VSVAEGGVLSRIPDAESLKKVLSNVKEETVSLNRLKIWKGRLKNGKKAV >gi|157101632|gb|DS480692.1| GENE 137 137831 - 138790 1250 319 aa, chain - ## HITS:1 COG:no KEGG:Closa_1247 NR:ns ## KEGG: Closa_1247 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 316 1 315 315 306 55.0 1e-81 MTSLLLWKEHLKRFYNRYSYGIQPVVRFLFALCTMVSLNTNIGYMAKLKSPLVILPVCLV SAFLPYGAIVFLAGCLMIAHVYAVSLEMALIAIVLILTIAILYYGFKPGDSYLMVLTPLA FLFKIPYVVPLLVGLGGSLASVIPVGCGVFLYYLLIYVKQNAGILAASDGSVDIVQKYSQ ILKSVLFNQTMMVMIATCAVGIIVVYLIRRLSMDYAWVVAIVAGAVAELLAIFVGDFVFG VSVPVGQMILGLVLSAAIAAVFNFFVLSVDYTRTEYVQFEDDDYYYYVKAVPKMTVSTPD VKVQKISTHKRGTRERNSY >gi|157101632|gb|DS480692.1| GENE 138 138787 - 139194 444 135 aa, chain - ## HITS:1 COG:no KEGG:Closa_1246 NR:ns ## KEGG: Closa_1246 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 133 10 140 142 112 47.0 3e-24 MTGLAIFEKNEGRKIFPINRYFKKDYVGGQMFRSFFGYTFCYLLMLLLWILYKLDVLLGG MSIEELLECGRNWGILYVGGLAFYLFVTWIVYSKRYDYASRSQTMYASRLRHLVKKYDNK DRTDRTKDNRGGRVR >gi|157101632|gb|DS480692.1| GENE 139 139348 - 141489 1944 713 aa, chain - ## HITS:1 COG:slr1857 KEGG:ns NR:ns ## COG: slr1857 COG1523 # Protein_GI_number: 16330244 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Synechocystis # 33 713 4 707 707 700 50.0 0 MEDKGKKAAGPADGITWDKKTNCIDHLALPGTIHLVPMDYVGGYAVRPGFYEINGATAIP GGVNFTVYSHGATDIELLLFRRTEEEPYAVLPFPKHYRIGNVYSMIVFRLDIGEFEYAYR VDGPYEPEKGLIFDKTRYLLDPYAKAVTGQSRWGEPLPGCQHYKARVVRDDFDWADMAQP LTPMEDLIIYELHVRGFTMHGSSAVLHPGTFEGLVEKLPYLLELGVNVVELMPIFEFDEM QDYREVDGRKLYNYWGYNTVSFFSPNTSYAAGTEYNREGNELKNLIQVFNQHGIEVYLDV VFNHTAEGNENGPFFSFKGFDNNIYYLLTPEGYYYNFSGCGNTLNCNHPIVQQMILNCLR YWVTAYRIDGFRFDLASIMGRNEDGTPMSKPPLLQSLAFDPILGDVKLIAEAWDADGLYQ VGTFPSWNRWAEWNGRYRDDMRRHIKGDQGMAQAAALRIAGSRDIYADHDRKNASVNFIT CHDGFTLYDLFSYNVKHNESNGWNNTDGANDNNSWNCGTEGETDDPQVEALRRRMVRNAC ALLMCSRGIPMFLAGDEFCNTQFGNNNAYCQDNEISWLDWGRLGKYRDIFSFFQYMIRFR KTHRLVRANVSGGACGFPDVSFHGVKPWCSSFAEYERYVGVMFAGREAGKGPQTVYIASN AYWEELDVELPVLPDGMKWELAADTWEDTPCPGPLGADGFRIRPRTVVVLVGK >gi|157101632|gb|DS480692.1| GENE 140 141664 - 142185 542 173 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 3 173 1 172 174 127 36.0 1e-29 MDLLEIMRRRRSIRNYSGEEIPEESLTKVLQAGLLSASGKAKRPWEFIVVRQRKTLDDLS GCRAGGVKMLKEAQCAIVVIGDEKEQDVWIEDCSVAMANMHLMASSLGIGSCWVQGRLRN ADEITTEEYVRERLGFPENYKLEAILTLGMPDAEAPAHQLDELPMEKIKWERY >gi|157101632|gb|DS480692.1| GENE 141 142227 - 143210 863 327 aa, chain - ## HITS:1 COG:CAC2543_2 KEGG:ns NR:ns ## COG: CAC2543_2 COG2025 # Protein_GI_number: 15895805 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 326 12 327 332 182 32.0 8e-46 MANGICVFAENYNGNLEPVVAELVRAAHCIKETTGEKIQAAVLAEDCKALVAQLEQLGFD EIYAVETDRNCLFQDDAVSQAMAEMLKRIDPSSVLIPATPTGRSVFSRVAARLGCGMTAD CTELAVDTREDGTFFIKQNKPSFGENVFVTIVTKEGVYPQMMTVRPGVYTPCEAQEGRRA DVTYYKDITLPESGIQVMSSAPAMNDTDSILAAEVVVVGGRGAESEENFTLLKAFAEKLE AAIGGTRPLADTGFIPFEHQIGQTGFTIRPKVCISLGVSGAIQHTEGIKDTKLFVAINQD ESAPIFNIADYGMTCDMKDVLEAYLKL >gi|157101632|gb|DS480692.1| GENE 142 143211 - 144005 885 264 aa, chain - ## HITS:1 COG:SSO2817_1 KEGG:ns NR:ns ## COG: SSO2817_1 COG2086 # Protein_GI_number: 15899533 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Sulfolobus solfataricus # 1 252 4 254 286 167 38.0 2e-41 MDIAVLIKQVPVSNDVSVDPETHALIRSSSEGMINPADLNAIEAAMVLKEQTGGKVAVFT MGPTDAEKALRDAMALGCDESCLITDRCLAGGDTIATAKVLARSIEKYGKFDLILSGALS ADGATGQVGAMVAEYMGIPHVAEIQGLSYDEDDKDCITAVKKYQGMEWKIKAALPALVSV CFGSNEPRLATLRSKRAAKNKPLEVYTNADLKMAADEIGLPGSPTMVTDSFQPESTRRAQ MLSGTPEELAAKLKELIEIEKGKE >gi|157101632|gb|DS480692.1| GENE 143 144029 - 145189 1171 386 aa, chain - ## HITS:1 COG:mlr1283_1 KEGG:ns NR:ns ## COG: mlr1283_1 COG0665 # Protein_GI_number: 13471340 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Mesorhizobium loti # 3 371 8 374 411 124 28.0 4e-28 MEKKRIGIIGGGISGTCLGYHLSLYDNAEVIVFEKDTIGGASTAKSAGTVCLFDDSLRNR YWDVRLYGFESYLKMEAEEKGSAGFDKTGTLVVATDEKVEATIKTGIALAKAAGYTGEYI TDKDRIKEILPDISTDNILGAGYTQDDGYFDGTMISNTYAKKMQKNGGVIKTGVKVTEVL KEGNKIKGLKTDDGQDYEFDYVVDCTGPWSKFTGEMVGMDVPIWHTKAEAFFLCPPGKKL GYTFPVLKYPAFYALRAGDNIFICKSHLSMDLSNPMHAGQWDPDKLPATGGTDDYFIDFL LDQLECRVPGIVDSGLVSSWLSYRAEPKDFLPILGETPVEGYILATGYGGNGVIEAPAAS RDLAKFIMRGEKTPLLEDWAFERLLK >gi|157101632|gb|DS480692.1| GENE 144 145232 - 146467 1388 411 aa, chain - ## HITS:1 COG:BH0761 KEGG:ns NR:ns ## COG: BH0761 COG0624 # Protein_GI_number: 15613324 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 1 408 5 412 414 265 36.0 1e-70 MDINLDRVLGRLEELYQCGAQEDGTFTRMAYSPEDRKGREIFMDYFRKLGIEPEMDAAGN LIARLEGEDKTLPAIMTGSHLDTVPDGGRYDGALGCVAGLEVCQTLIESGRKLRHPLEIV VFTDEEGFRFGSGMLGSGAMCGQDLHVSEEDQDMYGQARGDVLKACGLKVADLGNARRSK DSVHCFLELHVEQGASLHKKQIPVGVVTSIAGVSRFEIKITGEANHAGSTVMEDRKDALV AAARFVARVPEIVAAYGNPFTVATVGTMKVVPNSVNVIPGEAFFHLEIRDQDEKVMETIE QKLRECLGEICEAMGEEYSFNRFSYHEPAPMTEWVKDAIEASVKELDIPYTKVPSGAFHD SLLMTTVFPTGMIFVPSVGGISHSRYEFTEGRDIGQGCRVLLETVLRVDDI >gi|157101632|gb|DS480692.1| GENE 145 146682 - 148352 1915 556 aa, chain - ## HITS:1 COG:BS_yunI KEGG:ns NR:ns ## COG: BS_yunI COG2508 # Protein_GI_number: 16080295 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Bacillus subtilis # 3 548 1 530 531 120 20.0 7e-27 MGLTVRELLLAEEMKDIKVLAGDRGLDKEIKGVTIIEAPDIVKFIDGGEVLLTGLYAFRS CTVDEFRTYINELSRKSVSALVLKRGRKVENADTKIELLFAFAQEHNIPVLEVPFEVSFR DVMSLIMERLFNEEVTRLKYFKTTHDNFAALAIAPDSTGKGVDEILDVLSKLIRNPVAVY NQNLTCLAATDQAPREFAVSRDAVPFDPGTYSNYTYMRQKGEVPQCLIQVKMSFREKIYL VVTELNQPFGTMDCIAAESAITALRFEFSRQYAVTELEKKFQNDIIHNILNGKIHSMGEL QKNTSLLGVDINGSYRVIVFGMTNESMARGDFKAKVKDIDVLSDLIRSRISKAKIHNDLD KIVVIQEVEREQTQEEYRKSIKKIAEQVQADVARYNKYLKIKAGAGRVVDGIINLPESFK EANEAFMFVDVAGETPEDGGPQVMLFADLGIFKLLCQLSDPSMLLEYVPEGLQKLYNYKK PQRDDLITTLRAYLDRNLNLSKTAQDLYVHYKTAVYRIEKIEKLTGIDFDNANEVLAVRI GLVVYKMIENYNKDFI >gi|157101632|gb|DS480692.1| GENE 146 148465 - 149706 1387 413 aa, chain - ## HITS:1 COG:lin0541 KEGG:ns NR:ns ## COG: lin0541 COG0624 # Protein_GI_number: 16799616 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Listeria innocua # 9 408 7 406 414 295 38.0 1e-79 MVIEEITGRIARDLEHLKQFTATPGNGCTRLPFTKEARKAVEYLKELMTESGLDVREDAA GNVIGVLKGEDSELPCVMMGSHYDSVYNGGDYDGIAGVICAIEVARLLLEEGIRPKRDFV VVGFCDEEGTRFGTGYFGSGAILGHRDVDYMKKFRDKDGISIYEAMKEYGMDPERIKEAV WEDGKIGCFLEAHIEQGPVLDTEGTELGLVDCIVGIQRYMVTVHGRADHAGTTPMDMRMD PVDAAAKVISRIADWAREKADGTVATVGYVNTIPGGMNIVADTAEFTVDIRSRNNDNIND ITNRIRKALERAVGEYGGSFDMENKLTITPVELSGEMLDIMEKECGERGYSCKRMLSGAG HDALEIGQVLPTVMLFVPSKDGRSHCPVEFTKYSDLAKAAVVMEKLAKEKVMK >gi|157101632|gb|DS480692.1| GENE 147 149754 - 149951 356 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939880|ref|ZP_02087227.1| ## NR: gi|160939880|ref|ZP_02087227.1| hypothetical protein CLOBOL_04771 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04771 [Clostridium bolteae ATCC BAA-613] # 1 65 1 65 65 87 100.0 2e-16 METEKFFKTKGGRLTVAFAAILAAFALILTGLKMQGDGACLLGFILLVAAMLYSPFKVYV LDRKK >gi|157101632|gb|DS480692.1| GENE 148 149987 - 151600 1565 537 aa, chain - ## HITS:1 COG:AF0806 KEGG:ns NR:ns ## COG: AF0806 COG1620 # Protein_GI_number: 11498412 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Archaeoglobus fulgidus # 11 521 4 523 544 146 27.0 9e-35 MNVPTNLFMWIMACLPIIVLLLLMIKFQWGATEAAPVGLAITIITGIVFYKADIRLLAAE SAKGIWSALIILLIVWTAILLYQVADEARAFLVIRNGMRKLLPNELLMVLALGWILESFL QGITGFGVPVAVGAPLLMGIGVVPVFAVIIPLLGQAWGNTFGTLAAAWDALAMSTGLVPG TPDYLAAAFWAGVFIWMWNVVIGLVICWFYGKGKAVRKGLPALLILSLIQGGGELLLTRV NTTIACFLPACLSLVALILIGRMKMYRQEWSVEDSRIMDRGAASGTSEETPDGMTLVQAF VPYILLTVVTMVVLVVPPVNRFLNQVSIGFSFPETSTGYGFVNQATELFSPLRPFTHASM FLFLSSIAGLVYFGRHGWIRPGGVKRVFVRSVTMSMPSGIAIIGLVIMSKIMGGTGQTSV LADGIARVLGKTYVILAPLVGMVGSFMTGSNMSSNILFGEFQVTTANLLHLNQAPILGAQ TAGGSIGSAISPSNIILGTTTAGILGSEGQVLKRIIPFTTIMTLAIGIIVFLSVAVS >gi|157101632|gb|DS480692.1| GENE 149 151615 - 152550 1169 311 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 4 309 3 308 310 324 57.0 1e-88 MDKKRVVIALGGNALGKDIEEQKQAVAKTAEVIVDLVQQGMDLIITHGNGPQVGMIQNAM DQLACSYENYKETPLPTCVAMSQGYIGIDLQNAIKYELYKRNMDVKVSTILSQVEVDPED EAFKNPTKPIGRFLTEEEARKNMENGIPCMEDAGRGYRIVVASPMPMKIRELKTIETLVD AGHIVITCGGGGIPVVNDNGRLSGVNAVIDKDNASSLLAAELEADYLIILTAVEKVAINF GRENQEWLSDLTVDKAKEYIAQEQFAKGSMLPKIEAAIRFAQSGEGRRTLITLLDKAAEG IAGSTGTVIHK >gi|157101632|gb|DS480692.1| GENE 150 152577 - 155585 3242 1002 aa, chain - ## HITS:1 COG:ECs0580 KEGG:ns NR:ns ## COG: ECs0580 COG0074 # Protein_GI_number: 15829834 # Func_class: C Energy production and conversion # Function: Succinyl-CoA synthetase, alpha subunit # Organism: Escherichia coli O157:H7 # 1 580 1 555 555 403 42.0 1e-111 MLKTIVKKGSYHDSVVLMLLTNQISSIEGVKKVSIMMATPANKDIYRQSGLSTSELEEAS ANDMVVVADVDDDGLLDVIMEEVEAFFKKQSTSESDKKGAEAVRSWDKALGKLADANLAV ISIPGAYAALEADRALDEGMNVFMFSDNVTLEDEIKLKKKAHEKGLAVMGPDCGTGIIQG VPIAFTNHVAPGPIGIIGASGTGIQELTTIIDRLGEGVKNAIGTGGRDLSTEVGGITMMD MIEAMEKDDTVKVLIIISKPPAKAVRDRISDRLSNFKKPVVTLFLGEKPEYHEENFYHAY TLDEAARLAVGLVRGQDIAEGSVEVDSSSFFAAEEKKTIKAYYSGGTLAGEAAMLIKDAL NLKVPPQKAEGFMLKTDGHIVVDLGDDVYTQGKPHPMIDPAKRIECMQEAIDDGSTGVIL LDIMLGYGSHEDMAGALLPAIIRLRDKAREEGRKLFFVATVCGTRRDFQGYDKAVETLKE AGVIVCENNKLAVHTAIRAIGLDFEEPVKEIRPKTTARIQKTEASEKLLELLSQKPRVIN VGLKSFAEVIEAFGCDVVQYDWAPPAGGNVELIKVLNFLRSCREMNIDEANRSVIAKVVA SQPVIRDVVPAKSVIKELNEGKVILHAGPPIRFENMPDPVQGSCVGAALFEGWAATEEEA RKILASGEVTFIPCHHVKAVGPMGGITSANMPVFVVENQTDGNEAYCTMNEGIGKVLRFG AYSKEVVDRLLWMKDVLGPTLGRAIRSIGGLNVNPLVAKAIAMGDEFHQRNIAASLAFLK EVSPVITKMDMDEKDRYDVIKFLADTDQFFLNIMMATGKAVMDAARQVTDGTIVTAMCRN GVEFGIRISGMGDEWFTAPVNTPQGLYFTGYDGEDACPDMGDSAITETFGVGGMAMIAAP AVTRFVGAGGYEDALRTSTEMTEITIDRNPNFIIPNWNFQGTCLGIDARLVVEKGITPVI NTGIAHKVAGYGQIGAGTVHPPIACFEKAIAAYAKKLGFSLD >gi|157101632|gb|DS480692.1| GENE 151 155601 - 156653 1028 350 aa, chain - ## HITS:1 COG:ylbC KEGG:ns NR:ns ## COG: ylbC COG2055 # Protein_GI_number: 16128501 # Func_class: C Energy production and conversion # Function: Malate/L-lactate dehydrogenases # Organism: Escherichia coli K12 # 1 350 1 349 349 409 55.0 1e-114 MKLSHDELKGLMKNKLMQAGLKADHADMTADVLTWSDERGYHSHGAVRVEYYSERIAKGG ITVDPKFEWRETGPCSAVFEGDNGCGYVASVLAMEKAIEMAKKSGIAVVGIRNISHSGSI GYYTEMAAKADLVAIACCQSDPMAVPYGGIEPYYGTNPISFAAPTADERTVVFDMATTVQ AWGKILDKRSRHESIPDTWAVDAKGRPVTDSRLVNALVPIEGAKGYGLMMMVDIFSGVLL GVPFGKHVSSMYHDLSAGRELGQMFIVMDPSRFVGLDAFKASMSQVLDELGEMPAAEGFD KVYYPGERAVMRRDKAYATGGVEIVDEIYEYLKSDKVHFDKYDHKNRFAE >gi|157101632|gb|DS480692.1| GENE 152 156681 - 157469 856 262 aa, chain - ## HITS:1 COG:STM0526 KEGG:ns NR:ns ## COG: STM0526 COG3257 # Protein_GI_number: 16763906 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in glyoxylate utilization # Organism: Salmonella typhimurium LT2 # 1 262 1 261 261 294 55.0 1e-79 MSYLNDQVGYREELLATRSVIKKENYVLLEPDGLVKNSIPGYENCDVTILGSPAMGATFA DYLVTVKEGGKNSGIGGEGLETFIYVLAGEVVVKNADKEESLTEGGYIFSPESVKVSFEN KSGKEARLYVYKRRYEKIEGYSAFTVVGNANDIPWTEYEGMDNCHIKDFLPAAGNFGFDM NMHILKFKLGASHGYVETHIQEHGMYFLSGKGMYRADNDWVPVKAGDYMFLDAYCPQACY AVGREEDFAYIYSKDCNRDVQL >gi|157101632|gb|DS480692.1| GENE 153 157702 - 158571 643 289 aa, chain + ## HITS:1 COG:SA0310 KEGG:ns NR:ns ## COG: SA0310 COG0657 # Protein_GI_number: 15926023 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Staphylococcus aureus N315 # 1 272 1 267 275 98 28.0 2e-20 MTEKREINLEHPAYPKYATVYSEKDTDAKACILYFHGGGLLYGSREDLPRRHIDTLTRAG YIIVSYDYPLAPAARLDTIMGDVCSSITHYINHPEIYCGRKLPFFLWGRSAGAYLCLLAG AKGNLPAVPAGILSYYGYGFLCDHWYETPSGFYCSLPAVDASCLEQAGADLHTAGSLDTH YSIYVYARQTGSWRSLLYEGRDKYFYLDYTLRACTALPCPLFCVHGTRDNDVPYGEFLEL SNRFRPWQFIASCDGHDFDRDEENPFTEKLLDETLAFLEENLKLSPAQC >gi|157101632|gb|DS480692.1| GENE 154 158575 - 159939 1608 454 aa, chain - ## HITS:1 COG:RSc1233 KEGG:ns NR:ns ## COG: RSc1233 COG1953 # Protein_GI_number: 17545952 # Func_class: F Nucleotide transport and metabolism; H Coenzyme transport and metabolism # Function: Cytosine/uracil/thiamine/allantoin permeases # Organism: Ralstonia solanacearum # 18 435 24 490 502 145 27.0 1e-34 MVENLNLNPELVKNVDPDLQPTKKRIMGPLSYAASFMGGCVSIGTFSMGAGLIGALTVGQ AILAMVIGCLVIAVALVIIGNCGHKYGIPYTVQLRSSFGTTGVKVPGLLRGIPAIIWFGF QSWVGAGAINSCLSILFGFSNLPLVYALFTMLQVALAIKGFEGIKWLENISCVFIIAILA YMLYVVKTQFATEIDDVFSGIKGTWGMPFWAATTSFLGIYSTMIINASDYSRNLEGKVGP TFTGSIYTVAILPVTLFMGLIGLLVTAATGNSDPVVVFSTTMGSKFLTVVTLLFIAFAQV TTNVLNNIVPPSYVLMESFHMKWSHATILVGILSACCMPWKLVTDDSAAGLSLFTQFYSA FLGPIFAVMAVDYYILRKKKLNINHMYDKQGVFKGINWAAIIAIVIGSLCSLVIVQLSWY VSLIPTGLVYYFLMKNMKSAKSFRTGTIFQEQEG >gi|157101632|gb|DS480692.1| GENE 155 159990 - 161363 1422 457 aa, chain - ## HITS:1 COG:ybbX KEGG:ns NR:ns ## COG: ybbX COG0044 # Protein_GI_number: 16128496 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 2 448 3 445 453 288 36.0 1e-77 MYDLLVKNGRIVTAQAVTEGNIGIKDGRIAALLEAGQEPEAAKVIDAKGKYVFPGAIDTH AHLNDPGYEWREDYEHGTAAAAMGGYTTIIDMPLQNEPAMTNAAIFDRKVEKVDCNAYTD YCFWGGLVPDNFDELEGLHEKGCVAFKSFIGPVSPDYSSLNYGQAYEAMERIKEFGGRAG FHCEDFSMIKWQEARMKREGRMDWQGFLDSRPVIAEMVATVDMIEIAKATGCKVHICHVS SPEVAQKIKEAQQAGYDVTAETCSHYLSLTDQDVLANGPLFKCAPPLRSKEDVDRLWAYV EDGTFSGIASDHSPCSYDEKFKEILGNKIENVFDVWGGISGIQSGFQVAFNEGCVKRGVD PRVLARAMALNPAKAFGIYGKKGDILPGFDADLVIIDPEKEWEITSESLLYVNKISAFVG MKGRGLPVCTIIRGNVVAEDGKITAEKGVGTLVRRLS >gi|157101632|gb|DS480692.1| GENE 156 161380 - 162741 1508 453 aa, chain - ## HITS:1 COG:RSc1233 KEGG:ns NR:ns ## COG: RSc1233 COG1953 # Protein_GI_number: 17545952 # Func_class: F Nucleotide transport and metabolism; H Coenzyme transport and metabolism # Function: Cytosine/uracil/thiamine/allantoin permeases # Organism: Ralstonia solanacearum # 13 427 23 460 502 132 26.0 1e-30 MSDVKNEVNKQTDSLAPIQDKDRIMGWGSYLMLWLGGCISIGTLTMGSAQLDKGLNLMQV FLAVLIGSTILVIGICANDRFSYKSGAPYAVQLKSAYGTKGNIVPVMIRGLPAIVWYGFQ TWLGGSAINQISIAIFGYDNVVLFFLLFQVVQIILSVKGFHGIKWVENIGGVVIVSAMLY MLFVCVTQYWGVIGDKLISRKGSWGMPFIAAIIAFFGNSTTVMLNAGDYSRELKGGFSVG KRGVAYFIAMVPTTVLLGLIGAMASTATGIANPINAFSQMVHNKVLLVATLGFILFAQMT TNLASNVVPPAYAFMDAFKMKHRTAVILVGVLAVCTCPWILTNDSSAAGLDLFVKIYTAF FGPIFAVLITDYYILHRGKMEGAALDDLYDDNGTHSGINWAAIIATAVGAVIGLVNVDIS FFTATIPTGLVYYFCMKHMPSCQRFRKGTSLEK >gi|157101632|gb|DS480692.1| GENE 157 162772 - 164181 1513 469 aa, chain - ## HITS:1 COG:STM0523 KEGG:ns NR:ns ## COG: STM0523 COG0044 # Protein_GI_number: 16763903 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Salmonella typhimurium LT2 # 2 446 3 432 453 249 34.0 1e-65 MYDLVITNGKLVTPDRIYEADIAVKDGVYAAFLAKGSQAAAKEVIDAKGCYIYPGIIDCH AHLNEPGFEYREDFETGSRAAAAAGCTCLIDMPLNNDPSLMNKEIFDLKMGRISKHSYVD FALWGGIVGDYDNQPGSVKNNMADLVDLHKCGVAAFKGFTCPNGDLFPTVNMGNVRKALE ILKPYGALCGFHCEEFGQVLEREKEAKAKTGRTSEEKIRDFLDSHDVWTEYVATKNVIDM ARATGGRVHICHVSHPMVAQVVKDAIHEGLPITAETCPHYLGFTEDFVFEKGAPAKCTPP MRRKEDMERLWDYVLDGTLSCVGSDHSPAADEEKDNSTKDIWHAWGGLNSIQFFLPMMFD MVVNKKGLSPTLIARVMDYNPARVFGLYGRKGAFELGFDGDIVIVDPEKAWKCSQNELLT KGHVSCFDGLEGKGAPVCTIIRGKVVASGGTYREDALGYGEYVTPVHDC >gi|157101632|gb|DS480692.1| GENE 158 164423 - 165298 402 291 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939892|ref|ZP_02087239.1| ## NR: gi|160939892|ref|ZP_02087239.1| hypothetical protein CLOBOL_04783 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04783 [Clostridium bolteae ATCC BAA-613] # 1 291 1 291 291 548 100.0 1e-154 MYPTITAIAAYAAQQLEECRLWTVHSVYKKTVNLQAGERLLALQAASSPMSPISIITDMD VRAFEVLPVKPGQQVSIAGSRIIISSPEHTVSFGCLPKKVCCTRLTCPPGPFPCKELLSQ IRQAVASSQAGIFRLLFCDSTEGQAITGKEEYSLILAAARSRILQCSECLEQHLYGDAAK HLAGLIGLGIGLTPSGDDFLCGVLAGLILRGEESHPFAHALCQEITSRRFDTNDISRAFL DCSVQHHFSLAVNSLTDLPSADRILNVFSAIGHSSGIDTLCGVAYGLSAEV >gi|157101632|gb|DS480692.1| GENE 159 165405 - 165995 493 196 aa, chain - ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 195 1 192 193 263 66.0 2e-70 MSTIPYVIEQTGRGERSYDIYSRLLSDRIIFLGEEVSDTSASLIIAQLLFLESTDPDKDI QLYINSPGGSVTAGWAVYDTMQYIKCDVSTICLGLAASFGAFLLAGGAHGKRMALPNSEI MIHQPAIHGNGIQGPASDIKIMSDYMQKSRQRLNRVLAVNTGRSEEEIARDTERDNFMSA EEALEYGLIDSILTRR >gi|157101632|gb|DS480692.1| GENE 160 165992 - 166414 223 140 aa, chain - ## HITS:1 COG:BH3634_1 KEGG:ns NR:ns ## COG: BH3634_1 COG2207 # Protein_GI_number: 15616196 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 6 106 5 105 132 101 47.0 4e-22 MKNQKHTVRKVIDYIEEHLEEKLNLEQIAEHAGYSRFHLNRIFMEETGGTIHKYVQERRL TMAAEKLVDTDMDITQIAYDAGYQSQQAFSLAFKQVYLCPPGAYRKRGIYLPAKKRIQMH QASCFTIYGVCDRGMGGIAA >gi|157101632|gb|DS480692.1| GENE 161 166444 - 166584 59 46 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939895|ref|ZP_02087242.1| ## NR: gi|160939895|ref|ZP_02087242.1| hypothetical protein CLOBOL_04786 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04786 [Clostridium bolteae ATCC BAA-613] # 1 46 57 102 102 91 100.0 2e-17 MPDHIHMLVSIPPKYAVSEVVGYLKGKSSLRYKTHFLLRQSGSLLL >gi|157101632|gb|DS480692.1| GENE 162 166957 - 167361 291 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939896|ref|ZP_02087243.1| ## NR: gi|160939896|ref|ZP_02087243.1| hypothetical protein CLOBOL_04787 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04787 [Clostridium bolteae ATCC BAA-613] # 1 134 1 134 134 256 100.0 4e-67 MKSITNHCDKVTDLTLFAEVEAKYSIRIRDDIKEFLTANSGGYPVKDLVKVDGEEYEIRV FLSLDCANKNYYIEKPLTFFLDNTKGKIVPIGLDSGDNYYCVNNETGKVYYWSASDNQYY CIGKSIAEFTEYFE >gi|157101632|gb|DS480692.1| GENE 163 167361 - 168122 429 253 aa, chain - ## HITS:1 COG:no KEGG:BALH_0828 NR:ns ## KEGG: BALH_0828 # Name: not_defined # Def: hypothetical protein # Organism: B.thuringiensis_AlHakam # Pathway: not_defined # 126 247 77 197 197 101 47.0 4e-20 MNEFKEVSETLKTKLSESTLGKQLSGIKIFDLGDEMDRPLQEAPQEDAEVESPENNRSEA DTPESDAQTDNAEKKTRELTEEERQELKDKLGWSDEKLNKCTIDEDGVIHYKTDRSDLEG KTSENGVPYERKIIEINGVKIEGVFPKFDSAFDTALDPDNLKTKAYAKECNAKLKEAIEN DPELRSKFTEEQLKDIEEGRTPTGYVWHHNEEPGKMQLVKREDHDRAIGGAAHTGGNSLW GADSVDRKGGESF >gi|157101632|gb|DS480692.1| GENE 164 168119 - 170593 382 824 aa, chain - ## HITS:1 COG:SSO0283 KEGG:ns NR:ns ## COG: SSO0283 COG0433 # Protein_GI_number: 15897226 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Sulfolobus solfataricus # 270 617 222 525 542 69 25.0 3e-11 MDTAGVLIKTGGAIALGALGVVTAPLTGGASVVAAGAIMAGQIGLSAFNPSTKSKGTSTS ETDTKNESETKGVAFGNNESQSENETHTRGITSGTSDNMQLTMQNKTLLNTLERIDLQLK RIDECESLGMWECAAYFLSSSQETAEMAAGTYKALMKGSKSGLETSAVNYWGRQDEAKLP ILRDYITNFIHPVFAYRSTENTEYLPVTPSSLLSSNELAIQMGLPRKSVCGFPVIEHADF GKEVVSYGKAKSMRTICLGKVFSMGTEMPTEVNLDVDSFTMHTFVTGSTGSGKSNTVYEL LNQLHSIYGTHFLVVEPAKGEYKNVFGQYADVTVYGTNPKKTSLLKLNPFRFPADVHILE HLDRLVEIFNVCWPMYAAMPAILKEAMEKAYVSIGWDLVSSENSKGAKYPNFADLLEQIE VVIEESKYSADSKGDYSGALLTRVRSLTTGLNGMIFCNDDLTDEQLFDRSVVVDLSRVGS TETKSLIMGLLVMKLNEHRMATGKANSPLTHITVLEEAHNLLKRTSTEQSAEGSNLLGKS VELLANSIAEMRTYGEGFVIADQSPGLLDMSVIRNTNTKIILRLPDKGDRELVGYSAGLN DEQIDELSKLKRGVAAVYQNDWVEPILVQVNRCNLSEREYSYEADTQAMDKNALKEQLVN FLIQGRLSEKLKFSLKEIADNLDKIGLSSTNADFVDELIDEYNKTGTLSIWEDENFRKLA RRVTDILGVRTRVENCVLSAADDSELTSMLERIVKQFFPDASNSKVLALSQAFMKDMSVQ QEESEIRRKLYVKWVSFIKEGGVRCLQKTATSNQCCLLRKEEIK >gi|157101632|gb|DS480692.1| GENE 165 171484 - 172050 238 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939899|ref|ZP_02087246.1| ## NR: gi|160939899|ref|ZP_02087246.1| hypothetical protein CLOBOL_04790 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04790 [Clostridium bolteae ATCC BAA-613] # 1 188 9 196 196 281 100.0 1e-74 MALTNAFYEAVKDDNVRRVRIMMKDSLLVDPTFTDFNNMVKAASSMTGLYDAHDGRAFIS DEQEWNDEYMDKLMVQVVGNFSHERIDHLKDVVRYLRPVHKSGSTSSHHTEYNSEHRTSG NSYQEQKYRDLQNGDYRGAKIAAGAVAGAVVGGVVASVAGVTVVGGVAVGALVGGAAVTV TVNGGKRG >gi|157101632|gb|DS480692.1| GENE 166 172064 - 174355 596 763 aa, chain - ## HITS:1 COG:no KEGG:Pjdr2_1837 NR:ns ## KEGG: Pjdr2_1837 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 1 763 1 752 752 461 37.0 1e-128 MKKVFIKYNPYNLETEITVDGKKLAQNSKIGERIVPGTRLQEWIEDLPRILVDEYNDKDF DITFHGTLLDFEDLTEVFTQAYERGELTAKLDRIPAKETSDKEALIDDVFKAIQEGPFDE LRDVEILSAFKHARSKDFEVCVVATMSAGKSTLINSMLGTKLMPSKQEACTAIITRIKDT EGTSFWQAEVYNKDNMLVETHENLTYETMDRLNGDETVSVIKAAGDIPFVSAEDVSLVLI DTPGPNNSRDPEHKKVQSEFLNKSSKSLVLYIMEGTFGSDDDNALLQRVADSMKVGGKQS KDRFIFVVNKMDDRRREDGDTEQTLERIRAYLRNHGISNPNLFPAAALPALNIRLQQSGI EMDEDTIDETEMKVRKLNRNASLHFETYASLPASIRGTINKQLAQAAEKSDAEAEALIHT GVVSIEAAIRQYVQKYAKTAKVKNIVDTFMHKLDEVGCFEKTKQELARNRKDGERIAKQI ETIRKKVDDAKSAQKFKDAVDDAVVRVNDSARDAIEAIIQKCQAKIRKKIDDLRGEELDI DEVSDEVERLSRFAKKLEPDFQSELDELIRESLLKTSNALLAEYKKKLASLTEEIDPSTL AGISIDPLKLMGGSVVSADDFSVKKLVQSKKVEDGEEYVENTEKKWYKPWTWFQEKGYYR TKYKTVKYVAADALAQEFLTPINEVVFDNGELARKHALKQSNHIASAFNKEFQRLDVILK KKLSELESYATDQEKAAERVKETERRLAWLEEIKARVASILEI >gi|157101632|gb|DS480692.1| GENE 167 174352 - 175764 292 470 aa, chain - ## HITS:1 COG:no KEGG:EAT1b_1094 NR:ns ## KEGG: EAT1b_1094 # Name: not_defined # Def: hypothetical protein # Organism: Exiguobacterium_AT1b # Pathway: not_defined # 200 465 5 262 269 153 33.0 2e-35 MIETTLYFSEAIDFSGHPLSNAKSKKRVQYYTALSLVSRYIADSLKNNVMDGNTNSGEKD GEQIVVLGTSISLTEFKMYLEQRLQQYSDFLCEGKENFIHDGLVDKKEVIHAVSGNVSLP WKRKTRFWLLADLALILLDDGLIAKASSLIKECLGNGAKTSIDDFLNVLLDNQNGAELFP FADNLIKQYHENLLFSKKSVKKLVVTANMSAGKSTLINALVGKSVARTSQEVCTGNVCYI SNKAFEDNCISLATPELSLCATQEKLNDFSWQGSIAIAAYFTGTEAINNSWCIIDTPGVN AALYKNHSKIARQVIKKQSYDRLLYVISPTNLGTDAEIRHLKWVSENVDQNKVVFILNKL DDYRSESDNISESMKALRNDLTKLGFTAPVICPISAYFGYLLKLKLTHQPFTEDEQDEYD LLSKKFSKQQFDLSVYYDNSNENDSDTTEIKLCKKSGLYGLEKIIYGGTV >gi|157101632|gb|DS480692.1| GENE 168 175810 - 176760 244 316 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20463 NR:ns ## KEGG: EUBELI_20463 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 316 14 329 329 400 62.0 1e-110 MENFKRDKIERVLGIYTKLLNGHVIDKSAESVNYGVNERSIQRDIDDIRNYLEEDGEKVG CINSVVYDRGEKGYRLEQIYKMKFTNSEVLALSKILLDSRAFTKKEMVQLLDKLISCCVP ETNQKLVTDLISNEEFHYVEPRHKTEFLDTMWDIGQAIRHCQCIEVDYLRTKDKAVVKRK LLPVAIMFSEYYFYLTAFIDDENVKKDFDVLNDSFPTIYRIDRIKGMKLLDEKFHIPYSS RFEEGEFRKRIQFMYGGKLQKVKFQYSGIDVDAVLDRLPTAQILSDDNGSYILSAEVFGK GIDMWLKSQGDNITLL >gi|157101632|gb|DS480692.1| GENE 169 177285 - 177599 247 104 aa, chain - ## HITS:1 COG:no KEGG:ELI_0209 NR:ns ## KEGG: ELI_0209 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 102 1 126 127 119 50.0 4e-26 MNKLKSRIHDDSNGLDYELMGDYYIPALEVPEEHRPIGRWGRLSKPDLNEQAESRLETII RQMKEAEGMRDELKTSEQLVWVQRMNNIRNRAEEIVLTELIYTI >gi|157101632|gb|DS480692.1| GENE 170 177611 - 177838 231 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939905|ref|ZP_02087252.1| ## NR: gi|160939905|ref|ZP_02087252.1| hypothetical protein CLOBOL_04796 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04796 [Clostridium bolteae ATCC BAA-613] # 1 75 1 75 75 157 100.0 3e-37 MSGNAWVFSDQIDDEDMEFMTHEFVTYNMACDYYGLGLKPVTRLSHECGAVYKIGKKVLI RRSIFEAYLREQRKG >gi|157101632|gb|DS480692.1| GENE 171 177828 - 177974 154 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939907|ref|ZP_02087254.1| ## NR: gi|160939907|ref|ZP_02087254.1| hypothetical protein CLOBOL_04798 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04798 [Clostridium bolteae ATCC BAA-613] # 1 48 1 48 48 84 100.0 3e-15 MFELASDAGAIYRFDSTVLINKEIFDEYLERFHEPATKDYKGGDVDER >gi|157101632|gb|DS480692.1| GENE 172 177993 - 178196 221 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939906|ref|ZP_02087253.1| ## NR: gi|160939906|ref|ZP_02087253.1| hypothetical protein CLOBOL_04797 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04797 [Clostridium bolteae ATCC BAA-613] # 1 67 40 106 106 119 100.0 6e-26 MLNQYEDIITVEELCEILKTGKNRVYELLQTNQKRAFQLWRNWKIPKISVKHYLEIQSGI SKSQIKK >gi|157101632|gb|DS480692.1| GENE 173 178404 - 180368 2338 654 aa, chain - ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 9 494 21 509 515 557 57.0 1e-158 MGEQQKVNQADTDLNHVLKARRDKLAELQTAGKDPFQITKYDVTAHSMDIKDSYEQWEGK EAAIAGRMMFKRVMGKASFCNVQDLKGSIQVYVARDSVGEESYKDFKKMDIGDIVGVKGT AFTTKTGEISIHATEVTLLSKSLQVLPEKFHGLTDTDMRYRQRYVDLIMNAEVKDTFIKR SRILAAIRKYLGGEGFMEVETPMLVSNAGGAAARPFETHFNALDEDLKLRISLELYLKRL IVGGLERVYEIGRVFRNEGLDTRHNPEFTLMELYQAYTDYNGMMDLTEKLYRFVAQEVLG TTTITYKGVEMDLGKPFARITMVDAVKKYAGVDFDQIHTLEEARAVAKEKGVEFEGRHKK GDILNLFFEEFAEEHLVQPTFVLDHPVEISPLTKKKPGNPDYVERFEFFMNGWEMANAYS EINDPIDQRERFRAQEELLAQGDEEANHTDEDFLNALEVGMPPTGGIGFGIDRMVMLLTD SPAIRDVLLFPTMKSLGGSGDSKKADMASARTAAEAGDTAEKVQVIQEKIDFSKVKIEPL FEEMVDFDTFAKSDYRAVKIKDCVAVPKSKKLLQFTLDDGSGTDRTILSGIREYYEPEEL IGKTAIAIVNLPPRKMMGIDSCGMLISAVHEEDGKEGLNLLMVDGRIPAGAKLY >gi|157101632|gb|DS480692.1| GENE 174 180393 - 180875 658 160 aa, chain - ## HITS:1 COG:CAC3198 KEGG:ns NR:ns ## COG: CAC3198 COG0782 # Protein_GI_number: 15896445 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Clostridium acetobutylicum # 3 157 4 158 158 127 52.0 6e-30 MADKKHILTYAGLKQYEDELQNLKVVKRKEVAQKIKEAREQGDLSENAEYDAAKDEQRDI ELRIEELEKLLKNAEVVVEDEIDLDKINIGCKVKLLDVEYDEEMEFYIVGSTEANSLQNK ISNESPVGRALIGKVVGECVDVETQAGDIQYKVLEIQRVS >gi|157101632|gb|DS480692.1| GENE 175 180981 - 181946 596 321 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 2 315 27 347 353 234 40 3e-60 MKLRIGNTVLENNVILAPMAGVTDLPFRVLCREQGAGCVVTEMVSAKAVLYNNKNTRELL QIDPAERPAAVQLFGSEPDIMAEIAARLEEGPYDYIDVNMGCPVPKIVNNGEGSALMKNP ERAKEVLTAMVKAVKKPVTVKFRKGFNDLSVNAVEFAKMAESCGVAAVAVHGRTREQYYS GKADWDIIRQVKEAVRIPVIGNGDIFTPEDAGRMLKETGCDGIMVARGAKGNPWLFGRIN HYLDTGEVLPGPSMAEIKAMILRHGRMLVQFKGEGVAMREMRGHMAWYTKGMPHSATLRN EINQVETLEGFVELLDRKIQS >gi|157101632|gb|DS480692.1| GENE 176 181947 - 182348 484 133 aa, chain - ## HITS:1 COG:no KEGG:Closa_1095 NR:ns ## KEGG: Closa_1095 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 20 132 6 115 118 153 72.0 2e-36 MESPPPVTGSGLLWRDEMKDTTVLCGANSYEQKYYFNQDFKSLPQSIRDELQIMCVMYTE DVGGVLTMEFDEKGNLEFRVTSEETDYLFDEIGSVLKIKQYQEEKKEMLEALELYYRTFF LGEDIDIEDDGED >gi|157101632|gb|DS480692.1| GENE 177 182380 - 183354 1376 324 aa, chain - ## HITS:1 COG:CAC0517 KEGG:ns NR:ns ## COG: CAC0517 COG0205 # Protein_GI_number: 15893808 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Clostridium acetobutylicum # 5 322 1 317 319 359 61.0 4e-99 MAKAVKTIGVLTSGGDAPGMNAAIRAVVRTAINKGLKVKGIMRGYAGLLEEEIVDMESTS VSDIINRGGTILYTARCKEFTTAEGQQKGADICRKHGIDGMVVIGGDGSFRGAGKLSSLG INTIGLPGTIDLDIACTDYTIGFDTAVNTAMEAIDKIRDTSTSHERCSIVEVMGRNAGYI ALWCGIGNGAEDILLPERYDGNEQALINRIIDNRRRGKKHNIIINAEGIGHSGSMAKRIE AATGIETRATILGHMQRGGTPTCKDRVYASIMGAKAAELLAAGKSNRLVAYKHGEFVDFD IQEALNMTKDIPEEQYEIAKMLIR >gi|157101632|gb|DS480692.1| GENE 178 183447 - 186932 4259 1161 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 3 1150 11 1167 1167 1153 52.0 0 MAFTHLHVHTEYSLLDGSCKIRELTARAKELGMDSMAVTDHGVMYGVIDFYKAAREAGIK PILGCEVYVAPGSRFDRETVSGEDRYYHLVLLAENNKGYENLMKIVSKGFVDGFYYKPRV DYEVLRTYHEGVIALSACLAGEVQRCLTRGMYEEAVRASEKYVDIFGKDNFFLELQDHGL PDQKLVNQGLMRMSRDTGLKLVATNDIHYTFAEDEKAHDILLCIQTGKKVADEDRMRYEG GQYYCKSEEEMRALFPYAPEAIENTHMIAERCNVEIEFGVTKLPKYEVPEGFDSWSYLNH LCREGFKTRYPDDDGTLSSRLDYELDVIRSMGYVDYFLIVWDFINYARSQDIIVGPGRGS AAGSIVSYTLGITNIDPVRYNLLFERFLNPERVSMPDIDIDFCYERRQEVIDYVVRKYGK DQVVQIVTFGTLAAKGVVRDVGRVLDLPYSLCDTIAKMIPGDLGMTLDKALTMNPDLKKA YNEDEQVHYLIDMSRRLEGLPRHTSMHAAGVVIGSRSIDEFVPLSRASDGTITTQFTMTT IEELGLLKMDFLGLRTLTVIQNAVRLAEKDYGITIDIDHIDFDDKKVLESIGTGRTEGVF QLESAGMKSFMKELKAENLEDIIAGISLYRPGPMDFIPKYLKGKNDRASITYDCPQLEPI LSPTYGCIVYQEQVMQIVRDLAGYTMGRSDLVRRAMSKKKTSVMEKERQNFVYGSQSEGV KGCVNNGIDEKTANHIYDEMIDFAKYAFNKSHAAAYAVVSYQTAYLKYYYPREFMAALMT SVMDNVSKFSEYVLTCRRMMGIPVLPPDINEGESGFSVSGDAIRYGLSAIKSVGKPVTEV ILAERERNGPFRSMEDFVGRMSMKEVNKRTLENFIKSGALDTLPGTRRQKMAVAPTMLEE KAREKKSMFEGQLSLFDIAGEEDRNEFQITFPDVGEYSKDELLAFEKEILGVYISGHPLD DYEDTWRRNITATSAQFIVDEETDAAGVADGSRAVIGGLVAGKTVKTTRTNQLMAFITLE DLLGPVEVIVFPRDYEANRDLFTEDSKLFIRGRVSIGDDPVGKLVCEQVIPFDSIPKELW LQFPDMETYQSGEQRLLTSLRSSEGKDRVVIYLQKERAKKVLPANWNVQVTPGLLEELGR GLGEKNVKVVEKGLEKLGKMN >gi|157101632|gb|DS480692.1| GENE 179 187065 - 187319 383 84 aa, chain - ## HITS:1 COG:BS_crh KEGG:ns NR:ns ## COG: BS_crh COG1925 # Protein_GI_number: 16080527 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 84 1 84 85 63 41.0 6e-11 MVKKLITVELASGLEARPAAMLVQVASQYESAIYLEAEAKRVNAKSIMGMMTLCLTAGES ILVEADGKDEEAAVTEIEKYLKNQ >gi|157101632|gb|DS480692.1| GENE 180 187441 - 188418 827 325 aa, chain - ## HITS:1 COG:CAC0513 KEGG:ns NR:ns ## COG: CAC0513 COG1481 # Protein_GI_number: 15893804 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 315 1 315 317 269 46.0 6e-72 MSFSSSVKDELSRQMPGARHCQIAETAAILSLCGRVKISASDHFWIEIHTENVAVARKYF TLLKKTFNIRTDVSIRSGINPGRSRTYIVAVREHEEALKVLQAVKLINSQGEIGENLSLI RNVVLQNACCRRAFIRGAFLAAGSISAPEKFYHFEIVCPTEPKAEQLKNIIATFDIEAKI VPRKKYYVVYIKEGSQIVDILNVMEAPVSLMELENIRIVKEMRGSVNRQVNCETANINKT VSAAVKQIEDIRFIQSVAGLSGLPESLQEMARIRLERPEATLKELGEALEPPVGKSGVNH RLRKLSLVAEELRQQFPDRNTSSLQ >gi|157101632|gb|DS480692.1| GENE 181 188424 - 189293 940 289 aa, chain - ## HITS:1 COG:CAC0511 KEGG:ns NR:ns ## COG: CAC0511 COG1660 # Protein_GI_number: 15893802 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Clostridium acetobutylicum # 1 285 1 286 294 295 52.0 8e-80 MKLVIVTGMSGAGKTIALKMLEDLGFYCVDNLPISLVDKFAQLAAAGSDISQAALGIDIR SGEELSQLDAVLRGWRELGMDYQILFLDANDGVLIKRYKETRRSHPLAGTGRLDKGIEKE RVKLAFLKKDADFIIDTSKLLTRELRQELEKIFLKHQEYCNMFVTVLSFGFKYGIPEDAD LVFDVRFLPNPYYSEDLRLLTGEDQSVRDYVMQGGTCAQFLDKLYDMMDFLLPNYINEGK NQLVVAVGCTGGKHRSVTVARALCAHLTARAEYGIKIEHRDIDKDNKRK >gi|157101632|gb|DS480692.1| GENE 182 189330 - 190241 942 303 aa, chain - ## HITS:1 COG:CAC0510 KEGG:ns NR:ns ## COG: CAC0510 COG0812 # Protein_GI_number: 15893801 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Clostridium acetobutylicum # 23 302 25 303 305 285 50.0 7e-77 MNLEHMLKEVLVSEGAVCRFMEPMSNHTTFRIGGPAEAYVCPGNEDELGKVICMCREHDI PWNVLGNGSNLLVSDRGIKGVVIAMEGNWCYAGAEGSVIRAGAGELLGRVARVALDHCLT GMEFAAGIPGSVGGALVMNAGAYGSEIKNILKSARVMTNEGEVLELSVDELELGYRTSCI PHRGYTVLEASFQLEPGDQEAIEARMKELAARRREKQPLEYPSAGSTFKRPQGYFAGKLI EDAGLRGYGMGGARVSEKHCGFVINGGNATASDVMALCGHIRKTVMEQSGVELEMEVKRW GEF >gi|157101632|gb|DS480692.1| GENE 183 190262 - 191194 1227 310 aa, chain - ## HITS:1 COG:BS_yvoB KEGG:ns NR:ns ## COG: BS_yvoB COG1493 # Protein_GI_number: 16080553 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Bacillus subtilis # 29 309 28 308 310 267 44.0 2e-71 MHGVTISELIKKMKLKNTTPEIDTDKIVLTHPDVNRPALQLTGFFDHFDRERVQIIGYVE QAYINTMDQEQKRRMYDKLTSSQIPCLVFSRDQEPDEDLLEFCNYYGVPCLVSDKTTSAL MAEIIRWLNVKLAPCISIHGVLIDVFGEGVLIMGESGIGKSEAALELIKRGHRLVTDDVV EIRRVSDDTLIGSAPDITKHFIELRGIGIIDVKTLFGVESVKDTQSIDMVIKLEDWSKDK EYDRLGLEDQYTEFLGNQVVCHTIPIRPGRNLAIIVESAAVNYRQKKMGYNAAQELYNRV QANLGKGQEN >gi|157101632|gb|DS480692.1| GENE 184 191279 - 193258 1801 659 aa, chain - ## HITS:1 COG:CAC0508 KEGG:ns NR:ns ## COG: CAC0508 COG0322 # Protein_GI_number: 15893799 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Clostridium acetobutylicum # 1 622 1 621 623 662 50.0 0 MFDLEEELKKLPASPGVYLMHNDRDEIIYVGKAISLKNRVRQYFQSSRNKTAKIEQMVSH IAWFEYILTDSELEALVLECNLIKEHRPRYNTMLKDDKSYPYIKATMGEDFPRLLFARDM KKDGRSRYFGPYTSAGAVKDTLDLIHKLYRIRTCNRNLPKDTGKERPCLNYHIKQCDAPC QGYISREEYQSNFNQALDFLNGRFDPLIKNLQEKMQTASDNLEFERAIEYRELLNSVKQV AQKQKITSSGMEDRDIVAMARDERDAVVQVFFIREGKLIGRDHFHISAATAENEGEILDS FVKQFYAGTPFIPRELWLQHPLGDEEVISQWLSAKRGQKVRLVVPKKGEKERLVELAARN AALVLSQDKERIKKEELRTIGAINQIGELIGIGGIRRMEAYDISNTSGVESVGSMVVYED GRPKRSDYRKFRIKTVQGPNDYASMEEVLTRRFSHGMRETEELKEKGADLSMGSFTRFPD LIMMDGGRGQVNIALEVLDKLGLSIPVCGMVKDDNHRTRGLYYNNVEIPIDRHSEGFRLI TRIQDEAHRFAIEYHRSLRGKGQVKSILDDIPGIGPARRKALMRHFKDIEAVRSASVEEL EQTPQMNRRSAESVYRFFHGDKESDRTGGQAALQDNGKPGTSADPDGGAKTDFQIMETQ >gi|157101632|gb|DS480692.1| GENE 185 193325 - 194722 1398 465 aa, chain - ## HITS:1 COG:FN0456 KEGG:ns NR:ns ## COG: FN0456 COG1306 # Protein_GI_number: 19703791 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 63 393 47 376 380 237 37.0 4e-62 MKRWIAAGILILAMTGCSRYEPAKEMTRAQEESVQSEASGAQNGQETQEAEAADSQVITI STYPPRNPVKVKGIYVSAYVAGTGDMMDKIIEEIDRTELNAVVIDVKDDQGRITYAMDSP TVNEIGACQVFIQDMPALMAKLKEHGIYTIARVVAFRDPYLAEQKPEWSLHVADGKIYRD NKGLAWVNPYKKEVWDYLIEVGKKAGEAGFDEIQFDYIRFAVDKTMNDVVFDDADTQGRD KTQAITEFIGYAHDELAKEGLFVSADVFGTIMRSEEDAAAVGQEYEDMAEQLDYICPMIY PSHYGPGNFGIEYPDTQPYDTILNALNGSRELLAASAKEDAPQAVVRPWLQDFTASYLEH YIKYGDEQVRQQIQAVYDAGYDQWILWDAGVSYHYGGLLTPEEAEEEDQRISQSRAALET ERDAAENVPAETVPAENAPPETVPPEAVPSENVPPETVPPAGAAR >gi|157101632|gb|DS480692.1| GENE 186 195075 - 196694 1632 539 aa, chain - ## HITS:1 COG:TM0068 KEGG:ns NR:ns ## COG: TM0068 COG0246 # Protein_GI_number: 15642843 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Thermotoga maritima # 1 534 1 532 539 562 51.0 1e-160 MHLLDSELRNKEEWEAAEIELPKFDREAMKAATRECPQWVHFGAGNIFRAFPAACQQKLL NDGVEKTGIIVAEGYDYEIIKKAYRPQDDLSLLVTLKANGTIEKTVIGSLAQSLTVDRQD AEDWNCLKDIFASKTLQMVSFTITEKGYSITSADGSFRPDVKADFEAGPEQADSYIGKLS ALCYWRYTHGAAPLALVSMDNCSHNGDKLYAAVDAYAKAWTENGKADSGFLEYIENPAHV SFPWSMIDKITPRPDDSVKKMLEEAGFTDTESIVTSKNTYVAPFVNAEEAEYLVIEDAFP NGRPRLEHAGVMFTTRDTVEKVERMKVCTCLNPLHTTLAVYGCLLGYTKISEEMKDADLV KLVEIIGYKEGLPVVTDPGILSPKKFIDEVLQIRVPNPFMPDTPQRIACDTSQKLSIRFG ETIKEYMARPELDVKSLKFIPMVQAGWCRYLLGVDDEGNTFAVSPDPMYEGLAEKLQGVK VGEPCNVHEVLQPVLSDAKIFGVDLYEAGLGEYVESIFSELIAGKGAVRAALQKHLAEA >gi|157101632|gb|DS480692.1| GENE 187 196920 - 197069 215 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939925|ref|ZP_02087272.1| ## NR: gi|160939925|ref|ZP_02087272.1| hypothetical protein CLOBOL_04816 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04816 [Clostridium bolteae ATCC BAA-613] # 1 49 1 49 49 81 100.0 2e-14 MKVRKVLGGFAWAMLAAMLILVNAMAANGDDLFDQTENSNVSAVIQYQR >gi|157101632|gb|DS480692.1| GENE 188 197498 - 197716 155 72 aa, chain + ## HITS:1 COG:no KEGG:Ethha_0641 NR:ns ## KEGG: Ethha_0641 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 63 17 79 83 69 44.0 4e-11 MITYEPFWNTLKNSDESTYTLIYNHNLSSSTIDRLRKNKPLSTTTLNDLCRILDCDINDI ISYQRSESDQFL >gi|157101632|gb|DS480692.1| GENE 189 197775 - 198473 306 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939929|ref|ZP_02087276.1| ## NR: gi|160939929|ref|ZP_02087276.1| hypothetical protein CLOBOL_04820 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04820 [Clostridium bolteae ATCC BAA-613] # 1 232 14 245 245 358 100.0 2e-97 MRKKKYILYFAAAAAAAGLGLGACSPGTGAQKETASSVSAGQQTAVSGGTESGTESETGY ENGHETGHETRHETELETEEGTASEHTTEHIPEQEAAKAPVLRLKLPDDVTVCSSPVIIQ KDEHTVQIQFHDENTQSDAMAWASLEAGGGPSEYYVLDEEGTLQQEVWAGEKLVKMTIQK TVENSDIHGILVSWEHEGVFYELWEDDARDSMDAVIEMASAIAVRSAAVQGN >gi|157101632|gb|DS480692.1| GENE 190 198712 - 200160 938 482 aa, chain - ## HITS:1 COG:BH1858 KEGG:ns NR:ns ## COG: BH1858 COG1621 # Protein_GI_number: 15614421 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Bacillus halodurans # 20 466 17 466 487 322 40.0 1e-87 MKKYEKVRREDYEAWYRDNREYRERVRADESRLAYHLMPETGWMNDPNGLCQFQGLYHIY YQYTPFEPTGEIKLWGHYTTRDFIHYKQEEPALFPDQDLDAHGVYSGSAYVEDKTIHYFY TGNVKLFDRDDYDYIMSGRVSNTIHVTSRDGYHFSAKELVMNHKDYPAGMSCHIRDPKVF KKGEYYYMVLGARDALSKGMVLVYRSMDLTNWEYFNRITTEQTFGYMWECPDLFCLDGQL CLVCCPQGVKPMGTEYQNVHQCTVFCLDYDFRENTIRLNSQFPGMVDRGFDFYAPQTFLD EQGRRILIGWMGIPDAEYTNPTVKAGWQHALTMPRLLHMREGRLLQQPLEEMKELRRDRR DYDFWELGKGVETGLVFEACLSMESCSDMCLTIRDDVTLSYQNGLLTLDMGNSGSGRTRR SVLLEQLTDLRIFSDTTSLEIFVNNGLEVFTTRVYGDRPGMRIKGKCCGTVSVYELSGFI YE >gi|157101632|gb|DS480692.1| GENE 191 200252 - 201088 889 278 aa, chain - ## HITS:1 COG:BH0844_3 KEGG:ns NR:ns ## COG: BH0844_3 COG2190 # Protein_GI_number: 15613407 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 130 273 6 150 151 126 46.0 4e-29 TEPAMFGVTLKLKYPFYAAMAGSAAGSAYLAATKTLAQALGAAGLPGFISMKPDHYMNFA IGIILSMGVSFALTMVFWKKFALGREDGPEKKDAGRAEGTETASDMEISRKIEMPQEIEQ SREIITELYVPMKGQVLDVGQSADEVFASKALGDGVAINPAGGMVCAPCDGTISLLFPTK HAVGITSETGVEVLIHIGINTVQLDGQGFEAFVAQGDKVKKGHKLIAADLDLIREKGMNP QTMMILPEGGNLDVTVYPGGQADSGDLAVKAVKTVKTA Prediction of potential genes in microbial genomes Time: Thu Jun 30 18:55:56 2011 Seq name: gi|157101631|gb|DS480693.1| Clostridium bolteae ATCC BAA-613 Scfld_02_34 genomic scaffold, whole genome shotgun sequence Length of sequence - 50237 bp Number of predicted genes - 44, with homology - 41 Number of transcription units - 24, operones - 8 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 43 - 95 11.0 1 1 Tu 1 . - CDS 110 - 1042 757 ## Closa_4234 cell wall binding repeat-containing protein - Prom 1095 - 1154 6.5 2 2 Op 1 . + CDS 993 - 1118 89 ## 3 2 Op 2 . + CDS 1120 - 1236 91 ## + Prom 1254 - 1313 2.3 4 3 Op 1 . + CDS 1347 - 1520 232 ## Closa_3397 hypothetical protein 5 3 Op 2 . + CDS 1535 - 1780 367 ## Closa_3398 Coat F domain protein + Term 1820 - 1869 14.5 + Prom 2288 - 2347 2.6 6 4 Tu 1 . + CDS 2394 - 2906 319 ## gi|160939940|ref|ZP_02087286.1| hypothetical protein CLOBOL_04830 + Prom 2922 - 2981 1.9 7 5 Tu 1 . + CDS 3174 - 4127 495 ## Fjoh_2572 HSP90 family molecular chaperone-like protein + Prom 4207 - 4266 1.6 8 6 Tu 1 . + CDS 4292 - 4840 591 ## CDR20291_1495 hypothetical protein + Prom 4868 - 4927 1.5 9 7 Op 1 . + CDS 4971 - 8132 2992 ## gi|160939945|ref|ZP_02087291.1| hypothetical protein CLOBOL_04835 10 7 Op 2 . + CDS 8215 - 10326 1815 ## Sgly_1347 peptidase C11 clostripain 11 7 Op 3 . + CDS 10348 - 11790 1224 ## COG3772 Phage-related lysozyme (muraminidase) + Term 11865 - 11900 3.3 - Term 11853 - 11888 3.3 12 8 Tu 1 . - CDS 11974 - 12753 395 ## gi|160939949|ref|ZP_02087295.1| hypothetical protein CLOBOL_04839 - Prom 12827 - 12886 2.4 13 9 Tu 1 . - CDS 12954 - 13358 392 ## Dhaf_2435 RNA polymerase, sigma-24 subunit, ECF subfamily - Prom 13397 - 13456 4.0 + Prom 13507 - 13566 4.0 14 10 Tu 1 . + CDS 13694 - 14977 1210 ## COG1145 Ferredoxin + Term 15025 - 15073 8.1 + Prom 15085 - 15144 6.2 15 11 Op 1 2/0.000 + CDS 15191 - 15949 166 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 16 11 Op 2 11/0.000 + CDS 15998 - 17119 517 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 17 11 Op 3 11/0.000 + CDS 17145 - 17681 492 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 18 11 Op 4 1/0.000 + CDS 17678 - 18970 1027 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 19 11 Op 5 . + CDS 18984 - 19748 243 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 20 11 Op 6 . + CDS 19765 - 21747 1221 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 21 11 Op 7 . + CDS 21771 - 22838 988 ## COG1609 Transcriptional regulators 22 11 Op 8 . + CDS 22826 - 23203 300 ## Spico_1571 Cupin 2 barrel domain-containing protein + Term 23237 - 23291 13.1 + Prom 23303 - 23362 6.2 23 12 Tu 1 . + CDS 23401 - 24381 859 ## CLJU_c19440 hypothetical protein + Prom 24388 - 24447 4.3 24 13 Op 1 . + CDS 24508 - 25302 735 ## COG0584 Glycerophosphoryl diester phosphodiesterase 25 13 Op 2 . + CDS 25382 - 26005 567 ## COG0020 Undecaprenyl pyrophosphate synthase + Prom 26028 - 26087 4.8 26 14 Tu 1 . + CDS 26116 - 26862 492 ## COG3279 Response regulator of the LytR/AlgR family + Term 26895 - 26943 8.4 - Term 26883 - 26931 12.2 27 15 Op 1 . - CDS 26956 - 27561 642 ## Rumal_2967 cytidylate kinase - Prom 27616 - 27675 5.8 28 15 Op 2 . - CDS 27722 - 28291 416 ## Closa_0569 hypothetical protein - Prom 28312 - 28371 5.5 - Term 28387 - 28424 8.0 29 16 Tu 1 . - CDS 28482 - 29858 1574 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 30016 - 30075 4.8 + Prom 29809 - 29868 4.5 30 17 Op 1 1/0.000 + CDS 30108 - 32114 2002 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 31 17 Op 2 6/0.000 + CDS 32179 - 33222 860 ## COG0451 Nucleoside-diphosphate-sugar epimerases 32 17 Op 3 6/0.000 + CDS 33219 - 34352 1324 ## COG0500 SAM-dependent methyltransferases 33 17 Op 4 12/0.000 + CDS 34366 - 35313 910 ## COG0451 Nucleoside-diphosphate-sugar epimerases 34 17 Op 5 11/0.000 + CDS 35320 - 36285 1056 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 35 17 Op 6 8/0.000 + CDS 36282 - 38510 2360 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 36 17 Op 7 . + CDS 38452 - 39618 1032 ## COG1216 Predicted glycosyltransferases + Prom 39622 - 39681 4.6 37 18 Op 1 . + CDS 39729 - 41147 1420 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 38 18 Op 2 . + CDS 41278 - 43935 1966 ## COG1316 Transcriptional regulator + Term 43972 - 44036 24.8 + Prom 43981 - 44040 6.5 39 19 Tu 1 . + CDS 44123 - 45547 1628 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 45575 - 45621 6.2 + Prom 45565 - 45624 7.3 40 20 Tu 1 . + CDS 45681 - 47114 241 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 47191 - 47250 14.1 41 21 Tu 1 . - CDS 47157 - 48803 1235 ## COG1404 Subtilisin-like serine proteases - Prom 48979 - 49038 4.5 + Prom 48943 - 49002 5.8 42 22 Tu 1 . + CDS 49027 - 49515 636 ## COG0597 Lipoprotein signal peptidase + Term 49570 - 49607 -0.3 - Term 49553 - 49596 0.5 43 23 Tu 1 . - CDS 49615 - 49974 378 ## Closa_0598 hypothetical protein - Prom 50059 - 50118 4.1 + Prom 50037 - 50096 10.5 44 24 Tu 1 . + CDS 50145 - 50235 118 ## Predicted protein(s) >gi|157101631|gb|DS480693.1| GENE 1 110 - 1042 757 310 aa, chain - ## HITS:1 COG:no KEGG:Closa_4234 NR:ns ## KEGG: Closa_4234 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 272 10 278 317 98 29.0 4e-19 MAKKSLIIAALAAFLSMGTLFSAYGQEKAITSIPLTFSWDTAPKGGELVGNVYAVSSNGE FKVEGAVYDKKDDQWDYGERPIVEVELSAKDGYYFSSTKRSIFSLTGCGAQFKTAEKDSD GSALILQVYLPAIDGNLPATTSVSWNLTTAVWDKVNGAKNYEVRLYRDKILVTTQKTTES TYDFSSYINVEGNYTFTVRALGTYSSQAGPWTDNSEPLTIRTEDTWFITNGTWDKTSSGW RYVYPNNVYPVNSWRCISDNWYYFGNNGYMESDCYVKSSDQDLYYWLGSDGIWNTEKDTA APESGARVVK >gi|157101631|gb|DS480693.1| GENE 2 993 - 1118 89 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERKAAKAAMMSDFFAIEYPSLWFRKIDCRYTCYYYYCTVF >gi|157101631|gb|DS480693.1| GENE 3 1120 - 1236 91 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLQVLFDNNIGAVSIDLLVLDLLVLGLLVLGLLVLDF >gi|157101631|gb|DS480693.1| GENE 4 1347 - 1520 232 57 aa, chain + ## HITS:1 COG:no KEGG:Closa_3397 NR:ns ## KEGG: Closa_3397 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 1 57 57 70 78.0 2e-11 MSPKELNYIEDALGHEQFLMNQCQEAIQNLQDPALKNQAQQMEQKHKQIFDSFYNLV >gi|157101631|gb|DS480693.1| GENE 5 1535 - 1780 367 81 aa, chain + ## HITS:1 COG:no KEGG:Closa_3398 NR:ns ## KEGG: Closa_3398 # Name: not_defined # Def: Coat F domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 81 1 81 81 126 83.0 4e-28 MDDKCIMENLLLTEKGVCDLYVHGTIESSTANVHQTFNQALNDSLCMQDDIYKQMSAKGW YQMEQAEQQKIMKVKNQFAGM >gi|157101631|gb|DS480693.1| GENE 6 2394 - 2906 319 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939940|ref|ZP_02087286.1| ## NR: gi|160939940|ref|ZP_02087286.1| hypothetical protein CLOBOL_04830 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04830 [Clostridium bolteae ATCC BAA-613] # 1 170 11 180 180 335 100.0 7e-91 MENDRTFLKSIEEHVKDPYIRERLINRLEWYMVMANRCQCFYYMSACMGIVMPTLILLLN SIPGNMLSDGMRQLCITALSGLSGIVGGLSGIFGWHENMIRYRSNAERLKSETSFYMLGV GNYRDRKQRDQMFLNALEDLSLKENQMWQKEESSHSSYRQEQGGRRRPNG >gi|157101631|gb|DS480693.1| GENE 7 3174 - 4127 495 317 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_2572 NR:ns ## KEGG: Fjoh_2572 # Name: not_defined # Def: HSP90 family molecular chaperone-like protein # Organism: F.johnsoniae # Pathway: not_defined # 27 249 4 233 881 93 27.0 1e-17 MNHKEQDIKHTDTYINHTATEKWEFALQKRLKQCSPSLHRNYLEASTGACMMLDRYRNVF PQYTDHSSLHCLNIMNLAGQLVGDCLLQLNGSELYVLLTGILFHDLGMGISKADFEGLAP VLCGELYPAKGREAELIREYHQEFSALLVEKYSDIFEIRKEYVFPVMQAVRGHRKTDLWD ENAFPAAFQIGTESVCLPYIAGIIRLADEMDLCRDRNSTLGGEEQMISDSYSRLVWRTHD TVTGMELTEKECILYAEGGDRDTWEELGKWTRKLQKTMDEVEQVITCRTPFFLPKRMVRI DKITPWLEQCQDITSET >gi|157101631|gb|DS480693.1| GENE 8 4292 - 4840 591 182 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1495 NR:ns ## KEGG: CDR20291_1495 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 167 18 184 416 66 25.0 6e-10 MENNTLKASIGIDFSLIGTQLHAMYEKNEGGYAVLLIPSEQTPDSGISVKDLIDDIKKMA KGVSGSDTAVDTTQLETAMTEAAKEGGGSTSMKLEDIVIRLQMAYLYICKKGENSVLEYA FQLQVVTEGLVPETIKPIVDVTNLSISVWNTDRKKVVDKMALITVDEYLGVAALPDKSQA AK >gi|157101631|gb|DS480693.1| GENE 9 4971 - 8132 2992 1053 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939945|ref|ZP_02087291.1| ## NR: gi|160939945|ref|ZP_02087291.1| hypothetical protein CLOBOL_04835 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04835 [Clostridium bolteae ATCC BAA-613] # 1 1053 16 1068 1068 2068 100.0 0 MNQRQMSCSIAVEGAVKGCPAKLAVWYDKNEMSFYGELQMDQRGPAGLLEAVNSEMASKF EAYASGLVHASSQRATLSCIHGITALGYGEPGLYFKAAAGRGSAAILFSFSADSNGGNAQ VSSLMDAVKKAADFFGMEEFYFFGRVGKGLSPRDLMAEAPGAEAFPSRIDRCSLLTYSHL KLNGDSVFEKALRELFGLKEAQLFLGAGEDGFLCMAAIPASKNSFMESRNMCLAVELRSR ALSLALQGSFRFAFVPDMEFHVNCGISRTRFMVEAFGEVREPIPLAGPFSIGDTCLAVGF DRGPSFAMFANLYIREIHLFGAVMLTVAGSVPKLDLLSAAVSDLSIPILVRNLLGCSFDG LDSIDFIRILGLPFQDMQAFDREMVKRADGARLAEAFNQRIHDNSLVLERDQVKASPFGD GVDLVDQRRMRHYYIDGSGGIKLTAQFYYADVNTRLGSYTVERGIFVCGVIEIFKVRFEV LFSFREGDGVLAYGRIPEVNLGFLKLGPSQAGQESGEALPVPADSVLRQFMGKEQRGIVF FLSAGKKDVTFYLDGRIELLSILYFETRIIFCKGLISIDVRFNWLAIFDVSIHLKVNYQD FNRGGFEFRLVIDTSKLAEKLKAVQESINRAIDRLRDKIKNATREIDRAQAHVNELYGQI DSLNRKIEDCKRSIRNAKWWKKAFVAIAKGIEIGAYEVAKAGIYVAIGVATAALQVAKAA VQLAGKVGEGVMKAVNAVIQGALSFFYINRLELYGKANMSEQEFRASIEFVALGKTYRYE TKMGHSALKSDAAGAVSGNMNGQMKNDLDHIEDGAFRSNFRRYSHETYTIEEQCRRLAGA REQMKSSTRLMRSMQEAYIREFSEPMSDFDQMNQEYMNALDCVGGILSTGEQAGDVKALS GSMGGLKRKITLKEKEGVFRDEELSELKDVIRNYDQAKLLYDEVKKSMSAVESQRKQMLS YAELQKQKTGEQKQKKALSAAEGSMERVLNKTEEAMYRSFPVDRSGKNYINLSREELIHQ YMDEARQEFGARTSENVRVMRNRSRKGRYDSRL >gi|157101631|gb|DS480693.1| GENE 10 8215 - 10326 1815 703 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1347 NR:ns ## KEGG: Sgly_1347 # Name: not_defined # Def: peptidase C11 clostripain # Organism: S.glycolicus # Pathway: not_defined # 44 585 122 659 715 194 29.0 9e-48 MDEKRKRVTAPRIAAALSALLAGAALYTVSGSERQGIQVKEYTEAAEAADSTIMVYMNGS DLEGDYGAATADLREMMDALRTAGQEENFPSLHVVVEAGGSTRWELDEMDGVPYARFSLT EDGISSMEPMEIRNMGDADTLTDFVNYGVQSYPANHYGLILWNHGGGPVGGYGSDSHFDG DGLSLEEIREALDHSVMADKAFDFVAFDACLMGSVEIADCLEGRAGYVIASPELEPQDGY DYSWMTALGDSLPSDMEWGEAVGRSMVDAYDAYYASGTAPVAMSLLDMKEYPAFHEVFHQ YVDGIPQELREELYRELGKDRMKMLAFGSRQAGGSPELVDVLEFLDACQSVYPDESAFQT LKERMGKLVTDQWAKGYPGNPSGLTIYLPSGSNPYLSEDLETYDTTGFCSAYRQLTDGYA AYLARESGVEWGDINAHKDGTVEISIAPEDVSDVTGAYLAVFCPVGDDGNYYLLCTDSDV DIGVDGTLRAAPENSYMGMKGQVLCLIETMNLDAYTEYMAPVLYNGELCTMRIGFDEEHE DGQVLSVTPAGQTSEAAKQIYELKEGDRVTPLYLVEHMEDVEEEPVDEAKDEAKDETIDG AKDETKDEAGDETKDEIRDETGNTAGSSHGETSHNPDQITEDRYYTDSYYMGTEFYIEEP DDMLLETVPVSGDGYFYGFMLMDVRQNIYYTDFTGIETEDTGS >gi|157101631|gb|DS480693.1| GENE 11 10348 - 11790 1224 480 aa, chain + ## HITS:1 COG:STM1028 KEGG:ns NR:ns ## COG: STM1028 COG3772 # Protein_GI_number: 16764388 # Func_class: R General function prediction only # Function: Phage-related lysozyme (muraminidase) # Organism: Salmonella typhimurium LT2 # 341 480 3 150 150 145 57.0 2e-34 MSKTSRELIEFCKGKLGTPYVFGMKGQVLTEALLKQLAAENPKVYTPQYITKARSFMGKH CTDCSGLITWCTGILRGSANYKETAREVQPIGALNEAMAGWALWKPGHIGVYIGDGTCIE AKGINYGTVETRAQDTPWVSVLKLKDIDYSEQVCESAVPSAHPTGWFKEDGGVRFYLEPE RYVANDWYQDKGQWFWFDGSGHAVRNNWYLYKDNWYYFGADFAMVKGLQLIKGKPYYFNR DGQMAKLGTRFMVQVDKNGVLHFPEEESAEIPISTGGISVENGAGREPEHTSPGKTEGKN PALAAYEGSEYTAGAVQPPVTKAGTASEADRKAGYAQGERRISDAGICLIKQFEGCRLEA YRCAAGVPTIGYGHTAGVAMGMKITQAQAEAYLREDLRAFEKAVNKVLECSVTQNQFDAL VSFAYNLGAGALRNSTLLKRLHAGDVKGAADEFPKWNKAAGKVLEGLTRRRMMERQLFLS >gi|157101631|gb|DS480693.1| GENE 12 11974 - 12753 395 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160939949|ref|ZP_02087295.1| ## NR: gi|160939949|ref|ZP_02087295.1| hypothetical protein CLOBOL_04839 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04839 [Clostridium bolteae ATCC BAA-613] # 1 259 1 259 259 524 100.0 1e-147 MNIVISTEHNQLFRQLYAAADAAISNRHLRGMELYAFRSLTPEELIKEAAGSPADLWLMF FDSDSRPDWPDVLNRLTSFSSQIMVCIVSSDYSTAARLINSHSVRGVAGYIHPVHDNLER SCLNILTYLNRRVTSADGFLIVHGSSRDTRLSFSSIYYIETVKGTHMCNIHCSDGVYTFR SGIKELSEKLKPPFRQVRASTIANLSNVRSFEPTQGILSFTGGISCCCSRSYRHDVSDYF RRHPRHVHAEGDMAERGPG >gi|157101631|gb|DS480693.1| GENE 13 12954 - 13358 392 134 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2435 NR:ns ## KEGG: Dhaf_2435 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 133 6 137 139 81 37.0 1e-14 MQYYIRIDNRNVYVSEEIYKLYCKGERKERYFRESDIHNKTFYYNALDTEDFNGCDLFQD TRAKPVDEQAEEALERSLLFEAMKDLDPREKDIIRRIYCYDQSLRQISRETGVPVTTLHY RHGKILRKLRSLMA >gi|157101631|gb|DS480693.1| GENE 14 13694 - 14977 1210 427 aa, chain + ## HITS:1 COG:AF1263 KEGG:ns NR:ns ## COG: AF1263 COG1145 # Protein_GI_number: 11498862 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Archaeoglobus fulgidus # 1 351 1 343 369 147 29.0 5e-35 MGHLTTRDAYRNLGERINWFTQGAPASETLYKILQVLYSEKEAKWVALLPVRPFTIKKAA KIWGTSEYKAEKLLDHLCGKALLVDSWDKGVRKFVMPPPMAGFIEFALMRTRGDIDQKYL SELYYQYMNVEEDFIKDLFYATETKLGRVFVQEPVLTNDKMNHILDYERATHIVEEAEYI GLGLCYCRHKMSHAGHPCEINAPWDVCLTFDNVARSLAQHGDYCRLISREEALDALERSY ASNLVQIGENVREHPAFICNCCGCCCEALQAARHFSPMQPVATTNYIPDITGDQCVGCGK CAKTCPVLAISMEEGENGRKRAVVDKDICLGCGVCDRNCGVKAIHMERRTEQIITPVYST HRFVLQAIEKGTLANLVFDNQAFSNHRAMAALFSAILELPPLKQAMASKQFKSVYLDRLL AAKREKG >gi|157101631|gb|DS480693.1| GENE 15 15191 - 15949 166 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 62 247 51 238 242 68 28 7e-11 MDSLFSLAGKKAVITGCATGIGKGMAAGLARAGADIIALDIGDLEDTRRTVENLGRECRC YHVDLSVSSDVDEVWIRLLEEAGHVDILCNNAGMQFRKNAFEYPEEMFDKVIAVNLKAPY RLSQKAAIHFRDQGLKGKIINTASLFTTFGGVEVSGYTCSKHGVMGMTRALSNEFAPYGI CVNAIAPGYIATELTKAIWSDSKRRQPMDERIPAGRWGTPDDFEGPAVFLASRASDYITG IMIPIDGGYSAR >gi|157101631|gb|DS480693.1| GENE 16 15998 - 17119 517 373 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 69 370 48 344 346 203 35 1e-51 MKKRSVMMALGLVMTLTACGGGNAGTTAAPSSGSETQTEAKAEEKTEGDTTAQEENKEES KTEASGDTVTLKLSNNQPDGNCTTIAVEWFADQVRERTDGRVNIEVYNNGTLGDAISCLE QLQYGGMDIVKADLSTMTNFVDEYNALMMPYIYKNTEHFWKVHGGDIGMGILRGDKMKEK KLYGLTYYDGGTRCFYNTKKEIHSPADMKGMLVRVQQSELMMSMVEALGGQSVATAYSEV YSALQTGVVDAAENSIVNYLDQSHFEVAPYFVEDHHTRSADILVMGEKSREKLSAEDLKI IDDTALESWEYQKELWAEAEEESKQKLLEQDVTITELTEEELKAFQDACEPIWYSYEGGA YKDIIDQIVEAGK >gi|157101631|gb|DS480693.1| GENE 17 17145 - 17681 492 178 aa, chain + ## HITS:1 COG:Cgl2288 KEGG:ns NR:ns ## COG: Cgl2288 COG3090 # Protein_GI_number: 19553538 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Corynebacterium glutamicum # 31 156 6 132 173 62 26.0 4e-10 MLKGIKKVLIDIFNLFYKVIYGYARWVLAVVIVIICAQVFCRNILRTNIRWNQEVALLLT IWMAFLGIAIGAEKNLHIGVELFYDKFPVKIRNVIDVLNRIITLLVGAFFTVYGVRMVLS TMDSRLPVTKWPSSLMYIMIPVSGVAIIYFTVLDIMGLKKYKKTDDAGKFNREEEEKE >gi|157101631|gb|DS480693.1| GENE 18 17678 - 18970 1027 430 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 5 425 3 426 435 400 45 1e-110 MNPNIAIAILLITFVLLIMIKCPITFSMIISTAFTMLYIQVPVMTLVQQMSKQLNSFSLL AIPFFILMGEIMAAGGISSRLLAFANVCVGQITGGLAHVNVLASMLFGGISGSAIADVSS LGALEIPMMEEAGYEKDFSIAVTVASSCQGLIIPPSHNMVLYSMVAGGVSVGSLFCAGYI PGIMLGVFIMGMCYIISKKKHYPKGKKLPFKEALRVTKDTFFAMLAMVIIIGGVSAGFFT ATESAAVACIYCLFVGFFIYRELRFKDIPVVLGNTIKTLSMVYALIAAAGAFGWVMAYLN VPTLATNLLLGISDNKIVVLLLINLILLLLGCVMDLAPLVLIMAPILLPVVTKFGMSPIQ FGVMMMLNLGIGLCTPPVGTALFTGCAIGNMKIEKVSKAMLPLYIPMLITLLLVTFIPAL TMTLPELIMK >gi|157101631|gb|DS480693.1| GENE 19 18984 - 19748 243 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 2 251 4 242 242 98 30 8e-20 MKTALVTGASRGIGLAIARELGVNGYQVAVMATRGESFYPESTRTLRDAGISYAWYAGDI GKSEDRIRIVREAAEHFGEIHVLVNNAGVAPKVRADLLDMTEESFDYVIGTNTKGNMFMT QTVARQMMKQPVYGGKRGTIINISSCSSQVSSVNRGEYCVSKAGISMLTKLFADRLAPEG IFVYEIRPGVIRTDMTSQVEEKYTELIKQGMFPIARWGVPEDVAKAVRAFCSDDFCYTTG NYIDVDGGFHIRRL >gi|157101631|gb|DS480693.1| GENE 20 19765 - 21747 1221 660 aa, chain + ## HITS:1 COG:TM0427 KEGG:ns NR:ns ## COG: TM0427 COG1053 # Protein_GI_number: 15643193 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Thermotoga maritima # 9 660 16 664 664 647 50.0 0 MDINRIPFYQYNTIIVGTGAAGLNAAVTLHKLGQKNIALITEGRYMGTSRNTGSDKQTYY KMTQCGSEPDSVRKMAKTLFEGGCVDGDIAMAEAAGSLRAFYHLVDIGVPFPFNKSGEYV GYKTDHDPLKRGISAGPLTSKYMTECLERQAEERGIPFFDGYQAVRLLTEEKDGEKRAAG IIALNKQKAASPDERYAVFSAVNIIWATGGEAGMYQASVYPVSQTGGMGVLLEAGAVAKN LTESQFGIASVKHRWNLSGTFQQCLPRYLSAEQDGSHEREFLNDYFDTPRQLLTAIFLKG YQWPFDPRKVEGQGSSLIDLLVYQETVLKGRRVFLDFMHNPSPLTEGGNVSFRYLAGEAR EYLENSRALLDTPYERLKHMNPSAIEVYSSHHIDLSGEYLEIAVCAQHNNGGIGGNQWWE SNIRHLFAVGEVNGSHGIYRPGGSALNAGQVGAIRASQYIVKRYGGEPCSREQFLSRHMK EVEEETAFGERVLRGSESCMTDVAGQRRLLGIRMTKYGACIRSEEGIRTALEENLRQREQ LEQKVMIKGPEVLKDLYKLKHLLISQFVYLEALRDYDARVGISRGSYLVYNAGGQVPNPH LDERFRNRTEETDTTVLQEIYYEADRKECRVSWRPVRPLPDEEIWFENVWKDYREDGIIR >gi|157101631|gb|DS480693.1| GENE 21 21771 - 22838 988 355 aa, chain + ## HITS:1 COG:VC1721 KEGG:ns NR:ns ## COG: VC1721 COG1609 # Protein_GI_number: 15641725 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 331 1 333 336 194 34.0 2e-49 MVGIRDVAKYAGVSPSTVSRALSGVAFVEPETKEKVMKAVSDLNYKPNLAARSLKKGGSK LIGLIIPDIMNPYYPEVVKYMEACATKAGYSLILCDALGDVKKEKEYFKTLKYLFVDGIL YIASTEDVEHVKPYIGEIPMVIVNRTFDVDAPCINIDNADAAYQAVKCLTDNGHRKIALY INDKDRQYNRERLEGALEALGECGIEDYEKYIVRDVESEDDAYYKTLELMRHEERPTGVF MFNDFMAYGVYRGITKSGLRIPDDISVVGFDDIPQVKYLDPPLTTLRHSLADTAGIIFEQ LMEQIKTQTCAHKSQTYFKGRLIERESVKNLISRSVSERDAKAHSGQSGGMKWKK >gi|157101631|gb|DS480693.1| GENE 22 22826 - 23203 300 125 aa, chain + ## HITS:1 COG:no KEGG:Spico_1571 NR:ns ## KEGG: Spico_1571 # Name: not_defined # Def: Cupin 2 barrel domain-containing protein # Organism: S.coccoides # Pathway: not_defined # 1 124 1 124 124 160 60.0 2e-38 MEKVNIWDIEGTEFPAGRRTRVILGQNGAMEGSKFCQGYVVVYKGGGIPEHDHETVESYT IIKGHGEMEVDGEKQPVKEGDCIFIPGNLKHALYNTGEEELHMMFVYAPSIIVDHWAKEQ SGEIK >gi|157101631|gb|DS480693.1| GENE 23 23401 - 24381 859 326 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c19440 NR:ns ## KEGG: CLJU_c19440 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 303 1 301 312 348 51.0 3e-94 MIISASRRTDIPAFYSDWFFKRLEGGYLYVKNPMNPGQVSKILLNQDTVDCFVFWTKDPG PMMEKLSVLDRLGYPYYFQFTLTPYGADVEPGLPHKNELVKTFRELSARLGPERVIWRYD PVLLSPSYTREYHFQWFSRLCSSLEGYTHTCVISFLDMYRKIKGRMDRIQIQAMTGEDMV SLASFMGPEAERHGMTVRTCSEAADLSGFHIQKGKCIDDQLIAGLLGRPLDVKKDSTQRE ECGCVKSVDIGAYNTCSHLCRYCYANFSPDQVEVKRRLHDPDSPLLCGRLTGEERITVRE MKSVVCPAGGGQMRLAFDVSEDGEPE >gi|157101631|gb|DS480693.1| GENE 24 24508 - 25302 735 264 aa, chain + ## HITS:1 COG:CAP0015 KEGG:ns NR:ns ## COG: CAP0015 COG0584 # Protein_GI_number: 15004719 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Clostridium acetobutylicum # 31 262 39 268 281 135 31.0 1e-31 MAAALLGAVLIVSMIPVRVLAKDRTGEGWRDKVEITAHRGDNTQAPENTMPAFKAAVENG ADWIELDVTQTKDGVLVIFHDADLMRMTGKPDKIWDMSYEELSQINTASHWGPFFKDTRI PTLEEVLDYCKDKIKLNIEVKVNNHQTQDFIARLVGLIQDKGMAGQCMITSFDYHSLQAV KQLAPALETGFISSKAIEQPEAFTSADNFVLSIDLINPETVSRIHSLGKEVIAWTVNDAY SVDKCRKAGADNLITDKPRKIMED >gi|157101631|gb|DS480693.1| GENE 25 25382 - 26005 567 207 aa, chain + ## HITS:1 COG:CAC1432 KEGG:ns NR:ns ## COG: CAC1432 COG0020 # Protein_GI_number: 15894711 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 1 207 1 213 213 305 66.0 4e-83 MRLPKHIGIIPDGNRRWAKGQGMDKSEGYGHGISPGMSLYRMCRELGIGEMSFYGFTTDN TKRPSAQRQAFTRACVAAVEELAKEDASLLVLGNTKSPMFPHELLPFTARHDFGKGGMKI NFLVNYSWEWDLGYKDETIGPGLKTAEVSRIDLIIRWGGRRRLSGFLPVQSVYSDFYVVD DYWPDFKREHVEDALKWYAAQDITLGG >gi|157101631|gb|DS480693.1| GENE 26 26116 - 26862 492 248 aa, chain + ## HITS:1 COG:lin0801 KEGG:ns NR:ns ## COG: lin0801 COG3279 # Protein_GI_number: 16799875 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 2 235 3 240 240 108 30.0 8e-24 MIQIAICDDDFQVCSQLEDYIRRWKEGRRLEVEIFNTGKEFEEALKERHYDLIFLDIVMR DRNGIELGRFIREELGNDTSMIVYMSAYGDEYALDLFQFQPFHFIKKPIRPEDFQNIFYK ACEKIEGDAGIFEFKSSRTVYRIPLKDILYFEGDNRRVKIVTAGHTYHCYSTLENLFSKI NMDQAGFIRLHKSYAVNLRYVATFSVRMVTLFDGSEFTVSAPFKENALRQYEGYISKTGM GRGDTSGE >gi|157101631|gb|DS480693.1| GENE 27 26956 - 27561 642 201 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2967 NR:ns ## KEGG: Rumal_2967 # Name: not_defined # Def: cytidylate kinase # Organism: R.albus # Pathway: not_defined # 5 195 4 197 209 160 40.0 3e-38 MSHKIITIARQFGSGGHEVAQRLSDQLGIPLYDRNLVEMAAERMGHSPVSIEKADESALS SFLAGYQIPKEPNSVTGYGLSLNDSTYVTQTIIIEALAQKGPCVIVGRCGDYVLRSHPDL IDVFICADMEDRVRRIMDRYSFSERDAVNAIKSTDRRRKNYYENYTKRKWGSIESHQMLL NISKLGMDKTVRIIKGIYETP >gi|157101631|gb|DS480693.1| GENE 28 27722 - 28291 416 189 aa, chain - ## HITS:1 COG:no KEGG:Closa_0569 NR:ns ## KEGG: Closa_0569 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 1 119 135 125 49.0 7e-28 MRELAVFYCPKCGHYAYYQTSRHPQCPKCGTQEAMHMVRMHYSEFMRMSCDERDEFLSRE ILKTNPSLITRLTEPHKRYNSREIIAEMNNVIMNLDTENKILSDTVKWMHDTIWELMHEK REYGQSNGTAPGDCGGYRDFRDCGNCRDREPSEECMEVTAHTGQTEAAYSEAAAADMSQT EEAEKGYKR >gi|157101631|gb|DS480693.1| GENE 29 28482 - 29858 1574 458 aa, chain - ## HITS:1 COG:CAC1079_2 KEGG:ns NR:ns ## COG: CAC1079_2 COG5263 # Protein_GI_number: 15894364 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Clostridium acetobutylicum # 34 453 305 742 2566 89 25.0 1e-17 MKKQIKVLAVLSTAMFMAAVTPNFIGTASTAYAKTVGWTEENGNWYYYDSYGEAITDTWK KNGDDWYYLDSDGIRAADRQIDEYYVGEDGKRVSMKWVSVENEDFWDEDDAPEFLYYYYG RDGKALTSRWASINGSWYYFNEDGIMQTGSITVDGYNYYLGEDGSRKTGWILLADETDDP EYVESWYYFDNTGKRIENEVDKKIEGQYYTFVDGRMQTGWYKLPVQAAAESGEAQTATPS EAAPVSEQTAAGYQYYDEDGKRASGWRTIEGIEGLSEEAELYRFYFKNGKPYHAEKGLEL FTIESRKYAFNTKGEMQTGKKVVNLEDGNVANFYFDEEGVMKTGKQVIFDEDLGETQNWY FHTDGSRKGQGFHGIKDNVLYVYGLRQEADKDLRFAPVELNGNQYLVNSNGAVQKATSSS KSNAMPELGSGYKDFKDENDKVWTVNTEGVIQTQSTAQ >gi|157101631|gb|DS480693.1| GENE 30 30108 - 32114 2002 668 aa, chain + ## HITS:1 COG:all4613 KEGG:ns NR:ns ## COG: all4613 COG0028 # Protein_GI_number: 17232105 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Nostoc sp. PCC 7120 # 13 650 43 608 632 228 29.0 3e-59 MKMKISNYIAGKLVEEGISQAFMVTGGGAMHLDDGLGHQEGLHCIYNHHEQACAMAAEAY ARIHNRIAALCVTTGPGGTNAITGVVGGWLDSIPMLVLSGQVRYDTTARWSGVGIRAMGD QEFDICKAIDCMTKYSEMVIDPMRIRYCLEKALYLARSGRPGPCWLDIPLNVQGAFVETE ELPGFDAADYEAGGNGWAGKDGGHGIPEDEAGKGEHRQVLPGKVDVSAARTIIDKIRAAK RPVINAGNGIRIAGAHDLFMDVAEKLGVPVITGWDSEDIMYDTHPLYVGRAGNMGDRPGN FAIQNSDLVLSIGSRLSIRQVGYNFETWAREAYVIVNDIDEEELKKPSVHSDMRVHADAK DLLEQLDLVLDRVLAEEEGNLPESVSRPGSRQVFDGGKGLRDMTWSRTCAMWKEKYPVIL PKHFAHTDEEPANVYAFIHAVSSRLREGQVTVVGNGSACVVGGHAYIIKKGQRFITNSAI ASMGYDLPAAIGAWTASRDKGCYSQGGDRSGEDLILVTGDGSMQMNLQELQTIIHHRMGI KIFVINNGGYHSIRQTQKNFFGEPLIGIGVDSGDLSFPALDKLAAAYGYPYVSARHNSEL GEAVEKTLAAEGPAICEVFVSMDQNFEPKSSAKRLPDGTMVSPPLEDLSPFLPDEEMDEN MIIPRIKG >gi|157101631|gb|DS480693.1| GENE 31 32179 - 33222 860 347 aa, chain + ## HITS:1 COG:Cj1319 KEGG:ns NR:ns ## COG: Cj1319 COG0451 # Protein_GI_number: 15792642 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Campylobacter jejuni # 135 345 98 312 323 67 25.0 4e-11 MRVIVTGATSFIGKAVTKELLAGGHHVFAVVRPDSPGRGRLEQPGAEAGEKTEEETGEET GNKAGKEAGKEGGRITVLPFALKDIGKLDACPELVSCPVLKSRSDLKTKADLWLHLGWEG AGSANRQDPHVQARNIGYALDSLHTAARLGCARFLFCGSQAEYGIADGVMGEEILCHPVS EYGKDKLEVCRRAGEEAKALGIDYIHARIFSVYGPGDHPWSLVSTCLNTFLKGGRMELGE CTQQWNFLHVRDAARALTGLLLAKVPAGVYNVAGEDTRPLKSYIEQMHRLCGEKGTYEYG KRPPNAEGVVSLVPDIRKLKEAAGFAQEISFEDGIREMIGLYKENCL >gi|157101631|gb|DS480693.1| GENE 32 33219 - 34352 1324 377 aa, chain + ## HITS:1 COG:SMb21062 KEGG:ns NR:ns ## COG: SMb21062 COG0500 # Protein_GI_number: 16264389 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Sinorhizobium meliloti # 5 366 22 396 399 95 26.0 1e-19 MKRCIACGAPLWETPLLTLDNMPASAQHMPDAQGVKEDRGLTLDLCQCMGCGLVQFDCEP VDYYRDVIRAGGFSKTMVELRRYQYKHLIDSYHLEGKKFIEVGCGQGEFLKVLSEFPVEV HGIEHDPHLVELARAQGLDVTEGFTETEDTRFAGGLYDVFLSFNFLEHQPDPSTMLQAIY RNLEDDAMGLITVPSFEYIMDHNSYYELIRDHLAYYTFETLTPLLERNGFQVEECEVINR DTLSVIVRKRPQMDTENLLECYVNLKREMETYMKYLDAWDKKVAIWGASHQGFTLAATTK LGEKARYIIDSAPFKQGKFAPASHLPIVGPDHFHEHPVDAIIITAPGYTDEIAASIRQKF GTAVEIRAMRSNHLEMV >gi|157101631|gb|DS480693.1| GENE 33 34366 - 35313 910 315 aa, chain + ## HITS:1 COG:VNG0063G KEGG:ns NR:ns ## COG: VNG0063G COG0451 # Protein_GI_number: 15789397 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Halobacterium sp. NRC-1 # 1 311 1 312 328 78 27.0 2e-14 MRVVITGATGFIGSNLARVFLEQGAHVYALVRPGSLHLDALPVHERLHTVACDLAHVLEC VPEVGHADGFFHMAWGGVNREEIDSPEVQARNVAGSLDCVEAAGRLGCRVFMDAGSRVEY GAVDGIMEEEAPCRPINEYGKAKWEFYQKAAPLCSRLGLHYYHLRFFSVYGCGDHPWSII STLVRDLRQDKKVSLSACRHMWNFMYIEDAVQAVYELYRQAVNEEPGLKKDSSPISAIVN IASQDTRPLRSFVEEIHDIVGFKGTLEYGTFVQAKEGALSIRPDITRLKRLTGGRWQERY TFRQGIEETIEKEEA >gi|157101631|gb|DS480693.1| GENE 34 35320 - 36285 1056 321 aa, chain + ## HITS:1 COG:STM0558 KEGG:ns NR:ns ## COG: STM0558 COG0463 # Protein_GI_number: 16763936 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 3 309 2 307 308 164 32.0 2e-40 MKKISVLIPCYNEAENVGPISRAVTEILEKELPQYDYELVFIDNDSTDGTRDIIRGLCAD NPRIKAILNARNFGQFNSPYYGMLQVTGDCVIEMVADFQDPVEMIPKYIHEWEKGYKIVI GIKTSSKENRLMYWLRSCYYKTIKKLSDVEQIEHFTGSGLYDREFIEVLRNLDDPTPFLR GIVAELGYRRKEIPYEQPRRRAGKTHNNFYRLYDAAMLSVTSYTKAGLRLATIFGSICAV VSMLIAMVYLVMKLIWWDRFPAGMAPMLIGMLFLGSVQLFFIGFLGEYIMSINQRVMKRP LVIEEERINFNEEEKKGGNEA >gi|157101631|gb|DS480693.1| GENE 35 36282 - 38510 2360 742 aa, chain + ## HITS:1 COG:alr4487_2 KEGG:ns NR:ns ## COG: alr4487_2 COG0463 # Protein_GI_number: 17231979 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 182 574 1 392 519 225 33.0 3e-58 MKYKIDVVRIRENSITLNGWAIGKSPDSKATFRVEDEKRQPVKFKHVNTRRDDVSQIYFK KVYDREFGFDIQFPYERGKDYYLLIRCEGRQAKIKYNEELIARRASVAHKRMEKLKDLMN METVHVAMDFWKEHGLKALVVKSKHKLQGIDNDYDYSEWYELTKPTDEELAEQRKHLFDF EPMLSVVIPAYKTPERYLREMLDSIMEQTYTNWEICVADGSPRGEGLERVLKKYADRDRR VRYEILGSNRGISGNTNAALDMARGDFVILADHDDTLPPNAFYEVVKAINENPDCQVIYS DEDKLDMDGKALFDPHFKPDFNPDLLTSVNYICHLFVIRQDLLKQVGGFRQEFDGAQDYD FIFRCTEAAKRVYHIPKVLYHWRCHQNSTASNPESKMYAFEAGSRAIMAHYERMGIPAVK VEKGVDYGIYHTTFEIQGEPLVSVIIPNKDHSQDLDVCVRSLMEKSSYRNLEFIIVENNS SQKETFAYYDKMQAEHTNFRVVTWKEGFNYSAINNYGASFAKGEYLLLLNNDTELIEEDS INEMLGFCQREDVGIAGARLLYGDDTIQHAGVVIGFGGIAGHTFIGLHKAENSYFHRAMC AQDYSAVTAACLMTKKSVFDQVGGLSPELAVAFNDIDYCMKVRALGKMVVYAPYSCFYHY ESKSRGLEDTPEKIARFNREVAIFIRKWPDIIQNGDPCYNPNLTLRKSNFALRDLLKEKI GEPYDLKVYEQFAPEDDTKAHE >gi|157101631|gb|DS480693.1| GENE 36 38452 - 39618 1032 388 aa, chain + ## HITS:1 COG:MTH172 KEGG:ns NR:ns ## COG: MTH172 COG1216 # Protein_GI_number: 15678200 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Methanothermobacter thermautotrophicus # 69 369 6 324 332 164 34.0 2e-40 MTLRFTNSLRRRTIRKHMSDHRNDYEAECDYKAGRGNEAGCDNKAGCGNEAGRGNETGRR TGRRRKVTIVIPNYNGLKFMEPCFKALSMQICRDFDILVVDNGSTDGSVEWLKEQEIPAI FLPENTGFSGAVNVGLKAAATPYVILLNNDTEPDFHYVGEMIKAIERSPKIFSVSCKMIQ LCRKELMDDAGDMYSLLGWAYQRGVGRSSAKYNRACRIFSACAGAAIYRREVFDEIGYFD EMHFAYLEDLDVGYRARIAGYDNIYCPTAMVYHVGSGTSGSKYNPFKVKLAARNNIYLNY KNMPLLQLAVNLVPILLGMGLKYMFFKKKGFGRDYAAGVKEGLATAKRCRKVPYNRERLK NYLAIEWELITGTLIYVYEFAARQIAKL >gi|157101631|gb|DS480693.1| GENE 37 39729 - 41147 1420 472 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 117 472 110 464 464 260 38.0 5e-69 MIKDNQKRLNRMHVLLDILVTVVAYALAWFIVISGKVLPLDEGVLKPQVYFMALIFIVPI YLILYASFHLYVPKRIQGRRSELANICKANVIGLMLFTFVLFGLRRFVSHLSYFSTKMIL AFFAANIILLEAERISIRIFLRSLRTNGYNQKHVLLIGYSRAAEGFIDRVSVNPEWGYHV QGILDDHRPAGFAYKKVQVLGPTNHLEDFLASNTLDEIAITLSIKEYSNLEQIVAACEKS GVHTKFIPDYNNIIPTIPYMEDLQGLPVIHIRHVPLTGVFNATMKRIVDLAGALFGLIVF SPLMLVTALLIKITSPGPVLFSQERIGLHNRPFKMYKFRSMEVQDPGRERSQWTTPHDPR VTPVGRFIRKTSIDEMPQFFNVLIGDMSLVGPRPERPLFVEKFKEEIPRYMIKHQVRPGL TGWAQVNGYRGDTSITKRIEHDLYYIENWSLGFDFKIMFLTVFKGFINKNAY >gi|157101631|gb|DS480693.1| GENE 38 41278 - 43935 1966 885 aa, chain + ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 371 629 30 282 341 126 32.0 2e-28 MNREFDNGFDEEEERLRSRSRKRDLGAGAGADSDTMPRGRTGGSGQEIRGAGQAGRAGSG GGSRQGTPVSAGRASVRTASDGRGSGSAAGPSSGSGARPRYAGVGDAERTPGIPGKNGGM RSTYAGERMPERPVSGEGKTGQRPRFAGEKPAGRPGGTAGGAGERPGYTGTGGKGGARPG YTGTGSGSGERPVYTGTGSKTGEHPGYTGTGSKTGERPGYTGGGSKTGEHPGYTGTGSKT GERPGYTGAGGRSGERPRFTGDRPDGRPGVSGSRPGERPRYTGEGKGTRTAAGNLGAGET GSRTYSRSYAGAEDLGAGKGRDSSFGRNGGGAGGRRAAREAAAADPAAAEIKRRRKLIIV FIILEILFLLGTGLYRYAQNKMSLIQPSQFKPAQVTNPNIPQGKVEEMEGYWTIALFGVD SRNNSVGKGNNADVIIICNIDQGTGEIKLVSVFRDTYLSVSDNGLYNKINQAYFLGGPEQ AVEALNRNLDLQIDDFATFNWKAVVDAVNILGGVDVELSKAEFYYINAYITETVEATGVG SYQLKQAGLNHLDGVQAVAYARLRKMDTDFARTERQREIIDLCFQKLKKSDFAVVNNVME AVFPQILSSVTIDDIIPAARNLTKYTIADTMGFPAARSDANMGKKGACVIPQTLESNVTL LHQFLFGDENYQPSDMVKKISAKISADTGMYNEAKPIDHVGTDGGYIPKPTQATKATEET KENESESSTSGTDESIIDGETDLEIETDEFGNELDPPEDDIFGRPGESSGSGTVHPGRPG ESSSGSIFPGAETSEGDQTTGPGAITYPGQDPTRGTSAAYPGASKGTTAAYPGSQGTSPA YPGSTKGSTAAYPGSTKGSTAAYPGASSEEYVPEGPGSVIIGPGN >gi|157101631|gb|DS480693.1| GENE 39 44123 - 45547 1628 474 aa, chain + ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 134 447 29 339 459 166 35.0 1e-40 MYDNDQNYGWKSYDNQSYDNRQYGGDPYGRRPQPEDMPPKKKKGGILKKAVLITAGALLF GTVSGATMVGVNVAASRFMGNSAAASAEAEKKEEISKAQTESQGSGQDNGQGSAQGNGGY QQGQRNNGKAVADVSDIVEQAMPTVVAITSTAVYQSNNYGYGWFFRGGPQTYEVPSSGSG IIVGENDKELLIVTNNHVVEDSTSLKVAFIDSEVVDAAIKGTDAETDLAVIAVPLEQIKD DTKSKIKVARLGNSDELKVGQGVVAIGNALGYGQSVTVGYVSALNREVRVSNTSTRELLQ TDAAINPGNSGGALLNMKGEVIGINAAKYSSTEVEGIGYAIPISKAEDIMNQLMNRKTMN PVEEAKRGYLGIQGTSVDEESAAAFGMPRGVYVYKILEEGAAAQSDLREKDIITRVDGQS VRNMTDLQELLACYEMGEQIDLTVQSQKDGEYQERTVTITLKAMPQENTETQAQ >gi|157101631|gb|DS480693.1| GENE 40 45681 - 47114 241 477 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 122 474 10 309 466 97 26 1e-19 MDEKEYTQDVETSGHGEDKDKQKEEDRYEKVCYMCRRPESKAGPMISMPGGMNLCHDCMQ KAFDSVTKGGMDFSKLPNMPYMNMNLNDLNMTPPAVEIPKKQKIKKKAKEQPVLTMKDIP APHVIKAELDEYVIGQEKAKKVMAVAVYNHYKRAFLDQPPQDGAGEDAGTDKNIVIEKSN ILMIGPTGSGKTYLVKTLARLLDVPLAIADATSLTEAGYIGDDIESVVSKLLAAADNDVE RAQRGIIFIDEIDKIAKKKQTNARDVSGESVQQELLKLLEGSTVEVPVGSNQKNAMTPMA TVNTDNILFICGGAFPDLEEIIKERLMKKITMGFGSILKDTYDKDPDILGQVTNEDLRTF GMIPEFLGRLPVTVTLQGLTEDMMVRILKEPKNAITKQYERLLEMDEVRLVFEDEALKWI AGEAIKRGTGARALRAILEEFMLDIMYEIPKDPNIGSVVVTRPYLEKSGGPLIQMRG >gi|157101631|gb|DS480693.1| GENE 41 47157 - 48803 1235 548 aa, chain - ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 28 547 597 1098 1118 287 34.0 4e-77 MDYVNDEYNIIYTPLDTVTPISLARYSYYTVPGLFSLLDSSSMEASGILTTFAAPALGNR GRGTLIGIVDTGIDYTNPLFRYQDGSTRIAGLWDQSIPTGEDVIPPGVPAYYELSGASYG TEFTREQINQALASDSPLDLVPSTDTNGHGTFLAGIAAGGSLPEQDFTGAAPECEIIVVK LKQAKRYLREFYLVSQDADAYQENDIMMGIKYLRVTAFRLGRPLIILIALGSNLGSHEGT SPLSSVVQDASRFLGRAAVIAAGNETGRAHHHFGTIPSGQEWDDVEIRVGPEESARGFSL ELWASTADTYSVGFVSPSGEIISRIPIIARNETSIPFLLEPTVITVNYQLIESGAGKQLI FMRFRNPVAGIWKVRVYNTQYFTGEFHMWLPSEGLVSDETVFLRPTPDTTITLPGNTAAP ITVGAYNHLNNSIYIHSSRGFTPSGIVKPELAAPGVNVMGPSVGRRAGGSVPMTTRSGTS VAAAHVAGAVASLFGWGIVESNQITMSQASVKSYLIRGARRNPALRYPNEEWGYGALDLY ETFRRIRE >gi|157101631|gb|DS480693.1| GENE 42 49027 - 49515 636 162 aa, chain + ## HITS:1 COG:BMEI1799 KEGG:ns NR:ns ## COG: BMEI1799 COG0597 # Protein_GI_number: 17988082 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Brucella melitensis # 13 150 30 163 171 65 30.0 3e-11 MIYGWIIGGLAALDLGVKSVVEGQEDDTFPRDLPSAKGFIKLHKNHNSGFPFGFMKERPE LVKGIPLMVISAMAGALAAMMQDKGKTGEKLGLSLVMGGALSNLYDRVVRGYVVDYFTIE WKSLKKVVFNLGDMFVFLGSAVFVLAQAVGSLEDAGVKKKKK >gi|157101631|gb|DS480693.1| GENE 43 49615 - 49974 378 119 aa, chain - ## HITS:1 COG:no KEGG:Closa_0598 NR:ns ## KEGG: Closa_0598 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 119 1 119 119 92 60.0 4e-18 MENGIVKPMLRSLLISYVLSGILLAALAFALYKLRLKEGQVNLMVYAVYLLTCLCGGFLA GKRIRQRRFFWGLLSGLLYFLVLFAVSWAMNMGSAIDMERSVTVMGICALGGTIGGMLS >gi|157101631|gb|DS480693.1| GENE 44 50145 - 50235 118 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHVKTLNTNTLKDTMKKGGCGECQTSCQS Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:00:39 2011 Seq name: gi|157101630|gb|DS480694.1| Clostridium bolteae ATCC BAA-613 Scfld_02_35 genomic scaffold, whole genome shotgun sequence Length of sequence - 460839 bp Number of predicted genes - 424, with homology - 418 Number of transcription units - 177, operones - 104 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 4.4 1 1 Op 1 . + CDS 85 - 624 458 ## COG0655 Multimeric flavodoxin WrbA 2 1 Op 2 . + CDS 637 - 789 70 ## gi|160939991|ref|ZP_02087336.1| hypothetical protein CLOBOL_04880 + Term 813 - 857 2.5 + Prom 796 - 855 2.9 3 2 Op 1 . + CDS 879 - 1103 166 ## Pjdr2_1921 transcriptional regulator, XRE family 4 2 Op 2 . + CDS 1165 - 1566 274 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 5 2 Op 3 . + CDS 1601 - 3547 1516 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 3608 - 3655 0.7 + Prom 4109 - 4168 5.5 6 3 Op 1 . + CDS 4394 - 5431 1076 ## COG2008 Threonine aldolase 7 3 Op 2 . + CDS 5447 - 6679 1631 ## COG1301 Na+/H+-dicarboxylate symporters + Term 6757 - 6803 12.7 - Term 6745 - 6789 5.1 8 4 Op 1 . - CDS 6833 - 7513 523 ## COG0120 Ribose 5-phosphate isomerase 9 4 Op 2 . - CDS 7589 - 9001 1153 ## COG0364 Glucose-6-phosphate 1-dehydrogenase 10 4 Op 3 . - CDS 8967 - 10391 1291 ## COG0362 6-phosphogluconate dehydrogenase 11 4 Op 4 1/0.163 - CDS 10420 - 11514 571 ## COG2706 3-carboxymuconate cyclase - Prom 11562 - 11621 8.9 - Term 11707 - 11771 22.2 12 5 Tu 1 . - CDS 11823 - 13310 1530 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 13492 - 13551 10.8 + Prom 13378 - 13437 14.0 13 6 Tu 1 . + CDS 13540 - 13719 181 ## gi|160940005|ref|ZP_02087350.1| hypothetical protein CLOBOL_04894 + Prom 13777 - 13836 11.1 14 7 Op 1 . + CDS 13860 - 16037 1357 ## COG1203 Predicted helicases 15 7 Op 2 . + CDS 16081 - 16740 480 ## EUBREC_2494 hypothetical protein 16 7 Op 3 . + CDS 16737 - 18512 1507 ## EUBREC_2493 hypothetical protein 17 7 Op 4 2/0.093 + CDS 18514 - 19404 988 ## COG3649 Uncharacterized protein predicted to be involved in DNA repair 18 7 Op 5 12/0.000 + CDS 19431 - 20066 368 ## COG1468 RecB family exonuclease 19 7 Op 6 13/0.000 + CDS 20063 - 21094 695 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 20 7 Op 7 . + CDS 21133 - 21423 266 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair - Term 22613 - 22651 6.5 21 8 Op 1 . - CDS 22691 - 23494 786 ## COG1402 Uncharacterized protein, putative amidase 22 8 Op 2 . - CDS 23533 - 24141 514 ## COG3601 Predicted membrane protein - Prom 24304 - 24363 2.6 + Prom 24485 - 24544 6.7 23 9 Op 1 . + CDS 24695 - 26101 1327 ## COG0232 dGTP triphosphohydrolase 24 9 Op 2 . + CDS 26113 - 26499 256 ## BT_4485 hypothetical protein 25 9 Op 3 . + CDS 26496 - 28268 1409 ## Closa_0369 plasmid pRiA4b ORF-3 family protein 26 9 Op 4 . + CDS 28291 - 28464 290 ## gi|160940019|ref|ZP_02087364.1| hypothetical protein CLOBOL_04908 27 9 Op 5 2/0.093 + CDS 28504 - 29433 899 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain + Term 29460 - 29499 5.1 + Prom 29509 - 29568 4.7 28 10 Op 1 10/0.000 + CDS 29607 - 30227 806 ## COG2376 Dihydroxyacetone kinase 29 10 Op 2 . + CDS 30358 - 31356 1300 ## COG2376 Dihydroxyacetone kinase 30 10 Op 3 . + CDS 31429 - 31749 449 ## Elen_1231 hypothetical protein 31 10 Op 4 3/0.047 + CDS 31762 - 32604 1016 ## COG0191 Fructose/tagatose bisphosphate aldolase + Prom 32716 - 32775 7.2 32 11 Tu 1 . + CDS 32810 - 33805 884 ## COG2222 Predicted phosphosugar isomerases + Prom 33861 - 33920 9.1 33 12 Op 1 . + CDS 33977 - 34699 594 ## COG2188 Transcriptional regulators 34 12 Op 2 5/0.023 + CDS 34717 - 35553 626 ## COG1082 Sugar phosphate isomerases/epimerases + Term 35586 - 35631 -0.0 + Prom 35675 - 35734 9.2 35 13 Op 1 1/0.163 + CDS 35770 - 36669 689 ## COG0524 Sugar kinases, ribokinase family 36 13 Op 2 35/0.000 + CDS 36714 - 38087 1013 ## COG1653 ABC-type sugar transport system, periplasmic component 37 13 Op 3 38/0.000 + CDS 38149 - 39024 720 ## COG1175 ABC-type sugar transport systems, permease components 38 13 Op 4 . + CDS 39038 - 39862 737 ## COG0395 ABC-type sugar transport system, permease component + Term 39886 - 39923 6.3 + Prom 39889 - 39948 6.2 39 14 Tu 1 . + CDS 39975 - 40784 691 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 40 15 Tu 1 . - CDS 40781 - 41623 560 ## COG1284 Uncharacterized conserved protein - Prom 41643 - 41702 1.6 - Term 41667 - 41712 6.3 41 16 Tu 1 . - CDS 41732 - 42700 832 ## COG2222 Predicted phosphosugar isomerases - Prom 42803 - 42862 9.9 - Term 42854 - 42908 13.3 42 17 Tu 1 . - CDS 42935 - 43186 202 ## Amet_1294 GntR family transcriptional regulator - Prom 43276 - 43335 5.2 + Prom 43215 - 43274 10.2 43 18 Tu 1 . + CDS 43377 - 44048 891 ## TepRe1_1961 Asp/Glu/hydantoin racemase + Prom 44059 - 44118 3.3 44 19 Op 1 . + CDS 44175 - 45221 265 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 45 19 Op 2 . + CDS 45232 - 45714 450 ## Amico_1587 tripartite ATP-independent periplasmic transporter DctQ component 46 19 Op 3 1/0.163 + CDS 45716 - 46987 722 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 47 20 Tu 1 . + CDS 47152 - 47889 816 ## COG2186 Transcriptional regulators + Prom 47896 - 47955 2.0 48 21 Op 1 . + CDS 47990 - 48343 341 ## COG0662 Mannose-6-phosphate isomerase 49 21 Op 2 . + CDS 48376 - 49188 1028 ## COG3246 Uncharacterized conserved protein 50 21 Op 3 . + CDS 49207 - 50625 1430 ## COG3395 Uncharacterized protein conserved in bacteria + Term 50735 - 50781 8.0 51 22 Tu 1 . + CDS 50976 - 51884 808 ## COG1737 Transcriptional regulators + Term 51924 - 51964 9.2 + Prom 51987 - 52046 2.8 52 23 Op 1 17/0.000 + CDS 52114 - 52899 366 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 53 23 Op 2 . + CDS 52922 - 55549 2277 ## COG1178 ABC-type Fe3+ transport system, permease component 54 23 Op 3 . + CDS 55648 - 57525 1670 ## COG0075 Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase 55 23 Op 4 . + CDS 57671 - 57946 260 ## Closa_0373 sporulation transcriptional regulator SpoIIID + Term 58017 - 58070 12.3 - Term 58005 - 58058 3.4 56 24 Tu 1 . - CDS 58079 - 59614 1758 ## COG0591 Na+/proline symporter - Prom 59710 - 59769 8.8 57 25 Tu 1 . - CDS 59848 - 60441 714 ## COG1309 Transcriptional regulator - Prom 60614 - 60673 4.3 + Prom 60576 - 60635 8.9 58 26 Tu 1 . + CDS 60675 - 61700 1230 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Prom 61716 - 61775 3.0 59 27 Tu 1 . + CDS 61827 - 62252 487 ## Closa_0380 hypothetical protein + Term 62355 - 62395 -0.0 + Prom 62299 - 62358 6.8 60 28 Op 1 . + CDS 62432 - 63082 719 ## COG1045 Serine acetyltransferase 61 28 Op 2 . + CDS 63109 - 63981 878 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 62 28 Op 3 . + CDS 64081 - 65667 1967 ## COG1388 FOG: LysM repeat + Term 65727 - 65770 3.6 + Prom 65756 - 65815 5.5 63 29 Op 1 . + CDS 65842 - 66753 1068 ## COG4866 Uncharacterized conserved protein 64 29 Op 2 . + CDS 66750 - 67877 808 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 65 29 Op 3 . + CDS 67966 - 68688 764 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 66 29 Op 4 . + CDS 68707 - 69699 1017 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 67 29 Op 5 1/0.163 + CDS 69696 - 69944 229 ## COG4481 Uncharacterized protein conserved in bacteria + Prom 69984 - 70043 9.0 68 30 Op 1 24/0.000 + CDS 70113 - 70400 346 ## PROTEIN SUPPORTED gi|160881892|ref|YP_001560860.1| ribosomal protein S6 69 30 Op 2 21/0.000 + CDS 70450 - 70914 382 ## COG0629 Single-stranded DNA-binding protein 70 30 Op 3 . + CDS 70952 - 71221 478 ## PROTEIN SUPPORTED gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 + Term 71253 - 71309 11.1 - Term 71247 - 71289 4.0 71 31 Op 1 . - CDS 71332 - 71679 442 ## gi|160940070|ref|ZP_02087415.1| hypothetical protein CLOBOL_04959 72 31 Op 2 . - CDS 71755 - 72819 592 ## COG3053 Citrate lyase synthetase - Prom 72866 - 72925 3.9 + Prom 72876 - 72935 8.3 73 32 Tu 1 5/0.023 + CDS 73020 - 74705 1751 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 74768 - 74808 7.3 + Prom 74746 - 74805 1.9 74 33 Op 1 44/0.000 + CDS 74845 - 75825 990 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 75 33 Op 2 6/0.023 + CDS 75818 - 76837 938 ## COG4608 ABC-type oligopeptide transport system, ATPase component 76 33 Op 3 49/0.000 + CDS 76834 - 77781 260 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 77 33 Op 4 . + CDS 77823 - 78776 905 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 78 33 Op 5 2/0.093 + CDS 78810 - 79703 1021 ## COG0583 Transcriptional regulator + Term 79806 - 79853 5.2 + Prom 79716 - 79775 3.5 79 34 Op 1 27/0.000 + CDS 79935 - 81113 1392 ## COG0845 Membrane-fusion protein 80 34 Op 2 . + CDS 81128 - 84181 3463 ## COG0841 Cation/multidrug efflux pump + Prom 84437 - 84496 9.1 81 35 Op 1 . + CDS 84662 - 84970 412 ## gi|160940083|ref|ZP_02087428.1| hypothetical protein CLOBOL_04972 82 35 Op 2 16/0.000 + CDS 84989 - 87004 2206 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 83 35 Op 3 . + CDS 87007 - 87489 603 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 84 35 Op 4 . + CDS 87532 - 88125 605 ## TepRe1_2081 V-type proton ATPase subunit E 85 35 Op 5 13/0.000 + CDS 88147 - 89115 995 ## COG1527 Archaeal/vacuolar-type H+-ATPase subunit C 86 35 Op 6 12/0.000 + CDS 89108 - 89428 392 ## COG1436 Archaeal/vacuolar-type H+-ATPase subunit F 87 35 Op 7 16/0.000 + CDS 89444 - 91219 2078 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 88 35 Op 8 16/0.000 + CDS 91219 - 92592 1609 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 89 35 Op 9 . + CDS 92701 - 93384 1032 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D + Term 93463 - 93524 11.6 + Prom 93416 - 93475 5.5 90 36 Op 1 . + CDS 93548 - 94354 702 ## COG0526 Thiol-disulfide isomerase and thioredoxins 91 36 Op 2 . + CDS 94373 - 94522 222 ## gi|160940095|ref|ZP_02087440.1| hypothetical protein CLOBOL_04984 92 36 Op 3 . + CDS 94566 - 95399 645 ## COG0348 Polyferredoxin + Term 95413 - 95470 2.3 + Prom 95494 - 95553 3.9 93 37 Op 1 44/0.000 + CDS 95631 - 96629 600 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 94 37 Op 2 7/0.023 + CDS 96633 - 97640 890 ## COG4608 ABC-type oligopeptide transport system, ATPase component 95 38 Op 1 38/0.000 + CDS 97768 - 99402 1970 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Prom 99559 - 99618 2.8 96 38 Op 2 49/0.000 + CDS 99645 - 100664 1320 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 97 38 Op 3 . + CDS 100664 - 101500 1043 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 98 38 Op 4 . + CDS 101518 - 102609 1286 ## COG0006 Xaa-Pro aminopeptidase 99 38 Op 5 . + CDS 102632 - 103204 619 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 103298 - 103328 -1.0 + Prom 103239 - 103298 5.5 100 39 Tu 1 . + CDS 103493 - 103699 324 ## PROTEIN SUPPORTED gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 + Term 103715 - 103754 8.2 101 40 Op 1 1/0.163 + CDS 103808 - 104893 1078 ## COG3872 Predicted metal-dependent enzyme 102 40 Op 2 32/0.000 + CDS 104847 - 105710 383 ## PROTEIN SUPPORTED gi|223485211|ref|YP_002587537.1| ribosomal protein L11 methyltransferase 103 40 Op 3 . + CDS 105853 - 106929 1439 ## COG0216 Protein chain release factor A + Prom 107245 - 107304 3.8 104 41 Op 1 . + CDS 107359 - 107838 498 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 105 41 Op 2 . + CDS 107835 - 108821 1085 ## gi|160940113|ref|ZP_02087458.1| hypothetical protein CLOBOL_05002 + Prom 108908 - 108967 3.8 106 42 Op 1 . + CDS 109065 - 109943 1040 ## COG2207 AraC-type DNA-binding domain-containing proteins 107 42 Op 2 . + CDS 109980 - 110537 551 ## Clos_0809 hypothetical protein 108 42 Op 3 . + CDS 110513 - 111103 635 ## Clos_0810 hypothetical protein + Prom 111269 - 111328 5.4 109 43 Op 1 35/0.000 + CDS 111378 - 112778 1531 ## COG1653 ABC-type sugar transport system, periplasmic component 110 43 Op 2 38/0.000 + CDS 112854 - 113735 972 ## COG1175 ABC-type sugar transport systems, permease components 111 43 Op 3 3/0.047 + CDS 113732 - 114562 1052 ## COG0395 ABC-type sugar transport system, permease component 112 43 Op 4 9/0.000 + CDS 114598 - 115719 1266 ## COG0673 Predicted dehydrogenases and related proteins 113 43 Op 5 . + CDS 115748 - 116863 1123 ## COG0673 Predicted dehydrogenases and related proteins 114 43 Op 6 . + CDS 116866 - 117789 1021 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 115 43 Op 7 . + CDS 117791 - 118255 404 ## COG2731 Beta-galactosidase, beta subunit 116 43 Op 8 . + CDS 118283 - 118987 671 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase + Term 119018 - 119066 8.6 + Prom 119061 - 119120 5.2 117 44 Tu 1 . + CDS 119188 - 121167 1613 ## COG4886 Leucine-rich repeat (LRR) protein + Prom 121194 - 121253 5.3 118 45 Op 1 16/0.000 + CDS 121320 - 122219 1074 ## COG1209 dTDP-glucose pyrophosphorylase 119 45 Op 2 11/0.000 + CDS 122235 - 123257 1354 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 123290 - 123338 -0.6 120 45 Op 3 . + CDS 123373 - 124233 932 ## COG1091 dTDP-4-dehydrorhamnose reductase 121 45 Op 4 . + CDS 124307 - 125659 1748 ## COG1109 Phosphomannomutase + Term 125835 - 125882 16.1 122 46 Tu 1 . + CDS 126173 - 127840 1781 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases + Term 127904 - 127939 7.2 + Prom 128146 - 128205 7.2 123 47 Op 1 . + CDS 128255 - 129400 775 ## COG0006 Xaa-Pro aminopeptidase 124 47 Op 2 11/0.000 + CDS 129418 - 130536 418 ## PROTEIN SUPPORTED gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 125 47 Op 3 11/0.000 + CDS 130553 - 131026 198 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 126 47 Op 4 . + CDS 131056 - 132330 804 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 127 47 Op 5 . + CDS 132359 - 133045 720 ## COG1414 Transcriptional regulator + Term 133074 - 133128 4.2 - Term 133060 - 133116 7.0 128 48 Tu 1 . - CDS 133187 - 134818 1613 ## CLK_0336 hypothetical protein - Prom 134879 - 134938 5.6 + Prom 134936 - 134995 5.1 129 49 Op 1 . + CDS 135025 - 135813 508 ## Closa_0158 hypothetical protein 130 49 Op 2 . + CDS 135912 - 136637 684 ## COG2188 Transcriptional regulators + Term 136778 - 136812 1.5 + Prom 136715 - 136774 6.2 131 50 Op 1 9/0.000 + CDS 136834 - 138777 1979 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 132 50 Op 2 . + CDS 138843 - 140489 1458 ## COG0366 Glycosidases 133 50 Op 3 . + CDS 140589 - 143423 2607 ## COG0642 Signal transduction histidine kinase + Prom 143547 - 143606 4.7 134 51 Op 1 . + CDS 143643 - 144743 1353 ## COG0562 UDP-galactopyranose mutase 135 51 Op 2 . + CDS 144777 - 145874 982 ## gi|160940149|ref|ZP_02087494.1| hypothetical protein CLOBOL_05038 136 51 Op 3 8/0.023 + CDS 145871 - 146908 880 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 137 51 Op 4 . + CDS 146987 - 148438 1434 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 138 51 Op 5 . + CDS 148515 - 149480 1029 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 139 51 Op 6 . + CDS 149500 - 150507 1036 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 150705 - 150731 -0.6 - Term 150479 - 150519 9.1 140 52 Tu 1 . - CDS 150584 - 152143 1281 ## Closa_3823 ErfK/YbiS/YcfS/YnhG family protein - Prom 152163 - 152222 7.9 + Prom 152194 - 152253 10.3 141 53 Op 1 . + CDS 152309 - 153328 926 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases 142 53 Op 2 . + CDS 153345 - 154226 892 ## COG1216 Predicted glycosyltransferases 143 54 Op 1 5/0.023 + CDS 154383 - 155498 1251 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 144 54 Op 2 8/0.023 + CDS 155508 - 156194 694 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 145 54 Op 3 8/0.023 + CDS 156208 - 157566 1258 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 146 54 Op 4 . + CDS 157547 - 158530 1081 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 147 54 Op 5 . + CDS 158556 - 159005 470 ## Closa_3819 hypothetical protein 148 54 Op 6 . + CDS 158995 - 160677 1513 ## COG4713 Predicted membrane protein + Prom 160694 - 160753 5.0 149 55 Op 1 . + CDS 160831 - 161385 645 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 150 55 Op 2 . + CDS 161405 - 162217 877 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 151 55 Op 3 . + CDS 162236 - 162835 662 ## COG1451 Predicted metal-dependent hydrolase - Term 162726 - 162757 1.5 152 56 Tu 1 . - CDS 162877 - 163326 547 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 163365 - 163424 5.0 + Prom 163408 - 163467 5.5 153 57 Op 1 . + CDS 163487 - 163873 396 ## CA_C2946 hypothetical protein 154 57 Op 2 . + CDS 163892 - 164842 823 ## Amet_3885 hypothetical protein 155 58 Tu 1 . + CDS 164960 - 165340 361 ## COG0346 Lactoylglutathione lyase and related lyases + Term 165509 - 165559 4.0 - Term 165134 - 165178 0.2 156 59 Tu 1 . - CDS 165329 - 166507 968 ## COG1757 Na+/H+ antiporter - Prom 166730 - 166789 4.3 + Prom 166653 - 166712 4.8 157 60 Tu 1 . + CDS 166737 - 168206 877 ## COG3666 Transposase and inactivated derivatives + Term 168208 - 168239 4.1 - Term 168142 - 168196 11.1 158 61 Op 1 . - CDS 168250 - 169608 1422 ## COG2610 H+/gluconate symporter and related permeases - Prom 169629 - 169688 3.4 159 61 Op 2 . - CDS 169690 - 170988 1212 ## Spirs_0433 hypothetical protein 160 61 Op 3 1/0.163 - CDS 171007 - 172500 1129 ## COG2721 Altronate dehydratase - Prom 172520 - 172579 9.3 - Term 172556 - 172599 0.2 161 62 Tu 1 . - CDS 172630 - 173265 568 ## COG1802 Transcriptional regulators - Prom 173340 - 173399 6.3 + Prom 173375 - 173434 10.5 162 63 Op 1 2/0.093 + CDS 173512 - 174510 757 ## COG3734 2-keto-3-deoxy-galactonokinase 163 63 Op 2 . + CDS 174524 - 175189 478 ## COG0800 2-keto-3-deoxy-6-phosphogluconate aldolase + Term 175217 - 175251 6.0 + Prom 175304 - 175363 8.7 164 64 Tu 1 . + CDS 175517 - 178243 1273 ## COG2200 FOG: EAL domain 165 65 Tu 1 . - CDS 178383 - 180941 1148 ## COG0642 Signal transduction histidine kinase - Prom 181130 - 181189 8.1 + Prom 180918 - 180977 5.8 166 66 Tu 1 . + CDS 181118 - 182581 474 ## Elen_3094 regulatory protein GntR HTH + Prom 182610 - 182669 6.4 167 67 Op 1 . + CDS 182890 - 183534 676 ## COG0036 Pentose-5-phosphate-3-epimerase 168 67 Op 2 . + CDS 183599 - 184351 210 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 169 67 Op 3 . + CDS 184394 - 185659 1477 ## COG3775 Phosphotransferase system, galactitol-specific IIC component 170 67 Op 4 1/0.163 + CDS 185675 - 186115 422 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 171 67 Op 5 . + CDS 186161 - 186700 699 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 172 67 Op 6 . + CDS 186784 - 187083 464 ## PPA0023 PTS system, galactitol-specific IIB component (EC:2.7.1.69) + Prom 187087 - 187146 8.3 173 68 Tu 1 . + CDS 187295 - 189373 1846 ## COG3711 Transcriptional antiterminator + Term 189581 - 189611 -0.6 174 69 Op 1 40/0.000 - CDS 189397 - 190422 492 ## COG0642 Signal transduction histidine kinase 175 69 Op 2 . - CDS 190419 - 191096 694 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 191144 - 191203 7.3 + Prom 191218 - 191277 6.1 176 70 Op 1 . + CDS 191333 - 191950 450 ## gi|160940197|ref|ZP_02087542.1| hypothetical protein CLOBOL_05086 177 70 Op 2 27/0.000 + CDS 191972 - 193111 1141 ## COG0845 Membrane-fusion protein 178 70 Op 3 . + CDS 193127 - 196261 2858 ## COG0841 Cation/multidrug efflux pump 179 70 Op 4 . + CDS 196258 - 197433 1189 ## Closa_0525 hypothetical protein 180 71 Op 1 . + CDS 197566 - 198732 1150 ## Closa_0526 hypothetical protein 181 71 Op 2 . + CDS 198758 - 199219 332 ## gi|160940202|ref|ZP_02087547.1| hypothetical protein CLOBOL_05091 182 71 Op 3 . + CDS 199224 - 199754 249 ## gi|160940203|ref|ZP_02087548.1| hypothetical protein CLOBOL_05092 183 71 Op 4 . + CDS 199807 - 200232 428 ## EUBELI_00061 hypothetical protein + Term 200234 - 200267 5.4 + Prom 200239 - 200298 3.5 184 72 Op 1 5/0.023 + CDS 200370 - 200777 200 ## COG0534 Na+-driven multidrug efflux pump 185 72 Op 2 . + CDS 200722 - 201747 761 ## COG0534 Na+-driven multidrug efflux pump + Term 201815 - 201858 8.4 + Prom 202111 - 202170 7.9 186 73 Tu 1 . + CDS 202318 - 202752 626 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA + Prom 202791 - 202850 2.3 187 74 Op 1 . + CDS 202886 - 204076 956 ## CPF_0400 putative glucuronyl hydrolase 188 74 Op 2 13/0.000 + CDS 204136 - 204627 545 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 189 74 Op 3 13/0.000 + CDS 204662 - 205450 1080 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 190 74 Op 4 . + CDS 205434 - 206249 945 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID + Term 206259 - 206322 16.8 + Prom 206272 - 206331 2.5 191 75 Op 1 . + CDS 206400 - 208505 1291 ## CPF_0406 heparinase II/III-like protein 192 75 Op 2 . + CDS 208553 - 209557 1105 ## COG1609 Transcriptional regulators + Prom 209611 - 209670 8.7 193 76 Tu 1 . + CDS 209707 - 210474 743 ## COG1349 Transcriptional regulators of sugar metabolism + Term 210495 - 210561 0.6 + Prom 210617 - 210676 10.7 194 77 Op 1 3/0.047 + CDS 210730 - 211779 436 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 195 77 Op 2 . + CDS 211801 - 213162 974 ## COG3395 Uncharacterized protein conserved in bacteria 196 77 Op 3 . + CDS 213213 - 214085 737 ## COG3246 Uncharacterized conserved protein 197 77 Op 4 . + CDS 214085 - 215413 874 ## COG2610 H+/gluconate symporter and related permeases + Prom 215430 - 215489 5.7 198 78 Tu 1 . + CDS 215516 - 216142 255 ## Closa_1315 hypothetical protein + Term 216216 - 216264 9.0 + Prom 216147 - 216206 1.6 199 79 Op 1 . + CDS 216276 - 217760 1505 ## COG0466 ATP-dependent Lon protease, bacterial type 200 79 Op 2 . + CDS 217843 - 218382 549 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 218479 - 218516 8.5 - Term 218469 - 218501 5.0 201 80 Tu 1 . - CDS 218702 - 219046 78 ## COG1733 Predicted transcriptional regulators - Prom 219128 - 219187 6.5 + Prom 219163 - 219222 6.0 202 81 Tu 1 . + CDS 219247 - 220050 341 ## COG1145 Ferredoxin + Prom 220090 - 220149 6.3 203 82 Tu 1 . + CDS 220180 - 220458 375 ## gi|160940227|ref|ZP_02087572.1| hypothetical protein CLOBOL_05116 + Prom 220795 - 220854 2.3 204 83 Op 1 . + CDS 221008 - 222066 1287 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase + Term 222082 - 222123 5.0 205 83 Op 2 . + CDS 222143 - 223111 948 ## Shel_24810 hypothetical protein 206 83 Op 3 33/0.000 + CDS 223132 - 224307 942 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 207 83 Op 4 35/0.000 + CDS 224324 - 225343 902 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component + Prom 225371 - 225430 3.1 208 83 Op 5 . + CDS 225505 - 226608 955 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components 209 84 Tu 1 . - CDS 226610 - 227401 715 ## COG0789 Predicted transcriptional regulators - Prom 227438 - 227497 6.3 + Prom 227397 - 227456 3.6 210 85 Tu 1 . + CDS 227510 - 228844 1122 ## COG0534 Na+-driven multidrug efflux pump + Prom 228865 - 228924 4.2 211 86 Op 1 . + CDS 228971 - 229681 763 ## COG2243 Precorrin-2 methylase 212 86 Op 2 45/0.000 + CDS 229681 - 230526 813 ## COG1131 ABC-type multidrug transport system, ATPase component 213 86 Op 3 . + CDS 230592 - 231254 611 ## COG0842 ABC-type multidrug transport system, permease component 214 86 Op 4 . + CDS 231298 - 232716 1485 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family + Prom 232734 - 232793 4.4 215 87 Tu 1 . + CDS 232858 - 233721 877 ## COG0685 5,10-methylenetetrahydrofolate reductase 216 88 Op 1 . - CDS 233735 - 234643 1081 ## COG0583 Transcriptional regulator 217 88 Op 2 . - CDS 234713 - 235627 411 ## Geob_3648 sugar transferase 218 88 Op 3 . - CDS 235631 - 237682 1743 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain - Prom 237839 - 237898 5.2 + Prom 237810 - 237869 7.6 219 89 Op 1 . + CDS 237985 - 239166 1153 ## COG0025 NhaP-type Na+/H+ and K+/H+ antiporters + Prom 239173 - 239232 4.6 220 89 Op 2 . + CDS 239260 - 240183 724 ## COG0388 Predicted amidohydrolase 221 89 Op 3 . + CDS 240204 - 243560 2995 ## COG0642 Signal transduction histidine kinase + Term 243736 - 243794 12.2 222 90 Tu 1 . - CDS 243645 - 243821 63 ## gi|160940249|ref|ZP_02087594.1| hypothetical protein CLOBOL_05138 - Prom 243844 - 243903 3.4 223 91 Tu 1 . + CDS 243896 - 244975 1152 ## COG0628 Predicted permease + Prom 245007 - 245066 6.1 224 92 Op 1 . + CDS 245086 - 245712 424 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 225 92 Op 2 . + CDS 245717 - 246781 1085 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 226 92 Op 3 . + CDS 246832 - 247845 992 ## COG0451 Nucleoside-diphosphate-sugar epimerases 227 92 Op 4 . + CDS 247914 - 248510 401 ## PROTEIN SUPPORTED gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) + Term 248566 - 248609 3.5 + Prom 248671 - 248730 1.6 228 93 Op 1 . + CDS 248789 - 249427 631 ## Rumal_3740 lipoprotein 229 93 Op 2 . + CDS 249424 - 250350 717 ## gi|160940257|ref|ZP_02087602.1| hypothetical protein CLOBOL_05146 230 93 Op 3 . + CDS 250350 - 251483 905 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 231 93 Op 4 . + CDS 251480 - 251839 498 ## COG1733 Predicted transcriptional regulators + Prom 251863 - 251922 4.9 232 94 Tu 1 . + CDS 252003 - 252419 451 ## COG3871 Uncharacterized stress protein (general stress protein 26) + Term 252493 - 252535 2.4 + Prom 252708 - 252767 8.0 233 95 Op 1 . + CDS 252928 - 253494 629 ## COG3236 Uncharacterized protein conserved in bacteria 234 95 Op 2 . + CDS 253496 - 254431 693 ## COG1051 ADP-ribose pyrophosphatase 235 95 Op 3 . + CDS 254435 - 255250 834 ## COG4295 Uncharacterized protein conserved in bacteria + Term 255441 - 255478 5.1 + Prom 255258 - 255317 5.8 236 96 Op 1 3/0.047 + CDS 255507 - 255803 447 ## COG0011 Uncharacterized conserved protein 237 96 Op 2 21/0.000 + CDS 255772 - 256545 801 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 238 96 Op 3 17/0.000 + CDS 256592 - 257617 1146 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 239 96 Op 4 . + CDS 257653 - 258399 206 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Term 258404 - 258442 5.0 + Prom 258447 - 258506 8.2 240 97 Op 1 6/0.023 + CDS 258545 - 259705 755 ## COG2199 FOG: GGDEF domain 241 97 Op 2 2/0.093 + CDS 259684 - 260361 714 ## COG2200 FOG: EAL domain 242 97 Op 3 . + CDS 260366 - 261898 1569 ## COG2200 FOG: EAL domain + Term 261906 - 261950 3.7 + Prom 261969 - 262028 8.4 243 98 Tu 1 . + CDS 262157 - 262909 430 ## COG1145 Ferredoxin 244 99 Tu 1 . - CDS 263151 - 264173 1391 ## COG3641 Predicted membrane protein, putative toxin regulator - Prom 264241 - 264300 7.2 245 100 Tu 1 . + CDS 264124 - 264267 95 ## + Prom 264295 - 264354 6.6 246 101 Op 1 . + CDS 264381 - 264923 169 ## COG0671 Membrane-associated phospholipid phosphatase 247 101 Op 2 . + CDS 264988 - 266013 1258 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 266026 - 266069 1.6 248 102 Tu 1 . + CDS 266441 - 267013 684 ## EUBREC_0096 hypothetical protein + Term 267097 - 267137 6.5 + Prom 267259 - 267318 5.3 249 103 Op 1 . + CDS 267360 - 268628 1519 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 250 103 Op 2 . + CDS 268675 - 269526 869 ## COG1082 Sugar phosphate isomerases/epimerases + Term 269531 - 269581 9.2 - Term 269525 - 269564 4.2 251 104 Tu 1 . - CDS 269614 - 270993 1566 ## COG0534 Na+-driven multidrug efflux pump - Prom 271208 - 271267 8.9 + Prom 271268 - 271327 12.5 252 105 Op 1 5/0.023 + CDS 271382 - 272377 1047 ## COG1397 ADP-ribosylglycohydrolase 253 105 Op 2 . + CDS 272374 - 273327 1018 ## COG0524 Sugar kinases, ribokinase family 254 105 Op 3 . + CDS 273333 - 274019 794 ## COG3201 Nicotinamide mononucleotide transporter + Prom 274022 - 274081 4.3 255 105 Op 4 . + CDS 274105 - 274830 781 ## COG2188 Transcriptional regulators + Term 274872 - 274910 8.2 256 106 Tu 1 . - CDS 274883 - 276385 1308 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 276455 - 276514 1.7 257 107 Op 1 . - CDS 276516 - 277880 1307 ## COG0534 Na+-driven multidrug efflux pump 258 107 Op 2 . - CDS 277877 - 278692 425 ## Asuc_1312 MerR family transcriptional regulator - Prom 278789 - 278848 11.9 + Prom 278728 - 278787 8.1 259 108 Op 1 . + CDS 278845 - 280071 948 ## OB0304 hypothetical protein 260 108 Op 2 . + CDS 280116 - 280919 692 ## COG2362 D-aminopeptidase + Term 281132 - 281171 2.3 - Term 280952 - 281009 2.0 261 109 Tu 1 . - CDS 281105 - 281470 367 ## COG1733 Predicted transcriptional regulators - Prom 281503 - 281562 6.4 - Term 281672 - 281713 2.3 262 110 Op 1 1/0.163 - CDS 281760 - 282578 711 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 263 110 Op 2 . - CDS 282597 - 283802 997 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 283862 - 283921 5.0 + Prom 283813 - 283872 4.5 264 111 Tu 1 . + CDS 283990 - 284358 213 ## COG3250 Beta-galactosidase/beta-glucuronidase 265 112 Op 1 7/0.023 + CDS 284620 - 286425 2140 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 266 112 Op 2 2/0.093 + CDS 286448 - 288094 1845 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 288161 - 288193 -0.8 + Prom 288141 - 288200 4.0 267 113 Op 1 35/0.000 + CDS 288236 - 289600 1667 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 289651 - 289702 2.1 + Prom 289757 - 289816 5.6 268 113 Op 2 38/0.000 + CDS 289850 - 290617 943 ## COG1175 ABC-type sugar transport systems, permease components 269 113 Op 3 . + CDS 290630 - 291457 982 ## COG0395 ABC-type sugar transport system, permease component + Prom 291485 - 291544 1.6 270 114 Op 1 . + CDS 291571 - 293658 1430 ## Closa_4172 Heparinase II/III family protein 271 114 Op 2 . + CDS 293739 - 294917 991 ## Cphy_0934 glycosy hydrolase family protein + Prom 295017 - 295076 7.9 272 115 Op 1 20/0.000 + CDS 295186 - 296379 1365 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 273 115 Op 2 1/0.163 + CDS 296437 - 296874 585 ## COG0822 NifU homolog involved in Fe-S cluster formation 274 115 Op 3 . + CDS 296895 - 297950 959 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 275 115 Op 4 . + CDS 297947 - 299308 1053 ## EUBREC_0966 N-acetylmuramoyl-L-alanine amidase domain protein + Prom 299326 - 299385 4.9 276 116 Tu 1 . + CDS 299421 - 300224 1105 ## Closa_0058 hypothetical protein + Term 300287 - 300350 3.2 + TRNA 300345 - 300433 51.8 # Ser CGA 0 0 - Term 300473 - 300537 10.9 277 117 Op 1 . - CDS 300623 - 300802 87 ## gi|160940318|ref|ZP_02087663.1| hypothetical protein CLOBOL_05208 278 117 Op 2 . - CDS 300809 - 300991 237 ## Closa_1280 hypothetical protein 279 117 Op 3 . - CDS 301073 - 301921 1152 ## Closa_2348 hypothetical protein 280 117 Op 4 . - CDS 301914 - 302465 538 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 302534 - 302593 7.1 + Prom 302512 - 302571 4.7 281 118 Tu 1 . + CDS 302656 - 303645 819 ## COG5632 N-acetylmuramoyl-L-alanine amidase + Term 303688 - 303732 9.1 - Term 303676 - 303720 5.3 282 119 Tu 1 . - CDS 303749 - 304558 1023 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Prom 304721 - 304780 9.2 + Prom 304578 - 304637 7.8 283 120 Op 1 26/0.000 + CDS 304836 - 305867 1005 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 305908 - 305944 6.1 284 120 Op 2 13/0.000 + CDS 305977 - 307194 1568 ## COG0126 3-phosphoglycerate kinase 285 120 Op 3 8/0.023 + CDS 307245 - 307997 1062 ## COG0149 Triosephosphate isomerase 286 120 Op 4 . + CDS 308096 - 309637 1821 ## COG0696 Phosphoglyceromutase + Term 309670 - 309708 7.0 - Term 309657 - 309695 4.0 287 121 Tu 1 . - CDS 309745 - 311247 1210 ## COG0475 Kef-type K+ transport systems, membrane components - Prom 311307 - 311366 9.8 + Prom 311258 - 311317 6.5 288 122 Tu 1 . + CDS 311352 - 312263 1025 ## COG0583 Transcriptional regulator + Term 312313 - 312346 2.0 + Prom 312401 - 312460 6.6 289 123 Op 1 . + CDS 312489 - 314264 226 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 290 123 Op 2 . + CDS 314280 - 315149 960 ## COG1307 Uncharacterized protein conserved in bacteria 291 123 Op 3 . + CDS 315150 - 316619 1747 ## COG0642 Signal transduction histidine kinase 292 124 Op 1 . + CDS 316730 - 317632 1149 ## Closa_0837 Lipoprotein LpqB, GerMN domain protein 293 124 Op 2 . + CDS 317655 - 320207 1400 ## COG0658 Predicted membrane metal-binding protein 294 124 Op 3 8/0.023 + CDS 320281 - 321717 1082 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 295 125 Tu 1 . + CDS 321833 - 323158 844 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase - Term 322920 - 322967 -0.2 296 126 Op 1 . - CDS 323187 - 323645 408 ## COG0394 Protein-tyrosine-phosphatase 297 126 Op 2 . - CDS 323648 - 326542 764 ## COG5635 Predicted NTPase (NACHT family) - Prom 326570 - 326629 6.0 298 127 Tu 1 . + CDS 326897 - 326974 77 ## 299 128 Op 1 . - CDS 327404 - 327535 62 ## 300 128 Op 2 . - CDS 327628 - 328656 896 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 - Prom 328901 - 328960 6.4 + Prom 328941 - 329000 2.9 301 129 Tu 1 . + CDS 329046 - 329123 126 ## + Term 329314 - 329365 -0.8 + Prom 329362 - 329421 7.7 302 130 Tu 1 . + CDS 329540 - 329617 124 ## - Term 329794 - 329827 -0.4 303 131 Op 1 . - CDS 329842 - 330870 779 ## EUBREC_0167 hypothetical protein 304 131 Op 2 . - CDS 330890 - 331528 375 ## EUBREC_0166 hypothetical protein - Prom 331596 - 331655 4.0 + Prom 331690 - 331749 5.0 305 132 Op 1 . + CDS 331794 - 332774 1029 ## COG1466 DNA polymerase III, delta subunit 306 132 Op 2 . + CDS 332795 - 333367 844 ## COG4475 Uncharacterized protein conserved in bacteria 307 132 Op 3 . + CDS 333425 - 334009 659 ## Closa_0782 hypothetical protein + Term 334089 - 334123 -0.1 308 133 Op 1 . - CDS 334132 - 334275 118 ## gi|160940352|ref|ZP_02087697.1| hypothetical protein CLOBOL_05242 309 133 Op 2 . - CDS 334299 - 334448 232 ## gi|160940354|ref|ZP_02087699.1| hypothetical protein CLOBOL_05244 - Prom 334479 - 334538 2.9 - Term 334527 - 334565 3.5 310 134 Tu 1 . - CDS 334606 - 334869 279 ## PROTEIN SUPPORTED gi|160880450|ref|YP_001559418.1| ribosomal protein S20 - Prom 334951 - 335010 7.6 + Prom 334910 - 334969 4.7 311 135 Op 1 . + CDS 335154 - 336164 985 ## Closa_0874 spore protease (EC:3.4.24.78) 312 135 Op 2 . + CDS 336238 - 337605 1141 ## Closa_0875 Stage II sporulation P family protein 313 135 Op 3 4/0.047 + CDS 337679 - 339493 2110 ## COG0481 Membrane GTPase LepA 314 135 Op 4 . + CDS 339515 - 340792 1193 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 315 135 Op 5 . + CDS 340865 - 341602 580 ## COG4912 Predicted DNA alkylation repair enzyme + Prom 341616 - 341675 2.5 316 136 Op 1 3/0.047 + CDS 341720 - 342286 668 ## COG0450 Peroxiredoxin 317 136 Op 2 . + CDS 342291 - 342956 625 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Prom 342968 - 343027 5.4 318 137 Op 1 2/0.093 + CDS 343213 - 343782 571 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) + Term 343794 - 343840 4.9 319 137 Op 2 18/0.000 + CDS 343882 - 345378 1565 ## COG0554 Glycerol kinase 320 137 Op 3 . + CDS 345483 - 346196 778 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) + Term 346234 - 346290 14.5 + Prom 346296 - 346355 4.0 321 138 Op 1 6/0.023 + CDS 346436 - 347875 1756 ## COG0579 Predicted dehydrogenase 322 138 Op 2 4/0.047 + CDS 347893 - 349158 1387 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 323 138 Op 3 . + CDS 349162 - 349521 518 ## COG3862 Uncharacterized protein with conserved CXXC pairs + Term 349584 - 349638 9.8 324 139 Tu 1 . + CDS 349688 - 349879 351 ## gi|160940372|ref|ZP_02087717.1| hypothetical protein CLOBOL_05262 + Term 349995 - 350027 0.4 325 140 Op 1 10/0.000 + CDS 350044 - 351030 1479 ## COG2376 Dihydroxyacetone kinase 326 140 Op 2 9/0.000 + CDS 351081 - 351704 820 ## COG2376 Dihydroxyacetone kinase 327 140 Op 3 . + CDS 351785 - 352165 634 ## COG3412 Uncharacterized protein conserved in bacteria + Term 352204 - 352241 8.5 328 141 Op 1 . - CDS 352216 - 352566 379 ## COG5341 Uncharacterized protein conserved in bacteria 329 141 Op 2 . - CDS 352563 - 353060 414 ## Cphy_3035 hypothetical protein + Prom 353242 - 353301 6.2 330 142 Tu 1 . + CDS 353360 - 353950 602 ## COG4769 Predicted membrane protein + Prom 353958 - 354017 5.6 331 143 Tu 1 . + CDS 354072 - 355592 1728 ## COG4468 Galactose-1-phosphate uridyltransferase + Term 355649 - 355701 -0.1 + Prom 355756 - 355815 5.8 332 144 Op 1 7/0.023 + CDS 355973 - 356422 547 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 333 144 Op 2 7/0.023 + CDS 356427 - 357491 1468 ## COG1299 Phosphotransferase system, fructose-specific IIC component 334 144 Op 3 2/0.093 + CDS 357546 - 357854 493 ## COG1445 Phosphotransferase system fructose-specific component IIB + Term 357936 - 357968 2.3 + Prom 357874 - 357933 2.9 335 145 Tu 1 . + CDS 357989 - 360025 2063 ## COG3711 Transcriptional antiterminator + Prom 360032 - 360091 4.4 336 146 Tu 1 . + CDS 360135 - 360476 369 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain + Term 360620 - 360682 11.5 337 147 Op 1 . + CDS 360955 - 361626 633 ## COG2364 Predicted membrane protein 338 147 Op 2 . + CDS 361623 - 362972 1555 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 339 147 Op 3 . + CDS 363062 - 363706 635 ## COG0655 Multimeric flavodoxin WrbA 340 147 Op 4 6/0.023 + CDS 363693 - 365027 1473 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases 341 147 Op 5 . + CDS 365056 - 365676 403 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase + Term 365702 - 365737 0.2 - Term 365428 - 365474 6.2 342 148 Tu 1 . - CDS 365693 - 367051 356 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 367140 - 367199 9.0 343 149 Op 1 11/0.000 + CDS 367532 - 368008 471 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 344 149 Op 2 . + CDS 368012 - 370270 2572 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs + Term 370341 - 370384 6.4 345 150 Tu 1 . - CDS 370393 - 371292 780 ## COG0583 Transcriptional regulator - Prom 371381 - 371440 6.2 + Prom 371390 - 371449 7.1 346 151 Op 1 5/0.023 + CDS 371479 - 372657 1251 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 347 151 Op 2 1/0.163 + CDS 372685 - 373968 1127 ## COG0477 Permeases of the major facilitator superfamily + Prom 374359 - 374418 2.4 348 152 Op 1 . + CDS 374445 - 375692 1517 ## COG0112 Glycine/serine hydroxymethyltransferase 349 152 Op 2 18/0.000 + CDS 375778 - 376866 1319 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 350 152 Op 3 16/0.000 + CDS 376908 - 377285 660 ## COG0509 Glycine cleavage system H protein (lipoate-binding) 351 152 Op 4 12/0.000 + CDS 377310 - 378629 1441 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 352 152 Op 5 . + CDS 378626 - 380023 1460 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain + Term 380057 - 380108 0.2 353 152 Op 6 . + CDS 380122 - 381558 792 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 354 152 Op 7 9/0.000 + CDS 381592 - 382260 733 ## COG1760 L-serine deaminase 355 152 Op 8 . + CDS 382273 - 383148 910 ## COG1760 L-serine deaminase 356 152 Op 9 2/0.093 + CDS 383178 - 384560 1058 ## COG0285 Folylpolyglutamate synthase 357 152 Op 10 4/0.047 + CDS 384544 - 385101 454 ## COG0302 GTP cyclohydrolase I 358 152 Op 11 5/0.023 + CDS 385106 - 385924 612 ## COG0294 Dihydropteroate synthase and related enzymes 359 152 Op 12 . + CDS 385993 - 386805 505 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 360 153 Tu 1 . - CDS 386856 - 387377 606 ## COG4720 Predicted membrane protein - Prom 387413 - 387472 4.7 361 154 Op 1 . - CDS 387521 - 388153 629 ## COG0546 Predicted phosphatases 362 154 Op 2 . - CDS 388153 - 389154 892 ## COG1940 Transcriptional regulator/sugar kinase - Prom 389195 - 389254 3.0 + Prom 389148 - 389207 7.6 363 155 Op 1 1/0.163 + CDS 389349 - 390293 878 ## COG0679 Predicted permeases 364 155 Op 2 . + CDS 390324 - 391331 795 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 391352 - 391390 6.6 + Prom 391533 - 391592 8.4 365 156 Op 1 . + CDS 391742 - 392950 1299 ## CD3114 hypothetical protein 366 156 Op 2 3/0.047 + CDS 392971 - 394353 1583 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 367 156 Op 3 . + CDS 394368 - 395549 1209 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase + Term 395595 - 395642 1.2 - Term 395582 - 395629 1.2 368 157 Op 1 . - CDS 395778 - 396788 963 ## CD3111 hypothetical protein 369 157 Op 2 . - CDS 396834 - 397847 1098 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D - Prom 397957 - 398016 9.1 + Prom 397976 - 398035 8.2 370 158 Op 1 26/0.000 + CDS 398082 - 398534 539 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 371 158 Op 2 . + CDS 398556 - 399506 1237 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 399560 - 399624 10.6 + Prom 399509 - 399568 2.3 372 159 Op 1 . + CDS 399655 - 401049 1440 ## COG0534 Na+-driven multidrug efflux pump 373 159 Op 2 . + CDS 401091 - 401273 299 ## gi|160940433|ref|ZP_02087778.1| hypothetical protein CLOBOL_05323 + Term 401284 - 401322 1.0 374 160 Op 1 . + CDS 401382 - 401456 67 ## 375 160 Op 2 . + CDS 401475 - 402308 957 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding 376 160 Op 3 . + CDS 402340 - 403008 805 ## COG0546 Predicted phosphatases + Prom 403018 - 403077 5.4 377 161 Op 1 15/0.000 + CDS 403134 - 403637 697 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Prom 403663 - 403722 4.1 378 161 Op 2 . + CDS 403750 - 404760 1165 ## COG0059 Ketol-acid reductoisomerase + Term 404822 - 404877 18.3 - Term 404813 - 404862 15.3 379 162 Tu 1 . - CDS 404932 - 405855 1199 ## COG0583 Transcriptional regulator - Prom 405880 - 405939 7.6 + Prom 405825 - 405884 9.5 380 163 Op 1 30/0.000 + CDS 405963 - 407225 1601 ## COG0065 3-isopropylmalate dehydratase large subunit 381 163 Op 2 . + CDS 407270 - 407755 553 ## COG0066 3-isopropylmalate dehydratase small subunit + Term 407854 - 407891 1.0 382 164 Tu 1 . - CDS 407961 - 408833 1015 ## COG0583 Transcriptional regulator - Prom 408872 - 408931 7.4 + Prom 408846 - 408905 9.3 383 165 Op 1 . + CDS 409007 - 410284 1550 ## COG3681 Uncharacterized conserved protein 384 165 Op 2 . + CDS 410287 - 411480 1275 ## COG1301 Na+/H+-dicarboxylate symporters + Term 411487 - 411536 -0.9 + Prom 411483 - 411542 4.9 385 166 Op 1 . + CDS 411661 - 413091 1629 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 386 166 Op 2 . + CDS 413129 - 413884 849 ## COG0789 Predicted transcriptional regulators 387 166 Op 3 15/0.000 + CDS 413957 - 415075 1529 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 388 166 Op 4 24/0.000 + CDS 415166 - 416695 1654 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 389 166 Op 5 26/0.000 + CDS 416688 - 417716 1191 ## COG4603 ABC-type uncharacterized transport system, permease component 390 166 Op 6 . + CDS 417710 - 418651 1143 ## COG1079 Uncharacterized ABC-type transport system, permease component + Term 418652 - 418684 -1.0 + Prom 418669 - 418728 6.1 391 167 Op 1 . + CDS 418954 - 419205 283 ## EUBREC_0233 hypothetical protein 392 167 Op 2 . + CDS 419296 - 420717 1923 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 393 168 Tu 1 . + CDS 420931 - 422166 1158 ## COG1160 Predicted GTPases + Term 422275 - 422332 4.1 + Prom 422384 - 422443 13.3 394 169 Tu 1 . + CDS 422475 - 423632 1259 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase + Term 423717 - 423760 8.0 + Prom 423674 - 423733 5.6 395 170 Tu 1 . + CDS 423920 - 425932 1786 ## COG0840 Methyl-accepting chemotaxis protein + Term 426043 - 426074 3.4 + Prom 425957 - 426016 9.4 396 171 Op 1 32/0.000 + CDS 426098 - 426565 532 ## COG0779 Uncharacterized protein conserved in bacteria 397 171 Op 2 22/0.000 + CDS 426607 - 427773 1580 ## COG0195 Transcription elongation factor 398 171 Op 3 8/0.023 + CDS 427808 - 428080 252 ## COG2740 Predicted nucleic-acid-binding protein implicated in transcription termination 399 171 Op 4 10/0.000 + CDS 428070 - 428405 431 ## PROTEIN SUPPORTED gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 400 171 Op 5 . + CDS 428342 - 428386 61 ## PROTEIN SUPPORTED gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 401 171 Op 6 32/0.000 + CDS 428392 - 431700 3502 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 402 171 Op 7 4/0.047 + CDS 431726 - 432127 516 ## COG0858 Ribosome-binding factor A 403 171 Op 8 1/0.163 + CDS 432127 - 433095 222 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 404 171 Op 9 12/0.000 + CDS 433106 - 434032 755 ## COG0130 Pseudouridine synthase 405 171 Op 10 9/0.000 + CDS 434092 - 435042 345 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Prom 435073 - 435132 7.8 406 171 Op 11 26/0.000 + CDS 435222 - 435488 416 ## PROTEIN SUPPORTED gi|227872009|ref|ZP_03990394.1| ribosomal protein S15 + Term 435499 - 435536 7.1 + Prom 435648 - 435707 7.3 407 172 Op 1 . + CDS 435727 - 437850 1393 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase 408 172 Op 2 . + CDS 437891 - 438595 1041 ## gi|160940475|ref|ZP_02087820.1| hypothetical protein CLOBOL_05365 + Term 438607 - 438650 9.2 + Prom 438608 - 438667 5.4 409 173 Op 1 . + CDS 438705 - 440156 1495 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 410 173 Op 2 . + CDS 440168 - 441994 1878 ## COG0210 Superfamily I DNA and RNA helicases 411 173 Op 3 . + CDS 441991 - 442533 494 ## COG2109 ATP:corrinoid adenosyltransferase + Prom 442548 - 442607 3.1 412 173 Op 4 . + CDS 442627 - 442950 620 ## Closa_1312 hypothetical protein + Term 442973 - 443018 8.1 + Prom 443080 - 443139 7.2 413 174 Op 1 40/0.000 + CDS 443186 - 443866 945 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 414 174 Op 2 . + CDS 443863 - 445386 1854 ## COG0642 Signal transduction histidine kinase + Term 445391 - 445428 2.1 + Prom 445454 - 445513 5.4 415 175 Op 1 . + CDS 445554 - 446867 1252 ## SpiBuddy_2811 hypothetical protein + Term 446903 - 446935 3.4 416 175 Op 2 21/0.000 + CDS 446950 - 448536 193 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 417 175 Op 3 11/0.000 + CDS 448550 - 449629 863 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 418 175 Op 4 . + CDS 449626 - 450753 1141 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 419 175 Op 5 . + CDS 450750 - 451238 390 ## Spico_1669 hypothetical protein + Term 451240 - 451275 0.1 + SSU_RRNA 451742 - 453236 99.0 # EF402206 [D:1..1495] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. + 5S_RRNA 453335 - 453451 94.0 # CP000885 [D:447851..447967] # 5S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. + TRNA 453457 - 453529 80.2 # Ala TGC 0 0 + Prom 453457 - 453516 76.8 420 176 Tu 1 . + CDS 453560 - 453730 175 ## gi|160940488|ref|ZP_02087833.1| hypothetical protein CLOBOL_05381 + 5S_RRNA 456277 - 456682 86.0 # AM773717 [D:3446..3948] # 5S ribosomal RNA # Leuconostoc gasicomitatum # Bacteria; Firmicutes; Lactobacillales; Leuconostoc. + TRNA 456774 - 456845 75.4 # Gly GCC 0 0 + Prom 456805 - 456864 47.6 421 177 Op 1 . + CDS 456948 - 458138 797 ## COG0303 Molybdopterin biosynthesis enzyme + Term 458161 - 458204 13.1 + Prom 458319 - 458378 5.8 422 177 Op 2 . + CDS 458457 - 459209 647 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump 423 177 Op 3 . + CDS 459292 - 460473 1252 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 424 177 Op 4 . + CDS 460474 - 460837 167 ## Closa_0784 VanZ family protein Predicted protein(s) >gi|157101630|gb|DS480694.1| GENE 1 85 - 624 458 179 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 179 257 70.0 6e-69 MSKHVLVISASPRKGGNSDTLCDEFIRGVQESGNQAEKIFLASRKIGYCIGCGVCNTTHK CVQKDEMAEILDKMVDADVIVLATPVYFYTMDAQMKTLIDRTVPRYTEIQNKDFYFIVAA ADTEQKMMDRTIEGFRGFTQDCLTGAREKGIIYGTGAWQAGEIKGTPAMKQAYEMGRNV >gi|157101630|gb|DS480694.1| GENE 2 637 - 789 70 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160939991|ref|ZP_02087336.1| ## NR: gi|160939991|ref|ZP_02087336.1| hypothetical protein CLOBOL_04880 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04880 [Clostridium bolteae ATCC BAA-613] # 1 50 1 50 50 99 100.0 1e-19 MHDGPEGVSCIFFWIAAIAGNECKKGLTQECVDVNAAGELIVNDFSFRVV >gi|157101630|gb|DS480694.1| GENE 3 879 - 1103 166 74 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_1921 NR:ns ## KEGG: Pjdr2_1921 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: Paenibacillus # Pathway: not_defined # 6 70 2 66 231 72 53.0 5e-12 MEFSGDIRIKVGNRIRQLRKELLLSQESLAFKAGLDRTYIASVENGKRNLSIMSLEKIIV ALDCSMAEFFETFE >gi|157101630|gb|DS480694.1| GENE 4 1165 - 1566 274 133 aa, chain + ## HITS:1 COG:Cgl1127 KEGG:ns NR:ns ## COG: Cgl1127 COG0494 # Protein_GI_number: 19552377 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Corynebacterium glutamicum # 2 122 3 124 131 104 45.0 5e-23 MKKIEVAAAVLHKDGTFLGTQRGYGEFEGGWEFPGGKIEEGESPQAALLRELKEELGIDA IVEQFLMTVECNYPQFHLMMHCYLCSIAEGKIQLKEHKSARWMNREQFDDVEWLPADLDV VKRIRDMDIDGLA >gi|157101630|gb|DS480694.1| GENE 5 1601 - 3547 1516 648 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 42 640 46 625 627 355 36.0 1e-97 MSQIWYAAFSEDFVEQAQKIFKKLDKDVIVTVWDPELVPEMLRRGVSVILGRGATALRIR KVVDLPVVEIPIPFEDMADTLIEASSYGRNIGVIGYNNLLSGLERLNPILNVSIRQIFAV DEDDTYHQIQKLKNEGVDVIVGGLIQTRYARELGLPAVRIELTDKSLSYACQEAEKLIAT VKAATRKAEELKTILNTTNEKYVAVDIKGNITWMNRVAKPYLPNPGGLVYDTPITEVLPA FEAVHDVLATGEEIIQETGSINGADILYDMIPLTYKNGEILGAVITFNDAGTITRGEHKI RDKGIKGFQATYSFKDICGSSQQMNQCIQHAKRYAHTDLTVLLLGETGSGKEMFAQSMHN ASSRRNGPFVAVNCAALPEGILESELFGYDDGAFTGARRSGKMGLFELAHNGTIFLDEIG EMPMSLQSRLLRVLQERKVMRLGGDRVFPVNIRIFAATNKNLMELVGEHKFREDLFYRLN VLTLKIPPLRERVEDIPDLANLFLRESGGKCCLTPAAEKVLTSYGWPGNARQLRHFMEKV RIICDSSVISGEAAGYVIQNYEPPCEMEQRGNSSGFPGSNGIGNYKATKGRVNRSQEEPG QENISREITEECLAQAMEQAGGNKTKAARILGIHRSTIWRYIKKFGME >gi|157101630|gb|DS480694.1| GENE 6 4394 - 5431 1076 345 aa, chain + ## HITS:1 COG:TM1744 KEGG:ns NR:ns ## COG: TM1744 COG2008 # Protein_GI_number: 15644490 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Thermotoga maritima # 1 341 1 338 343 254 42.0 2e-67 MVDLRSDTISMPTREMLETILEAKLGDDGRTDSKGRGEDLAVNELEDLAAEMTGKEAAVL FPSGTQGNTSAILAYCRPGQKVMVDEMQHIYLSEKVVFDPSIGQLEPVTYKLDQDNLPDL DDMRRILESQEIALLCIENTHNFSGGTCVPLERMKAIHELAKEFGVPVHMDGARMFNAVV ALGVEAREMCKYVDSVMFCISKGLGAPIGSLVCCSEAFSMKIRDKRKLLGGAMRQAGVIA APGMYALTHNIERLAEDNANAAYAAEQLKDLKNTKVYGKVMSNIIVLDANGLGLTPGEYC GKAAEKGLLIKPVLKDKVRLVFYRDISREDTEKAVAIIRELDSLR >gi|157101630|gb|DS480694.1| GENE 7 5447 - 6679 1631 410 aa, chain + ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 6 401 8 411 424 195 31.0 2e-49 MKIAGKKIGLSTQIFAAMLLGAVLGIVIGEPMTKVGFIGTIWLNCIKMIVVPMVLVTIVT GVVSQKDMKTLGRISFRIMAYYVITTLVACAIGILVTSIVKPGTIANFTGLASKEVTGSA DITVADFFTGMFSSNIIQTFADGNILQTVVIAILLGVAILRMKNPEHKEKAIKGMNVLND MVFSLINMIMLVSPIGVFFLMGDSFGKYGAGIFTSMATLVGTYYLACLVHVVLVYCTVIW VTAGINPLKFLKDSAELWVYTISTCSSVAAIPINIKVAKEKFNVPETISGFTVPLGSQMN YDGSVILYGCVILFISQTIGVPVTLGTMVKIILMSAILSTGGGGIPGSGIVKLLVMVTAF GLPTEIVGIIAAFYRLFDMGTTTNNCLGDLAGTVFVSKLEDKAAARAKSA >gi|157101630|gb|DS480694.1| GENE 8 6833 - 7513 523 226 aa, chain - ## HITS:1 COG:lin0974 KEGG:ns NR:ns ## COG: lin0974 COG0120 # Protein_GI_number: 16800043 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Listeria innocua # 7 222 4 220 224 158 42.0 8e-39 MYSKEEQKKAAAWQAAMEVEDNMVLGLGTGSTVYYLMEKLSERIRGGLYVHGAATSLETE GLARRLGIPLISLEDARTIHLAIDGVDAIDPDFYSIKGGGGALFREKIIACKARRVLWIM DQSKLVHSLSGCVLPVEVPAFAVPYVEEQVREAGFIPKLRIKNGTVFVTDNGSHILDLTG GVEMDYPMAAVRLKSMTGVLETGLFGDICEKIIVGTENGTQERFHN >gi|157101630|gb|DS480694.1| GENE 9 7589 - 9001 1153 470 aa, chain - ## HITS:1 COG:BS_yqjJ KEGG:ns NR:ns ## COG: BS_yqjJ COG0364 # Protein_GI_number: 16079442 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate 1-dehydrogenase # Organism: Bacillus subtilis # 13 469 11 486 489 347 37.0 3e-95 MNGGGLMTISTNLTIFGGTGDLTFRKLLPALYTMDITGKLPEDSRITIIGRRDYDSSHYR QIARDWVEKFTRLSFEEKAFNRFSEKIDYYRMDFSRLEAYHGLNHYYRENDIHSHIFYFA VAPRFFSIITEGLKLVHGSSSGKVILEKPFGENLAAAAALNKELEAFFGPEHIYRIDHYL GKEMVRNIQAIRFSNPIFTDVWNSRYIESVQISALEDMGVETRGGYYDASGALKDMVQNH LFQILSIMAMEQPEEFSPKGLHDAQLDVFRCLQPADESSIESTLVLGQYEGYRGEPLVKP DSQTETYGALRLFIDNQRWKGTPFYIRTGKKTGKREIELAVIFRRPYQEVEPNILIIKIQ PTEGVYLQFNIKRPGDSEDIIPAKMDFCQNCSLIHQLNTPEAYERLITACMAGERSWFSQ WDQIEFSWNYISHLKELHAAKRLPVYPYRPGTRGPAEADRMLEQYGHSWL >gi|157101630|gb|DS480694.1| GENE 10 8967 - 10391 1291 474 aa, chain - ## HITS:1 COG:L0046 KEGG:ns NR:ns ## COG: L0046 COG0362 # Protein_GI_number: 15672604 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconate dehydrogenase # Organism: Lactococcus lactis # 1 466 1 468 472 501 50.0 1e-141 MIKLDIGIVGLSVMGRSLALNMADHGFKVGGYNRSAAVTEQVMRDHPHENLIPFYDLKDM TDALARPRKVMLMIQAGKPVDAVIEQLVPLLEKGDMILDGGNSFFEDTRRRAALLAEKGI HYLGVGISGGEEGARFGPSVMPGGNADAYESVRPILEAIAARAMDEPCCAYIGPDGAGHY VKMVHNGIEYADMQLIAESYLLLKYVGGFNNQELAHIFKSWNDGELHSYLIGITAGIFRE ADDLGDGELIDRIKDSAKQKGTGRWASIEALKQGVDISMITSACTARIMSNHLDEREKAW QLIHSPAVGLSPDKESFAAMVREALYTGKIIAYAQGFSLMQDASRLYGWNLDLGKIASIF RAGCIIQAVFLNDITHAFEAEPKPENLIFDGFFLSRINEHQDSLRQAVSTGVMNGLPIPA LCNAVSYLDEFRAKAAGANLIQAQRDCFGAHTYERTDREGVFHHEWRWANDNQH >gi|157101630|gb|DS480694.1| GENE 11 10420 - 11514 571 364 aa, chain - ## HITS:1 COG:SP1506 KEGG:ns NR:ns ## COG: SP1506 COG2706 # Protein_GI_number: 15901353 # Func_class: G Carbohydrate transport and metabolism # Function: 3-carboxymuconate cyclase # Organism: Streptococcus pneumoniae TIGR4 # 6 357 4 333 337 117 27.0 5e-26 MNNSLTGFLGTYASPESLGIYRFTIDLRNGAMSYPELYYEAPDCKYLSLRDSMLASPLKR EGRSGICLLDTAHNKEEALAQPAAECFGESSPACYAAQDDRYLYTANYHEGTILIYEITY EEGSPGRKKRPSLPHLELAKRIPIAPKAGCHQILFHSHYMMVPCLLLDKMMVFDCQKDFT LVSELAFDKGTGPRHGIFDRDHKQFFLVSELSNQVFVYSLTEDGAAGTDSSHNSISLPDF SMKLLQICPILPLDAVYEEPPASAAIRLSPDQRFLYVSTRFAEVITVFKISHGRLEQIQQ TGCGGIHPRDMVLTPDGRYLLVANRTQGGLVSFQLNPETGELLDICSRVPAPEAVSIVLS QHTI >gi|157101630|gb|DS480694.1| GENE 12 11823 - 13310 1530 495 aa, chain - ## HITS:1 COG:FN0454 KEGG:ns NR:ns ## COG: FN0454 COG1012 # Protein_GI_number: 19703789 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Fusobacterium nucleatum # 6 492 5 491 491 659 63.0 0 MNKPELQNRYQLFIGGQWRDASDGEFFTTKCPANGEKLAECAQATKEDVDDAVREAWKAF ETWKKVPTSERAAILNKIADIIDANTEHLAMVESLDNGKPIRETMAIDIPLSAKHFRYFA GCIMAEEGSANILDEQFLSLILREPIGVVGQIVPWNFPFLMAAWKLAPVLAAGCCTVFKP SSDTSLSVLEFARLVQDVIPKGVFNVITGSGSKSGQYMLDHKGFRKLAFTGSTEVGRQVA LAAADRLIPATLELGGKSANIFFPDCNWEQAIDGLQLGILFNQGQVCCAGSRVFVHEDIY DKFLEDAVKAFNNVKVGVSWDPETQMGSQINERQLEKILSYVEIGKQEGARLICGGERIT DGELAKGCFMRPTLLADVTNDMRVAQEEIFGPVACILKFRDEDEVIRMANDNAYGLGGAV WTRDLNRAIRVSRGIETGRMWVNTYNQIPEGSPFGGYKESGIGRETHKVILEHYTQMKNI MINLSEAPSGFYPAK >gi|157101630|gb|DS480694.1| GENE 13 13540 - 13719 181 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940005|ref|ZP_02087350.1| ## NR: gi|160940005|ref|ZP_02087350.1| hypothetical protein CLOBOL_04894 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04894 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 97 100.0 2e-19 MEQKSPDKVCYQFICPRYPKCARARGRGCCIEYPEDEAQMVQEGDCRAEDGFPLFVEEQ >gi|157101630|gb|DS480694.1| GENE 14 13860 - 16037 1357 725 aa, chain + ## HITS:1 COG:BH0336 KEGG:ns NR:ns ## COG: BH0336 COG1203 # Protein_GI_number: 15612899 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Bacillus halodurans # 3 713 2 787 800 311 31.0 2e-84 MQYYAHISEDKTRKQTVKDHLTGTARRSELFAEAFRCGAWGYGCGVLHDIGKYSQGFRKR LEGGSITDHATAGAQEMMRLYAGSNPLAAYCAAYCISGHHSGLLDGGRTGDTAGEATLQG RLKKQLEDYSPYKNEVEMPDFPGLPLRQIGKGGFGLSFFIRMLFSCLVDGDYLDTEQFML GQDAGRGDYDCMDVLLRRLESHVESWLSNDDLTSVNGRRTAILKACLQAGRRRQGLYQLT VPTGGGKTVSSLGFALRHAAEHGLERIIYVIPYTSIIEQNAQVFKEILGKKNVLENHCNV TYEDKESGKELKMEQLAAENWDKPVVVTTNVRFFESLYACRSSECRKLHNIANSVIIFDE AQMLPVKYLMPCIRAISELICNYHCTAVICTATQPSLQPFFPAKMQELKAEELCPDVKGQ YDFFKRTKIQLAGRISQEHLVKELEGRQQVLCILNSRKRVQRVYDSIDSRGTYHLSTLMY PKHRKEILKGIRSRLSSGEPCRLISTSLVEAGVDFDFPAVYRELAGIDSVIQAAGRCNRE GKRDPEECMTQVFTLEEEEDIHIPRELKLPISVAGQIAQKYEDISSPEAIGEYFTRLYRY KGEGLDAKDVVEQFEQGSRSFMFPFASAASGFRLIESNTRTILIDTEPEAAQIAMQIRQG GHSRELIRQAGQYCVNVYEHDFEALSGAGRLEQIDREFYVLRNKEQYTRERGLVIDVSRG DALMF >gi|157101630|gb|DS480694.1| GENE 15 16081 - 16740 480 219 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2494 NR:ns ## KEGG: EUBREC_2494 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 219 1 222 222 342 72.0 5e-93 MSRGVRVRLWGDYALFSRPEMKVERCSYDVMTPSAARGMLEAIYWHPGMRWVIDKIYVRK PIQFTSIRRNEVKSKVLAGNALTAVNGGGKPLYISSKEEIVQRASILLRDVEYVVEAHFE MTPKAVPGDNEGKFKDIIMRRLRRGECYHQPYFGCREFPARFALYEEEEVDTAYDGTERD LGYMLYDLDYSDTEDIKPMFFRAVMKDGVLDVRNCEVVR >gi|157101630|gb|DS480694.1| GENE 16 16737 - 18512 1507 591 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2493 NR:ns ## KEGG: EUBREC_2493 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 586 1 575 578 623 52.0 1e-177 MILQALVEYYEALERKEKITSPGWCRARVAYGLDISEQGELMGVIPLKKEQQNGKKTVWV PQDLTVPQMVSRSSGVSANFLCDHSGYMLGIDNKGKPERSRECFECAKKKHKDILGNISS PAARAVCRFFDCWAPDRAAENPYLVPELEEIISGSNLVFMIDGEYVHEDPDIRQCWEEYS CQSGTGPEGVCLVTGRREEIARIHSTIKGVQGAQSSGAALVSFNAPAFESYGKEQSFNAP VGTYAAYAYTTALNYLLSDRSHGTTIGDTAVVYWSEEGDERYQNIFACVSEPTMENQEIV DGVFKNLAEGKAVVAQDVKDSLDMEKRFYILGLAPNAARLAVRFFYQDSFGNILKHIKEH YDRMEIVRPAADAVEYLGIWRMLQETVNKKSRDKKPVPSMSGAVYRSIISGSRYPASLYQ AVLGRIRAEQDDGDSRIYKITRGRAAIIKAYLLKNGNIREEITMALNEDSNNTAYILGRE FAVLEAIQEDANPGINATIKDKYFNSACATPAAIFPILFKLKNSHIKKMNNGAKETYYEK MLCDLQGRLTVAEGQRAACPRRLTLEEQGMFILGYYHQTQKRYEKKIKEEA >gi|157101630|gb|DS480694.1| GENE 17 18514 - 19404 988 296 aa, chain + ## HITS:1 COG:SPy1564 KEGG:ns NR:ns ## COG: SPy1564 COG3649 # Protein_GI_number: 15675456 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Streptococcus pyogenes M1 GAS # 4 286 1 266 282 139 31.0 7e-33 MAEVIKNRYEFVVLFDVENGNPNGDPDAGNLPRIDPESGYGIVTDVCLKRKIRNYVETLK EDEPGYKIYIREDVPLNRSDNTAYIELKTDEKGIKELKRKDPQVDQKIRDFMCRNFFDIR TFGAVMTTFVKAALNCGQVRGPVQLGFARSIDPIVSQEVTITRVAITTEKDAENKSTEMG RKNIVPYALYRAEGYISANLARKSTGFSEEDLELLWDAIINMFEIDHSAARGKMAVRRLI VFKHSKELGDCPAYKLFDAVEVKRNEDVEYPRKYGDYTVAVHRDQIPETVEVSEKI >gi|157101630|gb|DS480694.1| GENE 18 19431 - 20066 368 211 aa, chain + ## HITS:1 COG:SPy1563 KEGG:ns NR:ns ## COG: SPy1563 COG1468 # Protein_GI_number: 15675455 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Streptococcus pyogenes M1 GAS # 1 203 10 213 224 212 50.0 4e-55 MLSGIQHFIFCRRQWALIHIEQQWKENEHTIVGELLHKKAHDPYLAEKRGDVMISRALPV YSRSMGVSGECDIVEFHRAEDGIGLHGHRGLFRVFPVEYKKGSPKESEADILQLTAQAMC LEEMLSARIEAGALYYGEIHRRNMVEITDELRKRVRDIFQEMHELYDKGYTPRVRWSKSC NACSMKDICLPKLGKASSVRDYIRGKIEEDV >gi|157101630|gb|DS480694.1| GENE 19 20063 - 21094 695 343 aa, chain + ## HITS:1 COG:BH0341 KEGG:ns NR:ns ## COG: BH0341 COG1518 # Protein_GI_number: 15612904 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 343 1 343 343 455 63.0 1e-128 MRKLLNTLYVTSPDSYLSLDGENVVIYDNDTELGRIPFHNLEAIVSFGYRGMSPALMGGC AERDISLCVLSPQGRFLARVTGRVRGNVLLRRKQYQVSMNQDASLTIARNCILGKVYNAR WVLERAVRDHGLQIDTEKVKEAAGFLKQSLRHIEESRDMAQLRGYEGEDAGIYFGVFDQL ILQQKKDFYFKGRNRRPPLDNVNAMLSFVYTLLANTVASALETVGLDPYVGFMHTDRPGR LSLALDLMEELRPVLADRFVLTLINRKMVNKKDFSRKEDGAVIMRDEARKLLLSEWQNKK KEMITHPYLNEKVEWGMIPFVQAMLLSRYLRGDIDEYPPFFWK >gi|157101630|gb|DS480694.1| GENE 20 21133 - 21423 266 96 aa, chain + ## HITS:1 COG:BH0342 KEGG:ns NR:ns ## COG: BH0342 COG1343 # Protein_GI_number: 15612905 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 96 1 96 96 121 62.0 4e-28 MLILITYDVNTETTAGRTRLRKVAKQCVNYGTRVQNSVFECILDNTQMIKLKAILTDIID EQTDSLRFYNLGNKHTTKVDHVGVSKGIKVEEPLIF >gi|157101630|gb|DS480694.1| GENE 21 22691 - 23494 786 267 aa, chain - ## HITS:1 COG:alr1374 KEGG:ns NR:ns ## COG: alr1374 COG1402 # Protein_GI_number: 17228869 # Func_class: R General function prediction only # Function: Uncharacterized protein, putative amidase # Organism: Nostoc sp. PCC 7120 # 3 266 9 262 262 117 30.0 2e-26 MIRNYRDLTRLEMEQVNKDETIVLIPLGALEQHGNQAPLGTDDIIAEAMTDYIRRELEGE PSSEDSDFPMLVFPVIPVGLSTEHRNFCGSITLKPDTYYHMLYDISTSLVHHGFKKLAFL VCHGGNAPIVQVLSRELRSEFGISPFILSSGAFSHPDVTATISEGNIWDFHGGEMETSMV MAVDPSLVKLDTSEAGIPIAFKDNQALRPYGNVSIGWVSEDWKTADGKPIGIGGDPSGAT AEKGRIILETSAKVLVPGLREIRAWKG >gi|157101630|gb|DS480694.1| GENE 22 23533 - 24141 514 202 aa, chain - ## HITS:1 COG:CAC2841 KEGG:ns NR:ns ## COG: CAC2841 COG3601 # Protein_GI_number: 15896096 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 7 192 6 188 209 99 34.0 3e-21 MKTKWSIQKLIYTGMLAAVAGALMSLEFSVPMMPPFYKIDFSDVPSIIALFLMGPASAAW VEIIKIIIKLITVGTNSMYVGEFANLIGVALFVIPMWAVYKKGGKTRRAAIASLAVSVPI RTLFACFCNAFITLPLYAAAMGLPLNQVITMVAAVNPAISDLTTFIFLATIPFNLIKLGL NCFVGYLLYTRLLAVYPAVRTA >gi|157101630|gb|DS480694.1| GENE 23 24695 - 26101 1327 468 aa, chain + ## HITS:1 COG:lin2806 KEGG:ns NR:ns ## COG: lin2806 COG0232 # Protein_GI_number: 16801867 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Listeria innocua # 1 467 1 464 465 430 46.0 1e-120 MDWQTLLCDDRIRSYKKQSSTDLRTEFEKDYHRIIGSASFRRLQDKTQVFPLDRSDFIRT RLTHSLEVSSLAKSLGQNISESIRTIIKDETFTPEHKAAVCDILQCAGLIHDIGNPPFGH FGETAIQDWFKKNLERLTFKGRTLAEILEPQMVQDFCHFEGNTQAFRVVTRLHFLVDEHG MNLTKALLGTIIKYPVSSLEIDKDSGDIRTKKMGYFHGDRENFQDVQESTGTLGKRHPLA FILEAADDIAYKTADIEDAVKKGCISYERLLAELKAYKAGSQTNHYVQIVSWLEEKYGKA VGKGYDRPDQYAVQNWIISVQGQMISGVTECFAGNYESIMEGTYTRDLFAGTDVELLMEA LGDIALRYAFSTRPILKLEIAAQTIFDFLLDRFVDAVIPYDTDLPMTQVQKKLVSLISDN YKMIYSICARDRDEEERLYLRLLLVTDYICGMTDTFAKDMYQELNGIR >gi|157101630|gb|DS480694.1| GENE 24 26113 - 26499 256 128 aa, chain + ## HITS:1 COG:no KEGG:BT_4485 NR:ns ## KEGG: BT_4485 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 126 2 121 121 124 61.0 2e-27 MDSGKKYRARWGYLAASIIVFIIELIIALYVHDRIIRPYIGDMLVVVLVYCFVRVFVPRG MKRLPLYVFLFAVCVEVLQYFRLAELLGLQGNTAARIILGSVFDWKDIACYGTGCLLIQI FEIRRRAL >gi|157101630|gb|DS480694.1| GENE 25 26496 - 28268 1409 590 aa, chain + ## HITS:1 COG:no KEGG:Closa_0369 NR:ns ## KEGG: Closa_0369 # Name: not_defined # Def: plasmid pRiA4b ORF-3 family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 579 1 510 522 213 29.0 2e-53 MKGYQLKITIKGSKPPIWRRVVVPEQFTFCQLHQVIQEAFGWYDYHLHEFEFKKLGLLIR DPGEEDDLMESCSCDVLEEGTQIGTLITENPRFIYTYDFGDAWEHQILMEKEVEYEYSYP QVLKYKGDNIPEDCGGIGGYYDLLDQLADPEAEEHDLMEEWASRQGMEEYDLDKVNANLK AHPAFGQGGAGDGQENAAQERGVQEDAAQEHGAQEHLVQAHASREHAAHKQVTGLRDTEK PETQRPAPRVIHRLEDLFSRFEKTELLRIAKVHHIGGCDELGKNKLAPALAAALLDSRRM GQFFGTISDETAAEFEKAAAAGGMEYQGDLEALRFLHYGGYCGFTSMGTAVVPSDVAEAY DELNTEKFRSERHYRSQVWACCKAAVYLYGAADAGQVETICKTVGCEADAGEITRLCREM KGVWTDFAYWDGRLIDWDLVQADAYKDVLNYQKGKEFYIPSAAQVKEIARTGAVEIKRHI RPLKAFFLAMVGCDEETAQEAASSIHHHIRLGTTPKAVEDIMEHFGLNLDSQEKMNGFDE IMEQVWKETRTVGSCGYNRVELDAKTARIDPDAPCPCGSGKRYRQCCGRK >gi|157101630|gb|DS480694.1| GENE 26 28291 - 28464 290 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940019|ref|ZP_02087364.1| ## NR: gi|160940019|ref|ZP_02087364.1| hypothetical protein CLOBOL_04908 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04908 [Clostridium bolteae ATCC BAA-613] # 1 57 1 57 57 95 100.0 9e-19 MFERVDYRVERIDGDYAYLKRTDVLDDDEKCVARALLPGDINEGSRLAYEMLEYTLL >gi|157101630|gb|DS480694.1| GENE 27 28504 - 29433 899 309 aa, chain + ## HITS:1 COG:ECs5004 KEGG:ns NR:ns ## COG: ECs5004 COG2390 # Protein_GI_number: 15834258 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Escherichia coli O157:H7 # 8 294 10 300 315 108 32.0 2e-23 MDKKTERLVNVARMYYEQDRTQSEIADRYGISRPMVSKLLKEARDRGIVTIRINAPKGES GGAPSLMELVGRCFGIYDGVAVPGGPNDQTTNEAVAEAAISYLSELGGASLGIGWGHIIG DVVKHMEQKAKLVPIGTFVCPLIGNGGVGLKNYHSNELVRSIAEHSGAQPRFIYSPACVL SEQELKLTRELDSYHEIYHVWEKLDVALVNIGNFPSVPDFASEARYGDLLIRQKAVGRIL NYFMDSQGHIIRSDTDYAIQIPLELLTRTRHVVGICSANTSPKAFRAALKTGYLKHFIAP EHVVREALE >gi|157101630|gb|DS480694.1| GENE 28 29607 - 30227 806 206 aa, chain + ## HITS:1 COG:SMb20313 KEGG:ns NR:ns ## COG: SMb20313 COG2376 # Protein_GI_number: 16264047 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Sinorhizobium meliloti # 21 206 22 209 213 108 39.0 7e-24 MLLSKPEITKMFRKAAQVWNENKDYLSEIDSRFGDGDHGVTIGKIAGLIEKSLDGWDDDD VETFLEDLGDNTMEIGGGSAGPLYGTMIGGLSGPLEGNKPIDAGTLKEMFTECLSAMEDI TNAGVGDKTMMDALIPAVEAAQKAKDDVMAVLEAAKGAAARGAKESEQYVSKYGRARSYK EKTIGTPDAGAVSTSLFFAGLCDGLK >gi|157101630|gb|DS480694.1| GENE 29 30358 - 31356 1300 332 aa, chain + ## HITS:1 COG:SMb20312_2 KEGG:ns NR:ns ## COG: SMb20312_2 COG2376 # Protein_GI_number: 16264046 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Sinorhizobium meliloti # 3 331 1 331 333 215 34.0 9e-56 MKMKKFINNPENLTPELLEGFAEAHKELVTLGENRMIINNKLAEADRVTIVTQGGSGHEP AISGFVGEGMVDISVVGDVFAAPGPQACLDAIKLADKGKGVLYIVLNHAGDMLTGNLTMK KCAKEGLNVVKVVTQEDIANAPRSNADDRRGLVGCIPTYKIAGAAAAEGKSLEEVAAIAQ RFADNMATLAVAVSGATHPATGSLLADLGEDEMEIGMGQHGEGGGGRQTMKSADETAKIM LDGLLADLDIREGEKIMLILNGTGATTLMELFIIYRRCVSYLKEKNIEIVSNYVGELLTV QEQAGFQMFMARMDDELLHYWNAPCNTPYMKK >gi|157101630|gb|DS480694.1| GENE 30 31429 - 31749 449 106 aa, chain + ## HITS:1 COG:no KEGG:Elen_1231 NR:ns ## KEGG: Elen_1231 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 1 106 1 106 107 144 68.0 8e-34 MAAEYMHIGIPVLNKKEGMTYNEDMKFWVSNVDDYDFKIEYLKFEEGTPFPEILSKQPHV AYRVDDLDHYASQAQRIIFGPADCGPGVRLAFVIWDDAIVELYEEK >gi|157101630|gb|DS480694.1| GENE 31 31762 - 32604 1016 280 aa, chain + ## HITS:1 COG:YPO3960 KEGG:ns NR:ns ## COG: YPO3960 COG0191 # Protein_GI_number: 16124088 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Yersinia pestis # 2 278 3 281 286 194 40.0 1e-49 MLVNMNHVLRYAEEKQCCIGAFDTPNLEILMAVIRAAEKREEPVIIQHAQLHECEMPIHI IGPIMVRMAKEARVPVCVMLDHGEDLDYVRKALDLGFSAVMYDGSSLAYEENVAMTREVV AMAKACNADVEAEIGIVTGHEGKTFAINDVSDAYTDPELAARYVKDTGIDALAASVGTVH GFYATKPKLDFDRIVKIKELTGLPLVMHGGSGISVADTQRAIRCGIRKINYFSYMSNAGV KAVKKLLEEKEVKYFHDLANAAVDAMEQDVLNAMSMFALE >gi|157101630|gb|DS480694.1| GENE 32 32810 - 33805 884 331 aa, chain + ## HITS:1 COG:SMc03139 KEGG:ns NR:ns ## COG: SMc03139 COG2222 # Protein_GI_number: 15966707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Sinorhizobium meliloti # 1 331 1 337 337 253 37.0 3e-67 MFNFDEAKVLQEHKNGLSIVGATEKAVDEVVKRGYKNIFYIGIGGTVLYANQMAHIVREE GSTIPLFIENAADFCVVDNPHFSKDSVVVIESISGDTKEVVAAVDKAHEIGAAVIGYVEK EGSPLYEKSDYLITTTGGGYYFWYTVTLRFMYHAGQFPRYEKFFEELKNMPENVVEIYKK ADEKAAEYARAYQDEPIQYLVGSGNLEDWAVCYGMCIMEEMQWMRTRPISASNFFHGTLE VIDRDTSVILIKGEDKTRPLMDRVENFVHKISAKVTVFDSKEFELKGISDEFRGMLCPIM MRSAFQRVSTHLEYNRRHPLAIRRYYRRLDY >gi|157101630|gb|DS480694.1| GENE 33 33977 - 34699 594 240 aa, chain + ## HITS:1 COG:CAC3502 KEGG:ns NR:ns ## COG: CAC3502 COG2188 # Protein_GI_number: 15896739 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 3 236 2 237 237 148 35.0 8e-36 MGKLNKESSVPLYQQLMEVIQNQILNGELKENDRIPTEIELSREYDVSRITVRKAVELLV EEEILVKRQGIGTFVSQKKLCRNINGFMGFTQSCLAEGNTAGAQLVSAELAEATMVDAEE LKIQEHEKIIKIVRVRTCDGIPVMLEENHFPARFAYLLGHDLTGSIYQILAENGTIMDNG IKRIGICQANEIEHKHLGVDLDKPLLYVKDVSYDREGNPVHNCKSVINPDRYKMTVMVNA >gi|157101630|gb|DS480694.1| GENE 34 34717 - 35553 626 278 aa, chain + ## HITS:1 COG:CAC3499 KEGG:ns NR:ns ## COG: CAC3499 COG1082 # Protein_GI_number: 15896736 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Clostridium acetobutylicum # 12 266 6 256 271 159 32.0 5e-39 MITISKSQISVMNIQYGYYPLEKFLDDAARAGVEHVELWGAAPHFHLEDMTYTDVCRVRK QIEERGLSLVCYTPEQCIYPVNLAADSTVQRRRSLKYFEDNLRAAAELGTDKMLVTTGWG YLDNSNINEAWKYAREGLMSLGDMARDYGIKLALEVLRRDESNLVYNLPTLKRMMEDLSH PAIGAMIDTIPMALAGERPEDYLKVFGEQLVHVHFIDGAPRGHLAWGDGVLDMKGYLEEF SRYSYKGYLSLEITDGRYRVDPTSSVLQSVERLYDVLT >gi|157101630|gb|DS480694.1| GENE 35 35770 - 36669 689 299 aa, chain + ## HITS:1 COG:STM3600 KEGG:ns NR:ns ## COG: STM3600 COG0524 # Protein_GI_number: 16766886 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Salmonella typhimurium LT2 # 2 296 3 279 281 231 41.0 9e-61 MLRVLGLGDNVVDKYMHIRTMYPGGNALNFAVYAKMFGIEAGYLGVFGDDEAAAHVYDTI RGLGLELSHCRFYPGENGYAEVRLDNGDRVFIGSNKGGVSREHPLELTAMDRAYIAGYDI VHTSIFSYIEDELPLLRQASSFVSMDFSDRADEEYLRQCAPYVDCASISCGDMPRSEIEK QMTLIRKFGCRHIVIATRGAKGALVMVDGRLYEQSPCLVEAVDTMGAGDSFITCFLINYV DAMKDSRDFPAASGTKGTVTAADYQDLAVRTSLYRAAVYSAGNCQKDGSFGFGKVFQNE >gi|157101630|gb|DS480694.1| GENE 36 36714 - 38087 1013 457 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 49 450 20 419 422 342 42.0 7e-94 MRKFVKKAAAALLAGTMAMSLTACAGFTPSGGNDKKDTTASSEAGADTGVTQAAQTSGAK KTIKFFHRFPDEPFNGFIESKIAEYEEAHPDIDIVVESAQNDPYKEKIKVVVGGNDAPDI FFSWCGEFSERFLRENLIMDLTPYLEADKEWKDSLMESQLVNYTKDGVTYGIPFRLDCKL FFYNTQIFEENGLKAPATWDELIEVCDKLKAAGVTPISYGNQEKWPAAHYIGSLNQMLVS DEVREKDFDPAQGEFTDPGYEEALKCYQQLIPYCNDAVNGTAADMARTNFAMGKSAMYYA ELIEIPYITDLNADLDYGMFKFPVVEGPGNPDILTGVPEGFVVSSKTKYPDECIEFLKWF LGPEVGKEQAQTIGWFNASKNVTEGVNDTKLLDGYKVVNEAKIMGPWFDNALYSTTCDEY LTAASDLTNGDITPAEAMARVQKVAKEAQSLVSGQTK >gi|157101630|gb|DS480694.1| GENE 37 38149 - 39024 720 291 aa, chain + ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 5 290 6 290 292 237 41.0 2e-62 MRTKKSTPYLYMIPGLIMVLVFVYIPVITNIVYSFFSLSSYSSSAKFVGISNYIRFFTKD TLPIMLKNNGLYCIISLVVQVGFGTLLALLLESKATGRFRNVYRNIYYIPALISLTAVGI LFTFIYQPDLGMLNSFLRAIGHGDMARSWLGDSKIAIYCIIAMSQWQFTGYITLLMVVAI QNVPNDYIEAASIDGAGPVKRALHIILPLAKEQLLVCSIITVIGAFKLFTEVYSTTGGGP GNSTQVLGVFLYQNAFLHDDLGMAAVTGVFIFMVTMVLSLIQMKITRSGEV >gi|157101630|gb|DS480694.1| GENE 38 39038 - 39862 737 274 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 13 274 39 300 300 247 46.0 2e-65 MKKRTAVTILLNIFFILLVLVIILPAAWLIMNSLKTNVELFTDSLALPAKVQFANYISAW NKGLYKYFGNSVIVSSISLISILFLSSLLAYGITRFNSRFGNLIFLITLGGMALSEQVAL VPLYKILQATHLYNTYWAVILPYVAFRIPFTVFLMRAYFMCIPKELEEAAVIDGYNSFQI FRKIIVPISKPIFASCAIVNLNFVWNEFLFANVFLESKSIMTIPIGLMTFKGDMRSDYVV MLAGITIASLPMIILYLCMQKQFVRGLTSGAVKG >gi|157101630|gb|DS480694.1| GENE 39 39975 - 40784 691 269 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 259 2 255 269 99 27.0 8e-21 MEYKTADYHVHSTISPDAKGTMEEMCNAAMAAGIEEIAFTDHYEFYAGGITGKYFNRGYL IRYFEELDKCRQMYEGRLIIRSAMEFGQLHLDPEKASGIIKSWPFDYLIGSVHKIDNIDL SQMEFRKDTGQQIADAYYSHLLKLARTGDFDCMGHLDLYKRHCRKAGLPDDYDKYEDRIV QVLTALIERGKGIEVNTSGLRQGAGETMPGIRCLKLYKELGGTVITVGSDAHLPEDVGSG FIEALGLVKKAGFDRIARYEKRMCLFQPL >gi|157101630|gb|DS480694.1| GENE 40 40781 - 41623 560 280 aa, chain - ## HITS:1 COG:SPy2155 KEGG:ns NR:ns ## COG: SPy2155 COG1284 # Protein_GI_number: 15675897 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pyogenes M1 GAS # 18 276 57 309 313 130 31.0 2e-30 MAGATAGVFLFAFSFCVFLRPLGLYSGGFTGIAQIIHLVFQDILKIPLPGGIDWTGIIFW MLNIPLFLLSYVLISHSFFYKTIVCVCLQSLFIAFIPSPEIPVFTDPLASVLVGGAISGF GVGLTLTCGGSGGGVDIVGIYCTKRFPRFSVGRISILINVLIYLFCAFRYNLATAAYSAL FAIIAGVATDKIHYQNIKACVTIISRHPDIQNIILNNLHRGVTIWNASGGYTGQSTYVYM TVISKNEIEPLRQLVLTADHNAFIAIQNNIDVTGHFEKRF >gi|157101630|gb|DS480694.1| GENE 41 41732 - 42700 832 322 aa, chain - ## HITS:1 COG:STM0572 KEGG:ns NR:ns ## COG: STM0572 COG2222 # Protein_GI_number: 16763949 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Salmonella typhimurium LT2 # 4 322 9 328 328 279 43.0 4e-75 MELKKIIEEIKAERPEVTSVIFVGCGASQAELYPAKYFLEGNAKKLRTSLYTANEFVHAT PACVGKESIIITCSLGGNTPETVEASKKAMNMGAKVIAITHTPGSPLANSAHYVVLHGFE KNYAAKLEKMTFCLQLAAEILNQYEGYEHYQDMMTGFAKIFDLIEDAVSFVVPSAKKFAE NYKDAPVIYVMSSGATHMVAYSFSICLLMEMQWINSGSFHDGEFFHGPFEIVDKDVPFLL LMNDGSTRALDSRALDFLNRFQAKTTLIDAKDFGLGSVIPSTVKDYFNPMLITAVLRVYA EQLAILRNHPLTQRRYMWKLEY >gi|157101630|gb|DS480694.1| GENE 42 42935 - 43186 202 83 aa, chain - ## HITS:1 COG:no KEGG:Amet_1294 NR:ns ## KEGG: Amet_1294 # Name: not_defined # Def: GntR family transcriptional regulator # Organism: A.metalliredigens # Pathway: not_defined # 7 80 46 119 223 80 56.0 1e-14 MFFILHLSRTPVREALIELSKVGLVEIQPQRGSCIAKIVYELIGESRFMRLMLENAVLKL ACESISQEYMDKLKEYLRMETIS >gi|157101630|gb|DS480694.1| GENE 43 43377 - 44048 891 223 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_1961 NR:ns ## KEGG: TepRe1_1961 # Name: not_defined # Def: Asp/Glu/hydantoin racemase # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 216 1 217 222 203 50.0 5e-51 MKVALVYTSTTPELIELVEKEVGDVLPRETEVASYQDPSILAEVRDAGYVTAPPAARLVG MYMTAVSEGADAILNLCSSVGEVADAAQDIARYTGIPIVRVDEEMCREAVRQGKRIGVMA TLPTTLNPTKNTILRVAREMNRQVELVDALVDGGFGLDQEQFKALMSEYAGTIADKVDVI LFAQGSMAYCEEYIHEKYGKVVLSSPRFGAAALKDALVKKGLM >gi|157101630|gb|DS480694.1| GENE 44 44175 - 45221 265 348 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 9 311 24 314 346 106 26 1e-21 MHATKGEGSMFKRLKKAAAVICAAAMVLSMTACGSAGDKKVVLHFTHTQSPGSISDLTAQ EFKKLVEERSDGRIEVSIYSNCGLSGGDLTKAIELVQAGNIDIHSCAPPNIANYDKKFYS FWLPFLFPNSDDLLAFCHSDKVHEVVNGWCNDLEMEMLGINNAGSRQISNSKKEITKPED LKGMNIRVPGANIFIDLYRNYFGANPTAMDFSEVYTSLQQKTIDGQENPIAVFDSSKFAE VQQYVTLWDGVRDTTIWVMSRKTMNKLSPEDQELVKECAAEALDWGNDYLADNEAVIIQK LKDGGTVITELTEEQKAEFQKACAGIYDDYAKSVGQDVIDLFTGGYKN >gi|157101630|gb|DS480694.1| GENE 45 45232 - 45714 450 160 aa, chain + ## HITS:1 COG:no KEGG:Amico_1587 NR:ns ## KEGG: Amico_1587 # Name: not_defined # Def: tripartite ATP-independent periplasmic transporter DctQ component # Organism: A.colombiense # Pathway: not_defined # 6 158 1 153 156 73 26.0 3e-12 MKEKGLLKDFWENFEEYLCSAALLVMTFVTFMNVFSRKITWFNMSFTQELVTTMFVWVCC LAAASVFKTDSHMGFNYLTDKFTGTKRKVYRICRLLLCSANYGIWIIWGAQMVYRQYHYH LLTGVLEMPQWTIGIAIPLSAVFSIARMVQYEIRLAKEGN >gi|157101630|gb|DS480694.1| GENE 46 45716 - 46987 722 423 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 420 4 425 431 282 36 1e-74 MSAAILFGSMAVMIGFGIPIAIVLGVSSALTLIFSGAPLGVIPSMMQATVQKFSLLTIPL FVLAGAIMDKGGISKRLINLADSIIGPVHGGLGYVAVVAALFFAAISGSGTATVAAMGSI LIPAMVAQGYDAGFSSALSAISGSLGTVIPPSITFIIYGMITGESIGDLFLSGIIPGLIF GFMLCIMVFIKSKKNGWKGETQRASLKEIGRLFLDTIWGLLSPVIVLGGIYSGLFTPTEA AAVAVVYSLIVGIFIYKELRLAELKETLFSTAKTTGMILLIIMNAGIFSWVLTQQGIAAS LTEMALALTDNKYIMLVIINIVFLCAGCVMDNTSALYILVPIIMPIAKALDINMIQLGVI LVLNLSIGQVTPPVGPNLYVAADIGKVKFEEICRKMVPLLLMSLLALLCITYIPALSLCL IPS >gi|157101630|gb|DS480694.1| GENE 47 47152 - 47889 816 245 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 10 241 4 230 231 90 28.0 2e-18 MEKTSFLKEPVGGQSVVNKIVDNITNAIINGELNPGDKIPTEAELSESMGAGRNSVREAI KVLEAYGVVHIKRAEGTFVSQEYDSRMIYPVLYGIILQKDSTSQIVELRKVIDVGLLQLA VDKLRSKSLEQTQMEAIEKAMEELEYQAHMDKPQARSLYEADVQFHMAIVGITENVMLQS ICNYVDKITRRSRMVTIDRIFLDGEVENFLDLHRRIVKLLQDRDSAGIYAIVEEHYQYWA KVKDE >gi|157101630|gb|DS480694.1| GENE 48 47990 - 48343 341 117 aa, chain + ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 7 116 12 120 121 70 37.0 8e-13 MIYRMKNELERTPIEGCMGGNGTVWMEKLLTGQEEMMGKGRAYVRHTLNPGVSIGIHTHE GEMETMVIVRGKAVHTINGQDQYLEEGDIIAAQPGDSHGIAQTGDEPLVLIAQVLFA >gi|157101630|gb|DS480694.1| GENE 49 48376 - 49188 1028 270 aa, chain + ## HITS:1 COG:FN1868 KEGG:ns NR:ns ## COG: FN1868 COG3246 # Protein_GI_number: 19705173 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 13 270 14 272 272 124 32.0 1e-28 MTDMDNKVLVNVAPVSAEDKVIIPEKIAEDVYKCYKAGAAMVHLHVRDREGRLTPDMSLL EETLAMIRRDSDIIIEVSTGGVSDLNIRERCAPLYSELVEACSLNVGSTNLGKAVYCNPI DDVEYCIREILKQKKTPEVETFEIGHTWTMARLMEKYDFADPVLFSVVLGHEGEAPATPQ ALAAMIQMIPSNAVWGITQAHRKDFSLLAGALGMGARTVRIGFEDSNYLDAQTQVTSNAP LVEKTVKLLRAMDKEPMLPDEARELFRIGR >gi|157101630|gb|DS480694.1| GENE 50 49207 - 50625 1430 472 aa, chain + ## HITS:1 COG:slr1342 KEGG:ns NR:ns ## COG: slr1342 COG3395 # Protein_GI_number: 16330749 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 29 470 3 439 445 250 35.0 5e-66 MLNASVLDQYPGVDTDAVQALLEEKCRQDRHKIIVLDDDPTGVQTVHDVSVYTDWSYDSI KKGFEEDGKLFYILTNSRGFTVEQTTRAHLEIGETAAKVSEETGIDYVIVSRGDSTLRGH YPLETELLARAEEKHRGRAVDGEIICPYFKEGGRFTIGNVHYVKYGNELIPAGETEFAED KTFGYHCSNLKEYVEEKTGGRYPAREVLDVSLEELRSLDYASITDKLLALHDFGKIVVNA VDACDLKVFCIALYDAMNQGRRFMFRTAAGFVKEFGAIRERPLLSREEMVQENCGTGGII VVGSHTKKTTSQLEALKTVEGIRFIEFNSDLVLDEEKFQEEISSVISQEEELIGRGVTVA VYTRRKLLSLEHDSPEEALVRSVRISDAVQSLVGRLKVRPAFVVAKGGITSSDVGTKALQ VKRAAVLGQIRPGIPVWRTGPESRFPGTPYIIFPGNVGEVETLKEAVEILLG >gi|157101630|gb|DS480694.1| GENE 51 50976 - 51884 808 302 aa, chain + ## HITS:1 COG:NMA1605 KEGG:ns NR:ns ## COG: NMA1605 COG1737 # Protein_GI_number: 15794500 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Neisseria meningitidis Z2491 # 9 300 5 275 282 111 30.0 2e-24 MAKPFEFNIKKLYADLRPSEQKVADYYLGYAGQLEELSLVPMAKAAGVSQPTVMRFVKAL GFGGFREFKYAVLKAASGRGNMEADAKADSRADGSGGAVPPPLVCGYGISEEDRPEEVPG QVITRTISYLDDALKHISPSEIIKAAGMICSAGQVAVYYVENSATVANDLVTKLLYLGIN CVTYNDVYLQQISAGNLGEQDVAIGISYSGTSKSTVDVMKLAKRKGASTIVLTNYDDVLI GKYADIMLCTGNRQQLYGNAIFSRTSQIAVVDMIYTGIILRDYPKYTKKLDESSRIVRNQ TY >gi|157101630|gb|DS480694.1| GENE 52 52114 - 52899 366 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 237 1 245 245 145 35 2e-33 MLELNKIRKSYDGVTILDNISLSIESGEILSILGPSGCGKTTLLNLILGITDADSGEIWF EGRNITRVPMEERGFNIVFQDYALFPNLNVYKNITYGLRNKPDISTGQEVQELIDLLGLG EHLAKGIDQLSGGQKQRVALARTMVMKPRILLLDEPLSALDGVIKESIKEKIKEIAREFK LTTIIVTHDPEEALTLSDKVLIVNEGNIAQFGRPEEIIGQPADSFVKNFILNQLEVKRNN IFTLFSREMTRSSSQRISCTA >gi|157101630|gb|DS480694.1| GENE 53 52922 - 55549 2277 875 aa, chain + ## HITS:1 COG:SMb21542 KEGG:ns NR:ns ## COG: SMb21542 COG1178 # Protein_GI_number: 16264731 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 41 535 161 657 680 218 28.0 4e-56 MIKTKERELKQIYAMVVIVFLVFLLIPIVLLLGKSFEGGGSVSLEHYADVAGGKGFAAAF GRSIFVSCVSAVLTTALAFVLAYTIHYTNVPGGLKKFIRLAAVVPMLLPTITYGFAIIYS FGRQGFITGMIGHQMFEIYGFCGLLLGYVIYTLPISFMLILNTMSFIDKKFMIVSRIMGD SPLKTFLATVIRPLLGTLAASFVQCFFLCFTDFGIPASVGGQYDVVATVLYNEMLGSVPD FNRGAVVAMFMLVPSVVSISLLHFLERFNIRYSKVSGIELRKGRIRDICCGAASVLILAS VLCIFAVIFVVPMVEEWPYRRQFTTEHIKTVFTDSRLLSVYKNSILVSIFTALTGSLVAY AAALATARSQLSSRLKSVIESIALVTNTIPGMVIGIAFLLTFSGTPIQNTFFIIVICNVV HFFSTPYLMMKSSLAKMNASWETTALLMGDSWLKTIVRVVTPNAVSTLLEVFGYYFVNAM VTVSAVIFIAGARTMVITTKIKELQYYTKFNEIFVLSLLILITNLAAKGLFGYLAGGRCK NKRRKTIMKNWKKAAVMGCLAAVLAAAAAVTGCQKKEAQVIIYSNADDEAVEAMKHALDG NGYGGKYLFQTFGTSELGGKLLAEGTDIEADLVTMSSFYLESAQEQNHMFLDLTFDAKPL EKYPAYYSPVTRQEGAIILNTEMMKEHQLPTPSSIKDLVNPAYADLISVTDIKSSSTAWL LMQAIVSEYGEDGAKEVLSGIYQNAGPHIESSGSAPLKKVRAGEVAVGFGLRHQAVADKA EGLPVDYVDPTEGNFSLTESAAVIDKGDKTNKLAMEMAECIIKNGRQELLTTYPLPIYEG ETSNSKNQSAYPRVFPEKLTVDLLKKHQELSESCK >gi|157101630|gb|DS480694.1| GENE 54 55648 - 57525 1670 625 aa, chain + ## HITS:1 COG:STM0431 KEGG:ns NR:ns ## COG: STM0431 COG0075 # Protein_GI_number: 16763811 # Func_class: E Amino acid transport and metabolism # Function: Serine-pyruvate aminotransferase/archaeal aspartate aminotransferase # Organism: Salmonella typhimurium LT2 # 3 356 5 359 367 397 53.0 1e-110 MPNYKLLTPGPLTTTDTVKKEMMFDHCTWDEDYKKITQEIRSGLLKLARVSEDTYTCVLM QGSGTFGVESVLTSSVGADDRLLIASNGAYGERMADIAEHAGLSYILYSETYDQVPSADK IQKLLNEHPEITHVSMVHSETTSGILNDIASVAGVVKASGRTFIVDAMSSFGGVDIPAEE LGIDFIISSANKCIQGVPGFSFIICRRDCLAACEGKAKSLSLDLYDQWKTMEKDGKWRFT SPTHVVLAFAQAMREMEAEGGIEARHQRYTSNNRLLIELMAEMGIRPYIDSRHQGPIITT FFYPENHAFSFDEMYHYIKERGYAIYPGKVTDADTFRIGNIGEIYEEDIRRLCAIIGEFL NKKIQKREKSHTAFDAVIFDWAGTTVDYGCFAPVQAFMDAFEEFGIVPTMDEVRAPMGML KIDHVRTMLQMERLTAEWERIYGRPFTEEDVYQVYQLSEKKILENVPRFTDVKPYVTETV ETLRGMGIKIGSTTGYTDEMMEIVAAAAATKGYKPDCWFSPDSVDRKGRPNPYMIFRNMQ FLGLTDVRRVMKVGDTVSDIREGKNAGLFTVGVIEGSSAMGLSREEFKALSPKEKEERSR KVRQMYLEAGADAVVTDIRGVLEYI >gi|157101630|gb|DS480694.1| GENE 55 57671 - 57946 260 91 aa, chain + ## HITS:1 COG:no KEGG:Closa_0373 NR:ns ## KEGG: Closa_0373 # Name: not_defined # Def: sporulation transcriptional regulator SpoIIID # Organism: C.saccharolyticum # Pathway: not_defined # 1 89 1 89 90 139 87.0 3e-32 MKEYIEERAVAIANYIIDHNATVRQTAKKFGISKSTVHKDVTDRLEHINPTLAAQARIVL DVNKSERHIRGGLATREKYQHRLAICSKQED >gi|157101630|gb|DS480694.1| GENE 56 58079 - 59614 1758 511 aa, chain - ## HITS:1 COG:MA0003 KEGG:ns NR:ns ## COG: MA0003 COG0591 # Protein_GI_number: 20088902 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanosarcina acetivorans str.C2A # 1 509 1 489 514 363 44.0 1e-100 MSTGQIGILAAIIVYLAAMVYVGFYFSKKGGGDSADEFYLGGRKLGALVTAMSAEASDMS SWLLMGLPGVAYLTGIADAGWTAIGLAVGTYLNWLIVAKRLRRYSVACGAITIPDFFSRR YRDDRNILMCIAAIVILIFFIPYTASGFKAVGTLFNSLFGVNYHAAMIAGAVVIIGYTVM GGFMAVSTTDLIQSIVMSIALVIIVFFGIHVAGGWDAVTDNARSLAGYLSMTHVHNMADN TASPYGFITILSTLAWGLGYFGMPHILLRFMAISHEDKLKTSRRIASIWVVISMFVAILI GIIGYGASVAGGIPMFSSSSESETVIIRLADLLGRHGILLSVVAGVVLAGILACTMSTAD SQLLTAASGVSQNLLQDFLKIKMDTKASMTAARLTVVGIALVAILLAWNPDSSVFTIVSF AWAGFGASFGPVMLFALFWKRSNLNGALAGMICGGVMVFVWKYGVRPLGGAWNIYELLPA FIVACIAIVAVSLMTEAPSKEICDEYDSVGR >gi|157101630|gb|DS480694.1| GENE 57 59848 - 60441 714 197 aa, chain - ## HITS:1 COG:CAC2605 KEGG:ns NR:ns ## COG: CAC2605 COG1309 # Protein_GI_number: 15895863 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 3 177 8 176 194 83 29.0 2e-16 MAKKNARNTRGKIVNAAWQLFYEQGYEDTTVEEIIELSHTSKGSFYHYFDGKDALLSTLS LLFDEKYEELMDTLTPEMSAMDKLLFLNSELFGMIENSISLDLLARLLSTQLVTSGEKHL LDHNRTYYKLLRRIIAQGQESGELNAETSVNEMVKLYALSERALLYDWCLCAGEYSLRRY SASVMPAFLSHFLPWRG >gi|157101630|gb|DS480694.1| GENE 58 60675 - 61700 1230 341 aa, chain + ## HITS:1 COG:CAC1479 KEGG:ns NR:ns ## COG: CAC1479 COG0115 # Protein_GI_number: 15894758 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Clostridium acetobutylicum # 1 340 1 340 341 550 73.0 1e-156 MEKKDLDWGNIGFSYMPTDKRYVANYKDGAWDEGGMTSDPNVVINECAGILQYCQEIFEG LKAYTTEDGSIVTFRPDLNAERMMDSARRLEMPPFPKERFLEAVDQVVKANAAWVPPFGS GATLYLRPYMFATGPVIGVKPSQEYQFRLFGTPVGPYFKGGAKPLTLCVSDFDRAAPNGT GHVKAGLNYAMSLHASVTAHAAGYDENVFLDPGTRTYIEETGGANFLFVTKDGEVVTPKS DSILPSITRRSLVYVAEHYLGLKVTQRPVKLSELDEFAECGLCGTAAVISPVGRIVDHGK EICFPSGMDAMGPVTKKLYDTLTGIQMGTIEAPEGWIRKIV >gi|157101630|gb|DS480694.1| GENE 59 61827 - 62252 487 141 aa, chain + ## HITS:1 COG:no KEGG:Closa_0380 NR:ns ## KEGG: Closa_0380 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 6 141 8 143 143 178 63.0 6e-44 MKGTIKGKVYLGLMMLFAMTLWLAGCSASKDSGDKVRDLDFTVAGDMDVPDELKNLIAEK QQQPFKLTYSDEQNLYIAVGYGVQPTGGYSISVNELYLTDNSIVVNTELKGPEKGENAGA EQSFPYIVIRTEYLENPVVFQ >gi|157101630|gb|DS480694.1| GENE 60 62432 - 63082 719 216 aa, chain + ## HITS:1 COG:CAC0687 KEGG:ns NR:ns ## COG: CAC0687 COG1045 # Protein_GI_number: 15893975 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Clostridium acetobutylicum # 2 186 1 179 186 234 60.0 1e-61 MIFKDIWLDIKAVQERDPAARNALEVFLLYQGIHALIWHRFAHWFYNHRMFFVARLISQI ARFFTLIEIHPGARLGRGILIDHGCGVVIGETAVVGDNCTIYQGVTLGGVGTKKGKRHPT LGNNVMVGAGAKILGAFEVGDNCSIAANAVLLKPLEDNVTAVGIPARPIKKDGVTIPKEK KSLTLEEYETMKRRMEQMEKEIDFLKGALKEARSEK >gi|157101630|gb|DS480694.1| GENE 61 63109 - 63981 878 290 aa, chain + ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 10 274 22 269 277 225 40.0 9e-59 MKQEAPHVNIERSIIKQFRKEIWRPFVKGLQDYEMIKDGDKVAVCISGGKDSMLLAKCLQ QLKRQSRTDFELEFIVMDPGYNPDNRNLIEDNAALMNIPVRIFDSDIFDIVVDVEQSPCY LCARMRRGYLYAHARELGCNKIALGHHFDDVIETILMGILYGGQINTMMPKLHSTNFQGM ELIRPLYYVKEEDILDWKDYNGLHFLQCACRFTERIAKEQAARGMAGEAADIVHTSKRQE MKELIRSLRKTSPFIDANIMKSVENINLDACLGYVKDGVRRHFMDEYDER >gi|157101630|gb|DS480694.1| GENE 62 64081 - 65667 1967 528 aa, chain + ## HITS:1 COG:CAC2903 KEGG:ns NR:ns ## COG: CAC2903 COG1388 # Protein_GI_number: 15896156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 6 520 17 514 520 137 22.0 6e-32 MNQWKGNVTTQITLDDDFIVPDTLDDMEQVMLDTGEIQIETVKNQGDKVAVKGKLEFQVL YRKESGGLQTLGGNIPFEEVVNVQGLEEKDYVGLNWMLEDLNTDMINSRKLGVQAIITLQ VRVETLRDVEAAVDVDMENNSRPAGAGMSDTDGAGQVQVETLKRTANAAAIAVRRKDTYR IKEELSLTGGKPNIGQLLWREMKLRDVTAKPLDGRLHLDGELSVFVIYGAEDENMPVQWL EETIPFSGDMEMGDVKEDQIPMVTVRLAHKDMEAKPDYDGEMREMDVDAVLELDIKLYEE EEVELLSDMYSNNREIELNRSDACFDQILTRNVCKSKVAEKLSLPQAERILQICHNEGTI KLDEVEIGDDCLEIDGVLEVTLLYLTSDDTSPVQSSVEQVPFHCVAEARGIREDSVYQLD AGLEQLSAVMMGGDMVEVKAVIALDFLVLQPVCEPVITGAAIHPMDLQKLQELPGIVGYI VQPGDSLWEIAKKFHTTVGNIISTNELADDQVKPGQRLLLVKEIAQGL >gi|157101630|gb|DS480694.1| GENE 63 65842 - 66753 1068 303 aa, chain + ## HITS:1 COG:HP0292 KEGG:ns NR:ns ## COG: HP0292 COG4866 # Protein_GI_number: 15644920 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori 26695 # 92 298 59 284 290 117 35.0 4e-26 MELEFKPVVAGDIDRLRPFYGLRPNKTCDSVFLDSFLWKDYYHVQCAVSEGRAVLWLMEK DGQVSTAMPLCREEDLPYFFQQMVDYFSGVLHKPFYISLADEEAVQCLNLDPALFEVEEQ VDLKDYLYSGEELRKLEGKKFVKKRNHLNGFKRTYEGRYEYRRLCCSDRHEVWQFMERWR QHKEEVGAMELSLDYEVAGIHDILKNCSSLYVRMAGVYIDGQLEAFTIGSLNVRENMAVI HIEKANPEIRGLYQFINQQFLINEFPEAELVNREDDVGMEGLRKAKMSYCPIGFARKYMV RER >gi|157101630|gb|DS480694.1| GENE 64 66750 - 67877 808 375 aa, chain + ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 2 138 3 137 391 88 34.0 2e-17 MIRYLEKSEFGACRPLWQEAFPEDSREFADYYFDKKILQSSVLVKEDGTGRIVTMAHMNP YRINMGKKMWKLDYIVGVATAADSRHRGHMRDVLMKMLGDMHQDRKPFCYLMPASPDIYR PFGFRYIFDQPQYRLGSEAKECLQRRELRLDGNLCSSLAGWMNHWLSSRYQVYALRDRDY MEMLQSELDSEAGSVYGWYDVSGSLQAFQAFWGREKQEQRFLYAGRDRWLEPDSGQHAPR PAIMARITDVRSFMEAIVLDEGCPCPAMDVMVNIHDPLIPGNSGLWRWKIDGNGSSLTKK GVSLPKIHTGECAEEPDYPGTLVSTEVLDITIEQLASWLFGYSALEELAEGEQNGVPFWC AYVQVLDGVYLDEVV >gi|157101630|gb|DS480694.1| GENE 65 67966 - 68688 764 240 aa, chain + ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 3 239 5 245 245 324 61.0 1e-88 MNEKWQQLKEWIGGSDNIVFFGGAGVSTESGIPDFRSVDGLYNQQYKYPPETIISHSFYM RYPEEFYRFYKDRMLFTDAVPNQAHKALARLEERGKLKAVITQNIDGLHQMAGSREVLEL HGSVHRNYCTRCGQFYDLDYVVKSDGVPHCSCGGVIKPDVVLYEEGLDDRTLQKSVDYIR HADILIIGGTSLVVYPAAGLIDYYRGHKLVLINKAATSRDSQADLVISDPIGEVLGTVVD >gi|157101630|gb|DS480694.1| GENE 66 68707 - 69699 1017 330 aa, chain + ## HITS:1 COG:STM2406 KEGG:ns NR:ns ## COG: STM2406 COG0667 # Protein_GI_number: 16765732 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Salmonella typhimurium LT2 # 1 330 2 330 332 416 63.0 1e-116 MYQAAENRYGSMEYKRSGRSGVLLPRISLGLWQNFGLEKSLEEQEAVLFRAFDMGITHFD LANNYGFPAVGKAEENFGSALRDGLGRYRDELFISTKAGFDFWPGPYGNWGSRKYLMASL DSSLKRMGLEYVDVFYHHRPDPETPLEETMGALSDMVRQGKALYVGISNYQAEEAEAAIR ILRENKTPCLLHQPRYNMFERWAEDGLFELLDREGVGCICYSPLAQGALTGRYLDGIPEG SRASKEGTTVGGRYLTEDKLVKIRALNKMAGERGQSLAQMALAWVLRRKEVTSVLIGASS IAQLEDNAKALDSPAFSCEELADIEGILKG >gi|157101630|gb|DS480694.1| GENE 67 69696 - 69944 229 82 aa, chain + ## HITS:1 COG:lin2921 KEGG:ns NR:ns ## COG: lin2921 COG4481 # Protein_GI_number: 16801980 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 18 80 2 65 65 68 46.0 4e-12 MTGSGQTHIQKWRYGMNEKLNYEVGDIVKLKKTHPCGSSQWEILRVGADFRLRCMGCGHQ IMIARRLVEKNTRGLTKAKEDS >gi|157101630|gb|DS480694.1| GENE 68 70113 - 70400 346 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881892|ref|YP_001560860.1| ribosomal protein S6 [Clostridium phytofermentans ISDg] # 1 95 1 95 95 137 66 5e-31 MNKYELTVVLNVKLEDEQRAAAIEKVKGYITRFGGTVTNVDEWGKKRLAYEIQKMHEAYY YFIQFESDSVCPNEVEAHIRIMEPVIRYLCVKAEA >gi|157101630|gb|DS480694.1| GENE 69 70450 - 70914 382 154 aa, chain + ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 111 1 106 154 96 47.0 2e-20 MNRVILMGRLTRDPEVRYSQGERSMAIARYTLAVDRRGRRGQDSSAEQQTADFINCVAFD RAAEFAEKYFRQGMRVLVSGRIQTGSYVNKEGQKVYTTEVILDDQEFADSKGAASEMGGY AQAAPSQRPAPTSAIGDGFMNIPDGVEDEGLPFN >gi|157101630|gb|DS480694.1| GENE 70 70952 - 71221 478 89 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 [Clostridium bolteae ATCC BAA-613] # 1 89 1 89 89 188 100 2e-46 MAFNKTDRPDGAKMRRPGGMRRRKKVCVFCGDKNGTIDYKDVNKLKRYVSERGKILPRRI TGNCAKHQRALTVAIKRARHIALMPYTCD >gi|157101630|gb|DS480694.1| GENE 71 71332 - 71679 442 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940070|ref|ZP_02087415.1| ## NR: gi|160940070|ref|ZP_02087415.1| hypothetical protein CLOBOL_04959 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04959 [Clostridium bolteae ATCC BAA-613] # 1 115 1 115 115 210 100.0 3e-53 MLMDMNLMLIIVQENDAESLASAFKRSQIQATRIDAKGLVSSRKLNVFLVGTERTDEVLQ LVSSNCKERAMEIEDQEYNGHMFVDVQKTIVIGGATIFILGEARLMKVKGGCCGD >gi|157101630|gb|DS480694.1| GENE 72 71755 - 72819 592 354 aa, chain - ## HITS:1 COG:FN0319 KEGG:ns NR:ns ## COG: FN0319 COG3053 # Protein_GI_number: 19703664 # Func_class: C Energy production and conversion # Function: Citrate lyase synthetase # Organism: Fusobacterium nucleatum # 5 354 2 345 345 431 59.0 1e-120 MSDYYISQIRPDDKPANQQLDLLLTSEGIRRDANLDYTCGMFDEEMNLVAAGSCFGNTLR CLAVGRCHQGEGLMNQIMTHLMEIQVSRGNTHLFLYTKCSSARFFADLGFYEIARTDGQV VFMENRKNGFSSYLQRLSSESGESGRASSQERCAALVMNANPFTLGHLYLTERASAENDT VHLFMVSEDASLIPFKVRRRLVTEGTSHLDNIIYHDSGPYIISSATFPSYFQKDMDSVIE SHAKLDLVIFTKIAHALGIGRRYVGEEPHSRVTGIYNQIMEKELPKAGIQCSVIPRRQQD GAAISASTVRRAIKENHMELLPSLVPPSTLRYLLSREAAPVLCKIREAKDVIHY >gi|157101630|gb|DS480694.1| GENE 73 73020 - 74705 1751 561 aa, chain + ## HITS:1 COG:BH0567 KEGG:ns NR:ns ## COG: BH0567 COG0747 # Protein_GI_number: 15613130 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 72 533 52 512 539 134 23.0 6e-31 MKHTKGTAGRRAGRFLALALSAAMVLSSAGCAGKGGSQPSGQAESGQAAAGQTQSAAAGS GKIVNIGVTSSLNTLNPLLMDGVEMNKYATGLMFLPLVELDQNMEFEGMLADSVTAEDER NFLVHIDDKAVWSDGTPVTADDVVYTALRLCSPVIGNPAMMYYVFEGVGDNGFVEEGADH VEGITRVDDKTVRFTTKAPMSLITFNSSYARYLMPLPKHVIGDIDEKALTSAPWFSRPDV VSGPYKVTEFDRDHYVSYEANKDYWKGAPKIERLNIKIVEGSQLYAGLKSGEIDITQNTM STIPLEDYESVAALENVEVSYGDPITNQSVFINTARITDTKVRQAMLYAVDRNQLLEQLL KGNGEVADGFLSSASPYFDSTLVPVSYDPEKAKSLLAQSGWDQSKSIRFCIDSGDSTFVN AASVIAAQWAAVGIKADIQTMDINTLMSTASSGDFDVLSVQYTYPPVDPYADVAWLLGGQ GSWTGYTRPEVEEALSNVALTNDKDKLKELYGIVDRSVQEDVPMFSAYIIKTMAAANKRL TGVKPSVYGFFNHVEQWDVSQ >gi|157101630|gb|DS480694.1| GENE 74 74845 - 75825 990 326 aa, chain + ## HITS:1 COG:CAC3182 KEGG:ns NR:ns ## COG: CAC3182 COG0444 # Protein_GI_number: 15896430 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Clostridium acetobutylicum # 4 316 5 316 324 333 51.0 3e-91 MSHILEVTNLEVEFSGDGGKNISVDHVGFHVDSGEIVCIVGESGCGKSVTALSVMGLLGK GGKVTAGRVLFEGRDLLSMTERELDGVRGDRLTMIFQDPLTSLNPVFTIGSQMTESIRAH MGLGKDEARERALSLLKKVGLPDTLSVMKKYPHLLSGGMRQRVMIAMALSCNPSLLIADE PTTALDVTIQAQIMDLILELKEEMGMAVLLITHDMGLVAQMADRVLVMYAGQLIEQARVL ELFDHPAHPYTRALLRSVPGIRDEEDRRLESIEGIVPEHYDRIRGCRFARRCPFRTEICM KPQQEMSVGDSHIVRCCRAKEVLADG >gi|157101630|gb|DS480694.1| GENE 75 75818 - 76837 938 339 aa, chain + ## HITS:1 COG:BH0351 KEGG:ns NR:ns ## COG: BH0351 COG4608 # Protein_GI_number: 15612914 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 9 326 5 320 332 342 53.0 5e-94 MADTSRVPLLSAEGLKKYYTAKSGMFSRSAGVVRAVDGVSFQLFEGETLGLVGESGCGKS TLGRQLVGLEHPTGGRVVYDGHDLGHMSQAQMKPFRTQLQMIFQDSYSSLNPRKHVFEIL AEPMLYHGLAKRGDVYSHVERLLDMVGLPKNCLGRYPHEFSGGQRQRIGIARALSLNPRL LVCDEPVSALDVSIQAQILNLLRDLQKELGLTCIFIGHGLGAVNYVSDRIAVMYLGKMVE IADRKELFDHPLHPYSRALFEAVPIADPRKRRKRSEGIVGGETASNVNTPSGCPFHPRCP HCKGECMSREMELRPAVPGSSHLTACIRFKELEKEDGKV >gi|157101630|gb|DS480694.1| GENE 76 76834 - 77781 260 315 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 64 315 43 315 320 104 25 5e-21 MTKYILKRILIAIPTLLGITMIDYAIMCFAGSPLEMLQGARISQEAIAAKEAAMGLDKPF YIQYFIWLGQLLSGNMGYSVKSYEAVSAMIGSHLGPTLLLMGVSLIVSLIMSVPAGIYSA IHQYSKGDYAVVTASFFGSSIPGFFLSLLLVYLFTVNLGILPSGGMTTLGTAGGAADVAA HMVMPVLVLSVSMAGTNIRYIRSAMLEILQQDYLRTARAKGIGRRMVIYKHALRNALVPI VTVIGMQIPVLFGGAVIVEQLFSWPGLGLMTMSAILNRDYPVIMGVCLLSAVVVLAANLV TDILYAIVNPAIQYD >gi|157101630|gb|DS480694.1| GENE 77 77823 - 78776 905 317 aa, chain + ## HITS:1 COG:BS_appC KEGG:ns NR:ns ## COG: BS_appC COG1173 # Protein_GI_number: 16078205 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 17 293 29 302 303 223 44.0 4e-58 MKMNRKNPYGDIGRENYWQTVRRRFLHHRLACISLVVLAVICTAAILAPVIAPYDPDAIA GPFGAPPGPGFLLGTDQIGRDMFSRLLYATRISLLVGVLATAISTAIGVSLGLLGGYFGG WLDIAIMRFTDMVMSFPYILLVLVAAAIFEPGLWSIILILGFVDWPGIARLVRGNVLSLR ETNFVKSSIVAGMPARHILFSEILPNTVAPILVYATSVMALSMLDEASLSFLGMGVQPPT ASLGNMLNSAQSLTVLTKQPWLWIPPGLLIVVLVVAINFVGDALRDALDPSAVLNLGGAK QDTPEQDAAKQDTPDSL >gi|157101630|gb|DS480694.1| GENE 78 78810 - 79703 1021 297 aa, chain + ## HITS:1 COG:PA5218 KEGG:ns NR:ns ## COG: PA5218 COG0583 # Protein_GI_number: 15600411 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Pseudomonas aeruginosa # 1 242 2 243 304 76 23.0 7e-14 MTIHQIECFLEAARTLNFTEAANHLYISQQGLSRQIASLEKELELRLFDRTTRDVRLTRS GELLLWRWKDIPKEIYDSVDMAREEGERAKRRINLSVVGMSGIIEMAGNILADYMAFDPE VEFEINEFTNIKDITNGNPDLMMTVSFTPSYEQLKEKCGLMVIKKLPLYYVMSKKNPLAQ KEDVVMEDFKGETMLCLFKNFFAGAELRLFELIAKQEHVLQKARYYENVNSLELAIIANE GIHIGFKEFYHNYGERLVMHPMPNSRNQAYASVIIVWRKENEKRLESFIKFLKNNYN >gi|157101630|gb|DS480694.1| GENE 79 79935 - 81113 1392 392 aa, chain + ## HITS:1 COG:yegM KEGG:ns NR:ns ## COG: yegM COG0845 # Protein_GI_number: 16130014 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Escherichia coli K12 # 57 364 128 456 464 84 24.0 5e-16 MKKARTRIIWGIVIVIIAVIVIARVFKKEEPVVPTPDPVVTVQTPHMGDILLTTDLVGTI EPADIVYIYPKAGGDVTAVNVKAGDVVAAGQVLLEIDTKQVDTSKNSMDAAKVAWDDAQS TLARMAPLHASGFVSDKEFEGYQTAAEAKRLQYESAKIAYENQMEFSHVTSPISGRVEQL NVEVHDTVGNSNQLCVIAGDSGNNVTFYVTEKVKNHLNPGDPITVIKEGTEYGGNVIEVS NMAEASVGLFKIKASVDDNETMAAGTSVELRVPSASARNAILIPNNCVYYKSGDAYVYTY DNGIIHEVPVEVGIYDSENIAVLSGITLDDQILTTWTSELKEGTKVTLEIEAPATIEMNG TETDAQIKMQVIQETDQTDKAESPAADQSQAQ >gi|157101630|gb|DS480694.1| GENE 80 81128 - 84181 3463 1017 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 3 1006 1 1006 1093 400 29.0 1e-111 MGLTKLVLRRPVSTVLAILCLIVFGLSSVMKSPLELMPDMNMSMMIVMTVYSGASPDDVS ELVTKPIEDRASTLSGLDTISSQSKENMSIVMLKYKYGTDMNDAYDDLKKQMDLAKVELP EDADDPIMLELNTSLKPNVIMAVSHGEDDDLYNYVNNEIVPEFEKLSTAAEVSITGGLQK YIKVELMPEKLKQYGVSMSSIASDIAGSDITYPAGDTQVGDQKLSVSTEQPFETMDSLND IPLTVSGGQTVYLSDVAKVYMGADDVESIARYKAENGLPEDIVALTVSKQQDAATLNVSK DVKRVVNELTAKDPSLQITIVDDDKDSIMSSLSSVIETMVLAIIISMVIIWLFFGDLKAS MIVGSSIPVSILSSLILMQLMGFSLNVITLSALVLGVGMMVDNSTVVLESCFRATDDTGF REFSKAALNGTGVVYQSVIGSTLTTCVVFLPLAMLGGMTGQMFRPLGFTIVFCMTASLIS AITVVPLCYMIYKPVEKKTAPLSRPVEKMQEAYRGIMTKLLNKRGLVMLASVALLLFSFF IAKFLRVEMMAEDDQGQISITAEVKPGMTIDKVDHIMKQIEDIISQQGDVESYITLAGSS GLSMSSDPNITVYLKKDRKMETDEVVRLWKQQLGGISDTNITVEAYSQVSAMMGSDDEYS VDLQSTNYDDLKAVSDQLAENLSKRPELTKVHSDLENAASVVKVTVNPIKARAAGLTPAQ IGGTLNNMLSGTTPTSLNIDGNDIDVKVEYPKERYKTLDQIETITLQTPKGSSVALTDVA DIHFADSPASITRQDKQYRATISGIYTENSDKNTKALLLSEVVQPAVSDNANVTIAQNQM DESMTEEFTALFQAIALAIFLVFVVMAAQFESPKFSIMVMTTIPFCLIGAFGLLWLADSA ISMTSLLGFLMLVGTVVNNGILYVDTVNQYRREMDLNTALVEAGATRLRPILMTTLTTVV SMVPMALALGDSGSTTQGLALVNIGGLTASTILSLLMLPAYYSLMNGGGKKRMIIAD >gi|157101630|gb|DS480694.1| GENE 81 84662 - 84970 412 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940083|ref|ZP_02087428.1| ## NR: gi|160940083|ref|ZP_02087428.1| hypothetical protein CLOBOL_04972 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04972 [Clostridium bolteae ATCC BAA-613] # 1 102 41 142 142 81 99.0 2e-14 MAQETIDAIRQAETAAEAAEKEAVKEAEAIVAEAKAQAAQMKAEMTKSAREGAMGAEEDA KAQSERMMQAAGAEEGKDLEALQKAVAEKQQKAVEVILSELL >gi|157101630|gb|DS480694.1| GENE 82 84989 - 87004 2206 671 aa, chain + ## HITS:1 COG:FN1741 KEGG:ns NR:ns ## COG: FN1741 COG1269 # Protein_GI_number: 19705062 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Fusobacterium nucleatum # 1 660 1 638 638 237 28.0 5e-62 MAVVPMKKVLICGLKKDRKGTLELLQRQGVLEISNVLQEDDMLGRMDVTSSKTVFERNAN IAEQAINILDRYAPEEKGMLSSFEGREVLSLDEYEANAGKHDEVMKKAYRLQELAKQIGE HSAAVPKLEQQMEALVPWRSFDLPLDFKGTKKTAAFIGSIQDEITLEQVTEQLGELAPQA ETIDVTIVSASKEQTCLFIVCAKTDAEAVEDALKKMNFVKPPLSSAVPAKRQQQLEEELA KEKAEIEKAEKAVAEMAPDRELIKFVMDYYTMRAEKYGVLNGLAQSRRVFFITGYVPENA APKLEKLLQEKYEVVVEYTEPGDEEDVPILLHNNKFAEPVEGVIESYSVPSKGEIDPSMI VALFYYVQFGLMLSDAAYGLIMVAGTAYCLTKFKNMEAGMKKFMKMFMYCGISTTFWGFM FGSFFGDAVNVIATTFFNRPDIRLAPLWFEPVSLPMKLLVFAFGLGILHLFIGLGIKFYS CVKNGSLADGIYDAIFWYMLVGGGIVYLLTMSMFTEMLGLTFTLPAVAGTVAAYAAAIGF VGIVLTSGRESKNWVKRILKGLYGAYGVSSYLSDILSYSRLLALGLATSVISTVFNKMGS MMGGSIPGAIIFILVFLIGHSLNLAINALGAYVHTNRLQYVEFFGKFYEGGGRKFEPFAV HTKYYKVKEDI >gi|157101630|gb|DS480694.1| GENE 83 87007 - 87489 603 160 aa, chain + ## HITS:1 COG:SP1321 KEGG:ns NR:ns ## COG: SP1321 COG0636 # Protein_GI_number: 15901175 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Streptococcus pneumoniae TIGR4 # 5 150 13 153 158 102 46.0 4e-22 MGNMGVALALLGAAIAALFAGVGSAIGVGIAGQAAAGVVTEDPNKFSKVLVLQLLPGTQG IYGLLIAFITLTQIGIMGGSADLSLVKGALYLAACLPMGIVGWISGASQGKAAAASIGLV AKRPEQFGKAMIFPAMVETYAILALLISILSIFGIAGLNI >gi|157101630|gb|DS480694.1| GENE 84 87532 - 88125 605 197 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_2081 NR:ns ## KEGG: TepRe1_2081 # Name: not_defined # Def: V-type proton ATPase subunit E # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 1 196 1 198 200 82 32.0 8e-15 MAGLDKIISQIKEESQKAAERTRAEARSKADEILAQARADAERECVDIERRSKQAVANIL ERGRTAAELKKRGAVLAEKQRLIGATIGMAKAELKGLETGAYIDMILKLAVKSAQTGEGE LLFSKKDLERLPEGFEDRLNAALKDKGAVLRISGDTRDIDGGFVLTYGGIEENCSIDALF DAAHEVLQDKVQEILFS >gi|157101630|gb|DS480694.1| GENE 85 88147 - 89115 995 322 aa, chain + ## HITS:1 COG:FN1738 KEGG:ns NR:ns ## COG: FN1738 COG1527 # Protein_GI_number: 19705059 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit C # Organism: Fusobacterium nucleatum # 1 322 1 334 334 85 25.0 2e-16 MADKQYVYAVARIRSKELSLLSGAFLEQLTAAKDYDECIQLLMEKGWGEDGMTNAADILA IEKRKTWELINELVEDMSVFDVFLYANDYHNLKAAIKEVRMGDEYPGIFMEQGTVDVKLI REAVQTREFQNLPAAMRTPAEEAYKALLHTQDGQLCDIIIDKAALDAICAAGKSSGNEFL ELYAELTVAAADIKTAVRASRTGKDRAFLEQALAPCGSINVARLAQAAIEGVDSIGSYLE TTAYADAVEELRRSPSAFERWCDNLLIRKIKPQQYSAFGLGPLAAYILARENEIKSVRIV LSGKLNHLPEESIRERIREMYV >gi|157101630|gb|DS480694.1| GENE 86 89108 - 89428 392 106 aa, chain + ## HITS:1 COG:FN1737 KEGG:ns NR:ns ## COG: FN1737 COG1436 # Protein_GI_number: 19705058 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit F # Organism: Fusobacterium nucleatum # 1 102 4 105 105 84 43.0 4e-17 MYKIAVMGDRDSIYGFASLGLEPFPLTDPAEAGKKVKDLAESGYAVIYITEALAAQIEPE INRYREAGLPAIILIPGISGNTGKGILAVKKSVEQAVGSDIIFNGQ >gi|157101630|gb|DS480694.1| GENE 87 89444 - 91219 2078 591 aa, chain + ## HITS:1 COG:MJ0217 KEGG:ns NR:ns ## COG: MJ0217 COG1155 # Protein_GI_number: 15668390 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Methanococcus jannaschii # 4 583 12 592 594 738 61.0 0 MSKGVIKKVAGPLVIAEGMRDANMFDVVRVSEQRLIGEIIEIHGDKASVQVYEETSGLGP GEPVESTGVPMSVELGPGLIGSIYDGIQRPLDAIMEKTGGNLLNRGVEVPSLKREKKWTF VPAANAGDKVGGGDIIGTVQETDVVQQKIMIPVGVSGTLTYISGGEYTVTDTVAVVTDEK GQEHPVSLMQKWPVRKGRPYQRKLSPDMPLVTGQRVIDCLFPIAKGGVAAVPGPFGSGKT VVQHQLAKWADADIVVYIGCGERGNEMTDVLNEFPELKDPKTGKSLMERTVLIANTSDMP VAAREASIYTGITIAEYFRDMGYSVALMADSTSRWAEALREMSGRLEEMPGEEGYPAYLG SRLAQFYERAGRVISLGQDNREGALSVIGAVSPAGGDISEPVTQATLRIVKVFWSLDADL AYKRHFPAINWLTSYSLYVDTMEKWFNAEVESDWTELRARLMRLLQEESELNEIVQLVGM DALSAPDRLKLEAARSIREDFLHQNAFHEIDTFTSLKKQHMMMMLMLAYYDKAGEALAQG VNIERLVGLPVREAIGRFKYTEEADLDKVYEDVIKTLESEINSELSRKEDF >gi|157101630|gb|DS480694.1| GENE 88 91219 - 92592 1609 457 aa, chain + ## HITS:1 COG:FN1734 KEGG:ns NR:ns ## COG: FN1734 COG1156 # Protein_GI_number: 19705055 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Fusobacterium nucleatum # 1 457 1 457 458 675 70.0 0 MPKEYRTIEEVAGPIMLVRGVQGVTYDEMGEIELVNGERRRCKVLEIDGGNAMVQLYEAS TGINLDSSKVRFLGRTMELGVSEDMLSRMFDGMGKPIDGGPDILPEKRMDINGLPMNPAA RSYPEEFIQTGVSAIDGLNTLVRGQKLPIFSASGLPHANLAAQIARQAKVRGTDESFAVV FAAMGITFEEANFFMESFRETGAIDRSVMFINLANDPAVERIATPRMALTAAEYLAFEKN MHVLVILTDITNYADSLREISAARKEVPGRRGYPGYMYTDLASLYERAGRQKGKNGSITM IPILTMPEDDKTHPIPDLTGYITEGQIILSRELYRKNVTPPIDVLPSLSRLKDKGIGEGK TRADHADVMNQLFAAYARGKDAKELMIILGEAALTDMDKLYAKFADEFEKEYVSQGYYAD RDIEETMEIGWKLLRILPRSELKRIKDKFLDLYYEAK >gi|157101630|gb|DS480694.1| GENE 89 92701 - 93384 1032 227 aa, chain + ## HITS:1 COG:FN1733 KEGG:ns NR:ns ## COG: FN1733 COG1394 # Protein_GI_number: 19705054 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Fusobacterium nucleatum # 1 212 1 211 211 183 52.0 2e-46 MGAKHVNPTRMELTRLKKKLATAIRGHKLLKDKRDELMRQFLDLVRENKALREKVEAGIA AANQNFVLARSGMTDEALNVALMAPKQEVYLETETKNVMSVEIPVFKYKTRTSDPNDIYS YGFAFTSSDLDDAVKSLADLLPDMLRLAECEKSCQLMAAEIEKTRRRVNALEHVMIPDTQ SNIRYITMKLDENERSSQTRLMKVKDMMLEEAHHYSEREVVPVVDEV >gi|157101630|gb|DS480694.1| GENE 90 93548 - 94354 702 268 aa, chain + ## HITS:1 COG:BS_yneN KEGG:ns NR:ns ## COG: BS_yneN COG0526 # Protein_GI_number: 16078864 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 130 261 33 166 170 103 36.0 5e-22 MRREHFYRNKKVVKRSAAGAALAIFLALQITGCSQTADQNKADRKTEQAQGTEKEGTAAE ESSAETEKNGTGTDKPKDTGNNESGPEKTREGGEAGEAGREPSPDGSQADAEAAADQGDG VIEFKPGMPIKEGVQAPDFTGELIDGTSITLSELQGKPVIINFWATWCGPCVKEMPAFER LKDDFGDKIGIIAVNCGDDAGTVKDFVEENGYTFPVVLDEEYSISMLYPTNSIPYTVVVD AEGRVTHISTGALDADTMYERYKEALGV >gi|157101630|gb|DS480694.1| GENE 91 94373 - 94522 222 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940095|ref|ZP_02087440.1| ## NR: gi|160940095|ref|ZP_02087440.1| hypothetical protein CLOBOL_04984 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_04984 [Clostridium bolteae ATCC BAA-613] # 1 49 1 49 49 84 100.0 2e-15 MMDWCRQNKVAVILLVLGIVFLCTGAAQGGYQDAFRKAVRVCLECIGIG >gi|157101630|gb|DS480694.1| GENE 92 94566 - 95399 645 277 aa, chain + ## HITS:1 COG:MJ0750 KEGG:ns NR:ns ## COG: MJ0750 COG0348 # Protein_GI_number: 15668931 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Methanococcus jannaschii # 72 275 49 238 238 79 28.0 1e-14 MKRRLIQLCAALLYNLNLKGFAEASIYKGRSKGLCVPGLNCYSCPGAVGSCPLGTLQNAL SAMDRKLPFYILGVLLLFGVTMGRTICGFLCPFGLVQELLYRIPVPKIKKNRVTQELSKV KYVILAVFVIILPVVFKAASGLPVPAFCKYICPAGTLEGGIPLVLLDDSLSGLAGALFHW KVLVLAVILVSACFLYRSFCRILCPLGAIYSLFAPVALLGVRIDRDTCVDCGRCVRTCKM DVRQVGDRECIQCGECMKECNVDAIYWKKPWDVKSKK >gi|157101630|gb|DS480694.1| GENE 93 95631 - 96629 600 332 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 17 324 28 328 329 235 40 2e-60 MTQTDIILEVKDLCVEFRTAEGTVQAVDHLSYVLHKGEKLGIVGESGSGKSVSSLGMMQL IPNPPGQITGGEILYLGRDLVKTSERDMQKIRGNEISMIFQEPMTSLNPIIKCGKQIAES LRLHRGMKKKEAMEEAIRMMKAVGIANPEVRAHEYPHQMSGGMRQRVMIAMALACQPQIL IADEPTTALDVTIQAQILDLIREMNEELHSSVLFITHDLGVVSELCDTVIVMYTGHIVEQ APAGELFRDPKHPYTIGLLNAIPVITKDRKPLSTIEGMVPNPTERIKGCSFWPRCPHATE RCRTVSPPMKQLSEERKVRCWLFEDQAAGKEA >gi|157101630|gb|DS480694.1| GENE 94 96633 - 97640 890 335 aa, chain + ## HITS:1 COG:BS_appF KEGG:ns NR:ns ## COG: BS_appF COG4608 # Protein_GI_number: 16078202 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus subtilis # 19 335 5 328 329 360 54.0 2e-99 MAQNPNRDINESINEDKNNREVLVKADHVKVYFKGKDKKSGTVKAVDDVSFHIMKGETFG VVGESGCGKSTLGRALVRLLKPTEGHIYMGGTDIAGLKGADLKKMRRKVQIIFQDPSACL NPRRTVRQILMEPFEIHGMKGKMDVEARIMKLLNLVGLDLYCLSRYPHELSGGQKQRIGI ARALALEPDIIICDEAVSALDVSVQAQVLNLLQELKEKLGLTYFFISHNLNVVYQVSDRV GVMYLGNMVEIADYDKLYEKRYHPYTEALLSAIPQVDQGVKTERIHLEGEVPSPSDPPSG CRFHTRCPKACDRCRQEVPRLKEVAEGHFVACHLY >gi|157101630|gb|DS480694.1| GENE 95 97768 - 99402 1970 544 aa, chain + ## HITS:1 COG:BMEII0217 KEGG:ns NR:ns ## COG: BMEII0217 COG0747 # Protein_GI_number: 17988561 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Brucella melitensis # 36 543 27 537 539 189 28.0 9e-48 MKKKLLTAALCLAVSFALSACGGNGADGTTAGKTDTATEGASQAAGSNTVVVAMGSGFST LDPGYVYEKYPPLIVNACYENLFKFYSNDSAPEPCLADTYEFSEDGLTLTVKLKQDVTFA SGNSMTSADVLFSINRCKNLQGNPSFICDTIESMEAPDDYTVVFHLTQPDSAILSKLTYS STAVLDSAVVKEHGGTDAEDASSADTAQSYLDTASAGSGMYVMTSYIPDQEVVLEKNPNY WGEATNVDKFIIKIQPDANTQMMTLSGGDIDVAMNMTDDTMSELEGAENISIINGATKTV GFVMMNMDEAYGGPVSNPQVQNAIRKALDYTGIQTICGSGTITPYDIIQVGFMGSKGERP ADYTNLEEAKALLAEAGYPDGFDVDLTVTDLDMEGILLTDLAQKIKDDLSQVGINVNIVS QPWAAGYGDAYRDGTLGFTVMYWGTDYNDPNVQLEFLPGGVVGKRAGWSADMDPELAAMY EKTMSATDNDARIAVLEEIQDAMYEHGPFIMVAQAPSHIGYNTRLEGVAISDPYALDLTL INVK >gi|157101630|gb|DS480694.1| GENE 96 99645 - 100664 1320 339 aa, chain + ## HITS:1 COG:BMEII0220 KEGG:ns NR:ns ## COG: BMEII0220 COG0601 # Protein_GI_number: 17988564 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Brucella melitensis # 9 337 24 352 354 268 42.0 1e-71 MEQLKFIGKRLVYLVVMLFGVATLVFILTKMIPGDPTVANLSQRALNDPEIVAAYRAKYG LDQPLPVQYILYMKNLLQFDLGTSMRTNKPVLSELARCYPATIELALFAIVIAAILGVLF GIISAIRRNSILDQTVRAISVTGVSIPSFWFALLVLYFFYYKLKLFPGPGRLSNAFTAPA TVTGMYVIDSLLEGNIPKALDAASHLILPGTVLAAFTMGLITRTARSNLLDVMSTDYIRT AKAKGLSRPGLIIRHALGNALIPVLTVIGLGLGNLLGGMVLVETIFNWPGVGQFAYESVL SVDFPSIIGVALLIALNYMVINTVVDILYGIIDPRVRCS >gi|157101630|gb|DS480694.1| GENE 97 100664 - 101500 1043 278 aa, chain + ## HITS:1 COG:PAB0093 KEGG:ns NR:ns ## COG: PAB0093 COG1173 # Protein_GI_number: 14520362 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Pyrococcus abyssi # 9 273 19 284 287 248 46.0 1e-65 MVQSIKRLFKSNYLFTLGVIICLAWIAAAILAPALAPYDPIVQDLGQRLKAPSSEHWFGT DNFGRDIFSRVLYGGRYSLLAGCLTVVIAGFIGTFYGAVAGYVGGWVDNVMMRFSEMILS FPSLILAMIINAVMGSNLFNTMFALIVVAWPTYARMMRSVVLSVKENEYVAASEVLGASR IRILMKEVIPNSISSVLIMATTDIGNQILMFSTLSFLGLGSAPPTPEWGMMVSDGADYFN KFWVAGFPGLAIFTMAVGANFIGDGLRDLLDPKLRKQF >gi|157101630|gb|DS480694.1| GENE 98 101518 - 102609 1286 363 aa, chain + ## HITS:1 COG:CAC2788 KEGG:ns NR:ns ## COG: CAC2788 COG0006 # Protein_GI_number: 15896043 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Clostridium acetobutylicum # 1 357 1 357 358 313 43.0 2e-85 MKQKRAERIMEALKEMGLRQMLIVDPMSIYYLTGVYVEPFERFYALYLREDGKHVYFLNK LFTVPEDVGVEKVWYSDTDPAAEIVAGYLDKESPLGVDKDLKARFLLPLMEMEAAAGFVN SSIAVDRTRGVKDEEEQDKMRTASDINDKAMAVFKTLIHEGVTERQVADQMLKIYMDLGA DGFSFEPLVAFGANAADPHHGPDGTVIKPGDSVLFDVGCIKDGYCSDMTRTFYFRKASDE HRRIYEIVRSANETAISKIRPGVPLCELDGAARDLIAEQGYGPFFTHRLGHFIGLGEHEF GDVSSVNTQKAEPGMIFSIEPGIYLPGDTGVRVEDLVLVTEDGCEVLNHYSKEFEIIGSS VTG >gi|157101630|gb|DS480694.1| GENE 99 102632 - 103204 619 190 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 2 176 3 168 174 84 34.0 9e-17 MNPIATLHMANGRNIVIELLPESAPNTVNSFIYTASRGYLDHHAIERIVPGNWVDVSYTA FGKKECRYLIPNEFELNPDVVPLDSHPGAVCMGGYGEAGLAGCEFFFPLRDCPDHRGIYP VFGRVLEGMDELYRLEKVETVPVTDFPIEGVEVNRPVKPEIIERVELELHGEIYPEPVRV REPELPECWK >gi|157101630|gb|DS480694.1| GENE 100 103493 - 103699 324 68 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 [Eubacterium eligens ATCC 27750] # 1 68 1 68 68 129 80 2e-28 MKEGIHPSYYQAKVVCNCGNEFVTGSTKQDIHVEVCSKCHSFYTGQQKAAQARGRIDKFN RKYGMNAN >gi|157101630|gb|DS480694.1| GENE 101 103808 - 104893 1078 361 aa, chain + ## HITS:1 COG:CAC2886 KEGG:ns NR:ns ## COG: CAC2886 COG3872 # Protein_GI_number: 15896140 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Clostridium acetobutylicum # 2 294 3 296 317 245 41.0 8e-65 MKYSGIGGQAVIEGIMMQNGNDYAIAVRKPDGDIEVKKDTYLSMTKKHKVLGLPFVRGIF SFIDSMVVGMRALTWSCSFFEDDEEAEPGKFEAWLDKVFGEKLESALMTVVMVFSFVMAI GIFMVLPLFIANICRGFIHSDTVMAILEGVIRIVIFIAYIKLVSRMEDIRRTFMYHGSEH KCINCIEHGLELTVDNVRASSKEHKRCGTSFIMIVMVISILFFMVIRVDTVWLRVVSRIV LIPVIAGVSYEFLRFAGRHDSRLVNVLSRPGMWMQGLTTTEPDDSMIEVAIAAVETVFDW RAYLDENFPGWQKSRHYDGGNGNDESSENDKVSRNGKGNDSEGAVKHDHAAAVVAGCSGT E >gi|157101630|gb|DS480694.1| GENE 102 104847 - 105710 383 287 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|223485211|ref|YP_002587537.1| ribosomal protein L11 methyltransferase [Brevundimonas sp. BAL3] # 7 283 20 290 304 152 36 3e-35 MTMQQLLWQGVQELNKAGVPDPQLDARYLLLEVFHLNLASFLALKARELGKDEETEGKCR EFMRLIEARAGRTPLQHLTGTQEFMGFEFLVNEHVLIPRQDTETLVELVLEEQNDREKRV LDMCTGSGCIAISLALMGRYRHVAALDVSAEALKVAAGNRDRLLGGYEGGFELFESNMFS ALETDRTFDVIVSNPPYIPSRVIEGLAPEVRDHEPRIALDGSDDGLTFYRILAEEARNHL AEGGSIYMEIGYDQSEAVEGLFRSGGYRDVRTFQDLAGQDRVVRARR >gi|157101630|gb|DS480694.1| GENE 103 105853 - 106929 1439 358 aa, chain + ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 354 1 354 359 390 55.0 1e-108 MFDKLDDMLIHYEELMLMLGDPDVTQDTKRFTRLMKEQADLAPIVEAYKQYKQAKQDVED SLALLEEESDEEMREMAKEELSDARKRIEDLEHELKILLLPKDPNDDKNIILEIRAGAGG DEAALFAAELYRMYSNYADSQRWKVEIVSLNENGIGGFKEVVAMVTGKGAYSKLKYESGV HRVQRVPETESGGRIHTSTATVAVMPEAEEVDVQIDMNDCRIDVMRASGNGGQCVNTTDS AVRLTHMPTGIVIYSQTEKSQLQNKEKAFRLLRSKLYDIELEKRQSSEAEERRSQIGTGD RSEKIRTYNFPQGRVTDHRIKLTLYKIDSIMNGDIQELLDNLIAADQAAKLARMNEAV >gi|157101630|gb|DS480694.1| GENE 104 107359 - 107838 498 159 aa, chain + ## HITS:1 COG:BS_yhdM KEGG:ns NR:ns ## COG: BS_yhdM COG1595 # Protein_GI_number: 16078017 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 3 152 2 152 163 69 31.0 2e-12 MDSMEEIYMKHGKMIYGFLLTRTRDPGLAEELTQETFYQAVKHIGGYKGESSISTWLCAI AKNLWRDYLRRQKDHVPLDEAEQVSVESAESRALCLWDNVQILKLVHGLEDPMREVMYLR LVGNLTFGQIGEIMGRSENWARVTYYRGKERVVKEAGKL >gi|157101630|gb|DS480694.1| GENE 105 107835 - 108821 1085 328 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940113|ref|ZP_02087458.1| ## NR: gi|160940113|ref|ZP_02087458.1| hypothetical protein CLOBOL_05002 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05002 [Clostridium bolteae ATCC BAA-613] # 1 328 1 328 328 670 100.0 0 MNHRIPCEIIQDLMPMYADGLTSETTDREIRSHLEECETCREMYERMKAEMEGDASQTAG EPPEIDYLKKVRRKNVRNVVLAAAGVFLFMAAALFMKLFVIGYPTESYVLTYTDVNGEQV NVGGVMADSAAVYRGYKLVQEDGAKRLVIYSCLPSFWNRSGTFNLELALPGGGMDLNIQG ITIKSNGDLISSLANELYRARNPYIGDASADGRLSGTLGISRELGSFKNELQTSGEPYGW TLNFEESTSNSAVFEERMKAYACVLIALTDNLGQVSWNYTVELEQGPVLRHGTITVEECG EMVGAPVKTFADSPEGIEQLIKQLGIGR >gi|157101630|gb|DS480694.1| GENE 106 109065 - 109943 1040 292 aa, chain + ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 179 288 187 297 313 79 38.0 9e-15 MQSLSSFMNTIRFNFLYIDKYSFGRTWTYPESAIPYNMLRYIIDGSAEFVVDGETVIVRK GQVSYIPEGCWLSCKALEDTFAFYSIRFTTSVFYEGANFLREYYNFPLVMDVGEGIEPYF SDIYKWVRTDKKSKSFHVRGALDTLIASLIDILNSDEPDDVKAEINGLEYNLEQIRKRVK KSTVQTDPRIQTVIDYIMLNPTEEYTSDKLSGMAEVAETTFRRLFKEATGKTATEFIRQV RLTTAARLLLVSNDPVNCIAHDVGFEDANHFTRVFRQAFGMTPGRYRKMSQE >gi|157101630|gb|DS480694.1| GENE 107 109980 - 110537 551 185 aa, chain + ## HITS:1 COG:no KEGG:Clos_0809 NR:ns ## KEGG: Clos_0809 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 185 1 186 186 249 65.0 3e-65 MEQIKVWTKQHENVLKELNENGRYIARREYIRSDLQEHAGLVLEVYDWLVRHSPDAARKP GDVEYPVWVSFTSEAVMLPSPGAVILELTLEPDRITSVNIEKWGSILNYSYIPKDKADAR RHQDMMEQYGVSDAKAYMSQFYPHLKREIIASWDRLFDDSVILGNDSKYGNIWEIRKEWV TQVIR >gi|157101630|gb|DS480694.1| GENE 108 110513 - 111103 635 196 aa, chain + ## HITS:1 COG:no KEGG:Clos_0810 NR:ns ## KEGG: Clos_0810 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 1 190 1 191 191 224 53.0 1e-57 MGDTGDKVILYAAQADAVLKAIERDGRCFSREEYVRRKYGESGPIFLTVYRWFVKEAAKL VPKPEGAEFPYWSFMNLYSLDQSAGTRTLTLCVPRNEAVFFDMYDWNKILCLKYLGEDEA DELAFQAYLKQRGVREMDAVLTGFYPELKQKIMGSWPRLFRHHEQIRAGEESGAKSVQAA LWQIKKEWIVTETAGE >gi|157101630|gb|DS480694.1| GENE 109 111378 - 112778 1531 466 aa, chain + ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 3 463 4 446 451 615 63.0 1e-176 MKKSHVSFILAAALAAALLAGCGSSGSSTEASGSKEAGSQSQAQDTGSQDGSGSGTEGAL SGEITFWHSFTQGPRLETIQKAADKFMEENPDVKINIETFSWNDFYTKWTTGLASGNVPD MSTALPAQVAEMINADALTPINDLIDDIGRDKFAEAAISEGTMDGNNYSVPLYRHAHVMW IRKDLLEKNGLEVPKTWDELYDTAKALTKDGVYGLSFPCGTNDFQATFFLDFYVRSGGGS LLTDDLKANLTSDLAIDGINYWLKVYNDCSPKDSINFDVLDQATLYYQGKTAFDFNSGFQ ISGVAANSPDLLQYVDCYPIPKINADDPDEGIMTTNTPMVVWKASKHPEICKAFMKTLYD EDTYVEFLHATPVGMLPAIKGIADTDAYKDDDTIRQFTHAEEVLTAAADIGTAFGYEHGP NVQAGIMQNNHVIEDMFQDIITNGTDVKTAAKTAEDKLNALFETAE >gi|157101630|gb|DS480694.1| GENE 110 112854 - 113735 972 293 aa, chain + ## HITS:1 COG:PM1761 KEGG:ns NR:ns ## COG: PM1761 COG1175 # Protein_GI_number: 15603626 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Pasteurella multocida # 5 290 7 292 294 375 72.0 1e-104 MGKKSYTRWLFVLPAIVIVIALFIYPICSSIVYSFTNKNLIKPAYKFIGFKNYAEVLRDS GFWLSFFNSLKWTVLSLVGQIAVGFTAALALKRIKVCKGVYKTLLIIPWAFPSIVIAFSW KWILNGVYGFLPNILVKLGICEAAPQFLTSGTLAFVSLVLINIWFGAPMIMVNVLSALET VPKDQYEAAQIDGARPWQSFYYVTVPHIKMVVGLLVVLRTVWIFNNFDLIYLITGGGPAG TTQTIPLYAYNMGWSTKLIGRSSAVTVLLFIFLMIICGIYFAVINKWEKEDSK >gi|157101630|gb|DS480694.1| GENE 111 113732 - 114562 1052 276 aa, chain + ## HITS:1 COG:PM1760 KEGG:ns NR:ns ## COG: PM1760 COG0395 # Protein_GI_number: 15603625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Pasteurella multocida # 7 275 9 277 277 369 74.0 1e-102 MKKFKAGNVISHIYLIILTLISIFPLVWIIISSLKGKGELTGNPTAFWPETWTFDYYAHV INDLKFFVNIRNSVIISLVTTLIAITVAALAAYGIVRFFKRLGSAMSKVLVTTYIFPPIL LAIPYSVVMGKLGLINTRIGLVIVYLSFSVPYAVWLLTGFFRTVPLEIEEAARIDGANKM QTFVQVVLPLVAPGVVATAIYTFINTWNEFLYALILMNNTSKMTVAVALRSLDGAEILDW GDMMAASALVVLPSVIFFCLIQNKIAGGMSEGAVKS >gi|157101630|gb|DS480694.1| GENE 112 114598 - 115719 1266 373 aa, chain + ## HITS:1 COG:SP1686 KEGG:ns NR:ns ## COG: SP1686 COG0673 # Protein_GI_number: 15901521 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 8 373 2 367 367 509 63.0 1e-144 MEAITGTIGYGVVGSGYFGAELARIMNTKPGAQIRAVYDPENGGPVAAELGCDAEDSLES LVSRKDIQAVIVATPNYLHKEPVLAAAEHGLHVFCEKPIALSFEDCREMVDACRENHVVF MAGHVMNFFRGVRKAKELIGQGVIGDILYCHGARNAWEGTGASETWKKTRSKSGGHLYHH IHELDCVQFLMGGCPETVTMAGGNAAHRDDRFGDEDDMLFITMEYPDRRYAVLEYGSAFR WQEHYLLIQGTRGAIKIDMCSCGMTLKTGEKEEHFLVHRTKEEDDDRTRIYRETERDNGN QFGRPGRKPFLWLQGIMDEEMEFLNNVLHGAPVTDEFLPLLTGEAARNSIATADACTLSL KEDRKVKLSEIMK >gi|157101630|gb|DS480694.1| GENE 113 115748 - 116863 1123 371 aa, chain + ## HITS:1 COG:SP1686 KEGG:ns NR:ns ## COG: SP1686 COG0673 # Protein_GI_number: 15901521 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 4 367 2 365 367 541 70.0 1e-154 MKTIGYAVVGTGYFGAELARIMKEQEGARVVAVYDPENAGTVAAELGCDTETDLDRLYSR EDVDAVIVASPNYLHKEPVIKAAEHGVHVFCEKPIALSYQDCVEMVEACVAHNVTFMAGH VMNFFKGVRTAKKMINDGVIGKVLYCHSARNGWEDIQPSVSWKKIRAKSGGHLYHHIHEL DCIQFLMGGCPEEVTMAGGNVAHCGEQFGDEDDMLFITMEYGDNRYAVLEYGSAFHWPEH YVLIQGTEGAIRLDMFNCGGTLKKGDKEEPFLMHKTQEEDDDRTRIYHGTEMDGAIMYGK PGKKPPMWLHSIMYDEMEYFNNIMHGAQPDEEFKALLTGEAARNAIATADACTRSRFENR KVKLSEITGVK >gi|157101630|gb|DS480694.1| GENE 114 116866 - 117789 1021 307 aa, chain + ## HITS:1 COG:SP1329 KEGG:ns NR:ns ## COG: SP1329 COG0329 # Protein_GI_number: 15901183 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Streptococcus pneumoniae TIGR4 # 3 306 4 304 305 391 63.0 1e-109 MKLDKYRGIIPAFYACYDRAGEVSGDGVRALTRYFVDQGVKGVYVGGSSGECIYQSVEDR KKTLEAVMDEAAGHLTVIAHVACNNTRDSRILAAHAQSLGVDGIAAIPPIYFKLPEYSAA GYWNDISEAAPDTDFIIYNIPQLAGTALTMSLLRTMLLNGRVAGVKNSSMPVQDIQMFKA EAMKTREDFIVFNGPDEQLVSGLAMGADGGIGGTYGAMPELFLKIYDLVSENRMAEARAI QYDADEIIYKLCSGTCNMYAMIKEVLRVSRGIDIGGVRKPLTDLCDGDRAIAGEAAAMIE AAKEKYC >gi|157101630|gb|DS480694.1| GENE 115 117791 - 118255 404 154 aa, chain + ## HITS:1 COG:SP1680 KEGG:ns NR:ns ## COG: SP1680 COG2731 # Protein_GI_number: 15901515 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Streptococcus pneumoniae TIGR4 # 1 149 1 149 150 78 30.0 6e-15 MIYDLLANIGNYRGMNRNLDKAIDYLMTVDINTLPLGKTEIDGDRVFLQVMEARTHELTD ESYEVHRDYMDIQIDIDGCEVIETALNGVTPIGEYRPDFQKAAACAGAGGRCVMGPGRFI LCMAGEAHGPGGCLEQPEDIKKCVVKVAVADERP >gi|157101630|gb|DS480694.1| GENE 116 118283 - 118987 671 234 aa, chain + ## HITS:1 COG:SP1330 KEGG:ns NR:ns ## COG: SP1330 COG3010 # Protein_GI_number: 15901184 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Streptococcus pneumoniae TIGR4 # 3 229 7 232 233 236 51.0 4e-62 MDKTELFNRIKGKLIISCQALPGEPLYDPERTLMPYMARAAREAGSPMIRANTVRDVVGI KRETGLPVIGLIKQVYEGYDGYITPTMREVDALAEAGADIIAVDCTDRKRGDGLSPWEYI GTIKERYPDQILMADISDYQEGIQAWKSGIDLVSTTMSGYTDYTPRLEGPDFELVRRLSE ELPIPVIGEGRVHTPEEAVKMLDMGAWAVIVGGAITRPLEIAGRFMKAVNGRTA >gi|157101630|gb|DS480694.1| GENE 117 119188 - 121167 1613 659 aa, chain + ## HITS:1 COG:CAC3280_2 KEGG:ns NR:ns ## COG: CAC3280_2 COG4886 # Protein_GI_number: 15896525 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Clostridium acetobutylicum # 303 653 26 334 367 72 23.0 3e-12 MKMVEIKCPSCGGKLKVEDSQTKLITCEYCGSQFLLDDEKVQHITNYNIYQTTPSGKDQG NGAKIALGAGLACAGVLLAAFILSSGSGSSSGPRTAPPSIPAYTRPAGYLGKEGEAGEEV REEETAPKADSPLFHAMTWEMFQKEPSEISPEDLAGVRYLKVETSLETDRVWYSFDDPYS EKEPVVQTASFGAMDWEPQDAAAFTGLVKLDLGSGLSRVRTLKGLKELKGITAGNVEISG IKDMLDDPAQITELQLKGITSMEGIASFENLERLTVRDYPDTNLKQLVPLKSLNYLSLED TVDSDSIISTDKDKIRVKDYSAISVMCGLKSLYLNSDIIKDIGFIKGLPALEDLTLEDTA VISLASLSEVPALRSLTLLGNNKVQDFGPVGTAPGLTSLSIDKMTSQPDPDLSSLSGLER LEIKGFMSISSIRGLSSLKELSIHNCNVDQADALSSLTGVERLTFYSVWNSQGNLRNLDF LKGMTGLKYADFTGNLDGTGWSGYNYLVEVYGDVSSIFNHQGLEELYLDNGKFEINFDKI KENPSLRVLGLRNMELHKNYYVESYGGMTSVWYDDVKLGEHMDILSRFPNLEELYLDSNE LADLNFAAGLKNLKRLSLKDNYITDLAPLKQAEFLEYLDIRDNPVGEVGDVGNGVEILQ >gi|157101630|gb|DS480694.1| GENE 118 121320 - 122219 1074 299 aa, chain + ## HITS:1 COG:STM2095 KEGG:ns NR:ns ## COG: STM2095 COG1209 # Protein_GI_number: 16765425 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Salmonella typhimurium LT2 # 2 288 5 291 292 421 69.0 1e-117 MKGIILAGGSGTRLYPLTKVTSKQLLPIYDKPMIYYPLSVLMNAGIRDILIISTPDDTPR FEALLGDGHQFGIELSYAVQPSPDGLAQAFIIGEEFIGSDSVAMVLGDNIFHGQGLAKRL RAAANKEKGATVFGYYVDDPERFGIVEFDGDGKAISIEEKPEKPKSNYCVTGLYFYDNRV VEYAKNLKPSARGELEITDLNRIYLEKGELDVILLGQGFTWLDTGTHESLVEATNFVKTV ETHQHRKIACLEEIAYLRGWITREQVMETYEILKKNQYGKYLKDVLDGKYVDKIHPQDF >gi|157101630|gb|DS480694.1| GENE 119 122235 - 123257 1354 340 aa, chain + ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 2 340 3 334 336 471 64.0 1e-133 MKIIVTGGAGFIGSNFVHHMVNKYPDYEIINLDLLTYAGNLENLKPVEDKPNYKFVKGDI ADRKFIFELFEKEKPDVVVNFAAESHVDRSITDPEAFVRTNVMGTTTLLDACRTYGIKRY HQVSTDEVYGDLPLDRPDLFFTEETPLHTSSPYSSSKASADLFVLAYHRTYGLPVTVSRC SNNYGPYHFPEKLIPLIISRALADEELPVYGTGENVRDWLHVADHCQAIDLVIHKGREGE VYNIGGHNERTNLEVVKTILKALDKPESLIKFVTDRPGHDMRYAIDPTKIETELGWKPTY NFDTGIEQTIRWYLDNQDWWKNILSGEYSNYFDKMYGSRL >gi|157101630|gb|DS480694.1| GENE 120 123373 - 124233 932 286 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 281 1 279 280 236 42.0 5e-62 MKVLVTGVKGQLGHDVMDELALRGIEGFGVDVEEMDITDRTACETVISQEKPDAVIHCAA YTAVDAAEDNLELCRKINAEGTRNIARVCKAMDIKMMYISTDYVFNGGGERPWEPDDHRE PLNVYGLTKYEGEIAVEQNVQKYFIVRIAWVFGVNGKNFIKTMLRLGKEKGAVSVVNDQI GSPTYTYDLARLLVDMIQTDKYGRYHATNEGLCSWYEFACEIFRQAGMDEVKVTPVDSDG FPAKAKRPSNSRMSKEKLTENGFERLPSWQNALERYLKALKDNGMA >gi|157101630|gb|DS480694.1| GENE 121 124307 - 125659 1748 450 aa, chain + ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 444 1 444 450 416 49.0 1e-116 MGKYFGTDGFRGEANVDLTVEHAYKVGRFLGWYYGQKTPDERCRVVIGKDTRRSSYMFEY SLVAGLTASGADVYLLHVTTTPSVSYVVRTEGFNCGIMISASHNPFYDNGIKVINEKGEK LEESVILEVEKYLDGEMGEIPLAKREHIGRTVDFAAGRNRYIGYLISIATRSFKNMKVAL DCSNGSASAIAKNVFDALGAETHVMNNEPNGLNINTDCGSTHIEHLQKFVADEKCDIGFA YDGDADRCIAVDKNGRVVDGDSIMYICGKYMKEQGSLFNNTVVTTIMSNFGLYKAFDREG IAYVKTAVGDKYVYENMAATGHCLGGEQSGHIIFSKHATTGDGILTSLKVMEVVLEKKQP LDKLASEVEIYPQVLKNVRVKDKKTAQDDEAVQAEVARVTRSLGSDGRILLRQSGTEPVV RVMVEAPDIQTCETYVDQVIQVMKDRGHCI >gi|157101630|gb|DS480694.1| GENE 122 126173 - 127840 1781 555 aa, chain + ## HITS:1 COG:CAC0273 KEGG:ns NR:ns ## COG: CAC0273 COG0119 # Protein_GI_number: 15893565 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 4 552 3 554 558 641 56.0 0 MLNYQRYKRVPVVDFPERTWPNKEIMKAPIWCSVDLRDGNQALVEPMVVEEKVEMFNLLV KLGFKEIEVGFPAASQIEYDFLRTLVERKLIPDDVEVQVLVQCREHLIKRTFEALEGVKK AIVHIYNSTSTLQRDVVFHKDREDIKDIAVIGTKMVQRYMADCDTQIRLEYSPESFTGTE LDFALDICTAVQETWGATRENPIIINLPSTVEMTTPNVYADQIEWMNTHFKNRESIILSI HPHNDRGEGVASTELALLAGADRVEGTLFGNGERTGNVDILTVAYNMFSQGINPELNIEN IREIADVCERCTKMDIPPRHPYAGKLVFTAFSGSHQDAINKGMQALHERNGQYWEVPYLP IDPSDIGREYEPIVRINSQSGKGGVAFVMDTFYGFKLPKGMHKEFADVIQKISEQQGEVA PEQIMEEFKKNYLERKEPMHFRKCRITDTETGDGEFATLAKVTYSDHGIEKTFEGVGNGP IDAVQRGMEDALGIQIKVLDYSEHALASGSGAEAASYIHLVDQVTGKATYGVGISSNITR ASLRGIFSAVNRLFY >gi|157101630|gb|DS480694.1| GENE 123 128255 - 129400 775 381 aa, chain + ## HITS:1 COG:PH1149 KEGG:ns NR:ns ## COG: PH1149 COG0006 # Protein_GI_number: 14590977 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Pyrococcus horikoshii # 11 376 3 345 351 111 23.0 2e-24 MGGYTGIPVPIFEKRIQIIAQYVRGNHLDGAFIFSDEYRAGFTTYFSDYKPINVIEESPQ GVFVTPEGKVTLFLGAINSQSARRVSWIADIRNIETLSDYFKGLRDKKGGPLLIGLSGQD LLPVKYFKRLNGWLVGNDYVDMDDKLTEMRAVKEPEEIELARYAAHIGDMSLKACVEKMR ESEVTEIDLCATAEYVMKSNGCDLSCATVLSSGMNTEVPTWRPTHKRIEDGEAVIIDVAP AYKGYATDVAVTVINGKATDGQKAILKASRDAVVQSIECLRPGEPASRIYEMFLHYARKN HVEEYFEPYAKGLRAVGHSIGLDCVERPNLDDDASFIMVPGMTLATKFDLHGMEWGGLRI ETVMLVEENACVSLNRYLYDL >gi|157101630|gb|DS480694.1| GENE 124 129418 - 130536 418 372 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 53 358 27 327 340 165 32 2e-39 MTRRFLWGLLAVVMAFGTAGCGTKETPAPATVSGQPAGAEKTGGEKNAGGEAVTETSASQ SAKDDGKTYTIRIAIVSSDSHLHNTVLNEWAQEAKEKTNGRLILKVMGSSQLGGERDYVE GMQLGNIEMAQVSSAAVNGFLNDFSLLSFPYFFKDYAEMEKVFNGPMGDELFAELEGIGI KGLTWFSNGFRNVYTKNTPVTTPADMKGLKIRVMESDVMIATLNAMGASGTPMAYSELYS AIQQGVMDGAENALGNIYSDGYYEICKNVSLTEHFAPPGVVAISQKSYDSLPDDLKEYLT ESAIRFGKMEREMDEKLQEEMKKKLEEKGVQINEVDKQSFIDATASVYTEYAGGISDTVK VLAEEELGKSFN >gi|157101630|gb|DS480694.1| GENE 125 130553 - 131026 198 157 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 147 5 150 164 80 31 7e-14 MKILIKAADFVVAFLLTIAASSMIISMFLQVVFRFVFNSPLYWTEELSRYSFIYIVFIGA AWAGKQGMHLGVDYFTSKLPEQAVRRLAVIIDLLVLVFSAVIVIVGAQVIPINFKQFSPA LNVPMGAVYAAIPMGFLLLFIYYLDHLMEDLGMRRQL >gi|157101630|gb|DS480694.1| GENE 126 131056 - 132330 804 424 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 2 419 5 426 435 314 38 4e-84 MLVILASIFIILLVLGMPIAFTMGIACLGTVVYSQMPLNMLITRMFSATDSFSLMAVPFF ILAGELMNEADLTDRILNLARALVGFLRGGLAIVNILASVLFAGLSGSATADTAALGSLE IPMMVKDGYSKEFSVAVTVASSTIGPIIPPSVMLVMYAVIASVNITKILIAGIVPGILMA LAMSVAVYFISLKRGYGTAGTFSLKNIGKAAKQAVIPMLMPIIIMGGILSGIFTATEAGV VAVVYAFIIGIFVYKTIKLKDIPRILVKGAATTAVSLFIIAMASIFSWFLAWESFPETVV NVMQALTSNGTVALCMVILFLFVLGLFVEGIPVLIVFAPILVPAMEAYGIDTLYFGIVLV LTVLVGSITPPVGSLLYLGSSIAKTTVSKAGKEVWIFVAMIMSVIGLLVIFPQIVLFLPD LLYN >gi|157101630|gb|DS480694.1| GENE 127 132359 - 133045 720 228 aa, chain + ## HITS:1 COG:BH1819 KEGG:ns NR:ns ## COG: BH1819 COG1414 # Protein_GI_number: 15614382 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 2 228 5 250 252 73 26.0 2e-13 MNRTVLRTVEILEIISKHGEISLADLVKITGYPKTSVYDILHALEERAMIYRCTDKVCYG IGFRAYAIGRSYSKNSDLLSNSYQCMKALAEDIGKAVLLGKIDGEKILYIAKCEPKHPVV MTPSIGDEEPIKNTAFAKIAEVFTHHEKKICQLSKEEKNRIRTEKLASYGLDGPGPFYNI ASPIYNFETRLSGVVGVFGVKEVGVDYSEELKKVRECAGKISERLGFI >gi|157101630|gb|DS480694.1| GENE 128 133187 - 134818 1613 543 aa, chain - ## HITS:1 COG:no KEGG:CLK_0336 NR:ns ## KEGG: CLK_0336 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 37 542 34 533 535 405 42.0 1e-111 MLNLALTAIFFILIIFQGPNLNPLYPDGAFSWCFIISCYVVLNFFVALGSVRVGANEYGR PQISIDKSVRGYKRGFVLLAILWVLYFGVSILSMPLFNYKAYRDQLGVPAVSEFTDTVQP LALDQLPIVDKSLARELADKKVGENPGLGSQVILGTPVIQKVDGKLVWVVPLEHSGFFKW LKNMDGSAGYIVVSASDVSDVTLVTDYKIKYQANAYLLDDLNRHVRFFGGGLFKGLTDYS FELDDDGVPYWVITSYQNRWLFNLPEADGVLTVNATTGETTHYTIATLPDWVDRVQPENF VMQQIQNQGVYVHGIFNFSNQDKFRPSQGDIIIYNGARCYLFTGLTSVGSDESAIGFILV DMVTKESNIYQMSGATEMAAQSSAEGKVQQFGYRASFPMIINLDGQPTYFMTLKDNAGLI KQYAFVSVSNYTNVGTGETIDSALRNFRQVRGNAGVNITTGQTISKEEGTVLRIASETID NTVTYKLILEEKQDYIFTVSYDLSNELALTQPGDKVALEYMDDPSGICSASSFDNLEFDQ SGQ >gi|157101630|gb|DS480694.1| GENE 129 135025 - 135813 508 262 aa, chain + ## HITS:1 COG:no KEGG:Closa_0158 NR:ns ## KEGG: Closa_0158 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 13 257 11 250 251 180 46.0 4e-44 MPFSRKKPEIQVGLSGTALKYIAMATMLADHIGAVLLERIVYYRGNLEQVAMLMTSQWGD TLYWLWRMLRTVGRIAFPIYCFLLVEGFLHTRDWRRYWLRMAAFALISEIPFRLAVWNTW AGGSSNVYVELAIGLLVLRGLKQSEGLTQPRRAAGMAAVIGGGCLAAVFLKADYDMDGIL IISLFYLLREKRIVQVMSGGILSLLESWNICYGAGILSAVPLYFYNGKKGQAPWKYAFYW FYPLHLMVLFFIRLYVVGIPLG >gi|157101630|gb|DS480694.1| GENE 130 135912 - 136637 684 241 aa, chain + ## HITS:1 COG:BH0873 KEGG:ns NR:ns ## COG: BH0873 COG2188 # Protein_GI_number: 15613436 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 5 241 4 237 237 157 37.0 2e-38 MAKTKYNEIYMALREKMEDGTYGFQELLPSENTLVKEFDCSRNTIRRAIGQLASEGYVQS IHGKGVRIIYQPGQQQSEFILGGIESLKEAVARNRKNYTTRVVCFAELTVDETIQRRTTF PVGAQIYYIQRVRCIEGEALIIDHNYFLKDVVRDLTPEIAERSIYEYMEKDLGESIVTTK RKMTVERVNQIDETYLDLKDYNCMAVVSSMTYNKEGVMFEFTQSRHRPDRFVFYDLAHRT K >gi|157101630|gb|DS480694.1| GENE 131 136834 - 138777 1979 647 aa, chain + ## HITS:1 COG:lin1223_2 KEGG:ns NR:ns ## COG: lin1223_2 COG1263 # Protein_GI_number: 16800292 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Listeria innocua # 92 461 1 370 403 489 66.0 1e-138 MGKFTGDSKKLLEYVGGRENIAAVSHCVTRMRFVLNDPSKADTEAIESLSSAKGTFTQAG QFQVIIGNEVSTFYNDFAEVAGIEGVSKEAVKKAAMQNQTLIQRIMSSLAEIFAPLIPAI IVGGLILGFRNIIGEIALFDGGTKTLTQISVFWAGINSFLWLIGEAVFHFLPVCIVWSIT KKMGTTQALGIVLGITLVSPQLLNAYSVATTAAAEIPKWNFGLFQVDMIGYQAQVLPAVL AGFTLVYLERFFRKICPAMISMIVVPFCSLLISVAVAHGILGPIGWKIGAVISDCVNAGL TSPFKGVFAGIFGFFYAPLVITGLHHMSNAIDLQLIADFGGTTLWPMIALSNIAQGSAVL AMILLQKNDAKAKEVSVPACISCYLGVTEPAMFGVNLKYGFPFICGMAGSAVGAVLSVVS GIKANAIGVGGIPGILSIQPQYMGLFAAAMGLVILIPFCLTYAIGRKKGIGGGRENQNES GHEASWEERKDSASQPIRELKAYVSGEVIPMEDVPDQVFASRALGDGVAIKPEDGVLRAP ADAMVAVVMEESLHACGLVLDNGMELLLHIGLDTVDMKGDGFRAFVKPGDRVRAGEELIA FDREKIAGAGHSDMVIMVLTNPGNSGAISWQTGRRVRALADAVARID >gi|157101630|gb|DS480694.1| GENE 132 138843 - 140489 1458 548 aa, chain + ## HITS:1 COG:lin1222 KEGG:ns NR:ns ## COG: lin1222 COG0366 # Protein_GI_number: 16800291 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Listeria innocua # 1 546 1 548 548 603 51.0 1e-172 MGHFKDSTVYQIYTKSFQDTTGNGLGDIRGVISRLDYLKELGIDYIWLTPFFSSPLNDNG YDVADYRAVDPVFGTMKDVEELIREADIRGMGCMFDMVFNHTSTEHPWFKKALEGNPEYM DYYIFREGSPDAPPTNWQSKFGGNAWEYVPRLKKWYLHLFDVTQADLNWDNPRVRQELKE VILFWKAKGVKGFRFDVVNLISKPDIFEDDHKGDGRRFYTDGPRVHEYLQELVRDTGIED MVTVGEMSSTTLDNCIRYSRPEDRELSMCFNFHHLKVDYRDGDKWKLTEPDLVRLKQLFM EWQEGMERENGWNAVFWCNHDQPRAISRFGDDGRYWKESAKMLATVIHMFRGTPYVYQGE ELGMTNPHYGDISQYRDVESINYYRILTERGITGEEALKVLAARSRDNGRTPMQWDGSRY AGFSQGEPWIGIPENHSYINAQAEAADSGSILNYYKRLIALRKEYKVISEGEIEFIYREH PQVLAYRRTYQGQELVVLANMKASAADLGEKPELEGCRQLIGNYGEADCGQTLEQLKPYE CRVYIRTL >gi|157101630|gb|DS480694.1| GENE 133 140589 - 143423 2607 944 aa, chain + ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 517 937 260 663 676 172 32.0 3e-42 MGRTEGGRKFRPEKGMASMQDYFTDKGKTDILRSTRTGLWEIILCQGREPVMHADSVMLE LLGLDAAPAPEECYQVWYNRIQDEYKGAVQETVSKAIASTRAEVQEVQYKWNHPLWGPIS VRCGGMKDQEWEDGIRLRGYHQNITDTIMFQKEYDAVIQTLSGRYRGILLCDLRDGSYKI IKAAEDLKDNLTKYTDFREFLRGYSRSCIRPEFRNRIERFAEYEYIQEQFLSGEKQIEEL YRIRSGSWQRVMAIPFAPESDLKARAILAFDIQDGEVEKRMDEVTARVAVSTIYTLVLSL DPTSQEYSCLHYMGDRLKIAPKGMLSELLSQVMPSLPREDKEKLEQICSPDSYRDTESLE GILRIRDRDRKLHYYRYYGVPIHMELGDRILITGRNIDARQEVELRESVLTNLCQCYYSI YLFDLEHDIEEAIWQEEIIRRRQEFPKGSMAVYYEKFVRCHVYEEDQEKMRRVGSPEFLR TNLTPEQPVYDVDFRRVYPDGIRWVRSRFSIAETADGVVTKVIFANMGIDEQKRKELEEE AENKKSLFAAYESATMANEAKSNFLARMSHDIRTPMNAIIGMTALAASHTDQPERVRDCL EKISISSSHLLSLINEILDMSRIEKGKLELLEEPFHLETMLDNIYSIIKPTAMEKNHDIT FSRKGVIHESLIGDANRVKQVLLNLITNAVKYTPDNGIIRVTTEEVPMGKAGWACYRFVV EDNGIGMSEEYMKQVFEPFSRAVVPVVQEQQGTGLGMSIAHGIVSTMQGDIHVESREGEG SSFTVTLNFRIQGKEQERAGGMRDCIRDDDMDCASLAGRRILLVEDNALNMEIARAILCE HGLEVDGAENGQEAYERFTSSAPGTYEAVLMDLQMPVMDGCTAARMIRASSHPQARTIPI IALTANAFAEDVAKALTSGMNYHIAKPIDFHQLFHALKQFMTTS >gi|157101630|gb|DS480694.1| GENE 134 143643 - 144743 1353 366 aa, chain + ## HITS:1 COG:Cj1439c KEGG:ns NR:ns ## COG: Cj1439c COG0562 # Protein_GI_number: 15792757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Campylobacter jejuni # 4 358 2 356 368 456 60.0 1e-128 MKKYDYILVGSGLYAGVFAWYARKNKKRCLVVEKRNHIGGNVYCEDVEGIHVHKYGAHIF HTGNRKVWEFVNSLAEFNRYTNSPVANYKGQMYNMPFNMNTFSRMWGISTPDEAKAIIEK QRAEITGEPKNLEEQAIRLVGRELYEKLIKGYTEKQWGRDCKELPAFIIKRIPVRYIYDN NYFNDPYQGIPIGGYNVIVEKLFEGCDIETSADYLENREHYDSLGETVVYTGTIDAFYGY RFGKLEYRSLRFESQVLDRENHQGVAVVNYTDRDTPYTRVIEHKHFEFGTQPKTVITREY PVSWQEGMEPYYPVNDQKNQELYQRYEELARAESHVLFGGRLGEYKYYDMDKVIESAMNR AEEIFG >gi|157101630|gb|DS480694.1| GENE 135 144777 - 145874 982 365 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940149|ref|ZP_02087494.1| ## NR: gi|160940149|ref|ZP_02087494.1| hypothetical protein CLOBOL_05038 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05038 [Clostridium bolteae ATCC BAA-613] # 1 365 1 365 365 688 100.0 0 MTKTQDRQVYLDVAKGMGMLGVVASHCLVETSLGILSDWYGFFMLAVFYVYTGWRYALRY GKERTGISSREMAGKRLISLGIPYVCYSVLFIFSRIFLVWPQQYTFLVLMSDIYYTGTLV GLETLWFLPSMFIAELLFNMIYGSRRKMGIAACLAGAASVCLILYINGSRQDTTLWRVIH LPIMVYIKGLVGFLLALGGTFACDAWQKASMYLKGRAGLAAAFLLMCLGIVLTVLIPGCD FNFLTMENPVSWILTAWFSSVTILMFGERVSACGGSWKEWRIFSRVLVPFFTYYGTHSLT VMCTHLVPVIAFFKVAAGRMMYPGVLGSTPWDILLFALVLAVDPGVVRLIETRLPWMNGR KKRSK >gi|157101630|gb|DS480694.1| GENE 136 145871 - 146908 880 345 aa, chain + ## HITS:1 COG:DRA0037 KEGG:ns NR:ns ## COG: DRA0037 COG0463 # Protein_GI_number: 15807707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Deinococcus radiodurans # 4 331 10 302 328 135 32.0 1e-31 MITVLMASYNGSRYIRQQLDSILAQDEEDVRILVSDDCSSDGSRELLKEYEASRGDRVLV LLRKEPSGGAAVHFLKLLKLMADAAHGPEEQPQHGSLLPEYEMTKSQLAHLGRLAGADYF MLSDQDDVWMPQKARMLSDKMKEMERKPGRGNVPLLVHSDLTVANEALRPIADSFFRYQK ISPERTSLPQLLVQNNVTGGAVMVSRSMLPYLKEIPGVCLMHDAWLALIASCFGEIGWVD RPLYYYRQHGGNTLGAEKGDSLDSVRTRVKDGSSARENYRRMFGQADSFYHMFYSRLNRE QQETLEAFIRLPRCGRLGKMGLIVKYGFTKNTLVRTLGQMLFIGD >gi|157101630|gb|DS480694.1| GENE 137 146987 - 148438 1434 483 aa, chain + ## HITS:1 COG:XF2362 KEGG:ns NR:ns ## COG: XF2362 COG2244 # Protein_GI_number: 15838953 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Xylella fastidiosa 9a5c # 8 334 18 344 510 156 31.0 9e-38 MNEKQTTTKVLNGLFWKLMENGGAQGVQFLVSIILARLLSPEEYGVVGVILIFVTIANVL VQNGFSTALIQKRKVDDTDFSSVFFFNMAVSAVIYLVLFLAAPGIAYFYRNQEMTALVRV LAVVLFPGGVISIQNAYVSRNMEFKGLFISSFVASMISGAISIFLACSGLGVWALVWQQI AYYFFYMLILFMSISWKPRLLFSILRIKTMFAFGWKLLCASLLDTVYNNIYGLVIGRIYN ESMLGNYNRGEQFPKLIVSNLGAAIQSVMLPVLSASQDEPERVKSMLRRAITVSSYLVLP MMAGLIAVARPMVLLLLGEKWLACVPFLQIMCVAYSFWPIHIANLQALNAMGRSDIFLKL EIVKKMVGLAVLAVGIRYNPLVLVALKAAADFLCTFINAWPNKRLLNYSIIEQWKDIIPS VAVSILMAAAVMAAGRYVPGGWLGLGMQILFGAVVYMLASWVLGLEVFRYIRGLAVDRLP GRK >gi|157101630|gb|DS480694.1| GENE 138 148515 - 149480 1029 321 aa, chain + ## HITS:1 COG:CAC2189 KEGG:ns NR:ns ## COG: CAC2189 COG0458 # Protein_GI_number: 15895458 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Clostridium acetobutylicum # 4 284 3 281 315 103 28.0 6e-22 MRTVVVTAIGSFSADIVIKKCRENGMRVIGCDVYPGEWIADAGNVDAFYQVPYASDTDHY IETMLSICRKEGAKAVIPLTDAEIDVFNLHRREFGEINAALCMSDKACIGLCRDKMELYR YLEEHMEGTVIPTVRLEDADLDEISYPAVCKPCNGRSSQGLRFVGSRREMEGFLGETDPA GYIVQPMIKGNVVTVDVVRSPEQGTCATVCRRELLRTLNGAGTSVLVFRNRQLEERCRAI AGLLGVRGCVNMEFIEDRDGVYHMLECNPRFSGGVEFSCLAGYDCVTNHLRCFEGREIER DDRIKSMYIARKYEEYITKVE >gi|157101630|gb|DS480694.1| GENE 139 149500 - 150507 1036 335 aa, chain + ## HITS:1 COG:MT1566 KEGG:ns NR:ns ## COG: MT1566 COG0463 # Protein_GI_number: 15840982 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 7 235 32 254 373 128 36.0 1e-29 MGDRVEGGTGPQTADTAIMASINCVTFNHADYIRTALDSFLMQKTDFAFEILVHDDASTD GTSSIIREYAARYPDQVKPLIQTENQYSQGIDNISGAFNFPRARGKYIFMCDGDDYFLSP DKLQKQVDYMEAHPDCTLCIHSAKIDLVGRALTEGQMRPYRESRVLPPEEIIDKPSGYAM SSMAFPARLVRELPDYYVDCPVGDTPIQLIAAANGYGYYFDDAMSAYRVGVAGSWTVEGK NGDYQGKQRAYCDRMKRTYEQFDRATGGRFKEAAESAARRTYYQTMVNTKQFEEIFNPQY RRYYKELTPRMRFFLRFEHTMPVVYEMARKTFTGK >gi|157101630|gb|DS480694.1| GENE 140 150584 - 152143 1281 519 aa, chain - ## HITS:1 COG:no KEGG:Closa_3823 NR:ns ## KEGG: Closa_3823 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family protein # Organism: C.saccharolyticum # Pathway: not_defined # 125 517 94 483 483 484 60.0 1e-135 MNKLMKRAAAALLSGLLVLGQAGPALAAQDDSNYGPAFSTKVEQTSPGSLDGQSGDGQGS GPQAVAPANAAPAEQADGAGAAAGSTEAAAGTAGTDGVTSAATDQNAAQAADDSAQADAE AAAAAQFAAEEAARQASANTPFLQIQVLRPDTTWTDPVIGDTPVVSEQGFRSMSVYLNNI VGDILYRTYTSAHGWSDWAMNGGHTTVWEDGALVEAIQMRFNGFVGNTFDIYYCTTLNDG TELNWARNSATAGTMGTGKVLSSFRVSLWGKGVEGASYNMEKPLEAAFPDGIQVVDGAVA YSSGNGVPFTGWAWNDRDRYYFVNNAPVTGWQYIDGFKYYFDETGKLLTDLEPIVGNSGP FLISINKQMNCMTIFAQDGANGFIIPVKTYLTSTGPDTPIGTFQTPAKYRWRDMNHGIFT QYATRIYKGFLIHSILYSRPDPMTLDPLTYNYLGIAESAGCVRLLSGDAKWVYDNCALGT TVTIYNSPKPGPYDRPAIEWVIPGDQHWDPTDPLFAQQQ >gi|157101630|gb|DS480694.1| GENE 141 152309 - 153328 926 339 aa, chain + ## HITS:1 COG:BS_gspA KEGG:ns NR:ns ## COG: BS_gspA COG1442 # Protein_GI_number: 16080894 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Bacillus subtilis # 6 288 5 282 286 115 25.0 2e-25 MDWNTETVNIIYASNDGYAGHLAASLYSLLDHNRNIPRMDIYILSVGMSEAYLERLAGIA GKFCRSLHVAELGNLRERFDYAVDTRGFDISAMGRLFAPKVLPDSVRKALYLDCDTIVNG SIRPLYETELGNHLAGMVMEPTVYREMKESIGMEKDEPYYNSGVLLIDLEAWRNQDVLGQ LLEFYRVHQGSLFACDQDTINGALRGRIMTLPVRYNYFTNYRYFRYRTLVSMCGAYRAVG EDGYRQAGKAPVIIHYLGDERPWIAGNHNHFRRLYEYYLDKTPWKGTPKQEGKRLYMHMW WVFNHASLICPAFRLWISRRLGMRLVDSRRKQKAFGSGK >gi|157101630|gb|DS480694.1| GENE 142 153345 - 154226 892 293 aa, chain + ## HITS:1 COG:CAC2321 KEGG:ns NR:ns ## COG: CAC2321 COG1216 # Protein_GI_number: 15895588 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 1 216 8 219 298 62 26.0 1e-09 MIILNYNDSYTTLSLVDEVKDYECLDSVVVVDNHSSDDSWKRLQTLNGSGKVHALRLEQN GGYGMGNQEGINYAVSCLEADYVIIANPDIHVTPRCIRRVKDALDKTQGAVAASARVKDP MGGELFSYWTLLPLWKDLLDTGLVTRRLFKAMLNTPSYRLAYAGDEDCRLVDAVPGSFFM LKTGILTPGEIKEVFDKHIFLYYEEKVLGQKFRKMGLKTVLVTDESYVHAHSVSIDKSFK RIVDKQRLLHRSKLYYYKEYLGTGPAGMAAARAVLGLVLAEVWFLTVVCRMRW >gi|157101630|gb|DS480694.1| GENE 143 154383 - 155498 1251 371 aa, chain + ## HITS:1 COG:CAC2350 KEGG:ns NR:ns ## COG: CAC2350 COG0399 # Protein_GI_number: 15895617 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Clostridium acetobutylicum # 11 368 2 353 364 298 41.0 8e-81 MNGQEDKEKEILVTRSSLPSYEEYIEMIKPIWDSAWLTNMGDFHKQLEKELKEYMGIRNM VLFVNGHMALEMAIQALELTGEVITTPFSFASTTHAIVRNNLTPVFCDIRPEDYTMDPEK IEALITEKTSAIMPVHVYGNLCDVERIEAIARKHGLRVIYDAAHTFGERYKGKSVAEFGD ASIFSFHATKVFNSIEGGAVTFREDWMEHRLNCLKNFGIVDQEHVVWVGGNAKMNEFQAA MGLCNLHHLDQEIDRRRLVVERYLSGLAGVPGIRLPSFREGLTPNYAYFPVLFEDFKADR DQVYDCLAGHRIYPRKYFYPLINDFQCYKGRFSSKDTPVAAYVADRVLTLPCYADLELED VDRICGIIRGM >gi|157101630|gb|DS480694.1| GENE 144 155508 - 156194 694 228 aa, chain + ## HITS:1 COG:SPy0794 KEGG:ns NR:ns ## COG: SPy0794 COG0463 # Protein_GI_number: 15674837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 3 223 2 226 231 174 40.0 1e-43 MDKVLLVIPAFNEEKNIERVVEDLVRGFPQLDYVVVNDGSRDCTAKICREKGYNLLDLPV NLGLAGCFQAGMKYAYRKGYRYAIQFDGDGQHRPEYIEAMRQKMEEGYDIVIGSRFVTEK KGWSARMIGSRVIGSAIRLTTGTRVTDPTSGMRMFNRKMIEEFALNLNYGPEPDTVSFLI KQGARVAELQVTIDERTEGESYLKPLTAVHYMARMLISILMIQNFRKR >gi|157101630|gb|DS480694.1| GENE 145 156208 - 157566 1258 452 aa, chain + ## HITS:1 COG:SPy0797 KEGG:ns NR:ns ## COG: SPy0797 COG2244 # Protein_GI_number: 15674839 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Streptococcus pyogenes M1 GAS # 33 423 13 400 428 124 25.0 3e-28 MLSGYPPELGEYMKNLTKHLLYKGTDIGRQNVVWNIAGSFVYALASMVLSFLVIRVVGDG QGGIFSFGFSTLGQQMFIVAYFGIRPFQITDGTGEYSFGDYLEHRNITCIMALAAGAVFL TFMHGVGRYPADKCMILILLVIYKVIDGYADVYESEFQRQGSLYLTGKSNFFRTLFSVSV FLVTLAAFEHLLFSCLAAVAAQAAGIALFNLDVIHALPSVDWNKGERKTGRLFKSTLFLF ISAFLDFYVFSAAKYAIDARMNDAASGYFNLIFMPTSVIYMVANFVIRPFLTRLTDLWTG RDYDCFKKELMRIGAIILGLTVLAVGATAVLGKWVLSVMEMILGSGYEGRLVSYYGAFII IVLGGGFYALANLMYYALVIMRRQKAIFTVYLAAAAAAAVSSGFLVSKFGINGAAGCYLL LMIGLVIGFGLYTAGAYQSEKKENAGNGGSNT >gi|157101630|gb|DS480694.1| GENE 146 157547 - 158530 1081 327 aa, chain + ## HITS:1 COG:STM2085 KEGG:ns NR:ns ## COG: STM2085 COG0463 # Protein_GI_number: 16765415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 20 326 1 302 314 177 34.0 3e-44 MEEVTRNTEIADGRERKRVLRVDVIIPTYKPGKKFSRLLKMLSMQTYPIGKIIVMNTEKA YWNDKGFEGIGNLEVHHLSKEEFDHGATRNRGARCSRADIMVFMTDDAVPADERLIEHLA AAFDKRGPGGESVIMAYARQLPDKDCAMAERFTRSFNYPEDSCIKTKEDLGRMGIKTFFA SNVCCAYDREKFWFQGGFIQKTIFNEDMIFAGKAVLQDDYAIAYVAEARVIHSHNYGCAA QFHRNFDLAVSQADHPEVFAGIHSEGEGIRLVRQTARYLVTHRRPWLVPGLIVKSGFKYA GYRLGRCYRFLPVKLAAACSMNKEYWK >gi|157101630|gb|DS480694.1| GENE 147 158556 - 159005 470 149 aa, chain + ## HITS:1 COG:no KEGG:Closa_3819 NR:ns ## KEGG: Closa_3819 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 115 1 114 123 102 48.0 4e-21 MMTVIFRVILVIVSILTMAFMMRKIRQAKVQIEAALFWVIMALILVVFSLFPAVADACAH LLGIYSTPNFLFLFMIFLLIVKVFGMTLQMSQMESRQKELVQRIALDQKDREELELKIQR FLEKEESAGKDSREVMSTKKMAGEETDGR >gi|157101630|gb|DS480694.1| GENE 148 158995 - 160677 1513 560 aa, chain + ## HITS:1 COG:CAC0024 KEGG:ns NR:ns ## COG: CAC0024 COG4713 # Protein_GI_number: 15893322 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 98 518 11 411 456 126 24.0 9e-29 MAGKKAGRKQESGRPGIRHDAGSGRMLSGHSRCVWLIGSFSVLALGLWHMTDVRAGVLET GDRFLYGWYGGLMVFAVLVLAAVGYLFLVRRDYGLARIFPAAAMGIGLMYMVVLPPLSAP DEVSHYISAYQLSSRLMGRTARAEDERVFIRAQDVFIEDLTGVMDYEAVENGGKTAVVLG QKLEEGTYRTIKERGLGGTGEEGTVMSYQMPVRTTPLAYVPQAMGIALARMLGLGGLGLL FLGRLFNLMFFTAMGCLTIRRMPFGKEVAAGAALLPMTLHLAASFSYDVMIIALSGYFSA VCLDLAYKAERVEVRDVAVLALVMAVMGPCKMVYGVIAGFCLLIPVRKFGGWGKWSLSAA SVLGAFAAAMAVVNHRTVSLYTQADQGYVAWAGETGYTFGQLLHSPLLVLKMCYNTLAWQ GEQLYSGMIGGALGNMDAVLNTPYAVILGLTAILVMLALRKPGESIFIQWKGRLWIWFLC LVCLGALMFSMLLAWTPVSSNVINGVQGRYLLPLLPMFLLSLKNDKVVRTDWDDRGLLFA MAAMDIYVVLRLFSLVCLRV >gi|157101630|gb|DS480694.1| GENE 149 160831 - 161385 645 184 aa, chain + ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 180 1 180 185 259 70.0 2e-69 MGKIKVTPCDIKGLYVIEPAVFKDERGYFMETYNQNDFKEAGLGMVFVQDNQSMSVKGVL RGLHFQKQYPQGKLVRVVRGTVYDVAVDLRSDSETYGKWFGVILSAENKKQFYIPEGFAH GFLVLSDEAEFAYKCTDFYHPGDEGGLLWSDPEIGVDWPIEPDMQLIISDKDRKWSGLRD TFKF >gi|157101630|gb|DS480694.1| GENE 150 161405 - 162217 877 270 aa, chain + ## HITS:1 COG:lin1062 KEGG:ns NR:ns ## COG: lin1062 COG1682 # Protein_GI_number: 16800131 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Listeria innocua # 1 270 1 267 267 130 29.0 2e-30 MNYVLSLMKEIIGKRKLIWDLGKADFRKRFVGSYFGVVWMFIQPIVTVSIYAVVFGMGFK SPPPIEGFTYVEWLVPGIVPWFFFSESVNSITNCLQEYNYLVKKVVFKVEILPIIKLISC LLVHAFFVVIMFGMYLIIGRMPSVIWFQLVYYSLAASLLALGIGFFTSAVNVFFKDMAQI VSICLQFGIWMTPIMYHESMFTGAGKWVEVLFKLNPFYYVVAGYRDTMLTGNWFWERPTL TLYYWGFTAVVLLLGLKMFKRLRPHFSDVL >gi|157101630|gb|DS480694.1| GENE 151 162236 - 162835 662 199 aa, chain + ## HITS:1 COG:VNG0110C KEGG:ns NR:ns ## COG: VNG0110C COG1451 # Protein_GI_number: 15789432 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Halobacterium sp. NRC-1 # 92 192 127 227 241 105 48.0 5e-23 MIQKVEISWRHREQEGVIPVEVVYSRRRTIGLEIKADGHVYARVPERIPNQYVMDFIKER QEWIVQKWFMIMERRRQEQSRPVKDYEKDPELEKLYRKKAKMQLENRCAYFAKRMAVDYN RIAVRAAKTRWGSCSAQGNLNFHWKLVLMPPEILDYVVVHELAHRKEMNHSQRFWAEVER ILPDYKARRKWLKEFGSQV >gi|157101630|gb|DS480694.1| GENE 152 162877 - 163326 547 149 aa, chain - ## HITS:1 COG:CAC3445 KEGG:ns NR:ns ## COG: CAC3445 COG0454 # Protein_GI_number: 15896686 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 143 1 144 147 167 56.0 8e-42 MEIIQVKKDKKKYLDLLLLADEQEDMVDRYLERGDMFVLTDGTVRAECVVTKEGPGIYEL KNIATAPQFHRQGYGRRLIEYVFTYYPDLRTLYVGTGDSPLSLGFYHACGFIDSHRVERF FTDNYDHPIIEDGIQLVDMVYLKRERDMA >gi|157101630|gb|DS480694.1| GENE 153 163487 - 163873 396 128 aa, chain + ## HITS:1 COG:no KEGG:CA_C2946 NR:ns ## KEGG: CA_C2946 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 10 125 10 125 130 119 47.0 4e-26 MRPISFSGMEDAITVQKPNGTHVSYHIFEESEIHLNRITPGSVQEWHYHTRIDECLLITK GVLTCRWLEDGAERTRKVREQEIVRVGSSVHTFANESEEDVEFVVFRFVPDGRDKRETIK SDKVVVDR >gi|157101630|gb|DS480694.1| GENE 154 163892 - 164842 823 316 aa, chain + ## HITS:1 COG:no KEGG:Amet_3885 NR:ns ## KEGG: Amet_3885 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 47 310 39 304 305 241 43.0 2e-62 MVKWIRGAAGMVFLAAFLWGCSPGQADEQTAEEQQQAVAGLEGRDVQRVTQTASPEPEGT GIAVNPSGAVLSERIPVPSGFQRTDAGEGSFQDYIRSYPLKPDGSPVMLYDGSEKGNQTA HQAVFALPVFDSDLQQCADSLMRMYGEYLWAHGAGDEIAFHLTNGFLMDYPSWREGNRLS VDGNQVSWVKKASYDDSYETFLLYMKYVMMYAGTLSLDNECTAIDISLIKAGDMFIKGGS PGHCVMVADVAVNGDGDICFLLAQGYMPAQDFHILNNPLHMDDPWYYASELSYPLKTPEY VFQEGSLKRWYLSDFL >gi|157101630|gb|DS480694.1| GENE 155 164960 - 165340 361 126 aa, chain + ## HITS:1 COG:CAC0249 KEGG:ns NR:ns ## COG: CAC0249 COG0346 # Protein_GI_number: 15893541 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 126 1 126 126 166 61.0 1e-41 MDLKKIHHVAIIVSDYEVSRRFYVELLGFKVIRENYRPEKDDYKLDLELDGCELELFSGK HNPPRPSYPEALGLRHLAFRVEDMDAAVRELNEKGVDTEPVRMDQFTGRRMTFFHDPDGL PLELHE >gi|157101630|gb|DS480694.1| GENE 156 165329 - 166507 968 392 aa, chain - ## HITS:1 COG:FN0978 KEGG:ns NR:ns ## COG: FN0978 COG1757 # Protein_GI_number: 19704313 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 3 380 54 430 431 276 43.0 4e-74 MLWEGISQVRNILIIFVFIGGLTAVWRISGTIPYILYYAVGFIHPRYFVLCTFLLCSSMS FLTGTSFGTASTMGVICMLISNAAGLSPFLTGGAILSGSFFGDRCSPMSSSAQLICSLTR TDIYLNIKLMCRSSAVPLAATCILYVVLASGSSAPADRELLDLFRTNFSLHWAAMLPAVL ILVLSLLRVDVKYAMAVSIAAGAAVALFVQGASPLVLLRCFLYGYEAQDGTRLAQLLNGG GIRSMVKVGIIVLISASYSGIFSHTCLLSGVKHALLRSARRLTPFGTVLVTSVLSCAVSC NQSLATILTCQMCDGLYPKKEKLALALEDTAILIAALIPWGIAGTVPVAAIGAPMGCMFY AFYLYLVPLWNLITALYHDRIRAGRTAFTSFM >gi|157101630|gb|DS480694.1| GENE 157 166737 - 168206 877 489 aa, chain + ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 4 386 5 376 401 442 53.0 1e-124 MMTQNADKKREQIQLFCMDDMVPQDHLLRIIDKAIDWSFIYELVEDKYSQDNGRPSMDPV MLIKIPFIQYLYGIKSMRQTVKEIEVNVAYRWFLGLDMLDKVPHFSTFGKNYTRRFKDTD LFEQIFAHILSECYKFKLVDPNEVFVDATHVKARANNKKMQKRIAHDEALFFEDLLKQEI NEDREAHGKRPLKEKEDDNNTPSGGAGGKEEKTVKASTSDPESGWFRKGEHKHVFAYAVQ TACDKNGWILSYSVHPGNNHDSRTFKSLYDKIKGIGIETLIADAGYKTPGIAKLLIDQGV KPLLPYKRPLTKEGFFKKYEYVYDEYYDCYICPNNQVLTYRTTNREGYREYKSCGSACAS CAYLAKCTQSKDHVKTVMRHIWEPYMEMCEEIRQTLGMKELYSQRKETIERIFGSAKENH GFRYTQMFGKARMEMKVGLTFACMNLKKLARMKAKWGAAHFTNFILNAIWLIKENWLWDT KPKTSLSTV >gi|157101630|gb|DS480694.1| GENE 158 168250 - 169608 1422 452 aa, chain - ## HITS:1 COG:BH0805 KEGG:ns NR:ns ## COG: BH0805 COG2610 # Protein_GI_number: 15613368 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 1 415 3 422 456 179 33.0 7e-45 MSGTLLVVNLIVAIALMIALILIPKLNPAISLVIASIYMGVSCGLGFVTSIDTIASGFGN MMASIGLPIGFGVILGQLMSDTGAAKVIAKTIVHSVPNKFVLYAVAIAGFILAIPVFFDV TFIILIPIGIAVAKEVNKPLAYVVGALTIGDGIAQTLVPPTPNPLAAADLLHFNLGTMVI MGLIIGAIATVVSVFIYTKIHDTGFFKPEKDMNPNAELFSEEKELESDKQPSFLVSFLPI LIPVVCILTGTGADAIYDETPMVAQFLGNKIIAILLGTLCSYLIGYKYLGRDKVEASAGE ALKQAGIVLLITGAGGSFGSIISATDIGNALVTTLSINSGSVIKVLFLAYFIGCIFRIAQ GSGTVAGITAMTIMATIAPSVSIHPIYIALAALAGGISVGHVNDSGFWVVSNLSGFTVTG GLKTYTVAQIIMSISIMILTLLFALVLPGAIG >gi|157101630|gb|DS480694.1| GENE 159 169690 - 170988 1212 432 aa, chain - ## HITS:1 COG:no KEGG:Spirs_0433 NR:ns ## KEGG: Spirs_0433 # Name: not_defined # Def: hypothetical protein # Organism: S.smaragdinae # Pathway: not_defined # 13 432 11 430 430 411 47.0 1e-113 MALETKKEGNIITELASSVRLPKMVKVKQLLDHSHIDVNEIPGIVRSELDREELRARIKP GASVAITCGSRGVANIAVIIRAIVDFVKECGGSPFIFPAMGSHGGATAQGQREILASYHV TEDTMGCPIKATMEVVQVGETPDGMPVYVDRYAYEADAIILCGRVKAHTAFRGPYESGIM KMAVIGLGKQKGAETVHRNGFCELGTMLPVIGRVVLEHAPVIGALALVENAFDQTCIIKG MLKEEIYEEEPKLLIASKQRLGKIYFDNIDVLVVDRIGKDISGDGMDPNITGRFAVPYIN EGIKVQHIAVLDLTDETHGNCNGLGLADVTTKRLVDKIDVDCTYPNVVTSTVLCTPKIPL FTHSDKTCIQIALRTCNYIDREHPRVVRIKDTMNLEEIYISESMLKEAEESNHVIVSGPP KDWVFNEEGNLW >gi|157101630|gb|DS480694.1| GENE 160 171007 - 172500 1129 497 aa, chain - ## HITS:1 COG:BH0490 KEGG:ns NR:ns ## COG: BH0490 COG2721 # Protein_GI_number: 15613053 # Func_class: G Carbohydrate transport and metabolism # Function: Altronate dehydratase # Organism: Bacillus halodurans # 1 497 1 497 497 550 54.0 1e-156 MERKLIKIEEHDNVAVAVESIKRGQTVIAGQTEVTAQEDIPFGHKIALRDIEAEEAVIKY GYAIGHASCPIKKGSWVHSHNLATNLKGMLTYTYEPDIPRPANVSGISRTFRGYVRKDGN VGIRNEIWIIPTVSCVNTTVRMLADMAAREYGELCDGIYAYPHNAGCSQLGDDFETTQKI LASIVHHPNAGGVLLVSLGCENNDLEHFLPVLGEIDESRVKMMVTQDVEGDELEYGMELI RELAQELSQDQREDVPVSKLKIAFKCGGSDAFSGVTANPLCGRIADCITALDGSAVLTEV PEMFGAETILMNRSDSDKTFHQVVELINGFKQYYLDYGQPVYENPSPGNKRGGITTLEEK SLGCIQKGGKAMVTGTLQYGERCIKPGLNLMTGPGNDSVSITDLLSCGAQILFFTTGRGN PLGAAIPTIKIASNNVLYERKERWIDYNAGAILDGKTFDDAAHELWDLMIQTASGEHTKN EIYGYREIMIFKNGVLL >gi|157101630|gb|DS480694.1| GENE 161 172630 - 173265 568 211 aa, chain - ## HITS:1 COG:BH1062 KEGG:ns NR:ns ## COG: BH1062 COG1802 # Protein_GI_number: 15613625 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 199 14 206 226 90 32.0 3e-18 MENKSDYAYQELKNRIISGQMPPLSDVSEEQLQKELGVSRTPIREAIQKLEKEHFVMIYP RRGTLVSDITLDLIYSIYEVRLLNEPFIARSACRYIVNEWIDHMYTSFSTTFHEKEGEQQ RDYYIELDRELHNTLTSHTNNFMLKDTFRVVNDHNHRIRILTSQRNMNYQRSINEHLAIL EALRLRDESQLENAVREHILTAKQEAFEYYY >gi|157101630|gb|DS480694.1| GENE 162 173512 - 174510 757 332 aa, chain + ## HITS:1 COG:RSc2759 KEGG:ns NR:ns ## COG: RSc2759 COG3734 # Protein_GI_number: 17547478 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-galactonokinase # Organism: Ralstonia solanacearum # 5 330 3 329 331 242 43.0 8e-64 MEHYLITIDSGTTNTRVALWNQKKACVDIYKSEIGVRVTAVEGNNKRLAGAIREGIQTLL NRNNLTIHQIASVYASGMITSNVGLTEVPHIEAPAGLEDFAAAVRAVLIPEVFELPIHFI PGLKNSNAVDMDTLESMDIMRGEETEALALLSLITVKGAVLLALPGSHNKFVTTDRSGRL TGCLTTLSGELLSAITNHTIIADAVNHSFASAHYNWEMVLKGFKTARDTSLTRAVFTTRI LNQCLSASPKDCAGYLLGAVLSSDISAVKKSSALMLSPDMQVVVAGKEPLRSALTRLFEE DGSFSDVSGYDSGELLLSGYGALLIAVYNGDF >gi|157101630|gb|DS480694.1| GENE 163 174524 - 175189 478 221 aa, chain + ## HITS:1 COG:BH3723 KEGG:ns NR:ns ## COG: BH3723 COG0800 # Protein_GI_number: 15616285 # Func_class: G Carbohydrate transport and metabolism # Function: 2-keto-3-deoxy-6-phosphogluconate aldolase # Organism: Bacillus halodurans # 7 215 2 204 214 125 34.0 6e-29 MDERNRQQMILQMIRKEKIVAIVRGIPSGLILDTGRALADGGICMMEVTFDHSGPEGIRE TLHSIALLKEHLSERMHIGAGTVLTAEEAEEAYRRGAEYIISPNVDRSVIEKTKELGLIS MPGAFTPSEIVDAYNYGGDIIKLFPAGLLGVSYIKAVRGPLSHIPMAAVGGVTPENICQF TAAGVACFGIGSNLVNQSLTVSHDFENLTERALAFRNALGE >gi|157101630|gb|DS480694.1| GENE 164 175517 - 178243 1273 908 aa, chain + ## HITS:1 COG:BS_ykoWm_4 KEGG:ns NR:ns ## COG: BS_ykoWm_4 COG2200 # Protein_GI_number: 16081166 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Bacillus subtilis # 660 908 10 257 259 160 34.0 8e-39 MGRIIRKLGGVFVPIIFCTALIFTAHMSLDLTDDLQGNARVINYIGIVRGATQRLIKKEL NHVPDDELIYFLDNILSGLSSGSDELNLIKLDSEEFQTMLIEMQNDWEDIKAQIYNYRKD PSNQMLYELSEDYFRLANDTVFTAEEYTEHTVQNARKSLVITNIIFGLMAFGCSVFTICQ EKRRKKLIEAEKENIKKSEQLSRRAQELMAPMNEISELMYVSDINTYELLFVNEAGKRMF QIDDSQPNIKCYKVLQGFEEPCPFCTNKLLKADETYSWEYTNPIIKKHYLLKDRLIEWDG RTVRMEIAFDITEANNEKNELKRRLKRDDIRLACVRELYNNRNIQSAITKVLQYIGTLFS AERSYVFRFHDQYYSNIAEWCKEGVLPQIDNLQNIPIGDYQVWLDELDKHKKLVINDIEE IKDTFPKGYELLVQQGIRNIIWVPMMKNGKVNGSIGLDNQDLGLAEVAVPFLQTIQYFLS LTMQRNENETMLFEMSQIDRLTSFYNRNRFIQDVSELKECKESVGVVYLDINGLKEINDS FGHDTGDKMIKECADIVKNSVDSKYLYRIGGDEFVIIYVDITEESFCDNVQLLKNAFEKS KCQAAIGCRWNEECTHIQDIIKEADELMYDDKKRFYQGHHATGRYRHNNDILRSLAEPDV LNEKIENHNFKVYLQPKINVEDCRMVGAEALIRYRDENGAIIAPDKFIPVLEDTYLISKI DYYVFEEVCKTLSIWARQGKSAVTISSNFSKLTFRDDRFIKRLEEISDKYHVQRNYLEIE ITESANFTNLDTLVTRINQIRDSGFRVAMDDFGVESSNLALLSLVKFDILKIDKGFVKDI ISNKRAQIIIGMMTRMCSEMGIQLVAEGIEDGQQLEVLREYGVKTVQGYLFSRPISISEY EEKYMQGP >gi|157101630|gb|DS480694.1| GENE 165 178383 - 180941 1148 852 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 330 571 2 239 260 168 40.0 3e-41 MAGRQKNDTTIQFLIYSFIGLLIFSIVIFSILGIYMSQKSRKAVYKVGEIYMSGMNEQMS RHFETVIKLRFDQVSGIVSVVSTDNNDKEALYEELIYRAQVRDFDYLALCSAEGNFETLY GHPIQPLNPEPFVEALLQSEQRVAVGVDSDENEVVLFGVDAAYPMQNGDKSTGLIAAVPL EYITDFLSLEDKGQLMYYHIIRPDGSFVIQNPNTELWPFFEELQKHLALEENDSSEENSI DEFGIALKDHKKYAATFEVNGEERQIYGIPLPYSEWYLVAVMPYGILDDTINSLSSQRMV MTLLSCASVLFILTLIFLRYYSMTRSQLHELEKARQTALESSKAKSEFLANMSHDIRTPM NAIVGMTAIATAHMDDREQVQSCLRKITLSSKHLLGLINDVLDMSKIESGKLTLTTEQIS LQEVTEGIVNIMQPQVKAKKQTFDIHVDNILTENVWCDGVRLNQVLLNLLSNATKYTPEG GAIRLSLSEEKSPKGENYVRIHINVKDNGIGMSQDFLKKIYESYSRADSARVHKTEGAGL GMAITKYIVDAMEGGIEVQSEIGKGTEFRITFDFEKAAAMEVDMVLPSWNMLVVDDDELL CRTAINALKSIGIKAEWTLSGEKAIDLVIQHHSRRDDYQIILLDWKLPGMNGIQVARELR RNLGDEVPILLISAYDWSEFEAEAREAGISGFISKPLFKSTLFYALRQYMGVDTPNSRTL NQGMELSGRRILLAEDNELNWEIANELLSDLGVVLDWAEDGQICLDKFQKSPEGYYDAIL MDIRMPHMTGYEATKAIRGLDHPDALSVPIIAMSADAFSDDIQHCLECGMNAHIAKPIDV MELTRMLKRYLI >gi|157101630|gb|DS480694.1| GENE 166 181118 - 182581 474 487 aa, chain + ## HITS:1 COG:no KEGG:Elen_3094 NR:ns ## KEGG: Elen_3094 # Name: not_defined # Def: regulatory protein GntR HTH # Organism: E.lenta # Pathway: not_defined # 1 462 1 465 486 131 24.0 7e-29 MKPQETKFTFVYNEIKQRILEGQILPGNSLLSSRMYCEQFHVSRYTINHVFEALREEGLI EIQPRLAPIVVSGKDTPDSLNTVFEILKQKQTILQVYQTFTLILPSLFVFALQGCDVEIM PHYKQAMKVLRLGRLTGGWRIPSNLGYDILRIGGNPLFGELYSAFGLYNKFSFFIEECPC FRKHFLQKPAPANSVIIDILKGKDPSAKHRQLLNMYQALTDSIESTLKNLRDTTPECPVQ TGVPFAWNPMRGQDYYYSKIVDDLNLKIGLGEYSIGMYLPYEKQLAGQYNVSLSTVRKAL SELEQRGFVKTLNGKGTIVIEPDDTKIHRLALNSGYVEKALRYLHALQLMALLIRPAALE AALRFTRDELDELADRFTSPGSAYLADMLEAILKHTTLEPLYVILSETNHLLEWGHHFAY YPSKRRTLSHLNKQVIAAFRQLNQGNADSFADGIADCYCYILVCMKKHMTEKYKFSNASN IRIPEKY >gi|157101630|gb|DS480694.1| GENE 167 182890 - 183534 676 214 aa, chain + ## HITS:1 COG:lin0505 KEGG:ns NR:ns ## COG: lin0505 COG0036 # Protein_GI_number: 16799580 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Listeria innocua # 1 213 1 213 216 246 57.0 3e-65 MAQILPSIFGADILRLKDEIEFLERENTAILHVDMMDGNFVSNIAFGPNQIGAMKKASKM MFDVHMMIDHPKDHLDDVIATGAEMISVHYESTPHIHNMIQRIKKAGRKAGVVLNPGTPE SVLEYLLDDIDYVLIMTINPGTPGQTFIEKSLEKIANLKKMIGSRPIQIEVDGGVNDVIA KQVAGAGAELIVVGGYLFGGNPDERYSILREAVK >gi|157101630|gb|DS480694.1| GENE 168 183599 - 184351 210 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 249 4 242 242 85 29 3e-15 MKKFEGKVVLVTGAGHGIGKQIAVQFAEQGANTVIVDYNVEFGTETAEEITRDYVKSLFV QADVSKPEDMDHVREKVFEEFGRIDTLVLNAGVAFKKKVNEFTFEEWNRTISINLNGLFN TVKAFYNDFLTNHGSIVYISSGSALSGTGGAVAYPASKAGGEGLMRGLAKELGPKGVNVN IIAPRLIDTGDMMRVNYPTQAELDAVLEKIPVRRYGTVRDVANLAVFLADKDNSYIQGQT ILLDGGRTIA >gi|157101630|gb|DS480694.1| GENE 169 184394 - 185659 1477 421 aa, chain + ## HITS:1 COG:SA0238 KEGG:ns NR:ns ## COG: SA0238 COG3775 # Protein_GI_number: 15925950 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, galactitol-specific IIC component # Organism: Staphylococcus aureus N315 # 1 420 1 418 419 244 35.0 2e-64 MEAILGFFQEFINLGAAILLPVVIAILGKFFGMKLGHAIKSGLLVGIGFQGLVLAVNLLI TTIQPVMDYYKALGSGYDALEVGFAALGAASWTVPFAVFVIPAIIIVNLILVRLKITKVL NVDIWNFMHFLVPGALAYALSGNAVIGFLVAVGCGIVALFFSQWLAPKWGEFFGLEGTTC TTLAFCAWVYPVTYGLNKIIDLIPGLRDVDVNVDKLGKKLGIFGDPAVIGLLVGVFLALL TRQDLGALLTIGMGVAASMVLIPRMVSVMMEGLTPMGNAANDYMHRKIGDDADIYIGMDI ALGLGDPACITCSAIMIPVTILLAFLIPDMRFFPLGILAEVCYLAPMCVLTSKGNIFRTL ICMTVMMFITLFFANMFIPEATQMLSVTGVKFEGLVTASHFGWNPGNLLVSLLHRLFGML G >gi|157101630|gb|DS480694.1| GENE 170 185675 - 186115 422 146 aa, chain + ## HITS:1 COG:BH0192 KEGG:ns NR:ns ## COG: BH0192 COG1762 # Protein_GI_number: 15612755 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus halodurans # 17 127 22 133 160 71 32.0 5e-13 MKNRYLVIHGFARDRYEAITMCGEALYKEGLVSEQFGTLCVEREKNYPTGLPTEIPTAIP HAKDESITQNSVCFLKLDRPVSFKRMDDDTQDVSTDMIFNLAIKDPNEHLQALQNMMGFL NDSEALLKCKSLSDEELIEYLQEKIG >gi|157101630|gb|DS480694.1| GENE 171 186161 - 186700 699 179 aa, chain + ## HITS:1 COG:BS_yckF KEGG:ns NR:ns ## COG: BS_yckF COG0794 # Protein_GI_number: 16077414 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Bacillus subtilis # 5 179 9 185 185 125 40.0 3e-29 MELMDIVAELKNASVQMPEEKIDQFIEAVDTHERIFVYGTGRSGLMLKALAMRLMQMGYQ SYVVGETTTPSVGKGDLLIVASASGETQSVCSAADDGAKQGTDVLVITGSKESTLSRNHE PLIRIEAATKFSESKASIQPLGSLFEQMLLLIFDAVILKMSSKNADTNKKMAKRHASIE >gi|157101630|gb|DS480694.1| GENE 172 186784 - 187083 464 99 aa, chain + ## HITS:1 COG:no KEGG:PPA0023 NR:ns ## KEGG: PPA0023 # Name: not_defined # Def: PTS system, galactitol-specific IIB component (EC:2.7.1.69) # Organism: P.acnes # Pathway: Galactose metabolism [PATH:pac00052]; Phosphotransferase system (PTS) [PATH:pac02060] # 1 82 1 82 100 87 52.0 1e-16 MTKEIKILVACGSGVATSTIAQEAVKEIAQRAGVNARVFKATIAEVPERQHHVDIVLTTA NYRQPLEKPYMSVFGLISGVNKANTEKKLEELMKKVAAE >gi|157101630|gb|DS480694.1| GENE 173 187295 - 189373 1846 692 aa, chain + ## HITS:1 COG:BH0220_1 KEGG:ns NR:ns ## COG: BH0220_1 COG3711 # Protein_GI_number: 15612783 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 1 474 3 488 556 90 21.0 1e-17 MKKRSTEILQRLLKNPNEELSLKKLTDDYNITEKTLKGDVQELVEFARESGFGHSVYWDN HILRLDDRKHISEFMDAVYSMDPYQYKMSLEERKVYIIIELLCHDGYYSMQQLADELYVT RNTIINDCRLVDEYVKKYGIIFVAKSKKGILLQTEEDKTRGLLIDIFKGLIPSIKCEKDF FVRFIIRRAGFICPLTDVIYYMNCFTKDNNIIFAKDVFFEIAICIFVLLNRLEQTGFAEN STEPGLPQLQLDTIGNMIHYVAEELGYSAIGHNGILVMERQILLRNLHPQVQSINDFELY GVICHFLLEISREIDVDIQSDNLLVESLISHVKGMNNWNDADYDWDIGYETSGEFPRIRE LAEDKFCILEKYLQYSLTPKMKDSIIIHICAALLRGRKNSSPLGVIISCPGSMATSKYLE AQIKNYFNFYVVDTMTTRQVEASKGCFDQVDFVISTVPIQDCVLPVAVVSPLITVEDINK IQNLAFKQKKTVLPDARERFPVLSKIYAIYDSGDRRKIEYLDRELKQILEDAFYVESKIG KEFALLNMLKIKYIKARDGKMAWREAMKAASEDLIRDGYFDERYVREAIGNVEEYGSYII VNKGIALAHARKESGVYEDGLSLLVSKDGILFDEGETVHLLFFFSQKGETDYLDLFKEII KLGKDQNDVDRIRNLTDSMEIYRTMWEILSRN >gi|157101630|gb|DS480694.1| GENE 174 189397 - 190422 492 341 aa, chain - ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 44 338 162 457 459 186 35.0 4e-47 MKNGIRLWLSLSIHVIIIMFVMVFITGTLSFRFFSSDIHSVTARLFPLISLLIAAILVSV SLILIVSKRLLVPIQNLIEALQRVAEGDFTVRLPENHEDAYAYNMNANFNKMVKELNSME MLQSDFIQNVSHEFKTPLTSIEGYATLLNGVSLPDELHVYSGHILESARQLSVLTGNILK ISKLENQGIVSQKESFFLDEQIRQALLSLAPMWEPKNLDIDMDLPETLYYGNENLMFQVW TNLFSNAVKFTPPGGSISVRCCQTPEVIQVCIKDTGVGMTPEVQSHIFDKFYQGEQNRNI EGNGLGLALVKKIVTLCNGTVAVESRPDEGSVFTVRLPMDK >gi|157101630|gb|DS480694.1| GENE 175 190419 - 191096 694 225 aa, chain - ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 3 221 1 220 225 188 43.0 5e-48 MLMLKILIVEDDEKLSQLYKIVLNKAGYETVLASNGQEAWDIFDQEHIDLVITDVMMPVL NGYDLVKILRNENPLIPVLMITAKDDYPSKSKGFTLGIDDYMTKPIDVNEMVLRVKALMR RANINQERRLMVGATCLDYDSLTVSCGEDSIMLPQKEFLLLYKLVSYPNKIFTRIQLMDE IWGPGTASDVQTIDVHINRLRRHFEHNHDFKIMTVRGLGYKAVVQ >gi|157101630|gb|DS480694.1| GENE 176 191333 - 191950 450 205 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940197|ref|ZP_02087542.1| ## NR: gi|160940197|ref|ZP_02087542.1| hypothetical protein CLOBOL_05086 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05086 [Clostridium bolteae ATCC BAA-613] # 1 205 1 205 205 393 100.0 1e-108 MPTERFDKLPELKKNRISDAISEAFMHRGTGDIHITQIARSARVSRGSLYAYFLSKEDML FFSLCQTQRFIWENNKKILLEYGGDYWKMLETSLRYHFSICRTNRLYRLLYLPIEGNESF SLLSRSLLHGKEYMEYKAWIYRHLMPAYRERFSEDEFGVYQDTCNDILTVSIQEYISGAG KKEDIMACFHNKIIYIRPEVKQQTV >gi|157101630|gb|DS480694.1| GENE 177 191972 - 193111 1141 379 aa, chain + ## HITS:1 COG:PA2528 KEGG:ns NR:ns ## COG: PA2528 COG0845 # Protein_GI_number: 15597724 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 44 361 70 405 426 84 24.0 5e-16 MKKGVIVGVAAAVVICAAVIPRFMNNKQFAEAVALPVMETAYPQTGDIRLTTSLIGKVEP SDVVYIYPKAAGDVTSVNVKAGDVVTEGQVICTIDTKQVETAKSSMEAAALTLKEAQAEL GRQQILYNSGGISEQAYEQYRNKVDSARIQYEQAEFTYKTQLEYSQITAPISGLVEICDM EVFDTVSQNNLICVISGQGARVVSFSATERIRGYLHEGDEIEVEKEGIKGSGAIYEVSTM ADSTTGLYKVKARLEDGDTFPTGSEVKLYVVSEKTEDAMTIPVDSVYYDNGNPYVYTCEN GTVHKVFIETGIYDSETIEVLSGLTMEDQVITTWSSELYEGAQVRVLGEEQEPQQVPGQI QTEVQTQTTEQTTEQTTEQ >gi|157101630|gb|DS480694.1| GENE 178 193127 - 196261 2858 1044 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 4 1008 1 1010 1093 407 27.0 1e-113 MINLTKYVLKRPVTTIMCILCLIFFGYTSVTGATMELTPEMDMPMLLVLTNYSGANPEDV NDLITKEIEDQVGSLSGLKSITSASSEGMSMVMLEYEYGTDTDEAYDDLKKKIDLVTTQL PNDADTPVIMELNMDSAADITLAIDNASQNNLYSYVNNELVPEFEKIAVAADVSIRGGSD EYVKIEVIPEKVSQYKVSISSIASDIQAADLSYPAGDTQVGSQELSVSTRMTYNTVENLK KIPLTVPSGDTVYLEDVANVYTTTDSSSSIARYDGNDTISVSISKQQSATAMELSNQVKK TISRLMAEDPDLNIILVDDSSDSIMESLMSVVETLIMAVVISMIIIWLFFGDIKASLIVG SSIPVSILAAFILMDLMGFSLNVITMSALTLGVGMMVDNSIVVLESCFRATASSEKKAGL VEFMDNALEGTRVVGASVLGGTATTCVVFIPLAFLEGMTGQLFKPLGFTIVFCLTASLIS AVTVVPLCYMIYRPRERQKAPLSKPVYRLQDTYRSAMRVILPKKKTVMFISIGLLMFSLF LATKLDMELMGSDDNGEITISVDTRPGLMTEKADEILREIEGIIVQDSNLESYMTSYGGG GIRADDSATINAYLKDDRDMDTVDVVKMWKKQLADVRNCDISVEMSSSMSMMSSALESYE VILKGADYDEVKSISNKIVTQLMERDDVTKIHSDLENSSPVVEIRVDALKAQAAGLSASK IGSAVHNAISGVESTEIEIDGNDIDVKVEYSDEDYKTIDQVKGLVLTTGSGGSVALTDVA DVQFVDSPASISREDKEYKVTISGEYTELATRETKMRINQEVVTPNLTPSVSIGLNSIDS SMNEEFSALYKAIGTAVFLIFVVMAAQFESPRYSFMVMTTIPFSLIGAFGLLFLTNCKIS MVSLIGFLMLIGTVVNNGILFVDTANQYRETMDMDTALIEAGATRIRPILMTTLTTIISM IPMAMAIGNSGNMTQGLAIVNIGGLTASTVMCLLMLPGYYKLMSGNRKIIKEKRGGRRKP WPVIAGSFKKNLKRNRKDKGRSEE >gi|157101630|gb|DS480694.1| GENE 179 196258 - 197433 1189 391 aa, chain + ## HITS:1 COG:no KEGG:Closa_0525 NR:ns ## KEGG: Closa_0525 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 390 8 380 383 265 43.0 4e-69 MKPQMTFMANGRCILVLALTAVMGGLPPMAALGASPEFARTEQEWARLQDNVIEYDEIPD LIHEYNATVQNNQYDYQKFREDYGDTNSDVADAYNDLAQDFYDDMSGETDAGSMMSDLQL DIQARNMLKQADNTLEDSKIYLLTYEMAEDNLAATAQSNMISYHKKQLELEQKQTDLELA REKYSLEQVKQAAGTVTAVDVLTAKESLQSSENNIKELESGIENLKEELYISLGWKHNDS PDIKELPQMDVSRIDGMDPDRDLETALENNYTLKINKRKLENARSKTTRESLETKIRNNE KQIGASLSSAYKNVLSARLSYEQAVAEAQLEETNTNIAAGKLQAGMMTSLEYKEQEYKME SSRLNAEMAAVSLFQAMETYDWSVKGLASAE >gi|157101630|gb|DS480694.1| GENE 180 197566 - 198732 1150 388 aa, chain + ## HITS:1 COG:no KEGG:Closa_0526 NR:ns ## KEGG: Closa_0526 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 384 7 392 398 168 32.0 4e-40 MLSLSLAAGLSEVTALTAAAAAQNGPGDTDRQEEYLGQERNEKLKDDVLEYSELQDMIRT YNPTVLEVADSYNKTIMDYENAWAELKLYQGRADSDKDDAKDAGNAEQYTYYTAQEQTYK AAASSYYKMLDNMKKTASSTSTQRQTERQMTVAAQSLMVSYETLRQQKYTLEKMEELYKT QYELGTVKEQAGTAAAAEVLSFKNQVLAAQASLAEVEANMESIYNSLCLMVGREADGSLQ VASIPAADPSRIGTMNLEEDTAKAIGNNYTLISDRHSLKVDSTSSSNYKLRTMEDGEQKL TSKMKRLYEDAAIKRDALEQARTGYEKARINKQQADTKYAIGMLSKDEYLMEELDYVQKE ADYKAADLALQQAMDTYDWAVLGIADIE >gi|157101630|gb|DS480694.1| GENE 181 198758 - 199219 332 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940202|ref|ZP_02087547.1| ## NR: gi|160940202|ref|ZP_02087547.1| hypothetical protein CLOBOL_05091 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05091 [Clostridium bolteae ATCC BAA-613] # 1 153 1 153 153 250 100.0 3e-65 MKKIRVAGWMTGMGAAAAVLALSAQTAWAGSWIEEPMGWFYEKDDGETVRGWNQINGIWY YMDTETGVWQKEPAINRANAPYLMENMMAEAGLYQDEEKEMEYRVQYETKDTIEVLAGWE EKPGVFHSVNTFEINKKTKTAISSVTKEEYGVY >gi|157101630|gb|DS480694.1| GENE 182 199224 - 199754 249 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940203|ref|ZP_02087548.1| ## NR: gi|160940203|ref|ZP_02087548.1| hypothetical protein CLOBOL_05092 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05092 [Clostridium bolteae ATCC BAA-613] # 1 176 7 182 182 360 100.0 2e-98 MVGICVTEKGENVRLDIYCNQGRYMERKTLRYRKEESALCDAALYGLLKRRTGDAPQDCV KGGIEPSPSLKEKSELCRCLSKTYMVGKRIGPFWCFKGMISGRLGQRRPLSALLYDTEYG QIVIAACGPEVKTWYLFEGEAGADIGKNGMDNGYILSIWGMGKVLQKFFFKTIVSY >gi|157101630|gb|DS480694.1| GENE 183 199807 - 200232 428 141 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00061 NR:ns ## KEGG: EUBELI_00061 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 140 8 141 151 102 44.0 3e-21 MIAQNSCGACIKQIHDALEKNANNALRHQDLTMAQVGVLLALSGAPDYQLTLKELEHALH VAQSTAAGIVARLEQKGFLEGFGDPSDRRVKMVRLTPAGIECCCEAEDNMKRAERMLLSG LTETEQGILGSLLRKVRDSLI >gi|157101630|gb|DS480694.1| GENE 184 200370 - 200777 200 135 aa, chain + ## HITS:1 COG:CAC2485 KEGG:ns NR:ns ## COG: CAC2485 COG0534 # Protein_GI_number: 15895750 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 12 118 11 117 442 76 35.0 1e-14 MKKSNAADIFGKAPVFQAVFKNALPAMAAMLMVLIYNLADTFFIGQTHDALQVAAVSLAT PVFLIFMAAGTIFGIGGTSVISRAIGEGRHEYARSVCAFCMWSCVAVGIVMAALFLIYES DTGMDGRQRGYLGAC >gi|157101630|gb|DS480694.1| GENE 185 200722 - 201747 761 341 aa, chain + ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 329 127 456 475 145 30.0 9e-35 MNQILGWMGASADTWELAEIYLTIVSLSGPFVLISNCYSNIIRAEGQSGKAMMGQLLGNL LNVILDPILILGFGWNIAGAAIATVIGNIVGAGYYILYFKGGKSSLSIQVKDFTTRNRVC SSVLAIGIPASLGSLLMSVSQIIVNSLIAEFGDMALAGMGVAMKITMITGMICIGFGQGI QPLLGYCVGAGMWRRFKQVMKCSILCSFGLSAVMTIICYFFVNQIVSVFLTEASAFNYAA RFARILLCTSFLFGMFYVLSNALQAMGAATEALIINLSRQGIIYIPSLFLLKMAIGLDGL AWAQPVADILSTGLVAILYIRTVRKMEYNCSILSGGCQMES >gi|157101630|gb|DS480694.1| GENE 186 202318 - 202752 626 144 aa, chain + ## HITS:1 COG:SPy0634 KEGG:ns NR:ns ## COG: SPy0634 COG2893 # Protein_GI_number: 15674706 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Streptococcus pyogenes M1 GAS # 1 137 1 136 145 91 36.0 5e-19 MTGIIVTGHGTFPEGILSAVSLVAGKPDNTEAVNFEMGQSSEDLKDSMAKAMESLEGDEI LILADLVGGTPFNTAAALRKARTDKRIKVIAGVNMAAMVEAVFSRPMYGLDELAVALLAA GREGLRDLDALDSEEEAPEFEDGL >gi|157101630|gb|DS480694.1| GENE 187 202886 - 204076 956 396 aa, chain + ## HITS:1 COG:no KEGG:CPF_0400 NR:ns ## KEGG: CPF_0400 # Name: not_defined # Def: putative glucuronyl hydrolase # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 396 1 396 396 604 71.0 1e-171 MIKKIHVEPINRKAEYEAAGFLTREEVTAAMDRMADQVRCNMEYFGTRFPSSATRNQTYG VIDNIEWTDGFWTGLLWLCYEYTGDDAFKNLALKNVDSFLNRVEKRIELDHHDLGFLYSL SCVAGYKLTGSAEGRKAGLLAADKLMERFQEKGGFIQAWGELGARDNYRLIIDCLLNIPL LHWAFLETGNPVYRNAAVRHYEAACNNVIRDDASAYHTFYFDPGTGEPLKGVTRQGYSDD SAWARGQAWGIYGIPLNYRYVKDDSAFNLFKGMTNYFLNRLPEDEVCYWDLIFTDGSNQS RDSSAAAIGVCGIHEMLKYLPEVESDKNTYRHAMHCILRSLMERYTAPEIKPGNPVLLHG VYSWHSGKGVDEGNIWGDYYYMEALMRFYKDWNLYW >gi|157101630|gb|DS480694.1| GENE 188 204136 - 204627 545 163 aa, chain + ## HITS:1 COG:SP0323 KEGG:ns NR:ns ## COG: SP0323 COG3444 # Protein_GI_number: 15900255 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Streptococcus pneumoniae TIGR4 # 1 163 1 163 163 212 63.0 2e-55 MAVPNIVAFRIDERLIHGQGQLWIKALGVNTVIVANDSASEDSMQQTLMKAVVPKTIAMR FFSIQHTCDVIMKASPKQTIFIVCKTPEDALKLVAGGVPVKEINVGNIHHAPGKEEVSKY IALGGEDKAALRQLRDTYGVVFNTRTTASGDDGAARVDLNKYL >gi|157101630|gb|DS480694.1| GENE 189 204662 - 205450 1080 262 aa, chain + ## HITS:1 COG:SP0324 KEGG:ns NR:ns ## COG: SP0324 COG3715 # Protein_GI_number: 15900256 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Streptococcus pneumoniae TIGR4 # 1 253 1 252 259 253 57.0 2e-67 MTISMFQCLLIGLWTAFCLAGMLFGIYTNRCLVMAAGVGLILGDLPTGLAMGAVGELAFM GFGVSQGGSVPPNPMGPGIVGTIIAITMKDSGIDVGSALALSFPFAVAFQFVITATYTFA TTLTSYAYKALDKKNFRGFRIAANATVCVFAVVGFIIGFGGAFSSEGLQKVISLIPAWLS AGLGVAGKMLPAIGFAMILNVMAKKELIPFVLFGYIAIAYLNLPVMGVAVIGTAIALLVF FHADKGNGESVEEVEVEFEDGI >gi|157101630|gb|DS480694.1| GENE 190 205434 - 206249 945 271 aa, chain + ## HITS:1 COG:SP0325 KEGG:ns NR:ns ## COG: SP0325 COG3716 # Protein_GI_number: 15900257 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Streptococcus pneumoniae TIGR4 # 6 271 5 272 272 342 66.0 4e-94 MKMAFNQLEKKDYMKTSLRAYFLQNGFNYGNYQGLGYANVLYPALRKMYKDDDDKLQEAL KENIEFFNTNIHFLPFITSLHLVMLENKTPSNEIRNIKMALMGPLAGIGDSLAQFCLAPL FATIGASLAQDGLIMGPILFFVAMNVILLAMKLLTGMWGYKLGTNIIATLSTKMEQISNI ASMIGVTVISGLAVNFVKISTPIQYVANMPGDQQKVVAIQEMIDAIAPKLLPVLFTGLIF YLIKVKKWTTYKLVILTIILGVVLSVLHLIA >gi|157101630|gb|DS480694.1| GENE 191 206400 - 208505 1291 701 aa, chain + ## HITS:1 COG:no KEGG:CPF_0406 NR:ns ## KEGG: CPF_0406 # Name: not_defined # Def: heparinase II/III-like protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 9 700 11 671 672 512 37.0 1e-143 MGDERYVQVRERAVAYAAWYEKERVTAHITENCREEAKQILAQADRLMTHTFSFEDRWDM EPCREPFTLKEMIWDRSPNGDPEWIFMLNRHEYMNKLLIAGWLTGDTSYAEKLKWFLLHW IQANPILPEGTVTTRTIDTGIRCMSWQYLLLHLLGEGLIEEREAGEILESMKDQFASLRK RYIGKYTLSNWGVLQTASICSGYLWFREYLPSDGTEDWAWKELERQIGLQVLDDGAHWEQ SMMYHVEVLLACMKLLASCRAIGRIGSRTDKAGAGYGWGEPGGDGDSRKEFWRSTDWLEK AVDRMSRYVLFATGPDHVQIAQCDSDVTDVRDVMVKAAVLTGDGRYKYAGYETADLDSAW LLGSAGIAGYRAMEGRKPESRSMEAADAGHIFFRSSWEEDSHFTYLKCGPLGSGHGHADL THISLYYRGRPVLADSGRYSYVEEEPLRPFLKSAQAHNVCVIDGESHGRPRGSWGYDSFG QSFKNYYREQGPVHYGEMAYHGCLMSGEHYLVIRKVMAVDQGIWMIVNDIRCDGSHEVKE YYHLDSAVQAARTGPGKDGTGEYWRLCCGGDVSMTVLGSRPFESEPCIISKQYNQKEKST CLVRKTGFTDRITDWTCLLGEGTEAKKTPVFQYGSSLPEPEEQVTAMSFCISPDESWDFL VWNQETWQGGKIHYCKGVPVYAKAAAIHTINGNTTLYRLRI >gi|157101630|gb|DS480694.1| GENE 192 208553 - 209557 1105 334 aa, chain + ## HITS:1 COG:SP0330 KEGG:ns NR:ns ## COG: SP0330 COG1609 # Protein_GI_number: 15900262 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 2 334 3 333 333 249 38.0 6e-66 MKKLTINEIAERAGVSKTTVSFYLNGKINKMSEETKQRIQHIIDETGYEPSAAARAMKAK SSGVAGVILADTSEGYCARALKGIEEAASALEYQIVVGNSGLAFQHEKDYVERMLRLGVD GFIVQSTYRFGMLASDLEKKKKPVVYLDAKPYDFKGRYVKSNNYDCVYQVISECIKKGYE NFLMISDGDSNISAGFENTQGYKDAIQDAWKEGATQYLQEGVKSDQVYEMLKDQIDLEKK TLIYVASPGLLQVVYQAIRRYPDYMRLFPDTLGLIGFDAEGWTRMTTPTISAIITPAYQE GVRAMEELADILDGKKTDGEVVFKNIVKWRETTL >gi|157101630|gb|DS480694.1| GENE 193 209707 - 210474 743 255 aa, chain + ## HITS:1 COG:BH1553 KEGG:ns NR:ns ## COG: BH1553 COG1349 # Protein_GI_number: 15614116 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 1 255 1 249 251 129 32.0 4e-30 MKASERLEIIIDEVQTNGIVRVAEISGRLGCSEVTIRNDIRKLDQQGVLKKTYGGAVRKE NGLSVEFIPGEYFLNSDKKYRIAKRAYEYIEGGDSIIIDDSTTGYYLAKYIKEHAEKHLA VVTNSVLSGAELTSARHVDLFIVGGHVIGNPPAALDNITVGAMGQFHVDKAFVGVNGINL KTGLTSMGTPQMDVKREMIRVADEVYVIADSSKFGSRNLFTVCPMSEVDRVITDTEIKKE YVQTARNLGIELDLI >gi|157101630|gb|DS480694.1| GENE 194 210730 - 211779 436 349 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 1 336 1 332 346 172 33 2e-41 MKEKYKPIVGITIGDPAGIGPEIILKALGEKHIYDECRPLVIGCVNVLERAKAVLPQCSL EFNRIESVSDACFRFGSIDVLETGHYDISKLVWGKEQAAAGQIAMDSIRTSIELGLEGEI DAVTTAPINKVAIKMVGVKQAGHTEIYMDGTKAPYVLTMFDCFKMRVFHLSRHISLMNAI RYATKEHVLNDICRIDKELKRLGMETPYIAVAGINPHSGEGGLFGDEEMKEIIPAIEEAR KRGINAIGPVPPDTLFSRGKNGEFDAILAMYHDQGHMPCKTLDLERTVSVTLGLPFLRCS VDHGTAFDIAGKGIATNVSMTAAIDSTVKYAKALHDAKQDENMENHDAK >gi|157101630|gb|DS480694.1| GENE 195 211801 - 213162 974 453 aa, chain + ## HITS:1 COG:FN0227 KEGG:ns NR:ns ## COG: FN0227 COG3395 # Protein_GI_number: 19703572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 5 427 7 409 425 149 26.0 1e-35 MIGAVTDDFTGAASAGALIARSQARTGLFLDAEALEDSEEARCLDAVFVSSNSRHLKPED AYREVCKATKALKSTGVTCFSKKIDTTLRGRIGNEIDAMLDTLGGDMVAVMVPAMPQSRR ICLNGISIIDGTVLTETPIAQDVKTPVKDAFIPRLIQGQSRRKVELISIENVDRGKGTLK CAMIDARERGGQILVMDAVTLEHIDLIAQTCVELGWNVLAVDPGAFTMKLNYRRGMIKEE VSTGAEGSTGPEEKVALFVVGSANPLTKAQMKYLCSSEANVPVHVSAYMLISGQVQFEEE VNRAVGIAVNLFRQKPRPQSIIIGTALQDCVVDLNDEDLRRGYDSGTCSRLINEGLAEIT GRVMELAGREQVAGLLLTGGDTMESVCRRLHVSYIEAIDHIVPQVDVGRIVGNYTGLPVV VKGGFCGGREIGMAVVSRLLSESAGCREGNGRV >gi|157101630|gb|DS480694.1| GENE 196 213213 - 214085 737 290 aa, chain + ## HITS:1 COG:FN1868 KEGG:ns NR:ns ## COG: FN1868 COG3246 # Protein_GI_number: 19705173 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 289 2 270 272 115 28.0 1e-25 MKKVIIEARVNEYAMRDMNPNVPWTVQELVDEAQRVREAGASIMHFHARTAAGEPNNHAE VYAEIIEKVREKTDILLLPTLGFNSNDKDSSRIKIISELAKKESTKPDIIPLDTGSVNLD QYDEEKGCFTESDSIYTNSTSTLEFCAREMLNYKIKVKMTCWDVGFVRRGRKLIDMGLIT KPGYFLFHLTEGKYITGHPCTVSGVDAMRSVLPDEPCYWSANCLGGNLLNVAPHVLKNGG GLAIGIGDYHYKELGAPKNAELIKRAADMAREVDREIASPDEVKAFFNMN >gi|157101630|gb|DS480694.1| GENE 197 214085 - 215413 874 442 aa, chain + ## HITS:1 COG:BH3897 KEGG:ns NR:ns ## COG: BH3897 COG2610 # Protein_GI_number: 15616459 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 12 437 1 424 427 223 34.0 6e-58 MAEGKMETYISLVGIVIAFCTLIFLMMRGVNLFITVMIASLIAIVSGNMNIYKTLTENYM TGFADYYRSYYLVFFCGTLLGTTMELSGAATSIAKWVTGKFKSMAYLAIPLATGIICYGG VTAIISIFATFPIALAIFRENNIPRRLMPAALYFGSCTFAMIAPGAPQVQNIIPTQGFGV DLMAGTVVGFISSILMLLIGSFWLSVMIKNAKKAGEVFTSRDDDKDEKSGSLPSPLISFI PMLVTVIAINLKNKAGENIIPIEYALLMGIATTIILMYRRFDRAELTEAVFQSVPKAAVA IFNICTIVGFGTVIKSTPAFELLVDAVIHIPGNYLVGVSVGTALLAGFCGSASGGLGIIT PIFYDIFGAMQGVSYAAIARVMAIASSSLDSLPHSGSVNTSIGLCGESHKSAYIPIFCLS VVTPAICTAVAVILFTLFPGWP >gi|157101630|gb|DS480694.1| GENE 198 215516 - 216142 255 208 aa, chain + ## HITS:1 COG:no KEGG:Closa_1315 NR:ns ## KEGG: Closa_1315 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 175 8 178 187 126 43.0 5e-28 MKFWAYFLIPVYTVLFTRGYNWFTTNFSVIGNFFDRKKSFFLWGVLVGSYFYMIHRNIKS RAALHPICARLIPAALVLLFCAITTPYLPEELPLKSVLHIVFAFLSAVLVLLYLFWIIRT RYQTEPGAYRPFLYGWGAIVGVSVILLAVAGIISSALEIYVTLTSVAMAQRLACRVAVQE EMPDSGGDTSEKASRRKSLGKRRNHLLT >gi|157101630|gb|DS480694.1| GENE 199 216276 - 217760 1505 494 aa, chain + ## HITS:1 COG:CAC0456 KEGG:ns NR:ns ## COG: CAC0456 COG0466 # Protein_GI_number: 15893747 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Clostridium acetobutylicum # 146 494 210 573 786 165 29.0 2e-40 MPVFDFSDTPKKEEEPQCTYTVYYSNEPAASPEKPMLVLDNSSRAWRHHSCGVFTNQAKR TAFEFKEEDGTVSADILQVDARFTSLLRWLGENHISVRLSGQNRGDGYAVYKIREVAFGG GTKLSAEDGFLQFMIERLLAGSAPEQELAEEDQEENGDDMKLTSIQSLTDFMNCAGRTLP DNIRIWARRNLAVARSHEVTPEERRHAQRALSIMMNIQWKNHYFESIDPVEARRILDEEL YGMERVKQRIIETIIQINRTHTLPAYGMLLVGPAGTGKSQIAYAVARILKLPWTTLDMSS INDSEQLTGSSRIYSNAKPGIIMEAFSMAGESNLVFIINELDKASSGKGNGNPADVLLTL LDNLGFTDNYIECMIPTGGVYPIATANDKSQISAPLMSRFAVIDIPDYTPEEKKIIFSKY AMPKVLKRMGLHENECVVTEDALDAVIEKYADTTGIRDLEQAAEHMAANALYQIEVDNVA AVVFDGDMVRELLG >gi|157101630|gb|DS480694.1| GENE 200 217843 - 218382 549 179 aa, chain + ## HITS:1 COG:SA2335 KEGG:ns NR:ns ## COG: SA2335 COG0350 # Protein_GI_number: 15928126 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Staphylococcus aureus N315 # 1 177 1 171 173 169 50.0 2e-42 MYYTTDYISPVGRIKLAADGERLVGLWLEGQKYFAGTVKEEMTEAPELGIFKDTKDWLDR YFAGKRPESSELLLAPLGGEFRQGVWEILCQISYGQLTTYGDIAKKIAEKMNRETMSAQA VGGAVGHNPISIIIPCHRVVGAAGSLTGYAGGIDKKIWLLKHEGVDMEGLFIPEKGTAL >gi|157101630|gb|DS480694.1| GENE 201 218702 - 219046 78 114 aa, chain - ## HITS:1 COG:CAC2568 KEGG:ns NR:ns ## COG: CAC2568 COG1733 # Protein_GI_number: 15895828 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 18 107 20 108 117 95 50.0 2e-20 MKTEKSVPERCEIIQKVQRVLGGKWKIEILYYIGFQDIHRFGALRRHIGGISESSLINQL RSLEEDGFISRHDYKELPPRVEYSLTDLGQSFMPIMEHVKDWGETHLFPLSSSR >gi|157101630|gb|DS480694.1| GENE 202 219247 - 220050 341 267 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 5 227 12 244 294 60 25.0 5e-09 MESVIYYFSATGNSLRIANIMAERMDGLLLPMSRYKGTKCSSRQIGLVFPTYFWGVPRTV AEFVDEMKVEADNPYVFAAATCGGLAGGVLGHLDALLKKKGLHLDYGITIPSVANFIEEY NPKTRSADRKLREADEMAERASAEVIAARRNGPFGFHVWDRIFYKLYTDYKLNRDTGFHV DDTCVRCGICQRICPSRNIVLKNEKPEFQHRCEHCVACINCCPQQAIQWKHATQKRVRYR NPGVSVHDIIEGMGSARDSFISKDRDI >gi|157101630|gb|DS480694.1| GENE 203 220180 - 220458 375 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940227|ref|ZP_02087572.1| ## NR: gi|160940227|ref|ZP_02087572.1| hypothetical protein CLOBOL_05116 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05116 [Clostridium bolteae ATCC BAA-613] # 1 92 43 134 134 186 100.0 5e-46 MTFTYPAVFTPHKNDKGYHVTFPDLQCCEADGPDLEDAVEHAREAAYNWLYLEIEEKTFE FPPQTHMEDIRLEEGEFLKHIMVTVKLLPDND >gi|157101630|gb|DS480694.1| GENE 204 221008 - 222066 1287 352 aa, chain + ## HITS:1 COG:CAC1373 KEGG:ns NR:ns ## COG: CAC1373 COG4822 # Protein_GI_number: 15894652 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Clostridium acetobutylicum # 94 348 5 262 278 190 36.0 4e-48 MKMKKLAVFFAAAALTTVTAAGCSGGQKETAADAAKDSEAVTTAEQTTGAKTEVPETTAE AAEAAAEADEENYDTGDASKDNARNQDGIGENELLVVSFGTSYNDSRRLTIGAIEDAVEK AFPDFAVRRGFTSQIIIDHVKSRDNVAIDNVGEALDRAEKNGVRNLVIQPTHLMNGLEYT DLVNEAAEYSDAFDKVAIGKPLLTTEDDFRAVMKAVTEATAQYDDGETAICFMGHGTEAE SNQVYAKMQDMLTEAGYKNYYVGTVEAAPSLDDVLAAVEEGSYKKVVLEPLMIVAGDHAN NDMAGDEEGSWKTAFEDAGYEVTCLVNGLGQLEAIQQLFVEHAQAAVDSLTK >gi|157101630|gb|DS480694.1| GENE 205 222143 - 223111 948 322 aa, chain + ## HITS:1 COG:no KEGG:Shel_24810 NR:ns ## KEGG: Shel_24810 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 81 321 60 347 391 151 34.0 3e-35 MKMKLCALLAAIGLCAGNLSGCGGKTPAVADTSAAAKSSVVELSEAELAQEELAEGAYDD APAKNTQNQESGSLGDAVKDGMEPVSADALKDGGYEVRVDSSSGMFRITECELTVQDGAM SAVMTMSGTGYLKLYMGTGAEAEQASEADFIPFVENADGKHTFKVPVEALDKEINCSAFS RKKETWYDRVLVFCSGSLPAEAFADGEAATAKSLKLEDGSYTVAVRLEGGSGRASVETPA ALRVEDGNAFAVITWGSSNYDYMKVDGERLDLISTGGNSSFEIPVRVFDRKMPVIADTIA MSEPHEVEYTLVFDSTTIKKAE >gi|157101630|gb|DS480694.1| GENE 206 223132 - 224307 942 391 aa, chain + ## HITS:1 COG:CAC2441 KEGG:ns NR:ns ## COG: CAC2441 COG0614 # Protein_GI_number: 15895706 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Clostridium acetobutylicum # 55 388 140 467 471 147 28.0 2e-35 MAIAAAGILMTAVCTGCSSAGRTGADPAVANNRSLGSSPAPVPEGWSGIKRTGSMELLYA DQFSVDYYDGGYAMITIKDSGRYLVVPEGKLQPSGLPGDVAVIKQPLENIYLAATSAMDL FCSLDGTDRITLSGTDASGWYIEEARSAMEDGRMLYAGKYNAPDYERILSGSCDLAVEST MIYHSPEVKEQLERLGIPVLVERSSYERHPLGRMEWLKLYGVLLDREEQANSCFAGQVKQ LEPVMAQESTGKTVSFFYISSNGYVNVRKSGDYVAEMIDLAGGTYVPQGLTENENALSTM NMQMESFYASARDADYIIYNSTIDGELDNLSQLLEKSSLLADFKAVKEGNVWCTEKSLFQ ETMGLGDMILDIHRILTEEEPEGLRYMHRLR >gi|157101630|gb|DS480694.1| GENE 207 224324 - 225343 902 339 aa, chain + ## HITS:1 COG:CAC2442 KEGG:ns NR:ns ## COG: CAC2442 COG0609 # Protein_GI_number: 15895707 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Clostridium acetobutylicum # 24 332 26 336 342 235 45.0 1e-61 MIKGTYIKNSISFALLAVMMCFLFIWNINSGSVHLSVEEIGEILFRRTGEAASYRIVWDI RLPRILSAVILGGALSVSGFLLQTFFANPIAGPFVLGISSGAKLVVSLVMIVLLGRGISI GSAGMILAAFAGSMISMGFVLMISGKVKKMSMLVICGVMISYICSAITDFVVTFAEDANI VNLHNWSMGSFSGTSWENVSVMTAVVFLSLILVFLMAKPIGAYQMGEVYAQNMGVNILRF RVALILLSSILSACVTAFAGPISFVGIAVPHLVKSLFKTAKPILMIPGCFLGGAVFCLFC DLAARTVFAPTELSISSVTAVFGAPVVIYMMVHNRKGMQ >gi|157101630|gb|DS480694.1| GENE 208 225505 - 226608 955 367 aa, chain + ## HITS:1 COG:CAC2443 KEGG:ns NR:ns ## COG: CAC2443 COG1120 # Protein_GI_number: 15895708 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Clostridium acetobutylicum # 9 360 7 357 387 272 38.0 9e-73 MAEDYIWTENMTVGYGKTPLIRQIGIHVRAGEIVTLIGPNGAGKSTILRSVIRRLGLLEG TVYLDGMPMKGMGEREIAKRMSILMTERIHPELMNCEDVVGTGRYPYTGRMGILTAEDRG KVREAMELVHAWDLASRDFSQISDGQKQRILLARAICQDPSVIVLDEPTSFLDIRHKLEL LTILKDLVRRKKVAVLMSLHELDLAQKLSDYIVCVKGEYIERCGTPEEIFTSSYITGLYG ITKGSYYAEFGCLEMEPVKGKPQVFVIGGNGSGIPVYRRLQRMGIPFAAGILHENDVDYP IARALASQVISEMPFEPIREETYDRAAEVLASCGQVICCLKEFGTLNDKNRKLAELGRDK QGADLLV >gi|157101630|gb|DS480694.1| GENE 209 226610 - 227401 715 263 aa, chain - ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 114 1 116 117 82 40.0 7e-16 MSTFYKIGEIASLYNISTDILRYYEELGILVPRRAPNGYRIYRTEDLWCLNVIRDLRELG FSMEQIKAYIENRSIDSSLELFQKEMDVIDRHIERLKSLRDNIITRRETISQARGLPLGT ITLKEMPPRACHRIMQSYETDEEMDILIKQLVSFGPDRQYIIGNNQMGSFISLSDAMDGR CQQYDAVFILHPDGEHCIGKGTYLSVCYSGNFRQTRTYVPRLLDYAREHSMNIRGPLLEL LWIDVHTTKHVEEQVTELQLKVD >gi|157101630|gb|DS480694.1| GENE 210 227510 - 228844 1122 444 aa, chain + ## HITS:1 COG:FN1469 KEGG:ns NR:ns ## COG: FN1469 COG0534 # Protein_GI_number: 19704801 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 15 437 13 436 440 229 33.0 1e-59 MNLEGMSLKKKFWRFVWPSVVAQWIFALYTMVDGMFVARGVSEVALSAVNIASPFVNFLF SVSILFAVGTSTVVAIFLGQGRQREANQANTQNMVVTGVLSLFIMLMVWLCLDPIVSFLG ATQSTQEYVKHYILSLLPFTWWFIISYTFETLVKTDGFPRFAAAAVASGALVNCVLDYLF VMVFRWGIGGAGAATGLSQMVPVFLYLKHFLGPKATIRFARPVWDLGQFIRVVKIGLSSG FTELSAGFTVFMFNHAILRYIGEQGVVSYTIIAYVNTIVVMSMAGIAQGIQPLISFHYGR QERDVCSRLLKYSLTASVGVAGIAFISSMAGADWLVGIFISEKLEGLRQYSASVFRIFSL SFLVVGFNITGAGYFTAIERPKESLIISLGRGMVIIAASLAVCIGVGGGEGIWWAPAVSE LMCLGVTLILVYLYTRRETFKCIS >gi|157101630|gb|DS480694.1| GENE 211 228971 - 229681 763 236 aa, chain + ## HITS:1 COG:FN0959 KEGG:ns NR:ns ## COG: FN0959 COG2243 # Protein_GI_number: 19704294 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Fusobacterium nucleatum # 3 234 12 239 248 131 33.0 1e-30 MGKLYGIGVGPGDPELLTLKAHRILNAADVIFCPEKEAGAGSFAFDIIQGLLEDTKAEIV NLVYPMHYHGTQLREMWKQNAGCIAQYLNGDRTGAFITLGDPAVYSTFMYTLPYIEEAGV EVEVVPGVTSFCAVADSMKIPLVAWNEDLVVAPVRKNSSEDLGRVLREHDNVVLMKPSTD PQALLKALKENHLEDRFVLITKTGTGEERLVTDFEELERYDIPYLSTVIIKKQGRQ >gi|157101630|gb|DS480694.1| GENE 212 229681 - 230526 813 281 aa, chain + ## HITS:1 COG:PH0913 KEGG:ns NR:ns ## COG: PH0913 COG1131 # Protein_GI_number: 14590767 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Pyrococcus horikoshii # 2 242 7 250 324 190 40.0 3e-48 MILCNDLHKQFGDVEAVKGVSMEIPDGMFLGLLGPNGAGKTTLMRIMTGLLAPDTGTVAF DGQVMDRNAVAVKQAIGVTSQHVNLDKELTVEENMEFAGRLYRMGKADIAVSTDRLLAFL GLDSVRKRAAQNLSGGMKRKLMIAKSLIHNPKYLFLDEPTVGIDPNARRDIWDFLHIQHK EGKTILLTTHYIEEAQHLCDYVMLIDGGKIFKEDTPAGLIDEIGQFKVEYEDGDRIKAEF FKDLPAAKSRAAGIDALCSVLPSTLEDVFFHYTSKGVGGWK >gi|157101630|gb|DS480694.1| GENE 213 230592 - 231254 611 220 aa, chain + ## HITS:1 COG:all4219 KEGG:ns NR:ns ## COG: all4219 COG0842 # Protein_GI_number: 17231711 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 183 48 237 275 96 32.0 4e-20 MVSPLLYFITFGMGLGRVMEMDGRPYLYYLIPGLLAMTTMRNSYTSVSTRISVMRLHEKS FECYLYSPTRMHLLAMGHILAGALRGMYSGVFIVLLGIVSGARLAMNGWLLTAVILNSLI FSALGFLAAMMMESHYDMNNFTNIVITPMSFLCGTFFSLDGIPEALKWVVNMMPLTHTTR LIREISFGGSVSWPSMASAVLFAAAFTGGCVWTCYREAKA >gi|157101630|gb|DS480694.1| GENE 214 231298 - 232716 1485 472 aa, chain + ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 7 449 2 430 437 397 46.0 1e-110 MKTKKNEHVIKEVYPGSIAQEMEIEPGDILLSINHEVIEDVFDYRYLIKDEYIEVVIRKP DGEEWLLEIDKEYDDDLGVEFENGLMSDYRSCSNKCIFCFIDQMPPGMRETLYFKDDDSR LSFLQGNYITLTNMKERDIERIIRMQLAPINISVQTTNPQLRCKMLNNRFAGDKLKYLQM LYDGHVEMNGQVVCCKNVNDGAELERTIRDLSRYLPFLRSVSVVPAGITKFREGLFPIEL YTKEEAGAVIDMVESRQQEFYEQYGLHFIHASDEWYIIAGRDFPEEERYDGYIQLENGVG MMRMFINEFNEALEDVISKKEYEELAGTVSRTLTIATGKLTYPTICGFAGKLMDAFPHLT VHVYAIRNDFFGETITVSGLITGQDLIGQLKERQAAGEDLGEVLQIPSNMLRVGEQVFLD DLTVRDVERELGMKVVAVESGGREFIDAILDPEYRMERRNDNFVYIQAYDRK >gi|157101630|gb|DS480694.1| GENE 215 232858 - 233721 877 287 aa, chain + ## HITS:1 COG:Cj1202 KEGG:ns NR:ns ## COG: Cj1202 COG0685 # Protein_GI_number: 15792526 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Campylobacter jejuni # 13 285 4 276 282 298 49.0 1e-80 MKTSSLFNHKTVLSLEVFPPKRTAPVDTIYNTLDELKGITPDFISVTYGAGGSENSQTTL KIASAIKHKYGIESVAHIPCLNLTRDDVLLILEQLKEQKIENILALRGDRTADREPAGDF RYAADLVEFIKNHGDFNIIGACYPEGHQESGGLVRDMKHLKDKVDAGVSQLITQLFFDNE YFYRFRERMDLVGIHVPVEAGIMPVVNKKQIERMVSLCGVKLPKKFISMMERYGDNPIAM RDAGIAYAVDQIVDLAAGGVDGIHLYTMNNPYIANRIYESVFRLMAS >gi|157101630|gb|DS480694.1| GENE 216 233735 - 234643 1081 302 aa, chain - ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 294 1 294 301 346 56.0 3e-95 MTLQQLRYIVTIVNCGSISEAAKHLFITQPSLSNSVRELEKEMGISIFNRSSKGIALSSQ GMEFLSYARQVLEQAELLEQHYTNKKPVKRICSISTQHYAFAVNAFANLVSCQNTDEYEF TLRETRTYEIIDDVKNYRSELGILYINDFNRKVLEKLFKESGLLFHPLFKASPHVFISVT HPLSARNSVAIEDLEDYPFLAFEQGEKNSFYFSEEILSTIPRKKVIYVSDRATLFNLLIG LNGYTICSGILNRNLNGDSIMAVPLETEENMVIGWIGDPRIHLSEFAGKYLEELHRLISS QS >gi|157101630|gb|DS480694.1| GENE 217 234713 - 235627 411 304 aa, chain - ## HITS:1 COG:no KEGG:Geob_3648 NR:ns ## KEGG: Geob_3648 # Name: not_defined # Def: sugar transferase # Organism: Geobacter_FRC-32 # Pathway: not_defined # 7 288 195 444 456 111 26.0 4e-23 MDIPSAKSLIRRNIPFYLAGAFLLLGMKYYYSRGGPDSLSWILAPTARWVSALSGIPFIK VPQTGYVSHSCRFIIAASCSGLQFLMISMTALVFSYIHRMRTIKGKIGWMALSALASYLL TIFVNGFRILFSIFIPIYLGMSGTAWTDVSGSAWAETAGSSGPAPARAWSIWLTPKQLHT IIGTAVYFTALFAVCQLGEYVSRKCSAAPGTSHRGNSRARAGFYPIRALGRWAAPAFWYF SIVLGIPFLNRAYRNRPQSFTDYALLLTAVCLTVITFYCICSELHRRISRLTSGRYRKNH KNME >gi|157101630|gb|DS480694.1| GENE 218 235631 - 237682 1743 683 aa, chain - ## HITS:1 COG:alr4412 KEGG:ns NR:ns ## COG: alr4412 COG2304 # Protein_GI_number: 17231904 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Nostoc sp. PCC 7120 # 42 673 12 653 820 306 29.0 1e-82 MKRKKTWLYVCLTLLMLALTPAAVYGEEPDGDTADKTLAPYFFVQDAAQAADQFPLKDTS VSSTINGVIAETYVTQTYANEGTIPINASYVFPASARVTVHGMKMEIGNEVITAKIKERE EAKTEFEEAKSEGKSASLLEQQRPNVFTMNVANVMPGDTVRIELHYTELISPRDGVYQFV FPTVTGPRYAGSAPETGGGDEWTASPYLEEGTAPQGTYEIAVSLSAGVPISSLTSSSHKI KVEKESESIAHVTLSDSGEFAGDRDFILDYRLTGQEINCGLMLNTGEKENFFMLMVQPPE RVRAEEIPPREYIFVLDVSGSMFGYPLDTAKELIGNLVGNLRDSDQFNLILFSDTAVSMA PKSVPATAENIRQAIDLIERQDGGGGTELAPALEQAVSLPRDPRMARSIVTITDGYMSDE SSIFSLINRNLKTADFFSFGIGTSVNRYLIDGIAKAGSGEAFVVTEPSQASDTACDFSTY IQSPVMTGINVSFDGFDVYDVEPDILPTLFAQRPIVLFGKWRGQPSGTIRITGKTGNQDY MQEIPVGRAEAPADNTAIPYLWARTKVERLTDYGTREDVQDAVRQAMIKKTVTQLGLDYS MMTPYTSFIAVTETVRSQDGSAADVKQPLPLPQHVSNFAVGAGYTIGSEPGTLILMCAAG VIMAAGCIRRSRADRKKRKTEEA >gi|157101630|gb|DS480694.1| GENE 219 237985 - 239166 1153 393 aa, chain + ## HITS:1 COG:MTH760 KEGG:ns NR:ns ## COG: MTH760 COG0025 # Protein_GI_number: 15678785 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters # Organism: Methanothermobacter thermautotrophicus # 4 390 7 394 399 285 47.0 8e-77 MLISIAFMLMLGMFLGWVCRKLRLPGLMGMIFTGVLLGPYALNLIDGSILNISSELRRIA LIIILMRAGLSLDLNDLKKVGRPAVLMCFLPACFEILGMVLLAPRLLGISVLDAAIMGAV VGAVSPAVIVPKMLKLIEDGYGTDKGIPQLLLAGASVDDVFVIVMFTAFTGLAQGGSVSP ISFVKIPVSILIGSFIGLAAGWGLAVYFKRVHIRDTVKVIILLCVSFILVTLEDRYSDIV PFSSLISVMGIGIALQKKREEAARRLSVKFNKVWVCAEIMLFVLVGATVNIHYALSAGVW AVILIFGVLVFRMAGVFCCLAGTRLNMKERVFCMIGYMPKATVQAAIGGVPLAMGLACGD IVLTVAVLAILITAPLGAFLIDLTYKRFLQERL >gi|157101630|gb|DS480694.1| GENE 220 239260 - 240183 724 307 aa, chain + ## HITS:1 COG:sll0784 KEGG:ns NR:ns ## COG: sll0784 COG0388 # Protein_GI_number: 16331918 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Synechocystis # 1 307 6 308 346 181 34.0 1e-45 MKDVKKICKIAVIQAAPVMFDKDACTQKAVDLIQEASRRGAQLMVFPELFIPGYPYGMTF GFRVGSRSGEGRQDWKLYLDNSILVPGKETEKIGMAAREAGAYVSIGVSERDGVTGTLYN TNLFFSPRGSLVCVHRKLKPTGAERVVWGDADRGYFPVEDTPWGVMGSLICWESYMPLAR AALYEKGVALYISPNTNDNPEWQSTVQHIALEGRCYFINCNMYITRDMYPENLCCRDEID GLPDTVCRGGSCIVDPYGHYVTEPVWDKEAVIYADLEMDRVSASRMEFDVCGHYSRPDVL RLQIDDR >gi|157101630|gb|DS480694.1| GENE 221 240204 - 243560 2995 1118 aa, chain + ## HITS:1 COG:alr3761_5 KEGG:ns NR:ns ## COG: alr3761_5 COG0642 # Protein_GI_number: 17231253 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 726 949 1 221 245 171 42.0 1e-41 MGVKTESDTGQHQRLEDVKKLASRIMHMHYCEHDIEGITSCFSEQFSWMGAGEQEYLSGR KACADHFRKFRESIPDCNIWDEEYDVVCPMENVYVVMGRMWIATDPGTRMYLKVHQRVTF VFRDTEDGLTCSHIHCSNPYQEMLDGESFPEKIGRQSYEYVQERLNALEEETKQQNRQLE VIMSSIAGGLKISNDDETYSFAFVSKEAAALFGYTVEEFMEVTGGTAVGNVYPPDLPGAL ADCAEAFRDGNLTYSTRYRVRCKDGSLKWIIDSGKKAQDADGNWMVNSLYLDVTRSEEDA QRLREQTQLLTSIYDTVPCGIIRFVRDIEGGYRLVSINRAVLNLMGYKDMEEGLNDWHDG VLGAVLEEDRLILQEAYRKLREIGDRQDQEYRVRWKDGSIHWLEGTSMVVGVTPQGEDIL QRTAVDITQRKILQEQLEREQEMYRVSMEVSSAVMFEYRMDEDVFISYEPKTGQGVLRNE LRDYSRTLLKRQVVHPDDVPTVIDNICNGRSEVFEVRVSVPGGEPGTYIWHRVNSRLMME NGKPSRVVGALHNIHSMKSKLSENSERLYMSQSALQAINGVYMSIFYVNLPEDSYYAVRL PEVRGGAVLPRNGCYSTELCSYILSDVDQADRKRVMSICEREWLLGELAGGNEHIEVEFR HGFSPLWLRLEVHMVASKEGRPRTAIIALRNISAEKQRELEYYDEEKKAKHALEEAYDSL NRANQAKSDFLSRMSHDIRTPMNAIMGMTAIAQSNLNNRDKIEDCLSKISLSGSHLLDLI NKVLDMSKIESGNVGLSEDAFCLEELVEEVSLIVKPDMDSKGQELSISLKEIDHHAVYGD AVRVKQILINLLSNAVKYTSDRGHIAVSLEEKLSSESGVGCFEFVVEDDGIGMAPEFLEK LFMPFERAEDSRVSQVQGTGLGLAITRNLVQMMNGTIRVESQLNRGTRFIATIYLKLAGE EDTGERSQNGNTPRTPASFPPGTCVLLAEDNELNREIVVELLSMFNITAVCAVNGREAVE RFETDPPGTYALILMDIQMPVMDGYTAASAIRSLGKTGRRPDGTGIPIIALTANAFADDV YRAKQAGMDEHVTKPLEISRLLEIMHRYLDVPARNRTE >gi|157101630|gb|DS480694.1| GENE 222 243645 - 243821 63 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940249|ref|ZP_02087594.1| ## NR: gi|160940249|ref|ZP_02087594.1| hypothetical protein CLOBOL_05138 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05138 [Clostridium bolteae ATCC BAA-613] # 1 58 1 58 58 110 100.0 5e-23 MDGAPFMGKPVKNKKTRPAPNCAQAMPYPTYVVNCIEISMLRGSAAEFVIHGIRLLLM >gi|157101630|gb|DS480694.1| GENE 223 243896 - 244975 1152 359 aa, chain + ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 16 341 28 360 371 114 26.0 3e-25 MYFTILLLAAFVLLKYGLPMVAPFVTAFVIAYLLRRPICFASVRLHMNRKITAILMVLLF YSVAGLLFTLLCVKMFSNGKSLVASLPSVYANYAEPVFTGIFGGIEQYLLRMDPSLLGAL EELEGHFVQSAGQMVSGISMSVMGWLSGFASSLPGMFINLLLMVITTFFIAADYELLTGF CLRQLGERPKEVFMEIQSYVVGTLFVCIRSYALIMTITFVELSIGLTLIGVENSLVIAFL IAVFDILPVLGTGGIMIPWTVITALMGSHGLALGLLVVYLVITVVRNIIEPKIVGSQIGL HPVVTLVSMFVGAQLLGVLGLFGFPIGLSLLRYLNETGSIRLFKTAGEESQQLFHKEQI >gi|157101630|gb|DS480694.1| GENE 224 245086 - 245712 424 208 aa, chain + ## HITS:1 COG:BH3001 KEGG:ns NR:ns ## COG: BH3001 COG0110 # Protein_GI_number: 15615563 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus halodurans # 6 187 3 184 186 210 56.0 1e-54 MQKIMTEKERMLSEKLYIPMDEALAADCARSRRLVRLINGTTEEQAEYRVQLFKELFGRT GENLWVEPPFHCDYGCHISVGENFYANFDCIILDVCDVTIGDNVFLAPRVCIYTAGHPID AGVRRRQLEYGKKVVIGNDVWVGGNTVINPGVTIGDNVVIGSGSVVTKDIPSGVIAAGNP CRVLRPVTEEDTRYWEELEREYRRDRGE >gi|157101630|gb|DS480694.1| GENE 225 245717 - 246781 1085 354 aa, chain + ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 52 344 9 307 314 174 37.0 2e-43 MEQEGQKHQRRVRYKGTHPKRYEEKYKELQPEKYPETVAKVIQKGSTPVGMHIPIMVNEI LEFLQIQPGQTGLDATLGYGGHTSRMLERLESKGHIYALDVDSIEMEKTRQRLEHMGYGP EILTIRKLNFANIDQIAGEFGPLDFVLADLGVSSMQIDNPSRGFSFKKEGPLDLRLDPLK GEPASERLKGLTQEELTGMLMENSDEPYAEEIARTVMGEIKRGREVATTTRLYEMVDRAL AFIPEEERKEAVKKSCQRTFQALRIDVNSEFEVLYEFLDKLPGVLKPGGRAAILTFHSGE DRIVKKSFKEMYRAGLYSQVAGDVIRPSAEECRMNSRAHSTKMRWAIKAEDSNS >gi|157101630|gb|DS480694.1| GENE 226 246832 - 247845 992 337 aa, chain + ## HITS:1 COG:alr4831 KEGG:ns NR:ns ## COG: alr4831 COG0451 # Protein_GI_number: 17232323 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Nostoc sp. PCC 7120 # 1 262 1 247 311 100 27.0 4e-21 MKALFIGGTGTISTAVAALAMDKGWEVTLLNRGSKPVPEGMESMVADIHDEAAVARIMEG RTYDVVAQFIAYGAEDVQRDIRLFQEKTRQYIFISSASAYQKPAAGCPITESTPLINPYW EYSRKKIDAEEVLTAAYRKNGFPVTIVRPSHTYDGKKPPVAIHGHKGNWQILKRILEGKP VIIPGDGTSLWTLTHSADFARGYVGLMGNPHAIGNAYHITSDESMTWNQIYETLAEALDR PLNALHVASDFLAEHGKEYDFAGQLLGDKACTVLFDNTKIKRAVPDFVCTVSMAEGIRSS VRYMMDHPESQTPDPQFDAWCDRIAGAWKAADEAFGK >gi|157101630|gb|DS480694.1| GENE 227 247914 - 248510 401 198 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164512|ref|YP_001467500.1| 50S ribosomal protein L24 (BL23; 12 kDa DNA-binding protein; HPB12) [Campylobacter concisus 13826] # 1 180 1 180 185 159 42 2e-37 MDGCTWCLCNEKMKKYHDEEWGVPLHDDHRQFEFLMMEVMQCGLNWNMMIQKREIFRQCF DGFDYDKIASYEEEDIERILSCEGMIKSRRKVEAVIHNARCFQKVRQEYGSFSSYIWGFS GGKTILYAGHQKGKIPAQNALSQQLSQDLRKRGFKYLGPVTVYSHLQACGVINDHVEGCF RYHDLIEHYPTVKKRDKK >gi|157101630|gb|DS480694.1| GENE 228 248789 - 249427 631 212 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3740 NR:ns ## KEGG: Rumal_3740 # Name: not_defined # Def: lipoprotein # Organism: R.albus # Pathway: not_defined # 28 209 521 704 705 214 55.0 2e-54 MKMRWRVAAVIVCVAALSYCADNRPDLTLAEKPVIYLYPEQAQEVSVRLDYDGRLTDTIP AYGDGWKVTAWPDGRLVDHNDGKEYPYLFWEGEDRTAYDMTKGFVVLGSGTEAFLREKLE HLGLKEKEYEAFIEYWLPRMEGNPYNLITFQKEAYEETAGLLISPKPDSILRVFMVYMPL EQPVQAEEPELAPFERRGFTVVEWGGKEILAE >gi|157101630|gb|DS480694.1| GENE 229 249424 - 250350 717 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940257|ref|ZP_02087602.1| ## NR: gi|160940257|ref|ZP_02087602.1| hypothetical protein CLOBOL_05146 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05146 [Clostridium bolteae ATCC BAA-613] # 1 308 1 308 308 595 100.0 1e-168 MKVTPVVYYKKPGYPTGADMKSRPELLKKVPHRWSGNGVVLTALGMSLALCACAARGDME PGGVKPGNTEPGVGEPEDTGAGGAGQDYRGKTEDGEAHKGLVAPVFIHGEGTGAYGCIAV TAPFFLSEEEAWEIIKSEAESYKGITFSRNTPPVLEDVKLPESRLFSGKTEKKQSLRDGD LCLDGADADNKIAFEFISQDDVYEWRKENEEAAATFTLYDTKGTAEQLREELVSYGPDIA LGIFYDPCNDVDMEGAYDAMAKAEEQGMTFDWEGAKLEQEARARAKSEEELRAQVRDFLD WLKGQGVI >gi|157101630|gb|DS480694.1| GENE 230 250350 - 251483 905 377 aa, chain + ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 5 343 99 452 454 107 25.0 3e-23 MHVTLHLTNACNLACGYCYECHQKDYMTVETAKKAALLAAGKGQSSCGIVFFGGEPLLCR DVIYRTAAACRDMENALNTRFHFKITTNGTLLDREFLDYSRKNRILIALSHDGTKRAHDF FRRHKDGRGSYDELESVTEELLAAHPYAPVMMTVCPETVGEYARGIKELWNKGFRYFICS LNYAGAWDNGSVSQLKSQYSELAEFYYEMTCREEKFYFSPFEVKIASWIKGDAYCHERCE LGLKQISVAPDGVLYPCTQFAGHRGYAIGTVDEGIDQAKRQRLYAQSRGDKPECSGCAVK QRCNCTCGCLNYQVTGSVRQVAPMLCTHERLILPIADKLAARLYRQRNGMFIQKHYNDMF PLISLVEDKTGKGGKEK >gi|157101630|gb|DS480694.1| GENE 231 251480 - 251839 498 119 aa, chain + ## HITS:1 COG:CAC3399 KEGG:ns NR:ns ## COG: CAC3399 COG1733 # Protein_GI_number: 15896640 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 115 1 116 116 121 56.0 2e-28 MKTRTEYTCPLELTHDITKGKWKPIILWQLGQGPHSLAGLKREIKGISQKMLIEQLRELC DYGMVSKTSYEGYPLKVEYALTSRGTKMLEAVVIMQGIGIEMMMEDGKEEFLRKKGLLD >gi|157101630|gb|DS480694.1| GENE 232 252003 - 252419 451 138 aa, chain + ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 11 138 13 139 145 114 41.0 4e-26 MRDAETTIGNLADKQTVAFIGAADEEGFPVVKAMLAPRKREGIKVFYFTTNTSSRHTAQY RENPKACIYFCDRRFFRGAMLKGTMEILEDSESKEMIWREGDTMYYPGGVTDPDYCVMRF TAVSGRFYSNFHSEDFEV >gi|157101630|gb|DS480694.1| GENE 233 252928 - 253494 629 188 aa, chain + ## HITS:1 COG:PA4580 KEGG:ns NR:ns ## COG: PA4580 COG3236 # Protein_GI_number: 15599776 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 6 182 8 184 184 169 48.0 2e-42 MVYTWEALKKAYIAGQHFKFLFFWGHTPSADGLITEACLSQWWKCSFTADGTVYSCAEQF MMAEKARMFGDRDMLGNIMEAREPKAMKAYGRAVKNFDKETWDAACYELVREGNMAKFSQ NPELWEFLKSTRKRILVEASPRDRIWGIGMGKNNPDAECPLKWKGSNLLGFALTEARDLL LEKEEGMD >gi|157101630|gb|DS480694.1| GENE 234 253496 - 254431 693 311 aa, chain + ## HITS:1 COG:CAC1777 KEGG:ns NR:ns ## COG: CAC1777 COG1051 # Protein_GI_number: 15895053 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Clostridium acetobutylicum # 12 304 7 294 307 211 43.0 1e-54 MGEKKENTPCLRDRNGLTEEEFLKSYNPGHYARPSVAADMAIFTVTDQEESNYRKLPEKN LSILLIQRGGHPYLGCWALPGGFVRPEETTEQAARRELREETGLDYGYMEQLYTFSEPGR DPRTWVMSCSYMALVDCSRLTIQAGDDADNARWFRISYRLTDEHRDYRRSQDTGLVESVE HIQHYELKLWTDDTVLCSGIEKITRKNRQAETVEYKITDNRGLAFDHARIISCALERLRG KVEYTGLALHLMPQEFTLTQLQQVYEVILDKCLLKPAFRRKIASLVEETDSFTEREGHRP SRLYRRKWEEF >gi|157101630|gb|DS480694.1| GENE 235 254435 - 255250 834 271 aa, chain + ## HITS:1 COG:DRB0099 KEGG:ns NR:ns ## COG: DRB0099 COG4295 # Protein_GI_number: 10957515 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 2 269 5 272 285 163 39.0 3e-40 MDRKAIARQTLDIMEKGWYETEGTVVEIRARQQESVKKSVLFTPEQGERLLEQYETVTKK TAKYKCCTWNCSTVDAILKLAGENQCRCAVLNFASAKNPGGGFINGAMAQEESLAASSCL YKTLTAHETYYRMNRACSTMIYTDHAIFSPDVVFFRDGRFGLLKEPVEASVLTLPAVNMG QVILKGEDRALAEQSMKRRMKLALAIFASRGCRNLILGAYGCGVFRNDPVKVAGWWKELL EQYFPGDFDTIVYAVLDRSATQACYRAFRDI >gi|157101630|gb|DS480694.1| GENE 236 255507 - 255803 447 98 aa, chain + ## HITS:1 COG:CAC1398 KEGG:ns NR:ns ## COG: CAC1398 COG0011 # Protein_GI_number: 15894677 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 98 6 97 97 76 43.0 1e-14 MNASVAIQSLPKAANDEELIRIVDQVIDYIKSTGLNYYVGPFETTIEGDYDQLMDIVKEC QHIAVRAGAPSVAAYIKVSYHPEGDVLTIEKKVTKHHQ >gi|157101630|gb|DS480694.1| GENE 237 255772 - 256545 801 257 aa, chain + ## HITS:1 COG:FN0237 KEGG:ns NR:ns ## COG: FN0237 COG0600 # Protein_GI_number: 19703582 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Fusobacterium nucleatum # 21 257 1 237 241 222 46.0 5e-58 MKRKSPSITSKLWSVSAILAILILWQLSVSLGLVEGFMLPSPVQVAEAFIKEFPALMENA RITLTEAGMGLVLGIATGFGAAVLMDRFDKAYKAFYPIIVLTQTVPAVAIAPLLVLWFGY EMTPKVILIVITTFFPITVGLLTGFRAADPDAVNLLKAMGAGRFQIFCYIKLPGAMGQFF SSLKISASYAVVGAVISEWLGGFGGLGVYMTRVKKAFSFDKMFAVIFLISAISLILMWAV ELLQKKCMPWETSQKQK >gi|157101630|gb|DS480694.1| GENE 238 256592 - 257617 1146 341 aa, chain + ## HITS:1 COG:FN0236 KEGG:ns NR:ns ## COG: FN0236 COG0715 # Protein_GI_number: 19703581 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Fusobacterium nucleatum # 9 340 7 333 334 283 43.0 4e-76 MKNKCASMLLASLCLGVLLTGCGKAPDSGENTTPVQAAKQDVRKLTFVLDWTPNTNHTGI YVAREKGYFKDAGLEVEIVQPPENGAEVLAASGKAQFAMSFQDSLAPAFSGDNPLPVTAV AGVIQHNTSGIISRRGEGMDRPKGLEGKKYATWDLPVEKATIKDVMEADGGDFDKVELIP STVTDEVTALKTKSVDAIWIFYAWAGVKTELEGLETDYFAFADLDPVFDFYTPVIIANNT FLAEEPETARAFLDAVSRGYEFAIEHPREAGEILCKAAPELDPDLVKASQEYLADKYQAD APQWGYIDQERWDAYYAWLNEKGLTETPIEAGTGCSNEYLP >gi|157101630|gb|DS480694.1| GENE 239 257653 - 258399 206 248 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 219 11 245 563 84 26 8e-15 MEQLKVEGISKSFDGEPVLADISITLNKGELVSLLGISGGGKTTLFNIIAGLSRPDTGHV LLDGEDVTGCPGRVSYMLQKDLLLPYRTIEDNVALPLIVKGMKKRKAREAAFPYFRQFGL EGTQKKYPAQLSGGMRQRAALLRTYLFSGSVALLDEPFSALDMLTKKSVHNWYLKVMDEI HLSTLFITHDIDEAILLSDRIYLLTGRPGRITEELIIKETKPRDRDFGLTEEFLEYKRHI LRHLETTA >gi|157101630|gb|DS480694.1| GENE 240 258545 - 259705 755 386 aa, chain + ## HITS:1 COG:ydaM_3 KEGG:ns NR:ns ## COG: ydaM_3 COG2199 # Protein_GI_number: 16129302 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Escherichia coli K12 # 232 340 1 104 170 64 35.0 3e-10 MSTKGSSEKTSIYIIDRDYRLVHFNDALKRRFKDVRCGDVCYEKLCGEDRPCQGCPLNRD SDGAVMFYNKAIQQWVEVSSGNMDWPGLGPCRVILVREINEGNKNLFYNLTSISAYDELF ELNFTRDYYKILYHQEGKYVIPAPEGTLSSMVQEVSEHMIHPEDARRFLEFWEPGRIASC LSDQHASRILKGEFRKRKVDGTWCWVAQIVVPILSGAEDEDIVMCFIQDIDEQKRREESL ALGNQLTEEGLDPITGLCRRPVFFRQSAAFLKHVKNPEDYCLMAIDIEHFKLFNDWYGQE AGDRLLAMIGELLNKTQSEEGGLAGYMGGDDFVILLPDSPAVLRRLQEEITRYMKEAVQP DSFRPLASMVLTAGALLSAICTTWLL >gi|157101630|gb|DS480694.1| GENE 241 259684 - 260361 714 225 aa, chain + ## HITS:1 COG:RSc0510_3 KEGG:ns NR:ns ## COG: RSc0510_3 COG2200 # Protein_GI_number: 17545229 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Ralstonia solanacearum # 23 223 4 205 282 148 35.0 1e-35 MYDLAVIALSSVKGNYAQRVSRYDSRMKQKLEEDHRLLSEIQRALENGEITFYAQPKCNM RTGKIVGLESLVRWNHPERGLIAPGVFIPLLENNGLVTKLDLYVWEEVCRNVKKWIDSGR KPVPISVNVSRIDIYTLNVTRVFQELISRYCLDPRLIEIEITESAYVEEYKVITAVVEEL RSAGFTVLMDDFGSGYSSLNMLKDVNVDVLKIDMKFLDMDHESVG >gi|157101630|gb|DS480694.1| GENE 242 260366 - 261898 1569 510 aa, chain + ## HITS:1 COG:aq_1442_6 KEGG:ns NR:ns ## COG: aq_1442_6 COG2200 # Protein_GI_number: 15606615 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Aquifex aeolicus # 1 62 191 252 254 66 43.0 1e-10 MGILEAITRMANIVGIRMIAEGVESKEQMELLQDMGCTYGQGYYFYHPMPIEVFEQILSD EANIDFRGIKARQIERIRLQELMNGDMVSDAMMNNILGAVAFYDLYDGRLELLRVNEQYC SVTRTTGMDLEEVRKTILGTVFEDDRDQVMEIFSRARQNPIKGAEGELRSRRGDGTSMWM HLRVFFLREQDGHWLYYGAISDISQQKQREQQLEASQRALSSAVHISEKDESFMTLTEEN RRAAAAIFAQMTPGGMIGGYCEEGFPLYFANHEMVKLLGYETFEELSEAIRGKVVNTIHP DDRQRVAADIGPRYYTGLEYTTTYRMPKKDGTWFWTLDKGRVIEAEDGRLAIISACTDIS ESMEAQRLLRERNSALARENAELNILNNDMPGGYHRCARTKDFDFTYISSRFLEIFGYSR QQIKDLFDNKFINMIHPEDRRRVTGEVDCLACQVSCGIMVYRMQSSRGYIWIMDQTNYLI YEDREYLQGVVTDLSELQRIMDSTKSIKLP >gi|157101630|gb|DS480694.1| GENE 243 262157 - 262909 430 250 aa, chain + ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 2 235 12 247 294 70 24.0 2e-12 MILYFSGTGNSEYTAKRIREETDDNIMNLGEKIRSRDFSSIHSDTPWVVAAPTYAWRIPR ILQEWLEKTELTGNRTIYFVMTCGGNIGNAGRYLKKLTDAKGLNYAGCAQIIMPENYIAM FATPTQEEAAGIIRGSLRAIDEAARQIKDSRPFPQLPITRKDKINSSIVNILFYPMFVHA KKFQATSACTSCGKCAVICPLNNIRLTNGTPVWGKNCTHCMACINRCPSQAIEYGKHTQT LPRYTCPKSI >gi|157101630|gb|DS480694.1| GENE 244 263151 - 264173 1391 340 aa, chain - ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 8 339 3 334 336 262 51.0 8e-70 MAKSSVKEFLKRKNVNITVQTYLIDALGAMAFGLFASLLIGTIFGTLGQQFQIEIFNVIA DYAKSATGAALGIAIAHALKAPQLVLFSAATVGIAGNALGGPVGALVATIVAAELGKIVS KETRLDIIVTPGITIISGVLVAQFVGPGVAGFMNWFGNLVKTATEMQPFIMGILVSALIG IALTLPISSAAICIALSLDGLAGGAATAGCCAQMIGFAVMSYRENKIGGLMAQGLGTSML QMGNIVLNPRIWIPPTLASMITGPIATMVFRLENIPAGSGMGTCGLVGPIGVYTAMGGGT SMWVGILLVCFLLPAVLTFVFGEVLRRMGWIKDGDLKLDL >gi|157101630|gb|DS480694.1| GENE 245 264124 - 264267 95 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFTFFRFRNSFTLDFAILIFSLYGAEECIGIYTLRSLYFVRDFITYL >gi|157101630|gb|DS480694.1| GENE 246 264381 - 264923 169 180 aa, chain + ## HITS:1 COG:CAC2438 KEGG:ns NR:ns ## COG: CAC2438 COG0671 # Protein_GI_number: 15895703 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 19 155 24 160 180 95 35.0 3e-20 MDVIEFQILYTLQELRTPLVDGLMVFITSLGDHGWFWILMGVLLFSFPRTRILGGCMLIS IAAGFLLGNVMLKNIAARQRPCWLDPSVELLVAVPKDFSFPSGHSLVSFEGAVCIFLFNR KWGIPALMLAVLTAFSRLYLFVHFPTDVLAGIVMGTVIAWSVVRTAKRRMKKTDRMSGKP >gi|157101630|gb|DS480694.1| GENE 247 264988 - 266013 1258 341 aa, chain + ## HITS:1 COG:FN1279 KEGG:ns NR:ns ## COG: FN1279 COG0491 # Protein_GI_number: 19704614 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Fusobacterium nucleatum # 3 278 4 263 326 100 30.0 3e-21 MLERAPGIFEVELHSNQKSIDEIKIFLIPGKDGERSLMVDAGFRSRTCLHVMEEALGKLG ITYDRLDIFLTHKHHDHCGLASEYAARGARLFMNPQEDRHCYDCLYYNHSHGSTEDQPQV LQSVGVTEDGTPEIWNMFMEVNQRIRENRGWEFEIPGYPYTPVSEGRILSYGDYDLETVG LKGHTYGQMGLYDRNHKILFCADQVIDGIVPIVGTTYPDEHLLKGYFDSLEKLKHQYVDC LILPAHKEPIRDVKRVVDRIVFAYLDKTDLIKHILDHGHHRMTTKEVACLAYGIDHVPRD QSEFIKLKMVISKTFSCLEYLYDEDFAIRTLENGTYYWEAP >gi|157101630|gb|DS480694.1| GENE 248 266441 - 267013 684 190 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0096 NR:ns ## KEGG: EUBREC_0096 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 182 1 182 190 211 64.0 1e-53 MKKLAELFGSSSRELKSVRTITVCAMLAAAAIILVNFRIELTNTQRIGFSGIPNQLVAYL FGPAVSCLFAGALDIIKYLIKPVGAFFPGLTLVTMLAGLIYGFFFYKKPLKLTRVLAAHF LVCLVCNVILNTLCLSILYGQAFMILLPPRVIQNIVKWPIDSFIFFNVAKMLEMSGVFRV VRGSRAAVQR >gi|157101630|gb|DS480694.1| GENE 249 267360 - 268628 1519 422 aa, chain + ## HITS:1 COG:CAC0764 KEGG:ns NR:ns ## COG: CAC0764 COG0493 # Protein_GI_number: 15894051 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Clostridium acetobutylicum # 7 403 9 409 411 394 49.0 1e-109 MAVHVIDEANRCLQCKKPMCQQGCPIHTPIPAMIKALKEGGLEEAGAMLFSNNPMSLVCS LVCNHERQCEGHCVLGRKGQPVHVSSIENYISDTCLERIDVQCQPKNGKKVAVIGAGPAG ITIAILLTKKGYSVTIFDSRDKIGGVLQYGIPEFRLPKAILERYKKKLLSMGIRIRPNTT IGGALEIKDLFRDGYESIFIGTGVWRPKKLGIKGESLGNVHFAIDYLANPDAYDLGDRVA VIGVGNSAMDVARTVIRKGSHHVTLYARGSHSDASLHETQYAMLDGAKFEFSRNIVEIND NGPVFQQILYDEEGNVTGYSGEKEQVYADSVIISISQGPKSKLVNTTEGLKATRNGLLQT DEQGETTVEGIFASGDVVLGAKTVVEAVAYSKTVADAMDAYMKEKAGKEKNEKDGKGLAG RE >gi|157101630|gb|DS480694.1| GENE 250 268675 - 269526 869 283 aa, chain + ## HITS:1 COG:SMc03811 KEGG:ns NR:ns ## COG: SMc03811 COG1082 # Protein_GI_number: 15966947 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Sinorhizobium meliloti # 54 251 49 243 276 70 25.0 4e-12 MKLNTGMRCHDLCPKMEMEKLFAQVREHDIRQIQLAFGKSISDYDFTAGHYSSGFARTIA RELEKNQIHVAVLGCYINPVNPIESIRQAEVARFIEHMKYARIIGADMVGTETGRLDPDF RVTEESYTEDAYQLLLRSMREIVAAAEKLGVIVGVEGVFNHTLYSPARMKRFLEDIDSPN VEVILDAVNLIHPEQAEPEAQKEVIDRAFAYYGDRIGALHVKDFVFDGQEQLFRHVGDGL FQYEPLMRHVKERKPHIAMLLENSSRERYHEDVKFLQTIYDQV >gi|157101630|gb|DS480694.1| GENE 251 269614 - 270993 1566 459 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 11 440 6 432 448 279 37.0 8e-75 MEKNKEAKEVDLGTGSVGKLLFQLALPAITAQIINVLYNMVDRMYIGHLPGEGANALTGV GVTFPVIMAISAFAALVSMGGAPRASIMLGKGRKDEAENILGNCTTALITVAVVLTAFFL IFGRRILLMFGASGNTIEYGWAYMQIYSLGTIFVQLALGLNAFINAQGYAKTGMYTVLIG AICNIILDPILMFVFHMGVRGAALATIISQGISAAWVVLFLISNKSYLKIRTCFMRPKRE ILMPAIALGAAPFVMQFTESILNICFNTSLLKYGGDVAVGAMTIMGSVMQFSLLPLQGLT QGAQPIISFNYGAKKLDRVSRTFRLLLTSCMTYSTVLWAVAMFVPMVYIRIFTQDAALTA FSEWSIRIYMAASLLFGAQLACQQTFIAIGDSKTSLFLALLRKVILLIPLIYVLPNLFEN KVFAVFVAEPIADTIAVCTTVTLFLLSFRKMKRELGANA >gi|157101630|gb|DS480694.1| GENE 252 271382 - 272377 1047 331 aa, chain + ## HITS:1 COG:ECs2902 KEGG:ns NR:ns ## COG: ECs2902 COG1397 # Protein_GI_number: 15832156 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ADP-ribosylglycohydrolase # Organism: Escherichia coli O157:H7 # 3 329 4 330 334 258 43.0 1e-68 MRDKIAGALYGMALGDAMGMPAELWGRKRVKAFFGRIEGFLDGPAENDVAFNYKKGQFTD DTGQALVLLDSIESTDYEPDTRDIALRMLAWAEKENAFENNILGPTSKVALANFRDNIQD SRITDKALSNGSAMRIAPIGCLFRPEQKEDIARYVYLVSRATHTSDVTIAGAAMIAEAVS SAIVKDDFEKVMEDVFGIEEIGYGMGCETFSPRLGERLRIGLDIAGSYKGDDEGFIHKLY DVVGTGVGIIESVPAALSVAYYAQDPNLSCLLCANMAGDTDTIGAMATAVCGAFTGMGRI KPEYIRTLRQQNDADFDHYIRILEKGRERFA >gi|157101630|gb|DS480694.1| GENE 253 272374 - 273327 1018 317 aa, chain + ## HITS:1 COG:ECs2903 KEGG:ns NR:ns ## COG: ECs2903 COG0524 # Protein_GI_number: 15832157 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 4 288 20 304 321 175 34.0 1e-43 MKTLVIGAAIIDIIMKIKRLPKSGEDILCSETVSAVGGCAYNVAGTLRGFGVDHDLFVPV GRGMYGDMIAGDLEKLGYQILIREEESDNGYCLCLVEEDGERTFITVKGAEGRFRPSWFE QLSQDAYDSIYVAGYQVCGTSGGVISDWMAGAKGRMKEKRVFFAPGPVITDIDQAVMERI LSVGPILHLNEKEAFDYAKQPSVEDCLKYLYGLNHNLVVVTMGASGTMYYDGSVMRQVPA YKTQVKDTIGAGDSHVAAMIAGYSKGLDTEQCVRLANRVASAIVSIQGPVMTGEMFEQQD FAPYILNGCSKHTNVEG >gi|157101630|gb|DS480694.1| GENE 254 273333 - 274019 794 228 aa, chain + ## HITS:1 COG:PM1838 KEGG:ns NR:ns ## COG: PM1838 COG3201 # Protein_GI_number: 15603703 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Pasteurella multocida # 6 224 9 227 234 108 33.0 9e-24 MNKIKEFFRDWSIFEKVWLIFVCSLMTVIWFINGDTPFMLVLSLTGSLNLVLGAKGKVAG LYFAIINSAMYAYQCLGIPLYGEVMYNVLYSIPVSATAIYLWKKNATSDGEVRFRTMTLK LVAGVAIATAVGVWGYSELLKYMGGSFAFMDSLTTVVSVIASMLYLLRYSEQWLMWVIVN ALSIIMWIMVFVSGDTTALLIIVMKTVNLLNSLYGYMNWRKISQKTAE >gi|157101630|gb|DS480694.1| GENE 255 274105 - 274830 781 241 aa, chain + ## HITS:1 COG:STM3602 KEGG:ns NR:ns ## COG: STM3602 COG2188 # Protein_GI_number: 16766888 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 8 207 14 213 239 123 36.0 3e-28 MREKLAKYEQIRQDIIHKIESMEYRPNQVIPSENELCASYGVSRITVRKAIDELVHEGLL YRIKGKGSFVRDHGSEGLSRIYSFTEAILHQGKTPSKKLLSLKVEEADADIRRRMGLEEG EEVYVIKSIYYADGRPYCINTSILPRKLFPKLELFDFNHNSLYEVLKSFYQLSFTKARQI LNATVGSSEIYGYLETEQNQPLLRINAASFCLYHDNETVFEIYESYILTDILSYYVEKYN T >gi|157101630|gb|DS480694.1| GENE 256 274883 - 276385 1308 500 aa, chain - ## HITS:1 COG:BH1233 KEGG:ns NR:ns ## COG: BH1233 COG2244 # Protein_GI_number: 15613796 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus halodurans # 3 486 6 491 522 149 26.0 2e-35 MAFITGTLLLTSTGFICRILGFFYRIFMSRTIGAEGLGIYNMVHPIFSICFAVCAGSIQT ALSQYVAANQTRGRAVFRTGLVIAMGMSFLLAWVIYGNAGFLAEKLLLEPRCAPYLPVMA VSVPFAALHACINGYYYGMQKARVPAFSQVAEQVIRMGAVFLIASIWLESGRQITVSLAV YGHLIGEMASAVFTVFCLGFFPPCKDGDSRRAPAIPLSFGATAAPLMALALPLMGNRLIL NVLGSAEAIWIPNRLMSSGLTNSEALSVYGVLTSMALPFILFPSAITNSMAVLLLPTVAE AQADGNEARISSMISMSLRYSCYMGVLCIGIFTIFGNQLGVSVFHDQNAGTYITILSWLC PFMYLATTMGSILNGLGRTSSTFFQNVFAMVIRLAFVLFAIPRYGILGYLWGMLVSELAL ALMSFLAVKRLVPFCWDTVNMIVKPVLLLLISIGMYLAFSQSCFWLKELPLFIMTAVQIL FLSFSYMGLLLLFHKKRPLP >gi|157101630|gb|DS480694.1| GENE 257 276516 - 277880 1307 454 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 9 450 3 448 468 217 33.0 4e-56 MSSNESENERYEMLVNGDIHKVIPRLAVPTIISMMVSAIYNMADTFFVSQLGTSASGAVG IIFSAMAIIQALSFTIGMGSGNYMARCLGAGRHEDGERAVSVAFFTGFLLGTCLGVFCLF HIDQVVMLLGATETIKPYAVEYARYIFLATPFMMCSFIMNNLLRLQGLAAFAMVGITAGG ILNIALDPIFIFGLGLGTSGAAIATGLSQFISFCILLAQCNLRRECIHIRIGNFRPSLFL YGKILSGGLPSLTRQGMASIATIVLNTTAHPFGDAAIAAMAIVNRLIYFVNSAVIGFGQG FQPVCGFSYGARKYSRVRQAYWFCVKFTTCILVVVCAGGFLVSGHVISLFRKNDPEVIAI GTLALRLQIATMWMNGFLTMSNMMSQSIGYGFRATLLAISRQGLFLIPVLLIAAHFLGLL GIQLAQPVADIATLVLTLVIIRGIFREFNKLEGL >gi|157101630|gb|DS480694.1| GENE 258 277877 - 278692 425 271 aa, chain - ## HITS:1 COG:no KEGG:Asuc_1312 NR:ns ## KEGG: Asuc_1312 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: A.succinogenes # Pathway: not_defined # 19 257 21 256 260 84 26.0 5e-15 MFINTKDGVMPKKNYIGKIADLFSVSIATLRYWGKEGLVKFDHSEENGYRTWSMHTLRTL CDITLYRQLSIPIQDLKHIHELNAEEIYSLLQCRREYICDMIEDLQDKLKHIDIKSDQVN TVRTMLYSEPSFLSLSLPAICAYELLNKENLSHLYDHEKELVILIQPQTPSRYDYGRFTA VPANPGTLLRPKDDAPRVYLHVLLKSSYEDPADNDLAIYYDYLKMHNYRPGPAIGRVLVS ACEGRLMNYYDTYIEAIPFKTSTIFQKDGIS >gi|157101630|gb|DS480694.1| GENE 259 278845 - 280071 948 408 aa, chain + ## HITS:1 COG:no KEGG:OB0304 NR:ns ## KEGG: OB0304 # Name: not_defined # Def: hypothetical protein # Organism: O.iheyensis # Pathway: not_defined # 10 405 5 388 388 118 25.0 5e-25 MKGSDVKVAPVIAIVICFAFVSLGDIVSLKTKAKFPSLAVAIVAYLIAIWCGMPKTYPDV SGLAALGDILFPVFVAGLATSILPISIVKNWKFVIIGCIAVLADFLCTVVVGGLFYDPRE MFAAAMTTCGSGLTGGLIVLDRLKGTGLTEMVTIPLLLACTIDAIGQPVGSLITRKYAGK LIASDAYLTDKIVDHSDGGKLNKWGAPFNSSENPSPRFCAWIPPKYETEAVAFLQLIVVT ALAMWLGGITGLGWSIVLILLGFGGTFIGLFRMNMLDRTQTGGFVMAAIYALLFQMLNDM TLQEIMAKIVPLLLVILLSGVGLVLGGMLGAKVFGFDPWLGASATIGLFYLFPGVRNVIN EVARSLSRNEEERLYLVEKISPSCIIAASMGSKFCLLAGVLLMPLIIR >gi|157101630|gb|DS480694.1| GENE 260 280116 - 280919 692 267 aa, chain + ## HITS:1 COG:mll6661 KEGG:ns NR:ns ## COG: mll6661 COG2362 # Protein_GI_number: 13475561 # Func_class: E Amino acid transport and metabolism # Function: D-aminopeptidase # Organism: Mesorhizobium loti # 1 266 1 265 265 185 38.0 8e-47 MKVYISVDIEGTANAVGWDSTAPGGLDYERNRLEMTKEAVAAARGAHAAGADEIVIKDAH GHGNNILPEYMPEYVELIRSYTYAPDVMVEGIDESFDAAFYVGYHSAAGCEGNNLAHTIS HSKVHSIKVNGEIASEFMIYSYMAAYRNVPSVLLTGDRSLCQTGKKYHPCLVTVPVKDDI GGRNRGISGELACKQIEKAAKHALEQDLAEARIVLPGHFDVELCFKDHTLAKSGSYYPGA TQKDPYTITFQSDDWYDVGRFLIFTIL >gi|157101630|gb|DS480694.1| GENE 261 281105 - 281470 367 121 aa, chain - ## HITS:1 COG:FN1904 KEGG:ns NR:ns ## COG: FN1904 COG1733 # Protein_GI_number: 19705209 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 12 100 41 129 148 102 49.0 3e-22 MPDIKIVSSEIEVTLKVIGGKWKPLILHYLQYGGVKRYNEILRYLGTAPKKTLTAQLREL EEDGIIDRTVIPTLPVQVEYSITEHGKTLFPILSLMCDWGYKNMGDRYQLTHPTCGCEDE E >gi|157101630|gb|DS480694.1| GENE 262 281760 - 282578 711 272 aa, chain - ## HITS:1 COG:lin0819 KEGG:ns NR:ns ## COG: lin0819 COG0656 # Protein_GI_number: 16799893 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Listeria innocua # 4 272 7 274 274 284 50.0 1e-76 MHYVQLNQGNKIPQLGLGVYQTADGQQTMDAVCWALEAGYRHIDTAKIYGNEKSVGDAIR ESKIPRQQIFLTTKLWNDDIRAGRTEEAFNESLAALQTDYIDLYLIHWPAEGFTEAWTAM EKLYKQGRIRAIGVSNFHKCHLDELEKRADVMPAVNQVESHPYFNNNELIDYCFSRDIAV EVWSPLGGTGGNLLQDETLVRLAQKYGRTPAQIVLRWDIQRNVVVIPKSTHRERIVSNQS IFDFELSDQDMEAVTRLNRAARVGADPEHFNF >gi|157101630|gb|DS480694.1| GENE 263 282597 - 283802 997 401 aa, chain - ## HITS:1 COG:FN1415 KEGG:ns NR:ns ## COG: FN1415 COG1979 # Protein_GI_number: 19704747 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Fusobacterium nucleatum # 1 391 1 384 385 318 42.0 1e-86 MHNFQYYTPTKVVFGKDTEKQTGSLVKEQGGKKVLIHYGGGSVIRSGLLDRVKASLAAEH MDFVELGGAVPNPRLGLVYEGIELCKKEGVDFILAVGGGSAIDSAKAIGYGAVNEGDVWD FYDYKRQPSHCLPIGVVLTIAATGSEMSDSSVITKEDGWIKRGYSNDLSRPRFAVMNPDL TKTLPDYQTACGCTDILMHTMERYFTNGGNMEITDSMAEALMRTVIKNAKILAEDPLNYD ARAEILWAGSLSHNGLTGCGNAGGDFASHALEHEIGGLFDVAHGAGLAAIWGSWARYVYK NCLPRFHRFAVNVMGIEDLGAPDDTALKGICAMEDFYRAIHMPTSLQELGIAPTDGDLKT MARKCAVASGGQKGSAMVLHEDDMLAIYRMADHGPEKTAAH >gi|157101630|gb|DS480694.1| GENE 264 283990 - 284358 213 122 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 2 118 914 1013 1014 65 37.0 2e-11 MPQEYGCHADTHWVKLYGPADGAGTAFAGESGAAGQRSRCLEISMEEKPFYFSAIPYTPQ ELESALHREELPAPRRTVVSILGAMRGVGGIDSWGSDVEPAYHVPADEDIEYGFVIRRGQ DV >gi|157101630|gb|DS480694.1| GENE 265 284620 - 286425 2140 601 aa, chain + ## HITS:1 COG:BS_yesM KEGG:ns NR:ns ## COG: BS_yesM COG2972 # Protein_GI_number: 16077762 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 193 595 203 575 577 169 32.0 2e-41 MGGIMRELIWGKTLRSRLIRYNFFIIWVIAIFVSVCTYITASRKTLEVAKNSLQHHVESI SYRYQLAYEEMINIVLNCTERKTFNLGEIGRGLTPGARKKGLDYAELIKDYCAISEYGDY IVKLSVFDDRGGMVQTGSSFGSTDDAKGILDGGWLDRELDKTMDSYELDLVDSPFFQHEG QVLPIARTLTGSRSVPGGWVVLCLSTDLFQDVLKDTSGGQEAVVVTSEGKLIASLYEPED HREENDRVIRQLQQSGQDKGLLEMKIHGRNSLVAFEQYDRSGVMVYETITLDSLSNDRLM IIQTVIIMFAACLIFGFILSVLITNQIKKPIDRLVRHIGHVAAGNFTRDPDIEGEDEIGT VGKVVNDMSERIDTLMAERLEHEKEKGDLELKMLQAQINPHFLYNTLDSIRWIAVIQKNS GIVKMVTALSGLLKNMAKGFNEKVTLQKELDFLGDYVTIEKVKYVELFDLEVRVDEPELY EALVIKLTLQPLVENAIFNGIEPGGKHGTILIHAYREGDVFIIKVRDNGVGIAREKMASI LNHTEKVKGSSMSGIGLPNVDRRIKLNYGEEYGLTIESQEGEFTEITIKMPLEYECEGES R >gi|157101630|gb|DS480694.1| GENE 266 286448 - 288094 1845 548 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 544 1 525 525 153 25.0 1e-36 MYRVMIVDDEPLILAGIASLLDWKEYGCEISGKAANGQQALKLMEEQKPDIVITDIKMPG MDGIGFMKAVKERGWDDVIFILLTNLEEFSLARQALSLGAVEYLVKMELTEEKLADSLKL AMERREMKRKAEAAGTAVTVSREEAVRGYIEKLLTDGGTFSGGASSGNAGEASAQNQGEG YDSCLRRPVLAIISFNYGYEGFSSDFTREDQKKVISFAENIIEQMVKGYFDHSCLVRREL NSLVLVMSTDGIEDYREQIRSLGEKIISVVKDYFEVSVSVAVSSRKESLGEFGALLYEAM SATNHYFYHSLDPVVFYSEECETSARHTGSFHIGFLKKDLSQAVALNDSGRLEEILNQVA CLLREHNPSRQQAVNACANLYFFLSSFFEDGEEPDFPYEVNIMEKLGRLGTLGQIIQWIN WFKEAVSRILERRRDTRVDKIAEMVREYVMEHYKERITLGQAAEALNISQGYLSTAFKKQ SGESFTNYVSAIKIEKAKELIASHQYMMYEVSDLLGFDTPFYFSKVFKKVTGMSPKEYEA QCLKQKKL >gi|157101630|gb|DS480694.1| GENE 267 288236 - 289600 1667 454 aa, chain + ## HITS:1 COG:BH1864 KEGG:ns NR:ns ## COG: BH1864 COG1653 # Protein_GI_number: 15614427 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 2 398 3 382 461 114 27.0 6e-25 MRLKKTVALALVTAMAATMAAGCGSSSSDKGSAAPESTAAGSEAAGGEADSKETEKGAAS GEKQKLTVSVWDNDATPQFTSVTKAFMEKYPDVEVEYIDAPSDEYANKVTVMLAGGDADP DVIFIKDTENQVTMKDKGQILNLDEYIAKDGIDLTMYNGVADQLKMDGSSYTLPFRKDWY MLYFNKDLFDKAGVPYPSADMTWDEYEELAKKMTSGEGSAKVYGTHNHTWQALVSNWAVQ DGKNTLMSEDYSFLKPYYEQALRMQDEGVVQSYANLKTGSIHYISVFEQQQCAMIPMGSW FIGTLTQDKDAGKFDFNWGVTAIPHPADTKAGATVGSTTPVAINAKSDVPDLAWEYVKFV TGEEGAQVLADNGIFPAAESQAISDKLAAIPGFPEDGQDALKTTSWVLDRPLDAKMAAVR KVLEEEHDLIMIGEEDVDTGIANMNQRAQEARED >gi|157101630|gb|DS480694.1| GENE 268 289850 - 290617 943 255 aa, chain + ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 12 250 55 292 296 192 44.0 4e-49 MKWDGSSTPMEFVGLKNFVKLFHDSGFRISVTNTLYYAVFTVPLTMVAALLIAVLLNSKI KGIVVYRTAFFFPYVASLVAVGAVWNMLFQPEFGPVNEFLKFIGIANPPKWCASTDWAMT VIIIVSVWKYMGYYMVVYLAALQGISQDLYEAASIDGATGFKKFRYITIPMLKPTTFFVI IMMTIQCFKVFDLIYVMTGGGPGRSTNVLVNHIYNAAFVDFKFGYASASALVLFAIVLVV TLVQFQGEKKFTDYV >gi|157101630|gb|DS480694.1| GENE 269 290630 - 291457 982 275 aa, chain + ## HITS:1 COG:lin0219 KEGG:ns NR:ns ## COG: lin0219 COG0395 # Protein_GI_number: 16799296 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 6 274 13 281 282 194 39.0 1e-49 METRTLKSKIGIYMLLSVLAVLVVIPFGWMLLASVKTSNEVFSIPMKLFPKEFQWVNYQT IWDKIPLLVFFKNTFKIAVCTTLLQVLTSCFAAYGFSKVEFKGRDFLFLVSVTTFAVPWQ AYMVPQFALISRMGLTDSHLGLILMQAFSGFGVFLIRQFMIGVPNELLEAARIDGLSEYG IFAKIALPLVKPGIATLVIFTFVNIWNDFMGPLIYLNSTKLKTIQLGIRMFISQYGADYA LIMAASVCSLVPVLIVFMVCQKWFVEGITAGGVKG >gi|157101630|gb|DS480694.1| GENE 270 291571 - 293658 1430 695 aa, chain + ## HITS:1 COG:no KEGG:Closa_4172 NR:ns ## KEGG: Closa_4172 # Name: not_defined # Def: Heparinase II/III family protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 695 18 672 672 577 42.0 1e-163 MENKKIDMKWCSGYCRQYWPEECAHILRIADDAVAQRFLFDLPWDMEQTVKAVTFEGKID WRHMPEGDPEFIYQFNRHRYWICMGQAYALTGDGKYARCFVNQLYSWLEDNPINQETKNT TWRTIEAGIRGENWVKAMEYFEDCPVVTRAVRERFLEGLRVHGEYLMDCRVPFSDKSNWG VLESHGLFAIGAYLMKCRGNASGEDVAGAAAHDTAAHDTAATDAAAGLARRGEAYVAEAV LRLVRELDIQIMDDGVQWEQSPMYHNEVLRCLLEVLRVAGQQGISLPRSLPEKARAMALA DLAWMKPDRTQPLNGDSDRTDLRDVFTPAAWILKDNRLRYAGYDRLDFESVWDLGADAAL EYESMRSETPECLFHALEQSGNWCLRSGWDRNADYLHFKCGSLGGGHGHFDKLHVDLSIA GEDVLIDSGRYTYVDGSLRRQLKSSCAHNVPVVDWQEYTQCTGSWDVKGRTAPAGQDWSQ KGPYTFLQGTHLGYLPAGVLVNRRIISIGTRIHMIADEFYGTGTHVLSQVFHLNPAGSVE MREDVFLYHGIRAEAAFYRVAPWSGGGGGKMELEETPVSFHYNQLEYAPCVSLWRESALP CSMLTVAAGYETGSGNPYTVSRVSVTSPVTGRCLKDEEAEAVRIQDGKHSWLVVLNHLET GADGEYIGAEGRYGLGRVMVCDEADREGRMTVLQW >gi|157101630|gb|DS480694.1| GENE 271 293739 - 294917 991 392 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0934 NR:ns ## KEGG: Cphy_0934 # Name: not_defined # Def: glycosy hydrolase family protein # Organism: C.phytofermentans # Pathway: not_defined # 1 391 1 391 391 588 69.0 1e-166 MQTKDSLQTYTIMEKSEAEACLDLAVSLVRENLKTFTRCFPDSNSRNQFYPQSSNREWTT GFWTGEIWLAYERTGEEVFKEAGTIQAESFLERIKERVDVDNHDMGFLYTPSCVAAYRLT GNETARKAALMAADNLIGRFQEKGQFFQAWGELGAKDNYRLIIDCLLNMPLLFWASETTG DQTYRKKAEAHIRTAMDCVIRPDHSTYHTYFFDPETGAPVKGVTHQGNRDGSAWSRGQAW GIYGSALSYRIERKPEYADVFRKVTDYFLKHLPSDLIPYWDFDFDDGSSEPRDSSSAAIA ACGMLEMVKYMEGGEAAYYGDMARRLVKALSDRCAVRCHEESNGLLLHGTYARASAGNPC ANRGVDECNTWGDYFYMEALTRLAGDWEPYWY >gi|157101630|gb|DS480694.1| GENE 272 295186 - 296379 1365 397 aa, chain + ## HITS:1 COG:MA2718 KEGG:ns NR:ns ## COG: MA2718 COG1104 # Protein_GI_number: 20091542 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 5 384 6 384 392 462 60.0 1e-130 MKQLIYLDNAATTKTRPEVVEAMLPYFTEYYGNPSSVYDFSTESKKAVNHARETIAQSLG AKTNEIYFTAGGSEADNWALVATAEAYGNKGKHIITTKIEHHAILHTCQYLEKRGYEVTY LDVDENGVVKLDELKKAIRPDTILISVMFANNEIGTIEPIKEIGAIARENGILFHTDAVQ AFAHVPIHVDEYNIDMLSSSGHKLNGPKGIGFLYIRTGVKIRSFIHGGAQERKRRAGTEN VPGIVGYGKAVELAMATMEERAARETQLRDYLMKRVMDEVPFTRINGSRTSRLPNNVNFA FQFIEGESLLIKLDMAGICGSSGSACTSGSLDPSHVLLAIGLPHEIAHGSLRLTLSEENT EEEMDYVVEQIKDIVGYLRSMSPLYEDYMKHNGSEKK >gi|157101630|gb|DS480694.1| GENE 273 296437 - 296874 585 145 aa, chain + ## HITS:1 COG:MA2717 KEGG:ns NR:ns ## COG: MA2717 COG0822 # Protein_GI_number: 20091541 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Methanosarcina acetivorans str.C2A # 2 124 3 124 128 168 63.0 3e-42 MYTEKVMDHFEHPRNVGEIENPSGMGTVGNPKCGDIMRIYLDIDENQVIRDVKFKTFGCG AAVATSSMATELVKGKTVREAMAVTNKAVMEALDGLPPVKVHCSLLAEEAIHAALWDYAQ KNGITIEGLEKPKDNIEEEEEEENY >gi|157101630|gb|DS480694.1| GENE 274 296895 - 297950 959 351 aa, chain + ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 1 349 9 354 355 389 55.0 1e-108 MSGGVDSSVAAWLLKEQGYDVIGVTMQIWQDEDTEVQEAEGGCCGLSAVDDARRVAMDLG IPYYVMNFKEEFRKNVMDYFVGEYVEGRTPNPCIACNRHVKWESLLRRSMAIGADYIATG HYAQIDRLPGGRYSLKTSVTAAKDQTYALYNLTQEQLSHTLMPVGSYHKEEIRDMAKRLG LPVAHKPDSQEICFIPDHDYASFIEEYTGRELPPGNFVDLDGRVLGRHRGITHYTVGQRK GLNLSMGRPVFVVEIRPETNEVVIGNNEDVFTNVLRCDKLNWMAVDGLHGKPMDVMAKIR YSHRGSPCTIREIGDDMVECCFHEPVRAVTPGQAVVFYDGDYVAGGGTIIR >gi|157101630|gb|DS480694.1| GENE 275 297947 - 299308 1053 453 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0966 NR:ns ## KEGG: EUBREC_0966 # Name: not_defined # Def: N-acetylmuramoyl-L-alanine amidase domain protein # Organism: E.rectale # Pathway: not_defined # 199 451 95 347 347 292 67.0 2e-77 MKRKIRERSAGCKSSGAPTDRIEHNGTARSKRRANGLAAIGLAGILAAGMGLTGCQNYES SSEMKARIQSEGGEASWSRESPAAVGASAVQGEGSADNGNKAETIKLDTERTTAPMPQFE DVSDTVYVKSSDGGSVRIRSACDTGDDSNILTYASPGTALKRTGKGEQWTRIDYNGTAGY ISSGYITTQVPETAALPPQACVDSGEVTLNPSWKYAEFSKINSGAAVLYRSEAASPKGIT VCVNAGHGTKGGASVKTQCHPDGTPKVTGGTTGAGATSAAAVSGGMTFSDGTPESKVTLS MAKILKDKLLAAGYDVLMIRESDDVQLDNIARTVIANNASDCHIALHWDSTTNNKGAFYM SVPNVESYRSMEPVKSHWQQHNALGESLVAGLKGAGVKIFSGGSMAMDLTQTSFSTVPSV DIELGDKASDHSSATLATLADGLVAGVNQYFGQ >gi|157101630|gb|DS480694.1| GENE 276 299421 - 300224 1105 267 aa, chain + ## HITS:1 COG:no KEGG:Closa_0058 NR:ns ## KEGG: Closa_0058 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 33 266 29 252 256 106 29.0 1e-21 MKLMKRLAAVVLGVALLCPMTAYAAQSQEALELYKAVEARNQNMTDMNAFYDFKMKMSGS LFENEGIDPVDMRLEMNMKMNHIQDPSQMRYMAYCRMTVPESEPITYSMYYLDGYIYMDM LGQKVKYPVAMGDMMNQALASSKAFDVPEDLVGDFSLWDEGENKVIGYTINDAKMNEYMQ MVLGSTGLTGMLDGLDMKLHNIRGEYVVNPAGDCIKMRLKMDMDMTMQGETFSVNLDGDV GIADPGQPVDVPVPNPAEYTEMQAAAS >gi|157101630|gb|DS480694.1| GENE 277 300623 - 300802 87 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940318|ref|ZP_02087663.1| ## NR: gi|160940318|ref|ZP_02087663.1| hypothetical protein CLOBOL_05208 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05208 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 89 100.0 1e-16 MDRIQTPGKQDPGRQEPLELPGRNSSTYGDGDIPEIDLPGADNATHDFPQEFPENRMQS >gi|157101630|gb|DS480694.1| GENE 278 300809 - 300991 237 60 aa, chain - ## HITS:1 COG:no KEGG:Closa_1280 NR:ns ## KEGG: Closa_1280 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 59 1 59 65 61 54.0 9e-09 MIERDELPIGFTMELAMNPEAMSRFSGLTEPEQKQVVNRARNIMSHEEMRNYVENMFTEG >gi|157101630|gb|DS480694.1| GENE 279 301073 - 301921 1152 282 aa, chain - ## HITS:1 COG:no KEGG:Closa_2348 NR:ns ## KEGG: Closa_2348 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 280 11 288 289 271 55.0 2e-71 MDKLKSDYESIPVPTEAKERMLAGIAQAKKEQKGVIIMKFAKKTGGAAAAAMIAITVLAN ATPALANAMEQIPVIGSIAKVVTFRTYEDKKDNFEADIKVPQVTIEGTEGAQVPANKSIE DYGNELIAMYEEELSRDNGEGHYGLDSQYKVVTDNSKYLSIRIDTTLTMASGTQFVKVFT IDKATGSIISLKDLFHDRPEMLEAVSDNIKSQMAQQMAKDDSVVYFYQSDMPDEDFKGLT GDESYYFNENGELVITFNEYDVAPGYMGAVQFTIPGSVTGTF >gi|157101630|gb|DS480694.1| GENE 280 301914 - 302465 538 183 aa, chain - ## HITS:1 COG:BS_sigV KEGG:ns NR:ns ## COG: BS_sigV COG1595 # Protein_GI_number: 16079766 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 14 155 15 160 166 129 48.0 4e-30 MGLYNNTAETERLLTENYERYYRLAFKLMRNQDDALDVVQESAYRAIKDCKGVRNKDYLS TWIYRIVVNTSLDMLRRKKKETPTEEMPETPTEDHYQDLDLKQMIKRLDDKSRTVIILRY FEDLKLEDIADIVGENLNTVKARLYRTLKKLRLELEPMPLGANQNHFSERSHAHETGRKQ TDG >gi|157101630|gb|DS480694.1| GENE 281 302656 - 303645 819 329 aa, chain + ## HITS:1 COG:lin2374 KEGG:ns NR:ns ## COG: lin2374 COG5632 # Protein_GI_number: 16801437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 161 327 4 180 316 103 36.0 6e-22 MKEERTNDIDRMREYEDIRQVPLEPNRPASGRQNPPVGDSYIGGRPRRERISRRKPGIGD GAPEARTSAARTSAARKSADRTAQEVQNTGKKADRSQRSKKAGNSGSGEHRNRSGKGVSF LALLLVLILGAGAGLGGGYYLWGWERPYTVDLKAVEVPDYVEQDFIRKNIFSRPDVGRQK VDKIVIHYVANPGSTAKNNRDYFDSLADQDPQKSGSSASSHFVVGLEGEVIQCIPVNEIA YANAPLNNTTVAIEVCHPDDSGKFNDATYESLVDLTAFLCRQLKLTPGDVIRHYDVNEKL CPKYYVEHEDAWEQFLKDVKTAMKTESAG >gi|157101630|gb|DS480694.1| GENE 282 303749 - 304558 1023 269 aa, chain - ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 263 5 259 269 130 29.0 3e-30 MIYDCHLHTEFSGDSNTPISLQIEKAIELGMKEMCITDHHDYDSGFCDCDFILDIPTYLT SLRMLQAEYRDRIRINIGIELGLQDHLKEYLDHFVKRYGDSFDFIIGSSHFVKSMDPYDP EYWNKRGEVPGFEDFFEASLVRVRDLCRTFDSFGHLDYAVRYAPHQNDFYDYRHLSQWID PLLKILIENGKSLECNTGGFKYGLGQPNPCRKILMRYRELGGEMITIGSDAHTPEYVGYA FDTCRELLSDCGYRYFTVYHGRKPEFIPL >gi|157101630|gb|DS480694.1| GENE 283 304836 - 305867 1005 343 aa, chain + ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 340 1 331 334 429 65.0 1e-120 MAVKVAINGFGRIGRLAFRQMFGAEGYDVVAINDLTDPKMLAHLLKYDTAQGGYAGVYGQ GLHTVEAGEDSITVDGKKITIYKEPKAADLPWGEIGVDVVLECTGFYCSKDKAQAHIEAG AKKVVISAPAGNDLPTIVYSVNENVLTADDKIISAASCTTNCLAPMAKALNDYAPIQSGI MSTIHAYTGDQMILDGPHRKGDLRRARAGAANIVPNSTGAAKAIGLVIPELNGKLIGSAQ RVPVPTGSTTILTAVVKKAGVTKEDINAAMKAAASESFGYNEDAIVSSDVIGMRYGSLFD ATQTMVAQIADDLYEVQVVSWYDNENSYTSQMVRTIKYFAELG >gi|157101630|gb|DS480694.1| GENE 284 305977 - 307194 1568 405 aa, chain + ## HITS:1 COG:CAC0710 KEGG:ns NR:ns ## COG: CAC0710 COG0126 # Protein_GI_number: 15893998 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Clostridium acetobutylicum # 2 404 3 397 397 497 67.0 1e-140 MLNKKTVDDLKDLQGKKVLVRCDFNVPLKEGVIQNYNRIDGAIPTIKKLLDQGAKVILCS HLGKPKGEPVPEMSLAPVAPALSERLGVEVKFVDDPKVTGPETQKVAAELKDGEVLLLQN TRYRVEETKFKDGDAASEAYAKELADLCDGIFVNDAFGTAHRAHCSNVGVTRFTKENVVG YLMEKEIRFLGQAVENPVRPFVAILGGAKVSDKINVINNLLDKVDTLIIGGGMAYTFAKA QGQEIGNSLCEADKLDYALEMIKKAEEKGVKLLLPVDHVEGKEFSNDTEHKVVDVIDAGW SGFDIGPKTIALYKEALQGAKTVVWNGPMGVFEFSNFAEGTLEVCRAVAELADATTVIGG GDSVNAVKRLGFADKMTHISTGGGASLEFLEGKELPGVAAADNKA >gi|157101630|gb|DS480694.1| GENE 285 307245 - 307997 1062 250 aa, chain + ## HITS:1 COG:BH3558 KEGG:ns NR:ns ## COG: BH3558 COG0149 # Protein_GI_number: 15616120 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Bacillus halodurans # 3 240 2 240 251 248 53.0 9e-66 MSRKKIIAGNWKMNMTPSEAVALVNELKPLVANEDVDVVFCVPAIDIIPAMEAAKGSNIC IGAENMYYEEKGAYTGEISPNMLVDAGVKYVIIGHSERREYFAETDETVNKKVLKAFEHG ITPIVCCGETLTQREQGITIDWIRQQIKIAFLNVTADQAKTAVIAYEPIWAIGTGKVATT EQAEEVCAAIRVCIGEIYDEATAEAIRIQYGGSVSASSAPELFAQPDMDGGLVGGASLKP DFGKIVNYNK >gi|157101630|gb|DS480694.1| GENE 286 308096 - 309637 1821 513 aa, chain + ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 511 1 509 510 614 59.0 1e-175 MSKKPTVLMILDGYGLNDNCQHNAVCEGKTPVMDQLMSQCPFVKGEASGMAVGLPEGQMG NSEVGHLNMGAGRIVYQELTRITKSIQDGDFFQIPEFLTAIENCKKHDSALHMFGLVSDG GVHSHNTHIYGLLELAKKNGLKKVYVHCFLDGRDTPPESGKGFVEQLEAKMKEIGVGEVA SVMGRYYAMDRDKRWDRVELAYNALTKGEGNQAESAAAGIQASYDSGKADEFVIPFVVMK DGKPAATIQDKDSVIFYNFRPDRAREITRAFCDDSFDGFPREKRLDLVYVCFTDYDETIQ NKLVAFKKESITNTFGEFLAAHKKTQVRIAETEKYAHVTFFFNGGVEEPNEGEDRILVPS PKEVATYDLKPEMSANGVCDKLVEAIKSDKYDVIIINFANPDMVGHTGVEDAAIKAIETV DACVGRAVEAIREVDGVMFICADHGNAEQLVDYETGEPFTAHTTNPVPFILVNADPSCKL REGGALCDIVPTLIELMGMEQPKEMTGKSLLVK >gi|157101630|gb|DS480694.1| GENE 287 309745 - 311247 1210 500 aa, chain - ## HITS:1 COG:PAB1247 KEGG:ns NR:ns ## COG: PAB1247 COG0475 # Protein_GI_number: 14521863 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Pyrococcus abyssi # 13 354 3 344 380 69 22.0 1e-11 MALFLSNLGAAGIILTIAVMLASGFLLTRITKRLHLPNVTGYIIAGVIIGPWCLNLVPSQ YIGQMDFITDLALAFIAFGVGKYFKLSSLKANGKKMVILTLFESLTAGIFITIVMLIMGL SLSFSLLLGAIGCATAPASTIMTIRQYKAKGPFVDTILQVVALDDAVSLIAFSVCAAFVQ ASSSNGSVTVSQIAFPVLFNLLALALGGLSGVVLSRLIHKRRSKDHSLVLACVTIMGVAG ICTSLNVSPLLACMASGAAYINASGNKHLFKQLNQFTPPLLVMFFVLSGMRLSIPSLAAA GVIGIVYFLVRIAGKYTGSALGAAVTHASPEIRRYFGLALIPQAGVSIGLAVLGQRMLPA ESGQLLSTIILSSGLLYEMVGPACAKASIKLSGSVPGKAAVDKAAAGKAAAGKAAAEKAA AGKAAAEKAAAEKAAAEKVIAGSSSASGPDIAKADGVEQPVHKAGAAALEEQTRKKAPDT GNDTNDTVSAQQVRPRHAVV >gi|157101630|gb|DS480694.1| GENE 288 311352 - 312263 1025 303 aa, chain + ## HITS:1 COG:BS_ywfK KEGG:ns NR:ns ## COG: BS_ywfK COG0583 # Protein_GI_number: 16080817 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 12 294 6 280 299 127 26.0 2e-29 MEGDWPVFDMKVKTLLAVIESGSYTKAAQKLNLTQPAVSHQIRLLENEFDIKIFYRSKNK LKLTPQGKILVQYARRAAAVYSNACQAIEDSKTETSHLNVGITPTAGETIIPQVLATLCN ENPDIHINISVNTIQKIYSRLKAYELDFAVVEGGVPDAALVSTTLDTDYLCLAVSPMHRL ARAKTVSVEEIRHEKLILRSRSAGTRQLFEKHLGARDMSLEEFNVMMELDNVSMIKELVT MDLGITIIAKSACREELASGRLAIIPIENSSMVRQIDMVYQKDFLHPEFIQNIRGIYEQL KAK >gi|157101630|gb|DS480694.1| GENE 289 312489 - 314264 226 591 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 361 569 35 249 329 91 32 4e-17 MSGKKGMKTKKLLESFLPYYGRYKKILIFDLFCAALTTLGELILPLMLRYITNQGMQNLA SMTAGIVIRIGGLYLLLRIIDSLAAYYMAYTGHVMGVYIETDMRQDAFEHLQMLSDSYYS NTKVGQIMSRITSDLFDVTEFAHHCPEEFFIAALKTVVSFIILSGINLELTLLIFVLVPV MIISCTWFNMKVKEAFRRQRNQIGELNAQIEDTLLGNRVVRAFANEPVEIAKFGKGNRDF MDIKKYTYRYMAAFQITTRSFDGLMYVVVIVAGGIFMIQGKVEPGDLVAYTMYVTTLLTT IRRIIEFAEQFQRGITGIERFQEIMDVVPDIKDSKNAKKMEHVKGAITFDHVSFEYPDDH VPVLEDINITVRPGERIALVGPSGGGKTTMCNLLPRFYDTTSGSIAIDGQDIKNVTLQSL RSNIGVVQQDVYLFSGSVYENISYGRPGATREEVMEAAKLAGAHEFITELQDGYDTYVGE RGVKLSGGQKQRISIARVFLKNPAILILDEATSALDNESEYLVSQSLEKLAVGRTTLTIA HRLTTIQGADRILVLAGNKIVEEGNHEQLLEKKGMYYQLYTTANRLNQGIA >gi|157101630|gb|DS480694.1| GENE 290 314280 - 315149 960 289 aa, chain + ## HITS:1 COG:lin2658 KEGG:ns NR:ns ## COG: lin2658 COG1307 # Protein_GI_number: 16801719 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 2 284 4 283 283 124 31.0 2e-28 MKIAIVADSNSGVTQDEAKELGIYVLPMPFMIDGQVFYEGIDLSHEDFYARMEEGAEITT SQPSPKDVTEQWEQLLEDHDQVVYIPMSSGLSGSCQTALMLAEDYEGRVQVVNNQRISVT QRQSALDARDLAERGWSGEQIKEKLEATRFDSSIYITVDTLKYLKKGGRITPAAAALGTL LKIKPVLTIQGEKLDAFAKARTMKQAKTIMLTALAHDLEDRLADPKAEHTYLQIAHTCSE QAAQELEETVRELYPGVPVFGAPLSLSIGCHIGPGALAVACTRKLKELE >gi|157101630|gb|DS480694.1| GENE 291 315150 - 316619 1747 489 aa, chain + ## HITS:1 COG:CAC1701 KEGG:ns NR:ns ## COG: CAC1701 COG0642 # Protein_GI_number: 15894978 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 242 478 332 564 566 154 41.0 5e-37 MSEKKFVKKLRTFGMSMKVPLTAIIIGAALIPLFLEAGIMLGTFRQSQLDARFIEIQNQC VILSNKMTRSGYMTAEKKNNASLDSQMQTLSDVYNGRIVIVGSNFRVVTDTFNLATGKYY LSEEVIKCFKGENSSHYNKDMQYFAQTIPVYDTADAKAVSGVIVVTASTENILSLTDKVM GKSHLFLVCMVLLISVLGVAAAHLLLRPFKKLQMSFDRVAQGDLDADITEETYRETRLLS QSVQKSLSKLKAVDQSRQEFVSNVSHELKTPITSIRVLADSLMGMEDVPVELYREFMTDI SDEIDRENQIIEDLLMLVKMDKTAEDQMNIEQVNINGELELILKRLRPIAKRGNVELILE SIREVTADVDKVKISLAITNLVENAIKYNRDSGMVRVTLDADHKYFYIKVADTGIGIPED ALEHIFERFFRVDKARSREVGGTGLGLAIAKNVIQMHHGIIDVESTVGEGTTFSVRIPLN YVPRQEAKP >gi|157101630|gb|DS480694.1| GENE 292 316730 - 317632 1149 300 aa, chain + ## HITS:1 COG:no KEGG:Closa_0837 NR:ns ## KEGG: Closa_0837 # Name: not_defined # Def: Lipoprotein LpqB, GerMN domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 25 300 38 319 330 330 59.0 4e-89 MAVVFLLSGCGEGESPVSNPAEGRYLIYYLNASITKLVPQEYETETKDKGELVNELMDQF LNVPKDLDCQPGLTDRVTYQGSRQEEQVLYLYFDMNYTAMKAEREILCRAALTRTLTQIE GIDYINIYCGDQPLMDRQGNPVGMLSATDFIMNTSNVNAYEKTELTLYFADETGNRLVPE KREVVHNINTSLEQLVVEQLIAGPGQEGHNPTLPSDCKILSLSVTDNVCYINFDSAFANT TLAVNEYIPIYSIVDSLSEMTTVTKVQIMINGSQDVMFRDVVSLNTTFERSQNYIDGKNE >gi|157101630|gb|DS480694.1| GENE 293 317655 - 320207 1400 850 aa, chain + ## HITS:1 COG:lin1517_1 KEGG:ns NR:ns ## COG: lin1517_1 COG0658 # Protein_GI_number: 16800585 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Listeria innocua # 157 502 106 449 487 141 27.0 6e-33 MKRPLVVLAAGYVLGEVLALQVKTAVDLGVLAWLCAAAAGILWLLDTGRGVLRPSAPRAK MQGRSNRKHRVLLLFLVCLLSGSSVGMTRGQQEKGILDHEESVARTMTGARILVRGVIKK AEQKENTITLMLEEVTAEAGKRSVKFRRMVVYAENRTTEKDDSDLGEELAVGVKVQVRGK LAPVEGPGNPGEFDFRTYYRTKGTACRLYGENLAVAGGEAIPYYKGIAEFRMQCAGVLEK ICMPEDVAVFKAVLLGDTSDMNPEMRDMYQRNGISHLLAVSGQHLAIIGGGIYLMIRRAG AGYGKAGMISAVLVISYGIMTGSSGSAIRAVIMIVCLWLAAVKGRSYDTLSALGAAALIL LYRQPYLLYHSGFELSFGAVLAISGLGGWVQSFLELERAWEKTLLISLSVQIVLTPIVLY HYFQHPLYGIFLNLLVIPLVSILMYSGILGIVLGSFWIQGGVAAVGAGHYILRFYELLCK FAEGLPGYSLVLGRPSPGSLVMYFGIHGMGTAVVLFCIRGRSAFIRAKSPARCNAYLAER DLQTRISSISCKPMMLMYFLGIYALSFAALVPRPVKGLEVVCLDVGQGDGLLLRSGKRAV LIDGGSSSQKKLGNIVLEPYLKSQGISWIDYAVVSHGDSDHINGLIYLLEESEDIRIGTL VLPVMGKGEEVYENLAALARREGAAVVYMKTGDWVETGELTLTCLYAGEDFGRKDRNSHS LVLCGDYKGFHMLFTGDMGEEQESSLVRLAEQEGALQDIHLNHVQILKTAHHGSRTSSSE VFLDRLRIQLAVVSYGKENSYGHPSPETMERFRRQGIAVLETGKKGAITLKTNGTSLRVH AFLEEKSRQN >gi|157101630|gb|DS480694.1| GENE 294 320281 - 321717 1082 478 aa, chain + ## HITS:1 COG:BH0296_2 KEGG:ns NR:ns ## COG: BH0296_2 COG1263 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Bacillus halodurans # 112 465 2 350 368 268 41.0 2e-71 MNYQETANEILSHIGGAENIREMTHCFTRLRFELRDPEKAERGKIERIEGVIAVVESSGQ FQVVMGTKVGRIYDLLKEMTGTERKAGEKREKVYHRDSKDREKNTPGSHLGNRMIQSVSS MFTPMVPAIAASGLLKGFLTIARMLASNQGIDITGNHTYVILLAATDAIFYFMPIILAYT SARVFGANEFVAMALGGTMCYPSVLSLMTGEEHISMLGVALTRANYTSSVIPIIIGVFIL AYVQRFLERVIPEVLKIILVPGLSLLVMVPAVFMVFGPVGIYLGNIIHFIYTGLMGISPV LCGGFIGGMWCVFVIFGSHRALVPIGIQDVALNGRQNLLAFAGAANFSQGGAALGVMMRT RSRELKAVAASASVAASVCGITEPAIYGCNLRLKRPMIYAVICGAAGGAVMGAGGVYGDA FANNGILTLATYAAFGMKKFLFYLAGIGISFFGSAILTMLLGFEDMDQEPEETSWERN >gi|157101630|gb|DS480694.1| GENE 295 321833 - 323158 844 441 aa, chain + ## HITS:1 COG:CAP0010 KEGG:ns NR:ns ## COG: CAP0010 COG2723 # Protein_GI_number: 15004715 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 16 438 49 466 469 383 44.0 1e-106 MTVADFNSFKRSHIQADTKVASDFYHHWKEDIDYMAELGLKMYRFSIAWARIIPDGDGAV HEEGIKFYDRIIDCLLSRGIQPFVTLYHFDLPYALAEKYNGWESRRCVLAFERYARICFR AFGDRVKYWQVHNEQNLMVRVDERMNIDAADKWEADRLRAQMDYHMFLAHALAVKACHEL VSGGKIGPAVSSCCTYPETPRPEDVWAAANNDWFKAEYCLDMHQYGRYPGYYRRYLKERN IMPSAEDGDEEILKWGASDYIAVNYYRTLCVRYLPADEIHPVGERCFPGNEVDFDQYGYF SHVKNPHLVSTEYGAQIDPMGLRIVLNRYYQKYRLPLLITENGLGAADILTENGKVHDSY RIDYLRRHIQACRQALDDGVELMGYSPWSFMDLLSSHEGFRKRYGFLYVDRNEHETGSLK RIKKDSFHWYRRVIETNGATI >gi|157101630|gb|DS480694.1| GENE 296 323187 - 323645 408 152 aa, chain - ## HITS:1 COG:BH2238 KEGG:ns NR:ns ## COG: BH2238 COG0394 # Protein_GI_number: 15614801 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Bacillus halodurans # 1 151 1 153 160 143 45.0 1e-34 MIKILFVCHGNICRSPMALYYFRSLLKERGLSGIIQADSAATSTEEIGNPVHHGTRAKLK QAGIPCEGHRAKQMTKADYKSFDYLIGMDSWNIRNMQRIAGGDPDHKIFKLMEFTGSGRD VADPWYTGDFDTTWSDVTEGCAALLDYISERL >gi|157101630|gb|DS480694.1| GENE 297 323648 - 326542 764 964 aa, chain - ## HITS:1 COG:all3465_1 KEGG:ns NR:ns ## COG: all3465_1 COG5635 # Protein_GI_number: 17230957 # Func_class: T Signal transduction mechanisms # Function: Predicted NTPase (NACHT family) # Organism: Nostoc sp. PCC 7120 # 216 385 226 413 662 64 30.0 8e-10 MNISVMINDFIRQNIQSEAEVRSKLIVPLLELLEYPKDLRAEEFPVYGYEGSKALKTKAA DFLQFTSNQFDKYRGKTDLEVDWVYKHSLLVFEAKKPTEKVLVKGQPVFYSAWTKSVAYM ISNGINIEGYIVNANYSDTCVFSCRVEEIPEKWEEINLLNYDRILELKKSSDEKDKWTNR DIYEKYKNAMRVHCTEELCICVDRSLKEFAYDLNILKNGENKDFGDILDNTSKIITSEPG GGKSYLMWMLMREYLTKCNGDEDKIPVILEGRYYGKVFNSIVDGIYQEVNLLLPFLTKEL IEKRLREGGFVILFDAIDEVEQDYDVLVYSLHQLRRNTDNTIIITSRMQNYKGDFCSEFA HYSLEPLDDHRIVDLLKQYSQGEMQIQIHQIPKRLMEILRIPLFLKMFVSISKKEDKYRI PSNHAALFQEYINEKMNVLSCSLYDKTIIKSVLGNYAMYSFENGDSTEHFFEIMDNACTD LNKTKIYEKIWKTGLMSEGLQGIKYYHKAIQEFFAAVELSTWDRDKLTAWLDSNALKEKY NEILCYLTGIISNQQKQNYVLDYLEIHNLKLFVKALKSRRNFDVVEMDLNFEYAQSYYAQ ILKTYDSIVQSYFYKICHVFDGYSIKGTGKICIRGCMNFTKKSISMIIYNGTSDAKSLDI TVSDENGVYMAVADGTEIPINSSVFTAGHLHERYYNLELLSYGFDSSREIAIDIIKSQIT EMLEQKSVFDIEIDVLLAERIEKELKKLRNRRESQNNRQDLSLYSNDINYVIDKVNELGI YNQDVDMITTFCKVLGLRMDNAESLLDIKEDLTLAPGRHSYWFDELYSDEQLVKKVERIL SLSNEAIQTITTNIIPVLSTVKPITRKIGIVHRKGKFSGVSYMRVEVRENEDTTPIIEFG EDNIDIYPKLDPYYVNKLKEIGKSESNVLGSSSAVLDLYFREDVFHDQIYGEIKDLFKEL LGDI >gi|157101630|gb|DS480694.1| GENE 298 326897 - 326974 77 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTYETLSLMIAFGVLVVMIIGTKK >gi|157101630|gb|DS480694.1| GENE 299 327404 - 327535 62 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVACILKAIEIEPTMYQKWISNSIGENYSTIKYYMESMKNHEL >gi|157101630|gb|DS480694.1| GENE 300 327628 - 328656 896 342 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 6 340 7 339 339 349 50 8e-95 MTPYQSVISEIKSIISSGQEAAYSASSKAMLLTYWNIGKRIVEQELGGSERAEYGRGLIS ALADDLTKEFGKNYSKRNLHYYIKFYQCFPDEQIVNACVHNLNWTHIRSLLRVTDENARY WYMKEAVDESWSSRTLDRNISTQYYHRLLQTPKKDAVIEEMKRKTAEFQKNQFELIKSPV IAEFLGFKNEDTYLEGELESAILSHIRDFLMELGRGFAFVARQQHIVTEASDYYLDLVFY NIELKCYVLIDLKMGRITHQDVGQIDMYVRMYDDLKRGHGDNPTIGILLCSETDEDIARY SVLHDNDRLFMSKYLTYLPTKEQLKAEIERQKEIFSMQHSDF >gi|157101630|gb|DS480694.1| GENE 301 329046 - 329123 126 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTYETLSLMIAFGVLVVMIIGTKK >gi|157101630|gb|DS480694.1| GENE 302 329540 - 329617 124 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTYEALSLMIAFGVLIVMIIGTKK >gi|157101630|gb|DS480694.1| GENE 303 329842 - 330870 779 342 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0167 NR:ns ## KEGG: EUBREC_0167 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 35 332 21 309 317 112 25.0 2e-23 MENKREHSTNHATPKTPKEIIVNSLTELFADDEFGFDMFIMMKTGDPPMKHFAFYEGPTG REHDFKLKVQNSIVKSIRDLFLDEEAEYAMAEQAGGNQNIFYIIKQNDDYKPFELLNTPE KQMGPFFMDDRDKADAILFRFRRGVHSIWAYQYILPATIPNKKNQHFFARILDTEQTDCF VEMPEQLFVITRKVDLLVMDDKIITKDIGLMQRHFGFNDFIKASAKKAVSDISDIRLVTN CEKLTAYIERNQTKYSRKMMRIKHYHVVEKTAEELIQIINTVPRWQGVFDIRDGQIYLRT FRDVENLIDLFDERYTVSLVTGDEFDTDVKRLAAPGGMGGED >gi|157101630|gb|DS480694.1| GENE 304 330890 - 331528 375 212 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0166 NR:ns ## KEGG: EUBREC_0166 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 208 1 188 192 86 28.0 6e-16 MGEQNLLLKRKLVIQSFLPLFIFIFIRYFDYRMISSICHFIRELMQRNFSVMNRIWDHPY LGPFIAAFISLACSLYGITAMWQFKSMQMSGFVDAGEEIVIEEEITDSGITFFMTFVLPL LLDDVETLRGFIIFTGILGLVIRLMWTTHLYYQNPILTLLGYRIYKFRFLNPVMDGCRDK TMIAVCRTGIAEKKIVKWKYISDDVCLMYNKN >gi|157101630|gb|DS480694.1| GENE 305 331794 - 332774 1029 326 aa, chain + ## HITS:1 COG:BH1337 KEGG:ns NR:ns ## COG: BH1337 COG1466 # Protein_GI_number: 15613900 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus halodurans # 4 321 6 332 342 130 27.0 5e-30 MQTLNQDIKDRTFKPVYLLFGEEAFLKKSYKSRLREALTDGDTMNYNYFEGKGMNVNEII GLADTMPFFAEKRLILMEDSGFFKGGAGADELTEYMGGIPESTCLVFVESEVDKRSRLYK AVKKYGYAAELSHQEPAQLARWAAGILSKNGRKITGRTMEFFLSKAGDDMENISSELEKL ISYTLGREVITDEDVETICTTQVTNKIFDMITAIAARQTRKAMDLYEDLLTLKEPPMRIL FLIARQFNQILQVKDLMGKGMDKSTIASRLKLQPFVVGKIMLQAKTFTREQILSYVNLCV DAEEGVKTGKLQDRLAVELLIANKYE >gi|157101630|gb|DS480694.1| GENE 306 332795 - 333367 844 190 aa, chain + ## HITS:1 COG:BS_ywlG KEGG:ns NR:ns ## COG: BS_ywlG COG4475 # Protein_GI_number: 16080744 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 9 188 1 180 180 179 50.0 4e-45 MDMMETIELEDLKAQARQVLEELMEAARLKPGQILVVGCSSSEIDSFKIGSHSSAEIGMA VYTALYQELKPKGIYLAAQCCEHLNRALILEAEAAQAYGYEQVNVVPQLKAGGSFATAAY ATLEHPVAVEHIKAHAGIDIGDTLIGMHMKDVAVPVRIKTKEIGSAHVVCARTRPKFIGG IRARYDEENM >gi|157101630|gb|DS480694.1| GENE 307 333425 - 334009 659 194 aa, chain + ## HITS:1 COG:no KEGG:Closa_0782 NR:ns ## KEGG: Closa_0782 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 191 2 196 199 192 46.0 7e-48 MNRAITISREYGSGGREIGEKLARRLGIPFYDKELIAMIAKEGNIETSVLEANDEVDPDF DNYSPRQVPPDYQIAMTQRIYAAQATVIRNLNDKGPCVIVGRCADAILPDAVNIFVFSDM ENRINRIMSLNPGLTREEAKASILAVDKRRKAYHEYYSTTQWGDMDAYDICLNSGLAGVE GCLEAALTYIKYVK >gi|157101630|gb|DS480694.1| GENE 308 334132 - 334275 118 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940352|ref|ZP_02087697.1| ## NR: gi|160940352|ref|ZP_02087697.1| hypothetical protein CLOBOL_05242 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05242 [Clostridium bolteae ATCC BAA-613] # 1 47 1 47 47 68 100.0 2e-10 MNRTMNEDTAENRDCSNDTDHVKKDKGAYAEPLPESFRPRRDGPGGE >gi|157101630|gb|DS480694.1| GENE 309 334299 - 334448 232 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940354|ref|ZP_02087699.1| ## NR: gi|160940354|ref|ZP_02087699.1| hypothetical protein CLOBOL_05244 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05244 [Clostridium bolteae ATCC BAA-613] # 1 49 14 62 62 85 100.0 1e-15 MASYVSPKIRDKFETLSVDLKNCILERNVHLETLQDLIKVLDEIVKEGS >gi|157101630|gb|DS480694.1| GENE 310 334606 - 334869 279 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880450|ref|YP_001559418.1| ribosomal protein S20 [Clostridium phytofermentans ISDg] # 1 87 1 87 87 112 68 3e-23 MANIKSAKKRILVNETKAARNKAIRSKVKTAIKKVDAAVAAGDKAAAQAALLAATTEIDK AATKGVYHKNNASRKVSRLSKAVNSIA >gi|157101630|gb|DS480694.1| GENE 311 335154 - 336164 985 336 aa, chain + ## HITS:1 COG:no KEGG:Closa_0874 NR:ns ## KEGG: Closa_0874 # Name: not_defined # Def: spore protease (EC:3.4.24.78) # Organism: C.saccharolyticum # Pathway: not_defined # 17 333 4 318 325 443 73.0 1e-123 MEENAAVNAVNEAAAGTFQVRTDLALEQRESSMGEGGDVSGVRLKEWHHQKSGLKLTEVK ILNEQGAKSMGKPRGTYLTLEADRLSKADDDYHSEISEELAHQIRRLMAEMIGYREEDLP SVLVVGLGNDSVTPDSLGPRVLDNLQVTRHLDGQYGNTFLKDRKLPAISGIAPGVMAQTG METAEILKGIIHETHPDLIIAIDALAARSVRRLGTTIQLTDTGIHPGSGVGNHRHSLTKE SLGVPVMAIGVPTVVGAAAIVHDTVSALVGVLAENEGTRGTGSWIEEMDPDGQYQLIREL LEPEFGPLYVTPPDIDQSVKQLSFTISEGIHQAVFP >gi|157101630|gb|DS480694.1| GENE 312 336238 - 337605 1141 455 aa, chain + ## HITS:1 COG:no KEGG:Closa_0875 NR:ns ## KEGG: Closa_0875 # Name: not_defined # Def: Stage II sporulation P family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 448 1 468 468 432 51.0 1e-119 MRIRRGGVDRLMRRFLTGTVLILLMLLCGRYISRAAGSITKSDYKDMICAGGQWLVEAIW NQAHPAENAGMSGSWPDAWGSTGAINDPDPAYRKYNAVKTFYEEHQYLAWYGNEESGQQT LEEPGVLASGEAAAGAGGTAQGGETVRGGGAGGAGGAKTESSQDLAAGAGGTASGYQGPN QSLETITGTLERPITGNTYVLEQLMDYDFLIKHFYSVHTSTTAGRDLMNAKDLLSRDMTM KGDDSKPQILIYHTHSQEAYKDSGPGQTVVGVGDYLTRLLEAKGYNVYHDTSVYDLKNGQ LDRSKAYNYALDGITNILQQNPSIEVVLDIHRDGVGENLHLVTQVDGRDTAQIMFFNGLS QTPEGPIEYLQNPYREDNLAFSLQMQLGAAAYYPGFTRKIYLKGLRYNEHLRPKSSLIEV GAQTNTYEEALNAMEPLSELLDMVLQGNRKDSIIQ >gi|157101630|gb|DS480694.1| GENE 313 337679 - 339493 2110 604 aa, chain + ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 1 602 1 602 602 865 69.0 0 MPTVDQSKIRNFCIIAHIDHGKSTLADRIIEKTGLLTSREMQEQVLDNMDLERERGITIK SQAVRTVYKAQDGNEYIFNLIDTPGHVDFNYEVSRSLAACDGAILVVDASQGIEAQTLAN VYLALDHDLDVFPVINKIDLPSAEPERVAAEIEDVIGLDASDAPHISAKTGVNIEEVLEA IVHKIPAPKGDPQAPLKALIFDSTYDSYKGVIIFCRIMEGTVKRGTPIRMMATGSEEDVV EVGYFGAGQFIPCEELRAGMVGYITASIKNVRDTRVGDTVTSRDNPCASPLPGYKKVTPM VYCGLYPADGAKYNDLRDALEKLQLNDASLFYEPETSIALGFGFRCGFLGLLHLEIIEER LEREYNLDLVTTAPGVVYRVHKTTGEMMELTNPSNLPDPTEIEYMEEPIVSAEIMVTTEF VGSIMTLCQERRGNYLGMEYIETTRALLKYELPLNEIIYDFFDALKSRSRGYASFDYELK GYERSSLVKLDILVNREEVDALSFIVHADSAYERGRKMCEKLKDEIPRHLFEIPIQAAIG GKIIARETVKAVRKDVLAKCYGGDISRKRKLLEKQKEGKKRMRQIGNVEIPQKAFMSVLK LDDE >gi|157101630|gb|DS480694.1| GENE 314 339515 - 340792 1193 425 aa, chain + ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 9 425 4 374 374 289 40.0 9e-78 MTNLQRKPLELYIHVPFCARKCLYCDFLSFRALASVHEAYTEQLIREIEVQGACCREYQV KTVFIGGGTPSVMEPCLIRDIMQALNRNFDIAGDAEITIEVNPGTLLQNKLHIYRAAGIN RLSIGLQSADNQELKDLGRIHTFEEFLKSYQCARMAGFTNVNVDLMSSIPGQTLESWKNT LKKVTMLKPEHISAYSLIVEEGTPFWDRYGKREGEETGLKTDDVCFLHPRGGGQTEQAPL PRKRAALYPALPDEDTENRIYHFTRTFLAEQGYGRYEISNYAKPGRECLHNTGYWREVPY LGLGLGASSCINGTRFSNERDLDTYLHLDFSEEGGSSALALLRGPVEELTREAQMEEFMF LGLRMTKGISEIDFVSMFGVKIEGVYGPVIERLIGDGLLKREGVWISLTEWGMDVSNFVL SEFLL >gi|157101630|gb|DS480694.1| GENE 315 340865 - 341602 580 245 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 15 243 25 250 251 132 32.0 6e-31 MPDKPTIRETVRKKLEELSDPEYREFHSRLLPGITGIMGVRTPELRGIAKDLKKSGWQEY IKEVSCAWKEKGQGADGVLYDEMIIWGLCICGGCRDWDTAGEYVTAFVPAINNWAVCDIF CGSLKITGRYREEVWQFIQPYFQSGREYDLRFGTVMLLSHYTDQDYLDRALKLLDRVRHP GYYAKMAVAWAVSVYFVKFPDQVMEYLKQSSLDDWTYNKALQKITESFRVDKETKKLVRQ MRRGR >gi|157101630|gb|DS480694.1| GENE 316 341720 - 342286 668 188 aa, chain + ## HITS:1 COG:FN1983 KEGG:ns NR:ns ## COG: FN1983 COG0450 # Protein_GI_number: 19705279 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Fusobacterium nucleatum # 3 188 2 188 188 251 65.0 8e-67 MGNLINRTIGEFKVQAFYQGKFMDVTREDTLGHWSVFFFYPADFTFVCPTELGDLADFYE EFKEAGCEIYSVSEDTHFVHKAWADASDTIKKIQYPMLGDPAGKLAREFGVLDEEAGQAF RGTFILNPEGKIKAYEIHDMGIGRNAQELLRKVQAARFVEEHGDQVCPAKWKPGEETLTP SLDLVGML >gi|157101630|gb|DS480694.1| GENE 317 342291 - 342956 625 221 aa, chain + ## HITS:1 COG:FN1984_2 KEGG:ns NR:ns ## COG: FN1984_2 COG0526 # Protein_GI_number: 19705280 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 16 220 4 211 213 137 35.0 1e-32 MDVTIDLNTIPEESSLVDQGLKTQLAGIFGRMEKRVTIRAVVDLSGEKDREMASFLRAVV SVSPNLELELYGPDEADQVPELNTAWLPVTGLYKDGAYGRAAFHGVPGGREINSFVLAVY NLAGPGQEVAKGTRKKIQRLDKKTNIKICVSLACHHCPLVVTACQQIAFLNPNIEAEMID AALYEDLVAQYDIKRVPMMILNDSRIVMGSKTIDEIVTLLK >gi|157101630|gb|DS480694.1| GENE 318 343213 - 343782 571 189 aa, chain + ## HITS:1 COG:ECs3623 KEGG:ns NR:ns ## COG: ECs3623 COG1954 # Protein_GI_number: 15832877 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Escherichia coli O157:H7 # 5 189 3 186 191 167 46.0 1e-41 MNQKLYDLIEANPVIAAVKDMEGLDACCQRDEIKVVFILFGDICNISAIVEQIKASDKVA MVHIDLITGLSSKEVAVDFIRNNTSADGIISTKPALIKRARELSLYTTLRVFVLDSMAFE NIEKQMSVARPDIIEILPGLMPKVIRRVCRLVKVPVIAGGLISDKEDVMAALSAGAISVS TTNQKVWLM >gi|157101630|gb|DS480694.1| GENE 319 343882 - 345378 1565 498 aa, chain + ## HITS:1 COG:BS_glpK KEGG:ns NR:ns ## COG: BS_glpK COG0554 # Protein_GI_number: 16077994 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Bacillus subtilis # 1 494 1 495 496 674 64.0 0 MAKYVMALDSGTTSNRCILFNEKGEMCSVAQKEFTQYFPKPGWVEHDANEIWSTQLGVAV EAMSKIGATAEDIAAIGITNQRETTIVWDKETGEPVYHAIVWQCRRTSEYCDTLKEKGLV DTFRAKTGLVIDAYFSGTKLRWILENVEGVRERAEKGELLFGTVETWLIWKLTKGRVHVT DYSNASRTMLFNINTLEWDDEILAELNIPKCMLPEAKPSSFVYGESDPQFFGGPIPIGGA AGDQQAALFGQTCFNAGEAKNTYGTGCFMLMNTGEKPVFSKNGLVTTIAWGLDGKVNYAL EGSIFVAGAAIQWLRDEMRIIDSSPDSEYMAKKVKDTNGCYVVPAFTGLGAPHWDQYARG TIVGITRGVNKYHIIRATLDSLCYQTNDVLQAMKADSGIELAALKVDGGASANNYLMQTQ ADIINAPVNRPQCVETTAMGAAYLAGLAVGYWANKEEVIKNWAIDRTFTPEISEEKRTEM VTGWNKAVKCSYGWAKED >gi|157101630|gb|DS480694.1| GENE 320 345483 - 346196 778 237 aa, chain + ## HITS:1 COG:SA1140 KEGG:ns NR:ns ## COG: SA1140 COG0580 # Protein_GI_number: 15926882 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Staphylococcus aureus N315 # 4 231 4 230 272 215 57.0 5e-56 MLPYIAEFLGTMILIILGDGVVANVTLNKSGMKGAGSIQITFAWGLAVMLPAFIFGAASG AHFNPALTIALAVDGSMSWGLVPGYIVAQFAGAFVGAVIVYLLFKDQYDATESAATKLGT FCTGPSVPNMGRNILSEAVGTFVLVFAIKGIGQVSGIAPGVDKLLVFGIIVSIGMSLGGL TGYAINPARDLGPRLAHAVLPIKGKGDSNWGYAPVVIIGPVVGAVVAALLYQAIPWM >gi|157101630|gb|DS480694.1| GENE 321 346436 - 347875 1756 479 aa, chain + ## HITS:1 COG:FN0183 KEGG:ns NR:ns ## COG: FN0183 COG0579 # Protein_GI_number: 19703528 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Fusobacterium nucleatum # 1 476 23 498 498 455 50.0 1e-128 MYDVIIIGAGVSGAASARELSRYKVNACVLEREEDVCCGTSKANSAIVHAGYDAAEGSLM ARLNVEGNQIMPELAKELDFPFNPCGSFVVCLDEESLPDLRALYERGVKNGVKDLEIITD KARIKEMEPNLADEAAGVLYAPTAGIVCPFNLNIALAENAYTNGVDFKFNTEVTDIRRIE GGWALETNQGVYETRCVVNAAGVHADKFHNMVSGTKIHITPRRGDYCLLDKSAGNHVSHT IFALPGKYGKGVLVSPTVHGNLIVGPTAIDIEDKEATATTREGLDELIAKAGMNVKDLPM RQVITSFAGLRAHEDHHEFIIKELEDAPGFVDCAGIESPGLTSCPAIGRMVAGILKEKLG LEPNPQFDGSRKGILDPDTLTKEEQAELIRQNPAYGNIICRCEMVTEGEILDAIHRPLGA RSLDGIKRRTRAGMGRCQAGFCTPRSMELLHRELGLPMTEITKAGGDSKLVVGTNKDRI >gi|157101630|gb|DS480694.1| GENE 322 347893 - 349158 1387 421 aa, chain + ## HITS:1 COG:CAC1323 KEGG:ns NR:ns ## COG: CAC1323 COG0446 # Protein_GI_number: 15894603 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 4 421 3 416 417 460 59.0 1e-129 MREYDIVIIGGGPAGLAAAVAARDNGIESILILERDKELGGILNQCIHNGFGLHTFKEEL TGPEYAARFEEQVYERKIEYKLNTMVMDISHDKVVTAMNREEGLFEIQARAVILAMGCRE RSRGALNIPGYRPAGIYCAGTAQRLVNMEGFMPGREVVILGSGDIGLIMARRMTLEGARV KVVAELMPYSGGLKRNIVQCLDDYGIPLKLSHTVVDIRGKERLEGITLAAVDEKGKPIPG TEEDYTCDTLLLSVGLIPENELSNGMGVKMNRVTSGPVVNESLETNIEGVFACGNVLHVH DLVDFVSEEAAAAGKNAARYVKDGRCSGTGQEIELSAVDGVRYTVPCTIHPDRMEETQIV RFRVGNVYKNCYIGVYFDDEQVMHRKRPVMAPGEMEEIKLQKEKLMLHPGLKKITIKVEE A >gi|157101630|gb|DS480694.1| GENE 323 349162 - 349521 518 119 aa, chain + ## HITS:1 COG:TM1434 KEGG:ns NR:ns ## COG: TM1434 COG3862 # Protein_GI_number: 15644185 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Thermotoga maritima # 4 118 3 118 138 95 42.0 3e-20 MEKRELICIGCPMGCPLTVELENGEIKTITGYTCKKGETYARKEVTNPTRIVTSTVRVEG GRADMVSVKTREDIPKDKIFQCVKALKGVTVKAPIRIGDVVVADVAGTGVDIVATKEVL >gi|157101630|gb|DS480694.1| GENE 324 349688 - 349879 351 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940372|ref|ZP_02087717.1| ## NR: gi|160940372|ref|ZP_02087717.1| hypothetical protein CLOBOL_05262 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05262 [Clostridium bolteae ATCC BAA-613] # 1 63 1 63 63 75 100.0 1e-12 MEKKLDINNMNGITEEEDKTCGLNDMNCLEGEEDKTCGLNDMNCVEEEEDKTCGLNDMNC LGE >gi|157101630|gb|DS480694.1| GENE 325 350044 - 351030 1479 328 aa, chain + ## HITS:1 COG:FN1840 KEGG:ns NR:ns ## COG: FN1840 COG2376 # Protein_GI_number: 19705145 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 327 5 332 332 370 56.0 1e-102 MKKFINDVALVEDQMIQGMVKAYPGYLRKLDCGNVVVRANKKEGKVALISGGGSGHEPAH GGYVGCGMLDAAVAGAVFTSPTPDQIYEGIKAIATDAGVLMVVKNYTGDVMNFEMAAEMA EMEGITVKYVVTNDDVAVKDSLYTVGRRGVAGTVFVHKIAGAMAETGASLDEVHAVAQKV IDNVRTMGAAIAPCTVPAAGKPGFELSDDEMEVGIGIHGEPGTHRESMKTADQVADMLLA QILGDIDYEGREVAVMINGAGSTPLMELFIINNRVSDVLAEKGIRVYKTFVGEYMTSIEM QGFSISLLRLDDQLKELLDAPADTPAWK >gi|157101630|gb|DS480694.1| GENE 326 351081 - 351704 820 207 aa, chain + ## HITS:1 COG:FN1841 KEGG:ns NR:ns ## COG: FN1841 COG2376 # Protein_GI_number: 19705146 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 8 203 3 198 202 165 41.0 6e-41 MADSKKVLEIIRAIGLKMEAEKEYLTELDQPIGDSDHGINMARGFAAVEGKLPDLEGKDI GTILKTVGMTLVSTVGGASGPLYGSAYMKAGMALAGKEEMDMDDFLSMMDTAVQAVEQRG KATVEEATMLDAMVPSLKTMKDAAAEGKSVREALEAGVRAAWAGAEHTKDLVATKGRASY VGERGLGHQDPGATSYSYMLEVIAGLV >gi|157101630|gb|DS480694.1| GENE 327 351785 - 352165 634 126 aa, chain + ## HITS:1 COG:BH3395 KEGG:ns NR:ns ## COG: BH3395 COG3412 # Protein_GI_number: 15615957 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 117 7 119 128 70 44.0 6e-13 MVGFVIVSHSENLAKSVVELTSIMAPNARIAPAGGMDGGGFGTSFEKIQAAIESVYSDDG VLVLVDLGSAVMTTEMVIEMFEGKKVEMVDCPLVEGAVVATIDSVGGMSFEDIRTALAGV GRAKKF >gi|157101630|gb|DS480694.1| GENE 328 352216 - 352566 379 116 aa, chain - ## HITS:1 COG:L178600 KEGG:ns NR:ns ## COG: L178600 COG5341 # Protein_GI_number: 15673322 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 15 116 24 126 135 64 32.0 5e-11 MNILQKRDVILAAAILLLAAAAFGISHYIRRTPAAIAQVSVDGTVVETLDLSKDTEITVT SSNGGTNHLIVKDGEIWCSEASCPDKVCVHQGKKHLNSDTIVCLPNQMIVTITGGE >gi|157101630|gb|DS480694.1| GENE 329 352563 - 353060 414 165 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3035 NR:ns ## KEGG: Cphy_3035 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 50 162 154 268 687 97 43.0 3e-19 MKRSTISSNARSLIGIAVMAVLSLAVIAVSDPLYKALRGPVTTASPEAPLADGIYTHEAL EPDANGFRDRTTLTVSDGIIVSCVWDSFNSDGESKQKLSMEGQYIMTEDGPLWKAQSDSV CRYLIEHQRLAGDDGYTTDAVASVSINVYPFMNGVEECLRQAEIK >gi|157101630|gb|DS480694.1| GENE 330 353360 - 353950 602 196 aa, chain + ## HITS:1 COG:L179010 KEGG:ns NR:ns ## COG: L179010 COG4769 # Protein_GI_number: 15673323 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 41 189 11 161 176 89 40.0 4e-18 MPNQKKTVTESYKKSAAAVRKGRGTKQSRRTSAQNAALFGILTALALVLGYVESLVPVYL GAPGVKLGLANLVTMVGLYCMGTKETALISVVRVLLSGILFGPPSAILFGLAGTALSLTV MALCKRFRLFGMVGVSILGGVAHNIGQFLMAAFVVNTFGVFSYLPVLLTAGCAAGALIGL LGGILVGRVNRLFRTI >gi|157101630|gb|DS480694.1| GENE 331 354072 - 355592 1728 506 aa, chain + ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 1 499 1 495 497 550 51.0 1e-156 MVYEAVEGLVQYGLDKGLISEADAVYARNQILDVMGMDEYEEPQGPVESGDLEAILKELL DCAAGTGVLKEDSVVYRDLLDTKLMNCLMPRPGEVVKEFWKRYEESPEKATDWYYGFSQD SDYIRRYRIARDMKWTTDTRYGTLDITVNLSKPEKDPKAIAAAKLARQSGYPKCQLCMEN VGYAGRTNHPARNNHRIIPITINDSQWGFQYSPYVYYNEHCIVFNGQHVPMKIDKAAFRK LFDFIKQFPHYFLGSNADLPIVGGSILSHDHFQGGHYTFAMAKAPIEKHFSIKGYEDVEA GIVFWPMSVLRLRAADPDRLIELGDVVLKAWRGYTDEDAFIFAETDGEPHNTITPIARKV GDTFELDLVLRNNITTEEFPLGVYHPHQELHHIKKENIGLIEVMGLAVLPSRLKTELAML GEYMVDGKDIRKDEVLLKHADWVEEFMPGYQEKGILVTRDNVWSILQEEVGKVFARVLED AGVYKCDERGRRAFAGFLHSVGFEEV >gi|157101630|gb|DS480694.1| GENE 332 355973 - 356422 547 149 aa, chain + ## HITS:1 COG:STM4110_3 KEGG:ns NR:ns ## COG: STM4110_3 COG1762 # Protein_GI_number: 16767376 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Salmonella typhimurium LT2 # 4 148 1 144 145 90 31.0 8e-19 MSKLINENCIVFDMEGSKNDIIRTLSGELYRAQKITDTEEFYQDVLAREAITPTFVGFDM GLPHGKTDHVLEASVCFGRTKEPVVWNEESGETADLIILIAVPLSEAGDTHLKILANLSR RLMHEEFRESLRSSTREQVYTILEEVLEG >gi|157101630|gb|DS480694.1| GENE 333 356427 - 357491 1468 354 aa, chain + ## HITS:1 COG:VC1822_2 KEGG:ns NR:ns ## COG: VC1822_2 COG1299 # Protein_GI_number: 15641824 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Vibrio cholerae # 1 335 4 338 358 358 61.0 1e-98 MKEQLKILKKHILTGTSHMIPFIVAGGILFSLAVMLNPAGAATPETGWLAGLAQIGLGGL TLFVPVLGGYIAYSIADKPGLAPGMIGAYLAKEMGAGFLGGMAAGLIAGVVVKELKRIKL PISLKTLGSIFIYPLVGTLVTGGIIVWGIGGPIAAIMNGLTNWLSSLGDVGKVPLASILG AMTAFDMGGPVNKVATLFAQTQVDTLPYLMGGVGVAICTPPIGMGIATLLAPKKYNVEEK EAGKAAILMGCVGITEGAIPFAANDPLRVIPSLIVGAVVGNIIPFLAGVLNHAPWGGLIV LPVVEGRIWYFIAVLAGGFVTALMVNLLKKNYVEESAGNDTDLDDLDEITFDEL >gi|157101630|gb|DS480694.1| GENE 334 357546 - 357854 493 102 aa, chain + ## HITS:1 COG:ECs4879 KEGG:ns NR:ns ## COG: ECs4879 COG1445 # Protein_GI_number: 15834133 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Escherichia coli O157:H7 # 2 97 3 98 106 93 46.0 9e-20 MKVVGITSCPSGVAHTYMAAEALKLSGEKLGMEVLIETQGGAGVENQLKQKDIDEAVCVV LVNDVALEGLDRFKGKKVLKMGVSDLIKKSDAVMKKIHDSFQ >gi|157101630|gb|DS480694.1| GENE 335 357989 - 360025 2063 678 aa, chain + ## HITS:1 COG:SPy1325_1 KEGG:ns NR:ns ## COG: SPy1325_1 COG3711 # Protein_GI_number: 15675268 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Streptococcus pyogenes M1 GAS # 1 473 1 483 514 96 22.0 2e-19 MFNGRMLNIIRYLKEHGEATYKEMAKALGISERSIRYDVDRINDILSLERLPEIEKHSKG LLRYPQSLDLKGLEDGNEVVYTGKERMSILLLILLMRNEDLKINQLSGQFRVSRSTVKND MAALDERLKKDGMGIGYSGHFYLSGPKLKRTTLMNQEFRKYIEYLINPFTEYNSYEFYCI HIIHKAFEGISIANVVMAVDGLLEELKCTLTSSSYLWYMSNVVVLVWFILHDKDYPLDIT VVPDYDREVYQRFGEKLGAIIGKPVSEEHMAMMAKMFDFTNKMAGGTAEVDPVHTQEVTF SLISAMSKRMNMPFDKDTNLVEGLLNHMIPLIQRINNHVDIHDNVSSLMRPEEQQFLSIV KQCCKETDTLEKLENEDELVYLTICFMASIRRMKGAPYKKVLLVCGHGYGTTTMLKESLL SEYQIHIVDTLPIYKLSSYPDWAAIDYVLSTIRINNSLPKPCLTVNPILQSEDRAAMDRL GIPRKSLLSSYYSIEEKLGFLDENTRARVMDVVKKELGYQTVQVFHSPKSFSSFLKFDCI MRADLVSDWKDAVRMSAGLLVKRGFVEESYVSNMLGFIEKQGFYAVSDDSFALLHGKGAE GISKTSLSLLVTREPVVFGEKKARVIFCLASRDGKEHIPAVVTLMRMVKTTALIHDLEES RSEEELYQTVLNCEFEVL >gi|157101630|gb|DS480694.1| GENE 336 360135 - 360476 369 113 aa, chain + ## HITS:1 COG:CAC3376 KEGG:ns NR:ns ## COG: CAC3376 COG1917 # Protein_GI_number: 15896618 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Clostridium acetobutylicum # 2 110 1 109 114 130 55.0 6e-31 MLDERWVFHENAEPVQAGPGVVRRVLAYSKDLMCVENTFEEGAVGSLHHHPHTQITYVVS GEFEFNIDGEKKTVRAGDTMLKLDGVEHGCVCRKAGILLDIFNPMREDFVSVL >gi|157101630|gb|DS480694.1| GENE 337 360955 - 361626 633 223 aa, chain + ## HITS:1 COG:CAC0198 KEGG:ns NR:ns ## COG: CAC0198 COG2364 # Protein_GI_number: 15893491 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 12 209 5 201 227 129 37.0 5e-30 MSNGTNYAKDSQMTRRYIMLLIGIFFMGTGISLIIKSGTGTSPMSSLTNVMTQICPLLTL GMYTFLLNVLFFIGEFIVDFKSFSASKFIQLIPTFFLSVAVDFNMFLVRGLHPETYILKL AVLIIGCMFFGLSIAFMVSADVILMPGEALIKIISVKYQKEYGNVKTAVDVSLVVLAVIV SLIFLNTIYGVREGTLIAAFTIGNFSRFFRGYTNKLVREVALQ >gi|157101630|gb|DS480694.1| GENE 338 361623 - 362972 1555 449 aa, chain + ## HITS:1 COG:MK0366 KEGG:ns NR:ns ## COG: MK0366 COG0402 # Protein_GI_number: 20093804 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanopyrus kandleri AV19 # 17 441 14 428 431 228 35.0 3e-59 MREGYNNLHIVTVDGLDRIYEDGGILYEDGVITHVGDRAYIEEKAGELKIELKDGKGRYL FPGLINTHTHLYQDIMKGMGSDLSLEDWFPKSMAPAGAVLRERHVAAGVKLGLAEAIRCG VTTVADYMQLQPVKGLGKLELDIAKDMGVRMVYGRGYRDIGKKELVEKAEDVFADVTALK EEFEGDGMYRVWLAPAAGWGASLELLKATREYADRNATPIMMHMFETGTDDKISWERNGK SAIRHYEESGLLGADLLAVHSVAIGEEEISTYARNHVSVSYNPIANMYLASGVAPVGEML KAGITVAIGTDGAGSNNDNDMLEAMKFGALLQKTFHKDPLAMTAGGMLRMATIEGARALG LDQLVGSIEVGKKADFFLFDPAKSVKSCPVHDIVATLIYSGDHKAVDTVVINGKTVMEEG RFLLADEQEILNTAQLMAEDLVRCIQENQ >gi|157101630|gb|DS480694.1| GENE 339 363062 - 363706 635 214 aa, chain + ## HITS:1 COG:AF1519 KEGG:ns NR:ns ## COG: AF1519 COG0655 # Protein_GI_number: 11499114 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Archaeoglobus fulgidus # 4 192 2 188 195 132 42.0 6e-31 MAKILGISGSPRNKSTEYALGEALKSIESRDGIETSMITLRGKKIAPCNGCGYCKKNKTW CCLKDDFEPLLKEFMEADAYLFGSPVYAYGPTPQLSAFFSRMRPLFHVYPELMRDKIGSA FAVGGTRNGGEEATITAMHNMMMARGINIVCNEVYGYAGGYIWSKDQGQEGVDADEIGMG GLLKLANKLADMALIREYGKKALEEREAAVNERD >gi|157101630|gb|DS480694.1| GENE 340 363693 - 365027 1473 444 aa, chain + ## HITS:1 COG:MA0246 KEGG:ns NR:ns ## COG: MA0246 COG0043 # Protein_GI_number: 20089144 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Methanosarcina acetivorans str.C2A # 8 422 4 405 422 221 36.0 3e-57 MSGIKDLRDFLAVLKDNKLLFETERAVDWKYEIGSLLATLEAEGQRTGLFHNIKDKEFSV CGNVLASMDGIALALGCRKDEMTDFLADCLEHPVKPQVVADAPCHENIRMGDEIDLERIP VPIHAPKDGGPIITGGVIVSKEIDGTRQNLSFQRMHVKGKDKFSIMINEWRHLKDFYDQA ESQGRALPIAVVIGADPVIYVGAGLRYDGDETEIAGAIRKSPIEVVKCITSDIYVPATAE FVIEGEILPDYREKEGPLGEFTGHYSEPWNSPMVQVKAITHRNGAIYQTINGASFEHINL GNVLPREPLLKTHTQYVSKGVTNVHIPPYGSGFLAIVQMKKKNPGEPKNVALAAMTAYVN IKNVIVVDEDVDIYDPADVMWAVSNRVIPERDIFYIHNAQGHELDPCSDERGVQTKMGID ATLNEESRCLERVRYPKVDLKDYQ >gi|157101630|gb|DS480694.1| GENE 341 365056 - 365676 403 206 aa, chain + ## HITS:1 COG:BH1651 KEGG:ns NR:ns ## COG: BH1651 COG0163 # Protein_GI_number: 15614214 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Bacillus halodurans # 2 198 6 197 206 182 49.0 4e-46 MGKYIVGITGASGSIYAKRVIERLIQKGHHVCICMTQAGKLVVESELGWQIGQDTPSGEV EQYLQEIFGSKELIHHYDVHAIGAPIASGSSGMDAMIVVPCSMGTLSAICHGSSHNLLER AADVCLKERRPLVIVPREAPYNQIHLENMTKLSGYGAVIMPASPGFYSRPRTIEEMVDFF VTRILDQMGIHEPSESRWTGMDVCVK >gi|157101630|gb|DS480694.1| GENE 342 365693 - 367051 356 452 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 16 452 8 437 447 141 25 3e-32 MNARNQEVSHSLYVFDSKPPVRKALPISLQHIMAMFLGTVTVPIVIAGAAGADAATRTIM IQYSLMMSALATMIQVCPLGPVGSRLPVIFCAGFTCVPVFTPIAAQYGLPGVFGAQLLCS VITIALGFCVGRLHKFFPTVVTGTIILSIGLSLYPIALKYMAGNASSPTYGQLNNWIVAF VTLAAVLFFNLMCKGVVKMASILLGAVVGYVLALCMGMVHFDGVVSASWLAVPKPFFYGM PSFDLSMVLPILLITIVNIMQSVGDITGTTVGGFDREPRSEELTGGVAASGIATLVGTIF GVPVVSSFSQNVGIVSMNKVVSRRVITIACAIMLALGIVPKFSALVSTLPAPVIGGGTLI VFGMITLTGLKLVSSEPLTARNSTIVGVSIALAMGLSTLEGTAALENFPPLAQALLSKPV VVAGVMSFLLNLLAPGKTVDEEARERAALDAE >gi|157101630|gb|DS480694.1| GENE 343 367532 - 368008 471 158 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 3 156 14 166 171 162 52.0 2e-40 MEQICLNINKKNYNVAIEKNWTLLYVLREELELTAVKCGCNTGDCGACKVIIDGEAVNSC LVLARNAVGKTIETVEGLSDGIHLHPIQQAFIDVGAVQCGYCTPGMIMSAKALLDKNPDP TEAEIRAGISNNLCRCTGYVKIVRAIQLAASRMQKEGK >gi|157101630|gb|DS480694.1| GENE 344 368012 - 370270 2572 752 aa, chain + ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 10 752 18 765 772 451 35.0 1e-126 MELQTPKNGVIGTDMPVRDAALKVTGQFKYVGDMTLPHMLHAKVLFSPVAHARIKSIDTS QAEQLEGVRAVVCWKNAPDALFNSCGEEIDGEKTERVFDSTVRYVGDKVAAVAAETAKIA EQALKLIRVEYEELPYYLEPEEALKEGAYPIHKDSNVIEEVVQEAGDVEKGMAEADYIYE DDFETPAIHHGAIETHTSLAVYESSGKLTVYTPSQDVFGHRTNLSRIFGLPMSRIRVVNP GIGGGFGGKIDMVTEPVTALLAMKTGRPVRLVYTRREDIPSSRNRHSMKLHLKTGMKKDG TIVAQEMDVIVNAGAYAGGTMSIVWAMSGKYFKNHKTPNLRFHAVPVYTNTPVAGAMRGF GSPQEFFAQQCQLNRIAKDLGMDIIQLQLKNLVEPDGFDQRDGLPHGNPRPIDCVLKGME LSGYEEAVREQEDAADSRYRIGVGMAVAAHGNGVFGVRPDTTGVIIKMNEDGTAVMFTGV SDMGNGSVTTQTQVVSEILGIPMPHIECVQADTDATLWDMGNYSSRGTFVSCSAALKVAG QVKTELLKEASGLLEVDESELDLKDQRVYCISRPDTSASLAEVIKYAKQAHGRDICCADT FASCAMAVSYGAHFAKVQVDMENGSVKVLDYTAVHDIGKALNPMGVEGQIEGAVQMGIGY ALTEGFILDDKGKVKNTTLKQYHMLNAQEMPPIKVGLVEQIEQSGPFGAKSIGECSVVPV AAAVANAVSNAVGKQVKRIPVRPADVMKLIES >gi|157101630|gb|DS480694.1| GENE 345 370393 - 371292 780 299 aa, chain - ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 295 1 301 301 98 23.0 2e-20 MNFSHLEYAMTVAECRSINKAAQRLLVSQPYLSGVLKNLEEELGCQLFKRSHNGILLTEK GIKFMDSARIILYEYEKLKQLSWDEAEQPMVISSYFVSNIMRLFLEFKKQSAKQFPDRLT EMGNQEVLESVAAGRTRLGFILCAVEKKEKYLRLARDFHCSCEELMTSIPLYVMVSREHP LYKKPSVSIKELFDYPYVHFNDVSAISYLKLVGLSDHPNRLEVDNRGQFFDAIREGQYIS LSVRGKTSDGRGFCFVPISDKNLYLNLYFVTQTDYRPNKREREFIRFLRDFAALDGQTP >gi|157101630|gb|DS480694.1| GENE 346 371479 - 372657 1251 392 aa, chain + ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 5 386 3 393 394 250 36.0 3e-66 MYRELLEEAKTMEADLIAWRRKLHTMPELGLELPDTSTFVREKLEEMGIPYEIKVNGSCV VGILGKGSRCFMLRSDMDGLPFQEESGEEFASRNGRMHACGHDLHATVLLGAARLLKQHE SELKGQVKLLFQPGEETFQGARAAVEEGVLDHPKVDSAFAMHVAAQMSPEYVAYGSLPMA GVYGFRITLTGQGGHGSRPEKCIDPINTGVHVYLALQELIARECPAISETALTIGQFCAG SASNVIPETAVLQGTMRSFDEKTMSHLIARLNEIVPSVAGAYRTKAEIEVISDIPIVRCN EELNQEIVEGLKELEPELKAVCAYHVMGSEDFAYISQKIPASYMCIGAGIEDVSKRYVEH NPKVRFHESALVKGAAIYAGTAMRWLDLHGNQ >gi|157101630|gb|DS480694.1| GENE 347 372685 - 373968 1127 427 aa, chain + ## HITS:1 COG:BH2694 KEGG:ns NR:ns ## COG: BH2694 COG0477 # Protein_GI_number: 15615257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 5 396 3 394 418 105 26.0 2e-22 MAENKRFFYGWLILLTCILIMGLGYAPLVSCASLFTKPVTEDLGFTRSGYALVSTIAALM MAFMSPFIGKIMSKPYMHKALVVCMIGNCLAYYCYSFASTLPQFYVTAACIGFFECGSTM IPVSVLITNWFRKKRGLVMSIAMVGSSIGGTILSPIIGNLIASQGWRFTYKAVGITRLVI LVPLVLLVIRRTPADMGTKAYGADEEGETGKARKTEREWNVSLKEVKKKPMFWCFVLGCT LISATAAVITQIPASVMDAGYATTTAASIASLYLFIAIPGKLVLGHIYDRYGVKAGILFG NTMFFLSVICLIFIKHQPFLYLMAILFGFGTCIGTVSAPVITSGIFGTKHYAEIYGFLSL FPSVGYALGGPLIASAYDLTGSYNIAWIIVAALSIVMTAFLLYSYRVSRVCIKEQEKTAD RAKNTMQ >gi|157101630|gb|DS480694.1| GENE 348 374445 - 375692 1517 415 aa, chain + ## HITS:1 COG:SA1915 KEGG:ns NR:ns ## COG: SA1915 COG0112 # Protein_GI_number: 15927687 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Staphylococcus aureus N315 # 6 415 1 412 412 516 63.0 1e-146 MVEQVMDFIEGYDKEIGEAIKAECGRQRRNLELIASENIVSEPVMAAMGTVLTNKYAEGY SGKRYYGGCEFVDVVETIAIERAKKLFGCDYVNVQPHSGAQANMAVFVAMLKPGDTVMGM NLDHGGHLTHGSPVNFSGLYFNIVPYGVNEDGYIDYDKLEETAVASKPKLIIAGASAYCR TIDFKRFREVADKVGAYLMVDMAHIAGLVAAGVHPSPIPYADVVTTTTHKTLRGPRGGMI LANQAVADKFNFNKAIFPGIQGGPLEHVIAAKAVCFGEALRPEFKAYQEQVVKNAAALAA ALKRQGFNILTGGTDNHLMLVDLRGMDVSGKELQNRCDQVYITLNKNTVPNDPRSPFVTS GVRIGTPAVTSRGLKEEDMEKIAECIWLAATDFEAKADYIRGEVDKICGKYPLYQ >gi|157101630|gb|DS480694.1| GENE 349 375778 - 376866 1319 362 aa, chain + ## HITS:1 COG:lin1385 KEGG:ns NR:ns ## COG: lin1385 COG0404 # Protein_GI_number: 16800453 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Listeria innocua # 3 362 5 362 362 348 50.0 7e-96 MELKTPLYDMHVKYKGKIVPFAGYLLPVQYEKGVIAEHMAVREQCGLFDVSHMGEILLSG PDALKNVNMLLTNDYTVMEDGTARYSPMCNEAGGVVDDLIVYKIKDNSYFIVVNASNKDK DYQWMKDHVSGDVELKDISGQVGQLALQGPKALDVLKKVADPDAIPDKYYTFKKDCCIDG IPCIISKTGYTGEDGVEIYMAGNDAPRLWELLLEAGREEGLIPCGLGARDTLRLEASMPL YGHEMDDTITPKEAGLGIFVKMDKEDFIGKKALQEKGPLTRKRVGLKVTGRGIIREHEPV FAGEQQIGTTTSGTHCPYLGYPAAMALVDIAYKEPGTQVEVDVRGRRVGAEVVKLPFYKR EK >gi|157101630|gb|DS480694.1| GENE 350 376908 - 377285 660 125 aa, chain + ## HITS:1 COG:PH1317 KEGG:ns NR:ns ## COG: PH1317 COG0509 # Protein_GI_number: 14591129 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Pyrococcus horikoshii # 3 125 13 138 138 120 54.0 7e-28 MNVPKELMYTKSHEWVKTVDDTTVTVGLTDYAQDALGDLVFVNLPEPGDDVTAGEAFSDV ESVKAVSDVLSPVTGQVEEINEELLDSPESVNQSPYDAWFIKVKDVSGKEELMDGEAYEA YLETL >gi|157101630|gb|DS480694.1| GENE 351 377310 - 378629 1441 439 aa, chain + ## HITS:1 COG:BH2815 KEGG:ns NR:ns ## COG: BH2815 COG0403 # Protein_GI_number: 15615378 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Bacillus halodurans # 4 428 5 438 447 374 45.0 1e-103 MGSYVPNTSEEQQEMLREIGFDSFEALFSHIPREVRVKGGLKLPEGLGEMEVIGKMKEIA KQNVVFKHIFRGAGAYSHYIPSIVKSVVSKEEFLTSYTPYQAEISQGILQSIFEYQTMLC QLTGMDVSNASVYDGATAAAEAAAMCRDRKRSRVYLSETTHPLVTEVVKTYCFGSGAQVV MVPEKDGVTDADALRELMDGTAACFYVQQPNYYGNLEDCSLLGEITHEKGAKYIMGCNPI SLALLPTPAECGADIAVGDGQPLGMPLSFGGPYLGFMTATSAMGRKLPGRIVGETKDVDG RRAYVLTLQAREQHIRREKASSNVCSNQALCALTASVYMAAMGADGLRKTAVLCTSKAHY LKEQLEQAGLTARYDREFFHEFVTESDTDSGLILKALEEKGILGGLPLSHREILWCATEQ NRKEDMDQAVQIVKEVCGA >gi|157101630|gb|DS480694.1| GENE 352 378626 - 380023 1460 465 aa, chain + ## HITS:1 COG:lin1387 KEGG:ns NR:ns ## COG: lin1387 COG1003 # Protein_GI_number: 16800455 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Listeria innocua # 1 465 7 487 488 514 57.0 1e-145 MKLIFEKGTPGRHCHLLPDCDVPVHQIGVGRKEKLNLPELSENEISRHYTGLAGRAHGVN NGFYPLGSCTMKYNPKVNEEAASLEGFAGIHPLQPEHTVRGALRVFDLAEQYLCEITGMD AMTFQPAAGAHGEFTGLLLIKAYHHSRGDEKRTKIIVPDSAHGTNPASATMAGYSVVSIP SREDGCVDLDKLKEAVGEDTAGLMLTNPNTVGLFDKNILEITRIVHDGGGLCYYDGANLN AVMGIVRPGDMGFDVIHLNLHKTFSTPHGGGGPGSGPVGCKEFLRPFLPGFMVKGPQSIG SVKMFYGNFGVVVKALAYIMTLGREGIPEASSNAVLNANYMMNKLSDLYQMAYNETCMHE FVMTLEDLKHDINVSAMDIAKALLDYGIHPPTMYFPLIVHEALMVEPTETESKETLDEAI EVFRSIYEKAKADPQSLHQAPVKTPVRRLDEVTAARKPVLRYIGE >gi|157101630|gb|DS480694.1| GENE 353 380122 - 381558 792 478 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 464 4 457 458 309 36 9e-83 MADIFDLVIIGAGPGGYVAAIKGAKLGLSVAVVENREVGGTCLNRGCIPAKAMIHASRLY REMKEGGQFGIFAENVRYEYDKILEYKEGTSGSLRQGVEQLFKANNVTLVKGTGTLQADK TVLVTGCEGEEESRVLKGTHVLLASGSKPMNLPIEGLELPGVLTSDGLLGLQHAPESLLI IGGGVIGAEFASVFSSLGIRVTIVEALPRLLANLDKDISQNLKMILKKRGIKIYTGAMVK RIEKTEHGLACVFEEKGEEKREEADYVLSAAGRVPETEKLLGPGTELAMERGRITVDSNF ETSMEGVYAIGDVIKGIQLAHVASAQGVWVAEHLAGTGHSIDLSVVPSCVYTSPEIASVG LTEDEAAQAGIPVSVGKFLMSANGKSQISREERGFIKIIAEADTKVVLGAQMMCARATDM IGEMATAAANKLTVPQLLLGMRAHPTYNEAVGEALEELEGGSIHTVPRQAPGKLKSRQ >gi|157101630|gb|DS480694.1| GENE 354 381592 - 382260 733 222 aa, chain + ## HITS:1 COG:lin1927 KEGG:ns NR:ns ## COG: lin1927 COG1760 # Protein_GI_number: 16800993 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Listeria innocua # 1 222 1 220 220 185 40.0 6e-47 MAVISAFDVLGPNMIGPSSSHTAGAVAIALLAQKMIAGQIIEAEFTLYGSFSRTYRGHGT DRALLGGIMGFKTDDRRIRDSFSIAQQRGLRYSFLTNEKDKDVHPNTVDIRMVNDRNQEM TVRGESLGGGKVRIVKINQVDVDFTGEYSSLIVIHQDKPGVVAHISKCLSDCNVNIAFMK LFREEKGAQAYCIVESDERLPLEITDWINSNPHVYHTMLVQV >gi|157101630|gb|DS480694.1| GENE 355 382273 - 383148 910 291 aa, chain + ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 1 281 1 281 290 224 46.0 2e-58 MDFKSGAELLEQCAVRGMTISEVMKQRECDLTGVSMEDMEARMKDVLAVMRDSATLPLRE PVTSIGGLIGGESARMEERRRKGTSILSGVLGRAVIYAMAVLEVNTSMGLIVAAPTAGSS GIVPGLLLALQEEYGLEDDLVADGLFAASAVGYLAMRNATVSGAVGGCQAEVGVACAMAA AAAVELMGGTPAQSLDAASAALMNLLGLVCDPVGGLVECPCQNRNGAGAANALVCAETAL SGIRQLIPFDEMLEAMYAVGRGLPLELRETALGGCAATPSGCRMGCCVQKA >gi|157101630|gb|DS480694.1| GENE 356 383178 - 384560 1058 460 aa, chain + ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 19 452 1 431 431 258 35.0 2e-68 MDFNEAVDFGVTADSGERMDFDGALDFLEQRSGLGSVMGLDSIRNLLRELSDPQKDLEFV HIAGTNGKGSVSACLSSILKEAGCRTGTYTSPAVISVRERYQVDGSWITEREFALLADRV KAAAGRMEEKGRGIPTVFEIETAMAFLYFKEKGCRVVVLETGLGGEQDATNVVENTLAAV FTSISMDHMGVLGNTLGQIAACKAGIIKPGCEVVSAHQPLEVLSVLKDRAGEKGCVFTQV DMSSLSALSCPEREENRGEYAENVFTYKAMEGIRISLPGTYQLENGALALEAASALSRKG IHIPESAMRKGLSNVSWPGRFQMLKKSPMVIVDGAHNRDAALRLRECVETCLNGRRLIFI MGVFGDKEYGLMTQIMAPLAERIYTVWLPDRGRSLNPEILAREAGKYCAGTQAVDSVETA LDMALKDAGEQGAVLVFGSLSYLGQIMKETERRNRDDRQE >gi|157101630|gb|DS480694.1| GENE 357 384544 - 385101 454 185 aa, chain + ## HITS:1 COG:SP0291 KEGG:ns NR:ns ## COG: SP0291 COG0302 # Protein_GI_number: 15900225 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Streptococcus pneumoniae TIGR4 # 2 183 1 182 184 211 52.0 9e-55 MIDRNKVEQAVRLLLEGMGEDPSREGLIDTPERVARMYEELYSGMDEHPGMYLEKTFRAE NNDLIVEKDITFYSVCEHHLLPFYGKVHVAYVPDKKVAGLSKLARTVEVFSRRLQIQEQL TAQIADALMDGLAPRGVMVMMEAEHMCMTMRGIKKPGSRTVTVVKRGCFCQRENLVNLFF QMVRG >gi|157101630|gb|DS480694.1| GENE 358 385106 - 385924 612 272 aa, chain + ## HITS:1 COG:BH0093 KEGG:ns NR:ns ## COG: BH0093 COG0294 # Protein_GI_number: 15612656 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Bacillus halodurans # 12 267 18 273 280 279 53.0 4e-75 MIIGQKEFDVWNRTYVMGILNTTPDSFSDGGRYCDREMALRQAEKLVFDGADIIDVGGES TRPGCRPVGEEEELERVLPVIRLIHENFDIPISVDTYKSGVARQAVEAGAAMVNDIWGLK KDKDMAGTIAELGVCCCLMHNRQNPVGHHFMNTALEEMNETILLAKQAGIKDDRIVLDPG VGFGKTYEMNLEVIRNLSAFKRFGYPVLLGASRKSVIGLALDLPVDKREEGTLVTTAAAV FSGCSFVRVHDVEGNKRAVRMARSIMGAPEFK >gi|157101630|gb|DS480694.1| GENE 359 385993 - 386805 505 270 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 270 1 268 278 199 40 2e-49 MDQIRIEQLEVFAKHGAIPEENVLGQKFLVSAVLFTDTRKAGRNDQLDQTISYGRASKFM ERYLREHTFQLIESAAEKLAEELLLMYPDSLKSVELEIRKPWAPIGLPLQYVSVRITRGW HQAYIALGSNMGDRLAYLTGAVEALGKLRECRVGRVSGFIETPPYGVTDQADFLNGCMEI KTLYTPEELLDALHEIEAAAGRERIIHWGPRTLDLDVIFYDDLVYESERLSIPHTQMHKR DFVLKPLAEIAPYKRHPIYGETVAEMLDTV >gi|157101630|gb|DS480694.1| GENE 360 386856 - 387377 606 173 aa, chain - ## HITS:1 COG:PH1832 KEGG:ns NR:ns ## COG: PH1832 COG4720 # Protein_GI_number: 14591582 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 3 171 36 199 202 70 33.0 1e-12 MTERFSTKKLVLSALFAALACVATMSIRIPTPGTGGYIHPGDAIVILCGVFLDPVSAFLA AGIGSCMADLLGGYFIYVPITFVIKGLVAFSASHVCGRLRARGMNPYAAVAGCGVIDIIL VAGGYCLCEIFLYGLGAALASVPSNIIQGISGLIISSALYPVLQKPLSMALKA >gi|157101630|gb|DS480694.1| GENE 361 387521 - 388153 629 210 aa, chain - ## HITS:1 COG:lin2878 KEGG:ns NR:ns ## COG: lin2878 COG0546 # Protein_GI_number: 16801938 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Listeria innocua # 3 209 2 201 203 139 36.0 3e-33 MRYTHFVFDIDNTLINTTGAVLHGLQRALRDVTGEHWDLSRLTPVLGIPGLDAFERLGIH SPEQIFQIYPRWEQYEQEYQYTAYLYEGIVPLLDSLKKKGCVLGIITSKTMPQYTSSFLP FQISEYFQTVITADDTVRHKPDPEPMLAYMERTGVCPRQILYIGDSIYDMQCAAQAGVDS CLALWGCHCPDGILSTHRFEEPAGMMRWLE >gi|157101630|gb|DS480694.1| GENE 362 388153 - 389154 892 333 aa, chain - ## HITS:1 COG:lin0520 KEGG:ns NR:ns ## COG: lin0520 COG1940 # Protein_GI_number: 16799595 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Listeria innocua # 6 318 4 318 334 80 23.0 5e-15 MAQEEPVTVKQLNKEHIRAYMGKHTSASKSQIARELGLSFPTVGRLIDELCSDGELTEQG AGSSTGGRCACIYELNPLFSLYLLVQVESDRVCWNLKDLRENTVEQGCLPFEILSLEQLD ALILDIHHRYPRLKAAAAGIAALVNHGVVEETNQYIGLKGLNLEEHFRRLVPIPMTVGND MNFLTMGCWAQKHPDAGSLVTLFMGGNGIGGGMVINGQLWTGASGYCSEVGFLPVYERLN KGSLGLPPAEHISELYARLIQIYAVTVNPHMMVLYRHPLLEGRLDEIRRLCASYLPSKAI PSIELSYDYQQDYEKGLFTVAKSMGQAVSEGGL >gi|157101630|gb|DS480694.1| GENE 363 389349 - 390293 878 314 aa, chain + ## HITS:1 COG:FN0623 KEGG:ns NR:ns ## COG: FN0623 COG0679 # Protein_GI_number: 19703958 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 314 4 318 318 121 27.0 1e-27 MEHLIFSLNATMPIFFLMVLGACFKKAGIMEGVFADKANQFVFKVALPVLLFEDLSNSDF LKVWDTRFVMFCFASTLGGILLAVLLSMALKDRRLRGEFIQASYRSSAALLGIAFIKNIY GDVGMAPLMIIGSVPLYNVMAVVILSFTNPEGAVLDRRMLGKTAACILKNPIILGILTGM AWSLLGLKQPQIMEKTVSSLAGVATPLGLMAMGASFDFRQASQRVGAAAGASAIKLVLLC AVFLPAAVYLGFREQQLVAILVMLGSAATVSCFVMARNMGHEGVLSSSVVAITTCGSAFT LTLWLDLIRTLGLI >gi|157101630|gb|DS480694.1| GENE 364 390324 - 391331 795 335 aa, chain + ## HITS:1 COG:BS_yxxF KEGG:ns NR:ns ## COG: BS_yxxF COG0697 # Protein_GI_number: 16080975 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 3 282 15 293 311 134 32.0 2e-31 MKAKLEIIGSMAAFGTIGAFVRHIPLPSGELALYRALIAAGAIFLWQLCSGQGTGIKDAK AELHLLILSGAAIGINWILFFEAYRYTTVAMATLSYYFAPVIVTVASPFLFKEKLTARQV FCFIMSTLGLVLVIGVSRGGSSSDLIGILFGLGAAVFYASVVLLNKSIRHVSGINRTFFQ FLAAILVLTPYVCLTGGSHLLDMDALGVINLLVVGVFHTGICYCMYFSSLRYLKGQEASI LSYIDPLVAILVSVTFLHETITVFQVVGGGMILGFTVVNEVRMRAGGGDPDPREAAVPAA HPPFTRCISNASAVSSLPGKEKNIPCQSVVKITPY >gi|157101630|gb|DS480694.1| GENE 365 391742 - 392950 1299 402 aa, chain + ## HITS:1 COG:no KEGG:CD3114 NR:ns ## KEGG: CD3114 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 2 402 3 401 402 493 67.0 1e-138 MSFWNPLTCITGCLVVYSVGEFLSKKTKGAVSSLLFACVLFLIGFWSGLLPKDITTQSGL VAVMANFGTAFMITNIGTLINLEDLIREWKTVVISLFAIAAIALICFTVGSLLFGREYAL IAAPPVAGSTVAGIIVTSAAEAANRPELAAFAVLVLSVQKFFGIPISTFCIRKELRIKRG SGFFKSNTVEEKSSLKLPSMRIFKETPKNLRSNTIYICKVALVACIADFVGKATLIPGSS PANYILNPNIAYLLFGLIFARIGFLEKDIFAKANSSGIITFGLLLMLPGSLATLSPSGLL SMIVPVFGILLICSIGIIVICGIVGKVLGCSPYTSAAVGVTCMLAYPATQIITTEGVDSF EWEGDERQKAMDYILPKMIIGGFVTVTIASVAFASIIGPIIF >gi|157101630|gb|DS480694.1| GENE 366 392971 - 394353 1583 460 aa, chain + ## HITS:1 COG:AGl2896 KEGG:ns NR:ns ## COG: AGl2896 COG1473 # Protein_GI_number: 15891558 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 453 84 543 552 294 40.0 3e-79 MTGQIAIEEIERKRQKIDSLSQYIWNHPEAGFKEYKAQEKVADLLREEGFQVETGAGGVP TAIKAVYGSGHPVMGFLGEFDALPGLSQKVSVNKEPVEGQPYGHGCGHNLLCSAHVAGVI GLKEEMIQRNLPGTIVFYACPGEELLTGKPLMAKGGAFEGLDVAVNFHPNKINEATVGVS TAVNSAKFHFYGKASHAANAPENGRSALDAVELTDIGANYLREHVPSDVRIHYTIVDGGV APNIVPDKAVVWYYMRAFSREVVENVYERLVKVAKGAAMMTETELEIEFLGGCYNTQNNH VLAGVVAEAMNEIPQEPWTQEELDFAAALDEQTADAARATTKKYGLSADTHLYTGPGQVT CFNSYGSTDVGDVMHLVPTAYFFTACTNMGAPAHSWQFASCAGSSIGEKGMIYAAKVMAL YGLKLIEKPELIAQAKEEFDRQMEGRSYKCPIPDGMTMPW >gi|157101630|gb|DS480694.1| GENE 367 394368 - 395549 1209 393 aa, chain + ## HITS:1 COG:FN1063 KEGG:ns NR:ns ## COG: FN1063 COG1473 # Protein_GI_number: 19704398 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 1 390 1 393 394 289 39.0 8e-78 MDFLKEAEGIEQEVIGWRRHVHQYPELLMDLPETAAFVERELKNMGYEPEYICQSGIVAV LDSQKPGKTLLLRADMDALPIGEESGLEYASKHPGISHACGHDTHIAMLLGAAKLLMKHK DLLRGKVKFMFQPGEEGGGGARTMVEAGVLRSPDVDACMAFHQVVARDHVPTGIIGYTRG PMMASADMFRITVRGKSAHGASPESGVNPVQILCQIYNGLQSIECSEKPRGSALALTIGQ ISAGRAGNVIPEQGFMCGSIRAYEENVRSLAKRRLKEISEQTAAMYGGRAEVEFPSELPA TVNDLQVGDEMFGYVKELVGEEGTMMLPQIMGGEDFAEVLKEAPGVLFRVSLGDKQEGYP YISHNSKVIFNESGMKHAVAAFAHCAVRWLNAH >gi|157101630|gb|DS480694.1| GENE 368 395778 - 396788 963 336 aa, chain - ## HITS:1 COG:no KEGG:CD3111 NR:ns ## KEGG: CD3111 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 220 2 221 442 199 42.0 2e-49 MKIAVILTRFLKDYVESYFNSLSLDCELSYYIYNDFAHAGELYLELEKSYDGFLVSGPVP RAAICRRVPVLTKPIVSFGSNALCYYETFFQVQFKEQDFYLERGYFDLLEWLPETVPLHQ YLERGQFNQLIYKVYQNTSTYSLEQLCEMEKKIKLKHIRLWNEKKIQYSVTRFSNIIQDL LDAGVNTYFVYPKYEILQESVTLLLQEIHLKTIVQNQNAITALNQQFSDGASVMAPQLSK DPDTSAPLTSRSRQLSQKAGISVSYADRLLTALSAMDRDCITSQALAAALHITPRSANRL LGRLSATGIFLELERQPSFTRGRPEKIYQFHPELIS >gi|157101630|gb|DS480694.1| GENE 369 396834 - 397847 1098 337 aa, chain - ## HITS:1 COG:BS_ansA KEGG:ns NR:ns ## COG: BS_ansA COG0252 # Protein_GI_number: 16079415 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Bacillus subtilis # 1 331 1 328 329 289 44.0 5e-78 MKRILLLGTGGTIACKRTDKGLKPVITSDEILSYVPDSQSYCEIHSIQLMNIDSTNIQPC HWLAVEKAIEENYSQYDGFVITHGTDTMAYTAAALSYLIQHSPKPIIITGAQKPIDMENT DARTNLADSLLFASCPGAHGVNIVFDGKVIAGTRGKKEWTKSYHAFSSINFPYVAIIQDG QIIFYLDDKDRDKEPVCFYRQMDGDVGLMKLIPSMDASLLDYMAEHYDAVIIESFGVGGL PSYQDGDYHGAIARWTGLGKTVIMTTQVTNEGSDMSVYEVGRSIKQEFGLLEAYDMTLEA VVTKIMWILGQTKDPARIRRMFYETVNRDILWKWSEN >gi|157101630|gb|DS480694.1| GENE 370 398082 - 398534 539 150 aa, chain + ## HITS:1 COG:TM0865 KEGG:ns NR:ns ## COG: TM0865 COG1585 # Protein_GI_number: 15643628 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Thermotoga maritima # 3 138 5 137 140 67 33.0 7e-12 MSLIWLIAFVVFLVVEGVTTSLTSIWFAGGSLAALVVQVCGGGLYPQLAVFVAVSFMLFL MVRPFAYKYLYGKRTKTNVDSLTGRKAVVKDRIDNVAGTGTAILAGETWLARAAEEGDTF EAGDVVVISAVSGAKLLVAAAKQEHEVEKA >gi|157101630|gb|DS480694.1| GENE 371 398556 - 399506 1237 316 aa, chain + ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 11 308 7 293 294 310 58.0 3e-84 MFGAIGGFITFIIIAVIILFVLSTCIRVVPQAQALVVERLGAYLGTYSVGIHFLVPFIDR VAKKVNLKEQVEDFPPQPVITKDNVTMQIDTVVFFYITDPKLYAYGVERPLLAIENLTAT TLRNIIGDLELDETLTSRETINAKMQESLDIATDPWGIKVTRVELKNIIPPAAIQEAMEK QMKAERERRESILRAEGEKKSMVLVAEGHKESAVLNAEGEKEAAILAAEAEKEKKIREAE GQAEAIRSVQKATADGIRFIKEAGADNAVLQLKSLEAFQAAANGKANKIIIPSDIQGIAG LVKSIAEVASKDEAAE >gi|157101630|gb|DS480694.1| GENE 372 399655 - 401049 1440 464 aa, chain + ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 6 442 5 440 445 308 37.0 2e-83 MEGKVVKGNRITEGSIFGQLLLFFFPILFGTFFQQLYNTADAMVVGRFVGKQALAAVGGS TSTLINLLVGFFVGLSSGATVVISQFYGARKADKVHWAVHTSIAFSVIGGIIFMIVGLVG SPWALEAMKTPEDVMGHAVVYIRIYFLGIIVNLVYNMGAGILRAVGDSRRPLYFLIASCF TNIILDVLLVAVLGMGVAGAALATITSQLLSAVLVVLALMKTDDMYKLEWKKVRIDQRML QRIVRIGIPAGMQSVMYNISNVIIQAGVNTLGTDNVTAWATYGKVDGLYWMMINALGISA TTFVGQNFGAGRLDRVRKGAGACMVIGVVLTASVGVVLYNGGHLLVELFTTDQQVQAISM DLLHFMVPTFITYIAIEILSGTLRGVGDAWMPLIITGIGVCAVRVLWIMFVLPHYHTIIG AAFCYPLTWSLTTVAFVIYYYFFSSLRRWKLKPLKKRFGVYKPF >gi|157101630|gb|DS480694.1| GENE 373 401091 - 401273 299 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940433|ref|ZP_02087778.1| ## NR: gi|160940433|ref|ZP_02087778.1| hypothetical protein CLOBOL_05323 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05323 [Clostridium bolteae ATCC BAA-613] # 1 60 7 66 66 109 100.0 6e-23 MTTKEVMFDSVEDVKRFVQQSEKQPEDIDVCCGSCMVDGKSILGILSLGIHKKLNVVIHD >gi|157101630|gb|DS480694.1| GENE 374 401382 - 401456 67 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNRSCDTIDTKYIFTDWTFRINRP >gi|157101630|gb|DS480694.1| GENE 375 401475 - 402308 957 277 aa, chain + ## HITS:1 COG:CAC3396 KEGG:ns NR:ns ## COG: CAC3396 COG2816 # Protein_GI_number: 15896637 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Clostridium acetobutylicum # 108 255 103 249 271 153 50.0 4e-37 MIQDIAPHTYDPVFRRKKPEDRDYLLHYEYNKVMLIKSGRGSVIPTFGELGAEGDFSAQA EYLFSIDDRAYYCVTDMKVPEFGGFYMEPLTVFRNFEPLHQAFAGITGSQLYRWMQSRRF CGGCGAKTEPSLKERALVCPSCGQTEYPKISPAVIVAITNGDKLLMSRYARGAYRNYALI AGFVEIGETFEDCVRREVMEEVGLRVKNIRYYKSQPWAFSDTEMVGFTAELDGDDTICLE EEELCEAGWFTRDEIVEYGPYISVGHEMMKAFKDGKI >gi|157101630|gb|DS480694.1| GENE 376 402340 - 403008 805 222 aa, chain + ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 5 222 3 215 216 202 46.0 4e-52 MNRKDYILFDLDGTLTDPKEGITKSVQHALEHFGIQTDDLDSLTPFIGPPLRDSFKRYYG FSDEQAWEGVQAYREYFSVRGWVQNKEYPGIKEMLEALKEAGRVLLVATSKPEEFAKKIL DHFDMAEYFDFIGGADMGETRVRKADVIRYVLEQCGLESDDETIGRCVMVGDREHDVLGA RECGMECVGVLYGYGDRQEMDACRPAWIAETVKDLRDLLLSL >gi|157101630|gb|DS480694.1| GENE 377 403134 - 403637 697 167 aa, chain + ## HITS:1 COG:MTH1443 KEGG:ns NR:ns ## COG: MTH1443 COG0440 # Protein_GI_number: 15679440 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanothermobacter thermautotrophicus # 3 160 7 165 168 146 51.0 2e-35 MDDRIVLSLLVDNTAGVLARVAGLFSRRGYNIESLTVGVTADPRYSRMTVVSLGDQTVLE QIKNQLNKLEDVRDIKELQPDRSVYRELMMVKVRANASDRQSVSAISSIFRATIVDVGKD SLTVMLTGDQSKLDALINLLEDYEILELARTGLTGLERGAEDIRMLP >gi|157101630|gb|DS480694.1| GENE 378 403750 - 404760 1165 336 aa, chain + ## HITS:1 COG:TM0550 KEGG:ns NR:ns ## COG: TM0550 COG0059 # Protein_GI_number: 15643316 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Thermotoga maritima # 1 327 1 327 336 441 68.0 1e-123 MPKIYYQEDCNLSLLEGKTIAVIGYGSQGHAHALNLKESGCNVIVGLYEGSRSWKKAQEQ GFEVYTAAEASKKADVIMILINDEKQAAMYKESIAPNLRPGMMLMFAHGFAIHFGQIVPP KDVDVTMIAPKAPGHTVRSEYQRGRGTPCLVAVYQDATGKAMDMALAYGQGIGGARAGIL ETTFRVETETDLFGEQAVLCGGVCALMKAGFETLVEAGYAPENAYFECVHEMKLIVDLIY ESGFAGMRYSISNTAEYGDYITGPKIVTDETKKAMKKILSDIQDGSFAKEWLLENQVGCP HFNAMRKNEAEQPLEKVGAELRKLYSWNDTDKLINN >gi|157101630|gb|DS480694.1| GENE 379 404932 - 405855 1199 307 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 18 307 2 290 296 154 32.0 2e-37 MGYMKKCYIAGGDSMEQNLSQYKIFYEVAKAGNISKAAKELYISQPAISKAISKLEDSLG LSLFTRSSRGVQLTSEGEILFEHTREAFDALDRGEQELKRIQEFDIGHLRIGVSNTLCKY ILLPYLKTFIDQYPHMKVTIESQATAQTLARLEQQKIDLGLVAEPSVRRDLAFIPVMDIQ DTFVTTPNYLENLYLREGQDTSLFETGNIMLLDTSNMTRHHVDEYMAENNIFPHQILEVT TMDLLIEFAKIGLGIACVIKELVQKELDSGMLVEIPLDIPIHRRTIGFAYHPANQAMALK TFLEFLY >gi|157101630|gb|DS480694.1| GENE 380 405963 - 407225 1601 420 aa, chain + ## HITS:1 COG:CAC3173 KEGG:ns NR:ns ## COG: CAC3173 COG0065 # Protein_GI_number: 15896421 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Clostridium acetobutylicum # 3 416 5 418 422 605 75.0 1e-173 MGMTMSQKILAAHAGLEEVKAGQLIEADLDLVLGNDITSPVAINEMKKMDRKTVFDKDKI ALVMDHFIPNKDIKSAENCKCCREFACRHEITNYFDVGDMGIEHALLPEKGLVVAGDAVI GADSHTCTYGALGAFSTGVGSTDMAAGMVTGKAWFKVPSAIKFQLVGKPAKWVSGKDVIL HIIGMIGVDGALYKSMEFVGEGIRNLSMDDRFTICNMAIEAGGKNGIFPVDELAVEYMKE HSKREFTVYEADEDAEYDETYVIDLSELKPTVSFPHLPSNTRTIDQVGEVKVDQAVIGSC TNGRIEDMRIAAEVLKGRKIAKGVRCIVIPATQSIYLQAMKEGLLEIFIEAGAVVSTPTC GPCLGGYMGILAAGERCISTTNRNFVGRMGHVDSEVYLASPAVAAASAVAGKIICPCQLD >gi|157101630|gb|DS480694.1| GENE 381 407270 - 407755 553 161 aa, chain + ## HITS:1 COG:PAB0892 KEGG:ns NR:ns ## COG: PAB0892 COG0066 # Protein_GI_number: 14521550 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Pyrococcus abyssi # 1 157 1 157 164 221 66.0 4e-58 MKACGHVFKYGDNVDTDVIIPARYLNATQGDELAKHCMEDIDKEFVNQVQKGDIIVANKN FGCGSSREHAPLAIKCAGVSCVIAETFARIFYRNSINIGLPIIECGSAARSIEAGDEVEV DFDSGIITNKTKGETYKGQSFPPFMQKIISAGGLVNYINGQ >gi|157101630|gb|DS480694.1| GENE 382 407961 - 408833 1015 290 aa, chain - ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 289 1 289 299 179 32.0 6e-45 MELREIAAFIQVAQLKSFSKAAKQLGYSQAAVTIQIKQLEQELNVHLFDRIGKQTTLTHQ GALFYEHAVSVMKELEHARDAVTHVSELTGSLCLGTIESICATIFPELLREYHRLYPKVN VRLITDSPETLLEMMNSNAIDIVYFLDKRMYDSKWIKVLEEPEDIVFVTASGHPFVGRDS LDLDLVIDQPFILTERNASYRFILDQYLAAQGKEIHPFLEIGNTEFIISLLRKNLGVSFL PEFTIHQDIINGELAVLDVRDFYMRTWRQIVYHKDKWVTREMDAFLRLAT >gi|157101630|gb|DS480694.1| GENE 383 409007 - 410284 1550 425 aa, chain + ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 20 422 1 409 411 337 45.0 3e-92 MEKTNVKYGAYVQILKEELLPAMGCTEPIALAYAAATARKVLGELPDRVVVGASGSIIKN VKSVIVPNTNHLKGIPAAAAAGIVAGDSDKELEVIAEVTPDETRQMKEFLDDTEIRVEHV DNGITFDIIVTLYKGDSYAKVRIANYHTNIVLIEKNGEILKQVEVAGEEEEGLTDRSLLN MEDIWDFINTVDIEDIREVLDRQIKYNWAIAQEGLKGDYGANIGSVLLEMSGNDVRTRAK AMAAAGSDARMNGCELPVIINSGSGNQGITASVPVLVYAEEFKVDEDKKYRALALSNLTA IHQKTPIGRLSAYCGAVSAGAGAGAGIAYLCGGGFKEVVHTVVNALAIVSGIVCDGAKAS CAAKIASAVDAGILGYNMYIRGQQFYGGDGIVTKGVEATLKNVGRLGKEGMKETNEEIIK IMIGE >gi|157101630|gb|DS480694.1| GENE 384 410287 - 411480 1275 397 aa, chain + ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 7 385 9 386 390 394 61.0 1e-109 MKKITGSLPFRLLLGVILGIIVGQLAGSGLMHVVVTIKYILNQVIVFCVPLIIIGFIAPS ITRLGNNASKMLGVAIIIAYVSSIGAALFSMTAGFLMIPHLSIATEVEGLKDLPAIVFQL DIPQIMPVMSALVFSLLLGLAATWTRAAVVTQVLEEFQQIVLSIVTKVVIPILPVFIAFT FCALSYEGTITKQLPVFLQVVIIVMIGHYIWLALLYAIGGVYSGKNPFDVIKNYGPAYIT AVGTMSSAATLAVALRCAKKSQPPLRDDMVDFGIPLFANIHLCGSVLTEVFFVMTISKIL YGTIPSLGTMILFCALLGVFAIGAPGVPGGTVMASLGLITGVLGFDETGTALMLTIFALQ DSFGTACNVTGDGALTMILTGFAEKHGIKRQKIESGL >gi|157101630|gb|DS480694.1| GENE 385 411661 - 413091 1629 476 aa, chain + ## HITS:1 COG:RSc2119 KEGG:ns NR:ns ## COG: RSc2119 COG0402 # Protein_GI_number: 17546838 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Ralstonia solanacearum # 2 453 9 444 474 241 33.0 2e-63 MDKKILIKNARAIVSCDGEDRVYRDADMLIEGPRILAIGRDLMNQRDPVSGREWKQEEVQ VIQAEGKFVYPGLINTHHHFFQTFVRNLITIDYPNMMVMDWIDKIYRIFQNIDSDVIYYS TLTSFADLIKHGCTCAFDHQYCYTRKTGKSPVDRQMEAAELLGIRYHAGRGTNTLPRSEG SSIPDNMLETTDEFLKDCDRLIGLYHDPRPFSMKQIVMAPCQPINCRRDTFAETVAMARE KGVRMHTHLGEGENEGMIARWGKRTMDWCEEMDFIGEDVWYAHDWEVTKEEYKVLAATGT GVSHCPAPAVLGGFPILNIKEMQEAGILVSLGCDGSATNDSSNLLDALRMAYLMQAYHTK ERGGCTSAYDMLKVATVNGAKTLGRTDLGTLEAGKAADLFMIDTETLELAGALHDPKNLL ARVGLTGPVWMTMINGNIVYKDGILKGVDERKLAMDGEAVCTRVIREPHSAYRDFI >gi|157101630|gb|DS480694.1| GENE 386 413129 - 413884 849 251 aa, chain + ## HITS:1 COG:BS_ydfL_1 KEGG:ns NR:ns ## COG: BS_ydfL_1 COG0789 # Protein_GI_number: 16077613 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 117 1 117 117 77 34.0 3e-14 MKEYFSIGELAELFGLNIQTLYYYDSVGIFSPRERNEKNGRRKYEFDQIYELATVSYMRR LGYSLEEIKESRTSLNSNRAVDIMKQRSQELRRQWQELLNIDEAIQRKIRFMEQEMDGIC LDEIAVKHFEDRKFISIGDEELLYRENSFYFYPTIAFYEGDRKYFGAYLYPDDEELPKTI PQSRIQIIPEGDFLCGYHLGGYEGVPDTIRRLREARPDLELGAQAINFNILDQFVESDNE NFITAMQIRVL >gi|157101630|gb|DS480694.1| GENE 387 413957 - 415075 1529 372 aa, chain + ## HITS:1 COG:alr5361 KEGG:ns NR:ns ## COG: alr5361 COG1744 # Protein_GI_number: 17232853 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Nostoc sp. PCC 7120 # 67 325 37 295 348 154 33.0 3e-37 MKNRMRLAAAALALSMAGFGLTGCSGQAKETEAKTEAADKQAGSAQTGDTQEKGDQADDG QAKEGQTADGKTYKVALCLSGAANDMGWCQSAYDGLKLLEADYGCEVTYTENLTPDDIEA AFADYAASGYDVVIGHGYEFGDPAVDVAEQYPDTKFIVTEGEVSADNVASYVSKCEEGGY IMGMLAAGMSESGKVGFIGPIQGASLVKIMNGFEDGAKEVNPDIQVQTAWTGSFTDTALG KEAAQAMIDNGADVIGHCANESGTGAINAAKEAKVYATGDSYDQNDLAPDTILSSSVYHI PHVIEVAFKTVADGTFEGGIYQLGMADGAVSVAPYHNLDSAVPDELKQKISDKIAAIESG EFEVPCDTKPRA >gi|157101630|gb|DS480694.1| GENE 388 415166 - 416695 1654 509 aa, chain + ## HITS:1 COG:CAC0703 KEGG:ns NR:ns ## COG: CAC0703 COG3845 # Protein_GI_number: 15893991 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Clostridium acetobutylicum # 3 501 4 499 514 426 44.0 1e-119 MPLLEMKHITKSFAGVYANEDVDLSVEQGEIHALLGENGAGKTTLMNILFGIYQADKGEI CFKGEHVEFKSPGEAIARGIGMVHQHFSLVKKMTVLDNVMLGLKGQGIVADRKLAREKLT GLAATYGLLVNPEAPVHTLSVGEQQRVEILKALYRDVSLLILDEPTGVLTPQETENFFGV LKRLRDEGYGIIIITHRLGEIMDISDRVTVLRDGRKVSSLVTEETTPEELSACMIGRELK AGREIRESAAGDTALSLKDVCLHKKHSTKMSLDHVSLTIRRGEILGIAGVEGNGQKELAE VITGIRRINSGQMEYQGKDIRRDSVKERFRAGISYISDDRHGDSLVMDMTVEDNLILRDF DRQPFSRNMVLDYRQAEKRAREALDEYSIKTSGKSGSKTPVKLMSGGNQQKVILSREISE EAGLIVASQPTRGLDIGATRFVHETLLRQRDAGRCVLLISADLDEILGVSDRVAVMYEGR IIGILNRDEADVHKIGLLMGGIAKEAADE >gi|157101630|gb|DS480694.1| GENE 389 416688 - 417716 1191 342 aa, chain + ## HITS:1 COG:AF0888 KEGG:ns NR:ns ## COG: AF0888 COG4603 # Protein_GI_number: 11498493 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Archaeoglobus fulgidus # 16 335 11 331 332 190 41.0 3e-48 MNRKSKIVEIAGIGLIAAAVWTVLIMLSDASPAEAYAMFFKGIFGNLNGFMEVFVKATPL IFTGLGCAVAFRTGFFNIGAEGQFYIGALASAWAALTWTGVPGGLRILCAILMGFVFGGL WALIAAMFKAKLGISEIIVTIMLNYIAINFLGIAVRTFLMDPAGSIPQSPKLDPSVTLYR FLKPTRLHAGFIIAVLMVFLVWFILEKTTVGYEIKVVGFNKRAAACNGISVVRNIIISAF LSGGLAGIAGAVEVMGVQRKLLEGISGECGYTAVLIALLASNHPVGVLFAAIGFAALEVG ANSMQRQMGVPSAIVNILVGLIVLLILGRELFNRKRKGEKPC >gi|157101630|gb|DS480694.1| GENE 390 417710 - 418651 1143 313 aa, chain + ## HITS:1 COG:alr5368 KEGG:ns NR:ns ## COG: alr5368 COG1079 # Protein_GI_number: 17232860 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Nostoc sp. PCC 7120 # 8 308 4 304 312 238 45.0 1e-62 MLEQIMTVGFVTAFLTASVRMAVPLIYAGLGETISEKAGILNIGMEGVMLGGAFFSFAGT FYSGSLAVGLLCGMAGGAMVSAIHGLLSIRLAQDQSVSGIAINIFMAGVTSFLYKLMSSG QSYQQIEPFAKLKIPFLGDIPLIGNALFNQDVLTYGVYVLVILVTLFYSRTSKGMAFAAI GENPRAADSAGIPVHRYQWAAVLLNGMLGGLGGAYLVLVQVGNFSENMTSGRGYIALAAV ILGRYVPFGMMGAAFIFGAANALQIRLQAIGVPLPTQALAMLPYIITLVALLGAIGKNSA PEALGKPYIRGAR >gi|157101630|gb|DS480694.1| GENE 391 418954 - 419205 283 83 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0233 NR:ns ## KEGG: EUBREC_0233 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 80 1 80 81 111 72.0 9e-24 METRIAVIGIIVENKDSVESVNELLHQYSQYIIGRMGLPYAKKKVNIISVVVDAPQDIIS ALSGKLGRLQGINSKALYSNIGS >gi|157101630|gb|DS480694.1| GENE 392 419296 - 420717 1923 473 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 3 473 2 472 472 713 74.0 0 MNYDPKSSHAEEFIHHGEIEETLAYGEANRTNRELIDEILAKARERRGLTHREAMVLLDC GLEDKNQEIYELAEQIKKDFYGNRIVLFAPLYLSNYCINGCVYCPYHAKNKHIPRKKLTQ DEIRREVMALQDMGHKRLALETGEDPVHNPIEYMLESIDTIYSIKHKNGAIRRVNVNIAA TTVENYRKLKDAGIGTYILFQETYHKESYEKLHPTGPKHDYCYHTEAMDRAQLGGIDDVG CGVLFGLELYRYEFAGLLMHAEHLEAVFGVGPHTISVPRIRRADDIDPDSFDNGIDDDTF AKLVACIRIAVPYTGMIVSTRESQKTRERVLHLGISQISGGSKTSVGGYFEPEPEEECSA QFDVSDNRTLDQVVNWLMGMGYIPSFCTACYREGRTGDRFMSLCKSGQIQNCCHPNALMT LKEYLMDYSSEETRRIGEALIEKEIGSIPKEKVQQIVRDNLAKIEQGIRDFRF >gi|157101630|gb|DS480694.1| GENE 393 420931 - 422166 1158 411 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 399 4 399 411 452 56.0 1e-127 MGMNNTPASDRIHIGFFGRRNAGKSSVLNAVTGQDLAVVSDVKGTTTDPVYKAMELLPLG PVMMIDTPGIDDEGQLGSLRVRKSYQVLNKTDVAVLVMDACHGPAPEDYTLLKKIGEKQI PCVAVCNKADVVGAAGGRPEGIPESVPVLSVSAHTGQGITALKECLANLGKTQETSRKIV GDLLSPSDLAVLVVPIDKAAPKGRLILPQQQTIRDILEADAVSVVVRENGLRDTLASLGK KPRMVITDSQVFAKVSADTPEDILLTSFSILFARYKGSLEPMVRGVRALDRIGKGDRILI CEGCTHHRQCEDIGTVKLPGWIREYTGTEPEFVFTSGTEFPDDLAGFRLIVHCGGCMLNE REMKYRMKCAHDQGIPITNYGITIAYVQGILRRSVEPFPEIAALLCGSREG >gi|157101630|gb|DS480694.1| GENE 394 422475 - 423632 1259 385 aa, chain + ## HITS:1 COG:BS_nagA KEGG:ns NR:ns ## COG: BS_nagA COG1820 # Protein_GI_number: 16080554 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Bacillus subtilis # 27 381 26 384 396 241 41.0 1e-63 MEGEEIMRIQGKRVWIAGQFMAAQMEIEDGVISRIHEYGAGPADEDYGDKRIVPGFIDIH THGAYDFDTNDGQPEGLRNWMKRIPEEGVTAILPTTVTQMPDVLTRAVANVAAVVKEGYE GAEILGIHFEGPFLDMEYKGAQPPEAIASASVEQFKKYQEAAEGLIKYITLAPERDPDHA LTRYCSTHGVVVSMGHSAATYEQALLGIANGALSMTHVYNGMTPYHHRKPGLVGTAFRVR DIYGEIICDGCHSHLAALNNFFTAKGRDYSIMISDSLRAKHCPPGGQYQLGGHDIEIGED GLARLKGTETIAGSTLNMNRGLKILVEQALVPFDAALNSCTINPARCLGVDHRKGRIAAG CDGDLVVLDDDYSVLQTWCRGRAML >gi|157101630|gb|DS480694.1| GENE 395 423920 - 425932 1786 670 aa, chain + ## HITS:1 COG:AGc3943 KEGG:ns NR:ns ## COG: AGc3943 COG0840 # Protein_GI_number: 15889452 # Func_class: N Cell motility; T Signal transduction mechanisms # Function: Methyl-accepting chemotaxis protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 136 665 142 702 717 238 33.0 2e-62 MKIGIRIKMFLGIMVPALMVFLLSGVLISVSVGKYAKVQAREKLTSDSESAAHEVDTFFT GHLKAVEQSAQEALVIDFMHELRGSQRASAAPRYDEVMAAMAVRQALDSSTILGVWLADT DSSQLFQPDGFISDPGWDVTSRPWFECTKTKKPVMTEPYIDVNTGLPIVTAAAPVIDRQS GEVYGASGTDINLDTLQQRINEVKLGQTGFLVLMSGSGNILCYPDTEQVGKSISEINISD EVKQAVQNGLTGDYIYEIDGTRYYGNLTRIDSAGWYVLSAMPETEALQGFYTVMRMSAVT FAGSMVLMAAAVLLISNGITRPMKKLADAAHRIADGELNVETGVHSKDEIGQVASAMERT VSKLKDYIEYINETTRVLDEIADGNLQFELELQYEGDFAKIRDALMRIRTTLNDTLGSIL RTAEQVTASSGQVADSTSSLADGNARQASSVEELAATINEIFTHVDSNARRTEEVSGRSA EMDHRVALSNTQMQELVKAVRDISNKSGEIGKIIKTIEDIAFQTNILALNAAVEAARAGE AGKGFAVVADEVRNLANKSGEAAKNTTSLIEESLRSIDNGQKIANETAQSLGQVVTSAQQ IAEAVEDISKASTEQAKSLDQVRIGIEQISGVVQTNAAMVEENAATGGELSEEAKKLFDL ISRFRTDRKL >gi|157101630|gb|DS480694.1| GENE 396 426098 - 426565 532 155 aa, chain + ## HITS:1 COG:lin1358 KEGG:ns NR:ns ## COG: lin1358 COG0779 # Protein_GI_number: 16800426 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 10 155 7 155 155 113 40.0 1e-25 MGKKESYESRVEKHLLPLMEEHGFELVDVEYVKEAGTWYLRAYIDKEGGIAVDDCEVISR ALSDWLDKEDFIDDSYILEVSSPGLGRPLKKERDFVRSMGKDVDVRLYRQLNKQKEFTGA LSAYDENTVTLTMEDGSQMVFEKADIALIRLALDF >gi|157101630|gb|DS480694.1| GENE 397 426607 - 427773 1580 388 aa, chain + ## HITS:1 COG:BH2416 KEGG:ns NR:ns ## COG: BH2416 COG0195 # Protein_GI_number: 15614979 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Bacillus halodurans # 1 384 1 382 382 358 52.0 1e-98 MNKELLEAMEVLEKEKNISKDTLIEAIENSLLTACKNHFGKADNVKVTVNPNTCDFAVYA EKAVVENVEDDCLEMSLADAKMLNPKYEIGDTVQIPLDSKKFGRIATQNAKNVILQKIRE EERKALYNEYYMKEKDVMTGVVQRYLGRNVSINLGRVDAILNESEQVKGETFRPTERVKV YVIEVKDTPKGPRVSVSRTHPDLVKRLFESEVAEVRDGTVEIKAIAREAGSRTKIAVKSN DANVDPVGACVGLNGSRVNSIVSELKGEKIDIINWDDNPAYLIENALSPAKVICVVADEE EREAQVIVPDYQLSLAIGKEGQNARLAARLTGFKIDIKSETQAREMGLFEQMGLQYGDTS SENYEEEQEEPESYQEYEENYQEDGSEQ >gi|157101630|gb|DS480694.1| GENE 398 427808 - 428080 252 90 aa, chain + ## HITS:1 COG:CAC1800 KEGG:ns NR:ns ## COG: CAC1800 COG2740 # Protein_GI_number: 15895076 # Func_class: K Transcription # Function: Predicted nucleic-acid-binding protein implicated in transcription termination # Organism: Clostridium acetobutylicum # 1 87 1 87 88 75 51.0 2e-14 MSMKKVPLRQCIGCQEMKSKKEMIRVIKTAEDEIMLDATGRKNGRGAYLCPSMECLKKAV KGKGLERSFKMAIPKEVYETLEKEMEELGR >gi|157101630|gb|DS480694.1| GENE 399 428070 - 428405 431 111 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 [Clostridiales bacterium 1_7_47_FAA] # 1 94 1 94 105 170 89 7e-43 MADNKRLLSLVGLATRARKVVSGEFSTEKSVKSGKSHLVIVSEEASDNTKKKFTNMCTYY KVPIYLFGTKDELGHAMGQEFRASLSVEDAGFAKSMVERMNINGGSLNESK >gi|157101630|gb|DS480694.1| GENE 400 428342 - 428386 61 14 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 [Clostridiales bacterium 1_7_47_FAA] # 1 14 92 105 105 28 85 7e-43 FRKVNGRTNEYKRR >gi|157101630|gb|DS480694.1| GENE 401 428392 - 431700 3502 1102 aa, chain + ## HITS:1 COG:CAC1802 KEGG:ns NR:ns ## COG: CAC1802 COG0532 # Protein_GI_number: 15895078 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Clostridium acetobutylicum # 523 1100 107 689 693 667 59.0 0 MRVSELANELGKTSKEILEIIKTKDKAAVLYAASNVTKDQENIVRSFANPKRDAKPAAPE HRSEAPKTEAPRAEAPRTEAPRTEAPKTEAPKAEAVRSQDGASKGEPAKKKLTAVFRPQN AQQQIKRPVPPKSQTANQGHTAPAPAAAPQPAQAAASAVPAEAAKPQVQAQSADTVSEKA AAPQPSAAPQAGAHTGTQGTAQDSRDTARSQGGYQGSRDNNRGQGGYQGNRDNNRGQGGY QGSQGGYQGNRDNNRGQGGYQGNRDNNRSQGGYQGNRDNNRGQGGYQGGQGGYQGNRDNN RSQGGYQGGYQGNRDNNRPQGGGYQGNRPQGGYQGGQGGYQGNRDNNRPQGGGYQGNRPQ GGYQGQGGYQGNRDNRPQGGGYQGNRPQGGYQGGQGGYQGNRDNNRPQGGGYQGNRPQGG YGARPQGGYQGRPGDDKDARDNRGRNDSRRPAGARDGKSSFDTPIQTKPTSNRQNKNSHK NDRFDKRDRLEDGKKKVPKTGKGAFIMPQPKKEESKADEVKTITLPDVLTIKELAEKMKL QPSVIVKKLFLKGQVVTLNQEIDYEQAEEIAMEFDVLCEREVKVDVIAELLKEDEESTED MVPRSPVVCVMGHVDHGKTSLLDAIRETNVTAREAGGITQHIGASVIEINDRKITFLDTP GHEAFTAMRMRGAQSTDIAILVVAADDGVMPQTVEAINHAKAANVEIIVAINKIDKPSAN VDKVKQELAEYELIPEDWGGSTIFVPVSAHTKEGIKDLLEMVLLTADVLELKANPNRKGR GLVIEAELDKGKGPVATVLVQKGTLRVGETIAAGACFGKIRAMMDDRGRRVKEAGPSTPV EILGLNDVPNAGEVFVATENEKEARNFAETFISEGKSKLIEDTKAKLSLDDLFSQIQAGN VKELPIIVKADVQGSVEAVKQSLTKLSNEEVMVKVIHGGVGAINESDVSLASASNAIIIG FNIRPDATAKSIAEREKVDIRLYKVIYQAIEDVEAAMKGMLDPVFEEKVIGHAVIRQTFK ASGIGTIAGSYVLDGKFQRNCSCRVKREGEQIFEGPLASLKRFKDDVKEVAAGYECGLVF EKFNDLQEDDEIEAYIMVEVPR >gi|157101630|gb|DS480694.1| GENE 402 431726 - 432127 516 133 aa, chain + ## HITS:1 COG:BH2411 KEGG:ns NR:ns ## COG: BH2411 COG0858 # Protein_GI_number: 15614974 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus halodurans # 4 117 2 114 116 90 41.0 1e-18 MRKNSIKNTRINMEVQRELSEIIRGGIKDPRIHPMTSVVSVEVTPDLKFCKAYISVLGDE EAGKSTIQGLKSAEGYVRRELARRVNLRNTPELKFILDQSIEYGVNMSRLIDEVTKDLHQ EETQDEEGAKEEF >gi|157101630|gb|DS480694.1| GENE 403 432127 - 433095 222 322 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 1 294 1 291 311 90 28 1e-16 MTILEQMLEDTKSVAILGHIRPDGDCLGSTLGLYNYLRLNYPDIRAAVYLEESSPKFSYL KGYDTIRHQVDGECYELCVCLDSGDIQRLGDFKHYLDQADKSLCLDHHVTNTRYGGTNVV PDEASSTCEVLFDQLDEGKLDRYTAECLYTGIIHDTGVFKFSCTSAHTMEIAGKLMGKGI DFGAIIDNSFYRKTYIQNQILGRALLESITFFDGRCIFSAIRQSEMEFYGVDGKDMDGII DQLRLTEGVEVAIFLYETGPQEFKVSMRSQYLVDVSRIAAFFGGGGHVRAAGCSMSGSIH DVINNLSVHIARQLDAASGSQE >gi|157101630|gb|DS480694.1| GENE 404 433106 - 434032 755 308 aa, chain + ## HITS:1 COG:CAC1805 KEGG:ns NR:ns ## COG: CAC1805 COG0130 # Protein_GI_number: 15895081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Clostridium acetobutylicum # 4 290 3 274 289 218 42.0 2e-56 MYHGIINVYKEPGYTSHDVVARLRGILKQKKIGHTGTLDPAAEGVLPVCLGAGTRLCDML TDRSKTYQAVMLLGCETDTQDTTGTILAEDREGALSLAEEQVNEAVMGFKGDYAQIPPMY SALKVDGKKLYELARAGKEVERRPRAVQILDIKVSRIELPRVWMEVTCSKGTYIRTLCHD IGRLLGCGGCMEHLTRTRVDRFSIEDSLTLDQLEQLRDQAAVESCILPVEEALQSYPPLG CLPEADSLLHNGNPCFARHLDWSQSDQWKAGAEDGQMFRMYDSNHRFTGVYQYQKDRHWW KPWKMFLM >gi|157101630|gb|DS480694.1| GENE 405 434092 - 435042 345 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 18 308 20 314 317 137 31 6e-31 MRYIADTTEFQIEEPTIVTLGKFDGRHRGHQKLLRTMMELKEATGYPTAIFTFGTAPGTA VTGKPQTVITTNLERRANMEKVGIDYLVEYPFTEETRHMEPEAFVKDILAGRMHAREIVV GPDCSFGYKGAGSAELLSAMARDLGYHLHVIEKEKDHRRDISSTYIREELEKGNVEKANQ LLGQPYAIHGEVVHGNHIGGSLLGFPTANILPPPIKRLPRYGVYVSRVLVDGVYYKGVTN IGKKPTVGGEYPAGVETYIFGLEGDIYGKNIEVQLLAFDRPEQKFTSFEELKERIEKDKE FANAYYENHPEMIAQG >gi|157101630|gb|DS480694.1| GENE 406 435222 - 435488 416 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872009|ref|ZP_03990394.1| ribosomal protein S15 [Oribacterium sinus F0268] # 1 88 1 88 88 164 88 4e-39 MISKEKKAQIIAEYGRKPGDTGSPEVQIAILTERITELTEHLKQNPKDHHSRRGLLMMVG QRRGLLDYLKKTDLEGYRALIEKLGIRK >gi|157101630|gb|DS480694.1| GENE 407 435727 - 437850 1393 707 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 1 707 1 708 714 541 42 1e-152 MFKSFSMDLAGRKLTVEVGRVAAQASGAAFMHYGDTVVLSTATASSKPREGIDFFPLSVE YEEKLYSVGKIPGGFNKREGKASENAVLTCRVIDRPMRPLFPKDYRNDVTLDNIVMCVDP NCSPELTAMLGSAIATSISDIPFAGPCATTQVGLIDGEFIINPTADQKQVSDMALTVAST REKVIMIEAGANEVPEQIMIDAIFRAHEVNQEIIKFIDTIVAECGKEKHTYESCAVPEEL FAAIREIVTPQEMEEAVFTDDKQTREKNIAEITAKLEEAFAGNEEWLAVLGEAVYQYQKK TVRKMILKDHKRPDGRAITEIRHLAAEVDLLPRVHGSGMFTRGQTQILNACTLAPLSEAQ KLDGLDENETSKRYMHHYNFPSFSVGETKPSRGPGRREIGHGALAERALIPVLPSAEEFP YAIRTVSETMESNGSTSQASICASTLSLMAAGVPLKEMVAGISCGLVTGDTDDDYLVLTD IQGLEDFFGDMDFKVGGTKNGITAIQMDIKIHGLTRPIIEEAIARTREARMYILDNVMRP VISEPRKHLSPYAPKIKQITIDPQKIGDVVGKQGKVINKIIEETGVKIDITDDGAVNVCG TDEAMIDKAIQIITGIVTDIEAGMIFNGKVVRIMNFGAFVELAPNKDGMIHISKLSDKRV GKVEDVVNIGDEVTVKVTEVDKMGRINLTMRPSDLAENKTAARTEEK >gi|157101630|gb|DS480694.1| GENE 408 437891 - 438595 1041 234 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940475|ref|ZP_02087820.1| ## NR: gi|160940475|ref|ZP_02087820.1| hypothetical protein CLOBOL_05365 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05365 [Clostridium bolteae ATCC BAA-613] # 1 234 1 234 234 372 100.0 1e-101 MRNMKRNFAVLAATVMAVCSIALTGCQQKLEPADQVVGALYELSVKENATPMKDLLGFAS EEDVKTALMDDSAETDLVSMFKSEFEAAGIEFSDEEVEEMSNALQGLIDKLDYSTEITDQ SKDETTVLLKVKSYSMTDMQNIMVDVMTDMQNNIDEDTAAAIMAGDEDALQKLMQDAVKQ YMTKIGEMEPSEEMTELTIKCQRVKVDVSGKEKIAWMPQDLSKFSDEVNNATFK >gi|157101630|gb|DS480694.1| GENE 409 438705 - 440156 1495 483 aa, chain + ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 2 482 4 484 485 652 63.0 0 MSRVRTRYAPSPTGRMHVGNLRTALYAYLIAKHEGGDFLLRIEDTDQERFVEGAVEIIYQ TLKDTGLIHDEGPDKDGGCGPYVQSERQASGIYLEYAKKLVEKGEAYYCFCTPERLASLK TTVNGEEIMTYDKHCLHLSKEEVEANLAAGMPYVIRQNNPATGTTTFQDEIYGDITVDNS ELDDMVLIKSDGYPTYNFANVVDDHLMGITHVVRGNEYLSSSPKYNRLYAAFGWDVPVYV HCPLITDESHHKLSKRSGHSSFEDLLEQGFISQAVVNYVALLGWSPEDNREIFSLEEMVK EFDYHRMSKSPAVFDMTKLKWMNGEYMKAMDFDTFYGMAEPYLKAAVSRDLDLRKIAAMV KTRIEVFPDIADHVDFFEALPEYDTAMYTHKKMKTNAETSLKVLTDVLPLLEAQDDFTND ALYGVLSKYVEDTGVKTGFVMWPIRTAVSGKQMTPAGATEIMEVLGKEESLARIRKGIQM LEA >gi|157101630|gb|DS480694.1| GENE 410 440168 - 441994 1878 608 aa, chain + ## HITS:1 COG:BS_yjcD KEGG:ns NR:ns ## COG: BS_yjcD COG0210 # Protein_GI_number: 16078247 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 1 602 134 745 759 338 34.0 2e-92 MAFNDAQKTAIRHREGPMLVLAGPGSGKTTVITNRVRYLTEKAGVDPSHILVITFTRAAA REMKERYEQIAQAGCSRVSFGTFHSVFFLILKLAYRYKAADIVREEQRIQYMKEMLVKCD LEVEDEGEFISSVLSEISMVKGELMDMDHYYAKNCSEDMFRQLYRGYEARLRQNRLLDFD DMLVMCYELFKERKDILSAWQDKYRYILIDEFQDINRIQYEIVKMLALPGNNLFIVGDDD QSIYRFRGAKPELMLGFERDYPDAKKILLDTNYRCSRQIVEAAGRVISHNRTRFPKEIKA ARGNGHPVIIKAWQEPMDETLGIVTEIRDYASMGISYNDMAVLYRTNVGPRLLISKLMEY NIPFYMRDAVPNLYEHWIAGNVISYIRAALGDLSRSNVLQIINRPKRYVSRDALEGQQVS WESVKSFYQDKNWMMDRIEQLEYDLAMLRNMAPAAAVNYIRKAVEYDEYIREYAQSRRMK PEELFEVLDQLQESAAGFKTYEHWFIHMEEYKEQLKKQAADRDAQRQGVSLMTMHSAKGL EFRVVYILDANEGVTPHHKAVLDPDVEEERRMFYVAMTRAKERLHIYHVKERYRKKQAIS RFAEEAEG >gi|157101630|gb|DS480694.1| GENE 411 441991 - 442533 494 180 aa, chain + ## HITS:1 COG:VNG1574G KEGG:ns NR:ns ## COG: VNG1574G COG2109 # Protein_GI_number: 15790547 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Halobacterium sp. NRC-1 # 3 180 31 225 225 104 38.0 7e-23 MNETGLVQIYCGDGKGKTTAAIGAAVRAAGRGYRVLVARLLKTDDSGEVNGLVHIPGITV LPCDQNFGFSWNMTQSQRKAAALYYDHCLLTAWNMALGAGGEEPYDMLVLDEAIGACNLG FVDESGLIKALKEKPASLEVILTGRCPSEALQEQADYITEMVMRRHPYERGIGAREGIEY >gi|157101630|gb|DS480694.1| GENE 412 442627 - 442950 620 107 aa, chain + ## HITS:1 COG:no KEGG:Closa_1312 NR:ns ## KEGG: Closa_1312 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 106 8 110 114 95 74.0 8e-19 MADEKELNEEEQEMTVTLTLDDDTEVECVVLTVFNAGEHEYIALLPMEGADSEEGEVYLY RYSETEDGTPNLDNIEDDDEYEIVAEAFDELLDAQEYDELVGEDEEE >gi|157101630|gb|DS480694.1| GENE 413 443186 - 443866 945 226 aa, chain + ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 225 5 224 224 228 51.0 6e-60 MDTLKILVVDDEARMRKLVKDFLTNKGFAVIEAGDGEEAVDVFFAQKDIALVLLDVMMPK MDGWEVLRTIRKYSQVPVIMLTARSEERDELQGFSLGVDEYISKPFSPKILVARVEAILR RTNVASTDSVNVGGICIDKAAHQVTIDGQEIDLSFKEFELLTYFVENQGIALSREKILNN VWNYDYFGDARTIDTHVKKLRSKMGEKGEYIKTIWGMGYKFEVSEG >gi|157101630|gb|DS480694.1| GENE 414 443863 - 445386 1854 507 aa, chain + ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 503 3 489 492 252 32.0 1e-66 MKHSIRVKFTMIFIGLMAAVLVSMWAVNSFFLESFYTKQKLQLLENAYENLNAIAVDKEF LGENITEDLQSLYSNDGEQTEASRLLRLMNDKYNTTIVIQDSITGELIPLAKDGKFLADA LHRYILGIIDPRTETLISQDDHQVQKIYDRRSQGYYLQSWGFFDDNRTMFIMSMPLASIH DSVSISNEFLAYVGLAALVLGSALMYYATKKVVSPIRSLAALSARMSELDFEARYTGDSE DEIGVLGHSMNTLSERLKDTIGELKTANNELQKDIEEKIKIDETRKEFIANVSHELKTPI ALIQGYAEGLTEGMAEDEDSRNYYCEVIMDEAGKMNKMVKQLLTLTALEFGNDMPVMERF DITALIRGILASAGILLQQKEARVVFEQKEPVWVWADEFKIEEVITNYLNNAMNHLDGER QITISIFREGDQVRITVFNTGQHIPEEDLDNLWTKFYKVDKARTREYGGSGIGLSIVKAI MDSHNKSCGVENVDGGVEFWFTVDSGV >gi|157101630|gb|DS480694.1| GENE 415 445554 - 446867 1252 437 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2811 NR:ns ## KEGG: SpiBuddy_2811 # Name: not_defined # Def: hypothetical protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 61 437 27 405 405 541 69.0 1e-152 MRKKLIGVMLAAAMVTGLAGCGSSSGTGTSAAAADTQATEAQKEEAKDDAKKEDEKEDSE ASDTEGASDFHIGIVTGSVSQSEDDRRGAEAFQAMYGEDMVKLAIYPDNFTEELETTIQT IVNLSDDPKMKAIIMNQSVPGTTEAFRKIKETRPDILCIAGEGHEDLPEIGSAADLVCNN DFVARGYLIIRTAHELGCDTFVHISFPRHMAYETMSRRVAVMKEACKEFGMEFVLETAPD PTSDVGVAGAQAYILEKVPEWVEKYGENAAYFCTNDAHTEPLLKQLMEYGGYFIEADLPS PLMGYPGALGIDLTAEAGDFDKILAKVESTIVEKGGADHFGTWAYSYGYTTSAGLAQHAL NVLNGESELDDIDDIAKAYQVFSPKAEWNGSNYTNVETGVKLDNVFLVYQDTYIMGDPGH FMGSTKIEVPEKYFTIK >gi|157101630|gb|DS480694.1| GENE 416 446950 - 448536 193 528 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 291 514 36 248 329 79 26 3e-13 MNNNTPLLEMRNIKKDFFGNQVLTDINFTLNAGEVLGLVGENGAGKSTLMKLLFGMDVIR ETGGYGGDVLINGEKVSFSTPFDALEAGIGMVHQEFSLIPGFTATDNILLNREPKKKNII SEIFGDRLDTLDYKEMAERSEQAIDKMGVKIDAKMVISDMPVGHKQFTEIARELSKDNLK LLILDEPTAVLTEKEAEALLDSIRGMAAKGIAVIFITHRLHEILTVCDKVVIMRDGYVVK DVPAKETDVADITKWMVGRNIQTTARADIRINPDAKNIMSIRNLWVDMSGETVRNVDLDI KEGEILGIGGLAGQGKLGIPNGIMGLCEAGGKVEFDGKPIPLNNPRKCLDASLAFVSEDR RDVGLLLDETLEWNIAFSAMQIHDKFLKKYLGGMITWRDEKAIHEVTQKYIDELKIKCTS SKQKAKELSGGNQQKVCLAKAFALEPKFLFVSEPTRGIDVGAKSLVLDALKQFNKEHGVT VVMISSELEELRTTCDRIAIVSGGKIAGILPATESSEEFGMLMVSQVK >gi|157101630|gb|DS480694.1| GENE 417 448550 - 449629 863 359 aa, chain + ## HITS:1 COG:FN1897 KEGG:ns NR:ns ## COG: FN1897 COG1172 # Protein_GI_number: 19705202 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 20 359 1 338 339 363 59.0 1e-100 MGKIQDSTINPGKKTKIRELIDNFGLPRLIIAGFLLALFILAPIVGADLPTQITNTINRF SWNAVLVLAMVPMVHSGCGLNFGLPLGIISGLLGATMSIQFGFTGPVSFFMAILIATPFA LIFGGGYGWLLNKIKGGEMMIATYVGFSSVSFMCMMWLLLPYSSPTMVWGLSGKGLRTTI SLEGFYDKALANFLQINIGKISIPTGSLLFFAFLAFLMWAFLHTKTGTAMTAVGSNPNFA RASGINIDKTRMLSVIMSTWLAAVGILMYEQGFGFIQLYMAPFYMALPAVSAILIGGASV NKASITNVIIGTFLFQGIVTMTPTVMNSMIHMDMSEVIRIVVSNGMILYALTRKTEGSK >gi|157101630|gb|DS480694.1| GENE 418 449626 - 450753 1141 375 aa, chain + ## HITS:1 COG:FN1896 KEGG:ns NR:ns ## COG: FN1896 COG1172 # Protein_GI_number: 19705201 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Fusobacterium nucleatum # 29 351 1 323 340 347 59.0 3e-95 MSKKQKNNGSNSILELVRNNSVPLMFIIICAVCIPISGFSPGYLLNEIVTRLGRNAFLIL SLLIPIMAGMGLNFGMTLGAMAGEIGLIFVADWQIWGIPGIILAMIISIPISILLGLFCG KMLNMAKGREMVTSYIISFFMNGLYQLVVLYMMGSIIPIIHSSIKLPRGYGVRNTVSLLH MRQYLDNLLAIHIGGVKIPVLTLIIIALLCLFIIWFRKTKLGQDMRAVGQDMDVAGDAGI KVERTRIIAIIMSTVFAGLGMVIYLQNMGNISTYSSHTQIGMFCIAALLVGGASVDRASI GNVFLGVILFHLMFIVAPKAGATITGDSMIGEYFRVFISYGVITVALIMYETKKRRAKGK AGQMLQLAQSEEGEN >gi|157101630|gb|DS480694.1| GENE 419 450750 - 451238 390 162 aa, chain + ## HITS:1 COG:no KEGG:Spico_1669 NR:ns ## KEGG: Spico_1669 # Name: not_defined # Def: hypothetical protein # Organism: S.coccoides # Pathway: not_defined # 4 131 5 133 158 71 33.0 1e-11 MRAKRTILFRIGAILLLLIIAGIMMVIGRGHTVYIDNKSIDYNGQTYTTPYKVVVYVDGE QVAKLRDKERGMATCIGQTFKMTLEITQEKGGSEEMVDVTVSLPHHMDGIAINLPAYMAG LPEEAYLSEFQVAPEVVEEESESSSDEIPVDGIPAEGIPSDI >gi|157101630|gb|DS480694.1| GENE 420 453560 - 453730 175 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940488|ref|ZP_02087833.1| ## NR: gi|160940488|ref|ZP_02087833.1| hypothetical protein CLOBOL_05381 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07099 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05381 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 77 100.0 4e-13 MLKTSEETEAYHMYLENRILNKNELYILNQNIKTSEAIQEIVWKNKLVNAKAFMKT >gi|157101630|gb|DS480694.1| GENE 421 456948 - 458138 797 396 aa, chain + ## HITS:1 COG:BH3021 KEGG:ns NR:ns ## COG: BH3021 COG0303 # Protein_GI_number: 15615583 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Bacillus halodurans # 2 396 10 411 423 265 39.0 9e-71 MLEEAQERLMKIKMKEETETVRVTEALGRVCAENIYAVMDQPPFPRSPLDGYAVRSQDTT GASKENPVRLNVIEHICAGMYPKLKIGPGEAARIMTGAPIPEGADGVIMQERTDEGSESV EIYQSVASYGNYCKEGEDTYRGALLMEKGRSIHSSHIGILSSQGIETIPVYSVPVVAIMA TGDELVPVGTPLEPGKIYDSNGPLLAARIRELGMKPFLLQRGGDDTEKLAEEIRKQLTEC DALITSGGVSVGVRDCMPYVAEALGADVLFHGINVKPGSPMLAMIVEGKPVFCLSGNPFA AAATFEVLVRPVLERMRGKAQWKPEILLGILKSPFPKASPCRRLVRGKIEDGKVWLPEGR HSSGMLSSMAGCNCLVDIPAGSSGLKAGEQVKVMLM >gi|157101630|gb|DS480694.1| GENE 422 458457 - 459209 647 250 aa, chain + ## HITS:1 COG:BH4055 KEGG:ns NR:ns ## COG: BH4055 COG1811 # Protein_GI_number: 15616617 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Bacillus halodurans # 11 250 1 229 236 139 38.0 7e-33 MGKEQAGISKMAGLGTIVNVAGILAGCGVGLLLKGGLPKRLQDTISSAVGLCVVFVGITG ALKGLMTINGGSFETQDTLVSVVCMVAGAALGEWINIEYKLQQLGEWCKSRIPAGQVSGT FVEAFVSSSLLFCVGAMAIVGSLEDGLSHNYNTLFTKAVMDGVLSVVFTAALGIGTAFSV FPVAIYQGSITLLAGLVRPFLTDIMITRVSCIGSILIFGLGLNLVLGSKIKIGNLLPAVF LPIIACLLGF >gi|157101630|gb|DS480694.1| GENE 423 459292 - 460473 1252 393 aa, chain + ## HITS:1 COG:BH0936 KEGG:ns NR:ns ## COG: BH0936 COG0436 # Protein_GI_number: 15613499 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 1 377 2 377 385 342 43.0 1e-93 MKFAKRMDRFGEGIFSKLAEIKKEKTAKGEEIIDLSIGAPNIPPAPHIMEALCRAAAEPS NYVYAISDRREMLEAAGGWYQTRYGVELDPETEICSLLGSQEGLAHIALSIVDEGDVVLV PDPCYPVFGDGPQIAGARLHYMPQRKENGYIIRLDEIPEDVARKAKLMVVSYPNNPTTAM APDSFYEELIAFARKYDIIVLHDNAYSDLIFDGKACGSFLRFPGAKEIGVEFNSLSKTYG LAGARIGFCLGNSQVVSRLKLLKSNMDYGMFLPVQAAAIAAITGDQSCVQATRKAYETRR NIMCDGFTSIGWKMERPEATMFVWAAIPEHYTSSEAFVMELVERAGVMVTPGSAFGPSGE GYVRLALVQDEEMLNKAIKAVRESGVLTAEGKR >gi|157101630|gb|DS480694.1| GENE 424 460474 - 460837 167 121 aa, chain + ## HITS:1 COG:no KEGG:Closa_0784 NR:ns ## KEGG: Closa_0784 # Name: not_defined # Def: VanZ family protein # Organism: C.saccharolyticum # Pathway: not_defined # 28 121 7 101 158 77 42.0 1e-13 MPDERNRPVLKRPALKRSVMNRPALNRWHVVILVYICFIYGNSLTPATISSQESGFLLDK FRGAMISLGWEHLWLTEHIVRKTAHFAEYAVLGGLMVKACGGNGRYRIFNRDVLMMIFMV P Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:08:52 2011 Seq name: gi|157101629|gb|DS480695.1| Clostridium bolteae ATCC BAA-613 Scfld_02_36 genomic scaffold, whole genome shotgun sequence Length of sequence - 103695 bp Number of predicted genes - 93, with homology - 92 Number of transcription units - 38, operones - 24 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 - CDS 1 - 1066 1339 ## COG4214 ABC-type xylose transport system, permease component 2 1 Op 2 11/0.000 - CDS 1069 - 2616 197 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Term 2644 - 2694 8.1 3 1 Op 3 . - CDS 2698 - 3882 1527 ## COG4213 ABC-type xylose transport system, periplasmic component - Prom 4064 - 4123 5.6 - Term 4221 - 4263 10.1 4 2 Op 1 11/0.000 - CDS 4334 - 5359 1368 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 5 2 Op 2 21/0.000 - CDS 5359 - 6411 1314 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 6 2 Op 3 16/0.000 - CDS 6411 - 7925 197 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) - Prom 7968 - 8027 2.4 - Term 7994 - 8033 10.2 7 3 Op 1 1/0.000 - CDS 8067 - 9170 1423 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 9212 - 9271 5.2 - Term 9227 - 9270 6.1 8 3 Op 2 7/0.000 - CDS 9342 - 10949 1706 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 9 3 Op 3 . - CDS 10946 - 12574 1737 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 12624 - 12683 7.9 - Term 12670 - 12706 1.5 10 4 Op 1 . - CDS 12727 - 13692 824 ## COG0657 Esterase/lipase 11 4 Op 2 . - CDS 13762 - 17526 3558 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 12 4 Op 3 . - CDS 17568 - 17921 367 ## ELI_3756 hypothetical protein - Prom 17996 - 18055 11.0 - TRNA 18126 - 18209 56.4 # Leu AAG 0 0 + TRNA 18388 - 18471 64.6 # Leu CAG 0 0 - Term 18542 - 18568 0.1 13 5 Tu 1 . - CDS 18605 - 19645 1105 ## COG1609 Transcriptional regulators - Prom 19710 - 19769 8.3 - Term 19751 - 19801 6.6 14 6 Op 1 1/0.000 - CDS 19811 - 21193 1508 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 15 6 Op 2 18/0.000 - CDS 21291 - 22010 221 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 16 6 Op 3 19/0.000 - CDS 22012 - 22743 254 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 17 6 Op 4 24/0.000 - CDS 22766 - 23761 1147 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 18 6 Op 5 . - CDS 23768 - 24631 1054 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 19 6 Op 6 . - CDS 24694 - 25452 241 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 25684 - 25743 8.3 + Prom 25803 - 25862 9.5 20 7 Op 1 1/0.000 + CDS 25964 - 27574 1373 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 21 7 Op 2 . + CDS 27590 - 29173 1704 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 22 7 Op 3 10/0.000 + CDS 29255 - 29992 229 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 23 7 Op 4 . + CDS 30019 - 31206 1340 ## COG0183 Acetyl-CoA acetyltransferase + Term 31212 - 31279 15.9 - Term 31210 - 31259 12.1 24 8 Op 1 1/0.000 - CDS 31294 - 32769 802 ## COG1070 Sugar (pentulose and hexulose) kinases 25 8 Op 2 . - CDS 32814 - 33689 727 ## COG0191 Fructose/tagatose bisphosphate aldolase 26 8 Op 3 21/0.000 - CDS 33710 - 34705 1021 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 27 8 Op 4 16/0.000 - CDS 34702 - 36204 213 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 36246 - 36299 16.6 28 9 Op 1 . - CDS 36324 - 37367 1044 ## COG1879 ABC-type sugar transport system, periplasmic component 29 9 Op 2 . - CDS 37403 - 38287 582 ## COG1737 Transcriptional regulators - Prom 38423 - 38482 5.7 - Term 38474 - 38523 10.2 30 10 Op 1 . - CDS 38561 - 39661 822 ## EUBELI_01688 hypothetical protein 31 10 Op 2 . - CDS 39658 - 40845 1227 ## EUBELI_01689 hypothetical protein 32 10 Op 3 . - CDS 40869 - 42854 2514 ## gi|160940534|ref|ZP_02087878.1| hypothetical protein CLOBOL_05429 33 10 Op 4 . - CDS 42930 - 43616 247 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 - Prom 43641 - 43700 6.1 + Prom 43662 - 43721 6.2 34 11 Tu 1 . + CDS 43922 - 44125 182 ## Closa_1532 hypothetical protein - Term 44125 - 44165 1.5 35 12 Op 1 . - CDS 44230 - 44811 620 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 36 12 Op 2 . - CDS 44865 - 45053 279 ## gi|160940539|ref|ZP_02087883.1| hypothetical protein CLOBOL_05434 37 12 Op 3 . - CDS 45108 - 46049 756 ## CLJU_c23130 hypothetical protein - Term 46089 - 46132 9.1 38 13 Op 1 . - CDS 46157 - 46513 574 ## PROTEIN SUPPORTED gi|239625929|ref|ZP_04668960.1| ribosomal protein L20 39 13 Op 2 . - CDS 46542 - 46739 242 ## PROTEIN SUPPORTED gi|228581564|ref|YP_002852265.1| ribosomal protein L35 40 13 Op 3 . - CDS 46769 - 47263 404 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 - Term 47627 - 47680 18.4 41 14 Op 1 . - CDS 47701 - 48015 454 ## Closa_2298 hypothetical protein 42 14 Op 2 . - CDS 48077 - 49696 1611 ## HRM2_29940 hypothetical protein 43 14 Op 3 . - CDS 49711 - 51099 1612 ## COG0534 Na+-driven multidrug efflux pump 44 15 Op 1 . - CDS 51201 - 51791 438 ## COG2068 Uncharacterized MobA-related protein 45 15 Op 2 . - CDS 51808 - 52677 652 ## Cphy_1491 hypothetical protein - Prom 52725 - 52784 5.3 - Term 52801 - 52848 15.5 46 16 Op 1 . - CDS 52977 - 53060 158 ## - Prom 53101 - 53160 5.4 47 16 Op 2 . - CDS 53226 - 54257 1060 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 48 16 Op 3 . - CDS 54254 - 55195 1046 ## Cbei_1987 regulatory protein, LysR 49 16 Op 4 6/0.000 - CDS 55243 - 56391 1054 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component 50 16 Op 5 23/0.000 - CDS 56405 - 57076 781 ## COG4149 ABC-type molybdate transport system, permease component 51 16 Op 6 . - CDS 57098 - 57982 1361 ## COG0725 ABC-type molybdate transport system, periplasmic component - Prom 58006 - 58065 3.9 - Term 58060 - 58094 1.2 52 17 Op 1 6/0.000 - CDS 58229 - 59197 255 ## PROTEIN SUPPORTED gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 53 17 Op 2 6/0.000 - CDS 59203 - 60261 1004 ## COG2896 Molybdenum cofactor biosynthesis enzyme 54 17 Op 3 5/0.000 - CDS 60267 - 60761 604 ## COG0315 Molybdenum cofactor biosynthesis enzyme 55 17 Op 4 . - CDS 60786 - 61844 1287 ## COG0303 Molybdopterin biosynthesis enzyme - Prom 61889 - 61948 4.9 - Term 61965 - 62022 9.3 56 18 Op 1 . - CDS 62029 - 62679 497 ## COG2068 Uncharacterized MobA-related protein 57 18 Op 2 . - CDS 62739 - 63149 483 ## Acear_0330 thioesterase superfamily protein 58 18 Op 3 1/0.000 - CDS 63168 - 64766 1612 ## COG4670 Acyl CoA:acetate/3-ketoacid CoA transferase 59 18 Op 4 5/0.000 - CDS 64751 - 66820 1631 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II 60 18 Op 5 . - CDS 66832 - 68079 1484 ## COG0477 Permeases of the major facilitator superfamily - Prom 68112 - 68171 4.8 - Term 68143 - 68215 19.3 61 19 Op 1 . - CDS 68238 - 70610 2660 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 62 19 Op 2 11/0.000 - CDS 70627 - 72948 2725 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 63 19 Op 3 . - CDS 72965 - 73576 553 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs - Prom 73658 - 73717 8.4 - Term 73729 - 73791 12.5 64 20 Op 1 . - CDS 73817 - 74485 568 ## Meso_4073 TetR family transcriptional regulator 65 20 Op 2 . - CDS 74506 - 75567 897 ## COG0406 Fructose-2,6-bisphosphatase - Prom 75736 - 75795 4.7 66 21 Tu 1 . - CDS 75881 - 76501 549 ## COG2068 Uncharacterized MobA-related protein - Prom 76547 - 76606 4.4 - Term 76605 - 76659 12.0 67 22 Tu 1 . - CDS 76671 - 77384 711 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 77446 - 77505 8.8 + Prom 77528 - 77587 5.4 68 23 Tu 1 . + CDS 77618 - 78505 716 ## CLJU_c15750 hypothetical protein + Prom 78512 - 78571 4.9 69 24 Tu 1 . + CDS 78618 - 79505 789 ## CLJU_c15750 hypothetical protein + Term 79612 - 79662 -0.9 70 25 Tu 1 . + CDS 79922 - 80452 471 ## gi|160940581|ref|ZP_02087925.1| hypothetical protein CLOBOL_05476 - Term 80458 - 80508 1.3 71 26 Tu 1 . - CDS 80552 - 81580 1030 ## COG4859 Uncharacterized protein conserved in bacteria - Prom 81624 - 81683 4.2 - Term 81682 - 81726 8.0 72 27 Tu 1 . - CDS 81756 - 82040 252 ## gi|160940583|ref|ZP_02087927.1| hypothetical protein CLOBOL_05478 - Prom 82219 - 82278 4.3 + Prom 82142 - 82201 4.8 73 28 Tu 1 . + CDS 82226 - 83695 884 ## COG3666 Transposase and inactivated derivatives + Term 83696 - 83733 3.5 - Term 83631 - 83685 11.1 74 29 Op 1 11/0.000 - CDS 83751 - 86114 2802 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 75 29 Op 2 15/0.000 - CDS 86107 - 86598 519 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 76 29 Op 3 . - CDS 86612 - 87481 860 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs - Prom 87538 - 87597 4.7 - Term 87671 - 87727 22.5 77 30 Op 1 . - CDS 87779 - 89251 1291 ## COG0513 Superfamily II DNA and RNA helicases 78 30 Op 2 . - CDS 89280 - 90026 689 ## COG3619 Predicted membrane protein 79 30 Op 3 . - CDS 90080 - 90577 455 ## COG1329 Transcriptional regulators, similar to M. xanthus CarD - Prom 90823 - 90882 7.2 + Prom 90608 - 90667 5.3 80 31 Tu 1 . + CDS 90910 - 91200 280 ## Cbei_3087 hypothetical protein - Term 90981 - 91028 6.0 81 32 Op 1 . - CDS 91210 - 92058 853 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 82 32 Op 2 . - CDS 92065 - 92454 413 ## Shel_01590 molybdenum-binding protein - Prom 92485 - 92544 3.1 - Term 92503 - 92532 -0.9 83 33 Tu 1 . - CDS 92585 - 93508 1154 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 93539 - 93598 6.0 + Prom 93545 - 93604 3.6 84 34 Op 1 . + CDS 93633 - 94481 398 ## Closa_0221 dipicolinate synthase subunit A 85 34 Op 2 . + CDS 94499 - 95083 469 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase + Prom 95188 - 95247 2.1 86 35 Tu 1 . + CDS 95386 - 96129 527 ## bpr_I1000 hypothetical protein - Term 96005 - 96034 -0.9 87 36 Tu 1 . - CDS 96141 - 97148 788 ## COG3344 Retron-type reverse transcriptase 88 37 Op 1 38/0.000 - CDS 97580 - 98413 923 ## COG0395 ABC-type sugar transport system, permease component 89 37 Op 2 35/0.000 - CDS 98428 - 99306 966 ## COG1175 ABC-type sugar transport systems, permease components - Term 99324 - 99359 3.4 90 37 Op 3 . - CDS 99369 - 100808 1979 ## COG1653 ABC-type sugar transport system, periplasmic component 91 37 Op 4 . - CDS 100842 - 101807 1199 ## Closa_1289 Uroporphyrinogen-III decarboxylase-like protein - Prom 101857 - 101916 8.0 - Term 101971 - 102015 10.2 92 38 Op 1 . - CDS 102055 - 102702 679 ## COG1802 Transcriptional regulators 93 38 Op 2 . - CDS 102699 - 103694 626 ## gi|160940609|ref|ZP_02087953.1| hypothetical protein CLOBOL_05504 Predicted protein(s) >gi|157101629|gb|DS480695.1| GENE 1 1 - 1066 1339 355 aa, chain - ## HITS:1 COG:BH3440 KEGG:ns NR:ns ## COG: BH3440 COG4214 # Protein_GI_number: 15616002 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, permease component # Organism: Bacillus halodurans # 10 355 10 354 388 252 46.0 8e-67 MDKKKTVNIDMKQYGMVLALIAVFLIFAVMTGGKNMSPANINNLIMQNGYVVILAVGMLL CCLTGNIDLGVGSIVALCGATVGILMIDYHTNMWVAILAALAVGALAGMFVGLFVSKLSI PPFIVTLATMLMGRGLTYTLLKAQTKGPLPTSYTYIGAGFLPTVKIPFGNGTLDLVSIIV AGIATVLVIMAELKSINTKKKYGFSTNPLWQVIIKEAVILMIVWFFFYKLSRYNGIPFVL VIMGVLVGLYHFITSKTVAGRQIYALGGNAKAAKLSGINTEKVFFWVYTNMGILSAIAGI VLSARNASATPKAGDGFEMDAIASCYIGGAATSGGIGTIIGAVVGAFIMGILNNG >gi|157101629|gb|DS480695.1| GENE 2 1069 - 2616 197 515 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 261 514 6 252 563 80 26 3e-14 MSDYILEMNHITKEFSGVKALDDVNIKVKRGEIHALCGENGAGKSTLMNVLSGVYPFGSY SGDIVYNGNVCQFHSIKQSEAKGIVIIHQELALSPYMSVAENVFLGNEQKAAKGIIDWTL THKKAQEMLEKVGLGGENLNAPISSLGVGKQQLIEIAKAMAKKVELLILDEPTAALNDEE SRRLLDIMLDLKSHGITSIIISHKLNEISYVADSITVIRDGKTIETLEKGRDEFTEDRII KGMVGRELTNRYPERHCEIGDTIFEIKDWNVYHSDDAQRQVIKDVSLHVRKGEVVGLAGL MGAGRTELAMSIFGRSYGQKISGTIKINGREVHIKDVKDAINNKLAYVSEDRKNYGLVLI KDIKWNMTLSAMRNFFSKNGVLNENDEILAAEDYKKKINIKSNSINQTVGSLSGGNQQKV VLAKWMLTQPDVLILDEPTRGIDVGAKYEIYCVINELAKSGKAVIVISSEMQEVIGTCDR VYVINEGRIAGELTKEEVTQERIMKCIMADNGKEA >gi|157101629|gb|DS480695.1| GENE 3 2698 - 3882 1527 394 aa, chain - ## HITS:1 COG:BH3442 KEGG:ns NR:ns ## COG: BH3442 COG4213 # Protein_GI_number: 15616004 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type xylose transport system, periplasmic component # Organism: Bacillus halodurans # 56 394 22 359 359 342 51.0 8e-94 MKRKILSMILGTAIAASLLTGCGGSQEAAATTAAAAADTTAAAADSSAAADAATEAAKAA TGEGKKVGVAMPTQSSERWINDGANMKKQLEALGYEVDLQYAEDDVQMQVSQIENMIASG VNCLVIASIDSSALVNVEEQAKNAGIPIIAYDRLLMDTDAVSYYATFDNKGVGTAIGNYI KEAKDLDAAKAAGESYTIEFFMGSPDDNNALFLYNGLMEVLQPYLDDGTLVCKSGRTSFE DTCILRWSQETAQQNCENYLTGFYADEKLDICASAFDGFAYGCKAALEGAGYKVGEDWPL ITGQDAELMAVKNIISGYQTATIYKDTRLLAEKCVTMVQAVLEGAEPEINDTEQYNNGKV VVPAYLCTPVAVDKDNYKEIIVDGGYYTEEQLAQ >gi|157101629|gb|DS480695.1| GENE 4 4334 - 5359 1368 341 aa, chain - ## HITS:1 COG:YPO3905 KEGG:ns NR:ns ## COG: YPO3905 COG1172 # Protein_GI_number: 16124037 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 17 336 8 314 330 206 42.0 5e-53 MKNRSEKKRLDGNSFLLAVTIILFFVMYIAGIIMFGGRGFAKVQNFLNLFISNAGLIVVA TGMTIVMITGGIDISVGSVVAMVCMMLAWMMERGNIGAFPAILIVLVTGCIFGLVQGFLV AYLKIQPFIVTLAGMFFARGMTSIISQEMISITNKTFTGIATFKMYLPFGGYLNKKGVMI YPYVYPSVIIALVVLAIVFVVLKYTKFGRSIYAIGGNEQSALLLGLNVRRIKLQAYVLDG FLAGLGGLLFCMNTCSGFVEQAKGFEMDAIASSVIGGTLLTGGVGNVIGSLFGVLIKGTI ESFITFQGTLSSWWVRITIAALLCFFIVLQSLIAALKRKNK >gi|157101629|gb|DS480695.1| GENE 5 5359 - 6411 1314 350 aa, chain - ## HITS:1 COG:YPO3906 KEGG:ns NR:ns ## COG: YPO3906 COG1172 # Protein_GI_number: 16124038 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 19 298 29 298 339 201 45.0 1e-51 MGKLKKLTGYRLFLPLSCLIIVLLINLITTPTFFRITINNGVLYGYIIDVINRASELVVL AVGMTLVVSASAGTDISVGAVMAVAAAVCTRVLAGGEVSVNEYANPYVLAVLAALAAAVV CGGFNGLLVAKLKIQPMVATLILFTAGRGISQLVTNGQITYIRVDGYKLLGNNIPGIPIP TPIFVAVIVVALTYILLKKTAMGMYIQSVGINERASRLVGLKSVKIIWMAYAFCGLCAGI AGLVASSRIYSADANNIGLNMELDAIAAVALGGNSLGGGRFSLLGSVIGAYTIQALTTTL YAMSVPADQIPVYKAIVVILIVAVQSEELKKFRKRLASRHSSKNVEGGVA >gi|157101629|gb|DS480695.1| GENE 6 6411 - 7925 197 504 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 264 479 4 218 223 80 31 3e-14 MEQNVVLEMRGINKNFPGVKALQDVDFTLRKGEIHALMGENGAGKSTLIKVLTGVEEFET GTIRMEGSSNTIINRSPQEAQENGISTVYQEVNLCPNLSVAENLYIGREPKKGPMIDWRT MKEDARKLLEGLDVHIDVTMAVENYSIAIQQMIAIARAVDMSAKVLILDEPTSSLDDGEV EKLFVLMRQLKDKGIGIIFVTHFLEQVYAVCDRITVLRNGTLVGEFKVEELPRVQLVAKM MGKDFDDLAAIKKEGVSTVKEDVIISARGLGHKGTIKPFSLDIHKGEVIGLTGLLGSGRS ELVRAIYGADKPDSGELAVKGKKLKVNAPIDAMMEGMAYLPENRKEEGIIADLSVRENII IALQAKKGMFRLMSRKEQEEFTDKYIDILQIKTADRETPIKQLSGGNQQKVILGRWLLTN PDFLILDEPTRGIDIGTKTEIQKLVLKLAEDGMSVVFISSEIEEMLRTCSRMAVMRDGGK VGELKEDELSQDSIMKAIAGGGEE >gi|157101629|gb|DS480695.1| GENE 7 8067 - 9170 1423 367 aa, chain - ## HITS:1 COG:SMb21587 KEGG:ns NR:ns ## COG: SMb21587 COG1879 # Protein_GI_number: 16264775 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 66 343 20 293 320 216 48.0 7e-56 MRKKGLSVMLCAAMAASLLAGCGGGNKPAETTAAPAAAETTAAAADTTAAPEKSEAETEA EAKDDAASGDLIVVGYAQVGAESDWRTANTESFKSTFTEENGYKLIFDDAQQKQENQIKA IRSFIQQDVDYIVVAPVVETGWEAVLQEAQEAGIPVILSDRQMDVDESLYECWVGGNFIK EGETAGNWLADYLKAQGRDGEDINIVTLQGTIGASAQVGRTEGFGNILKQHDNWKMLDMQ TGEFTQAKGQEVMESFLKSYDDIDVVVAENDNMAFGAIDAIKAAGKTCGPDGDIIIFSFD AVKAAFDAMIAGDLNAAFECNPLHGPRVDEIIKKLEKGETVEKIQYVDEAYFDTSMDLES IKAERAY >gi|157101629|gb|DS480695.1| GENE 8 9342 - 10949 1706 535 aa, chain - ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 165 528 224 595 602 197 33.0 6e-50 MTGKWRTGSGKKGQDGANEKEQGIGKDWDTNHILDRLPLEKRLNLMTFIIIIPLAVLVIY LMATVAKFCNAYTQSIANITQANSVTANLREDVDYSMYRIVIGMRTYQSIHELMEEDRPY GWEQIKDPHQTISDARKTYRQLLKRTPEGANRTRINWLLHCLDQLEQRVDEIEDNLPHGM YDKNMEILDYGVRVLTADIEKQGREYVYYETLHVQEIQKELESQEHTAIVVSLTLLVSIL IISLLLSRRITKSVTVPIQKLCNETERVAKGDFTSGPKIEAGDELAILTGSFDHMKEEIG RLIEDIRQEQNQRRVMELQLLQEQINPHFLYNTLDTIVWLAEGGQNRAVVDMVTSLSEFF RTTLSGGKDFITMREEIGHIKSYLQIQKIRYQDIMDYEVTLEKSLEERRILKLTLQPLVE NALYHGIKNKRGRGRIWVRGYAKEDMAVLEVEDDGAGMTEEEMEAVRRKLRGEKDLAFQE GPAPKGGFGLFNVAERLRLNYGSRCSLEFKSVLGLGTRAVVNIPLESGYDREGED >gi|157101629|gb|DS480695.1| GENE 9 10946 - 12574 1737 542 aa, chain - ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 535 1 520 525 165 25.0 3e-40 MIKVFLVEDEIVMRNGIKNNIPWEKEGLEFAGEASDGELAYPLIRKVQPDILITDIRMPF MDGLELSELVKKEFPRIKIIILSGYNEFDYAKQAIHIGVTDYLLKPITAAKLLEAVKKVA DVIEKEREDARLMDKYRLEMAENTVLERQRLLRDLVTGRVNFKEALERGEQVGMDLSASF YQLMLFKLMPMGSAVAYSDRVVSCQEAIEERMEGRQHILVFDRGDEGWAFVLTGESEQEV ERRMSECAASLGKMAGACKDIQYFGGIGSCVNRLGDLQTSYLQAGRAFAARFFTEMNRII SYSQMDGMVHGTGETIDINSVDASKVNRKTLENFLKQGTVGEASGFVEEYFQNVGEKNCQ SFMFRQYIVMDCYLCVSTFLEQLGLDMGRLPRELSDMEKVLKDGCALDMLKEKLVRLFEE VMTLRDSQTASKYSQVLDEAKAFIYDNYAKEEISLNTVAARVNISPSYFSSIFSQEMGVT FVEFLTGVRMEKAKELLMCSNLKTSEIGYEVGYKDSHYFGYLFKKTVGCTPKEYRAGSRE GS >gi|157101629|gb|DS480695.1| GENE 10 12727 - 13692 824 321 aa, chain - ## HITS:1 COG:RSp1108 KEGG:ns NR:ns ## COG: RSp1108 COG0657 # Protein_GI_number: 17549329 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Ralstonia solanacearum # 27 311 50 332 339 223 44.0 5e-58 MKRCVPLEPAAEEVCQANSKPPLIFELPPAEGRMVLEKAQDTPVYKYPARVGKVRVNTGR WGTIPVWLVDPGVSCQPANVIFYIHGAGWVFGSFHTHEKLVRELAARTGCLLIFPEYSRS PEVRCPTAIEQCYSVMCTAPDLVKSMGYAMNPGTFTVAGDSVGGNMAAAMTLMSKYRKGP FIQKQLLYYPVTNACFDTCSYNEFAAGYYLYRAGMQWFWNQYAPCQKDRAQITVSPLRAS AEQLRGLPDAMILNGEADVLRDEGEAYAGKLREAGVDVTALRFQAIIHDFVMLNSLDQTR ACRAAMDVSTEWINRKNREKQ >gi|157101629|gb|DS480695.1| GENE 11 13762 - 17526 3558 1254 aa, chain - ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 4 351 10 366 368 238 37.0 7e-62 MRQRDTILIVDDMEINRVILDGLFQEEYHLLEAENGEQALLLLRQYHENIAIMLLDVVMP VMDGYKVLQEMGANGLLDEVPVVVITADNSLEGELRAFDLGASDIIVKPFEPHVVKRRVH NIIELYLHKHNLEDMVEQQAQKLRESNDVLVDALSSVIEYRSLESGQHIKRIRLLTQVLL EEVCRNCPEYGLDKKKIEVIASASALHDIGKIAIEDKILNKPGRLTAEEFEIMKTHTLRG CQILEGLERMSDKGYLEYAYNICRYHHERWDGRGYPDGLYGDAIPICAQAVSIVDAYDAL TSDRVYKKAIPHEQAFNMILNGECGCFSSRLLDCFKNVEPQLADLCKRYRDGNVPAPAKL SHHGSVLGVGDGSLSALQLEQMKYFAMLRYVDTTVAEVDFSTGSYHVVYAPDERIKKLFQ GECLEAVAECCINDLVVPEEQDNVRRIWKTEIPRFFEEGMLKQSWKYRLKDEHRSEEWYE LTLLRVDIEHPRRRKALLLWRLVDRAHQEADLDENRLREDRRVLHNILKGILRCRNDRWF TMMDVDDGFLGYTREEIKERFNNCYIEMIRPDDRERMRRELTEQINRGGEYEVEYRVEDR SGRSIWMLDKGRLMYDNEGREYFYSVIIDIDQTKHAQEELRLGLERYKIVMEQTNDIFFE WDIGSDTVVYSINWKAKFGYDPISEQASVRIPKISHLYPEDMAPFGHLIEEIRDGRRYGE LEFRVSDSKGRYQWCKLRATTQFDDQGQPCKAVGIVIDINDEKLAAQELKARAERDTLTR LYNKESARQRIEHLLVNREEGEGAALFVIDIDNFKMVNDQYGHMFGDAVLTKIAGQLSRL FRSSDIVSRIGGDEFMALLQGVVNEQRVKSVAGRLIECFERVLEELPKDCCITCSIGIAV CPRDGENFQTLFRRADVALYQAKACGKKQYQIYDQSMEDKAFGQGAGRKATVNTTIESEQ NGDMMLNDLLPRAFNILSKSEKMDKAVESVLELLGERLQVSRAYVFENSEDGTCYSNTFE WCARGIRSQKDLLQGCSYAEFSTDYQSCFNESGILYCDDIGQLPGDIQKRLKQRGVKSML QCAIRDNGVFKGWVGFDDCTSHCLWTISQIEVLSFVSELLSLFLLKQRAQDSAVDLAEDL RTILDHQNSWIYVVDMESHELLFINDKTYRLAPDSRLGMSCHEAFFKNDKPCQRCPMKQV EDKINYTMEVYNPVLKVWSSADASRIHWGGKDACLLCCHDITCYKKAEENSANQ >gi|157101629|gb|DS480695.1| GENE 12 17568 - 17921 367 117 aa, chain - ## HITS:1 COG:no KEGG:ELI_3756 NR:ns ## KEGG: ELI_3756 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 6 114 3 112 116 66 35.0 4e-10 MDLNTILELKGAGVDIDGALRRFSGNSALYEKFLKKFLTDSTFSQITKAFEGENREDALM ATHTFKGVTANLGMDKLFNISSFMVDHIRADRFDEAAGAYPELEEAYKEICRILAND >gi|157101629|gb|DS480695.1| GENE 13 18605 - 19645 1105 346 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 5 335 2 332 333 196 36.0 5e-50 MSIQKKDKYTISDVAQMLGVSRSTISRAMNNSPGVGEELRKKVLDFVEEIGYQPNTIARS LSKGRQSIIALIVSDIRNPFYADLTFYIQKILHNNGYMLMVLNSEYDIRREKEFIHMAIQ FNFSGVFLLTAQSEEIEKELDDIEIPVVLVNRILGSYEGDSVLSDNFKAGYIAAMHLIEL GYPEIAFVKGPDVSSASEQRFRGYRQALENYRLPFKEQNVFKGDLKLDTGSELAKIYISD LKNRPKGIVISNDMMAIGFVEHCRESGVKIPEQISVVSFDNIVFSSLYDISLTTVSQHVR EMSEHAARLMLKQLKNPQEKPERVILDPTLIVRRTTCPYVPEQDGE >gi|157101629|gb|DS480695.1| GENE 14 19811 - 21193 1508 460 aa, chain - ## HITS:1 COG:APE2521 KEGG:ns NR:ns ## COG: APE2521 COG0683 # Protein_GI_number: 14602118 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Aeropyrum pernix # 71 458 39 424 430 243 36.0 4e-64 MRKQLVALTLSAAMAAGMLSGCGGYGSGSGGQKDAAEAAGTAAGAAATTAAAGSDGTAEP AKASTADGEPILIGALYPMTGALADSGQNMKDGIDLAVEEINAAGGISGRPIQIVYGDTQ GANATGMTEMERLITQDKVMAVMGAYQSGVTEVVSQVAENYQVPMITANATSDSLTSHGY EYFFRLAPTNMMFIRDMEQYLLDLSAKDPADDIDVKSVAVCADNTELGQQTATWAKYFAE ENGLEFKGEVLYSQGAADLTSEVLQLKSLNPDALIVDNYVSDAILLTKTMNEQGYKPNIM IAKANGYTESSYLPSVGGLANGILTATEFLPGDKGTKVSDTFKAKYGVDMNGHSAEAYTV VWIFRTALQNLADAGKEITSEALKDELSTLEIKDAFPGGEEIILPYDTIKFSESEFNGIP YKNTNMDGKLTIAQFQDGSLVTVWPFDIAKHDTIYPAPFN >gi|157101629|gb|DS480695.1| GENE 15 21291 - 22010 221 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 7 231 3 231 245 89 26 5e-17 MADRYFEVNKINVRYGDIQVLWDIEFTADKGEIVALVGSNGAGKSTTVKSCAGLIRPFQG NIRLDGHDLTGAACRTFIDHGVILVPEGRQLFPSMTIYENLEMGASSKAAKAAKKDTIER IYNWFPKLKDRKNQLAGTLSGGEQQMLAFSRGMMGLPKLLIMDEPSLGLAPNIVDNIFSI AKEVAQESGLTIILVEQDVRKALRIANRGYVIENGQITVSGTAEELLNNDEVKKAYLGF >gi|157101629|gb|DS480695.1| GENE 16 22012 - 22743 254 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 5 223 12 225 312 102 30 8e-21 MGNILEIKDLSKNFKGLKAVDSVSFAVEEGGITGLIGSNGAGKTTVFNMISGVFPPTAGH IIYKNKEITGMKPYHYTSMGIARTFQIMKPLASMSVLDNVASGALYGRKKYKSVKEAKEY AEEILHFTGLYERKDWLPGEMGTPFKKRLELARALATDPDMLLLDEVMAGLNPTETDEAV ELIRKINENGTTILLIEHVMRAVANLCQKVVVMHHGEKITEGTPEMVMNDPYVIEIYLGK EEH >gi|157101629|gb|DS480695.1| GENE 17 22766 - 23761 1147 331 aa, chain - ## HITS:1 COG:BMEII0874_2 KEGG:ns NR:ns ## COG: BMEII0874_2 COG4177 # Protein_GI_number: 17989219 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Brucella melitensis # 26 312 15 301 315 169 35.0 8e-42 MKKKIGSRQILLLILLLIAAVIPVFVRTPYLLSICVMTFYMASASLAWSILGGLTGQISL GHASFMGLGAYMSTLLLVKLNVSPWISIPLVFLFVGALTALLLSPCFVLRGPYFSLVTIA FGEAFRNLMTNWDFAGKGQGLLLPFGDNSLALLRFKSKVPYFYLSLLMVIAVYLIVRAID RSKFGYALKTVREDEDTANAIGINPLKYKVLATFVSCGLIAVCGVFYANYIRFINPDIMI QAQSVEFVLPAVIGGIGSVTGPLIGAIILTPLAQYLNSTLSSIAPGANLLVYAIILIVVI LFQPRGIMGWYNGSKFKVKVNDMFDRMDGKK >gi|157101629|gb|DS480695.1| GENE 18 23768 - 24631 1054 287 aa, chain - ## HITS:1 COG:BMEII0102 KEGG:ns NR:ns ## COG: BMEII0102 COG0559 # Protein_GI_number: 17988446 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Brucella melitensis # 1 279 1 278 287 158 33.0 9e-39 MNIFLQAAANGIMVGGIYALIGMSLTLIFGVMKIINFCQGELLMLGMYISFVLFDQCGLD PFVAIPIVAVVMFAFGALLQSTLITRSLHGNDDTNVLFLTVGLGILFQNVALLYFKSDYR TAQSMFSEKIVQLGAINLSLPKVLSFVILLAVTILLFAFLKYTNIGKQIRATSQNSTGAQ VCGIKTKLVYATTYGLGAAIAGITGACLMSFYYVFPTVGAVYGTRSFIVVTMGGLGSVIG AFVSGIVLGLMETVGAVVVGSSFKDTIVFLAFILILVVKQTVKTRRG >gi|157101629|gb|DS480695.1| GENE 19 24694 - 25452 241 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 11 247 7 238 242 97 30 2e-19 MILDLEGRRAVVTGGSRGLGYGIAQALHDSGAEVIITGRTGKVWEAAAEIGSSGPPAYGV TGDLSRQQQRETVCEQILEIYNGRVDILVNAAGTLNRCAAFEVTQEDWNEVVELNLNAVF FMSQRIGRSMAARRYGKIINIASMDSFFGSVLVPAYSASKGGVAQLTKALSNEWAAEGIN VNAIAPGYMATALTDTMKVKNPAQYEETTRRIPMGRWGTAEDLKGLAVFLASDASAYISG AVIPVDGGFLGR >gi|157101629|gb|DS480695.1| GENE 20 25964 - 27574 1373 536 aa, chain + ## HITS:1 COG:BH2006 KEGG:ns NR:ns ## COG: BH2006 COG0318 # Protein_GI_number: 15614569 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Bacillus halodurans # 35 533 7 509 513 209 29.0 2e-53 MINGTTLWNRSYTADMIEKTFSGKTYLTYPNLPANLYDALSLSAARHPDKTAVVDDSGAA YTYTELRDMADHFSSYLYYISDIKPGTKTGIMMFNSVEFCVTFLALTKLGAVVIPLPSKY SRNEVLSLTAKADLQYILCDEKFYDWFVPLETSGVHLMKPGRSQDGFGFSHLCASYLSPV PSLGREEDDALIMFTSGTTSQSKGVIIKNYSIMHAIVSYQRIFQITDRDTTLIPIPIYLV TGLVALLGLTLYAGGTVYLHKFFDAKRVLRDVNDKEITFLHAAPAVFSLLLREKDSFPSL PSLRLLACGSSNMSKEKLTEIHRWLPCAVFHTVYGLTETCSPATIFPGDASTSSYIGSSG LPIPGTCFCILDDSGSQMQTGQVGEIAVRGTVLLDRYYQKGTGELDEDGWLRTGDLGYFN GEGYLFIVDRKKDMINRGGEKIWSFDVENELYRLDGIDEAAVVGIPHDIYGEVPVAAVKL SPESILTEQQIQDLLKCRIAKYMIPSRILFLNELPLTPNNKVNKSAIRKLFMQPEQ >gi|157101629|gb|DS480695.1| GENE 21 27590 - 29173 1704 527 aa, chain + ## HITS:1 COG:BH3898 KEGG:ns NR:ns ## COG: BH3898 COG4670 # Protein_GI_number: 15616460 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Bacillus halodurans # 3 516 2 511 525 435 46.0 1e-121 MRKPVFITGEEAAAMIADNATVATIGMTLVSAAETILKAIEQRFLETGSPRGLTLVHSCG QSDRDRGIQHFAHETMLSRIIGGHWGLQPRMMQLIADNKILAYCIPQGQFAQLYRSMAGG EPGKITKVGLGTFIDPRIDGGKMNEITMSAPDINEVVTIGGEEYMRYKPIPLDYCIIRGT YVDEYGNLTTDEEAMQLEVFSAVMACKKFGGKVLAQAKYKVRAGSLHCKRVIVPGVFIDA VVICPNPEEDHRQTHSFYMDPAYCGDTKAPVSSDDVLPISLRKVIGRRALMELAPDDVLN VGTGIPNDVVGPIITEEGMSEDVTITVESGIYGGIPMGGIDFGIARNQFALVRHDDQFDY YNGPGVDVTYMGAGEVDAEGNVNATRLGPKPTGAGGFIDITTNAKHVVFCSSFTAKGLDC SFEGGRLHINQEGSLIKFVSRIKQVSYNGRIAREKGQKMHYVTERAVFELRPEGLTLTEI APGIDIQTQVLDLMEFTPLISPDLKEMDTAIFMEGAPFGLRKYIFSN >gi|157101629|gb|DS480695.1| GENE 22 29255 - 29992 229 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 3 241 1 238 242 92 28 6e-18 MRLENKTAIVTGAGRGLGKGIALKLAAEGAKVIVADMSMETASATAEEINASSGTAKAFA VNIAKQDEVKAMMDFTVESFGTLDVIVNNAGINRDAMLHKMTVEQWDQVIAVNLTGTFYC VQYAAQIMREKGSGAIINISSASWLGNIGQANYAASKAGVVGLTKTACRELARKGVTCNA ICPGFIDTDMTRGVPEKVWDTMISKIPMGRAGSPADVANMVAFLASDEANYITGEVINVG GGMIL >gi|157101629|gb|DS480695.1| GENE 23 30019 - 31206 1340 395 aa, chain + ## HITS:1 COG:CAC2873 KEGG:ns NR:ns ## COG: CAC2873 COG0183 # Protein_GI_number: 15896127 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Clostridium acetobutylicum # 4 394 1 391 392 366 50.0 1e-101 MENLKEVVIVSGVRLPVGSFGGSLKDISAIDMGAMVVKEAVKRAGIEPSDVDETIVGQVG QIAECGFVARAVSLKAGLPETTCAYSVNRQCGSGLQAIADAVMEIQTGFADVVVAAGTEN ISQLPYYVKDARWGARMGHKVFEDGVIDILTWPLGPYHNGVTAENVAEKYNVTRQEQDQY ALESHLRAAAAIEAGKFKDEILPVELKDRKGNITVFDTDEGPRASQTIEKLERLKPCFVK DGTVTAGNSSSLNDGAAAVVVMSREKADTLGVKPLLTIKGCTVAGNDGAVMGYAPKLSSE KLAAKLGLDLKAIDMFEINEAFASQAFAVRRDLGLDPEKVNIYGGGISIGHPIGATGVIL AVKVLYELQRTDKKDAMVSMCIGGGQGISMYFTKD >gi|157101629|gb|DS480695.1| GENE 24 31294 - 32769 802 491 aa, chain - ## HITS:1 COG:AF1752 KEGG:ns NR:ns ## COG: AF1752 COG1070 # Protein_GI_number: 11499341 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Archaeoglobus fulgidus # 2 484 10 481 493 186 26.0 1e-46 MYMGLDIGTSGVKAALVDGQGNIIRQHQVSYGFSNTSGGWRELNPEEVRSGARACLAAAG GGYPVRTITVSALGEAVILCGGSGNCLCPGITGTDIRGAGLFPAFLEKVGEKEFIEITGL NPSTIYSVNKLMWLKQNCQDLYGRAVMVFTFQDFLIHTLAGVRAIDFSMASRTGLFDCNE NSWSDRLLETSGIREGLLSEPVRGGTVVGKILPNEAFELNLPQDTLIVAGTHDHICNAIG CGAVKEGLCANTAGTTEGLTAILNEKNMIPGRAQRHHIACEPFAADGFFNTVAWENTSGV LLKWFASEFVREKGMDLKQVFSWLNGHMEPGPTGILVLPHFSGAATPHMDEHSRGAVLGL SLDTRRADLYKAMMEGINYELALIMDALLDAGIEVGQIISTGGSLSPQLLQIKADILGRN IHTVRNQQTGSLGGAMLGAVAFGEYDNLETAVAGMVRSGITYKPDRTAHELYAEYMNQYR RVYSAVRSVFE >gi|157101629|gb|DS480695.1| GENE 25 32814 - 33689 727 291 aa, chain - ## HITS:1 COG:ydjI KEGG:ns NR:ns ## COG: ydjI COG0191 # Protein_GI_number: 16129727 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli K12 # 1 290 1 276 278 150 35.0 2e-36 MLVSLQEMLQEAEKKNFAIASINTPNQISLRAVIQAAEDTNMPVTINHAQTEEDVVPIEI AVPLMLEYARKSTAPISVHVDHGYDFDHCMKAVRLGCTSIMHDNSRFSLRENVERTRRFV DIVHPLGIGVEAELGAMPNNMPSEVHGQETSDLSDLSVYFTNPDEAEIFCRESGCDVLTV SFGTVHGMYAGEPNLDIELVKKIRAKAGNVALGMHGGSGTPFNQIQAAIDAGIRKINYFT AIDTAPAPYLAKTISEATNPVNFCHLANQAMEIIYEKTVEILTVFKNEKKF >gi|157101629|gb|DS480695.1| GENE 26 33710 - 34705 1021 331 aa, chain - ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 28 322 14 309 314 238 47.0 8e-63 MSGQEKQKREAASIPVIGSIYRYCRENIGIIMGCLLLGILLTFASENFMKVSNWTNVLRQ ISTNFYLSVGIMLAIILGGIDLSVGSVLAVSGVISASLITNNGMPVGAAILLGCLAGTAI GLLNGLIIAFTDMPPFVVTIATMNIGRGVAYLYADGNPIRVSNDTFEAIGLGYLGPIPLP VIYMAVILVFMYFLMNKSKMGSHIYAVGGNRQAAVYSGINVRKVIIFVYTLSGLAASWAG IVLASRMASGQPATGVGYEADAVAAAVLGGTSMTGGAGKVGGMVLGALLIGMLNNGLNLL GVNSFWQYVVKGTVILVAVYIDIFRKRKERF >gi|157101629|gb|DS480695.1| GENE 27 34702 - 36204 213 500 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 258 474 1 217 245 86 26 4e-16 MNQVLLKMKNIRKTFPGVLALDNIDFELESGEIHGMLGENGAGKSTLINVLGGIYIPDRG EIIINGQKTQINGVKDAQEAGIAVIHQELVLVPHMSIAENVFLGREIKSPAGLVDKKRME RESVEILKRVGLEISPSVPVRSLSVAQQQMVEIAKALSLNARILVMDEPSSSLTDSEVER MFAIMRKLKREGVGIIYISHKMKELFEITDRITVIRDGSYIGTIQTSEADMDQLVHMMVG RELTSYYTRTEHEIGDVMLELKDINRSGVLKNIDFSVRAGEIVGFAGLVGAGRSELFKAV LGIEPMDSGTVLIEEKNAGRLSPKKAQEMGMVLVPENRKKEGLVLINTVAFNLTLPVLDQ FVKGIFVNKKRQLEIENTYIDSMNIKTPSRLQKVGNLSGGNQQKVVIGKWLAARPKILIL DEPTRGVDVGAKAEIYKIIDRLAEQGIAIVIISSEMAELINMCDRMVVMNQGSIAGELKK SEMGFSQESIMRLATGGGTL >gi|157101629|gb|DS480695.1| GENE 28 36324 - 37367 1044 347 aa, chain - ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 69 347 45 325 325 175 36.0 1e-43 MKRKVLAALLTLTIAGASLAGCSGGKTGETTTAAQTQAETKAETKAEAKETAKESIAEVK ENGGKEHYKFGFAATTMNNPFFHAIQEAIEEVVDENGDELIVIDAQNDAQKQISMVEDLL TQGIDLLFLCPIDSASIKSSLEQCGKAGVPVVNFDTDVYDVDLVNTIIVSDNYYAGELVA EDMMKKLPEGSKVCILTSPSAEACIKRQNGFKDKADGYFKIVSEIDGKGDTATSLGIAED VLQGNPDLGAFYAINDPSAIGCVQALESQKKTDVLVYGVDGQPMGKQAISEGTMEATAAQ SPINIGKESVAAAYKILSGESVEKNILVPTFLIDKNNVTEYGLDGWQ >gi|157101629|gb|DS480695.1| GENE 29 37403 - 38287 582 294 aa, chain - ## HITS:1 COG:BH2675 KEGG:ns NR:ns ## COG: BH2675 COG1737 # Protein_GI_number: 15615238 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 18 293 15 286 287 143 32.0 4e-34 MGMNMNFEQVTPTIVLDIRSKMQDFNPAQKAVANYILNNFYDMEGMPIRDLAVESEVSEA TVSRFVKAMGYENYRAFQLEVTKSKAERRKGSLKGYAEVTEEDNAPGICRKIFESNIQAL QDTLTVIDYDALERAADLIARARRLCIFAQGRSKVTANSIRLRLHRLGIEGTLYTDPHEQ AIASSLMGPEDVAVGISTFGRSRSILVSIRRAAARGSSIIGVTSYQNTPLEKEADILLRT VNNDEADFGSEPSCASVTQMVMLDCLYMLVAKRMEGKAEERFKITRDAIQQEKE >gi|157101629|gb|DS480695.1| GENE 30 38561 - 39661 822 366 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01688 NR:ns ## KEGG: EUBELI_01688 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 363 1 349 349 149 26.0 1e-34 MKRVAMWFFLSCKRYMRKWSFLFILLLLPAGVLAAGQAQKKGSQDISIAISVEESGENSL GDALADSLVNRPRGEDAGMFRFYLCRDEEEVKAEVASRRAECGYVIYAGLEEKLDTKSYK RSIGVYSAPSTVASSLSTETVFAALMKIYDGKLLADYVAENELFDPLGAPGGEAREEAAV QAKELYTKWLESGSTFRFEYQFQGRDGEEIDVASSRTDVFPVRGLVAVYVFITGLYGAVV MCGDEERGLFLPLSYGYRIPCRVASMAAPAVMVSISGLLALWAGGVMTSFPREAAAMAGY CCVVIASAWILRLVCRRPQVLCCIIPFLVIGSLVFCPVFVDAGRFFPGLDQVGRLFPPWY YLQMFR >gi|157101629|gb|DS480695.1| GENE 31 39658 - 40845 1227 395 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01689 NR:ns ## KEGG: EUBELI_01689 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 353 1 382 426 131 27.0 6e-29 MSARLVYLRLELKRACKKLPHIFAGAIVLLLLAGTIALLSARMLYGETSAGRITVGVVLP EGDAVAKKALAMVSSLESVKSICDFEYMNKEECLDKLKAGKLYAVMEVPEGFVQDIMNGT NTPVKILLPKGAGLESRIFKELTDAGASTLGASQAGIYAGDELLYLYGMADSVSSLEADL NRIYMGYSLPRMDYFKNVRVSATGDVDTIHFYGISTAVLFLLLCAIPVSAYLASGSASMK GQLALIGVGKGTLVAARILGVGMLFLSVALCTALGASFAGLIEFSLPGLAALVMVCLGAA SLVVFLYQAAGSLMGGVMLLFLAVTAMHFLSGGFLPLVFLPTTFRAAAPFLPTYVLMEGM KLVVTSSFSLAVFIKLAVLAMAGFLLSVGAEVVRE >gi|157101629|gb|DS480695.1| GENE 32 40869 - 42854 2514 661 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940534|ref|ZP_02087878.1| ## NR: gi|160940534|ref|ZP_02087878.1| hypothetical protein CLOBOL_05429 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05429 [Clostridium bolteae ATCC BAA-613] # 1 661 1 661 661 1119 100.0 0 MKCTKCGAEQEADVKFCTQCGAPMEHEDQTPENNGPEPADKEPASGAEAVKSEEPASGAE AVKTEEPASGAETVKAEEPADEAEAVKAEEPAAEAENVKTEEPAPAAVTVPEEPQPNKKF IKILAAVIAFAAVIAVAALAYVKMTTKDPKQVVIDAFENVFPEGQVYPSEELFGLKDFAE TVKTADSQGGLTLKMDSCSDATVNAYAGSGLRFEAKDDKTNGRSFFNMGVIYSGMDLANL NAYYGDDTLMLAVPELSAKVFTVDLGEGLADRIKNSPTVGPLLEQNGVDVEGMAVYFTEL MDEAEKAQTEGRQPFDVEALINRYKEGCKAQENFKAALTVEKAAKGTYTIDGAQVSCKGY NVTVSKDSMIEFLRQSSDFFLQDETLKADFMRQLETTVKMSELMGGTMSGTGTMSAEEMQ QQSYEEAKKMVDQMIEYLDKALTDVNMTVYVDKKGRLAAVEGSTNLYVEDTDVSEEGYIA LTFSCQLQGGAYLTQNALASITLEDATDTVSIDMVKQGSYDKSVLTSDLSLDLTVPGDET YNFTYTSTYDSKDGSYHLSGEVGGNGSQLVKISAEGAVDQLEKGKSVHVNIDSLETSIMD DSVNVVLSGEYYYQPLEGEISPLEGETMDVLSATEEDWTNVGMEMLFGVMGLGSQMGVSM Y >gi|157101629|gb|DS480695.1| GENE 33 42930 - 43616 247 228 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 226 7 233 318 99 33 5e-20 MIAAMLEISGIYKSYHRQNILAGADLTVAPGECVGIVGYNGCGKTTLLSILAGAQKADRG SILYNGREASGHPRVFAEEAAYVPQENPLMEELTVRDNLLLWYRGSRKQMESDLKDGAAS MLGVDRMLDRTAGKLSGGMKKRLSIACALSNHAPVLIMDEPGAALDLECKEIIRNYLREY MASGGAVILTSHELAELALCTRMCILKGGVLREIECGLSAKELISQFR >gi|157101629|gb|DS480695.1| GENE 34 43922 - 44125 182 67 aa, chain + ## HITS:1 COG:no KEGG:Closa_1532 NR:ns ## KEGG: Closa_1532 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 65 1 65 67 80 87.0 1e-14 MAKAGMRRPSPYDADNHGTENHHKMNHPKNEQAQVPEIQGAAKNGNAKAGPINAPNWARD DYKTGEH >gi|157101629|gb|DS480695.1| GENE 35 44230 - 44811 620 193 aa, chain - ## HITS:1 COG:CC3650 KEGG:ns NR:ns ## COG: CC3650 COG0494 # Protein_GI_number: 16127880 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Caulobacter vibrioides # 17 186 6 175 187 110 36.0 2e-24 MKKADNKDEKLNRNLSWKILDETTVIKDSWIDMRASTCQLPGGMVIAPFYVNHLPDFAVV VAVTPEHRVIMVRQYRHGVEKVLLELPAGCIEAGEDPKDGAARELLEETGYKAGSLEFLF KIAPNASNCTSYAQCYLARNVFPAAPQNLDETEALEVVELEGEEVKRLLREGGVEQAVHV AALYRAAELGALG >gi|157101629|gb|DS480695.1| GENE 36 44865 - 45053 279 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940539|ref|ZP_02087883.1| ## NR: gi|160940539|ref|ZP_02087883.1| hypothetical protein CLOBOL_05434 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05434 [Clostridium bolteae ATCC BAA-613] # 1 62 1 62 62 104 100.0 2e-21 MIKDAYVQYQSRKAAKDLFDAMELLPGRVKMERDVHYIDDKTAAMNLHLVLMMAALEDGL WQ >gi|157101629|gb|DS480695.1| GENE 37 45108 - 46049 756 313 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c23130 NR:ns ## KEGG: CLJU_c23130 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 307 1 320 324 293 50.0 6e-78 MKKALIYGILASFFFAFTFILNRSMNLAGGYWMWAASLRYLFTFPILWIMLAGSPSAGRV WSAVKEAPVSWLVWSTVGFGLFYAPLTFGSVLGESWLAAATWQLTIVAGVLLTPLFGKRI PIRNLGWSVLVLAGVFMMQVPHLKRMEVRTVLFTLAPILVAAFSYPLGNRKMMALCPAGM TTLERVYGMTLCSMPFWIFMSGWAWVKAGPAGRGQVFQSLCVALFSGVIATILFFKATDL VKENPRHLAVIEATQCGEVVFTLLGGILLLKDRVPEPAGLAGIAVIVAGMVGNSLSAGKQ CGESAVYPADSRQ >gi|157101629|gb|DS480695.1| GENE 38 46157 - 46513 574 118 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239625929|ref|ZP_04668960.1| ribosomal protein L20 [Clostridiales bacterium 1_7_47_FAA] # 1 118 1 118 118 225 96 6e-58 MARIKGGLNAKKKHNRVLKMAKGYRGARSKQYRIAKQSVMRALTSSYAGRKERKRQFRQL WIARINAAARINGLSYSKFMYGLKLAEVDLNRKVLSEMAINDAEGFAKLAELAKSKIA >gi|157101629|gb|DS480695.1| GENE 39 46542 - 46739 242 65 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228581564|ref|YP_002852265.1| ribosomal protein L35 [Clostridium sp. 7_2_43FAA] # 1 65 1 65 65 97 73 2e-19 MPKLKTSRAAAKRFKKTGTGKLVRNKAYKSHILTKKSTKRKRNLRKDIVTDATNSKVMKK ILPYL >gi|157101629|gb|DS480695.1| GENE 40 46769 - 47263 404 164 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 9 164 2 158 159 160 50 3e-38 MINEQIRDKEVRLIGEDGEQLGIVPGKEALRMAQEAELDLVKIAPTAKPPVCKIIDYGKY RYELARKEKEAKKKQKVIEIKEVRLSPNIDTNDLNTKVGAARKFLEKGDKVKVTLRFRGR EMAHMFKSKYILDDFAESLKDIAVIDKPSKVEGRSMVMFLTAKR >gi|157101629|gb|DS480695.1| GENE 41 47701 - 48015 454 104 aa, chain - ## HITS:1 COG:no KEGG:Closa_2298 NR:ns ## KEGG: Closa_2298 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 104 1 104 104 111 64.0 1e-23 MTKLECSVKNCLHNADNCCCKAAIAVDGNKAKTAEQTCCASFDENKGGKFTNLFKTPETR LEIACDAVKCVYNEEHRCSAERIDISGDGACECTETRCSTFKAR >gi|157101629|gb|DS480695.1| GENE 42 48077 - 49696 1611 539 aa, chain - ## HITS:1 COG:no KEGG:HRM2_29940 NR:ns ## KEGG: HRM2_29940 # Name: not_defined # Def: hypothetical protein # Organism: D.autotrophicum # Pathway: not_defined # 36 418 34 411 678 199 30.0 4e-49 MKRPCMKAAVILGAAAFLFTSGFARAEDHAAGTAGYLKSATYYSDDWVINFWNSESGNMD QELARIAQDGFNNIILVVPWREFQPGMTPCTYNQYAWDKLDRVMDAAAAQGLSVMLRVGY TWDYCGGGNVMDRYQGLMQEGTERSAWLEYVGRLYQAASSHENFCGGFLTWEDFWNFTDS SASLGNGVKGRSMAKAFGYVDYAMENYELEELEEMYGHELASYDDLYFPARDSQARSVFY GFYDQFLNELLADSQTVFPGLSMEVRLDVDPVEHKDGSQEGFMHSSTFPSGSAAYTSAMY GIPMGFMNEHERVTAAEALEKIPVFLNRLHVYSGGKPVYLDQFLFTDNTVGYEHNAQLRD DEKSLYLDGVAPILKNSTMGYGIWTYRDYGDNKLFNAQFALGMDGWRFSGGSYIAEDETG KSAMIPSGGNIYQNLGGRMTGNTGKDTYVKLRTGAEEKCRLTIRMGSQTRTVDVKEERKV ELKFSNCQPSDVTISVSGGKGAYVDDIQVYTFVTRGELYNMDGSEGKCIEALRLMNQKL >gi|157101629|gb|DS480695.1| GENE 43 49711 - 51099 1612 462 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 22 439 16 430 448 263 34.0 6e-70 MDKQSDNQKKNDFSRGSVVGNIMKLAVPMTLAQLINVLYNIVDRIYIGMIPENATLSLTG LGLCLPIISIVIAFANLFGMGGAPLCSIERGRGNEKEAEAIMGNSFVMMVAVGILLTVLG LMFKRPMLYLFGASDATIPYADAYVTIYLMGNIFVMTGLGMNSFINSQGFGTVGMMTVLL GAVTNIVLDPVFIFVFHMGVRGAALATILSQMLSALWILAFLTGRRTILKLRPSAMKLDA GRLLRIVGLGMSGFTMSITNSLVQIMYNSMLQKFGGDLYVGIMTVINSVREVISMPVNGL TNSAQPVLGYNYGAREYKRVQKAIVFMSVVCILYTVVAWGFVDWFPAFFIRIFNREGDMV EAGIPAIRMYFFGFFMMSLQFAGQSVFVGLGKSRNAVFFSIFRKVIIVIPLIIILPTVFG MGTQGILMAEPISNFIGGAACFGTMLLTVWPELKRGGEKNHS >gi|157101629|gb|DS480695.1| GENE 44 51201 - 51791 438 196 aa, chain - ## HITS:1 COG:SSO2432 KEGG:ns NR:ns ## COG: SSO2432 COG2068 # Protein_GI_number: 15899180 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Sulfolobus solfataricus # 1 193 1 181 189 78 30.0 1e-14 MRISFIYMASGFGSRFGSNKLLVPFKGKELYRHGLDCICQAAGELEKEGHQTEVLVVSQY EAILKQAESRGLLAVPNRFSGEGITASLRLGTSSASPETEALLFSVADQPYMSTPTLKEF IHGFRLSGKGMGCVCHQGRRGNPAVFSSFYRKELMALRGDRGGSVIMKAHPGDVWTMEVP EEELKDIDRTEDLGAV >gi|157101629|gb|DS480695.1| GENE 45 51808 - 52677 652 289 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1491 NR:ns ## KEGG: Cphy_1491 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 40 182 31 173 247 121 42.0 4e-26 MRAYFYQCGIWTGTDSLWHGLGLNGRIKQARSGKGPGPFVVSLAGAGGKTSTIRRLAWEA VGRGLKVLVVTTTHMARPGAFGVFDGNAEEIRTVLERRSLAVAGRPAEKGKITFTGWELY KEACSLADLVLVEADGSRRLPLKVPRAGEPVIPDNTDMILCLNGLTSLGKRAEECCLRIE EAGALMKRYGRKMYEDSREQRGSGTDALELNAKHKTDWIIQKEDMMTLMKHGYLLPLRAA HPGTEVLPVFNQADTPQEAALAGEMLDCMGETSGIACGHLDRDGSARLF >gi|157101629|gb|DS480695.1| GENE 46 52977 - 53060 158 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVIGVILVFGILSFDLYGVIRNNIQK >gi|157101629|gb|DS480695.1| GENE 47 53226 - 54257 1060 343 aa, chain - ## HITS:1 COG:Rv0376c KEGG:ns NR:ns ## COG: Rv0376c COG1975 # Protein_GI_number: 15607517 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Mycobacterium tuberculosis H37Rv # 15 335 7 348 380 92 26.0 9e-19 MKTLFTELRQLLEQGEEAVLVTIIASSGSTPRGTGSRMLVRKDGSIEGTVGGGAVEYQAI QAALKAMEDKVSFAKGFTLTRNQVADIGMVCGGNVVVYFQYIRPGDQVLTGFCGQVLDAL SKDEDSWLILDITDETCWQMGLYSPSLGLSGIKGLGQLLEGELFGSNAFQKEVDGRTFYI EPLVQAGTVYIFGGGHVAQELVPVLAHVGFRCVVMDDREAFANPQVFLRAERTVVGDMEH ISDYVDIRPRDYVCVMTRGHQFDYYVQKQAMALHPYYIGIMGSRNKIRVVTDKLLADGFS LEEIQKCHMPIGTDIGAETPAEIAISIAGELIARRAERMGKKK >gi|157101629|gb|DS480695.1| GENE 48 54254 - 55195 1046 313 aa, chain - ## HITS:1 COG:no KEGG:Cbei_1987 NR:ns ## KEGG: Cbei_1987 # Name: not_defined # Def: regulatory protein, LysR # Organism: C.beijerinckii # Pathway: not_defined # 3 305 9 309 309 222 35.0 2e-56 MKTGAVIIAAGHKSTISRFQPMLPVGDSTVIRRIIITLKRAGIDPVVVVTGQKADEVEKH IAGLRVICLRNQDYEQTQMYYSICMGLNYIEDLCDRVFVLPAKFPMFLPDTIQRMMGSRA MAACPVYDGKRGHPVLVSKAAIPSLLIYHGERGLRGALRQPEINGHLEEIPVEDEGIIMA VESDEDCALGSLGREKLAVYPQVQLTLERNEGFFGPQAAQFLSLIDHTGSMQTACRQMHM SYTKGWKILKEAERQLGYPLLITQSGGAEGGFSQLTPKSKDFLDRYLRMERELRMEGERL YKKYFTGEEETES >gi|157101629|gb|DS480695.1| GENE 49 55243 - 56391 1054 382 aa, chain - ## HITS:1 COG:sll0739_2 KEGG:ns NR:ns ## COG: sll0739_2 COG1118 # Protein_GI_number: 16331977 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Synechocystis # 9 303 38 311 395 249 45.0 8e-66 MNSEYNLHLEVDIRKKLKDFSLDISFEAGRGCLGILGPSGCGKSMTLKSIAGIIRPDQGR IALRYAQGEAAGGRVLYDSALKINEKPQVRRVGYLFQNYALFPGMTVEQNIMVGLKGGRG NVGLRRRRPMSGEARQQKVSEMVERFRLQGLEKRFPGQLSGGQQQRVALARSLAYEPEAL LLDEPFSAMDTYLREGLRMELADALKDYDGVTIMVTHDRDEAYQLCDNLLLMDRGCVLAA GRTRDIFQNPVTCRAARLTGCKNISRIERLGERRIRAVDWDGLELATDRPVGDAITGVGI RAHDFEPLSESSARAYMEREDANLIKVCAPCISEMPFEWYITLQNGLWWKKEKDIHTHDT AGVVPRWLRVEPSALLLLTGEL >gi|157101629|gb|DS480695.1| GENE 50 56405 - 57076 781 223 aa, chain - ## HITS:1 COG:alr2433_1 KEGG:ns NR:ns ## COG: alr2433_1 COG4149 # Protein_GI_number: 17229925 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 217 3 218 223 179 48.0 4e-45 MDWSPILISMKTASLSIFITFFLGIFAAWAVVSMKNETWKLIWDGVLTLPLVLPPTVAGF FLLYMFGVKRPVGQFFIEYFSVKIAFSWTATVIAAVVMSFPLMYRSARGAFEQVDQNLVA AARTLGMSEWGIFWKVLMTGGLPGVVSGGVLAFARGLGEFGATAMIAGNIAGKTRTLPMA VYSEVAAGNMDTAYRYVAVIVVIAFLSIILMNWAALRPVNRKG >gi|157101629|gb|DS480695.1| GENE 51 57098 - 57982 1361 294 aa, chain - ## HITS:1 COG:sll0738 KEGG:ns NR:ns ## COG: sll0738 COG0725 # Protein_GI_number: 16331978 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Synechocystis # 52 294 29 270 270 174 42.0 2e-43 MKKQICMMMAAMMAAAALTGCSGGAKETAAATEAETTQATATAAETEAETSARETTAAQA GESTEILVAAAASLKNAYEDELIPMFQEQYPGVTVKGTYDSSGKLQTQIEEGLEADVFMS AAAKQMTALDEEGMIESDTITSLLENKIVLIVPTGSSAGIEKFEDIEKAETIALGDPASV PAGQYAEEALTSLGIWDKIQDKVSFGTNVTEVLNQVAASSADAGIVYATDAASMADKVEV VAEAPEGSLAKKVIYPVAVVKNTAHPEEAKNFVEFLKTDEAMKVFEEYGFTKGE >gi|157101629|gb|DS480695.1| GENE 52 58229 - 59197 255 322 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|134277849|ref|ZP_01764564.1| ribosomal protein S16 [Burkholderia pseudomallei 305] # 165 304 4 149 194 102 42 6e-21 MSEQIKQGVCPTGVVKAICISDKRGIEKRAIEEGHFLVDFGIEGDAHAGHWHRQVSLLSY DKVMAFNERGANVIDGAFGENLVVEGIDFRSLPVGTRLYAGDVQLEMTQIGKECHSHCAI YKRMGECIMPKEGVFARVIREGIIRPGDVMRVEPPAEERPFTAAVITLSDKGARGERKDE SGPAAKEMLEAAGYEVVELLLLPDEPGQLKTQLIRLADSRQVDLVLTSGGTGFSLRDQTP EATMAVAERNAPGIAEFIRMKSMEVTDRAMLSRGVSVIRGTTLIVNLPGSPKAVKESLGF ILGSLDHGLKILRGSASECARR >gi|157101629|gb|DS480695.1| GENE 53 59203 - 60261 1004 352 aa, chain - ## HITS:1 COG:lin1039 KEGG:ns NR:ns ## COG: lin1039 COG2896 # Protein_GI_number: 16800108 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Listeria innocua # 1 285 4 284 333 196 37.0 6e-50 MKDALGRTIDYMRISITDRCNLRCKYCMPYGVECVPRWDILSLEEIQAVGICAAGLGIRH IKVTGGEPLVRRDCCQLVKQLKSTPGIEKVTITTNGVLLERYLDDLAEAGIDGINISLDT LDRDLYQRITGTDALGTVMEAVRRACAMPVSVKINAVSIDFSQMDEKQREAGQDGWRQMV ELAREYPVDVRFIEMMPIGYGKNFKTINHQELLEEMMRAYPDMEEDDRVHGFGPAVYYRI PGFKGSIGLISAIHGKFCSDCNRVRLTSQGYLKPCLCYEDGVDLRRILRKGQERPEEEGH YRWPYAGCPSDEGLQRELRKAMEQAVLCKPAAHCFERPGQITEAHNMIAIGG >gi|157101629|gb|DS480695.1| GENE 54 60267 - 60761 604 164 aa, chain - ## HITS:1 COG:RSc0560 KEGG:ns NR:ns ## COG: RSc0560 COG0315 # Protein_GI_number: 17545279 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Ralstonia solanacearum # 3 154 4 155 158 171 61.0 7e-43 MEFTHFDDQGNALMVDVGDKAETKREAVARGSIFMSPECLEKVARGTMAKGDVLGVARVA GIMGAKRTSDLIPLCHILNLTKLTVDFTIKKESNEIEAVCTARTTGKTGVEMEALTGVSV ALLTVYDMCKAVDKSMEIGDIYLEHKSGGKSGEFHNPRCARNRG >gi|157101629|gb|DS480695.1| GENE 55 60786 - 61844 1287 352 aa, chain - ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 4 339 6 324 330 100 27.0 4e-21 MKLIQTVDAEGTVLCHDITQIIKGVTKDAVFRKGHVVTKEDIPVLLSVGKDQLYVWEKED GMLHENDAAEILCDMCKGQHMERSQAKEGKIELTAACDGLLKVDNRGLKAVNGFGQMMIA TRHGNFAVKKGDKLAGTRIIPLVIEEEKMEAAKKAAAEATGGCPILELKPFLHKKVGIVT TGNEVFYGRIKDTFTPVIRGKLSEFDTEVIDHVTWNDDDTKVTASILDMIQKGADVVVCT GGMSVDPDDKTPLAIRNTGADIVSYGAPVLPGAMFLLAYYQVKDGENPRTVAIMGLPGCV MYARRTIFDLVLPRIMADDQVTADDLAALGQGGLCLNCPECTFPNCGFGKGM >gi|157101629|gb|DS480695.1| GENE 56 62029 - 62679 497 216 aa, chain - ## HITS:1 COG:BH0751 KEGG:ns NR:ns ## COG: BH0751 COG2068 # Protein_GI_number: 15613314 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Bacillus halodurans # 6 186 8 194 209 73 30.0 2e-13 MERTGAVILAAGLSSRMHEFKPLLELGDSTIIVRAIENLKTAGASPVIIVAGYKADQLMD YLWPLDIMFVRNENYASSQMFDSVKLGISRAAGLCSRILITPADVPLIEQDTFKQVMECP GALVRPVCGGRPGHPVRMDSRLVPDICAYKGAGGLKGAMEHLGVPITEPEVEDPGIYLDA DTPQDYMNLRYWNDYSQIQKQGPKMIHFRISIGITS >gi|157101629|gb|DS480695.1| GENE 57 62739 - 63149 483 136 aa, chain - ## HITS:1 COG:no KEGG:Acear_0330 NR:ns ## KEGG: Acear_0330 # Name: not_defined # Def: thioesterase superfamily protein # Organism: A.arabaticum # Pathway: not_defined # 1 89 1 89 131 79 38.0 6e-14 MRRMESDNCFVCGSLNPIGLHLDITEGEGWARALWTVEKPYVGYEGMLHGGIMASIMDDL MAHALYYTDLDVVTAHLELDYKAPVHVGERIECEAQVTEFGTGRSIRAQGTIKREGAVAA RAKGVMVIVKAPGQEV >gi|157101629|gb|DS480695.1| GENE 58 63168 - 64766 1612 532 aa, chain - ## HITS:1 COG:lin2275 KEGG:ns NR:ns ## COG: lin2275 COG4670 # Protein_GI_number: 16801339 # Func_class: I Lipid transport and metabolism # Function: Acyl CoA:acetate/3-ketoacid CoA transferase # Organism: Listeria innocua # 1 528 1 513 527 422 42.0 1e-117 MAKIITVKEAAELVQDGAVIGSAVQGMTGWPEEIGLAIENRFMETGHPSGITHIHGAGQG DFGRMSEDGKTCRGECALAHDGLLSCSIHGHVGCSFKVTKQIVDNKILAYNIPLGVVGQI WREMGRGFPGLLTKVGLGTFMDPRYDGAMINEKTKKEGKQIVTYIPDFLGEEYLFYTLPK LTVALLRATTADEDGNLTYEKECMPCEPLDLAMAAKAAGAVVIAQVERVAATGTLDPRNV KVPGILVDYICVAEHPDRMMQTHITHFNPAFTGEIRIPMKNSAAPLPLDDKKVFTRRTAM EIHGGDKCNLGIGMPGLVPNVLMEEGVDSQVTLISESGLIGGIPAPGGDFGAHYNPVAMY PQTDHFSFFDQGGLDVAVFGLSEVDRDGNVNTTFLNGRIAGVGGFPNISANARHSIFVGS FTAGGLKCHIEDGKMCIDQEGRFDKFVESCAQLSFNAQQCLAKGNRVTFITERCVIKRTA QGMVLAEVAPGIDIQTQILDHMGFKPIIPEGGPALMDAGLFSETWGRLGDHF >gi|157101629|gb|DS480695.1| GENE 59 64751 - 66820 1631 689 aa, chain - ## HITS:1 COG:MTH657 KEGG:ns NR:ns ## COG: MTH657 COG0318 # Protein_GI_number: 15678684 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Methanothermobacter thermautotrophicus # 88 629 7 545 548 409 40.0 1e-114 MEDLIDGKRRIEAVVCRDENGPGEASFLFDRNGCLLDFRGAVPCSEAFADMLSGICRNKP LVSGREYVEAMWEGRPYHVWLNRYIHLTLGELADIQARRFPDREAVVDSLAGVRLTYREV KRRSDGLAKGLMHIGILKGDHVAVIMDNCWENVVTKIAIEKTGAVIVNLNIHEKKDMLEC LLHRADVKAVILKQGIKNREHMDMFYQISPELKEAVPGRIYAPRLPLLRHVIVTDQERPR SCAWQFEKLMELGMSMGDSLLKERMKAVRPFDDATIIHTSGTSGVPKGVMLNHCQILENA WIHVQYLGLEKEDRLCMTPPMFHSFGCVGSVLSSMMAGAALVCYEKTDRICLLEMLRKER CTVLCSVPTVYIRLIREMREGKAGGEDLCLRLCVTAGAPCPEHTLRDMKRVMGAEAAVVM YGMTEAGPGISSTSMDDSLETAVSTVGRLWPGVTGRIQDLTTGRVLGPGQAGELCIKSYG VMKGYYNNPEETEKAVDREGWLHTGDIASLSEDGLLTLKGRCKDLIIRGGENISPREIED FIRNYEPVEDVAVVGAPDEQYGELVYAFIRPKEGAVVTKEGLRNWCRGKIATIKIPQEIE LTDHFPISATGKISKGQLRSLAREHLEGRGPRQTEPETGEGGLRQTEAEELRQAETESGG LRRSDRETAVSGQRAGETGKDGMDTWQKS >gi|157101629|gb|DS480695.1| GENE 60 66832 - 68079 1484 415 aa, chain - ## HITS:1 COG:lin2274 KEGG:ns NR:ns ## COG: lin2274 COG0477 # Protein_GI_number: 16801338 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Listeria innocua # 7 412 7 400 407 207 37.0 3e-53 MDLMKKRWAVLVASVIVNICIGTGFAWSVYQTGLFNEGAVIFGTEVQKSQLALAFTICSG VAPIPMIAGNGLQKKLKGPRNVVWLGGILFSAGLICTSFIHSLTMLYVTYGLLCGFGIAF AYGITIGNTVRFFPDKRGLIAGISTAAYGAGSIIFPPIMQSLIGSGGVMFTFRVLGILFG VMIIIAACFITEPPEGWLPTGWTPPAVKNSSGASAEGKNWKLMLADTRFYLMIIAFTIFA TGGLMVVSQGSPMAQAIGGVDAAVAATAVSIIGLANTGGRVLWGWVSDKVGRYPALSIMA VIVAVSGFALSAFTESGSLALFLVFAMLIAMCYGGSMGVYPALTADAFGIKYNGVNYGIM FIGFALGGYIGPILANSLYDNTGSYSVPLMAVGAMGVVALIILFVLTAMKKQAQR >gi|157101629|gb|DS480695.1| GENE 61 68238 - 70610 2660 790 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 162 560 40 456 468 234 35.0 7e-61 MKYFEHESAATFDEAVSLLKESPKGKTVVMAGGSDLIGVLKEQILEDYPEKVVDLKTVRG GEYIKQDGDTIEIGALTKLCDIVKSDLLNEKAPVLSQAARSVATPLIRNVATMGGNICQD VRCWFYRYPHGIGGRMDCMRKGGKECYAVMGDNRYHSIFGGMKVHTTPCSVQCPANTDIP AYMERLRQGDVEGAAHILMEANPIPMITSRVCAHTCQEQCNRCGSDESVSIHGVERYVGD YILEHPDTFYRAPETETGHKVALVGAGPAGLSAAYYLRKAGHDVTVFDKMEEPGGMLTYA IPNYRLPKSYVKQVAAAYEKMGIRFRLGCCLGEDIQAEDLEKEYDNVFYATGAWKRPVLG FDGEEFTEFGLQFLMEVNQWMNKKDRRHVLVVGGGNVAMDVAITARRLGAESVTLACLES EPEMPASREEIARAREEGIEIMPSYGVSKAIYEGSQVTGMELMRCTSVKDENGRFNPRYD REETLRVSADSILMAAGQKVDLSFLGDKYGLALERGLIQVDKDTQATSKSGIYAGGDATT GPATVIQGVRSGRNAAEAINRGYAVMPERRREDKFIHFDTAGVKEEHAVKDKELSAAERA LDKEDSFTLTGEEAAREAGRCMNCGCYSVNASDISPVLILLDARIVTTKKTVRAADFFTT RLKAADMLDTDELVTAVRFRVPEGYTTAYDKFRVREAVDFAIVSLAYAYRMKDGLIEDAR IVLGGVAPVPMERKKVEAFLAGRKPDEALAEAAAELAVEGTAAMANNSYKIQEVRALIKK MILDMGAVQA >gi|157101629|gb|DS480695.1| GENE 62 70627 - 72948 2725 773 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 7 770 9 763 772 288 30.0 3e-77 MSKLEYRDIKQPNYRHVGKYTPRKDAKDIVTGKAIFLDDFSVPHMLYGKTKRSPYPHARI LSIDTSRAEALPGVRAVITHKNMPEQWGLGLPVHRLLMEDKVYYVGDLVALIAADTVESA EEAIDLIDVEYEQLEAVYTAEDALKEGAVEIYSRFKGNHVDNGIKYFQPDGPWWQIIRGD VEKGFEEAAYVAEDKVAFDKMPSPLAPETPSCIAKYEGGNEYTIWASSQSSHILKLMGEG RIPNSNLSVKTFNVGGSYGNKQSLMTTVLSAAMLSLVTKQPVKIVLTKAEQLMAYEVRLG STITAKVGMDKEGYVKAVEGDWLVDTGAIADSIQGQVGVGLGEAQLVCAKCPNWYMDSHI AVTNKQAAGIVRGYGGQELNSCLERLMCAVMKKGNFDPLDVFKKNYIAPGDRFIWRDGRW WQSRSSLFFPEAMQNAADRFGWAGKWKGWNVPTCVNGTKARGVGMGVIGNADISEDNTEA IVRIVPDLVGSRRASTAIIECDITESGMGTRSNACKIVAEVLNVPVEKVSITEPGSKFNP SNYGLCGSRGTITTGKAVSLAAMEAKKKALELGALYFKRSVDQLDTKDFMVYVRDNPQLV VPMFKLAPKELSIVGYGKHMEMFNIPSCMAIFVEAEVDLETGNTKLVKVAGGTDIGQIID SKAVEMQLHGGFGSACIDTAIFEECILDPSTGRLLTSSLIDYKWRTFNEFPPYDAYIMES QIDSFMFKALGIGEISGAAGASAVLMAISNAIGVDIKDYPATPAVILKALGKA >gi|157101629|gb|DS480695.1| GENE 63 72965 - 73576 553 203 aa, chain - ## HITS:1 COG:BS_yurB KEGG:ns NR:ns ## COG: BS_yurB COG2080 # Protein_GI_number: 16080300 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Bacillus subtilis # 1 159 11 160 173 140 44.0 2e-33 MKKQYVTLTVNGREYRFSIGDGFGQVPPSETLSQTLRKRLQLTGSKESCSEGACGCCTVL MDGKGVTSCMLLTVDCDGKDIVTIEGLEDPEKGLDPLQQAFIDEYAFQCGYCTPGIIMAA KALLLENPHPTGDEIKEAMSGNYCRCISHYTVLRAINRVAGNEEDHLAVMHRANDDVENP IPVRQELFLSPYANGTNSDHSLD >gi|157101629|gb|DS480695.1| GENE 64 73817 - 74485 568 222 aa, chain - ## HITS:1 COG:no KEGG:Meso_4073 NR:ns ## KEGG: Meso_4073 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: Mesorhizobium_BNC1 # Pathway: not_defined # 1 212 15 231 245 70 27.0 5e-11 MEGNNSTKTKLIKAGIKLFSQYGYAATSTRMIASEAEVNLSAIAFHYSNKERLYVACLEY MLEKVKGYYAASYMEIEGTFKQDAMTPQKAYEFLEKLIDLQIEVAFAPQYKTTLALIYWE NNGPGDMRPLSAAAFDRQERVMAELIQTVAPVSQSQAIIASRHINGSIISFGEHRGFIED LIPKPLEGESVPLWIREEIKGNCLAIVQRLMKPELFPPYPAQ >gi|157101629|gb|DS480695.1| GENE 65 74506 - 75567 897 353 aa, chain - ## HITS:1 COG:aq_1990 KEGG:ns NR:ns ## COG: aq_1990 COG0406 # Protein_GI_number: 15606984 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Aquifex aeolicus # 3 150 39 188 212 65 28.0 1e-10 MELASWFKSRCVEAVYSSPLSRCMETAEYIAGEQQRVHVEEGLVELDAGEWENMEFTEIR SRYPQLYEERGRSIGTVPPPGGESFARGAERLQDALDRILGNTRGDVAVVTHCGVSRGLM CGILGWDINRVLEVPQPYGGISRILADDDGGFRGFYEQTGVKPLLYPDHGICRRLSDRYS VPEHIRLHEAAVARLAHQWAVRLKRLGYDIDPELLRTAGLLHDIARLKPDHARAGARILR MEGYPVMAGIIQCHHRLEGGEESGLTERTLLFLADKMTLEDRMVDIDERFRQSAPKCTTP QARENHRLQYNQAQAVRHCLESAMGSLEAADAFGTSPQASVDRKARIREWAAG >gi|157101629|gb|DS480695.1| GENE 66 75881 - 76501 549 206 aa, chain - ## HITS:1 COG:AGpA742 KEGG:ns NR:ns ## COG: AGpA742 COG2068 # Protein_GI_number: 16119729 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 184 33 219 230 73 33.0 2e-13 MKKTGAVLAAAGLSSRMGDFKPLLPFGDTTIAFHVTSMLRRLGADPVLVVTGYRAEELED HLKGTGACFVRNTRYRHTQMFDSVKLGIQAALEGCERILVMPMDLPAITDDIIKQVMEAP GQIVRTVHDGEPGHPICMEGEIARRICTYTGDQGLKGAIEASGVPVEDLEAGNESIYLDV DTREEYRELLRWVCRQESSQRKCKMH >gi|157101629|gb|DS480695.1| GENE 67 76671 - 77384 711 237 aa, chain - ## HITS:1 COG:alr4392 KEGG:ns NR:ns ## COG: alr4392 COG0664 # Protein_GI_number: 17231884 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 38 213 26 203 223 66 26.0 5e-11 MVGQECNTAVQMPHANFVSRLLYLPRGISRLEKLGQRKVFPKNHMLVQAGTKPNYCYIVK TGRVAAFETTVTGEERIYNFNESNSVLLEGNLLFDQESAVNFKTVQPSELVCITKEMLLK GIAHDPQLSMDIMESLSTKFLSAMEQVRHTNFHNAGWKICDLLLIFAEHYGVPYDGKILI KEKISQQLLSNLLGINRVTAVRAIKDLKEMALIEQINGYYCIRSLEKLKRHQERMCQ >gi|157101629|gb|DS480695.1| GENE 68 77618 - 78505 716 295 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c15750 NR:ns ## KEGG: CLJU_c15750 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 292 1 280 281 232 42.0 1e-59 METVIILGAGQFGRGAARLLNQENMALLAFGDNNPALHQLTESERTEKGFPANVPVLPVD EALSLKPDYIITGVTDPLRSGQLKDQALELGYEGRFLMLSQLYRYFDIRNATLKRLAERI HGQGLQESIAELGVFKGDTAWKLNALFPQQRLYLFDTFQGFDPRDIKEEKSKGCSFAREG EFSDTSEQAVLGRLPFPGQAVIRRGYFPDTAAGLEQERFCLVSLDADLYAPILSGLIFFY PRLVPGGMILLHDYNNERFRGARQAVEEFEKQYGRLCLVPLCDLHGSAVIVKPCD >gi|157101629|gb|DS480695.1| GENE 69 78618 - 79505 789 295 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c15750 NR:ns ## KEGG: CLJU_c15750 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 293 1 279 281 173 32.0 7e-42 MKTVVILSTDNLGMTVADMLNPREMKLVGMGDTRQETWNVFSDLEKGELKEEIEGMPIMP ADLAVALQPDVLVIAATDSEKSHALEYMAIRAGFLNDIVFIRDLHEQFSIRCSVLRRLCR RLTGLGVEGNVAELGCYRGDTSWQLNVLMPDRKLYLFDTFEGFDERDVDMERKLGCSDAQ TGLYSGTDKEKLMERMPLPGQVVIRKGWFPETAFDIEDETFCLVCMDACLYQPTLAGLEF FFPRMGRGGVILLSGCSGTRYRGVAKAVEDLETRYGALLMLPVGDLDGTVMIVHP >gi|157101629|gb|DS480695.1| GENE 70 79922 - 80452 471 176 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940581|ref|ZP_02087925.1| ## NR: gi|160940581|ref|ZP_02087925.1| hypothetical protein CLOBOL_05476 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05476 [Clostridium bolteae ATCC BAA-613] # 1 176 1 176 176 313 100.0 3e-84 MNFYKNHFGMIISSVVAICISLIMATSAIFVDKLTFTLPLLIKNWGTAFLVISLTGMAFP LTDWSFALCRKMGLRPETLPHVLVENFVATLFFNTTATIVLTAVNVFHNPEIEAAVAAGF LPNTLTAFVQGVLHDWPIMFIISYVFAFFVTKAAIRIAKQAVGELKSPHSPQNQFQ >gi|157101629|gb|DS480695.1| GENE 71 80552 - 81580 1030 342 aa, chain - ## HITS:1 COG:NMB0685 KEGG:ns NR:ns ## COG: NMB0685 COG4859 # Protein_GI_number: 15676583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 255 339 15 98 107 67 39.0 3e-11 MDRVSQAVKDLVEEIQRMPEAPDPAETDRHAFTLLLSGIAACRRAPGIPCHMGYRTLYRC GDEPASEELKAHLFRLYGIYDRESLEKVCMEQFTSGREYEQFMTFWCDAPLFDLEELEEK GRRAFETRFKRASLFRPYVGERGFYAWDINERIGLGRLACACGIIDRETFDELTDYQVRK AQVFYHTFKDYAVSCICGAVYDVPGGDEEDMLSFLDLNRKLALHLLEEGGAWYRNAWYAP EKREWVSLLPHNGGCIVSKQIEEGRAIGYMYRDSRPSEQWADTGWRFFAGDESDEYSRNP DNFTIWSLNDICNLDATILGYAEAKEGSAFGRNAKGEWQRER >gi|157101629|gb|DS480695.1| GENE 72 81756 - 82040 252 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940583|ref|ZP_02087927.1| ## NR: gi|160940583|ref|ZP_02087927.1| hypothetical protein CLOBOL_05478 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05478 [Clostridium bolteae ATCC BAA-613] # 1 94 27 120 120 168 100.0 2e-40 MGRQLKKMRLPNGVVAEERACGDYVCKYCGGKATACNFVADMCLQVYARHQGMCGAYIET EKYVGYVTDCSTCHYQGISGECRYEGRAAMGQGV >gi|157101629|gb|DS480695.1| GENE 73 82226 - 83695 884 489 aa, chain + ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 4 386 5 376 401 442 53.0 1e-124 MMTQNADKKREQIQLFCMDDMVPQDHLLRIIDKAIDWSFIYELVEDKYSQDNGRPSMDPV MLIKIPFIQYLYGIKSMRQTVKEIEVNVAYRWFLGLDMLDKVPHFSTFGKNYTRRFKDTD LFEQIFAHILSECYKFKLVDPNEVFVDATHVKARANNKKMQKRIAHDEALFFEDLLKQEI NEDREAHGKRPLKEKEDDNNTPSGGAGGKEEKTVKASTSDPESGWFRKGEHKHVFAYAVQ TACDKNGWILSYSVHPGNNHDSRTFKSLYDKIKGIGIETLIADAGYKTPGIAKLLIDQGV KPLLPYKRPLTKEGFFKKYEYVYDEYYDCYICPNNQVLTYRTTNREGYREYKSCGSACAS CAYLAKCTQSKDHVKTVMRHIWEPYMEMCEEIRQTLRMKELYSQRKETIERIFGSAKENH GFRYTQMFGKARMEMKVGLTFACMNLKKLARMKAKWGAAHFTNFILNAIWLIKENWLWDT KPKTSLSTV >gi|157101629|gb|DS480695.1| GENE 74 83751 - 86114 2802 787 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 26 777 25 763 772 441 36.0 1e-123 MHKDCDKHYFKKPEFYRLTGENNYVRIDAEDKVTGHGQYVGDIMFPDMLTGKMVRSPYAS AKILSIDTSEAEKLPGVKCILTARDFEWKSLVGNGEFAAEFADKEVLCSEKVRQVGDDVA AVAAVDEETAQRAADLIKVEYQVLPGVFDPFEAMEENAPEVNWEGKGIHNIGMQSVMKAG TDIDEEFDRASYVQHRDYKTHRMVHAAMEPHGAVATYRNGTYTIWMSTQMSFVDQFWYAR CLGVGENQVRVIKPLVGGGFGGKLDSYSFGLCAAKMAEMTGRPVRMILSREEVFQTTRNR HPIYMHIDTAFGTDGKLLAKKCYHVLDGGPYGGSGVAACAQSTLWANFPYKMNSVDFLAR RVYTNNPSAGAMRGYTACQVHFAHDLNMQFAADQMGIDPVEFRKISAADPGYVAPAGLAI TSCAYKETLDTAAKEIGWYEKKDKLKKGEGIGFAGTGFVSGTGFAVLEAPNQSSACVTLR MNKRGMATLYIGSHDIGQGSDTVMTAIVAEELGLPMDMVKTFMSDTFLTPWDSGSYGSRV TFLAGNAARRAAVDAKRQLFEVIAPMWGVMPETLECLDGKVISKEKAEYQMSIGDAMFKY MTVKGGDELIGVGSYYHRTDNSQYNGNNTTNYAPAYSFSTGAAHLTVDEETGVLDIDEFV FAHDCGRALNKRAVEGQLEGSIGMGLGYAVYEHNVTREGKILNPNFRDYRLPTALDMPKM RTFYDFTPDEEGPLGAKEAGEGSAAPVAPAIANAVNMATGVYFTELPLDPEHIWRALHGM KDDRNSK >gi|157101629|gb|DS480695.1| GENE 75 86107 - 86598 519 163 aa, chain - ## HITS:1 COG:SSO2637 KEGG:ns NR:ns ## COG: SSO2637 COG2080 # Protein_GI_number: 15899363 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 2 154 9 161 163 161 50.0 6e-40 MKHSISLIINGDPVDAIVKDNLTLLDFLRDQLFLTGTKKGCEEGECGACTVMLDGKPVNS CCTLAVECDGHEIITVEGIAREGMLHPIQKQFIEKWAMQCGYCTPGMIMSAKALLDVNKH PTELEIREAIEGNLCRCTGYAKIVEAIQAAAAQMNWEEEAKNA >gi|157101629|gb|DS480695.1| GENE 76 86612 - 87481 860 289 aa, chain - ## HITS:1 COG:SMb20130 KEGG:ns NR:ns ## COG: SMb20130 COG1319 # Protein_GI_number: 16263878 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Sinorhizobium meliloti # 6 285 4 283 287 232 46.0 6e-61 MVLPQFEYLAPKTIGEACNLFLELGSTARVMAGATDLIPPMKDKVISPEYIIDLKKIPGL DYLEYDDREGLKIGALTTLRTIETSPLVKEKNPAVAHAAKVVASTQIRTKGTMAGNICNA SPSCDTAPNLLAQGAKILVQGPNKDRVIQIEDFFLGVKKTSLEPGEIVTGIVIPPLAENE RAAYIKHAVRKAMDLAIIGVAVKIKVEDGVCTDARIALGAVAATPVRAPGAEEALIGKEL TDEVIVKASEEAMNSCHPISDIRASAEYRKDMIRVFTKRAIKQAMECYN >gi|157101629|gb|DS480695.1| GENE 77 87779 - 89251 1291 490 aa, chain - ## HITS:1 COG:BS_deaD KEGG:ns NR:ns ## COG: BS_deaD COG0513 # Protein_GI_number: 16080962 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus subtilis # 13 481 4 474 479 430 47.0 1e-120 MENNQKTEPDRCFGGYGLSGEITETLALLGYEHPTRVQEEVLPHVLAGRNVVVRSQTGTG KTAAFAIPVCEQIVWEENLPQALVLEPSRELAVQVSQEIFRIGRKKRLKVPAVFGGFPID KQIQTLRQKTHIVVGTPGRVMDHVRRGSLKLTGVRYLVIDEADLMLDMGFMDEVRQIVGL LPEKRQTLLFSATLDEDVKNLAEETVKDAVSVFLEQEGETPLPIRQSVYGVAQDQKYTAF KRVLMKENPERAMIFCGTREMVQVLFQKLRRDRIFCGMIHGDMEQKERLKNVDAFRRGGF RYLIATDVAARGIDFEDVGLVVNYDFPMGRETYVHRIGRTGRNGKAGRAASLVTEDEIRT LHKAEAYVGVPLPVLPCPETEEEEEKAFWALQRKKPDPGPHKGEKLDRSITRLSIGGGRK SKMRSGDIVGAVCSIEGVDAEDIGIIDIRESLTFVEILNRKGSQVLAGLQDRTIKGKIRK VRITSQNNGV >gi|157101629|gb|DS480695.1| GENE 78 89280 - 90026 689 248 aa, chain - ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 29 242 6 217 221 176 46.0 3e-44 MSWKNQIYQIPNLGKAAGEMEKSMGTRGQMSEAVSTGIFLTLSGGFQDAYTYYCRGNVFA NAQTGNIVLMGSHLAAREWNLAVRYLAPVLAFAGGVYMAEHVKRLHRQDRRFHWRQLIVA IEIILLFVVGFLPQDMNMAANIIVSFVCALQVNAFRKIKGSPYATTMCIGNLRSATENLY WYRHTGKKDHRDKCIRYYGVIGIFAVGAVMGSVLTAYLRERTIWVSCGLLLVSFLIMFVK EDIEGGRR >gi|157101629|gb|DS480695.1| GENE 79 90080 - 90577 455 165 aa, chain - ## HITS:1 COG:TP0511 KEGG:ns NR:ns ## COG: TP0511 COG1329 # Protein_GI_number: 15639502 # Func_class: K Transcription # Function: Transcriptional regulators, similar to M. xanthus CarD # Organism: Treponema pallidum # 2 165 36 194 208 58 28.0 7e-09 MFKKGEYILYGTVGVCQVEGISKPDFSNNDKVYYSLVPKFDQDTTIYIPVDSPKVKMREI MTRQEAEQFILALPSVEGKQYANDKERPQAYRQILESGDCTQLASMIKEISEMEQNRRGK GKPLSIRERDGVKSARKLLFGELATALDICPEEIPDYITSQIGEA >gi|157101629|gb|DS480695.1| GENE 80 90910 - 91200 280 96 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3087 NR:ns ## KEGG: Cbei_3087 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 92 5 97 103 129 72.0 4e-29 MSVVIIGGHDRMVSQYKKICRNYKCKAKVFTQMSASLDKQIGSPDLLVLFTNTVSHKMIR CALDEVGNGTEIVRCHTSSGTALTEILEEKCAVCAL >gi|157101629|gb|DS480695.1| GENE 81 91210 - 92058 853 282 aa, chain - ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 14 276 269 526 541 249 51.0 4e-66 MEGNGCKMTERAKQLVIVRGGGDIATGTIHRLYRCGYPVLVLESGYPSAIRRCVAFSEAV YDGTSEVEGITCIKASGYEEACAIMEQGAVPVMIDENCAVLEKVRPWALVDAILAKKNLG TTRDMADKTIGLGPGFTAGQDVDLVIETMRGHNLGRIIGNGSAAPNTGIPGVIGGYGAER VIHAPAKGIFFGKALIGDMVEQGQAIGVIVTGQGEVTVEASLTGLLRGIIKDGYPVVKGF KIADIDPRKSEYENCFTISDKARCIAGSVVEGLQMLEVKQGY >gi|157101629|gb|DS480695.1| GENE 82 92065 - 92454 413 129 aa, chain - ## HITS:1 COG:no KEGG:Shel_01590 NR:ns ## KEGG: Shel_01590 # Name: not_defined # Def: molybdenum-binding protein # Organism: S.heliotrinireducens # Pathway: not_defined # 10 111 13 114 114 89 49.0 5e-17 MDQNQLHYGISLKLYFEKKAFGPGMSVLLRGVESTGSLQGAAQAMNMAYSKAWKMLKESE KAWGFMLTERETGGRDGGGSTLTPQAVRLLEAYDAFMAETRQELDRLFEQHFSQEWVEEL KQSAEQADE >gi|157101629|gb|DS480695.1| GENE 83 92585 - 93508 1154 307 aa, chain - ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 4 294 2 291 293 283 49.0 2e-76 MSHSIFEGSAAAVVTPMNGYGNLDFDAFGRLLEWMIREGTDAVVVNGTTGESATLDDKEK MAVIRYAVRQVDGRIPVIAGTGSNCTEHAVELSRKAQELGADALLQVTPYYNKASQEGLV RHFTQVADHVSIPILLYNVPSRTGVTIQPETYRILSEHPNIRGTKEASGNLSLIARTAAL CGQGFDIYSGNDDQTVPIMSLGGKGVVSVLANIMPKETHKMCRLYLDGDVKESRRMQLQL LEIMEALFWDVNPIPVKTALGLMGVCGDDLRLPLTPMEPEQRERLAEVMKRQGISLKKEL HNQTNLL >gi|157101629|gb|DS480695.1| GENE 84 93633 - 94481 398 282 aa, chain + ## HITS:1 COG:no KEGG:Closa_0221 NR:ns ## KEGG: Closa_0221 # Name: not_defined # Def: dipicolinate synthase subunit A # Organism: C.saccharolyticum # Pathway: not_defined # 1 276 1 276 277 287 51.0 3e-76 MHQFLILGGDSRQVCLNWLLSQSGSSTCFYYDRPSSPFSLREAMEDSHIILCPVPFTKDG KTIFSENSLPGLEIQTFTACLKNTHILFGGNIPSAVRERCDLLSIPCHDFMKMEDVAWKN AVATAEGAVAEAISLSPVNLYRSRCLVTGYGRCAAILADRLKGMGARVTVAGRNASRLVP AHCLGYDTCLLKELGAVIGESDFIFNTIPSLVLDAALANQVQENAAVIDIASAPGGVDFQ ALERRNIRARLCPGLPGRCSPLSSALILYEAVMEYIRDSQTQ >gi|157101629|gb|DS480695.1| GENE 85 94499 - 95083 469 194 aa, chain + ## HITS:1 COG:BS_spoVFB KEGG:ns NR:ns ## COG: BS_spoVFB COG0452 # Protein_GI_number: 16078737 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus subtilis # 3 191 4 192 200 193 49.0 2e-49 MELKGKHVGLAVTGSFCTFAKIEQEIRHLKETGAILHPVFSGSVQSTDSRFGDTGQFMDR ITSITEMKPILKIEEAEPIGPKGYLDVLLIAPCTGNTLAKLANGITDTPVLMAAKAHLRN GRPLVISVSTNDALGINLKNIGLLFNMKNIYFVPFGQDDPVKKPTSMIAHTGLIEDTLKE ALEGRQIQPVIQGL >gi|157101629|gb|DS480695.1| GENE 86 95386 - 96129 527 247 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1000 NR:ns ## KEGG: bpr_I1000 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 246 1 246 248 327 63.0 3e-88 MELIKITHGNLEQEHICCAISNNKDIQVMTKKAWLKERLDEGLIFLKCNVRGKCFIEYIP AEYAWAPIEADGYMYINCLWVSGQFKGLGYSTLLLDECIRDSKEKGKKGLVILSSKKKMG YLSDPEYLKYKGFETADAAGPYFELMYLAFDRNADKPCFRHTAKNREDMPKGLVLYYTSQ CPFTAKYVPLLEKTAKERNVELQTVHIQTREDAQNAPAPFTTFSLFRDGELLTHEILSEK KFDKLIS >gi|157101629|gb|DS480695.1| GENE 87 96141 - 97148 788 335 aa, chain - ## HITS:1 COG:FN0161 KEGG:ns NR:ns ## COG: FN0161 COG3344 # Protein_GI_number: 19703506 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Fusobacterium nucleatum # 39 281 59 303 349 171 38.0 2e-42 MARTDLWKYCSAEQCREMMISFRLIGAGDWPPEKIMGCLYALSNHAEKHYQRVNIPKRSG GFRTLLVPDPLLKNVQRNLLHHVLDGFTVSDSAAAYRKGASVAANAGKHQGRKIVMKMDI EDFFGSITFPMVLHHAFPTAYFPPAVGVMLASLCCCHDRLPQGAPTSPAISNLVMKPFDE YMEAWCGEREIVYSRYCDDMTFSGVFDGSEVKGKVYGFLRSMGFEPNLRKTRILTRGTRQ VVTGLVVNERVRVPAPYRRRLRQEIYYCMKYGAKEHLTRTGRQAYLNDGDGGTRRYLESL LGKTAYVLLASGDGDAWFREAQVKLREMLGQDMCL >gi|157101629|gb|DS480695.1| GENE 88 97580 - 98413 923 277 aa, chain - ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 4 276 7 291 291 173 35.0 3e-43 MNRKKNASTVLVYGVLIFITLICLFPIVWMCLISIKSTSESITGFNSLMVSNPTMANFKR LFEMIPVGQNAFNSVFTTIMGTITSLFFCALAGFAFAKYRFPGRKFLFYFVIATMLVPPE VGSVPLFIIMKKVGLINSLWSLVIPRIATAVGIFYMNQYITDVPDELVEAARIDGCSDFG IFTKIILPVIKPAMASWGAITLIARWNDFFWPLLYLRKQAKYTLMVTISLLPVSEGLSTP WPVILAGTTLVIIPIIILYLILEKFQKAGMMEGAVKG >gi|157101629|gb|DS480695.1| GENE 89 98428 - 99306 966 292 aa, chain - ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 5 285 11 289 296 150 32.0 2e-36 MLKALKKYRAPYLFIMPFFILFLVFQLVPTIWTFYISLTNWKGIGTPDFCGFDNYRKMLI DDMFWESLGNTVVYWLTGLVFIICCALLIASLLNSRYLKSNSARAFFKTATFLPNICAAI AMGLIFRMLFDENVGLVNEVVTIFGGSKIPWLTSTTFSKIPVIILNVWRYTPWFTMILLS GLLNISKDYYEAATVDGANGWQQFWFITLPSLKNILFFCSVTLTVDMWKLFNESYILPGP GTSNSSLFQYMYESGFNVFNMGYASAIGVILILILIVISIIQFVVRHKQGEI >gi|157101629|gb|DS480695.1| GENE 90 99369 - 100808 1979 479 aa, chain - ## HITS:1 COG:BH0905 KEGG:ns NR:ns ## COG: BH0905 COG1653 # Protein_GI_number: 15613468 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 139 399 117 387 445 63 22.0 7e-10 MMKRKLAMALAVSMVVGSLAACGGKTDQAGTTAAPKAENAQGDQSGAADGGAEAADSEKT VYPAGTLVVYAMGNPQYRQQWFETWLENHKDIAPDVKIEFVQTEGTADIREKITMTALSG ATEDLPDAAMLDPVTIMDLAGAGLLKDETEYLTPLLDKMVDGATTDATIGGRIYALPDSV RPQVLFYNQDIFDKYGVDAEQMNTMEGYIEAGRQLKEKSNGEVYLSYIDPTSKTWRYWGR RGLMPSAGAKIWDENGEVVIGSDEGTQKALGALDTMNSEGLLLKTTIMEPALYDAINKQQ VATFCIGAFWDEFMRKNCEATKGQWRVMSAPAFEGVDKAGAPVSQYMAIIEKGDNPYSEL FRQMWYDFTFDTQAKEVWVNSMEEQNAPYSNPVSKEMLEDPFWKEPSDFYGGQSFREMEG KCLENGAANLVVTPQDAEADEIISAELEKYVAGNQSMSDAIANMDKNLKAKIGKAEISQ >gi|157101629|gb|DS480695.1| GENE 91 100842 - 101807 1199 321 aa, chain - ## HITS:1 COG:no KEGG:Closa_1289 NR:ns ## KEGG: Closa_1289 # Name: not_defined # Def: Uroporphyrinogen-III decarboxylase-like protein # Organism: C.saccharolyticum # Pathway: Porphyrin and chlorophyll metabolism [PATH:csh00860]; Metabolic pathways [PATH:csh01100]; Biosynthesis of secondary metabolites [PATH:csh01110] # 1 321 1 318 319 378 56.0 1e-103 MTKKERVTAAIRGEEVDKIPSGFSLHFPKESAFGDAAVKAHLKFFEESDTDILKIMNENL VPYMGEINNGADYSMVKEMTMEDGFMQDQVELVKKILAGCDRDAFILGTLHGITASSIHP LEKMDPNYTYDQVREKLCRLLREDEDTVLAGMKRIADVMCELARTYIELGVDGVYYAALG GETRFFTDEEFEKWIKPFDLQIMKAIKDAGGYCFLHICKDQLNMDRYKDYGPYADVVNWG VYEAPFSLEDGRKMFAGKTIMGGLPNRHGVLVDGPAEAVKEETRKVIREFGRTHFILGAD CTLATEQDMDLLKAAVEAARE >gi|157101629|gb|DS480695.1| GENE 92 102055 - 102702 679 215 aa, chain - ## HITS:1 COG:mlr7177 KEGG:ns NR:ns ## COG: mlr7177 COG1802 # Protein_GI_number: 13475978 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 8 202 42 244 264 77 27.0 1e-14 MKTKKTLKETVVEGIYREIEEGIYKPNDIIHEGEIMEKYAMSKSPVREALIELCKDNVLK SIPRVGYQVVPVTLKEILDLLEFRIDVETGNLRRLCQRITPEQIEDLRQLDTITKDNHER MVAIHWNRNTDFHLKLCEMGGNGYICDVIAAALQKSCSYISQYFQTAWKKNAESNSFYHR EIIEALVKHDTERAVEMLQKDILNVKEQIQENYSL >gi|157101629|gb|DS480695.1| GENE 93 102699 - 103694 626 331 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940609|ref|ZP_02087953.1| ## NR: gi|160940609|ref|ZP_02087953.1| hypothetical protein CLOBOL_05504 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05504 [Clostridium bolteae ATCC BAA-613] # 1 331 1 331 331 626 100.0 1e-178 YKPGILVLSVRPHEVAGVAEDTIKPYVKYRREQGGGGLLLLSFAPSPRPEWYTRILGDGV RTVCILPAMETVIGGLDVYSLAFNKITYDPGCPPGREQLDAVRQFLKPLGQVFECTPDES LRLLSMSITYHMLYLLCFGLEPHYGDESGQGGCARAAGILYALNRRRLGLPGLITPAAAP DGRQMDGRQMDGQQMDRLAWLEKGWYEGALAFARGAGGEAGNKAGPYDGREGGNGNGNGT RAGTAGNADIIAPRSFELHLLSLQLETREFLEAKLKDHATPGGMTEEAVSGFRQLWTGKD WMDGDVSMAYDISYRLAELVYKKGKALEKRR Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:11:48 2011 Seq name: gi|157101628|gb|DS480696.1| Clostridium bolteae ATCC BAA-613 Scfld_02_37 genomic scaffold, whole genome shotgun sequence Length of sequence - 90795 bp Number of predicted genes - 100, with homology - 97 Number of transcription units - 51, operones - 21 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1035 508 ## COG3464 Transposase and inactivated derivatives 2 2 Op 1 . + CDS 1440 - 2648 215 ## Closa_1149 hypothetical protein 3 2 Op 2 . + CDS 2629 - 3162 372 ## gi|160940613|ref|ZP_02087956.1| hypothetical protein CLOBOL_05507 + Term 3332 - 3365 1.4 + Prom 3228 - 3287 6.7 4 3 Op 1 . + CDS 3430 - 3837 248 ## gi|160940614|ref|ZP_02087957.1| hypothetical protein CLOBOL_05508 5 3 Op 2 . + CDS 3862 - 4572 633 ## gi|160940615|ref|ZP_02087958.1| hypothetical protein CLOBOL_05509 6 3 Op 3 . + CDS 4582 - 5496 776 ## gi|160940616|ref|ZP_02087959.1| hypothetical protein CLOBOL_05510 7 3 Op 4 . + CDS 5493 - 6629 767 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 8 3 Op 5 . + CDS 6683 - 9649 1971 ## COG3451 Type IV secretory pathway, VirB4 components + Prom 9704 - 9763 4.1 9 4 Op 1 . + CDS 9968 - 10309 79 ## gi|160940619|ref|ZP_02087962.1| hypothetical protein CLOBOL_05513 10 4 Op 2 . + CDS 10278 - 10955 467 ## gi|160940620|ref|ZP_02087963.1| hypothetical protein CLOBOL_05514 11 5 Tu 1 . + CDS 11080 - 12183 355 ## gi|160940621|ref|ZP_02087964.1| hypothetical protein CLOBOL_05515 + Prom 12186 - 12245 4.2 12 6 Op 1 . + CDS 12438 - 14120 927 ## COG3505 Type IV secretory pathway, VirD4 components 13 6 Op 2 . + CDS 14153 - 15145 305 ## PROTEIN SUPPORTED gi|20808441|ref|NP_623612.1| ribosomal protein S1 + Term 15190 - 15233 -0.6 + Prom 15161 - 15220 3.9 14 7 Op 1 . + CDS 15244 - 15897 419 ## Rumal_0715 hypothetical protein 15 7 Op 2 . + CDS 15881 - 16795 523 ## Rumal_0716 hypothetical protein - Term 16799 - 16839 9.1 16 8 Tu 1 . - CDS 16861 - 17619 680 ## gi|160940627|ref|ZP_02087970.1| hypothetical protein CLOBOL_05521 - Prom 17661 - 17720 9.5 + Prom 17640 - 17699 4.9 17 9 Tu 1 . + CDS 17750 - 18058 264 ## DSY2751 hypothetical protein + Term 18179 - 18207 -1.0 + Prom 18182 - 18241 4.6 18 10 Op 1 . + CDS 18320 - 18724 348 ## Closa_2836 hypothetical protein 19 10 Op 2 . + CDS 18735 - 19226 288 ## gi|160940630|ref|ZP_02087973.1| hypothetical protein CLOBOL_05524 20 10 Op 3 . + CDS 19294 - 19866 280 ## gi|160940631|ref|ZP_02087974.1| hypothetical protein CLOBOL_05525 + Term 19873 - 19916 6.0 21 11 Op 1 . + CDS 19956 - 21647 879 ## DSY4654 hypothetical protein 22 11 Op 2 . + CDS 21622 - 22092 172 ## gi|160940633|ref|ZP_02087976.1| hypothetical protein CLOBOL_05527 23 11 Op 3 . + CDS 22104 - 22211 92 ## 24 11 Op 4 . + CDS 22227 - 23093 745 ## HM1_0585 hypothetical protein 25 11 Op 5 . + CDS 23098 - 23883 665 ## LM5578_1890 hypothetical protein 26 11 Op 6 . + CDS 23876 - 25240 1032 ## COG4962 Flp pilus assembly protein, ATPase CpaF 27 11 Op 7 . + CDS 25240 - 26160 794 ## CKL_3887 hypothetical protein 28 11 Op 8 . + CDS 26170 - 27048 726 ## Ccel_2774 hypothetical protein + Prom 27149 - 27208 2.6 29 12 Op 1 . + CDS 27235 - 27456 272 ## gi|160940640|ref|ZP_02087983.1| hypothetical protein CLOBOL_05534 + Term 27464 - 27511 8.6 30 12 Op 2 . + CDS 27538 - 27885 154 ## gi|160940642|ref|ZP_02087985.1| hypothetical protein CLOBOL_05536 + Term 27904 - 27935 -1.0 + Prom 27917 - 27976 3.0 31 13 Op 1 . + CDS 28027 - 28431 251 ## TK90_2760 hypothetical protein 32 13 Op 2 . + CDS 28455 - 29012 205 ## gi|160940646|ref|ZP_02087989.1| hypothetical protein CLOBOL_05540 33 13 Op 3 . + CDS 29036 - 29626 305 ## gi|160940647|ref|ZP_02087990.1| hypothetical protein CLOBOL_05541 + Term 29643 - 29693 0.2 - Term 29634 - 29678 -0.2 34 14 Tu 1 . - CDS 29687 - 30733 206 ## PCC7424_4909 Miro domain protein - Prom 30923 - 30982 4.2 + Prom 30952 - 31011 7.2 35 15 Tu 1 . + CDS 31044 - 31466 367 ## gi|160940649|ref|ZP_02087992.1| hypothetical protein CLOBOL_05543 36 16 Op 1 . + CDS 31575 - 32426 430 ## gi|160940650|ref|ZP_02087993.1| hypothetical protein CLOBOL_05544 37 16 Op 2 . + CDS 32453 - 34249 1085 ## COG0714 MoxR-like ATPases 38 16 Op 3 . + CDS 34269 - 34532 168 ## gi|160940652|ref|ZP_02087995.1| hypothetical protein CLOBOL_05546 39 16 Op 4 . + CDS 34529 - 36607 1379 ## COG4548 Nitric oxide reductase activation protein 40 16 Op 5 . + CDS 36645 - 37307 315 ## gi|160940654|ref|ZP_02087997.1| hypothetical protein CLOBOL_05548 + Term 37317 - 37353 5.1 41 17 Op 1 . + CDS 37396 - 37905 321 ## COG2002 Regulators of stationary/sporulation gene expression 42 17 Op 2 . + CDS 37953 - 38219 70 ## gi|160940656|ref|ZP_02087999.1| hypothetical protein CLOBOL_05550 43 17 Op 3 . + CDS 38291 - 38758 348 ## gi|160940657|ref|ZP_02088000.1| hypothetical protein CLOBOL_05551 + Term 38903 - 38959 0.4 + Prom 38786 - 38845 6.4 44 18 Tu 1 . + CDS 39092 - 41119 1044 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) + Term 41142 - 41186 4.2 + Prom 41183 - 41242 4.9 45 19 Tu 1 . + CDS 41266 - 42627 1019 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 42644 - 42704 5.1 + Prom 42647 - 42706 3.2 46 20 Tu 1 . + CDS 42735 - 43070 317 ## gi|160940660|ref|ZP_02088003.1| hypothetical protein CLOBOL_05554 + Term 43077 - 43130 6.7 - Term 43067 - 43116 12.6 47 21 Op 1 . - CDS 43120 - 43344 265 ## BLJ_1241 hypothetical protein 48 21 Op 2 . - CDS 43369 - 44001 554 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 44061 - 44120 2.2 49 22 Op 1 . - CDS 44188 - 44373 128 ## gi|160940663|ref|ZP_02088006.1| hypothetical protein CLOBOL_05557 50 22 Op 2 . - CDS 44366 - 44818 394 ## gi|160940664|ref|ZP_02088007.1| hypothetical protein CLOBOL_05558 - Prom 44890 - 44949 6.0 + Prom 45137 - 45196 5.3 51 23 Tu 1 . + CDS 45256 - 45606 162 ## EF2291 Cro/CI family transcriptional regulator 52 24 Tu 1 . - CDS 45981 - 46559 239 ## COG3344 Retron-type reverse transcriptase - Prom 46617 - 46676 5.4 53 25 Tu 1 . - CDS 46678 - 47445 534 ## COG0428 Predicted divalent heavy-metal cations transporter - Prom 47483 - 47542 5.8 - Term 47513 - 47554 1.1 54 26 Tu 1 . - CDS 47559 - 47765 157 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 47979 - 48038 3.5 + Prom 47759 - 47818 1.9 55 27 Tu 1 . + CDS 47992 - 48144 63 ## - Term 47981 - 48023 13.3 56 28 Op 1 . - CDS 48088 - 48624 217 ## gi|160940670|ref|ZP_02088013.1| hypothetical protein CLOBOL_05564 57 28 Op 2 . - CDS 48663 - 48857 145 ## gi|160940671|ref|ZP_02088014.1| hypothetical protein CLOBOL_05565 58 28 Op 3 . - CDS 48872 - 49750 743 ## CKR_2158 hypothetical protein 59 28 Op 4 . - CDS 49761 - 50705 978 ## gi|160940673|ref|ZP_02088016.1| hypothetical protein CLOBOL_05567 60 28 Op 5 . - CDS 50721 - 52043 1292 ## gi|160940674|ref|ZP_02088017.1| hypothetical protein CLOBOL_05568 61 28 Op 6 . - CDS 52062 - 52868 987 ## gi|160940675|ref|ZP_02088018.1| hypothetical protein CLOBOL_05569 62 28 Op 7 . - CDS 52884 - 53819 814 ## gi|160940676|ref|ZP_02088019.1| hypothetical protein CLOBOL_05570 63 28 Op 8 . - CDS 53848 - 54768 785 ## gi|160940678|ref|ZP_02088021.1| hypothetical protein CLOBOL_05572 64 28 Op 9 . - CDS 54797 - 55567 1008 ## gi|160940680|ref|ZP_02088023.1| hypothetical protein CLOBOL_05574 65 28 Op 10 . - CDS 55599 - 55838 212 ## gi|160940681|ref|ZP_02088024.1| hypothetical protein CLOBOL_05575 - Prom 55871 - 55930 2.4 - Term 55856 - 55898 11.2 66 29 Tu 1 . - CDS 55933 - 58404 1571 ## COG2909 ATP-dependent transcriptional regulator - Prom 58435 - 58494 5.9 67 30 Tu 1 . - CDS 58649 - 59551 428 ## COG5433 Transposase - Prom 59653 - 59712 3.5 68 31 Tu 1 . - CDS 59934 - 60515 176 ## gi|160940684|ref|ZP_02088027.1| hypothetical protein CLOBOL_05578 - Prom 60556 - 60615 6.0 + Prom 60697 - 60756 6.3 69 32 Tu 1 . + CDS 60785 - 61591 403 ## ELI_4256 hypothetical protein + Term 61737 - 61777 1.9 - Term 61581 - 61624 9.1 70 33 Op 1 . - CDS 61635 - 62651 689 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 71 33 Op 2 . - CDS 62707 - 62886 102 ## ELI_0217 hypothetical protein - Prom 62922 - 62981 2.3 - Term 62938 - 62974 0.8 72 34 Tu 1 . - CDS 63027 - 63248 67 ## ELI_0217 hypothetical protein - Prom 63385 - 63444 4.8 - Term 63465 - 63507 3.4 73 35 Op 1 35/0.000 - CDS 63516 - 65258 256 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 74 35 Op 2 . - CDS 65259 - 67061 249 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 67088 - 67147 5.8 75 36 Tu 1 . - CDS 67197 - 67766 197 ## COG1309 Transcriptional regulator - Prom 67844 - 67903 6.8 + Prom 67848 - 67907 8.1 76 37 Tu 1 . + CDS 68112 - 68723 249 ## COG0500 SAM-dependent methyltransferases - Term 68963 - 69022 13.1 77 38 Op 1 . - CDS 69194 - 69400 183 ## gi|160940694|ref|ZP_02088037.1| hypothetical protein CLOBOL_05588 - Prom 69476 - 69535 3.4 78 38 Op 2 . - CDS 69575 - 69736 143 ## EUBREC_0146 putative permease - Prom 69800 - 69859 3.1 79 39 Tu 1 . - CDS 69962 - 71989 1153 ## COG2217 Cation transport ATPase - Term 72056 - 72094 5.3 80 40 Tu 1 . - CDS 72112 - 72444 319 ## EUBELI_00577 hypothetical protein - Prom 72561 - 72620 5.9 81 41 Tu 1 . - CDS 72719 - 73063 213 ## EUBELI_00576 hypothetical protein - Prom 73146 - 73205 3.2 - Term 73198 - 73246 14.7 82 42 Op 1 . - CDS 73262 - 73417 173 ## 83 42 Op 2 . - CDS 73436 - 75424 1433 ## COG0370 Fe2+ transport system protein B - Prom 75482 - 75541 3.9 - Term 75485 - 75534 1.1 84 43 Tu 1 . - CDS 75545 - 75757 197 ## bpr_I2636 ferrous iron transport protein A FeoA1 - Prom 75837 - 75896 5.5 + Prom 75886 - 75945 9.8 85 44 Tu 1 . + CDS 75989 - 76354 241 ## COG1321 Mn-dependent transcriptional regulator + Term 76356 - 76399 9.6 - Term 76339 - 76391 13.2 86 45 Op 1 . - CDS 76393 - 76764 202 ## Closa_0593 hypothetical protein - Prom 76888 - 76947 3.2 87 45 Op 2 . - CDS 76967 - 77398 356 ## COG0716 Flavodoxins - Prom 77441 - 77500 8.0 + Prom 77871 - 77930 4.7 88 46 Tu 1 . + CDS 77980 - 78207 116 ## Rumal_0348 transposase IS116/IS110/IS902 family protein + Prom 78230 - 78289 2.1 89 47 Tu 1 . + CDS 78337 - 78495 73 ## gi|160940707|ref|ZP_02088050.1| hypothetical protein CLOBOL_05601 - Term 78463 - 78514 6.2 90 48 Op 1 . - CDS 78677 - 79045 361 ## gi|160940708|ref|ZP_02088051.1| hypothetical protein CLOBOL_05602 91 48 Op 2 . - CDS 79051 - 79968 734 ## Closa_1117 hypothetical protein 92 48 Op 3 . - CDS 80054 - 81301 615 ## Closa_1104 hypothetical protein 93 48 Op 4 . - CDS 81369 - 82388 865 ## Closa_1103 hypothetical protein 94 48 Op 5 . - CDS 82462 - 84723 1445 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 95 48 Op 6 . - CDS 84707 - 85099 270 ## Mahau_2838 hypothetical protein 96 48 Op 7 . - CDS 85100 - 86137 875 ## COG5377 Phage-related protein, predicted endonuclease - Term 86156 - 86186 1.1 97 49 Op 1 . - CDS 86223 - 87152 581 ## Closa_0805 hypothetical protein 98 49 Op 2 . - CDS 87166 - 88323 835 ## gi|160940717|ref|ZP_02088060.1| hypothetical protein CLOBOL_05611 + Prom 88587 - 88646 4.0 99 50 Tu 1 . + CDS 88885 - 89148 157 ## gi|160940720|ref|ZP_02088063.1| hypothetical protein CLOBOL_05614 + Prom 89171 - 89230 4.0 100 51 Tu 1 . + CDS 89396 - 90421 494 ## Rumal_3273 integrase family protein - TRNA 90607 - 90678 65.4 # Cys GCA 0 0 Predicted protein(s) >gi|157101628|gb|DS480696.1| GENE 1 1 - 1035 508 344 aa, chain + ## HITS:1 COG:BH2774 KEGG:ns NR:ns ## COG: BH2774 COG3464 # Protein_GI_number: 15615337 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 1 340 102 446 459 189 35.0 8e-48 IHDYRLQEVQDIPLLGKQVILLLRKRRYLCPYCRKRFTEPYSFLPSYHRRTRRLAFYIVS LLRQTFSLKQIAELTGVSVQTVCRLLDTICYPPPDQLPQALSIDEFKGNASTGKYQCILV DPKKRRILDILPDRTQSHLADYWRNIPRKERLKVKFFVCDMWRPYTELAQTFFPNATIIV DKYHFIRQMTWAIENVRKRLQRSMPVSLRKYYKRSRKLILTRYKKLKDENKQACDLMLHY SEDLRLAHRMKEWFYDICQMEAYRQQQREFDDWIANAQSCGIKEFEACAKTYRAWRKEIL NAFKYGLTNGPTEGFNNKIKVLKRSSYGIRNFKRFRTRILHCTS >gi|157101628|gb|DS480696.1| GENE 2 1440 - 2648 215 402 aa, chain + ## HITS:1 COG:no KEGG:Closa_1149 NR:ns ## KEGG: Closa_1149 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 331 12 332 338 390 54.0 1e-107 MDYPRCRIYREFVRRLMNDPNFRTHGSSYLYYYIVLCSYANFRSSYRKMDGITYLVSPGE WICTPQELSSWFRTRFQHQAISILDFLQSQNYITYTRLDRNRLIKFHISDWKANNTSLEY NYPCQKDVGFFFFPVSVVHELVSMGKCSEMDIILDMWIHAIYNDPQVQGSDVGPVVYFRD HTGRPLITCRELALRWGLSKSSISRLLKKLESHEYLTVVSFTGTHGSVLYLQGYLSTMFS ISDVMVDKEEVSLTFQLPVTLDDEASRCLSEADIGKSLADILPETEQIYVSDNTDSVPET HILKIIQKSAKVLAAQGVFCCECSKSKYKLYRLSACRRDIYRYSLQIGCSESNIVYHFEL SIQPMLQSDPSIKFDDAVPGASPVQKQTTNSHEEEHYETDKQ >gi|157101628|gb|DS480696.1| GENE 3 2629 - 3162 372 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940613|ref|ZP_02087956.1| ## NR: gi|160940613|ref|ZP_02087956.1| hypothetical protein CLOBOL_05507 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05507 [Clostridium bolteae ATCC BAA-613] # 1 177 1 177 177 331 100.0 1e-89 MKQTSSKNHSVDQSNPVYHDTYRLLKSYRDAVWNLELEVQHVKREFEIEYESSIDDFLES IYLAGADLNGSKLENYAKSIERSNQMLSLLNSAVEILRNKHKFGEQYYWILYYTYLSPQQ LQNADEIIDKLRPHIANISYRTYYRKRPQAIEALSGILWGYTARNCLHIVNQFFPDN >gi|157101628|gb|DS480696.1| GENE 4 3430 - 3837 248 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940614|ref|ZP_02087957.1| ## NR: gi|160940614|ref|ZP_02087957.1| hypothetical protein CLOBOL_05508 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05508 [Clostridium bolteae ATCC BAA-613] # 1 135 1 135 135 251 100.0 1e-65 MKNEKKTNSPIAGITPAGSACRKKKPVLDYALYGAAAGLCILLLDTQTVLAADMWTQAET IMKDVYSKILGISTIAAIVTASVALLLMNFSKSGRTVDESRAWLKRIVISWAILNGLGFI MAYITPFFSGGQWNG >gi|157101628|gb|DS480696.1| GENE 5 3862 - 4572 633 236 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940615|ref|ZP_02087958.1| ## NR: gi|160940615|ref|ZP_02087958.1| hypothetical protein CLOBOL_05509 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05509 [Clostridium bolteae ATCC BAA-613] # 1 236 1 236 236 404 100.0 1e-111 MGILDGIVEWIAEQVMNALDLITTSVLGALGCNMDTFLRYFPAAETMYRIFVATAIGLIL LNWIWQLFKNFGLIAGVEAEDPFKLSIRSFLFIGLVYYCKDITDLILGVGGTPYNWILSS DLPALNFADFNSVLLTVLGVCANGSVALIALTLLLILAWNYIKLLFEAAERYILLGVLIF TAPMAFATGSAQSTSNIFKSWCRMFGGQLFLLIMNAWCLRLFTSMVGSFISNPLSI >gi|157101628|gb|DS480696.1| GENE 6 4582 - 5496 776 304 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940616|ref|ZP_02087959.1| ## NR: gi|160940616|ref|ZP_02087959.1| hypothetical protein CLOBOL_05510 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05510 [Clostridium bolteae ATCC BAA-613] # 1 304 1 304 304 531 100.0 1e-149 MKRTIQKIATGTALTLSLTLMQIMPVLAVTEDEVAAHGKEATAGSIFIWFLCAIAFLKIS QKIDSFMSSLGINVGHTGGSMMAEFMIAAKGLTSAKNISTGGTFRGGSYRGGSTSNRYGM NPASFLSGGLAGAVSRQFSQSVMNSVTGQSGNPISRKAFESSLKKGGEFANKVTGAVAQG SISYTGSMKGAQAAQALTSYMGQTGIASAPSYSDVEIGGGRITGTETSIEHPNGTSFAMY HTDQYMAPEGNYETVTAADQTTWYRQYAADTVERIPYMTEKGQIAYHENIVQKLPRMPQR KDRV >gi|157101628|gb|DS480696.1| GENE 7 5493 - 6629 767 378 aa, chain + ## HITS:1 COG:Cgl2139 KEGG:ns NR:ns ## COG: Cgl2139 COG0791 # Protein_GI_number: 19553389 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Corynebacterium glutamicum # 247 351 229 317 339 63 39.0 8e-10 MSPGKAFAVIAKRTGGPAGMAVSAVWKGRDTLAKILIVIIALLLLPVMFILMLPSLIFGN EGLDAVPDNVLNDNSVIMANISQTETSIETILREKHNLLLDRILAEAANLGTNCEYSITD DFSDQIIYESALIISQFCASQQDFKEVQLKKLEQILRAETDGIFTYSVNITEREETIEGT DQTETIFHYEYIIEYAGDEYFSNHVFHLTEEQAETAGFYASNLHLFLYDTIYSVEINPDL VPGETGNAAVDLGLTKLGTPYSQELRNQPGYFDCSSFTYWVYNQLGVSLQYDGSNTAASQ GRFIVENNLAVSYDSLAPGDLIFYSFKVNNRYMNIGHVAIYAGDGYVVDASFSKKKVVYR PIYSVNNIVLCGRPYASP >gi|157101628|gb|DS480696.1| GENE 8 6683 - 9649 1971 988 aa, chain + ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 415 966 56 616 617 111 23.0 6e-24 MSLRQEETDIYIIPPNFIESGTIMGGTLKLRNAVEALCISLLVGIPEFYLPFSLTTKIII ACVTVLPLALFAIIGINGESLTSFVINFFLYLKHRRIVGICPEAAEEESASEKQTDTVSG KKAKKKSKQKKKASEDFPDEFGSYKERKAAKRKEKGHPDRSGNRQEQTIPLLNPAAEYLP IQKIENGIIYTRDHRFVKIVEIVPINFLLRSAREQKNIIYSFVSYLKISPVKLQFKVLTR RADINRHLEAIQGEMAQETDERCLALLKDYESLIRRIGSKEAITRRFFIIFEYEPFNGSR NTDEGDARSALATAVQTAKTYLQQCGNEILMPDNEDELATDVLYNLLNRRTSAIKPLPTR INEVIGQYINAGRKNDLDHIPAGEFFAPESIDFTHGNYCVMDGVYHTYLLIPSSGYRTQV CAGWLSLLVNAGEGIDVDIFFDRQPKDKVQQKLGQQLRINRSKIKDTSDTNSDYDDLDSA IRSGYFLKDGLANNEDFYYMNILITVTADSLEELEWRTNEMKKLLLSQDMEAVTCHFREE QAMLSALPLVSLERKLFARSKRNVLTTSVASCYPFTSFEMCDDNGILLGVNKHNNSLIIV DIFNSRVYKNANMALLGTSGAGKTFTMQLMALRMRRKNIQVFIIAPLKGHEFHRACNNIG GEFIQISPASKNCINIMEIRKVDSSANELLDGPQADKSELAAKIQRLHIFFSLLIPDMNH EERQLLDEALIQTYRKKGITHNNGSLYDPENLGHYRKMPVLGDVYNILKENPDTRRLANI MNRLVNGSASTFNQQTNVNLDNKYTVLDISELTGDLLTVGMFVALDFVWDKAKEDRTVEK AIFIDETWQLIGASSNRLAAEFVLEIFKIIRGYGGSAICATQDLNDFFALDDGKYGKGII NNSKTKIILNLEDEEAMRVQSILHLSEAETMAITHFERGNALISTNNNNVTVEFKASELE KELITTDREELKRLLERERQHQVMSEAV >gi|157101628|gb|DS480696.1| GENE 9 9968 - 10309 79 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940619|ref|ZP_02087962.1| ## NR: gi|160940619|ref|ZP_02087962.1| hypothetical protein CLOBOL_05513 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05513 [Clostridium bolteae ATCC BAA-613] # 1 113 81 193 193 226 100.0 6e-58 MEAAFDVLLDFLPEVTYHAPGEFPVTLTFFAREECYEVIWLPEGKELLVSHALSLQLRPP DNYQRLIVLSSETQLELAGRLPDVTAYCLPRNGKIHYYKKGLTYGPNHPEYPA >gi|157101628|gb|DS480696.1| GENE 10 10278 - 10955 467 225 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940620|ref|ZP_02087963.1| ## NR: gi|160940620|ref|ZP_02087963.1| hypothetical protein CLOBOL_05514 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05514 [Clostridium bolteae ATCC BAA-613] # 73 225 1 153 153 319 100.0 1e-85 MDRTTLNTRLERIQNEISKLVNTLYALNLTDSDTYPQNYEELSTDAALRAEKIACTLRNL IYSANLVPKPILMQKAADIQGITVTQIDCQVVITLPGLMPKKKKGSNPAFITEPLYQCLA DYAGTHPLRRFDQCVVCFSHQYDQNQSARRIRDYDNLECKQILDTVAAFLMKDDSGLLCD VYHCTQFGTEDCTILTIMEKSYFSKWLKAHENAAKYLSDFPSESG >gi|157101628|gb|DS480696.1| GENE 11 11080 - 12183 355 367 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940621|ref|ZP_02087964.1| ## NR: gi|160940621|ref|ZP_02087964.1| hypothetical protein CLOBOL_05515 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05515 [Clostridium bolteae ATCC BAA-613] # 1 367 4 370 370 728 99.0 0 MDTSTAAYRILEITAISGELPCDTLSRIHIGESYGEKLITRLKEDRLIKTHYKDKLRGYR LTSRGKKLLLAQNPERFSFYLTGSSDTNQPRSDYPRRLRLHQTSRTYCLMLGAGITIFRD KKPALFSGVPPDDTHILPFPVFYQSRELKELGMETTKINNSRAIGILLTESLIFPVFYTG STVMKWEYKTEIRLKALLSHHITQGVLSDHYCHKIPIHALFIGSNMETAVKLMTSTGGVR HSLFTLDTSFDYFHFIPDSHAGEILLKILCSPTIGRQLTVLLLSDLETVNPSISMEHDAV RMGQPVLLAYDFDMLRISRFITALKLQRQTGHLICFDFQKEALLLYCGDAVTISTIDLAK FERRFFS >gi|157101628|gb|DS480696.1| GENE 12 12438 - 14120 927 560 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 23 509 107 583 591 429 45.0 1e-119 MVMRMGWGSKEVHDTERNLIYSDKGTYGTSGFMTGSELHQVLMLVSDIRTCGGTILGRMG NKAVCLPEDTRMNRNVAVYGASGSMKSRAYARNVIFQCVSRKYKNGQGESLIINDPKSEL YEDMASYLEENGYVVRVFNLVNPEHSDSWNCLKEIEGQEIMAQLTADVIIKNTGSGKGDH FWDNAELNLLKALILYVERGFPEEGKNMGQVYQLLTMCSEKELNGLFDLLPVTHPAKAPY NIFRQANDTVRTGVIIGLGSRLQIFQNKLIRQITSYDEINLTRPGYEKCAYFCVSSDQDS TFDFLSSLFMSFVFIKLVRYADKYGDSGRLPVPVHILADELANGGVILDLTKKISTIRSR ALSISVMFQNLPQMQNRYPDNQWQEIIGNCDTQLFLGCTDELTARFISDRSGEVSISVNS QAKQLNTWRVSNYTPEYRETSSVGKRKLLTPDEVLRLPLDQALVILRGQKILKVEKYDYT LHPESKKLKPRKAVEHIPAWQSNSTEDDKDFIRSLKPPPSPSRKREKSTAPDQSGKKHSS PKPTDSDSFFVATDKESIMS >gi|157101628|gb|DS480696.1| GENE 13 14153 - 15145 305 330 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|20808441|ref|NP_623612.1| ribosomal protein S1 [Thermoanaerobacter tengcongensis MB4] # 54 328 5 255 257 122 28 9e-27 MANKEMTAEQTTAEVKKDTIDAPALEKRSLTASILTIDSDRAVETQENVEDTIWHDLQNA YRTKKILTGQLGGIERMEGGGTIAIVYYKKFRVAIPLTEMMIHLPEDESHNYGELSQRQN KILGNMLGCEIDFIIKGLDNNSRSLVASRKDAMLKKRQIFYLPDANGNSRVCVDRVVQAR VVAVAEKVVRVEIFGVETSILARDLSWDWLGNANEMFHVGDQILVRILDITADDLEHISV RADVKSVNGDTSKANLSKCKIQGKYAGTITDIHKGTVFVRLSIGVNAIAHSCYDSRTPGK KDEVSFVVTRIDEDRNVAVGLISRIIRQNI >gi|157101628|gb|DS480696.1| GENE 14 15244 - 15897 419 217 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0715 NR:ns ## KEGG: Rumal_0715 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 10 216 1 204 204 122 32.0 1e-26 MEVKGVDECMLKEYLEKTYGRNEPIFINEIRLEGVNDNSLRQSFTRMVKAGELAHFDTGV YYLPNSSRLLKKSYLDPMKVIMRKYIKNSDETYGYFSGAAFANQLGLTSQMPAVLEIVSN RETTKGRILTVGTQKLRLKCSTIPITDENAGLLQFLNAISQAEKYSELTDEETGRVLRKY ARQQNYSRKLLSLALPALTGNTAKKLIEWGIIYEFAS >gi|157101628|gb|DS480696.1| GENE 15 15881 - 16795 523 304 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0716 NR:ns ## KEGG: Rumal_0716 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 3 303 2 300 301 224 39.0 3e-57 MNLHRDPEAFAELVTAASSELHIPVGIIEKDYYVTLALRELNSRIKGMVFKGGTSLTKCY QVLNRFSEDIDISYAASEGVPGESRKRQLKKAVVSAMEALQFPIINLDDTRSRRSYNCYR ATYPTRYAPIAELKPELMIETYIALLPFPTVTRMADNYIHRFLKETDQEHLAEEFDLMPF SITTQAIERTLIDKVFAICDYYLADKVERHSRHLYDIYQILDHTSLDESLIPLIQEVRQL RSPLAICPSAKEDVCIHDILTEIIEKDIYKADYENITRNLLFTDLPYETAMKGLKNFLNQ GYFK >gi|157101628|gb|DS480696.1| GENE 16 16861 - 17619 680 252 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940627|ref|ZP_02087970.1| ## NR: gi|160940627|ref|ZP_02087970.1| hypothetical protein CLOBOL_05521 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05521 [Clostridium bolteae ATCC BAA-613] # 1 252 1 252 252 437 100.0 1e-121 MEYRRMSEEMSSQEVHLLMKVLIPEAEVKSYDRCRDTNCIDVTYCLYGQERVITFYEDDI ENALDDVEMEGEKYYRIYMIANGYSCLWKTESQMELGAWIMKAGRDIEERLKKQEEKDKE LQALQKRADTILEEVLEDADAEVVRDKLLEFQKVELQVRGRMKHTGYVLGASDAAEMIHK TEGAIEVLDLVTELEILVREYRDIQSLLSVFGDGIKESLGEYDVTNGVYAIAKAMEARNT AFQELFVKIVGE >gi|157101628|gb|DS480696.1| GENE 17 17750 - 18058 264 102 aa, chain + ## HITS:1 COG:no KEGG:DSY2751 NR:ns ## KEGG: DSY2751 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 102 8 106 110 80 45.0 2e-14 MFDINERIIELRMERHWSEYQLAEKSGIGQSTISSWTRTKSMPTVPNLEKICNAFGITLS QFFADKEERTVSLSPVQQEMLHNFDRLTPEQQENLIRFLSSF >gi|157101628|gb|DS480696.1| GENE 18 18320 - 18724 348 134 aa, chain + ## HITS:1 COG:no KEGG:Closa_2836 NR:ns ## KEGG: Closa_2836 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 134 4 131 131 74 35.0 1e-12 MLKNEKGEASYISVFIYILVAVILIAFIINVFRIISAKQQMDHAADQLVKQIQLGGSVNG ETDALFAFLSGEISGADQLDYHVECSDSTDKIQIGSPFYVTVTGRCSLGGFWNFRLINIT IRATGAGVSEHYWK >gi|157101628|gb|DS480696.1| GENE 19 18735 - 19226 288 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940630|ref|ZP_02087973.1| ## NR: gi|160940630|ref|ZP_02087973.1| hypothetical protein CLOBOL_05524 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05524 [Clostridium bolteae ATCC BAA-613] # 1 163 1 163 163 311 100.0 1e-83 MIKRLFWDERGDLSFFTIFVILSINMVMAFLLLFATIRIECINIRNAAKMELNNVSARIY ADTFHSQREANLDSYMGDLYFSSAYQDALRTDFIQGLTDRIVLENENYTLSNIRLDFSQH GNDSGKIEYIFSCDAQFRFQMFGEAFPPFSRHITLTGSHNTKY >gi|157101628|gb|DS480696.1| GENE 20 19294 - 19866 280 190 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940631|ref|ZP_02087974.1| ## NR: gi|160940631|ref|ZP_02087974.1| hypothetical protein CLOBOL_05525 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05525 [Clostridium bolteae ATCC BAA-613] # 1 190 1 190 190 305 100.0 9e-82 MNHKKGLAIFLSIACAAILGSVFLLGKSPEHTFTPEVTESVSSNSWEERNSQTDSGHVMS KESADKTLPEDSSAEDPVQTVISEDTEQVITELTPIEDKAQVQANTKPTAPPPDEEKQGR LPKCDMLPNAQTSVPESTPEESPQKSSENVSQGQDGQVYDPVFGWIAPSTSQGQVIDNDG DINKQVGTMN >gi|157101628|gb|DS480696.1| GENE 21 19956 - 21647 879 563 aa, chain + ## HITS:1 COG:no KEGG:DSY4654 NR:ns ## KEGG: DSY4654 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 20 559 20 558 564 511 51.0 1e-143 MKRFLQKAVCLILLFLLALPAKAYASNGSGNMDGGGSGMGQGTSTNYWTPGYDGVRITVV DASSGSAVSSPVDFSNKTFNKPLVHFAKKNKFQYLAGAALEPAGGSYVYQIPAVAMPYIM TSNSKPANIENTRKYFCSEYAAQMVADATGVSYDSLISGSYKLVVEPIGYFTFNGTFYAM TATEGALYNRLSGGALASKLPSTVHRNLPLALFLETPDLGIPAWTGSSTAYVNDDQILSS LGIGIVSYEGLPEIGLEAPDVTYRVNTDVITSITVNTNREINPDNPANVTFHINGRSYTV NNIVIPEGGSQIVWVKWHTPSEPCRITISASLHRATTAKTSFIADIVDLNENIPPDPTAT DTNPRFSVPSIPSNPQNTSAAWSVWYAYWFPDWDWCDHGDEGGHWVDNGWWEFDTHNYSA SLSGTMNIAPDDIVPTAQGDTMKSGYGIKEIASARLTINAPASHYAPAQTAVSYFPEFQY QNYWRLLTCSGGMAATFRFQPNEFSTYNRQVHFSPIWFPDHTSYTVYTYVIDAWTPAGML SLNLNDSVQIDGSLYDDWYSKRE >gi|157101628|gb|DS480696.1| GENE 22 21622 - 22092 172 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940633|ref|ZP_02087976.1| ## NR: gi|160940633|ref|ZP_02087976.1| hypothetical protein CLOBOL_05527 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05527 [Clostridium bolteae ATCC BAA-613] # 1 156 2 157 157 229 100.0 7e-59 MTGIQNVNSLEAGYYAATLIFIGWFALYDFRRHLVRNQALAIFFFWCLLSLPLTASTSLL PVSLLILQAVMGFLNGGLILLAAAWITHGGIGGGDIKLAALLGLLFGTRGVCLILFIAAI SALAFVLLISGRNRSSPLRLPFVPFLFLGSVLWVFL >gi|157101628|gb|DS480696.1| GENE 23 22104 - 22211 92 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLLFISSGLLLALFLAFLLGSALIEQARAFLDQNR >gi|157101628|gb|DS480696.1| GENE 24 22227 - 23093 745 288 aa, chain + ## HITS:1 COG:no KEGG:HM1_0585 NR:ns ## KEGG: HM1_0585 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 3 269 4 268 270 161 35.0 2e-38 MRINNRFIYGIISILLAAVIAFVAIPVVTKKTSSTLEIVRVVNPLERGQQITAQDLELIE VGSYNLPASVAIRLEDVTGTYAAASLMPGDYILPNKVSSSPLSSDPALYSIPDGMVAISV TTQTLATALSDKLQSGDIIRFYHYNDEHEQLNPVEDIPELRYVKVLSVTDSQGLDIDYTQ PLEEDEEKLQTATITVLASPEQAMLLTRYENEGILHVALISRGNEALVTELLQRQTDILN ELYLVPEESLFETNAEDYMESEESPSEEETALPEESQETTSEADEKEE >gi|157101628|gb|DS480696.1| GENE 25 23098 - 23883 665 261 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1890 NR:ns ## KEGG: LM5578_1890 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 2 229 27 255 271 131 30.0 3e-29 MAKIITVWGNPGCGKSMFCCILAKVLTAGKQKAIIINADPATPMLPVWMPDRILEQDASV GGVLSSLDINSALIAERVAVLKDYPFIGVMGYAAGETPFSYPELKYDRIKLLISEASRLV DYIILDCSSSVVNFFTPTAIESADVVVRILTPDLRGINYLKAHKPLLTDAKFRYDEHLTF AGLARPFHALDEMGHLIGGFDGILPYAKEIERCGTSGEMFRALAYCNQRYINSMKLVRQR LEIGEASQEADAAQEEEILNE >gi|157101628|gb|DS480696.1| GENE 26 23876 - 25240 1032 454 aa, chain + ## HITS:1 COG:RSp1085 KEGG:ns NR:ns ## COG: RSp1085 COG4962 # Protein_GI_number: 17549306 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 73 402 80 399 450 115 29.0 1e-25 MNSLNELFFNHSAAEGMTYEDILDDVQKYCTANHADTLSGETDPEQAKKLLKEFILQYLI KCSYGVEGMSSAELCENLYEDMAGFSFLKKWIYRKGVEEVNINAYNDIEVITSGGRSMKI PEKFQSPQHAIDVIRRLLSTCGMVIDDTMPSVIGYLDKNIRISVDKTPIVDADVGINASI RIVNQQSISREKLIDSGSATEEMLDFLVCCIRYGVSVCIAGSTGSGKTTVMSWLLSMVPD NRRLITIEEGSREFDLVKRDADGKILNSVVHLLTRPHENPALDIDQDILLERVLRKHPDV IGVGEMRSAKEALSVAESSRTGHTVVTTIHSNSAEATYRRMMTLAKRKYNMADDILMQIM VEAYPVVIFTKQLEDGSRKIMQIIEGEDYTDGKLQYRTLYQYNVEDNRTIAGKTQVFGEH KKLNPFTDNLKKRLLDNGISQLELKKLMDERREP >gi|157101628|gb|DS480696.1| GENE 27 25240 - 26160 794 306 aa, chain + ## HITS:1 COG:no KEGG:CKL_3887 NR:ns ## KEGG: CKL_3887 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 305 1 308 309 259 43.0 7e-68 MQWIQIIIFVLLLSGFFTLFDIRFTDLLNLLAVGKKSTLNDDIDVLLGKPAKGFFNRESV EIEQLLKSTGRENRFTMVKRISVLLFSLGAVVSLLLNNPFLIPILGAGFALAPVWYLRAT ASVYKKHLNEELETAISVVTTSYLRSGDLMKAVKENIPYLNPPVKGNFEAFITEAEMLNA NMISAINSLKMKIPNRIFHEWCNTLIQCQSDRNMMNTLTFTVQKFSDMRIVQTELEAIIN EPKREAFAMMFLVLCNIPLLYVLNQNWFETLIFTLPGKITLAICAAVILFSFTRIMQLSK PIEYKG >gi|157101628|gb|DS480696.1| GENE 28 26170 - 27048 726 292 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2774 NR:ns ## KEGG: Ccel_2774 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 5 292 1 290 290 197 38.0 4e-49 MTGLLILLAAIFTGCSCYYLSCAFADIPTVRTSKAMMTTRKQTGSGSEKLFDVYLTRIAR LLAPVIRLDPVKRGKLQTTLEIAELSLTPEAYTAKAFLSALAVAFLSLPLFLIMPLLGFL VLGLSILMWFSTYYGAFDYVKKRKVLIEQELPRLAVSIQQSLANDRDVLKLLVSYRRIAG PELGKELDTTIADMKTGNYENALLRFQNRIGSTMLSDIVRGLIGTLRGDDQQMYFRMLAF DMRQMEQSNLKKEAAKRPKQMQKYSMMMLFCILLIYVVVLTVEVISSLDAFF >gi|157101628|gb|DS480696.1| GENE 29 27235 - 27456 272 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940640|ref|ZP_02087983.1| ## NR: gi|160940640|ref|ZP_02087983.1| hypothetical protein CLOBOL_05534 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05534 [Clostridium bolteae ATCC BAA-613] # 1 73 47 119 119 124 100.0 2e-27 MKKAIYRFTSPARNFCTKAGQYLRNQRGDFYISDGVKIIIAVVLGALLLAALTTIFNDTV IPRITQEIEGLFS >gi|157101628|gb|DS480696.1| GENE 30 27538 - 27885 154 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940642|ref|ZP_02087985.1| ## NR: gi|160940642|ref|ZP_02087985.1| hypothetical protein CLOBOL_05536 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05536 [Clostridium bolteae ATCC BAA-613] # 1 115 1 115 115 227 100.0 2e-58 MKSHHEISIPYETLVRFLESAEGNYRNALYIRCSACRYHREEHCGDFLFIPGHDCVPILF PLADAMDVIGHIDPSDCAGILSSTAFLHLYHQWLILLTNTDEICPFHQILHLTNS >gi|157101628|gb|DS480696.1| GENE 31 28027 - 28431 251 134 aa, chain + ## HITS:1 COG:no KEGG:TK90_2760 NR:ns ## KEGG: TK90_2760 # Name: not_defined # Def: hypothetical protein # Organism: Thioalkalivibrio_K90mix # Pathway: not_defined # 5 134 4 132 134 92 37.0 5e-18 MKRKSQTIFCPYCGKPAILRKASDIYKTHTLEEYLYACSDYPACDAYVGVHRGTLLPKGS LANKELRRKRIVAHRYFDSIWKHGILNRKNAYFWLQDIFGLTSDQAHIGQFSDYMCDQVI SASKRVLENNKIAC >gi|157101628|gb|DS480696.1| GENE 32 28455 - 29012 205 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940646|ref|ZP_02087989.1| ## NR: gi|160940646|ref|ZP_02087989.1| hypothetical protein CLOBOL_05540 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05540 [Clostridium bolteae ATCC BAA-613] # 1 185 3 187 187 342 100.0 5e-93 MTSTVSNTVHPEEYLHLVDTVISLYITQRIDIADMEYADLYQTGCLALCHAAATYDPERE ASFSTYASVVIRNRLYDYCRHVNHVHSQLLYLDAPLADDSETAYVDQLQAPSESPAELFP LLQIIQEEYSGIAAKGVNALLLQASGYSSKDIAAMYHVKPNHISAWISRARSRLQKDPRV LAFAQ >gi|157101628|gb|DS480696.1| GENE 33 29036 - 29626 305 196 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940647|ref|ZP_02087990.1| ## NR: gi|160940647|ref|ZP_02087990.1| hypothetical protein CLOBOL_05541 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05541 [Clostridium bolteae ATCC BAA-613] # 1 196 1 196 196 379 100.0 1e-104 MLTLYTAVGKLQIRKNSAGKEYPLVMVGEKERTLTPYEMLLWSSLSWNILTFDETRALFY EREQEMHILSDLDFEYYLKRLIFRGLVVSGTGYSGISALHQLLAPLYVHSCADRLLEKLA ALADMVCLQHYPLRTAVRIFRKEKLTVEEKLLIGRCGPKRIAASEILEYGEAPDQHRMAA VLAGVYLKRQVIFETF >gi|157101628|gb|DS480696.1| GENE 34 29687 - 30733 206 348 aa, chain - ## HITS:1 COG:no KEGG:PCC7424_4909 NR:ns ## KEGG: PCC7424_4909 # Name: not_defined # Def: Miro domain protein # Organism: Cyanothece_PCC7424 # Pathway: not_defined # 5 164 834 995 1015 100 34.0 7e-20 MGRVEIFLSYCWADEEIASDIEMHLAKDPEINLHRDKLDIRKWGSIKQYMQSIPKMDYMI LLISDAYLKSANCMYEVLEVMRDRQYQDKIFPAVVHTGIYKPAIRASYVKHWQAEYEELK HDLEGIGIQNIGRLGEDLKRRQDISANIADFLDLVSDMNNPSISDVCTAIEEHLADKNII RKSQRQSGAEHVSERIFESMGIFAGADNAEPTDLEINQFMSKAFHEINEIMVDLCKQLEE KYGCFQIMAEMIDSRNYYYQFYKNGRTIKSLKIFLDNSFGAWNIGFSNDSFSMVGGNHSW NGMYSAKCINGNLCLSSMMSLGGRQNGMTTEDVVKDIWKNHITPYLER >gi|157101628|gb|DS480696.1| GENE 35 31044 - 31466 367 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940649|ref|ZP_02087992.1| ## NR: gi|160940649|ref|ZP_02087992.1| hypothetical protein CLOBOL_05543 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05543 [Clostridium bolteae ATCC BAA-613] # 1 140 21 160 160 280 100.0 4e-74 MNTPVTIITTGRVTADLELKTSQNGRNTKFVQFSLAVNKGYGENNRPNYYKCLLFGEEAQ RIIDAKVKKGSLIQIVGDLDLEEFKKRDGSAGWSARVTLLSWNYAPTNRPKEDSLPEENN ARDNQFIPTEDCGENGLPFN >gi|157101628|gb|DS480696.1| GENE 36 31575 - 32426 430 283 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940650|ref|ZP_02087993.1| ## NR: gi|160940650|ref|ZP_02087993.1| hypothetical protein CLOBOL_05544 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05544 [Clostridium bolteae ATCC BAA-613] # 1 283 1 283 283 551 100.0 1e-155 MKSSLFTKGFKILIVTLLLLAVLFNRSLAQKLMLATILIWMFMVIVRFLKPTVRKLIENW YADRQYRRNITPLVLSAPMPDEAISFSPVPEPEPAFSEAEQKRMLQHISLRITEKIKSAY PDATWRWADSPDLTAILDGKTFRIQVDDMAAFTHADIRFDRYARIRITPLSVGEFASSAD SGKKTPSEEDTKEPAVVDVISWYELIGRQMLESVITDLNANGHSRLSIKENGDIIITRNK KEALKGTLDQFPPKNYWNEFLNLLEEQELHGKVEKDRIIISWV >gi|157101628|gb|DS480696.1| GENE 37 32453 - 34249 1085 598 aa, chain + ## HITS:1 COG:SA1241 KEGG:ns NR:ns ## COG: SA1241 COG0714 # Protein_GI_number: 15926989 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Staphylococcus aureus N315 # 400 553 70 223 263 63 28.0 1e-09 MAKTPLFQNWSWKRTLPEPFDKEDAVSTASYSKYCIGSTKKATLHSAALSAILAYMDMDT AVTPGSTSECAYGTQGEKLVAEYQSRTGEIHAVIFNKTNRKFVAGRFDIADPSANTKVYS IKDSGNTGTALFFALLPLALKDLEFKEQYDKLASCKQAGYPDMETADEAAYILCDNLYRR IENADNLGNDGIKIVVPPTGNIQQITPLNLTKGTYSPTSVLFGTFRIFKPGSSVAASIHI DKDHFSGKYSFSSRTFSPKEQMQIPKLEDWYIIPPEVDSICKHIQLTTAGTQPMRNFLLR GPAGTGKTEAAKAIAAGLGLPYLFYTCSANTEIFDLLGQMLPDTNSSACSLPEEYPSLMD IQMDPPSAFYKLTGQYQEGITDGEVYDKLVETITLRAKQKETDSNQNGQHFRYVETPLIE AIRKGYVIEIQEPTVIANPGVLVGLNGLLDRCASVVLPTGEVVRRHPDTVVIVTTNINYA GCRDLNQSIISRMNLVIDLDEPDMDTLVNRVTKITGCTDTSVVAEMAKAVTEISEHCRET MISDGSCGVRELISWVQSYMVSGNVLESASYTVLSSASSDPENRAEIFSTCLEGKFAA >gi|157101628|gb|DS480696.1| GENE 38 34269 - 34532 168 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940652|ref|ZP_02087995.1| ## NR: gi|160940652|ref|ZP_02087995.1| hypothetical protein CLOBOL_05546 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05546 [Clostridium bolteae ATCC BAA-613] # 1 87 11 97 97 165 100.0 8e-40 MNDITEQEMTRYFSNNLRLLRNGRTPPISQQQLANQLHVDRGLIAKYECGLLLPPVHLAF LISRYFKIPMEALLAKPLRTTRKEICT >gi|157101628|gb|DS480696.1| GENE 39 34529 - 36607 1379 692 aa, chain + ## HITS:1 COG:BS_yojO KEGG:ns NR:ns ## COG: BS_yojO COG4548 # Protein_GI_number: 16078998 # Func_class: P Inorganic ion transport and metabolism # Function: Nitric oxide reductase activation protein # Organism: Bacillus subtilis # 438 675 395 645 661 71 25.0 5e-12 MRTSHRTIRKLIREEQSKITDEQLFSSQQFSAYMTDIVETATKRYRRRSKVKMHWDPFSN SDVAHTDNRLIHINAANFLTRSFPTRSLQADSIIGLAGHEAGHILFTDFTMLQIHMQALE AGSFYPAEPGNLTMLQSLSLQQIKNYFEEKDEAVIQAIRLVTHSLVNTMEDLYIEARMMD AFPGSFKTGILLNNLRFSEDIPSVTDEIAEGHHEVFIIINLLIQYAKSGDLNNADGYKGK YLDTLYECIPFLDDAAYDEDAKARYQAANQMLVLLWPYMPSLIEQVREDLKNKTHHAAQA AEEQLATPPKTLSGTSKPVPSKTPYNHTPSSEDEERERLQSALDYETGRIALEKTDEISE GDSGGTSYDRNFSGSGYVSQAAEDMQRIVSQLAEESAAIRYEEELSEELQDESNRISYGN IHKGIHIHINRMGYVPEEYRISYQKVFPALHPISKRLQKQVSQILIDSKTGGKLDGLPFG RRINARNAVRNDGKLFYKIRLPHEQGDIAVCLLIDESGSMSSRDRITKARSAALIIHDFC VALGIPIAIYGHTEDYDVEIYSYAEYDSQDGKDAYRLMDMSSRCGNRDGAALRFAAERLM TRAEDLKLLFLISDGQPAGDGYYGTAAEADLRGIKQEYSRKGIQLFAAAIGDDKPNIKRI YGDGFLDISDLNRLPVHLGNLIIQYIKQKHTA >gi|157101628|gb|DS480696.1| GENE 40 36645 - 37307 315 220 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940654|ref|ZP_02087997.1| ## NR: gi|160940654|ref|ZP_02087997.1| hypothetical protein CLOBOL_05548 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05548 [Clostridium bolteae ATCC BAA-613] # 1 220 1 220 220 416 100.0 1e-115 MSGHISDQIVVYKNDKVLVELRDKLKVASVANYAHIHATGEQTGNGTKRTSLIGILMKDY SKGTKDQAVTVTANVSPEEIRFLFSRLNAGFPAVDFKQEKIFGDPDNTGHSQVTKLRIIR AEKDSRGNQRNNPWYIEIENGKGIAVKNSKGGTYMKSNSFSGNGKASANLNDLEMFRLLC KTVSYIDAWEKAIGPSLIIQAKNIIKKNQEDAQRHNTSAA >gi|157101628|gb|DS480696.1| GENE 41 37396 - 37905 321 169 aa, chain + ## HITS:1 COG:BH0070 KEGG:ns NR:ns ## COG: BH0070 COG2002 # Protein_GI_number: 15612633 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Bacillus halodurans # 1 163 6 173 179 67 31.0 9e-12 MIRRIDDLGRIVIPKEIRKNLHIQDGEPLEINVMDRSIMLTPYSPLPALSIQSDLFLRVM AKELKTGTAICSSTAILSYRNVSLYRDCSLSDEIRSRIQSQQQYEYDPEKPIYLETAHIC PVNALYLIGTPAKPIGAVVLIRKNDIDLLPEQKSIARIAAQILTELTKD >gi|157101628|gb|DS480696.1| GENE 42 37953 - 38219 70 88 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940656|ref|ZP_02087999.1| ## NR: gi|160940656|ref|ZP_02087999.1| hypothetical protein CLOBOL_05550 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05550 [Clostridium bolteae ATCC BAA-613] # 24 88 1 65 65 119 98.0 6e-26 MFRINKSRQLDRRPLIPSKGHFYVTGYLSSPVSKGQPAEYIYNGNRIRTSTVLQILKASD EFITFETLNSIYTISYLKVSAENRVLCA >gi|157101628|gb|DS480696.1| GENE 43 38291 - 38758 348 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940657|ref|ZP_02088000.1| ## NR: gi|160940657|ref|ZP_02088000.1| hypothetical protein CLOBOL_05551 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05551 [Clostridium bolteae ATCC BAA-613] # 1 155 12 166 166 283 99.0 2e-75 METEKQLKDQMATLKKNRDNVPLDVLKTKYKKGYTALCEDLRLLTSDFIKSIVLKDIAVM PKYMPDVVQIIETTVKDSGLLKECSKAVYRQQDFEELKSLAEQLRELALKALDEFYMKHI GLYIAPECLKEPYPPPYYLNLVTNQYYDGTRWAKI >gi|157101628|gb|DS480696.1| GENE 44 39092 - 41119 1044 675 aa, chain + ## HITS:1 COG:BS_yerG KEGG:ns NR:ns ## COG: BS_yerG COG0272 # Protein_GI_number: 16077730 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Bacillus subtilis # 15 671 9 663 668 328 32.0 2e-89 MKQNQNEHLSQEAIRIHHLVKQLNQYRHEYYNMNAPSVSDAVYDCLFDELDALEKRTGII LSNSPTQTVGYQAVSQLDKVHHPIPLLSLDKTKQITEIHDFMKKQKSLLMLKLDGLTIKL TYENGRLIEASTRGDGDEGEIVTHNASAFRNIPLSIPYQKRLVISGEAFIHIHDFERLQN TLLDSNGNPYRNARNLAAGSARCLDANICKERSVSFYAFHVLEGLDESSELSKSREARLT ALHPLGFDICPFLVLSSEASLETLNEAMATLRDLASELDIPIDGLVLRYDDLPYSVACGR TGRFYKDGKAFKFEDDTFKTIFRSIEWQTGRTGEIAPVALFDPIEIDGSTVSRASLHNLT FIQDLELYPGCRILVSKRNMIIPHIEENLDRGHYQDMTPRTCPCCGSPTRIYSRSSGSER IVQTLHCDNPNCSQQILQRFVHFCEKKAMNIIGISEATLKQFIDLGFLKCFQDIYHLDRF SDQIADMDGFGKKSYERLWNSINESRNTTFVRYLVAMDIPMIGRTASRKLEQYFHGNLLE LELAAMSRFDFTCLEDFGETMSSNIIEWFHDRENLTLWRNLQKEMHFKEREETIMTETKN NPFAGCTIVATGKLENFTRDSINSKIISLGATAGSSVTKKTDYLICGEKAGSKLAKAQQL GVKVLSEQEFLNMIA >gi|157101628|gb|DS480696.1| GENE 45 41266 - 42627 1019 453 aa, chain + ## HITS:1 COG:FN1101 KEGG:ns NR:ns ## COG: FN1101 COG1373 # Protein_GI_number: 19704436 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Fusobacterium nucleatum # 2 448 22 451 470 93 24.0 8e-19 MYLKRIIYDQLLDWKNDTSHSTLEVSGARQVGKTYIINKFADENFRHKIYINLFEQSGQQ FMECYKQATSWTPGTKRPEHPLHDAFRLFDIEFTDSDDTVIIIDEIQESAEIYNRIREFT RQFKSHFIITGSYLGKIYESDFRYSSGDVTTIPIYTLSFEEFLLAYDSALYEQYQNLSLD TPDSGNIYTLLKEAYDIYCQVGGYPKVVETYLETRSFEKAQGVLIKIIDTFTNESIRYFT DILDTRVFTQIFLSICRILNRESRGLKEDSISEELQKLVTRDYSSNISKAVCNRAISWLY FSGIIGFCAKITEMDVLDFKPASRCYFMDLGLANYYLTLTGTDAATLSGTLNENFVYINL KKRQDFPAEIAFETPAFATYKGGEIDFVAQTLKNRVRYLIEVKAGKGTATTSLKALEHGK ADHLLYLKGDTKGGIDKKVRTLPIYLLERFNFQ >gi|157101628|gb|DS480696.1| GENE 46 42735 - 43070 317 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940660|ref|ZP_02088003.1| ## NR: gi|160940660|ref|ZP_02088003.1| hypothetical protein CLOBOL_05554 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05554 [Clostridium bolteae ATCC BAA-613] # 1 111 1 111 111 213 100.0 3e-54 MTESMNQLIPIFEDIIKANNFITFAYSSLDDRWFYLSMKAGDDYDSTKILGNLNQAYAYM LEECQYYWLKRNRLLQPDLTVDAIIASLSPEIHGLLDAFLEPYVKAAEKFL >gi|157101628|gb|DS480696.1| GENE 47 43120 - 43344 265 74 aa, chain - ## HITS:1 COG:no KEGG:BLJ_1241 NR:ns ## KEGG: BLJ_1241 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 72 13 81 82 70 45.0 2e-11 MKQGTLFYDKESGRYDFCYDCDGETVNYGGIHCGEVFEFCLNDVWVPARVEMYEDWYLVG LPGLKMEGLEVRTR >gi|157101628|gb|DS480696.1| GENE 48 43369 - 44001 554 210 aa, chain - ## HITS:1 COG:pli0059 KEGG:ns NR:ns ## COG: pli0059 COG1961 # Protein_GI_number: 18450341 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 5 164 2 158 199 97 39.0 2e-20 MAVKKFGYARVSTKDQNLARQIEILKAEGIEERDIFVEKESGRSFERKVYRSLVDTVLRE GDLLVVTELKRFGRNYREIYQEWHHITKEIGADIKVTSMPILDTTQSKELVGQLITDVVL AVFAYVADEDWAERHELQRQGIEVAKAEGRHLGRPKVQFPENWDEWYPKWTAGEVTASEA MRQMGIKKDSFYRLAKRYEENSSNLEDGGT >gi|157101628|gb|DS480696.1| GENE 49 44188 - 44373 128 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940663|ref|ZP_02088006.1| ## NR: gi|160940663|ref|ZP_02088006.1| hypothetical protein CLOBOL_05557 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05557 [Clostridium bolteae ATCC BAA-613] # 1 61 1 61 61 115 100.0 2e-24 MSDFENLLRRAKSGDSAAVETIYQMFRPLLVKNALDFGVFDEDLYQELCHTLLLCILKFR I >gi|157101628|gb|DS480696.1| GENE 50 44366 - 44818 394 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940664|ref|ZP_02088007.1| ## NR: gi|160940664|ref|ZP_02088007.1| hypothetical protein CLOBOL_05558 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05558 [Clostridium bolteae ATCC BAA-613] # 1 150 30 179 179 224 100.0 1e-57 MKEYGSEQGSSVQNRFTAYLVTAVKNKRISYMERKHQQTEFEAERPGVLERSYTDFEQEY RQYKLEHLPWKYNETPSIEQLIRVLEEGKLIGAIIKLKKREQEILIARVFEELDFKEIGT RMEMSEKQAEMAYYYVIRKIRKELEGKRYE >gi|157101628|gb|DS480696.1| GENE 51 45256 - 45606 162 116 aa, chain + ## HITS:1 COG:no KEGG:EF2291 NR:ns ## KEGG: EF2291 # Name: not_defined # Def: Cro/CI family transcriptional regulator # Organism: E.faecalis # Pathway: not_defined # 12 112 9 109 121 63 33.0 3e-09 MQVYNSDLCLKLGRRIAAARKNCHMTQEQLSAKAHVTSRHISDIERGLISPSYDVLRSLI IALNISADTLFFPDYEETDPIFRELTACYRKCPESKRTALVQTVDFLIETILIDEK >gi|157101628|gb|DS480696.1| GENE 52 45981 - 46559 239 192 aa, chain - ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 9 173 67 231 470 182 55.0 4e-46 MEGKGQYRAYKSVKTNKGAPGVDEMTIEAALTWWREHNHKLVERIRRGKYTPSPVRRVEI PKPDGEVRKLGIPTDIDRTIQQAMTQQLTPIYEPRFSENSVGYRLGRGAKDVIQKIKKYA EQGYTQAVILDLSKYFDTLNHTTLLNLLRKEIKDERVIQMIKNYLKSGVMENGGKFSSGG QYFSAFRECVSK >gi|157101628|gb|DS480696.1| GENE 53 46678 - 47445 534 255 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 2 255 15 269 269 200 48.0 2e-51 MLIGILLPFAGTVLGSACVFFLKSLKKKIDKLLTGFAAGVMVAASVWSLLIPAMEQSENM GRLAFVPASVGFLLGIGFLLCIDEVVPHLHMYSDKPEGIKSGLSKHAMMLLAVTIHNIPE GMAVGVVLAAMQAEQSGITIGAALALAIGIAIQNFPEGAIIAIPLHNSGLEKKKAFVLGM LSGVVEPIGTILVLLMASTFTPILPYFLSFAAGAMIYVVVEELIPEMSEGKHSNIGVIAF SIGFVLMMILDIFFG >gi|157101628|gb|DS480696.1| GENE 54 47559 - 47765 157 68 aa, chain - ## HITS:1 COG:SA2216 KEGG:ns NR:ns ## COG: SA2216 COG1132 # Protein_GI_number: 15928006 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Staphylococcus aureus N315 # 3 68 512 577 577 76 46.0 1e-14 MEAIDELMKAKTIIMIAHRTKTVQNADQILVVDEGKVVQRGKHEQLKKEEGIYRRFVCQR EKTVSWKL >gi|157101628|gb|DS480696.1| GENE 55 47992 - 48144 63 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRLTAKRSQAQKHNVGTPCIHLQIKGDIPCLISGNGDISSECHDPEKQQD >gi|157101628|gb|DS480696.1| GENE 56 48088 - 48624 217 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940670|ref|ZP_02088013.1| ## NR: gi|160940670|ref|ZP_02088013.1| hypothetical protein CLOBOL_05564 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05564 [Clostridium bolteae ATCC BAA-613] # 1 178 1 178 178 347 100.0 2e-94 MKSIKVIVIGFIFAVALVAWPAGYMQYITAGIGFLLAAVLMAWLGVRRRKSAQKEYEDDV RRYTGTTMMKVVWIEESADERWAHQKDGSDRLRRETYYLPTYEYTVNGTTYRYSSRQSLS GKRDLGRQVMGYYDPAKPERITENRPRKPVLGGFLFFFGALILLFFGIMTFTGNITIS >gi|157101628|gb|DS480696.1| GENE 57 48663 - 48857 145 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940671|ref|ZP_02088014.1| ## NR: gi|160940671|ref|ZP_02088014.1| hypothetical protein CLOBOL_05565 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05565 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 66 100.0 8e-10 MPLWLAVVLLLLSLLGIALSRRYLRKPVQSVCVVLCVLLAVACAAYIGLTILFVDAVRNQ PPVS >gi|157101628|gb|DS480696.1| GENE 58 48872 - 49750 743 292 aa, chain - ## HITS:1 COG:no KEGG:CKR_2158 NR:ns ## KEGG: CKR_2158 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 117 287 130 303 315 73 29.0 1e-11 MEEKSRNMTDEHLSRQLRKSKSGKLGGKLLSLVGGVVAIAGIILGGNIVLVVIGGVILGL GQMVQGKSKDKASQQAFDAIAPDIVSAALENVQMSPTPHLLDAEDTNIPLPRHTHCSGSG YVRGTYRGLTTELCTVRLTEVNEVQREETGLWEKNEQPVYTGQWMLCELGQEFPTWLTIW PRERLDKLFSAKTIKTGSEVFDKRFNLSSDDEAAALRILTLGRMERILALADSSIGKFAV NLNPDGRVYIAVHSGRGFFDIGKGYENPAQLRQRFAHELRWFTDMIDAFRGV >gi|157101628|gb|DS480696.1| GENE 59 49761 - 50705 978 314 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940673|ref|ZP_02088016.1| ## NR: gi|160940673|ref|ZP_02088016.1| hypothetical protein CLOBOL_05567 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05567 [Clostridium bolteae ATCC BAA-613] # 1 314 1 314 314 635 100.0 1e-180 MRKKRNDFDAVGAERMTDTELSKKLGSYQRTESIGILLGVILVITGCIMTFVYRNIIVIC ILVFAGVALFLLVAQPAQKKKKALMQQQLGGFFRTELTKAFGPEPEPATLPIDWAYLEAA KLSVVSFTECTVTDFHEGEHKGLRFSAANVELRRTVEEKSGPNNDNWITRTETLFRGIVV RCKDICDPALDIALNDMFQERKKDDITEPAAFRKHFSAHTADDREADDQVTPQLRDLVQK LEASSRTSKLCGLILRDGDLALALNTRYVFAGIPEELDLRDIDGIRKWFIASLTGMGNLL DLITESPALTGTTE >gi|157101628|gb|DS480696.1| GENE 60 50721 - 52043 1292 440 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940674|ref|ZP_02088017.1| ## NR: gi|160940674|ref|ZP_02088017.1| hypothetical protein CLOBOL_05568 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05568 [Clostridium bolteae ATCC BAA-613] # 1 440 1 440 440 874 100.0 0 MKNRWKHIFIPVPAMALCLSLAACGNSDDVAGDDWRTTGVVVGSGTIAHDGESVDVLVTV CESSAAFYRDLPEQVLFDSVSFPMNIPDAEQSFNAISFDDMDGDGESDVLVSFIHENGDA TELVWIWDPVERYVFREDLSTVALSGGDLGEYVGLWEYQGKNLWLRICDDATWEFVNDQD EVIEYGTLWVDENGVTLHFDGSGDTLQLDRTVSGDLIDSVNNGTLLPVEGIQSQEPYFAR NGLEINAAVEMGTFLLGNGVCSYSGLGDGYNTADCYWEITKNGDYTHDGIRELHFDAICY IPEDSIPNFDQQYTTVTSSELYDFYTGMWLTTSTAYGNSQRGDNYYLHTVSWQGNSYLIE FAYSTNWQYNVGDWAQVLTKSYSVYLPEEYNGLVFAAEAQPGSYKDTAKRMQLDSISPEA AIMDIDTVDPYSSLYFSLCY >gi|157101628|gb|DS480696.1| GENE 61 52062 - 52868 987 268 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940675|ref|ZP_02088018.1| ## NR: gi|160940675|ref|ZP_02088018.1| hypothetical protein CLOBOL_05569 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05569 [Clostridium bolteae ATCC BAA-613] # 1 268 1 268 268 436 100.0 1e-121 MKRKKPYLLAGLVLLLSLLCTACGNTEEAISPQDDPNEGLGGGVSEDTVNDFSGFEGIWL GEANNDYDSMEFDAEGNWTLYLSGEVMDDGYLRYEPEWEAIYAYSNLDDSGSRIDMEDGQ LYSAAYGYFNYGEGMEYLWHEDDEPDPSGDDVSDWNGRLDGNPDSYWSWDSDLCQRNVSE FKGIWYYDGDLSAETYIVIDGNGNWSYYQRAPGAEAAEMDYGTFSYSTDEVSTYYADSTL YDGVSYRVFEFDEDALVWGDEGVYYLME >gi|157101628|gb|DS480696.1| GENE 62 52884 - 53819 814 311 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940676|ref|ZP_02088019.1| ## NR: gi|160940676|ref|ZP_02088019.1| hypothetical protein CLOBOL_05570 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05570 [Clostridium bolteae ATCC BAA-613] # 9 311 1 303 303 582 100.0 1e-164 MRQRRHLLMFALALLCLSFAACGAKQYPMQITVVNRSTYPITDLRISLISEEDWGENRLE TTLEEGESAGIDLGEYTEEQLNAGFHLQFYGEDGQPVNPDYDPSSPTFFENGDFLIFSPP DLSIALFLDTAYDPETYDQKIVELYAADGDGRGDVISQNGIPVLMGGALPFTNMQNLLSE NHDNGTYRYEDITEDGQLLVVNAAEQSCFVPDVQELDDYLTACALSLSDTDTYELLSAEE NEEYSKNLSYPVYIVTYTAGENEDTREWTVFAMDTDVCTYLYGFCALPDAAADMSEIYQN IFSQLSLSDEE >gi|157101628|gb|DS480696.1| GENE 63 53848 - 54768 785 306 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940678|ref|ZP_02088021.1| ## NR: gi|160940678|ref|ZP_02088021.1| hypothetical protein CLOBOL_05572 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05572 [Clostridium bolteae ATCC BAA-613] # 1 306 1 306 306 583 100.0 1e-165 MKKLFALVWTLFMCLSLAACGNTESDDTQSGYETVMDMEFLDGMWSIDGTAKLYFDSTEG FYAYRTRWGLGGRGEFELSEASGRPMIFFNGLLYNFLLRDDGVLLPNRNGEGSGLTIHRN TFLHDDEAEIVEWDISNWDGMWQNALGETIVINSSLMQYIACSPDYSMSGTMNDEGEGMG LYLYDNGTRAYLCPGDDGNSFTLSAERFGRYGDDQHFDGVFYRNADFYAYTDMENAEFYE DEYSTFYIWYYDGVNRYLLGNDYTIGEDGLAYHDEDGLIYPAGWIPEEPYDPAVDWGANW MDSWDI >gi|157101628|gb|DS480696.1| GENE 64 54797 - 55567 1008 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940680|ref|ZP_02088023.1| ## NR: gi|160940680|ref|ZP_02088023.1| hypothetical protein CLOBOL_05574 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05574 [Clostridium bolteae ATCC BAA-613] # 1 256 24 279 279 461 100.0 1e-128 MKYLKKHLSLALALVLCLSLAACGKSSEDYVESNEDYGDDGSDYNLYGVPDSFEGLVKDS EQMANLYLYGGAWTGEDGGTLVAANNDEGDEVRFALYDADEDMTASGFIQFVPDYDYDYF YNEHDGMAYRSWFDEDGALHVAELGTFTQVTGDAPGENIGDGDYGFFAGVWYQDGDPGAA SIIEFDASGSWELLERPGDDGDPTMVDCGTIEVDQDGAGQYFAISTIHTDVVYDFTLVGG DVIYWGGEYDYYQKIE >gi|157101628|gb|DS480696.1| GENE 65 55599 - 55838 212 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940681|ref|ZP_02088024.1| ## NR: gi|160940681|ref|ZP_02088024.1| hypothetical protein CLOBOL_05575 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05575 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 158 100.0 1e-37 MNRRYLTSVTPLPNHVLQVDFVSGSRLLLDMKPYLDKIRFRPLADARVWNSAVTNGIFVR FGDVELSHDEILAMAEQEH >gi|157101628|gb|DS480696.1| GENE 66 55933 - 58404 1571 823 aa, chain - ## HITS:1 COG:PA3921 KEGG:ns NR:ns ## COG: PA3921 COG2909 # Protein_GI_number: 15599116 # Func_class: K Transcription # Function: ATP-dependent transcriptional regulator # Organism: Pseudomonas aeruginosa # 103 309 133 345 906 71 28.0 7e-12 MPKPKWNLNTIYISERLQESLRPISRCAMTTVVAPMGYGKTTAVNWYLGERAKAETLKII RISIYSGNLAIFWKSVQEAFAHAGFPFLREYPCPTDAAGGGLLVDDLCHVLAGETSCYIF IDDFHLLTDRRASLFLCMLANRLPINVHLIVASRDRFLPAAETVRLGAKVYQIGTEQLRL NHTELAVYAHRCGTALSEAQIESLLYSSEGWFSAVYLNLRTLSERGVLPSRDSDIYATFT AAMIDPLPEKQREFLAVMGLADEFTVEMAQCVTGNGDAGQILSVLTEQNAFVTHLPDGVT YRFHHMMKECAECCFLAMEAETQRLYWERFGLWYEQHRQYLHAIAAYRKSENYDALLRVI RSDAGILLASLKPEDVLDALEKCPAETLKAYPFAILVLMRRMFTWRQIPKMLELKALLLT AIEEHPELSEEERGNLLGECDLILSFLCYNDISAMSRLHRSASVQMSRPAISIQNSGGWT FGSPSVLMMFYRAPGELESELAEMDECMPHYYKVTNHHGQGAETIMRAEALFCQGRFTDA HIELERAYAQVKDNGQINMALCCDFLSWRLSLHTDVEQRYTFEERYAALLRYHDASWINL WNATSAYYHALRGETDSIPEIFSQHRLSDINMLAPGKPMMEMIENQVYLAQGAYVRVVGY SAELLPVCEAMHYALVALHLRIQTAVAYEKLGKSEEAHAWLRKALSDAEPDGLVMPFVEN YDILKPLLEREVKNDLLSKIIELGEAATARNAAGTRPAAFDVLTPREFEIVELMAQRLSN REIAEQLFLSEGSVKQYVNQIYSKLHIEGDTRTKRKLLAELLR >gi|157101628|gb|DS480696.1| GENE 67 58649 - 59551 428 300 aa, chain - ## HITS:1 COG:ECs0602 KEGG:ns NR:ns ## COG: ECs0602 COG5433 # Protein_GI_number: 15829856 # Func_class: L Replication, recombination and repair # Function: Transposase # Organism: Escherichia coli O157:H7 # 2 261 128 344 378 89 27.0 1e-17 MLLNVVETVRGLMLAQLPVDSKTNEITVIPELLKLLDISGSIVTIDAVGTQTAIMEQIHE QGGHFALTVKKNQPEAYEEIHTFMDKLEAADVQRKKGEVLDSGMREYLEKYEEIIRIEKN RDRNEYRTCQICKDASNLTKSQKEWPHVQSIGRIKQVRIPSEKDSHGNDVTPSKEEFLEK GSRRVPAPSAEEGTGKDVQCTALISDLILTAEELGSIKRMHWSIENRLHHVLDDTFREDR SPAKKSRNNLSLIRKYAYNILRLAMYETGLADIMTEMMDCFCDNAALRERYVFQGIASLY >gi|157101628|gb|DS480696.1| GENE 68 59934 - 60515 176 193 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940684|ref|ZP_02088027.1| ## NR: gi|160940684|ref|ZP_02088027.1| hypothetical protein CLOBOL_05578 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05578 [Clostridium bolteae ATCC BAA-613] # 1 193 36 228 228 370 100.0 1e-101 MPVTEKKYPDWVQKHRTKGTTVKKRGDSYYLYKRTSRRVKGKKYPQPVDTYIGVITPEGV IQSNKRKVDLTDAEVWEYGFSKAVWELCPDDWKKPLGDDWQDVLSIILLKQSPTSYIQKT RMMKKESDFRYQFAAQTASLSRRIYKKRGISLEELRQLGTIYLVCLDKTEIISKINEEQQ GLLEKIQVALEMC >gi|157101628|gb|DS480696.1| GENE 69 60785 - 61591 403 268 aa, chain + ## HITS:1 COG:no KEGG:ELI_4256 NR:ns ## KEGG: ELI_4256 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 8 250 1 252 267 73 23.0 1e-11 MAVKKIELEEVAALTKELFHAHYAGDLEQWFSYLCPDSVYLGTGEPLLFGGDAIREHFKG FSGKAINIIQEEYFPLSLSDNAVQVCGQIFLESLEKSFRIINRFTISYRIIGGELKMVHQ HNTYEYMQPSESRILNLDMNTMQFVRSLLLDRPSGRRMPVRSGTQTIFVNPNTVLYVQSQ RRKTEFVCIDRVISCNSSIGEIGMELPDFFYPLRRGYLVNTLFIVAIRRFEVELISGICI PIPALTYQQVKQDLLRKQSLPPLNLSDK >gi|157101628|gb|DS480696.1| GENE 70 61635 - 62651 689 338 aa, chain - ## HITS:1 COG:CAC1228 KEGG:ns NR:ns ## COG: CAC1228 COG1961 # Protein_GI_number: 15894511 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 23 314 239 517 544 73 25.0 7e-13 MDDKVPIIRVKSNTECDVNYYSWGSARISYILRNPFYKGAHLVCRTHQKGIRSNTYDIIP REDWEIIEDCHEAIISPEEWEQVQEIIDRRPTIMKGNSCPFYNLFHGIIYCATCGKSMQV RYEKVGRTGKNRFTGEQREPIDKAYYICQTYNRLGKNACTSHKIEARDLYNLVLNDIQEL AKTALKDADAFYQRLSSRMERRYLLDASQTQKECQRLESRNREIDEMFLNLYTDKAKGIL TEQRFMKLTATLEQEQEANQNRPQDLMLMMRHSDEQENEVRTFIKEIRRYAAIEELDEAV LNRLIGKILVGEVEKIDGQKVQEVRIVYNFVGEIPVEG >gi|157101628|gb|DS480696.1| GENE 71 62707 - 62886 102 59 aa, chain - ## HITS:1 COG:no KEGG:ELI_0217 NR:ns ## KEGG: ELI_0217 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 53 121 173 228 106 92.0 3e-22 MRYRQAEQKCFRVLCDYRHLLERWKEEYAPKQPEDGWSPLFVEALHRQPYIEYLLDLLL >gi|157101628|gb|DS480696.1| GENE 72 63027 - 63248 67 73 aa, chain - ## HITS:1 COG:no KEGG:ELI_0217 NR:ns ## KEGG: ELI_0217 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 65 1 65 228 130 93.0 2e-29 METGCLLHCLEVEETNVFEAVEQSVTTRQAALVYGISVGRNGMACCPFHDDRHPSMKVDR RLPLFLPVRQMEM >gi|157101628|gb|DS480696.1| GENE 73 63516 - 65258 256 580 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 339 559 132 353 398 103 35 4e-21 MFGEKFQRKYALTDQGVRNTKKGTFWTVIVNLVVMGGVSILYLVMSGFMGILTEGSPLPG SVLVIGALVLFILLSFVTHLQQYKATYGLVYNEIKTTRLSLAERLRKLPLGYFGKRDLAD LTETIMGDVNRMEHVWSHVLGYLYGAYISTAIIAVCLFVYDWRLAIACLWGVPVAFGLLF GSRKIAACNAERTKKAAVRVSDGIQEALENVREIRATNQEERYLNGLNQKIDEHERVTIQ GELGTGLFVNAASVIMRLGVATTILVGANLILSGSIDFMLLFLFLLVITRVYAPFDQSLA LIAEMFVSQVSADRMNEIYDTPTAEGAEKFEPKGYDIVFEHVGFSYDEKEVLHDVSFTAK EGEVTALVGSSGSGKSTCARLAARLWDISKGVIRVGGVDISTVDPEVLLRDYSMVFQDVV LFDDTVMENIRLGKRGATDEEVRAAAKAANCDEFVHRLPQGYNTPIGENGAKLSGGERQR ISIARALLKDAPIVLLDEATASLDVENETKVQGALSRLLVGKTVLVIAHRMRTVEAADKI IVLADGRVAEEGTPAELMNKNGLYHRMVDLQRQSAGWRLN >gi|157101628|gb|DS480696.1| GENE 74 65259 - 67061 249 600 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 351 574 132 354 398 100 30 3e-20 MKQEKKNDVAVLLDYAGSRRGLTFLGLALSALSMLFSMAPYICIWLTARNLIAAAPEWTQ AQNISKYGWIAFAFAVGGIILYFAGLMCTHLAAFRTASNIRKQGVAHIMKAPLGFFDSNA SGLIRGRLDAAAADTETLLAHNLADIVGTIVLFLSMLVLLFLFDWRMGAACFLSAVISVI AMFSMMGGKNAKLMAEYQAAQDRMTKAGTEYVRGIPVVKIFQQTVYSFKAFQEAIEDYSN KAEHYQADVCRVPQSVNLTFTEGAFVFLVPVALFLAPSALANGSFAGFVSDFVFYAVFSA IISTALAKIMFASSGIMLASTALGRINQVMEAPTLKAPSHPQTPRSNKVEFKDVSFTYNG AESPALSHVSFTAQPGQTVALVGPSGGGKTTAASLIPRFWDATSGVVEVGEVNVAQIDPH VLMDQVAFVFQNNRLFKASILENVRVSRPKATREQVMAALMAAQCKDILDKLPDGLDTQI GTEGTYLSGGEQQRIALARAILKDAPIVVLDEATAFADPENEVLIQKAFATLTKGRTVIM IAHRLSTVVGADKIIVLEEGHIAEQGTHAELTASGGLYARMWTDYNKAVQWKITSEKEAE >gi|157101628|gb|DS480696.1| GENE 75 67197 - 67766 197 189 aa, chain - ## HITS:1 COG:CAC0821 KEGG:ns NR:ns ## COG: CAC0821 COG1309 # Protein_GI_number: 15894108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 120 17 136 200 64 33.0 8e-11 MDEFLEKGFLGASLRQIVKHAGVTTGAFYGYFSSKEALFASIVEPHAAILMSKYVNAHIS FSELPAEEQPEHMGVESGAYIHWMVEYVCQHREPVKLLFCGAEGTSYENFIHNMVELEVE STFRYMETLCRLGYSIPELSPSLCHIIASGMLGGLVEIVVHDIPQEQALRDVEQLREFYT AGWLKLMSP >gi|157101628|gb|DS480696.1| GENE 76 68112 - 68723 249 203 aa, chain + ## HITS:1 COG:CAC0567 KEGG:ns NR:ns ## COG: CAC0567 COG0500 # Protein_GI_number: 15893857 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 8 203 12 209 209 136 35.0 2e-32 MNFFENTRKPQGFGGKLMAKMMNSGHARVSQWGFSNISAEPDAKVLDVGCGGGANIAIWL DKCRNGHVTGLDYSEISVAESQKLNIAAIKQGKCRVLQGDVSSIPFSDEVFDYVSAFETV YFWPGLKKCFSEVNRVLKSGGTFLICNESDGTNASDEKWTKIIGGMKIYNRDQLVAALKE AGFTEIKTYINAKKHWMCIAATK >gi|157101628|gb|DS480696.1| GENE 77 69194 - 69400 183 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940694|ref|ZP_02088037.1| ## NR: gi|160940694|ref|ZP_02088037.1| hypothetical protein CLOBOL_05588 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05588 [Clostridium bolteae ATCC BAA-613] # 1 68 47 114 114 122 98.0 7e-27 MIQKKTFKVEGMHCEHCSNRVMEAVNSIPELSAKVKLKQGLVIISYAEPVEDNLIKEAIE RIGYKVVD >gi|157101628|gb|DS480696.1| GENE 78 69575 - 69736 143 53 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0146 NR:ns ## KEGG: EUBREC_0146 # Name: not_defined # Def: putative permease # Organism: E.rectale # Pathway: not_defined # 1 53 737 789 789 68 58.0 7e-11 MMQGYSGWFDFVMSASRYIEMFAFILMGYLFVVVLSFKRIKKIPMDEALKSIE >gi|157101628|gb|DS480696.1| GENE 79 69962 - 71989 1153 675 aa, chain - ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 83 670 97 682 687 376 35.0 1e-103 MIGERISCEQADILLYYLHNIKEIQAVKVYDRTADVTISYIGERADIIHILRRFQYENVK VPNGLLENSGRELNNTYQEKLIGKVISHYARKLLLPYPLQACYTTVCAAKYIWKGIGNLV KGKIEVPVLDATAIGVSMLRNDMNTASSIMFLLGVGELLEEWTHKKSVDDLARTMSLNIS KVWLVREEQQVLVSVDEIVAGDRVVVHMGNVIPFDGLVVSGEAMVNQAALTGESAAVRKS QDSYVYAGTVVEEGEVTVLVKQVGGTGRYDKIVTMIEASEKLKSGVESKAEHLADRLVPY TLAGTAFTYLLTRNTTKALSVLMVDFSCALKLAMPISVLSAIREASLYNITVKGGKYLEA IAEADTIVFDKTGTLTKAKPSVVDTVSFNELSSDEILRMAACMEEHFPHSMAKAVVDEAS KRNLEHAEMHSKVEYIVAHGIATTIGDKRAIIGSRHFVFEDEMCRVPVGKEAIFEQLPKE YSHLYLAVENELAGVILIEDPLREEAAELVNALRKAGLSQIVMMTGDSERTAAAIAERVG VDNYYSEVLPEDKANFIEEAKANGHKVIMIGDGINDSPALSAADVGIAISDGAEIAREIA DITVGSDDLLQIVTLKMLSDSLMKRIHKNYRTIVGFNTLLILLGVGGVLQPTTSALLHNS STLLIALKNMRNLLS >gi|157101628|gb|DS480696.1| GENE 80 72112 - 72444 319 110 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00577 NR:ns ## KEGG: EUBELI_00577 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 9 95 28 113 114 103 71.0 3e-21 MLKCLECINWKKTGIFAAGVAFGTAGIKILSSKDAKKVYTNCTAAVLRAKECVMKTVSTV QENAEDIYADAKAINEERAAAEEAAEFDSLDDVTEETLHETLNETEAAAE >gi|157101628|gb|DS480696.1| GENE 81 72719 - 73063 213 114 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00576 NR:ns ## KEGG: EUBELI_00576 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 106 28 131 142 94 49.0 1e-18 MLLLDIILNGQYSSCKEIYCEVHKKDSNIGIATVYRMVNTLEEIGAISWKKLSQIHCENW DGKEKICRIVLDDETEIELSQQKWKRVVECGLKCVGFVENQELKSLELKDENTE >gi|157101628|gb|DS480696.1| GENE 82 73262 - 73417 173 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADVIIVLIILAILAFSICYIVKEKKKGYACIGCPNASNCKKHCHCGELKE >gi|157101628|gb|DS480696.1| GENE 83 73436 - 75424 1433 662 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 3 644 4 697 709 453 37.0 1e-127 MRIAMTGNPNSGKTTMYNALTGRSEHVGNWAGVTVDKKEHPVKKNYYEGEEELVAVDLPG AYSMSPFTSEESITSSYVKNEHPDVIVNIVDATNLSRSLFFTTQLLELGIPVVVALNKTD INAKKDTKIDEKALSDKLGCPVIKTISTTGNGLREVMQAATSVAGKGQKAPYIQGNIDLT DKKEVEEADRKRFSFVNEIVSKVEVRKTLTKEKSKDDAIDIIVTHPVWGLITFAAVMWLV FYISQSTLGTWLADILTGWIETFQNFVASKVENASPILQAILVDGIIGGVGAVVGFLPLV MVMYFLIALLEDCGYMARATVVLDPIFKRVGLSGKSVIPMVIGTGCGVPAIMACRTIKNE RERRATAMLCTFMPCGAKLPVIALFAGAFFPDSKWVGPVMYFVGIALILLGALLIKQITG MKYRKSFFIMELPEYKVPSVVFACKSMLERGKAYIIKAGTIILVCNMVVQIMATFTPGFV VAAEDSSDSILAMIASPVAVLLIPVVGFASWQLAAAAITGFIAKENVVGTLATVFCITNF IDTDELALVSGGSEVAAIMGLTKVAALAYLMFNLFTPPCFAALGAMNSEMKSGKWLAGAI GFQLATGYVIGFVVYQAGTLITTGSLGAGFIGGLIAVLIIAAFITYLCIHANKKLKAEYA LD >gi|157101628|gb|DS480696.1| GENE 84 75545 - 75757 197 70 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2636 NR:ns ## KEGG: bpr_I2636 # Name: feoA1 # Def: ferrous iron transport protein A FeoA1 # Organism: B.proteoclasticus # Pathway: not_defined # 1 70 1 70 70 78 54.0 1e-13 MKLTDATEGKEYIVNSINTKDEELNAFLFSLGCYSGEPITVISRKHGGMVVSIKDGRYNI DSQLSDAIEV >gi|157101628|gb|DS480696.1| GENE 85 75989 - 76354 241 121 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 118 3 117 122 108 55.0 2e-24 MKRTESQENYLETILLLSKKKPVVRSVDIANELGFSKPSISIAVKNLKEHEYITVTPEGY IYLTQTGFQIAQNVYERHQVLTDALMKLGVPAETAEEDACRIEHIISDATFTAIKEHLHT K >gi|157101628|gb|DS480696.1| GENE 86 76393 - 76764 202 123 aa, chain - ## HITS:1 COG:no KEGG:Closa_0593 NR:ns ## KEGG: Closa_0593 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 123 62 183 183 135 51.0 4e-31 MLIYVYRPNCLQVDFTNEDLLEMLRNLKYPVSNPERCIVCLAQRLRECEVFPHEIGLFLG YPPEDVKGFIENKAQNYKIVGTWKVYGDEREAEKRFNRFRKCTADYCMRMEKGSTLDKLA VAL >gi|157101628|gb|DS480696.1| GENE 87 76967 - 77398 356 143 aa, chain - ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 142 1 141 142 126 50.0 1e-29 MSNVYVVYWSGTGNTQEMAMAIADGAKENGAEVTVWNVDEADIEELKKAKALALGCPSMG AEQLEEASMEPFMCDLDSQINGKTIALFGSYGWGDGQWMKDWEDRIASDGATVLNGEGII ANNEPSQEIIDECREAGKALANM >gi|157101628|gb|DS480696.1| GENE 88 77980 - 78207 116 75 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0348 NR:ns ## KEGG: Rumal_0348 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: R.albus # Pathway: not_defined # 1 67 1 67 391 86 62.0 3e-16 MIYVGIDVAKDKHDCFITNSDGEVLFKSFTISNNREGFKTLFQRIGSVSDDLTKAKVGLE ATGHYDYRLYSRSKK >gi|157101628|gb|DS480696.1| GENE 89 78337 - 78495 73 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940707|ref|ZP_02088050.1| ## NR: gi|160940707|ref|ZP_02088050.1| hypothetical protein CLOBOL_05601 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05601 [Clostridium bolteae ATCC BAA-613] # 1 52 1 52 52 75 100.0 1e-12 MRLRKIWAGGTFAVLKREHKLNKIQKRGLQKAIEEYLLSATALNLKRLVKAV >gi|157101628|gb|DS480696.1| GENE 90 78677 - 79045 361 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940708|ref|ZP_02088051.1| ## NR: gi|160940708|ref|ZP_02088051.1| hypothetical protein CLOBOL_05602 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05602 [Clostridium bolteae ATCC BAA-613] # 1 122 1 122 122 206 100.0 3e-52 MAKKNEFIFNIGFNRENPDHVKVAGILNRLGHGKAEYLARAVLAYESRGLHRGNTSVGID YRELEQFVRKILEEQNQNRFEEEYGNQNEGKRYKSDEMGVDEKEMKEDDIAGIFMALQDF RK >gi|157101628|gb|DS480696.1| GENE 91 79051 - 79968 734 305 aa, chain - ## HITS:1 COG:no KEGG:Closa_1117 NR:ns ## KEGG: Closa_1117 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 302 4 301 305 278 48.0 2e-73 MYQEKNIAIDHGNRNMKTENLIFTSGLTVSEKQPPFGDYMSYQGKYYALTGQRIPYQRDK TQDERFFYLTLFSIVKELERENLSDRLIQVNLPVGLPPKHYGGLYRKFEKYLSRDEIQNV TYRGKRYSFCINDVVAFPQDYAAAMTIYGKLKEYVKVIVIDLGGFTLDYLMLRNGQPDLG VCDSLDNGVIKLYNSISSRINAEYDVLLEETDMDQMILGKDTDYEPGIVRIVQEMAETFT EDFLGSLRERNMDLKSGCVVFVGGGASLLERYIRSSGKVGKSIFIPELQANAKGYGILYR LQKGR >gi|157101628|gb|DS480696.1| GENE 92 80054 - 81301 615 415 aa, chain - ## HITS:1 COG:no KEGG:Closa_1104 NR:ns ## KEGG: Closa_1104 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 410 6 406 595 286 38.0 2e-75 MREDLPFNMADVAGLVGLQIKRRGSISWEAYCPFCQDEKGKMNLNLKKNVFRCNRCGESG GMLDLYGKLYHVDHGTACSEIKELLGRETEKKPYKIQAKKVEQKIPEIPQAEAAGIEACN KTYTRMLEMLPLAESHRTNLLERGFSEEKIEENGYKSTLVFGYRKLAERLLKAGCTLQGV PGFYQDKEGRWTVNFSAKNAGFLVPVRNMDGLITAMQIRLDHPYDGRKYVWFSSANYQQG ITSGSPVHFVGQEGQACVFVTEGPLKADLSHYLSGRTFAAVAGVNLYGNLSPVLERLKQT GTKKIYEAYDMDKHLPVLCRRDYKEEVCSKCELRENGFGSIDCPRKKIKLDNIQRGCRNL YRLCEENGLESSSLTWDMNEKGIWKEKRKGVDDYLYALKKKEESEKQRKAGTGLL >gi|157101628|gb|DS480696.1| GENE 93 81369 - 82388 865 339 aa, chain - ## HITS:1 COG:no KEGG:Closa_1103 NR:ns ## KEGG: Closa_1103 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 231 1 223 427 97 34.0 7e-19 MSEQILNQGSMYESIQGIQELNQVEGFDPRKYMRQLAGDGQGVKYYLDVVYRKLWFRLKY PNGKIVKKLIKLTDQVAIVEAKVYLDRNDPEESFIANALAQKYLTGDDQFGNKYVELAET AAVGRALADAGFGLQFADLEGETDPAIVDAGIEHFADAGTENVQDAGTFQEMKTGAVPDE EVQLPGQYDIRDYMDQNENQQYADTTAVNTAGSIPPAMQMAGGGSAGSVQKPVQSQTQNR PQNPLQSSQNPVRDRQAMVQGALSNIQREQPVEQIYKMLNQEMAVAVVISAGFYKGKTLG QLALEKPQSLNWYVNDYKGPDNLLRAAAKYLLDAAQNAA >gi|157101628|gb|DS480696.1| GENE 94 82462 - 84723 1445 753 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 1 725 4 735 739 361 35.0 3e-99 MRCRYEETRFRNENDGYCVLCYSTEDPEVPEGARNKYRGEDRFIFTAVGTELPAKKDQEV ELNGRWVKSKYGLQLSVESFTEIRPQTEEGIRSYLSSGMIRGIGPKMAEQIVGRFGIRTF EILDYYPDSLLEIKGITPKKLEGILTSYQGEHMLRDLAAFLSPFQITPKKIRKIYETFGN EALEIVQNQPFSLCQISGFGFLTVDEIARKTGCSPKDPMRIEGCIEYCMEQENQEGHLFQ EKLLFQEKVYEKLNYGYEKGTVTELEIAQSVYKMVEAGKLHFESGVLYAEKYYRYEHEAA KALAALLMQKEKEPEQLEEFLSKAQEELGICLAAKQKEAVKNAFSFLFTIITGGPGTGKT TVEKVILYIHEKLRGGSVLLMAPTGRASRRMAECTGCTDASTMHSALGLVSEEMESESCD FLEADLILVDEMSMVDMRLAYEFFTRIKRGTRVVLIGDVNQLSSVGPGNVFRELIQCGAV PVTVLDQIFRQGKGSLIAANAYKMLNNSAALEYGEDFVFLPADNAECAAEIVEREYRRMT AELGIDQVQVLTPYRKSGAVSVNALNERLWNLVNPKMPGKAEMKVRGRIFREKDKVIHNK NKNEISNGDTGFITGIYLDEDNLEVSRIEFPDNRCVEYSSEDMEMIEHAYATTVHKSQGS EYSVVILPWMPMFYKMLRRNILYTAITRAKVKLILVGSKQAVYRAVHNTESDKRNTRLGE MILKERYALESREKIVPLPDADRTKYEQIAMNF >gi|157101628|gb|DS480696.1| GENE 95 84707 - 85099 270 130 aa, chain - ## HITS:1 COG:no KEGG:Mahau_2838 NR:ns ## KEGG: Mahau_2838 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 3 110 32 134 141 76 37.0 4e-13 MLKKGMYYICSPLSASTKEEIQKNMLQARHYMESISQSFSCRAIAPHAYLPELLDDQSQR ERSLGLAIGQVLLSWCDGLIVCGDVISSGMKKEIEAAEKKQIPIYFYRENRRGFYISTYR KQERKNEMQI >gi|157101628|gb|DS480696.1| GENE 96 85100 - 86137 875 345 aa, chain - ## HITS:1 COG:BS_yqaJ KEGG:ns NR:ns ## COG: BS_yqaJ COG5377 # Protein_GI_number: 16079682 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Bacillus subtilis # 8 341 5 316 319 145 32.0 1e-34 MSLNLNYQPEVLVSTEGLTEEEWLNYRRKGIGGSDAAAIMGQSPFCTKRDLYYDKLGIKS VLEEEESNWVAKEVGHRLEDLVAQIFSAKTGLTVFPIRKMFRHPLYPFMVADVDFFIEFP DGTFGILECKTTNYHCQDKWANNSVPVNYEYQGRHYMSVVNLNVVYFACLYGNNEDEFII RRMDRDLDSEADLIAEEENFWMENILKRTEPPYIEKPDLVLESIRKFCGPAEPDNDAIKL GSSYLSFVERYLELKDKKSEIDAEAKRLETEMKTAYCKVVEAMGVNCKAHLQGTAESYII SYNPVYRTGIDKKNLERLKLHHPDIYDDYVTVSESRRFSVKRGAA >gi|157101628|gb|DS480696.1| GENE 97 86223 - 87152 581 309 aa, chain - ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 20 298 18 299 321 150 31.0 6e-35 MNENMDCRVMKTCQEVQKLLGEDYEVQIQKTMKNNHCERLGITVRERNEKIGRTFYLEEY FQRNIDVETVAHQIAESMLESIPMPLMKWAESFNLHNLGNLKNLIRFKLVNYDANKEELA ERPYIRYLDLAIVFYIVIDSNPDGVMTTKIFNYNLADWKITEEELFKIARQNMERQFPVC VHTLKNVLLEGILEHGEEIDEEDVFLIKEMAKDAPALYVMTNRSQTFGASCMLYDSALED FAKSQGCNVVILPSSIHEVLLLKHEKRMDFKDMREMVMEINRNEVPEEDVLSDSVYLYDW KKKEIQRVE >gi|157101628|gb|DS480696.1| GENE 98 87166 - 88323 835 385 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940717|ref|ZP_02088060.1| ## NR: gi|160940717|ref|ZP_02088060.1| hypothetical protein CLOBOL_05611 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05611 [Clostridium bolteae ATCC BAA-613] # 1 385 1 385 385 786 100.0 0 MNEARMYADGFETMFAKQEDFMAFLKQIGRVSSWDRRRSKDLRLIAFEADSKIAGELQEQ YDRNGQDPGILEDTLQNTGMLLKVKRQYYPVRSCAIQSILNRAGISGPALRKVDKNVYAR ILNDCLKVAKGDALLRFSAGKVSAVLGGDGHDYAVLDMEQIFAMSVEYLQKTFPGCSYLG GFYDHTRASAIWEIGGEEKLLSVYKAELAAYGLNPTDMELKPVIRVTTSNTGDSGVNIYP MLMFGAGKKSIALGDPLRLEHKNGATLQKFEEQLKLTYGKYQLAVGKLSRLLMIPIYHPI NCMVGVMKRLDVPKRYAMEAADMFKSQYGEDPCTAHELYYGISEVIFMLETEGESGSRIT KMEEKIARALGINWKDYDLAQEVKW >gi|157101628|gb|DS480696.1| GENE 99 88885 - 89148 157 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940720|ref|ZP_02088063.1| ## NR: gi|160940720|ref|ZP_02088063.1| hypothetical protein CLOBOL_05614 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05614 [Clostridium bolteae ATCC BAA-613] # 1 87 1 87 87 160 100.0 2e-38 MWFFHTIITAEVILIITKKQLDLCQYDDFRKYSIEEISQMLGIGKTKTRELLQSRLLPVT KVGRDYFTSPQAIQEFLKKNIGKELYF >gi|157101628|gb|DS480696.1| GENE 100 89396 - 90421 494 341 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3273 NR:ns ## KEGG: Rumal_3273 # Name: not_defined # Def: integrase family protein # Organism: R.albus # Pathway: not_defined # 13 335 10 315 324 149 37.0 1e-34 MILTGADASAKLTVAEYGKKYLYYKEQQVKRKKLKNTTYDRLERTFTNQILSSEIANTLM SNLEGTAIQDAIDDMSEELSFSTIKKAYLFLQAMIGYGKNIKDFPDDYDPLSMVELPDET ALNVKTKEIEIIPDESLDILKKVAMELKPDGSLAYRYGPLIIFGLNTGLREGELLALSKK EINMLNGRRCYHVSETVSTVNNRDKDPKTKTKRILTPPKYPRSVRNVPLNKEADACLQIM LDTYGDHKFRNDLIVATQNGKLPTSRNIQTSFDRILKKAGLPHYGTHALRHTFATRLLRK TQSHQEIKAVAELLGDDYHVVVKTYLHTEEEGKSTLVDLIA Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:20:36 2011 Seq name: gi|157101627|gb|DS480697.1| Clostridium bolteae ATCC BAA-613 Scfld_02_38 genomic scaffold, whole genome shotgun sequence Length of sequence - 66617 bp Number of predicted genes - 93, with homology - 88 Number of transcription units - 38, operones - 17 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1026 - 1385 226 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 + Term 1521 - 1590 19.5 + Prom 1593 - 1652 6.7 2 2 Op 1 . + CDS 1675 - 2322 266 ## Closa_3160 transposase 3 2 Op 2 . + CDS 2223 - 2414 88 ## Rumal_1979 putative transposase 4 3 Tu 1 . + CDS 2551 - 3027 238 ## COG3436 Transposase and inactivated derivatives + Term 3058 - 3102 -0.4 5 4 Tu 1 . - CDS 3487 - 3621 68 ## - Prom 3717 - 3776 7.6 6 5 Op 1 . - CDS 3858 - 4364 79 ## COG3981 Predicted acetyltransferase 7 5 Op 2 . - CDS 4448 - 5506 390 ## COG3436 Transposase and inactivated derivatives 8 5 Op 3 . - CDS 5460 - 6035 146 ## Closa_3160 transposase 9 5 Op 4 . - CDS 6060 - 6344 308 ## - Prom 6375 - 6434 7.0 10 6 Op 1 . - CDS 6437 - 6796 232 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 11 6 Op 2 . - CDS 6786 - 7127 125 ## gi|160940736|ref|ZP_02088078.1| hypothetical protein CLOBOL_05630 - Prom 7153 - 7212 3.2 - Term 7239 - 7287 -1.0 12 7 Tu 1 . - CDS 7402 - 8853 1350 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain - Prom 8964 - 9023 5.5 13 8 Op 1 . - CDS 9037 - 10215 1415 ## Closa_3400 protein serine/threonine phosphatase 14 8 Op 2 1/0.000 - CDS 10215 - 12248 1927 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 15 8 Op 3 . - CDS 12251 - 12490 315 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 12539 - 12598 5.1 + Prom 12572 - 12631 5.7 16 9 Op 1 21/0.000 + CDS 12656 - 13837 1234 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 17 9 Op 2 . + CDS 13988 - 14611 565 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 18 9 Op 3 . + CDS 14626 - 15234 220 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein + Term 15269 - 15319 3.3 - Term 15257 - 15307 3.3 19 10 Tu 1 . - CDS 15384 - 15836 383 ## COG3546 Mn-containing catalase - Prom 15889 - 15948 5.5 + Prom 15845 - 15904 7.1 20 11 Tu 1 . + CDS 15924 - 16379 233 ## gi|160940745|ref|ZP_02088087.1| hypothetical protein CLOBOL_05639 + Term 16387 - 16423 6.5 - Term 17135 - 17179 4.5 21 12 Op 1 . - CDS 17182 - 18234 50 ## COG3093 Plasmid maintenance system antidote protein 22 12 Op 2 . - CDS 18266 - 18454 69 ## Swol_0482 hypothetical protein - Prom 18592 - 18651 10.1 23 13 Tu 1 . - CDS 18879 - 19652 650 ## Cphy_0799 hypothetical protein - Term 19665 - 19699 4.2 24 14 Op 1 . - CDS 19719 - 19922 181 ## gi|160940749|ref|ZP_02088091.1| hypothetical protein CLOBOL_05643 25 14 Op 2 . - CDS 20008 - 20499 505 ## Cphy_0798 toxin secretion/phage lysis holin 26 14 Op 3 . - CDS 20519 - 20839 83 ## Clole_2737 hypothetical protein 27 14 Op 4 . - CDS 20919 - 21665 485 ## gi|160940752|ref|ZP_02088094.1| hypothetical protein CLOBOL_05646 28 15 Op 1 . - CDS 22035 - 23054 645 ## gi|160940753|ref|ZP_02088095.1| hypothetical protein CLOBOL_05647 29 15 Op 2 . - CDS 23066 - 23749 314 ## lin1709 hypothetical protein 30 15 Op 3 . - CDS 23760 - 24944 572 ## COG3299 Uncharacterized homolog of phage Mu protein gp47 31 15 Op 4 . - CDS 24937 - 25305 245 ## gi|160940756|ref|ZP_02088098.1| hypothetical protein CLOBOL_05650 32 15 Op 5 . - CDS 25310 - 25708 169 ## gi|160940757|ref|ZP_02088099.1| hypothetical protein CLOBOL_05651 33 15 Op 6 . - CDS 25718 - 26569 264 ## Amet_0433 hypothetical protein 34 15 Op 7 . - CDS 26569 - 26931 186 ## gi|160940759|ref|ZP_02088101.1| hypothetical protein CLOBOL_05653 35 15 Op 8 . - CDS 26952 - 27545 406 ## gi|160940760|ref|ZP_02088102.1| hypothetical protein CLOBOL_05654 36 15 Op 9 . - CDS 27557 - 29950 1181 ## COG5412 Phage-related protein 37 15 Op 10 . - CDS 29992 - 30189 100 ## gi|160940762|ref|ZP_02088104.1| hypothetical protein CLOBOL_05656 38 15 Op 11 . - CDS 30167 - 30433 175 ## gi|160940763|ref|ZP_02088105.1| hypothetical protein CLOBOL_05657 39 15 Op 12 . - CDS 30496 - 30897 321 ## Apre_0847 hypothetical protein 40 15 Op 13 . - CDS 30914 - 31975 527 ## BSn5_14805 hypothetical protein 41 15 Op 14 . - CDS 31972 - 32493 269 ## gi|160940766|ref|ZP_02088108.1| hypothetical protein CLOBOL_05660 42 15 Op 15 . - CDS 32490 - 32864 264 ## gi|160940767|ref|ZP_02088109.1| hypothetical protein CLOBOL_05661 43 15 Op 16 . - CDS 32866 - 33585 384 ## Amet_0423 hypothetical protein 44 15 Op 17 . - CDS 33615 - 34274 277 ## gi|160940769|ref|ZP_02088111.1| hypothetical protein CLOBOL_05663 45 16 Op 1 . - CDS 34378 - 34788 425 ## gi|160940771|ref|ZP_02088113.1| hypothetical protein CLOBOL_05665 46 16 Op 2 . - CDS 34802 - 36469 1253 ## TM1040_1622 hypothetical protein 47 16 Op 3 . - CDS 36481 - 37185 425 ## COG3740 Phage head maturation protease - Prom 37208 - 37267 3.2 - Term 37212 - 37247 1.3 48 17 Tu 1 . - CDS 37285 - 39306 515 ## MXAN_1213 hypothetical protein - Prom 39357 - 39416 1.5 49 18 Op 1 . - CDS 39453 - 40907 407 ## Mahau_0567 hypothetical protein 50 18 Op 2 . - CDS 40894 - 41742 357 ## COG3728 Phage terminase, small subunit - Prom 41770 - 41829 2.6 - Term 41792 - 41834 4.0 51 19 Tu 1 . - CDS 41846 - 42595 235 ## COG0338 Site-specific DNA methylase - Prom 42841 - 42900 4.3 - TRNA 42692 - 42765 59.6 # Met CAT 0 0 - Term 42651 - 42687 1.1 52 20 Op 1 . - CDS 42904 - 43329 245 ## CLM_2534 hypothetical protein 53 20 Op 2 . - CDS 43335 - 43574 214 ## gi|160940779|ref|ZP_02088121.1| hypothetical protein CLOBOL_05673 54 20 Op 3 . - CDS 43590 - 44582 701 ## COG0582 Integrase 55 21 Tu 1 . - CDS 44689 - 45027 252 ## gi|160940781|ref|ZP_02088123.1| hypothetical protein CLOBOL_05675 56 22 Op 1 . - CDS 45441 - 45701 139 ## Cphy_2966 hypothetical protein 57 22 Op 2 . - CDS 45759 - 45980 254 ## gi|160940784|ref|ZP_02088126.1| hypothetical protein CLOBOL_05678 58 22 Op 3 . - CDS 46020 - 46415 142 ## gi|160940785|ref|ZP_02088127.1| hypothetical protein CLOBOL_05679 59 22 Op 4 . - CDS 46448 - 46801 247 ## Hore_13270 hypothetical protein - Prom 46842 - 46901 2.6 60 23 Tu 1 . - CDS 47014 - 47205 250 ## gi|160940788|ref|ZP_02088130.1| hypothetical protein CLOBOL_05682 - Prom 47226 - 47285 1.7 61 24 Tu 1 . - CDS 47820 - 47999 149 ## gi|160940792|ref|ZP_02088134.1| hypothetical protein CLOBOL_05686 - Prom 48029 - 48088 2.2 62 25 Op 1 . - CDS 48101 - 48466 294 ## gi|160940793|ref|ZP_02088135.1| hypothetical protein CLOBOL_05687 63 25 Op 2 . - CDS 48463 - 48819 224 ## Closa_0722 hypothetical protein 64 25 Op 3 . - CDS 48816 - 49124 325 ## gi|160940795|ref|ZP_02088137.1| hypothetical protein CLOBOL_05689 65 25 Op 4 . - CDS 49121 - 49603 351 ## gi|160940796|ref|ZP_02088138.1| hypothetical protein CLOBOL_05690 66 25 Op 5 1/0.000 - CDS 49587 - 50231 274 ## COG0629 Single-stranded DNA-binding protein 67 25 Op 6 . - CDS 50260 - 50703 403 ## COG0629 Single-stranded DNA-binding protein 68 25 Op 7 . - CDS 50700 - 50939 193 ## gi|160940799|ref|ZP_02088141.1| hypothetical protein CLOBOL_05693 69 25 Op 8 . - CDS 50923 - 51354 220 ## gi|160940800|ref|ZP_02088142.1| hypothetical protein CLOBOL_05694 - Prom 51443 - 51502 1.9 70 26 Op 1 . - CDS 51516 - 52907 1132 ## gi|160940802|ref|ZP_02088144.1| hypothetical protein CLOBOL_05696 71 26 Op 2 . - CDS 52909 - 55050 1276 ## Closa_1108 hypothetical protein 72 26 Op 3 . - CDS 55047 - 55625 436 ## Closa_1107 hypothetical protein 73 26 Op 4 . - CDS 55627 - 56115 657 ## gi|160940805|ref|ZP_02088147.1| hypothetical protein CLOBOL_05699 74 26 Op 5 . - CDS 56108 - 56659 248 ## gi|160940806|ref|ZP_02088148.1| hypothetical protein CLOBOL_05700 75 26 Op 6 . - CDS 56656 - 57090 141 ## COG0328 Ribonuclease HI 76 26 Op 7 . - CDS 57078 - 57662 120 ## Clole_0725 hypothetical protein - Prom 57850 - 57909 3.2 - Term 57874 - 57900 -0.6 77 27 Op 1 . - CDS 58013 - 58516 508 ## gi|160940809|ref|ZP_02088151.1| hypothetical protein CLOBOL_05703 78 27 Op 2 . - CDS 58538 - 58789 231 ## gi|160940810|ref|ZP_02088152.1| hypothetical protein CLOBOL_05704 79 27 Op 3 . - CDS 58837 - 58992 122 ## gi|160940811|ref|ZP_02088153.1| hypothetical protein CLOBOL_05705 - Prom 59130 - 59189 5.5 80 28 Op 1 . - CDS 59356 - 59496 100 ## 81 28 Op 2 . - CDS 59596 - 59832 197 ## gi|160940812|ref|ZP_02088154.1| hypothetical protein CLOBOL_05706 - Prom 59880 - 59939 3.4 + Prom 59833 - 59892 4.1 82 29 Tu 1 . + CDS 59935 - 60549 289 ## Clole_0788 hypothetical protein + Term 60607 - 60654 9.0 - Term 60214 - 60259 -0.9 83 30 Tu 1 . - CDS 60457 - 60684 194 ## gi|160940814|ref|ZP_02088156.1| hypothetical protein CLOBOL_05708 - Prom 60927 - 60986 8.0 + Prom 60898 - 60957 4.8 84 31 Tu 1 . + CDS 61037 - 61423 232 ## Closa_0705 XRE family transcriptional regulator - Term 61198 - 61235 -0.0 85 32 Tu 1 . - CDS 61420 - 61548 140 ## - Prom 61581 - 61640 3.9 86 33 Tu 1 . - CDS 61669 - 61881 136 ## TepRe1_0949 helix-turn-helix domain-containing protein - Prom 61941 - 62000 6.5 + Prom 61960 - 62019 3.8 87 34 Tu 1 . + CDS 62039 - 62701 529 ## CE0309 hypothetical protein + Term 62737 - 62776 1.3 + Prom 62756 - 62815 7.4 88 35 Tu 1 . + CDS 62906 - 64483 424 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Term 64259 - 64303 2.1 89 36 Op 1 . - CDS 64493 - 64711 197 ## COG3546 Mn-containing catalase 90 36 Op 2 . - CDS 64711 - 64983 235 ## Closa_2886 hypothetical protein 91 36 Op 3 . - CDS 64980 - 65351 230 ## gi|160940823|ref|ZP_02088165.1| hypothetical protein CLOBOL_05717 - Prom 65464 - 65523 3.0 + Prom 65336 - 65395 3.5 92 37 Tu 1 . + CDS 65429 - 66028 163 ## + Term 66054 - 66091 4.7 - Term 65937 - 65966 2.8 93 38 Tu 1 . - CDS 65969 - 66616 364 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|157101627|gb|DS480697.1| GENE 1 1026 - 1385 226 119 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 107 1 106 107 91 42 1e-17 MLNNASGFRKVYIAAGYTDLRRGIDGLASIVKFNFQLEPYEKDILFLFCGRRSDRIKGLI WEGDRFLLLYKRLELGSFSWPRTKEEALEITLEQYQALMQGLEIVSRHPIQEVQPSDLL >gi|157101627|gb|DS480697.1| GENE 2 1675 - 2322 266 215 aa, chain + ## HITS:1 COG:no KEGG:Closa_3160 NR:ns ## KEGG: Closa_3160 # Name: not_defined # Def: transposase # Organism: C.saccharolyticum # Pathway: not_defined # 8 160 9 179 555 89 35.0 1e-16 MQKIYLPEELNSFSRETLVAVLLSMQDQLTQLNTNMERLIEQIASANNHRYGRSCEKLDV IAGQLELEIIFNEAEALTETLYVVEPAEEDVIQVTRRKKKGKREANLKELPVEVISHTLP EEKLQEIFGSVGWKQLPDEVYKRVRVQPAVYTVEEHTVSMCCRPMRLWSGYRRMGARQTA RVTCGFTGRERDMGIHPLSCMNTRRPGKQTIRGNS >gi|157101627|gb|DS480697.1| GENE 3 2223 - 2414 88 63 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1979 NR:ns ## KEGG: Rumal_1979 # Name: not_defined # Def: putative transposase # Organism: R.albus # Pathway: not_defined # 1 52 268 318 500 62 53.0 5e-09 MWVYRAGKGYGDTPIILYEYQKTRKADHPREFLKGFTGTVVCGGYSAYRKLDRESETIVF AGC >gi|157101627|gb|DS480697.1| GENE 4 2551 - 3027 238 158 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 150 304 450 450 84 29.0 8e-17 MKPDDRKKQRQINLKPLVEAFFAWAKEIQTSSRLTKGKTLEGSNYCINQEEALKVFLDDG EVPLDNNATEGSLRSFCLHKHAWKLIDSIDGAQSSAIIYSITETAKANNLNPFRYLEYIL TVMKDHQEDTDYRFMEELLPWSEQLPEICKSKTKTTNV >gi|157101627|gb|DS480697.1| GENE 5 3487 - 3621 68 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKIWMQNWMQNWDAISSIFKFSTAIKKVIHTTNAIESQMPLTTS >gi|157101627|gb|DS480697.1| GENE 6 3858 - 4364 79 168 aa, chain - ## HITS:1 COG:FN0657 KEGG:ns NR:ns ## COG: FN0657 COG3981 # Protein_GI_number: 19703992 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Fusobacterium nucleatum # 15 142 16 144 173 60 32.0 1e-09 MLYLKKLNTYDSELEYLYLKDLPLNENGFINNLHNISKEEFMEVSLPKLINYSNGIDLPS GYVPETNYFLWDEKTIVGLFRLRHFLNDSLRNGSGHIGYQIHRNFRRMGFASSGLALAID EAKKIIPDDEVYMSTRLDNIASLKVQLHNNAYIHHTDNEHYYTRIRIK >gi|157101627|gb|DS480697.1| GENE 7 4448 - 5506 390 352 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 2 229 208 424 450 125 32.0 1e-28 MYRTGSLEAHPFVLYEYQKGRKADYPRQFLKDCSGICVTDGYEVYHSIARERNDLTIAGC WSHARRDFADVVKSAGKEKGSVQESICYKAFQIIQTMFRYEQGYADMPPEESLEQRKRNV APLVAAFFAYLKYEQGHVAPKMKTGRAISYCINQESFLRVFLTDGAVPMTNNAAEQSIRP FTLGRKNWYLIDTSGGAKSSAIAYSIAETAKANNLKPYEYFKYLLEELPKHGAESPYMLL VANVLKPNEIPAVTHVDKTARVQTITSTQNAVLYNILDEFEKLSGVPILLNTSFNIAGEP IVETPMDALKCFEKTNLDALSIGPYLITKDVKKGKIEIDNYQRFVVSLGAAL >gi|157101627|gb|DS480697.1| GENE 8 5460 - 6035 146 191 aa, chain - ## HITS:1 COG:no KEGG:Closa_3160 NR:ns ## KEGG: Closa_3160 # Name: not_defined # Def: transposase # Organism: C.saccharolyticum # Pathway: not_defined # 3 159 121 277 555 129 44.0 9e-29 MIGKRGEELKDLPVSTVSHPVCEAQLQRAFPDGKYKQLPDEVYKRLAFHPASFEVIEHHV EVFVSTDGRTFVRGERPADLLRNSIVTPSLAAGIYNAKYVNAQPIQRLVKEFERCNVFLP EPTMCRWANVCSDCYLKRIYDRLREKLNDYHVLHADETPVEVRKDGRPARSRSYIVGVPD RITGSTSFCPL >gi|157101627|gb|DS480697.1| GENE 9 6060 - 6344 308 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVRLDEQELSNLDKQTLIKMYLGLQDSMEKLSQSIDILTEEVIHLRQHRFGRSSEKGLT EAEGHFQLGFMFNETEMTVDLNPDILEPTFEEIH >gi|157101627|gb|DS480697.1| GENE 10 6437 - 6796 232 119 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 104 1 103 107 94 42 2e-18 MLSKMNGFEAIYLYAGKSDLRKGIDGLAALVKEQFNLNPFQKNVLFLFCGTRSDRFKGLV WEGDGFCLVYKRIEAGRLRWLRSQQEAVQLSQAELQRLLDGMTILERPTIKKVNCTQVF >gi|157101627|gb|DS480697.1| GENE 11 6786 - 7127 125 113 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940736|ref|ZP_02088078.1| ## NR: gi|160940736|ref|ZP_02088078.1| hypothetical protein CLOBOL_05630 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05630 [Clostridium bolteae ATCC BAA-613] # 1 113 5 117 117 211 100.0 1e-53 MNEVVEVRMASWKAMVQERSASGLSIEEWCAANNVSESQYYYRLRQLRNTALAAAGQTSM EHTGGFMRVPFCPADPSSAVALRIRRADTVIEVSGDAPDNILAFLKAVMTDVV >gi|157101627|gb|DS480697.1| GENE 12 7402 - 8853 1350 483 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 136 429 82 379 450 166 34.0 9e-41 MVTNDATILRLKHDVLYEVAKAAWEGNLDEKREQIPYSMIPGPQATYRCCIYKEREIIRQ RVRLAEGKCPTGKDTSNVVQVINAACEECPIASYIVTDNCRKCMGKACQNSCNFGAISMG RERAYIEPGKCKECGKCSQACPYNAIAHLERPCKKICPVDAITYDEYGICVIDEKKCIQC GACIHSCPFGAIGSKTFMVDVIRLIKAGKRVVAMVAPATEGQFGPDITMASWKIALKQLG FADMIEVGLGGDMTAAYEAEEWAQAHKEGKKMTTSCCPAFVNMIKQHFPMLLENMSTTVS PMCAVSRMIKSGDPDTVTVFIGPCIAKKSESLDLNVKGNADYAMTFGEIRAMMRAKGVEL QPAENDYQESSVFGKRFGNGGGVTAAVLECLKEMDENTDVKVARCNGAAECKKALLLMKV GKLPEDFVEGMACVGGCVGGPSRHKTETEAKKAREQLIGSAEERHVLENLKKYPMDEFSM HRH >gi|157101627|gb|DS480697.1| GENE 13 9037 - 10215 1415 392 aa, chain - ## HITS:1 COG:no KEGG:Closa_3400 NR:ns ## KEGG: Closa_3400 # Name: not_defined # Def: protein serine/threonine phosphatase # Organism: C.saccharolyticum # Pathway: not_defined # 1 392 1 393 393 668 81.0 0 MGITVDVAYKSLNKYTEILCGDKVELLQTEDSNIMILADGMGSGVKANILSTLTSKILGT MFLNGATLEECVDTISETLPICQVRQVAYSTFSILQVYHNGDAYLVEYDNPGCIFIRDGK LMQIPKNVRVMQGKEINEYRFKVKKGDALILMSDGTIHAGVGELLNFGWLWQDIADYAVR QYPLTVSAMRLAASISRACDELYQFRPGDDTTVACMRIIDSKPVHLMTGPARNPEDDVRM VQDFMDDPNARTIVCGGTSATIVSRVLGKKLDVSLDYVDPEIPPIAYMEGVDLVTEGVLT LNRVLQLLRRYVKNETVSEEFFHELDKPNGGSMVAKVLIEDCTEVHLYVGKAINSAYQNP GLPFDLGIRQNLVEQLRHVIEEMGKTVVVTYY >gi|157101627|gb|DS480697.1| GENE 14 10215 - 12248 1927 677 aa, chain - ## HITS:1 COG:TM1421 KEGG:ns NR:ns ## COG: TM1421 COG4624 # Protein_GI_number: 15644172 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 4 327 3 300 301 167 34.0 5e-41 MGMEIIDFKATKCKHCYKCVRYCGVKAIQVKDERAVIMPDKCILCGHCLKICPQSAKTLK SDLNMVRGFLREGMRVVVSIAPSYMGLLKYKTIGQVRGALLRLGFEDVRETSEGAAFVTA EYTKLLAEHKMENIITTCCPSANDLVEIYYPQLIPCLAPVVSPMIAHGKLLKEELGCDVK VVFLGPCIAKKKEAMDLRHEGYIDAVLNFNDINKWLEEEDIVIGDCEDRPFTAFDPKVNR LYPVTNGVVNSVLATEEKGDGYRKFYVHGEDNCIDLCKSMSRGEIKGCFIEMNMCSGGCI KGPTVDEEFISRFKVKLDMEESICREPAEETQMEPVWENVNFRKQFEDHSPRDPQPTEEQ IREILRMTNKLKPEDELNCGACGYPTCREKAIAVFQHKAEVSMCIPFMHEKAESMANLVM ETSPNIVLIVGDDMRIMEYSDVGERYFGRTRSQALQMYLYELIDPVNFQWVFDTHQNIHG KRVNYPEYNLSTLQNIVYIEKENAVLATFIDITREEELAQEEYERKLETIDLAQKVIHKQ MMVAQEIAGLLGETTAETKTTLTKLCHSLLEDGSDAGYAGGEREREIPGVQLGSGAVPLK GVMTAGGSGGAEAGQISGTGPASEPASGAGNPADNAGEQKKPTGYVHIGSANPGGGKSGY VHLSNVDLKKPGGNGGK >gi|157101627|gb|DS480697.1| GENE 15 12251 - 12490 315 79 aa, chain - ## HITS:1 COG:TM1420 KEGG:ns NR:ns ## COG: TM1420 COG1905 # Protein_GI_number: 15644171 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 1 76 3 75 77 61 40.0 3e-10 MKVTICIGSACHLKGSREIISKLQKLVDENGLSDKVDLNGAFCSGNCDHGVCVTVEGELY SLKPEDTEEFFENEIKGRL >gi|157101627|gb|DS480697.1| GENE 16 12656 - 13837 1234 393 aa, chain + ## HITS:1 COG:TM0202 KEGG:ns NR:ns ## COG: TM0202 COG0715 # Protein_GI_number: 15642975 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Thermotoga maritima # 106 391 26 299 300 80 26.0 4e-15 MKKHVSLLLIAFMGISLVLGGCGKSTSGTPATEARTEAQAGTQTEAQADAQTINQNGEKA SQDQASESASGAVSSSESGSGDELNTGFDENKLTCQAAIRVGSLKGPTSMGLVSLMDKAS RGETSNVYEFTMAGKADELVGKIANGDLDIALLPANVASVLYAKTQGNITVLDINTLGVL YVVASDDSISSMADLKGRTIYMTGKGTTPEYVMNYLLKENGLSTSDVDLQFKSEATEVAS LLKQDSSAIGVLPQPFATAACIQNPDLKTVLDLTEQWNMLNKDTGSMLVTGVTLVRSDFL RENRSPVADFIKDHEASTLFATEHAEDASRLIAEQGIVEKAPIAQKALPYCNIVCLTGQE MKDALSGYLSTLHEQDPKSIGGQMPGDDFYYMP >gi|157101627|gb|DS480697.1| GENE 17 13988 - 14611 565 207 aa, chain + ## HITS:1 COG:AGl927 KEGG:ns NR:ns ## COG: AGl927 COG0600 # Protein_GI_number: 15890580 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 204 53 253 269 84 30.0 1e-16 MVKALAEQAADPAFWATILNSFARISLGFLGAFALSILLGSLAYLFPLVKELLDPVMLLI KSVPVASFVILALIWIGSRNLAVFTSFLVVVPMVYVSTLSGLKHTDKKLLEMARVFRMPM WKRIHYIYVPALLPYLVNGCRTALGMSWKSGVAAEVIGIPEGSIGEQLYYSKLYLDTAGL FAWTFVIIIISALFERFFLYLLKKIKH >gi|157101627|gb|DS480697.1| GENE 18 14626 - 15234 220 202 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 3 189 5 202 311 89 30 5e-17 MTLELQHISKSYDGLPVLKNLNLSFQQGQCYCLMSPSGSGKTTLLRILMGLEQADHGLIL ADGTPQDMSSLPASAVFQEDRLCESFSPIENVAMCAGRSLKAPRIKWELARLLPEECLNR PVSTLSGGMKRRVAVARALLIPSHILLMDEPFTGMDDELKCNVISYIREKQDGRLLILST HQEEDVELIGGELVRLENCMDV >gi|157101627|gb|DS480697.1| GENE 19 15384 - 15836 383 150 aa, chain - ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 5 125 65 185 200 152 58.0 2e-37 MCAYELAHLEIVAAIIHQLTKGLTADQLVEQGFGPYYIDHTTGIWPQAAGGIPFNACEFQ SKGDVITDLHEDMAAEQKARSTYDNILRVVKDYDVCEPIKFLREREVVHYQRFGELLRIT QEKLDSRNFYAFNPAFDMQLPSCQDQMRQM >gi|157101627|gb|DS480697.1| GENE 20 15924 - 16379 233 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940745|ref|ZP_02088087.1| ## NR: gi|160940745|ref|ZP_02088087.1| hypothetical protein CLOBOL_05639 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05639 [Clostridium bolteae ATCC BAA-613] # 1 151 1 151 151 314 100.0 1e-84 MNVIPGMRFGALTTKWSWKNHNFRRIWKCTCECGGYCYVKEDALIKGIVMDCGECEENIA TYEPKLLDHPDFSVIEEFVQAGGKLADDNDKDIIRLRNLFGVITPILSELKWGEQRIAEC PDCGGKMIVALSSVNGHAHVVCEKCGVKLMQ >gi|157101627|gb|DS480697.1| GENE 21 17182 - 18234 50 350 aa, chain - ## HITS:1 COG:XF1708 KEGG:ns NR:ns ## COG: XF1708 COG3093 # Protein_GI_number: 15838309 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Xylella fastidiosa 9a5c # 11 330 9 337 371 129 27.0 9e-30 MLHSEHVIAVPPGETIREQLENRGMSQKEFALRMGMSEKHMSHLINGKVELTPEVSLRLE SVLGIPAKFWTNLESVYRERLARVNEELEFERDREIAQKFPYAKLATLGWVPSTRKMEEK VKNLRTYFGVARLEILDTLRVPGIAYRKMGENETSDYALAAWAQRARIEARDIAVSPINI KHLKECIPEIRMLTIREPHVFCRELHRILAECGIIIIFLPHISGSFLHGASFVDGNHIVL GLTVRGKDADKFWFSLFHELYHIIEGHIYDSCETSEEQEVLADEFARDTLIPMKQYQKFV SQGDMSAQAICNFAEQVGVAPCIILGRLQKENIVPYNWHQGLKIHYKLVQ >gi|157101627|gb|DS480697.1| GENE 22 18266 - 18454 69 62 aa, chain - ## HITS:1 COG:no KEGG:Swol_0482 NR:ns ## KEGG: Swol_0482 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 2 62 38 98 98 74 59.0 1e-12 MELQAADSVEMLVRLSIGRCHPLKGNRLGQYAMDLVQPYRLVFEQNENKLELVNIISIED YH >gi|157101627|gb|DS480697.1| GENE 23 18879 - 19652 650 257 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0799 NR:ns ## KEGG: Cphy_0799 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 149 3 165 216 87 31.0 4e-16 MSKTAEGLIQHCKDKLGTPYVYGAKGEILTQAILDRLARENPSTYTSTYKAKAAKYIGQR CTDCSGLISWYTGRIRGSYNYHDTAVERAGIDHLNESMTGWALWKPGHIGVYIGDGYCIE AKGINYGTIKSRVTATPWQKALKLCDIDYTPVPVTYTQGFQPAADGQRWWYQFADGNYAA NGWYWLQEMERRTWGWYLFDSEGYMLTGYQVDPAGEAFLLCPVIGSNEGKCMITDARGAL QIADEYDMIKRRYMFEW >gi|157101627|gb|DS480697.1| GENE 24 19719 - 19922 181 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940749|ref|ZP_02088091.1| ## NR: gi|160940749|ref|ZP_02088091.1| hypothetical protein CLOBOL_05643 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05643 [Clostridium bolteae ATCC BAA-613] # 1 67 17 83 83 108 100.0 1e-22 MAKLTGKHAARIPGNGGYLAEGPDLQEKQPTPYLYDGPTDTPHPGKHQSGVGGPSDSNNN GVDDEEE >gi|157101627|gb|DS480697.1| GENE 25 20008 - 20499 505 163 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0798 NR:ns ## KEGG: Cphy_0798 # Name: not_defined # Def: toxin secretion/phage lysis holin # Organism: C.phytofermentans # Pathway: not_defined # 1 154 1 154 160 143 47.0 3e-33 MKREYVITVQGALAAAGAFLSAKLGILYPVLCILMGTMVLDYITGMLASKNEAIDHPGDA SYGWSSRKGAKGIIKKVGYLCVIAAAMVVDYVIVFVSAELGMQISVKAFFGLLVAVWYLL NELLSIIENAGRMGANVPEWLRKYIAVLKDKIDNTDYQGGSRT >gi|157101627|gb|DS480697.1| GENE 26 20519 - 20839 83 106 aa, chain - ## HITS:1 COG:no KEGG:Clole_2737 NR:ns ## KEGG: Clole_2737 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 106 36 141 141 110 54.0 2e-23 MFALGGLCFIGLGLINEVLPWDMPLWQQVVIGAGIITVLEFLTGCVVNLWLGWGVWDYSN KPGNILGQICPQFFLLWLPVSLAGIVLDDWLRYWWWGEERPCYKIL >gi|157101627|gb|DS480697.1| GENE 27 20919 - 21665 485 248 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940752|ref|ZP_02088094.1| ## NR: gi|160940752|ref|ZP_02088094.1| hypothetical protein CLOBOL_05646 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05646 [Clostridium bolteae ATCC BAA-613] # 1 248 1 248 248 462 100.0 1e-128 MEKIRIGKEERRYEISSIRPESANVLEIVFSGAIPAIWGDITIYTDDGTEATTLHGYETV WKQGGNRVWLSNDGSVYTPPVPPEPAVPPEPYVPTLAEVQDNKKAEVNAACEAMIVSGVN VTLADGTVEHFDLKERDQLNLFGKQIQVNAGLESIEYHTDTTPTTNCKYYSNADMQSIIQ AAMWHVSYHQTYCISLKVWIDACEAKEEVAEIFYGADIPEQYQSEVLKAYLIQIAAEMGV DNGTPASL >gi|157101627|gb|DS480697.1| GENE 28 22035 - 23054 645 339 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940753|ref|ZP_02088095.1| ## NR: gi|160940753|ref|ZP_02088095.1| hypothetical protein CLOBOL_05647 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05647 [Clostridium bolteae ATCC BAA-613] # 1 339 1 339 339 617 100.0 1e-175 MAGTVLTNKGLALITKLMAAQATLSFSRVAVGTGRVPSGYDAQNMTGLNEYKMDATIESC GVSTEQPDVAYIVTQISSVGVSTGFAITEAGVFATDPDDGEILYAYLDLTQDPQYIYAST DAISKFAEITFNVLVGSVTSVTAIVSPGALVKKSEFDNLKTRVEDVETPEFDASGTAEGI TDKTSLLASFVTKMPLVKFMRNVVAGFKLVLYSGQIVNNCVTDRPDLPGSAAQLKVLMDL YTVLNTKIGSKVAVVSFIKSSIPIAAGADFFAYIPIDIDTTDAIWVFPVLTDAPGDNCTK VNISTCVWQDASKQIYVNGCNLASGAANISLAGVLIAIR >gi|157101627|gb|DS480697.1| GENE 29 23066 - 23749 314 227 aa, chain - ## HITS:1 COG:no KEGG:lin1709 NR:ns ## KEGG: lin1709 # Name: not_defined # Def: hypothetical protein # Organism: L.innocua # Pathway: not_defined # 25 166 16 164 213 70 30.0 6e-11 MSFATRMLDMLTSAYIRTDIQKLEKSEPPETNIGKLFYLAGWGFDILQEQTEKVRLWDNI DKMQGAVLDSFGNNYGVARGTATDEIFRIMIRVKILAMLASGNLDTLILSAASLFGIKAE DVKCEEVYPAKVYIYINEDELDEEHKNVAGIIAELMNRIKSGGIGIRIFYRTYSAHKTTV YVGASACLATFMNIPPTAINKKTEKQVKLKIGVGTLLCVTASYPASR >gi|157101627|gb|DS480697.1| GENE 30 23760 - 24944 572 394 aa, chain - ## HITS:1 COG:lin1710 KEGG:ns NR:ns ## COG: lin1710 COG3299 # Protein_GI_number: 16800778 # Func_class: S Function unknown # Function: Uncharacterized homolog of phage Mu protein gp47 # Organism: Listeria innocua # 7 393 2 380 383 164 32.0 3e-40 MDDEWGLTEKGFYRPTYTVLLNALEYKARELYGDGIILTVRSPLGLFLRILAWVWNILFA CLEDVYNSRFVETSVGNSLYNLGKNIGMHMLTEGKASGYITVTGTSGSTIPAGFLVATNG GLQYTVVDAITLSESGTGLALIRAVETGPEYNTAAGTVKVIVNPSSVNGVESITNKAEIS GGRIKETDAEFRARYNKSVDYAGGVNADAVRAALMNDVEGVSSAYVYENDTDESDTTYNL PPHSLEAVVYGGLDEEIAKAIYSRKAGGIQTVGNKAVNVLTASGQQLEVRFSRPTTKKIY VKVTELQTGEGFPGEDKVRQALIDYIGGTTVGGLETGMDVIYIKIPGILTAIPGVEDFEL QIGTSMTSYAKENIAIGYREKAITDSTAISITMK >gi|157101627|gb|DS480697.1| GENE 31 24937 - 25305 245 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940756|ref|ZP_02088098.1| ## NR: gi|160940756|ref|ZP_02088098.1| hypothetical protein CLOBOL_05650 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05650 [Clostridium bolteae ATCC BAA-613] # 1 122 1 122 122 201 100.0 1e-50 MSKTAWKIDPETKDLSFTDEGMLEVLEDDAAIAQGVALTLGAWKGDFELVPSHGTDYEQI LGTVVDEETTDEVLREAIFQEEQITTVESLGVTQGEGRKMTVTWSGRLSDGKTISMEVNT GG >gi|157101627|gb|DS480697.1| GENE 32 25310 - 25708 169 132 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940757|ref|ZP_02088099.1| ## NR: gi|160940757|ref|ZP_02088099.1| hypothetical protein CLOBOL_05651 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05651 [Clostridium bolteae ATCC BAA-613] # 1 132 1 132 132 243 100.0 3e-63 MARKTNELYNLLANEKKIMQRIRCHDIVRVEDFDAVRMTVTVKPLVQREMAGAYVSPPPI LAVKVAYIPLEIKVDGKTADVNVQIKKGDIGVVAYLDMDSDNSIQTGADSKPNTDRIHSG DDAVFVGVIRKG >gi|157101627|gb|DS480697.1| GENE 33 25718 - 26569 264 283 aa, chain - ## HITS:1 COG:no KEGG:Amet_0433 NR:ns ## KEGG: Amet_0433 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 3 278 2 254 258 145 32.0 1e-33 MAFFIRSASLQIGPLKYSMNDGFYFEFEVPFHDSEQLTTASFTVNNLNATSRAGIQKNQV VILNAGYEDDMGALFVGQVTSCNHRQNGVEWQTKITATAALDQWLNSVINKTYTEGSRAE DIVRDLLNIFGLEVGAFQLVKNTLYPRGRVCSGKLKDILTEIVINECKSRFLIRSNQVII NNPADGVNKGYLLTPDAGLLFRSEDSDVTMIEAPETKEVSKEKKAFEEKTWKRQCLLNYR MGPGDVIQIQSRDLNGKFMIVSGTHKGSPTGTWITEIEFKVAG >gi|157101627|gb|DS480697.1| GENE 34 26569 - 26931 186 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940759|ref|ZP_02088101.1| ## NR: gi|160940759|ref|ZP_02088101.1| hypothetical protein CLOBOL_05653 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05653 [Clostridium bolteae ATCC BAA-613] # 1 120 1 120 120 232 100.0 6e-60 MTNELQTMGLTSEVEYIPVDASMVPYTFSVKLDDRTYFMTIKYNDPGSFFTIDLALMATG EVLCYGDPVRYGRPMFNSIEDARYPIPVIVPYCLTGEVNEVTYDNFGNEVKLYLHERRTD >gi|157101627|gb|DS480697.1| GENE 35 26952 - 27545 406 197 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940760|ref|ZP_02088102.1| ## NR: gi|160940760|ref|ZP_02088102.1| hypothetical protein CLOBOL_05654 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05654 [Clostridium bolteae ATCC BAA-613] # 1 197 1 197 197 335 100.0 1e-90 MAYRITGRKTGTVKFQPATGTITQETMTKSSKMTSNAIEGGSTIEDHVSLNPEQFQIAGV VVRNCGSYKSQLDYMWRSRDLVTYVGKFRVTDYIIINLQMKNLPSNKKGFAFTATLQKAN IVSGQYVELGQALLMSQQDKGEKSKGTGQAEAIKAAGLKTKVSEQISQSSYASYVNSYNG KSSAGPNQRKTTSFNGV >gi|157101627|gb|DS480697.1| GENE 36 27557 - 29950 1181 797 aa, chain - ## HITS:1 COG:BH3518 KEGG:ns NR:ns ## COG: BH3518 COG5412 # Protein_GI_number: 15616080 # Func_class: S Function unknown # Function: Phage-related protein # Organism: Bacillus halodurans # 474 730 600 870 940 69 24.0 2e-11 MEDKRNLTMGIRVSATDTIGQLTKIMESIDEIKEGFQKAEDKAEKFGSEAAESAGTLAHG MQDARQETSKVATSMDKLSDSGDKVESSAKGARRALENMGSAASKSATEARKDLNNASDA ADEFRKEINKTSEAGEDASNTYRSMGAEADGFGSAAARSAAKALKETNSLGKAVKAGFQG AYGFAGKQAEQFANKARDGLEELETALKNPIQTIKNKLVDALEKAGKKIKDTGDKSDETG DDLKHMGNDGESAGTRIKDAVGGAVKSFFAISAAIEIVKAGIEAAKQLGSAIKDAGIASE QTGAKFGAMFSADSGVQEWADNFSDAIHRSNTEVQGFLVSNKAMYQEMGITGTAAEDLSK ITTSLAYDFGTAFSLDDAEALSVVQDYLKGNTEAMAEFGMQIDDAALKQSAMEMGLGKNI DALDEAAMAQVRMNALLQNSTEIQQAATKKQEGYANNIKSLKGIWSDFLSGAASKFAPVF TELTNTIMASWPQIEPALLGMIDMLSNGFAAGVPVIMDLATGALPGLISTFGELMSAAAP IGGVLLDMATTALPPLMAAVTPLIQTFSTLAQTVLPPVSRIIGSIATTVVPPLVNILKSL SDNVIAPLMPHIESIANAILPALSAGLKIIPPILSAISPVLSGISGVLSNVVGFLAKIME WAANGLASLLEKIAGLFGGGSKAAASAGANIPHNADGDNYFQGGWTHINERGGEIAYLPS GSTIVPADKSEQIIAGGRQQSIKNDVNFNPTIRIEILGGSNEGTPGMMEDLKQTVKELYQ EMQEEHFANLAIQQGNA >gi|157101627|gb|DS480697.1| GENE 37 29992 - 30189 100 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940762|ref|ZP_02088104.1| ## NR: gi|160940762|ref|ZP_02088104.1| hypothetical protein CLOBOL_05656 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05656 [Clostridium bolteae ATCC BAA-613] # 1 65 1 65 65 119 100.0 7e-26 MNPFLEDEKDIGKAQIRAIRNQEFWSLVFSTGKIAYTEWCNMCLSEYYEAREAYINYNES LKSNQ >gi|157101627|gb|DS480697.1| GENE 38 30167 - 30433 175 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940763|ref|ZP_02088105.1| ## NR: gi|160940763|ref|ZP_02088105.1| hypothetical protein CLOBOL_05657 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05657 [Clostridium bolteae ATCC BAA-613] # 1 88 21 108 108 168 100.0 1e-40 MAKQKKVTVNGTEYTLQSVSPTWYFNHNDECGMTGGKRKTVNYIDGLIKNIVISPKEVST QGISYFEEAEDISTPEKLMTEIESFLRG >gi|157101627|gb|DS480697.1| GENE 39 30496 - 30897 321 133 aa, chain - ## HITS:1 COG:no KEGG:Apre_0847 NR:ns ## KEGG: Apre_0847 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 1 127 1 127 135 90 44.0 2e-17 MVTSYDPKKVNVNVDGTTLTGFASDGIITVSKSEDAVTPNVGVQGDVVYEENANESGTVA VTLQQTSSSLQKLRNLASNRKRFALTISDANDDAPANVSGSECRILKEPDVVRGKNTSTV TVNIYVPNLKIRQ >gi|157101627|gb|DS480697.1| GENE 40 30914 - 31975 527 353 aa, chain - ## HITS:1 COG:no KEGG:BSn5_14805 NR:ns ## KEGG: BSn5_14805 # Name: not_defined # Def: hypothetical protein # Organism: B.subtilis_BSn5 # Pathway: not_defined # 4 350 5 343 343 117 30.0 7e-25 MSKDVVVVVTLEDVPQSMDTLDILLISTAGEKAGKTYTDLEEIKKDWTDSSIVYKKAAAL FNQGNAIPAPEKLIRKVTIVGVAQPETASALVEAIKAYQEENDDWYIFLTDQTDDTYLEA LGNFAAGSEPTEAELSSGVEDHRKLYIAQTNNKEYAVSTARTVIVYTKDTNEHADAAWLG AVGPWYPQSVTWKFKMPAGISVPTLTTSEVTALETNHVNFVTSEYKKNYIKNGICMDGEW IDAVLGADWIAKRMREKLYDIFMGNPNIPYTDAGFTTVSAGVFETLEEATGYAIIAENPE SGAGIYNVSVPKRSEATDQQAASRQMPDISWEAQLGGAVHGVKVKGTLKVSLT >gi|157101627|gb|DS480697.1| GENE 41 31972 - 32493 269 173 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940766|ref|ZP_02088108.1| ## NR: gi|160940766|ref|ZP_02088108.1| hypothetical protein CLOBOL_05660 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05660 [Clostridium bolteae ATCC BAA-613] # 1 173 1 173 173 349 100.0 4e-95 MKFKYIRNMVTNGLSNYLKVPVVLASQVEPEQDYPFIIYSVTAPYIPENTLGEYRQGGDG EEIRREQPTCTWSFTACSANRRTPSGFILGEDEAMELANQALGWFLHAGYEYTSGNGITI VNTGNVQERSVLEVDEAVRRYGFDVTIRYIREDNRMIATIRNAATIEQKGDTH >gi|157101627|gb|DS480697.1| GENE 42 32490 - 32864 264 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940767|ref|ZP_02088109.1| ## NR: gi|160940767|ref|ZP_02088109.1| hypothetical protein CLOBOL_05661 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05661 [Clostridium bolteae ATCC BAA-613] # 12 124 1 113 113 226 100.0 3e-58 MNSYCFGSAQPMIPEGLMHEMYEIKSGAVFSQEDGGQWVPGEDERIPFKGVLLPVSNKDL VRDIAGVFTQYSEKIYTNGHTLKVGAQVEDFGGMRYTVTQELGYNSLHPLKRYLVERKGK AAKK >gi|157101627|gb|DS480697.1| GENE 43 32866 - 33585 384 239 aa, chain - ## HITS:1 COG:no KEGG:Amet_0423 NR:ns ## KEGG: Amet_0423 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 4 235 12 197 200 86 30.0 1e-15 MERIIRELKRLEEMSIHIGIQGQPGRDESGMEREGAPADILTIANVNEFGATIKAKNVKN LAIPIAKKAIGKSPLDFPGLFFLRSRNGYLFGCISKKRKGTSPKKKSSPTDSKPNKHGPS KKQIPKKTDDIEFLFILMESTSIPERSFIRAGYDNNRRTIEDITTAAIQNIIFSGWDAEK AANNIGMGVVGIIQMYMNQPFNFKKKSSITKAVSNWPDNPLIETGRLRNSITYSIEGGT >gi|157101627|gb|DS480697.1| GENE 44 33615 - 34274 277 219 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940769|ref|ZP_02088111.1| ## NR: gi|160940769|ref|ZP_02088111.1| hypothetical protein CLOBOL_05663 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05663 [Clostridium bolteae ATCC BAA-613] # 1 219 3 221 221 448 99.0 1e-124 MEERRKLLADNAMTTLEDLMEFMGMDPKDIGIPDSVKNNLERLINAASGYIERMTDRRFG RKEYVEGHHGSGWQELCLNQYPIVDVKSVIDVESGQIIPSESYSFSDTGEIGVLYRDGGW ADRTFLGGLANDRVAPKRYLKVTYTAGYILPKDGADHSASDLPFDLQYAVWQMVQQQWNL SHNGANGLSAFTISDVSWTFDKELNTQVQNVIEQYRRWA >gi|157101627|gb|DS480697.1| GENE 45 34378 - 34788 425 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940771|ref|ZP_02088113.1| ## NR: gi|160940771|ref|ZP_02088113.1| hypothetical protein CLOBOL_05665 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05665 [Clostridium bolteae ATCC BAA-613] # 1 136 1 136 136 221 100.0 1e-56 MKRELFDNVVVRVGATGIAVDRKGFLSAVVAADIGEITGSPTETKLSVKVEHSDTADGTF TNVEDTMINPEHASQEGILSVATVESEAVLQINMDLLGCKRYIKVTPTITFTGGTSPSAA SAAYALVLGDPIDSPV >gi|157101627|gb|DS480697.1| GENE 46 34802 - 36469 1253 555 aa, chain - ## HITS:1 COG:no KEGG:TM1040_1622 NR:ns ## KEGG: TM1040_1622 # Name: not_defined # Def: hypothetical protein # Organism: Silicibacter_TM1040 # Pathway: not_defined # 210 542 94 423 439 141 29.0 8e-32 MSRYKMSRKTARARKSMKMGADDLQEMVKAAVKEALDEQKAEDGSEDDPEESGGDLGEIL DAAIEAVNAKRKSAKSDELNPDDTEELVGAILEEAGAAEDGKSDDEATDLEEVIKAACEA VNEKRKSAKADEIGDDVVDEILDAVAEVMSDDTADEEGKGRKEAYFRQRQTKSYGSRGKQ KKAVQRKYSDIFLKGGNGGGMQKKKEEVPPLITFARAVKCLDVYGRQDPERAAYYARKKY DDTEMEREFKALSATNPTDGGYLIPEVYSDQVIELLYPKTVIVELGAQTVPLTTGNLNLP KMTAGARAQWGGEQRKIKTSQTKFGNIKLSAKRLEAIIPQSRELLMLSTFSADSMFANDL TRRMQLGLDYGGLYGAGAEFQPLGIANNKEVENIDATKIGNDDLADTNGKITPDLPIYVR SKAMSKNIDDIHAGWAMNSMLEGIFLNMKTQMGTYIYREEMATGKLCGFPYKVSNQIPTD NGKTDLFFGNWSDLLIGDQMGLETYTTLDGTWTDEDGVQHNAFEENLAATRALMYDDIGV RHAESFIYCKNIKVM >gi|157101627|gb|DS480697.1| GENE 47 36481 - 37185 425 234 aa, chain - ## HITS:1 COG:AGc1747 KEGG:ns NR:ns ## COG: AGc1747 COG3740 # Protein_GI_number: 15888296 # Func_class: R General function prediction only # Function: Phage head maturation protease # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 19 152 25 160 190 83 37.0 4e-16 MEHEFKKMRFKMDAYNEEEGIFSGYGAVFENIDSGGDIIEPGAFTKTLAEGWERVKVLAL HNDCWLPIGRPVELREEPNGLFISGKISDTTMGRDVKVLLKDGVLNELSIGYDPIIFDYD ENGIRHLREIKLWEVSVVTWAMNPEAVITSYKSMQETAAQALAIKKDLLQELKEGRKISN SRLKSLRDVSKSMKDSARTIDAVIREASNEAAKKSGVLTLGSSKSVQKQIEIIF >gi|157101627|gb|DS480697.1| GENE 48 37285 - 39306 515 673 aa, chain - ## HITS:1 COG:no KEGG:MXAN_1213 NR:ns ## KEGG: MXAN_1213 # Name: not_defined # Def: hypothetical protein # Organism: M.xanthus # Pathway: not_defined # 17 360 99 442 452 136 31.0 4e-30 MAVIDKISSDLAYVKGKLYRIDQNGDKKELTVHPFLDFWARPNPLYEFTASALWKLQSDY LLLKGEGYFIIERYDNGYPAELWPVPTHWVQMTPYQGFPYYRIRAADGGVMDVPIDDVFV MKDLNPLDPYRRGLGQAEPLADEVEIDEYAAKFQKKFFFNDAMPGAIVVMPGADDKQQER FLAKWKERFRGHQNSHGIATIGGPKDTQASVVKLSDNMKDLDMINGRAFTRDAVLEHFGV PREIMGITQNSNRATAEAARYIYATNVLTPRLANRQDAINLQLLSAYGDDLVWEYDDIIP KDKEFEKMVAFDGWNNGVITKNEAREKLDMEKTEKGDIFKMNFADLYIGEDEDPVELSSA AENLQYSDASDPIEADRNDIEVSLGDELIGSTQDEEKILKRAAEIKEQRIKAAGMGLNHA RQTQTRKFEIATTKYFRSQADQIQGALIGNKKAEGSVWDAIGMTQEEFLRLSEQQQSQLT MQFVNGLLDWKTQEGILESILTPLWAETYDKGVKEVVSSYRLNAIQQPSLTSVARLRGGQ RVTRVTQTTKDNIRRIVADGLTQGKGKQELTEDIMIEMNTSAARARIIAAQECNTSLLAG NFDMARQGGFSTKTWHVTNLGKARDTHQALNGKTVPITEPFVTIKGNKLMMPCDPDCSVA EETVNCHCFLTYS >gi|157101627|gb|DS480697.1| GENE 49 39453 - 40907 407 484 aa, chain - ## HITS:1 COG:no KEGG:Mahau_0567 NR:ns ## KEGG: Mahau_0567 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 10 472 11 437 461 366 41.0 1e-99 MLGSDAVLFYADNPIYFVEDVIRAKPDEKQRDILRSLRDYPMTSVRSGHGIGKSAVEAWS VIWYMCTRPFPKIPCTAPTEHQLMDVLWAEISKWMRNNPALRDDLIWTKEKLYMQGHPEE WFAVPRTATNPEALQGFHAEHVLYIIDEASGVSDKVFEPVLGAMTGEDAKLLMMGNPTRL AGFFYDSHHRNREQYSAIHVDGRDSQHVSRTFVQKIIDMFGEDSDVFRVRVAGQFPKSTP DSLIAMEWCEEAANLQVYAPGGQIDIGVDVARYGDDSSALYPLIDKKQSLPYELYHHNRT TEIAGYVVIMIKQFAMDYPDAAIRVKVDCDGLGVGVYDNLYDQRDQIIDAIWYDRCRRAG INPEDGNQWNECQNVPKLDLEIIECHFGGSGGKVDDNDPVEYSNSTGLMWGKVRKYLQEG KLQLPDDDTLVSQLCNRRYLVNKDGKLELERKESMKKRGLTSPDIADALALALYEPNNEW TVNW >gi|157101627|gb|DS480697.1| GENE 50 40894 - 41742 357 282 aa, chain - ## HITS:1 COG:BS_xtmA KEGG:ns NR:ns ## COG: BS_xtmA COG3728 # Protein_GI_number: 16078322 # Func_class: L Replication, recombination and repair # Function: Phage terminase, small subunit # Organism: Bacillus subtilis # 9 248 6 223 265 135 37.0 1e-31 MPRPRDPNRDRAFELYRKSGGSLDLVEIASQLNLPPGTIRGWKSKDNWENRLNGTLQKNT ERSKHDSRKNKAEKIKAAEAAEQMSVNTELNSNQQLFCLYCAYGDNATAAYQKAYDCSYQ TAMVNASRLLRNAKIKAEVDRIKKERLESLFFDEHDIFQWHLDVARANITDYVTFGREEI QAIGAFGPITDKETGEPITKEVNYVKFKESSEVNGHVIKKVKLGKDGASIELYDAMAAMK WLAEHMSLGTAGQQKLAQSIVDAYEWRRSQEKERKGEADAGE >gi|157101627|gb|DS480697.1| GENE 51 41846 - 42595 235 249 aa, chain - ## HITS:1 COG:lin0088 KEGG:ns NR:ns ## COG: lin0088 COG0338 # Protein_GI_number: 16799166 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Listeria innocua # 1 207 1 215 270 87 31.0 2e-17 MDSFISWIGGKKLLRRAILEQFPETGTFDRYIEVFGGAAWLLFSKESHATMEVYNDINGE LVNLFRIVKYHPDALQKELDGTLMSREMFLDAIQPARGLTDIQRAARFWIAIKESFGTNL HSFICKGYNIQNAVELIKVASVRLNKVIIENNDFGKLIKTYDREKALFYLDPPYYDAEKY YPDRFQPEDHIRLKESLEHIKGRFILSYNDCPEIRKLYDRFVIIEVERQDNLVSKNGSRK YKELIIKNY >gi|157101627|gb|DS480697.1| GENE 52 42904 - 43329 245 141 aa, chain - ## HITS:1 COG:no KEGG:CLM_2534 NR:ns ## KEGG: CLM_2534 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A2 # Pathway: not_defined # 5 121 7 125 145 62 35.0 4e-09 MEKHILSQYIDACELIKETEEDITRLKKRRKLPVQDSVKGSMHEFPYAPQNFHIEGLSYS FIEHPSYLEKEEELLKFRKEDAEKIKLQVEAWMNTIPQRMQRIIRMKFFEEKTWGEVAAR LGRKATADSVRMEFNNFMSAA >gi|157101627|gb|DS480697.1| GENE 53 43335 - 43574 214 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940779|ref|ZP_02088121.1| ## NR: gi|160940779|ref|ZP_02088121.1| hypothetical protein CLOBOL_05673 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05673 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 133 100.0 5e-30 MGEVVMYDLYDEGRYVGRYPIAALAVMLSIQYPRLIANYAREGRTYRKRYQFERVDEPIG AELAEEWDKARQRFLRLGR >gi|157101627|gb|DS480697.1| GENE 54 43590 - 44582 701 330 aa, chain - ## HITS:1 COG:SP0890 KEGG:ns NR:ns ## COG: SP0890 COG0582 # Protein_GI_number: 15900773 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 56 330 47 321 321 156 32.0 6e-38 MDENKLKDELIAKLSSNVDTTVLQMVDTALASVLSDYEVAKRNTQLSTGVLRFPELEIYI AKMRFDNKAKSTIDQYSKFLGDMLCYLGKPVDKIQDFDIMNFLNFYAETNGISDSTKNHK RLIASSFFTFLHKRGYIIKNPMATVDTIKYTAQVREALTQKEVERMRVACGENLRDNVVL ELFLASGCRVSEVAGMRVENIDMKQKTVIVLGKGKKERPVFFSDRLLVYLEKYLDGRREG PVVISVRAPYQGIKKNAMENIVREISKKAGIEKRVFPHLLRHTFATHALNKGMPLESLSD LMGHACIETTRIYAKNHMSKIRYEYDMYAS >gi|157101627|gb|DS480697.1| GENE 55 44689 - 45027 252 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940781|ref|ZP_02088123.1| ## NR: gi|160940781|ref|ZP_02088123.1| hypothetical protein CLOBOL_05675 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05675 [Clostridium bolteae ATCC BAA-613] # 1 112 2 113 113 207 100.0 3e-52 MGTILDAFAKEDRVEVTFSDFYRLMRESTKAEIVMNAVNCNVPHKYIREMTTGKPENIDL KANIKCKDNGNVDEIADAIHKKFEEHISGAGRDERVDGGAEKAASRDIKLTY >gi|157101627|gb|DS480697.1| GENE 56 45441 - 45701 139 86 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2966 NR:ns ## KEGG: Cphy_2966 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 80 6 82 86 69 53.0 3e-11 MKHFESVCKRKLVDWYNCDVETKGTRIDLDDVYIVWACKTLQNYKCLASTSVSGDGIYAE YTYNGDKQELYEDVYKKLTNTCHREE >gi|157101627|gb|DS480697.1| GENE 57 45759 - 45980 254 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940784|ref|ZP_02088126.1| ## NR: gi|160940784|ref|ZP_02088126.1| hypothetical protein CLOBOL_05678 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05678 [Clostridium bolteae ATCC BAA-613] # 1 73 1 73 73 135 100.0 1e-30 MGIDLSRFKVIHGDKVLNAVALMEVRMPEGTWENRETIVKPKILEILAINEDGNIVSIMD EAWMFQFLPIVSN >gi|157101627|gb|DS480697.1| GENE 58 46020 - 46415 142 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940785|ref|ZP_02088127.1| ## NR: gi|160940785|ref|ZP_02088127.1| hypothetical protein CLOBOL_05679 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05679 [Clostridium bolteae ATCC BAA-613] # 1 131 1 131 131 251 100.0 1e-65 MRLIDADKVDFNEVFMGISDFAKNIREAAQSLIDNQPDIERWIPVEERTPEKPKENPLYD NKPLEIYLVSVKNTDCVIRAFWNGASFTDGWEKLDVLAWMPLPEPYRPETLRGPGAKAGQ DAAEPVFQSAT >gi|157101627|gb|DS480697.1| GENE 59 46448 - 46801 247 117 aa, chain - ## HITS:1 COG:no KEGG:Hore_13270 NR:ns ## KEGG: Hore_13270 # Name: not_defined # Def: hypothetical protein # Organism: H.orenii # Pathway: not_defined # 3 117 4 118 120 116 50.0 3e-25 MDKKTIVFDFDGVIHSYTSGWQGISVIPDPVVPEIQAAINYLRMEGYEVIVVSTRCARPE GMGAVRRYLRDNHIVVDDVVAHKPPAICYIDDRAICFDGDALGLIGKIRAFKPWNQN >gi|157101627|gb|DS480697.1| GENE 60 47014 - 47205 250 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940788|ref|ZP_02088130.1| ## NR: gi|160940788|ref|ZP_02088130.1| hypothetical protein CLOBOL_05682 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05682 [Clostridium bolteae ATCC BAA-613] # 1 63 1 63 63 113 100.0 5e-24 MNVKIYIRNFDGTWDLEKEFDEKVISITVANGNYVLVLEQEDYFGKLVFLYDMSKYKIEA LPQ >gi|157101627|gb|DS480697.1| GENE 61 47820 - 47999 149 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940792|ref|ZP_02088134.1| ## NR: gi|160940792|ref|ZP_02088134.1| hypothetical protein CLOBOL_05686 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05686 [Clostridium bolteae ATCC BAA-613] # 1 59 30 88 88 118 100.0 2e-25 MVRTGIVGNTIPAKIDFIAEAEKQIQAYEMAIKALESGEAETFVDRCYLGSPCPYQMPV >gi|157101627|gb|DS480697.1| GENE 62 48101 - 48466 294 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940793|ref|ZP_02088135.1| ## NR: gi|160940793|ref|ZP_02088135.1| hypothetical protein CLOBOL_05687 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05687 [Clostridium bolteae ATCC BAA-613] # 1 121 1 121 121 217 100.0 3e-55 MKDKWCYSWNEENFRGSFDSEEEAIKAARKEGPNAPYVYIGTCTECTLGWPGISASDVIE AISENLYEQCGEAAESFDVSSEDEAALDNVLNEVIEKWIEDRNIKAGCYCVLDAEKVWLK D >gi|157101627|gb|DS480697.1| GENE 63 48463 - 48819 224 118 aa, chain - ## HITS:1 COG:no KEGG:Closa_0722 NR:ns ## KEGG: Closa_0722 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 94 110 202 225 95 54.0 8e-19 MIKIAKLATAHMFDENNPEDSDYELWKGIAEYMDGEDAYVVESTAMEGKYVFLGLCDGNK DKELAYMMEQDGMTGVFIDDREGFEAAWELKEYEHEGCFCIEPQWIESCKGLKEENDS >gi|157101627|gb|DS480697.1| GENE 64 48816 - 49124 325 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940795|ref|ZP_02088137.1| ## NR: gi|160940795|ref|ZP_02088137.1| hypothetical protein CLOBOL_05689 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05689 [Clostridium bolteae ATCC BAA-613] # 1 102 1 102 102 177 100.0 2e-43 MTIEKKIELIAERYGYEPQSRQCIEEMAELTQAINKLWRKRNFGGNDRQIAEAEDAVLDE MADTLIMLWQIKYLLGFGEGPLAKRIDEKLNRQLERMGVTEE >gi|157101627|gb|DS480697.1| GENE 65 49121 - 49603 351 160 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940796|ref|ZP_02088138.1| ## NR: gi|160940796|ref|ZP_02088138.1| hypothetical protein CLOBOL_05690 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05690 [Clostridium bolteae ATCC BAA-613] # 1 160 1 160 160 311 100.0 1e-83 MKEYINDVPVFSLEEMNTSDGAYGVKESGIVRAYLKNLHKLGYLIPGDPILPIIFQAAGG EYRAAYDGFCYIRFEEKGRAIMFVELDRIKTEYDKIYPPKKTKPVDTYDKSILKTMISLE EDKLKMFGRHWLENEPYKYTSHTMALEAYRMLMEAHEVEG >gi|157101627|gb|DS480697.1| GENE 66 49587 - 50231 274 214 aa, chain - ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 1 211 1 208 229 182 49.0 3e-46 MLNNTENNKVSMMGEIASGFTFSHEVFGEGFYMADVAVGRLSGQVDIIPLMVSESLIDIH KDYTGHAMECTGQFRSHNQHEGVRNRLKLSVFVREIHLTQPVPMDCTKNNQIFLDGYICK PPVYRKTPLEKEIADILLAVNRPYGKSDYIPCICWGRNARRASEFEVGTRIETWGRVQSR GYIKRLSKTETELRTAYEVSILKLKRREENEGIY >gi|157101627|gb|DS480697.1| GENE 67 50260 - 50703 403 147 aa, chain - ## HITS:1 COG:FN1304 KEGG:ns NR:ns ## COG: FN1304 COG0629 # Protein_GI_number: 19704639 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Fusobacterium nucleatum # 1 111 1 106 154 92 44.0 2e-19 MNRVILMGRLTKDPDIRYTQGERSMAIARYTLAVDRRGRRGQDSSAEQQTADFINCVAFD RAAEFAEKYFRQGMRVLVSGRIQTGSYVNQEGRKVYTTDIVLDDQEFADSKGASGRGPEQ RQVQGADIGEGFMSIPDGIEDEGLPFS >gi|157101627|gb|DS480697.1| GENE 68 50700 - 50939 193 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940799|ref|ZP_02088141.1| ## NR: gi|160940799|ref|ZP_02088141.1| hypothetical protein CLOBOL_05693 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05693 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 128 100.0 1e-28 MEKQDETAEAMKTAMSIESMPEEAGPAEYIGKKQSGDRTYLLYRDKQGLYWYKTVFKTAA GYISEYEYIFGPKKNRRRR >gi|157101627|gb|DS480697.1| GENE 69 50923 - 51354 220 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940800|ref|ZP_02088142.1| ## NR: gi|160940800|ref|ZP_02088142.1| hypothetical protein CLOBOL_05694 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05694 [Clostridium bolteae ATCC BAA-613] # 58 143 1 86 86 177 100.0 2e-43 MRIRQRNCTKLDRQHLSGHTQDRNFLIYLGAIILTRRMSRRKRYPSRKKMDSSSFRTMTC EDLIGRTIRFRDRIQCYECDEPDPYEMCPDIGNPDNCPKKDYEVKVQKVSVSRVKGETIY TINDDYEFPAGDFWKVVRIGKAG >gi|157101627|gb|DS480697.1| GENE 70 51516 - 52907 1132 463 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940802|ref|ZP_02088144.1| ## NR: gi|160940802|ref|ZP_02088144.1| hypothetical protein CLOBOL_05696 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05696 [Clostridium bolteae ATCC BAA-613] # 1 463 1 463 463 729 100.0 0 MDEVNKSGTYASYETYKEAFDAEVKRTELGFVRIGYMLRVATDTDILKESGYATMEEFAW NEYRLDRSQTSRFMNINKRFSVDGYSDQLPERFQGYGVAKLGEMLTLPAEVIEILSPKLT RPDIQKIKKEIQEEKNVTDLEVMMEEPDEAQEAMESNLGRFMHQYIHDNPKSLQGFCEAV KMDGEHMTGYLMDLLAPSGYAVQMARIPGTGKMMLTISGKSQPIILMNVRTNDKEEFAWN DLQGILHRLITEETAEDCWNALYEEPYPPAEEEKEAQRQRKAEAERKAEEKRKEQEAKER EKARQAEAQRIQKDKEHQEREAAAQKQAVAPVQPETQAAPEPKEEPQEEPKEAPKEELAQ EPQEPGTFPMPEPQEDTPNSPQEAVQMEVEDYPGVVPEKYITCHDGTQVIKPMESLRDEG CRLADEVTQWMRTGTMEYARETQRNIARLGEIIKEMLQDEECD >gi|157101627|gb|DS480697.1| GENE 71 52909 - 55050 1276 713 aa, chain - ## HITS:1 COG:no KEGG:Closa_1108 NR:ns ## KEGG: Closa_1108 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 408 695 419 695 734 201 37.0 1e-49 MKRSSVLKTEPVRPDQDGAVLTAQAVEKILILNYWKDRELIGRYCMNTITGEYETYYIKT GIWRQTVLQTLCEYDATHYWYYDLDGNIPINSEADKKLIKELLPERFSTMEPLTRIKHKE NDYNSNQRESKESRRLERIRNLMDKIPPLPKDFNEWIVKTASGDRDYAFYDKDSERWSCT ACGKDGNAGKFKRPDGKKVRHNDHITCPYCGKTIQAKTRTHAIQLKTNAAVMQDIDSDRS VVRHLDIRITWERRGHKIEISEAVRVFPLRNHKKLACDIYYNQFDRGFEWNSPWDNRGCF DIKNPVNRRMNEEYLYPEGIEAALEGTCYEAWKRTFRQLAEAGKKLEYNRLMWAQSNKGL INMVEYLFKGRFNRLLQETTRKVSCWTCEYYGPLYKDGETIEEVFNLSDRQIINRIRECD GGENAVQWMRWAEETGKKIDQETLGWLTSENIEMANLSFVQAQMSPRQVMNYVKRQQAEG YAGRRARAVLEQWADYLDMCRQKKKDTTDEMVYRPRELKRRHNELVEEMRKERMLEQLKR DEKANEEMARKMAEKYPGAEKILEEIRDKYDYQNEEYMMIVPRSLVDIAIDGSALHHCTG WSERYYERIMQRETYICFLRRKAEPDIPYYTIEVEPGGTIRQHRSYLDEEPGVEEIRGFL REWQREIKNRLTKEDHRLASVSVIKRQKNIDELKEKNNTRVLKCLEEDFMEAI >gi|157101627|gb|DS480697.1| GENE 72 55047 - 55625 436 192 aa, chain - ## HITS:1 COG:no KEGG:Closa_1107 NR:ns ## KEGG: Closa_1107 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 186 1 176 182 110 33.0 4e-23 MLAYKGFNADLTCTCGRGTFQYEIGKTIKESKSKCRNSGAHCAEYPLECLRWYPLGCGNR YFLVEASGSLDELGGTDTQLACTEITLLKELSLRELVGHAMMYMVNHPLREWEMNMQCCS VRKDKAEAWSEGSMAIARGPHPKVKGAAGSVLGLIREVNGEIEDARLFKVNGTIKPNTWY TLEGREPKEAEG >gi|157101627|gb|DS480697.1| GENE 73 55627 - 56115 657 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940805|ref|ZP_02088147.1| ## NR: gi|160940805|ref|ZP_02088147.1| hypothetical protein CLOBOL_05699 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05699 [Clostridium bolteae ATCC BAA-613] # 1 162 1 162 162 302 100.0 6e-81 MFDKFGEFDSADEMNRAAAAQRKEGDNEAILAIAEENGIDKEDAMDFIDGYVAAFVTPLM AAYGKLDIEAKELKPYEIMEDWLQYIKLRCAEEPEMAVAVRRKGKSLKGCIAALLEWSMK NQHPVDSDILKAVKINYKVTLGIPGMGRAKKIITEYYMGKEQ >gi|157101627|gb|DS480697.1| GENE 74 56108 - 56659 248 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940806|ref|ZP_02088148.1| ## NR: gi|160940806|ref|ZP_02088148.1| hypothetical protein CLOBOL_05700 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05700 [Clostridium bolteae ATCC BAA-613] # 1 183 1 183 183 340 100.0 4e-92 MKSHQEESRTDRVNTGGKPGNDKVKGGIEPLEVEAFKRRIKEGDRLYCYRPRRRGDDEVI LEKMSVIKPHGHIVTMAYHGVRGNTYETSMTWGEAIQLNRMTQKQLARELARMRKEMGIA CSSDSCPEEVKAPGRRRINRKEYAGRIMTFRNMGMTYAQISCLVGISASTVREIYIEEAK QNV >gi|157101627|gb|DS480697.1| GENE 75 56656 - 57090 141 144 aa, chain - ## HITS:1 COG:DR0899 KEGG:ns NR:ns ## COG: DR0899 COG0328 # Protein_GI_number: 15805924 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Deinococcus radiodurans # 41 123 69 151 179 65 36.0 4e-11 MQDVNIYIETSFHGPARKDGEYLYLLECIRNGAPATREGRGAMERATENQLALTALAEAL GRLNCPCELRIYTTCQHIINAMQNRWARQWQKNEWRNAKGTPVKNADLWEKILQELDPHR YLFTDEHHEYRQWMQSEFRKGKIT >gi|157101627|gb|DS480697.1| GENE 76 57078 - 57662 120 194 aa, chain - ## HITS:1 COG:no KEGG:Clole_0725 NR:ns ## KEGG: Clole_0725 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 182 58 240 241 99 34.0 6e-20 MNENFGPDCWYITWNYAIENRPKNKKELISQITKVLRKLRNIYHRNGKVLKYVWVPEVGP RGGSHIHIVVSPIDVRLIKDVWPYGGLHFEPMRKDRNYRKLAGYFIEYSELTQKTYGGKQ AGRYNPSKNLVHAEMKKHRKRKKTFSAGEITVPEGWYLDKGSVEEWVNDFGYKYLYYLLV KLPEPERSKGKCRT >gi|157101627|gb|DS480697.1| GENE 77 58013 - 58516 508 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940809|ref|ZP_02088151.1| ## NR: gi|160940809|ref|ZP_02088151.1| hypothetical protein CLOBOL_05703 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05703 [Clostridium bolteae ATCC BAA-613] # 1 167 8 174 174 284 100.0 2e-75 MRKKRTLTIELCDEDMERLCKKAGSVSMTAGELLENFIADLICGERTNGSDERECADRWF DRCGFAILKDKTFLVWLIGTDMLDDVAWIWEEIQYINRMGIKDQDDQEELDEYWDELKTW FNEYKDAGGEYDELESEMEKVMAWKQEYDRLMEGEGDEEAGPEPDEH >gi|157101627|gb|DS480697.1| GENE 78 58538 - 58789 231 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940810|ref|ZP_02088152.1| ## NR: gi|160940810|ref|ZP_02088152.1| hypothetical protein CLOBOL_05704 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05704 [Clostridium bolteae ATCC BAA-613] # 1 83 1 83 83 130 100.0 3e-29 MKRKIELIIPLEYWAGESDWRLDHVGPDTVALTNAKGMRWILMIIGNRRPGLKWRWVILL LVWSIGMTIVAIVVMALAMGVRL >gi|157101627|gb|DS480697.1| GENE 79 58837 - 58992 122 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940811|ref|ZP_02088153.1| ## NR: gi|160940811|ref|ZP_02088153.1| hypothetical protein CLOBOL_05705 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05705 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 74 100.0 3e-12 MKVAEYKIGNGTVEIYDDNIAKTAEEREKILDRVGKIYSAYFSDKEKEQTA >gi|157101627|gb|DS480697.1| GENE 80 59356 - 59496 100 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGKFILVFETSKEELPDVLDWYKKIKRDYPYAIVNIRVRESHSNAR >gi|157101627|gb|DS480697.1| GENE 81 59596 - 59832 197 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940812|ref|ZP_02088154.1| ## NR: gi|160940812|ref|ZP_02088154.1| hypothetical protein CLOBOL_05706 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05706 [Clostridium bolteae ATCC BAA-613] # 1 78 11 88 88 155 100.0 1e-36 MDKSGTECIRKASQIMNFNDSQKVKGVEILKILEETTVEEGIEILEKCIFQHSYFGWERK YMAEPLPFGAGPSKPTHC >gi|157101627|gb|DS480697.1| GENE 82 59935 - 60549 289 204 aa, chain + ## HITS:1 COG:no KEGG:Clole_0788 NR:ns ## KEGG: Clole_0788 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 21 193 22 188 198 103 34.0 3e-21 MSSFNPFSIPSSLLNEATIKFDVERDNSVICSSRGFFCGKNYPSTIQLVENVDIKNGDWL IDTTTNQRYYVLDAHPIIVGGQPVDWMVKYQTELEYKQQLNVNNNSTTINIHSVNGNSAI GSQANVVFNIGSNLSDIEAIIEKLSPTEQTEANELLTILKDTTESNHPILVEGALSKFSN LIKKHSDLLIAIGGWAVQLLIGTK >gi|157101627|gb|DS480697.1| GENE 83 60457 - 60684 194 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940814|ref|ZP_02088156.1| ## NR: gi|160940814|ref|ZP_02088156.1| hypothetical protein CLOBOL_05708 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05708 [Clostridium bolteae ATCC BAA-613] # 1 75 4 78 78 131 100.0 2e-29 MWISRKKWMVMEKKISDLELALQGQICMTNLNHEFCKSVANKNKMSFSSYQQLYSPSSDS DEKVRMLFDKIRELG >gi|157101627|gb|DS480697.1| GENE 84 61037 - 61423 232 128 aa, chain + ## HITS:1 COG:no KEGG:Closa_0705 NR:ns ## KEGG: Closa_0705 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 71 1 71 174 93 61.0 3e-18 MNERLRELRKKCGLSQEEFGKKLGVTKTAVSKMELGTYQITDTMLKLICSEFNVNEKWLR SGEGGEGDMFIKPQKNDLIARAAQLLGEKDPVFEAFVATYSKLSPSNRKVLLEFGLEFLN NLDIPDDN >gi|157101627|gb|DS480697.1| GENE 85 61420 - 61548 140 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIIDDYKQLIFDLIEKSEDIDYVIAVYSFADSYPDKSKNEET >gi|157101627|gb|DS480697.1| GENE 86 61669 - 61881 136 70 aa, chain - ## HITS:1 COG:no KEGG:TepRe1_0949 NR:ns ## KEGG: TepRe1_0949 # Name: not_defined # Def: helix-turn-helix domain-containing protein # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 12 67 13 68 184 63 51.0 3e-09 MDVEIIFHIKEFREEKKITLRELSEKSGISKSEISFIENGQRDPTLHTMCLLSLALGVPP AKLFTVNVKR >gi|157101627|gb|DS480697.1| GENE 87 62039 - 62701 529 220 aa, chain + ## HITS:1 COG:no KEGG:CE0309 NR:ns ## KEGG: CE0309 # Name: not_defined # Def: hypothetical protein # Organism: C.efficiens # Pathway: not_defined # 8 219 75 286 286 195 52.0 1e-48 MRKTKLIMVAALSSVAITACGGSATPSETTAAVTATETTAAAETTAAAEGEAAPETEATK EEEPVEDESIPTEYKSALKKAGSYSDMMHMSKIGIFKQLTSEYGDKFSEEAAQYAVDNMT ADWNDNALKKAQDYSETMHMSKQGVYDQLTSEYGEQFTPEEAQYAIDNVTADWKANALAK AKDYQDTMSMSPEAIRDQLSSEFGEKFTQEEAAYAIENLD >gi|157101627|gb|DS480697.1| GENE 88 62906 - 64483 424 525 aa, chain + ## HITS:1 COG:BS_yokA KEGG:ns NR:ns ## COG: BS_yokA COG1961 # Protein_GI_number: 16079225 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Bacillus subtilis # 9 440 16 451 545 145 29.0 1e-34 MLDPNGIYFAYLRKSREDREAELHGYGETLKRHQRMLEDLAKYHKIKIAKFYSEVVSGET IASRPQMLAMLDDIESLSPDGVLVVEIERLARGDTRDQGLVLETFKYSGTRIITPMKIYD PTDERDEEYAEFGLFMSRREYKTINRRLRDGKIASFNEGKWTGNKAPFGYERVKLAGEKG FTLKIVPEDADVVRLIFQLFIYGSNASNNIPVGTSKICRILDNLHISPPIGETWQPCTIR QILSNPTYIGKVHMGRRSTQKKTLNGVVKISRPINKNAKIADGLHPAIIDDALFLKAQEK RKANYQKPFFEGIKNPLAGLIYCSVCGKSMYRRPAGKRCKEDMIQCTTHGCPTSASYYHY IEEKLLVSLNKWLEDYRIQVKDISPENYAQQLNLLQSQLNAAQNESETSKKQLSKAMDLL EKDIYSVEMFQQRASELNEKIALSSRSISDLSVKIDSIRTKAVQKEEVIARWERILSLYP QVDDPAVKNALMKELLTRIEYSKPHKGNRQNGGMDQFTLKLFPRL >gi|157101627|gb|DS480697.1| GENE 89 64493 - 64711 197 72 aa, chain - ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 68 1 68 200 100 64.0 5e-22 MWIYEKRLEFPVNIKEPNAQMAQFIMSQYGGPDGEMGASMRYLSQRYAMPYKECKGTLTD IGTEECVHNLYT >gi|157101627|gb|DS480697.1| GENE 90 64711 - 64983 235 90 aa, chain - ## HITS:1 COG:no KEGG:Closa_2886 NR:ns ## KEGG: Closa_2886 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 89 5 87 88 81 50.0 9e-15 MMNPDFNASREAMLREVDQAGFAVVDANLYLDTHPCDAAAIDYYNQMANAYRNAAAAFEA QFGPLTASANTDAAYWSWINDPWPWEGGCY >gi|157101627|gb|DS480697.1| GENE 91 64980 - 65351 230 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940823|ref|ZP_02088165.1| ## NR: gi|160940823|ref|ZP_02088165.1| hypothetical protein CLOBOL_05717 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05717 [Clostridium bolteae ATCC BAA-613] # 38 123 1 86 86 160 100.0 3e-38 MDSYTDVSTQCCGRGRMSCGTSFGNGCGSYTSPDYPDMNAFPWDSWGSNPALSYPTPAAP VTQPANMSCPSTEGGVLEQQFPVAMAYVPWQQWQTTYAPERGLVQGTIFPDLDLQFNYGR CGR >gi|157101627|gb|DS480697.1| GENE 92 65429 - 66028 163 199 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRQSLTLCLGVPYDIVQTNKKGEYNICVALDFVDVDGIVVIVDVAVAVTTVVVDAAATVA ALAAAVTADAAAAMTVAVAVDAAVTADAAVVALAVEETTAVAAAMISVQAPDARHTGKAS AMAIGKASTMQPGAIQAAAAPPPARPEDLVQLSPKTAAAVTTNRNWQTTGIRNRASYEAL SDCRQTGFGLCIPKPVFFY >gi|157101627|gb|DS480697.1| GENE 93 65969 - 66616 364 215 aa, chain - ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 6 112 270 376 401 136 55.0 3e-32 NGIETLIADAGYKTPGIAKLLIDQGVKPLLPYKRPLTKEGFFKKYEYVYDEYYDCYICPN NQVLTYRTTNREGYREYKSCGSACASCAYLAKCTQSKDHVKTVMRHIWEPYMDMCEEIRQ TLGMKELYSQRKETIERIFGSAKENHGFRYTQMFGKARMEMKVGLTFACMNLKKLARMKA KWGEAHFTNFILNAIWLIKENWLWDTKPKTSLSTV Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:27:59 2011 Seq name: gi|157101626|gb|DS480698.1| Clostridium bolteae ATCC BAA-613 Scfld_02_39 genomic scaffold, whole genome shotgun sequence Length of sequence - 43567 bp Number of predicted genes - 36, with homology - 31 Number of transcription units - 20, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 278 - 314 -0.2 1 1 Tu 1 . - CDS 341 - 928 -37 ## COG0582 Integrase - Prom 1076 - 1135 8.4 + Prom 945 - 1004 6.9 2 2 Op 1 . + CDS 1131 - 1622 233 ## gi|160940828|ref|ZP_02088169.1| hypothetical protein CLOBOL_05721 3 2 Op 2 . + CDS 1615 - 1932 165 ## gi|160940829|ref|ZP_02088170.1| hypothetical protein CLOBOL_05722 + Prom 2284 - 2343 8.9 4 3 Op 1 . + CDS 2366 - 2722 229 ## ELI_1867 hypothetical protein 5 3 Op 2 . + CDS 2725 - 2931 106 ## gi|160940831|ref|ZP_02088172.1| hypothetical protein CLOBOL_05724 6 3 Op 3 . + CDS 2979 - 3209 111 ## + Prom 3342 - 3401 2.3 7 4 Op 1 . + CDS 3554 - 4336 271 ## SPH_0097 hypothetical protein 8 4 Op 2 . + CDS 4365 - 8036 1004 ## COG4886 Leucine-rich repeat (LRR) protein + Prom 8136 - 8195 4.9 9 5 Tu 1 . + CDS 8257 - 8343 87 ## + Prom 8473 - 8532 6.6 10 6 Tu 1 . + CDS 8655 - 9095 107 ## GY4MC1_2110 membrane-bound metal-dependent hydrolase + Prom 9323 - 9382 3.5 11 7 Tu 1 . + CDS 9517 - 9960 188 ## COG2340 Uncharacterized protein with SCP/PR1 domains + Prom 10059 - 10118 6.3 12 8 Op 1 . + CDS 10228 - 21831 5130 ## COG4932 Predicted outer membrane protein 13 8 Op 2 . + CDS 21828 - 22385 432 ## gi|160940840|ref|ZP_02088181.1| hypothetical protein CLOBOL_05733 14 8 Op 3 . + CDS 22388 - 22690 214 ## gi|160940841|ref|ZP_02088182.1| hypothetical protein CLOBOL_05734 15 8 Op 4 . + CDS 22694 - 23395 379 ## COG4509 Uncharacterized protein conserved in bacteria 16 8 Op 5 . + CDS 23399 - 24901 401 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 17 8 Op 6 . + CDS 24898 - 25944 368 ## gi|160940844|ref|ZP_02088185.1| hypothetical protein CLOBOL_05737 + Prom 25955 - 26014 3.0 18 9 Op 1 . + CDS 26054 - 26329 145 ## gi|160940845|ref|ZP_02088186.1| hypothetical protein CLOBOL_05738 19 9 Op 2 . + CDS 26326 - 26739 116 ## gi|160940846|ref|ZP_02088187.1| hypothetical protein CLOBOL_05739 20 9 Op 3 . + CDS 26744 - 29392 1015 ## pE33L466_0231 hypothetical protein 21 9 Op 4 . + CDS 29419 - 30543 502 ## MPTP_0049 hypothetical protein 22 9 Op 5 . + CDS 30555 - 31088 60 ## gi|160940849|ref|ZP_02088190.1| hypothetical protein CLOBOL_05742 + Prom 32581 - 32640 2.4 23 10 Tu 1 . + CDS 32717 - 33475 32 ## gi|160940850|ref|ZP_02088191.1| hypothetical protein CLOBOL_05743 + Term 33600 - 33647 2.7 + Prom 33921 - 33980 4.5 24 11 Tu 1 . + CDS 34061 - 34831 99 ## gi|160940851|ref|ZP_02088192.1| hypothetical protein CLOBOL_05744 + Term 34832 - 34865 0.6 - Term 36017 - 36063 8.0 25 12 Tu 1 . - CDS 36066 - 36383 95 ## Calow_0808 XRE family transcriptional regulator - Prom 36492 - 36551 4.0 - Term 37159 - 37204 3.1 26 13 Tu 1 . - CDS 37258 - 37533 200 ## COG3077 DNA-damage-inducible protein J - Prom 37721 - 37780 10.2 27 14 Tu 1 . + CDS 37964 - 38221 116 ## gi|160940858|ref|ZP_02088199.1| hypothetical protein CLOBOL_05751 + Prom 38374 - 38433 9.6 28 15 Op 1 . + CDS 38491 - 39234 344 ## COG1192 ATPases involved in chromosome partitioning + Prom 39305 - 39364 6.9 29 15 Op 2 . + CDS 39393 - 39518 210 ## + Prom 39686 - 39745 4.4 30 16 Op 1 . + CDS 39921 - 40265 169 ## gi|160940863|ref|ZP_02088204.1| hypothetical protein CLOBOL_05756 31 16 Op 2 . + CDS 40322 - 40579 113 ## gi|160940864|ref|ZP_02088205.1| hypothetical protein CLOBOL_05757 + Prom 40581 - 40640 4.4 32 17 Tu 1 . + CDS 40691 - 41488 400 ## gi|160940865|ref|ZP_02088206.1| hypothetical protein CLOBOL_05758 33 18 Op 1 . + CDS 41853 - 42938 322 ## gi|160940866|ref|ZP_02088207.1| hypothetical protein CLOBOL_05759 34 18 Op 2 . + CDS 42919 - 43068 131 ## gi|160940867|ref|ZP_02088208.1| hypothetical protein CLOBOL_05760 35 19 Tu 1 . - CDS 43015 - 43155 69 ## - Prom 43309 - 43368 8.3 + Prom 43139 - 43198 4.0 36 20 Tu 1 . + CDS 43232 - 43381 89 ## Predicted protein(s) >gi|157101626|gb|DS480698.1| GENE 1 341 - 928 -37 195 aa, chain - ## HITS:1 COG:CAC1595 KEGG:ns NR:ns ## COG: CAC1595 COG0582 # Protein_GI_number: 15894873 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 3 195 1 186 186 122 40.0 4e-28 MNITTEPIRNKKQLQAILGYLKENSSCRGKVRLRNYVIAKTQLNTSLRISDVLPLKVSDI MHLSGNFRRYINLKEDKTGHRQRIAINDPLKVTFRMYIKEMGLEYDDFLFPGQSKSKPVT TTQIHRVFQDTALALRIDNFNTHSLRKTWGYYAYKQTKNIALIMEVYGHTTVRQTMKYIG ITQSDKDRLYNAIEF >gi|157101626|gb|DS480698.1| GENE 2 1131 - 1622 233 163 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940828|ref|ZP_02088169.1| ## NR: gi|160940828|ref|ZP_02088169.1| hypothetical protein CLOBOL_05721 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05721 [Clostridium bolteae ATCC BAA-613] # 1 163 53 215 215 296 100.0 3e-79 MLTGMEVQRESGQSKDENSEVVRLKKKINALIDLLEKTQKEKQYLNTFVDGYIRNNAPVE LRIKEVIHVLNILTKEAKWLDSSYQTSSSRNYYRIKTEDFEDVLDRTLVNIPRKKMIKIM ANIGVLKCDDGHYTYSATIQRTMYRVYMLKKSAVNTLIGEQDE >gi|157101626|gb|DS480698.1| GENE 3 1615 - 1932 165 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940829|ref|ZP_02088170.1| ## NR: gi|160940829|ref|ZP_02088170.1| hypothetical protein CLOBOL_05722 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05722 [Clostridium bolteae ATCC BAA-613] # 1 105 1 105 105 197 100.0 2e-49 MNKGDVCVQEFVRIADFLLKSGKVTIQRGYILAPRNVIDRLLARNQYETNETKLQYWKKL HWIDADRDRFTKQVSIGGQRFRMVKIDIQVFQTLGILFEEILVEK >gi|157101626|gb|DS480698.1| GENE 4 2366 - 2722 229 118 aa, chain + ## HITS:1 COG:no KEGG:ELI_1867 NR:ns ## KEGG: ELI_1867 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 42 118 60 136 147 70 46.0 3e-11 MRIRGFNEKKQEWVIIEITDGGAMVGLENWYDIASCSVGVSTEKADMRGKEVFQGDIILS YSTDIEMEIKYGCYQSYCPEDKCVMNNIGFYAVGETLKDMPIGPLDEYAIVIANTYGK >gi|157101626|gb|DS480698.1| GENE 5 2725 - 2931 106 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940831|ref|ZP_02088172.1| ## NR: gi|160940831|ref|ZP_02088172.1| hypothetical protein CLOBOL_05724 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05724 [Clostridium bolteae ATCC BAA-613] # 1 68 1 68 68 125 100.0 1e-27 MVLHIKKRIVVSIRKSTIRLILVLYFILLFSFVAYLGDWGNWKQDKYSETAYMEELEGED RTFVLPKD >gi|157101626|gb|DS480698.1| GENE 6 2979 - 3209 111 76 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTFHRYSLFHIEAGMNGLDSIMQKVVKREVWKKGYIFFWMITSLNAYGQILTGTSHYFIG LAMLCLQIFQPTQTIL >gi|157101626|gb|DS480698.1| GENE 7 3554 - 4336 271 260 aa, chain + ## HITS:1 COG:no KEGG:SPH_0097 NR:ns ## KEGG: SPH_0097 # Name: not_defined # Def: hypothetical protein # Organism: S.pneumoniae_Hungary19A_6 # Pathway: not_defined # 23 250 24 247 248 104 35.0 3e-21 MQKSVFLHEFRQSFGLRLTDDYILFRKPLELYNLKTGDSKLFKSMEEVYSQSINEITVGD MISKMTLETFYMRLDGGRGASSGLGQDKSFKFSHADDKGARTDTYDFPSRVNINIRNPDK TLEVFRNLHVNDKKEHAFTVDEQGFVTSYVHGMAHSVMVAGRRGEMVYHNHPSGGAFSDA DLISTALTAAKGIVASGKSGDYILQKGNHFKANAFVKAVNNAHLKGKDYDDAVDRWLKAN AKKYDYKYEFRKCQDELKTL >gi|157101626|gb|DS480698.1| GENE 8 4365 - 8036 1004 1223 aa, chain + ## HITS:1 COG:lin0372 KEGG:ns NR:ns ## COG: lin0372 COG4886 # Protein_GI_number: 16799449 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 542 792 413 638 656 63 25.0 3e-09 MKIRKLGIILLATIITVTNMVVYANSADDSIIHSSDDVIKEYTKNKEMEKNKEEYPKSTP SEAKPKPSEEEKFVSTPTVSIPSEATPSEATPSEATPSEATPSEAVPVEVIEFESIEPDE VPSWYTKVKKRLFGGKKHYQFTDRYGEEHHRIYGYYEEEAEAAWYECQPDGSVTNESFWI DLEWEDLNLAPLRWEAAEPQADDIRDLSERIFGDAEMLGIDIDLAKSNHEAENYDGTMYD IWSLDEEELAELYFFYGHAVNGILPGENTWFESTEDGQILDEVSDLVLSVMSADARAIDA TATEFEFGQTYFGVKDHPTGTLYIPCELGEKYNWQITVNVKRVSSTGGSSLKYYQCFKKD GNGPWYLADTASGHTGYFPVTVPYGADSKYYYTGSSRYLTAEEVNQGWYKTESSSEGDRI NYIDPEGDSFRMAAQHQSFGDWYITKEATCTSKGTKQRTCMGCGAAETAYIDALGHNWEA DYYTGANNGTYYKRCTRCSEKTDVRNNPYTITYHANGGKGTMAVQNCTYTAKLNLLSNTF QRDYYTFKNWTTEPDGGGRIYSDTQPVSNLTAVYGGNIDLYAQWQPKSYNITFDDGLDGT QNITHSYVYTHPFGELPVFKRLGYTLSGFFSEPEGGEKITAESKAPHSDTIYYAHWEANS YDIFFHTGKSYCGSEKKRVTYDEKVGILPAASMEDFEFLGWYTEPYSQTYVEGIMCGDAN PPDDKQIKSEDVYRIAGHTQAYAYLVLQYEELENGVNRRPGPDGYMNTKDDNYYLNGEDG IAGTHDDEKLYPGEDGKYGTKDDYYLDKEGHKIYPGNDSIFRTEDDYRDNGDGTNTRPGP DCNFKTEDDVEASNGLDGLPGTMDDWIDNNENYPDTNLRPGPDGVFGTGDDEVYWNGPDG IPGTEDDELVHPGLDGKLETGDDWVENGHNYLETNLRPGPDGVFGTEDDEVYWNGPDGIP GTEDDELVHPGLDGKLETGDDWVENGHNYPETNLRPGPDGVFGTEDDEVYWNGPDRIPGT EDDKKILSGPDGQYGTEDDCYDNKDKQDGTNMRPGSDGIFGTEDDELWLNGPDEMPGTED DIKYVHRNSSGGGGGYGDRLVGRGAYKPVIEIMDAALEAMEPQTVGLVSNFYDTYQSFKK GQEMSRYIVNTGMQEIQMEAATGSQVEKETQESRWSDESYSDKDVGPVVPVKTYRNMKVI WMILLLILLYVLGYEIYKAKKKD >gi|157101626|gb|DS480698.1| GENE 9 8257 - 8343 87 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHFDFKDLMAFGMFILALLTFVFTYIR >gi|157101626|gb|DS480698.1| GENE 10 8655 - 9095 107 146 aa, chain + ## HITS:1 COG:no KEGG:GY4MC1_2110 NR:ns ## KEGG: GY4MC1_2110 # Name: not_defined # Def: membrane-bound metal-dependent hydrolase # Organism: Geobacillus_Y4.1MC1 # Pathway: not_defined # 28 140 34 148 168 70 33.0 2e-11 MGVVLGGMIISVVEPTLPVAGLILGMSVLGELAPDIDTVNSTISLRTPIIPKLINRKFGH RGLLHSPLFLVLLWLVIGRQNLIGNVFCIGYLGHLVQDLFTCAGLPLLFPFDKKKISLFP YHSGGIMDYVMTSVLIILLLTVNQLI >gi|157101626|gb|DS480698.1| GENE 11 9517 - 9960 188 147 aa, chain + ## HITS:1 COG:CAC2230 KEGG:ns NR:ns ## COG: CAC2230 COG2340 # Protein_GI_number: 15895498 # Func_class: S Function unknown # Function: Uncharacterized protein with SCP/PR1 domains # Organism: Clostridium acetobutylicum # 13 146 44 174 175 60 29.0 8e-10 MVSNNNTGITTYEHIDMKALSYYMIDLINEEREQRGKDSLVINEELMDDAALRAEESSIK FSHTRPNGKDYSTAINVEYSKVGENLVYSTFAPNTEMEEIAQQTISQWLNSEEHKRNMLK SQWDETGLAAYAGEDGCVYFAQIFIQN >gi|157101626|gb|DS480698.1| GENE 12 10228 - 21831 5130 3867 aa, chain + ## HITS:1 COG:lin2281 KEGG:ns NR:ns ## COG: lin2281 COG4932 # Protein_GI_number: 16801345 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Listeria innocua # 2985 3480 1009 1483 1622 125 26.0 2e-27 MNYYQVGLVDGAVPAFCIEEGKKLPDNTVMTYEKYEAKPGQTVPVIGSFDRYLPMTVAYE WMLREGNYHDTDRYAVTQVYMWGCMSGYKDNWPAMEQAMQKLAEVLKRPYVMNYYEELKE YVIEGVAGYEAANNASLPSWNGTQQKMTLKDGRYELTLDISTCPQLKDTTWTFPDSNWNY QLSSDGNNITFIYNGAQEPIGTISSADIEGIEATYYAYIFSPPPGEQVQLGRLDAGFQPA NVTFSLNQGTLATPDVSNWEVYRHSETFESNYNIDLEKYCAETNQPLEGTTFNVWEDFDF SQINEGGYTEGEPDGTTGEVYLNCMTPEPESDYVCDTITTDTNGQASHSDARFYNYSKTY CMGHPAPEWIECDHEGGEGEDSEDCNCDEENERLREQWMAEQELCASTCDFHVQNDDEEN HGQDTAAMEAMLADRDETYENFINLEYSYQLEEKTARTGYVLHGLHNDDPEIETVILTSA QAGGDVRSGTYKASNMVGRVLNPIYNMAISRMEGLRAYTYPVPDGQDLELDEQRSIISIL KEEEEKPLIEIETPVDNMVDGTGNQENGGSGSSGQGGGSEDNSEESGESGSGSQGGGSED NSEESGESGSGSQGGGSEDNSEESGESGSGSQGGGSEDNSEESGESGSGSQGGGSEDNSE ESGESGSGSQGGGSEDNSEESGESGSGSQGGGSEDNSEESGESGSGSQGGGSEDNSEESG ESGSGSQGGGSEDNGEESGDSEGTDTGTPMASISLHGKSLLLSSIATQSNAKKESEEAIN PPDTDEEEVEEYEAYEYVRNPLEVSYEMDISMTQTDDPEEEGNGITRFLHSLFSGDDDGD SIIAALPSFVDDDLEPIDVSGYGASGTILYTFKVWDHRTEGRIHINKRDLELYNADQDGS YGLTQGDGTLEGAVYGLFAGQDLSHPDGKSGIVYNQNDLVAVATTDKNGDASFLAYTEKP GTRLDDDGNIKPLEGVTGPENLYDGSSITSSAEGFGTITYPDYVAANGGQWIGRPLLMGN YYIMELSRSEGYELSVNGISMSETNRTQVGTTIREAGQAQVVGGLSDYNDMSADGSWNDF IVENYKTEEGYDITITGYPEGAKIYRVDIENRTETVKVVTGSSLQPKTDDHGNIVYQTAK GGEYKIGPDGNPIIKTGTATDSNPDEQIPYGETLYYRFRTAPYPSGSATPEDMSKWGQAV DGNYLTDQVNDMLGQIGYKAAADTSPWADIELTGATNALAAQEIMDWFTAHNFFDCGAVE SIYQKDGGYHARLLYDYSAASDTYPAVYDSVNQKLYVRKVAEVDGGPAGEVGYWIEYQKG EYSLKSKTVSIKEKRQVTNSIPYGDDIAAYIDVVYQPVYETYQAGEVLLDRSGNPIPVME RVYTYEDQEQTVEREKLVPLDAVYDTQAGTYTIHIANDMDWSGATDAVRTIYRIVTTEKT IEHDGVEMPYNQYMTDVVGAGVSAYASMPELDEGSYIKTQALVYPGQNEPTQDGGTGDRP VQVLQRAIKQSIKVTKDISQASYDGVNTYGAVHNDPLTVLLGLFNGGSSSQGTKILNQFK FKAYLKSNLEDIFVDDAGTIVSEYIGTDGFTEEVQKVYLPPKDGNGNRLLETKEDGTYNY TKFFDALYAADRKAGGYPVEVVRQFAIDYYDIDSYKKEILAAEPGLNSDVAYDQALQRAS EAAAAYLDIFVGLDDRLAIAWDKDAGGGADGDRTTLQCNTKNGKDDYYNNSIMLPYGTYV VAEQTPADVGKELANRHFNKDYPKEVTLPFVPDISQDGNTGETDINYQTGSPYYRYDSTD TPEELIRKYKIRFNEETHIIQAHGQDGDYEVFKYGLDKEVRPGHSLTSTEPYEAAYMDGR NETVKSYYAGYTSQSEDASTMDDVIYDGYETDSGQMEVRDGVATMEGMQLAIDGKFAPML VPWTVLAPAVDRVNPDTGNVETLIPSGSGADFNFVAFAQEDFEDTYYNTKLRIEKLDAET GDNIIHDGALFKIYAAKRDVEKNGTNTVTGTGDVLYGEAVDWEGNPVLDADGNKILYPRV GESNGSMDDLPVRLDKEGIPQYDESQLIRQEDHDGNETGIFRAYSTIREVVVDGQVQKVP VGYIETYKPLGAGAYVLVEVQAPEGYTKSRPVAFEIYADDVTYYHDQRNPDGTTDGWEPE TAAKYQYAVPVAGDTNKFQTEIVSQIVVEDYPSRMEIHKVEDGDSMVGNQNVLQKTDAQG QTEASGGFDGDIMVNDEGDILMYQVHGRKEKLEERGDVREITYDPDTKDWYGYVTKPFDE YSEHIVEGTEKALKAMPGVKLLYELDGTYTGKGIRFDIPVSGARLSLYHAVELEKTGENQ YEGVTVEYEDGKVTRIIDTNTGTHKEIRRTGQDSGPAGLNVWDTIIVDNNPVDLYFYDME QVDTMEDPDTGEIWVLDDRGNQLCYADARSGMAYVYDDYGRMLAYTADEEGNKELVKSIQ VMDDGTGNGQTIYEDKSTVDDENGLPIYYTDGKVVTKDETWITDDSTDPYGNPESTGAVH TITRLPFGAYILQEELVPYEQGYIQAKHMGLVLEDTDEVQKYFMQNVFTKTAFAKIDVHT QKEIQGATMTLYCAQLDSEGNPLKEEDGTYKKGDAYTTWLSGYEYDDNGNLKLDAQGHPI PTTEPHWIDHIPVGFYVLEETICPYEQGYVQSVSVNIDVLETGNVQSFEMEDDFTALEIR KYDTKNEDVIYEDSEAYLTLYHAKLDAKGYPVIQDGIPQYDEFGKIFTFRAATYKDGQDV ASTGREVPDAGGNHPIMKYDYDFQPIPNTYQGRYYYTENATVRIEYLPVGSYVLVETDNP DGYATADPILIEIEDTGHLEEIQYAEMGDKPLSLEVSKVNITGGKEVNGAVLTIYPVDER GNVSDTPLILHQPTVDGNYQDITATWVSGLDGKYTDADRVAGLIPDGFEVGDLKPHLVEY IPEGDYILREETTPYGFLQSVDVPFTITDTQVIQKAEMIDEIPDGILKITKSDTDRPDER LQGAQFQLENMTTGTLCETVTTDEQGMAQFQPQPIGYMDRDGNFKPYTYKCSETKAAPGH MLTLSPYEFQFEYVNELTDKIVLDYNPTNDSNRVVTDKRIGNTDELLDGVTLRIERKVDN GWETIDEWMTGKQGHYTKDLQAGDYRLVEIKAAEGFKLLAEPIEFTISDGMTEVPHFVMR NYSTIVDITKVQSGTDTLLAGARLQLRKDTGEVIREWTTQEDGGQKFYGLEPGTYVIHEL QAPAGYVMAGDQEIVVTENNDTTQVFQYENRLKSSSGGGGGGGDKPKPKVDYISFKKIDS GGMPVAGAEFTFYRGDGSVLDTAVSDANGKITIKRPESGTYTIRETKAPDGFYISDKTYT VTIGQGGVQGDYEIVNVPNTTVTINKLDAETQEPLSDVKLQILDGSNHVVAEGWTDDNGQ FGFVAPYAGTYHIKELEALEGYRKLPSTYECGVQEDGSITGTTTLYNSKTQKIGKVMASY TPHLTGKGVASFGVPGIRVPGAKTGDDTPIMLYVVMLILSVLILAGFVITYWMRKGKKKK NLMMLVLIIGLSAGMHCEGKAAPMDTLEAATSSNASEVELQQKEGINTLIVVSAPNIDAE SIPRPEKTYQYEGKTYELQDYKVIETSIPEREEIARDTITYNEVEQADTIPATAVIEVED TITGTVTKVTVPLKDYEYTDYRWVDGFEFLITVEAADADSYALGDILIPRQEENPFAGYL NNLLELIQVNPAYYQIHTVEWTGEPWVAEDGVTYRQAIARGLKQVATVNATYEGIVLLDS VLANAVEAVYEEDMSQVETEPEKPVETVEVEVAKKNLWDFILELLGRLLKHPIIAMVTCI ALIILICLVVTILQVLSAKKRKKGDEE >gi|157101626|gb|DS480698.1| GENE 13 21828 - 22385 432 185 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940840|ref|ZP_02088181.1| ## NR: gi|160940840|ref|ZP_02088181.1| hypothetical protein CLOBOL_05733 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05733 [Clostridium bolteae ATCC BAA-613] # 1 185 1 185 185 357 100.0 2e-97 MKTVGIVGCQPRIGTTTQALQLTLCLMHMGYNAAYVEMGERDYIEKLDALYQGITIDKND IIYCQSIPLYTGSRIALANRGRYDYVIKDYGYIGNPGFEKISFLEQHIKIVVGGAKANEV DYVEQVIEDECYEDVNYIFSFIGLSEQRDVKEMMKSKKERTYFAVYTPDPFTYVENDIYE YILGD >gi|157101626|gb|DS480698.1| GENE 14 22388 - 22690 214 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940841|ref|ZP_02088182.1| ## NR: gi|160940841|ref|ZP_02088182.1| hypothetical protein CLOBOL_05734 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05734 [Clostridium bolteae ATCC BAA-613] # 1 100 1 100 100 156 100.0 5e-37 MEYKKRLGEKVEEKRAFEQEQQKLRRKYKIHEDGTILVKKKRLIEILLNTGAATIRIGAT IILCSLAAIGLISLLYVGPRTELLIIMQEVVEQLHSMLGV >gi|157101626|gb|DS480698.1| GENE 15 22694 - 23395 379 233 aa, chain + ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 61 218 67 226 246 94 36.0 2e-19 MGKLKTGAYAAVFVLALVGLGISSTRLLDISAEYQRSNIAYRQLREIKGIKGPREVVEDK LKDINPDYIAWLQIENTPIDYPVVMESDRDYLITGFYGENSIAGTPFIRRAQNAFQDKNT VIFGHNMRDGTMFAALKNYLKSEYCKQYPTITILNKGREYKYKLFSVQLLNEDDGSVFAY TFANDESYREYLDLMIQKSLVQLTEPDTDRNIITLSTCYGKDKRLVVQAFGEG >gi|157101626|gb|DS480698.1| GENE 16 23399 - 24901 401 500 aa, chain + ## HITS:1 COG:Cj0886c_2 KEGG:ns NR:ns ## COG: Cj0886c_2 COG1674 # Protein_GI_number: 15792216 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Campylobacter jejuni # 215 409 253 448 573 103 34.0 1e-21 MAQSSLDRQLDMAMGNLVVIIGKVLKSVWYGCKKLRGRKIIPFMLCIILAVIGWTRWGHI ASVLDVLEMPGWMQKILYILILLSPAIDLALLGSVITRKQMEFYRKFEEIGFKEKTGKYP LYLGEVEEPDGRICYTFRSNISIGEWKKRAVQIETVLDCTILRIENKGSKRIIQMLAVSS EYKIPDKLLWEEGYLSNEDGVCVIGQGMLSQISFNLNRTPHVLAAGETGSGKSVILRCCL WQLISQNARVYMIDFKGGVEFGLDYERYGEVITDRERAAEVLEMLVKENTARLALFRKLR VKNLPEYNKKTGKNLCRIGVFCDEIAEMLDKKGVPTKEREIYERLEGYISTLARLSRATG INLFLGVQRPDANVLTGQIKNNIPIRICGRFADKSASEIVLNSTAAINLPDIKGRFLYLQ GNELIEFQAYYFDDETMLDESVEVSPGNMLTESYTVNHHEEEIKVPSKSKSVTEEIKQVN YEEYDFNFGEDEEIEWSVDK >gi|157101626|gb|DS480698.1| GENE 17 24898 - 25944 368 348 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940844|ref|ZP_02088185.1| ## NR: gi|160940844|ref|ZP_02088185.1| hypothetical protein CLOBOL_05737 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05737 [Clostridium bolteae ATCC BAA-613] # 1 348 1 348 348 617 100.0 1e-175 MSLLPIHKTGEATETKKKLKKKGENQEAANVQKNTKDTVAVKKKKKEKQPRDTTYVLKKN FTMKIFRWVFWFMLVCVFARGAYQIIKPQAADELRQMIKDFEQTQQSQGDAAEEVMDFAQ DFAKEYLTYEKGGEQDFRTRINPYISQRLNNAPGIYAFRNNAKAAYVNAYRKEQEGEVYN VYVNAEIQYDKGEDGIEYADCTLKIPVVATESGYSITALPMYIQDKRNSDDYTLPQIPLG QEIDTALVSPSIENFLSAYYSQEQSMINYLLTTDADRSKFVAIGGRYEFKKVESVKAYQR PEAADILCSLTVRIADTVNQEEISQEYMITLVQESDKYYVKDIDTKIY >gi|157101626|gb|DS480698.1| GENE 18 26054 - 26329 145 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940845|ref|ZP_02088186.1| ## NR: gi|160940845|ref|ZP_02088186.1| hypothetical protein CLOBOL_05738 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05738 [Clostridium bolteae ATCC BAA-613] # 1 91 16 106 106 149 98.0 6e-35 MLSIQLTLSMAMVSFASGANYAENAAKWGFEQAFWVVLFVTLVVAAGAWMKHALSTAIVT VIIGGILAYLCKQPEVISTIGNAVGSRVFGG >gi|157101626|gb|DS480698.1| GENE 19 26326 - 26739 116 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940846|ref|ZP_02088187.1| ## NR: gi|160940846|ref|ZP_02088187.1| hypothetical protein CLOBOL_05739 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05739 [Clostridium bolteae ATCC BAA-613] # 1 137 1 137 137 263 100.0 2e-69 MIMDNSEQKKVTIYSYSKVWKVEKRIYAIQNLVLPVPIDPWQLLYFGITWAAINLIFGAL PGFSSIPVVIRSLLIPYLISKFLLTKKLDGKNPIRYMFGIVVFLFSEQDQALEHFKLQPV KRQAIKLSWNCSQGDRR >gi|157101626|gb|DS480698.1| GENE 20 26744 - 29392 1015 882 aa, chain + ## HITS:1 COG:no KEGG:pE33L466_0231 NR:ns ## KEGG: pE33L466_0231 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_ZK # Pathway: not_defined # 5 824 7 812 846 396 30.0 1e-108 MYQCPIDYYIDNLVFNADKSCWSIYRLKGFNYDYLSTQGKVNKLVQLARVYSGIMSYAQI LVIPVEKDVQEHFRNLRERLHRNDILYDSAMQQIELTESYLREKTEQLGRVNDYATYFIV KLSEASEYEAVEKISEFLLYFIKDPVNAINVQMNLDTKDILKSKIKAYEKMANKWLDENK YKMSMEAVSTEETQWLFRRIAYRGTGIPAKLFYKTVGREKWEPDVDEVVINQERIVRPYH RDTVSLFEGSIEAQNRSLKVSTGFSTSYQTFLPIVWLPEQNEFPGKEWLYELQVRNLQAE VCIHIKTIPHKSALHQLELKKREINSQIEHIEEANADIPEDLYTSKEYADCMEQELKDDR SPLLESNITVCLSDSSLDGLEKKCARVKELYDDMNIIIERPLGDQLKLYMSFIPSVRILI KDFVMRLTPVTLASGIVGVTRELGDNRGGYIGSTGKEGKPVYYAPELACLQNMSPAATFF GDLGTGKSFNANILIYQMVLYGGYGLIIDPKGERSHWEKQLIVLRGLISTVTLGAAASDR GKLDPYNIYPDDIREAHELTLNVLSDLFGLDPKSDEYIAILEAQKRMEKSHGAHCMLKLA KMLEAIPEEDNLHEAANNLARRIILYRDNGMAGLLIGDGAEHAITLDNRLNIIQLQNLKM PSPETPKQDYTRDEVLSVVIFGVVSAFIKKFALVKRPVPKGILVDESWAISASKEGRNME EFISRMGRSLYTCIIYNGHSTKDLPTEGIKNSITYKFVFRSRNNREEAARLLDYLGLEVT PENMLVIQNLGAGQCLFKDLYNRVGVLQFDPVFQDLFDIFSTTPTEDADEEPEPEAPEPE TLEPKAPEPEIPELPEDKIISKPEPLLDFEFTEDDLFTKEDI >gi|157101626|gb|DS480698.1| GENE 21 29419 - 30543 502 374 aa, chain + ## HITS:1 COG:no KEGG:MPTP_0049 NR:ns ## KEGG: MPTP_0049 # Name: not_defined # Def: hypothetical protein # Organism: M.plutonius # Pathway: not_defined # 212 371 82 246 250 134 37.0 8e-30 MAIVVPFLLICIVLGGLVGGNTEVVPADEETAMKYQMCASQLGVDWSWVMLIDMYMADQE HTDITSQNMVYTALNCLKVTIEVYTEEEDEDGETHWEYDHTDYAYGADAIMEYFGLPKEC RDVQLVVRTIQGKNSSQFHISTAPYDDLKDVLDTYYGCFDDEAKTEMLALNEQKYLVELY EDIIGEWSGTGSGDGLFGDYEIGDLVYPETGMEIPLYYQYQQPWGDVKFGGNTIRTSGCS VTSIAMVFSYLRDSTITPPDIVAWTGNRYYVGNAGQSWDIFPATAQHWGIKCYGLGTSLQ GMIAELAAGHPVIASMRPGTFTRAGHFIVLRGITEDGRILVNDPNDNATKRFFYTSFAPA LIQRESKQYWSFSN >gi|157101626|gb|DS480698.1| GENE 22 30555 - 31088 60 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940849|ref|ZP_02088190.1| ## NR: gi|160940849|ref|ZP_02088190.1| hypothetical protein CLOBOL_05742 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05742 [Clostridium bolteae ATCC BAA-613] # 1 177 3 179 179 328 100.0 1e-88 MKYGNIYKVLIMAVLAAIILSLSMRLFKNPEESKEPIVQPETAAELTEEDITAETERNQT TMATYPDEVEAEGGNTQPEIYSPWEEEDYTACYFTNTENTIDVDSTLPVGAQGRLTDDAQ RYLDSEGIKAIELRCIDGTVITEGPETSFQVQCDDVGKTIIVMTYDRNLHTWTFKRE >gi|157101626|gb|DS480698.1| GENE 23 32717 - 33475 32 252 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940850|ref|ZP_02088191.1| ## NR: gi|160940850|ref|ZP_02088191.1| hypothetical protein CLOBOL_05743 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05743 [Clostridium bolteae ATCC BAA-613] # 1 252 545 796 796 481 100.0 1e-134 MEENIRRKKNESSEPMSRPDSLNGSNPEMREEAGASKASVNRPVSYQPESREQPGEKRDV VVEKHTKVQERPVSYHRVQEGGERTEPTGESNQRPSEQTERTSTRPISHQVISKDVQKDL PAADENKSLKDVPERPISHSSISQVKEKGEAIEGAEHETVVQQRPISQQTTTTEKRKHLE GEHTRKQDKVVHSRHPDKLQGKLGNMGKNKVTADAMESTDNPSSRPMSQPVRTEEGKIED FKAGEMEPGHNT >gi|157101626|gb|DS480698.1| GENE 24 34061 - 34831 99 256 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940851|ref|ZP_02088192.1| ## NR: gi|160940851|ref|ZP_02088192.1| hypothetical protein CLOBOL_05744 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05744 [Clostridium bolteae ATCC BAA-613] # 1 256 23 278 278 477 99.0 1e-133 MYVARYQDDYGCIRGMYYRDVCENADMCKQTFYDTLRSLQAQGIITYSRVNQDYDITILD NDFSYPGAYHEGYINVSRQVFHTRRFHELKAKEKLLLLHFMKITHSASGSYQIGTGKLYT KYMQLLGVTKRVLRGYLHSLKKFFAIGIKDGKYFISYLRTVFNDRVEISETDQYMRYLVG VSCRRAKIKNCAPAAVKDVVTIMKQYRKEAQESIGRSIFEIVDDCICQARELNSKYIHKL VRRTLGLIWSGQEMEF >gi|157101626|gb|DS480698.1| GENE 25 36066 - 36383 95 105 aa, chain - ## HITS:1 COG:no KEGG:Calow_0808 NR:ns ## KEGG: Calow_0808 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.owensensis # Pathway: not_defined # 12 94 14 89 230 72 43.0 8e-12 MERGTVMDAYDLGLRLKGLREKYKLSQTQVATRLNLSRSAIANYESNTSFPSTDVVTKFA LLYHTSTDYILGLENRTTITLDGLTPSQEDDLLKVIDIFLKQFKA >gi|157101626|gb|DS480698.1| GENE 26 37258 - 37533 200 91 aa, chain - ## HITS:1 COG:SP0275 KEGG:ns NR:ns ## COG: SP0275 COG3077 # Protein_GI_number: 15900209 # Func_class: L Replication, recombination and repair # Function: DNA-damage-inducible protein J # Organism: Streptococcus pneumoniae TIGR4 # 7 91 5 87 87 80 52.0 1e-15 MSEKNTSMNIRMNKDVKLQAQKVFSDLGIDLTTAVNVFLRQSIRYQGFPFDVTLRQEPNA TTMAAIENADKGIDMHGPFESVEALMEDLNA >gi|157101626|gb|DS480698.1| GENE 27 37964 - 38221 116 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940858|ref|ZP_02088199.1| ## NR: gi|160940858|ref|ZP_02088199.1| hypothetical protein CLOBOL_05751 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05751 [Clostridium bolteae ATCC BAA-613] # 1 85 1 85 85 149 100.0 7e-35 MIIRYYNVLFILADHQEGDKFKLEWLKERIENGDRMNMRELYQWCQAHEINYVTKFRYRM DYPVKANIWNLYSYLRAKVDYELFR >gi|157101626|gb|DS480698.1| GENE 28 38491 - 39234 344 247 aa, chain + ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 2 202 66 267 318 166 40.0 6e-41 MIIAISNQKGGVGKTTTTHNLGVELAANNKRVLEVDADGQSSLTISFGKEPFDFEHSICD ILKRDPIGIEECIYNIKDNLDIIPSNLFLASMELELTGRTAREQVLARALKKVEANYDYI LIDCPPQLSILTLNALAAADKVLIPCQPTYLSYRGLEQLENTINDIRELVNPELEIMGVI ATLYKVRVKDQNEILGLLQEKYNVIGIIRETSEAVKGIYDGLAVVERNPKLPISQEYKKI AEYIMSM >gi|157101626|gb|DS480698.1| GENE 29 39393 - 39518 210 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDEKKIEKLYELMEKAKKEHDIEAVAALRWAIFKLENYPEI >gi|157101626|gb|DS480698.1| GENE 30 39921 - 40265 169 114 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940863|ref|ZP_02088204.1| ## NR: gi|160940863|ref|ZP_02088204.1| hypothetical protein CLOBOL_05756 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05756 [Clostridium bolteae ATCC BAA-613] # 1 114 1 114 114 206 100.0 4e-52 MEKSNIKKEWDNETLLEYYAVLFFNLQEGKDVNDEWMNVKKEILQRMETNIYSTAMSALS HSVSLMEAQKDKDMIRASFHNILGKVDILHICGIISDEEKQFWEDRVYKAYGFE >gi|157101626|gb|DS480698.1| GENE 31 40322 - 40579 113 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940864|ref|ZP_02088205.1| ## NR: gi|160940864|ref|ZP_02088205.1| hypothetical protein CLOBOL_05757 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05757 [Clostridium bolteae ATCC BAA-613] # 1 85 1 85 85 115 100.0 9e-25 MNLNEREQKRANSPVSAMIGVNKPEKAEKEKKQKKGEKEGLIQRGYYLTPGVAAQVKINA AMTGRRDYQVVQEALELYFKNNPTK >gi|157101626|gb|DS480698.1| GENE 32 40691 - 41488 400 265 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940865|ref|ZP_02088206.1| ## NR: gi|160940865|ref|ZP_02088206.1| hypothetical protein CLOBOL_05758 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05758 [Clostridium bolteae ATCC BAA-613] # 1 265 1 265 265 525 100.0 1e-147 MGRAIEEMKYMQTNTYCLHISRDELEALYFKASCVNVTIEQLLKDIIADLVGGTFTHGSD ERMYADFWYTRWGAEIYPHNFPDAGKSIKARELKGAGEIHIPLQLTNADTMRLQDKANCV GCNESQLLEAFVMDLISGRFTHGSDERMYADQWFRRCYMFPDKTFIRYLIIRDKVDELLE DYEKLREYQNDYVQMQKSPDVYELDEVMATQKEIEVLSAYIKEIYESYLKSLDDELPEGS LEEVISNVQVWNDKRLKLLKGEYEI >gi|157101626|gb|DS480698.1| GENE 33 41853 - 42938 322 361 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940866|ref|ZP_02088207.1| ## NR: gi|160940866|ref|ZP_02088207.1| hypothetical protein CLOBOL_05759 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05759 [Clostridium bolteae ATCC BAA-613] # 1 349 126 474 486 637 99.0 0 MAAVSGLRVSELAALDPGDIIFQSDGRLKIHVRHGKGGKEGWVTCLKDDYLYEHLKHHIE TLQPNERLFYDESYMRKYAWEHGMEMHDFRRAFAVLQKRELLQDGCNAKDADKIVQEQLR HSRFSNTKRYLYGRKIIAKKKKPKEKTEIEDVEPITDVNLEDYSIQEFYDLASELDSRDL TDQEKRSLYNYMGSEYRDMNKVLNNKDFNVPDYILQDIDTITECLERKKIPREEIVYRGM DNLGVLFGDDAKKLSSEELNQKYSGTLFISDGFSSTSMDKQIATAYAGFEEGVLMRIKVP EKMRGMYLGAVNRYREMELLLQRSSIFKLDAIEKKGDITYVDASLIYQVKKKGKMYGKKH K >gi|157101626|gb|DS480698.1| GENE 34 42919 - 43068 131 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160940867|ref|ZP_02088208.1| ## NR: gi|160940867|ref|ZP_02088208.1| hypothetical protein CLOBOL_05760 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05760 [Clostridium bolteae ATCC BAA-613] # 1 49 1 49 49 63 100.0 4e-09 MEKNINSQELNHERFKLQPGDMRPISLDEFEKLVKKDESADGDDIDGDE >gi|157101626|gb|DS480698.1| GENE 35 43015 - 43155 69 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLQQSCIGSGLERFRSYLQIFFYTICIFSLFITIYIITICRLILLH >gi|157101626|gb|DS480698.1| GENE 36 43232 - 43381 89 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSVNIVSVIIAGNIMNVLSAIVAEIMQILQNQIKQNVRTDSANESIHC Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:31:47 2011 Seq name: gi|157101625|gb|DS480699.1| Clostridium bolteae ATCC BAA-613 Scfld_02_40 genomic scaffold, whole genome shotgun sequence Length of sequence - 73670 bp Number of predicted genes - 75, with homology - 75 Number of transcription units - 31, operones - 19 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 247 - 306 9.1 1 1 Op 1 14/0.000 + CDS 358 - 972 722 ## COG1183 Phosphatidylserine synthase 2 1 Op 2 . + CDS 995 - 1933 751 ## COG0688 Phosphatidylserine decarboxylase + Term 1976 - 2020 8.2 - Term 1964 - 2006 11.6 3 2 Op 1 . - CDS 2045 - 3433 1458 ## COG0423 Glycyl-tRNA synthetase (class II) 4 2 Op 2 . - CDS 3490 - 4143 814 ## Closa_0907 hypothetical protein 5 2 Op 3 . - CDS 4238 - 4720 401 ## Closa_0905 DNA repair protein RecO 6 2 Op 4 . - CDS 4707 - 4889 98 ## Closa_0905 DNA repair protein RecO 7 3 Op 1 . - CDS 5007 - 5918 1158 ## COG1159 GTPase 8 3 Op 2 . - CDS 5940 - 8909 3130 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like 9 4 Tu 1 . - CDS 9021 - 9848 957 ## COG1434 Uncharacterized conserved protein 10 5 Op 1 4/0.000 - CDS 9955 - 11190 1295 ## COG0826 Collagenase and related proteases 11 5 Op 2 . - CDS 11187 - 11846 693 ## COG4122 Predicted O-methyltransferase 12 5 Op 3 . - CDS 11843 - 12553 766 ## Closa_0897 aminodeoxychorismate lyase 13 5 Op 4 . - CDS 12647 - 14854 2160 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 14 5 Op 5 . - CDS 14882 - 15142 422 ## Closa_0894 hypothetical protein 15 5 Op 6 6/0.000 - CDS 15237 - 15662 563 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) - Prom 15693 - 15752 1.7 - Term 15758 - 15785 1.5 16 5 Op 7 . - CDS 15883 - 16143 301 ## COG4472 Uncharacterized protein conserved in bacteria - Prom 16166 - 16225 10.2 17 6 Tu 1 . - CDS 16293 - 17654 870 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 17786 - 17845 3.0 + Prom 17742 - 17801 4.7 18 7 Tu 1 . + CDS 17860 - 18093 298 ## Closa_0890 Phosphotransferase system, phosphocarrier protein HPr + Term 18159 - 18190 3.4 19 8 Op 1 7/0.000 - CDS 18266 - 19450 1673 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 20 8 Op 2 . - CDS 19518 - 20672 1279 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 21 8 Op 3 9/0.000 - CDS 20686 - 21435 783 ## COG1385 Uncharacterized protein conserved in bacteria 22 8 Op 4 . - CDS 21466 - 22422 966 ## PROTEIN SUPPORTED gi|238917093|ref|YP_002930610.1| ribosomal protein L11 methyltransferase 23 8 Op 5 . - CDS 22463 - 23218 936 ## COG0730 Predicted permeases - Prom 23256 - 23315 6.7 - Term 23601 - 23653 20.2 24 9 Tu 1 . - CDS 23656 - 24378 692 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 24460 - 24519 6.6 + Prom 24460 - 24519 4.4 25 10 Op 1 35/0.000 + CDS 24593 - 26317 202 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 26 10 Op 2 . + CDS 26317 - 28212 201 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 27 10 Op 3 . + CDS 28231 - 28689 558 ## EUBELI_20547 hypothetical protein + Term 28691 - 28752 14.2 - Term 28685 - 28732 6.3 28 11 Op 1 . - CDS 28790 - 29689 1121 ## COG1052 Lactate dehydrogenase and related dehydrogenases 29 11 Op 2 . - CDS 29702 - 30487 841 ## COG1712 Predicted dinucleotide-utilizing enzyme - Term 30493 - 30536 5.7 30 11 Op 3 . - CDS 30559 - 31749 1461 ## Cphy_2845 hypothetical protein - Prom 31783 - 31842 5.1 31 12 Op 1 . - CDS 31969 - 32331 259 ## COG3070 Regulator of competence-specific genes 32 12 Op 2 . - CDS 32363 - 32653 224 ## ELI_0344 hypothetical protein - Prom 32728 - 32787 4.2 33 13 Op 1 . - CDS 32790 - 33695 857 ## COG2378 Predicted transcriptional regulator 34 13 Op 2 . - CDS 33743 - 34159 450 ## Closa_3192 hypothetical protein - Prom 34247 - 34306 2.8 - Term 34252 - 34291 -0.3 35 14 Op 1 . - CDS 34373 - 35236 878 ## EUBELI_01767 type II restriction enzyme 36 14 Op 2 . - CDS 35230 - 36027 515 ## COG0863 DNA modification methylase 37 14 Op 3 . - CDS 36038 - 36865 565 ## COG0338 Site-specific DNA methylase - Prom 36900 - 36959 4.8 + Prom 36853 - 36912 5.6 38 15 Tu 1 . + CDS 37071 - 37892 881 ## COG3544 Uncharacterized protein conserved in bacteria + Term 37995 - 38045 0.9 + Prom 37903 - 37962 9.3 39 16 Tu 1 . + CDS 38121 - 39500 1129 ## COG0534 Na+-driven multidrug efflux pump + Term 39581 - 39614 -0.7 40 17 Op 1 . - CDS 39470 - 41020 1658 ## COG4262 Predicted spermidine synthase with an N-terminal membrane domain 41 17 Op 2 . - CDS 41020 - 41442 534 ## Vpar_0586 protein of unknown function DUF350 42 17 Op 3 . - CDS 41520 - 42629 1063 ## Vpar_0386 hypothetical protein 43 17 Op 4 . - CDS 42622 - 43008 342 ## COG1586 S-adenosylmethionine decarboxylase - Prom 43078 - 43137 6.8 - Term 43143 - 43198 1.0 44 18 Op 1 . - CDS 43224 - 44219 978 ## COG1533 DNA repair photolyase - Prom 44280 - 44339 3.2 - Term 44315 - 44356 -0.8 45 18 Op 2 . - CDS 44417 - 46882 2123 ## COG0514 Superfamily II DNA helicase - Prom 47053 - 47112 9.1 - Term 47227 - 47262 6.5 46 19 Op 1 . - CDS 47373 - 48041 654 ## COG0546 Predicted phosphatases 47 19 Op 2 13/0.000 - CDS 48054 - 48872 847 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 48 19 Op 3 13/0.000 - CDS 48873 - 49619 878 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 49 19 Op 4 9/0.000 - CDS 49660 - 50130 578 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 50 19 Op 5 2/0.000 - CDS 50127 - 50555 687 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 51 19 Op 6 . - CDS 50579 - 51592 993 ## COG2222 Predicted phosphosugar isomerases 52 19 Op 7 . - CDS 51573 - 51707 96 ## gi|160940931|ref|ZP_02088271.1| hypothetical protein CLOBOL_05823 - Prom 51731 - 51790 5.0 + Prom 51719 - 51778 7.3 53 20 Op 1 . + CDS 51894 - 52715 716 ## COG2188 Transcriptional regulators 54 20 Op 2 . + CDS 52790 - 53470 607 ## Closa_4277 transcriptional regulator, GntR family 55 20 Op 3 . + CDS 53551 - 54120 248 ## Closa_3061 GCN5-related N-acetyltransferase 56 20 Op 4 . + CDS 54148 - 54984 890 ## COG0726 Predicted xylanase/chitin deacetylase - Term 55035 - 55087 10.4 57 21 Op 1 . - CDS 55131 - 55487 435 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 58 21 Op 2 . - CDS 55515 - 56474 676 ## gi|160940938|ref|ZP_02088278.1| hypothetical protein CLOBOL_05830 - Prom 56499 - 56558 5.3 + Prom 56463 - 56522 5.3 59 22 Tu 1 . + CDS 56611 - 57246 586 ## COG2755 Lysophospholipase L1 and related esterases + Term 57259 - 57305 -0.9 - Term 57246 - 57292 6.2 60 23 Op 1 . - CDS 57332 - 58201 449 ## GC56T3_2842 hypothetical protein 61 23 Op 2 11/0.000 - CDS 58217 - 59497 717 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 62 23 Op 3 11/0.000 - CDS 59498 - 59995 188 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 63 23 Op 4 . - CDS 60034 - 61113 426 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 - Prom 61239 - 61298 5.3 - Term 61333 - 61383 8.4 64 24 Tu 1 . - CDS 61468 - 62325 613 ## COG1737 Transcriptional regulators - Prom 62515 - 62574 10.8 + Prom 62423 - 62482 5.5 65 25 Op 1 . + CDS 62555 - 63478 862 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 66 25 Op 2 . + CDS 63498 - 64193 644 ## COG0684 Demethylmenaquinone methyltransferase + Term 64241 - 64292 11.6 - Term 64233 - 64275 11.7 67 26 Op 1 . - CDS 64287 - 65213 744 ## COG1893 Ketopantoate reductase 68 26 Op 2 . - CDS 65232 - 65771 577 ## COG0778 Nitroreductase 69 26 Op 3 . - CDS 65816 - 66502 508 ## Closa_1597 hypothetical protein - Prom 66546 - 66605 6.5 70 27 Tu 1 . - CDS 66631 - 67473 597 ## COG0789 Predicted transcriptional regulators - Prom 67539 - 67598 5.7 - Term 67563 - 67616 8.2 71 28 Tu 1 . - CDS 67635 - 67976 372 ## EUBREC_1585 hypothetical protein - Prom 68106 - 68165 8.8 - Term 68017 - 68059 3.6 72 29 Op 1 . - CDS 68183 - 70384 2236 ## COG0550 Topoisomerase IA 73 29 Op 2 . - CDS 70434 - 71168 586 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase - Prom 71250 - 71309 5.3 + Prom 71122 - 71181 7.8 74 30 Tu 1 . + CDS 71297 - 72313 931 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) - Term 72314 - 72363 4.2 75 31 Tu 1 . - CDS 72382 - 73668 797 ## EUBREC_2277 hypothetical protein Predicted protein(s) >gi|157101625|gb|DS480699.1| GENE 1 358 - 972 722 204 aa, chain + ## HITS:1 COG:CAC0798 KEGG:ns NR:ns ## COG: CAC0798 COG1183 # Protein_GI_number: 15894085 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Clostridium acetobutylicum # 2 201 3 203 205 137 42.0 2e-32 MIGFYSYTVVLTYLGLASAAMGMILTFQGFAKYALFCLAFSGLCDMFDGKVARLKKDRTE DEKRFGIQIDSLCDVVCFGAFPMILCYSIGMRGPAGISILVFYLIAGVIRLAFFNVMEEK RQDETDEARKYYQGLPITSIAIILPLFCTLRPLLGHRFLSELHICILTVGLLFIINFPLR KPGWKMLTLLVAIVSCALIKIYFF >gi|157101625|gb|DS480699.1| GENE 2 995 - 1933 751 312 aa, chain + ## HITS:1 COG:CAC0799 KEGG:ns NR:ns ## COG: CAC0799 COG0688 # Protein_GI_number: 15894086 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Clostridium acetobutylicum # 21 291 24 288 291 219 43.0 8e-57 MQYADRTGRSLGGSTLQDRFLENLYASFLGRMLIKPLVHPAFSKICGVFLDSPLSAPIIP GFMKSAGICAEDCETPAGGRYRSFNAFFTRRMLPEARPFDARDHILCSPCDGFASVYPIH NNMRITIKHTQYTLEQLLRDSGLAARYAGGTALLLRLTVSDYHRYAYVDRGRRSSYRRIP GVLHTVNPAAASRRPVYKENSREYSLLRTGSFGTVLMMEIGALMVGKIVNHHKAYTSIDV FRGQEKGYFAFGGSSILLLFQPGTVAIDRDIMRNTALDVETRVRMGEAIGQAMTANEMNP NEMNPNEMNSIK >gi|157101625|gb|DS480699.1| GENE 3 2045 - 3433 1458 462 aa, chain - ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 449 4 450 462 703 72.0 0 MEKTMEKIVALAKNRGFVYPGSEIYGGLANTWDYGNLGVELKNNVKRAWWQKFIMESPYN VGVDCAILMNSQTWIASGHLGGFSDPLMDCKQCKERFRADKLIEDYNDAHGIQMEGSVDG WSQEQMKQYIDDNNICCPSCGAHDFTDIRQFNLMFKTFQGVTEDAKSTVYLRPETAQGIF VNFKNVQRTSRKKVPFGIGQIGKSFRNEITPGNFTFRTREFEQMELEFFCKPDTDLEWFA YWKEFCINWLRSLGIKDDELRARDHSPEELSFYSKATTDLEYLFPFGWGELWGIADRTDY DLSQHQKVSGQDMSYFDDETKERYIPYVIEPSLGADRVTLAFLCAAYDEEEIGEGDVRTV LHFHPAIAPVKVGVLPLSKKLNEGAEKIYTELSKYFNCEFDDRGNIGKRYRRQDEIGTPF CITYDFDSEEDHAVTVRDRDTMEQVRVPIADLKDYFAEKFMF >gi|157101625|gb|DS480699.1| GENE 4 3490 - 4143 814 217 aa, chain - ## HITS:1 COG:no KEGG:Closa_0907 NR:ns ## KEGG: Closa_0907 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 217 1 209 209 65 30.0 1e-09 MKKIVKGILLLAAAAAVVTGCSRIGMGGAEAPTENSVYVAGDGSVRWASVETYQQGSYTE DELKNSAGQKIIDFNSGLGKAASYENAEGSEKLPVAIVSASMGNGTATLVTEYDAPGRLI EFAQEIGDYNVPFTQLDTGRVAAMSGELAEVSFQDEKGKAVDQETALKDGQSMVVKAAGQ GVIVTEKKIQYVSEGCVLKDSNRVQTSGEGTSYIILK >gi|157101625|gb|DS480699.1| GENE 5 4238 - 4720 401 160 aa, chain - ## HITS:1 COG:no KEGG:Closa_0905 NR:ns ## KEGG: Closa_0905 # Name: not_defined # Def: DNA repair protein RecO # Organism: C.saccharolyticum # Pathway: Homologous recombination [PATH:csh03440] # 4 160 60 216 221 214 65.0 1e-54 MAIFNLYEGRNAYNLQSAQITNYFDGLSTDMEAACYGSYFLETAAYFAQENLDGTELLKL LYQSLRALLRPALPNRLVRRVFELKSMVINGEYTQDPPVPVSDSCHYAWEYVICSPSESL YQFALKEEVLKEFESVVEKSRRYFVRHEFRSLEILETLTA >gi|157101625|gb|DS480699.1| GENE 6 4707 - 4889 98 60 aa, chain - ## HITS:1 COG:no KEGG:Closa_0905 NR:ns ## KEGG: Closa_0905 # Name: not_defined # Def: DNA repair protein RecO # Organism: C.saccharolyticum # Pathway: Homologous recombination [PATH:csh03440] # 1 60 1 60 221 91 66.0 1e-17 MRETVTLTGMVLLSAPSGDFDRRLVLLTRERGKITAFSHGARKPGNPLMAASRPFCFGNF >gi|157101625|gb|DS480699.1| GENE 7 5007 - 5918 1158 303 aa, chain - ## HITS:1 COG:BH1367 KEGG:ns NR:ns ## COG: BH1367 COG1159 # Protein_GI_number: 15613930 # Func_class: R General function prediction only # Function: GTPase # Organism: Bacillus halodurans # 8 302 9 303 304 317 56.0 2e-86 MEENIQKKSGFVTLIGRPNVGKSTLMNHLIGQKIAITSDKPQTTRNRIQTVYTDDRGQII FLDTPGIHKAKNKLGQYMVNVAEHTLKEVDVILWLVEPATFIGAGERHIAEQLKNVKTPI ILVINKIDTVKNQDEILTFIAAYKDVCDFAEIVPLSALKDKNTDLLTELIFKYLPYGPQF YDEDTVTDQPMRQIAAELIREKALRLLDDEIPHGIAVTIEKMKERKGGLIDIEASIVCER ESHKGIIIGKGGSMLKRIGIEARKEIESMMDTQVNLQLWVKVRKEWRDSELYMKNYGYNQ KEI >gi|157101625|gb|DS480699.1| GENE 8 5940 - 8909 3130 989 aa, chain - ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 18 988 16 975 976 832 44.0 0 MADGRNLEKVKELAAYRVSEEMYVEEMDSRAMVLEHIKSGARIFLMSNEDENKVFYIGFR TPPDDSTGLPHILEHSVLEGSDKFPVKDPFVELVKGSLNTFLNAMTYPDKTVYPVASCND KDFQNLMDVYLDGVLHPAIYREPKIFLQEGWHYELESPEDELAINGVVYNEMKGAFSSPE SVLDRFTRNVLFPDTTYSNESGGDPAVIPELTYEKFIAFHRNYYHPANSYIYLYGDMDMA QKLTWLDQEYLGQYDREDCHADSAIAVQQPFEQPVQREITYSVTEEEGTKDRTYLSINTV VGTDLDPVLYVAFQILEYTLINAPGAPLKQALIDAGIGQDILGGYDCGILQPYFSVIAKN ANPEQKGEFLAVVKGTLRRLADQGIDRKSLLAGLNLYEFRYREADYGSAPKGLMYGLWSL DSWLYDGKPTLHLEYQKTFDYLKKAVEEGYFEQLIHRYLLDNPHEAVITVRPRVNQTAEE DRNLAERLKTYKESLGREELEALTARTRQLKEYQEEPSQQEDLEKIPMLQREDIEREGGR FSYEVKMEDGVNVIHSNLFTSGIGYLKVLFDTSRVPVEDLPYVGLLKAVLGYVDTEHYSY GDLTSEIYLNSGGVSFAVSSYPDAAHPGQFTGAFVASAKVLYHKLDFAFSILAEILTRSR LDDEKRLGEILDETRSRARMKMEDASHGAAVGRASSYFSASAAFNDMTGGVGYYQFLEDV SRRFAEDASGRGQLIARLKDVCARLFTSDNLLVAYTADTEGYSRLPAELKTFRSVLGKGD GRTYEFVFRPDNRNEGFKTASQVNYVARCGSFAGKEAGGRKLEYTGALRVLKVIMNYEYL WMNLRVKGGAYGCMSSFSRTGDGCLVSYRDPNLEATNQVYEGIPDYLRSFSIDERDMTKY VIGTMSDVDTPLTPSLRGARNLSAYLSGVTDEMVQKEREQILDVTQEDIKALADIVQAVL DTRALCVIGNDQQIRAQESMFGEVKNLYH >gi|157101625|gb|DS480699.1| GENE 9 9021 - 9848 957 275 aa, chain - ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 5 270 12 258 259 103 29.0 5e-22 MLDGVLIGLAVLCVVYFIIIVVYAGISTSFAFIWLFFAALLVFLVYGKWYYARNMERIPR WVPVSVVTTCIAGVAVLGILCILVFLGAATPGKANLDYVIVLGARVKEHTVSNSLKKRLD RAIVYAEENPYTILVLSGGKGPGEADSEAQVMYDYLVYNGVSPRQLLMESDSTSTVENIA YSKIVIEQDRMKDKKEIIPMPGKTGSVPYAIAPDKPLEIGVLTSNFHIYRARLTAEKWGF DNVYGISAESDPVLFIHLCVRECASILKDRLMGNM >gi|157101625|gb|DS480699.1| GENE 10 9955 - 11190 1295 411 aa, chain - ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 1 392 1 390 406 451 53.0 1e-126 MRKTELLIPAGSLDVLKTAVVYGADAVYIGGEAFGLRAKAHNFSNEEMKQGIAFAHARGV KVYVTANILAHNGDLPGVEAYFEEMRDVAPDALIISDPGVFAIARRVLPDMEIHISTQAN NTNYGTYLFWHGLGARRVVSARELSLAEIREIRDHIPQDMQIESFIHGAMCISYSGRCLL SNYFVGRDANQGACTHPCRWKYSLVEETRPGEYMPVYENERGTYIFNSKDLCMVEYIPEM MDAGIDSFKIEGRMKTALYVATVTRAYRRAIDDYLKDPSVYRKNLAWYREEIGKCTNRRF TTGFYFGKPTSDDHIYDSSTYVKSYTYLGTVEETDSRGRCRIEQKNKFSAGETIEVMKPD GQNLEVTVESITDHEGNEQESAPHPKQILWVKLSGDVSVFDILRRKEEAVD >gi|157101625|gb|DS480699.1| GENE 11 11187 - 11846 693 219 aa, chain - ## HITS:1 COG:CAC1686 KEGG:ns NR:ns ## COG: CAC1686 COG4122 # Protein_GI_number: 15894963 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 2 213 4 211 216 147 39.0 2e-35 MIVNDRITDYIRSLESSQGSLLDSIEQEAERDRVPIIRRETAALLRTLVAALRPARILEI GTAVGYSSLLMCRVMPENCRITTIEKYEKRIPAAREHFKIAGEEGRITLLEGDADDLLAG LKGQTFDLVFMDAAKGQYLHWLPMILALMPEGGVLISDNVLQDGDIVESRFAVERRNRTI HSRMRRYLYELKHRKGLETAIIPIGDGVAVSTKYTERVE >gi|157101625|gb|DS480699.1| GENE 12 11843 - 12553 766 236 aa, chain - ## HITS:1 COG:no KEGG:Closa_0897 NR:ns ## KEGG: Closa_0897 # Name: not_defined # Def: aminodeoxychorismate lyase # Organism: C.saccharolyticum # Pathway: not_defined # 1 132 1 131 131 109 51.0 1e-22 MSNRTREINKITTTIIGISVKLMVYALIILLLYEAVARGYAFGHEIFFAEAVDEAPGQDM VVQIDSKESVSDAAQFLAHKGLIKSEFAFIFQSKFYDYDTIYPGTYTLNTSMTSKEILQL LNEKPETEDSAKPAQSGGAKAAEAKTSQGRTSEKENGNSGASDQETSPAAAAASQDGAAA AAETDGEIPADAAPGSRSGEDALDSQEAQANEQTYEGEDQEMEGGWIEDAAEDGTQ >gi|157101625|gb|DS480699.1| GENE 13 12647 - 14854 2160 735 aa, chain - ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 187 735 5 555 555 654 58.0 0 MNLEEKENTAEVAPEAAAQATNQAVQENNAQDAGVQETAGAAAPQEQPVKKSQNRRPAQK KTASKEQGGNGEAQKKKNTGSRQGQAKSRSGQKGENQTGTAARESKDSAGARSSKGTGSA KASRNSGESKASRGSGESRNSGESRNSGESKASRNSGESKNSRNSGESKALPAPARQKAR DGKKDGKGKLKIIPLGGLEQIGMNITAFEYEDSIIVVDCGLAFPGDDMLGIDLVIPDVTY LKQNIDKVKGFVITHGHEDHIGALPYILQQVNVPVYGTKLTIALIEHKLEEHRLLKNTKR KVMKHGQSVNLGCFRVEFVKTNHSIQDASALAIFTPVGTVLHTGDFKIDYTPVFGDPIDL QRFAELGKKGVLALMADSTNAIRPGFTMSERTVGKTFDAIFAEHQNRRIIVATFASNVDR VQQVVNTAYKYGRKVVVEGRSMVTIMDIASKLGYINVPEGTLIDIEHLRNYPPESTVLIT TGSQGESMAALSRMAASIHKKVSIVPGDVVVLSSTPIPGNEKAVANVVNELSMKGAEVIC QDTHVSGHACQEDLKLIYSLVHPKFSIPIHGEYRHRMAQRELAQFMGVQKENAVMVNSGD VVALDEDSCEVIDHVTCGGIFVDGLGVGDVGNIVLRDRQNLAQNGIIVVVLTLEKHSNQL LAGPDIVSRGFVYVRESEDLLEEAHTIVYDAVQDCLDRHVSDWGKIKNIIKDSLSDFLWK RMKRNPMILPIIMEV >gi|157101625|gb|DS480699.1| GENE 14 14882 - 15142 422 86 aa, chain - ## HITS:1 COG:no KEGG:Closa_0894 NR:ns ## KEGG: Closa_0894 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 85 8 92 93 90 69.0 2e-17 MITMVTDSGESVDFYVLEETRINARSYLLVTDAPEGEDGECYILKDMSGQQDAEAVYEFV EDDSELDALMKVFEELLSDADVDLEK >gi|157101625|gb|DS480699.1| GENE 15 15237 - 15662 563 141 aa, chain - ## HITS:1 COG:lin1537 KEGG:ns NR:ns ## COG: lin1537 COG0816 # Protein_GI_number: 16800605 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Listeria innocua # 1 137 1 136 138 149 58.0 2e-36 MRIMGLDFGSKTVGVAVCDPLGITAQTVETITRASENKMRQTLARIEQLIGEYEIERIVL GYPKNMNNTVGERGEKTQEFKAALERRTGLEVILWDERLTTVAAERVLIESGVRRENRKK SVDQIAAAMILQGYLDSLQYS >gi|157101625|gb|DS480699.1| GENE 16 15883 - 16143 301 86 aa, chain - ## HITS:1 COG:lin1538 KEGG:ns NR:ns ## COG: lin1538 COG4472 # Protein_GI_number: 16800606 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 5 81 4 82 90 86 58.0 1e-17 MGNIKDTQYFQAVQDKKIEVSDVLEQVYIALTEKGYNPVNQIVGYIMSGDPTYITSHKGA RSLIMKVERDEILEELMAVYIDARLK >gi|157101625|gb|DS480699.1| GENE 17 16293 - 17654 870 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 416 1 405 451 339 44 2e-92 MPKAALHNLGCKVNAYETEAMQQQLEERGYEIVPFDQKADVYIINTCSVTNIADRKSRQM LHRAKKLNPEAVVVAAGCYVQVASDALKEDDSVDIIVGNNNKARLADILEEYMKDRQGDE GGYVLDIARAREYEELHVSRLGEHTRAFIKVQDGCNQFCSYCIIPYARGRVRSRKPEDVE AEVKGLVARGYREVVLTGIHLSSYGTEHMEGSPVKGGDWDSGPLWDLIERIHRVEGLERI RLGSLEPRIITREFAEKLAGLPEFCPHFHLSLQSGCDATLKRMNRHYTTEDYLRRCGILR EIFDHPAITTDVIAGFPGETEEEFEETRRFLETVRFYEMHVFKYSKRQGTRAAVMEDQVS EQVKARRSDVLLELEKTMSREYRERFAGSRVSVLFEEAAEIGGKWYMMGHTPQYVRAALP LEDGMNREEFGGRIMELFASGLLNDEMLKVEFS >gi|157101625|gb|DS480699.1| GENE 18 17860 - 18093 298 77 aa, chain + ## HITS:1 COG:no KEGG:Closa_0890 NR:ns ## KEGG: Closa_0890 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 76 77 108 86.0 5e-23 MKTVRISLNSIDKVKSFVNDLSKFDVDFDLVSGRYVIDAKSIMGIFSLDLSKPIDLNIHA ESCVDDILAILSPYIIA >gi|157101625|gb|DS480699.1| GENE 19 18266 - 19450 1673 394 aa, chain - ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 6 390 5 379 384 351 48.0 1e-96 MQYQSFLIKYAEIGTKGKNRYMFEDALIKQIRYALKSVEGQFDVTKESGRIYVKAETDYD YDDAVEALKRVFGIADICPMVQIEDKDYENLKQHVVEYMDQVYPDKNITFKVNARRGDKQ YPVTSEQINRDMGEVILEAFPQMRVDVHHPDVILHVEVRQRINLFSLMIPGPGGMPVGTN GRAMLLLSGGIDSPVAGYMIAKRGVKIDAVYFHAPPYTSERAKQKVVDLANLVARYAGPI NLHVVNFTDIQLYIYDKCPHEELTIIMRRYMMRIAQTIAERTGSIGLITGESIGQVASQT LQSLAATNEVCTMPVFRPVIGFDKQEIVDVSEKIGTYETSIQPYEDCCTIFVAKHPVTKP NINVIHSSERRLEEKIDQLVETALETTEQILCQG >gi|157101625|gb|DS480699.1| GENE 20 19518 - 20672 1279 384 aa, chain - ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 1 380 1 376 379 334 47.0 2e-91 MEAYFDNSATTRVFDSVRDIVVKTMTEDYGNPSAKHRKGMEAEQYVRQAAADIAKTLKVR DKEILFTSGGTESNNMALIGTAMANQRAGKHIISTRIEHASVYNPLAYLEQQGFEVTYLS VDSQGHISLEELEAAIRPDTILVSIMYVNNEVGAIEPVEQIVSLIHGKNPAILFHVDAIQ AYGKLTIRPKKQGIDLLSVSAHKIHGPKGVGFLYIDEKVKIRPLLYGGGQQKDMRSGTEN VPGIAGMGRAAREIYTDHQQKVEYITGLKDYMTDRMASLPGVTVNSRKGMESAPQIVSAS FQGVRSEVLLHALEDKGIYVSSGSACSSNHPAISGTLKAIGVRQELLDSTLRFSFGVFNT REEIDYCMEVLEELLPVLRRYHRG >gi|157101625|gb|DS480699.1| GENE 21 20686 - 21435 783 249 aa, chain - ## HITS:1 COG:CAC1285 KEGG:ns NR:ns ## COG: CAC1285 COG1385 # Protein_GI_number: 15894567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 245 1 243 250 175 41.0 8e-44 MHHFFVNPEQVEDGLIRITGSDVNHIKNVLRIRQGEEMLVSDGTGRDYLCQAEEIAGQEV TVRILETEEEGRELPSRIWLFQGLPKSDKMEFIIQKAVELGAAGIVPVSTRNTVVKLDPK KEEAKVKRWQAIAESAAKQSKRSLVPRVSGIMTLKEAFDYVESQGFSVRLIPYEHEAGMD GTKTELDAAGPGQDIAVFIGPEGGFDEREIELALSKGVRPISLGRRILRTETAGLALLSV LMMRLEGAL >gi|157101625|gb|DS480699.1| GENE 22 21466 - 22422 966 318 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238917093|ref|YP_002930610.1| ribosomal protein L11 methyltransferase [Eubacterium eligens ATCC 27750] # 1 318 1 318 318 376 58 1e-103 MKWKKFTLTTTTQAVDLISSMFDDIGIEGIEIEDNVPLTEKETKGMFIDILPELPPDEGV ARISFYLDDDADVADYLRRVEEGLDELSPFADLGARTITASETEDKDWINNWKQYFKPFT VDDILIKPTWETIPEEHKDKLLVQIDPGTAFGTGMHETTQLCIRQLKKYVNRDTLVLDVG TGSGILGITALKLGAEEVWGTDLDENAINAVRENLEANSIPEDRFHVLQGNIIDDVSVKE WAGYGKYDVAVANILADVIILLVDEIPAHLKKGGIFITSGIIDMKEEAVKEAFGRCPELE VVEITYQGEWVSVTARRK >gi|157101625|gb|DS480699.1| GENE 23 22463 - 23218 936 251 aa, chain - ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 247 5 248 254 135 36.0 9e-32 MLQYLIVCPLVFLAGLVDSIAGGGGLIALPAYLIAGVPAHVALGTNKLSSAMGTTVSTAR LAKHGFLKGKVLMAVCSSAAALMGSALGAHLALMVPESVIKHMMIVVLPVVAFYVLRNKD MGARDRTEDISRNKVFAISMAAAFFVGGYDGFYGPGTGTFLILILTGAAGMDARSASAQT KVINLSSNIAALVTFIATGNVYYPLGLTAGVCSIAGHYIGAGMVAHDGQKVVRPVVLAVL GVLFIKIISNT >gi|157101625|gb|DS480699.1| GENE 24 23656 - 24378 692 240 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 114 239 468 620 621 64 33.0 2e-10 MRRMISVPLILAAAAALSFLPVQDCLAYGPGDSWYYDHYRRGGPGVEYDYGTAPSGTYDY GGPSAPQGNAQVTTDNTVRAYPGFNTNVPDTSTAQFDPGYWWDPNDDDWDDWDDDDWWYY NGGPGVQYGQGWRYSPGGWWFQLCGGSWLSGGWQLIHNRWYRFDQNGYMLTGWFTDSDGN RYYLNPLDDGTLGMMRTGWQMIDGQLYYFNTQSDGTMGRLFVNTTTPDGYVVGADGALVR >gi|157101625|gb|DS480699.1| GENE 25 24593 - 26317 202 574 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 331 545 279 498 563 82 28 6e-15 MKEFLSYLKPYKRDAVLAVFCIEAETVFELIIPLVMASIVDVGVATGDRHYILMKGLQMV LFALISLVLGQGSAMFSARCGQGLGAEIRKAEFAKLQQFSFANTDHFSSSSLVTRLTSDV TTIQNSVATGMRPAFRSPVMMLTAMAASFYINPQLALVFLVAAPVLGVLLFFIISHVRPL YSVMQGAIDMVNRIIQENLTAIRVVKSYVRGDYEIQKFEEVNYNLQFTAEKAFRLAVLNM PAMQLVMYSTILCILWFGGRLVTIGGVKVGELTGFLSYVMQVLNSLMMFSNVFLMTTRAL ASWKRISQVMDEEIDIREDASSTLEVKHGDIRFENVYFKYNQEAAEYVLSDISFHIKPGQ TVGIIGQTGAAKSTLVQLIPRLYDVTKGTVYIDGRPVREYSLKGLRDSIAMVLQKNTLFS GTVKDNLRWGRETATEQEIEEACRIACVDEFIDRLERGYETELGQGGVNVSGGQKQRLCI ARAILKSPRVLILDDSTSAVDTATEAKIRDGLAKSMPDTTKIIIAQRISSVAHADQIIIL EDGRVNAVGSHETLLASNQIYQDIYHSQQEGANL >gi|157101625|gb|DS480699.1| GENE 26 26317 - 28212 201 631 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 407 615 16 228 245 82 28 8e-15 MARIANYNRPKDTGRTLKQMFGYLGRHKWYMLVIALLVAVSALSSISGTYMLKPVINRFI LPGDIPGLVNMLLVMGGLYLCGALSCYAYNQMMVHISQKVVSEIRSDLFRHTQRLPLTYF DAHTHGELMSRFTNDVDTISEALNNSFAMMIQSFITITGTITMLLVLDWRLSLIVMVFLL LMVLFIRYNGKRSRKYFIQQQKYLGSINGFVEEMVAGQKVEKVFNHEAQDYEEFCRRNEA FRTAATRALTYSGMTVPTIVALSYVNYALSACVGGLFVLSGLTDLGTLASYLVYVRQSAM PMNQFTQQINFLLAALSGAERIFDMMNEIQELDEGTVTLCNVRRNDQGGWEECREFTHCF AWKVPHEGPADMAEPHTGSQSGFRLIPLKGDVRFDQVVFGYTPEKTILNGISLYAKPGQK IAFVGSTGAGKTTIINLVNRFYEIQQGTITYDGIDIRTIKKDDLRRSLSMVIQDTHLFTG TIADNIRYGRLDATDEDVVTAAKVANAHSFIRRLPQGYDTPLHSDGANLSQGQRQLLAIA RAAISRPPVLILDEATSSIDTRTEKLIEKGMDALMDGRTVFVIAHRLSTVRNSKAIMVLE KGEIIERGSHEELIDQRGRYYKLYTGQFELE >gi|157101625|gb|DS480699.1| GENE 27 28231 - 28689 558 152 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20547 NR:ns ## KEGG: EUBELI_20547 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 151 3 148 148 107 40.0 2e-22 MMKQEQVRRLLVRAERARKHLMQPHFTRIGLTFGQGHARILDVLLARDHITQKELSDLCH MDVTTMSRSLDRLEESGYLVREKDPGCRRSYLICLTGDGAAEARRVRKVLEMVDEVIWNG LDQDEMEAFCGTMEKICRNLEECDFKFDEQET >gi|157101625|gb|DS480699.1| GENE 28 28790 - 29689 1121 299 aa, chain - ## HITS:1 COG:PH0520 KEGG:ns NR:ns ## COG: PH0520 COG1052 # Protein_GI_number: 14590422 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Pyrococcus horikoshii # 16 242 14 252 333 80 31.0 5e-15 MFKKLVAIEPVNLVSQAEEELRGYAGEITLYEDVPSGDEEIIRRIGDADAVLVSYTSRIS RYVIESCPSIRYIGMCCSLYSEESANVDIACARERGITVLGIRDYGDRGVVEYVLCELVR FLHGFDRPMFRDMPVEITDLKVGIIGMGVSGGMIADAMKFMGADVSYYSRSRKQDREDQG LVYRPLKELLARSEVVFACLNKNVILLHEEEFTQLGQGKILFNTSIGPAFDLPALKTWLD QGNNYFICDTEGALGDSTGELIRHPGVFCAHISAGRTKQAFDLLSKKVLDNIRTYLNQQ >gi|157101625|gb|DS480699.1| GENE 29 29702 - 30487 841 261 aa, chain - ## HITS:1 COG:MJ0915 KEGG:ns NR:ns ## COG: MJ0915 COG1712 # Protein_GI_number: 15669105 # Func_class: R General function prediction only # Function: Predicted dinucleotide-utilizing enzyme # Organism: Methanococcus jannaschii # 4 259 2 263 267 110 28.0 2e-24 MGKLKLGILGNGYLADIIVEAGLKGMLDEYELVGVLGRTREKTELLAKKGGCRACSTIDE LLALAPDYVAEAASVQSVKDCGVKILASGASMIVLSIGAFADKEFYGQVKETAAVHGTRV YIASGAVGGFDVLRTISLMGQAEAGIRTKKGPASLKGTPLFEERLMEDTQESHVFHGNAK EAISLLPTKVNVAVASSLATVGPEDTRVDIYSVPGMVGDDHKITSEIEGVKAVVDIYSST SAIAGWSVVAVLQNIVSPIVF >gi|157101625|gb|DS480699.1| GENE 30 30559 - 31749 1461 396 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2845 NR:ns ## KEGG: Cphy_2845 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 396 1 396 396 648 79.0 0 MERKVGTISRGIRCPIIREGDNLAEIVVDSVLDAAESEGFEIRDKDVISVTESVVARAQG NYASIDAIAEDVRAKLGGETIGVIFPILSRNRFSICLKGIARGAKKVVLMLSYPSDEVGN SLVSLDQVDEAGINPYSDVLDLKRYRELFGENKHPFTGVDYVSYYCQLIEEQGAEAQVVF ANNPREILNHTRKVLTCDIHSRMRTKRILKEAGAEVVCGMDDILNAPVDGSGCNEKYGLL GSNKSTEDTIKLFPRDCMDLVNDIQKKVLDRTGKHVEVMVYGDGAFKDPVGKIWELADPV VSVANTEGLEGTPNEVKLKYLADNDFKDLSGEALKNAVSQRIKEKKSDLVGDMASQGTTP RRLTDLIGSLCDLTSGSGDKGTPVILVQGYFDNYTN >gi|157101625|gb|DS480699.1| GENE 31 31969 - 32331 259 120 aa, chain - ## HITS:1 COG:CAC2476 KEGG:ns NR:ns ## COG: CAC2476 COG3070 # Protein_GI_number: 15895741 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Clostridium acetobutylicum # 1 98 1 98 114 99 49.0 1e-21 MASSLEYVQYVAAQLSGAGAISYKKLFGEYGLWCGGTFFGTVENNQFYIKVTEAGHKHLP EAEPVAPHGGRPGMYLVEELDDREFLTALVLDTCGQLPEPHKPKQSVSKPSQPKSGKRKS >gi|157101625|gb|DS480699.1| GENE 32 32363 - 32653 224 96 aa, chain - ## HITS:1 COG:no KEGG:ELI_0344 NR:ns ## KEGG: ELI_0344 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 96 1 97 97 98 53.0 7e-20 MENIVACCGCICNECPYYQKECGGCPKIQGKPFWLEYTGEERCGIYRCCVEEKKLPHCGR CSELPCSRYDQQDPARTPEENAAGLKKMLEVLRSLD >gi|157101625|gb|DS480699.1| GENE 33 32790 - 33695 857 301 aa, chain - ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 289 1 288 300 171 36.0 1e-42 MKIDRLIGILSILLQREKVTAPYLAEKFEVSRRTINRDIEDICKAGIPLVTSRGPGGGIS IMEGYRMDRTLITSGEMDAILAGLRSLDSVSGSARYRLLMDKLALGKQDLISSDSSILIN LSSWYRQSLAPKIHLIQEAAETRRHIRFTYYSPMGETSRTIEPYLLVFQWSSWYVWGFCL MRQDYRLFKLNRMVDLECCQEVFDKRILPPYENEADRPFPIRFLVKARCKPSIKWRIVDE FGPETLEELPDGDFMFTAGYSDKENLFCWILSLGDQIELLEPCEFRGELAELGRRVYEIY R >gi|157101625|gb|DS480699.1| GENE 34 33743 - 34159 450 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_3192 NR:ns ## KEGG: Closa_3192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 138 139 127 54.0 1e-28 MDKVSVVLRVFFEDPFWIGILERVERGRMTVCKITFGPEPKDYEVHGFLAKEYYGLRFSP AVAAAVSEHLKNPKRVKRQVREELRESGMGTKSQQALKLQHEQIKTGRKAVSREKKEAEA RRQFELKQQKKKEKHRGR >gi|157101625|gb|DS480699.1| GENE 35 34373 - 35236 878 287 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01767 NR:ns ## KEGG: EUBELI_01767 # Name: not_defined # Def: type II restriction enzyme # Organism: E.eligens # Pathway: not_defined # 1 286 1 286 287 459 80.0 1e-128 MVKRDFDAWLSKFRASISGYRYYIDFDKVVRNVEDIKIELNILNSLIGSVNIEDDFEKII TKYPETLSCIPLLLAVRSNEIYAQDENGAFKYNFKTMNYDMDQYKIFMRKTGLFDLLQKH LVNNLVDYALGIETGLDSNGRKNRGGHQMEDLVEQYIIAAGFTKDENYFKEMYLKDIENK WDIDLSALSNEGKAAKRFDFVVKTESMIYAVETNFYSSGGSKLNETARSYKMLSQEADTI EGFSFVWFTDGIGWKSARGNLRETFEVMEHVYNIDDMEHSVMTALLV >gi|157101625|gb|DS480699.1| GENE 36 35230 - 36027 515 265 aa, chain - ## HITS:1 COG:HP0092 KEGG:ns NR:ns ## COG: HP0092 COG0863 # Protein_GI_number: 15644722 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Helicobacter pylori 26695 # 20 255 23 256 277 234 48.0 2e-61 MEELMRGKIQAYYGTDTKKLYLGDCLELLRKMKPESVDMIFADPPYFLSNNGITCQGGRM VSVNKASWDEGGDFKENHAFNRRWIRMCRRVLKPGGTIWISGTLHNIYSIGMALQQERYK IINNITWKKTNPPPNLACRCFTHSTETILWARKDEKKARHLFNYEQMKQMNGGKQMKDVW EGNLTRPSEKWAGRHPTQKPEYLLERIILASTKKGDVVLDPFCGSGTTGVVSGKYGRQFI GIDNNEEYLDIAKRRLDQIQEALEW >gi|157101625|gb|DS480699.1| GENE 37 36038 - 36865 565 275 aa, chain - ## HITS:1 COG:TVN0849 KEGG:ns NR:ns ## COG: TVN0849 COG0338 # Protein_GI_number: 13541680 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Thermoplasma volcanium # 7 275 9 277 277 189 40.0 6e-48 MTEISAQPFVKWAGGKRQLLEQIKARLPEHFNGYMEPFVGGGAVYFDLQPEKSIINDINE SLINAYLQIKISPEEFADAISEYDKRIADGGKEYYYKVRENYNDKMMRAEYDMELAVLFV FLNKHCFNGLYRVNGRGLFNVPYNNSVRESCSKESIMAVSQSLQKVKIMKGDFQNACDLA EKGDFVFIDSPYAPLNPTSFESYTKEGFDIESHRRLADVFRELTERGCYCMLTNHNTDFI NQLYADFNKVVVDVKRMINSDAKNRVGQEIIITNY >gi|157101625|gb|DS480699.1| GENE 38 37071 - 37892 881 273 aa, chain + ## HITS:1 COG:all7633 KEGG:ns NR:ns ## COG: all7633 COG3544 # Protein_GI_number: 17158769 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 56 262 33 234 242 72 29.0 1e-12 MNKNMRTWVIGICVALVIIVAVAAAFANNRSGSDDGQTSQSQTESTSASAASQTETETGS REQASSNSGNAADHSGHNGHMNASGDDLSRYLTEQDSIMMNMMEDMVIREKSGNASLDFL RGMIPHHEAAIKMSESYLNYQGKSDELKTIAQDIITAQKDELKQMNELVKTYETDGKKDQ TKEDAYLEQYSKMFADDSMSHHMDTSDVDSLDQAFAEGMIMHHQMAVDMARDILEYTDYE EIRTMAQNIIDVQEKEIAQMEKILNDQQESQPE >gi|157101625|gb|DS480699.1| GENE 39 38121 - 39500 1129 459 aa, chain + ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 23 442 9 430 443 178 29.0 2e-44 MFRKMPEGMGTVQEVTMQNNYAEQTSCLKEFARYTSLNVLGMIGLSCYILADTFFVAKGL GTSGLAALNLAIPVYSFIHGSGLMFGIGGATKFSISRKSGLSDLIFTNTLFFAGITALLF FLAGLFLSPQITALLGADSQVAAMTQTYLKMILLFAPAFILNDILVCFVRNDGNPRLSML AMLGGSMANILLDYVFIFPLGMGIFGAVLATGLAPLISIAIMSGHWMGKNKGFHLAKTEI QGQTALLILSLGFPSLITEVSSGIVIIVFNVIILGLKGNVGVAAYGVIANLSLVVISIFT GIAQGMQPLISSAYGRGDSARIRGILRYGLATAVFISLTLYAGICWGAAPITAVFNSENN LKLQEIAEYGLRLYFTGIPAAGLNIILSIFFTSTEQAVPAHIISLLRGMVLIIPIAFFLS GLWGMTGVWLAFPATELVCALTGAAICKRFYSQSFSASR >gi|157101625|gb|DS480699.1| GENE 40 39470 - 41020 1658 516 aa, chain - ## HITS:1 COG:RSp1306 KEGG:ns NR:ns ## COG: RSp1306 COG4262 # Protein_GI_number: 17549525 # Func_class: R General function prediction only # Function: Predicted spermidine synthase with an N-terminal membrane domain # Organism: Ralstonia solanacearum # 11 515 23 520 525 370 40.0 1e-102 MKEGTFSYRRLMFTTFIISGCSIIYELLISSVSSYLLGDSIAQFSITIGLYMCAMGMGSY LSKYVRTELFDWFVFVEIGVGILGGTSSLLLFLANIYVQSYQLVMYLEIILIGMLVGLEI PLLTRIIEENAGNLRITLSSIFSFDYIGGLAGSIAFPLLLLPQLGYFSTAFLVGSLNLGI SLFILVSYRGYIRMFSMWRLVSGLACAFMVLGMLFSENVASGIEQGLYRDKVILREQTPY QKLVVTKHKDDIRLFINGNIQFSSKDEYRYHEALVHIPMSAAKSRRNVLVLGGGDGLAVR EILKYPEVEKIVLVDLDARVVDLCRTNQDIAALNQGSLDSPKLEIRSEDAYGYLENNYRN EGGEGYDGEFDVIITDLPDPNSESLNKLYTDLFYRLCSNNLTEHGVMSVQSTSPYYAADA YWCINKTVGQEFPYVYPYHVQVPSFGDWGFQLASKDKVSVDGLELQAEGKFLNQSGLRGL FLFAEDEKPRRDRLTANTLSEPRLLHYYLEAEKDWE >gi|157101625|gb|DS480699.1| GENE 41 41020 - 41442 534 140 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0586 NR:ns ## KEGG: Vpar_0586 # Name: not_defined # Def: protein of unknown function DUF350 # Organism: V.parvula # Pathway: not_defined # 1 139 4 142 143 98 43.0 8e-20 MAEFITIMVYCVLGTALMLFGTFVVDLVIPCEFPAEIKKGNVAVGSVMAGISVGIGIIIR AAVVSPAGTAVHESLLTGIASTIYYYVVGLVFCVLGYLALNLINRKYDLNKEISGGNPAA GIMVAGMFIGLSIIISGVIM >gi|157101625|gb|DS480699.1| GENE 42 41520 - 42629 1063 369 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0386 NR:ns ## KEGG: Vpar_0386 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 7 141 3 128 213 66 31.0 2e-09 MGKTLQFNKDQNLSIEGAEYRVTGGIEFHNRSDGSRWWEYCLLETRTRGIKWLSIDNIYE EYAIYTQCPYGSEFDEMNIFRDGYRQADAGQAVVTSCFGQVDTSPGDTVRYTEYEDGTEE LIIAVEQWEDETEYSKGCYLDMDEIVLLDSGCSGQVESNRTLGFVNMKNLAVAAVILAVL GVLSYTYIQSNKKTIHKYLEGNINFSYQTSITSDLNEKERADVYSTDLSVDDAAKAIIQA IDGGTEDVQKNGEDDSVAILTKSEYCLVYTSTDQTTMVQISSRAYVYQSTNTPYHATGHT HSYYRGFYYSRGFFGDRDRYRQRTSGYENYSGETVDTNPVDPYKSYSDSVRQSSINSRRS SGGGISSGK >gi|157101625|gb|DS480699.1| GENE 43 42622 - 43008 342 128 aa, chain - ## HITS:1 COG:PA4773 KEGG:ns NR:ns ## COG: PA4773 COG1586 # Protein_GI_number: 15599967 # Func_class: E Amino acid transport and metabolism # Function: S-adenosylmethionine decarboxylase # Organism: Pseudomonas aeruginosa # 11 109 15 114 160 93 42.0 8e-20 MTGKETDIAEARQLIVDLYGCRTSYIDDVDSIKNVIHMVCQCIGAGIVEECYHKFSPMGI SAVAVITTSHISIHTWPEYGYAAVDIFSCQEDIPGEICRLLTESLKAEQAETRLIKRALN RAKEDVHG >gi|157101625|gb|DS480699.1| GENE 44 43224 - 44219 978 331 aa, chain - ## HITS:1 COG:BS_splB KEGG:ns NR:ns ## COG: BS_splB COG1533 # Protein_GI_number: 16078457 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Bacillus subtilis # 6 331 10 342 342 181 33.0 2e-45 MKFDAVYYEPAIFDYPLGRQLREEYGDLPWIPIESHNSIREMQEKPNDQFGHMKRNLIAG IRKTHKYVENHKVSDYLVPYTSSGCTAMCLYCYLVCNYNKCAYLRLFVNREQMLDRIIKK GKNSGTPLTFEIGSNSDLVLENTITGNLLYTIPRFAEEGEGKLTFPTKFHMVEPLLDLDH RGKVIFRMSVNPQPIIQKIELGTSGLRQRIEAVNQMCEAGYPCGLLIAPVILVEGWKEMY TGLLEELRDGLSDKMKSQMFLEIILMTYSYVHRAINSEAFPNAPDLYDREQMTGRGRGRY CYRAESREEAQLYLRQEIGRVLGNVPILYIS >gi|157101625|gb|DS480699.1| GENE 45 44417 - 46882 2123 821 aa, chain - ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 4 618 6 581 714 564 46.0 1e-160 MTQYEILKHYFGYDTFRDGQDVLIQSILEGRDVLGVMPTGAGKSLCYQIPALMMDGITLV ISPLISLMKDQVSNLNQVGILAAYINSSLTAAQYYKVLDLARAGRYPIIYVAPERLMSED FLRFALSSQVKISMVAVDEAHCVSQWGQDFRPSYLKIVDFINQLPERPVVSAFTATATAE VRDDIIDILMLRNPQVMTTGFNRPNLYFGVQSPKDKYATMVNYLERHKGESGIIYCLTRK VVEEVCGQLIREGFSVTRYHAGLSDSERRHNQEDFIYDRAQIMVATNAFGMGIDKSNVRF VVHYNMPKNMESYYQEAGRAGRDGEPSECILLYGGQDVVTNQFFIDHNQDNEALDPITRE IVMERDRERLRKMTFYCFTNECLRDYILRYFGEYGSNYCGNCSNCLSQFETVDVTDISRI LIGCVESSRQRYGTNVMIDTVHGANTAKIRNYRMDENPHYGELAKVPAYKLRQVMNHLML NGYLAVTNDEYAIVKLTGKSKGILEEGEPVTMKMAREAEHPAKASGASAGKGRKGRKGLA AGLSGPGGAEFTEADETLFEKLRAVRTEIAKEEKVPPYMVFSDKTLTHMCIVKPVTKGEM LNVSGVGEFKYEKYGERFLACVQAEMNDSPAEISFEGDDLYFTSDNDAFDDWSLETALTA WEYGGPENERQENGSLGNGSQGNDSREDERTREAPAGLNTDRKKKTGKGKTDFVMTGELA ARVRYSEQSSLSDFVSQINSLRDEDTMKRLTIKSVEQKLMEDGNFEEYYVNGIPRKRLTD KGREFGIEAEKRLSEKGNEYEVFFYGERAQRGIVEWLWEKI >gi|157101625|gb|DS480699.1| GENE 46 47373 - 48041 654 222 aa, chain - ## HITS:1 COG:MT2292 KEGG:ns NR:ns ## COG: MT2292 COG0546 # Protein_GI_number: 15841725 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Mycobacterium tuberculosis CDC1551 # 3 187 80 266 291 70 31.0 2e-12 MTKLVAFDLDGTLNMTHTYAVAAYQEALKELGAGPVAESVIRSRFGAPFRDDSLFFLGDD SEEAFRRFYRAVERHWAPLMRERAQTYPGVPEMLGELKEQGYVLAVCSNAKEEELEDVLG VLSIREYFDYIQGLTSRQDKADSLGVLIERAGADWCCMVGDRFYDREAAKANETAFIGCR YGYGGADEFGDGDLLAEDSVEIPGLLKRLVFSTPWLKPWQRP >gi|157101625|gb|DS480699.1| GENE 47 48054 - 48872 847 272 aa, chain - ## HITS:1 COG:FN0629 KEGG:ns NR:ns ## COG: FN0629 COG3716 # Protein_GI_number: 19703964 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Fusobacterium nucleatum # 13 272 19 279 279 238 44.0 8e-63 MQETKETSLITKKDLRSVFWRSFTMQSSWSFDKMMAYGFMYSIEKPLRKIYPKDDDYFEA LHRHTETFNITPHVSPFVIDLSVAMEQECAKADNFNKESINNLKVGLMGPLSGIGDSFFW GTFRVITAGIGIAFAQQGSILGALLYFFLYTAIHFLTKYITGNAGYKMGTRFLENSAENH LIEKMSYGASILGLTVIGAMIGTMVTLKTSLTFQFSGTETTLQSIFDQIFPGLLPLGATF LCVWLFNRNVKTIAIILGIFVICIAGCWIGIF >gi|157101625|gb|DS480699.1| GENE 48 48873 - 49619 878 248 aa, chain - ## HITS:1 COG:lin2109 KEGG:ns NR:ns ## COG: lin2109 COG3715 # Protein_GI_number: 16801175 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Listeria innocua # 18 238 20 240 272 141 38.0 9e-34 MLQAILVSIVAFLAAIDEYTFGASMMGRPLFTAPIVGLILGDFQAGIMIGATLELMFMGS IMVGSATPPEVYASSVLGTAIAIQSGEGVGTAVALALPLSIFLQMWRNGCYAIGASWGGK QIEKALAQRNLRRANMWHLITLPLMVGVPSMLLVFFAMYFGAGSINAVIKLIPQVVLDGF DVAAGVLSCVGLALLIKMMGNKKLMPYLFLGFIAVMYINMDVIGVAVAGLCMALIAVGNM KFEEEEDF >gi|157101625|gb|DS480699.1| GENE 49 49660 - 50130 578 156 aa, chain - ## HITS:1 COG:STM4536 KEGG:ns NR:ns ## COG: STM4536 COG3444 # Protein_GI_number: 16767780 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Salmonella typhimurium LT2 # 1 149 1 149 153 122 41.0 2e-28 MIKITRVDERLVHGQVAFAWTNSLGADCILVVNDEAAADKIRATTLKLAAPAGIKFSIKS VADAIVLLNGTKTDKYKVFVIVGNTDDALKLVEHVKGVDHINLGNMKKTEGRRIITNSLA VDDHDVRNIRAMAQAGAVVECRAVPTDKETDAMTLI >gi|157101625|gb|DS480699.1| GENE 50 50127 - 50555 687 142 aa, chain - ## HITS:1 COG:STM4535 KEGG:ns NR:ns ## COG: STM4535 COG2893 # Protein_GI_number: 16767779 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Salmonella typhimurium LT2 # 2 141 3 140 140 74 27.0 6e-14 MRKILIATHGEFAGGLKQTMDFVLGGNEKVGVLSAYTTPDFDMDREAAAVVDELEDGDEL IVMTDVLGGSVANAFSGHISHPGVYVLAGVNAPMLLAMVPMLESAMDTQELITQGIQAAR EGCVFINGLMKQAFEESEEDFV >gi|157101625|gb|DS480699.1| GENE 51 50579 - 51592 993 337 aa, chain - ## HITS:1 COG:SMc03139 KEGG:ns NR:ns ## COG: SMc03139 COG2222 # Protein_GI_number: 15966707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted phosphosugar isomerases # Organism: Sinorhizobium meliloti # 1 337 1 337 337 268 38.0 1e-71 MYGFQNEDFLKTLNGAVAQIGEINRIADKVSDLGYKNIYLVGTGGTYAIISPLAYMLKTN STLVWYHEIAAELLAAKPRSLTRDSLFITASLSGTTKETIAAAEYARSMGATVISFVGDR TAPLAAVSDYVVENAACNDNLVEELHIQFFALGARLMLKNGEFPQYERFVETLRKMPEVL LKVREQNDERALAFAEKHKDTGFHMCVGAGNTWGETYCFAMCVLEEMLWIPTKSIHAAEF FHGTVEMTDKNMSFMLFKGEDETRPLVDRVENFVRKYSDVVQVWDTRDYPLEGIDADMRK YVSPMVMSTQLERVSAHFEHVKDHSLDIRRYYRTVEY >gi|157101625|gb|DS480699.1| GENE 52 51573 - 51707 96 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940931|ref|ZP_02088271.1| ## NR: gi|160940931|ref|ZP_02088271.1| hypothetical protein CLOBOL_05823 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05823 [Clostridium bolteae ATCC BAA-613] # 6 44 1 39 39 66 97.0 5e-10 MVLTKVDDVGILKTQDGLYKRYPVKYIKFNLSNQEEDYDVRFSE >gi|157101625|gb|DS480699.1| GENE 53 51894 - 52715 716 273 aa, chain + ## HITS:1 COG:BS_yvoA KEGG:ns NR:ns ## COG: BS_yvoA COG2188 # Protein_GI_number: 16080556 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 37 247 17 227 243 113 32.0 4e-25 MQRKQIQGEQIQDKQIQGKQRQSTQTQRLTKSEIFTQTLKRNIQEGTYPYGTALPAEREL AESFGLNRSTVRAAIQTLVEEGLLKKVQGKGTFVMRHARDNAYTHFRGMSELLKNHGYVP SSRILYTGPTPAGYKLSHLLQVGEDTVLYRIMRLRLGNGLPVSIENTFVPYELIPDIEDI DFQIYSLYDAFAMNSLRIAGIQQVFSTTRVRNSEAKYLNAENGSPVVSIAITSASEENGI IEYTNALVLPEFCKFYSDGVIQDSKLNVYAQSI >gi|157101625|gb|DS480699.1| GENE 54 52790 - 53470 607 226 aa, chain + ## HITS:1 COG:no KEGG:Closa_4277 NR:ns ## KEGG: Closa_4277 # Name: not_defined # Def: transcriptional regulator, GntR family # Organism: C.saccharolyticum # Pathway: not_defined # 1 226 1 232 233 168 37.0 2e-40 MPKVFNLLSYDIAEYIEKYIHEHGLQPGDRLPTERELAAQLDITRVSVRQGYKRLLDQGV IRSENGYYVNPPRTDRSLTAYCFPYADSYLHPDTFSVKSTEYMPEAIRNICNNMLNADHS SHLNMNRFIEYLSDIPVGITYTAQTSSSMPSFPGLFYLRSLPEGLKQTQYMRVHPPSREE YELLELSEHDSMLMLVNGFQYEGTPIAVSVSLCVGTRMNLIAPITI >gi|157101625|gb|DS480699.1| GENE 55 53551 - 54120 248 189 aa, chain + ## HITS:1 COG:no KEGG:Closa_3061 NR:ns ## KEGG: Closa_3061 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 30 168 32 171 172 118 41.0 1e-25 MKILPACAADTAKIYRIMTEARSLAPDPGWYCTDSEDYIRSHIVNPDAGIVFKAVEHDCL AAFFIIHFPGNTPDSLGHYMNLEPEALDRVAHMDSLAVRPSFRGRGLQYQLMASAEQYLM TTSYCHLMGTVHPDNRYSLGNFLKLGYRIVTTTKKYGGLPRHVLYKSISNPPSFTGSGNN TASVSIYLK >gi|157101625|gb|DS480699.1| GENE 56 54148 - 54984 890 278 aa, chain + ## HITS:1 COG:CAC3009 KEGG:ns NR:ns ## COG: CAC3009 COG0726 # Protein_GI_number: 15896261 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 67 272 85 290 295 226 53.0 5e-59 MKNRLRRLPMKKLLQTTLLFALAFCAGRLAAQVVEYNRPPEAVTTSADGNWGLSFPGEGQ MPVGNATADYLKQFNAYYAQNTQDKVIYLTFDAGYENGNTAAILDALKKHNVHATFFLVG NYLQTSPDLVKRMVAEGHNVGNHTFHHPDMSKISTKEAFEKELNDLETLYQQTTGQPMKK YYRPPQGKYSESNLKMANDMGYKTFFWSLAYVDWYEDKQPTKEEAFKKLLTRIHPGAIVL LHSTSKTNGQILDELLTKWEEMGYHFGTLDDLVAADQT >gi|157101625|gb|DS480699.1| GENE 57 55131 - 55487 435 118 aa, chain - ## HITS:1 COG:FN0052 KEGG:ns NR:ns ## COG: FN0052 COG1393 # Protein_GI_number: 19703404 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Fusobacterium nucleatum # 2 117 3 118 120 119 56.0 1e-27 MNLTVLCYKKCSTCQKALKWLDANGISYVERPVREENPTLEELKEWYGMSGLSLKRFFNT SGNIYKEMSLKDKLPSMSEDEQLALLATDGMLVKRPLVVGDGFVLTGFKEAEWKEKLL >gi|157101625|gb|DS480699.1| GENE 58 55515 - 56474 676 319 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940938|ref|ZP_02088278.1| ## NR: gi|160940938|ref|ZP_02088278.1| hypothetical protein CLOBOL_05830 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05830 [Clostridium bolteae ATCC BAA-613] # 1 319 1 319 319 574 100.0 1e-162 MDKISKGFVGRTVRAVLFIILIIAFCTAISSAGIYLASRYGLTQSREQVPLTALYLGNEG KVELPGRGKMAEGNGWRLRNSRDSYTLTLDNAVIENTVMENAESSRPAIDITGDLTIELK EGSGSWIRSEGAGIRIHSGTLVIRGKGSLDIEAKTAGILGNRTENGQAALRLEGGQLDIK GTYAGISCPEVDIAGGTGTVSALGQCGIGIYSCRIKVEESEGEMTAEGSGAAVLAGGSGE MKPSIKSDGKVTAGPGDVRVVEFDAARPEHMAGHEEEIGGWMGGTHSALLTYSTEDMVTY NEDTSCFDGSARTLRFAAE >gi|157101625|gb|DS480699.1| GENE 59 56611 - 57246 586 211 aa, chain + ## HITS:1 COG:mlr0387 KEGG:ns NR:ns ## COG: mlr0387 COG2755 # Protein_GI_number: 13470623 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Mesorhizobium loti # 1 210 1 210 211 149 38.0 3e-36 MINILCFGDSNTFGTNPKGGRHSWNTRWPGRLQVLLGPEYYVIEEGMGGRTTVWDDPLEP GRCGIQALPMALQSHKPLDLVILFLGTNDCKAHFHASPRVITKGMENLCHTVQGFDYGEG YKKPKILVISPIHIGNDMEHCPYVSFDETAPGKSMALAPLYQELAKAEGCLFLDAASLAG PSSIDQLHMDAKNHGALAEGIAEVVKGYFEP >gi|157101625|gb|DS480699.1| GENE 60 57332 - 58201 449 289 aa, chain - ## HITS:1 COG:no KEGG:GC56T3_2842 NR:ns ## KEGG: GC56T3_2842 # Name: not_defined # Def: hypothetical protein # Organism: Geobacillus_C56-T3 # Pathway: not_defined # 3 289 4 295 301 261 43.0 3e-68 MTDLLKHAYDLHIHCGPDIIPRSVTALEMAERAVKRGMKGFAIKSHYAPTCLQAATVRAC YPGCNAVGTITLNASVGGLNPLAVETAARMGAKIVWCPTFDSASQQDYYLKHLPQYIDMQ KKMLDAGRNVPSYCLIDESGELCCEMQEVLDVVQSHDLVLATGHITHKETFALARVASKR GFRKLLITHADWEFTHYTVEEQQELVKLGAWIEHSYTSPEEGCVAWEKVAGEIRTIGAQR CIITTDLGKANGKYPDDGLWDYAQRLLEMGFTAGEISAMICQNPGFLVE >gi|157101625|gb|DS480699.1| GENE 61 58217 - 59497 717 426 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 5 421 6 426 431 280 35 1e-74 MGALLFILLLVLIFFGLPIGFCILLSSTVFLQVTSLKPLLVVAQKLALGLDSSSLLAIPL FTLMGYIMESCGLSKRLIDWVYSLFGGFKGAMGLVTIICCTIFAALSGSGPATVAAIGSI LIPAMIENGYPPKTAAGLTAMAGALGPIIPPSIVMIVYGTTMGVSITKMFMGGVVPGILI AICLAVANTIKTRKLPLKVNLERHSLKEQLLVTWKALPTLLLPVIILGGIYGGIFTPTES ATVGVVYSFVLALVYHKVTVKSFIDSVKKTIETSAMICFLIAGASVFCWILSTTQIPTKI ANFVVPMLGGSQALYWILLLIVLLFIGCLMESLASVVMLAPVLAPIGLAMGINEIHLGVV FCISLIVGFVTPPFGANLFTVVGITKQPFGEVVIGVLPFLAASFIALILCIIFPPIVTFL PSLLSA >gi|157101625|gb|DS480699.1| GENE 62 59498 - 59995 188 165 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 6 152 2 151 164 77 26 3e-13 MLRAYRKICTWIYRIMIAVCIVLVILMIVASGLQVFSRYVLNSTFSWTDECARYCFLWFN LIGSGCLVKSKGHAIVDLFSGKLKGNTKRIYNLAIHLLIFYMGIILAKYGLALCKVTMRQ TSTALKLPIGLVYGALPVLGMLILIYEVEAIWCILAGYSDGKEEI >gi|157101625|gb|DS480699.1| GENE 63 60034 - 61113 426 359 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 41 327 36 318 346 168 35 7e-41 MRRRNMIMAATAVAISFSLCACGGSQTSTSAGTETQKSEKNEKTETKAETAEKLVIQVGH TEADSEDSVHHKMCTLFKQYVAELSDGSIDIEIIPSASLGGERDMVEGMQLGTIEMASTA NMVISNFDPAFSILDLPYVYNDYESAYAVLESDELQQLMDKFAEESGVRILAYGQGGFRD VISNKAVISLNDFSGMKIRVPESDIYLSTFKAMNANPTPLAFNDVFTSLQQGTIDGFEIV PAVVLANGYYEVCSHVSLTRHLYSPNPLMISESLWSSLTEEQQKVIQEAADKAAGEQRKW EEDQEQNVFDQLREKGMTIDEDVDVEAMKKACADAGIYDTYRDTIGADFYDKVMNLIQK >gi|157101625|gb|DS480699.1| GENE 64 61468 - 62325 613 285 aa, chain - ## HITS:1 COG:BH2940 KEGG:ns NR:ns ## COG: BH2940 COG1737 # Protein_GI_number: 15615502 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 10 284 14 281 284 103 26.0 4e-22 MTILAERLKESKLTNQQHRIADYFLKNEERIGNMSSMDVAREIGVSDASIIRFSRAIGYK GFTDLKNDIYNSLAGEASAAVRNMQLSDRFDVLSGKFNDQNIPDTFVELMNYNITKTFHQ NPVESYDALTDMILMSKRKYIIGLRGCMGVAGHFARLLGFAVNQVVTLPHNETEAMGILQ DIGPEDIFILFSFARYYKIDMFLINVAKSHKARICVVTDSPLAPMIPCSDISFIAETSHM SFFNSAIGVEMIGEYIVTLVCRKNSEEYRKRAKERDALTTEFRMP >gi|157101625|gb|DS480699.1| GENE 65 62555 - 63478 862 307 aa, chain + ## HITS:1 COG:MJ1018 KEGG:ns NR:ns ## COG: MJ1018 COG0111 # Protein_GI_number: 15669207 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanococcus jannaschii # 3 303 4 308 524 200 37.0 3e-51 MLVYLSEYIHPDAVKLLKQHGEITDTFDRIREIDAIILRTANITRDIISQASSLKVIVKH GVGCNTIDLEAAKEYKIPVINTPKANTNSVAELIIGLILNVCRGIAFCDAKSRHEGFGRI APPEMTGIELTGKTIGLVGMGNIALRTGEILKNGFDTVLVGYDPYCTAEQAAIFGIKKYN DLNKMLEVSDIVNISVPLTPSTKNLIAGDSFLHFKKNAVLINAARGGIVNEDDLYTALKT RQLRAAACDAFVKEPPTGENKLTKLNNFCATPHIGANTEEALYRMGMEAVKAVIGAIEGN AAKYRVV >gi|157101625|gb|DS480699.1| GENE 66 63498 - 64193 644 231 aa, chain + ## HITS:1 COG:AGpA472 KEGG:ns NR:ns ## COG: AGpA472 COG0684 # Protein_GI_number: 16119556 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 218 15 216 227 171 46.0 9e-43 MSKVGCVIIEDFKRPDKALVEKFRNVAVANLDDCMNRISAVHEDIRPVNKGKLLGPAFTV KVPSGDNLMFHAAMDLAKPGDVIVIDAGGDTRRSIFGELMVTYCRKRGIAGIVVDGSLRD VDAMSQLTDFPIYARGITPDGPYKNGPGEINTPIVCGGRLVNPGFIVIGDEDGVIFINPS DAPELLEKVNKLHENEIRIINTIEKDGTYIRPWVNEKLEGLEVERINYVEY >gi|157101625|gb|DS480699.1| GENE 67 64287 - 65213 744 308 aa, chain - ## HITS:1 COG:CAC1605 KEGG:ns NR:ns ## COG: CAC1605 COG1893 # Protein_GI_number: 15894883 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 1 306 1 301 301 179 33.0 5e-45 MDRITRVSIVGMGALGVLYGDFFARTLGLEQVTFLADRDRVERYRDTTVYCNGRGWKFAV CPGDEADPGGPAQLLIFAVKATALEASVPMVRNQVGKDTVILSVLNGITSEAVIGNILGM EHVLYCVAQGMDAVKMGNELTYSHMGQICIGIPQHEPQKEAMLEAVTELFERTGLPYTRE PDILRRLWCKWMLNVGVNQAVMVEEGTYGTVQRPGPARQMMKAAMAEVVELAGREGIRVT MEDLDAYVDLIDSLNPDGMPSMRQDGLARRPSEVEFFAGTVIRRAEAAGLDVPVNRELYR RIKGMERR >gi|157101625|gb|DS480699.1| GENE 68 65232 - 65771 577 179 aa, chain - ## HITS:1 COG:TM0386_2 KEGG:ns NR:ns ## COG: TM0386_2 COG0778 # Protein_GI_number: 15643152 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Thermotoga maritima # 1 177 1 165 181 73 32.0 2e-13 MELDAVLRERRSMRKYQPDRKVSREQVEEILKAATLAPSWKNSQTARYYVVMSDDMLKRV KETCLPAFNQTNSKDAPVLIVAAFVKNRSGYENDGTPSNELGNGWGCYDLGMHNQNLLLK AKDLGLDTLVMGIRDAGKLRELLSIGEDQDVAAVIALGYGAADPQMPKRCTVREIARFY >gi|157101625|gb|DS480699.1| GENE 69 65816 - 66502 508 228 aa, chain - ## HITS:1 COG:no KEGG:Closa_1597 NR:ns ## KEGG: Closa_1597 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 218 1 217 218 298 63.0 1e-79 MEREVSLEDISDGKLYGINDMVRADSGGCEGCSACCRGMGSSIILDPMDMFRLTGNLGMT PEQLLSGPLEVNMVDGIILPNIRMTGRDEACSFLNGEGRCGIHSYRPGFCRLFPLGRLYE NRSFSYFLQVHECPKDNRSKVKVRKWIDTPDVKLYEKFVNDWHYFLKDLEHGLEENGSLD IRRTVVMYVLNQFYITPYNSGEEFYPQFYRRITEASEFAARSGIKTSL >gi|157101625|gb|DS480699.1| GENE 70 66631 - 67473 597 280 aa, chain - ## HITS:1 COG:SA2178 KEGG:ns NR:ns ## COG: SA2178 COG0789 # Protein_GI_number: 15927968 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 2 97 6 103 246 67 31.0 2e-11 MTIKEVETLTGITKANIRFYEKEGLLAPGRSSNNYREYSEDDVEHLRKIRIFRIMGFSVA QIRQLADQPQRAAVLMEARAGQIQKEVKELERVRQVCLEVRDSGWRFDTLDPQLLEMKLE WEKKGKDIMKKDRIDRLCKARDMAYLFCFLCIVSMIMFPINQVLGIRLPEKFIQIWRLVI TISPFPALLLGAAAAGKARWQVPDLNRILPDSGKPFMKKERDRGKLYERVCFLNQIGLTS LVLIPLNKALGIQLPLWVTCGWMAVIAGLAIAVMVLKNRG >gi|157101625|gb|DS480699.1| GENE 71 67635 - 67976 372 113 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1585 NR:ns ## KEGG: EUBREC_1585 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 24 98 55 129 132 87 57.0 2e-16 MPDSKVKELSDMPWKQREESGTWQKERVIQELRSQGKRMTKQRMILLDVILSGKWNCCKE IYYEAAKRDAAIGMATVYRMVSTLEEIGVFSRCYRYSLPDRPCVEEQAGDWMD >gi|157101625|gb|DS480699.1| GENE 72 68183 - 70384 2236 733 aa, chain - ## HITS:1 COG:BS_topB_1 KEGG:ns NR:ns ## COG: BS_topB_1 COG0550 # Protein_GI_number: 16077493 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus subtilis # 2 576 3 574 575 710 61.0 0 MKSLVIAEKPSVARDIARVLKCGKNINGAIEGDRYVVTWGLGHLVTLADPEDYDKKYKEW KMEDLPMMPEVFKLEVIKQTSKQYQAVKTQIYRGDVGDIIIATDAGREGELVARLILKKT GCNKPIRRLWISSVTDKAIREGFANLKDGREYNNLYDAAMCRAEADWLVGINATRALTCK YNAQLSCGRVQTPTLAMIAKREAQIRAFVPQPYFGLQARKDGLVLTWQDAGTGSHRSFDR GRMEGLAQELKGETAVVAEVKRTPKKTQPPLLYDLTELQRDANKRFNYSAKDTLNIMQRL YENHKVLTYPRTDSRYLSADIVPTLKERLKACSVGPYKMLAGRLVNQPLPAKPFFVDNSK VSDHHAIIPTEQYVQMDHMTIDERRIYDLVVRRFLAVLYPPFEYDETSLEANIRGQRFIA KGRIVKGQGWKAAYDTAVDDGEEEDEPEVKDQQLPDLKKGDVLHGLSLKMTEGKTKPPAP FNEATLLSAMENPVAYMETKDKAMAKTLGETGGLGTVATRADIIEKLFSSFLLEKRGKDI CLTSKARQLLELVPEDLKKPELTADWEMKLSGIAKGSLKRGAFMKDIRGYSQELIRQIKT GEGSFRHDNLTNTKCPVCGKRMLAVKGKNTEMLVCQDRECGHREVISRTSNARCPVCHKK MELKGKGDAQIFVCRCGHKEKLKAFEERRKKEGAGVTKKDVARYLNQQKKEAEEPVNNAF AKALAGLNLKDKG >gi|157101625|gb|DS480699.1| GENE 73 70434 - 71168 586 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 1 244 1 244 245 230 46 2e-59 MKENKYDNDIFFEKYSQMERSKKGLSGAGEWETLKALLPDFQGQRLLDLGCGYGWHCIYA MEHGAVSAVGVDISGKMLEVAREKTEDSRVRYVECAMEDIDFPDESFDVVLSSLAFHYVE SFSQVVDKVRDSLAPGGYFVFSVEHPVFTAYGSQDWYYDENGNILHFPVDNYYYEGKRKA VFLGEEVVKYHRTLTTYLNTLLTHGFELSQVVEPQPPDSMLHIPGMLDEMRRPMMLIVSA RKRS >gi|157101625|gb|DS480699.1| GENE 74 71297 - 72313 931 338 aa, chain + ## HITS:1 COG:CAC0554_1 KEGG:ns NR:ns ## COG: CAC0554_1 COG3757 # Protein_GI_number: 15893844 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Clostridium acetobutylicum # 33 224 2 186 187 119 36.0 6e-27 MKHLRGAAGIILSLIFAVCFMCSPAFAASRLYQGMDVSSWQGDIDFQAVRAAGIQVVYLR AGVGLEYVDPYFQSNYEKALDAGLNIGYYHYVTAADTFQARQQAEFFYSLIRDKQIDCCP AMDFESFPGLSTQEINAIGLSFMETLGSLLGYDPALYTDSYNAAYLWSSDFSSYPLWVAD YDVNQPESTGPWDSWDGFQYSDRGRVPGVNGDVDMDFFKDSMFIISRTPQPEPNPGDIGV LLTYKVQGGDTLWSISKRFGTTVDRLAALNHISNPNIIYRGQILEIPDIPGAIIYRVKSG DTLWAIADRFGTSVSDLVFTNAIANPNLIYVGQVLVIP >gi|157101625|gb|DS480699.1| GENE 75 72382 - 73668 797 428 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2277 NR:ns ## KEGG: EUBREC_2277 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 99 424 248 582 583 249 41.0 1e-64 MERVSGILWEYFRFDCRSEQKRKENRKRIKSIIRPFRFVRPAHGVIVRNTLNEHGQDSTE SGDDQCEKSGKAHKKPSDAAKGFCFWLEASSKRREEAVRSYIDACYGISMYAEQQNQAIC HMLCTGCHSTCSLHFTRGEKFLQVPDRGESARQLWEARQQRKKNREQFEANRPEYENSIR RLMERIQNTLLVDLQPVPEKSRQGKMMGEAVWRAVYLEDDRVFLKNSEEERQNVSVDLML DASASRMGHEAVIAAQGYVIAESLTRCGIPVQVYSFSTIQNYTVMRLFRGYEEKEKNRGI LDYVAAGWNRDGLALRAAGHLAGQSSCEKRLLIVLTDASPNDEQRMAPVSGAVRGKEYSG DVGIEDAAMEVRQLKKQGIRVMAVFYGLDSDLEGARKIYGSGFVRIKEMGQLADTVGNLL TSQLRSGR Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:33:38 2011 Seq name: gi|157101624|gb|DS480700.1| Clostridium bolteae ATCC BAA-613 Scfld_02_41 genomic scaffold, whole genome shotgun sequence Length of sequence - 61826 bp Number of predicted genes - 55, with homology - 54 Number of transcription units - 26, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 133 - 170 4.9 1 1 Tu 1 . - CDS 237 - 944 588 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 1177 - 1236 8.0 - TRNA 1324 - 1396 87.0 # Thr CGT 0 0 - TRNA 1464 - 1535 60.8 # Glu CTC 0 0 - TRNA 1546 - 1619 71.8 # Pro CGG 0 0 - Term 1536 - 1601 2.1 2 2 Tu 1 . - CDS 1728 - 1922 403 ## gi|160940960|ref|ZP_02088299.1| hypothetical protein CLOBOL_05854 - Prom 1942 - 2001 2.7 3 3 Op 1 . - CDS 2035 - 3132 1135 ## bpr_I2529 hypothetical protein 4 3 Op 2 . - CDS 3172 - 4662 1150 ## COG1696 Predicted membrane protein involved in D-alanine export 5 3 Op 3 4/0.000 - CDS 4723 - 5028 456 ## COG1937 Uncharacterized protein conserved in bacteria 6 3 Op 4 . - CDS 5118 - 7502 2680 ## COG2217 Cation transport ATPase - Prom 7537 - 7596 5.9 7 4 Op 1 9/0.000 - CDS 7748 - 9721 2074 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific 8 4 Op 2 . - CDS 9740 - 11512 1340 ## COG0366 Glycosidases - Prom 11539 - 11598 2.3 9 5 Op 1 1/0.000 - CDS 11629 - 12483 641 ## COG0561 Predicted hydrolases of the HAD superfamily - Prom 12507 - 12566 5.5 - Term 12519 - 12554 0.9 10 5 Op 2 . - CDS 12638 - 13408 756 ## COG1737 Transcriptional regulators 11 5 Op 3 . - CDS 13383 - 14156 854 ## COG1737 Transcriptional regulators - Prom 14202 - 14261 4.7 12 6 Tu 1 . - CDS 14317 - 14832 263 ## Clocel_4317 hypothetical protein - Prom 14877 - 14936 3.3 - Term 14893 - 14936 -0.6 13 7 Tu 1 . - CDS 14964 - 15824 379 ## COG2207 AraC-type DNA-binding domain-containing proteins 14 8 Op 1 . - CDS 15879 - 16787 77 ## COG2378 Predicted transcriptional regulator 15 8 Op 2 . - CDS 16780 - 17259 215 ## CLJU_c27480 hypothetical protein - Prom 17350 - 17409 7.6 - Term 17387 - 17425 -0.5 16 9 Op 1 . - CDS 17517 - 18005 534 ## gi|160940975|ref|ZP_02088314.1| hypothetical protein CLOBOL_05869 17 9 Op 2 . - CDS 18008 - 18208 233 ## COG1476 Predicted transcriptional regulators - Prom 18333 - 18392 9.0 - Term 18499 - 18558 11.8 18 10 Tu 1 . - CDS 18604 - 19977 1463 ## COG0733 Na+-dependent transporters of the SNF family - Term 20034 - 20067 3.1 19 11 Op 1 26/0.000 - CDS 20104 - 21057 1037 ## COG1079 Uncharacterized ABC-type transport system, permease component 20 11 Op 2 24/0.000 - CDS 21054 - 22154 1298 ## COG4603 ABC-type uncharacterized transport system, permease component 21 11 Op 3 15/0.000 - CDS 22151 - 23707 187 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 23776 - 23835 1.7 - Term 23780 - 23819 8.0 22 11 Op 4 . - CDS 23884 - 25086 1441 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein - Prom 25152 - 25211 5.1 - Term 25199 - 25246 2.3 23 12 Op 1 . - CDS 25268 - 29965 4933 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 24 12 Op 2 6/0.000 - CDS 29981 - 31051 1481 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 25 12 Op 3 17/0.000 - CDS 31087 - 32136 1232 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 26 12 Op 4 15/0.000 - CDS 32222 - 33370 1440 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 27 12 Op 5 32/0.000 - CDS 33417 - 34217 945 ## COG0575 CDP-diglyceride synthetase 28 12 Op 6 19/0.000 - CDS 34239 - 34958 922 ## COG0020 Undecaprenyl pyrophosphate synthase - Prom 34994 - 35053 3.9 - Term 35027 - 35076 11.2 29 13 Op 1 33/0.000 - CDS 35089 - 35640 735 ## COG0233 Ribosome recycling factor 30 13 Op 2 . - CDS 35693 - 36394 874 ## COG0528 Uridylate kinase - Prom 36423 - 36482 11.0 - Term 36811 - 36856 10.1 31 14 Op 1 . - CDS 36881 - 37246 456 ## gi|160940995|ref|ZP_02088334.1| hypothetical protein CLOBOL_05889 32 14 Op 2 . - CDS 37282 - 37671 347 ## gi|160940996|ref|ZP_02088335.1| hypothetical protein CLOBOL_05890 - Prom 37774 - 37833 1.5 + Prom 37707 - 37766 7.3 33 15 Tu 1 . + CDS 37801 - 38475 806 ## COG1802 Transcriptional regulators 34 16 Op 1 . - CDS 38453 - 38860 262 ## COG0822 NifU homolog involved in Fe-S cluster formation 35 16 Op 2 . - CDS 38853 - 41240 2390 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 36 16 Op 3 11/0.000 - CDS 41240 - 43540 2472 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 37 16 Op 4 . - CDS 43537 - 44052 549 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs - Prom 44078 - 44137 5.5 + Prom 44403 - 44462 12.5 38 17 Tu 1 . + CDS 44513 - 45892 1581 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain + Term 45920 - 45971 5.1 - Term 45906 - 45959 5.5 39 18 Op 1 . - CDS 45991 - 46065 84 ## 40 18 Op 2 . - CDS 46090 - 47061 1071 ## COG0673 Predicted dehydrogenases and related proteins 41 18 Op 3 . - CDS 47096 - 47935 877 ## COG0668 Small-conductance mechanosensitive channel - Prom 48041 - 48100 5.6 42 19 Op 1 . - CDS 48183 - 48434 235 ## gi|160941007|ref|ZP_02088346.1| hypothetical protein CLOBOL_05901 43 19 Op 2 . - CDS 48492 - 50096 1085 ## Ethha_1257 hypothetical protein 44 19 Op 3 . - CDS 50149 - 52035 1768 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 52112 - 52171 5.6 - Term 52226 - 52266 1.3 45 20 Tu 1 . - CDS 52286 - 52642 426 ## gi|160941011|ref|ZP_02088350.1| hypothetical protein CLOBOL_05905 - Prom 52675 - 52734 6.6 46 21 Tu 1 . - CDS 52847 - 53605 227 ## COG1145 Ferredoxin + Prom 53730 - 53789 9.2 47 22 Tu 1 . + CDS 53842 - 53985 163 ## gi|160941014|ref|ZP_02088353.1| hypothetical protein CLOBOL_05908 48 23 Tu 1 . - CDS 54073 - 54699 419 ## COG4832 Uncharacterized conserved protein - Prom 54792 - 54851 6.5 + Prom 54791 - 54850 3.8 49 24 Op 1 40/0.000 + CDS 54994 - 55680 449 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 55820 - 55858 -0.5 + Prom 55711 - 55770 4.6 50 24 Op 2 4/0.000 + CDS 55891 - 56727 504 ## COG0642 Signal transduction histidine kinase + Term 56746 - 56780 -0.2 + Prom 56731 - 56790 4.1 51 24 Op 3 36/0.000 + CDS 56817 - 57500 263 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 52 24 Op 4 . + CDS 57497 - 59299 843 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 53 25 Op 1 . - CDS 59493 - 59990 232 ## DSY1739 hypothetical protein 54 25 Op 2 . - CDS 60058 - 60618 272 ## Halhy_2853 hypothetical protein - Prom 60848 - 60907 6.3 + Prom 60907 - 60966 5.9 55 26 Tu 1 . + CDS 61003 - 61825 217 ## COG4335 DNA alkylation repair enzyme Predicted protein(s) >gi|157101624|gb|DS480700.1| GENE 1 237 - 944 588 235 aa, chain - ## HITS:1 COG:DR0781 KEGG:ns NR:ns ## COG: DR0781 COG0745 # Protein_GI_number: 15805807 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Deinococcus radiodurans # 6 226 5 219 222 129 37.0 6e-30 MKGYHILLGEDEPDILEYNREQLEQRGYQVTAVSTLGQAESCVLRDNPDLLVLDVMMPDG SGVDLCRRLREQFQGPILFLTSLGESSQIVQGLRAGGDDYITKPYDIEELAARIEAHLRR LERRGGEDIHQGGSRLYLNVKSQRAYLDGRDMLLKPKEYRLLAALMRNRGRYMESGELYR EIWDMAPNRDIRTVWVHISNLRKKLQDADGELIADIECKRALGYKLVMFEDENED >gi|157101624|gb|DS480700.1| GENE 2 1728 - 1922 403 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940960|ref|ZP_02088299.1| ## NR: gi|160940960|ref|ZP_02088299.1| hypothetical protein CLOBOL_05854 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05854 [Clostridium bolteae ATCC BAA-613] # 1 64 3 66 66 105 100.0 1e-21 MKTYKCEEMMCEGCVNRIKKGLDGAGIKNTVDLSVKTVTIDGDKDCEAKAVEILDDLGFT AVEV >gi|157101624|gb|DS480700.1| GENE 3 2035 - 3132 1135 365 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2529 NR:ns ## KEGG: bpr_I2529 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 19 365 13 362 363 233 36.0 1e-59 MDTQKQENGTEDMRYGRWLWFTVGAVLAMMVMLGGLVVWVDPFFHYHKPLEHLAYPIDSE RYQNDGISRNFTYDAVLTGTSMMENFKASRFDSLFGVSSVKIPYAGGYYKEVDQAVRRAL SYNPQVKVVCRSLDRSFLFYQKDQQNPAAPSPDYLTDDNPFNDVNYIFNKEVIFGTIQGV FARTKAGGQTTTFDEYMHWAPERDWGRKAVLKTYEREPQKNETVPFTEEDRRTVMENLEQ NVLATARANPQVTFYYFIPPYSISWWDSELMVKGEFERQMEGYRLMARMLLECRNIRLFA FDDQFDITCNLDHYMDVIHYSEDIGDQLLEWMAAGEHMLTNDNVDRYFDRITDFYANYDY DSIYE >gi|157101624|gb|DS480700.1| GENE 4 3172 - 4662 1150 496 aa, chain - ## HITS:1 COG:HP0855 KEGG:ns NR:ns ## COG: HP0855 COG1696 # Protein_GI_number: 15645474 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Helicobacter pylori 26695 # 3 417 16 440 527 300 40.0 6e-81 MQFNSYLFLLLFLPAVLAGYFGLCRMKRHTWARGFLAAMSLVFFAYANPWYLGLLVGSAV FNWAVSRVLAGEGNKETAKGNGKTPGRTDGKKSGRPGSRMVLVFGICANLALLFYFKYTN FFIENLNAFLGKDIAFTRLLLPVGISFFTFQQIAWLVDSFRMETGDYSFWDYFLFTVYFP KIAMGPILLHGEFIPQLKDPSRLKADSRNMAEGLMILAVGLFKKVILAEFFAGPVNWGYA QVEILSSTDAFLVMLAYTFQLYFDFSGYCDMAMGISRMFNLELPLNFNSPYKALSPVEFW KRWHMTLTRFLRTYIYFPLGGSRKGSLRTYMNIMIVFLVSGFWHGAAWTFILWGALHGAA QALNRVFKRQWENLHTAFRWMATFLFVNLTWVIFRSESISQAKLFLKRLLDFGNMQINPS FMDSFKMVELPLWLTGHRVFTVLGLYGIVLCLVMNARNMGETELRPTFLRGAGTALLLVW SVLSLAGISGFIYFQF >gi|157101624|gb|DS480700.1| GENE 5 4723 - 5028 456 101 aa, chain - ## HITS:1 COG:BH0558 KEGG:ns NR:ns ## COG: BH0558 COG1937 # Protein_GI_number: 15613121 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 14 101 12 99 100 81 46.0 3e-16 MNQEEGCTCRHKHREVTEERGLMNRLNRIEGQIRGIKSMVEDERYCVDIITQVAAVQAAL NSFNKVLLERHIKSCVVDDIRRGDSDATVEELCRTLQKMMK >gi|157101624|gb|DS480700.1| GENE 6 5118 - 7502 2680 794 aa, chain - ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 2 793 72 813 818 714 50.0 0 MTTQQYHIDGMSCAACSSAVERVTRKLNGVESSNVNLTTNRMVITYDESQVTPEMICEKV SKAGFSASLIVEEEQNKKKEEEEWQQQEEQLEAVKRRVVTAICFAVPLLYISMGHMLPFT LPLPAFMQMDKNPLNFALAQLILTVPVLICGRKFYLVGIRSLLKGNPNMDSLVAIGTGSA FIYSLVMTFGIPGDHMKAHQLYYESAAVVVTLVMLGKFMESRSKGKTSEAIRKLMELAPD TAILYENGMEREVETSLVSVGQHILIKPGSRIPLDGILVQGSSSVDESMLTGESIPVEKQ PGDSVIGGSMNYNGAMEVEVTHVGSDTTLSKIIKMIEDAQGKKAPISKLADKVAGYFVPA VMGIALVAALLWWILGGKELSFVLTIFVAVLVIACPCALGLATPTAIMVGTGVGAGHGIL IKSGEALEICHKVDAVILDKTGTITEGKPKVTDVNVISGAVVEQVWKLESSSVPGAVLPA AGENREGSASKDSVREPQASDDEKKEHLLGIAASCEQMSEHPLGQAIVNAAREKQMDLAM PEAFESITGAGIITTWKGWKVAVGNRRLLDHLHVPVSQDTEKTASEYANTGKTPMYVVID GRLAGIVCVADTIKETSVEAVEKIKGLGVTVYMVTGDNEKTAQYIGKLAHVDQVVAEVLP EDKAQVVNRLQNEGKTVMMVGDGINDAPALVQADVGCAVGNGSDIALESGDVVLMKSDLM DVYRAVKLSKATIRNIKQNLFWAFFYNSLGIPVAAGVLYLLGGPLLSPMLGGFAMSLSSV CVVGNALRLRNLKL >gi|157101624|gb|DS480700.1| GENE 7 7748 - 9721 2074 657 aa, chain - ## HITS:1 COG:SPy1815_2 KEGG:ns NR:ns ## COG: SPy1815_2 COG1263 # Protein_GI_number: 15675645 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Streptococcus pyogenes M1 GAS # 104 462 13 365 385 248 39.0 4e-65 MEGNQYRDAAGRIVEAVGKDNILSATHCATRLRLIVKDKERIDEKKLQEIGMVKGTFFNA GQYQIILGTGIVNKVYGEMEGLGLNTLSKQEQDELVKNQQKGVKRMMRMLGDIFIPIIPV IAATGLFLGLKGCVFNDNVLGLLGLSASIIPDYVVTLVNVLTETAFAFLPALICWSAFKV FGGTPVIGLVIGLMLVSPSLPNAYAVATPGSGVEAVMAFSRIPIVGCQGSVLTAILAAFL GASMEKKLRKIMPNALDLIMTPFLVMLGTFLVVMLGIGPVMHVVELKLVTIVEMLVHLPF GIGGFLIGATYPLLVITGLHHTYTMIETSLLANTGFNPVITLCAMYGFANVGTCLAFFVK SRKQSVKSTSIGAMLSQLFGISEPVLFGIQLRYNLRPLVIMLFTSGLGAALLSVLGIQSN SYGLAVLPSYLMYIYKGNQLMWYFIISLCSVACCFFLTCLFGIPQEVLEADGEPHDPEFR ENQLTGKNAGEMPDEITGEAVSDMVSPADGHILPLDQVKDEVFAGRVLGDGFAVELTGGR ILSPADGTIEAAFDTGHAVGIRTENGMEILIHIGINTVELGGKGFRLHVSQGQKVRRGDC LVDVDLEEVRKTGKDLTTMVIFTTGESVAVKSGEAVTAGAGFEIRIGQKGTRGSELS >gi|157101624|gb|DS480700.1| GENE 8 9740 - 11512 1340 590 aa, chain - ## HITS:1 COG:BS_yugT KEGG:ns NR:ns ## COG: BS_yugT COG0366 # Protein_GI_number: 16080181 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus subtilis # 12 532 2 518 554 437 43.0 1e-122 MRAEEQLRIRPEEPWWKEEVVYQIYPRSFKDSNGDGIGDIRGILEKLDYLKALGITMLWL CPIYQSPMDDNGYDVSDYCALAPEFGTMEELDELIERAGDMGIKIILDLVINHTSDEHRW FRQAMEDPEGEEHSYYIFKKGDREPNNWRSVFGGSVWEKVEGRSEYYFHAFGKKQPDLNW ENETLRKKLYEMVNWWLEKGIAGFRIDAITFIKKDLTFADQEPDGADGRAKCTKASRNQP GIGEFLNELKRETFDRHGCVTVAEAPGVPYEGLGDFIGRNGYFSMIFDFRYADLDIASGS EWFKRREWTVKELKDKIMASQMAVQTYGWGANFIENHDQPRAATKYLKEAAENPSAVKML AAMYFFLRGTPFLYQGQELGMVNFERSSLDAFNDISSIDQYHRSLEEGFSEEEALQIINL RSRDNARTPFPWTGGRYGGFSETAPWLPMSREYPENSAKTQEGDRDSILEFYRSMIRFRQ WGPWRDCLVYGAIRPLACSEHVIAYERRTEDTVIWCYFNFSGTGTVERLPGISAGCIWAN DREAMDGVMFNGKTADKKTAGKETAGNRAGIEDGTLYLEPYQALLLKGSC >gi|157101624|gb|DS480700.1| GENE 9 11629 - 12483 641 284 aa, chain - ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 277 1 250 256 88 27.0 2e-17 MVKAIFLDIDGTLRDECSGVPQSAYTAVRMCRERGIRILICTGRNLASIQEEILSMDTDG IIAGGGCLIAENGYVQKESYFQKPETERIQEYLLTRELPFAMESQERIFMNRAAAVLLRQ DFIGKLKGLGAEEIRKREKENSIPYEDTLKEYMHSPDHIHKFCLWCRPEDWEAVCATVPG WGAVVQQSAPTDGKIRKWGYLELQPAGCTKGQAIRWWCRRNGIRLEDTMSFGDGKNDVDM IRTTGIGVAMEDGDEELKQWADSLCRPAARDGIYRELVRRGIIE >gi|157101624|gb|DS480700.1| GENE 10 12638 - 13408 756 256 aa, chain - ## HITS:1 COG:CAC0531 KEGG:ns NR:ns ## COG: CAC0531 COG1737 # Protein_GI_number: 15893821 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 249 1 247 257 150 36.0 3e-36 MRLEEAFNKNYNRFTENEQYICRYLLEHKKECAVMSITEFAGSCHVSESMLVRFAKKNGL AGYGELKARLRLEEQETEEALGPGEELLNTVTDSYHKMMDELIGRGLDSLFEKLYRARRV FLYGSGSAQTRAASEMKRIFLPVKEMFHIHGHDMRGALSRVMTERDMAVIISLTGESEAT VSLARELRLRQVPTLSITRLGNNTLASVCDENLYIHSITLPARYGVEYEISTPYFILIEY LYLAYQEFLKRRLDSL >gi|157101624|gb|DS480700.1| GENE 11 13383 - 14156 854 257 aa, chain - ## HITS:1 COG:BS_yfiA KEGG:ns NR:ns ## COG: BS_yfiA COG1737 # Protein_GI_number: 16077886 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 246 1 246 254 131 31.0 1e-30 MNLEYRINESYDRLTANDREVLNRILRNKAQSAGMNSTELAAFCHVSRTTLIRLFKKLGI GGFAEFRLLLKERAGMQELARLDMEEIISNYHRMVEGLREYDYSVILEAIHEAGTIYLYG TGNEQKAIGEEFKRIFLMFGKCCIDLFDYGEVEYASRYFKSGDLFVAISLSGESENGIQV LRFVQCREIKTLSITRWTNNTMARMCRYNLYAGTRMISGGGEPGYEMVATFYILLDMLSV NYLEYERKGHETGRSVQ >gi|157101624|gb|DS480700.1| GENE 12 14317 - 14832 263 171 aa, chain - ## HITS:1 COG:no KEGG:Clocel_4317 NR:ns ## KEGG: Clocel_4317 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 171 12 182 182 227 64.0 1e-58 MYEAVRAKSNADVAGEVVYGCEKAAHTENNSDWVKSVMRRLENKFESEDVKEIRMDCQCG YGMNEKLVLVKELMAGAASIEEFTNSEKAKAAGLFCKGGELFLQFYFCPCPMLAEVERLE TAAWCQCTTGYSKVLFEQAFACKVDVELLKSIKMGDSVCLMKITPHNPVWQ >gi|157101624|gb|DS480700.1| GENE 13 14964 - 15824 379 286 aa, chain - ## HITS:1 COG:STM4586 KEGG:ns NR:ns ## COG: STM4586 COG2207 # Protein_GI_number: 16767827 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 10 115 1 106 289 72 34.0 7e-13 MLLSIREWYMDRKKLINQSIDYIMKHLDENLSLDTVAAHFFISKYHFSRIFKEETGESVY AFIKRCKIDQSAIDMKLNPAKAITDIGLDYGYSASNYSSVFRQHHDTPPSMFRQSIPTRS MSLPFSSERIVHFKTAEEYTAQIEVQKQEDLFVLYERFIGTYADLEKNWYQFLDKYKDFL NEKTLLVERFFNDPAITSPLQCICDICMTVEPDCGLSNVMRISGGQWIVYHFDGKIKDIF ETLEGIFSVWFPQSGYKMTRRYGLNIYRRVDRDNHSVIMDLYIPIS >gi|157101624|gb|DS480700.1| GENE 14 15879 - 16787 77 302 aa, chain - ## HITS:1 COG:CAC3494 KEGG:ns NR:ns ## COG: CAC3494 COG2378 # Protein_GI_number: 15896731 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 299 1 296 300 225 39.0 1e-58 MHEGRHFRILYILLNNERVTAPELAEEFGVSVRTIYRDIDTMSAAGIPIYMQGGRKGGIY LLEGYRLDKVTLSKTEQAEILLALQGLGAVQYPDIDPVLLKLGALFMSQDRDWIDVDLTR WGTAETVHRDKELFSILRKAITGLRRIHFEYYSAKGEMTVRTVEPAKLIYRCRAWYLAGY CLMRQEPRVFRLSRMKKTGLTEELFEYSPEKEAPYFTLDGNYGPLLKVRLMFNKSVAYRL YDAFEDDTMIRQDDNIIVETQLPDSEWLYSFLLSFGGNVTILEPENLRQAVKERILQALS II >gi|157101624|gb|DS480700.1| GENE 15 16780 - 17259 215 159 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c27480 NR:ns ## KEGG: CLJU_c27480 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 7 153 2 147 148 125 44.0 4e-28 MAEKIRYEKKKTRLTEKYQQPPESLIRIFMGIEAWKRLMRFEEMLRERYDVSREIRFPFG NEYGWSFRYSHKKSLLLYVFFEEGGFCCTISINDKGAQEVDSIFGELLPEIQASWLNRYA CGADGGWLNRSVISDEELPDLIRLVGVKVKPKKIGGKNA >gi|157101624|gb|DS480700.1| GENE 16 17517 - 18005 534 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940975|ref|ZP_02088314.1| ## NR: gi|160940975|ref|ZP_02088314.1| hypothetical protein CLOBOL_05869 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05869 [Clostridium bolteae ATCC BAA-613] # 1 162 3 164 164 288 100.0 9e-77 MWLNQLFIENPIDFEKRCRNRIVYGICFILLGAAAIGLSFAVGNHAMVMYLEQGYKDFMP GFYGGTGFGLAAAGVITIMRNLKYLRNPELKEKRRIYETDERNRMLGLRCWAYTGYTMMI SLYIGILISGFISMTVTKTLISVAVFFAVLLLVFRRLLQKVM >gi|157101624|gb|DS480700.1| GENE 17 18008 - 18208 233 66 aa, chain - ## HITS:1 COG:PAB7155 KEGG:ns NR:ns ## COG: PAB7155 COG1476 # Protein_GI_number: 14520844 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pyrococcus abyssi # 1 65 1 65 73 77 66.0 4e-15 MKTRIQELRKQNKVTQEELALALGVTRQTIISLENGKYNASLQLAFKISRFFGKSIEDIF LFEEEE >gi|157101624|gb|DS480700.1| GENE 18 18604 - 19977 1463 457 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 1 450 7 452 459 464 57.0 1e-130 MEREKFSSRLGFILISAGCAIGLGNVWRFPYIVGEYGGAAFVLVYLVFLLILGLPIVVME FAVGRASRKSAALSFDLLEPKGSRWHLTKYIAMAGNYILMMFYTTVGGWMMLYFFKTLKG DFGGMDAEAVAGEFVSMLANPALMGGFMVIVVLLCFEVCAMGLQKGVERITKVMMVCLLV LMVVLAVHSMVLPGGGPGLEFYLKPDFGKMVESGIGEAVFAALGQSFFTLSIGIGALAIF GSYIGKERTLTGEAVSVTLLDTLVAFMAGLIIFPACFAYNIEAKSGPSLIFITLPNVFNH MAGGRIWGTLFFLFMSFAAFSTIIAVFQNIISFATDLTGCTLKKAVLCNIAAIILLSLPC VLGFNLWSSFAPLGEGSTVLDLEDFILSNNLLPIGSMLYLLFCTSRYGWGFKKFMAEANE GEGIHFPAWTRVYVSYILPLIVLIIFIQGYASKFLGN >gi|157101624|gb|DS480700.1| GENE 19 20104 - 21057 1037 317 aa, chain - ## HITS:1 COG:SP0848 KEGG:ns NR:ns ## COG: SP0848 COG1079 # Protein_GI_number: 15900735 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 1 316 4 318 318 227 41.0 3e-59 MSVIYFIFQQTMLFTIPLMIVALGGMFSERSGVVNIALEGIMTMGAFTGILFLNMTGGRM SGQMQLLLAILISTATGAVFAFFHAYASINMKANQTISGTALNMFAPAFAIFVARVIQGV QQIQFNNTFRIESVPLLGSIPFFGPLLFQNTYITTYLGVAILILSTLVLYRTRFGLRLRS CGEHPQAADAAGINVCRMQYAGVLISGVLGGLGGLVFVVPTSTNFNADVAGYGFLALAVL IFGQWKPVKIMWASLFFGLMKAVAAAYSGIPFLAATGIPSYVYKMIPYVATLIVLVFTSR NSQAPKASGVPYDKGQR >gi|157101624|gb|DS480700.1| GENE 20 21054 - 22154 1298 366 aa, chain - ## HITS:1 COG:lin1427 KEGG:ns NR:ns ## COG: lin1427 COG4603 # Protein_GI_number: 16800495 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Listeria innocua # 20 354 12 342 350 198 38.0 1e-50 MSTNNRTGRKRDYVGVLSSVFAIVIGLLVGFVILLICNPSQALPGFVTILTGAFTHGLKG VGQVFYYATPIILTGLSVGFAFKTGLFNIGTPGQFIVGAFAAVYVGIMWTGLGRIQWAVA LLAAILGGALWGMVPGFLKACFNVNEVIASIMMNYIGMYMVNWIVKSYKPLFNNLRNESR NVAATAQIPKMGMDKLFPESSVGGGILLAIAAVLIIWVILNKTTFGYELKAVGFNRDASR YAGINEKRNIIMSMVIAGAIAGMAGGLLYLAGTGKHIEIKDVLASEGFTGISVALLGLSH PIGVLFSGLFIAYLTAGGFYLQLFEFSTEIIDIIVAVIIYFSAFALIVKTILTRMNHKRE KGGGRT >gi|157101624|gb|DS480700.1| GENE 21 22151 - 23707 187 518 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 276 493 16 226 245 76 25 3e-13 MDDYIIEMLGITKEFPGIIANDNITLQLRKGEIHALLGENGAGKSTLMSVLFGLYQPEKG IIKVRGREVKIKGPLDANALGIGMVHQHFKLVHNFTVLQNIVLGMETVSHGFLKMDNARK KVVELSERYGLYIDPDALIQDITVGMQQRVEILKMLYRDNEIMIFDEPTAVLTPQEIDEL MEIMRSLVAEGKSILFITHKLDEIKAVADRCSVLRRGKYIGTVDVASITKEEMSEMMVGR KVSLSVEKTPAAPGRTVLEAEHVSVKSHMEGHTKLVVNDVSFQVRSGEIVCIAGIDGNGQ TELVHAVTGLGETAGGRILINGRDVTKKSIRARNTGGLSHIPEDRHKHGLVLDYNLAYNL VLQQYFEPGFQRWGFLKDEEIYNYADKLIEDFDIRSGEGADTITRSMSGGNQQKAIVARE ISRDHDILLAVQPTRGLDVGAIEYIHRELVKQRDAGTAILLISLELDEVMNLSDRILVMF EGTIVADLNPQEVTVQELGLYMAGSKRQEPEGKAGERA >gi|157101624|gb|DS480700.1| GENE 22 23884 - 25086 1441 400 aa, chain - ## HITS:1 COG:TM0102 KEGG:ns NR:ns ## COG: TM0102 COG1744 # Protein_GI_number: 15642877 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Thermotoga maritima # 70 351 18 315 359 131 33.0 2e-30 MKKRILALAMAAVMAGSLTACSGSSGTDTKAAPAETEAPASESKSEAKEESKAGEGDAAN TASSKGGSEGYELALVTDLGTIDDKSFNQGAWEGLKKYAEENSISYKYYQPQEGTTDSYL ETIGLAIEGGAKLVVCPGFLFEEPIYLAQDKYPDVHFILLDGEPHDADYNYKTNDNVMPI LFQEDQAGFLAGYAVVKDGYTKLGFMGGMAVPAVIRFGYGFVQGAEFAAEEDGITGLEIM YNYTGAFSATPEAQSMAASWYQNGTEVIFGCGGAVGNSVMAAAQEKNAKVIGVDVDQSAE SETVITSAMKLLSNSVYDGVKDFYDGSFPGGKTSVFTVANNGVGLPMDTSKFEKFSQEQY DAVYKKLADGEIGLVQFSADNNDPTKDLTDLTNTKVTFVE >gi|157101624|gb|DS480700.1| GENE 23 25268 - 29965 4933 1565 aa, chain - ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 321 1564 204 1450 1452 1369 54.0 0 MEKNFLEVFPTLNIAESLKELLGLAQVEKVTTTRDRSSIRIYLKSARLIHKQNIYDLERG IRDQLFPDKQVAIKIQEKYRLSDQYTPKKLLDVYKDSLLLELKNYSIIEYTMFRKAGITF EKEDIMVLTVEDTMVNRDRTAELKRVIEKVFAERCGLPVEVRFAFVEPQGSDRRKQIELK MEREAQAIYWQNHREELEAMGASFSTQGLEGQELLSGNGGRSGQDSYLDAGQGPEGAPWD ALMAQAASGDVGDGSAPGDNGGRSGAPGSSQSQGSPARTGQSGQGQSGQGGGQKQGFKGD KPQLNKGRKGWGRDKDSGFGDDKKYSVKRSNNPDVLYGRDFEDEPMEIEKIDGEIGEVTL RGKVLTCENRELRSGKFILTFDVSDFTDTITVKMFIRPEIFDEVKSAINPGMFIKVKGVT TIDKFDGELTLGSIVGIKKADDFTSKRMDSSLEKRVELHCHTKMSDMDGVSEVKSIIKRA KQWGMPAVAVTDHGCVQAFPDANHALDKGDTFKILYGVEGYLVDDTKQLVENSRGQDFGD TFVVFDIETTGFSPKKNKIIEIGAVKVVDGRITDKFSTFVNPDVPIPFDIEKLTGINDSM VLPYPKIDVILPQFLEFVGDAVLVAHNASFDVGFIGHFAEKLELPFDPTVLDTVAIARLL LPNLNRFKLDTVAKALNISLQNHHRAVDDAGATAEIFAAFVKMLRDRDVNDLNQLNALST MDTDTIRKLPTHHVIILAKNDIGRVNMYRLVSWSHLEYYARRPRIPKSLLEKYREGLIIG SACEAGELYQALLRGAPDTEIARLVSFYDYLEIQPLGNNAFMLRDEKSTVKSQEDLIEIN KKIVELGMQFNKLVCATCDVHFLDPQDEVYRRIIMAGQGFKDSDDQAPLYLRTTEEMLEE FSYLGPDKAHEVVITNTRKIADMCEKIAPVRPDKCPPVIEDSDKTLTKICYDRAHEMYGE DLPKIVVDRLERELHSIITNGFAVMYIIAQKLVWKSNEDGYLVGSRGSVGSSFVATMSGI TEVNPLSPHYHCPKCRYYDFDSEEVKQFSGMAGCDMPDKECPVCGTPLEKDGFDIPFETF LGFKGDKEPDIDLNFSGEYQSKAHDYTEVIFGKGQTFRAGTIGTLADKTAFGYVKKYYEE RGTRKRNCEINRIVQGCVGVRRTTGQHPGGIIVLPLGEEIYSFTPVQHPANDMTTSTVTT HFDYHSIDHNLLKLDILGHDDPTMIRMLQDLIGIDPVKDIPLDSREVMSLFQDTSALGIM PDDIGGCPLGALGVPEFGTDFAMQMLLDTKPKYFSDLVRIAGLAHGTDVWLGNAQTLIQE GKATIQTAICTRDDIMVYLIAQGLEQGLSFTIMESVRKGKGLKPEWEEEMKAHGVPEWYI WSCKKIKYMFPKAHAAAYVMMAWRIAYCKIFYPLAYYAAFFSIRASAFSYEIMCQGRDKL EYYLADYKKRADTLSKKEQDTLRDMRIVQEMYARGFDFTPIDIYRAKARHFQIIDGKLMP SLSSIDGLGEKAADAVVDAVKDGAFLSIEDFSNRSKVNGTLCSLMSDLGLLKDLPKSNQL SLFDL >gi|157101624|gb|DS480700.1| GENE 24 29981 - 31051 1481 356 aa, chain - ## HITS:1 COG:BH1401 KEGG:ns NR:ns ## COG: BH1401 COG0821 # Protein_GI_number: 15613964 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Bacillus halodurans # 3 356 7 359 367 396 56.0 1e-110 MTRKDTKVIQIGDRVIGGGNPVLIQSMCNTKTEDVHATVEQIQRLEQAGCDIIRVAVPTM EAASALGDIKKEIHIPLVADIHFDYRLAIAAMENGADKIRINPGNIGDRHKVQAVVDRAR EYGVPIRVGVNSGSLEKPLLEKYGGVTAEGIVESALDKVKLIEDMGYDNLVISIKSSDVL MCVKAHELIAMRTHYPLHVGITESGTLMSGNIKSSVGLGIILHEGIGDTIRVSLTGDPVE EIKSAKLILRTLGLRKGGIEVVSCPTCGRTRIDLIGLANQVENMVTEFDDLDVKVAVMGC VVNGPGEAREADIGIAGGVGEGLLIRRGEIIRKVPESELLAVLREELLAMSREHRA >gi|157101624|gb|DS480700.1| GENE 25 31087 - 32136 1232 349 aa, chain - ## HITS:1 COG:CAC1796 KEGG:ns NR:ns ## COG: CAC1796 COG0750 # Protein_GI_number: 15895072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 2 345 5 333 339 202 35.0 6e-52 MSLIIAMLMLGIIIMIHEFGHFLFAKLNGIGVIEFSLGMGPRLFSFEKGGTRYSFKILPF GGSCMMLGEDEGITDESAFNNKSVWARISVVAAGPVFNFILAFGLSMVLIGITGYDTTRL AGVVDGYPAQAAGMEAGDVIKSINGRKVHSYRDINWYLFTHPQKSLKVTWERTEEGGGTE RFSTELEPVFSAENNQYMMGVQFNPVPSTVENIGQLLVHSAYEVQYWIHYVFDTFYMMFH GMVSVNDISGPVGIVNAIDTTVDETAPYGLSAVVLMLINFTILLSANLGVMNLLPIPALD GGRLVFLIIEAVRGKPIDKEKEGMVHMAGMMVLLALMVLILFNDVRKLF >gi|157101624|gb|DS480700.1| GENE 26 32222 - 33370 1440 382 aa, chain - ## HITS:1 COG:alr4351 KEGG:ns NR:ns ## COG: alr4351 COG0743 # Protein_GI_number: 17231843 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Nostoc sp. PCC 7120 # 1 371 1 376 399 384 52.0 1e-106 MMKKIAILGSTGSIGTQTLEVVRYNKDIQVTALAAGSNVKLLEEQIREFKPRLACLWDEE KAKELRAAVADMDVKVASGMEGLMEAAVTPEADLVVTAVVGMIGIRPTIAAMEAGKDIAL ANKETLVTAGHIIMPLAREKGVKLLPVDSEHSAIFQCLQGAAGNPLHKILLTASGGPFRG FTREQLEKVRLEDALKHPNWSMGHKITIDSSTMVNKGLEVMEARWLFGAEMDQVQVVVQP ASIIHSMVEFEDGAVIAQLGTPDMKLPIQYALYYPERRFLPGDRLDFEKLTKISFEVPDM ETFRGLKLAYEAGRRGGTLPTVFNAANEMAVAMFLERKINYLAIVDMIEGAMEHHCVKPS PDVAQILEAEQETYDYLSSKWN >gi|157101624|gb|DS480700.1| GENE 27 33417 - 34217 945 266 aa, chain - ## HITS:1 COG:CAC1792 KEGG:ns NR:ns ## COG: CAC1792 COG0575 # Protein_GI_number: 15895068 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Clostridium acetobutylicum # 5 241 6 240 245 127 35.0 3e-29 MFTTRLISGIILVLLSIVIVGQGGILLYAVTALISIIGLFELYRALGMQRRSLAVVGYVT ACSYYGLLWFEGQRYVTLMIIACLMAMMAFYVLTFPEYKTEEVTGAFFGVCYVPIMLSFL YQTRDMSDGAYLVWLIFLSSWGCDTFAYCTGMLLGRHKLAPVLSPKKSIEGAVGGVAGAA LLGFIYASLFGASMAELDNPQAACTIACAIAAVISQIGDLAASAIKRNHNIKDYGHLIPG HGGILDRFDSMIFTAPAIYFALTFLK >gi|157101624|gb|DS480700.1| GENE 28 34239 - 34958 922 239 aa, chain - ## HITS:1 COG:AGc2550 KEGG:ns NR:ns ## COG: AGc2550 COG0020 # Protein_GI_number: 15888704 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 9 238 9 238 247 244 53.0 1e-64 MALNKDMIIPEHVALILDGNGRWAKKRGLPRTLGHKEGCVTVEKTVEIAARMGIRYLTVY GFSTENWKRSAEEVGALMQLFRYYMVRLLKIASANNVRVKMIGDRSRFDRDIIDGINRLE SETQDNTGLTFVIAVNYGGRDEIRRAAAGLARDCALGKADPDQVDEALFASYLDTAGMPD PDLLIRTSGELRLSNYLLWQLAYTEIVVTDCLWPDFNQEELEKAIVQYNKRERRFGAVK >gi|157101624|gb|DS480700.1| GENE 29 35089 - 35640 735 183 aa, chain - ## HITS:1 COG:CAC1790 KEGG:ns NR:ns ## COG: CAC1790 COG0233 # Protein_GI_number: 15895066 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Clostridium acetobutylicum # 4 183 6 185 185 169 53.0 3e-42 MEELKVYEDKMNKTLDVLLSDYGTIRAGRANPHVLDKIKVDYYGTPTPLQQVGNVSVPEA RMIVIQPWEKSLLKPIEKAILTSELGINPTNDGNCIRLVFPEMTEERRKEVSKDVKKKGD NAKVALRNIRRDANEAFKKAEKAGDISEDDLTEMTEKMQKLTDKMVEKVDKAVEEKTKEI MTV >gi|157101624|gb|DS480700.1| GENE 30 35693 - 36394 874 233 aa, chain - ## HITS:1 COG:CAC1789 KEGG:ns NR:ns ## COG: CAC1789 COG0528 # Protein_GI_number: 15895065 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Clostridium acetobutylicum # 1 232 4 235 236 214 47.0 1e-55 MKMKRVFLKLSGEALAGPKKTGFDEDTVKEVARQVKLSVDAGIQVGIVIGGGNFWRGRTS EAIDRTKADQIGMLATVMNCIYVSEIFRNAGMMTQILTPFECGSMTKLFSKDRANKYFEK GMVVFFAGGTGHPYFSTDTGIALRAIEMEADCILLAKAIDGVYDSDPKLNPEAKKYDTIS IQEVIDRQLAVVDLTASIMCMEHKMPMAVFSLNEKDGIANAMQGRINGTIVTA >gi|157101624|gb|DS480700.1| GENE 31 36881 - 37246 456 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940995|ref|ZP_02088334.1| ## NR: gi|160940995|ref|ZP_02088334.1| hypothetical protein CLOBOL_05889 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05889 [Clostridium bolteae ATCC BAA-613] # 1 121 6 126 126 187 100.0 3e-46 MELVKIFIAGGVFCMFGEVLWAKTGLGVVKTLMVCIAAGVALDVAGILGPVIGFGQEGFT VTIYGAGASIFDGVMGSVRDHGLSGIIRLTNFYFLRFSGLIVCTMAMAAVLGFLFPGTRG K >gi|157101624|gb|DS480700.1| GENE 32 37282 - 37671 347 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160940996|ref|ZP_02088335.1| ## NR: gi|160940996|ref|ZP_02088335.1| hypothetical protein CLOBOL_05890 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05890 [Clostridium bolteae ATCC BAA-613] # 1 129 1 129 129 203 100.0 3e-51 MKINRYAGAFLTGGAICFLCHFLTMAYRMAGVPELFAVSVTVFTLAAIGSVTTITGHYQK LTDFGGMGAMITMGGFSSAITDAVHGARKEGMGAGRALWRGVKDGLFILFAGYLAAVVSV VVMYAAGRI >gi|157101624|gb|DS480700.1| GENE 33 37801 - 38475 806 224 aa, chain + ## HITS:1 COG:CAC0379 KEGG:ns NR:ns ## COG: CAC0379 COG1802 # Protein_GI_number: 15893670 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 11 216 11 216 216 94 29.0 1e-19 MEDNTQAVQTLSKKDEAYVRIKDRIISGEYKPGSLLVERTLSDSLDMSRTPIRAALMDLC HTGLATFVPGKGIYVSVIGRRDIEEIYEIRALLDPAVLLSFMKQATRDQIAMLRTFADNM RDAVRMENTGMLIENDVNFHQYYTDYCGNARLADIICSTVDPFHRFRYITSSSMKMNRVF VEEHLLIMKAIEDNDQAEAELQMKAHYSHLKRYLIQKLTSEGQL >gi|157101624|gb|DS480700.1| GENE 34 38453 - 38860 262 135 aa, chain - ## HITS:1 COG:AF0185 KEGG:ns NR:ns ## COG: AF0185 COG0822 # Protein_GI_number: 11497802 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Archaeoglobus fulgidus # 17 127 17 123 153 65 34.0 3e-11 MNKEYITETALKGEYRGKPQTCMGTGACQDPSCRDVLTYYFSADRELITSISYTITETSC FPSKACAETAAALVRGRPVLEAYTIDSSAIATALGGLEPEYMHCAMMAELALKRAVLDYA GKRKEGQPVKAAPPM >gi|157101624|gb|DS480700.1| GENE 35 38853 - 41240 2390 795 aa, chain - ## HITS:1 COG:TM1217_2 KEGG:ns NR:ns ## COG: TM1217_2 COG0493 # Protein_GI_number: 15643973 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 166 557 60 461 472 268 40.0 3e-71 MKNFEYIKPSTVTLASELLKGEHTAACAGETDLIDLMRNRILPEYPKRVVSLKGIEGLDY IREEHGSLLIGAMTTLNTIARSALVKEKNPMIAQAAESVASPNLRNTATIGGNLCQDVRC WYYRYPDSLGGRVNCARKEGHLCSAMMGENRYHSIFGAAKVCMTPCTQGCPAHTDISAYM EKLREGDVDEAARIILRANPMPAITSRVCAHFCQEKCNREQYDERVNVGAVERYVGDYIL EHHERFMKAPKQENGKRAAIVGSGPAGLAAAYYLRESGYGVTVYERMKEAGGCLAYAIPA YRLPKEIVQKFVSVLEGMGVCFVFNTKVGEDIELETIRSENDSVLLDTGAWKRPLIGISG EELTRFGLDFLVEVNSYIHERPGSEVVVVGGGNVAVDVAVTAKRLGVPKVTMICLESREQ MPAGREEIERVLEEGIEIKNGWGPMEILKESVSPEEDRITGVRFKACVSTLDGEGRFAPV YDEARQMEEKADVVFMAVGQRSDLSFLDSVCRVETDRGRIKASEDHSTSQEGIFAAGDVT TGPATVVRALAGGREAAVMMNRHMEGALLSVETGECRQGLAGFLPECGNISCSNEPDLLP AGERGVNREDRKGYSYNMLNSEAGRCMNCGCLAVNPSDMATALLASDAVIRTNLRTLGAQ ELLTSHTRISDTLLKGELILEIQIPWDAAGTISGYEKFRLRKSIDFAVVALAYRYVLDDG VITDARMMLGAVAPVPMRRKKAEAYLIGRKPSGETALGAAKLALEGALPLSGNAYKIDVA ETLVRRSLDWSGKDE >gi|157101624|gb|DS480700.1| GENE 36 41240 - 43540 2472 766 aa, chain - ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 2 764 13 768 772 319 30.0 1e-86 MNQDYKVVGREIKRIIADDIVTGKAVFTDDLRVTGMVYGKVLRSPYPHARITRIDVDRAK ACDGVHAVVTYKDIPDNIYITNQMTPAPHYRPLNETVRFVGDPVALVVADTEDIALDAME LIEVEYEELPPVFTIDEALAPDAPQLYDVFPGNIAPPIFPGLDDLDFKVGDPEAGFARAD IIVEMDGEIKSGQNAMPAEAPVCIAQWNDDMLTIYGSIAAPSNCQTYVAASLDIPYERVR IVAPCVGGSFGSKLFVGNVVPPVLTAIMAKASGRPVMFSYTKEEHFACHQTRMCTKSHVR LGVNREGLATAIEMTQYADAGVCASTQENMLSVGTASLPLLCKTDNKRYKGTVVVTNKVP SGSFRGYGYMESSALVNRAIMRACIGLGLDPVVYYEKNVLRHGERYYNALNQKEPWQYNA SPDWKDTIHAAAEGFRWKDRFKGWGIPSRADGSRRYGVGMGLSGHSDIGGMASNTNVTMT SAGGVMIQTVMTEFGSGVRDVYRKIVAEELCIPVDRVRVSISDTSAAPLDFGSIASRSTY SGGISAQRAARDLKKNLFQLAEERLGIPASDWDFKDGMLKRLSNPEEVHDLHEILIYPDS LSGTGHWPGIDNATIMHVQFVEVAVDTETGLIEITDHFGGSDAGTIMNPRAAYNQMTSFF AGLDVAIREETVWDKWDNKVLNPNLIEYKARTFNEAPPHDHVCLESTKGRESDFPFGAMG IGEPIISPSGPAITMAVYNACGIELEEYPYLPAKVLAALKEKEDRA >gi|157101624|gb|DS480700.1| GENE 37 43537 - 44052 549 171 aa, chain - ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 9 170 15 166 171 135 45.0 5e-32 MKEKLQQFITLTVNRREYEIPVGNNRGDMPQSETLAHTLRDRLHLMGLKLGCGQGACGCC TVIMDGRAVTSCMVLTMDCDGSSITTIEGLADPETGDLTGLQQAFVDNCGFQCGFCTSGI IMTARALLDEKPCPTEEEVRDALAGNYCRCGTHYTAVESIMSYVEKRRSEE >gi|157101624|gb|DS480700.1| GENE 38 44513 - 45892 1581 459 aa, chain + ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 40 440 359 763 776 161 28.0 2e-39 MGLFQHKKEKGTAELPPTPPVLTPTEPDAASEGRAACMNRLESVLGAPSSKGVVLKLYLE NFKSLNKTFGYDYCEELLSQIKSYLKEVAEGNIYRYIGVEFIIILEQLSEGKACDLADEI LERFGNVWKIRGVDCLCSAQIGLCSYPGHATSTDELLKRLDLAVSAASDCGPNQMAVYDS NMHTQFVRRQAVAMYLQTALEKNELEVRYRPTYNLKEGRFTRADSYMRIFIQGIGLVGAA EFMPIAEDSGQIRAIGYYALDHAGRCISELMNTGREFDSICIPVSPVLLVQEDFLEQVTK VLETYQIPKGKLALEVDDSALSNVYLNVNITMQELSDMGVELVLNNFGSGCAPVSSILDL PINTVKLERMFVWQLETNPKSASIAGGLIQIARELGIHIIAEGVETENQLNALNNYQCEY QQGFYYAPTMEQDVLIKVMGTTLDESRVTIEEEKLKMKM >gi|157101624|gb|DS480700.1| GENE 39 45991 - 46065 84 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDGHSELTIHSGINESIYAQWKP >gi|157101624|gb|DS480700.1| GENE 40 46090 - 47061 1071 323 aa, chain - ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 315 4 320 320 211 36.0 2e-54 MKVGIIGAGSIAGTMAGTIREMDCAQNYAVAARDYGRAAEFAQTYGFEKAYGSYEELVED AGVELVYIATPHSHHYEHIKLCLNHGKHVLCEKSFTVNESQAREVLALAREKKLLLTEAI WTRYMPMRKTLDSVLSSGVIGRPYMLTANLGYIISGKERIMRPELAGGALLDVGIYPLNF ASMVFGDEVDSITGTAVMTETGVDAQNSITITYKDGKMAALCSSAMGLSDRRGIIYGDNG FIEVENINNCEGIRVYDRSRNMIASYDTPKQISGYEYEVEACLRAIKEGALECPEMPHEE TLRMMGWMDELRRQWGLVYPMER >gi|157101624|gb|DS480700.1| GENE 41 47096 - 47935 877 279 aa, chain - ## HITS:1 COG:MA1724 KEGG:ns NR:ns ## COG: MA1724 COG0668 # Protein_GI_number: 20090576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Methanosarcina acetivorans str.C2A # 32 262 77 336 379 121 30.0 1e-27 MQILLIKIAVIVVFWSGGMFALNKIFGMILKRKEQIHIKFLRSMSKVVLTIIAFICISGL FNTTKALSATLLTSSSLLVAIVGFAAQQVLADVISGVMLSWSRPFNLGEKVNISSLGISG IVEDMTVRHTVIRTYHNSRMIIPNSVINKAIVENSNYNNDYIGNYMEVSVSYESNLEQAI EVMRETIASYPLVVDIRPDPSEGNKVNVAVKELGDDGIILKSTVWTKNIDDNFTACSDIR RLIKKNFDAVGISIPYRHVHVVTGSSEKELEQNGITEKQ >gi|157101624|gb|DS480700.1| GENE 42 48183 - 48434 235 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941007|ref|ZP_02088346.1| ## NR: gi|160941007|ref|ZP_02088346.1| hypothetical protein CLOBOL_05901 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05901 [Clostridium bolteae ATCC BAA-613] # 1 83 1 83 83 142 100.0 1e-32 MTGGIIEIDVHGKNTEEAKKAINMQIHSAGKSIYRIRVIHGYNGGTRIRSMLREEYGYGR EPAVKRIEMGDNQGITELVIREF >gi|157101624|gb|DS480700.1| GENE 43 48492 - 50096 1085 534 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1257 NR:ns ## KEGG: Ethha_1257 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 15 417 10 407 512 137 28.0 1e-30 MLWTKVEKERLKRAYHARMSQAISGLEAMDISGLSQVYCAVATEDRELIRSGGRAIGMVM EGMTMRQVIRLSEHFRQYTSMEWDIDWKKVDIRQKKDWFRSDRDYFWILALGSFHPNGYY RQACLEEMAGYPGALPFLVLRLNDWVGEVRLAAARAAAKRLETCPLDELFAAMMALDKVK RSGRKDGRTVEHIGTVMSERLDQEAGSLSVPSVLSMDYEVRKSIYRFLFSGRRLDQETAE LFLNQEKHSYCQSVIWTGLLTYYPCSIEMADCYLLHRSSFVRRKAMEYKYHVLKCAWPGV EDLLLDVNCGIRDLAAYILKKHSQLDILAFYEKHLKEPVPVPAILGIGEQGRVLDDRHGL ADLVLPYLEHESEKVVRGALEAVGMLMGAEGEEIYWKYLLDTRPCISKAAYQCIRKNGIH PGAALLYGEQEKWRKSDETAGKSPDSVCHIRRYLILLLIQENSWDRLPYLIFLLKDESLK EYRDKLLGALQIRSPYARVDRKQAAFIKQVLAKEAEAVPEELARAIVFDLKFMA >gi|157101624|gb|DS480700.1| GENE 44 50149 - 52035 1768 628 aa, chain - ## HITS:1 COG:BS_ydiF KEGG:ns NR:ns ## COG: BS_ydiF COG0488 # Protein_GI_number: 16077662 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 544 2 584 642 363 36.0 1e-100 MLYQISNGTVSVGGELILSHIDFEIRGNEKIAVVGKNGAGKTTLLNLVAGKLSLDRDDRR EGAGIRCSRRLTVGMLSQQAFKDGNRTVEEELLEACPCRDTFERERFEYEREYDTLFTGF GFPKEDKKKRISQFSGGEQTKIALIRLLLEKPDILLLDEPTNHLDMETAGWLEQYMKHYK NAVVMVSHDRFFLDQTADVVYELTGKGLRRYPGNYTHYREEKLKQIRLQKKAYERQQEEL ARLENLVERFKHKPNKAAFARAKRKTMEHMERVEKPQEDDAHIFTGEINPLLPGSKWVFT SEHLKIGYEKPLLEITMRIRRGQKIALLGPNGAGKTTFLKTIAGFLPSLEGSYSLGNQTT IGYFDQHSGAIQSEQTVLEHFTAQFPALTEKEARSILGAYLFGGRDAQKKVSSLSGGEKA RLVLAELLQSRPNFLVLDEPTNHMDIQAKETLESAFRAYKGTILFVSHDRYLIRQVADAV MIFENQTVMYYPFGYEHYLERKARENQGGSMAAQIRAEEQALIAGMRAVPKAERHRLREI PEDEAYREWNMGLAAQHLEPAGNAVEYLTEQVRELRAMWHGSEEYWNGGAWERMNEYEDR CLRLEQAWEEWHRACMEWLEKAQEMEVL >gi|157101624|gb|DS480700.1| GENE 45 52286 - 52642 426 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941011|ref|ZP_02088350.1| ## NR: gi|160941011|ref|ZP_02088350.1| hypothetical protein CLOBOL_05905 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05905 [Clostridium bolteae ATCC BAA-613] # 1 118 1 118 118 211 100.0 2e-53 MDYLYCMPDLNNTRENCEKIHNILARMSDKYKLNIVPEPVKAKYFGGLDYYKKYRIYKEI REIGGNSAEAYLQADEKEMILSVCKNQQEQELMKSCIYAYCYPAQMVLKSFNDRDKKK >gi|157101624|gb|DS480700.1| GENE 46 52847 - 53605 227 252 aa, chain - ## HITS:1 COG:MA4170 KEGG:ns NR:ns ## COG: MA4170 COG1145 # Protein_GI_number: 20092963 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Methanosarcina acetivorans str.C2A # 5 236 14 245 294 62 28.0 7e-10 MIGIYLSGTGNTRHCVEKLTGLLDSSAESMPLESRGTVEAIRNNDTIILGYPTQFSNAPY MVRDFIHRNAGLWKGKKLLCVSTMGAFSGDGAGCTARLLKKYGAVILGGLHIRMPDSVCD SKLLKKTLKQNREIIILADKKIESAALLIKQGIYPKDGITFISHLAGLLGQRLWFYGKTA DYTDKLKISKSCIGCGLCVSLCPMENISMKNGKAAAGKRCTMCYRCISRCPEKAITLLGK EVKEQCGYEKYR >gi|157101624|gb|DS480700.1| GENE 47 53842 - 53985 163 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941014|ref|ZP_02088353.1| ## NR: gi|160941014|ref|ZP_02088353.1| hypothetical protein CLOBOL_05908 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05908 [Clostridium bolteae ATCC BAA-613] # 1 47 1 47 47 85 100.0 1e-15 MLTFTDASGEILYTSHYKIIDNMIEEAIPFKYLVTKEQRWAQAKQIV >gi|157101624|gb|DS480700.1| GENE 48 54073 - 54699 419 208 aa, chain - ## HITS:1 COG:lin2189 KEGG:ns NR:ns ## COG: lin2189 COG4832 # Protein_GI_number: 16801254 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 4 207 7 207 208 195 49.0 6e-50 MAFDFKKEYKEFYMPKHKPEIITVPPVNYIAVRGKGNPNEEGGAYQQALGILYAVAYTLK MSYKTDYKMEGFFEYVVPPLEGFWWQDNVEGVDYSDKSSFQWISAIRLPDFVTKKDLDWA KETAEKKKKLECSSAEFLTIEEGLCVQMMHLGAFDDEPASVAVMDKFIQEKGYVNDMNSK RLHHEIYMTDARKTAPEKWKTVIRHPIK >gi|157101624|gb|DS480700.1| GENE 49 54994 - 55680 449 228 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 225 1 223 227 198 43.0 8e-51 MKRLFLLEDDLSLISGLSFAVKKQGYEIDAARTILEADTLWANGSYDLAILDVSLPDGSG FDFCRKIRRMSKVPVMFLTAADEETDIIMGLDMGGDDYITKPFKLAVFLSRINALLRRSE NFNTAAAEISSNGIHIQLLKGTVYKNGKRLDLTASEYKLLCLFMEHPDQILSPDQILSKL WDCSEQYIDNSSLTVYIRRLRIKIEDNPGAPERIITVRGMGYKWNTAD >gi|157101624|gb|DS480700.1| GENE 50 55891 - 56727 504 278 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 12 273 149 411 416 150 35.0 3e-36 MENAISQIKESISGNKDVTIECNDEGELYRLFHEVNSLVSILNAHAVNETNSKKFLQNTI SDISHQLKTPLAALNIYNGIIQDEGKNSPVIQEFSLLSEQELDRIETLVQNLLKITKLDA GTIILEKSLEHVSELAENVKQHFLFRAQREGKEICLSGSGEITLLCDRTWIIEAVSNLVK NALDHTEKGSFVRIEWQAFASVVQITIKDNGCGIHPEDLHHIFKRFYRSRFSKDTQGIGL GLPLAKAIVEAHNGTIEADSTLGIGTSITINFLIPSKF >gi|157101624|gb|DS480700.1| GENE 51 56817 - 57500 263 227 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 20 211 35 232 329 105 35 5e-22 MNLLEINHICKTYGNGDAAVHALRNVSFSVPKGEHVAIVGESGSGKSTLLNMIGALDTPT SGKVMICGNDIFSMSDRKRTIFRRRNIGFIFQAFNLIPELTVEQNIIFPVLLDYQKPDRQ YLEELLTILNLKERRNHLPSQLSGGQQQRAAIGRALITRPALILADEPTGNLDTQNSSEV IALLKEASKKYEQTIIMITHSRSISQTADRVLQVSDGVLTDFGRCRQ >gi|157101624|gb|DS480700.1| GENE 52 57497 - 59299 843 600 aa, chain + ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 78 595 260 826 832 107 23.0 5e-23 MKSYLSLIPISAKVRKRQNSMTLLCIVFAVFLVTAVFSMADMGVKMEQSRLLEKHGNLTL QDLLTNTMTQTLFLTAGVLFVLILTAAVLMISSSMNSNITQRTKFFGMMRCLGMSRQQII RFVRLEALNWCKTAIPAGVALGIAVTWCLCAALRFIVGEEFSGIPLFGISMPGIVSGIVV GIVTVLLAAGAPAKRAAKVSPVTAVSGNSQNDGNAHHTVNTHLFKVETALGIHHALSAAK NLILMTGSFALSIILFLSFSVLIDFVGYIMPQSAGTSDINITSIDGSNTVSSDLPDTIIG MHGVKRVYGRRSSFDVSAGLHSGGAKIDIVSYDDFDLDCLVKDGMLKKGSNISKVYGNSK YVLAVWDKDSTLDIGDKIDVGNEELEIAGLLKYNPFSSNGLTNGEITLVTSGAVFTRLTG ITDYSLIMIQTAKDVTDEDVTAIRSILDQNCVLKDLRGQHNSGTYFAFVFCIYAFLAIIT LVTVLNIVNSISMSVSSRIKSYGAMRAVGMDQYQITKMIAAEAFTYALTGGLVGCAAGLL IYKLLYDILIISHFSYAALSLPAAPLLIILFFVCMAAIAAVYAPAKRMRDMEITEIINEL >gi|157101624|gb|DS480700.1| GENE 53 59493 - 59990 232 165 aa, chain - ## HITS:1 COG:no KEGG:DSY1739 NR:ns ## KEGG: DSY1739 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 71 152 3 84 84 110 57.0 2e-23 MNDLQKKMDVLQKIAAEFNKDNLLWAIGASLLLYFKGVAEVFHDIDIMVDEKDVERCKNI LLNMGTMGAASPDPQYKTRTFIEFTVDEVEIDVIAGFVIVDNGMEHDCSLKPDQITEFIT IGKEHVPLQGLALWRQYYEWMGRTSKVELIDKFAEHKTGTGKGER >gi|157101624|gb|DS480700.1| GENE 54 60058 - 60618 272 186 aa, chain - ## HITS:1 COG:no KEGG:Halhy_2853 NR:ns ## KEGG: Halhy_2853 # Name: not_defined # Def: hypothetical protein # Organism: H.hydrossis # Pathway: not_defined # 39 179 46 185 188 96 41.0 5e-19 MDKDNEVKKERKYYTSGPGKPPIFIQAILYEKKMSAIDFFHLTDACLDWEKQGDDIAVIE PLVTLLAAWGDDLIFAFHDTMAELLYSLDTQKIAGDIYKDRSFSADEFLYIRCAALINSK RYYNDIVSGRRKLKGSLSFESILSVPAFAWARVHGKPENEYPHTTKYCYETMSNIEGWKG KTERED >gi|157101624|gb|DS480700.1| GENE 55 61003 - 61825 217 274 aa, chain + ## HITS:1 COG:BS_yhaZ KEGG:ns NR:ns ## COG: BS_yhaZ COG4335 # Protein_GI_number: 16078046 # Func_class: L Replication, recombination and repair # Function: DNA alkylation repair enzyme # Organism: Bacillus subtilis # 10 266 8 258 357 270 50.0 3e-72 MPVLMKDKYYNYDSLHDLASRIKAVFPSFQENDFVDHIMNETWDALELKARMRQITINLG KYLPADYKQALGIIDQVIAGYPSGYNDNALIYFPDFVEVYGQDECHWDLSMAAIERYTPL STAEFAVRPFIIKQEARMMQQMAVWARHDNEHVRRLASEGCRPALPWGQALTSFKKDPSP VLPILEQLNTDPSLYVRKSVANHLNDISKTHPGLVTKIARDWYGRHEYTDWIVKHGCRTL LKKGNQEVLDIFGYHNAGSVDIADFTLGSASISL Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:37:08 2011 Seq name: gi|157101623|gb|DS480701.1| Clostridium bolteae ATCC BAA-613 Scfld_02_42 genomic scaffold, whole genome shotgun sequence Length of sequence - 57518 bp Number of predicted genes - 69, with homology - 69 Number of transcription units - 16, operones - 12 average op.length - 5.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.333 - CDS 42 - 1526 1672 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 2 1 Op 2 . - CDS 1543 - 2466 476 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 2625 - 2684 15.9 + Prom 2647 - 2706 10.8 3 2 Op 1 . + CDS 2735 - 3466 210 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Term 3498 - 3539 -0.9 + Prom 3563 - 3622 5.4 4 2 Op 2 . + CDS 3667 - 4122 401 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 4151 - 4198 11.0 - Term 4209 - 4250 8.0 5 3 Op 1 59/0.000 - CDS 4304 - 4696 596 ## PROTEIN SUPPORTED gi|240144422|ref|ZP_04743023.1| 30S ribosomal protein S9 6 3 Op 2 7/0.000 - CDS 4721 - 5149 721 ## PROTEIN SUPPORTED gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 - Prom 5290 - 5349 4.7 - Term 5337 - 5375 -0.5 7 3 Op 3 8/0.000 - CDS 5377 - 6120 671 ## COG0101 Pseudouridylate synthase 8 3 Op 4 34/0.000 - CDS 6159 - 6959 905 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 9 3 Op 5 15/0.000 - CDS 6952 - 7830 290 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 10 3 Op 6 . - CDS 7848 - 8702 971 ## COG1122 ABC-type cobalt transport system, ATPase component - Prom 8889 - 8948 4.2 - Term 9087 - 9135 15.1 11 4 Op 1 1/0.333 - CDS 9179 - 10165 998 ## COG5263 FOG: Glucan-binding domain (YG repeat) 12 4 Op 2 1/0.333 - CDS 10211 - 11206 865 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 11232 - 11291 1.6 13 4 Op 3 . - CDS 11295 - 12545 1159 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 12605 - 12664 4.9 - Term 12675 - 12723 11.5 14 5 Op 1 50/0.000 - CDS 12741 - 13277 718 ## PROTEIN SUPPORTED gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P 15 5 Op 2 26/0.000 - CDS 13495 - 14457 1236 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit - Prom 14556 - 14615 3.2 - Term 14494 - 14522 2.3 16 5 Op 3 36/0.000 - CDS 14631 - 15224 811 ## PROTEIN SUPPORTED gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 17 5 Op 4 48/0.000 - CDS 15251 - 15655 661 ## PROTEIN SUPPORTED gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 18 5 Op 5 2/0.167 - CDS 15751 - 16119 533 ## PROTEIN SUPPORTED gi|227871777|ref|ZP_03990182.1| ribosomal protein S13 19 5 Op 6 . - CDS 16146 - 16259 189 ## PROTEIN SUPPORTED gi|160881761|ref|YP_001560729.1| ribosomal protein L36 - Prom 16293 - 16352 7.6 20 6 Op 1 . - CDS 16457 - 16675 270 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 21 6 Op 2 . - CDS 16689 - 16952 168 ## gi|160941047|ref|ZP_02088385.1| hypothetical protein CLOBOL_05940 22 6 Op 3 12/0.000 - CDS 16967 - 17713 590 ## COG0024 Methionine aminopeptidase 23 6 Op 4 28/0.000 - CDS 17717 - 18361 770 ## COG0563 Adenylate kinase and related kinases - Term 18667 - 18723 13.0 24 7 Op 1 53/0.000 - CDS 18739 - 20055 800 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 25 7 Op 2 48/0.000 - CDS 20055 - 20495 637 ## PROTEIN SUPPORTED gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 26 7 Op 3 50/0.000 - CDS 20521 - 20703 223 ## PROTEIN SUPPORTED gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 27 7 Op 4 56/0.000 - CDS 20719 - 21228 685 ## PROTEIN SUPPORTED gi|227871784|ref|ZP_03990189.1| ribosomal protein S5 28 7 Op 5 46/0.000 - CDS 21246 - 21614 600 ## PROTEIN SUPPORTED gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 29 7 Op 6 55/0.000 - CDS 21632 - 22174 757 ## PROTEIN SUPPORTED gi|160881770|ref|YP_001560738.1| ribosomal protein L6 - Term 22185 - 22223 1.5 30 7 Op 7 50/0.000 - CDS 22248 - 22649 594 ## PROTEIN SUPPORTED gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 31 7 Op 8 50/0.000 - CDS 22682 - 22867 310 ## PROTEIN SUPPORTED gi|168334339|ref|ZP_02692526.1| ribosomal protein S14 32 7 Op 9 48/0.000 - CDS 22884 - 23423 829 ## PROTEIN SUPPORTED gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 33 7 Op 10 57/0.000 - CDS 23447 - 23752 358 ## PROTEIN SUPPORTED gi|227871791|ref|ZP_03990195.1| ribosomal protein L24 34 7 Op 11 50/0.000 - CDS 23765 - 24133 576 ## PROTEIN SUPPORTED gi|160881775|ref|YP_001560743.1| ribosomal protein L14 35 7 Op 12 . - CDS 24158 - 24433 370 ## PROTEIN SUPPORTED gi|160881776|ref|YP_001560744.1| ribosomal protein S17 36 7 Op 13 . - CDS 24433 - 24645 332 ## PROTEIN SUPPORTED gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 37 7 Op 14 50/0.000 - CDS 24626 - 25063 670 ## PROTEIN SUPPORTED gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 38 7 Op 15 61/0.000 - CDS 25063 - 25719 988 ## PROTEIN SUPPORTED gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P 39 7 Op 16 59/0.000 - CDS 25729 - 26115 578 ## PROTEIN SUPPORTED gi|238916270|ref|YP_002929787.1| large subunit ribosomal protein L22 40 7 Op 17 60/0.000 - CDS 26146 - 26430 461 ## PROTEIN SUPPORTED gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 41 7 Op 18 61/0.000 - CDS 26451 - 27296 1320 ## PROTEIN SUPPORTED gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 42 7 Op 19 61/0.000 - CDS 27416 - 27715 404 ## PROTEIN SUPPORTED gi|238922832|ref|YP_002936345.1| 50S ribosomal protein L23 43 7 Op 20 58/0.000 - CDS 27715 - 28335 835 ## PROTEIN SUPPORTED gi|238922831|ref|YP_002936344.1| ribosomal protein L4/L1e 44 7 Op 21 40/0.000 - CDS 28364 - 28999 962 ## PROTEIN SUPPORTED gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 - Prom 29153 - 29212 1.6 45 7 Op 22 . - CDS 29248 - 29565 522 ## PROTEIN SUPPORTED gi|160941071|ref|ZP_02088409.1| hypothetical protein CLOBOL_05964 - Prom 29798 - 29857 5.1 - Term 29956 - 30001 6.4 46 8 Tu 1 . - CDS 30048 - 31403 1529 ## COG5263 FOG: Glucan-binding domain (YG repeat) - Prom 31492 - 31551 6.0 + Prom 31556 - 31615 4.4 47 9 Tu 1 . + CDS 31656 - 32516 985 ## COG1092 Predicted SAM-dependent methyltransferases + Term 32526 - 32567 10.5 - Term 32516 - 32553 8.1 48 10 Op 1 . - CDS 32621 - 33310 611 ## COG0684 Demethylmenaquinone methyltransferase 49 10 Op 2 . - CDS 33360 - 34442 975 ## COG0371 Glycerol dehydrogenase and related enzymes 50 10 Op 3 . - CDS 34478 - 35260 721 ## COG1414 Transcriptional regulator 51 10 Op 4 . - CDS 35334 - 36620 819 ## PROTEIN SUPPORTED gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 52 10 Op 5 . - CDS 36614 - 37015 326 ## VVA1577 TRAP-type C4-dicarboxylate transport system, small permease component - Prom 37035 - 37094 2.9 53 11 Op 1 . - CDS 37123 - 38193 356 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 54 11 Op 2 1/0.333 - CDS 38222 - 39880 1667 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 40044 - 40103 9.1 - Term 40172 - 40210 -0.6 55 12 Op 1 1/0.333 - CDS 40237 - 41379 1110 ## COG0673 Predicted dehydrogenases and related proteins 56 12 Op 2 . - CDS 41391 - 42773 1197 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase 57 12 Op 3 . - CDS 42788 - 43261 600 ## Mahau_2039 methylmalonyl-CoA epimerase (EC:5.1.99.1) 58 12 Op 4 1/0.333 - CDS 43320 - 44330 933 ## COG1312 D-mannonate dehydratase 59 12 Op 5 . - CDS 44337 - 45746 1264 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 60 12 Op 6 38/0.000 - CDS 45793 - 46647 689 ## COG0395 ABC-type sugar transport system, permease component 61 12 Op 7 35/0.000 - CDS 46649 - 47527 867 ## COG1175 ABC-type sugar transport systems, permease components - Term 47542 - 47566 -1.0 62 12 Op 8 . - CDS 47625 - 48935 1328 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 49056 - 49115 7.8 + Prom 49102 - 49161 8.2 63 13 Tu 1 . + CDS 49192 - 50205 884 ## COG1609 Transcriptional regulators 64 14 Op 1 . - CDS 50260 - 51105 773 ## COG1404 Subtilisin-like serine proteases - Prom 51173 - 51232 3.5 65 14 Op 2 . - CDS 51283 - 53103 1724 ## COG1001 Adenine deaminase - Prom 53329 - 53388 77.0 + TRNA 53312 - 53385 74.1 # Val CAC 0 0 - Term 53646 - 53701 10.2 66 15 Tu 1 . - CDS 53753 - 54370 585 ## COG1011 Predicted hydrolase (HAD superfamily) - Term 54432 - 54462 4.1 67 16 Op 1 15/0.000 - CDS 54474 - 54752 270 ## COG1862 Preprotein translocase subunit YajC 68 16 Op 2 . - CDS 54800 - 55960 1153 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 69 16 Op 3 . - CDS 55975 - 57375 1562 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 57429 - 57488 2.7 Predicted protein(s) >gi|157101623|gb|DS480701.1| GENE 1 42 - 1526 1672 494 aa, chain - ## HITS:1 COG:lin0198 KEGG:ns NR:ns ## COG: lin0198 COG0791 # Protein_GI_number: 16799275 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Listeria innocua # 305 479 99 274 292 87 33.0 6e-17 MNNWKKVIVLSGLCSCALSVNAMASTNLETGSSLAGISVALNNYYAGNTEPEKQLASSYS NIQNKSAQSNTAAGSASGQGSGNKASGQSGSASGDAGKAKTAQETASKPKSSAYDNIAVS KINGTVNIRTEANTSSGVTGKINNDCAATILDTVDGEGGKWYKIQSGSVTGYIKADYFVT GQQAESRAKQVGTTYGTIVGTPSLRLRQSPDMEGKTLTLLSEGAHYVVTGEEGDFLKVQV DSDLEGYVFKEYMKTSVEFNKAVSVEEEKAKAAEEAKRKEDAEKALQALEDAKKADAAKK TTAATTKAEKTTEAPATTAPKKEETKGTDAVTTIAANPENGKDSTVAAPTTAKAPETVDS KEPGSPGGTSTEVASATRNAVVAYAKQFLGNPYVYGGTSLTSGADCSGFTQSVFAHFGIS TGRSSRDQAARGKSIPVSDAKPGDLLFYASGDYINHVAIYIGGGQVIHASNPTTGICITP ANYRTPCKAVTFLD >gi|157101623|gb|DS480701.1| GENE 2 1543 - 2466 476 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 304 1 303 306 187 37 9e-47 MSHIYDLIIIGSGPAGLAAAVYAQRAKLDTLVVEKAMVSGGQVLTTYEVDNYPGLPGIGG YDLGIKFREHADRLGARFVEDEVLNIQDGGKGAIKGVVCQGNTYEARSLILATGAVHRKL GVPGEEELAGAGVSYCATCDGAFFRNKVTAVIGGGDVAVEDAIFLARMCSKVYLIHRRNE LRAAKSLQENLLSLDNVEVIWDTVADSINGDGMVKSLSLTNVKNGQKRELDVQGVFIAVG ITPESRAFEGLVDMDHGYIRAGEDTVTSAPGIFAAGDVRTKPLRQIITAAADGANAITSV ERYLVEN >gi|157101623|gb|DS480701.1| GENE 3 2735 - 3466 210 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 240 1 238 242 85 29 6e-16 MPRKTVLVTGASRGIGKAVAVKFAKKGYNVAISCVHREEQLMQTRKEIESFQVPCLAYKG DMGDMACCEELFLKIKKMFGGVDVLVNNAGISYIGLLQDMTPADWERMLRTNLTSVFNCC KLAVPYMISQKQGKIVNISSVWGVVGASCETAYSATKGGINALTKALAKELAPSNIQVNA IACGAIDTEMNQWMDEDDLIALVDEIPSGRLGRAEEVADLAYHLGYKESYLTGQIIGLDG GWI >gi|157101623|gb|DS480701.1| GENE 4 3667 - 4122 401 151 aa, chain + ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 19 145 559 688 744 65 29.0 4e-11 MKKKYLAIAALSMALAAGSAMTSFAAGFVHTSQGTKYQWGSGDYCTNNWVNYKNHWFFFG EDQIMRTGWIQKDGTWYYAADTGELQGGIMKINGNVYYFDSNSCKLQTGERSFNGQTHTF TENGTTDGGPYVYTEWNSNGTLRRGTKFGVR >gi|157101623|gb|DS480701.1| GENE 5 4304 - 4696 596 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240144422|ref|ZP_04743023.1| 30S ribosomal protein S9 [Roseburia intestinalis L1-82] # 1 130 1 130 130 234 87 1e-60 MANAKFYGTGRRKKSIARVYLVPGTGNITINKRDINEYLGLETLKVIVRQPLVATETTDK FDVMVNVRGGGTTGQAGAIRHGISRALLQVDADYRPALKKAGFLTRDPRMKERKKYGLKA ARRAPQFSKR >gi|157101623|gb|DS480701.1| GENE 6 4721 - 5149 721 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 [Clostridiales bacterium 1_7_47_FAA] # 1 142 1 142 142 282 96 3e-75 MKSFMASPATIERKWYVVDATGYTLGRLSSEIAKVLRGKNKPIFTPHMDCGDYVIVVNAE KIKVTGKKLDQKIYYNHSDYVGGMRETTLRELMAKKPEKVIELAVKGMLPKGPLGRSMFG KLHVYAGPDHEQAAQKPEVLTF >gi|157101623|gb|DS480701.1| GENE 7 5377 - 6120 671 247 aa, chain - ## HITS:1 COG:BH0167 KEGG:ns NR:ns ## COG: BH0167 COG0101 # Protein_GI_number: 15612730 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus halodurans # 1 238 1 238 263 208 46.0 8e-54 MKRIRLVVAYDGTQYHGWQIQPGAVTIESVLNEALTQLMREPIQVIGASRTDSGVHARGN VAVFDTESQMPPDKICMALNQRLPEDIRVQVSEEVPSDWHPRKCTCIKTYEYRILNRRIN MPMERLYSHFCYYKLDVDRMQAAANMLAGEHDFKSFCSVRTQVTDTVRTVYRIDVTRNAD DIITIRVTGNGFLYNMVRIIAGTLMAVGTGHIQAEDMPSILEAKDRRAAGPTAPARGLTL IEMKYEL >gi|157101623|gb|DS480701.1| GENE 8 6159 - 6959 905 266 aa, chain - ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 1 245 1 247 267 241 51.0 7e-64 MFREITLGQYYPVDSAVHRLDPRTKLFGTMVFIASLFVADNIWGYVIATVVLAAVIKLTR VPVRFILRGLKAIMILLLISVSFNLFLTDGRILVKFWIFKITFEGVRMAFFMGLRLIYLV IGSSIMTLTTTPNQLTDGLEKGLGFLKRFRVPVHEVAMMMSIALRFIPILVEETDKIMKA QMARGADFETGNLIQKAKAMVPLLVPLFISAFRRATDLAMAMEARCYRGGDGRTKMKPLQ YRQADHDAYMIYALYFVVIIASRVYL >gi|157101623|gb|DS480701.1| GENE 9 6952 - 7830 290 292 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 261 279 541 563 116 29 3e-25 MSIKAVDLNFVYGGGTAFEQHALFDVNLEIEDGEFVGLIGHTGSGKSTLIQHLNGLIKAS AGELYYNGENIYSQGYDMKQLRSKVGLVFQYPEHQLFEVDVLTDVCFGPKNQGLSREEAE ARAKKALEQVGLDPSYYKQSPFELSGGQKRRVAIAGVLAMEPEVLILDEPTAGLDPRGRD EILDQIDRLHRERHMTIILVSHSMEDVARYADRLIVMNHGQKVFDGAPKEVFCHYRELET MGLAAPQITYLVHDLKENGIDIDDDITTVAEAREAILALRNKLTESSRDKNV >gi|157101623|gb|DS480701.1| GENE 10 7848 - 8702 971 284 aa, chain - ## HITS:1 COG:CAC3102 KEGG:ns NR:ns ## COG: CAC3102 COG1122 # Protein_GI_number: 15896353 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Clostridium acetobutylicum # 3 282 5 280 281 312 56.0 4e-85 MGIVKAAKLVYEYIRRDEEENIEEVNRAIDGVDVDIKKGDFVAVLGHNGSGKSTLAKHVN GLLLPTEGTVWVGDMDTRDEEHIWDVRKTAGMVFQNPDNQIIGNIVEEDVGFGPENIGVP TEEIWKRVEESLKAVGMTAYRLQSPNKLSGGQKQRVAIAGVMAMKPECIILDEPTAMLDP NGRKEVIRTIHELNRAEGITVLLITHYMEEAIEADRIIVMDDGRIVMDGQPREIFSRVKE LKSHGLDVPQVTELAWELKEAGMPLTDGILSREELVEQLVPLLR >gi|157101623|gb|DS480701.1| GENE 11 9179 - 10165 998 328 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 240 327 534 620 621 73 39.0 5e-13 MGRLKQQAVIWMSGALLMLTCGTAMASTRTEIDSISLDVESNIEAGDSSGDVDVTCDSGD YYVDDIEITNEPKNGWDDGDKPKLKVTVEAEDDYYFSSGLSKSDVDLRGADGKVTSVTRK SSTLIVYITLDSLDGSDSGYDLDVYGLEWDESDGMASWEDSGDARKYEVRLYRNDNSVTS VLTTSDTSYDFSGYITKSGDYLFKVRAVYSSSDKGSWEESDSWYVSSEEADEISDGRRTY GSSSSSYKGAWLRDSVGWWYCNADKSYTVNNWQYIDDRWYFFNAAGYMVTGWIDWGGRWY YCGDDGAMYYNTTTPDGYYVGDDGAWVQ >gi|157101623|gb|DS480701.1| GENE 12 10211 - 11206 865 331 aa, chain - ## HITS:1 COG:SP0069 KEGG:ns NR:ns ## COG: SP0069 COG5263 # Protein_GI_number: 15900014 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 254 331 135 211 211 61 40.0 2e-09 MKRGRFCFTAAAVCLMVFGLSISAQAKTTYRLEITNQYDTDGTSRDDNDKDRVHPLEPDV EVSEGSSDGMELASDVEWSKAPLNWSAGNEVTGTLYLGCTGSLGRDSLELRVTNGRNEAK VSSVKKYTGEDYESDLGNVYEVKFKYQVAAQLGETAWAGWDSANPSLARWNAVKYANTYR VVLYDDQGSVVSQLVEGSTEYDFSPFMTKEGVQYYFEVQAIARNGNQQDYLEDGEAVSSL TSGANTPGITDGTWGDYQEGRRFTYEDGTAAADRWERIMGKWYYFNPEGYAVTGWNQIQD TWYYMYEDGAMAAGVTTPDGFQVDDSGAWIH >gi|157101623|gb|DS480701.1| GENE 13 11295 - 12545 1159 416 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 225 414 516 743 744 127 33.0 5e-29 MRSMKRLLCLALATGTLITAMPIQSMASSDYKYITNISLKVDIDLDSGDEINDGDSLGTD SSDSGNKVYTSSNKYRVQSAEWSNDKEVTIGDTPKITVWLEPESSSNSNYDYRFRSSYSS GSVSVSGGTFVSASKSGSDLKVVIRANGIKGTYDPPEDAEWGSTRGRARWEEPENGGSGY YDVYLYRGSSVVKKLEEYKGTSYNFYPYMTKEGDYSFKVRTVPHTEEEKKRGKKSEWTEA GDLYIAEDEVSDGTGQEDGNGSTPSGGNTDVGWRKEGDTWYFKYPDGNYQKNGWLRWNDK WYLFDETGKMVTGWKQTNSGWYYLGESGDMKTGWVKSNNIWYYMNPNQDGPEGAMVKNSW LTIDGKTYFMNESGAMVEGWYKVQDNWYYFYPGQGQKAVNTSISGFQLDANGVWQH >gi|157101623|gb|DS480701.1| GENE 14 12741 - 13277 718 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P [Roseburia intestinalis L1-82] # 1 178 1 178 178 281 79 8e-75 MAGYRKLGRTSDQRKALIRSQVTALLYHGHIRTTETRAKEIRKVAEGLIAMAVKEKDNFE TVKVTAKVARKDKDGKRVKEVVDGKKVTVYDEVEKEIKKDMPSRLHARRQMLKVLYGVTE VPAAAAGKKRNTKSVDLVAKIFDDVAPKYVGRNGGYTRIVKIGQRKGDGAMEVLIELV >gi|157101623|gb|DS480701.1| GENE 15 13495 - 14457 1236 320 aa, chain - ## HITS:1 COG:BS_rpoA KEGG:ns NR:ns ## COG: BS_rpoA COG0202 # Protein_GI_number: 16077211 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus subtilis # 1 318 1 314 314 386 66.0 1e-107 MFDFEKPNIEIAEISEDKRYGKFVVEPLERGYGTTLGNSLRRIMLSSLPGAAVSQVKIDG VLHEFSSIPGVKEDVTEIIMNIKSLAIKNNSDTDEPKVAYIEFEGEGVITAADIQADADI EIMNPDQIIATLSGGTDSKFYMELTITKGRGYISADKNKNDDLPIGVIAVDSIYTPVERV NMAVANTRVGQQTDYDKLTLEIYTNGTLAPDEAVSLAAKVLSEHLNLFIDLSENAKTAEI MVEKEDNEKEKVLEMNIDELELSVRSYNCLKRAGINTVEELCNRTSEDMMKVRNLGRKSL EEVLAKLKELGLQLNPSDDN >gi|157101623|gb|DS480701.1| GENE 16 14631 - 15224 811 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 [Roseburia intestinalis L1-82] # 1 197 1 197 197 317 78 1e-85 MAVDRVPVLKRCRSLGLDPIYLGIDKKSNRESKRTNKKLSEYGMQLREKQKAKFIYGVLE KPFRNYYAKASKMNGMVGTNLMILLECRLDNVLFRMGLGRTRKEARQIVDHKHILVNGKP VNIPSYRVKAGDVIEVKEKYKSAQRYKDVLEVTGGRMVPAWLDVDQENLRGTVKEMPTRD EIDVPVNEMLIVELYSK >gi|157101623|gb|DS480701.1| GENE 17 15251 - 15655 661 134 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 [Clostridiales bacterium 1_7_47_FAA] # 1 134 1 133 133 259 99 3e-68 MAKKVSTTAKKVTKKRVKKNVERGQAHIQSSFNNTIVTLTDAQGNALSWASAGGLGFRGS RKSTPYAAQMAAETATKAALVHGLKSVDVMVKGPGSGREAAIRALQACGLEVTSIKDVTP VPHNGCRPPKRRRV >gi|157101623|gb|DS480701.1| GENE 18 15751 - 16119 533 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871777|ref|ZP_03990182.1| ribosomal protein S13 [Oribacterium sinus F0268] # 1 122 1 122 122 209 86 2e-53 MARISGVDLPREKRVEIGLTYIYGIGRASSNRILTAAGVNPDTRVKDLTDDEVKKLANTI AETQTVEGDLRREIAMNIKRLQEIGCYRGIRHRKSLPVRGQKTKTNARTCKGPRKTVANK KK >gi|157101623|gb|DS480701.1| GENE 19 16146 - 16259 189 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881761|ref|YP_001560729.1| ribosomal protein L36 [Clostridium phytofermentans ISDg] # 1 37 1 37 37 77 94 2e-13 MKVRSSVKPICEKCKIIKRKGSIRVICENPKHKQRQG >gi|157101623|gb|DS480701.1| GENE 20 16457 - 16675 270 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 108 68 7e-23 MSKADVIEIEGTVIEKLPNAMFQVELENGHQVLAHISGKLRMNYIRILPGDKVTIELSPY DLSKGRIIWRDK >gi|157101623|gb|DS480701.1| GENE 21 16689 - 16952 168 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941047|ref|ZP_02088385.1| ## NR: gi|160941047|ref|ZP_02088385.1| hypothetical protein CLOBOL_05940 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_05940 [Clostridium bolteae ATCC BAA-613] # 1 87 1 87 87 144 100.0 2e-33 MVEFKEAGLVKSLAGHDKNELYIIISVQGEYVYLSDGRKHPLSKQKRKNRKHLQLIHEHD ETLRKKLTEGAAVRDEEIAGFIRSRRQ >gi|157101623|gb|DS480701.1| GENE 22 16967 - 17713 590 248 aa, chain - ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 247 1 246 248 253 50.0 3e-67 MVTIKSEREIELMREAGRILAKVHEELGRTLVPGMSTKEIDRMCEDMIRSHGCVPSFLNY QGFPASVCISINDEVVHGIPDKHRYLEEGDIVSLDTGVIWKGYQSDAARTHMIGEVSGEA RKLVEVTQQSFFEGIKYAKAGNHLNDISKAIQEYAESFGFGVVRDLVGHGIGTEMHEAPE IPNFAQRRKGIRLAAGMTLAIEPMITAGRYDVVWEDDGWTVVTEDGSLASHYENTILITD GEPEILSL >gi|157101623|gb|DS480701.1| GENE 23 17717 - 18361 770 214 aa, chain - ## HITS:1 COG:CAC3112 KEGG:ns NR:ns ## COG: CAC3112 COG0563 # Protein_GI_number: 15896362 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Clostridium acetobutylicum # 1 213 1 213 215 272 62.0 3e-73 MKIIMLGAPGAGKGTQAKKIAAKYQIPHISTGDIFRANIKNGTELGMKAKSYMDAGGLVP DEITIGMLLDRIHEEDCKNGYVLDGFPRTIPQAESLTKALGDMGEAIDYAINVDVPDENI INRMSGRRACLSCGATYHIVYNPPKKEGICDVCGQQLVLRDDDKPETVKKRLDVYHDQTQ PLIEYYKKAGVLAEVDGTLDMEEVFQAIVRILGA >gi|157101623|gb|DS480701.1| GENE 24 18739 - 20055 800 438 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 10 436 19 437 447 312 37 2e-84 MFKTIRNAFKVKELRTKILYTFMMLVVIRFGSQLPIPGIETSFFANWFAKQTTDVFGFFN AMTGGSFSSMSIFALSITPYITSSIIIQLLTIAIPKLEELQKDGEDGRKKIQEYTRYTTV GLALIESSAMAIGFGRQGLLIDYNAWNIIIAIVTMTTGSALLMWIGEQITEKGVGNGISI VLLFNILSSVPDDMKTLYYRFIFGQTVTKMIFSVVVIALIILAMVVFVIVLNDAERRIPV QYSKKMVGRKMVGGQASNIPLKINTAGVMPVIFASSIMSFPVVISQFFTIDPNSIGSKIL MVLNSGSWCRPEYPIYSIGLVIYIALLIMFAYFYTSITFNPLEVANNMKKQGGFIPGIRP GKPTSDYLNKILNYIVFIGACGLIVIAIVPILASGLLNVSRISFSGTSLIIIVGVVLETI KAVESQMLVRYYKGFLND >gi|157101623|gb|DS480701.1| GENE 25 20055 - 20495 637 146 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 [Roseburia intestinalis L1-82] # 1 146 1 146 146 249 85 2e-65 MELSNLQPAAGSKHSDNFRKGRGHGSGNGKTAGKGHKGQKARSGATRPGFEGGQMPLYRR LPKRGFTNMNSKTIIGINVSALERFDNDAEVTVETLIEAGIVKNPRDGVKILGNGELTKK LTVKVNAFSEGAKSKIEALGGTCEVI >gi|157101623|gb|DS480701.1| GENE 26 20521 - 20703 223 60 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 [Oribacterium sinus F0268] # 1 60 1 60 60 90 70 2e-17 MAEKLKITLVKSPIGAVPKQKATVKALGLTKMHKTVEMPDNGAVRGMVAAVRHLVKVEEV >gi|157101623|gb|DS480701.1| GENE 27 20719 - 21228 685 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871784|ref|ZP_03990189.1| ribosomal protein S5 [Oribacterium sinus F0268] # 1 169 1 169 169 268 77 5e-71 MKRTIIDASQMELNDKVVAIKRVSKTVKGGRTMRFSALVVVGDGNGHVGAGLGKAGEVPE AIRKGKEAAIKNLVEIPVDENKSIPHDFIGKFGSAAVLLKRAPEGTGVIAGGPARSVLEM AGIKNIRTKSLGSNNKTNVVLATLAGLTSLKTPEEFARLRGKSVEEIVG >gi|157101623|gb|DS480701.1| GENE 28 21246 - 21614 600 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 [Clostridiales bacterium 1_7_47_FAA] # 1 122 1 122 122 235 98 4e-61 MVNKQSRSEIRVKKHNRMRNRFAGTAERPRLAVFRSNNHMYAQIIDDTVGNTLVAASTVE KEVNAELEKTNDKAAAAYVGTVIAKRALEKGIKEVVFDRGGFIYQGKVQALADAAREAGL DF >gi|157101623|gb|DS480701.1| GENE 29 21632 - 22174 757 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881770|ref|YP_001560738.1| ribosomal protein L6 [Clostridium phytofermentans ISDg] # 1 180 1 180 180 296 78 2e-79 MSRIGRMPVAVPAGVTVTIAEGNKVTVKGPKGTLERELPVEMTIEQKDGQIIVSRPNDLK KMKSLHGLTRTLVNNMIHGVTEGYEKVLEVNGVGYRAAKQGKKLTLNLGYSHPVEMEDPE GIETVMEGQNKIIVKGISKEKVGQYAAEIRDKRRPEPYKGKGIKYADEVIRRKVGKTGKK >gi|157101623|gb|DS480701.1| GENE 30 22248 - 22649 594 133 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 [Roseburia intestinalis L1-82] # 1 133 1 133 133 233 85 2e-60 MTMSDPIADMLTRIRNANTAKHDTVDVPSSKMKIAIADILVKEGYVAKYDMVEDGGFQTI RITLKYGKDKNEKIISGIKRISKPGLRVYANKEELPKVLGGLGTAIISTNQGVITDKEAR QLGVGGEVLAFVW >gi|157101623|gb|DS480701.1| GENE 31 22682 - 22867 310 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168334339|ref|ZP_02692526.1| ribosomal protein S14 [Epulopiscium sp. 'N.t. morphotype B'] # 1 61 1 61 61 124 88 2e-27 MAKTSMKIKQQRPAKFSTREYNRCRICGRPHAYLRKYGICRICFRELAYKGQIPGVKKAS W >gi|157101623|gb|DS480701.1| GENE 32 22884 - 23423 829 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 [Roseburia intestinalis L1-82] # 1 179 1 179 179 323 86 1e-87 MARLKEQYQNEMVDALMKKFGYKNVMQVPKLDKIVINMGVGEAKENAKVLDSAVRDLEII SGQKAITTKAKKSIANFKLREGMAIGCKVTLRGERMYEFLDRLVNLALPRVRDFRGVNPN AFDGRGNYALGIKEQLIFPEIEYDKVDKVRGMDIIFVTTAKTDEEARELLTLFNMPFAK >gi|157101623|gb|DS480701.1| GENE 33 23447 - 23752 358 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227871791|ref|ZP_03990195.1| ribosomal protein L24 [Oribacterium sinus F0268] # 1 101 1 101 101 142 68 4e-33 MNKIKKDDLVVVIAGKDKDKQGKVISVDTKKNKVLVQGCNMVTKHTKQAPGNPQGGIVNQ EAPIDISNVMLVVDGKATRVGFEEKDGKKVRVAKTTGKVID >gi|157101623|gb|DS480701.1| GENE 34 23765 - 24133 576 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881775|ref|YP_001560743.1| ribosomal protein L14 [Clostridium phytofermentans ISDg] # 1 122 1 122 122 226 93 2e-58 MVQQETRLKVADNTGAKELLCIRVLGGSTRRYANIGDIIVASVKDATPGGVVKKGDVVKA VVVRTVKGARRKDGSYIRFDENAAVIIKDDKTPRGTRIFGPVARELRDKQFMKIVSLAPE VL >gi|157101623|gb|DS480701.1| GENE 35 24158 - 24433 370 91 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881776|ref|YP_001560744.1| ribosomal protein S17 [Clostridium phytofermentans ISDg] # 8 91 2 85 85 147 85 2e-34 MRKETFTVERNLRKTRTGKVVSNKMDKTIVVAIEDHVKHPVIGKIVKKTVKLKAHDEKNE CTIGDTVKVMETRPLSKDKRWRLVEIIEKAR >gi|157101623|gb|DS480701.1| GENE 36 24433 - 24645 332 70 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 [Clostridiales bacterium 1_7_47_FAA] # 1 70 1 70 70 132 97 4e-30 MITVKTNKFVEELKNKSANELNEELVAAKKELFNLRFQNATNQLDNTSRIKEVRKNIARI QTVITEKANA >gi|157101623|gb|DS480701.1| GENE 37 24626 - 25063 670 145 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 [Clostridium phytofermentans ISDg] # 1 145 1 145 145 262 88 3e-69 MLMPKRVKRRKQFRGSMAGKALRGNTISNGEYGLVSTEPCWIKSNQIEAARVAMTRYIKR GGKVWIKIFPDKPVTAKPAETRMGSGKGALEYWVAVVKPGRVLFEIAGVPEETAREALRL AMHKLPCKCKIVSKADLEGGDNSEN >gi|157101623|gb|DS480701.1| GENE 38 25063 - 25719 988 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P [Roseburia intestinalis L1-82] # 1 215 1 215 218 385 87 1e-106 MGQKVNPHGLRVGVIKDWDSKWYAEADFADCLVEDYNMRKFLKKRLFSAGISKIEIERAS DRVKIIIYTAKPGVVIGKGGSEIEKLKKEVQKLTDKKLFIDIKEIKRPDRDAQLVAESIA QQLENRVSFRRAMKSTMGRTMKAGVKGIKTAVAGRLGGADMARTEFYSEGTIPLQTLRAD IDYGFAEADTTYGKIGVKVWIYKGEVLPTKGNKEGSDK >gi|157101623|gb|DS480701.1| GENE 39 25729 - 26115 578 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916270|ref|YP_002929787.1| large subunit ribosomal protein L22 [Eubacterium eligens ATCC 27750] # 1 128 1 128 128 227 88 1e-58 MAKGHRSQVKRERNAQKDTRPSATLSYARVSVQKACFVLDAIRGKDLQTALGIVTYNPRY ASSLIKKLLESAAANAENNNGMDPAKLYVEECYANQGPTMKRVRPRAQGRAYRIEKRMSH ITVVLNER >gi|157101623|gb|DS480701.1| GENE 40 26146 - 26430 461 94 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 [Roseburia intestinalis L1-82] # 1 93 1 93 93 182 94 5e-45 MARSLKKGPFADAHLLKKVDAMNAAGQKQVIKTWSRRSTIFPQMVGHTIAVHDGRKHVPV YVTEDMVGHKLGEFVATRTYRGHGKDEKKSGVRK >gi|157101623|gb|DS480701.1| GENE 41 26451 - 27296 1320 281 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 [Roseburia intestinalis L1-82] # 1 281 1 281 281 513 86 1e-144 MGIKKYNAYTPSRRHMTGSDFKEITKDTPEKSLVVSLNKNAGRNNQGKITVRHRGGGSRR KYRIIDFKRNSKDGIPATVIGIEYDPNRTANIALICYADGEKAYILAPQGLTDGMKVMSG DTAEAKLGNCMPLYHIPVGTQVHNIELYPGKGGQLVRSAGNSAQLMAKEGKYATLRLPSG EMRMVPIICRATIGVVGNGEHSLINIGKAGRKRNMGIRPTVRGSVMNPNDHPHGGGEGKC GIGRPGPCTPWGKPALGLKTRKKNKQSNKLIVRRRDGRTIK >gi|157101623|gb|DS480701.1| GENE 42 27416 - 27715 404 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922832|ref|YP_002936345.1| 50S ribosomal protein L23 [Eubacterium rectale ATCC 33656] # 1 99 1 99 99 160 77 2e-38 MADIKYYDVILKPVITEKSMNAMGDRKYTFMVHVDANKSMIKEAVEKMFPGTKVASVNTM NCEGKTKRRGMTFGKTAASKKAIVKLTEDSKEIEIFQGL >gi|157101623|gb|DS480701.1| GENE 43 27715 - 28335 835 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922831|ref|YP_002936344.1| ribosomal protein L4/L1e [Eubacterium rectale ATCC 33656] # 1 206 1 206 206 326 77 2e-88 MANVSVYNMEGKEVGALELNDAVFGVEVNEHLVHLAVVAQLANKRQGTQKAKTRSEVSGG GRKPWRQKGTGHARQGSTRSPQWKGGGVVFAPTPRDYTIRLNKKEKRAALRSALTSRVQD NKFIVVDELKFDEIKTKKFQNVMDNLKVSKALVVLADNDQNTVLSARNIAGVKTSQVGSI NVYDILKYNTVVATKAAVASIEEVYA >gi|157101623|gb|DS480701.1| GENE 44 28364 - 28999 962 211 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 [Eubacterium eligens ATCC 27750] # 1 211 1 211 211 375 87 1e-103 MKKAILATKVGMTQIFNENGALVPVTVLQAGPCVVTQVKTAENDGYKAVQVGFVDKRDKL VSKPQKGHFDKAGVSYKRYVREFKFENAEEYSVKDEIKADIFAAGDKIDATAISKGKGFQ GAIKRYGQHRGPMAHGSKFHRHQGSNGSATTPGRVFKGKGMPGHMGSKQITVQNLEVVKV DVDNNLILVKGAVPGPKKSLVTIKETVKVER >gi|157101623|gb|DS480701.1| GENE 45 29248 - 29565 522 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160941071|ref|ZP_02088409.1| hypothetical protein CLOBOL_05964 [Clostridium bolteae ATCC BAA-613] # 1 105 1 105 105 205 100 4e-52 MASQVMRITLKAYDHQLVDQSAGKIIETVKKTGSQVSGPVPLPTKKEVVTILRAVHKYKD SREQFEQRTHKRLIDITAPSQKTVDALSRLEMPAGVYIDIKMKTK >gi|157101623|gb|DS480701.1| GENE 46 30048 - 31403 1529 451 aa, chain - ## HITS:1 COG:SP0117 KEGG:ns NR:ns ## COG: SP0117 COG5263 # Protein_GI_number: 15900059 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 24 227 522 689 744 86 28.0 1e-16 MKKKMVAVWALAAVLCFGTATISALAAEGWAQSGSTWVYYDSNGNKIYNTWKKGADNQWR YLNGEGVMATNAWVESDYYMDSNGIMLTDKWLKLTGDQGEYEWYYFSSSGKVITDAWKKI DNKWYHFNGDGRMEIGWILDDMYYTGSDGVMRTGWQKLYPPDDDDDNADKVTPGIDSVTD DDGRYWYYFSSNGKKYAPDSSDGDYTARKIDGTYYCFDENGAMQTGWKNVKSTTNDSIQD YMYFGSDGKAKIGWYSIEPPEDLNGYEEAVEWFYFNNSGKPRAADSERLTTSDIVKLNGK SYLFNHLGNPVYGLKKVYTGSGEDDWTAFYFGDKNKSCVQKGKMKVTEDDGNKSDFYFLD NGRGVNGVKDNYLYYKGKLQKATDGQKYVCYRVEGRNYVVNASGKVMKGSNVKNSDGVKF TTNSSGELTKADDDSDVDSYAQEPTEPYCTE >gi|157101623|gb|DS480701.1| GENE 47 31656 - 32516 985 286 aa, chain + ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 9 286 58 338 338 229 42.0 4e-60 MWLADEWKDYEVIDCSKGEKLERWGDYLLVRPDPQVIWDTPRNHKGWKKMNGHYHRSSKG GGEWEFFHLPEQWSIGYKGLTFNLKPFSFKHTGLFPEQAANWDWFGARIREAGRPVKVLN LFAYTGGATLAAAKAGASVTHVDASKGMVAWAKENAQSSGLADAPIRWIVDDCVKFVERE IRRGNRYDAVIMDPPSYGRGPKGEIWKIEDAIHPLVKLCVQLLSDKPLFFLINSYTTGLA PAVLSYMIGVEIVPRFGGTVSAEEVGLPVTQTGLVLPCGASGRWEA >gi|157101623|gb|DS480701.1| GENE 48 32621 - 33310 611 229 aa, chain - ## HITS:1 COG:BH1938 KEGG:ns NR:ns ## COG: BH1938 COG0684 # Protein_GI_number: 15614501 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Bacillus halodurans # 11 225 8 205 210 81 32.0 1e-15 MVVWKNDDELFALMRETLYSSVIGDILDKMNYLHQFLPWRIQPMRENMVVAGRAMTVLEA DALETVSEGQNPIMRQSFGLMLEALDDLKKNEVYICTGSSPTYALVGELMCTRMKILGAA GAVVNGFHRDTNGILDLDFPCFSYGRYAQDQGPRGKVIDYRVPIDMEGVKINPGDIVFGD LDGVLVIPKEIEEEVIVRAVEKATGEKMVAEAIKNGMAAKASFDKYGIM >gi|157101623|gb|DS480701.1| GENE 49 33360 - 34442 975 360 aa, chain - ## HITS:1 COG:lin1848 KEGG:ns NR:ns ## COG: lin1848 COG0371 # Protein_GI_number: 16800915 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Listeria innocua # 16 360 19 362 368 155 32.0 9e-38 MKGQESVYLPHFTIGLNAFDAFRDVIGRYGNKIAAVHGEKAWNAAGRYVTQAVEKAGMTT TGEILYGHEATWSNVERLVADQRVRQSDVLLAVGGGKCVDTVKLAADLLRKPVFTIPTIA SNCAPVTKLSIMYGEDGTFEKVKRLNSVPVHCFIHPGIILDAPIRYLRAGIGDAMAKHVE SEWSAKAGEKLSFGSEFGILAGRLCFYPMLEFGKKAVEDAKAGIVSEELEKILLNIIITP GIVSVSVHSDYVGGIAHALFYGLTRRRQIEENYLHGDVVAYGTLVNLMVDRDWDKLEKTY RFNRSLGLPSCLKDLGLTLEDDLEDVLAAAVENQELSHTPYPVTADLIQEAMRELESYRA >gi|157101623|gb|DS480701.1| GENE 50 34478 - 35260 721 260 aa, chain - ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 9 228 2 218 251 128 33.0 9e-30 MERESQYTISSVKKALKVLKLFDSAHKSMTLTEISERADMRKGTMVRVLTTLEEEGFVRY DEISKRYSLGMSIYVLSSNAFQFISVVEAARPYLEKVTTELNLIAHLSVLEGEKIVLIDR VVPNANAVVFDLNSKIGGEVEAHNTGAGKVLTAYSAAEVREKIISHCSFESYSERTITDK DEYRRVLDAVRKNGYATNEGESEPYLKCITYPIFGYGGKIEAALSLSGIIQYFDTGLEQR CHEALKEICEKISAAMGQSL >gi|157101623|gb|DS480701.1| GENE 51 35334 - 36620 819 428 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195935|ref|ZP_01872991.1| Ribosomal protein L16 [Lentisphaera araneosa HTCC2155] # 1 423 1 425 432 320 37 2e-86 IMSIFLMMMAVFLVFVFLGIPVSFSIGLSTVAAVLAKGMPLAFVSQMAFTGLDSYTYLAI PMFILAGYLMETGGLSKRLVNFAASLVGNIHGGLGIITVLACAFFAAISGSSPGTVAAVG AMMIPPMVEKGYDRDFSAALTASAGSLGVLIPPSIPMVLYCIAGEVSIGEMFMAGFIPGF LMTAALALTTYVISKKNGYKSDEGEFEWKKVAASFKEAIWPLLAPVIILGGIYSGIFTPT EAAVVVVVYSMFVGLFITKELKVSDLPGILYKGAVTAGTTVFILAFAMAFSRYLTLNQIP QLIGSVILGITGNVHVILILFVLLCFVTGTFLETASQVLIYTPLFLPTLKGLGVSPLHFG ILLTVGTMLGMMTPPVGINLFVAQGISGAKISKMTKAILPFLFVMLFVQILFVFFPGFST FLPGLFSK >gi|157101623|gb|DS480701.1| GENE 52 36614 - 37015 326 133 aa, chain - ## HITS:1 COG:no KEGG:VVA1577 NR:ns ## KEGG: VVA1577 # Name: not_defined # Def: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: V.vulnificus_YJ016 # Pathway: not_defined # 1 130 32 161 181 88 40.0 1e-16 MLFRFVFNLPLAWTEELSRYVFIILIYCGASAAVLDNAHVRVELIDNVLSERGRFALDIA VKLLCAAVSLVIAYNSRQIIYNASLSKQLSASLRIPMAGLYALVAVMFCLIAFRFLQAVF KMIVRKKEVETKS >gi|157101623|gb|DS480701.1| GENE 53 37123 - 38193 356 356 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 53 353 30 326 328 141 28 7e-33 MKKRMLAALLAGSMLLTACGGTGGQTVSQAPKGEASQSAPADTGTQSGQKVTMKVSVGVA DNHFEAIAVNKMKEYIEEQTGGNFTVEIFTGAQIGNDQEVFEGLKLGVADMLPCGTDIIG NFSKDFGLLSLPYLFDNEKQVEAVVEGEFGQSLLKELEDIGYVGLGFGNFGFRHTTNSKH PINSVEDMKGLKIRTMTTPIHLEVFEALGANPTPMAFSELFSALQQGVVDGQENPLMNIY ANKLHEVQKYLTLDGHVFTFVTFVVSKDWYDKLDPSYQQILNDGIKIATEYMKESCESED ALALEKMKEAGVEVVELTPEAKDEFREAVKGVSEKYGNEINPDRYKEMLDIIAAVQ >gi|157101623|gb|DS480701.1| GENE 54 38222 - 39880 1667 552 aa, chain - ## HITS:1 COG:PAB0895 KEGG:ns NR:ns ## COG: PAB0895 COG0129 # Protein_GI_number: 14521553 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Pyrococcus abyssi # 1 550 1 548 551 505 49.0 1e-143 MRENDILKGEQGALKRALYKSMGFTGEQLGKPIIAIANSYTNATPGHFVLKQLCESVKEG IIKAGGTPMEFSTVAPCDGIAEGHEGMRYILPTRDLIASSIECMVRAHRFDGLVLLGSCD KIVPGMLMAAARLEMPAIFLNGGPMYPASYCGRHYDGNIVTEAIGWKQQGRIDEEEFRHI EDIAEPCPGSCAMLGTANTMSALSESLGMSLMGSSTIPAVLAARMAKGVETGEKIVELVQ KGVTTRDILTEEAFENAVMHLLTMGGSTNGILHLQAIYHEAGLGELPLSVFDEFSRKIPQ VASIYPASEFDMVDFYEGGGVPAVLKEIEEYLHKDTLTVSGLTMGEALSRFSYTNNRNMI KTAEEPFAPTGGVAILKGNIAPLGCVIKPAAVPKHMFRYSGRAQVFTTEEESYQAVLEGR VKPGTCMVLMYEGPKGGPGMPEMYKTMKYLEGMGLSDTCALITDGRFSGSNRGLFVGHIS PEAYEQGDFALIQDGDVIEIDVDARSIELKVPEEILEERRKRFVPVEKEVKRGYLRTYRR ISASASKGAVVE >gi|157101623|gb|DS480701.1| GENE 55 40237 - 41379 1110 380 aa, chain - ## HITS:1 COG:mll3311 KEGG:ns NR:ns ## COG: mll3311 COG0673 # Protein_GI_number: 13472879 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Mesorhizobium loti # 5 360 3 367 379 208 35.0 2e-53 MKGIKKVKTAVIGCGAISRIYFENMTNAFEILEVAGCCDLNRQLAEDTAAAYDISVLTME EILEDSGIEIVVNLTNPAAHYGVIRNLLEHGKHVYTEKVLAATFGQAQKLAALAHEKGRL LCSAPDTFLGAAVQTARYLVESGVIGTVTSCVAVLQRDARLLAEKFPYTSRPGGGIGIDV GIYYTTAMVNILGEVKEVCGMSGVCMPEQSHYFVKNESFGETYVQESETYLAGSLQFKNG CVGTLHFNSRSIRTEKPYVAFYGTEGILFLEDPNFFGGDVKVILKGQTEPVVFPHTHGYD GDDRGLGVAEMAWALRKGRVPRTRADMALHSLEVLTGVIESGHTKHFYEMQTGFERQPML PRGYLGGTYAMNQPEGALAL >gi|157101623|gb|DS480701.1| GENE 56 41391 - 42773 1197 460 aa, chain - ## HITS:1 COG:BH1923 KEGG:ns NR:ns ## COG: BH1923 COG2723 # Protein_GI_number: 15614486 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Bacillus halodurans # 17 455 5 440 447 386 44.0 1e-107 MQRYEGNSGKAEAGKHHFPEGFVWGVATASYQIEGAWDEDGRGETIWDRYCSIPGNILDG DDGKTACDHYHRYKEDVALMKQMGIRAYRFSIAWSRILPKGYGEVNQKGLDFYSCLIDEL LDAGIEPYITLYHWDLPQALQDMGGWTNPDMPRYFMEYARIVMDAFHDRVKKWITLNEPY CAAFLGNYEGRQAPGLRDFSAAVQVSYHLYVGHGLAVEYFRKQGYEGEIGITLNLMGRLP LTDSEEDRAAAVRADGYLNRWFAEPIVFGRYPEDMVELYRSKGVRLPEFKEEHMKLIGQK LDFIGLNYYNDFYVKADEHVWPLGFKIENPKHIPINDRNWPVTEQGFTNMLLRMKNEYGI ETIYITENGTSSHDVVSMEGRVEDGPRKDYLHRHLLALWEAVSQGVNVKGYFQWSLYDNF EWSFGYESRFGIVFVDFHTQERIIKESGRWYSGVIRDNAV >gi|157101623|gb|DS480701.1| GENE 57 42788 - 43261 600 157 aa, chain - ## HITS:1 COG:no KEGG:Mahau_2039 NR:ns ## KEGG: Mahau_2039 # Name: not_defined # Def: methylmalonyl-CoA epimerase (EC:5.1.99.1) # Organism: M.australiensis # Pathway: not_defined # 14 156 11 151 151 140 47.0 2e-32 MDKNMDRPIEDMRLAQIGFLVNDIEKTKKEFARFFGVEEPETVNSGEYEITHTEYRGEPA PKAKCYMTFFYFGDLQMELIQPNEEPSIWREHLEQFGEGIHHISFNVKGMQKTISVCEDW GMKLLQRGEYRGANGRYAFMDALDSLKVVVELLERDE >gi|157101623|gb|DS480701.1| GENE 58 43320 - 44330 933 336 aa, chain - ## HITS:1 COG:TM0069 KEGG:ns NR:ns ## COG: TM0069 COG1312 # Protein_GI_number: 15642844 # Func_class: G Carbohydrate transport and metabolism # Function: D-mannonate dehydratase # Organism: Thermotoga maritima # 34 331 35 356 360 150 32.0 4e-36 MRLAYGQIRSVDKEILTNARQMGIDRVQFNLPHDLPADGLWKYEDLARFRDACDRYGVIV EAMENMPISFYDKAMLGLEGRDRQIENVCESIRSLGRAGIPVLGYHFSPSFVWRTDNHAP VGRYGATVQAFDLEMQKQGVDDMDDFGQRRDVAVPDVELLWENFEYFMKRVIPVAEEYNV TMALHPDDPPVKSLSGIARMFIDLDSYKRAEAMIDSPNWGLLFCIGTFSQMEGGARNIFD AIKYFGPRKKLVYAHMRDVRGTVPKFQECFLGEGNFDPFEVVYKLMHSGFDGFLVSDHVP GIEGDREWGHKVRYADTAYIKGLMEAIEKMEAAKQV >gi|157101623|gb|DS480701.1| GENE 59 44337 - 45746 1264 469 aa, chain - ## HITS:1 COG:mll2322 KEGG:ns NR:ns ## COG: mll2322 COG4948 # Protein_GI_number: 13472125 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Mesorhizobium loti # 14 469 9 452 452 445 51.0 1e-125 MEYKDRQTNVNYEECLDYVRRSSNPSKLRITDVKFTKVDLPPWGCYLVRIDTNQGISGYG EMRDGASATYLKYLKSRIVGENPCEVDRIFRKIKQFGGPARQAAGVCAIELALWDLAGKA YGVPVYQLLGGRFRDKVRLYADTHIEEGRATGILMEPEKVGRILKGYMDQGFTVVKILSV ELLMAQEGNYCGPLDWVNELRAVEEKVRQVTRTGTRAESSAANAQLYDFNRIPHPFTNMH VTEQGLEQLDEYIGRVRSVIGTKVPLAIDHFGHFPLPDMIKIARRLEKYHLAWLEDMLPW YLTGQYGELKHATTAPIATGEDMYLAESFEPLLQAGALDVVHPDLLTSGGILETKKLGDL AARYGASMALHMCESPVSALAGAHMATASENFFAQEHDAFDSEWWQDLIIGPVKPIVKDG FTVVTDAPGLGIEGLNEDLIREHGPMKNKDVWVSTDEWNQETSLDRIWS >gi|157101623|gb|DS480701.1| GENE 60 45793 - 46647 689 284 aa, chain - ## HITS:1 COG:mlr7227 KEGG:ns NR:ns ## COG: mlr7227 COG0395 # Protein_GI_number: 13476021 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Mesorhizobium loti # 4 283 1 280 280 195 37.0 8e-50 MAKMKKSVTGTSPVWQVCTYAVLLLVVLVAVFPAIWMLSTSIKLPTEQYDIPPQIIPDTP TISNYVNVLTNSKMYLAFINSVIITACVVAVTLFISILAGYGLSRYKFKGHGVLKIALLF GQMIPSVVIIIPLYFLVAKTGLLDTHFSLILADLALTIPMGVIMLSSFFETVPKELEEAA KIDGCTGMGALFRVVLPIAKPGLISVAIYTYIHAWEEFLFALNLSTSTRTRTLPIAIHMF AGEFSVDWGSTMAASAVVAFPVLLIFLSCNKYFVKGMADGAVKG >gi|157101623|gb|DS480701.1| GENE 61 46649 - 47527 867 292 aa, chain - ## HITS:1 COG:mlr2327 KEGG:ns NR:ns ## COG: mlr2327 COG1175 # Protein_GI_number: 13472129 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 1 283 21 306 319 184 37.0 1e-46 MLRKIKRSGAGYVLPAVLVVAVVMLYPLFYTLIMGFFNNTLFMQTPVFCGISQYKKLFGD KVFIGSIGNTLLWTFGSVFFQFGLGFIMALVLHQPFVKGKTFIRILLMIPWVTPSIIGSV VWKWMYNADYGIINFIFCSLGIIETNQTWLSNPKVAMWSVIAVNTWKMFPFILLMIEAAL QSVSKDLKEAAVIDGANILDIFKNVTWPSIAATCYSTILLMIIWTLNAFTFIYAMTEGGP AHKTEVMAMYIYKRAFMDYDFGMASAASAVLFVLSMSVSLVYLYLTREKEGK >gi|157101623|gb|DS480701.1| GENE 62 47625 - 48935 1328 436 aa, chain - ## HITS:1 COG:BH3690 KEGG:ns NR:ns ## COG: BH3690 COG1653 # Protein_GI_number: 15616252 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 10 398 7 388 420 131 28.0 3e-30 MKKAVRRTLSVTVAAAMAMGLMTGCSSGKGEQKPAGEASGPQGQAADSGKQAGAAGEQTT IKFWDGNWQEAVWPEVEAIWNQEHPDIKIEAEFQADLANDKYMLALTNGTAPDVMACALD WVTTFGSANLLEPLDEYVAKDSFDVSQFVQGANDASTVDGKLYGLPFRSETYVMYYNKDI LSAAGYDTPPATWEEVKEVAAACTNGDVYGYGLCGTNYSNFSFQYITMLRSSGSDILNAD NSASELGNSVAVETAQLYKDLAAYSPASLLENDNIANRTLFASGKIAMYLSGIYDLEEIV KTNPDLNFACAMVPTANGAERNTILGGWSVAVAECSKEKEAAWEFVKFLTRPDIAAIYTN TFTGTAEVAARYADYPEDIIKPNAEALQYASALPAVKNIVGIRQAIMDNLQLMLSEDMSA EEASRLLDQAVNGLLE >gi|157101623|gb|DS480701.1| GENE 63 49192 - 50205 884 337 aa, chain + ## HITS:1 COG:VC2677 KEGG:ns NR:ns ## COG: VC2677 COG1609 # Protein_GI_number: 15642672 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 1 314 1 309 335 181 34.0 2e-45 MKTIKDVAAYSGVSLTTVSRVVNGSDKVSPKTKAKVDQAIKELGYFPNNAARSLVKKQTD SIAILLRNIHDPFFTDLIRGIEDASATLGRNIIFCSPGKDEKVRDRYIEYLTNGISDAII LYGSLFTDKAMVEHLQEVGFPFLLIENNFQMMPVNQFLIDNVAGARDAVNYLIRHNHKRI AHVMGNPNKKVILERLNGYIETMQDNGLLIQDYYIQNTLSDSTLSLPKFREMMNRPKGER PTAIFCSNDKIAIMVINELTESGFKVPEDVSVVGFDNLNTYGFSYRGPRITSISQPLYQI GYDSIISIDGVLQGSIETPINKFYPTTLEEHETVCSL >gi|157101623|gb|DS480701.1| GENE 64 50260 - 51105 773 281 aa, chain - ## HITS:1 COG:BS_aprX KEGG:ns NR:ns ## COG: BS_aprX COG1404 # Protein_GI_number: 16078789 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus subtilis # 3 252 144 410 442 214 45.0 1e-55 MGITGRGVGVAVLDTGIYLHEDFKDRVTAFADFVNHRTSPYDDNGHGTHIAAMIGGSGIS SDGKYRGVAPGCSLISVKVLDQKGNGYATDVLTGLKWIRENRDRYNIRIVNISVGSLSRR DMTENSILVRGVNAAWDDGLVVVVAAGNHGPGRMTITTPGISRKVITVGCSDDHKEVDVM GSRMVDYSGRGPTLACVCKPDLVAPGSGIISCCNEPGKYFSKSGTSMSTPLVSGAIALLL EKYPDMSNKDVKLRIRERALDLGLPHNQQGWGKLDVGRLLE >gi|157101623|gb|DS480701.1| GENE 65 51283 - 53103 1724 606 aa, chain - ## HITS:1 COG:CAC0887 KEGG:ns NR:ns ## COG: CAC0887 COG1001 # Protein_GI_number: 15894174 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Clostridium acetobutylicum # 7 596 4 564 570 464 42.0 1e-130 MRQYNDLKKMMDVAAGRRKASLVLKGGTIVNVFTERTEIGDIAIEDGCIAGIGEYDGLVN VDMTGRYICPGFIDGHIHIESSMVSPPEFEKAVLPHGTTTVITDPHEIGNVAGCQGVDYM LKATEGLSLDTFFVMPSCVPSTGLDESGAVLGPEDIKPYYENPRVLGLAEVMNSVGVVAG QEDLMGKLTEAGRRGKVIDGHAPFLRGNELNAYVCSGVWSDHECSDAGEALEKLGRGQWI MIREGTAARNLEALMPLFEAPYYERCMLVTDDKHPGDLISMGHIDYIIRRAVSLGADPIR AIKMGTFNAARYFGLKDRGAVMPGLRADLAVLEDLKDIRVAAVYKDGVLTAKEGVCLGAG KEKKRNYAAGEEPRSGSREDTESAEKTAGGLETAFPRVFNSFHMDEVTLEDLVLEQKGAM ERVIQFKPHELLTEERLVPWQNTPGLAPGVSLEQDIVKAAVFERHLHTGHKGLGFVGGYG LKKGAVATSVAHDSHNLIVVGTNDRDMVLAANAVRKNRGGLAVAAEGQVLGELALPIGGV MSRLSVEEVEEQLQALKVLTRQLGISSDIDAFMTLAFVSLPVIPKLRINTYGVIDVDRQK QVPPSF >gi|157101623|gb|DS480701.1| GENE 66 53753 - 54370 585 205 aa, chain - ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 1 190 1 187 201 99 33.0 5e-21 MIKNIVFDMGNVLVDYIADGVCRHFIEDQETRKKVSTSIFVSPEWILLDLGVMPEDMALK KMQARLDTEEERSLAALCFNRWHEFNMYKKEGMEDVVRWLKSMGYGIYLCSNASVRLLSC YKDVLPAVDCFDGILFSADVLCLKPQKEIYGHLFERFHLKPEECFFVDDLELNIEGARRC GMDGYCFDDGNVEKLREVLAGLNRR >gi|157101623|gb|DS480701.1| GENE 67 54474 - 54752 270 92 aa, chain - ## HITS:1 COG:aq_1254 KEGG:ns NR:ns ## COG: aq_1254 COG1862 # Protein_GI_number: 15606479 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Aquifex aeolicus # 6 85 11 88 102 67 37.0 6e-12 MTSGPMFWVLYIVLLGGMLYFMAIRPQKKEKQRQKELLESVAVGDTILTSSGFYGVIIDM TDDTVIVEFGNNKNCRIPMQKAAIIQVEKPEA >gi|157101623|gb|DS480701.1| GENE 68 54800 - 55960 1153 386 aa, chain - ## HITS:1 COG:CAC2282 KEGG:ns NR:ns ## COG: CAC2282 COG0343 # Protein_GI_number: 15895550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Clostridium acetobutylicum # 3 375 2 374 376 597 71.0 1e-170 MKYQVITKDGRAKRARMETVHGTVETPVFMNVGTVGAIKGAVSTDDLREIGTQVELSNTY HLHVRTGDKLIREFGGLHRFMNWDRPILTDSGGFQVFSLAGLRKIKEEGVYFNSHIDGHK IFMGPEESMQIQSNLGSTIAMAFDECPPSHASDDYIRHSVERTTRWLARCKTEMERLNSL PDTVNRQQLLFGINQGGVVDEIRMEHARTIREMDLDGYAVGGLAVGESHEEMYHILDVTV PCLPEDKPVYLMGVGTPANILEAVDRGVDFFDCVYPSRNGRHGHVYTNQGKLNLFNQKYE LDSRPIEEGCGCPACRSYSRAYIRHLLKAKEMLGMRLCTLHNLYFYNTMMKEIRDAIEEH RYAEYKAAKLAGMEGSGRAISTASDE >gi|157101623|gb|DS480701.1| GENE 69 55975 - 57375 1562 466 aa, chain - ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 1 463 3 454 454 480 49.0 1e-135 MIHQYINNGYHIVLDVNSGSVHVVDPVVYDAVQLVSQRIPEMEAPAALPSYVKEEVRVCL KDRYSQEDIEEALEEIGQLIEAEQLLTKDIYRDFVVDFKKRKTVVKALCLHIAHDCNLAC RYCFAEEGEYHGRRALMSYEVGKKALDFLIANSGNREHLEVDFFGGEPLMNWDVVKRLVE YGRSQEEAHHKKFRFTLTTNGVLLNDEVMEFCNREMSNVVLSLDGRKDVNDKMRPFRNGS GSYDLIVPKFQKFADSRKQMNYYVRGTFTRNNLDFADDVLHYADLGFEQMSMEPVVADPS EDYAIKEEDIPAILKEYDRLALEYIKRKKEGRGFNFFHFMLDLKAGPCVAKRMAGCGSGT EYLAVTPWGDLYPCHQFVGNEKFLMGNVDTGVVNTDIRDEFKTCNVYAKPGCKDCFARFY CSGGCSANAYNFSGSINGAYDIGCEMQKKRIECAIMIKAALADDEE Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:38:54 2011 Seq name: gi|157101622|gb|DS480702.1| Clostridium bolteae ATCC BAA-613 Scfld_02_43 genomic scaffold, whole genome shotgun sequence Length of sequence - 288863 bp Number of predicted genes - 259, with homology - 257 Number of transcription units - 134, operones - 57 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 838 753 ## COG0523 Putative GTPases (G3E family) - Prom 1052 - 1111 6.9 - Term 1101 - 1144 11.4 2 2 Op 1 . - CDS 1271 - 2311 1400 ## COG3804 Uncharacterized conserved protein related to dihydrodipicolinate reductase - Prom 2331 - 2390 2.1 3 2 Op 2 . - CDS 2393 - 3235 1056 ## COG0789 Predicted transcriptional regulators - Prom 3298 - 3357 6.0 - Term 3431 - 3488 17.1 4 3 Op 1 17/0.000 - CDS 3518 - 4183 928 ## COG0569 K+ transport systems, NAD-binding component 5 3 Op 2 . - CDS 4199 - 5452 1301 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 5625 - 5684 5.5 - Term 5683 - 5708 -0.5 6 4 Op 1 42/0.000 - CDS 5714 - 6535 1017 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 7 4 Op 2 25/0.000 - CDS 6532 - 7308 642 ## COG1121 ABC-type Mn/Zn transport systems, ATPase component 8 4 Op 3 . - CDS 7361 - 8374 1089 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 9 4 Op 4 . - CDS 8379 - 9056 814 ## COG0546 Predicted phosphatases - Prom 9105 - 9164 5.4 + Prom 9123 - 9182 7.3 10 5 Tu 1 . + CDS 9225 - 10115 707 ## COG1307 Uncharacterized protein conserved in bacteria + Term 10149 - 10175 -0.7 - Term 9988 - 10022 0.2 11 6 Tu 1 . - CDS 10166 - 10984 832 ## COG2035 Predicted membrane protein - Prom 11047 - 11106 9.5 - Term 11128 - 11183 6.1 12 7 Tu 1 . - CDS 11263 - 11478 196 ## gi|160941114|ref|ZP_02088451.1| hypothetical protein CLOBOL_06007 - Prom 11499 - 11558 7.2 + Prom 11650 - 11709 4.3 13 8 Tu 1 . + CDS 11811 - 13814 1394 ## COG3706 Response regulator containing a CheY-like receiver domain and a GGDEF domain 14 9 Op 1 . - CDS 13838 - 16303 3136 ## COG1410 Methionine synthase I, cobalamin-binding domain 15 9 Op 2 . - CDS 16319 - 17353 1234 ## COG2008 Threonine aldolase 16 9 Op 3 . - CDS 17412 - 18062 570 ## Closa_2453 Vitamin B12 dependent methionine synthase activation region 17 10 Tu 1 . - CDS 18221 - 19078 1038 ## COG0685 5,10-methylenetetrahydrofolate reductase - Term 19101 - 19146 11.3 18 11 Tu 1 . - CDS 19153 - 21606 3089 ## COG0058 Glucan phosphorylase - Prom 21646 - 21705 3.2 19 12 Tu 1 . - CDS 21724 - 24858 3639 ## COG0060 Isoleucyl-tRNA synthetase - Prom 24999 - 25058 2.8 - Term 24929 - 24968 8.2 20 13 Tu 1 . - CDS 25071 - 25172 149 ## - Prom 25192 - 25251 6.7 + Prom 25277 - 25336 8.6 21 14 Tu 1 . + CDS 25420 - 26877 1682 ## COG1488 Nicotinic acid phosphoribosyltransferase + Term 26898 - 26929 2.4 - Term 26886 - 26917 3.2 22 15 Tu 1 . - CDS 26982 - 27728 594 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 27789 - 27848 4.9 23 16 Tu 1 . - CDS 27898 - 29190 1495 ## COG0153 Galactokinase - Prom 29222 - 29281 3.6 + Prom 29276 - 29335 9.3 24 17 Tu 1 . + CDS 29368 - 30369 692 ## COG1242 Predicted Fe-S oxidoreductase - Term 30037 - 30084 4.5 25 18 Op 1 . - CDS 30306 - 31592 1164 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) 26 18 Op 2 . - CDS 31634 - 32494 1063 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 32548 - 32607 10.9 27 19 Tu 1 . - CDS 32878 - 34290 1820 ## CLL_A2949 putative phage lysozyme (EC:3.2.1.17) - Prom 34317 - 34376 5.0 - Term 34348 - 34401 12.6 28 20 Tu 1 . - CDS 34447 - 36471 2561 ## COG0326 Molecular chaperone, HSP90 family - Prom 36532 - 36591 7.6 - Term 36561 - 36601 5.1 29 21 Op 1 . - CDS 36657 - 37403 681 ## BT_1585 hypothetical protein 30 21 Op 2 . - CDS 37431 - 37697 309 ## gi|160941139|ref|ZP_02088476.1| hypothetical protein CLOBOL_06032 - Prom 37857 - 37916 7.3 - Term 37899 - 37936 5.3 31 22 Tu 1 . - CDS 37944 - 38564 573 ## gi|160941140|ref|ZP_02088477.1| hypothetical protein CLOBOL_06033 - Prom 38745 - 38804 4.5 - Term 38662 - 38708 2.1 32 23 Op 1 . - CDS 38927 - 39361 153 ## gi|160941143|ref|ZP_02088480.1| hypothetical protein CLOBOL_06036 33 23 Op 2 . - CDS 39373 - 39564 156 ## gi|160941144|ref|ZP_02088481.1| hypothetical protein CLOBOL_06037 - Prom 39641 - 39700 6.2 34 24 Op 1 . - CDS 39734 - 40219 294 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 35 24 Op 2 . - CDS 40281 - 40901 -3 ## CbC4_0465 hypothetical protein - Prom 41047 - 41106 6.8 + Prom 41566 - 41625 4.2 36 25 Tu 1 . + CDS 41648 - 42436 216 ## Closa_1057 hypothetical protein + Term 42614 - 42668 11.1 - Term 42599 - 42659 13.0 37 26 Op 1 . - CDS 42669 - 43889 1411 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 43909 - 43968 3.6 38 26 Op 2 . - CDS 44006 - 44488 370 ## Closa_3216 hypothetical protein - Prom 44540 - 44599 7.0 39 27 Tu 1 . - CDS 44751 - 45149 367 ## COG4405 Uncharacterized protein conserved in bacteria 40 28 Tu 1 . - CDS 45263 - 45655 136 ## gi|160941153|ref|ZP_02088490.1| hypothetical protein CLOBOL_06046 - Prom 45820 - 45879 3.8 - Term 45982 - 46039 21.0 41 29 Op 1 21/0.000 - CDS 46064 - 47551 1350 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 42 29 Op 2 . - CDS 47567 - 52108 5049 ## COG0069 Glutamate synthase domain 2 - Prom 52136 - 52195 4.0 - Term 52160 - 52197 6.1 43 30 Tu 1 . - CDS 52248 - 52514 370 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 52574 - 52633 4.3 44 31 Tu 1 . - CDS 52672 - 53373 468 ## COG1636 Uncharacterized protein conserved in bacteria - Prom 53398 - 53457 3.1 - Term 53471 - 53508 3.0 45 32 Tu 1 . - CDS 53522 - 54667 1395 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) - Prom 54705 - 54764 1.9 - Term 54716 - 54760 2.2 46 33 Op 1 . - CDS 54768 - 55694 1170 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 47 33 Op 2 . - CDS 55735 - 56847 525 ## PROTEIN SUPPORTED gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase 48 33 Op 3 . - CDS 56840 - 57799 796 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 49 33 Op 4 . - CDS 57832 - 58134 362 ## Closa_2492 hypothetical protein 50 33 Op 5 . - CDS 58138 - 59130 997 ## COG0078 Ornithine carbamoyltransferase - Prom 59182 - 59241 2.8 - Term 59182 - 59227 8.4 51 34 Op 1 . - CDS 59259 - 60491 1250 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily 52 34 Op 2 . - CDS 60478 - 60720 255 ## Closa_2496 hypothetical protein 53 34 Op 3 . - CDS 60768 - 61580 238 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 54 34 Op 4 . - CDS 61605 - 63461 1590 ## COG0171 NAD synthase + Prom 63687 - 63746 9.5 55 35 Tu 1 . + CDS 63769 - 64512 664 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 + Term 64625 - 64666 -0.8 - Term 64478 - 64525 12.3 56 36 Op 1 . - CDS 64555 - 65454 852 ## COG1284 Uncharacterized conserved protein 57 36 Op 2 . - CDS 65485 - 65919 380 ## COG1959 Predicted transcriptional regulator - Prom 65958 - 66017 4.7 + Prom 65901 - 65960 4.8 58 37 Tu 1 . + CDS 66141 - 66533 297 ## COG2033 Desulfoferrodoxin + Prom 66790 - 66849 7.5 59 38 Tu 1 . + CDS 66964 - 68373 1140 ## Closa_1011 GerA spore germination protein + Term 68412 - 68445 6.1 - Term 68390 - 68439 10.1 60 39 Tu 1 . - CDS 68440 - 69465 771 ## BCAH820_4940 hypothetical protein - Prom 69600 - 69659 7.2 - Term 69676 - 69734 9.3 61 40 Tu 1 . - CDS 69762 - 71429 1621 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 71478 - 71537 5.6 62 41 Op 1 35/0.000 - CDS 71603 - 73489 223 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 63 41 Op 2 4/0.000 - CDS 73479 - 75239 219 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 75266 - 75325 1.6 - Term 75279 - 75329 -0.7 64 41 Op 3 . - CDS 75337 - 75762 549 ## COG1846 Transcriptional regulators - Prom 75811 - 75870 4.5 + Prom 75914 - 75973 6.5 65 42 Op 1 . + CDS 76010 - 76729 634 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 66 42 Op 2 . + CDS 76719 - 77024 400 ## Closa_1065 branched-chain amino acid transport 67 43 Tu 1 . - CDS 77052 - 78350 1339 ## Closa_3037 hypothetical protein - Prom 78373 - 78432 5.8 - Term 78473 - 78515 9.1 68 44 Op 1 . - CDS 78543 - 79781 1239 ## COG1301 Na+/H+-dicarboxylate symporters 69 44 Op 2 . - CDS 79809 - 80987 841 ## COG0787 Alanine racemase 70 44 Op 3 . - CDS 80991 - 83240 1348 ## COG1048 Aconitase A - Prom 83359 - 83418 6.5 + Prom 83267 - 83326 5.7 71 45 Op 1 . + CDS 83391 - 83483 71 ## 72 45 Op 2 . + CDS 83498 - 85393 1379 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Term 85301 - 85325 -1.0 73 46 Tu 1 . - CDS 85398 - 87620 1521 ## COG1048 Aconitase A 74 47 Op 1 . - CDS 87698 - 87817 58 ## gi|160941193|ref|ZP_02088530.1| hypothetical protein CLOBOL_06086 75 47 Op 2 . - CDS 87831 - 88181 512 ## EUBELI_20051 hypothetical protein - Term 88205 - 88255 14.0 76 48 Tu 1 . - CDS 88284 - 90230 2315 ## COG1297 Predicted membrane protein - Prom 90404 - 90463 8.2 - Term 90425 - 90493 16.0 77 49 Op 1 . - CDS 90514 - 91494 949 ## COG0673 Predicted dehydrogenases and related proteins 78 49 Op 2 . - CDS 91570 - 93519 1871 ## COG2199 FOG: GGDEF domain 79 49 Op 3 . - CDS 93497 - 94936 1258 ## CLOST_0658 conserved exported protein of unknown function - Prom 95004 - 95063 8.7 - Term 95275 - 95309 3.4 80 50 Tu 1 . - CDS 95319 - 95525 221 ## COG2155 Uncharacterized conserved protein - Prom 95579 - 95638 2.5 81 51 Op 1 . - CDS 95684 - 98416 2663 ## COG0480 Translation elongation factors (GTPases) 82 51 Op 2 . - CDS 98449 - 100461 1581 ## COG0243 Anaerobic dehydrogenases, typically selenocysteine-containing 83 52 Tu 1 . - CDS 100636 - 101580 795 ## COG0053 Predicted Co/Zn/Cd cation transporters 84 53 Tu 1 . - CDS 102020 - 103006 1020 ## Closa_0123 diaminopimelate dehydrogenase (EC:1.4.1.16) - Prom 103050 - 103109 5.6 85 54 Op 1 . - CDS 103130 - 104677 1025 ## gi|160941210|ref|ZP_02088547.1| hypothetical protein CLOBOL_06103 86 54 Op 2 . - CDS 104771 - 105547 706 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family 87 54 Op 3 . - CDS 105633 - 106802 1013 ## COG3858 Predicted glycosyl hydrolase 88 54 Op 4 7/0.000 - CDS 106855 - 108210 1534 ## COG0534 Na+-driven multidrug efflux pump 89 54 Op 5 . - CDS 108200 - 108637 433 ## COG1846 Transcriptional regulators - Prom 108684 - 108743 6.4 + Prom 108746 - 108805 8.4 90 55 Tu 1 . + CDS 108840 - 109751 683 ## Clole_2712 hypothetical protein + Prom 109780 - 109839 7.2 91 56 Tu 1 . + CDS 109869 - 110228 346 ## Closa_2012 hypothetical protein + Term 110293 - 110332 2.4 + Prom 110330 - 110389 5.2 92 57 Op 1 . + CDS 110524 - 111015 472 ## bpr_I0833 TetR family transcriptional regulator 93 57 Op 2 . + CDS 111041 - 111865 713 ## CD0692 hypothetical protein + Term 111986 - 112025 0.6 94 58 Op 1 . - CDS 111895 - 112659 403 ## CPF_0998 CAAX amino terminal protease family protein 95 58 Op 2 . - CDS 112698 - 113855 1390 ## Thit_0785 isoaspartyl dipeptidase - Prom 113909 - 113968 5.3 - Term 113929 - 113972 8.5 96 59 Op 1 . - CDS 113991 - 115130 1270 ## COG4225 Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins 97 59 Op 2 11/0.000 - CDS 115148 - 116455 1035 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 98 59 Op 3 11/0.000 - CDS 116457 - 116927 264 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 99 59 Op 4 . - CDS 116944 - 117966 433 ## PROTEIN SUPPORTED gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 - Prom 118134 - 118193 4.5 + Prom 118040 - 118099 4.8 100 60 Tu 1 . + CDS 118183 - 119112 951 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 119139 - 119188 5.2 101 61 Op 1 5/0.000 - CDS 119242 - 120069 972 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 102 61 Op 2 1/0.219 - CDS 120158 - 121408 1226 ## COG4806 L-rhamnose isomerase 103 61 Op 3 4/0.000 - CDS 121457 - 121822 377 ## COG3254 Uncharacterized conserved protein 104 61 Op 4 . - CDS 121829 - 123250 1360 ## COG1070 Sugar (pentulose and hexulose) kinases - Prom 123478 - 123537 7.1 - Term 123477 - 123521 5.1 105 62 Tu 1 . - CDS 123574 - 124542 1137 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 124620 - 124679 10.0 106 63 Op 1 . - CDS 124811 - 125779 707 ## COG1052 Lactate dehydrogenase and related dehydrogenases 107 63 Op 2 1/0.219 - CDS 125798 - 126433 480 ## COG1802 Transcriptional regulators - Prom 126471 - 126530 4.4 108 64 Op 1 . - CDS 126538 - 127602 972 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 109 64 Op 2 . - CDS 127625 - 128569 366 ## COG0679 Predicted permeases 110 64 Op 3 6/0.000 - CDS 128571 - 129977 775 ## COG3051 Citrate lyase, alpha subunit 111 64 Op 4 6/0.000 - CDS 129970 - 130806 758 ## COG2301 Citrate lyase beta subunit 112 64 Op 5 . - CDS 130845 - 131120 186 ## COG3052 Citrate lyase, gamma subunit 113 64 Op 6 . - CDS 131191 - 132090 595 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 114 64 Op 7 11/0.000 - CDS 132110 - 132730 245 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 115 64 Op 8 11/0.000 - CDS 132721 - 133620 754 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain 116 64 Op 9 11/0.000 - CDS 133638 - 134258 187 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 117 64 Op 10 . - CDS 134255 - 135154 243 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 118 64 Op 11 11/0.000 - CDS 135176 - 136453 1239 ## COG1593 TRAP-type C4-dicarboxylate transport system, large permease component 119 64 Op 12 11/0.000 - CDS 136450 - 136944 473 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 120 64 Op 13 1/0.219 - CDS 136960 - 138048 450 ## PROTEIN SUPPORTED gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 - Prom 138140 - 138199 7.0 121 65 Tu 1 . - CDS 138218 - 138838 455 ## COG1802 Transcriptional regulators - Prom 138898 - 138957 5.6 - Term 138856 - 138891 5.1 122 66 Tu 1 . - CDS 138978 - 140894 1264 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 141015 - 141074 4.3 - Term 141109 - 141166 10.2 123 67 Tu 1 . - CDS 141214 - 141612 408 ## COG5015 Uncharacterized conserved protein - Prom 141755 - 141814 10.0 + Prom 141714 - 141773 6.7 124 68 Tu 1 . + CDS 141861 - 143834 1886 ## COG3808 Inorganic pyrophosphatase + Term 143859 - 143895 7.1 - Term 143847 - 143882 5.5 125 69 Op 1 . - CDS 143907 - 144719 677 ## COG1609 Transcriptional regulators 126 69 Op 2 . - CDS 144737 - 144943 128 ## Closa_3272 LacI family transcriptional regulator 127 69 Op 3 . - CDS 144979 - 146130 704 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Term 146147 - 146187 6.2 128 70 Op 1 . - CDS 146195 - 146938 799 ## COG5426 Uncharacterized membrane protein 129 70 Op 2 11/0.000 - CDS 146959 - 147936 1018 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 130 70 Op 3 . - CDS 147933 - 148910 1036 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 131 70 Op 4 . - CDS 148968 - 150266 1119 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 132 70 Op 5 16/0.000 - CDS 150270 - 151802 207 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Term 151816 - 151850 -0.5 133 70 Op 6 . - CDS 151912 - 153018 967 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 153086 - 153145 12.2 - Term 153366 - 153414 12.1 134 71 Op 1 . - CDS 153503 - 153901 390 ## Dred_2181 hypothetical protein 135 71 Op 2 6/0.000 - CDS 153886 - 154479 463 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase 136 71 Op 3 . - CDS 154476 - 155963 1153 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases - Prom 156092 - 156151 6.5 + Prom 156093 - 156152 7.2 137 72 Tu 1 . + CDS 156253 - 157131 195 ## PROTEIN SUPPORTED gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 138 73 Tu 1 . - CDS 157119 - 157886 703 ## COG1878 Predicted metal-dependent hydrolase - Prom 157934 - 157993 8.6 139 74 Op 1 . - CDS 158005 - 158595 396 ## CPE0456 hypothetical protein 140 74 Op 2 . - CDS 158534 - 159802 817 ## COG0438 Glycosyltransferase 141 74 Op 3 . - CDS 159814 - 160323 171 ## PROTEIN SUPPORTED gi|223039927|ref|ZP_03610210.1| ribosomal protein L22 + Prom 160422 - 160481 7.8 142 75 Tu 1 . + CDS 160611 - 161903 1299 ## COG0477 Permeases of the major facilitator superfamily - Term 161895 - 161920 -0.8 143 76 Tu 1 . - CDS 161937 - 163496 1400 ## gi|160941275|ref|ZP_02088612.1| hypothetical protein CLOBOL_06168 + Prom 163723 - 163782 7.7 144 77 Op 1 . + CDS 163923 - 164657 848 ## COG1335 Amidases related to nicotinamidase 145 77 Op 2 . + CDS 164781 - 166136 1191 ## COG0044 Dihydroorotase and related cyclic amidohydrolases + Term 166167 - 166221 12.2 - Term 166154 - 166209 12.2 146 78 Tu 1 . - CDS 166263 - 166814 359 ## ELI_2227 hypothetical protein - Prom 166975 - 167034 6.6 + Prom 166920 - 166979 5.8 147 79 Op 1 11/0.000 + CDS 167067 - 168092 492 ## PROTEIN SUPPORTED gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 148 79 Op 2 11/0.000 + CDS 168096 - 168599 254 ## PROTEIN SUPPORTED gi|90020580|ref|YP_526407.1| ribosomal protein S3 149 79 Op 3 . + CDS 168596 - 169903 1124 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 + Term 169960 - 170017 15.1 - Term 169948 - 170004 18.7 150 80 Op 1 . - CDS 170028 - 171236 1403 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 171276 - 171335 8.6 - Term 171281 - 171325 -0.8 151 80 Op 2 . - CDS 171392 - 172813 968 ## Daud_0731 WD40 domain-containing protein - Prom 172838 - 172897 4.8 152 81 Tu 1 . - CDS 172937 - 173221 302 ## Cphy_2002 YCII-related - Prom 173245 - 173304 7.3 + Prom 173270 - 173329 7.3 153 82 Tu 1 . + CDS 173489 - 173764 261 ## ELI_0302 hypothetical protein + Term 173775 - 173822 4.2 + Prom 173838 - 173897 5.0 154 83 Tu 1 . + CDS 173946 - 174539 213 ## COG4332 Uncharacterized protein conserved in bacteria - Term 174347 - 174385 2.7 155 84 Tu 1 . - CDS 174517 - 174999 468 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit - Prom 175036 - 175095 2.8 156 85 Op 1 . - CDS 175100 - 176647 842 ## Corgl_1526 hypothetical protein 157 85 Op 2 . - CDS 176706 - 179123 2268 ## COG0642 Signal transduction histidine kinase 158 85 Op 3 . - CDS 179143 - 179511 486 ## ELI_3416 hypothetical protein 159 85 Op 4 5/0.000 - CDS 179516 - 182743 3255 ## COG0642 Signal transduction histidine kinase 160 85 Op 5 . - CDS 182747 - 184882 1562 ## COG2200 FOG: EAL domain + Prom 185141 - 185200 9.5 161 86 Op 1 12/0.000 + CDS 185376 - 187802 2272 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 162 86 Op 2 15/0.000 + CDS 187813 - 188703 787 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 163 86 Op 3 2/0.031 + CDS 188696 - 189196 638 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 164 86 Op 4 . + CDS 189223 - 190563 400 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 190576 - 190635 24.3 + Prom 190726 - 190785 10.7 165 87 Op 1 7/0.000 + CDS 190907 - 192115 1211 ## COG0448 ADP-glucose pyrophosphorylase 166 87 Op 2 . + CDS 192109 - 193254 1299 ## COG0448 ADP-glucose pyrophosphorylase + Term 193273 - 193329 12.5 - Term 193267 - 193310 2.2 167 88 Op 1 . - CDS 193342 - 193998 725 ## COG0546 Predicted phosphatases 168 88 Op 2 . - CDS 194014 - 194712 537 ## COG1768 Predicted phosphohydrolase 169 88 Op 3 . - CDS 194775 - 195446 568 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Term 195458 - 195495 1.1 170 89 Op 1 . - CDS 195566 - 195727 100 ## gi|160941307|ref|ZP_02088644.1| hypothetical protein CLOBOL_06200 171 89 Op 2 12/0.000 - CDS 195795 - 196238 493 ## COG3610 Uncharacterized conserved protein 172 89 Op 3 . - CDS 196235 - 196990 744 ## COG2966 Uncharacterized conserved protein - Prom 197158 - 197217 5.2 173 90 Tu 1 . - CDS 197252 - 197662 441 ## Clole_2877 hypothetical protein - Prom 197695 - 197754 9.8 174 91 Tu 1 . - CDS 197765 - 199126 498 ## COG0534 Na+-driven multidrug efflux pump - Prom 199176 - 199235 5.6 + Prom 199127 - 199186 8.8 175 92 Tu 1 . + CDS 199289 - 199912 222 ## COG1309 Transcriptional regulator 176 93 Op 1 . - CDS 200094 - 202694 2025 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain 177 93 Op 2 . - CDS 202709 - 204292 1265 ## COG2199 FOG: GGDEF domain - Prom 204336 - 204395 5.8 + Prom 204465 - 204524 4.8 178 94 Op 1 4/0.000 + CDS 204555 - 205328 553 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 179 94 Op 2 . + CDS 205330 - 206478 1259 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB 180 95 Tu 1 . - CDS 206473 - 206694 180 ## gi|160941319|ref|ZP_02088656.1| hypothetical protein CLOBOL_06212 - Prom 206825 - 206884 3.5 + Prom 206784 - 206843 7.4 181 96 Tu 1 . + CDS 206877 - 207659 651 ## COG0778 Nitroreductase + Term 207783 - 207842 1.4 182 97 Tu 1 . - CDS 207675 - 208487 820 ## Clole_1220 xylose isomerase domain-containing protein TIM barrel - Prom 208593 - 208652 4.4 183 98 Op 1 . - CDS 208688 - 209839 1029 ## COG1408 Predicted phosphohydrolases 184 98 Op 2 . - CDS 209841 - 210740 733 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 210810 - 210869 7.8 + Prom 210772 - 210831 4.4 185 99 Tu 1 . + CDS 210877 - 211362 304 ## lwe0370 glyoxalase family protein (EC:4.4.1.5) - Term 211358 - 211401 -0.8 186 100 Tu 1 . - CDS 211429 - 212619 280 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 212644 - 212703 6.0 - Term 212679 - 212722 8.3 187 101 Op 1 . - CDS 212734 - 214263 1151 ## COG0165 Argininosuccinate lyase 188 101 Op 2 . - CDS 214303 - 215445 1099 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 189 101 Op 3 38/0.000 - CDS 215457 - 216317 762 ## COG0395 ABC-type sugar transport system, permease component 190 101 Op 4 . - CDS 216310 - 217212 608 ## COG1175 ABC-type sugar transport systems, permease components - Term 217227 - 217273 4.4 191 102 Op 1 . - CDS 217285 - 217602 289 ## gi|160941331|ref|ZP_02088668.1| hypothetical protein CLOBOL_06224 192 102 Op 2 1/0.219 - CDS 217653 - 218684 807 ## COG1653 ABC-type sugar transport system, periplasmic component 193 102 Op 3 . - CDS 218709 - 219812 907 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily - Prom 219929 - 219988 9.8 - Term 220032 - 220073 9.1 194 103 Op 1 . - CDS 220091 - 221167 984 ## COG0371 Glycerol dehydrogenase and related enzymes 195 103 Op 2 . - CDS 221204 - 222349 1119 ## COG3964 Predicted amidohydrolase 196 103 Op 3 . - CDS 222369 - 223781 1571 ## COG0471 Di- and tricarboxylate transporters 197 103 Op 4 . - CDS 223818 - 225134 1177 ## COG0786 Na+/glutamate symporter 198 103 Op 5 . - CDS 225216 - 226244 1088 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases - Prom 226268 - 226327 6.1 199 103 Op 6 . - CDS 226336 - 227037 856 ## COG0684 Demethylmenaquinone methyltransferase - Prom 227122 - 227181 5.8 + Prom 227086 - 227145 8.6 200 104 Op 1 . + CDS 227208 - 227789 701 ## COG0583 Transcriptional regulator 201 104 Op 2 . + CDS 227777 - 228187 314 ## Acfer_2032 transcriptional regulator, LysR family + Term 228191 - 228248 9.9 - Term 228179 - 228235 13.5 202 105 Tu 1 . - CDS 228250 - 228990 242 ## COG5632 N-acetylmuramoyl-L-alanine amidase - Prom 229191 - 229250 6.4 + Prom 229243 - 229302 5.2 203 106 Tu 1 . + CDS 229513 - 231234 1530 ## COG1283 Na+/phosphate symporter + Term 231469 - 231511 3.1 + Prom 231330 - 231389 6.4 204 107 Tu 1 . + CDS 231531 - 232199 314 ## COG4300 Predicted permease, cadmium resistance protein + Prom 232307 - 232366 4.9 205 108 Tu 1 . + CDS 232386 - 233369 298 ## gi|160941350|ref|ZP_02088687.1| hypothetical protein CLOBOL_06243 - Term 233405 - 233441 3.9 206 109 Op 1 . - CDS 233520 - 234839 1294 ## COG0427 Acetyl-CoA hydrolase 207 109 Op 2 22/0.000 - CDS 234836 - 235399 595 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 208 109 Op 3 23/0.000 - CDS 235396 - 236148 731 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 209 109 Op 4 . - CDS 236154 - 237215 1125 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 210 109 Op 5 . - CDS 237272 - 237475 313 ## gi|160941355|ref|ZP_02088692.1| hypothetical protein CLOBOL_06248 211 109 Op 6 . - CDS 237553 - 238218 688 ## Daud_1610 hypothetical protein - Prom 238380 - 238439 7.3 + Prom 238324 - 238383 6.6 212 110 Tu 1 . + CDS 238601 - 239998 1162 ## COG0471 Di- and tricarboxylate transporters + Term 240027 - 240054 -0.8 + Prom 240031 - 240090 7.2 213 111 Tu 1 . + CDS 240127 - 240858 599 ## COG1802 Transcriptional regulators + Prom 240899 - 240958 8.8 214 112 Tu 1 . + CDS 241032 - 241658 716 ## TepRe1_0427 selenium metabolism protein YedF - Term 241684 - 241738 3.1 215 113 Tu 1 . - CDS 241784 - 242773 766 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 216 114 Op 1 8/0.000 - CDS 242914 - 243954 858 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 217 114 Op 2 36/0.000 - CDS 243944 - 244759 725 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 218 114 Op 3 . - CDS 244778 - 245644 638 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 219 115 Tu 1 . - CDS 245778 - 247118 1483 ## COG4134 ABC-type uncharacterized transport system, periplasmic component - Prom 247140 - 247199 4.6 220 116 Op 1 . - CDS 247237 - 248367 1048 ## Closa_1178 Rhodanese domain protein 221 116 Op 2 . - CDS 248416 - 249417 889 ## Closa_1177 aminoglycoside phosphotransferase 222 116 Op 3 . - CDS 249424 - 250755 1066 ## COG3222 Uncharacterized protein conserved in bacteria 223 117 Op 1 . - CDS 250928 - 251374 278 ## Closa_1171 hypothetical protein - Term 251390 - 251427 -0.8 224 117 Op 2 . - CDS 251444 - 252586 875 ## COG2897 Rhodanese-related sulfurtransferase - Prom 252648 - 252707 7.4 - Term 252830 - 252888 14.6 225 118 Op 1 3/0.000 - CDS 252936 - 253847 742 ## COG0280 Phosphotransacetylase 226 118 Op 2 . - CDS 253840 - 254943 794 ## COG3426 Butyrate kinase 227 118 Op 3 23/0.000 - CDS 254936 - 255817 652 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 228 118 Op 4 14/0.000 - CDS 255810 - 257009 1098 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 229 118 Op 5 14/0.000 - CDS 257002 - 257313 272 ## COG1144 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, delta subunit 230 118 Op 6 . - CDS 257270 - 257818 547 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 231 118 Op 7 . - CDS 257838 - 258746 905 ## COG1893 Ketopantoate reductase 232 119 Op 1 2/0.031 - CDS 258871 - 259779 881 ## COG0388 Predicted amidohydrolase 233 119 Op 2 2/0.031 - CDS 259795 - 260991 1012 ## COG0477 Permeases of the major facilitator superfamily - Term 261336 - 261385 6.1 234 120 Tu 1 . - CDS 261414 - 262178 539 ## COG1414 Transcriptional regulator - Prom 262207 - 262266 7.6 235 121 Tu 1 . - CDS 262287 - 263693 1499 ## COG0534 Na+-driven multidrug efflux pump - Prom 263738 - 263797 4.9 + Prom 264219 - 264278 6.3 236 122 Tu 1 . + CDS 264329 - 265237 757 ## COG0789 Predicted transcriptional regulators + Term 265256 - 265287 -0.7 + Prom 265293 - 265352 10.7 237 123 Op 1 . + CDS 265442 - 266590 1043 ## COG3199 Uncharacterized conserved protein 238 123 Op 2 12/0.000 + CDS 266626 - 268032 1579 ## COG0403 Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain 239 123 Op 3 . + CDS 268049 - 269641 1421 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain 240 123 Op 4 . + CDS 269674 - 271086 1728 ## COG2211 Na+/melibiose symporter and related transporters - Term 271136 - 271183 4.5 241 124 Op 1 . - CDS 271191 - 272435 1309 ## COG0281 Malic enzyme 242 124 Op 2 . - CDS 272469 - 273911 1633 ## COG1027 Aspartate ammonia-lyase - Prom 274002 - 274061 7.6 + Prom 273928 - 273987 7.3 243 125 Tu 1 . + CDS 274055 - 275005 853 ## COG0583 Transcriptional regulator + Term 275184 - 275221 0.0 - Term 274974 - 275021 10.1 244 126 Op 1 2/0.031 - CDS 275038 - 275904 911 ## COG0191 Fructose/tagatose bisphosphate aldolase 245 126 Op 2 16/0.000 - CDS 275924 - 276913 1166 ## COG1082 Sugar phosphate isomerases/epimerases 246 126 Op 3 2/0.031 - CDS 276916 - 278064 1181 ## COG0673 Predicted dehydrogenases and related proteins 247 126 Op 4 . - CDS 278116 - 279282 961 ## COG0524 Sugar kinases, ribokinase family 248 126 Op 5 . - CDS 279379 - 280413 1157 ## COG2115 Xylose isomerase 249 126 Op 6 . - CDS 280503 - 280670 72 ## gi|160941400|ref|ZP_02088737.1| hypothetical protein CLOBOL_06293 + Prom 280489 - 280548 2.5 250 127 Tu 1 . + CDS 280581 - 281531 520 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 281250 - 281288 5.0 251 128 Tu 1 . - CDS 281537 - 282589 738 ## gi|160941402|ref|ZP_02088739.1| hypothetical protein CLOBOL_06295 - Prom 282631 - 282690 10.1 + Prom 282715 - 282774 7.8 252 129 Tu 1 . + CDS 282811 - 283371 445 ## COG4430 Uncharacterized protein conserved in bacteria - Term 283227 - 283269 1.4 253 130 Tu 1 . - CDS 283395 - 284621 770 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 254 131 Tu 1 . - CDS 284955 - 285566 546 ## EUBREC_1092 hypothetical protein - Prom 285800 - 285859 7.2 + Prom 285698 - 285757 5.2 255 132 Tu 1 . + CDS 285829 - 286671 649 ## COG1307 Uncharacterized protein conserved in bacteria 256 133 Op 1 . - CDS 286783 - 286917 175 ## gi|160941409|ref|ZP_02088746.1| hypothetical protein CLOBOL_06302 257 133 Op 2 . - CDS 286950 - 287774 670 ## COG0566 rRNA methylases - Prom 287858 - 287917 9.3 - Term 287933 - 287962 -0.3 258 134 Op 1 . - CDS 287974 - 288450 401 ## gi|160941412|ref|ZP_02088749.1| hypothetical protein CLOBOL_06305 259 134 Op 2 . - CDS 288490 - 288852 257 ## gi|160941413|ref|ZP_02088750.1| hypothetical protein CLOBOL_06306 Predicted protein(s) >gi|157101622|gb|DS480702.1| GENE 1 1 - 838 753 279 aa, chain - ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 3 197 2 193 294 145 36.0 8e-35 MTKIDIISGFLGAGKTTFIKKLLEEAIAGEQVVLIENEFGEIGIDGGFLKDSGIEIREMN SGCICCSLVGDFGTSLAEVLTQYKPERIIIEPSGVGKLSDVMKAVIDVSADMDVELNSAV TIVDAAKCKMYMKNFGEFFNNQIENAGTIVLSRTDITDASKIQKDVDMIREKNANAVIIT TPLDQLGGSQLLEIIEKKDTMLDDLLAEVRESRHHDGHGEECCGHHHGHDEECHEHHHDH DEECHEHHHDHDEECHEHHHHGHDEECHEHHHGHDEECH >gi|157101622|gb|DS480702.1| GENE 2 1271 - 2311 1400 346 aa, chain - ## HITS:1 COG:mll2179 KEGG:ns NR:ns ## COG: mll2179 COG3804 # Protein_GI_number: 13472019 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to dihydrodipicolinate reductase # Organism: Mesorhizobium loti # 12 323 3 314 329 124 26.0 3e-28 MMNCEKIRVVQYGCGKMAKYILRYLYEKGAEIVGAIDVNPAVVGMDVGDFAGLGTKLGVV IREDADAVLDECDADIAVVTLFSFMSDVYPYFEKCVSRGINVVTTCEEAIYPWTTSSAVT NKLDKLAKETGCTIVGAGMQDIFWINMIGCVAGGVHRINRIEGATSYNVEDYGLALAKAH GVGLTPEEFEAQIAHPETLEPCYVWNSNEALCNKMGWTIKSQSQKCVPYFYPTDLYSETM GMTIPKGNCIGMSAVVTTETFQGPVIETQCIGKVYGPDDGDLCDWKIIGEPDTTFFVQKP ATVEHTCATIVNRIPTILNAPAGYITVEKLEDVEYLSYPMHMYLDM >gi|157101622|gb|DS480702.1| GENE 3 2393 - 3235 1056 280 aa, chain - ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 3 113 4 114 117 80 38.0 3e-15 MKELSIGQMARLNGLSEQTLRLYDKAGLFSPMYRDAENGYRYYDIRQSAQLDMIQHMKAL GMSLKDIREQMTHFELDMFKTILHRNLEELELRSRELMYQRRAIERTLESYEWYENAPPD GTIVLEYIPRRLTYMTDSGVNFYDYDIDVYEKILRDLKENLIANQLSPIYFYNAGTVMRR EHLMEGRYYSTEIFVLVDRGFVKDSLITEIPASTYLCIYCNGFEREKEYMGRLVEEIRTK EYQVTGDYICEVVAEVPMDMQERGMFLRLQVPVSFRKNNP >gi|157101622|gb|DS480702.1| GENE 4 3518 - 4183 928 221 aa, chain - ## HITS:1 COG:lin1022 KEGG:ns NR:ns ## COG: lin1022 COG0569 # Protein_GI_number: 16800091 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Listeria innocua # 6 213 7 214 219 148 37.0 8e-36 MKSILVIGIGRFGQHLCENLAKHDNQIMAVDISEEKLEPVLPYVVSAKIGDCTNEAVLKS LGIGNFDICFICIGNNFQNSLEVTSLIKEIGAKYVVSKANRNIHARLLLKNGADEVVYPD RDVAEKVAVRYSANNVFDYTELADGISIYEIKPLPQWVGKSIKDSDIRRLYGISVIAVKN PDGHMRFMPGADHVIAGDNHLMVIGKQEDAEKIVKHLTKNV >gi|157101622|gb|DS480702.1| GENE 5 4199 - 5452 1301 417 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 1 403 36 432 445 248 38.0 2e-65 MLPIASRDGQSEPFLNCLFTATSASCVTGLVVADTWTQWSLFGQMVLLVLIQLGGLGFIS IGIFLSIVLRRKIGLKERGLMQESVNTLQIGGMVKLAKKIIIGTAVFEGAGAVILACRFI PEYGLVHGTWYGIFHSVSAFCNAGFDLMGHTSQYSSLCNYEGDWVVIGTISFLIITGGIG FIVWDDLSRKKLDFRHYMLHTKIVLVTTAVLLITSTILFYLMERNNILVGMNGSESFLAC FFSAVTPRTAGFNNVDTAALTDGSKFLSAILMFIGGSPGSTAGGIKTSTLAVLLLYVHSN IRQTYGVEIFGRRLEDESIRQSACILTINLGLMLAATIAIMVSQNLPMSDVFFETCSAIG TSGMSTGVTRSLNSFSRIVIILLMYCGRIGSLSFALAFTRSNRKPHVQLPAERITIG >gi|157101622|gb|DS480702.1| GENE 6 5714 - 6535 1017 273 aa, chain - ## HITS:1 COG:lin0193 KEGG:ns NR:ns ## COG: lin0193 COG1108 # Protein_GI_number: 16799270 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Listeria innocua # 6 261 2 258 268 121 35.0 2e-27 MNLIREMLSYPFLVRALAGGMMVSLCASLLGVSLVLRRYSMIGDGLSHVSFGALSIALAM GWSPLKISIPVVVLAAFFLLRITENSRIKSDAAIALISASALALGIIVTSLTTGMTTDVS SYMFGSILAMSREDVWLSVVLSLVVLGLFVICYNRIFAVTFDENFAKATGVNVGRYNIII AVLTAVTIVLGMRMMGAMLISSLVIFPCLSSMRVFKSFGGVMVSSGLLSLACFFAGMVAS YQFSIPAGASVVVVNLCAFLLFMAWQSLGRLRG >gi|157101622|gb|DS480702.1| GENE 7 6532 - 7308 642 258 aa, chain - ## HITS:1 COG:lin1485 KEGG:ns NR:ns ## COG: lin1485 COG1121 # Protein_GI_number: 16800553 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn/Zn transport systems, ATPase component # Organism: Listeria innocua # 1 235 1 235 257 164 35.0 1e-40 MSPLIQCQHVDFGYDNQDAVTDVNMEVDPGDYLCIVGENGSGKSTLMKGLLGLIKPTGGT LVTADELKRTGIGYLPQQTAAQKDFPATVSEVVLSGCLNRRGMLPFYSKKEKEIADRNME KLGITPIRKQCYRELSGGQQQRVLIARALCATSSLLILDEPITGLDPTAIQEFYTMIRRL NREDKVAILMVSHDLRNAVEEANKILHLQKRVLFYGPAHDYMNSQAAGHFFHEKEQGRCK PTAGCHTVKLHMHKEDKS >gi|157101622|gb|DS480702.1| GENE 8 7361 - 8374 1089 337 aa, chain - ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 10 333 4 308 312 162 34.0 9e-40 MPLRYRGITWLMALLGFAVMVISIWGCAAGQRPLESGKASAAGQEGKDDRILVVTTIFPY YDFVRQIAGDRVKLKLVVPAGMDSHSFEPTPADMIAMQEADVLICNGGEMEQWVEKVLGS LDTSHMKVLTMMDYVDVVEEEHVEGMEEEEEHRHEDGFEMHIEYDEHIWTSPVNARAIVK IISQTLSEAAPREKSRFEDNTQAYLKELKELDSQFRQVVNQGRLHMIVVADKFPFRYFAD EYGLSYRAAFSGCSGDTEPSARTIAYLIDKVKEDRIPAVYYLELSSHRTAEIIQEETGAM PLLLHSCHNVTRKQFDEGVTYLQLMKQNVENLRYGLE >gi|157101622|gb|DS480702.1| GENE 9 8379 - 9056 814 225 aa, chain - ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 218 4 220 222 138 35.0 8e-33 MYKCCIFDLDGTLVNSIYAIQKSVNDTLSYWNMREISVEESRLYVGDGYKKLLERSLIAC GDKELTHYEEAVERYQDIFRECCMYRVEAYEGIGELLEFLKSQGIFIAVLSNKPHPRTLD NVQGVFGKGYFDLVYGEREDKGIKKKPCPDGVWAIAEELGLSKSEILYLGDTNTDMETGD NAKVDTVGVTWGFRTREELMAFHPALIADHPSQVVQYIKDVNGIE >gi|157101622|gb|DS480702.1| GENE 10 9225 - 10115 707 296 aa, chain + ## HITS:1 COG:BS_yitS KEGG:ns NR:ns ## COG: BS_yitS COG1307 # Protein_GI_number: 16078175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 11 289 9 280 283 184 36.0 2e-46 MTNSYMITCCSTADLPASYLTERGIPYSCFHFRMNGMEYDDDLGKSIPIEDFYENIRQGA MPTTSQVNPEQYEAMFEPILKEGKDIIHMTLSSGISGTYNSAVIARDEMLERYPDRQIAV IDSCCASAGYGLLVDTAWEMMQAGAEYRELVSWIEENKKRVHHWFFTSDLTHLRRGGRIS ASAAFFGNMLNICPLLEVNAEGKLIPRDKYRGKKRVIREMVERMKEHAEQGDAYNGKCFL SQSSCPEDAEAVASLVEETFPHLNGKVFIVDIGTVIGSHTGPGTVCLVFWGDGRTV >gi|157101622|gb|DS480702.1| GENE 11 10166 - 10984 832 272 aa, chain - ## HITS:1 COG:TM0164 KEGG:ns NR:ns ## COG: TM0164 COG2035 # Protein_GI_number: 15642938 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Thermotoga maritima # 6 240 3 226 264 94 31.0 3e-19 MNRDNVKIILKGMWIGGTMTVPGVSGGSMAMIMGVYDRLISSISSFFKEPAGSMAFLLKF VLGAGTGMVLFSRFISYLFTTRADVPLRFFFLGAVAGGVPMIYREAGVKKLDLGAVAYPA IGILGVVLLALIPSGLFTPDGSFGPAALLLQLAGGFIIAVGLVLPGISVSQMLYMLGIYE TIIGNISSFHILPLIPLGAGVLGGIFLTTKVLERLMSRHPQPTYLIILGFMFGSLPELFP GIPTGADLAAALIAGAAGFAALYVMSMKEQRT >gi|157101622|gb|DS480702.1| GENE 12 11263 - 11478 196 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941114|ref|ZP_02088451.1| ## NR: gi|160941114|ref|ZP_02088451.1| hypothetical protein CLOBOL_06007 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06007 [Clostridium bolteae ATCC BAA-613] # 7 71 1 65 65 89 100.0 8e-17 MLSVSSMALAAMLGSACGTNMANINDCNAALYNTGNCPQAVQYQKGGQDLNSMLSQMGVG MMPGNSQCGVN >gi|157101622|gb|DS480702.1| GENE 13 11811 - 13814 1394 667 aa, chain + ## HITS:1 COG:slr1760 KEGG:ns NR:ns ## COG: slr1760 COG3706 # Protein_GI_number: 16329649 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and a GGDEF domain # Organism: Synechocystis # 218 384 145 311 315 135 42.0 2e-31 MLEYVKKNLSCIFPAQDHGKGCLYADEIYDTVQTMNISQTLNYSILLIIMEAFSLIVYLA TDFTGRAASMLSFTMLLFLLFIAFNIVYCKYLLKQQRTCSTRRRLKHHTYFFTAIFALYC IAVNHICLQSRLTAESVLMFYIYIAAGPIYSLKEALASVLATALAAIPAFRAHHVPPALY SNLFLYSFISLFLSQMRCHIIVSNLQMVRKARDEQVCLQERADKDPLTQLFNRNGYSLRL EELIPYSIRLHVPVAVIMIDIDYFKQYNDTYGHVQGDECLKKVAAALSGSIHRENDLICR FGGEEFQILLCDVHPSDAIKVGDRLRKSVADLKIPAANRSASPFVTISAGVVSTVLTSMD DYHKLVRAADDELYYAKSHGKNMISFRELNPAPSSDLSLEDKLEHAKLIYDSIPFPFAII QIQAKDGQACDFSYVYVNEACARLECRSRNILYRKSFLSHYPDADTRRLPAYYETAMHGG RQMFYDFSPERSKYLKIECFQFHEGYCGCLIEDVTDQHYFELYGNNELGLLNQVVNGGIL LTSYQAESPEVIYMNSRLLHALGYCSLSEYKLLSGGSLSFLRQLHPDDTHTFQETMTVFS TDDTVRSCIIRIRRKDGAWLWFMLRGKLIIDEHKNPLNLFTVYHITRQLKQLGQIVPQVD GDSRISL >gi|157101622|gb|DS480702.1| GENE 14 13838 - 16303 3136 821 aa, chain - ## HITS:1 COG:TM0268_2 KEGG:ns NR:ns ## COG: TM0268_2 COG1410 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Thermotoga maritima # 300 812 11 475 483 255 33.0 2e-67 MMMTILEEMKVRRLFCDGGMGSLLQAQGLKPGELPETWNLTRRDVLISIHRSYLEAGADI MTTNTFGANRLKFKDDLESIVTAAVENARTAVREAGHGYVALDLGPTGRLLKPLGDLDFE DAVKLYKEVVSIGARAGADLVLIETMSDSYELKAAVLAAREAGFRPDTGERLPVFATVIF DEKGKLLTGGNVESTVALLEGLRVDALGINCGLGPVQMKGILEEILRVSSLPVLVNPNAG LPRSENGKTVYDIDAEGFALVMEEIAAMGAVVVGGCCGTTPEHIRLMTQRCKDMPVVWPE KKHRTVVSSYAKAVTAGRKTIIIGERINPTGKSKFKQALRDHNLEYILKEGVSQQDNGAD VLDVNVGLPEIHEPSMMEEAVRELQAIIDLPLQIDTSDMEAMERAMRIYNGKPLINSVNG KEESMSRIFPLMAKYGGVAVGLCLDESGIPDTADGRLAVGRKIIERAAVYGIGPEDIILD GLCMTVSSDSKGALTTLETLRRIRDELGVGTVLGVSNISFGLPQREIINAAFFTMAMECG LGAAIINPNSEAMMRAYYSFNALMDRDPQCGQYISVYSGQSAGLGQTIGKSGSQNGIKAD NHFGSGEDQGGGSQGPALTAAIERGLKEAAHNAVTELLKEREPLDIINSEMIPALDRVGK GFEKGTVFLPQLLMSAEAAKAAFEVIKAAMDGSGQAQEKKGTIILATVKGDIHDIGKNIV KVLLENYSYEVIDLGRDVPSEVIVKEAVERQVALVGLSALMTTTVPSMEETIRQLRAASV TVKVMVGGAVLTEDYARTIGADAYCRDAMASVNYAEQVFGA >gi|157101622|gb|DS480702.1| GENE 15 16319 - 17353 1234 344 aa, chain - ## HITS:1 COG:CAC3420 KEGG:ns NR:ns ## COG: CAC3420 COG2008 # Protein_GI_number: 15896661 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Clostridium acetobutylicum # 1 341 1 341 344 456 62.0 1e-128 MIRFNCDYSEGAHERILKKLAETNLEQTPGYGEDHYCAEAAGIIRSLCGREDAAVHFLVG GTQANLTVIASALRSHQGAVGAVTAHINVHETGSIEATGHKVLALPSEDGKITAEQVEEL YQAHIRDESFEHTVQPKMVYISNPTELGTIYTRAEMENLYGVCRKYGLYLFVDGARLAYG LAAEGNDLDLKSLAASCDVFYIGGTKVGALFGEAVVILNDELKADFRYHIKQRGGMLAKG RLLGIQFGELLRDGLYFELGAHADRLADKIRCACVKKGYPFLVENTTNQVFPIMPDALLE SWKDKYSYTNQGRVDESHTAIRLCTSWITSGEQVDILVNDILNS >gi|157101622|gb|DS480702.1| GENE 16 17412 - 18062 570 216 aa, chain - ## HITS:1 COG:no KEGG:Closa_2453 NR:ns ## KEGG: Closa_2453 # Name: not_defined # Def: Vitamin B12 dependent methionine synthase activation region # Organism: C.saccharolyticum # Pathway: not_defined # 5 216 2 213 214 242 60.0 8e-63 MDIQDVDRREVLRYLGYRGQEADGAVAAMVEQSMAELWEAATPRHLYREYPLSLGEDYRI DGGCFNARSRNLWGNLKDCDQIIVFAATLGCGADHLIQKYSRLQMSRAVVMQAAAAAMIE EYCDQVCTVIKSEYEAKGRYLRPRFSPGYGDFPLECQGMLLEALEAGKRIGIKLTDSLLM MPSKSVSAVMGASRKPYRCDVKGCEACAKTDCPYRR >gi|157101622|gb|DS480702.1| GENE 17 18221 - 19078 1038 285 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 283 1 289 296 251 44.0 1e-66 MKIRDILAEGKPTLSFEVFPPKTEDAYDSVEKAAAEIAKLKPSFMSVTYGAGGGTSEYTV GIASAIKEEYGVTPLAHLTCVSSTREKVHHVLGELRERGIENVLALRGDIPQDGSTPKEY HYASELIREIKKAGDFCIGAACYPEGHVESANKSVDIDYLKQKVEAGCDFVTTQMFFDNS ILYSYLYRIREKGIQVPVIAGIMPVTNAKQIRRITQMSGTYLPSRFMSIVDRFGDNPAAM KQAGIAYATDQIIDLIANGVNGIHVYSMNKPDVAMKIKENLSEIL >gi|157101622|gb|DS480702.1| GENE 18 19153 - 21606 3089 817 aa, chain - ## HITS:1 COG:BH1084 KEGG:ns NR:ns ## COG: BH1084 COG0058 # Protein_GI_number: 15613647 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Bacillus halodurans # 8 815 6 808 815 828 50.0 0 MNKGFDKETFKRSVVDNVKNMFRRTIDEATPQQVFQAVAYAVKDVIIDEWIATHKEYEKK DVKTVYYLSMEFLMGRALGNNIINICARDEIKEALDEMGFDLNVIEDQEPDAALGNGGLG RLAACFLDSLATLGYPAYGCGIRYRYGMFKQKIENGYQVEVPDNWLKDGNPFEIRRPEYA SEIKFGGYVRIENQGGVNHFVQDGYQSVRAVPYDLPIIGYGNNVVNTLRIWDAEPINTFN LDSFDRGDYQKAVEQENLAKTIVEVLYPNDNHYAGKELRLKQQYFFISASVQRAVKKYKE KHDDIRKFYEKVVFQLNDTHPTVAIPELMRILLDEEGLTWDEAWEVTTRTCAYTNHTIMS EALEKWPIELFSRLLPRIYQIVEEINRRFQNQIQTMYPGNQEKLRKMSIIYDGQVKMAYM AIAASFSVNGVARLHTEILKHQELKDFYEMMPEKFNNKTNGITQRRFLLHGNPLLADWVT SKIGDEWITDLPHIKKLELFAGDEKCQFEFMNIKYQNKLRLARYIKENNGIDVDPRSIFD VQVKRLHEYKRQLMNILHVMYLYNQLKDNPDMDMIPRTFIFGAKAAAGYKRAKLTIKLIN SVADVINNDKSINGKIKVVFIEDYRVSNAELIFAAADVSEQISTASKEASGTSNMKFMLN GALTLGTMDGANVEIVEEVGADNAFIFGMSSDEVINYENNGGYYPMDIFNNDQEIRRVLM QLINGYYAPDNPELFRDIYNSLLNTKSSDRADTYFILKDFRSYAEAHQRVDKAYRDQAWW AKAAILNTANCGKFTSDRTIEEYVKDIWHLKKVTVEM >gi|157101622|gb|DS480702.1| GENE 19 21724 - 24858 3639 1044 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1041 1 1033 1035 1206 55.0 0 MYDKVPTDLKFVEREKEVEKFWEAEHIFEKSIKMREGCKPYVFYDGPPTANGKPHIGHVE TRVIKDMIPRYRAMKGYMVPRKAGWDTHGLPVELEVEKKLGLDGKEQIEQYGLEPFIKEC KESVWKYKGMWEDFSNTVGFWADMDNPYVTYDNSFIESEWWALKQIWDKGLLYKGFKVVP YCPRCGTPLSSHEVAQGYKDVKERSAIARFKAKGEDAYILAWTTTPWTLPSNLALCVNPK EDYVKVKTGDGYTYYIAQALADTVLGEDYEVLETYKGKDLEFKEYEPLYDCSVPLCEKQH KKAYYVVCADYVTLTDGTGIVHIAPAFGEDDANVGRKYDLPFVQLVDAKGDMTADTPFAG TFVKDADPMVLKDLESRGLLFSAPSFEHSYPHCWRCGTPLIYYARESWFIKMTAVKDDLI RNNNTINWVPDSIGKGRFGDWLENIQDWGISRNRYWGTPLNIWECECGHQHSIGSIEELK SLSDNCPDEIELHRPYIDAVNIKCPKCNKPMKRVPEVIDCWFDSGAMPFAQHHYPFENKE LFESQFPAQFISEAVDQTRGWFYSLLAESTLLFNKAPYENVVVMGLVLDENGQKMSKSKG NAVDPFDTLAAHGADAIRWFFYTCSAPWIPKRYQDKAVTEGQRKFMGTLWNTYAFWVLYA NIDNFDPTRYTLEYDKLPVMDRWMLSKMNSMVKTVDNSLMNYQIPEAARSLQEFVDDLSN WYVRRSRERFWAKGMEQDKINAFMTLYTALVTVAKAAAPMIPFMTEQIYQNIVRKVDGNA PESVHLCDFPEVNESWIDAELESDMDEVLKVVVMGRAARNAANIKNRQPIARMYVKADHE LSRFYVQIIEEELNVKQVIFSDDVREFTSYTFKPQLKTVGPKYGKLLNRIRQTLTDIDGS AAMDTLNEKGQLTFDYDGQEVVLTKDDLLIDVSQKDGYVTEEDNYVTVVLDTNLTPELIE EGFVRELISKIQTMRKEAGFEVMDHISVFQDGNDRIAELIKAHADEIKTEVMADHISIGA MSGFVKEWNINGENVMLGVEKTMK >gi|157101622|gb|DS480702.1| GENE 20 25071 - 25172 149 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METVGLQERLREPEMVETGASNAELKITPELSG >gi|157101622|gb|DS480702.1| GENE 21 25420 - 26877 1682 485 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 485 11 489 489 547 55.0 1e-155 MEQRNLTLLTDLYELTMMQGYFKEKDANETVVFDMFYRTNPHGNGFAICAGLQQVIEYIE DLHFDDSDIEYLRSLDMFEEDFLSYLNEFRFTGNIYAIPEGTVVFPREPLVKVIAPIMQA QLIETALLNIINHQSLIATKTERIVYAAKGDGVMEFGLRRAQGPDAGTYGARAAMIAGCI GTSNVLCGKMFDVPVKGTHAHSWIMSFPDELTAFRTYAKLYPSACILLVDTYDTLGSGVP NAIQVFKEMREAGIPLTFYGIRLDSGDLAYLSKKAKKMLNEAGFPDAVISASNDLDENLI NSLKMQGSTINSWGVGTNLITAKDCPSFGGVYKLAAVMDKATGQFMPKIKLSENAEKITN PGNKTIQRIYDKATGKIIADLICLVGEEFDTRNSLLLFDPLETWKKTMLAPDSYTMRELM IPIFLDGRCVYQSPKVMDIQAYCKKELDTLWDESKRLINPHTVHVDLSNELWHTKNQLLD SFHRK >gi|157101622|gb|DS480702.1| GENE 22 26982 - 27728 594 248 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 10 244 10 245 245 155 35.0 6e-38 MIGEKTLELLKQVLAESSYTVAICGSGMMEEGGILGLKQEGRAYEIEQKYSESPEELFHI SCLSRRPERFYEFYREEILKKIPDMTPSVRALARMEEDDKLQCIVTTNIFDLPEQVGCRN VIYLHGSIYKNRCPHCGRLYSMEEIRDSRHVPHCKNCKTMIRPGTSLYGEMVDSSIMSRT TEEIARADVLLVLGTSLRSEVYSNYIRYFQGRKMVIIHRIPRPLDEKADMVIYDLPQNVL PLLVDGGI >gi|157101622|gb|DS480702.1| GENE 23 27898 - 29190 1495 430 aa, chain - ## HITS:1 COG:FN2107 KEGG:ns NR:ns ## COG: FN2107 COG0153 # Protein_GI_number: 19705397 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Fusobacterium nucleatum # 36 418 3 381 389 107 25.0 3e-23 MKVNETVKLLESGAARTLMAQLYGEDGVEANVKRYEDLLAGYEKMFGGEGDVKLFSSPGR TEISGNHTDHNHGKVLAGSINLDCVGAAGTNESNQVHIVSETYNQDFTIDLNHLEPSEKK AGTVDLVKGLLQGFKDSGYRVGGFNAYITSNVISAAGVSSSASFEMLLCSMLNTFFNEGR MSTVAYAHIGKFAENNYWDKASGLLDQMACAVGGLITIDFMEPLVPKVEKIDFDFGSQDH SLIIVQTGKGHADLSADYSSVPAEMKKVARYFGKEVLSQISEEQVIDNLAEVRQFAGDRS VLRALHFFEENKRVEAEVLALKENRFKDFLNNITASGNSSWKWLQNCFTNSDYQEQGITV TLALTELFIAEKQKGACRVHGGGFAGVIMAVLPNELVDEFIHYIEKCTGEGSAYRMSIRP YGAICFNDLI >gi|157101622|gb|DS480702.1| GENE 24 29368 - 30369 692 333 aa, chain + ## HITS:1 COG:CAC3238 KEGG:ns NR:ns ## COG: CAC3238 COG1242 # Protein_GI_number: 15896484 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 3 305 16 315 324 306 46.0 4e-83 MEWNGKPYHSLDYELRQQFGRKIYKLSLDGGMSCPNRDGTLGTGGCIFCSQGGSGDFAAS RQTAVSRQIEQAMAQVKRKMPDGSQGSYIAYFQSYTNTYAPLPYLKQLFGEAVSHPAVAA LSIGTRPDCLEPEVIGLLKDLNRVKPVWVELGLQTIHDSTAAFIRRGYGLPVFEDALRRL KAAGLTVIVHVILGLPGETRLMMAETVHYLAQSPIDGIKLQLLHILKGTGLADYCRTNPF PMFTMEEYIDFVIDCVEILPPELTIHRLTGDGPKSLLLAPLWSGNKRLVLNTLHRRFKER NTWQGKRYVPDSSYSAALSSTGAAIPSSPTVTP >gi|157101622|gb|DS480702.1| GENE 25 30306 - 31592 1164 428 aa, chain - ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 98 362 44 305 380 143 34.0 7e-34 MAAALLLGLLLAGITGYLIWHSAQDKYVRAEMVGVSGGTGVAGVEAPEPEEGDGAETGAA ADSDGAQGTDGLQGMDGAQGTDGPDAADGGLSAHSDNAEAQGPGTGTSDEDSPLKLVFAG DILLSDHVLGAYRKAGNIGGVVDSGFRQVIDGSDIFMANEEFPFSRRGTAAADKQFTFRL PPEHVSMFQELGIDIVTLANNHALDFGTDALLDTCSTLDDAGILRVGAGANLEEAKKPVF MEAKGRRIGFLGASRVIPEGSWNATSKGPGMLTTYDPSLLLEEIKKAREVCDYLVVYVHW GIERDERPQEYQRTLGQQYIDAGADLVIGSHPHVLQGLEYYKGKPIVYSLGNFVFGSSIP KTALLTVEWDGEDALLRLVPGTSSGGYTRMLEDEGEKAGFYQYITSISYGVTVGEDGIAA PVEERAAE >gi|157101622|gb|DS480702.1| GENE 26 31634 - 32494 1063 286 aa, chain - ## HITS:1 COG:SP0742 KEGG:ns NR:ns ## COG: SP0742 COG1307 # Protein_GI_number: 15900637 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 285 1 277 281 147 35.0 3e-35 MTYKIIGDSCLDLTEDMKKDPRFQMIPLTLQVGSAQVVDDETFDQKRFIEMVKACPECPK TACPSPETFKAAFEEAEAEAVFVITLSSHLSGSYNSAAVAKKLYEEEMAEKGMAEQAKKV AVIDSLSASSGELNIALFIQNLCDSGLEFDEVVEKTLAYRDGMNTYFVLESLDTLRKNGR LSGLQAFFATALNIKPVMGADTGTIIKLDQARGMNKALQRMCDIAVKEVADAASKIAVVC HVNNPERAEYVKDELAKRVGFKKIVVTNAAGVATVYANDGGIVLAV >gi|157101622|gb|DS480702.1| GENE 27 32878 - 34290 1820 470 aa, chain - ## HITS:1 COG:no KEGG:CLL_A2949 NR:ns ## KEGG: CLL_A2949 # Name: not_defined # Def: putative phage lysozyme (EC:3.2.1.17) # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 224 310 18 106 260 67 44.0 1e-09 MRGKRFTAVLLSAVLTAAGAMTAWALPTGLSTIAGSTVRSSMAGFEATFPAGFEVGRYGG NLGLDYLSYMEEDGVKIDFIAAAIDEENESYLAMAGIVADLEGESLEGVMQEMLQMLGQD ASYGSYVAMGTAGIAGAGYYSVKINYGALMAQYMSAWMGSYDMTPEEKQEYDAYMEELSN RMFLDVYMREIGGNCYMLVQIYSGDQAGNAALLLSQMQPYAGGGWSYTEAGGWQYLHGDG TFGTNEWALDENGLTYRLDGNGQIMYKAWIEENGRWKYVDEFGHMVTNLTKTIDGSQYTF DAQGYMVEGSQRPAQAYETGTISGKTYSNRWANLFMKFPEAAELMLGDGSYYSYPLEAGE NMYYWYDGGTGYLLTIDYTDSTQQLDRYLEWLIDYAGYYDYTVDSTGTVNMGGYEYKYIK TSGQNSDGTTEHEDTYFRQIDGKLMEISIEYTGDRQATIDQILSAIEQAR >gi|157101622|gb|DS480702.1| GENE 28 34447 - 36471 2561 674 aa, chain - ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 1 660 1 649 658 422 37.0 1e-117 MAKKGSLSITSENIFPVIKKWLYSDHDIFYRELISNGSDAITKLKKLELMGEYERPEDLE YKIQVSVNPNDKTIKITDNGLGMTEDEIDKYINQIAFSGVQDFMEKYKDKANEDQIIGHF GLGFYSAFMVSDKVSIDSLSYQKDAKPVHWESEGGIDFEMEEGDKAEVGTTITLYLNEDS TEFCNEYRAREVIEKYCAFMPVDIFLDNETAEPQYETIEKDELTDKDTVVETVIEPARTE EKEKEDGTKETVEIEPSREKYKINKRPVALNDTNPLWNKHPNECSEEDYKSFYRKVFRDY KEPLFWIHLNMDYPFNLKGILYFPKINMEYDSLEGTIKLYNSQVFIADNIKEVIPEFLML LKGVIDCPDLPLNVSRSALQNDGFVKKISDYITKKVADKLSGMCKTDRENYEKYWDDIAP FIKFGYIKDEKFAEKMGDFILYKNLEGKYLTLQDCLDENKEKHENTIFYVTNEKEQSQYI NMFKEEGIDAVIMPAAIDSPFISHVEQKKEGLKFLRIDTDLNAAFKEEVKEDDEEFKKTS EELTEYFKKALNNDKLDIKVEKMKNAGVASMITISEDTRRMQDMMKMYSMGGMDMGMFGG TGETLVLNANHPLVQYVLGHKEDANTSKICEQLYDLASLSHGPLAPERMTAFVSRSNEIM MIMAGEKTDSEKTE >gi|157101622|gb|DS480702.1| GENE 29 36657 - 37403 681 248 aa, chain - ## HITS:1 COG:no KEGG:BT_1585 NR:ns ## KEGG: BT_1585 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 246 1 242 244 237 47.0 4e-61 MGLFGFDPVKEAGGWEAAMTEEEIAEMEKKGYDMSSVRGRQTEMAAQEEADKAAFADRRK AAAVPTDLNKLTPYRSTPRSAESSFFRDVAGKAPLFGREKWREKYANAPMVYGAVVQANS DLWLPGTGEYLPAVFVFALDSPHIYDVEWLRDTAEKISEMKASSAVPADCQEFIHILRDD QSEFCFPLGASLADGADAWCVTFKFDKQAILPGNRLPEDGIVPFLLEARPKKQMPIQLTP IPGKYYQA >gi|157101622|gb|DS480702.1| GENE 30 37431 - 37697 309 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941139|ref|ZP_02088476.1| ## NR: gi|160941139|ref|ZP_02088476.1| hypothetical protein CLOBOL_06032 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06032 [Clostridium bolteae ATCC BAA-613] # 1 88 34 121 121 162 100.0 9e-39 MSGLYTRFMDWIMPFLEANWQLFLIMAGALFLLGAIFRWKWVCDPQGEDGLGFRAFVYRN FGEKGYRILQGIGGAVIILCSAVLWVLM >gi|157101622|gb|DS480702.1| GENE 31 37944 - 38564 573 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941140|ref|ZP_02088477.1| ## NR: gi|160941140|ref|ZP_02088477.1| hypothetical protein CLOBOL_06033 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06033 [Clostridium bolteae ATCC BAA-613] # 1 206 42 247 247 400 100.0 1e-110 MSKKGYTKVEREQVGRDLLTVGLEMLSQRGLKGTTLQDILQAVGISKPFFYGNYYTSLAE LVIHIIDYEISLLLREVRNTVGNKGMSLEETIYHFLDMVVHSRQHHFFVMTQEEEMWVYK HLSPVEFEVYQQGQARFYEQVLALWQISQEKCSPKELGNLILSVVLIYNSAARSLPFFFP EELEQTAKAQATALSRYLASLADADK >gi|157101622|gb|DS480702.1| GENE 32 38927 - 39361 153 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941143|ref|ZP_02088480.1| ## NR: gi|160941143|ref|ZP_02088480.1| hypothetical protein CLOBOL_06036 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06036 [Clostridium bolteae ATCC BAA-613] # 1 144 12 155 155 245 100.0 7e-64 MKKVTWLRTLVTVVLSVTVVSMVYVFTKGWPLMRPPRMEDIKEVTMTDTESGVKKEFVDE ENKELAVKLINFLNYVPFSTASDTYEPLIIITYVLDDGTEIKISANNTEVFFNGNGHQLK DAEIFGNLTKAVFFSEETARESAQ >gi|157101622|gb|DS480702.1| GENE 33 39373 - 39564 156 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941144|ref|ZP_02088481.1| ## NR: gi|160941144|ref|ZP_02088481.1| hypothetical protein CLOBOL_06037 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06037 [Clostridium bolteae ATCC BAA-613] # 1 63 21 83 83 112 100.0 9e-24 MKRYAQLAFLKALVITVGFDLICIIYGLISGNPYRISLLGGVLLFAVLFSIGLIEYLWKN RKN >gi|157101622|gb|DS480702.1| GENE 34 39734 - 40219 294 161 aa, chain - ## HITS:1 COG:BS_bltD KEGG:ns NR:ns ## COG: BS_bltD COG0454 # Protein_GI_number: 16079713 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Bacillus subtilis # 9 151 3 149 152 124 44.0 1e-28 MNLKNNNQLHFEPVNSENRMEAENLSVFSEQSGFIESVGECLQEADNLNLWRPVCIYDGD ILVGFTMYGYFPSPFPGRLWLDRLLIDKRYQGKGYGRQAVLALLDRLHAEYSDSTVYLSV YENNTRAIRLYEQIGFCFNGEYDTKGEHVMVYLWEKGREGK >gi|157101622|gb|DS480702.1| GENE 35 40281 - 40901 -3 206 aa, chain - ## HITS:1 COG:no KEGG:CbC4_0465 NR:ns ## KEGG: CbC4_0465 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 1 203 1 207 211 106 31.0 5e-22 MNAVIYRPYKRQDFKAVSSIINIIWKHESYYSPKTAVRLSEAYLRLCLTEQTFTQVALAD GKPIGIIMGNHIRRHRCPLVLRLQAGWSVLVLSMTAEGRRSWRFLEEIDRIYAALLSGQP QEYKGELSFFAIHPDYHGQGIGRELFSRFCMYMERENIQHFYVFTDTSCNYGFYENMNMF RRGTKRVKLKVNHRLEEFSFFLYENG >gi|157101622|gb|DS480702.1| GENE 36 41648 - 42436 216 262 aa, chain + ## HITS:1 COG:no KEGG:Closa_1057 NR:ns ## KEGG: Closa_1057 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 17 97 18 98 336 84 43.0 7e-15 MKKLYFLLSTTAALLSISFSAYAGQWQQEGEHWKYLNNDGTYSAHTWQWIDGNGDGISES YYFDDNGFLYLDTITPDGYSVNQDGAWVVNGVVQTRSETSGEGEYISLEVGYNAKIPVGM ERVHNNILDYYKRTSARFEDAQGTRLISFSIDDYRNQADYFRTVEGTTDASFQERKADAV GTTLKATVVLRDTKEYHSGTWSHVQLTDYVGDYQKYWSNVHYFFQSNDFVLTTVRIWSTG ADMDVDGFMNSLVKNSYYDSVR >gi|157101622|gb|DS480702.1| GENE 37 42669 - 43889 1411 406 aa, chain - ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 114 405 83 373 374 303 57.0 3e-82 MDKMELSKRLEQLVGEAAAQEGRLEMTQVLDHFRDVQMTPEEMEDVYLKLERKGVHIQAP EEPEEELPEGDFLLDDDAELDQPSLAKEEFRWGRDRLDPDEMLDGGSMVSGGDEEEEEYD GYGQERLLEGVSTADPIREYLKEIGSIALLTPEEESDLARRKSEGDVEAGRKLVEANLRL VVSIAKRYTGRGMSFLDLVQEGNLGLMKAVEKFDYAKGYRLSTYATWWVKQSITRSLADQ SRTIRLPVHMVEAVNKIRRAQRSLSVKLGREPSMEEVAEEVNMSGKRVAELIQASGDTVS LETPVGDEEGSNLGDFVADDANASTEDKAESFLLREEIDSMLQGLNPREREVIILRFGLE TGHPLTLEEVGKRFNVTRERIRQIETAALRKLRNPSKSKKIRDFLP >gi|157101622|gb|DS480702.1| GENE 38 44006 - 44488 370 160 aa, chain - ## HITS:1 COG:no KEGG:Closa_3216 NR:ns ## KEGG: Closa_3216 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 157 4 160 165 173 50.0 2e-42 MKVEDVKTFTSKLFLKEDFDSFLVKEVNIVTYNNFSIDGHIRQGYYTDEELEENQIEFFS SWKVLRPVCFSLIKGKKLPGSFHIVLLLPPSDTEKFASTSGSGISSEQIQGLYLNIRYED GALYCVTGTSLHLFTMDKTLENEWDKAVAKFMRSHEIVCT >gi|157101622|gb|DS480702.1| GENE 39 44751 - 45149 367 132 aa, chain - ## HITS:1 COG:STM4186 KEGG:ns NR:ns ## COG: STM4186 COG4405 # Protein_GI_number: 16767436 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 12 131 15 133 135 75 34.0 2e-14 MKNTDKGYFERFSFGDSPEMADELLALVLAGKKTATVSVILEDEKAPSVGDLSLVLDGRS TPACVIKTVHVETVRFCGLTWDMVKLEGEDENFEQWKSGNIRYWTRDAAKRGYTFTDQTL ITFERFEVVEVL >gi|157101622|gb|DS480702.1| GENE 40 45263 - 45655 136 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941153|ref|ZP_02088490.1| ## NR: gi|160941153|ref|ZP_02088490.1| hypothetical protein CLOBOL_06046 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06046 [Clostridium bolteae ATCC BAA-613] # 1 130 1 130 130 228 100.0 1e-58 MKKILSFLIAGIMACSAAGCSQKNANPISLPEESTIQSIDVTVGEETEKYSDSEWISQCI SSMNNAQATAKESVQDIPQADEYIKIDINTEDAKSTLFVYLEKNDYYIEQPYQGIYKTDS AFYQTITGNH >gi|157101622|gb|DS480702.1| GENE 41 46064 - 47551 1350 495 aa, chain - ## HITS:1 COG:sll1027 KEGG:ns NR:ns ## COG: sll1027 COG0493 # Protein_GI_number: 16329369 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Synechocystis # 1 494 1 493 494 568 55.0 1e-162 MGKSTGFLEYEREVGRGKTPEDRIKNWNEFHASLSEEEQRRQGARCMDCGVPFCQSGMMI GGMVSGCPLHNLVPEWNDLVYTGNWKQAYNRLKKTNSFPEFTSRVCPAPCEAACTCSLYG DAVTIKENERAIIEKAYSSGYALPNPPSIRTGKKVAVIGSGPAGLAAADQLNKRGHSVTV FERNDRVGGLLMYGIPNMKLEKWVIERKVDVMKKEGITFVTGADVGRNYRAGKIRKEFDR IILACGASNPRDIDVPGRDSAGIYFAVDFLKATTKSLLDSGLEDGSFISAKDKHVLVIGG GDTGNDCAGTSIRHGCVSVTQLEMMPKPPAERAENNTWPEWPRILKTDYGQEEAISIFGS DPRQYQTTVKEFIKDGEGKVCKAIITNLVPKKDEETGRTIMAAVEGSEREIPADLVLIAA GFTGAQAYVADAFGIDLDARTNAATEPGTYRTSADQIFTAGDMHRGQSLVVWAIQEGREA AKAVDRDMMGYTNLA >gi|157101622|gb|DS480702.1| GENE 42 47567 - 52108 5049 1513 aa, chain - ## HITS:1 COG:sll1502_2 KEGG:ns NR:ns ## COG: sll1502_2 COG0069 # Protein_GI_number: 16329610 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Synechocystis # 387 1185 1 801 801 952 58.0 0 MESKQKQPGLYEPRFEHDNCGIGAVANIKGIRSHRTVDQALHIVENLEHRAGKDAEGKTG DGVGIMLQISHGFFKKVTEHLGFQIGDERAYGVGMFFFPQDELIRSQAMKMFEIIVRKEG MTFLGWREVPTAPEILGRKAVDVMPYIAQGFVGKPAGMKAGLDFDRKLYIARRVFEQSNE DTYVVSLSSRTIVYKGMFLVGELRLFFSDLQDEDYKSAIAIVHSRFSTNTNPSWMRAHPY RLIVHNGEINTIRGNVDKMVAREENMESEYFRYDMHKVLPIINQEGSDSAMLDNALEFMV MSGMDLPLAVMVTIPAPWENDKYMDSEIKDFYRYYATMMEPWDGPASILFTDGDVVGAVL DRNGLRPSRYYITEDGYMILASEVGAIDIDQTKVVKKDRLRPGKMLLVDTKQGRLIDDDE LKHYYASRNPYGEWLDSNLWSIKDLPVPNKCIPAMKDEERLRLQKTYGYTYEQYKTMILP MALNGAEAVSAMGVDSPLAVLSKKHQPLFNYFKQLFAQVTNPPIDAIREEIVTSTTVYVG EEGNILEEAAKNCKILKVENPILTELDLLKIKNMKKPGFKVEVIPITYYKNTSLEKAIDR LFLEADRAHKDGANIIILSDRDIDENHVPIPSLLAVSALQQYLVRTKKRTSVALILESGE PREVHHFAALLGYGACAVNPYLAHESIRSLIDSGMLDKDYYAAVCDYDLAVLHGIVKIAS KMGISTIQSYQGAQIFEAIGISEDVVDKYFTDTVSRVGGITLKDIAGDVDVLHSAAFDPL GLDVDLTLESAGSHKSRSGEEDHLYNPLTIHLLQDAARSGNYDIFKQYSAQVDREDKMYH LRSLMDFKFPEDGGIPLEQVESVESIVRRFKTGAMSYGSISREAHECLAIAMNRIHGKSN TGEGGESLDRLIPGPAGNNRCSAIKQVASGRFGVTSRYLVSAQEIQIKMAQGAKPGEGGQ LPGGKVYPWVAKTRHSTTGVSLISPPPHHDIYSIEDLAQLIYDLKNSNTRARISVKLVSE AGVGTVAAGVAKAGAQVILISAYDGGTGAAPRNSIYNAGLPWELGVAEAHQTLIMNGLRD KVILETDGKLMTGRDVAMACMLGAEEFGFATAPLVTLGCVMMRVCNLDTCPMGIATQNPE LRKRFRGKPEYVVNFMLFIAQELREYMARLGVRTVDDLVGRTDLLKRRENLPLGRANQVD LRRILDNPYEGQNVAGYSGKQVFDFRLEDTMDEQVFLRKFKSALGNGQKKRIQVEVTNVN RALGTIFGSEITRKYPEGLSDDTFTVSCSGSGGQSFGAFIPRGLTLELAGDSNDYFGKGL SGGKLVVFPPKGVLFQAEENIIIGNVALYGATSGNAFINGIAGERFCVRNSGATAVVEGM GDHGCEYMTGGCVVVLGATGKNFGAGMSGGIAYVLDEDRSFYKRLNKELVSFEEVTGKYD VLELKGLIEEHVACTNSLKGRRILEHFSEYLPMFKKVVPHDYRRMMNAIVQMEEKGLNSE QAQIEAFYANLRG >gi|157101622|gb|DS480702.1| GENE 43 52248 - 52514 370 88 aa, chain - ## HITS:1 COG:STM3779 KEGG:ns NR:ns ## COG: STM3779 COG1925 # Protein_GI_number: 16767063 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Salmonella typhimurium LT2 # 1 87 1 87 89 70 40.0 1e-12 MISKTLTVVNPSGLHLRPAGVLSQTAMKFKSDITIECGEKKIVAKSVLNVMAAGIKCGTE INLICDGEDEEEAMKTLTEAIESGLGEM >gi|157101622|gb|DS480702.1| GENE 44 52672 - 53373 468 233 aa, chain - ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 7 212 3 208 208 247 61.0 1e-65 MTQPGQKRNYQKELDRLIEGLREEGRVPKLLLHSCCAPCSSYVLEYLSQYFDVTVYYYNP NIFPETEYEARVKEQERLIREMETERPVRFLAGSYEPEEFFSVARGHEKDPEGGERCFRC YELRLDRAARAAVSGDFDYFTTTLSISPLKNAQWLNDIGCRLAGQYGIPYLVSDFKKKNG YKRSVELSALYGLYRQDYCGCVYSRAEAIERQKEPAVQRLQSSFVRMNTRGQE >gi|157101622|gb|DS480702.1| GENE 45 53522 - 54667 1395 381 aa, chain - ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 11 381 21 371 372 355 56.0 8e-98 MEKDDFLKKLEKLVEHAKTKHNTLDAGEINDFFTGDNLSPEQMDQIYSYLEGRNIDVVPV LDEAMLTEDTVLLDDIDLDLDLDDDSFIKDAEEEEIDLDAVDLLEGIGTEDPVRMYLKEI GTVPLLTADEELDLAQRKADGDEFAKERLIEANLRLVVSIAKRYTGRGMSFLDLVQEGNL GLIKGVEKFDYTKGYKLSTYATWWIRQSVTRALADQARTIRVPVHMVETINKMSKMQRKL TLELGYEPSVTELAEALDMTEDKVMEIMQIAREPASLETPIGEEDDSNLGDFVADNNVVT PEGNVESVMLREHIDALLGDLKERERQVIVLRFGLEDGHPRTLEEVGKEFNVTRERIRQI EAKALRKLRNPVRSKRIRDFL >gi|157101622|gb|DS480702.1| GENE 46 54768 - 55694 1170 308 aa, chain - ## HITS:1 COG:BS_yerQ KEGG:ns NR:ns ## COG: BS_yerQ COG1597 # Protein_GI_number: 16077740 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus subtilis # 1 276 1 276 303 171 35.0 2e-42 MKKMLFVFNPRSGKEQIKGQLMEILDIFTKAGYELRVHVTQKQKDAMEVVARLGKKVDVV VCSGGDGTLNETISGMMKLKKMPLLGYIPAGSTNDFATSLGIPKRMPLAAWDIVEGTPFA IDTGTFCEDMNFMYIAAFGAFTEVSYLTSQDRKNLLGHQAYMLEGVKSLAGLKPYHMKVE WDGQVLEEDFAFGMVTNTISVGGFKGLVNQSVALNDGLFEVLLIRMPRTPVDLSNIISYM FLREEPNEYVYKFKTSSIRLTSEQEVDWVLDGEYGGSRTEVTIGNIRENVQILLRNHPEE GSGRKQIE >gi|157101622|gb|DS480702.1| GENE 47 55735 - 56847 525 370 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase [Haemophilus influenzae R2846] # 32 323 13 299 378 206 38 6e-52 MYDTGHLYDRLQVLEREEMPLGRRERLSAVERPLLAWYSSRARSLPWRDDPKPYRVWISE IMLQQTRVEAVKPYFERFMEAFPTVSHLAQAEDDHLMKMWEGLGYYNRARNLKAAAQMIM SEYGGCLPASFDELIRLPGIGSYTAGAIASIAYGIPLPAVDGNVLRVISRLLGDREDIKK ASVKTGIEAELKAVMPQDEASHYNQGLIEIGALVCIPGGEPRCSQCPLASICLTRKNGWW KEIPYKSPAKARKIEERTVFIIEYQDKVAIRKRPPKGLLASLYELPNIEGKTSGETVPQV LGLDREQVASVELLPEAKHVFSHVEWHMTGYRVVLSQEEEPLSCFMVSREELEHTYALPN AFNAYTKLIG >gi|157101622|gb|DS480702.1| GENE 48 56840 - 57799 796 319 aa, chain - ## HITS:1 COG:MTH1916 KEGG:ns NR:ns ## COG: MTH1916 COG0340 # Protein_GI_number: 15679898 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Methanothermobacter thermautotrophicus # 67 309 14 257 261 202 43.0 5e-52 MLKSTEDYLSGQQLCGMLGVSRTAVWKAVGELREEGYVIEAVRNRGYRLVEGADVITQAE LASMLHTQWIGTRLEYFDETDSTNIRARKLAEEGAPHGTLVVADRQTAGKGRRGKSWVSP AGTGIWMSMVLRPVMSPMSASMLTLIAGLSVVRGVKESTGLEAMIKWPNDAVLNGKKICG ILTEMSTEVECIRYVIPGIGINVNIDDFPEEIRATATSLKLEAGRSIKRSLVIAAVADSF EYYYDIFMKTCDMSGLRDDYNKALVNLNKEVLVLDPRGQYKGKALGIDNEGSLLVRREDG NISAVISGEVSVRGIYGYV >gi|157101622|gb|DS480702.1| GENE 49 57832 - 58134 362 100 aa, chain - ## HITS:1 COG:no KEGG:Closa_2492 NR:ns ## KEGG: Closa_2492 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 99 5 100 102 81 45.0 8e-15 MQGQRGRSERHSGVFLDMVHLILGIAIVILAVLSFINPEGNHMLFPLVFLLAALLNAVNG IFELKTRGRDKKKFGASILQLALAAGLMVLGIVSAISIWA >gi|157101622|gb|DS480702.1| GENE 50 58138 - 59130 997 330 aa, chain - ## HITS:1 COG:STM4465 KEGG:ns NR:ns ## COG: STM4465 COG0078 # Protein_GI_number: 16767710 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Salmonella typhimurium LT2 # 1 329 3 334 334 411 61.0 1e-115 MDLKGRHFLTLKDYTPEEITYLLDLAADLKDKKKQGVLVDSLRGKNVALIFEKTSTRTRC AFEVAAHDLGMGTTYLDPSGSQIGKKESIADTARVLGRMYEGIEYRGFGQEIVEELAKYA GVPVWNGLTNEDHPTQMLADLLTIREHLGCLKGLKLVYMGDARYNMGNSLMIACTKMGMH FVACAPAKYFPDPSLVKECEAYAAASGGSVTLTEHVAEAVKGADIIDTDVWVSMGEPDEV WEERIEELTPYKVTKAVMDLAGPNAIFLHCLPSFHDLKTKIGKEMGERFNVSELEVTDEV FESGQSVVFDEAENRMHTIKAVMLATLGES >gi|157101622|gb|DS480702.1| GENE 51 59259 - 60491 1250 410 aa, chain - ## HITS:1 COG:mlr3635 KEGG:ns NR:ns ## COG: mlr3635 COG1752 # Protein_GI_number: 13473136 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Mesorhizobium loti # 16 211 1 208 361 71 26.0 3e-12 MDIIKLQLDLSKEYGIVLEGGGAKGAYQIGAWRALREAGIRIKGAAGTSVGALNGALICM DDFEKAERIWENISYSRVMDVDDELMERLTKFSLKNISSLAPLNVGELIAGAIRVLRDGG FDIAPLRALIEDVVDEEKIRNSDRELYVVTYSISDRHPLIADVKEAPEGGIADLLMASAY LLGFRQERLGGKYYMDGGGINNVPVDVLLDRGYRDVIVLRIYGYGVDTEKKLEVPEGATL YHVAPRRDLGGLLEFDRRRARRNMLLGYFDGKRMLYGLAGRVYYIDAPEGEPYYFDKMIS EIQSLLSYMEPELQEAGEAKHGREEQKDPLAGYRVYTEQVFPRLAKELKLKEGWDYKELY LSILEDLAKQYRISRFKIYTVEELLRLIHKKAGAAALKAMIPILDSSTQV >gi|157101622|gb|DS480702.1| GENE 52 60478 - 60720 255 80 aa, chain - ## HITS:1 COG:no KEGG:Closa_2496 NR:ns ## KEGG: Closa_2496 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 68 1 68 146 63 48.0 3e-09 MLSVGWLVSFLIFLGVELAFGCLTSLWFAAGAIGGFAAALAGLPMEAQLAVFVAVSFLTL ILIRPLAFILGRRERAGGHH >gi|157101622|gb|DS480702.1| GENE 53 60768 - 61580 238 270 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 70 268 54 254 255 96 30 1e-18 MINSTSNKQVKRVVSLRSKAKARREEGMYVVEGLRMCRELNPEDVDALYVTENFAGDADN RKWMGRFRPETVSDVVMNVMADTQTPQGVLAVVRQKRYTMDDIVKPAPDGRSLAIMVLET IQDPGNLGTIIRAGEGAGITGIVMNHDTADIYNPKVIRSTMGSIFRMPFVYADDLQDACL LMKNRGVRLFAAHLKGMNNYDQEDYTDNVGFLIGNEANGLTEETTSLADCLVKIPMEGKV ESLNAAIAASVLMFETARQRRSCSVPAETA >gi|157101622|gb|DS480702.1| GENE 54 61605 - 63461 1590 618 aa, chain - ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 304 610 2 309 310 426 64.0 1e-119 MVFPELCLTAYTCADLFGQKALLRRAKEELGRIVRFTDGKDILVFLGLPWERDGKLYNAA AAIQKGRLIGVVPKRNLPNYSEFYEARNFCPGNERPVMTCWNGEKVPMGTNLLFRCNTMP ELTVAAEICEDVWVPCPPSIRHALAGATVVVNCSASDETTGKDMYRHDLICSQSARLVCG YVYANAGEGESTQDLVFGGQNIIAENGTCLVESRRFINESICADMDLERLDSERRRMSTF PDPAAAREEGGYLTVEFSFDAVSEDSAGSQDTQTHGSTQPGGDVLRYVDPAPFVPRDERQ RNRRCEEILSIQAMGLKKRLEHTGCHEAVIGLSGGLDSTLALLVTVRAFDSLRIPRSGIH CITMPCFGTTDRTYNNACTLAGKVGAKLREINIREAVTRHFEDIGHDMDKHDVTYENSQA RERTQVLMDIANEVGGLVIGTGDMSELALGWATYNGDHMSMYGVNGSVPKTLVRHLVRYY ADTCNEKELADVLLDVLDTPVSPELLPPEDGQISQKTEDLVGPYELHDFYLYYILRYGYA PSKIYRLAIQAFKSQYDRETILKWLNVFYRRFFSQQFKRSCLPDGPKVGSVAVSPRGDLR MPSDASGRVWLEELEGIK >gi|157101622|gb|DS480702.1| GENE 55 63769 - 64512 664 247 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 6 243 6 248 251 230 49.0 2e-60 MMINEFSRTEMLIGTEGMNKLKNSTVAVFGVGGVGSHAAEALARCGVGRLILVDNDTVSL TNINRQSIALHSTMGQFKTRVMKDKIADICPGTQVITFEEFVLPDNIESLFERMKHQMGQ GSSVDYILDAIDTVTAKLALAGFAASHAVPVIASMGTGNKLHPELFRISDISQTSVCPLC KVMRRELKARNIRELKVCWSPEQPLTPGQTAEDTGCRRATPGSISFVPPVAGLVIAGEII RDICGLA >gi|157101622|gb|DS480702.1| GENE 56 64555 - 65454 852 299 aa, chain - ## HITS:1 COG:L22691 KEGG:ns NR:ns ## COG: L22691 COG1284 # Protein_GI_number: 15673946 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Lactococcus lactis # 27 293 6 275 280 150 30.0 3e-36 MGQILRISAFINQWKKDRRVRTGITVAAVFVSALIQSYALQVFVRPAGIISGGFTGMAML VERLGELNGLNIPMQVTMIMLNIPVAVLCCKGISFRFTIFSLVQVFLSSIFLQIFNFSPI FTDEVLNVIFGGVIFGSAIVIALRGNASTGGTDFIALFVSNRTGHSIWSYVFFGNAVMYC FYGAIFGWKHAGYSIIFQFISTRMISAFHHRYELVTLQATTMKGQQVVDAYISHFRHGMS CVEAMGGYSKKKMYLLNTVISAYEVNNAIHIMQEADEHIIINVLRTQQFVGRFYRAPLE >gi|157101622|gb|DS480702.1| GENE 57 65485 - 65919 380 144 aa, chain - ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 1 132 46 175 197 115 46.0 2e-26 MLVSTKGRYALRTMVDLAIHGDGEPVKIKDIANRQGISGKYLEQIISILSRAGFVRSIRG NQGGYYLARPSSDYTVGSILRITEGSLAPVDCLSGDKNPCTRQMDCVTLRLWRELDEAIS GVVDKYTLEDLVQWQKSMKDNYVI >gi|157101622|gb|DS480702.1| GENE 58 66141 - 66533 297 130 aa, chain + ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 33 127 32 123 125 82 42.0 1e-16 MKNEPVFLTDKNHNIVLEALSPAPNAALPDSCKPFEILDPAATEGAAEKHLPVVEQDGLH VTVKVGNIFHPMGQEHSIGWVCLVTKAGCVMRVPLTPDCEPVASFTLEEGDAPAAAYAYC NLHGLWKKSI >gi|157101622|gb|DS480702.1| GENE 59 66964 - 68373 1140 469 aa, chain + ## HITS:1 COG:no KEGG:Closa_1011 NR:ns ## KEGG: Closa_1011 # Name: not_defined # Def: GerA spore germination protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 469 1 469 469 753 75.0 0 MEFSQNLKDNITYLHKKLNVETNFDVVYRVVHIGGREACLYFIDGFTKDESLLKILQVFS TIKPEDMPKDAHGFSKQYVPYGEIGLLSNDQDMTVQLLSGVSCLFIDGYDKCITIDCRTY PSRGVSEPEKDKVMRGSRDGFVETLIFNTALIRRRIRDPRLTVEITSAGESSHTDIAICY MENRVDKQLLDKIKKRIQNLKVDALTMNQESLAECLFPYKWFNPFPKFKFSERPDTAAAS ILEGNIIILVDNSPSAMILPSSVFDIIEEADDYYFPPITGTYLRLSRMTISLLSLLLTPT WLLFMQNTELIPDWLAFIRLSDPLNVPLIWQLLILEFAIDGLRLAAVNTPNMLTTPLSVI AGIVLGEYAVKSGWFNSETMLYMAFVTIANYSQASFELGYAMKFMRIIILILTSILNIWG FVIGIILSACAIIFNRTIAGKSYIYPLIPFSLSELKKRFLRGRLPHTEK >gi|157101622|gb|DS480702.1| GENE 60 68440 - 69465 771 341 aa, chain - ## HITS:1 COG:no KEGG:BCAH820_4940 NR:ns ## KEGG: BCAH820_4940 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_AH820 # Pathway: not_defined # 36 310 7 292 327 124 31.0 5e-27 MGKWHKGRIRRFFREGKKKVDLQLKQLADSGRPLDISLTNDFAFKKTFRNKTALTGLLSS LLDIPAGEITSLEFPDTFLHGEYADDREGILDVKVRLNHCKKVNIEIQLLSHPFWEERSL FYISRMYTEDFLKGQNYDALEECIHISILGFPRKGTEHFYSVIRLMDDKTGCVYSGKISL RVLYLSQLVSCSEEEKQTEIYHWAKLISARDWEVLKDMAKRNRYMKEAVEELERLNADKE LRYLYLERLKAASDEATMQSYYKRLAEDAQKNGLERGLSQGLEQGLSQGLEQGLSQGIKA LIKDNLEEGKDKKVILAKLEKHFSLTEKEAEAYFDMYNNMS >gi|157101622|gb|DS480702.1| GENE 61 69762 - 71429 1621 555 aa, chain - ## HITS:1 COG:CAC3634 KEGG:ns NR:ns ## COG: CAC3634 COG4166 # Protein_GI_number: 15896868 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 49 551 38 545 550 313 34.0 5e-85 MKKKLSLLLCAALTATTLAGCGQAQSSPAGTDAGTEASAAGTAGSKEGEKILTYAVMEEP ETLDPTLNNYSTSSTFLQNMFCGLFQLEADGSLSNAMCDTYEVSEDGLTYTFTLKDGLKW SDGSDLTAGDFEYSWKRVLNPDTASPAAWELHYLKGGEEYNTQGGSAQDVGVKAVDDKTL EVTLKAPTPYFLYLTASSNFFPVKQEVVEGQDPWTKSADTYVCNGAFTLEKINPQSSYVL KKNPNYYDAGSVKLDGVEIVIIQSPESALSAYNAGEIDAMGDNLVTSQAIDQYGNTEELK GYDKIGTRYYDFNCSKEYLSNPDVRRALAMAIDRRTICESIVPSKPEPAYGFVPYGIPYE GSSEDFRTVSGDLIQEDVEAAKKLLADAGYPNGEGLPVLTFIVTNTKENKDIAQVIQSMW KDNLGVQADIVTFESKVYWDEQKAGNFDVCFDGWTGDYLDPDTNLNCFTQARAYNQNRWS GDNAMKYDSMIEECRNLADNSKRMEIFKEAEAILMDEMPIIPLYYLNAIVLAKPDVSGLV KNANGHTLFRNADKL >gi|157101622|gb|DS480702.1| GENE 62 71603 - 73489 223 628 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 367 605 257 502 563 90 27 7e-17 MGNKKNAAVNHHYAKTAKRLMGYVAGTYRLQFVCVLVTIVISALANVAGPLFLMVLIDDF ITPLLGQQHPDFTVLNHVMAGLAAFYVVGVICTYTYNRLMINIGQGVQKRIRDEMFEHMQ TLPVRYFDRHPHGELMSRYTNDIDTLRQMINQALPMVLSSFVTIIAISAVMIGISVPLSV LAAAMIVLMLFVSRTLAGRCGRYFVAQQLAIGKVTGYIEERLSGQKVIKVFNHEEETREE FDKLNDRLYDSAFHANKYANVIMPVITNIGNLQFVLTAIAGGILSLSGLAGLTLGQIASY LQFTKSFNQPFAQISQQSNSIIMALAGAERIFNMMDEEPETDRGDVTLVNVRMDESGKMT EVPERTSLWAWKCPCEDGTAAYTRLTGDVRFQDMDFGYEPGHAVLHDISLYAKPGQKLAF VGATGAGKTTITNLINRFYDIQKGSITYDGIDIRHIKKDDLRRSLGIVLQDTHLFTGTIR ENIRYGKLDATDREICDAAKLANADQFIEMLPEGYDTMLTGDGEELSQGQRQLLAIARAA IADPPVLILDEATSSIDTRTEAMVQRGMDELMKGRTVFVIAHRLSTIRNSDAIMVLDHGR IAERGDHESLIRERGMYYQLYTGGLELE >gi|157101622|gb|DS480702.1| GENE 63 73479 - 75239 219 586 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 341 567 132 356 398 89 29 2e-16 MKGQHQKTVRALLGSVREYKKPSLLCPVFVVLEVLMEVLVPGVMALMIDKGMSGGSIGYI LKIGAVLLVMSMMALVFGILAGTNAARASAGMARNLRKDMFHHIQEFSFSNIDRFSPSSL VTRLTTDITNVQNAYQMVIRIAIRGPIMLCFALIMAMGISRELSTIFFIVLPILAAGLFY IVIKAHPYFEAVFKRYDVLNRVVQENLNAIRVVKAYVREPYEIEKFRHISQEVFYQFKYA EKIVAWNSPLMQFCMYNCILLVSGLGARLIVGSRMTTGELTSMIIYAVQILSSLMMLSMV FVMVIMARSSAERIVEVLDEKSSLSDPDDPVQEVADGSIDFEHVSFSYIGDKEKLALKDV DIHIKSGETIGILGGTGSSKTTLVQLIPRLYDVTGGCLKVGGRDVKEYGLKSLRNQVAVV LQKNILFSGTIRENLRWGNENATDGEIERACCLAQADEFIRQFPDQYETYIDQGGTNVSG GQKQRLCIARALLKKPKILILDDSTSAVDTKTDAMIRRAFREEIPDTTKIIIAQRISSVE DADKVIVLNHGAVDDFGTPQELFARNKIYREVFESQKKGGDDHDGK >gi|157101622|gb|DS480702.1| GENE 64 75337 - 75762 549 141 aa, chain - ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 8 137 9 142 143 95 42.0 3e-20 MAVNPTVRLIATLSNLIRRKIDSAEGVGGLTPAQNGILHFILGRCREQDLFQKDIEEEFN LRRPTATEILKLMERKGLIYREEASYDARCKKIIPTEKGWELEGRVLKDILEMEQFITRD IPKEELDTFIRIGMKMMKNLK >gi|157101622|gb|DS480702.1| GENE 65 76010 - 76729 634 239 aa, chain + ## HITS:1 COG:BH2910 KEGG:ns NR:ns ## COG: BH2910 COG1296 # Protein_GI_number: 15615473 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus halodurans # 11 220 18 221 237 103 33.0 3e-22 MKHNSHQFHCGLKDGIPIGLGYLAVSFTFGIMARGAGLTTLQSVVMSFTNLTSAGQFAAL GIIQAGAPFMEMAVAQLIINLRYCLMSCSLSQKLGADMPFFHRFFMSYGVTDEIFGVSVC RPGFLSPFYSYGLICVAVPGWTLGTLLGAVSGELLPARLLSALNVALYGMFLAVVIPPAK TSRILTGVILASMALSLVFAHVPLLAGLSSGFKIILLTILIAGAAAILFPVKEENEHES >gi|157101622|gb|DS480702.1| GENE 66 76719 - 77024 400 101 aa, chain + ## HITS:1 COG:no KEGG:Closa_1065 NR:ns ## KEGG: Closa_1065 # Name: not_defined # Def: branched-chain amino acid transport # Organism: C.saccharolyticum # Pathway: not_defined # 1 101 1 101 101 114 73.0 1e-24 MNHNVYLYILVMAGVTYLIRLLPLTLIKKEIKNVYVKSFLYYVPYVTLSVMTFPAILHAT ASVWSGAAALAVAVLLAWKGKSLFQVSLAACAMVFLLELFL >gi|157101622|gb|DS480702.1| GENE 67 77052 - 78350 1339 432 aa, chain - ## HITS:1 COG:no KEGG:Closa_3037 NR:ns ## KEGG: Closa_3037 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 13 427 15 428 433 471 63.0 1e-131 MKHLEHFHELKSLTRKQKFFLLSLLPVYFMVIGLFLQPIQEIGPGIVRLIKEPDFLITDY FVVGGVGAALINAGALTLMSIGIIYFMGMDMDGHTITSSCLMFGFSLFGKNLLNIWAILA GVYLYARYHKTSMRRYIYIGFYGTSLSPIITQMMQVGNLPVVMRMAIALAVGLIIGFVLP PLSTHVHFAHKGYSLYNVGFAAGIIATVIVSVLKSLGVEIESRLIWSTGNNEIFGIVLSV LFGGMIVFGVAVRGRSIWESYRRIIKSYGIGGTDYLRDEGGASTVFNMGVNGLFATYFVL AVGGELNGPTICGIFTIVGFGATGKHLRNIAPVMMGVYLASFTKTWNIYQPSPMLALLFS TTLAPVAGEFGVVAGIVAGYLHSSVALNVGIVYGGMNLYNNGYAGGIVAIFMVPVIHSIM DRRARARGGLSL >gi|157101622|gb|DS480702.1| GENE 68 78543 - 79781 1239 412 aa, chain - ## HITS:1 COG:VC1168 KEGG:ns NR:ns ## COG: VC1168 COG1301 # Protein_GI_number: 15641181 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 1 398 43 444 458 192 32.0 1e-48 MLKKYNKMNDITKILIAMLLGSLLGILLGEKATCIGFIGTIWLNLMKMFLVPIVICMMVK GITSMDNPKTLGRIGIRIVVFYMFTTVAASVLGLVITGVFHPGVGFQFTEGSAEAVEIAE LPTVGKFFTDMFSSNIFATFNNANMMQVLIIAVIMGVAIVMLPEEKRTPVRNWFVSMADL VMSIISIALKLAPIGVFCLMANALGKYGINLLLTMSKLLGTFYLCCILHLIVVYCLILWL FTGITPLNFLKRSFPTIAAAASTCSSAAVIPVSMNVAKENFEVEDSVAGLGITLGGTINK DGVAVLCSVVILFSAQAMGIALTPGQIFNTIFVTALVTSTGSGVPGGGLMNLMIVAAAVG IPLEIVIMVGGFYRFFDMGTTTMNCLGDMSATVIIDRLEKKRTQRLGKTANS >gi|157101622|gb|DS480702.1| GENE 69 79809 - 80987 841 392 aa, chain - ## HITS:1 COG:alr2458 KEGG:ns NR:ns ## COG: alr2458 COG0787 # Protein_GI_number: 17229950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Nostoc sp. PCC 7120 # 6 353 28 378 401 182 30.0 1e-45 MDGYNLKRSCWLEISLDNVRDNFFAFKKMVGPDVKVMPAVKANAYGHGIVPCGKELEACG ADYLGMGSISEAIKLRENGVKMPLLVFASNTIEEVAELYIQYHLIPTILSEREAKAISDA ASSPVPVFVKIDTGRGRLGVNAEEFPDFYRKITALPNLYVEGVYSHMAAVNWPDVSAEYG LWQYERFSAAIEVLGEEGKKIPFCQLANTPGGIALPEIRMTGVCPGRAIWGYSPLDRREG HPQLKHPITAWKSRLIHINQVCGGKFGENYKAVKLDSPKRIGIMAGGLGDGISPKLAGGY VLLHGRKAKVASTISLEHTIIDLADFPDAKVGDEIVILGRQGDEEITMEQRMEEWGRGVP SFWIEISSHTARCYYKEGKLLAVSESGVLQYI >gi|157101622|gb|DS480702.1| GENE 70 80991 - 83240 1348 749 aa, chain - ## HITS:1 COG:ECs0799 KEGG:ns NR:ns ## COG: ECs0799 COG1048 # Protein_GI_number: 15830053 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli O157:H7 # 1 746 9 761 761 758 51.0 0 MFDIQMDGIYLDNGCLVKEAGIQPDDARENTIAYQILRAHSISEEPDRMQIKFDALISQD LTYVGIIQTARSSGMTEFPVPYAMTNCHNSLCAVGGTINEDDHVFGLTAAKKYGGNYVPA NLAVMHAYAREELSACGGMILGSDSHSRYGALGTMGFGEGGPELVKQLLGRTYDIPSPEV VLCYVTGNTKRGVGPEDVALAMVAALFKPGTVKNKILEFTGPGLKNLTADYRCNIDTMMT ETSCLSSIWETDEITQAYYRNHGRPGDYRKLKARSPAYYDGVIYLDLGRIESMIALPFHP SNAVTIHEFNERAKELLEEVDQEAERLFGKNKLQLSHKIKDGRVIADQGSIAGCAGGLFE NLQEAAEILKDATLGKDGFSMTVYPASMPVNLALIRSGAAEKLLSSGAIMKPCICGSCSG YGDIPATHTFSIRHATRNFPNREGSRPDEGQYCAVALMDARSIAATAANQGVITAATDLD YQIPDLKYCFDGTVYKKRVYYGYGKADASVRLVTGPNIVDWPPMDELHDNILMRFSAVIH DPVTTTDELIPSGDTASYRSNPIRLAEYALCRRVPGYAGYCRSIQAVEEERKQGKMPEEL VQVMNRFGIDEDMIAGTGYGSCLFANKPGDGSAREQAASCQKVLGGIANLCFEYATKRYR SNCINWGIIPFTLDGREKFDCVQGDYIFIEGIRKALLEGKKCIAARTLGSDGNVRGITLA LEGLTETERDILLAGCLINYYRKQKIGEC >gi|157101622|gb|DS480702.1| GENE 71 83391 - 83483 71 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFNFKTQYDKLDVDYAAFVFYIFVIPGGSL >gi|157101622|gb|DS480702.1| GENE 72 83498 - 85393 1379 631 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 51 626 51 626 627 313 36.0 5e-85 MLISVYPEMLEDIQAVCQTMHIEPTILQWEIAQQGLLDKLNQMFLELPKPDVIISRGATA GMIEQYFPEIVSIRAEPDNLEILEILNKAREYGSRIGLILFDEYVDHYKIKTVEHLLDVE EVRIYPFRSRADIESQVLRGKRDGMDVMVGGGTLALRMGKQCGMNVCFVESGRLSLKKAI RQAESIIYARQREKLQLKCFSSAASSIQEGILSTENGTIVVANQEMGHILNINERLILNK KAEALPSSLMAPCILTFMFHDTADDGILKINGQNYYVKKSTARSKTNQRIIAVFQGTHNI QYQEQRVRSELRNNGFMAKYTFEDIVAESALMKSLMDKARLYASTNAGILINGNSGTGKE LLAQSIHNASPRHNQPFVAVNCAAIPASLLESELFGYEEGAFSGAKKGGKRGLFELAHKG TIFLDEINSMPIELQSVLLRTIQEKEIRRVGAQKNIYVDIRIISACNKNMEELIADGKFR ADLYYRLNTLKLTMPDLETRKEDIPPLCGYFLKRYSAKYSVPIPALSPEDSEILLKRSWP GNVRELENTMHRYVILSSLQHCSVNDCLDGTPPEQAASSGASQPPLEGTLAQMELALIRK CLEENHGSREKTAKQLGISRSTLWKKLSQQA >gi|157101622|gb|DS480702.1| GENE 73 85398 - 87620 1521 740 aa, chain - ## HITS:1 COG:ECs0799 KEGG:ns NR:ns ## COG: ECs0799 COG1048 # Protein_GI_number: 15830053 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli O157:H7 # 1 738 9 758 761 665 46.0 0 MIQLKKEPFVRLADGTFVRACDYCADIETARSHTMAWRILEAHNEGPDMENLYLKFDMLV SPDDNYTSILQELCAVGERPLAIPWILTNCHNTLAATGGTINNDDHAYGLGCMKKLGGIF VPPYTAVIHQYMRECAAGGGRMILGSDSHTRYGCLGTMGIGEGGTEIARQALGSTYDIRR PPVIAVKLSGAPVPGVGPMDVALTLIGAVFDCGFCKNKILEVVGDGIANLSMEYRMGIDV MTTETAALSSIWMTDETVREWLAIHGREDAYEKLEPTGDALYDGLIEIDLSSMEPMIALP FHPSNVYSIRELCRDRAYLDGVLESVEKDAYMRSGLTYTLRDKIKNKRLMVQQASIAGCA GGTFDNIAAVADILDGYKISGDGISLGIYPASQPVFLETMRQGVAERLLLSGVTIRPCIC GPCFGTVDVPANNTLGIRHVSRNYYSREGSKTDSGQLSATALMDARSIASTVRNGGLLTA ATEMDTVYRKFDYHFENTIYRGGVYNGFGCPNPCIHVPKGPNIKDWPDFSEMRKHVLLKA AACFEGSITTDELCPSGEASSLRSNPEKIAAYTLISKDQEYVEYAGNIRDALENENTGTR SVLEQVCTMLGSEMKDISYGSLLISDKIGDGSSREQAASNQKVLGGWANLAIEYSTKRYR SNCISWGLIPLSCKERPDLKKGDYLLLPDASGKIRMGEEQLEGIILKTGEKVRLSIGEMT EEERRMLVLGGMINYYRSIR >gi|157101622|gb|DS480702.1| GENE 74 87698 - 87817 58 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941193|ref|ZP_02088530.1| ## NR: gi|160941193|ref|ZP_02088530.1| hypothetical protein CLOBOL_06086 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06086 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 71 100.0 2e-11 MGCPGDRALFLFWGQLKEEFFVFYCFTREIGHVLLWNLA >gi|157101622|gb|DS480702.1| GENE 75 87831 - 88181 512 116 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20051 NR:ns ## KEGG: EUBELI_20051 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 114 3 111 121 112 48.0 6e-24 MGFNKKKDQNYLEKVPVRNPEFSWKEDQQGIVTVDMVHKGIFDKLAQKLWVTPRVSHVKL DRFGSFVWKQMDGNRNIIAIGALVREEFGDQAEPLYERLAKFVKMLRDNRFVIFGK >gi|157101622|gb|DS480702.1| GENE 76 88284 - 90230 2315 648 aa, chain - ## HITS:1 COG:VNG6268C KEGG:ns NR:ns ## COG: VNG6268C COG1297 # Protein_GI_number: 16120189 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Halobacterium sp. NRC-1 # 6 597 25 618 655 366 40.0 1e-101 MKDENGFKPFIPADKVVPEVTAVSVILGILLAVLFGAANAYLGLRVGMTVSASIPAAVIS MGVIRVILRRDSILENNMVQTIGSAGESVAAGAIFTLPALFMWMSEWGEGAPSLVEIALI ALCGGVLGVLFMIPLRQALIVKEHGVLPYPEGQACAEVLIAGEQGGSKAGVVFAGMGIAA LYKFVADGLKLFPSDVDYDIQKYGGGVGISVLPALAGVGYICGPKISSYMFSGGIISWLI LMPIIKLFGSGLTAPFAPADVLISEMSSQAMWGSYIRYIGAGAVAAAGIISLIKSLPMIV NTFRDAMRDLKGGRQASTLRTDRDIPMNVVLIGVLIAVVIIWLVPAVPVNFVGAMLIAVF GFLFAAVSSRLVGLVGSSNNPVSGMAIATLLVSALVLKAVSGATHESMMGVISIGGVICV IAAIAGDTSQDLKTGYLVGSTPRNQQMGELIGVVSSAIAIGGVLYLLNQAWGYGSAELPA PQAMIMKTVVEGVMDGNLPWNMILAGAAIAVVIEILGIPVMPVAVGLYLPLRTTAAIMVG GLIRHWYEKRKYAKEEDKKEAIDRGVLYTSGMIAGEGLVGILLAVFAIIPLASARGGYLG DFINLSALDNGLGAFLVSPAAKIVSLIVFGLLMLTMCKFTIWHKENRK >gi|157101622|gb|DS480702.1| GENE 77 90514 - 91494 949 326 aa, chain - ## HITS:1 COG:CAC1480 KEGG:ns NR:ns ## COG: CAC1480 COG0673 # Protein_GI_number: 15894759 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 318 4 320 320 256 40.0 4e-68 MNWAILATGTIAAKFAETVNGMKGEDRLAACGSRSREKAEAFGKKYGIPRTYGSYEELLR DEEVDAVYVATPNNMHYENCMACLEAGKHVLCEKPFTTNVKEGERLFAAAEERGLFIMEA FWIRHLPALKKMQELINGGVIGQVGFARCDFGFVAEGARKARKFDSGLGGGALLDVGIYN LGFLRMVMGDKQPEHVTSEYHLNEFGTDDYSTVLLRYPGGFRAAATASIGMDMPREAAVF GSGGSIYLPDYQKAERLIIRPMKGEEYVMDFPFEVNGFEYQIREVNRCVRLGMSSSDVLK KEDTLDILRLMDDIRASWGLVFDCEK >gi|157101622|gb|DS480702.1| GENE 78 91570 - 93519 1871 649 aa, chain - ## HITS:1 COG:CC0655_2 KEGG:ns NR:ns ## COG: CC0655_2 COG2199 # Protein_GI_number: 16124908 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Caulobacter vibrioides # 481 644 3 170 205 81 33.0 5e-15 MDRKFRSKSIAFKLTIAFVVSLVLQSVLLVSFMVAGGVMEQAEANQYRIFAEKVKGRRNN LENQMKNVWTNFSHHTDTISRYFDSPDGLEYQGQPDRMLEALAPEVLDALYDTKTTGVFL ILPDEGETDGSLAALYFRNNNPDRNSSQNENLYMLIGPWNVAEKMKISTTANWNFRLEVN EENRDFFQKPWSAAAENGRSKWLGYWSPPFQVNEGDEDVITYSVPLFGQDGRVLAVFGVE ISVSYLYRFLPASELQTGDSYGYVIGYRSEREKKETAVTYGALQRQLAGQDRKFEMELKD KENSIYTLLNTDRTGRKINACVNQMGMYYHNTPFEGDEWYLIGLMEEPVLLQFPRRIGQI LEYSLVLSLAAGLIIAVFTSQWFARHAKLIELSELPVGAFEISGRSSKVLMTSQIPRLLR LTKEQEHLFSRDKDEFIRFLKSLASHRDGGPGVIKMDTAEGTRWIKITCKTAVHPIRCVV EDVTDEILETKALKVERDRDGLTGVGNRLAYEHMLQHKNSHPQEIPGSGFLMCDLNDLKG VNDRFGHDKGDEYIRRSADIIREAFPGAPLFRIGGDEFVVQLDGLGQEQVTRGILAMERA VKAYSDGNVFPAGIAAGYAFFDPGKDVSLESTLARADTYMYRKKHKMKR >gi|157101622|gb|DS480702.1| GENE 79 93497 - 94936 1258 479 aa, chain - ## HITS:1 COG:no KEGG:CLOST_0658 NR:ns ## KEGG: CLOST_0658 # Name: not_defined # Def: conserved exported protein of unknown function # Organism: C.sticklandii # Pathway: ABC transporters [PATH:cst02010] # 7 465 7 470 479 334 39.0 4e-90 MNRRLYVFMLAAVMAAALGLGGCMAGDGKKEENEKPVVVVLWHSYNAVAKGAFDDLVMEF NETVGMEQGIIVEPVGYGSSNELDDVLYASASHVIGSDPLPDIFSSYPDSAYRLDGIVPL VHLDDYFTEEELDAYRSEFLKEGVWEGDGIHRMIPVAKSTEILYLNETDWETFAGETGAD KEMLKTWEGLAQAAGMYYEWSGGSPFLGMNAFNDFAALSAAQLGEGIRWTQDGPEFNYSR DTARRVWDAYYVPHIKGWYESRTYNQDGIKSGRLMAYIGSSAGAGFFPQLVIEDEKQSHP ISCQSYAYPVFQDGTPYMGQRGANMAVFASDESHQQAAVRFLKWFTQPEQNIRFAVATGY LPVQEKALESVSDLVSHVESRDNVQAVEQSIRTSLNALKNHNVYVRETFLGSYDMDQIFS SSLENCVNVDLETLKQRMEKGESRESVERELLDEEHFDRWYETLLKEMAGKTDGQKIQK >gi|157101622|gb|DS480702.1| GENE 80 95319 - 95525 221 68 aa, chain - ## HITS:1 COG:CAC0976 KEGG:ns NR:ns ## COG: CAC0976 COG2155 # Protein_GI_number: 15894263 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 60 2 58 69 69 61.0 1e-12 MGNKALDYTALTIALIGAVNWGLVGFFNFNLVSWIFGTASWVTRIIYALVGLCGLYLITF YSHSGREA >gi|157101622|gb|DS480702.1| GENE 81 95684 - 98416 2663 910 aa, chain - ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 6 643 5 640 644 538 42.0 1e-152 MKSIVLGILAHVDAGKTTLSEALLYLAGSIRKMGRVDNRDAFLDTYALERARGITIFSKQ AALTWEGMPMTLLDTPGHVDFSAEMERTLQVLDYAVLVISGADGVQSHTITLWRLLAKYR IPVFLFVNKMDQPGTDREQRMRELQKRLDDGCADFAGAGTEEFTEHAAMCDEKLLERYLE TGDVDDDEIRRLIKERRLFPCFFGSALKLTGVEELLGGIRKWALAPEYPQEFEAKVYKIS RDDQGNRLTHLKITGGSLKVKGIISGGNTEKPSETWQEKVNQIRVYSGDRYETVAEAEAG TICAVSGLTRTYPGQGLGAGQDSMVPVLEPVLNYQVKLPEGCDGAVMLPKFRQLEEEDPL LRVVWNEELKEISMQLMGEVQIEVLKSLIEERFGIQVEFGTGNIVYKETIAGAVEGVGHF EPLRHYAEVHLLMEPGERGSGLQFETRCSEDDLDRNWQRLVLTHLEEKVHRGVLTGAAIT DMKITLVAGRAHNKHTEGGDFRQATYRALRQGLMEASCILLEPWYTFRLEVPEASIGRAM TDIEKRCGTCVIEENRQGQAVLTGQAPVAAMRGYQAEVMSYTRGQGRLACALKGYEPCHN SREIIGQTGYDPERDTENPTGSVFCAHGAGFVVSWDQVKEYMHVDSGLVIESPDGEEMEG ERNPFSGRHPGQSSAGQEESSEVWLGTDEIDAILERTFYANSRDKSARKGYPGKSRGSRS AAAYSGPVTRTYQKQEARQEYLLVDGYNIIFAWEELRELAQDNMDGARGRLMDLLCNYQA IRKCCLMVVFDAYRVTGHATEVSEYHNIQVVYTKEAETADQYIEKFAHENARRFDVSVAT SDGVEQVIILGQGCRLISARELKEELDRVNGMLREEYLEQPGLKRNRLYDILPEEVIRQM KEAAAEDKQD >gi|157101622|gb|DS480702.1| GENE 82 98449 - 100461 1581 670 aa, chain - ## HITS:1 COG:BS_yoaE KEGG:ns NR:ns ## COG: BS_yoaE COG0243 # Protein_GI_number: 16078918 # Func_class: C Energy production and conversion # Function: Anaerobic dehydrogenases, typically selenocysteine-containing # Organism: Bacillus subtilis # 5 632 11 643 677 414 36.0 1e-115 MASYIRKTICPYDCPTSCGLLAETDGTRILSVKGDPDHPAAKGLICRKMQRYEQSIHSSE RILTPMKRVGAKGEGEFEPITWDAAVKEIAGRWNRILEEDGADAILPMYYSGVMSVLQRK CGDAFFNRMGACDMVRTLCSSAKGAGYEAVMGQTGCLDPRELSDSSFYLVWGSNMKATRL QSMPALIRARSQGKRVVLIESCAADMDVYCDQTILIRPGTDGALALAMMHVMEEENLADR DFLAHQTEGYEAFRKTLSPYTPAWAEKETGIPAEVIADLAKEYASAAAPAIILGSGPSRY GNGGMTTRLITILSAYTGAWGRPGGGLCGCNPGAGPYVDNRLVTRPDFRKKPGRRVNINE IASALTGNRGEKPIRSFYVYGGNPVASVCCQKGIMEGLLRPDLFTVVHERFMTDTAMYAD ILLPAAFSVEQTDVYTAYGYCTFGTARKIIEPAGQSKSNWNTFCLLAQAMGYEEAYFKKT EEEMFEELLAHPMEGLSCISEEEWRILREGGAVSTAFADHGRFRTPSGKMMIYNENMEES MPRYVKCHGGTYPLRLVAVPSAYTLNSVFMDREDLKSGRGPMALMLHPEDAAARNIGDGS LVTAFNDLAEVEFTAKITPLVAEGTAAAEGVYDRTFTKDGLLVNALHHERLSDIGAATTL NDNTVDVKPC >gi|157101622|gb|DS480702.1| GENE 83 100636 - 101580 795 314 aa, chain - ## HITS:1 COG:MA0617 KEGG:ns NR:ns ## COG: MA0617 COG0053 # Protein_GI_number: 20089506 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 2 303 6 311 331 188 36.0 1e-47 MGEGNEQKVSGKKIRSHGAGLAMRVSCVSIAINVVLSVFKVGAGILAHSGAMISDGVHSA SDVFSTLIVMAGITMASRKSDKEHPYGHERMECVAALLLSAVLFATGIAIGVSAVETIGS GPEGSRNVPGMLALGAAVISIVVKEWMFWYTRAAARKLKSGALMADAWHHRSDALSSVGA FIGILGARMGVPVMDPLASFVICIFIVKAALDVFRDSMDKMVDKACDEETVRSIEQAALD TRGVERVGSMKTRLFGSRIYVDLEIEADKSLMLEQAFTIAKEVHDTIEARFPQVKHCSVQ VSPEGSTGLEGKEN >gi|157101622|gb|DS480702.1| GENE 84 102020 - 103006 1020 328 aa, chain - ## HITS:1 COG:no KEGG:Closa_0123 NR:ns ## KEGG: Closa_0123 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: C.saccharolyticum # Pathway: Lysine biosynthesis [PATH:csh00300] # 1 328 1 328 328 527 75.0 1e-148 MAIRVGILGYGNLGRGVECAVKHNPDMELKAVFTRRNPDSLSILTEGAKVCRAEDVLSMK DQIDVMILCGGSAMDLPGQTPEMAAHFNVIDSFDTHANIPRHFEAVDRAAKESGHVGIIS VGWDPGMFSLNRLYANAILPGGSDYTFWGKGVSQGHSDAIRRIKGVKDARQYTIPVEAAL TAVRSKKAPKLTTRDKHTRECFVVAEEGADLKAIEEAIVTMPNYFADYDTTVHFISQEEL MRDHAGIPHGGFVIRTGSTGWNDENGHVIEYSLKLDSNPEFTASVIAAYARAAYRLSREG QSGCKTVFDIAPAYLSAADGAELRKHLL >gi|157101622|gb|DS480702.1| GENE 85 103130 - 104677 1025 515 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941210|ref|ZP_02088547.1| ## NR: gi|160941210|ref|ZP_02088547.1| hypothetical protein CLOBOL_06103 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06103 [Clostridium bolteae ATCC BAA-613] # 1 515 24 538 538 888 100.0 0 MAALVWLLCGGCAAAVACGGPGAAGLTAFTAFADVHDPTADTSTGTGEVLVDIAGEDQLK QWYDQRLMGGNSIAVLPKNLVITKPLTLGLPDGQTSTSPIEIRIPEGPIKILAPKNLTQT GVVIDNPNLLITGTKSLISVEGEGCLTLKRGQIQRDASDEPAIILRNQGMLVWEEGKDFV GLKKTDIRDERIDPPEPVPPGESQTPEYGGTQPQLTEAVLLNLGADGSGSARLEFKNLPS DINALYILRSESGHSWKKEKNKVTAASSQSGQETVEYENFLKETGNTLNTSIVEDGYLIY RFQSGDSSFYVKASIEWPGGSCETDKVKIAIPETVGQGLTFSYGGSYSYAGGYTGSYGSS SGGYGGGVTSGGNSSAQPGTQEESDAPEAPVRGRGGRRRESYTPYSAPGYKDGGAAAGQT AEEMLKEATPSQVRGRGEKNRDSDPQIQPADEVEDTAGLPEGENEVTDTEQEPSKEEVQD GEDSGKSGLKWYAGAVLAAAVIGGTAVCFRRRKRK >gi|157101622|gb|DS480702.1| GENE 86 104771 - 105547 706 258 aa, chain - ## HITS:1 COG:STM2546 KEGG:ns NR:ns ## COG: STM2546 COG0483 # Protein_GI_number: 16765866 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Salmonella typhimurium LT2 # 25 229 28 232 267 159 41.0 4e-39 MEINAERVIELVKSVRPLFMDHERASRITVKGAADFVTQVDFLVQERMRSGLFDMYPQVQ FMGEEKDNRDIDFSGAVWILDPVDGTTNLIHDFRCSALSLALCDRGQMELGLIYQPYEDE LFLAQRGKGAFLNGNPIHVSSVKDLGQSLISIGTSPYNHDLADRNFEDIKRVFLKCSDIR RTGSAAVDLAYVACGRLDAYFERCLNPWDFAAGMLLIEEAGGRVTTFAGRAVDVSGPSDI LSSNGRIGILLEDELLPL >gi|157101622|gb|DS480702.1| GENE 87 105633 - 106802 1013 389 aa, chain - ## HITS:1 COG:BH2292 KEGG:ns NR:ns ## COG: BH2292 COG3858 # Protein_GI_number: 15614855 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Bacillus halodurans # 2 378 51 425 426 272 41.0 9e-73 MKIYVVKQGDSVDSIAESQGIPVETLVWANQIEYPYRLAVGQALYISDGEIGEGRRPLFS SGYAYPFIDSGVLEDTLPYLSAINVFSYGFTEDGNLVLPMADDAWMIERALQWGVRPVLT LTPLGEDGRFNNNLVSALVRSQENQQRLIWELGRTMQEKGYEGLDIDFEYVLAEDRVEYA DFVRRATQVLNIFGYTVTVALAPKTSAQQRGLLYEGIDYRLLGEAANHVMLMTYEWGYSQ GPPMAVAPINMVRKVVEYAVSEIPPEKIILGIPNYGYDWPLPFERGVTRARSLGTLEAVK LAVDFGVDIRFDETAMSPYFRYWQYGIQHEVWYEDARSIRAKFDLIKEFNLYGAGYWQLM RFFRANWLLMDDMFYIERDWPLMDYEDMS >gi|157101622|gb|DS480702.1| GENE 88 106855 - 108210 1534 451 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 450 1 444 447 266 38.0 6e-71 MVNNLTEGKPLKLLFFFAMPMVVGNLFQQLYNMVDSMVVGRFVGEDALAAVGSSFPVVFL SVAIAAGLSMGCTVVISQFFGAGKILEMKITVSTALISLGVIGLIIMGIGEIIAGPLLTL LGTDPDIMADSLAYLRIYFGGAVFLFLYNSLNGIYNALGDSQTPLKFLMVSALTNIVLDL LFVIQFNMGVAGVAWATLIAQGMCAVFSFFVLIARLRKMENEEAAKGKSFTFFDMSSAER IAKVGVPSMLQQSIVSISMMLMQGLVNSYGKVFVAGYTAATKIDTLAMMPNMNFSNAMSS YAAQNIGARKMDRVKQGYKASMFMVVVFSLIITSIIFLFGPQLLGLFLKQGAEGSAMSYG LSYMKTVSVFYILMGALFVSNGLLRGAGDMGAFMLSSVVNLFSRVAIAYLLAHFIGASAI WWSIPTGWAVGALFSFLRVRSGKWMARRLVD >gi|157101622|gb|DS480702.1| GENE 89 108200 - 108637 433 145 aa, chain - ## HITS:1 COG:MTH313 KEGG:ns NR:ns ## COG: MTH313 COG1846 # Protein_GI_number: 15678341 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 15 135 21 142 146 66 31.0 2e-11 MNQYTGIYRRCELWFIRRELEQFGLQPLEGKIIMFLQDNQCTQEDIGAHFDLDKGRIARN LSELEEKGLVRRLINEKNKRQKFVSLTVRGNQVLEEIHRISGRWDEICFAGFTEEERGQQ QDYLRRIAENAIEYKHRVGECTYGK >gi|157101622|gb|DS480702.1| GENE 90 108840 - 109751 683 303 aa, chain + ## HITS:1 COG:no KEGG:Clole_2712 NR:ns ## KEGG: Clole_2712 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 6 296 6 257 264 212 42.0 1e-53 MIRPLKKSPRQMEWKALEKKEGRFITAARHQKEPLLNRKLERFVPEKLEDTLNLAFYKAF ELIFNRGTSIIEKTYRKEDMENTYKVNAYAAGLKESRKTMKAFSREAGKNKVRNLAAAGA GGIGLGALGIGLPDIPLFTGMVLKSIYETAISYGFSYDTAEEQCYILKLISTALSRGDAA ESGNRSLDAMGRLIRTGTPLSETALSGTPLSGTPLSGTFSGQDGLAALPGGPESHDTASL CTSLMQQASHALSSELLYMKFLQGIPIAGIIGGMYDAVYLKRIADYADMKYKRRFLEKSN PDI >gi|157101622|gb|DS480702.1| GENE 91 109869 - 110228 346 119 aa, chain + ## HITS:1 COG:no KEGG:Closa_2012 NR:ns ## KEGG: Closa_2012 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 119 1 119 119 147 73.0 2e-34 MDKEVLSFVVEKTHELMNAASCSKEAKASAQTWLDAVGTENEAAATKTYIAELEADIMPV DGLIGFAESEMGAQVFGADKAKEVAAHAKEIKAAGAKYCDCPACAAAEAILEKKDDILK >gi|157101622|gb|DS480702.1| GENE 92 110524 - 111015 472 163 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0833 NR:ns ## KEGG: bpr_I0833 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: B.proteoclasticus # Pathway: not_defined # 1 149 35 185 200 64 29.0 2e-09 MKDICEAAGLSRGGLYTHYGSTGQVFADIIEELMSGLESQVAGKMERGLPASLILDELLE RYQSEMLDRSGSLGLAFYEYYSGLPLTEDNAMLKQYYSSKTMLCSLIEYGIGKGEFRQAH ADAVADLLLFSYQGVRMLSSIMPLDDDNIPEGMIREIRSMLVK >gi|157101622|gb|DS480702.1| GENE 93 111041 - 111865 713 274 aa, chain + ## HITS:1 COG:no KEGG:CD0692 NR:ns ## KEGG: CD0692 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 274 1 273 273 369 63.0 1e-101 MKLKIAGSGGMFLIPNPFCKCPVCQEARTKRGRYERLGPSLYIEDIGMLIDTPEDIAVAC ERQGISRIDYLSISHKDPDHTRGMRIVEPLGYDCITGKGTPIQFIALPEVIDDINAWNGD GLYYYQNILNCISINRTNYAKIGSIELHLVNNKTHRGNMTFYVISDGSKKAVYACCDVKP FVPNELYYDTDVLIIGLVSDDGILKDGSSLDSAPFRDDMFTLDEMLELKRTYRIKRIVIT HIDEYWGKSYAYYREFEKKLDNVSFAYDGMEIVL >gi|157101622|gb|DS480702.1| GENE 94 111895 - 112659 403 254 aa, chain - ## HITS:1 COG:no KEGG:CPF_0998 NR:ns ## KEGG: CPF_0998 # Name: not_defined # Def: CAAX amino terminal protease family protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 109 213 136 239 267 75 36.0 2e-12 MVFLVFTAWLILSQVMISLVQAFFAPAGYGFLAGITYAAQAGAQVVFIWIFRSHLPHAAG KKEAGLRNGLQMVLFSAMTGIGLCMVHRVLFFLFPQIAQSTLGQMTAEEQSRILNSARTP AYFLYVGLIGPVLEELVFRGILFNCCSGKYGRWYPVFITALLFAVSHRNPVQFVSAICMG LVLGSLLTLTGDIRITMMIHISNNVFSVMQSPFVKSGSEFSLSPAGIIASVLIGFLFLVW GFVQIKKRWSRGNP >gi|157101622|gb|DS480702.1| GENE 95 112698 - 113855 1390 385 aa, chain - ## HITS:1 COG:no KEGG:Thit_0785 NR:ns ## KEGG: Thit_0785 # Name: not_defined # Def: isoaspartyl dipeptidase # Organism: T.italicus # Pathway: not_defined # 3 385 4 391 391 318 44.0 2e-85 MKVLKQGDIYAPEHLGRKDILIEGSRIARIADHIDEYDQIEEVEKVDLGGHILTPGYMDI HVHITGGGGESGPATRVPEASLSVLVRNGITTVVGLLGTDGITRSLENLLAKARAFNEEG ITCRILTGAYGYPSPTITGSVERDIALIDLMVGVKIAMSDHRSSNLTGEQLISLATQARR AGMLSGCCGYVTIHMGSGKAGLDPLFYAIDNSDVPVQKFLPTHMGRTQELFEQGLEFVKR GGTIDMTAGLTREELEETADQILSYLKQDPEGLNMTMSSDAYGSAPRFNDKMECIGLTYA SPKSLHQQLKVLVCERGASLEKVLNLLTKNPARVLGLTGVKGTVAEGADADFVVYDDSMD ILHVMAGGRKAVWDKEVIMKGTFEE >gi|157101622|gb|DS480702.1| GENE 96 113991 - 115130 1270 379 aa, chain - ## HITS:1 COG:AGl618 KEGG:ns NR:ns ## COG: AGl618 COG4225 # Protein_GI_number: 15890425 # Func_class: R General function prediction only # Function: Predicted unsaturated glucuronyl hydrolase involved in regulation of bacterial surface properties, and related proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 55 376 78 391 397 279 44.0 8e-75 MRSVKLRTSDKKKVQEKLDLVIEKLMNLGGPENGEELAEGGEAIGFFKRDFGIREWDWPQ GVGLYGLLRIMQARKNEDYKEFLYQWFKDNIREGLPSKNINTTTPLLTLAELNEYYHDQE FETLCLKWADWLMNCLPRTREGGFQHVTSANGDRQGVRLNENEMWIDTIFMTVLFLNRMG QKYQNQAWIDESVHQVLMHIKYLCDKETGLFYHGWTFNERNNFGGIFWCRGNSWFTLGIL DYIDMFRGTMNAGVKEFVVDTYKAQAEALKGLQGKSGLWHTVLTDADSYEEVSGSAAITA GILKGIRYGILDDSYLACAWKAVEAILDNIDRDGTVLNVSGGTGMGYDADHYKNILIAPM AYGQSLTILALAEALLHLD >gi|157101622|gb|DS480702.1| GENE 97 115148 - 116455 1035 435 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 433 1 431 431 403 46 1e-111 MEIQAAIVLVLVFFLMLAMSVPVSFSIISASLLTIIMFLTPGFGMFVSAQKIVSGIDSFT LLAVPFFVLAGLLMSSGGIAKRLINLAMLFLGRVPGSLAMTNIAGNAMFGSISGSGIAAA TAIGGVMLPLEEEQGYDKGFSAAVNIATAPVGQLIPPTASFIVYSAASGGVSVAALLMAG WVPGLLWAALCMVVACVYGAKHGYVMKRDKMTAAIILKTIWEAIPSLFLIVIIIGGILSG YFTPTEASGVAVVYAFLLSVFVYKSIQIKDISKILVDTAVMTTIVMLIIGASSVLSFVLS FTGLPQAISSLLLGISDNRIVILLIINITLLIVGTFMDMAPALLIFTPIFLPIVKSLGMD PIQFGVMIVMNLSIGTITPPVGNVLFVGCSVARLQVEDVMKKLLPFFLAIVAALMFITFV PGFSMWLPGVLGLLK >gi|157101622|gb|DS480702.1| GENE 98 116457 - 116927 264 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 1 145 4 150 164 106 36 1e-21 MMKILNRFLETVLAILVALMVVGCFWQVITRFLLHNPSKYTEELLRYMLIWLTMLGVPYA YGKESHLSINLITRTFKPKNLTRTKIGIEFLVLFISVFVMIAGGVMVTLNSAGQISPAME LPMQVYYICVPISGVLMVLYCLQRLITFAKELKEEK >gi|157101622|gb|DS480702.1| GENE 99 116944 - 117966 433 340 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149195933|ref|ZP_01872989.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 1 308 1 317 340 171 31 3e-41 MKSCRKTIGIAAALLSIAVICVTFFQVTSAKASDKLVIRLAHNQAAGSEIADSIALISDI VAEDASMNMEVQIYPSGVLGTEKDIIEMVKAGILDMAKVSSNTLGQFKEEYSIFAVPYLF NGQQHYYDAMQKSEKVKELFMSTEDEGYIAIGYYANGARNFYLKEDKPCVEPSALKGKKI RSMPNTTSMDMIEAMGGTPVPMAAGETYASLQQGIVDGAENTELALTVDGHQDLVKSYTY TEHQYSPDIYIISTKTWNRMTREQQDYLVKGFEKTNENFKQLYNGMMDQAMEEAESNGVH VYRDIDKTAFIQAVQPIHNKFCAKGESYKALYDDIQKYAE >gi|157101622|gb|DS480702.1| GENE 100 118183 - 119112 951 309 aa, chain + ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 30 279 35 278 284 90 29.0 3e-18 MEKVTFYGEIEGISLEQMVRYGKFDMRVKHFHNQYEIFYIIEGERLFFFNNREYVARSGD LILVDTNLIHMTKSVTAEDTGHNRVILYVSYDRMKAFDGQYPSLQLVRFFHEHYGVYHLD KEQQALFLNFYRNLRIAMTEKAHNYKVGIDLETLTWLFKITSYVSKQEGASPHSSDHPKY RTAYAIADYLSENCEQNISLEELAAHFYLSKYYVCRTFKEITGYTVTEYTNIHRIRKAKR LLEETDMSISEIAHELGYESLTYFERMFKSFMTISPLKYRKTLNTVTYTNEQTTEFEEDT EQLSSHRQP >gi|157101622|gb|DS480702.1| GENE 101 119242 - 120069 972 275 aa, chain - ## HITS:1 COG:YPO0328 KEGG:ns NR:ns ## COG: YPO0328 COG0235 # Protein_GI_number: 16120665 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Yersinia pestis # 3 265 4 265 274 234 44.0 1e-61 MRVLDTAFAKGFIRVCDDGFNQNWHERNGGNLSYRIKHEEVESVKEELSYDGKWQPIGTS VPALAGEYFMVSGSGRFMRNMILEPESNCCIIEVDQAGENYRICWGLAEGGRPTSELPTH LMNHEVKMKATEGRYRVIYHAHPANIIALTFVLPLTDEVFTRELWEMATECPVVFPSGIG VVGWMVPGGRDIAVATSRLMEEYDAAVWAHHGLFCAGEDFDLTFGLMHTIEKSAEILVKV LSIRPDKLQTITPQDFRNLAADFHVNLPEKFLYEK >gi|157101622|gb|DS480702.1| GENE 102 120158 - 121408 1226 416 aa, chain - ## HITS:1 COG:BS_yulE KEGG:ns NR:ns ## COG: BS_yulE COG4806 # Protein_GI_number: 16080170 # Func_class: G Carbohydrate transport and metabolism # Function: L-rhamnose isomerase # Organism: Bacillus subtilis # 1 415 1 417 424 548 60.0 1e-156 MNIQERYESAKEIYAGLGVDVDAAVERLNHIPVSMHCWQGDDVMGFEGADSLSGGIQATG NYPGKARNPQELMADIDKALNLIPGRHRINLHASYAIFQDGEKVDRDALEPRHFADWVKF AKERGLGLDFNPTMFSHPKAENATLSSEDPQIRKFWIDHCKACVRISQYFAEELGSPCAM NIWIPDGFKDIPADRMSPRARLKDSLDQILAMDYDKSKVLIAVESKVFGIGMESCTVGSH EFYMNYAASRGIMCLLDSGHFHPTEMISDKISSMLLFSDKIALHVSRPVRWDSDHVVLFD DETREIAKEIVRNDPDRVLLALDFFDASINRICAWVVGMRNMQKALLYALLMPNEKLAKL QEERRFTELMMEQEELKTYPFGDVWDYFCQMNQVPVKEQWFKEVQAYEKEVLLKRS >gi|157101622|gb|DS480702.1| GENE 103 121457 - 121822 377 121 aa, chain - ## HITS:1 COG:lin2978 KEGG:ns NR:ns ## COG: lin2978 COG3254 # Protein_GI_number: 16802036 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 18 120 1 103 104 114 52.0 4e-26 MTVEAVTRNDLTKGGYRMVRKGFKMFLNPGMAEEYEKRHNALWPEMKEMIRQYGGHNYSI FLDRETNVLYGYLEVEDEALWAESADTPINRRWWDYMADIMETNADNSPVCVDFVPVFHL D >gi|157101622|gb|DS480702.1| GENE 104 121829 - 123250 1360 473 aa, chain - ## HITS:1 COG:BH1551 KEGG:ns NR:ns ## COG: BH1551 COG1070 # Protein_GI_number: 15614114 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Bacillus halodurans # 4 472 3 466 467 459 48.0 1e-129 MNRYYLAVDIGASSGRHILGWMENGKMNLREIYRFSNGMEMVDGTLCWNLVGLFEEIKAG MRECKKLGINPESMGIDTWAVDFVLLDKQGHVLGKTAGYRDRRTEGMDLKVYERISREEL YRRTGIQKQIFNTVYQLMAVKEKQPDILSQAESFLMIPDYLNFLLTGVQKQEYTNATTTQ LVNPSERTWDRELIEALGYPQKLFGGLSMPGTVVGRLSEDVQKETGLDCRVVLPATHDTG SAVAAVPSNRDHVLYISSGTWSLMGTELKDADCREMSMEANFANEGGYEYRYRFLKNIMG LWMIQSVKKEWKESGEDYSFGEICKRASRETISSIVDCNDSRFLAPESMCKAVKTYCEES GQQVPETKWEMAAVIYNSLADCYRICCREIEALTGVRYDCIHIVGGGANADYLNRLTAEA TGKTVYAGPTEATAIGNLAVQMIEGKEFKSLKEARDSIFCSFEIKTYEPSEKV >gi|157101622|gb|DS480702.1| GENE 105 123574 - 124542 1137 322 aa, chain - ## HITS:1 COG:lin2983 KEGG:ns NR:ns ## COG: lin2983 COG2207 # Protein_GI_number: 16802041 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 51 319 53 323 326 110 26.0 4e-24 MNQELIQSLSVITEEEQKILAGSADIDRTLYMDREDMVIDSDKLLEAGQLITIRPHTRFV HFPEHTHNYVEVIYMCRGETTHVVDGRELKLKEGELLFLNQHAVQEIQPAGQDDIAVNFI ILPEFFDTAFRMMGEEENLLRDFIVGCLCDDTRYGRYLHFQIADVLPVQNLVENLVWSLM HEREGMQAMNQTTMGLLLRHLMHYTGRIRVNRDSEQSFEQQLTLQVLRYIDEHYREGSLT ELAALMGYDVYWLSRAINRLLGRNYKELLQIKRLNQAAFLLHSTRMPVADVSFAVGYDNT SYFYRIFRTYYGMSPKEYRKNG >gi|157101622|gb|DS480702.1| GENE 106 124811 - 125779 707 322 aa, chain - ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 320 1 323 324 357 52.0 2e-98 MKIVVLDGYTENPGDLTWNEISAMGDLTVYDRTPVDDKEEIIRRIGDAQVVYTNKVPLDR AVFNACPSIAFVCLLATGYNVVDVGCAREKGIPVSNIPTYGTDSVGQFAISLLLEICHHI GHHDQAVKSGRWEKNADWCFWDYPLIELAGKTMGIIGFGRIGQKTGTIAKALGMNILAYD QYPNDAGRAIAAYTDLDTLLGNSDVIALHCPLFPETERMINRKSIEKMKDGVIILNNSRG PLIHQQDLAEALNSGKVFAAGLDVVDTEPIRGDNPLLKAKNCIITPHISWAPREARQRIM DMAAGNLKAFLAGKPVNVVNGG >gi|157101622|gb|DS480702.1| GENE 107 125798 - 126433 480 211 aa, chain - ## HITS:1 COG:STM3358 KEGG:ns NR:ns ## COG: STM3358 COG1802 # Protein_GI_number: 16766653 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 202 1 195 209 120 38.0 2e-27 MKSIQNIQTKDLVVHILREQILSGDLKPGEELAQEDVAEKLGVSRMPVREALQALTQEGF LTRMRNRHVCVSRMKRIQVLESFRIMAVVEAEMLGMLDERAVQVLSAHIDETVKILEEGS EDKGRQMELEYHWLIGELLDNPYLQQMLKKLQEGYLSYVLLKLPFIHSDRVAHLRRLGDA AKEIGGPGIKDALLGYFEHLADSLLDRTAAL >gi|157101622|gb|DS480702.1| GENE 108 126538 - 127602 972 354 aa, chain - ## HITS:1 COG:ECs2509 KEGG:ns NR:ns ## COG: ECs2509 COG0473 # Protein_GI_number: 15831763 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Escherichia coli O157:H7 # 1 354 2 361 361 355 50.0 7e-98 MKVHKIAVIPGDGIGPEVLEEGIKILKKTACLDGGFSFEFTHFPWGCEYYLKHGRMMDEN GIEILKDFDAIYLGAVGAPGVPDHISLWELLLNIRKSFDQYINLRPVVLLRGAPCPLKDV KCEDINMIFVRENSEGEYAGSGSWLYKGKTNEVVIQNSVFSRIGCERVIRYAYNLAREKG KTLTSISKGNALNYSMVFWDQIFKEVGLEYPDVKTYSYLVDAASMFMVKDPKRFEVVVTS NLFGDILTDLGAAIAGGLGLAAGANLNPEHKFPSMFEPIHGSAPDIAGHNIADPLAAIWS ASQMLEFFGYNKWAQRILDAVETIMWEGKRLTPDMGGSSTTSQVGDEVAAILER >gi|157101622|gb|DS480702.1| GENE 109 127625 - 128569 366 314 aa, chain - ## HITS:1 COG:YPO2874 KEGG:ns NR:ns ## COG: YPO2874 COG0679 # Protein_GI_number: 16123066 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Yersinia pestis # 6 314 8 318 318 105 29.0 1e-22 MEHLIFALNATVPIFCLVLLGMAFRMTGILNRDFADRINHFVFNISLPAVLWRDLSGADF LKVWDGKFIGFCFMATTASILIAAAASMFLKKKEIRGEFIQASYRSSAALLGIAFVQNIY GNSSMAPLMIIGCVPLYNIMAVVILEMMRPGRGRINKGLLIKTLAGILHNPILWGIICGM AWSASGIPRPAVLVKVIGDVAVLATPLGLLAMGALFDLGRAGKSIGPALSASFIKLAGLE FIIIPAAIAAGFRREALTAIFIMLGSATTFGAFVMAKNMGYDGDLTSNTVMITTCGCAVT LTIGLYILKSLNLI >gi|157101622|gb|DS480702.1| GENE 110 128571 - 129977 775 468 aa, chain - ## HITS:1 COG:SPy1189 KEGG:ns NR:ns ## COG: SPy1189 COG3051 # Protein_GI_number: 15675158 # Func_class: C Energy production and conversion # Function: Citrate lyase, alpha subunit # Organism: Streptococcus pyogenes M1 GAS # 2 467 46 510 510 456 51.0 1e-128 MSKLVSDLREAIERCGLRDGMCISFHHHFRGGDFVLNMVMDEIADMGFKNLKINASSIHD SHAPLIRHMENGVVTALETDYIGPAVGKAVSQGVLKTPVIFRTHGSRPSAIESGQSHIDI AFLGAPASDDRGNCTGTIGRSACGSLGYAFADAACADKVVIITDYLVPYPLTRRSISEEY VDYVVKVDAIGDPAGIVSGTTRLPRDPIALKIADYAAKAIEASGLLKNGFSFQTGAGGAS LAVAGFLKDIMLKQGIKGSYCLGGITGYVVDMLEAGCFEAIQDVQCFDLRAVESIRDNRN HVEITASQYASPTAKSTAASSLDVVVLGATQIDTQFNVNVHTDSNGYIIGGSGGHTDVAE CAKMTIIVAPLSRARMSIVVDKVDCISTPGSSVDVLVTQYGICVNPCRRDLLERFTEAGI PVADIHDWKAMAEKMNGIPRGIEHKTSRIVARVMGRNGTQMDTIYQVE >gi|157101622|gb|DS480702.1| GENE 111 129970 - 130806 758 278 aa, chain - ## HITS:1 COG:HI0023 KEGG:ns NR:ns ## COG: HI0023 COG2301 # Protein_GI_number: 16271998 # Func_class: G Carbohydrate transport and metabolism # Function: Citrate lyase beta subunit # Organism: Haemophilus influenzae # 1 268 17 284 291 206 41.0 5e-53 MIMNGGLLGADSIIFDLEDAVAPDQKDAARILVKSALRSLDFGGCEIIIRMNALDTPYWE EDIEEMVPLGPSAIMPTKVSDGDYIRKLEQKITETEEKNGMDVGKIKLIPLLETAMGIEH AYDIAIASPRMEALYLGAEDLTADLRCERTKEGAEILYARGRLVCAARAAGIEAYDTPFT DVEDMEGLRRDARFAKGLGYTGKAVINPRHVEDVNCIFSPSDKEIRYAREVFGAIREAKR QGKGAISLRGKMIDAPIAQRAKLVLEAASELKGVNYFE >gi|157101622|gb|DS480702.1| GENE 112 130845 - 131120 186 91 aa, chain - ## HITS:1 COG:STM0059 KEGG:ns NR:ns ## COG: STM0059 COG3052 # Protein_GI_number: 16763449 # Func_class: C Energy production and conversion # Function: Citrate lyase, gamma subunit # Organism: Salmonella typhimurium LT2 # 1 91 1 92 97 72 47.0 2e-13 MIIHKNAVAGTLESSDILVVVQPAEHGIHIELQSTVYQQFGDDIIKTICEVTESLGVEAA SIQANDHGALDCTIRARVETALRRASKEEAI >gi|157101622|gb|DS480702.1| GENE 113 131191 - 132090 595 299 aa, chain - ## HITS:1 COG:CC1195 KEGG:ns NR:ns ## COG: CC1195 COG0329 # Protein_GI_number: 16125446 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Caulobacter vibrioides # 1 237 1 236 294 137 32.0 3e-32 MVFTQWKGIFPYLVSPVDEYGKVKEQVLRNLVEHLVGCGVHGLTPLGSTGEFFYLDWEQR REIVRIVVEAAAGRVPVVAGVAASTNRDAVFQAAEFERLGVDGILGILNVYFPLNQNGIY DYFASIADAVSCQVVVYNNPKFTGFEIEIPTLKRLSQISNINYYKDASGNIGRLLQLSNV VGNDLKIFSASAHVPVFVMMLGGAGWMAGPACLLPRESVMLYELCEKKHWDEAFRLQKIM WEVNRVFQKYNLAACVKAGLQFQGFSVGNPIPPNQPLATDAQKEVAQVIGMIQKEFHTS >gi|157101622|gb|DS480702.1| GENE 114 132110 - 132730 245 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 1 179 309 485 508 99 34 7e-38 DSVMAKKILTTPIKAEDLEDIHIGDIIYLSGHITTCRDVAHRRLIEGGRKLPVDLEGGAI LHAGPIIRPLEDGRYEMVSVGPTTSMRMEKFEREFIRETGVRLVVGKGGMGMGTEEGCKE YKALHCVFPAGCAVVAADSVEEIEQAQWLDLGMPETLWTCRVKEFGPLIVSIDTYGRNLF EQNKVKFNEKKDKVYERICREVHFIK >gi|157101622|gb|DS480702.1| GENE 115 132721 - 133620 754 299 aa, chain - ## HITS:1 COG:STM3355 KEGG:ns NR:ns ## COG: STM3355 COG1951 # Protein_GI_number: 16766650 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Salmonella typhimurium LT2 # 1 299 1 299 299 495 77.0 1e-140 MTKTEAVEYMTDIMAKFVGYSGKVLPDDVTEKLKELRVLETDELPKTIYDTMFRNQELAA QLDRPCCQDTGVLQYLVKCGTRFPLIDFVESLLKEATVRATFDAPLRHNSVETFDEYNTG KNVGKGTPTVFWEIVPECDTCEIYSYMAGGGCSLPGKAMVLMPGQGYEGVTKFVLDVMTS YGLNACPPLLVGVGVATSVETAALLSKKALFRTLGSKNSNERAAKLEKLLEDGINEIGLG PQGVSGKMSVMGVHIENTARHPSTIGVAVNVGCWSHRKGHIIFDKDLKYTIISHKGVTL >gi|157101622|gb|DS480702.1| GENE 116 133638 - 134258 187 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 2 196 314 505 508 76 26 7e-36 MKKILTTPVTTEDIKDLRVGDIIYLSGELVTGRDDVHHRVVHEGLTCPYDFAGGAVMHAG PIIREEPGKNTMISIGPTSSIRMEADAADFIRLTGVKIQVGKGGMGEKTSAACKEYGAIH CVYPGGCAVSAAAHVEEIKNVYWRELGMPECLWVMKIKEFGPLIVSIDTEGNNMFVDNKK YYASRKEECMAPIIDSVKDYMKVEQA >gi|157101622|gb|DS480702.1| GENE 117 134255 - 135154 243 299 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 26 277 26 281 508 98 27 7e-36 MTHEDAKQRFTDIMEKFIGMSSKRLPDDVYAKLKECGEKEDSDIQKVIYNAYFENLDMAG KLSRPCCQDTGLLHFYIKMGTGFAYQGMVADCLREATHNATYSAPLRQNTVNYFEERNTN DNTGERMPWINWDIVPDNDDLEIITYFAGGGCCLPGRSQVFKPSDGYAAIIRYVFDAVSD LGINACPPLIVGVGLGHNAENAALLSKKAYLRPLGTSHPHPKGAQLEQDLLEGLNKLGIG AQGLRGNCAAMEVHIESSARHTATIAIGVNVACYAHRRGVIRFHNDLSYEMPTYKGVTL >gi|157101622|gb|DS480702.1| GENE 118 135176 - 136453 1239 425 aa, chain - ## HITS:1 COG:BH2671 KEGG:ns NR:ns ## COG: BH2671 COG1593 # Protein_GI_number: 15615234 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, large permease component # Organism: Bacillus halodurans # 3 425 4 426 426 307 44.0 2e-83 MTGLLFGSFFVLMFLGVPIAVALGLAAVIAIVAGLNSSLIVTAQSMFNGINSFPLMAIPF FILAGNMMGEGGISKKLVTFVNLLFSRITGGLAMVAIGASMFFAAISGSCPATTAAIGGI MVPEMKDSGYEKTFAAATVAAAGTVGQVIPPSIPMVTYCVLANTSVSTLFLAGVGPGLLM GLTMMVVAYLYAKKHNVPVIKEKKSPKEVLHICLDSMWALIMPVIILGGIYSGIFTPTEA GAVAAVYGMVVGLFVYKELKWRDIPKVVCNSAVAASVIMLIMATVSTFGYVLTMARIPQV IASSLLSFTTNKYVLLFLFNIVVLIAGCFLNSSAAIALLTPILLPVLTSVGVSPYMVGIV FIVNMAIGMITPPVGNCLYVACNIADIKFEALVKAIMPYLIALIISLLAITYIEPISMTL VNLLG >gi|157101622|gb|DS480702.1| GENE 119 136450 - 136944 473 164 aa, chain - ## HITS:1 COG:VC1928 KEGG:ns NR:ns ## COG: VC1928 COG3090 # Protein_GI_number: 15641930 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Vibrio cholerae # 7 114 20 127 232 68 32.0 5e-12 MGNCVLRVIHIVDKFMEVFSALILGGMTLFIFAQVLARYVFKSPLAWSEEGARFMFIWMT FIAGYVGARKGQHIGVELIQNLFPESIKAGMKVLCDLISIGFFILVLFYCCAQWDKLSAQ TSPALKIPMAFVYLGMMLGCFGMAASYFLHIFEVLKKKRGEEAS >gi|157101622|gb|DS480702.1| GENE 120 136960 - 138048 450 362 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646731|ref|ZP_01719241.1| Ribosomal protein L22 [Algoriphagus sp. PR1] # 65 359 30 323 328 177 33 3e-43 MKKGKHLVSLILAAAMAAGLTACGGQTAPQTTAAASEGTKQAEAAAPAESQEKEEASAGK AEVTLSLNHVGATTHPYQYGSERFAELVSEKTGGRIAVEVYPASQIASGAKAIEFVQMGT LDIALESTMAAENFIPEIGVLNLPFLFENADQAFSVLDGDVGNELRAAAEAKGFKILCWM YNGFRDISNSVKPITAPEDLKGLKIRVPESQVFLKTFETLGGVPTPMAISEVFTAMQLKT VDGQENPSAIFVNNKYNEVNDYYSVTHHIFTAEPLIMSLDKFNSFSEEDQTTLLEAAQEA AEYQRQLAIDSADQELQQIKDAGVNVNVVEDMSDFKAAVQPVYETFKDQYGALIEKIQEA TK >gi|157101622|gb|DS480702.1| GENE 121 138218 - 138838 455 206 aa, chain - ## HITS:1 COG:STM3357 KEGG:ns NR:ns ## COG: STM3357 COG1802 # Protein_GI_number: 16766652 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 204 11 215 221 125 35.0 4e-29 MWSAREQVVNSLREAILTSELKAGERLVLSDLSARLGVSVTPIREALQSLEREGLVQLMP HKEAVVVGIDRKYVTDYYVTRAILESEAAAMACDAKDMTVLLGHLKRMEQVIEEERYQDY ADANYDLHEAIWILADNERMEGILKTLFISNSLARNTSLKDNALVSFGEHKEIVKAIQMR NKQDARKKMHAHMMRSLRDTLSKYEE >gi|157101622|gb|DS480702.1| GENE 122 138978 - 140894 1264 638 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 189 621 320 754 776 184 29.0 4e-46 MVILGIIWVYARKGSHLPSLKNRMFQECLTVTFAAMLSNILSTFMLSGYILVPVWVTWAI TTIYFILTPLMGMVYYLYAVSVIYEERPQLNRMILLGGIPGAVYTLLVLSNFFTKCLFDI TANQGYEQGSLIFITYLIFYAYCACCIVIAVRNRRSIDRHIYHILATFPVLAVLVIFFQQ MYPNIILSGSAATCALLIIYLHLQNRQISLDYLTNVPNRQELLNMLDLLLRRYPDKDFTL LVVSIRDFRQINNICGQHRGDAFLKLVCAFLCSVGPKENVYRYSGDEFALLFINDRGERV RQCVLDIEARMRQPWQNQDYQFKLPVVMGIIRRSEYVKTLEEAVSYIEYAVSQAKTGRYG QVCYCDQEMLEKLERRKKIIRILKAQLSNHSVEMYYQPIYSVETGQFLYAESLMRMNHTT IGPVYPNEFIPIAEETGLIIELTYVILDEVCKYINQLIEKGLPMEAIHVNFSAIQFSQPD LSRKVLEIIQRNGTPMSAVKIEFTESTLAENPQVVTDFALEMRKHGIEMGLDDFGTGYSN IATVIRIPFGTIKLDKSLIWASIDNPTSALTIRHLVHAFKDLGMTVIAEGVETESQRQLV EDIGVEQIQGYYYSKPLSMEEMEEFLKINRYGSAGGGE >gi|157101622|gb|DS480702.1| GENE 123 141214 - 141612 408 132 aa, chain - ## HITS:1 COG:CAC2569 KEGG:ns NR:ns ## COG: CAC2569 COG5015 # Protein_GI_number: 15895829 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 2 126 1 124 131 80 38.0 7e-16 MMNKVVDFLQANPVQYLGTVGRDGKAKCRPFMFCFEKDGKLWFCTNNQKDVYKDMQANPE VEVSVSDPAYAWLRLHGRAVFEDNMEVKEACMANPIVKGQYNTADNPIFVVFYLENPHGV IADFSGNPPYEF >gi|157101622|gb|DS480702.1| GENE 124 141861 - 143834 1886 657 aa, chain + ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 1 651 13 685 685 568 55.0 1e-162 MLFLVPVVGLLGLLFAVILRQQVVKEDPGTDKMREIADAIAEGANAFLASEYRILVVFVA VLFFVIGFGTRNWITAGCFLVGSGFSTMAGYLGMNAAIRANSRTANAARTSGMHRALALA FSGGSVMGMAVVGLGLLGVGVLYIITRDVSVLSGFSLGASSIALFARVGGGIYTKAADVG ADLVGKVEAGIPEDDPRNPAVIADNVGDNVGDVAGMGADLFESYVGSLISALTLGLVFYQ EAGIVFPLVLSACGIIAAIIGSLLVKSIGNSDPHKALKTGEYSATALVVVCSLILSRIFF GNFMAAFTIITGLLVGVLIGAVTEIYTSGDYRFVKKIAKQSETGSATTIISGLAVGMQST AVPILLVCVGVLISNKLMGLYGIALAAVGMLSTTGITVAIDAYGPIADNAGGIAEMAGLD KNVRDITDKLDSVGNTTAAIGKGFAIGSAALTALALFVSYAEAVKLTTIDILNAHVIVGL FIGGMLTFLFSAMTMESVSKAAHQMIEEVRRQFREKPGILKGTDRPDYASCVSISTKAAL REMFLPGLMAVLAPLATGLILGPSALGGLLTGALVTGVLMAIFMSNSGGAWDNAKKYIEE GNHGGKGSDSHKAAVVGDTVGDPFKDTSGPSINILIKLMTIVSLVFAPLFLQFGGLI >gi|157101622|gb|DS480702.1| GENE 125 143907 - 144719 677 270 aa, chain - ## HITS:1 COG:VC1721 KEGG:ns NR:ns ## COG: VC1721 COG1609 # Protein_GI_number: 15641725 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 2 266 70 332 336 161 33.0 1e-39 MSNVFFSNLAKGVEDECRRQDWNLILCNTNDKHSRDISYIQVLADKGVDGILFCMALDSS KKKGMESIGLMEKLKIPFVMIDRFLEGARCCFVSLNHELGGYMATEHLLKLGHRRIGCVT GPDVLEDSMARLTGYKKALAEYEIDFEPSLLYHGNYDRESGINGADYLVDKDVSAIFAFN DMCAYGVYNRLKRLGKSVPEDISLVGYDNIFFSEILDVPLTSVEQPVYEMGVEAVKQLID GINSGNCAEKSILFQPRLIVRESTSEYKER >gi|157101622|gb|DS480702.1| GENE 126 144737 - 144943 128 68 aa, chain - ## HITS:1 COG:no KEGG:Closa_3272 NR:ns ## KEGG: Closa_3272 # Name: not_defined # Def: LacI family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 6 64 1 59 343 83 71.0 2e-15 MRHGTLRTTIKDIARYTGFSVTTISLVLNGKAKKIPKQTKDVILEAADRLNYHPNQVAVG LVKRGPIQ >gi|157101622|gb|DS480702.1| GENE 127 144979 - 146130 704 383 aa, chain - ## HITS:1 COG:lin0289 KEGG:ns NR:ns ## COG: lin0289 COG0624 # Protein_GI_number: 16799366 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Listeria innocua # 10 375 10 373 378 146 27.0 9e-35 MKEYDEYGFLQAVLEIPSVNGADDEGAVARFLCDFLRDCGVDSQVIPIDDSHADVIANIK GESEDLVVLNGHLDTVPYGKREEWDTPPERCVRRNNRFYGRGASDMKSGLAAMVYVLGTM VKAGCSPRMNLCFMGTCDEEKGGLGARKILQENRMPQPSLLLIGEPTGLKPGVAQKGCMW IELTVDGVTSHGAYPDEGINAVEYGMDIAREFKEIIGSHSHGILGTATVQVTKIQGGIAP NMTPDRAEIFMDVRTVPGMEPEDVVKMLEDVCSRRFKRCPGRGGYHYNVVNQRRAIEIEG SHPWVAEFDEVLKQMGLSTQRVGINYFTDASILTEYNHSMPVLLLGPGEPSLCHKPNEYV ELEKYSKYVYIMKKVFFTSFEND >gi|157101622|gb|DS480702.1| GENE 128 146195 - 146938 799 247 aa, chain - ## HITS:1 COG:mll7663 KEGG:ns NR:ns ## COG: mll7663 COG5426 # Protein_GI_number: 13476365 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Mesorhizobium loti # 2 247 4 255 256 236 47.0 3e-62 MKRVLLAGESWMSYTTHVKGFDSFYTSVYETGEKWLKAALEKGGYQVDFLPNHLASEEFP FTMEEIKKYDCVILSDIGANTLLLPVPTFTRSQKMPNRAKLIRDYVLEGGSLIMVGGYLT FSGVDAKGKWHDTAVQEVLPVEVLTVDDRMEHCEGVKPQVVREHEALSGIPEDWPEVLGY NRTIPREDAVVAVEVEGDPLVAFGSYGKGRSAVFTTDCAPHWAPPEFCEWEYYDKIWQGI ADWLTTK >gi|157101622|gb|DS480702.1| GENE 129 146959 - 147936 1018 325 aa, chain - ## HITS:1 COG:BMEII0433 KEGG:ns NR:ns ## COG: BMEII0433 COG1172 # Protein_GI_number: 17988778 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Brucella melitensis # 24 312 39 341 346 151 35.0 1e-36 MSKLAFSIAKSRTKFLLLLLALEIVVLSCISPYFLSLGNLLQITQFGAGLTLLSLGEALV MVGGKDGIDISIGSTMSLSGVIFGLAVMGGASIPVAIVISLAAGAALGAVNGVLIAMAGV PALIATLGTQYVYSSLALYLTGGIPISGFPESFEWLSLKSTFGIPNQILFVVIPVTILVF ILIYKMKFGRRAYLMGTNPEAAKFTGIREVRIRMGIYILAGVLAAISAVINNSWLMTARA DAGTGMEMQAITVAVLGGISVAGGSGHLGGVLIGVIIITMMNSGLQIANINSVWQLAVLG LILILAIILNHMMNKFVIRVEKNRE >gi|157101622|gb|DS480702.1| GENE 130 147933 - 148910 1036 325 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 2 314 23 330 330 152 34.0 7e-37 MFKKYVPKPGRREYITFLILFLEIVLFGIFAPNFLTVSNLVRVVQNNAEIAIVSIGMTIV MLLGGIDLSVGSVMGVIAIVIGYMLQADINILLIAVVAICIGIGLGLVNGFLVAKFNIPD MIVTLATMNIWKAAIFALLGGKWLTGLNPSYGVITRARVLGIPVLLLFIIGVYAVFYYFL MYRRFGRHIYATGCNPQSANLSGINIAGVKMVSYCISGGIVAIAAMLYIARMGSVEMTIG SDLPIACIAAVTIGGTGSKGGGKRGSVIGTLAGVLFIAFLKNGIVILGIPSLLENCFIGL LIIISVLFDALTSRYGLLKGKGEAA >gi|157101622|gb|DS480702.1| GENE 131 148968 - 150266 1119 432 aa, chain - ## HITS:1 COG:BS_argE KEGG:ns NR:ns ## COG: BS_argE COG0624 # Protein_GI_number: 16079029 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus subtilis # 1 431 6 422 436 229 32.0 1e-59 MKEKVYRWIDENEGEVVKLLQELIRIPSVNPYFDEDKQYMQEGKAQAYLKEYLEDMGMDT ELTYPDAGKLVAYKDKAGYYADHTFEDRPNLLGTLKGEGGGRSILLSGHMDVVQRGNKWV HDSFGAEIVDGKIYGRGALDMKGGIAAMTVALKAISRSGIKLKGDAMIGTVVDEEAGGMG TLALVAEGYRADGCLITEPTHLKIAPLCRGILWGKLIIEGRSGHIELKQGDYRTGGAVDA IDKASLYLEHFRRLNKEWSVTKEHKYLPIPCQLHVAQFNAGEYPTTFANHAELVFDAQYL PAEKDENGLGSRVKKELEDFVQAVAMTDPWLRENPPRIEWLIDADCGETLDDQPFFQTVR DSAREINPGSEVEGICCHTDMGWFCNVGIPTMNIGPGDSRLAHQADEYVEVDELVTCTKM IASILMDWCGVA >gi|157101622|gb|DS480702.1| GENE 132 150270 - 151802 207 510 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 264 486 30 249 329 84 29 5e-15 MAEEIILKAESISKSFGGVKALQNVSFELKKGEILSVIGENGAGKSTLMKIVAGALKNDK GEIYFEGKKLSLNSPLDAVKTGISIVYQEPNIFADMSVLENIFMGNEVVNLNKTIQWTKM YEEAVEALKLVGLNGNILHLTMSELSIGNQQLVLIARGIYKKCKILILDEPTSILSHGES EKLFEIIADLKSKGVSIMYISHRIPEILRISDNIIVLRDGCLTGTLSPREADEESIITAM SGRKINMSVYQTREIKDDTILEVRNLGYKNQYQNITFSLKPGEILGMYGLVGSGRSEIAR AIFGETKAETGNIIFEGKDITGTDINGAIEKNIFYVPEDRGAQGLFDIHPIRDNMSVSFL DGFSNKFSFIRKKEERKAVQDNIEKYSIKTPSQDLSVNSLSGGGQQKVLLCRWLMRKPKV LILDEPTRGIDVATKAEIHKYIMELAAEGVAILVISSDLPEIMGVSDRILTLHKGTITAE FDRETVTEEKILKNALNLSDQSLSVHSEEV >gi|157101622|gb|DS480702.1| GENE 133 151912 - 153018 967 368 aa, chain - ## HITS:1 COG:mll5706 KEGG:ns NR:ns ## COG: mll5706 COG1879 # Protein_GI_number: 13474749 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 64 366 24 327 331 146 34.0 5e-35 MKKKMSKHVALLLAAAMTAGSLAGCSGGQSAPTGTTAAEAKAEDTKAEEPAGSTEAAKTE EGTGDKVKIAFIPQLIGIPYFSAMESGGKKAAEDLGVEFLYTGSTQASAAEQVKIMDSLI KQGVNAISLSVLDSSSTNPYIKKAQEAGIKVYTSDSDAPDSTRDFYVAQALDKDLGTTLM DCLGEQMGGSGKVGIVSGESTATNLNTWIDYIKQRQEEKYPDIEIVDIRYTQGGSSEQAL KQAQELMTRYPDLKGLVAVASSTIPGVAQAVQQEGKAGEVAVIGYGSPATVKPFIESGVM KQSVLWNAYDLGYLTVWAGKMAAEGKEFEATNQIECIADPVTWDAENKILLLGQPLIIDK DNVNDFDF >gi|157101622|gb|DS480702.1| GENE 134 153503 - 153901 390 132 aa, chain - ## HITS:1 COG:no KEGG:Dred_2181 NR:ns ## KEGG: Dred_2181 # Name: not_defined # Def: hypothetical protein # Organism: D.reducens # Pathway: not_defined # 9 123 3 123 131 75 36.0 5e-13 MDWIIRGRKELSCSYMEAGTAFLGEDILAFVQGGDKPHIGCTVQSVPRPSLTGNGTISVT SSILNLTGHKDEALCRRLAEKLCRETGRVVVCTGGFHIDNMKPEQIDEVVKALDGLADEI VSGIAGAGYSPL >gi|157101622|gb|DS480702.1| GENE 135 153886 - 154479 463 197 aa, chain - ## HITS:1 COG:ECs3593 KEGG:ns NR:ns ## COG: ECs3593 COG0163 # Protein_GI_number: 15832847 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Escherichia coli O157:H7 # 12 197 2 186 197 175 48.0 5e-44 MRSTGANRGGKRIIVGATGASGLPILIKCLELIGEQPEYESYLIMSHSAVLTLGQETDLS AEQVEGLADHVLGPDEIGAGPASGSFAAEGMLIVPCSMKTIAGIHGGYAENLILRAADVT IKEQRTLVLAARETPLSPIHLRNMYELSMMPGVRIIPPMMTFYHRPENMEEMIYHIASRL LEPFGIEGKEYRRWTGL >gi|157101622|gb|DS480702.1| GENE 136 154476 - 155963 1153 495 aa, chain - ## HITS:1 COG:MA0246 KEGG:ns NR:ns ## COG: MA0246 COG0043 # Protein_GI_number: 20089144 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Methanosarcina acetivorans str.C2A # 20 475 14 417 422 162 30.0 1e-39 MAKKVRDLRSALELLAELPGQLLETDVEVEPMAQLSGVYRYVGAGGTVKRPTKEGPAMIF HNVKGHPDASVAIGVLASRKRVAALLDCKPEELGKMLCRSVEKPVLPVMCQGEPVCQQVV HRADEPDFDLYKLVPAPTNTPYDAGPYITLGMCYAAHPDTGAHDVTIHRLCIQGRDELSI FFTPGARHIGAMAERAEQLGRRLPISISIGVDPAIEIGSCFEPPTTPLGYDELSVAGALR GEPVKLCRCLTVDELAIANAEYVIEGEVVPNVRVKEDQNTHTGFAMPEFPGYTGPASDQC WLIKVKAVTHRVHPIMQTCIGPSEEHVSMAGIPTEASIYGMVEKAMPGRLQNVYCCSAGG GKYMAVLQFKKSVASDEGRQRQAALLAFSAFSELKHIFLVDEDVDCFDMNDVLWAMNTRF QGDLDTITIPGVRCHPLDPSNDPAFSPSIRDHGIACKTIFDCTVPFDQKDRFVRARFMEL DPSEWVKQKNEGKIS >gi|157101622|gb|DS480702.1| GENE 137 156253 - 157131 195 292 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149913192|ref|ZP_01901726.1| 50S ribosomal protein L35 [Roseobacter sp. AzwK-3b] # 1 246 1 245 305 79 27 1e-13 MTFEQLDYFIAAVQCDTFLDAAETLHISQSALSKQFMKLEKELDIQLWDRSRRSAVLTEA GHMFYEEALNLSKEYRRTLFRISEFKEKAGQRLSIGTLPILSQYHLTAMLKDFADSHPLI HVFLEEVEEHDLLQGFSNNAYDLIITRSNMLDSKTHTFLPLAEDRLVAILPENHRLADRS CISLNDIAGEGFILMHSYTSIYHFCMDLFQKAGIRLHLLRTARMESIISAVAVHEGISLL PEENFRLFQYKDIVSVPIDPAGKLSIGIAGKKQGQVSSAMAEFLRYAKGYTR >gi|157101622|gb|DS480702.1| GENE 138 157119 - 157886 703 255 aa, chain - ## HITS:1 COG:SA0343 KEGG:ns NR:ns ## COG: SA0343 COG1878 # Protein_GI_number: 15926056 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Staphylococcus aureus N315 # 2 249 4 247 250 208 41.0 9e-54 MYELWEDLKRLKRCRWVELSHPLNNDSPYWGGIPEGAVELSKTVYDWGNELECLIQTFKF PGQFGTHIDFPGHFIKGEALSEQYGVKDMVFPLVVIDISEKVKEDVEYAVTVEDIREFEA AYGDIPDGAFVALYTGWSRRWPDMDAISNFDQEGGEHFPGWSMEALRYIYEVRNAAANGH ETLDTDASVLAARSGDLACERYLLDKGKLQVEVMCNLDQVPAAGAIVFAVFPPIEGATGL PVRMWAVVPEQVQRV >gi|157101622|gb|DS480702.1| GENE 139 158005 - 158595 396 196 aa, chain - ## HITS:1 COG:no KEGG:CPE0456 NR:ns ## KEGG: CPE0456 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 6 180 2 182 210 119 35.0 5e-26 MKNMAKACLCRLYYGYLRLLEKTVRLEVEMPDIWTKDMPDRGAVIGFWHEDSFLMNLLLR RLSEGRDIAVIVTSDGRGDYIENIIKRCNGKAIRVPDGCRSRDFLRNLIEEAGCPGKTLA AAMDGPLGPRRVPKTLGFYLAEKGKKNFIEVRAGYSRAVCLNWRWDHYRIPLPFTNIKIS LKDCGITGDRNAVLPG >gi|157101622|gb|DS480702.1| GENE 140 158534 - 159802 817 422 aa, chain - ## HITS:1 COG:TM0744 KEGG:ns NR:ns ## COG: TM0744 COG0438 # Protein_GI_number: 15643507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 1 360 1 334 406 174 33.0 2e-43 MNIAMFTNNYKPFIGGVPISIERLSEGLRSLGHRVTVFAPEPEGKLSAEQEKDVVRFKVF YRKADRGMVLGNCFDRKIEAFFREGNFDVIHVHHPVLVGQTALYLGKKYNIPVAYTYHTR YEEYLHHFKLYEAMASGEYPVSRIARYGKEILVPALITAFSNQCDLVFAPTSTMKDHLLE QGTKTRISVLPTGLDRVSFQYDEMLSSEIRNMYGNGCPYLFATTARLEKEKNLDFLFRGT ARLRDILGQDFRLMVIGDGTERRHLEELARMLGISRQIVFTGKIENTRVRHYLNAADAFL FASKSETQGIVLLEAMAAGCPVAAVRASGVVDVVRQGINGYMTEEAPEAFAAAAARLVLE RGVWSGMSEEARKTAELYESGRIAALAAEQYGLMTGQKEDAGYEEHGKGMLVPSILRLFK AA >gi|157101622|gb|DS480702.1| GENE 141 159814 - 160323 171 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223039927|ref|ZP_03610210.1| ribosomal protein L22 [Campylobacter rectus RM3267] # 3 161 6 163 208 70 28 7e-11 MDVQMLTQYFLRYGAVFIFVIILLEYMNLPGFPAGIIMPLAGIWAARGQISFFMVMALGL AAGLLGSWILYGIGRMGGDLFLDRYLKRFPKHEPVIQRNFELLRTRGAAGIFISKLIPMI RTIISIPAGVIRMNFVKYTISSALGIFVWNFFLIGAGYVMGDHVFQLLS >gi|157101622|gb|DS480702.1| GENE 142 160611 - 161903 1299 430 aa, chain + ## HITS:1 COG:BH2694 KEGG:ns NR:ns ## COG: BH2694 COG0477 # Protein_GI_number: 15615257 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus halodurans # 6 409 2 402 418 81 23.0 3e-15 MEVKKQNKSMYHWVILVCCILTLMFAYSTRFGLAQLFTTQIIKETGFATSAYFLSTTIAS IICVFTGPIAGKLLRGRYMRPTFVVCVIGTMGFYSCYGFCHELWQFYLVGALQGFFAMGA CTIPVTVLITNWFEKNRGLMISIAMMGISLGGTFLSPFLSWTIGMFGWRRAYFILGAVSL CVLIPISAFIVRRSPEDVGLFPYGHGEESSGKHGKNKSVPANSWNATLSEARKTPILWMF AAGGFLIYFTSCIMGHMSYYLQTTGFAPSSIAAYISVYSIVALAGKLVLGHMFDRFGPKG GILFGCGTFFIFLICFIMVQGGPLMLYLSAVFYGFGTCTATVAIPIMTTSIFGSKNYSEI YGFISAFTMTGGAIGSSGIGLAYDITGSYKPALTILALLTVLTIVIMFICINLSQKRVEK PADKCVQMPA >gi|157101622|gb|DS480702.1| GENE 143 161937 - 163496 1400 519 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941275|ref|ZP_02088612.1| ## NR: gi|160941275|ref|ZP_02088612.1| hypothetical protein CLOBOL_06168 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06168 [Clostridium bolteae ATCC BAA-613] # 1 519 1 519 519 1009 100.0 0 MTERKLSTLGYLMDRLGLSTAALARRLHVDASLVSKWRSGSRRLSAKSVYFDEVCRMLLE QNPMALTDALRSLVPLEEPKREEGERGKRGDGSLSEAEGIWGTEGISGTGGSSGTEELLY RVLNDRHFTVPKAFTIRAEALCTAEIAIYTSGEGRRQAISDLLEIAEAMETPGELFYVDS EQTGWLLEDTAYAREWVVRMVRLLDRGFVFKAALHFSVSVDKFVAFFQLCSPLIFHRNAR WYYHQYYDENIYWFSFFILEHAMSIAGMSMSPEQTSATVYTDAYSILQHRNVVEMVLTSC RPMFSDLSMGLGAEAARMLREQGRAGETLFSYLPAPAFMSAGEQLFDDILRDNEITGAAA SRLREINRQLRGLVENQLTGGQGEFIQILQLGEMEHRADTCFISTSLSLLGKRKVVVSPA HYARGLRELALRLEAFPAYRLFLAADADDIRIPAMNCWCRGTSWMVQMDCEGFRLCREPT LCGAAYTTLEQGWRRVPPGRKDPAAVRAVLEWLIERLEK >gi|157101622|gb|DS480702.1| GENE 144 163923 - 164657 848 244 aa, chain + ## HITS:1 COG:TVN0771 KEGG:ns NR:ns ## COG: TVN0771 COG1335 # Protein_GI_number: 13541602 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Thermoplasma volcanium # 76 228 31 175 175 63 31.0 3e-10 MAENKDYKYVDHYDFNEELLKQAFAEARRIYKERGFMRKMGFGKAPAITTVDMARAWMSE GHPFTCDHSEEVCANALKVLEAGRKSGVPIFHTTTGYIGKQQWDLPRWDEKIPMSALDIN SEWLEIDPKLKPRPEEPVIHKKYASNFFGTHLAQTLNFLGVDTVIVMGATSCACVRHTVM DSTGYGFKTIVPEGTVGDRVPGVIEWNLFDMEAKFADVVPVEEVVKYLEGIDSNVYTKHE RSLD >gi|157101622|gb|DS480702.1| GENE 145 164781 - 166136 1191 451 aa, chain + ## HITS:1 COG:BMEI1644 KEGG:ns NR:ns ## COG: BMEI1644 COG0044 # Protein_GI_number: 17987927 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Brucella melitensis # 1 443 16 460 489 301 39.0 2e-81 MTAASEFTGDIYIEGETIRQIGTNLSVNASRTIDASGKYVMPGGVDEHVHYGSFGGRLFE TAEAAAAGGTTTIVDFAPQEKDVPLLDAIRRQAAKAEHTSSVDFAFHAVIMDPKESVFEE VRHLPEVGVATLKLFMAYKGTAFYCDDEAILSAMMNAKDAGVTMMVHAENADIITILTRY YLSQGKTAPVYHYYARPPIAEEEATGRAIALAKAADCPLFVVHVSIREAMEAIRNAHNDG WPIFGETCTHYLTLTTDCLDKPGFEGAKYICSPPLRPQEHVDALWQAVREGWLTAVGSDH CALDGGYEGAKKKGTNFSNIPNGCPGVQDRLALLWTYGVCTGKITRRKFVELFAANPAKV VGLPNKGDLRIGADADVVIFDPEWKGIITNKDSLHGIDYEPFEGFEIQGRPQQVFLRGRL VAENGRFVGERGMGRWQKCKPYGLCYDYYHK >gi|157101622|gb|DS480702.1| GENE 146 166263 - 166814 359 183 aa, chain - ## HITS:1 COG:no KEGG:ELI_2227 NR:ns ## KEGG: ELI_2227 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 11 180 11 181 187 160 49.0 2e-38 MKGFKRENRLLSLCGLNCGLCTMYLGKYCPGCGGGEGNQPCAMARCSLQHPGVEYCWQCR SFPCDIYEGIDEYDSFITHRNRFKDMKRAEEIGVERYNEEQMEKSRHLEMLLENYNDGRK KTFYCQAVNLLTLQDIRDVMKVLAAEAGDGLSLKEKAKRAETLFLERAGEQGIELKLRRK PKK >gi|157101622|gb|DS480702.1| GENE 147 167067 - 168092 492 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149199369|ref|ZP_01876406.1| Ribosomal protein L22 [Lentisphaera araneosa HTCC2155] # 29 334 44 346 346 194 34 4e-48 MVSVLMKKLAIAAVAGAVMVSASGCSLVSSGKWIIRVSHAQSETHPEHLGMLAFKEYVEE HLGDKYEVQIFPNEILGSAQKAIELTQTGAIDFVIAGTANLETFADVYEIFSMPYLFDSE EVYRSVMEDTDYMENIYKSTDEAGFRVVTWYNAGTRNFYAKTPINTPDDLKGKKIRVQQS PASVDMVNAFGAAAAPMGFGEVYTAIQQGVIDGAENNELALTNNKHGEVAKYYTYNKHQM VPDMLVANLKFLNSLRPEELQVFKEAAALSTEVELVEWDKSIKEAKQIAAHDMGVTFIET DVEAFKAKVLPLHQKMLENNPKIRDFYQYIQTVNDSRKGDH >gi|157101622|gb|DS480702.1| GENE 148 168096 - 168599 254 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020580|ref|YP_526407.1| ribosomal protein S3 [Saccharophagus degradans 2-40] # 4 148 1 147 164 102 36 2e-20 MDTMHKLRAAIMKILGVVITLLFILMTLVGTYQIVTRYFFNRPSTISEELLTYSFTWMSL LASAYVFGKRDHMRMGFMADKLTGPARRYLEVFIDALSFFFAGVVMVYGGISITKLTMIQ ITASLRISMGWIYIIVPIAGLLIMVFSVMNAADMLHKDFSEPGEAKV >gi|157101622|gb|DS480702.1| GENE 149 168596 - 169903 1124 435 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 9 429 6 426 435 437 51 1e-121 MSIAVFCGLVIFVLLAVMLLAGVPIAIALAVSSICAILPVLNLGASVLTGAQRIFSGISV FSLLAIPFFILAGNIMNKGGIAIRLINFAKLFTGSIPGALAHTNAVANMLFGAISGSGTA AASAMGSIIGPIEEEEGYDRDFSAAANIATAPTGLLIPPSNVMITFSLVSGGTSVAALFM AGYIPGILWGLACMVVIYFFARKKGYRSTKRYTGSEKVKVFFQAIPCLFMIIVVIGGIIS GIFTATEGSVVAVVYSMVLSIFFYKSIRLRELPKIFLDSAEMTGIIIFLIGVSSIMSWVM AFTGIPTAVSEAMLGISTNRYVILFIINILLLIVGTFMDMTPACLIFTPIFLPICQALGM NTIHFGIMMIFNLCIGTITPPVGTTLFVGVKVGKTKIEQVIRPLLYYFGAIFIVLMLVSY IPQLSLWLPGLMGYV >gi|157101622|gb|DS480702.1| GENE 150 170028 - 171236 1403 402 aa, chain - ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 3 396 7 411 424 267 42.0 3e-71 MKKKPLALWIFTGLALGIVAGMLLMGRPDIAEIYIKPFGTLFLNLLKFIVVPIVLFSIMN GVISMKDIKKVGSIGGKTVIYYMCTTFFAVTLGLLIANLFKKGFPVLSTSSLEYTAAETP SFIQTLIGIFPSNIIQPMAEASMLQVIVVALLFGFGTIVAGKKGEVFGNFVESANDVSIA VMSFIIKLSPIGVFCLITPVVAANGPQILGSLAFVLLVAYIGYILHASLVYSLTVKSLAG IGPLKFFKGMAPAMIMAFSSASSVGALPLNLECAEKLGARKEVASFVLPLGATINMDGTA IYQGVCAVFIASCYGINLTLSQQIVIILTAVLASVGTAGVPGSGMIMLAMVLQSVGLPVE GIALVAGIDRLFDMGRTTVNITGDAACAVIVSHLEDKRLKKA >gi|157101622|gb|DS480702.1| GENE 151 171392 - 172813 968 473 aa, chain - ## HITS:1 COG:no KEGG:Daud_0731 NR:ns ## KEGG: Daud_0731 # Name: not_defined # Def: WD40 domain-containing protein # Organism: D.audaxviator # Pathway: not_defined # 73 470 66 450 570 81 25.0 9e-14 MKRTAACLAGLLMLVLLAGCKGGADESGARENKGDIETSGTVSVLYAANDGLYLIRVLRN GAEDANAAVSTERLAEGKDISSPLFSSDGRMAGYLRQDGFYGCSIETGKEELLLNGALSF VPDGKEGFYASSRETGIVRVHMGREQEEVWKPEPVEGGWIQCEKLTLSPDGKKLAFARRR YVDRRKDPGLSLFEQNQGIWMIQMNLEKEKKLSVLIQIIGEEPPFFERMEETDGEPYPYL WPAKWSPDSSRLFIWQDVLSGSMRSDGIGTAVYDTVSGTMIDPLGEQEEVVLPYNENVVF GEDNSLFLLAGAGREMALDKELVKIPPEKGAVHEKLKTPGLVPQSPQVSYDGKSIFFAAS KELETGEQVEYPIWRQLYRMENGHVAELTNDSEYSSEFPVLTGDGSALVFGRVDKEGGMS IWSVGTNGSGLQKLADLEENISTDETNEHYPAGYEDFYGRGSWSSIMSVYASK >gi|157101622|gb|DS480702.1| GENE 152 172937 - 173221 302 94 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2002 NR:ns ## KEGG: Cphy_2002 # Name: not_defined # Def: YCII-related # Organism: C.phytofermentans # Pathway: not_defined # 1 89 1 89 98 108 55.0 7e-23 MKYFLVEGSIKDAGRMTDEIMNEHMACTRAAMERGMIFMSGLKADMSGGLSVMRAEHAGQ LQDYLDNEPLFVHGIQEYRVVEFDAHYVNKDPEG >gi|157101622|gb|DS480702.1| GENE 153 173489 - 173764 261 91 aa, chain + ## HITS:1 COG:no KEGG:ELI_0302 NR:ns ## KEGG: ELI_0302 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 89 1 89 93 111 65.0 1e-23 MSYQKADRILPRELLEQVQEYVDGALIYIPKTAGHKKCWGEGTSTRQELRLRNQQIYSDY LSGKSTDELSSLYCLSLKSIQRIIGQEKKNK >gi|157101622|gb|DS480702.1| GENE 154 173946 - 174539 213 197 aa, chain + ## HITS:1 COG:CAC0055 KEGG:ns NR:ns ## COG: CAC0055 COG4332 # Protein_GI_number: 15893352 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 182 1 173 196 157 42.0 8e-39 MKDIKWTIQYDRPPAPLRHCKKCNIRQEFVSSGQFRVNAQGKYLDIWLIYKCSCCSSTWN AAIYSRINPGSLPPGMLSRFHANDTSLAMQYAMDTRFLRLNGAQIQLPSYHVEGQVFQLA DSPASTDPDSSMVLQIDNPYPLPLRVHAVLKSKLKLSQKRLDQLTQSGRIQCEDGRELKK SRTGNHIRLYITGVLIP >gi|157101622|gb|DS480702.1| GENE 155 174517 - 174999 468 160 aa, chain - ## HITS:1 COG:MTH1548 KEGG:ns NR:ns ## COG: MTH1548 COG1905 # Protein_GI_number: 15679544 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Methanothermobacter thermautotrophicus # 10 146 2 134 149 62 26.0 4e-10 MQINENEARDDQTREILDYYRGLPERSSQEAIVEMLRELQDIHGCISPYMLEQAAEAAGV RDSMVQAICKRYPSLKTAPYNHEIILCTGRNCASKGSITVMDELKKRLGVGKNGISEDGT VCLKTRNCLKNCRKAPNVMVDGRLCSGLDAEGILRELKRR >gi|157101622|gb|DS480702.1| GENE 156 175100 - 176647 842 515 aa, chain - ## HITS:1 COG:no KEGG:Corgl_1526 NR:ns ## KEGG: Corgl_1526 # Name: not_defined # Def: hypothetical protein # Organism: C.glomerans # Pathway: not_defined # 212 313 399 493 493 73 39.0 2e-11 MVLLVLTSCGMKDRKQTYPQDEGVTQQQEQAGGQSEKHEASGTEPGLSRKPVAGGGGNAV GAGNASAAGNAGTAGGTGAAGNAAPAGGAEPERPGTAVRAVQNLEACNQLITGSFFKAGD IRCGLSFELTGEGWESDRLKCVFTLPDEMHTSYYDSMRTAEILTEKGKKAYKITPETMEE VPRVPALTYTIKFAMDDESDPHISVKGEGGLAGDYYLFEDSLTFPDVFSRYLSRADLCLW PTENLWLLRNEIYAANGRQFKSDVLSRYFSEKRWYRGIIEPDSFSDSILSDVESGNISLI QKMENDTDRDKLDGRNQYGLEDLPPAPYLQYLGRYDETGLSGDLSQARDMGAYYAVPGEI SVPASITREQLQTVLEGGQVRVTLNELTGESRMLSLNPGQEDTLYGFLLYEAGEEPAGQG YETGIRPDYNTDEYHLWQTSWDTVMKTVYKGDIYIMKGAVSGADTGLLRASQYQHEIFPD PADPETGLVFCGQVTGNRLYYNSRGHFTAIYYLGD >gi|157101622|gb|DS480702.1| GENE 157 176706 - 179123 2268 805 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 393 643 27 274 280 192 42.0 2e-48 MTAENLMNLLDSLTETSVYVIEEESHRLLYFNERCRNTGRGRAALGIRCDEVWPELCANC PLKMLGDRVSNHMVCYDPILKLTVDVTANRINWEGRVPAVVITAAPHRLNFEEEQGFQKI RQMYASSLVTVFDECIIANLTRDYYVSCQKDVVWDDIPEQGNFGSENRKYAQKALHPDDL ECFNENFSRESMLCMFTEGKKQITRRLRRRADNGSYRTVEFTAARIGNQEDECWCVLVFR DVQDELLLEQERNVEISQLATAAKAAYQMLIAVNLTQNTYHMVEYQRFPVKAPDPEGRFD ELIEFEMEAVHPDYREEFRSKFTRKALTEAFQRGECILSMDVPHIGEDGIYHWNSTQVVK VESPYTHDLIEITLSRNIDEERRVQQETLEKERQAKLLLEDALKKAEKANKAKSDFLSRM SHDIRTPMNAIIGMTELAQLHIGDEEKQRDYLNKIASSGAHLLGLINEILDVSKIESGVM ELSESPLNLRALAGEAAEMVRISMENSQQEFQVDIDESFDPWVMGDARRIRQVLVNILEN ASKYTGQRGKITFSVCEFKKEEQRTGTYRFIIEDTGIGMKPEYMEHIFEPFSRADDSRTS KVPGTGLGMTIVKNLISMMDGDIRVESEYGKGSRFTVTLCLDKCGNTGEAIQPAVSGPEA AYRGLRVLLVEDNELNRQIASEMLKLLGVRVEMAENGREAVEAVCSHPALYYDAVFMDVQ MPVMNGYEATREIRGSGMERIGELPIIAMTADAFAEDVKRARLSGMNGHLAKPVSIELLR GALSGCLDCKRKNRWDEMMDISGPN >gi|157101622|gb|DS480702.1| GENE 158 179143 - 179511 486 122 aa, chain - ## HITS:1 COG:no KEGG:ELI_3416 NR:ns ## KEGG: ELI_3416 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 113 1 112 117 123 53.0 2e-27 MDDFERIFEAYGADYQVTLGRFMGNRKMYMKFLGMLFQDENLSKLGNALEQGDMTSAFEA AHTLKGVTGNMGLTPLYDAVCTIVEPLRTREDREDYTKMYEDVCRQFKRAEKLRADLAEI SD >gi|157101622|gb|DS480702.1| GENE 159 179516 - 182743 3255 1075 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 542 794 1 250 260 186 40.0 2e-46 MKRQMTPGETLREFCRAWFVQRDAERTLAFLTEDVGFVGTGTDEMASGRQQMAGYLAQDI REIPEPFECELSPIYEQPVADGIYNMSADLTLKNSKYTWYLRAFFTLVLSGEEWLVKSFH VAEPASSQKDAEHYPGTLVMEHTSRLRQELLNNSLPGGMMGGYIEDGFPFYFINRRMLEY LGYENETEFVSDIEGMITNCMHPDDRNRVDQLVACQLQRSEEYVVEYRMRKKDGSYIWVH DLGRRVTSEDNRPAIMSVCMDITEEKQMRDRIKEMYEEELSYFAEQSSADGSIQGSLNIT TGCLESYLSTADTSIAKVGDAYEDTIENLAASAVDPVYGDGIRHCLNRERVLADYAVGKV DYRFEFLRRNNSGIIFWESTIMHSCQNPETGHVILFFYTQDVTEKKMQEQLFKRIAELDY ESITEVDILRDMAYRTILVDESNMDTMLPDHSRFQSEIRAISGRYMDEAAREEYLRKLDY DYMKSQLAHNDAYTFIVEMRDMTGATRVKRFRVFYISRELERVCVARTDVTDVVLKEQRQ KEELAAALVAAEQANAAKSDFLSRMSHEIRTPMNAIIGMSTIAAQSIGDDEQVEDCISKI GISSRFLLSLINDILDMSRIESGKMLLKSEKIPTEEFINGINSICYSQADVKGVEYECIV DPVLDDYYIGDAMKLQQVLINILSNAIKFTQEGGKVTFSASQRRKTKNDASLRFIVNDTG VGMNEEFLPHLFEPFSQESTGTTSLYGGTGLGLAISKNIVDMMDGKITVRSIKGIGTEFT VDVKLGITEEEKLRHNQKKQDYNFSHLKTLVVDDDVAVCESAVVTLHEMGIKAEWVDSGR KAIDRVKKLWDEGRYFDMILIDWKMPGMDGIETARRIRGIVGAEVTIIIMTAYDWISIEH EAKLAGVNLLMSKPMFKSSLVSAFSRALGEKEQQAQQPEVNDYDFTGKHVLLVEDNQINT EVAMMLLESKGFKVDTAENGLRALELFSKSDKGYYDAILMDIRMPLMDGLTAATNIRHLS NADSGTVPIIAMTANAFDDDIEKSKAAGMNAHLAKPIEPERMYQTLYDFIYGKEA >gi|157101622|gb|DS480702.1| GENE 160 182747 - 184882 1562 711 aa, chain - ## HITS:1 COG:all4225_2 KEGG:ns NR:ns ## COG: all4225_2 COG2200 # Protein_GI_number: 17231717 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Nostoc sp. PCC 7120 # 301 550 4 253 260 203 39.0 1e-51 MISQKKILVVEDNEINRMILREILSPQYKVLEAGNGAEALSVLREYGEMVSLILLDIVMP VMDGYTFLSHIKADSSFSSIPVIVTTQSDSESDEVAALSHGAADFVAKPYKSQVILHRVA SIIHLRETAAMINLVQYDRLTGLYSKEFFYQRVRETLMQHPDREYDIICSDIVNFKLIND IFGIPAGDHLLNEIGRLYKKVAGKNGICGHFHADQFACLLERRWEYTDELFIRCNARVNT LPNARNVVMKWGIYQVADRNVSVEQMCDRALLAARSIKGRYGTYFAAYDDQLRNRLLREQ AITDSMEPALAKKQFEVYLQPKYRIKDHRLSGAEALVRWRHPVWGFQSPGEFIPLFEQNG FITKLDQYVWERAASVLREWDNRGYPPIPVSVNVSRADIYHKDLADMLLGIVRRYRLQPS RLHLEITESAYTENPEQVIETVGYLRELGFVIEMDDFGSGYSSLNMLGKMPVDILKLDMK FIQDEAAGPDDGGILHFIMELARWMDLSVVAEGVETRQQLERLKKADCDYVQGYYFARPM PINEFEELLRDGHTAMEQEGMGDENLRESQPFPVLLVADEDACYRRNVRETFEGSFQVME AGDGKAALRCMADSKSRIAAVILSLTLSGTDGFSVLEVLQREKMDPDIPVIATGPQNEAM EDRAMELGADDFAGRPHSQKSLRRRVLRAIHAKAPRERAARPGQILTLGED >gi|157101622|gb|DS480702.1| GENE 161 185376 - 187802 2272 808 aa, chain + ## HITS:1 COG:ygeS KEGG:ns NR:ns ## COG: ygeS COG1529 # Protein_GI_number: 16130768 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli K12 # 10 808 2 752 752 661 44.0 0 MNIVGQQVKRVDAYGKVTGEAKYTADLEPRDILHGKVVHSTIANGLVKSFDLTEAYQVPG VVKIVTCFDVPDCQFPTAGHPWSVEKKHQDICDRKILNQRVRLYGDDIAAVIAEDEVAAA RAARLIKVEYKEYTPIVTVEAAMAEDATPLHPDLRKDNVIVHSHMTMGDSEFTYERGLEK AGSGYGPEEIISLEKEYDTPRISHCHIELPVSWAYVDTNGKITVVSSTQIPHIVRRCTAQ ALGVPIGRVRIIKPYIGGGFGNKQDVLYEPLNAFLTMSVGGRPVRLEISREETISGTRTR HAIKGKCRGLVTKDGRILARKLEAYANNGAYASHGHAICANCGNVFKDLYRDELGTEVDC FTVYTSSPTAGAMRAYGIPQAAWFAECLTDDLADAIGMDPCEFRLKNCMEDGFVDPANGI TFHSYGLKKCIEEGRRHIRWDEKWSAYRNQTGPVRKGVGMAIFCYKTGVHPISLETASAR MILNQDGSMQLCMGATEIGQGADTVFTQMASQTTGIAFDKVYIVSTQDTDVTPFDTGAYA SRQTYVSGMACRKCGEEFRLKILEYAAYMLSHDVSDLSKTVYADQVKAAAALLRETLGLK EGELVSPDMLDIENSKIVVSGGEPVLFDLSVVADTAFYSLDRSVHITAEVTNQCKDNTFS SGCCFVEIEVDIPLGLVTVKDIINVHDSGILINPLTARAQVHGGMSMGLGYGLSEELLVD EKTGRPLNNNLLDYKIPTAMDTPDLNVAFIELEDPTGPYGNKSLGEPPAIPVAPAIRNAI LNATGVQMDVTPMTAQRLIEKFKEKGLI >gi|157101622|gb|DS480702.1| GENE 162 187813 - 188703 787 296 aa, chain + ## HITS:1 COG:ECs3740 KEGG:ns NR:ns ## COG: ECs3740 COG1319 # Protein_GI_number: 15832994 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Escherichia coli O157:H7 # 1 294 1 291 292 261 42.0 1e-69 MFDIKSFYEAKDVADAINALVKNPDAEIISGGTDVLIRVREGKDAGRSVVSIHNIEALKG VRLLDDGDLWIGAGTAFSHITNDALIQKYIPMLGDAVDMVGGPQIRNTGTIGGNICNGAT SADSASTMWTLEADVLLEGPEGKRRVPVCEFYTGPGRTVRDRTEVCTGFLVHRDSFEGWS GHYIKYGKRKAMEIATLGCAVRVKLSQDKKRIEDVRLGYGVAGPTPLRCRKAEDGLKGRA VNDSEAVLAFGKAALEEVNPRSSWRASKEFRLQLIEELSKRALAEAIKKAGGAINA >gi|157101622|gb|DS480702.1| GENE 163 188696 - 189196 638 166 aa, chain + ## HITS:1 COG:mlr1925 KEGG:ns NR:ns ## COG: mlr1925 COG2080 # Protein_GI_number: 13471825 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Mesorhizobium loti # 1 154 1 154 157 149 48.0 2e-36 MLKIVHMKVNGKEVELAVDERESLLDTLRQRLGLTSVKKGCEVGECGACTVLVNGEAIDS CIYMTLWAEGKSIMTVEGLKGPNGELSPIQKAFIEEAAVQCGFCTPGLIMSAVEIVGTGK KYNREELKKLISGHLCRCTGYENILNAMERIVEEVYQVSHKTGGTD >gi|157101622|gb|DS480702.1| GENE 164 189223 - 190563 400 446 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 18 438 10 421 447 158 28 2e-37 MNSNTNCSINNIYKLEGRVPIAKAIPFGLQHILAMFVSNLAPITIIAGAAQPALSQAQTA ILLQNAMFVAGIATMIQLYPIWRIGSKLPVVMGVSFTFVTVLSTIAANYGYPAVVGAVLI GGIFEGTLGLLAKYWRRIITPVVAASVVTAIGFSLFTVGARSFGGGYADDFGSAQNLILG TITLLTCLLWNIFAKGYLKQLSVLAGLIVGYVIAIFLGKVDLSLIMSGGIISFPHLLPFM PEFHAGAIISACIIFLVSAAETIGDTSALVSGGLNREITGKEISGSLACDGYASALSTVF GCPPVTSFSQNVGLVAMTKVVNRFTIMTGAACMILAGLLPPVGNFFASLPQAVLGGCTIM MFGTILTSGVQMIAKCGFSQRNIVIVSLSLAVGIGFTTASEIGIWDIFPELVQSVFSANV VAVVFVVSVFLSLVLPKDMDIKTLEG >gi|157101622|gb|DS480702.1| GENE 165 190907 - 192115 1211 402 aa, chain + ## HITS:1 COG:CAC2237 KEGG:ns NR:ns ## COG: CAC2237 COG0448 # Protein_GI_number: 15895505 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 3 379 4 374 380 445 56.0 1e-125 MAKEMIAMLLAGGQGSRLYALTQKLAKPAVPFGGKYRIIDFPLSNCVNSGIDTVGILTQY QPLVLNEYIGNGQPWDLDRLYGGVHILPPYQKASGSDWYKGTANAIYQNISFIERYDPQY VIILSGDQICKQDYSDFLKFHKEKGAEFSVAVMEVPWEDASRFGLMVADGDDRITEFQEK PKNPKSNLASMGIYIFNWDILRQYLIEDEADPDSENDFGNNIIPNLLRDGRRMYAYHFNG YWKDVGTISSLWEANMEVLDPEHSGINLFDENWKIYSRNSGMTGHKISGDALVENSMITD GCRINGQVRHSILFSGVKVEAGAQVEDAVVMGSTTIKSGAVVKHCIVAENVVIGENAVVG AMPSDSEKGVATIGPGIEIGPGAKIGPSAMVKNNVKGGEEQW >gi|157101622|gb|DS480702.1| GENE 166 192109 - 193254 1299 381 aa, chain + ## HITS:1 COG:BH1086 KEGG:ns NR:ns ## COG: BH1086 COG0448 # Protein_GI_number: 15613649 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Bacillus halodurans # 9 377 5 364 368 243 37.0 5e-64 MVNSNANALGILFPNSYDSLVPDLVSERLMASIPFASRYRLVDFILSSMANCGIDNISLI VRRNYHSLMDHLGSGREWDLTRKNGGLNLVPPYAEKTVSIYNGRVEALAGILDFLKEQKE KYVVMADTNLAVNFDFNALIEAHMASHADVTVAYKEEPLPIDLIEHRDIGKSLYYTLAID NGRVTKMYMNSSEPGMQNISMNIYIIDRELLISQINTAYVRGQVYFERDILAPQLERLNV QAFRYDGYVARISSMKSYFNENMKMLDDVNVDALFSAGNPIYTKIRDDNPARYINGSKVS NIMAADGCIIEGEVENSILFRGVKVGKGAWVKNCVLMQDTVIEADAGVEYVVTDKNATIT RGKIIKGTVTFPVYVAKYQIV >gi|157101622|gb|DS480702.1| GENE 167 193342 - 193998 725 218 aa, chain - ## HITS:1 COG:MA2967 KEGG:ns NR:ns ## COG: MA2967 COG0546 # Protein_GI_number: 20091785 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Methanosarcina acetivorans str.C2A # 1 210 63 272 279 218 51.0 7e-57 MKYKAVIFDFDYTLGDCTEGIAASVNHGLKKLGYEAGELEDIRKTIGLSLKETFACLTGS RDQEEAQRFSIYFKEKSDQVMVEHTRLYPAVGPAFEELRRQGCRIGIVTTKYHYRIDQIL DKFQIAGLVDVIVGAEDVKAEKPSPEGLLYAMEQLGADRKEVLYTGDSVVDAKTAAGAGV DFAGVLTGTTSFRDFEPFNHICIVRDLAELMNALTQEK >gi|157101622|gb|DS480702.1| GENE 168 194014 - 194712 537 232 aa, chain - ## HITS:1 COG:CAC0640 KEGG:ns NR:ns ## COG: CAC0640 COG1768 # Protein_GI_number: 15893928 # Func_class: R General function prediction only # Function: Predicted phosphohydrolase # Organism: Clostridium acetobutylicum # 1 230 1 228 231 213 47.0 3e-55 MSLYAIGDLHLSFTANKPMDVFGREWKDHVRKIEKNWRGRIGAEDTVVITGDHSWGRNLE EALADLDFIQSLPGRKILLRGNHDMFWDAKKTYRLNEMFQGRLSFLQNNYYAYKEYALVG TKGYCYEGKDSPEHFEKLVKRELERLRISFELAAADGCSQFIMFLHYPPTSIGEMESGFT EMAKEYGAEQVVYSHCHGEARYQDSFLGEVDGIEYKLVSSDYLRFRPEKILR >gi|157101622|gb|DS480702.1| GENE 169 194775 - 195446 568 223 aa, chain - ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 33 223 35 225 229 65 24.0 1e-10 MYEHFSIPGESCRIPFADLCRRNSEFRTYFIRKTYDASQCIHPQGTKITSFGIVETGILK AVDHTRKDAEMCHAYFEARDIFPEFLYFSGEKEYSYTLAAEKRSTVLWVPVQVMEEMLEK DRQLLYALLLYVSQRGLKNQLYLNCLNYQTIRERIAYWIVGMHNIAPAETIRMPGSQLML ANMLHVSRSSLNQELKLMEKEGYFRIRGHEMQDWDKKKLEELV >gi|157101622|gb|DS480702.1| GENE 170 195566 - 195727 100 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941307|ref|ZP_02088644.1| ## NR: gi|160941307|ref|ZP_02088644.1| hypothetical protein CLOBOL_06200 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06200 [Clostridium bolteae ATCC BAA-613] # 3 53 1 51 51 92 98.0 8e-18 MFVWEFERLEYIDGPSEKKVVDVAGTLPYYDGIALCVAADDETDAVNREEPTV >gi|157101622|gb|DS480702.1| GENE 171 195795 - 196238 493 147 aa, chain - ## HITS:1 COG:SA0699 KEGG:ns NR:ns ## COG: SA0699 COG3610 # Protein_GI_number: 15926421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 8 126 11 130 164 66 32.0 2e-11 MIPQVCAAMMGTIAFSLLFGVPRKYYIWCGLIGGAGWAVYVLASGYVSGAGASFAATVVV IFLSRAAAVVNRCPVTIFLISGIFPLVPGAGIYWTVYYLVTEQLGLAVYTGYEAVKAAVA IVLGIVFVFELPQGVFRWMAKAFVKIP >gi|157101622|gb|DS480702.1| GENE 172 196235 - 196990 744 251 aa, chain - ## HITS:1 COG:BH0081 KEGG:ns NR:ns ## COG: BH0081 COG2966 # Protein_GI_number: 15612644 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 1 241 8 249 251 129 31.0 7e-30 MEVSLQAGHLLLENGAEISRVEETIERICRYYGVCSGNAFVLTNGIFVTVGSRNEPDFAK VEHIPVSGTHLDRVAAVNQLSREIEAGMYTVREVAERLECIKSMPGKPKYMQVLASGCGS AAFCYLFGGSIWDSSVALMTGIILYIYVLYVSAPHLSKIVGNISGGALVTILCSTMYLAG MGQHLNYMVIGSIMPLIPGVPFTNAIREIADSDYISGSVRMMDALLIFFCIAIGVGVGFS LAGWVTGGTLL >gi|157101622|gb|DS480702.1| GENE 173 197252 - 197662 441 136 aa, chain - ## HITS:1 COG:no KEGG:Clole_2877 NR:ns ## KEGG: Clole_2877 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 14 130 16 123 124 130 55.0 2e-29 MYMDSFFWGGGFEIMFTLVFVLIIGIFVVTAVKGLSQWNKNNHCPRLSVTASVVSKRTNV SRHSHPNAGDATGAHGFHSTTSTSYYATFQVESGDRMELSVTGTEYGLLAEGDRGKLTFQ GTRYLGFERLGNVEPV >gi|157101622|gb|DS480702.1| GENE 174 197765 - 199126 498 453 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 431 1 432 448 270 37.0 5e-72 MEVAEKIKTQPIPKLIWALSAPAVLSLLLNALNTAIDGIFVAKSAGITALSAVTVSFGVI LIIQALSLLIAAGASASISLKMGKSDKQGAEKIIGSACMLSILFSAAITIIGLLTIKPLL SLYGANADNMVYAKEYITVILSGSFFFVTAQSMNNCVKGMGYAKRAFLNSLSSVVTNTIL DAVFIFVFKWGVFGAALSTVIGNCVCMLLAMQFLCSKKSAGNLKPSNINLSASKKIMSTG APASITQFALSLVSLTFNHVAAFYGGNVGVAAYGIMYNTTMLVYMPIIGLGQGIQPIFGF NYSAGNYTRVRSTLKYSITCATIFAAGMFLVIELFSSQIISTFGGAGNKALMDMAVPGIR IFTLLLPAVGFQMISANYFQYIGKVKQSVVLSALRQLLLLIPFAILLPTVFEVTGIWIAT PTADFISLIVTAIFVRQEVCVIHSQELGVNLVN >gi|157101622|gb|DS480702.1| GENE 175 199289 - 199912 222 207 aa, chain + ## HITS:1 COG:CAC0019 KEGG:ns NR:ns ## COG: CAC0019 COG1309 # Protein_GI_number: 15893317 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 5 183 3 176 201 76 30.0 3e-14 MENKKSVRRPQQKRSIQMKETILEVSKSLFCENGYYNTTTNEIAKTAGISIGSLYSYFPD KDAILTELLERSNQYHFSNVFEKLRPESSAQLYLKEPKKWLYDLVNTLIQLHEAEKDFLR ELNVLYFAKPEVKAIKDSQSEKVQMATYEYIRRYQSELPYEDLEAVSVVIVDFITALVDR IIFKEARLEKERLLNAGIEALYRIISK >gi|157101622|gb|DS480702.1| GENE 176 200094 - 202694 2025 866 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 137 544 344 754 776 199 31.0 2e-50 MISQKQILIVEDNEINRMLLGEILSADYQVVEAENGLEALSVLKERGDVISLILLDITMP VMDGYTFLSIVKKDPVFSSIPVIVTTQSDSESDEVAALSHGASDFVAKPYKPQIILHRVA SIINLRENAAMVNQLKYDRLTGLYSKEFFFQKVKEELARHPEREYDIICSDIENFKLVND VFGLPAGDRLLREVADFYQSQVGDKGICGHFNGDQFAILMERRCKYTDEMFMEATARVNA LPNAKNVLVKWGIYSIEDRTIPVDQMCDRALLAACGIKGQYGKYFAIYDDRLRSRLLREQ AITECMESALREGQFEIYLQPKYSVKEERLSGAESLIRWNHPEWGLQSPGQFIPLFEQNG FITQLDQYVWDQTCAILQRWEQMGYPCIPVSVNVSRADIYNVDLPEILTETVQKYGLPPS RLHLEITESAYTENPSQIIEVAGRLRELGFIIEMDDFGSGYSSLNMLNKMPIDILKLDMK FIQNETEKQASQGILRFTMGLARWMDLSVVAEGVETREQFEQLREIGCDYVQGYYFAKPM PCGEFEKLIREQPEEAAMESSMEDVSFHGKQRPVLLIADEDDAYCRQVRRIFESRYQIVE ASDGEKALACIASYENKIGAAIISLTLSEPDGFQILEMLRRERAVWNIPVIATSWDGSQE EQALDMEADEFLRKPHTAAVLERRAVRAMRSAASRERERMLEGEACQDYLTGLLNRRGLE AAADALDGKDMPLAVYLFDLDNLKHINDTFGHMRGDQTISAFAELLREQTRESDILSRFG GDEFVVIMKQMKSGESAVKKGEDICRSICEYPFADNVRACCSAGVVIWDTRKPLPAILEY ADQALYRAKAENKGGCCMWEDGYEYP >gi|157101622|gb|DS480702.1| GENE 177 202709 - 204292 1265 527 aa, chain - ## HITS:1 COG:mlr2027_3 KEGG:ns NR:ns ## COG: mlr2027_3 COG2199 # Protein_GI_number: 13471907 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Mesorhizobium loti # 367 522 1 155 155 110 41.0 5e-24 MYRCRMKIAIFSKDTMLSELVRSLEPLEHFTHEITVAPEANQEIMRESRLLIWNLDDSVP PSKLRSQCAAEASLIFCGSRQRIGALPPEEFGAADEFWETPLVRDYIKVRLKRMMEQIKL GYDYYMTRTYLDTAIDSIPDMLWFKTLDGIHVKVNKAFCSVVGKTREDVTGRNHCYIWGV SPDDYENGEASCRESEDAVIKERRTLQFTESVKASHGMRQLRTYKSPIVDRDGVTVLGTV GIGHDVTDLVNMSAEIEILLQSMPYAILLWDNNGKILNANRKFEEYFKLPKEAVIGQDYD AWITGAFEEQRTINSEGYMEARVSRRDGSGKMLEIHENSVYDVFNHVAGKLCIFRDVTVE RDLEKQIRHSSNTDFLTGLYNRRCFYQYIHNNRGGRTVSLMYIDLDRFKEVNDTYGHKVG DAVLVHTADVLRRLFRDDFVARLGGDEFLVVRLGKCSMDQMEQEAEAFLKEMRTAFLAAE QTVSLSASVGIAQSSDSMVDIDALLQRSDQALYQAKKAGRSRYCVYR >gi|157101622|gb|DS480702.1| GENE 178 204555 - 205328 553 257 aa, chain + ## HITS:1 COG:yjiL KEGG:ns NR:ns ## COG: yjiL COG1924 # Protein_GI_number: 16132155 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Escherichia coli K12 # 2 247 5 250 257 209 44.0 5e-54 MFSLGIDSGSTMTKGVLFDGINVVGCHMLPTSMRPGQILRDIYDRLYTEETGFVVSTGYG RELLKEADAKITEITCHGAGASYLAPGCDTVIDIGGQDCKVITLDSHGQVNDFLMNDKCA AGTGRFMEMIMARVGSDISHLDDFVQGCRPVPINSMCAVFAESEIVGLMAQETPPGDIVL GCVHSICHRTAIFAQRLTGGHPHIFFSGGLAQSEIMRSVLGEYMNTSHITTHPLSQYTGA VGAAVLGYGKLKKRSRN >gi|157101622|gb|DS480702.1| GENE 179 205330 - 206478 1259 382 aa, chain + ## HITS:1 COG:yjiM KEGG:ns NR:ns ## COG: yjiM COG1775 # Protein_GI_number: 16132156 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli K12 # 1 382 8 390 390 452 54.0 1e-127 MELKKELPEIFEEFADARKNSFLAIKELKEEGIPVVGVFCTYLPREIPHAMGASVVGLCS VTDETIPDAEKDLPRNLCPLIKSSYGFAKTDKCPFFYFSDLIVGETTCDGKKKMYEMLAE FKPVHVIELPNCQTEAGIGLYRQELIRFKEVLEKKFGTVITEDAIRCQIHNRNQILAALN RLQYVMALDPAPALGLDIVNIVYGTGFKMKIDTLAEEVNAITDKIEAEYAQGRNIGKRAR ILVTGSPSGGAALKVVRAIEDNGGVVVCFENCSGMKPLDPIDEDNPDVYDALARKYLNIG CSCMSPNPRRMELLDRLIDDFHVDGVVDLVLQACHTYNVETSLVRRLVKEKKGLPYTVVE TDYSQADVGQLNTRMAAFIEML >gi|157101622|gb|DS480702.1| GENE 180 206473 - 206694 180 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941319|ref|ZP_02088656.1| ## NR: gi|160941319|ref|ZP_02088656.1| hypothetical protein CLOBOL_06212 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06212 [Clostridium bolteae ATCC BAA-613] # 1 73 14 86 86 139 100.0 8e-32 MKCPYCNQEMEAGFLTSDARCIAWRRERHEPGLVSRNDRNSGVQLARKTLGAAVVENAYC CAACRKIVIDYSL >gi|157101622|gb|DS480702.1| GENE 181 206877 - 207659 651 260 aa, chain + ## HITS:1 COG:CAC3359_2 KEGG:ns NR:ns ## COG: CAC3359_2 COG0778 # Protein_GI_number: 15896602 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 82 260 1 191 191 123 35.0 4e-28 MVEINRNACTGCGQCMSDCIANNLFLREGKAEVSGNCILCGHCVAVCPLNAVSIPEYDMG DVEELSQEQAGLDSDRLLKAIKYRRSVRRFKQKPVSSGDLDMLLQAGRYTATAKNMQDCR FIFVQKELEILKTLIWDGIGRILDSPAPGPAQAYRGFYEAHMADSKQDYLFRNAPAVLFI ASEANIDAGLAAQNMELMGVSLGLGFLYNGYLRRAAEMNPQVLDWLGVGEKKIEACALLG YPDITYKRTAPRRAADVIFR >gi|157101622|gb|DS480702.1| GENE 182 207675 - 208487 820 270 aa, chain - ## HITS:1 COG:no KEGG:Clole_1220 NR:ns ## KEGG: Clole_1220 # Name: not_defined # Def: xylose isomerase domain-containing protein TIM barrel # Organism: C.lentocellum # Pathway: not_defined # 1 270 1 271 273 159 34.0 1e-37 MKIGVRAHDYGKHSIEGLASLLREEGYDGAQLALPKVFEEIDSYEDIRLSHIERIRRAFE KNRVEIPVMGCYMDLGNPDRSVREYAVETLKNCLLYAKEMGAGVVGTETAYPRLSREERA AWCPYMMDSLMRVMEEAQRIDMKLAIEPVYWHPLADLETTLKTIRMVDDPEHLRLIFDAS NLLEFPETTDQDAYWNQWLASVGTYIDVMHIKDYSLGKDRAYQPKQLGEGILRYAEISRW LHENKPDMYLLREEMNPASAGADIAFMRRM >gi|157101622|gb|DS480702.1| GENE 183 208688 - 209839 1029 383 aa, chain - ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 40 377 53 388 392 159 30.0 7e-39 MKLISLFAALAAVIGLFANAGRKLFQWLGPLFPCIGPLPYGIAYGLAVTGILGAFIVSRV PGNGIFAPVFYVCHYLLGFIVYMVMLVNLAGLFLFLAGLLRLLPAPLSCRAGVAAGAIPL LLSAALSVYGAVHGAVIQIKPYEIQIGGQTQEKAPLRIALISDLHLGYVIGEHHLEKVIN AVNSTKPDLVCIAGDIFDGDATALADAGTLKELFLKIESVYGVYACLGNHDAGPSYDRMT EFLSEAGVRVLQDEAVVIDSRFVLAGRKDSFPIGGQGERRGSLELPERTEDLPVIVMDHQ PGNIRDYGEETDLILCGHTHKGQMFPFNLITDAVFDVDYGYYRASVDSPQVIVTSGAGTW GPPQRVATDNEVAEILVMLPVRQ >gi|157101622|gb|DS480702.1| GENE 184 209841 - 210740 733 299 aa, chain - ## HITS:1 COG:BS_ydeC KEGG:ns NR:ns ## COG: BS_ydeC COG2207 # Protein_GI_number: 16077582 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 8 295 6 288 291 133 28.0 5e-31 MQLNYLEVNGNLEEVTCHGTPAFPMEVYLDDLDLYPNRSIPWHWHKELELVYVLRGAIEF QTGVRSYVLSSGQGGLLPPNMLHMVAPYHHQRGSAYCAVLADPYLLHGLSGGLVEAQCAS FITGADVDFFLLDNTCPWHKEALSCIHDMYLCDCRRPPGYLWRLQILLQQTGLTLYENMD LKREKHTSYSDVPYRRVRSCLTFIHQEYARKIRLEDIARAASVSTSECCRDFKEILGQSP VDYLITYRVRMGEYLLTHTGKSILEIALAAGFSGSSHFSNTFTRYMGCTPLHYRKRDGV >gi|157101622|gb|DS480702.1| GENE 185 210877 - 211362 304 161 aa, chain + ## HITS:1 COG:no KEGG:lwe0370 NR:ns ## KEGG: lwe0370 # Name: not_defined # Def: glyoxalase family protein (EC:4.4.1.5) # Organism: L.welshimeri # Pathway: Pyruvate metabolism [PATH:lwe00620] # 1 159 1 124 126 67 31.0 2e-10 MKLTYIAIHTAKPERMKDFYVKYLGAVADEPRPAGKDGQEADTGLADSTFGGTVYNLSFE GGVKIRLIPERPVICDALSSVSFASALISSAPSSISPAMQARPSYAVCLTFTVGSRKRVN ELTGQLIMEGYDVICEPGFYGVEHYSSSVFDPEGNVIELVA >gi|157101622|gb|DS480702.1| GENE 186 211429 - 212619 280 396 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 91 386 6 318 319 112 26 2e-23 MGRYRRGNMKGNNLSDVKKANRQVVYECIWSHKAVTMTDIAYMTHLSRPTVTSLVQEMTA DNLVVKSGYGTSNGGRNPVLYHANAGAAYAMGIDLEFPKVRMAISNLECENICFSVRVYP RDSDKDQMIRLLKQQMDDLIQDSGIDRSRLLGVGMGIPGVIDYKQNHSVLFERIEGWENV PIGDIIQRYVGKPVYICNDVNLLSWAERKAAHLEDVQNMLYIVIRSGIGMAIWTDGSLMQ GEMGNSGRIGHMTVDKNGLECKCGSRGCLGLYTSEKAMMRIYSELTGKEIKQAGDLVPLA ESGDKAALQVFETCGRYLGIGIVNVANLFDISEVVVSASFDVSYILRNAQYALDVRKMNT IRREVKIREGLLAESGFGLGGCLLVLDKEHMSLLEG >gi|157101622|gb|DS480702.1| GENE 187 212734 - 214263 1151 509 aa, chain - ## HITS:1 COG:AGl463 KEGG:ns NR:ns ## COG: AGl463 COG0165 # Protein_GI_number: 15890339 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 467 8 462 504 261 33.0 2e-69 MRLPETYTELRKQTLEQEGMKYPGKALVAMELEPGYERAKNRMLPYIQAINEAHLIMLME QGIVDENSGNVILKALGELDQEFYKNTVYDGRYEDLYFYMESKLIEATDGIAGNLHLARS RNDMCVTWSHMAVRDGLLQAMEQFLYLKNAIVLFAEEHKDTLYVIHTHTQHAQPGLLGHY FLGAADMLERDFKRLCRAYDAVNQSPMGAAAVTTTGFPVSRERVAELAGFSGMIENAYDA IGNSDYLTQTASALGLCALDMGRIVTDLLLWATQEMNMIHVADGYISISSIMPQKRNPIA LEHLRSSLSVLKGMADTVLTGFLKSPYGDISDYEDIEDSVFGCLELFQKNVQLFRAVIST LEVNKELLESRAYESFSVVTDIADQMYRSYKIPFRKAHSFVSFLVKKAGTMDYNLKNISE EFFSSAYHEFFGEPFAFDFTPISQCINPWHFVKCREVRGGTGDKAMKSMIDRAKKEVEAN LLWIQEHKKQLREAAETRKQVIQKKLDKF >gi|157101622|gb|DS480702.1| GENE 188 214303 - 215445 1099 380 aa, chain - ## HITS:1 COG:SMb20510 KEGG:ns NR:ns ## COG: SMb20510 COG4948 # Protein_GI_number: 16264240 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sinorhizobium meliloti # 1 355 1 355 382 245 37.0 1e-64 MKITDVSTRLIDQYMFVRVDTDEGISGVGESGAWAFLEASEAVVETFKRYLIGQNPLLIE HHWQYMYRCYHFRGAAIMGALSAIDIALWDIKGKYYGVPCYELLGGKVRHKARVYYHVFG STGEELIRGCRNAKARGFTAVGHLTPFVDSDRREIMPLESFAAKISNASERVYQYREAVG NEVDLCIEIHRQLNVPEAIALGREIEQARPMFFEDPVRPDNFDDMEKVADKIHIPIATGE RLHTLEEFGMLLSRNACQYIRPDVCLCGGITQAKKIAALAEARGVLVVPHNPLSPVSTAA CIQLAAAIPNFAIQEYPLGEDSAPKNRIVKSTLTLKDGYLTVPEGPGIGIELNESAMNDY PYKPRLYVTKLREDGAVADQ >gi|157101622|gb|DS480702.1| GENE 189 215457 - 216317 762 286 aa, chain - ## HITS:1 COG:PM1760 KEGG:ns NR:ns ## COG: PM1760 COG0395 # Protein_GI_number: 15603625 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Pasteurella multocida # 10 285 2 277 277 168 34.0 1e-41 MANGGGHVNRKNSKLLKLIGIHILAVAVVIPIIFPVYWLVVSALQSPQTVTKLPPSILPQ TYSLYFVKKVFTEFGIARFLWNSIFITVVSTILTLAVACLAAFSYKAYEYRMKETFSKMV LFVYMFPQILIIIPIYLMMSKMGLINTYKGLILCYIAFELPICIWTMQSYFETIPSDLIE AAEIDGLAKIQTLWHIFLPVALPGLAASGIMTFIGIWNNFLLANTLLIDESKKTLPVVIA DFSSRDSMLQGDVLAASMIVCIPSFLFALFAQKYLVGGLTSGSVKG >gi|157101622|gb|DS480702.1| GENE 190 216310 - 217212 608 300 aa, chain - ## HITS:1 COG:BH1245 KEGG:ns NR:ns ## COG: BH1245 COG1175 # Protein_GI_number: 15613808 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 53 290 195 433 445 192 41.0 6e-49 MSGNRKKKNRTSTQGTLLYMPAILIIFGIVVYPMLYALVMSFTGYSVRKPVMNFVGIANY IKILQDASFWQAVGRSLIFTFGSLIPQVVLGLAIAILLNHPDLRFKGLFRGLVIMPWLVP TVAVAMIFRWMFHDIYGIMNYILIDLHVLKEPVAWIANEHTAMFILILANVWRGVPMLIT MFLAGLQGIPSDLYEAGQIDGANGWNRFCKITLPLLMPVVMVSGILRFIWTFNFYDLPWV MTGGGPAEATQTTPIYAYRRAFSSYRMGEGSAITMILFVILIIFAAIYFILKKRQDKLYG >gi|157101622|gb|DS480702.1| GENE 191 217285 - 217602 289 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941331|ref|ZP_02088668.1| ## NR: gi|160941331|ref|ZP_02088668.1| hypothetical protein CLOBOL_06224 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06224 [Clostridium bolteae ATCC BAA-613] # 1 105 19 123 123 199 100.0 5e-50 MFTGDNMVDFYLSYPYAMFPAKEDLFQNTKYQENLPDNLKEMVPDMALDILSTSNGLGMV NGPFPGAGEFESQCILGNGLIKMLSDGYSADQAVDYIVGELEKLL >gi|157101622|gb|DS480702.1| GENE 192 217653 - 218684 807 343 aa, chain - ## HITS:1 COG:SP1690 KEGG:ns NR:ns ## COG: SP1690 COG1653 # Protein_GI_number: 15901525 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 50 339 30 309 445 102 26.0 9e-22 MTKKGKRILGVIMGAAVMMAAAGCGQSADSGTKDAQEVKVSENNAENADSAQAESAEEAT GTIEMWSMLTQQERADELQKLADSYMEQNPGTQINITVMPWSGAMDKIVSAIMAGNAPDI MVTGTGYPQSLAGTGGLLELSDLVEEIGGKDAFLSTSLSVQGAYEDGLYSVPLYITPYVA YYRDSWLKEAGIETLPTTWEEYYDMCKAVTDPSKDRYGFALPLGDLHAWKTIWSFLQAND VDLLNVDENGEWYIDLDDESRAAMVETYDYLYKLVKDCAPEGTVGYTQTNVREMVAGGTV MSRIDTPEIYYTLQSVAPDDMDDVSYFKIPGRKRQVPDRDGLD >gi|157101622|gb|DS480702.1| GENE 193 218709 - 219812 907 367 aa, chain - ## HITS:1 COG:SSO3124 KEGG:ns NR:ns ## COG: SSO3124 COG4948 # Protein_GI_number: 15899830 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Sulfolobus solfataricus # 30 351 33 361 373 202 37.0 1e-51 MIITDIRIHAYSAAYKNPIRNGKYTYPATEIVICEVVTDEGVNGVGWVHGSDMVIKAMLS LKDRVIGQDPFNVERIWERMYLPKVYGRKGLATRAISGIDIALWDIKGKVAGRSVCQMMG GYADRIPAYIAGGYYEEGKGWDMLQSEMKENLKRGAKAVKMKIGGVSIKEDLERVDAVRD AVGPDIHLLVDANNAYNRIDALKMGRELERRNIYWFEEPLSPDDIEGCAELQRKLDIPIA IGENEYTRWGFKQLIEANAAQVLNADAQVLGGITEWKKVADLAMSSHVLLAPHGDQEIHA HLVASVPNGLIAEYYDNNTNALLKDMFPEPIRLNELGQIMVPQAPGLGVEIEYERIRPYC TYSSDDK >gi|157101622|gb|DS480702.1| GENE 194 220091 - 221167 984 358 aa, chain - ## HITS:1 COG:VNG6270G KEGG:ns NR:ns ## COG: VNG6270G COG0371 # Protein_GI_number: 16120191 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Halobacterium sp. NRC-1 # 12 323 11 318 365 130 27.0 5e-30 MDSAVKFGAGRYRQGRGVLEHCGQEIRRFGKKAYIVSGPRAFDAVKDRLLPGLTEAGVEF VAEIYDGVCSYEAAEALGKKCLDAGCDEVVGIGGGVIMDLSKAVAGKAGVGVINIPTSIA TCAAFTAMSVMYTPQGAMKDNWRYEYETDGVYVDLDVISGCPCRYAAAGILDAMAKKIEM LNGRPAMEPDTAAFDLYTAYRMSEFAYDLLERYGHQAIEDIRRGTVTKAVEYVTYINIAV TGIISNITRSFNQSALAHMIYYGIRTCFTREARQALHGEIVAVGLFVQLHYNGLGGEMEN LKEFMRHLDMPLSLRELGVEATEANLLILEQYLADSPYVERTEKGLGMLHESIRKLSE >gi|157101622|gb|DS480702.1| GENE 195 221204 - 222349 1119 381 aa, chain - ## HITS:1 COG:STM4413 KEGG:ns NR:ns ## COG: STM4413 COG3964 # Protein_GI_number: 16767659 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Salmonella typhimurium LT2 # 3 375 5 378 387 266 38.0 3e-71 MLDLLIKNGHVIDPANRTDQVGNIAIYNGRITRYEEGEPARHVVDAAGRYVFPGLIDAHA HMFQEGTEIGIYPDVAYLPTGVTSAIDFSAGVANYPIFRSAVIARSRVTIKSFLQVCSAG LTTTSYHECINPKFFNPEKIVRMYHDNQDNILGLKIRQSEELAEGLGIEPLKETVRIAGM AGCPVEVHCTNIPVPTSEVLKVLRPGDIFEHVYQGVKNTIIDENGKLYDCVREARKRGII FDTAEGRKHGDFDVMLKAKEQGFIADMCSTDLVLGSMFRRPIFSLPNLMSRYICMGIPMS QVVSMATERPAGLMKMQGEIGCLSEGARADVAIFRWMDFKQTYKDWKGTAFTADKLLKPE MTVKDGQIMYCAPDYIYEDWV >gi|157101622|gb|DS480702.1| GENE 196 222369 - 223781 1571 470 aa, chain - ## HITS:1 COG:HI0020 KEGG:ns NR:ns ## COG: HI0020 COG0471 # Protein_GI_number: 16271995 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Haemophilus influenzae # 8 467 8 473 479 297 38.0 3e-80 MEKKRFLKLFLIAAVTIAIWNIPAPAGLEVETWHIVALYLGLLLGLVIKPFSEPIITLII VGIAAMFMDTSILLKGYGNNMAWFIAMVTIVCTAFVKTGLGRRIAYNLLLKSGKNTLSLG YIMVFTDLVLSPATGSNSSRTSIIYPIFQNIAEGTGSSAKNAPRKLGAYLTVLMYASSQG TSALFLTGMATNAITVSLITEMLGITLTWGTWFLASIVPVGLFLLLAPYVIYKIYPPELK SLDDIKPLAQKGLKELGAVTSAEKKLFVLFLLAILGWMFGPKLPVVNLSMQVVGFVFLAL VLLLDVLDWNDVMAAKGAWSIFIWYGAFYGIASALSSAGFYTWLADQLGVVLDLSQMNGM AVTVILVLVSLAVRYFFVSNSAFVASFYPVLFSLAVNTKANPMVVGLLLAFFASFGALLT HYGNGAGLITFASDYVPQKDFWRVGTIMVAMALVIFFLIGLPYWKLIGLW >gi|157101622|gb|DS480702.1| GENE 197 223818 - 225134 1177 438 aa, chain - ## HITS:1 COG:Cgl2722 KEGG:ns NR:ns ## COG: Cgl2722 COG0786 # Protein_GI_number: 19553972 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Corynebacterium glutamicum # 5 425 7 418 449 120 26.0 6e-27 MTVSSLCNDMLILAVFMLAGFFIREIFKPVQKLFLPSSLIGGLLLLLLGHQAMGIIAVPE SFHELPGVLIDIVMASLVFGVNFNREKISSYLDYSCVTMTSYGMQMCLGVGLGWVLSKIW TGLPQGWGVMGVFSFHGGHGTAAAAASAFQKLGVDGNMAVGMVLSTFGLIFAMVVGMAMV NYGIRKGWGTYVKEPKKQPSYFYGGTLPQEKRAASGHTVTTAISINHLALQISWLLAALF LGRTIFHFLGTCIPFFTKLPGVLHGVFGGAILWKILKMTKLDRYVDIKTVKMLSGFFLEI VVFTAMATLNLEFVSTYIAPVLIYTVTICGLTVPLIFYLSYRFCKDEWFEKACMAFGAAT GNTSTGLALVRAIDPDSQSGAGDTHGVYSTIMSWKDIYVGLTPVWLMNGIALTFGVGFVM MIGFAVTGFVFFNRRKRR >gi|157101622|gb|DS480702.1| GENE 198 225216 - 226244 1088 342 aa, chain - ## HITS:1 COG:MTH970 KEGG:ns NR:ns ## COG: MTH970 COG0111 # Protein_GI_number: 15678988 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Methanothermobacter thermautotrophicus # 35 311 35 308 525 202 41.0 9e-52 MGWKILLPQEIMDEGEQLLVDAGHTLIRGRGFETEDVLADMKEHRPDAMIVRITPITREV IEANPNLKVVVRHGAGFDALDVKACHDNGVQTLYAPVANSTSVAETAMLLMLECSRNVIE LRKIWVQDYYKAKLKTRKTTVNGKTVGIIGCGNIGSRLAVRALGMEMNVLAYDPYKKADE FPEGVEVVRDLDRIFKESDYVSLHVPNTPITRGMVNKERLEMMKPTAFLINTARGGCVVE QDLYEACKNKTIAGAGLDAIQKEPVDPANPLLTLDNVIIYPHIGGNTMEAAHRASYFAAM GVQEVYEGKEPTWPINDIDYVTAPTYTDTKKPNKKSSGMFDF >gi|157101622|gb|DS480702.1| GENE 199 226336 - 227037 856 233 aa, chain - ## HITS:1 COG:AGpA472 KEGG:ns NR:ns ## COG: AGpA472 COG0684 # Protein_GI_number: 16119556 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 226 1 223 227 147 40.0 2e-35 MSVGKRIYLKREMPDMEVMNQFKSIPASNTADVMGRSCAMNPRIRLVSSPKAQMMVGPAY TVKCRAGDNLALHAALSMCNEGDVIVVSNEEDSTRALIGEVMMAYLRYTKKVAGIVLDGP IRDIDEIGKWDFPVYCTGTTPGGPYKEGPGEVNVPIACGGISVNPGDIILADPDGVIVIP RKDAAVILEEARKFQAADEAKLEASKNGTAKREWVDKALEAKGFEIIDDVYRP >gi|157101622|gb|DS480702.1| GENE 200 227208 - 227789 701 193 aa, chain + ## HITS:1 COG:CAC3361 KEGG:ns NR:ns ## COG: CAC3361 COG0583 # Protein_GI_number: 15896604 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 168 1 169 312 101 34.0 7e-22 MDTLSLYYFSELAKDLHITRTANRLFISQQTLSNHIMRLEEYYGVKLLNRKPSLSLTYAG EYVLSFAETMNRENANLMDILADIQKQKRGLILFGASTLRMSASLPDILPEFSSRYPNVE IRITDMNSKRLEQLILSGDLDLAIVISGGEHPSIEEKPLMSDQIYLCAADTLLRKYYGDG TEDLKRKPETWRM >gi|157101622|gb|DS480702.1| GENE 201 227777 - 228187 314 136 aa, chain + ## HITS:1 COG:no KEGG:Acfer_2032 NR:ns ## KEGG: Acfer_2032 # Name: not_defined # Def: transcriptional regulator, LysR family # Organism: A.fermentans # Pathway: not_defined # 2 127 192 319 321 94 30.0 2e-18 MADVKDFSRLPFCILNNRMGQNIQKCFDEAGFVPNIYTTSAYVQISTSIGLKGLAACFAT RNSILNQKGEISEDINVFPLYCKGEPLLQQISVIHHKDRYLSTYTQYFQELILKYFHEVE QIPIEELVNSSSISLQ >gi|157101622|gb|DS480702.1| GENE 202 228250 - 228990 242 246 aa, chain - ## HITS:1 COG:lin0128 KEGG:ns NR:ns ## COG: lin0128 COG5632 # Protein_GI_number: 16799205 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 18 123 32 138 289 82 42.0 6e-16 MRDITFCHPRLQHLAGQLIEECGRQGLSIKIGETFRTVAEQDALFAQGRTKPGNIVTNAP GSNYSSYHQWGTAFDIYRNDGAGAYNDSALFFQRVGAIGMSLGLEWGGNWKSIMDNPHFQ LPDWGSSTSGIKKVYATPEAFMKTWIPEERTGWIKDNNGWWYRRPDGSYPANAWHTINHH WYLFNKDGYACTSWHRWNGSVCDPEDGSGDWYYFDSTAGGPLEGACWHSRNNGAQGIWFV DTTDTI >gi|157101622|gb|DS480702.1| GENE 203 229513 - 231234 1530 573 aa, chain + ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 6 567 9 538 543 297 35.0 4e-80 MSIHSFFSLLGGLALFLYGMQMMSTNLEAAAGSRMKQILERLTANRFLGVFVGAGITAII QSSSATTVMVVGFVNSQLMTLKQAVWIIMGANIGTTVTGQLIALDIGAIAPLIAFAGVAL ILFVKQKKVQFAGGIIAGLGILFLGMGMMSAAMIPLRDSQHFVNLMTKFSNPFLGILAGA AFTAIIQSSSASVGILQALAVSGLIGLDSAVFVLFGQNIGTCITAVLASIGANRDAKRTT LIHLIFNVIGTTVFTLICLVSPFTQWMASFTPDNPAAQIANVHTLFNIVTTLLLLPFGTQ LARISEKLLPDKPQKTTDEERWFEDLLASEHVLGVSVIARKQLNEDISRMLALAASNVET SFLAFEHRDEYELEQIQKREEEIDLYNAQLSRRISRVLAVENSPSEVNALNRMFSIIGNV ERIGDHATNIAGYARTMMERRLELSSQASVELADMRTSCMRAVNLICSACCVDFTFAPEQ ESPDKRGSLTEHDAPAEHISLPQHASLLEQALLLEQDIDSRTARYRSNQIERMRKGKCHV ETSILYSEILTDYERIGDHAFNIAQALDKLDEN >gi|157101622|gb|DS480702.1| GENE 204 231531 - 232199 314 222 aa, chain + ## HITS:1 COG:SP1625 KEGG:ns NR:ns ## COG: SP1625 COG4300 # Protein_GI_number: 15901461 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted permease, cadmium resistance protein # Organism: Streptococcus pneumoniae TIGR4 # 1 221 1 198 204 98 31.0 1e-20 MSSIVTTSIVSFISTNIDNIFVMMVLYAKVDETFKKKYVVIGQYLAIGILTAISLSAAFG LNFVPQKYVGLLGFIPIALGVKEWISYKRPAQNKSNRKEFPLETEKTEAAHPNKLQNILR NVTSMASKIVKPEILSVALVGVANGADNIGVYIPLFTGYSSIQVIITIIVFVLMMAVWCF LGEKITDFPKIKAIIQTYKYFAVPAVFVGLGIYIIIKSGLIT >gi|157101622|gb|DS480702.1| GENE 205 232386 - 233369 298 327 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941350|ref|ZP_02088687.1| ## NR: gi|160941350|ref|ZP_02088687.1| hypothetical protein CLOBOL_06243 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06243 [Clostridium bolteae ATCC BAA-613] # 1 327 1 327 327 681 100.0 0 MLYHKDVDGDREYVHIFDEDKLKICFEALADTTNEIHDLPAQDGIKQKILGCVNPAQENR LTIPGIPPVVVLRKALDLQLSKHEHDVNVLNTYVSIYGSRFQRILLPAPDVLFSSKMGQS QSSFKAQDTASFLTSNLLYRKMTFAEYTYLCNRKELPCNGNYFSPDGLWQYSPDDPTGLS QNHEVLVKFTLAHSLGSILDNVLSSNIPVYREGGNLSKAASENIDMFARHLCAFPKKGKT EFKAGIKIESDCISFRMGNGLKTGVCYIPLIQGELENIEVEAYNIVISDEKYDALQKGDQ EPMGIKFHEIADYYLPLLKSNFTFNSR >gi|157101622|gb|DS480702.1| GENE 206 233520 - 234839 1294 439 aa, chain - ## HITS:1 COG:FN0621 KEGG:ns NR:ns ## COG: FN0621 COG0427 # Protein_GI_number: 19703956 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Fusobacterium nucleatum # 1 432 1 427 434 286 39.0 5e-77 MNRYLEQYKAKQCTVDDALAAIRDGDFVATSGGVNAPLLFYKNLHRIAPRLREGFDFFYG TTYKKGDFEFLDSVECKEKIHTLSGFVLSGDEREYIRQGHVDYVPTHYHSQGSKMIQARG GLDVYVAAVCPMDERTGYFRTSLSNVNETDFRNAAKKIYLEVVPSLPVIYGNNEIHISEV EGIYEYDHPLETMDPLPFGEVEKQIGEYVAELVEDGSTIQLGIGAIPDAVAHAFLDKKDL GVHTEMITNSILELVEAGAVNGRKKSINRGIIVGAFSRGSQRLYDFMDHNPCVAMNPCSY VNNPHVIARNYKMVSVNTAMGIDLLGQVSSESIGPMQYSGSGGQVDTVTGAIHATGGKSI IALASTAKNGTLSKICLSHAPGTAVTLSRNDVDYVATEYGIVRLSGRTVRERAKLLISIA HPRLRDELTAQAKSMGFLL >gi|157101622|gb|DS480702.1| GENE 207 234836 - 235399 595 187 aa, chain - ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 173 12 177 186 111 36.0 8e-25 MINRLMIAGFGGQGVMMIGKLIGECAMEQGLYATFFPSYGAEQRGGTANCTVIQSDRMIG SPTSNKLDVLCALNQPSMVDFLPRLRPGGTLLVNSSIVDTSGVSRDDIQIVKVDIDNMAY ELGNRKVANVIMFAAYMALTNTIPADKAEEIAMHKLGRKPELIELNRRAFRAGMDVIEKL KEAGECR >gi|157101622|gb|DS480702.1| GENE 208 235396 - 236148 731 250 aa, chain - ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 2 248 11 267 296 238 44.0 8e-63 MEKVYSRPESLDITNNKFCPGCMHSTFIKLCMQVIDELGIADRAVVVQPIGCANNSSPCI KIDQGCALHGRAPAYATGLKRCNPDSIVFTYQGDGDLSSIGFAEVMHAANRGENICVFYV NNCNYGMTGGQMSPTTLIGQKTTTCVEGRRADLTGYPMHMAEMIAALRGVVYSGRFALDT PANIRKAKEALRKAFQNQMDNKGFSFVEMLSNCPTNWGLSPLESVKMLQEKVMKEFPLGI YKDSEEEAAE >gi|157101622|gb|DS480702.1| GENE 209 236154 - 237215 1125 353 aa, chain - ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 1 342 4 345 356 306 46.0 5e-83 MERKFMKGNEAIAEAAVRAGCRFYAGYPITPQSEVPEYLSRALPKAGGIFVQGESEVASI NMVYGAAAGGTLAMTSTSGPGISLMAEGISSLAAARVPSVIVDIMRAGPGTGVMKPAQSD YFQINRASGHGGYQTINYAPATIQEAVDLVTKAFDVTQKYRMPVFVFADSLVGGLMEAVT LPEAKEPPRTPEWAVVGKGYHGGRPNIVLNASFVEEQQMKKALEWAEIYEGWKKDEVMVE EYMTEDAEFILTAYGSCGRICHTCVDELREAGYRAGLIRPITTNPFPSDAYRKIADKARF ILDVEMAIPALMAQDVELAVGGRLPVYTCLTTGGVIVDADDVAERAIRLVKGE >gi|157101622|gb|DS480702.1| GENE 210 237272 - 237475 313 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941355|ref|ZP_02088692.1| ## NR: gi|160941355|ref|ZP_02088692.1| hypothetical protein CLOBOL_06248 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06248 [Clostridium bolteae ATCC BAA-613] # 1 67 3 69 69 96 100.0 7e-19 MAKLEISRESCKSCELCVSVCPKKVLEIGGEINKKGYRYVVDAHPEACIACCMCAQVCPD CVIEVFK >gi|157101622|gb|DS480702.1| GENE 211 237553 - 238218 688 221 aa, chain - ## HITS:1 COG:no KEGG:Daud_1610 NR:ns ## KEGG: Daud_1610 # Name: not_defined # Def: hypothetical protein # Organism: D.audaxviator # Pathway: not_defined # 7 216 17 226 231 94 30.0 5e-18 MSTHNFVLIGEAGSGKTEIAANLALRLAEDMPVPVCLIDMDQTKCMFRARDFSSLLEKGH VHMAVNPELWDSPLMPQGVTSLLKDDSICCVFDAGGNAVGAAMLGQFAGLMAEKNTSYYY VINPCRAFSGSMEDILESLSAILISARIPAEGLKFIANPFMGEYTTPELIMEYVRHLEAE LGKAGLKLHGLAVSERFFGEVRKLTGLPILELNTFVDKLYV >gi|157101622|gb|DS480702.1| GENE 212 238601 - 239998 1162 465 aa, chain + ## HITS:1 COG:CAC1590 KEGG:ns NR:ns ## COG: CAC1590 COG0471 # Protein_GI_number: 15894868 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Clostridium acetobutylicum # 9 400 14 410 476 96 23.0 1e-19 MKSVNTRKYCLSILCILIGLVIGLIRPPGELTVESMRYLGIFMTAVLLLILQLYPAPLVA LFACIAFVAFHVCTFPEAFSAWSGDTMWTVILMMIFATGIAKTGLIDRIAYNIIKFFPAT YTGMVVAIMATSTILSPIVPNSYSKCAVLGPFAASVAKANNIKKGSRAAAGLFCAFFVPA AIHTMAFLTGSAMTFVLIGMMGDGYAFSWTEWFVCCSPWFIVNLVLTFLFIQVYYKPKEK LNISKELIHQKIAELKPMSSDEKAAAGILAVSILLWITGSYTGLKAYTVAALAFCAMFLK GIFTMKDFQDMVPWGGLITLVASLLSISALLNVVGVNNWLASVAAPVIIRFIPNVYVFII LLCISTYLLRYLECTGLATLAIIAAIFLPIGAPLGIHPFITLFADYLAMLVWNLSFHNPY YLQAEAVVDGLITHKNVVSMSHAYMVIHILGLLASVPLWRYLGMC >gi|157101622|gb|DS480702.1| GENE 213 240127 - 240858 599 243 aa, chain + ## HITS:1 COG:SSO1255 KEGG:ns NR:ns ## COG: SSO1255 COG1802 # Protein_GI_number: 15898099 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sulfolobus solfataricus # 12 204 3 196 212 80 32.0 2e-15 MTDITNTAGNDLKNYAYNILKERLINCTYAPGSILNEAHLTSEMKLSRTPVREAVSRLET EGFVKVIPKKGIHVTDILLSDVLQIFQIRTEFEPTILRLSASRLPKDELLYFIQRFENPD PDDTVMDRIRLDTAMHLFIIEHCGNRYMVDIMHRVYDISSRIIFFIHQNHAAVHDDSGEN LEILRLLLDRDLETATEKMRVHMEHCRMAALNYFYTVPDGEAPVKTYRDKLYKKQETAPH PYR >gi|157101622|gb|DS480702.1| GENE 214 241032 - 241658 716 208 aa, chain + ## HITS:1 COG:no KEGG:TepRe1_0427 NR:ns ## KEGG: TepRe1_0427 # Name: not_defined # Def: selenium metabolism protein YedF # Organism: Tepidanaerobacter_Re1 # Pathway: not_defined # 3 207 5 205 206 218 58.0 2e-55 MKLDERGKQCPKPVIDTKKALESCNPGETVEVLVDNEIAVQNLSKMAASKGLSAVSEKIS DNEFSVKIPVGQGVTPASGAPDDSDEEPAACLPDSRKKGMAVVLSSNLMGHGDPELGKAL MKGFVYALTQQDVLPETILLYNSGAYLSCEGSDNLEDLKSMESQGVEILTCGTCLNFYGL SEKLQVGSVTNMYEIVERMTAAKLLVRP >gi|157101622|gb|DS480702.1| GENE 215 241784 - 242773 766 329 aa, chain - ## HITS:1 COG:MT3500 KEGG:ns NR:ns ## COG: MT3500 COG1957 # Protein_GI_number: 15842986 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Mycobacterium tuberculosis CDC1551 # 12 235 5 255 308 84 28.0 2e-16 MKKIIFDCDNTFGVKDCDVDDGLALMYLLGNRETQLLGITSTYGNSSLDVVQAVNLRMLE ELGRRDIPVKRGGEKRGCYQSEAAVFLAEMADRHPGELSILATGSLTNLKGAYERDSHFF EKVKEIVLMGGITSPLVFEKKVMEELNFSCDPPAARTVLTKGRNVSVITGNNCLKVLFTK QEYRERLSGTENRAAAYIMEKTDDWFRYNDEGYGIHGFYNWDVTAAAYLMHPELFADNVK GLCVSDQDLETGYLRMEEDEMNGGWKAGIKAGMEAGMRGAMRGAMRDGTKSRTADGRCIC NLPLIRDGAVFKDNIYRSWLSVTGILADV >gi|157101622|gb|DS480702.1| GENE 216 242914 - 243954 858 346 aa, chain - ## HITS:1 COG:VCA0687 KEGG:ns NR:ns ## COG: VCA0687 COG3842 # Protein_GI_number: 15601444 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Vibrio cholerae # 5 298 9 312 351 224 37.0 3e-58 MSLNIENLRVSLQKKKILHGIDLEIQEGEFVSLLGKSGCGKTTLLKCIAGLLEQESGDIR IQGASVMDQQPKERGTVIVFQDLRLFPHMTAEKNIAFPMELRKVPKKEQKERVEALLRAV QLPGFEKRRMKEMSGGQMQRVALARALAAEPRVLLLDEPFSGLDEGLRTEMARLVRKLHE AWKITTILVTHDKAEALRLSDRIALMEDGKILQYGTPEQMFCHPGTKKVAEYFGKVNYIS GSVAGGWFVSPLFEKRTDLPDGVYDAMIRPNSVQLKEEGDYTVEEVTFMGEMTEIRVRVP EKMAPGGSILCQRMEGPGRPAKLRAGMKAGLAVETEGAVLFRHDME >gi|157101622|gb|DS480702.1| GENE 217 243944 - 244759 725 271 aa, chain - ## HITS:1 COG:YPO2030 KEGG:ns NR:ns ## COG: YPO2030 COG1177 # Protein_GI_number: 16122271 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Yersinia pestis # 22 256 28 265 278 92 27.0 6e-19 MKKRKSFITNMILALLCASILIPASSLVVWVFTERWAWPDLFPQVLSDRAVMEIVRRKGE LVSLMASSVMISSVVAVLSAVIGILTSRALVLYRFRGKDLMSFFTVLPLMVPGTVFAMGI QITFIRLGLNNTVAGVILAHLIYSLPYAVCLIMDGTRAVGIALEEQARVLGAGSFQAFRR TTLPMLVPVILSAVSMSYIVSFSQYFLTLLLGGGRVKTFAVVMVPYLQSGSRNIACIYSI LFMGITLLVFAVFERIAGYWTGQIVGGYYES >gi|157101622|gb|DS480702.1| GENE 218 244778 - 245644 638 288 aa, chain - ## HITS:1 COG:AGl602 KEGG:ns NR:ns ## COG: AGl602 COG1176 # Protein_GI_number: 15890418 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 45 276 48 270 285 96 28.0 5e-20 MRRNLIPYLLLVPQIILTLFFIIGLTTGITQSLGVIPAFGLREPTLKYYREVLTRPDMLK SVLYSLKVAFLSAGVATAAGVGLSAVCVMHKRTKGSMMRVIQLPIIVPHVVVAIFMVNIF SQNGLLARIGYALGVIQEQQQFPMLIYDTKGVGVILAYLWKEIPFIIYFVIALMANINEK LGEAAINLGAGRWTAFCKITLPLCKDAVISGFLIIFVFALGAYELPFLLGATSPKALPVL AYQQYIHPDLRNRPYSMALNGIIIVLSVLSAWMYYFIMRKNMKALTEQ >gi|157101622|gb|DS480702.1| GENE 219 245778 - 247118 1483 446 aa, chain - ## HITS:1 COG:SMc02589 KEGG:ns NR:ns ## COG: SMc02589 COG4134 # Protein_GI_number: 15963823 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Sinorhizobium meliloti # 65 419 21 381 408 209 36.0 9e-54 MKKRIMAAVLAAAMTLSLGACSGSGQTQGAGSNQAEDAGSSQAEDAGSSQAGDTGSSQAG DTDSRRSGRENMTFDQMKEEAKGTTVTFYGWGGDEKLNRWLDDEFARVMKEKYDITMERV PMDIDQVLSQLTGEIQAGEKDGSIDMIWINGENFQSAKENNMLYGPFTDKLPNFSDYVDG ESEDVTMDFAYPIEGYEAPYGKAQLVMVADLAVTPDVPKSAGELKEFVEKYPGKVTYPAL PDFTGSAFVRNIIYEICGYEQFMDMAADKETVRAAVAPAMEYLRELNPYLWNQGAAFPDS STTADNMFADGELVMSMTYGAYDVALNIEDGKYTDTTAAFQFEKGTIGNTNFMAIAANSG NKAGAMVAINEMISPEIQADRYEKLRVIPVLDNEKLSGEQKAAFDRVDLGKGTIPQDELL SKRLPEMPAQLVPIIEEIWAEEVVGK >gi|157101622|gb|DS480702.1| GENE 220 247237 - 248367 1048 376 aa, chain - ## HITS:1 COG:no KEGG:Closa_1178 NR:ns ## KEGG: Closa_1178 # Name: not_defined # Def: Rhodanese domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 62 374 1 299 300 306 59.0 1e-81 MVAACIAASILSYMCIPSVKNMMNKIVAMFAAGLMILFALSAFVVMVRKIYRKTRKNKEE EMKKKFVTLGIMMALSMTVAAGCGQKAGNETTAQETAESTAADTAAADTTAADTAAADTT GADGAGSETAAALQSDGSYQYVSPKEAVAAAKDKSAHVLDVREWDNYVKGRVADSMWCPI FPLEDDSLAEAMGTYAKENLSDGQKIYIICNSGKRGAEKATGVLKEAGIDGSLIYTVEGG AKALESEKGALTTNRAEEDIDWKTVAAADALKAVGGSDVQILDVRDNDTYAEGHLKGSLQ SSLKEIEDPAAQTAMYKMAKEEMDPSKPVYLLCYSGNKCAKTGISVMKDAGFDVDNLFII ENGAKDKDIQAAFVTE >gi|157101622|gb|DS480702.1| GENE 221 248416 - 249417 889 333 aa, chain - ## HITS:1 COG:no KEGG:Closa_1177 NR:ns ## KEGG: Closa_1177 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 330 1 329 334 478 66.0 1e-133 MERIAGLKEYVAWKQYRSALGLPDQVTEEYRMLAQGEYNRNYLFTHPVTGRKLVLRVNCG SQMHLEHQIQYEYETLKLLWQSGRTPRPFYADGSLEKIPHGIMVMEFLPGHAMDYRTELS FGAECLADIHSVRVGSGSHLVRPEDPLEAILTECEAMVRTYMDSDLGDQGIKMKIRQMLD RGWKRCREAAACKGYECCINTELNSTNFLINGEGKGNYLVDWEKPVYGEPAQDLGHFLAP TTTFWKTDVILTPEEMEDFIHLYIQKVDGRFDTKGLMERTLAYIPITCLRGITWCAMAWV QYQQPDKLLFNQSTFQKLGQYLDMEFLEKMDRL >gi|157101622|gb|DS480702.1| GENE 222 249424 - 250755 1066 443 aa, chain - ## HITS:1 COG:sll1095 KEGG:ns NR:ns ## COG: sll1095 COG3222 # Protein_GI_number: 16330917 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 2 193 3 199 211 108 33.0 2e-23 MKRAIIIFTRVPEPGQTKTRMMPALSAKGCARLHTCFLEDIKRECGKVEGQLFVCFTPDD GRERLYPVFGRGEHYISQRGSGLGERMYQAIREVLGRGYEACILMGTDVPEVRSEYLERA FGLLEQNDVVLGPTRDGGYYLVGMKKPQRDVFDVEGYGQGSVLRDTMYRLKTAGCRVGLT ETLWDMDVYQDLQGYRQRMREDKRLQETSTGRYLAKTSRISIIVPVYNEEKTIVQMQDQL EPLRGKCEILFVDGGSTDRTMERIRPWVKVIHSEKGRARQMNTGAMESHGDILFFLHCDS ELPPRPLAEIRRVMKDHSAGCFGIAFHSSNFFMFTCRVISNHRIKDRRIMFGDQGIFVDR SLFFDAGMFPEIPVMEDYQFSLTLKERGVKLGMTGRRIYTSDRRFPKGTLPKLKLMWKMN RLRKMYRDRVPIEQIDRMYRDVR >gi|157101622|gb|DS480702.1| GENE 223 250928 - 251374 278 148 aa, chain - ## HITS:1 COG:no KEGG:Closa_1171 NR:ns ## KEGG: Closa_1171 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 10 137 2 129 136 104 45.0 1e-21 MKKVAYISATILEILCLAGAWAVQFFTRTKMGMARYVVYKNRGWESRYPMELLAKGAALV MVLLTAMLLLCFLRRRRNLTKALWVMSAGMVLLTSGYAVFTWMNSINEFRAYYFISAILG AVSLIQCIKTWIGMLGCRQEGSGALFLP >gi|157101622|gb|DS480702.1| GENE 224 251444 - 252586 875 380 aa, chain - ## HITS:1 COG:ynjE KEGG:ns NR:ns ## COG: ynjE COG2897 # Protein_GI_number: 16129711 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Escherichia coli K12 # 61 366 136 440 440 193 36.0 4e-49 MKRRWLGIAICAAALLAAGCSSGTDVSRAADESDTGTAAGKASETTADTAAAKSEAGGTA AEASAVKADPVEKKKVYVSPDWVQSVLDGNQEESEDYMVLECAWGTVQDAAAYKEGHIKG AYHMNTDDIESEEYWNIRTPEEIKDLMAEYGITKDTTVICYSDKGTNSADDRVAFTLLWA GVENVKCLDGGYEAWLKSGYGTEKTVNTPQPSDREFGADIPVHPEYILSIDEVKDKLAGD DNFKLVSIRSRDEFLGGTSGYGYIDRAGEPEGAVWGHDTDDGSYHNEDGTTAGLDVLKGY LEESGASLDNELSFYCGTGWRATIPFLICYENGMTNMTVYDGGWYQWQMDDSLPVQVGDP AESSCQFTTVGELSTDKAKK >gi|157101622|gb|DS480702.1| GENE 225 252936 - 253847 742 303 aa, chain - ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 298 1 297 301 197 40.0 3e-50 MFNSFKDIESAVRLSGRRITIAVAGSHDSHVLEAVAKACERGLSKAVLIGHQAETERLLK DMGYTGQNMDFTFADMDGDEAIAAYACDMVVQKKADIPMKGIISTASFMRAILNKERGFV PANALISQSTLFEWQGRFMILTDCAINIAPDYDDKIKIIKNAAALAKKAGIDCPKVAVLA PMEVVNPKIESSVHAAMLTMANKRGQIKECIIDGPLALDNAVSEDAARRKGIDSEVAGHA DILVVPDLDAGNMFTKSLTFFAGLKTAGTVNGTKIPVIMCSRTDSTDDKYHTVLAALMQL ICD >gi|157101622|gb|DS480702.1| GENE 226 253840 - 254943 794 367 aa, chain - ## HITS:1 COG:CAC1660 KEGG:ns NR:ns ## COG: CAC1660 COG3426 # Protein_GI_number: 15894937 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 8 355 2 350 356 327 46.0 2e-89 MLEVPMERYEILVINLGSSSTKVAYYINDTCMVKANLAHLSEELKQYDSIWKQADMRMDA IEKFLDENGIDKGKLAAVVSRGGHTRPLTGGTYLINQVMLEESASEVYGNHACDLGLVLA SRLARYGARPMTANTPVTDEFEPLARYSGLPEIERRSSFHALNHKGAARHYAKETGRDYG SLNLIVVHMGGGISVAAHKRGKMVDANNGLAGDGPFSTNRSGGLPVGSLISECFSGKYTQ KQMMRRVNGEGGMLAYVGESDTLTVERRAESGDESCAMVLDAMAYQVSKEIGACAAVLAG QVDAVILTGGMAHSERLTGFIEERVGFIAPIVRYPGEYEMQSLAENAYEVLCKKQPLLIM EQGGEHV >gi|157101622|gb|DS480702.1| GENE 227 254936 - 255817 652 293 aa, chain - ## HITS:1 COG:MTH1738 KEGG:ns NR:ns ## COG: MTH1738 COG1013 # Protein_GI_number: 15679730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanothermobacter thermautotrophicus # 19 271 16 273 288 263 50.0 3e-70 MYKLEHESKGAVSHGISTCAGCGLEIVMRNVLSVVGDDIILIIPPGCAALFSGCGIETSV KVAGYQCNLENVASTAAGVKASLLAQGNDHTTVIGFAGDGATVDIGIQALSGAMERRDKI LYICYDNEAYMNTGIQGSSSTPMFASTTTTPAGKRTVRKDLVGIAMAHDIPYAATASISN LTDLRRKVQKAKDTQGPSLVHVHAPCPTGWRYASSKTVEVARKAVQTGCWALYEYEDGRT TVNYRPKELKPIEEYISLQGRFKNMTEDEKALMSANTENHFRRLIKRLDDMNA >gi|157101622|gb|DS480702.1| GENE 228 255810 - 257009 1098 399 aa, chain - ## HITS:1 COG:MTH1739 KEGG:ns NR:ns ## COG: MTH1739 COG0674 # Protein_GI_number: 15679731 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Methanothermobacter thermautotrophicus # 3 390 4 382 383 352 46.0 7e-97 MYKMLDGNYAAVEAMKLAKVKVIAAYPITPQSSISEKLSDLVADGTLDAQYIRVESEHSA MSATVAAQMAGVRAATATASNGLALMHEVLTMASGNRQPIVMPVVNRAVASPWSLWCEHG DTMSQRDLGWMQVYCQNCQEVLDFMLMAYKVAEDSRVLLPVMVCLDGFFLSHSMQKIDVP DQNEVDAYIGEYTPKNLYLNPSDPMFICDLTTSDEYTEMKFQQKVAMLDSVSVMEDAMDE FEHKFGRKLSIAEPYRLDDAEVVLVALGSMCGTAKCVVDELRDRGEKAGLLKLSVFRPFP TELIGKYLAGKKVIGVFDRSSGLGSQGGPLYNEIRSALCTQQSQIVSFIGGLGGRDVSDM TLHRLYSQMIDISKGSTGTYTQWIDVRDDAMQLREVQHV >gi|157101622|gb|DS480702.1| GENE 229 257002 - 257313 272 103 aa, chain - ## HITS:1 COG:SSO11071 KEGG:ns NR:ns ## COG: SSO11071 COG1144 # Protein_GI_number: 15899473 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, delta subunit # Organism: Sulfolobus solfataricus # 26 97 25 93 94 78 51.0 3e-15 MLRQQEKHMKKRNAEHLMGPVATVFAAGKTGSWRLERPVVDYTSCIKCGTCERYCPSDII TIDKDAEECVRIDFDYCKGCGVCANECPKDCIAMVEEGGGENV >gi|157101622|gb|DS480702.1| GENE 230 257270 - 257818 547 182 aa, chain - ## HITS:1 COG:PAB1470 KEGG:ns NR:ns ## COG: PAB1470 COG1014 # Protein_GI_number: 14521567 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Pyrococcus abyssi # 1 181 1 181 185 145 46.0 5e-35 MIKLKFFGLGGQGVVTAAKMFSEAYSLYEGNFAITIPAYGHERRGAPVNTSIIADDKPVM QNCFVYEPDVVVVMDASVIDKGVDIAAGIHRDSILVLNTHKKSDVERYSALGFAKVYHTD GTGIAQTNIGMGIPNGSMLGALAKTGIVGIDAIEHAIMDTFGKKAGEKNAKAAREAYEKT EC >gi|157101622|gb|DS480702.1| GENE 231 257838 - 258746 905 302 aa, chain - ## HITS:1 COG:STM2573 KEGG:ns NR:ns ## COG: STM2573 COG1893 # Protein_GI_number: 16765893 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Salmonella typhimurium LT2 # 1 294 11 303 305 161 32.0 1e-39 MGCRFGVAFAEAGAQVWLYDVWQEHVNQMNANGLVVEDGVGGERTLEVNAVSDISRMPVS DVVVIFTKSMYTMDAITKAVPIIGRDTAVLTLQNGLGNIEAIREATGNNPVIAGVTNYAS DLLKPGRVELKGSGVTKMMALDEGAREKAARLVGLLCAVGHNAMLSDDVLIDIWEKVAFN AALNTTTAITGLTVGGVGSLSESRQLLFDISSEVVMVAKAEGINASESHVHGIIESVFDP EMSGDHKTSMLQDRILKRNTEIEAVCGRAVQIGKKHGLKTPKLECVYALVRVIECNYNKI CL >gi|157101622|gb|DS480702.1| GENE 232 258871 - 259779 881 302 aa, chain - ## HITS:1 COG:AGpA799 KEGG:ns NR:ns ## COG: AGpA799 COG0388 # Protein_GI_number: 16119766 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 233 29 243 600 102 31.0 7e-22 MKDIITIAAVNFDPVWGDSEDNLKRMLEHIEAQAKQGCDLIVFPETALTGYDDETGKLLE EKMHRRLAQTVPGPSTDAVCELTKKYGIYVVYGLAERDAQDTSKVYNAAAVCGPDGVIGC CRKIHLPFAEQNWAVRGDTPMLFDSPWGTIGVGICYDTYAYPEITRYARAMGARLFINCT AIGTSESGGAGGYTGNCSLEYHAHTNDMFIVSSNMYGKDVTTYFMGGSSIIGPGSKPPHI FYYAGTPFEEEGADEGTIAKATIDLSIVKKSFLNGIWENPDWRPDLYARWFDKVTETGFL KK >gi|157101622|gb|DS480702.1| GENE 233 259795 - 260991 1012 398 aa, chain - ## HITS:1 COG:SMa1937 KEGG:ns NR:ns ## COG: SMa1937 COG0477 # Protein_GI_number: 16263515 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Sinorhizobium meliloti # 25 264 24 257 412 75 26.0 2e-13 MGTNAADSRKRSPGMVLVVMLISMVAGSICMNKVAPVLTNIVSDLAIANSTLSGMLMSIF VISGIFLSIPMGMLTTKYGTFKTGLFSLAAIIIGSSMGAVAGNYTLLLVSRMVEGIGLMF LATIGPAAVASSFSDEKRGTAMGLLMCFMSFGQIIALNLAPVMAASGTWRNFWWFSAGFG AVGMVLWVIFIKGIDDGADGAAAQESSVGATLGDVLANKGVWLVCITFFAYMITHMGVFN YLPTYLTEVGGISATLAGTLTSVASLIGIPVGIVGGMIADKWGSRKKPLAITMILFAVLI GTIPMFNSSNFIILLVLYGIVAMAEAGLCFTSVTEVVKANQGSSASAVLNTAQWLGAFLS TMIFGTLLDSFGWSTSFYVMVPIALVGAAAALCNKDLK >gi|157101622|gb|DS480702.1| GENE 234 261414 - 262178 539 254 aa, chain - ## HITS:1 COG:TM0065 KEGG:ns NR:ns ## COG: TM0065 COG1414 # Protein_GI_number: 15642840 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Thermotoga maritima # 29 254 20 245 246 140 36.0 2e-33 MQNTNVNQYILQSVDNSLSLIELLCDHEELSLADMVSMTDYGKSSIFRMLNTLEKHKMVT KTENNRYRLGYRLATIGSIVTSRMEITGIAHPYLVDLSARTKETTHLAVLVDDTKVQFFD KVRGDSSIWMESSVGMTRKAHLTGTGKAQLAFCDEAQIESYIKNTEFSQMTRYTIKDEKE LREELDRIRQQGYACDKEESEEGLVCFAAPVVNFKGKVVAAVSISGFRDKMYARQDEFIR EIKETARQISAELA >gi|157101622|gb|DS480702.1| GENE 235 262287 - 263693 1499 468 aa, chain - ## HITS:1 COG:SA0323 KEGG:ns NR:ns ## COG: SA0323 COG0534 # Protein_GI_number: 15926036 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Staphylococcus aureus N315 # 1 458 1 451 451 196 32.0 9e-50 MMDDKTILFEKTPVPKAVMQLAVPTILSSLVMVIYNLADTYFVGMMNNPVQNAAVTLAAP VLLAFNAVNNLFGVGSSSMMSRALGRRDYESVRRSSAFGFYCALFFGLLFSVLCTVGLNP LLILLGADANTMDATAGYLKWTVLFGAAPAILNVVMAYMVRAEGSSLHASIGTMSGCLLN IILDPIFILPWGFDMGAAGAGLATFLSNCVACVYFFVLLRVRRKTTFVCINPEKFSLQKY IVLGVCAVGIPAAIQNLLNVTGMTILNNFTSSYGAQAVAAMGITQKINMVPLQVAMGFSQ GIMPLISYNYASGNHRRMKETIFFTIKTLLPFLVVVSLGYFAGAGLLTRAFMDNEAIVAY GTRFLRGFCLGLPFLCMDFLAVGVFQAVGMGKAALSFAILRKIVLEIPALYLLNLLFPLY GLSYAQLVAELVLSIASVIMLARIFRRMQREDGALREGAEGGSSSYGK >gi|157101622|gb|DS480702.1| GENE 236 264329 - 265237 757 302 aa, chain + ## HITS:1 COG:BH3496_1 KEGG:ns NR:ns ## COG: BH3496_1 COG0789 # Protein_GI_number: 15616058 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 24 136 1 111 117 72 38.0 1e-12 MDYTYFGTSWVTSYETGEGVVSEIKDKYSVGEAARLSNVSTKTLRYYDEIQLIIPDIDEN NNYRYYTKDQIQDIITTKKLKNAGLKLNEIRDYLQKNQRQDKVKLLNRKIEDCSRELTRL RKSILRLSDFRDRLSDQFYDNLSSTAGQVKREIYSPGLIAFSNHYSSAYVSNLFLEYEIE LENYMDTRHLEIAGKLTAFFYDHFSHQFFDIPTHFQIFYPIKETQIRDSQIKVMPEYPCL TTIHLGSYKEIMEAYHRTLAYAKKNGIILTGQSFESYFIGPNLIKNADEFVTKIHMVICQ NP >gi|157101622|gb|DS480702.1| GENE 237 265442 - 266590 1043 382 aa, chain + ## HITS:1 COG:PA1572 KEGG:ns NR:ns ## COG: PA1572 COG3199 # Protein_GI_number: 15596769 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pseudomonas aeruginosa # 1 377 3 366 381 253 41.0 6e-67 MKKIGFIVNPVAGIGGKVGLKGSDGTQTLKKALELGAVPEAGAKALGTMKLLKPMEQQLE IHTCPGNMGEDICKQAGLNYRLAEGTKAWSQCSPSSRNTTPKDTMDAAGMLCNEGVDLIL FAGGDGTARNILDAVGTSIPVLGIPAGCKIHSAVYARTPRAAGELMLRYAEGRVMEYREA EVMDIDENLFRQGICDARLYGYLKVPNEKKFVQNLKSGRGCSEDASIQSMCMYVSRLIEQ EPEYLYIVGTGSTTARLMNYMGKPNTLLGVDLLFQGQVIGSDCTEQDILAALKKYPRAKI IVTVIGGQGYIFGRGNQQISAEVIRRVGTKNIMVIASRDKIFSLGGNPLLIDTGDEEVNR MLTGYIPVTTGYKDVIMANVIY >gi|157101622|gb|DS480702.1| GENE 238 266626 - 268032 1579 468 aa, chain + ## HITS:1 COG:PH1995 KEGG:ns NR:ns ## COG: PH1995 COG0403 # Protein_GI_number: 14591729 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), N-terminal domain # Organism: Pyrococcus horikoshii # 10 467 5 449 449 249 35.0 6e-66 MDSKKTVYPYIPNSAPDVQAEMMAEVGISDVMELYEEIPEAIRYQGALNLPEAIPDEQGI KRHIEGILSKNKNCNEFSNFMGAGCAQHYVPAVCDEIASRGELLTTYGAETWADHGKYQI FFEYQSLMAELLDMSFLTVPCHCGGQASATGLRMACRITGRNKVLLPHTMNRQNLALVRN YLKMIDADQEIKIELIACRSDTGMLDLEDLKSRLDQDTAAVYIENPTFLGIIETQGEEIG KLAKAAGAEFIVYTDPISLGVLEAPANYGATICVGDLHGLGLHLNFGGGQAGFIASHDDM RYITEFKELVDGMVETTVPGEQGYSVVLIERTHYAMRENGKEFTGTQNNLWTAPVAVYLS LMGPKGMEEIGHTIMTNAKYAAKKLAGIPGVSLRFPSVFFKEFVVNFDKTGKTVEEINQL LLSRQIFGGVDLSRDYPELGQSALYCVTETADKESIDRLAGALQAILN >gi|157101622|gb|DS480702.1| GENE 239 268049 - 269641 1421 530 aa, chain + ## HITS:1 COG:PH1994 KEGG:ns NR:ns ## COG: PH1994 COG1003 # Protein_GI_number: 14591728 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Pyrococcus horikoshii # 12 523 2 500 502 385 42.0 1e-106 MKRDYSKHLRRFHEAKWDEELIFDLSVPGQRGVNVTKPSPQITEEVGDGADDIPPCMRRK TPPFLPEVHQARVNRHYLRLTQEILGNDITPDLGEGTCTIKYSPKVQEHIANTHPNIIDV HPLQDASTIQGILEIYKKTEEMICEISGMDAACCHMGGGAAACFAGASVIRAYHEAKGNT QKDEIITTIFSHPCDAGAPATAGYKVITLMSDPETGMPSLEAFRSALSERTAAIFITNPE DTGIFNPQIKEIVDAAHEAGALCYYDQANVNGIMTITRAKEIGADLIHYNLHKTFSSPHG GMGPGVGALCVREPLKAFLPVPRVEYDGKQYYMDCNCPESIGKVRMFMGNANVILRVYMW IRQLGADGLREAAVCAVLNNQYMMKKVGQIKGVKIFYAEGKRRIEQCRYSWQPLKEDTGF GTVDVTKRLVDFGMQHYWQSHHPWIVPEPFTLEPTESFSKDDLDEFAAILKEISRECYEE PETIRSAPHNAPIHNTLLDEVMDFDKIAVTYRQLRKRMEAGTISKDILGQ >gi|157101622|gb|DS480702.1| GENE 240 269674 - 271086 1728 470 aa, chain + ## HITS:1 COG:CAC3451 KEGG:ns NR:ns ## COG: CAC3451 COG2211 # Protein_GI_number: 15896692 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Clostridium acetobutylicum # 11 445 7 437 458 138 26.0 2e-32 MEHKDNYDRLLPMPVKLGFGIANLGDTVITEFVGAFFIFFLTNIAGVRPALAGTIVFLGV MWDAISDPIIGTMSDRCTLEAGRRRPFLLISTVPLILLTTMMFTKVNFSANLKFVYFVVV SVFYWTAYTLFNIPYLSLGSELTTNNDEKTRASSIRQVFGTCGLFFANALPMILVSLFKG QGISEERAWTFAALTLGAIAGTAIFITWRATRGWEIKYPPQSKDEPIFKSLGKVLTYKPY ILVIVASFLFYFAFNTCNASVIYNTIVVVGATEADTAVVYLAGTIVGVILSVLIGKLAVV FDKKWVFITFMAIAGFALVIFKFVGFHSITDQVIQFCMAHFAIIGFLVLSYNLLYDVCEV YEFKTGEMLTGVMISYFSFFIKLGKATALQAVGILLDLGGYDASLTIQPDSAKAAVTNMA SIIPGCLMLLCALVVCFYPINRVRFRAMQNARKLKDEGKAYSTSEFDRIL >gi|157101622|gb|DS480702.1| GENE 241 271191 - 272435 1309 414 aa, chain - ## HITS:1 COG:SA1524 KEGG:ns NR:ns ## COG: SA1524 COG0281 # Protein_GI_number: 15927279 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Staphylococcus aureus N315 # 1 389 1 390 409 431 57.0 1e-121 MDYAEAALKMHRENHGKLEMASKIPLTTRDELSTAYTPGVAAPCLKIKEDKSEAYTYTAK GNLVAVVTDGTAVLGLGDIGPEAAMPVMEGKAVLFKKFGGVDAVPICLDTRDTEEIIETV KRIAPTFGGINLEDISAPRCFEIEQRLERELDIPVFHDDQHGTAIVATAALINALKLTGK QMDGIKVIVNGPGSAGTAITEMLLGAGVKDLIVCDEHGALCPGREKMDGHKEKLSLMTNV RKEKGRLEDVIAGADVFIGVSAAGVVSKDMVRTMKKDAIVFAMANPVPEIMYEEAKEAGA RVMATGRSDCPNQINNVLVFPGLFRGALDCRARDITCGMKLAAAYGIAGLVSGEELGEEY IIPSAFDERVAKAVAEAVKRAAVKRDAAKRDAAKWDAVKPAAVKAAGEVKQGRE >gi|157101622|gb|DS480702.1| GENE 242 272469 - 273911 1633 480 aa, chain - ## HITS:1 COG:CAC0274 KEGG:ns NR:ns ## COG: CAC0274 COG1027 # Protein_GI_number: 15893566 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 6 469 2 465 465 599 64.0 1e-171 MDGRKDYRVEMDSVGAKDVPENVYYGVQSMRAAENFQITGLNMHPEIINSLAYIKKAAAI TNCEIGLLEKKTAEAIVQACDEILEGRFHEDFIVDPIQGGAGTSLNMNANEVIANRAIEI LGGQKGDYFIVSPNDHVNCGQSTNDVIPTAGKMTSLRLLKNLKSELMRLHREFSHKAEEF DHVIKMGRTQMQDAVPIRLGQEFKAYSVAVMRDIRRMDKAMEEMCTLNMGGTAIGTGINA DEAYLRRIVPNLAEVSNMEFVQAFDLIDATQNLDPFVAVSGAIKACAVTLSKIANDLRLM SSGPKAGFNEINLPVRQNGSSIMPGKVNPVIPEVVNQVAFNIIGNDLTITMAAEAGQLEL NAFEPIIFYCMFQSIDTLAYAVRTFVDNCVSGITANEERCRSLVENSIGVITALSPHVGY EKAADISKKALQTGTSVRSIILQEGLLDEEELDHILDPVRMTEPGISGKELLMKNNIDEK >gi|157101622|gb|DS480702.1| GENE 243 274055 - 275005 853 316 aa, chain + ## HITS:1 COG:FN0603 KEGG:ns NR:ns ## COG: FN0603 COG0583 # Protein_GI_number: 19703938 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 1 307 1 307 314 210 36.0 3e-54 MFQGMEYIYTVYQEKSFSKAARKLFISQPSLSATVKRVEEHIGYPIFDRSTKPLTLTEFG KRYISSVEQIISVEHEFSSFMNDWGGLKTGKLVLGGSSLFSSWVLPPLIGSFTRQFPMVK VELMEENTSELKLLLQNGGIDLMIDNCQLDKTAFDSRVYRTEYLLLAVPAALDVNREAAR YQIPLEMIHDASFLDSSIAPVPLERFSGEPYIMLKPDNDTRKRAVKILAGHNITPDIRFE LDQQLTSYNITCSGMGISFISDTLIKQVPFHPGVVYYKLDGSLCQRNLYFYWKNGRYFSR AMEEFLNIAKGDAKPA >gi|157101622|gb|DS480702.1| GENE 244 275038 - 275904 911 288 aa, chain - ## HITS:1 COG:ECs5069 KEGG:ns NR:ns ## COG: ECs5069 COG0191 # Protein_GI_number: 15834323 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Escherichia coli O157:H7 # 2 285 3 286 286 224 43.0 1e-58 MLVNLNDVLVPARKGKYAVGLFNTVNLELARGVMEAAEELDSPVIIGTAEVLLPYGPLED LANLLIPMAERASVPVVVHLDHGLREETCRLALELGFSSIMYDCSTMDYGDNMEHVKKMA ETAHSFGASIEAELGHVGDNEGGSAEGSGPGAEPSQFYTDPVQARDFIDRTGADALAIAV GTAHGAYKLPPKLDFDRISEIAETISTPLVLHGGSGLSDSDFRTAVERGISKVNIFTDIN QAGARAAGSGYREGIGLTDMILPEIEAVKRSVMEKMRLFGSAGQGSGR >gi|157101622|gb|DS480702.1| GENE 245 275924 - 276913 1166 329 aa, chain - ## HITS:1 COG:CC1631 KEGG:ns NR:ns ## COG: CC1631 COG1082 # Protein_GI_number: 16125877 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Caulobacter vibrioides # 17 324 24 334 351 181 35.0 1e-45 MARPVTIFTGQWADLGLEEMCRTAKDMGYDGLEIASWGQIDLQKAAEDPEYVKNLKGTLE KYGLGCWAIGAHLPGQCVGDVWDPRLDGFAPPELAGKPEEIRAWGIQQLKYAAMAAKAMG VKVVTGFMGSPIWKMWYSFPQTTEEMVEEGFKQIKELWTPIFDVFDQCGVKFALEVHPTE IAFDYYSTQKLLEVFEYRETLGINFDPSHLIWQGMDPAMFLYDFASRVYHVHIKDAAVNL NGRNGILGSHITFGDPRRGWNFVSPGHGDVDFDKIIRILNVKGYEGPLSIEWEDSGMDRI FGGTEACAFTKKINFSPSDIAFDDALKTK >gi|157101622|gb|DS480702.1| GENE 246 276916 - 278064 1181 382 aa, chain - ## HITS:1 COG:mll3361 KEGG:ns NR:ns ## COG: mll3361 COG0673 # Protein_GI_number: 13472914 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Mesorhizobium loti # 5 379 14 395 395 314 42.0 2e-85 MAKRLTYAMVGGGPDGFIGDAHRRAIGLDGTAAIAAGVFSRNREKSLQMAESLGISPDRC YEDYQTMARAEGEREDGIDFVTVVTPNQSHYEICRAFLEAGIHVVCDKPVTTTYRQAKEL EALAEDKGLLFMVTYTYMGYVTAKYARELVASGAIGEVRTVMAEYPQGWLAFEDDFGGKQ GEWRCDPGQSGGVNCLGDLGTHVENAVAAMTGLKIKRLLAKMDVIVPGRVLDDNDSILVE YDNGATGIYWTSQVAVGHDNSLRIRIYGSKGTIQWFQETPDKITVIDADNTIREIHRGYG AIGERAGKYARIPAGHPEGWFEAMGNLYRSFAQCVSAKKDGTFTAEMVDYPTVSQGAEGL AFVEACLESNQNGNTWVEFRRK >gi|157101622|gb|DS480702.1| GENE 247 278116 - 279282 961 388 aa, chain - ## HITS:1 COG:YPO1816 KEGG:ns NR:ns ## COG: YPO1816 COG0524 # Protein_GI_number: 16122068 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 47 371 26 301 319 89 26.0 1e-17 MADERAAQNKRAAKGKRAVAAGHICIDITPLFPKGSTGEAGRILAPGQLVQMEGVNIHPG GAVANTGLAMKHLGVDVRLMGKIGKDELGTLILNCLEKYGAAEDMILADQEKTSYSVVLA IPGIDRVFLHDPGANVNFSGSDLDFEAIKEADLFHFGYPTLMQNMYRNQGEEMARMFRQI KEGGTAISLDMAAIDPASKAAGADWKGILGNILPYVDFFVPSAEELCFMLDGERYREWTK RADGRDVTEVIRVRDVKPLGQMALDMGARVVLIKCGALGIYYRTACKNGIEDMCRQLNLS MEAWKGKEGFEPSFEPDTVVSATGAGDTSIAAFLASVLGEADLKEAVEMAAAEGACCVAA YDALSGLRSLEEMKERIRKGWSKREYRP >gi|157101622|gb|DS480702.1| GENE 248 279379 - 280413 1157 344 aa, chain - ## HITS:1 COG:TM1667 KEGG:ns NR:ns ## COG: TM1667 COG2115 # Protein_GI_number: 15644415 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Thermotoga maritima # 4 338 42 414 444 65 25.0 2e-10 MTKFKFSVGPWNVHSGADSYGPATRDEIAFEEKIRTFAELGFSAIQFHDDDAVPNINDYT EEEIKEKARTLKKMLDKYGLAAEFVAPRLWMDGRTADGGFMSTSEEDREYAMWRAYRSID IAKELGCNMVVLWLAREGTLCAESKSPVWATKMLIQAINKMLEYDSDILICIEPKPNEPI DRSICGTMGHVLAVSAATIDPGRVGGNLESAHAILAGLDPAHEIGFAMAMGKLMTVHLND QNGIKYDQDKAFGVENLRAAFNQVKVLKENGYGENGEYVGLDVKAMRTTRDQDSYKHLEN SLNIFKALEEKADRFDYEFQKECVRNRDFEGLEMYVMNLLMGLV >gi|157101622|gb|DS480702.1| GENE 249 280503 - 280670 72 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941400|ref|ZP_02088737.1| ## NR: gi|160941400|ref|ZP_02088737.1| hypothetical protein CLOBOL_06293 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06293 [Clostridium bolteae ATCC BAA-613] # 1 55 39 93 93 99 100.0 8e-20 MDMGGAVKHIYKKKLKVHLAVGYDFKLLVHGISSLSDSIRPQGRACKVKSKFVMV >gi|157101622|gb|DS480702.1| GENE 250 280581 - 281531 520 316 aa, chain + ## HITS:1 COG:L0229 KEGG:ns NR:ns ## COG: L0229 COG2207 # Protein_GI_number: 15673492 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Lactococcus lactis # 1 272 3 272 327 149 35.0 5e-36 MNQEFEIISHSQMNFKLFLVNMLYRTPHIHKDFELCLLLDGSLFLQAQNETHSLETGDFF VVNPFQSHELKADSPALLVSLQVHPSFFSSYYPRIANLEFTSLILQRDGTDVCKKLYDMM IELACLYLKKADKFELKCVGLLNLLFAGLLDTLPNRLLSDKEKNASQSKASRMRKITHYI DEHYSEKLLLSDIARKEDLSLTYLSHFFKDYLGLPFQEYLAKIRCEKARQLLLLTDFPLL DICMSCGFSDSKYFNSGFRRQYGCSPKEYRRNFRHDELKQQQKSMLTTQEFLSASSSIVL LERCLLPQPAGPAACS >gi|157101622|gb|DS480702.1| GENE 251 281537 - 282589 738 350 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941402|ref|ZP_02088739.1| ## NR: gi|160941402|ref|ZP_02088739.1| hypothetical protein CLOBOL_06295 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06295 [Clostridium bolteae ATCC BAA-613] # 1 350 1 350 350 688 100.0 0 MERRKVMTIAAVLISVMVAVGACGKMMGDAPVPAEKAGVSSGDRKAPADSTEKAEAGTAD TEKNEKERTKKQSPTETQAESEQAALNVQEVPVLKGEAEPAKLDSSGLLAKASPETSAMT LFIYDGTSVRLYYMFDSEKEREILDDLSSVKAEPAEDWSAKDASLPVYGIEIGREDGLSL FAAWTNGRWIAQDGSVYQFDYDFEALKQRKEWESAGELPSFTNFPCARVFTQEKEQWNTR FLVPAAPLNPPRGITMTLDSWDNDIAAVTINNESGGEWSCGEFYELQVLIDGVWYEIPAM PGNWGFNCMGFYIPDGGKQSMINHLTMYGELPSGTYRLVIKELSAEHVIP >gi|157101622|gb|DS480702.1| GENE 252 282811 - 283371 445 186 aa, chain + ## HITS:1 COG:alr0739 KEGG:ns NR:ns ## COG: alr0739 COG4430 # Protein_GI_number: 17228234 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 10 185 16 191 193 86 30.0 2e-17 MNEPLHFKTRDEFRIWLINHCMSGGGIWLKFEKSKNTAFKAGDALEEALCFGWIDGVMKR IDDESYIKYFSARRKDSKWSEKNKALAEDLEKRGLMTDFGREKIQEARKNGQWDKPSGPP AVTAEQIGTVAGLLKEHEPAYTNFQKMPPSVQKTYTRAYYDARTDDGRARRFAWMVERLE KNLKPM >gi|157101622|gb|DS480702.1| GENE 253 283395 - 284621 770 408 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 10 400 16 409 418 301 40 3e-80 MGIYEELEARGLIAQVTDEAEIRELVNAGKAKFYIGFDPTADSLHVGHFMALCLMKRLQM AGNRPVVLLGGGTGYIGDPSGRTDMRSMMGPDVIRRNCECFKRQMERFIDFDGENGAIVV NNADWLLNLNYIELLRDVGACFSVNNMLRAECYKQRMEKGLSFLEFNYMIMQSYDFYHLY KELGCNMQFGGDDQWSNMLGGTELIRRKLGKDAYAMTITLLMNSEGNKMGKTARGAVWLD PEKTSPFEFYQYWRNIADADVMKCIRMLTFIPVEEIDAMDGWEGAQLNRAKEILAYDLTA LVHGEEEARRAKETSVSLFTGAMDDDNMPFTEVGEEKLVQGSINIMDLLVLCRLAASKSE ARRLITQGGIQVSQKRVESIDFTVTRNMLMDGIIIQKGKKVFRKAGLK >gi|157101622|gb|DS480702.1| GENE 254 284955 - 285566 546 203 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1092 NR:ns ## KEGG: EUBREC_1092 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 23 192 4 173 183 108 31.0 2e-22 MYQHKRCKNSTRRVRVALDEIADSRKKLSESIEYLLSRKSLDDIPVSEIVKMAGVSRRTF YRHFCDKYELVNWYYDEFYEESFGSIVLGAGLEEALTQCLKIYRQKRSVLKHAYESRDIN GLKNHDIEITRRIYETFLKQRGADIRSEIMWFSIEIAVRGGSDMVIQWMLGNIDLPAEKM ARLICETLPEEIKKYQMSPVTHV >gi|157101622|gb|DS480702.1| GENE 255 285829 - 286671 649 280 aa, chain + ## HITS:1 COG:CAC1624 KEGG:ns NR:ns ## COG: CAC1624 COG1307 # Protein_GI_number: 15894902 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 2 279 3 279 280 214 39.0 2e-55 MKTAILIDSGCDVSHELIQRFHMKVMHLHIIYPERDYVDGVDILPETIYQRFPGEIPATS TPSPQDVKDMAEEIKAEGYTHVLAFCISSGLSGTYNTVGSILGDEKELTSFVLDTRSISF GAGILAVWAAMQLEEGRSFDELKEMLPKKVGDSKVFYYMDTLTYLRKGGRIGLVTSVVGS ILNIKPIISCNEDGVYYTAAKIRGARQGLSRLLTEARSFAGGRPCLTALLNGQGQDASDE LRPRLIAGIPEGRLIMEKAITASLAVHTGPGLVGIGVLRL >gi|157101622|gb|DS480702.1| GENE 256 286783 - 286917 175 44 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941409|ref|ZP_02088746.1| ## NR: gi|160941409|ref|ZP_02088746.1| hypothetical protein CLOBOL_06302 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06302 [Clostridium bolteae ATCC BAA-613] # 1 44 1 44 44 70 100.0 4e-11 MADNTQEKYILLAVNGTLMRGLELEGNLREAGAEFGFEARTEVS >gi|157101622|gb|DS480702.1| GENE 257 286950 - 287774 670 274 aa, chain - ## HITS:1 COG:Rv0881 KEGG:ns NR:ns ## COG: Rv0881 COG0566 # Protein_GI_number: 15608021 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Mycobacterium tuberculosis H37Rv # 2 273 10 279 288 161 36.0 1e-39 MPNIIEITDFAAPELDIYARLTEGQLLNRHEPDKGVFIAESPKVIERALDAGCVPISMLM EKKHVESQAREIIQRCGEIPVYAAEFDVLTQLTGFHLTRGMLCAMYRPPLPGPEEICEGA RRIAVLENVMNPTNVGAIFRSAAALNMDGVLLTSACSNPLYRRAVRVSMGTVFQIPWTFL DSRLSWPQEGVGFLRELGFATAAMALNDDSVSIDDSGLMSEEKLAIILGTEGDGLAAGTI AGCDYTVRIPMSHGVDSLNVAAASAVAFWQLGRR >gi|157101622|gb|DS480702.1| GENE 258 287974 - 288450 401 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941412|ref|ZP_02088749.1| ## NR: gi|160941412|ref|ZP_02088749.1| hypothetical protein CLOBOL_06305 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06305 [Clostridium bolteae ATCC BAA-613] # 1 158 10 167 167 298 100.0 1e-79 MEDEIARDGGRYPEYTYTFLIGLQSFSEGIARGYVKSFHVEKLLYFQGLDQMIFIMEDIM DAVGAPMRGRVPRSLTGKRNKDAVTVPFVDLRGKIKKEYGFEREVLREFPVYPAVSVRVV GRQDAGIQGIFRSKYGDVGFRSGVELMRLMYEFFEKMC >gi|157101622|gb|DS480702.1| GENE 259 288490 - 288852 257 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941413|ref|ZP_02088750.1| ## NR: gi|160941413|ref|ZP_02088750.1| hypothetical protein CLOBOL_06306 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06306 [Clostridium bolteae ATCC BAA-613] # 1 120 4 123 123 232 100.0 8e-60 MKTPEFTCITGLVLLKEGDTEMVSCSLLEYLALKAKCEYMSDLRFMHVEERWLREFILRN RDDFTQEEWKEACIYLTGQKASTEDEAIGTILHRWTDTKDRPGMNYRTDMNCRTDRQDKI Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:45:02 2011 Seq name: gi|157101621|gb|DS480703.1| Clostridium bolteae ATCC BAA-613 Scfld_02_44 genomic scaffold, whole genome shotgun sequence Length of sequence - 27951 bp Number of predicted genes - 33, with homology - 30 Number of transcription units - 11, operones - 8 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 270 235 ## 2 1 Op 2 . - CDS 283 - 408 191 ## 3 1 Op 3 . - CDS 405 - 773 342 ## gi|160941416|ref|ZP_02088752.1| hypothetical protein CLOBOL_06308 - Term 802 - 835 2.2 4 2 Op 1 . - CDS 847 - 1047 128 ## gi|160941417|ref|ZP_02088753.1| hypothetical protein CLOBOL_06309 5 2 Op 2 . - CDS 1037 - 2758 1322 ## COG5585 NAD+--asparagine ADP-ribosyltransferase 6 2 Op 3 . - CDS 2755 - 4119 1226 ## SPSINT_0578 phage portal protein 7 2 Op 4 . - CDS 4143 - 5897 1498 ## Dtox_4228 phage protein - Prom 6012 - 6071 2.3 - Term 6038 - 6069 2.5 8 3 Op 1 . - CDS 6192 - 6674 441 ## Ethha_0502 NGN domain-containing protein 9 3 Op 2 . - CDS 6677 - 7249 692 ## Ethha_0503 hypothetical protein 10 3 Op 3 . - CDS 7234 - 7572 281 ## gi|160941423|ref|ZP_02088759.1| hypothetical protein CLOBOL_06315 11 3 Op 4 . - CDS 7578 - 7943 409 ## Ethha_0085 hypothetical protein - Prom 7970 - 8029 3.4 - Term 7952 - 7997 7.4 12 4 Op 1 . - CDS 8035 - 8358 354 ## Ethha_0084 Mor transcription activator domain protein 13 4 Op 2 . - CDS 8342 - 8836 398 ## Ethha_0083 hypothetical protein 14 4 Op 3 . - CDS 8885 - 9595 840 ## Ethha_0082 hypothetical protein 15 4 Op 4 . - CDS 9538 - 9777 198 ## Ethha_0081 SpoVT/AbrB domain-containing protein - Term 9844 - 9869 -0.5 16 4 Op 5 . - CDS 9870 - 10142 207 ## gi|160941429|ref|ZP_02088765.1| hypothetical protein CLOBOL_06321 17 4 Op 6 . - CDS 10123 - 10608 378 ## Clocel_3545 hypothetical protein 18 4 Op 7 . - CDS 10626 - 10727 80 ## - Prom 10952 - 11011 3.4 - Term 10937 - 10976 -1.0 19 5 Op 1 . - CDS 11028 - 11255 104 ## gi|160941433|ref|ZP_02088769.1| hypothetical protein CLOBOL_06325 20 5 Op 2 . - CDS 11276 - 11452 96 ## gi|160941434|ref|ZP_02088770.1| hypothetical protein CLOBOL_06326 21 5 Op 3 . - CDS 11436 - 11849 345 ## gi|160941435|ref|ZP_02088771.1| hypothetical protein CLOBOL_06327 22 6 Op 1 . - CDS 11958 - 12227 254 ## gi|160941436|ref|ZP_02088772.1| hypothetical protein CLOBOL_06328 23 6 Op 2 6/0.000 - CDS 12241 - 13203 833 ## COG2842 Uncharacterized ATPase, putative transposase 24 6 Op 3 . - CDS 13228 - 15183 1284 ## COG2801 Transposase and inactivated derivatives 25 6 Op 4 . - CDS 15268 - 15462 80 ## gi|160941439|ref|ZP_02088775.1| hypothetical protein CLOBOL_06331 - Prom 15517 - 15576 8.5 + Prom 15456 - 15515 10.5 26 7 Tu 1 . + CDS 15578 - 16159 193 ## ELI_3146 transcriptional regulator + Term 16188 - 16235 3.0 27 8 Op 1 . - CDS 16743 - 17621 350 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 28 8 Op 2 . - CDS 17634 - 18740 836 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases - Prom 18761 - 18820 3.0 29 8 Op 3 . - CDS 18839 - 20719 828 ## COG3962 Acetolactate synthase - Prom 20848 - 20907 6.2 + Prom 20956 - 21015 7.1 30 9 Tu 1 . + CDS 21128 - 21937 261 ## COG1414 Transcriptional regulator + Term 22103 - 22147 5.3 - Term 22091 - 22135 5.3 31 10 Tu 1 . - CDS 22233 - 24863 2704 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 24964 - 25023 5.5 - Term 25727 - 25771 11.2 32 11 Op 1 40/0.000 - CDS 25846 - 26535 976 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 33 11 Op 2 . - CDS 26507 - 27949 1498 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|157101621|gb|DS480703.1| GENE 1 3 - 270 235 89 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEEGTKAAVSETVLEQQGEKQEEGGLLDKIRSLLGGSQKKEESPGTQTPAKDGKTAGETK EEGTDGKAASHEKQAEKTYTQAELAAEIE >gi|157101621|gb|DS480703.1| GENE 2 283 - 408 191 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIQTLTRIYRETGNEQYLTNAVKKKWIKEEEKAQIMAEVNG >gi|157101621|gb|DS480703.1| GENE 3 405 - 773 342 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941416|ref|ZP_02088752.1| ## NR: gi|160941416|ref|ZP_02088752.1| hypothetical protein CLOBOL_06308 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06308 [Clostridium bolteae ATCC BAA-613] # 1 122 1 122 122 206 100.0 4e-52 MEKRKLLLKDGTAIILEAGSCLGQMEAAYEGREALMTDWEKMTKENLSRVQIKNGDTVTG TYEHLILGNPVLVVRGKEDGTLLASWGIRERTELEKLADRVGAVEETTDVLTMDALTGGE GA >gi|157101621|gb|DS480703.1| GENE 4 847 - 1047 128 66 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941417|ref|ZP_02088753.1| ## NR: gi|160941417|ref|ZP_02088753.1| hypothetical protein CLOBOL_06309 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06309 [Clostridium bolteae ATCC BAA-613] # 1 66 1 66 66 118 100.0 2e-25 MKHENCENCLVYDELMEGILAPCDEEEAWKLHYCLSYEKGIPKEIWSGKNACPHRIEPES SERTIQ >gi|157101621|gb|DS480703.1| GENE 5 1037 - 2758 1322 573 aa, chain - ## HITS:1 COG:BH3531 KEGG:ns NR:ns ## COG: BH3531 COG5585 # Protein_GI_number: 15616093 # Func_class: T Signal transduction mechanisms # Function: NAD+--asparagine ADP-ribosyltransferase # Organism: Bacillus halodurans # 3 328 4 321 490 105 24.0 3e-22 MNQRQRNAWIEAAKEQVLRNAEASDGCEADLMELYDECVNGLENEIRAFYSRYARDNQLT EAQASKLLTGKEYSTWKKSLEGYLEELEGQGKDSRLALELNTLAAKSQVSRKEQLLANVY HNMSRLAGRSETELTGLLEGLVQTNYERKMFDIQSIGGVAWDVSKVDERLLKQILSYPWS GRQYSKALWDNTDQLAALTRRELTLGFMSGASVDKIAKEIDDVMGKGRYAAQRLVRTEAS YFANQGQLLAYQDAGVTKYRFLGGGCEICQRLNGQMFELSDARAGENLPPIHPNCKCTTV AAYDIPVFKQRQGNPLESNPKFEEWKKKHMEEADQVESGPGQEPVKKGIFASLKERREAT GLLQPYASKVKVTGEVNQANYKKAAAELERQLAKAPFQKLDEITIFDSRDAPGKIGSAVG KQLRMGTALMNAPEKYYEGSVLNWTKRIETSMGKLRTRLNEGASEAVREAYGRQMELKRY SRGNVLYNGREIACVIQHEMMHMIVNETGMRDDRELKECYNRAMKSGDIYGISYRASENE REFICEAAVMYENGEPMPEYIKRLVKECKSHEA >gi|157101621|gb|DS480703.1| GENE 6 2755 - 4119 1226 454 aa, chain - ## HITS:1 COG:no KEGG:SPSINT_0578 NR:ns ## KEGG: SPSINT_0578 # Name: not_defined # Def: phage portal protein # Organism: S.pseudintermedius # Pathway: not_defined # 10 425 27 456 481 227 34.0 7e-58 MIVYMDRASIESLTVKDIREIVVKQSYQMKYRKLERYYVGDHDILHSERTDKTGGDNRIV NNMARYITDTATGYFLGQPVVYSSENEEYLQTIQDIFDYNDEQDHNTELGKQCSIKGDCF EMVYLDEDGRIRLGLVFPENLILFYETESEFTSPLAAIRMVRGMDKNGNILLRVEFWTWT QVIYLQSFNGGVLEVTGWKEHYWNDVPFCEYVNNRERIGDFEGVLSEIDAYNRVQSNTAN YFQYNDDAILKVTRLGDVDSKDIAQMKRERAVILEDGGDVGWILKTVDDTALENYKNRLR EDIHLGANVPNMTDEAFGGNLSGVAVSYKLWGLEQICAIKERKFKRALQRRIELITHVLN LLGHNYDYRDLDMQFRRNKPQNILEQSQIIGNLSSMLSKETLLQLLPFVDNPKEELEKLE EEKQEGMESFGMYQNLARAFQTEEADQEGEAEKG >gi|157101621|gb|DS480703.1| GENE 7 4143 - 5897 1498 584 aa, chain - ## HITS:1 COG:no KEGG:Dtox_4228 NR:ns ## KEGG: Dtox_4228 # Name: not_defined # Def: phage protein # Organism: D.acetoxidans # Pathway: not_defined # 1 570 1 574 594 754 65.0 0 MRKGKQKSIDTLMGALAEAESRAFYEEEDNVLNDLDELLNVFLRKGEEPERRQLLKEYDS GLPLTGPEGIRRKLGAIDMEFFGRAYFPHYFSRPSPDFHRDLDAIWQDGVLKGRFPITRK AAREINRLEGCRRAVAAPRGHAKSTTLTFKGSMHAILYQYKHYPIILSDSSDQAEGFLEN IRVEFEENGLIREDFGNLQGKVWRNNVILTTTNIKVEAIGSGKKIRGRKHRNWRPDLLVL DDIENDENVRTAEQRSKLSNWFNKAVSKAGDSYTDIVYIGTLLHYDSLLAHTLTNTGYKS IKYKAVLSFSQADDLWKEWEDIYTDLSNDSHAEDAKAFFEAHKAEMLKGTEVLWEEKLSY YDLMKMRIDEGEASFNSEEQNEPINPDDCLFQEEWLDYYNEAEVDFKDRSFVFYGFVDPS LGKTKHSDFSAIITLAKHKGTGYMYVFDADIERRHPDRIISDILEKERRIRRDYGRGYKK FGCETVQFQWFLKEELVKASARAGLYLPVEEVPQTADKTLRIQTMQPDIKNKYIKFNRRH KRLLEQLVQFPMGAHDDGPDALEGCRTLAKKVRKFKVIPRESLI >gi|157101621|gb|DS480703.1| GENE 8 6192 - 6674 441 160 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0502 NR:ns ## KEGG: Ethha_0502 # Name: not_defined # Def: NGN domain-containing protein # Organism: E.harbinense # Pathway: not_defined # 1 139 1 144 164 82 35.0 5e-15 MLWYVVQVRTGEEKDIAAKLTDMGFQTLAPVENRPVRSGGAWGTKEYVLFPGYVFLQMDY NAGNYYRLKAVPGIVKLLSGTLTYLEAEWIRLLAGQGGRPLEPTLMRETEEGLEIETGIL QNFKSRIIRMDKRSLRATIELSICGEKKEVQLGIRLPEEV >gi|157101621|gb|DS480703.1| GENE 9 6677 - 7249 692 190 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0503 NR:ns ## KEGG: Ethha_0503 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 12 187 4 180 190 171 49.0 2e-41 MCEGMSGENRRRSIGKVDRLPPELKDTVEQMLLTGSTYKEIVTYLKENGEEMSQMAICTY AKKYLATVEMINVAQNNFSMLMDEMNRYPDLDTSEALIRLASHHVMNALTNVDEEQMKEV PVDKLIKETNGLIRAAAYKKRIEVQTRENYEAGLEAVKGLVFEAMAKEQPELYQQVSAYL NKKKQEGMEG >gi|157101621|gb|DS480703.1| GENE 10 7234 - 7572 281 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941423|ref|ZP_02088759.1| ## NR: gi|160941423|ref|ZP_02088759.1| hypothetical protein CLOBOL_06315 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06315 [Clostridium bolteae ATCC BAA-613] # 1 112 1 112 112 231 100.0 2e-59 MNRDEMMKRIRADAFPANNGSVLTCINLLNRSGFSPLEQVRVGVKNWGVEKPEFMDSIHF LKLAGYIETRTIEGHISGADLADYDYDELEARASEKGIRLMQGSLTDECVKV >gi|157101621|gb|DS480703.1| GENE 11 7578 - 7943 409 121 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0085 NR:ns ## KEGG: Ethha_0085 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 9 116 14 104 112 61 30.0 9e-09 MNEWVITTVITLGIGVVSYFLKRTMSQVDRHDTEIREGREQSASKEDLKEQTKELKDDIR KIREDYTPRDTHQKDFDECRKDIKEIRENYLTKDDFIREISKMDRKLDRMMEMMIEAAKK G >gi|157101621|gb|DS480703.1| GENE 12 8035 - 8358 354 107 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0084 NR:ns ## KEGG: Ethha_0084 # Name: not_defined # Def: Mor transcription activator domain protein # Organism: E.harbinense # Pathway: not_defined # 4 102 3 101 107 92 50.0 4e-18 MDLLDKVQMEDLDEEQRTLAGLIGIEAFRALVRNYNGTPIYIPKIESLEKPVRDELIREE FDGKNYRELALKYGLTETWIRNIVIEKAREIRARPMDGQISLKGILY >gi|157101621|gb|DS480703.1| GENE 13 8342 - 8836 398 164 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0083 NR:ns ## KEGG: Ethha_0083 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 3 154 2 155 166 145 49.0 4e-34 MRSIDSYQMKRIYAVGHQLSLVGHDHEDELHALVATITGKESVKALSYREAESVIARLTQ LQGGQAPPRPRREREHPSRPGGVTSGQQKKIWALMYELAKYDRKPESVSLGDRLCAIIKK ELGMDAIARTPFAWIDFDSGNKLIEVLKGYVKSAVRGCKDGPVG >gi|157101621|gb|DS480703.1| GENE 14 8885 - 9595 840 236 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0082 NR:ns ## KEGG: Ethha_0082 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 23 234 10 221 221 139 39.0 9e-32 MRDGSLSDLCREDCGGVSVTEKTKEVKAKADRLVELTAQKERVQAEMEEIKAWFENLAVN DLKDTKKKTVEYWGSSNARVVVGNSETVKPVSMAMVKRLLNTVYPDFVTEKTSYSLAAPA KRLFTIAYLGAYTEGTLDDTIQAITKDEKLQRTLRKKLKGKYEKDTESLMKSAGLDAKEA SDWAYLVSEVVNWEWMLQVLKAAEWSGTPQEAVDIINAAVIVDESLKVTVGAEEKS >gi|157101621|gb|DS480703.1| GENE 15 9538 - 9777 198 79 aa, chain - ## HITS:1 COG:no KEGG:Ethha_0081 NR:ns ## KEGG: Ethha_0081 # Name: not_defined # Def: SpoVT/AbrB domain-containing protein # Organism: E.harbinense # Pathway: not_defined # 1 67 1 67 90 63 44.0 3e-09 MQISKKVTKGGGITIPRMLRQETGILPGVPVDVTADAAGIHIVKHVPACRFCGAVEDVAA VCGMEVCRTCAGKIAEVFQ >gi|157101621|gb|DS480703.1| GENE 16 9870 - 10142 207 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941429|ref|ZP_02088765.1| ## NR: gi|160941429|ref|ZP_02088765.1| hypothetical protein CLOBOL_06321 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06321 [Clostridium bolteae ATCC BAA-613] # 1 90 1 90 90 171 100.0 1e-41 MKALRDEFYFEPRVIDSSGKLRWYGEVYTGNMLLLHTEETVYIRDNGSKLFIYTLDSDQM KQEQRIEAVFTLVCQVQKYSNKWRYGKRNR >gi|157101621|gb|DS480703.1| GENE 17 10123 - 10608 378 161 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3545 NR:ns ## KEGG: Clocel_3545 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulovorans # Pathway: not_defined # 1 154 1 154 154 199 59.0 2e-50 MKKYIGTKLVQARPMTRGAYNRYRGWEIPADENPEDEGYLIQYPDGYVSWSPKGMFDHSY LEVDDNPQLPSGVSIGPGMVEAFIDRIEVMKLGERTTVVRCILKNGFELVESSACVDPRN YSVEIGQEACMEKIRDRIWNLLGFLLQTAWMGVRKDEGTKG >gi|157101621|gb|DS480703.1| GENE 18 10626 - 10727 80 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLKVISCLLAMIIFAAVIYLRLNILAYFLKTI >gi|157101621|gb|DS480703.1| GENE 19 11028 - 11255 104 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941433|ref|ZP_02088769.1| ## NR: gi|160941433|ref|ZP_02088769.1| hypothetical protein CLOBOL_06325 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06325 [Clostridium bolteae ATCC BAA-613] # 1 75 1 75 75 139 100.0 6e-32 MRVIEGDLKACPFCGSKNIVIDTCHDLGDCENFERCEDTGYYSAVCNRNDGGCGASTGYK PTIKAAVEAWNRREN >gi|157101621|gb|DS480703.1| GENE 20 11276 - 11452 96 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941434|ref|ZP_02088770.1| ## NR: gi|160941434|ref|ZP_02088770.1| hypothetical protein CLOBOL_06326 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06326 [Clostridium bolteae ATCC BAA-613] # 1 58 1 58 58 97 100.0 3e-19 MEKERRPTETGKRCLSCRHRRVIEDSRGESISLCVCAVGRKYLNPVCPLGTCGRYLGY >gi|157101621|gb|DS480703.1| GENE 21 11436 - 11849 345 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941435|ref|ZP_02088771.1| ## NR: gi|160941435|ref|ZP_02088771.1| hypothetical protein CLOBOL_06327 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06327 [Clostridium bolteae ATCC BAA-613] # 1 137 13 149 149 183 100.0 3e-45 MRREAKLKSQVNQSARLEEFLAWVKECEEQYRTASEAVALEDRRLQDLLHEMEFAATSKE RGRVATKLYRSRKMRREQKDIMKRNEQVVEFFREQPARAMLKRINQLVGRQKTEEQYLDG KRTYKPRVEGGGDGKGA >gi|157101621|gb|DS480703.1| GENE 22 11958 - 12227 254 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941436|ref|ZP_02088772.1| ## NR: gi|160941436|ref|ZP_02088772.1| hypothetical protein CLOBOL_06328 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06328 [Clostridium bolteae ATCC BAA-613] # 1 89 1 89 89 150 100.0 2e-35 MKRLNEKVIMMVRGLMIGSSMAAVLTVLVMGGKLGIWGFISVLGLIVLAGLGGWLIGFQT RTRLDADQVWLKGYKEGRQDGCLIPSHRE >gi|157101621|gb|DS480703.1| GENE 23 12241 - 13203 833 320 aa, chain - ## HITS:1 COG:HI1481 KEGG:ns NR:ns ## COG: HI1481 COG2842 # Protein_GI_number: 16273385 # Func_class: R General function prediction only # Function: Uncharacterized ATPase, putative transposase # Organism: Haemophilus influenzae # 9 294 3 274 287 97 28.0 2e-20 MSKEYNVELQKKLEDYLEAEGLSQAKAAPILGISAAVLSQYRRSVYDKGDIGEVEKKLEE FFRIKEEQGQNSRKAEPFRANLQGYIPTSISEAAYKLIRYCQLEKGIVVIDGDAGIGKTK AAAKFLQDNPSTTVYLKAAPSTGTLRSLLKMIGRALKLPENQRTEDLSLAIQDKLKETDK IIIIDEAQNLKFMALEEIRGWVDEDPLTGKPGIGIVLIGNVEVYNKMLGRQEAIFSQQFN RTRLHGRYRTTDIKREDVVKLLPALEERRMTKEIDYLHSISRSKWGIRGMVNVFRNAVND EDVSMAGLERVAGTMGIRFI >gi|157101621|gb|DS480703.1| GENE 24 13228 - 15183 1284 651 aa, chain - ## HITS:1 COG:pli0058 KEGG:ns NR:ns ## COG: pli0058 COG2801 # Protein_GI_number: 18450340 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Listeria innocua # 141 532 50 404 478 84 24.0 1e-15 MGQMLTVKQVAEVKGCRVQYIQRMAKEGKLPSVKTVNNRNQKVYQIPLESLPDELQQRWY QMQVEQMREENGIEEENRGGTDQFSAEERLEIDFWLELVRKWQEYRNLAPKRSKGETDQK FLIWCSLEYPDRTISMDILYRKWKAVRENNMAGLTDKRGKWRKGTSDIHETVWQAFLYYY LDEGQHTIQKCLEYTKMWIREKQPELYTDIPSYSSFYRKLNRDIPEGVKVLGREGHKAYN DRCAPYIRRIYEDIASNEWWIADNHTFDVIVVDKNGKQHRPYLTAFMDARSGILTGYYIT YNPSSEATLIALRKGILEYGIPDNIYVDNGREFLTFDIGGLGHRRKKPKNGEERFEPPGV FKRLGINMTNAIVRNAKAKIIERRFRDVKDSLSRLFDTYTGGSVVEKPERLKGVLKKGEI YSDDEFQEYVEAVIDYYFNLQPYHGAVPADHGKLKMDVFNEHLIKKRTATAEALNLMLMR SSRAQTVGRRGVHLDIAGGRIDYWNDDFVHLMLGKKVYFRYDPDNLSEVRIYDLEDRYIM TVPADNEAVLSYNASREDVKAAMAKTRRLEKVAKEYIEHAVLADCDKVTAMELVLKEAQY NKENYQGKANPKVLEVQRADEEPAFKKVVGGIDLDRMIRNAEIRHEQEKQR >gi|157101621|gb|DS480703.1| GENE 25 15268 - 15462 80 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941439|ref|ZP_02088775.1| ## NR: gi|160941439|ref|ZP_02088775.1| hypothetical protein CLOBOL_06331 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06331 [Clostridium bolteae ATCC BAA-613] # 1 64 4 67 67 119 98.0 8e-26 MDAGKIIKNIRKEKGLRQKDIAHAMTVEQSYISQVENGKTVPTPMFIKLFCLTFSVNEEQ FHEK >gi|157101621|gb|DS480703.1| GENE 26 15578 - 16159 193 193 aa, chain + ## HITS:1 COG:no KEGG:ELI_3146 NR:ns ## KEGG: ELI_3146 # Name: not_defined # Def: transcriptional regulator # Organism: E.limosum # Pathway: not_defined # 11 95 2 88 124 68 39.0 1e-10 MLIFIGGKMNNISDRIRVLRKSEHLTQKEFAKRLLISQSYLSGLENGNEIPTNKLLKLIC LEFGVSESWLLNDNGEMYTEVYENDKASLTEVSNGALLKIMTLLSTKSNVEYGFYANSLS LFSNMLDCSHSLGEDRKLIYLELLTTLVMDLDRMVYVSFNNKDSKNIDKHKQVIRNDLDA FFNYIVKTDIPSL >gi|157101621|gb|DS480703.1| GENE 27 16743 - 17621 350 292 aa, chain - ## HITS:1 COG:SSO2636 KEGG:ns NR:ns ## COG: SSO2636 COG1319 # Protein_GI_number: 15899362 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Sulfolobus solfataricus # 7 277 8 277 282 108 28.0 9e-24 MLQYKKYYMPKSKEELFRLMEHNAHSFDIISGGTDLFAEERTPFNGQDSAIDISSIEDFS IIESKCGFITIGANTRIQQFLEEPELIDTVPVLRHAASYFADQQIREIATVGGNLANASP CADLIPPLLAMDATVHTIRQNDNDICMSDVPLSDFIKGVGKTSLSEGEVIQSVTCPILKD YGCAFKKVGLRRSLCISTVNSAFLVKADETNQYFKDVRIAFGGIGPAPVRLNEIEDNLKG NRISKDMILKMAECIPGDIVKSRSRREYRKTVIRNFLLAGCNLTYRIQKETN >gi|157101621|gb|DS480703.1| GENE 28 17634 - 18740 836 368 aa, chain - ## HITS:1 COG:PH0655 KEGG:ns NR:ns ## COG: PH0655 COG1063 # Protein_GI_number: 14590542 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Pyrococcus horikoshii # 5 312 1 293 348 182 38.0 9e-46 MNYEIPETMKAWVLGDPGVLTLEEKPVPRPKKSEVLVKIDAVAICATDLEIIENGPPAMI QGGLPFNKGFTPGHEYMGTVVKLGDGVDEYKVGDRVAVEIHAGCSSCERCREGMYTSCLN YGKNYEGNDKGHRANGFTTDGGFAEYAINNVNTLCHVGDNVTDEEATLIVTAGTAMYGLD TLGGLIAGQSVVVTGPGPIGLMAVAVAKALGANPVILTGTRDNRLELGKKLGADYTVNVR NQDVVKAVKEITNGEGVHYVMECSGAPNAVNEAANMVKRGGRICLAAFPSKPVEVDVAAL VRNNISLFGIRGEGRSAVHRAASLISEKRFDASLIHTHTFGFGDLLTALKYSKERIDDAI KVVVKIPG >gi|157101621|gb|DS480703.1| GENE 29 18839 - 20719 828 626 aa, chain - ## HITS:1 COG:lin0404 KEGG:ns NR:ns ## COG: lin0404 COG3962 # Protein_GI_number: 16799481 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase # Organism: Listeria innocua # 57 611 39 611 638 115 25.0 2e-25 MNEMMEQRRKRAKEIKTHGTIRKAVEAGSLEQFQDISLSEAVVLGLINQGVKTFVGIFGH GMTDVGEVLRIYEEESAVKTVNVRNEVEASHVAAMLRWKYNEVPAVFTSIGPGALQAAAG SLVPLANGLGVYYLMGDETSHSEGPNMQQIPKREQELFLRLLSTFGPAYSLHTPEAVFTA LKRGDAAVHGRAKRTPFYMLLPMNIQPKLMESCNLLEFTEKSEESKVISTDMSVFNAALE AILDSSRITVKVGGGAADVTAEVLEEFLELTDAVYVHGPQVPGLYPYSKKRNMSVGGSKG SICGNYAMNECDLMIAIGARGVCQWDSSGTSFRKAKKIININCDYDDLAQYNNSVRIQGD AEEVLLKIIELLKKTENKTSDPEWLKECTDKRMEWETYKQRRYDNPVLVDEKRGESILTE PAAIKKAVDFADMHGCVKIFDAGDVQANGFQIVTDEKPGQTITDTGSSYMGFAVSSMLAF ALAGGKDYPVAFTGDGSFMMNPQILIDGVQHGLRGMIVLFDNRRMGAITSLQHSQYDVEY KTDDSVIVDYVQMAEAVKGVKGFYGGATAEELEKALSQAYEHDGLSLVHVPVYYGRDELS GLGSFGDWNVGNWCERVQKLKHTIGF >gi|157101621|gb|DS480703.1| GENE 30 21128 - 21937 261 269 aa, chain + ## HITS:1 COG:BH2137 KEGG:ns NR:ns ## COG: BH2137 COG1414 # Protein_GI_number: 15614700 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 248 1 249 251 161 34.0 1e-39 MIQSLDRAMTILSILEQKNTASVTEIAEELGVNKSTVSRLMETLKKHDMVQADPATKKYG LGFKILYLGEGVKRNINIITIARPFLTKLYDELDESVHLCAFNNDAAYVVDQVQSNKVYN LSATVGMAEPLHSSSVGKCILAFRAPFAAVRLLKDYPLTAYTPKTITDMDILMEQLAIIR SQGYAVDDEEIALGVRCIAAPIYDYRGEVNYSLGVSGTTNNIKPAAMDRYLSALLSASSQ ISSALGYKCGRQRPEVQPPVPYGDSGREE >gi|157101621|gb|DS480703.1| GENE 31 22233 - 24863 2704 876 aa, chain - ## HITS:1 COG:lin1981 KEGG:ns NR:ns ## COG: lin1981 COG0574 # Protein_GI_number: 16801047 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Listeria innocua # 1 863 1 860 879 1031 57.0 0 MAKWVYKFHEGSAAMRNLLGGKGCNLAEMTNLGMPIPQGFTVTTEACTEYYNCGKQISQE IQDQIFEAITWLEGINGKKFGDTEDPLLVSVRSGARASMPGMMDTILNLGLNDVAVEGFA KKTGNPRFAYDSYRRFIQMYSDVVMEVPKSYFEKIIDEMKEAKGVHFDTDLTADDLKELA AKFKAVYKEAMNGEDFPQDPTEQLMGAVKAVFRSWDNPRAIVYRRMNDIPGDWGTAVNVQ TMVFGNKGETSGTGVAFTRNPSTGAKGIYGEYLINAQGEDVVAGVRTPQPISKLAEDLPE CYKEFMDLAMKLENHFRDMQDMEFTIEEGKLYFLQTRNGKRTAPAAIQIACDLVDEGMIT PEEAVCRIEAKSLDQLLHPTFVPEALKAGEVIGSALPASPGAAAGKVYFTADEAKDAGKG GRGERVILVRLETSPEDIEGMHAAQGILTVRGGMTSHAAVVARGMGTCCVSGCGEIKIDE EAKVFELGGHTFHEGDYISLDGSTGKIYKGDIATQEATVSGNFERIMEWADQFRTLRVRT NADTPADTLNAVKLGAEGIGLCRTEHMFFDAERIPKIRKMILSETVAQREEALNELIPFQ KGDFKAMYKALEGRPMTVRYLDPPLHEFVPTDPDDIKALADDMGMTVEDVKAKCAELHEF NPMMGHRGCRLAVTYPEIAKMQTRAVMEAAIEIKEECGYDIVPEIMIPLVGEKKELKYVK DVVVEIAELVKKEKNSDIQYHIGTMIEIPRAALTADKVAEEADFFSFGTNDLTQMTFGFS RDDAGKFLDSYYKAKIYESDPFARLDQEGVGQLVKMAVEKGRSTKADLKCGICGEHGGDP SSVEFCHKIGLNYVSCSPFRVPIARLAAAQAALNNK >gi|157101621|gb|DS480703.1| GENE 32 25846 - 26535 976 229 aa, chain - ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 229 230 318 71.0 5e-87 MSKILIVEDEESIAELEKDYLELSGFEVEIENDGSSGLDKALKEDYDLLILDLMLPGTDG FEICKRVREVKNTPIIMVSAKKEDIDKIRGLGLGADDYITKPFSPSEMVARVKAHMARYE RLIGSGQPANELIEIRGLKIDKTARRVWVNGEEKTFTTKEFDLLTFLAQNPNHVFTKEEL FSKIWDMESIGDIATVTVHIKKIREKIEFNTAKPQYIETIWGVGYRFKV >gi|157101621|gb|DS480703.1| GENE 33 26507 - 27949 1498 480 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 464 27 492 498 321 39.0 2e-87 LALSNYQTRSFTKEYGLTEQVDLFSGNSMQIFNRITKRSQEDIRNTLDNDPGLYNDLSYL DKVNAELKSHYAYLIVRKGDTILYCGDSNKAAGEAICAQLPRFDMMKGDLQGGIYLDGES QHLIKQMDFTFPDGQEGSAFIISNVDDLIPEVKSMIREMLILGVTILVFAGIVLTYWVYR SILSPLNKLQEATKKIRDGNLEFTLDVDADDEIGQLCQDFEEMRIRLKENAEDKIQYDKD NKELISNISHDLKTPITAIKGYVEGILDGVASSPEKLDKYIRTIYNKSNDMDRLIDELTF YSKIDTNKIPYTFSKINVAQYFRDCVEEVGLDMEARGIELGYFNYVDEDVVIIADAEQMK RVINNIISNSVKYLDKKKGIINIRIKDVGDFIQVEIEDNGKGIAAKDLPNIFDRFYRTDS SRNSAQGGSGIGLSIVRKIVEDHGGRIWATSKEGIGTEIHFVLRKYQEVLQDEQDSNRGR Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:47:03 2011 Seq name: gi|157101620|gb|DS480704.1| Clostridium bolteae ATCC BAA-613 Scfld_02_45 genomic scaffold, whole genome shotgun sequence Length of sequence - 28063 bp Number of predicted genes - 25, with homology - 24 Number of transcription units - 10, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 197 - 284 56.4 # Ser GCT 0 0 - Term 133 - 175 4.7 1 1 Op 1 . - CDS 362 - 1075 411 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 2 1 Op 2 11/0.000 - CDS 1103 - 1654 493 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 3 1 Op 3 . - CDS 1685 - 2527 842 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain 4 1 Op 4 . - CDS 2524 - 4458 1628 ## COG5001 Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain - Prom 4516 - 4575 3.0 5 2 Tu 1 . + CDS 4415 - 4495 58 ## + Term 4544 - 4575 2.5 - Term 4532 - 4563 1.7 6 3 Op 1 24/0.000 - CDS 4582 - 7128 2970 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 7 3 Op 2 9/0.000 - CDS 7160 - 9073 1981 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 8 3 Op 3 . - CDS 9107 - 10192 952 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 9 3 Op 4 . - CDS 10212 - 10421 228 ## Closa_0003 hypothetical protein 10 3 Op 5 16/0.000 - CDS 10434 - 11552 1272 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) - Prom 11740 - 11799 3.4 11 3 Op 6 . - CDS 11833 - 13203 1334 ## COG0593 ATPase involved in DNA replication initiation - Prom 13243 - 13302 7.4 + Prom 13543 - 13602 7.5 12 4 Tu 1 . + CDS 13651 - 14292 715 ## COG4684 Predicted membrane protein + Term 14364 - 14396 4.0 + Prom 14401 - 14460 7.8 13 5 Tu 1 . + CDS 14531 - 14665 188 ## PROTEIN SUPPORTED gi|160882064|ref|YP_001561032.1| ribosomal protein L34 + Term 14695 - 14728 2.1 14 6 Op 1 22/0.000 + CDS 14781 - 15125 228 ## COG0594 RNase P protein component + Prom 15142 - 15201 2.5 15 6 Op 2 16/0.000 + CDS 15364 - 16680 1385 ## COG0706 Preprotein translocase subunit YidC 16 6 Op 3 4/0.000 + CDS 16692 - 17483 899 ## COG1847 Predicted RNA-binding protein + Term 17525 - 17573 7.1 17 6 Op 4 11/0.000 + CDS 17580 - 18959 1375 ## COG0486 Predicted GTPase 18 6 Op 5 24/0.000 + CDS 18971 - 20866 1691 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 19 6 Op 6 . + CDS 20878 - 21657 516 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Term 21666 - 21706 8.4 - Term 22675 - 22720 5.0 20 7 Tu 1 . - CDS 22782 - 23612 408 ## COG0648 Endonuclease IV - Prom 23643 - 23702 6.4 - Term 23667 - 23705 5.2 21 8 Tu 1 . - CDS 23733 - 24683 442 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 24730 - 24789 7.2 + Prom 24872 - 24931 6.1 22 9 Op 1 25/0.000 + CDS 24977 - 25747 592 ## COG1192 ATPases involved in chromosome partitioning 23 9 Op 2 . + CDS 25747 - 26667 839 ## COG1475 Predicted transcriptional regulators 24 9 Op 3 . + CDS 26692 - 27204 283 ## Closa_4289 hypothetical protein + Term 27219 - 27259 5.2 + Prom 27209 - 27268 5.0 25 10 Tu 1 . + CDS 27338 - 28061 76 ## Closa_3993 hypothetical protein Predicted protein(s) >gi|157101620|gb|DS480704.1| GENE 1 362 - 1075 411 237 aa, chain - ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 16 226 27 235 241 85 28.0 9e-17 MVLMNLLHAPIWFFTISRMAREEDTRTAAEKYGYISEMVKRINRRGRVTVTATGTEHIPS QDGFILFPNHQGLFDMLALFSTCPRPLSVVIKKEASNWILVKQVLGATKSLAMDRDDIKD QVRIITEVTKRVKSGRNFVIFAEGHRSRNGNEILDFKSGTFKSAVKAGCPIVPVALVNSF RPFDMNSLKREEVQVHYMEPILPDQYMGMKTSEIAHLVHDRIQEEINKNLTKNGLTS >gi|157101620|gb|DS480704.1| GENE 2 1103 - 1654 493 183 aa, chain - ## HITS:1 COG:CAC3090 KEGG:ns NR:ns ## COG: CAC3090 COG1838 # Protein_GI_number: 15896341 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Clostridium acetobutylicum # 1 180 1 180 187 214 56.0 9e-56 MNRHMNVPMTKEEAASLKAGDYVYLTGTIYTARDAAHKRMDEALDRGESLPFDIEGSIIY YMGPSPAREGRAIGSAGPTTASRMDKYTPRLLDLGMGAMIGKGKRSKAVMDAIVRNGAVY FAAVGGAGAILSKCILSSEIVAYEDLGTEAVRRLAIQDFPVVVVMDALGNNLYETAVKEF CTL >gi|157101620|gb|DS480704.1| GENE 3 1685 - 2527 842 280 aa, chain - ## HITS:1 COG:CAC3091 KEGG:ns NR:ns ## COG: CAC3091 COG1951 # Protein_GI_number: 15896342 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Clostridium acetobutylicum # 1 275 3 277 282 347 60.0 1e-95 MREIAAATITEAIRDMCIEANYGLAPDMRKVFEKAVEEEESPLGRQVLGQLKENLKIAGE DKIPICQDTGMAVVFMKIGQDVHITGGRLADAVNEGVRQGYEQGYLRKSVVGDPIERINT KDNTPAVIHCEIVDGEQIDITVAPKGFGSENMSRVFMLKPADGLEGVKESILQAVREAGP NACPPMVVGVGIGGTFEKCTQMAKHALTRDIEDKPGKQWVRDLEEDMLNRINQSGIGPGG LGGRVTALAVNIETFATHIAGLPLAVNICCHVNRHARRVL >gi|157101620|gb|DS480704.1| GENE 4 2524 - 4458 1628 644 aa, chain - ## HITS:1 COG:RSc1545 KEGG:ns NR:ns ## COG: RSc1545 COG5001 # Protein_GI_number: 17546264 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein containing a membrane domain, an EAL and a GGDEF domain # Organism: Ralstonia solanacearum # 226 635 345 759 776 193 31.0 9e-49 MVMWSLAAELLGLIIHIILILYYHERRLVSNSRRKIYQVCLWISVLTILLNIVCVHMVTK PAVVSHGVNMLLNSLYFILCVLTCSVMAVYMFALTLEHVYDKRCLRITGRIVLILNIIFW GIVIWNLRSGVLFYFDENQIYIRGPLNRIGYLVMAIEMLMLVLCYMRNRRSVSRPVVRLI RTMPVIAAICIVFQHIYKDLQLNGMFMAIVNMVIFISFQTRRSEVDSLTFIGNRNSFFEE LSLRIASRQYFQVVLVCLKQFSLVNEKFSYKKGDEFLYNIARELDHILPGAKAFRFGNVE FAVLLPYATEEESTGSLKRIQERFEERWELGAVGSYIQAYFTDVMYRGQEWNATQIIEYL ESGIHYAKKEPGGLRRFDIKLLEQLNRRKRILDIMETSIRQRRFKVWYQPVFNLKTNRFS SAEALLRLRDYDGEPVSPSEFIPLAEETGLIDDLSWIVLEEVCTLLGQMRDKIDSISINL SMQQFEDRSLCARIHECLNRCGLNPDQLKIEVTERVLLQDMDYMKRMMEEMTGEGFGFYL DDFGTGYSNISCALSLPFEYIKLDRSLLVRLPGDSKVQVFVRSMVETFHAMGQKIVAEGV EEEEQIELLRQFGVDCVQGYYYGKPMPEDEFRAAVAKGWEEQRR >gi|157101620|gb|DS480704.1| GENE 5 4415 - 4495 58 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MISPNNSAANDHITILFGSPLLSVTR >gi|157101620|gb|DS480704.1| GENE 6 4582 - 7128 2970 848 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 8 835 6 830 830 920 57.0 0 MEPNVFDKVHEVDLKKTMEQSYIDYAMSVISARALPDVRDGLKPVQRRVLYSMIELNNGP DKPHRKCARIVGDTMGKYHPHGDSSIYGALVNMAQEWSTRYPLVDGHGNFGSVDGDGAAA MRYTEARLSKISMEMLADINKDTVDFVPNFDETEREPLVLPSRYPNLLVNGTSGIAVGMA TNIPPHNLREIIGAVVKMIDNRVEEDRETSLEEIMEIVKGPDFPTGATILGRRGIEEAYR TGRAKIKVRAVTNIETMANGKSRIIVTELPYMVNKARLIEKIADLVKEKKIDGITYIGDE SNREGMRINIELRRDVNANVILNQLLKHTQLQDTFGVIMLALINNQPKVLNLHQMLEEYL KHQEEVVTRRTQYDLNKAEERAHILKGLLIALDNIDEVIRIIRGAANVADAKNQLMERFG LSDAQSQAIVDMRLRALTGLEREKLENEYRELQAKIEELKAILADEKKLLTVIRDEIMII SDKYGDDRRTAIGYDDDMSMEDLIPDEDTIIAMTHLGYIKRMDVDNFRSQNRGGKGIKGM QTIEEDYIEDLLMTTNHHYMMFFTNTGRVYRLKAYEIPEAGRTARGTAIINLLQLMPEEK ITAIIPMREFNDDKYLFMATRNGMVKKTPMVEYEHVRKNGLQAIVLRDGDELIEVKATDN SQDILLVTRKGMCIRFNETDVRVTGRVSMGVIGMRFEDDDEVIGMQMQSQGDSLLVVSEN GMGKRTMITEFSAQNRGGKGVICYKCTDKTGYLVGAKLVNDGREIMIITTEGIIIRMTVD DISVIGRNTSGVKLMSIDQDSDIKVASIAKVRESVTKDSNVYDEEYYEEENGDSEEESEA DVDSDETV >gi|157101620|gb|DS480704.1| GENE 7 7160 - 9073 1981 637 aa, chain - ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 3 637 6 637 637 806 63.0 0 MGEQYGADQIQILEGLEAVRKRPGMYIGSTSARGLHHLVYEIVDNAVDEALAGYCDSIDV TINEDNSITVMDDGRGIPVDIQKKAGLPAVEVVFTILHAGGKFGNGGYKVSGGLHGVGAS VVNALSEWLEVQVYKDGNIYQQRYERGKVTYPLKIVGKCGPEQHGTRVTFLPDKEIFEET VYDYDTLKIRLRETAFLTKNLKIILKDDRDEKKEKVFHYEGGIKEFVTYLNKGKTPIYDQ VIYCEGTREGVYVEVSMQHNDSYTENIYTFVNNINTPEGGTHLTGFKNALTKTFNDYARE KKLLKDSEDNLSGEDIREGLTAIISVKIEDPQFEGQTKQKLGNSEARAAVDNIVSEQLTY FLEQNPAVAKSMCEKSILAQRARAAARKARDLTRRKSALDGMALPGKLADCSDKNPENCE IYIVEGDSAGGSAKTARSRATQAILPLRGKILNVEKARMDRIYGNAEIKAMITAFGTGIH DDFDISKLRYHKIIIMTDADVDGAHISTLLLTFIYRFMPELIKQGYVYLAQPPLYKVEKN KKVWYAYSDDELNSILKEIGRDNSNKIQRYKGLGEMDAEQLWETTMDPERRILLRVTMDE ETTSEVDLTFTTLMGDQVEPRREFIEANAKKVKNLDI >gi|157101620|gb|DS480704.1| GENE 8 9107 - 10192 952 361 aa, chain - ## HITS:1 COG:CAC0004 KEGG:ns NR:ns ## COG: CAC0004 COG1195 # Protein_GI_number: 15893302 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Clostridium acetobutylicum # 1 357 1 360 363 292 42.0 6e-79 MIIESIELKNYRNYKELHMEFNQGTNILYGDNAQGKTNILEAVYVCCTSKSHKSAKDRDI IRFNQDESHIKLQIRKNNVPYRIDMHLKKNKPKGIAINGVPIRKASELFGIANVVFFSPE DLNIIKNGPSERRRFIDMELCQLNKLYVHSLVQYNKVLLQRNKLLKELFFRPEYEETLDV WDMQLVNYGREVIKFRREFIKQLNEIIHAIHLSLTGGREDISISYEPFTREDQMEDILKK NRAQDMKQKTTLSGPHRDDISFIVNGIDIRRFGSQGQQRTAALSLKLSELQLVKQLSHDD PILLLDDVLSELDSSRQNHLLSAIKHIQTMITCTGLDDFVNNRFQIDKVFKVIDGTVINE N >gi|157101620|gb|DS480704.1| GENE 9 10212 - 10421 228 69 aa, chain - ## HITS:1 COG:no KEGG:Closa_0003 NR:ns ## KEGG: Closa_0003 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 67 1 67 101 83 70.0 3e-15 MEIIKLREDYIKLGQALKAAGLAESGVEAKYAAQDGLVSVNGETVYQRGKKLVDGDVVTY KGETIKITK >gi|157101620|gb|DS480704.1| GENE 10 10434 - 11552 1272 372 aa, chain - ## HITS:1 COG:CAC0002 KEGG:ns NR:ns ## COG: CAC0002 COG0592 # Protein_GI_number: 15893300 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Clostridium acetobutylicum # 1 365 1 361 366 225 36.0 1e-58 MKLRFQKDAIVNGINIVMKAVPSKTTMSILECILIDASTNEIKLTGNDMELGIETRVEGD ILEHGKIALDAKLFSEITRRLSSENASVTIESDDKFNTTISCENSVFNIQGRDGEEFAYL PYIEKDKYICLSQFTLKEVIQQTIFSISPNDSNKMMAGELFEVNENQLKVVSLDGHRISI RKVRLKDHYEDTKVIVPGKTLSEVSKILGGDNEKEVLIYFSTNHILFEFDNTIVVSRLIE GEYFRISQMLSSDYETKVSVNKKEFLDCIERATILIRENDKKPLIINIGDNSMELKLNSS FGSMNADLMIHKTGKDIMIGFNPKFLIDALRVIDDEEINIYMMNPKSPCFIKDEEENYIY LILPVNFNAATV >gi|157101620|gb|DS480704.1| GENE 11 11833 - 13203 1334 456 aa, chain - ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 1 456 1 446 446 434 51.0 1e-121 MIEKIKENWDKILLTLKEEHEISNISYSTWLEPLKPHSFKDNTVTIIVPEQTFLQYVNKK YGFLLKVTISEFLEKECEVDFKVKEQIVEAEEPAAPQLIKNNKASVSPDVIQSANLNPKY TFDTFVVGGNNNLAHAAALAVAESPGEIYNPLFIYGGVGLGKTHLMHAIAHFILKNNAKS KILYVTSETFTNELIDAIRNKNNITTTEFREKYRNNDVLLIDDIQFIIGKESTQEEFFHT FNTLYESKKQIIISSDKPPKEIETLEERLRSRFEWGLTVDIQSPDYETRMAILRKKEEME GYNIDNEVIKYIATNIKSNIRELEGALTKIVALSRLNKCDITLELAEEALKDIISPNAQR EVTPDLIIQVVSDHFGLTPLDISSQKRNKEIVYPRQIVMYLCRDMTATPLQTIGRYLGGR DHTTIIHGAEKITGDMAKDDTLRNTIEILKKKINPQ >gi|157101620|gb|DS480704.1| GENE 12 13651 - 14292 715 213 aa, chain + ## HITS:1 COG:CAC0331 KEGG:ns NR:ns ## COG: CAC0331 COG4684 # Protein_GI_number: 15893623 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 6 200 8 182 192 91 34.0 9e-19 MNTLSKTRDLVLAAVIAAIIVIMAFVPFLGYIPLGFMNATTVHIPVIIGAIILGPKYGAF LGLTFGLTSLWKNTFMPNPTSFVFSPFVTMGQFHGNLGSLVICLVPRILIGVVAYYVYRL FMRTLKNRRNKQGTALFAAGVAGSLTNTLLVMNLIYFLFGDQYASAASWTAKWVYGVILG IIGMQGVPEALVAGIIVTAVAGILLKLTPDIRG >gi|157101620|gb|DS480704.1| GENE 13 14531 - 14665 188 44 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160882064|ref|YP_001561032.1| ribosomal protein L34 [Clostridium phytofermentans ISDg] # 1 44 1 44 44 77 84 1e-13 MKMTFQPKKRQRAKVHGFRARMKTAGGRKVIAARRAKGRAKLSA >gi|157101620|gb|DS480704.1| GENE 14 14781 - 15125 228 114 aa, chain + ## HITS:1 COG:CAC3738 KEGG:ns NR:ns ## COG: CAC3738 COG0594 # Protein_GI_number: 15896969 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Clostridium acetobutylicum # 7 103 6 104 119 79 47.0 1e-15 MKRFPSVKKNSEFQVIYRNGTSYANRLLVMYVMKTGEDENRIGISVSKKVGNSVVRHHIT RLLREIFRLNNNRIKTGLNIILVARGAARQSDYKHLEGAYLHLCGLHHILKESK >gi|157101620|gb|DS480704.1| GENE 15 15364 - 16680 1385 438 aa, chain + ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 35 135 50 153 280 90 47.0 8e-18 MNSIAGAVLLTKTGGILGPIADILGWIMDLLFRFTSTFGILNIGLCIILFTLVTKVLMIP LTIKQQKSSKLMAVMQPEITAIQKKYKGKESDQKYAMMMQTEIKAVYEKYGTSMTGGCLP LLIQMPIIFALYRVIYNIPAYVQSVRVYYEAVVSKLPSGYEASQAFTDLAQAHKMVGERF DYTNVNTVIDLLYKLTGQQWDSLVNAFSSVGQAVTETGVRVVDAIENMQNFFGLNIAYTP LQVIMNYFNKVDNTTLMTAFAALCIPLLAGLSQWYSTKLMTANQPQQSEDAPGANMMKSM NVTMPLMSVFFCFSFASGIGLYWVAQSVFTIIQQVGINSYLNKVDIDDLVQKNIEKTNKK RAKKGLPPTKVGNVDEMLKKIEAKEEKAEQAQMAKIAKTKSIVEESTKYYNDNAKPGSLA AKANMVSKYNEKHEKGKK >gi|157101620|gb|DS480704.1| GENE 16 16692 - 17483 899 263 aa, chain + ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 1 207 1 208 209 179 52.0 4e-45 MDMITVTAKTVDEAVTKALIELETTSDKLEYEVVDKGSTGFLGIGAKPAIIRAKKKESIE DKAMDFLSQIFGAMNMQVNITAAYNQEEQELSLNLEGEDMGILIGKRGQTLDSLQYLVSL IVNKGTEGYLRVKLDTENYRERRKETLETLAKNIAYKVKRTKRPVSLEPMNPYERRIIHA ALQNDKYVTTRSEGEEPFRHVVIALKKEAASGDRKGRYDRNKGGRFERSDRSGGYRNNRR TNGNSASQTGSDASAETAAGEEE >gi|157101620|gb|DS480704.1| GENE 17 17580 - 18959 1375 459 aa, chain + ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 459 5 459 459 393 46.0 1e-109 MKTDTIAAIATGMSNSGIGIVRISGEQAFSVIDSIFRNRKGERVRLSLSQSHTVHYGYIY DGDEKVDEALVIIMKGPHSYTAEDTVEIDCHGGVLMVRRILETVLHAGARTAEPGEFTKR AFLNGRLDLSQAEAVADVIHATNEYALKSSVSQLSGSVSAKIKELRSRILYQIAFIESAL DDPEHISLDGYGSRLMDDLVPMIGEVRKLLLSADDGRVMSEGVKTVILGKPNAGKSSLMN VLLGEERAIVTEIAGTTRDTLEEHIYLQGISLNVVDTAGIRDTEDVVEKIGVDRAMKAAR DADLIIYVVDGSRPLDESDREIIEFIRDRKTIVLLNKSDLDLEVGTAELEQICGHKVLPV SAKEEQGIELLEQEIKKLFYHGDLSFNDQVYITNARHKEALEQCLESLLMVKGSVEDSMP EDFYSIDLMNAYEQLGFIIGEAVDDDVVNEIFAKFCMGK >gi|157101620|gb|DS480704.1| GENE 18 18971 - 20866 1691 631 aa, chain + ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 7 627 1 619 626 785 61.0 0 MEEERGMIPYLEETYDVVIVGAGHAGCEAALASARLGMETIMFTVSVDSIALMPCNPNIG GSSKGHLVRELDALGGEMGKNIDKTFIQSKMLNESKGPAVHSLRAQADKQDYSRHMRRTL ENTEHLTIRQAEVSEIMAEDGKIRGVKTFSGAVYYAKAVILCTGTYLKARCIYGDVSNPT GPNGLQAANHLTDSLREHGIEMFRFKTGTPARVDKKSIDFSKMEEQFGDLRIVPFSFSTN PDEIQKDQVSCWLTYTNENTHRIIKDNLDRSPLFSGAIEGTGPRYCPSIEDKVVKFPDKN RHQVFVEPEGLYTNEMYLGGMSSSLPEDVQYAMYRTVPGLEKVKIVRNAYAIEYDCINAL QLKPTLEFKKISGLFAGGQFNGSSGYEEAAVQGFMAGVNAVMKIRGQEAVVLDRSQAYIG VLIDDLVTKENHEPYRMMTSRAEYRLLLRQDNADIRLRKIGHEIGLVCDEEYEHLLRKMD DIQSEIKRLEKTVIGVSDRVQMFLENYGSTLLKSGITLAELVKRPELDYVKLAELDEGRP ELPDDVREQVNIEIKYEGYIKRQMQQVAQFKKLEDKKLPEDFDYSEVNSLRREAVQKLNK VQPATIGQASRISGVSPADISVLLVHFTRKQ >gi|157101620|gb|DS480704.1| GENE 19 20878 - 21657 516 259 aa, chain + ## HITS:1 COG:BS_gidB KEGG:ns NR:ns ## COG: BS_gidB COG0357 # Protein_GI_number: 16081152 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Bacillus subtilis # 17 258 14 237 239 218 45.0 1e-56 MTDTFRQQMDRELGLLGIKLSEKQLEQFFTYYEMLVEKNKVMNLTAITDETDVVSKHFSD SLSLFRVLKRVMDSGDGCDGSRVYEEELLEGKSVIDVGTGAGFPGIPLKIAFPGIKLTLL DSLNKRVKFLEEVCDALELKNVEFIHGRAEDIGRQKEHREMYDLCVSRAVANLASLSEYC MPFVKIGGFFVPYKSGEIEDELKSAGSAVGILGGELEFVDKFMLPGTDASRSLVVIKKVK GISKKYPRKAGMPTKEPLG >gi|157101620|gb|DS480704.1| GENE 20 22782 - 23612 408 276 aa, chain - ## HITS:1 COG:lin1487 KEGG:ns NR:ns ## COG: lin1487 COG0648 # Protein_GI_number: 16800555 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Listeria innocua # 1 272 1 282 297 201 40.0 2e-51 MLRIGCHLSSSKGYRAMAKDAKKINANTFQFFTRNPRGGNAKAIDENDVKVFLTEVQSLD ITPILAHAPYTLNACSADPKLRNFAKRTMADDLVRMEYTPGNMYNFHPGSHVKQGTEIGI QFIAEMLNEILSPAQTTTVLLETMAGKGSEVGGAFEELRQILDLVELKSHMGVCLDTCHI WDGGYDIVNHLDDILTEFDNKIGLKCLKAIHLNDSQNPIGAHKDRHAKLGEGYIGLDTFK RIVTHPELRHLPFFLETPNDLDGYAHEIQMMREAAD >gi|157101620|gb|DS480704.1| GENE 21 23733 - 24683 442 316 aa, chain - ## HITS:1 COG:alr5028 KEGG:ns NR:ns ## COG: alr5028 COG0596 # Protein_GI_number: 17232520 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Nostoc sp. PCC 7120 # 63 315 41 294 296 96 25.0 8e-20 MKTRNKLLTLLILSSGAAAATALINKAIKLSATSRNVLEEPEALCYKWRLGNIHYTKTGT GKPILLVHDLAPASSGYEWKNLVGKLAETYTVYTIDLLGFGRSEKPNLTYTNYLYVQLLS DFIKSEIGHRTDIIASGSSAALGIMACSNSPELFNQLLFINPESLLSCSQVPGKNAKLYK IILDLPIAGTLIYNIACSKQFITKEFLTNYYYNPYSVKTRVIDAYHESAHLGESPKSVYA SLKCNYVKCNIAAALKKIDNSIYLLGGGAIDDIGECMEEYKEYNPAVEYADVPNTKLLPH LEKPAEVFDMIQTYLS >gi|157101620|gb|DS480704.1| GENE 22 24977 - 25747 592 256 aa, chain + ## HITS:1 COG:BH4058 KEGG:ns NR:ns ## COG: BH4058 COG1192 # Protein_GI_number: 15616620 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus halodurans # 1 253 1 253 253 294 58.0 1e-79 MGRIITIANQKGGVGKTTTAINLSACLAEAGQHVMLVDFDPQGNASSGLGLEQEDFDKTV YDMMIEEASVNECIIKEIQPNLDVLPSDMNLAGAEIEFQEVEDKEKLLSSCLNQVRDTYD FIIIDCPPSLNILTINALTAADTVLVPIQCEYYALEGLNQVLKTVDLVKRKLNPELELEG VVFTMYDARTNLSLEVVESVKSSLNRTIYKTIIPRNVRLAEAPSHGMSINLYDSRSTGAE SYRMLAAEVMSRGEEL >gi|157101620|gb|DS480704.1| GENE 23 25747 - 26667 839 306 aa, chain + ## HITS:1 COG:CAC3729 KEGG:ns NR:ns ## COG: CAC3729 COG1475 # Protein_GI_number: 15896960 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 3 303 4 282 283 216 46.0 6e-56 MAKKGLGKGLGAIFGEDVIKESEEEIAKAKAAVTDNGDGKSGELMVKMALIEPNREQPRK DFNEEQLAELADSIKRYGILQPLLVQKKGTFYEIIAGERRWRAAKIAGLKEIPVVLREYN KQESMEIALIENVQRSDLNPIEEALAYQRLVTEFKLTQEEIAARVSKNRATITNSMRLLK LDGQIQEMLIQNLISSGHARALLSLEDKGLQLKAAKMILDESLSVRETERLVKRLAKEAE NGEEKKDKNKDEALALIYQSLEERMKSVMGTKVSIHNKDKNKGRIEIEYYSEAELERIVE MIESIR >gi|157101620|gb|DS480704.1| GENE 24 26692 - 27204 283 170 aa, chain + ## HITS:1 COG:no KEGG:Closa_4289 NR:ns ## KEGG: Closa_4289 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 166 1 166 168 219 66.0 4e-56 MENSILNQIPFDPAYIIAGLLILVVLLLILTITTMTKVRRLDRQYDYFMRGKDAETLEDV ILDQIEEIRDLRAEDRSNKDSIRTLNRNQRASFQKYGLVKYNAFKGMGGNLSFAFAMLDY TNSGFVLNCVHSREGCYLYIKVVDMGQTDIVLGNEEQEALEQALGYIKKN >gi|157101620|gb|DS480704.1| GENE 25 27338 - 28061 76 241 aa, chain + ## HITS:1 COG:no KEGG:Closa_3993 NR:ns ## KEGG: Closa_3993 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 57 39 95 255 78 64.0 2e-13 MQDIEYLKSMYPSGIRILQGYVAEACDRLDYKNSPMYDEYPDHIMINRLCDTICDTVIAS EGAEKVRGMWNISEHDKAVMEMAELKKDMKRLDVDFFTEDDDPVMEQEQNIENVPAENIG NAEVLTTEESLEEQSVSNSQNEFLGYRESVVNPSGIIMETQEMRGREPGFSGNGRMGGRP NMGPGPGRPPMGNNPGRPPMGPGPERPPMGPGPERPPMGPGPVRPPSGPNQDRPPVGGPG P Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:47:29 2011 Seq name: gi|157101619|gb|DS480705.1| Clostridium bolteae ATCC BAA-613 Scfld_02_46 genomic scaffold, whole genome shotgun sequence Length of sequence - 28339 bp Number of predicted genes - 27, with homology - 27 Number of transcription units - 6, operones - 4 average op.length - 6.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 8 - 340 196 ## Clole_0583 transposase IS116/IS110/IS902 family protein - Term 599 - 644 13.2 2 2 Op 1 1/0.000 - CDS 766 - 1632 765 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily 3 2 Op 2 2/0.000 - CDS 1665 - 3077 866 ## COG1641 Uncharacterized conserved protein 4 2 Op 3 . - CDS 3079 - 3822 797 ## COG1691 NCAIR mutase (PurE)-related proteins 5 2 Op 4 17/0.000 - CDS 3908 - 4516 252 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 6 2 Op 5 44/0.000 - CDS 4513 - 5445 652 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 7 2 Op 6 49/0.000 - CDS 5490 - 6326 855 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 8 2 Op 7 38/0.000 - CDS 6323 - 7303 213 ## PROTEIN SUPPORTED gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 9 2 Op 8 . - CDS 7343 - 9067 1688 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Term 9465 - 9507 6.5 10 3 Op 1 . - CDS 9549 - 10799 1389 ## Apre_0526 ABC-type nitrate/sulfonate/bicarbonate transport systems periplasmic components-like protein 11 3 Op 2 24/0.000 - CDS 10888 - 11670 239 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 12 3 Op 3 . - CDS 11670 - 12746 730 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component - Prom 12794 - 12853 9.1 - Term 12890 - 12937 0.8 13 4 Op 1 . - CDS 13016 - 14518 1454 ## COG3875 Uncharacterized conserved protein 14 4 Op 2 . - CDS 14559 - 15398 1084 ## COG0149 Triosephosphate isomerase 15 4 Op 3 . - CDS 15395 - 16588 1270 ## COG0205 6-phosphofructokinase 16 4 Op 4 1/0.000 - CDS 16629 - 17786 1409 ## COG0191 Fructose/tagatose bisphosphate aldolase 17 4 Op 5 3/0.000 - CDS 17826 - 18680 955 ## COG0191 Fructose/tagatose bisphosphate aldolase 18 4 Op 6 . - CDS 18694 - 19710 1258 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 19 4 Op 7 2/0.000 - CDS 19738 - 20496 200 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 20 4 Op 8 24/0.000 - CDS 20510 - 21271 265 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 21 4 Op 9 2/0.000 - CDS 21259 - 22044 981 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 22 4 Op 10 21/0.000 - CDS 22060 - 22893 933 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 23 4 Op 11 . - CDS 22914 - 24077 1547 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 24272 - 24331 6.5 + Prom 24386 - 24445 7.7 24 5 Op 1 2/0.000 + CDS 24519 - 25205 687 ## COG1609 Transcriptional regulators 25 5 Op 2 1/0.000 + CDS 25293 - 25589 262 ## COG1609 Transcriptional regulators 26 5 Op 3 . + CDS 25593 - 26474 526 ## COG2017 Galactose mutarotase and related enzymes + Prom 26681 - 26740 5.0 27 6 Tu 1 . + CDS 26907 - 27248 262 ## gi|160941510|ref|ZP_02088844.1| hypothetical protein CLOBOL_06400 + Term 27329 - 27364 4.9 - TRNA 27422 - 27501 58.4 # Leu TAG 0 0 - TRNA 27515 - 27598 64.1 # Leu TAA 0 0 - TRNA 27640 - 27712 86.7 # Val TAC 0 0 - 5S_RRNA 27655 - 27706 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. - TRNA 27749 - 27822 83.3 # Asp GTC 0 0 - TRNA 27868 - 27941 85.6 # Met CAT 0 0 - TRNA 27984 - 28056 82.3 # Thr TGT 0 0 - TRNA 28076 - 28147 62.5 # Glu TTC 0 0 - TRNA 28185 - 28257 76.0 # Asn GTT 0 0 Predicted protein(s) >gi|157101619|gb|DS480705.1| GENE 1 8 - 340 196 110 aa, chain + ## HITS:1 COG:no KEGG:Clole_0583 NR:ns ## KEGG: Clole_0583 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: C.lentocellum # Pathway: not_defined # 1 103 292 394 397 107 53.0 2e-22 MAEIGDVTRFTHKGAITAFAGVDPGVNESGTYEQKSVPTSKRGSSSLRKTLFQVMDCLIK TKPQDDPVYAFIDKKRAQGKPYYVYMTAGANKFLRIYYGRVKEYLMSLPE >gi|157101619|gb|DS480705.1| GENE 2 766 - 1632 765 288 aa, chain - ## HITS:1 COG:alr1286 KEGG:ns NR:ns ## COG: alr1286 COG1606 # Protein_GI_number: 17228781 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Nostoc sp. PCC 7120 # 36 283 21 267 275 194 40.0 2e-49 MNTVNNSFSDEMSSPDQMKKCLEARMEQLAKEDICLAFSGGVDSSLLLKTAADAAAHTGR KVYAVTFDSRLHPSCDLEIARRVAGELGGIHEVITVDELEQESIKNNPVNRCYLCKRHLF SRLAELAEARGIRYILDGTNEDDMHVYRPGIRALKELGIISPLAELHITKAQVKALASEY GISVASRPSTPCMATRLPYGAALDYEVLRRIGEGEAYVRTMVPGNVRLRLHGDIVRLELD PEAFEVFMKGRKEIVSRLKKMGFVYITLDAEGFRSGSMDAGLNGPEVQ >gi|157101619|gb|DS480705.1| GENE 3 1665 - 3077 866 470 aa, chain - ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 457 2 411 420 237 36.0 4e-62 MGKILYLECNSGISGDMTVGALLDLGADRQVLENALESLGVDGYHLHFGRKVVSGLDAFD FDVHLEEHEHGHEEGHGHEEGHWHEEGHRHEDGHGHEHEHGHEDGHRHEAGHGHEVGHGH EHEHSHEADHGHEHEHSHEAGHGHPHPHIHRNLHDIYHIIDRLDSNERVKEMARTMFRIV AEAESKAHGLPVEQVHFHEVGAIDSIVDIISAAVCIDNLGVEDVVVSALSEGHGHVYCQH GVLPVPVPATANIASSYGLKLHFTDNDGEMVTPTGAAIAAALRTKDRLPSSCRLLKVGMG AGNKVFKQANVLRAMLLETSQEEDRTMWVLETNLDDCTGEMLGLAMEMLLDAGAADVWYT PIHMKKNRPAYMLSVLCRESSIEAMEEIILTQTTTIGIRRYPTERTILDRSEIQVETSYG PADVKVCAYKGRTFFYPEYESIRRICREQGVDFQTAYHQARMKAEESRQD >gi|157101619|gb|DS480705.1| GENE 4 3079 - 3822 797 247 aa, chain - ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 2 244 5 248 248 233 45.0 3e-61 MDVRELLEQVKSGGVNIEEAEKQLKNLPYEDLGYAKLDHHRKLRSGFGETVFCQGKPDAY LLEIYKKFYERDGEVLGTRASEKQAELVRTAVPEVVYDPISRILKVEKPGKERKGYVAVC TGGTADIPVAEEAAQTAEYFGCRVDRIFDVGVAGIHRLLAQRERLDKASCIVAVAGMEGA LGTVIAGLVECPVVAVPTSVGYGASFHGLSALLTMLNSCANGISVVNIDNGYGAGYLATQ INRMAVR >gi|157101619|gb|DS480705.1| GENE 5 3908 - 4516 252 202 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 10 196 28 232 329 101 30 5e-21 MKLEAMNISFQYDNGNRKILNNVSISLESGERVGLTAPSGFGKTTFCKILAGYEKPDHGT VVLDGKNISSCAGYNPVQMIWQHPELSVNPRLKMREVLREGDFVEERVVRGLGIEPDWLN RYPGELSGGELQRFCIARALGKRTRFILADEISTMLDLITQSQIWGFLLEEVKTRDLGLL VVSHSEELLERVCGRVVDLRTD >gi|157101619|gb|DS480705.1| GENE 6 4513 - 5445 652 310 aa, chain - ## HITS:1 COG:MA1911 KEGG:ns NR:ns ## COG: MA1911 COG0444 # Protein_GI_number: 20090760 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 2 305 13 321 325 288 49.0 8e-78 MEEKQVILSVEHLMISFSQYVGGWRQRELPVIRDLNIKVREHEVVAVAGSSGSGKSLLAH AVMGLLPQNAACQGTVSFEGEILTQKKKEQLRGSRMVLVPQSVSYLDPLMKVGEQVRRGQ KDRESVKRCRQSMGRYGLGEDTEELYPFELSGGMTRRVLISTAVMEHPRLVIADEPTPGL HISAARRVLSHFREIADQGAGVLLITHDLELALEVADRIVVFYAGTNVEEALAEDFEDEQ RLRHPYTKALFRAMPRHGFKAASGTQPYAADMPAGCPYGPRCPDMDEGCLQEISYRSFRG GMVKCRKAGI >gi|157101619|gb|DS480705.1| GENE 7 5490 - 6326 855 278 aa, chain - ## HITS:1 COG:MA1912 KEGG:ns NR:ns ## COG: MA1912 COG1173 # Protein_GI_number: 20090761 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 9 277 17 284 285 261 52.0 1e-69 MSAQRHRWNRRKAMAALLAVSVLLLAAIAIAGQVLREQALATDFTRKNLPPSLAYPFGTD WMGRDMFVRSLTGLSISIRIGLLTACISAVVAFILGTMAACLGRVTDAVIGGIIDLVMGI PHILLLILISFAVGKGFWGVLIGISLTHWTSLARLLRGEVIQLRESQYIQIAKKLGKGRF YIAFKHMTPHLLPQLFVGMVLLFPHAILHEASITFLGFGLPPEQPAVGVILSESMKYLVM GKWWLALFPGLLLVFVVVLFHFIGDTLSRLLDPAQAHL >gi|157101619|gb|DS480705.1| GENE 8 6323 - 7303 213 326 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855436|ref|ZP_02478201.1| 30S ribosomal protein S21 [Haemophilus parasuis 29755] # 47 319 20 310 320 86 25 2e-16 MSWKQAGVNFIRMAVLLVLVSMAAFFLVSVSPLDPLTTNVGQAALGSMSSEQIARLQEYW GVNTPPVTRYLAWAGDFLKGDMGISLLYRRPVAQVIGEKLANSLWIMAAAWILSGLIGFL MGIIAGAKKGKTADRIISSYALVTASTPAFWVALVLLVVFAVWLKLLPIGLSVPIGVEAS AVTMKDRLIHGILPAAALSITGISNIALHTREKMAQVMESDYVLFARARGESEWSIIRRH GIRNILLPAMTLQFASVSEIFGGSVLVEQVFSYPGLGQAAVTAGLGGDVPLLMGITIISA AIVFLGNFTANLLYGTVDPRIRRSRA >gi|157101619|gb|DS480705.1| GENE 9 7343 - 9067 1688 574 aa, chain - ## HITS:1 COG:MA1915 KEGG:ns NR:ns ## COG: MA1915 COG0747 # Protein_GI_number: 20090764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 32 571 9 548 553 394 42.0 1e-109 MKYAVAAMAAVSMAVTGCAGSGTGNSAKDSGISAAADSAGSSAGNRAGDIAGDSAGNRAE NSAGKGAGETGNNGTRDDVVVVMGPTSEPEAGFDPAYGWGAGEHVHEPLIQSTLTVTTAD LKIGYDLATSMEVSQDGLTWTVNIRDDVNFTDGEKLTARDVAFTYNTLRDTSSVNDFTML ESAEALDDTTVVFHMKRPYSIWPYTMAIVGIVPEHAYGADYGSHPIGSGRYIMKQWDKGQ QVIFEANPDYYGTEPKMKKVTILFMEEDAAYAAVMSGQVDLAYTAASYSDQTLPGYELLS FETVDNRGFNLPAVKSGTVTDGKTVVGNDFTSDVQVRRAVNIGIDRNEMIDHVLNGYGSP AYSVCDKMPWYNESAKTEYDPEKAAELLEEAGWKAGADGIREKDGVRAGFTLMYPASDSV RQALAADTANQLKEVGIEVKIEGVGWDDAYDRAQTEPLMWGWGAHTPMELYNIYHTMKDT GLAEYSPYANDSVDRYMDEALASGNLEDSYELWKKAQWDGTAGVTQDGDIPWIWLVNIDH LYWSRDGLKVAEQKIHPHGHGWSIVNNVDQWSWE >gi|157101619|gb|DS480705.1| GENE 10 9549 - 10799 1389 416 aa, chain - ## HITS:1 COG:no KEGG:Apre_0526 NR:ns ## KEGG: Apre_0526 # Name: not_defined # Def: ABC-type nitrate/sulfonate/bicarbonate transport systems periplasmic components-like protein # Organism: A.prevotii # Pathway: ABC transporters [PATH:apr02010] # 88 384 51 345 357 167 32.0 1e-39 MKKTNAKRILAAALAGMMALSLSACGEQGGGSPTEGATTESASTAAVAAETKAEAGSAAE PQDSGKTGEEATGGAARPEAVSQEDWEAMQKEPAFGTTLNYLFNGGACVSAVYLAEALGY YEDYGINAEYIEGESVVITVGTGKCLWGTDHIATMLVPVTNGVDMTFVAGAHMGCKSIYV FNDSEIKTAEDLKGKTIAIHDGIGNSDQNIIYRMLDGEGIDPTSEVEYLDIADSAASVAA MESGEIDASIFSDYFVIANYRDKMRKVCSITPGDEFEGEVCCATAMNNDFLAKNPVHAKY IVMAIKRAGQYARLHSEDAVQLMFDTNKMTGEFENQLEFWDSLDFGLSDAITEEALGNIA ADYIRLGVIQKKELTADDVMKLAWTNACPDEEVPGLTVGDPKDVEGRTVEVQQKGE >gi|157101619|gb|DS480705.1| GENE 11 10888 - 11670 239 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 28 238 37 258 329 96 31 2e-19 MAENTVILRLDNVSKSFAKVEHDEVTHALNEVNLSMKSGEFISLVGPSGCGKSTILRLVA GLIPPTTGYLTVNGAQITGPSPERGMMFQKPTLFPWLTVEKNISFSLKLQGKLKGNEEKV ERMLKIIGLESFRNDYPGQLSGGMAQRVSLVRSLINEPDILLLDEPLGALDAFTRMNMQD EILKVWQEKKQLALMVTHDVDEAIYMGTRVIVMDAHPGRVVSDIKIDQAYPRERSSQTFV ACRNEILNCLHFGGKNRGQR >gi|157101619|gb|DS480705.1| GENE 12 11670 - 12746 730 358 aa, chain - ## HITS:1 COG:YPO2287 KEGG:ns NR:ns ## COG: YPO2287 COG0600 # Protein_GI_number: 16122511 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Yersinia pestis # 150 356 65 271 273 112 30.0 1e-24 MSTTMEEKPVSGSSQINPLGHLTRQQLKELEQKEKTRGQKILDLVIMLLPVICGFIAVIE YWEVPNGSPNSHPYTYVWAVAAFMTAYALYALAAGIKYRKGDRRTAEDLRYRAPLFSAFF LLLTFYDYLTLKTGILSQPFVPCMNSILNIAWEDRAYLLECTLHTLRLLFLGYFIGIALG LVTGITCGYSEKARYWINPVIKFLGPIPTATWIPIIMVVAASLFRGAVFIIALGSWFAVT VASMSGIQNVDKDFFEAARTLGASEGQLVFRVAIPHAMPSILQGCTQAMSSSCVAIMIAE MMGVKAGLGWYMNWAKSWAAYDKMFAALFVICFIFTIVTKVLDLIKRRVLRWQNGVVK >gi|157101619|gb|DS480705.1| GENE 13 13016 - 14518 1454 500 aa, chain - ## HITS:1 COG:MA1313 KEGG:ns NR:ns ## COG: MA1313 COG3875 # Protein_GI_number: 20090175 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 2 259 56 290 468 92 29.0 1e-18 MKLSFEYGAGLMAAELPDNTDVFIPGETVADPPCIPEDQLVEKTLESIRNPMGMEPLSKL AHKGSKVTIIFPDRVKGGEQPTSHRKISIKLILKELYGAGVEKKDILLICSNGLHRKNTE TEIHNILGDELFHEFWHTHQIINHDSEDYEHLVDLGTTDRGDPVLMNKYVYDSDVAILIG HTQGNPYGGYSGGYKHCATGITHWRSIASHHVPEVMHRKDFTPVSGTSLMRTKFDEIGQY MEKCMGKKFFCCDAVLDTRSRQIEINSGYAKVMQPHSWITADKRTYVPWAEKKYDVMIFG MPQFFHYGEGMGTNPIMLMQAISAQVIRHKRIMSDNCVIICSSLCNGYFHDELWPYTREM YEMFQHDFMNTLPDMNRYGEYFATNEEYIRKYRYCNAFHPFHGFSMISCGHIAEMNTSAI YLCGAQEPGYARGMGLKTRATIEEALADARKKFVGQNPNILALSQTFKLGAVHLMMKDEA YEGKGQEDCGCACHMHGKLM >gi|157101619|gb|DS480705.1| GENE 14 14559 - 15398 1084 279 aa, chain - ## HITS:1 COG:CT328 KEGG:ns NR:ns ## COG: CT328 COG0149 # Protein_GI_number: 15605051 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Chlamydia trachomatis # 56 273 65 269 274 102 32.0 8e-22 MKHIFLNLKRFDIPREYGGVNGVAPMEEWGSYIVQNTQEKLKAYSAEDVEFVMYFPEAHL IPAVKALCEDSPVKIGCQGVYRDDTAENGNFGAFTTNRTANAAKAMGCSSVIIGHCEERR DKAGILEEAGVTDEDAIGRLLNQEIKAAIQAGLTVLYCIGETAGEQEHWQEVLKSQLETG LKDVDKEKVVIAYEPIWAIGPGKTPPDEAYITKIGTYIKEMTGGMDVVYGGGLKTDNARM LASVPVMDGGLIALTRFQGQIGFYPEEYLEIVRTYLERD >gi|157101619|gb|DS480705.1| GENE 15 15395 - 16588 1270 397 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 5 379 16 398 427 194 34.0 2e-49 MSANVLVVHGGGPTAVINASLYGVVEEAKSSGSIGRVYGAIGGSEGILKESFLDLMQYPE EKLSLLLQTPATAIGSSRYALKQEDYNAMAGIFRKYGIRYVLLNGGNGTMDTCGRIFEAC RGEDIYVVGIPKTIDNDIAITDHTPGYGSAARFIAASAAEVGADVKALPIHVCVMEAMGR NAGWITAASALARKKPGDAPHLIYLPERPFHEEEFLEDVKRLYDEKGGVVVVASEGLKNE KGEPIVPPIFKVGRATYYGDVSAYLANLVIQKLGIKARSEKPGLCGRASIAWQSPVDRDE AVLAGRQALRTAMAGRSGVMVGLIRDEKEDSGYHVHTSAIPIKEVMLHERVLPDEYINDR GNDVTDAFLKWCRPLIGPELRDFVDFKEEYEKMGVGK >gi|157101619|gb|DS480705.1| GENE 16 16629 - 17786 1409 385 aa, chain - ## HITS:1 COG:all4563 KEGG:ns NR:ns ## COG: all4563 COG0191 # Protein_GI_number: 17232055 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Nostoc sp. PCC 7120 # 1 385 1 349 359 196 35.0 7e-50 MTLIPLRPLMEASVKHGFAQGAFNVNAVAQAKAAIQVHEMFRSAAILQGADLANGFMGGR CDFMNATLEDKKKGAENIAGAVKKYGEDSPIPIVLHLDHGRDSDSCAAAIAGGYTSVMID GSSLAFDENVELTREVVKYAHARGVSVEGELGVLAGVEDHVFSEGSTYTNPLKAVEFFRK TGVDALAISYGTMHGASKGKNVKLRKEIAVAIRECLNHEGIFGALVSHGSSTVPAYIVDE INGLGGTLTGTYGIAVSELQEAARCGINKINVDTDIRLAVTRNMKEFFAGNPEKRNSSSI GAIYELLESKREQFDPRVFLTPIMDTVMTGVIPDEDTAAITDCIERGVKEVVGTLIVQFG SYGKAPLVEQVSLEEMAERYKKMQI >gi|157101619|gb|DS480705.1| GENE 17 17826 - 18680 955 284 aa, chain - ## HITS:1 COG:TM0273 KEGG:ns NR:ns ## COG: TM0273 COG0191 # Protein_GI_number: 15643043 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Thermotoga maritima # 2 284 6 315 315 157 32.0 3e-38 MLTTTYEMLQKAYEGHYAVPAINTQGGTYDIIRAVCMAAEELRSPVILAHYVNTGAYSGH DWFYETAKWMAGKVSVPVAIHLDHGDSFERCMEMLKLGFTSIMFDGSALPVEENAAATEA VARVCRSFGVPLEAEIGELCRLDDRGNKIGASNIADPDVVRQYLKLCHPDSLAIGIGNAH GFYSGPVDIRVEVLEECRKFTDIPFVLHGCTGMDEELVKRSIDSGVAKINFGTQVRCQYV NYLKEGLAEGKDQGHAWKLSQYAELRLREDIKDIIRLAGSKEKA >gi|157101619|gb|DS480705.1| GENE 18 18694 - 19710 1258 338 aa, chain - ## HITS:1 COG:HI0053 KEGG:ns NR:ns ## COG: HI0053 COG1063 # Protein_GI_number: 16272027 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Haemophilus influenzae # 1 334 5 338 342 283 43.0 4e-76 MKVLCLPEPGNLVIKDLPMPELKEGQAIVKMEMCGICGSDVTAYRGVNPTMRYPINGLGH EGVGIIQEIGENDKGLKPGDRVALEPYVPCNKCHMCAAGRFNNCADLHVRGVHKDGMMSE YFLHPVQLLYKLPDELTFTHAALVEPLTIGLHGATRARVSRGEHCVVFGAGIIGLMAAFA CINYGATPILVDVLQKRLDYAKELGVPCTFNSKDGNVEEYLREVTGGKLPEAMIDCTGAP VILENMHNYVCHGGRIALVGWPHDPVLINTVRLMQKEIDVCPSRNSNGKFPEAIGLVNEG KVPTDAIITKMIELDQVEDTIKDMIQSPSDYLKVIVNI >gi|157101619|gb|DS480705.1| GENE 19 19738 - 20496 200 252 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 214 1 218 312 81 27 5e-15 MPEREDTRPIKVEVRNLTKRFGDLLVLDDMSFNIRKGEFVCVVGPTGCGKTTFLNCLTRI HMPSEGDLYIDGVPADPRKHNISFVFQEPSALPWLTVEDNLAYGLKIKRIPKAEIDRRVN QILDLMGLQEFRKAYPGELSVSAEQRIIIGRSFAMQPDLLLMDEPYGQMDVKMRFYLEDE VIRLWKELGSTVVFITHNIEEAVYLAERVLILSNKPAKIKEEVRIDLPRPRDITSSQFIQ YRNYITDKIKWW >gi|157101619|gb|DS480705.1| GENE 20 20510 - 21271 265 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 8 214 2 214 245 106 31 1e-22 MEQLNKRMEINGISKTFFSDKGYFTAIKDVSFDVNDGEFLVILGPGRCGKTVLLNIIAGL EQQTEGKVVYNGREWKGVNPEISMVFQKLALMPFKTVMENVELGLKFRGMSKGQRREIAQ HYIELVGLKGFEKSYPTQLSGGMKQRVGIARAYAADPKLLIMDEPFGQLDAQTRYQMQEE ILRIWEKEKRTVIFVTNNIEEACYLGDRIILLSDCPATVKEVYPISIPRPRDMVSGEFLK LRTVISDNTDLAI >gi|157101619|gb|DS480705.1| GENE 21 21259 - 22044 981 261 aa, chain - ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 19 259 39 282 286 150 31.0 3e-36 MDVASKQQKTLKNVLYYGLPVLAILMIAVGWVLFSGSHPELMPAPADVWARFVKTFTKPI AKTSLAGHAWASLRRVLMALLIAWAFGISFGILIGWNKKCKAFFGSIFEVIRPIPPIAWI PIVIMWFGIGEFPKVLLVFIGTFVPLVINTSTGIEMVDKINLDVGHVFGGNNRQILTDIV IPTALPSIFAGIRTSVSSGWTTVLAAEMLGAQKGLGALVTRGWQGSDMALVLVSVITIAI IGALLSLGLQKLEKVVCPWNN >gi|157101619|gb|DS480705.1| GENE 22 22060 - 22893 933 277 aa, chain - ## HITS:1 COG:mlr4519 KEGG:ns NR:ns ## COG: mlr4519 COG0600 # Protein_GI_number: 13473801 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Mesorhizobium loti # 21 273 31 282 286 160 35.0 3e-39 MATGNEKLEAGLLAWKKQKSKEGLNYALLSAAGIITLLAVWQLVVSLGLVNERMLPAPTQ IFETLMYKFANKAPDGNVLWVNILASLQVALSGFLAAIIIGVPLELVMGWWTYADRFIRP IFELVRPVPPIAWIPLVVVWMGIGLKAKALIIFFTAFVPCVINSYTGIKLTNKTLINVSR TFGASNAEIFWKVGVPSSLPMVFAGIRVALGNSWSTLVAAEMLAASAGLGYMIQIGRTVA RPDIVIVGMVVIGAIGAILSGFLSRAEKYFLRWKVNR >gi|157101619|gb|DS480705.1| GENE 23 22914 - 24077 1547 387 aa, chain - ## HITS:1 COG:BS_ssuA KEGG:ns NR:ns ## COG: BS_ssuA COG0715 # Protein_GI_number: 16077949 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 1 302 1 263 332 70 26.0 6e-12 MRKSLVKTLAVTMAAVMCAGMVSGCGAGSESKKADTQAEASGTGDSKTADKSYELTVSGI GGSLNYLPIYIAEQEGWFKEKGLDIEEVLFTNGPVQMESLSSDGWDIGCTGVGGVFAGVL GYDALVLGSSNTDDGTQYVFARNDSDIVKAGKGHNSLSDEIYGDADSWKGKSILCNTGSV TQYVLIKVLEGFGLTVDDVNFIAMDPATAYSAFLAGEGDVCVLTGAGGTFKMLAESDKYI PVASGPMADSGLMCSFVANKNSYADPDKYEAMKVFMEVYFKTMDWMKENKEKAIDYCVDM NDENGSSMDRETTAKYLEADTYYTLQEACGMLNDKAEGSDHSVMDQRLLDVLDFFISVGN YKEGDQERFVGHTDGKLLNEVLTETAN >gi|157101619|gb|DS480705.1| GENE 24 24519 - 25205 687 228 aa, chain + ## HITS:1 COG:BH3241 KEGG:ns NR:ns ## COG: BH3241 COG1609 # Protein_GI_number: 15615803 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 2 221 4 217 333 78 31.0 8e-15 MTLKEIAREAGVSISTVSRVINKNSTNVASKEVQDRIWEIVRRTGYTPNATARSLKTGAA KAADSPTRSIACLYARAEDSNSDLFFSTLARSIEKEAFKHNYVLKYSFTGIDIHHPNTFR LITDNHVDGVVVLGRCDKQTLSFLKKYFNCVAYTGLNLLEAKYDQVICDGYQASLAAMNT LLGLGHTRIGFIGETQFEDRYTGYCAALSAHNLRPAKSYIVNVPLSSE >gi|157101619|gb|DS480705.1| GENE 25 25293 - 25589 262 98 aa, chain + ## HITS:1 COG:BH2227 KEGG:ns NR:ns ## COG: BH2227 COG1609 # Protein_GI_number: 15614790 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 86 259 343 347 69 38.0 2e-12 MRAIKEAGLVIPDDLSVISIDDIDTAQYLSPMLTTIHIPVEEMGQMTAKILIDRIEEGHK LPIKMNLPFYLANRESCAPYAEKTQGPEHEHTMPRKED >gi|157101619|gb|DS480705.1| GENE 26 25593 - 26474 526 293 aa, chain + ## HITS:1 COG:CAC3032 KEGG:ns NR:ns ## COG: CAC3032 COG2017 # Protein_GI_number: 15896283 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Clostridium acetobutylicum # 1 289 1 296 298 186 33.0 3e-47 MLITLTDSKTTAVIDSTGAQLISLKDASGCEYIWQRDAKYWKKCSPLLFPVVGNCRNDRT ILEDRIYAIEKHGFCRERDFDVSQKSPAKAVFSMDDTPDTHRAYPYAFCLSLAYELKDGI LFMEYQVENRDQRDMWYAIGAHPGFNCPMEEGFAFEDYQLVFEKEENTVSIPYDLDQLHF SPSKPGTRLRGRTLSLKREMFRNDAVFFDKLNSRAVSILNPATGHGVEVGFPGFETVAFW TLYPEPAPYLCVEPWNGSGIYENEDDQLSHRHHIQHLCPGDSCSYIMTIRILG >gi|157101619|gb|DS480705.1| GENE 27 26907 - 27248 262 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941510|ref|ZP_02088844.1| ## NR: gi|160941510|ref|ZP_02088844.1| hypothetical protein CLOBOL_06400 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06400 [Clostridium bolteae ATCC BAA-613] # 1 113 1 113 113 219 100.0 4e-56 MKYERETCNLLRRLGVNNSYVGFRYTVYGVIRAIANPELLIYISKGLYVEISAEYHTSIG CVERNIRTIISTIWLHGDRTLLNQVFGFELEQKPRNGAFIDALSHYVVEHYYD Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:47:55 2011 Seq name: gi|157101618|gb|DS480706.1| Clostridium bolteae ATCC BAA-613 Scfld_02_47 genomic scaffold, whole genome shotgun sequence Length of sequence - 25402 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 5, operones - 5 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 34 - 107 67.3 # His GTG 0 0 - TRNA 144 - 217 84.4 # Arg TCT 0 0 - Term 377 - 414 -0.9 1 1 Op 1 1/0.000 - CDS 468 - 1403 915 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF - Prom 1430 - 1489 1.8 - Term 1469 - 1509 6.6 2 1 Op 2 2/0.000 - CDS 1581 - 3263 2060 ## COG4166 ABC-type oligopeptide transport system, periplasmic component 3 1 Op 3 44/0.000 - CDS 3461 - 4447 896 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 4 1 Op 4 44/0.000 - CDS 4447 - 5451 441 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 5 1 Op 5 49/0.000 - CDS 5456 - 6400 1277 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 6 1 Op 6 . - CDS 6410 - 7336 975 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 7450 - 7509 9.8 + Prom 7483 - 7542 8.0 7 2 Op 1 1/0.000 + CDS 7620 - 8795 1280 ## COG0787 Alanine racemase 8 2 Op 2 . + CDS 8792 - 9982 952 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 9 2 Op 3 . + CDS 10012 - 10806 952 ## COG2362 D-aminopeptidase + Term 10831 - 10881 10.4 - Term 10819 - 10869 9.6 10 3 Op 1 . - CDS 10887 - 11933 408 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 11 3 Op 2 . - CDS 11923 - 12876 1180 ## CDR20291_2766 2-keto-3-deoxygluconate permease 12 3 Op 3 . - CDS 12879 - 14021 1309 ## COG1929 Glycerate kinase 13 3 Op 4 . - CDS 14037 - 15314 1468 ## COG3875 Uncharacterized conserved protein 14 3 Op 5 . - CDS 15318 - 16547 1220 ## COG3395 Uncharacterized protein conserved in bacteria - Prom 16609 - 16668 5.4 + Prom 16610 - 16669 6.6 15 4 Op 1 . + CDS 16769 - 18574 1553 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 16 4 Op 2 . + CDS 18654 - 19238 537 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 19277 - 19324 12.1 - Term 19262 - 19313 12.1 17 5 Op 1 . - CDS 19369 - 20589 1003 ## COG0006 Xaa-Pro aminopeptidase 18 5 Op 2 1/0.000 - CDS 20605 - 22212 1818 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID 19 5 Op 3 9/0.000 - CDS 22250 - 22726 533 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 20 5 Op 4 . - CDS 22807 - 23211 375 ## COG2893 Phosphotransferase system, mannose/fructose-specific component IIA 21 5 Op 5 . - CDS 23213 - 25102 1374 ## COG3711 Transcriptional antiterminator - Prom 25272 - 25331 5.5 Predicted protein(s) >gi|157101618|gb|DS480706.1| GENE 1 468 - 1403 915 311 aa, chain - ## HITS:1 COG:CAC0293 KEGG:ns NR:ns ## COG: CAC0293 COG1619 # Protein_GI_number: 15893585 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Clostridium acetobutylicum # 5 305 4 298 306 219 38.0 6e-57 MVFPERLREGDTVGLVSPAFPVKEEERDGCVKLLEGMGYRVKLGTCLENMYNFHNYLAGD AGARGEDINRMFADPQVKAVFCVRGGYGSSQIMKYLDFDLVKQNPKIFVGYSDITNLHSA FQMFGNLVTFHGPMVCSNMLKDFDAYTRSGLFAALNMEEELEFRNPPGEEGFKTIRGGEA EGILAGGNISVLARACGTFYQLDTRDKILFLEDVEEGIASLDMYITQMEYAGMFEHAAGI LLGDFTDCTNDRYDGTYQIEEFLHDRFGTFDLPVMYHIRSGHDKPMGTLPFGTMCRMDGD NKRLRFYRQLR >gi|157101618|gb|DS480706.1| GENE 2 1581 - 3263 2060 560 aa, chain - ## HITS:1 COG:CAC3634 KEGG:ns NR:ns ## COG: CAC3634 COG4166 # Protein_GI_number: 15896868 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Clostridium acetobutylicum # 57 554 41 543 550 327 36.0 3e-89 MRKNVKVLLSLVMAAAMLAGCGGQSSEKPAENGASGQTTQASAGKEDGAGSEGGKVMTYA MQKEPETLDPTMNNYATSSIVLQNLFTGLMQIGPDGGLINGCAEKYEVSGDGLEYTFTLR DGLKWSDGSPLTAGDFEYAWKRTLAQDTASPGAWYLFYLKNGEAYNEGKASAEDVGVKAE DDKTLKVTLENPTAYFIDLTAVTAYFPVKKDAVEGPEAWTKSADTYVSNGAFRLKEINPQ ASYVLEKNPEYIDADTVKLDGVNIVFIESAEAALSAYNAGEVDVVDNTIIQTQAQSQYGG SDELKSYDLIGTCYYDFNCEKDYLSDARVRKALAMSLNRDVINQSIVASRPESSYAFVPH GIPYEGSSEDYRTTVGNLFTEDVEAAKALMEEAGYPGGEGYPTLTLITQNDQEKKDVAQA MQAMWKENLGVNVEIVTFEPKVYWDEQTAGNFDICYDGWTGDYPDPSTNLDCFLLQRNET QCRWLNDQAKEYDSMMKEARSLTDNKKRMELFEAGEKLLMEEMPILPLYYRNAQLLVKPG CEGVIKTYIGHTIFKYADKQ >gi|157101618|gb|DS480706.1| GENE 3 3461 - 4447 896 328 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 328 1 328 329 349 52 9e-96 MAQCTDTNNCIEVNDLKMHFMLKSGLFSKPKVLKAVDGVTFSIEKGRTMGLVGESGCGKT TVGRTILKLYQATAGRILYQGTDITGLSDSQMVPYRKKMQMIFQDPYTSLDPRKNIGDII AEPIWAMKLHTGKDKNDRVRELIEMVGLKPDHINRYPHEFSGGQRQRIGIARALAAEPEF IVCDEPISALDVSIQAQVINILEELQDKFHFTYLFISHDLSMVRHISSEVGVMYLGNLIE YAQVDELYDNMLHPYTQSLISAVPVADPKLAKENQRIILEGDVPSPINPAPGCPFRSRCR YAKDICGQVKPELKAVNDKHKVACHLFD >gi|157101618|gb|DS480706.1| GENE 4 4447 - 5451 441 334 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 277 1 277 563 174 33 5e-43 MNMMEDTILEVENLRTSFATDAGSVQSVRGITFHVGKGESLGIVGESGCGKSVTMLSIMG LLEDNASRQANALLFDGEDLLKKSPREMRKIQGNRIGMIFQDPMTSLNPLFTVGEQIRGP LMRHQKLSRKEAEKKALVMLEAVGLPSPERRLKQYPHELSGGMRQRVMIAIAMCCKPELL IADEPTTALDVTIQAQILELMAHMKNEFNTSVILITHDLGVISSLCTRVIVMYGGLIMEE GKIEDIFYRTGHPYTAGLLASIPKRTKEKLVPIFGTPPDLLNPPKGCPFAARCSRAMKLC AVHQPPFCDLGNGHVSACWLHNESVKARMGEVKL >gi|157101618|gb|DS480706.1| GENE 5 5456 - 6400 1277 314 aa, chain - ## HITS:1 COG:CAC3637 KEGG:ns NR:ns ## COG: CAC3637 COG1173 # Protein_GI_number: 15896871 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 13 314 4 305 305 294 50.0 1e-79 MEERVEFTMSDVIPEELLAGLSDTDREMESINRPSISYWRNCWIRLKKDKLAMLGIAIVV LMTLAAIFVPMFSPYTYDQTDFGNALQWPNSAHLFGTDKMGRDIFVRTMYGARISLSIGF AAAAINMVIGVLYGGISGYVGGTTDIIMMRVVDILTGIPSLIYMILIMMFLGNTIQSILI AMCLTYWITTARMVRAQILTLREQDFALAAKVCGLSKWQILIHHLIPNSMGSIIVTVTFL IPSAIFQEAFLSFLGIGIQVPKASWGTLANDAIEYLFSYPYQMLFPALAISITIFALNFI GDGLRDALDPRLKK >gi|157101618|gb|DS480706.1| GENE 6 6410 - 7336 975 308 aa, chain - ## HITS:1 COG:CAC3638 KEGG:ns NR:ns ## COG: CAC3638 COG0601 # Protein_GI_number: 15896872 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Clostridium acetobutylicum # 1 306 1 304 306 282 44.0 7e-76 MAKYIIKRLVAGVLSLFILITITFFLMHVIPGGPFSPSEQRNVPEKILEQISEKYGLNDP LPAQYVRYLGNLLHGDMGTSFKKQDTTVNELIANGFPVSAKVGALGIAVALAAGIPLGIV AAVKRGKLADGASMVLATIGVSVPSFVLCVLMMYVFCEKWKIFPSYGLTSWKHYVLPVFC MAFSQVAYITRLMRSSMLETMRQDYIRTERSKGVPEWEVISKYALKNSILPVVTYVGPLV ATLLTGTFIIEKLFSIPGLGRYFISAITDRDYSVTLGLTVFLGVMIIGCNLIVDIMYAVI DPRVKITE >gi|157101618|gb|DS480706.1| GENE 7 7620 - 8795 1280 391 aa, chain + ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 10 378 5 378 386 276 38.0 4e-74 MEIPVLNEIMRDTVVEVDLDRIAFNVRQIKAMAGPGTQVAAVVKADGYGHGALGIAPAIM ENGASLLAVATLSEAVELKQAYPGYPVLIMGLTPDRLLPYVLEYGIIQTIDTLAQARLLN RLASEKKQQAVIHIKYDTGFHRIGFPDCPESLDEIRQICSMPWLKPEGIYSHLALKDDQS NQVQFRCFMDAVNELESEGCHFRYKHIADSIAAVDFPEYRLDMIRAGAIVYGLKGFHKGQ LDIRQALTFKTRISHITPVLKGQGVSYDYLWTAPQDTRVAALPFGYADGYPRNLRGKAMV TLRGKQVPVIGVICMDQCMIDLTDVPEAQIGDQVIIYGDGSGNTLDIDAVSRLAGTNKNE IVARLTRRPPRIYVRGNDSGTGYSSAERMTP >gi|157101618|gb|DS480706.1| GENE 8 8792 - 9982 952 396 aa, chain + ## HITS:1 COG:CAC1014 KEGG:ns NR:ns ## COG: CAC1014 COG1473 # Protein_GI_number: 15894301 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Clostridium acetobutylicum # 1 395 1 394 396 326 44.0 4e-89 MNIKQRTEELYPQLVRIRRELHQHPEPGFKEHWTSAYICGLLDEWGISYEFPVAGTGIVA MIQGEKPGSGNTVALRADMDALPLTEDIARPFCSLHPGHMHACGHDAHVAIALGTARILM ESRSEWSGCVKFFFQPAEETTGGALPMVEAGCMESPHVDYVTGLHVMPTHETGEIEIRYG DLNASSDEIHIKVHGKSCHGAYPDTGVDAIVMAAAVINSLQTLVSRCISPLDSAVLTLGT IHGGTAGNIVAAETEMTGTLRTTDPAVREKIIDAMTRQVSCTCQAMGGSGEVVIVPGYAA LINSDEIVDVLVETAEEVLGSQHIHPKKFPSLGVEDFSFFLEKAKGVFYHLGCANREKGI TAPLHSQDFDIDEECLKLGVQLQAQLALKLLKRDIA >gi|157101618|gb|DS480706.1| GENE 9 10012 - 10806 952 264 aa, chain + ## HITS:1 COG:mll6661 KEGG:ns NR:ns ## COG: mll6661 COG2362 # Protein_GI_number: 13475561 # Func_class: E Amino acid transport and metabolism # Function: D-aminopeptidase # Organism: Mesorhizobium loti # 1 264 1 265 265 186 37.0 4e-47 MKVYISVDFEGVAGSTSWSSTNLGDLEHGPMAREMTLEAAAACRGALAAGATEIYIKDAH ESGRNMDISLLPKEAKIILGWKYSPDSMVCGLDQSFDALMFVGYHSPAGTNGSPLAHTMN RKTNYIKFNGKLASEFLMHAYVGASLGVPSVFISGDKNLCSHVHEYDPGITAVAVKEGMG AATVNLAPEMAQELIEEGAKNALLTKDTCHLEVPETIVMEINFKDHFQALRASYYPGMVM TDDFTVSYTARSINELMTARMFVL >gi|157101618|gb|DS480706.1| GENE 10 10887 - 11933 408 348 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 2 344 4 331 346 161 28 3e-39 MEDRKIIGITMGDPASIGPEITVKALADQSIYEACRPIVVGDACVLEAALKIAGHEEMKV RAVKDVRDAVFTHGTIDVLDMGLVSMDELVRGQVSAMCGDAAFRYVEKVIELAMDGQVDA TVTNAFNKEAVNLAGHHYSGHTEIYADLTGTKNYTMMLAHENMRVVHVSTHVSLREACDR VKKQRVLDVIRIGDKACREMGIENPRIAVAGLNPHSGEHGLFGQEEIQEIIPAIEAARAE GIQAEGPVPPDTVFSKARGGWYDIVVAMYHDQGHIPLKVVGFVYDQEKQAWDAVAGVNIT LGLPIIRVSVDHGTAFDQAGTGTASELSLKNSLEYAVMFARNRKQKSV >gi|157101618|gb|DS480706.1| GENE 11 11923 - 12876 1180 317 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2766 NR:ns ## KEGG: CDR20291_2766 # Name: kdgT # Def: 2-keto-3-deoxygluconate permease # Organism: C.difficile_R20291 # Pathway: not_defined # 1 315 1 312 324 313 57.0 5e-84 MKILQAVKKIPGGLMVVPLLLGAIVNTFFGNVLWSAFDGTFTTYLWKSGAMPILAVFIFC NATTINFKKAGVTIYKGVVITAVKVGIGIIIGLICGTLFGEKGFLGLSTLAIVGALANSN GGLYAALAGEYGDATDVGAVSILSINDGPFFTMVALGAAGVAKIPMSILIGCIVPVIVGC ILGNLDEDIRKFCEPGATMLIPFFAFPLGAGLNIMNLLKAGGPGILLGVACTLITGLGGY LIYKLLRMEYPEVGGAIGTTAGNAAATPASVAAVDATLLAVAEAATAQITAAIIVTAILC PIFVSYLHKVEVRKRGR >gi|157101618|gb|DS480706.1| GENE 12 12879 - 14021 1309 380 aa, chain - ## HITS:1 COG:L79277 KEGG:ns NR:ns ## COG: L79277 COG1929 # Protein_GI_number: 15672838 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Lactococcus lactis # 2 373 4 376 385 268 43.0 2e-71 MKFILIPDSFKGTLSSSQICAVMDEKIKKHFPDSLTVSIPVADGGEGSVDAFITAVGGVK EYVTVKNPYFEDMESYFGLIDAGQTAVIEMASCAGLPLVEDRKDPRLTTTYGVGQLILAA ARKGVKKIIVGLGGSSTNDGGCGAAAAVGIRFYDRDGRQFIPTGGTTADIDRIDLSGRDS LLEQVEIVTMCDIDNPMYGPVGASFIFGPQKGADEVMVLQLDEGIRNLSRVIAQATGTDI SQVPGTGAAGAMGAGMIAFFGSRLQMGIQTVLDTVRFDEIIGDADYILTGEGKLDSQSLR GKVVIGIAERAKKQAKSVIAVVGGADDDEIGKAYDMGVTAVFPINRLPQDFSVSRHRSQE NLAYTVDNIIRLIKAGQGEH >gi|157101618|gb|DS480706.1| GENE 13 14037 - 15314 1468 425 aa, chain - ## HITS:1 COG:CAC0769 KEGG:ns NR:ns ## COG: CAC0769 COG3875 # Protein_GI_number: 15894056 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 423 8 425 426 425 52.0 1e-119 MREFLLPYGKETLKAEIEEEHLAGVLVSELHDYKAPMEGTQLVQEALEHPIGTPRLCDMA IDKKKIVVISSDHTRPVPSHIIMPLILKEIRRGNPDADITILISTGLHRETTREELESKF GPEITEHETIIVHDCDDTDNMVYLGKLPSGGNMYINRLAVEADLLVAEGFIEPHFFAGFS GGRKSVLPGVASRETVMYNHNSAFIDDLHSRTGIIEGNPIHNDMLYAARVAGLDFIVNVV IDSAHDPIFAVAGDCDLAHRAGRDFLASKCQVDAVPADIVISTNGGYPLDQNIYQSVKGM TAAEVTVKEGGVIIMLSKAADGHGGKYFHETFRDEKDLNRMMKTFMDRKPEETIIDQWQS QIFARVMLKARIVFVSSCDDKLVEELHMIPAHTMEEALNKAKAMVNKEDYKVTVIPDGVS VIVRG >gi|157101618|gb|DS480706.1| GENE 14 15318 - 16547 1220 409 aa, chain - ## HITS:1 COG:STM0162 KEGG:ns NR:ns ## COG: STM0162 COG3395 # Protein_GI_number: 16763552 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 4 405 2 418 423 172 32.0 1e-42 MEERYLIIADDFTGANDTGVQLKRRGYPTEVLFAGKPVDSQKSIVIDTESRNALPGHAYE IVRHSLEQVDFGQFRYIIKKVDSTMRGNIAQEIKAVDDAFCPELVVFAPALPALGRTTMD GVQCLNGVEICKTELSKDPKNPVMEDNLVKLLGQVYEEPVLLKYLPDVRSQSFSLDGGRI FVCDAESDDDLKRIIAAVRQDGRRALYVGTAGMADNMMELEKPTLPSFGVVASISSVANE QMHYCERAGYAMVKVPVAQLMTGTARPEEYRDMAVEYLKSGRDTILLTDTAYDRELIELS QEVGEKSGMDLTEVGDYVRSLIGRMAKEVLDSVEVSGVYLTGGDTALGMLMNIGADGSEI LAEILVGVPLVRVKGGACEGLKLVTKAGAFGTEDAAAFAMRKIKERGGI >gi|157101618|gb|DS480706.1| GENE 15 16769 - 18574 1553 601 aa, chain + ## HITS:1 COG:CC3315 KEGG:ns NR:ns ## COG: CC3315 COG2204 # Protein_GI_number: 16127545 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Caulobacter vibrioides # 252 597 107 473 488 83 22.0 1e-15 MKRYAGQRDDIQLTAMLGNVETGLSLAKEHYRNYDIIISRANTASRIAKGVPIPVIDIGI DYYDVLLCLKTAENTKTKFAVLGFRSLTTIAKSICNVLKIPTDIFSINTTSEASSLLDKL KEQGYKTVICDTVPYNYAKLIGITPILLTSSMESLKAAVDQAVYTWQTHRDLYHSLSLMH QLLDSSANRFLVLSRTGECIYTTLEDEIAVPIQAKLQNELEQCCTGRKRSFFITVNNQMY SITSHLVDETLSPYIIFHIMRSEIPLAYSKYGVTIMDKQAAENSFMDSFYSNTELAREIL TNTEDAAPSPSPLSLMITGEIGTGKDRVAHIYYAKSPLCDNPLYVINCSLISDKSWDFIT KHYNSPFTDNGNTIYISNLDSLQKPKQKQLLSIILDTNMHVRNHLIFSCTQTAGSATPHV VFEYTNALGCILVPIKPLREQKEDIISSAGLYINTLNQELGRQVVGVDDEAAALLEQYEY PYNRTQFKRILKEAVIRTDGPYICAKTIQEVIQRENVLFSGFPFPVHACSGSTKEDADAV PGNSRLAVSHPSPFCLDTSQSLDEMNRDIVHYVLNACKGNQTAAAKKLGISRTTLWRYLN R >gi|157101618|gb|DS480706.1| GENE 16 18654 - 19238 537 194 aa, chain + ## HITS:1 COG:all1011 KEGG:ns NR:ns ## COG: all1011 COG0110 # Protein_GI_number: 17228506 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Nostoc sp. PCC 7120 # 2 191 10 189 192 196 52.0 3e-50 MTEREKMLAGQLYDCGDAELLTQWHRAKDLVRDYNSTDSADTLKKNQILNELLGGRGEAL WITAPFFADYGNNIYFGNNCEVNMNCTFLDDNVIRIGNNALIAPNVQIYTAFHPTNAMDR FGVPQQDGSFAFCRTQTAPVVIGDNVWIGGGAIIMPGVTIGDNVVIGAGSVVTKDIPSNT VAYGNPCRVRRENR >gi|157101618|gb|DS480706.1| GENE 17 19369 - 20589 1003 406 aa, chain - ## HITS:1 COG:SA1530 KEGG:ns NR:ns ## COG: SA1530 COG0006 # Protein_GI_number: 15927285 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Staphylococcus aureus N315 # 19 393 13 354 358 130 26.0 4e-30 MTQEYKSFSPEEFRLHNEKLQAEMEKQDIDMLMLSTPENIYYSTGYRSWYTSSLFRPVYV LVPRKGDPAIILRILEKTTVQYTSWTSRIYCWGTASRNLGPLEGEEPISIIDRIIKEIQP DTGTIGLEAGDGMQYFWSMELLKKIMDSQPDIRFTDGSLAIQRARMVKTPWEIERIRHVC RITEQAILETGKTIVAGETTEKDISKGIAMRMARGGVDKISYLTVTSGIDKYCTFNTYAT DRVVQKGEYVLVDISGHIDGYASDLTRVFYLGTVPREEREMATTASGCVAAAKEAMKPGV SIAEINRICEGYIRDSRFGKFLLHSSGHCIGLNVVEYPTIHDEANEKLKPGMVFAVENGV YPYDLEKGVESIYLSFRMEDEVLVTEDGAEWITGPGEPVIEVGANH >gi|157101618|gb|DS480706.1| GENE 18 20605 - 22212 1818 535 aa, chain - ## HITS:1 COG:ECs2529 KEGG:ns NR:ns ## COG: ECs2529 COG3716 # Protein_GI_number: 15831783 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Escherichia coli O157:H7 # 267 533 16 286 286 167 39.0 4e-41 MSILTATLLSLLYFWGNSAFVLGVNWWTVMRPLVSGFLAGVILGDPVKGAMVGAQINILY LGFIGAGGALPGDICLAGVVGTTIAITGNLPVETAMALAVPVGLLGTIIWVVKMTVNTAW VRVAEKMSAKGDTRYYWIPNIVLPQLLLFLMSFIPCFLMVYFGTDYLKSVIQFLGENIVG VLTTIGGMLPAVGIALTLKSIFKGESVVFFFFGFLLVQYFGLDMISLGFSAVVFTLIYMQ LKGHKLSAMGGSLFGAEGNNENKYVLLDKKTIRKSWLRWIMFNQANYNYERMQGTGFCHA MVPVINKLYPDNQGKRAELMQNHMQFFNTEPQWGACIIGLTAALEEKRAQGSEEITGDTI TSIKSGLMGPLAGIGDTIDGGVVTPLLLTLFIGITNTGNIMGVIGYIIVEALFMWTIYWQ SYKLGYEKGSDAIVTIMESGLINQLILGASIMGCLVLGGLVGNYVTLGLKLMVPVGGGVM FNIQEQLFDVILPGALPLLLTLGTYKLVKKGWSSVNIIILVAVVGLAGGLLGIFA >gi|157101618|gb|DS480706.1| GENE 19 22250 - 22726 533 158 aa, chain - ## HITS:1 COG:lin0021 KEGG:ns NR:ns ## COG: lin0021 COG3444 # Protein_GI_number: 16799100 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Listeria innocua # 8 158 10 160 162 98 33.0 4e-21 MKNIVLTRIDDRLIHGQVVTAWIKQYPINKILIIDDELSQNRLMERIYKAAAPMGVEVLI QSVSEAREFLKEEPVKGENFLILVKVPEIIESLLKEGIEIKKVILGGMGAKNGRKTFNRN VSASGEEVECFKRIVEGGVEIFYQLVPNDKAVNIRSLF >gi|157101618|gb|DS480706.1| GENE 20 22807 - 23211 375 134 aa, chain - ## HITS:1 COG:CAP0066_1 KEGG:ns NR:ns ## COG: CAP0066_1 COG2893 # Protein_GI_number: 15004770 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose-specific component IIA # Organism: Clostridium acetobutylicum # 1 124 1 126 157 70 33.0 1e-12 MYKIIVFTHGSLAESLVRTSRLILGNQPDIETYCVEPGCNLEEMRNGVEQSIRRSNDSGQ EVLVLTDLMYGTPFNTMIQLEEECSFTHITGTNLPLLIEAINRRLLDGNSRSFAGLVDTA KEGIVDSHVLLQMS >gi|157101618|gb|DS480706.1| GENE 21 23213 - 25102 1374 629 aa, chain - ## HITS:1 COG:BS_yjdC_1 KEGG:ns NR:ns ## COG: BS_yjdC_1 COG3711 # Protein_GI_number: 16078265 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 519 4 506 510 96 23.0 2e-19 MNSRQIAFLRLLLEHEEYLPVGFYAGRMDVSDKTLRRMIHGVNEILASYNGVIESRPGTG IRLEINEEERERLMNSAYMMELMDSGALSRSWNQLSRRMDIALNLLLYSDEATSLSGLAY KYYVSKSSIAGDLKALVPFAGKHELRIIQGHGGTSVEGTESCLRKALAELLLYILDNNIN ANTRGNPAATGMFESETLMTILDIFTEEDLNFVEGLIKHIETSAGYRFDEREYMEFSVNL LVMIYRVRNGFLMEPVLKSNYRKQERDPLDAIAFELASRLSGSYRYTLSVSETSHIYNVL AATHLGNFLMQKELPEEESRKTAVAFGEDFIDAFSVITGINLRTKSTFYVNVISHITLML NRAASSTPARNPIIDILLENYKGTINVCQIICRILTEKFRLPEISFDEICYLMLYIQGEL LADEEKMDVILVSNMSNSITNILKHKLSQNYPQWTVVSSDYNHFLEMSQKQYDLILSTVP LGSREHVIPYALISPLLDEKDCSAINNLLKSCRHREDLYLRELMRARNDLYDIGCAVEVR NRKPMEIPVTGFLKVTALKEVEFVYVHNEAGINRCQFVTDLLQKKLDKVIMDMSNWDFML FASKMVYLMDNCPDWAMTEFIQNIITEGR Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:48:12 2011 Seq name: gi|157101617|gb|DS480707.1| Clostridium bolteae ATCC BAA-613 Scfld_02_48 genomic scaffold, whole genome shotgun sequence Length of sequence - 31016 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 13, operones - 5 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 41 - 811 513 ## BVU_3509 putative arginase - Prom 944 - 1003 7.7 + Prom 946 - 1005 7.8 2 2 Tu 1 . + CDS 1085 - 1714 691 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 1751 - 1810 15.6 - Term 1739 - 1798 15.6 3 3 Op 1 . - CDS 1822 - 2538 848 ## COG2949 Uncharacterized membrane protein 4 3 Op 2 . - CDS 2643 - 3011 411 ## COG0251 Putative translation initiation inhibitor, yjgF family + Prom 3274 - 3333 11.4 5 4 Op 1 . + CDS 3389 - 4549 724 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 6 4 Op 2 31/0.000 + CDS 4563 - 5486 925 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 7 4 Op 3 34/0.000 + CDS 5508 - 6212 569 ## COG0765 ABC-type amino acid transport system, permease component 8 4 Op 4 . + CDS 6224 - 6955 553 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 9 4 Op 5 . + CDS 6969 - 7892 717 ## COG0524 Sugar kinases, ribokinase family 10 4 Op 6 . + CDS 7971 - 8444 390 ## COG1846 Transcriptional regulators 11 4 Op 7 . + CDS 8452 - 9276 803 ## COG0434 Predicted TIM-barrel enzyme + Term 9313 - 9351 7.2 - Term 9296 - 9343 10.3 12 5 Op 1 . - CDS 9387 - 10697 1300 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 13 5 Op 2 . - CDS 10791 - 10946 61 ## gi|160941554|ref|ZP_02088886.1| hypothetical protein CLOBOL_06452 14 5 Op 3 . - CDS 10889 - 12328 1584 ## COG0442 Prolyl-tRNA synthetase - Prom 12361 - 12420 2.8 15 6 Tu 1 . - CDS 12783 - 13019 287 ## gi|160941556|ref|ZP_02088888.1| hypothetical protein CLOBOL_06454 16 7 Tu 1 . - CDS 13153 - 14232 1131 ## COG0628 Predicted permease 17 8 Op 1 16/0.000 - CDS 14333 - 15373 1332 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 15394 - 15453 5.3 18 8 Op 2 21/0.000 - CDS 15538 - 16524 1204 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 19 8 Op 3 9/0.000 - CDS 16493 - 17995 1772 ## COG1129 ABC-type sugar transport system, ATPase component 20 8 Op 4 . - CDS 18052 - 18450 518 ## COG1869 ABC-type ribose transport system, auxiliary component - Prom 18485 - 18544 3.1 + Prom 18383 - 18442 2.5 21 9 Tu 1 . + CDS 18489 - 18677 105 ## 22 10 Op 1 1/0.000 - CDS 18715 - 19665 1146 ## COG0524 Sugar kinases, ribokinase family 23 10 Op 2 1/0.000 - CDS 19812 - 20447 753 ## COG0274 Deoxyribose-phosphate aldolase - Prom 20481 - 20540 7.1 24 10 Op 3 2/0.000 - CDS 20603 - 21625 1059 ## COG1609 Transcriptional regulators - Term 21880 - 21917 7.8 25 10 Op 4 2/0.000 - CDS 22007 - 23368 1667 ## COG2610 H+/gluconate symporter and related permeases - Prom 23442 - 23501 2.0 26 10 Op 5 3/0.000 - CDS 23503 - 25224 1751 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase - Prom 25470 - 25529 8.7 27 10 Op 6 1/0.000 - CDS 25745 - 26467 845 ## COG2186 Transcriptional regulators - Prom 26512 - 26571 7.2 - Term 26516 - 26550 1.7 28 10 Op 7 . - CDS 26652 - 27041 388 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 27063 - 27122 4.4 - Term 27118 - 27171 15.0 29 11 Tu 1 . - CDS 27204 - 28565 858 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Prom 29185 - 29244 4.1 30 12 Tu 1 . + CDS 29313 - 29693 259 ## ELI_0507 hypothetical protein - Term 29740 - 29772 3.0 31 13 Tu 1 . - CDS 29790 - 31016 691 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|157101617|gb|DS480707.1| GENE 1 41 - 811 513 256 aa, chain - ## HITS:1 COG:no KEGG:BVU_3509 NR:ns ## KEGG: BVU_3509 # Name: not_defined # Def: putative arginase # Organism: B.vulgatus # Pathway: not_defined # 12 243 21 262 269 178 40.0 2e-43 MLNNKKLKNNNIVIMNFTGVYENQEFYQDLGAVWLELGDIQGTNCYCDEEAEAAIKERIS LMEPSGIHFLDSGNYHYVSKIWLDKIEEEFELLVFDHHTDMQMPMFGNILSCGGWIQAAL DTNTRLKRVYLAGPPSMEAEADRERVVGINEEELKTPGCISRHLKNSGLPLYISLDKDIL TRSCAITNWDQGEAQLEDVLACIKEAASCRSIIGVDVCGENPDDQEQEGNRASQTNQNTN RKIICKLLEICDVLCP >gi|157101617|gb|DS480707.1| GENE 2 1085 - 1714 691 209 aa, chain + ## HITS:1 COG:AF0830 KEGG:ns NR:ns ## COG: AF0830 COG1853 # Protein_GI_number: 11498436 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Archaeoglobus fulgidus # 1 166 1 165 169 131 39.0 8e-31 MDSKAFFKLSYGVYIISTSADNKEGGCVINTLTQVTSSPARLSIALNKENYTLKLIETSG TFSAAVLSDDVEMDLIRRFGFQCGKDVKKYDGIPQGRDSLNNPYPTEGVCARFTCRVVSS MDVGSHMIIVGEVVEAEVLDSQTPALTYSNYHLKKNGTTPPKAPSYQADTKEVTGWRCSV CGYILESETLPPDFICPVCGKDASYFVKL >gi|157101617|gb|DS480707.1| GENE 3 1822 - 2538 848 238 aa, chain - ## HITS:1 COG:all7616 KEGG:ns NR:ns ## COG: all7616 COG2949 # Protein_GI_number: 17158752 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Nostoc sp. PCC 7120 # 56 217 63 228 228 126 40.0 3e-29 MDKKAILKKAVKILLVLGLAGCLFLFLINLYMIRKEKPNIISSDDAAALGDVDCIMVLGC SVRPDGTPSGMLRDRLDKGIELYEDGVSDRLLMSGDHGRKNYDEVNRMKQYAIDEGIPSG DIFMDHAGFSTYESMYRARDIFQVKKIIIVTQRYHMYRALYVAQAMGMEAYGVESDPRQY GGQKMRDLRELLARPKDLIYTIVMPKPTYLGDAIPVSGDGNVTNDKEENRARNKSDQK >gi|157101617|gb|DS480707.1| GENE 4 2643 - 3011 411 122 aa, chain - ## HITS:1 COG:L52644 KEGG:ns NR:ns ## COG: L52644 COG0251 # Protein_GI_number: 15673211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Lactococcus lactis # 1 120 1 122 126 126 55.0 1e-29 MNVISTEKAPGAIGPYSQGFEVNGVIYTSGQIPVNPADGSVAEDIAGQAEQSCKNVGAIL EAAGSGFDKVFKTTCFLADMGDFATFNEVYAKYFTSKPARSCVAVKTLPKGVLCEIEAIA VK >gi|157101617|gb|DS480707.1| GENE 5 3389 - 4549 724 386 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 2 382 1 378 384 297 38.0 2e-80 MLYDFDTVINRWHTDSEKYDGLKKFAPKAPDNSIPLWIADMDFKTAPEIIAAMHRRVEEG IFGYTDIYDASYYKAVCRWMEKRHRWRITPDEIVVDSGVVPALSHALSLIARPGDGVIIH TPAYKPFFNSIRNTGMIPVYSRLCYSHGRFTIDFEDLEQKAEDENNKVLILCSPHNPTGR VWSQDELRHIVDICQKHGLFIICDEIHNDILRPGICHTPIASLYPDTKRLITCTAPSKTF NLAGNHLANIIIPDPDIRERWNSLYHYLPNPISVAAATAAYEEGEPWLDELNQYLTETFR HMETFFKANLPVIDFQTPEATYLAWINIDSLGSNHSEIENRFIENGLIIEGGNQFVENGA GFIRLNAAVPHSTIDTLLNKINHIFN >gi|157101617|gb|DS480707.1| GENE 6 4563 - 5486 925 307 aa, chain + ## HITS:1 COG:mll3861 KEGG:ns NR:ns ## COG: mll3861 COG0834 # Protein_GI_number: 13473306 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Mesorhizobium loti # 69 298 38 267 267 220 48.0 3e-57 MKKRNLILTAALFAMTASLAACGGQKPAQTQPAADTAAESKGGDKKEEAKEEPKSEETAA ASAEKTRYEKILESGVLKVGTEGTYKPFTYHDDNNELCGYDVEVARAIGEKMGVKTEFSE ITWEGLLTSLDTGTVDLVLNQVGVTDERKEKYDFSDPYLYSYIALITKTDNNDITDWDSA NGKKTSLNVSSNYALIAEKYNMDITASDTFSKDIELLLAGRTDCVINNTIAFNDFLTQKP DTPIKIAAVQEQADTVAVPIPKGNDDLVEAVNKAIAELQADGTLTELSNKYLGKDFSKEL TLEEIQQ >gi|157101617|gb|DS480707.1| GENE 7 5508 - 6212 569 234 aa, chain + ## HITS:1 COG:CAC3326 KEGG:ns NR:ns ## COG: CAC3326 COG0765 # Protein_GI_number: 15896569 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Clostridium acetobutylicum # 4 216 9 221 227 232 63.0 4e-61 MDARSIEILTSSFWPILKAGISFTIPLTLISFALGTTLAVFVAIIRISKVPVLNKICELY VWIIRGTPLLVQLFLVFFGLPKVGVVINPMPSGILVFTLSIGAYSSEIIRAAILSIPKGQ WEAAMSLGMSYPQQLLRIILPQAFKISVPPLFNSFIALVKDTSLAANITIPEMFLTTQRI TARNYEPLLMYIEVGFIYLIFCTVLNYVQNRIENKLMLTPTDKKRIKEITDTQA >gi|157101617|gb|DS480707.1| GENE 8 6224 - 6955 553 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 240 1 240 245 217 46 6e-56 MLQIKHLNKYFGENHVLKDISLEIESHKTMAIIGSSGSGKSTLLRCINLLETPDSGTISL DGKALDFSKKLPKHQKAAFTKKTGMVFQSFNLFPHKTALENIMEGPVTVLKKSKAEACRD AIELLKRVGLEDKKDSYPHQLSGGQQQRIAIARALAMKPEILLFDEPTSALDPELGAGVL SLIKELSREDYTIIVVTHNMSFAREVSEEVVFVEKGEILAKGSYDELVNLHNDRITQFLS YLD >gi|157101617|gb|DS480707.1| GENE 9 6969 - 7892 717 307 aa, chain + ## HITS:1 COG:YPO0008 KEGG:ns NR:ns ## COG: YPO0008 COG0524 # Protein_GI_number: 16120361 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Yersinia pestis # 9 306 6 300 308 182 36.0 6e-46 MNRKTSKNIICFGSLIMDISIRCDKIPAMGETVYTYEDYNVNPGGKGGNQSVAAARSGGS VTLIGRIADDDYAHQLIHSFQKDHIDTSQLMVDTDSKTGVAFVWVDMEGHNKCVCSLGVN ARVTSQDVQSYAHLFTDRSIVMTTLEHSSQLLDEISLMARQNNCILIVDPSSKDYSKLTP EIAGRIDILKPNEVETEMLTGIRVETEEDAVQAIHVLKDKGIRMPVISLGSRGVIYEYKG KATSVKGLRVNAVDTTAAGDTFIGSMAAKLSQGYSFQESIDYANRAAAYCVQHRGAQISI PFEENVL >gi|157101617|gb|DS480707.1| GENE 10 7971 - 8444 390 157 aa, chain + ## HITS:1 COG:CAC0763 KEGG:ns NR:ns ## COG: CAC0763 COG1846 # Protein_GI_number: 15894050 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 29 142 36 149 157 57 32.0 8e-09 MTELKLTNQLMNEYDSLPHDYGNAVLYQSEAHLIQCIGQNDGITVTEASKLLKKTKSACS QMVKKLVGKNWVKQTRNQKNNREYKLHLTEEGLIIYDHHENFDRMSYEHNSKELYHFTDE ELQLYIKIQKRMNELIGLDVERSYNYFKADISCINER >gi|157101617|gb|DS480707.1| GENE 11 8452 - 9276 803 274 aa, chain + ## HITS:1 COG:AGl1705GM KEGG:ns NR:ns ## COG: AGl1705GM COG0434 # Protein_GI_number: 15891857 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 265 12 271 280 190 38.0 3e-48 MGKLKDVFKVDKPIIGMVHLRPLPGSPKYDPVNMGMDKIISIALEEAAMLEQAGVDGVQV ENMWDIPYLRSEDIGYETAAALAVGIHAVRNKVSIPVGAECHMNGADCAMACAVAAGASW IRVFEWCNAFVSQSGFINAMGANVSRMRSRLKADQILALCDVNVKHGSHYIIHDRSVAEQ AMDIESQDGDAVIVTGFDTGTPPSVENISKCKKSTSLPILIGSGLNSSNVNELLTAADGA IIGSWFKEGNNWKNPVSYDRTKEFMDKVIALRQA >gi|157101617|gb|DS480707.1| GENE 12 9387 - 10697 1300 436 aa, chain - ## HITS:1 COG:L0324 KEGG:ns NR:ns ## COG: L0324 COG0617 # Protein_GI_number: 15673541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Lactococcus lactis # 9 289 26 306 411 206 41.0 8e-53 MEIRIPAPAEEIINKLNEHGYEAYVVGGCVRDMLLGREPGDWDITTSALPEQVKEVFRRT VDTGIQHGTVTVMMGKEGYEVTTYRIDGEYSDGRHPNSVEFTPDLVEDLKRRDFTINAMA YNSHTGFVDKFGGVEDLEKGIIRCVGEPMDRFTEDALRILRAIRFSAQLGFSIEERTYEA IKVIAPNMVHVSKERIQVELTKLLLSSHPDYISLVYETGISPFVSEKFHGAYAADSGDWA VPSIPAGVPAVKHMRWAAFLRECSASQAAGILKDLKLDNDTSYRVRTLVEWQGKKVGAED GEKDCLKHGKEEGMKEIAKQPEPSRASIRRAMSQMEPELFDDLLTLKMCLSEDASDDRES QWLCQVRDLTEEIRSNRDCISLKTLAVSGHDIMGAGVKPGREVGNTLARLLDMVLEEPQR NTKEYLLAHLGQSAEK >gi|157101617|gb|DS480707.1| GENE 13 10791 - 10946 61 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941554|ref|ZP_02088886.1| ## NR: gi|160941554|ref|ZP_02088886.1| hypothetical protein CLOBOL_06452 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06452 [Clostridium bolteae ATCC BAA-613] # 14 51 1 38 38 69 97.0 9e-11 MCLLRETGEEDGLLGQGVLKMCIETADAGLDLYLPFFIYEEYMVNMVIIRG >gi|157101617|gb|DS480707.1| GENE 14 10889 - 12328 1584 479 aa, chain - ## HITS:1 COG:MYPU_1830 KEGG:ns NR:ns ## COG: MYPU_1830 COG0442 # Protein_GI_number: 15828654 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Mycoplasma pulmonis # 6 479 27 501 501 484 46.0 1e-136 MADNKKMVEAITSMDEDFAQWYTDVVKKAELCDYASVKGCMVIKPAGYAIWENIQKEMDR RFKETGVQNVYMPIFIPESLLEKEKDHVEGFAPEVAWVTHGGLEELQERMCVRPTSETLF CDFYAKDIHSYRDLPRVYNQWCSVVRWEKTTRPFLRSREFLWQEGHTAHASAEEAEARTI QMLNLYADFCEEVLAIPVIKGQKTDKEKFAGAEATYTIESLMHDGKALQSGTSHNFGDGF AKAFDIQFTDKDNTLKYVHQTSWGVSTRIIGAIIMVHGDDNGLVLPPRIAPVQIMVVPVQ QQKEGVLDKAYELKERLTKAGYAVKVDDTDKSPGWKFADCEMRGVPLRVEIGPKDIEKNQ AVLVRRDNHEKSFVSLDELEARAGEMLDTIHDAMLEKARKHRDAHTYTAAGWDEFVDTVN NKPGFVKAMWCGCLECEEKIKEVAGATSRCIPFEQEKLSDQCVCCGKPAKKMVYWGKAY >gi|157101617|gb|DS480707.1| GENE 15 12783 - 13019 287 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941556|ref|ZP_02088888.1| ## NR: gi|160941556|ref|ZP_02088888.1| hypothetical protein CLOBOL_06454 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06454 [Clostridium bolteae ATCC BAA-613] # 1 78 1 78 78 150 100.0 3e-35 MIQAQAHINQLPNGDFELTACSYLSTTGANVDLMRGASRIKLDIANNSTMAMAEDFVREM VKKSLEGPDKGQPMLIEF >gi|157101617|gb|DS480707.1| GENE 16 13153 - 14232 1131 359 aa, chain - ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 8 348 10 354 371 90 23.0 3e-18 MEKPGKKLKKLLIITGTTGAVYAGFKYLLPLVAPFFVGYILALLLRPSARFLSYRMRVTV KGKRYHMPIGVIGGVEFLAVLSVLGTVLYRGALKLCAEGKLLMVQLPVWLEQFDLWLTGN CHRVEHAMGLRPGCVADMAGDMLYHMVDTCKTAAMPFLMANSMSVISCLAKAAIVSLVVF LAAILSLQEMEDLRDRRDQSAFRKEFNMIGSRLVQTGKAWFRSQGIILFITTGLCIGGMF LIRNPYYIMAGIGIGILDALPIFGTGTVLIPWAVFRLAVGDWKQALVLTAIYLVCYFVRQ ILEVRMMGGQVGLSPLETLASVYVGLELFGFFGFILGPLGLLLIEDMVEAWTKEEETET >gi|157101617|gb|DS480707.1| GENE 17 14333 - 15373 1332 346 aa, chain - ## HITS:1 COG:BH3732 KEGG:ns NR:ns ## COG: BH3732 COG1879 # Protein_GI_number: 15616294 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 56 337 33 311 311 220 45.0 3e-57 MRKQIGATVLALAMACSLMAGCSSGSGETTAAATTEAAKEESKEEAKDTSAEAENAGSEE GSGKKYVIGLAMNTQTNPFFVDVKDGVQKAADEHGIELYITDAQDDPTIQMKDVENLITK KPDAIIIDTCDSDAIVSSIEACNEAGIPVFTMDREANGGEVISHIGYDAIKSGRMAGQYL VDTLGGKGKIVEIQGIMGTNVAQNRSQGFNEVMKDNPDMEIVACQVADFDRAKGMSVMEN ILQANPEIDGLYAANDEMLLGALEAMEAAGRTDEIVKIGCDAIDDTLDAMKAGKVDATIA EPPFFLGKAILNTAYDYLEGKQVEPYVILDNQLVTQDTVDSLVTKE >gi|157101617|gb|DS480707.1| GENE 18 15538 - 16524 1204 328 aa, chain - ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 30 314 38 322 330 217 46.0 3e-56 MHLEVSVKMNSVNKKKLIGQLNIYRSVLILLVICVFATILSPSFLSVANLFNVFKQITVA GIVGCGMTFVILTGGIDLSVGSILGLAGVLASGILESTGNAGAAILVALAVGIGCGAVNG FFVAVCEIPPFISTLGMMTLLRGCILVYTKGSPIPIKVDAYKFFGKGTVAGVPVPVIILI ILFLIAHYVLTQTSYGRSIYAFGGNREAARLSGISVRFTEWMAYTLNGLMCGVAGVILTA RLGSAQSTSGTGIEMDAIAAVILGGTSLSGGVGFVLPTVVGAMIMGIIDNILTLMNVNPH ATNIVKGAVILIAVLVDKKVKDLSAKAE >gi|157101617|gb|DS480707.1| GENE 19 16493 - 17995 1772 500 aa, chain - ## HITS:1 COG:PM0155 KEGG:ns NR:ns ## COG: PM0155 COG1129 # Protein_GI_number: 15602020 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Pasteurella multocida # 1 494 1 494 498 446 47.0 1e-125 MEEKLVLQMDSIVKTFPGVKALDGVHLDVRRGEVHALCGENGAGKSTLMKIIAGAQGYTS GHMYVNGEEVVFHSTKDAERKGIAMIYQEFNMVRDLSVAENMYLGRLPKTPYGAVDWKKL YKDAQEELDRLGLKFDCRTKVRNLSVAESQMTEIAKCLTIGAKVIIMDEPTAALTDEEIR ILFKIIAELKAEGISILYISHRMDEIFQISDRLTVFRDGKYIATKDIGDTNYDEVVSMMV GRSVTNLYPERDYTQDETVLELKDICGKGVHNVNIRLHKGEILGVAGLLGSGTIELSKIV YGALPMDSGEIYINGEKKDCSSPKKALKAGIGFVSDDRKQEGLVLIRNIRENVSLSSLDK ITHFIHLDKKLETKQINVQVKRLNIKISSLQQLAGKLSGGNQQKVVFAKVLEANPSILIL DEPTRGVDVGAKAEIYQIMDELTRAGKSIILISTDLPEVIGMSDRVVIMREGHTVLEIPK SEMNQEIILAHASGGVSENE >gi|157101617|gb|DS480707.1| GENE 20 18052 - 18450 518 132 aa, chain - ## HITS:1 COG:BH3729 KEGG:ns NR:ns ## COG: BH3729 COG1869 # Protein_GI_number: 15616291 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Bacillus halodurans # 1 132 1 129 129 105 41.0 2e-23 MLKTGILHPQLARVLAELRHKDTIIIGDAGLPIPKGVERVDLGWRPGDPAYLDVLEEILK YIVVEGAVFADEALTVTPDFHKKALDLLPQGLPVEYIPHTELKERSKDAKAIVLTGEFTG YTNVILSAGCAY >gi|157101617|gb|DS480707.1| GENE 21 18489 - 18677 105 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MILPIMIFLMMLLTMALLIAALLSAAPLTVVHLTVFFLTVVHLTVILLTLVLLSVVPLVA EL >gi|157101617|gb|DS480707.1| GENE 22 18715 - 19665 1146 316 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 4 307 1 303 306 218 43.0 1e-56 MEHMGKKVTVFGSFVVDLMGRTPHLPAAGETVKGTVFKMGPGGKGFNQGVAAHKAGADVT MVTKLGRDAFADIALNTMNELHMDTGRVLYSETTETGCALIMVDENSSQNEIVVILGACN TITDQEVDSLEDLVGRSEYLLTQLETNVSSVERIVDIAYRKGVKVILNTAPVQPVSDELL GKIDLITPNEVEAEILTGIKVDSEEAADKAADWFFSKGVKNVLITLGHRGVYINTGEKKG IIPAYKVEAVDTTGAGDAFNGGLLAALAEGKNLWEAADFANALAALSVQKIGTTPSMPVR EEIDAFMAANRQTADC >gi|157101617|gb|DS480707.1| GENE 23 19812 - 20447 753 211 aa, chain - ## HITS:1 COG:lin2103 KEGG:ns NR:ns ## COG: lin2103 COG0274 # Protein_GI_number: 16801169 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Listeria innocua # 1 211 1 212 223 236 65.0 3e-62 MGIEHMIDHTMLKADASKNTIIRYCSEAREHKFASVCVNTCFVPLVAEQLKGSGVKTCCV VGFPLGAMLTSAKAFEASEAVKAGADEVDMVINISALKDGDDSFVEEDIRAVVKASAGAV VKVIIETCLLTDEEKVRACQLAVKAGADFVKTSTGFSTGGATAADVALMRRTVGDSARVK ASGGIRTPEDAAAMIEAGADRIGAGNGIVLI >gi|157101617|gb|DS480707.1| GENE 24 20603 - 21625 1059 340 aa, chain - ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 3 334 4 336 347 192 32.0 6e-49 MNIKEIARRAGVSSATISRVLNNSGYVKEETRQKVLRAVEEYNYVPSAIARSLSIQDSLS VGTIIPDIENEFFSKVISGISEVAESYHYNIVFLGTNETLDKEHDFLKIVESQRLKGVII TPISETDTVTRDYLLRLEESGIPVVLVDRDVKGAQFDGVFVDNQKGSYDGVMELIKAGHE RIAIITGPETSKPGKDRCQGYLQAMEDSGLAVPEEYVACGDFKIAKAYECTGRLLGLAQA PTAIFTSNNLSTLGCLKYLTEHKVKIGRDISLMGFDDIDALRMIDYRISVVDRDAREQGR EAMRLLQECFEDSGKSRQRGKRITIPYKVILRGSEHRKTT >gi|157101617|gb|DS480707.1| GENE 25 22007 - 23368 1667 453 aa, chain - ## HITS:1 COG:STM3541 KEGG:ns NR:ns ## COG: STM3541 COG2610 # Protein_GI_number: 16766827 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Salmonella typhimurium LT2 # 14 448 6 440 446 272 39.0 9e-73 MTTTTAAALDPTRLIIAALAGLVILLLLIIKFKVQAMISILVGAVAIGLIAGMPFTDIIS SVNEGIGSTLKGIALLVGLGSMFGSILEISGGAQTLAVTMVKKFGDKKAAWALGITGLVI AMPVFFDAGLIILIPLAFSLAKRTKRSSLFYAIPLLAGLAVGHAFIPPTPGPVLVATMLG VDLGWVIMVGIVCGIFAMIVAGPIWGTICGNKYDIPVPEHIANQEDFDESKLPKFGTIVG IILIPLVLIILNSIAKVVPALASVQPVLGFLGEPFVALTIATVVAMFLLGYRHGYSNEEL EKVLTKSLEPTGLILLVTACGGVLRFMLQNSGLGDVIGNAVSSASLPIVVVAFVVAALVR ISVGSATVAMTMAAGIIAAMPEIATLSPLYLACVTAAVAGGATVCSHFNDSGFWLVKSLV GLDEKTTLKTWTVMETLVGGTGFVVALIISFFA >gi|157101617|gb|DS480707.1| GENE 26 23503 - 25224 1751 573 aa, chain - ## HITS:1 COG:CAC3604 KEGG:ns NR:ns ## COG: CAC3604 COG0129 # Protein_GI_number: 15896838 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 3 572 2 571 572 756 63.0 0 MELNSQRVRALAPENDPLKIGMGWKVEDLDKPQIMVESTFGDSHPGSAHLDQFVNEAMRG IADAGGKGARYFTTDICDGIAQGHDGINYSLAHRDMIANMIEIHGNSTGFDGGVFIASCD KSVPACLMGLARLDMPSIVVTGGVMDAGPDLLTLEQIGAYSAMCQRGEITEEKLTYYKHN ACPSCGACSFMGTASTMQIMAEALGLMLPGSALMPATCEDLKEMAYKAGLQVMELARKGL KVSDIVTMKSFENAIMVHAAISGSTNSLLHIPAIAHELGLEIDADTFDRMHRGAHYLLDI RPAGKWPAQFFYYAGGVPAVMEEIKSMLHLDVMTVTGKTLGENLEELKAGGFYDKCDGYL KKWGLKRTDVIRTFEEPIGTDGTVAILRGNLAPEGSVVKHSAVPEEMFKAVLKAKPFDCE EDAIEAVISRKIQPGDAVFIRYEGPKGSGMPEMFYTTEAISSDPELGKTIALITDGRFSG ASKGPAIGHVSPEAADGGPIALVEEDDLIRIDIPERVLEIIGVKGEELPKEEVERILAER RKAWKPREAKYKKGVLKIYSEHAVSPMKGGYMV >gi|157101617|gb|DS480707.1| GENE 27 25745 - 26467 845 240 aa, chain - ## HITS:1 COG:RSc1078 KEGG:ns NR:ns ## COG: RSc1078 COG2186 # Protein_GI_number: 17545797 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 13 232 13 237 255 98 31.0 8e-21 MQKANKGELSNLKNKLLAEQVQEQIYQYILETPIAVGAKLPNEFELGDRFSVGRSTIREA VKLLISRGILEVRRGSGTYVVSTTPVDMDPLGLGAVEDKMALALDLVNVRIILEPGIAEM AALNATQEDVLKLRNLCDSVERKIKIGDSYIEDDIAFHTCVAECSKNKVVEQLIPIIDTA VLMFVNVTHKKLTDETIMTHRAVTEAIAEHDPIGAKTAMMMHMTFNRNMIRQLMKDGREA >gi|157101617|gb|DS480707.1| GENE 28 26652 - 27041 388 129 aa, chain - ## HITS:1 COG:lin1982 KEGG:ns NR:ns ## COG: lin1982 COG0346 # Protein_GI_number: 16801048 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Listeria innocua # 1 125 1 125 125 108 38.0 3e-24 MIDSIGKITLYVNNQNEARDFWTEKMGFIVRLEQQMGADQKWLEVGPEYGGGTSFVLYDK EKMKTQNPDVNVGHPSVILCTGDLENAHREMKERGVRTGDIMKMPYGSMFQFYDMDGNVF LLREEARVH >gi|157101617|gb|DS480707.1| GENE 29 27204 - 28565 858 453 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 450 3 440 456 335 41 3e-91 MDVITNLVAKGNSFLWSFLLIVLLCGTGIYYTIRLRFIQVRKFGEGWKLVFGHLSLNGEK HEKGEMTPFQSIATAIAAQVGTGNLAGAATALLGGGPGAIFWMWVSAFFGMSTIYAEATL AQNFKTEVNGEVTGGPVYYIKAAFKGTLGKVLAGLFAIFIVLALGFMGNMVQANSIGAAF TEAFGAFNITISPVVIGVIVAAVAAFVFLGGTQRLASVVEKVVPIMAGVYIVGSLILIIM NIANLPAAIKMIFVGAFDPQAVLGAGAGIAVKEAIRFGVARGLFSNEAGMGSTPHAHARA TAENPHKQGLCAMISVFIDTFVILNLTVFSVLTTGALESGKNGTALTQAAFMRGFGTFGI VFVAICLLFFAFSTILGWHFFGLINAKYLFGDGAAKIYSLLVVVCIIIGSALKLELVWDL ADFFNGLMVIPNAMALLALSGLVVKICNKYSDK >gi|157101617|gb|DS480707.1| GENE 30 29313 - 29693 259 126 aa, chain + ## HITS:1 COG:no KEGG:ELI_0507 NR:ns ## KEGG: ELI_0507 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 124 12 136 137 138 56.0 7e-32 MEKYIYDNSNGLWYELHGDYYLPCLVIPEEEIHTIGIWGRKHRQYLKEYRPMLYNDLVLS GKLYSYLSDIETQALNKLDLLVIQLAEKEGINDQLKEQNQLAWVRAMNNIRNRAEEIVLK ELIFAE >gi|157101617|gb|DS480707.1| GENE 31 29790 - 31016 691 408 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 14 400 71 449 450 167 29.0 4e-41 ADGRYKKLPDEVYKRLEFHPASFEVIEHHVEVYVSADGGNFARAERPADLFRNSLATASL VAGIYNLKYVNAQPIERLSKEFERSDVFLPTQTLCRWAIMGAERYLSRVYARMKQKLPEY HVMHADETVVEVRKDGRPAGAESRMWVYHSGELESKPVILYEFQKTRKKEHAREFLKDFS GICVTDGYQVYHSIADEREDLTISGCWSHARRGFADVVKAAGKKDLNIRESVAYKSLQLI QTMSRCEEKFAKLEPAERLEARIHHILPLADAFFAYLKSKEGSVVPKSATGKAISYCLNQ EAFLRVFLTDGYVPMTNNAAERSIRPFTVGRNNWFQIDTVSGAKASAIAYSIAETAKANQ LKPYEYFRYLLEELPKHGELEELSYVEELLPWSETLPKCCYQKKETES Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:48:48 2011 Seq name: gi|157101616|gb|DS480708.1| Clostridium bolteae ATCC BAA-613 Scfld_02_49 genomic scaffold, whole genome shotgun sequence Length of sequence - 30871 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 13, operones - 8 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 1 - 518 99.0 # EU463219 [D:1..1391] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. - TRNA 712 - 800 58.8 # Ser GGA 0 0 - TRNA 865 - 950 65.8 # Ser TGA 0 0 1 1 Op 1 . - CDS 1201 - 1431 292 ## gi|160941577|ref|ZP_02088908.1| hypothetical protein CLOBOL_06477 2 1 Op 2 . - CDS 1492 - 2769 952 ## COG1850 Ribulose 1,5-bisphosphate carboxylase, large subunit - Term 2802 - 2847 5.2 3 2 Op 1 . - CDS 2889 - 3614 460 ## COG1802 Transcriptional regulators 4 2 Op 2 . - CDS 3637 - 4242 161 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 5 2 Op 3 . - CDS 4265 - 5515 1109 ## COG5441 Uncharacterized conserved protein 6 3 Op 1 . - CDS 5624 - 6628 738 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 6658 - 6717 3.2 7 3 Op 2 11/0.000 - CDS 6726 - 7685 1074 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 8 3 Op 3 21/0.000 - CDS 7699 - 8649 834 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 9 3 Op 4 16/0.000 - CDS 8636 - 10291 200 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 10 3 Op 5 . - CDS 10270 - 11391 1009 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 11618 - 11677 5.6 - Term 11657 - 11703 8.1 11 4 Tu 1 . - CDS 11722 - 12465 348 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 12 5 Op 1 . - CDS 12567 - 13601 809 ## PROTEIN SUPPORTED gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 13 5 Op 2 . - CDS 13650 - 14561 1010 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 14593 - 14652 6.0 - Term 14680 - 14711 4.1 14 6 Tu 1 . - CDS 14779 - 15540 758 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 15758 - 15817 8.5 + Prom 15698 - 15757 9.2 15 7 Op 1 4/0.000 + CDS 15809 - 16267 430 ## COG0698 Ribose 5-phosphate isomerase RpiB 16 7 Op 2 . + CDS 16281 - 17072 641 ## COG0149 Triosephosphate isomerase 17 7 Op 3 . + CDS 17090 - 18322 1291 ## COG0205 6-phosphofructokinase + Term 18403 - 18438 2.0 - Term 18387 - 18429 9.9 18 8 Tu 1 . - CDS 18437 - 18679 273 ## Closa_0626 hypothetical protein - Prom 18749 - 18808 3.3 - Term 18737 - 18773 4.1 19 9 Op 1 20/0.000 - CDS 18815 - 19264 293 ## PROTEIN SUPPORTED gi|125974278|ref|YP_001038188.1| SSU ribosomal protein S18P alanine acetyltransferase 20 9 Op 2 12/0.000 - CDS 19342 - 20094 207 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase 21 9 Op 3 . - CDS 20091 - 20519 485 ## COG0802 Predicted ATPase or kinase 22 9 Op 4 . - CDS 20538 - 22685 198 ## PROTEIN SUPPORTED gi|225302983|ref|ZP_03739507.1| ribosomal protein S1 - Prom 22717 - 22776 5.7 + Prom 22674 - 22733 3.8 23 10 Tu 1 . + CDS 22899 - 23354 545 ## gi|160941602|ref|ZP_02088933.1| hypothetical protein CLOBOL_06502 + Term 23538 - 23570 4.0 24 11 Op 1 . - CDS 23425 - 23790 271 ## Closa_0619 hypothetical protein 25 11 Op 2 . - CDS 23794 - 25035 1608 ## COG0460 Homoserine dehydrogenase 26 11 Op 3 . - CDS 25093 - 25536 502 ## COG4492 ACT domain-containing protein - Prom 25595 - 25654 7.5 - Term 25665 - 25704 8.2 27 12 Op 1 6/0.000 - CDS 25721 - 26740 1219 ## COG1186 Protein chain release factor B - Prom 26845 - 26904 5.5 28 12 Op 2 . - CDS 26987 - 29560 2830 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 29746 - 29805 7.5 + Prom 30173 - 30232 12.5 29 13 Tu 1 . + CDS 30285 - 30815 476 ## PROTEIN SUPPORTED gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P Predicted protein(s) >gi|157101616|gb|DS480708.1| GENE 1 1201 - 1431 292 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941577|ref|ZP_02088908.1| ## NR: gi|160941577|ref|ZP_02088908.1| hypothetical protein CLOBOL_06477 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06477 [Clostridium bolteae ATCC BAA-613] # 1 76 1 76 76 142 100.0 1e-32 MVREIMLYTSNDLRDFMHLACASPLDIGVHTMDNQIADGKSVLGLMAIDYKKPVKVVSED RRFLKQLDRWAVDKDL >gi|157101616|gb|DS480708.1| GENE 2 1492 - 2769 952 425 aa, chain - ## HITS:1 COG:AF1587 KEGG:ns NR:ns ## COG: AF1587 COG1850 # Protein_GI_number: 11499182 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose 1,5-bisphosphate carboxylase, large subunit # Organism: Archaeoglobus fulgidus # 8 417 22 432 437 346 42.0 6e-95 MDSLLYTLSETVHKQNYAVATYHISLPKEVDVLKKAAALAVGQTIGTWIPIPGITEEIRE KYMGKVVNVFDVPSLDLATQITGDQREYLIQIAYPAVNFGTDLPLLITALLGNDASTSAQ VKLLDIEFPEEFVRKFRGPRYGIEGIQKFAGISNRPILLNMIKPCTGLTPKEGARIFYET ALGGVDFIKDDELFGNPVYSKPEERVRAYREAAEAAYEKTGERVKYFVNITSGAGEIMEN VKRAEDAGADGLMINFAAMGYSVLKHVAEHTVLPILGHSAGTGMCFEGAMNGMASPLAVG KLARLAGADIVMINTPYGGYPLLHQKYMQTVAQLTLPFYDIKPSMPSIGGGVHPGMVEKY IREVGKDVVLAAGGAVQGHPGGAAAGARAMRQAIEIVMSGQPFEKAAAEKDELRTALTQW NYIKS >gi|157101616|gb|DS480708.1| GENE 3 2889 - 3614 460 241 aa, chain - ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 15 229 16 224 228 100 32.0 3e-21 MKSNMRITMNDLDRGMLQHNKLPDLALDKLMEWIMSGKLSMGDKLNTEELASQLGVSRMP IREALANLEKKGLAESVPYAGTRLVTLTKDDVRQIYIARTALEPVAAKYACEKITDEDIQ NLKEIHAEYIRIVRQEELDAIDVYQQNRLYHFSIYKISQLDRICAMIESLWDNLSFFKLI YGQKLLEGRESRERMIEEHQSYLDALKQRDSERISHLLAQNLKKRTENIPYSLDVYFHEN A >gi|157101616|gb|DS480708.1| GENE 4 3637 - 4242 161 201 aa, chain - ## HITS:1 COG:ECs5174 KEGG:ns NR:ns ## COG: ECs5174 COG0235 # Protein_GI_number: 15834428 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli O157:H7 # 23 197 19 200 228 89 33.0 4e-18 MSEDVKKIAEELVIACKRAYTRGIQTGSGGNVSARVPGKDLMIVKASGSSFIDSTPDGFV ITDFDGNLVEGEGKPTREALLHGLLYRICPKVNAVVHTHSPYSIAWASTGKALKRTTWHA QLKMSADLPSLDVPAAMVKKEYFPMVEDIYRENPDLPGFLLVDHGLVAVGKDAINAEHTA ELIEETAQVAILKVTVSKLGL >gi|157101616|gb|DS480708.1| GENE 5 4265 - 5515 1109 416 aa, chain - ## HITS:1 COG:SMa1334 KEGG:ns NR:ns ## COG: SMa1334 COG5441 # Protein_GI_number: 16263182 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sinorhizobium meliloti # 1 347 1 337 398 162 33.0 1e-39 MNTIAIIGSCDTKYREIAYMREQVESQGMKAMVINVATGPNPSYGYDVSREDVTKAAGTE WAELEPRTKGEKITFMMEAVASYVEKLYAEGKIDGILSAGGLQNTVMATNAMKRLPIGFP KVMATTVASGRKTFESVVGAKDIVTIPSICDFTGLNIVTRQIMANACACCAGMVKHAGQV LKKGDKPVVGVTLMGITNTGACAAIDELERLGIEPIGFHSTGAGGAIMEQMAADGLIDGI LDLTTHEITQEYFKGGFSYGEDAKYRLVRGVEKKVPLVVSVGGLDFIDFQAGEFPPRMDE RIYMMHNANTAHIKLLPDEAEITTARFAARIEKIDYPVKLLIPTDGMRHNTRKGEALYYK EVDDIIIRQLKEIRNPNVEIITIPGNLDTEEWGIQAAHHMVDELKEHAVIGDEIQY >gi|157101616|gb|DS480708.1| GENE 6 5624 - 6628 738 334 aa, chain - ## HITS:1 COG:AF1509 KEGG:ns NR:ns ## COG: AF1509 COG0491 # Protein_GI_number: 11499104 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Archaeoglobus fulgidus # 43 268 32 250 262 90 28.0 5e-18 MPLTIRPLNSGYIPTKPLEYWYHFSCKKFVEKLGITNDSDQLPDFTFLIEGGDQLILVDT GMSWTEHADKYHHGPSVQRPGIDDIESRLAQVGFKPSDVDIVLFTHLHWDHVFYLEKFTK ARFICHETEWDYAHNPIPLHYKSYCRPIIAKDGDVTCNDEFIAPYDQPGVKERFETVKGE AQIAPGVSVYESFGHCPGHMTVVVETEDGPYYCVGDSVFVMGNIDAPQTMQDELHYDICP PGRYVDIVAAWQTIRDTVRRCHEAGVDPHKHLLLAHDIILSAAVEKYEDTHENRLPVIGL KDTDFVFDEYKGAIIDKDAKKAAAKAKTKYFSQK >gi|157101616|gb|DS480708.1| GENE 7 6726 - 7685 1074 319 aa, chain - ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 19 307 25 317 322 163 40.0 4e-40 MKAKKIFEEIRNSKVFYSLIGLIILVIFSAIAAPNFRTVDNIITVFRQASVLLVLSAGLT AVLLTGGMDLSVGATAGFIGCICAQCIKNGVPIPAAFILGILIGLVVGVINGLLAGILPS FIATYGTNWVMSGLAVLVMQGAVIYDLPKGFTQIGVGYTGPVPNLIIIAAVVVILIYVLL QKTTYGRQVYAYGSNPVAAKYAAVPVKKVMLSAFMMCSMCAGLAGMLMTARLNAADAAMG DSYGLQTVAAVVVGGTSMLGGEGGVLGTVVGALILTIIVNVMNLKGISSYAQGLVIGIVI IAMVLFDTYSKRRQESKAA >gi|157101616|gb|DS480708.1| GENE 8 7699 - 8649 834 316 aa, chain - ## HITS:1 COG:BH3731 KEGG:ns NR:ns ## COG: BH3731 COG1172 # Protein_GI_number: 15616293 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus halodurans # 1 307 1 309 314 176 35.0 7e-44 MQRNKIVQRMKSLPPVAYMLVVLIVIFSVMSPNYLTLMNFKNVLIQATPLMIVAFGQTCI VLTQGTDLSLGAQVSFVTVFTVWMALRGIPLPAAMLLAVLSTTLIGVANGVIVAKGSIPP FIATYGMQNILNSLSLLLAAGSSIYFPSFTYRIITETMIMGIPLMVWAAVLVFIMVWVVL NKTRFGTNIHGLGGNREALQLAGINPIICLIKTYAFAGFIAGIAGIITLSRIEAGMPTAA TGWEFQAVAVTLLGGTSLREGKGGVTGTIFGVLLIQVIKNGLNIVGVQSIKQNAIIGSIV LMAIIIDAVVRTRSKE >gi|157101616|gb|DS480708.1| GENE 9 8636 - 10291 200 551 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 315 545 147 362 398 81 28 5e-15 MVEGHPVECGKIETELFHLFVPAGNCGDEYRGNKFREKYWGETMGKTILTAKHITKLFPG MKALDDVDFDLQEGEVHILVGENGAGKSTLAKCLLGAYKPEEGEVRLDGQVVKFNGTKEA LARGIVAVYQEFTLVPYISVAQNIFLNREYKTKLGLIDHKKMEEEAAKLLKALNCEYMNP KDYVKNLSIAEQQMVEIAKALSFNPRIMVFDEPTATLSEREVDSLFAQIHRLKAEGIGII YVSHRMQEFPLIGDRITILRDGVKINTVGINDCTNEELVNMMVGRSVEQVYARTENEHEG ITLKTENLCDYKGRVRNVSFTVRKGEIVGLAGLVGAGRTETAELLYGIEKIKSGKVYVRG SQVFPKSPVQAVRIGIGLVSEDRKKQGLALDESIALNMVAVSLKKIFPNFFISKKKITEV ALGYKKQLRIATTSVNKACKYLSGGNQQKVVLAKWLSFDPEILIFDEPTRGIDVAAKLEI YHLMDRLAAEGKAIIMISSELPEVIGMSDRIYIMHEGEIVDEVIRGTDGFNSEVIGARMM LGAGGAEHAEE >gi|157101616|gb|DS480708.1| GENE 10 10270 - 11391 1009 373 aa, chain - ## HITS:1 COG:PM1325 KEGG:ns NR:ns ## COG: PM1325 COG1879 # Protein_GI_number: 15603190 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 81 355 31 306 314 131 33.0 2e-30 MRKRVVSILLTAAMSAMLFTGCSTPTEAPASSAKEETKAEEDTKSSSSGLSAATVLDVSQ LPVTGIGAQIATSVKAGGDYKIGYIAKNTTNPYMVAQSAGVEAAGKAMGFTAITQAPTTA DSVEEQVQLMENMITQDVDAIIVHCADSNGIMTGVRKAQDAGVLVLTIGTPAAEDTFLRT GVDYYESGYTMAKAVADKLGGKGKFIILEGPAGASNAIERLNGINTGLSEYEGIEIVASQ TANFKRTEGMSVTENLIQQYTDIDAVIACNDESALGAVQALTAANMSDVLVCGFDGSVDA TNAVKEGTMFATYNTDPYGSGFVACAYAVKYLNDKTEPEGKFIPFPTAANDPLVIADTVQ NYIDNIAWWKVIQ >gi|157101616|gb|DS480708.1| GENE 11 11722 - 12465 348 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 8 245 6 225 234 138 40 4e-32 MEKKYCTAVVLAAGKGSRMGGNTAKQYMEIGGKPLVAYALEVFEASPVIDEIILMTDAGH MEYVRTEIVEAYGLKKVSTIGAGGGERYESVWKALCTLMDREEGEKESVARDRQDGYVFL HDGARPFVTGEIIERAYRAVCKCNACVVGMPVKDTIKLVNREGMIESSPDRSMVWQAQTP QVFSVPLIVEAFTRQMKEDCTGITDDAMVVEAQMGVKAHMVMGSYANIKITTPEDLLMAE VLRNSIV >gi|157101616|gb|DS480708.1| GENE 12 12567 - 13601 809 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229879751|ref|ZP_04499249.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Slackia heliotrinireducens DSM 20476] # 10 343 439 779 781 316 48 1e-85 MIGQASEDVLILAIESSCDETAAAVVKNGRTVLSNVISSQIATHTVYGGVVPEIASREHI KAVNYVIERALAEAQVALEDITAIGVTYGPGLVGALLVGVAEAKAIAYAAGKPLIGVHHI EGHVSANFIENQDLEPPFVCLIVSGGHTHLVIVKDYGEFEIIGRTRDDAAGEAFDKVARS VGLGYPGGPKIDKAAKEGNPHAIQFPRGKVEGAPYDFSFSGLKSAVLNHINHARMTGEEI NVPDLVASFQNSVVESLVSRAILAAHEFGYKKLAIAGGVASNSALRKAMAEACEKDGIQF YYPSPIFCTDNAAMIGTAAYYEYLKGNISGWDLNAIPNLKLGER >gi|157101616|gb|DS480708.1| GENE 13 13650 - 14561 1010 303 aa, chain - ## HITS:1 COG:CAC1584 KEGG:ns NR:ns ## COG: CAC1584 COG1234 # Protein_GI_number: 15894862 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 292 5 302 313 331 50.0 1e-90 MLDICLLGTGGMMPLPYRWLTSMMARCSGSNLLIDCGEGTQVALKEKGWSPKPIDIICFT HYHADHISGLPGLLLTMGNAERVEPLTLIGPKGLERVVGALRTIAPELPFELRFVELTEN RERIQMGPYVIDAYRVNHNVLCYGYSITIPRTGRFDVERARSKEIPQKFWSRLQKGEEIE HEGRLLTPDMVLGPDRRGLKVTYCTDTRPVPVIAEYAAGADLFICEGMYGEKEKAAKARE YKHMTFYEAAALAKQANPKQMWLTHYSPSLTKPEEYMEEVRTIFPRAKAARDGWTAELEF DED >gi|157101616|gb|DS480708.1| GENE 14 14779 - 15540 758 253 aa, chain - ## HITS:1 COG:AGl88 KEGG:ns NR:ns ## COG: AGl88 COG1349 # Protein_GI_number: 15890151 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 235 10 239 243 139 36.0 4e-33 MGKRDLRIYALLDTLKESPVLSIKELADRFQVSEMTIRRDIDYLKENRLFYENKASEPAA REHDEYLYSSEQIRNFDKKDRIARFAAGMIEEGDILILDSGTTTGVLSKYIPEQTQLTVL CYNYHILSQLQGNEKLSLIFAGGYFHRNDLMFESVEGIGLIRRTRASKMFVSASGVHEKL GMTCAHNYEVVTKKAALESSLQKILVADSSKFGQVRPGYFAELEEIDEIVTDDGLSREWQ ELIAEKDIPLHLV >gi|157101616|gb|DS480708.1| GENE 15 15809 - 16267 430 152 aa, chain + ## HITS:1 COG:YPO3353 KEGG:ns NR:ns ## COG: YPO3353 COG0698 # Protein_GI_number: 16123503 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Yersinia pestis # 1 147 1 146 151 145 47.0 2e-35 MRTIVIGCDNAAVHLKNELMGFMEKKGYTVENMGCDSTEDSTYYPYVAEKVCQEIINSGY QKHGVLICGTGLGMAMTANKFKGIRAGVCHDVFSAERLKLSNDGNVICMGERVIGAELAK RILEKWLELEFVDCASTSKVEAIKEIEGENLK >gi|157101616|gb|DS480708.1| GENE 16 16281 - 17072 641 263 aa, chain + ## HITS:1 COG:PM1640_1 KEGG:ns NR:ns ## COG: PM1640_1 COG0149 # Protein_GI_number: 15603505 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Pasteurella multocida # 6 253 4 254 270 266 53.0 3e-71 MKKLNRLYLGTNTKMYKTIAETTSFLTQLRRLTEDLADSPLTLFVIPSFTSLESANRITS GSHIRLGAQNMCWEDQGQFTGEISPVMLKEVGVSVVEIGHSERRHVFRENDFDQEKKTAK AAQSGFTPLLCIGETLTQREYGLSGETLSTQLKVGLHSITTEQAEQLWIAYEPVWAIGVN GIPADSDYVAERHAGIRRILCARFGEEQGSRIPILYGGSVNPQNAQELIALPDVDGLFIG RSAWDASQFNRIIRQVMPLYMNK >gi|157101616|gb|DS480708.1| GENE 17 17090 - 18322 1291 410 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 4 404 16 411 427 234 37.0 2e-61 MHNLLVAQSGGPTSAINATLSGVITEAMIQGGIDRIYGGLNGIEGILQEKIIDLLDHINN TMDLDKLAQTPAAALGSCRFKLAGPEEDTAQYRVLIDIFRKYDISYFIYIGGNDSMDTVD KLARYCRSQQIEDIKIIGAPKTIDNDLMEIDHCPGFASAAKYIATTFAELERDVAVYDSF GVTIVEIMGRNAGWLTAASSLSRVNGSRGPDFIYLCEVPFSISRFLDDIRSRQSENRNLL IAVSEGIRDENGTYLSEQTPGNRPDRFGHRDIAGTGGVLAQIVRNELGCKTRSLELNLMQ RCAAHLASRTDLNESRLLGAKAVQCAVQGQTGKMATLLRLPSKDRYRIQFASADVSLVAN REKTVPRQWINREGNGITREMTDYLTPLMEGEVPVLYRNRFPEYFTISLE >gi|157101616|gb|DS480708.1| GENE 18 18437 - 18679 273 80 aa, chain - ## HITS:1 COG:no KEGG:Closa_0626 NR:ns ## KEGG: Closa_0626 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 80 46 125 125 124 65.0 1e-27 MKRYNKNGQLETIMCNCCGKKLVVQHGIIREGCISIDHAWDYFSEKDGQVHHFDLCEECY DELVSGFKIPVDREEQIEFL >gi|157101616|gb|DS480708.1| GENE 19 18815 - 19264 293 149 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|125974278|ref|YP_001038188.1| SSU ribosomal protein S18P alanine acetyltransferase [Clostridium thermocellum ATCC 27405] # 10 149 11 151 152 117 43 9e-26 MELIRGVRPMTGQDIEQVADLEQVCFSESWSENLIRMGLDSRLDTYFVYADHGTILGYAV LRILADEGEIQRIGVYPHYRRQGIARKLMDAMVTFARARGVRAIALEVRESNLGARNLYD SYGFRQEAVRKGYYHNPAEDAIIMWNRAI >gi|157101616|gb|DS480708.1| GENE 20 19342 - 20094 207 250 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 42 230 1 180 380 84 31 8e-16 MRVLGIESSSLVASVALVTDDILTAEYTVDFKKTHSQTLLPMLDEIVKLLELDMDTIDAI AVAGGPGSFTGLRIGAATAKGLGLALKKPLVHVPTVDAMAYNMWGAEGLICPIMDAKRSQ VYTGLYHTRDGLEVVMEQQPMDMRELAGLLNKKGERTIFLGDGVPVYKDIIREMLTVPHA FAPAQMNRQRAASVAALGMNALLDAEGCPGVRVVTAAQFTPDYLRKPQAERQREAEQSAL GAAAKLPGDY >gi|157101616|gb|DS480708.1| GENE 21 20091 - 20519 485 142 aa, chain - ## HITS:1 COG:CAC2838 KEGG:ns NR:ns ## COG: CAC2838 COG0802 # Protein_GI_number: 15896093 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Clostridium acetobutylicum # 9 142 9 141 152 129 47.0 1e-30 MVIETRKPEETYELGRKMGREAEPGQIVCLSGDLGVGKTVFTQGFAAGLGIEGPVNSPTF TILQQYEDGRLPLYHFDVYRIGDVSEMDEIGYEDCFFGDGVCLIEWPGLIEEILPEKVTW VTIEKDLEKGFDYRRISVEDRG >gi|157101616|gb|DS480708.1| GENE 22 20538 - 22685 198 715 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225302983|ref|ZP_03739507.1| ribosomal protein S1 [Halothiobacillus neapolitanus c2] # 637 712 189 263 560 80 51 9e-15 MDIIKKLAEELNIGRHQAEAAVKLIDEGNTIPFIARYRKEATGSLDDEVLRNLDERLKYL RNLEDRKSQVIASIAEQEKLTPELELKIKEAQTLVAVEDLYRPYKPKRRTRAMIAKEKGL EPLADLISMQMTAEPVEKAAAGFVSEEKGVASVQEAIQGARDILAERISDNAAFRTYIRN ITMNKGVIQSSAKDEKAQSVYEMYYNYEEPVKKAAGHRILALNRGENEKMLTVKILAPEE QILGYLEKQTIVRDNPYTTPILKEAAADSYSRLIGPSIEREIRSDLTEKAEEGAIKVFGK NLEQLLMQPPIVGRVVLGWDPAFRTGCKLAVVDSTGKVLDTVVIYPTAPQNKVEESRKIL KDFIKKYNISLISVGNGTASRESEQIIVDLLKEIREQVQYVIVNEAGASVYSASKLATEE FPAFDVGQRSAVSIARRLQDPLSELVKIDPKSIGVGQYQHDMNQKRLGEALEGVVEDCVN KVGVDLNTASASLLEYVSGINKTLAKNIVAYREENGAFKSRKQLLKVAKLGPKAFEQCAG FMRITGGENPLDGTSVHPESYDAAGKLLEKLGYKPEELSGGGLLGISRKIKDYKRTAQDL GIGEITLRDIAGELEKPARDPRDEMPRPILRSDILEMKDLEPGMVLKGTVRNVIDFGAFV DIGVHQDGLVHISQMTDKYIKHPLEAVSVGDIVDVKILSVDLAKKRISLTMKGVK >gi|157101616|gb|DS480708.1| GENE 23 22899 - 23354 545 151 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941602|ref|ZP_02088933.1| ## NR: gi|160941602|ref|ZP_02088933.1| hypothetical protein CLOBOL_06502 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06502 [Clostridium bolteae ATCC BAA-613] # 1 151 1 151 151 295 100.0 8e-79 MMNDTHNNQKPPSNKQGKNFTSSERLGTCSMIAGIIGIVCSLFYLPMSLAYNQTAMPTGL VCGVIGILLALMARNADMSPRKAFNARAIAGLVLSVIAITLTFFFFYALASYYEALSDPV TGPQINEFINRLQEQLNQQMQLPKTTGFIWF >gi|157101616|gb|DS480708.1| GENE 24 23425 - 23790 271 121 aa, chain - ## HITS:1 COG:no KEGG:Closa_0619 NR:ns ## KEGG: Closa_0619 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 16 118 11 113 115 105 51.0 8e-22 MASVISRIIGVITRQGFPCLFHLLTGLYCPGCGGTRAFRALLAGNLLLSIRYHPLVAYMA VVLTTELVSFGVSRAAKNPRYYLGHEKFLVYTGAAIALVNWVYKNYMLVVRGVDLLPLWP R >gi|157101616|gb|DS480708.1| GENE 25 23794 - 25035 1608 413 aa, chain - ## HITS:1 COG:MTH1232 KEGG:ns NR:ns ## COG: MTH1232 COG0460 # Protein_GI_number: 15679243 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 2 345 1 349 423 273 43.0 6e-73 MIKTAVMGYGTIGSGVAEILDKNRDVIAGQAGQEVELKYVLDLREFQDSPVADKIVHDFK VIQQDEEVKIVVETMGGLNPAYPFVKACLMAGKHVVTSNKALVAAHGTELLAIARRKNVN FFFEASVGGGIPIIRPLYRCLMGERIEEITGILNGTTNFILTKMDKEGESFENALKEAQN LGYAERNPEADVEGHDTCRKIAILTAMATGKEVDYEDIYTEGITRITDIDFKYAEKMGTS VKLFGTSRIKDGRVLAYVAPVMINKTSPLYSVNDVFNGILVKGNMLGTSMFYGSGAGKLP TGSAVVADIIEAAQNLKRNVPMGWTGEKQAIESMDQAQFRYFVRVAGSYRNKEQDVKSAF GQVEVIELYGMDEFAFLTAQMTEGEFKKGAVSYDARERERSEYGIKQMIRAVL >gi|157101616|gb|DS480708.1| GENE 26 25093 - 25536 502 147 aa, chain - ## HITS:1 COG:BH1214 KEGG:ns NR:ns ## COG: BH1214 COG4492 # Protein_GI_number: 15613777 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Bacillus halodurans # 4 145 3 144 147 107 38.0 7e-24 MEDKSKYFVVKQKAVPEVLLKVVEAKKLLETERAITVQEATDKVGISRSSFYKYKDDIFP FYDNTKGKTITLVVQMDDEQGLLSDLLHVVAVYRANILTIHQSIPVNGVATLTLSVEVRE NTGNVSSMIEELEVLDGIHYVKILARE >gi|157101616|gb|DS480708.1| GENE 27 25721 - 26740 1219 339 aa, chain - ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 1 323 40 362 366 379 57.0 1e-105 MEEPGFWDEPEQSTRMVRESKNLKDEVDTYRALEQQYEDIQVMIQMGYEENDSSLIPEIE EMLEQFKETLENMRMKLLLSGEYDSNNAILRLNAGAGGTESCDWCSILYRMYCRWAESKG FKAHVLDFLDGEEAGIKSVTVQIDGENAYGYLRSEHGVHRLVRISPFNSAGKRQTSFVSC DVMPNIEEDIDIEVNPDDIRVDTYRSSGAGGQHINKTSSAIRITHFPTGIVVTCQNERSQ FQNKEKAMQMLKAKLLMVKQEEQAAKAAGIRGDVKDIGWGSQIRSYVLQPYTMVKDHRTG EESGNVDAVLDGAIDNFISAYLRWMSLGCPDKNVREDEA >gi|157101616|gb|DS480708.1| GENE 28 26987 - 29560 2830 857 aa, chain - ## HITS:1 COG:CAC2846 KEGG:ns NR:ns ## COG: CAC2846 COG0653 # Protein_GI_number: 15896101 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 855 1 837 839 910 56.0 0 MNLIEKVFGTHSERELKMIRPIVTKIESLRPEMMAMSDEELRDQTRIFRERLADGATLDD VLPEAFATVREAARRTLNMEHFPVQLIGGIVLHQGRIAEMRTGEGKTLVSTCPAYLNALK GKGVQIVTVNDYLAKRDAEWMGQVHRFLGMTVGVVLNDMTSEQRKEAYACDITYVTNNEL GFDYLRDNMSIYKEQLVLRDLDYCIIDEVDSVLIDEARTPLIISGQSGKSTKLYEVCDVL ARQLERGTVSKEFSKIDAIMGEEIEETGDFVVDEKDKVVNLTEQGVKKVEEYFHIENLAD PENLEIQHNIILALRANNLMFRDKDYVVKDDEVLIVDEFTGRIMPGRRYSDGLHQAIEAK EHVNVRRESRTLATVTFQNFFNKYTKKAGMTGTAQTEEKEFRNIYAMDVIVIPTNRPMIR KDLEDAVYKTKKEKYKAVVDEVEKAHEKGQPVLVGTIAIETSELLSKMLTKKGIPHKVLN AKFHELEAEIVADAGIHGSVTIATNMAGRGTDIKLDEETKALGGLKIIGTERHESRRIDN QLRGRSGRQGDPGESRFYLSLEDDLLRLFGSDRLMAMFEAMGVPEGEQIEHKMLSNAIEK AQMKIESNNYGIREQLLKFDEVNNEQREVIYAERRKVLDGDNMRDLVLKMITDTVENAVD ISVSDDQTPDKWDLQELNNLLLPVIPLKPVTLSDEQKKSMKKNELKHNLKEEAIKLYETK EAEFPEPEQIREIERVVLLKVIDNKWMAHLDDMDALREGIGLQAYGQRDPVVEYKMQGYE MYESMMASIQEETIRILFHIRVEQKVEREAAAKVTGTNKDASAPSAPKKRAEQKIYPNDP CPCGSGKKYKQCHGRIK >gi|157101616|gb|DS480708.1| GENE 29 30285 - 30815 476 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P [Clostridium tetani E88] # 1 176 1 176 176 187 56 5e-47 MRYTITGRNIEVTPGLKAAVEKKIGKLEHFFTPDTEVIVALSAQKDRQKIEVTIPVKGNT IRAEESSTDMYVSIDLVEEIIERQIRRYRKKLIDKKQAAISFSQAFIEEEDEVQDDEIQI VKTKKFAIKPTIPEEACLQMEMLGHNFYVFLNADTDQVNVVYKRKNGTYGLIEPEF Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:49:13 2011 Seq name: gi|157101615|gb|DS480709.1| Clostridium bolteae ATCC BAA-613 Scfld_02_50 genomic scaffold, whole genome shotgun sequence Length of sequence - 16474 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 8, operones - 5 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 5/0.000 - CDS 2 - 524 515 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 598 - 657 2.3 2 1 Op 2 . - CDS 691 - 2061 170 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 3 1 Op 3 . - CDS 2058 - 4676 1845 ## Dhaf_0491 hypothetical protein 4 1 Op 4 . - CDS 4679 - 5380 222 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 - Prom 5402 - 5461 4.6 + Prom 5454 - 5513 4.9 5 2 Tu 1 . + CDS 5551 - 6084 375 ## CPR_0873 hypothetical protein + Term 6139 - 6188 8.9 + Prom 6110 - 6169 1.6 6 3 Tu 1 . + CDS 6201 - 6566 202 ## CLJU_c05760 hypothetical protein 7 4 Op 1 . - CDS 6692 - 6796 68 ## 8 4 Op 2 . - CDS 6867 - 8249 1230 ## COG0534 Na+-driven multidrug efflux pump + Prom 8285 - 8344 5.5 9 5 Tu 1 . + CDS 8373 - 8939 506 ## COG1695 Predicted transcriptional regulators 10 6 Op 1 . - CDS 8901 - 10118 837 ## COG0477 Permeases of the major facilitator superfamily 11 6 Op 2 . - CDS 10284 - 10637 417 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase - Prom 10743 - 10802 11.3 - Term 10777 - 10830 8.3 12 7 Op 1 3/0.000 - CDS 10862 - 11554 731 ## COG0235 Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases 13 7 Op 2 . - CDS 11644 - 13254 1859 ## COG1070 Sugar (pentulose and hexulose) kinases 14 8 Op 1 . - CDS 13403 - 14899 1697 ## COG2160 L-arabinose isomerase 15 8 Op 2 . - CDS 14966 - 15259 356 ## gi|160941627|ref|ZP_02088957.1| hypothetical protein CLOBOL_06526 16 8 Op 3 . - CDS 15259 - 16365 1387 ## COG1609 Transcriptional regulators - Prom 16394 - 16453 5.6 Predicted protein(s) >gi|157101615|gb|DS480709.1| GENE 1 2 - 524 515 174 aa, chain - ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 5 174 4 173 231 155 47.0 3e-38 MDKRKILFADDDPEIREVVRILLESEGYQVVEAENGEQAASAADDTFDLIILDVMMPGRN GFSACAEIRRKLNTPILFLTARTQDSDKTMGFGAGGDDYLAKPFSYAELAARVKAMIRRY HVYKGKEEDGEESITVKDLVIQKSFNEVMKNSREILLTDLEYRILLLLASNRGK >gi|157101615|gb|DS480709.1| GENE 2 691 - 2061 170 456 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 317 456 294 413 413 70 29 1e-11 MKTKDLLTICLQNLTRHKSRTFLTVLGVIIGCCSVVIMISIGIGMKEAQKNMLAQMGDLT IINVYSAGKGSRSAKLNNQAIRRLKEMKSVEAVTPKLTAENIPITLYAGRNRRYKSAYTT IVGIDVKAAEAMGYKLTDGTWDKGGRDGVFVGENFAYMFEDTKRPSGRNTVDMYSGYDNL DESGMPVKPQPYFDSMKTAYTLDISSDKEDDKKITRQLEAAGRMKEDYGKGEETSMGLVM DLEMLKTLLDQQAKLSGRKPEWKKGYSGALVKVKDMNQVAEVETEIKRMGFRTSSMETIR KPMEREARQKQMMLGGLGAISLFVAALGITNTMIMSISERTREIGVMKSLGCFVRDIRRI FLLEAGCIGLLGGVTGTVFSYAISFVMNMTSEGMSSSSMAGAMEADMAGLPSRLSVIPWW LSLFAVLFSIAVGVGAGYYPAGKAVKISALEAIKHD >gi|157101615|gb|DS480709.1| GENE 3 2058 - 4676 1845 872 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0491 NR:ns ## KEGG: Dhaf_0491 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 254 827 276 835 857 215 28.0 6e-54 MEITGRLKRAAAVMISTAMIIGMCGNALAEETDSSGGRETIAAEGINERNEGGMAGEGTG GMEAGAAGANGAGANGAGDSTTESAGTETGVAGTSQPGSGTGREMGSETGGSGADSVSGA QDGGSQEGSGGRVRGKEDSNNPDDGAPGSGSLWVGEYEIFAPGGSKDTLDVLRKGRQAKV VVNVRSNGIKTSEVGKRGVTVTKLSDSFRNGENPKVKITSDKEDDLEFTVTFTRLTYTGK GDVLRLKVGFKSSGIPSEQLEVNISECEESAPRENGDTGSTTGQPIIRVRRISPQNPVGP GDHFTLEVELENTSKDADIEDMVVNVSPGSSLFIGGDTNTRIVSRLDTGRAELVKFNLIA GQDISGPSQLIDLELKYNYYSGGQLTSAASVQKVLLPVKGGTATGQPVLLIDRGPMGPVS SGQPFQITLKLENTDTVKGIRNLTATFEPNDQISLLEATDTRQIGDIGPGQSVDVSVNLK AGSELSSAASQLLGITLKFDYDTDKGTVQGTYSQRIVVPTNGKTATPGAPTPNIILTNYT YGDKVSAGQVFRLNMEFMNTSQVSPIENVVISLETGEGLSINSSSNTFYVPKMGPGEKKA QQVDVQALFQTKDSKVQSPKITISCKYEYIDKTERKQSTAAETIAVPVYQPDRFQVSPPS FVEEIRQNEETTISLPYVNKGRGQVYNVEASLEGDIQVIDRSLNLGNFDAGKSGTIDFIA TPKKAGTFEGRVKVTYEDESMEIRTMEIPVTFEVKEGAAEETDGADMMDGEMDGGRKMNW KMMAGILTAALVSGILWIKRKKAKGLKDRSRCQEQNGWDELEDGQNLEGWDELEDSQDLE GWDELEDSQVLDGRNELEDSRDQPSENEEDKT >gi|157101615|gb|DS480709.1| GENE 4 4679 - 5380 222 233 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 223 1 235 563 90 27 9e-18 MDHGKPIIYVKNVRKVYRMGDEEVVALKRINLRIYKGEVCCIFGTSGSGKSTLLNQLAGM EKPTKGQVFIRGKNISDMNEEELAAFRQEHMSFIFQSYNLLPSMTAVENVAMPLMFKGMD RKRREAMAEEMLKRVGLSHRLHHYPSQMSGGQQQRAGIARAFVSRPEVVFADEPTGNLDT KTTAEIMDMVMGFARRFNQTIILVTHDPGMSRYADRIVTLVDGIITGDERKGQ >gi|157101615|gb|DS480709.1| GENE 5 5551 - 6084 375 177 aa, chain + ## HITS:1 COG:no KEGG:CPR_0873 NR:ns ## KEGG: CPR_0873 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 1 159 1 157 162 144 42.0 2e-33 MKGAILEKGETGYTYLKKLFLSIRHAQKDFNWLITDCQCCTQSEAFYQRIFQYDRYGWLS GEELTRMVETEDFQWIGAVLSGFPRHISLDQVLEYDLPYADMYTGFWTNPVSIQHPLAQV EIVPWDSSCTLVISRNHRIVSDFMDGFPLSRDLSLYNEQKGIFDESDELERWLSQHK >gi|157101615|gb|DS480709.1| GENE 6 6201 - 6566 202 121 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c05760 NR:ns ## KEGG: CLJU_c05760 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: Oxidative phosphorylation [PATH:clj00190] # 5 113 7 115 116 160 66.0 1e-38 MEHTFWTALDKLVEQSEIIIDRPKGSVHPVHPDFIYQVDYGFLRNTSSMDREGIDIWAGS DHTAGIDAILCTVDLLKRDSEIKILLDCTEEEKMLIYKAHNDTSCMKGILIRRTAGISYK P >gi|157101615|gb|DS480709.1| GENE 7 6692 - 6796 68 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGSRRLGMKKLYIVLIRMFYLSMTKVYSSDMSKV >gi|157101615|gb|DS480709.1| GENE 8 6867 - 8249 1230 460 aa, chain - ## HITS:1 COG:MA2050 KEGG:ns NR:ns ## COG: MA2050 COG0534 # Protein_GI_number: 20090897 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 6 458 3 458 468 249 35.0 7e-66 MEGRNKKMELLGNAPIPKALLAMGLPTMIGMMINALYNLVDAYFVGGLGTSQMGAISVAF PLGQVVVGLGLLFGNGGASYISRLLGRGDRETADRAASTALYSSIFVGAAVILCAVIFLK PLLKCLGATDSILPYAVTYAGIYVVSSIFNVFNVTMNNIVTSEGAARTTMCVLLTGAAFN MVLDPVFIYVLKLGVAGAAIATAISQAVSTLIYLYYVISGRSMFSFRFRNCCFSNEIMSE ILKIGIPTLAFQLLTSLSITMINMQAKPYGDSVIAGMGAVTRMISLGSLMVFGFIKGFQP IAGYNYGAGNYERLHEAIRISVVWSTLFCAVFGLTMAVFSGPVISRFTKNDMEMIRIGQK ALKANGLSFLLFGFYTVYSSLFLALGKAGEGFVLGACRQGICFVPIILVFPALWGLNGIL YAQPAADVLSAMTALFMAVHLNRDIGKKGKGDFGMNSGIR >gi|157101615|gb|DS480709.1| GENE 9 8373 - 8939 506 188 aa, chain + ## HITS:1 COG:MA2093 KEGG:ns NR:ns ## COG: MA2093 COG1695 # Protein_GI_number: 20090938 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanosarcina acetivorans str.C2A # 1 170 1 167 189 113 35.0 2e-25 MATIDLIVLGILKKESLSAYDIQKLVEYRNISKWVKISTPSIYKKVIQLEEKGYIKSNMI KEGKMPEKAVYSLTEAGEQAFEGLMLEIASKPIHIFLDFNSVIVNLPSLSPDNQADCLDR IEQNVNILKTYLEENISLKEHLPEIPETGMAVLQQQFILAQAIESWIASLRHTLPESSVH TVPQSRKS >gi|157101615|gb|DS480709.1| GENE 10 8901 - 10118 837 405 aa, chain - ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 8 399 45 435 450 127 27.0 6e-29 MEQQFAGWRKKFFVIWTGQGISTLTSSIVQMAIIWYITGKTKSAAVLSFATLAGFLPQAV LGMFIGVLIDQHDRKKIMICSDLVMSSACVALAVSGIWGDIPIWLIFIVLFIRSIGNAFY APSLQAIMPSILPKDQLTKYAGYSQSFESASMIASPAIAAVLYSMFPLKAILMLDVLGAF FAVGTLVAVDIPGIFREDKEKTNLLAEVKEGYNVMRSVPGMRELLVIGALYAVIYFPIGT LYPLITMSYFNGGVRESGLVETVFAAGTLVGSVSLGIWGGRIDKIKAIAGSIGIYGAGTL ITGLLPPQGFYIFAFLSFFMGISVPFYFGVQTSIYQIKIQGEYLGRALSLSGSISMAAMP LGLVLSGALAGPLGIENWFLISGGATLGIAVFALTLPALRHCMDG >gi|157101615|gb|DS480709.1| GENE 11 10284 - 10637 417 117 aa, chain - ## HITS:1 COG:STM0466 KEGG:ns NR:ns ## COG: STM0466 COG3695 # Protein_GI_number: 16763847 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Salmonella typhimurium LT2 # 3 98 33 127 129 73 46.0 6e-14 MDFYKRVGITVRTVPEGKVATYGQIALLCGKPKNARQVGYALNRGLAGEVPAHRVVNSQG YLTGAASFEHPDLQRMLLEEEEVLVSAEGRVDMKRDGWKNTLEDALRLKEIFEREGI >gi|157101615|gb|DS480709.1| GENE 12 10862 - 11554 731 230 aa, chain - ## HITS:1 COG:sgaE KEGG:ns NR:ns ## COG: sgaE COG0235 # Protein_GI_number: 16132020 # Func_class: G Carbohydrate transport and metabolism # Function: Ribulose-5-phosphate 4-epimerase and related epimerases and aldolases # Organism: Escherichia coli K12 # 2 228 1 227 228 313 62.0 1e-85 MLEELKQKVYEANMELPRRGLITYTWGNVSGIDREKGLFVIKPSGVDYDVLKPSDMVVMD LEGNKVEGEMNPSSDTATHVELYNAFKEIGGIVHTHSPHATAWAQAGRALPCYGTTHADY FYGEIPCARNLTAEEIEEGYEMNTGRVIIETFQGKNPVYIPAVLCKNHGPFTWGKDAAEA VHNAVVLEEIARMNFMTELINPQAGPAPQCMQDKHFMRKHGPNAYYGQGK >gi|157101615|gb|DS480709.1| GENE 13 11644 - 13254 1859 536 aa, chain - ## HITS:1 COG:CAC1344 KEGG:ns NR:ns ## COG: CAC1344 COG1070 # Protein_GI_number: 15894623 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 1 534 1 533 534 702 62.0 0 MSLDINKIKEQIEHGNTSLGIELGSTRIKAVLICEDHTPIASGSHDWENRYVDHIWTYTL EDIWTGIRDCYGKMAEDVKETYGITLTTIGSIGFSAMMHGYMAFDKEGELLVPFRTWRNT ITGPAAGKLTEVFRYNIPQRWSIAHLYQAILNKEEHVKDIDYLITLEGYVHWKLTGRRVL GIGDVAGMFPVDIHARDFDQKRVEQFDQLVAGENFPWKLRDILPKALVAGEDAGVLTEEG AKLLDASGNLKAGIPMCPPEGDAGTGMAATNSVAVRTGNVSAGTSVFAMVVLEKELEHVY PEIDLVTTPAGDMVAMVHCNNCTSDLNAWVNLFKEFSEAMGMKADMNQLFGTLYNKALEG DSDCGGLLAYNYFSGEHITHFEEGRPLFVRMPDSRFNLANFMRVHLYTSLGALKTGMDIL LKKEHVKVDSILGHGGLFKTKGVGQRILAGAMNAPVSVMETAGEGGAWGIALLASYMRCR QEGETLDAYLANKVFAGDKGTRMEPVKEDVEGFEAFMKRYAEGLAIEKAAVEHMKD >gi|157101615|gb|DS480709.1| GENE 14 13403 - 14899 1697 498 aa, chain - ## HITS:1 COG:BH1873 KEGG:ns NR:ns ## COG: BH1873 COG2160 # Protein_GI_number: 15614436 # Func_class: G Carbohydrate transport and metabolism # Function: L-arabinose isomerase # Organism: Bacillus halodurans # 1 498 1 494 497 569 54.0 1e-162 MIAVKNYKFWFCTGSQDLYGDECLAHVAEHSGIIVDSLNKSGILPYEVVWKPTLITNELI RRTFNEANADEECAGVITWMHTFSPAKSWILGLQEYRKPLMHFHTQFNQEIPYDTIDMDF MNENQSAHGDREYGHMVTRMGIERKVIVGHWSDEKVVGRIAAWMRTAVGIMESSHVRVAR FADNMRNVAVTEGDKVEAQMKFGWEVDAYPVNELAEYVKAVPKGDITALVDEYYSKYTIL PEGRDPEEFKRHVAVQAQIEAGLEKFLLEKDYHAIVTHFGDLGELQQLPGLAIQRLMEKG YGFGAEGDWKTAAMVRLMKIMTQGMKDAKGTSFMEDYTYNLVPGKEGILEAHMLEVCPTI ADGEISIKACPLSMGDREDPARLVFTSKTGHGIATSLVDLGTRFRLIINDVECKKTEKPM PKLPVATAFWTPEPNLATGAESWILAGGAHHTAFSYDLTAEQMGDWADAMGIETVYIDKD TSIRALKNELRWNAAAYR >gi|157101615|gb|DS480709.1| GENE 15 14966 - 15259 356 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941627|ref|ZP_02088957.1| ## NR: gi|160941627|ref|ZP_02088957.1| hypothetical protein CLOBOL_06526 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06526 [Clostridium bolteae ATCC BAA-613] # 1 87 1 87 97 150 100.0 3e-35 MGAKTGAKKEENAAKPTQITVILRLAAGGYLVYLAFGLLQEFLKPAGGGKMVQLGCAVLF GAIGAFLAGWSLKKFIKGEYIKYGEIPDDEEEEQTED >gi|157101615|gb|DS480709.1| GENE 16 15259 - 16365 1387 368 aa, chain - ## HITS:1 COG:BS_araR KEGG:ns NR:ns ## COG: BS_araR COG1609 # Protein_GI_number: 16080450 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 7 362 26 384 384 221 35.0 1e-57 MGTQELKYKAVYNWVLENINSGALKVGEKLPSENELSERFGLSRQTVRHAVDILEQQKLV LRVRGSGTYVGGNGKAERQERYMNIAVISTYVDSYIFPPVVRGIERVLSKKGYTTQIAFT GNRVSREQDILNNLIDKDIIDGLIVEPAKSALPNPNLHYYQELKERGIPILFFNSRYPEL ELPCVSMNDEQVGKKAVEYLIKNGHRNIGGVFKSDDGQGHLRYKGFLSGMLEAGIKVKDA NVVWLDTEDFLDLDQWADYLFRRLESCTGVVCYNDEVAYVLSGLCEKRGIAIPDQLSVVS IDNSDLATLAGVKLTSFPHPMEALGRKAAENMISMIENPYFDGNYLFDSDIIERDSVKVL KQPHKEEM Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:49:50 2011 Seq name: gi|157101614|gb|DS480710.1| Clostridium bolteae ATCC BAA-613 Scfld_02_51 genomic scaffold, whole genome shotgun sequence Length of sequence - 11994 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 8, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 81 - 1415 1437 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 1489 - 1548 9.9 2 2 Tu 1 . - CDS 1626 - 1904 368 ## gi|160941631|ref|ZP_02088960.1| hypothetical protein CLOBOL_06529 - Prom 1945 - 2004 2.7 - Term 1923 - 1952 -0.9 3 3 Tu 1 . - CDS 2006 - 2134 57 ## gi|160941632|ref|ZP_02088961.1| hypothetical protein CLOBOL_06530 - TRNA 2186 - 2274 49.3 # Ser GCT 0 0 - TRNA 2337 - 2427 32.8 # Pseudo GTA 0 0 4 4 Op 1 1/0.000 - CDS 2559 - 3671 1291 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Term 4163 - 4193 -0.9 5 4 Op 2 . - CDS 4282 - 5517 1308 ## COG1690 Uncharacterized conserved protein - Prom 5651 - 5710 5.4 + Prom 5741 - 5800 6.1 6 5 Op 1 . + CDS 5838 - 7139 1312 ## CA_C3385 hypothetical protein 7 5 Op 2 . + CDS 7146 - 8135 579 ## CA_C3386 hypothetical protein + Term 8314 - 8382 23.1 8 6 Op 1 . - CDS 8117 - 8368 93 ## gi|160941640|ref|ZP_02088969.1| hypothetical protein CLOBOL_06538 9 6 Op 2 . - CDS 8379 - 8636 226 ## gi|160941641|ref|ZP_02088970.1| hypothetical protein CLOBOL_06539 - Prom 8699 - 8758 5.3 + Prom 8485 - 8544 5.2 10 7 Tu 1 . + CDS 8753 - 9034 205 ## Closa_1116 XRE family transcriptional regulator - Term 9034 - 9081 16.4 11 8 Tu 1 . - CDS 9124 - 11736 2844 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 11851 - 11910 9.9 Predicted protein(s) >gi|157101614|gb|DS480710.1| GENE 1 81 - 1415 1437 444 aa, chain - ## HITS:1 COG:lin0569 KEGG:ns NR:ns ## COG: lin0569 COG0334 # Protein_GI_number: 16799644 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Listeria innocua # 3 444 17 458 458 577 65.0 1e-164 MSYVDEIYERVVAQNPGEPEFHQAVKEVLDSLKLVIDANEEKYRREGILERFTEPERIVS FRVPWVDDNGAVQVNKGYRIQFNSAIGPYKGGLRFHPSVNQSILKFLGFEQTLKNSLTGL PMGGGKGGSNFDPKGKSDREVMAFCQSFMTELYRHIGKDTDIPAGDIGVGGREVGYLFGQ YKRITGLYEGVLTGKGLTFGGSLARTQATGYGLVYILDEMLKHNGKDMAGKTIVVSGSGN VAIYATEKAQQLGAKVVALSDSNGYIYDKDGIQLDIVKEIKEVRRGRIKEYVDEVPTAVY TEGRGIWSIPCDIALPCATQNELNLEDAQTLLANGCFAVAEGANMPSTREATDLFVEKKI LFMPGKAANAGGVATSGLEQSQNALRMSWSFEEVDDKLHTIMVNIFAKVSEAAERYQVAG NYVAGANIAGFEKVVEAMLGQGIV >gi|157101614|gb|DS480710.1| GENE 2 1626 - 1904 368 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941631|ref|ZP_02088960.1| ## NR: gi|160941631|ref|ZP_02088960.1| hypothetical protein CLOBOL_06529 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06529 [Clostridium bolteae ATCC BAA-613] # 1 92 9 100 100 173 100.0 5e-42 MHQTMFPLLIVNILENHATKERPLSVTEITEFVNREFGPFAMEKEQLMNRSTVTRILDAM EFWTEEGNLLNFKVTQCGSENKKLFCLERPKG >gi|157101614|gb|DS480710.1| GENE 3 2006 - 2134 57 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941632|ref|ZP_02088961.1| ## NR: gi|160941632|ref|ZP_02088961.1| hypothetical protein CLOBOL_06530 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06530 [Clostridium bolteae ATCC BAA-613] # 1 42 1 42 42 67 100.0 3e-10 MAEAVFIKVRLYSYGKSGASSYSCYGSGRIGVTENRNCQSPG >gi|157101614|gb|DS480710.1| GENE 4 2559 - 3671 1291 370 aa, chain - ## HITS:1 COG:CAC3381 KEGG:ns NR:ns ## COG: CAC3381 COG0330 # Protein_GI_number: 15896623 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Clostridium acetobutylicum # 5 365 2 361 365 357 50.0 2e-98 MMIKKYIIKEQECGYLMKDGRFVELLTAGRYSYLNMLGYEVQTVPMTGEVKTCGIPEEIL MKDEKFASRVVKAVLPDECIALRFVNKAYREVITKPETLYWNVFEKNEFRLIDITQPYME DTLPRMYMDLMPSKYYKKIVIKDGETGLLYFDNCYEKKLDTGTYYFWNYGREVTCKVFNM KIQQLDISGQEILTADKVAVRLNIICNYRITNPEKLVQTVEGVASQLYTYVQLKLREYVG RYRLDELLEQKEEIGRFVLDKLKEYQEEYCVEITGAGIKDIILPGEIREIMNTVLMAEKK AQANVIMRREEVASTRSLLNTARLMDENRTLFKLKEMEYLEKICDKVGNISLNGGKGVLE QLAELAGVQD >gi|157101614|gb|DS480710.1| GENE 5 4282 - 5517 1308 411 aa, chain - ## HITS:1 COG:STM3519 KEGG:ns NR:ns ## COG: STM3519 COG1690 # Protein_GI_number: 16766807 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Salmonella typhimurium LT2 # 13 411 13 405 405 243 36.0 5e-64 MFVMYEKEKMKVPVRVWLKEREDLEGSCLEQAYHLSQLPFLHKWVSLMPDTHAGMGMPIG GVIAADGVVIPNAVGVDIGCGMAYTETNIKVADIREVITGNGSLLQAVIGDIMRNVPVGF AHHKTMMPSYTMDGALEEMDRYEEDGELLGQLEAGYYQIGTLGGGNHFIELQEDDDGYLA VMIHSGSRHFGKSVCDYFHYKARQLNQKWFSAVPDEYRLAFLPVDTREGKQYLNWMQLSM DFAKENREKMMLAVKAILEKWIGKYTELSLEFSHDINCHHNYASFENHYGKDVWVHRKGA VSAQNGELAVIPGAMGSYSYVVMGKGNQESFCSSSHGAGRQYSRKGAMAAFSCEEVILDL QKQGVILGKKGKADVAEESRFAYKNIEEVMDNQQDLVVPVKRLKTIGVVKG >gi|157101614|gb|DS480710.1| GENE 6 5838 - 7139 1312 433 aa, chain + ## HITS:1 COG:no KEGG:CA_C3385 NR:ns ## KEGG: CA_C3385 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 432 1 411 413 394 48.0 1e-108 MAEFQELIKNFDRIRDYMRQFYVYGFKVRNDFSEKSARTYDNERRRIESWLSRYTKSDYT SKGKHVYINVDSKTIPQNPLYAAWKSKSFTDNDLMLHFFILDLLWDCPEGMSSNTVTDLI SHNYGVVFDTQTVRLKLKEYETLGILVSHKDGKTLSYALAPLMPMETEPQCVNSGDMEPD GTEEGEQNLWQCLMTAVKYFQEAAPFGCIGSTILDRQDECNDLFQFKHHFIVHTLEDGIL ADILTAVREKRMIRFENKSSRSGQVSTLSGVPLKIFVSTQTGRRYLCLYLPERRRFSNSR LDSITRVTLLEPCAFYEEVLRDLEKNKEKCWGVSFGSGTSRMEEVCIKLYIDEEKEPYIL NRLYREGRGGEIMKIRENQFLYTGTFFDTNEMLSWVKTFTGRIMDIQGTNIFSIAKVTRD WEKMYEMYCGNEE >gi|157101614|gb|DS480710.1| GENE 7 7146 - 8135 579 329 aa, chain + ## HITS:1 COG:no KEGG:CA_C3386 NR:ns ## KEGG: CA_C3386 # Name: not_defined # Def: hypothetical protein # Organism: C.acetobutylicum # Pathway: not_defined # 1 323 1 321 326 256 39.0 7e-67 MELFHKIYSCYYNVVRHILDEAGHSPITRQDMEDICRAYGFQESALSIIPKLTDSTWALL EEQDKHTFTSRLGHAVPALPLTNLQKAWLKSLIQDPRFQLFFTDRQLELLAGEWDGLPSL YHEDDFYYYDRYRDGDSYDSPSYRKHFQAILKAIREERVLLVAYEGKHGRVHSFETAPYQ LQYSSKDDKFRLCCLQLHRNTFSRSTILNLARIKDCHVTSKSCPPDLESRCFEPIQKASE PVVLKISGERNSLERCMLHFANYEKHTEYDAEMKCWICSIHYDLADETELLIDILSFGPV VRVVGPSSFVRQIRKRVKRQHELFYSEIT >gi|157101614|gb|DS480710.1| GENE 8 8117 - 8368 93 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941640|ref|ZP_02088969.1| ## NR: gi|160941640|ref|ZP_02088969.1| hypothetical protein CLOBOL_06538 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06538 [Clostridium bolteae ATCC BAA-613] # 1 83 1 83 83 155 100.0 7e-37 MLYGCPDVLWYIRTAIQHLFHLSCFFLFLCGAYFCAERSPVLCWAGGAGEEALGKEVLGQ GALRQEALLWFLQSPGVPYVISL >gi|157101614|gb|DS480710.1| GENE 9 8379 - 8636 226 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941641|ref|ZP_02088970.1| ## NR: gi|160941641|ref|ZP_02088970.1| hypothetical protein CLOBOL_06539 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06539 [Clostridium bolteae ATCC BAA-613] # 1 85 2 86 86 159 100.0 9e-38 MPDKSQGGLLARLQDLSGCQYLSDLHSPFYIEDIIYAVRTVSMSSYSMSEWEEAFRYITG VRTEFKSKEELVKNLILRLEENKEL >gi|157101614|gb|DS480710.1| GENE 10 8753 - 9034 205 93 aa, chain + ## HITS:1 COG:no KEGG:Closa_1116 NR:ns ## KEGG: Closa_1116 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 10 93 2 85 89 69 38.0 5e-11 MHIPEEAMDQKLKQDLRIGPNIRKYRMQSHLTQDQVTAKMQLMGINISRSIYSQIESGTY NIRISELAAMKEIFNINYESFFDGIMLNHRDKQ >gi|157101614|gb|DS480710.1| GENE 11 9124 - 11736 2844 870 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 19 457 9 448 448 597 68.0 1e-170 MAKKIEQVVPEIIDSVDGLNAKMNAMREAQKAFSTFTQEQVDKIFFAAASAADKMRLPLA KMAVEETGMGVVEDKVIKNNYAAEYIYNAYKDTKTVGVIEEDKEYGIKKIAEPIGLIAAV IPTTNPTSTAIFKTLIALKTRNAIIISPHPRAKNCTIAAAKVVLDAAVKAGAPEGIIGWI DVPSLELTNEVMRDADLILATGGPGMVKAAYSSGKPALGVGAGNTPVIIDDTADIKMAVN SIIHSKTFDNGMICASEQSVTVMESIYSEVRKEFEYRGCYFLKPGEIDKVRKTIIVNGAL NAKIVGQKAATIAKLAGVEVPEDTKILIGEVESVDISEEFAHEKLSPVLAMYKAKTFDEA LDKAERLVADGGYGHTSSLYVNVNEEEKILKHAARMKTCRIVINTPSSHGGIGDLYNFKL APSLTLGCGSWGGNSVSENVGVKHLLNIKTVAERRENMLWFRAPEKVYFKKGCLPVALQE LKDVLGKKRAFIVTDSFLYKNGYTHAITDRLNEMGITYTVFSDVQPDPTLANAQAGAKLM REFEPDVILAMGGGSAMDAGKIMWVLYEHPEVDFMDMAMRFIDIRKRVYTFPKMGEKAYF IAVPTSSGTGSEVTPFAVITDQETGIKYPLADYALLPNMAIVDTDNMMSQPRGLTSASGV DVLTHALEAYASVMATDYTDGLALKAMKNVFDYLPTAYNEPTNVEARQKMADASCMAGMA FANAFLGVCHSMAHKLGAFHHLPHGVANALLISLVVDFNAAENPRKMGTFSQYQYPHTRE RYAECARFCGIQAKDDAEAVQKLIVKIEELKRTVGIKSCIKDYGVDEKDFLDRLDDMVEQ AFDDQCTGANPRYPLMSEIKEMYLKAYYGK Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:50:31 2011 Seq name: gi|157101613|gb|DS480711.1| Clostridium bolteae ATCC BAA-613 Scfld_02_52 genomic scaffold, whole genome shotgun sequence Length of sequence - 17661 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 4, operones - 1 average op.length - 15.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 103 - 162 5.9 1 1 Tu 1 . + CDS 209 - 1657 774 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 1696 - 1759 2.9 2 2 Tu 1 . - CDS 2043 - 3047 365 ## COG1609 Transcriptional regulators - Prom 3210 - 3269 9.1 + Prom 3004 - 3063 4.1 3 3 Tu 1 . + CDS 3199 - 3330 59 ## + Term 3513 - 3569 0.2 4 4 Op 1 . + CDS 3650 - 3871 181 ## gi|160941650|ref|ZP_02088978.1| hypothetical protein CLOBOL_06547 5 4 Op 2 . + CDS 3884 - 5197 803 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 6 4 Op 3 3/0.000 + CDS 5233 - 6063 847 ## COG1082 Sugar phosphate isomerases/epimerases 7 4 Op 4 . + CDS 6122 - 7162 792 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 8 4 Op 5 . + CDS 7206 - 7643 354 ## LLO_3229 hypothetical protein, conserved cupin barrel domain 9 4 Op 6 . + CDS 7667 - 8227 400 ## COG0279 Phosphoheptose isomerase 10 4 Op 7 2/0.000 + CDS 8249 - 9193 450 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase 11 4 Op 8 16/0.000 + CDS 9233 - 10336 703 ## COG0673 Predicted dehydrogenases and related proteins 12 4 Op 9 . + CDS 10358 - 11323 605 ## COG1082 Sugar phosphate isomerases/epimerases 13 4 Op 10 . + CDS 11359 - 12429 409 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 14 4 Op 11 . + CDS 12446 - 13360 377 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 15 4 Op 12 16/0.000 + CDS 13353 - 14300 430 ## COG0673 Predicted dehydrogenases and related proteins 16 4 Op 13 16/0.000 + CDS 14315 - 15193 429 ## COG1082 Sugar phosphate isomerases/epimerases 17 4 Op 14 . + CDS 15223 - 16182 411 ## COG0673 Predicted dehydrogenases and related proteins 18 4 Op 15 . + CDS 16207 - 16983 394 ## Mahau_0233 xylose isomerase domain-containing protein Predicted protein(s) >gi|157101613|gb|DS480711.1| GENE 1 209 - 1657 774 482 aa, chain + ## HITS:1 COG:MA2369 KEGG:ns NR:ns ## COG: MA2369 COG2865 # Protein_GI_number: 20091201 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 10 378 13 369 510 201 32.0 3e-51 MALDKLFLGESKSIEYKVDVPGKSEKYMKTVVAFANGRGGRIVFGIDDSTLDVTGMNPDT IFQTIDSITNAISDSCEPRIIPDVTLQTVGDKTVIVVEISSGKMRPYYLKSKGIVDGTFI RVAGTTRLAPDFMLKELILEGQNRYYDSEPCDGLTVTKDDIKKLCDDLKKVALEHALTNA AKAEIKDVTENVLLSWGILWEKDGEIVPTNAYALLTGKNTMQQNIQCAIFKGNDRAYFVD RREFNGPIQEQMEAAYQYVLEKINRGMKIDGIYRQDVYELPIDSIRELIANAVAHRSYLD PGNIQVALYDNRLEITSPGMLMNGVTIEKMKEGYSKIRNRAIANAFSYMKIIEKWGSGIP RIIRECQEYGIPEPELIDFDGDFRINMYRQPLLSQAGTIGTNCGTIGTNSDTIGTINGIK VTSPEERVLTIMKQNAKVTQKELQKELGVSLRTVKRMIAELQDKGYIVRSGNNRSGEWIV KG >gi|157101613|gb|DS480711.1| GENE 2 2043 - 3047 365 334 aa, chain - ## HITS:1 COG:BH2313 KEGG:ns NR:ns ## COG: BH2313 COG1609 # Protein_GI_number: 15614876 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 6 331 3 328 337 265 42.0 8e-71 MNAPTIKDVANAAGVSVATISRVLSNPDLVKPATKEHVLRIIKEMNYQPNVLARQLRTQT TRAVIVIVPNIENSFFHGILSGIGTEAERQGYQMLIADLKSQPSLEKHYIEAIKQRQVDG IISLSANMTQKLETLITEKYPLVMVVQCVPNYKIPSVSIDNMAASKALMTHLIRLGHREI AHLTVSPLQMPYQDRLNGYISALEEHKIPVDNELISYGEPAIKGGYDQMWTLLAKQKKFT AVFAAGDTMAIGAMKALKDQGLRVPEDCAVVGFDDIDLSSVWEPAITTIRQPKEMMGRIA FQKLLALMQNEPIAVSQEYLPYELVIRESCGYFL >gi|157101613|gb|DS480711.1| GENE 3 3199 - 3330 59 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVIDYNLKKSKQISTKLKLVKSNNNYKIISVKITIDNNDDIH >gi|157101613|gb|DS480711.1| GENE 4 3650 - 3871 181 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941650|ref|ZP_02088978.1| ## NR: gi|160941650|ref|ZP_02088978.1| hypothetical protein CLOBOL_06547 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06547 [Clostridium bolteae ATCC BAA-613] # 1 73 1 73 73 110 100.0 3e-23 MRAAINILTIVLLVFMFVISMQFTLGGTMKSTVLRIPRMYFYMSIPMGFGLCIYEYLKVV KIKILTDAASEKE >gi|157101613|gb|DS480711.1| GENE 5 3884 - 5197 803 437 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 1 433 1 430 431 313 40 4e-85 MGEIIALLVILTFIFIIAGIPIYISLMFTGLLSLVALASSTATPLATLVIPQSIFNGIGS LPLLAIPFFMLAGEIMNRGKITEKLIRFAMLLIGRLPASLGLSSIVASMFFGGITGSAQA STSCMGGILIPAMKEEGYPTKEAVGIIAAASTCGPIIPPSIIMVVFATAVGCSVGAMFMG GLIPGILVGLILMIVLLVRNHMYHFPKHDERLSRNEVVKVALDGIIPLGMPVIIVGGIMG GVCTPTEAGAIAVVYSLLVSLFVSRTLRLRDIWSLLLATVNSAAPLLLIIACARVFSYGL TALQMPVIVNDLILSLTSNKYVFLMLVNILLLIMGMFMDGGASVIILAPILAPVAAALGI STIHFGVIMALNLTIGNITPPLGYCLFIGSRIGDITVEEGIKGIIPYLIAEIAALLLITY IPSLVTYVPVMLGYSVV >gi|157101613|gb|DS480711.1| GENE 6 5233 - 6063 847 276 aa, chain + ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 13 247 21 247 270 107 31.0 2e-23 MKFSACTAAFGMADLKKIIDTSAKLGYDAVELTAALHLPVESTPERRKEVLGWIHDAGIV CSALHYIFDGTIRLLSTDSEMMNQSAAYLKQVVDVAYDMECPTVIVGSGGKTRSFEPDWD REAGVKCMAEVIRQVGLYAQEKGVTLAVEAINRYETNFLNTLEEAVNFVNMVNLPNVRTM ADTYHLNIEEVNPAETIRNYGHTLANLHLADSNRQAPGDGHFDFASVAEALREVDFKGYC SFEVFGLYPWKLWYDTFEESVEHMGNGIKYARSIFG >gi|157101613|gb|DS480711.1| GENE 7 6122 - 7162 792 346 aa, chain + ## HITS:1 COG:BH0189 KEGG:ns NR:ns ## COG: BH0189 COG1063 # Protein_GI_number: 15612752 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Bacillus halodurans # 26 339 24 336 343 149 31.0 6e-36 MRKVALVDYGKLELQENCGDIKTLEPGTVKIDVTACSICGSDIALYRGKRSLENERYFGH EFSGVVVDPGDGANGIKKGMRVASELSRACGHCWNCRNGLPNYCRSMNDALLPGGFSEET LVLNTQEYSFISPVPDTIDDITATLLEPTNCAYHVARQANVKHGETVVVFGMGPMGLIAS RIIKSMGVDVVVGVDNSKARLEKVRSLGFIEVIDSNDGNWQDQIFEMCGSKGADVIIELT GALPVLQSAFQVVRPGGRIVVGSVYSGFAEQVELLPIMRKELTIRGSKGPYPHLKTDGTS AAVDILVRLQDDLKKLITVYDYKDALKAFDDMMSGLAIKPVIRFRS >gi|157101613|gb|DS480711.1| GENE 8 7206 - 7643 354 145 aa, chain + ## HITS:1 COG:no KEGG:LLO_3229 NR:ns ## KEGG: LLO_3229 # Name: not_defined # Def: hypothetical protein, conserved cupin barrel domain # Organism: L.longbeachae # Pathway: Tyrosine metabolism [PATH:llo00350]; Metabolic pathways [PATH:llo01100]; Microbial metabolism in diverse environments [PATH:llo01120] # 36 133 46 143 343 79 39.0 4e-14 MTNKEHVLKYGGKFSISGGEVWTMPNGHDVHFALNPKMGCRGANIAVGFHKPGKEFAPHK HPISEEILIIHSGKGECYLYDKWIPVETGDIVFAPPGVLHGTRNPAENTEDFVTLGIATP PQLDLYLRAGYDILEDNSGEYVEEV >gi|157101613|gb|DS480711.1| GENE 9 7667 - 8227 400 186 aa, chain + ## HITS:1 COG:jhp0791 KEGG:ns NR:ns ## COG: jhp0791 COG0279 # Protein_GI_number: 15611858 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoheptose isomerase # Organism: Helicobacter pylori J99 # 3 175 10 180 192 98 31.0 9e-21 MFFDTYSEMLNDTIDGLKREDIQKLFDLIEETRNNGKHLFVLGNGGSAAAASHWVCDFGK GINRGDSKRLKIFSPADNGAIFSALGNDCGYETTFVEQMKNFLEPGDLVLTFSVSGSSSN LVEAHRYAKEIGAKTACVVADKGGKIIGMSDFAMIIPSENYGVVEDIHVILGHAISQQIY ADNVGA >gi|157101613|gb|DS480711.1| GENE 10 8249 - 9193 450 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 3 313 5 319 319 177 34 3e-44 MYIAALDIGGTKTIVAILDENGGILIQESFPSIVERYETHLELCVQAMKRLMHQTELQAE DFAGLGVSLPGIVDNEKGILLYAPYANWKNVEVAGYLSKNLGISRVRCENDVNACAIGEK RFGLGNNYTDFIWMTVSTGVGGAVVEGSKLVRGGLGFAGELGHLKVEYKSPAHCPCGQYG CLEAHGSGTALIRETRKRRLTSPAFAKALDEMGLKPDGAGCAALARAGNTDALDILNQIG TYLGRGISYCINILNTQAVVIGGGVAASLDLLLPSIRVSVQQNAFKQMQDIDIVKTPLGY EAALLGAAALVLEQ >gi|157101613|gb|DS480711.1| GENE 11 9233 - 10336 703 367 aa, chain + ## HITS:1 COG:YPO2584 KEGG:ns NR:ns ## COG: YPO2584 COG0673 # Protein_GI_number: 16122797 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Yersinia pestis # 1 366 1 373 377 163 30.0 5e-40 MKRLKAAVIGSGFIGAAHVEALKRVPGVDVVALVDIVDPVQKAEELDVPNGFSDYREMIE AVEPDCIHICTPNHTHKEIALYAFKHGIHVICEKPMARNAGEAKEMLEAAKKSNLVHAIN LHNRFYPANHQLRNMILDGALGKIYGVHGAYLQDCFSKESDFNWRMLSDKGGSTRVTSDI GSHWIDLAEYVIGSKVKEVFAEFQTNLPVRKMNLAGDLKDVAVDTEDTSYVMVRFENGAV GSAVFSQVYQGRKNQTTIRVSGSEISAEWDSEAIGDLKLGYRNEPNRLLTKDRGLAHPDT APIITYPGGHAEGFPDAFKQNFIAVYGAIRGVKPINPYADFEDGLHQMQVLDKIFESAKT GRWTSVN >gi|157101613|gb|DS480711.1| GENE 12 10358 - 11323 605 321 aa, chain + ## HITS:1 COG:BH1249 KEGG:ns NR:ns ## COG: BH1249 COG1082 # Protein_GI_number: 15613812 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Bacillus halodurans # 1 321 1 322 322 289 43.0 6e-78 MKIGIVANIMQDKPLAEALDYFRNIGIQTIEPGCGGYAGKSHVNPAVLLGDKEKMDGFKR TIADSGLTISALSCHGNPIHPDRGIAAAYDEDMRDAVRLCKELGLDTITCFSGCAGDGPD SKYPNWVTCPWPDDYLKILDYQWNEVLIPYWKEFAAFAKENGVTKIALELHPGFMVYNSE TLKRLRGEVGREIGVNFDPSHLLWQGMDPVSVIRELKDAIYHVHAKDVKVDSINTAVNGV LDTKHYSNEINRSWIFRTIGYGNDTAYWKNIFSTLRIIGYDGAVSIEHEDSLINRFEGLE KAVRIVKESLIMEDKTEMWWA >gi|157101613|gb|DS480711.1| GENE 13 11359 - 12429 409 356 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 33 334 10 312 327 162 32 2e-39 MKKNLISMLLITATAISLAACGSSKAPASAGSSGTSASKDTAASVNASQVVVKYSVTYPS TGTQAEGALKLGELIEECSDGRMKMEFYPSSQLGDKTATFEGLGNGTIEMTECAATDLSA FNDMWSVFSLPYLWENGQQACATVMDPAVREMLEADAQANGFVIIAWTDIGSRSIMNQKK TVNTPADLTGLKIRCMQDPILADSTNAMGAIATPLGASEIYTGLQQGTIDGLDHTPSVVV ANGWQELAKHFSLTEHFTIPDPVFVSKVWFDGLSAENQEAVLEAGKKFSDVWNNEIWPEA TEDGMKAMKEQGVEIVEVDKALFEEAVKPVVDKFLADASEDQKALYDLLTKTRENY >gi|157101613|gb|DS480711.1| GENE 14 12446 - 13360 377 304 aa, chain + ## HITS:1 COG:VNG0444G KEGG:ns NR:ns ## COG: VNG0444G COG0329 # Protein_GI_number: 15789685 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Halobacterium sp. NRC-1 # 6 290 25 304 313 143 33.0 3e-34 MYEKRLCGVVIPTITPMNEDGSIDDASLANFTRYLLGAGVNALYPNGTNGESLLLTKEER DHVAEVMANVNAHRLPLFIQCGSMTTGETASHAVHAVDIGADGIGIMSPAFFQMDEESLL QYYRDVIKGLPENFPVYIYNIPGCTTNDVKPGLLQKLMEEFPNIVGIKYSSPDLMRVEDY LETDGRKPELLIGCDSLFLQCLMTGGVGTVTGPGSIFHERFTRLYRQYQQGDYAGAAMTQ KKIVETDRKLAGIPGIPALKTLLKLRGVIRTDVCRGPLRPLKEDEKRILEEIYDAYCEEE GIHE >gi|157101613|gb|DS480711.1| GENE 15 13353 - 14300 430 315 aa, chain + ## HITS:1 COG:YPO2042_2 KEGG:ns NR:ns ## COG: YPO2042_2 COG0673 # Protein_GI_number: 16122281 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Yersinia pestis # 5 307 3 302 303 110 26.0 3e-24 MSRNKLRIGVIGVGYVASNNFLPVFPRFDDVELAGIMANHLESAKKAQRLCGAQQVVQSL EELVKLDLDCAFVLTPKACHAEQISFLLKAGIDVYSEKPMATTLKDADHMAELSEQTGRK LMIGFNRRYAPVCQKAKEVYTDILPDVIIGQKNRPATEYHATLENAIHIVDLMRFYGGEV KEVHAISKFTDPDYESFTTAQLLFESGASGMLIADRSSGQWEENIEIHGGNRSVFIKIPD SITITDAEEQHTTEMTPLAMGWARSEDKLGFSYAIRHFLDCVKEDRVPLTNAVDAYKTHE LLDRILRSAGLPGME >gi|157101613|gb|DS480711.1| GENE 16 14315 - 15193 429 292 aa, chain + ## HITS:1 COG:Cgl2502 KEGG:ns NR:ns ## COG: Cgl2502 COG1082 # Protein_GI_number: 19553752 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Corynebacterium glutamicum # 87 263 28 189 226 72 27.0 1e-12 MKIAGHTMGTPEYTVNEAIELFHRIGADGAEIVVQDGYCSGIPCDCDEKSLSIVKRCAEE NEIEISCLTPYNSYFNSLDKEVRQAEIESIRNVIDYCQYLNAHYIRIYGGNLLAGDTQKL EERREKLIESLRYLGNLAAQKDVVLIVENHFNTMAVSAKDSAALIRDIDHPAVRILYDQA NLTFTGNEGYEEAISLQQQYVSYMHVKDLVFLEGKDFVSSDVSHPDESERNVRTRIVGEG ILDWPAILKSVKAHGYDGWLSLEYERRWHPDDIPDASIGMKKSIDYLKSILF >gi|157101613|gb|DS480711.1| GENE 17 15223 - 16182 411 319 aa, chain + ## HITS:1 COG:BMEII0866 KEGG:ns NR:ns ## COG: BMEII0866 COG0673 # Protein_GI_number: 17989211 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Brucella melitensis # 59 311 46 301 317 89 26.0 6e-18 MKIALLGVSHWHLPLYLPGLPENSVVGISDDNIGIAERYAGKYSCPAYQDYRQMVMETKP DFVFAFAPHYRMKDTAEYLLKLGIPFSMEKPAGMNAGEVESLYEFCEKKNGFCSIPFVWR YSDTVNELKNKYLTKPITHMSYRFIAGPPSRYLETSSWMLSQRTAGGGCMTNLGVHFIDM ALFLTDSCSAKTLASAYQYSSPYDIETYATSLLKLPNGTSLLLESGYAYPMSEESKRENR WTIVTENGYYILAENRLEIREYDREPVGIPLNTDSDIYYTVYTKTTLNDWLTGAKPRASL KEMLDVNRILDDMNAKAGL >gi|157101613|gb|DS480711.1| GENE 18 16207 - 16983 394 258 aa, chain + ## HITS:1 COG:no KEGG:Mahau_0233 NR:ns ## KEGG: Mahau_0233 # Name: not_defined # Def: xylose isomerase domain-containing protein # Organism: M.australiensis # Pathway: not_defined # 13 255 36 276 281 95 25.0 2e-18 MRLGGFGRIADYDNIRQAGFDYAELDVPEIEALTENEFRIFCDKVHEIGFPVLTGARALP VAKPWFFTDSFNALEYKAYLEHACHRAKLLGMDRIIIGNGKARLLIDETSIKKENRFIDF MRMFAEIAGNYDVEVILEPLGPMYSNYINSLPEAVRIIREVDMPNLFTMADLRHLVWAKE PLTDIVSYSDYVHHIHMDYPISYPERPFPNANDDFDYTEFLDVLKESGYQGTLTIEADIP EDWKRAYRDTAVILQAFL Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:51:46 2011 Seq name: gi|157101612|gb|DS480712.1| Clostridium bolteae ATCC BAA-613 Scfld_02_53 genomic scaffold, whole genome shotgun sequence Length of sequence - 193277 bp Number of predicted genes - 218, with homology - 209 Number of transcription units - 92, operones - 46 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 21/0.000 - CDS 3 - 777 834 ## COG1386 Predicted transcriptional regulator containing the HTH domain 2 1 Op 2 . - CDS 791 - 1612 1005 ## COG1354 Uncharacterized conserved protein 3 1 Op 3 . - CDS 1637 - 2509 788 ## COG1408 Predicted phosphohydrolases 4 1 Op 4 . - CDS 2554 - 3240 741 ## COG0637 Predicted phosphatase/phosphohexomutase - Prom 3340 - 3399 2.7 5 2 Tu 1 . - CDS 3401 - 4027 783 ## COG1592 Rubrerythrin - Prom 4158 - 4217 5.3 + Prom 4102 - 4161 5.8 6 3 Tu 1 . + CDS 4224 - 4634 385 ## Closa_2512 XRE family transcriptional regulator + Term 4715 - 4751 -0.5 - Term 4647 - 4694 11.5 7 4 Op 1 . - CDS 4701 - 5480 649 ## COG2159 Predicted metal-dependent hydrolase of the TIM-barrel fold 8 4 Op 2 . - CDS 5490 - 5939 352 ## COG0698 Ribose 5-phosphate isomerase RpiB 9 4 Op 3 18/0.000 - CDS 5971 - 6690 250 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 4 Op 4 19/0.000 - CDS 6677 - 7441 193 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 11 4 Op 5 24/0.000 - CDS 7395 - 8378 824 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 12 4 Op 6 20/0.000 - CDS 8388 - 9254 906 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 13 4 Op 7 . - CDS 9311 - 10558 1115 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 14 4 Op 8 . - CDS 10491 - 10700 85 ## - Prom 10868 - 10927 3.5 + Prom 10630 - 10689 6.5 15 5 Tu 1 . + CDS 10868 - 11896 826 ## COG1609 Transcriptional regulators + Term 11931 - 11987 9.1 + Prom 11981 - 12040 5.3 16 6 Op 1 7/0.000 + CDS 12090 - 13220 1561 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 17 6 Op 2 17/0.000 + CDS 13399 - 14499 1406 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 18 6 Op 3 . + CDS 14496 - 16160 1922 ## COG1178 ABC-type Fe3+ transport system, permease component 19 6 Op 4 7/0.000 + CDS 16236 - 17963 1814 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 20 6 Op 5 . + CDS 17956 - 18756 978 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 18615 - 18652 7.1 21 7 Op 1 . - CDS 18746 - 19516 885 ## COG0666 FOG: Ankyrin repeat 22 7 Op 2 1/0.182 - CDS 19527 - 20672 1082 ## COG2199 FOG: GGDEF domain - Prom 20693 - 20752 1.5 23 7 Op 3 . - CDS 20756 - 21589 876 ## COG0789 Predicted transcriptional regulators - Prom 21669 - 21728 5.9 + Prom 21584 - 21643 3.7 24 8 Tu 1 . + CDS 21706 - 23034 1461 ## COG0534 Na+-driven multidrug efflux pump + Term 23061 - 23114 5.3 25 9 Op 1 7/0.000 - CDS 23194 - 23973 1047 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 26 9 Op 2 1/0.182 - CDS 23958 - 25670 1771 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 27 9 Op 3 2/0.091 - CDS 25667 - 26830 1068 ## COG1840 ABC-type Fe3+ transport system, periplasmic component 28 9 Op 4 8/0.000 - CDS 26865 - 27941 1449 ## COG3839 ABC-type sugar transport systems, ATPase components 29 9 Op 5 21/0.000 - CDS 28065 - 29774 2030 ## COG1178 ABC-type Fe3+ transport system, permease component - Prom 29855 - 29914 4.7 - Term 29859 - 29910 10.4 30 9 Op 6 1/0.182 - CDS 29970 - 31172 401 ## PROTEIN SUPPORTED gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 - Prom 31199 - 31258 5.0 31 10 Op 1 10/0.000 - CDS 31312 - 35148 3857 ## COG0642 Signal transduction histidine kinase - Prom 35209 - 35268 10.4 - Term 35227 - 35267 7.8 32 10 Op 2 40/0.000 - CDS 35295 - 36515 1108 ## COG0642 Signal transduction histidine kinase 33 10 Op 3 . - CDS 36515 - 37201 869 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 37241 - 37300 1.7 - Term 37228 - 37289 13.0 34 11 Op 1 24/0.000 - CDS 37326 - 38264 988 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 35 11 Op 2 5/0.000 - CDS 38308 - 39321 1114 ## COG1131 ABC-type multidrug transport system, ATPase component 36 11 Op 3 . - CDS 39358 - 40566 1199 ## COG1470 Predicted membrane protein - Prom 40606 - 40665 4.1 - Term 40695 - 40736 4.4 37 12 Op 1 . - CDS 40821 - 42230 1244 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Prom 42422 - 42481 4.3 38 12 Op 2 . - CDS 42522 - 43010 428 ## COG1309 Transcriptional regulator - Prom 43132 - 43191 9.1 + Prom 43094 - 43153 6.0 39 13 Tu 1 . + CDS 43203 - 43574 257 ## LEUM_0832 hypothetical protein 40 14 Tu 1 . + CDS 43678 - 43896 309 ## gi|160941707|ref|ZP_02089034.1| hypothetical protein CLOBOL_06603 41 15 Op 1 . + CDS 44337 - 44492 89 ## CLL_A2163 putative acetyltransferase + Prom 44549 - 44608 2.5 42 15 Op 2 . + CDS 44628 - 45041 324 ## Dhaf_2644 rifampin ADP-ribosylating transferase + Term 45104 - 45154 18.8 - Term 45081 - 45152 29.4 43 16 Op 1 11/0.000 - CDS 45159 - 46448 654 ## PROTEIN SUPPORTED gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 44 16 Op 2 11/0.000 - CDS 46448 - 46930 601 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component 45 16 Op 3 . - CDS 46971 - 48032 319 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 - Prom 48151 - 48210 7.3 + Prom 48211 - 48270 7.5 46 17 Tu 1 . + CDS 48310 - 49320 1023 ## COG1609 Transcriptional regulators 47 18 Op 1 . - CDS 49317 - 50402 956 ## COG1482 Phosphomannose isomerase 48 18 Op 2 . - CDS 50434 - 51453 539 ## PROTEIN SUPPORTED gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 49 18 Op 3 7/0.000 - CDS 51516 - 52745 1376 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 50 18 Op 4 . - CDS 52758 - 54161 1554 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 54213 - 54272 4.7 - Term 54234 - 54268 6.2 51 19 Tu 1 . - CDS 54303 - 55922 945 ## gi|160941720|ref|ZP_02089047.1| hypothetical protein CLOBOL_06616 - Prom 55965 - 56024 8.7 52 20 Tu 1 . - CDS 56063 - 56476 189 ## Clole_2554 hypothetical protein - Prom 56557 - 56616 6.1 - Term 56671 - 56730 15.3 53 21 Tu 1 . - CDS 56765 - 58021 1227 ## COG1686 D-alanyl-D-alanine carboxypeptidase 54 22 Tu 1 . + CDS 57876 - 58427 242 ## gi|160941723|ref|ZP_02089050.1| hypothetical protein CLOBOL_06619 - Term 58431 - 58466 -0.5 55 23 Tu 1 . - CDS 58667 - 58846 158 ## gi|160941727|ref|ZP_02089054.1| hypothetical protein CLOBOL_06623 - Prom 58869 - 58928 8.1 56 24 Tu 1 . - CDS 58960 - 59994 945 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Term 60029 - 60067 10.9 57 25 Tu 1 . - CDS 60116 - 61570 1842 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 61655 - 61714 4.7 - Term 61740 - 61781 -1.0 58 26 Op 1 1/0.182 - CDS 61798 - 63192 1597 ## COG0277 FAD/FMN-containing dehydrogenases 59 26 Op 2 29/0.000 - CDS 63192 - 64382 1080 ## COG2025 Electron transfer flavoprotein, alpha subunit 60 26 Op 3 . - CDS 64410 - 65198 842 ## COG2086 Electron transfer flavoprotein, beta subunit - Term 65207 - 65256 10.6 61 26 Op 4 . - CDS 65293 - 66204 924 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase - Prom 66255 - 66314 11.6 - Term 66318 - 66356 -0.4 62 27 Op 1 18/0.000 - CDS 66420 - 67130 198 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 63 27 Op 2 19/0.000 - CDS 67145 - 67927 248 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 64 27 Op 3 24/0.000 - CDS 67932 - 68918 1209 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 65 27 Op 4 20/0.000 - CDS 68934 - 69800 1096 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 66 27 Op 5 . - CDS 69881 - 71086 1499 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component 67 27 Op 6 . - CDS 71126 - 72205 1150 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 72243 - 72302 8.0 68 28 Op 1 . - CDS 72390 - 72554 105 ## 69 28 Op 2 . - CDS 72573 - 73220 676 ## COG1802 Transcriptional regulators - Prom 73241 - 73300 6.0 + Prom 73265 - 73324 7.2 70 29 Tu 1 . + CDS 73506 - 73751 335 ## Closa_1693 hypothetical protein + Term 73767 - 73822 17.6 - Term 73751 - 73815 18.9 71 30 Op 1 . - CDS 73844 - 74341 617 ## Closa_1008 hypothetical protein 72 30 Op 2 . - CDS 74388 - 74840 318 ## COG1321 Mn-dependent transcriptional regulator - Prom 74905 - 74964 3.0 + Prom 74808 - 74867 2.7 73 31 Op 1 22/0.000 + CDS 74915 - 75196 274 ## COG1918 Fe2+ transport system protein A 74 31 Op 2 . + CDS 75290 - 76774 1504 ## COG0370 Fe2+ transport system protein B 75 31 Op 3 . + CDS 76735 - 76980 111 ## - Term 76898 - 76944 7.1 76 32 Tu 1 . - CDS 76951 - 77160 133 ## gi|160941748|ref|ZP_02089075.1| hypothetical protein CLOBOL_06644 - Prom 77183 - 77242 4.2 77 33 Tu 1 . - CDS 77475 - 77870 220 ## COG2337 Growth inhibitor - Prom 78086 - 78145 4.0 + Prom 78746 - 78805 11.1 78 34 Op 1 . + CDS 78834 - 79094 187 ## TDE0291 hypothetical protein 79 34 Op 2 . + CDS 79084 - 79401 105 ## Dtox_0629 plasmid stabilization system + Term 79453 - 79502 12.1 - Term 79440 - 79488 11.1 80 35 Op 1 . - CDS 79509 - 80012 356 ## Dhaf_2435 RNA polymerase, sigma-24 subunit, ECF subfamily 81 35 Op 2 . - CDS 80102 - 80299 63 ## gi|160941754|ref|ZP_02089081.1| hypothetical protein CLOBOL_06650 82 35 Op 3 . - CDS 80343 - 80447 103 ## - Prom 80467 - 80526 3.9 83 36 Op 1 . - CDS 80544 - 81212 435 ## LDBND_1390 transposase dde domain 84 36 Op 2 . - CDS 81200 - 81832 153 ## LDBND_0402 transposase dde domain - Prom 81868 - 81927 5.0 85 37 Op 1 2/0.091 - CDS 82314 - 82778 149 ## COG3344 Retron-type reverse transcriptase 86 37 Op 2 . - CDS 82859 - 83194 251 ## COG3344 Retron-type reverse transcriptase + Prom 83830 - 83889 6.6 87 38 Op 1 21/0.000 + CDS 84092 - 84886 236 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 88 38 Op 2 16/0.000 + CDS 84864 - 85820 824 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 89 38 Op 3 . + CDS 85864 - 87000 1109 ## COG1879 ABC-type sugar transport system, periplasmic component 90 38 Op 4 . + CDS 87003 - 87809 406 ## COG2313 Uncharacterized enzyme involved in pigment biosynthesis + Term 87999 - 88038 2.2 - Term 87987 - 88023 3.3 91 39 Tu 1 . - CDS 88033 - 88152 117 ## gi|160941767|ref|ZP_02089094.1| hypothetical protein CLOBOL_06663 - Prom 88220 - 88279 3.8 - Term 88415 - 88457 -0.7 92 40 Op 1 5/0.000 - CDS 88462 - 88941 224 ## COG3436 Transposase and inactivated derivatives 93 40 Op 2 5/0.000 - CDS 88961 - 89620 166 ## COG3436 Transposase and inactivated derivatives - Prom 89642 - 89701 7.5 94 40 Op 3 . - CDS 89707 - 90048 214 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 95 40 Op 4 . - CDS 90128 - 90382 58 ## gi|325262201|ref|ZP_08128939.1| hypothetical protein HMPREF0240_01185 96 40 Op 5 . - CDS 90442 - 90585 90 ## gi|160941773|ref|ZP_02089100.1| hypothetical protein CLOBOL_06669 97 41 Tu 1 . - CDS 90752 - 91735 531 ## COG5519 Superfamily II helicase and inactivated derivatives 98 42 Tu 1 . - CDS 92087 - 92395 161 ## ELI_3221 hypothetical protein - Prom 92567 - 92626 9.2 + Prom 92601 - 92660 9.6 99 43 Tu 1 . + CDS 92693 - 93046 58 ## gi|160941778|ref|ZP_02089105.1| hypothetical protein CLOBOL_06674 + Term 93161 - 93197 0.3 100 44 Op 1 . - CDS 93667 - 94065 424 ## USA300HOU_0862 hypothetical protein 101 44 Op 2 . - CDS 94134 - 94286 88 ## gi|160941783|ref|ZP_02089110.1| hypothetical protein CLOBOL_06679 102 44 Op 3 . - CDS 94297 - 94575 177 ## gi|160941784|ref|ZP_02089111.1| hypothetical protein CLOBOL_06680 103 45 Tu 1 . - CDS 94961 - 95215 290 ## PROTEIN SUPPORTED gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 - Prom 95274 - 95333 4.2 104 46 Op 1 . - CDS 95849 - 96058 167 ## 105 46 Op 2 . - CDS 96138 - 96938 565 ## ELI_1005 hypothetical protein 106 46 Op 3 . - CDS 96952 - 97413 76 ## Cthe_0528 hypothetical protein - Prom 97441 - 97500 5.5 107 47 Tu 1 . + CDS 97779 - 98192 202 ## ELI_1003 hypothetical protein 108 48 Tu 1 . + CDS 98304 - 98699 240 ## ELI_1002 phage protein + Prom 98709 - 98768 2.5 109 49 Op 1 . + CDS 98887 - 99831 680 ## ELI_1000 plasmid recombination protein 110 49 Op 2 . + CDS 99875 - 100030 167 ## Tresu_1927 hypothetical protein + Term 100056 - 100109 9.2 111 50 Tu 1 . + CDS 100125 - 101636 1250 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 101809 - 101839 4.3 - Term 101621 - 101658 -0.7 112 51 Tu 1 . - CDS 101871 - 102140 63 ## gi|160941799|ref|ZP_02089126.1| hypothetical protein CLOBOL_06695 - Prom 102231 - 102290 2.5 - Term 102155 - 102197 6.8 113 52 Tu 1 . - CDS 102308 - 102424 144 ## gi|160941800|ref|ZP_02089127.1| hypothetical protein CLOBOL_06696 114 53 Op 1 . - CDS 102537 - 102680 70 ## gi|160941801|ref|ZP_02089128.1| hypothetical protein CLOBOL_06697 115 53 Op 2 . - CDS 102694 - 102912 61 ## gi|160941802|ref|ZP_02089129.1| hypothetical protein CLOBOL_06698 116 53 Op 3 . - CDS 102939 - 103028 61 ## 117 53 Op 4 14/0.000 - CDS 103067 - 104422 1327 ## COG1653 ABC-type sugar transport system, periplasmic component 118 53 Op 5 38/0.000 - CDS 104496 - 105323 670 ## COG0395 ABC-type sugar transport system, permease component 119 53 Op 6 10/0.000 - CDS 105314 - 106246 678 ## COG1175 ABC-type sugar transport systems, permease components 120 53 Op 7 5/0.000 - CDS 106227 - 107123 641 ## COG3839 ABC-type sugar transport systems, ATPase components 121 53 Op 8 . - CDS 107143 - 107325 187 ## COG3839 ABC-type sugar transport systems, ATPase components 122 53 Op 9 . - CDS 107346 - 107906 388 ## COG5418 Predicted secreted protein - Prom 107971 - 108030 7.2 123 54 Op 1 44/0.000 - CDS 108048 - 109028 671 ## COG4608 ABC-type oligopeptide transport system, ATPase component 124 54 Op 2 44/0.000 - CDS 109030 - 110067 773 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 125 54 Op 3 49/0.000 - CDS 110067 - 110966 854 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 126 54 Op 4 38/0.000 - CDS 110981 - 111931 774 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Term 111965 - 112003 7.3 127 54 Op 5 . - CDS 112018 - 113652 1536 ## COG0747 ABC-type dipeptide transport system, periplasmic component - Prom 113720 - 113779 1.8 128 55 Op 1 5/0.000 + CDS 114115 - 115149 269 ## PROTEIN SUPPORTED gi|225084369|ref|YP_002657150.1| ribosomal protein S16 129 55 Op 2 3/0.000 + CDS 115146 - 115949 493 ## COG4587 ABC-type uncharacterized transport system, permease component 130 55 Op 3 . + CDS 115952 - 116731 348 ## COG3694 ABC-type uncharacterized transport system, permease component - Term 116865 - 116912 1.2 131 56 Op 1 . - CDS 116958 - 117215 102 ## Ccel_2737 hypothetical protein 132 56 Op 2 . - CDS 117229 - 119034 1076 ## COG3505 Type IV secretory pathway, VirD4 components 133 56 Op 3 . - CDS 119039 - 120502 596 ## DSY4633 hypothetical protein 134 56 Op 4 . - CDS 120515 - 122470 866 ## HM1_0619 hypothetical protein 135 56 Op 5 . - CDS 122485 - 122892 297 ## DSY4635 hypothetical protein - Term 122977 - 123012 2.0 136 56 Op 6 . - CDS 123046 - 123255 251 ## COG1598 Uncharacterized conserved protein - Prom 123326 - 123385 8.8 137 57 Tu 1 . - CDS 123435 - 123719 257 ## gi|160941824|ref|ZP_02089151.1| hypothetical protein CLOBOL_06720 - Prom 123749 - 123808 5.8 138 58 Op 1 . - CDS 124213 - 124497 187 ## Dtox_1510 hypothetical protein 139 58 Op 2 . - CDS 124533 - 125516 442 ## Closa_0805 hypothetical protein 140 59 Op 1 . - CDS 125727 - 125945 250 ## gi|160941830|ref|ZP_02089157.1| hypothetical protein CLOBOL_06726 141 59 Op 2 . - CDS 125967 - 127124 726 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 142 59 Op 3 . - CDS 127146 - 128651 499 ## Ccel_2752 hypothetical protein 143 59 Op 4 . - CDS 128684 - 130447 1264 ## COG3451 Type IV secretory pathway, VirB4 components 144 59 Op 5 . - CDS 130527 - 131126 474 ## HM1_0602 hypothetical protein 145 59 Op 6 . - CDS 131126 - 131386 301 ## Closa_3123 hypothetical protein 146 59 Op 7 . - CDS 131409 - 131582 159 ## gi|160941836|ref|ZP_02089163.1| hypothetical protein CLOBOL_06732 - Prom 131679 - 131738 3.3 147 60 Op 1 . - CDS 131777 - 132679 680 ## DSY4551 hypothetical protein 148 60 Op 2 . - CDS 132719 - 133135 221 ## Dtox_1496 hypothetical protein 149 60 Op 3 . - CDS 133182 - 133295 58 ## 150 60 Op 4 . - CDS 133292 - 133828 136 ## gi|160941840|ref|ZP_02089167.1| hypothetical protein CLOBOL_06736 151 60 Op 5 . - CDS 133833 - 134168 141 ## CKR_3426 hypothetical protein - Prom 134290 - 134349 3.8 - Term 134341 - 134375 1.1 152 61 Tu 1 . - CDS 134388 - 135977 437 ## HM1_0594 hypothetical protein - Prom 136057 - 136116 3.6 - Term 136091 - 136120 0.5 153 62 Tu 1 . - CDS 136169 - 136765 307 ## HM1_0593 hypothetical protein - Prom 136833 - 136892 5.7 154 63 Op 1 . - CDS 136906 - 137397 407 ## DSY4557 hypothetical protein 155 63 Op 2 . - CDS 137445 - 137837 391 ## HM1_0591 hypothetical protein 156 63 Op 3 . - CDS 137854 - 138726 540 ## LM5578_1887 hypothetical protein 157 63 Op 4 . - CDS 138728 - 139657 508 ## CKL_3887 hypothetical protein 158 63 Op 5 . - CDS 139654 - 141279 1048 ## COG4962 Flp pilus assembly protein, ATPase CpaF 159 63 Op 6 . - CDS 141273 - 142082 763 ## LM5578_1890 hypothetical protein 160 63 Op 7 . - CDS 142075 - 142911 664 ## LM5578_1891 pilus assembly protein CpaB - Prom 142976 - 143035 1.8 161 64 Op 1 . - CDS 143068 - 146139 1314 ## COG4886 Leucine-rich repeat (LRR) protein 162 64 Op 2 . - CDS 146176 - 146592 99 ## Closa_3106 peptidase A24A prepilin type IV 163 64 Op 3 . - CDS 146603 - 146986 334 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 164 64 Op 4 . - CDS 147008 - 147220 146 ## HM1_0583 hypothetical protein - Prom 147245 - 147304 4.2 - Term 147249 - 147292 5.3 165 65 Tu 1 . - CDS 147308 - 147559 187 ## gi|160941856|ref|ZP_02089183.1| hypothetical protein CLOBOL_06752 166 66 Op 1 . - CDS 148267 - 148932 441 ## FN1197 hypothetical protein 167 66 Op 2 . - CDS 148922 - 150271 792 ## COG1106 Predicted ATPases - Prom 150329 - 150388 7.5 - Term 150514 - 150564 11.5 168 67 Tu 1 . - CDS 150611 - 150808 190 ## Closa_3551 hypothetical protein - Prom 150847 - 150906 3.3 169 68 Tu 1 . - CDS 151070 - 152686 1142 ## Closa_3096 hypothetical protein 170 69 Tu 1 . - CDS 152805 - 152966 161 ## gi|160941863|ref|ZP_02089190.1| hypothetical protein CLOBOL_06759 - Prom 152987 - 153046 1.6 171 70 Op 1 . - CDS 153071 - 153175 106 ## 172 70 Op 2 . - CDS 153205 - 153813 413 ## HM1_0579 hypothetical protein 173 70 Op 3 . - CDS 153794 - 154084 272 ## gi|160941866|ref|ZP_02089193.1| hypothetical protein CLOBOL_06762 174 70 Op 4 . - CDS 154088 - 154375 249 ## Closa_3101 hypothetical protein 175 70 Op 5 . - CDS 154335 - 154784 334 ## DSY4571 hypothetical protein 176 70 Op 6 . - CDS 154797 - 155726 625 ## DSY4669 hypothetical protein 177 70 Op 7 . - CDS 155742 - 156356 411 ## LM5578_1902 hypothetical protein 178 70 Op 8 . - CDS 156432 - 156962 318 ## gi|160941871|ref|ZP_02089198.1| hypothetical protein CLOBOL_06767 179 70 Op 9 . - CDS 157036 - 157419 223 ## gi|160941872|ref|ZP_02089199.1| hypothetical protein CLOBOL_06768 - Term 157988 - 158027 4.5 180 71 Tu 1 . - CDS 158108 - 158371 153 ## gi|160941874|ref|ZP_02089201.1| hypothetical protein CLOBOL_06770 - Prom 158587 - 158646 6.8 - Term 158606 - 158664 8.1 181 72 Tu 1 . - CDS 158671 - 159195 255 ## COG4422 Bacteriophage protein gp37 - Prom 159385 - 159444 3.9 182 73 Tu 1 . - CDS 159458 - 159883 434 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 159975 - 160034 4.9 183 74 Tu 1 . - CDS 160069 - 160494 250 ## Dhaf_2445 hypothetical protein - Prom 160653 - 160712 3.6 184 75 Tu 1 . - CDS 160733 - 162235 571 ## HM1_0571 hypothetical protein - Prom 162332 - 162391 5.8 185 76 Op 1 . - CDS 162448 - 162636 89 ## gi|160941880|ref|ZP_02089207.1| hypothetical protein CLOBOL_06776 186 76 Op 2 . - CDS 162708 - 162791 60 ## - Prom 162947 - 163006 6.7 - Term 162974 - 163015 1.2 187 77 Op 1 . - CDS 163082 - 164065 474 ## COG0535 Predicted Fe-S oxidoreductases 188 77 Op 2 . - CDS 164069 - 164401 262 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 189 77 Op 3 . - CDS 164416 - 164844 230 ## Closa_1175 C_GCAxxG_C_C family protein 190 77 Op 4 . - CDS 164869 - 165444 109 ## Closa_1173 iron-sulfur binding reductase - Prom 165553 - 165612 3.3 191 78 Tu 1 . - CDS 165836 - 166552 424 ## COG0398 Uncharacterized conserved protein - Term 166563 - 166606 7.7 192 79 Op 1 . - CDS 166618 - 167022 204 ## Bmur_0644 YciC 193 79 Op 2 . - CDS 167035 - 167676 534 ## Bmur_0645 hypothetical protein 194 79 Op 3 . - CDS 167724 - 168458 301 ## COG0398 Uncharacterized conserved protein 195 80 Tu 1 . - CDS 169031 - 169285 175 ## CCV52592_2220 prevent-host-death family protein - Prom 169371 - 169430 5.9 + Prom 169236 - 169295 5.8 196 81 Op 1 31/0.000 + CDS 169402 - 170082 671 ## COG0765 ABC-type amino acid transport system, permease component 197 81 Op 2 16/0.000 + CDS 170128 - 171027 709 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 198 81 Op 3 1/0.182 + CDS 171066 - 171803 630 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 171821 - 171852 3.2 199 82 Tu 1 . + CDS 171876 - 173099 491 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Term 173076 - 173105 0.4 200 83 Op 1 . - CDS 173148 - 173558 383 ## EUBREC_0966 N-acetylmuramoyl-L-alanine amidase domain protein 201 83 Op 2 . - CDS 173558 - 174040 414 ## EUBREC_0966 N-acetylmuramoyl-L-alanine amidase domain protein 202 83 Op 3 . - CDS 174052 - 175062 426 ## COG3943 Virulence protein - Prom 175108 - 175167 7.4 + Prom 175379 - 175438 8.0 203 84 Op 1 . + CDS 175592 - 175750 149 ## gi|160941902|ref|ZP_02089229.1| hypothetical protein CLOBOL_06798 204 84 Op 2 . + CDS 175754 - 177379 856 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 205 84 Op 3 . + CDS 177424 - 178167 764 ## COG0370 Fe2+ transport system protein B + Term 178205 - 178259 15.0 - Term 178193 - 178247 15.0 206 85 Op 1 41/0.000 - CDS 178287 - 179909 1631 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 207 85 Op 2 . - CDS 179953 - 180237 518 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 180375 - 180434 4.2 + Prom 180339 - 180398 14.2 208 86 Tu 1 . + CDS 180500 - 181111 733 ## Closa_1005 hypothetical protein + Term 181156 - 181217 11.2 - Term 181151 - 181195 7.2 209 87 Op 1 . - CDS 181271 - 182218 956 ## Closa_0990 hypothetical protein - Prom 182258 - 182317 5.2 210 87 Op 2 . - CDS 182336 - 183739 1196 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 183792 - 183851 9.1 + Prom 183810 - 183869 6.3 211 88 Tu 1 . + CDS 183999 - 184469 473 ## gi|160941911|ref|ZP_02089238.1| hypothetical protein CLOBOL_06807 + Term 184543 - 184589 8.2 - Term 184529 - 184575 8.2 212 89 Tu 1 . - CDS 184634 - 186328 2121 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 186367 - 186426 7.8 + Prom 186266 - 186325 4.7 213 90 Op 1 8/0.000 + CDS 186470 - 187201 859 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 214 90 Op 2 . + CDS 187205 - 187534 367 ## COG1687 Predicted branched-chain amino acid permeases (azaleucine resistance) + Term 187576 - 187639 10.1 - Term 187573 - 187616 4.5 215 91 Op 1 . - CDS 187676 - 190246 2379 ## COG2206 HD-GYP domain 216 91 Op 2 . - CDS 190261 - 191895 1478 ## Dehly_1104 GH3 auxin-responsive promoter 217 91 Op 3 . - CDS 191902 - 192432 186 ## DhcVS_1159 hypothetical protein - Prom 192672 - 192731 5.2 - Term 192817 - 192848 3.4 218 92 Tu 1 . - CDS 192890 - 193249 371 ## BBR47_42170 hypothetical protein Predicted protein(s) >gi|157101612|gb|DS480712.1| GENE 1 3 - 777 834 258 aa, chain - ## HITS:1 COG:CAC2060 KEGG:ns NR:ns ## COG: CAC2060 COG1386 # Protein_GI_number: 15895330 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Clostridium acetobutylicum # 33 230 3 198 202 121 34.0 1e-27 MSSKDRIEALEAGAGVVETRTGKLAGSEKREAKEQEGQMELAFPPTEDEKEQEAVIEAVL FTMGRSVELRQLAASIGQPEEVARKAVERLIKRYRSARSGMEITQLEDSYQMCTKAAYYE NLIRVASAPKKQVLTEVVLETLSIIAYKQPVTKMEIEKIRGVKSDHAVNRLVEYNLVYEV GRLDAPGRPALFATTEEFLRRFGVGSVQDLPDLGPEQEAEIKAEVEEELQLKLEELTVEA ASGEAAETAETAMETAET >gi|157101612|gb|DS480712.1| GENE 2 791 - 1612 1005 273 aa, chain - ## HITS:1 COG:CAC2061 KEGG:ns NR:ns ## COG: CAC2061 COG1354 # Protein_GI_number: 15895331 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 10 252 2 243 249 142 40.0 6e-34 MPLTGERMETISYKLEHFEGPLDLLLHLIEKNKINIYDIPIVEITAQYLDYVRHMEREDL NIVSEFLVMAATLLDIKAKMLLPKEVDEEGEEIDPRAELVMRLLEYKKYRLMADELADRE DGALKHLYKPSTLPPEVAKYEPPVDLDKLLDGLTLAKLQRIFEQVMKRKEDKIDPIRSNF GTIHREPVSLEQKIGNVLLYARRKRRFSFRELLEGQPDKLEVVVTFLAVLELMKIGKIHL SQEETFGDMDIETLEQEGEEEELDLAQLGDFEG >gi|157101612|gb|DS480712.1| GENE 3 1637 - 2509 788 290 aa, chain - ## HITS:1 COG:lin0757 KEGG:ns NR:ns ## COG: lin0757 COG1408 # Protein_GI_number: 16799831 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 22 288 21 285 293 138 34.0 1e-32 MWKEAVSLAAAAGAACMLRSEYEKKHFSVEVTEIISPKLERDRNLIFLSDLHNNEFGRDN GELVAAIHRLNPDAVLSGGDMMVCKGKRDIKVPLKLFRQLAASYPVYCGNGNHENRMVWE RSLYGDLYEEYRDAMTAMGVSYLEDSCAFLGRDLRIAGLDLAPCFYRKALYQKVPPMPPG YLERKLGKGPSDCKEEPFTILLAHSPLYFDSYARWGADLTLAGHFHGGTIRIPGLGGVMT PQYQFFLPWCAGDFERDGKRMIVSRGLGTHSINIRLNNRPQLVLIRLRRA >gi|157101612|gb|DS480712.1| GENE 4 2554 - 3240 741 228 aa, chain - ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 6 218 7 215 219 130 36.0 1e-30 MPNPIQAVIFDQDGLMFDTESLAATSWFEVGPKYGIHVDGNFLRGIRGCKPDKVKQVCTQ QFGEEAMKDYDRFREEKRQYSYRWIAEHGVPVKKGLKELLIYLKDHNIKTAVATASSESW TQGNVRGAGVEKYFDDYIYGDMVKEAKPNPAIFLLAARRLGVDPGACVVLEDSFNGIKAA AAGGFNPVMIPDQDQPDEEIRNLLTACCDSLTDVIGLFENGSLSLKQP >gi|157101612|gb|DS480712.1| GENE 5 3401 - 4027 783 208 aa, chain - ## HITS:1 COG:MTH756 KEGG:ns NR:ns ## COG: MTH756 COG1592 # Protein_GI_number: 15678781 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Methanothermobacter thermautotrophicus # 5 190 3 193 197 145 41.0 8e-35 MTDFKTSETAKNLMRAFAGESQARNRYTFAAGLAKEQKMPMVEAVFRYTADQEKEHAEIF YDYLKPLDKETIFIDGGYPVDLEKNTLAQLNAAAHNEYEEHDVVYKSFGDIAQKEGFSQI AATFRMIAEIEKTHGERFQYLADLLGAGNLYVSPQKETWICTNCGYIYEGEEPPQYCPVC RHDQGYFARKELAPWYLQDNMQKNKQAK >gi|157101612|gb|DS480712.1| GENE 6 4224 - 4634 385 136 aa, chain + ## HITS:1 COG:no KEGG:Closa_2512 NR:ns ## KEGG: Closa_2512 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 87 1 88 92 83 51.0 3e-15 MIRYDRLWATMEAQGMTQYRLIKHYNFSAGQIGRLKKNMHVSTHTLDTLCTLLGCDISDI IEYIPEGSPAPEEPSVPEAGSAEKETSSKSRKSDSPKGDKPKKSEPKKSAKSKKGDKKAA KEGKEGKKGKEGKKNK >gi|157101612|gb|DS480712.1| GENE 7 4701 - 5480 649 259 aa, chain - ## HITS:1 COG:MK0426 KEGG:ns NR:ns ## COG: MK0426 COG2159 # Protein_GI_number: 20093864 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Methanopyrus kandleri AV19 # 31 254 26 242 243 85 32.0 8e-17 MEKRYIFDADTHVSPYSNFDKSINACQWEERMERAGVAHAIVWLLPQGVVDVTESNRYIG REAQKNKRIVPFGWANIREGAEKAREDARLCLEEYGCAGVKLNGAQNEYYIDCPEAMSVM EVVAARNGMIAFHIGADYPDFTDPIRAERAARAFPDTKFLMVHMGGAGEPDVSQRVIETA ARNPNMYLVGSAIPAAKVKAAIDTLGPDRILFGSDTPFYDAADVIREYDRMLSGYDEEVR RKIMGENAGRLFGLPVQSV >gi|157101612|gb|DS480712.1| GENE 8 5490 - 5939 352 149 aa, chain - ## HITS:1 COG:SMb20371 KEGG:ns NR:ns ## COG: SMb20371 COG0698 # Protein_GI_number: 16264105 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Sinorhizobium meliloti # 1 149 1 148 148 154 51.0 7e-38 MVISVGSDHAGYGLKVKMIPYLESLGHTIIDNGNNGAEDVVFFPEVARAVCRPVLEGTAE RGIMFCGTGVGASVACNKIPGIRASIVHDYQCAHQCVEHDHVNVMCIGEKVVGEWLAKDL LKAFLEAAGDTDERTAHVIQLLAQMDQRQ >gi|157101612|gb|DS480712.1| GENE 9 5971 - 6690 250 239 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 227 1 229 245 100 28 4e-20 MLKIENLIANYGRIEALHGISLEVREHEIVALIGSNGAGKTTLLNSVSGHVTKEGKIELA GENISKLAPDKIARKGVLHVPEGRHVFPGLTVEQNLLVGTAANKGVRLSAGNNKGDLDYV YDIFPRLLERRKQLAWSLSGGEQQMLAIGRAIMGHPRLLMLDEPSMGLAPLVIDELFEKI VEINKSGTPVLLVEQNAKLALKVSDRAYILERGNISLSGKSKDLINDKRVTEAYLGKSR >gi|157101612|gb|DS480712.1| GENE 10 6677 - 7441 193 254 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 235 1 230 312 79 26 1e-13 MPDTLKEEQAVLRLDNVSKNFGGVVAASDICIQLFRGEVFGLIGPNGAGKTTLLNLITGI YMADGGDIYLDGGKITRFATHRRAKMGIARTFQHPRLLNRCDIRTNIFMGVDLAARRRGK NKNVEQELAQLLACAGLDGVDLTDQVDQLAYGQQKMLEIVRAILCEPVVLLLDEPAAGLN HKEMDYIVALIDKAVEKKIAVLLIEHAMDLVMSVCDRITVLNFGCQIATGTPKEIQMNEA VITAYLGGGGSAEN >gi|157101612|gb|DS480712.1| GENE 11 7395 - 8378 824 327 aa, chain - ## HITS:1 COG:YPO3806 KEGG:ns NR:ns ## COG: YPO3806 COG4177 # Protein_GI_number: 16123940 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Yersinia pestis # 27 306 104 406 428 119 28.0 8e-27 MLEGIRSVKKPYILTLMAIGLGCSILPAVSSSYMMRVFNITLITYLCVLSVYVLLGLCGQ NSFAQAGLWGVGAYITANMVLKLGLGSAAAFIIATTGTAAFAFILGFAFFRLKQYYFTFA SVGLMTILNGLFMNWVPVTGGALGISDIPVFSIAGFAANTENRKFFLILIVCMLASLGIR ILFHSPLGRSFMAIRDNEMAANCLGINSLLTKSIAFAISGALCGAAGAMYAFLSGYLSYQ TFTYQQSTMYLIMIMLGGTMSPVGAIIGTLIISLLQEWVRPLQNYMQFIYGTGIMILMIV QPEGIIGGGKEFYARYIKGRTGRFTVR >gi|157101612|gb|DS480712.1| GENE 12 8388 - 9254 906 288 aa, chain - ## HITS:1 COG:RSc2440 KEGG:ns NR:ns ## COG: RSc2440 COG0559 # Protein_GI_number: 17547159 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Ralstonia solanacearum # 3 288 5 311 311 136 37.0 4e-32 MNVQLLVAGVSMGIIYGLIAIGMVLIFRSVGVMNFAQGEFLMFGGYFCYTFNKILNLPIV LSLVLASLSMGVVGILFMRTSYWPLRSAQAKAIIVSAMGASIVFKEGARLIWGSIPVTMD RVIQGTAHLGQATIQWQNVAIIAISAIIMALVYMLLEKTFVGNIMQATAQDQYMASLTGV PVIVSIGLTFALSAMITGIGGGLLAPIFFLNNTMGATAGAKAFAAIVIGGFGSVPGAVLG GLIVGLVEAFGGAYISSTYQLVLIYVVLIAVLMFRPQGIFGEKIQEKA >gi|157101612|gb|DS480712.1| GENE 13 9311 - 10558 1115 415 aa, chain - ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 1 409 1 382 383 118 26.0 2e-26 MKKMLAVTLAVLMAGIVAGCGTKTESSAAAETETAAVKSEASDTEQAQESQEPSASGDPI RIGLTTVLTGDRSLEGEYASNAAKIITEEINTQGGVLGRPIEIVIEDALGTDVGAVNAYR KLASDDSIVAIVGPDSSNDGMAQSPSALEFKILTTAQGSSPTLKNMCNNDNPYLFQLRAC DDTLCAALIKYSVEELGIKSYAVMHDTETSSADQARLFTEALKSYGIEPVVTVPFTTGTK DFSSHIAQVQASGADAIIGAAFQTEAAILIQQIRSLGIEAPILGSNGFADPVTIQLAGEL NDNVYCASAWVPNTPNPKGAALAEKYKELYGDDCGKAAAQIYDHISVICEAITLAGSTDR EAVREAMNTIGDYQGAITKYDCRTNGDCGRGGLLVKVVKGKAEIISEITSEKVIE >gi|157101612|gb|DS480712.1| GENE 14 10491 - 10700 85 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYEMHKNITFNTNKLTYTIKYIIISRVNDYTKRDVNQGKNIRIWRYFYEENVSSNIGRVN GGDSGRMRH >gi|157101612|gb|DS480712.1| GENE 15 10868 - 11896 826 342 aa, chain + ## HITS:1 COG:PA1949 KEGG:ns NR:ns ## COG: PA1949 COG1609 # Protein_GI_number: 15597145 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 5 338 3 333 337 163 31.0 6e-40 MAKITIRDIAKQSGVSTATVSRVLNNSPKVSPEVRSTILRVIEELGYHPSQAARELKNKS SKTIALVIADGTNEYYFQIAQAIIHAIQEEGYTLFICNSFNNSEIERNYLMMLSERKIDG LVLNSCGGNNELIVSLSQKIPTVLIHRRIDHPGFVGDFANADFGISTYEMTMELIRNRHT KIGIICGPTQFSSAAERLQQFKRAMASIHITVDRDYRYFLEGPHDSAFGYEALEQFLRLP HPPTALVAAHNETCIGALRYCRDHNIPIPETLSIVAPCNVNLCDLFYVQPTYALPDTWAL GFRIGQMLLERIQSENQLLNREAIYIPRIIQGNSVSCPGDPR >gi|157101612|gb|DS480712.1| GENE 16 12090 - 13220 1561 376 aa, chain + ## HITS:1 COG:SP0243 KEGG:ns NR:ns ## COG: SP0243 COG1840 # Protein_GI_number: 15900178 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 49 369 12 329 338 254 45.0 2e-67 MRKFQKATAVIMAAALAATGCSSGGAKTSDTQQPSGTEAAKTEAAGDAAKTEAAKDAASG DEEPVLVVYTARSEALNNAVIPEFEKDTGIKVEVVVAGTGELLKRAQSEKDNPLGDIFWV ADQTMLSASKDLFMEYVSPEDENMLDAFRNTTGYFTPAFADPTVMIVNKDLKGDMKIEGF EDLLNPELKGKIAFGDPVNSSSAFQSLLAMLYGMGKDGDPLSDEAWAYVDQFIANLDGKM ANSSSQVYKGVAEGEYVVGLTWEDPAANYVKEGAAVEVVFPKEGAIFPGESVQILKDCKH PENAKKFVDYMLSEKIQNAVGSNLTVRPLRKDAKLADYMTPQSEIKLFDNYDEGWVAEHK TEITNLFSEHMETSMD >gi|157101612|gb|DS480712.1| GENE 17 13399 - 14499 1406 366 aa, chain + ## HITS:1 COG:FN0376 KEGG:ns NR:ns ## COG: FN0376 COG3842 # Protein_GI_number: 19703718 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Fusobacterium nucleatum # 1 360 1 357 371 408 58.0 1e-113 MSVAISIENVIKRFGKDTIINGLSLDVRPGEFFTLLGPSGCGKTTLLRMIIGFNSIEGGE IKIDGKVINNIPTNKRNMGMVFQNYAIFPHMSVRDNVAFGLKNRKVPQDQIESQVDEILK VVKIDHLKKRMPSKLSGGQQQRVALARAIVIHPEVLLMDEPLSNLDAKLRVEMRNAIKRI QQQVGITTIYVTHDQEEALAVSDRIAVMNGGVIQQIDTPKNVYQRPSNIFVSTFIGLSNI MDGRIEARDGQTLLHIGDYSVPMENLSRSAQDGQAVKVSIRPEEFIISREDSGGIPAVVR SSVFLGITTHYFVETADGREMEVIQNSDIWDIIPDHTPIRLGVQPHKINVFTEDGSQSLI VRRDHS >gi|157101612|gb|DS480712.1| GENE 18 14496 - 16160 1922 554 aa, chain + ## HITS:1 COG:FN0377 KEGG:ns NR:ns ## COG: FN0377 COG1178 # Protein_GI_number: 19703719 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Fusobacterium nucleatum # 7 552 6 548 550 463 50.0 1e-130 MRYAGEKKLNIWVAMALGILALFLLFVVYPLVLVLYKSVLSEDGNLSFAYFTKFFARKYY WSTLVNSFKVTIVSTVVAAVLGLAMAYVLRSVQIKGSKYLNILIVISYLSPPFIGAYAWI QLLGRNGFFTKILNSLFHIKLGGIYGFAGIVLVFSLQSFPLVYMYISGALKNLDNSLNEA AESLGCTAFQRVMQIIVPLVMPTMLASSLLVFMRVFSDFGTPMLIGEGYKTFPVLIYSQF MGEVSTDDHFAAALCVIVIGITLVLFFLQRYLGSRFTYSMTALKPMVAEKCTGTRNILSH LFVYAVVLIAILPQITVIVTSFLATGGGTVYTGGFSLENYRNTLFSKNNNTAIFNTYLFG ICAIAIVVVFGILISYLTVRKRSFLTGILDTVTMFPYIIPGSVLGISFLYAFNSKPLLLS GTAIIIIISLSIRRMPYTIRSSTAIIGQISPSVEEAAISLGCTETKSFVKVTVPMMMSGV LSGAIMSWITLISELSSSIILYTSKTQTLTVAIYAEVIRSNFGNAAAYSTILTLTSILSL LVFFKVTGSNDISI >gi|157101612|gb|DS480712.1| GENE 19 16236 - 17963 1814 575 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 308 566 322 591 602 151 32.0 3e-36 MAGNQYRDYIKRSFLKYACSIISLLFVLMGLFLFINVQWISVGSNRKNNALLASALDSQV SSYKEGLEHFSTDYRFLDALKPGNADAATQVNRLLYGFTTSQPIRSTFALTDTEGHMVSS SLFEGNREIFLNSAVYKSLVEKMQEAPDLTHTMPSRLNFAHGQEGDLLMARPVADADGIC GYLFFDLMDEQIYEAVRECPLDDVVITDRYDNLVFAVGRQNADPMEKYPAGKYRMDWQKG NIVKVNNKHYHIHKEVLPESSLILYTLVSMEFQRSLVTYGILFLGLAGILMVIMIPPLTL RITKKNLQAIDELQASVEEMGRGNMDYRLKSQVFDEFRMLNDAFRNMVIQREELMLHNSE LAERKRTMEIKQLEEQFNPHFIFNVLETLRYEIAIDSQKASEMVMAFANLMRYSVYYGST IVPLQTDIEYINDYLLLQKMRYNRRLNYHIDIPEELMDLMIPKLLLQPVVENSLVHGMKD KHCVSVTITARKKDSTLVLSVEDNGSGISPEKLKELREALDQEDIYREHIGLYNSHRVVR LLYGPSYGLTIESSPGNGTRVSVTLPADMEEDTYV >gi|157101612|gb|DS480712.1| GENE 20 17956 - 18756 978 266 aa, chain + ## HITS:1 COG:SPy1062 KEGG:ns NR:ns ## COG: SPy1062 COG4753 # Protein_GI_number: 15675054 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 1 258 1 249 262 155 35.0 1e-37 MYKVLIVEDEDIIRKGLLFMVNWQEADCVVVGEAVDGLEGLEKIRETNPDIVVVDINMPV KDGLSMLEDSIEEYGYDAVIVSGYSDFEYARKAIRLGVTEYLLKPVNFNELYDALRKIKD KQQAAARLRHELKQLDMEKRKLGILDEPETGDSPEGNRHVEYMMDCIRNHYSSHLSLTDI SEQCGMSCTYLNSKFKSFTGYTFNDYLNRYRMQKAVDLLKENKYKVYEIADLVGFSDYKY FIKVFKKYVGCSPVKFMESGGAGPTP >gi|157101612|gb|DS480712.1| GENE 21 18746 - 19516 885 256 aa, chain - ## HITS:1 COG:CAC1645 KEGG:ns NR:ns ## COG: CAC1645 COG0666 # Protein_GI_number: 15894922 # Func_class: R General function prediction only # Function: FOG: Ankyrin repeat # Organism: Clostridium acetobutylicum # 46 250 194 392 399 108 31.0 9e-24 MGLLDKLFKKGPKADEVSKGGSPIYRYEDKENEGWRPPEAYGVYAEEINAHFQGLFPNRE EFVFHELLSDLVHIDVNIMRPDETHPYYVMYTTGMSDMPMTLPEEIQDREDLRYGELYMF LPKEWNPGEAGQINSDIAQEEYWPIGLIKYLARFPHEYSTWLGWGHTIPNGPDYEPLAPD TGMGGVVLVQTGGDMGSMEAKDGRKVNFYMVIPAYREEIEYKLEYGMEALDKRFSEGNLP MVLDIHRPNLCADFKE >gi|157101612|gb|DS480712.1| GENE 22 19527 - 20672 1082 381 aa, chain - ## HITS:1 COG:AGl3214_2 KEGG:ns NR:ns ## COG: AGl3214_2 COG2199 # Protein_GI_number: 15891726 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 230 376 14 160 189 84 36.0 4e-16 MKPSIWRRVDIQVSLFTAIVVALLTFSIFWFQYRITYNDTLISLRDQAEAIYGYVEKRLD KSTFDQVRTREDMEGDVYKQAHEAFQRIREISGVRYLYTAKMNEDGEFVYLIDCLDQSEP DFRYPGDLIEHEIYPDMQRALDGQKVLPDRIKDTEWGKIFITYMPIYDGEQVLGVVGIEF EAGHQYDTYRSLRLLLPLFILLFSLGACLASRLLFRRISNPFYRDMSNTDYLTQLKNRNA YQLDIGNRIAGKAQEGTGFILLDLNGLKHVNDTLGHDAGDVYITCVSKAYLNMRTREAVM YRIGGDEFVVVMPDADRDKIQDFMRRFGDMFNREASMPEFTFSWGYSIYDPKADADLYST CRRADKNMYMRKQQYYKNKNT >gi|157101612|gb|DS480712.1| GENE 23 20756 - 21589 876 277 aa, chain - ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 108 1 108 124 101 44.0 2e-21 MEKQSELYFTTGEFARILGVRKHTLFHYDEIGLFSPALKEENGYRYYFVWQMDMFEVIRA LQKLGMSLGEIKEYMENRSPERFMSMMDGKKRQIDEEIRRLKNMKRFILHEEESVQLAMK AALDEPRLVERDREYLLVSDISAGSERKAAVEIAEHVRMQEKYHDTVGAVGSIYLGEDMD RGIYDRYVKVYTRLDKKTASRRVQSRPEGVYVELYSRGHLWDMEKPYRLISAFAGERGIR LGQMWYEDLMLDELTVKEYEQYIVKVMVPVESKVINP >gi|157101612|gb|DS480712.1| GENE 24 21706 - 23034 1461 442 aa, chain + ## HITS:1 COG:BH4045 KEGG:ns NR:ns ## COG: BH4045 COG0534 # Protein_GI_number: 15616607 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 442 3 447 447 348 48.0 2e-95 MSHSLSKKFTFGSLLLFALPTTIMMVVMSLYTIVDGVFVSRFVSTNALSSVNIVYPVINI VLGISVMLSTGSNAIVAKKMGEGRPDSARETFTTIILLNIIIGLVFAVLGNLAAVPLSRM LGASDLLLKDCVTYLRWQLAFAPSLMLQIQFQMFFVTEGKPGIGLFLTLLAGIANAVLDY VLIVPMGMGIAGAAIATVTGYSIPALIGLIYFAAARKSLWFVRPRIVKKELGETCLNGSS EMVTNLSSGVITFLFNLLMMHFAGEDGVAAITIIQYSQFLLNALFMGFSQGVSPVISFNY GSQNHQQLRQVFKTSLIFTAASSLAVFLMAQLGGSLVVEIFARRGTPVYELARHGFMIFA CSFLFSGFTIFASALFTALSDGRISAIISFVRTFGLIITSLLVLPFIIGMDGVWLAIPIA EFGGILLCLYFLRKYRTKYHYA >gi|157101612|gb|DS480712.1| GENE 25 23194 - 23973 1047 259 aa, chain - ## HITS:1 COG:SA0215 KEGG:ns NR:ns ## COG: SA0215 COG4753 # Protein_GI_number: 15925926 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Staphylococcus aureus N315 # 1 256 1 249 252 141 32.0 1e-33 MLRVLIAEDEEMIRKGLVYTTDWLSMDCVVVAEAADGQDGYEKILEHRPDVVITDICMPF MDGIEMIKKASEQVRFKSILLTSYADFEYARRAIEARVCEYLLKPVDEEALAGIMERLGE EIVSSRQVEHVMEQAEIEGGNLNLEYYMQLDLSENQYVSRAITAIREDFARKLSIESISD DLGVSASYLSRKFKEVTGQTFLDFLNKYRVQQAIVMLGTRQYRISEISEATGFTDYKHFC SVFKKYTSKSPTKFIKGVS >gi|157101612|gb|DS480712.1| GENE 26 23958 - 25670 1771 570 aa, chain - ## HITS:1 COG:SPy1061 KEGG:ns NR:ns ## COG: SPy1061 COG2972 # Protein_GI_number: 15675053 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pyogenes M1 GAS # 170 531 168 523 549 146 29.0 1e-34 MKKERYFRHELERMFIAYAIVPAVIFTMVCGMLFMTVLLHGKRSGNIEHNRYVAGLLDNV LARYEGGVEEMASQPGCFDNIRDPDGKAELFGIFYGITNSLGLEGDLFVMDSSRNILLSS RSDLPQFLDVPEDVSWGIFDDMDREPHKTALRLAEGWKSGDSTIAIGRSVMDGNRKLGYI LCSIPSSQFKALVDKPDIQTIIADPFGWAYVSSNYSFLNSSNQVIKVLRESGRYLNHEKQ MYLVSRQQVPRGGFTVYSISDIQNIVISLGLGSGLVITALVLMTLWMLLNSRKVTEKKTE DFYKILDVMENARDGDLEGVIRIQSDNEFKIIADAYNETIQSLRIQMENNKKMAELVSMA QNKQLESQFNPHFLYNTLENIRYMCKLEPATAERMVFSLSNLLRYSLDGSKAEVTLKEDL EHLESYLTILKRRFGERFRCRIDMEPEAMDLRIPKLVLQPMIENAVKYGFGKQLNLTVEL KAYIHEGKLIMICRDDGVGIPQGTLSELTALLEQKENPRRHSGLYNIHRRIDILYGRPYG VEIRSAEGHGTTLVVTLPVKREESDGCCVC >gi|157101612|gb|DS480712.1| GENE 27 25667 - 26830 1068 387 aa, chain - ## HITS:1 COG:SP0243 KEGG:ns NR:ns ## COG: SP0243 COG1840 # Protein_GI_number: 15900178 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 60 373 20 334 338 192 34.0 7e-49 MRMKKSWSIYIAVIGMSLCLAGCAGSRTSAAAGHRAGADRAASTEHGAGTDSGAGTEHGA GRDHAGDEELVIYCPHPLEFINPIVAEFEGRTGIRVYVQTGGTGELLKMAEEGSDPPCDI FWGGSLSTTSPKRELFEAYISCNEDMVRDEFKNREGNMSRFTDIPSVIMVNTNLIGDVKV EGYEDLLQPELKGRIAMCDPATSSSAYEHLINMLYAMGDGEPEQGWDYVEDFCRNLDGIL LGGSGEVYQGVAQGKYAVGLTFEEAAAHYVADRGPVKLVYMKEGVISTPDSVCIVKGSRH MKEAREFVDFVTGRDAQTVISMRLDRRSVRMDVEEPSYLPDKEDIHIIYSREDEVNSRKG MWLEHFASIYSRVLEEDGGMSGEVGSQ >gi|157101612|gb|DS480712.1| GENE 28 26865 - 27941 1449 358 aa, chain - ## HITS:1 COG:APE1732 KEGG:ns NR:ns ## COG: APE1732 COG3839 # Protein_GI_number: 14601591 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Aeropyrum pernix # 7 329 3 326 358 301 48.0 2e-81 MEKQKKGVRLEHISKIYTDPKTGKEFYAVQDTTLDIEPGSFVTLLGPSGCGKTTTLRMIA GFESPDEGDIYLGDEAINALTPNKRDTAMVFQSYALLPHYNVFDNVAYGLKIRKLPKEEI HERVMNILKLVEMEGMESRMTNQLSGGQQQRVALARALVIEPSVLLFDEPLSNLDAKLRV TMRTEIRKIQQKVGITAIYVTHDQSEAMSISDKIIIMSKGKVEQIGTPREIYYHPNSRFV ADFIGEANFLEARVRSASGEKAVIDVVGQELTVNNFAKAGPGDDAVLVIRPEGATLAEQG LLEGTVTLSTFMGSYQYYQVMVGNMEIQITDYNPVNRRIYEVGEKAYLDFDPKGVYIL >gi|157101612|gb|DS480712.1| GENE 29 28065 - 29774 2030 569 aa, chain - ## HITS:1 COG:SMb20364 KEGG:ns NR:ns ## COG: SMb20364 COG1178 # Protein_GI_number: 16264098 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+ transport system, permease component # Organism: Sinorhizobium meliloti # 10 569 89 663 681 313 33.0 7e-85 MARKQSAELERKKFFSDPILVSMIVVLLVFLALFILYPLLMLLVDSFYSEGKITLEVFGR VLSLERFQKAFKNTLVLGMITGIASTAIGLLFAYVEVYVKLKTKVLEKLFAMVSMLPVVS PPFVLSLSMIMLFGRSGIITRSLLGIYDSNVYGLKGIAIVQTLTFFPVCYLMLKGLLKNI DPSLEEATRDMGASRWKVFTSVTFPLLLPGLGNAFLVTFIESVADFANPMLIGGSYDTLA TTIYLQVTGAYDSTGAAAMSVVLLSLTVVLFLIQKYYLEKKTAATLTGKASRMRMLIEDK SVTVPLTFFCGAVSAFVMLMYIMVPFGALFTLWGRDYSLSLKWFQYMFKTSGLKAFKDSF VLSLIAAPLTAFLSMVISYLVVKKRFKAKGFIEFVSMLAMAVPGTVLGIGYIRGYANGLF RTGVMSGLYGTGLILIIVFIVRSLPTGTRSGISALRQIDKSIEESAYDMGANSAKVFMTV TLPLIKDSFFSGLVTAFVRSITAISAIILLVTPDFLLITCQINEQAEKGNYGVACAYATV LIIITYGAVLIMNTLMKFFGVSRKIKQEN >gi|157101612|gb|DS480712.1| GENE 30 29970 - 31172 401 400 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167854980|ref|ZP_02477755.1| 50S ribosomal protein L13 [Haemophilus parasuis 29755] # 72 367 26 322 346 159 32 1e-37 MKKKLALLLACSMALSLTACGGGSKPAETTAAPETTTAAASEAAKEESKEDAAETTAAEA AGGDMDALIEAAKAEGELTVYGSCEEEYLSAACQNFEKLYGIKTKYQRLSTSEVYTKISE EAGKPSADVWFGGTTDPYNEAVADGLLMEYNAINAKNLIGDQYKDPTNCWYGIYKGILGF MVNTEELDRLGLEAPKTWDDLTKEEYKGLIMLSNPNTAGTAKLLINTMVQMKGHDEAMEY FKALDKNISQYTKSGSGPSKMVGPGECVIGIGFLHDGIYQILQGYDNIQLIVPEDGTSFE VGATAIFNGCAHPNAAKLWIEYALSPDCVDHAKENGSYQFLVLSNATQPEEATKFGLDMN NTIDYDFEDAKENSAKYVEDFFAALGSSSDSADSSRFLTE >gi|157101612|gb|DS480712.1| GENE 31 31312 - 35148 3857 1278 aa, chain - ## HITS:1 COG:CC2501_1 KEGG:ns NR:ns ## COG: CC2501_1 COG0642 # Protein_GI_number: 16126740 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 899 1182 259 538 538 194 40.0 1e-48 MGKQPRDAENTTRAFFKEYMVNRNVEATLDWLMDDVQWIGTGKGEYSCGKEQVLEALEQE FSLDPDAYQLIWEDIRENSLTPDCAVMQGRLTVVRILPDGENFAMGVRVTSTCVRTAEGF KVSSIHASVPADFQEEGEFFPLSFAESVSEEYARRMGRSALELLGKTIPGGMLGTYFEPG FPLYYVNDRMLDCLGYTYEEFSQDSQGMVGNCIHPQDRERVYGVVRDAFGKVRDYGVRYR IRKKNGRYIHVYETGNFAQAEDGRDICLSVVRDISAEVEAEERLKQENREKERQAGRYDQ LFQSVLCGIMQWKPLDRERVVFKNANRESIRILGYEPEEFWSHGVWQWKELIAEADYQMK LDQLDNLEKAGDSMNFQYRVRQKDGTSCWILGKLEMVEDGDGELIAQSVFLDIDIRKRTE HQNQQLKELAEANSAILNMALEHTSLCEFYYYPDRRTCVIPDRTSARYQCGSRYVNMPDS FAQEMVVPGGRRDFIRMYEDIHSGKRTASAQFLTVKDSWCRVTMSAVEYGENGQNGTVLG IIEDITKEQTMALALEEAKSKDLLTGLWNREAGTRIAQEYMAQKPLGERCALMLLDMDNF TRINQEEGVAFANAVLQEVADILRAETEPDDIRVRMGGDEFMLVLKGRGKAGATVTGPRV AARVRELLLQSDKDISISVSIGMCATEVVDEYSGLYRCAESTLKYVKENCQGNAACYLDT SNELGEMLTDLYTEKHQLNTIEQGLTEQREDLVSFALDLLGKARNLNDAVQLLFARLGKT FGLDRVSLLDVDRDYLTCRFSYQWARDKTDLMMHKTFYITREQYEQWPGQFDPSGLNRHA VYDESSMSCLQAAIWNQGMFAGILSFEVKTDGYSWNDEQRKLLEELSKIIPSFVMKARAD AVSQAKTDFLSRMSHEIRTPLNAIVGMTSIARNVADDRDRVLECLDKLETSNQYLISLIN DILDMSRIESGKMELNVQPMDMEDFVRSLEGMMRPQAEQKGLRFIVENRCCQGLALVTDR LRLEQVLINIIGNAVKFTGEGGDVIFSITPEEGSSGGQRLTFSVKDTGIGIASEAMDSIF NAFEQAEKNTSVKYGGTGLGLAISSRLVQMMGGTLGVRSVLGEGSEFYFTLTLPIGKLDG QRPRSREPEHHDFHGKRLLVVEDNLLNQEIAQSLLEMEGFLVETAENGQAALDAFGNHEP GYYNAVLMDIRMPVMDGIEATRRIRTMERPDSRTIPIIAMTANAFDQDSRKSLDSGMNGH LSKPIRVEELLRMLDACL >gi|157101612|gb|DS480712.1| GENE 32 35295 - 36515 1108 406 aa, chain - ## HITS:1 COG:BH1945 KEGG:ns NR:ns ## COG: BH1945 COG0642 # Protein_GI_number: 15614508 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 158 392 223 449 462 115 34.0 2e-25 MREIGRLRLKFIIYNMLVVTAVIGITFCAVTSVVKHRVNDQGSLALSRVVSGEEQSLIFA AMPPVRMPYFTVLVGEDGVVSLKEGYGAYWEPEFLEQMAVLGMNGEEDMGILDAYHLRYL RISQPAGYMIAFADTTYEDSLRNGIMKYGGMACVAVWLGFFALSYFFSRWAVKPVEESVR MQKQFVADASHELKTPLTVITANAELLQERYAGISAEADKWMEHVNQECREMRALVESLL LLARNDSYVPGKGEFTRFSLSELVMEKILTFEPVFYQEEKVLEYDIEDDVLMMGNPCRMG QLIKALMDNAVKYCVPRGRAQIRLEKTGRSRARLWVCSQGEPIPEDKRTLIFRRFYRDDS ARSSTSGYGLGLAIAAETARSHRARIGVEYKDGMNCFYVTVKRKRG >gi|157101612|gb|DS480712.1| GENE 33 36515 - 37201 869 228 aa, chain - ## HITS:1 COG:CAC0653 KEGG:ns NR:ns ## COG: CAC0653 COG0745 # Protein_GI_number: 15893941 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 219 1 220 221 146 36.0 2e-35 MRVLIVEDNRRLAESIRDILKHWYDCDICGDGNTGCWLLKDGGYDAAVLDLMLPGMDGIT LLREARKMGCQVPVLILTAKSELEDRVTGLESGADYYLTKPFDMQELLAVMKTILRRRGE MIPENLNLGNVSLSQADYTLRGPHKSIQLGKKEYDIMRILMVNRDMVVSKDTLLSKVWGN DAEAVENNVEIYISFLRKKLDFLKADVMIVTIRKLGYRIVEKDGREVL >gi|157101612|gb|DS480712.1| GENE 34 37326 - 38264 988 312 aa, chain - ## HITS:1 COG:BH3213 KEGG:ns NR:ns ## COG: BH3213 COG1277 # Protein_GI_number: 15615775 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Bacillus halodurans # 6 311 33 344 345 243 44.0 3e-64 MGGMAALYRKEMSDHIRSKRFLIVLGLILLTSCASIYGALTGLSDAVASDSSYIFLKLFT LSGSSIPSFTSFIALLGPFVGLSLGFDAINSERSEGTLNRLVAQPIYRDAVINGKFLAGA TIIFLMVFSMGLIAGSVGLLATGVPPSAEEAGRVLVLLFFTSVYICFWLALSILCSVLCR HAATSAMIVIALWIFFALFMTLVVSIVANALYPMGQAASAGQILDNYSCQMSLNRLSPYY LYSEAVSTIMNPMVRSTNIILPQQLSGAISGYLSLGQSCLLVWPHLTGLLALTAVVFAAS YISFMRREIRSR >gi|157101612|gb|DS480712.1| GENE 35 38308 - 39321 1114 337 aa, chain - ## HITS:1 COG:BH3214 KEGG:ns NR:ns ## COG: BH3214 COG1131 # Protein_GI_number: 15615776 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 5 320 2 318 318 315 50.0 1e-85 MDYEEEMMIRTEHLSKSYGEKEAVCDLTLEIRRGEIFGLLGPNGAGKTTTTLMLLGLTEP TGGSAYIDGKDCARQAIDVKRMVGYLPDNVGFYSDMTGRENLRFTGQLNGLPSDVTDRRM EELLEKVGMTEAADQKAGTYSRGMRQRLGIADVLMKDPKVVIMDEPTLGIDPEGMRELMT LIRSLAEDDKRTILISSHQLYQIQQICDRVGLFVDGRLIACGRIDELAGQMRREGHYALE AAAFPDDSGFEELLRSLGNVEQVERTGDFYVVHSRSDLCRELNRRTLEKGYTLRHLRQRG NDLDEIYRRYFEKGEQKNGGLNQKKNKGFLSFGREQR >gi|157101612|gb|DS480712.1| GENE 36 39358 - 40566 1199 402 aa, chain - ## HITS:1 COG:BH3215 KEGG:ns NR:ns ## COG: BH3215 COG1470 # Protein_GI_number: 15615777 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 18 402 6 385 385 201 34.0 1e-51 MEEAMVIRRDGTGKRFGVLAVLMAAVFLVFMGTHTVYGAGGLDLNTDYPGISVKPGDSLN IPVTLENYTGASLNADLEITAMPEGWEGYLQGGSYQVSRVHVKNGEEGAQVTLHVTVPKE LTEGTYTVEVKASAGGGLTASLPVSFMVNEMNAGKGSFTSEYPQQEGTTGTSFSFSTTLI NNGLKTQSYSLSSNAPSGWGVSFTPTGETAKVAALEVESGASKGMTVSVTPPEGVAAGEY DISCSAVSAEETLDTSLKVVITGSYGLKVSTPDGRLSFDAYAGRSSDVTLNITNTGNVDL ENVTLNSSLPTGWTVTYNTENNMIPSIPAGSTTEVIAHVKPSSEAITGDYVNTFTASCEQ TQSNADFRVSVKTTTIWGVVAVVIILCTAGGLGYVFRKYGRR >gi|157101612|gb|DS480712.1| GENE 37 40821 - 42230 1244 469 aa, chain - ## HITS:1 COG:RSc2119 KEGG:ns NR:ns ## COG: RSc2119 COG0402 # Protein_GI_number: 17546838 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Ralstonia solanacearum # 5 444 12 448 474 362 44.0 1e-100 MKGTLLVKNVKHLVTCDGDDRLLDGVDVFIRDGVIAGIGQEAGTLESGTLESGTAEDVID ASNMVMYPGLINTHHHLYQTFSRNLPQVQNMELFPWLKTLYEIWKHVDEDVVCYSALTGM GELLKTGCTTCLDHHYVFPGSAGDGLLDAQFGAADALGIRFHATRGSMDLSVKDGGLPPD SVVQTVDQILKDSERAVKKFHDKSRYSMHQVALAPCSPFSVTGTLLKESAQLARELGVRL HTHLAETKDEERFTTERFGMRPLEYMESLGWMGEDVWFAHGIHFTEDELKRLAETGTGVA HCPISNMKLSSGVALVPKMLELGVPLGLAVDGSASNDGSNLLEEMRVAYLLHRLWWSRQA PSAYDILKIATRGSARVLGRDDLGQIAVGMAADFFLVDMNRMELTGAQFDPKSMLCTVGL KGSVDYTVVNGEIVVKEGRLVRVDEERTVEKANLAVKEYMGHSSCSVPL >gi|157101612|gb|DS480712.1| GENE 38 42522 - 43010 428 162 aa, chain - ## HITS:1 COG:BH0719 KEGG:ns NR:ns ## COG: BH0719 COG1309 # Protein_GI_number: 15613282 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 1 144 23 176 188 65 27.0 3e-11 MLQTNELNEISVSDICKLAGLNRTTFYANYTDIYGLADSIRDKLEHNLSDLYQNEITRGF NSNDYLKLFRHIKDNQIFYKTYFKLGYDDSYKIITYDINLARQHFENRFIEYHMEFFKSG ITKIIKLWLQNGCKESPEDMYEIIKSEYQGRADSGKECSDTL >gi|157101612|gb|DS480712.1| GENE 39 43203 - 43574 257 123 aa, chain + ## HITS:1 COG:no KEGG:LEUM_0832 NR:ns ## KEGG: LEUM_0832 # Name: not_defined # Def: hypothetical protein # Organism: L.mesenteroides # Pathway: not_defined # 1 123 1 123 123 73 39.0 2e-12 MKKSSFAAMISGTVSGVLFALGMCMALIPEWGAFKPGLALGSAGLALALITVLIWRRMEH KEPVRFSGKVILSIVIGIAGAMAFGVGMCFSMIWGKMTAGIVIGLAGIVILLCLIPLAIG IKE >gi|157101612|gb|DS480712.1| GENE 40 43678 - 43896 309 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941707|ref|ZP_02089034.1| ## NR: gi|160941707|ref|ZP_02089034.1| hypothetical protein CLOBOL_06603 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06603 [Clostridium bolteae ATCC BAA-613] # 1 72 1 72 72 97 100.0 3e-19 MATSKLVKTNEKIAEALTEVFFNIEHGVVDRYIKIEDTFVETYLAKEGETTAEAKERLLQ ERAQRKQAQREG >gi|157101612|gb|DS480712.1| GENE 41 44337 - 44492 89 51 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2163 NR:ns ## KEGG: CLL_A2163 # Name: not_defined # Def: putative acetyltransferase # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 5 51 213 259 355 75 72.0 5e-13 MADKVEDIPNLEKDLNCSYPAEPMEGLFLKMVKGQLEKTEKNPNDYLWNSF >gi|157101612|gb|DS480712.1| GENE 42 44628 - 45041 324 137 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2644 NR:ns ## KEGG: Dhaf_2644 # Name: not_defined # Def: rifampin ADP-ribosylating transferase # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 6 136 7 137 139 189 64.0 3e-47 MSNSVIFSPGTFFHGTRADLSVGDLLVPGYLSNYDENRVANYVYFTGTLDAAIWGAELAS GDGKQRIFVVEPTGEFEDDPNVTDKKFPGNPTKSYRTTQPLKIVAEAMGWKGHSPEVLQG MRDHLKQLDEMGIKAVE >gi|157101612|gb|DS480712.1| GENE 43 45159 - 46448 654 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646729|ref|ZP_01719239.1| Ribosomal protein L16 [Algoriphagus sp. PR1] # 8 423 13 428 431 256 35 5e-67 MLSAVGFFLILLFLGMPVAFAIAVSGFSFFLMRPEIPWTILVQKSLSTTQSFTMLAIPLF IFAGNLMNNTGITSRLVKLANVLTGHMYGSIGQVSVVLSTLMGGVSGSAVADAAMECRIL GPEMTKRGYAPGWAAAINCLSGLIVATIPPSMGLILYGTVGEVSIGRLFVAGFLPGIIMC IFMMIATSWSARHYGYQPDHDKPSPPGVIIKSCFESIWALMFPILLIVLIRFGVMTPSES GAFAVFYALFVGIFIYKELNLEKFKKCIVDSVKDLSVITMILAFSGIFSYGVVFDQLPKT LTTLLVGLTSNKYVLLLLILIMLTVFGMFMETTVITLIVTPILIPIITQYGIDPVHFGII MMTIVTMGCSTPPVGVALYTCSNIMDCSVQETTKYSLPLFIAIMATLAICVFFPDLVLFL PNMIYGTSM >gi|157101612|gb|DS480712.1| GENE 44 46448 - 46930 601 160 aa, chain - ## HITS:1 COG:Cgl2288 KEGG:ns NR:ns ## COG: Cgl2288 COG3090 # Protein_GI_number: 19553538 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Corynebacterium glutamicum # 18 143 5 129 173 64 25.0 8e-11 MKKISAVILKTEELLAGAMLCMIAVLVFWSAVARTIGMPVNWAQDVSLLAFGWLTFIGSD IIIKSGGLIRIDMLSNRFPKAVQKTLMLVFDVFMLLFLLILIVYGFLLVSQSWNRTFNTL KMSYAWCTLAVPVGSLLMFFSMIGKMLGDIRKPMKEWGVN >gi|157101612|gb|DS480712.1| GENE 45 46971 - 48032 319 353 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 51 334 28 310 327 127 28 4e-28 MKKRVLLAGVLTAAMMAVTACNGMPGMGNAQPAATQAKEGDSSEAAAGDDKVYELKISTS QTEQALITRNYQKLADVLNEKSGGRLKVSVFPSGQLGSDEDVLEQAIQGANVAVNTDASR MGQYVKDFSILMMGYFADNYEECYKITQTDTFKGWTDSLSKDHGIKILSFTFYDGPRHFM TNKPINTPEDLKGMRIRTIGQEVCTETIQAMGATPISMSWGEVYNGIQSKALEGCEAQNT STYPSRLYEVCKYQSKTGHFQLMQGLICGQSWFESLPEDLQTLLVETSIEVGQETAADVM TEADEAEKAMVEAGMTIVEPDLAPFKAAVDPVYEKLGYAELRDKLYKEIGKTN >gi|157101612|gb|DS480712.1| GENE 46 48310 - 49320 1023 336 aa, chain + ## HITS:1 COG:ECs3570 KEGG:ns NR:ns ## COG: ECs3570 COG1609 # Protein_GI_number: 15832824 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 330 2 332 337 163 31.0 4e-40 MVTIKDVAKRCGYSVCTVSRALSGKGYLKEETKQKILETVAELGYRPNTLAVNLKTGRRQ ALALVLPSLTNIYYTKLEQYLEACAAEKGYIIYLNNTEYSLEKEKQILENLIGMDIAGVI ITPVTSQHDHIMNLAKYDIPYVYLNRSFEDDIEHCLRLDNKKAAYEAVSYMIKLGHKNIG GIFQSFDNLTYRDRYDGMAQALKEHGLDCRPSHMLFDINPEDMGTTPGRILQLLQKPDRP EAIFACNDMTAFNIYRACYILGLRIPDHLSVFGYDDCIMANFVTPPLSTVSIPVRKLAGT AIDFIHGYLESGVRAELPILKASLIIRNSVRNLNLD >gi|157101612|gb|DS480712.1| GENE 47 49317 - 50402 956 361 aa, chain - ## HITS:1 COG:SP0736 KEGG:ns NR:ns ## COG: SP0736 COG1482 # Protein_GI_number: 15900631 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Streptococcus pneumoniae TIGR4 # 98 354 81 314 314 82 27.0 1e-15 MATGLMKLPENRVRRNYTGGAGIDRLHGKARQEDNNMPEEWVGSMVEASNPGMEPIAHEG MAVVETADGPRFLRDIIDDDREFYLGAAGREGGWRLSFLLKILDSAMRLHVQAHPSTQFA NEVMGMPYGKLECYYILHVRDGISPYIRLGFQHTPGRDGWREIVEKQDKARMDGCFEKIP VQVGEVWYIPGGMPHAIGEGITMLEIMEPSDLVVRCEFEREGIVVPEDGRFMGRGLDFCL DIFDYTEYSKEEIMEKCRIEPRVLEATDAFRRVRLVDGTLTSCFFVEKLEVNGPALVGHN RKFNLGVVCAGSCTMEESGQVIRLKAGDSFLIAAGVKAYQIRPEGSVQMVMVYPGKDMDK L >gi|157101612|gb|DS480712.1| GENE 48 50434 - 51453 539 339 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239995924|ref|ZP_04716448.1| ribosomal protein L22 [Alteromonas macleodii ATCC 27126] # 40 336 31 325 327 212 37 1e-53 MKKQWLIGAAAAALIAGLAGGYGILENRGAAGADNEPEIVLCYGEVNPEGHVLTDSAQYF ADRVSELTGGKVMVEIYPSGQLGDDARCYQSMEMGALDLYRGNSMSLVDSGNPMMSALVL PYVFRDREHFWKVCSSDLGREILDNIQDCTGMIGLAYLDEGARNFFTTDRPVCRLEDMKG LKIRVQVASMMGDTVEALGAEAVPIAYAELYTALESGTIDGAENPPVSYYYNKFYKVAPY YVKDRHTYTPGVILVSKITWKNLKKEYQDALVQAAKETQEYNRTAIEEADQKAYEALEKE GVTILEPEDPQAWSQAMEPVYRKYGGEYLDLIEKIRQIR >gi|157101612|gb|DS480712.1| GENE 49 51516 - 52745 1376 409 aa, chain - ## HITS:1 COG:FN0189 KEGG:ns NR:ns ## COG: FN0189 COG4753 # Protein_GI_number: 19703534 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Fusobacterium nucleatum # 1 129 1 129 261 92 38.0 1e-18 MLRVILADDEQYERNYLEKVIKESYPGLLEIVYKAVDGMDLMEKLEECKPQIVLLDIKMP RMNGLKTAEEVRKRHPDIQIVLISAYSDFSFAKQAIKLGVTDYLLKPYLDSELRETLDKV IARVREREDTLSMLSYSNYLVEDGKSAFDFYRDLEKDLLWDVFFGRRDRNKLEKEFAMWG VGPGWFKVVLISSPALISMGGFSQEVLKNYFHMAGVTVLNSIWMDQMVICLYSQQQDGFT ELTGCITRARDYLAEERKIPVACGVSGAYEGMERLTEAYGEAASFIREYSAQEARLAFDS MTEGMKRICGLEEQIIKSLSLEDQDLSCELLTEMIEELETLLGYQDVSVKLNFGRSLMTI IRGINQIPGIRVKTSEAARRFEKLGQLNFNGDNLKYHVEFFSEMKIQRE >gi|157101612|gb|DS480712.1| GENE 50 52758 - 54161 1554 467 aa, chain - ## HITS:1 COG:BH3841 KEGG:ns NR:ns ## COG: BH3841 COG2972 # Protein_GI_number: 15616403 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 13 453 17 464 481 166 26.0 7e-41 MKSIRKEMTRNHVILLLVAFIFIASNALVNVGSSRRYNGLLQDYQQVNGLLAINNRRQTY FKLYSKSHDEGMLKQYYDECELFDSQLRGLDEKMRNDRKCKMMYRIVGQVAEHRREMAES YIRPDGDYYPSLMADLDEVDLEIERCLNQLMSQYLEYLNTAFASHSRTLGLTNSILIVFF MAASLFGMVLNHQMSTSILNSIKRLTDAAREIMNNNLEAADIEETPYAELNQVSATFDQM KRQIRTMITELHETHQMKERLAEAKIRELQMQMNPHFLFNTLSLVIRSIQLGERDTSIQL VKAISKILRSSIEINTVSIPLDAEIELLQSYLYIQKLHLKGRVTFCLDVRKSFMDEDVMI PPLTIQPLVENSIQHGLKDRVNGGKVDILITEKPDYIEAVVADNGVGFPEDPSQPARRDT PIPKTSIGLKNVEERLRLFYRKEDVLHIQRTEGITKITLKLYKSDRH >gi|157101612|gb|DS480712.1| GENE 51 54303 - 55922 945 539 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941720|ref|ZP_02089047.1| ## NR: gi|160941720|ref|ZP_02089047.1| hypothetical protein CLOBOL_06616 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06616 [Clostridium bolteae ATCC BAA-613] # 1 539 19 557 557 1018 100.0 0 MKRSTISILMTAGLLAVLLSGCRLGGGTEGEAAGSGTRENVSKPVAGTAQEPVPESKVNQ ETEEDAQGKQETQGQQEPGVNQVPKSQGQVSGLPYQFASKYGIIQSEYPVYELDAVVESK LPQKEKTISLTSALHQKQELLVSMVLDDYGEVQKIPAGGEPPADSRYFTLSDGTMIVSEK YQDELWKSGEGLFLTGPGIPEGGIKPVESVYADYSAYFEEYGNIRYIIEARFELPSAPVS EELLSGYAIWLFDFEKPVEFALKSVPEYGTLEELAKEENGSIDTHDGISIISMGEKVKEG ILISWYVYSDEGRPSIIIGYKTPSLDTDTPPLDMPTISGSGRQYEIKELSANPYWDNMGH YRLSDIKQYGRRLRCLFDVPQEEQNGSFRIHIPGITFLNSEESEPVTLPVPEDYKELEET IPWKDGSVRILGITRMKPQTIESEDGQGNAKVTERPAVYIDVEAVHEERELALKGLLCQR KLRWGRWERERYDFDEKGVLSGFRIFYEEGDTEVTLKFQGAAFYWIQPYVMEIGADKIA >gi|157101612|gb|DS480712.1| GENE 52 56063 - 56476 189 137 aa, chain - ## HITS:1 COG:no KEGG:Clole_2554 NR:ns ## KEGG: Clole_2554 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 134 1 134 139 160 52.0 1e-38 MYERMLDKQNKPAFDDMAAYCGKGMELFIRINEWLSDTCSTVQEITFPYGNKYGWAVAHR KKKKLICNVFAENGAFTVMIRLSNAQFHLVYDQVEKETQECIDNKYPCGDGGWIHYRVAG EAHFCDIQKLLEMKCLS >gi|157101612|gb|DS480712.1| GENE 53 56765 - 58021 1227 418 aa, chain - ## HITS:1 COG:BH2877 KEGG:ns NR:ns ## COG: BH2877 COG1686 # Protein_GI_number: 15615440 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 32 279 45 284 395 184 43.0 2e-46 MGSPSQNPSASQAAGTFDQTAAASVQKPEIASEGAVLMDAATGTVLFSKNGDKQFYPASI TKLMTALLVAENCSLDDKVTFSATATTNLEAGAVSINMTAGDVMTVRQCLYALLLKSANE VGNALAEHVAGSNAKFAEMMNAKAAALGCTNTHFTNPHGLNDSNHYTTPHDMALIARAAF QNDVVKTVASTRTYTLPATIKNSSGLTVTIGHKMLNPNDARYYPGVIGGKTGYTSKAGNT LVTAVEKDGVRLIAVIMKSKSTHYTDTKALFDYGYELVKAGALSGSAGSTGSSDNAGTTG SSSGTGNAGPGTSSAKGWVQDSNGWYYVKDNGAKAANEWVTADGVSYWIDSNTYMAKGWR QVNGKWYFLRSNGAMARNQWEKVEENGQWFYLGADGAMLTNTTTPDGSRVDESGAWVQ >gi|157101612|gb|DS480712.1| GENE 54 57876 - 58427 242 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941723|ref|ZP_02089050.1| ## NR: gi|160941723|ref|ZP_02089050.1| hypothetical protein CLOBOL_06619 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06619 [Clostridium bolteae ATCC BAA-613] # 12 183 1 172 172 302 100.0 7e-81 MENRTVPVAASMSTAPSEAISGFCTEAAAVWSKVPAAWLADGFCDGLPNKEGLSDGEGVP DVEGLPEVDGLPDGDGLPEVDGLPDEDGLPDGDGLPDEDGPSDEMTAPSLPDSEPGPWDD TIPWSEVWFPDWKEDTWLSSKGVMAAGLAEAAILINRVQLVHIHKNDFHFLSLIISLPYS RSF >gi|157101612|gb|DS480712.1| GENE 55 58667 - 58846 158 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941727|ref|ZP_02089054.1| ## NR: gi|160941727|ref|ZP_02089054.1| hypothetical protein CLOBOL_06623 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06623 [Clostridium bolteae ATCC BAA-613] # 1 59 1 59 59 123 100.0 5e-27 MRCPKCSGQMEINRIRTHLVWVDELDRIPDSKNQLGPTIAYVCTECGYIEFYRDLTSVN >gi|157101612|gb|DS480712.1| GENE 56 58960 - 59994 945 344 aa, chain - ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 7 157 2 152 158 166 52.0 5e-41 MEQKSRLTTLCYIEKAGAYLMLHRISKKNDVNKDKWIGVGGHFEGDESPEDCLLREVREE TGLTLTSWRFRGLVTFIAEGWDTEYMCLYTADKYEGEMIPCNEGTLEWVKKEDVLSLNLW EGDKIFFKLLNEDAPFFSLKLAYQGDRLSEAVLNGKQLELFDVLDADGKKTGVVRERSLV HMDGVPHRTAHIWVVRKNKDKTYDLLLQKRSRGKDSYPGCYDISSAGHVQAGDEFLPSAI RELKEELGIEAGEEDLEFAGYHKGYMEEVFYGRMFRDSEVSAVYVYSKPVDADRLTLQKE EVESVMWMGLGQCVEAVRNNGIPNCIYLDELEMIEGYLCRSQEK >gi|157101612|gb|DS480712.1| GENE 57 60116 - 61570 1842 484 aa, chain - ## HITS:1 COG:CAC2701_3 KEGG:ns NR:ns ## COG: CAC2701_3 COG0516 # Protein_GI_number: 15895958 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Clostridium acetobutylicum # 205 484 1 280 280 396 70.0 1e-110 MGTIIGEGITFDDVLLVPAFSEVIPNQVDLTTHLTKKIKLNIPLMSAGMDTVTEHRMAIA MARQGGIGIIHKNMSIEAQAEEVDRVKRSENGVITDPFFLSPEHTLKDANDLMAKFRISG VPITEGRKLVGIITNRDLKFEEDFDRPIKECMTTKNLVTAREGVTLKEAKAILAKARVEK LPIVDDDFNLKGLITIKDIEKQIKYPLSAKDAQGRLLCGAAVGITANVLERVGALVDAKV DVVVLDSAHGHSANVIRCVKMIKEAYPDLQVVAGNVATAEATRALIEAGADSVKVGIGPG SICTTRVVAGIGVPQVTAVMNCYSVAKEYGVPIIADGGIKYSGDVTKAIAAGGSVCMMGS IFAGCDESPGTFELYQGRKYKVYRGMGSIAAMENGSKDRYFQTDAKKLVPEGVEGRVAYK GMVEDTVFQLLGGLRSGMGYCGAQDITTLQETAQFVKISAASLKESHPHDIHITKEAPNY SIEE >gi|157101612|gb|DS480712.1| GENE 58 61798 - 63192 1597 464 aa, chain - ## HITS:1 COG:CAC2542 KEGG:ns NR:ns ## COG: CAC2542 COG0277 # Protein_GI_number: 15895804 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Clostridium acetobutylicum # 3 463 5 465 467 623 63.0 1e-178 MGYRKLDERDIAYLKKVTEDGRVLTGSDISQDYSHDELGGVERAPEALVRVLSTGEVSDI MAYASREGIPVVTRGSGTGLVGAAVAIHGGIMLETTQMNHILELDRRNLTVTVEPGVLLM DLAQYVEENGLFYPPDPGEKSATIGGNISTNAGGMRAVKYGVTRDYVRALTVVLPTGEVV EFGGKTVKNSSGYSLLNLMIGSEGTLGVITRAVLKLVPLPGKTVSLLVPFADMEQAIEMV PVIVEAQIQATAIEFMERETILFAEEYLGKRFPDTRNDAYILLTFDGRNAEQVRAEYQEV ADLCLEHGALDVFIVDTQERKQSVWNARGAFLEAIKASTTQMDECDVVVPRDRVADFIKY THEVAGILNLRIPSFGHAGDGNLHVYLCRDDLAEDEWKEKLSLGFEMMYARAREYGGAVS GEHGIGYAKKEYLKEQLGDTQIGLMRRIKKAFDPNLILNPDKVV >gi|157101612|gb|DS480712.1| GENE 59 63192 - 64382 1080 396 aa, chain - ## HITS:1 COG:CAC2543_2 KEGG:ns NR:ns ## COG: CAC2543_2 COG2025 # Protein_GI_number: 15895805 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 66 393 5 330 332 410 60.0 1e-114 MAELRIHQEQLDEDSIRQLIDICPFGAISCRDGKVEIDSGCRMCRLCVKKGPGGAVEYME TQEPGINKDEWRAVAVYVEYNGKAVHPVTFELIGKARELAAVTGHPVYALFMGYRVGEKA EELLAYGVDQVFVYDRKELEHFSVLTYANVFADFITRIRPSSILVGATNAGRSLAPRVAA RFRTGLTADCTVLEMKENTDLVQIRPAFGGNIMAQIVTPKNRPQFCTVRYKIFSAPPVTK DPSGTIRTMEVTDGMIDRRIRVEEIIHKPKEIDISEAEVIVAVGRGVKSQADLELIRELA AALDAQLACTRPLIECGWFDARRQIGLSGRTVKPKLIITIGISGSVQFAAGMRGAECIIA INTDRKAPVFDIANIGLVGDWYEILPRLLKTVKEGT >gi|157101612|gb|DS480712.1| GENE 60 64410 - 65198 842 262 aa, chain - ## HITS:1 COG:CAC2544 KEGG:ns NR:ns ## COG: CAC2544 COG2086 # Protein_GI_number: 15895806 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Clostridium acetobutylicum # 1 262 1 262 262 313 57.0 2e-85 MKCIVCVKQVPDTANVEVDPVTGVLKRDGTQSKLNPYDLYAMESALGLKEQLGGTVDVIT MGPGQAREALKECMCMGADRAGLISDRKFAGADVVATSYTLSQGIRRMGEYDLILCGKQT TDGDTAQVGPEVAQWLGIPHASNVLEIRETGEGRIRVRMNMDGYEQIQAMDLPCLITMDK DVNTPRLPSYKREKELREGYLREFSLTDLEDQDESHYGLSGSPTQVERIFPPEKSECREM FDGSPDVLADKMYHVLRETKFI >gi|157101612|gb|DS480712.1| GENE 61 65293 - 66204 924 303 aa, chain - ## HITS:1 COG:PA0223 KEGG:ns NR:ns ## COG: PA0223 COG0329 # Protein_GI_number: 15595420 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Pseudomonas aeruginosa # 11 302 6 292 293 132 27.0 8e-31 MKVKDTIKKPYGICPASMTIWNADQTYNKKGMEKYLTWLLDNGAQSISICGSTGENIAMN MEEQREIIEHVSGFLGGQVPLVCGTGRYDTLNTIKMSKFAQDHGADCVMVILPFYLNPHK KAVMQHFRDLREALDIEIMVYNNPWFAGYELSVQEVKELVDDGTIQCIKAAHGDPNRVHE LKYHCKDDLTVFYGHDYAAMEGLLAGADGWLSGFPAVLPKQCRRLQDACFAKDVDAAIAA QNNIQPYIDYFFYDKVNGVPHWQEICKYTLQAQGLDVGLPRKPLGELDDANKKKIEKLLA DME >gi|157101612|gb|DS480712.1| GENE 62 66420 - 67130 198 236 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 2 234 5 233 305 80 28 4e-14 MLKVNGIDAYYGDVCALHDISLEIGDDEIISIIGSNGAGKTTLMNCIVGWVKVKKGNVEF NGQDITNLPTHSITRTGVVQIPEGREIFSNMTVLENLEMGSYSIKRSKKEMNAKIEEMYD LFPRLKERMGQKAGSLSGGEQQMLAIARGLMGDPKLLMCDEPSLGLAPVIVDDMFDIFLR INKEKHLPILIVEQNAFMALEVSSRCYVLENGRMVITASSEELAKSDTIKKSYLGG >gi|157101612|gb|DS480712.1| GENE 63 67145 - 67927 248 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 5 239 9 227 309 100 27 6e-20 MESLLKVTGITKKFQGLVAVDNVSIDMKEHTIHALIGPNGAGKTTTVNLITGVLPLTSGK VEFGGKDITGMTTHDIARQGIGRTFQNIKVFPTMTLLENVMVGGHQQAPMGMMRGIFDIR GSRAEEKRLKEKAEEVLKRIGMYELRGSLMKNLPYGRQKISEIGRAMMNDPKLILLDEPA AGLNPSERAELVDVIYKAYEDGYSFFMIEHNMDVVMKISDYITVLSFGRMIAEGTPKEVQ SNEEVIRAYLGSRYVEQDSR >gi|157101612|gb|DS480712.1| GENE 64 67932 - 68918 1209 328 aa, chain - ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 39 302 2 257 285 138 35.0 2e-32 MKKKLKLKSLHFAVGALVIGAFLPIFIKNDYYMMVLNRVLINIIVVLGLNFITGLTGQMN LGTAGIFALGAYSSALFTRYTNLTPWLGLIAAVIMGILIGRGLGYPSLRVRGVYLSLTTI GFGEVVRLLISNTPKFTGGTQGIRNIRPYNIGSYQIQSQKEMYYLFLVFTAIAFFIAWRI AYSKWGRIFKSLRDNVEAVEMSGVDIASCKIKAFTVASIFGTVAGAMYAHYMGYINPSTF NLDLSTNYVVMLMVGGLGSVVGTIFGSAIVTILPEMLRFLGNYYQIVFCSIILLGAIFFP DGWVSAATGLFMKMYQRISGKSYAEGGE >gi|157101612|gb|DS480712.1| GENE 65 68934 - 69800 1096 288 aa, chain - ## HITS:1 COG:AGc4402 KEGG:ns NR:ns ## COG: AGc4402 COG0559 # Protein_GI_number: 15889697 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 286 1 294 300 122 30.0 6e-28 MGTILQLLISGIAMGFIYALVAIEYTLIWNSTGLLNFSHDKIITVGAYIFAATFIIKFNM NKPLAIVLTFMLMGILGSVIASGIFNPLRHMRSNIFAVMGTMMLGRILAEAIRLIYGPVP FTIENWLSGTIKIGTLMVTKANVIICVVAILVVAALQLFMNFTKIGKAMRCVAQNKKAAA LMGINVAQNICVTTAVSSIICSVIGMLIIPLFNISTTMAASIGLKGFSAGVIGGFGYLPG AILGGLFIGVVENLAVVVLPAVYKDIVSFVLLIIFLLVHPSGFLGKKA >gi|157101612|gb|DS480712.1| GENE 66 69881 - 71086 1499 401 aa, chain - ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 1 389 1 372 383 146 31.0 9e-35 MLKKWAAVTLSVSLALSLAACGGQQGASSGSAGSQAGDTQASSGGEVSADEGDVIKIGLI SMMTGDNPLNGERMNQGVQMAVDEYNAAGGINGKQIELTIVDDQTLQDMAVTCANKLIGE GVVGIVGPHRSTNALAVEAVVKQAGVAAFTGGTTPQISTLDNDYLFRCRASDSIFAEAAA AYAKELGCKKLGLFFNNDDFGTGAEGVIKQYCADNGMEYVEEGHNTGDKDFTPQVMKMKD AGVDTVIIWTHDAELAIHARQIYELGLDVQVVSSPGVTMQQVIDMCKAEYIEGWYGVTDY VSTSDDPTLVAFKENFEKKYEIEPELYAASYYGGAVALLEGIKAAGTTDRKAVMEAVKGL KDLKVPVGVITCDENNDLVHNINVARIENKIPKFIKAVSVE >gi|157101612|gb|DS480712.1| GENE 67 71126 - 72205 1150 359 aa, chain - ## HITS:1 COG:SP0253 KEGG:ns NR:ns ## COG: SP0253 COG0371 # Protein_GI_number: 15900188 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Streptococcus pneumoniae TIGR4 # 13 341 9 331 362 117 26.0 3e-26 MFEGKHIQVGAGRYIQCHGALKRAGEEMVFFGKKAYLLYGDDVVKGKSSQVLEESLRKSG ITFQSEIFEGPSTEKNFGEVAARVRESGAEIVAGIGGGRIIDIAKAAGDMADASIFTIPT SAATCAAYAVLYVVYGEDGNVDHSGFLNHEISGVIVDMDLVVNDCPRRYFVSGIVDAMAK KPEFSFTMAHLKDEGMIATSEIATKIADFTYSKYMRDTRQALKDFDDGKDSMLIDDMVNM NIMLTGMVSDLSTGGKQLAVAHNFYDAVCCMYKDVRRKYLHGEIVAMALPLQLAVNGSPE AEIEELKAFLREVGIPVSIREAGVPQDGLEELITYVHRVTIPDDMELLEQIRDGFKYIM >gi|157101612|gb|DS480712.1| GENE 68 72390 - 72554 105 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQSSKTQQTKKQQTKTQQTKKQQTKTRQSKTQQTKTRSAPGRSSLKNSNLYAVI >gi|157101612|gb|DS480712.1| GENE 69 72573 - 73220 676 215 aa, chain - ## HITS:1 COG:SMc02036 KEGG:ns NR:ns ## COG: SMc02036 COG1802 # Protein_GI_number: 15966301 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 14 204 17 204 235 65 28.0 6e-11 MRVKKNLDKIIYDQIIDSFILGEYKMDDAISLDVLCEKYEVSRTPVVQAVKLLVNDGVLE KLSNGRVVVPTFDQNQIKNICDVRLMIEKHAIEHYLSAAADVSSLLDQLEICAKKCEEYL DKGDFLELSKTDLRFHRVLVSGARNEYLSELYKRIQGQFLVANYLVLPLRNRNFRSTVDD HYKLLEYLRNKDMEQSEKMIAGHINNIFYIITNQK >gi|157101612|gb|DS480712.1| GENE 70 73506 - 73751 335 81 aa, chain + ## HITS:1 COG:no KEGG:Closa_1693 NR:ns ## KEGG: Closa_1693 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 79 1 80 95 97 63.0 1e-19 MKLENISDVKAFFEAVDQCKGKIELVSPEGDRINLKSKLSQYLSIANMCSNGYIRELELV VHEKEDMDRLIEFMLSGENLK >gi|157101612|gb|DS480712.1| GENE 71 73844 - 74341 617 165 aa, chain - ## HITS:1 COG:no KEGG:Closa_1008 NR:ns ## KEGG: Closa_1008 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 164 2 166 169 149 44.0 5e-35 MNDDNYVEWLVKRKDPAYAVPVKILMIVVCIFSVLSALQTVFGVIVMTIAGVATYFVFLN LSVEFEYLFAEGGLSVDRILGKAKRKKIFDCEKDDIQIVAPADSFVLKDYEKQGMKIKDF SSGRKDAKVYALIYQKGPDHFKVLIEPNERMIGAMRRTFPRKLVM >gi|157101612|gb|DS480712.1| GENE 72 74388 - 74840 318 150 aa, chain - ## HITS:1 COG:CAC2616 KEGG:ns NR:ns ## COG: CAC2616 COG1321 # Protein_GI_number: 15895874 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 5 139 3 137 157 87 36.0 8e-18 MVTDKDGFYTMKGYERSASDEMTSSMEDYLEMVCRMEEEGEPIRVSSLAASLHVRPSSAS KMLDNLRKAGYIDFKKYGSIMVTDKGYEEGRYLLHRHRVLQDFFCTLNHTEDELELVEKI EHFINHRTVENMEQALSFLRHKNDMTGKMG >gi|157101612|gb|DS480712.1| GENE 73 74915 - 75196 274 93 aa, chain + ## HITS:1 COG:CAC0447 KEGG:ns NR:ns ## COG: CAC0447 COG1918 # Protein_GI_number: 15893738 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Clostridium acetobutylicum # 5 76 1 72 75 58 37.0 2e-09 MKEFLCLNDIKPGNRARVKELTSTGSIRRRLLDIGLVENTEVECLGQSPLGDPCAYLIRG AVIAIRSEDCRGILVQPCTGFSEQEVMEGCTSL >gi|157101612|gb|DS480712.1| GENE 74 75290 - 76774 1504 494 aa, chain + ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 28 473 7 445 683 255 33.0 1e-67 MGLTKQSVGMGAVDSGLEIKKQTPDDKIIAIAGNPNVGKSTLFNNLTGMNQHTGNWPGKT VTNAQGYCKTKEHSYVLVDIPGTYSLMAHSAEEEVARNFICFGGSDGVVVVCDATCLERN LNLVLQTMEISDRVLVCVNLMDEAARKNITIDLKGLSEKLGVPVAGTIARKKRSLDQLMK QLDDLVEGTQTASPYQVEYSPIIEQAIAMAEPAVKARVEGKVSSRWLTLKLLDSDPSLMK ELREYLNEDILEVPEISLALSQAREHLAKYGITNEILKDRIVAALVASAEKICRDTVQFH KNGYNEGDRKLDKILTSRFTGYPVMIGMLAVVFWLTITGANYPSQMLADALFRLQDQLTK AFMAAGAPQWLHGLLILGVYRVLAWVVSVMLPPMAIFFPLFTLLEDSGYLPRIAYNLDKP FKSCHACGKQALTMCMGFGCNAAGIVGCRIIDSPRERLIAMITNNFVPCNGRFPPPYKGK QKCSSALTMKKQPN >gi|157101612|gb|DS480712.1| GENE 75 76735 - 76980 111 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFLGADNEETAELILPIYGFEDWALKLKAYRKKNHITQQELAQLMDVKHLTLRSWEQKQA KPPYNVWRLHKHLFDDSIKLT >gi|157101612|gb|DS480712.1| GENE 76 76951 - 77160 133 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941748|ref|ZP_02089075.1| ## NR: gi|160941748|ref|ZP_02089075.1| hypothetical protein CLOBOL_06644 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06644 [Clostridium bolteae ATCC BAA-613] # 17 69 1 53 53 80 100.0 4e-14 MKRNCVQNVIVHIPDNMDLHALSDKINEFHLQVVERRLNSSNLTTDDKIAVIDKILDNLK SRELDGIIK >gi|157101612|gb|DS480712.1| GENE 77 77475 - 77870 220 131 aa, chain - ## HITS:1 COG:lin0887 KEGG:ns NR:ns ## COG: lin0887 COG2337 # Protein_GI_number: 16799960 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Listeria innocua # 1 109 2 110 115 139 61.0 2e-33 MLCRGDFYYADLSPVVGCEQGGVRPVLIVQNNVGNRYSPTVIVAAVTSRMEKHPLPTHVF INRNSVLQKDSIIMLEQIRTIDKSRLLEYIGHLDCRDMRAVDVSLRISLDMGTDTSGTFW EERQKQGLVLT >gi|157101612|gb|DS480712.1| GENE 78 78834 - 79094 187 86 aa, chain + ## HITS:1 COG:no KEGG:TDE0291 NR:ns ## KEGG: TDE0291 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 1 86 3 88 88 105 56.0 8e-22 MPQIIPIKDLKNTSDISDLCHTSQEPIFITKNGYGDMVIMSMETYENTVWKSSLYRGLEV SEEQIHSGRTKDARASLDELKDKYGL >gi|157101612|gb|DS480712.1| GENE 79 79084 - 79401 105 105 aa, chain + ## HITS:1 COG:no KEGG:Dtox_0629 NR:ns ## KEGG: Dtox_0629 # Name: not_defined # Def: plasmid stabilization system # Organism: D.acetoxidans # Pathway: not_defined # 3 101 2 98 102 69 38.0 4e-11 MDYKLHITEQAVEQLDWIISYLVSNLKNPTAARKLLSEIEIIYSYLESDPHIYPQCDDPF LRTKGYHKATVNHFQYVILFLIDDKSRTVYISGLFHEKENYGQKL >gi|157101612|gb|DS480712.1| GENE 80 79509 - 80012 356 167 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2435 NR:ns ## KEGG: Dhaf_2435 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 32 167 3 139 139 81 34.0 1e-14 MKDRRYYIRKNGILSEVTYDEYQKYVSHKKLEGNVYFVCIQNSLMEVTEDMYMKHYRVIR RKKYLTREVPYEILHYQSLDTDEMLGEELLYDDSGLSIEDKAIHNIYTDKLYPALVKLTD EERDLLYRLYYQGVSERVLAKQLGISQAAVHKRKEKILKQLRIWMKL >gi|157101612|gb|DS480712.1| GENE 81 80102 - 80299 63 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941754|ref|ZP_02089081.1| ## NR: gi|160941754|ref|ZP_02089081.1| hypothetical protein CLOBOL_06650 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06650 [Clostridium bolteae ATCC BAA-613] # 1 65 1 65 65 102 100.0 1e-20 MVSAVILKGIYRTETTIFQFNAQVIGITEIHIEEEQVDGLLKVHSVRMRFTMIRDGLHIS QVIVG >gi|157101612|gb|DS480712.1| GENE 82 80343 - 80447 103 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLMTGNCLSNRILQLLDEVWKTTGHVSVINKENG >gi|157101612|gb|DS480712.1| GENE 83 80544 - 81212 435 222 aa, chain - ## HITS:1 COG:no KEGG:LDBND_1390 NR:ns ## KEGG: LDBND_1390 # Name: not_defined # Def: transposase dde domain # Organism: L.delbrueckii_bulgaricus_ND02 # Pathway: not_defined # 1 221 209 432 463 253 54.0 3e-66 MLELIASAQAVGLSAKYVLFDSWFSSPKTLISLKEKCHMDTIALIKKNKTKYLYEGQNLN IKQIYSKNRKRRGRSKYLLSVNVTLKKDDEALPARIVCVRNKSNRKDWLALISTDTNLSE EELIRVYGKRWDIEVFFKSCKSYLKLVKECRSISYDALNAHVAIVFTRYMLLSVAKRRNE DDKTICELFYCLMDELEDITFSQSMQIILEALLDTVMEFFHI >gi|157101612|gb|DS480712.1| GENE 84 81200 - 81832 153 210 aa, chain - ## HITS:1 COG:no KEGG:LDBND_0402 NR:ns ## KEGG: LDBND_0402 # Name: not_defined # Def: transposase dde domain # Organism: L.delbrueckii_bulgaricus_ND02 # Pathway: not_defined # 1 185 1 186 463 182 47.0 6e-45 MSSIPYNRHKENEIFDFVSKFINEFQIGKLLFQCNAGKEKGIPVMTIFRYLLCLLFSDRS HYMQRKTKTFEEGFSKNTLYRFLNSVKTNWQRFTVLLASKIINDFMKTLTDEKRKDVFII DDSLFERSRSKKTEFLANVFDHCSMKYKKGYRMFTLGWSDGNSFVPVTYCLLLAAENKNL ICPAKNSMAVPLLENAAANCAGKRLMSCWN >gi|157101612|gb|DS480712.1| GENE 85 82314 - 82778 149 154 aa, chain - ## HITS:1 COG:MA2681 KEGG:ns NR:ns ## COG: MA2681 COG3344 # Protein_GI_number: 20091502 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Methanosarcina acetivorans str.C2A # 1 153 83 235 266 150 49.0 7e-37 MIEDFLKPRRLTLSKEKTVITHIDEGFDFLRWIFRKFKGKLIIMPSHKSMCKITQKVSEL IKNSQTIKQEILISRVNEIIRGWANYHHATWAKKSFSTIDHRIWKMLWRWAKRRHPNKGK QWIVNKYRKTHKGRKWTYMTEKNILFLMMDMPIV >gi|157101612|gb|DS480712.1| GENE 86 82859 - 83194 251 111 aa, chain - ## HITS:1 COG:BH0039 KEGG:ns NR:ns ## COG: BH0039 COG3344 # Protein_GI_number: 15612602 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 4 95 1 92 418 107 57.0 6e-24 MSELLEKILDNRNMNEAYEKVCANKGAGGVDGMEIKELDGYIRENWDSIREQIRKREYHP QPVRRVEIPKPNGSRRKLGIPTVMDRVIQQEIAGVGETGIAVCEVCGLYRS >gi|157101612|gb|DS480712.1| GENE 87 84092 - 84886 236 264 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 16 249 1 229 245 95 27 2e-18 MIDFLQIGVPALNKLMIQMEGIVKRFGSITALDNMNFHAYGGEITAIVGDNGSGKSTLIQ ILTGCLKPDQGVIQIKDRVFHSLTAKEAMEQGITAVYQDLALDNCKDCAGNLFLGRELLH FGIFLDHRAMYQETEKLLRQFHIRIPDIRQPVEKMSGGQRQAVAIARAVYQNGDIMIFDE PTSAMGMKETERIMNLFRTLKKQGKTIILISHNLFQVYDISDRICVIKNGRQVDFFLTRD SSPEQLYQSIIEREEQADDEKMDE >gi|157101612|gb|DS480712.1| GENE 88 84864 - 85820 824 318 aa, chain + ## HITS:1 COG:YPO2499 KEGG:ns NR:ns ## COG: YPO2499 COG1172 # Protein_GI_number: 16122720 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Yersinia pestis # 19 309 38 328 330 174 36.0 1e-43 MMKKWMSKYSQQLFLSAVLLIICAVLAGKSEYFFTWKNIRNILEAASYRMILAVGMTFVI ASGVMDLSAGSIVSLCGVLTALALHGGIEIWAAVLTGIAAGGLMGALNGALIHCTGINFF VLTLASGGMLRGISLMITDGKPITKFGKAFMQLGIGRIGEFQFPVLFALTLILLSSPLMF HVKWGNYVQALGGSETALKRSGVRTGFYRISVHVFLGITAAVTAVIITARLNSAEPNAGM NMELDAITAVIMGGTPIRGGNASIAGTVLAVLILGIIRNGLTLLSVSSDFQQWVTGFLLL ASVLISEIRVWNTRALIK >gi|157101612|gb|DS480712.1| GENE 89 85864 - 87000 1109 378 aa, chain + ## HITS:1 COG:PM1325 KEGG:ns NR:ns ## COG: PM1325 COG1879 # Protein_GI_number: 15603190 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 79 331 19 271 314 133 34.0 6e-31 MRKFITGLLAAGCIAAVLTGCSSSGGAKETSAPETSAAAAKETESEPAKEAVSATVAETE SADKAGLLLADVEKRMNEALGELPKSGQGEKIGVLISSTSNEFWGTMKTRYEEAAEDLGI EIRVFEADAEDDTQGQLDALNTMVTMGFDAIILSPIDGTNLIPGIVAANEAEIPVINLGP GVDAEALADAGGHLDGKITVNFEEQGSTVANDMISRMEDGGKVAILAGLEGAGQSVGRTN GAKTVFENTEGVELVAAQACDWDTEKAYEATKDILTAHPDLKGIFACNDNMALAAVQALQ EMGNKDVMVYGVDYTTDAKAAIEDGTMMGSMTYSSAIYTKAAEEMAMLIVQGKTFKDPVY LPLTLVTQDNVADFEGWK >gi|157101612|gb|DS480712.1| GENE 90 87003 - 87809 406 268 aa, chain + ## HITS:1 COG:mll0028 KEGG:ns NR:ns ## COG: mll0028 COG2313 # Protein_GI_number: 13470351 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Uncharacterized enzyme involved in pigment biosynthesis # Organism: Mesorhizobium loti # 5 267 29 304 305 69 26.0 6e-12 MKYLIETALLTHGLRSLTNNYLLENWTLPEALIAWIYKGEIKIGTITKYLSFRTEAQSAI RIDCSILENALKNGVSGALTASGTMAVCKKLGIPAAVTCGMGGIGEIKGEELCPDLPALA EIPIILISTGPKDMLERKATVSWLTSHGVQVIGANRGFCTGYLFHGEPVKLQGVLRSAGC RTKAGMPEGELCPPLLIINEIPEKLRVSDRQILTEAIIEGKRAEQEGRYYHPAANGKIDE MTNGYSSEIQLASLLDNARLAGQIESSR >gi|157101612|gb|DS480712.1| GENE 91 88033 - 88152 117 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941767|ref|ZP_02089094.1| ## NR: gi|160941767|ref|ZP_02089094.1| hypothetical protein CLOBOL_06663 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06663 [Clostridium bolteae ATCC BAA-613] # 1 39 1 39 39 69 100.0 6e-11 MEQYDKNVTAILEFLVAKNFSASVMRVYYLINGLKGIPL >gi|157101612|gb|DS480712.1| GENE 92 88462 - 88941 224 159 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 47 144 203 298 450 75 37.0 4e-14 MALRSAGPPLPTRTSTAHRTISSRCMITSTGNVQVLNEEGRKAQSQSFMWLFRSGEDGYP ALILYGYSPTRSGSQAKEFLEGYSGYLETDGYQGYNSLPGIKRCSCWAHIRRYFIDAVPK GKQYDYSQPAVQGVQYCSRLFVIEDSINKKYPGDYEKRK >gi|157101612|gb|DS480712.1| GENE 93 88961 - 89620 166 219 aa, chain - ## HITS:1 COG:Z1162 KEGG:ns NR:ns ## COG: Z1162 COG3436 # Protein_GI_number: 15800683 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli O157:H7 EDL933 # 1 219 6 201 285 63 27.0 4e-10 MASSAKNIQLRELKDTILQLNTMVSEQTELIKSLRLMIDEKSSREKALQEQVDYLTQKLF GPSSERRADDIPGQQNLFDEAEIEQDPSLLEEETVIRENTRKKKAAHDDLFKGLRVEKVV IPLPEEEQVCPVCGTQMVLIGEEYVRRELEFVPATCKVIEYYSQSYGCPSCKEGHGDTEK PVIIKSQVPATMIGKGPASASTVAWTMYQKYANGLPLYR >gi|157101612|gb|DS480712.1| GENE 94 89707 - 90048 214 113 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 104 1 103 107 87 39 5e-16 MLGDISELKKIYIVCGYTDMRKSIDGLCAIIEDQLKMDPLSSALFLFYGRRRDRIKALFR EADGFVLIYKRLAIHGGYQWPRKQSEVRNLSWREFDWLMSGIDIDQPKALKAE >gi|157101612|gb|DS480712.1| GENE 95 90128 - 90382 58 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|325262201|ref|ZP_08128939.1| ## NR: gi|325262201|ref|ZP_08128939.1| hypothetical protein HMPREF0240_01185 [Clostridium sp. D5] hypothetical protein HMPREF0240_01185 [Clostridium sp. D5] # 1 38 1 38 136 75 94.0 1e-12 MSVTRKTRVLMPEQIRRINECRQSGMTDADWCRENGIATVILRIRVRNKTSFPLTLYRIL FRNNMSHQRCRKRTLTIHIRSKLL >gi|157101612|gb|DS480712.1| GENE 96 90442 - 90585 90 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941773|ref|ZP_02089100.1| ## NR: gi|160941773|ref|ZP_02089100.1| hypothetical protein CLOBOL_06669 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06669 [Clostridium bolteae ATCC BAA-613] # 1 47 50 96 96 99 100.0 1e-19 MEAEVSQAERAYQFIIDWIAANENLFDPMYSNKVSAKYSTGFHSPPV >gi|157101612|gb|DS480712.1| GENE 97 90752 - 91735 531 327 aa, chain - ## HITS:1 COG:SA1828 KEGG:ns NR:ns ## COG: SA1828 COG5519 # Protein_GI_number: 15927596 # Func_class: L Replication, recombination and repair # Function: Superfamily II helicase and inactivated derivatives # Organism: Staphylococcus aureus N315 # 18 281 40 305 569 76 25.0 7e-14 MDAPIQLRCEDWICNKEGVYKWVQGKKDTDPPVLIDVSYQQILPVGLTKNIETGEQKYDI AFSVRRNGKFMWKDIKVEPAECCSKTKIVTLANLGVVVNDQKAKNLVNYISDMYRINEDS LPVTKAVSHFAWIGKQFFPYVKDIMFDGDNAQAKTVQAVGPRGSFDTWQKGCMEYRKNLF VQMLMDASLASILIKKINCLCFVLHLWGAFGTGKTVAFMVAASIWGIPDELILSVDSTIN YCTSRAVLMKSLPVFVDETQLSRGNLEKLIYAMTEGKEKGRLLRNSSERLIIRYLQILHM SWRLSGNIMDMLERSLCGMFKGLRTQN >gi|157101612|gb|DS480712.1| GENE 98 92087 - 92395 161 102 aa, chain - ## HITS:1 COG:no KEGG:ELI_3221 NR:ns ## KEGG: ELI_3221 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 11 102 8 96 893 93 45.0 2e-18 MLKDYSLRQYALAYAKVGMAVFPLVPKSKNPATQHGFQDATTDFNQIDKWWMKNPNYNIG IATGQVSGGLIVIDLDIDKEKGKHGNETLRDWEAEQGQLPDT >gi|157101612|gb|DS480712.1| GENE 99 92693 - 93046 58 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941778|ref|ZP_02089105.1| ## NR: gi|160941778|ref|ZP_02089105.1| hypothetical protein CLOBOL_06674 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06674 [Clostridium bolteae ATCC BAA-613] # 1 117 1 117 117 202 100.0 7e-51 METRILIVSKDSYLRNQLYQILYTASYDCYLASTSLAASVICKMFTPDVIMIDIILPFRD IAELLKMVPADCSCNVIAIKPNLFPQPLIDYDIKYNLSQPLDYSSTMNLFSTLPNQP >gi|157101612|gb|DS480712.1| GENE 100 93667 - 94065 424 132 aa, chain - ## HITS:1 COG:no KEGG:USA300HOU_0862 NR:ns ## KEGG: USA300HOU_0862 # Name: not_defined # Def: hypothetical protein # Organism: S.aureus_USA300_TCH1516 # Pathway: not_defined # 1 111 1 110 114 97 38.0 2e-19 MNKEEIKKIVVDYINKNESASYAELQWLFEKKGYDYKGKLLSCSDVCEHVVFWSGWNAEA FDLMTELLHEGVAYREPAHPLRYLMDGAALTLPKVQRAVQYKTDHWAPVVFVKGLDPELM TSQVKIKAEGTE >gi|157101612|gb|DS480712.1| GENE 101 94134 - 94286 88 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941783|ref|ZP_02089110.1| ## NR: gi|160941783|ref|ZP_02089110.1| hypothetical protein CLOBOL_06679 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06679 [Clostridium bolteae ATCC BAA-613] # 1 50 1 50 50 90 100.0 4e-17 MKDITVKHITPEELPSILCFQLSYKHHTELGKDTSEEMKLLNQLKNEWGV >gi|157101612|gb|DS480712.1| GENE 102 94297 - 94575 177 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941784|ref|ZP_02089111.1| ## NR: gi|160941784|ref|ZP_02089111.1| hypothetical protein CLOBOL_06680 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06680 [Clostridium bolteae ATCC BAA-613] # 1 92 1 92 92 174 100.0 2e-42 MLESYDKAMEEIDDLKGLLKTVLENTGKVVPVQQKWDIEHATGKDDMVKENLINVIVVFL NGRYVNAVWNVYYFILGMRQVRGRGTSKGMRG >gi|157101612|gb|DS480712.1| GENE 103 94961 - 95215 290 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP11-BS70] # 1 77 1 77 77 116 67 8e-25 IIEAQCMPDHIHMLVSIPPKYSVSEVVGYLKGKSSLMIFEKHANLKYKYGNRHFWCRGYY VDTVGKNTAAIKAYIQNQIKEDLE >gi|157101612|gb|DS480712.1| GENE 104 95849 - 96058 167 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRILYELSWLWETLVIALAVAAVCVVCAMLICKAVKKNLSKKTAAVIGAAAFAGAVLAVI VIARTPMPL >gi|157101612|gb|DS480712.1| GENE 105 96138 - 96938 565 266 aa, chain - ## HITS:1 COG:no KEGG:ELI_1005 NR:ns ## KEGG: ELI_1005 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 266 1 266 266 427 87.0 1e-118 MALTIQERLKDLRVERGLTLEQLAEQTHLSKSALGSYEAEDFKDISHYALIKLAKFYSVT ADYLLGLSETKNHPNADLADLRVSDDMIELLKSGLVDNFLLCELAVHPDFPRLMADLEIY VNGTAVKQVQSANAIVDIMSATIMKQHNPGLSDPQLRQLIAAHIDDDSFCRYVIQQDING IALDLREAHKDDFFSVPEDNPLEDFLQAAEAASTPGSDPEQAAMAFICKRLKLNYGKLSE EERKWLKRIAQKSDLLKNPKPQRGRK >gi|157101612|gb|DS480712.1| GENE 106 96952 - 97413 76 153 aa, chain - ## HITS:1 COG:no KEGG:Cthe_0528 NR:ns ## KEGG: Cthe_0528 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 26 150 1 124 131 119 53.0 4e-26 MYNLNWSRSDLTRQKLGTFCEYYAKMSLASYGVSIYTSEVDDHGIDFIAESKRGFLKFQV KAIRKGTGYVFMREEYFDISDQSLYLFLLLLNDGEHPIEYLIPATTWDNDSSNTFVYHSY EGKKSKPEYGLNISAKNIPQLERFKLENMITAI >gi|157101612|gb|DS480712.1| GENE 107 97779 - 98192 202 137 aa, chain + ## HITS:1 COG:no KEGG:ELI_1003 NR:ns ## KEGG: ELI_1003 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 137 56 192 192 234 89.0 1e-60 MRDNPYKDLPPLERRPDGSLYRMTPAQRKQANALIRRECCCYEDGNCMFLDDGDTCTCPQ TVSFSVCCKWFRWAVLPLDKTLEAEIFRDKDLKRCAVCGGVFVPKSNRAKYCPDCAARVH RRQKTESERKRRSTVDS >gi|157101612|gb|DS480712.1| GENE 108 98304 - 98699 240 131 aa, chain + ## HITS:1 COG:no KEGG:ELI_1002 NR:ns ## KEGG: ELI_1002 # Name: not_defined # Def: phage protein # Organism: E.limosum # Pathway: not_defined # 1 131 1 131 131 261 98.0 5e-69 MANTIYIHQPEKAVSFTRLPNFLFEAPTFTPLSNEAKVLYAFILRRTDLSRKNGWADEYG RIYLYYPINEVVELLHCGRQKAVNTLRELQYAGLVEIQKQGCGKPNRIYPKSYEAVSNTD FKKSGYGTPED >gi|157101612|gb|DS480712.1| GENE 109 98887 - 99831 680 314 aa, chain + ## HITS:1 COG:no KEGG:ELI_1000 NR:ns ## KEGG: ELI_1000 # Name: not_defined # Def: plasmid recombination protein # Organism: E.limosum # Pathway: not_defined # 1 314 1 313 313 531 92.0 1e-149 MAQHAILRFEKHKGNPARPLEAHHERQKEQYASNPDIDTSRSKYNFHIVKPESRYYHFIQ NRIEQAGCRTRKDSTRFVDTLITASPEFFKKKPPKEIQEFFHRAADFLIGRVGRENIVSA VVHMDEKTPHLHLVFVPLTEDNRLCAKEIIGNRANLSKWQDDFHAYMVEKYPDLERGESA SKTGRKHIPTRLFKQAVNLSKQARAIEATLDGINPLNAGKKKEEALSMLKKWFPQMGNFS GQLKKYKVTINDLLAENEKLEARAKASEKGKMKDAMERAKLKSELDNLQRLVDRIPPDIL AELKRQQRQHGKER >gi|157101612|gb|DS480712.1| GENE 110 99875 - 100030 167 51 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1927 NR:ns ## KEGG: Tresu_1927 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 1 50 20 69 70 68 94.0 6e-11 MTFQGREITLENLSPVFTPEQEAAKRRELEQQLYEVFRKYADKRQSEEAGA >gi|157101612|gb|DS480712.1| GENE 111 100125 - 101636 1250 503 aa, chain + ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 7 451 4 471 531 192 30.0 1e-48 MNNRIDAIYARQSVDKKDSISIESQIEFCKYELKGGNCKEYTDKGYSGKNTDRPKFQELV RDIKRGLIAKVVVYKLDRISRSILDFANMMELFQQYNVEFVSSTEKFDTSTPMGRAMLNI CIVFAQLERETIQKRVTDAYYSRSQRGFKMGGKAPYGFHTEPIKMDGINTKKLVVNPDEA ANIRLMFEMYAQPTTSYGDITRYFAEQGILFNGKELIRPTLAQMLRNPVYVQADLDVYEF FKSQGTVIVNDAADFTGMNGCYLYQGRDVKPSKKNDLKDQMLVLAPHEGIVPSDIWLTCR KKLMNNMKIQSARKATHTWLAGKIKCGNCGYALMSIFNPSGKQYLRCTKRLDNKSCPGCG KIITSELEAVVYQQMIKKLASYKTLTGRKKAAKANPKIAALQVELLHVDSEIEKLVDSLT GANNVLLSYVNVKIAELDGRKQELVKQIAELTVEAISPEQVNQISGYLDTWDNVSFDDKR RVVDLMITTVAATSDSLNITWKI >gi|157101612|gb|DS480712.1| GENE 112 101871 - 102140 63 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941799|ref|ZP_02089126.1| ## NR: gi|160941799|ref|ZP_02089126.1| hypothetical protein CLOBOL_06695 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06695 [Clostridium bolteae ATCC BAA-613] # 1 89 1 89 89 153 100.0 4e-36 MKKRVLVVGATKETTFSVHRKCLLYILFYLLASNPWHMFFREQFFEQFLDGEYISYDNSL YSYLPDKGGQLCRGKNKNKSKQANRASQS >gi|157101612|gb|DS480712.1| GENE 113 102308 - 102424 144 38 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941800|ref|ZP_02089127.1| ## NR: gi|160941800|ref|ZP_02089127.1| hypothetical protein CLOBOL_06696 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06696 [Clostridium bolteae ATCC BAA-613] # 1 38 1 38 38 65 100.0 1e-09 MKINKRRKELREDTGTDFQTEPLYLAGATEVLAGQGKY >gi|157101612|gb|DS480712.1| GENE 114 102537 - 102680 70 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941801|ref|ZP_02089128.1| ## NR: gi|160941801|ref|ZP_02089128.1| hypothetical protein CLOBOL_06697 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06697 [Clostridium bolteae ATCC BAA-613] # 1 47 1 47 47 85 100.0 2e-15 MGKDRKGYSKTDLEATFMRMKEKHMFNCQLVMVEGTAAYPDGEQRTG >gi|157101612|gb|DS480712.1| GENE 115 102694 - 102912 61 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941802|ref|ZP_02089129.1| ## NR: gi|160941802|ref|ZP_02089129.1| hypothetical protein CLOBOL_06698 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06698 [Clostridium bolteae ATCC BAA-613] # 1 72 1 72 72 129 100.0 7e-29 MAELINWFSNHTLIEIDRCSPLELLKLQENLKRTTERDGIRFVYGKGKQKTEIQKLYDEL ELCGSRLMEYKE >gi|157101612|gb|DS480712.1| GENE 116 102939 - 103028 61 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICSFLVPGNGKTHSEESAIRDIIKYANK >gi|157101612|gb|DS480712.1| GENE 117 103067 - 104422 1327 451 aa, chain - ## HITS:1 COG:TM0595 KEGG:ns NR:ns ## COG: TM0595 COG1653 # Protein_GI_number: 15643361 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Thermotoga maritima # 49 447 18 416 419 210 32.0 5e-54 MRKMMLGFMAAAMVCSMAACGSKQETAAPAGTPAQTQSEAQKQENAGDSQVEITWWHAMS GVNEEAIQKIADDFMAGRPDVKVTLVNQGGYLDLFDKLMASAKADQLPTMAQVYSNRLSW YISKDLVEDLTPYMQAEGAGFIQEDYDDIPKMFFDDGIWDGKQYAMPFNKSMMVLYYNEG MLEEAGVEVPTTWEEWEAANKKLTKDTDGDGEPDVYGTVFANNLSTDIAPWLKQCGGRTM NEETNELYFDTPETKEAIQFLNSMFQAKTVRFAGEDKNANVPVQQGRAAMCVASTSALPY IENDTLEGIMINAAALPAYKTNDQLYYGTNITVFNTGSKEQKQAAWDFIKFLTNTENTAY FAAQTGYIPVRRSAQHAPVFAAVLAEKPIKQLCFDNMDNGFQGTRNIGGINALDALGEQL DLVFSGEKDLDSALKDAQAIGEKAMEEARNN >gi|157101612|gb|DS480712.1| GENE 118 104496 - 105323 670 275 aa, chain - ## HITS:1 COG:TM0598 KEGG:ns NR:ns ## COG: TM0598 COG0395 # Protein_GI_number: 15643364 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Thermotoga maritima # 54 274 57 277 277 228 52.0 7e-60 MGLKKRAARAGGYCILLAGAAFVLLPFVWMLLTSLKPSREVLMMPPKCFPSVIMWENYVD AYRAAPFAVYFFNSAIVSLAITAGELVTTILAAYAFSQMDFKGRDALFMLLVATMMVPGE ILIIPNYVTLADLGWINTYKALILPWCASVFSIFLLKQQFSSIPAAYYKAARIDGCRRFR YLITVMVPMSKPTIVAIALLKLINSWNSYMWPLIVTNTNEMRTLPVALAAFSTEAGVQYN TLMAFSLMIISPILLVYFFARKYIIRGVSSAGLKG >gi|157101612|gb|DS480712.1| GENE 119 105314 - 106246 678 310 aa, chain - ## HITS:1 COG:TM0596 KEGG:ns NR:ns ## COG: TM0596 COG1175 # Protein_GI_number: 15643362 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Thermotoga maritima # 68 305 175 414 420 203 45.0 4e-52 MGELGNKKDTGLKRGQKTRTWGEMSAWLYLAPALAILLTFQVYPIYKSFMMGFYTRFDYL TDTVYEWGLGNFRYILCDENFRIAMKNTCVLVFFAAPLSIGSGLLFALLLNSNIRFKPFF QSVYFLPFVTSMVAVSIVWSWMLNKDYGLVNGVIRFFGGKKIAWMTDRHMTMPILVALCI WKGLGYRIIIFLAALQGIDKRYYSAARVDGARTMNRFWHVTIPMLKPILIFLSITTVIGC FKTFDEVYVMYNHKPGPSRSGLTIVYYIFNKFYEHWEFSIAAAAAFFLFLLIFAITLLQF LLSRRRKSWD >gi|157101612|gb|DS480712.1| GENE 120 106227 - 107123 641 298 aa, chain - ## HITS:1 COG:PAB2232 KEGG:ns NR:ns ## COG: PAB2232 COG3839 # Protein_GI_number: 14520406 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Pyrococcus abyssi # 1 289 70 354 362 247 45.0 2e-65 MEAEDRDIGMVFQNYALYPHMTVGRNIEFPLRMRHIAKAQREERVRAVAGMMQIEDLLNR RPSQLSGGQQQRVAIARALVKRPKILLLDEPFSNLDARLRIELRDEIRSLQKQLGITTIF VTHDQEEAMSISDRILLLKEGVMQQYSTPKDMYTHPANRFTAGFLGNPPINFLTCRMVGG KLMVQDCGHGSMELPVKKCVWEQTGLPEEQADLEVGVRPEDIYVVASREHSDFRGRVIAV QTLGKEIYLKVETWGHRLTACVSWDHNYGIGDVINLAVKKIYLFPEKGETECNGRAGK >gi|157101612|gb|DS480712.1| GENE 121 107143 - 107325 187 60 aa, chain - ## HITS:1 COG:PAB2336 KEGG:ns NR:ns ## COG: PAB2336 COG3839 # Protein_GI_number: 14520241 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Pyrococcus abyssi # 4 60 6 62 364 78 63.0 4e-15 MIILKDITKKFRNTTAVKGLSVEIREGELVCLLGPSGCGKSTTLSMIAGLEPPTKGDIFF >gi|157101612|gb|DS480712.1| GENE 122 107346 - 107906 388 186 aa, chain - ## HITS:1 COG:MA1976 KEGG:ns NR:ns ## COG: MA1976 COG5418 # Protein_GI_number: 20090824 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Methanosarcina acetivorans str.C2A # 2 169 47 189 193 73 30.0 2e-13 MQKILFVSHCILNTAAKVVRYGDTGKKEEESRLEFVVKTVEQGIQLVQLPCPEFTLYGPG RWGHTKEQFDNPFFREHCRKILEPILTQMKAYMAPGEQGRFSVLGVVGIDGSPSCGVSRT CSGCWGGEFSGRTDLQEVLSTCHSAPGTGVMMEVLKSMMEEEGIQLPVEGLKPGDLQAVM SLVAGT >gi|157101612|gb|DS480712.1| GENE 123 108048 - 109028 671 326 aa, chain - ## HITS:1 COG:BH3645 KEGG:ns NR:ns ## COG: BH3645 COG4608 # Protein_GI_number: 15616207 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 1 318 1 319 322 329 50.0 5e-90 MKTPLLEMKNLQKRYPVYGPLGKLFPPRTYMRAVSNMCLALYEGETYGLVGESGCGKSTT GRSILGLVKPDSGEILYKGQDLTRLSDSEFRPLRRDLQMVFQDTLSSLNPRSRVGALLEE PLIVQGMKDGEERRQKVMEILGMVGLSDDYYFRYPHELSGGQVQRLGIARALIIDPKLII CDEPVSALDVSIQSQILNMLTSLQREMNLSLLFISHDIGVVRYISSRIGVMYLGTLVEEA GTDELFSHPLHPYTQALFASIPDFNKKRTVSLKGELPMYTDEFKGCVFHTRCPYTCDQCR RSAPELEEIYPGHRVACHRMGGIAMK >gi|157101612|gb|DS480712.1| GENE 124 109030 - 110067 773 345 aa, chain - ## HITS:1 COG:BH0028 KEGG:ns NR:ns ## COG: BH0028 COG0444 # Protein_GI_number: 15612591 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus halodurans # 13 342 4 332 347 327 50.0 2e-89 MKTDEMQNIIQDKMTDEAGTLLEIRNLKVRFTARKKVLTAVDGVSLSIRPGEIVGVVGES GSGKSVMSQSILRLREYDSDVNYEGQVLFEGKDLLKVSQAEMRGVRGNRIAVIFQDPMTS LSPVHTVGRQLSEVLILHKKMTKQQAEERCGELLHLTGIPNPQDCMTKYPYELSGGMQQR VMIAMALSCEPKLLIADEPTTALDVTIQEQILNLIAELNERMGMSVLFITHDLGAMAQIC HSVRVMYLGNLVEEAQAEALFARPLHPYTQGLLDCIPRLDSDRSQPLHVISGVVPPLSNV PEGCRFCTRCPYADQRCMLQNPPSVEVYPGHKAKCWKYVTEQEVM >gi|157101612|gb|DS480712.1| GENE 125 110067 - 110966 854 299 aa, chain - ## HITS:1 COG:FN0398 KEGG:ns NR:ns ## COG: FN0398 COG1173 # Protein_GI_number: 19703740 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 64 292 57 286 289 209 47.0 7e-54 MKISNTATDCQTSEVILREQRQICRERFFSNPGLIIGLTVFILIALAAIIIPAAVNVDPN EMAIADRLKGPSLNHPFGTDEFGRDLMIRVLYGARVSLLVGGAVALLSCVLGTVTGIYAS YFKILDHILMRICEGLIAIPGVLLAIALMAALGASNLNVIIALTIVYTPSVARIVRSSAL VTRGLPYVEAAKVQGAGSGRILWKLILPSVISPLTVQASFIFAQAIISEASLSFLGAGIP APAASWGNILQGSKQVLQKAPWTVLFPGLAVVLCVLSLNLLGDGLRDYLDPRTAGRKKG >gi|157101612|gb|DS480712.1| GENE 126 110981 - 111931 774 316 aa, chain - ## HITS:1 COG:AGl3101 KEGG:ns NR:ns ## COG: AGl3101 COG0601 # Protein_GI_number: 15891665 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 312 1 312 313 241 45.0 2e-63 MLSYIIKRILSLIPTLFVVSVVVFLVVYMIPGGPATALLGMEATPEAIADLNAELGFDRP FFVQYADWFINMLHGDWGQSYFLQTTVLDAIGEYFAPTFSLAVFAQLISLVIAVPLGILA AYRRGTCVDMITVSASLLGIAIPGFLLSMFLMLFFGVHLKWLPVAGYVQMSQGILEHLRY LLLPALSLGVVQAAYITRMSRSAMLDVLYMNYIRTARAKGLKESAVVLVYGLKNAGPTIL TVVGQSFGSLVTGTIVTETLFNIPGLGMLTMTSITRRDVFVIQGVVLFVTFIYVLVNLAV DILYGFVDPRIQLGNK >gi|157101612|gb|DS480712.1| GENE 127 112018 - 113652 1536 544 aa, chain - ## HITS:1 COG:AGl3099 KEGG:ns NR:ns ## COG: AGl3099 COG0747 # Protein_GI_number: 15891664 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 65 542 38 515 517 244 32.0 5e-64 MKHRKWFRLLALALSAVMLTACGQSGGAKEEATSAVAGDTAKGTGNTQGTETAQGTGETE YKTELNVAITANPPTLDVHGVNSNIVGGIGTHIYEPLFAMDANYEPAGVLADSYTVSEDG KVYTIKLRQGVKFHNGQEMTADDVVASMSRWLELSGKAKALLEGTVFEKADDYTITMTLP EAYADAMTVVAASIQFPAVYPKSVIDSAGTEGITEYIGTGPYKLDEWKQDQYIHLVKYED YAQPAGVSSGFTGEKLAATPDIYFRVVTDDSTRVAGVQTGQYDIAEEMPQERYQELSGDS SLKFYVKTAGTINLFFNTTKGIMTNEQMRQAIMACLNCDDIMLASYGDPNLYVLNPGWCN PDDAMWGSDAGSEYYNQNNIEKAKELLTEAGYNNEEIVLVTTPDYGEMYNATLVVQAELQ SAGINAVVESYDFATFMEHRANPDQFSLYITSNRYNLNPVQLSVLTKDWAGLDRPEVDQG ISAIRMAASAEESSEKWDDLQEFLYEYGAASVLGHYSGLMATGAGVEGFNFFDFPLYWNV KVPV >gi|157101612|gb|DS480712.1| GENE 128 114115 - 115149 269 344 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225084369|ref|YP_002657150.1| ribosomal protein S16 [gamma proteobacterium NOR51-B] # 48 341 24 307 309 108 29 2e-22 MPYRLSAEDKNMITLEHICKTYRVARRNSGLSAAFASLFHREYETIHALEDISFHIREGE IVGYIGPNGAGKSSTIKIMSGILVPDSGQCHINGMNPWKQRKEYVKDIGVVFGQRSQLWW DVPVADSFELLKDIYRLSDQSYHRNLSYLIQLLDIGTIIRTPVRQLSLGQRMRCELAASL LHSPRILFLDEPTIGLDAVSKVAVRNIIREINRDQGTTIILTTHDTQDIEALTNRILLIG KGRLLLDGDLQMLKKHYSSAKHISIDYTPAPLCGRASDLLDILENGLSVIKDIPGHLEVL ADTTTVSVSQVISILNSRLEINDLSITGTTMDEMVVSLYQEYSI >gi|157101612|gb|DS480712.1| GENE 129 115146 - 115949 493 267 aa, chain + ## HITS:1 COG:DR0203 KEGG:ns NR:ns ## COG: DR0203 COG4587 # Protein_GI_number: 15805239 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Deinococcus radiodurans # 4 267 19 275 275 91 27.0 1e-18 MKKYASFFKIRFIQGLQYRTAAYAGILTQFMWGAMEILLFSAFYRSEPEAFPMGFSQLAS YYWLQQAFLALFMTWLMEEEIFQAISNGTVAYELCRPVDIYSMWFVKSLANRASKAVLRC MPILLTALLLPKPYGLALPPDVKTAVLFLLSMILGFLVVVAFSMLIYIATFFTLSPLGIR VISLSLVEFMSGGIIPLPFLPGRIRGIMELLPFASMQNAPYRIYSANIAGAAAVSTLSLQ LFWVIALTVLGHYLMSLALKHAVIQGG >gi|157101612|gb|DS480712.1| GENE 130 115952 - 116731 348 259 aa, chain + ## HITS:1 COG:DR0204 KEGG:ns NR:ns ## COG: DR0204 COG3694 # Protein_GI_number: 15805240 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Deinococcus radiodurans # 2 259 13 277 277 115 33.0 8e-26 MLRLYCRYFSIQLKSALEYKVSFLLTALGQFFVSFAGFLSVFFMFQQFHTVDGFTYQQVL LCFSVVLSAFSIAELFAAGFKGFSHVVSNGEFDRIMLRPAGTLFQTFASKIEFSSVGRLL QAVVILAYAIWSSGIRWTIQGVLLLLFMILCGIFLFMGLYIIYATLCFFTIEGLEFMNIF TDGGREFGRYPAGIYGSSILKFLTFIVPLACFQYYPLLILLGKSSRKLYAAAPIVCLIFF ALCCIFWRAGVSRYKSAGS >gi|157101612|gb|DS480712.1| GENE 131 116958 - 117215 102 85 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2737 NR:ns ## KEGG: Ccel_2737 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 81 1 81 86 116 72.0 3e-25 MGYFRKLYQSGLPHRAISVYMYLKDRTDKNGTCYPSIGTISKELVLSRNTVKRGIDDLVK AGFIQKETRLRDNGGRSSNLYRLLK >gi|157101612|gb|DS480712.1| GENE 132 117229 - 119034 1076 601 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 11 514 75 546 591 110 25.0 8e-24 MDGSGVTLLIVIGFFMFAVISGIAMLSCHYTLDGIKSKTVGDGQHGTARFATKQEIRRTF THVEYLPEAWRRGSRLPEAQGLVVGGVFKGKRVTALVDTDDVHTLMIGSAGVGKTAYFLY PNIEYACASGMSFLATDTKGDLFRNYAGIAKKYYGYEAAVIDLRNPTRSDGNNLLHLVNQ YMDQYQKDTSNLAAKAKAEKYAKITSKTIINSSGGDAASYGQNAYFYDAAEGLLTAVILL IAEYCPPEKRHIISVFKLIQDLLAPSGVKGKNQFQLLMDKLPPDHKAKWFAGSALNTGDQ AMASVMSTALSRLNAFLDSELEQILCFDTVIDAERFCNTKSAIFVVLPEEDATKYFMASL IVQQLYRELLAVADEQGGKLKNRVMFYLDEIGTIPKIESLEMAFSAVRSRRCSIVAIIQS FAQLQKNYGKEGSEIITDNCQDGIFGGFAPNSESAEVLSKHMGNRTVLSGYVSRGRNDPS QSLQMIARPLMTPDEIKALKKGTFIVTKTGAHPMQTTLKLFLKWGITFEEPYELQERAAR KVSYADKLELEMEIMNRQLAVNLSAMEVMSEIETRCSERMLVKKEHGRPGTIGGKLPVRV D >gi|157101612|gb|DS480712.1| GENE 133 119039 - 120502 596 487 aa, chain - ## HITS:1 COG:no KEGG:DSY4633 NR:ns ## KEGG: DSY4633 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 454 1 451 477 370 44.0 1e-101 MEYVIFTEEEKERAGAANLVEFLLRQGEELEPSGSEWRWKRHDSVTVRNNTWYRHSCNYG GSTIQFLQEFYDMTYVEAMKCLLEGNYQPVIRETGDGSTCLKAKSRKEFQLPEPYKNEKR VLAYLSRTRFVDYEILMFFIERGAIYEEMRRHNAVFVGFDNQGKARHAHMKGAYTNGESF RLNVEGSDPAYGFGYCGEGPRLYVFEAAIDLLSFLTLYPLEWKKQSYICLDGLSEHAMLQ MLHSHPWLQEVILCMDHDPAGIEGCYRLKEILEGEGYKNVSRLSSRNKDWNEDLKEMHGI HPAPAREHPGLLTCMKLCLWLKNQEDKLCTDEEALKQIQDSFHHLSMAMERQVTYGEDHG AEMLMCLEDMSASGFHLLLENHRNEEPSCTKEQLANQLFKSYQPHRDRGRMRVKSEELKM TVESICPLDMQKCTKEDMETGTDSFRTFIMACMKCYIQISKITKEQQQESDRNVTVTCVE GIQEQEV >gi|157101612|gb|DS480712.1| GENE 134 120515 - 122470 866 651 aa, chain - ## HITS:1 COG:no KEGG:HM1_0619 NR:ns ## KEGG: HM1_0619 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 11 648 1 582 594 482 43.0 1e-134 MGYFWNRGDWMPRIILKCRYLKHEKAHLAHLVRYIATREGVEMARDTHDHLPATARQKQL IDELLRALPDTQESYEYRDYQANPTMGNASAFIAISIEQNMDLVAKKENYVDYIASRPGV EKRGTHGLFTDAGVPVILSRVQEEVARHEGNVWTFILSIRREDAVRLGYDNVKRWEALLR GKRMQMAEAMKMSPENLVWYAAFHQAGHHPHVHMIAYSRDPGKGFLTEKGIEQMRAMYAK EIFHQDMYEMYQNQTVQRNELVRASADPLMKLFGQDQGRWEESSKLEQLMTQLAADLKQT KGRKVYGYLPPRVKQQVDRIVNVLSTNPAIAECYQKWYESRLEILHIYSDQTPEPPPLSK QKEFKQIKNMIIREAMEWEAYGECISEQPMDDVSLMIAEDMEIMDGTEFTDDVVADDDMS CIVEFIEDGYAKWTDEYRQTRNILYGTDGEKPDVGTANQMIYSEAEQGNTFAMCDLRQML HYGRGCEPDPEVSQAWYNRAFRVFLNAETVKETAYLEYRLGKLYYDDLYMEKNLGASVYW LNLGAGHGNTYAQYLLGKLYLFEPTVRDDESGIFWLQNCADQGNPYAVYLLEHKDEWGRI RLRSGVIRLFHHLSALFNEQENEDHNKMHEWIDRKRGRKLKEKKIAQGIRG >gi|157101612|gb|DS480712.1| GENE 135 122485 - 122892 297 135 aa, chain - ## HITS:1 COG:no KEGG:DSY4635 NR:ns ## KEGG: DSY4635 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 135 1 135 135 149 51.0 4e-35 MEDDTKKRIPVWLYPKTLKRMDECLDLGNCKSRSEYIENAVSFYNGYLLSKESSPYLPVI LTSAMSGIVEASENRTSRLLFKLAVELSMLMNLYAAQSDVGEEILEKLRGKCIQDVKHTN GSVNLESVMAYQKGK >gi|157101612|gb|DS480712.1| GENE 136 123046 - 123255 251 69 aa, chain - ## HITS:1 COG:asl4361 KEGG:ns NR:ns ## COG: asl4361 COG1598 # Protein_GI_number: 17231853 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 3 69 4 70 70 78 59.0 3e-15 MHRYERIVFWSGEDQKWIVDVPELSGCMADGATPVEALENVETIIDEWIETARAIGREIP EPKGRLLYA >gi|157101612|gb|DS480712.1| GENE 137 123435 - 123719 257 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941824|ref|ZP_02089151.1| ## NR: gi|160941824|ref|ZP_02089151.1| hypothetical protein CLOBOL_06720 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06720 [Clostridium bolteae ATCC BAA-613] # 1 94 1 94 94 155 100.0 8e-37 MKQKTGKTKDEILAFRVEILDSQWQENKTPEERIIQQKVNERLDGWLKQISGEQRDGIED CIDDMLEENWECQLYLYEEGVKDGIRLMKMIYGL >gi|157101612|gb|DS480712.1| GENE 138 124213 - 124497 187 94 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1510 NR:ns ## KEGG: Dtox_1510 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 1 60 1 60 85 63 60.0 3e-09 MKKAIIQLKYDEEKLSALEKYMIKKEVDLEVELLHTLQKLYEKYVPPAVREYIESRETCE GRTGVRSQERRNTANRANEGILETGEGTWSTENS >gi|157101612|gb|DS480712.1| GENE 139 124533 - 125516 442 327 aa, chain - ## HITS:1 COG:no KEGG:Closa_0805 NR:ns ## KEGG: Closa_0805 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 301 1 308 321 135 29.0 3e-30 MVFQEFISTVAKRMEERYKEMGQDYSVTIKNRERVNRENSSSIVVKKPGQLQQPGIVMNS FYADFSEGKKKMDEIIDQLMKTVEEENLGIQLNGSYFTKDRILKNIRMELINADRNRELL SHVPHRMVADLAVIYKYIIQDTGEGIIGATLTNHTIAGMGIHEDELFQMAMDNRRENAPY VIGNLEDVIHGLNESLGMEETPEEDGLDSFLDCSVLTSQDGLFGASCILYPEAVKELAAA KGGNFFLLPSSIHEWLVIPDDGIYDGLEDIVRFVNQNEVGEEEVLSDHVYYYDVQKQELS IHQAGEEIEMDADRNGGPDVGRMDFLS >gi|157101612|gb|DS480712.1| GENE 140 125727 - 125945 250 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941830|ref|ZP_02089157.1| ## NR: gi|160941830|ref|ZP_02089157.1| hypothetical protein CLOBOL_06726 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06726 [Clostridium bolteae ATCC BAA-613] # 1 72 3 74 74 109 100.0 7e-23 MAKKNITTGFPEEKLNVLLYFLNREQQNLEPLLEEQLDRLYNRAVPKQTRDFIQFQMTGE LATETEENQEAG >gi|157101612|gb|DS480712.1| GENE 141 125967 - 127124 726 385 aa, chain - ## HITS:1 COG:Cgl2139 KEGG:ns NR:ns ## COG: Cgl2139 COG0791 # Protein_GI_number: 19553389 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Corynebacterium glutamicum # 239 341 227 317 339 67 43.0 5e-11 MAAPVLAAIAKASILAAADESHRKRMATIFGLILSPVLVVILLIMGMLSGMTEHNQNAVR TVFENGKLPEKTPEEYRAFIVGMQNSFNRIGNVIDEVTGMIEEGEIDGMMVKSIFYSLFY DYEGLPVSLSVIQEFVDCFVKYEERTREETDDDGNTTEETYLVAVPLTSQTEIYSNIRSR LQAVNDIDQANAIEIYYRMVYGVGAPTEGDGFVEWSEWNGTLSPEAYEAVCSDLPEGEAG SLAVQLAMTRLGDPYSQAKRGQGDYTDCSYLTLWAYRQLGVYLPGTAAEQAQYCVERGLV VTRASLVPGDLIFWSYKPNGRYMNITHVGMYAGNGMVVDASSSRGKVVYRNLFDADKQVL YGRPALLSTAGKSEQENQRNEHKSI >gi|157101612|gb|DS480712.1| GENE 142 127146 - 128651 499 501 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2752 NR:ns ## KEGG: Ccel_2752 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 333 3 342 342 451 65.0 1e-125 MREQRFGIEIELTGLTRRAAADIIGDYMGTTPVYIGGFYYVYEIPDREGRQWRVVLDNSI KVECKSGIANDEYKVEVVSPICRYPDIPDIQEIVRQLRHGGAIANKSCGIHIHVDATPHN ARTLRNITNIMASKEDLIYKALQVEVARKHQYCRPVDETFLDEMNRKKPRNMDEVSLIWY RGRSRRSKHYDKTRYHCLNLHSVFQKGTIEFRLFNSTTHAGKVKAYIHLCLAISYQALIQ KCASRRKTTSTNEKYTFRTWLLRLGLIGEEFANTRKHLLEHLDGCIAWKDPAQAERQKER LRARREMEENANQEGKENLEGRDVSNPMTKTGMEEKMEEGKLYVAYGSNLSLTQMRRRCP TARVVGLAELMDYELLFRGRRENAFATIEPKQGSCVPVMIWKIQGDDELALDRYEGYPHL YEKKQVEVSMGREQMEAMVYVMTPGQSFGRPSSRYLNIIREGYRNAGFAPVVLEAAVNDS VRLAMQEDAQQDGMTGMDSLR >gi|157101612|gb|DS480712.1| GENE 143 128684 - 130447 1264 587 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 45 532 324 798 853 126 24.0 1e-28 MVSPSIIQFEIDHYVCGNTYRCVWALREYPTSTEEQAILRNLGEKDGVTLRIYTRQVSPA EERKIISNAANKNRMRQGSTENLQDTITAGNNLQDVADIVSQMHRNKEPLLHTAVYIELT AADPEQLKLLQTEVLTELVRSKLNVDRLLIRQKQGFLSVMPSGWNVFREQYERVLPASSV ANLYPFHYSGKTDPNGFYLGRDKFGSNIIVDFNRRAEDKTNANILILGNSGQGKSYLLKL ILCIMREAGMDVICLDPEGEYQDLAENLGGCYIDLMTGEYKINLLQIRCWDENGSPDDTG APMAFRQTSRLSQHISFLKDFFRSYKDFTDQHIDAIEIMVGRLYERFGFTDRTDFKRLNP TDYPILSDLYQLIEGEYQGFDDTRRQLFTAELLQEILLGLHSMCRGAEAPFFDGHTNITD NHFIVFGVKGLLQASKNIRNAMLFNILSYMSDALLTNGRTAASIDEFYLFLSNLTAVEYV RNFMKRVRKKDSAVILSSQNLEDYNLPGIAEYTKPLFAIPTHSFLFNAGAVDAGFYMDTL QLEKSEYNLIQYPQRGVCLYKCGNERYNLMVQAPPYKARLFGKAGGR >gi|157101612|gb|DS480712.1| GENE 144 130527 - 131126 474 199 aa, chain - ## HITS:1 COG:no KEGG:HM1_0602 NR:ns ## KEGG: HM1_0602 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 5 198 3 196 196 256 68.0 3e-67 MKKEKKDKKTGQKKEGTRQLIGIQEITDYSLRTMEHGELVFFLIHPSNLSVLSESSLAAR IYGLMTVLKGMAELEMLCLNSKENFEDNKKFIQDRVEEEENPAIRKLLEADLKFLDQIQV QMATAREFILVVRLPGEEEKEVSPYLKRIEKTLNDQGFTTRRANGDDIRRVLAVYFEQNV TTERFEESDGERWVIFHDH >gi|157101612|gb|DS480712.1| GENE 145 131126 - 131386 301 86 aa, chain - ## HITS:1 COG:no KEGG:Closa_3123 NR:ns ## KEGG: Closa_3123 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 85 2 84 92 105 61.0 6e-22 MTYLYPDHLRARATLWLWDLRDIGFIGIGAVFSIACFSVSGILVPMVLTTVYGFLSIQFD GTRIRDFILYACAFFVTQQQTYEWRR >gi|157101612|gb|DS480712.1| GENE 146 131409 - 131582 159 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941836|ref|ZP_02089163.1| ## NR: gi|160941836|ref|ZP_02089163.1| hypothetical protein CLOBOL_06732 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06732 [Clostridium bolteae ATCC BAA-613] # 1 57 67 123 123 115 100.0 1e-24 MEVCDCVWICGEHISAGMKQELAFAKGIGKEVRFVGRNEITASVNERQEMLIRMEIG >gi|157101612|gb|DS480712.1| GENE 147 131777 - 132679 680 300 aa, chain - ## HITS:1 COG:no KEGG:DSY4551 NR:ns ## KEGG: DSY4551 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 295 15 309 315 375 66.0 1e-102 MFIWDFVVDTVFEDIIEWFYGQLIGFLSSFFGMMGNMGAEIFDLVWVQGIVHFFVYLAWA LYVTGLVVAVFECAIEYQTGRGSVKDTALNAIKGFMAVGLFSSVPVELYRLAITLQVQVS AAISGQGVGVTELAAGIMDNLSSAGSLKEADLSFLYGGLDPAGNGIFTIFLIIMMGYAII KVFFANLKRGGILLTQITVGSLYMFSVPRGYVDGFMNWCKQVIGLCLTAFLQATLLTAGL LVIREHALLGLGIMLASGEIPRIAGQFGMDTSTRANLMGSVYAAQSAINLTRTVVMLIPK >gi|157101612|gb|DS480712.1| GENE 148 132719 - 133135 221 138 aa, chain - ## HITS:1 COG:no KEGG:Dtox_1496 NR:ns ## KEGG: Dtox_1496 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 1 131 1 131 145 155 49.0 4e-37 MTFFEQELKKLFADDTAFMDKRFIGNACYGRLDNNIRIKIRFTTCGVADQYEALKVTLLN RNEGEIDSMMLHFHDLWGIKKTGNPNFGEGISPHIWRYREKTEWYVYQPNKDDYQKLADA VRAYVETFQEPIQGQQMC >gi|157101612|gb|DS480712.1| GENE 149 133182 - 133295 58 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTDLIGRLDELLLLVDSQVELVNIKEQECISVGQNVS >gi|157101612|gb|DS480712.1| GENE 150 133292 - 133828 136 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941840|ref|ZP_02089167.1| ## NR: gi|160941840|ref|ZP_02089167.1| hypothetical protein CLOBOL_06736 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06736 [Clostridium bolteae ATCC BAA-613] # 35 178 1 144 144 286 100.0 4e-76 MIRLYHDTTTEAYCSILKNGFCHKDVVWTCSDSNMPYFYSDELIAKEFELDEQEAINKCC EMALQSAAFSAAVTNSTYRDFYVFEFLIDEEYWYIIKTDGSMKKSSACSVCIEANLLKKL THNIYCAPEHYTSRLAIFYLSMLEQDYLAELGTHKRNHKIGSGAVGSRKRIGRAGRNI >gi|157101612|gb|DS480712.1| GENE 151 133833 - 134168 141 111 aa, chain - ## HITS:1 COG:no KEGG:CKR_3426 NR:ns ## KEGG: CKR_3426 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 5 111 39 147 147 137 65.0 1e-31 MKWKKRFGTVVMTSWLLSFLFCMISYADGNVSGAIEETWKTALIQIRAVVDHVVFPAIDL VLAVFFFGKLGTAYFDFRKHGQFEWSGPAILFACLVFTLTAPTYIWQILGM >gi|157101612|gb|DS480712.1| GENE 152 134388 - 135977 437 529 aa, chain - ## HITS:1 COG:no KEGG:HM1_0594 NR:ns ## KEGG: HM1_0594 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 8 525 44 559 562 622 59.0 1e-176 MSQGTSQKNKWSVGDEGVRISIIRSSDQAVISKSIDYTNKSPDIQLHFGTVNKITYRNGT RLAVDTTPYVYARPAIKLPKIVTSASGHSSIEEIKRYFCSEYMIQLISQDTGIPFETITN GDYKILLEPVAYVTYNGIRMAMTATEAAYYSRMGTNIRDMLPNVLFKNVPLSMFLEIPDL GYPAWSGPRDQLVTEDQIISSLGLGIIHFKEHEPEQGEVVTYDYVYRVNTQVITSVTVRG GQSDPDYPASVRFNIKGKNYTVNQVYYPEGSEQLAWIRWTTPETPQDIIIQVSGGGGGYP SKGTIHVKVVDLDKNPPPNPVADDRNDSYVKPTSLPSNSHMKEAKWSIWRPWWHEYWVYH DGGEDEDGYWEDEGWWEFDQDWYQASLSGGMDIFPDEKNPTATGDIIKSGYGINQEVTAD VSTTLSSAVTGAQNAVTYFPEFHYQTYWRLLERISGDYHARFEFKENEYSTFSRRTHFTP IWMPDGTYTAYTWLLDCWTPKGMLSLDLSDDVDIQGDLWLDWHIGPVSP >gi|157101612|gb|DS480712.1| GENE 153 136169 - 136765 307 198 aa, chain - ## HITS:1 COG:no KEGG:HM1_0593 NR:ns ## KEGG: HM1_0593 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 3 198 4 184 184 83 38.0 4e-15 MKLDDKQKKWMIIGSAVIVCMILVLLIAGQFKSEPAAVQETTQADQTTVAEPMIESSEEV RETVKVETKAETTAAPAKDQTDETVQQIQPEVTKPAKPKDEDRHDPTKKPNGETVTPTNP AVVEPQPTPETEPPVVTEPQPVEQPTEPPTEAPTEATESAPQEGMIYVPGFGWVTPSSSI GIDVDSDGDINKQVGIMD >gi|157101612|gb|DS480712.1| GENE 154 136906 - 137397 407 163 aa, chain - ## HITS:1 COG:no KEGG:DSY4557 NR:ns ## KEGG: DSY4557 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 163 20 183 183 186 57.0 4e-46 MAVASALVVLFLFCGISEYARLNLIAAGVRDAVQEAILSTVNDNYDDVYHGVREGYSGGY YPSGGGWDESLDYGDVYGFLDDLLGMEDYGSYHVKRMDGGRKEEYRISDLDVTIQNVSLT SDSSERFLADATFLLRVPVHFGGSRLPDMQIRMKVQAAYTPVF >gi|157101612|gb|DS480712.1| GENE 155 137445 - 137837 391 130 aa, chain - ## HITS:1 COG:no KEGG:HM1_0591 NR:ns ## KEGG: HM1_0591 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 4 130 5 131 131 185 70.0 4e-46 MNIIRNENGEGYIDVCVLVLCVFLMLGLAVNVLPVYVAKNQLDTFAQELCREAEMAGRIG PETSRRETVLREKTGLTPEVEWSEGGPFQLNEEVEVTLTMRMDIGLFGGFGTYPVTLHAK AQGKGEVYWK >gi|157101612|gb|DS480712.1| GENE 156 137854 - 138726 540 290 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1887 NR:ns ## KEGG: LM5578_1887 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 290 1 290 290 362 63.0 2e-98 MISLLFVFGLSLAFGFYFLAAGVLRLPSMGAARAIAERGSRKQKAAEFLEDGMDQLSVLL ASHITLDEYRKRRMGNILKAAGMGMTPEVHAASCMVKAGVTAIPILPCLLFLPLVTPVPV FAAVAVYFRESRRPEERLKEKRDRVEQELPRFVATIEQDLKSSRDVLSILENFKKNTSDT LRAELDILTADMRSGSYEAALTRFEARMNSPMLSDVVRGLIGVLRGDDSAAYFQLLSHDF KQLEMQRLKKEAQKIPPKIRIFSFLMLLCFLMTYLAIIGYEIIKSLGNMF >gi|157101612|gb|DS480712.1| GENE 157 138728 - 139657 508 309 aa, chain - ## HITS:1 COG:no KEGG:CKL_3887 NR:ns ## KEGG: CKL_3887 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 309 1 309 309 415 64.0 1e-114 MNMVNILACTGMVTGFFLILRLTPMEFTDGVFRHFLKSPDSIKSEIQKTQKRKKPSILQR EVELCQEVLALTGKGDRISLICAAALGLFMAGGILAVMLGNVFLIPVLATGFLFLPFWYI RLMEHHYKKAVASELETALSIITTAYLRSEDILTAVNENIHYLNPPASSAFQDFLVRLKV INPDMEAALKELRKKIRNDVFEEWVDAIGDCQYDRSLKSTLTPIVNKLSDMRVVNGELEN LVFEPRKEFITMVIFVIGNIPLMYALNHDWYQTLMHTLFGQAILAMTAVVIFVSTAAVIR LTKPIEYRR >gi|157101612|gb|DS480712.1| GENE 158 139654 - 141279 1048 541 aa, chain - ## HITS:1 COG:RSp1085 KEGG:ns NR:ns ## COG: RSp1085 COG4962 # Protein_GI_number: 17549306 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 139 491 51 399 450 115 28.0 2e-25 MLGEKRGESLFFQAEKAAEPKIVEMGPQTVERGIQAVEFHQPDHSGISATGTLHHPQKES KRPEMPTDRKNGGVEVQEEEWEDDTTVGFVPGNGGPRNQGLFFSAREGREFAQVLQEVQE YVSSKYATLLTGHGDADVKAQIKRYITKYIQDYRIAVQGMTMVQLVDGLYTEMAEFSFLT KYIFGTGIEEVDVNGWNDIEVQYSDGRNEKLEEHFDSPEHAINVIRRMLHVSGMVLDNAS PAVLGHLSKNIRIAVLKTPLVDEDVGICASIRIVNPQSMTQADFVKGGTATGEMLDFLAE CIRYGISVCVAGATSSGKTTLAGWILTTVPDSKRIFTIENGSRELDLVRVKEGKVMNRVI HTLTRDSENEKQTVDQDDLLDMALRFNPDIVCVGEMRSSEAYSAQEAARTGHTVLTTIHS NSCESTYRRMVTLCKRKYDISDETLMSLVTEAFPIIVFSKQLEDRKRKVMEIMECEILPD GNRNYRTLYHYNIVENRLDGDRFIIHGGHERVQGISESLKKRFLENGMPKEQLVRWEGEG A >gi|157101612|gb|DS480712.1| GENE 159 141273 - 142082 763 269 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1890 NR:ns ## KEGG: LM5578_1890 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 269 1 271 271 413 71.0 1e-114 MLNIKKHSIFSREEKGNCQEYQDRGGVLAVWGSPGSGKTTVAVKLAKYLADRKRNVSLLL CDMTAPMLPCICPPGDLEFEHSLGSVLAATHVTQNLVRQNCVVHKHQDYLMILGMQKGEN EYTYPPHSASQATELIDCLRELTPYVIIDCSSYIVNDILSAIALMEADSVLRLVGCDLKS ISYLSSQLPLLQDHEWDADKQYRVASNIRPNEASEHVERVLGNVTFRIPHSAEVEEQILA GNLFKELSLKDSRGFRKEIERIVREVFGC >gi|157101612|gb|DS480712.1| GENE 160 142075 - 142911 664 278 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1891 NR:ns ## KEGG: LM5578_1891 # Name: not_defined # Def: pilus assembly protein CpaB # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 242 1 242 284 294 63.0 4e-78 MKLLKNRTVLGILCIVLSLAICFLATPLFNQAISEKTDIVRVQKAIKAGDEITKDKVQMV EVGGYNLPENVMKDLGSVVGAYATADLLPGDYVLYSKISVQPPGADTYLYQLDGTKQAIS VTVKSFAAGVSGKLRSGDIVSILAPDYRKMGETVIPQELQYVQVIAVTASTGVDANTAGG DKEKEKSLPATLTLLATPIQCKVLAELETEGNLHAALVYRGKAETAGQFIVAQEQLLERL YPPEEATEGGEGGTETQDRKDTEGEENVETEEMEEQNA >gi|157101612|gb|DS480712.1| GENE 161 143068 - 146139 1314 1023 aa, chain - ## HITS:1 COG:lin0372 KEGG:ns NR:ns ## COG: lin0372 COG4886 # Protein_GI_number: 16799449 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 399 567 413 570 656 69 32.0 4e-11 MNKKKLCILLSMAMLSHQVAYASDQSPSGDAGPSGQWDVVIAMPAQEKEGMYTDEEAVDA NVEGNGTSCSDVTVWEATPSEAVHMEDTKTGESDSEAGELKEVQGSYASPRSLLRAAYGA EYYQGDWDPDANQNAVRIPKVYPAQSANYLPSLPAAAVTSSLYKTYGPNGCADVQIQVGS GSNGSLNRSEVTVNLKTGQLLILRPMGASKHSYFHLSYINDNKGRRFSDFAKLYHPSDSY YSRYKYAWLMRVTCDMAGNWDAHADGWNTVNNEGDRVYYGPAHWTMHLNVAHDYGPWTTA SEPSCTVQGTQRRTCRDCGSVQTQSTSVLGHAFSDGYYMGANDGTYFKRCTRPGCAARTD VKYNPYTIVFDANGGSGNMNSQPSVYQTPVVLPVNQYTRDYHTFQGWNSQPDGGGTAYGD GQSVLNLTKIYGGTVTLYAQWLPNTYIITFADGMDGGQNVQKGLLYTGKLGTLPEFIRKG YTLLGFYTEPAGGTRITEETDVPHEDTTYHAHWSANKYRITFHTRDAHCDIDGKAVIYDK TIGILPVPVLEDYAFLGWYAQPYGEEKEEGIMYGEALPEPGQKIVSAYEYTVDRDMDAYA YFTLVFRDLGDGTNRRPGKDEAIGTEDDNLYLNGPDGVAGTRDDRKIYEGVDGQYGTEDD FYLDNEGRKYFPGPDRTFGTEDDYRDDGNGWNTRPGPDCIFGTQDDITVSNGWDGIPGTA DDWVAHSESYPSTNRRPGNDGIFGTEEDEIWSNGPDCTPGTGDDVRIHPGLDGVYGTRDD WYDNQNSYPSTNVRPGADGEFWTEDDEIWFNGPDRIPGNEDDLILIPGPDGIYGTEDDCY DNRREQEGTNIRPGQDGIFGTKDDELWTNGPDRIPGTEDDEKYIPRHSGGGTGGRHATGK RAYRPYYPFQDWTALTVQQEPTMEPLIEPGTPVLWGPVAKEDVRQTQVKSECDIPVASAN NVTGKEAEAIKKGTASASEPGKEGGNKEADQGKAVLAAVLLFIIIILVLYLISKVSSHAR KQE >gi|157101612|gb|DS480712.1| GENE 162 146176 - 146592 99 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_3106 NR:ns ## KEGG: Closa_3106 # Name: not_defined # Def: peptidase A24A prepilin type IV # Organism: C.saccharolyticum # Pathway: not_defined # 5 131 8 134 139 92 44.0 4e-18 MYRHLELTLWIGLMLGASYIDFRHRMVPDWLNLCIAACSLLGLRSGTLAGVLCAIPFLLA AGCWGGIGGGDIKFMAACGMHLGMYGGLRAAILGTASLLVFHMGYLLWCSWKAIRPPTSY PMVPFLTMGCLAVLCAGG >gi|157101612|gb|DS480712.1| GENE 163 146603 - 146986 334 127 aa, chain - ## HITS:1 COG:SA0456 KEGG:ns NR:ns ## COG: SA0456 COG2088 # Protein_GI_number: 15926175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Staphylococcus aureus N315 # 23 100 13 90 108 71 38.0 4e-13 MKAQQAQSKTQPEGQAGNKMNIDVKINSISMEGNILATASINLNQCLAIRNVRLMNGEKG MFLSMPSYRTGNGEFRDICFPITAEFRTQMTEALTSAYKEALTIQQGKMTQMAESPVPQM APTGLSM >gi|157101612|gb|DS480712.1| GENE 164 147008 - 147220 146 70 aa, chain - ## HITS:1 COG:no KEGG:HM1_0583 NR:ns ## KEGG: HM1_0583 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 29 70 35 76 76 64 78.0 1e-09 MNRALYQPNYWHYLTGDFLNRVYPRFRKAIRILMAVVIDALLLAGLYKLFADTVLTTLPQ RVMEMFNYSG >gi|157101612|gb|DS480712.1| GENE 165 147308 - 147559 187 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941856|ref|ZP_02089183.1| ## NR: gi|160941856|ref|ZP_02089183.1| hypothetical protein CLOBOL_06752 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06752 [Clostridium bolteae ATCC BAA-613] # 1 83 140 222 222 162 100.0 5e-39 MYYYEADMINNLFINPGLASFKMEKSHKSHYIDSLRGFSSDLYNPHKLFAIEEKISDDFI DDEEAILLRAKELYQKYKDMLSE >gi|157101612|gb|DS480712.1| GENE 166 148267 - 148932 441 221 aa, chain - ## HITS:1 COG:no KEGG:FN1197 NR:ns ## KEGG: FN1197 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 8 220 2 210 213 138 38.0 1e-31 MANKSGGKPSDRRAGKRKDRNQRMGQRVPELGYYLIVTDTEETEKNYFEGLRDSIPAELK DRLIIKVEKAKTVELVKRALELVGKESQYRIPWIVFDRDQVKGFDEIIWTAEKNGVHAGW SNPCFEIWMYAYFGEMPAIRESYTCCDRFADKFEKVIGQKYYKNDRDVYRKLIQYGDFEQ AVVLAERSLKKCVEDGKKLPSEMWPACMVQRLVAEILKKIS >gi|157101612|gb|DS480712.1| GENE 167 148922 - 150271 792 449 aa, chain - ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 19 436 1 414 420 301 41.0 2e-81 MAYNRIKSVKECEMKESAMLIQFNFKNFKSFRDEVSLDLSATKITEHEDHVVELANDKLL KVAAIYGANASGKSNIYDAFKFMSYYVEESFKFGGESDSRQKTDSDYIRVSPFLFDSKSR GMESTFEVFFVDNSENTGKIYQYGFALQNDEVVEEWFYSKAKTARNRYKTIFYRKKGEEL EVNGLPKSSTQNIKVALEKETLIVSLGAKLKISKLKKVRDWFLNNEVVDFGNPAENFFRS QVLPKDFTTSRKIQSNVVNYFASFDNAICDFKVEEVQREIEKESDLSYKIDALHKMVDCD EFESIPLKSESSGTLKMFALYPSLKEVLDNGGTLFVDELNARLHPLLVRNIILTFLSPEI NTNHAQMIFTTHDIWQLSNELLRRDEIWLVDKNQDGVSDLYSLADFKDEDGNKVRRDEAL AKNYLTGSYGAIPALRPMEMLKGRSTDGE >gi|157101612|gb|DS480712.1| GENE 168 150611 - 150808 190 65 aa, chain - ## HITS:1 COG:no KEGG:Closa_3551 NR:ns ## KEGG: Closa_3551 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 3 65 32 94 94 79 66.0 5e-14 MIQIRTKLVQIMADKRGEGFVDTAIKILMAVVIGALLLAGLYKLFADTVLPTLTQRVTEM FNYSG >gi|157101612|gb|DS480712.1| GENE 169 151070 - 152686 1142 538 aa, chain - ## HITS:1 COG:no KEGG:Closa_3096 NR:ns ## KEGG: Closa_3096 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 369 537 1 168 172 164 52.0 7e-39 MLNITIMRGDNGQEAALKLPGSFDESEETLSALEAICSRDITTAIISADTSIESLDRILP GMEIDEQELNELGFLSRRIGGLCDRERNILTGILELEPSKSLRDIVNLSCNLHKFERLPG VKDADAMGRHLLKERGIDVPGSIADYIAYDQVANRYLADHPGVFTDSGFVFRKDEPLEIV YDGEKLPEPYCNPNGIITVRLSHKGKPNRVTLTFPATEECIEAVLAELGVKDLEDCYASC TSSIPGLLERLPNYDGLECLNLGAAYVGVLEEQYRLFMAVMEAEVPGSMFELPEIADALC QYHLLPEDIKTPEDYAKYLMDEGDICCVDGITAEFVNFDALGKAMMKENGVVLTSFGIVE RYDHPIRQLPEELTMLRLFSPLYTQVFTRDEYGDLNWEPEEMSAHEVCGYREEILRAIAD RRIDSEGDRGLAVYLKNELLARKVYSMNPTVEAWDGRLWGVLEVQTRGELSEGELRELTE AWSGQCSDGWGEGFEQQEIKIDQEELHVSFWHSGGDFVILTEAELKNQTEQQNGMIMQ >gi|157101612|gb|DS480712.1| GENE 170 152805 - 152966 161 53 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941863|ref|ZP_02089190.1| ## NR: gi|160941863|ref|ZP_02089190.1| hypothetical protein CLOBOL_06759 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06759 [Clostridium bolteae ATCC BAA-613] # 1 53 1 53 53 65 100.0 1e-09 MFFESREIREKFDQSVLCAYELQMNPEHEMDREPIEEDRDMEWEEGDEPEMGM >gi|157101612|gb|DS480712.1| GENE 171 153071 - 153175 106 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQETMERNTILFFQSIITNLQPEIGILATAEVRI >gi|157101612|gb|DS480712.1| GENE 172 153205 - 153813 413 202 aa, chain - ## HITS:1 COG:no KEGG:HM1_0579 NR:ns ## KEGG: HM1_0579 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 1 171 1 171 186 176 49.0 4e-43 MAGIAVKNNLILYYGNMAGYVEDGKAVLDPMFKNGYLIRFLQEKKGLEPCWQDGVYDRLV HGKVDLTGEVKPFRNCRLHQLRPEVPPEERFLDFEDQVSRHGMPDSSRYEVVYDCQIETD DLEAIYTKFEREFPQRTTGHPISISDVLELYDRQGSEFYYVDHYGFQKIGFIQGQSLDIQ GTDHMKGEDVSQRQPIQEMKEA >gi|157101612|gb|DS480712.1| GENE 173 153794 - 154084 272 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941866|ref|ZP_02089193.1| ## NR: gi|160941866|ref|ZP_02089193.1| hypothetical protein CLOBOL_06762 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06762 [Clostridium bolteae ATCC BAA-613] # 1 96 1 96 96 179 100.0 4e-44 MDEVKSRERLLCAYMAGRDGMVKLRMPFDEEQLEALMAQETLPELIVFLDSHAAPVLIMR GGNIQFCAEDGLGEKLDCLLKKRRYGKERGDGRYCG >gi|157101612|gb|DS480712.1| GENE 174 154088 - 154375 249 95 aa, chain - ## HITS:1 COG:no KEGG:Closa_3101 NR:ns ## KEGG: Closa_3101 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 94 4 92 94 90 48.0 2e-17 MAGLKRTYRIVDGNGRIYLPKSVREEAGMHPGDIIRLEADNGGWIGLMKVELIEAGDQSP EAMEAYVRVAVRQMPDKSRVSLLAELAELIQKDEG >gi|157101612|gb|DS480712.1| GENE 175 154335 - 154784 334 149 aa, chain - ## HITS:1 COG:no KEGG:DSY4571 NR:ns ## KEGG: DSY4571 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 135 25 161 164 65 35.0 5e-10 MRLKELNVRPEADGRVIIPAEVLESIGTDADGTVYVTYLAEEDGGGPEVFVSPLEVGELT PDQLYEDGTQLTVPEELLEEAGIQPDEDLEVVFEDRRITILPARTGGIVPREILELCSEI GINPEKVRIIMEMEGLDGRFEEDVPNRGR >gi|157101612|gb|DS480712.1| GENE 176 154797 - 155726 625 309 aa, chain - ## HITS:1 COG:no KEGG:DSY4669 NR:ns ## KEGG: DSY4669 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: Mismatch repair [PATH:dsy03430] # 1 303 1 302 311 425 67.0 1e-117 MNSIISWIGGKKALRELIYQRFPLSYGRYIEVFGGGGWVLFGKPRDRMEVYNDYNSDLTN LFRCVRDRPLALIQELGFFPLNGREEFDYLKRFLNREEFMAPGLWYQDEGHLAEQCLTPE ACGEIRPILEERAGMYDVKRAACFYQLIRYSYGSGCISYGCQPFDIRRTFGLLWEGSRRL AGTVVENKDFEALIRQYDREDAFFYCDPPYYKTEGHYEVVFRREDHVRLRDALGGIQGKW LLSYNDCAYIRELYEGYRIETVKRLNNLAQRYNQGEEYAEVFISNYDTACRSRELPEQME LSDVYGFYP >gi|157101612|gb|DS480712.1| GENE 177 155742 - 156356 411 204 aa, chain - ## HITS:1 COG:no KEGG:LM5578_1902 NR:ns ## KEGG: LM5578_1902 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 80 195 80 195 202 80 28.0 3e-14 MPAGRFLSGSSLEAFEVLMQESHPCLPDRKEQEMAVSCFFCRDTKRSGYQKMLKELRFRI PDSQFIERVEETKSCMEGPVYRNREHEERYRGLMGHRQILALDQKASYACALYLLAADGY LWDKARDAITMSQVIFPDIQLGGINVKGYILYHLAKDLYYRAGCVKVSDLTDRSLVDQGM FAVLLTGCLLREHGLKRMEQIGMV >gi|157101612|gb|DS480712.1| GENE 178 156432 - 156962 318 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941871|ref|ZP_02089198.1| ## NR: gi|160941871|ref|ZP_02089198.1| hypothetical protein CLOBOL_06767 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06767 [Clostridium bolteae ATCC BAA-613] # 1 176 8 183 183 351 99.0 2e-95 MDCPTHIQFGDPMYFEHFEDQKLERLVVDCNVTKNFVARVVLQEQPIEELPGEKLDTMTL YMAPERTISTYMDGYCYKGQDVEQKEIGVDTATYLFEVDGRYEEFKTEGDGYWGESREFS RIWDGRSIIDAAVITVCMPEPQGFEDMRRLVHYFFQGAQLLETGQNSQMGLQEPVQ >gi|157101612|gb|DS480712.1| GENE 179 157036 - 157419 223 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941872|ref|ZP_02089199.1| ## NR: gi|160941872|ref|ZP_02089199.1| hypothetical protein CLOBOL_06768 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06768 [Clostridium bolteae ATCC BAA-613] # 1 127 197 323 323 273 100.0 4e-72 MLFETCTIKGKRICNPIVDWLDRDIWDYIQSERIPVNPLYEWGFHRVGCIGCPMAAKNRW TEFRIFPSYQRAYLRAFGVMLTYIQEQGIATRWKDAGDVFAWWMEDKNTKGQISLSDLEL WRAENGE >gi|157101612|gb|DS480712.1| GENE 180 158108 - 158371 153 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941874|ref|ZP_02089201.1| ## NR: gi|160941874|ref|ZP_02089201.1| hypothetical protein CLOBOL_06770 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06770 [Clostridium bolteae ATCC BAA-613] # 1 87 55 141 141 174 100.0 2e-42 MGTYEEAMPHIGEAISTVKAEKNCDSFNAAFQCVCQLGGRAFLTDIFSNDRIPLNMRQSA QSTIVQEDKLSLDTEEAKLKNDSAISM >gi|157101612|gb|DS480712.1| GENE 181 158671 - 159195 255 174 aa, chain - ## HITS:1 COG:MT2803.2 KEGG:ns NR:ns ## COG: MT2803.2 COG4422 # Protein_GI_number: 15842273 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mycobacterium tuberculosis CDC1551 # 16 150 101 229 284 57 31.0 9e-09 MTSDFFLDEADGWRAEAWDIIRRRSDLNFVIITKRIHRFEVGLPGDWGSGYENVTICCTC ENQNRADYRLPVFLELPIKHRTVIHEPMLEQIDIRKYLATGKIEGVTCGGESGPDARVCD FAWILDSMEQCVEYDVPFWFKQTGAKFKKGNKVYLIDRKAQMSQAQKAGINYKC >gi|157101612|gb|DS480712.1| GENE 182 159458 - 159883 434 141 aa, chain - ## HITS:1 COG:CAC1468 KEGG:ns NR:ns ## COG: CAC1468 COG0454 # Protein_GI_number: 15894747 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 135 1 138 142 95 38.0 3e-20 MVRLFEFQDLDKIMDIWLQGNLEAHSFIDAEYWKKNFDSVKSVLPNAEVYVYEEDGEILG FIGMDAEYIAGIFVAAGHKGQGIGHQLIETVKKKKRLSLHVFDKNTGAMAFYLKEGFTVR ERMTEKETGESECLMVYESGE >gi|157101612|gb|DS480712.1| GENE 183 160069 - 160494 250 141 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_2445 NR:ns ## KEGG: Dhaf_2445 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 3 137 4 145 192 133 51.0 2e-30 MEITIDELRRMKHKDGLILQGCGGDIQEWVDGINERLTQEDILLEGTKFGNVRSFFYEGL TNLLFPFEDVKLNFGRLAIWRLCSHETYGGMWLSDYVPNRLGGFLDEGSAQKTEERQQSE LCQGISAQENEEIGLGDISMR >gi|157101612|gb|DS480712.1| GENE 184 160733 - 162235 571 500 aa, chain - ## HITS:1 COG:no KEGG:HM1_0571 NR:ns ## KEGG: HM1_0571 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 335 496 6 175 175 77 31.0 1e-12 MHSSGMIPFVAGVDSDVPSLAECLEDCTVFANGHIELLNELAREMVVWGKDEKARFGAAL MWDHPDTIETVMDVAGNLDRYILDDRIKNWEAAGRLELEKRGIQIDKRIEPFLDLESIGR EYIYTKGCMTPMGYVTRKQSVPCSSKEQGYEPHRGCLLSVHMTDEKWNKKIFQLPLTEQK KSSLSVSKWNAYAIQYTSGYLGELPCYLPPGITLSELDQVADVINREVAKRGILEKEKLY AILEAGLPETIGEACEIIQGYENYVYVPADKMGRKTMARMLALQYARIVLPEELKPFCRF EEYIESFPDRYMVTTSTGHVFHPTRDYSRRLSESRTIRLYSPLTVSLFRHDRESFMPEIL SGEGLIKRKTVIEATIKRHLPSDEECMGVFLNNQLLRAKVERMVPGLEVYDGQVWCALEV RTIGILTEAEQEELKRDWLRQMKEGWGLSLIEYPVHTEGGEMHIGLWDEDYGTALCIKTE EELKRPDFPEQEPGRMELYM >gi|157101612|gb|DS480712.1| GENE 185 162448 - 162636 89 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941880|ref|ZP_02089207.1| ## NR: gi|160941880|ref|ZP_02089207.1| hypothetical protein CLOBOL_06776 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06776 [Clostridium bolteae ATCC BAA-613] # 1 62 1 62 62 123 100.0 3e-27 MSWTHLSFHFFQKAGIDTDDPSFTYFDFQGYGERYIQKNGVAMAPMVLYSELGRRLCGYS PG >gi|157101612|gb|DS480712.1| GENE 186 162708 - 162791 60 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISREREVKDLSCDVIARYILGNTDKT >gi|157101612|gb|DS480712.1| GENE 187 163082 - 164065 474 327 aa, chain - ## HITS:1 COG:alr3097 KEGG:ns NR:ns ## COG: alr3097 COG0535 # Protein_GI_number: 17230589 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Nostoc sp. PCC 7120 # 14 324 13 333 335 252 43.0 9e-67 MNESIVQLNQMEHIQAFESRIEERFQYTENNLDVLQVNVGKLCNLSCKHCHVEAGPNRKE IMRREVMEACLSVCREQKVKTADITGGAPEMNPHFEWFVEEICKACSHVIVRSNLVILTE EGYRHLPKFFADHKIEVVCSLPYYQAKVMDRVRGDGSFDKAIRVIQELNSLGYGREDDLV LNMVYNPSGAFFPPVQRAMEKEYKAKLLGDYGIVFNHLFTITNNPTGRFFYFLQNSCNLN SYMRKLCGSFNQATVEGLMCRYQVSVGYDGRIYDCDFNQASDLPVSTSETIFDLVGKPYR ARKICFGNHCYGCTAGQGSSCGGATEA >gi|157101612|gb|DS480712.1| GENE 188 164069 - 164401 262 110 aa, chain - ## HITS:1 COG:TVN1348 KEGG:ns NR:ns ## COG: TVN1348 COG0599 # Protein_GI_number: 13542179 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Thermoplasma volcanium # 1 106 1 107 120 63 29.0 9e-11 MDTYYNLEDLEKFASIGEESPRLAEKFFDYYNSVFEEGALTAREKSLIALAVAHTVQCPY CIDAYTNDCLNKGVSQDQMLEAIHVAAAIRGGASLVHGVQMKNIIKKLEL >gi|157101612|gb|DS480712.1| GENE 189 164416 - 164844 230 142 aa, chain - ## HITS:1 COG:no KEGG:Closa_1175 NR:ns ## KEGG: Closa_1175 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 141 9 149 150 186 64.0 3e-46 MTEEEIKDLFMQGIDCSQVVAGRFADELGIEESLLRKMSACFGGGMQCGETCGAVTGALM VIGLRYGHSENNDSKQKEIMGEKTSEFKRLFADRYKSCMCRELLGYDISKAGEMEQVLEQ GLLFDFCPCVVRDVIEILEKMV >gi|157101612|gb|DS480712.1| GENE 190 164869 - 165444 109 191 aa, chain - ## HITS:1 COG:no KEGG:Closa_1173 NR:ns ## KEGG: Closa_1173 # Name: not_defined # Def: iron-sulfur binding reductase # Organism: C.saccharolyticum # Pathway: not_defined # 2 190 138 326 329 190 48.0 2e-47 MGVLFDCCGKPVSELGLEEQEEKIIRRINERLHVSGIQEIVMLCPNCYYFLKDKLEIQVL SIYEKLSQLGIGRKVEGEGRLFLPCPDREEQQWVSWLNPFFQQEPEIIKGTQCCGLGGCA GGKEPDLARDMACHVAEEGYKKVYTYCASCAGNLTRNGCKGVSHLLTEILDFHEQPDVAR SMINRMMTKFW >gi|157101612|gb|DS480712.1| GENE 191 165836 - 166552 424 238 aa, chain - ## HITS:1 COG:ydjX KEGG:ns NR:ns ## COG: ydjX COG0398 # Protein_GI_number: 16129704 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 9 218 21 220 252 91 30.0 1e-18 MKTVFHTYKKLILFIGMIAAILVCNHIFGWSAVLSGTGSLNLLRRMAEENLVNAIFLYTV ITVVGCVALTLPGVTFAILAGLLFGPVLGTICCSLATMIGAMAAFLAGRFFLKDSIKPVV MKNRYLKKWLFDEAGKNELFVLMITRLVPLFPYNLQNFAYGVTDIPFSTYSIFSLIFMLP GTAMYTIGTSGLADKENRVLYISIAVILAVIVMGMGMFFKKKYVVQSERMEEINGQES >gi|157101612|gb|DS480712.1| GENE 192 166618 - 167022 204 134 aa, chain - ## HITS:1 COG:no KEGG:Bmur_0644 NR:ns ## KEGG: Bmur_0644 # Name: not_defined # Def: YciC # Organism: B.murdochii # Pathway: not_defined # 1 121 1 123 134 70 38.0 2e-11 MKLFKIVYNSFLWAMTIAILCFKNEWLQMRVNTGYIFGGLLILSTMVVLFVFRKQEVVFN SLFTAGNLIICSVIGLLMYGGERMKVVPAALVREGIHQTRIPFSKINLILCIITAAGILI IGLNDILKIIQADR >gi|157101612|gb|DS480712.1| GENE 193 167035 - 167676 534 213 aa, chain - ## HITS:1 COG:no KEGG:Bmur_0645 NR:ns ## KEGG: Bmur_0645 # Name: not_defined # Def: hypothetical protein # Organism: B.murdochii # Pathway: not_defined # 8 213 15 228 231 219 52.0 8e-56 MVATAMVFSLAACGGKSNTVESKAEEKAAVETMTVDKDKKEIIMLCEVNGTYFTEPTRHG IVYKGGSNGEKAVLRGLADEKEFYQALLDIGAKAGDNLTAADMKAGPDNGKAVEGDKLDV FVKWEGQEEIPYQDIIKCTEDYTMDLRFGGNIESAKENNTGCVLCLDSCATGIVSDAAWP TGTTQNDIAKFYGDKDVLPEDGTQVTVIFRLAK >gi|157101612|gb|DS480712.1| GENE 194 167724 - 168458 301 244 aa, chain - ## HITS:1 COG:ECs2458 KEGG:ns NR:ns ## COG: ECs2458 COG0398 # Protein_GI_number: 15831712 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli O157:H7 # 1 231 1 233 235 229 54.0 2e-60 MEKKKSRKIGKYLVIALIVLGMLVYFFVPPVNATMNNILKMFVSGDFTVVRQFVESYGAY AALVSFLLMILQSIAAPLPAFLLTFANANLFGWWKGAILSWSSAMAGAAVCFYIARLLGR DAVEKLTSRTSLKQIDEFFEKHGRLSILIARLLPFISFDIVSYAAGLTSMSFGSFFIATG IGQLPATIIYSYVGGMLTGGAKMFVTALLLLFALSALIVLFRQLYVEKQKQKNSDKSVDI TDNI >gi|157101612|gb|DS480712.1| GENE 195 169031 - 169285 175 84 aa, chain - ## HITS:1 COG:no KEGG:CCV52592_2220 NR:ns ## KEGG: CCV52592_2220 # Name: not_defined # Def: prevent-host-death family protein # Organism: C.curvus # Pathway: not_defined # 6 82 9 85 87 80 51.0 2e-14 MNDVINVRKTNEISDICHKSQEPIFVTKNGYGALVLMSIEAYESLITDTQIDAAIAEAET EMEGGDILMDAREALSGLRRKYAK >gi|157101612|gb|DS480712.1| GENE 196 169402 - 170082 671 226 aa, chain + ## HITS:1 COG:CAC0878 KEGG:ns NR:ns ## COG: CAC0878 COG0765 # Protein_GI_number: 15894165 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Clostridium acetobutylicum # 6 221 1 216 221 164 42.0 9e-41 MKGYGMKFNWSYFFQTLPYIASKLNVTLTLTIISAVFSLVIGMVFAIISYYKVKGVSQVL KIWISFVRGTPVAAQLFFFYYGLANLSSLILNMSPTVAVAVIMSLNMGAFVSESIRGALI SVDEGQREAAMAFGMTGFQMTIRIVIPQAIRVALPPLFNDLINLFKMSSLAFLVGVRDVM GAAKIEGANTFQFFECYACVMLIYWVITLILTAIQKVLEKRCNAIF >gi|157101612|gb|DS480712.1| GENE 197 170128 - 171027 709 299 aa, chain + ## HITS:1 COG:BS_yxeM KEGG:ns NR:ns ## COG: BS_yxeM COG0834 # Protein_GI_number: 16081001 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Bacillus subtilis # 50 287 23 259 264 110 27.0 5e-24 MKNKKITALIISAALAAVSMAGCTGQSPETAGSAKTTQTSQSEEAKTDDSGNDTKQAESR TVIVGTAGTGEPYSMVDDKGNWTGAEADLWAEIEARTGWTIEMKQVSDMASLFGELNTGR VDVAANCFAITEARLKTYIASDPIYADAQVIIVQPDSEYQTLDDLRGHTIGVTAGQAAQT TVENMAPDYDWKIVTYEDSNAGFQDCSLGRIDCYANTVTNIEKAEKAQNLEFRMLDQKLF GNNVGWWFADTEEGTGLRDEVNVVLKEMHEDGTVSEIITKWFYEDLSQLISDEWLTATH >gi|157101612|gb|DS480712.1| GENE 198 171066 - 171803 630 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 243 1 242 245 247 51 3e-64 MVTVENLSKTFGTDTEVLKDINLNIRKNEIVAILGPSGTGKSTLLRCLNFLCVPTTGIIQ IGNARVNAASYTSKEVRNLRRQSAMVFQGYNLFKNKTALENVMEALVTVQNRPKAEAEET SLRLLEKVGMLDRKDFYPAKMSGGQQQRVAIARALAVNPDVLLFDEPTSALDPELVGEVL NTIRQLAEEDSTMVLVTHEIKFARNVATRILFMDNGMIAADGSPEEIIDNPENPRLRQFL NFVSK >gi|157101612|gb|DS480712.1| GENE 199 171876 - 173099 491 407 aa, chain + ## HITS:1 COG:TVN0740 KEGG:ns NR:ns ## COG: TVN0740 COG1473 # Protein_GI_number: 13541571 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Thermoplasma volcanium # 12 404 7 395 396 283 42.0 4e-76 MLTNQILTEQIKEEAEQCREELIQFRRELHQHPELSMEEYKTSHKIAEKLRCLKGMEVIT GAAGGTGVIGILKGTKGPGKTVLLRADIDALPIEEKTGLPYASETAGVMHACGHDGHAAW LLGSAMILSKLRSHFPGNVEFVFQPGEENGLGGKKMVEEDHILENPAVDAVFAAHAWPGA AAGELHVAKRCAFGYPGSFEIQVTGRGGHGSWPHECVDPIAVANQIYSGLQQIVSRRLPE TAPRVLSVCTIHSGPQDKRNIIPDCCTMTGTLRADKLEVMEQMEDDIKRIAQGIAAANGA FARVTTGHGRAVINSPDAVLFCLQSAAGILGKEHVKLDTKPHLGGEDFSEYVSRVPGAYV YAGIATEKNNGTFGLHSSNFMLEESMIPKMAAVFAQFAVDFLEKGGF >gi|157101612|gb|DS480712.1| GENE 200 173148 - 173558 383 136 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0966 NR:ns ## KEGG: EUBREC_0966 # Name: not_defined # Def: N-acetylmuramoyl-L-alanine amidase domain protein # Organism: E.rectale # Pathway: not_defined # 1 134 214 347 347 166 63.0 2e-40 MIRESDDVQLDNIARTVIGNNASNCHIALHWDSITNNKGAFYMSVPDVESYRNMEPVKSH WQQHNALGENLVSGLKGAGVKIFSGGSMAMDLTQTSFSTVPSVDIELGDKASAHSSATLT TLADGLVAGVDQYFGQ >gi|157101612|gb|DS480712.1| GENE 201 173558 - 174040 414 160 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0966 NR:ns ## KEGG: EUBREC_0966 # Name: not_defined # Def: N-acetylmuramoyl-L-alanine amidase domain protein # Organism: E.rectale # Pathway: not_defined # 48 160 95 212 347 114 60.0 1e-24 MSIIKISISSNHQKGEQLTRIDYNGTPGYISSEYITTQVPETTALQPQACVDSGEITMNP SWKYAEFSKIHSGAAVLYRSEAASPKGITVCVNVEHGIKGGASVKTQCHPDGTPKVTGGT TMAAAVSGGMTFADGTPESKVTLSMAKILKDKLLVAGYDV >gi|157101612|gb|DS480712.1| GENE 202 174052 - 175062 426 336 aa, chain - ## HITS:1 COG:STM3755 KEGG:ns NR:ns ## COG: STM3755 COG3943 # Protein_GI_number: 16767039 # Func_class: R General function prediction only # Function: Virulence protein # Organism: Salmonella typhimurium LT2 # 1 329 1 331 345 244 39.0 2e-64 MADKDFQIRNSTIDFLVFTKQNSTDSIQVRVHDENVWLTQKSMAQLFECSIDNISLHLKN IFRDGELVPEAVIEESSATASDGKQYKTKFYNLDAVISVGYRINSLRATQFRQWATKVLR TYTVQGYVLDKKRLENGQIFDEEYFEHLLDEIREIRASERKFYQKITDIYATAVDYSKDA VTTKEFFKTVQNKLHYAIHGHTAAEVIVERADHKKEHMGLTTWKNAPGGKIVKADVSIAK NYLSQSELQDLNQFVSMYLDYAERQAKRHIPMTMEDWATKLEVFLKMNDEDILLDKGKVT HEIAKAFAESEFEKYRVIQDQLYESDFDKLVMKVDL >gi|157101612|gb|DS480712.1| GENE 203 175592 - 175750 149 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941902|ref|ZP_02089229.1| ## NR: gi|160941902|ref|ZP_02089229.1| hypothetical protein CLOBOL_06798 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06798 [Clostridium bolteae ATCC BAA-613] # 1 52 4 55 55 80 98.0 4e-14 MAVSKNNIRVPITIPKELKQQLDKVAKEDKRTFSNLCAKILSDYVEQKKDGE >gi|157101612|gb|DS480712.1| GENE 204 175754 - 177379 856 541 aa, chain + ## HITS:1 COG:CAC1956 KEGG:ns NR:ns ## COG: CAC1956 COG1961 # Protein_GI_number: 15895229 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 1 536 1 523 531 318 38.0 1e-86 MRIYIYSRKSVYTGKGESVENQIEMCREYISAKIPESDKAEIVVYEDEGFSAKNTNRPQF QKMLRDIKLNKPEYVVCYRLDRISRNVSDFSSLIEDLNSYNISFVCIKEEFDTSKPMGKA MMYIASVFAQLERETIAERVRDNMLMLARTGRWLGGTTPTGFTSEKIQEIIIDGKIKTSC KLKDNPEELKTIDCILEKFLELRSISGVSKYLIKQGIKSRTGKIFSLLGIKEILQNPVYC IADEDAWEYFTEHHSDVCFEKAECSNKYGLLTYNKRDYRKKNAPRQGMDKWIVAIGKHKG RISGKKWVAIQNILKDNIPTGTKPAIMHNDYSLLSGLIFCDKCQSRMFAKQRSGKGGNRE LYDYICNNKLRGGLKLCDCQNLNGQQADDIVCEYLMQYTDENSGIYKLLEKLKQDLQGQT QKSPLSIIDDKITKCNSEMDNLINTLSHGNLGTAFIERVNARIMELDKELASMKEEKKRL QKDVSVITDREIQVDMLSTALSSFKDNFHTLAIEEKRTLIKLMVKKIVWDGKDLHIFIDG E >gi|157101612|gb|DS480712.1| GENE 205 177424 - 178167 764 247 aa, chain + ## HITS:1 COG:lin2209 KEGG:ns NR:ns ## COG: lin2209 COG0370 # Protein_GI_number: 16801274 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Listeria innocua # 23 247 443 664 664 89 31.0 5e-18 MAGGAFPTLIAIISMFFVGVSGGAFDSMLSALLLTLFIVLGVCMTFVVSKALSMTVLKGI PSSFTLELPPYRRPQIGKVIVRSIFDRTLFVLGRAIAVAAPAGLLIWVMANVQVNGITLL AHCSGFLDPFARLLGMDGVILMAFILGLPANEIVIPIIIMAYMAQGSLLEFDSLAQLREL LVNNGWTWITAVSTMLFSLMHWPCTTTLLTIRKESGSFKWTVMSFLVPTVCGIVLCFAFT AVARLFV >gi|157101612|gb|DS480712.1| GENE 206 178287 - 179909 1631 540 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 539 3 543 547 632 60 1e-180 MAKQIKYGVEARKALEAGVNQLADTVRVTLGPKGRNVVLDKSFGAPLITNDGVTIAKEIE LQDPYENMGAQLIKEVASKTNDVAGDGTTTATVLAQAMVNEGMKNLAAGANPIVLRKGMK KATDAAVEAIKKMSKPINGKEQIARVAAISASDDEVGTMVADAMEKVSKDGVITIEESKT MKTELDLVEGMQFDRGYLSAYMCTDMDKMEANLDDPYVLITDKKISNIQDLLPLLEQVVK MGARLLIIAEDVEGEALTTLIVNKLRGTFNVVAVKAPGYGDRRKEMLQDIAILTGGTVIS EELGLDLKDATMEQLGRAKSVKVQKENTIIVDGMGDKDAISARVSQIRKQIEETTSDFDR EKLQERLAKLAGGVAVIRVGAATETEMKEAKLRMEDALNATRAAVEEGIIAGGGSAYVHA AGEVKSVADGLEGDEKTGARIILKALEAPLYHIVANAGLEGSVIVNKVKESSVGHGFDAY KEEYVDMIEAGILDPVKVTRSALQNATSVASTLLTTETVVANIKEPAPAGPAAPDMGGMY >gi|157101612|gb|DS480712.1| GENE 207 179953 - 180237 518 94 aa, chain - ## HITS:1 COG:CAC2704 KEGG:ns NR:ns ## COG: CAC2704 COG0234 # Protein_GI_number: 15895961 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Clostridium acetobutylicum # 1 94 1 94 95 91 68.0 3e-19 MKLVPLFDKVVLKQLVAEETTKSGIVLPGAAKEKPQQAEVIAVGPGGVIDGKEVTMQVKA GDKVIYSKYSGTEVEIEDEKYVIVKQNDILAVVE >gi|157101612|gb|DS480712.1| GENE 208 180500 - 181111 733 203 aa, chain + ## HITS:1 COG:no KEGG:Closa_1005 NR:ns ## KEGG: Closa_1005 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 138 1 145 214 81 44.0 3e-14 MAKKGWGKFVAFAAVTGAVAAGVSYVLQYKTYHRELEKDFREFEDGDEDGTDRTIDTRSL DRNYISLSSSKDEFKVAAKDMAQATKNVLKDAGSLLSDTAHEAVSAAVDTAQIALQTVKT KKADFMDEHSRDKDDDSDLFEDEGFLDDDYVDEDDLYDYRRMDGSSSYDSGDLTEDDYDD SSEEPEKSEAGRSTAIIEEDTLE >gi|157101612|gb|DS480712.1| GENE 209 181271 - 182218 956 315 aa, chain - ## HITS:1 COG:no KEGG:Closa_0990 NR:ns ## KEGG: Closa_0990 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 98 315 74 288 289 137 38.0 6e-31 MKGRNNTWTDMAAAAAFAAVLGISAFWGQHDRAYDQAGTAGSAYGRMYNQAGPADEDGGP VGKAGAYPGTAGAAQAAEGAGEDLARTGGKDAEILPDVTESQKAVLDKIINALETRDLET AAQVMDSGEEELVTLFYEVMDGARYLYDGRSFSQSMEGEGLVLTMPKTLYYGTFKGGRPE GECTALQVVELDAPRYDYSQGLWKDGKMEGLGHTGYCYYETSPEGEARDVCKTGRFSGDR LEGEVTYTTLNQEGETSTWKLEVKDGTVQLDDRWIYIEERGEYQLMSQEDDSHAYIMDEK LADQPVWINLLAWEE >gi|157101612|gb|DS480712.1| GENE 210 182336 - 183739 1196 467 aa, chain - ## HITS:1 COG:BH0578 KEGG:ns NR:ns ## COG: BH0578 COG1167 # Protein_GI_number: 15613141 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus halodurans # 1 465 1 468 469 357 40.0 4e-98 MELIVPFDSQSESPLYEQIYQYIKNEIRQGKLESGSRLPSTRILARNLRLSRSTTQMAYD QLLSEGYIEALPCKGYFVCKIEELVEVRQKGGGSFVEPGDTGKDRYEVDFSPRGIDLDSF PFNTWRKISRNTLVDDNKEMFAAGDPQGERALRTAIGDYLHSARGVDCRPEQILIGAGSE YLLMLLSQILGNGRKIAMENPTYKQAYRVLKGEGYPVIPVDMDRYGMDVQRLSRSSADVA YVMPSHQYPTGIVMPVKRRQELLAWAYSGADRYLIEDDYDSEFRYKGKPIPALQGMDRGG RVIYMGTFSKSIAPAIRVGFMVLPEHLLEAYRERAGFYLSTVSRIDQNILYQFITQGYYE RHLNRMRALYKGKHDALMAGLKELEDRFLIRGEYAGLHVLLTHRQGETEESLVARAAELG VKVYGISGCFIQPEEKLFDSTVMLGYASLSEEEIRNGTKLLSKAWTF >gi|157101612|gb|DS480712.1| GENE 211 183999 - 184469 473 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941911|ref|ZP_02089238.1| ## NR: gi|160941911|ref|ZP_02089238.1| hypothetical protein CLOBOL_06807 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06807 [Clostridium bolteae ATCC BAA-613] # 1 156 3 158 158 263 100.0 4e-69 MSEQTLNMLKKLSPKKEHSLAFRCFIGAAAIVLLTVSVHILVFTAIEKHSEIRNLPEWYQ LLAGLSVDALYPLFIVSLASIIKRVFRIPVHRLERKVFKFTYLCLGLIAASRISMYVGLY VFFMALFAAGTAIYIIIGIRFLYELFKYRFISCRYY >gi|157101612|gb|DS480712.1| GENE 212 184634 - 186328 2121 564 aa, chain - ## HITS:1 COG:RSc0791 KEGG:ns NR:ns ## COG: RSc0791 COG0008 # Protein_GI_number: 17545510 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Ralstonia solanacearum # 19 560 16 578 580 622 54.0 1e-178 MADKDTVMETEEKEVVSRNFIEQEIDKDLAEGVYDHVQTRFPPEPNGYLHIGHAKSILLN YGLAQKYGGKFNLRFDDTNPTKEKTEFVESIMDDVKWLGADFEDRLFFASNYFDQMYECA VFLIKKGKAFVCDLTADQIREYRGDFTTPGKESPYRNRSVEENLALFEEMKEGKYQDGEK VLRAKIDMASPNINMRDPVIYRVARMSHHNTGDKWCIYPMYDFAHPIEDAIEHITHSICT LEFEDHRPLYDWVVKECEFENPPRQIEFAKLYLTNVITGKRYIKKLVEDNIVDGWDDPRL VSIAALRRRGYTPEAIKMFVEMVGVSKANSSVDYAMLEYCIREDLKLKKPRMMAILNPVK LVIDNYPEGQVEMLDVPNNLENPELGDRKVPFGRELYIEREDFMEEPPKKYFRMFPGNEV RLMGAYFVKCTGCEKDADGNITVVHGTYDPETKSGSGFEGRKVKGTIHWVAVPTARQVEC RLYENIVDEEKGKLNEDGSLNLNPNSLVVLKECYVEPALAEGEAYDSYQFVRNGFFCVDC KDSTPQKPVFNRIVSLKSSFKLPK >gi|157101612|gb|DS480712.1| GENE 213 186470 - 187201 859 243 aa, chain + ## HITS:1 COG:BS_azlC KEGG:ns NR:ns ## COG: BS_azlC COG1296 # Protein_GI_number: 16079724 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus subtilis # 14 238 28 249 254 155 35.0 6e-38 MYNYEEIYQTMKKAAKFALIKTIPIFLGYLFLGIAFGLLLQKSGLGVFWAFLISTLVYAG SMQFALVGILTGGLSYVTTAVMTLFINSRHAFYGLTFIERFKEMRKTYPYMVFSLTDETY SLLCSMARPSDFTDREWKAATFFTSFFDQCYWIAGSVLGALMGELITFDTTGIDFAMTAL FVVICVDQWKAAKTHIPAVTGFICGALFLVLIRSSNFILPALAAAVAMLLFLRRTVEQKM EED >gi|157101612|gb|DS480712.1| GENE 214 187205 - 187534 367 109 aa, chain + ## HITS:1 COG:PM0423 KEGG:ns NR:ns ## COG: PM0423 COG1687 # Protein_GI_number: 15602288 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permeases (azaleucine resistance) # Organism: Pasteurella multocida # 8 109 7 108 109 89 47.0 2e-18 MTQSNYVLITILVCALCTQVTRWLPFLLFGGKKEVPGLVRYLGTILPAAIMAVLVVYCLK GITPLAYPYGLPELISVAVVVGLHLWKGNTLASIALGTLCYMMLVQLVF >gi|157101612|gb|DS480712.1| GENE 215 187676 - 190246 2379 856 aa, chain - ## HITS:1 COG:CAC3410 KEGG:ns NR:ns ## COG: CAC3410 COG2206 # Protein_GI_number: 15896651 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Clostridium acetobutylicum # 649 833 67 253 265 157 43.0 1e-37 MRSYVFISITALYFYTFLMLAFMSAKKSRLIRDFIAVLGAMILWTGGSLLMRLRAWPSYE LWFHLSLAGISLVPCAFFCFIRDFCGHKAKSSHRMWLVLILLVNLYNIWTGNLVQPPAIE WNGASAVFVYHMDWRASIMYGLCFLVIIHTAAIIWSSRKNRALKAKVVPLLSGILLLFAG NMAVPLFNGFPIDILSGVLNAGVMFYTLYSRHVFRMTLLVSRGNCYIIAMAVSALAFYNV AQTLDGIIRRDIPMLAPFSVMIVATMTMLVTLLIYMVAKGFFDRIFIKEEVSQTERLSEY TAIVSQSLRLTEILNALVNVIGNTLHTRKIYICIRDSQGNYPAVFSSSPLDDKNFSLAGS APLINWLKSHDSCLLLKDFRRTVEYKSMWEEEKYQMEVLKIECILPLKDGNDLAGLVLLG GREKRGKMRTQGYSEEDIIFLNSIESVSSIAVKNSRLYEKAYEEARTDELTGLLNRKYFC QTLEENYEKCRRTSLALVIFNVDDFKLYNQLYGNHEGDKALVHIARIIQGTVGDNGYCGR YSGKEFAVILPDYDIYSAQNLAENISRQIQNMNLDCTDTYLKPLTVSCGICAAPYAAASM NELISNADCAVYHVKRSGKNGIRVYSDGIIGVRDTDEGLAKKHRSMYSEYAPTIYALTAA IDAKDHYTFQHSKNVAYYAHAMGQALKTSEEYQEILKESALLHDIGKIGIPENILNKTGK LTDEEYSTMKRHVEASVEIIRHLPSMDYVIPAVLGHHERYDGRGYPRRIAGKDIPLAARI LCIADSFDAMVSKRSYKPSMSVEFAVNELERGAGTQFDPELVPVFIGLLKDGMIRPALND GNDVCSREPEGTDAEI >gi|157101612|gb|DS480712.1| GENE 216 190261 - 191895 1478 544 aa, chain - ## HITS:1 COG:no KEGG:Dehly_1104 NR:ns ## KEGG: Dehly_1104 # Name: not_defined # Def: GH3 auxin-responsive promoter # Organism: D.lykanthroporepellens # Pathway: not_defined # 5 543 6 534 543 367 37.0 1e-99 MTFEEKLTNQEYDRIWQEYCGFLDLDMASYMKIQRRLLEEQMGLWCASPLGKKILKDKRP ENIEEFRAMVPLTTYEDYADVLLLKKEDMLPDKPIIWIQTTWEGGKHPIKVAPYTSGMLK TYKNNILTCLMMATSKGRGSFDIGIGDTFLYGLAPLPYITGLIPLGLSDELHIEFLPPVD EAVNMTFSERNKKGFKMGLGKGIDFFFGMGSVAYFVSMSIASLSEGGKKGGSFLKKLFTM PPSMTLRYLKAKKQCEKEGRGLMPKDLFRLKGFLCAGTDNRCYKDDLEELWGIRPVEVFA GTEPSFIGTETWNRNGLYFFPDACFYEFIPEAEMYKNLDDPSYVPRTICMDEVAAGEVYE IVLTVLKGGAFARYRVGDVYRCLGLTSKEDETRIPRFEYIDRVPDIIDIAGFTRISENSI KNVISLSGIDVTDWAALKEFTEDRGRPYLHMYVELAPGCVVNRAVSRELLKEHLTIYFKY VDQDYHDLKRILGMDPLEITVLRCGTFASYREKTGKTLRHINPSIHDIQEMIKMQAAFDG HRRY >gi|157101612|gb|DS480712.1| GENE 217 191902 - 192432 186 176 aa, chain - ## HITS:1 COG:no KEGG:DhcVS_1159 NR:ns ## KEGG: DhcVS_1159 # Name: not_defined # Def: hypothetical protein # Organism: Dehalococcoides_VS # Pathway: not_defined # 7 170 6 173 196 64 30.0 2e-09 MDWGFGLGAFGMFFLYDWNRVFLKKKWFLPLFAAGNLLLAVVGGRMVYACVSSGIKEQSV WLLPGAFFLAVLIYTLYFALPFDNTYCQEADRHKVCRSGMYGWCRHPGIWWFFGCFFCTG LSTGELNRVMLSLCLSLLNLLYAWYQDRLIFVEEFSDYREYQEQVPFLLPRLWKRR >gi|157101612|gb|DS480712.1| GENE 218 192890 - 193249 371 119 aa, chain - ## HITS:1 COG:no KEGG:BBR47_42170 NR:ns ## KEGG: BBR47_42170 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 2 119 3 120 125 137 52.0 1e-31 MLQTQYEFTLPKGYMDEEGNLHKNGIMRLATAMDEIRAMRDPRVMQNPDYAAIIILSHVI IKLGSLPLVTVETIEKLFASDLKFLQEMYETLNGLEGPVVRVTCPYCGKEFTTEMDFKG Prediction of potential genes in microbial genomes Time: Thu Jun 30 19:59:39 2011 Seq name: gi|157101611|gb|DS480713.1| Clostridium bolteae ATCC BAA-613 Scfld_02_54 genomic scaffold, whole genome shotgun sequence Length of sequence - 7996 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 3, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 392 382 ## COG3464 Transposase and inactivated derivatives + Term 619 - 671 10.3 - Term 607 - 659 11.1 2 2 Op 1 . - CDS 737 - 3856 1299 ## COG3587 Restriction endonuclease 3 2 Op 2 . - CDS 3853 - 4404 129 ## gi|160941924|ref|ZP_02089250.1| hypothetical protein CLOBOL_06819 4 2 Op 3 . - CDS 4415 - 6310 687 ## COG2189 Adenine specific DNA methylase Mod 5 2 Op 4 . - CDS 6303 - 6584 158 ## gi|160941926|ref|ZP_02089252.1| hypothetical protein CLOBOL_06821 - Prom 6810 - 6869 6.6 6 3 Op 1 . - CDS 7417 - 7539 102 ## gi|160941928|ref|ZP_02089254.1| hypothetical protein CLOBOL_06823 7 3 Op 2 . - CDS 7545 - 7754 106 ## gi|160941929|ref|ZP_02089255.1| hypothetical protein CLOBOL_06824 8 3 Op 3 . - CDS 7669 - 7881 72 ## gi|160941930|ref|ZP_02089256.1| hypothetical protein CLOBOL_06825 - Prom 7924 - 7983 4.2 Predicted protein(s) >gi|157101611|gb|DS480713.1| GENE 1 3 - 392 382 129 aa, chain + ## HITS:1 COG:BH0270 KEGG:ns NR:ns ## COG: BH0270 COG3464 # Protein_GI_number: 15612833 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 2 129 272 406 441 64 31.0 4e-11 TWAIENVRKRLQRSMPVSLRKYYKRSRKLILTRYKKLKDENKQACDLMLHYSEELRLAHR MKEWFYDICQMEAYRQQQREFDDWIANAQSCGIKEFEACAKTYRAWRKEILNAFKYGLTN GPTEGFTTR >gi|157101611|gb|DS480713.1| GENE 2 737 - 3856 1299 1039 aa, chain - ## HITS:1 COG:FN0417 KEGG:ns NR:ns ## COG: FN0417 COG3587 # Protein_GI_number: 19703759 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Fusobacterium nucleatum # 1 646 1 659 997 514 45.0 1e-145 MKLKFKLQGFQTDAVNAVCALFDGQQRETSTFSMEQSGGQMELFANLGVANTLLLNDGKI IENMREVQRKNVLPQTFDLQGRQFSVEMETGTGKTYVYTKTIYELNHLYGFTKFIIVVPS VAIREGVYKSLQVTQEHFSALYDNASCRYFIYNSAKLSDVRQFATSANIEIMIINIDAFR KAENVINQQQDKLNGDAAINYIQATRPIVIIDEPQSVDNTQKAKAAIASLAPLCIFRYSA THREKVNLLYRLTPVDAYQMGLVKQICVSSNQAAGGFNKPYIKLLSVSNEDGFKARLEID VQGKDGKVSRKAVTVKPGADLHKLSGYRSLYENYVVSGIDCTPEMEQIELSNTDVVRLGH AIGDVDEKLIKRMQIRRTIEAHLDKELRYTEKGIKVLSLFFIDEVKKYRNPEGGKGIYAQ MFEELYAELMGLPKYAPLREHFTIDASRVHDGYFSQDKKGNYKNTKGDTQDDTNTYNTIM KDKEWLLSFDCPLRFIFSHSALKEGWDNPNVFQVCTLIEQKSTLTARQKVGRGLRLCVNQ DGERIEDRNINLLHVMANENFAEFANTLQKEIEEETGVKFGILQLDLFSGMVFEETKTEE MPLTEQQTAMLLSHLAGEGFIKTDEPMPEVTKLPQELEAAKTKAVELIAREGDITPATVT QLTVQKTVVVQSALTYDDAQQLMAHFKQNGYVTKSGRIKDSMKAAIQTGTLDLPPHLEGA RVQIENMVRRIDTKPPIRDASRDVTVHLKKEVTASPEFLELWNKIKQKTTYRVQIDEDEL IRRSVKGLRDMEPIPKARIITQTADIQIDNPGVTYTERGIKTATLSDSYLSLPNILTIVG SQTLIKRATVAEILKQSGRLGDFLNNPQMFIENTTQIILDVRRTLAIDGISYKKLYGAEY YVQEIFDSAELIANLDRNAVAVEHSVYDYIVYDSTTIEKPFALDLDDDPDVKMFFKLPSR FKVDTPIGTYNPDWAVYVELEGMKKLYFILETKGKTNELDLRGREDLKIYCGKEHFKAID SGAELHVASKWKDFKVRNI >gi|157101611|gb|DS480713.1| GENE 3 3853 - 4404 129 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941924|ref|ZP_02089250.1| ## NR: gi|160941924|ref|ZP_02089250.1| hypothetical protein CLOBOL_06819 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06819 [Clostridium bolteae ATCC BAA-613] # 1 183 1 183 183 337 100.0 3e-91 MIEWITNNKEWIFSGIGITLIVGVIATFCYLKKKLYRKKYIWKPQTLMRNKSIAKDISTH PKDIAFDIEKGDIQLEYYVSGLDAFIQLLQKFILTERRKYPIYGNTYGIDESISIFTEQD VVEFQRQCNNIELHLIDYFKEWIEEIYQIRRNGNHLTNELKVSGKAETVKCIVPNRHKEG REK >gi|157101611|gb|DS480713.1| GENE 4 4415 - 6310 687 631 aa, chain - ## HITS:1 COG:FN0416 KEGG:ns NR:ns ## COG: FN0416 COG2189 # Protein_GI_number: 19703758 # Func_class: L Replication, recombination and repair # Function: Adenine specific DNA methylase Mod # Organism: Fusobacterium nucleatum # 114 626 1 519 525 407 46.0 1e-113 MIDKLDGLSMDIEAENKNKLKSVFPECFIEGRLDIDKLLSLCGEYITDDFEKYEFKWKGK SDCLRLAQRRSTATLRPCPEESVNFDTTQNLYIEGDNLEVLKLLQKSYFCGVKMIYIDPP YNTGNDFVYEDDFADPMRRYMEVTQQTTKSNPETMGRYHTNWLNMMYPRLRLAANLLRDD GVIFISINDSEITNLRKLCDETFGEENFVVDLIWTNKEGGGSSDSKLFRIKHEHILCYCK NLELIEVNGVVISNEERYKSSDEYEEIRGKYYLQKLGMGSIQYSTSLDYPITAPDGTEIM PADNNNGKKACWRWSQDKFKWGQSNGFVEIKKDPNDIWTVYTKQYLNCDNEGNIIKRTQR PMSVIDKFSSTQASKLLQNLFDGKVFDYSKPVDLIIYLMQRVLKEKSNDLILDFFSGSAT TAHAIMQLNAKDGGNRRFIMVQLPEVCDDKSEAYKAGYTNICEIGKERIRRIGKKVLEAD GGQTSLDGDKPSIDVGFKVFKLDTSNLKLWEDTPIEDGDVATLFNRIDKHIDGLKPDRSD EDLIYEILLKMGYSLTADLAALDVDGLTVYKVDNGEMLICLQDGITAEHIEQMAAFAPQK IVVADNSLADNSAMSNAHYLLENKGIELKIV >gi|157101611|gb|DS480713.1| GENE 5 6303 - 6584 158 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941926|ref|ZP_02089252.1| ## NR: gi|160941926|ref|ZP_02089252.1| hypothetical protein CLOBOL_06821 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06821 [Clostridium bolteae ATCC BAA-613] # 1 93 23 115 115 158 100.0 1e-37 MRYKISQKDYDKIEELVKCALSAYNKAEAKNYIQQISYVPYNLAGAANNILGELISAIET ASGQVQDKERLKYFAEISLYKLKSFIEEDAPND >gi|157101611|gb|DS480713.1| GENE 6 7417 - 7539 102 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941928|ref|ZP_02089254.1| ## NR: gi|160941928|ref|ZP_02089254.1| hypothetical protein CLOBOL_06823 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06823 [Clostridium bolteae ATCC BAA-613] # 1 40 1 40 40 74 100.0 3e-12 MKHVDGISGNDCIICGERIPVSRNKKKELMELLIQYMNEG >gi|157101611|gb|DS480713.1| GENE 7 7545 - 7754 106 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941929|ref|ZP_02089255.1| ## NR: gi|160941929|ref|ZP_02089255.1| hypothetical protein CLOBOL_06824 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06824 [Clostridium bolteae ATCC BAA-613] # 1 69 1 69 69 129 100.0 6e-29 MKKAVQTSISVPTKDGLQKLAVSDIYYIESQGHDTCYRTARGEFLSRITLKELEDSMGGY GIFVVEKEI >gi|157101611|gb|DS480713.1| GENE 8 7669 - 7881 72 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941930|ref|ZP_02089256.1| ## NR: gi|160941930|ref|ZP_02089256.1| hypothetical protein CLOBOL_06825 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06825 [Clostridium bolteae ATCC BAA-613] # 1 70 1 70 70 127 100.0 2e-28 MNSGQKKGINFDLDTNALKIYYTEGDWRNAYHDVRNFLRKWPDEKGSADIYIGSYEGRSA EAGSFGYLLY Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:00:16 2011 Seq name: gi|157101610|gb|DS480714.1| Clostridium bolteae ATCC BAA-613 Scfld_02_55 genomic scaffold, whole genome shotgun sequence Length of sequence - 8978 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 2, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 94 - 150 3.1 1 1 Op 1 47/0.000 - CDS 233 - 607 535 ## PROTEIN SUPPORTED gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 2 1 Op 2 43/0.000 - CDS 706 - 1206 756 ## PROTEIN SUPPORTED gi|239623242|ref|ZP_04666273.1| ribosomal protein L10 - Prom 1229 - 1288 2.1 - Term 1592 - 1636 0.5 3 1 Op 3 55/0.000 - CDS 1686 - 2381 1023 ## PROTEIN SUPPORTED gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 4 1 Op 4 45/0.000 - CDS 2541 - 2966 643 ## PROTEIN SUPPORTED gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 5 1 Op 5 . - CDS 3064 - 3582 568 ## COG0250 Transcription antiterminator 6 1 Op 6 . - CDS 3639 - 3899 347 ## gi|160941940|ref|ZP_02089265.1| hypothetical protein CLOBOL_06834 7 1 Op 7 . - CDS 3926 - 4075 239 ## PROTEIN SUPPORTED gi|160881814|ref|YP_001560782.1| ribosomal protein L33 - Prom 4129 - 4188 6.3 8 2 Op 1 . - CDS 4300 - 5025 440 ## COG1040 Predicted amidophosphoribosyltransferases 9 2 Op 2 . - CDS 5048 - 7339 2384 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 10 2 Op 3 . - CDS 7424 - 8467 1290 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 8614 - 8673 8.6 Predicted protein(s) >gi|157101610|gb|DS480714.1| GENE 1 233 - 607 535 124 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 [Roseburia intestinalis L1-82] # 1 124 1 125 125 210 90 3e-54 MAKLTTAEFIEAIKELSVLELNELVKACEEEFGVSAAAGVVVAAAGAGAAAAEEKTEFDV ELTEVGPNKVKVIKVVREVTGLGLKEAKDVVDGAPKVVKQGASKEEAEDVKAKLEAEGAK VTLK >gi|157101610|gb|DS480714.1| GENE 2 706 - 1206 756 166 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623242|ref|ZP_04666273.1| ribosomal protein L10 [Clostridiales bacterium 1_7_47_FAA] # 1 166 1 166 166 295 93 6e-80 MAKVELKQPVVDEIKAMLEGAAGAVIVDYRGLTVEQDTQLRKQLREAGVAYKVYKNTLIK RAAEGTDFAALAPQLEGPTALAVSKEDATAPARILANFAKTAPKLELKASVVEGTYYDQA GTQVIATIPSREELLGKLLGSIQSPITNLARVLNQIAEQQGGAAEA >gi|157101610|gb|DS480714.1| GENE 3 1686 - 2381 1023 231 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 [Eubacterium eligens ATCC 27750] # 1 230 1 230 231 398 83 1e-111 MKSGKRYAEAMKNVDRAALYDIADAITLVKKNASAKFDETVELHIRTGCDGRHADQQIRG AVVLPHGTGKTVRILVFAKGPKADEAQAAGADYVGAEELIPRIQNDGWLDFDVVVATPDM MGVVGRLGRVLGPKGLMPNPKAGTVTMDVTKAINDIKAGKIEYRLDKTNIIHVPVGKASF TEEQLADNFQTLIDAIMKAKPSTVKGAYLKSVALTSTMGPGVKLNVAKLVN >gi|157101610|gb|DS480714.1| GENE 4 2541 - 2966 643 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 [Roseburia intestinalis L1-82] # 1 141 1 141 141 252 89 8e-67 MAKKVTGYIKLQIPAGKATPAPPVGPALGQHGVNIVQFTKEFNARTADQGDMIIPVVITV YADRSFSFITKTPPAAVLIKKACNIKSGSGVPNKTKVAVLKKADLQKIAETKMPDLNAAS LEAAMSMVAGTARSMGVTIEE >gi|157101610|gb|DS480714.1| GENE 5 3064 - 3582 568 172 aa, chain - ## HITS:1 COG:CAC3149 KEGG:ns NR:ns ## COG: CAC3149 COG0250 # Protein_GI_number: 15896397 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Clostridium acetobutylicum # 1 172 1 172 173 188 60.0 4e-48 MSEAHWYVVHTYSGYENKVKVDIEKTIENRNLQDQILEVSVPMLPVVELKNGVEKKADKK MFPGYVLINMVMNDDTWYVVRNTRGVTGFVGPGSKPVPLTEEEMASLGFHREEDVLVDFE VGDMVVVISGAWKDTVGAIKAINDSKKTITMHVEMFGRETPVELGFAEVKKM >gi|157101610|gb|DS480714.1| GENE 6 3639 - 3899 347 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941940|ref|ZP_02089265.1| ## NR: gi|160941940|ref|ZP_02089265.1| hypothetical protein CLOBOL_06834 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06834 [Clostridium bolteae ATCC BAA-613] # 1 86 1 86 86 106 100.0 7e-22 MEENTVNAVETTEKTAKADKVEKKAAKKDKKPSFFHGLKKEYKKIVFADKETVAKQTVAV LFMSIAIGVMIAVLDFVMKFGLSFIL >gi|157101610|gb|DS480714.1| GENE 7 3926 - 4075 239 49 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881814|ref|YP_001560782.1| ribosomal protein L33 [Clostridium phytofermentans ISDg] # 1 49 1 49 49 96 81 6e-20 MRTKITLACTECKQRNYNMTKDKKTHPDRMETKKYCRFCKKHTMHKETK >gi|157101610|gb|DS480714.1| GENE 8 4300 - 5025 440 241 aa, chain - ## HITS:1 COG:HI0434 KEGG:ns NR:ns ## COG: HI0434 COG1040 # Protein_GI_number: 16272382 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Haemophilus influenzae # 15 230 4 221 229 104 31.0 1e-22 MKIIPQGIDFMTDILFPRRCPVCGGIVLPKGDLICPGCMTKLSWVRRPVCKKCGKEVLDE TIEYCYDCTRHKRSFDYGLSLINYDDTASRSMARIKYNNRREYLDFYSEAMVRKMGKRIR FMDGDALIPVPVHPSRRRERGFNQAEELARRLSGPLGIPVNTSILKRTRKTAPQKSLDSG GRLKNLEQAFTASVLPSGMKNIILVDDIYTTGSTIEACTRALRKAGAEHVYFVTVFIGHG Q >gi|157101610|gb|DS480714.1| GENE 9 5048 - 7339 2384 763 aa, chain - ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 1 755 1 732 739 570 43.0 1e-162 MATITGFIERIKFRNEENGYTVMSVTDQSDGEEVIMVGNLAYAAEGDMIEASGHMTQHPV YGEQLQIESYEMKTPQDELSMERYLGSGAIKGIGAALAARIVRHFKADTFRIMEEEPERL SEVKGVSEKMAMAIAEQVEDKKDMRQAMMFLQNYGITMNLAAKIYQEYGPTLYGIIRENP YRLADDIPGVGFKMADEIAERVGIFTDSDYRIKAGILYTLLQATANGHTYLPQTELFAQA SELLKVEPSAMEKHLMDMQIDRKVVVKPSAQGPLVYSSHYYYLELNAARMLHDLNIKGDI PQVEIKANLLKIQQQESIHLDELQEKAVYEAVNSGLLVITGGPGTGKTTTINTIIRYFEM EGMEILLAAPTGRAAKRMTEATGYEARTIHRMLELVPVGMPGNDAPVIPGAPRSGKGAQS MGMHFDRNEENPLDADVIIIDEMSMVDISLMHSLLRAVNVGTRLILVGDVDQLPSVGPGN VLRDIIESGSFNVVKLTHIFRQAAQSDIVVNAHKINAGEPVDLAKRSQDFLFIRRDNPDA VISAAITLIQKKLPDYVHANAFDIQVMTPMRKGALGVERLNQIMQNYLNPPDKSKKEKES GGTIYRVGDKVMQIKNNYQMEWEVRNKYGIPVDKGAGIFNGDVGIIREINDFAELLTVEF DEGKMIEYSFKQLEELELAYAITIHKSQGSEYPAVIIPMYNGPRMLMTRNLIYTAVTRAR SCVCLVGQPDAFYTMASNCVEQKRYSGLKSRIGEIYAQPDSIG >gi|157101610|gb|DS480714.1| GENE 10 7424 - 8467 1290 347 aa, chain - ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 338 1 338 340 407 62.0 1e-113 MFSMMSNDIGIDLGTASILVYIKGKGVVLKEPSVVAIDRDTNKIMAIGEEARLMIGRTPG NIVAVRPLRQGVISDYTITEKMLKYFISKAVGKKTLRKPRISVCIPSGATEVEKKAVEDA TYQAGAREVAIIEEPVAAAIGAGIDIGKACGNMIVDIGGGTADIAVISLGGPVVSTSIKI AGDDFDEALVRYMRKKHNLLIGERTAEEIKINIGAAYRRPEVLTMEVRGRNLVTGLPKTI VVTSDETLEALREPALQIVDAVHNVLERTPPELAADIFDRGIVMTGGGSLLSGLDALIEE KTGITTMIAEDPLTAVAIGTGKFIEFAHGMNMAMEASMGVGGGREDY Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:00:29 2011 Seq name: gi|157101609|gb|DS480715.1| Clostridium bolteae ATCC BAA-613 Scfld_02_56 genomic scaffold, whole genome shotgun sequence Length of sequence - 9087 bp Number of predicted genes - 6, with homology - 5 Number of transcription units - 5, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 100 - 1287 1425 ## COG0192 S-adenosylmethionine synthetase 2 2 Tu 1 . - CDS 1645 - 3489 1763 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 3548 - 3607 6.0 - Term 3577 - 3624 2.1 3 3 Op 1 . - CDS 3676 - 4704 907 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain 4 3 Op 2 . - CDS 4812 - 6104 1445 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 6199 - 6258 9.6 - Term 6142 - 6175 -0.8 5 4 Tu 1 . - CDS 6268 - 8748 2115 ## Closa_3992 hypothetical protein - Prom 8784 - 8843 6.4 6 5 Tu 1 . - CDS 8962 - 9087 117 ## Predicted protein(s) >gi|157101609|gb|DS480715.1| GENE 1 100 - 1287 1425 395 aa, chain - ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 3 392 5 394 400 613 73.0 1e-175 MERRLFTSESVTEGHPDKMCDAISDAILDALMEQDPMSRVACETATTTGLVLVMGEITTN AYVDIQKLVRETVREIGYDRAKYGFDCDTCGVITAIDEQSQDIALGVNKALEAKENKMSD QELDAIGAGDQGMMFGFATNETEEFMPYPISLAHKLAKRLTEVRKNGTLKYLRPDGKSQV TVEYDENDKPVRLDAVVLSTQHDESVTQEQIHADVKKYIFDEILPQHMIDGNTKFFINPT GRFVIGGPQGDSGLTGRKIIVDTYGGYARHGGGAFSGKDCTKVDRSAAYAARYVAKNIVA AGLADKCEIQLSYAIGVAHPTSIMVDTYGTGKLSNEKLVDIIRSNFDLRPAGIIKMLDLR RPIYKQTAAYGHFGRNDLDLPWERLDKVELLKSYL >gi|157101609|gb|DS480715.1| GENE 2 1645 - 3489 1763 614 aa, chain - ## HITS:1 COG:TM1878 KEGG:ns NR:ns ## COG: TM1878 COG0737 # Protein_GI_number: 15644621 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Thermotoga maritima # 42 571 22 469 508 158 30.0 2e-38 MKNKSFKWLRSLFLAIVLTTVTVAGLPLTGGPVTALAADKDIVVLYTNDVHCGVDDNIGY AGLALYKKEMQQQTPYVTLVDTGDAIQGAPIGTLSDGGYLIDIMNYVGYDFAVPGNHEFD YGMSRFLALAGKLNCGYYSCNFIDSATGAPVFAPYKMFTYGATQVAFVGVTTPESFTKST PAYFQDSQGNYIYSFCEDESGQKLYDQVQASVDAARTAGADYVILAGHLGENGITQKWSS ASVIANTTGIDACIDGHSHETVPSENVKNKNGQNVVLTQTGTKLNHIGKLTISADGSIRT ELVSEVPAADLDREYTVQEHDSLSRIAKRELGSYNRWIDIYNSNLDKIKNADVIPVGLNI VIPGKSYINPDGKAADYGTYQFIQSIENQYNETLKTVLGTTPYELTVNDPATGNRIIRNA ETNLGDLTADAYRAELGADIGLSNGGGIRSVIKPGNITYNDTLAVFPYGNMGCVIEATGQ QIKDALEMASRNCPEESGGFLQVSGLTYTIDTSVKSGVQTDDKGNFTGVSGAYRVMDIKV GGEPIDLKKTYTVASHNYMLKQGGDGMTMFKGCNVIRDEVMVDVDILSSYIRRMGGSVTS EYANPGGQGRISIR >gi|157101609|gb|DS480715.1| GENE 3 3676 - 4704 907 342 aa, chain - ## HITS:1 COG:SPy0210 KEGG:ns NR:ns ## COG: SPy0210 COG5279 # Protein_GI_number: 15674407 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Streptococcus pyogenes M1 GAS # 1 271 52 354 410 77 21.0 5e-14 MKQEFYYQQMNKAQQNAYRAMLDGFESLSPEFPVLRLDGKELSDIFFRLRLDHPAIFYVE GFHYRFAQNSEYVQMIPEYMFEKKKIKEMKIALESRISRLVRQAEGMAPEEKEKYVHDFI CSNVTYDKLKKQYSHEIIGPLQQGIGVCEGIAKTVKILCDRMGLECLIAISESAPDRGIR YRHAWNLVRLKNTWYHLDATFDNSLGRYGQKRFDYFNLDDKMIFKDHQPLVYKVPACTDG GRFYYKENRLSLTKLEDVAGRFKAVLRKKQPYYVFHWRGGYLTRDVLEQIAGIASEAARE KGKYVRMSVNYSQAVMELAVTDYQSGQEICREEANEGELACE >gi|157101609|gb|DS480715.1| GENE 4 4812 - 6104 1445 430 aa, chain - ## HITS:1 COG:CAC3539 KEGG:ns NR:ns ## COG: CAC3539 COG0766 # Protein_GI_number: 15896775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Clostridium acetobutylicum # 1 415 1 415 418 442 58.0 1e-124 MEQYIIKGGNPLVGDVTISGAKNAALGILAASIMTDDDVVIDNLPDVRDINVLLEAIQEI GARVDRIDRHTVKINGSNISEVSVDDEYIRRIRASYYFIGALLGKYKSAQVPLPGGCNIG SRPIDQHIKGFRALGAEVTIERGAVIAHAIDLVASHIYLDVVSVGATINIMMAAALAEGQ TILENAAKEPHIVDVANFLNSMGANIKGAGTDTIRIRGVNRLHGTEYSIIPDQIEAGTFM CAAAATRGDIMVKNVIPKHLEAISAKLTEIGCEVIEFDDAVRVVGKPAQRHTDIKTLPYP GFPTDMQPQMSVALVLANGTSMVTESIFENRFKYVDELARMGSNIKVEGNVAVIDGVKGL TGAQVNAPDLRAGAALVIAGLAADGYTVVDEIGYIQRGYECFEEKLQGLGAMIEKVDSDR EVQKFKLRVG >gi|157101609|gb|DS480715.1| GENE 5 6268 - 8748 2115 826 aa, chain - ## HITS:1 COG:no KEGG:Closa_3992 NR:ns ## KEGG: Closa_3992 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 826 1 825 825 811 48.0 0 MKKILQKAGILFLIFIGALVVYFISARNTMEKESAVYTSMEEPSLPVVYTQLGGQEINCL HGYMQDMGNQAARESISVLPEDRGLNIRIEEYGNTITGISYEVRNLTLDRLVERTEVEDW VSGDGSVSAVLPIQNLLARNETYLLSITVSTGEKELHYYTRIMWPDNAYASDMVRLAQEF TRKSLDYNQARDLVSYLETNDTEDNSSLGHVTIRASFSHLTWDGLDVEMVGEPLMTLQEF DGIMGQIQIRYQVAINEEDGTRSMVDAEDNFTMKWNEQRIYLMNYERNANEVFDGGHQSF SGKKILLGITNDNKVRTMKSPKSKYVAFKTGGDLWCYDYDDKQAVCVFSFRSNSDDGVRS NYDRHDIKILSMQDDGSMDFLVYGYMNRGKYEGRMGVVYYHYDKEQDTVQEKFFLPASES YDMVKADIDKLSYLSENDMMYIMLQGTVYGIDLKSNESLVVAQGLTEGSYAVSGDASRFA WQEGQNLYESEKVHVMDFNTSQKQEIVGEVNDYVRVLGFVGNDLIYGLSSSKDKWIVNGR MKGMPMYAMYIVDTQMQVESEYRKDGIYITDVVAQDGRIHLKRLVPLGENQYLYQNEDTI VCNQRVEKDPLEGIGWFASQDKGKVYFVQADSEIHGNEVRTSAPKAFSYEYTSVLDTGTS ASASSDNSMIFRAYGGGHYLGSSRTFSQAVEMAYGQMGYVTDSSQHIVWDRINRQPIRNI KSPVDEARKVTKYLDSFDGSRVYEDGLILIDAGGCSLSQILYYIDKGIPVIAYVESGQYV LLSGYDQYNVTLYDPQTQETQKMGLNDATEYFKNLQNDFLCALAVE >gi|157101609|gb|DS480715.1| GENE 6 8962 - 9087 117 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AEANAPAPAETAEATATQANTSPSETANAAQTNAAQTGSAT Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:00:46 2011 Seq name: gi|157101608|gb|DS480716.1| Clostridium bolteae ATCC BAA-613 Scfld_02_57 genomic scaffold, whole genome shotgun sequence Length of sequence - 8108 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 5, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 118 - 164 13.2 1 1 Op 1 . - CDS 220 - 501 371 ## gi|160941956|ref|ZP_02089279.1| hypothetical protein CLOBOL_06848 2 1 Op 2 . - CDS 527 - 2947 2627 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 3073 - 3132 7.5 - Term 3087 - 3152 14.3 3 2 Op 1 . - CDS 3250 - 4530 622 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 4 2 Op 2 . - CDS 4533 - 5591 925 ## COG0457 FOG: TPR repeat 5 2 Op 3 . - CDS 5602 - 6117 437 ## COG0756 dUTPase - Prom 6330 - 6389 4.2 + Prom 6243 - 6302 8.6 6 3 Tu 1 . + CDS 6361 - 6624 307 ## gi|160941964|ref|ZP_02089287.1| hypothetical protein CLOBOL_06856 + Prom 6663 - 6722 3.9 7 4 Tu 1 . + CDS 6745 - 7089 400 ## DSY2751 hypothetical protein - Term 7278 - 7307 -0.4 8 5 Tu 1 . - CDS 7336 - 7962 703 ## Closa_2167 hypothetical protein - Prom 7982 - 8041 6.3 Predicted protein(s) >gi|157101608|gb|DS480716.1| GENE 1 220 - 501 371 93 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160941956|ref|ZP_02089279.1| ## NR: gi|160941956|ref|ZP_02089279.1| hypothetical protein CLOBOL_06848 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06848 [Clostridium bolteae ATCC BAA-613] # 1 93 6 98 98 177 100.0 2e-43 MFMGALCLAEGICLFTGKDFLMFMGSTKKEDYNLDKVFQVEKWIFLVDAACSLGVGMNRF PDVAENILLVVFGVTLAAHVYVFKSKKFRKENE >gi|157101608|gb|DS480716.1| GENE 2 527 - 2947 2627 806 aa, chain - ## HITS:1 COG:PA1920 KEGG:ns NR:ns ## COG: PA1920 COG1328 # Protein_GI_number: 15597116 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Pseudomonas aeruginosa # 4 655 13 659 675 716 51.0 0 MIQVVKRDGEVAEFSLNKITEAIKKAFRATSKDYNNEILELLSLRVTADFQNKMKDGRIS VEEIQDSVEHVLEQTGYTDVAKAYILYRKQREKIRNMNSTILDYKDVVNSYVKVEDWRVK ENSTVTYSVGGLILSNSGAVTANYWLSEIYDQEIAEAHRNADIHIHDLSMLTGYCAGWSL KQLLMEGLGGITGKITSSPAKHLSVLCNQMVNFLGIMQNEWAGAQAFSSFDTYLAPFVKA DNLSYPEVKKCIESFIYGVNTPSRWGTQAPFSNITLDWTVPDDMAEMNAIVGGRETDFKY KDCKKEMDLINKAFIETMIEGDANGRGFQYPIPTYSITNEFDWSDTENNRLLFEMTSKYG TPYFSNYINSDMQPSDVRSMCCRLRLDLRELRKKTGGFFGSGESTGSVGVVTINMPRIAY QAKDEADFYARLDHMMDVSARSLKTKRQVITKLLNQGLYPYTKRYLGTFENHFSTIGLIG MNEAGLNARWVRKDMTHRECQEFTKNVLNHMRERLSDYQELYGDLYNLEATPAESTTYRL AKHDKQRYPDIITAGEEGDTPYYTNSSHLPVDYTEDVFDALDIQDELQTLYTSGTVFHAF LGEKMPDWKAAANLVRTVAENYKLPYYTLSPTYSICKCHGYLIGEHFTCPICGEKAEVYS RITGYYRPVQNWNEGKTQEYKNRTNYNIGQSQLKHGVSRITADTRERTDIARTESSHQGK EKGGALTHLYLFTTKTCPNCVSAKEFLKGQDYQIIDAEERPDLAEKFGIMQAPTLVIVSD GIVQKFANASNIRRFVEQNHTSTVKA >gi|157101608|gb|DS480716.1| GENE 3 3250 - 4530 622 426 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 20 408 20 407 425 244 38 2e-64 MAELIDLSELQEKVILVAVSTAEEEDTPASLDELEELASTAGAVTVARIIQNRERVHPGT YLGKGKIDEVRELLWDTRATGVICDDELSPAQLRNLEDALDTKVMDRTMVILDIFASRAS TREGKIQVELAQLKYRAVRLVGMRNSLSRLGGGIGTRGPGEKKLETDRRLIHQRIGQLKE ELADVKRHREVTRQQREKNFALSAAIVGYTNAGKSTLLNRLTGAGILAEDKLFATLDPTT RSYTLEDGQQILLTDTVGFIRKLPHHLIEAFKSTLEEARYSDIVLHVVDCSNPQMDMQMH VVKETLEELEIVDKTIVTVFNKVDRFRELEAGENSGVMQIPRDFSSDYQVRISARTGEGL EELQKVLQAIIRSRRILLEKVFPYSQAGRIQTIRKYGQLLEEEYQEDGIAVKAYVPAELF GKLYSN >gi|157101608|gb|DS480716.1| GENE 4 4533 - 5591 925 352 aa, chain - ## HITS:1 COG:MA1613 KEGG:ns NR:ns ## COG: MA1613 COG0457 # Protein_GI_number: 20090471 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Methanosarcina acetivorans str.C2A # 83 247 1601 1754 1885 61 27.0 3e-09 MGSRGQRSYSSGRRPQSKGQHPGYGGKRPVSNAARRRRRRNRIIRAVIAWAVCIFLVGLI AAGTFRLVAHMTTSKKRQFRAEGIEKLEAGDYAGAIGSFDTALEKSGKGAEDFNRDVLLY RADAEFLLKDYNAAIHTYDLLLEMKPDTPEYMYRQSSCYARLGDTDNALERYQEAKALDK KDKPVPGRQEALLAAGSACVDAKEYDKAMALYEDALKDGMEHGEIYNQMGLCQMAAEDYQ SAYDSFDKGYQVAAAAQAVALQEKDRKTGKETDKKETKDGDAGEGAGGENAPAVAAEADG FRELLKELSYNRAAACEHLQQYDKALALFEDFVKEFGSNEDAEHEIAFLKTR >gi|157101608|gb|DS480716.1| GENE 5 5602 - 6117 437 171 aa, chain - ## HITS:1 COG:L181168 KEGG:ns NR:ns ## COG: L181168 COG0756 # Protein_GI_number: 15672158 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Lactococcus lactis # 30 170 7 150 150 100 41.0 2e-21 MQRIAKFHKVSFEQFKADWADAFPDTSEEEIIHIYEGIRLPARATSGSAGYDFFTPVPLT LAPGESMKVPTGVRAEIDTGWVLQIFPRSSLGFKFRLQLNNTVGIIDSDYFYSDNEGHMF IKVTNDSREGKTVELGQGAGFAQGIFIPYGITADDECSDVRNGGFGSTDAR >gi|157101608|gb|DS480716.1| GENE 6 6361 - 6624 307 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941964|ref|ZP_02089287.1| ## NR: gi|160941964|ref|ZP_02089287.1| hypothetical protein CLOBOL_06856 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06856 [Clostridium bolteae ATCC BAA-613] # 1 87 1 87 87 160 100.0 4e-38 MKISYNSQPQRNLIGNQIFKLRKSRKLSQKEMAAQLQVQGYYFSELTILRIEKGKRLVTD IELKILCDYFRITPNDLLGYGDGGRLV >gi|157101608|gb|DS480716.1| GENE 7 6745 - 7089 400 114 aa, chain + ## HITS:1 COG:no KEGG:DSY2751 NR:ns ## KEGG: DSY2751 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 12 92 11 91 110 73 41.0 3e-12 MVVKNMMSYELERIEELCRERGWSHYRLALEMDTSPNNIGNLFRRTTVPSIPTIRRICEV MGITMAQFYSTDGIQVTLNEQQRRIMDLYDKLDPLDKSKAEAYMEGLSAKAKDI >gi|157101608|gb|DS480716.1| GENE 8 7336 - 7962 703 208 aa, chain - ## HITS:1 COG:no KEGG:Closa_2167 NR:ns ## KEGG: Closa_2167 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 206 5 206 206 191 53.0 2e-47 MLGSLILVFALCTDTFVASLAYGANRVHVGWGKVALLNGICSGCLGLALGLGNVISSVLP GDVTRIICFVSLFLLGFIKFLDYSIKAYINRHCCIRRNLSFSLSGLKVIVNIYGDPMAAD VDGSQSLGWKETVFLALAMSIDSLVAGTMAALLDIPVAATLALSFAVGVCMMYGGLWLGR KVASRFRCDLSWISGVLLMVLALMKVLH Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:01:06 2011 Seq name: gi|157101607|gb|DS480717.1| Clostridium bolteae ATCC BAA-613 Scfld_02_58 genomic scaffold, whole genome shotgun sequence Length of sequence - 7454 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 3, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 127 136 ## COG1592 Rubrerythrin + Term 134 - 158 -0.3 + Prom 261 - 320 4.5 2 2 Op 1 . + CDS 359 - 1876 1256 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins 3 2 Op 2 . + CDS 1869 - 3101 845 ## COG0019 Diaminopimelate decarboxylase 4 2 Op 3 . + CDS 3135 - 3362 408 ## EUBELI_00116 hypothetical protein 5 2 Op 4 . + CDS 3374 - 5089 1452 ## COG1696 Predicted membrane protein involved in D-alanine export 6 2 Op 5 . + CDS 5076 - 6122 794 ## EUBELI_00118 hypothetical protein + Term 6176 - 6205 -0.1 + Prom 6522 - 6581 8.4 7 3 Tu 1 . + CDS 6738 - 7283 657 ## COG1592 Rubrerythrin Predicted protein(s) >gi|157101607|gb|DS480717.1| GENE 1 2 - 127 136 41 aa, chain + ## HITS:1 COG:CAC3597 KEGG:ns NR:ns ## COG: CAC3597 COG1592 # Protein_GI_number: 15896831 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 40 141 180 181 60 77.0 8e-10 DLAKRAKALNLDAIHDTVHEMAKDEARHGKAFEGLLKRYFG >gi|157101607|gb|DS480717.1| GENE 2 359 - 1876 1256 505 aa, chain + ## HITS:1 COG:Cj1307 KEGG:ns NR:ns ## COG: Cj1307 COG1020 # Protein_GI_number: 15792630 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Campylobacter jejuni # 1 503 1 501 502 358 37.0 1e-98 MIQNILSFLEESAEAYPRKTAFSDEHTSLTYRGLSDLSMRIGSVLAQKTGPRRPVPVLTE KNVFTLGAFMGIVRAGCFYVCLDANQPAERLNRILDTLKADLMIADKESLDLAVGISFSG EILFLEDLKSLPDDALNSSLLASIRRQAADTDPLYGIFTSGSTGVPKGVVVSHRSVIDFI QHFTDLFHITQDDVIGNQAPFDFDVSVKDIYSALKTGATMEIIPKKLFSIPTGLLDYLCD RKITTAIWAVSALCIITTLKGFTYRIPSHLNKILFSGEVMPVKHLNIWRSYLPDALFVNL YGPTEITCNCTYYVVDREFEPGTPLPIGVPFPNETVFLLDESNRLVTEPGQNGELCVAGT ALALGYYRDPKRTATAFVQNPLNPCYPEIIYRTGDLACYGRDGLLYFNGRKDFQIKHMGH RIELGEIEAALETVEGVDRAVCLFLEEKNKIAAFYEGAAVKRDIAVGLEHLLPRYMFPAL YRNLTELPRTKNGKVDRQALRRLYG >gi|157101607|gb|DS480717.1| GENE 3 1869 - 3101 845 410 aa, chain + ## HITS:1 COG:AGc3079 KEGG:ns NR:ns ## COG: AGc3079 COG0019 # Protein_GI_number: 15888979 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 368 57 421 440 131 27.0 2e-30 MDNHILQEAARTYNTPCYIFDLDVFISRIRRMEQILGKQVNIYYAMKANPFLTAAAAQAA GGLEVCSPGEYAICRRARVPAQKIVLSGVNKEESHIRSVMAEQGAGTYTAESLNQLRLLE ACAGAAGRTGVRVLLRLTSGNQFGMDEEDILTSIRNRDSYPHLDLAGLQYYSGTQKKGME KIEKELVKLDSFMDFVREHLDFEFRELEYGPGFQIPYFEGQEQADEEILLQEFKKALDSL NFKGIIFLEAGRFLAAPCGSYLTRVADVKRSQGQNYCIVDGGINHVNYYGQTMAMKVPAY RYIPQDNPARPPASQTSGDRWTVCGSLCTSGDILVKNLPLDGLMTGDLLVFDRIGAYSVT EGIYLFLSRKLPVVLTCTQEHGLSLVRDALPTDILNDGSAMEIYQSQSNS >gi|157101607|gb|DS480717.1| GENE 4 3135 - 3362 408 75 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00116 NR:ns ## KEGG: EUBELI_00116 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 74 18 91 91 88 60.0 1e-16 MDELLELLEDIKPTVDFRTCTGLIDDGYLDSFDILSIVSELNDAFGIEISPVDIIPENFN SAQALWNMVERLKDN >gi|157101607|gb|DS480717.1| GENE 5 3374 - 5089 1452 571 aa, chain + ## HITS:1 COG:FN1672 KEGG:ns NR:ns ## COG: FN1672 COG1696 # Protein_GI_number: 19704993 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Fusobacterium nucleatum # 143 571 68 481 486 219 33.0 1e-56 MSFLSMKFLLFLAAAVAGYYVIPRQLQWVWLLIFSYIFYLASGPAAVVFILTTTATTFLG GLWLEHTDRALACALDRTPARDPDRTPDCGPDHTLDCGPDHTLDCGPDHMLDCGPDRLRR PANPQQPLSPDEKKALKGRFKQRKKWIVALVLFVNFGILAALKYRNFAADNMNLLFGTHF SSAKLLLPLGISFYTFQSMGYLIDVYRGKYAPDRNPFRFALFVSFFPQILQGPIGRYDRL ASQLYGQKSFSLTRIERGLQLMLWGYFKKIVIADRAAVVVSEVFGNYQSYGGILVMAGVL CYSLQLYGDFSGGMDVIMGASECFGISLDANFKRPYFAQSISDFWHRWHITLGTWMKDYV FYPFSLSKGMNKFGKFCKKHFGKHVSRVLPVCIANLLVFFLVGVWHGPAWKFIVYGLYNG IIIAASNLFAPFYGEMARKLHIPVESRPWMAVRILRTFLLVNISWYFDMAVSLGAALTMM KNTVTGFSLAALSDGSLLRLGLDLKDYAALALSCTVLLAVSLLQENHVNIRDTLSAKPLA ARWCVYLMLLFSIPLLGQITMTGGGFIYAQF >gi|157101607|gb|DS480717.1| GENE 6 5076 - 6122 794 348 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00118 NR:ns ## KEGG: EUBELI_00118 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 13 345 7 344 358 168 28.0 3e-40 MPNSNQGAGRFHKWLPQWLKAILFLSLCICITAALNYGFIPPSSVRINLHNLRNGETYDT IFIGTSHGQYGIDPFSVDAASGGSSVSLCMADAYPDDMYHMLRLACETQSPSRVVYELDP SYWMNEQRSGSTQIFFYQEFPASRSKFFYFLDKIINLDFRSALAPWSYYRNLIGQVPDHI RRKQSTAYKNYDPSVLEIPGGHYAGRGFIYRDRVAGEDKGTFNNIPWDESQVKSKAELYF NRIAAFCRKNGIELTVITTPVPQETLDKFAGSYDQSHLYFKKLLEGLGLEYYNFNYMKPE LMDRSMEGYWDYDGHMDGVKAQEFSTLLGSFLNSLDQGSLTHSDYFIN >gi|157101607|gb|DS480717.1| GENE 7 6738 - 7283 657 181 aa, chain + ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 180 1 180 181 224 66.0 5e-59 MKKWVCTVCGYVYEGENAPEKCPQCGVPASKFKEQASEGMTWACEHEVGVAQGSPEDIMM DLRANFEGECSEVGMYLAMSRVAHREGYPEIGMYWAEAAFEEAEHAAKFAELLGEVVTSS TKKNLEMRVAAENGATAGKFDLAKRAKALNLDAIHDTVHEMAKDEARHGKAFEGLLNRYF G Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:01:16 2011 Seq name: gi|157101606|gb|DS480718.1| Clostridium bolteae ATCC BAA-613 Scfld_02_59 genomic scaffold, whole genome shotgun sequence Length of sequence - 3263 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 51 - 125 86.4 # Pro TGG 0 0 + TRNA 153 - 224 75.4 # Gly GCC 0 0 1 1 Op 1 . + CDS 430 - 1266 558 ## Closa_1057 hypothetical protein 2 1 Op 2 . + CDS 1263 - 1913 534 ## COG1309 Transcriptional regulator + Term 1923 - 1960 1.0 3 2 Op 1 . + CDS 1995 - 2336 204 ## COG5470 Uncharacterized conserved protein 4 2 Op 2 . + CDS 2333 - 3193 364 ## COG0491 Zn-dependent hydrolases, including glyoxylases Predicted protein(s) >gi|157101606|gb|DS480718.1| GENE 1 430 - 1266 558 278 aa, chain + ## HITS:1 COG:no KEGG:Closa_1057 NR:ns ## KEGG: Closa_1057 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 38 139 18 122 336 83 38.0 1e-14 MKIKDKRAVYKRAADKSTAGRKKCWMMVIAVLLATVALSMSAFAGQWRQDDTGWWYQNED GTYVKSEWSWIDGRCYYFNEDGYCLQNTQTPDGYSVDMQGAWIEDGVVQMQQEVQAQMQE QAEVPQQELTGVQEQGQDSGAIKQIGNVLLTVPEEFVFHSSDERGIYMTSVDGSSLICIM SESLADFGGYADLVYTYQESVLDAAVEAELGTPQTKTTSQLSNGLWYSYHYMDASSLGIP GFLKVYARINGDRLQMVMFGGSISSLDTEGIMNSLTFL >gi|157101606|gb|DS480718.1| GENE 2 1263 - 1913 534 216 aa, chain + ## HITS:1 COG:BS_yvkB KEGG:ns NR:ns ## COG: BS_yvkB COG1309 # Protein_GI_number: 16080573 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus subtilis # 26 215 1 187 189 91 29.0 8e-19 MTGSLDREMGLLYDASVHLRFGEGDMDQTGQNIIDAAMELVVERGYTATTTKDIAKRAGV NECTIFRKFKGKKEIVLQAMGQKRWHPDLEPDDFMIETGNLTEDLCQFARIYMKKVTPEF VKLSLGLRTPELAEDTREGILAIPQVFKTGVTAYFRKMYEKGKLISDDYESMAMMFLSLN FGFVFFKASFGSGLTEMKADEYIIKMVDAFVHGVAK >gi|157101606|gb|DS480718.1| GENE 3 1995 - 2336 204 113 aa, chain + ## HITS:1 COG:MA2290 KEGG:ns NR:ns ## COG: MA2290 COG5470 # Protein_GI_number: 20091128 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 16 97 2 83 88 65 36.0 2e-11 MSCYFIISVFLEPGKDRADYDDYIKRVRPIVEEHGGRYVVRSEGIEYIGKDWHPDRFIMI WFPDRESIDRCFSSEKYRKIMSKRENTVDSQAIIVTAEEMYGDNSGKRERRME >gi|157101606|gb|DS480718.1| GENE 4 2333 - 3193 364 286 aa, chain + ## HITS:1 COG:SSO1157 KEGG:ns NR:ns ## COG: SSO1157 COG0491 # Protein_GI_number: 15898013 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Sulfolobus solfataricus # 23 206 10 198 213 70 25.0 4e-12 MKINDRVHQLRVQFNITPEITRYVYVYIITGKKGCYLVDAGVKGCDKKIEEYIHKIGKNR ENINAVLLTHSHPDHIGALKTIKEMYSCCVYASRGERDWIEDIDLQYEKRPIPGFYSLVD GSVTVDRILSDGDLLCLEPGITIQVIGAPGHSRESLCFYYREQKILFTGDSIPEPSDVLI YDDSHASEETLKRLERLEEVDVYCPAWDKVYDKRTGIAMIHEGLSRISKIRRCTEGYRGG MGKESTQRLFELICEACNMAEYKKNPLFMRSVMSDLSRLNQGNHRD Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:01:21 2011 Seq name: gi|157101605|gb|DS480719.1| Clostridium bolteae ATCC BAA-613 Scfld_02_60 genomic scaffold, whole genome shotgun sequence Length of sequence - 3457 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 17 - 76 3.7 1 1 Tu 1 . + CDS 298 - 1125 449 ## COG0491 Zn-dependent hydrolases, including glyoxylases 2 2 Op 1 . + CDS 1354 - 2016 419 ## BDI_1568 hypothetical protein 3 2 Op 2 . + CDS 2052 - 2219 113 ## + Prom 2254 - 2313 6.2 4 3 Tu 1 . + CDS 2548 - 2703 111 ## gi|160941987|ref|ZP_02089307.1| hypothetical protein CLOBOL_06878 + Term 2798 - 2853 10.1 Predicted protein(s) >gi|157101605|gb|DS480719.1| GENE 1 298 - 1125 449 275 aa, chain + ## HITS:1 COG:SSO1537 KEGG:ns NR:ns ## COG: SSO1537 COG0491 # Protein_GI_number: 15898363 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Sulfolobus solfataricus # 1 272 1 266 269 97 27.0 2e-20 MEKTRFTLLDLGDFDAYKAEHLTLTEQEKKKDGVRRIVWPIPCYCVLVQHPVLGNVLYDT GIDDGYEARWPKNLLEEYPVKRFHRLEDRLGELGLCPYDIDILILSHMHFDHAGNLRLFC GTKAGKHVVIQEEEARHGFTLSNIYDCQEIRYRHDGYVRHEFNGLDGIAFDLVRGDVKLA DDLELLHLPGHTPGTMGMVVRTETFGTAVFPSDAVYNAINYGPPAVLPGMCARPEEFGGS IETCRKLAEREKGTVFFSHDMAGYQTYKKSPEWYV >gi|157101605|gb|DS480719.1| GENE 2 1354 - 2016 419 220 aa, chain + ## HITS:1 COG:no KEGG:BDI_1568 NR:ns ## KEGG: BDI_1568 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 218 69 284 424 254 53.0 1e-66 MVDPVLSGFDMPLLIEMPITEDEVPAVDGILLTHCDNDHYGKQTCSRLAGKTGAFHTTEY VNTLLAKELHINGHGHKIGDYFTIGSIKVTLTPADHAWQNESPKHHTRDFHARDFCGFWL DTPDGSIWAVGDSRLLETQLKMPKPNAVLFDFSDSRWHIGLDGAEKFAKAYPDTPLILWH WGSVDAPEWNEFNGNPDVLCSRIVNPGRVVVLAPGEKYRI >gi|157101605|gb|DS480719.1| GENE 3 2052 - 2219 113 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQRGAANLARCLENRNTAVVKISTARMKPSKKQMANYTEDGHVNLCVDWLKCLIE >gi|157101605|gb|DS480719.1| GENE 4 2548 - 2703 111 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160941987|ref|ZP_02089307.1| ## NR: gi|160941987|ref|ZP_02089307.1| hypothetical protein CLOBOL_06878 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06878 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 82 100.0 8e-15 MFSDSDMMKQVMESVGTVLYERLKKKYTYPTQIPELSMVAEETVKYSGNRN Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:01:36 2011 Seq name: gi|157101604|gb|DS480720.1| Clostridium bolteae ATCC BAA-613 Scfld_02_61 genomic scaffold, whole genome shotgun sequence Length of sequence - 2858 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 1 - 216 101 ## Pedsa_0472 L-carnitine dehydratase/bile acid-inducible protein F 2 1 Op 2 . + CDS 194 - 646 464 ## COG2030 Acyl dehydratase - Term 657 - 706 8.3 3 2 Op 1 . - CDS 724 - 1332 587 ## Tresu_0483 formate/nitrite transporter 4 2 Op 2 . - CDS 1336 - 2178 980 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase - Term 2208 - 2249 7.8 5 2 Op 3 . - CDS 2258 - 2764 409 ## Closa_1139 hypothetical protein Predicted protein(s) >gi|157101604|gb|DS480720.1| GENE 1 1 - 216 101 71 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_0472 NR:ns ## KEGG: Pedsa_0472 # Name: not_defined # Def: L-carnitine dehydratase/bile acid-inducible protein F # Organism: P.saltans # Pathway: not_defined # 1 58 318 375 382 64 48.0 2e-09 WDKLFKTDGFKELDMLQTVRTPGGTEFATTRCPITIDREHYKSAKAAPAIGQHNNQYIYY KENQHGTDYLF >gi|157101604|gb|DS480720.1| GENE 2 194 - 646 464 150 aa, chain + ## HITS:1 COG:SMb21110 KEGG:ns NR:ns ## COG: SMb21110 COG2030 # Protein_GI_number: 16264437 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Sinorhizobium meliloti # 2 146 3 147 153 188 61.0 4e-48 MEQTIYFEEYEIGSVRESIGRTITETDIVLHAGQTGDFFPHHMDAEFAKTTEFGKRIAHG TLTFSIAVGMTANLINPAAFSYGYDHMRFIHPVFIGDTIHVKVTLKEKRENPKHPDRGFI TEALEIFNQNGQLVMVCEHIMSCLKKTADA >gi|157101604|gb|DS480720.1| GENE 3 724 - 1332 587 202 aa, chain - ## HITS:1 COG:no KEGG:Tresu_0483 NR:ns ## KEGG: Tresu_0483 # Name: not_defined # Def: formate/nitrite transporter # Organism: T.succinifaciens # Pathway: not_defined # 9 197 14 207 214 159 45.0 6e-38 MGEHVKTFLDAFLAGIVIGIGGVVYLGVENRIAGSLLFTVGLYAIVLNQFALYTGKIGYA LGRPLAYMAEMGVVWFGNLCGTFTTGTLVSFTRLKTARAAAGICAVKLEDSLSGILILAF FCGMLMYIAVDGYKSKGNPVILFLGVSVFILAGFEHCIANMFYFTVAGVWSLKALVYILV MTLGNSAGGMFIPFIRKVGQRT >gi|157101604|gb|DS480720.1| GENE 4 1336 - 2178 980 280 aa, chain - ## HITS:1 COG:BS_folD KEGG:ns NR:ns ## COG: BS_folD COG0190 # Protein_GI_number: 16079487 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Bacillus subtilis # 4 279 6 282 283 222 42.0 6e-58 MKEIKGSDVARQMKEQMTVQIEELKGYIPHLAIIRVGERPDDISYEKGATKRAQRLGLRC ISFLFPAEISHDQFVEEFQKINRNPDVDGILMLRPLPKQIDEKEIESLIDVKKDIDCISP INLAKVFAGDSDAYAPCTAEAVMEMLAYEGIELKGKRVTVVGRSLVVGKPLAMLMLGQHA TVTICHTRTADFTGTCRNAEILVAAAGKARMIKAEHVAEGAVVIDVGINVDEDGSLCGDV DFDQVKTKAGVLTPVPGGVGSVTTLVLLKHLIRSAKEKIG >gi|157101604|gb|DS480720.1| GENE 5 2258 - 2764 409 168 aa, chain - ## HITS:1 COG:no KEGG:Closa_1139 NR:ns ## KEGG: Closa_1139 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 146 17 159 178 140 53.0 2e-32 MTKTKHMVWMGILIAVSIVLSRFLSFSAWNVKIGFAFIPIVIGAVLFGPVQGGIAAAAAD FLGAILFPIGMYFPGFTVTAFLTGLTYGILLHKNRSMFRIACAVLIVQLVYGLLLNTCWI SLLYGAPYLALLSTRIVQYVVLIPVQFVIIARMYVLGSKKYHILQENS Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:01:47 2011 Seq name: gi|157101603|gb|DS480721.1| Clostridium bolteae ATCC BAA-613 Scfld_02_62 genomic scaffold, whole genome shotgun sequence Length of sequence - 2773 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1462 1332 ## Closa_0534 cell wall binding repeat-containing protein + Term 1495 - 1557 17.0 + Prom 1516 - 1575 5.3 2 2 Tu 1 . + CDS 1607 - 2677 380 ## SPCG_1256 choline binding protein PcpA + Term 2701 - 2757 12.0 Predicted protein(s) >gi|157101603|gb|DS480721.1| GENE 1 2 - 1462 1332 486 aa, chain + ## HITS:1 COG:no KEGG:Closa_0534 NR:ns ## KEGG: Closa_0534 # Name: not_defined # Def: cell wall binding repeat-containing protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 485 109 569 570 280 42.0 1e-73 GEDNEPDHYWYYFQANGKALKNGDNDKVALKTVNGKKYAFDDEGKMLYGWVAADNAERLD DTDGDGFKEGDYYFGGEDDGAMTVGWLQMDITYDEATNDNEIAPVFNEDEDQSRWFYFKS NGKKIKAEDGDVQKGKTINGKKYEFDQYGAMTAEWSLDVESASKAGVRNQNSNVTTNSVP AKYAQQWRYFQDVENGARVSKGWFKVVAAEYLNYDKNNDDEDAWYYADGSGNLYAGEFKT IKGKKYAFRNDGRMVDGLKFILDSNGSLDVKADDDDDYPFDTEDDFIENAPYYENANYKC YYFGDGDDGSMKTNKINIDIDGDKFNFYFEKSGSKKGAGKTGEKDDKYYQSGMLLKAGKD EKYQVVKTLDPTIDKNDAGEGLKGYKKLDDVKTFLEELGMDTTSAYTTDPSKVSLSKLNI TKSADNIDELYVIPEKDDNNQAVKGKYFIVNTSGKVIDSKSRNKDGNDYYYATGSKGEIL AIYTEK >gi|157101603|gb|DS480721.1| GENE 2 1607 - 2677 380 356 aa, chain + ## HITS:1 COG:no KEGG:SPCG_1256 NR:ns ## KEGG: SPCG_1256 # Name: not_defined # Def: choline binding protein PcpA # Organism: S.pneumoniae_CGSP14 # Pathway: not_defined # 170 300 261 392 393 68 36.0 6e-10 MIRKTWMIFCLSATLLSVNPIVSLADTTTDASVTTMESCGAYLFEWRPVDCGNGHYFAIL VGGNSITERDVILNRGYDYAYTFNPSMYPRQWGEVPTLVNVDGVWAIPENQPLLPEGTQS TLQIVLYTNNGKLPNKERYIDVVRLPSNVDTSTLPADVRKYLINVDGSDAGAYEGTKSSG WVIEADGRYRYRKPDGTFVSGGWLNVDDNLYYMDEEGYMLSDTIAPDGSYVNASGAKQKY MPGWFQNERGWKYVLKNGYFAASTWVQDTDGKYYYFDIGGYMRTDYDTPDGYHVGPDGVW DGAAATGEYGQNPGPGGVVSSNTESGVAEENTQEEPQTEAVEETNNSSEGNAIDNN Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:02:01 2011 Seq name: gi|157101602|gb|DS480722.1| Clostridium bolteae ATCC BAA-613 Scfld_02_63 genomic scaffold, whole genome shotgun sequence Length of sequence - 2652 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 260 - 311 8.5 1 1 Tu 1 . - CDS 368 - 1708 927 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases + Prom 1805 - 1864 11.6 2 2 Tu 1 . + CDS 2091 - 2606 504 ## COG5263 FOG: Glucan-binding domain (YG repeat) Predicted protein(s) >gi|157101602|gb|DS480722.1| GENE 1 368 - 1708 927 446 aa, chain - ## HITS:1 COG:CAC0695 KEGG:ns NR:ns ## COG: CAC0695 COG0246 # Protein_GI_number: 15893983 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Clostridium acetobutylicum # 7 444 16 482 482 420 45.0 1e-117 MSTGVKETVIQFGEGGFLRGFVDYFFHKLQEKGLYDGKIVVVQPIEKGMCQMLSDQNCEY NLFLRGIDNGQVVNEHTHVTSISRALNPYTQYEEYIALAENPDLRVLVSNTTEAGIEYLG TENLTDMPPKSYPAKLTQFLYKRYKAGLRGLILLPCELIDDNGDNLKKCVLQYAKLWELE DGFTTWLEDENDFCSTLVDRIVTGYPRDEVEELTKQIGYEDKLIDTAEIFHLWVIGGHHE DELPFQKAGYNIVWTDDVHPYKKRKVRILNGGHTSMVLGAYLYGLETVGECMKDEKVSAF LKKCIFEEIIPTIGDTEDNRKFGAAVLERFSNPFIKHQLLSIALNSVSKFQVRVLPTILE YKEKFGNYPKALTFSMAALISFYRTDKANDGDEIMQFMKTASVEEIMKREDYWHADLCEM IPMVKEYYELIQSKGMAEAYGVILGM >gi|157101602|gb|DS480722.1| GENE 2 2091 - 2606 504 171 aa, chain + ## HITS:1 COG:SP2190 KEGG:ns NR:ns ## COG: SP2190 COG5263 # Protein_GI_number: 15901997 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 22 162 508 643 693 99 39.0 3e-21 MRKHTKALATIMAAATLAMASAVTSLANTGWLNVQSKWYYYDEAGNSVRNQWVENDKDLF WLKDNGEMAIQSWVNYDNSWYWVNAQGAMVTDSWIPITNNWYYVGQDGKMLADTWFEYNH RTYYLTESGAAAKGWVELDGKWYYFDRNNGDMKRGTTIDGYTVNEDGVYVK Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:02:59 2011 Seq name: gi|157101601|gb|DS480723.1| Clostridium bolteae ATCC BAA-613 Scfld_02_64 genomic scaffold, whole genome shotgun sequence Length of sequence - 187711 bp Number of predicted genes - 164, with homology - 160 Number of transcription units - 73, operones - 38 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 156 - 215 10.2 1 1 Op 1 . + CDS 425 - 1276 561 ## COG1737 Transcriptional regulators 2 1 Op 2 . + CDS 1333 - 2727 1081 ## COG1757 Na+/H+ antiporter 3 1 Op 3 . + CDS 2714 - 4108 1231 ## COG2379 Putative glycerate kinase 4 1 Op 4 . + CDS 4123 - 5415 1210 ## COG0148 Enolase + Term 5433 - 5498 12.2 + Prom 5456 - 5515 9.3 5 2 Tu 1 . + CDS 5544 - 6554 884 ## COG1609 Transcriptional regulators + Term 6582 - 6625 3.5 + Prom 6593 - 6652 12.2 6 3 Op 1 . + CDS 6787 - 7401 462 ## COG0655 Multimeric flavodoxin WrbA + Term 7466 - 7505 -0.9 + Prom 7446 - 7505 3.8 7 3 Op 2 . + CDS 7636 - 8964 389 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 8995 - 9036 6.1 8 4 Op 1 . + CDS 9064 - 10062 814 ## COG0095 Lipoate-protein ligase A 9 4 Op 2 11/0.000 + CDS 10081 - 10581 479 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 10 4 Op 3 . + CDS 10582 - 12864 2191 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 11 4 Op 4 . + CDS 12861 - 13589 837 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 12 4 Op 5 . + CDS 13653 - 14612 1054 ## COG3252 Methenyltetrahydromethanopterin cyclohydrolase 13 4 Op 6 28/0.000 + CDS 14656 - 15639 1089 ## COG1071 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit 14 4 Op 7 24/0.000 + CDS 15639 - 16616 836 ## COG0022 Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit 15 4 Op 8 . + CDS 16652 - 18004 1031 ## COG0508 Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes 16 4 Op 9 . + CDS 18059 - 19459 1435 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 17 4 Op 10 . + CDS 19470 - 20477 1028 ## Spirs_2674 pyrimidine reductase, riboflavin biosynthesis 18 4 Op 11 . + CDS 20495 - 21913 908 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 19 4 Op 12 . + CDS 21935 - 22282 324 ## gi|160942026|ref|ZP_02089341.1| hypothetical protein CLOBOL_06912 20 4 Op 13 . + CDS 22282 - 23043 393 ## gi|160942027|ref|ZP_02089342.1| hypothetical protein CLOBOL_06913 21 4 Op 14 . + CDS 23064 - 23702 660 ## COG4869 Propanediol utilization protein 22 4 Op 15 . + CDS 23759 - 25087 1301 ## COG0534 Na+-driven multidrug efflux pump + Term 25154 - 25198 3.2 + TRNA 25192 - 25263 64.4 # Gln TTG 0 0 + Prom 25190 - 25249 76.8 23 5 Tu 1 . + CDS 25404 - 26471 1324 ## COG3191 L-aminopeptidase/D-esterase 24 6 Tu 1 . - CDS 26468 - 28579 2129 ## COG3283 Transcriptional regulator of aromatic amino acids metabolism - Prom 28603 - 28662 9.5 + Prom 28589 - 28648 4.8 25 7 Op 1 . + CDS 28669 - 28734 58 ## 26 7 Op 2 38/0.000 + CDS 28815 - 30500 1944 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 30510 - 30554 11.2 + Prom 30504 - 30563 6.9 27 8 Op 1 49/0.000 + CDS 30600 - 31520 1029 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 28 8 Op 2 44/0.000 + CDS 31537 - 32415 1153 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 29 8 Op 3 44/0.000 + CDS 32427 - 33422 1110 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 30 8 Op 4 . + CDS 33422 - 34414 1156 ## COG4608 ABC-type oligopeptide transport system, ATPase component 31 8 Op 5 . + CDS 34411 - 35211 929 ## COG2362 D-aminopeptidase 32 8 Op 6 . + CDS 35240 - 36328 1291 ## COG1363 Cellulase M and related proteins 33 8 Op 7 . + CDS 36342 - 37085 684 ## COG2071 Predicted glutamine amidotransferases 34 9 Op 1 . + CDS 37196 - 37807 399 ## COG2357 Uncharacterized protein conserved in bacteria 35 9 Op 2 . + CDS 37797 - 38717 934 ## COG2421 Predicted acetamidase/formamidase + Term 38722 - 38751 -0.9 + Prom 38779 - 38838 2.4 36 10 Op 1 . + CDS 38973 - 39866 835 ## COG1091 dTDP-4-dehydrorhamnose reductase 37 10 Op 2 . + CDS 39856 - 41136 1492 ## COG1301 Na+/H+-dicarboxylate symporters + Prom 41141 - 41200 1.5 38 11 Tu 1 . + CDS 41267 - 41602 359 ## gi|160942046|ref|ZP_02089361.1| hypothetical protein CLOBOL_06933 + Term 41646 - 41697 0.3 - Term 41634 - 41685 2.8 39 12 Tu 1 . - CDS 41718 - 42299 326 ## Closa_1713 hypothetical protein - Prom 42320 - 42379 3.3 - TRNA 42554 - 42627 70.7 # Arg ACG 0 0 + Prom 42657 - 42716 6.8 40 13 Op 1 . + CDS 42922 - 44280 1282 ## COG1114 Branched-chain amino acid permeases 41 13 Op 2 . + CDS 44316 - 46046 1998 ## COG0608 Single-stranded DNA-specific exonuclease + Term 46113 - 46150 9.5 - Term 46100 - 46138 8.1 42 14 Tu 1 . - CDS 46155 - 46526 626 ## gi|160942052|ref|ZP_02089367.1| hypothetical protein CLOBOL_06940 - Prom 46647 - 46706 6.5 + Prom 46701 - 46760 7.0 43 15 Tu 1 . + CDS 46796 - 48043 1504 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 48095 - 48148 7.4 44 16 Op 1 . + CDS 48206 - 49366 1311 ## COG3835 Sugar diacid utilization regulator + Prom 49437 - 49496 6.0 45 16 Op 2 . + CDS 49528 - 50673 1020 ## COG1929 Glycerate kinase + Term 50732 - 50791 17.0 + Prom 50736 - 50795 9.9 46 17 Op 1 . + CDS 50987 - 51115 147 ## gi|160942056|ref|ZP_02089371.1| hypothetical protein CLOBOL_06944 47 17 Op 2 . + CDS 51067 - 51339 168 ## gi|160942057|ref|ZP_02089372.1| hypothetical protein CLOBOL_06945 48 17 Op 3 . + CDS 51332 - 51412 60 ## 49 18 Op 1 . - CDS 51631 - 52806 1159 ## COG1228 Imidazolonepropionase and related amidohydrolases 50 18 Op 2 . - CDS 52848 - 53852 1190 ## CDR20291_2766 2-keto-3-deoxygluconate permease - Prom 54065 - 54124 10.2 51 19 Tu 1 . + CDS 53767 - 53973 98 ## + Prom 54093 - 54152 9.0 52 20 Tu 1 . + CDS 54211 - 56259 1732 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Prom 56470 - 56529 3.3 53 21 Op 1 . + CDS 56570 - 57574 373 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 54 21 Op 2 . + CDS 57593 - 58042 465 ## SpiBuddy_2979 hypothetical protein 55 21 Op 3 1/0.190 + CDS 58058 - 59563 1871 ## COG3333 Uncharacterized protein conserved in bacteria + Prom 59579 - 59638 3.9 56 22 Tu 1 . + CDS 59681 - 60673 1236 ## COG3181 Uncharacterized protein conserved in bacteria + Prom 60812 - 60871 3.2 57 23 Op 1 . + CDS 61006 - 61842 874 ## COG1082 Sugar phosphate isomerases/epimerases 58 23 Op 2 . + CDS 61891 - 62550 767 ## SpiBuddy_2976 Asp/Glu racemase 59 23 Op 3 12/0.000 + CDS 62553 - 63380 869 ## COG3959 Transketolase, N-terminal subunit 60 23 Op 4 . + CDS 63380 - 64330 1357 ## COG3958 Transketolase, C-terminal subunit 61 23 Op 5 . + CDS 64408 - 65319 989 ## SpiBuddy_0718 xylose isomerase + Prom 65330 - 65389 4.7 62 24 Op 1 . + CDS 65413 - 66315 1227 ## Spirs_2575 xylose isomerase + Prom 66328 - 66387 5.3 63 24 Op 2 . + CDS 66410 - 67369 1141 ## COG1052 Lactate dehydrogenase and related dehydrogenases 64 25 Op 1 . + CDS 67513 - 69414 2010 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains 65 25 Op 2 . + CDS 69508 - 70149 955 ## COG0698 Ribose 5-phosphate isomerase RpiB + Term 70188 - 70236 14.1 - Term 70173 - 70225 4.4 66 26 Tu 1 . - CDS 70290 - 71633 1123 ## COG1472 Beta-glucosidase-related glycosidases - Prom 71780 - 71839 5.5 + Prom 71731 - 71790 3.3 67 27 Op 1 . + CDS 71884 - 73122 1096 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 68 27 Op 2 1/0.190 + CDS 73137 - 74468 1632 ## COG2252 Permeases 69 27 Op 3 . + CDS 74514 - 76292 1949 ## COG1001 Adenine deaminase + Term 76323 - 76395 20.8 - Term 76311 - 76383 19.2 70 28 Tu 1 . - CDS 76451 - 80782 4377 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) + Prom 81118 - 81177 7.5 71 29 Op 1 . + CDS 81206 - 82147 1124 ## COG0598 Mg2+ and Co2+ transporters 72 29 Op 2 . + CDS 82183 - 83013 708 ## Closa_0669 hypothetical protein 73 29 Op 3 . + CDS 83091 - 84527 1761 ## COG0469 Pyruvate kinase + Term 84557 - 84614 8.2 74 30 Tu 1 . + CDS 84637 - 85905 1570 ## COG0019 Diaminopimelate decarboxylase + Term 85936 - 85982 9.3 75 31 Op 1 . + CDS 86011 - 86808 908 ## COG0345 Pyrroline-5-carboxylate reductase 76 31 Op 2 . + CDS 86849 - 87445 868 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 77 31 Op 3 . + CDS 87538 - 88080 556 ## Closa_3613 hypothetical protein 78 31 Op 4 . + CDS 88120 - 88983 809 ## COG2510 Predicted membrane protein 79 31 Op 5 . + CDS 89006 - 90403 1094 ## COG1075 Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold + Prom 90518 - 90577 7.0 80 32 Op 1 . + CDS 90633 - 90941 121 ## Closa_3612 integrase family protein 81 32 Op 2 . + CDS 90880 - 91515 665 ## COG4974 Site-specific recombinase XerD + Term 91541 - 91577 4.0 82 33 Tu 1 . - CDS 91585 - 92691 1212 ## COG3359 Predicted exonuclease - Prom 92752 - 92811 3.5 + Prom 92639 - 92698 4.0 83 34 Op 1 29/0.000 + CDS 92910 - 93539 841 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 84 34 Op 2 . + CDS 93581 - 94588 1167 ## COG2255 Holliday junction resolvasome, helicase subunit + Prom 94632 - 94691 3.1 85 35 Op 1 . + CDS 94925 - 95797 693 ## Closa_2236 hypothetical protein + Term 95810 - 95855 6.3 86 35 Op 2 . + CDS 95880 - 98240 1980 ## COG0826 Collagenase and related proteases 87 35 Op 3 19/0.000 + CDS 98246 - 99601 1617 ## COG0772 Bacterial cell division membrane protein 88 35 Op 4 . + CDS 99601 - 101067 1832 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Term 100910 - 100967 4.1 89 36 Tu 1 . - CDS 101072 - 101986 1027 ## COG0583 Transcriptional regulator - Prom 102027 - 102086 7.7 + Prom 101991 - 102050 6.8 90 37 Op 1 1/0.190 + CDS 102269 - 103147 1002 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family 91 37 Op 2 . + CDS 103173 - 104609 1388 ## COG3314 Uncharacterized protein conserved in bacteria + Term 104628 - 104678 7.5 + Prom 104622 - 104681 3.1 92 38 Op 1 . + CDS 104716 - 105510 864 ## Closa_0658 molybdopterin dehydrogenase FAD-binding protein 93 38 Op 2 11/0.000 + CDS 105507 - 105965 589 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 94 38 Op 3 . + CDS 105970 - 108408 2689 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs + Prom 108477 - 108536 5.1 95 39 Tu 1 . + CDS 108669 - 109121 454 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 109142 - 109179 8.5 96 40 Tu 1 . - CDS 109222 - 110238 776 ## COG1609 Transcriptional regulators - Prom 110397 - 110456 8.9 + Prom 110310 - 110369 8.5 97 41 Op 1 3/0.048 + CDS 110444 - 111286 765 ## COG1082 Sugar phosphate isomerases/epimerases 98 41 Op 2 16/0.000 + CDS 111332 - 112438 817 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 112451 - 112485 4.0 99 41 Op 3 21/0.000 + CDS 112499 - 113980 181 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 100 41 Op 4 3/0.048 + CDS 113988 - 114950 888 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components 101 41 Op 5 16/0.000 + CDS 114962 - 115996 1022 ## COG0673 Predicted dehydrogenases and related proteins 102 41 Op 6 . + CDS 116007 - 116921 812 ## COG1082 Sugar phosphate isomerases/epimerases + Term 117004 - 117049 12.3 + Prom 117144 - 117203 4.5 103 42 Tu 1 . + CDS 117255 - 119792 2749 ## COG0426 Uncharacterized flavoproteins + Term 119848 - 119914 12.1 - Term 119835 - 119896 15.1 104 43 Tu 1 . - CDS 119903 - 121258 337 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Prom 121437 - 121496 6.5 + Prom 121460 - 121519 8.7 105 44 Op 1 5/0.000 + CDS 121747 - 122721 796 ## COG1609 Transcriptional regulators + Term 122790 - 122844 8.5 + Prom 122772 - 122831 4.4 106 44 Op 2 1/0.190 + CDS 122953 - 124611 1298 ## COG1621 Beta-fructosidases (levanase/invertase) 107 44 Op 3 35/0.000 + CDS 124629 - 125930 1547 ## COG1653 ABC-type sugar transport system, periplasmic component 108 44 Op 4 38/0.000 + CDS 125984 - 126871 1042 ## COG1175 ABC-type sugar transport systems, permease components 109 44 Op 5 . + CDS 126887 - 127729 886 ## COG0395 ABC-type sugar transport system, permease component + Term 127809 - 127850 1.0 + Prom 127771 - 127830 7.2 110 45 Tu 1 . + CDS 127864 - 128724 657 ## COG2340 Uncharacterized protein with SCP/PR1 domains + Term 128740 - 128785 11.1 + Prom 129058 - 129117 10.7 111 46 Op 1 6/0.000 + CDS 129286 - 130347 926 ## COG1609 Transcriptional regulators 112 46 Op 2 35/0.000 + CDS 130344 - 131705 1428 ## COG1653 ABC-type sugar transport system, periplasmic component 113 46 Op 3 38/0.000 + CDS 131735 - 132649 633 ## COG1175 ABC-type sugar transport systems, permease components 114 46 Op 4 . + CDS 132667 - 133506 606 ## COG0395 ABC-type sugar transport system, permease component 115 46 Op 5 1/0.190 + CDS 133536 - 134906 1291 ## COG4948 L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily 116 46 Op 6 . + CDS 134923 - 135831 864 ## COG0684 Demethylmenaquinone methyltransferase + Term 136003 - 136069 6.1 - Term 135993 - 136047 8.4 117 47 Op 1 22/0.000 - CDS 136058 - 136621 720 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 118 47 Op 2 23/0.000 - CDS 136624 - 137376 758 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 119 47 Op 3 8/0.000 - CDS 137378 - 138439 1236 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 120 47 Op 4 . - CDS 138443 - 138661 244 ## COG1146 Ferredoxin - Prom 138716 - 138775 7.5 + Prom 138774 - 138833 5.5 121 48 Tu 1 . + CDS 138921 - 139685 895 ## COG1414 Transcriptional regulator + Term 139817 - 139860 2.0 + Prom 139724 - 139783 6.5 122 49 Op 1 . + CDS 139927 - 140988 1136 ## COG3426 Butyrate kinase 123 49 Op 2 . + CDS 141021 - 142463 1509 ## COG1757 Na+/H+ antiporter 124 49 Op 3 . + CDS 142513 - 143418 1062 ## COG0280 Phosphotransacetylase 125 49 Op 4 11/0.000 + CDS 143434 - 144753 1454 ## COG0477 Permeases of the major facilitator superfamily + Term 144779 - 144823 3.5 126 49 Op 5 . + CDS 144854 - 145555 592 ## COG1309 Transcriptional regulator + Term 145560 - 145609 -0.4 + Prom 145557 - 145616 7.8 127 50 Op 1 . + CDS 145754 - 146770 403 ## Cphy_0392 NAD-dependent epimerase/dehydratase 128 50 Op 2 . + CDS 146859 - 147839 637 ## SpiBuddy_2189 dihydrodipicolinate synthetase 129 50 Op 3 . + CDS 147863 - 149110 739 ## COG1914 Mn2+ and Fe2+ transporters of the NRAMP family + Prom 149113 - 149172 4.2 130 51 Tu 1 . + CDS 149316 - 150821 1723 ## COG2317 Zn-dependent carboxypeptidase + Term 150863 - 150914 7.4 - Term 150851 - 150902 7.4 131 52 Tu 1 . - CDS 150978 - 152675 299 ## COG4928 Predicted P-loop ATPase - Prom 152846 - 152905 7.3 + Prom 152802 - 152861 7.5 132 53 Tu 1 . + CDS 153033 - 154040 1148 ## COG1363 Cellulase M and related proteins + Term 154288 - 154335 4.6 + Prom 154396 - 154455 4.0 133 54 Tu 1 . + CDS 154508 - 155698 1333 ## Closa_1647 electron transport complex, RnfABCDGE type, G subunit + Term 155797 - 155844 7.0 + Prom 155736 - 155795 4.8 134 55 Op 1 . + CDS 155956 - 156909 1094 ## COG2385 Sporulation protein and related proteins 135 55 Op 2 . + CDS 156950 - 157705 950 ## Closa_2891 Peptidase M23 + Term 157752 - 157801 12.3 + Prom 157878 - 157937 5.2 136 56 Op 1 35/0.000 + CDS 158028 - 159374 1609 ## COG1653 ABC-type sugar transport system, periplasmic component 137 56 Op 2 38/0.000 + CDS 159614 - 160513 1103 ## COG1175 ABC-type sugar transport systems, permease components 138 56 Op 3 . + CDS 160513 - 161343 936 ## COG0395 ABC-type sugar transport system, permease component 139 56 Op 4 . + CDS 161354 - 164977 2376 ## Mahau_1687 hypothetical protein - Term 165139 - 165168 -0.5 140 57 Tu 1 . - CDS 165295 - 165960 295 ## SpiBuddy_2052 phosphoesterase PA-phosphatase related protein - Prom 165982 - 166041 1.6 - Term 166053 - 166091 6.0 141 58 Tu 1 . - CDS 166113 - 166727 165 ## COG1396 Predicted transcriptional regulators 142 59 Tu 1 . + CDS 167106 - 167564 496 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 167699 - 167752 1.2 + TRNA 167853 - 167925 46.9 # His GTG 0 0 + Prom 167854 - 167913 75.7 143 60 Op 1 . + CDS 168000 - 168197 234 ## gi|288871602|ref|ZP_06118187.2| toxin-antitoxin system, antitoxin component, Xre family + TRNA 168221 - 168293 67.4 # Lys TTT 0 0 + Prom 168220 - 168279 75.6 144 60 Op 2 . + CDS 168363 - 168797 395 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 168855 - 168902 0.3 + Prom 168935 - 168994 5.7 145 61 Tu 1 . + CDS 169016 - 170443 1512 ## COG0534 Na+-driven multidrug efflux pump + Prom 170504 - 170563 4.1 146 62 Op 1 . + CDS 170601 - 171266 709 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 147 62 Op 2 . + CDS 171299 - 171751 156 ## Caci_3891 hypothetical protein + Term 171901 - 171949 1.3 148 63 Op 1 . - CDS 171825 - 172010 87 ## gi|160942184|ref|ZP_02089499.1| hypothetical protein CLOBOL_07074 149 63 Op 2 . - CDS 172010 - 172522 672 ## COG1528 Ferritin-like protein - Prom 172573 - 172632 5.7 150 64 Tu 1 . - CDS 172671 - 174026 1280 ## COG0463 Glycosyltransferases involved in cell wall biogenesis - Prom 174079 - 174138 5.8 151 65 Tu 1 . + CDS 174445 - 176448 2008 ## COG1193 Mismatch repair ATPase (MutS family) + Term 176468 - 176515 8.3 152 66 Tu 1 . - CDS 176455 - 177675 936 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Prom 177725 - 177784 5.6 + Prom 177767 - 177826 6.6 153 67 Op 1 . + CDS 177886 - 179391 1684 ## COG2211 Na+/melibiose symporter and related transporters 154 67 Op 2 . + CDS 179432 - 181675 2070 ## COG1472 Beta-glucosidase-related glycosidases 155 68 Op 1 . - CDS 181965 - 182387 494 ## CDR20291_1132 hypothetical protein 156 68 Op 2 3/0.048 - CDS 182360 - 183289 874 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 157 68 Op 3 . - CDS 183279 - 184139 771 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 184164 - 184223 5.7 158 69 Tu 1 . + CDS 184035 - 184226 116 ## 159 70 Tu 1 . - CDS 184303 - 184524 176 ## gi|160942195|ref|ZP_02089510.1| hypothetical protein CLOBOL_07085 - Prom 184551 - 184610 8.3 + Prom 184552 - 184611 6.5 160 71 Tu 1 . + CDS 184653 - 185036 390 ## gi|160942194|ref|ZP_02089509.1| hypothetical protein CLOBOL_07084 + Term 185108 - 185139 -0.7 - Term 185039 - 185085 2.2 161 72 Op 1 . - CDS 185116 - 185433 355 ## gi|160942196|ref|ZP_02089511.1| hypothetical protein CLOBOL_07086 162 72 Op 2 . - CDS 185487 - 185951 233 ## CLB_2021 hypothetical protein - Prom 186046 - 186105 5.5 + Prom 186023 - 186082 13.2 163 73 Op 1 . + CDS 186329 - 187249 921 ## COG0714 MoxR-like ATPases 164 73 Op 2 . + CDS 187239 - 187710 290 ## Acfer_0396 von Willebrand factor type A Predicted protein(s) >gi|157101601|gb|DS480723.1| GENE 1 425 - 1276 561 283 aa, chain + ## HITS:1 COG:BH2940 KEGG:ns NR:ns ## COG: BH2940 COG1737 # Protein_GI_number: 15615502 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 13 274 15 275 284 127 27.0 2e-29 MKNIIDEINKRYENLTKSQKIIADFLLENYALVPFETLNEIAHRIGLSTTSLIRFSRTMG YKDYTDLKKNIQDLIKVKVSLPSRFGELSEQGGTPNDLLLRCRDIAVNNIEATLRLQDTE KLDLAVDWISSANHNYVLGLRTCFSPAFYLAVVLSQIKKKVRLIQGIASTYPEEITDVEE GDVCTVFSFPRFYKETVNIAHFMRRRGAKVIIITGKDIRPVKPYGDIIIPCSAVGPSFKD SYVAVHFLIDYIISSIAAKNKDEGIEVVSKIEEILSQNYFIGF >gi|157101601|gb|DS480723.1| GENE 2 1333 - 2727 1081 464 aa, chain + ## HITS:1 COG:BS_mleN KEGG:ns NR:ns ## COG: BS_mleN COG1757 # Protein_GI_number: 16079413 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus subtilis # 12 448 9 446 468 248 32.0 2e-65 MSSNERKKPSFLFALLIMAAVCVVVIVPYMKWGVSMAATFFLSWLFVIPACMSLGFTYEE LEKAAVSYCGKIITSVFILLSVGGMIGAWIAAGTVSAIVDVGLRMITPKIFLLVTFIVGA LYAMACGTSWGTLGTIGIAMSAVGLGLGVPPAMTAGAVVSAAFLGDGFSPMSDSPNLASA VTGVKLMDHVRHLAKIQFPAFIISAVLYTILGFVYAGGEVQNETTLMIIKTLGDNYNVGL IAFLPALIVIILLLLKKSAIISILISAATGIGVAVIYQGKSLAYVLTCFWSGVKSDTGME LVDTLLSRGGVTSLFSSASLYLITFGLIGILTQAGVLDAVVAPIVNKVKTGFQLLITTII TGFLGDAVGCSGSFAYLFAGNLMKPLYKKMNVSDLDLTRNLACSVTPLGPLIPWNMNAVI ALDLLGVSCFQYAPYCFQAYVMPVLLIVNYLLFNKRGGQQNGEN >gi|157101601|gb|DS480723.1| GENE 3 2714 - 4108 1231 464 aa, chain + ## HITS:1 COG:PAB1021 KEGG:ns NR:ns ## COG: PAB1021 COG2379 # Protein_GI_number: 14521745 # Func_class: G Carbohydrate transport and metabolism # Function: Putative glycerate kinase # Organism: Pyrococcus abyssi # 34 452 17 420 435 222 34.0 1e-57 MGKIKNTESLLNHGDRESRKKITELLERMWAKVDAYHLIKELMSVDGNCLTIGTKRWDMD KLGNIYLFGAGKACNAMAMAVCDVLKEKLTEGVISVKIAEDNDTYINTRVYVGGHPLPNE EGYRAAEDIIQMIDKGKPGDLFISVISGGSSALLTCPKEGITLQEEIQAQDMLLRSGAKI EEINAVRRHISRTNGGRLAEQVLKNGAEIVNIIVGDGVGVKPTVDFKAPFQFFGTPVAPD KTTIQDARDCIRNYQLEDKLPESILKYLYEDNPEHETPKAFGEGVTHFCLNNVPDSCEAA VEAAREMGISSMVFSTFIEGESREAGYFFASMAREIQANHRPIKAPCFVFCSGETTTKVE DGSRGTGGPSHELALGFAHGARYTAGAALASVDTEGTDGTTRYAGALTDSQTLRMLSEKG INIFDVFRTHDCGGALECTQDSILTGNTGTNLCDFNVMYVPEAE >gi|157101601|gb|DS480723.1| GENE 4 4123 - 5415 1210 430 aa, chain + ## HITS:1 COG:MK1647 KEGG:ns NR:ns ## COG: MK1647 COG0148 # Protein_GI_number: 20095083 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Methanopyrus kandleri AV19 # 5 411 3 412 427 305 43.0 9e-83 MNGNRIKEVKARQVLDSKGRPVVETEIYTENGKMGRAGASTGTSVGKNESYVLRDNDKEL FGGLSVFKAVKNIEEIIGPALVGMDVTDQQAVDHRMIELDGTRYKTNLGGNAIYSVSCAA ARAAAASVGKPLWRYLAEEEPERVFAPAYNMINGGTYGNYTLAFQEFIVIPKNVSTFYEG SRIGVEIFQKMGDIIKKYNNGKPPVMGNYSGYGAPSDDPFVLFDMLMEAASSLGYEDKII FSMDCAASEIYDEARDAYLYKGTYIDRDEMISLLSRIAHKYPIGFIEDALQEEDFEGFKK ARKEIQAVLIGDDFICSSIDRAKKAIDMDAIQGMILKPNQIGTLTEAIETVRFMKQHDLL VVASGRAGGVIDDPNAELAIALGLPIMKTGAPRSGERTIFTNTGLRVEEQLKPGKTMADV LKVPGFERLG >gi|157101601|gb|DS480723.1| GENE 5 5544 - 6554 884 336 aa, chain + ## HITS:1 COG:RSc1014 KEGG:ns NR:ns ## COG: RSc1014 COG1609 # Protein_GI_number: 17545733 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Ralstonia solanacearum # 3 325 4 337 347 154 31.0 3e-37 MNIRQIAYLAGVSVATVSRCINQPEKVSEETRNHIIGVMKRVDYIPNPSAKSLSTGYTKT ILCVIPTLCNEFFNQLVEGSQAVLREADYKVLVFSTDSIEDFWRRVDQRSVDGMIISGSG FINSEELRMAKIQVPYVLIENMEQSESPKDVTYVYSDDYKGVQMALEYLYREGNRVFGVI STSDTYLVTERRRKAVYDFFMEKKDCRLCMVKAHYADLQDAYDACGHLMRLEERPTAIFA FNDMMAIAAMRYLVLHGIRVPEDMEIMGFDDNPVASFVLPSLSTVVAPNYQLGESAARLL LEKLRGNTESKSVLFPVELKLRETTRNNVKEYRDKL >gi|157101601|gb|DS480723.1| GENE 6 6787 - 7401 462 204 aa, chain + ## HITS:1 COG:MA3740 KEGG:ns NR:ns ## COG: MA3740 COG0655 # Protein_GI_number: 20092538 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 198 1 195 221 131 38.0 7e-31 MVKIIGICGSPRRKSSYAALREALDGALETGDVEVELIELRARKLNFCIHCNKCLREDAV RCTVYEDDMTPLYEKVYQADGMIIASPVYEMNITAQLAAFFNRFRPTWNIIKKDPTFFNR KVGAAIAVGGTRNGGQEGALNAILGFYHTQGMVVCNGGAGTYAGASLWNPGDGSGEMDDP EGLKRARALGAKMADLTKRLNLDN >gi|157101601|gb|DS480723.1| GENE 7 7636 - 8964 389 442 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 6 431 7 428 447 154 29 3e-36 MMSYGIEDKPRLAVAFPLAFQHVLSAFTGQIAVGMLLAMGFGLSVKETALLIQCALFITG IATIIQSFGIGKHIGARLPIVSGGSFTLITPMVVAANNPNIGIGGAFGAALVGSGVLFIL GPIAIRYLHKYFSPTVTGAVVLTVGNCIGATAFGNFNVESETAMTELAIAVMVVVVIVLL NFFGKGLIKQCCILIGMVIGYVVAAALGMVDFSSIGEASWFSMVKPFAWGVNFNVGAILT ICAVHIGTLMEMVGDTTGLVTAVSNRLPTSEELSRTIRGGGITGMFASAFNGAPVITGSA NVGIVAMTGVGSRFVTGLGGIIFLAFAVFPKLAQLLALIPNCVMAGAVLVMCGQISASGL RVISMGDFTERNTTILAIALAIGIGGNFGVANLAFLPSTVTTIFTGIPGTALVALVLNMI LPKAAEDEVTDVKIEGEIPAVQ >gi|157101601|gb|DS480723.1| GENE 8 9064 - 10062 814 332 aa, chain + ## HITS:1 COG:BH0683 KEGG:ns NR:ns ## COG: BH0683 COG0095 # Protein_GI_number: 15613246 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase A # Organism: Bacillus halodurans # 1 323 1 326 330 257 40.0 2e-68 MIYVETNSIIGSENLAFEEYFLKKEDIREPVLMLWRNRPTIVVGAFQNTHEEICEEFVKA NKIDVVRRTSGGGAVYHDLGNLCFSFIMEHGDFTNTDYSAFLKPVVEALRGMGIQAGING RNDLVLGNAKISGSAVRIYKNRVLFHGTLLFSSDLEILSRALRVKQDKLVSKGIKSVRSR VTTIAEHLSYDMDVLEFRERLLQALFGESGGKEYEISLNERREIMCLARDKYESFLWTYG KNPPADLTHGKRFGGGSITVKASLKNSRIEQIRFEGDFLGCMEMKDIEQLLNGVVYERDA VSRVFEELDIRPYFGTITKEEVVGCIMGDDRI >gi|157101601|gb|DS480723.1| GENE 9 10081 - 10581 479 166 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 3 154 14 163 171 159 53.0 2e-39 MEQVSLDINGKKTQVSIEKNWTLLFVLREVLGLTGAKCGCSTGDCGACKVIIDGEAVNSC LVLARNLEGKKIETIEGLSNGTDLHPIQQAFIDAGAVQCGYCTPGMVMSAKALLDRIENP TEEEIRNNGISNNLCRCTGYVKIVEAIRLAARRMMEAKAGGKKEER >gi|157101601|gb|DS480723.1| GENE 10 10582 - 12864 2191 760 aa, chain + ## HITS:1 COG:SMb20132 KEGG:ns NR:ns ## COG: SMb20132 COG1529 # Protein_GI_number: 16263880 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Sinorhizobium meliloti # 3 751 18 770 772 471 36.0 1e-132 MGIIGQSIPVRDAAMKVTGQFKYTADLEFPGMLHAKILFSPVPHARIKKIDTSRAEALEG VRGVVCYLNAPDTKYNSCGEEIDGHKTERVFDDTVRYVGDKVAAVAADTVKIAEQAVKLI QVEYEELPFYTDPGEALKEGAYPIHGESNIIFPVDMGAGDVEAGFKAADYIYEDTYSTPA IHHSAIETHASIAVYDSSGKLTVYTPSQDVFGYRKNLSRIFGLPMSRVRVVNPGMGGGFG GKIDTITEPVTALLAMKTGRPVKLVYNRREDIVSSRTRHGMEIKIKTGVKKDGTIISQDM QVIINAGAYAGGTMSIVWAMSGKFFKNHKTPNIRFRATPVYTNTPIAGAMRGFGSPQEFF AQQCQMNKIARDLHISIIDMQLKNLVEPDGFDQRNGQRHGNARPIDCVKKGMELFGWEEA VKEQEESGAAKGRYRIGVGMAAATHGNGVYGVCPDTTGVILKMNEDGSVVMFTGVSDMGN GSVTTQAQAVAQELGISMDRIECIQADTDATMYDLGNYSSRGTYVSCNAAVKAAGKIRQE LLKEAAQLLEEEQQNLELKDNGVCVRNNPEKKASLEEVITHARKVNQRDICCADTFASYA LAMSYGAHFVKVSVDTRTGGVKVLEYTAVHDVGKALNPMSVEGQIEGAVQMGLGYALSEG IIIDSGGKVKNTTFKQYHIMNAGEMPPIKVGLVEETEPTGPYGAKSIGECSVVPSAGAIA NAVANAIGCEVHRLPLKPDTVLELLAREHETVERRDGITL >gi|157101601|gb|DS480723.1| GENE 11 12861 - 13589 837 242 aa, chain + ## HITS:1 COG:TM1724 KEGG:ns NR:ns ## COG: TM1724 COG1028 # Protein_GI_number: 15644471 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Thermotoga maritima # 1 239 1 242 246 130 33.0 2e-30 MRFEDKVAAITGAGQGLGAGFAKAFAKEGAVAVLIGRTGGKLQAVAEEIRKEGGRALVKV CDIAVPEQVEQCFREIEQEAGSVDVLVNNAAYHKSVPVVETSNEEWLSQISINLNGTFFC TKAVLPAMIRNRYGKIINISSSAAKHFFPGFGAYAASKGGIVSFTHTLSEEVKQYGINVN AIYLGMTNTEHTRERMDSDQAVTIDLDSMLQVEDVAEVVTFLASDAAAPIMGAAIDVFGK KS >gi|157101601|gb|DS480723.1| GENE 12 13653 - 14612 1054 319 aa, chain + ## HITS:1 COG:MK0625 KEGG:ns NR:ns ## COG: MK0625 COG3252 # Protein_GI_number: 20094063 # Func_class: H Coenzyme transport and metabolism # Function: Methenyltetrahydromethanopterin cyclohydrolase # Organism: Methanopyrus kandleri AV19 # 2 317 1 313 315 219 35.0 6e-57 MLNINAKAMKVVREIIEDADALGCKVIKMDCGATLIDMGINCEGSWKAGVLFTRASMGDM STVTLGEFRLNEDYTFASVEVLVNKPLIACMASQIAGWKLGDGEFATIGSGPARSIAHVD SDWYFEMTDYREENNEAVLCLQDVKYPTDQMAMEVAKACNVKPEDTYILISDSTCVVASI QVSGRMLEQTCHKMFEKGFDAGQIVMIRGTAPIAPIVKDEMKSMGRINDALIYGSSVEIW VDASDEAIEKVIPGLVGKTSSPCYGQLFEDVFEKAGRDFFYVDHDVHSLGRVQIHNINTG KAFCGGEINYKVLEKSFLY >gi|157101601|gb|DS480723.1| GENE 13 14656 - 15639 1089 327 aa, chain + ## HITS:1 COG:SSO1525 KEGG:ns NR:ns ## COG: SSO1525 COG1071 # Protein_GI_number: 15898353 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, alpha subunit # Organism: Sulfolobus solfataricus # 3 321 5 329 332 241 39.0 1e-63 MALSKERLLKVYRDMVMIRRFEEVIEEYAANGTIPGFVHLSIGQEACQAGVVDALKKTDY KFPDHRGHGAIALCGTDPKLVMAEIFAKETGINHGRGGSMHVNDLECRNMGFNGIQGSTM VTCLGTAFASVYNGTDDVTAVFLGDGTLGEGTCHESMNMAATWKLPIIYCLVNNGYAIST RYEEAHPQKELKTWGEGYEVPSFRLDGNDIEAVIEAVEKAADRARKGEGPTVLEFMTYRW QGHFAGDPAAYRPEDEVSYWVNDRDPLKLTKAILLEREQVEPAVLQAVEEEEEKHVQEML KFSLESEYPGIETATTYTYADREVELR >gi|157101601|gb|DS480723.1| GENE 14 15639 - 16616 836 325 aa, chain + ## HITS:1 COG:SSO1526 KEGG:ns NR:ns ## COG: SSO1526 COG0022 # Protein_GI_number: 15898354 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dehydrogenase (E1) component, eukaryotic type, beta subunit # Organism: Sulfolobus solfataricus # 3 323 2 323 324 298 46.0 1e-80 MGRKLSYGMAINEALHQMMGADERVFILGEDVAKMGGDFGITKGIWEKWPNRIKDTALSE SAILGLSCGAAVCGLKPVPEIMFADFLGVCFDQLTNNAAKLNFMYQGKAHCGITVRAVQG GGIRCAYHHSACVESWFMNTPGLVVVCPTTPYEAKGMLISAIKSDNPVLFLEHKTLYNVK GEVPQEMYEIPLYEAEVEREGSDITIVATQIMLDKAHKAADIMAKEGVSVEIIDPRTIYP YDKDTIQKSVAKTGRIILAQEGPKCGGWGAELSAMISEDVFEYLCAPIKRVTSLDSPVPY APVLEDYVLPQLDDLVKTCRELMEY >gi|157101601|gb|DS480723.1| GENE 15 16652 - 18004 1031 450 aa, chain + ## HITS:1 COG:BH0778 KEGG:ns NR:ns ## COG: BH0778 COG0508 # Protein_GI_number: 15613341 # Func_class: C Energy production and conversion # Function: Pyruvate/2-oxoglutarate dehydrogenase complex, dihydrolipoamide acyltransferase (E2) component, and related enzymes # Organism: Bacillus halodurans # 1 445 1 429 436 209 33.0 7e-54 MVTKVIMPKFGLSMETGVLGSWLVEEGASVTKGSALAEITTDKITNTCEAPKDGILRKIL LPEGEEAACGEAIAVLADTADEDISAECGGGQSEGGAFADSAEQAAAASPVPEKAAPADI KITPRAKKVAEEQGLEYSHIQGTGLLGAITISDLKKHGIPRKDPASAASVSAAPSSASVP SSASGPASASGLASASAPASGPSTASAPAAAPKAVFDTRPCEGEDVIVKMSTMETAIAKA MQNSLLTTAQATIATEAEITELVRVYKQLKGKYTNAGVKLSYTAMLIKAVAMALENHKAL RSTMADETHIKISSRIHIGVAVDIPGGLIVPVIRDANMKDLRTICLELSDLTQRAKDNKL TSDQLGGATITITNLGMFGITYFTPVLNVPESAILGVGAIIEKLMVKDGGFYPASVMNFS LTHDHRIVNGAPAARFLKEVTASLQDFKWV >gi|157101601|gb|DS480723.1| GENE 16 18059 - 19459 1435 466 aa, chain + ## HITS:1 COG:MA1276 KEGG:ns NR:ns ## COG: MA1276 COG0402 # Protein_GI_number: 20090140 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanosarcina acetivorans str.C2A # 4 447 12 437 442 268 35.0 2e-71 MKKADVMIKNAYIITMDHDRNIISNGCIVIDKDKITAVGGGELASCYEASRVVDAKGKFV FPGMISTHSHLFQTMLKGLGRDKLLFDWLDSSVRTALHRFDGEMCYYAALTGCMEAIQSG TTTLLDYMYCHTSPGLSDYVTQAMEDIGIRGIYGRGFTNTANFPPEFKVAHHDTEQDMFD DVRRLYKKYEGHSRMSVALAPGIIWDNTDDGYREMRKMADEMHIPLTMHVLESEDDDKYC REVRGGRTIPHLERLGFIGPDFIAVHCVCMEEEDFDIFKQYDVKVSHNPVSNMILASGVA PVERMVKEGLTVSLACDGSASNDTQDMMEVLKTTALLQKVHLRDAAAMPASRVLELATLG GAKAVMREGDLGAIAAGMKADLVIYDPFHGRSIPVHDPVSAIVYSSSQANIESVMVDGVF VMEHKRMTMIDEEKVLYATQKAAEKLINEVGLGNTQWGRKIGNLGL >gi|157101601|gb|DS480723.1| GENE 17 19470 - 20477 1028 335 aa, chain + ## HITS:1 COG:no KEGG:Spirs_2674 NR:ns ## KEGG: Spirs_2674 # Name: not_defined # Def: pyrimidine reductase, riboflavin biosynthesis # Organism: S.smaragdinae # Pathway: not_defined # 1 335 1 339 339 329 46.0 1e-88 MNELVFFEIPKQNLKIQIVKKSNEILEQIRKESAETAVMPDVAQYYTDIYFPKAPSDRPY TFSSIVLSSDGKMAYGDNPSGPLIAKNNFLDPDGSLGDFWVLNVLRAYADGIIIGARTLL SEPGITCHVYEERLTRQRREVLGKKYQPCGVIVSLDGTDIPFDHYIFDVDPKEEYKLVIA TSPRGAEYIMANSPLKHPVIGPFKTIEDVDHADLGELYTDFNAFPVIVTGQGENPDTKVL MYALKKWGLEKLCIEAPSYCTHLLQQGMLDEYFINYSMVFAGGQKSPGYASEFGHMDHPH ADFLTLGIHESNFIFTRQKIRYGVTNETDLSGYKY >gi|157101601|gb|DS480723.1| GENE 18 20495 - 21913 908 472 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 5 466 1 458 458 354 39 2e-96 MGGTMAGTDLVVIGAGPGGYVAAIRAAQLGMKVVIAEKDACGGTCLNYGCIPTKALYKNA QVMGYMDHSREFGIEIDGYRLDMEQVQARKNKIVKTLTGGVEYLLKSNKVAIEKGCAKII KAGLVEVTGKDGTVKRLETKRILIASGSKSSRLPIEGMDLEGVITSKEALDMKTVPEEIV IIGGGVIGIEFAGIYQSFGAKVTVVEFMPHIIPNVDVEITARLKSLLEKRGISIMTGSKV EKIEKKGNNLSVQVDAGGKKQVLSCGQVLVSTGREMDADGLNLDGAGVRYDRKGIKVDEN YETNVPGIYAIGDVTGRVMLAHVASEEGKTAVERMAGENTEVDYSLIPNSIFTFPDVSSI GLSEEQAKEQGIEYITSKYQFSGNGKALTMGDAEGMVKVIAAKDKSRLLGVHIIGPNASD LIAEAAIAMNGMFTVEEAAGVMHGHPTLSEAFDEAVSNLLGKAIHMPPVKNR >gi|157101601|gb|DS480723.1| GENE 19 21935 - 22282 324 115 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942026|ref|ZP_02089341.1| ## NR: gi|160942026|ref|ZP_02089341.1| hypothetical protein CLOBOL_06912 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06912 [Clostridium bolteae ATCC BAA-613] # 1 115 1 115 115 227 100.0 2e-58 MELRVNEPGLFIEELVFQIRPEQVKRYVELEYEIMAKELAGLDGFCGWQIWVSETNPGRV TSLYFWKDHASYRNLDQEWLSGKKDEITRAFGAENMTFVRVGHETDRRYQMRSLA >gi|157101601|gb|DS480723.1| GENE 20 22282 - 23043 393 253 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942027|ref|ZP_02089342.1| ## NR: gi|160942027|ref|ZP_02089342.1| hypothetical protein CLOBOL_06913 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06913 [Clostridium bolteae ATCC BAA-613] # 1 253 1 253 253 530 100.0 1e-149 MGTIHITLVANAGLLLESGGRKLLLDGLHSQENGMFSPVPPCIREQIMKRIPPFDGIRWV AFTHFHEDHFDLDLVNRYLQEAAPEMLVMPEDTPNGVSSKLKAGVCSTEAIELPMGQCRS IRLWETGSLTAFPSRHAGREYETVSHLCYVLEADGKKVLILGDSDYDSVFFKQMAGAMEF DAVIANPLFLHLPRGRRVFTQSIRTKEIIFCHLPFAEDDQIHFRNMMSEDMRQYETILPP MKALTDPLQKVSI >gi|157101601|gb|DS480723.1| GENE 21 23064 - 23702 660 212 aa, chain + ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 5 181 23 200 210 187 53.0 1e-47 MGQNILIEISARHIHLSKEHLEILFGEGYELTVKKMLSQPGQFACEERVSVIGPKGKFSM SVLGPVRERTQVEISLTDARTIGVSAPIRESGDVEGSGSCVLEGPRGTVILKEGVIAAKR HIHTTPEDAERLGVTDRETVAFEVNINDRSLIFKDMVVRVSEHYSTAAHIDTDEANAVAM SGTVYGTIVKLRGNGLADSGLTDSDLADSSIA >gi|157101601|gb|DS480723.1| GENE 22 23759 - 25087 1301 442 aa, chain + ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 1 427 15 442 475 197 30.0 2e-50 MGVMPVNRLLVTMALPIMISMIIQAMYNIVDSLFVARISEQALTAVSLAYSVQNLLIGVA VGTGVGVNSLLSRYLGEQNREGVNKTAVNGIFIMLVTYLFFLVFGLFFTRTFFKLQTDSE EIIELGVQYLSICLIFSVGVLEQVVLERILQGTGKTIYTMITQGAGAVINIILDPILIFG LLGFPRMGISGAAIATVTGQLAGMGMNVYFNKRVNHEVQFVFRGFRPDMGIIKQIYRVGI PAIVMQSIASVMTFGLNQILISYTATAAAVMGVYFKIQSFIFMPVFGLNNGMIPIIGYNL GAGRPGRIRKAIHYSLGYATAIMAAGFVLSQVFAVEIMAMFNASEHMMSIGVTALRVISF GYLSASVSVIYSAVFQALGRGMESLVISFFRQLIVLLPAACILAAMFGLDALWWSFPISE TMAALLAWVWWKHIRRQMAGQV >gi|157101601|gb|DS480723.1| GENE 23 25404 - 26471 1324 355 aa, chain + ## HITS:1 COG:mlr2477 KEGG:ns NR:ns ## COG: mlr2477 COG3191 # Protein_GI_number: 13472247 # Func_class: E Amino acid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: L-aminopeptidase/D-esterase # Organism: Mesorhizobium loti # 6 349 5 338 343 274 46.0 1e-73 MERKRIRDFGIVPGVMKTGPKNKITDVPGVLVGHKTVRNGDNKTGVTVIVPGPGNVFERK FIAAGYVHNGFGKTCGLVQVEELGTLETPIALTNTLNVGLAADALVEYTIRQCEKDQVEV TSVSPVVGECNDCRINRIQHRAVSMEDVMEAFSGAGEDFEEGDVGAGTGTVCYGLKGGIG SASRVMRIGGVNYTLGVLVQSNFGATEDLVIDGIRAGQRILEEKERRRKMATAEQDQGSI MTIIATDLPVSDRQLKRIIRRAGVGIARTGAYTGHGSGEVMIGFTTAGRLQGKDSPEIMT STCIREDLINVAFKAAAEAVNEAILNSMTAAGRTGGLAGEVYYSLSEFLPQILKH >gi|157101601|gb|DS480723.1| GENE 24 26468 - 28579 2129 703 aa, chain - ## HITS:1 COG:YPO2344 KEGG:ns NR:ns ## COG: YPO2344 COG3283 # Protein_GI_number: 16122568 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulator of aromatic amino acids metabolism # Organism: Yersinia pestis # 341 597 200 428 525 128 34.0 5e-29 MDRSQPKYHVLIVSINSSIRLEFQSYLHKILGAHIAFDTIGLDQVQRPEQIKGYQCILFS SPLVQSEFPISIPKEVMQIICTRTFNHACLDQIIRIPPGERVYIVNDTYDSIVTIMDQLK EAGVVQYRFEPFYPGCIQADESIHYAITVGEPQLVPSHITNVIDIGNRIIDISTVNELCE YFHLPASLSNQITKSYVNSIMQIAKLTSAYYQDYIYSRQLLQTVISNLPIGLCLLSVRGE INMVNRRFSMDLELPETGTTGRNLSQFLPQPCKGMDFCHSADYKVMNRSGKHLLLSSLEL LLPNHEPLYLLNSTPCPVLADGTGDYPPLPDPVFSDSRYADSMFTSFTTASEPMKNILEY ARRLSLYDFPVLIQGESGTQKKLLARAIHKTSSRRQHPFVCLNMPMGLNAAARINGVSGP NEVSGPDSAPGPGVLSDPGGTSCLSGAALTVSDPLSQMLGLFKEANHGTLLIDSAEYLSP EIQGLLIHILQDNRTGNVFGTRFSPFDVRIIATAGQDLYGMVKQGTFREELFFLLNAASI DTLPLRKRREDIPLLMDQFFRELFHNKSLMARDLLSPSLFDFMMDYDYPGNVQELYNLTR YFFSHYAAHPLILAQLPSYIRNQMRKSDQDHSPVRLRVLAAIAASPRIGRGAIKNALAEA GVELSDGRLRGLLKELSEEGLIRINRTRGGCEITEQGLMSLRH >gi|157101601|gb|DS480723.1| GENE 25 28669 - 28734 58 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLYGKKNDLFDKAAQKRVKLL >gi|157101601|gb|DS480723.1| GENE 26 28815 - 30500 1944 561 aa, chain + ## HITS:1 COG:ECs0909 KEGG:ns NR:ns ## COG: ECs0909 COG0747 # Protein_GI_number: 15830163 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Escherichia coli O157:H7 # 63 559 29 511 512 351 39.0 2e-96 MKKSGKKLCALALAAAMLGLTACGSSSGTTDTTTAAAGGAAATEETTAAAAVAEGESVSQ ETDIIGAVNVDFTTLDPLDTSDTLSGGVQRLIMDGLFGFDDEMKIIPMLATEYTANDSAT EFTIKLREGITFSDGTPWNAEAAKANFDRWADKSLGLKRTTLLCNILDTTTVQDDYTIVV KLSSPFGAFIETLAHPACVVMSPKVIEQGSTACAEAPIGTGQYTFKEWIQGDHLTVELNK DWWGYDADICGGTALADKDAGFKSITLKPVPESATRVAMIQSGDAQLIWPVPDENLAGLK SDPNVIVGQDQGLVVRYLMMNNQKKPFNDKKVRQAINYAINKEAYVAVVKNGNSEPGTSV IAPKLQYYKGNDPYPYDIEKAKSLLAEAGYPDGFKTTLIFANTSANMKQGEFLKQQLAEV GIDVELISLESAIVNEKVQDTDAPGAEAEVDMYIIGWSSSTGDADWGIRPLLAKESEPPM SYNICYFENEELDGYLKDGLNTADRDKRAEAYAKAQDLIWEESPLVCLATDFNSWATSNK MVNVKMFPDNCLNIRNGKMTK >gi|157101601|gb|DS480723.1| GENE 27 30600 - 31520 1029 306 aa, chain + ## HITS:1 COG:RSc1381 KEGG:ns NR:ns ## COG: RSc1381 COG0601 # Protein_GI_number: 17546100 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Ralstonia solanacearum # 1 306 1 306 307 344 57.0 1e-94 MLRYICKRVVGVIPTLIIVVTFVFFFVRLIPGDPARQVAGPQATKEDVELVREQLGLNKP LPQQYISYVTGLVKGDLGTSLRTKRPVLTEVQGRYMNTVQLTLLSLTWSVIAGVLIGVWS GRNRSKWQDYTGMTVAVSGISMPSFWIGFMLIMVFAVKYRLLPTTGAGSFKNLILPSITL GVSIAAIIARFTRSSVIEVLKEDYVRTARAKGLREKTVIWKHVFRNCMISVVTVVGLQFG FLLGGSVVTETVFAFPGLGSLLIESVNYRDYPAIQSLILIFSLHFVLINLVVDVLYAVLN PEIQLS >gi|157101601|gb|DS480723.1| GENE 28 31537 - 32415 1153 292 aa, chain + ## HITS:1 COG:RSc1382 KEGG:ns NR:ns ## COG: RSc1382 COG1173 # Protein_GI_number: 17546101 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Ralstonia solanacearum # 13 290 18 296 299 331 60.0 1e-90 MAKNKVLTEKTDINTPFSEFVRKFKKQKTAMVAMVFLVFLVVLAFISYKIAPYGINEYDY SAIMQPPTAAHLFGTDEFGRDLFSRVICGTRISLSVGLFAVTVGMLVGTIMGLLAGYYGG IVDSIIMRICDVLLAFPGLILAIAIVAILGSGLYNVVIAVAVFNIPKFARLVRGNTLETK NSVFVQAARNIGAGDMRILFKHILPSAIPNIIVQYTMSIGTSIITASSLSFLGMGAQPPT PEWGLLLSNGRNYMLTSWHITLFPGLAIFFTVLCFNLLGDGLRDALDPKLTD >gi|157101601|gb|DS480723.1| GENE 29 32427 - 33422 1110 331 aa, chain + ## HITS:1 COG:DR1568 KEGG:ns NR:ns ## COG: DR1568 COG0444 # Protein_GI_number: 15806577 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Deinococcus radiodurans # 2 321 27 360 386 387 57.0 1e-107 MENTNNNILEIKNLHTYFYTDSGVIKSVDGVDIELREGTTLGIVGESGSGKSVTALSVMG LLMGTTGKVAEGEILFEGRDLTKLDDEERRKMRGEKISMIFQEPMTSLNPVMKIGDQITE CILMHNNISKQEAWDKAVEMLKLTGVPRVERMMKEYPFQLSGGQRQRVMIAMALVCKPKI LIADEPTTALDVTIQAQILDLMENLKQKTGTSILFITHDLGVVAEVCDDVVVMYSGRVVE KGDVRSIFASPSHPYTKGLLASIPKLGECAEELESIPGNVPNPKYMPQGCKFAPRCSCAF DKCREEEPGFYDVGEGHMSRCWLCEKKGGDA >gi|157101601|gb|DS480723.1| GENE 30 33422 - 34414 1156 330 aa, chain + ## HITS:1 COG:BH3645 KEGG:ns NR:ns ## COG: BH3645 COG4608 # Protein_GI_number: 15616207 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Bacillus halodurans # 1 322 1 322 322 388 58.0 1e-108 MADILLNVEGMKVYYPVKGDFGKGREFVKAVDGVSFEVRKGEVFGIVGESGCGKSTLGRG ICKLETPTAGKIVLSGEDISSYGRKQMRSVRKKVQMVFQDPYASLNPRMSIFDIIAEPLI IHGLTKSKKELEDRVMELLRKVGLDDYHANRYPHEFSGGQRQRIGIARALAVEPELIIAD EPVSALDVSIQAQVLNLLHQLQKEFNLTYIFVAHDLSVVEHISDRVGVMYLGNFVEVGDK RKLYSNPLHPYTQALLSAVPVPDPTAKKDRIILEGSIPSALNPPSGCKFHTRCPRCMDIC KKKAPERYQVSDDHYVYCHLYDDKIKENKG >gi|157101601|gb|DS480723.1| GENE 31 34411 - 35211 929 266 aa, chain + ## HITS:1 COG:mll6661 KEGG:ns NR:ns ## COG: mll6661 COG2362 # Protein_GI_number: 13475561 # Func_class: E Amino acid transport and metabolism # Function: D-aminopeptidase # Organism: Mesorhizobium loti # 1 265 1 265 265 208 43.0 7e-54 MKLFISADIEGCAGVALAYETHKNEAAYGEFAKQMTKEVVAACEAAHEAGADEIVVKDGH GDATNIDPLCMPDYVTLIRGKSGHPYNMMSGLDDSFDGVMYIGYHAPAGNPGFAISHTST GNSLYIRLNGSCMSEFMLNSYTAASHKVPVLFLSGDSTICGLAREMVPDITTAVTKTGLG ASTYCKAPGQVEESIRQGVKKALAGNLSRCSVELPETFTYEVTYKDWKKAYQMSFYPGMR AVDTFTNRLETARWMDVVTAHCFVIY >gi|157101601|gb|DS480723.1| GENE 32 35240 - 36328 1291 362 aa, chain + ## HITS:1 COG:BS_ysdC KEGG:ns NR:ns ## COG: BS_ysdC COG1363 # Protein_GI_number: 16079934 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Bacillus subtilis # 3 359 8 361 361 182 34.0 8e-46 MDIEMFGKISDAFGPSGFEEEVVRTVAGYCGKYQVENDAMNNLYVRMPGRAGDKPVVQLD AHLDECGFMVQCIHDNGCLGILMLGGFHLTNLPAHTVIIKTRAGRKIRGIIMSKPVHFMN DKERASLELCIEKLYVDIGATNRREVEEDFGVSIGDPMVPEVSFSYDEEHGLCFGKAFDN RAGCACIVDTMDKVYDSRNELAVDVVGAFAAQEEVGMRGATVTTQVVKPDLAILFEGSPS DDFFFSATQAQGRMKNGVQIRRMDKSYISNPVFMEYAMELAGKFGIRYQEAVRRGGSTNA GRISLIGKAVPVLVLGVPSRYVHSHYNFCAKDDLEAATELAAQVIKGLDGERIRHILRQD IL >gi|157101601|gb|DS480723.1| GENE 33 36342 - 37085 684 247 aa, chain + ## HITS:1 COG:FN0505 KEGG:ns NR:ns ## COG: FN0505 COG2071 # Protein_GI_number: 19703840 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Fusobacterium nucleatum # 1 239 1 240 243 167 37.0 2e-41 MERPVIGIMGNTYMTQPGMFDSMERAYQNSYYVDAVMKNGGIPVILPASAVMEQTEEIMG ICDGILFPGGEDMTPSYYGEDPHPAIQVYKPEIDEALMRAGRYALEHKKPMLGICKGNQL LNVLMGGSLYQDLSLKGPDCIRHLQLGRRDYLTHQIRVEEGTRLSKLLGSGVCMTNSMHH QSVKELGKGLRASAYANDGIIEAIEDQEGMIVGVQWHPESLLESAPAMNHLFSDLCGRAL ERKAGSL >gi|157101601|gb|DS480723.1| GENE 34 37196 - 37807 399 203 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 188 20 203 217 151 42.0 1e-36 MTNEQYYELIKPYEDASRVLSTRLEILNHSLYGDRSDAGPIHNIQGRIKEKKSMESKLVR LGFTAGVTNARNHLQDIAGLRIICYFVEDIYNLTAALKKQSDLVIIKEKDYIRNAKPNGY RSYHIVLGIPVYFLDTMEYFPVEVQLRTMAMDFWASMEHRVCYKKQPRNRERLEQDFCRY ARILEEIEGEFETHNERRGSDGG >gi|157101601|gb|DS480723.1| GENE 35 37797 - 38717 934 306 aa, chain + ## HITS:1 COG:BH0025 KEGG:ns NR:ns ## COG: BH0025 COG2421 # Protein_GI_number: 15612588 # Func_class: C Energy production and conversion # Function: Predicted acetamidase/formamidase # Organism: Bacillus halodurans # 20 305 9 294 300 230 43.0 2e-60 MEVKQAETRRTEGKRVEGKTIFAMSSSAVPVCRAGSYETLTFVTKDCFDNQFTEEGDVLD SLNWDCINPATGPVYVEGAMPGDVLKVKIEDIRVASWGTMAAIPDNGVLGDCVSRGTVKR IPVRDGVAWFNQDIGIPCAPMIGVIGVAPEKGEIPCGEPGSHGGNMDNTKIKAGATLYLP VFHEGALLAMGDVHACMGDGEIMVTGLEIPAEVTVTLEVLKGISIDNPMLEDGEACYTIA SHENVETAVYTAVKAMTDILMRELGMSLEDAGMLLSAMGNLQFCQVVDPKRTVRMEMKKT VLKKLF >gi|157101601|gb|DS480723.1| GENE 36 38973 - 39866 835 297 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 2 276 1 275 280 214 37.0 2e-55 MLKIWICGAGGRVGRKMTDILASRPVELLLTDADSVDITDSEAVMEYAHINRPHYIVNCA GLTDAAACEDCPEEAYRVNALGARNLSVAARMGKSRLVQMSTDDVFDGRSLVPYTEFDPV SPRTVYGKSKMAGENFVREFCNRHIIVRSSWIFGDGSPYLERILEMAGQGKTIRAASDQM ASPTGADGLAAKIIELMEHGEDGLYHVTGQGCCSRYELAKEAVRLAGYQVPVEPVNASED TLSSMRPSYSVLDNMMLRISNMRLLPHWKVMLEEYMKERRAGSRRRAEGKDGIGYGK >gi|157101601|gb|DS480723.1| GENE 37 39856 - 41136 1492 426 aa, chain + ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 1 423 4 424 424 335 47.0 7e-92 MENRKRPGLTTAIFIGLALGAAAGILLHYAVPEGDIRDKLLIDGLFYLIGNGFLRLMQML VVPLVFCSLVCGSMSMGDTKKLGKIGIKTLGFYLATTALAITAAILTAHVINPGAGLDLS SLETAQVEIGEKEGLADTLLAIIPVNPVRALADGDMLSIILFALIVGIILAGLGEAVQTV GNLFSQFNDVMMQMTVMVMKLAPYGVFCLTARTFSGIGFDAFLPLLKYMAGVMTALAIQG LVVYMAILKGFTGLSPVRFLKKFLPVMGFAFSTATSNATIPMSIDTLHRKMGVSRRISSF TIPLGATINMDGTAIMQGVAVVFASQAFGIHLSVADYLTVILTATLASVGTAGVPGVGLI TLSMVFASVGLPVEAIALIMGIDRILDMSRTAVNITGDAVCTTVVAFQDGAVDKAVFDDM EAGRGG >gi|157101601|gb|DS480723.1| GENE 38 41267 - 41602 359 111 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942046|ref|ZP_02089361.1| ## NR: gi|160942046|ref|ZP_02089361.1| hypothetical protein CLOBOL_06933 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06933 [Clostridium bolteae ATCC BAA-613] # 2 111 1 110 110 162 99.0 8e-39 MVPEQEGNMNIFGFEIKSKEEREQEEREYLHRIFPGGTAQKAAVEQQLKEKLPKEDKKAV MLYYILVKDAMTAGNGMSFEEAVGKVSKKQRILKLTPVMLEKVREVMEDNQ >gi|157101601|gb|DS480723.1| GENE 39 41718 - 42299 326 193 aa, chain - ## HITS:1 COG:no KEGG:Closa_1713 NR:ns ## KEGG: Closa_1713 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 40 190 25 174 177 126 49.0 4e-28 MYKGNFENNFNFENKDNNDTFEHRNNLENIECQEPPRYCCEEKKCVLRPNRTFMKCSVPV TTNIPMTTASGTTFNLANLNIDTSKFHKPCIKFEFTSNILTAAGSLTFNFQINKQCKYQP TSFPIGPVWTFSRLASSVGENDTFNFSVCDCDICDDECCNFSVVVTIESLETQGGITVNN ATLSAIVVDNPFC >gi|157101601|gb|DS480723.1| GENE 40 42922 - 44280 1282 452 aa, chain + ## HITS:1 COG:CAC1610 KEGG:ns NR:ns ## COG: CAC1610 COG1114 # Protein_GI_number: 15894888 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Clostridium acetobutylicum # 3 442 2 430 440 333 46.0 4e-91 MKQKLSKQNLLLVGFTLFSMFFGAGNLIFPPFLGAQAGVMTWKAMSGFLLSAVGLPVLGV AAVALSGGISKLAGRVHPAFAFLFTLLIYLSIGPCLAIPRTSSTSFEMTVFPVLEQMGVS LNSTVSGTGFTVQTMAQFGYSVVFFTAAMAVAFRPEKLTDRLGKILCPTLLVLISVIFIG CLVWPMGHYGDPGSVYKSGPAVAGFLEGYQTMDTIAALNFGIIIAINIRAKGVEQEGAIV RETIKAGIIAGILLALIYAALAHIGAPAGAAARGMDNGARILTYVAGNLFGNAGMMILGL IFLIACFNTCVGLLSCCSQYFNSIIPVIGYRVWVFLFALISLIISSAGLNKILAVSVPVL NAIYPIAIVLIVLAFLGPLTKRWPAMYPAAILFTGAVSVVYALEQSKFVIPFLTQATTFL PGYGAGLGWIVPAFAGMAAGIIYSSVMGDQEE >gi|157101601|gb|DS480723.1| GENE 41 44316 - 46046 1998 576 aa, chain + ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 5 574 3 587 587 566 47.0 1e-161 MTEARWMLYAKKADFDRIAEQYHISPVTARIIRNRGLEDMEDIGRYLYGSVEQLHDPHLL PDMDRAVEILGEKAAEGRRIRIVGDYDIDGVCSTYLLYRALKRIGAEVDYEIPDRIKDGY GINELIIEAAAEDGIDTILTCDNGIAAISQIARAKELGMTVVVTDHHDILTEAGETPEGR EILPPADAVVNPKRRDSLYPFPDICGGMVAYKLVQVLYEVHRVPREEWLSLLEIAAIATV GDVMKLQGENRILVKEGLSRLGHTSNLGLRKLIEKNNLQAGTITAYHIGFVIGPCLNASG RLQTAKIALGLLLCEDEGEADRMAEELKALNDQRKDMTQEGIEQAAAMVDELYREDKVLV VFLPDCHESLAGIVAGRLRERYNKPSYVLTRAEGCVKGSGRSIEAYHMFQSLVEVQDLLL KFGGHPMAAGFSLEEKDVDEFRRRLNENARERLTEEDFIPKVWIDVAMPFEYITEPFIEE LELLEPYGQGNEKPQFAQKGLFVRSARVMGRNRNVVRISLVNDRGTAMDGVIFTDGDLFM EEKGDCRVMDIIYYPGINEYNGNRNLQIVVKNWKFH >gi|157101601|gb|DS480723.1| GENE 42 46155 - 46526 626 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942052|ref|ZP_02089367.1| ## NR: gi|160942052|ref|ZP_02089367.1| hypothetical protein CLOBOL_06940 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06940 [Clostridium bolteae ATCC BAA-613] # 1 123 1 123 123 99 100.0 9e-20 MKKTIAKSMTAIMAAAMIAGSLTACGGSDNKTESTTAVETTVEASESVEETTAEAADAEE TTEAADAEESTEAADTEESTEAADAEESTEAADAEESTEASKEEAASEANAEVAEETTEA AQN >gi|157101601|gb|DS480723.1| GENE 43 46796 - 48043 1504 415 aa, chain + ## HITS:1 COG:SA1915 KEGG:ns NR:ns ## COG: SA1915 COG0112 # Protein_GI_number: 15927687 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Staphylococcus aureus N315 # 6 415 1 412 412 508 62.0 1e-144 MVNEVMDFITGYDKEVGEAIQAECARQRRNLELIASENIVSEPVMMAMGTVLTNKYAEGY SGKRYYGGCQCVDVVETLAIERAKKLFGCDYANVQPHSGAQANMAVFVAMLKAGDTVMGM NLDHGGHLTHGSPVNFSGLYFNIVPYGVNDQGFIDYDELERIAKEARPKLIIAGASAYAR TIDFKRFREIADEVGAYLMVDMAHIAGLVAAGEHPSPIPYADVVTTTTHKTLRGPRGGMI LANKEAAEKFNFNKAIFPGTQGGPLEHVIAGKAVCFAEALKPEFKAYQHQVAANAKALAQ ALKDEGFKLLTDGTDNHLMLVDLRGMEVSGKELQNRCDEVYITLNKNTVPNDPRSPFVTS GVRIGTPAITTRGLKEEDMPKIARCIWLAATDFENKADYIRSEVTKLCERYPIYQ >gi|157101601|gb|DS480723.1| GENE 44 48206 - 49366 1311 386 aa, chain + ## HITS:1 COG:BH2731 KEGG:ns NR:ns ## COG: BH2731 COG3835 # Protein_GI_number: 15615294 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Sugar diacid utilization regulator # Organism: Bacillus halodurans # 15 379 7 367 371 138 27.0 2e-32 MELTGQALRLSRQGAQAIVEEVGAIVGQHINMMDPEGYVIASTDPERIGTLHEGARKIIR EHLDELYVRDEEATLTSRPGLNLPVYQAGSIAGVIGITGRYEDVAGYGQIVKKMVEILIR ENAEQDERRMGQRVLNRFLEDWVLGSGLLQPRILEERGFALGIDISVPRRVMVSGVRNFQ EYASTASGQKLLEQAEGEAASLAGAGCLVFRNAGRQIFLVQGASDRHMEQLAGRIRSCVK KKFGIGLVMGIGGRARDIHVAYGQANKAWRSASMSKRDVMTYDSVTLELFTEDVAESVKA EYLRKVFKNCGYEELCQWMGLLEAYFSAEGSLKAASDAMHIHKNTLTYKLRKLEELTGYD VRVLSQSAVFYMAMVFFRDVKEMMQD >gi|157101601|gb|DS480723.1| GENE 45 49528 - 50673 1020 381 aa, chain + ## HITS:1 COG:STM0525 KEGG:ns NR:ns ## COG: STM0525 COG1929 # Protein_GI_number: 16763905 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Salmonella typhimurium LT2 # 1 381 1 382 382 288 44.0 1e-77 MKFLFASDSFKGTLSSERITQLLEAAAEQTFPGCETAGVPVADGGEGTIDAVISALNGTY RRLTVHGPLMEEAQSFYGEFEGDSAIIEMAAASGLPMVPADRRNPLNTTTYGTGELIKDA LDRGYRKISIAIGGSATNDGGMGAMRALGVRFLDDQGNELEGRGRDLTRVADIDVSRIHP AVSQSRFTVMCDVTNPLTGPDGATYTFGKQKGGTPEILDVLEGGMIRYAALLKEKLGVDV DKIPGAGAAGGLGAAFCVFLKANLKSGIDTVLDLVHFDSLLEGVDLVITGEGRMDWQSAF GKVPSGIGKRCRERGIPVAAIVGGMGDKAETIFDYGIDSIITTINGAMDIEEALKRAEEL YISAADRTFRMLKAGMSLGRK >gi|157101601|gb|DS480723.1| GENE 46 50987 - 51115 147 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942056|ref|ZP_02089371.1| ## NR: gi|160942056|ref|ZP_02089371.1| hypothetical protein CLOBOL_06944 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06944 [Clostridium bolteae ATCC BAA-613] # 1 42 1 42 42 73 100.0 3e-12 MKKMNDPGGPVMKPLAELTLLDRFLFACVMEDRGTMELCRRE >gi|157101601|gb|DS480723.1| GENE 47 51067 - 51339 168 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942057|ref|ZP_02089372.1| ## NR: gi|160942057|ref|ZP_02089372.1| hypothetical protein CLOBOL_06945 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_06945 [Clostridium bolteae ATCC BAA-613] # 1 90 1 90 90 154 100.0 3e-36 MCYGGPRDHGTVPEGVSGELIELLHYIEHTSETEAAGSSSPRIKELHRRVSQVKASEEIG VRYMQEWEERMYQLQDAKAEGRENRRTDIG >gi|157101601|gb|DS480723.1| GENE 48 51332 - 51412 60 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKLVRMFSLSRDTAGEYLKRYKGTP >gi|157101601|gb|DS480723.1| GENE 49 51631 - 52806 1159 391 aa, chain - ## HITS:1 COG:BH2935 KEGG:ns NR:ns ## COG: BH2935 COG1228 # Protein_GI_number: 15615497 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 1 386 1 386 394 149 29.0 1e-35 MKRTILNAGMILCGEQLTPLENGSLIIEDGIIREILPRKAREALYLEDAKEISLPHMTLM PGMIECHNHLCIDAALPEHLELLAWSNECQLTLLALKGLKEDLMSGVTTARCMGDKFYID VTLKKLIEDKKVDGPRLLAAGIGMKGSHGAGHIGMPHCGAGEIRHTCRENLKKGADLLKL FITPGVPDPASTFVPSFLSLEEISAAVSEGARLGIPTAAHCIGGQGLKDCIDGGVDVIEH MYMASEQDVERLAASRCVVDLTSGIFLDPSREEFLSPANALKVRLNRSRVRENVARIIKA RLPFVLGTDAYHGFLYREVGYAVELGADPVTAIQGVTSRAARVCRLEDRIGKLENGMEAD IIAVEGNPLEDVSRLSQVGLVMKGGVVYRQP >gi|157101601|gb|DS480723.1| GENE 50 52848 - 53852 1190 334 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2766 NR:ns ## KEGG: CDR20291_2766 # Name: kdgT # Def: 2-keto-3-deoxygluconate permease # Organism: C.difficile_R20291 # Pathway: not_defined # 1 304 7 310 324 274 50.0 5e-72 MKRFPAGVMVIPLLLGCAMNTFFPNALTIGGFTSGLFKNGVPTLIGLFLFCSGATIDVKM AGSTVWKGVVLTALKFFIGFGLGLLLNALFGEAGFLGLAPLAVIGAVTNSNGVIYATLAG EFGDETDVGATSILALNDGPFFTMIALGASGMGNFPITDIIASIIPMVIGFIIGNLDHEW RKILATGMILLPPFNGFALGAGMNFNNILRAGISGIVLGLLTVLATGLLTFFLYSALRRK ADPMGAAIGTTAGVATTTPTAVAMADPSFQPQVETATAQTAAAVVITAVLCPLLVSWLSK VSRKWNEAHGIAADIPVPEQEGCETQPAATADLD >gi|157101601|gb|DS480723.1| GENE 51 53767 - 53973 98 68 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVRALGKKVFIAHPRSRGITITPAGKRFIVANILINLFSFTWASMLTGHVRCVSRWLILH AIAMLLTD >gi|157101601|gb|DS480723.1| GENE 52 54211 - 56259 1732 682 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 45 555 47 554 627 363 39.0 1e-100 MIKIVFFAPYPDILPVIEQVFRERPESEAIQYEVVQDFYNNRLEQVEADVVIARGFTAHT LKRKNIPCAELKSTGYDVMKAVNECMQKGNIERIAVVGAFNMVYGAEQMCHLYPNLGLGC YAEDDETKLEMAVHQAMKDGNQAIVGGYSTVQIAERLGMPAAMIHSGIEAVNHAISEAIA VAHLTRYERQKRDEIANIMNYSFQGIISVNRKGIITLANTCCHTYMKDRKTSLAGEHIKD FFPDIDFDSVIRDKQKILSEVCRFGGKQVLVNCVPVAGDSEEFGCVLTFQGTEQIQAEEG KLRKRMHSDGFTARYDFSHILYRDSCMEAVISQAVKFSYSDSNILIHGETGTGKELFAQS IHNSSRRRKGPFVAINCAALPENLLESELFGYVEGAFTGASRGGKMGFFEIAHKGTIFLD EIGDISPKLQSRLLRVIQEREIIRLGNDTVIPIDVRVICATNRDLKKEVSRGNFREDLLY RLDVLELNLPPLRKRKQDILYLADRMVRFEHERTGSRLEAITQEGRELLMRYNWPGNVRE MRNFCERICILCEKTRAGAEDVLQALPGEWEHGEGADSQYADSQYAESRHAESRYAESRY AESRYAESRHADSGSGSADFPASGSHVPAEAAGRGPRLEEAQRQAVKDALELCGYHRGRT AAYLGIDKSTLWRKMKKYGIEG >gi|157101601|gb|DS480723.1| GENE 53 56570 - 57574 373 334 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 3 327 5 326 346 148 31 2e-34 MSGKPILGILLGDAAGVGPEIIVKLAASRFYDEYCRPVVIGDVRVFERGARICGLEIPVQ VIDKVEEADWTKGMPVLDQRDVDPEQVPFANISIESGRACLNMLKTAIELYQAGRIDGFC FAPLNKQAMIQAGCPFESEHHYMAHLFGHTEPFGEINVLGDLWTTRTTSHIPISKVSDSL TVDTIMRAIRLANVSLKNSGIEKPRLALAALNPHCGEGGKCGREEIDVIAPAIEKAREMG IDANGPFPSDILFIKAFNGDFDGVVTMYHDQGQIALKLKGFDQGITIAGGLPAPIVTCAH GTAYDIAGKGIVKTSAFENAVKMAAKMAAHLKGC >gi|157101601|gb|DS480723.1| GENE 54 57593 - 58042 465 149 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2979 NR:ns ## KEGG: SpiBuddy_2979 # Name: not_defined # Def: hypothetical protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 2 144 4 145 152 105 46.0 5e-22 MKKIGTKQIVPLVLAVFAVVFAVVGFTQLGFWEDVDGPQPGFFPAIMAIVMFLASIASFF QSLKEEKAARYERDEMMVIAGGAGIIAGSFIIGLLPSCYLFVILWLKVFEKTGWKETLIV LAVCMAISIGVFRMWLGVHFPMGLFEAFL >gi|157101601|gb|DS480723.1| GENE 55 58058 - 59563 1871 501 aa, chain + ## HITS:1 COG:RSc0793 KEGG:ns NR:ns ## COG: RSc0793 COG3333 # Protein_GI_number: 17545512 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 1 496 1 492 500 310 36.0 3e-84 MENIQLLMHGFSTLFTFSNVGAALLGAVLGLVVGAMPGIGSLAGVALLLPLTYKFNPTTA IIMLGALYYSNMYGGAFSAILLNIPGDSPAICTAMDGYPMASKKKRPGQALFTANMASFI GGTIGIIILTFTGPALADIGLKFGPSEMTALLLIAMTSISWLVGENPIKGVVITMLGILL ASIGMDTLSGSPRYDFGSMYLLGGIPFTPFIIGTVGFSQVIKLVMERSNEVKSEPHMKLT IRGSILTKHDFKRLLPPALRSGVMGTFVGVLPGAGATTGAFMGYAAQKKFKNEEELGTGA IEGIAASEAANNAAAAGAFAPLLALGIPGSGTGAVLLGGLMMWGLNPGPLLFTNEPEFTW GLIASLFLSNIFALVVAIGVIPFLIQILSVPVKYMIPIITIVCVVGSYSSSYSMYGVLIM FLSGILGYLLVKNDYPTAPMLLSFVLAKLLESNMRKAFIISGGSLGIFFTRPITCVLMLI FLAFICTPVVKAVLKKARASK >gi|157101601|gb|DS480723.1| GENE 56 59681 - 60673 1236 330 aa, chain + ## HITS:1 COG:SMb20771 KEGG:ns NR:ns ## COG: SMb20771 COG3181 # Protein_GI_number: 16265211 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 33 292 24 286 328 88 28.0 2e-17 MKKRSIVMAMAALMMVSAVTGCTSKAVSSDGTFTPKETINWTVTSSPGGGSDIYTRMISD IMTKENLVNGQPIIVTNKTDGSGEIGRNEVATTKGSKADYTLLTFNSGDLMPMVQNTKNR SSNFRILAIMAVDKQLIFKGEQTKYADFKEAIEAAKNGTKIVIGGSKGDDIATYDAMLEE IGISKDVMSYITYDSTGDAITAALGGHVEFVISKPAAASEYVTAGSLIPVLALSTERYTG NLADAPTLSEIGDYENVEVPVWRGVAAPAAMSDAAAAYWSEQLGKVAETDTWKNDYLEKN KLIGNYMDAAGATEYVTAYEKDFMAANGIQ >gi|157101601|gb|DS480723.1| GENE 57 61006 - 61842 874 278 aa, chain + ## HITS:1 COG:TM0416 KEGG:ns NR:ns ## COG: TM0416 COG1082 # Protein_GI_number: 15643182 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Thermotoga maritima # 1 269 1 265 270 160 35.0 4e-39 MKLCYQVATPDVAIADSVTAYQGSLEKSFGDLGRLGYDGVELMTLNPEKLDWNEVKETAE KNGLSVILVCTGEIFGQLGLSYTNPREEIRKEAIRRSREIIDFAGFLGANINIGRVRGQY CGELSREETENLAVEAFRELADYGAPRNVNIALETVTIMQTNFINTLAEGAAMVDRVDRP NFRLMMDIFHLNLEEKNIYEAIRRYSSYNIHVHLADNNRRYPGHCGLDFEKILTTFKECG YDGNFCTEIFQIPSMEEAAAGAIRHLRPIADRVYGRTV >gi|157101601|gb|DS480723.1| GENE 58 61891 - 62550 767 219 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2976 NR:ns ## KEGG: SpiBuddy_2976 # Name: not_defined # Def: Asp/Glu racemase # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 219 1 219 219 231 53.0 2e-59 MKSIALIHTVKSVASGFDDSLRNYLGYEVKIHNLWDDFLANNPNEIGEFTIQNRNRLFCD MKAQELTGADIIVTTCSTLTPAVELIRPFIQVPVIAIDDAMARRSVLYGKRIMVMATAES TVGPTVSKIKAEADKAGREADISTCVCMDAFFAMKAMDMKKHDMILREKAGKMKGYDCIV LAQASMAHLEDETASITGCPVLTSPRLCMEEIKKKLEEI >gi|157101601|gb|DS480723.1| GENE 59 62553 - 63380 869 275 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 9 272 7 270 270 278 51.0 1e-74 MTNERKEELQRQCSKFRNDLIDLLYSIQTGHPGGSLSCTEILTSLYFEIMNVNPEDPEME GRDHLILSKGHAAPMLYLVLAEKGFFPKEELKTLRQMDSMLQGHPCVHKTPGVELSTGPL GLGLSAGLGMAMADRIKGLDSYTYVVMGDGEIEEGCIWEAAMSASKFGADHLIGILDNNG VQLDGTLEEIMPMGDIGAKWKAFGWNVIPCDGHDVEDFCRAVEEAKKTKGCPSLILAATV KGKGVSFMEGKNTWHGKAINDNEYAQAKAELGGAR >gi|157101601|gb|DS480723.1| GENE 60 63380 - 64330 1357 316 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 6 293 5 293 309 265 49.0 6e-71 MGENIAIRDAYGAALKELGEQNEKIVGLEADVASSTKSGIFGKAFPERYFNVGISELDMV SMSAGFAREGLIPYVNTFAVFLTTRGADPIQSLIAYDKLNVKLCGTYCGLSDSYDGASHQ AITDLAFVRAIPNMTVITVADAVETKKAVFAIAEHQGPVYLRLSRAAAPVFYPEDMKFEI GRGITVREGGDVTIITTGTVLHKALAAAELLEAKGIRARVVDMHTIKPIDEELIIECARE TGAIVTVEEHSVCGGLGSAVAEVLAEHMPVPMTRIGATDFAESGDYEQLLVKYGYGPESI AEKCEKVMKRKQVQKK >gi|157101601|gb|DS480723.1| GENE 61 64408 - 65319 989 303 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_0718 NR:ns ## KEGG: SpiBuddy_0718 # Name: not_defined # Def: xylose isomerase # Organism: Spirochaeta_Buddy # Pathway: not_defined # 1 303 1 302 307 383 58.0 1e-105 MEQSMRRFMKVGIILHVSYPQLGGGEGPILECLERICGDDYFEAVEVARMKDGQVRKKAA EMIRAAHMVSAYGGQSRTLSAGLNINDLDETRRAMAVDTLKEGIDEAYEMGCAGFSFLCG RYDEGRKEQAFEQLLKSTRQLCGYAGEKGNMPICCEVFDYDIDKRALIGPAALAARYAGE IRRDYGNFGLLVDLSHIPMLHETIEESILPVKDYIIHAHMGNTVIKSPDCEAYGDNHPRF GFPDSENDVEELAHYLRTLMEIGFLGEKKRPIVSFEVKPWKDESPEVIIANAKRTLNLAW ELV >gi|157101601|gb|DS480723.1| GENE 62 65413 - 66315 1227 300 aa, chain + ## HITS:1 COG:no KEGG:Spirs_2575 NR:ns ## KEGG: Spirs_2575 # Name: not_defined # Def: xylose isomerase # Organism: S.smaragdinae # Pathway: not_defined # 1 300 71 368 368 437 66.0 1e-121 MKDPIQKYFQVGTIQWMTHPPVNYPILDSVKTICCDEYFNALEITHIEDQETKDKVRDML AQSHMKVCYGAQPRLLGPKLNPNDLDEEGRKKAEAVLMDSIDEARYMGAKGIAFLAGKWE PETKDQAYAQLLKTTRAVCSYAATKGMMVELEVFDFDMDKAALIGPAPYAARFAADMRTT HNNFGLLVDLSHFPTTYETSRFVIQTLRPYITHLHIGNAVVKEGFEAYGDQHPRFGFPDS ANDTEQLVDFFTVLKEEGFFNKENPYVLSLEVKPWADEDGDIILANTKRVINRAWALVED >gi|157101601|gb|DS480723.1| GENE 63 66410 - 67369 1141 319 aa, chain + ## HITS:1 COG:CAC2945 KEGG:ns NR:ns ## COG: CAC2945 COG1052 # Protein_GI_number: 15896198 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Clostridium acetobutylicum # 1 319 1 324 324 390 58.0 1e-108 MKIVVLDGYTENPGDLSWEGLERLGELTVYDRTPADKIAERIGDAEAVYTNKTPISAQTI GQCPNLKFIGVLATGYNVIDTAAAKAAGVIVSNIPTYGTDAVAQYAIALLLELCHHIGEH SDCVKAGEWTHNADWCFWKHPLVELAGKTFGVIGFGRIGQGTAKIAEALGMKVLAYDEYP NKALETDNCKYASLDQLLAQADVISLHCPLFPSTEGIINRDSIAKMKDGVKIINTSRGPL IVEKDLREALDSGKVSGAAVDVVSTEPIREDNPLLGAKNMIITPHIAWAPRESRQRLMDI AVDNLKHFVDGAPQNVVNK >gi|157101601|gb|DS480723.1| GENE 64 67513 - 69414 2010 633 aa, chain + ## HITS:1 COG:CAC0459 KEGG:ns NR:ns ## COG: CAC0459 COG3829 # Protein_GI_number: 15893750 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Clostridium acetobutylicum # 49 622 49 625 627 306 33.0 7e-83 MVTVRMFIPYMNMKSRFEAAVSRLEPQDDVRVELLHVFGTPESLSRYGDADILVARGMTY DRLRYLFPEKHVVEIQLSSFDILKALICARQEFHPKKIALCVRYMDESAVSELEKLCQAE IAYYTVHDEASTLEAIHSARANGADVFVGAGTMCGLCDKEALNRVHIHTKDIAIEQALKQ AMDAARTINMERARSKMTSTILNTSSDALIAVSGSGLIQALNNQAYRTFQLSSQADYTGR PVEEVCPALKWKDVVETGREREEVIQWKDRKLYTEYRPVLVDKMGKGAIIVARYTEQIIE AETKIRQSLAKQGLTAKYSFDDIIGSSPAIRENILMAKRYSRVDSNVLIVGETGTGKELF AHSIHRESRRSAEPFVALNCAALPENLLESELFGYEPGAFSGASKNGKTGLFELAHKGTI FLDEIGEIPISLQAKLLRVLQEREIRRIGSNRVQPVDVRVISATNINIEEKIQEGQFRAD LYYRLNLLDITIPPLRERGDDIREMVDFYLTRFACEMGKPIPRLSKEAVDLMTHYGWPGN VRELRNICERLIVLSDTTEIGLREIQMLKIFRKKEGPLPGTQTPDKETDQPETGIVYANL KPRKKKQDIAKELGVSRTTLWRMEKMAREQKKQ >gi|157101601|gb|DS480723.1| GENE 65 69508 - 70149 955 213 aa, chain + ## HITS:1 COG:CAC2606 KEGG:ns NR:ns ## COG: CAC2606 COG0698 # Protein_GI_number: 15895864 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Clostridium acetobutylicum # 1 213 1 212 213 304 71.0 6e-83 MRIALINENSQGAKNGMIYNSLKKVADQYGFEVDNYGMYTAEDEAQLTYVQVGILAAALL NAKAADYVITGCGTGEGAMLACNSFPGVICGHVEDALDAYTFAQINDGNAIAIPFAKGFG WGGDLNLEYIFEKLFCEPSGQGYPRERAVPEQRNKRILDDVRKASFREDLVEIFKGLDQE LVKGAFSGEKFGELFFGKCKDEKLAAYIRELIG >gi|157101601|gb|DS480723.1| GENE 66 70290 - 71633 1123 447 aa, chain - ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 98 440 131 475 686 226 38.0 5e-59 MTTRGHSRKKRKQTGIKWLYITAAGLVCAVILAAILFKIRAGSMDSRRPAGTDASQETGA AGDRAGDGTGDPPNSENQPDPLSLSDKTNADNTDNADIRAVIRQMSLEQKAAQLFIITPE ALTGFKTVTQAGDATYKALQEYPVGGLIYFPQNLQNPGQLKTMTAKTQEFASGLNQPPLF LSIDEEGGKVARIANHQGFDLPKVDTMAEIGKTGDTSRAYEAGSVIGGYLEEFGINLDFA PDADVLTNPDNTVVLDRSFGSDPVLVTQMVKAYMKGLEEHHVYGTPKHFPGHGATEGDSH KGFAYTYKTWDELEQAELVPFAGLIQDNTPFIMAGHISLPQVTGDDTPSSLSSQVLTDYL RETMGYNGIIITDALNMGAIQDNYPPDRAAVMALQAGADLLLMPADFKEAYNGVLDAVKT GELTEERIDQSLTRILGLKLTLPRHKR >gi|157101601|gb|DS480723.1| GENE 67 71884 - 73122 1096 412 aa, chain + ## HITS:1 COG:BH0761 KEGG:ns NR:ns ## COG: BH0761 COG0624 # Protein_GI_number: 15613324 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Bacillus halodurans # 1 405 5 413 414 283 39.0 4e-76 MDINQDRLWSDIQELGRIGRSPDGGITRLAFTPQDRQAQDWLEARMRDAGLAVREDAAGN LIGELCGSRPEKPCVMCGSHYDTVPGGGQFDGTLGILSALEAVRRIREQGTVTERTIRLA AFKDEEGSRFGYGMVGSKSICGILDPEGLTSVDKDGISLEQAMADYGCRPGQLASCKMED VGTYLELHIEQGKVLEDHGASIGVVSGIAGLVRYTVEIRGESGHAGATPMKARKDPVPAM CRWIDRVTELAGARESCVATVGSITAYPGARNVICERVVFSLDLRSIRQEDLREIAEEME RYGAELSSLHGVEVSLRPDQALPPCRCHEPLREAIKNICRSEGYSFMELMSGAGHDCMNF KDVCPTAMIFVPSEGGVSHRKEEFTSREDCVRGAGVLLGLLLETAGVSDKID >gi|157101601|gb|DS480723.1| GENE 68 73137 - 74468 1632 443 aa, chain + ## HITS:1 COG:CAC2820 KEGG:ns NR:ns ## COG: CAC2820 COG2252 # Protein_GI_number: 15896075 # Func_class: R General function prediction only # Function: Permeases # Organism: Clostridium acetobutylicum # 4 442 2 428 429 306 46.0 7e-83 MEKLEEFFKLKERGTSVKIEVLAGITTFMTMAYVLVVQPGAIIGFGDAVSFTDINGLVIT KEAIAVTCAVISALITLLMGFYANLPFALATGMGTNFMFGALLQSNTLSFGAIMAITLIS GLIFVVLTIFGVRDLIVKAIPKNIKVAIGSAIGFYIAYLGFKNSGIGTFTNGIGMGNFKE PAVFIALLGLVIIAVLTAYRVNGAILIGIVAVTVMGIPLGVTTLPDTFAKIPDVSSWGNI VFCFDFKGLLSFQAIIYVFIAFCGDFFSTLGTVLGVAGKAGMLDEDGNMPGIQKPFLVDA IGTCVGACTGNTTITTFVESTSGVEAGGRTGLTSIVVAVLFAVMVFFSPLILMIPDAATG PALIFVGFLMVSGIRNIDFSDFTEAFGPFVMIMFVIFCGSIAAGIAAGILAYIIIKTATG KFKELHPVMYVLAVPLVMYFVFK >gi|157101601|gb|DS480723.1| GENE 69 74514 - 76292 1949 592 aa, chain + ## HITS:1 COG:BH0640 KEGG:ns NR:ns ## COG: BH0640 COG1001 # Protein_GI_number: 15613203 # Func_class: F Nucleotide transport and metabolism # Function: Adenine deaminase # Organism: Bacillus halodurans # 12 583 9 571 585 359 38.0 1e-98 MKIQPRDKRALLRAALGEIKSDLAIKNVQLVNVITGEIYPASVYVYDGFISHVEYKEPGR EEQFAKEVVDGQGKYLIPGLIDAHEHIESSMMTPRNFAKAVIPKGTTTVITDPHEIGNVW GIEGVRYMHEASEGLPMRQLIDIPSCVPAVPGLEHAGAEFGPAQIEELAKLERVVGLAEV MDYLDVIHGGDRMMDIIRTAEEHGLYLQGHAPFVEGRMLSAYLCGGPNTCHESRTAEEAL EKMRSGMRVDARDSSITKNVEAIWSGVKDFRFFDNFCLCTDDREADDILHNGHINDVVRA AIRYGMEPVAAIKSATLNSAREAGLQNLGAVAPGYAADMLLVDDLRELTPSHVFYAGKLV AQEGRLLAEIEDKSYPLESANSVHVRKLAAEDFTIHPPVSQGKVKVNLMKYYDMNLSTTD IVCEEVWVKDGRIDISGDQDLKFVAVVNRYEGNDNIALGLVRGFGTKTGALASTVSHDSH NLTIVYDNPEEALMAAEELARTGGGMCAVKEGKILHTLALPLAGLMSLKPAEELAEDSAV MKDANRALGLTNMENPLLRIVTLALPVIPDAKMSDMGLVQVTAKRIVPLFPE >gi|157101601|gb|DS480723.1| GENE 70 76451 - 80782 4377 1443 aa, chain - ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 8 667 5 663 663 826 60.0 0 MKLHTNTLTLGIDIGSTTVKVALLNQASHIVFSDYERHYANIQETLAGLLKKAREITGPI QVIPVITGSGGLTLSSHLEVPFVQEVVAVASALQDFAPQTDVAIELGGEDAKIIYFTGGI DQRMNGICAGGTGSFIDQMAALLQTDASGLNDYAEHYKAIYPIAARCGVFAKSDIQPLIN EGATREDLAASIFQAVVNQTISGLACGKPIRGNVAFLGGPLHFLPQLRESFIRTLRLTPE QVIAPDHSHLFAAVGAAMNPQDNQEAIPLETLIQRLSSGIKMDFEVKRMEPLFKDQEDYD RFCARHASNHVKSGDLSSYEGCCYLGIDAGSTTTKAALVGEDGTLLYRFYDNNNGSPLAT AIRAIREIKEQLPETARIAYSCSTGYGEALLKAALMLDEGEVETISHYYAAAFFEPDVDC ILDIGGQDMKCIKIKDGTVDSVQLNEACSSGCGSFIETFARSLNYNVEDFAREALFAKNP TDLGTRCTVFMNSNVKQAQKEGATVADISAGLAYSVIKNALFKVIKITNASDLGEHVVVQ GGTFYNDAVLRSFEKISGCPAVRPDIAGIMGAFGAALIARERYHMASEPETSMLSLERII GLKYSTSMTRCKGCNNHCVLTINQFGSGRRFISGNRCERGLGLEKAKKEIPNLFDYKYHR MFDYESLPMDQADRGTVGIPRVLNMYENFPFWAVFFKKLRYHVVLSPQSTRQLYELGIES IPSESECYPAKLVHGHVTWLIKQGVKFIFYPCIPYERNESPDAGNHYNCPMVTSYAENIK NNVEALAEENVRFMNPFMAFTNEEILTDTLVKEFSRTFQIPAAEVRMAAHAAWQELTASR DDLMKKGEETLAWLKEHGKRGIVLAGRPYHVDPEIHHGIPELITSYGFAVLTEDSISHLG RVERPLVVTDQWMYHSRLYAAASFVKTQDNLDLIQLNSFGCGLDAVTTDQVADILTHSGK IYTVLKIDEVNNLGAARIRIRSLIAALKVRDQRHFERRVVSSAYHRVVFTKEMKKDYTLL CPQMSPIHFDLLEPAIRSFGYKIEVLQNHNRNAVDVGLQYVNNDACYPSLIVIGQIMDAL LSGKYDLDHTAVLMSQTGGGCRASNYIGFIRRALEKAGMPQIPVISVNANGMETNPGFTF TVPLLTKAMQAIVYGDIFMRVLYATRPYEAEPGSANALHMKWKARCIASLSKKNAAMIEF GRNIKGIIRDFDALPRLDVKKPKVGIVGEILVKFSPLANNHIVELLESEGAEAVMPDLMD FLLYSFYNSNFKAEHLGSKKSTARLCNAGIALLEYFRRTARKELTASMHFTPPSAINELA QMAKGFVSLGNQTGEGWFLTGEMLELIHSGVNNIICTQPFGCLPNHIVGKGVIKELRRNY PLSNIIAVDYDPGASEVNQLNRIKLMLATAQKNLKTESSREDRGNGRRPVAAYSPCLGTM SHT >gi|157101601|gb|DS480723.1| GENE 71 81206 - 82147 1124 313 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 1 313 1 315 315 307 53.0 2e-83 MIRIFKTEDGAMHEKEEMQPGCWIALTNPTASEIIDIADTYQIDPDHLRAPLDEEERSRI EVEDEYTLILVDIPSIEERNGKDWFVTIPLAIITTKDVLITVCLEETPVLTSFMDGRVRD FHTFMKTRFILQILYKNATQFLQYLRIIDKKSEVIERKLHQSQKNEELIELLELEKSLVY FTTSLRSNEVVLEKLLRIEKIKKYPEDTDLLEDVIVENKQAIEMANIYSGILSGTMDAFA SVISNNLNIVMKFLATVTIVLSIPTMIASFYGMNVNSHGMPFADSPYGFAIVLGLTLLLS LFVAYIFNKKDLF >gi|157101601|gb|DS480723.1| GENE 72 82183 - 83013 708 276 aa, chain + ## HITS:1 COG:no KEGG:Closa_0669 NR:ns ## KEGG: Closa_0669 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 272 1 273 281 283 54.0 4e-75 MRFFNKLERRFGRYAIHNLMYYIIIMYVMGFVMMYMDPMFFTRYLSLDAEAILHGQVWRL VTFLLYPPTLSPFWIIFAALMYYSLGRSLEAVWGAFRFNVFFFMGAIGIILAEFIVYFIW GWDMHLNTSYLSMSIFLAYAVMFPEEYFYIYFILAIKAKWLALFDAFFLVFGFVYGSLPQ RIAIFLSLANFIIFFLMTKDFKKYSPKEVKRKQDFKVKTMRPVNRTHHKCAVCGRTDEES PGMEFRYCSKCEGSYEYCMDHLYTHQHVKKSDSPKS >gi|157101601|gb|DS480723.1| GENE 73 83091 - 84527 1761 478 aa, chain + ## HITS:1 COG:CAC0518 KEGG:ns NR:ns ## COG: CAC0518 COG0469 # Protein_GI_number: 15893809 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Clostridium acetobutylicum # 1 478 1 473 473 431 48.0 1e-120 MKKTKIICTMGPNTSDKNIMMELARNGMDVARFNFSHGDYNEHQGRLELLKEVRKELDIP VAALLDTKGPEIRTGQLKDGKKVTLKEGQTYTLTTRELVGDDTIGYINYSGLNEDVAAGN RILIDDGLIELEVRQVKDTDIVCEVINGGELGEKKGVNVPNVKIKLPALTDKDKEDIRFG IRQGFDFIAASFVRTADCIKEIKAMLDEQGSSMKVIAKIENAEGIENLDAIIEAADGIMV ARGDMGVEIPAEKVPHIQKKIIRKCNEACKIVITATQMLDSMIRNPRPTRAEVTDVANAV YDGTDAVMLSGETAMGKYPVDALKMMVSIALETEMHLDYAGYRQRKVTEQNMKNVSNAVC FASVSTAHDLDADVIIAPSITGFTTQMLSKWRPGARIIGMSPSMATVRQMQLQWGVVPVW SRRAESTDELIENSVEELKNRGLVEEGELAVITAGVVTYARRHEAATQTNIMRVINIE >gi|157101601|gb|DS480723.1| GENE 74 84637 - 85905 1570 422 aa, chain + ## HITS:1 COG:SP1978 KEGG:ns NR:ns ## COG: SP1978 COG0019 # Protein_GI_number: 15901801 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 3 414 2 414 416 552 64.0 1e-157 MDKKPFATLEQLREIERTYPTPFYLYDEKGIRENAARLKQAFSWNRGYKEYFAVKATPNP FLLNILKDMGCGTDCSSMTELMMSRACGFSGSDIMFSSNDTPPEEFAYAEKLGAIINLDD ITHIQCLEETLGHIPETISCRFNPGGLFKISNDIMDNPGDSKYGMTAEQIGQAFKILKEK GAKHFGIHAFLASNTVTNEYYPMLAKILFELAVKLKEETGVHIAFINLSGGIGIPYRPDQ EPNDILAIGDGVRRVYEEILVPAGMDDVALCTELGRFMMGPYGALVTKAIHEKHTYKEYV GVDACAVNLMRPAMYGAYHHITVMGREDEPCTRMYDVVGSLCENNDKFAIDRMLPEIKKG DLLFIHDAGAHGFAMGYNYNGKLKSAELLLKEDGTVEMIRRAETPEDYFATFDFCDILRN ID >gi|157101601|gb|DS480723.1| GENE 75 86011 - 86808 908 265 aa, chain + ## HITS:1 COG:CAC3252 KEGG:ns NR:ns ## COG: CAC3252 COG0345 # Protein_GI_number: 15896497 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Clostridium acetobutylicum # 3 264 4 267 270 229 48.0 4e-60 MVKIGFIGMGNMGNAILNGLLKTHRPEDMIFSAAHQDKMEAVTARTKVPHAGSNRECAKA VKYLILAVKPQYFDAVFSEIRDVVTPEQVVISLAPGVTISNITERLGGNVRVVRAMPNTP AMLGEGMTGISCREGSCTEEEKETVRDIFSSCGRVEMVEERLMDAVGCVSGSSPAFVYMF IEALADGGVKYGLPRKTAYAMAAQTVLGSAKMILETGKHPGQLKDEVCSPGGTTIAGVSA LEEHGLRNALIKAADACYEKTQSMK >gi|157101601|gb|DS480723.1| GENE 76 86849 - 87445 868 198 aa, chain + ## HITS:1 COG:VCA0764 KEGG:ns NR:ns ## COG: VCA0764 COG0494 # Protein_GI_number: 15601519 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Vibrio cholerae # 24 179 24 176 185 122 44.0 5e-28 MEDKEKNTQIPVIRLDRQLKYEGNILKIYEDKVLANGHEARWDFIHHDGAAAVLPVADDG KILMVRQYRNALDRYTLEIPAGKLDAPGEPKVECAFRELEEETGYRVESPENLEYLMSLT TTVAFCDEAIDIFVAHNLIPSHQNLDEDEVINVVPCSLGELEDMIYTGKITDGKTIAAIM AYARKYCSEDKKDTMVKG >gi|157101601|gb|DS480723.1| GENE 77 87538 - 88080 556 180 aa, chain + ## HITS:1 COG:no KEGG:Closa_3613 NR:ns ## KEGG: Closa_3613 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 16 175 51 214 220 73 34.0 3e-12 MGMRWLRLRLGTLTYEDWLLFCFTAGIFCGTAAALAFGGPTVQGCILGAAGSFGSGNRGV KEYITVFRQRALETCGGWLAGVTVCSQMLFGFLTFYAGMSLAVVLSVLTIRKGLLGIAAF LCTVLPHGLVYLLVWYVLSGWSGQIQKKLHILPGLLLLIITGAGAFLEVFVSPFLWGLFP >gi|157101601|gb|DS480723.1| GENE 78 88120 - 88983 809 287 aa, chain + ## HITS:1 COG:NMA0616 KEGG:ns NR:ns ## COG: NMA0616 COG2510 # Protein_GI_number: 15793606 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Neisseria meningitidis Z2491 # 148 287 2 141 143 113 49.0 5e-25 MEWLLYAFGSAVFAGLTAVLSKIGVRDTDSDLATAVRTAVVLVFSWIMVFLTGSAGSIGT ISAKTCLFLVLSGVATGASWLFYFRALQLGDVNKVASIDKSSTILTMFLAALILGEGLGG MKILCMVLIGTGTLMMIQKRNQGVEKTGSRRWLGAALLSAAFASLTAILGKIGIQGVESN LGTAIRTCVVLAMAWLLVFLQGKQKHLDRIAPKSWCFLVISGFATGASWLCYYRALQEGP ASVVVPIDKLSILVTILFSRIFLGERLDRKSFAGLVLLTAGTLLLLL >gi|157101601|gb|DS480723.1| GENE 79 89006 - 90403 1094 465 aa, chain + ## HITS:1 COG:CAC1028 KEGG:ns NR:ns ## COG: CAC1028 COG1075 # Protein_GI_number: 15894315 # Func_class: R General function prediction only # Function: Predicted acetyltransferases and hydrolases with the alpha/beta hydrolase fold # Organism: Clostridium acetobutylicum # 63 465 78 479 479 417 52.0 1e-116 MKKKKTGRICLKRLFLALYIPYAVNIPLAACVWAGGIWPPEGAVSPAVRTGLLAVGLACT WLVWMAYNIMPRKKDFFASWRVTIMEGGRSLCYSALYGFAAQAAVLLWLYPKAYRAVHDQ RVLWINGIYSAIMLFILLWNGILRIFFTSVRLRLKYRILMLLAMWIPGLNLGVLLYAMRI VHGEYDFACYKESVRQVRAQSQICSTRYPLLLVHGVGFRDLRYFNYWGRIPRELARYGAD IYYGNQEAFATVAYNAGDIYRKIQEICRETGCEKVNIIAHSKGGLDSRYAISRLGAAPMV ASLTTINTPHRGCRFVDYACRLPEGLYRAIAGVFDLWFSRFGDSHPDFYTATHQFSTRAS ICFNEEVADAPGVYYQSYASVMRDFLSDPLLWFPYLIIRAVDGKNDGLVTPESAMWGEFK GTIANRKHRGISHGDMIDLKREDYKGFDVTEFYVELVKTLKEKGF >gi|157101601|gb|DS480723.1| GENE 80 90633 - 90941 121 102 aa, chain + ## HITS:1 COG:no KEGG:Closa_3612 NR:ns ## KEGG: Closa_3612 # Name: not_defined # Def: integrase family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 62 1 62 295 86 66.0 3e-16 MTSDIKSFVSYLRDVKKTSRNTEISYQRDLMQLKSFLEDKGITEVEKVTKTSLNSYILFW KRGEGHDHHIQGAGIHEGFFQFHVQRRTNPQGPGGAPESAQD >gi|157101601|gb|DS480723.1| GENE 81 90880 - 91515 665 211 aa, chain + ## HITS:1 COG:BS_ripX KEGG:ns NR:ns ## COG: BS_ripX COG4974 # Protein_GI_number: 16079408 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus subtilis # 1 210 85 295 296 197 47.0 1e-50 MFKEGRIRKDPAELLKAPKIEKKAPVILSVNEVNALLDQPGMTTSKEIRDKAMLELLYAT GIRVSELIGLELSDLNMQVGYITCRDGQKERMVPFGKPAKQALSVYLEKGRSQLLKGKES QWLFTNCSGKAMSRQGFWKIIKYYGEKAGIQADITPHTLRHSFAAHLIRGGADIHAVQAM LGHSDSTTTHMYAAYSGNTVGETYRAALPRK >gi|157101601|gb|DS480723.1| GENE 82 91585 - 92691 1212 368 aa, chain - ## HITS:1 COG:CAC0978 KEGG:ns NR:ns ## COG: CAC0978 COG3359 # Protein_GI_number: 15894265 # Func_class: L Replication, recombination and repair # Function: Predicted exonuclease # Organism: Clostridium acetobutylicum # 22 194 31 203 274 112 34.0 9e-25 MITIKHPVDLPDTYPLDRIGPLKDLLFFDIETTGFSGDTSQLYLIGCTYHDGFGWKLIQW FADSRESERELLTAFFTFMERFKILVHFNGDGFDIPYLLKCCRRLDLPYNFDRIKSVDIY KKIKPYRKLLGLENMKQKSIEQFLGLAREDKYNGGQLIEVYREYLMTHESFLYDLLILHN EDDLKGMPSILPILSYADMMEAPFSLESQQLFTYDDMAGNACPFLSLTWKSRCAIPVPLA YDSAPVSLELNRDTLVCNIALYQGELKYFYSNYKDYYYLPLEDTAIHKSVGEYVDRDART KATARTCYTKKQGLFLPQFGALWSPALMQEYKASLTYAEYEPEMLTPDSMADAYARQILS YVSQAKEL >gi|157101601|gb|DS480723.1| GENE 83 92910 - 93539 841 209 aa, chain + ## HITS:1 COG:ECs2571 KEGG:ns NR:ns ## COG: ECs2571 COG0632 # Protein_GI_number: 15831825 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Escherichia coli O157:H7 # 1 204 1 200 203 138 40.0 9e-33 MISYVKGPLMAIEEDVIVVEAGHVGLAIHVPVSLLPELPGMGKEVTVYTYFQVREDAMTL YGFLHSQDRDMFRQLLGVNGIGPKGALGILSVLRPDDLRLAIVSEDVKALAKAPGIGNKT AQRLILDLKDKISMDDVLGNMAGGAGMGAADTGASVAGLVEAAKEAVQALVALGYTNTEA SRAVKQVEVTDGMTSEDVLKASLKHLSFL >gi|157101601|gb|DS480723.1| GENE 84 93581 - 94588 1167 335 aa, chain + ## HITS:1 COG:BS_ruvBm KEGG:ns NR:ns ## COG: BS_ruvBm COG2255 # Protein_GI_number: 16081161 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Bacillus subtilis # 1 329 3 331 336 409 62.0 1e-114 MERRIITTEVTAEDERIETTLRPQCLRDYVGQEKIKSTLNIYIEAAKTRGEPLDHVLFYG PPGLGKTTLAGIIANEMGTNMKVTSGPAIEKPGEMAAILNNLQEGDVLFVDEIHRLNRQV EEVLYPAMEDFAIDIMLGKDSTARSIRLDLPHFTLVGATTRAGLLTAPLRDRFGVIQRLE FYTPEELKIIILHSARVLGVEIEAGGAMELARRSRGTPRLANRLLKRVRDFAQVKYDGVI TREVTDFALDIMDVDKLGLDQNDRNILLMIIEKFSGGPVGLDTLAAALGEDSGTLEEVYE PYLLMNGLINRTPRGRMATETACRHLGLEMQKPPC >gi|157101601|gb|DS480723.1| GENE 85 94925 - 95797 693 290 aa, chain + ## HITS:1 COG:no KEGG:Closa_2236 NR:ns ## KEGG: Closa_2236 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 117 1 117 122 119 58.0 1e-25 MDTKNYTQVLIDGKVYTLGGSEDESYLQKAASYVNEKNSAMRKVPGFTKQSADYQMVMTE LNIADDYFKAVEWGEGMERQKNDMEKETYSLKHELVSTQMKLEAVLKDLEERQRELDRLN RRTAQLEGELKEARENLQNLRNNPSADTENNAVVQAVEEIKETVEVEAGITLNASRTEYV SGSGARESSQPEAVPQTPEADVSAQQPDTPAVQTAQKQPAPSLRVLNPASEAGRGMTAAS AQAAATAAPPAAPVAEPIAAAPAGGMTDEELARKALQAARKAGSHKGGRR >gi|157101601|gb|DS480723.1| GENE 86 95880 - 98240 1980 786 aa, chain + ## HITS:1 COG:CAC2341 KEGG:ns NR:ns ## COG: CAC2341 COG0826 # Protein_GI_number: 15895608 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 5 571 3 497 787 310 36.0 1e-83 MEKRQVEILAPAGSFDSMKAAVAAGADAVYMGGSRFGARAYADNPDEKGLLEAIDYVHLH GRRLYMTVNTLFKEDELEELENYMRPYCDRGLDGVIVQDLGALAFMRRHFPGLELHSSTQ MTVTSVYGARMMKEMGCSRVVTAREMSLEEIRRIHDETDIEIESFVHGALCYCYSGQCLM SSLIGGRSGNRGRCAQPCRLPYTVYEPGDLSRQKDKHCAPEQAGQSGARSRIIEQTGKKN RNQKPAASGRTGPSPLNKSSERYVLSLKDLCTLDILPDIIEAGVYSLKIEGRMKSPRYTA GVVSIYRKYVDRYLEQGRDGYFVEPEDRKMLLDLFDRGGFTDGYYARHNGRGMVALKEKP EFREGNQKLFEYLDKTYVVAELKEKVRGHVELAEGEPSRLTLESRGEKVQVLGQAPQAAE HQPMTREKVLKQLNKTGGSPFSFETLTAQIEGDLFLPVQALNELRRTGFQELEKKLTGAR VLTGEGGIGAQFRPVPTKTAAPQSQSVLTAFLEETAQLSPVLARGEISVVYLDADGFNPD QWRDIADRCHDRGKQCWLALPQIFRSHAQRYLGANRHLLCQAGFDGVLIRALEEAVWLKD LMEQENQKTSLPFGMDASVYGWNSRSAEVLASMGTSLLTMPWELNSREIEPVLGACRGLG MASELIIYGNAPMMVSAQCITNTVKGCTHKRGTLMMKDRTGALLPVKNHCSFCYNTIYNP APLSLLGSEKLVGRLMPDRLRLQFTVEGTEQTEKVLDAFIGSFVYGRDVEAPFRDFTRGH FKRGVE >gi|157101601|gb|DS480723.1| GENE 87 98246 - 99601 1617 451 aa, chain + ## HITS:1 COG:MT0020 KEGG:ns NR:ns ## COG: MT0020 COG0772 # Protein_GI_number: 15839391 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 110 431 91 437 469 177 33.0 5e-44 MTNLIVEISKYLMILLMAVYTYANFRFFSFPDMERKRRVCARQNRAMFAIHFLAYLVMFL KTEDEGMQAMLLAFYGAQVVFFLCYIYLYRLLYRNVSRLLVNNACMLLCVGFIMLTRLSM AKGLDKALRQFAIVVISAALAWLVPYIMERVWQLYKLQWVYAGAGLLILLVVWAAGNESF GAQLSLTIAGVSIQPSEFVKLTFVFFVASMFYQSTDFKTIFLTTAVAGAHVLVLVLSKDL GSALIFFVTYLLMLFVATGSWVYLITGSALGTGAALAAYQLFDHVRRRVAAWSNPWADIE NKGYQITQSLFAIGTGGWFGMGLCQGMPGKIPVVEKDFIFSAVSEEMGAIFAICVLLICL GCFIQFMMIAARMQAVFYKLIAFGLGVEYIVQVFLTVGGVTKFIPSTGVTLPFVSYGGSS ILGTFLLFGIIQGLYILKRNDEEETGEEVMP >gi|157101601|gb|DS480723.1| GENE 88 99601 - 101067 1832 488 aa, chain + ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 25 466 7 460 482 253 34.0 4e-67 MKKADAGPKPNPKPNPKPNSKSNRSILGITYVVVALFLGLAAYLGYFLQVKSEDVINNSY NARLDSFSDRIVRGRILASDGTVLAQTQMDGGGNETRVYPFGDIFDHAVGYSTKGKTGIE ALANFYLLTSHVNLMEQVGNELTGNKNPGDDVYTTLDTELQQAAYAALGARKGVVIAMEP DTGKILAMVSKPGYDPNSLLGDWEALTAQDNKEGQLLNRATQGLYPPGSTFKIVTALEYM REHPADYKNYQFDCGGVYENGEYRIKCYHSTAHGHQDFTLAFANSCNGAFASLGLTMDLG KWNSTAGDLLFNGPLPLSGIPYKESAYNMKPGAGTWEILQTSIGQGTTQITPLHNAMITA AIANGGTLMKPYFLDSVVSAGDETIKKFMPAAYGSLMSAKEAAGLTELMRAVVTEGTGSA VRTDAYTVAAKTGSAEFETGKETHAWFTGFAPAESPRLVVTVLVEEGGSGGKAAAPIARQ LFDIYMSR >gi|157101601|gb|DS480723.1| GENE 89 101072 - 101986 1027 304 aa, chain - ## HITS:1 COG:SPy0898 KEGG:ns NR:ns ## COG: SPy0898 COG0583 # Protein_GI_number: 15674920 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 1 293 1 296 301 107 28.0 3e-23 MDIKYFEYVIEIVECGSINKAAQNLQMLQPNLSVCIKNLEQELGFPIFRRQHSGIRLTNE GELFLKSARKIATELETIHNIPSMFSHKDNLSISCTYSFDFMNHFLKFKKKKPPTACEDS FKETGLIQTIRDVVEQRYRMSLFYCFDTVSDTYYALAKKHNLKLIPIARNRPLILLASKK NPLSRKKEIPFDSITDYKFIMYENFKFDEWLKILNFKNDNNILYVFDRGGLIDAIRQSRY VTVMMKRFTDTYSEDCVEISIVDAPFGMDAYCLYHASYTMNSREKQFIRELKELFADHPA DRPI >gi|157101601|gb|DS480723.1| GENE 90 102269 - 103147 1002 292 aa, chain + ## HITS:1 COG:PA1421 KEGG:ns NR:ns ## COG: PA1421 COG0010 # Protein_GI_number: 15596618 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Pseudomonas aeruginosa # 8 285 36 313 319 244 42.0 1e-64 MQKVDSAEGLDFAIAGAPFDTASSFRSGSRFGPNAIRNISAMMKPNNVIMQVNIMDGLKG GDIGDFNVTPGYIHPTYQAIEEGVANILKENACPIVLGGDHSITLAELRAVAKKYGPVAL VHFDSHSDLCDEVFGQKYNHGTPFRRALEENLIDASHSIQVGMRGSLYDPDEHKMAAELG MKLIPAHKVREMGLETLIKTILERVGDKPCFLTFDIDFVDPAYAPGTGTPEVGGFTSLEA LDLVRKIKDLNFVGFDLVEVLPAYDHGEITAYLAANIVFEYLSILAVKKKAK >gi|157101601|gb|DS480723.1| GENE 91 103173 - 104609 1388 478 aa, chain + ## HITS:1 COG:BH3947 KEGG:ns NR:ns ## COG: BH3947 COG3314 # Protein_GI_number: 15616509 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 35 478 26 460 460 200 31.0 4e-51 MENDNNVNGNLNDNSVLDALVLTREQWRSGIVKAVIFTLIAVIIFFVPVTVRGSTDVTFG IIYKNIKAGLGLAGLWMGGIIIMGNGLASVYGKYFCKNKDSAVYRYYQEDSVLHPLFYLL GSIFALILMLHYTFPSFEGPAAIVSPDIGETVYSIATDVAWIIPVSAVFMPFLLNYGIVD FIGSLMEPLMRPVFKIPGRSAVNAIASFVSSASVGVLITSKLYQKGIYTKKEAALIATGF SAVSVGFAYKVIETADLSEYFLPIYFIALLVTLIVSFFMARIPPLSRKASVFADGRQQTK EEILAERVPARAVLKTGASRAVKKAATAPNLLKEIRSSLLDSCYVLPKVISLLTAVGIIA MLIATYTPVFNWIGKLFEPLLVLLQVPNAAEIAPSLPVGIAEMFLPVLMIADKVGTLAVG ARYMVVTVSMVQIIFFSETIVVMMSTKIPVSLKELIVCFFERTIVAIPISALFMHLLF >gi|157101601|gb|DS480723.1| GENE 92 104716 - 105510 864 264 aa, chain + ## HITS:1 COG:no KEGG:Closa_0658 NR:ns ## KEGG: Closa_0658 # Name: not_defined # Def: molybdopterin dehydrogenase FAD-binding protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 256 1 253 258 317 62.0 2e-85 MFRAEDYVKVDSLSEAYELCQKRSSLVVGGMVWLKMTSVTKRTIVDLSGLGLDKIEETKE EFRIGAMCTLRQLETDQGLDQYFGGVFKECTRHIVGVQMRNLATVGGSIYSRFGFSDILT CLMALDTYVELYHGGIVPLEEFAKRPVRRDDRDILVRIIIKKDGRKAAYTTQRNSETDFP VIACCVSRWGNDWYVAVGARPGRAQVVKIRDDGYDSLAQLAGEAADSFTYGSNTRASGQY RRQLAAVYTRRLMERLTGRSVSKG >gi|157101601|gb|DS480723.1| GENE 93 105507 - 105965 589 152 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 1 143 13 157 171 134 45.0 8e-32 MNIEFRLNGKNVQAEAEPDDLLLDLVRSLGCYSVKRGCETANCGLCTVWLDGKPVLSCSI LAVRADGRHVTTLEGLQKEAEEFGMFLAAEGGEQCGFCSPGFIMNVLAMERELKNPDEEA IKEYLAGNLCRCSGYMSQLRAVKKYLESKQEV >gi|157101601|gb|DS480723.1| GENE 94 105970 - 108408 2689 812 aa, chain + ## HITS:1 COG:Z4220_2 KEGG:ns NR:ns ## COG: Z4220_2 COG1529 # Protein_GI_number: 15803418 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 EDL933 # 48 808 2 787 790 441 35.0 1e-123 MGYTDNKTSDSTCGTCAGTDGTCGSICGTCGSTCGTCGSAGGAYAGRNYRVVGAPFIKKD ARALVTGKPVFTDDLALKDCLVVKVLRSPYAHAMVKEIDCGVAAKIPGIECILTWKDVPQ SRFTMAGQTYPEPSPYDRLILDRRLRFAGDAVAIVAGRDEAVVDHAMRLIRVKYEVLEPV LDMHDAKDAKILVHPEEDWKSLCDVGADNRRNLCASGIEGHGDLEQAFAGCTHVIERVYH TKANQQAMMETFRAYTHLDVYGRLNVVASTQIPFHTRRILAHALDIPKSRIRVIKPRIGG GFGAKQTVVAEVYPAIVTWKTGKPAKIIYSRYESQIASSPRHEMEVRVKLGCDDNGILKA MDVYTLSNTGAYGEHGPTTVGLSGHKSIPLYHTPEAFRFSYDVVYTNRMSAGAYRGYGAT QGIFAVESAVSELAAELGMDPVKFRELNMVKEGDVMPAYYGETADSCALDRCVARVKDMI GWDGKYPRRQVSGSKVRGVGLAMAMQGSSISGVDVAGATIKINDDGFYNLIIGAADMGTG CDTTLAQVAAECLGCELDDIVVFGADTDISPYDSGSYASSTAYLTGTAVVKTCEALKEKI LHKAAKYLETGADELEFDGKRVYKLACEADGTRKEITIKDLANRVQCANEDALQVTQSHS SPISPPPFMAGAAEVEVDLETGQVELIKFAAAVDCGTPLNENLARVQTEGGLVQGIGMAL YEDVSYSKDGKVHENSFMQYKVPSRLDVGRVQVEFESSFEPTGPFGAKSIGEVVINTPSP AIANAVYNAVGVRIRELPITAEKVFKGMNGLD >gi|157101601|gb|DS480723.1| GENE 95 108669 - 109121 454 150 aa, chain + ## HITS:1 COG:aq_158 KEGG:ns NR:ns ## COG: aq_158 COG0494 # Protein_GI_number: 15605731 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Aquifex aeolicus # 1 131 1 125 134 69 37.0 2e-12 MIEATSCGGVVIFRGKILVLYKNYKNKYEGWVLPKGTVEAGEEYKETALREVKEETGVSA SIIKYIGKSQYSFNTPQDVVEKDVHWYLMMADSYYSKPQREEYFLDSGYYKFYEAYHLLK FSNEKQILEKAYNEYLDLKKSNLWGSKKYF >gi|157101601|gb|DS480723.1| GENE 96 109222 - 110238 776 338 aa, chain - ## HITS:1 COG:BH2313 KEGG:ns NR:ns ## COG: BH2313 COG1609 # Protein_GI_number: 15614876 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 7 332 3 326 337 194 32.0 3e-49 MATKPTISQIADMSGVSIATVSRFINNTTAVKGSTAKKIMEAIHSLNYPLPESFITTNTK KDGKVILINIPSVSNPFYNDVIKGAQDAALRHDYYTLVNVQHLNSSTESTFFNLLNNINL CGAIILNAMDQGYFNSLYHTIPFVQCCEYTENQDLVSYTSIDDVAAAKTMMEYILSTGHT KIAFLCGPDRYKYARQRKAGYIQTLEAAHIPLNKDWIIQLPELDFDMALSSTHQLMSQKE HPTAVFAVSDVLAAAAIRGCRLAGLSVPSDIIVTGFDNVNISQATSPPITTINQPKYQMG YMACELLIEKLNTPGAKPRQVLLNTELIIRESTQRQRT >gi|157101601|gb|DS480723.1| GENE 97 110444 - 111286 765 280 aa, chain + ## HITS:1 COG:sll1304 KEGG:ns NR:ns ## COG: sll1304 COG1082 # Protein_GI_number: 16330235 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Synechocystis # 1 278 7 282 287 181 35.0 2e-45 MKFGVSSYVWVSPFSNHTTEQLKHAKELGFDIYEIGVEDPSSFDPAYVKQEADMAGIQIN ICGAFGPERDISSDDSRYRDKGMEYIRTLIDMASVFGSPYVAGPMYAATGKARMASEEER KRQRSYAVDNLRDLAGYAATKGIRLALEPLNRFETDFLNTVDQGVELLDEIGCDNVGLLL DTFHMNIEEKSLGQAIRRAGSRLFDFHACSNDRGTPGQDHIDWDEISKALRDVGYDGAVV IESFTTDITEIAKAVSLWRPLAASQDVLALEGLKFLKENL >gi|157101601|gb|DS480723.1| GENE 98 111332 - 112438 817 368 aa, chain + ## HITS:1 COG:BS_rbsB KEGG:ns NR:ns ## COG: BS_rbsB COG1879 # Protein_GI_number: 16080649 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 80 332 57 299 305 74 25.0 3e-13 MKKGLLALMVVSGMTLMTACSGSSQTETTTPAQTAAKEETSQAEQASPKEAEESSAESAG GKVYGYITPGPDTWYQRNVEGFQMGAEKDGNKVVILNSDYDVSKEVSNIDSMINQGVDGL CIFSFNESGAKIAAEKCAKAGIPLVSTDSVGTALDNNGNDIVAAIDFDWTEMGNNYGQWF ADNCPGENIFVITGNFESVPCQRINEAMQAKVDELGKNKIIDIQEGEYNPSKAITVAQDA IASGKDFSVIFVMNEDMAAGIIQMLDSQGVSDQYKVVSQNGSPAGLPLIKNGKLDYTISS SPGWEGLVSYQVLNQYATGASTAVQQQVELPIMPVDQSNIDDKTKVVPWEVDECYWDLTK EYFPELMQ >gi|157101601|gb|DS480723.1| GENE 99 112499 - 113980 181 493 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 269 476 38 246 329 74 26 4e-12 MGNILSIDGICKSFSGIQVLKDVSLELEAGQIICLAGENGAGKSTLIKILSGAEKPDKGK ITIFGTTYERMTPGQAMHLGIATIYQDADLVSSLTVSDNIFLGNERLRGRAFVNSREQEE RTRKLLQSLSMDMKPSMLVEELSPGQKQNLQIAKALHQEARILIMDEPTASLGEEETASL MNLVEQLRKKGLGIIYISHYLEEMFRLGDMAYVLKDGVMVKRLILKETNQDELIKAMVGR DASNFYQKGSFSSGDRALEAVHYSGNGIVRDVSFSVSRGEVFGLGGLVGAGRTELVRMIY GADKRDSGTLTLDGKDITPTTPKSAVKRGVFLVSEDRKGEGLFLIRSARENLTISKNDNS FFLNLRKEKGIVEKSIQQLKIKVFSQEQEVGNLSGGNQQKVVISRWLLEDGDVYIFDEPT KGVDVGAREEIYKLIEKLAKEQKIVIMVSSNMPELISMSDRIGVMREGRLVRILDSADAT EDRLIKEYIGVKE >gi|157101601|gb|DS480723.1| GENE 100 113988 - 114950 888 320 aa, chain + ## HITS:1 COG:BS_rbsC KEGG:ns NR:ns ## COG: BS_rbsC COG1172 # Protein_GI_number: 16080648 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Bacillus subtilis # 16 307 25 314 322 169 40.0 7e-42 MNDRNNLKYKILSQQWLILLLIILLISVLTALKNPRFLMLNNVMNIFEQIAVLGLVAAGA TILIISGNFDISVGAVIGLSSCLMAMMMNAGIPIPAAAAAGILTAMFCTFFNGVLAILFK APSFIISLATTSVYTGAALYLTKGVIQTVYGKFDTMNRFRLFGAIPLIFIISLMGCVVVG IILRYTQIGRRIFAIGSNEKAAFLSGIKVTRNKLIFFALNGFFVGLAAVLLLSRVGSALP STGAGYELQAMGAVVIGGAPINGGKGNIVGTFFGVLLMGLISNVLNILGVNSYLQEIASG ALIIISLGISAIRVHMLTKN >gi|157101601|gb|DS480723.1| GENE 101 114962 - 115996 1022 344 aa, chain + ## HITS:1 COG:BH1248 KEGG:ns NR:ns ## COG: BH1248 COG0673 # Protein_GI_number: 15613811 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 4 343 6 340 340 350 50.0 2e-96 MKSIGIIGCGRIAQVRHIPEYADHRDARLDAFYDLNPERARELAGTYGGRVYETYEELLE DEKIDAVSICVPNEFHAKISIAALKAGKDVLCEKPMAVTLEECRQMAEAAEKTGQCLMIG QNQRLAKGHALARKLIERGDIGKVITFKTTFGHGGPESWSIDPGSNTWFFDKKKAAMGAM ADLGIHKTDLIQYLMGEYVAETTAVITTLDKKDGSGNLIGVDDNAICIYKMESGAVGTMT ASWTYYGQEDNSTILYGSKGIMRIYEDPEYCIKIIKPDNEAILYNVDQIQTNDKQTKSGV IDAFMDCIVHGKEAQLSGKSVLSAMEAVFASIESAKTGKTIKIG >gi|157101601|gb|DS480723.1| GENE 102 116007 - 116921 812 304 aa, chain + ## HITS:1 COG:mlr1887 KEGG:ns NR:ns ## COG: mlr1887 COG1082 # Protein_GI_number: 13471795 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Mesorhizobium loti # 1 303 1 303 304 248 39.0 1e-65 MKLGFVSAILDGWTFEEMMVTAAGLGYECVEAACWPQGKAERRYAGVSHIDVDNDSRGYV AHIHETCKKTGVEISALAYYPNTMDGDFQKREDAIRHLKKVIAFSEKLGIGMVTTFVGRD QTKSVEENLEAFGRIWPPIIQYAEEHGVRIAIENCPMLFGREQWPGGHNLFTSPALWRKM FDIIPSKNFGINYDPSHFVWQMMDYIQPIYEFRERIFHVHYKDIKVYPQKLKDVGIMAYP LEYMEPKLPGLGDVDWGRYVSALTDIGYDGFTCVEVEDRAFESDRERILDSLKLSKRYIE QFVI >gi|157101601|gb|DS480723.1| GENE 103 117255 - 119792 2749 845 aa, chain + ## HITS:1 COG:CAC1027 KEGG:ns NR:ns ## COG: CAC1027 COG0426 # Protein_GI_number: 15894314 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Clostridium acetobutylicum # 1 389 1 384 392 330 42.0 6e-90 MEAIKLRDGFCWTGIIDDKLRVFDIVMYTEFGTTYNSYVMKTGDKVVLFETAKARFFDDY LEKLKEVIDVTKIDYLVTSHTEPDHAGSVERLLDYSPQMKILATPCAISFLKEIVNRDFV SIAVKDDQRMTIGKRTLHFMLVPNLHWPDTMYTFIEEEQILVTCDSFGSHYCLKEVVSDK IQNEDDYLKALRYYFDCIIGPYKPFMLKALDRVESLDISMVCTGHGPVLVGDRIKQVMAL YREWSTVVNPNRKKTVIIPYVSAYGYTGMLAEKIAQGIADSGDIDVRSYDMVTADAAKVQ EELQFADGMLFGTPTIIAEALRPIWDLTLGMFSVTHGGKYAGAFGSYGWSGEGVPHITER LKQLKMKVVDGFKVRFKPSDADLVGAYEFGYQFGCLVQSKKPGTAAAKGGRRLVKCLVCG EIFDSSIEICPVCGVGRENFVPVEDAANDFTNNTANDYLILGNGAAGFNAAKAIRERDAT GRITMVSEEPYPSYNRPMLTKSLVAGLEPEQIAMVDAPWYEENQVRQMLGKRVESVDTDA RQALLDDGTKLRFTKLIYALGSECFIPPMEGSGLPEVAAIRRLSDVRRVEALMKSTEKAV VIGGGVLGLEAAWELKKAGLEVTVLEMAPSLMGRQMDESSGEQLKIIASKAGVVIRTGVN VEAIEGDGHVSGVRLKTGEVVAAGMVIVSAGIRANIELAKKMGLETKKGVVVNERMETSV SGIYACGDCAQYHDGNYGIWPEAVEQGKTAGANAAGDSLEYTPVPAALTFHGMNTALFAA GDNGRNPNLYYKTVEFRDMGKEQYRKYYFLNNRMSGVILMGDLSRMAAMTEAMENHATYQ EVMEN >gi|157101601|gb|DS480723.1| GENE 104 119903 - 121258 337 451 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 23 439 15 421 447 134 26 3e-30 MSQKNTATSTSKFQLDGRPSMKEAVPLGLQHVLAMFVGNIVPMILVASVANLSVQDANML LQCCMLGAAISTAIQLYPIRIGSIQIGSGLPCMMGLTYVFLPACLSIAGSKGIPYIFGAQ IAAGVIAISYGTLLKKMSSFFPPVVTGTIVMTVGVSIFNLAIGNIAGGTSSPDFGAPINW AVGVFVVLVVLGFNVFGKGLPKVAAILIGMTAGYILSVILGLVSFSGIGEAAWFAVPKPF AFGVKFDLGYILMFVLLYFIVAVQMIGDFNVSCMGGLDREPTPDEIGGGATANGITSIIS AVLNSFPTATYSQNSGIVALTKVCSRFVILVGCGILFLAGLCPKVGAVLSTIPNCVIGGG TVVVFAMIATSGMKLLAKAGYSNRNCLIIGVSLAFGLGTQFCKGASAQFPAGIAAMLEEN SVILTSLLAIILNLVFPKDPVVAAEKEGKAQ >gi|157101601|gb|DS480723.1| GENE 105 121747 - 122721 796 324 aa, chain + ## HITS:1 COG:SP1725 KEGG:ns NR:ns ## COG: SP1725 COG1609 # Protein_GI_number: 15901558 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 324 2 321 321 209 34.0 7e-54 MASIRDVAKKAGVGVGTVSRALNGTGYVAEDTKAKILAMAAELDYQPNELARNLFRNRTG IIGIVVPDMENPFFSKLLKHMEIQLYKNGYKAMICNTIEISNREQDFIDMLRQNVMDGII TGAHSLQDYAYLNLNKPVVAMDRNLGPDIPIIHSDHKAGGRMAARLLLDAGCRNVLNFGG SFRVHTPSNDRHQEFNRVMEENGASVKTIEMAWNMMEYDYYCRIMEQYMDIYRDIDGVFT TDIGALYCLNIANQRGVKVPEELHIVGYDAVDMTRLFTPKLTSIAQDIPGLATACVDTMM DLLDGKKVKMEQILPVSIQKGGTI >gi|157101601|gb|DS480723.1| GENE 106 122953 - 124611 1298 552 aa, chain + ## HITS:1 COG:TM1414 KEGG:ns NR:ns ## COG: TM1414 COG1621 # Protein_GI_number: 15644166 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Thermotoga maritima # 105 548 3 431 432 187 30.0 5e-47 MLEFDNGEYESVCIFAEKTDDRDGWICLEGPGGKEQQVHMNDDWYRLVQLALPEKGIYRI SRKGVSFTQMYLSGGPDLMDRGIRFLDPDSGDEMELGKWYDTSVREQYHFNPFMNWVNDP NGLCWFKGYYHLFYQSNPFGQEWNDMYWGHAVSRDLIHWTHMPYVLEPQPDLWRDKEHKG GAFSGSAQVEGEQMHLYLTRHHGPQEDGEETREWQTEAVCWDGIHVEKERPCITERPEGA SFDFRDPKVQRIEGMDYMVLGSSLDGVPSILLYVRENGAWSFKGPLLQEHEPGIRTFECP DFFELDGKYVAAGAWMCHRDEAGRYQMTRCYTGTFDGDRFKTEHQQWYDFGSNFYAVQSF EHGGRRIAIGWISDFYGEHRVKPTGACGSFSLPRELHMEQGRLFTEPVKECYGLLKERIF AVSGQDVPPVIVPGNSFFVKIKLGDDRDFLVTLAREGDDALYLERKNGVTSLVSTRKEVS EVRFPSDVSAVRYVEIFMDRRVAEVYLNHGEAAGTKLFYQESTRGHLEADFAAGSLKRLE VWTMESIWNHKS >gi|157101601|gb|DS480723.1| GENE 107 124629 - 125930 1547 433 aa, chain + ## HITS:1 COG:AGpA379 KEGG:ns NR:ns ## COG: AGpA379 COG1653 # Protein_GI_number: 16119492 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 86 430 69 416 422 118 28.0 2e-26 MKNKRLACMTLAAVLAACALSGCGGSSKSGGGKTGSDGAKLVTIWSPTDEPAIEEWWVEK INAFNEEHKGEIELKREAIVRADSYAYEDKINAAVTSNDLPDILFVDGPNISNYAADGII VPIDSYFTEEDLSDFVESIKVQGTYDGKLYALGATESSVALYYNKDMTDAAGITMPDKME DALTWSEFADIAGKLTTSDVAGTNIIMDKGEGLPYVLEQFWISNGTDFVSEDGSKADGYV NSEKGIEAANFLNSLIQDGYANIDPMKQEFHNGKCATMLGGSWEVATLEQSFPDLNWGVT YFPVADNGGINTSPTGDWAACITRNADAEAAGVVISYLMNKENVTSYAQAIAKLPTRASS YDTLTEYNEYPRSLFKEQSLNTGHPRPRTPGYTVLSPGFSEAMMNMFTGADVKESLDQLA ADFDSNYQKNYAE >gi|157101601|gb|DS480723.1| GENE 108 125984 - 126871 1042 295 aa, chain + ## HITS:1 COG:lin0760 KEGG:ns NR:ns ## COG: lin0760 COG1175 # Protein_GI_number: 16799834 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Listeria innocua # 8 295 10 293 296 203 41.0 4e-52 MKYGTKNKVHERRAAFLFILPAVVLLAAFLILPAVSTVRYAFTDYNILRPDKIKFCGLDN FVELFGDRDFKKSLVNTIYFTVVVVPFQCILALLLAMLISSRRKGVSIFRAAYFSPNITS MVVVAILWSVLYNPNPSTGLLNAFLTKLGFAPCGFLTDPKTSMNSIIFMSAWQAAGYQMM IFLAGLQAIPGDQYEAASIDGAGAFQKFLYVTLPGLKNVIKYVVMITMIQAMKLFTQPYV MTQGGPQNSTRTLVYYIYEQGFQSRNFGYACSVATVFFVIVVTMSLMLKRVIKAD >gi|157101601|gb|DS480723.1| GENE 109 126887 - 127729 886 280 aa, chain + ## HITS:1 COG:lin0761 KEGG:ns NR:ns ## COG: lin0761 COG0395 # Protein_GI_number: 16799835 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Listeria innocua # 3 279 4 291 291 177 35.0 2e-44 MGLKEKDILKRVLFYGGNAIVALIFVSPLLWMIAASLKPEAQIFSDMSNIRTFWPTAATL GNYVEVFTRVNMMKFILNSLFYVFVIVILDLAVNSVCGYALAKFNFPGKELLLTVVISLM VLPMEAIMLPLYKEVASLGWVNTWAGLIVPFVGKCFSIYMFRQFFLDIPDDLLEAAAIDG CGPIKTFFTIVMPISGTVYATIFILDFVAHWNDFMWPMLVVTGEDMRTIQLAIQTFFGSK PIHYGAIMASLTISAIPMLLMFVFLQKYYVEGIASTGIKG >gi|157101601|gb|DS480723.1| GENE 110 127864 - 128724 657 286 aa, chain + ## HITS:1 COG:CAC2230 KEGG:ns NR:ns ## COG: CAC2230 COG2340 # Protein_GI_number: 15895498 # Func_class: S Function unknown # Function: Uncharacterized protein with SCP/PR1 domains # Organism: Clostridium acetobutylicum # 169 286 55 174 175 118 51.0 1e-26 MRKISMYTLSAAAAMLVSAAIPVTSLAAVSTYQIPFANGSAYVIGVGGQNCLPGNIGQNG NWGQNGGWGQNNNGNQNNGWGAGPVLPDNSLPQVTPPDFIFPTPELPGPEMPDSSLPDNS LPDSPDSSLPDNSLPGEPDNPGTAPEAPGDTVPGDPDNGGSQDAFADAVVELVNAERAKA GLSPLSVHEGVAEAANKRAQEIKGTFSHTRPDGSNFSTVLTQAGISYRSVGENIAYGQNS PEAVMQSWMNSSGHRANILNRDFTSIGVGHYQDASGTDYWTQLFIK >gi|157101601|gb|DS480723.1| GENE 111 129286 - 130347 926 353 aa, chain + ## HITS:1 COG:CAC0393 KEGG:ns NR:ns ## COG: CAC0393 COG1609 # Protein_GI_number: 15893684 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 2 281 5 276 344 101 28.0 2e-21 MKNATISDVAKLSGVSKSTVSNYLNGNFERMSDETKKKIKNAIDELGYTPSLSARRLSSK NHSKTVCLAIPRNISHLNDTMYYPVIFSALGEEAKKFDYNTLIYSMDSDDINKDTEYLKS LSSSLVDGILLYDLEEDSLAFREFERAGIPYVCVGKILSLENYNYVASDHGKAMKDVLEY FYNLEHKKVAIVTEGKSNSVVQTVRSTAFNEFMSERKEEKLDYSYIRINQNESSSDIKLL FNELLNPANRPTAVGIISCFMNQFMDVVKGYGIRIPEDLSVMILEYYKNSNVDEEYQDFT CVESKAEKVTRIAMKKLVKLIDSGKPFESELVGLKVCVKNSTSKPKKRGGRKL >gi|157101601|gb|DS480723.1| GENE 112 130344 - 131705 1428 453 aa, chain + ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 4 451 2 444 451 108 25.0 3e-23 MKKRYTKIVSLVLGFGLAASMVAGCSVKSSENVVTTAAPTAAKAETTASAESGGSAAEQK KMVFWDKSEYVEGYNTMMKAKVDEFASENHVDVDYVVIPAADMKQKLMAAIEAGNAPDLI VGDDTLVGQFVSMQQIAECSDIFEAVDFTENSKILGTFNGKPYLVPLAFTAPGMYLRTDQ WEKTGMDIPTTWEELKEGAKLMNDPANGFYGLGFAMGASGGGDAEGFCRTIILDWGGIPV DENGKVTVNSPETLEALKFIASLYEEGLIPPDAITGDDSWNNNAYLAGTVGVITNSGSVV SSMKEEKPDLLANTQIIAYPAGPTGEAFTLGGANVFGIFETGKNTETAKEFVKYYFEDLD NYNAMIEAMGAMWQPVVNGVDDTDFWKDPVNAGWLANSKNTYKTYYPAPSDERATVSFTN QLCVKAVQEIVVNKADPQEALDHLEAEFNKIYN >gi|157101601|gb|DS480723.1| GENE 113 131735 - 132649 633 304 aa, chain + ## HITS:1 COG:SMb20659 KEGG:ns NR:ns ## COG: SMb20659 COG1175 # Protein_GI_number: 16265114 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Sinorhizobium meliloti # 22 294 23 295 310 177 36.0 3e-44 MAGKKSEKRKYSHSQQLLDKRLGLLLMVPIAAVIFGITGIPFIRALYLSFTNKVVGVPEQ FIGFDNYLALFGDKIYWKSLYNTIIYTVGCITAKLALGLLLAVILNQKFRGKAFFRTALL IPWALPGMVAATTWRWMYDSTYGIINSLLLKAGLISLPIPWLSNPDITLLSTMIVNVWRG VPFFMFSLLGALQTLDGQIFEAAYVDGAGMFKRFWYITLPGISSVLGISTLLSTIWTFND FENVFLITGGGPIYSSSVISTYTYDLAFIQNSFGRALSVAVSVIPLMLILILVSQKVING DRNE >gi|157101601|gb|DS480723.1| GENE 114 132667 - 133506 606 279 aa, chain + ## HITS:1 COG:SMb20658 KEGG:ns NR:ns ## COG: SMb20658 COG0395 # Protein_GI_number: 16265113 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Sinorhizobium meliloti # 11 262 28 282 299 164 35.0 1e-40 MVMKKRSRIMRTTATYGILMIALLWTIFPIYWMIKSSLTVNEEMYVARPPLFSNVITFDH YIDLIYNTSFMHNVWNSFVIAAITTIICLAIGILGSYAMTRLKYPGRSFFRNSIIISYLM PTAVLFVPMYVFVSSLGFYDNKYSLLIIYPTFVVPYCCYMLISYFKAIPYALEEAALIDG CNRLQTLWYIIMPIALPGIAVVATFAFTMAWNEYLYAMIMTTSNVQKTATVAISGFKYAD SAIWGRIMSASVVCSLPVTLLYIVAQSMLISGKYEGSVK >gi|157101601|gb|DS480723.1| GENE 115 133536 - 134906 1291 456 aa, chain + ## HITS:1 COG:mll2322 KEGG:ns NR:ns ## COG: mll2322 COG4948 # Protein_GI_number: 13472125 # Func_class: M Cell wall/membrane/envelope biogenesis; R General function prediction only # Function: L-alanine-DL-glutamate epimerase and related enzymes of enolase superfamily # Organism: Mesorhizobium loti # 9 456 11 452 452 591 65.0 1e-168 MEDRDDMLMDHVNTNSRPGDLKITDIRVADIVGAPMHCTLVKVYTNQGLVGYGEVRDGGS RNYVLALKSRLIGENPCNIDKLFRRIKQFGGPARQGGGVCGIELALWDLAGKAYGVPVYQ MLGGKFRDRIRMYCDTDIEGKHTGKEMGEALKKRMEKGFTFLKMDLGIDLLYDIPGALCA PLGMVGDLNQSSRTYEAISRRLTPEKRSLRNRMYDMQNIPHPFTGVQVTEYGYDILEQYV KDVRSVIGYEIPLAIDHFGHIGVESCIRLARRIEKYNIAWMEDLIPWQLTDQYVRLRNST TVPICTGEDIYLKENFRPLMERGGVSVIHPDILTSGGILENKKIGDMAQEYGVAMAVHMA ESPIACMAAVHSIAATENFLALEFHSVDVPWWSDLVTGLPNPIVDNGYITVPDAPGLGIE NLEDEVIREHLHPDYPDMWADTSDWDNDYSHDRTWS >gi|157101601|gb|DS480723.1| GENE 116 134923 - 135831 864 302 aa, chain + ## HITS:1 COG:mlr2332 KEGG:ns NR:ns ## COG: mlr2332 COG0684 # Protein_GI_number: 13472133 # Func_class: H Coenzyme transport and metabolism # Function: Demethylmenaquinone methyltransferase # Organism: Mesorhizobium loti # 67 241 49 219 249 63 31.0 4e-10 MIYYTRDQIIELTSRWKGERFEDGRPRVPDYLLDKLRTMTIEEIWLPLFLKDYKFQFEGE MKKLHDGLKLVGRAVTAVFMPTRPDLMEAVRTEGDARGYQGTCNQWVVDHLQEGDVVVAD MFDKVWNGTFVGGNLTTAINVRTRTGGAVIWGGVRDIEQMQKIDTQVYYRGTDPTPIREC LMTSYNGPAKIGKAVCMPGDIVMGTTSGILFIPAYLVEELVNSAEKSHAKDIFGFAMLEQ GTYSAAEIDSTVWPEEMVDKMIDFIEHDSSCEKYRGLDWSLEINAARGDQEAVDELMKGY LV >gi|157101601|gb|DS480723.1| GENE 117 136058 - 136621 720 187 aa, chain - ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 173 13 177 186 94 35.0 9e-20 MKEIVFAGSGGQGVLTAGLIISDIAATEGINVTWVPSYGSAMRGGTANCTVKYCENTIYN PSQEQPDLLLAMNSPSFHKFLPLVAPGGVVVIGDLVEIPEDARKDVTYVRVPATRISTEL NNPKGANIVMTGAIVKLMGDFTKEAAIAAMIHMFEKKGKSRFNEANEKAFNAGYDAVETL ALDSCVR >gi|157101601|gb|DS480723.1| GENE 118 136624 - 137376 758 250 aa, chain - ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 20 245 37 261 296 172 37.0 5e-43 MSASNKERKVPVLVDKPMDFCPGCGHGIISRLIMECLDELEQSDNIIFPIGVGCSSNLGA GLECDRLHCSHGRAGAVATGMKRVNPDVLIVSYQGDGDAYNIGIAETFNAAYRNENITVI VVNNTNFGMTGGQMSLTTMEGQKTSTSVHGRDCSVTGYPLRFPEMVAREFPGAAYAARGT VTSPSKINQLKGYIKNGLQAQMNHEGYSVIEVLSPCPVNWGLTPVKAMERIESELVPYYP LGEFKKREVR >gi|157101601|gb|DS480723.1| GENE 119 137378 - 138439 1236 353 aa, chain - ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 4 309 8 321 356 268 45.0 9e-72 MAEMMKGNEAIAEAAVRAGVKFYAGYPITPSSEIMEYLSWRLPEVGGSFVQAESELAGIN MVIGAAASGVRALTASSGPGISLKQEGVSTLSDEGLSAVVITQVRYGNGLGTLLSAQCDY NRETRGGGHGDYRCLVFAPASVQEFVDLMRPAYELAEKYRVVSVMMGEATLGQMMEPVTL PDFVEAERTPWALNGHYTYKKIGIFERDSMKEAVELVDKYNTIRENEQRWEDEGLEDADY VFVSYGVPGRSTLGAVRELRAQGEKVGIIRPITVWPFPEKAFYKINPNVKGIITVEANAT GQLVDDAALYTKKALKERNVPAYALTYVFGTPTIRNIKKDYFEVKSGSKKEVY >gi|157101601|gb|DS480723.1| GENE 120 138443 - 138661 244 72 aa, chain - ## HITS:1 COG:TM1758 KEGG:ns NR:ns ## COG: TM1758 COG1146 # Protein_GI_number: 15644504 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermotoga maritima # 3 58 5 60 77 57 50.0 8e-09 MKVTVNKDRCKECGLCIHHCPRKAISKCDVLNANGYYPVQVDDEACIACGMCYITCPDGV FHVSGNIEGEVK >gi|157101601|gb|DS480723.1| GENE 121 138921 - 139685 895 254 aa, chain + ## HITS:1 COG:ECs4936 KEGG:ns NR:ns ## COG: ECs4936 COG1414 # Protein_GI_number: 15834190 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Escherichia coli O157:H7 # 5 251 37 280 287 131 29.0 1e-30 MADTIQALDRALELLIYLNKQGREVGITQIASDMGVYKSTVFRTLSTLEARGFVSRNPET EKYWLGSQLFVLGKSVENKMGIQEVIRPYARRLHQKYHEVVNVSILEHNPGDVYHSVIIL KEMSDQQLLTVNPPVGSLSECHCSAVGKCLLAFGDNIDWAPYEERELTRYTVNTILSYED LKEEIDKVRRRGYAMDHEEQEMGLTCIGAPILDRNKKAVAAISLSGPTSRMNSSDLEDRI EAVCNTARDISGNF >gi|157101601|gb|DS480723.1| GENE 122 139927 - 140988 1136 353 aa, chain + ## HITS:1 COG:CAC1660 KEGG:ns NR:ns ## COG: CAC1660 COG3426 # Protein_GI_number: 15894937 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 2 353 4 356 356 354 51.0 2e-97 MKLLIINPGGTSTKIAVFEDEEQVLKKNIIHTQEELSGFSHVFDQFGYRKGLILKTLEEE GVDIHELSGVAGRGGLMNPIPGGTYRVNERMLEDLEHAVHGEHPSNLGAALARSIGDQIG VPSFVVDPVSVDELMPKARISGISDLERPSWFHALNHKAVARWAAERIGKKYEESSLIIA HLGSGNSVVAHKNGQMIDGSGGRTNGPFSPERSGGLPTYPLVELCYSGKYTREEMVAKIS STGGMYDYLGTKDAGEVERRMGLGDETAKLVYEAFVYMVAKEICSYAAVFEGKVDCIVLT GGIAHSRYVTDEIRRMVGFLAPVEVVAGEFEMTALALGALRVLKGEEEPREYE >gi|157101601|gb|DS480723.1| GENE 123 141021 - 142463 1509 480 aa, chain + ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 26 466 10 450 473 292 37.0 1e-78 MSEDHKEKRQPGAVMSFLVFGAAIAVLLLGVVLLDYDIHIVLLCALAVVCIASVPLGYSF MDLVDCMKKSLGQALAAMIIFIFIGIIISSLIFSGTVPALIYYGLQYIIPDYFLPIGLLL CSLTSISIGTSWGTVGTMGLAMMGIGAGMGIPAPLTAGMVISGAFFGDKMSSISDSTNLA PAAAGTTLYAHIGAMWKTTFPSYIITLAVFTVLGLSYHGGNFSMDTVNLFLDTIGGRFHI SPLVLMPMAVLLILNLMKYPAVPSMAIGTVLAVLVAVFYQGCPMSEVVAGLNYGYAESTG VELVDKLLLRGGIQSMMYTFSLSCIAISFGGVMEHVGYLPQIVKVIIKRVRSDRMMVPVV IASTTFGTLTMGEVFLSIVVNGSLYKEAFRERGLRPEMLSRLLEEGGTLTQVFIPWSTSG VFILSTLGVGVSQYWKYAVLNYVNPVLSIVLAMFGIYVLKVGAEEKKSLCHRLIHSHKLS >gi|157101601|gb|DS480723.1| GENE 124 142513 - 143418 1062 301 aa, chain + ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 297 1 298 301 218 40.0 2e-56 MIKNFAELQERVRSSEPLVVSVAQAADQEVLLSVKAAADQGFIKPVLTGDRERIEELCGH IGLKPLEILQAGTEEEAVERAVRAVHDGSAQVLMKGLVNTSVYMRGILNREWGLRTGRLL SMMAVYEAPGYHKLLFCSDSGINVAPNLEQKKDILKNLLYAVRNMGIQNPKVAVLTANEM VDPKVVSTTDAAGLVEAVKNEEGFLPCIIEGPIAFDVAFDPKSAAHKKIDSRITGDVDLV IFPNIEAGNIMGKSWIHLCRSPWGGIVLGASNPVILGSRSDTAEIKLNSIALACLAAQSG K >gi|157101601|gb|DS480723.1| GENE 125 143434 - 144753 1454 439 aa, chain + ## HITS:1 COG:YPO1668 KEGG:ns NR:ns ## COG: YPO1668 COG0477 # Protein_GI_number: 16121932 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 18 424 9 409 411 147 28.0 6e-35 MSNSKVSITGKQLLIVFFMGLAFAVVYATPFVQYVFYDDLAGALHATNTQLGFLIAIFGI GNLLAPFGGALSDKFNTKKVYLLGMFITCALNFLLAMNMSYTFAIFIWAGLAVAGLILYF PAHTKLVRLVGDEESQGTIFGFTESACGLASVIINFIALGLYAKVAVGAMGGVAGLKAVI ISYGAVGLIATIALVFLLPDPEKAGKGQDGGDSGQETKLSMKEWAGVFKDPRTWFAGIAV FATYSTYQTLSYFTPYFTNVLGATVVGSGAIAIIRTYGIRIVGAPLGGYMGDRIHSVSSV IATVLACGAVITLIFMFMPAGVPSVLLTVMTLVIGFMVHIARGAMFAVPSEVKIPRRYAA STAGVVCAIGFCPDLFQAAMYGHWLDTCGNAGYTRIFIYIIAIMVVGVINGVATVVYKKK YVAAGLVPVEGGTEAAQAQ >gi|157101601|gb|DS480723.1| GENE 126 144854 - 145555 592 233 aa, chain + ## HITS:1 COG:RSc1007 KEGG:ns NR:ns ## COG: RSc1007 COG1309 # Protein_GI_number: 17545726 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Ralstonia solanacearum # 37 231 40 235 239 120 33.0 3e-27 MNQCIDIAAILLYNITNQLVFEKEEFMTKQEIKSTQTKQAILKAAEEEFSEKGIYGARVD EIAAKAKINKGMIYQYFGNKEELYKTVLKNVYDRLGDSEDLVICNQKNCISEITDLVRTY FYFLRDNPSYVRMVMWENLNYGYYFQEKELGDVKNPIRLELGKIMEEGKKAGEISEQVCE EDIFQTLIACSFNYFSNRWTLKQILKKDLDKNDEIERRIKQVTKMVVSYIKKD >gi|157101601|gb|DS480723.1| GENE 127 145754 - 146770 403 338 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0392 NR:ns ## KEGG: Cphy_0392 # Name: not_defined # Def: NAD-dependent epimerase/dehydratase # Organism: C.phytofermentans # Pathway: not_defined # 1 338 1 338 338 421 58.0 1e-116 MWTEKKLNDVLTEPTLAMVEDMKRIDGDILVLGAGGKMGHTICVLASKAMERAGIHKKVI AVSRFHDPEVRKYLEENHVEMIQADLQDLKQLENLPEVPNVIYMAGRKFGTDGQEWMTWG VNSVLPAFVGEKYKKSRIVVFSSGNIYPMVPLASGGCTEKDKVAPVGEYAQSCLARERTF EYYSSRYGTKVFMYRLNFAVDLRYGVLADIADKIMNQTPISVTTPCFNCIWQGSAAEIGV RGLLHASNPPKIMNVTGPEVISVRKAAEKIGGFLDREPIFEGEEGNDAYLNNASQAMEQF GYPAVPADTLIRWQAEWVADGGRNLGLPTHFEERGGRY >gi|157101601|gb|DS480723.1| GENE 128 146859 - 147839 637 326 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2189 NR:ns ## KEGG: SpiBuddy_2189 # Name: not_defined # Def: dihydrodipicolinate synthetase # Organism: Spirochaeta_Buddy # Pathway: not_defined # 2 326 31 355 355 419 58.0 1e-115 MDEKTQKILIRYYLDAGVGGIATAVHSTQFAIRNPEHRFLERILQLVSEEIGRFEEKHKK TIVKISGVCGETDQAVSEAVLAKRMGYDAVLLSPGGLGHLTEEELLERTKAVAEIMPVIG FYLQTAVGGRRFTYRYWENICRIPNVVAIKCASFNRYSTVDVMRAVAFSGREEKIALYTG NDDNLLVDLLTPYRFTVDGVERNIGFRGGLLGHWCVWTKKAVELFEKVKHIQGQDCIPAD LLTQAAELTDINGAFFDAAHEFGGCIPGLHEILRRQGIFKNTLCLDPKETISQEQIEEIN RIYRMYPQWNDDIFVAGNLDRWGLDI >gi|157101601|gb|DS480723.1| GENE 129 147863 - 149110 739 415 aa, chain + ## HITS:1 COG:CAC0628 KEGG:ns NR:ns ## COG: CAC0628 COG1914 # Protein_GI_number: 15893916 # Func_class: P Inorganic ion transport and metabolism # Function: Mn2+ and Fe2+ transporters of the NRAMP family # Organism: Clostridium acetobutylicum # 22 409 24 413 417 81 23.0 3e-15 MGNRDKKTDREGGWLRQLISSLGPGLIIAATVFGAGSIITASKAGASAGYSYLWVLALAA VFMITFTKIAAKIGCVGEKSMLGHIEDNYSRGMAVLFGLCCSLICTGFQAGNNINTGVAL NALFPFMGIKVWIIVSFLIIMVLIWRSSSFYQLLEKIMTVMVLIMLVCFAGNIVFFFGRI HYGELALGFIPGKVSGWQIIVSMSATTFSIAGAACQSYLVQGKGWTMDNYDKAGRDSTAG IIILSCITAIILITAATILPKGAEITSVLSIAALLKPLLGEFANGLFLLGFFAAVFSSVI ANAVIGGTFLADSLKLGKTINDFWVKMFSSIIIALGACVGLIFGSNPIQLTIMAQGATII GAPLVTVMLMLLSSKESVLGKHKNSKVTTVIGWVAVLWVVFLSVNQILLWNGIAL >gi|157101601|gb|DS480723.1| GENE 130 149316 - 150821 1723 501 aa, chain + ## HITS:1 COG:FN0061 KEGG:ns NR:ns ## COG: FN0061 COG2317 # Protein_GI_number: 19703413 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent carboxypeptidase # Organism: Fusobacterium nucleatum # 1 495 1 494 496 421 43.0 1e-117 MSKAYGELQTYMDKAMAIKTAMTLFEWDNETLAPREAGELTSKVIGVLSGEYFQAVTCDE MKKLLEKCRGDKSLTTAEAANVRELLEEREQICPIPQDEYQEFARLTARATSVWARAKKD QDFEAFAPTLEKVIGFQKKFAGYRAKDGKKLYDVMLDDYEKGFSMENLDRFFSVLKKQLV PFLKKVMEEGKMIEDSFLKGDYPEEKQEELGRFLAEYVGFDFDRGVMAVSAHPFTTNLHN KDVRITTHYTDCVDSSLFSVIHEAGHAIYELGIGDDLTLTPVGQGASMGMHESQSRFFEN IIGRSRSFWVPIYDRVQAMFPEQLGKVNLDQFVEAVNKVTPGLIRTEADELSYSLHVLIR YEIEKMLIEEDLDVEKLPRLWADKYEEYLGVRPENPAEGVLQDIHWSQGSFGYFPSYALG SAFGAQLYYHMKEIMDFDSLLKDGRVDVIRDYLREHIHQYGKLKDSRQILKDVTGEDFNP EYYVRYLKEKYSRLYGLESKG >gi|157101601|gb|DS480723.1| GENE 131 150978 - 152675 299 565 aa, chain - ## HITS:1 COG:TM1189 KEGG:ns NR:ns ## COG: TM1189 COG4928 # Protein_GI_number: 15643945 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase # Organism: Thermotoga maritima # 1 232 43 231 718 73 26.0 1e-12 MTIAIQGEWGSGKTSIMNMVREQLDHTKNFKYKSIWFNTWQYAQFDSASRLNIMFITDLI EQVFEGEDVCSSQSLNKADPVKDIYKLLKVISSNYMDRVFQEKTGVNNITNIIANTLSKD NEWDYQKYLNTESQSIRKFKDTFESFVDYSCYQNKYDRIVFFIDDLDRIDPCRAVELMEI IKNYLDCKKCIFVLAIDYDVVIRGVKAKFGETRESKARSFFEKIIQLPFMVPTNYYNVEK YVTNMLGRFGITVNDSELSDNLFKLMKTCTTNNPRAIKRLLNVYALNSMVNCTQGINSTP DFKNILLLGVLGIQLSYPKLYQFVLINAKYLSMTVEGKQVCLLEYITNPEYQDELANTDA DIYLYKAPKNIPELYSLKEQLEEAGISDEWDELWGVLETYTKLLKDSKNSIPTGTLEIFI NILNLSELSSYGSQKKDVSKRSIPLDYEIYGEKRHADSAIEMYLYVMEKILTKNAAPLDR QQIRDLYTTVDCFYDCSKLIANGNFDGNTEEMVQFIKKKGKRRQSSSQFGRTNRLKIPPG KICRNWKYKCLYRCFTEPAAYAQLY >gi|157101601|gb|DS480723.1| GENE 132 153033 - 154040 1148 335 aa, chain + ## HITS:1 COG:CAC0690 KEGG:ns NR:ns ## COG: CAC0690 COG1363 # Protein_GI_number: 15893978 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Clostridium acetobutylicum # 5 333 12 339 343 293 46.0 3e-79 MEQAKNLLAIDSPSGYGREVTEYLLKELEALGVQARRTVKGGVIADFGGRNKEDGILLEA HCDTLGGMVAQIKENGRLKLTNIGGMKPANAEAENVRIITKADGIYEGTCQLVDASVHVN GSYEKTERTWDTVEVLVDEDVTCREDAVKLGIMPGDYVCFEPRTRITPSGYIKSRFLDDK LSAAILMGYAKYLKDESVTPERRVYAHFTIYEEVGHGGSASVPEGVTEAWSVDMGCVGEG LQCTEREVSICAKDSGGPYNYDILCRLVELAKEHGIGYAVDVYPHYGSDVETTLKSGHDV RHALIGPGVYASHGYERSHRDGAENTFRLIQAYIG >gi|157101601|gb|DS480723.1| GENE 133 154508 - 155698 1333 396 aa, chain + ## HITS:1 COG:no KEGG:Closa_1647 NR:ns ## KEGG: Closa_1647 # Name: not_defined # Def: electron transport complex, RnfABCDGE type, G subunit # Organism: C.saccharolyticum # Pathway: not_defined # 216 390 29 203 204 169 51.0 3e-40 MLIAGVLVSTGCSGSRVKAPERVFEPEEAFFEIPHSSFEAEWLEGGMKDEDGHVYEEVYD FVLAMDRAVIPYTVSGKLIQTCEYNPSEGSWNVERRTEDVTQEMDLTKYRWKLENDPVYE YTETFIKTGPNTFETGEPDHPTGIYFTVDILSGGTNLKDESGRTNLPGGQWTARLYLESD NLLGGNFKGEAAIDLLGGFYGYNDLRWVVTIDELKRVEKELNTTDEERANELQEGFRKSF PEADSFEVMDFELIQDCNKELASSDFGDVGVDAAAAAKDRDGALMGWVVNAHSKDSYMGN VAVSVAFESGGTIRGLEFLVLEDTPGLGMRASEDVFKEQFEGKGREALTVTDSGNPGDSQ IDAISGATITSKAVTNAVNGAMYYVHHYTEAGEQAL >gi|157101601|gb|DS480723.1| GENE 134 155956 - 156909 1094 317 aa, chain + ## HITS:1 COG:CAC2861 KEGG:ns NR:ns ## COG: CAC2861 COG2385 # Protein_GI_number: 15896115 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Clostridium acetobutylicum # 46 312 65 338 345 127 30.0 3e-29 MNKRIIWAALTGLMVPYVGTLAWTGTIRGEELRYEQQKEISGRRRILLDRTSGSYYMDME EYLPGVIARQIPVEYEPEALKAQAIIARTYICRQMEGTGDGEEIAESALDMDYLEADQLK KLWGSSRFPEYYKKLEDAVKATSGIVMTYEGKCIDPMYCRATAGMTRQGDFTHPYLQTVD CPGDVEAEGYMQMLSFSRESFVSDINSIPAGESGSRTIEASQVPDTIQIVSRDDAGYVDT VQIGSAGYTGEEVQYALGLASSCFVLEPYEDGIRAVVKGIGHGYGLSQATANEKAKEGWK AEDILGYFYKNIAFISE >gi|157101601|gb|DS480723.1| GENE 135 156950 - 157705 950 251 aa, chain + ## HITS:1 COG:no KEGG:Closa_2891 NR:ns ## KEGG: Closa_2891 # Name: not_defined # Def: Peptidase M23 # Organism: C.saccharolyticum # Pathway: not_defined # 1 251 1 253 253 282 63.0 1e-74 MKEKVNQMFKDKLFLVMLVLGLLTIVAAAGVITVQRGKGNETNPYLEVPQPEMIAQETAP EMPQVAGASDAAKETQKSEPAPTKAAVAQNANQGNDLAAEAGAGKDAADALVLNFTDTSK MEWPVKGNVLLDYSMDQTIYFPTLDQYKCNPGLVIQSDVSTPVGAPANARILEVGSNEEI GNYVVMDLGNEYTATCGQLKEVCAAEGEYLKKGQTLGYVSEPTKYYSVEGVNVFFELKHQ DKTVDPLDYME >gi|157101601|gb|DS480723.1| GENE 136 158028 - 159374 1609 448 aa, chain + ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 3 418 2 394 422 100 22.0 5e-21 MNKKMIALLTCAAMSASAMTGCASSNAGASGEGNTAAAGDGSGKVTIKVLDNYGGMVEEM ISVYEKLNPEVNIEYEYVPSDSYYSKFTALNVANELPDVVMTNSQFIGDQVQSGLLVDLT DAVENGQNYEKDANWGETINPTLLSNCKATLKAIGSDYADKLYGVPFTMTTVAVIYDKNV YKELGLNVPSTWEEFEKNCEAIKAAGKIPVSMQQQNMDWWPRIFWDQYCREELDANPNAF EDGSMTFSSESVKKGLEKFKYMWDQGWFPESGLTGNRETMQQLFVQKELVQVMLQPNYLS YLTENVPEGVELASYALPGVEGKPARCLGGSSSIWAVTNSSKHQEEAIEFVKFMTSKTAF SEDYAKFINPGLTNFELTSDNEAMQGYIDAGKNGFIPDIYVPVNITTEIQNTFLTDLEPN YLLGTYDIDYVTDQLNQLYEETYLSNLN >gi|157101601|gb|DS480723.1| GENE 137 159614 - 160513 1103 299 aa, chain + ## HITS:1 COG:mlr7001 KEGG:ns NR:ns ## COG: mlr7001 COG1175 # Protein_GI_number: 13475831 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Mesorhizobium loti # 8 284 24 302 317 172 34.0 5e-43 MKSSQTLKERKEMMTALALIAPTIIVMVIFLYIPFVNALQTSFYKYNGLGALEHFVGLKN YAKVLADPKFVRSLWNTFYLIMVSFLAIPIGFVFAYILYVGVPGKKIFNAGLFIPYLISM VVVGCIWRIIYDPTIGPVDQFLKMAGLGKYAKAWLSRPETALWAIAVTWIWRSQPFNMLI MYANITKMPEDFLEAAQIDGANFRQKLFYIIIPYLKPTFAVLAMLTVTNGLRLFDLIWVM TQGGPGGASDVMTSYIYTKAFTNRDFGAGTAASVILMLIMVAIMAVKTIVQKKRAGRAL >gi|157101601|gb|DS480723.1| GENE 138 160513 - 161343 936 276 aa, chain + ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 11 276 37 300 300 182 36.0 6e-46 MNKKNVYKLFLFLFMLLAFFISVYPLLWVVIQSLKTETEFLKSIWTLPARLNFQNYATAW NDAGMSRYFMNSVVVTLVTTAVNLVFVTCAGYAFAKLTFPGKTFFYYMIIFNLLIPTAII LLPMFTMVNRMHLVNTLPALVFPYFQGFAPMGLIICRNYFEDLPDELMEAGKLDGCSNMQ VFRKIMLPLAKPILATMAILSAMQVWNEYLWALTSITDESKYTLSVGVALFNKKTETVGY TPVFAALSISALVIVVVYLCMQKNFVKSIAAGAVKG >gi|157101601|gb|DS480723.1| GENE 139 161354 - 164977 2376 1207 aa, chain + ## HITS:1 COG:no KEGG:Mahau_1687 NR:ns ## KEGG: Mahau_1687 # Name: not_defined # Def: hypothetical protein # Organism: M.australiensis # Pathway: not_defined # 1 557 1 525 1018 255 31.0 1e-65 MDLIRDFKDPGREYSVLAFWFLNGELKKERLAWQIGQMVEKGVYGGFMHPRAYLKTPYLE DEWWDAVGVCVEESRKQGFAPWLYDEYAWPSGTAGSTFDYGFQKPSRILSRGRDNMAKGL WAVKKDAGNRGCGNEGAGNEGGGNQGSGNQGGANPDAANPEDKGQSGANPSSMPYRVVKR DGFIYEFYEKVLEKAVDYLNPETIASFIKLTHEEYRKRWGEDFGKLIPGIFFDEIYMMGN PLPWTDRLPGRFRETYGYDILDELPSLVDGVSERDKQVRKDYFSLVTAMYEEAFFRQISD WCGKYGLKLTGHTEEFLWEHPRRQGDYFKTMRHLMVPGSDCHDYRYRYPRRITYCEPKYS VSVARIYGKERAMSEALGGAGWNCTMEEFKKGINTLAAMGTGMFILHGFYYECDHQGSQS DWPTSFFYQNPYWDYFKIFADYIRRLSFMNSQGNPVVDYAILYPIGDMDENMENGEENPA GQAINNGFHQALNCMIEHQLDTDMVDEESILNAHICDGKLCLGQQRFKVLLLPEGGSLME ETVAKLEAWTKAGGSVLFYRTGLAHENRSGENRAGGDRANGCSVNGYWDTVLRGNVHSIS RIPQAAAGLAAPSASVIYGNTGNIFLNHRTAGNVDYYLVANSSDERRNLVLSFSHASTPV LLDIEKGDMVQAVYTQAGTAQMCGGRQAGSGVLIYMDLEPGEAVYVLFGLEEDTIRQAKP ALTGDIRWEEEWITGKWDFLPLSPSGPNHGDNAHNEESTDLEIPIAEFTSDVSSGTETIR ICNQSGEEGSCGRHMSLWNGRWITRRRSWNDQLDASDLYFRKTVFLEHAPEKAEFCAAAV DSFECFINGTMVYKGISNGEPVVFGDKGQLTQGENILAFHVTNRNPLHDVYVCSAEELPP DRFISLLMEGIIVQGTNVEVVKSDSTWIVNDSLIKGWEQPDSDGRFTAAGFDVRKVLNFN YTGLEHVWLKAWERGKPPLKPWGDLPLFGQTLTYPRKLWYTVTIPAGASVLYEPVTTGAA VCMLDGKEVHWENGVHILPDNERIHILTIQITAGGCSDGLKQPIRVTMKAVAVNLSDWRM LGLPWFSGRCRYTNTWSVEQLEGTYMLELGGVNHCAQIWINGRLADTRLWRPYRADITSL LRPGENEITILVSNLASNERRHMLVDEGMALGWNRYWNEDNMDRDSRNYVSGLLGPVRLL HMISPKP >gi|157101601|gb|DS480723.1| GENE 140 165295 - 165960 295 221 aa, chain - ## HITS:1 COG:no KEGG:SpiBuddy_2052 NR:ns ## KEGG: SpiBuddy_2052 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 28 213 24 208 214 169 50.0 6e-41 MQKKSKKCIITTGILFLIFMLFTVIIKTIDVQPVGPEQSTIGLASLNQFVFNLFGVNLLW YNITDWLGIVAIVIALGFAILGLIQLIQRKSIWNVDPRILLLGAFYFIVIIIYVFFEIVI INYRPIILSQSLEASFPSSHTMIVICIMSTAMLQFHYYLRDKKVCLWTIDIASVLMIAVT VIGRLISGVHWFTDIVAGILLSSALVALYYSTLKYIEEKKG >gi|157101601|gb|DS480723.1| GENE 141 166113 - 166727 165 204 aa, chain - ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 65 1 65 195 60 40.0 2e-09 MEFNEKLQQLRIGKNLTQQQLAEQLYVSRTAISKWESGKGYPNIESLKCISRFFSITIDE LLSSEELITLAEAENHSNVNRIHNIISGIIDVLAIALIVLPIYGEPQGSHFYHVNLLSIT HLSNIDIGIYWAIYLLIIGFGIARLAFIFLGKERLCGIISKASAATGAAAICLFAAAREP YVTILLFLFFAVKIFLLLQQSRTK >gi|157101601|gb|DS480723.1| GENE 142 167106 - 167564 496 152 aa, chain + ## HITS:1 COG:CAC2491 KEGG:ns NR:ns ## COG: CAC2491 COG0454 # Protein_GI_number: 15895756 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 148 1 144 155 95 36.0 3e-20 MRFEKVGMEKLEDLVKTRIQVLRAANKLDGSADMGQTEQESRTYYEKSLEDGSHVAYLVY DGDLIIGTGGISFYQVMPTYHNPTGMKAYIMNMYTDPEYRRKGIAMRTLDLLVQEAWGRG IRFIALEATGMGRPLYERYGFCRMEDEMYLPE >gi|157101601|gb|DS480723.1| GENE 143 168000 - 168197 234 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|288871602|ref|ZP_06118187.2| ## NR: gi|288871602|ref|ZP_06118187.2| toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] toxin-antitoxin system, antitoxin component, Xre family [Clostridium hathewayi DSM 13479] # 2 55 32 85 87 76 64.0 7e-13 MRKSKGLSVPYLRDYFGFSTTNAIYKWLRGDTLPSLDNMFALSLLLGESVNDILVAECGD ERTAG >gi|157101601|gb|DS480723.1| GENE 144 168363 - 168797 395 144 aa, chain + ## HITS:1 COG:phnO KEGG:ns NR:ns ## COG: phnO COG0454 # Protein_GI_number: 16131919 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Escherichia coli K12 # 2 138 6 142 144 94 40.0 7e-20 MIRYATQKDENAIYELLCELEGMSLDKEGFHEAYRDYMKDDKMHCLVAEMDKEVAGVLNL RIGTMLCRCGKIGEIVELAVRNSMRSQGIGHGLFQEAYHIAREAGCVRFEVSTNRTRLGA HRFYEREGMERSHYRFIRYFSIEK >gi|157101601|gb|DS480723.1| GENE 145 169016 - 170443 1512 475 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 14 441 16 443 455 320 42.0 3e-87 MNQNTNQYMAEESIGKLMLRFSIPCIMSLLVSALYNIVDQIFIGRGIGFLGNGATNVVFP VTVIALSLSLLVGDGCAAYLSICQGRRDGEHAHRSVGNAVVFITASGIVLTLLYAFFRDS ILWGFGATENNIGYAREYFLYLIPGIPFFMFANAMNSVIRADGSPQFAMFSTLIGCALNV VLDPVAIFVLGWGMKGAALATITGQIVSALLAVYYLFRPKSFRLKRVSFKPDAEILKHVL PLGISSFLTQVSIVVIMAVMNNVLVIYGAGSKYGADIPMTVVGIVMKVFQIVISVVVGIA AGAQPIVGYNYGAGLWRRVKLIFRTMMAAEFSVGLVSLICFEVFPVQIISVFGSEDGLYN EFAVLAFRVFLGGIVLCCIQKSCSIFLQSMGKPALSMLLSLLRDFVLSVPLTLVLPRFFG VTGALYSGPAADVISFGAAVLCMAAVFRKLNRLEDGEVALSGASQGGLAVERKIC >gi|157101601|gb|DS480723.1| GENE 146 170601 - 171266 709 221 aa, chain + ## HITS:1 COG:ECs0395 KEGG:ns NR:ns ## COG: ECs0395 COG0110 # Protein_GI_number: 15829649 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Escherichia coli O157:H7 # 23 220 4 200 203 204 46.0 1e-52 METERNQQRNQDGRTDGVRDGDSAAKRMEEGRLYFPGDEDILKEQMECMELLYDYNATRP RESARRTSLLEQMFGSIGRDCYIEPPLHSNWGGRHVYMGDFVYANFNLTLVDDGEIYIGS HCMIGPNVTIATAGHPVEPGLRRKGIQFNMPVHIGENVWIGAGAVVVPGVTIGDNSVIGA GSVVTRDIPANVVAVGNPCRVLREIGERDRIYYYKDRKIDM >gi|157101601|gb|DS480723.1| GENE 147 171299 - 171751 156 150 aa, chain + ## HITS:1 COG:no KEGG:Caci_3891 NR:ns ## KEGG: Caci_3891 # Name: not_defined # Def: hypothetical protein # Organism: C.acidiphila # Pathway: not_defined # 31 121 40 127 155 70 38.0 1e-11 MSDIIYKIFPRYYYPKYTDDQIKRAAGMLKLTSNGRITFTNYKSVQFIDCAGELEHIFCP WCGREVDREFWQKAMDTAYDGSAFKYLGIRMPCCHQPSSLEELIYVKPCGFATFVIEIWN PQEMPSVVELHEMGKCFGNVHFFRMISACI >gi|157101601|gb|DS480723.1| GENE 148 171825 - 172010 87 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942184|ref|ZP_02089499.1| ## NR: gi|160942184|ref|ZP_02089499.1| hypothetical protein CLOBOL_07074 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07074 [Clostridium bolteae ATCC BAA-613] # 1 61 1 61 61 79 100.0 7e-14 MGKDKYICPCFKVTKDDIKKAIEEGADSFKKVKKATHLAAGCGHCKCRAKKYTKKRLGKI K >gi|157101601|gb|DS480723.1| GENE 149 172010 - 172522 672 170 aa, chain - ## HITS:1 COG:CAC0845 KEGG:ns NR:ns ## COG: CAC0845 COG1528 # Protein_GI_number: 15894132 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Clostridium acetobutylicum # 13 170 13 170 170 189 60.0 2e-48 MLDKKVAELINTQVNKEFYSAYLYLDFANYYKDAELNGFSNWYQVQAQEERDHAMLFIQY LQNNGEKITLEAIAKPDKVFEDFRGPLTAGLEHENYVTGLIHDIYDAAYSVKDFRTMQFL DWFVKEQGEEEKNASELVKRFDLFGHDPKGLYMLDSELAARVYAAPSLVL >gi|157101601|gb|DS480723.1| GENE 150 172671 - 174026 1280 451 aa, chain - ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 2 259 4 259 344 154 34.0 4e-37 MPFVSLIIPVYNAEKYLRRCLNSAMEQTFQDMEIIVVNDGSQDTSLEICREYEKMDRRFR VINKENTGVSDSRNQAIETARGEYLQFMDSDDWLTPDATESFVHAARKFDCDLVVSDFYR VDGAVFTEKQHIRERGLLTRAQYAEYMMREPADFYYGVLWNKLYRRSIVQEHQLKMDEDL RWCEDFLFNLNFIRYAGRFTAIQTPVYYYMKRKGSLVSTDWKKANTVKLKLRLLKDYKEL YQSMDLYEENKLKINAFAVSIAKDGGIGAPMSRLRKKLSEEDYIEDELPAGYTRVCHTLG PIYDEASRILILGSFPSVKSREQNFYYGHPQNRFWKLMARLFEEPVPETTEDKKGLLMRH HIALWDVVAACDIKGSSDRSIRNVIPTDLNRILRTARIETILANGDTAFQLYRKYCRETT GREAVKCPSTSPANAFFTLDRLAEAWGRELF >gi|157101601|gb|DS480723.1| GENE 151 174445 - 176448 2008 667 aa, chain + ## HITS:1 COG:CAP0099 KEGG:ns NR:ns ## COG: CAP0099 COG1193 # Protein_GI_number: 15004802 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 16 655 14 614 629 365 33.0 1e-100 MNGTFEAYKTVAFDIIREQLADLANSAEAREMARGLVPCMEEGELRRSMRDTTQARQMLE LAGTPPMPAMEHTAEFVARSVRGELLTPEEMEEIGMFLAAVRRVRSYLDKGMSYGIPLSC YGENLRLMTELKEEIEGAIRHGRVDDRASNTLRDIRRDLQLLEEKIKDKAEALLKSQKKF MAESFLVTRNGRLCLPVKKEYKSKIPGSTIDRSSSGSTVFIEPETIARMQDEIEGLRIEE DCEERRILYTLMDRIAMEEDGFKENLALLAKLDFVFAKGKLSAQMDAREPAVNTQGIICL KEARHPLLPGDSNVPLDFKLGGETRGMVITGPNTGGKTVAIKTVGLFVLMACSGLHLPCR EADIAMRNLVLCDIGDGQDILDNLSTFSAHITNVLEILKRATGESLVILDELGSGTDPAE GMGIAIAILEQLRLRGCLYLVTTHYPEVKTYAGRHAELISARMAFDRENLRPLYRLEMGK TGDSCALYIAKRLGMPEDMIRTAASEAYGDGGELEKPLEKTREKTLYKTREKPQERSQEK PLEKSPGRQWPGGLERILVPCIVPIKRKPAAAGPEEGYTRGDSVEVLPKGEIGIVVRPAD QTGNVLVQVKREKRLVNQKRLKLKVRASELYPEDYDFSIVFDSVENRKARHQMERKYVGN MEVRAEE >gi|157101601|gb|DS480723.1| GENE 152 176455 - 177675 936 406 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 191 401 316 516 530 84 28.0 4e-16 MEEHEKALISRQFIEVMLKVPVILCPGRIEAIETSIKEYFPDFIKNSFPFFITRSMAESL QTGTLYHITDFINLHFNLFLYGGGTQLLIAGPYLSHPADTAFCEQILQDNGKNLSLLVPF SQFCLSLPVVGSSQMLEVMRTAMRVITGTDREIPYLHYEPQRHYTQEISNLSAEGVDEAS MELLEHRYYYEKLLLQEVAQGSQDLALKYYRQFSQESRSIVRTEDPLRTAKNLGFSLNTM LRKSAETTGIHPVYLDIISSNFAMLIENANSIREAEECKSQMISAYCRFVRKNRLDQYSP LVRKAVTYIRIHLADQLTLAGIAKGIRVSPSYLSRLFNRETGDSVSNFITKARVEKAAGL LSFSGMSIQNIAFYVGFGDLNYFSRCFKKYKKMTPTEYRASSSIGV >gi|157101601|gb|DS480723.1| GENE 153 177886 - 179391 1684 501 aa, chain + ## HITS:1 COG:STM0149 KEGG:ns NR:ns ## COG: STM0149 COG2211 # Protein_GI_number: 16763539 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Salmonella typhimurium LT2 # 35 479 28 447 468 126 23.0 1e-28 MEKNKSVVKQYNNAKLWQIGFFSLNNCATNIAMFLMMQYSYFTQNVLGLAAAIVGLIATG TRIFDAVTDPLVGFLVDRTNGRFGKFRPYMLVGNIIIWCSLIVIFNTPSGWGISRKYMFT TVFYVIYIIGYTCQTVVTKAGQAVLTNNPKQRPIFAGFDSVLTQMASALVPMLITTILAE KYSVGEFAGENGKGLGMINPAMWKEAAVILAVISFVFTILAIIGISAKDRPEFFGQAGSQ PKVKFRDYKDIVMHNRPIQMLIISAATDKLGQLLMSGTMTYLFANMLLNSSLQGVFSSIL IAPLVAISIGGVFVSRRFGLKRTFLVGTWGSMIMLAIMFVVGPDPKVPAIFLGMFLVQKC LASMGNSGIIPMIADCTDYEMYRSGRFVPGMMGTMFSFIDKIISSFSAMIQGFALTLAGV GNVVIKPNEPVNSTFNMAIMVCFCIVPILGHIATIIAMRWYYLDKVKMAEIQDELENRKE ALAGKGTRPDGAEEAAAGAVS >gi|157101601|gb|DS480723.1| GENE 154 179432 - 181675 2070 747 aa, chain + ## HITS:1 COG:Cgl0317 KEGG:ns NR:ns ## COG: Cgl0317 COG1472 # Protein_GI_number: 19551567 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Corynebacterium glutamicum # 17 509 6 487 548 337 40.0 8e-92 MNKASANIRFYKNENGPVIGVTVRPVIEKDGLYFRDLTGDGRLAPYKDWRNTPAVRAASL AAELSADEKIGMLFVNSWKMGIYQEDRTKVDESGLLNEEIVEQDESIFNVEKTYGTTYTL KEMGIRHFILRQNPGPGELADWINQLNMVAEGTGHALPVMVLSNSRNEHGEIVFGMNDAA GVFAAWPGTLGIAAAVRGSGPELIDSFAECIRREWDAVGMKKGYMYMADVMTDPRWQRSY GTFGEDPGTVCTIMERLIPGIQGSSRGVTRDGVAVTIKHFPGGGARENGFDPHYVQGQWN VYQTEDSLRTYHLPAFRTAIEKKASSIMPYYAKPSAAKSRPQYGPGGDVMEMEPVGFAFN RSFIQGLLREQMGFEGYVNSDSGISNKMAWGVEELDVPSRIALAVNTGVDIISGSLDVFS AREAYERGRNGYYTAQGHPVPQGYRAGDLVLTDEALTRAVTRTLKEKFELGMFDNPYRDP ETAKQVVATKKHWEDAYRVHQQSVVLLKNKEGLIPLDREKTAGRKVYVECFGCEAEAAAR ETDAVRTSFAGRFQAELTEDYREADFAILFIRPSSGEYFHSTKGYLELDICENKQVRDVD SQGRPADSFHQETTLLGAGRIREIYESLHSRGGKVISNINFTLAWEVGNVEPCADALLAG FDTYTDAVLDVIMGRCAPTGRMPITLPRNDSVIRVDRDGICISRNDVPGYHKDKYMPESM KDENGKAYAYRDSEGNYYELDFGLTLE >gi|157101601|gb|DS480723.1| GENE 155 181965 - 182387 494 140 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1132 NR:ns ## KEGG: CDR20291_1132 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 140 1 140 140 189 67.0 3e-47 MDHQNEKCVMVIDEELPTGIIANTAGIMGITLGKKLPETVGPDVSDRNGRSHLGIIAIPV PILKASKEKLKELRLKLYEPSFSELTVVDFSNVAQSCNDYDEFASKAAQTDGDSFQYFGV GICGAKKLVNKLTGNLPLLR >gi|157101601|gb|DS480723.1| GENE 156 182360 - 183289 874 309 aa, chain - ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 1 296 1 296 306 271 55.0 1e-72 MNHNRTAAGHLAAFVTILIWGTTFISTKVLLRTFSPVEILFIRFVMGYLALWLACPRSLR LTSARQEGLYAAAGFCGVTMYYLLENIALTYTLASNVGVIISIAPFFTAILGWMFLGGER PRFRFFAGFLLAMAGISLISFGNGAALSLNPTGDLLAVAAAVIWAVYSTLTKKISALGHG TVQSTRRTFFYGILFMVPALAFMDFYVTPAQFTDMKNILNLLFLGLGASALCFVTWNTGV KILGPVKTSVYIYMVPVITTLTSALILKEPVTIPAALGIIMTLAGLFLSEQKTTKKGEPK IWTTKMKNV >gi|157101601|gb|DS480723.1| GENE 157 183279 - 184139 771 286 aa, chain - ## HITS:1 COG:BS_ybfI KEGG:ns NR:ns ## COG: BS_ybfI COG2207 # Protein_GI_number: 16077291 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus subtilis # 4 274 2 272 275 259 47.0 5e-69 MIQEQEQRRVYYDKDLKIEAYNLSGVVQKFPNHFHEYYVIGFVEGGKRHLWSRNKEYDVS AGDLILFNPRDNHYCAPIDGEPLDYRAVNIDPSVMEQAVREITGREFTPRFTRNVVYRSD ITQSVSALYGAILRRATRLEKEEALFFLLEQVLQEYAAPFKEEDISRPDDRIQNLCTYME DHFSENISLDELLSMTTFGKSYLLRSFTKQVGVSPYRYLQTIRLDKAKKFLEQGMPPVEA AAMAGFSDQSHFTNFFKEFIGLTPRQYQKIFLSEPDEEQKEHYHES >gi|157101601|gb|DS480723.1| GENE 158 184035 - 184226 116 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEMIGEFLYNPAQIVGFYFKVFIIIDPALFLLLYHFAAPFKLLFCYFNTMGRENLARYDT RMD >gi|157101601|gb|DS480723.1| GENE 159 184303 - 184524 176 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942195|ref|ZP_02089510.1| ## NR: gi|160942195|ref|ZP_02089510.1| hypothetical protein CLOBOL_07085 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07085 [Clostridium bolteae ATCC BAA-613] # 1 73 47 119 119 140 100.0 2e-32 MHIEIDLERLLKEKHKSKNFVCEECGLQRTQFNNYCKNKVSRVDLTILAKMCACLDCTPN DILKIVKETAKDE >gi|157101601|gb|DS480723.1| GENE 160 184653 - 185036 390 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942194|ref|ZP_02089509.1| ## NR: gi|160942194|ref|ZP_02089509.1| hypothetical protein CLOBOL_07084 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07084 [Clostridium bolteae ATCC BAA-613] # 1 127 1 127 127 248 100.0 1e-64 MHASLILMALCIYLQWGRGVSVRNTISSSRDGKRTEHITIVANKLYIGDKEAFAGDMIET CINNEFRDVRFSYDMGYPVEITMDIYTNDTARRLGLRCCEVRYAQAEEDRYRYNVKDDRE RFVMTVK >gi|157101601|gb|DS480723.1| GENE 161 185116 - 185433 355 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942196|ref|ZP_02089511.1| ## NR: gi|160942196|ref|ZP_02089511.1| hypothetical protein CLOBOL_07086 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07086 [Clostridium bolteae ATCC BAA-613] # 1 105 1 105 105 166 100.0 7e-40 MEAMFYALAIKFLDKTELRKVKEKINMTLLGQMLMEDGIEKGEMTKLVSLVMKKMKKGLS PEETAELLEESPDCVSRIYAAVQANPGLDENGIFEFMKKKDREKD >gi|157101601|gb|DS480723.1| GENE 162 185487 - 185951 233 154 aa, chain - ## HITS:1 COG:no KEGG:CLB_2021 NR:ns ## KEGG: CLB_2021 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 24 153 3 132 271 89 34.0 5e-17 MFTGKGGSCSNTRYPPNHQIQDTPNQLEDKVLKTAAMFFGQDLLPYLGVRGRITGLVPTE QIHLELRRMEEDFNYMMEDGSLRHLEFESDSITSRDLRRFREYEAYLSLIYNCPVITTVL CTSHVRRIKQELVTGINVYRIQVIRIKDRNADKN >gi|157101601|gb|DS480723.1| GENE 163 186329 - 187249 921 306 aa, chain + ## HITS:1 COG:BS_yojN KEGG:ns NR:ns ## COG: BS_yojN COG0714 # Protein_GI_number: 16078999 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus subtilis # 27 289 32 298 304 136 32.0 5e-32 MTVIEFLKEEKINSSLIEGIMEFRASHPVREEDRGRIPVPRYLYYGRDVWEQAVAALLCG QNLLLTGPKATGKNILAQNLAAAFGRPVWDISLHVNMDASGLIGTDTFENGAVVFRPGPV YRCGISGGFGVLDEINMARNEALAVLHGILDFRRTIDVPGYDVVRMDEGTRFIATMNYGY AGTRELNEALTSRFSCLQMPAITGENLEKLIDREYPGLKRTYARQFVQLFLDIERKCQSA QLSTRPLDLRGLLDAVRLMERGLDAGAALDMGLTNKSFDPYEQTLVRDLIRARIPKKLER EKLFAD >gi|157101601|gb|DS480723.1| GENE 164 187239 - 187710 290 157 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0396 NR:ns ## KEGG: Acfer_0396 # Name: not_defined # Def: von Willebrand factor type A # Organism: A.fermentans # Pathway: not_defined # 9 157 5 153 574 106 34.0 3e-22 MQTDIKMQEYDGEERRALNMIWTAAKDHSFRPEFMAFDRYGRADLYLNSIIGYVHRWYDG GKVSEMFGAFQGTALQDIYDTIFWLGLECGAYEKEREGRPGLEELRREYWAQVLEESKWS AQEKLVQSLQTGWGRMVLGEKPGVTPWERGILSGLSF Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:06:46 2011 Seq name: gi|157101600|gb|DS480724.1| Clostridium bolteae ATCC BAA-613 Scfld_02_65 genomic scaffold, whole genome shotgun sequence Length of sequence - 84067 bp Number of predicted genes - 94, with homology - 93 Number of transcription units - 34, operones - 24 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 244 - 303 8.3 1 1 Op 1 . + CDS 411 - 839 374 ## Apre_1718 transcriptional regulator, MarR family 2 1 Op 2 35/0.000 + CDS 887 - 2617 209 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 3 1 Op 3 . + CDS 2614 - 4416 214 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 4462 - 4511 12.9 - Term 4441 - 4504 14.1 4 2 Op 1 24/0.000 - CDS 4527 - 5351 974 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 5 2 Op 2 . - CDS 5344 - 6111 218 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 - Prom 6207 - 6266 2.3 - Term 6188 - 6236 10.2 6 3 Tu 1 . - CDS 6269 - 7735 1308 ## Closa_2429 polysaccharide deacetylase - Prom 7822 - 7881 6.4 + Prom 7861 - 7920 4.2 7 4 Op 1 . + CDS 7951 - 8391 519 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Prom 8425 - 8484 2.0 8 4 Op 2 . + CDS 8539 - 9891 1300 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Term 9966 - 10003 6.2 - Term 10282 - 10343 12.1 9 5 Tu 1 . - CDS 10496 - 11500 1142 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 11607 - 11666 6.6 + Prom 11468 - 11527 5.3 10 6 Op 1 1/0.000 + CDS 11601 - 12275 385 ## COG0500 SAM-dependent methyltransferases 11 6 Op 2 . + CDS 12245 - 14734 1489 ## COG1061 DNA or RNA helicases of superfamily II + Term 14765 - 14825 17.6 + Prom 14910 - 14969 5.3 12 7 Op 1 . + CDS 14991 - 15794 877 ## CDR20291_2752 hypothetical protein 13 7 Op 2 . + CDS 15808 - 16497 882 ## CDR20291_2751 hypothetical protein + Prom 16513 - 16572 2.1 14 8 Op 1 . + CDS 16718 - 17824 1114 ## COG2195 Di- and tripeptidases 15 8 Op 2 . + CDS 17849 - 19186 1309 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 16 8 Op 3 . + CDS 19236 - 20549 1398 ## CDR20291_0449 putative transcriptional regulator + Prom 20551 - 20610 6.3 17 9 Op 1 . + CDS 20670 - 21479 1050 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 21546 - 21605 7.7 18 9 Op 2 . + CDS 21693 - 22013 447 ## COG4496 Uncharacterized protein conserved in bacteria + Term 22039 - 22083 11.0 - Term 22027 - 22070 3.2 19 10 Tu 1 . - CDS 22150 - 22767 774 ## Closa_1554 hypothetical protein - Prom 22802 - 22861 6.5 + Prom 22933 - 22992 8.6 20 11 Op 1 . + CDS 23026 - 25368 2650 ## COG0210 Superfamily I DNA and RNA helicases + Prom 25509 - 25568 8.9 21 11 Op 2 . + CDS 25619 - 25936 477 ## Closa_1728 hypothetical protein + Term 26011 - 26046 7.2 + Prom 26026 - 26085 6.2 22 12 Tu 1 . + CDS 26105 - 27538 1053 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase + Term 27581 - 27645 7.4 + Prom 27623 - 27682 5.3 23 13 Tu 1 . + CDS 27770 - 28615 469 ## gi|160942250|ref|ZP_02089559.1| hypothetical protein CLOBOL_07136 + Term 28644 - 28683 3.3 + Prom 28628 - 28687 3.0 24 14 Op 1 . + CDS 28709 - 29137 225 ## Dhaf_2445 hypothetical protein 25 14 Op 2 . + CDS 29210 - 29593 320 ## gi|160942252|ref|ZP_02089561.1| hypothetical protein CLOBOL_07138 + Term 29666 - 29711 8.1 + Prom 29656 - 29715 3.5 26 15 Op 1 . + CDS 29748 - 30458 246 ## LM5578_1902 hypothetical protein 27 15 Op 2 . + CDS 30452 - 32086 994 ## Closa_3096 hypothetical protein 28 15 Op 3 . + CDS 32149 - 32364 110 ## gi|160942255|ref|ZP_02089564.1| hypothetical protein CLOBOL_07141 29 15 Op 4 . + CDS 32383 - 33564 955 ## DSY0069 hypothetical protein 30 15 Op 5 . + CDS 33601 - 34530 667 ## COG0338 Site-specific DNA methylase 31 15 Op 6 . + CDS 34517 - 34900 368 ## LM5578_1900 hypothetical protein 32 15 Op 7 . + CDS 34900 - 35328 327 ## CKR_3439 hypothetical protein 33 15 Op 8 . + CDS 35321 - 35611 348 ## CKL_3895 hypothetical protein 34 15 Op 9 . + CDS 35601 - 36176 346 ## CKR_3438 hypothetical protein 35 16 Op 1 . + CDS 36341 - 37366 740 ## gi|160942262|ref|ZP_02089571.1| hypothetical protein CLOBOL_07148 36 16 Op 2 . + CDS 37303 - 37497 110 ## gi|160942263|ref|ZP_02089572.1| hypothetical protein CLOBOL_07149 37 17 Op 1 . + CDS 37632 - 38189 576 ## LM5578_1896 hypothetical protein + Term 38202 - 38232 1.0 38 17 Op 2 . + CDS 38245 - 38475 225 ## HM1_0583 hypothetical protein 39 17 Op 3 . + CDS 38517 - 38858 407 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 40 17 Op 4 . + CDS 38859 - 39254 191 ## CKR_3435 hypothetical protein 41 17 Op 5 . + CDS 39268 - 39375 89 ## 42 17 Op 6 . + CDS 39377 - 39754 189 ## gi|160942270|ref|ZP_02089579.1| hypothetical protein CLOBOL_07156 43 17 Op 7 . + CDS 39773 - 40642 907 ## LM5578_1891 pilus assembly protein CpaB 44 17 Op 8 . + CDS 40643 - 41455 704 ## HM1_0586 hypothetical protein 45 17 Op 9 . + CDS 41449 - 43086 1380 ## COG4962 Flp pilus assembly protein, ATPase CpaF 46 17 Op 10 . + CDS 43083 - 44012 793 ## LM5578_1888 hypothetical protein 47 17 Op 11 . + CDS 44018 - 44890 992 ## LM5578_1887 hypothetical protein 48 17 Op 12 . + CDS 44918 - 45286 145 ## Sgly_0061 hypothetical protein 49 17 Op 13 . + CDS 45328 - 45678 315 ## Closa_3112 hypothetical protein 50 17 Op 14 . + CDS 45675 - 46220 419 ## LM5578_1885 hypothetical protein + Prom 46235 - 46294 3.3 51 18 Op 1 . + CDS 46326 - 46943 505 ## LM5578_1884 hypothetical protein + Term 46960 - 46987 -0.8 52 18 Op 2 . + CDS 46990 - 48696 1021 ## LM5578_1883 hypothetical protein + Term 48741 - 48771 1.8 53 19 Op 1 . + CDS 48803 - 49123 275 ## CKR_3426 hypothetical protein 54 19 Op 2 . + CDS 49171 - 49758 399 ## gi|160942282|ref|ZP_02089591.1| hypothetical protein CLOBOL_07168 55 19 Op 3 . + CDS 49802 - 50194 204 ## gi|160942283|ref|ZP_02089592.1| hypothetical protein CLOBOL_07169 56 19 Op 4 . + CDS 50225 - 50503 188 ## gi|160942284|ref|ZP_02089593.1| hypothetical protein CLOBOL_07170 57 19 Op 5 . + CDS 50527 - 50895 209 ## gi|160942285|ref|ZP_02089594.1| hypothetical protein CLOBOL_07171 58 19 Op 6 . + CDS 50964 - 51866 1062 ## CKL_3881 hypothetical protein 59 19 Op 7 . + CDS 51889 - 52290 217 ## Dtox_1498 hypothetical protein 60 19 Op 8 . + CDS 52313 - 52825 495 ## gi|160942288|ref|ZP_02089597.1| hypothetical protein CLOBOL_07174 61 19 Op 9 . + CDS 52836 - 53096 241 ## HM1_0601 hypothetical protein 62 19 Op 10 . + CDS 53093 - 53695 530 ## CKR_3422 hypothetical protein + Prom 53740 - 53799 5.5 63 20 Op 1 . + CDS 53836 - 54600 520 ## COG3177 Uncharacterized conserved protein + Term 54626 - 54664 7.1 64 20 Op 2 . + CDS 54685 - 56514 1644 ## COG3451 Type IV secretory pathway, VirB4 components 65 20 Op 3 . + CDS 56531 - 57592 752 ## CKL_3875 hypothetical protein 66 20 Op 4 . + CDS 57606 - 58121 367 ## LM5578_1874 hypothetical protein + Term 58128 - 58173 7.3 - Term 58116 - 58159 6.1 67 21 Tu 1 . - CDS 58162 - 58536 297 ## ELI_2970 hypothetical protein - Prom 58596 - 58655 5.0 + Prom 58555 - 58614 9.1 68 22 Op 1 . + CDS 58707 - 59711 665 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 59757 - 59787 -0.9 69 22 Op 2 . + CDS 59807 - 60085 401 ## gi|160942298|ref|ZP_02089607.1| hypothetical protein CLOBOL_07184 + Term 60107 - 60151 1.4 70 23 Op 1 . + CDS 60174 - 61112 607 ## LM5578_1872 hypothetical protein 71 23 Op 2 . + CDS 61127 - 61402 286 ## Closa_3132 hypothetical protein 72 23 Op 3 . + CDS 61399 - 61662 274 ## Sgly_0038 hypothetical protein + Term 61872 - 61909 1.1 73 24 Op 1 . + CDS 61939 - 62427 452 ## CKR_3415 hypothetical protein + Prom 62434 - 62493 4.6 74 24 Op 2 . + CDS 62516 - 62650 120 ## gi|160942303|ref|ZP_02089612.1| hypothetical protein CLOBOL_07189 + Prom 62790 - 62849 4.6 75 25 Op 1 . + CDS 63019 - 65940 1682 ## COG0790 FOG: TPR repeat, SEL1 subfamily 76 25 Op 2 . + CDS 65909 - 66937 414 ## Sgly_0035 hypothetical protein 77 25 Op 3 . + CDS 67003 - 68796 1121 ## COG3505 Type IV secretory pathway, VirD4 components 78 25 Op 4 . + CDS 68812 - 69069 113 ## HM1_0623 hypothetical protein + Prom 69186 - 69245 2.3 79 26 Tu 1 . + CDS 69357 - 69809 531 ## COG1349 Transcriptional regulators of sugar metabolism + Term 69870 - 69918 3.1 - Term 69851 - 69912 7.7 80 27 Tu 1 . - CDS 69970 - 71211 482 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 71373 - 71432 9.3 + Prom 71352 - 71411 10.2 81 28 Op 1 . + CDS 71549 - 72013 65 ## COG1959 Predicted transcriptional regulator 82 28 Op 2 . + CDS 72091 - 76536 3292 ## COG5263 FOG: Glucan-binding domain (YG repeat) + Term 76566 - 76612 7.2 83 29 Op 1 . + CDS 76624 - 77091 165 ## gi|160942314|ref|ZP_02089623.1| hypothetical protein CLOBOL_07200 84 29 Op 2 1/0.000 + CDS 77111 - 77389 138 ## COG0776 Bacterial nucleoid DNA-binding protein 85 29 Op 3 . + CDS 77454 - 78782 854 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Prom 78813 - 78872 2.1 86 30 Op 1 . + CDS 78940 - 79290 222 ## Closa_2032 hypothetical protein 87 30 Op 2 . + CDS 79387 - 79809 271 ## gi|160942318|ref|ZP_02089627.1| hypothetical protein CLOBOL_07204 88 30 Op 3 . + CDS 79851 - 80270 188 ## gi|160942319|ref|ZP_02089628.1| hypothetical protein CLOBOL_07205 + Prom 80328 - 80387 1.8 89 31 Op 1 . + CDS 80416 - 80655 188 ## gi|160942320|ref|ZP_02089629.1| hypothetical protein CLOBOL_07206 90 31 Op 2 . + CDS 80705 - 81214 241 ## COG1528 Ferritin-like protein 91 32 Op 1 . + CDS 81609 - 81974 216 ## EUBELI_10051 hypothetical protein 92 32 Op 2 . + CDS 81940 - 82080 106 ## gi|160942323|ref|ZP_02089632.1| hypothetical protein CLOBOL_07209 + Term 82153 - 82201 1.5 + Prom 82132 - 82191 4.3 93 33 Tu 1 . + CDS 82229 - 82426 322 ## gi|160942324|ref|ZP_02089633.1| hypothetical protein CLOBOL_07210 + Prom 82635 - 82694 7.3 94 34 Tu 1 . + CDS 82714 - 83877 873 ## COG3328 Transposase and inactivated derivatives Predicted protein(s) >gi|157101600|gb|DS480724.1| GENE 1 411 - 839 374 142 aa, chain + ## HITS:1 COG:no KEGG:Apre_1718 NR:ns ## KEGG: Apre_1718 # Name: not_defined # Def: transcriptional regulator, MarR family # Organism: A.prevotii # Pathway: not_defined # 3 101 8 107 159 63 31.0 3e-09 MAFFNFKDQWFQELRKVNWLVRECVEEKCRPFGITPEQGRVLNHLFLADGQVNLTQLSRS LHVTKGNCSMFCRRLERAGHITMVKNDKDARFINVALTEKGRSLVQGMIEQMGSHSEKDT MSKEDLETILKGLKCLEKYLQS >gi|157101600|gb|DS480724.1| GENE 2 887 - 2617 209 576 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 333 557 131 354 398 85 25 1e-15 MKKIFAELNAFRPELLCVLVSVAAGVGATLGLPTYLSDIINRGIADKDMNYILHTGVIML GIAILGMVCNITTGFFASRIALGLGRNVRSRIFTKVEYFSQAEIDTFSTASLITRTNNDI TQVQNFMVMFLRVILTAPIMCVVGIMLAYSKNPKMSSILVVSMPVMVLIISLIGRRAMPL SRKMQTRIDRINLIMREKLSGIRVIRAFGTEDYEEKRFDGANKDLMNNAMKMMHAMSLLG PSLILILNLTVVGLLWRAGQGIGTEPVMPGDILAIIQYVMQIMMSVTMLSMIFVMYPRCA ASADRICEVLDTENSIGDPAVPKLSDVQRGYLTFRDVSFYFPGAKEPAVSHVSFEAKPGE TTAIIGSTGSGKTALVGLIPRFYDVQEGEVLVDGVNVKDYDRRTLRRKIGYVPQKALLFK GTIMENIRFGDDSASDERVKEAAAIAQSTDFIEDKPDGFDSSISQGGSNVSGGQRQRLAI ARAIVRKPEIYVFDDSFSALDFKTDSALRQALSKETGNATVVIVAQRVSTIMNADRIIVM DEGKVMGIGTHRELLNTCGTYQEIVRSQLSEEEMSA >gi|157101600|gb|DS480724.1| GENE 3 2614 - 4416 214 600 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 362 579 8 226 245 87 29 3e-16 MKQMNVRAADKKINTKGTMGRLFRFMKPYRIRIILMVACLVMGAVFTTQGPYTLGRAMDA LVAVAVDSAGVLQGFRTFITVLIQLGCVYVLAFLFNYSGQYIVAGVAERTMHDLRMAVDK KIRRLPLAYFDSNTFGDVLSRVTNDVDTIDTSLQQSISQVITAVCTMVFIFVMMLVVSPV LTLIGICVIPICGLVSMKVVKHSQRYFQGQQSALGDLTGYVEEMYNGQNVIAAFGKEEDV IENFEDINNRLYNNGWRAQFSSSIIMPLTQALTNIGYVGVAVVSGWLCINGRLSIGMIQS FIQYLRQFSQPINQVSNIANIMQATMAAAQRVFEFLDAEEEVPEAEKPQFPESREGNVDF SHVRFGYTEDRTLIHDLDLHVHAGDKIAIVGPTGAGKTTLVNLILRFYDVNGGSITIDGV DVRDMKREALRSMIGMVLQDTWLFAGTIKENIRYGRLDATDQEVTDAARAAHANGFIMSM PGGYDMELHEGASNIAQGQRQLLTIARAFLSDPEILILDEATSSVDTRTEVAIQKAMNKL MEGRTSFVIAHRLSTIKDAELIVYMEHGDIKEVGNHRELLAKGGYYAALYNSQFAAENAG >gi|157101600|gb|DS480724.1| GENE 4 4527 - 5351 974 274 aa, chain - ## HITS:1 COG:CAC0618 KEGG:ns NR:ns ## COG: CAC0618 COG0600 # Protein_GI_number: 15893906 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 13 273 3 263 264 202 40.0 6e-52 MNRDKQPVSACNMSVAQQEYVAREIRRHRQVTICRVMLLVLLLALWELAADMHWIDSFIF SSPVMIAKCLVSMAKDGSLFLHTGVTLMETLISFAVCTIFGLACALLLWSSKSVAQVLEP FLVLLNSLPKSALAPLLIVWLGNNMRTIVVAAVSVAVFGSILTLYTGFSQMDTEQIKLIY SLGGGRKDVLLKVLLPGSLPLVISNMKVNIGLCLVGVIIGEFLSAKAGLGYLITYGSQTF AMTMVVTSIVILCIVSGVLYQIIAYAELKIKKRQ >gi|157101600|gb|DS480724.1| GENE 5 5344 - 6111 218 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 210 1 210 305 88 30 1e-16 MIPKLEVNGVSYSYHSLDGETLALDHISFDVSPGEFLAIVGPSGCGKSTLLSMLCGLTKP EGGTITIDGVPLTKSPSAIGYMLQKDHLFEWRNIFSNISLGLEIQKRLDEPARQELLEMM AAYGLAGFEYAKPSELSGGMRQRAALIRTLALKPDLLLLDEPFSALDYQTRLSVCDDISS IIRQTHKTAILVTHDLSEAISVADRILVLSPRPGRVKKILSIDFPENCIRSLDRRNCPEF STYFNMVWKELKTYE >gi|157101600|gb|DS480724.1| GENE 6 6269 - 7735 1308 488 aa, chain - ## HITS:1 COG:no KEGG:Closa_2429 NR:ns ## KEGG: Closa_2429 # Name: not_defined # Def: polysaccharide deacetylase # Organism: C.saccharolyticum # Pathway: not_defined # 63 482 62 466 469 507 59.0 1e-142 MSRHLPTFILLLAITVLVGGGIFALYRFVLAKQPDGTVQVIASSGNGSPNPDEGSNSDRY PDSGSGPASGGDSSNQPFAEIPQDDISRLIAQADRIAMGYDYDKAAELINSSGLDLNDSR MKEALARYESQKAALVPADMNAVTHIFFHSLIMDTSKAFDGDSDSANYNSVMTTKDEFLK ILEDMYQKGYVLVRIHDVAYEAPDENGNVRFVKGSVMLPEGKKPFVMSQDDVCYYPYMDG DGFAKRVVIGEDGKPTCEMVMDDGTTSTGSYDLIPLLENFIQEHPDFSYKGARAIIAFTG YEGILGYRTASSYQDSPTYEADREQAARVAQCLKDNGWELASHSWGHLHLGVADDPEAGF AIDEERFRADTDKWEAEVESLIGPTDIILYPYGNDIADWRPYHQDNPRFAYLESKGFRYF CNVDASKPYWTQMGTNYFRMARRNLDGYRLYEDLIQTDPSKKRLTDLFDAAQVFDSARPT PVSWSYQK >gi|157101600|gb|DS480724.1| GENE 7 7951 - 8391 519 146 aa, chain + ## HITS:1 COG:SP0628 KEGG:ns NR:ns ## COG: SP0628 COG0537 # Protein_GI_number: 15900535 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 10 142 13 143 167 61 32.0 5e-10 MEDMVKDPNCAYCMQGELVAKFGYPVCEMKTGFLYVFKEQSKKGRVVLAHRKHVSELIDL TDEERNDFFAEVAQVARAVHKVFQPDKVNYGAYGDTGHHLHFHIVPKYKGGEEWGGTFEM NSGRTMLTDAEYEKMAEDLRQALKEV >gi|157101600|gb|DS480724.1| GENE 8 8539 - 9891 1300 450 aa, chain + ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 80 445 24 383 387 261 42.0 3e-69 MEENMRRIAAILCSAAILASYPGIAALGQEPAIAIPVGAGVSGSSLYGQEDGWTGEDVQA AGDAQAVDVQTAGDAQAAPADQSQAAVQIAAPSAILMEASTGQVIYEKDADEKRSPASVT KVMTLILIFDALQSGKIQLTDEVVTSAHAKSMGGSQVFLEEGEKQTVETLIKCIVIASGN DASVAMAEYIAGSEDEFVRMMNERAASLGMSNTHFVDCCGLTESPDHYTTARDIAIMSRE LINKYPQIHNYSTIWMENITHVTKQGTKEFGLSNTNKLLKMATNFTVTGLKTGSTSIAKY CLSATAEKDGVRLIAAIMAAPDFKARFADAQTLLNYGYANCKLYEDKEHLPLPQMPVTGG VEDEVGLTYEGTFSYLSLKGEDLGAIEKKLVLLESVPAPVEPGQKAGVLEYSLGGKKLGE VNVLTNGSIREAGYMDYLKKLVGAWKLNRQ >gi|157101600|gb|DS480724.1| GENE 9 10496 - 11500 1142 334 aa, chain - ## HITS:1 COG:CAC0620 KEGG:ns NR:ns ## COG: CAC0620 COG0715 # Protein_GI_number: 15893908 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Clostridium acetobutylicum # 18 334 20 335 338 255 41.0 1e-67 MKKAISILLAAAMGAAALTGCSKSSGSSSSGNTPLVLNEVAHSIFYAPQYAAIELGYFED EGIDLTLVNGAGADKVMTALISGDAQIGFMGSEASIYVYQEGSQDYAVNFAQLTQRAGNF LVGRQPEDNFKWESLKGKKVLGGRAGGMPQMVFEYILKKHGMDPKTDLSIDQSINFGLTA AAFTSDDADYTVEFEPFATTLESEGSGYVVASLGTESGYVPYTAYCARKSYVEQNPEIIQ KFTNAIQKGMDYVNSHSAEEIAKTIQPQFKETPVDKLAVIVGRYKDQDTWKDDTVFEKES FELLENILEEAGELSARVPYEDLVTTQFSEQAAK >gi|157101600|gb|DS480724.1| GENE 10 11601 - 12275 385 224 aa, chain + ## HITS:1 COG:VC0813 KEGG:ns NR:ns ## COG: VC0813 COG0500 # Protein_GI_number: 15640831 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Vibrio cholerae # 33 224 2 192 192 163 42.0 2e-40 MMSKNDMIDSSHVNRGQRPDYLGKEAFVFMNDSLQYYNQHAKEFSDSTRDVEFKEMQERF LSYLKPGARILDFGCGSGRDTKYFLDRGFEADAADGSEELVRIASGYTGIEVRLMYFQDL DEKEAYDGIWACSSILHLAYGELEDVFIKMARALTSHGILYTSFKYGTAEGERNGRYFTD MTEEKMDKLLKDVNSFDILEMWVTSDVRPGRAEEKWLNMILRKK >gi|157101600|gb|DS480724.1| GENE 11 12245 - 14734 1489 829 aa, chain + ## HITS:1 COG:CAC2824_2 KEGG:ns NR:ns ## COG: CAC2824_2 COG1061 # Protein_GI_number: 15896079 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Clostridium acetobutylicum # 221 829 3 608 616 592 51.0 1e-168 MAEHDIEEKIESLRQIQMEIGPCEVITGGRDKKRFLLYQLELSMLKAERIDMIVSFLMES GVRMILDDLEQALNRGVKIRILTGNYLGITQPGALCLIKERLGDRVDLRFYNEKGRSFHP KAYIFRRKDFGEIYIGSSNVSRSALTSGIEWNYHFDERRDSRNFHQFCGTFEDLFYNHSI VINDEVLKQYSKEWKRPAVIRDMERYDSLDDETDGEVVFQPRGAQIEALYALKNSRLEGS EKALICAATGIGKTYLAAFDSQPYERVLFVAHREEILKQAAESFRNVRHSDDYGFFYGKE KTRDKSVIFASVSSLGKTDYLNESYFARDYFDYIVIDEFHHAVTDQYKKIMEYFKPRFLL GLTATPERMDGRNIYALCDYNVPYEIGLKDAINKGMLVPFHYYGILDETVDYSKIRIVKG KYDEEELTKEFIKGRQYELIYKYYMKYRSKRALGFCCSRTHAEQMAKEFCRRNIPAAAVY SDSDGEYAVERSEAIEKLKRGELKVIFSVDMFNEGLDISEIDMVMFLRPTESPVVFLQQL GRGLRKSRDKEYLNVLDFIGNYEKAGSAPFFLSGRSYSSAEAGRMQQQDFEYPDDCLVDF DLRLIDLFHEMAKKRAKKEERIRSEYFRIKEMLGGRVPARMDLFTYMEEEILQICNGLQN SPFKHYLRYLNSLGELGDRNRAIFNSIGNDFLEMLETTQMTKSYKMPVLLAFYNNGNVKT QIDEDDIYRGHKEFYYTANNWKDLAKDKSTSDFRTWDKKRCTREALKNPVRFLLQSGKGF FVRKEGYVLALNPQLYEAVEMDGFSQQMKDVIDYRTMDYYRKRYEGKTE >gi|157101600|gb|DS480724.1| GENE 12 14991 - 15794 877 267 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2752 NR:ns ## KEGG: CDR20291_2752 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 29 264 1 231 231 204 49.0 2e-51 MDMTETRKSKRLQLLERGITKEVGMNESVNYLDYANSPLMWAAAALAVAVVVFQSILFMK RSIKAASESALTTDQVHKAIKSSVISSIGPSVVILVTMISLLVTMGAPVAWMRLSFIGSV NYEAMAAGFGAQAMGKTLQTMDTMAFACGVWTMVCGSLGWLIFTFLFCDKMDKVNHLMSK GNAKMVPIISAGAMLGAFANLASGNFFTAEGSLTFTNAPAIATIAGCVIMMVLVKLSRAR DIAWLREWAFAIAMFSGMFIGYAVSVI >gi|157101600|gb|DS480724.1| GENE 13 15808 - 16497 882 229 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2751 NR:ns ## KEGG: CDR20291_2751 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 6 226 8 223 225 190 51.0 4e-47 MKNDYYSGTYIPGIVKWGKITLLLGIFAGFLPALVMAFRGYMPPVSAIIAGTLMQISVSG AFYIVEPISYFPILGIPGTYLTFLSGNTSNMRVPCSSVAQEAAGVEMGTEEGSIISTIGI ATSILVNVVILTVGAVAGSVILNILPAPVKEALNFLLPALFGAVFGQFAATRPKLGVVAA IIAIGMNWLMKLGFLSFIPGNPSYVVILVAVFGSIYAGKLLYKKELQVD >gi|157101600|gb|DS480724.1| GENE 14 16718 - 17824 1114 368 aa, chain + ## HITS:1 COG:BH1469 KEGG:ns NR:ns ## COG: BH1469 COG2195 # Protein_GI_number: 15614032 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Bacillus halodurans # 1 362 1 362 372 241 36.0 1e-63 MLSRERMQERFLELVQIYSPSGGEMEQCQWLMDYFKERGIEASIDEAGKAYGGNGGNIIA HVKGEPCNPPFCFVAHLDQIEPCKDVRPVVDGHIIRTDGTTTLGGDDKGAVASILEIVED IAETNRPHKEFYIMFTVSEETSMQGTKHMDPSRLPCKNMVIADATGPAGIIAYKAPAMEA IRCTVRGRKAHAGIEPEKGINAVVAASRAISRMHIGRIDQETTSNIGRIEGGAATNIVTD EVTFTAEIRSHSMDKLRDEAAHMEECLKAACLEMGAAYEMEHELAYPSLEVSLDSDLYRM TAQAMEKEGIEPRPMVIGGGSDGNILAGYGCSSLILSVGMMDVHTVQEALDMDELWKATR VMSRMTEL >gi|157101600|gb|DS480724.1| GENE 15 17849 - 19186 1309 445 aa, chain + ## HITS:1 COG:FN1186 KEGG:ns NR:ns ## COG: FN1186 COG1473 # Protein_GI_number: 19704521 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Fusobacterium nucleatum # 37 365 2 295 359 105 27.0 1e-22 MDKEELKQRCLNVIEEHRDEIIALGKEAYKTPELGFKEFRTGKLMEEAFCKLGLEPETGV SYTGCRVSSGPKGNGPRVAVMGELDCIMCDSHPDAAEGGMVHACGHNVQLANMYGAAIGI LTSGVMEHLGGAADFIAIPAEECVDYEYRNRLMSQGTIHYLGGKQELMYRGGLDDTDMVL QCHMMEMEPGKCCILDTKGNGFISKTVHFLGKAAHAGFAPEQGINALNMAELAMNNIHAL RETFRDEDKVRVSIVIKEGGGLVNVVPERVTMEIMVRAFTIDAMEDASHKVNRSMKAAAM ALGGKVEIHDCIGYLPLNTDRQIARLYRDNMMAYEHAGEDAFVEDWETAGSTDLGDISQI MPCMHIWAGGIKGGLHTENYRMDDPYTAYIVPAKMMALTIIDLLWDEGAKGKEIIRDFRP ALTKEEYLNLLKDHQVVDLYDASDL >gi|157101600|gb|DS480724.1| GENE 16 19236 - 20549 1398 437 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_0449 NR:ns ## KEGG: CDR20291_0449 # Name: not_defined # Def: putative transcriptional regulator # Organism: C.difficile_R20291 # Pathway: not_defined # 1 428 1 428 428 361 41.0 4e-98 MIVGVVGPRESCVIIRKAIQQIDQSHEVRLYTRELVREAVDVIDTCEQECDGVIFTGSGI CDYVLGHHQMQKPYTYIRRSASSIAEVFLKMVREGREIDTFSIDVVEKQIIEDILDAFHI QPRNIYTCPLLAGVAEEEYVAWHMELLEKKMTDVAVTAYVAVYHMIEEKGGRVFYLEPKR SQVRDALADLEKGWALAQAEYSQIAVELIQMSNFPEKEDQYYSAMASKAEFEVELIRYVR GIQGSVFPFGRDEYVIYANSGVVKGKKNHKDLRILQREGKNLGLVVNAGLGMGLTAYQAE MHARKALEYSLGKGEGGIFQIDEEEKIQGPLGLEQQLQYSMISFDPKIQTISEKTGLSME SILKIQAIADVRKSYVFDASELAECLGITVRSARRILSKIQAAGLGTVCARESAAKGGRP RALVELNFGLRDKTHKD >gi|157101600|gb|DS480724.1| GENE 17 20670 - 21479 1050 269 aa, chain + ## HITS:1 COG:CAC0629 KEGG:ns NR:ns ## COG: CAC0629 COG0561 # Protein_GI_number: 15893917 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 269 2 267 268 191 42.0 9e-49 MDYKIIVLDLDGTLTNRDKIITPRTKEALMNAQEAGNVVVLASGRPTAGVEPLARELELS RFGSYILSYNGGMITNCKTGETVFSSLLPRESNQKIIGLAEEHRVDILTYEGEEIITNNV ECPYAIAESNINHLPLRQVEDMKSYVDFKVPKFLMLDDGDYLVTVEPKVKAAMGRDFSVY RSEPYFLEIMPKGIDKARSLARLLEVLGLDREQMIACGDGYNDLTMIKYAGLGVAMENAV LPVRQAADYITASNNHDGVGLVVEKFMLS >gi|157101600|gb|DS480724.1| GENE 18 21693 - 22013 447 106 aa, chain + ## HITS:1 COG:lin1875 KEGG:ns NR:ns ## COG: lin1875 COG4496 # Protein_GI_number: 16800941 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 3 94 4 95 98 115 58.0 2e-26 MNKKIKTEAVDHLFDAILTLKTPDECYKFFEDVCTINELLSLSQRYEVAKMLREGKTYLE IAEKTGASTATISRVNRSLNYGSDGYDMVFARLTGGTQADKGDNES >gi|157101600|gb|DS480724.1| GENE 19 22150 - 22767 774 205 aa, chain - ## HITS:1 COG:no KEGG:Closa_1554 NR:ns ## KEGG: Closa_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 205 1 204 206 242 62.0 9e-63 MKTNEERLQDLFSYLGNLSHVKADEIPDIDLYMDQVTTFMDSHLKDMRRHPGDKVLTKTM INNYAKNNLLPPPVKKKYSKDHMLMLVFIYYFKNLLSFSDIEELFRPITSKHFSGQTSQL SLEDIYNEVFTLEKGEMNNLKSDVAAKFARAQDTFSDAPVDDKDREYLKLFAFICELSFD VYLKTQMIEMIADQLREDAPAPRKK >gi|157101600|gb|DS480724.1| GENE 20 23026 - 25368 2650 780 aa, chain + ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 2 779 3 727 730 654 46.0 0 MSIYDTLNPMQKEAVLQTEGPLLILAGAGSGKTRVLTHRVAYLIEEKQVNPWNILAITFT NKAAGEMRERVDQLVGFGAESIWVSTFHSTCVRILRRHIEYLGYTTNFSIYDSDDQKTLM KQVFKAMDVDTKQFKERSVLGTISSAKDKLTGPEEFLLNAGQDFRQRRIGEIYKEYQKRL KKNNALDFDDLIVKTVELFQNNSEVLNYYQERFKYIMVDEYQDTNLAQFKLVSLLASKYR NLCVVGDDDQSIYRFRGADIGNILSFEEMFPGAKVIKLEQNYRSTQNILNAANGVIRHNR GRKDKTLWTANGEGDLIRFKQFDTAREEADFVAREIRDSGYAYQEQAVLYRTNAQSRLLE ERCIFYNVPYRLVGGVNFYQRKEIKDILAYLKTIANGVDDLSVLRIINVPKRGIGATTMG KVTIFASEHGMSLYDALREARQIPGLGKAAEKIGAFIGQMESFRARAQSDDYTIQDLIEG IMDETGYQQELEAEGEVESQTRLENIEELVNKAVSYEEDSEHPSLDEFLEQVALVADIDN MDESENRVTLMTLHSAKGLEFPKVYLVGLEDGLFPSMMSINSDDKTDMEEERRLCYVGIT RAKNELVITSARQRMVNGETRYCKPSRFMEEVPGELLEEERLEPVLGSYGSRNNGAGAGG FGKTGISGEAGLPWNQPAAGNTRTSTFGKGYNAYASGSSQPLSGLGTGSTAGNPGFGKAF TVQKAATLDYSEGDRVRHIKFGEGTVKSIRDGGKDYEVTVVFDGAGQKKMFASFAKLKKV >gi|157101600|gb|DS480724.1| GENE 21 25619 - 25936 477 105 aa, chain + ## HITS:1 COG:no KEGG:Closa_1728 NR:ns ## KEGG: Closa_1728 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 7 75 2 69 96 76 79.0 3e-13 MNRFDRDRFDQMGKEALSRVEELMNSSRINELLHKREEDEKKKNCILWVLAIIGAVAAVA GIAYAVYRFFTPDYLEDFEDDFDDDFDDDFFEDEEPEEKKTEKAE >gi|157101600|gb|DS480724.1| GENE 22 26105 - 27538 1053 477 aa, chain + ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 11 470 3 452 456 429 48.0 1e-120 MPKQRAERRYFMKKGDIYEGTVEKIEFPNKGILHIDGERVVVKNAIPGQKIQCVITKRRK GKSEARTLEILAHSPAEKPEEACPHFGICGGCIYQSLPYEEQLNIKKEQVKSLLDTVVED GTYEFQGIKASPISSGYRNKMEFSFGDEVKDGPLALGMHKRGSFYDIVTTPECRIVHPDF CRILVATKEYFEELGVAFYKKLQHQGYLRHLLVRRAVKTGEILVDLVTSTQRDYLGERTE EEVLAEWTKKLVGLPLDGKLAGILHTENDSLADVVQSDATHILYGQDYFYEELLGLRFRI SPFSFFQTNSLGAEVLYDTARGFVGDTKDRVVFDLYSGTGTIAQILAPVAKKVVGVEIVE EAVRSAQVNARLNGLENCEFLAGDVLKVIDELQDKPDLIVVDPPRDGIHPKALGKIIDFG VDRIVYISCKPTSLVRDLVVLQDRGYRVEKVCGVDMFPGTGNVEVVIMMRDCGLEGK >gi|157101600|gb|DS480724.1| GENE 23 27770 - 28615 469 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942250|ref|ZP_02089559.1| ## NR: gi|160942250|ref|ZP_02089559.1| hypothetical protein CLOBOL_07136 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07136 [Clostridium bolteae ATCC BAA-613] # 1 281 1 281 281 563 100.0 1e-159 MFKIYLSRTVSPGVGISLPATIEEMREAYSLLNGTDTVPLETATAYVESSIPNLRQYLYE VPVSEKRLEELNYLAYRVKWMDSQDEAVFGTVIEMMKPETLQDMINLSCNMDKFRYLPSA TTEVKLGEYLLKGNADMAMEEQAARSNYEGIGKDYIKKHGGMFHAFGYTSGIQEELEPIY RGEELPDPDFKQTCSFKVWIYKGNPYDNYTLSLPATESKMDALKSAMGISNWSECKQLAI QCRVPTLWDGLPEYGSIEELNDLVTEHCQSMENQQAPVLEM >gi|157101600|gb|DS480724.1| GENE 24 28709 - 29137 225 142 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2445 NR:ns ## KEGG: Dhaf_2445 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 2 112 3 113 192 155 66.0 5e-37 MQITTDELRRMTNQEGLILQGCGGDAQEWIDGINELLTQEGILLEGTEFKHVKSFCHEGL TNLLFPFEDVKLDLGKLAIWRIKSHETFGGTWLSDYVPNRLGGFLDEAPAQKTEEIQQPG EHPSHSAWEGEEAGLGGMSMKQ >gi|157101600|gb|DS480724.1| GENE 25 29210 - 29593 320 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942252|ref|ZP_02089561.1| ## NR: gi|160942252|ref|ZP_02089561.1| hypothetical protein CLOBOL_07138 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07138 [Clostridium bolteae ATCC BAA-613] # 1 127 6 132 132 238 99.0 1e-61 MKDYTVNTAITFHTGFDDRECNCLMYEGMKEKIKHDIQTAFLSDESLKGYITSDLTLRFL DGYKVRVEYEFSCYDENKQEAEGFSNYCVKGVQSRLEELGYRIESISSKAEEMGMGWLDE LESMVFR >gi|157101600|gb|DS480724.1| GENE 26 29748 - 30458 246 236 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1902 NR:ns ## KEGG: LM5578_1902 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 6 228 1 198 202 149 37.0 1e-34 MTVRFMKGTSIAELEREMQQVPGFRPRKSGIIRSRQAGETAGIGSCRYVNVWRAEKEQGR DGHLYTAADEEMPSAFYQILLRLLTEELPGSALTVRVGKLREEGEKGIMLFKSEEHRRRF RALMRCGLYPGLVTEPGFAAAVFLLSSEERLWEKAGPFVQERTIDYSRFRLRGADLDSYA TFCVAKELCTGKPYVSLSELGDPELFHDGLLRLIIHAILISRHGIGTVLNEEEVEC >gi|157101600|gb|DS480724.1| GENE 27 30452 - 32086 994 544 aa, chain + ## HITS:1 COG:no KEGG:Closa_3096 NR:ns ## KEGG: Closa_3096 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 384 527 8 158 172 82 38.0 6e-14 MLEIRIGYGDMVSELMRIPMSASDSEAVKKILDEGKTGNQTVTVRSLVSPVQPLSEMITG LDYGDETRRKELDFLDSRLRHMTRREQDIFSVALKLEAPGTLKEIINLSFNLDSYDLLED ISDPGRTAAELLRRGKQIEIPEVLYPMLDFERIRDSYFADHKGDYCPSGLVLKREDTVYT EVYQNHFPDPGYDKNSLFLLHLRRPSQNGMVNVSLAIPTDKEQMELAKKELGIQEFSECQ WNQYGGPLDELRHYLPVGGKVDELNRFAWFLKEKVLDGTEQIVEKLKAVLTAECPRSLDE AIGILNDLDHYQVLSEFRTPEDYARWRMKDDGSLYVSRFCEAYLDWNSLGKALMKRDGTR WSEQGTVLRDEWHCNELSENASVIRLYSPLLAELVDEDGYSSSLSSVGLISYEEDIRDAI AADDILKTEKGLADYLDNQLLKQRVISMCPTVERYQYNLWGVLEIKSHGELNQEELEVLK REWAGQACDGWGECFSQEGIECDSDCDLYVYFGHSKFRVQTEQELKGIVKEQELTGMEGM GGIS >gi|157101600|gb|DS480724.1| GENE 28 32149 - 32364 110 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942255|ref|ZP_02089564.1| ## NR: gi|160942255|ref|ZP_02089564.1| hypothetical protein CLOBOL_07141 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07141 [Clostridium bolteae ATCC BAA-613] # 1 71 1 71 71 95 100.0 8e-19 MMHYGRVRSDLQQAERTISMALRSNIVSETEKRALEEALNLVQEAAEKCRLAQAESVRKI FSQGMSHSEGR >gi|157101600|gb|DS480724.1| GENE 29 32383 - 33564 955 393 aa, chain + ## HITS:1 COG:no KEGG:DSY0069 NR:ns ## KEGG: DSY0069 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 19 371 28 394 418 97 25.0 8e-19 MNQSMQVWIYACQFIDDPAYQGSALSLPAHREAVRDAFQRARLKEGEPYHLERGRGWPGF IESVLDRYNHTLEELNLLVYKLAQMTEKQIEVYEGVLKALPERNMKQVINGLYNLERFEF LPGIQCDNEIGEMTIDNDLDPILKDLPGDIYPLLDTEKVGAYIREKENGVFTSKGYCYRV SDQWQEVYDGKQLPEQVEIHRALFSLYLVPAGTNTEDLGVGAWLELPYDEAAKNQVLERL GLELFSHCGIAKIHAAAPLFEQLAGQDHDIDRWNHLAKKLASMTEKELLKYKAVSEFEGC SSIEQAAELTDMLEQYTFDPVQVTYATYGRECLEQMGADLEAQAFRRFDFEGYGKELYAQ TGRKLTAYGAVSKEPVLEEVQSEDHAMMTQGMG >gi|157101600|gb|DS480724.1| GENE 30 33601 - 34530 667 309 aa, chain + ## HITS:1 COG:lin0088 KEGG:ns NR:ns ## COG: lin0088 COG0338 # Protein_GI_number: 16799166 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Listeria innocua # 2 295 9 265 270 66 23.0 7e-11 MGGKKALRNTVYELSPASFERYIEVFGGGGWVLFGRPPDPKVMEVYNDFNSNLTNLYACV KNHTMEFLRQLGFLPLNGRDEFLVLRRFLEGQEFAGTFLAQELELAMHNLPPLEFEELRS LLLERAAPGDVQRAAAFYKVIRYSYGSGCTSFGCQPFDVRKTFHLIWQAADRLANTVIEN KDFEALIRQYDRENAFLYCDPPYFMTEDHYEVEFPRQDHVRLRETLAGCQGKWMVSYNDC DYIRGLYEGCHITAVSRINNLAQRYDRGCEFPEVIITNYDPQERKKWGPQQMDLFSAGLL GGETEDGGD >gi|157101600|gb|DS480724.1| GENE 31 34517 - 34900 368 127 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1900 NR:ns ## KEGG: LM5578_1900 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 118 1 119 128 77 32.0 2e-13 MEGIEKLWWAKRQLEEQTEGKQRLITRTGGHLFVKVDDSCLAACYFAMVRNQKTGQYHAD VKGYLRTFSGYCNGTRLEQLSEEISGLSALVRELEAAQLSVSEEELQEFIRELEQQDETV EGAERKA >gi|157101600|gb|DS480724.1| GENE 32 34900 - 35328 327 142 aa, chain + ## HITS:1 COG:no KEGG:CKR_3439 NR:ns ## KEGG: CKR_3439 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 142 1 142 143 97 41.0 2e-19 MRNVKINGRVAEDGSIILPPGVLESMCLNPGNLVTLEFAPTSPVKEANGYGPRFQVETGG CETDIIADEEEAALVVPHDLLNDAHIPMDTGLVVQTIPGAILIGDTDPMAAVQAPLLEIL SLFGISEEEAYKAIEEGGYYDE >gi|157101600|gb|DS480724.1| GENE 33 35321 - 35611 348 96 aa, chain + ## HITS:1 COG:no KEGG:CKL_3895 NR:ns ## KEGG: CKL_3895 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 2 91 3 92 94 117 66.0 2e-25 MSKPVYKVMDSKGRVLIPRELRLAAGMCGGDILALQLQEGKVSVRKVEIVEVGDQSPEAV EAYVRAAIKTMPDAVRLSLISDLSGLVREKEGSHVG >gi|157101600|gb|DS480724.1| GENE 34 35601 - 36176 346 191 aa, chain + ## HITS:1 COG:no KEGG:CKR_3438 NR:ns ## KEGG: CKR_3438 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 179 1 175 185 229 61.0 5e-59 MSDNRIYTDRDCVETGSGCSLKGKVVVLKESALEAGFGRQLYYCTGGNGANGNALGKSVF LVNLKNGEFERCTRDHVLGVLKPELLPDEEKLQLSQIRPPGALPLENHEPQYSGYSFLED GRYAAGVWLCNEKEAMEYVEMQKPYQHRIMLCDRNDFCVWEVRCGMQVYPPQEKLDEMRE GLVENPGPMQL >gi|157101600|gb|DS480724.1| GENE 35 36341 - 37366 740 341 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942262|ref|ZP_02089571.1| ## NR: gi|160942262|ref|ZP_02089571.1| hypothetical protein CLOBOL_07148 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07148 [Clostridium bolteae ATCC BAA-613] # 1 341 1 341 341 640 100.0 0 MSKYVDKMNLPQYRFYNHMVGDGEERKELISDLLTAEMILERGSKKERSQALITLATKKK RLADFIYSQEEDRKRCLKDLTGRDMKQFRKRLDHYLDILLMIPFFQGQSEKKRRYVDEKS ISSIPPEQRSRRFLYDREKGKYYTEYDCVPKLLLINERLYEEYGCYEMTETDQENLDRLV ADDVHYIINIQGSSGVDCAYLVEKDEDLCIKLLDRLKGSGKIPIENLVGKTDGVTAVFLS LLDYLQEMELAEQRRRKYLEEHPRKSEYAGKAIQSQRFVDKNSIKVFDMKQEGAAPDSIG AVYFLARRIAEEADLGGRGMKCCLTPGEDITGNTRTRKPSM >gi|157101600|gb|DS480724.1| GENE 36 37303 - 37497 110 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942263|ref|ZP_02089572.1| ## NR: gi|160942263|ref|ZP_02089572.1| hypothetical protein CLOBOL_07149 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07149 [Clostridium bolteae ATCC BAA-613] # 21 64 1 44 44 67 97.0 3e-10 MLPHTRRGHYRKYKNEKTVYVKAAIIHKEKYEGIQSAHRMNQSGGKNRLEQEAGEAAFLQ GMSL >gi|157101600|gb|DS480724.1| GENE 37 37632 - 38189 576 185 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1896 NR:ns ## KEGG: LM5578_1896 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 185 1 182 182 210 56.0 2e-53 MRGIRVENGRIVYFGNPAGYIAGSQAVVDPIFKGKELEAYLERQGGIEAVVWKGGVYDRL MNGQAETQGCEPLKNCRIWQLKPDVNIHMKFIGYDTLVTRFGEPDPQNYHMVYDGEIETN ELERIYDKFDGGQEVPGYTGHSLSMSDVIELYDEEASEFYYVDYRDFKPVVFGGPEPVQS QVLQL >gi|157101600|gb|DS480724.1| GENE 38 38245 - 38475 225 76 aa, chain + ## HITS:1 COG:no KEGG:HM1_0583 NR:ns ## KEGG: HM1_0583 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 1 76 1 76 76 82 59.0 5e-15 MREKMEHVKHAAEQKMWKVRAVLVDRSGENFIDSAIKILMAVVIGALLLAGLYALFSENV LPTLSRRITEMFNYAG >gi|157101600|gb|DS480724.1| GENE 39 38517 - 38858 407 113 aa, chain + ## HITS:1 COG:AF1778 KEGG:ns NR:ns ## COG: AF1778 COG2088 # Protein_GI_number: 11499367 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Archaeoglobus fulgidus # 18 85 16 83 85 63 44.0 1e-10 MPTVDVRITSMYPPGEVGSIRGYASATIDSCLAIRGIKVVEGGERGLFVSMPSRKTTEGY KEVCFPVTAEFREQLHNVVLSSYQQALEQSVAQQTAPQPEAPEQAGEEPGLQM >gi|157101600|gb|DS480724.1| GENE 40 38859 - 39254 191 131 aa, chain + ## HITS:1 COG:no KEGG:CKR_3435 NR:ns ## KEGG: CKR_3435 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 131 8 139 139 89 44.0 6e-17 MQGVLFFFLLLASSFVDLKRREIPDWVSGSIAALTLLHFRPEYLLGLIPALFFLAAAVRG GFGGGDVKLAAACGLVLGLPAALMGTILGLLLQLLFHLGALCVLPLFKRQVWSAYPMAPF LAIGYAVAYYV >gi|157101600|gb|DS480724.1| GENE 41 39268 - 39375 89 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKKILSLLVLIGCVSVFCLRFRKKAGYLQCGKGN >gi|157101600|gb|DS480724.1| GENE 42 39377 - 39754 189 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942270|ref|ZP_02089579.1| ## NR: gi|160942270|ref|ZP_02089579.1| hypothetical protein CLOBOL_07156 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07156 [Clostridium bolteae ATCC BAA-613] # 1 125 1 125 125 251 100.0 1e-65 MAVERANAFTMYYEPFELEGKRLYGTGYTVDHDTVPAGLYCYDVWNDGIEDSEEHIFISR NPLSENISGAVISDESIDFHGENQMSMKGIQFLDDEPLVSLERLLEQKADGPVAAETAEF CMVQG >gi|157101600|gb|DS480724.1| GENE 43 39773 - 40642 907 289 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1891 NR:ns ## KEGG: LM5578_1891 # Name: not_defined # Def: pilus assembly protein CpaB # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 243 1 242 284 353 76.0 5e-96 MNFFKNRTVIGLLCIVLSLVICFAITPMFNRTISEKTEIVRVTKNIRIGDEITKDMVKPV EVGAYNLPESVVRNIDEVIGKYASADMAPGDYIIRSKVAEEPAAENAYLYNLNGEKQAIS VSVKAFANGLSGKLVSGDIVSVIAPDYRKQGSTVIPPELQYVEVIAVTANSGYDANTGER GDEETEKELPGTVTLLVTPDQGRVLAELEADGKLHLSLVYRGTKENARKFVEAQEIMLEE LYPLESEEMEENTEKESGTEETETVEEAGEETVQGEETEATAGSESGVE >gi|157101600|gb|DS480724.1| GENE 44 40643 - 41455 704 270 aa, chain + ## HITS:1 COG:no KEGG:HM1_0586 NR:ns ## KEGG: HM1_0586 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 1 269 1 270 271 405 72.0 1e-112 MLNFKKGSIFDRSAKAQEDFMDEPGNQVLAVWGSPGSGKSTVAVKLAKYLVGKKRNVVLL TCDMTAPMLPCICPPADLECEHSLGSVLAAAQVTQSLVRHNLITHKKYDYLTLMGMLRGE NEYTYPPYNAKQATELITCLREMAPYVIIDCGSYIANDILSAIALMESDAVLRLVNCDLK SISYLSSQLPLLRDNKWDADKQYKVASNIKPNEASEHVEQVLGSVTFKLPYCAELAAQSL AGDLLADLSLRESKGFRKAIEAISREVFGC >gi|157101600|gb|DS480724.1| GENE 45 41449 - 43086 1380 545 aa, chain + ## HITS:1 COG:RSp1085 KEGG:ns NR:ns ## COG: RSp1085 COG4962 # Protein_GI_number: 17549306 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Ralstonia solanacearum # 112 495 22 400 450 120 28.0 6e-27 MLGEKRSGDIFGPGPKREIGGQAAMAEATGDDQEVELQKYSDSKRREIEYEDESAQRGES EAAVPEKQPVLEELSLEEISRTEEEDEERLLFQESQAPRAHNLFFSPQGEGREFTDVLKE VQEYISSKYSTLIIDQGSGDVKEQIKRYITKYVQDYRIAVKGMDRQELVDTIYTEMAEFS FLTKYVFGTGIEEIDINSWRDIEVQYSGGVTKKLTERFESPQHAINVIRRMLHTSGMVLD NASPAVLGHLSKNIRIAAMKTPLVDEDVGIAASIRIVNPQSMKQEDFIKGGTATQPMMDF LTECLRYGISICVAGATSSGKTTLLGWLLTTIPDNKRIYTIESGSRELALVREKEGVVTN SVIHTLTRDSENERQRVDQIALLDMALRFNPDIIVVGEMRGPEANAAQEAARTGVAVVTT IHSNSCEATYRRMVSLCKRAVDMSDETLLSYVTEAYPIVAFCKQLENRERRLMEIQECEI RPDGTRSYRPLFQYKITENRVEDGKFIIEGRHEQVHTISDGLAKRLLENGMPGEALKQLR GGVNV >gi|157101600|gb|DS480724.1| GENE 46 43083 - 44012 793 309 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1888 NR:ns ## KEGG: LM5578_1888 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 309 1 309 309 462 71.0 1e-129 MTLMQLLACAGMITGAFLILGLKPMKFTDGLFGFLMQKPRTIKEEINEATSRKKPGVFRR EIRAAQEILAMTGRESRFSMICAASLSLFCLGGSLAILMGNYFLAPVLAVGFLFFPFWYV RLTAGHYKKNVAAELETALSIITTAYLRNEDILTAVEENLHYLNPPVRNVFQEFSTQVRM VNPDVEAGLQALRGRIENDVFEEWCNALCDCQYDRSLKTTLTPIVSKLSDMRIVNAELEL LVTEPRKEFITMVILVIGNIPLMYFLNRSWYETLMFSYMGKLILAGSAALIFVSTACVIR LTKPLEYRR >gi|157101600|gb|DS480724.1| GENE 47 44018 - 44890 992 290 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1887 NR:ns ## KEGG: LM5578_1887 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 290 1 290 290 406 73.0 1e-112 MLGLLVCFGGLLAAGLFFLAADLLRLPYLKTSKAMINTGRENRKAAKSLETYLLSLAVKL APYIHMDEYKRGRQKNILKASGLNMEPEVYQAYVISKAGLVLLGIIPCLLVFPLLAVIVV VLAVMVYFKEQERADELLAKKRGELEGELPRFVSTIEQELKNSRDVLSIVENYKKNAGEE FANELEILAADMRSSSYEAALTRFEARINSPMLSDVVRGLIGVLRGDDGAMYFQMLSHDF KQMELQRLKKEAQKIPPKIRVFSFLMLVCFIVTYLAIIVFEILKSMGSMF >gi|157101600|gb|DS480724.1| GENE 48 44918 - 45286 145 122 aa, chain + ## HITS:1 COG:no KEGG:Sgly_0061 NR:ns ## KEGG: Sgly_0061 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 12 107 109 205 218 70 42.0 2e-11 MSSVKNQTGMAEKEVENGSGQKLSGTVREFLERHPKEPVFMMTPGGYVYLVPEQIGNLLA GGAVSGSPYSWRECVQIKAEELLQQMVDSVNHADGTWYVLTRLYEPEVIPARTSEEGMVC HV >gi|157101600|gb|DS480724.1| GENE 49 45328 - 45678 315 116 aa, chain + ## HITS:1 COG:no KEGG:Closa_3112 NR:ns ## KEGG: Closa_3112 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 116 16 131 131 175 73.0 4e-43 MAVLILCVMLVLALSVKILPVFIAKQQLDTFATELCREAEIAGRIGSETNRRAAVLREKT GLNPEIHWSASGRIQLNEEVTVQLTYRYNLGLFGGFGSFPITLRAAATGKSEVYWK >gi|157101600|gb|DS480724.1| GENE 50 45675 - 46220 419 181 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1885 NR:ns ## KEGG: LM5578_1885 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 181 1 181 181 262 64.0 4e-69 MKKWRACLKDQRGTAFPLVIAVTLACLLILCGIMEFFRLSLIASGVKEALQDAVVVVVND NYANVYHGVREGYSGGYYTEGYGFEEAVDTGDIYYHLDETLGTRREGGNRVKYTGGVLEY KITGLDVEIRNAPLAPSDPANAQRFEADAVVWLEVPVRFAGKTFPSMKMKLKVQAGYIEV F >gi|157101600|gb|DS480724.1| GENE 51 46326 - 46943 505 205 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1884 NR:ns ## KEGG: LM5578_1884 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 1 205 1 188 188 130 44.0 3e-29 MKLNERMKRRLAVLGCVVVGAVLIAAIGSQFRGEAQGSNHAEARTEQTEEVTVAEIPTET TEAATTEEETEPTKPDIVVRTEPATEPSRTGTTEPTEALPAQTDRTEQAVQPAPEKPTAP PEEVLKNPTQKPDGETVEGTPEAIPHEEVVQPSEAPTQAGEPQYGDTQNGKIYVPGFGWI DEIGEGQGTVAEDMYENGNKIGIMD >gi|157101600|gb|DS480724.1| GENE 52 46990 - 48696 1021 568 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1883 NR:ns ## KEGG: LM5578_1883 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 23 567 22 565 566 720 67.0 0 MCRDRAKILAVLAMALMLMSPLTAYAVGEGNLDGGGGSMGQGTSQNKWTPGMEGVRVTVI LADTRTPVTQPIDFTNKRPTNIQLHFGKVSKLSYNKGTALSASGEAYTFINPGQSLPRII SSQSLGAASIEAIKSYFTDEQVIRSIANLTGMDFEVLTNGKYKLMLEPIAYVVFQGVNVA MTATEAALYDQTINGAMRAMLPTVAFQNLPLSMFLETADLGYPAWSGPKSGVRSNQEIIT SLGLGIVRFNEITTPPAVDAFDYEYRVNTEVITAVTVRGGQSDPDHPVSVTFRVAGKTMR VDHVYYPEGDSQIAWIRWTTPSTPQTMTIHVTVSGGGRTDKNTITANIVDLSKNPPPDPN ADDRNDGFTIPTVPGREQVTRADWGVWRPWWYSYWVWHSGDDDDDGYWCDHGWWEFDYDR YHARLTAEMEIHPDEKSPTASGNTLKSGYGIQEIVTAGVSTNQSHAVTEAQNAITYFPEF DYQSYWRVLERMGRGYQTRFEFEENPFSTYERRTHFLPIWYPDGSYTPYTWLIDCWTPAG MLSMNLTDSVRVQGNLWQDWHISPQKPR >gi|157101600|gb|DS480724.1| GENE 53 48803 - 49123 275 106 aa, chain + ## HITS:1 COG:no KEGG:CKR_3426 NR:ns ## KEGG: CKR_3426 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 13 106 54 147 147 135 68.0 5e-31 MIALIVVLLLTAMFSMTAFATNTGNVAGAVEGTWKAASSQIKTVVNNVVFPAIDLVLAVL FFVKVATAYMDYRKHGQIEWAPAAILFAGLVFSLFAPMYVWQIVGI >gi|157101600|gb|DS480724.1| GENE 54 49171 - 49758 399 195 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942282|ref|ZP_02089591.1| ## NR: gi|160942282|ref|ZP_02089591.1| hypothetical protein CLOBOL_07168 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07168 [Clostridium bolteae ATCC BAA-613] # 1 195 1 195 195 359 100.0 6e-98 MGCWGITAFESDEGLDAVEFIRESIPEDGKLELETLVETLKRDSWNAPPEVETAGSHTSV MVLAELMLKFVDGDVKSLDSEEDWNREEHKFCSVTSFTASKETVHWIRNYLFDTLHHVKK HAGSQTEDGGKWGGWFEEKNWIGWQEHMTELVSRLDTLLESSEAEIELITLQRPKYQEME NNREAEHPLQNQFGL >gi|157101600|gb|DS480724.1| GENE 55 49802 - 50194 204 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942283|ref|ZP_02089592.1| ## NR: gi|160942283|ref|ZP_02089592.1| hypothetical protein CLOBOL_07169 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07169 [Clostridium bolteae ATCC BAA-613] # 1 130 11 140 140 221 100.0 1e-56 MASNDRQDKLLMETCIKHLIQYAATIKISRGAQGDESIGRLRKIIGEMEAYWNLSDRKGR VEQFDKTLRRAVQTGRTNGVSEEQKIAAVNGLYRYASEMISAQGAEAADRIKEVQSVIRE LADGWGMDKE >gi|157101600|gb|DS480724.1| GENE 56 50225 - 50503 188 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942284|ref|ZP_02089593.1| ## NR: gi|160942284|ref|ZP_02089593.1| hypothetical protein CLOBOL_07170 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07170 [Clostridium bolteae ATCC BAA-613] # 1 92 1 92 92 165 100.0 1e-39 MVGNKEQQPSETREESRFFKDISTVAEKMGFEVKGVKNGLLQLYLEGSRVAQVDESGSLL YHPYPEVFKLMDAIEAWKGREENLQMQDGLQM >gi|157101600|gb|DS480724.1| GENE 57 50527 - 50895 209 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942285|ref|ZP_02089594.1| ## NR: gi|160942285|ref|ZP_02089594.1| hypothetical protein CLOBOL_07171 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07171 [Clostridium bolteae ATCC BAA-613] # 1 122 3 124 124 246 100.0 4e-64 MKSLSEYLAYYDPMRESNERYLPEDQATLRYSRVSVIADGKVIGASLYPDHVLDLAFLET PFMRQLCRDYQYARKLKVRIECYEHGGEGESRGLVGGEFTLFCMGALDLKVVSIRHICLW EE >gi|157101600|gb|DS480724.1| GENE 58 50964 - 51866 1062 300 aa, chain + ## HITS:1 COG:no KEGG:CKL_3881 NR:ns ## KEGG: CKL_3881 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 300 14 313 313 407 77.0 1e-112 MFIWDFVADTVLGQIVDWIYGQMIGFLGNFFSLMGQMGAELFELEWIQAIVLFFSQLAWV LYVVGLVVAVFECAIECQSGRGSVRDTALNALKGFMAASLFSVVPVELYKFSVSLQGEIT RGITGLGGDFGSMAGNIIDSLKNSGTLQDAMVSGVFGGINTITNPILMIFILFMMGYSTI KVFFANLKRGGILLIQIAVGSLYLFSVPRGYMDGFTNWCKQVIGICLTAFLQATMLIAGL MVLSKNALLGLGIMLAAGEVPRIAGQFGLDTSTRANLMSSVYAAQSAVNLTRTIVQAAVK >gi|157101600|gb|DS480724.1| GENE 59 51889 - 52290 217 133 aa, chain + ## HITS:1 COG:no KEGG:Dtox_1498 NR:ns ## KEGG: Dtox_1498 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 1 105 3 107 134 125 58.0 6e-28 MIYICSPYAGNTEENTAFARQACGYAIRQGAVPLAPHLLYPQILNDSVPEEREIGIRLGL DILERCEELWICGDRMSAGMKRETAYAKARGIPVRRIPVCEIMGNQTVQELGKGIREGPE TSQSYQKNVQLGM >gi|157101600|gb|DS480724.1| GENE 60 52313 - 52825 495 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942288|ref|ZP_02089597.1| ## NR: gi|160942288|ref|ZP_02089597.1| hypothetical protein CLOBOL_07174 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07174 [Clostridium bolteae ATCC BAA-613] # 1 170 15 184 184 333 99.0 3e-90 MHLYFHTNTEMALYYIKALLRDGDEHPMSEIYDYVMENCKGHAVMGEPMRATVISSALWR LANTDGSGYCRLRRGVYQMGDPKKTMRETPSLYERAIRIVSRAERELAGSFYVNLDQGMD LKTSIEAQKAGREVLDKLGQIKAELAAMQQEFTNEKECGQDQTSGMSMEL >gi|157101600|gb|DS480724.1| GENE 61 52836 - 53096 241 86 aa, chain + ## HITS:1 COG:no KEGG:HM1_0601 NR:ns ## KEGG: HM1_0601 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 1 84 1 85 100 113 65.0 3e-24 MYIYPEHLKARAVMWLWQLRDLTVIGVGVLFSVLAAVQTGVIIPALITAAYAFLTIRFED TSILDFISYACAYFFRQQFFEWRMTR >gi|157101600|gb|DS480724.1| GENE 62 53093 - 53695 530 200 aa, chain + ## HITS:1 COG:no KEGG:CKR_3422 NR:ns ## KEGG: CKR_3422 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 3 197 6 200 203 256 71.0 3e-67 MSRKGKREGKTTAKQRLSTRQLMNTRRITEYSLETYDGDELVYFMIRPTNLSVLSESSVG ARVYALMNVLKGVAEIEMLCLNSRENFEENKGFLRRRAEEERNPVIRQLLERDQIFLDRI QVQMATAREFLILIRLRNQKGKDVFSYLDRIEKSLKEQGFDSRRADEEDIKRILAVYYEQ NVTSEKFEDFDGARWIIPDD >gi|157101600|gb|DS480724.1| GENE 63 53836 - 54600 520 254 aa, chain + ## HITS:1 COG:MA2133 KEGG:ns NR:ns ## COG: MA2133 COG3177 # Protein_GI_number: 20090976 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 20 241 46 265 326 153 39.0 4e-37 MNYPELLRKRDLYQSGRASIPELTLQSYEQAFEIEYTHNSTAIEGNTLTLMETKVLLEDG ITIGGKRLREIYETVNHQKAYRYVKECIAKEQPLDEKIIKEIHALLMENIFVGGIYRNVD VYISGAQHTPPSPGEMYRQVKDFYADLTWKGKEMNPIELAAWTHAEFVRIHPFPDGNGRT SRLIMNYQLLANGFPAVSIAKESRLDYFNALEAYAVEGNLAPFAEMVADLVDHQMDRYLG MIVPSQNMQQNPRM >gi|157101600|gb|DS480724.1| GENE 64 54685 - 56514 1644 609 aa, chain + ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 78 578 332 826 853 125 25.0 4e-28 MKHAKTAPEAEKEDVRIQEFLDMIAPSVMKFETDYFICGNTYRCVWALREYPTSTEEQAI LKRLGEKDGVTLKIYTRHVSAAEERKIISNAASKNRLNQSSTNDLQQTVQAESNLQDVAM IVARTHRNREPLLHTAVYMELTANEYDRLKLLQTEVLTELIRSKLNVDRLLLRQQQGFQC VMPSGRNVFRDQFERVLPASSVANLYPFNYSGKTDPRGFYVGRDKFGSNILVDFNQRADD KTNGSILILGNSGQGKSYLLKLILCNLREAGLNVICLDPEMEYEDLTNNLGGCFIDLMGG RFLINPLEPKVWDEGEGLEDPDAPETFRIRSRLSQHISFLKDFFRSYKDFTDREIDVIEI MVQRLYARCGLTDETDFQMLGSKDYPTLSELYGLIEAEYKGFDSGERQLYTAELLQNILL GLHSMCRGPEAKFFNGHTNITDSSFVTFGVKGVLQASRNLRDALLFNTLSFMSNRLLTVG NTVAGLDEFYLFLSNLTAVEYVRNFMKRVRKKNSSVILASQNLEDFNIEGIREYTKPLFS IPTHQFLFNAGAVDEKFYTDTLQLEASEFRLIRYPQRGVCLYKCGNERYNLMVTVPDYKA KLFGKAGGR >gi|157101600|gb|DS480724.1| GENE 65 56531 - 57592 752 353 aa, chain + ## HITS:1 COG:no KEGG:CKL_3875 NR:ns ## KEGG: CKL_3875 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 349 1 345 347 510 74.0 1e-143 MDMKQQRFGIEIELTGIARKRGAEIVAAYFGTQSYYAGTYYDIYAALDPQGREWKFMSDS SIKPERKEGKNRVGASDSFQTEMVSPICRYEDIETIQELVRKLREAGALANSSCGIHVHI DASPFDARTLRNITNIMAAKEDLIYKALQVSVARQNRWCKPVEERFLEELNQKKPGTLEE VRQIWYNGASRQREHYNNSRYHCLNLHSVFQKGTIEFRLFNGTTHAGKIKAYIQLCLAIG AQALNQTSASQRKTQTTNEKYTFRTWLLRLGLNGDEFKTARLHLLKHLDGCIAWKDPAQA EAQKERLRMKREKELQMQEGRVPAVESETREQEETMYVEAEAQSPAFSMMTGM >gi|157101600|gb|DS480724.1| GENE 66 57606 - 58121 367 171 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1874 NR:ns ## KEGG: LM5578_1874 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 10 162 2 159 167 197 63.0 1e-49 MARRTGKKAAQDRLYIAYGSNLNLPQMEQRCPYAKVVGASEIKNYELLFRGVATVEPKEG ATVPVLLWKIEPLDEAALDRYEGWPHLYRKEMIDVELEGKTVSAMVYVMNDVRSLGMPSE VYYRIIEEGYHSIGFDTAVLEHALARTEEMMEQENSQYHQQGFDDMDGFHL >gi|157101600|gb|DS480724.1| GENE 67 58162 - 58536 297 124 aa, chain - ## HITS:1 COG:no KEGG:ELI_2970 NR:ns ## KEGG: ELI_2970 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 11 95 1 85 110 71 47.0 1e-11 MISCMKRGEHMTEHNAKEIGLRIRRQREALGYSRERLAELSEISNSFLSDIERGDRGFSV ALLGRLSRVLGLSADYILFGTEQATDISDITDMLSGLDGKYIPKLKELLGAYLKTITLAE KQNH >gi|157101600|gb|DS480724.1| GENE 68 58707 - 59711 665 334 aa, chain + ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 210 331 4 123 124 108 44.0 1e-23 MAAPAAILAVKAAIAVATDKRGRTVIASVVAAVLLPFILVIVMLLSIMDGASSHNVSAVA QVFWEGAISSQVPEEYRKYIVDMQTSFASLEDLIADIDHVEDRELDIEWVKSVFYAMYFG SAQPSLLAQKEFVDCFVEYEEREDGDGDSYTAAIPITDLGTVYANMRQRLGLEIGVDQEA NAQRIYTVAVYGPAVPGGMAAGSAMGDGSYQALLTEATKYIGFPYRWGGSNPQTSFDCSG YICWIYTQSGTYQLPRTSAQGIFDQCAVIPREEAKPGDLVFFTKTYASSTPVSHVGLYIG GNQMLHCGDPIGYANIDSNYWRSHFYAFARLPAA >gi|157101600|gb|DS480724.1| GENE 69 59807 - 60085 401 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942298|ref|ZP_02089607.1| ## NR: gi|160942298|ref|ZP_02089607.1| hypothetical protein CLOBOL_07184 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07184 [Clostridium bolteae ATCC BAA-613] # 1 92 1 92 92 142 100.0 1e-32 MRELAEQAEELMVKRVQELRDEFGMSRSKTYDLEMKEYVARMERILHTLSEEDREWLDNQ LIDKFCVSETECRELYMNGFRDAMRLIMAVGL >gi|157101600|gb|DS480724.1| GENE 70 60174 - 61112 607 312 aa, chain + ## HITS:1 COG:no KEGG:LM5578_1872 NR:ns ## KEGG: LM5578_1872 # Name: not_defined # Def: hypothetical protein # Organism: L.monocytogenes_08-5578 # Pathway: not_defined # 2 297 3 298 314 274 49.0 4e-72 MIQAVFERKPDFRLRDVVIETVTRLPKEEYEQFLSSPCDSYEFIEKNSKSMLMDEKNGVF YCMLVTGEGYRDGVLVEAEGYPYARYASYVPDATALCYESLSKVNEILAKAVEEIVKEGT NMTTTGNWMTDRSKVETLLGEGQSENPRLWTLLQDMLGERPEVAQVDRMDEGLDIYYYLD FCPNYIPEEGEAAVQEAGADVKSPRLKDILCTRWENIHLVHTEVDNVPHTIAELDSGTLT EAGKKVWADVLNAKVERVYQGLYGLQMELSGVKPSRLDAFSGMLGGYCSEQEYETWVKEP EKEPVSPQLNNS >gi|157101600|gb|DS480724.1| GENE 71 61127 - 61402 286 91 aa, chain + ## HITS:1 COG:no KEGG:Closa_3132 NR:ns ## KEGG: Closa_3132 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 75 1 75 82 85 64.0 6e-16 MKNTTLTLTFNTERLDALAYHMGKKEADLKEELSDYLQKMYEKYVPQTTREYLDDKIARE GAAKPARPRRQAEESHVRSQTGQEMGQAGSA >gi|157101600|gb|DS480724.1| GENE 72 61399 - 61662 274 87 aa, chain + ## HITS:1 COG:no KEGG:Sgly_0038 NR:ns ## KEGG: Sgly_0038 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 77 2 79 90 87 58.0 1e-16 MRYQDKHGNEITEGMYLRFEDGSVEQVYAWQTGDESDLGINASNEAYLQAHGLGEFPQEL YPLSEFDLSEVEICEPEMEPWQLGMMG >gi|157101600|gb|DS480724.1| GENE 73 61939 - 62427 452 162 aa, chain + ## HITS:1 COG:no KEGG:CKR_3415 NR:ns ## KEGG: CKR_3415 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 160 1 165 167 182 58.0 3e-45 MLVEPKDAKKMPIAGFFMNTEGGKALAGRAKKQRIGVYMEKELVERADEMVSYVGARSRN EFVAEAVKFYIGFLNSKKAENYLLQSLSSVLTSTVHDSENRLARMDFKLAVEISKLAHVI AYSHEVDEDALKKLHLKCVEEVKRVNGAVEFEDAYKYQKREV >gi|157101600|gb|DS480724.1| GENE 74 62516 - 62650 120 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942303|ref|ZP_02089612.1| ## NR: gi|160942303|ref|ZP_02089612.1| hypothetical protein CLOBOL_07189 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07189 [Clostridium bolteae ATCC BAA-613] # 1 44 22 65 65 73 100.0 6e-12 MGKEQIMEDAVRDLTELALQEHRETTGEAEQKMLKRDADPVQVI >gi|157101600|gb|DS480724.1| GENE 75 63019 - 65940 1682 973 aa, chain + ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 457 914 156 582 590 145 29.0 3e-34 MPKIIFTSRYLRDAPPEQLGNYVKYIGTREGVEKIDESKGHLPATAAQKKLIAQLLRDLP KARAMLEYEDYRLHPTRRNASEFISTALEWNLDLLSKRENYVDYLANRPHVERIGEHGLF TDAGKPVVIARVQEEVKAHKGPVWTHVVSLKREDAARLGYDSGKQWMELLRSKRAMFCKQ MKIDSENLRWYAAFHNESYHPHVHVMVYSAKDHDGFLTEPAIEAMRSELAHDIFRQDFAN LYGVQNAAREGLKKEAEQTVKRLIQEIQSGTCQNQKIEKQIHRLSKRLQRTNGKKVYGYL KADLKQMVDRIVDELEQEPHIKELYQSWGTAREKIWQTYSDQPIPLAPLSQQKELKRIKN MVITEAMKIGSHHFLLEESSAEETDIWIDRAVTEMGSIAGDERRGESERRDAADEPEDNE LPGDNDIDFHAEWSDPYKVACQCLYGSDEKEPDFKEAFRLFSGEAEEGNALAMFDLGRMY ADGLGREADPAKAHEWYGKALTAFVAAEQCAEERQRPYLQYRIGKMYAAGLGYELPVVPD EEGGDGKAVRDYEKAVAWFSRAVSADHKYAQYSLGGLYYRGQGVTQNYSQAFNLYQRSAE QGNPYASYEMAKMYRDGIGVAVNVENAESCFEQAFSGFCRLEAQSHDDKLQYRLGQMLHT GTGTVKDDRVAEAYWERAAQLGNMNAQYALGKLWLENGTGDQKQAVAWLEKAAEAEHASA QYALANIYLAGEAVAKDVTKATELFTRAAKQGHDYAAYQLGKQFLQGEETEKDVEAAIKW LKQSAAANNQYAQYSLGKLYLDGEKVEKDIRTAITYLKKSAAQNNAFAEYRLGRFYLLGE DVEADVKEAVQWLEQSASQGNQYAQYALGKLYLCGHEVPRNKEKALPYLEASAAQGNIYA QFLLDHLDSFYEPSVFLATTRLMHRLAQMFEEEKWKAGGSSMQVEGKLRSRIREKKKAMG HKSDDETPRQNLS >gi|157101600|gb|DS480724.1| GENE 76 65909 - 66937 414 342 aa, chain + ## HITS:1 COG:no KEGG:Sgly_0035 NR:ns ## KEGG: Sgly_0035 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 18 328 4 313 472 385 58.0 1e-105 MMKLRVRIYHEEVSDMSYVYFTEEQKERANSIDLVDFLQRQGEKMLPSGRDKRLSSDHSI TVRGNSWYDHAIEKGGLAIDFVKYFYGKTYPDAVSMLLDGEQGIPYIQSQKQEKEPSAPF SLPPKNRDMRRTFAYLLKVRYLDKQIVTDFAKADLLYESLEDSQDGTRQYHNAIFVGRDK EGVARHAHKKGIYTYGSSFRGNLTSSDPKYSFRWVGTSDTLYVFEAPIDLLSYVSLHKEG WKQHSYVALCCVGSKPIFQLLQDFPNMKKIWLCLDHDIPGMKAAERIKKQLLEMGYGDTE TDLSQYKDWNEDLKALHGQLAIPAEELPGNQQETKQELLKME >gi|157101600|gb|DS480724.1| GENE 77 67003 - 68796 1121 597 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 37 533 108 571 591 116 25.0 1e-25 MVLLICAGAGMFILMGVIPLLANNYSLNGIKSKTVGDGQYGTARFASEAEIKRTFAQVPY EPALWRQGKHLPKVQGIVVGGLFSRTRVTALVDSGDIHLLMIGAAGVGKTANFLYPCIEY ACASGMSWLCTDTKGDLFRNYAGIAQKYYGYQTSILDLRNPTCSDGDNILGMVNKYMDLY LENPENLAVKAKAEKYAKITAKTIISSGGGDASSYGQNAFFYDAAEGLLTSVILLISEYC PKEQRHIVSVFKLIQDLLAPSPVKGKNQFQLLMQKLPADHKAKWFAGAALNTAEQAMASV LSTAMSRLNAFLDSEMEQILCFDSTMDTETFCNSKSAIFIVLPEEDSTKYFMVSLFLQQI YREMLSIADEHGGKLPNRVVIFGDEIGTIPKIESLEMLFSAGRSRRISMVPIIQSFAQLQ KNYGKDGSEIIVDNCQGVLFGGFAPNSESAEILSKALGNKTVLSGSISRGKNDPSQSLQM IGRPLMTPDELKSLPKGHFILAKTGCHPMRVELRLFFKWGIEFEEEYEVEQHVARPVSYA ERLEVEQEIIRRQFDEDGEADFDEPSGIEGTSGGGLAHVQMPNRPAAPETRQPLRTD >gi|157101600|gb|DS480724.1| GENE 78 68812 - 69069 113 85 aa, chain + ## HITS:1 COG:no KEGG:HM1_0623 NR:ns ## KEGG: HM1_0623 # Name: not_defined # Def: hypothetical protein # Organism: H.modesticaldum # Pathway: not_defined # 1 83 1 83 84 108 62.0 1e-22 MGYYDSVYSEELPHRAISVYMYLKERADKKSQCYPAMSTIAEELNLSRRTIQRAVADLEK AGFIKTEQRFRRKGGKSSLLYTVLK >gi|157101600|gb|DS480724.1| GENE 79 69357 - 69809 531 150 aa, chain + ## HITS:1 COG:SA0653 KEGG:ns NR:ns ## COG: SA0653 COG1349 # Protein_GI_number: 15926375 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Staphylococcus aureus N315 # 14 117 80 183 253 65 28.0 4e-11 MAPKNMKDEVQLYRKIIARYAAAFVSENDTIFINTSSNALQILEYVECNNVTVITNNGKA IGREYCSGVNVVLTGGKLRHPKDAMVGDFTLRNLQHVYAKKAFVGCSGISAELGMTTEIF NDTDEKAFDDALNAIREKGIQVYQVHKSDF >gi|157101600|gb|DS480724.1| GENE 80 69970 - 71211 482 413 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 287 386 179 278 284 62 32.0 2e-09 MKNKLLKLIKSIHVISHLPIYVFNQDFHLEHLFVADRVQVLPYDFTQYYGSDSSSTFLFS GILDEAFITYSHSDTILIIGPFTTSRLTDTVIQNRVYGYTQDPNLQVFLSQYLELLPYFS LGDVRDFLIHLNFLFSGSSECPYSDDLHLQVRKNKFLLTQEKLKEHDWRAFQPDYYTYRY EEQILALVQSGDTQQLRESLAELSNSVIPDNAQSPLRSEKNYTIIILEKLSSLAIQSGYD ISDSYRLRNFYIRLIEEKEKLMDVLYVRDCAIIHFTELMHHFVNKDFSPLVKSVMQHIGL NLYTVLKVSDISKSFFVSEATLSSRFKKETGLSVMEYIHKRKVSEAKLLLRAGLSPSEVA TTLCYYDYAHFSHTFKKITGVTPKAFQLHPKVLYPNFLPPHQQTLEDNETPLI >gi|157101600|gb|DS480724.1| GENE 81 71549 - 72013 65 154 aa, chain + ## HITS:1 COG:TM1527 KEGG:ns NR:ns ## COG: TM1527 COG1959 # Protein_GI_number: 15644275 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Thermotoga maritima # 9 128 11 127 140 76 35.0 2e-14 MQLTMTTDYALRCLLYLAGKEGVSSSPEIGKAVGINNIFVQKVLRVLRDAGFVSSLKGGT GGYWLAKKPEEIVLLDIILLFEKTMKINRCLEAEGCCERYETCPMHVYYAEVQETLEGYF GQATLQDVINHGKIKRKEMIDDAKEKACDRKNSM >gi|157101600|gb|DS480724.1| GENE 82 72091 - 76536 3292 1481 aa, chain + ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 1306 1478 478 621 621 70 31.0 3e-11 MKNSLEQKRRKLKKTGRQACAFALSLAVMVPSMGLPSYAAVPYRSNGVSIGDKGMDTTSH FTANGIEESNLRYATTTFDKVDADGTIRLVHNKWIVTNGASEFATDASNQFAGKYILSFS NDMFYKQIDRVTVDDTSMEKVDDGALWMIPAVGNLKTGLIGVNTNHDIKIMLKNGQTLEK LGLADKEISFNSLWIKSNGAIAEESVSNGFILQNNPNVKNEKQTDFTNGRVTQRVVFDAK TMSLKSIHTFKPIENYLQTDYNWVVYVKERIPVELLQYIDRDEVYIYCSDYQGEAQKNRK IFRVTIDDAGNIDTSEVPELSIVGNDTSTQLSAARSNTNEIFWGTLGQSRNYTISYKLRD DVTLDQFAREMNDYVTAKNARVLFEHWMEADYLNARDGGATPQQLQDSYTNAYLDTNDTD KDGLFDFVEWQIGTDTRKVDTDGDGVPDGKEFMEDKTNPSDAKDYLVSIPTTNITSFDPT KAVTITGTVPKPLQKDPADETKLINITSQDAGDAIVKLQGYDDASKSYTQDEYGTAKIPF DNLVNGEFTMSVPANTVPEGTKVVLVAYSPNGMNPAMGTPFEFKQGDAEKYEATGGVLDK EYGETATASEIIGKVTTTAPDDKVKSKEVVGDIPTEGKGQKVKVKVTYADGSIDEAEVTV NYETATEKYTAVGQDVSVGKGDTPEASEGIKNKNDLPTGTTYEWKQPVDTSNPGTQTGRI IVTYPDQTKDEVEVTVQIKDSKNDAETYDVSGGVLNKEYGDKATKDEIIEKVTTTAPDNK VRSKEVVGTIPETGKGQKVKVKVTYVDGSVDEAEVTVNYGSASDKYEPEVEDETVKTGSD IDLTDNVTNLDELPTGTTVKDVTDTPIDTSVPGDYTGKVEITYPDGSKDIVEVPVKVVDK TDAERYRPVTERELIEEGQTYDLTDNVKNMGSLPTGTTVEDVTPEGEIDPDTPGNYTGTI KVIYPDGSSETVKVKVKVKKRTPDAKKYDPEVVPEIIYAGELADLTDNVVNLEDELPEGT IVTDITEYGEDGVNLDRPGKYKGRIEIEYPDGSTKELTVPIRVLKDADTDTATPSEPGKA TPSEPDKATDSEAEKTDASKYKPKPNPIVIDQGETFEPEDGIKNKDELPEGTEYSDETPD KVDTSKDYTAIIVVTYPDGSKDKVKVPVTVRPGGSEKTDADKYDPKPNPIVIDKGGTFEP EDAIKNKDELPNGTEYRDETPDNVDKTKDYTAIIVVKYPDGSEDKVKVPVTVKPVTSGSG GSGSSGSSGGSSGGHSSGGSGGGSGSRGSNVSNDRIYANPDMSVTTGSLRGTWTLVDAEN HKWTYTTSSGVMAKDGWMFIGNPYAKDEEGRFSWFKFDANGIMEFGWIKSQNGKWYHTHA VSDGNLGILHKGWYHEPMDGKWYYLDEKTGAMLDGWVSLSGKYYYFTEAPLVPEQTYFQR ENGYWYYDNHNRRPYGSMYQNEMTPDNYFVDQNGVWDGKNH >gi|157101600|gb|DS480724.1| GENE 83 76624 - 77091 165 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942314|ref|ZP_02089623.1| ## NR: gi|160942314|ref|ZP_02089623.1| hypothetical protein CLOBOL_07200 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07200 [Clostridium bolteae ATCC BAA-613] # 1 155 1 155 155 306 100.0 3e-82 MRNKVGSVIGRVEKCGGRNVFPWLKRSCIYIAVLSGDTDTLWYYRDKLCDFAERNHYKIE VVLTTSKGALLQSIRPFNRKPDFIIVAAPLNLYQYQMFIEQLEGRRSRAKVFFMGTGEVL KGYEKKVISVANKAKLEWYLEREMDWLQRKRLWRW >gi|157101600|gb|DS480724.1| GENE 84 77111 - 77389 138 92 aa, chain + ## HITS:1 COG:BS_hbs KEGG:ns NR:ns ## COG: BS_hbs COG0776 # Protein_GI_number: 16079336 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Bacillus subtilis # 1 92 1 92 92 77 44.0 5e-15 MNKRQLVDRMSSGSGLTMKQSEKALNGLMDVIHSELSSGGNVSLIGFGVFSVADRKARMA RNPKTGEEMEVKARKIPVFKAGSTLKRAVNSR >gi|157101600|gb|DS480724.1| GENE 85 77454 - 78782 854 442 aa, chain + ## HITS:1 COG:L196579 KEGG:ns NR:ns ## COG: L196579 COG0446 # Protein_GI_number: 15672373 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Lactococcus lactis # 1 442 1 446 446 545 61.0 1e-155 MKTVIVGANHGGIAAANTLLDHYPDHRVVMIDQNTNISYLGCGTALWVGRQIDSYQDLFY TKKEDFEKKGASVHMETTVRKIDFERKVVFCEKTDGRTFEESYDKLILATGSLPIAPDLP GQNLEHISFLKLFQDGQNVDHLIGRMDVQHVAVIGAGYIGVEIAEAAIRRGKKVKLFDIA DTSLASYYDEWFTKDMDRLLEEKGIETHFKEQVLAFLGTTSVERIVTDHGEYKTDLVISA IGFRPNTVLGMEHLKLFKNGAYCVDRHQRTSDPDVYAVGDCAAIYSNALKRKTYIALATN AVRSGIVAGHNVGGTALEETGVQGSNGISVFGYHMVSTGLSVKAAQKSGLDVTYTDFEDL QRPGFMKKNAKVKVRIVYERNSRRIVGAQMASTEDISMGIHMFSLAIEQEVTIDQLKLLD LFFLPHFNQPYNYITMAALSAK >gi|157101600|gb|DS480724.1| GENE 86 78940 - 79290 222 116 aa, chain + ## HITS:1 COG:no KEGG:Closa_2032 NR:ns ## KEGG: Closa_2032 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 40 111 77 148 172 68 43.0 6e-11 MLSEIMDRLNLWHYEDADDREERAGLFGNRMEENEGNFLLNIVTYKPRTMDDVQEICERF LDGDAVILSLEQAEGKNEERIVDFLSGAIYSQNGNILKISEHIYAISPEDVGIFER >gi|157101600|gb|DS480724.1| GENE 87 79387 - 79809 271 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942318|ref|ZP_02089627.1| ## NR: gi|160942318|ref|ZP_02089627.1| hypothetical protein CLOBOL_07204 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07204 [Clostridium bolteae ATCC BAA-613] # 1 140 1 140 140 275 100.0 1e-72 MQFKATGEVKINIAVLAYSNKLLWDCRNFLYAFRGKTKCNLGITLTKNVDVIFRGMKFMP GGFDLVVVADVFCEQKYDEFVNTITKDPNHTKVFFFGEEHKNVKNMGSMITVSSKEQLEK QLSKEITGLLFGASDITYKG >gi|157101600|gb|DS480724.1| GENE 88 79851 - 80270 188 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942319|ref|ZP_02089628.1| ## NR: gi|160942319|ref|ZP_02089628.1| hypothetical protein CLOBOL_07205 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07205 [Clostridium bolteae ATCC BAA-613] # 1 139 16 154 154 289 100.0 4e-77 MGERILRDRLETLKMKDLDKFDRAAIVLLCIHGVSAHGVYGDIYNHNFKENIPFNGIGDM VLKIDQICNWLNTPRATTELRFFSREMEKQYRESYSDPPEQNMKADGEVGGGGIAYERVA NAKELLVIWVKYRQNASLQ >gi|157101600|gb|DS480724.1| GENE 89 80416 - 80655 188 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942320|ref|ZP_02089629.1| ## NR: gi|160942320|ref|ZP_02089629.1| hypothetical protein CLOBOL_07206 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07206 [Clostridium bolteae ATCC BAA-613] # 1 79 1 79 79 132 100.0 1e-29 MKPNKRAKNARKQISDIYLEVVKLSDFLSELQLQAEREFVNQEDSHELADEYIILFQGIT KAIEQCRQVGGKLHVLVEE >gi|157101600|gb|DS480724.1| GENE 90 80705 - 81214 241 169 aa, chain + ## HITS:1 COG:CAC0845 KEGG:ns NR:ns ## COG: CAC0845 COG1528 # Protein_GI_number: 15894132 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Clostridium acetobutylicum # 11 169 12 170 170 168 53.0 5e-42 MNKKIADLLNNQINQEFYSAYLYLDIANFYTKKGLDGFANWYQIQAREEQDHAMLVYKYL HNNDMDVALGTIGKSEKIFVTLIDPLKFSLEHEKYVTELINEIYLEAQKVNDFRTMQFLN WFIKEQGEEEKSSSEQITKMELYGSDPRSLYMLNSELAGRVYRAPSLVL >gi|157101600|gb|DS480724.1| GENE 91 81609 - 81974 216 121 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_10051 NR:ns ## KEGG: EUBELI_10051 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 120 1 119 121 126 64.0 3e-28 MGRAERRRNAKNERKEKKATYNLTREQLNHMVHERVEDELDHMRQEAMEEAINTAMLLLL TLPLKVLMDHYWNKSYTKRMPEFINYVLSYYEQWQKGELDMDELRKELWEYGGVRLEEVE D >gi|157101600|gb|DS480724.1| GENE 92 81940 - 82080 106 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942323|ref|ZP_02089632.1| ## NR: gi|160942323|ref|ZP_02089632.1| hypothetical protein CLOBOL_07209 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07209 [Clostridium bolteae ATCC BAA-613] # 3 46 1 44 44 69 97.0 9e-11 MVVSAWKKWRTDYDELNEQIKEIVGNITGNNVLRYMWQTDVKEKSD >gi|157101600|gb|DS480724.1| GENE 93 82229 - 82426 322 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942324|ref|ZP_02089633.1| ## NR: gi|160942324|ref|ZP_02089633.1| hypothetical protein CLOBOL_07210 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07210 [Clostridium bolteae ATCC BAA-613] # 1 65 1 65 65 76 100.0 7e-13 MVDKGLTDQLKGKTKKIVGDLTGDTVTKAEGWLEQGVGKTKKAIADVTDIIEEVSEKATE KLDKK >gi|157101600|gb|DS480724.1| GENE 94 82714 - 83877 873 387 aa, chain + ## HITS:1 COG:YPO0011 KEGG:ns NR:ns ## COG: YPO0011 COG3328 # Protein_GI_number: 16120364 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 47 387 34 374 402 322 45.0 6e-88 MAREKKSVHKVQMTDGKRNIIRQLLEEYEIESAQDIQDALKDLLGGTIKEMMESEMDEHL GYRKSERSDCDDYRNGYKTKQVNSSYGSMKVEVPQDRNSTFEPQVVKKRQKDISDIDHKI ISMYAKGMTTRQISETLEDIYGFEASEGFISDVTDKLLPQIEDWQNRPLSDVYPVLYIDA IHYSVRDNGVIRKLAAYVVLGINSDGLKEVLTIEVGENESAKYWLSVLNGLKNRGVKDIL LLCADGLTGIKEAIAAAFPKTEYQRCIVHQVRNTLKYVSDKDRKLFAADLKTIYQAPTEE KALEALERGTKKWSEKYPNSMKSWHQNWDAIVPIFKFSTTVRKVIYTTNAIESLNATYRK LNRQRSVFPSDTALLKALYLSTFEATK Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:12:21 2011 Seq name: gi|157101599|gb|DS480725.1| Clostridium bolteae ATCC BAA-613 Scfld_02_66 genomic scaffold, whole genome shotgun sequence Length of sequence - 75975 bp Number of predicted genes - 76, with homology - 74 Number of transcription units - 29, operones - 15 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 1 - 946 685 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 2 1 Op 2 2/0.000 - CDS 943 - 2103 803 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 3 1 Op 3 . - CDS 2100 - 3248 935 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 3373 - 3432 7.9 + Prom 3433 - 3492 8.0 4 2 Op 1 2/0.000 + CDS 3574 - 4587 900 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 4622 - 4681 4.6 5 2 Op 2 . + CDS 4710 - 5792 1039 ## COG0673 Predicted dehydrogenases and related proteins 6 2 Op 3 1/0.167 + CDS 5829 - 6707 790 ## COG3618 Predicted metal-dependent hydrolase of the TIM-barrel fold 7 2 Op 4 . + CDS 6726 - 7049 140 ## COG3254 Uncharacterized conserved protein 8 2 Op 5 3/0.000 + CDS 7063 - 7938 697 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Term 7961 - 8005 10.4 + Prom 7980 - 8039 6.2 9 2 Op 6 . + CDS 8072 - 8773 611 ## COG2186 Transcriptional regulators + Term 8795 - 8833 4.4 + Prom 8846 - 8905 3.8 10 3 Op 1 . + CDS 8969 - 10876 1566 ## COG1151 6Fe-6S prismane cluster-containing protein 11 3 Op 2 . + CDS 10869 - 12224 725 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 12465 - 12512 4.9 + Prom 12610 - 12669 3.0 12 4 Tu 1 . + CDS 12755 - 13639 956 ## Cbei_4049 hypothetical protein 13 5 Tu 1 . - CDS 13636 - 14421 804 ## COG0450 Peroxiredoxin - Prom 14455 - 14514 3.3 + Prom 14411 - 14470 8.6 14 6 Tu 1 . + CDS 14605 - 14775 239 ## gi|160942341|ref|ZP_02089649.1| hypothetical protein CLOBOL_07226 + Term 14884 - 14933 7.3 + Prom 14883 - 14942 4.4 15 7 Op 1 2/0.000 + CDS 14966 - 15172 236 ## COG3666 Transposase and inactivated derivatives 16 7 Op 2 2/0.000 + CDS 15194 - 15709 267 ## COG3666 Transposase and inactivated derivatives + Prom 15788 - 15847 1.9 17 7 Op 3 . + CDS 15893 - 16447 89 ## COG3666 Transposase and inactivated derivatives + Term 16448 - 16490 8.2 - Term 16443 - 16469 -1.0 18 8 Op 1 . - CDS 16499 - 16957 512 ## COG3193 Uncharacterized protein, possibly involved in utilization of glycolate and propanediol 19 8 Op 2 2/0.000 - CDS 16974 - 17519 681 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 20 8 Op 3 . - CDS 17536 - 18888 1058 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 18925 - 18984 8.1 + Prom 18881 - 18940 8.8 21 9 Op 1 4/0.000 + CDS 19115 - 19495 407 ## COG4810 Ethanolamine utilization protein 22 9 Op 2 . + CDS 19527 - 20036 440 ## COG4917 Ethanolamine utilization protein 23 9 Op 3 . + CDS 19990 - 20562 595 ## COG0693 Putative intracellular protease/amidase + Prom 20582 - 20641 2.3 24 10 Op 1 11/0.000 + CDS 20725 - 23295 2388 ## COG1882 Pyruvate-formate lyase 25 10 Op 2 . + CDS 23332 - 24249 572 ## COG1180 Pyruvate-formate lyase-activating enzyme 26 10 Op 3 4/0.000 + CDS 24277 - 25071 917 ## COG4816 Ethanolamine utilization protein 27 10 Op 4 . + CDS 25116 - 25394 336 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein + Term 25441 - 25477 8.9 28 11 Op 1 . + CDS 25489 - 26253 343 ## Cbei_4057 hypothetical protein 29 11 Op 2 4/0.000 + CDS 26275 - 26934 506 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 30 11 Op 3 4/0.000 + CDS 27016 - 27279 382 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein 31 11 Op 4 . + CDS 27322 - 28635 888 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 32 11 Op 5 . + CDS 28662 - 29846 846 ## COG0192 S-adenosylmethionine synthetase 33 11 Op 6 2/0.000 + CDS 29892 - 30170 366 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein + Term 30189 - 30217 1.3 34 11 Op 7 1/0.167 + CDS 30236 - 30871 515 ## COG4869 Propanediol utilization protein 35 11 Op 8 4/0.000 + CDS 30905 - 31723 931 ## COG4820 Ethanolamine utilization protein, possible chaperonin 36 11 Op 9 6/0.000 + CDS 31745 - 33160 884 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 37 11 Op 10 . + CDS 33179 - 34294 1003 ## COG1454 Alcohol dehydrogenase, class IV + Prom 34390 - 34449 6.7 38 12 Op 1 . + CDS 34477 - 35808 810 ## COG4936 Predicted sensor domain 39 12 Op 2 . + CDS 35811 - 36857 743 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 36957 - 36994 7.8 + Prom 37081 - 37140 12.6 40 13 Tu 1 . + CDS 37201 - 39708 1751 ## COG1882 Pyruvate-formate lyase + Term 39761 - 39799 1.3 + Prom 39761 - 39820 6.2 41 14 Op 1 . + CDS 39844 - 41163 1028 ## COG4936 Predicted sensor domain 42 14 Op 2 . + CDS 41181 - 42182 692 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Term 42201 - 42246 6.0 + Prom 42538 - 42597 6.6 43 15 Op 1 . + CDS 42763 - 43455 515 ## COG4816 Ethanolamine utilization protein 44 15 Op 2 . + CDS 43485 - 43574 90 ## + Term 43591 - 43634 3.7 + Prom 43629 - 43688 3.4 45 16 Tu 1 . + CDS 43822 - 43977 63 ## gi|160942375|ref|ZP_02089683.1| hypothetical protein CLOBOL_07260 + Prom 43982 - 44041 6.7 46 17 Op 1 . + CDS 44095 - 44532 224 ## PCC7424_0111 hypothetical protein 47 17 Op 2 . + CDS 44533 - 44727 171 ## gi|160942378|ref|ZP_02089686.1| hypothetical protein CLOBOL_07263 + Term 44780 - 44844 13.2 - Term 45026 - 45073 0.3 48 18 Tu 1 . - CDS 45167 - 47089 752 ## COG3711 Transcriptional antiterminator - Prom 47171 - 47230 5.6 + Prom 47070 - 47129 6.5 49 19 Tu 1 . + CDS 47314 - 48591 660 ## COG1455 Phosphotransferase system cellobiose-specific component IIC + Prom 48615 - 48674 5.5 50 20 Op 1 . + CDS 48705 - 49010 334 ## Smon_0197 phosphotransferase system PTS lactose/cellobiose-specific IIA subunit 51 20 Op 2 10/0.000 + CDS 49015 - 49335 265 ## COG1440 Phosphotransferase system cellobiose-specific component IIB 52 20 Op 3 . + CDS 49348 - 50625 593 ## COG1455 Phosphotransferase system cellobiose-specific component IIC 53 20 Op 4 . + CDS 50648 - 51460 230 ## BDI_1568 hypothetical protein 54 20 Op 5 . + CDS 51477 - 52517 496 ## gi|160942388|ref|ZP_02089696.1| hypothetical protein CLOBOL_07273 55 20 Op 6 . + CDS 52508 - 53422 283 ## gi|160942389|ref|ZP_02089697.1| hypothetical protein CLOBOL_07274 56 20 Op 7 . + CDS 53467 - 54507 827 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 57 20 Op 8 . + CDS 54599 - 55903 1055 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 55917 - 55972 8.0 + Prom 55943 - 56002 3.2 58 21 Tu 1 . + CDS 56041 - 57366 752 ## COG0534 Na+-driven multidrug efflux pump + Prom 57392 - 57451 3.4 59 22 Tu 1 . + CDS 57510 - 58466 864 ## COG1250 3-hydroxyacyl-CoA dehydrogenase 60 23 Tu 1 . - CDS 58463 - 60469 1451 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains - Prom 60512 - 60571 8.8 + Prom 60580 - 60639 5.5 61 24 Op 1 . + CDS 60683 - 61588 859 ## Cmaq_1396 hypothetical protein 62 24 Op 2 . + CDS 61588 - 62568 729 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases 63 24 Op 3 2/0.000 + CDS 62581 - 63690 544 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase + Prom 63741 - 63800 5.0 64 24 Op 4 . + CDS 63854 - 65080 994 ## COG1804 Predicted acyl-CoA transferases/carnitine dehydratase 65 24 Op 5 . + CDS 65093 - 65692 520 ## gi|160942400|ref|ZP_02089708.1| hypothetical protein CLOBOL_07285 66 24 Op 6 . + CDS 65710 - 66528 615 ## COG3836 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase 67 24 Op 7 5/0.000 + CDS 66555 - 67343 293 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 68 24 Op 8 5/0.000 + CDS 67381 - 68700 1196 ## COG2610 H+/gluconate symporter and related permeases 69 24 Op 9 . + CDS 68737 - 69570 714 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) + Term 69591 - 69630 5.1 + Prom 69826 - 69885 6.7 70 25 Tu 1 . + CDS 69916 - 70413 284 ## BC1003_2069 carboxymuconolactone decarboxylase + Prom 70491 - 70550 2.7 71 26 Tu 1 . + CDS 70605 - 70721 88 ## - Term 70791 - 70825 1.1 72 27 Tu 1 . - CDS 70871 - 72037 341 ## COG3547 Transposase and inactivated derivatives + Prom 72043 - 72102 5.1 73 28 Op 1 . + CDS 72173 - 72310 79 ## gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 74 28 Op 2 . + CDS 72313 - 73248 628 ## COG2030 Acyl dehydratase 75 28 Op 3 . + CDS 73279 - 75189 1353 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Term 75247 - 75281 1.1 76 29 Tu 1 . - CDS 75327 - 75974 108 ## COG3547 Transposase and inactivated derivatives Predicted protein(s) >gi|157101599|gb|DS480725.1| GENE 1 1 - 946 685 315 aa, chain - ## HITS:1 COG:mll7158 KEGG:ns NR:ns ## COG: mll7158 COG1804 # Protein_GI_number: 13475961 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Mesorhizobium loti # 3 315 10 323 388 369 53.0 1e-102 MKPLEGITILDFSQFLSAPSATLRLADFGARVIKVERPDGGDICRTLYVSNLIIENDSSL FHSINRNKYGVSIDLKNPASRDILWSLIGKADVMVVNFRPGVVEKLGLDYASVKQHCPSM IYGEITGYGKKGPWSPMPGQDLLVQSLSGLAWLNGNQNQPPTPMGLSVADLFAGQHLAQG ILAALIKRASCGEGSYVHVSMLESVMDMQFEVFTTYLNDGGQSPERSAVNNANGYINAPY GIYQTLDGYLAIAMVPVPVLGKLISCDSLIGYTDSETWCTRRDEIKSLLADHLKTRTTEY WLSKLEPADVWCADV >gi|157101599|gb|DS480725.1| GENE 2 943 - 2103 803 386 aa, chain - ## HITS:1 COG:mll7158 KEGG:ns NR:ns ## COG: mll7158 COG1804 # Protein_GI_number: 13475961 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Mesorhizobium loti # 3 380 10 388 388 390 48.0 1e-108 MKPLENMIVLDFSQYLAGPSAALRLADMGARVIKIERPGSGDGSRQMKLHNLDSMGDSVL FQTINRNKQSFCADLKSPEDMNMVKKLIRRSDVMIENFRPGVMKKIGLDYETVKSINPRM VYGTVTGYGATGPWNKKPGQDLLVQSISGLAWLNGDDGQPPMPFALSLADSYTGIHLAEG IMACLFRRFSTGLGGLVEVSLLESILDMQFEVITTFLNDGHKPPKRSAFHNAHAYLGAPY GIYRTEDGYIALAMGSIPELGRLIECPALSSYDNQEEWFTKRDEIKQLIGSHLLHRTTAS WLKALNAGGYWCSDVYSWDQLLQSEQFHTLDMLLNIRRPGGNDIFTTRCPISFGDGPIKN ETHAPLLGEHTLEIIEEFHLKEGEEA >gi|157101599|gb|DS480725.1| GENE 3 2100 - 3248 935 382 aa, chain - ## HITS:1 COG:mll7157 KEGG:ns NR:ns ## COG: mll7157 COG1653 # Protein_GI_number: 13475960 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 8 377 1 368 381 238 38.0 1e-62 MSIQLNGITWSHSRGYTSITAVSQRFCELHPDVSIHWEKRSLQEFADAPIENLAKSYDLL VIDHPWAGFAATTGILLPLEKYLSKEYLDDQLANSVGASHLSYNFNGFQSALAIDAATPV AVCRPDYFRNGEHPLPADFKDVMDLARKGHVAFAGIPLNLLMDFYMYCATSTEHFFDDPD KIVETSLGVSILEDMRQLASYCSKSIFQWDPIQVHEALAEDNDLYYCPYAYGYTNYSRRG YCSHPLKAMDLVTYRGRMLRSVLGGTGLAVSSLCENRETALRFAEYAASPLIQTTLYTEN GGQPGHRKAWLDEINNNMTMNFFKDTLNTLDHSYLRPRYNGYLYFQDHAGDYIQDYIMNG GSPAHVLDQLNQLYRKSREESK >gi|157101599|gb|DS480725.1| GENE 4 3574 - 4587 900 337 aa, chain + ## HITS:1 COG:rhaT KEGG:ns NR:ns ## COG: rhaT COG0697 # Protein_GI_number: 16131747 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Escherichia coli K12 # 22 333 26 340 344 79 24.0 8e-15 MASAFFVLFLACIFQGSFGICFKKYQPFSWEAFWVLFSFIGVLCIPHIWCMVEVPHYLSY ITATPVPTLIVGALSGFFWGISSIWYSKAIDMIGVSLVTGINLGLSNLLGSFVPMIILGT YPPARVLVVLLLGQLILLGGVIVLSKAGFMKNGNNEGTKTAKEKGTSSLFITGLIMALAS GAGSAAINIGATAANYPVALAVKEGVNPTSASLLSWVVVFAGGFLANFVYAISKLIKNKT YTDYTKPGCAKAYFKVVLTSIVWFSALAIYGKATALLGTLGPVVGWVAFNGLALIIANAW GFLDGEWKGFEKAKRVAIYGNAVIIAALVVVGISNGL >gi|157101599|gb|DS480725.1| GENE 5 4710 - 5792 1039 360 aa, chain + ## HITS:1 COG:mll7146 KEGG:ns NR:ns ## COG: mll7146 COG0673 # Protein_GI_number: 13475951 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Mesorhizobium loti # 15 351 18 349 358 310 45.0 2e-84 MANRIEELNYVPEMPEKARNIVIVGAGGIVSGSHLPAYKMAGYPVVGIYDLDHEKAEKAA KEFGIPNAVPELEQLIELGVREDAVFDIAVPASKTASILRLLPDNAAVLMQKPMGESIEQ AKEILDICHAKNLTAGVNFQLRQAPYMIAARKLIQDGVIGDIVDIDWRVVELHPWHLWSF LFGLERMEINYHSIHYIDCIRSFVGNPKGVYCKTMQHPKMQALSQTRTAIIMDYGDTLRV NLHINHNHDYAPDYQESTMKIEGIKGAIRIQLGLILDYPTGRPDKVEYITEDGKGWRELE VKGSWFNEAFIGTMGGLMKKLEDPSYHYMNSVEDAYHTMCVVEACYKSSEAGGLNVDYAQ >gi|157101599|gb|DS480725.1| GENE 6 5829 - 6707 790 292 aa, chain + ## HITS:1 COG:BMEI1718 KEGG:ns NR:ns ## COG: BMEI1718 COG3618 # Protein_GI_number: 17988001 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase of the TIM-barrel fold # Organism: Brucella melitensis # 3 292 2 282 282 148 33.0 1e-35 MRIIDTHLHIWNKEEFSLPWLDGEGEVLNRTYSLEDYKNALEPDKGYEVEAAVYVEVDCA RRDKEKENLFIAGCCANPGQMFAGACISGYLNEEGFKEYIDKYASEYVKGVRQVLHVPEA LPGTCLGQLFVENVRYLGKKGLVFEGCVRNAELGDLYETAKACPDTTIVLNHMGIVDPDI ISVQTPSEEEQQYKERWILNIRKLASLSNVVCKISGLNPKGEWFAETLKPAVDIALDYFG EDRVMFASNYPVCNIATGLSPWIEALMKITEGRGREFQNKLFRENAKRIYGL >gi|157101599|gb|DS480725.1| GENE 7 6726 - 7049 140 107 aa, chain + ## HITS:1 COG:Cj0488 KEGG:ns NR:ns ## COG: Cj0488 COG3254 # Protein_GI_number: 15791852 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 4 107 1 105 105 101 52.0 4e-22 MSDVKRYGSISRLTPEKADYYKKLHAAPWKQINDMIKECNIRNYSIYCHGEYLFSYYEYT GDNYEADMAKMAADSETQRWWGECVPCMNPLSADGPWMDMQEIYHLD >gi|157101599|gb|DS480725.1| GENE 8 7063 - 7938 697 291 aa, chain + ## HITS:1 COG:SA0304 KEGG:ns NR:ns ## COG: SA0304 COG0329 # Protein_GI_number: 15926017 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Staphylococcus aureus N315 # 1 287 5 291 293 167 33.0 2e-41 MKKLYVASITPFDGNNQINDSAMRQLWDRNLAEGADGFFIGGSSGECFLLSREERVHTFE LAGEYTDRTEMFAHVGAISTEEAIFYARKAKEMGIQNIAATPPFYFGFSDKEIAGYYYDI AEAVDMPVLYYNIPSSTHRELNPSQPDLAALLKSGAVGAIKHTNLNLYQMERIHDVNPDI RCYGGFENCMVAFLAFGCDGFIGSNFNFMLPQYKKVMELYLSHHDDEARILQSKSNCILD AVLKNGLCASLKYIASCQGIQAGEVRKPMCPLTENQKAEIDRVIGMYMEIK >gi|157101599|gb|DS480725.1| GENE 9 8072 - 8773 611 233 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 5 145 4 144 231 80 32.0 3e-15 MECEPLKKVTLTEQIMEQIAGMITSGQLKPGDKLPNERDLAELFGVTRSRIREALRALSL VGMLTIKPGDGSFVSEEGTKVPEETVMWMYYQEMNKHDEIYAARNLIETEVYLTCYDNRT PEILERLDGFRDRLLDVETEEITAEAFCQLLTDIDTYVGDVCGNGIYARLMQIMVLMRKE SAMKILSLASSKESAVFYRCKILNSFHQDDRNKVKKCLNDFFKYSIREISIQE >gi|157101599|gb|DS480725.1| GENE 10 8969 - 10876 1566 635 aa, chain + ## HITS:1 COG:CAC0116 KEGG:ns NR:ns ## COG: CAC0116 COG1151 # Protein_GI_number: 15893412 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 14 634 9 629 629 706 55.0 0 MDTSIKEGHMDKTMCISADKRLQEFLESVPEDHSHRRMVQQEVKCGFGLDGVCCRLCSNG PCRISKARPKGVCGADRDTIAARNFLRSVAAGSGCYIHVMEQAAKELKSAALAGRKLKGV DALHRLCDELGIGEADDSQMALRLSEAVLEDLRRPYEEPMHLTEKLAYAKRTGVWKNAGL LPGGAKDEIFNAIVKTSTNLNSDPMDMLTQCLRLGISTGIYGLVLVNRLNDILMGEPELG FDPVGMRVIDPECINIMVTGHQHSMFGSLTGMMEHPEIKEMADRAGARGIRIVGCTCVGQ DFQARSRCYEKVFCGHAGNNYSSEAVLLTGCIDLVVSEFNCSLPGIEPICEELSIPMLCL DDVAKKKGARHLPYHPEDREDVSVNVIAAAIASYVRRSRAGKRQNPMQDHGCNEAVTGVT ERTLKGALGGSFAPLIDLIAAGSIKGVAAVVGCSNLRAKGHDVFTVELVKKLIARDILVV SAGCTCGGLENCGLMSRDAAGLAGPGLSQVCRSLGIPPVLNFGPCLAIGRIEAVAGELAG ALGVDLPKLPVVVSAPQWLEEQALADGAFALALGFHLHLGLAPFVMGSPVILDVLCNQME GITGGKLMVETEIDAAAEKIAAVISDKRRGLGLNE >gi|157101599|gb|DS480725.1| GENE 11 10869 - 12224 725 451 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 86 449 12 365 372 285 46.0 1e-76 MSERYPKSVPVSPDNPAIEKDEDICANCSHCLAVCAEEIGVAQPCKVTGADDFTCIHCGQ CAAACPERAIRAKSQWREAAMAVNDQQKIVIFSTAPSVRVGLGECFGGKPGDFVESYMVS ALRKLGADFVLDVAFSADLTIMEEGTEFLKRFIADSHPLPQFTSCCPAWVKYIETFHPER IPCLSSTKSPISMQGALIKTYFSKEKGIDPRRIVNIAVAPCTAKKFEISREELCSAGAYL GIPGLRDNDYVLTTRELAEWIREAGIGFYDLTPSAFDSLMGEGSGAGVIFGNTGGVMEAA LRMAYSVVNGIEPPDLLMQYQPVRGLEQVKESVVSLGGRDVRTAVVYGTRAAEQLIATGA VDDYDFVEVMTCPGGCIGGGGQPDNGILPVPDQLRMARIRSLYLADQERSFRNSLENQEI KHLYETFIGEPMSKMAQLLLHTSYHSRKRGI >gi|157101599|gb|DS480725.1| GENE 12 12755 - 13639 956 294 aa, chain + ## HITS:1 COG:no KEGG:Cbei_4049 NR:ns ## KEGG: Cbei_4049 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 287 1 288 289 197 48.0 5e-49 MNLFALVASFGGGIIGAYMGALPAFILTGVIAIAGAVAAMAGGADMTVGFIAFGSYLGPH IAFAGGVAASAYAGKTKKLGSGTDILSCLNGLADPATLLVGGVFGVIGFLIHYVIGAVLH LNTDLPGFTVIISAVIARLVFGSSGLIGKAAPDETREYFTGGKGMLCNVILGLGIGTTTG FVYQALVDGGASAASIGSYPVLCFGIAAVSLIFAQTGFAMPATHHIALISALAAVTAQNP VMGIVFGILTSLFGDFIGKTFNSCCDTHIDPPAFTIFIFTFIVNVLFGSGFFSV >gi|157101599|gb|DS480725.1| GENE 13 13636 - 14421 804 261 aa, chain - ## HITS:1 COG:BS_ykuU KEGG:ns NR:ns ## COG: BS_ykuU COG0450 # Protein_GI_number: 16078486 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Bacillus subtilis # 3 173 4 180 180 187 52.0 1e-47 MNRIIGKKAPDFHLKAVSGDGEDFFDVSLDTYKGHWLVLFFYPMDFTFVCPTEITTFSND LYLFDEADAKLLAVSTDSEYTHQAWIRNGLGRIGYPLGADKTLSMAADYGVLLEDQGVAL RGLFIIDPDQTIRYSVIHDNNIGRNTQEVLRVLKALQTGSLCGANWTIGSQPLKEDRPSL PDSFASDKTVRIYTLPGCSYCRQVKEFLADNGISYEEIDLDSDMQGQAFMDSRGYTALPV TVIGSHEISGFRLDKIKELLL >gi|157101599|gb|DS480725.1| GENE 14 14605 - 14775 239 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942341|ref|ZP_02089649.1| ## NR: gi|160942341|ref|ZP_02089649.1| hypothetical protein CLOBOL_07226 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07226 [Clostridium bolteae ATCC BAA-613] # 1 56 1 56 56 71 100.0 2e-11 MSDKKYSGNFYKTDKIEETPDDCPNCLYQNKKAEKDPDECPNCYYIPNKKKKEGQE >gi|157101599|gb|DS480725.1| GENE 15 14966 - 15172 236 68 aa, chain + ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 4 68 5 69 401 76 49.0 1e-14 MMTQNKDKKRGKIQIFCMDDMVPQDHLLRIIDKAIDWNFIYGLVVDKYSPDNGRPSMDPV MLIKLPFI >gi|157101599|gb|DS480725.1| GENE 16 15194 - 15709 267 171 aa, chain + ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 1 171 78 237 401 203 56.0 1e-52 MRQTVKEIEVNVAYRWFLGLEMMDKVPHFSTFGKNYTRRFKDTGLFEQIFSHILQECYKF KLIDPSEVFVDSTHVKARANNKKMQKRIAQEEALFFEDLLKKEINEDREAHGKRPLKEKD DDSNPPSGPSGGKEEKTIKTSTSDPESGWFHKGEHKSVFAYAVQTACDKNG >gi|157101599|gb|DS480725.1| GENE 17 15893 - 16447 89 184 aa, chain + ## HITS:1 COG:SPy2013 KEGG:ns NR:ns ## COG: SPy2013 COG3666 # Protein_GI_number: 15675797 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 3 77 302 376 401 102 57.0 3e-22 MTKKGFFKKYEYVYDEYYDCYICPNNQVLSYRTTNRDGYREYKSSGTVCENCPYLEQCTQ SKDHVKVVTRYIWEPYLETCEDIRHTLGMKELYSQRKETIERVFASAKENHGFRYTQMYG KARMEMKAGLTFACMNLKKLAKIKAKWGLLEEGEPHKISILYTIRIIREKWLWDTNPRAA LSTV >gi|157101599|gb|DS480725.1| GENE 18 16499 - 16957 512 152 aa, chain - ## HITS:1 COG:PA1329 KEGG:ns NR:ns ## COG: PA1329 COG3193 # Protein_GI_number: 15596526 # Func_class: R General function prediction only # Function: Uncharacterized protein, possibly involved in utilization of glycolate and propanediol # Organism: Pseudomonas aeruginosa # 23 151 10 138 143 79 36.0 3e-15 MNETHTAKEITAALLDALKHQGLCLKEALQMIEKGEEMAEIIHVPMVITVVDEGGNTVAM HRMDDSLLASISISYSKAYTAAALRAPTGEAARDILPGQSLYGLQQTHPGKFCIFGGGIP IMKNGRCIGGLGVSGGTVEQDMTVARHALSLD >gi|157101599|gb|DS480725.1| GENE 19 16974 - 17519 681 181 aa, chain - ## HITS:1 COG:lin1107 KEGG:ns NR:ns ## COG: lin1107 COG4577 # Protein_GI_number: 16800176 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 1 181 3 183 184 150 53.0 1e-36 MITIGFLELNSIAKGILAADMMLKAAEIKLVSARPSCPGKYQILITGEVSAVESALRIGE ESAKTNVVDRLLIPRVNPQVIEAINMSSMPSSLKALGILEFFSVTGAIIAADAAAKAANV SLIEIRLGTGIGGKSFVTFTGDVGAVEESVEAGAKTAENSGALLEKVVIAHPDKELYRSL L >gi|157101599|gb|DS480725.1| GENE 20 17536 - 18888 1058 450 aa, chain - ## HITS:1 COG:lin1106 KEGG:ns NR:ns ## COG: lin1106 COG4656 # Protein_GI_number: 16800175 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Listeria innocua # 1 445 4 449 454 397 48.0 1e-110 MTLLEQIKEAGIVGCGGAGFPTHVKLNCTVEYLIINAAECEPLLRTDRFLMLHKAREIVT AAGMIGDMVQAGKRYIVLKETYSEEIAALEAAITELHSPVKLYKMKNFYPAGDEQIMVCD VTGRTVPPSGIPLDVGAVVSNLATVYSIYNASQGQPFTEKYLTVTGAVNSPCIVRAPLGT SFAECLLLAGGSSLSSFHVIAGGPMMGKCYRKEEAAGLTVTKTTSGYIIVADDTPLVEKH NIPISVSLKRAKMCCIQCSYCTQMCPRYLTGHPLKPHMIMRKLAYAQSPEEVLEDEHVRQ AMICSECGLCETYACPMGLQPRQVNIYVKNLLRQNKYRYPKPSETFTQLEERSYRKAPSK RMAVRLGVDQYYDYHIDSCKAADPATVRISLRQHIGAPSQPVVQTGDTVSCGQLIGAIPA EALGANIHASISGTVVQVTDTEIVISADRQ >gi|157101599|gb|DS480725.1| GENE 21 19115 - 19495 407 126 aa, chain + ## HITS:1 COG:ECs3324 KEGG:ns NR:ns ## COG: ECs3324 COG4810 # Protein_GI_number: 15832578 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Escherichia coli O157:H7 # 12 126 26 135 135 110 47.0 5e-25 MEIKNWNGSQGDLMRIIQESVPGKQITMAHIIASPDEVIYKKLGLDPKVDYHRSAIGVLT LTPSETAIITADLAIKASNIQIGFIDRFSGTLIITGTISDVNIAFERILEYTSSELDFTV CRITKA >gi|157101599|gb|DS480725.1| GENE 22 19527 - 20036 440 169 aa, chain + ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 142 1 143 150 109 38.0 2e-24 MKKIMLVGRSGAGKTTLTQAMKGRKITYHKTQYINNYDVIIDTPGEYAENKTLARALILY SYEADIVGLLMSAIEDYSLYSPNIVSMATREVIGIVTQIDQPEARPDLATMWLELTGCKK IFYVSSVTGEGVGDILNYLREEGDTMPWEREEETDEASSDAGGRLYGGL >gi|157101599|gb|DS480725.1| GENE 23 19990 - 20562 595 190 aa, chain + ## HITS:1 COG:PA4336 KEGG:ns NR:ns ## COG: PA4336 COG0693 # Protein_GI_number: 15599532 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Pseudomonas aeruginosa # 2 190 5 193 194 230 58.0 1e-60 MKRVLMLAGDYTEDYEVMVPYQALLMAGVAVDVVCPDKTAGDTIRTAIHDFEGDQTYSEK PGHNFVLNASFQYLKLEQYDGLFLTGGRAPEYLRLNSRVIDMVHYFMDLKKPVAAICHGV QILTAARVLKGKKVTAYPAVRPEVLAAGGIFMEKEPYEAVIFENLVTSPAWPGNVEILKG FLKLLGVTIS >gi|157101599|gb|DS480725.1| GENE 24 20725 - 23295 2388 856 aa, chain + ## HITS:1 COG:STM4114 KEGG:ns NR:ns ## COG: STM4114 COG1882 # Protein_GI_number: 16767379 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Salmonella typhimurium LT2 # 10 852 2 761 765 549 37.0 1e-156 MIAKGFTEPTERVKRLKRAIVDAIPYVESERAVLVTESYKETEGLSPIMRRAKAAEKIFN NLPITIHDDELVVGAITKNLRSTEICPEFSYDWVEKEFETMGTRVADPFQIPKDTAAELH EAFKYWEGKTTSALADSYMSQETKDCIANGVFTVGNYFYGGVGHVCVDYGKVLDIGFTGI IKQVIETMEKLDTSDPEYIKKKNFYEAIVITYTAAINFAHRYAAKAREMAASCPDPVRKA ELLQIAANCDRVPERGATNFYEACQAFWFVQILLQIEANGHSISPGRFDQYMYPHLAADK NICPEFAQELVDCIWVKLNDVNKTRDEVSAQAFAGYAVFQNLIVGGQTEDGLDATNDVSY MCMEAVAHVALPAPSFSIRVHQNTPDEFLYRACEVTRLGLGVPAMYNDEVIIPALCNRGV SLADARSYCIIGCVEPQCPHKTEGWHDAAFFNIAKVLEITLNNGKVGDKQLGPQTGDMTS FTSIEDIFAAYKKQMEYFVYHLAEADNCVDFAHAERAPLPFLSALVDDCIGRGKSVQEGG AIYNFTGPQAFGVADSGDSLCAIKKHVFESKEVTMAQLKEAMANNFGYACNASAPAATAD ECTDEARIYEAVKRILSNNGSINLADLQAQLAGPAQACRWPSPAEPAKTEPACVNPDYAH IKRLMENTPWFGNDIDEVDMIARRCGQIYSYEVEKYTNPRGGQFQAGCYPVSANVLFGKD VSALPDGRLAKTPLADGVSPRQGKDTNGPTAAAMSVAKLDHANYSNGTLYNQKFLPDALA GDEGLKRFASVVRAYFDHKGMHVQFNVIDRATLLAAQEHPEDYKDLVVRVAGYSAQFTVL AKEVQDDIISRTEQTF >gi|157101599|gb|DS480725.1| GENE 25 23332 - 24249 572 305 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 7 305 3 300 302 257 41.0 2e-68 MDSINYNLMGTVFDIQRFSLHDGPGIRTIVFLKGCPLSCQWCSNPESQSIKPVIMYKEDD CLHCGRCITACKRGAISPENKTWINRELCSGCGECVNACPAGALVLKGKTMSIQQVIREL KKDATTYRRSGGGITLSGGEPLMQYEFASELLQACKGQGWNTAIETTGVGSREAIEKVIP YVDTVLLDIKHMDGEQHKKFTGVSNDLVIKNAPEICKISNTVIRVPVIPGFNYSVEDIKA IAEFARTLVGIRTIHLLPYHSFGENKYGLLGQDYTLKQIKPLAPEDLEECKAVVESYGFQ CIIGG >gi|157101599|gb|DS480725.1| GENE 26 24277 - 25071 917 264 aa, chain + ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 4 264 6 267 267 293 63.0 2e-79 MDQQLIEKVMKQVSDELGIGKEECKAEAPKEECCGNIGLTEFVGTAIGDTIGLVIASVDP MLVDVMKLGKYRSIGIVGGRTGAGPQIWAVDEAVKATNTEIISVELPRDTKGGAGHGSLI YIGADDVSDARRAVEIALQVLPKYFGDVYGNDAGHLEFQYTARASYCLEKALGAPVGRAF GMTCAGPAAIGTVLADIAVKAANVEVVGYCSPGNGGTSYSNEVILTFTGDSGAVRQAVKA SIEAGKKMLGALGDEPKSTTTPYI >gi|157101599|gb|DS480725.1| GENE 27 25116 - 25394 336 92 aa, chain + ## HITS:1 COG:lin1123 KEGG:ns NR:ns ## COG: lin1123 COG4577 # Protein_GI_number: 16800192 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 4 89 3 88 91 107 87.0 7e-24 MNADALGMVETKGLVGAIEAADAMVKAANVTLVGYEKIGSGLVTVMVRGDVGAVKAATDA GAASAAAVGEVVSTHVIPRPHTDVEKILPHLE >gi|157101599|gb|DS480725.1| GENE 28 25489 - 26253 343 254 aa, chain + ## HITS:1 COG:no KEGG:Cbei_4057 NR:ns ## KEGG: Cbei_4057 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 251 1 251 257 176 36.0 8e-43 MKISEEQLKELVLRVLGELEAERKADNANHPGQKLYMLCVSPWSDRYMEFLREMETSDQY EIVPVIPSSWKDMGYEARLMETKACSHILHRSCEKPADLETAITVMPVVPRDVLVKTALC ISDTYETSWISECMDKGSRVVLLRSGLARFSGKEKPAYRNRVMNYYRQILEYGVEICSVD EFTGRAPYESPGHSPAGETASGNDTQHPDTKKKRVISSSNVEQFASGGVIFLQQGDIVTD LAKDRAKFLNIVFK >gi|157101599|gb|DS480725.1| GENE 29 26275 - 26934 506 219 aa, chain + ## HITS:1 COG:lin1122 KEGG:ns NR:ns ## COG: lin1122 COG4577 # Protein_GI_number: 16800191 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 1 117 1 116 145 84 45.0 1e-16 MKQALGLVEISGLSTAVVVADTMAKAANVRILEIENTKGLGYMTIKIIGDVGAVNAAVNA GKQIGTANGKLVSWKVIPRPSDYVDQTFCCPEPPVPPSPPKKEAQEETEAEVEVKAETME AVKAGEPEETGLEEIAPEDTEQETEAEPAEAGPEPKIEAEPIEIKSEPLVEAEPTRTGTV DPAEPEVPDSPETPPERPKKTAKRTTSAKSTSTTRHKKN >gi|157101599|gb|DS480725.1| GENE 30 27016 - 27279 382 87 aa, chain + ## HITS:1 COG:lin1127 KEGG:ns NR:ns ## COG: lin1127 COG4576 # Protein_GI_number: 16800196 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 1 86 1 86 87 77 52.0 5e-15 MRLAKIIGTVVATRKDNSLVGYKLMIIRRIDGHGNFIDSEEVAVDYVGAGIGETVLIGSG SSVRVDQSKREAVIDMAIIGIVDTMDI >gi|157101599|gb|DS480725.1| GENE 31 27322 - 28635 888 437 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 5 437 39 474 477 346 41 2e-94 MKSSIYSSVTDAVAAAKGAYSRYSKLTLNERQEILEGVKKALRPIANELAAMTVSETGMG NVCDKAQKIKLAIEKTPGVEDLITEVNTGDHGMTLYELSPFGIVCAVQPCTNPCATLISN TIGMLAAGNAIIHCPHPRAVKVSQFLTTIISETIRSISGIDNLVVTLHVSLMGFTTEVMS HPDVDLVVCTGGSGSLRQAMTSGKKVIGAGPANPVAIVDATANFEKAARDIVRGASFDNN LMCTSEKCMVVEEKAADRFIFELLKNGVYYVKNEEEIQKLLDAAVTGDFVINKKLEGRSA NEILEYAGIPCNGTIKLIVAETERIHPFVTLELLMPLVPLVKAADFEEALEIALDVEQGY KHTATIHSESIEHLNQAAKEMQTSVFVKNGPSFMGIGFDKEGHTSFTIATTTGEGTTTAR HFARRRRCTLTSGFSIR >gi|157101599|gb|DS480725.1| GENE 32 28662 - 29846 846 394 aa, chain + ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 13 394 9 398 400 414 54.0 1e-115 MMEGRDDRFYYVTSESVTEGHPDKVCDQISDGILDAYLGKDPKSRVAIEAMASANTLMLA GEVTSSGHVDVVAKAREIIRNIGYTEHGKGFDADTCMIFTNIHNQSPDISMGVTRSAGTG KEVLGGGDQGIMYGYAVNETESLMPLSCHLANRLAERLAYVRKVQKADFLYPDGKTQVTM KYNEAGKPVCIHSIVVSTQHSENISHKLLEAFIRCTVIEPVIDPQWITPKTRIHVNPTGR FVIGGPAGDTGVTGRKIMVDTYGTVGKHGGGAFSGKDPTKVDRSAAYMARYVAKNIVAAE LADRCEVALAYVIGGIGPEAITVNTFGTGRIPDSSIEELVKSVFSFGVADYISELNLRTP QFLKTAAYGHFGRDDQGFRWEETDKAWLLKQLAF >gi|157101599|gb|DS480725.1| GENE 33 29892 - 30170 366 92 aa, chain + ## HITS:1 COG:lin1123 KEGG:ns NR:ns ## COG: lin1123 COG4577 # Protein_GI_number: 16800192 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 3 92 2 91 91 110 83.0 7e-25 MSNEALGMVETKGLVGAIEAADAMTKAANVALIGYEKIGSGLVTVMVRGDVGAVKASVDA GAAAAANVGQVVSQHVIPRPHTDVEKILPKSL >gi|157101599|gb|DS480725.1| GENE 34 30236 - 30871 515 211 aa, chain + ## HITS:1 COG:TM0375 KEGG:ns NR:ns ## COG: TM0375 COG4869 # Protein_GI_number: 15643143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol utilization protein # Organism: Thermotoga maritima # 9 209 1 207 210 222 58.0 5e-58 MVNEELLKLITERVMEKVVNYNTYKIPVGVSNRHVHVTREDLETLFGKGYELTVKGELKQ PGQFASNETVTIRGPKGEFERVRILGPVRKQSQIEISKTDSFRLGVKALIRESGDLSGTP GLELVGPKGTVLLPQGAIVALRHIHMTPQQAEAMGVWDKDIVEVETFGERHGVFGDVLIR VSDQFELEMHVDVDEANACALKNNDYVIIRH >gi|157101599|gb|DS480725.1| GENE 35 30905 - 31723 931 272 aa, chain + ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 6 264 10 268 274 265 55.0 5e-71 MTKEVLSEFAELIRENKCNPYSGNLKTGVDLGTANIVIAVTDEDNHPVAGATAVSKVVKD GIVVDFVGAMQTVRRLKAQLEELLGVTLETAATAVPPGIMDGNVKCISNVVEGAGFEVIN VIDEPTAAASVLEITDGAVVDVGGGTTGISILKDGKVIFTADEPTGGTHMTLVLAGYYGI PIEEAEKLKKDKDKENDVFPIIKPVVEKMASIVKRFLSGYDVDCVYVVGGASCFDEFEQT FEKALGITTVKTNDALLVTPLGIAWNAAADCA >gi|157101599|gb|DS480725.1| GENE 36 31745 - 33160 884 471 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 36 471 38 474 477 345 42 5e-94 MDMDIKVIEQLVEQALKEIKAEQPLKFTAPKLERYGVFKTMDEAIAASEEAQKKLLFSKI SDRQKYVDVIRSTIIKRENLELISRLSVEETEIGDYEHKLIKNRLAAEKTPGTEDLLTEA ITGDNGLTLVEYCPFGVIGAITPTTNPTETIINNSISMIAGGNTVVFSPHPRAKKVSQMT VKMLNKALIDNGAPPNLITMVEEPSIENTNKMIDNPSVRLLVATGGPSIVKKVLSSGKKA IGAGAGNPPVVVDETADIDKAAKDIVDGCSFDNNVPCIAEKEVFAVDSICDYLIHHMKEN GAYQITDPMLLEQLVALVTTEKGGPKTSFVGKSARYILDKLGITVDASVRVIIMEVPKDH LLVQEEMMMPILPVVRVSDVDTAIEYAHQAEHGNRHTAMMHSKNVEKLSKMAKIMETTIF VKNAPSYAGIGVGGEGYTTFTIAGPTGEGLTSPRTFCRKRKCVMTDAFSIR >gi|157101599|gb|DS480725.1| GENE 37 33179 - 34294 1003 371 aa, chain + ## HITS:1 COG:lin1130 KEGG:ns NR:ns ## COG: lin1130 COG1454 # Protein_GI_number: 16800199 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Listeria innocua # 1 370 1 371 372 344 46.0 2e-94 MSNVYIKTKIRSGNYAMEYLRKIRKKRILVVCDKFLAESGAINYILDSLDSSNSVELFDQ AIPDPTIEIVGKGLGVLCRVRPDIVIGFGGGSAIDTAKAMIYFAGVRELVPKPRFVTIPT TSGTGSEVTSAAVITDSETKSKHLISSEDILADVAILDPRLTLSVPPAITANTGMDVLTH ALEAYVANGCNVFSDALSEKATELLIKALLKCYRDGKNLEARTLMHEASTLAGMAFNTAG LGLNHSIAHQLGGTFHIPHGLANAMLLNKVIEYNSVNHDIMNKYATLAYKVQLVGMNEAP EMAVKTLEELIRSLMCCMDMPASIRELGISREAYEGQLQNMTDNALMDNCLISNPREAGT EAIKEILLSIY >gi|157101599|gb|DS480725.1| GENE 38 34477 - 35808 810 443 aa, chain + ## HITS:1 COG:CAC0432_1 KEGG:ns NR:ns ## COG: CAC0432_1 COG4936 # Protein_GI_number: 15893723 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Predicted sensor domain # Organism: Clostridium acetobutylicum # 44 197 20 169 177 82 30.0 2e-15 MAEYHKMMHMDEVEPDALDEYLSINEEDRINELDIIHLFGKEKLETIQESLSKATGLAFI TVDFKGDPITSATSFSRYCEKVRNNPVAMERCKSSDAFGSIQAAVTQKTNVYFCPCGLLE VAIPIIVRGHYLGGFIGGQIQCNDAPDSVSRLSSVMHAAKADEAVVQYKSLLDEIPVYSY EKFLDIANLVFLVINQLSENEISQHLQEDILKKRIKKIQTVNHGYAKEIVQKNKRVQELE MRGNQYELLDMLTSLLNLTIIEDAPRTNELLSSFIDYIKYKYMEKSTFVHISGELEHAEQ YLLFQKKKTGDRLEYSIQVPKNLHMQKMPSDVLMPFIQNAFYNGVMLKKEGGQITVTGYV RNGNVVLEINDTGPGLTEAELDIKYETYKDKHEGYYIKMGMEYAREKMKRLFGEEYSIIS ESYRNKGSKTILLWPEHYEERTE >gi|157101599|gb|DS480725.1| GENE 39 35811 - 36857 743 348 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 237 343 423 529 530 73 29.0 4e-13 MYRVLIADDDILMREALNIMIAKDSAFKVVHMTGTGEGAVRACKEDVIDIVFMDMLMPGM TGLEASRIIHQSNPEINIYILSAHAANILIRSAAHSQVKDFLEKPITYNFLKKILENYKT EHEDSVQNQLETLASYLRNKNFGDFYGGIAGVIDEIYGIAGEDSVRLIKLFTYLGQNLLD TRSMYGDSGNISELFPINEVLILDRKTSELWLFRVMDFLFQQNSIGRYPLLENIFIYIEK HIKEDITLNSIIENCAISQGYLSRIFREQFQVSVTEYLHMKKLHLAKGYFYFTEDSIAEV AFRLGYNESSYFSKVFKKFENMTVKQYKNKIRQSSPGKKTETGRCGNR >gi|157101599|gb|DS480725.1| GENE 40 37201 - 39708 1751 835 aa, chain + ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 10 835 2 765 765 574 38.0 1e-163 MIAKGFTEPTERIKRLKKAFVAAVPEVESERAVLVTESYRETEGMPILIRRAKALERILN HLPIVIRDDELVVGVITKNLRSAQIYPEFSFEWVEKEFDTMDKRSADPFKISEAAKSELK EAFKYWKGKTTSDLASAYMSEATKEAISNGVFTVGNYFYGGIGHVSADYGKVLELGFSGL IKLIVETMEKLDQNDPDYVKKRTFYESMIICYNASIQFARRYAVKAREMAASCSCAVRKA ELLRIAQVCDRVPEHGATNFYEACQAFWFTQILIQIESSGHSISPGRFDQYMYPYLKMDH NISEEFAQELLDCLWVKFNGLSKIRDEVSAQAFAGYGGFQNLIVGGQTGDGLDATNDVSY MCMVAAAHVQLPQPSFSIRVHQNTPEEFLYRACEVVRLGTGVPAMYNDETIIPALCNRGV SLADARNYGMIGCVEPQSPHKTDGWHDSAFVNVAKILELTLNGGKNNGHQVGPDTGDVTS FKSIEDVWRAFEKQIQYFVHHVVEACNSVDYAHTERTPLPFLSGLVDDCIGKGLSLQHGG AVYNFTGPQAFGVADCGDSIYAMQTHVFENKELTMAQLKDALDHNFGCGCCETASDDEAR IYEAVKRILSSNGSIDVSALQSQLSASQTANAGDGEYARIKRIMENTTWYGNDVDDVDLL ARRCGQIYSREVEKYKNPRGGVYQAGCYPVSANVLFGKDVGALPDGRLAKQPLADGVSPR QGKDTNGPTAAAMSVAKLDHENYSNGTLYNQKFLPSALAGDEGLKRFSAVVRSYFDHKGM HIQFNVIDKNILLDAQAHPELYKDLVVRVAGYSAQFTVLAKEVQDDIISRTEQAL >gi|157101599|gb|DS480725.1| GENE 41 39844 - 41163 1028 439 aa, chain + ## HITS:1 COG:lin1114_1 KEGG:ns NR:ns ## COG: lin1114_1 COG4936 # Protein_GI_number: 16800183 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Predicted sensor domain # Organism: Listeria innocua # 45 184 12 147 149 81 36.0 4e-15 MEELDRIFEDTGPLASTDQEERDTGETSRGIGSIDIIRLFGRDKLEHIQESLSKATGLAF VTVDYKGEPVTDMTFFSDFCRKFREDKELRQNCRTSDAMGSIQAAVTKKTHIYFCPCGLI EMAIPIVVEGNYLGGFLGGQFRCEDAPETVRQIEPTVETERFRRLREESRQRFEELPVYS YEKILDMANLVSLVITQLSENRANQYYQEDMLKKRIKKIQETNQKHLKEKAELEVRFHEL EAQSNPYSYLNMLIALQNLGTVEQAERTTELADLFISHVKYGIADKGTFVHFAGELEYVE QYLKFQKIRLGSQFNYTLHFPREMQMMKLPAGVLFPFVQNAFYHGVMLSPEGGSISLNGY LKQGNFVLEIINTGSGFSNEELKIRFEPYKNNHEGYYIQLGMECAKRKLAQLFGDEYTIT VEINKNKGCRCEIIFPEYK >gi|157101599|gb|DS480725.1| GENE 42 41181 - 42182 692 333 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 237 331 423 517 530 67 30.0 3e-11 MYRILIVDNDMLSQEALEAMIAGYEQFEIVDKVTTGEKAVKVCRSCAVDLVFVEMQLPGI SGAETGRLIRQFCPDTVIYGMTVCTSAGLIESIVPGTVKAMIEKPVMAKNLNTIFRNYKT EKEDTSNKQVEELVEILKERDFQMFYQRLPGIIDCIYENAGPEPDRLVKAFSYIGQQLTD TQNLYDEKNTIIELFPINKSLILEKKTSELWLFRVVDYLFQQSSMKRYPLLDNIFLYIDQ HIKEEITLNKIIENCAVSQGYLSRIFRSQFHVSVTEYLHMKKMHLAKGYFYFTEDSIGEV AFRLGYSESSYFSRVFKKYENMTVKEYKNKIKK >gi|157101599|gb|DS480725.1| GENE 43 42763 - 43455 515 230 aa, chain + ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 6 222 41 259 267 224 54.0 1e-58 MTDHSQITEFVGTAIGDTIGLVIPGIDPSLAPFMKLGEYRSIGIFGGRVAVGPQVIAVDE AVKAADVELISMEASRDSKGGAGCGTLFYIGANDVSDARHAVEIALEQLPKQFGEVYVND AGHLSFSYTARAGQCCVKTLGAVEGKAFGIICCGPAAIGCLVSDLAVKSAPVTVIGYHHP LDNCFANEVVLCFTGESAAVRQALRTAIHAGKQLLGSLGEEPHSLGNPWI >gi|157101599|gb|DS480725.1| GENE 44 43485 - 43574 90 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVNYVAMPIMVDVGDSPVIVIDVDHYEKI >gi|157101599|gb|DS480725.1| GENE 45 43822 - 43977 63 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942375|ref|ZP_02089683.1| ## NR: gi|160942375|ref|ZP_02089683.1| hypothetical protein CLOBOL_07260 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07260 [Clostridium bolteae ATCC BAA-613] # 1 51 1 51 51 77 100.0 4e-13 MVLSIKFSSEDQTALDYRGSQKSFGTVFYYAYTKLEDFKRFFQEMELGKKA >gi|157101599|gb|DS480725.1| GENE 46 44095 - 44532 224 145 aa, chain + ## HITS:1 COG:no KEGG:PCC7424_0111 NR:ns ## KEGG: PCC7424_0111 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC7424 # Pathway: not_defined # 3 127 16 139 173 111 47.0 8e-24 MRIYLDNCCYNRPFDDQTQERIHLESEAILTILKRGQLNVYEIVGSEMLELEMERMGDET KKRKVKELYQVTHMHVAYTDEIKKRSEDIMKISKIRTFDSLHIAAAESANADVLLTTDDK MSNMASKLELKVKVQNPLQFAWEVF >gi|157101599|gb|DS480725.1| GENE 47 44533 - 44727 171 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942378|ref|ZP_02089686.1| ## NR: gi|160942378|ref|ZP_02089686.1| hypothetical protein CLOBOL_07263 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07263 [Clostridium bolteae ATCC BAA-613] # 1 64 1 64 64 120 100.0 4e-26 MLNVDLNNPADVRNAGMKVLREALGPVGMVKFIKQFDLGYGDYTEEKYERPDISLDEIDA ILKR >gi|157101599|gb|DS480725.1| GENE 48 45167 - 47089 752 640 aa, chain - ## HITS:1 COG:lin0919_1 KEGG:ns NR:ns ## COG: lin0919_1 COG3711 # Protein_GI_number: 16799990 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Listeria innocua # 1 506 1 493 505 148 25.0 3e-35 MDKRLELLEYLFAQKRNVSNRELQTALGISKRTVINYVKDLNSLAPNIIHSSNQGYFCPN PSAIQGTIAKLQSTQLFDTYSKRRDYLFKELLLKATHPSFEELAEQMYISTVTLNNELVR LRNELKEHSLYLKTKNNRLFIIGQDRDKRRFTMLLLNQELRQSHFKLENMQKFFQNVTIN DISDMVRQTLRNHSYFLDDFSFLNYVLHLAICIETTLSRREVPDTINSVMDNYSFTPHIL EIVKDIHKNLTDIYHTDISFTQVLDASVLMTTRIRSVNLSSLNFEQIHTILPAQVCGILF KIITAVHSIYGIDLKADNFMVRFAFHLKNLMDRAANSLVVPENYFITIKDDYPFLYVIAE YIASIIGQEIHAALPENEISYIALHLGVLMEEKNNAATKINCVMVLYDYYNMGVTIFEHI KQMAAPINLYGIVTSYEQIQNLSSVDLIITTLPEESDLGIPTIKIHLLPTRQDYESILLK ATELTQNIHHRELARQIQQLFKKELFFPHTNFLSREEVIQFLCERMEASNYVDKAFHQEI LYHEQVAPSAYRNVALPHPLTADESLTRVSSIAVAINDHPIPWDTNQVDFIFLLSLRGED RQLFKDVFTIISSFLQSDSNCRLLKSCDSFDTFVNLIISA >gi|157101599|gb|DS480725.1| GENE 49 47314 - 48591 660 425 aa, chain + ## HITS:1 COG:BS_licC KEGG:ns NR:ns ## COG: BS_licC COG1455 # Protein_GI_number: 16080909 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 5 424 1 437 452 201 29.0 2e-51 MGRVLEKVTDKISDVMTPVSAKLMSVQFIAAISETMQAVMPIILIGSFGCLVAYLDVGPW QSIVTGVPYLVDICGKLQLITLGLFSLYVVIILTYIYASKLNMKESVACIPIAAAVFMMV TPCDWNGVPVEWLGTSGLFSALMIGAMVPRLMQLMIRKKICIRMPNVVPKFVEDSFKILI PGLILLAFSGVLDTILEHSSLGCLHSIIYSVIQTPVSNIGLSLGGHIIICILAATIMWCG LHAGTIGNLVTPLLIAASAENLAAYTADQEIPNIITQEFFNLTLPGMAGCLLIPAVLMAF VCRSKQLKSVGRLAVIPAVFGIGEPVLFGAPVMLNPLLYLPMILSIIVNQVLTYTAIVTG LIGRSTGVVLPWTTPPVLYPLLANSTPVRAAVLNVVIILIDIIIWLPFVKAYDRNTVLKE MEAEK >gi|157101599|gb|DS480725.1| GENE 50 48705 - 49010 334 101 aa, chain + ## HITS:1 COG:no KEGG:Smon_0197 NR:ns ## KEGG: Smon_0197 # Name: not_defined # Def: phosphotransferase system PTS lactose/cellobiose-specific IIA subunit # Organism: S.moniliformis # Pathway: Phosphotransferase system (PTS) [PATH:smf02060] # 10 99 13 103 103 67 41.0 2e-10 MDWDIIVCTIVANAGESRSASLAAIEAAGDDDFEAAERLMQESEKAYLLAHEAHMEILKR AAGDDPVPMNFLIVHAATHLSNAEVTKDLARQMISILKKRR >gi|157101599|gb|DS480725.1| GENE 51 49015 - 49335 265 106 aa, chain + ## HITS:1 COG:lin2472 KEGG:ns NR:ns ## COG: lin2472 COG1440 # Protein_GI_number: 16801534 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIB # Organism: Listeria innocua # 1 105 1 104 104 87 42.0 6e-18 MLNIVLLCQFGASTGMLAERIKTAAAKRGEEAIVNAYSFSRIRDVIDGADVILLGPQIRF QKAALEKEYADKGVPFTVIDTQAYGMLNGDRIYSDAKARIEEYNKG >gi|157101599|gb|DS480725.1| GENE 52 49348 - 50625 593 425 aa, chain + ## HITS:1 COG:BS_ywbA KEGG:ns NR:ns ## COG: BS_ywbA COG1455 # Protein_GI_number: 16080890 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system cellobiose-specific component IIC # Organism: Bacillus subtilis # 1 420 1 437 444 171 26.0 3e-42 MEKFANRIERILTPFGAKLQNVTFMQAISESLQATLPILIVGSFATLISSLDVGPWQGIV KSIPGLIPVCQKITSFTSGGFTLLMLIALAQLYCKKIEVKEYISCTALTVAVFLILSEVT DGNIASSALGMRGMILALFVGILVPKSVKALIDRNVRIRMPNSVPKFVEEGFTILLPACI ICIAASVINYFFTLLGYSSFQTFFYSTIQIPLQGIGLSFGGFVFISCLGALVMWPGLHAS TISAFIQPLALTASAANLEAWNLGKTLPNVIEYSFMQSAKFGGGANLLLPCLIALIFCKS KQLKSVAGLGIVPAVFCIGEPILFGFPMMLNPMMLIPMILSTITTYTIYYAGIVSGMVGQ FTGVVLPWTTPPILNVALSSSTPLRAIIMQLAAMAVGILIYYPFVKGYDNQLVEKEKQAE QMISE >gi|157101599|gb|DS480725.1| GENE 53 50648 - 51460 230 270 aa, chain + ## HITS:1 COG:no KEGG:BDI_1568 NR:ns ## KEGG: BDI_1568 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 268 26 290 424 288 49.0 2e-76 MEPGIRRAPNTQFFGKEAFSASNDTVIRWMGVGAAMVNCHGTCIMIDPMLKDAKMPMLID FPITVEQIPYLDGILVTHIDRDHYSLPTCLAVHDRCKEFHTTQYVASEMQGDGINGIGHD IGGRFTIGPMHILVTPADHCWQNQDKRYKNKRLYVPEDYCGFYIDTPDGTFWDVGDSKLM DEQLRLPEPDMILFDFSDNEWHITLDGAYKLAAAYPNSELVLIHWGSVNRPDITACNANP MDLLENVINPGRVRIVAPGEPVKLKRISER >gi|157101599|gb|DS480725.1| GENE 54 51477 - 52517 496 346 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942388|ref|ZP_02089696.1| ## NR: gi|160942388|ref|ZP_02089696.1| hypothetical protein CLOBOL_07273 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07273 [Clostridium bolteae ATCC BAA-613] # 1 346 1 346 346 737 100.0 0 MEGFYTGENRRQIHQQFVLLIQMIRDRDFTNVRQVLTEGCEIHFSTAGYHKGIENICRAL MWPGIPMDIRKAQILNFVARSRGENAVQTAYIQCIEAKDDGVNVFPFLFGGEYANRLVKE DGVWKFSEIRFDLCYAKGNTIFVKDRWTLMDESAYCGHEPMINPDIDNPWKVIPVDDEPQ GNEEKIFELLYKYAYAFDHNDYKFMTEFTTDNFYIAARFEEGVLWSDKPDTADLVGYRQV SDFLRSKFHKEAKMMHSCRMGRILFKEDMAVAYMPRGEDHRLKNRVMNRDNVHCICTTAI HTVYAKFDRYTNQWNMFKYNVNPIPTIFTQAEDDCLEFHEYFGGCL >gi|157101599|gb|DS480725.1| GENE 55 52508 - 53422 283 304 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942389|ref|ZP_02089697.1| ## NR: gi|160942389|ref|ZP_02089697.1| hypothetical protein CLOBOL_07274 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07274 [Clostridium bolteae ATCC BAA-613] # 1 304 1 304 304 632 100.0 1e-180 MLVKNYRNVMSDFLICYELGNWDNCKELVNDDIVLESTLAGTHRGIEEIKSALQLPKEHN ISNMVITNLIHREADGHDIVWCYMHHMTGVYQGRQLFPLIYGGKLYFDICNNRITNIKFE LGYEYGNTWAMNGIWKFYEETKNTHLISADALRLGGETRGLRDLVNRFFWSVDHMDRALF QKLCSRDVEIIKSGIDSSIYKKDGVFDVDRYLRFEKDFYEQNLYSIHIVDESDIDPFHKK ITCWHLSPGRPGTKHYGANTIYMQFYDEIIDVEFHRDKDGWKIGKVIFTAKSNQAEYSPK FLML >gi|157101599|gb|DS480725.1| GENE 56 53467 - 54507 827 346 aa, chain + ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 324 1 327 331 327 50.0 2e-89 MKYTKLGCSDLTVSRICMGCMGFGDPGKGQHSWTLDEGRSREIIKHGLEQGVNFYDTAVG YQGGTSEQYVGRALRDFAGREEVVVATKFTPRTKEEAEKGISGQEHIERMLNRSLKNLGM DYVDLYIYHMWDYQTPLYEIMDGLDRMVKAGKARYIGISNCYAWQLAKANALAEKEGFAK FISVQGHYNLMFREEEREMIPYCREENIALTPYSALAGGRLSKRPGEISKRLEEDSYARL KYDGTARQDQVVIGRVVELADKYGVSMTEVALAWLLEKVTAPVVGMTKIHHMENAVKAVD FNLTSGETEYLEEAYVPHRLVGVMAQNTREASGKEHVWSVKEDYVL >gi|157101599|gb|DS480725.1| GENE 57 54599 - 55903 1055 434 aa, chain + ## HITS:1 COG:CAP0010 KEGG:ns NR:ns ## COG: CAP0010 COG2723 # Protein_GI_number: 15004715 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Clostridium acetobutylicum # 1 430 35 466 469 555 60.0 1e-158 MSVQDIYETPEGITDFKAASDHYHHMREDVKLMADLGMKAYRFSIAWTRVIPDGDGDVNA EGIKFYNDLINELVSYDIVPVVTMYHFDLPLKLHKKGGWGNRKTIDAFERYAKILFENFG DRVKYWLTINEQNVMINHPNAMNPGRIPSKKELYQQCHHMFLAGAKATLLCHKMCEGGKI GPAPNITAVYPEKCNPADVIAADNWESIRCWMYLDMAVYGRYNSLAWSYLAEKGFTPVME DGDMEILKKAKPDFLGVNYYATATVSASKNDGMDCQPRNGDQQVMIGEEGVYRAGDNPYL EKTEYGWMVDSVGLRVTLRRIYDRYQLPLLITENGLGASDVLNEDGTVHDPYRIVYLERH FKQAQLAITDGVDLIGYCPWSFMDLVSTHQGYGKRYGFVYVNRDEADLKDMGRIKKDSFY WYHQVIITNGAILD >gi|157101599|gb|DS480725.1| GENE 58 56041 - 57366 752 441 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 5 435 92 527 547 150 27.0 6e-36 MFSNKDLQKLIAPLIVEQILLMLVGIADTVMISYAGEAGISGVALVDMVNYLIITVLSAV ATGGAVVVSQYLGSEERENANRTASQLFSIAFLISVGITALCLLFCREILGALFGGVEAE VMRAAQDYLLITSCSFPFLGIYNSSSALFRSMNRTKIIMYVSFLMNVINVAGNAVGIFIF HAGVAGVAVPTLLSRVVAAIIMLILSMQSKYEICITWKNLVSWNQGIAVRVLKIAVPNGV ENGLFALGRVLVTSIVALFGTTQIAANGVAGSIDQIASIATNAMNLAIIPVVGQCIGAGE YDQATYYTKKFMKITYVLIGGMGGIVIVILPFILGFYELSEETCRISVLLITLHNVLAFL LHPTSFVLANSLRAAGDVKITMYIGIGSMVVFRLGTAVLFGIIFRMGVAGVWAAMGMDWF ARTIAFSIRYKSGKWRCVKVI >gi|157101599|gb|DS480725.1| GENE 59 57510 - 58466 864 318 aa, chain + ## HITS:1 COG:AF1122_1 KEGG:ns NR:ns ## COG: AF1122_1 COG1250 # Protein_GI_number: 11498722 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Archaeoglobus fulgidus # 10 290 2 276 395 185 38.0 1e-46 MRNQDVSQWKVAVLGAGSMGQCIAQFMAMNSHQVFLYNRTPSNLANALVQMENNLQTLVS LGQMKAGDIPDIMNRIAGSSDLKGSVEQADIVIENLAESEEVKKNIFTQLDEYCDKDVIL SSDTSTMDIFSFLEVSHPERLVITHFFNPAHVMPLVELVRGPETSDEVAGATKHFLEDTG KQVAVLNKCIPGFILNRITLSVFREAAHLVESGVATPEDIDKAVTSTFGPRYTFEGPFGL SDFAGIDIYERLATLLPPVLCSDTECPKLLHKMVQEGKLGVKSGEGFYRYDDEKAARRDR DSKIMKMLAAIQEVNKRG >gi|157101599|gb|DS480725.1| GENE 60 58463 - 60469 1451 668 aa, chain - ## HITS:1 COG:BH1879 KEGG:ns NR:ns ## COG: BH1879 COG3829 # Protein_GI_number: 15614442 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Bacillus halodurans # 301 548 240 490 555 256 48.0 1e-67 MPNENFDYVPYETLSIERKKLSRLYTFIWRYIGKTISISHGSRYAMGLFSRDGLLLDLFA HENYTLERFMNDGIRAGSNWVNIGYNAVRQGIIGRESLCTIGEENEFPLLKKYAVYYAPI SILSPYEPYDQMEECGLGIIASLDNAWDDYLTMIRGTAHDMMITLQFNNIATMYYERSGK GILSIDNMMSTSGRNLATYYNDELFKVLETPPINIYYEPAENLIDPLPANKELWDIVQNH RTIANQPLEVTCNGKTVEVIASTDAFNQPMINAHGVVFYFTTPQKMTAELSRQVANGAIK TFHDIIGENADLKKLIQKAQRMAQTNSNIMILGESGTGKDVFAQAIHNASARREKPFIAL NCGALPRDLVESELFGYETGAFTGARKNGNIGKFELANGGTIFLDEIGEMPLELQAKLLR VTETKQLMRLGSSKNIRVDVRIIAATNANIDDMIAQKLFRADLYYRLSTMKLYLMPLRER RDDIVPLAEHFIRSISRRIDKPNIMRFSENAKELLQQMDWFGNVRELQNLIECIVQLYPG DIILPEYILDNVSERYYPKNALKNISSVSALREAFPKEDTAAAVPWSTGAPSVPLPEAAF NIPSGTEAPLDPSNPRHPKSGKRKVITKEDIYEALAACSNNRTAASQYLEISRRTFYRKL EEFGIETD >gi|157101599|gb|DS480725.1| GENE 61 60683 - 61588 859 301 aa, chain + ## HITS:1 COG:no KEGG:Cmaq_1396 NR:ns ## KEGG: Cmaq_1396 # Name: not_defined # Def: hypothetical protein # Organism: C.maquilingensis # Pathway: not_defined # 13 291 2 296 297 171 31.0 3e-41 MAVRNKPLGRIDVNVIDDIWKGSIDMHIHPGPDPLAERPVDSIQAAQMAERSKMGGIVLK SFSYNTVSDAYLIEKNLTSGLRVFGSVVIGYTTTGGLSHASETIEAMAKIGCKVVWFPAM DSKWCRSYLGKEGGISILKKDGILKDEVLDILELVKQYDMVVCSGHMSYEESTKMFDAAI QKGITKMVATHPLAELSRFTLDQIQELAAKGVYIEHVYGTLMPRLGSMDPSDYVDCVKLV GADRCIMGTDLAQVWDMTPADGMRHFIAMMIQFGCTPAEVEAMSKKNPAKLLGMEEGQEE Q >gi|157101599|gb|DS480725.1| GENE 62 61588 - 62568 729 326 aa, chain + ## HITS:1 COG:BH2634 KEGG:ns NR:ns ## COG: BH2634 COG2084 # Protein_GI_number: 15615197 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Bacillus halodurans # 7 280 3 255 299 150 34.0 3e-36 MSDLKDKKFGFLGWGAIGFPICHGLVKSGYTVYLPVYRRKSAQRHGFSGLVPDEKSKTEA IDWMLGNGGIAAASQRELLENSDILVFSLPKSQQVEEVVLGKDGVMEVCKPGTIIIDMTS ADAVSTKKLAALLEKKGIELLDAPVSGGTSGAAAQTLTIMCGGKEDTFLAVKPVLDVMGA PDKVTYMGPNGSGDMIKCANNFLSACCAAATTEAVAVCAKAGIDPHLAVQVIGTSGGTNH AATMKFPNIVFPGKNWNFSLGLMSKDVGLFNSASKDMGIPSLFGNLTAQILAIPKAEEGD DADCIQVQKLYERWAKVELCGIDNEK >gi|157101599|gb|DS480725.1| GENE 63 62581 - 63690 544 369 aa, chain + ## HITS:1 COG:BMEII1019 KEGG:ns NR:ns ## COG: BMEII1019 COG1804 # Protein_GI_number: 17989364 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Brucella melitensis # 1 349 1 368 415 158 28.0 2e-38 MALKPLDGIRIVEMGTTEAEGIAALLLSDYGAEVIRLEFPKEEKEENSQDYRVCDRGKMR VRFRPDEPDDRKWLEKLLTRVDAVVTSVPDIRMSQWNLDVDTLCSRNPGMVYTSVTGYGQ TGPYGNGRLYDEATIQAESGFMSITGPEHGDPVRSGSDFATFAAGANACIGTLMALIDAQ RTGRGRRVDVSMMDSVLYGLENQFSVYLKSEHIPERLGNHYALSAPVGDFTCRDGKKIMI SVATGAQWENFADVMGHPEWLENPDYRNVGARLKNHHMLEKEVSTAFAEYTSGEWMKRLQ TQKCIYGRINDFKEVARHEQVQARHMFIEITAPDGTKMTVPSNPLVMDGEKNAGTVVNDT MMCHDCAVD >gi|157101599|gb|DS480725.1| GENE 64 63854 - 65080 994 408 aa, chain + ## HITS:1 COG:SMb21182 KEGG:ns NR:ns ## COG: SMb21182 COG1804 # Protein_GI_number: 16264596 # Func_class: C Energy production and conversion # Function: Predicted acyl-CoA transferases/carnitine dehydratase # Organism: Sinorhizobium meliloti # 1 380 1 387 394 233 33.0 7e-61 MSKPLEGFRILDFSQFMSGPMCTLLLSDFGAEVIKIENPPIGDNTRYGNIIDHEVSSHFA TRNRGKKSIVLNMKDEAHKKLFLELVKTADAVIENYKPGTMEKFGINYEVLKEINPGIVF TSISGYGQDGPYASHAAFDQTVQAESGMMSITGPEGGEPVKSGGSIADYSGGLMACIGTL MGILEAQKTGHGRRVDVSMMDSLIFCLENQFSSYLMTGKVPRPMGNSYASSAPIGAYKCK DGKSLMITVGTDAQWKTFCEALNQPQWYANEGWATMTQRAADYKAIDAEVNRVFSRYDSD EVADMLFKGKCVYGRINDFEAVANHPQVRHRRTIVNANYANGVTFRVPGNPILMSDLERE TDYTAAELGENTFEVLGEVADQETLHRLFDPVLEKAAEAQKAIYSKSK >gi|157101599|gb|DS480725.1| GENE 65 65093 - 65692 520 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160942400|ref|ZP_02089708.1| ## NR: gi|160942400|ref|ZP_02089708.1| hypothetical protein CLOBOL_07285 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07285 [Clostridium bolteae ATCC BAA-613] # 1 199 1 199 199 413 100.0 1e-114 MARLIESGKIEEVVFARFEPGEDILKALYDICREKDIRTGIILDGSGSAVDFTYQHFPIN PKLCPTNVSIGTMHGKCEISLQGTIGTTVCNLAEGDTPDTVVERMFPTIPGLWEGPVDKW PMVGCEYGPGTPYIHAHCVASNADYTVCGHLMPGTRVASGDPSKPSHFTVIIAKISGVEL QATIDHKIGFYQNLVRVDE >gi|157101599|gb|DS480725.1| GENE 66 65710 - 66528 615 272 aa, chain + ## HITS:1 COG:SA0118 KEGG:ns NR:ns ## COG: SA0118 COG3836 # Protein_GI_number: 15925826 # Func_class: G Carbohydrate transport and metabolism # Function: 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase # Organism: Staphylococcus aureus N315 # 13 251 10 238 259 96 28.0 4e-20 MRDLFEKNPFLEKVSAGKAPVGFFNYLRDTAVLDIAGTVGFDFVVIDNEHTGMERETTEK LILAAELNQVVPFVRVPELIPYLMRNYMEMGARGVLVPHIRSGAECRMAQEALRYPPYGN ASCCRSNHADGFEAKNWMKYLDHVQELVFVPMIEDPEAIGHLDEILDELKPGRDMVMFGK ADYGQACGSLNADGTFTSEVNEAYLKVMERCRARNIGFMACPSAGPAGQTAADVKKVIEE GCSAVVLNTDQLTLAAAFGNIVGPCLALDIKK >gi|157101599|gb|DS480725.1| GENE 67 66555 - 67343 293 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 8 248 4 240 242 117 32 2e-25 MDLNLAGKVVAITGGSEGIGYAMAEAFAKEGCRVAICSRSQEKLDKAKAEFQEKGFDLYV RSVDVSDSNRLYAFVEEVKNDLGRIDIWINNTATTVVKSILELELEEWNRIMNTNLNSYF VGTQAAGRVMKEQGGGVIVNVASFGGLMPPMYRSAYCTSKYGINGLTRMSAAEFAPFGIR VNAVAPGSINTAMQAAAGRTEEDVKRVAKNFALHRAGEPEEVANVALFLASDMSSYIDGV VLECSGGKFLAQDCDTAWQQRR >gi|157101599|gb|DS480725.1| GENE 68 67381 - 68700 1196 439 aa, chain + ## HITS:1 COG:BH3897 KEGG:ns NR:ns ## COG: BH3897 COG2610 # Protein_GI_number: 15616459 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Bacillus halodurans # 4 424 1 412 427 196 30.0 1e-49 MLGVIGLILAIIFLMVFAYKGLGALPLAMAGALIAVIFNHMPVYETFTGSFVPGYASAFT SYMLLFVSSAFYAKLMDISGCATAIGNKFTEWFGIGHVVLVGIVIVGILTYGGISLFVVI YAAAPILYTMLKKANLPRHLTVICFSAGSSTFSMTCLPGSPQLSNLVATEYLGTSLNAAP IMGITCGIAMFVMCALYGEWASKKARERKEEWSYPANVDPKNYEVDASQLPAAWKGFLTI VSLILIIIVGSHKGINATLIAVAAMVVGSIMVIVLNADRFFGKVKWLKLFTEGMTSGVTS IGGIAAILGFGAVVKATPAYQNIIDWALGLNMNPLLLAVVVTCIVCAIAGGSSSGQRIMY ETMAPTFLASGANLPVLHRLVAIASGSLDTLPHSSGLFLVYDLLGLTHKNAYRHSFATSV VIPLFVTIVATTICVATGI >gi|157101599|gb|DS480725.1| GENE 69 68737 - 69570 714 277 aa, chain + ## HITS:1 COG:MA0107 KEGG:ns NR:ns ## COG: MA0107 COG1028 # Protein_GI_number: 20089006 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Methanosarcina acetivorans str.C2A # 25 274 7 251 255 137 34.0 3e-32 MSEHRIDETCRYEEDFTIRNFNKMMDGKVVIVTGASYGMGKKMAQLMVEQGAKVFVTARG KEKLDAAIADIKEKTGGDITGITADSSRPEECKKVFEKALEAYGDIDVLINNAGKGELIT IDAVSDEVIDQVVDTNMKGPLYLCREAVRYFLKKDSGNIINISSVNGIRPMCGAAYCATK GGLNTLTKNVAIRCTGTGVRCNAVAPGHTPSPTAYAHDDGGPQTLAEGVMNEVRNTRTVR SVHPWEIDQAYAVLFLASDMSRCINGEVLSVDCGCYL >gi|157101599|gb|DS480725.1| GENE 70 69916 - 70413 284 165 aa, chain + ## HITS:1 COG:no KEGG:BC1003_2069 NR:ns ## KEGG: BC1003_2069 # Name: not_defined # Def: carboxymuconolactone decarboxylase # Organism: Burkholderia_CCGE1003 # Pathway: not_defined # 6 117 11 120 261 84 35.0 2e-15 MEQKWSEYQEAAKQKFIEARGAWKWSAMWDTILELDPGIVASYASYSSVPDKRNYLDIRM RKFICIAIDSVTGCLNAGGLKNHMKHALQLGIDVKEILEVLEITCMAGASTHRMSVPIPC GMNLHHKGNSALAARGFGMGRQKALKKGTILPNMAGYWSDEKEIF >gi|157101599|gb|DS480725.1| GENE 71 70605 - 70721 88 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDVLHIVSTLGIHAVTMGVPILKEAMKEVEAEQNKSEK >gi|157101599|gb|DS480725.1| GENE 72 70871 - 72037 341 388 aa, chain - ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 388 391 350 45.0 3e-96 MIYVGIDIAKLNHFAAAISSDGEILIEPFKFSNNYDGFYLLLSHLAPLDQNSIIIGLEST AHYGDNLVRFLISKGFKVCVLNPIQTSFMRKNNVRKTKTDKVDTFVIAKTLMMQDSLRFM ALEDLDYIELKELGRFRQKLVKQRTRLKIQLTSYVDQAFPELQYFFKSGLHQNSVYAVLK EAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSIQITHTIE QIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSNPKKLLAF AGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAKRAEGRTH YNALGHCAGKLVRVIWKMLTDEVAFNLE >gi|157101599|gb|DS480725.1| GENE 73 72173 - 72310 79 45 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|160936491|ref|ZP_02083859.1| ## NR: gi|160936491|ref|ZP_02083859.1| hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_01382 [Clostridium bolteae ATCC BAA-613] # 1 45 1 45 454 89 97.0 6e-17 MAIHSCRVVEATKKHQKSQYLNVLNHETEKERKYVVFASNNIIQI >gi|157101599|gb|DS480725.1| GENE 74 72313 - 73248 628 311 aa, chain + ## HITS:1 COG:YKR009c_2 KEGG:ns NR:ns ## COG: YKR009c_2 COG2030 # Protein_GI_number: 6322861 # Func_class: I Lipid transport and metabolism # Function: Acyl dehydratase # Organism: Saccharomyces cerevisiae # 30 305 2 266 278 103 27.0 6e-22 MANIEKVDWHTRKTPDFDSIIGKGMGPRYFEYTWRDVILYALGVGCTKDDLAYVWEGCPD GFKALPSFVLTAYLNAITMQPQRRVPYAPNEIPGDLIIEALDGYIPNRLHMGVDFIMHRP IDPLCGKILTEDSVEEVFDWGEKGVVNQCKMEMYDIAGNPVCTLRSQHYIAAFGNNGRPK YVSNKMNYPGREPDFECDDHIADNLAALYRLTGDTYTTHIDPEVGKGYGYKGAFMPGLCS AGFAARLAIQAVIPYQPERVTHVATQLRSVTFPDTYVKFQAWKIEEGKLIYRMLNKETGK AIVDNCLLEYK >gi|157101599|gb|DS480725.1| GENE 75 73279 - 75189 1353 636 aa, chain + ## HITS:1 COG:lin2337_1 KEGG:ns NR:ns ## COG: lin2337_1 COG1902 # Protein_GI_number: 16801400 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 1 342 1 348 361 259 40.0 2e-68 MKYKKLLSPGKIGSLELKNRAIMAPMSAALGNPDGTVSDPLIAYLTARANGGVGMIITEY CFVTEDGRSSDHQISITDDDKIPGLKKLVDAVHAHGAKICLQLQHGGRRSMVRVMAPSAI MKQTDRVTPYEMTTQDVHDLIHAFIAAAVRAKKAGFDMVEVHCSHGYLLNDFVSPSANRR TDEFGGGITGRAKVPVMIIEGIKKVCGEDYPVSVRLNGDDMVSDGNLGRDSAALAMLMEE AGADLLDVSGGMNGVGYGIAPAAVKTGYNTDPAEEIRRVVSIPVAVAGRINEPEYAESLL RKGDVEFITLGRALFADPEFVNKAAQGKEEEICPCVGCLQRCYGSYGHGGQFRGCMVNPF SMRETVLKIKPAETKKKVVVVGAGIAGMEAAWTAAARGHAVELFEQGTYPGGQFRIAAIP PHKQMLARACVYYSNMCKKYGVHMHYNTKADRELIESCNPDVVVVATGGNPLVPGIPGLR ESGYIANREILLGRIAPGNKSLILGGGLQGAETADFMAEHGYEVTVVEMRNGIAVDDHPA TQKLLLERLASNHVGLITSATVKTVYPDGVDYEKDGEIIGLRGFDSIILAFGTRPEHTLA EELEGMDAEVVTVGDAAKAGNAVEAIYRGAVLGTTI >gi|157101599|gb|DS480725.1| GENE 76 75327 - 75974 108 215 aa, chain - ## HITS:1 COG:FN1357 KEGG:ns NR:ns ## COG: FN1357 COG3547 # Protein_GI_number: 19704692 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 6 215 179 388 391 184 42.0 2e-46 NTLCRLKEAPTPNAIASMHLTHLAHTLEVASHGHFGKDKARELRVLAQKSVGVNDSSLSI QITHPIEQIELLDSQLFSTELEMANLVTCLHSVIMTIPGIGVVNGGMILGEIGDIHRFSN PKKLLAFAGLDPTVYQSGNFQAHRTRMSKRGSKVLRYALMNAAHNVVKNNATFKAYYDAK RAEGRTHYNALGHCAGKLVRVIWKMLTDEVAFNLE Prediction of potential genes in microbial genomes Time: Thu Jun 30 20:14:09 2011 Seq name: gi|157101598|gb|DS480726.1| Clostridium bolteae ATCC BAA-613 Scfld_02_67 genomic scaffold, whole genome shotgun sequence Length of sequence - 36739 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 6, operones - 5 average op.length - 5.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 283 71 ## COG3344 Retron-type reverse transcriptase 2 1 Op 2 . - CDS 336 - 461 57 ## - Term 715 - 763 7.0 3 2 Tu 1 . - CDS 799 - 1125 310 ## gi|160942440|ref|ZP_02089746.1| hypothetical protein CLOBOL_07323 - Prom 1186 - 1245 10.6 - Term 1237 - 1270 -0.1 4 3 Op 1 . - CDS 1308 - 2141 449 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 5 3 Op 2 . - CDS 2200 - 4725 1950 ## SH0040 hypothetical protein 6 3 Op 3 . - CDS 4747 - 6594 240 ## PROTEIN SUPPORTED gi|90020820|ref|YP_526647.1| ribosomal protein L19 7 3 Op 4 . - CDS 6617 - 7606 792 ## Tint_1853 NHL repeat containing protein 8 3 Op 5 . - CDS 7607 - 8476 441 ## gi|160942445|ref|ZP_02089751.1| hypothetical protein CLOBOL_07328 9 3 Op 6 . - CDS 8506 - 8778 369 ## gi|160942446|ref|ZP_02089752.1| hypothetical protein CLOBOL_07329 10 3 Op 7 5/0.000 - CDS 8816 - 9223 512 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 11 3 Op 8 . - CDS 9224 - 10432 691 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit 12 3 Op 9 . - CDS 10456 - 11241 646 ## COG5263 FOG: Glucan-binding domain (YG repeat) 13 3 Op 10 . - CDS 11281 - 14247 2772 ## gi|160942450|ref|ZP_02089756.1| hypothetical protein CLOBOL_07333 14 3 Op 11 . - CDS 14250 - 16679 1414 ## gi|160942451|ref|ZP_02089757.1| hypothetical protein CLOBOL_07334 15 4 Op 1 . - CDS 16780 - 17100 395 ## gi|160942453|ref|ZP_02089759.1| hypothetical protein CLOBOL_07336 16 4 Op 2 . - CDS 17151 - 18269 623 ## gi|160942454|ref|ZP_02089760.1| hypothetical protein CLOBOL_07337 17 4 Op 3 . - CDS 18274 - 18567 124 ## gi|160942455|ref|ZP_02089761.1| hypothetical protein CLOBOL_07338 18 4 Op 4 . - CDS 18579 - 19295 756 ## gi|160942456|ref|ZP_02089762.1| hypothetical protein CLOBOL_07339 19 4 Op 5 . - CDS 19336 - 22047 2938 ## COG3451 Type IV secretory pathway, VirB4 components 20 4 Op 6 . - CDS 22044 - 24113 1619 ## MBOVPG45_0484 membrane protein 21 4 Op 7 . - CDS 24129 - 25979 1638 ## gi|160942459|ref|ZP_02089765.1| hypothetical protein CLOBOL_07342 22 4 Op 8 . - CDS 25992 - 27704 1225 ## gi|160942460|ref|ZP_02089766.1| hypothetical protein CLOBOL_07343 23 4 Op 9 . - CDS 27701 - 29788 2040 ## COG3505 Type IV secretory pathway, VirD4 components - Prom 29889 - 29948 5.6 - Term 29917 - 29958 9.1 24 5 Op 1 9/0.000 - CDS 29959 - 31836 1410 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 25 5 Op 2 9/0.000 - CDS 31853 - 32572 763 ## COG3279 Response regulator of the LytR/AlgR family - Prom 32703 - 32762 3.6 - Term 32652 - 32704 9.2 26 6 Op 1 . - CDS 32773 - 34692 1186 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 27 6 Op 2 . - CDS 34689 - 36737 1018 ## gi|160942466|ref|ZP_02089772.1| hypothetical protein CLOBOL_07349 Predicted protein(s) >gi|157101598|gb|DS480726.1| GENE 1 1 - 283 71 94 aa, chain - ## HITS:1 COG:CAC3514 KEGG:ns NR:ns ## COG: CAC3514 COG3344 # Protein_GI_number: 15896751 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Clostridium acetobutylicum # 3 94 13 107 470 58 40.0 2e-09 MGTENKDSCSQRDSAEREGYVRAHRSFRRIWKERDSAQPELLEEILEKGNLNRAFKRVKA NKGAPGIDGITVEEIGAYLRENQKELIERIRRGK >gi|157101598|gb|DS480726.1| GENE 2 336 - 461 57 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MERKGRVVPGEVCAECPERDNHYRKIVLNVQKSAEAVVVGR >gi|157101598|gb|DS480726.1| GENE 3 799 - 1125 310 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942440|ref|ZP_02089746.1| ## NR: gi|160942440|ref|ZP_02089746.1| hypothetical protein CLOBOL_07323 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07323 [Clostridium bolteae ATCC BAA-613] # 1 108 14 121 121 202 100.0 5e-51 MRIEALKYQTDKKEDIIIFVDYNEVYSEGYHVQWSIADIAYRRPPSRNYIFLSDTYRDDS EYYILSPEEKTAYALKRQKEFAGEVKLKEALVSAWNIIRPDTDSILGM >gi|157101598|gb|DS480726.1| GENE 4 1308 - 2141 449 277 aa, chain - ## HITS:1 COG:sll1036 KEGG:ns NR:ns ## COG: sll1036 COG1235 # Protein_GI_number: 16329358 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Synechocystis # 6 277 4 276 292 150 32.0 4e-36 MNNNIKVTCWGSRGSGPAPYTEQMEYGGNTSCFTIETGQMLLVLDGGTGLAALGREMMKN PDDKRDIHIFISHLHLDHIIGIPLFKPLFQTGRRICFYGESRNGRSLKQQLNLFIGPPYW PVSLSQCSASIEYYEVEAGKILSLPLPQSVQILPIRSNHPDQTVLYRIQLEETMVFYGLD CELNQEMTDVLSNYAENAGLLICDVQYSPEDYIFHKGWGHSTWEAALKLAEASSAKRIWL THFDWEADSSTLFCLERQATLCNPNCEYVREGMTVWL >gi|157101598|gb|DS480726.1| GENE 5 2200 - 4725 1950 841 aa, chain - ## HITS:1 COG:no KEGG:SH0040 NR:ns ## KEGG: SH0040 # Name: not_defined # Def: hypothetical protein # Organism: S.haemolyticus # Pathway: not_defined # 558 814 97 358 1563 103 28.0 3e-20 MRERGRNLIRCGAAFAVSCCLLAGNVFPAAADTSVSGATMRLSSREGTVNVSGKGGKAVA AFDNMRLQSGYYLETEAASYAGLALDDVKAVKMDELSKGEIRKKGKQLELLLSEGSLLCD VTAPLKGEESLSIRTSTMITGIRGTVVYAEVVDSNTSRVCVLEGQVLVRPVNPVEGDRNG QYIEKGRQALIIGPDRMERKGEIQTLPLTETDIAPYVLWEIHENEELARRLREAGWDVDW MIENAERLLREEEDKNSQWAENGQGTSGSKPGGKDIHTPLFNGGDSGSDSGTGTQPGRPT VELTMPVSPETIEEKLRTSDVVLHSAADSNPLIIDNSLTVPAGSRLELAGGMGLVVGTSA ALQVDGTLIVDGNLENRGTVTNTSMNTLDIKGDYVGSGTLNNKGRVAANGSFMQEGGSFQ TIGQGRLEVERKAVIDNAAFDFGTGTTLFHGDLTVRNADGEFGDNLCVEGMLSLEDSSIL VTGGFYGGGITAVKKQDGENITYEELNLEGGKVLAERGVKGVIFAENYKLNVQSGILERS DDGIPFITGTGILEMDGEEYKIEELTESMFESEETETGWVLTGIVQEQKKQAEVHTSIPL QADAEKLPDVNEEEKELQEEENLPEEGEPEEGGNQPEEGSPEAGGNRPEEGEPEEGGNQP EEGEPEEGGNQPEEGEPDEGGNRPEEGEPEEGGSQPEEGIPEEGGNRPEEGEPGEGGNRP EEGSPEEGANRPEEGRLEEGGNRPEEGSPEESGNRPEEGSPEEGANRPEEGSPEEGTNQP EEGSPEEGANQPEEGSPEESRNQPEEGSSEEGANRPVEGIPEKGKNQPATGKPEEYGNLP G >gi|157101598|gb|DS480726.1| GENE 6 4747 - 6594 240 615 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020820|ref|YP_526647.1| ribosomal protein L19 [Saccharophagus degradans 2-40] # 386 595 195 405 460 97 30 1e-19 MDKKLIKAYLPWGLCAGIFTIAAASGLFYEMDNRLSDALYQERRGTDGRIAIVGIDQRSL EALGPFGTWDRRILAQTVSYLNQGDIRPAVIGLDVLLAGETGTEGDLLLAEAAGTGENVV TAAAGTFGSSLVTAADGTFYLDDFTVKSFDEPYKALRDVTAQGHINAMYDKDGILRHQLL ELGLPDGRVFPSFALTIARKYIQQEGGREAGLPPVDSRGFWYLNYSGLPEDYSEFISIVD LLEENIPPEYFADRIVLIGPYAAGLGDQYITSADRGDPMYGVEIQANAIHALLEGDYKEE PGAGIQLAVLFLVLAAAGAWFKSSRLAGGGAVWLAGSFAYVIIAKAAYEKGMVLRILWIP LGLTLFFGASVMIHYGKAVLEKQRVTATFKRYAAPEIVDEILRQGTQSLELGGKLTQLAV LFVDIRGFTSMSEALEPGQVVEVLNSYLTLVSSSIKNHGGTLDKFIGDAAMAFWGAPLAQ EDYVMKAVLAALDMAKEADRLGDQLEARFGRRLTFGTGIHTGPAVVGNVGAPDRMDYTAI GDTVNTASRLESNAPGGKIYISAAVANALEGRINAKAIGSIRLKGKAGEFQVLELEGLTE GSSGRMPVSEGKKAD >gi|157101598|gb|DS480726.1| GENE 7 6617 - 7606 792 329 aa, chain - ## HITS:1 COG:no KEGG:Tint_1853 NR:ns ## KEGG: Tint_1853 # Name: not_defined # Def: NHL repeat containing protein # Organism: T.intermedia # Pathway: not_defined # 6 304 3 330 366 98 33.0 5e-19 MKTIWKRHGMTGCGTKALGFVLSLVCSGCILGVSLSGAAPFFDNTGDGFSPRVMVQGREG CFLAADGFNKVIWQVTEEGRSIYAGQKGAAGKYGEPRGGYRDGSLGEMLIGEPWDMIPYL DGYALSDRSNHVVRYISPEGAMTAAGTGKAGYEDNLASKALFAAPTGLAAGPDGILYIAD TDNDVVRSLSLSGKVDTYLRGLSAPTGICFYEGALYVADTGNNRIVKAMDGAVVWSAGTG EDGFADGPVSQAMFSGPQRITAAEDGALYVSDTGNSVVRKIWGDNVSTLPAADAADPDLT IASPEGMMVMGDHIYLCDSFLRKVFVLPR >gi|157101598|gb|DS480726.1| GENE 8 7607 - 8476 441 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942445|ref|ZP_02089751.1| ## NR: gi|160942445|ref|ZP_02089751.1| hypothetical protein CLOBOL_07328 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07328 [Clostridium bolteae ATCC BAA-613] # 1 289 12 300 300 533 99.0 1e-150 MTWQQRVWKWIGDFVYGIRPLLLYLLLPPAISAFGLFIAGLKEHIEFFLWRSGQFYRTLG LILVFYLMRRKCRKREVSIWDETTLYWRGVNAKKLLLLMGAGAGLNLFMSAALTVIPFPD FLIGPYHEMTYEVFGKVDKILAILSVMFLAPLVEELIFRGYVLNRFLRRFESQTFAVLLC TGIFAACHVTPLWVAYGLFMGLLLAHVSIVEDNIIYAVALHMGYNIMTLPVWFANESQQI SMVLFASPALIGVYGVIGISVAVLCLRQYPLDIRQLAEYFPRIKRKGER >gi|157101598|gb|DS480726.1| GENE 9 8506 - 8778 369 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942446|ref|ZP_02089752.1| ## NR: gi|160942446|ref|ZP_02089752.1| hypothetical protein CLOBOL_07329 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07329 [Clostridium bolteae ATCC BAA-613] # 1 90 12 101 101 137 100.0 2e-31 MTEVKTAVKISVAQAPELEKALKSVLETEEKEIVVNMEDTIYMSSVGLRVFASIQKKINA SGKTMVLKNVRPQMKEIFEVTGFVGVLSFE >gi|157101598|gb|DS480726.1| GENE 10 8816 - 9223 512 135 aa, chain - ## HITS:1 COG:CPn0670 KEGG:ns NR:ns ## COG: CPn0670 COG2172 # Protein_GI_number: 15618580 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Chlamydophila pneumoniae CWL029 # 3 131 7 138 144 91 39.0 3e-19 MLEKQVPAGMEYMDGVLEALREYLEEVNCPKKTALHLEIALEELFTNIASYAYGGEPGPV RVYCGLEGGELVLEVEDKGISFNPLDREAPDLQLPLEERPVGGLGIYMVRQFADSVEYQY RDGCNILRLKICIRP >gi|157101598|gb|DS480726.1| GENE 11 9224 - 10432 691 402 aa, chain - ## HITS:1 COG:FN1091 KEGG:ns NR:ns ## COG: FN1091 COG2208 # Protein_GI_number: 19704426 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Fusobacterium nucleatum # 144 394 202 446 447 130 32.0 6e-30 MDIDRLLALKEVRELFPADEKGKLPEGSCLLLSAMEELDFPPGRDIVTWGQAGEDGMYIL MEGSAQVLDCHGEEINTLMGPGSIVGEMALIRDEPRGATVRAITRVRAARLSRVQFEEAA GSNRRLYGELLNVVYKKTTSLVREQARLHSELDIAARIQTGLLRRDFGGLEEQLGLRAWA LMEPAKEVGGDFYDIFPAAPGKACIVIADVSGKGVPAALFMTMAKTHIKNYAMLDMPLEE LAFRVNNRLCEDNPEEMFVTAFIGMLDTEQGILTFVNAGHNHPYVSSEQGPFHQAACHSD FVLGLWEDMKYREQKMELGKQGTIFLYTDGVTEAENPEEEMFGEQRLKAVLDRCAPEEDP KGIISRISMAVKDFSVKKGAVQTDDITMLCVRALCRKEGNER >gi|157101598|gb|DS480726.1| GENE 12 10456 - 11241 646 261 aa, chain - ## HITS:1 COG:SP2136 KEGG:ns NR:ns ## COG: SP2136 COG5263 # Protein_GI_number: 15901950 # Func_class: R General function prediction only # Function: FOG: Glucan-binding domain (YG repeat) # Organism: Streptococcus pneumoniae TIGR4 # 34 89 565 619 621 68 50.0 2e-11 MKAGKLTAGFCLTVLLTAAAAVTSMAAPGTSGDWRQEQDGRWYYYDEGGTMCTGWIEVGG KHYYLYEDGHCAMNEITPDGFRVDADGAWYEAKRELLGQEFPLPAGFQNSISLGDGWGGA QAALNGAAGKISADSGGKRKLLIDSTAIEYKSAVKSAKDDARYMGLYKDSTTGGYRLDIF ILLDRNSETQSLGEALDYEVFKAMLTTVSSTPDQLEEAVYSSWQGDNSWNISRAAITSVG DCSVIYEAGQGYGRYYLRPSG >gi|157101598|gb|DS480726.1| GENE 13 11281 - 14247 2772 988 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942450|ref|ZP_02089756.1| ## NR: gi|160942450|ref|ZP_02089756.1| hypothetical protein CLOBOL_07333 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07333 [Clostridium bolteae ATCC BAA-613] # 1 988 1 988 988 1736 100.0 0 MAANNGFKIVQFSPELIQALTQMGELRTRINDYAVEEFKNYKGEEKLDHRDFVEMMNLSP EKSEKEQKEYVRNFVRDYTSGDPQLRKKCLDDIYDRDDSFNPASLDLSCLKGTDTGPREP GGLTGSEKDFMSLIGAFRREQALSVKLKENPGYAQERYGSDSRKRAEFEARRTAMMNGLI SYTDNMLRNNGFNERLHRGNVAPMAGDGEYVAADLAVNLYRNSMDRLQGGRTGENRLNFS LGPENGNIYGLSRQLAAGNLTNQMKTNAAAAFDTSLATLYRSGLPATTMNEMGVDALDLI HVDGLSVRERYGNKYANLSEWEQNKMIKAEVMGAIMDGQRVDAVTLGLGARGRMQAVTHT LEADLHAYDQMEKRSEHSGFRRLFDWGPGKIRTKADNVDRKNREDRQMEERTGLITNSIA SRLEARESKQYLMENLGMSPKPVTPQIQRQEEKLRGIEGRIGDIEKQNFSSYEEIGNISR ELERRKVYEGARERFNKAKSKLDTTKAVMDNGNLLEVDIREVMDPIIPDYGPEQKEKLGN SYDNDFLPGYYKLDEIRTARDLSQEKLEARRDRLNASLDQGQKKNEYRLEAVNRLLDAKT PEEREKVIGDTLEEFKKGAQAHLEQCEDTLNSHPCGEMSPEQIQEKEKRLQVLEEGLTPV PQELLQERAMEEARLNQLKEMNGEDRTPQIKKFNQAELLHPDYIPPATKEEALARIGELD AKYGPEIQYLDRREMVKNFRAQDLSTIDETKPYWAKEQAAMTLRYMDSIPIDTSGLSDKA KQTLAEGRTAMEAIAGAKAGTVSLAGSNAKGIGISIAYEEALKTDPACAKAAVKGVEYAE SFSAQWEMRQLLMDDLKELVKKEPVKNMKAEAEKSGPEKAGAEKAGAEKAEAEKAGAEKA EKEKSGPEKAEAEKAGPEKSGGAKGKTEVTFVELELSEKEAKAAGAAAKPQSSWERHDLS PFHKESSRSFQKESGIGSHRQDNKKPAK >gi|157101598|gb|DS480726.1| GENE 14 14250 - 16679 1414 809 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942451|ref|ZP_02089757.1| ## NR: gi|160942451|ref|ZP_02089757.1| hypothetical protein CLOBOL_07334 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07334 [Clostridium bolteae ATCC BAA-613] # 1 809 17 825 825 1589 100.0 0 MKRKLRNCLAILTGLSLFFTGCKNQSVNLYMETLVPESASPPQRPVETTGTPASLPDVEK EVGEQASVYLEQRFQELTGPRAFFFPENHSLSHTDIKVFQVLGLLEGRFYYYYVTKEINT GQEVHALASYEYETERYKSLYENKVDPGAAGTDTDSQYSFYGHLAADKLGRLRISLYDGG NLFILDGEGKVVFSRTSEDEGEDFLYLIDSIFGVGGTNPSYFSKDITNVTTNGRYCFYVP ITLYKGDYTREVEIGEEEYILQYAYSPIGAGTFPNTEYLLQEDLNWNQRVLEWKAIADSG NSERDPEEDWKDALKKYPVKWGPYYVSTGYDWAPAFLYRLMEWKNQKFRNFVEGKKGIST ADSKENEGYAELFAIPDEESIFTDVWELKTGQNLMKFFFLSPGEGYCPLVGKVGQRSDIL KTETRTYTITKTNSKGESEEIEIEDEIQAVIRRKALFAEGAAVEQFYVLDNKAGIGVTAG PVWRSGNYSMGINIGPILLCIDYFKDQTANEAVFFPAVTQWEMQMGEQYEGNLSFCILGG RAGDSYPMAHNLEKEIISIYAGDGKYITIKTDDLNASGYGQTSADQELLSEVEKRKEYRE EKGFNKNSTYKIDKENAYTNPDRYGRDNILVIKNGDSKNLLFTSFYNGMLYFETERYSMS TELSTGKGYHLSEWTLYQAWQTGEDEITCIGFQRTDAVYYYMDMAMARVYKLSLKEFKEQ AAGYQIPVERKAPPENEAREPSLDEGVTLPALPQDMERIKQDIKENSIPPTSSASREEME TLPDVTITETGAVGDRMWKDAQERMRGEE >gi|157101598|gb|DS480726.1| GENE 15 16780 - 17100 395 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942453|ref|ZP_02089759.1| ## NR: gi|160942453|ref|ZP_02089759.1| hypothetical protein CLOBOL_07336 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07336 [Clostridium bolteae ATCC BAA-613] # 1 106 1 106 106 184 100.0 2e-45 MNVPVMLLNSTGGNPFEQVSKPVVELLNMALTPALGIVGALGAIYCIFLGAKLAKAEEPQ DREKAKNSLKNAIIGFVLIFVLIVVLKIGMDAMSNWMNSAMSNVTP >gi|157101598|gb|DS480726.1| GENE 16 17151 - 18269 623 372 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942454|ref|ZP_02089760.1| ## NR: gi|160942454|ref|ZP_02089760.1| hypothetical protein CLOBOL_07337 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07337 [Clostridium bolteae ATCC BAA-613] # 1 372 1 372 372 703 100.0 0 MKRIRSPLLLAAMALLAAGLTLWGSADRSFAGQTDQGYGAGGEARELDKEAQESMFSYQV EGEPESPYGEGSQELERLLEALGMEVPTLLPEEAAGMEKVVDPPMNFLVSKTEGLLDYTL PDGSRFSASVPQGMTVTDPVTFKPVENAILTVLRNGSGIETSRDGIYSQPGKYHIRMVIL PTSSEGGSRLLEVNFRFTILPREVSLLNLLKAPEGFAIEKMELDGRGILPDNPRWHFLHG DGRYKVRFAQENGGLFYDISFKRDTQAPLFTISPVPDKKEMKDTVFLELTEDMTSLEMYY NGRMAPIPIKRLELAGLYQFRLWDRAGNERYYQIRIAERLRPPSPKTIIITILGVGAAAG WFFYQRRHPRFL >gi|157101598|gb|DS480726.1| GENE 17 18274 - 18567 124 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942455|ref|ZP_02089761.1| ## NR: gi|160942455|ref|ZP_02089761.1| hypothetical protein CLOBOL_07338 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07338 [Clostridium bolteae ATCC BAA-613] # 1 97 5 101 101 139 100.0 1e-31 MAYFDSPKNRVKWEIELEELRQQKEDFAAGRLHDTGEIKRGAMAEGQHRKPVTFEQLERE EASASRRREAGSRDMSPGRKREVSKPSPSPSIGPKGG >gi|157101598|gb|DS480726.1| GENE 18 18579 - 19295 756 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942456|ref|ZP_02089762.1| ## NR: gi|160942456|ref|ZP_02089762.1| hypothetical protein CLOBOL_07339 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07339 [Clostridium bolteae ATCC BAA-613] # 1 238 18 255 255 443 100.0 1e-123 MKGIVRKITVAAVLLWILPGLSGCGDKDRVDEIKAQGRLVAAVPEEGERSAWDQAYAREL EQQVLEEMASDIGVTLRIERVKEQDLVPAVKQGRAHVAVGVPVLPGRQDEGYSLAYGRRA SYLAVADGTVLDSWGALADGIVGYSVNTGPAVLRQLNTLSGIELKEVETGDAVSGLAKGD ITAYVCGETEAREFMAAEGIQIQELPGLWPETYTFYTGELQYRLLGLANQGITDKLTE >gi|157101598|gb|DS480726.1| GENE 19 19336 - 22047 2938 903 aa, chain - ## HITS:1 COG:MYPU_3830 KEGG:ns NR:ns ## COG: MYPU_3830 COG3451 # Protein_GI_number: 15828854 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Mycoplasma pulmonis # 444 876 423 827 853 176 28.0 2e-43 MRIIPKKTRVSMEFFKGIELLDIVIAAAGVTLTLFMLLSNLPLRWMAALIVFIVFSASII PLDDEKAYKSLYYAIRYAMSYKEFVKHPEKKGQIPVAGVTPFTGISDMFIEYGTSYLGVV VEIPSIEFRFLTEPRQNQLIDQVYGSILRTVNDTDSAAMVKLDRPVLYDSFIEGEEKKME DLKAAYIRGLMTDEELTVRIGIIQDRMSQLELFNNKETVYLPFHYMVFFGRDRGRLTEQA QNMVDTLGPHGIECRILKEQELAIFLKYNYSGVFDEREAWKLTPDQYMDWILPDKLAVTS RTVAYDGLVTHNLRVTDYPIVVPNAWGHALFNRPDVRVTLKMRPIDRYKGIKQIDRAIDE LREQGASTGKTSRLMELGSHIDTLAEVLSLLQGDNEILMDVNIFITAYDYEASPELLGPG YRPPGQGIGMKRQIRRELSEWGFKSSDMFMRQFDAYASGHISAFDAFSKDGRGIHSGSVA AAFPYVYKVMMEKKGICLGKSAGRPVFLDFFARNKERVNSNMVVIGKSGSGKSYATKSIL ANLAAENSKIFILDPENEYLGLARSLKGKIIDVGSATEGRLNPFHIITGLSDEEDELDGD EEENQIPGAKVSFNMHMQFLEEFYRQILPGIEADALEYLNNITIRMYEAKGIDAETDLSG LTPGDYPTFDDLYEKILNDFQMSTGDYSKKNLTVLLNYISKFATGGRNAGLWNGEASIST QENFIVFNFQSLLANKNNTVANAQMLLVLKWLDNEIIKNRDYNLRYGASRKIIIVIDEAH VFIDSKYPVALDFMYQMAKRIRKYNGMQIIITQNIKDFVGTEELARKSTAVINACQYSFI FPLSPNDMHDLCRLYEKAGAINESEQDEIINNGRGRAFVVTSPSERSSIDIETPKDIERL FGI >gi|157101598|gb|DS480726.1| GENE 20 22044 - 24113 1619 689 aa, chain - ## HITS:1 COG:no KEGG:MBOVPG45_0484 NR:ns ## KEGG: MBOVPG45_0484 # Name: not_defined # Def: membrane protein # Organism: M.bovis_PG45 # Pathway: not_defined # 161 463 94 351 745 68 23.0 1e-09 MEMSYLGIYEWVYEKVLSSIVTPVFDFVVSIINIAWGYVFQYILGPILLPVLQVVFYWVI RVVEVLGAFMNYLQFIGYLMVVDCLEKAFDIFIGLEPVTYYPNGREGAAVTGSLLEVLFT STAISEAFKVITLGGVGIALLLTIFGVAKSSFDLDFENKRPVSHVLKQLFKSVLGAFLIP FAVWFMIQLSMIILTTLNTAVSFDDNGGRSTLGSTIFTIASVNAAWDSENNMSKWGTVEL DEGFGFKVLDKANPDFDITKEPRKNFYLNDTYERYTNIPNVVLNFDLSRFDYLMGGWTAI FLCVILFLCFLTVIRRIFEILLLYLVSPYFVAMMPLDDGEKFKQWFGMFMGKLFTGYGTV VLMKIYLMLMPLILRGGITAEGMGGESRIVMAALYVMGGAWTVYKAGSLLTGLVNEAAGR LEQESNRDAYGAAMLAAAPFRAIKNRLEQKVSGKLEEKITGAGKDVGTRLQGGDPKGSGA RSEVRTSRLRSKPPKQPKFKKAVIYDELKNSSGFTIGGHRPARGRSRQLEPVSKAALKNV KKSTLKHINFENDKRPFKYGYDQFRKDNAFIRNKDGTVDFMYPAKRKSAVLAQRNEIAKV REEALSYSDPKERSRYVTNWLQHQSDDKNSLMYFNPADRMRLERTFGCKIDKTPISSGIN KGQIAFYPEKPKLPNGTRLNPDKSKGDKK >gi|157101598|gb|DS480726.1| GENE 21 24129 - 25979 1638 616 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942459|ref|ZP_02089765.1| ## NR: gi|160942459|ref|ZP_02089765.1| hypothetical protein CLOBOL_07342 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07342 [Clostridium bolteae ATCC BAA-613] # 1 616 1 616 616 1022 100.0 0 MEVAKGFTDVLYTTMVLLFNEIFVPVMTEILSLFYEKAMYVLSHSVSILAYFFLITVCKI LDSLETMIGIFAGTENVIVDNKPMTFMEMVFNQDAVSKGFLMITLVAVGICFLFTIFAIV RSMSSSTLENKNPISQVLKDAMKACVSFMLVPFLCIALMQLSSALIKQTKAVIQAESGTA GDPSMGLYIFLTASLRAGRADFEIPVLQETADVAEGLGDKFTYEILTSLDSAFRVEYLPP SMSDDLRRSYLNGDKDYLDFRDSGSDFSAAKIDYLLGFSASIFMIIMLAGVLIQFIRRLL ELVLLYLVAPFFVATIPLDRGDMFKRWREMFIAKFLAGFGVIYALKIFMLLLPLIFSDNL SLGSSFVMDRLYAGGVYDQTKDKFFDNLREQTGGIEIPGSGNPAYQEEVNRMEEAVDFAW EKTGMSYMENHSAMDSVLVNMGLVDRNSPSLEESMIDSILKTIFFLGGMFAVYKSCAMFL EILNPKAAEDAKTSTMLAMGYTVGAARKAASTGVQLGVAAGVGLGTGLATGGAGAAATAG AAGAKAAATAGAKAAATAGAKAVGTAAKTAGSTAKNAVKSAAKSTAKNAVGAAKQGVQGA ASGAAGAAKDSGDRDG >gi|157101598|gb|DS480726.1| GENE 22 25992 - 27704 1225 570 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942460|ref|ZP_02089766.1| ## NR: gi|160942460|ref|ZP_02089766.1| hypothetical protein CLOBOL_07343 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07343 [Clostridium bolteae ATCC BAA-613] # 1 570 1 570 570 991 100.0 0 MNRGIKKGLLLFAGCCMAAGLPDRALAASYNDDTADSYLSRIYTEAEDWISEGGAEDVFN GPGAAQTALEEDSALAGPRIHEVSMGEKYHSDYDLYEHSIDGRYFIYSNVSNGGITDRPV YVEIPVDVFCTAEKDGVPMNYTSGQLLSDRGTYVMRLTAVYDKNVPLSQQEEYRTVFRFR IDEKRQDASGAVSGAAGALARQIDGLTGSDTAGLLNQITSESGIAGDDIAAALAGQLFSE SGDKGEAGEGSNDGTDNEAGNTDQGLHASSEGVLGQGEDSGQEGAGQEKDSGQDGAGQEG DSGQNGSDQGRGSGREGREVSGRSQVYLEDEKMYLVTLENGFTFRSSVPEGMLINQSVAI YPGEGQYTLYRGEEETALNERQEAAEYGIYRLVSGEYEFTFEISNTYVNRDTFSAPEGTR ISAASFNGNSLDTGEGTLLTMEADGSYLVTLKGDYGETLEVELNRDTEPPAVEVEIEGQT ARIIYSSTDLASVTLAKNGGEPREFNGFTITETGSYVLTVTDRAGNQTLKEFSLRYKVNM YGVIAIAAIVISITAGAVMLVRKKRNLTVR >gi|157101598|gb|DS480726.1| GENE 23 27701 - 29788 2040 695 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 335 667 255 562 591 105 28.0 4e-22 MKAIKKAVIVRDFMDRRHMEHIRKREGPMIKKIAGYILGGIVVVMVLGYLTAMMLTELVP AILTRDWGRLWNPGLLLKPAVWIYGAVVTVMVYMLYLVFHVGTVKRAKRLMKAKEDSIES SLENSRFMTDKEKDDNFSPKRFTRLKEEKKDGIPVYAVYNKKKKELNINIAKPMHGLVIG ATGSGKTTSFINPMIQILGKSSAGSSMICTDPKGELFQLHSGMLAEQGYKVMVLDLRDPY SSFRWNPLGDIYDRYQLYLKAGQGIFQCDTDVRQSGLELVNEADQYKDVWYEYKGKAYAV RRELLTVIKVEKQKIYDEIYEDLNDLISVICPIESQDDPVWEKGARSIIMAVCLAMLEDS EYPELEMTREKFCFYNVNKAISNSEDEFRALKEYFMGRPKLSKSVGLSRQVLSAADTTLA SYMSIAFDKLSMFNDEGLCSLTSETDIDPQQFAAEPTALFLKIPDEKDTRHALAAVFVLC IYKALIKVASAREDLSLPRNVYFILDEFGNMPKIDKFDKMITVGRSRKIWFNMVVQSYAQ LDNVYGQTISNIIKSNCGLKMFIGSNDIETCEEFSKLCGNKTVSTQSVSSTLGDKTGNIN VSSQVQTRPLIYPSELQKLNNKNSTGNAIIVTFGNYPLKTQFTPSYKCPLYEMKTMDLSQ VRMNAFFGDEVYYDLDERNYIIASRQEKEMEVSAE >gi|157101598|gb|DS480726.1| GENE 24 29959 - 31836 1410 625 aa, chain - ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 379 611 202 441 452 93 28.0 1e-18 MVSLRPIIVITASAILSVLFILILYYADNKYTVGGKQPELGRLELTLGDMEENPYRYLVT GWEFYPDLLLGPDAFDGAEKPAGVTAAVGKYGQMRRNQRGTKGTYRLVLELPEKGGPFAL ELPEIEGASRVYIGGELVREWGSIEDVRSGMKRFVFPLTQGRDTELIIQAAALGAFHSQN FSPPLLGVYDAVTGIRDVGLLLKVMVLGLAVSAAGLSLHLAVKIRWWRGFLFCLFCLCFV GYQLWPLARERIQLGLQPWYGIQVFCFYAMLWLMVILENDLYRIKGGKVSAAMGIFCILA LVYGCYAQYGTAAAADLFSYITQWYKFAIACHLILVAYMAMAEGMERSQTLLVIAVVLVS ALFMEELLPLYEPIIGGSFLTIGCVVIMVGMLSILWQDMADAFRARTFFVTETGRMNRQL ALQRDHYRQLNARIEETRRIRHDMRHHIRLLQAYAAEGNLEHVKAYLGQLSPAMDSMGPV TFTSNYALDAVLCHYTALAEKEKIETDIRVMVPGRTVLPDDELCVVMGNLLENAVEACKR HESGRKFIFLRCLQDDSRLSIVLDNSFMGDVQYERGYFRSSKRESVGIGVESVKAIVKRH GGIGTFVPEEKVFKVSIILPLKDEG >gi|157101598|gb|DS480726.1| GENE 25 31853 - 32572 763 239 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 3 232 4 229 234 78 25.0 1e-14 MIIAVVDDTMQDRELLKGMLDQYFSGHQTVPEIQEFESGEAFLEAYEPGHYDIIFFDIYM GGITGMETASLVYERDKDCRLVFFSSSADYAVDSYRVRAAYYLMKPLSYEDLALALDCFT RELLEAGRGLTVMVKGGVEAVVPFSKLLYVDSLKRMVRIHTSDTVLEAAESFMAVSGRLM PDSRFLCCNRGIYVNMDWIQEVGDSVIRLKNGETLPVRVRGRAQVKKEYMEYALKDLRR >gi|157101598|gb|DS480726.1| GENE 26 32773 - 34692 1186 639 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 390 613 190 414 433 65 24.0 4e-10 MMKVREKQAFLLITAVILLSFAFFYYCYEKDNKYTRPRPYSRDGVIWLEEQWYSRHPMFY LTDGWSFYQDKLLSPDEIENHTPDAYFYIGRYGGFDLGDREASPHGKGTYRTVILTEGHE QEYGLELTPVYSRWRLWVNGKLKQSVGMDGNINGDMGPRPPVSMVTFTAKDRIEIVVEAE DDSHFYSGMVYPPAFGVPQLVSRTSELRLLIHGAAIMAAFLIGAMCLCLGAVRRFSRPYG AMAFLCFCFCGSTAWPLFQVMSESMYWVLLAERFCYYGMFFSMMWIQWQICGLPGRVYVP ACAAGFLICLSVLIQPLIPAETAGALYTYGNILGLYKWFTAFWLLATSGWAVCKKRPCCL AVLAGNSVFMCALVAGKMFPLHEPVLTGWFAELAGGVIILITGGIMIYDVDRTYKESLRL KMEQQLSQVQLEARARYGALQQEYIRGTRKQLHETRNRLTLIKHYLDTGETDKLKEYLKG LVSSTVDMENGEYTGHILMDSILAVELGTARSQGIYVEAEGDRLPGSLAVHDAHLTALLM NLLDNAIEACSRLPEDGEKWISLEIGRVGTGLTILCTNSSLPREEGRSTSKEDSRAHGYG MELMSQVVREYRGTFETTWYTDSFSARIFLPDAVKDGEL >gi|157101598|gb|DS480726.1| GENE 27 34689 - 36737 1018 682 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160942466|ref|ZP_02089772.1| ## NR: gi|160942466|ref|ZP_02089772.1| hypothetical protein CLOBOL_07349 [Clostridium bolteae ATCC BAA-613] hypothetical protein CLOBOL_07349 [Clostridium bolteae ATCC BAA-613] # 1 682 1 682 682 1233 100.0 0 ENGKGENPPSSDREDIEGENPEHPDSGQTGHNQEELENPGDIRQPEIQLQMIEIPLDEEV QCGTVEELLTLLETGEEICLTGDILVAGSVDLNIAPEDEAVVNMNGYSIVVSDKGNLFVD GPIRFQGAAGDKALFQISGKTTFKNGAGVWAEGDGAVAVEISETKNGKYALQTDNAYIGV SGDNSIALKWTGKGDCSLQYAHIEAEGENSVGIEADTPVELKLCLVEAEGSSVVTDKVLI VDTSQVTPLPEQSEVITREIYLDGRLDENGVSVLAGTDCQGLYDKLAPSASYVFCDPTGE RKMWTRDFTLNWKDLPEELSEPGVYRAMVEAEGYPDWFFIEIPEIEIEIYVVDPKTVHIE EIFRMEDMAVLYFLNPVEDAQRIELFCSADGGDTWQDADSMPGCSVQIDPLTITVEGLEF NHTYRFCVAVTGGSMEGLSSVKSFGYYENDSDRFGHGDRDGDDRDDQGDLPPWNSAPPPG DEFGGSGDPDSDGRGTAEADGTSGTSDGDGTLSLEPSAVKCAGIPEPCTADRIGTQRLST TAAADTQEPSAAAGSDPNGPLAADKIRRQEPSIEEGMAARDASGAGAGDRPGASGSALGE DGRAYGGGEAEDSFHEDGLRTGNLLEENATENREEPAGEGMDADSGRLQRDNPSGFIFRT VLGAAAGIAVLSWYKRRSRRKP